No title

February 11, 2010 10:0 WSPC/148-RMP J070-S0129055X10003874 Reviews in Mathematical Physics Vol. 22, No. 1 (2010) 1–53 ...

Author: H. Araki | V. Bach | J. Yngvason (Editors)

97 downloads 660 Views 15MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Reviews in Mathematical Physics Vol. 22, No. 1 (2010) 1–53 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003874

SPECTRAL THEORY OF NO-PAIR HAMILTONIANS

OLIVER MATTE Mathematisches Institut, Ludwig-Maximilians-Universit¨ at, Theresienstraße 39, D-80333 M¨ unchen, Germany [email protected] EDGARDO STOCKMEYER∗ Institut f¨ ur Mathematik, Johannes Gutenberg-Universit¨ at, Staudingerweg 9, D-55099 Mainz, Germany [email protected]

Received 27 October 2008 Revised 18 August 2009 We prove an HVZ theorem for a general class of no-pair Hamiltonians describing an atom or positively charged ion with several electrons in the presence of a classical external magnetic field. Moreover, we show that there exist infinitely many eigenvalues below the essential spectrum and that the corresponding eigenfunctions are exponentially localized. The novelty is that the electrostatic and magnetic vector potentials as well as a nonlocal exchange potential are included in the projection determining the model. As a main technical tool, we derive various commutator estimates involving spectral projections of Dirac operators with external fields. Our results apply to all coupling constants e2 Z < 1. Keywords: Dirac operator; Brown and Ravenhall; no-pair operator; pseudo-relativistic; Furry picture; intermediate pictures; HVZ theorem; exponential localization. Mathematics Subject Classification 2000: 81Q10, 47B25

1. Introduction The relativistic dynamics of a single electron moving in the potential of a static nucleus, VC ≤ 0, in the presence of an external classical magnetic ﬁeld B = curl A is generated by the Dirac operatora DA,VC := α · (−i∇ + A) + VC . ∗ On

(1.1)

leave from Mathematisches Institut, Ludwig-Maximilians-Universit¨ at, Theresienstraße 39, D-80333 M¨ unchen, Germany. a Energies are measured in units of mc2 , m denoting the rest mass of an electron and c the speed of light. Length is measured in units of /(mc), which is the Compton wave length divided by 2π. is Planck’s constant divided by 2π. In these units, the square of the elementary charge equals the fine structure constant, e2 ≈ 1/137.037. 1

February 11, 2010 10:0 WSPC/148-RMP

2

J070-S0129055X10003874

O. Matte & E. Stockmeyer

Here an electron is a state lying in the positive spectral subspace of DA,VC . A ground state of the one-electron atom modeled by DA,VC can be characterized as an energy minimizing bound state of the restriction of DA,VC to its positive spectral 2 3 4 subspace, Λ+ A,VC L (R , C ), where 1 1 + sgn(DA,VC ). (1.2) 2 2 This is conﬁrmed by Dirac’s interpretation of the negative spectral subspace as a completely ﬁlled sea of virtual electrons which, on account of Pauli’s exclusion principle, forces an additional electron to attain a state of positive energy. On the other hand, it is well known that there is no canonical a priori given atomic or molecular Hamiltonian generating the relativistic time evolution of N > 1 interacting electrons. Guided by non-relativistic quantum mechanics one might naively propose to start with the formal expression Λ+ A,VC =

N

(j)

DA,VC +

j=1

Wjk ,

(1.3)

1≤j
where the superscript (j) means that the operator below acts on the jth electron and Wjk ≥ 0 is the interaction potential between the jth and kth electron. It then turns out, however, that (1.3) suﬀers from the phenomenon of continuum dissolution which is also known as the Brown–Ravenhall disease [9]. That is, the eigenvalue problem associated to (1.3) has no normalizable solutions; see, e.g., [43, 48]. A frequently used ansatz to ﬁnd a reasonable and semi-bounded Hamiltonian for an N -electron atomic or molecular system again incorporates the concept of a Dirac sea. Namely, one projects (1.3) onto the N -fold antisymmetric tensor product of a suitable one-electron subspace, i.e. one considers operators of the form   N (j)  DA,VC + Wjk Λ+,N (1.4) HN := Λ+,N A,V A,V , 1≤j
j=1

where Λ+,N A,V :=

N

+(j)

ΛA,V .

(1.5)

j=1

Here Λ+ A,V is deﬁned as in (1.1) and (1.2) but with VC replaced by a new potential V . A Hamiltonian of this kind has been introduced ﬁrst by Brown and Ravenhall in [9]. We emphasize that HN can formally be derived from quantum electrodynamics (QED) by a procedure that neglects the creation and annihilation of electron-positron pairs [47], the latter being deﬁned with respect to Λ+ A,V . For this reason operators of the form (1.4) are often called no-pair Hamiltonians. Models of this type are widely used as a starting point for numerical computations in quantum chemistry. We refer the reader to the recent textbook [43] for a detailed exposition of the application of no-pair models in quantum chemistry, for examples

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

3

of molecular systems which can be studied eﬀectively by these methods, and an exhaustive reference list. Roughly speaking, no-pair models are supposed to give a good description of heavy atoms and molecules where relativistic eﬀects play an important role in the understanding of their chemical properties but where QED eﬀects can be neglected since their contributions live on a negligible energy scale; compare [43, Chap. 8]. In particular, the binding energies of a molecular system are low enough so that processes involving pair creation and annihilation do not need to be taken into account in the investigation of its chemical properties. Another broad ﬁeld of application for no-pair models is the theoretical and numerical study of highly ionized heavy atoms; see, e.g. [11,29,30,45] for a review. In fact, since very accurate spectroscopic data is available for highly charged heavy ions they provide important tests of relativistic atomic structure theory and QED. We quote from the introduction to [11]: “Although the proper point of departure for relativistic atomic structure calculations is quantum electrodynamics, very few atomic structure calculations have been carried out entirely within the QED framework. Indeed, almost all relativistic calculations of the structure of many-electron atoms are based on some variant of the Hamiltonian introduced a half century ago by Brown and Ravenhall to understand the helium ﬁne structure”. As already indicated above, various QED eﬀects like retardation of the electron-electron interaction, electron self-energy, and vacuum polarization are not accounted for by the no-pair Hamiltonian. However, by comparing the splitting between eigenvalues of the no-pair operator with experimental values and taking corrections due to the ﬁnite mass of the nucleus into account one can infer the size of the omitted QED corrections. The good agreement of the discrepancy found in this way with theoretical computations of QED eﬀects provides a test of QED in the strong ﬁelds of highly charged nuclei; see, e.g., [29, §3]. In particular, the no-pair energy levels may serve as a ﬁrst approximation for more accurate and complicated QED calculations; see, for instance, [6]. In practice, the eigenvalues of the no-pair operator and the ﬁnite nuclear mass corrections can be determined by means of the (formal) relativistic many-body perturbation theory (MBPT) as described in [11,29,30,45]. There exist variants of MBPT where the negative-energy states discarded in the no-pair approximation are re-introduced in perturbative expansions which becomes important in certain physical situations [12, 14]. We remark that in the articles cited above the electron–electron interaction is often given by the Coulomb–Breit potential and A is equal to zero. In the discussion of no-pair models in quantum chemistry and in atomic physics (see, e.g., [43, Chap. 8] and [11]) it is assumed that the spectrum of the Hamiltonian in (1.4) shows the usual qualitative features well known, for instance, for multi-particle Schr¨odinger operators. That is, the essential spectrum is supposed to cover some positive half-axis and there should exist inﬁnitely many eigenvalues below the ionization threshold (provided N is not too large). In view of the variety and number of applications, it therefore seems worthwhile to complement the treatment of no-pair models in the quantum chemical and physical literature by

February 11, 2010 10:0 WSPC/148-RMP

4

J070-S0129055X10003874

O. Matte & E. Stockmeyer

mathematically rigorous results and, in particular, to conﬁrm the assumptions on the spectral properties of HN by providing mathematical proofs. We would like to give some further comments on the connection between no-pair Hamiltonians and certain computational techniques in quantum chemistry. Namely, there exists a block-diagonalization scheme which is used to represent the formal Coulomb–Dirac operator in (1.3) as a two-fold direct sum of operators acting on two-spinors [23,26]. For the one-particle Coulomb–Dirac operator DA,VC , both blocks are unitarily equivalent to the restriction of DA,VC to its positive and negative spectral subspaces, respectively. In the general case, the upper left block turns out to be unitarily equivalent to the no-pair Hamiltonian. One may then expand the upper left block with respect to the Coulomb coupling constant. The partial sums in this expansion then give reasonable approximations for Hamiltonians describing a relativistic molecular system and are implemented in numerical algorithms in quantum chemistry [43, Chaps. 11 and 12]. It has been rigorously shown in [25, 46] that, for suﬃciently small Coulomb coupling constants and suﬃciently large orders in the expansion, each partial sum in the expansion has a distinguished self-adjoint realization and that the sequence of partial sums converges in the norm resolvent sense. In particular, the spectra of the partial sums — which are directly studied numerically — converge to the spectrum of the no-pair Hamiltonian HN . Obviously, the question arises how to choose the new potential V determining the projection (1.5) or, in other words, how to ﬁx the Dirac sea for one electron in the presence of the others in a physically eﬃcient way. Various possibilities are discussed from a physical and numerical point of view in [29,45,47,48]; see also [12] where the potential dependence of MBPT results is discussed and a possibility to eliminate this dependence is proposed. The choice V = 0 is referred to as the free picture, or Brown–Ravenhall model [9]. It has by now been studied in many mathematical works [2–5, 10, 17, 21, 20, 24, 27, 28, 35, 37–40, 50–52]. This is due to the fact that the free projection, Λ+ 0,0 , can be calculated explicitly both in momentum and position space [2, 40]. The free picture is considered as one extreme case in a family of intermediate pictures [48]. The opposite extreme case, sometimes called the Furry picture (see, e.g., [45, §III.F] and [48]), is given by substituting the (negative) Coulomb potential VC for V . Other members of that family are obtained by choosing V to be equal to VC plus some additional positive and in general nonlocal operator. The Furry and intermediate pictures give better numerical results in comparison to the free picture [47, 48] and are the preferred choices in MBPT. The additional non-local term may, for instance, incorporate the interaction with the remaining electrons. An example would be the Hartree–Fock potential generated by a set of appropriately chosen orbitals which is in fact a choice often employed in relativistic MBPT [11, 29, 30, 45]. In this paper, we do not aim to contribute to the subtle question of optimizing the choice of V . Rather we try to keep the assumptions on V as general as our techniques permit. Namely, we consider a class of potentials which can be written as V = VC + VH + VE , where VC may have several Coulomb-type singularities,

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

5

VH is a bounded potential function vanishing at inﬁnity, and VE is a compact nonlocal operator that behaves nicely under conjugation with exponential weights. As already mentioned above, our goal then is to establish some basic qualitative spectral properties of HN . First, we show that HN is well-deﬁned on a natural dense subspace (which is not obvious) and, thus, has a self-adjoint Friedrichs extension. We further locate its essential spectrum and, assuming that the number of electrons does not exceed the sum of all nuclear charges, we prove that there exist inﬁnitely many eigenvalues below the ionization threshold. Moreover, we show that the corresponding eigenfunctions are exponentially localized. Our results apply to all nuclear charges Z < e−2 ≈ 137.037. The “easy part” of the HVZ theorem, i.e. the upper bound on the ionization threshold holds for a certain class of possibly unbounded magnetic ﬁelds. The “hard part”, i.e. the lower bound on the ionization threshold as well as our results on existence of eigenvectors and their exponential localization are derived for bounded magnetic ﬁelds. Although the general strategy of our proofs is fairly standard the discussion of general no-pair models poses a variety of new mathematical problems. As an essential and novel technical input, necessary to obtain any of the results mentioned above, we ﬁrst derive various commutator estimates involving the spectral projection Λ+ A,V , exponential weights, and cut-oﬀ functions. They describe the non-local 2 properties of Λ+ A,V in an L -sense and might be of independent interest. Similar estimates are obtained in [36] in the case V = 0. The case where V does not vanish requires, however, more care due to the complex extension theory for singular Dirac operators. In order to derive the exponential localization estimate, we employ a strategy from [1] which has been complemented by a number of useful observations in [19]. The argument presented in [1] is advantageous for us since no a priori knowledge on the spectrum is required to prove the localization. (In particular, no eigenvalue equations are exploited in the argument.) Inspired by a remark in [19] we rather argue in the opposite direction and infer the lower bound on the ionization threshold from the localization estimate. This possibility is very convenient in the analysis of the no-pair Hamiltonian HN since the corresponding argument is very simple and requires only form bounds on the interaction potentials Wij . The comparably bad behavior of singular Dirac operators and the resulting weak control on the interaction potentials also complicates the derivation of the upper bound on the ionization threshold, which is obtained by a modiﬁed version of the standard Weyl criterion [13] where a suitable strictly monotonic function of the operator is considered. In order to prove the existence of inﬁnitely many bound states we employ minimax principles proceeding along the lines of [40] where the Brown–Ravenhall model (with A = 0) is considered. The main problem here is to replace those arguments in [40] where explicit position or momentum space representations of Λ+ 0,0 are used by new and somewhat more abstract ones. We remark that our results on spectral projectors also allow to analyze the Hamiltonian HN in the free picture proceeding along the discussion of the Furry and intermediate pictures presented here. There is, however, a subtlety to

February 11, 2010 10:0 WSPC/148-RMP

6

J070-S0129055X10003874

O. Matte & E. Stockmeyer

consider: Namely, for vanishing magnetic ﬁelds, it is known that the one-particle 2 ≈ Brown–Ravenhall operator is stable if and only if Z ≤ Zc := e12 (2/π)+(π/2) 124.2 [17]. In the presence of an exterior B-ﬁeld one can show that H1 is still bounded below in the free picture, for all Z ≤ Zc , provided the vector potential is locally bounded and Lipschitz continuous in a neighborhood of the nuclei [36]. (For smaller values of the coupling constant, one can actually prove the stability of matter of the second kind in the free picture treating a gauge ﬁxed vector potential as a variable in the minimization and adding the ﬁeld energy to the multi-particle Hamiltonian; see [35, 34], where the quantized electro-magnetic ﬁeld is considered. In this situation it is essential that the vector potential is included in the projection for otherwise the model is always unstable if N > 1 [21].) Finally, we comment on some closely related recent work. In the free picture and for vanishing exterior magnetic ﬁeld, an HVZ theorem and the existence of inﬁnitely many eigenvalues below the essential spectrum have been proved in [40], for nuclear charges Z ≤ Zc . The case N = 2 is also treated in [27]. A more general HVZ theorem that applies to diﬀerent particle species and a wider class of interaction potentials and exterior ﬁelds in the free picture has been established in [37]. Moreover, the reduction to irreducible representations of the groups of rotation and reﬂection and permutations of identical particles is considered in [37]. The L2 exponential localization of eigenvectors in the Brown–Ravenhall model is studied in [38,39] improving and generalizing earlier results from [2]. In all these works, the authors employ explicit position space representations of Λ+ 0,0 . An HVZ theorem in the free picture with constant magnetic ﬁeld is established in [28] again using explicit representations for the projection based on Mehler’s formula. By employing somewhat more abstract arguments we are able to study a wider class of projections in this paper. Similar results on the spectral projectors are used in [36] to study the regularity of the eigenvectors of H1 in the free picture and to derive pointwise exponential decay estimates for their partial derivatives of any order (for Z ≤ Zc and assuming that all partial derivatives of A increase more slowly than any exponential function). The rate of exponential decay found in [36] is actually the same as it is known for the Chandrasekhar operator and, hence, seems to be the optimal one. For many-electron atoms it is, however, more diﬃcult to prove the exponential localization and — as in [2,38] — we shall only get suboptimal bounds on the decay rate in the present article. For recent developments and numerous references to the literature on HVZ theorems we refer to [33]. The article is organized as follows. In Sec. 2, we introduce our mathematical model precisely and state our main theorems. Section 3 is devoted to the study of some non-local properties of Λ+ A,V expressed in terms of various commutator estimates which form the basis of the spectral analysis of HN . Moreover, it contains results + that allow to compare the projections Λ+ A,V and ΛA,0 . In Sec. 4, we derive the exponential localization estimate for HN and in Sec. 5 infer a lower bound on the

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

7

threshold energy. Section 6 deals with Weyl sequences and, ﬁnally, in Sec. 7 we show that HN possesses inﬁnitely many eigenvalues below the threshold energy. Some frequently used notation. Open balls in R3 with radius R > 0 and center z ∈ R3 are denoted by BR (z). Spectral projections of a self-adjoint operator, T , on some Hilbert space are denoted by Eλ (T ) and EI (T ), if λ ∈ R and I is an interval. D(T ) denotes the domain of the operator T and Q(T ) its form domain. The characteristic function of a subset M ⊂ Rn is denoted by 1M . C, C , C , . . . denote constants whose values might change from one estimate to another. 2. The Model and Main Results In our choice of units the free Dirac operator reads D0 := −iα · ∇ + β := −i

3

αj ∂xj + β.

j=1

Here α = (α1 , α2 , α3 ) and β =: α0 are 4 × 4 hermitian matrices which satisfy the Cliﬀord algebra relations {αi , αj } = 2δij 1,

0 ≤ i, j ≤ 3.

(2.1)

In Dirac’s representation, which we ﬁx throughout the paper, they are given as 0 σj 1 0 , j = 1, 2, 3, β = , αj = σj 0 0 −1 where σ1 , σ2 , σ3 are the standard Pauli matrices. D0 is a self-adjoint operator in the Hilbert space H := L2 (R3 , C4 ) with domain H 1 (R3 , C4 ). Its spectrum is purely absolutely continuous and given by the union of two half-lines, σ(D0 ) = σac (D0 ) = (−∞, −1] ∪ [1, ∞).

(2.2)

Next, we formulate our precise hypotheses on the exterior electrostatic potential VC and on the potential V determining the Dirac sea. We think that, at least with regards to the commutator estimates in Sec. 3, it is interesting to keep the conditions on VC and V fairly general. Hypothesis 2.1. There is a ﬁnite set Y ⊂ R3 , #Y < ∞, such that VC ∈ 3 4 L∞ loc (R \Y, L (C )) is almost everywhere hermitian and VC (x) → 0,

|x| → ∞.

(2.3)

February 11, 2010 10:0 WSPC/148-RMP

8

J070-S0129055X10003874

O. Matte & E. Stockmeyer

Moreover, there exist γ ∈ (0, 1) and ε > 0 such that the balls Bε (y), y ∈ Y, are mutually disjoint and, for 0 < |x| < ε and y ∈ Y, VC (y + x) ≤

γ . |x|

(2.4)

Example 2.1. The main example for a potential satisfying Hypothesis 2.1 is certainly the Coulomb potential generated by a ﬁnite number of static nuclei, VC (x) = −

e2 Zy 1, |x − y|

x ∈ R3 \Y.

y∈Y

In this case the restriction on the strength of the singularities of VC imposed in Hypothesis 2.1 allows for all nuclear charges, 0 ≤ Zy < e−2 ≈ 137.037, y ∈ Y. Hypothesis 2.2. V = VC + VH + VE , where VC fulﬁlls Hypothesis 2.1 and 3 4 VH ∈ L∞ loc (R , L (C )) is an almost everywhere hermitian matrix-valued function dropping oﬀ to zero at inﬁnity, VH (x) → 0,

|x| → ∞.

(2.5)

VE is compact and has the following property: There exist m > 0 and some increasing function c : [0, m) → (0, ∞) such that, for every F ∈ C 1 (R3 , R) with |∇F | ≤ a < m, ∀χ ∈ C 1 (R3 , [0, 1]) : [eF VE e−F , χ] ≤ c(a) ∇χ ∞ ,

(2.6)

[VE , eF ]e−F ≤ c(a) ∇F ∞ ,

(2.7)

lim 1R3 \BR (0) eF VE e−F 1R3 \BR (0) = 0.

R→∞

(2.8)

Example 2.2. (i) Possible choices for VH and VE satisfying the conditions of Hypothesis 2.2 are the Hartree and non-local exchange potentials corresponding to a set of exponentially localized orbitals ϕ1 , . . . , ϕM ∈ H , |ϕi (x)| ≤ Ce−m|x| , 1 ≤ i ≤ M , for some C ∈ (0, ∞). Their Hartree potential is given as

M 1 2 VH (x) := e |ϕi | ∗ (x), |·| i=1 2

x ∈ R3 .

It incorporates the presence of M electrons in a ﬁxed state into the Dirac sea by a smeared out background density. The exchange potential corresponding to ϕ1 , . . . , ϕM is the integral operator with matrix-valued kernel VE (x, y) := e2

M ϕi (x)ϕ∗ (y) i

i=1

|x − y|

.

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

9

It is a correction to the Hartree potential accounting for the Pauli principle. In the sense of quadratic forms it then holds VC ≤ V = VC + VH + VE ≤ 0, which justiﬁes the notion “intermediate picture”. These choices of VH and VE are discussed, e.g., in [11, 29, 30, 45]. (ii) More generally, we may set VH := ∗ | · |−1 , for some 0 ≤ ∈ L1 ∩ L5/3 (R3 ). In this case we ﬁnd some C ∈ (0, ∞) such that 0 ≤ VH ≤ C/| · |. Moreover, standard theorems on integral operators show that every kernel with values in the set of hermitian (4 × 4)-matrices satisfying VE (x, y) ≤ C

e−m|x−y| , xρ |x − y|yρ

for some m, ρ, C > 0, yields a compact operator satisfying the conditions of Hypothesis 2.2. As a ﬁrst consequence of Hypothesis 2.2 we ﬁnd, for every locally bounded vector potential A : R3 → R3 , a distinguished self-adjoint realization of the Dirac operator DA,V = α · (−i∇ + A) + β + V, whose essential spectrum is again contained in (−∞, −1] ∪ [1, ∞); see Lemma 3.2 below, where we recall some important well-known facts on Dirac operators with singular potentials. Therefore, it makes sense to deﬁne the spectral projections, Λ+ A,V := E[e0 ,∞) (DA,V ),

+ Λ− A,V := 1 − ΛA,V ,

(2.9)

where e0 ∈ (DA,V ) ∩ (−1, 1).

(2.10)

For later reference we introduce the parameter 0 :=

1 − e20 .

(2.11)

Many of our technical results on DA,V , for instance, various commutator estimates of Sec. 3 hold actually true under the mere assumption that the components of A are locally bounded. Of course, if not all eigenvalues of DA,V are larger than −1 and e0 is chosen between −1 and the lowest eigenvalue, the physical relevance of the N -particle Hamiltonian HN becomes rather questionable. We remark that such situations are not excluded by our hypotheses. For instance, if VC is the Coulomb potential and the intensity of a constant exterior magnetic ﬁeld is increased, then the lowest eigenvalue of DA,VC eventually reaches the lower continuum [16]. Nevertheless, our theorems hold for any choice of e0 as in (2.10).

February 11, 2010 10:0 WSPC/148-RMP

10

J070-S0129055X10003874

O. Matte & E. Stockmeyer

In order to deﬁne the atomic no-pair Hamiltonian precisely we ﬁrst set HN :=

N

HN+ := Λ+,N A,V HN ,

H,

N ∈ N,

H + := H1+ ,

i=1 3 3 where Λ+,N A,V is given by (1.5) and (2.9). We let W : R × R → [0, ∞] denote the interaction potential between two electrons.

Hypothesis 2.3. There is some γ > 0 such that, for all x, y ∈ R3 , x = y, 0 ≤ W (x, y) = W (y, x) ≤ γ |x − y|−1 .

(2.12)

When we consider N electrons located at x1 , . . . , xN ∈ R3 we denote their common position variable by X = (x1 , . . . , xN ). Furthermore, we often write Wjk for the maximal multiplication operator in HN induced by the function (R3 )N X → W (xj , xk ). For N ∈ N, we introduce a symmetric, semi-bounded operator acting in HN+ by ˚N ) := Λ+,NDN , D(H A,V

˚N Φ := H

 N

 Λ+,N A,V

DN :=

N

D,

D := C0∞ (R3 , C4 ),

i=1 (i) DA,VC

i=1

+

(2.13)

 Wij  Λ+,N A,V Φ,

˚N ). (2.14) Φ ∈ D(H

1≤i<j≤N

Proposition 2.1. Assume that V fulﬁlls Hypothesis 2.2, W fulﬁlls Hypothesis 2.3, 3 3 −τ0 |x| A ∞ < ∞, for some 0 ≤ τ0 < and that A ∈ L∞ loc (R , R ) satisﬁes e ˚N given by (2.13) and (2.14) is well-deﬁned, min{0 , m}. Then the operator H symmetric, and semi-bounded from below. Proof. The only claim that is not obvious is that Wij Λ+,N A,V φ is again squareintegrable, for every φ ∈ DN . This follows, however, from Corollary 3.3. ˚N by HN . Note that we do not require We denote the Friedrichs extension of H the elements in the domain of HN to be anti-symmetric since in our proofs it is convenient to consider HN as an operator acting in the full tensor product. Of course, in the end we shall be interested in the restriction of HN to the anti-symmetric (fermionic) subspace of HN+ . We denote the anti-symmetrization operator on HN by AN , 1 (AN Φ)(X) = sgn(σ)Φ(xσ(1) , . . . , xσ(N ) ), Φ ∈ HN , (2.15) N! σ∈SN

where SN is the group of permutations of {1, . . . , N }, and deﬁne the no-pair Hamiltonian by A := HN AN H + . HN N

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

11

Our ﬁrst main result is the following theorem, where A ENA := inf σ(HN ),

E0A := 0.

N ∈ N,

(2.16)

Theorem 2.1 (Exponential Localization). Assume that V and W fulﬁll Hypotheses 2.2 and 2.3, respectively. If A ∈ C 1 (R3 , R3 ) and B = curl A is bounded and if I ⊂ R is an interval with sup I < ENA−1 + 1, then there exists b ∈ (0, ∞) such A )) ⊂ D(eb|X| ) and that Ran(EI (HN A eb|X|EI (HN ) < ∞.

Proof. This theorem is proved in Sec. 4. Remark 2.1. In the case N = 1 the assertion of Theorem 2.1 still holds true under the assumptions of Proposition 2.1. This follows from the proof of Theorem 2.1. In fact, for N = 1, we do not have to control error terms involving the interaction W which is the only reason why B is assumed to be bounded in Theorem 2.1. If also V = VC , then we obtain an exponential localization estimate for an eigenvector, φE , with eigenvalue E ∈ (−1, 1) of the Dirac operator DA,VC . The estimate on the decay rate which could be extracted from our proof is, however, suboptimal due to error terms coming from the projections; see also [7] for decay estimates for Dirac operators. Next, we introduce a hypothesis which is used to prove the “easy part” of the HVZ theorem below. Hypothesis 2.4. (i) For every λ ≥ 1, there exist radii, 1 ≤ R1 < R2 < · · · , Rn ∞, and ψ1 , ψ2 , . . . ∈ D such that ψn = 1,

supp(ψn ) ⊂ R3 \BRn (0),

lim (DA − λ)ψn = 0.

n→∞

(2.17)

(ii) A ∈ C 1 (R3 , R3 ) and B = curl A has the following property: There are b1 ∈ (0, ∞) and 0 ≤ τ < min{0 , m} (m and 0 are the parameters appearing in Hypothesis 2.2 and (2.11)) such that, for all x, y ∈ R3 , |B(x) − B(y)| ≤ b1 eτ |x−y|.

(2.18)

Example 2.3 ([22]). We recall a result from [22] which provides a large class of examples where Hypothesis 2.4(i) is fulﬁlled: Suppose that A ∈ C ∞ (R3 , R3 ), B = curl A, and set, for x ∈ R3 and ν ∈ N, |∂ α B(x)| 0 (x) := |B(x)|,

ν (x) :=

|α|=ν

1+

|α|<ν

|∂ α B(x)|

.

February 11, 2010 10:0 WSPC/148-RMP

12

J070-S0129055X10003874

O. Matte & E. Stockmeyer

Suppose further that there exist ν ∈ N0 , z1 , z2 , . . . ∈ R3 , and ρ1 , ρ2 , . . . > 0 such that ρn ∞, the balls Bρn (zn ), n ∈ N, are mutually disjoint and sup{ν (x) | x ∈ Bρn (zn )} → 0,

n → ∞.

Then there is a Weyl sequence, ψ1 , ψ2 , . . . , that satisﬁes the conditions of Hypothesis 2.4(i). The fact that the additional assumption of Part (ii) of the next theorem yields a lower bound on the ionization threshold is an observation made in [19] for Schr¨ odinger operators. Theorem 2.2 (HVZ). Assume that V and W fulﬁll Hypotheses 2.2 and 2.3, respectively. Then the following assertions hold true: A ) ⊃ [ENA−1 + 1, ∞). (i) If Hypothesis 2.4 is fulﬁlled also, then σess (HN (ii) Assume additionally that, for every interval I ⊂ R with sup I < ENA−1 + 1, there A )∈ is some g ∈ C(R, (0, ∞)) such that g(r) → ∞, as r → ∞, and g(|X|)EI (HN A A L (AN HN ). Then σess (HN ) ⊂ [EN −1 + 1, ∞). In particular, this inclusion is valid if A ∈ C 1 (R3 , R3 ) and B = curl A is bounded.

Proof. (i) follows directly from Lemma 6.3 and (ii) follows from Theorems 5.1 and 2.1. To show the existence of inﬁnitely many eigenvalues below the bottom of the A we certainly need a condition on the relationship between essential spectrum of HN VC , W , and the magnetic ﬁeld. To formulate it we set, for δ, R > 0, Sδ (R) := {x ∈ R3 : (1 − δ)R ≤ |x| ≤ (1 + δ)R} v (δ, R) :=

sup

(2.19)

sup v | VC (x)vC4 .

(2.20)

x∈Sδ (R) |v|=1

Hypothesis 2.5. (i) V fulﬁlls Hypothesis 2.2. (ii) A ∈ C 1 (R3 , R3 ) and B = curl A is bounded. (iii) There exist radii 1 ≤ R1 < R2 < · · · , Rn ∞, some constant δ ∈ (0, 1/N ), and a sequence of spinors, ψ1 , ψ2 , . . . ∈ D, with vanishing lower spinor components, ψn = (ψn,1 , ψn,2 , 0, 0) , n ∈ N, such that ψn = 1,

supp(ψn ) ⊂ {Rn < |x| < (1 + δ/2)Rn },

2Rn ≤ Rn+1 , (2.21)

for all n ∈ N, and (DA − 1)ψn = O(1/Rn ),

n → ∞.

(2.22)

(iv) W fulﬁlls Hypothesis 2.3 and, for every δ ∈ (0, N1 ), we ﬁnd some ε ∈ (0, 1) such that lim sup Rn n→∞

v (δ, Rn ) + (N − 1)

sup |x−y|≥(1−ε)Rn

W (x, y)

< 0.

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

13

Example 2.4. V = VC + VH + VE and W fulﬁll Hypothesis 2.5(i) and (iv), if

VC is given as in Example 2.1 with y∈Y Zy ≥ N , VH and VE are given as in Example 2.2(i) or (ii), and W is the Coulomb repulsion, W (x, y) = e2 /|x − y|. Hypothesis 2.5(iii) is fulﬁlled under the following strengthened version of the condition given in [22]: Suppose again that A ∈ C ∞ (R3 , R3 ), B = curl A, and let Bρn (zn ) denote the balls appearing in Example 2.3. Suppose additionally that there is some C ∈ (0, ∞) such that ρn < |zn | ≤ Cρn , for all n ∈ N, and that either sup{|B(x)| : x ∈ Bρn (zn )} ≤ C/|zn |2 ,

n ∈ N,

or ∀ n ∈ N : |B(zn )| ≥ 1/C

and

sup{ν (x)|x ∈ Bρn (zn )} = o(ρ−ν n ).

Then we ﬁnd a Weyl sequence ψ1 , ψ2 , . . . satisfying the conditions in Hypothesis 2.5(iii). This follows by inspecting and adapting all relevant proofs in [22]. We leave this procedure to the reader since it is straightforward but a little bit lengthy. Theorem 2.3 (Existence of Bound States). Assume that V, W, and A fulﬁll A has inﬁnitely many eigenvalues below the inﬁmum of its Hypothesis 2.5. Then HN A ) = ENA−1 + 1. essential spectrum, inf σess (HN Proof. This theorem is proved in Sec. 7.

3. Spectral Projections of the Dirac Operator In this section, we study spectral projections of Dirac operators with singular potentials in magnetic ﬁelds. We start by recalling some basic well-known facts about Dirac operators in Sec. 3.1. A crucial role is played by the resolvent identity stated in that subsection which applies to Coulomb singularities with coupling constants up to e2 Z < 1. We remark that the domains of the Dirac operators studied here are not known in general and actually change when the strength of a Coulomb-type potential is increased. Consequently, the usual resolvent identities are not applicable and all formal manipulations involving Dirac operators and their spectral projections have to be treated carefully in the whole paper. In Sec. 3.2 we derive some norm estimates on resolvents of Dirac operators which are conjugated with exponential weight functions. We verify that the conjugated resolvent stays bounded provided the weights increase with an exponential rate smaller than 1 − (z)2 , where z ∈ (−1, 1) + iR is the spectral parameter. The simple Neumann-type argument we employ to prove this for non-vanishing electrostatic potentials might be new. In Sec. 3.3, we derive the main technical tools of this paper, namely, various commutator estimates involving spectral projections of singular Dirac operators. Some long and technical proofs are postponed to Sec. 3.5. Finally, in Sec. 3.4, we study the diﬀerence of projections with and without electrostatic potentials.

February 11, 2010 10:0 WSPC/148-RMP

14

J070-S0129055X10003874

O. Matte & E. Stockmeyer

3.1. Basic properties of Dirac operators with singular potentials in magnetic ﬁelds In the next lemma, we collect various well-known results on Dirac operators which play an important role in the whole paper. To this end we let Hcs := Hcs (R3 , C4 ) denote all elements of H s := H s (R3 , C4 ), s ∈ R, having compact support. Moreover, ˇ 0. we denote the canonical extension of D0 to an element of L (H 1/2 , H −1/2 ) by D It shall sometimes be convenient to consider the singular part of VC , VCs (x) := (x − y)VC (x), x ∈ R3 , (3.1) y∈Y

where ∈ C0∞ (R3 , [0, 1]) equals 1 on Bε/2 (0) and 0 outside Bε (0). Here ε is the parameter appearing in Hypothesis 2.1. We let VCs (x) = S(x)|VCs |(x) denote the polar decomposition of VCs (x). By Hardy’s inequality we know that VCs is a bounded operator from H 1 (R3 , C4 ) to L2 (R3 , C4 ). By duality and interpolation it possesses 3 3 a unique extension VˇCs ∈ L (H 1/2 , H −1/2 ). Given some A ∈ L∞ loc (R , R ) we set s ∞ 3 A := (1 − ϑ)A, where ϑ ∈ C0 (R , [0, 1]) is equal to 1 on some ball containing supp(VCs ). We let α · As (x) = U (x)|α · As (x)| denote the polar decomposition of ˇ 0 + α · As + Vˇ s is well-deﬁned as an α · As (x) and note that the operator sum D C 1/2 −1/2 3 3 s s element of L (Hc , Hc ), for every A ∈ L∞ (R , R ). So V C and A have disjoint loc support by deﬁnition. As a consequence the application of the following lemma eventually becomes more convenient. 3 3 Lemma 3.1 ([8,41,42,44]). Assume that A ∈ L∞ loc (R , R ) and VC fulﬁlls Hypothesis 2.1. Then there is unique self-adjoint operator, DAs ,VCs , such that : 1/2

(i) D(DAs ,VCs ) ⊂ Hloc (R3 , C4 ). (ii) For all ψ ∈

1/2 Hc (R3 , C4 )

and φ ∈ D(DAs ,VCs ),

ψ | DAs ,VCs φ = |D0 |1/2 ψ | sgn(D0 )|D0 |1/2 φ + |α · As |1/2 ψ | U |α · As |1/2 φ + |VCs |1/2 ψ | S|VCs |1/2 φ. Proof. In [44, Proposition 4.3] it is observed that the claim follows from [8, Theorem 1.3] and [41, 42]. Consequently, we may deﬁne a self-adjoint operator, DA,V := DAs ,VCs + α · (A − As ) + (VC − VCs ) + VH + VE

(3.2)

on the domain D(DA,V ) = D(DAs ,VCs ). Notice that in (3.2) we only add bounded operators to DAs ,VCs . We state some of its properties in the following lemma where RA,V (z) := (DA,V − z)−1 ,

z ∈ (DA,V ).

(3.3)

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

15

3 3 Lemma 3.2 ([8, 41, 42, 44]). Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2. Then the following assertions hold true:

(a) 1BR (0) (DA,V − i)−1 is compact, for every R > 0. (b) σess (DA,V ) = σess (DA ), σ(DA ) ⊂ (−∞, −1] ∪ [1, ∞). (c) DA,V is essentially self-adjoint on ˇ 0 φ + α · Aφ + Vˇ s φ ∈ L2 (R3 , C4 )} De := {φ ∈ Hc1/2 (R3 , C4 ) : D C

(3.4)

and, for φ ∈ De , DA,V φ is given as a sum of four vectors in H −1/2 , ˇ 0 φ + α · Aφ + VˇCs φ + (V − VCs )φ. DA,V φ = D Moreover, De = D(DA,V ) ∩ E , where E denotes the dual space of C ∞ (R3 , C4 ). (d) For χ ∈ C0∞ (R3 ) and φ ∈ D(DA,V ), we have χφ ∈ De ⊂ D(DA,V ) and [DA,V , χ]φ = −i(α · ∇χ)φ + [VE , χ]φ. In particular, for z ∈ (DA,V ), [χ, RA,V (z)] = RA,V (z)[DA,V , χ]RA,V (z) = RA,V (z)(−i(α · ∇χ) + [VE , χ])RA,V (z).

(3.5)

(e) If A is bounded, then D(DA,V ) ⊂ H 1/2 (R3 , C4 ). Proof. Since VE is compact it is clear that all assertions hold true as soon as they hold for VE = 0, which we assume in the following. To prove (a) we write 1BR (0) (DA,V − i)−1 = (1BR (0) |D0 |−1/2 )(|D0 |1/2 χ(DA,V − i)−1 ), where χ ∈ C0∞ (R3 , [0, 1]) equals 1 in a neighborhood of BR (0). Then we use that 1BR (0) |D0 |−1/2 is compact and that |D0 |1/2 χ(DA,V −i)−1 is bounded by Lemma 3.1 and the closed graph theorem. By standard arguments, we obtain the identity σess (DA,V ) = σess (DA ) from (a) since V drops oﬀ to zero at inﬁnity; see, e.g., [49, §4.3.4]. The inclusion σ(DA ) ⊂ (−∞, −1] ∪ [1, ∞) follows from supersymmetry arguments; see, e.g., [49, §5.6]. The assertions in (c) follow from [8, §2], (d) follows from [8, Lemma G], and (e) from [41]. Next, we recall the useful resolvent identity (3.6) (see, e.g., [18, 53]) which is used very often in the sequel. It should be regarded as a substitute for the second resolvent identity which is typically not applicable in order to compare two diﬀerent Dirac operators in this paper. For, in general, the domain of one of these Dirac in operators is not included in the domain of the other. The vector potential A Eq. (3.6) below could for instance be the gradient of some gauge potential or just be

February 11, 2010 10:0 WSPC/148-RMP

16

J070-S0129055X10003874

O. Matte & E. Stockmeyer

equal to zero. We recall another well-known resolvent identity [41] in the beginning of Sec. 3.5. ∈ L∞ (R3 , R3 ). Lemma 3.3. Assume that V fulﬁlls Hypothesis 2.2, and that A, A loc s Let V be either VC (given by (3.1)) or 0, let z ∈ (DA, eV e ) ∩ (DA,V ) and χ ∈ C ∞ (R3 , R) be constant outside some ball in R3 , and assume that (VC − V )χ and are bounded. Then α · (A − A)χ χRA,V (z) = χRA, eV e (z) + RA, eV e (z)iα · (∇χ)(RA, eV e (z) − RA,V (z)) − RA, eV e (z)χ(V − V + α · (A − A))RA,V (z).

(3.6)

ˇ 0 ψ + α · Aψ + V ψ ∈ L2 }. Since χ can be written e := {ψ ∈ Hc1/2 |D Proof. Let φ ∈ D ∞ as χ = c + ϑ, for some c ∈ R and ϑ ∈ C0 (R3 , R), Lemmas 3.2(c) and (d) imply e . By the deﬁnition of De in (3.4) and the assumptions on χ it further that χφ ∈ D follows that χφ ∈ De ⊂ D(DA,V ) and DA, eV e χφ = DA,V χφ + {−V + V − α · (A − A)}χφ. Therefore, we obtain (RA, eV e (z) − RA,V (z))χ(DA, eV e − z)φ = (RA, eV e (z) − RA,V (z))((DA, eV e − z)χ + iα · (∇χ))φ = χφ − RA,V (z)(DA,V − z − V + V − α · (A − A))χφ + (RA, eV e (z) − RA,V (z))iα · (∇χ)φ = RA,V (z)(V − V + α · (A − A))χR eV e (z)(DA, eV e − z)φ A, + (RA, eV e (z) − RA,V (z))iα · (∇χ)RA, eV e (z)(DA, eV e − z)φ. As DA, eV e is essentially self-adjoint on De , we know that (DA, eV e − z)De is dense, which together with the calculation above implies (RA, eV e (z) − RA,V (z))χ = (RA, eV e (z) − RA,V (z))iα · (∇χ)RA, eV e (z) + RA,V (z)(V − V + α · (A − A))χR eV e (z). A,

(3.7)

Taking the adjoint of (3.7) (with z replaced by z¯) we obtain (3.6). 3.2. Conjugation of RA,V (z) with exponential weights As a preparation for the localization estimates for the spectral projections, we shall now study the conjugation of RA,V (z) with exponential weight functions eF acting as multiplication operators on H . To this end we recall that e0 ∈ (−1, 1) is an

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

17

element of the resolvent set of DA,V and set δ0 := inf{|e0 − λ| : λ ∈ σ(DA,V )} > 0,

(3.8)

0 := min{1 − e0 , e0 + 1}, Γ := e0 + iR.

(3.9)

Notice that the decay rate in the following lemma is determined only by the decay rate m appearing in Hypothesis 2.2 and the number 0 deﬁned in (2.11). In the next proof and henceforth we shall often use the abbreviations DA := DA,0 ,

RA (z) := (DA − z)−1 ,

z ∈ (DA ).

(3.10)

We remark that, for V = 0, the bound (3.11) below follows from a well-known computation (see, e.g., [7]) which is recalled in the next proof. For non-vanishing, singular potentials V a bound on the operator norm of the conjugated resolvent seems to be less well known and the Neumann-type argument we use to prove it might be a new observation. 3 3 Lemma 3.4. Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2. Let 0 < a < min{0 , m}. Then there is some Ca ∈ (0, ∞) such that, for all F ∈ C ∞ (R3 , R) with F (0) = 0, F ≥ 0 or F ≤ 0, ∇F ∞ ≤ a, and all z = e0 + iη ∈ Γ,

Ca eF RA,V (z)e−F ≤ . 1 + η2

(3.11)

Proof. First, we assume that F is constant outside some ball in R3 . Then it sufﬁces to treat the case F ≥ 0, since otherwise we could consider the adjoint of eF RA,V (z)e−F . Since F is smooth and constant outside some compact set a straightforward calculation (see [7]) using Lemma 3.2(d) yields, for z ∈ C and ϕ ∈ D(DA ) ∩ E , 1 F e (DA − z)e−F ϕ 2 + 3ε α · (−i∇ + A)ϕ 2 4ε + 3ε(1 + |z|2 ) ϕ 2 + 3εϕ | |∇F |2 ϕ ≥ e−F (DA + z¯)eF ϕ | eF (DA − z)e−F ϕ = α · (−i∇ + A)ϕ 2 + ϕ | (1 − z 2 − |∇F |2 )ϕ. This and the assumption |∇F | ≤ a permit to get, for z = e0 + iη ∈ Γ, that is, z 2 = e20 − η 2 , and for every 0 < ε < (1 − e20 − a2 )/9 = (20 − a2 )/9, eF RA (z)e−F ≤

Ca,ε 1 ≤ . 2 2 2 4ε 1 − e0 − a − 9ε + η /2 1 + η2

(3.12)

February 11, 2010 10:0 WSPC/148-RMP

18

J070-S0129055X10003874

O. Matte & E. Stockmeyer

We choose ε := (1 − e20 − a2 )/10 in what follows. Next, we pick some R > max{|y| : y ∈ Y} and χ ∈ C ∞ (R3 , [0, 1]) such that χ ≡ 0 on BR (0), χ ≡ 1 on R3 \BR+2 (0), denote the characterand ∇χ ∞ ≤ 1. We set χ := 1 − χ. Furthermore, we let χ istic function of R3 \BR (0). We choose R so large (depending on a, but not on F ; recall (2.8)) that sup VC (x) + sup VH (x) + χ eF VE e−F χ ≤

|x|≥R

|x|≥R

1 . 2Ca,ε

(3.13)

Conjugating (3.6) with exponential weights and rearranging the terms we ﬁnd, for z ∈ Γ, {1 + eF RA (z)e−F ( χVC + χ VH + χeF VE e−F χ )}χeF RA,V (z)e−F = χeF RA (z)e−F − (eF RA (z)e−F )(eF iα · ∇χ)(RA,V (z) − RA (z))e−F − (eF RA (z)e−F )(χeF VE e−F χ)(1BR+2 (0) eF )RA,V (z)e−F . Here the operator {· · ·} on the left side can be inverted by means of a Neumann series and {· · ·}−1 ≤ 2 by (3.12) and (3.13). Furthermore, we recall the identity α · v L (C4 ) = |v|,

v ∈ R3 ,

(3.14)

which follows from the Cliﬀord algebra relations (2.1), and observe that, by the choice of χ, the assumption on F , and (3.14), eF iα · ∇χ ≤ ea(R+2) ,

e−F ≤ 1.

Moreover, we have, for z = e0 + iη ∈ Γ, 1 , RA (z) ≤ 2 0 + η 2

1 RA,V (z) ≤ 2 . δ0 + η 2

(3.15)

Using these remarks together with (2.7) and (3.12), we obtain C eR+2 χeF RA,V (z)e−F ≤ a , 1 + η2

z = e0 + iη ∈ Γ.

This estimate implies the assertion if F is constant outside some ball since, certainly, χeF RA,V (z)e−F ≤ ea(R+2) (δ02 + η 2 )−1/2 . Let us now assume that F ≥ 0 is not necessarily bounded. Let F1 , F2 , . . . ∈ C ∞ (R3 , [0, ∞)) be constant near inﬁnity and such that Fn = F on Bn (0) and Fn → F . Then e−Fn RA,V (z)eFn φ → e−F RA,V (z)eF φ by the dominated convergence theorem, for every φ ∈ D. Since e−Fn RA,V (z)eFn obeys the estimate (3.11) uniformly in n, we see that the densely deﬁned operator e−F RA,V (z)eF D is bounded and satisﬁes (3.11), too. But this is the case if and only if its adjoint, eF RA,V (z)e−F = (e−F RA,V (z)eF )∗ , is an element of L (H ) and satisﬁes (3.11) as well.

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

19

In the applications of the previous lemma, the following observation is very useful. 3 3 Lemma 3.5. Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2. Let 0 < a < min{0 , m}. Then there is some Ca ∈ (0, ∞) such that, for all F ∈ C ∞ (R3 , R) with F (0) = 0, F ≥ 0 or F ≤ 0, ∇F ∞ ≤ a, which are constant outside some ball in R3 , and for all φ ∈ H , |DA,V |1/2 eF RA,V (z)e−F φ 2 |dz| ≤ Ca φ 2 , (3.16) Γ

and, for φ ∈ D(|DA,V |1/2 ), eF RA,V (z)e−F |DA,V |1/2 φ 2 |dz| ≤ Ca φ 2 .

(3.17)

Γ

Proof. For later reference we additionally pick some χ ∈ C ∞ (R3 , R) which is constant outside some large ball and infer from Lemma 3.2(e) that, for z ∈ Γ, [RA,V (z), χeF ] = RA,V (z){iα · (∇χ + χ∇F ) + [χeF , VE ]e−F }eF RA,V (z).

(3.18)

The special case χ ≡ 1 implies eF RA,V (z)e−F = RA,V (z) − RA,V (z){iα · ∇F + [eF , VE ]e−F }eF RA,V (z)e−F . (3.19) Taking the adjoint and replacing F by −F and z¯ by z we also get eF RA,V (z)e−F = RA,V (z) − eF RA,V (z)e−F {iα · ∇F + eF [VE , e−F ]}RA,V (z). (3.20) Now, let T be a self-adjoint operator on some Hilbert space, K , such that (−δ0 , δ0 ) ⊂ (T ). Then, for φ ∈ K , R

|T |1/2 (T − iη)−1 φ 2 dη =

R

R

λ2

|λ| dη d Eλ (T )φ 2 = π φ 2 , + η2

(3.21)

and it is elementary to check that, for η ∈ R, |T |1/2 (T − iη)−1 ≤

1/2

δ0 1(−δ0 ,δ0 ) (η) 1(−δ0 ,δ0 )c (η) + . 2|η| δ02 + η 2

(3.22)

Using (3.21) and (3.22) with T = DA,V − e0 and taking (2.8), (3.11), (3.14), and (3.15) into account, we readily derive the asserted estimate (3.16) from (3.19). The second estimate (3.17) it obtained analogously by means of (3.20).

February 11, 2010 10:0 WSPC/148-RMP

20

J070-S0129055X10003874

O. Matte & E. Stockmeyer

3.3. Commutators In this subsection, we derive the crucial technical prerequisites for the spectral analysis of HN , namely various commutator estimates involving the projection Λ+ A,V , cut-oﬀ functions, and exponential weights eF . Roughly speaking, these estimates allow to adapt many arguments known from the spectral analysis of partial diﬀerential operators that involve partitions of unity and conjugations with exponential weights to our non-local model. Our standard assumptions on the cut-oﬀ and weight functions are χ ∈ C ∞ (R3 , [0, 1]) is constant outside some ball and

F ∈ C ∞ (R3 , R), F ≥ 0 or F ≤ 0, F (0) = 0, |∇F | ≤ a, F is constant outside some ball.

(3.23)

(3.24)

To shorten the presentation, we generalize our estimates to unbounded F only if this is explicitly used in this article. 3 3 Proposition 3.1. Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2 and let 0 < a0 < min{0 , m}. Then there is some constant Ca0 ∈ (0, ∞) such that, for all a ∈ [0, a0 ] and χ, F satisfying (3.23) and (3.24), F −F |DA,V |1/2 ≤ Ca0 ( ∇χ ∞ + a). |DA,V |1/2 [Λ+ A,V , χe ]e

(3.25)

Proof. We shall employ the identity F [Λ+ A,V , χe ] =

1 [sgn(DA,V − e0 ), χeF ] 2

(3.26)

and the representation of the sign function as a Cauchy principal value (see, e.g., [31, p. 359]), dz RA,V (z)ψ sgn(DA,V − e0 )ψ = π Γ R dη (3.27) := lim RA,V (e0 + iη)ψ , R→∞ −R π for ψ ∈ H , where Γ is deﬁned in (3.9). Taking also (2.6), (2.7), and (3.18) into account we obtain F −F ||DA,V |1/2 φ | [Λ+ |DA,V |1/2 ψ| A,V , χe ]e ≤ |DA,V |1/2 RA,V (¯ z )φ iα · (∇χ + χ∇F ) + [χeF , VE ]e−F Γ

· eF RA,V (z)e−F |DA,V |1/2 ψ

|dz| 2π

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

≤

Ca 0 ( ∇χ ∞ ·

Γ

+ χ∇F ∞ + a)

Γ

1/2

|DA,V |

eF RA,V (z)e−F |DA,V |1/2 ψ 2

|dz| 2π

RA,V (z)φ

2 |dz|

21

1/2

2π

1/2 ,

(3.28)

for φ, ψ ∈ D(|DA |1/2 ) ⊃ Ran(RA (z)). By virtue of (3.16) and (3.17), we ﬁrst infer that F −F |DA,V |1/2 ψ ∈ D((|DA,V |1/2 )∗ ) = D(|DA,V |1/2 ). [Λ+ A,V , χe ]e

We conclude by recalling that an operator T : D(T ) → K on some Hilbert space K is bounded if and only if sup{|φ | T ψ| : φ ∈ X, ψ ∈ D(T ), φ = ψ = 1}

(3.29)

is ﬁnite, in which case it is equal to the norm of T . Here X ⊂ K is a subspace with ¯ ⊃ Ran(T ). X Given some suitable weight function, F , we abbreviate F + −F ΛF . A,V := e ΛA,V e

(3.30)

3 3 Corollary 3.1. Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2. Let 0 < a < min{0 , m}. Then there is some C(a) ∈ (0, ∞) such that, for all F ∈ C ∞ (R3 , R) satisfying F (0) = 0, F ≥ 0 or F ≤ 0, and ∇F ∞ ≤ a, we have F ΛF A,V ∈ L (H ) and ΛA,V ≤ C(a).

Proof. First, we assume that F satisﬁes (3.24). In this case the claim follows from + −F Proposition 3.1 because [eF , Λ+ = ΛF A,V − ΛA,V . If F is unbounded, then A,V ]e we apply an approximation argument similar to the one at the end of the proof of Lemma 3.4. 3 3 Corollary 3.2. Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2 and let 0 < a0 < min{0 , m}. Then there is some constant C ∈ (0, ∞) such that, for all a ∈ [0, a0 ], χ, F satisfying (3.23), (3.24), and ∇χ ∞ ≤ 1, L ∈ L (H ), and ϕ∈H, + + F 2 |ϕ | ΛF A,V χLχΛA,V ϕ − ϕ | χΛA,V LΛA,V χϕ| ≤ (a + ∇χ ∞ )C L ϕ .

(3.31)

Moreover, for all ϕ ∈ D(DA,V ), + + F |ϕ | ΛF A,V χDA,VC χΛA,V ϕ − ϕ | χΛA,V DA,VC ΛA,V χϕ| + −1 ϕ 2 }. ≤ (a + ∇χ ∞ ) inf {εϕ | χΛ+ A,V DA,VC ΛA,V χϕ + Cε 0<ε≤1

(3.32)

February 11, 2010 10:0 WSPC/148-RMP

22

J070-S0129055X10003874

O. Matte & E. Stockmeyer

If VC = V = 0, then (3.32) still holds true, if χDA χ is replaced by χ|DA |χ on the left side. Proof. The proof of (3.31) is a rather obvious application of Proposition 3.1 and in fact a simpler analogue of the derivation of (3.32) below. Once (3.31) is veriﬁed, it suﬃces to prove (3.32) with DA,VC replaced by DA,V since VH and VE are bounded. Without loss of generality we may further assume that DA,V is positive on the range of Λ+ A,V . For otherwise we could add a suitable constant to DA,V . To prove (3.32) we ﬁrst recall that Λ+ A,V maps the domain of DA,V into itself and by Lemma 3.2(d) we know that multiplication with χ or e±F leaves D(DA,V ) invariant, too. We thus have the following identity on D(DA,V ), + F ΛF A,V χDA,V χΛA,V − χΛA,V DA,V χ + + −F F −F ]DA,V Λ+ = eF [Λ+ A,V , χe A,V χ + χΛA,V DA,V [χe , ΛA,V ]e −F −F + eF [Λ+ ]DA,V [χeF , Λ+ . A,V , χe A,V ]e

It follows that the absolute value on the left-hand side of (3.32) is less than or equal to     F −F |DA,V |1/2 Λ+ |DA,V |1/2 [Λ+ ϕ A,V χϕ A,V , χe ]e   =±

+

F −F |DA,V |1/2 [Λ+ ϕ 2 , A,V , χe ]e

=±

which together with Proposition 3.1 implies (3.32). The last statement of the lemma is valid since the argument above works equally well with |DA,V | in place of DA,V + because Λ+ A,V |DA,V | = ΛA,V DA,V . In order to carry out explicit computations it is important to know that functions in the image set Λ+ A,V D still have a certain regularity. This is ensured by the following lemma. As a ﬁrst consequence we shall see in the corollary below that HN is actually well-deﬁned on DN . 3 3 −τ0 |x| ∞ < ∞, for Lemma 3.6. Assume that A ∈ L∞ loc (R , R ) satisﬁes Ae some 0 ≤ τ0 < min{m, 0 }, and that V fulﬁlls Hypothesis 2.2. Then Λ+ A,V φ ∈ 1/2 3 4 H (R , C ), for every φ ∈ D.

Proof. Let φ ∈ D. We pick some χ ∈ C0∞ (R3 , [0, 1]) with χ ≡ 1 on supp(φ). Furthermore, we pick ζ ∈ C0∞ (R3 , [0, 1]) such that ζ ≡ 1 on supp(χ) ∪ BR (0), where 1/2 R > max{|y| : y ∈ Y}. We set ζ := 1 − ζ. Since D(DA,V ) ⊂ Hloc (R3 , C4 ) and + the spectral projection ΛA,V maps the domain of DA,V into itself it follows that

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

23

1/2 ζΛ+ (R3 , C4 ). Furthermore, we pick a (smooth, locally ﬁnite) partition of A,V φ ∈ H

∞

∞ 3 unity on R , {Jν }ν∈N , ν=1 Jν = 1, such that ν=1 |∇Jν | ≤ C, for some constant C ∈ (0, ∞). Setting ζν := Jν ζ, ν ∈ N, φ := Λ+ A,V (DA,V − i)φ, and using (3.6), we obtain

ζΛ+ A,V φ =

∞

ζν RA,V (i)Λ+ A,V (DA,V − i)φ

ν=1

= ζR0 (i)φ −

∞

R0 (i)iα · (∇ζν )(RA,V (i) − R0 (i))φ

(3.33)

ν=1

−

∞

R0 (i)ζν (V + α · A)RA,V (i)φ.

(3.34)

ν=1

Here the sum in (3.33) commutes with the ﬁrst resolvent and the strong limit

∞ (∇ζν ) deﬁnes a bounded operator on L2 (R3 , C4 ). To treat (3.34) we ﬁrst ν=1 iα · ∞ ∞ 3 use that ν=1 ζν V = ζV is bounded. Next, we pick some F ∈ C (R , [0, ∞)) vanishing on some ball containing 0 and supp(φ) and satisfying F (x) = a|x| − a , for x outside some suﬃciently large ball with τ0 < a < min{m, 0 }, a > 0. Then we write (α · A)RA,V (i)φ = (e−F α · A)(eF RA,V (i)e−F ) −F × (eF Λ+ )(DA,VC +VH + iα · ∇F + eF VE e−F )φ. A,V e

Using (2.7), Lemma 3.4 and Corollary 3.1 we see that (α · A)RA,V (i)φ is an element of L2 (R3 , C4 ). These remarks imply that ζΛ+ A,V φ belongs to Ran(R0 (i)) + 1 3 4 Ran(ζR0 (i)) = H (R , C ). We may now conclude that HN is well-deﬁned on the dense domain DN deﬁned in (2.13). 3 3 −τ0 |x| ∞ < ∞, for Corollary 3.3. Assume that A ∈ L∞ loc (R , R ) satisﬁes Ae some 0 ≤ τ0 < min{m, 0 }, and that V fulﬁlls Hypothesis 2.2. Then, for Ψ ∈ DN and 1 ≤ i < j ≤ N, 1 2 |Λ+,N A,V Ψ(X)| dX < ∞. 2 |x − x | 3N i j R + Proof. Let φ, ψ ∈ D. Thanks to Lemma 3.6 we know that both Λ+ A,V φ and ΛA,V ψ 1/2 3 4 3 3 4 belong to H (R , C ) and, hence, to L (R , C ) by the Sobolev inequality for | 1i ∇|. An application of the Hardy–Littlewood–Sobolev inequality thus yields 1 2 + 2 |Λ+ A,V φ(x)| |ΛA,V ψ(y)| dx dy < ∞. 2 R6 |x − y|

This estimate clearly implies the full assertion.

February 11, 2010 10:0 WSPC/148-RMP

24

J070-S0129055X10003874

O. Matte & E. Stockmeyer

In our applications it is important to control commutators that are multiplied with square-roots of the electron-electron interactions W (xi , xj ). In order to formulate an appropriate estimate we set Wy (x) := W (x, y) = W (y, x),

x, y ∈ R3 ,

(3.35)

in what follows. The proof of the next proposition looks somewhat lengthy and is hence postponed to Sec. 3.5. This is due to the fact that the singularity of Wy may be located anywhere and that we allow for unbounded magnetic ﬁelds. We remark that, even in the case V = 0, a diamagnetic inequality is not very useful in this context since, for unbounded magnetic ﬁelds, one cannot compare |−i∇ + A| with |DA |. We tackle this problem by a procedure that involves a partition of unity, local gauge transformations, and exponential decay estimates which control the correlation between diﬀerent regions in position space. As a result we obtain a commutator estimate which can be chosen to depend only on the local magnitude of |B| either at the singularity y or on the support of the involved cut-oﬀ function. For any function χ on R3 we use the notation B ∞,χ := sup{|B(x)| : x ∈ supp(χ)}.

(3.36)

Proposition 3.2. Assume that A ∈ C 1 (R3 , R3 ) and B = curl A satisﬁes (2.18) and that V fulﬁlls Hypothesis 2.2. Let 0 ≤ a0 < min{m, 0 } and N ⊂ R3 be a neighborhood of the set of singularities, Y, of VC . Then there is some constant, Ca0 ,N ∈ (0, ∞), such that, for all a ∈ [0, a0 ], all χ, F satisfying (3.23), (3.24) which are constant on N , and all y ∈ R3 , F −F Wy1/2 [Λ+ ≤ Ca0 ,N (1 + min{|B(y)|, B ∞,χ })(a + ∇χ ∞ ). A,V , e χ]e

(3.37)

If VE = 0, then B ∞,χ can be replaced by B ∞,χ∇F +∇χ in (3.37). Corollary 3.4. Assume that A ∈ C 1 (R3 , R3 ) and B = curl A is bounded and that V fulﬁlls Hypothesis 2.2. Then we ﬁnd, for every ε > 0, some constant Ca0 ,ε ∈ (0, ∞) such that, for all F satisfying (3.24), ϕ ∈ DN , and 1 ≤ i ≤ N, +,N +,N −F |ϕ | 1 ⊗ eF Λ+,N ϕ − ϕ | Λ+,N A,V WiN ΛA,V 1 ⊗ e A,V WiN ΛA,V ϕ| +,N 2 ≤ a{εϕ | Λ+,N A,V WiN ΛA,V ϕ + Ca0 ,ε ϕ },

where eF acts only on the last variable. Proof. This corollary is proved by means of Proposition 3.2 in the same way as Corollary 3.2. We also recall that Λ+,N A,V DN ⊂ D(Wij ). The technique used in the proof of Proposition 3.2 also yields the following result whose proof can be found in Sec. 3.5, too:

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

25

Lemma 3.7. Assume that A ∈ C 1 (R3 , R3 ) and B = curl A satisﬁes (2.18) and that V fulﬁlls Hypothesis 2.2. Then there is some constant C ∈ (0, ∞) such that, for all ψ ∈ D(DA,V ), Wy1/2 Λ+ A,V ψ ≤ C(1 + min{|B(y)|, B ∞,ψ }) (DA,V − i)ψ .

(3.38)

3.4. Diﬀerences of projections In our applications it is eventually necessary to have some control on the diﬀerence between Λ+ A,V and + Λ+ A := ΛA,0 . 3 3 Lemma 3.8. Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2. Then there is some C ∈ (0, ∞) such that, for all ζ ∈ C ∞ (R3 , [0, 1]) which are constant outside some ball such that ζVC is bounded, + |DA |1/2 ζ(Λ+ A − ΛA,V ) ≤ C( ζV + ∇ζ ∞ ). 1/2 ), for every ϕ ∈ D(DA ). In particular, ζΛ+ A,V ϕ ∈ D(|DA |

Proof. Due to (3.27) the norm in the statement (if it exists) is bounded from above by sup 1/2

φ∈D(|DA | ), ψ∈H

φ = ψ =1

Γ

||DA |1/2 φ | ζ(RA (z) − RA,V (z))ψ|

|dz| . π

We next use (3.6), (3.15), and (3.17) to conclude that the asserted bound holds true. We note the following trivial consequence of the previous lemma: Namely, we pick some θ ∈ C0∞ (R3 , [0, 1]) with θ ≡ 1 on B1 (0) and θ ≡ 0 outside B2 (0), and set θR (x) := θ(x/R), for R ≥ 1, x ∈ R3 . By virtue of Hypothesis 2.2 and Lemma 3.8 we then have, for every ζ as in the statement of Lemma 3.8,

∇θ ∞ + |DA |1/2 (1 − θR )ζ(Λ+ − Λ ) ≤ C (1 − θ )V + → 0, (3.39) R A A,V R as R tends to inﬁnity. 3 3 Corollary 3.5. Assume that A ∈ L∞ loc (R , R ) and that V fulﬁlls Hypothesis 2.2. Then there is some C ∈ (0, ∞) such that, for every ζ ∈ C ∞ (R3 , R), which is constant outside some ball and such that ζVCs = 0 and ζV + ∇ζ ∞ ≤ 1, and

February 11, 2010 10:0 WSPC/148-RMP

26

J070-S0129055X10003874

O. Matte & E. Stockmeyer

every ϕ ∈ D, + + + |ϕ | Λ+ A,V ζDA,VC ζΛA,V ϕ − ϕ | ΛA ζDA ζΛA ϕ| C + 2 ≤ ( ζV + ∇ζ ∞ ) inf ϕ ζ|D |ζΛ ϕ + εϕ | Λ+ , (3.40) A A A 0<ε≤1 ε

and + + + |ϕ | ζΛ+ A,V DA,VC ΛA,V ζϕ − ϕ | ζΛA DA ΛA ζϕ| C + + 2 ≤ ( ζV + ∇ζ ∞ ) inf εϕ | ζΛA DA ΛA ζϕ + ϕ . 0<ε≤1 ε

(3.41)

The last estimate still holds true (with a new constant C), if DA and DA,VC are replaced by DA − 1 and DA,VC − 1, respectively. Proof. Let ϕ ∈ D and let θR be the cut-oﬀ function constructed in the paragraph 1/2 and, preceding (3.39). On account of Lemma 3.6 we know that Λ+ A,V ϕ ∈ H 1/2

hence, we infer from Lemma 3.2(c) that θR ζΛ+ belongs to the domain A,V ϕ ∈ Hc of DA . Applying also the formula appearing in (ii) of Lemma 3.1 and using ζVCs = 0 we obtain + + + s s ϕ | Λ+ A,V ζDA,VC ζΛA,V ϕ = lim θR ζΛA,V ϕ | DA,VC ζΛA,V ϕ R→∞

+ = lim DA θR ζΛ+ A,V ϕ | ζΛA,V ϕ. R→∞

+ Writing δΛ+ := Λ+ A,V − ΛA we further get + DA θR ζΛ+ A,V ϕ | ζΛA,V ϕ + + + = DA θR ζΛ+ A ϕ | ζΛA ϕ + DA θR ζΛA ϕ | ζδΛ ϕ + + + θR ζδΛ+ ϕ | DA ζΛ+ A ϕ + DA θR ζδΛ ϕ | ζδΛ ϕ.

By virtue of Lemma 3.8, we know that ζδΛ+ ϕ ∈ D(|DA |1/2 ) and it is easy to see + that DA θR ζΛ+ A ϕ → DA ζΛA ϕ, as R → ∞. Using also (3.39) we arrive at + + + s |ϕ | Λ+ A,V ζDA,VC ζΛA,V ϕ − DA ζΛA ϕ | ζΛA ϕ| 1/2 ≤ 2 |DA |1/2 ζΛ+ ζδΛ+ ϕ + |DA |1/2 ζδΛ+ ϕ 2 . A ϕ |DA |

Therefore, we obtain (3.40) by applying Lemma 3.8 once again. (3.41) follows from a straightforward combination of (3.40) and Corollary 3.2. The last statement of Corollary 3.5 follows from (3.41) and Lemma 3.8.

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

27

3.5. Proofs of Proposition 3.2 and Lemma 3.7 First, we recall a useful resolvent identity. We remind the reader that VCs = S|VCs | denotes the polar decomposition of the potential deﬁned in (3.1) and set M (z) := |VCs |1/2 R0 (z)|VCs |1/2 ,

z ∈ (D0 ).

(3.42)

Lemma 3.9 ([32, 41, 42]). Assume that VC fulﬁlls Hypothesis 2.1 and let VCs be given by (3.1). Then there exist η0 > 0 and γ0 ∈ (γ, 1) such that, for every η ∈ R\(−η0 , η0 ), we have M (iη) ≤ γ0 and R0,VCs (iη) = R0 (iη) − R0 (iη)|VCs |1/2 (1 + SM (iη))−1 S|VCs |1/2 R0 (iη).

(3.43)

Proof. The inequality | · |−1/2 R0 (iη)| · |−1/2 ≤ 1 has been conjectured in [41] and proved in [32]. By means of this inequality and the arguments of [42, pp. 2 and 3 (with ks ≡ 1)] we ﬁnd some γ0 ∈ (γ, 1) such that M (iη) ≤ γ0 provided |η| is large enough. The resolvent formula (3.43) then follows from [41, Lemma 2.2 and Theorem 2.2]. Proof of Proposition 3.2. We pick some ζ ∈ C0∞ (R3 , [0, 1]) such that ζ = 1 ˚ . We set ζ := 1 − ζ. Moreover, we pick in a neighborhood of Y and supp(ζ) N ∞ 3 some ∈ C0 (R , [0, 1]) with ≡ 1 on B1/2 (0) and supp() ⊂ B1 (0) and set y (x) := (x − y), x ∈ R3 . On account of Proposition 3.1 it suﬃces to consider F −F y Wy1/2 φ | [Λ+ ψ A,V , e χ]e dz dz 1/2 + y Wy1/2 φ | ζT (z)ψ , y Wy φ | ζT (z)ψ = 2π 2π Γ Γ

(3.44)

for φ ∈ H 1/2 (R3 , C4 ) and ψ ∈ H , where, by (3.18) and (3.27), T (z) := RA,V (z)T eF RA,V (z)e−F ,

z ∈ Γ,

T := iα · (χ∇F + ∇χ) + [eF χ, VE ]e−F = O( ∇χ ∞ + a).

(3.45) (3.46)

To study the ﬁrst integral in (3.44) we write, using (3.6), ζRA,V (z) = ζR0,VCs (z) + R0,VCs (z)iα · ∇ζ(R0,VCs (z) − RA,V (z)) − R0,VCs (z)ζ{V − VCs + α · A}RA,V (z), where VCs = S|VCs | is deﬁned in (3.1). Since D(D0,VCs ) ⊂ H 1/2 (R3 , C4 ) due to Lemma 3.2(e) we ﬁnd C, C ∈ (0, ∞) such that, for all y ∈ R3 and all z ∈ Γ, Wy1/2 R0,VCs (z) ≤ C |D0 |1/2 R0,VCs (z) ≤ C .

(3.47)

February 11, 2010 10:0 WSPC/148-RMP

28

J070-S0129055X10003874

O. Matte & E. Stockmeyer

By deﬁnition of T (z) and T and by (3.11) we thus have |dz| |y Wy1/2 φ | ζ{T (z) − R0,VCs (z)T eF RA,V (z)e−F }ψ| 2π Γ dη ≤ Ca,ζ (a + ∇χ ∞ ) φ ψ . 2 R 1+η

(3.48)

To treat the remaining part of the ﬁrst integral in (3.44) we employ the ﬁrst resolvent formula and (3.43) to obtain, for η ∈ R with |η| ≥ η0 and z = e0 + iη (η0 is the parameter appearing in Lemma 3.9), R0,VCs (z) = R0 (iη) + e0 R0,VCs (z)R0,VCs (iη) − R0 (iη)|VCs |1/2 (1 + SM (iη))−1 S|VCs |1/2 |D0 |−1/2 |D0 |1/2 R0 (iη). (3.49) Here the operator (1 + SM (iη))−1 is uniformly bounded for |η| ≥ η0 . Moreover, Wy1/2 R0 (iη)|VCs |1/2 ≤ C,

(3.50)

uniformly in y ∈ R3 and η ∈ R, which is a simple consequence of Kato’s inequality. In view of (3.48), (3.49) and (3.22) it is therefore clear that it suﬃces to discuss the contribution coming from the bare resolvent R0 (iη) in (3.49). On account of (3.11), (3.21) and (3.46) we ﬁnd by means of the Cauchy–Schwarz inequality, dη |R0 (−iη)Wy1/2 y ζφ | T eF RA,V (e0 + iη)e−F ψ| 2π |η|≥η0 ≤ T (|D0 |−1/2 Wy1/2 )y ζφ ψ = O(a + ∇χ ∞ ) φ ψ . Since the remaining part of the integral over {|η| < η0 } does not pose any further problem, we see altogether that the ﬁrst integral in (3.44) is absolutely convergent and of order O( ∇χ ∞ + a). Next, we treat the second integral on the right-hand side of (3.44). Since we eventually have to change the gauge locally, we pick a (smooth) partition of unity, {Jν }ν∈Z3 , on R3 such that supp(Jν ) ⊂ B1 (ν), for every ν ∈ Z3 . We can certainly

assume that ν∈Z3 |∇Jν | ≤ C, for some C ∈ (0, ∞). We set Jν , µ := 1 − µ, G (χ) := {ν ∈ Z3 : Jν χ = 0}, µ := ν∈G (χ)

so that µχ = χ, µχ = 0, and re-write the operator deﬁned in (3.46) as T = iα · (χ∇F + ∇χ) + χ[eF , VE ]e−F + [χ, VE ] = µ{iα · (χ∇F + ∇χ) + χ[eF , VE ]e−F + [χ, VE ]} − {µVE χ}µ (Jν U1 − U2 Jν ). =: µU1 − U2 µ = ν∈G (χ)

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

29

For every ν ∈ Z3 , there is some gauge potential, gν ∈ C 2 (R3 , R), such that A − ∇gν = Aν , where Aν is deﬁned by Aν (x) :=

1

0

B(ν + t(x − ν)) ∧ t(x − ν)dt,

x ∈ R3 .

(3.51)

ν := ∇gν , By virtue of (3.6) we obtain with A y ζRA,V (z)T =

y ζRA,V (z)(Jν U1 − U2 Jν )

ν∈G (χ)

=

ν∈G (χ)

+

y ζRAeν (z)(Jν U1 − U2 Jν )

ν∈G (χ)

−

ν∈G (χ)

RAeν (z)iα · ∇(y ζ)(RAeν (z) − RA,V (z))(Jν U1 − U2 Jν ) RAeν (z)y ζ(V + α · Aν )RA,V (z)(Jν U1 − U2 Jν )

=: S1 + S2 − S3 .

(3.52)

The proof of Proposition 3.2 is ﬁnished as soon as we have shown Lemma 3.10 below. Lemma 3.10. In the situation above there exists a χ- and F -independent constant, Ca0 ∈ (0, ∞), such that, for j = 1, 2, 3, Ij := |Wy1/2 φ | Sj eF RA,V (z)e−F ψ||dz| Γ

≤ Ca0 (1 + min{|B(y)|, B ∞,χ })( ∇χ ∞ + a) φ ψ . Proof of Lemma 3.10. In our estimations below, that involve non-local operators, we exploit the fact that the interference between spatially separated regions decays exponentially. Therefore, we start by introducing appropriate exponential weight functions: We pick some a ˜ ∈ (τ, min{0 , m}) and a convex, even function f˜ ∈ ∞ ˜|t|−3˜ a, for |t| ≥ 4, and 0 ≤ f˜ ≤ 1 C (R, [0, ∞)) such that f˜ ≡ 0 on [−2, 2], f˜(t) = a on (0, ∞). We deﬁne fν (x) := f˜(|x − ν|), x ∈ R3 , so that fν ≡ 0 on B2 (ν) and ˜ dist(·, B2 (ν)) − a ˜ with equality outside B4 (ν). Moreover, |∇fν | ≤ a ˜. We fν ≥ a further pick some non-decreasing θ ∈ C ∞ (R, R) such that θ(t) = t, for t ≤ 1, θ ≡ 2 on [3, ∞) and θ ≤ 1. We set θν,y (t) := (|ν − y| + 1)θ(t/(|ν − y| + 1)), t ∈ R, ˜ dist(·, B2 (ν)) − a ˜ on and fν,y := θν,y ◦ fν , so that fν,y is bounded, fν,y = fν ≥ a fν,y ˜. By construction e Jν = Jν . Setting B1 (y) ⊃ supp(y ), and |∇fν,y | ≤ a ψ ν,y (z) := (Jν U1 − efν,y U2 e−fν,y Jν )eF RA,V (z)e−F ψ

February 11, 2010 10:0 WSPC/148-RMP

30

J070-S0129055X10003874

O. Matte & E. Stockmeyer

we thus have (Jν U1 − U2 Jν )eF RA,V (z)e−F ψ = e−fν,y ψ ν,y (z). Observing that µχ = 0 implies efν,y U2 e−fν,y = µ[efν,y VE e−fν,y , χ] and employing (2.6) and (3.11) we further ﬁnd some constant C ∈ (0, ∞) such that, for all z = e0 + iη ∈ Γ, ν ∈ Z3 , and y ∈ R3 , a + ∇χ ∞ ψ ν,y (z) ≤ C ψ . 1 + η2

(3.53)

Taking these remarks into account we obtain |e−fν,y RAeν (¯ z )efν,y Wy1/2 (y e−fν,y )ζφ | ψ ν,y (z)||dz|. I1 ≤ Γ

ν∈G (χ)

ν = 0, whence (2.1), (2.12) and ν = ∇gν is a gradient we have curl A Now, since A Hardy’s inequality imply ν )ϕ ≤ C |D e |ϕ , Wy ϕ ≤ C ∇(eigν ϕ) = C (−i∇ + A Aν for ϕ ∈ D, with some ν- and y-independent constant C ∈ (0, ∞). Standard argu1/2 ments now imply that |DAeν |−1/2 Wy is a bounded operator whose norm is uniformly bounded in ν and y. Setting φν := |DAeν |−1/2 Wy1/2 (y e−fν,y )ζφ and applying (3.17) we thus ﬁnd

I1 ≤

Γ

ν∈G (χ)

·

Γ

RAeν (¯ z )e

φν (a + ∇χ ∞ )

|DAeν |

1/2 φν |dz| 2

R

dη 1 + η2

1/2 ψ

sup{e−fν,y (x) : x ∈ B1 (y)} φ (a + ∇χ ∞ ) ψ

ν∈Z3

1/2

ν∈G (χ)

≤ C

fν,y

1/2 2 ψν,y (z) |dz|

≤C

e

−fν,y

≤ C (a + ∇χ ∞ ) φ ψ .

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

31

For I2 , we obtain the estimate by means of (3.22), (3.11) and (3.53), |dz| |DAeν |−1/2 Wy1/2 φ |DAeν |1/2 RAeν (z) I2 ≤ Γ

ν∈G (χ)

· |∇(y ζ)|e−fν,y efν,y (RA,V (z) − RAeν (z))e−fν,y ψ ν,y (z) sup{e−fν,y (x) : x ∈ B1 (y)} φ ψ ≤ C( ∇χ ∞ + a) ν∈Z3

≤ C ( ∇χ ∞ + a) φ ψ . To derive a bound on I3 we employ the special properties of the gauge transformed vector potentials Aν . Namely, we make use of the bound y (x)e−fν,y (x) |Aν (x)| ≤ y (x)

|x − ν| (b1 e−(˜a−τ )|x−ν|+3˜a + |B(ν)|e−˜a|x−ν|+3˜a ), 2 (3.54)

for all ν ∈ Z3 and x ∈ R3 , which follows from (2.18). Since also |B(ν)| ≤ |B(y)| + |B(ν) − B(y)| and |x − y| ≤ 1, if y (x) = 0, we infer again from (2.18) that y e−fν,y |Aν | ∞ ν∈G (χ)

≤ C

ν∈G (χ)

e

−ˆ a|y−ν|

1 + min |B(y)|, sup |B(ν)|

,

(3.55)

ν∈G (χ)

for some suﬃciently small a ˆ > 0. Using these observations and the uniform boundfν,y −fν,y edness of ζe V e , which is implied by Hypothesis 2.2 and the choice of ζ, we ﬁnd some χ-, F -, and y-independent constant C ∈ (0, ∞) such that RAe(¯ z )Wy1/2 φ { y e−fν,y ∞ ζefν,y V e−fν,y I3 ≤ ν∈G (χ)

Γ

+ y e−fν,y |Aν | ∞ } efν,y RA,V (z)e−fν,y ψ ν,y (z) |dz| ≤ C (1 + min{|B(y)|, B ∞,χ })( ∇χ ∞ + a) φ ψ . This completes the proof of Lemma 3.10 and, at the same time, the proof of Proposition 3.2. (The last assertion of Proposition 3.2 follows by inspecting the arguments above.) Proof of Lemma 3.7. We use the notation introduced in the proofs of Proposition 3.2 and Lemma 3.10 in the following. We already know from Corollary 3.6 1/2 that the vector Λ+ A,V ψ belongs to D(Wy ), but we do not have any control on the norm on the left in (3.38) yet. It is certainly suﬃcient to derive the asserted bound

February 11, 2010 10:0 WSPC/148-RMP

32

J070-S0129055X10003874

O. Matte & E. Stockmeyer 1/2

1/2

with Wy replaced by y Wy . As in the proof of Corollary 3.6, we ﬁrst pick some ζ ∈ C0∞ (R3 , [0, 1]) (independent of ψ) such that ζ ≡ 1 on some large open ball containing Y and set ζ = 1 − ζ. By the closed graph theorem |D0 |1/2 ζRA,V (i) is bounded whence 1/2 ζRA,V (i) (DA,V − i)ψ . y Wy1/2 ζΛ+ A,V ψ ≤ C |D0 |

We denote the characteristic function of the support of ψ by χ. To treat the remaining piece of the norm we set ψ ν := Λ+ A,V (DA,V − i)Jν ψ, and write analogously to (3.52), |Wy1/2 φ | y ζΛ+ A,V ψ| ≤ |Wy1/2 φ | y ζRA,V (i)ψ ν | ν∈G (χ)

≤

ν∈G (χ)

+

|Wy1/2 y ζφ | RAeν (i)ψ ν |

ν∈G (χ)

+

ν∈G (χ)

|Wy1/2 φ | RAeν (i)α · ∇(y ζ)(RA,V (i) − RAeν (i))ψ ν | |Wy1/2 φ | RAeν (i)y ζ(V + α · Aν )RA,V (i)ψ ν |

=: Q1 + Q2 + Q3 , where φ ∈ H 1/2 (R3 , C4 ). Again we use exponential weights constructed in the beginning of the proof of Lemma 3.10 and abbreviate −fν,y fν,y )e (DA,V − i)e−fν,y Jν ψ, ψν,y := (efν,y Λ+ A,V e

so that by Corollary 3.1, ˜) ψ ≤ C (DA,V − i)ψ , ψν,y ≤ C (DA,V − i)ψ + O( ∇Jν ∞ + a where C, C ∈ (0, ∞) neither depend on ν nor y. Writing also efν,y RAeν (i)e−fν,y = RAeν (i)(1 − iα · ∇fν,y efν,y RAeν (i)e−fν,y ) and using (3.11) we thus obtain Q1 ≤

ν∈G (χ)

|DAeν |−1/2 Wy1/2 y e−fν,y φ

· |DAeν |1/2 efν,y RAeν (i)e−fν,y ψν,y ≤ C φ (DA,V − i)ψ .

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

33

Using (3.55) we further ﬁnd |DAeν |−1/2 Wy1/2 φ |DAeν |1/2 RAeν (i) Q3 ≤ ν∈G (χ)

· { y e−fν,y ∞ ζefν,y V e−fν,y + y e−fν,y |Aν | ∞ } ψν,y ≤ C (1 + min{|B(y)|, B ∞,ψ }) φ (DA,V − i)ψ . The remaining term, Q2 , can be dealt with similarly. 4. Exponential Localization In this section, we prove Theorem 2.1. To this end, we adapt an argument from [1] and some useful improvements of the latter from [19] to our non-local situation. In the proof below, we present the general strategy of the argument. In doing so, we refer to three technical lemmata whose proofs are postponed to the end of this section. The argument from [1] is advantageous here since it does not require any a priori knowledge on the spectrum of HN . It rather gives the possibility to prove the exponential localization of spectral projections directly and to infer results on the nature of the spectrum from the localization estimate. In particular, the argument avoids the use of eigenvalue equations which are, for instance, exploited in Agmon type estimates. Throughout this section we always assume that the assumptions of Theorem 2.1 are fulﬁlled. Proof of Theorem 2.1. Since HN is bounded from below we may suppose that inf I > −∞. By assumption we have sup I < ENA−1 + 1. Moreover, we consider HN as an operator on the unprojected N -particle space HN . In this case we have to keep in mind that 0 becomes an inﬁnitely degenerated eigenvalue of HN . Our goal 2b|X| |Φ(X)|2 dX ≤ C, for all is to show that there are b, C ∈ (0, ∞) such that R3N e normalized Φ ∈ Ran(EI (HN )) such that Φ = AN Φ and Φ = Λ+,N A,V Φ. Borrowing an idea from [38] we simplify the problem by using the bounds √ N |xj |

e2b|X| ≤ max e2b j=1,...,N

≤

N

√ N |xj |

e2b

,

X = (x1 , . . . , xN ) ∈ (R3 )N ,

j=1

and the anti-symmetry of Φ = AN Φ. (We are not aiming to derive good estimates on the decay rate here.) Indeed, it suﬃces to show that there exist a, C ∈ (0, ∞) such that e2a|xN | |Φ(X)|2 dX ≤ C, (4.1) R3N

Λ+,N A,V Φ

for all Φ = AN Φ = ∈ EI (HN ), Φ = 1. Then Theorem 2.1 holds true with √ b = a/ N . Furthermore, it suﬃces to show that (4.1) holds true with a|xN | replaced

February 11, 2010 10:0 WSPC/148-RMP

34

J070-S0129055X10003874

O. Matte & E. Stockmeyer

by F (xN ), for every (bounded) F : R3 → R satisfying (3.24). This is in fact an obvious consequence of the monotone convergence theorem applied to the integrals 2Fn (xN ) e |Φ(X)|2 dX with Φ as above, where F1 , F2 , . . . is a suitable increasing 3N R sequence of functions satisfying (3.24) and converging to a|xN |. Therefore, it suﬃces to ﬁnd some a > 0 such that (AN −1 ⊗ eF )EI (HN )(AN −1 ⊗ 1)Λ+,N A,V < ∞,

(4.2)

for every F satisfying (3.24), where AN −1 denotes anti-symmetrization of the ﬁrst N − 1 variables and eF acts only on the N th electron variable. We start by introducing a comparison operator. To this end we pick some χ ∈ C ∞ (R3 , [0, 1]) such that χ ≡ 1 outside B2 (0) and χ ≡ 0 on B1 (0) and set χR := χ(·/R) and χR := 1 − χR , for R ≥ 1. Furthermore, we deﬁne orthogonal projections +,N −1 PN −1 := AN −1 ΛA,V ,

QN := (AN −1 ⊗ 1)Λ+,N A,V ,

PN⊥−1 = 1HN −1 − PN −1 , Q⊥ N = 1HN − QN .

Then the comparison operator is deﬁned, a priori on the domain DN ⊂ HN , by − A A ⊥ N := QN HN QN + HN H −1 ⊗ ΛA,V + EN −1 PN −1 ⊗ 1 + A ⊥ + PN −1 ⊗ Λ+ A,V (1 − E1 )χR ΛA,V + QN A A ⊥ = HN −1 ⊗ 1 + EN −1 PN −1 ⊗ 1

(4.3)

+ A ⊥ + PN −1 ⊗ Λ+ A,V {DA,VC + (1 − E1 )χR }ΛA,V + QN

+

N −1

QN WiN QN .

(4.4) (4.5)

i=1

N again by the same symbol. (The idea to We denote the Friedrichs extension of H introduce an additional cut-oﬀ function χR in (4.4) to compensate for the Coulomb singularity in the last variable xN is borrowed from [19]; together with the other N stays away additional terms in (4.3) and (4.4) it ensures that the spectrum of H A A from the interval I.) Notice that on DN we have HN −1 ⊗ 1 + EN −1 PN⊥−1 ⊗ 1 ≥ ENA−1 1HN . Furthermore, Lemma 4.3 below implies that + A ⊥ PN −1 ⊗ Λ+ A,V {DA,VC + (1 − E1 )χR }ΛA,V + QN ≥ 1 − o(1)PN −1 ⊗ 1,

as R tends to inﬁnity. We now pick some ε > 0 with sup I < ENA−1 + 1 − ε. Then the above remarks imply N ψ ≥ (E A + 1 − ε/2) ψ 2, ψ | H N −1

ψ ∈ DN ,

for all suﬃciently large R ≥ 1. Next, we deﬁne − A HN := QN HN QN + HN −1 ⊗ ΛA,V .

(4.6)

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

35

N and H have the same domain since they diﬀer by a bounded operator Then H N I ∈ C0∞ (R, [0, 1]), such that on their common form core DN . We further pick some χ N ) = 0 by (4.6). As χI ) ⊂ (−∞, ENA−1 + 1 − ε). Then χ I (H χ I ≡ 1 on I and supp( in [1] we now observe that N ))QN . )QN = ( χI (HN )−χ I (H QN EI (HN )QN = QN EI (HN

(4.7)

We preserve the symbol χ I to denote an almost analytic extension of χ I (see, e.g., [15]) to a smooth, compactly supported function on the complex plane satisfying supp( χI ) ⊂ (−∞, ENA−1 + 1 − ε) + i(−δ, δ), I (z) = ON (|z|N ), ∂z¯χ

N ∈ N,

(4.8)

where ∂z¯ = 12 (∂ z + i∂z ). Here we may choose δ > 0 as small as we please. We shall apply the Helﬀer–Sj¨ ostrand formula (see, e.g., [15]), χ I (T ) =

C

(z − T )−1 d χI (z),

d χI (z) :=

i I (z)dz ∧ d¯ z, ∂z¯χ 2π

which holds for every self-adjoint operator T on some Hilbert space. By means of (4.7), we then ﬁnd the representation QN EI (HN )QN =

C

N − z)−1 ]d [(HN − z)−1 − (H χI (z)QN .

(4.9)

For some F as in (3.24) (which acts only on the last variable in what follows), we abbreviate F +,N −F . ΛF,N A,V := e ΛA,V e

Then (4.9) and the second resolvent identity together with the trivial identities ⊥ Q⊥ N QN = 0 = (PN −1 ⊗ 1)QN yield (AN −1 ⊗ eF )EI (HN )(AN −1 ⊗ 1)Λ+,N A,V N − z)−1 PN −1 ⊗ {Λ+ (1 − E A )χ Λ+ } eF (H ≤ R A,V 1 A,V C

− z)−1 QN |d χI (z)| × QN (HN χI (z)| N − z)−1 e−F ΛF,N eF χR |d ≤ (1 − E1A ) eF (H A,V |z| C χI (z)| N − z)−1 e−F |d . ≤ Ca,R eF (H |z| C

(4.10)

February 11, 2010 10:0 WSPC/148-RMP

36

J070-S0129055X10003874

O. Matte & E. Stockmeyer

In the last step, we apply Proposition 3.1 and eF χR ≤ e2aR . By (4.8) |d χI (z)|/|z| is a ﬁnite measure. To conclude the proof of Theorem 2.1 it thus N − z)−1 e−F is uniformly bounded in all remains to show that the norm of eF (H z ∈ supp( χI )\R and F satisfying (3.24). This is done in the rest of this proof. Since F satisﬁes (3.24) we know that 1N −1 ⊗ eF is an isomorphism on HN . N e−F and H N have the same resolvent Therefore, the densely deﬁned operators eF H set and N e−F − z)−1 , N − z)−1 e−F = (eF H RF (z) := eF (H

N ). z ∈ (H

(4.11)

N e−F is closed because its resolvent set is not empty. Using the In particular, eF H ∗ ∗ −1 N e−F )∗ = e−F H N eF . = RF (z)−1 we readily verify that (eF H identity RF (z) Since e±F maps DN into itself we further have N e∓F ) = e±F D(H N ) ⊂ e±F Q(H N ). DN ⊂ D(e±F H

(4.12)

The following two lemmata, whose proofs are postponed to the end of this section, N e−F is a small form perturbation of H N . We deﬁne T : DN → show that eF H HN by N e−F ϕ − H N ϕ, T ϕ := eF H

ϕ ∈ DN .

(4.13)

Lemma 4.1. Assume that F : R3 → R satisﬁes (3.24). Then we have, as a > 0 tends to zero, N ϕ + O(a)ϕ | ϕ, |ϕ | T ϕ| ≤ aϕ | H

ϕ ∈ DN .

(4.14)

Lemma 4.2. There exist constants c1 , c2 ∈ (0, ∞) such that, for all F : R3 → R satisfying (3.24) and all ϕ ∈ DN , N e±F ϕ| ≤ c1 e±F 2 ϕ | H N ϕ + c2 e±F 2 ϕ 2 . |e±F ϕ | H

(4.15)

N ) ⊂ Q(H N ). In particular, e±F Q(H N e−F )DN has a distinguished, If a < 1/2 then Lemma 4.1 implies that (eF H F sectorial, closed extension, HN , that is the only closed extension having the prop N ), D(H F ∗ ) ⊂ Q(H N ), and iη ∈ (H F ), for all η ∈ R with F ) ⊂ Q(H erties D(H N N N suﬃciently large absolute value; see [31]. Thanks to (4.11), (4.12) and Lemma 4.2, N e−F is a closed extension enjoying these properties, whence we know that eF H F N e−F . N H = eF H

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

37

We are now prepared to derive a uniform bound on the norm under the integral sign in (4.10). For z ∈ supp( χI ) and ϕ ∈ DN , we obtain N − z)ϕ + ϕ | T ϕ F − z)ϕ = ϕ | (H ϕ | (H N

N − z ≥ (1 − a) ϕ H ϕ − O(a) ϕ 2 . 1−a

(4.16)

By (4.6) and (4.8), we thus ﬁnd a ∈ (0, 1/2) and R ∈ [1, ∞) such that, for all z ∈ supp( χI ) and ϕ ∈ DN , F − z)ϕ ≥ ϕ | (H N

ε ϕ 2 . 4

F − z This inequality implies that, for z ∈ supp( χI ), the numerical range of H N is contained in the half space {ζ ∈ C : ζ ≥ ε/4} [31, Theorem VI.1.18 and F − z is zero, for all Corollary VI.2.3]. Moreover, by (4.11) the deﬁciency of H N F z ∈ C\R, and we may hence estimate the norm of (HN − z)−1 by the inverse F [31, Theorem V.3.2]. We thus arrive at distance of z to the numerical range of H N

F N (H − z)−1 ≤

4 , ε

z ∈ supp( χI ),

which together with (4.10) proves Theorem 2.1. Lemma 4.3. For every suﬃciently large R ≥ 1, there is some cR ∈ (0, ∞) such that cR → 0, as R → ∞, and, for all ϕ ∈ D, + + A 2 2 ϕ | Λ+ A,V [DA,VC + (1 − E1 )χR ]ΛA,V ϕ ≥ ΛA,V ϕ − cR ϕ .

(4.17)

Proof. To begin with we introduce a scaled partition of unity. Namely, we pick ˜ ≡ 1 on B2 (0) and observe that θ := µ ˜2 + some µ ˜ ∈ C0∞ (R3 , [0, 1]) such that µ 2 3 (1 − µ ˜) is strictly positive. We further set, for R ≥ 1 and x ∈ R , µ1 (x) ≡ ˜(x/R)/θ1/2 (x/R), and µ2 (x) ≡ µR,2 (x) := (1 − µ ˜ (x/R))/θ1/2 (x/R), so µR,1 (x) := µ 2 2 2 2 that µ1 + µ2 = 1. Since µ1 ∇µ1 + µ2 ∇µ2 = ∇(µ1 + µ2 )/2 = 0 it follows that, for ϕ ∈ D, + A ϕ | Λ+ A,V [DA,VC + (1 − E1 )χR ]ΛA,V ϕ + A 2 = ϕ | Λ+ A,V [µj DA,VC µj + (1 − E1 )µj χR ]ΛA,V ϕ j=1,2

=:

Yj .

(4.18)

j=1,2

To treat the summand with j = 1 we use that, by construction, µ1 χR = µ1 , for every R ≥ 1. Taking also Corollary 3.2 and (2.16) into account we ﬁnd, for all R ≥ 1

February 11, 2010 10:0 WSPC/148-RMP

38

J070-S0129055X10003874

O. Matte & E. Stockmeyer

and ϕ ∈ D, + A + 2 2 Y1 ≥ (1 − 1/R)µ1 ϕ | Λ+ A,V [DA,VC − E1 ]ΛA,V µ1 ϕ + µ1 ΛA,V ϕ − O(1/R) ϕ 2 2 ≥ µ1 Λ+ A,V ϕ − O(1/R) ϕ .

(4.19)

We next turn to the summand with j = 2 in (4.18) where µ22 χR ≥ 0. Applying successively Corollaries 3.2 and 3.5, Proposition 3.1, and Lemma 3.8 we deduce that, for all ϕ ∈ D and every ε > 0, + + + 2 ϕ | Λ+ A,V µ2 DA,VC µ2 ΛA,V ϕ ≥ (1 − ε)ϕ | µ2 ΛA DA ΛA µ2 ϕ − oε (1) ϕ 2 2 ≥ (1 − ε) Λ+ A µ2 ϕ − oε (1) ϕ 2 2 ≥ (1 − ε)2 µ2 Λ+ A ϕ − oε (1) ϕ 2 2 ≥ (1 − ε)3 µ2 Λ+ A,V ϕ − oε (1) ϕ ,

(4.20)

as R → ∞. We conclude by combining (4.18)–(4.20) and using µ21 + µ22 = 1. N e−F − H N Proof of Lemma 4.1. We have to study the contribution to T = eF H F coming from each term in (4.3)–(4.5). The terms in (4.3) commute with e and hence give no contribution. In order to estimate the contribution coming from the left term in (4.4) we ﬁrst observe that Corollary 3.1 and (3.32) imply the following identities on D, + −F F = ΛF eF Λ+ A,V (DA,VC + iα · ∇F )ΛA,V A,V DA,VC ΛA,V e F = ΛF A,V DA,VC ΛA,V + O(a) + = (1 + a)Λ+ A,V DA,VC ΛA,V + O(a).

The term in (4.4) involving the cut-oﬀ function χR yields a contribution of order O(a), too, due to Corollary 3.2 (with L = (1 − E1A )χR and χ = 1). To account for + the projection on the right in (4.4) we write Q⊥ N = 1HN − PN −1 ⊗ ΛA,V and use F ⊥ −F ⊥ Proposition 3.1 to obtain e QN e −QN = O(a). Finally, we apply Corollary 3.4 to all terms in (4.5) — this is the only place in this section where we use the assumption that B is bounded — and arrive at ! N −1 (N ) WiN QN ϕ + O(a) ϕ 2 . |ϕ | T ϕ| ≤ a ϕ QN DA,VC + i=1

A A Since HN −1 ≥ EN −1 this completes the proof of Lemma 4.1.

Proof of Lemma 4.2. We drop the ±-signs in (4.15) since the they do not play any role in this proof. It is clear that we only have to comment on those terms in (4.3)–(4.5) that involve unbounded operators. Since HN −1 ⊗ 1 commutes with eF

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

39

and since HN −1 ≥ ENA−1 we ﬁrst ﬁnd, for ϕ ∈ DN , eF ϕ | (HN −1 ⊗ 1 − ENA−1 PN⊥−1 ⊗ 1)eF ϕ ≤ e2F ϕ | (HN −1 ⊗ 1 − ENA−1 PN⊥−1 ⊗ 1)ϕ.

(4.21)

By virtue of Proposition 3.2 we can estimate |ϕ | eF QN WiN QN eF ϕ|, for ϕ ∈ DN , as 1/2

1/2

1/2

WiN QN eF ϕ 2 ≤ 2 eF WiN QN ϕ 2 + 2 WiN [eF , QN ]e−F eF ϕ 2 1/2

≤ 2 eF 2 WiN QN ϕ 2 + O(a2 ) eF 2 ϕ 2 .

(4.22)

(If B is unbounded, then the O-symbol in (4.22) depends on the supremum of |B| on supp(∇F ).) It remains to prove that there are constants c3 , c4 ∈ (0, ∞) such that + F ϕ | eF Λ+ A,V DA,VC ΛA,V e ϕ + F 2 2 ≤ c3 eF 2 ϕ | Λ+ A,V DA,VC ΛA,V ϕ + c4 e ϕ ,

(4.23)

for ϕ ∈ D. Moreover, since VH and VE are bounded it suﬃces to prove this estimate := DA,V − e0 , which is positive on the range of Λ+ . with DA,VC replaced by D A,V We abbreviate Λ± := Λ± A,V in the rest of this proof and seek for bounds on both terms on the right side of + )1/2 eF Λ ϕ 2 . + )1/2 eF ϕ 2 ≤ 2 (DΛ (4.24) (DΛ =±

+ )1/2 [eF , Λ− ]ϕ and is not greater than Here the norm with = − equals (DΛ some O(a) eF ϕ due to Proposition 3.1. We next deﬁne + + 1 ≥ 1. ˆ := Λ+ (DA,V − e0 )Λ+ + 1 = Λ+ DΛ D ˆ −1/2 ≤ 1 and + )1/2 D In fact, because of (DΛ ˆ 1/2 Λ+ ϕ 2 + [D ˆ 1/2 eF Λ+ ϕ 2 ≤ eF D ˆ 1/2 , eF ]Λ+ ϕ 2 D ˆ 1/2 ϕ 2 + D ˆ 1/2 ϕ 2 ˆ 1/2 [D ˆ −1/2 , eF ]Λ+ D ≤ eF 2 Λ+ D we shall see that (4.23) holds true as soon as we have shown that ˆ 1/2 [D ˆ −1/2 , eF ]Λ+ = O(a) eF . D

(4.25)

To check whether (4.25) is correct we ﬁrst note that, on D, ˆ eF ] = Λ+ [D, eF ] + [Λ+ , eF ]D [D, = −Λ+ iα · ∇F eF + ([VE , eF ]e−F )eF + [Λ+ , eF ]D.

(4.26)

February 11, 2010 10:0 WSPC/148-RMP

40

J070-S0129055X10003874

O. Matte & E. Stockmeyer

We apply the norm-convergent integral representation 1 ∞ 1 dt √ , T −1/2 = π 0 T +t t

(4.27)

which holds for any strictly positive operator, T , on some Hilbert space. For φ, ψ ∈ D, it implies dt 1 ∞ ˆ 1/2 −1 ˆ F Λ+ 1/2 −1/2 F + ˆ ˆ D φ [D, e ] ψ √ . (4.28) , e ]Λ ψ = D φ | [D ˆ ˆ π 0 t D+t D+t We estimate the contribution of the ﬁrst term on the right side of (4.26) to (4.28) as D ˆ 1/2 φ

+ O(a) eF Λ+ F Λ ≤ iα · ∇F e ψ φ ψ , D ˆ +t ˆ +t (1 + t)3/2 D

t ≥ 0.

(4.29)

In view of (2.7) the second term in (4.26) can be dealt with similarly. To account for the third term in (4.26) we apply Proposition 3.1 and obtain, for t ≥ 0, ! 1 O(a) eF D 1/2 + F −F F + ˆ [Λ , e ]e }e {D Λ ψ ≤ φ ψ . (4.30) φ ˆ +t ˆ +t D 1+t D Equations (4.28)–(4.30) show that (4.25) holds true, which completes the proof of Lemma 4.2. A 5. The Lower Bound on inf σess (HN )

In order to prove the “hard part” of the HVZ theorem, Theorem 2.2(ii), we employ an idea we learned from [19]: One may use a localization estimate for spectral projections to prove their compactness. Of course one might try to derive a lower bound on the ionization threshold by a more direct argument, for instance, by following the general strategy presented in [33]. Since we have already derived an exponential localization estimate we ﬁnd it, however, more convenient here to adapt the observation from [19] to our non-local model. Another advantage of the proof below is that we can work with the square root of HN . This is important since only form bounds on perturbations of HN are available. Theorem 5.1. Let the assumptions of Theorem 2.2(ii) be fulﬁlled and let I ⊂ R A ) is a compact be an interval sup I < 1 + ENA−1 . Then the spectral projection EI (HN + operator on AN HN . In particular, A σess (HN ) ⊂ [1 + ENA−1 , ∞). A Proof. Let g ∈ C(R, (0, ∞)) satisfy g(r) → ∞, r → ∞, and g(|X|)EI (HN ) ∈ L (AN HN ) and set h := 1/g. We let R denote a smoothed characteristic function

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

41

of the closed ball in R3 with radius R > 0 and center 0 and set χR (X) := R (x1 ) · · · R (xN ), for X = (x1 , . . . , xN ) ∈ (R3 )N . First, we argue that it suﬃces to A )χR (−∆+1)1/8 is a (densely deﬁned) bounded operator from HN show that EI (HN + to AN HN . In fact, let us assume that this is the case. Since (−∆ + 1)−1/8 h(|X|) A ) is bounded, it then follows that is compact and g(|X|)EI (HN A A EI (HN )[χR h(|X|)]g(|X|)EI (HN ) A A = EI (HN )χR (−∆ + 1)1/8 [(−∆ + 1)−1/8 h(|X|)]g(|X|)EI (HN )

is compact. Since χR h(|X|) converges to h(|X|) in the operator norm, as R A A A ) = EI (HN )h(|X|)g(|X|)EI (HN ) tends to inﬁnity, it further follows that EI (HN is compact, too.

(j) A )χR (−∆ + 1)1/8 is bounded we set S := 1 + N To verify that EI (HN j=1 |DA,V | and write, for some suﬃciently large c > 0, A )χR (−∆ + 1)1/8 EI (HN A 1/2 = EI (HN )(HN + c)1/2 {(HN + c)−1/2 Λ+,N }{S −1/2 χR (−∆ + 1)1/8 }. A,V S (5.1)

Here the left curly bracket in (5.1) is a bounded operator from HN to HN+ since +,N Λ+,N A,V SΛA,V ≤ HN + c, provided c is large enough, due to the positivity of the interaction potentials and the boundedness of VH and VE . To see that the right curly bracket in (5.1) is a bounded operator in HN we ﬁrst notice that it is a restriction of S −1/2 T ∗ , where T := (−∆ + 1)1/8 χR is closed. It thus remains to show that T S −1/2 = T ∗∗ S −1/2 = (S −1/2 T ∗ )∗ belongs to L (HN ). To this end (i) (i) we recall that (−∆(i) + 1)1/4 R (|DA,V | + 1)−1 is bounded on L2 (R3i , C4 ) since 1/2

D(DA,V ) ⊂ Hloc (R3 , C4 ). It follows that (i)

(i)

(−∆(i) + 1)1/4 χR S −1 = (−∆(i) + 1)1/4 χR (|DA,V | + 1)−1 (|DA,V | + 1)S −1 is bounded, for i = 1, . . . , N , and, hence, χR (−∆ + 1)1/4 χR S −1 ∈ L (HN ). Since χR (−∆ + 1)1/4 χR is a restriction of T ∗ T we see that T ∗ T S −1 ∈ L (HN ), which implies |T |S −1/2 ∈ L (HN ) and, hence, T S −1/2 ∈ L (HN ). 6. Weyl Sequences In this section, we prove the “easy part” of our HVZ theorem, namely Part (i) of Theorem 2.2 asserting that A ) ⊃ [ENA−1 + 1, ∞). σess (HN A This is done by constructing suitable Weyl sequences for HN . The diﬃculties we encounter are similar to those in [40] where the Brown–Ravenhall model (free picture without magnetic ﬁeld) is considered. We have, however, to replace those arguments in [40] that require explicit momentum or position space representations

February 11, 2010 10:0 WSPC/148-RMP

42

J070-S0129055X10003874

O. Matte & E. Stockmeyer

of the free projection Λ+ 0 by more abstract ones; see, e.g., Lemma 6.2. Another new technical complication is caused by the related facts that Λ+ A,V maps the dense subspace D merely into H 1/2 when V has a strong Coulombic singularity and that, compared to the free picture, it is more diﬃcult to control the singularities of the interaction potentials. For this reason we shall eventually study the square root of HN rather than HN itself. We ﬁx some spectral parameter λ ≥ 1 throughout the whole section and {ψn }n∈N will always denote a corresponding Weyl sequence as in Hypothesis 2.4(i). In this and the following section we shall repeatedly employ the following sequence of cut-oﬀ functions: We pick some χ ∈ C ∞ (R, [0, 1]) such that χ ≡ 0 on (−∞, 1 − ε/4] and χ ≡ 1 on [1, ∞). Here ε ∈ (0, 1) is a ﬁxed parameter whose value becomes important only in Sec. 7. We set χn := χ(|x|/Rn ), for x ∈ R3 and n ∈ N, where Rn is given by Hypothesis 2.4(i). Then it holds χn ψn = ψn and ∇χn ∞ = Rn−1 ∇χ ∞ → 0, as n → ∞. To begin with we draw two simple conclusions from our hypotheses: 3 3 Lemma 6.1. Assume that A ∈ L∞ loc (R , R ) and V fulﬁll Hypotheses 2.4(i) and 2.2, respectively. Then

lim (DA,VC − λ)Λ+ A,V ψn = 0.

lim (DA,V − λ)ψn = 0,

n→∞

n→∞

(6.1)

Proof. The ﬁrst identity is clear from the hypotheses. To treat the second we employ the cut-oﬀ functions deﬁned in the paragraph preceding the statement of this lemma and abbreviate VHE := VH + VE . By means of Proposition 3.1 and VHE χn → 0 we then obtain + + VHE Λ+ A,V ψn ≤ VHE χn ΛA,V ψn + VHE [ΛA,V , χn ]ψn → 0,

as n tends to inﬁnity. Therefore, the second identity follows from the ﬁrst. 3 3 Lemma 6.2. Assume that A ∈ L∞ loc (R , R ) and V fulﬁll Hypotheses 2.4(i) and 2.2, respectively. Let ε > 0 and set Iε := (λ − ε, λ + ε). Then we have, as n tends to inﬁnity,

EIε (DA,V )ψn → 1,

in particular,

Λ+ A,V ψn → 1.

(6.2)

Proof. Clearly, EIε (DA,V )ψn ≤ 1 since ψn is normalized. Suppose that there is some δ > 0 such that lim inf EIε (DA,V )ψn 2 ≤ 1 − δ. Then we have lim→∞ EIε (DA,V )ψn 2 ≤ 1 − δ, for an appropriate subsequence, and lim (DA,V − λ)ψn 2 ≥ ε2 lim inf (1 − 1Iε (s))d Es (DA )ψn 2 →∞

→∞

R

2

= ε − ε2 lim EI (DA,V )ψn 2 ≥ ε2 δ > 0. →∞

This is a contradiction to (6.1).

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

43

In the following we show that ENA−1 + λ belongs to σess (HN ) by means of a suitable Weyl sequence. Instead of applying Weyl’s criterion directly to HN we shall, however, use a slightly strengthened version of it in Lemma 6.3 (see, e.g., [13]) which allows to work with quadratic forms. As already mentioned above this is important since, for instance, it seems that one cannot expect Proposition 3.2 to √hold with W 1/2 replaced by W . (At least not for large nuclear charges e2 Z ≥ 3/2.) To construct the Weyl sequence we pick, for every n ∈ N, some  Φn | H A Φn < E A + 1 , N −1 N −1 +,N −1 n Φn = AN −1 Φn ∈ ΛA,V DN −1 such that (6.3)  Φn = 1. This is possible since HN −1 is deﬁned as a Friedrichs extension starting from +,N −1 ΛA,V DN −1 . We further set Υn (x) := |Φn (x, X )|2 dX . (6.4) R3(N −2)

Next, we pick 0 < a < min{m, 0 }, r ∈ (0, 1 − ε/4), and r ∈ (0, 1) such that (1 − r)a > (1 + r)τ,

s := r + r − 1 > 0.

(6.5)

Here τ appears in (2.18). We further pick some cut-oﬀ function, ϑ ∈ C ∞ (R, [0, 1]), such that ϑ ≡ 0 on (−∞, s/2] and ϑ ≡ 1 on [s, ∞). By Lemma 3.6 we know that (1) |D0 |1/2 Φn ∈ HN −1 , where the superscript (1) again indicates that the operator acts on the ﬁrst variable. Therefore, we ﬁnd a subsequence, {Rkn }n∈N , of {Rk }k∈N such that, for every n ∈ N, 1 (1) (6.6) ||D0 |1/2 ϑ(x1 /Rkn )Φn (X)|2 dX < , n R3(N −1) As a candidate for a Weyl sequence we then try {AN Ψn }n∈N , where +,N Ψn := Φn ⊗ Λ+ A,V ψkn ∈ ΛA,V DN ,

n ∈ N.

(6.7)

To simplify the notation we again write n instead of kn in the following. Finally, we pick some c > 1 and set f (t) := (t + c)−1/2 (t − ENA−1 − λ),

t > −c.

Lemma 6.3. Let the assumptions of Theorem 2.2(i) be fulﬁlled. If, in the situation described above, c > 1 is suﬃciently large, then AN Ψn ∈ D(f (HN )), for every n ∈ N, and w-lim AN Ψn = 0, n→∞

lim inf AN Ψn > 0, n→∞

lim f (HN )AN Ψn = 0.

n→∞

(6.8)

In particular, ENA−1 + λ ∈ σess (HN ). Proof. First, suppose that (6.8) holds true. If c > 1 is chosen suﬃciently large, then f is strictly monotonically increasing on σ(HN ). If I is some small open

February 11, 2010 10:0 WSPC/148-RMP

44

J070-S0129055X10003874

O. Matte & E. Stockmeyer

interval around ENA−1 + λ we thus get EI (HN ) = Ef (I) (f (HN )). By (6.8) and the Weyl criterion applied to f (HN ) it follows that ∞ = dim Ran(Ef (I) (f (HN ))) = dim Ran(EI (HN )). To verify (6.8) we ﬁrst notice that Ψn 0, as n → ∞, because of (2.17). Exactly as in [40, §4] we can also check that lim inf AN Ψn > 0. So it suﬃces to show that f (HN )Ψn → 0, as HN commutes with AN . Since ψn and Φn are +,N normalized and Ψn = Φn ⊗ Λ+ A,V ψn ∈ ΛA,V DN we obtain f (HN )Ψn 1

1

1

≤ (HN + c)− 2 (HN −1 − ENA−1 ) 2 ⊗ 1H + (HN −1 − ENA−1 ) 2 Φn + (DA,VC − λ)Λ+ A,V ψn +

N −1

(6.10) 1

1

(6.9)

1

+ 2 2 (HN + c)− 2 Λ+,N A,V WiN WiN (Φn ⊗ ΛA,V ψn ) .

(6.11)

i=1

We ﬁrst show that the operator norm in (6.11) is actually ﬁnite. In fact, 1/2

1/2

+,N −1/2 ≤ 1, (HN + c)−1/2 Λ+,N A,V WiN = WiN ΛA,V (HN + c) +,N −1 + + A , and, hence, since W ≥ 0, Λ+ A,V DA,VC ΛA,V ≥ −C ΛA,V , HN −1 ≥ EN −1 ΛA,V 1/2

+,N +,N 2 WiN Λ+,N A,V φ = φ | ΛA,V WiN ΛA,V φ ≤ φ | (HN + c)φ,

for φ ∈ Λ+,N A,V DN . Using similar estimates and (6.3) it is straightforward to check that the term in (6.9) converges to zero provided c > 1 is suﬃciently large. The norm in (6.10) tends to zero by Lemma 6.1. The claim now follows from Lemma 6.4 below which implies that the remaining norm in (6.11) tends to zero, too. The ﬁrst inequality of the following lemma is used in the proof of Lemma 6.3 and the second one in Sec. 7. Lemma 6.4. There are κ, C ∈ (0, ∞) such that, for all n ∈ N, 1 2 W (x, y)Υn (y)|Λ+ sup W (x, y) + Ce−κRn + . A,V ψn (x)| d(x, y) ≤ n R6 |x−y|≥ (1−r )Rn

If B is bounded, then there is some C ∈ (0, ∞) such that, for all n ∈ N, 2 W (x, y)Υn (y)|Λ+ A,V ψn (x)| d(x, y) R6

≤

2 −aεRn /2 sup W (x, y) Λ+ A,V ψn + C (1 + B ∞ )e

|x−y|≥ (1−ε)Rn

+C

{|y|≥εRn /2}

Υn (y)dy(1 + B ∞ ) (DA,V + i)ψn 2 .

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

45

Proof. For n ∈ N, we pick a weight function, Fn ∈ C ∞ (R3 , [0, ∞)), with Fn ≡ 0 on R3 \BRn (0), Fn ≥ (1 − r)aRn − a on BrRn (0) and ∇Fn ∞ ≤ a. Here a and r are the parameters from (6.5) and a > 0 is some ﬁxed, n-independent constant. Since ψn = χn ψn and 1BrRn (0) χn = 0 we obtain 2 1BrRn (0) (x)W (x, y)Υn (y)|Λ+ A,V ψn (x)| d(x, y) {|x−y|<(1−r )Rn }

≤ 1BrRn (0) e−Fn ∞ ≤ C e−(1−r)aRn

sup |y|≤(r+1−r )Rn

sup |y|≤(1+r)Rn

−Fn Wy1/2 eFn [Λ+ ]ψn 2 Υn 1 A,V , χn e

(1 + |B(y)|) ≤ C e−([1−r]a−[1+r]τ )Rn .

(6.12)

In the last two steps we make use of Proposition 3.2 and (2.18). Next, if |x − y| ≤ (1 − r )Rn and 1BrRn (0) (x) = 0, then |y| ≥ (r + r − 1)Rn = sRn , and by the choice of ϑ (see the paragraph below (6.5)) it follows that 2 (1 − 1BrRn (0) (x))W (x, y)Υn (y)|Λ+ A,V ψn (x)| d(x, y) {|x−y|≤(1−r )Rn }

≤

sup |x|≥rRn

R3

2 W (x, y)ϑ(y/Rn )Υn (y)dy Λ+ A,V ψn .

(6.13)

On account of (6.5), Kato’s inequality, and (6.6) the ﬁrst asserted estimate follows from (6.12) and (6.13). The second one is derived similarly by means of Lemma 3.7 and the replacements r → 1 − ε/2, r → ε. Note that 1B(1+ε/2)Rn (0) χn = 0, which is used to derive the analogue of (6.12). 7. Existence of Eigenvalues A In this section, we prove Theorem 2.3 which asserts that HN possesses inﬁnitely A A many eigenvalues below inf σess (HN ) = 1 + EN −1 . We proceed along the lines of [40, §6] with a few changes. In particular, as in the previous section we replace the arguments of [40] that employ explicit position or momentum space representations of Λ+ 0 by more abstract ones. A crucial observation is the new argument used to prove Lemma 7.1. Throughout this section we always assume without further notice that the assumptions of Theorem 2.3, i.e. Hypothesis 2.5, are fulﬁlled.

Proof of Theorem 2.3. We proceed by induction on N and start with the inducA tion step. So, we pick N ∈ N, N ≥ 2, and assume that HN −1 possesses inﬁnitely A many eigenvalues below EN −2 + 1. In particular, we can pick a normalized ground A state of HN −1 , which we denote by Φ. Moreover, we denote the transposition operator which ﬂips the ith and N th electron variable by πiN , 1 ≤ i < N , and set πN N := 1. The vectors ψ1 , ψ2 , . . . are the elements of the sequence appearing in Hypothesis 2.5.

February 11, 2010 10:0 WSPC/148-RMP

46

J070-S0129055X10003874

O. Matte & E. Stockmeyer

Now, let d ∈ N. By Lemma 7.7 below we know that, for all suﬃciently large m0 +d m0 ∈ N, the set {AN (Φ ⊗ Λ+ A,V ψn )}n=m0 , where AN (Φ ⊗ Λ+ A,V ψn ) =

N 1 (−1)N −i πiN (Φ ⊗ Λ+ A,V ψn ), N i=1

n ∈ N,

is linearly independent. Our goal then is to show that the expectation of N := HN − ENA−1 − 1 H m0 +d with respect to any linear combination of the vectors {AN (Φ ⊗ Λ+ A,V ψn )}n=m0 is strictly negative provided m0 ∈ N is large enough. Since d is arbitrary the assertion of Theorem 2.3 then follows from the minimax principle. For cm0 , . . . , cm0 +d ∈ C, and

Ψ :=

m 0 +d n=m0

cn

N (−1)N −i i=1

N 1/2

πiN (Φ ⊗ Λ+ A,V ψn ),

(7.1)

we obtain as in [40] by means of the anti-symmetry of Φ, N Ψ ≤ Ψ | H

m 0 +d

+ |cn |2 Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψn )

(7.2)

n=m0

+ (N − 1)

m 0 +d

+ |cn ||cm ||π1N (Φ ⊗ Λ+ A,V ψn ) | HN (Φ ⊗ ΛA,V ψm )|

n,m=m0

(7.3) +

m 0 +d

+ |cn ||cm ||Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψm )|.

(7.4)

n, m=m0 m=n

Combining the eigenvalue equation (HN −1 − ENA−1 )Φ = 0 with Lemmata 7.1–7.4, Hypothesis 2.5, and (6.2), we ﬁnd some δ0 > 0 such that the scalar product in (7.2) −1 −1 + o(Rm ), as m0 gets large. Here the numbers is bounded from above by −δ0 Rm 0 0 R1 , R2 , . . . are those appearing in Hypothesis 2.5. Lemmata 7.5 and 7.6 imply that −K ), as m0 → ∞, for every the scalar products in (7.3) and (7.4) are of order O(Rm 0 K ∈ N. By the Cauchy–Schwarz inequality we ﬁnd some δ0 > 0 such that N Ψ ≤ −δ Ψ | H 0

m 0 +d

|cn |2 ,

n=m0

for all cm0 , . . . , cm0 +d ∈ C, if m0 is suﬃciently large (depending on d). This concludes the induction step. Finally, the case N = 1 is treated in the same way as the induction step N → N + 1 (setting E0A := 0 and ignoring Φ, W , and the term (7.3)).

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

47

To show that the contribution coming from the (one-particle) kinetic energy of ψn decreases faster than its negative potential energy we make use of the requirement that the ψn have vanishing lower spinor components, ψn = (ψn,1 , ψn,2 , 0, 0) , n ∈ N. This has also been used in [40] together with explicit formulas for Λ+ 0 . We replace these arguments by the following observation: Lemma 7.1. There is some C ∈ (0, ∞) such that + −2 0 ≤ Λ+ A ψn | (DA − 1)ΛA ψn ≤ CRn ,

n ∈ N.

Proof. Since the last two components of ψn are zero we have (β − 1)ψn = 0. If we denote the projection onto the ﬁrst two spinor components, L2 (R3 , C4 ) (ϕ1 , ϕ2 , ϕ3 , ϕ4 ) → (ϕ1 , ϕ2 , 0, 0) , by p then we also have pαi ψn = 0 = pαi ∂i ψn , 2 ] = 0 and, hence, i = 1, 2, 3, and, therefore, p(DA − 1)ψn = 0. Moreover, [p, DA −1 [p, |DA | ] = 0. This implies + |Λ+ A ψn | (DA − 1)ΛA ψn | 1 1 = ψn | p(DA − 1)ψn + sgn(DA )ψn | (DA − 1)ψn 2 2 1 1 −1 −1 = (DA − 1)ψn | |DA | (DA − 1)ψn + ψn | |DA | p(DA − 1)ψn 2 2

≤

1 (DA − 1)ψn 2 = O(1/Rn2 ). 2

In the last step we apply Hypothesis 2.5. In the following we split VC into a singular and regular part, VC = VCs + VCr , where VCs is deﬁned in (3.1). By Hypothesis 2.1 VCr is bounded. Lemma 7.2. As n tends to inﬁnity, + Λ+ A,V ψn | (DA,VC − 1)ΛA,V ψn + + r + −1 = Λ+ A,V ψn | VC ΛA,V ψn + ΛA ψn | (DA − 1)ΛA ψn + o(Rn ).

Proof. We let χn , n ∈ N, denote the cut-oﬀ functions introduced in the paragraph preceding Lemma 6.1. Then the assertion follows from Corollary 3.5 applied to DA,VCs − 1 with ζ = χn , since by Lemma 7.1 and Hypothesis 2.2, + −1 2 ( χn V + ∇χn ) Λ+ A ψn | (DA − 1)ΛA ψn ψn = o(Rn ). In the next lemma, we single out the leading order negative contribution to (7.2).

February 11, 2010 10:0 WSPC/148-RMP

48

J070-S0129055X10003874

O. Matte & E. Stockmeyer

Lemma 7.3. There is some constant C ∈ (0, ∞) such that, for all suﬃciently large n ∈ N, + r + 2 −Rn /C , Λ+ A,V ψn | VC ΛA,V ψn ≤ v (δ, Rn ) ΛA,V ψn + Ce

where v (δ, Rn ) is given by (2.20). Proof. We pick some even function f ∈ C ∞ (R, [0, ∞)) such that f ≡ 1 on [δ, ∞), f ≡ 0 on [0, δ/2], and |f | ≤ 4/δ. (Recall (2.21).) For some a ∈ (0, δ min{0 , m}/4), we deﬁne exponential weights, Fn (x) := aRn f (|x|/Rn ), n ∈ N. Using the notation introduced in (2.19) and (2.20) we then obtain, for all suﬃciently large n ∈ N, + r + r + Λ+ A,V ψn | VC ΛA,V ψn ≤ ΛA,V ψn | 1Sδ (Rn ) VC ΛA,V ψn −Fn + VCr 1R3 \Sδ (Rn ) e−Fn eFn Λ+ . A,V e

where, by (2.20) and Pythagoras’ theorem, r + Λ+ A,V ψn | 1Sδ (Rn ) VC ΛA,V ψn + 2 2 ≤ v (δ, Rn )( Λ+ A,V ψn − 1R3 \Sδ (Rn ) ΛA,V ψn ) 2 −Fn 2 Fn + e ΛA,V e−Fn 2 . ≤ v (δ, Rn ) Λ+ A,V ψn + |v (δ, Rn )| 1R3 \Sδ (Rn ) e

By (2.19), (2.21), and the choice of Fn we know that 1R3 \Sδ (Rn ) e−Fn ≤ Ce−aδRn /2 , which implies the assertion of the lemma. From now on, we always assume that the induction hypothesis made in the proof of Theorem 2.3 is fulﬁlled and that Φ is a normalized ground state eigenvector of 1 A A A A A HN −1 . So, HN −1 Φ = EN −1 Φ, EN −1 < EN −2 + 1. Given δ ∈ (0, N ) we pick some ε ∈ (0, 1) as in Hypothesis 2.5(i). Then the following assertion is valid: Lemma 7.4. As n tends to inﬁnity, we have, for 1 ≤ i ≤ N − 1, + Φ ⊗ Λ+ A,V ψn | WiN Φ ⊗ ΛA,V ψn ≤

2 −∞ sup W (x, y) Λ+ A,V ψn + O(Rn ).

|x−y|≥ (1−ε)Rn

Proof. This follows from Lemma 6.4 with Υn (y) = R3(N −2) |Φ(y, X )|2 dX and the exponential decay of Φ, which is ensured by Theorem 2.1 and the induction hypothesis. Now, we turn to the discussion of the terms in (7.3). Lemma 7.5. As n and m tend to inﬁnity, + −∞ σnm := |π1N (Φ ⊗ Λ+ A,V ψn ) | HN (Φ ⊗ ΛA,V ψm )| = O(Rmin{n,m} ).

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

49

Proof. We pick χ ∈ C0∞ (R3 , [0, 1]) such that χ ≡ 1 on B1/4 (0) and χ ≡ 0 outside B1/2 (0) and set χn := χ(·/Rn ) and χn := 1 − χn , for n ∈ N. As in [40] we ﬁnd + σnm ≤ |{χn Λ+ A,V ψn } ⊗ Φ | Φ ⊗ {(DA,VC − 1)ΛA,V ψm }| + (1) + |{Λ+ A,V ψn } ⊗ Φ | {χn Φ} ⊗ {(DA,VC − 1)ΛA,V ψm }|

+

N −1

+ |{χn Λ+ A,V ψn } ⊗ Φ | WiN Φ ⊗ (ΛA,V ψm )|

i=1

+

N −1

+ (1) |{Λ+ A,V ψn } ⊗ Φ | WiN (χn Φ) ⊗ (ΛA,V ψm )|

i=1

=: Y1 + Y2 +

N −1

Y3i +

i=1

N −1

Y4i .

i=1

For the ﬁrst two summands we ﬁnd + (1) Y1 + Y2 ≤ (DA,VC − 1)Λ+ A,V ψm ( χn ΛA,V ψn + χn Φ ), −∞ where the right-hand side is of order O(Rmin{n,m} ) due to the exponential localization of Φ and the support properties of ψn and χn . Moreover, we observe that, for i = 2, . . . , N − 1, 1/2

1/2 + Y3i ≤ χn Λ+ A,V ψn WiN Φ Φ sup Wy ΛA,V ψm , y∈R3

1/2

(1) 1/2 + Y4i ≤ Λ+ A,V ψn WiN Φ χn Φ sup Wy ΛA,V ψm . y∈R3

1/2

Here the norms WiN Φ , i = 2, . . . , N −1, are actually ﬁnite since Φ ∈ Ker(HN −1 − ENA−1 ) implies 1/2

+,N −1 +,N −1 WiN ΛA,V Φ ≤ (ENA−1 + C) Φ 2 , WiN Φ 2 = Φ | ΛA,V

for some constant C ∈ (0, ∞). Finally, 2 1/2 + Y31 ≤ sup Wy1/2 χn Λ+ A,V ψn Φ sup Wy ΛA,V ψm , y∈R3

y∈R3

(1) 1/2 + Y41 ≤ sup Wy1/2 Λ+ A,V ψn Φ χn Φ sup Wy ΛA,V ψm . y∈R3

y∈R3

We pick f ∈ C ∞ (R, [0, ∞)) such that f ≡ 0 on [1, ∞), f ≡ 1 on (−∞, 1/2], and −3 ≤ f ≤ 0, and set Fn (x) = aRn f (|x|/Rn ), x ∈ R3 , n ∈ N, where a ∈ (0, min{0 , m}/3). Since χn ψn = 0, we ﬁnd sup Wy1/2 χn Λ+ A,V ψn

y∈R3

≤

sup |x|≤Rn /2

Fn −Fn e−Fn sup Wy1/2 [Λ+ ψn . A,V , χn e ]e y∈R3

February 11, 2010 10:0 WSPC/148-RMP

50

J070-S0129055X10003874

O. Matte & E. Stockmeyer

This estimate, the exponential decay of Φ, and Lemma 3.7 imply that the terms −∞ ) also. Y3i and Y4i , 1 ≤ i ≤ N − 1, vanish of order O(Rmin{n,m} Finally, we discuss the terms in (7.4). Lemma 7.6. As n tends to inﬁnity, it holds, for all m > n, + −∞ |Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψm )| = O(Rn ).

Proof. We pick a family of smooth weight functions, {Fk }k,∈N , such that Fk ≡ 0 on supp(ψk ), Fk is constant outside some ball containing supp(ψk ) and supp(ψ ), ∇Fk ∞ ≤ a < min{0 , m}, and

gk := e−Fk −Fk ∞ ≤ Ce−a

min{Rk ,R }

,

k, ∈ N,

where a, a ∈ (0, min{0 , m}) and C ∈ (0, ∞) do not depend on k, ∈ N. In view of (2.21) it is easy to see that such a family exists. Then we observe that + |Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψm )| + + ≤ |Λ+ A,V ψn | (DA,V − 1)ψm | + |ΛA,V ψn | (VH + VE )ΛA,V ψm | 1/2 1/2 + + |WiN Φ ⊗ Λ+ A,V ψn | WiN Φ ⊗ ΛA,V ψm | 1≤i
+ (iα · ∇Fmn + VH + [eFmn , VE ]e−Fmn )ψm } −Fk + gnm eFmn (VH + VE )e−Fmn sup eFk Λ+ ψk 2 A,V e k=

−Fk + gnm (N − 1) sup sup Wy1/2 eFk Λ+ ψk 2 . A,V e k= y∈R3

By virtue of Proposition 3.2 and Lemma 3.7, we know that all terms behind the factors gnm appearing here are uniformly bounded which shows that the assertion holds true. Applying the above arguments in an easier situation, we obtain the following lemma. Lemma 7.7. For every d ∈ N, there is some n0 ∈ N such that the set of vectors m0 +d {AN (Φ ⊗ Λ+ A,V ψn )}n=m0 is linearly independent, for all m0 ∈ N, m0 ≥ n0 . Proof. We pick Ψ as in (7.1) and estimate Ψ 2 from below by an obvious analogue N replaced by the identity. Now, by virtue of Lemma 6.2 there of (7.2)–(7.4) with H

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

51

+ is some m1 ∈ N such that Φ ⊗ Λ+ A,V ψn = ΛA,V ψn ≥ 1/2, for all n ≥ m1 . The proof of Lemma 7.5 shows that + −∞ |π1N (Φ ⊗ Λ+ A,V ψn ) | Φ ⊗ ΛA,V ψm | = O(Rmin{n,m} ).

Furthermore, by employing the exponential weights from the proof of Lemma 7.6 we see that + + − min{Rn ,Rm }/C |Φ ⊗ Λ+ . A,V ψn | Φ ⊗ ΛA,V ψm | = |ψn | ΛA,V ψm | ≤ Ce

Altogether we ﬁnd some C ∈ (0, ∞) such that, for d ∈ N and all suﬃciently large m0 ∈ N, Ψ 2 ≥

m0 +d m0 +d 1 C (N − 1) C |cn |2 − |cn ||cm | − 2 n=m Rm0 Rm0 n,m=m 0

0

m 0 +d

|cn ||cm |.

n, m=m0 n=m

Hence, the Cauchy–Schwarz inequality implies that, for suﬃciently large m0 , Ψ in (7.1) is zero if and only if cm0 = · · · = cm0 +d = 0. Acknowledgments It is a pleasure to thank Hubert Kalf, Sergey Morozov, and Heinz Siedentop for useful remarks and helpful discussions. Moreover, we thank Sergey Morozov for making parts of his manuscripts [38] available to us prior to publication.

References [1] V. Bach, J. Fr¨ ohlich and I. Sigal, Quantum electrodynamics of conﬁned nonrelativistic particles, Adv. Math. 137 (1998) 299–395. [2] V. Bach and O. Matte, Exponential decay of eigenfunctions of the Bethe–Salpeter operator, Lett. Math. Phys. 55 (2001) 53–62. [3] A. A. Balinsky and W. D. Evans, On the virial theorem for the relativistic operator of Brown and Ravenhall, and the absence of embedded eigenvalues, Lett. Math. Phys. 44 (1998) 233–248. [4] A. A. Balinsky and W. D. Evans, Stability of one-electron molecules in the Brown– Ravenhall model, Comm. Math. Phys. 202 (1999) 481–500. [5] A. A. Balinsky and W. D. Evans, On the spectral properties of the Brown–Ravenhall operator, J. Comput. Appl. Math. 148 (2002) 239–255. [6] P. Beiersdorfer, M. H. Chen, K. T. Cheng and J. Sapirstein, Transition energies of the 3s–3p3/2 resonance lines in sodiumlike to phosphoruslike uranium, Phys. Rev. A 68 (2003) 022507, 7 pp. [7] A. Berthier and V. Georgescu, On the point spectrum of Dirac operators, J. Funct. Anal. 71 (1987) 309–338. [8] A. M. Boutet de Monvel and R. Purice, A distinguished self-adjoint extension for the Dirac operator with strong local singularities and arbitrary behavior at inﬁnity, Rep. Math. Phys. 34 (1994) 351–360.

February 11, 2010 10:0 WSPC/148-RMP

52

J070-S0129055X10003874

O. Matte & E. Stockmeyer

[9] G. E. Brown and D. G. Ravenhall, On the interaction of two electrons, Proc. Roy. Soc. London A 208 (1951) 552–559. [10] R. Cassanas and H. Siedentop, The ground-state energy of heavy atoms according to Brown and Ravenhall: Absence of relativistic eﬀects in leading order, J. Phys. A 39 (2006) 10405–10414. [11] K. T. Cheng, M. H. Chen and W. R. Johnson, Accurate relativistic calculations including QED contributions for few-electron systems, in Relativistic Electronic Structure Theory. Part 2: Applications, ed. P. Schwerdtfeger, Theoretical and Computational Chemistry, Vol. 14 (Elsevier, 2002), pp. 120–187. [12] K. T. Cheng, M. H. Chen and J. Sapirstein, Potential independence of the solution to the relativistic many-body problem and the role of negative energy states in heliumlike ions, Phys. Rev. A 59 (1999) 259–266. [13] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators, Texts and Monographs in Physics (Springer, Berlin-Heidelberg, 1987). [14] A. Derevianko, W. R. Johnson, D. P. Plante and I. M. Savukov, Negative-energy contributions to transition amplitudes in heliumlike ions, Phys. Rev. A 58 (1998) 4453–4461. [15] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in the Semi-Classical Limit, London Math. Soc. Lecture Note Series, Vol. 268 (Cambridge University Press, Cambridge, 1999). [16] J. Dolbeault, M. J. Esteban and M. Loss, Relativistic hydrogenic atoms in strong magnetic ﬁelds, Ann. Henri Poincar´e 8 (2007) 749–779. [17] W. D. Evans, P. Perry and H. Siedentop, The spectrum of relativistic one-electron atoms according to Bethe and Salpeter, Comm. Math. Phys. 178 (1996) 733–746. [18] V. Georgescu and M. M˘ antoiu, On the spectral theory of singular Dirac type Hamiltonians, J. Operator Theory 46 (2001) 289–321. [19] M. Griesemer, Exponential decay and ionization thresholds in non-relativistic quantum electrodynamics, J. Funct. Anal. 210 (2004) 321–340. [20] M. Griesemer and H. Siedentop, A minimax principle for the eigenvalues in spectral gaps, J. London Math. Soc. (2) 60 (1999) 490–500. [21] M. Griesemer and C. Tix, Instability of a pseudo-relativistic model of matter with self-generated magnetic ﬁeld, J. Math. Phys. 40 (1999) 1780–1791. [22] B. Helﬀer, J. Nourrigat and X. P. Wang, Sur le spectre de l’´equation de Dirac (dans ´ Norm. Sup. 22(4) (1989) 515–533. R3 ou R2 ) avec champ magnetic, Ann. Sci. Ecole [23] B. A. Heß, M. Reiher and A. Wolf, The generalized Douglas–Kroll transformation, J. Chem. Phys. 117 (2002) 9215–9226. [24] G. Hoever and H. Siedentop, Stability of the Brown–Ravenhall operator, Math. Phys. Electr. J. 5 (1999) Paper 6, 11 pp. [25] M. Huber and E. Stockmeyer, Perturbative implementation of the Furry picture, Lett. Math. Phys. 79 (2007) 99–108. [26] G. Jansen and B. A. Heß, Revision of the Douglas–Kroll transformation, Phys. Rev. A 39 (1989) 6016–6017. [27] D. H. Jakubaßa-Amundsen, The HVZ theorem for a pseudo-relativistic operator, Ann. Henri Poincar´e 8 (2007) 337–360. [28] D. H. Jakubaßa-Amundsen, Heat kernel estimates and spectral properties of a pseudorelativistic operator with magnetic ﬁeld, J. Math. Phys. 49 (2008) 032305, 22 pp. [29] W. R. Johnson, Relativistic many-body theory applied to highly-charged ions, in Many-Body Theory of Atomic Structure and Photoionization, ed. T. N. Chang (World Scientiﬁc, 1993), pp. 19–46.

February 11, 2010 10:0 WSPC/148-RMP

J070-S0129055X10003874

Spectral Theory of No-Pair Hamiltonians

53

[30] W. R. Johnson, Relativistic many-body perturbation theory for highly charged ions, in Many-Body Atomic Physics, eds. J. J. Boyle and M. S. Pindzola (University Press, 1998), pp. 39–64. [31] T. Kato, Perturbation Theory for Linear Operators, Classics in Mathematics (Springer, Berlin-Heidelberg, 1995). [32] T. Kato, Holomorphic families of Dirac operators, Math. Z. 183 (1983) 399–406. [33] Y. Last and B. Simon, The essential spectrum of Schr¨ odinger, Jacobi, and CMV operators, J. d’Analyse Math. 98 (2006) 183–220. [34] E. H. Lieb and M. Loss, Stability of a model of relativistic quantum electrodynamics, Comm. Math. Phys. 228 (2002) 561–588. [35] E. H. Lieb, H. Siedentop and J. P. Solovej, Stability and instability of relativistic electrons in classical electromagnetic ﬁelds, J. Statist. Phys. 89 (1997) 37–59. [36] O. Matte and E. Stockmeyer, On the eigenfunctions of no-pair operators in classical magnetic ﬁelds, Integr. Equ. Oper. Theory 65 (2009) 255–283. [37] S. Morozov, Essential spectrum of multiparticle Brown–Ravenhall operators in external ﬁeld, Documenta Math. 13 (2008) 51–79. [38] S. Morozov, Multi-particle Brown–Ravenhall operators in external ﬁelds, PhD thesis, Universit¨ at M¨ unchen (2008). [39] S. Morozov, Exponential decay of eigenfunctions of Brown–Ravenhall operators, J. Phys. A 42 (2009) 475206, 16 pp. [40] S. Morozov and S. Vugalter, Stability of atoms in the Brown–Ravenhall model, Ann. Henri Poincar´e 7 (2006) 661–687. [41] G. Nenciu, Self-adjointness and invariance of the essential spectrum for Dirac operators deﬁned as quadratic forms, Comm. Math. Phys. 48 (1976) 235–247. [42] G. Nenciu, Distinguished self-adjoint extension for the Dirac operator with potential dominated by multicenter Coulomb potentials, Helvetica Phys. Acta 50 (1977) 1–3. [43] M. Reiher and A. Wolf, Relativistic Quantum Chemistry (Wiley-VCH, Weinheim, 2009). [44] R. Richard and R. Tiedra de Aldecoa, On the spectrum of magnetic Dirac operators with Coulomb-type perturbations, J. Funct. Anal. 250 (2007) 625–641. [45] J. Sapirstein, Theoretical methods for the relativistic atomic many-body problem, Rev. Modern Phys. 70 (1998) 55–76. [46] H. Siedentop and E. Stockmeyer, The Douglas–Kroll–Heß method: Convergence and block-diagonalization of Dirac operators, Ann. Henri Poincar´e 7 (2006) 45–58. [47] J. Sucher, Foundations of the relativistic theory of many-electron atoms, Phys. Rev. A 22 (1980) 348–362. [48] J. Sucher, Relativistic many-electron Hamiltonians, Phys. Scripta 36 (1987) 271–281. [49] B. Thaller, The Dirac Equation, Texts and Monographs in Physics (Springer, BerlinHeidelberg, 1992). [50] C. Tix, Strict positivity of a relativistic Hamiltonian due to Brown and Ravenhall, Bull. London Math. Soc. 30 (1998) 283–290. [51] C. Tix, Self-adjointness and spectral properties of a pseudo-relativistic Hamiltonian due to Brown and Ravenhall, preprint (1997) 20 pp.; mp arc 97-441. [52] C. Tix, Lower bound for the ground state energy of the no-pair Hamiltonian, Phys. Lett. B 405 (1997) 293–296. [53] J. Xia, On the contribution of the Coulomb singularity of arbitrary charge to the Dirac Hamiltonian, Trans. Amer. Math. Soc. 351 (1999) 1989–2023.

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

Reviews in Mathematical Physics Vol. 22, No. 1 (2010) 55–89 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10003886

ON THE LONG TIME BEHAVIOR OF FREE STOCHASTIC ¨ SCHRODINGER EVOLUTIONS

ANGELO BASSI Department of Physics, University of Trieste, Strada Costiera 11, 34151 Trieste, Italy and Istituto Nazionale di Fisica Nucleare, Trieste Section, Via Valerio 2, 34127 Trieste, Italy [email protected] ∗ and MARTIN KOLB† ¨ DETLEF DURR

Mathematisches Institut der L.M.U., Theresienstr. 39, 80333 M¨ unchen, Germany ∗[email protected] †[email protected] Received 10 March 2009 Revised 15 August 2009 We discuss the time evolution of the wave function which is the solution of a stochastic Schr¨ odinger equation describing the dynamics of a free quantum particle subject to spontaneous localizations in space. We prove global existence and uniqueness of solutions. We observe that there exist three time regimes: the collapse regime, the classical regime and the diﬀusive regime. Concerning the latter, we assert that the general solution converges almost surely to a diﬀusing Gaussian wave function having a ﬁnite spread both in position as well as in momentum. This paper corrects and completes earlier works on this issue. Keywords: Collapse models; GRW-model; Hilbert space valued diﬀusions; large time behavior. Mathematics Subject Classiﬁcation 2000: 60H30, 60J60, 82C31, 81S99, 35R60

1. Introduction Stochastic diﬀerential equations (SDEs) in inﬁnite dimensional spaces are a subject of growing interest within the mathematical physics and physics communities working in quantum mechanics; they are currently used in models of spontaneous wave function collapse [1–14], in the theory of continuous quantum measurement [15, 17–25], and in the theory of open quantum systems [26–28]. In the ﬁrst case, the Schr¨odinger equation is modiﬁed by adding appropriate nonlinear and stochastic terms which induce the (random) collapse of the wave function in 55

February 11, 2010 10:1 WSPC/148-RMP

56

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

space; in this way, one achieves the goal of a uniﬁed description of microscopic quantum phenomena and macroscopic classical ones, avoiding the occurrence of macroscopic quantum superpositions. Current research focuses on designing experiments which discriminate between collapse and non-collapse theories, see references in [16]. In the second case, using the projection postulate, stochastic terms in the Schr¨ odinger equation are used to describe the eﬀect of a continuous measurement. In the third case, slightly generalizing the notion of continuous measurement to generic interactions with environments, SDEs are used as phenomenological equations describing the interaction of a quantum system with an environment, the stochastic terms encoding the eﬀect of the environment on the system. Looking directly at the stochastic diﬀerential equation for the wave function, rather than the deterministic equation of the Lindblad type for the statistical operator has some advantages with respect to the standard master equation approach, e.g. for faster numerical simulations [29]. Among the diﬀerent SDEs which have been considered so far, the following equation, deﬁned in the Hilbert space H ≡ L2 (R), is of particular interest [17– 19, 30–37, 39] √ i p2 λ 2 dt + λ(q − qt )dWt − (q − qt ) dt ψt , ψ0 = ψ. (1.1) dψt = − 2m 2 The ﬁrst term on the right-hand side represents the usual quantum Hamiltonian of a free particle in one dimension, p being the momentum operator. The second and third terms of the equation, as we shall see, induce the localization of the wave function in space; q is the position operator and qt denotes the quantum expectation ψt |qψt of q with respect to ψt . The parameter λ is a ﬁxed positive constant which sets the strength of the collapse mechanism, while Wt is a standard Wiener process deﬁned on a probability space (Ω, F , P) with ﬁltration {Ft , t ≥ 0}. Equation (1.1) plays a special role among the SDEs in Hilbert spaces because it is the simplest exactly solvable equation describing the time evolution of a nontrivial physical system. Within the theory of continuous quantum measurement, it describes a measurement-like process designed to measure the position of a free quantum particle; within decoherence theory it represents one of the possible unravellings of the master equation ﬁrst derived by Joos and Zeh [40]. Within collapse models (like GRW-models), it may describe the evolution of a free quantum particle (or the center of mass of an isolated system) subject to spontaneous localizations in space [1, 2] in the following sense. Realistic models of spontaneous wave function collapse are based on a more complicated stochastic diﬀerential equation: The diﬀerence between Eq. (1.1) and the equations of the standard localization models such as GRW [1] and CSL [2] is most easily described on the level of the Lindblad equations for the respective statistical operators ρt := EP [|ψt ψt |], induced by the stochastic dynamics of the wave function. By virtue of Eq. (1.1) (see, e.g., [9]): i λ d ρt = − [p2 , ρt ] − [q, [q, ρt ]], dt 2m 2

(1.2)

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

57

with the “Lindblad term” in position representation λGRW α (x − y)2 ρt (x, y). 4

(1.3)

For the GRW dynamics as described in [1] the corresponding Lindblad term of the GRW master equation in the position representation reads: −λGRW [1 − e−α(x−y)

2

/4

(1.4) √ When the distances involved are smaller than the length 1/ α 10−5 cm characterizing the model we have that −λGRW [1 − e−α(x−y)

2

/4

]

]ρt (x, y).

λGRW α (x − y)2 , 4

1 for |x − y| √ . α

(1.5)

Accordingly, the stochastic dynamics of Eq. (1.1) approximates — at least on the statistical level — the GRW dynamics for all atomic and subatomic distances. Since this is a regime of growing interest [41, 13, 42, 43] it is reasonable to study now ﬁrst the simpler Eq. (1.1). Equation (1.1) is nonlinear. Nonlinearity is a fundamental ingredient because only in this way it is possible to reproduce the collapse of the wave function. It is well known how to “linearize” the equation, i.e. how to express its solutions as a function of the solutions of a suitable linear SDE [31, 44]. We brieﬂy review this procedure. Let us consider the following linear SDE: √ i p2 λ 2 dt + λq dξt − q dt φt , φ0 = φ, (1.6) dφt = − 2m 2 deﬁned in the same Hilbert space H ≡ L2 (R); the stochastic process ξt is a standard Wiener process with respect to the probability space (Ω, F , Q) and ﬁltration {Ft , t ≥ 0}, where Q is a new probability measure whose relation with P will soon be established. This equation does not conserve the norm of the state vector, as the evolution is not unitary; we therefore introduce the normalized state vectors: φt /φt if φt = 0, (1.7) ψt = 0 otherwise. A standard application of Itˆ o calculus shows that, if φt solves Eq. (1.6), then ψt deﬁned in (1.7) solves the following nonlinear SDE: √ √ i p2 λ dt + λ(q − qt )(dξt − 2 λqt dt) − (q − qt )2 dt ψt , (1.8) dψt = − 2m 2 for the same initial condition ψ = φ. Equation (1.8) is a well deﬁned collapse equation, however it is not suitable for physical applications, as the collapse does not occur with the correct quantum probabilities. This can be seen by analyzing the time evolution of particular solutions, such as Gaussian wave functions; it can also be easily understood by noting

February 11, 2010 10:1 WSPC/148-RMP

58

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

that there is no fundamental diﬀerence between Eqs. (1.8) and (1.6), since any solution of Eq. (1.8) can be obtained from a solution of Eq. (1.6) simply by normalizing the wave function. In turn, Eq. (1.6) does not contain any information as to why the wave function should collapse according to the Born probability rule, i.e. the Wiener process ξt is not forced to pick most likely those values necessary to reproduce quantum probabilities, during the collapse process. The way to include such a feature into the dynamical evolution of the wave function is to replace the measure Q with a new measure (which will turn out to be the measure P previously introduced) so that the process ξt , according to the new measure, is forced to take with higher probability the values which account for quantum probabilities. This is precisely the key idea behind the original GRW model of spontaneous wave function collapse [1]: the wave function is more likely to collapse where it is more appreciably diﬀerent from zero. The mathematical structure of the GRW model suggests that the square modulus φt 2 should be used as density for the change of measure. We now formalize these steps. In [31], Holevo has proven that for initial condition φ0 2 = 1 the process (φt 2 )t≤0 is a martingale satisfying the equation √ t qs φs 2 dξs . (1.9) φt 2 = φ0 2 + 2 λ 0

We shall always work with normalized initial states. The martingale φt 2 can be used as a Radon–Nikodym derivative to generate a new probability measure P from Q, according to the usual formula: P[E] := EQ [1E φt 2 ],

∀ E ∈ Ft ,

∀ t < +∞,

(1.10)

where 1E is the indicator function relative to the measurable subset E. We recall that the martingale property, together with the property EQ [φt 2 ] = 1, guarantee consistency among diﬀerent times, so that (1.10) deﬁnes indeed a unique probability measure P on F . In the following, for simplicity we will write dP/dQ ≡ φt 2 . One can then show that Eq. (1.8), with the stochastic dynamics deﬁned on the probability space (Ω, F , P) in place of (Ω, F , Q), correctly describes the desired physical situations. A drawback of the change of measure is that the equation is deﬁned in terms of the stochastic process ξt , which is not anymore a Wiener process with respect to the measure P, as it was with respect to the measure Q. This can be a source of many diﬃculties, e.g. when analyzing the properties of the solutions of the equation. The disadvantage can be removed by resorting to Girsanov’s theorem, which connects Wiener processes deﬁned on the same measurable space, but with respect to diﬀerent probability measures. According to this theorem, the process √ t qs ds, (1.11) Wt := ξt − 2 λ 0

is a Wiener process with respect to (Ω, F , P) and ﬁltration {Ft , t ≥ 0}, and thus is the natural process for describing the stochastic dynamics with respect to the

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

59

measure P. It is immediate to see that, once written in terms of Wt , Eq. (1.8) reduces to Eq. (1.1), thus the link between Eq. (1.6) and (1.1) is established. The above discussion should also have given a ﬁrst idea of why SDEs like Eq. (1.1) are those which are used in Quantum Mechanics to described the collapse of the wave function; we will come back on this point later in the paper. The ﬁrst important problem to address concerns the status of the solutions of Eq. (1.6). In [31], Holevo has proven the existence and uniqueness of topological weak solutions of a rather general class of SDEs with unbounded operators, to which Eq. (1.6) belongs. (See the end of the section for the notation.) The problem of the existence and uniqueness of topological strong solutions of Eq. (1.6) has been addressed in [30]; there however, the proof relies on the expansion of wave functions in terms of Gaussian states, which in general is problematic and requires special care, as shown in [45]. An explicit representation of the strong solution of Eq. (1.6) has been given in [37]; the representation is written in terms of path integrals and is not particularly suitable for analyzing the time evolution of the general solution. A much more convenient representation, given in terms of the Green’s function of Eq. (1.6), has been ﬁrst derived in [32, 35]; the Green’s function reads: αt ¯t x + ¯bt y + c¯t ; (1.12) Gt (x, y) = Kt exp − (x2 + y 2 ) + βt xy + a 2 the coeﬃcients Kt , αt and βt are deterministic and equal to λ , Kt = υπ sinh υt 2λ coth υt, υ

αt =

λ βt = 2 sinh−1 υt, υ while the remaining coeﬃcients are functions of the Wiener process ξt : t √ a ¯t = λ sinh−1 υt sinh υs dξs ,

(1.13) (1.14) (1.15)

(1.16)

0

¯bt = 2i λ m υ i c¯t = m

0

t

0

t

a ¯s ds, sinh υs

a ¯2s ds.

In the above expressions, we have introduced the following two constants: λ 1+i ω, ω ≡ 2 . υ≡ 2 m

(1.17)

(1.18)

(1.19)

February 11, 2010 10:1 WSPC/148-RMP

60

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

As we shall see, the parameter ω, which has the dimensions of a frequency, will set the time scales for the collapse of the wave function. The representation in terms of the Green’s function (1.12), as we have said, is particularly suitable for analyzing the time evolution of the general solution of Eq. (1.6), and thus of Eq. (1.1), even though we will see that, when studying the long time behavior, another representation is more convenient. Our ﬁrst result concerns the meaning of the solution of Eq. (1.6) in terms of (1.20) φt (x) := dyGt (x, y)φ(y) for given initial condition φ. Theorem 1.1. Let φt be deﬁned as in (1.20); then the following three statements hold true with Q-probability 1: (1) φ ∈ L2 (R) ⇒ φt ∈ L2 (R), (2) φ ∈

L2B (R) 2

⇒ φt is a topological strong solution of Eq. (1.6),

(3) φ ∈ L (R) ⇒ limt→0 φt − φ = 0,

(1.21) (1.22) (1.23)

where L2B (R) is the subspace of all bounded functions of L2 (R). Having the explicit solution of the Eq. (1.6), and thus of Eq. (1.1), the next relevant problem is to unfold its physical content. Previous analysis of similar equations [2,8,10,14,39] have shown that one can identify three regimes, which are more or less well separated depending on the value of the parameters λ and m. (1) Collapse regime. A wave function having an initial large spread, localizes in space; the localization occurring in agreement with the Born probability rule. (2) Classical regime. The localized wave function moves in space like a classical free particle, since the ﬂuctuations due to the Wiener process can be safely ignored. That a well localized Schr¨ odinger wave function should move along a classical path is connected to the validity of Ehrenfest- or Egorov-type theorems [38]. (3) Diﬀusive regime. Eventually, the random ﬂuctuations become dominant and the wave function starts to diﬀuse appreciably. It is not an easy task to spell out rigorously these regimes and their properties. We shall, however, be a bit more speciﬁc on this in the following section. We shall afterwards focus on the simplest regime, namely the diﬀusive one, which in fact has been intensively looked at in the previous years [7, 19, 26, 34, 35, 39] and we shall prove a remarkable property of the solutions of Eq. (1.1): Any solution converges almost surely to a Gaussian wave function having a ﬁxed spread. Theorem 1.2. let ψt be a solution of Eq. (1.1); then under conditions which we will specify, the following property holds true with P-probability 1: lim ψt − ψt∞ = 0,

t→∞

(1.24)

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

61

where ψt∞ , deﬁned in (5.2), is a Gaussian wave function with a ﬁxed spread both in position and momentum. Theorems 1.1 and 1.2 have been extensively discussed before in the literature [7, 19, 26, 30, 34–36, 39], proving that the community has devoted much attention to the problem. However, these proofs are not complete or ﬂawed. Concerning Theorem 1.1, in particular Statement (3) was not proven [30, 34–36]. While Statements (1) and (2) are rather straightforward conclusions from the Gaussian kernel of the propagator, the third statement is much more subtle and does not follow from purely analytical arguments. Concerning Theorem 1.2, none of the previous proofs is decisive. In [35, 36], the major ﬂaw was that it was overlooked that the eigenfunction expansion of the relevant dissipative operator (not self-adjoint) does not give rise to an orthonormal basis. In [19], the long time behavior was analyzed by expanding the general solution in terms of coherent states, while in [26,39] it was analyzed by scrutinizing the time evolution of the spread in position of the solution; in [45] it has been shown that both approaches are not conclusive. Finally, [7] proposed Theorem 1.2 as a conjecture, but shows stability of ψt∞ only against small perturbations. Building on previous work of Holevo, Mora and Rebolledo recently enhanced in [46, 47] the general theory of stochastic Schr¨odinger equations. In particular, they developed criteria for the existence of regular invariant measures for a large class of stochastic Schr¨odinger equations as an important step towards an understanding of the large time behavior. Until now however the only complete and detailed results on the large time behavior seem to be Theorems 1.1 and 1.2. We conclude this introductory section by summarizing the content of the paper. In Sec. 2, we will present a qualitative analysis of the time evolution of the general solution of Eq. (1.1); we will discuss the three regimes previously introduced, giving also numerical estimates, and we will set the main problems which we aim at solving. In Sec. 3, we will analyze the structure of the Green’s function (1.12) and prove Theorem 1.1. In Sec. 4, we will introduce another representation of the general solution of Eq. (1.1), which is more suitable for analyzing its long time behavior. Sec. 5 will be devoted to the proof of Theorem 1.2. Finally, Sec. 6 will contain some concluding remarks and an outlook. Notation. We will work in the complex and separable Hilbert space L2 (R), with the norm and the scalar product given, respectively, by · and ·|·. We will also consider the subspace L2B (R) of all bounded functions of L2 (R). Given an operator O, we denote with D(O) its domain and with R(O) its range. Since in some expression the real and imaginary parts of some coeﬃcients appear, we introduce for ease of readability the symbols z R or zR will denote the real part of the complex number z, while z I or zI will denote its imaginary part. Given the linear SDE (1.6), a topological strong solution is an L2 -valued process such that for any t > 0, √ t i t p2 λ t 2 φs ds + λ qφs dξs − q φs ds (1.25) φt = φ − 0 2m 2 0 0

February 11, 2010 10:1 WSPC/148-RMP

62

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

holds with Q-probability 1. A topological weak solution instead is an L2 -values process such that for any t > 0 and for any χ ∈ D(p2 ) ∩ D(q 2 ), √ t i t 1 2 p χ|φs ds + λ χ|φt = χ|φ − qχ|φs dξs 0 2m 0 λ t 2 − q χ|φs ds (1.26) 2 0 holds with Q-probability 1. Topological strong and week solutions for the nonlinear SDE (1.1) are deﬁned in a similar way. There is also a distinction between strong and weak solutions in a stochastic sense [48], depending on whether the probability space, the ﬁltration and the Wiener process are given a priori (strong solution) or whether they can be constructed in such a way to solve the required SDE (weak solution). Throughout the paper, we will deal only with strong solutions in the stochastic sense. 2. Time Evolution of the General Solution We begin our discussion with a qualitative analysis of the time evolution of the general solution of Eq. (1.1); we will spot out the regimes we introduced in the previous section, corresponding to three diﬀerent behaviors of the wave function. These regimes of course depend on the value of the mass m of the particle and also on the value of the coupling constant λ which sets the strength of the collapse mechanism. As discussed, e.g., in [39], it is physically appropriate to take λ proportional to the mass m according to the formula: λ := λ0

m , m0

(2.1)

where λ0 is now assumed to be a universal coupling constant, while m0 is taken equal to the mass of a nucleon ( 1.67×10−27 kg). To be deﬁnite, in the following we take λ0 1.00 × 10−2 m−2 sec−1 , so that the localization mechanism has the same strength as that of the GRW model [1]. Though, as we discussed in the introduction, Eq. (1.1) is used also in the context of the theory of continuous measurement as well as in the theory of decoherence, for brevity and clarity in the following we will only make reference to its application within models of spontaneous wave function collapse. 1. The collapse regime The ﬁrst important eﬀect of the dynamics embodied in Eq. (1.1) is that a wave function, which initially is well spread out in space, becomes rapidly localized. This is most easily seen through the Green’s function representation of the solution. The Green’s function Gt (x, y) in (1.12) can be rewritten as follows α ˜t 2 αt x 2 ˜t x + c˜t exp − (y − Yt ) Gt (x, y) = Kt exp − x + a (2.2) 2 2

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

63

where we have introduced the new parameters: βt2 2λ tanh υt, = αt υ βt¯bt a ˜t = a ¯t + , αt ¯b2 c˜t = c¯t + t , 2αt βt x + ¯bt . Ytx = αt α ˜ t = αt −

(2.3) (2.4) (2.5) (2.6)

The y-part of Gt (x, y) is a Gaussian function whose spread in position (equal to 1/ αR t ) rapidly decreases in time, and afterwards remains very small. In particular, we have: 2λ sinh ωt − sin ωt ω cosh ωt − cos ωt  2  24 −2 −1 −1   λt (3.99 × 10 m kg sec )mt 3 =  2λ   (2.39 × 1029 m−2 kg−1 )m ω

αR t =

t ω −1 , (2.7) t → +∞,

with ω 5.01 × 10−5 sec−1 independent of the mass of the particle. Let us introduce a length , and let us say that a wave function is localized when its spread is smaller than . For sake of deﬁniteness, we take 1.00 × 10−7 m, corresponding to the width of the collapsing Gaussian of the GRW model. By means of this length, we can deﬁne the collapse time t1 as the time when the spread of the y-part of the Green’s function Gt (x, y) becomes smaller than . By using the small time approximation of αR t given in (2.7), we can set: t1 :=

2.51 × 10−11 kg sec 3 . 2 2 λ m

(2.8)

As we see, and as we expect, this time decreases for increasing masses, i.e. for increasing values of λ, and is very small for macroscopic particles. Let us assume that the initial state φ(x) is not already localized, and in particular that it does not change appreciably on the scale set by ; this is a physically reasonable assumption when φ represents the state of the center of mass of a macroscopic object. In this case, from the time t1 on, the y-part of the Green’s function Gt (x, y) acts like a Dirac-delta on φ(x), and the solution at time t of the linear equation can be written as follows: 2π α ˜t 2 φt (x) Kt exp − x + a ˜t x + c˜t φ(Ytx ). (2.9) αt 2

February 11, 2010 10:1 WSPC/148-RMP

64

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

This is a Gaussian state whose spread is controlled by α ˜ t , which evolves in time in a way similar to αt ; in particular: 2λ sinh ωt + sin ωt ω cosh ωt + cos ωt  2λt (1.20 × 1025 m−2 kg−1 sec−1 )mt t ω −1 , = 2λ (2.10)  (2.39 × 1029 m−2 kg−1 )m t → +∞. ω ˜R As we see, the spread 1/ α t is well below , for any t ≥ t1 . We can then conclude that, for times greater than the collapse time, any state initially well spread out in space is mapped into a very well localized wave function. An important issue is where the wave function collapses to, given that the initial state is spread out in space. We now show that the position of the wave function after the collapse is distributed in very good agreement with the Born probability rule. A reasonable measure of where the wave function is, after it has collapsed, is given by the quantum average of the position operator qt . Accordingly, the probability for the collapsed wave function to lie within a Borel measurable set A of R can be simply deﬁned to be Pcoll t [A] := P[ω : qt ∈ A]. Though this probability is mathematically well deﬁned for any Borel measurable subset A, it is physically meaningful only when A represents an interval ∆ much larger than the spread of the wave function itself, or a sum of such intervals. In such a case, as discussed in [49], one can show that: coll 2 pt (x)dx, (2.11) Pt [A] EP [P∆ ψt ] ≡ α ˜R t =

∆

where P∆ (x) is the characteristic function of the interval ∆ of the real axis and pt = EP [|ψt (x)|2 ]. The idea behind the approximate equality (2.11) is that when ψt lies within ∆, then P∆ ψt ψt , so that P∆ ψt 2 is almost equal to 1, while when it lies outside ∆, it is practically 0. The critical situations, which require special care, are those when the wave function lies at the edges of ∆. In [39] it has been proven that: 2 µt (2.12) dy e−µt y pSch pt (x) = t (x + y), π µt =

3mm0 m (2.27 × 1043 m−2 kg−1 sec3 ) 3 , 22 λ0 t3 t

(2.13)

Sch 2 Sch where pSch t (x) = |ψt (x)| and ψt (x) is the solution of the standard free-particle Sch¨ odinger equation, for the given initial condition φ(x). For the times we are considering (t = t1 ), the Gaussian term in (2.12) is much more peaked than any typical quantum probability distribution pSch t (x), and consequently acts like a Dirac-delta (x). Finally, for macroscopic systems and for the on it; accordingly, pt (x) pSch t

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

65

times we are considering, the wave function solution of the free-particle Schr¨ odinger Sch Sch equation does not change appreciably, implying that pt (x) p0 (x) = |φ(x)|2 , which means precisely that the collapse probability is distributed in agreement with the Born probability rule. 2. The classical regime After time t1 , we are left with a wave function which, when m is the mass of a macroscopic particle, is very well localized in space, almost point-like. This is the way in which collapse model reproduce the particle-like behavior of classical systems, within the framework of a wave-like dynamics. The relevant question now is to unfold the time evolution of the position and momentum of the wave function, to see whether it matches Newton’s laws. When the wave function is well localized in space (t > t1 ), one can reasonably assume that it can be approximated with the Gaussian state to which — as we shall see — it asymptotically converges to. We will analyze the time evolution of such a Gaussian state in the following, and we will see that its mean position x ¯t and momentum k¯t evolve in time as follows (see Eqs. (5.28) and (5.29)): √ ¯t1 + k¯t1 (t − t1 ) + λ x ¯t = x m m k¯t = k¯t1 +

t t1

Ws ds +

(Wt − Wt1 ), m

√ λ(Wt − Wt1 ).

(2.14) (2.15)

We can easily recognize in the deterministic parts of the above equations the freeparticle equations of motions of classical mechanics describing a particle moving along a straight line with constant velocity; the remaining terms are the ﬂuctuations around the classical motion, driven by the Brownian motion Wt . The important feature of the above equations is that these ﬂuctuations, for macroscopic masses, are very small, for very long times. As a matter √ of fact, if we estimate the Brownian motion ﬂuctuations by setting Wt ∼ t, we have for the stochastic terms in Eq. (2.14): √ t 2 √ 3/2 t3/2 λ Ws ds λ t (1.63 × 10−22 m kg1/2 sec−3/2 ) √ , m t1 3 m m

(Wt − Wt1 ) m

t t −17 1/2 −1/2 (1.02 × 10 . m kg sec ) m m

(2.16)

(2.17)

We see that the random ﬂuctuations decrease with the square root of the mass m of the particle, which means that the bigger the system, the more deterministic its motion. This is how collapse models recover classical determinism at the macroscopic level, from a fundamentally stochastic theory. We can introduce a time t2 , deﬁned as the time after which the ﬂuctuations become larger than L; we can set, e.g., L 1.00 × 10−3 m. Since the ﬂuctuations

February 11, 2010 10:1 WSPC/148-RMP

66

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

in (2.16) grow faster as those in (2.17), we can set: 2/3 √ 3 L m √ t2 (3.55 × 1012 sec m−1/3 ) 3 m 2 λ √ (1.13 × 105 year m−1/3 ) 3 m.

(2.18)

The time t2 deﬁnes the time interval [t1 , t2 ] during which the classical regime holds. As we can see, for macroscopic systems this is a very long time — much longer than the time during which a macro-object can be kept isolated from the rest of the universe, so that its dynamics is described by Eq. (1.1). To summarize, during the classical regime, which for macroscopic systems lasts very long, the wave function behaves, for all practical purposes, like a point moving deterministically in space according to Newton’s laws. In other words, the wave function reproduces the motion of a classical particle. 3. The diﬀusive regime After time t2 , two new eﬀects become dominant: First, the wave function converges towards a Gaussian state, as we shall prove. Second, the motion becomes more and more erratic: the dynamics begins to depart from the classical one, showing its intrinsic stochastic nature. A thorough mathematical analysis of these time regimes and their main properties is still lacking. In this paper, as we have anticipated, we focus now only on the long time behavior of the solutions of Eq. (1.1), leaving the study of the remaining properties as open problems for future research. 3. Solution of the Equation In the ﬁrst part of this section, we derive the Green’s function (1.12) in a way which will make clear the connection between Eq. (1.6) and the equation of the so called non-self-adjoint (NSA) harmonic oscillator [52–54]. This connection is important for two reasons; from a physical point of view, it will bring a deep insight on how the collapse of the wave function actually works. From a mathematical point of view, it will allow to prove rigorously both Theorems 1.1 and 1.2 presented in the introductory section. A way to connect Eq. (1.6) with that of the NSA harmonic oscillator is to apply suitable transformations to the wave function in such a way to transform the SDE in a Schr¨ odinger-like equation. We will do this in two steps. We present this section in detail for convenience although the approach goes back to Kolokoltsov [35]. 1. Reduction of Eq. (1.6) to a linear diﬀerential equation with random coeﬃcients √ The idea is to remove the stochastic diﬀerential term λq dξt from Eq. (1.6): borrowing the language of quantum mechanics, we shift to a sort of interaction picture by deﬁning a suitable operator which maps the solution of Eq. (1.6) to the solution

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

67

of a new equation which does not have that stochastic term. To this end, let us consider the operator Qa : D(Qa ) ⊆ L2 (R) → L2 (R) deﬁned as follows: Qa φ(x) = eax φ(x),

a ∈ C;

(3.1)

where D(Qa ) is deﬁned as the set of all φ(x) ∈ L2 (R) such that eax φ(x) ∈ L2 (R). It should be noted that, in general, the operator Qa is unbounded and its domain D(Qa ) is dense in L2 (R) but does not coincide with it. We will settle all technical issues in the second part of the section. We now deﬁne the vector: (1)

φt

= Q−√λξt φt ;

(3.2)

(1)

an easy application of Itˆ o calculus shows that φt

satisﬁes the diﬀerential equation:

p2 −1 i (1) (1) 2 √ √ Q = − Q− λξt − λq φt dt, φ0 = φ. (3.3) 2m − λξt √ The stochastic diﬀerential λq dξt has disappeared; in turn, the free Hamiltonian √ which, due to the p2 /2m has been replaced by the operator Q−√λξt (p2 /2m)Q−1 − λξt speciﬁc commutation relations between q and p, takes the simple form: √ √ Q−√λξt p2 Q−1 = p2 − 2i λξt p − λ2 ξt2 . (3.4) − λξ (1) dφt

t

Equation (3.3) can then be re-written as follows: 2 p d (1) i √ λ2 2 (1) 2 − iλq − ξ φ . i φt = λξt p − dt 2m m 2m t t

(3.5)

This is a standard diﬀerential equation with random coeﬃcients; note that the operator on the right-hand side is not self-adjoint, due to the presence of the second and third terms. The last term of Eq. (3.5) is a multiple of the identity operator and can be removed by deﬁning: iλ t 2 (2) (1) ξ ds φt ; (3.6) φt = exp − 2m 0 s we then obtain:

2 p d (2) i √ (2) 2 i φt = − iλq − λξt p φt . dt 2m m

(3.7)

The third term on the right-hand side contains a time dependent coeﬃcient, and the next step aims at removing it. 2. Reduction of Eq. (3.7) to a diﬀerential equation with constant coeﬃcients The idea we now follow is to perform a transformation similar to a boost. We introduce the operator Pa : D(Pa ) ⊆ L2 (R) → L2 (R) deﬁned as: Pia/ φ(x) = φ(x + a),

a ∈ C,

(3.8)

February 11, 2010 10:1 WSPC/148-RMP

68

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

where D(Pa ) is the set of all φ(x) ∈ L2 (R) which can be analytically continued to the line x + a in the complex space C, and such that φ(x + a) ∈ L2 (R). Similarly to Qa , also Pa is in general an unbounded operator and its domain D(Pa ), though being dense, does not coincide with L2 (R); we will come back to this point later in this section. We deﬁne the operator: Vt = exp(−iat /)Pibt / Q−ict / ,

(3.9)

where the coeﬃcients at , bt and ct , yet to be determined, will turn out to be complex random functions of time. One can easily verify that: Vt qVt−1 = q + bt ,

(3.10)

Vt pVt−1

(3.11)

= p + ct ,

and similarly for higher powers of q and p. Let us deﬁne the vector: (2)

ϕt = Vt φt ,

(3.12)

which solves the equation: 2 p d 1 i √ 2 ˙ − iλq − bt − ct + i ϕt = λξt p + (c˙t − 2iλbt )q dt 2m m m 1 2 i √ 2 c − λξt ct − iλbt ϕt . + a˙ t + c˙t bt + 2m t m

(3.13)

The time-dependent part of the equation can be removed by requiring that at , bt and ct satisfy the ﬁrst-order diﬀerential equations: √ mb˙ t − ct = −i λξt b0 = 0, (3.14) c0 = 0 c˙t − 2iλbt = 0 and a˙ t + iλb2t +

1 2 i √ c − λξt ct = 0, 2m t m

a0 = 0.

(3.15)

The ﬁrst two equations form a non-homogeneous linear system of ﬁrst-order diﬀerential equations, which has a unique Q-a.s. continuous random solution; the third equation instead determines the global factor at , which is also random. With such a choice for the three parameters, Eq. (3.13) becomes: 2 p d − iλq 2 ϕt , ϕ0 = φ, (3.16) i ϕt = dt 2m which is the equation of the so-called non-self-adjoint (NSA) harmonic oscillator, whose solution and most important properties are well known. Before continuing, we note that in the case of a more general Hamiltonian H = p2 /2m+V (q) appearing in Eq. (1.1) in place of just the free evolution p2 /2m, the potential V (q) would have

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

69

been transformed, when going from Eq. (3.7) to Eq. (3.16), according to the rule: Vt V (q)Vt−1 = V (q + bt ); in this case, we would not be able to remove completely the time-dependent terms from the equation and we would not be able to reduce the original equation to one, whose solution is known. However, besides the free particle case, all equations containing terms at most quadratic in q and p (among them, the important case of the harmonic oscillator) can be solved in a similar way. The solution of Eq. (3.16) admit a representation in terms of the Green’s function, also known as Mehler’s formula: (x, y) GNSA t

=

λ λ 2 λ −1 2 exp − (x + y ) coth υt + 2 xy sinh υt , υπ sinh υt υ υ

(3.17)

with υ and ω deﬁned as in (1.19). In this way, we have established the link between the solutions of the SDE (1.6) and those of the equation for the NSA harmonic oscillator (3.16), which we summarize in the following lemma, whose proof is straightforward. Lemma 3.1. Let TtNSA be the evolution operator represented by the Green’s func(x, y) and Tt the one represented by Gt (x, y); then: tion GNSA t Tt ≡ exp(iϑt /)Q√λξt +(ict /) P−ibt / TtNSA ,

(3.18)

where the two random functions bt and ct solve the linear system (3.14), and ϑt , which includes all global, i.e. independent of x, phase factors, solves the equation: 1 2 i √ λ2 2 ct + ξ , λξt ct + ϑ˙ t = −iλb2t − 2m m 2m t

θ0 = 0.

(3.19)

We now proceed to prove in which sense φt := Tt φ is the topological strong solution of Eq. (1.6) for the given initial condition φ. We ﬁrst need to set some (x, y) which will be necessary for the subproperties of the Green’s function GNSA t sequent theorem. Lemma 3.2. The absolute value of GNSA (x, y) is equal to: t

|GNSA (x, y)| t

=

λ λ 2λ √ exp − (x2 + y 2 )pt + 4 xyqt , ω ω πω cosh ωt − cos ωt

(3.20)

where we have introduced the following quantities: pt =

sinh ωt − sin ωt , cosh ωt − cos ωt

(3.21)

qt =

sinh ωt/2 cos ωt/2 − cosh ωt/2 sin ωt/2 ; cosh ωt − cos ωt

(3.22)

February 11, 2010 10:1 WSPC/148-RMP

70

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

note that the function pt is positive for any t > 0. The integral of |GNSA (x, y)|2 t with respect to y is equal to:

λ p2t − 4qt2 2 2λ NSA 2 exp −2 dy|Gt (x, y)| = x . (3.23) πω(sinh ωt − sin ωt) ω pt A simple calculation shows that p2t − 4qt2 > 0 for any t > 0; this means that (x, ·), taken as a function of y, belongs to L2 (R) for any x ∈ R and t > 0; GNSA t moreover : (x, ·)2 < +∞ for any t > 0. (3.24) dxGNSA t Finally, the following expression holds true:

λ p2t − 4qt2 2 2λ bx NSA 2 exp −2 dy|e Gt (x + a, y)| = x πω(sinh ωt − sin ωt) ω pt qt (qt aR + q¯t aI ) + 2 pt aR + p¯t aI − 4 + 2bR x pt (qt aR + q¯t aI )2 + pt (a2R − a2I ) + 2¯ p t aR aI − 4 , pt (3.25) with p¯t =

sinh ωt + sin ωt , cosh ωt − cos ωt

(3.26)

q¯t =

sinh ωt/2 cos ωt/2 + cosh ωt/2 sin ωt/2 . cosh ωt − cos ωt

(3.27)

The above formulas imply that, for any a, b ∈ C, for any x ∈ R and for any t > 0, the function ebx GNSA (x + a, ·) belongs to L2 (R) and: t (x + a, ·)2 < +∞. (3.28) dxebx GNSA t We are now in a position to state and prove the main theorem of this section. Theorem 3.1. Let Pa and Qa be deﬁned, respectively, as in (3.8) and (3.1); let bt and ct solve the linear system (3.14) and θt be the solution of Eq. (3.19). Finally, let φt = Tt φ, with φ ∈ L2 (R) and Tt deﬁned as in (3.18). Then the following three statements hold true with probability 1: (1) Tt : L2 (R) → L2 (R)deﬁnes a bounded operator for everyt > 0.

(3.29)

(2) φ ∈ L2B (R) ⇒ φt is a topological strong solution of Eq. (1.6).

(3.30)

(3) φ ∈ L2 (R) ⇒ limt→0 φt − φ = 0.

(3.31)

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

71

Proof. Statement 1. Let φ belong to L2 (R); since also GNSA (x, ·) belongs to t older’s inequality implies that GNSA (x, ·)φ L2 (R) for any x ∈ R and t > 0, H¨ t belongs to L1 (R); accordingly, the operator TtNSA is well deﬁned for any t > 0, and maps any L2 (R)-function into a measurable function. By using Schwartz inequality together with relation (3.24), we have: 2 ≤ φ2 dxGNSA (x, ·)2 < +∞; (x, y)φ(y) (3.32) dx dy GNSA t t thus TtNSA φ belongs to L2 (R) for any φ in L2 (R) and for any t > 0. (x + a, ·) belongs to L2 (R) for any a ∈ C and In a similar way, since also GNSA t because of (3.28), one proves that Pa TtNSA φ belongs to L2 (R) for any φ ∈ L2 (R), for any complex a and for any t > 0, i.e. that D(Pa ) contains R(TtNSA ). Using once more the same inequalities and (3.28), one shows also that Qb Pa TtNSA φ belongs to L2 (R) for any φ in L2 (R), for any a, b ∈ C and t > 0. Remark. Actually a stronger statement is true, as can be readily seen from the Gaussian form of the Green’s function Gt of the operator Tt : For positive t, it maps L2 (R) to Schwartz space S(R). We shall need this information in the proof of Statement 3. Statement 2. Let us consider the vector ϕt := TtNSA φ, with φ ∈ L2B (R). By construction, ϕt solves Eq. (3.16), once one proves that the integration dy GNSA (x, y)φ(y) (3.33) t can be exchanged with the ﬁrst and second partial derivatives with respect to x and with the ﬁrst partial derivative with respect to t. We note that the (x, y)φ(y) satisﬁes the following two properties: (i) The function function GNSA t (x, y)φ(y) is measurable and integrable on R for any t > 0 and for y → GNSA t any x ∈ R; (ii) The ﬁrst and second partial derivatives with respect to x and the ﬁrst partial derivatives with respect to t are exists for any t > 0, x ∈ R and y ∈ R and can be bounded uniformly with respect to t and x. Accordingly, one can apply, e.g., [50, Theorem 12.13, p. 199] to conclude that the operations of integration and diﬀerentiation can be exchanged. o calculus proves Having proved that ϕt solve Eq. (3.16), a direct application of Itˆ that φt , deﬁned as in (3.18), is a topological strong solution of Eq. (1.6). Statement 3. Let φ = φ0 ∈ Cc∞ (R) be given. Since φt solves Eq. (1.6) in a strong sense, it also solves the SDE in a weak sense; hence, using, e.g., [31, Eq. (1.1)], one has: lim ϕ|φt = ϕ|φ0 ∀ ϕ ∈ Cc∞ (R).

t→0

(3.34)

We extend (3.34) to the general case of ϕ ∈ L2 (R). Being dense in L2 (R), there exist a sequence {ϕn ∈ Cc∞ (R), n ∈ N} which approximates any ϕ ∈ L2 (R). By

February 11, 2010 10:1 WSPC/148-RMP

72

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

triangle and Schwarz inequality we get |ϕ|φt − ϕ|φ0 | ≤ |ϕn |φt − ϕn |φ0 | + ϕ − ϕn φt + ϕ − ϕn φ0 .

(3.35)

The ﬁrst term on the right-hand side can be made arbitrarily small because of (3.34); the second and third terms can also be made arbitrarily small by choosing n suﬃciently large, while φt can be bounded as it converges to φ0 for t → 0, due to Eq. (1.9). This proves that: lim ϕ|φt = ϕ|φ0 ∀ ϕ ∈ L2 (R).

(3.36)

t→0

Statement 3 for test functions φ ∈ Cc∞ (R) now follows directly from Eq. (1.9), Eq. (3.36) and observing φt −φ0 2 = φt 2 +φ0 2 −2φ0 |φt R . It remains to extend the strong continuity of Tt from the subspace Cc∞ (R) to L2 (R). For this, observe that for φ ∈ Cc∞ (R) (φt 2 )t≥0 deﬁnes a stochastic process with continuous paths and by Holevo’s result (cf. Eq. (1.9)) it is a martingale. For given f ∈ L2 (R) choose a sequence (ϕn )n∈N ⊂ Cc∞ (R), which converges to f in L2 (R). Doob’s inequality for submartingales implies that for all n, m ∈ N, T > 0 and λ > 0 1 n 2 m 2 2 (3.37) Q sup |ϕt − ϕt | > λ ≤ EQ [|ϕnT 2 − ϕm T |]. λ 0≤t≤T We now show that 2 lim EQ [|ϕnT 2 − ϕm T |] = 0.

(3.38)

n,m→∞

The elementary inequality 2 n m n m |ϕnt 2 − ϕm t | ≤ (ϕt + ϕt )ϕt − ϕt

implies that 2 n m n m EQ [|ϕnt 2 − ϕm t | ≤ EQ [(ϕt + ϕt )ϕt − ϕt ] 1

1

2 2 n m 2 2 ≤ (EQ [(ϕnt + ϕm t ) ]) (EQ [ϕt − ϕt ]) √ 2 12 n m 2 ≤ 2(EQ [ϕnt 2 ] + EQ [ϕm t ]) ϕ − ϕ √ 1 = 2(ϕn 2 + ϕm 2 ) 2 ϕn − ϕm .

The right-hand side converges to 0 as n, m → ∞. Therefore, the sequence of stochastic processes (ϕnt 2 )t≥0 is a Cauchy sequence in the complete metric space (D, d) of adapted processes with right continuous paths having left limits, where the metric d is deﬁned as (see [51, pp. 56–57] for background concerning this topology) d(X, Y ) =

∞ 1 E |(X − Y ) | min 1, sup Q s 2n 0≤s≤n n=1

(X, Y ∈ D).

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

73

Therefore (ϕnt 2 )t≥0 converges locally uniformly in probability to a stochastic process. This stochastic process again has to be continuous almost surely, since a subsequence of (ϕnt 2 )t≥0 converges locally uniformly with probability one. Since limn→∞ ϕnt 2 = ft 2 almost surely we know that [0, ∞) t → ft 2 is continuous, in particular limt→0 ft = f almost surely and deﬁnes by the lemma of Fatou a positive continuous supermartingale. Therefore, it has a unique decomposition ft 2 = Mt − At , where (Mt )t≥0 is a continuous martingale and (At )t≥0 is increasing process. In fact, as we shall show now, the increasing process is identically 0, i.e. ft 2t≥0 is a positive martingale for every f ∈ L2 (R). For that, we observed in the remark above that for positive ε the function fε almost surely belongs to the Schwartz space and in particular to the domain of the generator. By Holevo’s result cited above, (Tt−ε fε )t≥ε is a continuous martingale. Therefore, At = 0 for t > 0 and hence it equals 0 almost surely. In order to ensure strong convergence limt→∞ ft − f = 0 we need only show that weak convergence holds, i.e. limt→∞ φ|ft = φ|f . Observing |ψ|ft − φ|f | ≤ |φ|ft − φ|ϕnt | + |φ|ϕnt − φ|ϕn | + |φ|ϕn − φ|f | it suﬃces to show that for some T > 0 limn→∞ supt≤T |φ|ft − φ|ϕnt | = 0. But supt≤T |φ|ft − φ|ϕnt | ≤ φ supt≤T ft − ϕnt . Therefore, we need only establish that limn→∞ supt≤T ft − ϕnt = 0. This is done by a similar argument as above, namely we show that for every ε > 0 n 2 lim Q sup ft − ϕt > ε = 0, n→∞

t≤T

because then there exists a subsequence which is almost surely convergent to 0. But as we showed above (gt 2 )t≥0 is a martingale for every g ∈ L2 (R). Hence (ft − ϕnt 2 )t≥0 is a martingale and we can again apply Doob’s inequality as we did before. Remark 1. The Gaussian form of the Green’s function (1.12) is a consequence of the fact that Eq. (1.6) contains terms which are at most quadratic in q and p. This in particular implies that the dynamics preserves the shape of initially Gaussian wave functions; in fact, as shown e.g. in [30, 34, 35, 39], a state 2 m φt (x) = exp[−σt (x − xm t ) + ikt x + ςt ],

(3.39)

m is solution of Eq. (1.6) provided that the two real parameters xm t , kt and the two complex parameters σt , ςt satisfy the following stochastic diﬀerential equations: 2i 2 dσt = λ − (σt ) dt, (3.40) m √ √ m λ m (3.41) dxt = kt dt + R [dξt − 2 λxm t dt], m 2σt

February 11, 2010 10:1 WSPC/148-RMP

74

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

√ σI √ dktm = − λ Rt [dξt − 2 λxm t ], σt √ m √ I λ 2 σ dςtR = λ(xm ) + + dt + λxm t t t [dξt − 2 λxt dt], R m 4σt √ √ σtI m m 2 R λσtI I (kt ) − σt + λ R xt [dξt − 2 λxm dςt = − dt + t dt]. R 2 2m m 4(σt ) σt

(3.42) (3.43) (3.44)

In particular, the solution of Eq. (3.40) is σt = (λ/υ) coth(υt + κ), where κ sets the initial condition. These results will be useful in the subsequent analysis. 4. Representation of the Solution in Terms of Eigenstates of the NSA Harmonic Oscillator We now turn to the problem of analyzing the long time behavior of the solution of the (norm-preserving) nonlinear Eq. (1.1). The representation of the solution φt of Eq. (1.6) in terms of the Green’s function (1.12) is not suitable for controlling the long time behavior; it turns out to be more convenient to express φt in terms of the eigenstates of the NSA harmonic oscillator, resorting to the connection which we previously established between Eqs. (1.6) and (3.16). In this way, as we shall see, the collapse process will be manifest: the coeﬃcients of the superposition will decrease exponentially in time, the damping being the faster, the higher the associated eigenstate. Accordingly — when normalization is also taken into account — in the large time limit only the ground state survives, which has a Gaussian shape. We ﬁrst recall a few basic features of the Hamiltonian of the NSA harmonic oscillator, H≡

p2 − iλq 2 2m

(4.1)

which has been studied in particular by Davies in a series of papers [52, 53] and reviewed in his recent book [54]. The eigenvalues of H are complex and equal to: 1−i 1 λn ≡ ωn , ωn ≡ n + ω, (4.2) 2 2 and the corresponding eigenvectors are: (n)

φ

√ 2 2 ¯ n (zx), (x) ≡ ze−z x /2 H

2

z ≡ (1 − i)

λm

(4.3)

¯ n (x) is the normalized Hermite polynomial of degree n. Since the argument where H ¯ n in (4.3) is complex, these eigenstates are not orthogonal; it can be shown of H that they are linearly independent and form a complete set, however they do not form a basis. As such, they cannot directly used to expand an initial state into a superposition of the eigenstates of H. This problem can be circumvented in the following way, also discussed by Davies.

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

75

It is easy to see that the sequences {φ(n) } and {φ(n) } form a bi-orthonormal system; one then deﬁnes the (non-orthogonal) projection operators: Pn φ ≡ φ|φ(n) φ(n) = αn φ(n) ,

(4.4)

which satisfy the relations: Pn = φ(n) 2

Pn Pm = δn,m Pn ,

and

lim

n→+∞

lnPn = 2c, n

(4.5)

where c is an appropriate constant [54]. As we see, although the states φ(n) are normalized, in the sense that +∞ φ(n) (x)φ(m) (x)dx = δn,m , (4.6) −∞

the norm of the projection operators Pn grows exponentially as n → +∞. Finally, the following equality holds true [54]: TtNSA

=

∞

e−(1+i)ωn t/2 Pn

for t > 4c/ω.

(4.7)

n=0

A remarkable property of the above representation of the solution of Eq. (3.16) in terms of the eigenstates of the operator (4.1) is that it holds not for any t ≥ 0, as one would naively expect, but only for t > 4c/ω. The reason is that the norm of the projection operators Pn grows exponentially with n, so one has to wait for t to be large enough in order for the term e−nωt/2 to suppress the exponential growth of the projectors. From a physical point of view, recalling the discussion of Sec. 2, since the constant c is of order 1 [54] and ω 5.01 × 10−5 sec−1 , we see that the representation (4.7) holds true only in part of the classical regime and in the diﬀusive regime, which is the one we are interested in studying now, but not in the physically more crucial collapse regime. We now apply the above results to our problem; we will ﬁrst proceed in an informal way, and at the end we will prove the relevant theorems. Let φ ∈ L2 (R); then, according to (3.18) and (4.7): φt (x) = Tt φ = e[

+∞ √ λξt +ict /]x+iϑt /

αn e−(1+i)ωn t/2 φ(n) (x − bt )

(4.8)

¯ n [z(x − bt )], αn e−(1+i)ωn t/2 H

(4.9)

n=0

= e−z

2

¯t x+γt √ (x−¯ xt )2 /2+ik

z

+∞ n=0

where αn = φ|φ(n) (see Eq. (4.4)), while the two real parameters x¯t , k¯t and the complex parameter γt are deﬁned as follows: √ I I (4.10) x ¯t = bR t + bt − (2/mω)ct + (ω/2 λ)ξt , √ I λξt , (4.11) k¯t = (mω/)bIt + (1/)(cR t − ct ) + ¯2t ) + (i/)θt . γt = −(1 − i)(mω/4)(b2t − x

(4.12)

February 11, 2010 10:1 WSPC/148-RMP

76

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

By resorting to Eqs. (3.14) and (3.19), and after a rather long calculation, we obtain the following set of SDEs for these parameters: √ ¯ [dξt − 2 λ¯ xt dt], (4.13) d¯ xt = kt dt + m m √ √ dk¯t = λ[dξt − 2 λ¯ xt dt], (4.14) √ √ ω x2t + xt [dξt − 2 λ¯ xt dt], (4.15) dγtR = λ¯ dt + λ¯ 4 √ √ ¯2 ω I k + dγt = − xt [dξt − 2 λ¯ xt dt]; (4.16) dt − λ¯ 2m t 4 the initial conditions are: x ¯0 = k¯0 = γ0 = 0. Note that these equations are equivm ¯ ¯t = xm alent to (3.41)–(3.44), with σt = σ∞ = λ/υ = z 2 /2, x t , kt = kt and γt = ςt + (1 + i)ω/4; as a matter of fact, the above equations describe the time evolution (according to Eq. (1.6)) of the ground state of the NSA harmonic oscillator, which is: 2 z 1+i ∞ 2 (0) ¯ ωt , φ∞ (x). (4.17) φt (x) = exp − (x − x¯t ) + ikt x + γt − 0 (x) = φ 2 4 As we shall prove in the next section, this is the state to which — apart from normalization — any initial state converges to, in the long time limit, hence the name φ∞ t . As we see, due to the stochastic part of the dynamics, the argument of the Gaussian weighting factor and that of the Hermite polynomials of Eq. (4.9) are diﬀerent functions of time, while for analyzing the long time behavior of the wave function, it is more convenient that both arguments display the same time dependence. We thus modify the argument of the Hermite polynomials, to make it equal ¯t − bt ; we can then to that of the weighting factor. To this end, let us deﬁne ζt = x write: ¯ n [z(x − bt )] = √ 1 H Hn [z(x − x ¯t ) + zζt ] π2n n! n n 1 = √ ¯t )] (2zζt )n−m Hm [z(x − x n m π2 n! m=0 √ n √ n! ¯ m [z(x − x √ = ( 2zζt )n−m H ¯t )], (4.18) m!(n − m)! m=0 where Hm is the standard (not normalized) Hermite polynomial of degree m; in going from the ﬁrst to the second line, we have used property (A.2). Resorting to the above relation, we can rewrite Eq. (4.9) as follows: ¯

φt (x) = eikt x+γt −(1+i)ωt/4

+∞ m=0

(m) −(1+i)mωt/2 (m)

α ¯t

e

φ

(x − x ¯t );

(4.19)

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

77

the functions φ(m) are the eigenstates deﬁned in (4.3), while the time dependent (m) coeﬃcients α ¯ t are deﬁned as follows: (m) α ¯t

(k + m)! √ ¯ k ( 2z ζt ) , = αk+m √ m!k! k=0 +∞

(4.20)

where we have introduced the new quantity ζ¯t ≡ e−(1+i)ωt/2 ζt . Equations (4.19) and (4.20) represent the two main formulas, which we will use in the next section to analyze the large time behavior. Before doing this, we need to set these formulas on a rigorous ground; we will do these with the following two lemmata. Lemma 4.1. Let φ ∈ L2 (R) and αn = φ|φ(n) , with φ(n) deﬁned as in (4.3). (m) Then the series (4.20) deﬁning α ¯t is a.s. convergent for any m and any t > 0. Moreover, one has the following bound on the coeﬃcients: (m) |¯ αt |

≤ Nt e

(c+1/2)m

,

+∞ k(c+1) √ e | 2z ζ¯t |k √ Nt ≡ A kk k=0

a.s.,

(4.21)

where A is a constant independent of the Brownian motion ξt . Proof. Because of (4.5), there exists a constant C1 such that: |αn | ≤ φφ(n) = φ(n) ≤ C1 enc . Secondly, using Stirling formula, there exists a constant C2 such that: √ √ C2−1 2πnnn e−n < n! < C2 2πnnn e−n ,

(4.22)

(4.23)

for n > 1; we can then write the following estimate: (k + m)! C22 4 k + m (k + m)(k+m)/2 e−(k+m)/2 √ ≤ √ mk 2 mm/2 e−m/2 k k e−k 2π m!k! C2 ≤ √2 e−k(ln k−2)/2+m/2 ; π

(4.24)

in the second line, we have used the inequality (k + m) ln(k + m) ≤ k ln k + m ln m + k + m. Using Eqs. (4.22) and (4.24), we have the following bound: √ (k + m)! √ ¯ k C1 C22 ek(c+1) | 2z ζ¯t |k √ ( 2z ζt ) ≤ √ , k, m ≥ 1. (4.25) αk+m √ 4 π m!k! kk The cases k = 0 and m = 0 can be treated separately, giving the same bound, with the only possible diﬀerence of an overall constant factor. This proves convergence of the series deﬁned in (4.20) and the bound (4.21).

February 11, 2010 10:1 WSPC/148-RMP

78

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

Theorem 4.1. Let the conditions of Lemma 4.1 be satisﬁed; let moreover ζ¯t ≡ ¯t − bt with x ¯t and bt solutions of Eqs. (4.13) and (3.14), e−(1+i)ωt/2 ζt , where ζt = x respectively. Then the series deﬁned in (4.19) is a.s. norm convergent for t > t¯ ≡ (4c + 1)/ω. In addition, the following equality holds true: ¯

Tt φ = eikt x+γt −(1+i)ωt/4

+∞

(m) −(1+i)mωt/2 (m)

α ¯t

e

φ

(x − x¯t ),

t > t¯,

(4.26)

m=0

where Tt is the evolution operator associated to the Green’s function (1.12). Proof. According to (4.5) and (4.21), one has: (m) −(1+i)mωt/2 (m)

α ¯t

e

φ

[z(x − x ¯t )] ≤ C1 Nt e(2c+1/2−ωt/2)m ,

(4.27)

from which the conclusion follows. Comparing the two expressions of Eqs. (3.18) and (4.19) when the initial state φ is an eigenstate φ(n) , we see that they coincide on the dense subspace of all ﬁnite linear combinations of φ(n) , and hence on the whole of L2 (R). 5. The Long Time Behavior We are now in a position to study the long time behavior of the solution of Eq. (1.1). Looking at expressions (4.19) for the solution φt and (4.20) for the coeﬃcients (m) α ¯ t , it should be clear what the long time behavior of the normalized solution ψt = φt /φt is: whatever the initial condition, at any time t > 0 the wave function (0) ¯t ), since α ¯ t = 0 as long as at φt picks a component on the ground state φ(0) (x − x least one of the coeﬃcients αk is not null, which is always the case. Equation (4.19) on the other hand shows that each term of the superposition has an exponential damping factor, which is the bigger, the higher the eigenvalue. Accordingly, after normalization, only the eigenstate with the weakest damping factor survives, which is the ground state. Hence we expect that the general solution of Eq. (1.1) converges ¯t ), which is a Gaussian a.s., in the large time limit, to the ground state φ(0) (x − x state. That this is true is proven in the following theorem. Theorem 5.1. Let φt be a strong solution of Eq. (1.6) that admits, for t > t¯ a representation as in (4.26). Let ψt ≡ φt /φt (when φt = 0), which can be written as follows: ψt =

ψt∞

+e

¯t x+γ I −ωt/4) i(k t

+∞ (m) α ¯t e−(1+i)mωt/2 φm (x − x ¯t ), r t m=1

(5.1)

with: (0)

α ¯ t i(k¯t x+γtI −ωt/4) e φ0 (x − x¯t ), rt +∞ (m) −(1+i)mωt/2 α ¯t e φm (x − x¯t ) . rt :=

ψt∞ :=

m=0

(5.2) (5.3)

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

79

Then, with P-probability 1: lim ψt − ψt∞ = 0.

t→∞

(5.4)

Note that, apart from global factors, ψt∞ is the ground state of the NSA harmonic oscillator, randomly displaced both in position space as well as in momentum space. Proof. According to Eq. (5.1), all we need to prove is that, with P-probability 1: +∞ (m) α ¯t −(1+i)mωt/2 e φm (x − x¯t ) = 0. lim (5.5) t→∞ r t m=1 Resorting to (4.27), one can write the following bound: +∞ (m) α ¯t Nt e−ω(t−t¯) e−(1+i)mωt/2 φm (x − x ¯t ) ≤ C1 , rt rt 1 − e−ω(t−t¯) m=1

(5.6)

thus all we need to set is the long time behavior of rt and Nt . Lemmas 5.1 and 5.2 (see Eqs. (5.7) and (5.12)) state that, with P-probability 1, rt converges asymptotically to a ﬁnite and non-null random variable, while Nt converges to a ﬁnite random variable. From the above properties, the conclusion of the theorem follows immediately. In the remaining of the section, we prove the required lemmas. Lemma 5.1. Let rt be deﬁned as in (5.3). Then, with P-probability 1, lim rt = r∞

t→∞

ﬁnite and not null.

(5.7)

Proof. According to Eqs. (4.19) and (5.3), the following equality holds: R

φt = eγt −ωt/4 rt ;

(5.8)

resorting to the stochastic diﬀerentials (1.9) and (4.15) for φt 2 and γtR , respectively, one can write the following stochastic diﬀerential equation for rt2 : √ x2t − qt x ¯t ) dt]rt2 , r02 = 1. (5.9) drt2 = [2 λ(qt − x¯t )dξt + 4λ(¯ By using relation (1.11), the above equation can be re-written in terms of the Wiener process Wt as follows: √ drt2 = [2 λ(qt − x¯t )dWt + 4λ(qt − x ¯t )2 dt]rt2 , r02 = 1, (5.10) whose solution is: rt2

t √ t 2 = exp 2 λ (qs − x¯s )dWs + 2λ (qs − x ¯s ) ds . 0

0

(5.11)

February 11, 2010 10:1 WSPC/148-RMP

80

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

The crucial point is to establish the behavior of the diﬀerence qt − x ¯t between the mean position of the general solution ψt and the mean position of the “asymptotic” state ψt∞ . Since ψt converges to ψt∞ , we expect qt − x ¯t to vanishes asymptotically. That this is actually true with P-probability 1 is proven in Lemma 5.3 (see Eq. (5.15)), where indeed it is shown that the convergence is exponentially fast. This fact, together with (5.11), concludes the proof of the lemma. Lemma 5.2. Let Nt be deﬁned as in (4.21). Then, with P-probability 1, lim Nt = N∞

t→∞

ﬁnite.

(5.12)

Proof. Looking back at Eq. (4.21), we see that in order to prove this lemma it is suﬃcient to show that ζ¯t tends to a ﬁnite limit as t → ∞, with P-probability 1. According to our previous deﬁnition, ζ¯t is equal to: ζ¯t = e−(1+i)ωt/2 (¯ xt − bt ).

(5.13)

Equations (3.14) and (4.10), together with the change of measure (1.11), lead to the following stochastic diﬀerential equation for ζ¯t in terms of the Wiener process Wt : √ ω ¯t )dt], ζ¯0 = 0. (5.14) dζ¯t = √ e−(1+i)ωt/2 [dWt + 2 λ(qt − x 2 λ Once again, the large time behavior of qt − x ¯t (see Eq. (5.15)) yields the conclusion of the lemma. Lemma 5.3. Let qt ≡ ψt |q|ψt and x ¯t deﬁned in (4.10). Then, with Pprobability 1: ¯t = O(e−ωt/2 ). ht ≡ qt − x

(5.15)

Proof. Let us consider the Gaussian solution of Eq. (1.6): αt 2 G ¯t x + c¯t φt (x) ≡ Gt (x, 0) = Kt exp − x + a 2 αt G 2 G ¯ ¯t ) + ikt x + c˜t = Kt exp − (x − x 2

(5.16)

(5.17)

where Gt (x, y) is the Green’s function deﬁned in (1.12) and x ¯G t =

a ¯R t , αR t

αI R k¯tG = a ¯It − Rt a ¯ , αt t

c˜t = c¯t +

αt G 2 (¯ x ) . 2 t

(5.18)

G ¯G Note that x¯G t is the mean position of the Gaussian state φt , while kt is its average momentum. Obviously we can write:

¯G xG ¯t ); ht = (qt − x t ) + (¯ t −x

(5.19)

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

81

Lemma B.1 proves that qt − x ¯G has the required asymptotic behavior t ¯t behaves as (see Eq. (B.1)), so all we need to show is that also x ¯G t − x required. Lemma B.1 was ﬁrst proven in [35]; for completeness, we reproduce it in Appendix B, adapting it to our notation. The proof of the lemma is instructive ¯G because it makes clear why it is convenient to analyze qt − x t separately from G x ¯t − x¯t . By letting the ground state of the NSA harmonic oscillator evolve according to ¯t in terms of the functions (1.13)– the Green’s function Gt (x, y), one can express x (1.18); a straightforward calculation leads to the following result: R βt¯bt ω −1 G R ¯t = at − , (5.20) x ¯t − x (pt − 1)¯ 2λ αt + α∞ where α∞ ≡ limt→∞ αt = 2λ/υ. By inspecting expressions (3.21) and (1.15), we − 1 = O(e−ωt ) and |βt | = O(e−ωt/2 ), thus in order to prove recognize that p−1 t the lemma all we have to do is to control the long time behavior of a ¯t , which in turn sets the asymptotic behavior of ¯bt through (1.17). Inverting Eq. (5.18) we get: ¯G a ¯t = αt x¯G t + ikt ,

(5.21)

¯G thus we can control a ¯t by controlling x ¯G t and kt . These two quantities, being the average position and (modulo ) average momentum of the Gaussian solution (5.17), satisfy the stochastic diﬀerential equations (3.41) and (3.42), with αt /2 in place of σt . By using the change of measure (1.11), we can re-express these equations in terms of the Wiener process Wt as follows: √ ¯ G 2λ λ G k + R ft dt + R dWt , d¯ xt = (5.22) m t αt αt √ αI √ αI dk¯tG = −2 λ Rt ft dt − λ Rt dWt , αt αt

(5.23)

¯G with ft ≡ qt − x t . By integrating the second equation, by using the strong law of large numbers applied to Wt , Eq. (B.1) for ft and the fact that αt has an asymptotic ﬁnite limit, one can show that, with P-probability 1, the process k¯tG grows slower than t2 , for t → ∞. By integrating now the ﬁrst equation, and by using the same 3 properties as before, one can show that x ¯G t grows slower than t , for t → ∞ and again with P-probability 1. According to Eqs. (5.21) and (1.17), we then have, with P-probability 1: a ¯t = o(t3 ) as t → ∞,

lim ¯bt = ¯b∞

t→∞

ﬁnite.

(5.24)

¯G This proves that x ¯t − x t has the required asymptotic behavior, hence the conclusion of the lemma.

February 11, 2010 10:1 WSPC/148-RMP

82

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

In this way, we have proven that any initial state is P-a.s. norm convergent to the Gaussian state (5.2), which can be written as follows: 2 π z ω ¯t )2 + ik¯t x + i γtI − t , (5.25) ψt∞ ≡ 4 2 exp − (x − x zR 2 4 which has a ﬁxed ﬁnite spread both in position and in momentum, given by [39]: ∞ 2 ∞ 1/2 ∆q = ψt |(q − x , (5.26) ¯t ) |ψt = mω mω ∞ 2 ∞ 1/2 ¯ ∆p = ψt |(p − kt ) |ψt . (5.27) = 2 This corresponds almost √ to the minimum allowed by Heisenberg’s uncertainty relations, as ∆q ∆p = / 2. Note also that, the more massive the particle, the smaller the spread in position of the asymptotic Gaussian state: this is a well known eﬀect of the localizing property of Eq. (1.1). Finally, Eqs. (4.13) and (4.14), together with the change of measure (1.11), tell how the average position x ¯t and momentum k¯t evolve in time, as a function of the Wiener process Wt : ¯ ω kt dt + ωht dt + √ dWt , m 2 λ √ dk¯t = 2λht dt + λdWt ,

d¯ xt =

(5.28) (5.29)

which imply that there exist two random variables X and K such that [35]: √ t Wt + O(e−ωt/2 ), x ¯t = X + Kt + λ Ws ds + (5.30) m m 0 m √ k¯t = K + λWt + O(e−ωt/2 ). (5.31) These parameters fully describe the time evolution of the Gaussian state (5.25). 6. Conclusions and Outlook In Sec. 2, we have spotted three interesting time regimes during which the wave function, depending on the values of the parameters λ and m, evolves in a different way. In the central sections of this paper, we have analyzed the long time behavior, which pertains to the third regime, the diﬀusive one. There are many other properties of the solutions of Eq. (1.1) which deserve to be analyzed, and in this conclusive section, we would like to point out a number of interesting open problems. I. Collapse regime Let be the length which discriminates between a localized and a non-localized wave function, i.e. such that, deﬁning with ∆ψ q the spread in position of a wave

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

83

function ψ, we say that ψ is localized in space whenever ∆ψ q ≤ . In our case, we must take > /mω, where /mω is the asymptotic spread (see Eq. (5.26)). Problem I.1: Find bounds on the collapse time Let ψt be the solution of Eq. (1.1), for a given initial condition ψ ∈ L2 (R) such ψ as the ﬁrst time at which the that ∆ψ q > . Let us deﬁne the collapse time TCOL wave function is localized in space: ψ := min{t : ∆ψt q ≤ }. TCOL

(6.1)

ψ How is TCOL distributed? Find best possible bounds (depending on the parameters deﬁning the model) for the distribution function. The dependence of the collapse time on the parameters of the model is physically relevant as for macroscopic bodies the collapse is supposed to happen at a much shorter time, producing a classical macroscopic body. This time must be much before diﬀusion becomes eﬀective. Bounds on the collapse time will lead hopefully to experimentally testable deviations from linear quantum mechanics (i.e. where the superposition principle holds on all scales.)

Problem I.2: collapse probability ψ ¯ ψ ¯ be the position of the wave function ]. Let x ¯ := ψ|q| Let ψ¯ := ψt , for t = EP [TCOL at the average time at which it is localized in space. Show that the distribution of x ¯ is close to the Born probability given by |ψ(x)|2 . II. Classical regime In the classical regime, the wave function is expected to move, on the average, like a classical free particle. Problem II.1: classical motion Let q¯t and p¯t be the (quantum) average position and momentum of ψt . Let t > ψ . Show that the random trajectories q¯· and p¯· are with high probability for TCOL a reasonably large amount of time close to the classical trajectories. The closeness will of course depend on the parameters deﬁning the model. III. Diﬀusive regime With this regime, the classical regime ends and has been analyzed in this paper: as we have seen, the wave keeps diﬀusing in the Hilbert space, eventually taking a Gaussian shape, as described in Sec. 5.

Acknowledgments The work was supported by the EU grant No. MEIF CT 2003-500543 and by DFG (Germany). We thank GianCarlo Ghirardi, Lajos Di´ osi and the referees for helpful comments and an anonymous referee for pointing out a ﬂaw in a previous version.

February 11, 2010 10:1 WSPC/148-RMP

84

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

Appendix A. Properties of Hermite Polynomials We list here the main properties of Hermite polynomials, which are used in the paper. The primary deﬁnition of the Hermite polynomials is n/2

Hn (z) = n!

(−1)m (2z)n−2m , m!(n − 2m)! m=0

(A.1)

where z is any complex number. These polynomials satisfy the following addition rule n n Hn (z1 + z2 ) = (2z2 )n−m Hm (z1 ). m m=0

(A.2)

When the argument is real (z = x ∈ R), they form an orthogonal set with respect to the weight exp[−x2 ]; the normalized Hermite polynomials are: ¯ n (x) = 1 Hn (x), H Nn

√ n Nn = π2 n!.

(A.3)

Appendix B. Lemma 3.1 in [34] Lemma B.1. Let φ ∈ L2 (R), φ = 1 and let φt = Tt φ. Then, with P-probability 1:

−ωt/2 ft ≡ qt − x ¯G ), t = O(e

(B.1)

¯G where qt = ψt |q|ψt , and x t has been deﬁned in (5.18). Proof. Using the expression (1.12) for Gt (x, y) together with Schwartz inequality, we can derive the following bound on φt : 2

2

|φt (x)| ≤ |Kt |

λ p2t − 4qt2 2 π exp −2 x ω pt αR t

2 ¯bR qt ω (¯bR t t ) R R x + 2¯ ct + , +2 a ¯t + 8 pt 2λ pt

(B.2)

which holds for any t > 0. The above inequality implies that it is suﬃcient to consider φ ∈ L2 (R) such that: 2

|φ(x)| ≤ Ce−Ax ,

(B.3)

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

85

where C and A are random variables. A direct calculation leads to the following expression for the quantum average φt |q|φt : 2 π (¯ aR t ) R exp 2¯ c + dy1 dy2 φ(y1 )φ(y2 ) t R αR α t t R βt y 1 + β t y 2 a ¯t βt2 1 βt2 1 2 · + − − − exp − y y22 α α t 1 t R R R 2 2 2αR α 2α 2α t t t t R R 2 ¯ ¯ βt a β a |βt | · exp ¯bt + Rt y1 + ¯bt + t Rt y2 + y1 y2 . (B.4) αt αt 2αR t

φt |q|φt = |Kt |2

As we shall soon see, all exponential terms in the above expression can be controlled. The crucial factors are the two within brackets: the ﬁrst term decays exponentially in time, since βt = O(e−ωt/2 ), while αt has a ﬁnite asymptotic limit; the term R a ¯R t /αt , instead, does not decay in time (see the discussion in connection with the proof of Lemma 5.3). Since φt 2 is equal to the expression (B.4) without the terms in square brackets, and because of (5.18), we have that a ¯R t φt 2 (B.5) αR t 2 π (¯ aR t ) R βt y 1 + β t y 2 = |Kt |2 exp 2¯ c + dy φ(y )φ(y ) dy 1 2 1 2 t αR αR 2αR t t t 2 2 1 β 1 β · exp − αt − tR y12 − αt − t R y22 2 2 2αt 2αt βt a βa |βt |2 ¯R ¯R · exp ¯bt + Rt y1 + ¯bt + t Rt y2 + y y (B.6) 1 2 . αt αt 2αR t

ft φt 2 = φt |q|φt −

According to the discussion above, we expect the quantity ft φt 2 to decay exponentially in time, as we shall now prove; this is the reason why, in proving Lemma 5.3, it was convenient to split the diﬀerence ht as done in Eq. (5.19). Using the inequality y1 y2 ≤ (y12 + y22 )/2, we can write: a ¯R (B.7) |ft |φt 2 = φt |q|φt − tR φt 2 αt 2 |βt | (¯ aR π t ) 2 R ≤ dy1 dy2 |φ(y1 )||φ(y2 )| |K | exp 2¯ c + t t 2αR αR αR t t t · (|y1 | + |y2 |)g(y1 )g(y2 ),

(B.8)

with: 1 ¯R (βtR )2 βtR a t R 2 R ¯ g(y) ≡ exp − y + bt + y . αt − 2 αR αR t t

(B.9)

February 11, 2010 10:1 WSPC/148-RMP

86

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

Next, by using the inequality g(y1 ) + g(y2 ) ≤ (g(y1 )2 + g(y1 )2 )/2 and the symmetry between y1 and y2 , we have: 2 π |βt | (¯ aR t ) 2 2 R |Kt | exp 2¯ ct + |ft |φt ≤ dy1 dy2 |φ(y1 )||φ(y2 )| 2αR αR αR t t t · (|y1 | + |y2 |)g(y1 )2 .

(B.10)

Now, a direct computation shows that 2 π (¯ aR t ) R Gt (·, y)2 ≡ dx|Gt (x, y)|2 = |Kt |2 exp 2¯ c + g(y)2 ; t R αR α t t

(B.11)

the key point is that, since Gt (x, y) solves Eq. (1.6), then Gt (·, y)2 is a positive martingale with respect to the measure Q, for any value of y; we call MarQ (t, y) this martingale. We can then write: |βt | dy1 dy2 |φ(y1 )||φ(y2 )|(|y1 | + |y2 |) MarQ (t, y) |ft |φt 2 ≤ 2αR t 2 |βt | (B.12) ≤ dy e−Ay (A1 |y| + A2 ) MarQ (t, y), 2αR t where A1 and A2 are suitable constants. In going from the ﬁrst to the second line, we have used (B.3). The quantity 2 1 (B.13) dye−Ay (A1 |y| + A2 ) MarQ (t, y) R 2αt is another positive martingale with respect to Q, which we call Mar Q (t). We arrive in this way at the inequality: |ft | ≤ |βt |

Mar Q (t) . φt 2

(B.14)

Since Mar Q (t) is a positive martingale with respect to Q, then MarP (t) = Mar Q (t)/φt 2 is a positive martingale with respect to P which, by Doob’s convergence theorem, has a P-a.s. ﬁnite limit for t → +∞. The conclusion of the lemma then follows from Eq. (1.15), according to which βt = O(e−ωt/2 ). References [1] G. C. Ghirardi, A. Rimini and T. Weber, Uniﬁed dynamics for microscopic and macroscopic systems, Phys. Rev. D 34 (1986) 470–491. [2] G. C. Ghirardi, P. Pearle and A. Rimini, Markov processes in Hilbert space and continuous spontaneous localization of systems of identical particles, Phys. Rev. A 42 (1990) 78–89. [3] G. C. Ghirardi, R. Grassi and P. Pearle, Relativistic dynamical reduction models: General framework and examples, Found. Phys. 20 (1990) 1271–1316.

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

87

[4] P. Pearle, Reduction of the state vector by a nonlinear Schrdinger equation, Phys. Rev. D 13 (1976) 857–868. [5] P. Pearle, Combining stochastic dynamical state-vector reduction with spontaneous localization, Phys. Rev. A 39 (1989) 2277–2289. [6] P. Pearle, Collapse Models, in Open Systems and Measurement in Relativistic Quantum Theory, eds. F. Petruccione and H.-P. Breuer (Springer-Verlag, Berlin, 1999). [7] L. Di´ osi, Localized solution of simple nonlinear quantum Langevin-equation, Phys. Lett. A 132 (1988) 233–236. [8] L. Di´ osi, Models for universal reduction of macroscopic quantum ﬂuctuations, Phys. Rev. A 40 (1989) 233–236. [9] L. Di´ osi, Relativistic theory for continuous measurement of quantum ﬁelds, Phys. Rev. A 42 (1990) 5086–5092. [10] S. L. Adler, D. C. Brody, T. A. Brun and L. P. Hughston, Martingale models for quantum state reduction, J. Phys. A 34 (2001) 8795–8820. [11] S. L. Adler and T. A. Brun, Generalized stochastic Schr¨ odinger equations for state vector collapse, J. Phys. A 34 (2001) 4797–4809. [12] S. L. Adler, Quantum Theory as an Emergent Phenomenon. The Statistical Mechanics of Matrix Models as the Precursor of Quantum Field Theory (Cambridge University Press, Cambridge, 2004). [13] A. Bassi, E. Ippoliti and S. L. Adler, Towards quantum superpositions of a mirror: An exact open systems analysis, Phys. Rev. Lett. 94 (2005) 030401, 4 pp. [14] A. Bassi, E. Ippoliti and B. Vacchini, On the energy increase in space-collapse models, J. Phys. A 38 (2005) 8017–8038. [15] V. P. Belavkin, Non-demolition measurements, non-linear ﬁltering and dynamic programming in quantum stochastic processes, in Lecture Notes in Control and Information Science, ed. A. Blaqui`ere, Vol. 121 (Springer-Verlag, Berlin, 1988). [16] S. L. Adler and A. Bassi, Is quantum theory exact? Science 325 (2009) 275–276. [17] V. P. Belavkin and P. Staszewski, A quantum particle undergoing continuous observation, Phys. Lett. A 140 (1989) 359–362. [18] V. P. Belavkin and P. Staszewski, Nondemolition observation of a free quantum particle, Phys. Rev. A 45 (1992) 1347–1357. [19] D. Chru´sci´ nski and P. Staszewski, On the asymptotic solutions of Belavkin’s stochastic wave equation, Phys. Scripta 45 (1992) 193–199. [20] A. Barchielli, Direct and heterodyne detection and other applications of quantum stochastic calculus to quantum optics, Quantum Opt. 2 (1990) 423–441. [21] A. Barchielli, On the quantum theory of measurements continuous in time, Proceedings of the XXV Symposium on Mathematical Physics (Toru´ n, 1992), Rep. Math. Phys. 33 (1993) 21–34. [22] A. Barchielli and A. S. Holevo, Constructing quantum measurement processes via classical stochastic calculus, Stochastic Process. Appl. 58 (1995) 293–317. [23] Ph. Blanchard and A. Jadczy, On the interaction between classical and quantum systems, Phys. Lett. A 175 (1993) 157–164. [24] Ph. Blanchard and A. Jadczyk, Event-enhanced quantum theory and piecewise deterministic dynamics, Ann. Physik 4(8) (1995) 583–599. [25] Ph. Blanchard and A. Jadczyk, Events and piecewise deterministic dynamics in eventenhanced quantum theory, Phys. Lett. A 203 (1995) 260–266. [26] J. Halliwell and A. Zoupas, Quantum state diﬀusion, density matrix diagonalization, and decoherent histories: A model, Phys. Rev. D 52 (1995) 7294–7307. [27] J. Halliwell and A. Zoupas, Post-decoherence density matrix propagator for quantum Brownian motion, Phys. Rev. D 55 (1997) 4697–4704.

February 11, 2010 10:1 WSPC/148-RMP

88

J070-S0129055X10003886

A. Bassi, D. D¨ urr & M. Kolb

[28] H.-P. Breuer and F. Petruccione, The Theory of Open Quantum Systems (Oxford University Press, New York, 2002). [29] H.-P. Breuer, U. Dorner and F. Petruccione, Numerical integration methods for stochastic wave function equations, Comp. Phys. Comm. 132 (2000) 30–43. [30] D. Gatarek and N. Gisin, Continuous quantum jumps and inﬁnite-dimensional stochastic equations, J. Math. Phys. 32 (1991) 2152–2157. [31] A. S. Holevo, On dissipative stochastic equations in a Hilbert space, Probab. Theory Related Fields 104 (1996) 483–500. [32] V. P. Belavkin and V. N. Kolokol’tsov, Quasiclassical asymptotics of quantum stochastic equations, Teoret. Mat. Fiz. 89 (1991) 163–177 (Russian); translation in Theoret. and Math. Phys. 89(2) (1991) 1127–1138. [33] V. N. Kolokol’tsov, Application of quasiclassical methods to the study of Belavkin’s quantum ﬁltering equation, Mat. Zametki 50 (1991) 153–156 (Russian); translation in Math. Notes 50 (1991) 1204–1206. [34] V. N. Kolokol’tsov, Scattering theory for the Belavkin equation describing a quantum particle with continuously observed coordinate, J. Math. Phys. 36 (1995) 2741–2760. [35] V. N. Kolokol’tsov, Localization and analytic properties of the solutions of the simplest quantum ﬁltering equation, Rev. Math. Phys. 10 (1998) 801–828. [36] V. N. Kolokol’tsov, Semiclassical Analysis for Diﬀusion and Stochastic Processes, Lecture Notes in Mathematics, Vol. 1724 (Springer-Verlag Berlin, 2000). [37] S. Albeverio, V. N. Kolokol’tsov and O. G. Smolyanov, Continuous quantum measurement: Local and global approaches, Rev. Math. Phys. 9 (1997) 907–920. [38] S. Teufel, Adiabatic Perturbation Theory in Quantum Dynamics, Lecture Notes in Mathematics, Vol. 1821 (Springer-Verlag, Berlin, 2003). [39] A. Bassi, Collapse models: Analysis of the free particle dynamics, J. Phys. A 38 (2005) 3173–3192. [40] E. Joos and H. D. Zeh, The emergence of classical properties through interaction with the environment, Z. Phys. B 59 (1985) 223–243. [41] W. Marshall, C. Simon, R. Penrose and D. Bouwmeester, Towards quantum superpositions of a mirror, Phys. Rev. Lett. 91 (2003) 130401, 4 pp. [42] J. Z. Bern´ ad, L. Di´ osi and T. Geszti, Quest for quantum superpositions of a mirror: High and moderately low temperatures, Phys. Rev. Lett. 97 (2006) 250404, 4 pp. [43] S. L. Adler, A density tensor hierarchy for open system dynamics: Retrieving the noise, J. Phys. A 40 (2007) 8959–8990. [44] A. Barchielli, Some stochastic diﬀerential equations in quantum optics and measurement theory: The case of diﬀusive processes, in Contributions in Probability — in Memory of Alberto Frigerio, ed. C. Cecchini (Forum, Udine, 1996), pp. 43–55. [45] A. Bassi and D. D¨ urr, On the long-time behavior of Hilbert space diﬀusion, Europhys. Lett. 84 (2008) 10005. [46] C. M. Mora and R. Rebolledo, Regularity of solutions to linear stochastic Schr¨ odinger equations, Inﬁn. Dimens. Anal. Quantum Probab. Relat. Top. 10 (2007) 237–259. [47] C. M. Mora and R. Rebolledo, Basic properties of nonlinear stochastic Schr¨ odinger equations driven by Brownian motions, Ann. Appl. Probab. 18 (2008) 591–619. [48] R. S. Liptser and A. N. Shiryaev, Statistics of Random Processes (Springer-Verlag, Berlin, 2001). [49] A. Bassi, G. C. Ghirardi and D. G. M. Salvetti, The Hilbert-space operator formalism within dynamical reduction models, J. Phys. A 40 (2007) 13755–13772. [50] R. G. Bartle, A Modern Theory of Integration, Graduate Studies in Mathematics, Vol. 32 (American Mathematical Society, Providence, RI, 2001).

February 11, 2010 10:1 WSPC/148-RMP

J070-S0129055X10003886

On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions

89

[51] P. E. Protter, Stochastic Integration and Diﬀerential Equation (Springer-Verlag, Berlin, 2004). [52] E. B. Davies, Pseudo-spectra, the harmonic oscillator and complex resonances, R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 455 (1999) 585–599. [53] E. B. Davies and A. B. J. Kuijlaars, Spectral asymptotics of the non-self-adjoint harmonic oscillator, J. London Math. Soc. (2 ) 70(2) (2004) 420–426. [54] E. B. Davies, Linear Operators and Their Spectra, Cambridge Studies in Advanced Mathematics, Vol. 106 (Cambridge University Press, Cambridge, 2007).

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

Reviews in Mathematical Physics Vol. 22, No. 1 (2010) 91–115 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003904

FROM GLOBAL SYMMETRIES TO LOCAL CURRENTS: THE FREE (SCALAR) CASE IN FOUR DIMENSIONS

GERARDO MORSELLA∗ and LUCA TOMASSINI† Department of Mathematics, Tor Vergata University, via della Ricerca Scientifica I-00133 Roma, Italy ∗[email protected] †[email protected] Received 4 May 2009 Revised 15 October 2009

Within the framework of algebraic quantum field theory, we propose a new method of constructing local generators of (global) gauge symmetries in field theoretic models, starting from the existence of unitary operators implementing locally the flip automorphism on the doubled theory. We show, in the simple example of the internal symmetries of a multiplet of free scalar fields, that through the pointlike limit of such local generators the conserved Wightman currents associated with the symmetries are recovered. Keywords: Quantum Noether theorem; split property; flip automorphism. Mathematics Subject Classification 2010: 81T05, 46L45

1. Introduction One of the most important features of ﬁeld theoretic models is the existence of local conserved currents corresponding to space-time and internal (gauge) symmetries. While in the framework of classical Lagrangian ﬁeld theory a clariﬁcation of this issue comes from Noether’s theorem (which provides an explicit formula for the conserved current associated to any continuous symmetry of the Lagrangian itself), it is well known that in the quantum case several drawbacks contribute to make the situation more confusing. For example, symmetries which are present at the classical level can disappear upon quantization due to renormalization eﬀects. In [1, 2], a diﬀerent approach to the problem was outlined in the context of algebraic quantum ﬁeld theory. It consisted of two main steps: (1) given double ˆ with bases B, B ˆ in the time-zero plane centered at the origin and such cones O, O ¯ ˆ that O ⊂ O, start from generators Q of global space-time or gauge transformations Q and construct local ones, i.e. operators JO, ˆ generating the correct symmetry on O ˆ ˆ the ﬁeld algebra F (O) and localized in O (i.e. aﬃliated to F (O)); and (2) these local generators should play the role of integrals of (time components of) Wightman 91

February 11, 2010 11:24 WSPC/148-RMP

92

J070-S0129055X10003904

G. Morsella & L. Tomassini

ˆ and possibly some smearing in time, so currents over B with a smooth cut-oﬀ in B that one is led to conjecture that 1 Q Q f (x)αx (JλO,λ (1.1) ˆ )dx → cj0 (f ) O λ3 R4 holds, in a suitable sense, as λ → 0. Here α denotes space-time translations, j0Q (x) the sought-for Wightman current, f ∈ S (R4 ) any test function and c a constant Q which (in view of the above interpretation of JO, ˆ ) would be expected to satisfy O ˆ vol(B) ≤ c ≤ vol(B).

(1.2)

It is important to note that there is a large ambiguity in the choice of the local ˆ is not ﬁxed by the above requirements we are generators: since their action in O ∩ O ˆ Thus, the limit (1.1) is not to be expected free to add perturbations in F (O ∩ O). to converge in full generality, but we can still hope that a “canonical” choice or construction of the local generators might solve the problem (see below). The ﬁrst problem above was completely solved in [1] for the case of Abelian gauge transformation groups, while in [2,3] the general case (including discrete and space-time symmetries and supersymmetries) was treated. The ﬁnal result was that in physically reasonable theories what was called by the authors a canonical local unitary implementation of global symmetries exists and if a part of them actually constitutes a Lie group the corresponding canonical local generators provide a local representation of the associated Lie (current) algebras. A key assumption was identiﬁed in the so-called split property (for double cones), which holds in theories with a realistic thermodynamic behavior [4]. It expresses a strong form of statistical ˆ and is equivalent to the existence of independence between the regions O and O ˆ such that φ(AB) = ω(A)ω(B) (ω being normal product states φ on F (O) ∨ F (O) ˆ [5]. the vacuum state) for A ∈ F (O) and B ∈ F (O) However, the above-mentioned construction crucially depends on such a highly elusive object as the unique vector representative of the state φ in the (natural) cone ˆ ) = ∆1/4 (F (O) ∨ F (O) ˆ )+ Ω PΩ (F (O) ∨ F (O)

(1.3)

(see [6]), where Ω indicates the vacuum vector and ∆ the modular operator of ˆ , Ω), so that ﬁnding an explicit expression of the local the pair (F (O) ∨ F (O) generators appears as an almost hopeless task. This makes it extremely hard to proceed to the above-mentioned second step, i.e. the determination of the current ﬁelds themselves. Notwithstanding this, the reconstruction of the energy momentum tensor of a certain (optimal) class of 2-dimensional conformal models was carried out in [7], while partial results for the U(1)-current in the free massless 4-dimensional case were obtained in [8], showing that for the local generators of [3] the drawbacks brieﬂy discussed after Eq. (1.1) might be less severe. However, in both cases the existence of a unitary implementation of dilations was crucial for handling the limit λ → 0.

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

93

In what follows, we restrict our attention to the case of continuous symmetries and propose a new method for obtaining local generators based on the existence of local unitary implementations of the ﬂip automorphism, a requirement actually equivalent, under standard assumptions, to the split property [9]. This method turns out to be particularly suited for carrying out step (2) above, at least in the free ﬁeld case. To be more speciﬁc, we consider a quantum ﬁeld theory deﬁned by a net O → F (O) of von Neumann algebras on open double cones in Minkowski 4dimensional spacetime acting irreducibly on a Hilbert space H with scalar product ·, · satisfying the following standard assumptions: (1) there is a unitary strongly continuous representation V on H of a compact Lie group G, which acts locally on F V (g)F (O)V (g)∗ = F (O),

g ∈ G,

and we set βg := Ad V (g); ¯1 ⊂ O2 ) there (2) (split property) for each pair of double cones O1 O2 (i.e. O exists a type I factor N such that F (O1 ) ⊂ N ⊂ F (O2 ). ¯ F (O), To such a theory, we associate the doubled theory O → F˜ (O) := F (O) ⊗ with the corresponding unitary representation of G given by V˜ (g) := V (g) ⊗ V (g). In this situation, it is well known that for each pair of double cones O1 O2 there exists a local implementation of the ﬂip automorphism of F˜ (O1 ), i.e. a unitary operator WO1 ,O2 ∈ F˜ (O2 ) such that WO1 ,O2 F1 ⊗ F2 WO∗ 1 ,O2 = F2 ⊗ F1 ,

F1 , F2 ∈ F (O1 ).

(1.4)

Assume now, for the argument’s sake, that there is a 1-parameter subgroup θ ∈ R → gθ ∈ G of G, such that the generator Q of the corresponding unitary group θ → V (gθ ) is a bounded operator on H . Considering the conditional expectation ¯ B(H ) → B(H ) deﬁned by (Fubini mapping) EΦ : B(H ) ⊗ EΦ (A1 ⊗ A2 ) = Φ, A2 ΦA1 ,

A1 , A2 ∈ B(H ),

where Φ ∈ H is such that Φ = 1, we can deﬁne the operator Q ∗ JO := ΞΦ O1 ,O2 (Q) := EΦ (WO1 ,O2 (1 ⊗ Q)WO1 ,O2 ), 1 ,O2

(1.5)

and it is then easy to see that such operator gives a local implementation of the infinitesimal symmetry generated by Q in the following natural sense: Q ∈ F (O2 ), JO 1 ,O2

Q [JO , F ] = [Q, F ], 1 ,O2

∀ F ∈ F (O1 ).

(1.6)

We also note that for this last equation to hold, it is suﬃcient that WO1 ,O2 is ¯ B(H ) for only a semi-local implementation of the ﬂip, i.e. a unitary in F (O2 ) ⊗ which (1.4) holds.

February 11, 2010 11:24 WSPC/148-RMP

94

J070-S0129055X10003904

G. Morsella & L. Tomassini

The assumption of boundedness for Q is of course very strong, and it is not expected to be satisﬁed in physically interesting models. In the unbounded case it is however possible, in the slightly more restrictive setting of [2, 3], to make sense Q aﬃliated to F (O2 ) of Eqs. (1.5) and (1.6) producing a self-adjoint operator JO 1 ,O2 and implementing the commutator with Q on a suitable dense subalgebra of F (O1 ). More explicitly, assume that the triple Λ = (F (O1 ), F (O2 ), Ω) is a standard split W∗ -inclusion in the sense of [10] and consider the unitary standard implementation UΛ : H → H ⊗ H of the isomorphism ¯ F (O2 ) . η : F1 F2 ∈ F (O1 ) ∨ F (O2 ) → F1 ⊗ F2 ∈ F (O1 ) ⊗ This was used in [3] to deﬁne the universal localizing map ψΛ : B(H ) → B(H ), ψΛ (T ) = UΛ∗ (T ⊗ 1)UΛ ,

T ∈ B(H ),

where the standard type-I factor NΛ = ψΛ (B(H )) satisﬁes F (O1 ) ⊂ NΛ ⊂ F (O2 ). For the commutant standard inclusion Λ = (F (O2 ) , F (O1 ) , Ω) [10], one has ψΛ (T ) = UΛ∗ (1 ⊗ T )UΛ . For any unitarily equivalent triple Λ0 = (V0 F (O1 )V0∗ , V0 F (O2 )V0∗ , V0 Ω), one ﬁnds UΛ0 · V0 = V0 ⊗ V0 · UΛ . Notice that in the case of gauge transformations Λ = Λ0 and so UΛ V (g) = V (g) ⊗ V (g) · UΛ .

(1.7)

It is then straightforward to verify that, with Z1,3 the unitary interchanging the ﬁrst and third factors in H ⊗ H ⊗ H ⊗ H , the operator WΛ = (UΛ∗ ⊗ UΛ∗ )Z1,3 (UΛ ⊗ UΛ ) is a local implementation of the ﬂip. Setting g = gθ in (1.7) and diﬀerentiating with respect to θ, a simple computation shows that WΛ (1 ⊗ Q)WΛ∗ = ψΛ (Q) ⊗ 1 + 1 ⊗ ψΛ (Q) = JΛQ ⊗ 1 + 1 ⊗ JΛQ , where JΛQ , JΛQ are the canonical local implementations of [2, 3], which of course Q satisfy (1.6). Choosing now Φ = UΛ∗ (Ω ⊗ Ω), we see that ΞΦ O1 ,O2 (Q) = JΛ . The above construction (1.5) therefore includes the canonical one as a particular case. As remarked above, the control of the limit (1.1) for such operators does not seem within reach of the presently known techniques. However, we shall see in Sec. 3 below that if Q is the (unbounded) generator of a 1-parameter subgroup of a compact Lie gauge group acting on a ﬁnite multiplet of free scalar ﬁelds of mass m ≥ 0, it is possible to provide a diﬀerent explicit (semi-)local implementation of the ﬂip WO1 ,O2 such that the limit (1.1) can actually be performed for the Q (which is self-adjoint and satisﬁes (1.6) in the same corresponding generator JO 1 ,O2 Q sense as JΛ ). The rest of the paper is organized as follows. In Sec. 2, we introduce a new class of test functions spaces and use it to obtain estimates concerning certain free ﬁeld bilinears; as it is shown in the Appendix, these estimates also allow to

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

95

establish the existence of the above-mentioned unitaries. This is used in Sec. 3, where we go into the study of our models of 4-dimensional free ﬁelds. We focus on the case of a single-charged free ﬁeld with U(1) symmetry, the multiplet case being an easy generalization discussed at the end. We elaborate on the explicit realization of local unitaries implementing the ﬂip automorphisms introduced for the neutral ﬁeld case in [9], make use of the multiple commutator theorem in [11] to get an expression for the corresponding local generators of the U(1) symmetry and prove their (essential) self-adjointness on a suitable domain. Finally, convergence of the limit (1.1) is proved and the constant c there shown to satisfy (1.2) (and in particular to be diﬀerent from zero). 2. Test Functions Spaces and N -Bounds for Free Field Bilinears We collect here some technical results, needed in the following section, on the extension of bilinear expressions in two commuting complex free scalar ﬁelds φi , i = 1, 2, and their derivatives, to suitable spaces of tempered distributions. Using this, we will also obtain useful N -bounds for such operators. The Hilbert space H˜ on which the ﬁelds φi act is the bosonic second quantization of K = L2 (R3 ) ⊗ C4 . For Φ ∈ H˜ , we denote by Φ(n) its component in K ⊗S n ˜ 0 we indicate the dense space (the symmetrized n-fold tensor power of K) and by D (n) ˜ be the = 0 for all but ﬁnitely many n ∈ N0 . Let N of Φ ∈ H˜ such that Φ (n) (n) ˜ ˜ = nΦ on the domain D(N ) of vectors number operator, deﬁned by (N Φ) =+,− of Φ ∈ H˜ such that n n2 Φ(n) 2 < ∞. Fixing an orthonormal basis (eτi )τi=1,2 τ1 ...τn τ1 ...τn =+,− 4 ⊗S n with collections Φ = (Φi1 ...in )i1 ...in =1,2 of C , we can identify elements Φ ∈ K ...τn functions on R3n , such that Φτi11...i (p1 , . . . , pn ) is symmetric for the simultaneous n interchange of (τk , ik , pk ) and (τh , ih , ph ), and τ1 ,...,τn =+,− i1 ,...,in =1,2

R3n

...τn dp1 · · · dpn |Φτi11...i (p1 , . . . , pn )|2 < ∞. n

We introduce then the operators on H˜ τ cτ,− i (ψ) = a(ψ ⊗ ei ),

−τ ∗ ¯ cτ,+ i (ψ) = a(ψ ⊗ ei ) ,

where ψ ∈ L2 (R3 ) and a(ξ), ξ ∈ K, is the usual Fock space annihilation operator. Their commutation relations are τ,σ ρ,ε dp ψ(p)ϕ(p). [ci (ψ), cj (ϕ)] = −σδij δτ,−ρ δσ,−ε R3

Introducing also the maps jσ : S (R4 ) → L2 (R3 ), σ = +, −, deﬁned by dx ipx 2π/ωm (p)fˆ(σωm (p), σp),(where fˆ(p) = R4 (2π) is the jσ f (p) := 2 f (x)e † Fourier transform of f and ωm (p) = |p|2 + m2 ) and the notation φi (f ) := φi (f¯)∗ ,

February 11, 2010 11:24 WSPC/148-RMP

96

J070-S0129055X10003904

G. Morsella & L. Tomassini

we have 1 −,σ φi (f ) = √ ci (jσ f ), 2 σ=+,−

1 +,σ φ†i (f ) = √ ci (jσ f ). 2 σ=+,−

˜ 0, With the notation ∂ := ∂0 , we have, for f ∈ S (R8 ) and Φ ∈ D : ∂ l c−,σ ∂ k c+,ε : (f )(n) Φ(n−σ−ε) , (: ∂ l φi ∂ k φ†j : (f )Φ)(n) = i j

(2.1)

σ,ε

where : ∂ l c−,σ ∂ k c+,ε : (f )(n) : K ⊗S (n−σ−ε) → K ⊗S n is a bounded operator whose i j expression can be obtained from the formal expression of φi in terms of creation and annihilation operators. For instance, if Φ ∈ K ⊗S n , ...τn (: ∂ l ci−,+ ∂ k cj+,− : (f )(n) Φ)τi11...i (p1 , . . . , pn ) n

=

n X

δτr ,+ δi,ir il+r (−1)k π

r=1

Z

×

R3

τr ...τn ˆ r , . . . , pn ), dpωm (p)k−1/2 ωm (pr )l−1/2 fˆ(pr,+ , −p+ )Φ+τ1 ...ˆ (p, p1 , . . . , p ˆ ji1 ...ir ...in

where the hat over an index means that the index itself must be omitted and where we have introduced the convention (which we will use systematically in the following) of denoting simply by qσ ∈ R4 the 4-vector (σωm (q), q), σ = +, −. We now want to show that such operators can be extended to suitable spaces of tempered distributions on R8 , which in turn are left invariant by the operation induced by the commutator of ﬁeld bilinears. Definition 2.1. We denote by Cˆ the space of functions f ∈ C ∞ (R8 ) such that for all r ∈ N, α, β ∈ N40 ,

f r,α,β =

sup |(1 + |p + q|)r ∂pα ∂qβ f (p, q)| < ∞.

(p,q)∈R8

Introducing the notation f˜(p, q) := f (q, p) and the expressions (T k,l (f )Φ)(p) := dq ωm (p)k−1/2 ωm (q)l−1/2 f (p+ , −q+ )Φ(q), R3

Φk,l,σ (p, q) f

:= f (σp+ , σq+ )ωm (p)k−1/2 ωm (q)l−1/2 ,

where k, l = 0, 1 and σ = +, −, we denote by Cˆk,l the space of functions f ∈ Cˆ ∈ such that T k,l (|f |), T l,k (|f˜|) : L2 (R3 ) → L2 (R3 ) are bounded operators and Φk,l,σ f 2 6 k,l L (R ). Furthermore, we introduce on Cˆ the seminorm

f k,l := max{ T k,l (|f |) , T l,k (|f˜|) , Φk,l,σ

L2 (R6 ) }. f The spaces Cˆk,l depend also on the mass m appearing in ωm , but we have avoided to indicate this explicitly in order not to burden the notations. It is clear that functions in Cˆ are bounded with all their derivatives and therefore

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

97

Cˆk,l ⊂ S (R8 ). We denote then by C k,l the space of distributions f ∈ S (R8 ) such that fˆ ∈ Cˆk,l . It is also easy to verify that S (R8 ) ⊂ C k,l . Lemma 2.1. The expression Cˆ l,k (f, g)(p, q) := (−1)l π

k+l

σ(iσ)

R3

σ=±

dk ωm (k)l+k−1 f (p, −σk+ )g(σk+ , q), (2.2)

defines a bilinear map Cˆ l,k : Cˆl ,l × Cˆk,k → Cˆl ,k , such that Cˆ l,k (f, g) l ,k ≤ 2π f l ,l g k,k .

Proof. We start by showing that if f, g ∈ Cˆ then Cˆ l,k (f, g) ∈ Cˆ. Setting ε = 2/|p + q|, and e = ε(p + q)/2, it is clearly suﬃcient to show that, as ε → 0, Ih,r (ε) :=

R3

(1 + |x +

|x|h/2 dx ≤ O(εs(r,h) ), + |x − ε−1 e|)r

ε−1 e|)r (1

(2.3)

where h = k+l−1 = −1, 0, 1, and s(r, h) → +∞ as r → +∞. Consider ﬁrst the case h = 0. Choosing the x3 axis along e and evaluating the integral in prolate spheroidal coordinates x1 = ε−1 (u2 − 1)(1 − v 2 ) cos φ, x2 = ε−1 (u2 − 1)(1 − v 2 ) sin φ, x3 = ε−1 uv, one gets +∞ +∞ +∞ I0,r (ε) = 2πε2r−3 du Jr−1 (u) + ε2 du Jr (u) − 2ε du uJr (u) , 1+ε

1+ε

1+ε

where, by recursion, Jr (u) :=

1

dv −1

r−1 1 2(2r − 3) · · · (2r − 2k + 1) 1 = 2 2 r 2k 2 (u − v ) (2r − 2) · · · (2r − 2k) u (u − 1)r−k k=1 u + 1 (2r − 3)!! 1 , + log (2r − 2)!! u2r−1 u − 1

which easily gives estimate (2.3) with s(r, 0) = r − 3. Take now h = −1. Dividing the integration region into the subregions {|x| ≤ 1}, {|x| > 1} and using the Cauchy–Schwarz inequality in the ﬁrst integral, one gets I−1,r (ε) ≤

|x|≤1

1/2 |x|−1 dx

I0,2r (ε)1/2 + I0,r (ε) ≤ O(εr−3 ).

Finally, for h = 1, taking into account the bound |x|1/2 /(1 + |x + ε−1 e|)(1 + |x − ε−1 e|) ≤ 1/2, one gets I1,r (ε) ≤ O(εr−4 ). We now show that if f ∈ Cˆl ,l , g ∈ Cˆk,k , then Cˆ l,k (f, g) ∈ Cˆl ,k . We introduce the notation KΨ to denote the Hilbert–Schmidt operator on L2 (R3 ) with kernel

February 11, 2010 11:24 WSPC/148-RMP

98

J070-S0129055X10003904

G. Morsella & L. Tomassini

Ψ ∈ L2 (R6 ). It is then easy to verify that, if Φ ∈ L2 (R3 ),

T l ,k (|Cˆ l,k (f, g)|)Φ 2 ≤ π( T l ,l (|f |)T k,k (|g|)|Φ| 2 + KΦl ,l,+ KΦk,k ,− |Φ| 2 ), |f |

|g|

(f, g)|)Φ 2 ≤ π( T k ,k (|˜ g |)T l,l (|f˜|)|Φ| 2 + KΦl ,l,− KΦk,k ,+ |Φ| 2 ),

T k ,l (|Cˆ l,k |f |

|g|

(f, g)|) are bounded. Furthermore one has, so that T l ,k (|Cˆ l,k (f, g)|) and T k ,l (|Cˆ l,k 2 6 for Ψ ∈ L (R ),

l ,k ,+ k,k ,+ |ΦC , (T l ,l (|f |)∗ ⊗ 1)|Ψ|L2 (R6 ) ˆ l,k (f,g) , ΨL2 (R6 ) | ≤ π(Φ|g|

k ,k + Φl|f,l,+ (|˜ g|)∗ )|Ψ|L2 (R6 ) ) | , (1 ⊗ T

l ,k ,− l ,l,− k,k |ΦC (|g|))|Ψ|L2 (R6 ) ˆ l,k (f,g) , ΨL2 (R6 ) | ≤ π(Φ|f | , (1 ⊗ T

k,k ,− + Φ|g| , (T l,l (|f˜|) ⊗ 1)|Ψ|L2 (R6 ) )

l ,k ,σ 2 6 ˆ l,k (f, g) l,k now so that by Riesz theorem ΦC ˆ l,k (f,g) ∈ L (R ). The bound on C follows at once from the above estimates.

For (f, g) ∈ C l ,l × C k,k , we write C l,k (f, g) := Cˆ l,k (fˆ, gˆ)∨ . Proposition 2.1. The following statements hold for any i, j ∈ {1, 2}, k, l ∈ {0, 1}, n ∈ N, σ, ε ∈ {+, −}, with n − σ − ε ≥ 0. ∂ k c+,ε : (f )(n) ∈ B(K ⊗S (n−σ−ε) , K ⊗S n ) (1) The map f ∈ S (R8 ) → : ∂ l c−,σ i j can be extended to a map (denoted by the same symbol) from C l,k to B(K ⊗S (n−σ−ε) , K ⊗S n ), such that

: ∂ l c−,σ ∂ k c+,ε : (f )(n) ≤ π fˆ l,k (n + 2). i j

(2.4)

˜ 0 by formula (2.1), (2) For each f ∈ C l,k the operator : ∂ l φi ∂ k φ†j : (f ), defined on D satisfies ˜ + 1)−1/2 ≤ υ fˆ l,k , ˜ + 1)−1/2 : ∂ l φi ∂ k φ† : (f )(N

(N j

(2.5)

˜ + 1)−1/2 [N ˜ , : ∂ l φi ∂ k φ† : (f )](N ˜ + 1)−1/2 ≤ υ fˆ l,k ,

(N j

(2.6)

˜ 0, for some υ > 0. If furthermore (f, g) ∈ C l ,l × C k,k , there holds, on D

[: ∂ l φi ∂ l φ†i : (f ), : ∂ k φj ∂ k φ†j : (g)]

= δij : ∂ l φi ∂ k φ†j : (C l,k (f, g)) − δi ,j : ∂ k φj ∂ l φ†i : (C k ,l (g, f )) ,+ + il+l +k+k π 2 δi ,j δi,j ((−1)l+l Φlfˆ,l,− , Φk,k L2 (R6 ) g ˆ k,k ,− − (−1)k+k Φlfˆ,l,+ , Φ L2 (R6 ) )1. g ˆ

(2.7)

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

99

Proof. (1) Deﬁne the contraction operator Π(ψ) : K ⊗(n+2) → K ⊗n , ψ ∈ K ⊗2 , by Π(ψ)ψ1 ⊗· · ·⊗ψn+2 = ψ, ψ1 ⊗ψ2 ψ3 ⊗· · ·⊗ψn+2 . It is easily seen from the usual expressions of creation and annihilation operators (see, e.g., [12, Sec. X.7]) that for f ∈ S (R8 ) : ∂ l ci−,+ ∂ k cj+,− : (f )(n) = il (−i)k π

n

+ Vr ((T l,k (fˆ) ⊗ |e+ i ej |) ⊗ 1 ⊗ · · · ⊗ 1),

r=1

: ∂ l ci−,+ ∂ k cj+,+

(n)

: (f )

: ∂ l ci−,− ∂ k cj+,− : (f )(n)

il+k π

1,n

− ∗ = Wr,s Π(Φl,k,+ ⊗ (e+ i ⊗ ej )) , fˆ n(n − 1) r=s + = (−i)l+k π (n + 1)(n + 2)Π(Φl,k,− ⊗ (e− i ⊗ ej )), fˆ

where for ψi ∈ K, i = 1, . . . , n, Vr ψ1 ⊗ · · · ⊗ ψn = ψ2 ⊗ · · · ⊗ ψ1 ⊗ · · · ⊗ ψn , r th place

Wr,s ψ1 ⊗ · · · ⊗ ψn = ψ3 ⊗ · · · ⊗ ψ1 ⊗ · · · ⊗ ψ2 ⊗ · · · ⊗ ψn . rth place

sth place

Thus the above formulas provide an extension of : ∂ l c−,σ ∂ k c+,ε : (·)(n) to C l,k i j and the bound (2.4) holds. √ (2) The bounds (2.5) and (2.6), with υ = 4π( 3 + 1), follow easily from (2.4). Equation (2.7) is obtained by a straightforward (if lengthy) calculation, using ∂ k c+,ε : (·)(n) . the above expressions for : ∂ l c−,σ i j Remark 2.1. It is not diﬃcult to see that the above deﬁned exten∂ k c+,ε : (·)(n) to C l,k is unique in the family of linear maps sion of : ∂ l c−,σ i j l,k ⊗S (n−σ−ε) → B(K , K ⊗S n ) which are sequentially continuous when S : C ⊗S (n−σ−ε) ⊗S n B(K ,K ) is equipped with the strong operator topology and C l,k is equipped with the topology induced by the family of seminorms ˜

f k,l,Ψ = max{ T k,l (|fˆ|)Ψ , T l,k (|fˆ|)Ψ , Φk,l,σ

L2 (R6 ) }, fˆ

Ψ ∈ L2 (R3 ),

with respect to which S (R8 ) is sequentially dense in C l,k . On the other hand, we point out the fact that, according to Eq. (2.7), the linear span of extended ﬁeld bilinears is stable under the operation of taking commutators. Together with Proposition A.1 in the Appendix, this implies that in the construction of the local symmetry generator carried out in the following section, Eq. (3.8), only the above deﬁned extensions are relevant. According to the results in [12, Sec. X.5], the bounds (2.5) and (2.6) imply that : ∂ l φi ∂ k φ†j : ∂ l φi ∂ k φ†j : (f ) can be extended to an operator, denoted by the same ˜ ). symbol, whose domain contains D(N

February 11, 2010 11:24 WSPC/148-RMP

100

J070-S0129055X10003904

G. Morsella & L. Tomassini

3. Reconstruction of the Free Field Noether Currents We start by considering the theory of a complex free scalar ﬁeld φ of mass m ≥ 0. The Hilbert space of the theory is the symmetric Fock space H = Γ(L2 (R3 ) ⊗ C2 ). As customary, we denote by D0 ⊂ H the space of ﬁnite particle vectors, and by N the number operator N = dΓ(1), with domain D(N ). The local ﬁeld algebras are deﬁned as usual by F (O) := {ei[φ(f )+φ(f )

∗ −

]

: f ∈ D(O)} ,

and if we consider iθ e V (θ) := Γ 1 ⊗ 0

0 e−iθ

,

we obtain a continuous unitary representation of U(1) (i.e. a 2π-periodic representation of R) on H , θ ∈ R → V (θ), which induces a group of gauge automorphisms βθ := Ad V (θ) of F such that βθ (φ(f )) = eiθ φ(f ). We denote by Q the self-adjoint generator of this group. It is easy to see that (N + 1)−1/2 Q(N + 1)−1/2 ≤ 1 and [N, Q] = 0, so that thanks to Nelson’s commutator theorem (cfr. [12, Sec. X.5]) D(N ) ⊂ D(Q). Furthermore we introduce the unitary operator Z on H such that Zφ(f )Z ∗ = −φ(f ), ZΩ = Ω. In order to ﬁnd an explicit representation of the (semi-)local implementation of the ﬂip automorphism we consider, following [9], the doubled theory O → F˜ (O) := ¯ F (O), generated by the two commuting complex scalar ﬁelds φ1 (f ) := F (O) ⊗ φ(f ) ⊗ 1, φ2 (f ) := 1 ⊗ φ(f ). There is a continuous unitary representation of U(1) on H˜ = H ⊗ H , ζ ∈ R → Y (ζ), which induces a group of gauge automorphisms γζ := Ad Y (ζ) of F˜ such that γζ (φ1 (f )) = cos ζ φ1 (f ) − sin ζ φ2 (f ),

(3.1)

γζ (φ2 (f )) = sin ζ φ1 (f ) + cos ζ φ2 (f ).

In Proposition A.1 in the Appendix it is shown that the Noether current of this U(1) symmetry Jµ (x) = φ1 (x)∂µ φ2 (x)∗ + φ1 (x)∗ ∂µ φ2 (x)− ∂µ φ1 (x)φ2 (x)∗ − ∂µ φ1 (x)∗ φ2 (x)

(3.2)

is a well-deﬁned Wightman ﬁeld that when smeared with an h ∈ SR (R4 ) gives ˜ ), and generates a group of an operator which is essentially self-adjoint on D(N unitaries which locally implements the symmetry: given 3-dimensional open balls Br , Br+δ centered at the origin of radii r + δ > r > 0 together with functions ϕ∈D R (Br+δ−τ ), ψ ∈ DR ((−τ, τ )) such that τ < δ/2, ϕ(x) = 1 for each x ∈ Br+τ and R ψ = 1, it holds that eiζJ0 (ψ⊗ϕ) ∈ F˜ (Or+δ ),

eiζJ0 (ψ⊗ϕ) F e−iζJ0 (ψ⊗ϕ) = γζ (F ),

∀ F ∈ F˜ (Or ),

(3.3)

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

101

where Or , Or+δ are the double cones with bases Br , Br+δ , respectively. It then follows easily that setting hλ := ψλ ⊗ ϕλ with ϕλ (x) = ϕ(λ−1 x) and ψλ (t) = λ−1 ψ(λ−1 t), the unitary operator π

¯ B(H ), WλOr ,λOr+δ := (1 ⊗ Z)ei 2 J0 (hλ ) ∈ F (λOr+δ ) ⊗

(3.4)

is a semi-local implementation of the ﬂip automorphism on F˜ (λOr ) for each λ > 0. In what follows, we will keep the functions ϕ, ψ ﬁxed and we will assume that ϕ(Rx) = ϕ(x) for each R ∈ O(3). For a function h ∈ S (R4 ), we introduce the distribution hδ ∈ S (R8 ) deﬁned by hδ (x, y) = h(x)δ(x − y) (i.e. hδ , f = R4 dx h(x)f (x, x) for f ∈ S (R8 )). Proposition 3.1. Let the operator WλOr ,λOr+δ be defined as above. The operator ΞλOr ,λOr+δ (Q) defined on D(N ) by ∗ ΞλOr ,λOr+δ (Q)Φ = P1 WλOr ,λOr+δ (1 ⊗ Q)WλO Φ ⊗ Ω, r ,λOr+δ

Φ ∈ D(N ),

(3.5)

where P1 (Φ1 ⊗ Φ2 ) = Ω, Φ2 Φ1 , is essentially self-adjoint. Furthermore, there are l,k (λ) ∈ C l,k , n ∈ N, l, k = 0, 1, m ≥ 0, defined recursively by distributions Kn,m 1,0 0,1 K1,m (λ) = −K1,m (λ) := (hλ )δ , l,k Kn+1,m (λ) = i(−1)n

1

0,0 1,1 K1,m (λ) = K1,m (λ) = 0,

(3.6)

r,k [(−1)l+1 C 1−l,r ((hλ )δ , Kn,m (λ))

r=0 k

l,r + (−1) C r,1−k (Kn,m (λ), (hλ )δ )],

(3.7)

such that, for all Φ ∈ D(N ),

ΞλOr ,λOr+δ (Q)Φ =

+∞

2n



π  n (2n)! 4 n=1

0,1

 l,k : ∂ l φ∂ k φ† : (K2n,m (λ))Φ ,

(3.8)

l,k

the series being absolutely convergent for all λ ∈ (0, 1]. Proof. We start by observing that, for all Φ ∈ H for which the right-hand side of (3.5) is deﬁned, one has π

π

ΞλOr ,λOr+δ (Q)Φ = P1 ei 2 J0 (hλ ) (1 ⊗ Q)e−i 2 J0 (hλ ) Φ ⊗ Ω.

(3.9)

It follows from this formula that ΞλOr ,λOr+δ (Q) is well-deﬁned (and symmetric) on D(N ): according to formula (A.1) in the Appendix for J0 (hλ ), Proposition 2.1(2) π ˜ ) ⊂ D(N ˜ ) and D(N ) ⊂ D(Q) as remarked and [11, Lemma 2], we have ei 2 J0 (hλ ) D(N above.

February 11, 2010 11:24 WSPC/148-RMP

102

J070-S0129055X10003904

G. Morsella & L. Tomassini

˜) Recalling now the deﬁnition of Q one has on D(N Q1 (λ) := i[J0 (hλ ), 1 ⊗ Q] =

2

[: ∂φj φ†j : ((hλ )δ ) − : φj ∂φ†j : ((hλ )δ )]

j=1

=

0,1 2

l,k : ∂ l φj ∂ k φ†j : (K1,m (λ)),

j=1 l,k

where j = 3 − j. Proceeding now inductively using formula (2.7), one veriﬁes that ˜ 0, there are operators Qn (λ) such that, on D Qn+1 (λ) = i[J0 (hλ ), Qn (λ)], Q2n (λ) =

0,1 2

(3.10)

l,k (−1)j+1 : ∂ l φj ∂ k φ†j : (K2n,m (λ)),

j=1 l,k

Q2n+1 (λ) =

0,1 2

(3.11) : ∂ l φj ∂ k φ†j

l,k : (K2n+1,m (λ)),

j=1 l,k l,k (λ) ∈ C l,k satisfy (3.7). It is also easy to verify inducwhere the distributions Kn,m tively that the distributions K l,k (λ) are real (g ∈ S being real if g, f = g, f¯), n,m

so that Qn (λ) is symmetric. Arguing again by induction, it follows from (3.7) and Lemma 2.1, that l,k n n−1 ˆ n,m (λ) l,k ≤ (8π)n−1 (max{ (h

h nS ,

K λ )δ 0,1 , (hλ )δ 1,0 }) ≤ (8π)

where h S is some ﬁxed Schwartz norm of h. The last inequality above follows from Lemma A.1 and from the observation that, switching for a moment to the (m) notation · l,k in order to make explicit the dependence on the mass m of the seminorms · l,k , one has (m) ˆ (λm)

(h λ )δ l,1−l = hδ l,1−l ,

l = 0, 1.

Using now the bounds in Proposition 2.1(2) and the results in [12, Sec. X.5], we see that Qn (λ) can be extended to an operator (denoted by the same symbol) which ˜ . The domain D ˜ 0 being such a core, is essentially self-adjoint on any core for N ˜ ) × D(N ˜ ) and we are therefore Eq. (3.10) can be assumed to hold weakly on D(N in the position of applying [11, Theorem 1∞ ] to obtain π

π

ei 2 J0 (hλ ) (1 ⊗ Q)e−i 2 J0 (hλ ) = 1 ⊗ Q +

+∞ 1 π n Qn (λ) n! 2 n=1

˜ ). Combining this with (3.9), and the series converges strongly absolutely on D(N and the fact that l,k l,k (λ))Φ ⊗ Ω = 0 = P1 : ∂ l φ2 ∂ k φ†2 : (K2n,m (λ))Φ ⊗ Ω, P1 : ∂ l φj ∂ k φ†j : (K2n+1,m

Eq. (3.8) readily follows, upon identiﬁcation of φ1 (f ) = φ(f ) ⊗ 1 with φ(f ).

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

103

It remains to prove that ΞλOr ,λOr+δ (Q) is essentially self-adjoint on D(N ), but this again follows from the easily obtained N -bounds

(N + 1)−1/2 ΞλOr ,λOr+δ (Q)(N + 1)−1/2 ≤ γ cosh(4π 2 h S ),

(N + 1)−1/2 [N, ΞλOr ,λOr+δ (Q)](N + 1)−1/2 ≤ γ cosh(4π 2 h S ),

(3.12)

where γ > 0 is a suitable numerical constant. We now show that the unitary group generated by the operator ΞOr ,Or+δ (Q) deﬁned in the above proposition provides a local implementation of the U(1) symmetry. Proposition 3.2. For each θ ∈ R and F ∈ F (Or ) there holds: eiθΞOr ,Or+δ (Q) ∈ F (Or+δ ),

eiθΞOr ,Or+δ (Q) F e−iθΞOr ,Or+δ (Q) = βθ (F ).

Proof. Since the free ﬁeld enjoys Haag duality property, it is suﬃcient to show that eiθΞOr ,Or+δ (Q) ei[φ(f )+φ(f )

∗ −

]

e−iθΞOr ,Or+δ (Q) = ei[φ(f )+φ(f )

∗ −

]

if supp f ⊂ Or+δ and that

eiθΞOr ,Or+δ (Q) ei[φ(f )+φ(f )

∗ −

]

iθ

e−iθΞOr ,Or+δ (Q) = ei[e

φ(f )+e−iθ φ(f )∗ ]−

if supp f ⊂ Or . Applying once again [11, Theorem 1∞ ] and keeping in mind the previously obtained N -bounds for ΞOr ,Or+δ (Q), Eq. (3.12), one sees that in order to achieve this, it is enough to show that for all Φ1 , Φ2 ∈ D(N ) ΞOr ,Or+δ (Q)Φ1 , φ(f )Φ2 − φ(f )∗ Φ1 , ΞOr ,Or+δ (Q)Φ2 = 0

(3.13)

for supp f ⊂ Or+δ and

ΞOr ,Or+δ (Q)Φ1 , φ(f )Φ2 − φ(f )∗ Φ1 , ΞOr ,Or+δ (Q)Φ2 = Φ1 , φ(f )Φ2

(3.14)

for supp f ⊂ Or . In order to prove the latter equation we compute ΞOr ,Or+δ (Q)Φ1 , φ(f )Φ2 π

π

= (1 ⊗ Q)e−i 2 J0 (h) Φ1 ⊗ Ω, e−i 2 J0 (h) (φ(f ) ⊗ 1)Φ2 ⊗ Ω π

π

= (1 ⊗ Q)e−i 2 J0 (h) Φ1 ⊗ Ω, (1 ⊗ φ(f ))e−i 2 J0 (h) Φ2 ⊗ Ω π

π

= (1 ⊗ φ(f )∗ )e−i 2 J0 (h) Φ1 ⊗ Ω, (1 ⊗ Q)e−i 2 J0 (h) Φ2 ⊗ Ω π

π

+ e−i 2 J0 (h) Φ1 ⊗ Ω, (1 ⊗ φ(f ))e−i 2 J0 (h) Φ2 ⊗ Ω = φ(f )∗ Φ1 , ΞOr ,Or+δ (Q)Φ2 + Φ1 , φ(f )Φ2 , where in the second and fourth equalities we used (3.1) and (3.3), and in the third π equality the fact that, as noted in the proof of Proposition 3.1, e−i 2 J0 (h) Φi ⊗ Ω ∈ ˜ 2 ∈ D(N ˜ ), there holds ˜ ) and that for Φ ˜ 1, Φ D(N ˜ 1 , (1 ⊗ φ(f ))Φ ˜ 2 − (1 ⊗ φ(f )∗ )Φ ˜ 1 , (1 ⊗ Q)Φ ˜ 2 = Φ ˜ 1 , (1 ⊗ φ(f ))Φ ˜ 2 (1 ⊗ Q)Φ

February 11, 2010 11:24 WSPC/148-RMP

104

J070-S0129055X10003904

G. Morsella & L. Tomassini

which in turn is an easy consequence of the commutation relation [Q, φ(f )]Φ = φ(f )Φ,

Φ ∈ D(N ),

˜ is the closure of N ⊗ 1 + 1 ⊗ N and of the N ˜ -bounds holdof the fact that N ing for 1 ⊗ Q and 1 ⊗ φ(f ). The proof of (3.13) being analogous, we get the statement. l,k In the following lemma, we collect some properties of the distributions Kn,m := which will be needed further on. We will use systematically the notations

l,k (1) Kn,m

f α := sup (1 + |p0 | + |p|)α |f (p)|, p∈R4

ϕ α := sup (1 + |p|)α |ϕ(p)|, p∈R3

ψ 1,∞ := max{ ψ ∞ , ψ ∞ },

ϕ 1,α := max{ ϕ α , ∂1 ϕ α , . . . , ∂3 ϕ α }, for f ∈ S (R4 ), ϕ ∈ S (R3 ), ψ ∈ S (R) and α > 0. Lemma 3.1. The following statements hold. ˆ l,k enjoy the following symmetry properties: (1) The functions K n,m l,k k,l ˆ n,m ˆ n,m K (p, q) = −K (q, p),

l,k l,k ˆ n,m ˆ n,m K (p0 , Rp, q0 , Rq) = K (p, q)

(3.15)

for all p = (p0 , p), q = (q0 , q) ∈ R4 , and all R ∈ O(3). (2) Given α > 5 there exists a constant C1 > 0 such that, uniformly for all m ∈ [0, 1] and all smearing functions ϕ ∈ DR (Br+δ−τ ), ψ ∈ DR ((−τ, τ )), l,k ˆ n,m |K (p, q)| ≤

C1n−1 ˆ n

ψ ∞ ϕ

ˆ nα (1 + |p|)2−l (1 + |q|)2−k , 4π 2

n ∈ N,

(3.16)

for all p = (p0 , p), q = (q0 , q) ∈ R4 . ˆ l,k (p, q) is continuous. (3) For each n ∈ N, the function (p, q, m) ∈ R8 × [0, 1] → K n,m l,k ˆ n,m (p, q) is of (4) For each n ∈ N, the function (p, q, m) ∈ R8 × [0, 1/e] → K 1 class C . Moreover, given α > 5, there exists a constant C2 ≥ C1 such that uniformly for all m ∈ [0, 1/e] and all smearing functions ϕ ∈ DR (Br+δ−τ ), ψ ∈ DR ((−τ, τ )), ∂ C1n−1 ˆ n l,k ˆ K (p, q)

ψ 1,∞ ϕ

ˆ n1,α (1 + |p|)2−l (1 + |q|)2−k , (3.17) ≤ ∂uµ n,m 4π 2 ∂ C2n−1 ˆ n l,k ˆ K (p, q)

ψ 1,∞ ϕ

ˆ n1,α (1 + |p|)2−l (1 + |q|)2−k , ≤ m|log m| ∂m n,m 4π 2 (3.18) for all p = (p0 , p), q = (q0 , q) ∈ R4 , and where u in (3.17) is p or q.

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

105

Proof. (1) Both properties in (3.15) follow easily by induction from the recursive l,k ˆ n,m , taking into account rotational invariance of the function ϕ. deﬁnition of K (2) We start by observing that, by interchanging k with −k in the σ = −1 summand, formula (2.2) can be rewritten as Cˆ l,k (f, g)(p, q) := (−1)l π σ(iσ)k+l dk ωm (k)l+k−1 f (p, −kσ )g(kσ , q), σ=±

R3

(3.19) where we recall that kσ = (σωm (k), k). Since α > 5, there exists a ﬁxed constant dk |k|s dk B1 > , , s = 0, 1, 2, p ∈ R3 . α α R3 |k|(1 + |p − k|) R3 (1 + |k|) It is then easily computed that for h = −1, 0, 1, j = 1, 2 and m ∈ [0, 1], dk R3

ωm (k)h (1 + |k|)j ≤ 7B1 (1 + |p|)h+j , (1 + |p − k|)α

so that estimate (3.16) follows by induction from (3.7) and the above expression ˆ δ (p, q) = for Cˆ l,k , provided one deﬁnes C1 := 14B1 /π and keeps in mind that h 1 ˆ + q ) ϕ(p ˆ + q). ψ(p 0 0 4π 2 ˆ ∈ S (R4 ), we obtain a bound to the inte(3) Using (3.16) and the fact that h 1−l,r r,k ˆδ, K ˆ δ ) with an integrable funcˆ ˆ r,1−k (K ˆ l,r , h (h grands in Cˆ n,m ) and C n,m tion of k, uniformly for (p, q, m) in a prescribed neighborhood of any given (¯ p, q¯, m) ¯ ∈ R8 × [0, 1]. By a straightforward application of Lebesgue’s domiˆ l,k (p, q) follows nated convergence theorem, the continuity of (p, q, m) → K n,m then by induction from the recursive relation (3.7). ˆ l,k ∈ Cˆl,k , we already know that it is diﬀerentiable with respect to the (4) Since K n,m components of p and q. The estimate (3.17) and the continuity of (p, q, m) → ∂ ˆ l,k ∂uµ Kn,m (p, q) then follow by an easy adaptation of the inductive arguments l,k ˆ n,m of points (2) and (3) above, using also (3.16). In order to show that K is continuously diﬀerentiable in m and satisﬁes (3.18), we proceed again by r,k ˆδ, K ˆ n,m ) induction using (3.7). The m-derivative of the integrands in Cˆ 1−l,r (h is given, apart from numerical constants, by σm m(r − l) ˆ r,k ˆ − kσ )K ˆ ˆ r,k (kσ , q) ∂0 h(p h(p − kσ )Kn,m (kσ , q) − n,m ωm (k)2+l−r ωm (k)1+l−r ˆ r,k ∂K n,m ˆ − kσ ) ˆ − kσ ) ∂ K ˆ r,k (kσ , q). − h(p (kσ , q) + ωm (k)r−l h(p ∂p0 ∂m n,m It is now straightforward to verify, using (3.16), (3.17) and the inductive hypothesis (3.18), that it is possible to bound the last three terms in the above

February 11, 2010 11:24 WSPC/148-RMP

106

J070-S0129055X10003904

G. Morsella & L. Tomassini

expression with an integrable function of k, uniformly for (p, q, m) in a given neighborhood of a ﬁxed (¯ p, q¯, m) ¯ ∈ R8 ×[0, 1/e]. The same reasoning also applies to the ﬁrst term when 2 + l − r < 3 and also when 2 + l − r = 3 for |k| ≥ 1/2. For |k| ≤ 1/2 and 2 + l − r = 3 the ﬁrst term can be bounded uniformly in a neighborhood of (¯ p, q¯) by the function m(m + |k|)−3 , apart from a constant (depending on the chosen neighborhood). By maximizing the function x → x3 | log x|β /(m + x)3 in the interval [0, 1/2], with β > 1, one ﬁnds the bound

β −β/3 3 mW0 e 3 m 3m ≤ (m + |k|)3 β |k|3 |log|k||β

3 ,

where W0 is the principal branch of Lambert’s W function [13]. From the asymptotic expansion of W0 given in [13, Eq. (4.20)] it is then easily seen that the numerator on the right-hand side converges to 0 as m → 0; since the function k → |k|−3 |log|k||−β is integrable for |k| ≤ 1/2, interchangeability of derivar,k ˆδ, K ˆ n,m ) tion with respect to m and integration with respect to k in Cˆ 1−l,r (h for all values of l, r, k = 0, 1 follows. A completely analogous argument applies ˆ δ ), so that we conclude that K ˆ l,r , h ˆ l,k of course to Cˆ r,1−k (K n,m n+1,m is continuously diﬀerentiable in m. To complete the inductive step, it remains to be shown that ∂ ˆ l,k Kn+1,m . In order to do that, we argue again in a estimate (3.18) holds for ∂m similar way as in point (2) by choosing constants B2 , B3 > 0 such that

dk |k|s dk , , s = 0, 1, t = 0, 1, 2, t α α R3 |k| (1 + |p − k|) R3 (1 + |k|) 1 , m ∈ [0, 1/e]. B3 ≥ log(1 + 1 + m2 ) − √ 1 + m2

B2 ≥

p ∈ R3 ,

Taking now into account the identity 0

1

1 x2 dx = log(1 + 1 + m2 ) − √ − log m, 2 3/2 +x ) 1 + m2

(m2

it is easy to verify that the estimate ∂ 1−l,r m|log m| r,k ˆ ˆ ˆ (hδ , Kn,m )(p, q) ≤ [16π(1 + B3 ) + 16B2 + 7B1 ] ∂m C 8π 3 n+1 2−l ˆ n+1 ϕ

(1 + |q|)2−k , × C2n−1 ψ

1,∞ ˆ 1,α (1 + |p|) ∂ ˆ r,1−k ˆ l,r ˆ C holds for all m ∈ [0, 1/e] together with a similar one for ∂m (Kn,m , hδ ). 2 Choosing C2 := π [16π(1 + B3 ) + 16B2 + 7B1 ] ≥ C1 , one ﬁnally gets (3.18) for ˆ l,k K n+1,m .

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

107

In the next theorem, which is our main result, we denote by D0,S the dense subspace of H of ﬁnite particle vectors such that the n-particle wave functions are in S (R3n ) for each n ∈ N. Theorem 3.1. There holds, for each f ∈ S (R4 ) and each Φ ∈ D0,S , 1 dx f (x)αx (ΞλOr ,λOr+δ (Q))Φ = cj0 (f )Φ, lim λ→0 λ3 R4

(3.20)

where j0 (f ) = : ∂φφ† − φ∂φ† : (fδ ) is the Noether current associated to the U(1) symmetry of the charged Klein–Gordon field of mass m ≥ 0 smeared with the test function f and +∞ ˆ 0,0 ∂K π 2n 2n,0 0,1 4 ˆ K2n,0 (0, 0) + i (0, 0) . (3.21) c = −(2π) 4n (2n)! ∂p0 n=1 Proof. Since D0,S is translation invariant and contained in D(N ), according to Proposition 2.1 and the estimates given in the proof of Proposition 3.1 there exists a υ > 0 such that, for each x ∈ R4 , l,k

αx (: ∂ l φ∂ k φ† : (K2n,m (λ)))Φ ≤ υ(8π)2n−1 h 2n S (N + 1)Φ ,

and l,k l,k (λ)))Φ − αy (: ∂ l φ∂ k φ† : (K2n,m (λ)))Φ

αx (: ∂ l φ∂ k φ† : (K2n,m ∗ ∗ ≤ υ(8π)2n−1 h 2n S (U (x) − U (y) )(N + 1)Φ

l,k (λ))U (y)∗ Φ , + (U (x) − U (y)): ∂ l φ∂ k φ† : (K2n,m

so that the function x → αx (ΞλOr ,λOr+δ (Q))Φ is continuous and bounded in norm for each Φ ∈ D0,S , the integral in (3.20) exists in the Bochner sense and furthermore it is possible to interchange the integral and the series. ˆ fˆδ still Given now K ∈ C l,k , it is easy to see that the pointwise product K 1 l,k ˆ ˆ ˆ ˆ ˆ and K fδ l,k ≤ (2π)2 f ∞ K l,k so that we can deﬁne K ∗ f := belongs to C 4 ˆ ˆ ∨ (2π) (K fδ ) ∈ C l,k . It is then straightforward to check that l,k l,k dx f (x)αx (: ∂ l φ∂ k φ† : (K2n,m (λ)))Φ = : ∂ l φ∂ k φ† : (K2n,m (λ) ∗ f )Φ. R4

ˆ l,k (λp, λq) and, with the notaˆ l,k (λ)(p, q) = λ2+l+k K Furthermore one has K 2n,m 2n,λm ˆ tion (δλ K)b(p, q) = K(λp, λq), we see that we are left with the calculation of lim

λ→0

0,1 l,k

l+k−1

λ

+∞

π 2n l,k : ∂ l φ∂ k φ† : (δλ K2n,λm ∗ f )Φ. n (2n)! 4 n=1

(3.22)

As a ﬁrst step in this calculation, we show that it is possible to interchange the limit and the series. Of course, it is suﬃcient to consider vectors Φ with vanishing n-particles components except for n = N with any ﬁxed N ∈ N. For simplicity, we will give here only the relevant estimates in the case m > 0, the case m = 0 being

February 11, 2010 11:24 WSPC/148-RMP

108

J070-S0129055X10003904

G. Morsella & L. Tomassini

treated in a similar way. Using then the notations for creation and annihilation operators and for wave functions introduced in Sec. 2 and the formulas in the proof of Proposition 2.1, we have l,k ∗ f )(N ) Φ

: ∂ l c−,+ ∂ k c+,− : (δλ K2n,λm l,k ≤ 16π 5 N ((T l,k ((δλ K2n,λm )bfˆδ ) ⊗ |e+ e+ |) ⊗ 1 ⊗ · · · ⊗ 1)Φ ,

together with the estimate, for λ ∈ [0, 1/m], l,k )bfˆδ ) ⊗ |e+ e+ |) ⊗ 1 ⊗ · · · ⊗ 1)Φ]τ1 ...τN (p1 , . . . , pN )| |[((T l,k ((δλ K2n,λm

≤

(1 + |p1 |)2−l ωm (p1 )l−1/2 C1n−1 B1 ˆ n n ˆ

ψ

ϕ

ˆ

f

β ∞ α 4π 2 (1 + |p2 |)α · · · (1 + |pN |)α ωm (q)k−1/2 (1 + |q|)2−k × dq , (1 + |q|)γ (1 + |p1 − q|)β R3

where we have used (3.16) and the fact that Φ ∈ D0,S (which gives the constant B1 > 0). It is now easy to see that the right hand side is a square integrable function of (p1 , . . . , pN ) if α > 3/2, β > 3, γ > 15/2 and therefore we get l,k n ˆ n ϕ

∗ f )(N ) Φ ≤ B2 C1n−1 ψ

: ∂ l c−,+ ∂ k c+,− : (δλ K2n,λm ∞ ˆ α,

where B2 > 0 is a constant depending on m, f , Φ but not on n and λ. A similar l,k ∗f )(N ) Φ . Furthermore we have estimate holds then for : ∂ l c−,− ∂ k c+,+ : (δλ K2n,λm l,k ∗ f )(N −2) Φ

: ∂ l c−,− ∂ k c+,− : (δλ K2n,λm ≤ 16π 5 N (N − 1) Φl,k,−

2 6 Φ , )bfˆ L (R ) (δ K l,k λ

2n,λm

δ

with

Φl,k,− l,k

2L2 (R6 ) ≤ bˆ

2(n−1)

C1

(δλ K2n,λm ) fδ

16π 4

B3 ˆ 2n ˆ2

ψ ∞ ϕ

ˆ 2n α f β

×

dp dq R6

(1 + |p|)3 (1 + |q|)3 , (1 + |p| + |q|)2β

l,k for some β > 6. A similar estimate holds for : ∂ l c−,+ ∂ k c+,+ : (δλ K2n,λm ∗f )(N +2) Φ . In summary, we get, uniformly for λ ∈ [0, 1/m], l,k n ˆ n ϕ

∗ f )Φ ≤ B4 C1n−1 ψ

: ∂ l φ∂ k φ† : (δλ K2n,λm ∞ ˆ α,

with B4 independent of λ and n, so that, if l + k ≥ 1, it is possible to interchange the limit and the sum in (3.22). The term in (3.22) with l = k = 0 needs however a separate treatment, due to the divergent prefactor λ−1 . We ﬁrst observe that, due ˆ 0,0 (0, 0) = 0. Using bounds (3.17) and to the ﬁrst relation in (3.15), we have K n,m

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

109

(3.18), we thus obtain the estimate λ 1 1 0,0 d 0,0 ˆ ˆ K K (λp , λq ) dµ (λp , λq ) = σ σ σ σ λ 2n,λm λ 0 dλ 2n,λm λ=µ ≤

3C2n−1 ˆ

ψ 1,∞ ϕ

ˆ 1,α (m + |p| + |q|)(1 + |p|)2 (1 + |q|)2 , 4π 2

valid for σ, σ = ± and for λ ∈ [0, λ0 ], with λ0 := min{1/em, 1}. Then a straightforward adaptation of the above arguments easily gives, uniformly for λ ∈ [0, λ0 ], 1 l,k n ˆ n ϕ

: φφ† : (δλ K2n,λm ∗ f )Φ ≤ B5 C2n−1 ψ

1,∞ ˆ 1,α , λ

(3.23)

with B5 > 0 a constant independent of λ and n. The same estimates above, being uniform in λ ∈ [0, 1/m], together with use of Lemma 3.1(3), allow us also to conclude that l,k ˆ l,k (0, 0): ∂ l φ∂ k φ† : (fδ )Φ. ∗ f )Φ = (2π)4 K lim : ∂ l φ∂ k φ† : (δλ K2n,λm 2n,0

(3.24)

λ→0

Furthermore there holds lim

λ→0

1 0,0 δλ K2n,λm ∗f λ

b

(p, q) = (2π)4 (p0 − q0 )

since, as a consequence of (3.15), we have ˆ 0,0 ∂K 2n,0 ∂p0 (0, 0)

ˆ 0,0 ∂K − ∂q2n,0 (0, 0). 0

ˆ 0,0 ∂K 2n,0 ∂pi (0, 0)

ˆ 0,0 ∂K 2n,0 (0, 0)fˆ(p + q), ∂p0

=0=

ˆ 0,0 ∂K 2n,0 ∂qi (0, 0),

i = 1, 2, 3,

= Exploiting again the uniformity in λ ∈ [0, λ0 ] of and the estimates leading to (3.23), we ﬁnally get 0,0

lim

λ→0

ˆ ∂K 1 2n,0 0,0 : φφ† : (δλ K2n,λm ∗ f )Φ = −(2π)4 i (0, 0): ∂φφ† − φ∂φ† : (fδ ). λ ∂p0

Together with (3.24), this gives the statement. We stress that vanishing of the constant c in the previous theorem is still by no means ruled out. That in general this is not the case, can be seen by choosing the time-smearing function ψ ∈ DR ((−τ, τ )) suﬃciently close to a δ function and the space-smearing function ϕ ∈ DR (Br+δ−τ ) to a characteristic function. Proposition 3.3. Assume that the time-smearing function ψ used in the construction of ΞλOr ,λOr+δ (Q) satisfies ψ(t) = τ −1 ψ1 (τ −1 t), where ψ1 ∈ DR ((−1, 1)) is such that R ψ1 = 1, and that the space-smearing function ϕ is such that ϕ ∈ DR (Br+δ/2+ε ), 0 ≤ ϕ ≤ 1 and ϕ(x) = 1 for all x ∈ Br+δ/2−ε , with ε < δ/2 − τ . Then, denoting with c(τ, ε) the corresponding constant given by

February 11, 2010 11:24 WSPC/148-RMP

110

J070-S0129055X10003904

G. Morsella & L. Tomassini

Eq. (3.21), there holds lim lim c(τ, ε) =

ε→0 τ →0

3 δ 4 π r+ . 3 2

(3.25)

ˆ l,k : Proof. By induction, it is straightforward to prove the following formula for K n,0 ˆ l,k (p, q) K n,0 =

(−1)k+n−1 in−l−k ηn (2π)n+1 ×

n−1

R3(n−1) j=1

0,1

n−1

r −rj−1

σj j

r1 ,...,rn−2 σ1 ,...,σn−1 j=1

ˆ 1,σ − k2,σ ) · · · h(k ˆ n−1,σ ˆ − k1,σ )h(k dkj |kj |rj −rj−1 h(p 1 1 2 n−1 + q),

where ηn = i for n even and ηn = −1 for n odd and r0 := l, rn−1 := 1 − k. Since ˆ 0 ) = ψˆ1 (τ p0 ) → (2π)−1/2 as τ → 0 and kj,σ = (σj |kj |, kj ), it is easy to see that ψ(p j in the limit τ → 0 the dependence on the σj ’s drops oﬀ the integral in the second line of the above equation and therefore (−1)n 22n−1 (−1)n 4n 0,1 ˆ ϕˆ ∗ · · · ∗ ϕ(0) ˆ = dx ϕ(x)2n . lim K2n,0 (0, 0) = τ →0 (2π)3n+1 2(2π)4 R3 Analogously, since ψˆ (p0 ) = τ ψˆ1 (τ p0 ) → 0 as τ → 0, one has from the above formula ˆ 0,0 ∂K 2n,0 (0, 0) = 0. τ →0 ∂p0 lim

But, thanks to the estimates (3.16), (3.17), the convergence of the series (3.21) is uniform in τ , so that one has +∞ 1 (−1)n π 2n lim c(τ, ε) = − dx ϕ(x)2n . τ →0 2 n=1 (2n)! 3 R Since ϕ is bounded above by the characteristic function of the ball Br+δ for ε < δ/2, the convergence of the above series is also uniform in ε so that, taking into account that ϕ converges to the characteristic function of the ball Br+δ/2 when ε → 0, we ﬁnally get (3.25). It is straightforward to extend the above analysis to treat the case of the net O → F (O) generated by a multiplet of free scalar ﬁelds φa , a = 1, . . . , d, with the action of a compact Lie group G deﬁned by V (g)φa (f )V (g)∗ =

d

v(g)ab φb (f ),

b=1

where v is a d-dimensional unitary representation.

g ∈ G,

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

111

More precisely, consider the 1-parameter subgroup θ ∈ R → gθξ ∈ G associated to a Lie algebra element ξ ∈ g and correspondingly the global generator Qξ of θ → V (gθξ ), which satisﬁes on D(N ) [Qξ , φa (f )] = −i

d

t(ξ)ab φb (f ),

b=1

ξ → t(ξ) being the representation of g (through antihermitian matrices) associated to v. Then considering again the U(1) symmetry of the doubled theory and the associated Noether current J0 it is possible to deﬁne a semi-local implementation of the ﬂip as in Eq. (3.4) and to construct a local implementation ΞλOr ,λOr+δ (Qξ ) of Qξ as in Eq. (3.5), which is essentially self-adjoint on D(N ) and for which an expansion analogous to (3.8) holds:   0,1 d +∞ 2n π l,k  t(ξ)ab : ∂ l φa ∂ k φ†b : (K2n,m (λ))Φ , ΞλOr ,λOr+δ (Qξ )Φ = n (2n)! 4 n=1 l,k a,b=1

l,k (λ) are the distributions deﬁned in (3.6) and (3.7). Finally, the anawhere K2n,m logue of formula (3.20) holds, where on the right-hand side the appropriate Noether current d

j0ξ (f ) =

t(ξ)ab : φa ∂φ†b − ∂φa φ†b : (fδ )

a,b=1

appears and the normalization constant c is again given by (3.21). 4. Summary and Outlook In the present work we have shown that it is in principle possible to construct operators implementing locally a given inﬁnitesimal symmetry of a local net of von Neumann algebras (local generators), starting from the existence of unitary operators implementing (semi-)locally the ﬂip automorphism on the tensor product of the net with itself. In particular, in a large class of free scalar ﬁeld models our construction provides an eﬃcient tool to obtain manageable such local generators through the explicit expression of the local ﬂip given in Eq. (3.4). Moreover, we showed that it is possible to recover, up to a well-determined strictly positive normalization constant, the associated Noether currents through a natural scaling limit of these generators in which the localization region shrinks to a point. As expected, the above-mentioned constant is found to depend only on the volume of the initial localization region of the generator and not on the mass and isospin of the model. The existence of this limit depends in this case on control of the energy behavior of the generators (namely the existence of H-bounds) rather than on dilation invariance of the (thus massless) theory, which was a key ingredient of previous similar results [7, 8].

February 11, 2010 11:24 WSPC/148-RMP

112

J070-S0129055X10003904

G. Morsella & L. Tomassini

These results have been obtained in the spirit of giving a consistency check towards a full quantum Noether theorem according to the program set down in [1] and recalled in the introduction. In order to proceed further in this direction it is apparent that two main problems have to be tackled. First, it is necessary to extend the construction of local generators proposed in the Introduction to a suitably general class of theories. Second, it would be desirable to gain a deeper understanding of the general properties granting the existence and non-triviality of the pointlike limit of the free generators, which are presently under investigation. Among other things, this is likely connected with the problem of clarifying if it is generally possible, through a suitable choice of the local ﬂip implementation, to gain control over the “boundary part” of the local symmetry implementation, whose arbitrariness is considered to be an important obstruction for the reconstruction of Noether currents. The methods of [14] can be expected to be useful to put this analysis in a more general framework. Finally, we believe that our method could help to shed some light on the diﬃcult problem of obtaining sharply localized charges from global ones. Acknowledgments We would like to thank Sergio Doplicher for originally suggesting the problem to one of us and for his constant support and encouragement, and Sebastiano Carpi for several interesting and useful discussions. We also thank the referees for suggesting several improvements in the exposition. This work was supported by MIUR, GNAMPA-INDAM, the SNS, the Marie Curie Research Training Network MRTNCT-2006-031962 EU-NCG and the ERC Advanced Grant 227458 “Operator Algebras and Conformal Field Theory”. Appendix. Local Implementation of the Doubled Theory U(1) Symmetry In this Appendix, we show that the smeared Noether current associated to the U(1) symmetry of the theory of two complex free scalar ﬁelds of mass m ≥ 0, Eq. (3.1), is represented by a self-adjoint operator which generates a group locally implementing the symmetry. Although this material is more or less standard, we include it here both for the convenience of the reader and because the proof of self-adjointness of (Wick-ordered) bilinear expressions in the free ﬁeld (and its derivatives) can be found in the literature only for mass m > 0 (see [15, 16]). For this reason, we will only emphasize the main diﬀerences in the (possibly) massless case. To begin with, the main estimates in the appendix of [15], which are valid only for m > 0, have to be sharpened as in the following lemma. Lemma A.1. Let h ∈ S (R4 ), and consider the tempered distribution hδ (x, y) = ˆ δ 0,1 , h ˆ δ 1,0 ≤ h S h(x)δ(x − y). Then hδ ∈ C 0,1 ∩ C 1,0 for all m ≥ 0, and h where · S is some Schwartz norm independent of m varying in bounded intervals.

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

113

1 ˆ (2π)2 h(p

ˆ δ ∈ Cˆ. We denote by + q), which implies h ˆ δ |). It is easy to see that, for |q| ≥ 1, w(p, q) the integral kernel deﬁning T 1,0 (|h ωm (p) ≤ (1 + |p − q|)1/2 , ωm (q)

ˆ δ (p, q) = Proof. One has h

ˆ ∈ S (R4 ), there exists a C1 > 0 and an r > 3 such that and therefore, being h

2 2 C1 dp dq w(p, q)Φ(q) ≤ dp dq |Φ(q)| (1 + |p − q|)r R3 |q|>1 R3 |q|>1 ≤

C12

R3

dp (1 + |p|)r

2

Φ 2L2 (R3 ) ,

where use was made of the Young inequality f ∗ g L2 ≤ f L1 g L2 . On the other hand, there exist C2 > 0 and s > 2 such that, for |p| > 1, C2 ωm (p) dq w(p, q)Φ(q) ≤ dq |Φ(q)| |q| (1 + |p − q|)s |q|≤1 |q|≤1 ωm (p) dq |Φ(q)| ≤ C2 s |p| |q| |q|≤1 2πωm (p)

Φ L2 (R3 ) , ≤ C2 |p|s and a C3 > 0 such that, for |p| ≤ 1,

|q|≤1

dq w(p, q)Φ(q) ≤ C3

|q|≤1

dq

ωm (p) |Φ(q)| |q|

√ ≤ 2πC3 (1 + m2 )1/4 Φ L2 (R3 ) .

Putting these inequalities together, we obtain

1/2 √ √ dp ωm (p) 1,0 ˆ

T (|hδ |)Φ L2 (R3 ) ≤ 2 C1 + 2πC2 r 2s R3 (1 + |p|) |p|>1 |p| 2 1/4 + 2π 2/3C3 (1 + m )

Φ L2 (R3 ) , so that, since the constants Ci can be expressed by Schwartz norms of h, we conclude ˆ δ |) ≤ h S for a suitable Schwartz norm · S . that T 1,0(|h ˆ˜ δ |) , T 0,1 (|h ˆ δ |) , T 0,1(|h ˆ˜ δ |) ≤ h S are completely The proofs that T 1,0 (|h

1,0,σ ∈ L2 (R6 ), σ = ±, and that analogous and it is immediate to see that Φ0,1,σ ˆ δ , Φh ˆδ h their norms can be bounded by h S .

February 11, 2010 11:24 WSPC/148-RMP

114

J070-S0129055X10003904

G. Morsella & L. Tomassini

This lemma, together with Proposition 2.1, shows that the timelike component J0 (h) of the current (3.2) is well-deﬁned for h ∈ S (R4 ). Using the fact that |pi | ≤ ωm (p), the proof above shows that the spacelike components Ji (h), i = 1, 2, 3, are well-deﬁned too. Proposition A.1. The following statements hold. ˜ ) by (1) For each h ∈ S (R4 ), the operator Jµ (h) defined on D(N Jµ (h) :=

2

(−1)j [: ∂µ φj φ†j : (hδ ) − : φj ∂µ φ†j : (hδ )],

(A.1)

j=1

where j = 3 − j, defines a Wightman field such that Jµ (h) is essentially selfadjoint for real h. (2) If h ∈ DR (O), O a double cone, then eiζJµ (h) ∈ F˜ (O), ζ ∈ R. (3) Given a 3-dimensional open ball Br of radius r centered at the origin together with functions ϕ ∈ DR (R3 ), ψ ∈ DR ((−τ, τ )) such that ϕ(x) = 1 for each x ∈ Br+τ and R ψ = 1, it holds that eiζJ0 (ψ⊗ϕ) F e−iζJ0 (ψ⊗ϕ) = γζ (F ),

∀ F ∈ F˜ (Or ),

(A.2)

where Or is the double cone with base Br . ˆ δ 0,1 , h ˆ δ 1,0 ≤ h S , so that Jµ Proof. (1) According to Lemma A.1 one has h is a Wightman ﬁeld and Jµ (h) is symmetric for real h. Given now a Φ ∈ K ⊗S n , Jµ (h)p Φ is the sum of 16p vectors of the form −,σp kp +,εp ∂µ cjp

: ∂µlp cjp

1 k1 +,ε1 : (h)(np ) · · · : ∂µl1 c−,σ ∂µ cj : (h)(n1 ) Φ j1 1

with nj = nj−1 + σj + εj , j = 1, . . . , p (n0 := n). Therefore, by (2.4), p 4 h S p

Jµ (h) Φ ≤ (n + 2(p + 1)) · · · (n + 4) Φ , π ˜ 0 is a and we see that Φ is an analytic vector for Jµ (h). Since any element in D ﬁnite sum of such vectors, essential self-adjointness of Jµ (h) follows. ˜ 0, (2) A straightforward but lengthy calculation shows that, on D [Jµ (h), φj (f ) + φj (f )∗ ] = (−1)j+1 i(φj (g) + φj (g)∗ ), g = h(∂µ ∆ ∗ f ) + ∂µ (h(∆ ∗ f )),

(A.3)

1 ε(p0 )δ(p2 − m2 ). Since where, as customary, ∆ is the Fourier transform of 2πi ˜ 0 is an invariant dense set supp ∆ is contained in the closed light cone and D of analytic vectors for both Jµ (h) and φj (f ) + φj (f )∗ , we see by standard ∗ − arguments that eiζJµ (h) commutes with ei[φj (f )+φj (f ) ] if supp h is spacelike from supp f , i.e. eiζJµ (h) ∈ F˜ (O) = F˜ (O) if supp h ⊂ O.

February 11, 2010 11:24 WSPC/148-RMP

J070-S0129055X10003904

From Global Symmetries to Local Currents

115

(3) Take f ∈ D(Or ). Since supp ∆ ∗ f does not intersect [−τ, τ ] × {x : ϕ(x) = 1} we have that ψ ⊗ ϕ(∂0 ∆ ∗ f ) + ∂0 (ψ ⊗ ϕ(∆ ∗ f )) = ψ ⊗ 1(∂0 ∆ ∗ f ) + ∂0 (ψ ⊗ 1(∆ ∗ f )). On the other hand, a calculation shows that, thanks to R ψ = 1, ∆ ∗ (ψ ⊗ 1(∂0 ∆ ∗ f ) + ∂0 (ψ ⊗ 1(∆ ∗ f ))) = ∆ ∗ f, and, since ∆ ∗ f1 = 0 implies f1 = ( + m2 )f2 with fi ∈ S (R4 ), the commutation relations (A.3) become [J0 (ψ ⊗ ϕ), φj (f ) + φj (f )∗ ] = (−1)j+1 i(φj (f ) + φj (f )∗ ). Furthermore, thanks to the estimates (2.5) and (2.6) we can apply the multiple commutator theorems in [11] to conclude, as in the proof of [9, Theorem 2], that (A.2) holds. References [1] S. Doplicher, Local aspects of superselection rules, Comm. Math. Phys. 85 (1982) 73–86. [2] S. Doplicher and R. Longo, Local aspects of superselection rules. II, Comm. Math. Phys. 88 (1983) 399–409. [3] D. Buchholz, S. Doplicher and R. Longo, On Noether’s theorem in quantum ﬁeld theory, Ann. Phys. 170 (1986) 1–17. [4] D. Buchholz and E. H. Wichmann, Causal independence and the energy level density of states in local quantum ﬁeld theory, Comm. Math. Phys. 106 (1986) 321–344. [5] D. Buchholz, Product states for local algebras, Comm. Math. Phys. 36 (1974) 287– 304. [6] S. Stratila, Modular Theory in Operator Algebras (Abacus Press, Bucharest, 1981). [7] S. Carpi, Quantum Noether’s theorem and conformal ﬁeld theory: A study of some models, Rev. Math. Phys. 11 (1999) 519–532. [8] L. Tomassini, Sul teorema di Noether quantistico: Studio del campo libero di massa zero in quattro dimensioni, Master’s thesis, Universit` a di Roma “La Sapienza” (1999). [9] C. D’Antoni and R. Longo, Interpolation by type I factors and the ﬂip automorphism, J. Funct. Anal. 51 (1983) 361–371. [10] S. Doplicher and R. Longo, Standard and split inclusions of von Neumann algebras, Invent. Math. 75 (1984) 493–536. [11] J. Fr¨ ohlich, Application of commutator theorems to the integration of representations of Lie algebras and commutation relations, Comm. Math. Phys. 54 (1977) 135–150. [12] M. Reed and B. Simon, Methods of Modern Mathematical Physics. Vol. II: Fourier Analysis, Self-Adjointness (Academic Press, New York, 1975). [13] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeﬀrey and D. E. Knuth, On the Lambert W function, Adv. Comput. Math. 5 (1996) 329–359. [14] H. Bostelmann, Phase space properties and the short distance structure in quantum ﬁeld theory, J. Math. Phys. 46 (2005) 052301, 17 pp. [15] J. Langerholc and B. Schroer, On the structure of the von Neumann algebras generated by local functions of the free Bose ﬁeld, Comm. Math. Phys. 1 (1965) 215–239. [16] S. Albeverio, B. Ferrario and M. W. Yoshida, On the essential self-adjointness of Wick powers of relativistic ﬁelds and of ﬁelds unitary equivalent to random ﬁelds, Acta Appl. Math. 80 (2004) 309–334.

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 2 (2010) 117–192 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003916

PERTURBATIVE DEFORMATIONS OF CONFORMAL FIELD THEORIES REVISITED

IGOR KRIZ Mathematics Department, University of Michigan, Ann Arbor, MI 48109-1109 USA [email protected] Received 30 March 2009 Revised 12 October 2009 The purpose of this paper is to revisit the theory of perturbative deformations of conformal field theory from a mathematically rigorous, purely worldsheet point of view. We specifically include the case of N = (2, 2) conformal field theories. From this point of view, we find certain surprising obstructions, which appear to indicate that contrary to previous findings, not all deformations along marginal fields exist perturbatively. This includes the case of deformation of the Gepner model of the Fermat quintic along certain cc fields. In other cases, including Gepner models of K3-surfaces and the free field theory, our results coincides with known predictions. We give partial interpretation of our results via renormalization and mirror symmetry. Keywords: N = (2, 2) conformal field theories; perturbative deformation; Gepner model. Mathematics Subject Classification 2010: 83E30, 53D37, 81T15

1. Introduction Recently, there has been renewed interest in the mathematics of the moduli space of conformal ﬁeld theories, in particular, in connection with speculations about elliptic cohomology. The purpose of this paper is to investigate this space by perturbative methods from ﬁrst principles and from a purely “worldsheet” point of view. It is conjectured that at least at generic points, the moduli space of CFT’s is a manifold, and in fact, its tangent space consists of marginal ﬁelds, i.e. primary ﬁelds of weight (1, 1) of the conformal ﬁeld theory (that is in the bosonic case, in the supersymmetric case there are modiﬁcations which we will discuss later). This then means that there should exist an exponential map from the tangent space at a point to the moduli space, i.e. it should be possible to construct a continuous 1-parameter set of conformal ﬁeld theories by “turning on” a given marginal ﬁeld. There is a more or less canonical mathematical procedure for applying a “Pexp” type construction to the ﬁeld which has been turned on, and obtaining a perturbative expansion in the deformation parameter. This process, however, returns certain 117

March 10, J070-S0129055X10003916

118

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

cohomological obstructions, similar to Gerstenhaber’s obstructions to the existence of deformations of associative algebras [26–29]. Physically, these obstructions can be interpreted as changes of dimension of the deforming ﬁeld, and can occur, in principle, at any order of the perturbative path. The primary obstruction is well known, and was used, e.g., by Ginsparg in his work on c = 1 conformal ﬁeld theories [30]. The obstruction also occured in earlier work, see [45–47, 63–65, 61], from the point of view of continuous lines in the space of critical models. In the models considered, notably the Baxter model [11], the Ashkin–Teller model [8] and the Gaussian model [48], vanishing of the primary obstruction did correspond to a continuous line of deformations, and it was therefore believed that the primary obstruction tells the whole story. (A similar story also occurs in the case of deformations of boundary sectors, see [1, 2, 12, 22, 51, 52, 58, 38].) In a certain sense, the main point of the present paper is analyzing, or giving examples of, the role of the higher obstructions. We shall see that these obstructions can be non-zero in cases where the deformation is believed to exist, most notably in the case of deforming the Gepner model of the Fermat quintic along a cc ﬁeld, cf. [3, 55, 23, 60, 44, 66, 67, 14, 15, 17]. Some discussion of marginality of primary ﬁeld in N = 2-supersymmetric theories to higher order exists in the literature. Notably, Dixon, [19] veriﬁed the vanishing for any N = (2, 2)-theory, and any linear combination of cc, ac, ca and ac ﬁeld, of an amplitude integral which physically expresses the change of central charge (a similar calculation is also given in Distler– Greene [18]). Earlier work of Zamolodchikov [70,71] showed that the renormalization β-function vanishes for theories where c does not change during the renormalization process. However, we ﬁnd that the calculation [19] does not guarantee that the primary ﬁeld would remain marginal along the perturbative deformation path, due to subtleties involving singularities of the integral. The obstruction we discuss in this paper is an amplitude integral which physically expresses directly the change of dimension of the deforming ﬁeld, and it turns out this may not vanish. We will return to this discussion in Sec. 3 below. This puzzle of having obstructions where none should appear will not be fully explained in this paper, although a likely interpretation of the result will be discussed. It is possible that our eﬀect does not impact the general question of the existence of the nonlinear σ-model, which is widely believed to exist (e.g., [3, 55, 23, 60, 44, 66, 67, 14, 15, 17]), but simply concerns questions of its perturbative construction. One caveat is that the case we investigate here is still not truly physical, since we specialize to the case of cc ﬁelds, which are not real. The actual physical deformations of CFT’s should occur along real ﬁelds, e.g., a combination of a cc ﬁeld and its complex-conjugate aa ﬁeld (we give a discussion of this in case of the free ﬁeld theory at the end of Sec. 4). The obstructions discussed here however are not linear, and hence a priori the case of the corresponding real ﬁeld in the Gepner model is much more diﬃcult to analyze, in particular, it requires regularization of the deforming parameter, and is not discussed here. Nevertheless, it is still surprising that an obstruction occurs for a single cc ﬁeld; for example, this

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

119

does not happen in the case of the (compactiﬁed or uncompactiﬁed) free ﬁeld theory. Also, there is strong evidence that obstructions to deformations along cc ﬁelds and the corresponding real ﬁelds are equivalent (see the remark after Example 2 in Sec. 6). Since an nth order obstruction indeed means that the marginal ﬁeld gets deformed into a ﬁeld of non-zero weight, which changes to the order of the nth power of the deformation parameter, usually [30, 45–47, 63–65, 61], when obstructions occur, one therefore concludes that the CFT does not possess continuous deformations in the given direction. Other interpretations are possible. One thing to observe is that our conclusion is only valid for purely perturbative theories where we assume that all ﬁelds have power series expansion in the deformation parameters with coeﬃcients which are ﬁelds in the original theory. This is not the only possible scenario. Therefore, as remarked above, our results merely indicate that in the case when our algebraic obstruction is non-zero, non-perturbative corrections must be made to the theory to maintain the presence of marginal ﬁelds along the deformation path. In fact, evidence in favor of this interpretation exists in the form of the analysis of Nemeschansky and Sen [55,35] of higher order corrections to the β-function of the nonlinear σ-model. Grisaru, Van de Ven and Zanon [35] found that the four-loop contribution to the β-function of the nonlinear σ model for Calabi–Yau manifolds is non-zero, and [55] found a recipe how to cancel this singularity by deforming the manifold to metric which is non-Ricci ﬂat at higher orders of the deformation parameter. The expansion [4] used in this analysis is around the 0 curvature tensor, but assuming for the moment that a similar phenomenon occurs if we expanded around the Fermat quintic vacuum, then there are no ﬁelds present in the Gepner model which would correspond perturbatively to these higher order corrections in the direction of non-Ricci ﬂat metric: bosonically, such ﬁelds would have to have critical conformal dimension classically, since the σ-model Lagrangian is classically conformally invariant for non-Ricci ﬂat target K¨ ahler manifolds. However, quantum mechanically, there is a one-loop correction proportional to the Ricci tensor, thus indicating that ﬁelds expressing such perturbative deformations would have to be of generalized weight (cf. [39–42]). Fields of generalized weight, however, are not present in the Gepner model, which is a rational CFT, and more generally are excluded by unitarity (see discussions in Remarks after Theorems 2 and 3 in Sec. 3 below). Thus, although this argument is not completely mathematical, renormalization analysis seems to conﬁrm our ﬁnding that deformations of the Fermat quintic model must in general be non-perturbative. It is also noteworthy that the β-function is known to vanish to all orders for K3-surfaces because of N = (4, 4) supersymmetry. Accordingly, we also ﬁnd that the phenomenon we see for the Fermat quintic is not present in the case of the Fermat quartic (see Sec. 7 below). It is also worth noting that other non-perturbative phenomena such as instanton corrections also arise when passing from K3-surfaces to Calabi–Yau 3-folds ( [14, 15, 17]). Finally, one must also remark that the proof of [55] of the β-function cancellation

March 10, J070-S0129055X10003916

120

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

is not mathematically complete because of convergence questions, and thus one still cannot exclude even the scenario that not all nonlinear σ models would exist as exact CFT’s, thus creating some type of “string landscape” picture also in this context (cf. [20]). We should remark that this scenario also has a compelling interpretation from the point of view of the relationship between classical and quantum geometry (see the end of the Concluding Remarks). In this paper, we shall be mostly interested in the strictly perturbative picture. The main point of this paper is an analysis of the algebraic obstructions in certain canonical cases. We discuss two main kinds of examples, namely the free ﬁeld theory (both bosonic and N = 1-supersymmetric), and the Gepner models of the Fermat quintic and quartic, which are exactly solvable N = 2-supersymmetric conformal ﬁeld theories which should be the nonlinear σ-models of the Fermat quintic Calabi– Yau 3-fold and the Fermat quartic K3-surface (in the case of the Fermat quartic, this was actually proved in [54]). In the case of the free ﬁeld theory, what happens is essentially that all non-trivial gravitational deformations of the free ﬁeld theory are algebraically obstructed. In the case of a free theory compactiﬁed on a torus, the only gravitational deformations which are algebraically unobstructed come from linear change of metric on the torus. (We will focus on gravitational deformations; there are other examples, for example the sine-Gordon interaction [69, 13], which are not discussed in detail here.) The Gepner case deserves special attention. From the moduli space of Calabi– Yau 3-folds, there is supposed to be a σ-model map into the moduli space of CFT’s. In fact, when we have an exactly solvable Calabi–Yau σ-model, one gets operators in CFT corresponding to the cohomology groups H 1,1 and H 2,1 , which measure deformations of complex structure and K¨ ahler metric, respectively, and these in turn give rise to inﬁnitesimal deformations. Now the Fermat quintic x5 + y 5 + z 5 + t5 + u5 = 0

(1)

in CP 4 has a model conjectured by Gepner [24, 25] which is embedded in the tensor product of 5 copies of the N = 2-supersymmetric minimal model of central charge 9/5. The weight (1/2, 1/2) cc and ac ﬁelds correspond to the 100 inﬁnitesimal deformations of complex structure and 1 inﬁnitesimal deformation of K¨ ahler metric of the quintic (1). Despite the numerical matches in dimension, however, it is not quite correct to say that the gravitational deformations, corresponding to the moduli space of Calabi–Yau manifolds, occurs by turning on cc and ac ﬁelds. This is because, to preserve unitarity, a physical deformation can only occur when we turn on a real ﬁeld, and the ﬁelds in question are not real. In fact, the complex conjugate of a cc ﬁeld is an aa ﬁeld, and the complex conjugate of an ac ﬁeld is a ca ﬁeld. The complex conjugate must be added to get a real ﬁeld, and a physical deformation (we discuss this calculationally in the case of the free ﬁeld theory in Sec. 4). In this paper, we do not discuss deformations of the Gepner model by turning on real ﬁelds. As shown in the case of the free ﬁeld theory in Sec. 4, such deformations

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

121

require for example regularization of the deformation parameter, and are much more diﬃcult to calculate. Because of this, we work only with the case of one cc and one ac ﬁeld. We will show that at least one cc deformation, whose real version corresponds to the quintics x5 + y 5 + z 5 + t5 + u5 + λx3 y 2 = 0

(2)

for small (but not inﬁnitesimal) λ is algebraically obstructed. (One suspects that similar algebraic obstructions also occur for other ﬁelds, but the computation is too diﬃcult at the moment; for the cc ﬁeld corresponding to xyztu, there is some evidence suggesting that the deformation may exponentiate.) It is an interesting question if nonlinear σ-models of Calabi–Yau 3-folds must also contain non-perturbative terms. If so, likely, this phenomenon is generic, which could be a reason why mathematicians so far discovered so few of these conformal ﬁeld theories, despite ample physical evidence of their existence [3, 55, 23, 60, 44, 66, 67]. Originally prompted by a question of Igor Frenkel, we also consider the case of the Fermat quartic K3 surface x4 + y 4 + z 4 + t4 = 0 in CP 3 . This is done in Sec. 7. It is interesting that the problems of the Fermat quintic do not arise in this case, and all the inﬁnitesimally critical ﬁelds exponentiate in the purely perturbative sense. This dovetails with the result of Alvarez-Gaume and Ginsparg [5] that the β-function vanishes to all orders for critical perturbative models with N = (4, 4) supersymmetry, and hence from the renormalization point of view, the nonlinear σ model is conformal for the Ricci ﬂat metric on K3-surfaces. There are also certain diﬀerences between the ways mathematical considerations of moduli space and mirror symmetry vary in the K3 and Calabi–Yau 3-fold cases, which could be related to the behavior of the non-perturbative eﬀects. This will be discussed in Sec. 8. To relate more precisely in what setup these results occur, we need to describe what kind of deformations we are considering. It is well known that one can obtain inﬁnitesimal deformations from primary ﬁelds. In the bosonic case, the weight of these ﬁelds must be (1, 1), in the N = 1-supersymmetric case in the NS-NS sector the critical weight is (1/2, 1/2) and in the N = 2-supersymmetric case the inﬁnitesimal deformations we consider are along so called ac or cc ﬁelds of weight (1/2, 1/2). For more speciﬁc discussion, see Sec. 2 below. There may exist inﬁnitesimal deformations which are not related to primary ﬁelds (see the remarks at the end of Sec. 3). However, they are excluded under a certain continuity assumption which we also state in Sec. 2. Therefore, the approach we follow is exponentiating inﬁnitesimal deformations along primary ﬁelds of appropriate weights. In the “algebraic” approach, we assume that both the primary ﬁeld and amplitudes can be updated at all points of the deformation parameter. Additionally, we assume one can obtain a perturbative power

March 10, J070-S0129055X10003916

122

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

series expansion in the deformation parameter, and we do not allow counterterms of generalized weight or non-perturbative corrections. We describe a cohomological obstruction theory similar to Gerstenhaber’s theory [26–29] for associative algebras, which in principle controls the coeﬃcients at individual powers of the deformation parameter. Obstructions can be written down explicitly under certain conditions. This is done in Sec. 3. The primary obstruction in fact is the one which occurs for the deformations of the free ﬁeld theory at gravitational ﬁelds of non-zero momentum (“gravitational waves”). In the case of the Gepner model of the Fermat quintic, the primary obstruction vanishes but in the case (2), one can show there is an algebraic obstruction of order 5 (i.e. given by a 7 point function in the Gepner model). It should be pointed out that even in the “algebraic” case, there are substantial complications we must deal with. The moduli space of CFT’s is not yet well deﬁned. There are diﬀerent deﬁnitions of conformal ﬁeld theory, for example the Segal approach [59, 36, 37] is quite substantially diﬀerent from the vertex operator approach (see [41] and references therein). Since these deﬁnitions are not known to be equivalent, and their realizations are supposed to be points of the moduli space, the space itself therefore cannot be deﬁned until a particular deﬁnition is selected. Next, it remains to be speciﬁed what structure there should be on the moduli space. Presumably, there should at least be a topology, so that we need to ask what is a nearby conformal ﬁeld theory. That, too, has not been answered. These foundational questions are enormously diﬃcult, mostly from the philosophical point of view: it is very easy to deﬁne ad hoc notions which immediately turn out insuﬃciently general to be desirable. Because of that, we only make minimal deﬁnitions needed to examine the existing paradigm in the context outlined. Let us, then, conﬁne ourselves to observing that even in the perturbative case, the situation is not purely algebraic, and rather involves inﬁnite sums which need to be discussed in terms of analysis. For example, the obstructions may in fact be undeﬁned, because they may involve inﬁnite sums which do not converge. Such phenomenon must be treated carefully, since it does not mean automatically that perturbative exponentiation fails. In fact, because the deformed primary ﬁelds are only determined up to a scalar factor, there is a possibility of regularization along the deformation parameter. We brieﬂy discuss this theoretically in Sec. 3, and then give an example in the case of the free ﬁeld theory in Sec. 4. We also brieﬂy discuss suﬃcient conditions for exponentiation. The main method we use is the case when Theorem 1 gives a truly local formula for the inﬁnitesimal amplitude changes, which could be interpreted as an “inﬁnitesimal isomorphism” in a special case. We then give in Sec. 3 conditions under which such inﬁnitesimal isomorphisms can be exponentiated. This includes the case of a coset theory, which does not require regularization, and a more general case when regularization may occur. In the ﬁnal Secs. 5 and 6, namely the case of the Gepner model, the main problem is ﬁnding a setup for the vertex operators which would be explicit enough to allow evaluating the obstructions in question; the positive result is obtained using

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

123

a generalization of the coset construction. The formulas required are obtained from the Coulomb gas approach (= Feigin–Fuchs realization), which is taken from [34]. The present paper is organized as follows: In Sec. 2, we give the general setup in which we work, show under which condition we can restrict ourselves to deformations along a primary ﬁeld, and derive the formula for inﬁnitesimally deformed amplitudes, given in Theorem 1. In Sec. 3, we discuss exponentiation theoretically, in terms of obstruction theory, explicit formulas for the primary and higher obstructions, and regularization. We also discuss supersymmetry, and in the end show a mechanism by which non-perturbative deformations may still be possible when algebraic obstructions occur. In Sec. 4, we give the example of the free ﬁeld theories, the trivial deformations which come from 0 momentum gravitational deforming ﬁelds, and the primary obstruction to deforming along primary ﬁelds of non-zero momentum. In Sec. 5, we will discuss the Gepner model of the Fermat quintic, and in Sec. 6, we will discuss examples of non-zero algebraic obstructions to perturbative deformations in this case, as well as speculations about unobstructed deformations. In Sec. 7, we will discuss the (unobstructed) deformations for the Fermat quartic K3 surface, and in Sec. 8, we attempt to summarize and discuss our possible conclusions. 2. Infinitesimal Deformations of Conformal Field Theories We shall work in the framework of [59] (see also [36–38]). In the bosonic case (without considering supersymmetry), a conformal ﬁeld theory in this framework is characterized by a Hilbert space of states H, and for a worldsheet, by which one means a Riemann surface Σ (a 1-dimensional complex manifold) with analytically parametrized boundary components, a trace class element ˆ ∗ˆ ˆ H ⊗ H (3) UΣ ∈ deﬁned up to scalar multiple. One assumes that these elements depend on Σ analytically (i.e. are real-analytic functions on the moduli space of worldsheets). Here H∗ ˆ denotes the Hilbert tensor product. In denotes the Hilbert space dual of H, and ⊗ ∗ (3), the tensor product of copies of H (respectively, H) is over the inbound (respectively, outbound) boundary components of Σ. Inbound and outbound boundary components are distinguished by orientation. For an annulus in C enclosed by two concentric circles oriented counterclockwise, the inside circle is inbound. The elements (3) are subject to gluing identities (gluing of Riemann surfaces corresponds to trace). These elements can also be viewed (perhaps even more conventionally, but less symmetrically) as operators ˆ ˆ H→ H (4) UΣ : where the tensor product in the source (respectively, target) is over inbound (respectively, outbound) boundary components. In this paper, we shall almost exclusively consider the case when Σ is a Riemann surface of genus 0, since this is the key

March 10, J070-S0129055X10003916

124

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

case for deformation theory. It should be noted that a physical CFT has still more structure. Namely, we want to consider the operators (4) where Σ is an annulus approaching the degenerate annulus which is the unit circle with both inbound and outbound parametrizations equal to the identity. In such limit, the operator (4) should approach the identity H → H. Also, one requires reﬂection-positivity (which is the Wick rotation of unitarity). ¯ the Riemann surface complex conjugate to Σ, This means that if we denote by Σ then UΣ¯ is adjoint to UΣ . One also requires that for a physical theory that H actually be the complexiﬁcation of a real vector space, and the quadratic form one obtains by taking limits to the degenerate annulus S 1 with boundary parametrizations by z (the identity) and 1/z be related to the Hermitian form on H by complex conjugation. Treating the supersymmetric case mathematically is more technical, but analogous. Essentially, one must work on the super-moduli space of superconformal surfaces (for a very quick review mostly suﬃcient for the purposes of this paper, see [49]). The structure just described originates in conformally invariant 2-dimensional quantum ﬁeld theory. From the point of view of 2-dimensional quantum ﬁeld theory, the element (3) can be viewed as a generalization of the vacuum expectation value in the sense that no ﬁeld is inserted inside the worldsheet. From the point of view of conformal ﬁeld theory, this element is a CFT amplitude. Now in a bosonic (= non-supersymmetric) CFT H, if we have a primary ﬁeld u of weight (1, 1), then, as observed in [59], we can make an inﬁnitesimal deformation of H as follows: For a worldsheet Σ with associated element UΣ (see (3)), the inﬁnitesimal deformation of the vacuum is UΣxu . (5) VΣ = x∈Σ

Here UΣxu is obtained by choosing a holomorphic embedding f : D → Σ, f (0) = x, where D is the standard disk. Let Σ be the worldsheet obtained by cutting f (D) out of Σ, and let UΣxu be obtained by gluing the vacuum UΣ with the ﬁeld u inserted at f (∂D). The element UΣxu is proportional to f (0)2 , since u is (1, 1)-primary, so it transforms the same way as a measure and we can deﬁne the integral (5) without coupling with a measure. The integral (5) is an inﬁnitesimal deformation of the original CFT structure in the sense that UΣ + VΣ satisﬁes CFT gluing identities in the ring C[]/2 . The main topic of this paper is studying (in this and analogous supersymmetric cases) the question as to when the inﬁnitesimal deformation (5) can be exponentiated at least to perturbative level, i.e. when there exist for each n ∈ N

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

125

elements u0 , . . . , un−1 ∈ H,

u0 = u

and for every worldsheet Σ UΣ0 , . . . , UΣn ∈

H∗ ⊗ H

such that UΣ (m) =

m

UΣi i ,

UΣ0 = UΣ

(6)

i=0

satisfy gluing axioms in C[]/m+1 , 0 ≤ m ≤ n, u(m) =

m

u i i

(7)

i=0

is primary of weight (1, 1) with respect to (6), 0 < m ≤ n, and dUΣ (m) = UΣxu(m−1) (m − 1) d x∈Σ

(8)

in the same sense as in (5). We should remark that a priori, it is not known that all deformations of CFT come from primary ﬁelds: One could, in principle, simply ask for the existence of vacua (6) such that (6) satisfy gluing axioms over C[]/m+1 . As remarked in [59], it is not known whether all perturbative deformations of CFT’s are obtained from primary ﬁelds u as describe above. However, one can indeed prove that the primary ﬁelds u exist given suitable continuity assumptions. Suppose the vacua UΣ (m) exist for 0 ≤ m ≤ n. We notice that the integral on the right-hand side of (8) is, by deﬁnition, the limit of integrals over regions R which are proper subsets of Σ such that the measure of Σ − R goes to 0 (ﬁx an analytic metric on Σ compatible with the complex structure). Let, thus, ΣD1 ,...,Dk be a worldsheet obtained from Σ by cutting out disjoint holomorphically embedded copies D1 , . . . , Dk of the unit disk D. Then we calculate dUΣ (m) = UΣxu(m−1) (m − 1) d x∈Σ = lim UΣD1 ,...,Dk (m − 1) U(S Di )xu(m−1) (m − 1) S µ(ΣD1 ,...,Dk )→0

=

=

lim

µ(ΣD1 ,...,Dk )→0

lim

µ(ΣD1 ,...,Dk )→0

UΣDi (m − 1)

i

i

UΣDi (m − 1)

x∈

x∈Di

Di

U(Di )xu(m−1) (m − 1)

dUDi (m) d

March 10, J070-S0129055X10003916

126

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

assuming (8) for Σ = D, so the assumption we need is dUΣ dUDi (m) (m) = lim . UΣDi (m − 1) ◦ d µ(ΣD1 ,...,Dk )→0 d i

(9)

The composition notation on the right-hand side means gluing. Granted (9), we can recover dUΣd(m) from dUDd(m) for the unit disk D. µ denotes the Lebesgue measure (this is well deﬁned on a worldsheet at least up to absolute continuity, which is suﬃcient for taking the limit in the above computation). Now in the case of the unit disk, we get a candidate for u(m− 1) in the following way: Assume that H is topologically spanned by subspaces H(m1 ,m2 ) of -weight (m1 , m2 ) where m1 , m2 ≥ 0, H(0,0) = UD . Then UD (m) is invariant under rigid rotation, so ˆ UD (m) ∈ H(k,k) []/m+1 . (10) k≥0

We see that if Aq is the standard annulus with boundary components S 1 , qS 1 with standard parametrizations, then u(m − 1) = lim

q→0

1 dUD (m) UA q2 q d

(11)

exists and is equal to the weight (1, 1) summand of (10). In fact, by (9) and the deﬁnition of integral, we already see that (8) holds. We do not know however yet that u(m − 1) is primary. To see that, however, we note that for any annulus A = D − D where f : D → D is a holomorphic embedding with derivative r, (9) also implies (for the same reason — the exhaustion principle) that (8) is valid with u(m − 1) replaced by UA u(m − 1) . r2

(12)

Since this is true for any Σ, in particular where Σ is any disk, the integrands must be equal, so (12) and u(m − 1) have the same vertex operators, so at least in the absence of null elements, UA u(m − 1) = u(m − 1) r2

(13)

which means that u(m − 1) is primary of weight (1, 1), which says precisely that the expression on the left-hand side of (13) is independent of A. We have presented an argument by which, making certain assumptions, deformations of CFT’s occur along primary ﬁelds of critical weights. This is a question raised in [59]. We shall see however that there are problems with this formulation even in the simplest possible case: Consider the free (bosonic) CFT of dimension ˜−1 . (We disregard here the issue that H itself lacks 1, and the primary ﬁeld x−1 x a satisfactory Hilbert space structure, see [37], we could eliminate this problem by

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

127

compactifying the theory on a torus or by considering the state spaces of given momentum.) Let us calculate 1 ˜ −1 )x−1 x UD = exp(zL−1 ) exp(¯ zL ˜−1 D

=

˜−k 1 x−k x . 2 k

(14)

k≥1

We see that the element (14) is not an element of H, since its norm is k≥1 1 = ∞. This occurs despite the fact that the norm on H is preserved by the deformation, i.e. the deformation is unitary. (This is because the inner product is conjugate by reality to the quadratic form which is the operator associated with the degenerate worldsheet with two outbound boundary components S 1 = {z ∈ C| z = 1} parametrized by z and 1/z; in the class of measures equivalent to the Lebesgue measure by absolute continuity, this worldsheet has measure 0 and hence the deformation acts trivially on it.) The explanation is simply that the inﬁnitesimally deformed vacuum is ˜−n 1 x−n x . (15) 1+ 2 n>0 n When computing the square norm of (15), the second summand is orthogonal to the ﬁrst, hence its square norm occurs with coeﬃcient 2 , which disappears when calculating up to linear order in (which is what we are doing in an inﬁnitesimal deformation); such phenomena routinely occur when one attempts to diﬀerentiate unitary processes on Hilbert spaces. In our case, as we shall see, the situation is further complicated by the fact that the process actually has to be regularized. There are other problems as well. For one thing, we wish to consider theories which really do not have Hilbert axiomatizations in the proper sense, including Minkowski signature theories, where the Hilbert approach is impossible for physical reasons. Therefore, we prefer a “vertex operator algebra” approach where we discard the Hilbert completion and restrict ourselves to examining tree level amplitudes. One such axiomatization of such theories was given in [41] under the term “full ﬁeld algebra”. In the present paper, however, we prefer to work from scratch, listing the properties we will use explicitly, and referring to our objects as conformal ﬁeld theories in the vertex operator formulation. As mentioned in the introduction, our approach in this paper is essentially to build the minimal possible machinery in which we can phrase the concept of perturbative deformation of a CFT along a primary ﬁeld of critical weight to arbitrary degree, and identifying obstructions to obtaining such deformation. Actually identifying the deformed conformal ﬁeld theory upon plugging in a value of the deformation parameter (provided the obstructions vanish) by means of a general abstract machinery (i.e. not assuming we can recognize the theory by other means) is a diﬃcult problem which remains untreated in the present paper. Therefore, speaking purely mathematically, we are actually deﬁning the concept of perturbative

March 10, J070-S0129055X10003916

128

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

deformation along with ﬁnding our obstructions. It would be far superior to deﬁne rigorously the moduli space of conformal ﬁeld theories upfront, with enough geometry to allow us to deﬁne paths. Such technology, however, is not mathematically available at the present time. Regarding the approach to constructing and treating ﬁelds, the vertex operator approach is largely superior from the computational point of view. One can convert to the more symmetric and foundationally more powerful Hilbert space approach when we have appropriate convergence of the operators constructed. We shall proceed by using either language according to what is more convenient at each particular time. For now, let us consider untopologized vector spaces (16) V = V(wL ,wR ) . Here (wL , wR ) are weights (we refer to wL , respectively wR , as the left, respectively right, component of the weight), so we assume wL − wR ∈ Z and usually wL , wR ≥ 0,

(17)

V(0,0) = UD .

(18)

The “no ghost” assumptions (17), (18) will sometimes be dropped. If there is a Hilbert space H, then V is interpreted as the “subspace of states of ﬁnite weights”. We assume that for u ∈ VwL ,wR , we have vertex operators of the form u−vL −wL ,−vR −wR z vL z¯vR . (19) Y (u, z, z¯) = (vL ,vR )

Here ua,b are operators which raise the left (respectively, right) component of weight by a (respectively, b). We additionally assume vL − vR ∈ Z and that for a given w, the weights of operators which act on w are discrete. Even more strongly, we assume that Yi (u, z)Y˜i (u, z¯) (20) Y (u, z, z¯) = i

where Yi (u, z) = Y˜i (u, z¯) =

ui;−vL −wL z vL , (21) u ˜i;−vR −wR z¯vR

where all the operators Yi (u, z) commute with all Y˜j (v, z¯). The main axiom which ﬁelds (19) must satisfy is “commutativity” and “associativity” analogous to the case of vertex operator algebras, i.e. there must exist for ﬁelds u, v, w ∈ V and w ∈ V ∨ of ﬁnite weight, a “4-point function” w Z(u, v, z, z¯, t, t¯)w

(22)

which is real-analytic and unbranched outside the loci of z = 0, t = 0, z = ∞, t = ∞ and z = t, and whose expansion in t ﬁrst and z second (respectively, z ﬁrst

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

129

and t second, respectively, z − t ﬁrst and t second) is w Y (u, z, z¯)Y (v, t, t¯)w, w Y (v, t, t¯)Y (u, z, z¯)w, w Y (Y (u, z − t, z¯ − t)v, t, t¯)w, respectively. Here, for example, by an expansion in t ﬁrst and z second we mean a series in the variable z whose coeﬃcients are series in the variable t, and the other cases are analogous. Comment: the existence of 4-point function is the appropriate generalization of “locality”. ˜ n with equal central charges We also assume that Virasoro algebras Ln , L cL = cR act and that ∂ Y (u, z, z¯), ∂z ˜ −1 u, z, z¯) = ∂ Y (u, z, z¯) Y (L ∂ z¯

(23)

˜ 0 ). VwL ,wR is the weight (wL , wR ) subspace of V with respect to (L0 , L

(24)

Y (L−1 u, z, z¯) =

and

Remark. Even the axioms outlined here are meant for theories which are initial points of the proposed perturbative deformations, they are too restrictive for the theories obtained as a result of the deformations themselves. To capture those deformations, it is best to revert to Segal’s approach, restricting attention to genus 0 worldsheets with a unique outbound boundary component (tree level amplitudes). Operators will then be expanded both in the weight grading and in the perturbative parameter (i.e. the coeﬃcient at each power of the deformation parameter will be an element of the product-completed state space of the original theory). To avoid discussion of topology, we simply require that perturbative coeﬃcients of all compositions of such operators converge in the product topology with respect to the weight grading, and the analytic topology in each graded summand. In this section, we discuss inﬁnitesimal perturbations, i.e. the deformed theory is deﬁned over C[]/(2 ) where is the deformation parameter. One case where such inﬁnitesimal deformations can be described explicitly is the following Theorem 1. Consider ﬁelds u, v, w ∈ V where u is primary of weight (1, 1). Next, assume that Zα,β (u, v, z, z¯, t, t¯) Z(u, v, z, z¯, t, t¯) = α,β

where Zα,β (u, v, z, z¯, t, ¯ t) =

i

Zα,β,i (u, v, z, t)Z˜α,β,i (u, v, z¯, t¯)

March 10, J070-S0129055X10003916

130

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

and for w ∈ W ∨ of ﬁnite weight, w Zα,β,i (u, v, z, t)(z − t)α z β (respectively, α w Z˜α,β,i (u, v, z¯, t¯)(z − t) z¯β ) is a meromorphic (respectively, antimeromorphic) function of z on CP 1 , with poles (if any) only at 0, t, ∞. Now write Yu,α,β (v, t, t¯) = (i/2) Zα,β (u, v, z, z¯, t, t¯)dzd¯ z, (25) Σ

so Yu (v, t, t¯) = Y (v, t, t¯) +

Yu,α,β (v, t, t¯)

α,β

is the inﬁnitesimally deformed vertex operator where Σ is the degenerate worldsheet with unit disks cut out around 0, t, ∞. Assume now further that we can expand Zα,β,i (u, v, z, t) = Yα,β,i (v, t)Yα,β,i (u, z)

when z is near 0,

(26)

Zα,β,i (u, v, z, t) = Yα,β,i (u, z)Yα,β,i (v, t)

when z is near ∞,

(27)

Zα,β,i (u, v, z, t) = Yα,β,i (Yα,β,i (u, z − t))v, t)

Write Yα,β,i (u, z) = Yα,β,i (u, z) = Yα,β,i (u, z) =

when z is near t.

(28)

uα,β,i,−n−β z n+β−1 , uα,β,i,−n−α−β z n+α+β−1 , uα,β,i,n−α z n+α−1 .

(Analogously with the ˜’s.) Assume now uα,β,i,0 w = 0,

uα,β,i,0 v = 0,

uα,β,i,0 Yα,β,i (v, t)w = 0

(29)

and analogously for the ˜’s (note that these conditions are only nontrivial when β = 0, respectively, α = 0, respectively, α = −β). Denote now by ωα,β,i,0 , ωα,β,i,∞ , ωα,β,i,t the indeﬁnite integrals of (26)–(28) in the variable z, obtained using the formula z k+1 for k = −1 z k dz = k+1 (thus ﬁxing the integration constant), and analogously with the ˜’s. Let then Cα,β,i = ωα,β,i,∞ − ωα,β,i,t , Dα,β,i = ωα,β,i,∞ − ωα,β,i,0 , C˜α,β,i = ω ˜ α,β,i,∞ − ω ˜ α,β,i,t , ˜ Dα,β,i = ω ˜ α,β,i,∞ − ω ˜ α,β,i,0 (see the comment in the proof on branching). Let uα,β,i,−n u˜α,β,i,−n φα,β,i = π n n

(30)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

131

and similarly for the˜’s, the ’s and the ’s. (The deﬁnition makes sense when applied to ﬁelds on which the term with denominator 0 vanishes.) Then φα,β,i Y (v, t, t¯)w − Y (φα,β,i v, t, t¯)w − Y (v, t, t¯)φα,β,i w Yα,β,u (v, t, t¯)w = i

˜ α,β,i (1 − e2πiβ ). + Cα,β,i C˜α,β,i (−1 + e−2πiα ) + Dα,β,i D

(31)

˜ α,β,i = 0, and when β = 0 then Cα,β,i = Additionally, when α = 0, then Dα,β,i = D ˜ Cα,β,i = 0, and φα,β,i Y (v, t, t¯)w − Y (φα,β,i v, t, t¯)w − Y (v, t, t¯)φα,β,i w. (32) Yα,β,u (v, t, t¯)w = i

Equation (32) is also valid when α = −β. Remark 1. Note that technically, the integral (25) is not deﬁned on the nondegenerate worldsheet described. This can be treated in the standard way, namely by considering an actual worldsheet Σ obtained by gluing on standard annuli on the boundary components. It is easily checked that if we denote by Auq the inﬁnitesimal deformation of Aq by u, then Auq (w) = φAq (w) − Aq (φw). Therefore, the theorem can be stated equivalently for the worldsheet Σ . The only change needs to be made in formula (31), where φ needs to be multiplied by s−2n and φ needs to be multiplied by r−2n where r and s are radii of the corresponding boundary components. Because however this is equivalent, we can pretend to work on the degenerate worldsheet Σ directly, in particular avoiding inconvenient scaling factors in the statement. Remark 2. The validity of this theorem is rather restricted by its assumptions. Most signiﬁcantly, its assumption states that the chiral 4-point function can be rendered meromorphic in one of the variables by multiplying by a factor of the form z α (z − t)β . This is essentially equivalent to the fusion rules being “abelian”, i.e. 1-dimensional for each pair of labels, and each pair of labels has exactly one product. As we will see (and as is well known), the N = 2 minimal model is an example of a “non-abelian” theory. Speaking more generally in terms of function theory, branched analytic functions on CP 1 (at a risk of great confusion, we recall that those were called “Abelsche Funktionen” by Riemann) are ﬁnite-dimensional vector spaces which are locally spaces of holomorphic functions, outside of ﬁnitely many points z1 , . . . , zn on CP 1 . One also assumes that the singularities at zi are of bounded polynomial growth. Such function then deﬁnes a ﬁnite-dimensional representation of the fundamental group π1 (CP 1 −{z1 , . . . , zn }), called the holonomy representation. In particular, chiral correlation functions of a full ﬁeld algebra are branched functions in this sense. The key issue is whether the holonomy representation is a sum of one-dimensional

March 10, J070-S0129055X10003916

132

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

representation (in which case it factors through the abelianization of the fundamental group — the ﬁrst homology group). Then and only then is the function a sum of contributions which can be rendered holomorphic by multiplication with appropriate products of (zi − zj )αij . A most basic example of a branched function with non-abelian holonomy is the hypergeometric function, which occurs as the 4-point function of parafermions and N = 2-minimal models. Even for an abelian theory, the theorem only calculates the deformation in the “0 charge sector” because of the assumption (29). Because of this, even for a free ﬁeld theory, we will need to discuss an extension of the argument. Since in that case, however, stating precise assumptions is even more complicated, we prefer to treat the special case only, and to postpone the discussion to Sec. 4 below. Proof. Let us work on the scaled real worldsheet Σ . Let ηα,β,i = Zα,β,i (u, v, z, t)dz, z. η˜α,β,i = Z˜α,β,i (u, v, z¯, t¯)d¯ Denote by ∂0 , ∂∞ , ∂t the boundary components of Σ near 0, ∞, t. Then the form ωα,β,i,∞ η˜α,β,i is unbranched on a domain obtained by making a cut c connecting ∂0 and ∂t . We have ωα,β,i,t η˜ = −Y (φα,β,i v, t, t¯), (33)

∂t

ωα,β,i,0 η˜ = −Y (v, t, t¯)φα,β,i .

(34)

∂0

But we want to integrate ωα,β,i,∞ η˜α,β,i over the boundary ∂Σ : ωα,β,i η˜α,β,i = ωα,β,i η˜α,β,i + ωα,β,i η˜α,β,i + ωα,β,i η˜α,β,i ∂Σ

∂t

+ c+

∂0

∂∞

ωα,β,i η˜α,β,i +

c−

ωα,β,i η˜α,β,i

(35)

where c+ , c− are the two parts of ∂Σ along the cut c, oriented from ∂t to ∂0 and back respectively. Before going further, let us look at two points x+ ∈ c+ , x− ∈ c− which project to the same point on c. We have C(e−2πiα − 1)˜ η (x− ) = C η˜(x+ ) − C η˜(x− ) = (ωt + C)˜ η (x+ ) − (ωt + C)˜ η (x− ) = ω∞ η˜(x+ ) − ω∞ η˜(x− ) = (ω0 + D)˜ η (x+ ) − (ω0 + D)˜ η (x− ) η (x− ) = D˜ η (x+ ) − D˜ = D(e2πiβ − 1)˜ η (x− )

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

133

(the subscripts α, β, i were omitted throughout to simplify the notation). This implies the relation Cα,β,i (e−2πiα − 1) = Dα,β,i (e2πiβ − 1).

(36)

Comment. This is valid when the constants Cα,β,i , Dα,β,i are both taken at the point x− ; note that since the chiral forms are branched, we would have to adjust the statement if we measured the constants elsewhere. This however will not be of much interest to us as in the present paper we are most interested in the case when the constants vanish. In any case, note that (36) implies Cα,β,i = 0 when β = 0 mod Z and α = 0 mod Z, and Dα,β,i = 0 when α = 0 mod Z and β = 0 mod Z. There is an anlogous ˜ α,β,i . Note that when α = 0 = β, all the forms in relation to (36) between C˜α,β,i , D sight are unbranched, and (32) follows directly. To treat the case α = −β, proceed analogously, but replacing ωα,β,i,∞ by ωα,β,i,0 or ωα,β,i,t . Thus, we have ﬁnished proving (32) under its hypotheses. Returning to the general case, let us study the right-hand side of (35). Subtracting the ﬁrst two terms from (33), (34), we get Cα,β,i η˜α,β,i , Dα,β,i η˜α,β,i , (37) ∂t

∂0

respectively. On the other hand, the sum of the last two terms, looking at points x+ , x− for each x ∈ c, can be rewritten as Cα,β,i (−e−2πiα + 1)˜ ηα,β,i = Dα,β,i (−e2πiβ + 1)˜ ηα,β,i . (38) c+

c−

Now recall (30). Choosing ω ˜ α,β,i,∞ as the primitive function of η˜α,β,i , we see that for the end point x of c− , ω ˜ α,β,i,∞ (x+ ) − ω ˜ α,β,i,∞ (x− ) = ω ˜ α,β,i,t (x+ ) − ω ˜ α,β,i,t (x− ) = (e−2πiα − 1)˜ ωα,β,i,t (x− ) = (e−2πiα − 1)˜ ωα,β,i,∞ (x− ) + (e−2πiα − 1)C˜α,β,i . (39) −

Similarly, for the beginning point y of c , −ω ˜ α,β,i,∞ (y + ) + ω ˜ α,β,i,∞ (y − ) = −˜ ωα,β,i,0 (y + ) + ω ˜ α,β,i,0 (y − ) = −(e2πiβ − 1)˜ ωα,β,i,0 (y − ) ˜ α,β,i . = −(e2πiβ − 1)˜ ωα,β,i,∞ (y − ) − (e2πiβ − 1)D (40) Then (39), (40) multiplied by Cα,β,i are the integrals (37), while the integral (38) is − Dα,β,i (1 − e2πiβ )˜ ωα,β,i,0 (y − ) + Cα,β,i (1 − e−2πiα )˜ ωα,β,i,0 (x− ).

(41)

March 10, J070-S0129055X10003916

134

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Adding this, we get ˜ α,β,i (1 − e2πiβ ), Cα,β,i C˜α,β,i (−1 + e−2πiα ) + Dα,β,i D as claimed.

3. Exponentiation of Infinitesimal Deformations Let us now look at primary weight (1, 1) ﬁelds u. We would like to investigate whether the inﬁnitesimal deformation of vertex operators (more precisely worldsheet vacua or string amplitudes) along u indeed continues to a ﬁnite deformation, or at least to perturbative level, as discussed in the previous section. Looking again at Eq. (8), we see that we have in principle a series of obstructions similar to those of Gerstenhaber [26–29], namely if we denote by Ln (m) =

m

Lin i ,

L0n = Ln

(42)

i=0

a deformation of the operator Ln in Hom(V, V )[]/m , we must have Ln (m)u(m) = 0 ∈ V []/m+1

for n > 0

L0 (m)u(m) = u(m) ∈ V []/m+1 . This can be rewritten as Ln um = −

(L0 − 1)um = −

(44)

Lin um−i

i≥1

(43)

(45) Li0 um−i .

i≥1

(Analogously for the ˜’s. In the following, we will work on the obstruction for the chiral part, the antichiral part is analogous.) At ﬁrst, these equations seem very overdetermined. Similarly as in the case of Gerstenhaber’s obstruction theory, however, of course the obstructions are of cohomological nature. If we denote by A the Lie algebra L0 − 1, L1 , L2 , . . . , then the system Ln (m)u(m − 1) (L0 (m) − 1)u(m − 1)

(46)

is divisible by m in V []/m+1 , and is obviously a coboundary, hence a cocycle with respect to L0 (m) − 1, L1 (m), . . . . Hence, dividing by m , we get a 1-cocycle in H 1 (A, C). Solving (45) means expressing this A-cocycle as a coboundary. In the absence of ghosts (= elements of negative weights), there is another simpliﬁcation we may take advantage of. Suppose we have a 1-cocycle c = (x0 , x1 , . . .) of A, representing an element of H 1 (A, C). (In our applications, we will be interested in the case when the xi ’s are given by (46).) Writing out the cocycle condition explicitly, we obtain the equations Lk xj − Lj xk = (k − j)xj+k ,

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

135

where Lk = Lk for k > 0, L0 = L0 − 1. In particular, Lk x0 − L0 xk = kxk , or Lk x0 = (L0 + k − 1)xk

for k > 0.

(47)

In the absence of ghosts, (47) means that for k ≥ 1, xk is determined by x0 with the exception of the weight 0 summand (x1 )0 of x1 . Additionally, if we denote the weight k summand of y in general by yk , then c = dy

(48)

means (x0 )k = (k − 1)y,

(49)

(x0 )1 = 0.

(50)

The rest of Eq. (48) then follows from (47), with the exception of the weight 0 summand of x1 . We must, then, have (x1 )0 ∈ Im L1 . Conditions (50), (51), for xk = −

(51)

Lik um−i ,

i≥1

are the conditions for solving (45), i.e. the actual obstruction. For m = 1, we get what we call the primary obstruction. Calculating the integral (5) over an annulus and passing to the appropriate limits (the inﬁnitesimal annuli expressing the operators Ln ), we obtain ˜ 1−k = ui,m+k u˜i,m , (52) L1k = L m,i

so (50) becomes

ui,0 u ˜i,0 u = 0.

(53)

i

The condition (51) becomes ui,1 u ˜i,0 u ∈ Im L1 ,

i

i

˜ 1. ui,0 u˜i,1 u ∈ Im L

(54)

This investigation is also interesting in the supersymmetric context. In the case of N = 1 worldsheet supersymmetry, we have additional operators Gir , and in the −i i N = 2 SUSY case, we have operators G+i r , Gr , Jn (cf. [31, 49]), deﬁned as the i + − -coeﬃcient of the deformation of Gr , resp. Gr , Gr , Jn analogously to Eq. (42). In the N = 1-supersymmetric case, the critical deforming ﬁelds have weight (1/2, 1/2) (as do a and c ﬁelds in the N = 2 case), so in both cases the ﬁrst

March 10, J070-S0129055X10003916

136

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Eq. (42) remains the same as in the N = 0 case, the second becomes (L0 − 1/2)um = − Li0 um−i .

(55)

i≥1

Additionally, for N = 1, we get Gr um = −

Gir um−i ,

r ≥ 1/2

(56)

i≥1

(similarly when ˜’s are present). In the N = 1-supersymmetric case, we therefore deal with the Lie algebra A, which is the free C-vector space on Ln , Gr , n ≥ 0, r ≥ 1/2. For a cocycle which has value xk on Lk and zr on Gr , Eq. (47) becomes Lk x0 = (L0 + k − 1/2)xk

for k > 0,

(57)

so in the absence of ghosts, xk is always determined by x0 . If the 1-cocycle (xk , zr ) is the coboundary of y

(58)

we additionally get (x0 )k = (k − 1/2)y, so (x0 )1/2 = 0. On the other hand, on the z’s, we get Gr x0 = (L0 + r − 1/2)zr ,

r ≥ 1/2,

(59)

so we see that in the absence of ghosts, all zr ’s are determined, with the exception of (z1/2 )0 . Therefore our obstruction is (z0 )1/2 = 0, (z1/2 )0 ∈ Im(G1/2 ). For the primary obstruction, we have ˜ −1/2 u)m+k,m , ˜1 = L1k = L (G−1/2 G −k

(60)

(61)

m

G1r = 2

˜ −1/2 u)m+r,m , (G m

˜ 1r G

=2 (G−1/2 u)m,m+r , m

(62)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

so the obstruction becomes

137

˜ −1/2 u)m,m = 0, (G−1/2 G

m

˜ −1/2 u)m+1/2,m ∈ Im(G1/2 ), (G

(63)

m

˜ 1/2 ). (G−1/2 u)m,m+1/2 ∈ Im(G

m

In the case of N = 2 supersymmetry, there is an additional complication, namely chirality. This means that in addition to the conditions (L0 − 1/2)u = 0, Ln u = G± r u = Jn−1 u = 0 for n ≥ 1,

r ≥ 1/2,

(64)

we require that u be chiral primary, which means G+ −1/2 u = 0.

(65)

(There is also the possibility of antichiral primary, which has G− −1/2 u = 0

(66)

instead, and similarly for the ˜’s.) Let us now write down the obstruction equations for the chiral primary case. We get the ﬁrst Eqs. (45), (55), and an analogue of (56) −i with Gir replaced by G+i r and Gr . Additionally, we have the equation m m−i G+ G+i −1/2 u = − −1/2 u i≥1

and analogously for the ˜’s. In this situation, we consider the super-Lie algebra A2 which is the free C-vector + space on Ln , Jn , n ≥ 0, G− r , r ≥ 1/2 and Gs , s ≥ −1/2. One easily veriﬁes that this is a super-Lie algebra on which the central extension vanishes canonically ([31], Sec. 3.1). Looking at a 1-cocycle whose value is xk ,zr± , tk on Lk , G± r , Jk respectively, we get Eq. (57), and additionally ± G± r x0 = (L0 + r − 1/2)zr ,

r ≥ 1/2 for −, r ≥ −1/2 for +

(67)

and Jn x0 = (n − 1/2)tn ,

n ≥ 0.

(68)

± + We see that the cocycle is determined by x0 , with the exception of (z1/2 )0 , (z−1/2 )1 . Therefore, we get the condition

(x0 )1/2 = 0 ± )0 ∈ Im(G± (z1/2 1/2 ) + )1 (z−1/2

and similarly for the ˜’s.

=

G+ −1/2 u

(69) where

G+ 1/2 u

=0

March 10, J070-S0129055X10003916

138

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

In the case of deformation along a cc ﬁeld u, we have ˜− ˜1 = L1k = L (G− −k −1/2 G−1/2 u)m+k,m ,

(70)

m

Gr+,1 =

˜ − u)m+r+1/2,m , 2(G −1/2

m

˜ +,1 = G r Gr−,1 = Jn1

2(G− −1/2 u)m,m+r+1/2 ,

m ˜ r−,1 G

(71)

= 0, ˜ = 0 = Jn1 ,

so the obstructions are, in a sense, analogous to (63) with Gr replaced by G− r . Remark. The relevant computation in verifying that (70), (71) (and the analogous cases before) form a cocycle uses formulas of the following type ([72]): Resz (a(z)v(w)z n ) − Resz (v(w)a(z)z n ) = Resz−w ((a(z − w)v)(w)z n ).

(72)

For example, when v is primary of weight 1, a = L−2 , the right hand side of (72) is Resz−w (L0 v(z − w)−2 n(z − w)wn−1 + L−1 v(z − w)−1 wn ) = nv(w)wn−1 + L−1 v(w)wn = nvk wn−k−2 + (−k − 1)vk wn−k−2 = (n − k)vn wn−k−2 . The left-hand side is [Ln−1 , vk−n+1 ]wn−k−2 , so we get [Ln−1 , vk−n+1 ] = (n − k − 1)vk , as needed. Other required identities follow in a similar way. Let us verify one interesting case when a = G− −3/2 , u chiral primary. Then the right-hand side of (72) is − −1 n −n−1 Resz−w (G− v(w)(z − w) w ) = (G v)(w) = (G− . −1/2 −1/2 −1/2 v)w This implies − [G− r , us ] = (G−1/2 u)r+s ,

(73)

as needed. We have now analyzed the primary obstructions for exponentiation of inﬁnitesimal CFT deformations. However, in order for a perturbative exponentiation to exist, there are also higher obstructions which must vanish. The basic principle for obtaining these obstructions was formulated above. However, in practice, it may often happen that those obstructions will not converge. This may happen for two diﬀerent basic reasons. One possibility is that the deformation of the deforming ﬁeld

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

139

itself does not converge. This is essentially a violation of perturbativity, but may in some cases be resolved by regularizing the CFT anomaly along the deformation parameter. We will discuss this at the end of this section, and will give an example in Sec. 4 below. Even if all goes well with the parameter, however, there may be another problem, namely the expressions for Lin etc. may not converge due to the fact that our deformation formulas concern vacua of actual worldsheets, while Lin etc. correspond to degenerate worldsheets. Similarly, vertex operators may not converge in the deformed theories. We will show here how to deal with this problem. The main strategy is to rephrase the conditions from the above part of this section in terms of “ﬁnite annuli”. We start with the N = 0 (non-supersymmetric) case. Similarly as in (42), we can expand UAr (m) =

m

UAhr h .

(74)

h=0

In the non-supersymmetric case, the basic fact we have is the following: Theorem 2. Assuming uk (considered as ﬁelds in the original undeformed CFT ) have weight > (1, 1) for k < h, r ∈ (0, 1) we have 1 sh 2m −1 2mh −1 h umh ,mh · · · um1 ,m1 UAr sh sh−1h−1 · · · U Ar = (mk ,k≤h)

·

s2

s1 =r

1 −1 s2m ds1 · · · dsh . 1

sh =r

sh−1 =r

(75)

(Note that the integral on the right-hand side is over a simplex.) 1 uh = um ,m · · · um1 ,m1 u. h 2 (mh + · · · + m1 )(mh−1 + · · · + m1 ) · · · m1 h h (mk ,k≤h)

(76) In particular, the obstruction is the vanishing of the sum (with the term mh +· · ·+m1 omitted from the denominator ) of the terms in (76) with mh + · · · + m1 = 0. Proof. The identity (75) is essentially by deﬁnition. The key point is that in the higher deformed vacua, there are terms in the integrand obtained by inserting uk , k > 1 to boundaries of disjoint disks Di cut out of Ar . Then there are corrective terms to be integrated on the worldsheets obtained by cutting out those disks. But the point is that under our weight assumption, all the disks Di can be shrunk to a single point, at which point the term disappears, and we are left with integrals of several copies of u inserted at diﬀerent points. If we are using vertex operators to express the integral, the operators must additionally be applied in time order (i.e. ﬁelds at points of lower modulus are inserted ﬁrst). There is an h! permutation factor which cancels with the Taylor denominator. This gives (75).

March 10, J070-S0129055X10003916

140

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Now (76) is proved by induction. For h = 1, the calculation is done above ((52)). Assume now the induction hypothesis, and evaluate the integral in the standard way of taking primitive functions successively from the inside out. The primitive function of ms is taken to be ms+1 /(s + 1) (by the induction hypothesis, and the assumptions that lower order obstructions vanish, the case s = −1 never occurs. Then the contributing term of the integral where the k − 1 innermost integrals have the upper bound and the kth innermost integral has the lower bound is equal to uk , UAh−k r

h>k≥1

(and this term occurs with a minus sign because of the lower bound involved). The summand which has all upper bounds except in the last integral is equal to 1 − r2(m1 +···+mh ) um ,m · · · um1 ,m1 ur2 , 2h (mh + · · · + m1 )(mh−1 + · · · + m1 ) · · · m1 h h

(77)

which is supposed to be equal to −UAr uh + r2 uh . This gives the desired solution. Remark. The formula (77) of course does not apply to the case m1 + · · ·+ mh = 0. In that case, the correct formula is − ln(r) um ,m · · · um1 ,m1 ur2 . (mh−1 + · · · + m1 ) · · · m1 h h

(78)

So the question becomes whether there could exist a ﬁeld uh such that UAr uh −r2 uh is equal to the quantity (78). One sees immediately that such ﬁeld does not exist in the product-completed space of the original theory. What this approach does not settle however is whether it may be possible to add such non-perturbative ﬁelds to the theory and preserve CFT axioms, which could facilitate existence of deformations in some generalized sense, despite the algebraic obstruction. It would have to be, however, a ﬁeld of generalized weight in the sense of [39–42]. In eﬀect, written in inﬁnitesimal terms, the expression (78) becomes L0 uh − uh = −

1 um ,m · · · um1 ,m1 u. (mh−1 + · · · + m1 ) · · · m1 h h

The right-hand side wu is a ﬁeld of holomorphic weight 1, so we see that we have a matrix relation uh 1 w uh L0 = . u 0 1 u This is an example of what one means by a ﬁeld of generalized weight. One should note, however, that ﬁelds of generalized weight are excluded in unitary conformal ﬁeld theories. By Wick rotation, the unitarity axiom of a conformal ﬁeld theory becomes the axiom of reﬂection positivity [59]: the operator UΣ associated with a

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

141

worldsheet Σ is deﬁned up to a 1-dimensional complex line LΣ (which is often more ¯ the complexstrongly assumed to have a positive real structure). If we denote by Σ conjugate worldsheet (note that this reverses orientation of boundary components), then reﬂection positivity requires that we have an isomorphism LΣ¯ ∼ = L∗Σ (the dual ∗ line), and using this isomorphism, an identiﬁcation UΣ¯ = UΣ (here the asterisk denotes the adjoint operator). Specializing to annuli Ar , r ≤ 1, we see that the annulus for r real is self-conjugate, so the corresponding operators are selfadjoint, and hence diagonalizable. On the other hand, for r = 1, we obtain unitary operators, and unitary representations of S 1 on Hilbert space split into eigenspaces of integral weights. The central extension given by L is then trivial and hence the operators corresponding to all Ar commute, and hence are simultaneously diagonalizable, thus excluding the possibility of generalized weight. The possibility, of course, remains that the correlation function of the deformed theory can be modiﬁed by a non-perturbative correction. Let us note that if left uncorrected, the term (78) can be interpreted inﬁnitesimally as L0 u() − u() = Cm v

mod m+1 ,

(79)

where v is another ﬁeld of weight 1. Note that in case that u = v, (79) can be interpreted as saying that u changes weight at order m of the perturbation parameter. In the general case, we obtain a matrix involving all the (holomorphic) weight 1 ﬁelds in the unperturbed theory. Excluding ﬁelds of generalized weight in the unperturbed theory (which would translate to ﬁelds of generalized weight in the perturbed theory), the matrix must have other eigenvalues than 1, thus showing that some critical ﬁelds will change weight. In the N = 1-supersymmetric case, an analogous statement holds, except the assumption is that the weight of uk is greater than (1/2, 1/2) for k < h, and the integral (75) must be replaced by ˜ −1/2 u)m ,m · · · (G−1/2 G ˜ −1/2 u)m1 ,m1 UAr (G−1/2 G UAhr = h h mk

·

1

sh =r

h −1 sm h

sh

sh−1 =r

m

h−1 sh−1

−1

···

s2

s1 =r

s1m1 −1 ds1 · · · dsh ,

(80)

and accordingly uh =

mk

1 2h (mh + · · · + m1 )(mh−1 + · · · + m1 ) · · · m1

˜ −1/2 u)m ,m · · · (G−1/2 G ˜ −1/2 u)m1 ,m1 u, · (G−1/2 G h h

(81)

so the obstruction again states that the term with mh + · · · + m1 = 0 must vanish. In the N = 2 case, when u is a cc ﬁeld, we simply replace G by G− in (80) and (81). But in the supersymmetric case, to preserve supersymmetry along the deformation, we must also investigate the “ﬁnite” analogs of the obstructions associated

March 10, J070-S0129055X10003916

142

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

+ with G1/2 in the N = 1 case, and G± 1/2 , G−1/2 in the N = 2 c case (and similarly for the a case, and the ˜’s). In fact, to tell the whole story, we should seriously investigate integration of the deforming ﬁelds over super-Riemann surfaces (= super-worldsheets). This can be done; one approach is to treat the case of the superdisk ﬁrst, using Stokes theorem twice with the diﬀerentials ∂, ∂¯ replaced by ¯ respectively in the N = 1 case (and the same at one chirality for the N = 2 D, D case). A general super-Riemann surface is then partitioned into superdisks. For the purpose of obstruction theory, the following special case is suﬃcient. We treat the N = 2 case, since it is of main interest for us. Let us consider the case of cc ﬁelds (the other cases are analogous). First we note (see (71)) that G− is unaﬀected by deformation via a cc ﬁeld, so the obstructions derived from G− −1/2 and G− are trivial (and similarly at the ˜’s). 1/2 To understand the obstruction associated with G+ 1/2 , we will study “ﬁnite” (as opposed to inﬁnitesimal) annuli obtained by exponentiating G+ 1/2 . Now the element + G1/2 is odd. Thinking of the super-semigroup of superannuli as a supermanifold, then it makes no sense to speak of “odd points” of the supermanifold. It makes sense, however, to speak of a family of odd elements parametrized by an odd parameter s: this is simply the same thing as a map from the (0|1)-dimensional superaﬃne line into the supermanifold. In this sense, we can speak of the “ﬁnite” odd annulus

exp(sG+ 1/2 ).

(82)

Now we wish to study the deformations of the operator associated with (82) along a cc ﬁeld u as a perturbative expansion in . Thinking of G+ 1/2 as an N = 2-supervector ﬁeld, we have + − G+ 1/2 = (z + θ θ )

∂ ∂ − zθ− . + ∂θ ∂z

(83)

We see that (83) deforms inﬁnitesimally only the variables θ+ and z, not θ− . Thus, more speciﬁcally, (82) results in the transformation z → exp(sθ− )z,

(84)

θ− → θ− .

This gives rise to the formula, valid when uk have weight > (1/2, 1/2) for 1 ≤ k < h, th 1 mh−1 −1 mh −1 h = t th−1 ··· Uexp(sG + h ) 1/2

mk

·

th =exp(sθ − )

t2

t1 =exp(sθ − )

th−1 =exp(sθ − )

t1m1 −1 dt1 · · · dth vmh ,mh · · · vm1 ,m1 Uexp(sG+ ) , 1/2

(85)

where vmk ,mk is equal to ˜ − u)m (G k+1/2 ,mk −1/2

(86)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

143

in summands of (85) where the factor resulting from integrating the tk variable has a θ− factor, and ˜− (G− −1/2 G−1/2 u)mk ,mk

(87)

in other summands. (We see that each summand can be considered as a product of factors resulting from integrating the individual variables tk ; in at most one factor, (86) can occur, otherwise the product vanishes.) Realizing that exp(msθ− ) = 1 + msθ− , this gives that the obstruction (under the weight assumption for uk ) is that the summand for m1 + · · · + mh = 0 (with the denominator m1 + · · · + mh omitted) in the following expression vanish: h mk k=1

1 1 ˜ − u)m ,m ··· (G− G h h m1 + · · · + mh m1 −1/2 −1/2

˜− ˜ − u)m · · · mk (G · · · (G− k+1/2 ,mk −1/2 −1/2 G−1/2 u)m1 ,m1 u.

(88)

To investigate the higher obstructions further, we need the language of correlation functions. Speciﬁcally, the CFT’s whose deformations we will consider are “RCFT’s”. The simplest way of building an RCFT is from “chiral sectors” Hλ where λ runs through a set of labels, by the recipe H= Hλ ⊗ Hλ∗ λ ∗

where λ denotes the contragredient label (cf. [38]). (In the case of the Gepner model, we will need a slightly more general scenario, but our methods still apply to that case analogously.) Further, we will have a symmetric bilinear form B : Hλ ⊗ Hλ∗ → C with respect to which the adjoint to Y (v, z) is (−z −2 )n Y (ezL1 v, 1/z) where v is of weight n. There is also a real structure ¯ λ∗ , Hλ ∼ =H thus specifying a real structure on H, u ⊗ v = u ¯ ⊗ v¯, and inner product ¯2 )B(v1 , v¯2 ). u1 ⊗ v1 , u2 ⊗ v2 = B(u1 , u We also have an inner product Hλ ⊗R Hλ∗ → C given by u, v = B(u, v¯). Then we have the P1 -chiral correlation function u(z∞ )∗ |vm (zm )vm−1 (zm−1 ) · · · v1 (z1 )v0 (z0 )

(89)

March 10, J070-S0129055X10003916

144

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

which can be deﬁned by taking the vacuum operator associated with the degenerate worldsheet Σ obtained by “cutting out” unit disks with centers z0 , . . . , zm from the unit disk with center z∞ , applying this operator to v0 ⊗ · · · ⊗ vm , and taking inner product with u. Thus, the correlation function (89) is in fact the same thing as applying the ﬁeld on either side of (89) to the identity, and taking the inner product. This object (89) is however not simply a function of z0 , . . . , z∞ . Instead, there is a ﬁnite-dimensional vector space MΣ depending holomorphically on Σ (called the modular functor) such that (89) is a linear function MΣ → C. However, now one assumes that M is a “unitary modular functor” in the sense of Segal [59]. This means that MΣ has the structure of a positive-deﬁnite inner product space for not just the Σ as above, but an arbitrary worldsheet. The inner product is not valued in C, but in det(Σ)2c where c is the central charge. Since the determinant of Σ as above is the same as det(P1 ) (hence in particular constant), we can make the inner product C-valued in our case. If the deforming ﬁeld is of the form u⊗u ˜,

(90)

the “higher L0 obstruction” (under the weight assumptions given above) can be further written as v(0)∗ |u(zm ) · · · u(z1 )u(0) 0≤ z1 ≤ zm ≤1

u(zm ) · · · u ˜(z1 )˜ u(0) dz1 d¯ z1 · · · dzm d¯ zm · ˜ v ∗ |˜

for w(v) ≤ 1

(91)

(w is weight) in the N = 0 case and − v(0)∗ |(G− −1/2 u)(zm ) · · · (G−1/2 u)(z1 )u(0) 0≤ z1 ≤ zm ≤1

˜ − u˜)(zm ) · · · (G ˜− u · ˜ v ∗ |(G z1 )˜ u(0) dz1 d¯ z1 · · · dzm d¯ zm −1/2 −1/2 ˜)(¯

for w(v) ≤ 1/2 (92)

in the N = 2 cc case. The G+ 1/2 -obstruction in the N = 2 case can be written as

m

0≤ z1 ≤ zm ≤1 k=1

− v(0)∗ |(G− −1/2 u)(zm ) · · · u(zk ) · · · (G−1/2 u)(z1 )u(0)

˜− u ˜ − ˜) · ˜ v ∗ |(G −1/2 ˜)(zm ) · · · (G−1/2 u · (z1 )˜ u(0) dz1 d¯ z1 · · · dzm d¯ zm

for w(v) ≤ 0, w(˜ v ) ≤ 1/2

(93)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

145

and similarly for the ˜. We see that these obstructions vanish when we have v(z∞ )∗ |u(zm ) · · · u(z0 ) = 0,

for w(v) ≤ 1

(94)

in the N = 0 case (and similarly for the ˜’s), and − v(z∞ )∗ |G− −1/2 u(zm ) · · · G−1/2 u(z1 )u(z0 ) = 0,

for w(v) ≤ 1/2,

(95)

and similarly for the ˜’s. Observe further that when u ˜=u ¯, the condition for the ˜’s is equivalent to the condition for u, and further (94), (95) are also necessary in this case, as in (91), (92) we may also choose v˜ = v¯, which makes the integrand non-negative (and only 0 if it is 0 at each chirality). In the N = 2 case, it turns that the condition (95) simpliﬁes further: Theorem 3. Let u be a chiral primary ﬁeld of weight 1/2. Then the necessary and suﬃcient condition (95) for existence of perturbative CFT deformations along the ﬁeld u ⊗ u ¯ is equivalent to the same vanishing condition applied to only chiral primary ﬁelds v of weight 1/2. Proof. In order for the ﬁelds (95) to correlate, they would have to have the same J-charge QJ . We have QJ u = 1,

QJ (G− −1/2 u) = 0.

As QJ of the right-hand side of (95) is 1. Thus, for the function (95) to be possibly non-zero, we must have QJ v = 1.

(96)

But then we have w(v) ≥

1 1 QJ v = 2 2

with equality arising if and only if v is chiral primary of weight 1/2

(97)

(see [31, Sec. 3.3]). Remark 1. We see therefore that in the N = 2 SUSY case, there is in fact no need to assume that the weight of U k is > (1/2, 1/2) for k < h. If the obstruction vanishes for k < h, then we have 1 ˜ − u)(zk ) · · · (G− G ˜− uk = (G− G z1 · · · d¯ zk (98) −1/2 −1/2 u)(z1 )udz1 · · · dzk d¯ k! D −1/2 −1/2 where the integrand is understood as a (k + 1)-point function (and not its power series expansion in any particular range), over the unit disk.

March 10, J070-S0129055X10003916

146

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Additionally, for any worldsheet Σ, 1 ˜ − u)(zh ) · · · (G− G ˜− (G− G z1 · · · d¯ zh UΣh = −1/2 −1/2 u)(z1 )dz1 · · · dzh d¯ h! Σ −1/2 −1/2

(99)

(it is to be understood that in both (98), (99), the ﬁelds are inserted into holomorphic images of disks where the origin maps to the point of insertion with derivative of modulus 1 with respect to the measure of integration). When the obstruction occurs at step k, the integral (98) has a divergence of logarithmic type. In the N = 0 case, there is a third possibility, namely that the obstruction vanishes, but the ﬁeld uh in Theorem 2 has summands of weight < (1, 1) (< (1/2, 1/2) for N = 1). In this case, the integral (98) will have a divergence of power type, and the intgral of terms of weight < (1, 1) (respectively, < (1/2, 1/2)) has to be taken in range from ∞ to 1 rather than from 0 to 1 to get a convergent integral. The formula (99) is not correct in that case. Remark 2. In [19], a diﬀerent correlation function is considered as an measure of marginality of u to higher perturbative order. The situation there is actually more general, allowing combinations of both chiral and antichiral primaries. In the present setting of chiral primaries only, the correlation function considered in [19] amounts to − 1|(G− −1/2 u)(zn ) · · · (G−1/2 u)(z1 ) .

(100)

It is easy to see using the standard contour deformation argument that (100) indeed vanishes, which is also observed in [18]. In [19], this type of vanishing is taken as evidence that the N = 2 CFT deformations exist. It appears, however, that even though the vanishing of (100) follows from the vanishing of (95), the opposite implication does not hold. (In fact, we will see examples in Sec. 6 below.) The explanation seems to be that [19] writes down an integral expressing the change of central charge when deforming by a combination of cc ﬁelds and ac ﬁelds, and proves its vanishing. While this is correct formally, we see from Remark 1 above that in fact a singularity can occur in the integral when our obstruction is non-zero: the integral can marginally diverge for k points while it is convergent for < k points. It would be nice if the obstruction theory a la Gerstenhaber we described here settled in general the question of deformations of conformal ﬁeld theory, at least in the vertex operator formulation. It is, however, not that simple. The trouble is that we are not in a purely algebraic situation. Rather, compositions of operators which are inﬁnite series may not converge, and even if they do, the convergence cannot be understood in the sense of being eventually constant, but in the sense of analysis, i.e. convergence of sequences of real numbers. Speciﬁcally, in our situation, there is the possibility of divergence of the terms on the right-hand side of (45). Above we dealt with one problem, that in general, we do not expect inﬁnitesimal deformations to converge on the degenerate worldsheets of vertex operators, so we may have to replace (45) by equations involving ﬁnite annuli instead. However, that is not the only problem. We may encounter

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

147

regularization along the ﬂow parameter. This stems from the fact that Eqs. (43), (44) only determine u() up to scalar multiple, where the scalar may be of the form Ki i = f (). (101) 1+ i≥1

But the point is (as we shall see in an example in the next section) that we may only be able to get a well deﬁned value of f −1 ()u() = v()

(102)

when the constants Ki are inﬁnite. The obstruction then is Ln (m)f (m)v(m) = 0 ∈ V []/m+1 (1 − L0 (m))f (m)v(m) = 0 ∈ V []/

m+1

for n > 0 .

(103)

At ﬁrst, it may seem that it is diﬃcult to make this rigorous mathematically with the inﬁnite constants present. However, we may use the following trick. Suppose we want to solve c1 a11 + · · · + cn a1n = b1 .. . c1 am1 + · · · + cn amn = bn in a, say, ﬁnite-dimensional vector space V . Then we may rewrite (104) as (b1 , . . . , bn ) = 0 ∈ V (a11 , . . . , am1 ), . . . , (a1n , . . . , amn ) .

(104)

(105)

m

This of course does not give anything new in the algebraic situation, i.e when the aij ’s are simply elements of the vector space V . When, however the vectors (a11 , . . . , am1 ), . . . , (a1n , . . . , amn ) are (possibly divergent) inﬁnite sums (a1j , . . . , amj ) =

(a1jk , . . . , amjk ),

k

then the right-hand side of (105) can be interpreted as V (a11k , . . . , am1k ), . . . , (a1nk , . . . , amnk ) . m

In that sense, (105) always makes sense, while (104) may not when interpreted directly. We interpret (103) in this way. Let us now turn to the question of suﬃcient conditions for exponentiation of inﬁnitesimal deformations. Suppose there exists a subspace W ⊂ V closed under vertex operators which contains u and such that for all elements v ∈ W , we have that Yi (u, z)Y˜i (u, z¯)v i

March 10, J070-S0129055X10003916

148

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

involve only z n z¯m with m, n ∈ Z, m, n = −1. Then, by Theorem 1, ˆ []/2 1 − φ : W → W is an inﬁnitesimal isomorphism between W and the inﬁnitesimally u-deformed W . It follows, in the non-regularized case, that then exp(−φ)u

(106)

is a globally deformed primary ﬁeld of weight (1, 1), and ˆ [[]] exp(−φ) : W → W

(107)

is an isomorphism between W and the exponentiated deformation of W . However, since we now know the primary ﬁelds along the deformation, vacua can be recovered from Eq. (8) of the last section. Such non-regularized exponentiation occurs in the case of the coset construction. Setting W = v|Yi (u, z)Y˜i (u, z¯)v involve only z n z¯m with m, n ≥ 0, m, n ∈ Z . Then W is called the coset of V by u. Then W is closed under vertex operators, and if u ∈ W , the formulas (106), (107) apply without regularization. The case with regularization occurs when there exists some constant K() = 1 + K n n n≥1

where Kn are possibly constants such that K() exp(−φ)u

(108)

is ﬁnite in the sense described above (see (105)). We will see an example of this in the next section. All these constructions are easily adapted to supersymmetry. The formulas (106), (107) hold without change, but the deformation is with respect to + ˜ −1/2 u respectively, G− G ˜− ˜− G−1/2 G −1/2 −1/2 u, G−1/2 G−1/2 u depending on the situation applicable. 4. The Deformations of Free Field Theories As our ﬁrst application, let us consider the 1-dimensional bosonic free ﬁeld conformal ﬁeld theory, where the deformation ﬁeld is u = x−1 x˜−1 . In this case, the inﬁnitesimal isomorphism of Theorem 1 satisﬁes x−n x ˜−n φ=π n

(109)

(110)

n∈Z

and the suﬃcient condition of exponentiability from the last section is met when we take W the subspace consisting of states of momentum 0. Then W is closed

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

149

under vertex operators, u ∈ W and the n = 0 term of (110) drops out in this case. However, this is an example where regularization is needed. It can be realized as follows: Write φn φ= n>0

where

φn = π

xn x x−n x ˜−n ˜n − n n

We have exp φ =

.

exp φn .

(111)

n>0

To calculate exp φn explicitly, we observe that

x ˜−n x˜n x−n x ˜−n xn x ˜n x−n xn , − − 1, =− n n n n and setting e=

˜−n x−n x , n

f =

xn x ˜n , n

h=−

(112)

x˜−n x x−n xn ˜n − − 1, n n

we obtain the sl2 Lie algebra [e, f ] = h, (113)

[e, h] = 2e, [f, h] = −2f.

Note that conventions regarding the normalization of e, f, h vary, but the relations (113) are satisﬁed for example for

0 1 0 0 −1 0 e= , f= , h= . (114) 0 0 −1 0 0 1 In SL2 , we compute

0 1 exp(π(−f + e)) = exp π 1 0 cosh π sinh π = sinh π cosh π  1 1 tanh π  cosh π =  0 1 0

 0 cosh π

 

1 tanh π

0 . 1

(115)

March 10, J070-S0129055X10003916

150

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

In the translation (112), this is

x−n xn x˜−n x ˜−n ˜n x−n x + +1 exp tanh(π) exp (−ln cosh π) n n n

xn x ˜n · exp −tanh(π) . (116) n To exponentiate the middle term, we claim

x−n xn x−n xn z z = : exp (e − 1)) : exp n n

(117)

To prove (117), diﬀerentiate both sides by z. On the left-hand side, we get

x−n xn x−n xn exp z . n n Thus, if the derivative by z of the right-hand side y of (117) is

x−n xn z x−n xn : exp (e − 1) :, n n

(118)

then we have the diﬀerential equation y = x−nnxn y, which proves (117) (looking also at the initial condition at z = 0). Now we can calculate (118) by moving the xn occuring before the normal order symbol to the right. If we do this simply by changing (118) to normal order, we get

x−n xn z x−n xn exp (e − 1) :, (119) : n n but if we want equality with (118), we must add the terms coming from the commutator relations [xn , x−n ] = n, which gives the additional term

x−n xn z x−n xn z exp (e − 1) :. (120) (e − 1) : n n Adding together (119) and (120) gives ez :

x−n xn exp n

x−n xn z (e − 1) :, n

(121)

which is the derivative by z of the right-hand side of (117), as claimed. Using (117), (116) becomes

x−n x˜−n 1 exp tanh(π) Φn = cosh π n

x−n xn 1 x˜−n x ˜n ˜n xn x −1 + : exp : exp −tanh(π) cosh π n n n (122)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

151

which is in normal order. Let us write Φn =

1 Φ . cosh π n

Then the product Φ =

(123)

Φn

n≥1

is in normal order, and is the regularized isomorphism from the exponentiated deformation W of the conformal ﬁeld theory in vertex operator formulation on to the original W . The inverse, which goes from W to W , is best calculated by regularizing the exponential of −φ. We get

0 −1 exp(π(f − e)) = exp π −1 0

cosh π −sinh π = −sinh π cosh π   1

0 1 0 1 −tanh π  cosh π  =   −tanh π 1 0 1 0 cosh π =

x−n x˜−n 1 exp −tanh(π) cosh π n

x−n xn 1 x ˜−n x ˜n −1 + : exp : cosh π n n

˜n xn x exp tanh(π) . n

So expressing this as Ψn =

1 Ψ , cosh π n

the product Ψ =

(124)

Ψn

n≥1

is the regularized isomorphism from W to W . ˆ , the element u() = Ψ u is the Even though Ψ and Φ are only elements of W regularized chiral primary ﬁeld in W , and can be used in a regularized version of Eq. (8) to calculate the vacua on V , which will converge on non-degenerate Segal worldsheets. In this approach, however, the resulting CFT structure on V remains opaque, while as it turns out, in the present case it can be identiﬁed by another method. In fact, to answer the question, we must treat precisely the case missing in Theorem 1, namely when the weight 0 part of the vertex operator of the deforming

March 10, J070-S0129055X10003916

152

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

ﬁeld, which in this case is determined by the momentum, does not vanish. The answer is actually known in string theory to correspond to constant deformation of the metric on spacetime, which ends up isomorphic to the original free ﬁeld theory. From the point of view of string theory, what we shall give is a “purely worldsheet argument” establishing this fact. Let us look ﬁrst at the inﬁnitesimal deformation of the operator Y (v, t, t¯) for some ﬁeld v ∈ V which is an eigenstate of momentum. We have three forms which coincide where deﬁned: ˜1 , z, z¯)Y (v, t, t¯)dzd¯ z Y (x−1 x

(125)

Y (v, t, t¯)Y (x−1 x ˜−1 , z, z¯)dzd¯ z

(126)

Y (Y (x−1 x ˜−1 , z − t, z − t)v, t, t¯)dzd¯ z.

(127)

By chiral splitting, if we assume v is a monomial in the modes, we can denote (125)–(127) by η η˜ (without forming a sum of terms). Again, integrating (125)– (127) term by term dz, we get forms ω∞ , ω0 , ωt , respectively. Here we set 1 dz = ln z. z Again, these are branched forms. Selecting points p0 , p∞ , pt on the corresponding boundary components, we can, say, make cuts c0,t and c0,∞ connecting the points p0 , pt and p0 , p∞ . Cutting the worldsheet in this way, we obtain well deﬁned branches ω∞ , ω0 , ωt . To complicate things further, we have constant discrepancies C0t = ω0 − ωt C0∞ = ω0 − ω∞ .

(128)

These can be calculated for example by comparing with the 4 point function Y+ (x−1 , z)Y (v, t) + Y (v, t)Y− (x−1 , z) + Y (Y− (x−1 , z − t)v, t)

(129)

where Y− (v, z) denotes the sum of the terms in Y (v, z) involving negative powers of z, and Y+ (v, z) is the sum of the other terms. Another way to approach this is as follows: one notices that (130) Y (x−1 , z)dz = ∂ Y (1 , z)S− |=0 where Sm denotes the operator which adds m to momentum. It follows that C0t = ∂ (Z(x−1 , v, z, t)S− − Z(x−1 , S v, z, t))|=0 C0∞ = ∂ (Z(x−1 , v, z, t)S− − S Z(x−1 , v, z, t))|=0 .

(131)

Now the deformation is obtained by integrating the forms ω0 η˜,

(132)

η, (ωt + C0t )˜

(133)

(ω∞ + C0∞ )˜ η,

(134)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

153

on the boundary components around 0, t and ∞, and along both sides of the cuts c0,t , c0,∞ . To get the integrals of the terms in (132)–(134) which do not involve the discrepancy constants, we need to integrate   x−n n m−1   z + x0 ln z x ˜−m z¯ . (135) n m n =0

To do this, observe that (pretending we work on the degenerate worldsheet, and hence omitting scaling factors, taking curved integrals over z = 1), − ln z¯ 1 ln z d¯ z=− d¯ z = −2πi ln z¯ − (2πi)2 (136) z¯ z¯ 2 1 ln z · z¯m−1 d¯ z = −2πi z¯m . (137) m (Actually, the ﬁrst term on the right-hand side of (136) depends on the branch of the logarithm taken and hence cannot ﬁgure in the ﬁnal result; the reader may note that this is indeed the case.) Integrating (135), we obtain terms   m z¯ + x˜0 ln z¯ x ˜−m (138) −2πix0  m m =0

which will cancel with the integral along the cuts (to calculate the integral over the cuts, pair points on both sides of the cut which project to the same point in the original worldsheet), and “local” terms 1 ˜−n 2πi x−n x − (2πi)2 x0 x˜0 . (139) 2 n 2 n =0

The discrepancies play no role on the cuts (as the forms C0t η˜, C0∞ η˜ are unbranched), but using the formula (131), we can compensate for the discrepancies to linear order in by applying on each boundary component S−2πi˜x0 .

(140)

In (138), however, when integrating η˜, we obtain also discrepancy terms conjugate to (140), so the correct expression is S−2πi˜x0 S˜−2πix0 .

(141)

The term (141) is also “local” on the boundary components, so the sum of (139) and (141) is the formula for the inﬁnitesimal isomorphism between the free CFT and the inﬁnitesimally deformed theory. To exponentiate, suppose now we are working in a D-dimensional free CFT, and the deformation ﬁeld is ˜−1 . M x−1 x

(142)

Then the formula for the exponentiated isomorphism multiplies left momentum by exp M

(143)

March 10, J070-S0129055X10003916

154

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

and right momentum by exp M T .

(144)

But of course, in the free theory, the left momentum must equal to the right momentum, so this formula works only when M is a symmetric matrix. Thus, to cover the general case, we must discuss the case when M is antisymmetric. In this case, it may seem that we obtain indeed a diﬀerent CFT which is deﬁned in the same way as the free CFT with the exception that the left momentum mL and right momentum mR are related by the formula mL = AmR for some ﬁxed orthogonal matrix A. As it turns out, however, this theory is still isomorphic to the free CFT. The isomorphism replaces the left moving oscillators xi,−n by their transform via the matrix A (which acts on this Heisenberg representation by transport of structure). Next, let us discuss the case of deforming gravitaitonal ﬁeld of non-zero momentum, i.e. when u = M x−1 x˜−1 1λ

(145)

with λ = 0. Of course, in order for (145) to be of weight (1, 1), we must have λ = 0.

(146)

Clearly, then, the metric cannot be Euclidean, hence there will be ghosts and a part of our theory does not apply. Note that in order for (145) to be primary, we also must have µi ⊗ µ ˜i (147) M= i

where µi , λ = 0. µi , λ = ˜

(148)

Despite the indeﬁnite signature, we still have the primary obstruction, which is   x−k x˜−k k  zk + z¯ M x−m x ˜−n z m−1 z¯n−1 exp λ  coeﬀ z−1 z˜−1 : : k k m,n k =0

˜−1 1λ M x−1 x

(149)

(we omit the z λ,x0 term, since the power is 0 by (146)). In the notation (147), this is (µi x0 − µi x0 λx−1 λx1 + µi x−1 λx1 + λx−1 µi x1 ) i,j

⊗ (˜ µj x˜0 − µ ˜j x ˜0 λ˜ x−1 λ˜ x1 + µ ˜j x ˜−1 λ˜ x1 + λ˜ x−1 µ ˜j x ˜1 )M x−1 x ˜−1 1λ

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

155

which in the presence of (148) reduces to the condition ˜−1 = 0. M 2 λ ⊗ λx−1 x

(150)

M 2 = 0

(151)

This is false unless

which means that (145) is a null state, along which the deformation is not interesting in the sense of string theory. More generally, the distributional form of (150) is λ ⊗ λM (λ)2 = 0. (152) λ 2 =0

If we set f (λ) = δ λ 2 =0 M (λ)2 then the Fourier transform of f will be a function g satisfying ∂ 2g ± 2 =0 ∂λi where the signs correspond to the metric, which we assume is diagonal with entries ±1. The Fourier transform of the condition (152) is then ∂2 g = 0. ∂λi ∂λj

(153)

Assuming a decay condition under which the Fourier transform makes sense, (153) implies g = 0, hence (151), so in this case also the obstruction is nonzero unless (145) is a null state. In this discussion, we restricted our attention to deforming ﬁelds of gravitational origin. It is important to note that other choices are possible. As a very basic example, let us look at the 1-dimensional Euclidean model. Then there is a possibility of critical ﬁelds of the form a1√2 + b1−√2 .

(154)

This includes the sine-Gordon interaction [69] when a = b. (We see hyperbolic rather than trigonometric functions because we are working in Euclidean spacetime rather than in the time coordinate, which is the case usually discussed.) The primary obstruction in this case states that the weight (0, 0) descendant of (154) applied to (154) is 0. Since the descendant is (4ab)x−1 , we obtain the condition a = 0 or b = 0. It is interesting to note that in the case of the compactiﬁcation on a circle, these cases where investigated very successfully by Ginsparg [30], who used the obstruction to competely characterize the component of the moduli space of c = 1 CFT’s originating from the free Euclidean compactiﬁed free theory. The result is that only free theories compactify at diﬀerent radii, and their Z/2-orbifolds occur.

March 10, J070-S0129055X10003916

156

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

There are many other possible choices of non-gravitational deformation ﬁelds, one for each ﬁeld in the physical spectrum of the theory. We do not discuss these cases in the present paper. Let us now look at the N = 1-supersymmetric free ﬁeld theory. In this case, as pointed out above, in the NS-NS sector, critical gravitational ﬁelds for deformations have weight (1/2, 1/2). We could also consider the NS-R and R-R sectors, where the critical weights are (1/2, 0) and (0, 0), respectively. These deforming ﬁelds parametrize soul directions in the space of inﬁnitesimal deformations. The soul parameters θ, θ˜ have weights (1/2, 0), (0, 1/2), which explains the diﬀerence of critical weights in these sectors. Let us, however, focus on the body of the space of gravitational deformations, i.e. the NS-NS sector. Let us ﬁrst look at the weight (1/2, 1/2) primary ﬁeld M ψ−1/2 ψ˜−1/2 .

(155)

The point is that the inﬁnitesimal deformation is obtained by integrating the insertion operators of ˜ −1/2 M ψ−1/2 ψ˜−1/2 = M x−1 x ˜−1 . G−1/2 G Therefore, (155) behaves exactly the same as a deformation along the ﬁeld (142) in the bosonic case. Again, if M is a symmetric matrix, exponentiating the deformation leads to a theory isomorphic via scaling the momenta, while if M is antisymmetric, the isomorphism involves transforming the left moving modes by the orthogonal matrix exp(M ). In the case of momentum λ = 0, we again have indeﬁnite signature, and the ﬁeld u = M ψ−1/2 ψ˜−1/2 1λ .

(156)

Once again, for (156) to be primary, we must have (147), (148). Moreover, again the actual inﬁnitesimal deformation is obtained by applying the insertion operators ˜ −1/2 u, so the treatment is exactly the same as for the deformation along of G−1/2 G the ﬁeld (145) in the bosonic case. Again, we discover that under a suitable decay condition, the obstruction is always nonzero for gravitational deformations of nonzero momentum with suitable decay conditions. It is worth noting that in both the bosonic and supersymmetric cases, one can apply the same analysis to free ﬁeld theories compactiﬁed on a torus. In this case, however, scaling momenta changes the geometry of the torus, so using deformation ﬁelds of 0 momentum, we ﬁnd exponential deformations which change (constantly) the metric on the torus. This seems to conﬁrm, in the restricted sense investigated here, a conjecture stated in [59]. Remark. Since one can consider Calabi–Yau manifolds which are tori, one sees that there should also exist an N = 2-supersymmetric version of the free ﬁeld theory compactiﬁed on a torus. (It is in fact not diﬃcult to construct such model directly,

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

157

it is a standard construction.) Now since we are in the Calabi–Yau case, marginal cc ﬁelds should correspond to deformations of complex structure, and marginal ac ﬁelds should correspond to deformations of K¨ ahler metric in this case. But on the other hand, we already identiﬁed gravitational ﬁelds which should be the sources of such deformations. Additionally, deformations in those direction require regularization of the deformation parameter, and hence cannot satisfy the conclusion of Theorem 3. This is explained by observing that we must be careful with reality. The gravitational ﬁelds we considered are in fact real, but neither chiral nor antichiral primary in either the left or the right moving sector. By contrast, chiral primary ﬁelds (or − antiprimary) ﬁelds are not real. This is due to the fact that G+ −3/2 and G−3/2 are not real in the N = 2 superconformal algebra, but are in fact complex conjugate to each other. Therefore, to get to the real gravitational ﬁelds, we must take real parts, or in other words linear combinations of chiral and antichiral primaries, resulting in the need for regularization. It is in fact a fun exercise to calculate explicitly how our higher N = 2 obstruction theory operates in this case. Let us consider the N = 2-supersymmetric free ﬁeld theory, since the compactiﬁcation behaves analogously. The minimum number of dimensions for N = 2 supersymmetry is 2. Let us denote the bosonic ﬁelds by x, y and their fermionic superpartners by ξ, ψ. Then the 0-momentum summand of the state space (NS sector) is (a Hilbert completion of)

1 . Sym(xn , yn |n < 0) ⊗ Λ ξr , ψr |r < 0, r ∈ Z + 2 The “body” parts of the bosonic and fermionic vertex operators are given by the usual formulas x−n z n−1 , Y (y−1 , z) = y−n z n−1 , Y (x−1 , z) = Y (ξ−1/2 , z) =

ξ−s z n−s−1/2 , Y (ψ−1/2 , z) =

[ξr , ξ−r ] = [ψr , ψ−r ] = 1 [xn , x−n ] = [yn , y−n ] = n. We have, say, G1−3/2 = ξ−1/2 x−1 + ψ−1/2 y−1 G2−3/2 = ξ−1/2 y1 − ψ−1/2 x−1 . As usual, 1 √ (G1−3/2 ± iG2−3/2 ). G± −3/2 = 2

ψ−s z n−s−1/2 ,

March 10, J070-S0129055X10003916

158

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

With these conventions, we have a critical chiral primary u = ξ−1/2 − iψ−1/2

(157)

(and its complex conjugate critical antichiral primary). We then see that for a non-zero coeﬃcient C, CG− −1/2 u = x−1 − iy−1 .

(158)

We now notice that formulas analogous to (112) etc. apply to (158), but the −1 summans of h will appear with opposite signs for the real and imaginary summands, so it will cancel out, so the regularizations (123), (124) are not needed, as expected. Next, let us study the formula (81). The key observation here is that we have the combinatorial identity 1 1 = (159) n1 · · · nk (nσ(1) + · · · + nσ(k) )(nσ(1) + · · · + nσ(k−1) ) · · · nσ(1) σ where the sum on the right is over all permutations on the set {1, . . . , k}. Now in the present case, we have the inﬁnitesimal isomorphism on the 0-momentum part, up to non-zero coeﬃcient, (xn − iyn )(˜ xn + i˜ yn ) (160) φ= n and in the absence of regularization, the expansion of the exponentiated isomorphism on the 0-momentum parts is simply exp(φ). (The + sign in the ˜’s is caused by the fact that we are in the complex conjugate Hilbert space.) Applying this to (157), we see that we have formulas analogous to (116)–(122), and applying the exponentiated isomorphism to (157), all the terms in normal order involving x>0 , y>0 will vanish, so we end up with

xn + i˜ yn ) (xn − iyn )(˜ exp D u n n<0 for some non-zero coeﬃcient D. Applying (159), we get (81). Finally, the obstruction in chiral form − u∗ (0), (G− −1/2 u)(zk ), . . . , (G−1/2 u)(z1 ), u(0)

must vanish identically. To see this, we simply observe (157), (158) that in the present case, u is in the coset model with respect to G− −1/2 u (see the discussion below formula (250) below). Thus, in the N = 2-free ﬁeld theory, the obstruction theory works as expected, and in the case discussed, the obstructions vanish. It is worth noting that in 2n-dimensional N = 2-free ﬁeld theory, we thus have an n2 dimensional space of cc + aa real ﬁelds, and an n2 -dimensional space of real ca + ac ﬁelds, and although regularization occurs, there is no obstruction to exponentiating the deformation by turning on any linear combination of those ﬁelds. For a free N =

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

159

2-theory compactiﬁed on an n-dimensional abelian variety, this precisely recovers the deformations in the corresponding component of the moduli space of Calabi– Yau varieties. However, other deformations exist. For an interesting calculation of deformations of the N = 2-free ﬁeld theory in “sine-Gordon” directions, see [13]. 5. The Gepner Model of the Fermat Quintic The ﬁnite weight states of one chirality (say, left moving) of the Gepner model of the Fermat quintic are embedded in the 5-fold tensor product of the N = 2supersymmetric minimal model of central charge 9/5 [24, 25, 32]. More precisely, the Gepner model is an orbifold construction. This construction has two versions. In [24,25,32], one is interested in actual string theories, so the 5-fold tensor product of central charge 9 of N = 2 minimal models is tensored with a free supersymmetric CFT on 4 Minkowski coordinates. This is then viewed in lightcone gauge, so in eﬀect, one tensors with a 2-dimensional supersymmetric Euclidean free CFT, resulting in N = 2-supersymmetric CFT of central charge 12. Finally, one performs an orbifolding/GSO projection to give a candidate for a theory for which both modularity and spacetime SUSY can be veriﬁed. (Actually, this is still not quite precise, as in Gepner’s original work, the true point of interest is the construction of heterotic string theories; from our point of view, however, the diﬀerence does not matter.) What we care about is that it is also possible to create an orbifold theory of central charge 9 which is the candidate of the nonlinear σ-model itself, without the spacetime coordinates. (The spacetime coordinates can be added to this construction and usual GSO projection performed if one is interested in the corresponding string theory.) The essence of this construction not involving the spacetime coordinates is formula (2.10) of [33]. In the case of the level 3 N = 2-minimal model (more precisely, the unitary N = 2 Virasoro minimal model of A-type), the orbifold construction is with respect to the Z/5-action diagonal which acts on the eigenstates of J0 eigenvalue (= “U (1)-charge”) j/5 by e2πij/5 . As we shall review, the NS part of the level 3 N = 2 minimal model has two sectors of U (1)-charge j/5, which we will for j ∈ Z/5Z. In the FF realization for the moment ad hoc denote Hj/5 and Hj/5 (see below), these sectors correspond to = 0, = 1, respectively. Then the NS-NS sector of the 5-fold tensor product of minimal model has the form 5

ˆ ˆ i∗ /5 ⊕ Hi /5 ⊗H ˆ i∗/5 ). (Hik /5 ⊗H k k k (ik )

(161)

k=1

The corresponding sector of the orbifold construction (formula (2.10) of [33]) has the form (ik ):

P

5

ˆ

ik ∈5Z j∈Z/5Z k=1

∗ ∗ ˆ (j+i ˆ (j+i (Hik /5 ⊗H ⊕ Hik /5 ⊗H ). k )/5 k )/5

(162)

March 10, J070-S0129055X10003916

160

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Mathematically speaking, this orbifold can be constructed by noting that, ignoring for the moment supersymmetry, the N = 2-minimal model is a tensor product of the parafermion theory of the same level and a lattice theory (see [31] and also below). The orbifold construction does not aﬀect the parafermionic factor, and on the lattice coordinate, which in this case does not possess a non-zero Z/2valued form, and hence physically models a free theory compactiﬁed on a torus, the orbifold simply means replacing the torus by its factor by the free action of the diagonal Z/5 translation group, which is represented by another lattice theory. On this construction, N = 2 supersymmetry is then easily restored using the same formulas as in (161), since the U (1)-charge of the G’s is integral. The calculations in this and the next section proceed entirely in the orbifold (162), and hence can be derived from the structure of the level 3 N = 2-minimal model. It should be pointed out that a mathematical approach to the fusion rules of the N = 2 minimal models was given in [42]. We shall use the Coulomb gas realization of the N = 2-minimal model, cf. [34, 53]. Let us restrict attention to the NS sector. Then, essentially, the left moving sector of the minimal model is a subquotient of the lattice theory where the lattice is 3-dimensional, and spanned by

2 1 i 2 i 2 2 3 √ , 0, 0 , √ , , , √ , 0, i . 3 15 15 2 5 2 3 15 We will adopt the convention that we shall abbreviate (k, , m)MM = (k, , m) for the lattice label

k i √ , 15 2

2 mi , 5 2

2 . 3

We shall also write (, m)MM = (m, , m)MM . Call the oscillator corresponding to the jth coordinate xj,m , j = 0, 1, 2. Then the conformal vector is 1 2 i 2 1 1 2 x x1,−2 + x22,−1 . − x + (163) 2 0,−1 2 1,−1 2 5 2 The superconformal algebra is generated by 1 5 + G−3/2 = i x2,−1 − x1,−1 1( √5 ,0,i√ 2 ) , 3 2 3 15 1 5 x2,−1 + x1,−1 1(− √5 ,0,−i√ 2 ) . G− −3/2 = −i 3 2 3 15

(164)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

161

For future reference, we will sometimes use the notation (a, b, c)xn = ax0,n + bx1,n + cx2,n and also sometimes abbreviate (b, c)xn = (0, b, c)xn . The module labels are realized by labels (, m) = 1( √m

15

0 ≤ ≤ 3,

,− i 2

√2

im 3, 2

√2 , ) 3

m = −, − + 2, . . . , − 2, .

(165) (166)

It is obvious that to stay within the range (166), we must understand the fusion rules and how they are applied. The basic principle is that labels are indentiﬁed as follows: No identiﬁcations are imposed on the 0th lattice coordinate. This means that upon any identiﬁcation, the 0th coordinate must be the same for the labels identiﬁed. Therefore, the identiﬁcation is governed by the 1st and 2nd coordinates, which give the Coulomb gas (= Feigin–Fuchs) realization of the corresponding parafermionic theory (the Z/3 parafermion model). The keypoint here are the parafermionic currents 1 5 x2,−1 − x1,−1 1(0,i√ 2 ) , ψ1,−2/3 = i 3 PF 2 3 (167) 1 5 + = −i ψ1,−2/3 x2,−1 + x1,−1 1(0,−i√ 2 ) 3 PF 2 3 (the 0th coordinate is omitted). Clearly, the parafermionic currents act on the labels by ψ1,−2/3 : (, m)P F → (, m + 2)P F , + : (, m)P F → (, m − 2)P F . ψ1,−2/3

(168)

The lattice labels (, m)P F allowed are those which have non-negative weight. This condition coincides with (166). Now we impose the identiﬁcation for parafermionic labels: (, m)P F = (3 − , m − 3)P F . This implies (1, −1)P F ∼ (2, 2)P F , (1, 1)P F ∼ (2, −2)P F , (0, 0)P F ∼ (3, −3)P F ∼ (3, 3)P F .

(169)

March 10, J070-S0129055X10003916

162

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Now in the Gepner model corresponding to the quintic, the cc ﬁelds allowed are ((3, 2, 0, 0, 0)L, (3, 2, 0, 0, 0)R),

(170)

((3, 1, 1, 0, 0)L, (3, 1, 1, 0, 0)R),

(171)

((2, 2, 1, 0, 0)L, (2, 2, 1, 0, 0)R),

(172)

((2, 1, 1, 1, 0)L, (2, 1, 1, 1, 0)R),

(173)

((1, 1, 1, 1, 1)L, (1, 1, 1, 1, 1)R),

(174)

and the ac ﬁeld allowed is ((−1, −1, −1, −1, −1)L, (1, 1, 1, 1, 1)R).

(175)

Here we wrote for 1( , )M M , ( = 0, . . . , 3), which is a chiral primary in the N = 2 minimal model of weight /10, and − for 1( ,− )M M , which is antichiral primary of weight /10. The tuple notation in (170)–(175) really means tensor product. We omit permutations of the ﬁelds (170)–(173), so counting all permutations, there are 101 ﬁelds (170)–(174). We will need an understanding of the fusion rules in the Z/3 parafermion model and N = 2-supersymmetric minimal model of central charge 9/5. In the Z/3 parafermion model, we have 6 labels (0, 0)P F , (3, 1)P F , (3, −1)P F ,

(176)

(1, 1)P F , (1, −1)P F , (2, 0)P F .

(177)

This can be described as follows: the labels (176) have the same fusion rules as the √ lattice L = i 6 ⊂ C, i.e. L /L

(178)

where L is the dual lattice (into which L is embedded using the standard quadratic

form on C). This dual lattice is 2i 23 , and the fusion rule is “abelian”, which means that the product of labels has only onepossible label as outcome, and is described

by the product in L /L. The label ± 2i 23 corresponds to (0, ±2)P F ∼ (3, ∓1)P F . Next, the product of (2, 0)P F with (3, ∓1)P F has only one possible outcome, (2, ±2)P F = (1, ∓1)P F . The product of (2, 0)P F with itself has two possible outcomes, (2, 0)P F and (0, 0)P F . All other products are determined by commutativity, associativity and unitality of fusion rules. The result can be summarized as follows: We call (176) level 0, 3 labels and (177) level 1, 2 labels. Every level 1, 2 label has a corresponding label of level 0, 3.

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

163

The correspondence is (0, 0)P F ↔ (2, 0)P F (3, 1)P F ↔ (1, 1)P F

(179)

(3, −1)P F ↔ (1, −1)P F . As described above, the fusion rules on level 0, 3 are determined by the lattice theory of L. Additionally, multiplication preserves the correspondence (179), while the level of the product is restricted only by requiring that any level added to level 0, 3 is the original level. To put it in another way still, the Verlinde algebra is Z[ζ]/(ζ 3 − 1) ⊗ Z[]/(2 − − 1)

(180)

where ζ = (3, 1)P F and = (2, 0)P F . In the N = 2 supersymmetric minimal model (MM) case, we allow labels (3k + m, , m)MM

(181)

where (, m) is a PF label, k ∈ Z. Two labels (181) are identiﬁed subject to identiﬁcations of PF labels, and also (j, , m)MM ∼ (j + 15, , m)MM ,

(182)

and, as a result of SUSY, (j, , m)MM ∼ (j − 5, , m − 2)MM .

(183)

(By ∼ we mean that the labels (i.e. VA modules) are identiﬁed, but we do not imply that the states involved actually coincide; in the case (183), they have diﬀerent weights.) Recalling again that we abbreviate (m, , m)MM as (, m)MM , we get the following labels for the c = 9/5 N = 2 SUSY MM: (0, 0)MM ↔ (2, 0)MM (3, 3)MM ↔ (2, −2)MM (3, 1)MM ↔ (1, 1)MM

(184)

(3, −1)MM ↔ (1, −1)MM (3, −3)MM ↔ (2, 2)MM . Again, the left column (184) represents 0, 3 level labels, the right column represents level 1, 2 labels. The left column labels multiply as the labels of the lattice superCFT corresponding to the lattice Λ in C ⊕ C spanned by √ 2 5 (185) ( 15, 0), √ , i 3 15

March 10, J070-S0129055X10003916

164

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

(recall that a super-CFT can be assigned to a lattice with integral quadratic form; the quadratic form on C ⊕ C is the standard one, the complexiﬁcation of the Euclidean inner product). The dual lattice of (185) is spanned by

3 5 2 √ ,0 , √ ,i , (186) 3 15 15 which correspond to the labels (3, 0, 0)MM , (5, 3, −1)MM , respectively. We see that Λ /Λ ∼ = Z/5.

(187)

In (184), the rows (counted from top to bottom as 0, . . . , 4) match the corresponding residue class (187). The fusion rules for (2, 0)MM , (0, 0)MM are the same as in the PF case. Hence, again, multiplication of labels preserves the rows (184), and the Verlinde algebra is isomorphic to Z[η]/(η 5 − 1) ⊗ Z[]/(2 − − 1)

(188)

where η is (3, 3)MM . Remark. As remarked in Sec. 3, the positive deﬁniteness of the modular functor, which is crucial for our theory to work, is a requirement for a physical CFT. It is interesting to note, however, that if we do not include this requirement, other possible choices of real structure are possible on the modular functor: The Verlinde algebra of a lattice modular functor with another modular functor M with two labels 1 and , and Verlinde algebras (180), (188) are tensor products of lattice Verlinde algebras and the algebra Z[]/(2 − − 1). The real structure of this last modular functor can be changed by multiplying by −1 the complex conjugation in MΣ for a worldsheet Σ precisely when Σ has an odd number of boundary components labelled on level 1, 2. The resulting modular functor of this operation is not positive-deﬁnite. Let us now discuss the question of vertex operators in the PF realization of the minimal model. Clearly, since the 0th coordinate acts as a lattice coordinate and is not involved in renaming, it suﬃces the question for the parafermions. Now in the Feigin–Fuchs realization of the level 3 PF model, any state can be written as u1λ

(189)

where λ is one of the labels (166) and u is a state of the Heisenberg representation of the Heisenberg algebra generated by xi,m , i = 1, 2, m = 0. The situation is however further complicated by the fact that not all Heisenberg states u are allowed for a given label λ. We shall call the states which are in the image of the embedding

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

165

admissible. For example, since the λ = 0 part of the PF model is isomorphic to the coset model SU (2)/S 1 of the same level, states (a, b)x−1 (0, 0)P F

(190)

are not admissible for (a, b) = (0, 0). One can show that admissible states are exactly those which are generated from the ground states (166) by vertex operators and PF currents. Because not all states are admissible, however, there are also states whose vertex operators are 0 on admissible states. Let us call them null states. For example, since (190) is not admissible, it follows that (a, b)x−1 (3, 3)P F ,

(191)

which is easily seen to be admissible for any choice of (a, b), is null. Determining explicitly which states are admissible and which are null is extremely tricky (cf. [34]). Fortunately, we do not need to address the question for our purposes. This is because we will only deal with states which are explicitly generated by the primary ﬁelds, and hence automatically admissible; because of this, we can ignore null states, which do not aﬀect correlation functions of admissible states. On the other hand, we do need an explicit formula for vertex operators. One method for obtaining vertex operators is as follows. We may rename ﬁelds using the identiﬁcations (169) and also PF currents: a PF current applied to a renamed ﬁeld must be equal to the same current applied to the original ﬁeld. Note that this way we may get Heisenberg states above labels which fail to satisfy (166). Such states are also admissible, even though the corresponding “ground states” (which have the same name as the label) are not. Now if we have two admissible states ui 1( i ,mi ) ,

i = 1, 2

where 0 ≤ i ≤ 3 and 1 + 2 ≤ 3, then the lattice vertex operator (u1 1( 1 ,m1 ) )(z)u2 1( 2 ,m2 )

(192)

always satisﬁes our fusion rules, and (up to scalar multiple constant on each module) is a correct vertex operator of the PF theory. This is easily seen simply by the fact that (192) intertwines correctly with module vertex operators (which are also lattice operators). While in our examples, it will suﬃce to always consider operators obtained in the form (192), it is important to realize that they do not describe the PF vertex operators completely. The problem is that when we want to iterate vertex operators, we would have to keep renaming states. But when two ground states 1λ , 1µ are identiﬁed via the formula (169), it does not follow that we would have u1λ = u1µ

(193)

March 10, J070-S0129055X10003916

166

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

for every Heisenberg state u. On the contrary, we saw for example that (190) is inadmissible, while (191) is null. One also notes that one has for example the identiﬁcation λx−1 1λ = L−1 1λ = L−1 1µ = µx−1 1µ ,

(194)

which is not of the form (193). Because of this, to describe completely the full force of the PF theory, one needs another device for obtaining vertex operators (although we will not need this in the present paper). Brieﬂy, it is shown in [34] that up to scalar multiple, any vertex operator u(z)v = Y (u, z)(v) where u, v are admissible states can be written as · · · (ak x−1 1(0,−2) )(tk ) · · · (a1 x−1 1(0,−2) (t1 )u(z)vdt1 · · · dtk

(195)

where the operators in the argument (195) are lattice vertex operators and the number k is selected to conform with the given fusion rule. While it is easy to show that operators of the form (195) are correct vertex operators on admissible states (again up to scalar multiple constant on each irreducible module), as the “screening operators” ax−1 1(0,−2) commute with PF currents, selecting the bounds of integration (“contours”) is much more tricky. Despite the notation, it is not correct to imagine these as integrals over closed curves, at least not in general. One approach which works is to bring the argument of (195) to normal order, which expands it as an inﬁnite sum of terms of the form (196) (ti − tj )αij tβk k (where we put t0 = z) with coeﬃcients which are lattice vertex operators. Then to integrate (196), for αij , βk > 0, we may simply integrate ti from 0 to ti−1 , and deﬁne the integral by analytic continuation in the variables αij , βk otherwise. The functions obtained in this way are generalized hypergeometric functions, and fail for example the assumptions of Theorem 1 (see Remark 2 after the theorem). The explanation is in the fact that, as we already saw, the fusion rules are not “abelian” in this case. 6. The Gepner Model: The Obstruction We will now show that for the Gepner model of the Fermat quintic, the function (95) may not vanish for the deforming ﬁeld (170). This means, not all perturbative deformations corresponding to marginal ﬁelds exist in this case. We emphasize that our result applies to deformations of the CFT itself (of central charge 9). A diﬀerent

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

167

approach is possible by embedding the model to string theory, and investigating the deformations in that setting (cf. [16]). Our results do not automatically apply to deformations in that setting. We will consider v = u = (3, 3, 3) ⊗ (2, 2, 2). (In the remaining three coordinates, we will always put the vacuum, so we will omit them from our notation.) First note that by Theorem 3, this is actually the only relevant case (95), since the only other chiral primary ﬁeld of weight 1/2 with only two non-vacuum coordinates is (2, 2, 2) ⊗ (3, 3, 3), which cannot correlate with the right-hand side of (95), whose ﬁrst coordinate is on level 0, 3. In any case, we will show therefore that the Gepner model has an obstruction against continuous perturbative deformation along the ﬁeld (170) in the moduli space of exact conformal ﬁeld theories. Now the chiral correlation function (95) is a complicated multivalued function because of the integrals (196), which are generalized hypergeometric functions. As remarked above, the modular functor has a canonical ﬂat connection on the space of degenerate worldsheets whose boundary components are shifts of the unit circle with the identity parametrization. The ﬂat connection comes from the fact that these degenerate worldsheets are related to each other by applying exp(zL−1 ) to their boundary components. This is why we can speak of analytic continuation of a branch of the correlation function corresponding to a particular fusion rule. It can further be shown (although we do not need to use that result here) that the continuations of the correlation function corresponding to any one particular fusion rule generate the whole correlation function (i.e. the whole modular functor is generated by any one non-zero section). Let us now investigate which number m we need in (95). In our case, we have − − G− −1/2 (u) = G−1/2 (3, 3, 3) ⊗ (2, 2, 2) − (3, 3, 3) ⊗ G−1/2 (2, 2, 2).

(197)

(The sign will be justiﬁed later;√it is not √ needed at this point.) The ﬁrst sum√ mand √ (197) has x0,0 -charge (−2/ 15, 2/ 15), the second has x0,0 -charge (3/ 15, −3/ 15). Thus, the charges can add up to 0 only if m is a multiple of 5. The smallest possible obstruction is therefore for m = 5, in which case (95) is a 7 point function. Let us focus on this case. This function however is too big to calculate completely. Because of this, we use the following trick. First, it is equivalent to consider the question of vanishing of the function − u(t) . 1|(G− −1/2 u)(z5 ) · · · (G−1/2 u)(z1 )u(z0 )¯

(198)

Now by the OPE, it is possible to transform any correlation function of the form · · · | · · · v(z)w(t) · · ·

(199)

March 10, J070-S0129055X10003916

168

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

to the correlation function · · · | · · · (vn w)(t) · · ·

(200)

(all other entries are the same). More precisely, (199) is expanded, in a certain range and choice of branch, into a series in z − t with coeﬃcients (200) for values of n belonging to a coset Q/Z. By the above argument, therefore, the function (199) vanishes if and only if the function (200) vanishes for all possible choices of n associated with one ﬁxed choice of fusion rule. In the case of (198), we shall divide the ﬁelds on the right-hand side into two sets Gx , Gy containing two copies of G− −1/2 u each, and a set Gz containing the − remaining three ﬁelds u, u ¯ and G−1/2 u. Each set Gx , Gy , Gz will be reduced to a single ﬁeld using the transition from (199) to (200) (twice in the case of Gz ). To simplify notation (eliminating the subscripts), we will denote the ﬁelds resulting from Gx , Gy , Gz by a(x), b(y), c(z), respectively. Thus, x, y, z are appropriate choices among the variables zi , t, depending how the transition from (199) to (200) is applied. This reduces the correlation function (198) to 1|a(x)b(y)c(z) .

(201)

Most crucially, however, we make the following simpliﬁcation: We shall choose the fusion rules in such a way that the ﬁelds a, b, c are level 0, 3 in the Feigin–Fuchs realization, and at most one of the charges will be 3 (in each coordinate). Then, (201) is just a lattice correlation function, for the computation of which we have an algorithm. To make the calculation correctly, we must keep careful track of signs. When taking a tensor product of super-CFT’s, one must add appropriate signs analogous to the Koszul–Milnor signs in algebraic topology. Now a modular functor of a superCFT decomposes into an even part and an odd part. Additionally, more than one choice of this decomposition may be possible for the same theory, depending on which bottom states of irreducible modules are chosen as even or odd. The sign of a fusion rule is then determined by whether composition along the pair of pants with given labels preserves parity of states or not. Mathematically, this phenomenon was noticed by Deligne in the case of the determinant line (cf. [50]). (Deligne also noticed that in some cases no consistent choice of signs is possible and a more reﬁned formalism is needed; a single fermion of central charge 1/2 is an example; this is also discussed in [50]. However, this will not be needed here.) In the case of the N = 2-minimal model, there is a choice of parities of ground states of irreducible modules which make the whole modular functor (all the fusion rules) even: simply choose the parity of (k, , m) to be k mod 2. We easily see that this is compatible with supersymmetry. Now in this case of completely even modular functor, the signs simplify, and we put Y (u ⊗ v, z)(r ⊗ s) = (−1)π(r)π(v) Y (u, z)r ⊗ Y (v, z)s

(202)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

169

(where π(u) means the parity of u). Regarding supersymmetry (if present), an element H of the superconformal algebra also acts on a tensor product by H(u ⊗ v) = Hu ⊗ v + (−1)π(H)π(u) u ⊗ Hv,

(203)

− π(u) G− u ⊗ (G− −1/2 (u ⊗ v) = (G−1/2 u) ⊗ v + (−1) −1/2 v).

(204)

in particular

We see that because of (204), the ﬁelds a, b, c may have the form of sums of several terms. Example 1. Recall that the inner product (more precisely symmetric bilinear form) of labels considered as lattice points is (r1 , s1 , t1 ), (r2 , s2 , t2 ) =

s1 s2 t1 t2 r1 r2 + − . 15 10 6

(205)

Recall also (from the deﬁnition of energy-momentum tensor) that weight of the label ground states is calculated by w(r, s, t)MM =

s(s + 2) t2 r2 r2 + w(s, t)P F = + − . 30 30 20 12

(206)

Now we have u = (3, 3, 3) ⊗ (2, 2, 2) = (3, 0, 0) ⊗ (2, 1, −1).

(207)

We begin by choosing the ﬁeld c. Compose ﬁrst u and u ¯ = (−3, 3, −3) ⊗ (−2, 2, −2) = (−3, 0, 0) ⊗ (−2, 1, 1).

(208)

We choose the non-zero un u¯ of the bottom weight for the fusion rule which adds the lattice charges on the right-hand side of (207), (208). The result is u−1/10 u¯ = (0, 0, 0) ⊗ (0, 2, 0).

(209)

Next, apply G− −1/2 u to (209). Again, we will choose the bottom descendant. Now − G−1/2 u has two summands, (−2, 3, 1) ⊗ (2, 1, −1)

(210)

(3, 0, 0) ⊗ (0, 5, 3)x−1 (−3, 1, 3)

(211)

and

(the term (211) involves renaming to stay withing no-ghost PF labels after composition). Applying (210) to (209) gives bottom descendant (−2, 3, 1) ⊗ (2, 3, −1) of weight 8/5,

(212)

applying (211) to (209) gives bottom descendant (3, 0, 0) ⊗ (−3, 0, 0) of weight 3/5.

(213)

March 10, J070-S0129055X10003916

170

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Since (213) has lower weight than (212), (212) may be ignored, and we can choose c = (3, 0, 0) ⊗ (−3, 0, 0).

(214)

Now again, using the formula (204), we see that in the sets of ﬁelds Gx , Gy we need one summand (211) and three summands (210) to get to x00 -charge 0. Thus, one of the groups Gx , Gy will contain two summands of (210) and the other will contain one. We employ the following convention: We choose Gy to contain two summands (210) and Gx to contain one summand (210) and one summand (211). (215) This leads to the following: We must choose the ﬁelds a and b of the same weights and symmetrize the resulting correlation function with respect to x and y. (216) We will choose b ﬁrst. Again, we will choose the bottom weight (nonzero) descendant of (210) applied to itself renamed as (0, 5, −3)x−1 (−2, 0, −2) ⊗ (2, 2, 2),

(217)

(−4, 3, −1) ⊗ (4, 3, 1).

(218)

which is

We rename to level 0, which gives b = (0, 5, 3)x−1 (−4, 0, 2) ⊗ (0, 5, −3)x−1 (4, 0, −2), w(b) = 12/5.

(219)

Then a must have weight 12/5 to satisfy (216). When calculating a, however, there is an additional subtlety. This time, we actually have to take into account two summands, from applying (210) to (211) and vice versa, i.e. (211) to (210). In both cases, we must rename to get the desired fusion rule. To this end, we may replace (211) by (3, 0, 0) ⊗ (−3, 2, 0).

(220)

However, when applying (210) and (220) to each other in opposite order, the renamings then do not correspond, resulting in the possibility of wrong coeﬃcient/sign (since renaming are correct only up to constants which we have not calculated). To reconcile this, we must use exactly the same renamings step by step, related only by applying PF currents. To this end, we may compare the renaming of applying (0, 5, −3)x−1 (−2, 0, −2) ⊗ (2, 2, 2)

(221)

1 (3, 0, 0) ⊗ (0, 5, −3)x−1 (−3, 1, −3) 2

(222)

to

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

(the and

1 2

171

comes from the PF current (5, −3)x−1 (0, −2) which takes (2, 2) to 2(2, 0)) (3, 0, 0) ⊗ (−3, 2, 0)

(223)

(0, 5, −3)x−1 (−2, 0, −2) ⊗ (2, 1, −1).

(224)

to

We see that the bottom descendant of applying (221) to (222) is (0, 5, −3)x−1 (1, 0, −2) ⊗ (−1)(−1, 3, −1)

(225)

while the bottom descendant of applying (223) to (224) is (0, 5, −3)x−1 (1, 0, −2) ⊗ (−1, 3, −1).

(226)

The expression (225) is the negative of (226). On the other hand, we see that the bottom descendants of applying (210) to (220) and vice versa are the same. This means that we are allowed to use the names (210) and (220) to each other in either order, but we must take the results with opposite signs. Now (226) has weight 7/5, so to get weight 12/5, we must take the descendant of applying (210) to (220) and vice versa which is of weight 1 higher than the bottom. This gives ((−2, 3, 1) − (3, 0, 0))x−1 (1, 3, 1) ⊗ (−1, 3, −1) + (1, 3, 1) ⊗ ((2, 1, −1) − (−3, 2, 0))x−1 (−1, 3, −1), which is a = (−5, 3, 1)x−1 (1, 3, 1) ⊗ (−1, 3, −1) + (1, 3, 1) ⊗ (5, −1, −1)x−1 (−1, 3, −1). (227) Now the correlation function of a(x), b(y), c(z) given in (227), (219), (214) is an ordinary lattice correlation function. The algorithm for calculating the lattice correlation function of ﬁelds ui (xi ) which are of the form 1λi (xi ) or µi x−1 1λi (xi ) with the label 1P λi is as follows: The correlation function is a multiple of (xi − xj )λi ,λj i<j

March 10, J070-S0129055X10003916

172

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

by a certain factor, which is a sum over all the ways we may “absorb” any µi x−1 factors. Each such factor may either be absorbed by another µj x−1 , which results in a factor µi , µj (xi − xj )−2 ,

i = j

(228)

or by another lattice label 1λj , which results in a factor µi , λj (xi − xj )−1 ,

i = j.

(229)

Each µi x−1 must be absorbed exactly once (and the mechanism (228) is considered as absorbing both µi and µj ), but one lattice label 1λj may absorb several diﬀerent µi x−1 ’s via (229). Evaluating the correlation function of a(x), b(y), c(z) with the vacuum using this algorithm, we get 2(y − z) . (x − z)(x − y)3 Symmetrizing with respect to x, y, we get 2(x − 2z + y) , (y − z)(x − z)(x − y)2 (our total correlation function factor), which is non-zero. In more detail, we can calculate separately the contributions to the correlation function of the two summands (227). For the ﬁrst summand, the factor before the ⊗ sign contributes −

1 , (x − z)(y − x)

(230)

the factor after the ⊗ sign contributes 1 . y−x

(231)

Multiplying (230) and (231), we get −

1 , (x − z)(x − y)2

and symmetrizing with respect to x and y, −

x − 2z + y , (x − z)(x − y)2 (y − z)

which is the total contribution of the ﬁrst summand (227).

(232)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

173

For the second summand (227), the factor after ⊗ contributes −

1 2 , − 2 (x − y) (x − z)(y − x)

(233)

and the factor before the ⊗ sign contributes 1 . y−x

(234)

Multiplying, we get x − 2z + y . (x − z)(x − y)3 After symmetrizing with respect to x, y, we get also (232), so both summands of (227) contribute equally to the correlation function. Example 2. In this example, we keep the same a(x) and b(y) as in the previous example, but change c(z). To select c(z), this time we start with G− −1/2 u represented as (3, 0, 0) ⊗ (0, 5, −3)x−1 (−3, 1 − 3) + C(−2, 3, 1) ⊗ (2, 1, −1)

(235)

(C is a non-zero normalization constant which we do not need to evaluate explicitly), which we apply to u¯ represented as (−3, 0, 0) ⊗ (−2, 1, 1).

(236)

From the two summands (235), we get bottom descendants (0, 0, 0) ⊗ (−5, 2, −2) of weight 9/10

(237)

(−5, 3, 1) ⊗ (0, 2, 0) of weight 19/10.

(238)

and

Therefore, we may ignore (238) and select (237) only. Now applying (237) to u written as (3, 0, 0) ⊗ (2, 1, −1),

(239)

we select a descendant of weight 1 above the label (3, 0, 0) ⊗ (−3, 3, −3). Recalling from the conjugate of (191) that weight 1 states above the label (3, −3)P F = (0, 0)P F must vanish, we get c = (3, 0, 0) ⊗ (1, 0, 0)x−1 (−3, 0, 0)

(240)

March 10, J070-S0129055X10003916

174

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

(up to a non-zero multiplicative constant). This gives the correlation function (x − 2z + y)2 . 5(y − z)2 (x − z)2 (x − y)2

(241)

Let us write again in more detail the contributions of the two summands (227). For the ﬁrst summand, the contribution of the factor before ⊗ is again (230) (remain unchanged), and the contribution of the factor after ⊗ is −y − 3z + 4x . 15(y − z)(x − z)(x − y)

(242)

Multiplying, we get −y − 3z + 4x , 15(y − z)2 (x − z)2 (x − y) and symmetrizing with respect to x, y, −

y 2 + 6yz − 6z 2 − 8xy + 6zx + x2 . 15(y − z)2 (x − z)2 (x − y)2

(243)

This is the total contribution of the ﬁrst summand (227). For the second summand (227), the coordinate before ⊗ contributes again (234), and the coordinate after ⊗ contributes 2(−yx − 3yz − 3xz + 3z 2 + 2x2 + 2y 2 ) . 15(y − z)(x − z)2 (x − y)2

(244)

Multiplying, we get −

−yx − 3yz − 3xz + 3z 2 + 2x2 + 2y 2 . 15(y − z)(x − z)2 (x − y)3

Symmetrizing with respect to x, y, we get 2(−yx − 3yz − 3xz + 3z 2 + 2x2 + 2y 2 ) 15(y − z)2 (x − z)2 (x − y)2

(245)

which is the total contribution of the second summand (227). Adding the contributions (243) and (245) (which are not equal in this case) gives (241). Remark. When u is, say, a cc ﬁeld of weight (1/2, 1/2) in an N = (2, 2) CFT, then we have a CPT-conjugate aa ﬁeld v. Physical CFT’s require a real structure, and the ﬁelds u, v are not real. As already noted in the Remark at the end of Sec. 4 for the case of the free ﬁeld theory, deforming along the ﬁeld u (or v), which is the case considered in this section, breaks real structure of the CFT. Truly physical inﬁnitesimal deformations therefore occur not along the ﬁelds u, v but the ﬁeld u + v.

(246)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

175

(This, of course, explains why the dimension of the space of, say, inﬁnitesimal ccdeformations is the dimension of the space of deformations of the complex structure, and not the double of that number.) In the literature, the contribution of the CPTconjugate is often ignored (cf. [31, formulas (4.5) and (4.7)]). Nevertheless, from the point of view of obstruction theory, considering (246) and the original cc ﬁeld u should be equivalent. An argument can be sketched as follows: Let a be a non-zero complex number. Then replacing u by au, (246) becomes au + a ¯v.

(247)

Since the obstructions are homogeneous, instead of (247), we may consider u + bv, b =

a ¯ a

(248)

Then b is an arbitrary element of the unit circle S 1 . Thus, even when we restrict to real deformations, the obstruction should vanish for every ﬁeld (248) with b ∈ S 1 . But the chiral part of the obstruction is holomorphic in b, so vanishing for all b ∈ S 1 implies vanishing for b = 0 and hence if all of the real deformations along (247) are unobstructed, so is the deformation along u. While this argument is compelling, we learned in Sec. 4 that when deforming along ﬁelds of the form (246), regularization along the deformation parameter is required. Therefore, to make the argument precise in the present setting, we would either need to develop a general regularization scheme to the same order to which obstructions vanish, or compute the regularization parameters explicitly in the present case. Working this out would be a substantial improvement of the present result. The remainder of this section is dedicated to comments on possible perturbative deformations along the ﬁelds (15 , 15 ), (−15 , 15 ) (the exponent here denotes repetition of the ﬁeld in a tensor product, and 1 again stands for (1, 1, 1)MM , etc.). We will present some evidence (although not proof) that the obstruction might vanish in this case. The results we do obtain will prove useful in the next section. Such conjecture would have a geometric interpretation. In Gepner’s conjectured interpretation of the model we are investigating as the σ-model of the Fermat quintic, the ﬁeld (175) corresponds to the dilaton. It seems reasonable to conjecture that the dilaton deformation should exist, since the theory should not choose a particular global size of the quintic. Similarly, the ﬁeld (174) can be explained as the dilaton on the mirror manifold of the quintic, which should correspond to deformations of complex structure of the form x5 + y 5 + z 5 + t5 + u5 + λxyztu = 0.

(249)

Therefore, our analysis predicts that the (body of) the moduli space of N = 2supersymmetric CFT’s containing the Gepner model is 2-dimensional, and contains σ-models of the quintics (249), where the metric is any multiple of the metric for which the σ-model exists (which is unique up to a scalar multiple).

March 10, J070-S0129055X10003916

176

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

To discuss possible deformations along the ﬁelds (15 , 15 ), (−15 , 15 ), let us ﬁrst review a simpler case, namely the coset construction: In a VOA V , we set, for u ∈ V homogeneous, u−n+w(u) z n , Y (u, z) = n∈Z

Y− (u, z) =

u−n+w(u) z n ,

(250)

n<0

Y+ (u, z) = Y (u, z) − Y− (u, z). The coset model of u is Vu = v ∈ V |Y− (u, z)v = 0 and Y+ (u, z)v involves only integral powers of z . (251) Then Vu is a sub-VOA of V . To see this, recall that Y (u, z)Y (v, t)w = Y+ (u, z)Y− (v, t)w + Y+ (v, t)Y− (u, z)w + Y (Y− (u, z − t)v)w. (252) When v, w ∈ Vu , the last two terms of the right-hand side of (252) vanish, which proves that Y (v, t)w ∈ Vu [[t]][t−1 ]. Now in the case of N = 2-super-VOA’s, let us stick to the NS sector. Then (250) still correctly describes the “body” of a vertex operator. The complete vertex operator takes the form − n + n − u−n+w(u) z n + u+ Y (u, z, θ+ , θ− ) = −n−1/2+w(u) z θ + u−n−1/2+w(u) z θ n∈Z n + − + u± −n−1+w(u) z θ θ .

(253)

We still deﬁne Y− (u, z, θ+ , θ− ) to be the sum of terms involving n < 0, and Y+ (u, z, θ+ , θ− ) the sum of the remaining terms. The compatibility relations for an N = 2-super-VOA are + − D+ Y (u, z, θ+ , θ− ) = Y (G+ −1/2 u, z, θ , θ ),

+ − D− Y (u, z, θ+ , θ− ) = Y (G− −1/2 u, z, θ , θ ),

(254)

where D+ =

∂ ∂ + θ+ , + ∂θ ∂z

D− =

∂ ∂ + θ− . − ∂θ ∂z

(255)

Now using (252) again, for u ∈ V homogeneous, we will have a sub-N = 2-VOA Vu + deﬁned by (251), which is further endowed with the operators G− −1/2 , G−1/2 .

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

177

In the case of lack of locality, only a weaker conclusion holds. Assume ﬁrst we have “abelian” fusion rules in the same sense as in Remark 2 after Theorem 1. Lemma 4. Suppose we have ﬁelds ui , i = 0, . . . , n such that for i > j, Y (ui , z)uj = (ui )−n−αij −w(ui ) z n+αij

(256)

n≥0

with 0 ≤ αij < 1. Consider further points z0 = 0, z1 , . . . , zn . Then

(zi − zj )−αij Y (un , zn ) · · · Y (uz , z1 )u0

(257)

n≥i>j≥0

where each (zi − zj )−αij are expanded in zj is a power series whose coeﬃcients involve nonnegative integral powers of z0 , . . . , zn only. Proof. Induction on n. Assuming the statement is true for n − 1, note that by assumption, (257), when coupled to w ∈ V ∨ of ﬁnite weight, is a meromorphic function in zn with possible singularities at z0 = 0, z1 , . . . , zn zn−1 . Thus, (257) can be expanded at its singularities, and is equal to (zi − zj )−αij n−1≥i>j≥0



· zn−αn0 expandzn

(zn − zj )−αnj Y (un−1 , zn−1 ) · · ·

j =0



· Y (u1 , z1 )Y (un , zn )u0  <0 zn





n−1

(zn − zi )−αni expand(zn −zi ) 

+

i=1



(zn − zj )−αnj 

n−1≥j =i

· Y (un−1 , zn−1 ) · · · Y (ui+1 , zi+1 )Y (Y (un , zn − zi )ui , zi )  · Y (ui−1 , zi−1 ) · · · Y (u1 , z1 )u0  

(zn −zi )<0

−α −···−αn,n−1 + zn n0 expand1/zn

n−1

j=1

zj 1− zn 

· Y (un , zn )Y (un−1 , zn−1 ) · · · Y (u1 , z1 )u0  ≥0 zn

−αnj  .

(258)

March 10, J070-S0129055X10003916

178

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

In (258), expand? (?) means that the argument is expanded in the variable given as the subscript. The symbol (?)?<0 (respectively, (?)?≥0 ) means that we take only terms in the argument, (which is a power series in the subscript), which involve negative (respectively, non-negative) powers of the subscript. In any case, by the assumption of the lemma, all summands (258) vanish with the exception of the last, which is the induction step. In the case of non-abelian fusion rules, an analogous result unfortunately fails. Assume for simplicity that u0 = · · · = un holds in (258) with 0 ≤ αF ij < 1 true for any fusion rule F .

(259)

We would like to conclude that the correlation function v, u(zn ) · · · u(z1 )u

(260)

involves only non-negative powers of zi when expanded in z1 , . . . , zn (in this order). Unfortunately, this is not necessarily the case. Note that we know that (260) converges to 0 when two of the arguments zi approach while the others remain separate. However, this does not imply that the function (260) converges to 0 when three or more of the arguments approach simultaneously. To give an example, let us consider the solution of the Fuchsian diﬀerential equation of P1 − {0, t, ∞}

A B y = + y (261) x z−t for square matrices A, B (with t = 0 constant). Since the solution y has bounded singularities, multiplying y by z m (z − t)n for large enough integers m, n makes the resulting function Y converge to 0 when z approaches 0 or t. If, however, the expansion of Y at ∞ involved only non-negative powers of z, it would have only ﬁnitely many terms, and hence abelian monodromy. It is well known, however, that this is not necessarily the case. In fact, any irreducible monodromy occurs for a solution of Eq. (261) for suitable matrices A, B (cf. [7]). Therefore, the following result may be used as evidence, but not proof, of the exponentiability of deformations along (15 , 15 ) and (15 , −15 ). Lemma 5. The assumption (259) is satisﬁed for the ﬁeld u = G− −1/2 ((1, 1, 1), . . . , (1, 1, 1)) in the 5-fold tensor product of the N = 2 minimal model of central charge 9/5. Before proving this, let us state the following consequence: Indeed, assuming Lemma 5 and setting w = ((1, 1, 1), . . . , (1, 1, 1)), the obstruction is − w |(G− −1/2 )w(zn ) · · · (G−1/2 w)(z1 )w .

(262)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

179

(The antichiral primary case is analogous.) But using the fact that − − − − − G− −1/2 ((G−1/2 )w(zn ) · · · (G−1/2 w)(z1 )w) = (G−1/2 )w(zn ) · · · (G−1/2 w)(z1 )G−1/2 w

along with injectivity of G− −1/2 on chiral primaries of weight 1/2, we see that the non-vanishing of (262) implies the non-vanishing of (260) with u = G− −1/2 w for some v of weight 1, which would contradict Lemma 5. Proof of Lemma 5. We have G− −1/2 (1, 1, 1) = (−4, 1, −1).

(263)

We have in our lattice (1, 1, 1) · (1, 1, 1) = 1/15 + 1/10 − 1/6 = 0, (−4, 1, −1) · (−4, 1, −1) = 16/15 + 1/10 − 1/6 = 1,

(264)

(1, 1, 1) · (−4, 1, −1) = −4/15 + 1/10 + 1/6 = 0, so we see that with the fusion rules which stays on levels 1, 2, the vertex operators u(z)u have only non-singular terms. However, this is not suﬃcient to verify (259). In eﬀect, when we use the fusion rule which goes to levels 0, 3, (1, 1, 1)(z)(−4, 1, −1) and (−4, 1, −1)(z)(1, 1, 1) will have most singular term z −2/5 , so when we write again 1 instead of (1, 1, 1) and G instead of G− −1/2 (1, 1, 1), with the least favorable choice of fusion rules, it seems u(z)u can have singular term z −4/5 , coming from the expressions (G1111)(z)(1G111)

(265)

(1G111)(z)(G1111)

(266)

and

(and expression obtained by permuting coordinates). Note that with other combinations of fusion rules, various other singular terms can arise with z >−4/5 . Now the point is, however, that we will show that with any choice of fusion rule, the most singular terms of (265) and (266) come with opposite signs and hence cancel out. Since the z exponents of other terms are higher by an integer, this is all we need.

March 10, J070-S0129055X10003916

180

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

Recalling the Koszul–Milnor sign rules for the minimal model, recall that 1 is odd and G is even, so (G ⊗ 1)(z)(1 ⊗ G) = −G(z)1 ⊗ 1(z)G,

(267)

(1 ⊗ G)(z)(G ⊗ 1) = 1(z)G ⊗ G(z)1.

(268)

We have 1(z)G = (1, 1, 1)(z)(−4, 2, 2) = M (−3, 0, 0)z −2/5 + HOT, G(Z)1 = (−4, 2, 2)(z)(1, 1, 1) = N (−3, 0, 0)z −2/5 + HOT, with some non-zero coeﬃcients M, N , so the bottom descendants of (267) and (268) are −M N (−3, 0, 0) ⊗ (−3, 0, 0) respectively, M N (−3, 0, 0) ⊗ (−3, 0, 0), so they cancel out, as required. 7. The Case of the Fermat Quartic K3-Surface The Gepner model of the K3 Fermat quartic is an orbifold analogous to (162) with 5 replaced by 4 of the 4-fold tensor product of the level 2 N = 2-minimal model, although one must be careful about certain subtleties arising from the fact that the level is even. The model has central charge 6. The level 2 PF model is the 1-dimensional fermion (of central charge 1/2), viewed as a bosonic CFT. As such, that model has 3 labels, the NS label with integral weights (denote by N S), the NS label with weights Z + 12 (denote by N S ), and the R label (denote by R). The fusion rules are given by the fact that N S is the unit label, N S ∗ N S = N S, N S ∗ R = R,

(269)

R ∗ R = N S + N S . We shall again ﬁnd it useful to use the free ﬁeld realization of the N = 2 minimal model, which we used in the last two sections. In the present case, the theory is a subquotient of a lattice theory spanned by

1 1 i i 1 √ , 0, 0 , √ , √ , , √ , 0, i . 2 8 8 2 2

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

181

Analogously as before, we write (k, , m) for

k i mi √ ,√ , . 8 8 2 The conformal vector is 1 i 1 1 2 x0,−1 − x21,−1 + √ x1,−2 + x22,−1 . 2 2 2 2 2 The superconformal vectors are G− −3/2 = (0, 4, 2)x−1 (4, 0, 2), G+ −3/2 = (0, 4, −2)x−1 (−4, 0, −2). The fermionic labels will again be denoted by omitting the ﬁrst coordinate: (, m)F . The fermionic identiﬁcations are: (2, 2)F ∼ (2, −2)F ∼ (0, 0)F (1, 1)F ∼ (1, −1)F .

(270)

√ A priori the lattice 8 has 8 labels √k8 , 0 ≤ k ≤ 7, but the G− deﬁnition together with (270) forces the MM identiﬁcation of labels (1, 1, 1) ∼ (−3, 1, −1) ∼ (−3, 1, 1). The labels of the level 2 MM are therefore (2k, 0, 0),

0 ≤ k ≤ 3,

(2k + 1, 1, 1),

0 ≤ k ≤ 1.

The fusion rules are (k, 0, 0) ∗ (, 0, 0) = (k + , 0, 0), (k, 0, 0) ∗ (, 1, 1) = (k + , 1, 1), (k, 1, 1) ∗ (, 1, 1) = (k + , 0, 0) + (k + + 4, 0, 0), so the Verlinde algebra is simply Z[a, b]/(a4 = 1, b2 = a + a3 , a2 b = b) where a = (2, 0, 0), b = (1, 1, 1). One subtlety of the even level MM in comparison with odd level concerns signs. Since the k-coordinates of G− and G+ are even, we can no longer use the kcoordinate of an element as an indication of parity (u and G± u cannot have the same parity). Because of this, we must introduce odd fusion rules. There are various ways of doing this. For example, let the bottom states of (2k, 0, 0), (1, 1, 1) and (−1, 1, 1) be even. Then the fusion rules on level = 0 are even, as are the fusion rules combining levels 0 and 1. The fusion rules (1, 1, 1) ∗ (1, 1, 1) → (2, 0, 0), (1, 1, 1) ∗ (−1, 1, 1) → (2, 0, 0)

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

182

are even, the remaining fusion rules (adding 4 to the k-coordinate on the right-hand side) are odd. Now the c ﬁelds of the MM are (0, 0, 0), (1, 1, 1), (2, 2, 2) and the a ﬁelds are (0, 0, 0), (−1, 1, −1), (−2, 2, −2). If we denote by H1,2k+1 the state space of label (2k + 1, 1, 1), 0 ≤ k ≤ 1, and by H0,2k the state space of label (2k, 0, 0), 0 ≤ k ≤ 3, then the state space of the 4-fold tensor product of the level 2 minimal model is 1 3 3 ˆ ∗ ∗ ˆ 0,2ki ⊕ ˆ 1,2ki +1 . H0,2ki ⊗H H1,2ki +1 ⊗H (271) i=0

ki =0

ki =0

The Gepner model is an orbifold with respect to the Z/4-group which acts by i on products in (271) where the sum of the subscripts 2ki or 2ki + 1 is congruent to modulo 4. Therefore, the state space of the Gepner model is the sum over β ∈ Z/4 and αi ∈ Z/4, 3

αi = 0 ∈ Z/4,

i=0

of 3

ˆ i=0

2ki ≡αi mod 4

∗ ˆ 0,2k H0,2ki ⊗H i +2β

⊕

∗ ˆ 1,2k H1,2ki +1 ⊗H i +1+2β

.

2ki +1≡αi

(272) It is important to note that each summand (272) in which all the factors have the “odd” subscripts 1, 2ki + 1 occurs twice in the orbifold state space. If we write again for (, , ) and − for (−, , −), then the critical cc ﬁelds are chirally symmetric permutations of (2, 2, 0, 0), (2, 2, 0, 0) (2, 1, 1, 0), (2, 1, 1, 0)

(273)

(1, 1, 1, 1), (1, 1, 1, 1). Note that applying all the possible permutations to the ﬁelds (273), we obtain only 19 ﬁelds, while there should be 20, which is the rank of H 1,1 (X) for a K3-surface X. However, this is where the preceeding comment comes to play: the last ﬁeld (273) corresponds to a term (272) where all the factors have odd subscripts, and hence there are two copies of that summand in the model, so the last ﬁeld (273) occurs “twice”. By the fact that the Fermat quartic Gepner model has N = (4, 4) worldsheet supersymmetry (se e.g. [9, 54] and references therein), the spectral ﬂow guarantees

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

183

that the number of critical ac ﬁelds is the same as the number of critical cc ﬁelds. Concretely, the critical ac ﬁelds are the permutations of (0, 0, −2, −2), (2, 2, 0, 0) (0, −1, −1, −2), (2, 1, 1, 0)

(274)

(−1, −1, −1, −1), (1, 1, 1, 1). As above, the last ﬁeld (274) occurs in 2 copies, thus the rank of the space of critical ac ﬁelds is also 20. We wish to investigate whether inﬁnitesimal deformations along the ﬁelds (273), (274) exponentiate perturbatively. To this end, let us ﬁrst see when the “cosettype scenario” occurs. This is suﬃcient to prove convergence in the present case. This is due to the fact that in the present theory, there is an even number of fermions, in which case it is well known by the boson-fermion correspondence that the correlation functions follow abelian fusion rules, and therefore Lemma 4 applies. To prove that the coset scenario occurs, let us look at the chiral c ﬁelds u = (2, 2, 0, 0), (2, 1, 1, 0), (1, 1, 1, 1) and study the singularities of − G− −1/2 (z)(G−1/2 u).

(275)

By Lemma 4, if (275) are non-singular, the obstructions vanish. The inner product is (k, , m), (k , , m ) = w(k, , m) =

mm kk + − , 8 8 4 ( + 2) m2 k2 + − . 16 16 8

Next, (2, 2, 2) = (2, 0, 0), G− −1/2 (2, 0, 0) = (0, 4, −2)x−1 (−2, 0, −2), G− −1/2 (1, 1, 1) = (−3, 1, −1), if we again replace, to simplify notation, the symbol G− −1/2 by G, then we have 2·2 1 = , 8 2 1 −2 · 2 =− . The most singular z-power of G2(z)2 is 8 2 For G2(z)G2, rename the rightmost G2 as (−2, 2, 0). We get The most singular z-power of 2(z)2 is

The most singular z-power of G2(z)G2 is − 1 +

(−2) · (−2) 1 =− , 8 2

(276) (277)

(278)

The most singular z-power of 1(z)1 is 0 for the even fusion rule and 1/2 for the odd fusion rule, (279)

March 10, J070-S0129055X10003916

184

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

−3 1 + + 8 8 fusion rule and −1/2 for the odd fusion rule, 9 1 The most singular z-power of G1(z)G1 is + − 8 8 fusion rule and 3/2 for the odd fusion rule.

The most singular z-power of G1(z)1 is

1 = 0 for the even, 4 (280) 1 = 1 for the even 4 (281)

One therefore sees that for the ﬁeld u = (1, 1, 1, 1), (275) is non-singular: In the case of the least favorable (odd) fusion rules, the most singular term appears to be −1, coming from (G1, 1, 1, 1) ⊗ (1, G1, 1, 1).

(282)

However, this term cancels with (1, G1, 1, 1) ⊗ (G1, 1, 1, 1).

(283)

To see this, note that the last two coordinates do not enter the picture. We have an odd (respectively, even) pair of pants P− respectively, P+ in the MM with input 1, 1. They add up to a pair of pants in MM⊗MM. On (282), we have pairs of pants Pi ∈ {P− , P+ }, P (G1 ⊗ 1) ⊗ (1 ⊗ G1) = (P1 ⊗ P2 )(G1 ⊗ 1) ⊗ (1 ⊗ G1) = sP1 (G1 ⊗ 1) ⊗ P2 (1 ⊗ G1)

(284)

where s is the sign of permuting P2 past G1 ⊗ 1. Here we use the fact that 1 is even. On the other hand, P (1 ⊗ G1) ⊗ (G1 ⊗ 1) = (P1 ⊗ P2 )(1 ⊗ G1) ⊗ (G1 ⊗ 1) = −sP1 (1 ⊗ G1) ⊗ P2 (G1 ⊗ 1)

(285)

(as G1 is odd, so there is a − by permuting it with itself). From (73), the lowest term of Pi (1 ⊗ G1) and Pi (G1 ⊗ 1) have opposite signs, so (284) and (285) cancel out. The situation is simpler for u = (2, 2, 0, 0), in which case all the fusion rules are even, and the most singular term of (G2 ⊗ 2)(z)(2 ⊗ G2) appears to have most singular term z −1 . However, again note that 2 is even and G2 is odd, so (G2 ⊗ s)(z)(2 ⊗ G2) = G2(z)2 ⊗ 2(z)G2,

(286)

(2 ⊗ G2)(z)(G2 ⊗ 2) = −2(z)G(z) ⊗ G2(z)2.

(287)

while

Renaming G2 as (−2, 2, 0), the bottom descendant of both G2(z)2 and 2(z)G2 is (0, 2, 0) with some coeﬃcient, so (286) and (287) cancel out. Thus, the deformations along the ﬁrst and last ﬁelds of (273) and (274) exponentiate.

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

185

The ﬁeld u = (2, 1, 1, 0) is diﬃcult to analyse, since in this case, (275) has singular channels and the coset-type scenario does not occur. We do not know how to calculate the obstruction directly in this case. It is however possible to present an indirect argument why these deformations exist. In one precise formulation, the boson-fermion correspondence asserts that a tensor product of two copies of the 1-dimensional chiral fermion theory considered bosonically (= the level 2 parafermion) is an orbifold of the lattice theory 2 , by the Z/2-group whose generator acts on the lattice by sign. This has an N = 2-supersymmetric version. We tensor with two copies of the √ lattice theory associated with 8 , picking out the sector

m n p √ ,√ , 8 8 4

where m ≡ n ≡ p

mod 2.

(288)

The fermionic currents of the individual coordinates are ψ−1/2,1 = (1) + (−1),

ψ−1/2,2 = i((1) − (−1)),

(289)

so the SUSY generators are

4 ± √ , 0 ⊗ ((1) + (−1)), 8

4 = 0, ± √ ⊗ i((1) + (−1)), 8

G± −3/2,1 = G± −3/2,2

(290)

G = G·1 + G·2 . The Z/2 group acts trivially on the new lattice coordinate. A note is due on the signs: To each state, we can assign a pair of parities, which will correspond to the parities of the 2 coordinates in the orbifold. This then also determines the sign of fusion rules. Now consider our ﬁeld as a tensor product of (2, 0) and (1, 1), each in a tensor product of two copies of the minimal models. Considering each of these factors as orbifold of the N = 2-supersymmetric lattice theory, let us lift to the lattice theory: 2 (2, 0) → √ , 0 ⊗ (0), 8

1 1 (1, 1) → √ , √ ⊗ ((1/2) + (−1/2)). 8 8

(291) (292)

Then the ﬁelds (291), (292) are Z/2-invariant. In the case of (291), we can proceed in the lift instead of the orbifold, because the fusion rules in the orbifold are abelian anyway. In the case of (292), the choice amounts to choosing a particular fusion

March 10, J070-S0129055X10003916

186

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

rule. But now the point is that

2 − √ , 0 ⊗ ((1) + (−1)), 8

1 3 − G−1/2 (1, 1) → − √ , √ ⊗ ((1/2) + (−1/2)), 8 8

1 3 − ⊗ ((1/2) + (−1/2)). G−1/2 (1, 2) → √ , − √ 8 8 G− −1/2 (2, 0) →

(293) (294) (295)

Thus, the left of G− −1/2 u is a sum of lattice labels! Now the critical summands of the operator − G− −1/2 (u)(zk ) · · · G−1/2 (u)(z1 )(u)(0)

(296)

have k = 4m, and we have 2m summands (293), and m summands (294), (295), respectively. All

4m 2m, m, m possibilities occur. It is the bottom (= label) term which we must compute in order to evaluate our obstruction. But by our sign discussion, when we swap a (294) term with a (295) term, the label summands cancel out. Now adding all such possible

4m 2m, m − 1, m − 1, 2 pairs, all critical summands of (296) will occur with equal coeﬃcients by symmetry, and hence also the bottom coeﬃcient of (289) is 0, thus showing that the vanishing of our obstruction for this ﬁeld lifts to the lattice theory. Since the ﬁeld (292) is invariant under the Z/2-orbifolding (and although (291) is not, the same conclusion holds when replacing it with its orbifold image), the entire perturbative deformation can also be orbifolded, yielding the desired deformation. We thus conclude that for the Gepner model of the K3 Fermat quartic, all the critical ﬁelds exponentiate to perturbative deformations. 8. Conclusions and Discussion In this paper, we have investigated perturbative deformations of CFT’s by turning on a marginal cc ﬁeld, by the method of recursively updating the ﬁeld along the deformation path. A certain algebraic obstruction arises. We work out some examples, including free ﬁeld theories, and some N = (2, 2) supersymmetric Gepner models. In the N = (2, 2) case, in the case of a single cc ﬁeld, the obstruction we ﬁnd can be made very explicit, and perhaps surprisingly, does not automatically vanish. By explicit computation, we found that the obstruction does not vanish for a particular critical cc ﬁeld in the Gepner model of the Fermat quintic 3-fold

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

187

(we saw some indication, although not proof, that it may vanish for the ﬁeld corresponding to adding the symmetric term xyztu to the superpotential, and for the unique critical ac ﬁeld). By comparison, the obstruction vanishes for the critical cc ﬁelds and ac ﬁelds in the Gepner model of the Fermat quartic K3-surface. Our calculations are not completely physical in the sense that cc ﬁelds are not real: real ﬁelds are obtained by adding in each case the complex-conjugate aa ﬁeld, in which case the calculation is more complicated and is not done here. Assuming (as seems likely) that the real ﬁeld case exhibits similar behavior as we found, why are the K3 and 3-fold cases diﬀerent, and what does the obstruction in the 3-fold case indicate? In the K3-case, our perturbative analysis conforms with the Aspinwall–Morrison construction [9] of the big moduli space of K3’s, and corresponding (2, 2)- (in fact, (4, 4)-) CFT’s, and also with the ﬁndings of Nahm and Wendland [54, 62]. In the 3-fold case, however, the straightforward perturbative construction of the deformed nonlinear σ-model fails. This corresponds to the discussion of Nemeschansky–Sen [55] of the renormalization of the nonlinear σ-model. They expand around the 0 curvature tensor, but it seems natural to assume that similar phenomena would occur if we could expand around the Fermat quintic vacuum. Then [55] ﬁnd that non-Ricci ﬂat deformations must be added to the Lagrangian at higher orders of the deformation parameter in order to cancel the β function. Therefore, if we want to do this perturbatively, ﬁelds must be present in the original (unperturbed) model which would correspond to non-Ricci ﬂat deformation. No such ﬁelds are present in the Gepner model. (Even if we do not a priori assume that the marginal ﬁelds of the Gepner model correspond to Ricci ﬂat deformations, we see that diﬀerent ﬁelds are needed at higher order of the perturbation parameter, so there are not enough ﬁelds in the model.) More generally, ignoring for the moment the worldsheet SUSY, the bosonic superpartners are ﬁelds which are of weight 1 classically (as the classical nonlinear σ-model Lagrangian is conformally invariant even in the non-Ricci ﬂat case). A 1-loop correction arises in the quantum picture [4], indicating that the corresponding deformation ﬁelds must be of generalized weight (cf. [39–42]). However, such ﬁelds are excluded in unitary CFT’s, which is the reason why these deformations must be non-perturbative. One does not see this phenomenon on the level of the corresponding topological models, since these are invariant under varying the metric within the same cohomological class, and hence do not see the correction term [68]. Also, it is worth noting that in the K3-case, the β function vanishes directly for the Ricci-ﬂat metric by the N = (4, 4) supersymmetry ( [5]), and hence the correction terms of [55] are not needed. Accordingly, we have found that the corresponding perturbative deformations exist. From the point of view of mirror symmetry, mirror-symmetric families of hypersurfaces in toric varieties were proposed by Batyrev [10]. In the case of the Fermat quintic, the exact mirror is a singular orbifold and the nonlinear σ-model deformations corresponding to the Batyrev dual family exist perturbatively by our analysis.

March 10, J070-S0129055X10003916

188

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

To obtain mirror candidates for the additional deformations, one uses crepant resolutions of the mirror orbifold (see [57] for a survey). In the K3-case, this approach seems validated by the fact that the mirror orbifolds can indeed be viewed as a limit of non-singular K3-surfaces [6]. In the 3-fold case, however, this is not so clear. The moduli spaces of Calabi–Yau 3-folds are not locally symmetric spaces. The crepant resolution is not unique even in the more restrictive category of algebraic varieties; diﬀerent resolutions are merely related by ﬂops. It is therefore not clear what the exact mirrors are of those deformations of the Fermat quintic where the deformation does not naturally occur in the Batyrev family, and resolution of singularities is needed. In other words, the McKay correspondence sees only “topological” invariants, and not the ﬁner geometrical information present in the whole nonlinear σ model. In [21], Fan, Jarvis and Ruan constructed exactly mathematically the A-models corresponding to Landau–Ginzburg orbifolds via Gromov–Witten theory applied to the Witten equation. Using mirror symmetry conjectures, this may be used to construct mathematically candidates of topological gravity-coupled A-models as well as B-models of Calabi–Yau varieties. Gromov–Witten theory, however, is a rich source of examples where such gravity-coupled topological models exist, while a full conformally invariant (2, 2)-σ-model does not. For example, Gromov–Witten theory can produce highly non-trivial topological models for 0-dimensional orbifolds (cf. [56, 43]). Why does our analysis not contradict the calculation of Dixon [19] that the central charge does not change for deformation of any N = 2 CFT along any linear combination of ac and cc ﬁelds? Zamolodchikov [70,71] deﬁned an invariant c which is a non-decreasing function in a renormalization group ﬂow in a 2-dimensional QFT, and is equal to the central charge in a conformal ﬁeld theory. It may therefore appear that by [19], all inﬁnitesimal deformations along critical ac and cc ﬁelds in an N = 2-CFT exponentiate. However, we saw that when our obstruction occurs, additional counterterms corresponding to those of Nemeschansky–Sen are needed. This corresponds to non-perturbative corrections of the correlation function needed to ﬁx c, and the functions [19] cannot be used directly in our case. Finally, let us brieﬂy discuss the signiﬁcance of our result to the relationship between classical and quantum geometry. One of the well known eﬀects (and also great puzzles) of string duality (as reviewed, say, in [31]) is that a smooth path in the moduli space of conformal ﬁeld theories corresponding to Calabi–Yau varieties can correspond to a discontinuous path in the classical moduli space of the Calabi– Yau varieties themselves, and more speciﬁcally that the topology of the underlying Calabi–Yau variety can change along such path. In view of our result, it is possible that this picture needs to be reﬁned. Namely, what we perceive as a smooth path in quantum geometry may actually consist of discrete steps “tunneling” across the changes of topology. An explanation of such phenomenon could be that the moduli space of quantum geometries should itself be quantized, and can have a discrete rather than continuous spectrum.

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

189

Acknowledgments The author thanks D. Burns, I. Dolgachev, I. Frenkel, Doron Gepner, Y. Z. Huang, I. Melnikov, K. Wendland and E. Witten for explanations and discussions. Special thanks to H. Xing, who contributed many useful ideas to this project before changing his ﬁeld of interest. The author is supported by grants from the NSA and the MCTP. References [1] I. Aﬄeck and A. W. Ludwig, Universal noninteger ground state degeneracy in critical quantum systems, Phys. Rev. Lett. 67 (1991) 161–164. [2] I. Aﬄeck and A. W. Ludwig, Exact conformal ﬁeld theory results on the multichannel Kondo eﬀect: Single Fermion Green’s function, selfenergy and resistivity, J. High Energy Phys. 11 (2000) 21. [3] L. Alvarez-Gaume, S. Coleman and P. Ginsparg, Finiteness of Ricci ﬂat N = 2 supersymmetric σ-models. Comm. Math. Phys. 103(3) (1986) 423–430. [4] L. Alvarez-Gaume and D. Z. Freedman, K¨ ahler geometry and renormalization of supersymmetric σ-models, Phys. Rev. D 22 (1980) 846–853. [5] L. Alvarez-Gaume and P. Ginsparg, Finiteness of Ricci ﬂat supersymmetric σ models, Comm. Math. Phys. 102 (1985) 311–326. [6] M. T. Anderson, The L2 structure of moduli spaces of Einstein metrics on 4-manifolds, Geom. Funct. Anal. 2 (1992) 29–89. [7] D. V. Anosov and A. A. Bolibruch, The Riemann–Hilbert Problem, Aspects of Mathematics, E22 (Friedr. Vieweg and Sohn, Braunschweig, 1994). [8] J. Ashkin and G. Teller, Statistics of two-dimensional lattices with four components, Phys. Rev. 64 (1943) 178–184. [9] P. S. Aspinwall and D. R .Morrison, String theory on K3 surfaces, in Mirror Symmetry, eds. B. R. Greene and S. T. Yau, Vol. II, AMS/IP Stud. Adv. Math. (Amer. Math. Soc., 1994), pp. 703–716. [10] V. V. Batyrev, Dual polyhedra and mirror symmetry for Calabi–Yau hypersurfaces in toric varieties, J. Alg. Geom. 3 (1994) 493–535. [11] R. J. Baxter, Eight-vertex model in lattice statistics, Phys. Rev. Lett. 26 (1971) 832–833. [12] P. Bouwknecht and D. Ridout, A note on the equality of algebraic and geometric D-brane charges in WZW, J. High Energy Phys. 0405 (2004) 029. [13] R. Cohen and D. Gepner, Interacting bosonic models and their solution, Mod. Phys. Lett. A 6 (1991) 2249. [14] M. Dine, N. Seiberg, X. G. Wen and E. Witten, Non-perturbative eﬀects on the string world sheet, Nucl. Phys. B 278 (1986) 769–969. [15] M. Dine, N. Seiberg, X. G. Wen and E. Witten, Non-perturbative eﬀects on the string world sheet, Nucl. Phys. B 289 (1987) 319–363. [16] L. J. Dixon, V. S. Kaplunovsky and J. Louis, On eﬀective ﬁeld theories describing (2, 2) vacua of the heterotic string, Nucl. Phys. B 329 (1990) 27–82. [17] J. Ellis, C. Gomez, D. V. Nanopoulos and M. Quiros, World sheet instanton eﬀects on no-scale structure, Phys. Lett. B 173 (1986) 59–64. [18] J. Distler and B. Greene, Some exact results on the superpotential from Calabi–Yau compactiﬁcations, Nucl. Phys. B 309 (1988) 295–316. [19] L. Dixon, Some worldsheet properties of superstring compactiﬁcations, on orbifolds and otherwise, in Superstrings, Unified Theories and Cosmology, Proc. ICTP Summer school, 1987, ed. G. Furlan (World Scientiﬁc, 1988), pp. 67–126.

March 10, J070-S0129055X10003916

190

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

[20] M. R. Douglas and W. Taylor, The landscape of intersecting brane models, J. High Energy Phys. 0701 (2007) 031. [21] H. Fan, T. J. Jarvis and Y. Ruan, The Witten equation, mirror symmetry and quantum singularity theory, arXiv:0712.4021. [22] S. Fredenhagen and V. Schomerus, Branes on group manifolds, Gluon condensates, and twisted K-theory, J. High Energy Phys. 104 (2001) 7. [23] D. Friedan, Nonlinear models in 2+ dimensions, Ann. Phys. 163(2) (1985) 318–419. [24] D. Gepner, Space-time supersymmetry in compactiﬁed string theory and superconformal models, Nucl. Phys. B 296 (1988) 757–778. [25] D. Gepner, Exactly solvable string compactiﬁcations on manifolds of SU (N ) holonomy, Phys. Lett. B 199 (1987) 380. [26] M. Gerstenhaber, On the deformation of rings and algebras I, Ann. of Math. 79 (1964) 59–103. [27] M. Gerstenhaber, On the deformation of rings and algebras II, Ann. of Math. 84 (1966) 1–19. [28] M. Gerstenhaber, On the deformation of rings and algebras III, Ann. of Math. 88 (1968) 1–34. [29] M. Gerstenhaber, On the deformation of rings and algebras IV, Ann. of Math. 99 (1974) 257–276. [30] P. Ginsparg, Curiosities at c = 1, Nucl. Phys. B 295 (1988) 153–170. [31] B. R. Greene, String theory on Calabi–Yau manifolds, hep-th/9702155. [32] B. R. Greene and M. R. Plesser, Duality in Calabi–Yau moduli space, Nucl. Phys. B 338 (1990) 15–37. [33] B. R. Greene, C. Vafa and N. P. Warner, Calabi–Yau manifolds and renormalization group ﬂows, Nucl. Phys. B 324 (1989) 371–390. [34] P. A. Griﬃn and O. F. Hernandez, Structure of irreducible SU (2) parafermion modules derived vie the Feigin–Fuchs construction, Int. J. Modern Phys. A 7 (1992) 1233–1265. [35] M. T. Grisaru, A. E. M. Van Den and D. Zanon, Four-loop β-function for the N = 1 and N = 2 supersymmetric non-linear sigma model in two dimensions, Phys. Lett. B 173 (1986) 423. [36] P. Hu and I. Kriz, Conformal ﬁeld theory and elliptic cohomology, Adv. Math. 189 (2004) 325–412. [37] P. Hu and I. Kriz, Closed and open conformal ﬁeld theories and their anomalies, Comm. Math. Phys. 254 (2005) 221–253. [38] P. Hu and I. Kriz, A mathematical formalism for the Kondo eﬀect in WZW branes, J. Math. Phys. 48 (2007) 072301, 31 pp. [39] Y. Z. Huang, J. Lepowsky and L. Zhang, Logarithmic tensor product theory for generalized modules for a conformal vertex algebra, Part I, math/0609833. [40] Y. Z. Huang, J. Lepowsky and L. Zhang, A logarithmic generalization of tensor product theory for modules for a vertex operator algebra, Internat. J. Math. 17 (2006) 975–1012; math/0311235. [41] Y. Z. Huang and L. Kong, Full ﬁeld algebras, QA/0511328. [42] Y. Z. Huang and A. Milas, Intertwining operator superalgebras and vertex tensor categories for superconformal algebras, II, Trans. Amer. Math. Soc. 354 (2002) 363– 385. [43] P. Johnson, Equivariant Gromov–Witten theory of one dimensional stacks, Ph.D. thesis, Univ. of Michigan (2009). [44] S. Kachru and E. Witten, Computing the complete massless spectrum of a Landau– Ginzburg orbifold, Nucl. Phys. B 407 (1993) 637–666.

March 10, J070-S0129055X10003916

2010 10:13 WSPC/S0129-055X

148-RMP

Perturbative Deformations of Conformal Field Theories Revisited

191

[45] L. P. Kadanoﬀ, Multicritical behavior of the Kosterlitz–Thouless critical point, Ann. Phys. 120 (1979) 39–71. [46] L. P. Kadanoﬀ and A. C. Brown, Correlation functions on the critical lines of the Baxter and Ashten–Teller models, Ann. Phys. 121 (1979) 318–342. [47] L. P. Kadanoﬀ and F. J. Wegner, Some critical properties of the eight-vertex model, Phys. Rev. D 4 (1971) 3989–3993. [48] M. Kohmoto and L. P. Kadanoﬀ, Lower bound RSRG approximation for a large system, J. Phys. A 13 (1980) 3339–3343. [49] I. Kriz, Some notes on the N -superconformal algebra, http://www.math.lsa. umich.edu/˜ikriz. [50] I. Kriz, On spin and modularity in conformal ﬁeld theory, Ann. Sci. ENS 36 (2003) 57–112. [51] J. Maldacena, G. Moore and N. Seiberg, D-brane instantons and K-theory charges, J. High Energy Phys. 111 (2004) 62. [52] G. Moore, K-theory from a physical perspective, Topology, Geometry and Quantum Field Theory, London Math. Soc. Lecture Ser., Vol. 308 (Cambridge Univ. Press, 2004), pp. 194–234. [53] G. Mussardo, G. Sotkov and M. Stanishkov, N = 2 superconformal minimal models, Int. J. Mod. Phys. A 4(5) (1989) 1135–1206. [54] W. Nahm and K. Wendland, A hiker’s guide to K3, Comm. Math. Phys. 216 (2001) 85–103. [55] D. Nemeschansky and A. Sen, Conformal invariance of supersymmetric σ-models on Calabi–Yau manifolds, Phys. Lett. B 178(4) (1986) 365–369. [56] A. Okounkov and R. Pandharipande, The equivariant Gromov–Witten theory of P1 , Ann. of Math. (2) 163 (2006) 561–605. [57] M. Reid, La Correspondence de McKay, 52eme annee, session de Novembre 1999, no. 897, Asterisque 276 (2002) 53–72. [58] V. Schomerus, Lectures on branes in curved backgrounds, Class. Quant. Grav. 19 (2002) 5781–5847. [59] G. Segal, The deﬁnition of conformal ﬁeld theory, in Topology, Geometry and Quantum Field Theory, London Math. Soc. Lecture Note Ser., Vol. 308 (Cambridge University Press, 2004), pp. 421–577. [60] C. Vafa and N. Warner, Catastrophes and the classiﬁcation of conformal theories, Phys. Lett. B 218 (1989) 51–58. [61] F. J. Wegner, Corrections to scaling laws, Phys. Rev. B 5 (1972) 4529–4536. [62] K. Wendland, A family of SCFT’s hosting all very attractive relatives to the (2)4 Gepner model, J. High Energy Phys. 0603 (2006) 102. [63] K. G. Wilson, The renormalization group: Critical phenomena and the Kondo problem, Rev. Mod. Phys. 47 (1975) 773–840. [64] K. G. Wilson, Non-Lagrangian models of current algebra, Phys. Rev. 179 (1969) 1499–1512. [65] K. G. Wilson, Operator-product expansions and anomalous dimensions in the Thirring model, Phys. Rev. D 2 (1970) 1473–1493. [66] E. Witten, Phases of N = 2 theories in two dimensions, Nucl. Phys. B 403 (1993) 159–222. [67] E. Witten, On the Landau–Ginzburg description of N = 2 minimal models, Int. J. Mod. Phys. A 9 (1994) 4783–4800. [68] E. Witten, Topological sigma models, Comm. Math. Phys. 118 (1988) 411–449. [69] A. B. Zamolodchikov, Integrable ﬁeld theory from conformal ﬁeld theory, Adv. Stud. Pure Math. 19 (1989) 641–674.

March 10, J070-S0129055X10003916

192

2010 10:13 WSPC/S0129-055X

148-RMP

I. Kriz

[70] A. B. Zamolodchikov, “Irreversibility” of the ﬂux of the renormalization group in a 2D ﬁeld theory, JETP Lett. 43 (1986) 730–732. [71] A. B. Zamolodchikov, Renormalization group and perturbation theory about ﬁxed points in two-dimensional ﬁeld theory, Sov. J. Nucl. Phys. 46 (1987) 1090–1096. [72] Y. Zhu, Modular invariance of characters of vertex operator algebras, J. Amer. Math. Soc. 9 (1996) 237–302.

March 10, J070-S0129055X10003928

2010 10:14 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 2 (2010) 193–206 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003928

SPATIAL GROWTH OF FUNDAMENTAL SOLUTIONS FOR CERTAIN PERTURBATIONS OF THE HARMONIC OSCILLATOR

ARNE JENSEN∗ and KENJI YAJIMA† ∗Department

of Mathematical Sciences, Aalborg University, Fr. Bajers Vej 7G, DK-9220 Aalborg Ø, Denmark [email protected]

†Department

of Mathematics, Gakushuin University, 1-5-1 Mejiro, Toshima-ku, Tokyo 171-8588, Japan [email protected] Received 5 June 2009 Revised 24 November 2009

We consider the fundamental solution for the Cauchy problem for perturbations of the harmonic oscillator by time dependent potentials which grow at spatial infinity slower than quadratic but faster than linear functions and whose Hessian matrices have a fixed sign. We prove that the fundamental solution at resonant times grows indefinitely at spatial infinity with an algebraic growth rate, which increases indefinitely when the growth rate of perturbations at infinity decreases from the near quadratic to the near linear ones. Keywords: Fundamental solution; Schr¨ odinger equation; harmonic oscillator. Mathematics Subject Classification 2010: 35A08, 35B10, 35J10, 81Q20

1. Introduction We consider d-dimensional time dependent Schr¨odinger equations i

∂u = ∂t

1 − ∆ + V (t, x) u(t, x), 2

(t, x) ∈ R1 × Rd .

(1)

We assume throughout this paper that V (t, x) is smooth with respect to the x variables, and V (t, x) and its derivatives ∂xα V (t, x) are jointly continuous with respect to (t, x). Under the conditions to be imposed on V (t, x) in what follows Eq. (1) generates a unique unitary propagator {U (t, s) : t, s ∈ R} in the Hilbert space H = L2 (Rd ), so that the solution in H of (1) with the initial condition u(s, x) = ϕ(x) ∈ H 193

March 10, J070-S0129055X10003928

194

2010 10:14 WSPC/S0129-055X

148-RMP

A. Jensen & K. Yajima

is uniquely given by u(t) = U (t, s)ϕ. The distribution kernel E(t, s, x, y) of U (t, s) is called the fundamental solution (FDS for short) of Eq. (1): U (t, s)ϕ(x) = E(t, s, x, y)ϕ(y)dy. We write E(t, x, y) = E(t, 0, x, y). It is well known that the FDS of the free Schr¨ odinger equation, viz. Eq. (1) with V = 0, is given by 2 e∓ 4 ei(x−y) /2t , d/2 |2πt| iπd

E0 (t, x, y) =

t≷0

(2)

and that of the harmonic oscillator, viz. Eq. (1) with V (t, x) = x2 /2, is given for non-resonant times mπ < t < (m + 1)π, m ∈ Z via Mehler’s formula: Eh (t, x, y) =

e−id(1+2m)π/4 (i/sin t)((x2 +y2 )cos t/2−x·y) e , |2π sin t|d/2

(3)

and, for resonant times t − s = mπ by Eh (mπ, x, y) = e−imdπ/2 δ(x − (−1)m y).

(4)

Note that the FDS for the free Schr¨ odinger equation is smooth and spatially bounded for any t = 0; for the harmonic oscillator the FDS has this property only at non-resonant times; at resonant times t = mπ singularities of the initial function Eh (0, x, y) = δ(x − y) recur at x = (−1)m y, however, it is smooth and decays rapidly at spatial inﬁnity. Actually, it vanishes outside the singular point x = (−1)m y when t = mπ. We begin with a brief review on properties of the FDS for (1) with general potentials V (t, x) laying emphasis on its smoothness and boundedness with respect to the spatial variables (x, y). We denote the classical Hamiltonian and Lagrangian corresponding to (1), respectively, by H(t, x, p) = p2 /2 + V (t, x) and L(t, q, v) = v 2 /2 − V (t, q) and (x(t, s, y, k), p(t, s, y, k)) is the solution of the initial value problem for Hamilton’s equations x(t) ˙ = ∂p H(t, x, p),

p(t) ˙ = −∂x H(t, x, p);

x(s) = y,

p(s) = k.

(5)

We write (x(t, 0, y, k), p(t, 0, y, k)) = (x(t, y, k), p(t, y, k)). Suppose ﬁrst that V (t, x) increases at most quadratically at spatial inﬁnity in the sense that sup |∂xα V (t, x)| ≤ Cα , t

for all |α| ≥ 2.

(6)

Then, in the seminal work [4], Fujiwara has shown that there exists a T depending only on V such that the following results hold for the time interval 0 ≤ ±(t−s) < T : The map Rd k → x(t, s, y, k) ∈ Rd is a diﬀeomorphism for every ﬁxed y ∈ Rd

March 10, J070-S0129055X10003928

2010 10:14 WSPC/S0129-055X

148-RMP

Spatial Growth of Fundamental Solutions for Certain Perturbations

195

and, therefore, there exists a unique path of (5) such that x(s) = y and x(t) = x; if we write t (x(r) ˙ 2 /2 − V (r, x(r)))dr S(t, s, x, y) = s

for the action integral of the path, the FDS E(t, s, x, y) has the form e∓ 4 E(t, s, x, y) = eiS(t,s,x,y) a(t, s, x, y), (2π|t − s|)d/2 iπd

t ≷ s,

(7)

where a(t, s, x, y) is a smooth function of (x, y) such that, for any α and β, ∂xα ∂yβ a(t, s, x, y) are C 1 with respect to (t, s, x, y) and |∂xα ∂yβ (a(t, s, x, y) − 1)| ≤ Cαβ (t − s)2 .

(8)

Moreover the semi-classical approximation for the amplitude function is valid in the sense that as |t − s| → 0 −1/2 ∂ a(t, s, x, y) −d/2 = (2π) + O(|t − s|−(d−2)/2 ), (9) x(t, s, k, y) det ∂k (2π|t − s|)d/2 where k is the (unique) point such that x = x(t, s, y, k). In particular, E(t, s, x, y) is smooth and bounded with respect to the spatial variables (x, y) ∈ Rd × Rd for every 0 < |t − s| < T (see [9] for a generalization to the case when magnetic ﬁelds are present). For the free Schr¨odinger equation or for the harmonic oscillator the relation (9) holds without the error term O(|t − s|−(d−2)/2 ). Under the condition (6) the structure (7) of the FDS in general breaks down at later times because singularities of the initial data δ(x) may recur in ﬁnite time as the FDS of the harmonic oscillator (4) explicitly demonstrates. If V (t, x) is subquadratic at spatial inﬁnity in the sense that lim sup |∂xα V (t, x)| = 0,

|x|→∞

t

|∂xα V (t, x)| ≤ Cα ,

|α| = 2, (10)

for all |α| ≥ 3,

then this recurrence of singularities does not take place, however, and the FDS is of the form (7) for any ﬁnite time ([12]). More precisely, if V satisﬁes (10), then for any T > 0, there exists R > 0 such that, for any t and s with 0 < ±(t − s) ≤ T and for any pair (x, y) ∈ Rd × Rd with x2 + y 2 ≥ R2 , there exists a unique path of (5) such that x(s) = y and x(t) = x and the FDS for 0 < ±(t − s) ≤ T may be written in the form (7), where, for (x, y) with x2 + y 2 ≥ R2 , S(t, s, x, y) is the action integral of this path. Moreover, we have a(t, s, x, y) → 1 as x2 + y 2 → ∞. In particular, E(t, s, x, y) is smooth and bounded with respect to (x, y) for any t = s. On the other hand, if d = 1 and V (t, x) does not depend on t, and if V is convex and V (x) ≥ C|x|2+ε for large |x| for some ε > 0 and C > 0, then, under certain additional techinical assumption on the derivatives, E(t, x, y) is nowhere C 1 with respect to (t, x, y) ([10]). It is also known that, if V satisﬁes C1 |x|δ ≤ V (x) ≤ C2 |x|δ near inﬁnity with constants δ > 10 and 0 < C1 ≤ C2 < ∞, then E(t, 0, x, y) is

March 10, J070-S0129055X10003928

196

2010 10:14 WSPC/S0129-055X

148-RMP

A. Jensen & K. Yajima

unbounded with respect to (x, y) for any t ∈ R ([13]). These results have been proven only in one dimension so far, however, it is believed that similar results hold in all dimensions. In this way, properties of the FDS experience a sharp transition when the growth rate at spatial inﬁnity of the potential V (t, x) changes from subquadratic to superquadratic. Thus, the FDS for the borderline case, viz. perturbations of the harmonic oscillator 1 1 ∂u = − ∆ + x2 + W (t, x) u(t, x), (t, x) ∈ R1 × Rd , (11) i ∂t 2 2 where W (t, x) is subquadratic in the sense it satisﬁes (10) with W in place of V , has attracted particular interest of many authors, and the following properties of E(t, s, x, y) have been established (see, e.g., [14, 5, 12, 2, 3]). We may set s = 0, which we will do, and we will write E(t, x, y) for E(t, 0, x, y); x = (1 + |x|2 )1/2 . (a) The structure of the FDS Eh (t, x, y) at non-resonant times as stated in (3) is stable under perturbations and E(t, x, y) is smooth and spatially bounded for mπ < t < (m + 1)π. However, E(t, x, y) at resonant times is more sensitive to perturbations: (b) If W is sublinear, viz. |∂xα W (t, x)| = o(1), |α| = 1, as |x| → ∞ uniformly with respect to t, then the recurrence of singularities at resonant times mπ, m ∈ Z, persists (WFx denotes the wavefront set): WFx E(mπ, x, y) = {(−1)m (y, ξ) : ξ ∈ Rd \{0}}, and it decays rapidly at spatial inﬁnity, viz. for any N , |E(mπ, x, y) ≤ CN x − y −N ,

|x − y| ≥ 1.

(12)

(c) If W is of linear type, viz. |∂xα W (t, x)| ≤ C for |α| = 1, singularities of E(0, x, y) can propagate at resonant times. For example, if W = a x , then with ξˆ = ξ/|ξ|, ˆ ξ) : ξ ∈ Rd \{0}}, WFx E(mπ, x, y) = {(−1)m (y + 2amξ, but it remains to decay rapidly at spatial inﬁnity: |E(mπ, x, y) ≤ CN x − y −N ,

|x − y| ≥ 1.

(13)

(d) If W is superlinear and satisﬁes the following sign condition on the Hessian matrix ∂x2 W = (∂ 2 W/∂xj ∂xk ) that C1 x −δ ≤ ∂x2 W (t, x) ≤ C2 x −δ ,

(t, x) ∈ R1 × Rd

(14)

for some constants 0 < δ < 1 and 0 < C1 < C2 < ∞ or −∞ < C1 < C2 < 0, then E(mπ, x, y), m ∈ Z, is C ∞ with respect to (x, y), viz. singularities at resonant times t = mπ are swept away.

March 10, J070-S0129055X10003928

2010 10:14 WSPC/S0129-055X

148-RMP

Spatial Growth of Fundamental Solutions for Certain Perturbations

197

This paper is concerned with the properties of the FDS E(t, x, y), when t is at resonant times t ∈ πZ. We show that, in the last case (d) above, E(mπ, x, y) increases indeﬁnitely as |x| → ∞ at the algebraic rate C|x|dδ/(2−2δ) , exhibiting a sharp contrast to the decay result (12) or (13) for the case when W is at most linearly increasing at spatial inﬁnity. More precisely we prove the following theorem: Theorem 1.1. Suppose that W (t, x) is subquadratic and satisfies the sign condition (14) for some 0 < δ < 1 and 0 < C1 < C2 < ∞ or −∞ < C1 < C2 < 0. Let m ∈ Z and y ∈ Rd be fixed. Let χ ∈ C0∞ (Rd \{0}) be such that χ(x) = 1 for a ≤ |x| ≤ b, 0 < a < b < ∞ being constants. Then there exist constants 0 < M1 < M2 , independent of R ≥ 1, such that M1 R

dδ/(2−2δ)

≤

x |E(mπ, x, y)| χ R Rd 2

2

dx Rd

1/2 ≤ M2 Rdδ/(2−2δ) .

(15)

It is interesting to note that, when δ increases from 0 to 1, the growth rate as |x| → ∞ of W (t, x) decreases (hence W (t, x) becomes weaker), whereas that of E(mπ, x, y) as |x − y| → ∞, r(δ) = dδ/(2 − 2δ), increases from 0 indeﬁnitely to inﬁnity. This seemingly contradictory behavior may be understood via the semiclassical picture as follows. For functions a(x) and b(x) on Ω, a ∼ b means that A1 a(x) ≤ b(x) ≤ A2 a(x), x ∈ Ω, for constants 0 < A1 < A2 . At time 0 consider the ensemble Γ of classical particles in the phase space Rd × Rd sitting on the linear Lagrangian manifold {(x, p) ∈ Rd × Rd : x = y, p ∈ Rd } with uniform momentum distribution (2π)−d/2 dp. Semiclassically, this is described by the wave function δ(x−y) = E(0, x, y). After time mπ, Γ will be transported by the Hamilton ﬂow (5) to the Lagrangian manifold {(x(mπ, y, k), p(mπ, y, k)) : k ∈ Rd }. As we shall see below, we have |p(mπ, y, k)| ∼ |k| and |x(mπ, y, k)| ∼ |k|1−δ as |k| → ∞. It follows at least semiclassically (see (9)) that −1/2 ∂x |E(mπ, x, y)| ∼ det ∼ |k|dδ/2 ∼ |x|dδ/(2−2δ) , ∂k

|x| → ∞,

which is consistent with (15). Here is another remark, which clariﬁes that Theorem 1.1 is more or less consistent with the known results. We should note that 2 if δ = 0, then W = c x , and mπ is no longer a resonant time for V = x2 /2 + W , and the corresponding E(mπ, x, y) is bounded as |x − y| → ∞; on the other hand, if δ = 1, then W = c x and, as in (c) above, a large portion of E(mπ, x, y) is concentrated in a bounded domain |x − y| ≤ 2cm, which may be represented as the extreme case of C x dδ/(2−2δ) as δ → 1. We mention here that the result of the theorem has been conjecture by Martinez and the second author in [7], where a similar problem is studied in the semi-classical

March 10, J070-S0129055X10003928

198

2010 10:14 WSPC/S0129-055X

148-RMP

A. Jensen & K. Yajima

setting. More precisely, they consider the FDS of the semi-classical Schr¨odinger equation 2 ∂u h 1 ih = − ∆ + x2 + hµ W (x) u, ∂t 2 2 where W (x) is t independent and satisﬁes the same conditions as in this paper, (10) and (14); and they prove that the FDS at the resonant times may be written in the form E(mπ, x, y) = h−d(1+ν)/2 a(x, y, h)eiS(x,y)/h ,

ν = µ/(1 − δ),

(16)

where S(x, y) is the action integral of the path of (5) connecting x(0) = y and x(mπ) = x and a(x, y, h) satisﬁes C −1 ≤ |a(x, y, h)| ≤ C uniformly with respect h on every compact subset K of R2d \{(x, (−1)m x) : x ∈ Rd }. Thus, E(mπ, x, y) has the extra growing factor h−dν/2 as h → 0 compared to E(t, x, y) at non-resonant times t = mπ and they remark that, if their arguments applied for non-smooth potentials, (16) would imply the estimate (15) of Theorem 1.1 for the homogeneous potential W (x) = C|x|2−δ . It is well known that the boundedness of E(t, s, x, y) with respect to (x, y) implies the so called Lp -Lq estimates of the propagator U (t, s) (hence, also ﬁnite time Strichartz’ estimates). There are examples of Schr¨ odinger equations with smooth coeﬃcients, which exhibit break down of the estimates, e.g., the harmonic oscillator at resonant times. However, to the best knowledge of the authors, in all known examples they are broken because of local singularities and, Theorem 1.1 is the ﬁrst example in which they are broken because of the growth at spatial inﬁnity of the FDS (see [8] for Lp -Lq estimates for potentials which are singular but decay at inﬁnity). For the micro-local smoothing estimate which may be applied for proving the smoothness of the FDS, see for example [1] or [6]. The rest of the paper is devoted to the proof of this theorem. We prove it only in the m = 1 case. The proof for the other cases is similar. In Sec. 2, we recall several known facts, which will be used in Sec. 3, where the theorem is proved. We often omit some of the variables of functions, if no confusion is to be feared. For functions f of several variables, we write f ∈ C k (x) or f ∈ C k (t, x) etc., if f is of class C k with respect to x or (t, x), etc. 2. Preliminaries We ﬁrst recall some results on the Hamiltonian ﬂow generated by (5) when V (t, x) = x2 /2 + W (t, x) and W is subquadratic. We set the initial time s = 0 and omit the variable s. The solutions (x(t), p(t)) = (x(t, y, k), p(t, y, k)) of (5) satisfy the integral equations t sin(t − s)∂x W (s, x(s))ds, (17) x(t) = y cos t + k sin t − p(t) = −y sin t + k cos t −

0 t

0

cos(t − s)∂x W (s, x(s))ds.

(18)

March 10, J070-S0129055X10003928

2010 10:14 WSPC/S0129-055X

148-RMP

Spatial Growth of Fundamental Solutions for Certain Perturbations

199

Since the subquadratic condition implies |x(t)| ˙ + |p(t)| ˙ ≤ C(1 + |x(t)| + |p(t)|) for a constant C > 0 and, hence, e−C|t| (1 + |y| + |k|) ≤ (1 + |x(t)| + |p(t)|) ≤ eC|t| (1 + |y| + |k|),

(19)

it follows, as y 2 + k 2 → ∞, uniformly with respect to t in compact intervals, that |x(t) − (y cos t + k sin t)| = o(|y| + |k|),

(20)

|p(t) − (−y sin t + k cos t)| = o(|y| + |k|).

(21)

We ﬁx m ∈ Z, m = 0, and 0 < ε < π/2, and consider t in the interval I = [mπ − ε, mπ + ε]. Then, the following results have been proved in Lemmas 2.3, 2.5 and 3.5, respectively, of [12] by using the integral equations (17) and (18). (i) For any α and β, as R2 = y 2 + k 2 → ∞ ∂yα ∂kβ (∂y x(t) − (cos t)1) → 0,

∂yα ∂kβ (∂k x(t) − (sin t)1) → 0,

(22)

∂yα ∂kβ (∂y p(t)

∂yα ∂kβ (∂k p(t)

(23)

+ (sin t)1) → 0,

− (cos t)1) → 0,

uniformly with respect to t ∈ I. Here 1 is the d × d identity matrix. (ii) Let R > 0 be suﬃciently large. Then, for any t ∈ I and (ξ, y) ∈ R2d with ξ 2 + y 2 ≥ R2 , there exists a unique k ∈ Rd such that the solution (x(s, y, k), p(s, y, k)) of (5) satisﬁes p(t, y, k) = ξ.

(24)

(iii) Let R be as in (ii) and deﬁne ϕ(t, ξ, y) for t ∈ I and ξ 2 + y 2 > R2 by ϕ(t, ξ, y) = x(t, y, k) · ξ −

t

0

L(s, x(s, y, k), x(s, ˙ y, k))ds,

where k is determined by (24). Then ϕ ∈ C ∞ (ξ, y) and ∂ξα ∂yβ ϕ ∈ C 1 (t, ξ, y) for any α, β; ϕ is a generating function of the canonical map (p(t, y, k), y) → (x(t, y, k), k): (∂ξ ϕ)(t, p(t, y, k), y) = x(t, y, k),

(∂y ϕ)(t, p(t, y, k), y) = k,

(25)

and ϕ satisﬁes the Hamilton–Jacobi equation ∂t ϕ = ξ 2 /2 − V (t, ∂ξ ϕ). Moreover, as ξ 2 + y 2 → ∞, ∂ξα ∂yβ ϕ approaches the corresponding function of the harmonic oscillator whenever |α + β| ≥ 2: α β (ξ 2 + y 2 ) sin t + 2ξ · y sup ∂ξ ∂y ϕ(t, ξ, y) − → 0. 2 cos t t∈I Furthermore, we have the following representation formula of the FDS [12, Theorem 1.3(2)].

March 10, J070-S0129055X10003928

200

2010 10:14 WSPC/S0129-055X

148-RMP

A. Jensen & K. Yajima

Theorem 2.1. Let W be subquadratic. Then, for t ∈ I = [mπ − ε, mπ + ε], the FDS E(t, x, y) of (11) may be written in the following form E(t, x, y) = lim ε↓0

2

Rd

˜ i−(m+1)d eix·ξ−iϕ(t,ξ,y)−εξ (2π)d |cos t|d/2

/2

a(t, ξ, y)

dξ

(26)

where the integral converges in the C ∞ topology with respect to (x, y) and the functions ϕ˜ and a satisfy the following properties: (a) ϕ˜ ∈ C ∞ (ξ, y), ∂ξα ∂yβ ϕ˜ ∈ C 1 (t, ξ, y) for any α, β and ϕ(t, ˜ ξ, y) = ϕ(t, ξ, y)

for t ∈ I,

ξ 2 + y 2 ≥ R2 .

(b) a ∈ C ∞ (ξ, y), ∂xα ∂yβ a ∈ C 1 (t, ξ, y) for any α, β and lim

sup |∂xα ∂yβ (a(t, ξ, y) − 1)| → 0

ξ 2 +y 2 →∞ t∈I

for any α and β. We call integrals of the form (26) oscillatory integrals and often write them simply as ˜ i−(m+1)d eix·ξ−iϕ(t,ξ,y) a(t, ξ, y) dξ. d d/2 (2π) |cos t| Rd When W satisﬁes the sign condition (14), the phase function ϕ(π, ξ, y) satisﬁes the following properties which are essential for the proof of the theorem. From now on we let m = 1. Proposition 2.2. Let W be subquadratic and satisfy (14). Let L > 0. Then, there exist constants C > 0 and R > 0 depending only on L such that for every |ξ| ≥ R and |y| ≤ L: C1 |ξ|1−δ ≤ |∂ξ ϕ(π, ξ, y)| ≤ C2 |ξ|1−δ , |∂ξα ϕ(π, ξ, y)| ≤ C|ξ|−δ ,

|α| ≥ 2.

(27) (28)

Proof. The upper bound in estimate (27) is obvious from (25), (17) and (20); the lower bound is proved in [11, pp. 61–63] for time independent perturbations W (t, x) = W (x), and the proof applies to the time dependent case as well, if we use [12, Lemmas 2.1 and 2.2] instead of [11, Lemmas 4.2 and 4.3]. From [11, pp. 61– 63], we also have for |ξ| ≥ R and k such that p(π, y, k) = ξ ∂k x(π, y, k) ∼ |ξ|−δ .

(29)

Diﬀerentiating (∂ξ ϕ)(π, p(π, y, k), y) = x(π, y, k) with respect to k, we have (∂ξ2 ϕ)(π, ξ, y)∂k p(π, y, k) = ∂k x(π, y, k)

(30)

March 10, J070-S0129055X10003928

2010 10:14 WSPC/S0129-055X

148-RMP

Spatial Growth of Fundamental Solutions for Certain Perturbations

201

and, applying the second result of (23) and (29), we obtain (28) for the case |α| = 2. For higher derivatives, we further diﬀerentiate (30) and apply (22) and (23) in addition to (29). Estimate (28) follows inductively. Lemma 2.3. Let L > 0 and 0 < a < b < ∞ be fixed arbitrarily and let χ ∈ C0∞ (Rd ) be supported by {x ∈ Rd : a ≤ |x| ≤ b}. Then, there exist R0 > 0 and C0 > 0, such that for all R > R0 and |y| ≤ L 1 |χ(∂ξ ϕ(π, ξ, y)/R)|2 dξ ≤ C0 Rdδ/(1−δ) . (31) R d Rd If χ(x) > δ > 0 for a1 < |x| < b1 , a < a1 < b1 < b, then we also have the lower bound: 1 dδ/(1−δ) C1 R ≤ d |χ(∂ξ ϕ(π, ξ, y)/R)|2 dξ. (32) R Rd Proof. For suﬃciently large R > 0, we have by virtue of (21) that 1/2 ≤ |p(π, y, k)|/|k| ≤ 2 for |y| ≤ L and |k| ≥ R, and (27) implies C1 |k|1−δ ≤ |x(π, y, k)| ≤ C2 |k|1−δ . It follows that, if χ(x(π, y, k)/R) = 0, then aR/C2 ≤ |k|1−δ ≤ bR/C1 . Hence, whenever χ(∂ξ ϕ(π, ξ, y)/R) = 0, we have D1 R1/(1−δ) ≤ |ξ| ≤ D2 R1/(1−δ) and 1 Rd

|χ(∂ξ ϕ(π, ξ, y)/R)|2 dξ ≤ CRdδ/(1−δ) .

A similar argument yields the lower bound in the second case. We omit the obvious details. 3. Proof of Theorem 1.2 Before starting the proof we remark the following: If we were able to prove the faster decay as |ξ| → ∞ for the higher derivatives ∂ξα ϕ, say, |∂ξα ϕ(π, ξ, y)| ≤ C|ξ|−δ−|α| ,

(33)

then the standard stationary phase method combined with a change of scale would yield the pointwise estimate |E(mπ, x, y)| ∼ C|x|dδ/(2−2δ)

as |x| → ∞.

(34)

However, (33) does not seem to hold in general and this required a weaker formulation of the theorem and a little complicated proof given below.

March 10, J070-S0129055X10003928

202

2010 10:14 WSPC/S0129-055X

148-RMP

A. Jensen & K. Yajima

We need to estimate I(R) ≡

1 Rd

R

2 χ x E(π, x, y) dx. R d

(35)

In what follows, we omit the variable π, the domain of integration Rd from integral signs and write ϕ ˜ as ϕ. Since y is ﬁxed in the following computation, we sometimes omit the variables y as well. This should not cause any confusion. Then, by virtue of (26), (35) may be written as an oscillatory integral 2 i(x·ξ−ϕ(ξ))−εξ 2 /2 dx χ x a(ξ)dξ e ε↓0 R 2 2 2 x 1 = lim eix·(ξ−η)+i(ϕ(η)−ϕ(ξ))−ε(ξ +η )/2 a(ξ)a(η)dξdηdx χ 2d d ε↓0 (2π) R R 2 2 1 = lim (36) χ ˆ2 (R(η − ξ))ei(ϕ(η)−ϕ(ξ))−ε(ξ +η )/2 a(ξ)a(η)dξdη, ε↓0 (2π)d

I(R) = lim

1 (2π)2d Rd

where we wrote χ2 (x) = χ2 (x) and we deﬁned the Fourier transform by 1 fˆ(ξ) = (F f )(ξ) = e−ix·ξ f (x)dx. (2π)d In what follows we omit the limit sign limε↓0 and the damping factors which arise from exp(−ε(ξ 2 + η 2 )/2). In the right-hand side of (36), we change variables η to ζ = η − ξ and expand by Taylor’s formula as a(ξ + ζ) =

1 ζ α a(α) (ξ) + α!

|α|≤N

|α|=N +1

1 α ζ bα (ξ, ζ) α!

in the resulting formula, where a(α) = ∂ξα a and where we wrote bα (ξ, ζ) =

1

0

(1 − θ)N a(α) (ξ + θζ)dθ.

This expresses I(R) as 1 χ ˆ2 (Rζ)ζ α eiϕ(ξ+ζ)−iϕ(ξ) a(ξ)a(α) (ξ)dξdζ + BN (R), (2π)d α! |α|≤N

where BN (R) is the sum over α with |α| = N + 1 of constants times χ ˆ2 (Rζ)ζ α ei(ϕ(ξ+ζ)−ϕ(ξ)) a(ξ)bα (ξ, ζ)dξdζ =

e

−iϕ(ξ)

a(ξ)

e

iϕ(ξ+ζ)

α

χ ˆ2 (Rζ)ζ bα (ξ, ζ)dζ dξ.

(37)

March 10, J070-S0129055X10003928

2010 10:14 WSPC/S0129-055X

148-RMP

Spatial Growth of Fundamental Solutions for Certain Perturbations

203

We take ∈ N such that (1 − δ) > d and apply integration by parts times to the inner integral, which we denote by I(ξ, R), by using the identity

1 − i∂ζ ϕ(ξ + ζ) · ∂ζ 1 + (∂ζ ϕ(ξ + ζ))2

eiϕ(ξ+ζ) = eiϕ(ξ+ζ) .

Thus, if we write M for the transpose of the diﬀerential operator on the left, we have I(ξ, R) = eiϕ(ξ+ζ) M (χ ˆ2 (Rζ)ζ α bα (ξ, ζ))dζ. (38) Since M has the form M=

1 + i divζ 1 + (∂ζ ϕ)2

∂ζ ϕ 1 + (∂ζ ϕ)2

+

i∂ζ ϕ · ∂ζ , 1 + (∂ζ ϕ)2

∂ζα ϕ are bounded for |α| ≥ 2 and since C −1 ξ + ζ 2(1−δ) ≤ 1 + (∂ξ ϕ(ξ + ζ))2 ≤ C ξ + ζ 2(1−δ) by virtue of (27), M is an th order diﬀerential operator with respect to ∂ζ whose coeﬃcients are bounded by C ξ + ζ −(1−δ) . Hence |I(ξ, R)| ≤ C

ˆ2 )(Rζ)||ζ α−γ ||∂ζδ bα (ξ, ζ)|dζ. ξ + ζ −(1−δ) R|β| |(∂ζβ χ

|β+γ+δ|≤

Since χ ˆ2 (ζ) is rapidly decreasing and ∂ζδ bα (ξ, ζ) are bounded, the integrand is bounded for any L > 0 by a constant times ξ

−(1−δ)

ζ (1−δ) Rζ −L R|β| |ζ|N +1−|γ| .

It follows, by changing variables ζ to ζ/R, and by taking L large enough, that for R>1 |I(ξ, R)| ≤ CR−N −1−d+ ξ

−(1−δ)

≤ C R−N −1−d+ ξ

−(1−δ)

ζ/R (1−δ) ζ N +1−L dζ

.

Thus, for such that (1 − δ) > d we may estimate the remainder BN (R) in (37) by |BN (R)| ≤

C RN +d+1−

|a(ξ)| ξ

−(1−δ)

dξ ≤

C RN +d+1−

,

March 10, J070-S0129055X10003928

204

2010 10:14 WSPC/S0129-055X

148-RMP

A. Jensen & K. Yajima

and we may ignore BN (R) by taking N large enough. We have next to deal with the ﬁrst terms in (37), which are sum over |α| ≤ N of 1 α i(ϕ(ξ+ζ)−ϕ(ξ)) χ ˆ (Rζ)ζ e dζ a(ξ)a(α) (ξ)dξ. (39) Aα = 2 (2π)d α! By using Taylor’s formula, we write ei(ϕ(ξ+ζ)−ϕ(ξ)) = eiζ·∂ξ ϕ(ξ) eiΨ(ξ,ζ) , 1 ∂2ϕ (1 − θ) 2 (ξ + θζ)dθ ζ, Ψ(ξ, ζ) = ζ · ∂ξ 0 and expand eiΨ via Taylor’s formula: N N +1 1 (iΨ)m (iΨ) ei(ϕ(ξ+ζ)−ϕ(ξ)) = eiζ·∂ξ ϕ(ξ) + (1 − θ)N eiθΨ dθ , m! N ! 0 m=0 where we take N large enough so that (N + 1)δ > d. We then insert this into the right-hand side of (39). Note that |Ψ(ξ, ζ)| ≤ C ξ −δ ζ δ |ζ|2 by virtue of (28). It follows that the contribution to Aα of the term containing (iΨ)N +1 /(N + 1)! is bounded by taking L such that L > (2 + δ)(N + 1) + |α| + d by CLN Rζ −L |ζ|2(N +1)+|α| ξ −(N +1)δ ζ (N +1)δ dξdζ ≤ CLN R

−2(N +1)−|α|−d

·

−L+(N +1)δ

ζ

2(N +1)+|α|

|ζ|

dζ ·

ξ −(N +1)δ dξ

≤ CR−(d+|α|+2N +2) . Thus, we may again ignore this term and we are left for Aα with N 1 1 iζ·∂ξ ϕ(ξ) α m χ ˆ2 (Rζ)ζ (iΨ(ξ, ζ)) dζ a(ξ)a(α) (ξ)dζdξ. e (2π)d α! m=0 m! Here we repeat the same argument as in the ﬁrst step to the inner integral. We expand Ψ(ξ, ζ) further by Taylor’s formula: Ψ(ξ, ζ) =

2≤|α|≤N

LN (ξ, ζ) =

|α|=N +1

ζ α (α) ϕ (ξ) + LN (ξ, ζ), α! Cα ζ α

0

1

(1 − θ)N ϕ(α) (ξ + θζ)dθ

and expand the product Ψ(ξ, ζ)m . We estimate the contribution to Aα of the terms which contain LN , by performing integration by parts times, (1 − δ) > d, by

March 10, J070-S0129055X10003928

2010 10:14 WSPC/S0129-055X

148-RMP

Spatial Growth of Fundamental Solutions for Certain Perturbations

using the identity

1 − i∂ξ ϕ(ξ) · ∂ζ 1 + |∂ξ ϕ(ξ)|2

205

eiζ∂ξ ϕ(ξ) = eiζ∂ξ ϕ(ξ)

and the estimate (27). This yields the bound CR−2(N +1)−d+ for the contribution and we ignore them. The rest is a sum of the terms of the form Cβ1 ···βm ζ β ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ),

β = β1 + · · · + βm

and their contributions to Aα are given by constants times eiζ∂ξ ϕ(ξ) χ ˆ2 (Rζ)ζ (α+β) ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ)a(ξ)a(α) (ξ)dξdζ =

1 (iR)|α|+|β|Rd

(∂ζα+β χ2 )(∂ξ ϕ(ξ)/R)ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ)a(ξ)a(α) (ξ)dξ. −mδ

by (28) and this Here |β1 |, . . . , |βm | ≥ 2 and |ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ)| ≤ C ξ

integral is bounded in modulus by C −mδ dξ |(∂ζα+β χ2 )(∂ξ ϕ(ξ)/R)| ξ

R|α|+|β|Rd ≤ C Rdδ/(1−δ) R−|α+β| R−mδ/(1−δ) , by virtue of Lemma 2.3. Thus the main contribution to I(R) is given by the term with m = 0 and α = 0: 1 1 χ(∂ξ ϕ(ξ)/R)2 |a(ξ)|2 dξ. (2π)d Rd Since a(ξ) → 1 as |ξ| → ∞, this is comparable with CRdδ/(1−δ) for large R by virtue of Lemma 2.3. The theorem follows. Acknowledgements The ﬁrst author was partially supported by the Danish Natural Science Research Council grant “Mathematical Physics”. The second author was supported by JSPS grant in aid for scientiﬁc research No. 18340041. This work has been done while the second author was visiting Department of Mathematical Sciences of Aalborg University. He acknowledges the hospitality of the department. References [1] W. Craig, T. Kappeler and W. Strauss, Microlocal dispersive smoothing for the Schr¨ odinger equation, Comm. Pure Appl. Math. 48 (1995) 769–860. [2] S. Doi, Dispersion of singularities of solutions for Schr¨ odinger equations, Comm. Math. Phys. 250 (2004) 473–505. [3] S. Doi, Smoothness of solutions for Schr¨ odinger equations with unbounded potentials, Publ. RIMS Kyoto Univ. 41 (2005) 175–221.

March 10, J070-S0129055X10003928

206

2010 10:14 WSPC/S0129-055X

148-RMP

A. Jensen & K. Yajima

[4] D. Fujiwara, Remarks on the convergence of the Feynman path integrals, Duke Math. J. 47 (1980) 41–96. [5] L. Kapitanski, I. Rodnianski and K. Yajima, On the fundamental solution of a perturbed harmonic oscillator, Topol. Methods Nonlinear Anal. 9 (1997) 77–106. [6] A. Martinez, S. Nakamura and V. Sordoni, Analytic smoothing eﬀect for the Schr¨ odinger equation with long-range perturbation, Comm. Pure Appl. Math. 59(9) (2006) 1330–1351. [7] A. Martinez and K. Yajima, On the fundamental solution of semiclassical Schr¨ odinger equations at resonant times, Comm. Math. Phys. 216 (2001) 357–373. [8] W. Schlag, Dispersive estimates for Schr¨ odinger operators: A survey, in Mathematical Aspects of Nonlinear Dispersive Equations, Ann. of Math. Stud., Vol. 163 (Princeton Univ. Press, Princeton, NJ, 2007), pp. 255–285. [9] K. Yajima, Schr¨ odinger evolution equations with magnetic ﬁelds, J. d’Analyse Math. 56 (1991) 29–76. [10] K. Yajima, Smoothness and non-smoothness of the fundamental solution of time dependent Schr¨ odinger equations, Comm. Math. Phys. 181 (1996) 605–629. [11] K. Yajima, On fundamental solution of time dependent Schr¨ odinger equations, Contemp. Math. 217 (1998) 49–68. [12] K. Yajima, On the behavior at inﬁnity of the fundamental solution of time dependent Schr¨ odinger equation, Rev. Math. Phys. 13 (2001) 891–920. [13] G. P. Zhang and K. Yajima, Smoothing property for Schr¨ odinger equations with potential super-quadratic at inﬁnity, Comm. Math. Phys. 221 (2001) 573–590. [14] S. Zelditch, Reconstruction of singularities for solutions of Schr¨ odinger equation, Comm. Math. Phys. 90 (1983) 1–26.

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 2 (2010) 207–231 c 2010 by the authors DOI: 10.1142/S0129055X1000393X

ON THE EXISTENCE OF THE DYNAMICS FOR ANHARMONIC QUANTUM OSCILLATOR SYSTEMS∗

BRUNO NACHTERGAELE† , BENJAMIN SCHLEIN‡ , ROBERT SIMS§ , SHANNON STARR¶ and VALENTIN ZAGREBNOV †Department

of Mathematics, University of California, Davis, CA 95616, USA [email protected]

‡Centre for Mathematical Sciences, University of Cambridge, Cambridge, CB3 0WB, UK [email protected] §Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA [email protected] ¶Department

of Mathematics, University of Rochester, Rochester, NY 14627, USA [email protected]

Universite

de la M´ editerran´ ee (Aix-Marseille II), Centre de Physique Th´ eorique-UMR 6207 CNRS, Luminy - Case 907, 13288 Marseille, Cedex 09, France [email protected] Received 18 September 2009 We construct a W ∗ -dynamical system describing the dynamics of a class of anharmonic quantum oscillator lattice systems in the thermodynamic limit. Our approach is based on recently proved Lieb–Robinson bounds for such systems on finite lattices [19]. Keywords: Thermodynamic limit; infinite-system dynamics; anharmonic lattice. Mathematics Subject Classification 2010: 82C10, 82C20, 81Q15, 37K60, 46L55

1. Introduction The dynamics of a ﬁnite quantum system, i.e. one with a ﬁnite number of degrees of freedom described by a Hilbert space H, is given by the Schr¨ odinger equation. ∗ c 2010 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes.

207

March 10, J070-S0129055X1000393X

208

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

The Hamiltonian H is a densely deﬁned self-adjoint operator on H, and for a vector ψ(t) in the domain of H the state at time t satisﬁes i∂t ψ(t) = Hψ(t).

(1.1)

For all initial conditions ψ(0) ∈ H, the unique solution is given by ψ(t) = e−itH ψ(0),

for all t ∈ R.

Due to Stone’s Theorem e−itH is a strongly continuous one-parameter group of unitary operators on H, and the self-adjointness of H is the necessary and suﬃcient condition for the existence of a unique continuous solution for all times. An alternative description of this dynamics is the so-called Heisenberg picture in which the time evolution is deﬁned on the algebra of observables instead of the Hilbert space of states. The corresponding Heisenberg equation is ∂t A(t) = i[H, A(t)],

(1.2)

where, for each t ∈ R, A(t) ∈ B(H) is a bounded linear operator on H. Its solutions are given by a one-parameter group of ∗-automorphisms, τt , of B(H): A(t) = τt (A(0)). For the description of physical systems we expect the Hamiltonian, H, to have some additional properties. For example, for ﬁnite systems such as atoms or molecules, stability of the system requires that H is bounded from below. In this case, the inﬁmum of the spectrum is expected to be an eigenvalue and is called the ground state energy. When the model Hamiltonian, H, is describing bulk matter rather than ﬁnite systems, we expect some additional properties. For example, the stability of matter requires that the ground state energy has a lower bound proportional to N , where N is the number of degrees of freedom. Much progress on this stability property has been made in the last several decades [24,12]. We also expect that the dynamics of local observables of bulk matter, or large systems in general, depends only on the local environment. Mathematically this is best expressed by the existence of the dynamics in the thermodynamic limit, i.e. in inﬁnite volume. This is the question we address in this paper. There are two settings that allow one to prove a rich set of important physical properties of quantum dynamical systems, including inﬁnite ones: the C ∗ dynamical systems and the W ∗ dynamical systems [3]. In both cases, the algebra of observables can be thought of as a norm-closed ∗-subalgebra A of some algebra of the form B(H), but in the case of the W ∗ -dynamical systems, we additionally require that the algebra is closed for the weak operator topology, which makes it a von Neumann algebra. For a C ∗ -dynamical system, the group of automorphisms τt is assumed to be strongly continuous, i.e. for all A ∈ A, the map t → τt (A) is continuous in t for the operator norm (C ∗ -norm) on A. In a W ∗ -dynamical system the continuity is with respect to the weak topology.

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

209

In the case of lattice systems with a ﬁnite-dimensional Hilbert space of states associated with each lattice site, such as quantum spin-lattice systems and lattice fermions, it has been known for a long time that under rather general conditions the dynamics can be described by a C ∗ dynamical system, including in the thermodynamic limit [4]. When the Hilbert space at each site is inﬁnite-dimensonal and the ﬁnite-system Hamiltonians are unbounded, this is no longer possible and the weak continuity becomes a natural assumption. The class of systems we will primarily focus on here are lattices of quantum oscillators but the underlying lattice structure is not essential for our method. Systems deﬁned on suitable graphs, such as the systems considered in [6, 7] can also be analyzed with the same methods. In a recent preprint [1], it was shown that convergence of the dynamics in the thermodynamic limit can be obtained for a modiﬁed topology. Here, we follow a somewhat diﬀerent approach. The main diﬀerence is that we study the thermodynamic limit of anharmonic perturbations of an infinite harmonic lattice system described by an explicit W ∗ -dynamical system. The more traditional way is to ﬁrst deﬁne the dynamics of anharmonic systems in ﬁnite volume (which can be done by standard means [21]), and then to study the limit in which the volume tends to inﬁnity. This is what is done in [1], but it appears that controlling the continuity of the limiting dynamics is more straightforward in our approach. In fact, we are able to show that the resulting dynamics for the class of anharmonic lattices we study is indeed weakly continuous, and we obtain a W ∗ dynamical system for the inﬁnite system. The W ∗ -dynamical setting is obtained by considering the GNS representation of a ground state or thermal equilibrium state of the harmonic system. The ground states and thermal states are quasi-free states in the sense of [22], or convex mixtures of quasi-free states. In the ground state case the GNS representations are the well-known Fock reprensentations. For the thermal states the GNS representations have been constructed by Araki and Woods [2]. Common to both approaches, ours and the one of [1], is the crucial role played by an estimate of the speed of propagation of perturbations in the system, commonly referred to as Lieb–Robinson bounds [8, 11, 16–18]. Brieﬂy, if A and B are two observables of a spatially extended system, localized in regions X and Y of our graph, respectively, and τt denotes the time evolution of the system, a Lieb– Robinson bound is an estimate of the form [τt (A), B] ≤ Ce−a(d(X,Y )−v|t|) , where C, a, and v are positive constants and d(X, Y ) denotes the distance between X and Y . Lieb–Robinson bounds for anharmonic lattice systems were recently proved in [19], and this work builds on the results obtained there. Our results are mainly limited to short-range interactions that are either bounded or unbounded perturbations of the harmonic interaction (linear springs). To conclude the introduction, let us mention that the same questions, the existence of the dynamics for inﬁnite oscillator lattices, can and has been asked for

March 10, J070-S0129055X1000393X

210

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

classical systems. Two classic papers are [10, 15]. Many properties of this classical inﬁnite volume harmonic dynamics have been studied in detail, e.g., [23,9] and some recent progress on locality estimates for anharmonic systems is reported in [5, 20]. The paper is organized as follows. We begin with a section discussing bounded interactions. In this case, the existence of the dynamics follows by mimicking the proof valid in the context of quantum spins systems. Section 3 describes the inﬁnite volume harmonic dynamics on general graphs. It is motivated by an explicit example on Zd . Next, in Sec. 4, we discuss ﬁnite volume perturbations of the inﬁnite volume harmonic dynamics and prove that such systems satisfy a Lieb–Robinson bound. In Sec. 5, we demonstrate that the existence of the dynamics and its continuity follow from the Lieb–Robinson estimates established in the previous section. 2. Bounded Interactions The goal of this section is to prove the existence of the dynamics for oscillator systems with bounded interactions. Since oscillator systems with bounded interactions can be treated as a special case of more general models with bounded interactions, we will use a slightly more general setup in this section, which we now introduce. We will denote by Γ the underlying structure on which our models will be deﬁned. Here Γ will be an arbitrary set of sites equipped with a metric d. For Γ with countably inﬁnite cardinality, we will need to assume that there exists a non-increasing function F : [0, ∞) → (0, ∞) for which: (i) F is uniformly integrable over Γ, i.e. F (d(x, y)) < ∞, F := sup

(2.1)

x∈Γ y∈Γ

and (ii) F satisﬁes C := sup x,y∈Γ

F (d(x, z))F (d(z, y)) z∈Γ

F (d(x, y))

< ∞.

(2.2)

Given such a set Γ and a function F , by the triangle inequality, for any a ≥ 0 the function Fa (x) = e−ax F (x), also satisﬁes (i) and (ii) above with Fa ≤ F and Ca ≤ C. In typical examples, one has that Γ ⊂ Zd for some integer d ≥ 1, and the metric is just given by d(x, y) = |x − y| = dj=1 |xj − yj |. In this case, the function F can be chosen as F (|x|) = (1 + |x|)−d− for any > 0. To each x ∈ Γ, we will associate a Hilbert space Hx . In many relevant systems, one considers Hx = L2 (R, dqx ), but this is not essential. With any ﬁnite subset

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

211

Λ ⊂ Γ, the Hilbert space of states over Λ is given by HΛ = Hx , x∈Λ

and the local algebra of observables over Λ is then deﬁned to be AΛ = B(Hx ), x∈Λ

where B(Hx ) denotes the algebra of bounded linear operators on Hx . If Λ1 ⊂ Λ2 , then there is a natural way of identifying AΛ1 ⊂ AΛ2 , and we may thereby deﬁne the algebra of quasi-local observables by the inductive limit AΛ , AΓ = Λ⊂Γ

where the union is over all ﬁnite subsets Λ ⊂ Γ; see [3, 4] for a discussion of these issues in general. The result discussed in this section corresponds to bounded perturbations of local self-adjoint Hamiltonians. We ﬁx a collection of on-site local operators H loc = {Hx }x∈Γ where each Hx is a self-adjoint operator over Hx . In addition, we will consider a general class of bounded perturbations. These are deﬁned in terms of an interaction Φ, which is a map from the set of subsets of Γ to AΓ with the property that for each ﬁnite set X ⊂ Γ, Φ(X) ∈ AX and Φ(X)∗ = Φ(X). As with the Lieb–Robinson bound proven in [19], we will need a growth condition on the set of interactions Φ for which we can prove the existence of the dynamics in the thermodynamic limit. This condition is expressed in terms of the following norm. For any a ≥ 0, denote by Ba (Γ) the set of interactions for which 1 Φ(X) < ∞. x,y∈Γ Fa (d(x, y))

Φa := sup

(2.3)

Xx,y

Now, for a ﬁxed sequence of local Hamiltonians H loc = {Hx }x∈Γ , as described above, an interaction Φ ∈ Ba (Γ), and a ﬁnite subset Λ ⊂ Γ, we will consider selfadjoint Hamiltonians of the form Hx + Φ(X), (2.4) HΛ = HΛloc + HΛΦ = x∈Λ

X⊂Λ

acting on HΛ (with domain given by x∈Λ D(Hx ) where D(Hx ) ⊂ Hx denotes the domain of Hx ). As these operators are self-adjoint, they generate a dynamics, or time evolution, {τtΛ }, which is the one-parameter group of automorphisms deﬁned by τtΛ (A) = eitHΛ Ae−itHΛ

for any A ∈ AΛ .

March 10, J070-S0129055X1000393X

212

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

Theorem 2.1. Under the conditions stated above, for all t ∈ R, A ∈ AΓ , the norm limit lim τtΛ (A) = τt (A)

(2.5)

Λ→Γ

exists in the sense of non-decreasing exhaustive sequences of finite volumes Λ and defines a group of ∗-automorphisms τt on the completion of AΓ . The convergence is uniform for t in a compact set. Proof. Let Λ ⊂ Γ be a ﬁnite set. Consider the unitary propagator loc

loc

UΛ (t, s) = eitHΛ e−i(t−s)HΛ e−isHΛ

(2.6)

and its associated interaction-picture evolution deﬁned by Λ τt,int (A) = UΛ (0, t)AUΛ (t, 0) for all A ∈ AΓ .

(2.7)

Clearly, UΛ (t, t) = 1l for all t ∈ R, and it is also easy to check that d UΛ (t, s) = HΛint (t)UΛ (t, s) and dt with the time-dependent generator i

loc

loc

HΛint (t) = eiHΛ t HΛΦ e−iHΛ

t

=

−i

d UΛ (t, s) = UΛ (t, s)HΛint (s) ds loc

loc

eiHΛ t Φ(Z)e−iHΛ t .

(2.8)

Z⊂Λ

Fix T > 0 and X ⊂ Γ ﬁnite. For any A ∈ AX , we will show that for any Λn (A)} is Cauchy non-decreasing, exhausting sequence {Λn } of Γ, the sequence {τt,int in norm, uniformly for t ∈ [−T, T ]. Moreover, the bounds establishing the Cauchy property depend on A only through X and A. Since loc

loc

Λ Λ (eitHΛ Ae−itHΛ ) = τt,int (eit τtΛ (A) = τt,int

P x∈X

Hx

Ae−it

P x∈X

Hx

),

an analogous statement then immediately follows for {τtΛn (A)}, since they are all also localized in X and have the same norm as A. Take n ≤ m with X ⊂ Λn ⊂ Λm and calculate t d Λm Λn {UΛm (0, s)UΛn (s, t)AUΛn (t, s)UΛm (s, 0)} ds. (2.9) τt,int (A) − τt,int (A) = ds 0 A short calculation shows that d UΛ (0, s)UΛn (s, t)AUΛn (t, s)UΛm (s, 0) ds m (s) − HΛint (s)), UΛn (s, t)AUΛn (t, s)]UΛm (s, 0) = iUΛm (0, s)[(HΛint m n loc

loc

Λn ˜ ˜ = iUΛm (0, s)eisHΛn [B(s), τs−t (A(t))]e−isHΛn UΛm (s, 0),

(2.10)

where loc loc ˜ = e−itHΛlocn AeitHΛlocn = e−itHX A(t) AeitHX

(2.11)

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

213

and loc loc ˜ B(s) = e−isHΛn (HΛint (s) − HΛint (s))eisHΛn m n loc loc = eisHΛm \Λn Φ(Z)e−isHΛm \Λn − Φ(Z)

Z⊂Λm

=

Z⊂Λn

e

loc isHΛ m \Λn

Φ(Z)e

loc −isHΛ m \Λn

.

(2.12)

Z⊂Λm : Z∩Λm \Λn =∅

Combining the results of (2.9)–(2.12), and using unitarity, we ﬁnd that t Λm Λn Λn ˜ ˜ τt,int (A) − τt,int (A) ≤ [τs−t (A(t)), B(s)] ds

(2.13)

0

and by the Lieb–Robinson bound proven in [19], it is clear that Λn ˜ ˜ [τs−t (A(t)), B(s)] loc loc Λn ˜ ≤ [τs−t (A(t)), eisHΛm \Λn Φ(Z)e−isHΛm \Λn ] Z⊂Λm : Z∩Λm \Λn =∅

≤

≤

≤

2A 2 Φ a Ca |t−s| (e − 1) Ca 2A 2 Φ a Ca |t−s| (e − 1) Ca

Φ(Z)

y∈Λm \Λn Z⊂Λm : y∈Z

Φ(Z)

y∈Λm \Λn z∈Λm Z⊂Λm : y,z∈Z

2AΦa 2 Φ a Ca |t−s| (e − 1) Ca

≤ 2AΦa(e2 Φ a Ca |t−s| − 1)

Fa (d(x, z))

x∈X z∈Z

Fa (d(x, z))

x∈X

Fa (d(x, z))Fa (d(z, y))

y∈Λm \Λn x∈X z∈Λm

Fa (d(x, y)).

(2.14)

y∈Λm \Λn x∈X

With the estimate above and the properties of the function Fa , it is clear that sup t∈[−T,T ]

Λm Λn τt,int (A) − τt,int (A) → 0 as n, m → ∞,

(2.15)

and the rate of convergence only depends on the norm A and the set X where A is supported. This proves the claim. If all local Hamiltonians Hx are bounded, {τt } is strongly continuous. If the Hx are allowed to be densely deﬁned unbounded self-adjoint operators, we only have weak continuity and the dynamics is more naturally deﬁned on a von Neumann algebra. This can be done when we have a suﬃciently nice invariant state for the model with only the on-site Hamiltonians. For example, suppose that for each x ∈ Γ,

March 10, J070-S0129055X1000393X

214

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

we have a normalized eigenvector φx of Hx . Then, for all A ∈ AΛ , for any ﬁnite Λ ⊂ Γ, deﬁne (2.16) φx , A φx . ρ(A) = x∈Λ

x∈Λ

ρ can be regarded as a state of the inﬁnite system deﬁned on the norm completion of AΓ . The GNS Hilbert space Hρ of ρ can be constructed as the closure of AΓ x∈Γ φx . Let ψ ∈ AΓ x∈Γ φx . Then (Λn )

(τt (A) − τt0 (A))ψ ≤ (τt (A) − τt (Λn )

+ (τt0

(Λn )

(A))ψ + (τt

(A) − τt0 (A))ψ.

(Λn )

(A) − τt0

(A))ψ (2.17)

For suﬃciently large Λn , the limt→t0 of the middle term vanishes by Stone’s theorem. The two other terms are handled by (2.5). It is clear how to extend the continuity to ψ ∈ Hρ . We will discuss this type of situation in more detail in the next three sections where we consider models that include quadratic (unbounded) interactions as well. 3. The Harmonic Lattice As noted in the introduction, we will consider anharmonic perturbations of inﬁnite harmonic lattices. In this section, we discuss the properties of the harmonic systems that we need to assume in general in order to study the perturbations in the thermodynamic limit. We will also show in detail that a standard harmonic lattice model possesses all the required properties. 3.1. The CCR algebra of observables We begin by introducing the CCR algebra on which the harmonic dynamics will be deﬁned. Following [14], one can deﬁne the CCR algebra over any real linear space D equipped with a non-degenerate, symplectic bilinear form σ, i.e. σ : D × D → R with the property that if σ(f, g) = 0 for all f ∈ D, then g = 0, and σ(f, g) = −σ(g, f ) for all f, g ∈ D.

(3.1)

In typical examples, D will be a complex inner product space associated with Γ, e.g., D = 2 (Γ) or a subspace thereof such as D = 1 (Γ), or 2 (Γ0 ), with Γ0 ⊂ Γ, and σ(f, g) = Im[ f, g].

(3.2)

The Weyl operators over D are deﬁned by associating non-zero elements W (f ) to each f ∈ D which satisfy W (f )∗ = W (−f ) for each f ∈ D,

(3.3)

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

215

and W (f )W (g) = e−iσ(f,g)/2 W (f + g) for all f, g ∈ D.

(3.4)

It is well known that there is a unique, up to ∗-isomorphism, C ∗ -algebra generated by these Weyl operators with the property that W (0) = 1l, W (f ) is unitary for all f ∈ D, and W (f ) − 1l = 2 for all f ∈ D\{0}, see, e.g., [4, Theorem 5.2.8]. This algebra, commonly known as the CCR algebra, or Weyl algebra, over D, we will denote by W = W(D). 3.2. Quasi-free dynamics The anharmonic dynamics we study in this paper will be deﬁned as perturbations of harmonic, technically quasi-free, dynamics. A quasi-free dynamics on W(D) is a one-parameter group of *-automorphisms τt of the form τt (W (f )) = W (Tt f ),

f ∈D

(3.5)

where Tt : D → D is a group of real-linear, symplectic transformations, i.e. σ(Tt f, Tt g) = σ(f, g).

(3.6)

As W (f ) − W (g) = 2 for all f = g ∈ D, one should not expect τt to be strongly continuous; only a weaker form of continuity is present. This means that τt does not deﬁne a C ∗ -dynamical system on W, and thus we look for a W ∗ -dynamical setting in which the weaker form of continuity is naturally expressed. In the present context, it suﬃces to regard a W ∗ -dynamical system as a pair {M, αt } where M is a von Neumann algebra and αt is a weakly continuous, one parameter group of ∗-automorphisms of M. For the harmonic systems we are considering, a speciﬁc W ∗ -dynamical system arises as follows. Let ρ be a state on W and denote by (Hρ , πρ , Ωρ ) the corresponding GNS representation. We will assume that ρ is both regular and τt -invariant. Recall that ρ is regular if and only if t → ρ(W (tf )) is continuous for all f ∈ D, and τt -invariance means ρ(τt (A)) = ρ(A)

for all A ∈ W.

(3.7)

For the von Neumann algebra M, take the weak-closure of πρ (W) in L(Hρ ) and let αt be the weakly continuous, one parameter group of ∗-automorphisms of M obtained by lifting τt to M. The latter step is possible since ρ is τt -invariant; see, e.g., [3, Corollary 2.3.17]. 3.3. Lieb–Robinson bounds for harmonic lattices To prove the existence of the dynamics for anharmonic models, we use that the unperturbed harmonic system satisﬁes a Lieb–Robinson bound. Such an estimate

March 10, J070-S0129055X1000393X

216

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

depends directly on properties of σ and Tt . In fact, it is easy to calculate that [τt (W (f )), W (g)] = {W (Tt f ) − W (g)W (Tt f )W (−g)}W (g) = {1 − eiσ(Tt f,g) }W (Tt f )W (g),

(3.8)

using the Weyl relations (3.4). For the examples we consider below, one can prove that for every a > 0, there exist positive numbers ca and va for which |σ(Tt f, g)| ≤ ca eva |t|

|f (x)||g(y)|

x,y∈Zd

e−a|x−y| (1 + |x − y|)d+1

(3.9)

holds for all t ∈ R and all f, g ∈ 2 (Zd ). In general, we will assume that the harmonic dynamics satisﬁes an estimate of this type. Namely, we suppose that there exists a number a0 > 0 for which given 0 < a ≤ a0 , there are numbers ca and va for which |f (x)||g(y)|Fa (d(x, y)) (3.10) |1 − eiσ(Tt f,g) | ≤ ca eva |t| x,y∈Γ

holds for all t ∈ R and all f, g ∈ 2 (Γ). Here we describe the spatial decay in Γ through the functions Fa as introduced in Sec. 2. Since the Weyl operators are unitary, the norm estimate |f (x)||g(y)|Fa (d(x, y)), (3.11) [τt (W (f )), W (g)] ≤ ca eva |t| x,y

readily follows. 3.4. An important example Using the example given below, we illustrate the general discussion above in terms of a standard harmonic model deﬁned over Γ = Zd . We begin with a description of some well-known calculations that are valid for these models when restricted to a ﬁnite volume. This analysis motivates the deﬁnition of the harmonic dynamics in the inﬁnite volume. We then demonstrate that this inﬁnite volume dynamics satisﬁes a Lieb–Robinson bound. By representing this dynamics in a suitable state, the relevant weak-continuity is readily veriﬁed. Interestingly, our analysis also applies to the massless case of ω = 0, see below, and we discuss this brieﬂy. We end this subsection with some ﬁnal comments. 3.4.1. Finite volume analysis We consider a system of coupled harmonic oscillators restricted to a ﬁnite volume. Speciﬁcally on cubic subsets ΛL = (−L, L]d ⊂ Zd , we analyze Hamiltonians of the form HLh =

x∈ΛL

p2x + ω 2 qx2 +

d j=1

λj (qx − qx+ej )2

(3.12)

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

217

acting in the Hilbert space HΛL =

L2 (R, dqx ).

(3.13)

x∈ΛL

Here the quantities px and qx , which appear in (3.12) above, are the single site momentum and position operators regarded as operators on the full Hilbert space HΛL by setting px = 1l ⊗ · · · ⊗ 1l ⊗ −i

d ⊗ 1l · · · ⊗ 1l and qx = 1l ⊗ · · · ⊗ 1l ⊗ q ⊗ 1l · · · ⊗ 1l, dq (3.14)

i.e. these operators act non-trivially only in the xth factor of HΛL . These operators satisfy the canonical commutation relations [px , py ] = [qx , qy ] = 0 and [qx , py ] = iδx,y ,

(3.15)

valid for all x, y ∈ ΛL . In addition, {ej }dj=1 are the canonical basis vectors in Zd , the numbers λj ≥ 0 and ω ≥ 0 are the parameters of the system, and the Hamiltonian is assumed to have periodic boundary conditions, in the sense that qx+ej = qx−(2L−1)ej if x ∈ ΛL but x + ej ∈ ΛL . It is well-known that Hamiltonians of this form can be diagonalized in Fourier space. We review this quickly to establish some notation and refer the interested reader to [19] for more details. Introducing the operators 1 e−ik·x qx Qk = |ΛL | x∈ΛL

1 and Pk = e−ik·x px , |ΛL | x∈ΛL

(3.16)

deﬁned for each k ∈ Λ∗L = { xπ L : x ∈ ΛL }, and setting

d γ(k) = ω 2 + 4 λj sin2 (kj /2),

(3.17)

j=1

one ﬁnds that HLh =

γ(k)(2b∗k bk + 1)

(3.18)

k∈Λ∗ L

where the operators bk and b∗k satisfy 1

bk = Pk − i 2γ(k)

γ(k) Qk 2

and

b∗k

1

= P−k + i 2γ(k)

In this sense, we regard the Hamiltonian HLh as diagonalizable.

γ(k) Q−k . 2

(3.19)

March 10, J070-S0129055X1000393X

218

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

Using the above diagonalization, one can determine the action of the dynamics corresponding to HLh on the Weyl algebra W(2 (ΛL )). In fact, by setting W (f ) = exp i (Re[f (x)]qx + Im[f (x)]px ) , (3.20) x∈ΛL

for each f ∈ 2 (ΛL ), it is easy to verify that (3.3) and (3.4) hold with σ(f, g) = Im[ f, g]. It is convenient to express these Weyl operators in terms of annihilation and creation operators, i.e. 1 1 ax = √ (qx + ipx ) and a∗x = √ (qx − ipx ), 2 2

(3.21)

which satisfy [ax , ay ] = [a∗x , a∗y ] = 0 and [ax , a∗y ] = δx,y One ﬁnds that

for all x, y ∈ ΛL .

i W (f ) = exp √ (a(f ) + a∗ (f )) , 2

(3.23)

where, for each f ∈ 2 (ΛL ), we have set a(f ) = f (x)ax , a∗ (f ) = f (x)a∗x . x∈ΛL

(3.22)

(3.24)

x∈ΛL

Now, the dynamics corresponding to HLh , which we denote by τtL , is trivial with respect to the diagonalizing variables, i.e. τtL (bk ) = e−2iγ(k)t bk

and τtL (b∗k ) = e2iγ(k)t b∗k ,

(3.25)

where bk and b∗k are as deﬁned in (3.19). Hence, if we further introduce 1 eikx bk bx = |ΛL | k∈Λ∗ L

1 and b∗x = eikx b∗k , |ΛL | k∈Λ∗

for each x ∈ ΛL and, analogously to (3.24), deﬁne b(f ) = f (x)bx , b∗ (f ) = f (x)b∗x , x∈ΛL

(3.26)

L

(3.27)

x∈ΛL

for each f ∈ 2 (ΛL ), then one has that τtL (b(f )) = b([F −1 Mt F ]f ),

(3.28)

where F is the unitary Fourier transform on 2 (ΛL ) and Mt is the operator of multiplication by e2iγ(k)t in Fourier space with γ(k) as in (3.17). We need only determine the relation between the a’s and the b’s.

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

219

A short calculation shows that there exists a linear mapping U : 2 (ΛL ) → (ΛL ) and an anti-linear mapping V : 2 (ΛL ) → 2 (ΛL ) for which 2

b(f ) = a(U f ) + a∗ (V f ),

(3.29)

a relation know in the literature as a Bogoliubov transformation [13]. In fact, one has that U=

i −1 F MΓ+ F 2

and V =

i −1 F MΓ− F J 2

(3.30)

where J is complex conjugation and MΓ± is the operator of multiplication by 1 ± γ(k), Γ± (k) = γ(k)

(3.31)

with γ(k) as in (3.17). Using the fact that Γ± is real valued and even, it is easy to check that U ∗ U − V ∗ V = 1l = U U ∗ − V V ∗

(3.32)

V ∗U − U ∗V = 0 = V U ∗ − U V ∗

(3.33)

and

where we stress that V ∗ is the adjoint of the anti-linear mapping V . The relation (3.29) is invertible, in fact, a(f ) = b(U ∗ f ) − b∗ (V ∗ f ),

(3.34)

i ∗ ∗ ∗ ∗ ∗ W (f ) = exp √ (b((U − V )f ) + b ((U − V )f )) . 2

(3.35)

and therefore

Clearly then, τt (W (f )) = W (Tt f ),

(3.36)

where the mapping Tt is given by Tt = (U + V )F −1 Mt F (U ∗ − V ∗ ),

(3.37)

and we have used (3.28). 3.4.2. Infinite volume dynamics It is now clear how to deﬁne the inﬁnite volume harmonic dynamics. Consider a subspace D ⊂ 2 (Zd ) and deﬁne W(D) as above with σ(f, g) = Im[ f, g]. First, assume ω > 0, take γ : [−π, π)d → R as in (3.17), and set U and V as in (3.30) with

March 10, J070-S0129055X1000393X

220

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

(3.31). If ω > 0, both U and V are bounded transformations on 2 (Zd ). We will treat the case ω = 0 by a limiting argument. The mapping Tt deﬁned by setting Tt = (U + V )F −1 Mt F (U ∗ − V ∗ ),

(3.38)

is well-deﬁned on 2 (Zd ). To deﬁne the dynamics on W(D) we will need to choose subspaces D that are Tt invariant. On such D, Tt is clearly real-linear. With (3.32) and (3.33), one can easily verify the group properties T0 = 1l, Ts+t = Ts ◦ Tt , and Im[ Tt f, Tt g] = Im[ f, g],

(3.39)

i.e. Tt is sympletic in the sense of (3.6). Using [4, Theorem 5.2.8], there is a unique one-parameter group of ∗-automorphisms on W(D), which we will denote by τt , that satisﬁes τt (W (f )) = W (Tt f ) for all f ∈ D.

(3.40)

This deﬁnes the harmonic dynamics on W(D). Here it is important that Tt : D → D. As was demonstrated in [19], the mapping Tt can be expressed as a convolution. In fact, i (1) i (−1) (0) (1) (−1) (H − Ht ) . Tt f = f ∗ Ht + (Ht + Ht ) + f ∗ (3.41) 2 2 t where

1 1 i(k·x−2γ(k)t) e = Im dk , (2π)d γ(k) 1 (0) i(k·x−2γ(k)t) Re e dk , Ht (x) = (2π)d 1 (1) i(k·x−2γ(k)t) Ht (x) = Im γ(k)e dk . (2π)d

(−1) Ht (x)

(3.42)

Using analysis similar to what is proven in [19], the following result holds. Lemma 3.1. Consider the functions defined in (3.42). For ω ≥ 0, λ1 , . . . , λd ≥ 0, d but such that cω,λ = (ω 2 + 4 j=1 λj )1/2 > 0, and any µ > 0, the bounds (0)

2

(µ/2)+1

|Ht (x)| ≤ e−µ(|x|−cω,λ max( µ ,e (−1)

|Ht

(1)

2

)|t|)

(µ/2)+1

−µ(|x|−cω,λ max( µ ,e (x)| ≤ c−1 ω,λ e

2

)|t|)

(µ/2)+1

|Ht (x)| ≤ cω,λ eµ/2 e−µ(|x|−cω,λ max( µ ,e d hold for all t ∈ R and x ∈ Zd . Here |x| = j=1 |xi |.

(3.43) )|t|)

Given the estimates in Lemma 3.1, Eq. (3.41) and Young’s inequality imply that Tt can be deﬁned as a transformation of p (Zd ), for p ≥ 1. However, the symplectic form limits us to consider D = p (Zd ) with 1 ≤ p ≤ 2.

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

221

The following bound now readily follows: |Im Tt f, g| ≤ (1 + 2eµ/2 cω,λ + 2c−1 ω,λ ) (µ/2)+1 2 )|t|) × |f (x)||g(y)|e−µ(|x|−cω,λ max( µ ,e .

(3.44)

x,y

This implies an estimate of the form (3.9), and hence a Lieb–Robinson bound as in (3.11). A simple corollary of Lemma 3.1 follows. Corollary 3.2. Consider the functions defined in (3.42). For ω ≥ 0, λ1 , . . . , λd ≥ 0, d but with cω,λ = (ω 2 + 4 j=1 λj )1/2 > 0, take · 1 to be the 1 -norm. One has that (0)

Ht

− δ0 1 → 0

as t → 0,

(3.45)

and (m)

Ht

1 → 0

as

t → 0,

for m ∈ {−1, 1}.

(3.46) (m)

are bounded Proof. The estimates in Lemma 3.1 imply that the functions Ht by exponentially decaying functions (in |x|). These estimates are uniform for t in compact sets, e.g., t ∈ [−1, 1], and therefore dominated convergence applies. It is (0) (m) clear that H0 (x) = δ0 (x) while H0 (x) = 0 for m ∈ {−1, 1}. This proves the corollary. 3.4.3. Representing the dynamics The inﬁnite-volume ground state of the model (3.12) is the vacuum state for the b-operators, as can be seen from (3.18). This state is deﬁned on W(D) by 1

ρ(W (f )) = e− 4 (U

∗

−V ∗ )f 2

(3.47)

By standard arguments this deﬁnes a state on W(D) [4]. Using (3.38), (3.32) and (3.33), one readily veriﬁes that ρ is τt -invariant. ρ is regular by observation. The weak continuity of the dynamics in the GNS-representation of ρ will follow from the continuity of the functions of the form t → ρ(W (g1 )W (Tt f )W (g2 )),

for g1 , g2 , f ∈ D.

(3.48)

When ω > 0, this continuity can be easily observed from the following expression: ρ(W (g1 )W (Tt f )W (g2 )) = eiσ(g1 ,g2 )/2 eiσ(Tt f,g2 −g1 )/2 × e− (U

∗

−V ∗ )(g1 +g2 +Tt f ) 2 /4

.

(3.49)

Note that Tt is diﬀerentiable with bounded derivative and that both U and V are bounded. This establishes the continuity in the case that ω > 0. As discussed in the introduction of the section, the W ∗ -dynamical system is now deﬁned by considering the GNS representation πρ of ρ. This yields a von

March 10, J070-S0129055X1000393X

222

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

Neumann algebra M = πρ (W(D)). The invariance of ρ implies that the dynamics is implementable by unitaries Ut , i.e. πρ (τt (W (f ))) = Ut∗ πρ (W (f ))Ut .

(3.50)

Using Ut , the dynamics can be extended to M. As a consequence of (3.48), this extended dynamics is weakly continuous. 3.4.4. The case of ω = 0 We now discuss the case ω = 0. Here, the maps Tt are deﬁned using the convolution formula (3.41). By Lemma 3.1, Tt is well-deﬁned as a transformation of p (Zd ), for 1 ≤ p ≤ 2. Both the group property of Tt and the invariance of the symplectic form σ follow in the limit ω → 0 by dominated convergence which is justiﬁed by Lemma 3.1. This demonstrates that the dynamics is well deﬁned. We represent the dynamics in a state ρ deﬁned by (3.47), but with the understanding that (U ∗ −V ∗ )f may take on the value +∞, in which case ρ(W (f )) = 0. ρ is still clearly regular. It remains to show that the dynamics is weakly continuous. Observe that i (−1) (0) (1) (H + Ht ) Tt f − f = f ∗ (Ht − δ0 ) − f ∗ 2 t i (1) (−1) +f ∗ (H − Ht ) , (3.51) 2 t follows from (3.41). Using Young’s inequality and Corollary 3.2, it is clear that Tt f − f → 0 as t → 0 for any f ∈ p (Zd ) with 1 ≤ p ≤ 2. A calculation shows that (0)

(U ∗ − V ∗ )(Tt f − f ) = F1 ∗ (Ht

(−1)

− δ0 ) − F2 ∗ Ht

(1)

− iF3 ∗ Ht ,

(3.52)

where F1 = F −1 M√γ F Im[f ] − iF −1 Mγ −1/2 F Re[f ], F2 = F −1 M√γ F Re[f ] and F3 = F −1 Mγ −1/2 F Im[f ].

(3.53)

A similar argument to what is given above now implies that (U ∗ −V ∗ )(Tt f −f ) → 0 as t → 0, for any f ∈ D0 , where D0 = {f ∈ 2 (Zd ) : F −1 Mγ −1/2 F Re[f ] ∈ 2 (Zd )}.

(3.54) (1)

No additional assumption on Im[f ] is necessary since F3 is convolved with Ht . Given the form of (3.49), this suﬃces to prove weak continuity. In fact, one can check that Tt leaves D0 invariant and that if f ∈ D0 , then (U ∗ − V ∗ )Tt f ∈ 2 (Zd ) for all t ∈ R. This establishes weak continuity of the dynamics, deﬁned on W(D0 ). Remark 3.3. We observe that, when ω = 0, the ﬁnite volume Hamiltonian HLh (3.12) is translation invariant and commutes with the total momentum operator P0

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

223

(see (3.16)). In fact, HLh can be written as HLh = P02 + Pk∗ Pk + γ 2 (k)Q∗k Qk k∈Λ∗ L \{0}

= P02 +

γ(k)(2b∗k bk + 1)

k∈Λ∗ L \{0}

where we used the notation (3.16) and, for k = 0, we introduced the operators bk , b∗k as in (3.19). In this case, the operator HLh does not have eigenvectors: its spectrum is purely continuous. By a unitary transformation, the Hilbert space HΛL (see (3.13)) can be mapped into the space L2 (R, dP0 ; Hb ) of square integrable functions of P0 ∈ R, with values in Hb . Here, Hb denotes the Fock space generated by all creation and annihilation operators b∗k , bk with k = 0. It is then easy to construct vectors which minimize the energy by a given distribution of the total momentum: for an arbitrary (complex valued) f ∈ L2 (R) with f = 1, we deﬁne ψf ∈ L2 (R, dP0 ; Hb ) by setting ψf (P0 ) = f (P0 )Ω (where Ω is the Fock vacuum in Hb ). These vectors are not invariant with respect to the time evolution. It is simple to check that the h 2 Schr¨ odinger evolution of ψf is given by e−iHL t ψf = ψft with ft (P0 ) = e−itP0 f (P0 ) is the free evolution of f . In particular, for ω = 0, HLh does not have a ground state in the traditional sense of an eigenvector. For this reason, when ω = 0, it is not a priori clear what the natural choice of state should be. As is discussed above, one possibility is to consider ﬁrst ω = 0 and then take the limit ω → 0. This yields a ground state for the inﬁnite system with vanishing center of mass momentum of the oscillators. By considering non-zero values for the center of mass momentum, one can also deﬁne other states with similar properties. 3.4.5. Some final comments The analysis in the following sections and our main result is not limited to the class of examples we discussed above. For example, harmonic systems deﬁned on more general graphs, such as the ones considered in [6, 7] can also be treated. Also note that our choice of time-invariant state, while natural, is by no means the only possible state. Instead of the vacuum state deﬁned in (3.47), equilibrium states at positive temperatures could be used in exactly the same way. It would also make sense to study the convergence of the equilibrium or ground states for the perturbed dynamics and to consider the dynamics in the representation of the limiting inﬁnitesystem state, but we have not studied this situation and will not discuss it in this paper. 4. Perturbing the Harmonic Dynamics In this section, we will discuss ﬁnite volume perturbations of the inﬁnite volume harmonic dynamics which we deﬁned in Sec. 3. To begin, we recall a fundamental result about perturbations of quantum dynamics deﬁned by adding a bounded term

March 10, J070-S0129055X1000393X

224

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

to the generator. This is a version of what is usually known as the Dyson or Duhamel expansion. The following statement summarizes [4, Proposition 5.4.1]. Proposition 4.1. Let {M, αt } be a W ∗ -dynamical system and let δ denote the infinitesimal generator of αt . Given any P = P ∗ ∈ M, set δP to be the bounded derivation with domain D(δP ) = M satisfying δP (A) = i[P, A] for all A ∈ M. It follows that δ + δP generates a one-parameter group of ∗-automorphisms αP of M which is the unique solution of the integral equation t P αP (4.1) αt (A) = αt (A) + i s ([P, αt−s (A)]) ds. 0

In addition, the estimate |t| P αP − 1)A t (A) − αt (A) ≤ (e

(4.2)

holds for all t ∈ R and A ∈ M. Since the initial dynamics αt is assumed weakly continuous, the norm estimate (4.2) can be used to show that the perturbed dynamics is also weakly continuous. ∗ Hence, for each P = P ∗ ∈ M the pair {M, αP t } is also a W -dynamical system. P1 +P2 ∗ iteratively. Thus, if Pi = Pi ∈ M for i = 1, 2, then one can deﬁne αt 4.1. A Lieb–Robinson bound for on-site perturbations In this section, we will consider perturbations of the harmonic dynamics deﬁned in Sec. 3. Recall that our general assumptions for the harmonic dynamics on Γ are as follows. We assume that the harmonic dynamics, τt0 , is deﬁned on a Weyl algebra W(D) where D is a subspace of 2 (Γ). In fact, we assume there exists a group Tt of real-linear transformations which leave D invariant and satisfy τt0 (W (f )) = W (Tt f ) for all f ∈ D.

(4.3)

In addition, we assume that this harmonic dynamics satisﬁes a Lieb–Robinson bound. Speciﬁcally, we suppose that there exists a number a0 > 0 for which given any 0 < a ≤ a0 , there are positive numbers ca and va for which |f (x)||g(y)|Fa (d(x, y)) (4.4) |1 − eiσ(Tt f,g) | ≤ ca eva |t| x,y∈Γ

here the spatial decay in Γ is described by the function Fa as introduced in Sec. 2. As we discussed in Sec. 3, the estimate (4.4) immediately implies the Lieb–Robinson bound |f (x)||g(y)|Fa (d(x, y)). (4.5) [τt0 (W (f )), W (g)] ≤ ca eva |t| x,y∈Γ

Finally, we assume that we have represented this harmonic dynamics in a regular and τt0 -invariant state ρ for which the pair {M, τt0 }, with M = πρ (W(D)), is a W ∗ -dynamical system.

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

225

Our ﬁrst estimate involves perturbations deﬁned as ﬁnite sums of on-site terms. More speciﬁcally, the perturbations we consider are deﬁned as follows. To each site x ∈ Γ, we will associate a ﬁnite measure µx on C, and an element Px ∈ W(D) which has the form W (zδx )µx (dz). (4.6) Px = C

We require that each µx is even, i.e. invariant under z → −z, to ensure selfadjointness, i.e. Px∗ = Px . Our Lieb–Robinson bounds hold under the additional assumption that the second moment is uniformly bounded, i.e. sup |z|2 |µx |(dz) < ∞. (4.7) x∈Γ

C

We use Proposition 4.1 to deﬁne the perturbed dynamics. Fix a ﬁnite set Λ ⊂ Γ. Set PΛ =

Px ,

(4.8)

x∈Λ (Λ)

and note that (P Λ )∗ = P Λ ∈ W(D). We will denote by τt the dynamics that results from applying Proposition 4.1 to the W ∗ -dynamical system {M, τt0 } and P Λ . Before we begin the proof of our estimate, we discuss two examples. Example 1. Let µx be supported on [−π, π) and absolutely continuous with respect to Lebesgue measure, i.e. µx (dz) = vx (z) dz. If vx is in L2 ([−π, π)), then Px is proportional to an operator of multiplication by the inverse Fourier transform of vx . Moreover, since the support of µx is real, Px corresponds to multiplication by a function depending only on qx . Example 2. Let µx have ﬁnite support, e.g., take supp(µx ) = {z, −z} for some number z = α + iβ ∈ C. Then Px = W (zδx ) + W (−zδx ) = 2 cos(αqx + βpx ).

(4.9)

We now state our ﬁrst result. Theorem 4.2. Let τt0 be a harmonic dynamics defined on Γ as described above. Suppose that (4.10) κ = sup |z|2 |µx |(dz) < ∞, x∈Γ

C

(Λ)

and define the perturbed dynamics τt as indicated above. For every 0 < a ≤ a0 , there exist positive numbers ca and va for which the estimate (Λ) [τt (W (f )), W (g)] ≤ ca e(va +ca κCa )|t| |f (x)||g(y)|Fa (d(x, y)) (4.11) x,y

holds for all t ∈ R and for any functions f, g ∈ D.

March 10, J070-S0129055X1000393X

226

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

Here the numbers ca and va are as in (4.4), whereas Ca is the convolution constant as deﬁned in (2.2) with respect to the function Fa . Proof. Fix t > 0 and deﬁne the function Ψt : [0, t] → W(D) by setting 0 Ψt (s) = [τs(Λ) (τt−s (W (f ))), W (g)].

(4.12)

It is clear that Ψt interpolates between the commutator associated with the original (Λ) harmonic dynamics, τt0 at s = 0, and that of the perturbed dynamics, τt at s = t. A calculation shows that d Ψt (s) = i [τs(Λ) ([Px , W (Tt−s f )]), W (g)], (4.13) ds x∈Λ

where diﬀerentiability is guaranteed by the results of Proposition 4.1. The inner commutator can be expressed as [Px , W (Tt−s f )] = [W (zδx ), W (Tt−s f )]µx (dz) C

= W (Tt−s f )Lt−s;x (f ), where L∗t−s;x (f ) = Lt−s;x (f ) =

C

W (zδx ){eiσ(Tt−s f,zδx ) − 1}µx (dz) ∈ W(D).

(4.14)

(4.15)

Thus Ψt satisﬁes d Ψt (s) = i Ψt (s)τs(Λ) (Lt−s;x (f )) ds x∈Λ +i τs(Λ) (W (Tt−s f ))[τs(Λ) (Lt−s;x (f )), W (g)].

(4.16)

x∈Λ

The ﬁrst term above is norm preserving. In fact, deﬁne a unitary evolution Ut (·) by setting d Ut (s) = −i τs(Λ) (Lt−s;x (f ))Ut (s) ds

with Ut (0) = 1l.

(4.17)

x∈Λ

It is easy to see that d (Ψt (s)Ut (s)) = i τs(Λ) (W (Tt−s f ))[τs(Λ) (Lt−s;x (f )), W (g)]Ut (s), ds

(4.18)

x∈Λ

and therefore, Ψt (t)Ut (t) = Ψt (0) + i

x∈Λ

0

t

τs(Λ) (W (Tt−s f ))[τs(Λ) (Lt−s;x (f )), W (g)]Ut (s) ds. (4.19)

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

227

Estimating in norm, we ﬁnd that (Λ)

[τt

(W (f )), W (g)] ≤ [τt0 (W (f )), W (g)] t + [τs(Λ) (Lt−s;x (f )), W (g)] ds. x∈Λ

(4.20)

0

Moreover, using (4.15) and the bound (4.4), it is clear that [τs(Λ) (Lt−s;x (f )), W (g)] ≤ ca eva (t−s) |f (x )|Fa (d(x, x )) x ∈Γ

×

C

|z|[τs(Λ) (W (zδx )), W (g)]|µx |(dz)

(4.21)

holds. Combining (4.21), (4.20), and (4.5), we have proven that (Λ) |f (x)||g(y)|Fa (d(x, y)) [τt (W (f )), W (g)] ≤ ca eva t x,y

+ ca ×

C

|f (x )|

x ∈Γ

x∈Λ

Fa (d(x, x ))

t

eva (t−s)

0

|z|[τs(Λ) (W (zδx )), W (g)]|µx |(dz) ds.

(4.22)

Following the iteration scheme applied in [19], one arrives at (4.11) as claimed. 4.2. Multiple site anharmonicities In this section, we will prove that Lieb–Robinson bounds, similar to those in Theorem 4.2, also hold for perturbations involving short range interactions. We introduce these as follows. For each ﬁnite subset X ⊂ Γ, we associate a ﬁnite measure µX on CX and an element PX ∈ W(D) with the form W (z · δX )µX (dz), (4.23) PX = CX

where, for each z ∈ C , the function z · δX : Γ → C is given by zx if x ∈ X, (z · δX )(x) = zx δx (x) = 0 otherwise. x ∈X X

(4.24)

We will again require that µX is invariant with respect to z → −z, and hence, PX is self-adjoint. In analogy to (4.8), for any ﬁnite subset Λ ⊂ Γ, we will set PΛ = PX , (4.25) X⊂Λ (Λ)

where the sum is over all subsets of Λ. Here we will again let τt denote the dynamics resulting from Proposition 4.1 applied to the W ∗ -dynamical system {M, τt0 } and the perturbation P Λ deﬁned by (4.25).

March 10, J070-S0129055X1000393X

228

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

The main assumption on these multi-site perturbations follows. There exists a number a1 > 0 such that for all 0 < a ≤ a1 , there is a number κa > 0 for which given any pair x1 , x2 ∈ Γ, |zx1 ||zx2 ||µX |(dz) ≤ κa Fa (d(x1 , x2 )). (4.26) X⊂Γ: x1 ,x2 ∈X

CX

Theorem 4.3. Let τt0 be a harmonic dynamics defined on Γ. Assume that (4.26) (Λ) holds, and that τt denotes the corresponding perturbed dynamics. For every 0 < a ≤ min(a0 , a1 ), there exist positive numbers ca and va for which the estimate 2 (Λ) |f (x)||g(y)|Fa (d(x, y)) (4.27) [τt (W (f )), W (g)] ≤ ca e(va +ca κa Ca )|t| x,y

holds for all t ∈ R and for any functions f, g ∈ D. The proof of this result closely follows that of Theorem 4.2, and so we only comment on the diﬀerences. Proof. For f, g ∈ D and t > 0, deﬁne Ψt : [0, t] → W(D) as in (4.12). The derivative calculation beginning with (4.13) proceeds as before. Here W (z · δX ){eiσ(Tt−s f,z·δX ) − 1}µX (dz), (4.28) Lt−s;X (f ) = CX

is also self-adjoint. The norm estimate (Λ)

[τt

(W (f )), W (g)] ≤ [τt0 (W (f )), W (g)] t + [τs(Λ) (Lt−s;X (f )), W (g)] ds, X⊂Λ

(4.29)

0

holds similarly. With (4.28), it is easy to see that the integrand in (4.29) is bounded by ca eva (t−s) |f (x)| Fa (d(x, x )) |zx ||[τs(Λ) (W (z · δX )), W (g)]|µX |(dz), CX

x ∈X

x∈Γ

(4.30) the analogue of (4.21), for 0 < a ≤ a0 . Moreover, if 0 < a ≤ min(a0 , a1 ), then (Λ)

[τt

(W (f )), W (g)] ≤ ca eva t |f (x)||g(y)|Fa (d(x, y)) + ca |f (x)| Fa (d(x, x )) x,y∈Γ

×

0

t

eva (t−s)

x∈Γ

CX

X⊂Λ x ∈X

|zx |[τs(Λ) (W (z · δX )), W (g)]|µX |(dz)ds.

(4.31)

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

229

The estimate claimed in (4.27) follows by iteration. In fact, the ﬁrst term in the iteration is bounded by ca |f (x)| Fa (d(x, x1 )) X⊂Λ x1 ∈X

x

×

t

e

va (t−s)

0

CX

|zx1 | ca e

va s

x2 ∈X

|zx2 ||g(y)|Fa (d(x2 , y))

y

× |µX |(dz) ds |f (x)||g(y)| Fa (d(x, x1 ))Fa (d(x2 , y)) ≤ ca t · ca eva t ×

X⊂Γ: x1 ,x2 ∈X

x1 ,x2 ∈Γ

x,y

CX

≤ κa ca t · ca eva t

|zx1 ||zx2 ||µX |(dz)

Fa (d(x, x1 ))Fa (d(x1 , x2 ))Fa (d(x2 , y))

x1 ,x2 ∈Γ

x,y

≤ κa Ca2 ca t · ca eva t

|f (x)||g(y)|

|f (x)||g(y)|Fa (d(x, y)).

(4.32)

x,y

The higher order iterates are treated similarly. 5. Existence of the Dynamics In this section, we demonstrate that the ﬁnite volume dynamics analyzed in the previous section converge to a limiting dynamics as the volume Λ on which the perturbation is deﬁned tends to Γ. We state this as Theorem 5.1 below. Theorem 5.1. Let τt0 be a harmonic dynamics defined on W(1 (Γ)) as described in Sec. 4.1. Let {Λn } denote a non-decreasing, exhaustive sequence of finite subsets of Γ. Consider a family of perturbations P Λn as defined in (4.25) and (4.23) which satisfy (4.26). Suppose in addition that |zx ||µX |(dz) < ∞. (5.1) M = sup x∈Γ X⊂Γ: x∈X

CX

Then, for each f ∈ 1 (Γ) and t ∈ R fixed, the limit (Λ ) lim τ n (W (f )) n→∞ t

(5.2)

exists in norm. The limiting dynamics, which we denote by τt , is weakly continuous. It is important to note that since the estimates in Theorem 4.3 are independent of Λ, the limiting dynamics also satisﬁes a Lieb–Robinson bound as in (4.27). We now prove Theorem 5.1.

March 10, J070-S0129055X1000393X

230

2010 10:13 WSPC/S0129-055X

148-RMP

B. Nachtergaele et al.

Proof. Fix a Weyl operator W (f ) with f ∈ 1 (Γ). Let T > 0 and take m ≤ n. Iteratively applying Proposition 4.1, we have that t (Λ ) (Λ ) (Λ ) τt n (W (f )) = τt m (W (f )) + i τs(Λn ) ([P Λn \Λm , τt−sm (W (f ))]) ds, (5.3) 0

for all −T ≤ t ≤ T . The bound (Λ )

[P Λn \Λm , τt−sm (W (f ))] (Λ ) ≤ [W (z · δX ), τt−sm (W (f ))]|µX |(dz) X⊂Λn : X∩Λn \Λm =∅

CX

2

≤ ca e(va +ca κa Ca )(t−s)

≤ ca e

2

|f (x)|

x∈Γ

≤ M ca e(va +ca κa Ca )(t−s)

Fa (d(x, y))

X⊂Λn : y∈X X∩Λn \Λm =∅

x∈Γ (va +ca κa Ca2 )(t−s)

|f (x)|

x∈Γ

Fa (d(x, y))

y∈Λn \Λm

|f (x)|

CX

X⊂Γ: y∈X

Fa (d(x, y))

CX

|zy ||µX |(dz)

|zy ||µX |(dz)

(5.4)

y∈Λn \Λm

follows readily from Theorem 4.3 and assumption (5.1). For f ∈ 1 (Γ) and ﬁxed t, the upper estimate above goes to zero as n, m → ∞. In fact, the convergence is uniform for t ∈ [−T, T ]. This proves (5.2). By an /3 argument, similar to what is done at the end of Sec. 2, weak continuity follows since we know it holds for the ﬁnite volume dynamics. This completes the proof of Theorem 5.1. Acknowledgments The work reported in this paper was supported by the National Science Foundation: B.N. under Grants #DMS-0605342 and #DMS-0757581, R.S. under Grant #DMS0757424, and S.S. under Grant #DMS-0757327 and #DMS-0706927. The authors would also like to acknowledge the hospitality of the Department of Mathematics at U.C. Davis where a part of this work was completed. References [1] L. Amour, P. Levy-Bruhl and J. Nourrigat, Dynamics and Lieb–Robinson estimates for lattices of interacting anharmonic oscillators, to appear in Colloq. Math., Special volume dedicated to A. Hulanicki; arXiv:0904.2717. [2] H. Araki and E. J. Woods, Representations of the canonical commutation relations describing a non-relativistic infinite free Bose gas, J. Math. Phys. 4 (1963) 637–662. [3] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. Volume 1, 2nd edn. (Springer-Verlag, 1987).

March 10, J070-S0129055X1000393X

2010 10:13 WSPC/S0129-055X

148-RMP

On the Existence of the Dynamics

231

[4] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. Volume 2, 2nd edn. (Springer-Verlag, 1997). [5] P. Butt` a, E. Caglioti, S. Di Ruzza and C. Marchioro, On the propagation of a perturbation in an anharmonic system, J. Stat. Phys. 127 (2007) 313–325. [6] M. Cramer and J. Eisert, Correlations, spectral gap, and entanglement in harmonic quantum systems on generic lattices, New J. Phys. 8 (2006) 71. [7] M. Cramer, A. Serafini and J. Eisert, Locality of dynamics in general harmonic quantum systems, in Quantum Information and Many Body Quantum Systems, eds. M. Ericsson and S. Montangero (Edizioni della Normale, 2008). [8] M. Hastings and T. Koma, Spectral gap and exponential decay of correlations, Comm. Math. Phys. 265(3) (2006) 781–804. [9] J. L. van Hemmen, Dynamics and ergodicity of the infinite harmonic crystal, Phys. Rept. 65 (1980) 45–149. [10] O. E. Lanford, J. Lebowitz and E. H. Lieb, Time evolution of infinite anharmonic systems, J. Statist. Phys. 16(6) (1977) 453–461. [11] E. H. Lieb and D. W. Robinson, The finite group velocity of quantum spin systems, Comm. Math. Phys. 28 (1972) 251–257. [12] E. H. Lieb and R. Seiringer, The Stability of Matter in Quantum Mechanics (Cambridge University Press, 2009). [13] J. Manuceau and A. Verbeure, Quasi-free states of the CCR algebra and Bogoliubov transformations, Comm. Math. Phys. 9 (1968) 293–302. [14] J. Manuceau, M. Sirugue, D. Testard and A. Verbeure, The smallest C ∗ -algebra for canonical commutation relations, Comm. Math. Phys. 32 (1973) 231–243. [15] C. Marchioro, A. Pellegrinotti, M. Pulvirenti and L. Triolo, Velocity of a perturbation in infinite lattice systems, J. Statist. Phys. 19(5) (1978) 499–510. [16] B. Nachtergaele and R. Sims, Lieb–Robinson bounds and the exponential clustering theorem, Comm. Math. Phys. 265(1) (2006) 119–130. [17] B. Nachtergaele, Y. Ogata and R. Sims, Propagation of correlations in quantum lattice systems, J. Statist. Phys. 124(1) (2006) 1–13. [18] B. Nachtergaele and R. Sims, Locality estimates for quantum spin systems, in New Trends in Mathematical Physics, Selected Contributions of the XVth International Congress on Mathematical Physics (Springer-Verlag, 2009), pp. 591–614. [19] B. Nachtergaele, H. Raz, B. Schlein and R. Sims, Lieb–Robinson bounds for harmonic and anharmonic lattice systems, Comm. Math. Phys. 286 (2009) 1073–1098. [20] H. Raz and R. Sims, Estimating the Lieb–Robinson velocity for classical anharmonic lattice systems, J. Statist. Phys. 137 (2009) 79–108. [21] M. Reed and B. Simon, Methods of Modern Mathematical Physics, II, Fourier Analysis, Self-Adjointness (Academic Press, 1975). [22] D.W. Robinson, The ground state of the bose gas, Comm. Math. Phys. 1 (1965) 159–174. [23] H. Spohn and J. L. Lebowitz, Stationary non-equilibrium states of infinite harmonic systems, Comm. Math. Phys. 54 (1977) 97–120. [24] W. Thirring and F. Dyson (eds), The Stability of Matter: From Atoms to Stars: Selecta of Elliott H. Lieb, 4th edn. (Springer-Verlag, 2005).

April

20,

2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Reviews in Mathematical Physics Vol. 22, No. 3 (2010) 233–303 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10003953

EFFECT OF A LOCALLY REPULSIVE INTERACTION ON s-WAVE SUPERCONDUCTORS

J.-B. BRU∗ and W. DE SIQUEIRA PEDRA† ∗Departamento de Matem´ aticas, Facultad de Ciencia y Tecnolog´ıa Universidad del Pa´ıs Vasco, Apartado 644, 48080 Bilbao, Spain and IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain jeanbernard [email protected] [email protected] †Institut

f¨ ur Mathematik, Universit¨ at Mainz, Staudingerweg 9, 55099 Mainz, Germany [email protected] Received 23 September 2009 Revised 22 February 2010

The thermodynamic impact of the Coulomb repulsion on s-wave superconductors is analyzed via a rigorous study of equilibrium and ground states of the strong coupling BCS-Hubbard Hamiltonian. We show that the one-site electron repulsion can favor superconductivity at ﬁxed chemical potential by increasing the critical temperature and/or the Cooper pair condensate density. If the one-site repulsion is not too large, a ﬁrst or a second order superconducting phase transition can appear at low temperatures. The Meißner eﬀect is shown to be rather generic but coexistence of superconducting and ferromagnetic phases is also shown to be feasible, for instance, near half-ﬁlling and at strong repulsion. Our proof of a superconductor-Mott insulator phase transition implies a rigorous explanation of the necessity of doping insulators to create superconductors. These mathematical results are consequences of “quantum large deviation” arguments combined with an adaptation of the proof of Størmer’s theorem [1] to even states on the CAR algebra. Keywords: Superconductivity; s-wave; Coulomb interaction; Hubbard model; Meißner eﬀect; Mott insulators; equilibrium states; Størmer’s theorem. Mathematics Subject Classiﬁcation 2010: 82B20, 82D55

Contents 1. Introduction

234

2. Grand-Canonical Pressure and Gap Equation

241

233

April 20, 2010 14:17 WSPC/S0129-055X

234

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

3. Phase Diagram at Fixed Chemical Potential 3.1. Existence of a s-wave superconducting phase transition . 3.2. Electron density per site and electron-hole symmetry . . 3.3. Superconductivity versus magnetization: Meißner eﬀect 3.4. Coulomb correlation density . . . . . . . . . . . . . . . . 3.5. Superconductor-Mott insulator phase transition . . . . . 3.6. Mean-energy per site and the speciﬁc heat . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

244 245 249 250 252 255 257

4. Phase Diagram at Fixed Electron Density per Site 260 4.1. Thermodynamics away from any critical point . . . . . . . . . . . . . 260 4.2. Coexistence of ferromagnetic and superconducting phases . . . . . . 262 5. Concluding Remarks

266

6. Mathematical Foundations of the Thermodynamic Results 268 6.1. Thermodynamic limit of the pressure: Proof of Theorem 2.1 . . . . . 269 6.2. Equilibrium and ground states of the strong coupling BCS-Hubbard model . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 7. Analysis of the Variational Problem

292

Appendix. Griﬃths Arguments

298

1. Introduction Since the discovery of mercury superconductivity in 1911 by the Dutch physicist Onnes, the study of superconductors has continued to intensify, see, e.g., [2]. Since that discovery, a signiﬁcant amount of superconducting materials has been found. This includes usual metals, like lead, aluminum, zinc or platinum, magnetic materials, heavy-fermion systems, organic compounds and ceramics. A complete description of their thermodynamic properties is an entire subject by itself, see [2–4] and references therein. In addition to zero-resistivity and many other complex phenomena, superconductors manifest the celebrated Meißner or Meißner–Ochsenfeld eﬀect, i.e. they can become perfectly diamagnetic. The highesta critical temperature for superconductivity obtained nowadays is between 100 and 200 Kelvin via doped copper oxides, which are originally insulators. In contrast to most superconductors, note that superconduction in magnetic superconductors only exists on a ﬁnite range of non-zero temperatures. Theoretical foundations of superconductivity go back to the celebrated BCS theory — appeared in the late ﬁfties (1957) — which explains conventional type I

a In

January 2008, a critical temperature over 180 Kelvin was reported in a Pb-doped copper oxide.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

235

superconductors. This theory is based on the so-called (reduced) BCS Hamiltonian := (εk − µ)(˜ a∗k,↑ a ˜k,↑ + a ˜∗k,↓ a ˜k,↓ ) HBCS Λ k∈Λ∗

+

1 γk,k a ˜∗k,↑ a ˜∗−k,↓ a ˜k ,↓ a ˜−k ,↑ |Λ| ∗

(1.1)

k,k ∈Λ

deﬁned in a cubic box Λ ⊂ R3 of volume |Λ|. Here Λ∗ is the dual group of Λ seen as a ˜k,s creates torus (periodic boundary condition) and the operator a˜∗k,s respectively a respectively annihilates a fermion with spin s ∈ {↑, ↓} and momentum k ∈ Λ∗ . The function εk represents the kinetic energy, the real number µ is the chemical potential and γk,k is the BCS coupling function. The choice γk,k = −γ < 0 is often used in the Physics literature and the case εk = 0 is known as the strong coupling limit of the BCS model. The lattice approximation of the BCS Hamiltonian amounts to replace the box Λ ⊂ R3 by Λ ⊂ Z3 (or, more generally, by Λ ⊂ Zd≥1 ) and the strong coupling limit of the reduced BCS model is in this case known as the strong coupling (with γk,k = −γ) BCS model.b The assumptions εk = 0 and γk,k = −γ are of interest, because in this case the BCS Hamiltonian can be explicitly diagonalized. The exact solution of the strong coupling BCS model is well-known since the sixties [6, 7]. This model is in a sense unrealistic: among other things, its representation of the kinetic energy of electrons is rather poor. Nevertheless, it became popular because it displays most of basic properties of real conventional type I superconductors. See, e.g., [8, Chap. VII, Sec. 4]. Even though the analysis of the thermodynamics of the BCS Hamiltonian was rigorously performed in the eighties [9, 10] (see also the innovating work of Bernadskii and Minlos in 1972 [11]), generalizations of the strong coupling approximation of the BCS model are still subject of research. For instance, strong coupling-BCS-type models with superconducting phases at arbitrarily high temperatures are treated in [12]. In fact, a general theory of superconductivity is still a subject of debate, especially for high-Tc superconductors. An important phenomenon ignored in the BCS theory is the Coulomb interaction between electrons or holes, which can imply strong correlations, for instance in high-Tc superconductors. To study these correlations, most of theoretical methods, inspired by Beliaev [5], use perturbation theory or renormalization group derived from the diagram approach of Quantum Field Theory. However, even if these approaches have been successful in explaining many physical properties of superconductors [3, 4], only few rigorous results exist on superconductivity. For instance, the eﬀect of the Coulomb interaction on superconductivity is not rigorously known. This problem was of course adressed in theoretical Physics right after the emergence of the Fr¨ohlich model and the BCS theory, see, e.g., [13]. b See

also (1.2) with λ = 0 and h = 0.

April 20, 2010 14:17 WSPC/S0129-055X

236

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

In particular, the authors explain in [13, Chap. VI], by means of diagrammatic pertubation theory, that the eﬀect of the Coulomb interaction on the Fr¨ ohlich model should be to lower the critical temperature of the superconducting phase by lowering the electron density. We rigorously show that this phenomenology is only true — for our model — in a speciﬁc region of parameters. Indeed, the aim of the present paper is to understand the possible thermodynamic impact of the Coulomb repulsion in the strong coupling approximation. More precisely, we study the thermodynamic properties of the strong coupling BCSHubbard model deﬁned in the boxc ΛN := {Z∩[−L, L]}d≥1 of volume |ΛN | = N ≥ 2 by the Hamiltonian (nx,↑ + nx,↓ ) − h (nx,↑ − nx,↓ ) HN := −µ x∈ΛN

+ 2λ

x∈ΛN

x∈ΛN

nx,↑ nx,↓ −

γ N

a∗x,↑ a∗x,↓ ay,↓ ay,↑

(1.2)

x,y∈ΛN

for real parameters µ, h, λ, and γ ≥ 0. The operator a∗x,s respectively ax,s creates respectively annihilates a fermion with spin s ∈ {↑, ↓} at lattice position x ∈ Zd whereas nx,s := a∗x,s ax,s is the particle number operator at position x and spin s. The ﬁrst term of the right-hand side of (1.2) represents the strong coupling limit of the kinetic energy, with µ being the chemical potential of the system. Note that this “strong coupling limit” — explained above for the BCS Hamiltonian — is also called “atomic limit” in the context of the Hubbard model, see, e.g., [14, 15]. The second term in the right-hand side of (1.2) corresponds to the interaction between spins and the magnetic ﬁeld h. The one-site interaction with coupling constant λ represents the (screened) Coulomb repulsion as in the celebrated Hubbard model. So, the parameter λ should be taken as a positive number but our results are also valid for any real λ. The last term is the BCS interaction written in the x-space since γ ∗ ∗ γ ∗ ∗ ax,↑ ax,↓ ay,↓ ay,↑ = a ˜k,↑ a ˜−k,↓ a ˜q,↓ a ˜−q,↑ , (1.3) N N ∗ x,y∈ΛN

k,q∈ΛN

˜q,s is the corwith Λ∗N being the reciprocal lattice of quasi-momenta and where a responding annihilation operator for s ∈ {↑, ↓}. Observe that the thermodynamics of the model for γ = 0 can easily be computed. Therefore, we restrict the analysis to the case γ > 0. Note also that the homogeneous BCS interaction (1.3) can imply a superconducting phase and the mediator implying this eﬀective interaction does not matter here, i.e. it could be due to phonons, as in conventional type I superconductors, or anything else. We show that the one-site repulsion suppresses superconductivity for large λ ≥ 0. In particular, the repulsive term in (1.2) cannot imply any superconducting state if γ = 0. However, the ﬁrst elementary but nonetheless important property c Without

loss of generality, we choose N such that L := (N 1/d − 1)/2 ∈ N.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

237

of this model is that the presence of an electron repulsion is not incompatible with superconductivity if |λ−µ| and (λ+|h|) are not too big as compared to the coupling constant γ of the BCS interaction. In this case, the superconducting phase appears at low temperatures as either a ﬁrst order or a second order phase transition. More surprisingly, the one-site repulsion can even favor superconductivity at ﬁxed chemical potential µ by increasing the critical temperature and/or the Cooper pair condensate density. This contradicts the naive guess that any one-site repulsion between electron pairs should at least reduce the formation of Cooper pairs. It is however important to mention that the physical behavior described by the model depends on which parameter, µ or ρ, is ﬁxed. (It does not mean that the canonical and grand-canonical ensembles are not equivalent for this model.) Indeed, we also analyze the thermodynamic properties at ﬁxed electron density ρ per site in the grand-canonical ensemble, as it is done for the perfect Bose gas in the proof of Bose–Einstein condensation. The analysis of the thermodynamics of the strong coupling BCS-Hubbard model is performed in details. In particular, we prove that the Meißner eﬀect is rather generic but also that the coexistence of superconducting and ferromagnetic phases is possible (as in the Vonsovkii–Zener model [16, 17]), for instance at large λ > 0 and densities near half-ﬁlling. The later situation is related to a superconductor-Mott insulator phase transition. This transition gives furthermore a rigorous explanation of the need of doping insulators to obtain superconductors. Indeed, at large enough coupling constant λ, the superconductor-Mott insulator phase transition corresponds to the breakdown of superconductivity together with the appearance of a gap in the chemical potential as soon as the electron density per site becomes an integer, i.e. 0, 1 or 2. If the system has an electron density per site equal to 1 without being superconductor, then any non-zero magnetic ﬁeld h = 0 implies a ferromagnetic phase. Note that the present setting is still too simpliﬁed with respect to real superconductors. For instance, the anti-ferromagnetic phase or the presence of vortices, which can appear in (type II) high-Tc superconductors [3,4], are not modeled. However, the BCS-Hubbard Hamiltonian (1.2) may be a good model for certain kinds of superconductors or ultra-cold Fermi gases in optical lattices, where the strong coupling approximation is experimentally justiﬁed. Actually, even if the strong coupling assumption is a severe simpliﬁcation, it may be used in order to analyze the thermodynamic impact of the Coulomb repulsion, as all parameters of the model have a phenomenological interpretation and can be directly related to experiments. See discussions in Sec. 5. Moreover, the range of parameters in which we are interested turns out to be related to a ﬁrst order phase transition. This kind of phase transitions are known to be stable under small perturbations of the Hamiltonian. In particular, by including a small kinetic part it can be shown by high-low temperature expansions that the model ε(x − y)(a∗y,↓ ax,↓ + a∗y,↑ ax,↑ ) HN,ε := HN + x,y∈ΛN

April 20, 2010 14:17 WSPC/S0129-055X

238

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

has essentially the same correlation functions as HN , up to corrections of order ε1 (1 -norm of ε). This analysis will be the subject of a separated paper. For any ε = 0 notice that the model HN,ε is not anymore permutation invariant but only translation invariant. Such translation invariant models are studied in a systematic way in [18]. Their detailed analysis is however, generally much more diﬃcult to perform. Considering ﬁrst models having more symmetries — as for instance, permutation invariance — is in this case technically easier. Coming back to the strong coupling BCS-Hubbard model HN , it turns out that the thermodynamic limit of its (grand-canonical) pressured 1 ln Trace(e−βHN ) pN (β, µ, λ, γ, h) := (1.4) βN exists at any ﬁxed inverse temperature β > 0. It corresponds to a variational problem which has minimizerse in the set EUS,+ of (evenf ) permutation invariant states on the CAR C ∗ -algebra U generated by annihilation and creation operators: p(β, µ, λ, γ, h) := lim {pN (β, µ, λ, γ, h)} = − N →∞

inf

S,+ ω∈EU

F(ω).

(1.5)

Here the map ˜ ω → F(ω) := e(ω) − β −1 S(ω) is the aﬃne (lower weak∗ -semicontinuous) free-energy density functional deﬁned on EUS,+ from the mean energy per volume e(ω) := lim {N −1 ω(HN )} < ∞ N →∞

and the entropy density

1 Trace(Dω|UN log Dω|UN ) < ∞. N →∞ N Note that Dω|UN is the density matrix associated to the state ω restricted on the local CAR C ∗ -algebra UN B( CΛN ×{↑,↓} ) (isomorphism). Such a derivation of the pressure as a minimization problem over states on a C ∗ -algebras are also performed for various quantum spin systems, see, e.g., [19–23]. The minimum of the variational problem (1.5) is attained for any weak∗ -limit point of local Gibbs states

˜ S(ω) := − lim

Trace(· e−βHN ) (1.6) Trace(e−βHN ) associated with HN . Similarly to what is done for general translation invariant models (see [24, 25]), the set of equilibrium states of the strong coupling BCSHubbard model is naturally deﬁned to be the set Ωβ = Ωβ (µ, λ, γ, h) of minimizers ωN (·) :=

d Our

notation for the “Trace” does not include the Hilbert space where it is evaluated but it should be deduced from operators involved in each statement. e Because ω → F(ω) is lower semicontinuous and E S,+ is compact with respect to the weak∗ U topology. f See Remark 6.1 in Sec. 6.1.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

239

of (1.5). Note that Ωβ is a non empty convex subsetg of EUS,+ and the extreme decomposition in Ωβ coincides with the one in EUS,+ , i.e. Ωβ is a faceh in EUS,+ . So, pure equilibrium states are extreme states of Ωβ . Meanwhile, any weak∗ limit point as n → ∞ of an equilibrium state sequence {ω (n) }n∈N with diverging inverse temperature βn → ∞ is — per deﬁnition — a ground state ω ∈ EUS,+ . Here we have left the Fock space representation of the model to go to a representation-free formulation of thermodynamic phases. This means that HN is not anymore seen as a Hamiltonian acting on the Fock space but as a (self-adjoint) element of the CAR C ∗ -algebra U with thermodynamic phases describes by states on U. Doing so we take advantage of the non-uniqueness of the representation of the CAR C ∗ -algebra U. This property is indeed necessary to get non-unique equilibrium and ground states which imply phase transitions. This fact was ﬁrst observed by Haag in 1962 [26], who established that the non-uniqueness of the ground state of the BCS model in inﬁnite volume is related to the existence of several inequivalenti irreducible representationsj of the Hamiltonian, see also [6, 27]. Equilibrium states deﬁne tangents to the convex map (β, µ, λ, γ, h) → p(β, µ, λ, γ, h). The analysis of the set of tangents of this map gives hence information about the expectations of many important observables with respect to equilibrium states. The main technical point in the present work is therefore to ﬁnd an explicit representation of the pressure by using the permutation invariance of the model in a crucial way. Indeed, we adapt to our case of fermions on a lattice the methods of [19] used to ﬁnd the pressure of spin systems of mean-ﬁeld type. Then, it is proven that it suﬃces to minimize the variational problem (1.5) with respect to the set EUS,+ of extreme states in EUS,+ . By adapting the proof of Størmer’s theorem [1] to even states on the CAR algebra, we show next that extreme, permutation invariant and even states are product states ζx ωζ := x∈Zd

obtained by “copying” some one-site even state ζ to all other sites. This result is a non-commutative version of the celebrated de Finetti Theorem from (classical) probability theory [28]. Using this, the variational problem (1.5) can be drastically simpliﬁed to a minimization problem on a ﬁnite dimensional manifold. At the end, it yields to another explicit, rather simple, variational problem on R+ 0 , which can S,+ map ω → F(ω) on the convex set EU is aﬃne and lower semicontinuous, thus Ωβ is a S,+ non-empty face of EU . h A face F of a compact convex set K is subset of K with the property that if ω = Σm λ ω ∈ F n=1 n n m m with Σm n=1 λn = 1 and {ωn }n=1 ⊂ K, then {ωn }n=1 ⊂ F. i This means that there is no isomorphism between h j1 and hj2 whenever hj1 and hj2 are the Hilbert spaces corresponding to two diﬀerent irreducible representations. j This means that the Hamiltonian can be seen as an operator acting on several Hilbert spaces {hj }j∈J with no (non-trivial) invariant subspace. g The

April 20, 2010 14:17 WSPC/S0129-055X

240

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

be rigorously analyzed by analytic or numerical methods to obtain the complete thermodynamic behavior of the model. Observe however, that all correlation functions cannot be drawn from an explicit formula for the pressure by taking derivatives combined with Griﬃths arguments [29–31] on the convergence of derivatives of convex functions, unless the (inﬁnite volume) pressure is shown to be diﬀerentiable with respect to any perturbation. Showing diﬀerentiability of the pressure as well as the explicit computation of its corresponding derivative can be a very hard task, for instance for correlation functions involving many lattice points. By contrast, the method presented in this paper gives access to all correlation functions at once. This is one basic (mathematical) message of this method, which is generalized in [18] to all translation invariant Fermi systems without requiring any quantum spin representation. In fact, we precisely characterize the sets Ωβ for all β ∈ (0, ∞], where Ω∞ is the set of ground states with parameters µ, γ, λ, and h. This detailed study yields our main rigorous results on the strong coupling BCS-Hubbard model HN , which can be summarized as follows: • There is a set of parameters S, deﬁning the superconducting phase, with equilibrium and ground states breaking the U (1)-gauge symmetry and showing oﬀdiagonal long range order (ODLRO). • Depending on the parameters, the superconducting phase transition is either a ﬁrst order or a second order phase transition. • The superconducting phase S is characterized by the formation of Cooper pairs (shown by proving bounds for the density-density correlations) and a depleted Cooper pair condensate, the density rβ ∈ [0, 1/4] of which is deﬁned by the gap equation. • From our proof of Størmer’s theorem [1] for even states on the CAR algebra, we observe that the superconducting phase S corresponds to a s-wave superconductor, i.e. a superconductor with two-point correlation function, for x, y ∈ Zd , 1/2 s1 , s2 ∈ {↑, ↓} and within S, equal to ω(ax,s1 ay,s2 ) = rβ eiφ = 0 if x = y and s1 = s2 , and ω(ax,s1 ay,s2 ) = 0 else. (Here ω is any pure state of Ωβ ; φ ∈ [0, 2π) is determined by ω.) • We observe the Meißner eﬀectk by analyzing the relation between superconductivity and magnetization. • We establish the existence of a superconductor-Mott insulator phase transition for integer electron density per site. • The coexistence of ferromagnetic and superconducting phases is shown to be feasible at (critical) points of the boundary ∂S of S, by applying the decomposition theory for states [32] on the weak∗ -compact and convex set Ωβ . k It is mathematically deﬁned here by the absence of magnetization in presence of superconductivity. Steady surface currents around the bulk of the superconductor are not analyzed as it is a ﬁnite volume eﬀect.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

241

• The critical temperature θc for the superconducting phase transition with respect to λ, γ or h is analyzed in the case of ﬁxed chemical potential µ and also in the case of constant electron density ρ. It shows that θc can be an increasing function of the positive coupling constant λ > 0 at ﬁxed µ ∈ R but not at ﬁxed ρ > 0. • For λ ∼ γ the critical temperature θc shows — as a function of the electron density ρ — the typical behavior observed (only) in high-Tc superconductors: θc is zero or very small for ρ ∼ 1 and is much larger for ρ away from 1. Thus, our model provides a simple rigorous microscopic explanation for such experimentally well-known behavior of high-Tc superconductors. • Together with our study of the heat capacity, all these results can be used to ﬁx experimentally all parameters of HN . Note that our study of equilibrium states is reminiscent of the work of Fannes, Spohn and Verbeure [33], performed however within a diﬀerent framework. By opposition with our setting, their analysis [33] concerns symmetric states on an inﬁnite tensor product of one C ∗ -algebra and their deﬁnition of equilibrium states uses the so-called correlation inequalities for KMS-states, see [29, Appendix E]. To conclude, this paper is organized as follows. In Sec. 2, we give the thermodynamic limit of the pressure pN (1.4) as well as the gap equation. Then, our main results concerning the thermodynamic properties of the model are formulated in Sec. 3 at ﬁxed chemical potential µ and in Sec. 4, at ﬁxed electron density ρ per site. Section 5 brieﬂy explains our result on the level of equilibrium states and gives additional remarks. In order to keep the main issues and the physical implications as transparent as possible, we reduce the technical and formal aspects to a minimum in Secs. 2–5. In particular, in Secs. 2–4 we only stay on the level of pressure and thermodynamic limit of local Gibbs states. The generalization of the results on the level of equilibrium and ground states is postponed to Sec. 6.2. Indeed, the rather long Sec. 6 gives the detailed mathematical foundations of our phase diagrams. In particular, in Sec. 6.1 we introduce the C ∗ -algebraic machinery needed in our analysis and prove various technical facts to conclude in Sec. 6.2 with the rigorous study of equilibrium and ground states. In Sec. 7, we collect some useful properties on the qualitative behavior of the Cooper pair condensate density, whereas the Appendix is an appendix on Griﬃths arguments [29–31].

2. Grand-Canonical Pressure and Gap Equation In order to obtain the thermodynamic behavior of the strong coupling BCSHubbard model HN , it is essential to get ﬁrst the thermodynamic limit N → ∞ of its grand-canonical pressure pN (1.4). The rigorous derivation of this limit is performed in Sec. 6.1. We explain here the ﬁnal result with the heuristic behind it. The ﬁrst important remark is that one can guess the correct variational problem by the so-called approximating Hamiltonian method [34–36] originally proposed by Bogoliubov Jr. [37]. In our case, the correct approximation of the Hamiltonian HN

April 20, 2010 14:17 WSPC/S0129-055X

242

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

is the c-dependent Hamiltonian HN (c) := −µ (nx,↑ + nx,↓ ) − h (nx,↑ − nx,↓ ) x∈ΛN

+ 2λ

x∈ΛN

x∈ΛN

γ nx,↑ nx,↓ − ((N c)a∗x,↑ a∗x,↓ + (N c¯)ax,↓ ax,↑ ), N

(2.1)

x∈ΛN

with c ∈ C, see also [6, 7]. The main advantage of this Hamiltonian in comparison with HN is the fact that it is a sum of shifts of the same local operator. For an appropriate order parameter c ∈ C, it leads to a good approximation of the pressure pN as N → ∞. This can be partially seen from the inequality γ 2 ∗ ∗ ax,↑ ax,↓ − N c¯ ax,↑ ax,↓ − N c ≥ 0, γN |c| + HN (c) − HN = N x∈ΛN

x∈ΛN

which is valid as soon as γ ≥ 0. Observe that the constant term γN |c|2 is not included in the deﬁnition of HN (c). Hence, by using the Golden–Thompson inequal∗ ity Trace(eA+B B ) ≤ Trace(eA ), the thermodynamic limit p(β, µ, λ, γ, h) of the pressure pN (1.4) is bounded from below by p(β, µ, λ, γ, h) ≥ sup{−γ|c|2 + p(c)}.

(2.2)

c∈C

The function p(c) = p(β, µ, λ, γ, h; c) is the pressure associated with HN (c) for any N ≥ 1. It can easily be computed since HN (c) is a sum of local operators which commute with each other. Indeed, for any N ≥ 1, this pressure equalsl p(c) := =

1 1 ln Trace(e−βHN (c) ) = ln Trace(e−βH1 (c) ) βN β ∗ ∗ 1 ln Trace(eβ{(µ+h)n↑ +(µ−h)n↓ +γ(ca↓ a↑ +¯ca↑ a↓ )−2λn↑ n↓ } ). β

(2.3)

To be useful, the variational problem in (2.2) should also be an upper bound of p(β, µ, λ, γ, h). By adapting the proof of Størmer’s theorem [1] to even states on the CAR algebra and by using the Petz–Raggio–Verbeure proof for spin systems [19] as a guideline, we prove this in Sec. 6.1. Thus, the thermodynamic limit of the pressure of the model HN exists and can explicitly be computed by using the approximating Hamiltonian HN (c): Theorem 2.1 (Grand-Canonical Pressure). For any β, γ > 0 and µ, λ, h ∈ R, the thermodynamic limit p(β, µ, λ, γ, h) of the grand-canonical pressure pN (1.4) equals p(β, µ, λ, γ, h) = sup{−γ|c|2 + p(c)} = β −1 ln 2 + µ + sup f (r) < ∞, c∈C

l Here

r≥0

a0,↑ , a0,↓ and n0,↑ , n0,↓ are replaced, respectively, by a↑ , a↓ and n↑ , n↓ .

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

243

where the real function f (r) = f (β, µ, λ, γ, h; r) is deﬁned by f (r) := −γr +

1 ln{cosh(βh) + e−λβ cosh(βgr )}, β

with gr := {(µ − λ)2 + γ 2 r}1/2 . Remark 2.1. The fact that the pressure pN coincides as N → ∞ with the variational problem given by the so-called approximating Hamiltonian (here HN (c)) was previously proven via completely diﬀerent methods in [34] for a large class of Hamiltonian (including HN ) with BCS-type interaction. However, as explained in the introduction, our proof gives deeper results, not expressed in Theorem 2.1, on the level of states, cf. (1.5) and (6.33). In contrast to the approximating Hamiltonian method [34–37], it leads to a natural notion of equilibrium and ground states and allows the direct analysis of correlation functions. For more details, we recommend Sec. 6, particularly Sec. 6.2. From the gauge invariance of the map c → p(c) observe that any maximizer 1/2 cβ ∈ C of the ﬁrst variational problem given in Theorem 2.1 has the form rβ eiφ with rβ ≥ 0 being solution of sup f (r) = f (rβ )

(2.4)

r≥0

and φ ∈ [0, 2π). For any β, γ > 0 and real numbers µ, λ, h, it is also clear that the order parameter rβ is always bounded since f (r) diverges to −∞ when r → ∞. Up to (special) points (β, µ, λ, γ, h) corresponding to a phase transition of ﬁrst order, it is always unique and continuous with respect to each parameter (see Sec. 7). For low inverse temperatures β (high temperature regime) rβ = 0. Indeed, straightforward computations at low enough β show that the function f (r) is concave as a function of r ≥ 0 whereas ∂r f (0) < 0, see Sec. 7. On the other hand, any non-zero solution rβ of the variational problem (2.4) has to be solution of the gap equation (or Euler–Lagrange equation) 2grβ eλβ cosh(βh) tanh(βgrβ ) = 1+ . (2.5) γ cosh(βgrβ ) If gr = 0, observe that one uses in (2.5) the asymptotics x−1 tanh x ∼ 1 as x → 0, see also (7.2). Because tanh(x) ≤ 1 for x ≥ 0, we then conclude that 1 − γ −2 (µ − λ)2 . (2.6) 4 In particular, if γ ≤ 2|µ − λ|, then rβ = 0 for any β > 0. However, at large enough β > 0 (low temperature regime) and at ﬁxed λ, h, µ ∈ R, there is a unique γc > 2|λ − µ| such that rβ > 0 for any γ ≥ γc . In other words, the domain of parameters (β, µ, λ, γ, h) where rβ is strictly positive is non-empty, see Figs. 1 and 2 and Sec. 7. Observe in Fig. 2 that a positive λ, i.e. a one-site repulsion, can signiﬁcantly increase (right ﬁgure) the critical temperature θc = θc (µ, λ, γ, h), which is deﬁned such that rβ > 0 if and only if β > θc−1 . 0 ≤ rβ ≤ max{0, rmax },

with rmax :=

April 20, 2010 14:17 WSPC/S0129-055X

244

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra θc

θc

θc

0.6

0.8

0.20

0.5 0.6

0.15 0.4 0.3

0.4

0.10

0.2 0.2

0.05 0.1

− 2.0

− 1.5

− 1.0

− 0.5

µ

0.5

− 1.5

− 1.0

− 0.5

0.5

1.0

1.5

µ

− 0.5

0.5

1.0

1.5

2.0

µ

Fig. 1. Illustration, as a function of µ, of the critical temperature θc = θc (µ, λ, γ, h) such that rβ > 0 if and only if β > θc−1 (blue area) for γ = 2.6, h = 0 and with λ = −0.575 (left ﬁgure), 0 (ﬁgure on the center) and 0.575 (right ﬁgure). The blue line corresponds to a second order phase transition, whereas the red dashed line represents the domain of µ with a ﬁrst order phase transition. The black dashed line is the chemical potential µ = λ corresponding to an electron density per site equal to 1, see Sec. 3. (Color online.)

θc

θc

θc

0.4

0.5 0.8 0.4 0.6

0.3

0.3 0.2

0.4

0.2 0.1

0.2

− 2.0

− 1.5

− 1.0

− 0.5

0.1

0.5

1.0

λ

− 0.4

− 0.2

0.2

0.4

0.6

λ

0.2

0.4

0.6

0.8

λ

Fig. 2. Illustration, as a function of λ, of the critical temperature θc = θc (µ, λ, γ, h) for γ = 2.6, h = 0 and with µ = −0.5 (left ﬁgure), µ = 1 (ﬁgure at the center) and µ = 1.25 (right ﬁgure). The blue line corresponds to a second order phase transition, whereas the red dashed line represents the domain of λ with ﬁrst order phase transition. The black dashed line is the coupling constant λ = µ corresponding to an electron density per site equal to 1, see Sec. 3. (Color online.)

From Lemma 7.1, the set of maximizers of the variational problem (2.4) has at most two elements in [0, 1/4]. It follows by continuity of (β, µ, λ, γ, h, r) → f (β, µ, λ, γ, h; r), and from the fact that the interval [0, 1/4] is compact, that the set S := {(β, µ, λ, γ, h): β, γ > 0 and rβ > 0 is the unique maximizer of (2.4)}

(2.7)

is open. In Sec. 3.1, we prove that the set S corresponds to the superconducting phase since the order parameter solution of (2.4) can be interpreted as the Cooper pair condensate density. The boundary ∂S of the set S is called the set of critical points of our model. By deﬁnition, if (2.4) has more than one maximizer, then (β, µ, λ, γ, h) ∈ ∂S, whereas if (β, µ, λ, γ, h) ∈ S, then r = 0 is the unique maximizer of (2.4). For more details on the study of the variational problem (2.4), we recommend Sec. 7. 3. Phase Diagram at Fixed Chemical Potential By using our main theorem, i.e. Theorem 2.1, we can now explain the thermodynamic behavior of the strong coupling BCS-Hubbard model HN . The rigorous

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

245

proofs are however given in Sec. 6.2. Actually, we concentrate here on the physics of the model extracted from the (ﬁnite volume) grand-canonical Gibbs state ωN (1.6) associated with HN . We start by showing the existence of a superconducting phase transition in the thermodynamic limit. 3.1. Existence of a s-wave superconducting phase transition The solution rβ of (2.4) can be interpreted as an order parameter related to the Cooper pair condensate density ωN (c∗0 c0 )/N , where 1 1 ax,↓ ax,↑ = √ a ˜k,↓ a ˜−k,↑ c0 := √ N x∈ΛN N k∈Λ∗ N

c∗0 )

(respectively annihilates (respectively creates) one Cooper pair within the condensate, i.e. in the zero-mode for electron pairs. Indeed, in Sec. 6.2 (see Theorem 6.3) we prove, by using a notion of equilibrium states, the following. Theorem 3.1 (Cooper Pair Condensate Density). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (inﬁnite volume) Cooper pair condensate density equals    1  1 ∗ ∗ ωN (c∗0 c0 ) = lim ω (a a a a ) lim N y,↓ y,↑ x,↑ x,↓ N →∞ N N →∞  N 2  x,y∈ΛN

= rβ ≤ max{0, rmax}, with rmax ≤ 1/4 deﬁned in (2.6). The (uniquely deﬁned ) order parameter rβ = rβ (µ, λ, γ, h) is an increasing function of γ > 0. Remark 3.1. In fact, Theorem 3.1 is not anymore satisﬁed only if the order parameter rβ is discontinuous with respect to γ > 0 at ﬁxed (β, µ, λ, h). In this case, the thermodynamic limit of the Cooper pair condensate density is bounded by the left and right limits of the corresponding (inﬁnite volume) density, see the Appendix, in particular (A.1). Similar remarks can be done for Theorems 3.4–3.7. At least for large enough β and γ, we have explained that rβ > 0, see Figs. 1 and 2. Illustrations of the Cooper pair condensate density rβ as a function of β and λ are given in Fig. 3. In other words, a superconducting phase transition can appear in our model. Its order depends on parameters: it can be a ﬁrst order or a second order superconducting phase transition, cf. Fig. 3 and Sec. 7 for more details. From numerical investigations, note that rβ was always found to be an increasing function of β > 0. Unfortunately we are able to prove only a part of this fact in Sec. 7. Therefore, a superconducting phase appearing only in a range of non-zero temperatures as for magnetic superconductors cannot rigorously be excluded. But we conjecture that our model can never show this phenomenon, i.e. rβ should always be an increasing function of β > 0.

April 20, 2010 14:17 WSPC/S0129-055X

246

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

Fig. 3. In the ﬁgure on the left, we have three illustrations of the Cooper pair condensate density rβ as a function of the inverse temperature β for λ = 0 (blue line), λ = 0.45 (red line) and λ = 0.575 (green line). The ﬁgure on the right represents a 3D illustration of rβ as a function of λ and β. The color from red to blue reﬂects the decrease of the temperature. In all ﬁgures, µ = 1, γ = 2.6 and h = 0. (Color online.)

Observe that a non-trivial solution rβ = 0 is a manifestation of the breakdown of the U (1)-gauge symmetry. To see this phenomenon, we need to perturb the Hamiltonian HN with the external ﬁeld √ α N (e−iφ c0 + eiφ c∗0 ) for any α ≥ 0 and φ ∈ [0, 2π). This leads to the perturbed Gibbs state ωN,α,φ (·) deﬁned by (1.6) with HN replaced by HN,α,φ := HN − α (e−iφ ax,↓ ax,↑ + eiφ a∗x,↑ a∗x,↓ ), (3.1) x∈ΛN

see (6.42). We then obtain the following result for the so-called Bogoliubov quasiaverages (cf. Theorem 6.2). Theorem 3.2 (Breakdown of the U (1)-Gauge Symmetry). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, and for any φ ∈ [0, 2π), one gets for the Bogoliubov quasi-average below : √ 1 1/2 ωN,α,φ (ax,↑ ax,↓ ) = rβ eiφ , lim lim ωN,α,φ (c0 / N ) = lim lim α↓0 N →∞ α↓0 N →∞ N x∈ΛN

with rβ ≥ 0 being the unique solution of (2.4), see Theorem 2.1. Note that the breakdown of the U (1)-gauge symmetry should be “seen” in experiments via the so-called oﬀ diagonal long range order (ODLRO) property of the correlation functions [38], see Sec. 6.2. In fact, because of the permutation invariance, Theorem 3.1 still holds if we remove the space average, i.e. for any

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

247

lattice sites x and y = x, lim ωN (a∗y,↓ a∗y,↑ ax,↑ ax,↓ ) = rβ ,

N →∞

see Theorem 6.3. Similar remarks can be done for Theorems 3.4–3.7. Observe also that the type of superconductivity described here is the s-wave superconductivity, which is deﬁned via the two-point correlation function. Theorem 3.3 (s-Wave Superconductivity). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, and for any φ ∈ [0, 2π), x, y ∈ Zd and s1 , s2 ∈ {↑, ↓}, the two-point correlation function deﬁned from the Bogoliubov quasi-averages equals 1/2

lim lim ωN,α,φ(ax,s1 ay,s2 ) = rβ eiφ δx,y (1 − δs1 ,s2 ), α↓0 N →∞

with rβ ≥ 0 being the unique solution of (2.4), see Theorem 2.1. Here δx,y = 1 if and only if x = y. In other words, for x, y ∈ Zd and s1 , s2 ∈ {↑, ↓} the two-point correlation function inside the superconducting phase is non-zero if and only if x = y and s1 = s2 . More generally, for any inﬁnite volume equilibrium state ω, we have ω(ax,s1 ay,s2 ) = ω(a0,s1 a0,s2 )δx,y , see Sec. 6. We conclude now this analysis by giving the zero-temperature limit β → ∞ of the Cooper pair condensate density rβ proven in Sec. 7. Corollary 3.1 (Cooper Pair Condensate Density at Zero-Temperature). The Cooper pair condensate density r∞ = r∞ (µ, λ, γ, h) is equal at zerotemperature to rmax for any γ > Γ|µ−λ|,λ+|h| r∞ := lim rβ = β→∞ 0 for any γ < Γ|µ−λ|,λ+|h| with rmax ≤ 1/4 (cf. (2.6) and Fig. 4) and Γx,y := 2(y + {y 2 − x2 }1/2 )χ[0,y) (x)χ(0,∞) (y) + 2xχ[y,∞) (x) ≥ 0 been deﬁned for any x ∈ R+ and y ∈ R. Here χK is the characteristic function of the set K ⊂ R. Remark 3.2. If γ = Γ|µ−λ|,λ+|h| , straightforward estimations show that the order parameter rβ converges to r∞ = 0, see Sec. 7. This special case is a critical point at suﬃciently large β. We exclude it in our discussion since all thermodynamic limits of densities in Sec. 3 are performed away from any critical point, see, for instance, Theorem 3.1. The result of Corollary 3.1 is in accordance with Theorem 3.1 in the sense that the order parameter r∞ is an increasing function of γ ≥ 0. Observe also that 1 sup{r∞ (µ, λ, γ, h)} = r∞ (µ, µ, γ, h) = 4 λ∈R

April 20, 2010 14:17 WSPC/S0129-055X

248

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

Fig. 4. In the ﬁgure on the left, the blue area represents the domain of (λ, γ) with 1 ≤ γ ≤ 6, where the (zero-temperature) Cooper pair condensate density r∞ is non-zero at µ = 1 and h = 0. The ﬁgure on the right represents a 3D illustration of r∞ when 1 ≤ γ ≤ 6 and −2.5 ≤ λ ≤ 2.5 with again µ = 1, h = 0. (Color online.)

for any ﬁxed γ > Γ0,µ+|h| , whereas for any real numbers µ, λ, h, 1 . γ→∞ 4 In other words, the superconducting phase for µ = λ is as perfect as for γ = ∞. In particular, in order to optimize the Cooper pair condensate density, if µ > 0, then it is necessary to increase the one-site repulsion by tuning in λ to µ. Consequently, the direct repulsion between electrons can favor the superconductivity at ﬁxed µ. This phenomenon is conﬁrmed by the following analysis. First observe that Eq. (2.5) has no solution if γ ≤ 2|µ| and λ = 0. In other words, the strong coupling BCS theory has no phase transition as soon as γ ≤ 2|µ| and µ = 0. However, even if γ ≤ 2|µ|, there is a range of λ where a superconducting phase takes place. For instance, take µ > 0 and note that γ > Γ|µ−λ|,λ+|h| when γ γ (3.2) 0 ≤ µ − < λ < µ + − γ(µ + |h|). 2 2 This last inequality can always be satisﬁed for some λ > 0, if µ + |h| < γ ≤ 2µ. Therefore, although there is no superconductivity for γ ≤ 2|µ| and λ = 0, there is a range of positive λ ≥ 0 deﬁned by (3.2) for µ + |h| < γ ≤ 2µ, where the superconductivity appears at low enough temperature, see Corollary 3.1 and Fig. 4. In the region γ ≥ 2µ > 0 where the superconducting phase can occur for λ = 0, observe also that the critical temperature θc for λ > 0 can sometimes be larger as compared with the one for λ = 0, cf. Fig. 2. lim r∞ (µ, λ, γ, h) =

Remark 3.3. The eﬀect of a one-site repulsion on the superconducting phase transition may be surprising since one would naively guess that any repulsion between pairs of electrons should destroy the formation of Cooper pairs. In fact, the one-site and BCS interactions in (1.2) are not diagonal in the same basis, i.e. they do not commute. In particular, the Hubbard interaction cannot be directly interpreted as

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

249

a repulsion between Cooper pairs. This interpretation is only valid for large λ ≥ 0. Indeed, at ﬁxed µ and γ > 0, if λ is large enough, there is no superconducting phase. 3.2. Electron density per site and electron-hole symmetry We give next the grand-canonical density of electrons per site in the system (cf. Theorem 6.4). Theorem 3.4 (Electron Density per Site). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (inﬁnite volume) electron density equals (µ − λ) sinh(βgrβ ) 1 , ωN (nx,↑ + nx,↓ ) = dβ := 1 + lim N →∞ N grβ (eβλ cosh(βh) + cosh(βgrβ )) x∈ΛN

with dβ = dβ (µ, λ, γ, h) ∈ [0, 2], rβ ≥ 0 being the unique solution of (2.4) and gr := {(µ − λ)2 + γ 2 r}1/2 , see Theorem 2.1 and Fig. 5. At low enough temperature and for γ > Γ|µ−λ|,λ+|h| , Corollary 3.1 tells us that a superconducting phase appears, i.e. rβ > 0. In this case, it is important to note that the electron density becomes independent of the temperature. Indeed, by combining Theorem 3.4 with (2.5) one gets that dβ = 1 + 2γ −1 (µ − λ)

(3.3)

is linear as a function of µ in the domain of (β, µ, λ, γ, h) where rβ > 0, i.e. in the presence of superconductivity, see Fig. 5. We give next the electron density per site in the zero-temperature limit β → ∞, which straightforwardly follows from Theorem 3.4 combined with Corollary 3.1. Corollary 3.2 (Electron Density per Site at Zero-Temperature). The (inﬁnite volume) electron density d∞ = d∞ (µ, λ, γ, h) ∈ [0, 2] at zero-temperature dβ

dβ

2.0

dβ

2.0

1.00 1.5

1.5

1.0

1.0

0.5

0.5

0.95

0.90

0.85

−2

−1

1

2

µ

− 1.0

− 0.5

0.5

1.0

1.5

2.0

µ

2

4

6

8

10

12

β

Fig. 5. In the ﬁgure on the left, we give illustrations of the electron density dβ as a function of the chemical potential µ for β < βc (red line) and β > βc (blue line) at coupling constant λ = 0 (ﬁgure on the left, β = 1.4, 2.45) and λ = 0.575 (ﬁgure on the center, β = 4, 6.45). In the ﬁgure on the right, dβ is given as a function of β at µ = 0.3 with λ > µ equal to 0.35 (orange line, second order phase transition), 0.575 (blue line, ﬁrst order phase transition) and 1.575 (green line, no phase transition). In all ﬁgures, γ = 2.6, h = 0 and βc = θc−1 is the critical inverse temperature. (Color online.)

April 20, 2010 14:17 WSPC/S0129-055X

250

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

is equal to d∞ := lim dβ = 1 + β→∞

sgn(µ − λ) χ[λ+|h|,∞) (|µ − λ|) 1 + δ|µ−λ|,λ+|h| (1 + δh,0 )

for γ < Γ|µ−λ|,λ+|h| , whereas within the superconducting phase, i.e. for γ > Γ|µ−λ|,λ+|h| (Corollary 3.1), d∞ = 1 + 2γ −1 (µ − λ). Recall that sgn(0) := 0. To conclude, observe that (2−dβ ) is the density of holes in the system. So, if µ > λ, then dβ ∈ (1, 2], i.e. there are more electrons than holes in the system, whereas dβ ∈ [0, 1) for µ < λ, i.e. there are more holes than electrons. This phenomenon can directly be seen in the Hamiltonian HN , where there is a symmetry between electrons and holes as in the Hubbard model. Indeed, by replacing the creation operators a∗x,↓ and a∗x,↑ of electrons by the annihilation operators −bx,↓ and −bx,↑ of holes, we can map the Hamiltonian HN (1.2) for electrons to another strong coupling BCS-Hubbard model for holes deﬁned via the Hamiltonian N := −µhole (ˆ nx,↑ + n ˆ x,↓ ) − hhole (ˆ nx,↑ − n ˆ x,↓ ) H x∈ΛN

+ 2λ

x∈ΛN

γ n ˆ x,↑ n ˆ x,↓ − N

x∈ΛN

b∗y,↑ b∗y,↓ bx,↓ bx,↑ + 2(λ − µ)N − γ,

x,y∈ΛN

with n ˆ x,↓ := b∗x,↓ bx,↓ ,

n ˆ x,↑ := b∗x,↑ bx,↑ ,

hhole := −h and µhole := 2λ − µ − γN −1 .

Therefore, if one knows the thermodynamic behavior of HN for any h ∈ R and µ ≥ λ (regime with more electrons than holes), we directly get the thermodynamic properties for µ < λ (regime with more holes than electrons), which correspond to N with hhole = −h and a chemical potential for holes µhole > λ at the one given by H N shifts the grand-canonical large enough N . Note that the last constant term in H pressure by a constant, but also the (inﬁnite volume) mean-energy per site β (Sec. 3.6). 3.3. Superconductivity versus magnetization: Meißner eﬀect (c)

It is well known that for magnetic ﬁelds h with |h| below some critical value hβ , type I superconductors become perfectly diamagnetic in the sense that the mag(c) netic induction in the bulk is zero. Magnetic ﬁelds with strength above hβ destroy the superconducting phase completely. This property is the celebrated Meißner or (c) Meißner–Ochsenfeld eﬀect. For small ﬁelds h (i.e. |h| < hβ ) the magnetic ﬁeld in the bulk of the superconductor is (almost) cancelled by the presence of steady surface currents. As we do not analyze transport here, we only give the magnetization density explicitly as a function of the external magnetic ﬁeld h for the strong coupling BCS-Hubbard model. Note that type II superconductors cannot be covered

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

251

in the strong coupling regime since the vortices appearing in presence of magnetic ﬁelds come from the magnetic kinetic energy. Theorem 3.5 (Magnetization Density). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (inﬁnite volume) magnetization density equals sinh(βh)eλβ 1 , ωN (nx,↑ − nx,↓ ) = mβ := λβ lim N →∞ N e cosh(βh) + cosh(βgrβ ) x∈ΛN

with mβ = mβ (µ, λ, γ, h) ∈ [−1, 1], rβ ≥ 0 being the unique solution of (2.4) and gr := {(µ − λ)2 + γ 2 r}1/2 , see Theorem 2.1 and Fig. 6. This theorem deduced from Theorem 6.4 does not seem to show any Meißner eﬀect since mβ > 0 as soon as h = 0. However, when the Cooper pair condensate density rβ is strictly positive, from Theorem 3.5 combined with (2.5) note that mβ =

2grβ eλβ sinh(βh) . γ sinh(βgrβ )

(3.4)

In particular, it decays exponentially as β → ∞ when rβ → r∞ > 0, see Fig. 6. We give therefore the zero-temperature limit β → ∞ of mβ in the next corollary. Corollary 3.3 (Magnetization Density at Zero-Temperature). The (inﬁnite volume) magnetization density m∞ = m∞ (µ, λ, γ, h) ∈ [−1, 1] at zero-temperature is equal to m∞ := lim mβ = β→∞

sgn(h) χ[0,λ+|h|] (|µ − λ|), 1 + δ|µ−λ|,λ+|h|

Fig. 6. In the ﬁgure on the left, we have an illustration of the electron density dβ (blue line), the Cooper pair condensate density rβ (red line) and the magnetization density mβ (green line) as functions of the magnetic ﬁeld h at β = 7, µ = 1, λ = 0.575 and γ = 2.6. The ﬁgure on the right represents a 3D illustration of mβ = mβ (1, 0.575, 2.6, h) as a function of h and β. The color from red to blue reﬂects the decrease of the temperature. In both ﬁgures, we can see the Meißner eﬀect (in the 3D illustration, the area with no magnetization corresponds to rβ > 0). (Color online.)

April 20, 2010 14:17 WSPC/S0129-055X

252

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

for γ < Γ|µ−λ|,λ+|h| (see Corollary 3.1), whereas for γ > Γ|µ−λ|,λ+|h| there is no magnetization at zero-temperature since mβ decays exponentiallym as β → ∞ to m∞ = 0. Consequently, there is no superconductivity, i.e. r∞ = 0, when γ < Γ|µ−λ|,λ+|h| and, as soon as h = 0 with |µ − λ| < λ + |h|, there is a perfect magnetization at zero-temperature, i.e. m∞ = sgn(h). Observe that the condition |µ − λ| > λ + |h| implies from Corollary 3.2 that either d∞ = 0 or d∞ = 2, which implies that m∞ must be zero. On the other hand, if γ > Γ|µ−λ|,λ , we can deﬁne the critical magnetic ﬁeld at zero-temperature by the unique positive solution 1 (c) −2 2 + γ (µ − λ) − λ > 0 (3.5) h∞ := γ 4 (c)

of the equation Γ|µ−λ|,λ+y = γ for y ≥ 0. Then, by increasing |h| up to h∞ , the (zero-temperature) Cooper pair condensate density r∞ stays constant, whereas the (zero-temperature) magnetization density m∞ is zero, i.e. r∞ = rmax and m∞ = 0 (c) (c) for |h| < h∞ , see Corollary 3.1. However, as soon as |h| > h∞ , r∞ = 0 and m∞ = sgn(h), i.e. there is no Cooper pair and a pure magnetization takes place. In other words, the model manifests a pure Meißner eﬀect at zero-temperature corresponding to a superconductor of type I, cf. Fig. 6. Finally, note that we give an energetic interpretation of the critical magnetic (c) (c) ﬁeld h∞ after Corollary 3.5. Observe also that a measurement of h∞ (3.5) implies, for instance, a measurement of the chemical potential µ if one would know γ and λ, which could be found via the asymptotic (3.15) of the speciﬁc heat, see discussions in Sec. 5. 3.4. Coulomb correlation density The space distribution of electrons is still unknown and for such a consideration, we need the (inﬁnite volume) Coulomb correlation density 1 ωN (nx,↑ nx,↓ ) . (3.6) lim N →∞ N x∈ΛN

Together with the electron and magnetization densities dβ and mβ , the knowledge of (3.6) allows us in particular to explain in detail the diﬀerence between superconducting and non-superconducting phases in terms of space distributions of electrons. Actually, by the Cauchy–Schwarz inequality for the states one gets that 1 1 1 ωN (nx,↑ nx,↓ ) ≤ ωN (nx,↑ ) ωN (nx,↓ ). (3.7) N N N x∈ΛN

m Actually,

x∈ΛN

x∈ΛN

mβ = O(e−(γ−2(λ+|h|))β/2 ) for γ > Γ|µ−λ|,λ+|h| ≥ 2(λ + |h|).

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

253

From Theorems 3.4 and 3.5, the densities of electrons with spin up ↑ and down ↓ equal, respectively, dβ + mβ 1 ∈ [0, 1] ωN (nx,↑ ) = lim N →∞ N 2 x∈ΛN

and

lim

N →∞

dβ − mβ 1 ∈ [0, 1] ωN (nx,↓ ) = N 2 x∈ΛN

for any β, γ > 0 and µ, λ, h away from any critical point. Consequently, by using (3.7) in the thermodynamic limit, the (inﬁnite volume) Coulomb correlation density is always bounded by 1 2 1 0 ≤ lim ωN (nx,↑ nx,↓ ) ≤ wmax := dβ − m2β . (3.8) N →∞ N 2 x∈ΛN

If, for instance, (3.6) equals zero, then as soon as an electron is on a deﬁnite site, the probability to have a second electron with opposite spin at the same place goes to zero as N → ∞. In this case, there would be no formation of pairs of electrons on a single site. This phenomenon does not appear exactly in ﬁnite temperature due to thermal ﬂuctuations. Indeed, we can explicitly compute the Coulomb correlation in the thermodynamic limit (cf. Theorem 6.4): Theorem 3.6 (Coulomb Correlation Density). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (inﬁnite volume) Coulomb correlation density equalsn 1 1 lim ωN (nx,↑ nx,↓ ) = wβ := (dβ − mβ coth(βh)), N →∞ N 2 x∈ΛN

with wβ = wβ (µ, λ, γ, h) ∈ (0, wmax ), see Fig. 7. Here dβ and mβ are, respectively, deﬁned in Theorems 3.4 and 3.5. Consequently, because grβ ≥ |λ − µ|, for any inverse temperature β > 0 the Coulomb correlation density is never zero, i.e. wβ > 0, even if the electron density dβ is exactly 1, i.e. if λ = µ. Moreover, the upper bound in (3.8) is also never attained. However, for low temperatures, wβ goes exponentially fast with respect to β to one of the bounds in (3.8), cf. Fig. 7. Indeed, one has the following zero– temperature limit: Corollary 3.4 (Coulomb Correlation Density at Zero-Temperature). The (inﬁnite volume) Coulomb correlation density w∞ = w∞ (µ, λ, γ, h) ∈ [0, 1] at n If

h = 0, then wβ (µ, λ, γ, 0) := limh→0 wβ (µ, λ, γ, h).

April 20, 2010 14:17 WSPC/S0129-055X

254

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

wβ , wmax

wβ , wmax

0.5 0.4

wβ , wmax

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.3 0.2 0.1

2

4

6

8

10

12

β

2.0

2.5

3.0

3.5

4.0

4.5

5.0

β

2.0

2.5

3.0

3.5

4.0

4.5

5.0

β

Fig. 7. Illustration of the Coulomb correlation density wβ (red lines) and its corresponding upper bound wmax (blue lines) as a function of β > 0 at µ = 0.2, γ = 2.6, for λ = 1.305 < µ (left ﬁgure, dβ < 1), λ = 0.2 = µ (two right ﬁgures, dβ = 1), and from the left to the right, with h = 0 (mβ = 0), and h = 0.3, 0.35 (where mβ > 0). The dashed green lines indicate that d∞ /2 = 0.5 in the three cases. In the ﬁgure on the left there is no superconducting phase in opposition to the right ﬁgures where we see a phase transition for β > 2.3 (second order) or 2.6 (ﬁrst order). (Color online.)

zero-temperature is equal to w∞ := lim wβ = β→∞

1 + sgn(µ − λ) χ[λ+|h|,∞) (|µ − λ|) 2(1 + δ|µ−λ|,λ+|h| (1 + δh,0 ))

for γ < Γ|µ−λ|,λ+|h| whereas w∞ = d∞ /2 for γ > Γ|µ−λ|,λ+|h| , see Corollaries 3.1 and 3.2. If |µ − λ| > λ + |h|, the interpretation of this asymptotics is clear since either d∞ = 0 for µ < λ or d∞ = 2 for µ > λ. The interesting phenomena are when |µ − λ| < λ + |h|. In this case, if there is no superconducting phase, i.e. γ < Γ|µ−λ|,λ+|h| , then wβ converges towards w∞ = 0 as β → ∞. In particular, as explained above, if an electron is on a deﬁnite site, the probability to have a second electron with opposite spin at the same place goes to zero as N → ∞ and β → ∞. However, in the superconducting phase, i.e. for γ > Γ|µ−λ|,λ+|h| , the upper bound wmax (3.8) is asymptotically attained. Since wmax = d∞ /2 as β → ∞, it means that 100% of electrons form Cooper pairs in the limit of zero-temperature, which is in accordance with the fact that the magnetization density must disappear, i.e. m∞ = 0, cf. Corollary 3.3. As explained in Sec. 3.1, the highest Cooper pair condensate density is 1/4, which corresponds to an electron density d∞ = 1. Actually, although all electrons form Cooper pairs at small temperatures, there are never 100% of electron pairs in the condensate, see Fig. 8. In the special case where d∞ = 1, only 50% of Cooper pairs are in the condensate. The same analysis can be done for hole pairs by changing ax by −b∗x in the deﬁnition of extensive quantities. Deﬁne the electron and hole pair condensate fracˆβ , where ˆrβ and d ˆβ are the hole ˆβ := 2ˆrβ /d tions respectively by vβ := 2rβ /dβ and v condensate density and the hole density respectively. Because of the electron-hole ˆβ = 2 − dβ . In particular, when rβ > 0, we asymptotically symmetry, ˆrβ = rβ and d get that v ˆβ + vβ → 1 as β → ∞. Hence, in the superconducting phase, an electron pair condensate fraction below 50% means in fact that there are more than 50% of hole pair condensate and conversely at low temperatures. For more details concerning ground states in relation with this phenomenon, see discussions around (6.60) in Sec. 6.2.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors % of Cooper pair condensate

% of Cooper pair condensate

dβ

100

255

2.0

100

80

80 1.5

60

60 1.0

40

40 0.5

20

− 1.0

− 0.5

20

0.5

1.0

1.5

2.0

µ

− 1.0

− 0.5

0.5

1.0

1.5

2.0

µ

− 1.0

− 0.5

0.5

1.0

1.5

2.0

µ

Fig. 8. The fraction of electron pairs in the condensate is given in right and left ﬁgures as a function of µ. In the ﬁgure on the left, λ = h = 0, with inverse temperatures β = 2.45 (orange line), 3.45 (red line) and 30 (blue line). In the ﬁgure on the right, λ = 0.575 and h = 0.1 with β = 5 (orange line), 7 (red line) and 30 (blue line). The ﬁgure on the center illustrates the electron density dβ also as a function of µ at β = 30 (low temperature regime) for λ = h = 0 (red line) and for λ = 0.575 and h = 0.1 (green line). In all ﬁgures, γ = 2.6. (Color online.)

3.5. Superconductor-Mott insulator phase transition By Corollary 3.2, if λ > 0 and the system is not in the superconducting phase (i.e. if rβ = 0), then the electron density converges to either 0, 1 or 2 as β → ∞ since d∞ = 1 + sgn(µ − λ).

(3.9)

We deﬁne the phase where the system does not form a pair condensate and the electron density is around 1, as a Mott insulator phase. More precisely, we say that the system forms a Mott insulator, if for some < 1, some 0 < β0 < ∞, some µ0 ∈ R and some δµ > 0, the electron density dβ ∈ (1 − , 1 + ) and rβ = 0

for all (β, µ) ∈ (β0 , ∞) × (µ0 − δµ, µ0 + δµ).

As discussed in Sec. 3.4, observe that we have, in this phase, exactly one electron (or hole) localized in each site at the low temperature limit since dβ → 1 and wβ → 0 as β → ∞. To extract the whole region of parameters where such a thermodynamic phase takes place, a preliminary analysis of the function Γx,y deﬁned in Corollary 3.1 is ﬁrst required. Observe that Γ0,y > 0 if and only if y > 0. Consequently, for any real numbers λ and h such that λ + |h| ≤ 0 we have Γ0,λ+|h| = 0. However, if λ + |h| > 0 then Γ0,λ+|h| > 0. Meanwhile, at ﬁxed y > 0, the continuous function Γx,y of x ≥ 0 is convex with minimum for x = y, i.e. inf {Γx,y } = Γy,y = 2y > 0.

x≥0

(3.10)

In particular, Γx,y is strictly decreasing as a function of x ∈ [0, y] and strictly increasing for x ≥ y. Now, by combining Corollaries 3.1–3.4, we are in position to extract the set of parameters corresponding to insulating or superconducting phases: (1) For any γ > 0 and µ, λ ∈ R such that |µ − λ| > max{γ/2, λ + |h|},

April 20, 2010 14:17 WSPC/S0129-055X

256

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

observe ﬁrst that there are no superconductivity (r∞ = 0), either no electrons or no holes (see (3.9)) and, in any case, no magnetization since m∞ = 0. It is a standard (non ferromagnetic) insulator. The next step is now to analyze the thermodynamic behavior for |µ − λ| < max{γ/2, λ + |h|},

(3.11)

which depends on the strength of γ > 0. From (2) to (4), we assume that (3.11) is satisﬁed. (2) If the BCS coupling constant γ satisﬁes 0 < γ ≤ Γλ+|h|,λ+|h| = 2(λ + |h|), then from (3.10) combined with Corollary 3.1 there is no Cooper pair for any µ and any λ. In particular, under the condition (3.11) there are a perfect magnetization, i.e. m∞ = sgn(h), and exactly one electron or one hole per site since d∞ = 1 and w∞ = 0. In other words, we obtain a ferromagnetic Mott insulator phase. (3) Now, if γ > 0 becomes too strong, i.e. γ > Γ0,λ+|h| = 4(λ + |h|), then for any µ ∈ R such that |µ − λ| < γ/2 there are Cooper pairs because r∞ = rmax > 0, an electron density d∞ equal to (3.3) and no magnetization (m∞ = 0). In this case, observe that all quantities are continuous at |µ − λ| = γ/2. This is a superconducting phase. (4) The superconducting-Mott insulator phase transition only appears in the intermediary regime where Γλ+|h|,λ+|h| = 2(λ + |h|) < γ < Γ0,λ+|h| = 4(λ + |h|),

(3.12)

cf. Fig. 9. Indeed, the function Γx,λ+|h| = γ has two solutions x1 :=

γ 1/2 {4(λ + |h|) − γ}1/2 2

and x2 :=

γ > x1 . 2

In particular, for any µ ∈ R such that |µ − λ| ∈ (x1 , γ/2), the BCS coupling constant γ is strong enough to imply the superconductivity (r∞ = rmax > 0), with an electron density d∞ equal to (3.3) and no magnetization (m∞ = 0). We are in the superconducting phase. However, for any µ ∈ R such that |µ − λ| < x1 , the BCS coupling constant γ becomes too weak and there is no superconductivity (r∞ = 0), exactly one electron per site, i.e. d∞ = 1 and w∞ = 0, and a pure magnetization if h = 0, i.e. m∞ = sgn(h). In this regime, one gets a ferromagnetic Mott insulator phase. All quantities are continuous at |µ − λ| = γ/2 but not for |µ − λ| = x1 . In other words, we get a superconductor-Mott insulator phase transition by tuning in the chemical potential µ. An illustration of this phase transition is given in Fig. 10, see also Fig. 8.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

λ

257

λ

2.0

100

1.5 50 1.0 0.5

1.5

2.0

2.5

3.0

3.5

4.0

γ

50

100

150

200

γ

− 50

− 0.5 − 100

− 1.0

Fig. 9. In both ﬁgures, the blue area represents the domain of (λ, γ), where there is a superconducting phase at zero temperature for µ = 1 and h = 0. The two increasing straight lines (green and brown) are γ = 4λ and γ = 2λ for γ ≥ 1. In particular, between these two lines (2λ < γ < 4λ), there is a superconducting-Mott insulator phase transition by tuning µ. (Color online.)

dβ , r β , mβ

θc

dβ , r β , mβ

2.0

2.0 0.20

− 0.5

1.5

1.5

1.0

1.0

0.5

0.5

0.5

1.0

1.5

2.0

µ

− 0.5

0.15

0.10

0.05

0.5

1.0

1.5

2.0

µ

− 0.5

0.5

1.0

1.5

2.0

µ

Fig. 10. Here λ = 0.575, γ = 2.6, and h = 0.1. In the two ﬁgures on the left, we plot the electron density dβ (blue line), the Cooper pair condensate density rβ (red line) and the magnetization density mβ (green line) as functions of µ for β = 7 (left ﬁgure) or 30 (low temperature regime, ﬁgure on the center). Observe the superconducting-Mott Insulator phase transition which appears in both cases. In the right ﬁgure, we illustrate as a function of µ the corresponding critical temperature θc . The blue line corresponds to a second order phase transition, whereas the red dashed line represents the domain of µ with ﬁrst order phase transition. The black dashed line is the chemical potential µ = λ corresponding to an electron density per site equal to 1. (Color online.)

3.6. Mean-energy per site and the speciﬁc heat To conclude, low-Tc superconductors and high-Tc superconductors diﬀer by the behavior of their speciﬁc heat. The ﬁrst one shows a discontinuity of the speciﬁc heat at the critical point whereas the speciﬁc heat for high–Tc superconductors is continuous. It is therefore interesting to give now the mean-energy per site in the thermodynamic limit in order to compute next the speciﬁc heat. Theorem 3.7 (Mean-Energy per Site). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (inﬁnite volume) mean energy per site is equal to lim {N −1 ωN (HN )} = β := −µdβ − hmβ + 2λwβ − γrβ ,

N →∞

see Theorems 3.1–3.6 and Fig. 11.

April 20, 2010 14:17 WSPC/S0129-055X

258

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

εβ

εβ − 0.95

− 1.5

− 1.0

− 1.6

εβ

− 1.00

0.20

− 1.1

− 1.7

0.15 − 1.2

− 1.05

− 1.8

6

− 1.9

2

3

4

5

6

7

4

β

5

6

7

8

9

10

β

0.10

h

8

β

0.05 10 12

0.00

Fig. 11. In the two ﬁgures on the left, we give the mean energy per site β as a function of β at h = 0 for λ = 0 (ﬁgure on the left, second order BCS phase transition) or λ = 0.575 (ﬁgure on the center, ﬁrst order phase transition). The dashed line in both ﬁgures is the mean energy per site with zero Cooper pair condensate density. On the right ﬁgure, β is given as a function of β and h at λ = 0.575. The color from red to blue reﬂects the decrease of the temperature and the plateau corresponds to the superconducting phase. In all ﬁgures, µ = 1 and γ = 2.6. (Color online.)

At zero-temperature, Corollaries 3.1–3.4 imply an explicit computation of the mean energy per site: Corollary 3.5 (Mean-Energy per Site at Zero-Temperature). The (inﬁnite volume) mean energy per site ∞ = ∞ (µ, λ, γ, h) at zero-temperature is equal to ∞ := lim β = −µ + β→∞

−

λ + |λ − µ| χ[λ+|h|,∞) (|µ − λ|) 1 + δ|µ−λ|,λ+|h| (1 + δh,0 )

|h| 1 + δ|µ−λ|,λ+|h|

χ[0,λ+|h|] (|µ − λ|),

for γ < Γ|µ−λ|,λ+|h| whereas for γ > Γ|µ−λ|,λ+|h| γ ∞ := lim β = − + (λ − µ)(1 + γ −1 (µ − λ)), β→∞ 4 cf. Corollary 3.1. (c)

Note that the critical magnetic ﬁeld h∞ (3.5) has a direct interpretation in terms of the zero-temperature mean energy per site ∞ . Indeed, if |µ − λ| < λ + |h|, / {0, 2}, by equating ∞ in the superconducting phase with the mean energy i.e. d∞ ∈ ∞ = −µ − |h| in the non-superconducting (ferromagnetic) state, we directly get (c) that the magnetic ﬁeld should be equal to |h| = h∞ (3.5). In other words, the (c) critical magnetic ﬁeld h∞ corresponds to the point where the mean energies at zero-temperature in both cases are equal to each other, as it should be. Note that this phenomenon is not true at non-zero temperature since the mean energy per site can be discontinuous as a function of h (even if λ = 0), see Fig. 11. Now, the speciﬁc heat at ﬁnite volume equals cN,β := −β 2 ∂β {N −1 ωN (HN )} = N −1 β 2 ωN ([HN − ωN (HN )]2 ).

(3.13)

However, its thermodynamic limit cβ := lim cN,β = −β 2 ∂β β + Cβ N →∞

(3.14)

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

259

cannot be easily computed because one cannot exchange the limit N → ∞ and the derivative ∂β , i.e. Cβ = Cβ (µ, λ, γ, h) may be non-zero. For instance, Griﬃths arguments [29–31] (Appendix) would allow to exchange any derivative of the pressure pN and the limit N → ∞ by using the convexity of pN . To compute (3.14) in this way, we would need to prove the (piece-wise) convexity of N,β := N −1 ωN (HN ) as a function β > 0. As suggested by Fig. 11, this property of convexity might be right but it is not proven here. Notice however that if experimental measurements of the speciﬁc heat comes from a discrete derivative of the mean energy per site β , it is then clear that it corresponds to forget about the term Cβ . In this case, i.e. assuming Cβ = 0, we ﬁnd again the well-known BCS-type behavior of the speciﬁc heat in presence of a second order phase transition, see Fig. 12. In addition, if Cβ = 0, then for any µ, λ, h and γ > Γ|µ−λ|,λ+|h| (Corollary 3.1), we explicitly obtain via direct computations the well-known exponential decay of the speciﬁc heat at zero-temperature for s-wave superconductors: 1 (3.15) cβ = (2λγ + γ 2 − 4λ2 )β 2 e−βγ + o(β 2 e−βγ ) as β → ∞. 4 (Note that this asymptotic could give access to γ and also λ, see discussions in Sec. 5.) However, if a ﬁrst order phase transition appears, then the (inﬁnite volume) mean energy per site β is discontinuous at the critical temperature θc (cf. Fig. 11) and the speciﬁc heat cθc−1 is inﬁnite. In Fig. 12, we give an illustration of the ratio ∆c/cmax between the jump ∆c at θ = θc and the maximum value cmax of cθc−1 . For most of standard superconductorso note that the measured values are between 0.6 and 0.7. Numerical computations suggest that this ratio ∆c/cmax may always be bounded in our model by one as soon as a second order phase transition appears. cβ =c −1 θ

∆c/cmax

cβ =c −1 θ

3.0

3.0

2.5

2.5

2.0

2.0

1.5

1.5

1.0

1.0

0.5

0.5

1.0

0.8

0.6

0.4

0.4

0.6

0.8

1.0

1.2

θ/θc

0.4

0.2

0.6

0.8

1.0

1.2

θ/θc

− 0.2

0.0

0.2

0.4

0.6

λ

Fig. 12. Here µ = 1, γ = 2.6 and h = 0. Assuming Cβ = 0, we give 3 plots of the speciﬁc heat cβ as a function of the ratio θ/θc between θ := β −1 and the critical temperature θc for λ = 0, 0.5 (both left ﬁgure, respectively blue and red lines, second order phase transition), and λ = 0.575 (ﬁgure on the center, blue line, ﬁrst order phase transition). The dashed red line in the ﬁgure on the center indicates what the speciﬁc heat at ﬁnite volume might be since cθ−1 = +∞. The right c ﬁgure is a plot as a function of λ of the relative speciﬁc heat jump, i.e. the ratio ∆c/cmax between the jump ∆c at θ = θc and the maximum value cmax of cθ−1 at the same point. The yellow colored c area indicates that this ratio numerically computed is formally inﬁnite due to a ﬁrst order phase transition. (Color online.) o At

least for the following elements: Hg, In, Nb, Pb, Sn, Ta, Tl, V.

April 20, 2010 14:17 WSPC/S0129-055X

260

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

4. Phase Diagram at Fixed Electron Density per Site In any ﬁnite volume, the electron density per site is strictly increasing as a function of the chemical potential µ by strict convexity of the pressure. Therefore, for any ﬁxed electron density ρ ∈ (0, 2) there exists a unique µN,β = µN,β (ρ, λ, γ, h) such that 1 ωN (nx,↑ + nx,↓ ), (4.1) ρ= N x∈ΛN

where ωN represents the (ﬁnite volume) grand-canonical Gibbs state (1.6) associated with HN and taken at inverse temperature β and chemical potential µ = µN,β . The aim of this section is now to analyze the thermodynamic properties of the model for a ﬁxed ρ instead of a ﬁxed chemical potential µ. We start by investigating it away from any critical point. 4.1. Thermodynamics away from any critical point In the thermodynamic limit and away from any critical point, the chemical potential µN,β converges to a solution µβ = µβ (ρ, λ, γ, h) of the equation ρ = dβ (µ, λ, γ, h),

(4.2)

see Theorem 3.4. For instance, if ρ = 1, the chemical potential µβ is simply given by λ, i.e. µβ (1, λ, γ, h) = λ. At least away from any critical point, this chemical potential µβ is always uniquely deﬁned. Indeed, outside the superconducting phase (see Sec. 3.1), the electron density dβ given by Theorem 3.4 is a strictly increasing continuous function of the chemical potential µ at ﬁxed β > 0. In other words, for any ﬁxed electron density ρ ∈ (0, 2), Eq. (4.2) has a unique solution µβ , i.e. the chemical potential µβ is the inverse of the electron density dβ taken as a function of µ ∈ R. On the other hand, inside the superconducting phase, from (3.3) the chemical potential µβ is also unique and equals γ (4.3) µβ = (ρ − 1) + λ, 2 see Figs. 5 and 10. In particular, µβ does not depend on h or β as soon as rβ > 0. The gap equation (2.5) then equals 1 eλβ cosh(βh) tanh(βγgr ) = 2gr 1 + , with gr := {(ρ − 1)2 + 4r}1/2 , cosh(βγgr ) 2 and 0 ≤ rβ ≤ max{0, ρ(2 − ρ)/4}, for any ﬁxed electron density ρ ∈ (0, 2). Hence, the thermodynamic behavior of the strong coupling BCS-Hubbard model HN is simply given for any ρ ∈ (0, 2), away from any critical point, by setting µ = µβ

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

rβ

261

rβ

0.25 0.12 0.20

0.10

0.15

0.08 0.06

0.10 0.04 0.05

0.02

1

2

3

4

5

6

7

β

2

3

4

5

6

7

8

β

Fig. 13. Illustrations of the Cooper pair condensate density rβ as a function of the inverse temperature β for γ = 2.6, h = 0, and densities ρ = 1, 1.7 (respectively left and right ﬁgures), with λ = 0 (blue line), 0.5 (red line), 0.75 (green line), and 1 (orange line). The dashed line indicates the value of r∞ . (Color online.)

in Sec. 3. In particular, the superconducting phase can appear by tuning in each parameter: the BCS coupling constant γ (see (2.6)), the inverse temperature β > 0 (see Corollary 3.1), the coupling constant λ, the magnetic ﬁeld h (see Sec. 3.3), the chemical potential µ or the electron density ρ (see Sec. 3.5). Therefore, to explain the phase diagram at ﬁxed electron density, it is suﬃcient to give the behavior of the Cooper pair condensate density rβ as a function of ρ ∈ (0, 2). Everything can be easily performed via numerical methods, see Fig. 13. We restrict our rigorous analysis to the zero-temperature limit of rβ , which is a straightforward consequence of Corollary 3.1 and (4.3). Corollary 4.1 (Zero-Temperature Cooper Pair Condensate Density). At zero-temperature, ﬁxed electron density ρ ∈ (0, 2) and λ, h ∈ R, the Cooper pair condensate density rβ converges as β → ∞ towards r∞ = ρ(2 − ρ)/4 when γ > ˜ ρ,λ+|h| , 0}. Here max{Γ ˜ x,y := Γ

4y χ[0,∞) (y) x(x − 2) + 2

is a function deﬁned for any x, y ∈ R. ˜ ρ,λ+|h| is more subtle than its analogous with a Remark 4.1. The case 0 < γ < Γ ﬁxed chemical potential µ, because phase mixtures can take place. See Sec. 4.2. ˜ ρ,λ+|h| we can extract from this corollary all As explained above, as soon as γ > Γ the zero-temperature thermodynamics of the strong coupling BCS-Hubbard model by using Corollaries 3.1–3.4. If λ + |h| > 0 and γ satisfy the inequalities ˜ ρ,λ+|h| } = Γ ˜ 0,λ+|h| = Γ ˜ 2,λ+|h| = 2(λ + |h|) γ > min {Γ ρ∈(0,2)

and ˜ ρ,λ+|h| } = Γ ˜ 1,λ+|h| = 4(λ + |h|), γ < max {Γ ρ∈(0,2)

April 20, 2010 14:17 WSPC/S0129-055X

262

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

it is also clear that the superconductor-Mott insulator phase transition appears by tuning the electron density ρ in the same way as described in Sec. 3.5 for µ. See Fig. 10. In this case however, we recommend Sec. 4.2 for more details because of the subtlety mentioned in Remark 4.1. See Figs. 15 and 16 below. From (4.3) combined with Corollary 4.1, note that the asymptotics (3.15) of the speciﬁc heat at zero-temperature is still valid at ﬁxed electron density ρ as ˜ ρ,λ+|h| , 0}. Meanwhile, from Corollary 4.1 the zero-temperature soon as γ > max{Γ Cooper pair condensate density r∞ does not depend on λ, γ, or h, as soon as ˜ ρ,λ+|h| is satisﬁed. Indeed, the chemical potential µβ in the case where rβ > 0 γ>Γ is renormalized, cf. (4.3). In other words, at zero-temperature, the thermodynamic ˜ ρ,λ+|h| is equal to behavior of the strong coupling BCS-Hubbard model for γ > Γ the well-known behavior of the BCS theory in the strong coupling approximation (λ = h = 0). This phenomenon is also seen by using renormalization methods where it is believed that the Coulomb interaction simply modiﬁes the mass of electrons by creating quasi-particles (which however do not exist in our model). 4.2. Coexistence of ferromagnetic and superconducting phases Observe that the electron density dβ given by Theorem 3.4 can have discontinuities as a function of the chemical potential µ. This phenomenon appears at the superconductor-Mott insulator phase transition, see Sec. 3.5 and Fig. 10. Because of electron-hole symmetry (Sec. 3.2), without loss of generality we can restrict our study to the case where dβ ∈ [0, 1], i.e. ρ ∈ [0, 1] and µβ ≤ λ. In this regime, the electron density dβ has, at most, one discontinuity point at (c) the so-called critical chemical potential µβ ≤ λ. In particular, there are two critical electron densities + − d± β := dβ (µβ ± 0, λ, γ, h) with dβ > dβ . (c)

Similarly, we can also deﬁne two critical Cooper pair condensate densities r± β , two and two critical Coulomb correlation density critical magnetization densitiesp m± β − wβ± . Of course, since r+ β > rβ = 0, we are here on a critical point, i.e. (c)

(β, µβ , λ, γ, h) ∈ ∂S (see (2.7)), with β, γ > 0 and λ, h ∈ R such that this critical chemical potential (c) (c) µβ = µβ (λ, γ, h) exists. + The thermodynamics of the model for ρ ∈ [d− β , dβ ] is already explained in Sec. 4.1 because the solution rβ of (2.4) is unique at µ = µβ . The chemical potential (c) + µN,β converges to µβ = µβ , if ρ ∈ [d− β , dβ ]. In this case the variational problem ± (2.4) has exactly two maximizers rβ . The thermodynamic behavior of the system p If

h = 0, then m± β = 0.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

263

in this regime is not, a priori, clear except from the obvious fact that 1 lim ωN (nx,↑ + nx,↓ ) = ρ N →∞ N x∈ΛN

per deﬁnition. In particular, it cannot be deduced from the above results. We handle this situation within a much more general framework in Theorem 6.5. As a consequence of this study (see discussions after Theorem 6.5), all the extensive quantities can be obtained in the thermodynamic limit: Theorem 4.1 (Densities in Coexistent Phases). Take β, γ > 0 and real num(c) bers λ, h in the domain of deﬁnition of the critical chemical potential µβ . For any − + ρ ∈ [dβ , dβ ], all densities are uniquely deﬁned : (i) The Cooper pair condensate density equals     1 ∗ ∗ lim ω (a a a a ) = τρ r+ N y,↓ y,↑ x,↑ x,↓ β, N →∞  N 2 

with

x,y∈ΛN

τρ :=

ρ − d− β − d+ β − dβ

∈ [0, 1].

(ii) The magnetization density equals 1 + ωN (nx,↑ − nx,↓ ) = (1 − τρ )m− lim β + τρ mβ . N →∞ N x∈ΛN

(iii) The Coulomb correlation density equals 1 lim ωN (nx,↑ nx,↓ ) = (1 − τρ )wβ− + τρ wβ+ . N →∞ N x∈ΛN

(iv) The mean energy per site equals + lim {N −1 ωN (HN )} = (1 − τρ )− β + τρ β ,

N →∞

± ± ± with ± β := −µβ ρ − hmβ + 2λwβ − γrβ . (c)

As a consequence of this theorem, as soon as the magnetic ﬁeld h = 0, there is a coexistence of ferromagnetic and superconducting phases at low temperatures + for ρ ∈ (d− β , dβ ). In other words, the Meißner eﬀect is not valid in this interval of electron densities. An illustration of this is given in Fig. 14. Such phenomenon was also observed in experiments and from our results, it should occur rather near half-ﬁlling (but not exactly at half-ﬁlling) and at strong repulsion λ > 0. Additionally, observe that this coexistence of thermodynamic phases can also appear at the (c) critical magnetic ﬁeld hβ (see Sec. 3.3). Remark 4.2. Coexistence of ferromagnetic and superconducting phases has already been rigorously investigated, see, e.g., [16, 17]. For instance, in [16] such

April 20, 2010 14:17 WSPC/S0129-055X

264

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra r β , µβ , mβ

mβ

rβ

1.2 0.5 0.20

1.0 0.4 0.8

0.15 0.3

0.6

0.10 0.2 0.05

0.4

0.1

5

10

15

20

β

0.2

5

10

15

20

β

0.6

0.8

1.2

1.4

ρ

Fig. 14. In the two ﬁgures on the left, we give illustrations of the Cooper pair condensate density rβ and the magnetization density mβ as functions of the inverse temperature β for densities ρ = 0.6 (orange line), 0.7 (magenta line), 0.8 (red line), 0.9 (cyan line). In the ﬁgure on the right, we illustrate the coexistence of ferromagnetic and superconducting phases via graphs of rβ , mβ and the chemical potential µβ as functions of ρ for β = 30 (low temperature regime). In all ﬁgures, λ = 0.575, γ = 2.6, and h = 0.1. (The small discontinuities around ρ = 1 in the right ﬁgure are numerical anomalies.) (Color online.)

phenomenon is shown to be impossible in the ground state of the Vonsovkii–Zener model applied to s-wave superconductors,q whereas at ﬁnite temperature, numerical computations [17] suggests the contrary. This last analysis [17] is however not performed in details. The second interesting physical aspect related to densities ρ between the critical + densities d− β and dβ is a smoothing eﬀect of the extensive quantities (magnetization density, Cooper pair condensate density, etc.) as functions of the inverse temper(c) ature β. Indeed, since the critical chemical potential µβ only exists when a ﬁrst order phase transition occurs, one could expect that the extensive quantities are + not continuous as functions of β > 0. In fact, for ρ ∈ (d− β , dβ ), there is a convex + interpolation between quantities related to the solutions r− β = 0 and rβ > 0 of (2.4), see Theorem 4.1. The continuity of the extensive quantities then follows, see Fig. 14. It does not imply however, that all densities become always continuous at ﬁxed ρ as a function of the inverse temperature β. For instance, in Fig. 13, the green and orange graphs give two illustrations of a discontinuity of the order parameter rβ at ﬁxed electron density ρ = 1 where µβ = λ. To understand this ﬁrst order phase transition, other extensive quantity should be additionally ﬁxed, see discussions in Sec. 5 and Fig. 17. Following these last results, we give now in Fig. 15 other plots of the critical temperature θc = θc (ρ, λ, γ, h), which is deﬁned as usual such that rβ > 0 if and only if β > θc−1 . In this ﬁgure, observe that a positive λ, i.e. a one-site repulsion, can never increase the critical temperature if the electron density ρ is ﬁxed instead of the chemical potential µ, compare with Fig. 2. We also show in Fig. 15 (right ﬁgure) that if the density of holes equals the density of electrons, i.e. ρ = 1, then we have a Mott insulator, whereas a small doping of electrons or holes implies either a superconducting phase (blue area) or a superconductor-Mott insulator q It

is a combination of the BCS interaction (1.3) with the Zener s–d exchange interaction.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors θc

θc

1.2

0.6

1.0

0.5

0.8

0.4

0.6

0.3

0.4

0.2

0.2

0.1

265

θc 0.20

0.15

0.10

0.05

− 3.0

− 2.5

− 2.0

− 1.5

− 1.0

− 0.5

λ

0.5

0.2

0.4

0.6

0.8

1.0

λ

1.2

0.5

1.0

1.5

2.0

ρ

Fig. 15. Illustration, as a function of λ (the two ﬁgures on the left) or ρ (ﬁgure on the right), of the critical temperature θc = θc (ρ, λ, γ, h) for γ = 2.6, h = 0.1 and with ρ = 1 (left ﬁgure), ρ = 0.7 (ﬁgure on the center) and λ = 0.575 (right ﬁgure). The blue and yellow areas correspond respectively to the superconducting and ferromagnetic-superconducting phases, whereas the red dashed line indicates the domain of λ with a ﬁrst order phase transition as a function of β or the temperature θ := β −1 (It only exists in the left ﬁgure). The dashed green line (left ﬁgure) is the asymptote when λ → −∞. In the right ﬁgure, observe that there is no phase transition for ρ = 1. (Color online.)

εβ

εβ

6

1.0 0.5 2 − 0.5 − 1.0

4

6

8

10

β

8

9

10

β

cβ =c −1 θ 10

− 0.4

8

− 0.6 − 0.8

− 1.5 − 2.0

7

− 0.2

6 4 2

− 1.0 − 1.2

0.4

0.6

0.8

1.0

1.2

θ/θc

Fig. 16. In the two ﬁgures on the left, we give illustrations of the mean energy per site β as a function of the inverse temperature β for densities ρ = 0.7 (magenta line), 0.9 (cyan line), 1 (green line), 1.1 (blue line) and 1.3 (red line). For ρ = 1, there is no phase transition and for ρ = 0.9 or 1.1 only a ferromagnetic-superconducting phase appears, whereas for ρ = 0.7 or 1.3 this last phase is followed for larger β by a superconducting phase. In the ﬁgure on the right, assuming Cβ = 0, we give two plots of the speciﬁc heat cβ as a function of the ratio θ/θc between θ := β −1 and the critical temperature θc for densities ρ = 0.7 (magenta line) and 0.9 (cyan line). In all ﬁgures, λ = 0.575, γ = 2.6, and h = 0.1. (Color online.)

(ferromagnetic) phase (yellow area) related to the superconductor-Mott insulator phase transition described in Sec. 3.5 and Fig. 10. To conclude, the Fig. 16 illustrates various thermodynamic features of the system at ﬁxed ρ. First, as a function of β > 0, β is continuously diﬀerentiable only for ρ = 1. In other words, there is no phase transition by opposition to the cases with ρ = 0.7, 0.9 or ρ = 1.1, 1.3. This is the Mott insulator phase transition illustrated in Fig. 10. As in Fig. 10, we also observe the electron-hole symmetry implying that ρ = 0.7 and ρ = 1.3, or ρ = 0.9 and ρ = 1.1, has same phase transitions at exactly the same critical points. As explained in Sec. 3.1, the mean energy per site β for ρ = 0.7, 1.3, or ρ = 0.9, 1.1, diﬀers by a constant, i.e. in absolute value by |2λ − µβ |. At high temperatures, i.e. when β → 0, the function β diverges to ±∞ if ρ = 1 ∓ ε with ε ∈ (0, 1) whereas it stays ﬁnite at ρ = 1. Indeed, when β → 0 the electron density dβ converges to 1 at ﬁxed µ, λ, γ, h, see Theorem 3.4 and Fig. 5. If ρ = 1 ∓ ε, it follows that the chemical potential µβ diverges to ∓∞ as β → 0,

April 20, 2010 14:17 WSPC/S0129-055X

266

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

implying that β → ±∞. In other words, it is energetically unfavorable to ﬁx an election density ρ = 1 at high temperatures. Finally, the speciﬁc heat cβ has only one jump in the case of one phase transition and two jumps when there are two phase transitions, namely when the superconductor-Mott insulator (ferromagnetic) phase and the purely superconducting phase appear.

5. Concluding Remarks (1) First, it is important to note that two diﬀerent physical behaviors can be extracted from the strong coupling BCS-Hubbard model HN : a ﬁrst one at ﬁxed chemical potential µ and a second one at ﬁxed electron density ρ ∈ (0, 2). This does not mean that the canonical and grand-canonical ensembles are not equivalent for this model. But, the inﬂuence of the direct interaction with coupling constant λ drastically changes from the case at ﬁxed µ to the other one at ﬁxed ρ. For instance, via Corollary 4.1 (see also Fig. 15), any one-site repulsion between pairs of electrons is in any case unfavorable to the formation of Cooper pairs, as soon as the electron density ρ is ﬁxed. This property is however wrong at ﬁxed chemical potential µ, see Fig. 2. In other words, ﬁxing the electron density ρ is not equivalentr to ﬁxing the chemical potential µ in the model. Physically, a ﬁxed electron density can be modiﬁed by doping the superconductor. Changing the chemical potential may be more diﬃcult. One naive proposition would be to impose an electric potential on a superconductor which is coupled to an additional conductor serving as a reservoir of electrons or holes at ﬁxed chemical potential. (2) A measurement of the asymptotics as β → ∞ of the speciﬁc heat cβ (see (3.14) with Cβ = 0) in a superconducting phase would determine, by using (3.15), ﬁrst the parameter γ > 0 via the exponential decay and then the coupling constant λ. Next, the measurement of the critical magnetic ﬁeld at very low temperature would allow to obtain by (3.5) the chemical potential µ and hence the electron density at zero-temperature. Since the inverse temperature β as well as the magnetic ﬁeld h can directly be measured, all parameters of the strong coupling BCS-Hubbard model HN (1.2) would be experimentally found. In particular, its thermodynamic behavior, explained in Secs. 2–4, could ﬁnally be confronted to the real system. One could for instance check if the critical temperature θc given by HN in appropriate dimension corresponds to the one measured in the real superconductor. Such studies would highlight the thermodynamic impact of the kinetic energy. (3) In Sec. 4, the electron density is ﬁxed but one could have ﬁxed each extensive quantity: the Cooper pair condensate density, the magnetization density, the Coulomb correlation density or the mean-energy per site. For instance, if the magnetization density m ∈ R is ﬁxed, by strict convexity of the pressure there is a r “Equivalent”

is not taken here in the sense of the equivalence of ensembles.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

267

unique magnetic ﬁeld hN,β = hN,β (µ, λ, γ, m) such that 1 m= ωN (nx,↑ − nx,↓ ). N x∈ΛN

In the thermodynamic limit, we then have hN,β converging to hβ solution of the equation mβ = m at ﬁxed β, γ > 0 and µ, λ ∈ R. By using Theorem 6.5, we would obtain the thermodynamics of the system for any β, γ > 0 and µ, λ, m ∈ R. More generally, when one of the extensive quantities rβ , dβ , mβ , wβ , or β is discontinuous at a critical point, then the thermodynamic limit of the local Gibbs states ωN can be uniquely determined by ﬁxing one of the corresponding extensive quantity between its critical values. The other extensive quantities are determined in this case by an obvious transcription of Theorem 4.1 for the considered discontinuous quantity at the critical point. Observe, however, that rβ , dβ , mβ , wβ , and β should be related, respectively, to the parameters γ, µ, h, λ and β. For instance, the existence of a magnetic ﬁeld hN,β solution of (4.1) at ﬁxed ρ ∈ (0, 2) is not clear at ﬁnite volume. Figure 17 gives an example of an electron density always equal to 1 for µ = λ together with discontinuity of all other extensive quantities. In order to get welldeﬁned quantities at the thermodynamic limit in this example for parameters allowing a ﬁrst order phase transition, it is not suﬃcient to have the electron density ﬁxed. At the critical point we could for instance ﬁx the magnetization density m ∈ R in the ferromagnetic case (h = 0.1) or in any case, the Coulomb correlation density w ≥ 0 which determines a coupling constant λN,β converging to λβ , see the right illustrations of Fig. 17 with the existence of a critical magnetic ﬁeld and a critical coupling constant. (4) To conclude, as explained in the introduction, for a suitable space of states it is possible to deﬁne a free energy density functional F (1.5) associated with the Hamiltonians HN . The states minimizing this functional are equilibrium states and implies all the thermodynamics of the strong coupling BCS-Hubbard model discussed in Secs. 3 and 4. Indeed, the weak∗ -limit ω∞ of the local Gibbs state ωN as N → ∞ exists and belongs to our set of equilibrium states for any β, γ > 0 r β , mβ , wβ , ε β

r β , mβ , wβ , ε β

mβ c (h), wβ c (λ ) 0.8

0.4

0.4

0.6

0.2

0.2

1

2

3

4

5

6

7

β

1

− 0.2

− 0.2

− 0.4

− 0.4

− 0.6

− 0.6

2

3

4

5

6

7

β

0.4

0.2

0.1

0.2

0.3

0.4

0.5

h, λ

Fig. 17. In the two ﬁgures on the left, we give illustrations of the Cooper pair condensate density rβ (blue line), the magnetization density mβ (green line), the Coulomb correlation density wβ (red line), and the mean-energy per site β (orange line) as functions of the inverse temperature β for h = 0 (ﬁgure on the left) and h = 0.1 (ﬁgure on the center) whereas µβ = λ = 0.375, i.e. ρ = 1. In the ﬁgure on the right, we illustrate mβc (green line) and wβc (red line) respectively as functions of h with µ = λ = 0.375 and λ with (µ, h) = (0.375, 0.1) at the critical inverse temperature βc := θc−1 3.04. (Color online.)

April 20, 2010 14:17 WSPC/S0129-055X

268

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

and µ, λ, h ∈ R, cf. Theorem 6.5. In Sec. 6.2, we prove in particular the following properties of equilibrium states: 1/2

(i) Any pure equilibrium state ω satisﬁes ω(ax,↓ ax,↑ ) = rβ eiφ for some φ ∈ [0, 2π). In particular, if rβ = 0 they are not U (1)-gauge invariant and show oﬀ diagonal long range order [38] (ODLRO), cf. Theorem 6.1, Theorem 6.3 and Corollary 6.1. (ii) All densities are uniquely deﬁned: the electron density of any equilibrium states ω is given by ω(nx,↑ + nx,↓ ) = dβ , its magnetization density by ω(nx,↑ − nx,↓ ) = mβ , and its Coulomb correlation density equals ω(nx,↑ nx,↓ ) = wβ , cf. Theorem 6.4. (iii) The Cooper ﬁelds Φx := a∗x,↓ a∗x,↑ + ax,↑ ax,↓ and Ψx := i(a∗x,↓ a∗x,↑ − ax,↑ ax,↓ ) for pure states become classical in the limit γβ → ∞, i.e. their ﬂuctuations go to zero in this limit, cf. Theorem 6.6. Any weak∗ limit point of equilibrium states with diverging inverse temperature is (by deﬁnition) a ground state. For γ > 0 and µ, λ, h ∈ R, most of ground states inherit the properties (i)–(iii) of equilibrium states. In particular, within the GNSrepresentation [32] of pure ground states, Cooper ﬁelds are exactly c-numbers, see Corollary 6.2. In this case, correlation functions can explicitly be computed at any order in Cooper ﬁelds. Furthermore, notice that even in the case h = 0 where the Hamiltonian HN is spin invariant, there exist ground states breaking the spin SU (2)-symmetry. For more details including a precise formulation of these results, we recommend Sec. 6, in particular, Sec. 6.2. 6. Mathematical Foundations of the Thermodynamic Results The aim of this section is to give all the detailed proofs of the thermodynamics of the strong coupling BCS-Hubbard model HN (1.2). The central result of this section is the thermodynamic limit of the pressure, i.e. the proof of Theorem 2.1. The main ingredient in this analysis is the celebrated Størmer Theorem [1], which we adapt here for the CAR algebra (see Lemma 6.8). We orient our approach on the Petz–Raggio–Verbeure results in [19], but we would like to mention that the analysis of permutation invariant quantum systems in the thermodynamic limit (with Størmer’s theorem as the background) is carried out for diﬀerent classes of systems also by other authors. See, e.g., [33, 39]. Finally, we introduce in Sec. 6.2 a notion of equilibrium and ground states by a usual variational principle for the free energy density. The thermodynamics of the strong coupling BCS-Hubbard model described in Secs. 3 and 4 is encoded in this notion and the thermodynamic limits of local Gibbs states used above for simplicity are special cases of equilibrium and ground states deﬁned in Sec. 6.2. Before we proceed, we ﬁrst deﬁne some basic mathematical objects needed in our analysis. Let I be the set of ﬁnite subsets of Zd≥1 . For any Λ ∈ I we then deﬁne UΛ as the C ∗ -algebra generated by {ax,↑ , ax,↓ }x∈Λ and the identity. Choosing some

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

269

ﬁxed bijective map κ : N → Zd , N := {1, 2, . . .}, UN denotes the local C ∗ -algebra U{κ(1),...,κ(N )} at ﬁxed N ∈ N, whereas U is the full C ∗ -algebra, i.e. the closure of the union of all UN for any integer N ≥ 1. Note that nκ(l),↑ := a∗κ(l),↑ aκ(l),↑

and nκ(l),↓ := a∗κ(l),↓ aκ(l),↓

are the electron number operators on the site κ(l), respectively, with spin up ↑ and down ↓. To simplify the notation, as soon as a statement clearly concerns the onesite algebra U1 = U{κ(1)} , we replace aκ(1),↑ , aκ(1),↓ and nκ(1),↑ , nκ(1),↓ , respectively, by a↑ , a↓ and n↑ , n↓ , whereas any state on U1 is denoted by ζ and not by ω, which is by deﬁnition a state on more than one site (on UΛ , UN or U). Important one-site Gibbs states in our analysis are the states ζc associated for any c ∈ C with the Hamiltonian H1 (c) (2.1) and deﬁned by ∗ ∗

ζc (A) :=

Trace(Aeβ{(µ−h)n↑ +(µ+h)n↓ +γ(ca↓ a↑ +¯ca↑ a↓ )−2λn↑ n↓ } ) ∗ ∗

Trace(eβ{(µ−h)n↑ +(µ+h)n↓ +γ(ca↓ a↑ +¯ca↑ a↓ )−2λn↑ n↓ } )

,

(6.1)

for any A ∈ U1 . Finally, note that our notation for the “Trace” does not include the Hilbert space where it is evaluated. Using the isomorphisms UΛ B( CΛ×{↑,↓} ) of C ∗ -algebras, the corresponding Hilbert space is deduced from the local algebra where the operators involved in each statement are living. Now, we are in position to start the proof of Theorem 2.1. It is followed by a rigorous analysis of the corresponding equilibrium and ground states. 6.1. Thermodynamic limit of the pressure: Proof of Theorem 2.1 Since we have already shown the lower bound (2.2) in Sec. 2, to ﬁnish the proof of Theorem 2.1 it remains to obtain lim sup{pN (β, µ, λ, γ, h)} ≤ sup{−γ|c|2 + p(c)}. N →∞

(6.2)

c∈C

We split this proof into several lemmata. But ﬁrst, we need some additional deﬁnitions. We deﬁne the set of all S-invariant even states. Let S be the set of bijective maps from N to N which leaves invariant all but ﬁnitely many elements. It is a group with respect to the composition. The condition ηs : aκ(l),# → aκ(s(l)),# ,

s ∈ S,

l ∈ N,

(6.3)

deﬁnes a group homomorphism η : S → Aut(U), s → ηs uniquely. Here, # stands for a spin up ↑ or down ↓. Then, let EUS,+ := {ω ∈ EU : ω ◦ ηs = ω for any s ∈ S, and

ω(a∗κ(l1 ),# · · · a∗κ(lt ),# aκ(m1 ),# · · · aκ(mτ ),# ) = 0 if t + τ is odd}

be the set of all S-invariant even states, where EU is the set of all states on U. The set EUS,+ is weak∗ -compact and convex. In particular, the set of extreme points of EUS,+ , denoted by EUS,+ , is not empty.

April 20, 2010 14:17 WSPC/S0129-055X

270

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

Remark 6.1. Any permutation invariant (p.i.) state on U is in fact automatically even, see, e.g., [25, Example 5.2.21]. We explicitly write the evenness of states in the deﬁnition of EUS,+ because this property is essential in our arguments below. Now, to ﬁx the notation and for the reader convenience, we collect well-known results about the so-called relative entropy, cf. [25, 40]. Let ω (1) and ω (2) be two states on the local algebra UΛ , with ω (1) being faithful. Deﬁne the relative entropys S(ω (1) |ω (2) ) := Trace(Dω(2) ln Dω(2) ) − Trace(Dω(2) ln Dω(1) ), where Dω(j) is the density matrix associated to the state ω (j) with j = 1, 2. The relative entropy is super-additive: for any Λ1 , Λ2 ∈ I, Λ1 ∩ Λ2 = ∅, and for any even states ω (1) , ω (2) , ω (1,2) , respectively, on UΛ1 , UΛ2 and UΛ1 ∪Λ2 , ω (1) and ω (2) faithful, we have S(ω (1) ⊗ ω (2) | ω (1,2) ) ≥ S(ω (1) | ω (1,2) |UΛ1 ) + S(ω (2) | ω (1,2) |UΛ2 ).

(6.4)

For even states ω (1) and ω (2) , respectively on UΛ1 and UΛ2 with Λ1 ∩ Λ2 = ∅, the even state ω (1) ⊗ ω (2) is the unique extension of ω (1) and ω (2) on UΛ1 ∪Λ2 satisfying for all A ∈ UΛ1 and all B ∈ UΛ2 , ω (1) ⊗ ω (2) (AB) = ω (1) (A)ω (2) (B). The state ω (1) ⊗ω (2) is called the product of ω (1) and ω (2) . The product of even states is an associative operation. In particular, products of even states can be deﬁned with respect to any countable set {UΛn }n∈N of subalgebras of U with Λm ∩ Λn = ∅ for m = m. Observe that the relative entropy becomes additive with respect to product ˆ (1) ⊗ ω ˆ (2) , where ω ˆ (1) and ω ˆ (2) are two even states respectively states: if ω (1,2) = ω on UΛ1 and UΛ2 , then (6.4) is satisﬁed with equality. The relative entropy is also convex: for any states ω (1) , ω (2) , and ω (3) on UΛ , ω (1) faithful, and for any τ ∈ (0, 1) S(ω (1) | τ ω (2) + (1 − τ )ω (3) ) ≤ τ S(ω (1) | ω (2) ) + (1 − τ )S(ω (1) | ω (3) ).

(6.5)

Meanwhile S(ω (1) | τ ω (2) + (1 − τ )ω (3) ) ≥ τ log τ + (1 − τ ) log(1 − τ ) + τ S(ω (1) | ω (2) ) + (1 − τ )S(ω (1) | ω (3) ),

(6.6)

for any τ ∈ (0, 1). Note that the relative entropy makes sense in a class of states on U much larger than that of even states on UΛ (cf. [40]), but this is not needed here. The condition σ : aκ(l),# → aκ(l+1),# uniquely deﬁnes a homomorphism σ on U called right-shift homomorphism. Any state ω on U such that ω = ω ◦ σ is called shift-invariant and we denote by EUσ the s As

in [40] we use the Araki–Kosaki deﬁnition, which has opposite sign than the one given in [25].

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

271

set of shift-invariant states on U. An important class of shift-invariant states are product states ωζ obtained by “copying” some even state ζ of the one-site algebra U1 on all other sites, i.e. ωζ :=

∞

ζ ◦ σk .

(6.7)

k=0

Such product states are important and used below as reference states. More generally, a state ω is L-periodic with L ∈ N if ω = ω ◦ σ L . For each L ∈ N, the set of L all L-periodic states from EU is denoted by EUσ . Let ζ be any faithful even state on U1 and let ω be any L-periodic state on U. It immediately follows from super-additivity (6.4) that for any N, M ∈ N S(ωζ |U(M +N )L | ω|U(M +N )L ) ≥ S(ωζ |UM L | ω|UM L ) + S(ωζ |UN L | ω|UN L ). In particular, the following limit exists ˜ ω) := lim S(ωζ |UN L | ω|UN L ) = sup S(ωζ |UN L | ω|UN L ) S(ζ, N →∞ NL NL N ∈N

(6.8)

and is the relative entropy density of ω with respect to the reference state ζ. This functional has the following important properties: Lemma 6.1 (Properties of the Relative Entropy Density). At any ﬁxed ˜ ω) is lower weak∗ L ∈ N, the relative entropy density functional ω → S(ζ, semicontinuous, i.e. for any faithful even state ζ ∈ EU1 and any r ∈ R, the set L ˜ ω) > r} Mr := {ω ∈ EUσ : S(ζ,

is open with respect to the weak∗ -topology. It is also aﬃne, i.e. for any faithful state L ζ ∈ EU1 and states ω, ω ∈ EUσ ˜ τ ω + (1 − τ )ω ) = τ S(ζ, ˜ ω) + (1 − τ )S(ζ, ˜ ω ), S(ζ, with τ ∈ (0, 1). Proof. Without loss of generality, let L = 1. From the second equality of (6.8), Mr = {ω ∈ EUσ : S(ωζ |UN | ω|UN ) > rN }. N ∈N

As the maps ω → S(ωζ |UN | ω|UN ) are weak∗ -continuous for each N , it follows that Mr is the union of open sets, which implies the lower weak∗ -semicontinuity of the relative entropy density functional. Moreover from (6.5) and (6.6) we directly obtain ˜ ω) is aﬃne. that S(ζ, Notice that any p.i. state is automatically shift-invariant. Thus, the mean relative entropy density is a well-deﬁned functional on EUS,+ . Now, we need to deﬁne on EUS,+ the functional ∆(ω) relating to the mean BCS interaction energy

April 20, 2010 14:17 WSPC/S0129-055X

272

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

per site: Lemma 6.2 (BCS Energy per Site for p.i. States). For any ω ∈ EUS,+ , the mean BCS interaction energy per site in the thermodynamic limit N γ ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N →∞ N 2

∆(ω) := lim

l,m=1

=

γω(a∗κ(1),↑ a∗κ(1),↓ aκ(2),↓ aκ(2),↑ )

is well-deﬁned and the aﬃne map ∆ : EUS,+ → C, ω → ∆(ω) is weak∗ -continuous. Proof. First, N

ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ )

l,m=1

=

N

ω(a∗κ(l),↑ a∗κ(l),↓ aκ(l),↓ aκ(l),↑ ) +

l=1

N

ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ).

l, m=1 l =m

(6.9) Since ω ∈

EUS,+ ,

for any l = m observe that

ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) = ω(a∗κ(1),↑ a∗κ(1),↓ aκ(2),↓ aκ(2),↑ ),

(6.10)

ω(a∗κ(l),↑ a∗κ(l),↓ aκ(l),↓ aκ(l),↑ ) = ω(a∗κ(1),↑ a∗κ(1),↓ aκ(1),↓ aκ(1),↑ ).

(6.11)

whereas

Therefore, by combining (6.9) with (6.10) and (6.11), the lemma follows. Now, we deﬁne by ω H (A) :=

Trace(A e−βH ) , Trace(e−βH )

A ∈ UΛ ,

(6.12)

the Gibbs state associated with any self-adjoint element H of UΛ at inverse temperature β > 0. This deﬁnition is of course in accordance with the Gibbs state ωN (1.6) associated with the Hamiltoniant HN (1.2) since ωN = ω HN for any N ∈ N. Note however, that the state ωN is seen either as deﬁned on the local algebra UN or as deﬁned on the whole algebra U by periodically extending it (with period N ). Next we give an important property of Gibbs states (6.12): Lemma 6.3 (Passivity of Gibbs States). Let H0 , H1 be self-adjoint elements from UΛ and deﬁne for any state ω on UΛ FΛ (ω) := −ω(H1 ) − β −1 S(ω H0 |ω) + P H0 , t With

the appropriate numbering of sites deﬁned by the bijective map κ.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

273

where P H := β −1 ln Trace(e−βH ) for any self-adjoint H ∈ UΛ . Then P H1 +H0 ≥ FΛ (ω) for any state ω on UΛ with equality if ω = ω H0 +H1 . Note that −FΛ (ω) is the free energy associated with the state ω. Proof. For any self-adjoint H ∈ UΛ and any state ω on UΛ observe that Trace(Dω ln DωH ) = Trace(Dω ln(exp(−βP H − βH))) = −βω(H) − βP H ,

(6.13)

which implies that P H1 +H0 = −β −1 (Trace(DωH0 +H1 ln DωH0 +H1 ) − Trace(DωH0 +H1 ln DωH0 )) − ω H0 +H1 (H1 ) + P H0 ,

(6.14)

i.e. P H1 +H0 = FΛ (ω H0 +H1 ). Without loss of generality take any faithful state ω on UΛ . In this case, there are positive numbers λj with j λj = 1 and vectors j| from the Hilbert space HΛ such that ω(·) = j λj j| · |j. In particular, from (6.13) we have λj (− ln λj − βj|H0 + H1 |j). −βω(H1 ) − S(ω H0 |ω) + βP H0 = j

Consequently, by convexity of the exponential function combined with Jensen inequality we obtain that exp(−βω(H1 ) − S(ω H0 |ω) + βP H0 ) ≤ λj exp(− ln λj − βj|H0 + H1 |j) j

≤ Trace(exp(−β(H0 + H1 ))) = exp(βP H1 +H0 ). Note that the last inequality uses the so-called Peierls–Bogoliubov inequality which is again a consequence of Jensen inequality. This proof is standard (see, e.g., [25]). It is only given in detail here, because we also need later Eqs. (6.13) and (6.14). Observe that Lemma 6.3 applied to ω = ω H0 gives the Bogoliubov (convexity) inequality [29]. We can also deduce from this lemma that the pressure pN (β, µ, λ, γ, h) (1.4) associated with HN equals pN (β, µ, λ, γ, h) =

N γ ωN (a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N2 l,m=1

−

1 S(ωζ0 |UN |ωN ) + pN (β, µ, λ, 0, h), βN

(6.15)

April 20, 2010 14:17 WSPC/S0129-055X

274

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

for any β, γ > 0 and real numbers µ, λ, h. Recall that ωζ0 is the shift-invariant state obtained by “copying” the state ζ0 (6.1) of the one-site algebra U1 , see (6.7). ˜ at Finite N ). Let Lemma 6.4 (From S to the Relative Entropy Density S ω ˜ N be the shift-invariant state deﬁned by 1 ω ˜ N := (ωN + ωN ◦ σ + · · · + ωN ◦ σ N −1 ), N ˜ 0, ω where σ is the right-shift homomorphism. Then S(ωζ0 |UN |ωN ) = N S(ζ ˜ N ), cf. (6.8). ˜ 0, ω ˜N ) Proof. By Lemma 6.1 combined with (6.8), the relative entropy density S(ζ equals N −1 1 1 ˜ 0, ω S(ωζ0 |UM N | ωN ◦ σ k |UM N ) , ˜ N ) = lim (6.16) S(ζ M→∞ MN N k=0

for any ﬁxed N ∈ N. By using now the additivity of the relative entropy for product states observe that S(ωζ0 |UM N | ωN ◦ σ k |UM N ) = (M − 1)S(ωζ0 |UN | ωN |UN ) + S(ωζ0 |Uk | ωN |Uk ) + S(ωζ0 |UN −k | ωN |UN −k ),

(6.17)

for any k ∈ {0, . . . , N − 1}, with S(ωζ0 |U0 | ωN |U0 ) := 0 by deﬁnition. Therefore the ˜ 0, ω ˜ N ) directly follows from (6.16) combined with equality S(ωζ0 |UN |ωN ) = N S(ζ (6.17). We are now in position to give a ﬁrst general upper bound for the pressure pN (β, µ, λ, γ, h) by using the equality (6.15) together with Lemmas 6.2 and 6.4. Lemma 6.5 (General Upper Bound of the Pressure pN ). For any β, γ > 0 and µ, λ, h ∈ R, one gets that ˜ 0 , ω)}, lim sup{pN (β, µ, λ, γ, h)} ≤ p(β, µ, λ, 0, h) + sup {∆(ω) − β −1 S(ζ N →∞

S,+ ω∈EU

where we recall that EUS,+ is the non empty set of extreme points of EUS,+ . Proof. By (6.15) combined with Lemma 6.4 one gets pN (β, µ, λ, γ, h) =

N γ ωN (a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N2 l,m=1

˜ 0, ω − β −1 S(ζ ˜ N ) + pN (β, µ, λ, 0, h).

(6.18)

The last term of this equality is independent of N ∈ N since 1 pN (β, µ, λ, 0, h) = ln Trace(eβ[(µ−h)n↑ +(µ+h)n↓ −2λn↑ n↓ ] ) β =: p(β, µ, λ, 0, h), cf. (2.3).

(6.19)

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

275

However, the other terms require the knowledge of the states ωN and ω ˜ N in the limit N → ∞. Actually, because the unit ball in U is a metric space with respect ωN } converges in the weak∗ -topology along to the weak∗ -topology, the sequence {˜ a subsequence towards ω∞ . Meanwhile, it is easy to see that for all A ∈ UΛ , Λ ∈ I, lim {ωN (A) − ω ˜ N (A)} = 0.

N →∞

˜ N have the same limit points. Since ωN Thus, the sequences of states ωN and ω is even and permutation invariant with respect to the N ﬁrst sites, the state ω∞ belongs to EUS,+ . We now estimate the ﬁrst term (6.18) as in Lemma 6.2 to get lim sup{pN (β, µ, λ, γ, h)} ≤ p(β, µ, λ, 0, h) + γω∞ (a∗κ(1),↑ a∗κ(1),↓ aκ(2),↑ aκ(2),↓ ) N →∞

˜ 0, ω + β −1 lim sup{−S(ζ ˜ N )}. N →∞

(6.20)

From Lemma 6.1 the relative entropy density is lower semicontinuous in the weak∗ topology, which implies that ˜ 0, ω ˜ 0 , ω∞ ). ˜ N )} ≤ −S(ζ lim sup{−S(ζ N →∞

By combining this last inequality with (6.20) we then ﬁnd that ˜ 0 , ω∞ ), (6.21) lim sup{pN (β, µ, λ, γ, h)} ≤ p(β, µ, λ, 0, h) + ∆(ω∞ ) − β −1 S(ζ N →∞

with ω∞ ∈ EUS,+ . Now, from Lemma 6.2 the functional ω → ∆(ω) is aﬃne and weak∗ -continuous, ˜ 0 , ω) is aﬃne and lower weak∗ whereas by Lemma 6.1 the map ω → S(ζ ˜ 0 , ω) is, in parsemicontinuous. The free energy functional ω → ∆(ω) − β −1 S(ζ ∗ ticular, convex and upper weak -semicontinuous. Meanwhile recall that EUS,+ is a weak∗ -compact and convex set. Therefore, from the Bauer maximum principle [32, Lemma 4.1.12] it follows that ˜ 0 , ω)} = sup {∆(ω) − β −1 S(ζ ˜ 0 , ω)}. sup {∆(ω) − β −1 S(ζ S,+ ω∈EU

(6.22)

S,+ ω∈EU

Together with (6.21), this last inequality implies the upper bound stated in the lemma. Since even states on U are entirely determined by their action on even elements from U, observe that we can identify the set of even p.i. states of U with the set of p.i. states on the even sub-algebra U + . We want to show next that the set of extreme points EUS,+ belongs to the set of strongly clustering states on the even sub-algebra U + of U. By strongly clustering states ω with respect to U + , we mean that for any B in U + , there exists a net {Bj } ⊆ Co{ηs (B) : s ∈ S} such that for any A ∈ U + , lim |ω(A ηs (Bj )) − ω(A)ω(B)| = 0 j

uniformly in s ∈ S. Here, Co M denotes the convex hull of the set M .

April 20, 2010 14:17 WSPC/S0129-055X

276

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

S,+ Lemma 6.6 (Characterization of the Set of Extreme States of EU ). Any S,+ extreme state ω ∈ EU is strongly clustering with respect to the even sub-algebra U + and conversely.

Proof. We use some standard facts about extreme decompositions of states which can be found in [32, Theorems 4.3.17 and 4.3.22]. To satisfy the requirements of these theorems, we need to prove that the C ∗ -algebra U + of even elements of U is asymptotically abelian with respect to the action of the group S. This is proven as follows. For each l ∈ N deﬁne the map π (l) : N → N by  l−1 l−1  k + 2 , if 1 ≤ k ≤ 2 . (6.23) π (l) (k) := k − 2l−1 , if 2l−1 + 1 ≤ k ≤ 2l .   l k, if k > 2 . In other words, the map π (l) exchanges the block {1, . . . , 2l−1 } with {2l−1 + 1, . . . , 2l }, and leaves the rest invariant. For any A, B ∈ UΛ ∩ U + with Λ ∈ I, it is then not diﬃcult to see that lim [A, ηπ(l) (B)] = 0

l→∞

in the norm sense. Recall that the map ηπ(l) is deﬁned via (6.3). By density of local elements of U + the limit above equals zero for all A, B ∈ U + . Therefore, by using now [32, Theorems 4.3.17 and 4.3.22] all states ω ∈ EUS,+ are then strongly clustering with respect to U + and conversely. We show next that p.i. states, which are strongly clustering with respect to the even sub-algebra U + , have clustering properties with respect to the whole algebra U. Lemma 6.7 (Extension of the Strongly Clustering Property). Let ω ∈ EUS,+ be any strongly clustering state with respect to U + . Then, for any A, B ∈ U and ε > 0, there are Bε ∈ Co{ηs (B): s ∈ S} and lε such that for any l ≥ lε , |ω(Aηπ(l) (Bε )) − ω(A)ω(B)| < ε. Proof. By density of local elements it suﬃces to prove the lemma for any A, B ∈ UN and N ∈ N. The operators A and B can always be written as sums A = A+ +A− and B = B + + B − , where A+ and B + are in the even sub-algebra U + whereas A− and B − are odd elements, i.e. they are sums of monomials of odd degree in annihilation and creation operators. Since ω is assumed to be strongly clustering with respect to U + , for any ε > 0 there are positive numbers λ1 , . . . , λk with λ1 + · · · + λk = 1, and maps s1 , . . . , sk ∈ S such that for any l ∈ N,    k +  + + ω A+ ηπ(l)  λ η (B ) − ω(A )ω(B ) (6.24) k sj < ε. j=1

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

277

By parity and linearity of ω observe that ω(A+ )ω(B + ) = ω(A)ω(B), whereas    k ω(Aηπ(l) (Bε )) = ω A+ ηπ(l)  λk ηsj (B + ) (6.25) j=1

for large enough l with the operator Bε ∈ Co{ηs (B): s ∈ S} deﬁned by Bε :=

k

λk ηsj (B).

(6.26)

j=1

The equality (6.25) follows from parity and the statement ˜ − )) = 0 ω(Aηπ(l) (B ˜ − ∈ UN , B ˜ − odd, and suﬃciently large l. This can be seen for any ω ∈ EUS,+ , A, B as follows. Since any element of UN with deﬁned parity can be written as a linear combination of two self-adjoint elements with same parity, we assume without loss ˜ − . Choose l ∈ N large enough such that the support ˜ − )∗ = B of generality that (B − ˜ − ) does not intersect {κ(1), . . . , κ(N )} for all l ≥ l . The map π (l) : ˜ := π (l) (B of B l ˜ − ), m ∈ N0 := {0, 1, 2, . . .}, ˜ − := σ m2l+1 (B N → N is deﬁned by (6.23). Deﬁne B l,m

l

where σ is the right-shift homomorphism. For any J ∈ N J − ˜ ˜− ) ω AB = (J + 1)ω(AB l,m

l,0

m=0

by symmetry of ω. Use now the Cauchy–Schwarz inequality for states to get J − ∗ ˜− ˜ ˜− B (J + 1)|ω(ABl,0 )| ≤ ω(A A) ω(B l,m l,m ). m,m =0

˜ − anti-commute if m = m , ˜ − and B Since per construction, B l,m l,m J

ω(Bl,m B

l,m

)=

m,m =0

J

ω(Bl,m Bl,m ).

m=0

By symmetry of ω, the right-hand side of the equation above equals (J + ˜ − )2 ). Hence, we conclude that 1)ω((B l,0

˜ − )| ≤ (J + 1)−1/2 |ω(AB l,0

˜ − )2 ), ω(|A|2 )ω((B l,0

˜ − ) = 0 for all l ≥ l . for any J ∈ N, i.e. ω(AB l,0 Therefore, the lemma follows from (6.24) and (6.25) with Bε ∈ Co{ηs (B) : s ∈ S} deﬁned by (6.26) for any ε > 0. We now identify the set of clustering states on U with the set of product states by the following lemma, which is a non-commutative version of de Finetti Theorem

April 20, 2010 14:17 WSPC/S0129-055X

278

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

of probability theory [28]. Størmer [1] was the ﬁrst to show the corresponding result for inﬁnite tensor products of C ∗ -algebras. Lemma 6.8 (Strongly Clustering p.i. States are Product States). Any p.i. and strongly clustering (in the sense of Lemma 6.7) state ω is a product state (6.7) with the one-site state ζ = ζω := ω|U1 being the restriction of ω on the local (one-site) algebra U1 . Proof. Let l1 , . . . , lk ∈ N with li = lj whenever i = j, and for any j ∈ {1, . . . , k} take Aj ∈ U1 . To prove the lemma we need to show that ω(σ l1 (A1 ) · · · σ lk (Ak )) = ζω (A1 ) · · · ζω (Ak ).

(6.27)

The proof of this last equality for any k ≥ 1 is performed by induction. First, for k = 1 the equality (6.27) immediately follows by symmetry of the state ω. Now, assume the equality (6.27) veriﬁed at ﬁxed k ≥ 1. The state ω is strongly clustering in the sense of Lemma 6.7. Therefore for each ε > 0 there are q ∈ N, positive numbers λ1 , . . . , λq with λ1 + · · · + λq = 1, and maps s1 , . . . , sq ∈ S such that q λj ω(σ l1 (A1 ) · · · σ lk (Ak )ηπ(l) ◦sj (σ lk+1 (Ak+1 ))) j=1 (6.28) − ω(σ l1 (A1 ) · · · σ lk (Ak ))ω(σ lk+1 (Ak+1 )) < ε, for any l ∈ N. Fix N suﬃciently large such that the operators σ lm (Am ) and ηsj (σ lk+1 (Ak+1 )) belong to UN for any m ∈ {1, . . . , k + 1} and j ∈ {1, . . . , q}. Choose l ∈ N large enough such that the support of ηπ(1) ◦sj (σ lk+1 (Ak+1 )) does not intersect {κ(1), . . . , κ(N )} for all l ≥ l and j ∈ {1, . . . , q}, which by symmetry of ω implies that ω(σ l1 (A1 ) · · · σ lk (Ak )ηπ(l) ◦sj (σ lk+1 (Ak+1 ))) = ω(σ l1 (A1 ) · · · σ lk (Ak )σ lk+1 (Ak+1 )). Combined with (6.28) and λ1 + · · · + λq = 1, it yields |ω(σ l1 (A1 ) · · · σ lk (Ak )σ lk+1 (Ak+1 )) − ω(σ l1 (A1 ) · · · σ lk (Ak ))ζω (Ak+1 )| < ε. Since the equality (6.27) is assumed to be veriﬁed at ﬁxed k ≥ 1, it follows that |ω(σ l1 (A1 ) · · · σ lk+1 (Ak+1 )) − ζω (A1 ) · · · ζω (Ak+1 )| < ε, for any ε > 0. In other words, by induction the equality (6.27) is proven for any k ≥ 1. As soon as the upper bound is concerned, we combine Lemma 6.5 with Lemmas 6.6–6.8 to obtain that lim sup{pN (β, µ, λ, γ)} ≤ p(β, µ, λ, 0, h) + sup {γ|ζ(a∗↑ a∗↓ )|2 − β −1 S(ζ0 |ζ)}. N →∞

+ ζ∈EU

1

(6.29)

denotes the set of even states on the (one-site) algebra U1 . Now the Here proof of the upper bound (6.2) easily follows from the passivity of Gibbs states on EU+1

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

279

U1 . Indeed, we apply Lemma 6.3 to the one-site Hamiltonians H0 = H1 (0) (see (2.1)) and c c¯ H1 = − a∗↑ a∗↓ − a↑ a↓ 2 2 in order to bound the relative entropy S(ζ0 |ζ). More precisely, it follows that p(β, µ, λ, 0, h) − β −1 S(ζ0 |ζ) ≤ p(c/(2γ)) − x Re{ζ(a↑ a↓ )} − y Im{ζ(a↑ a↓ )}, (6.30) and any c ∈ C with x := Re{c} and y := Im{c}. Consequently, for any state ζ ∈ from (6.29) we deduce that EU+1

lim sup{pN (β, µ, λ, γ, h)} N →∞ ! ≤ sup inf γ(Re{ζ(a↑ a↓ )}2 + Im{ζ(a↑ a↓ )}2 ) + ζ∈EU

x,y∈R

1

− x Re{ζ(a↑ a↓ )} − y Im{ζ(a↑ a↓ )} + p((x + iy)/(2γ))} ≤ sup inf {γ(t2 + s2 ) − tx − sy + p((x + iy)/(2γ))} . t,s∈R

x,y∈R

In particular, by ﬁxing x = 2tγ and y = 2sγ in the inﬁmum we ﬁnally obtain lim sup{pN (β, µ, λ, γ, h)} ≤ sup {−γ(t2 + s2 ) + p(t + is)}, N →∞

t,s∈R

i.e. the upper bound (6.2) for any β, γ > 0 and µ, λ, h ∈ R. 6.2. Equilibrium and ground states of the strong coupling BCS-Hubbard model It follows immediately from the passivity of Gibbs states that ˜ 0 , ω) + p(β, µ, λ, 0, h), p(β, µ, λ, γ, h) ≥ ∆(ω) − β −1 S(ζ

(6.31)

EUS,+ ,

for any ω ∈ cf. (6.1) and Lemmas 6.2 and 6.3. Therefore, by using Lemma 6.5 with (6.22) the (inﬁnite volume) pressure can be written as ˜ 0 , ω)} + p(β, µ, λ, 0, h). p(β, µ, λ, γ, h) = sup {∆(ω) − β −1 S(ζ S,+ ω∈EU

Moreover, as shown above (see the upper bound in the proof of Lemma 6.5), any weak∗ limit point ω∞ of local Gibbs states ωN (1.6) when N → ∞ satisﬁes (6.31) with equality. Indeed, by using (6.13) one obtains for any state ω that N 1 γ −1 (−ω(HN ) − β S(trN |ω|UN )) = 2 ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N N l,m=1

1 − S(ωζ0 |UN |ω|UN ) + pN (β, µ, λ, 0, h), βN (6.32)

April 20, 2010 14:17 WSPC/S0129-055X

280

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

with pN being the (ﬁnite volume) pressure (1.4) associated with the Hamiltonian HN (1.2), ωζ0 being the product state obtained by “copying” the state ζ0 (6.1) on the one-site algebra U1 (see (6.7)), and with the trace state trN deﬁned on the local algebra UN for N ∈ N by trN (·) :=

Trace(·) . Trace(IUN )

For any permutation invariant state ω it is straightforward to check that the limits lim {N −1 S(ωζ0 |UN |ω|UN )}

N →∞

and e(ω) := lim {N −1 ω(HN )} = ω(H1 (0)) − ∆(ω) N →∞

exist for any ﬁxed parameters β, γ > 0 and µ, λ, h ∈ R, see respectively (2.1) and Lemma 6.2 for the deﬁnitions of H1 (0) and ∆(ω). Combined with (6.19) and (6.32) it then follows that the usual entropy density ˜ S(ω) := − lim {N −1 S(trN |ω|UN )} N →∞

= − lim

N →∞

1 Trace(Dω|UN log Dω|UN ) < ∞ N

of the permutation invariant state ω also exists and 1 ˜ + p(β, µ, λ, 0, h). S(ωζ0 |UN |ω|UN ) = e(ω) + ∆(ω) − β −1 S(ω) N →∞ βN lim

The set Ωβ = Ωβ (µ, λ, γ, h) of equilibrium states of the strong coupling BCSHubbard model is deﬁned by ˜ = p(β, µ, λ, γ, h) Ωβ := {ω ∈ EUS,+ : −e(ω) + β −1 S(ω) −1 ˜ = ∆(ω) − β S(ζ0 , ω) + p(β, µ, λ, 0, h)}. Note that Ωβ contains per construction all weak∗ limit points of local Gibbs states ωN as N → ∞. Consequently, the equilibrium states are, as usual, the minimizers of the free energy functional ˜ ω → F(ω) := e(ω) − β −1 S(ω)

(6.33)

on the convex and weak∗ -compact set cf. (1.5). They also maximize the ˜ 0 , ω). It follows that upper semicontinuous aﬃne functional ω → ∆(ω) − β −1 S(ζ S,+ Ωβ is a closed face of EU and we have in this set a notion of pure and mixed thermodynamic phases (equilibrium states) by identifying purity with extremity. In particular, it is convex and weak∗ -compact. Each weak∗ -limit ω of equilibrium states ω (n) ∈ Ωβn (µn , λn , γn , hn ) such that (µn , λn , γn , hn ) → (µ, λ, γ, h) and βn → ∞ is called a ground state of the strong coupling BCS-Hubbard model. The set of all ground states with parameters γ > 0 and µ, λ, h ∈ R is denoted EUS,+ ,

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

281

by Ω∞ = Ω∞ (µ, λ, γ, h). Extreme states of the weak∗ -compact convex set Ω∞ are called pure ground states. We analyze now the set of pure equilibrium states, i.e. the equilibrium states ω ∈ Ωβ belonging to the set EUS,+ of extreme points of EUS,+ , cf. (6.22). First, from Lemmas 6.6–6.8 recall that any extreme state is a product state ωζ (6.7), i.e. it is obtained by “copying” a state ζ on the one-site algebra U1 to the other sites. In particular, by combining (6.22) with (6.31) observe that p(β, µ, λ, γ, h) = sup {γ|ζ(a∗↑ a∗↓ )|2 − β −1 S(ζ0 |ζ)} + p(β, µ, λ, 0, h).

(6.34)

+ ζ∈EU

1

Therefore, a product state ωζ is a pure equilibrium state if and only if ζ belongs to the set Gβ = Gβ (µ, λ, γ, h) of one-site equilibrium states deﬁned by Gβ := {ζ ∈ EU+1 : γ|ζ(a∗↑ a∗↓ )|2 − β −1 S(ζ0 |ζ) = p(β, µ, λ, γ, h) − p(β, µ, λ, 0, h)}. (6.35) In other words, the study of pure states of Ωβ can be reduced, without loss of generality, to the analysis of Gβ . The ﬁrst important statement concerns the characterization of the set Gβ in relation with the variational problems (2.4) and (6.34). Theorem 6.1 (Explicit Description of One-Site Equilibrium States). For any β, γ > 0 and µ, λ, h ∈ R, the set Gβ of one-site equilibrium states are given by 1/2 the states ζcβ (6.1) with cβ := rβ eiφ for any order parameter rβ solution of (2.4) and any phase φ ∈ [0, 2π). Proof. Take any solution rβ of (2.4) and any φ ∈ [0, 2π). Then, from (6.14) observe that −β −1 S(ζ0 |ζcβ ) + p(β, µ, λ, 0, h) = −γζcβ (cβ a∗↑ a∗↓ + ¯cβ a↓ a↑ ) + p(cβ ).

(6.36)

Since ζcβ (a↓ a↑ ) = cβ and ζcβ (a∗↑ a∗↓ ) = ¯cβ , the last equality combined with Theorem 2.1 implies that γ|ζcβ (a↓ a↑ )|2 − β −1 S(ζ0 |ζcβ ) = p(β, µ, λ, γ, h) − p(β, µ, λ, 0, h).

(6.37)

In other words, ζcβ is a maximizer of the variational problem deﬁned in (6.34) and hence, ζcβ ∈ Gβ . On the other hand, any state ζ ∈ Gβ satisﬁes (6.37) and by combining Theorem 2.1 with the inequality (6.30) for c = 2γζ(a↓ a↑ ) it follows that −γ|ζ(a↓ a↑ )|2 + p(ζ(a↓ a↑ )) ≥ sup{−γ|c|2 + p(c)}. c∈C

1/2

Hence, ζ(a↓ a↑ ) = rβ eiφ = cβ for some φ ∈ [0, 2π). It remains to prove that the equality ζ(a↓ a↑ ) = cβ uniquely deﬁnes the one-site equilibrium state ζ ∈ Gβ . It follows from ζ(a↓ a↑ ) = ζcβ (a↓ a↑ ) = cβ with ζ, ζcβ ∈ Gβ that S(ζ0 |ζcβ ) = S(ζ0 |ζ) and γζ(cβ a∗↑ a∗↓ + ¯cβ a↓ a↑ ) − β −1 S(ζ0 |ζ) = P H1 (cβ ) − P H1 (0)

(6.38)

April 20, 2010 14:17 WSPC/S0129-055X

282

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

because of (6.36), see (2.1) for the deﬁnition of H1 (c). By Lemma 6.3, one obtains for any self-adjoint A ∈ U1 that −ζ(A) + γζ(cβ a∗↑ a∗↓ + ¯cβ a↓ a↑ ) − β −1 S(ζ0 |ζ) ≤ P H1 (cβ )+A − P H1 (0) . (6.39) Consequently, we obtain by combining (6.38) and (6.39) that P H1 (cβ )+A − P H1 (cβ ) ≥ −ζ(A), for any self-adjoint A ∈ U1 and ζ ∈ Gβ such that ζ(a↓ a↑ ) = cβ . In other words, the functional {−ζ} is tangent to the pressure at H1 (cβ ). Since the convex map A → P H1 (cβ )+A is continuously diﬀerentiable and self-adjoint elements separate states, the tangent functional is unique and ζ = ζcβ . It follows immediately from the theorem above that pure states of Ωβ solve the gap equation: Corollary 6.1 (Gap Equation for Pure Equilibrium States). For any β, γ > 0 and µ, λ, h ∈ R, pure states from Ωβ are precisely the product states ωζcβ satisfying 1/2

the gap equation ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ for any l ∈ N and with cβ := rβ eiφ being any maximizer of the ﬁrst variational problem given in Theorem 2.1. If cβ = 0, observe that the gap equation ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ with ζc deﬁned in (6.1) corresponds to the Euler–Lagrange equation satisﬁed by the solutions cβ := 1/2 rβ eiφ of the ﬁrst variational problem given in Theorem 2.1. The phase φ ∈ [0, 2π) is arbitrarily taken because of the gauge invariance of the map c → p(c), and the gap equation ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ can be reduced to (2.5). In other words, if cβ = 0, the gap equation can be written in two diﬀerent ways: either ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ in the view point of extreme equilibrium states or (2.5) in the view point of the order parameter rβ . From this last corollary observe also that the existence of non-zero maximizers cβ = 0 implies the existence of equilibrium states breaking the U (1)-gauge symmetry satisﬁed by HN (1.2). This breakdown of the U (1)-gauge symmetry for cβ = 0 is already explained by Theorem 3.2, which can be proven by our notion of equilibrium states as follows. Consider the upper semicontinuous convex map on EUS,+ deﬁned for any α ≥ 0 and φ ∈ [0, 2π) by ˜ + 2α Re{eiφ ω(a∗↓ a∗↑ )}. ω → −e(ω) + β −1 S(ω)

(6.40)

From Sec. 6.1 it is straightforward to check that 1 ln Trace(e−βHN,α,φ ) pα,φ (β, µ, λ, γ, h) := lim N →∞ βN =

˜ + 2α Re{eiφ ω(a∗↓ a∗↑ )}}, sup {−e(ω) + β −1 S(ω)

S,+ ω∈EU

(6.41)

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

283

with the Hamiltonian HN,α,φ deﬁned in (3.1). Moreover, any weak∗ -limits ω∞,α,φ of local Gibbs states ωN,α,φ(·) :=

Trace(· e−βHN,α,φ ) Trace(e−βHN,α,φ )

(6.42)

are equilibrium states (see the proof of Lemma 6.5 applied to HN,α,φ ), i.e. the state ω∞,α,φ belongs to the (non-empty) convex set Ωβ,α,φ = Ωβ,α,φ (µ, λ, γ, h) of maximizers of (6.40) at ﬁxed α ≥ 0 and φ ∈ [0, 2π). In fact, one gets the following statement, which implies Theorem 3.2. Theorem 6.2 (Breakdown of the U (1)-Gauge Symmetry). Take β, γ > 0 and real numbers µ, λ, h away from any critical point. Then at ﬁxed phase φ ∈ [0, 2π), N 1 1/2 ωN,α,φ(aκ(l),↓ aκ(l),↑ ) = lim ω∞,α,φ (aκ(1),↓ aκ(1),↑ ) = rβ eiφ , α↓0 N →∞ N α↓0

lim lim

l=1

with ω∞,α,φ ∈ Ωβ,α,φ being the unique maximizer of (6.40) for suﬃciently small α ≥ 0. Proof. First we need to characterize pure states of Ωβ,α,φ as it is done in Corollary 6.1 for α = 0. By convexity and upper semicontinuity, note that maximizers of (6.40) are taken on the set of extreme states whereas the set of extreme maximizers is a face. Since extreme states are product states (cf. Lemmas 6.6–6.8), we get that ˜ + α Re{eiφ ω(a∗↓ a∗↑ )}} = sup{−γ|c|2 + p(c + αγ −1 eiφ )}, sup {−e(ω) + β −1 S(ω)

S,+ ω∈EU

c∈C

(6.43)

as in the case α = 0 (see (2.3) for the deﬁnition of p(c)). If cβ,α,φ = cβ,α,φ (µ, λ, γ, h) ∈ C is a maximizer of − γ|c|2 + p(c + αγ −1 eiφ ),

(6.44)

then observe that zβ,α,φ := cβ,α,φ + αγ −1 eiφ maximizes the function −γ|z − αγ −1 eiφ |2 + p(z) of the complex variable z ∈ C. By gauge invariance of the map z → p(β, µ, λ, h; z), it follows that zβ,α,φ ∈ eiφ R and thus cβ,α,φ ∈ eiφ R. Using this, we extend Corollary 6.1 to α ≥ 0 and φ ∈ [0, 2π). In other words, for any β, γ > 0, α ≥ 0, φ ∈ [0, 2π) and µ, λ, h ∈ R, pure states of Ωβ,α,φ are product states ωζcβ,α,φ satisfying the gap equation ωζcβ,α,φ (aκ(l),↑ , aκ(l),↓ ) = cβ,α,φ , for any l ∈ N and with cβ,α,φ ∈ eiφ R being any maximizer of (6.44).

(6.45)

April 20, 2010 14:17 WSPC/S0129-055X

284

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

As |c| → ∞, notice that p(c) = O(|c|). So, by gauge invariance we obtain sup{−γ|c|2 + p(c + αγ −1 eiφ )} = c∈C

max

{−γ|seiφ |2 + p([s + αγ −1 ]eiφ )}

max

{−γs2 + p(s + αγ −1 )},

s∈[−M,M]

=

s∈[−M,M]

for any α ∈ (0, 1) and M < ∞ suﬃciently large. Consequently, if the parameters β, µ, λ, γ, and h are such that the maximizer rβ (2.4) is unique, then the maximizer cβ,α,φ ∈ eiφ R of (6.44) is also unique as soon as α > 0 is suﬃciently small. Indeed the map s → p(s) is continuous on the compact interval [−M, M ]. In particular, from (6.45) there is a unique maximizer of (6.40), i.e. Ωβ,α,φ = {ωζcβ,α,φ }.

(6.46)

1/2

Moreover, cβ,α,φ converges to rβ eiφ as α → 0. Therefore, it follows from (6.45) that 1/2

lim ωζcβ,α,φ (aκ(l),↓ aκ(l),↑ ) = rβ eiφ

(6.47)

α↓0

for any l ∈ N. By permutation invariance N 1 ωN,α,φ (a∗κ(l),↑ a∗κ(l),↓ ) = ωN,α,φ(a∗κ(1),↑ a∗κ(1),↓ ). N l=1

Now, let

(1) {Nj }

(2)

and {Nj } be two subsequences in N such that

lim ωN (1) ,α,φ (a∗κ(1),↑ a∗κ(1),↓ ) = lim sup ωN,α,φ (a∗κ(1),↑ a∗κ(1),↓ ),

j→∞

lim

j→∞

N →∞

j

ωN (2) ,α,φ (a∗κ(1),↑ a∗κ(1),↓ ) j

= lim inf ωN,α,φ (a∗κ(1),↑ a∗κ(1),↓ ). N →∞

We can assume without loss of generality that ωN (2) and ωN (1) both converge with j

j

respect to the weak∗ -topology as j → ∞. Since any weak∗ -limits ω∞,α,φ of local Gibbs states ωN,α,φ (6.42) are equilibrium states (see again the proof of Lemma 6.5), i.e. ω∞,α,φ ∈ Ωβ,α,φ , the theorem then follows from (6.46) and (6.47). Indeed, for any β, γ > 0 and µ, λ, h ∈ R away from any critical point, the sequence ωN,α,φ of local Gibbs state converges towards ω∞,α,φ = ωζcβ,α,φ in the weak∗ -topology as soon as α ≥ 0 is suﬃciently small. From Corollary 6.1 note that the expectation values of Cooper ﬁelds Φκ(l) := a∗κ(l),↓ a∗κ(l),↑ + aκ(l),↑ aκ(l),↓

(6.48)

Ψκ(l) := i(a∗κ(l),↓ a∗κ(l),↑ − aκ(l),↑ aκ(l),↓ ) are ωζcβ (Φκ(l) ) = 2 Re{cβ } and ωζcβ (Ψκ(l) ) = 2 Im{cβ }

(6.49) 1/2

for any pure state ωζcβ of Ωβ and l ∈ N, where we recall that cβ := rβ eiφ is some maximizer of the ﬁrst variational problem given in Theorem 2.1. In particular,

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

285

ω(Φκ(l) ) = 0 or ω(Ψκ(l) ) = 0 for any pure state ω ∈ Ωβ is a manifestation of the breakdown of the U (1)-gauge symmetry. Unfortunately, the operators Φκ(l) and Ψκ(l) do not correspond to any experiment, as they are not gauge invariant. More generally, experiments only “see” the restriction of states ωζcβ to the subalgebra of gauge invariant elements. Consequently, the next step is to prove the so-called oﬀ diagonal long range order (ODLRO) property proposed by Yang [38] to deﬁne the superconducting phase. Indeed, one detects the presence of U (1)-gauge symmetry breaking by considering the asymptotics, as |l − m| → ∞, of the (U (1)-gauge symmetric) Cooper pair correlation function Gω (l, m) := ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ )

(6.50)

associated with some state ω. In particular, if Gω (l, m) converges to some ﬁxed non-zero value whenever |l − m| → ∞, the state ω shows oﬀ diagonal long range order (ODLRO). This property can directly be analyzed for equilibrium states from our next statement. Theorem 6.3 (Cooper Pair Correlation Function). For any β, γ > 0 and µ, λ, h ∈ R away from any critical point, the Cooper pair correlation function GωN (l, m) associated with the local Gibbs state ωN converges for ﬁxed l = m towards lim GωN (l, m) = Gω (l, m) = rβ ,

N →∞

for any equilibrium state ω ∈ Ωβ , and with rβ being the solution of (2.4). Proof. By similar arguments as in the proof of Theorem 6.2, if Gω (l, m) = rβ for all equilibrium states ω, then lim GωN (l, m) = rβ .

N →∞

By permutation invariance of ω ∈ Ωβ , note that Gω (l, m) = Gω (1, 2)

(6.51)

for any l = m. If ω = ωζcβ is an extreme equilibrium state, then one clearly has Gωζc (1, 2) = ζcβ (a∗↑ a∗↓ )ζcβ (a↓ a↑ ) = |cβ |2 = rβ . β

On the other hand, the set Ωβ of equilibrium states for ﬁxed parameters β, γ > 0, and µ, λ, h ∈ R is weak∗ -compact. In particular, if ω ∈ Ωβ is not extreme, the function Gω (1, 2) is given, up to arbitrarily small errors, by convex sums of the form k

λj Gω(j) (1, 2),

λ1 , . . . , λk ≥ 0,

λ1 + · · · + λk = 1,

(6.52)

j=1

where {ω (j) }j=1,...,k are extreme equilibrium states. Since any weak∗ -limit ω∞ of local Gibbs states ωN (1.6) is an equilibrium state (see proof of Lemma 6.5), the theorem is then a consequence of (6.51) and (6.52).

April 20, 2010 14:17 WSPC/S0129-055X

286

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

Since N 1 ωN (a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N2 l,m=1

=

N (N − 1) ωN (a∗κ(1),↑ a∗κ(1),↓ aκ(2),↓ aκ(2),↑ ) + O(N −1 ), N2

note that this theorem implies Theorem 3.1. Therefore, away from any critical point, if an equilibrium state shows ODLRO then all pure equilibrium states break the U (1)-gauge symmetry. Conversely, if all pure equilibrium states break the U (1)-gauge symmetry, then all equilibrium state show ODLRO. This is due to the fact that the order parameter rβ is unique away from any critical point. In particular, from Sec. 7, at suﬃciently small inverse temperature β there is no ODLRO and Ωβ = {ωζ0 }, whereas for suﬃciently large β and γ all equilibrium states show ODLRO. For any β, γ > 0 and real numbers µ, λ, h at some critical point, this property is not satisﬁed in general. There are indeed cases where the phase transition is of ﬁrst order, cf. Fig. 3. In this situation, 0 and some rβ > 0 are maximizers at the same time, and hence, there are some equilibrium states breaking the U (1)-gauge symmetry and other equilibrium states which do not show ODLRO in this speciﬁc situation. Observe now that the superconducting phase is not only characterized by ODLRO and the breakdown of the U (1)-gauge symmetry. Indeed, the two-point correlation function determines its type: s-wave, d-wave, p-wave, etc. In fact, for any extreme equilibrium state ω = ωζcβ , x, y ∈ Zd and s1 , s2 ∈ {↑, ↓}, one clearly has  0 if x = y. ζcβ (ax,s1 )ζcβ (ay,s2 ) if x = y  = 0 if x = y, s1 = s2 . ωζcβ (ax,s1 ay,s2 ) = ζcβ (ax,s1 ax,s2 ) if x = y  c if x = y, s1 = s2 . β As a consequence, for any equilibrium state ω ∈ Ωβ , we have ω(ax,s1 ay,s2 ) = ω(a0,s1 a0,s2 )δx,y and we obtain a s-wave superconducting phase. In particular, Theorem 3.3 is a simple consequence of this last equalities combined with (6.46), (6.47) and the fact that any weak∗ -limits ω∞,α,φ ∈ Ωβ,α,φ of local Gibbs states ωN,α,φ (6.42) are equilibrium states (see again the proof of Lemma 6.5). Now we would like to pursue this analysis of equilibrium states by showing that their deﬁnition is in accordance with results of Theorems 3.4–3.6. This statement is given in the next theorem. Theorem 6.4 (Uniqueness of Densities for Equilibrium States). Take β, γ > 0 and real numbers µ, λ, h away from any critical point. Then, for any

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

287

equilibrium state ω ∈ Ωβ and l ∈ N, all densities are uniquely deﬁned : (i) The electron density is equal to N 1 lim ωN (nκ(l ),↑ + nκ(l ),↓ ) = ω(nκ(l),↑ + nκ(l),↓ ) = dβ , N →∞ N l =1

cf. Theorem 3.4. (ii) The magnetization density is equal to N 1 ωN (nκ(l ),↑ − nκ(l ),↓ ) = ω(nκ(l),↑ − nκ(l),↓ ) = mβ , lim N →∞ N l =1

cf. Theorem 3.5. (iii) The Coulomb correlation density is equal to N 1 ωN (nκ(l ),↑ nκ(l ),↓ ) = ω(nκ(l),↑ nκ(l),↓ ) = wβ , lim N →∞ N l =1

cf. Theorem 3.6. Proof. Suppose ﬁrst that ω ∈ Ωβ is pure. Then, from Corollary 6.1 it follows that ω(nκ(l),↑ + nκ(l),↓ ) = ωζcβ (nκ(l),↑ + nκ(l),↓ ), 1/2

with cβ = rβ eiφ for some φ ∈ [0, 2π). Thus, by using the gauge invariance of the map c → p(c) we directly get 1/2

ω(nκ(l),↑ + nκ(l),↓ ) = ∂µ p(β, µ, λ, γ, h; cβ ) = ∂µ p(β, µ, λ, γ, h; rβ ) = dβ .

(6.53)

At ﬁxed parameters β, γ > 0, µ, λ, h ∈ R, recall that the set Ωβ of equilibrium states is weak∗ -compact. In particular, if ω ∈ Ωβ is not pure, it is the weak∗ -limit of convex combinations of pure states. Therefore, we obtain (6.53) for any ω ∈ Ωβ . Similarly one gets ω(nκ(l),↑ − nκ(l),↓ ) = mβ

and ω(nκ(l),↑ nκ(l),↓ ) = wβ ,

(6.54)

for any equilibrium state ω ∈ Ωβ and l ∈ N. Moreover, since any weak∗ -limit ω∞ of local Gibbs states ωN (1.6) is an equilibrium state, i.e. ω∞ ∈ Ωβ , we therefore deduce from (6.53) and (6.54), exactly as in the proof of Theorem 6.2, the existence of the limits in the statements (i)–(iii). Observe that the weak∗ -limit ω∞ ∈ Ωβ of local Gibbs states ωN (1.6) can easily be performed, even at critical points, by using the decomposition theory for states [32]: Theorem 6.5 (Asymptotics of the Local Gibbs State ωN as N → ∞). 1/2 Recall that for any φ ∈ [0, 2π), cβ := rβ eiφ is a maximizer of the ﬁrst variational

April 20, 2010 14:17 WSPC/S0129-055X

288

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

problem given in Theorem 2.1, whereas the states ζc and ωζ are respectively deﬁned by (6.1) and (6.7). Take any β, γ > 0, µ, λ, h ∈ R, and let N → ∞. (i) Away from any critical point, the local Gibbs state ωN converges in the weak∗ topology towards the equilibrium state " 2π 1 ωζcβ (·)dφ. (6.55) ω∞ (·) = 2π 0 (ii) For each weak∗ limit point ω∞ of local Gibbs states ωN with parameters (βN , γN , µN , λN , hN ) converging to any critical point (β, γ, µ, λ, h) ∈ ∂S (2.7), there is τ ∈ [0, 1] such that " 2π τ ω∞ (·) = (1 − τ )ωζ0 (·) + ωζcβ (·)dφ. 2π 0 Proof. By U (1)-gauge symmetry of the Hamiltonians HN (1.2) recall that any weak∗ -limit ω∞ of local Gibbs states ωN (1.6) is a U (1)-invariant equilibrium state. So, in order to prove the ﬁrst part of the Theorem it suﬃces to show that the equilibrium state given in (i) is the unique U (1)-invariant state in Ωβ . If the solution rβ of (2.4) is zero, then this follows immediately from Corollary 6.1. 1/2 Let rβ > 0 be the unique maximizer of (2.4), i.e. cβ := rβ eiφ = 0 for any φ ∈ [0, 2π). Let ∂Ωβ = {ωζ : ζ ∈ Gβ } be the set of all extreme states of Ωβ , see (6.35) for the deﬁnition of the set Gβ of onesite equilibrium states. Observe that the closed convex hull of ∂Ωβ is precisely Ωβ and that ∂Ωβ is the image of the torus [0, 2π) under the continuous map φ → ωζcβ , 1/2

with cβ := rβ eiφ . This last map deﬁnes a homeomorphism between the torus and ∂Ωβ . In particular, the set ∂Ωβ is compact and for each equilibrium state ω ∈ Ωβ ˆ ω on the torus such that there is a uniquely deﬁned probability measure dm " 2π ˆ ω (φ), for all A ∈ U. ω(A) = ωζcβ (A)dm (6.56) 0

See, e.g., [41, Proposition 1.2]. By U (1)-invariance of ω∞ , for any n ∈ N one has from (6.56) that n " 2π # n/2 ˆ ω∞ (φ) = 0. aκ(l),↑ aκ(l),↓ = rβ einφ dm ω∞ l=1

0

Therefore, if rβ > 0, there is a unique probability measure allowing the U (1)-gauge ˆ ω∞ (φ) must be the uniform probability measure on [0, 2π). symmetry of ω∞ : dm From Lemma 7.1 the cardinality of set of maximizers of (2.4) is at most 2. Indeed, away from any critical point, it is 1 whereas at a critical point it can be either 1 (second order phase transition) or 2 (ﬁrst order phase transition). For

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

289

more details, see Sec. 7. In both cases, we can use the same arguments as above. By similar estimates as in the proof of Lemma 6.5 it immediately follows that all limit points of the Gibbs states ωN with parameters (βN , γN , µN , λN , hN ) converging to (β, γ, µ, λ, h) ∈ ∂S as N → ∞, belongs to Ωβ = Ωβ (µ, λ, γ, h). Since the set of all U (1)-invariant equilibrium states from Ωβ is {ω (τ ) for any τ ∈ [0, 1]} with " 2π τ (τ ) ω (·) := (1 − τ )ωζ0 (·) + ωζcβ (·)dφ, (6.57) 2π 0 we obtain the second statement (ii). This theorem is a generalization of results obtained for the strong couplingu BCS model [7]. Note however, that Thirring’s analysis [7] of the asymptotics of local Gibbs states comes from explicit computations, whereas we use the structure of sets of states, as explained for instance in [33]. Observe that Theorem 4.1 is a simple consequence of Theorem 6.5. Indeed, assume for instance that the order parameter rβ = rβ (µ, λ, γ, h) and the electron + density per site dβ = dβ (µ, λ, γ, h) jumps respectively from r− β = 0 to rβ and + from d− β to dβ by crossing a critical chemical potential µβ at ﬁxed parameters (β, λ, γ, h). An example of such behavior is given in ﬁgure 10 for an electron density + smaller than one. If ρ ∈ [d− β , dβ ], then the unique solution µN,β = µN,β (ρ, λ, γ, h) (c)

(c)

(c)

of (4.1) must converge towards µβ as N → ∞. Meanwhile, at ﬁxed (β, µβ , λ, γ, h) ωζ0 (n↑ + n↓ ) = d− β

and ωζc+ (n↑ + n↓ ) = d+ β, β

iφ r+ and φ ∈ [0, 2π). Any weak∗ -limit ω∞ of local Gibbs states ωN with c+ β := βe satisﬁes per construction ω∞ (n↑ + n↓ ) = ρ and has the form ω (τ ) (·) (6.57), by Theorem 6.5. Hence, the Gibbs state ωN converges in the weak∗ -topology towards ω (τρ ) (·) with τρ deﬁned in Theorem 4.1. Indeed, the existence of the limits (i)–(iii) in Theorem 4.1 follows from the unique+ ness of the limiting equilibrium state with ﬁxed electron density ρ ∈ [d− β , dβ ]. We give now various important properties of densities in ground states, i.e. for β = ∞, which immediately follow from Theorem 6.4. Recall that the set Ω∞ of ground states is the set of all weak∗ limit points as n → ∞ of all equilibrium state sequences {ω (n) }n∈N with diverging inverse temperature βn → ∞. Take γ > 0 and parameters µ, λ, h such that |µ − λ| = λ + |h|. Then the electron and Coulomb correlation densities equal, respectively, d := ω(nκ(l),↑ + nκ(l),↓ ) = d∞

and w := ω(nκ(l),↑ nκ(l),↓ ) = w∞ ,

for any ground state ω ∈ Ω∞ and l ∈ N, cf. Corollaries 3.2 and 3.4. u See

(1.2) with λ = 0 and h = 0.

(6.58)

April 20, 2010 14:17 WSPC/S0129-055X

290

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

If additionally γ > Γ|µ−λ|,λ+|h| , we are in the superconducting phase for ground states, cf. Corollary 3.1. Indeed, for any ϕ ∈ [0, 2π), there is a ground state ω ∈ Ω∞ such that for any l ∈ N, iϕ ω(aκ(l),↓ aκ(l),↑ ) = r1/2 max e .

In the superconducting phase, from Corollary 3.4 we observe that d∞ = 2w∞ , whereas the magnetization density equals m := ω(nκ(l),↑ − nκ(l),↓ ) = m∞ = 0,

(6.59)

for any superconducting state ω ∈ Ω∞ and l ∈ N. This is the Meißner eﬀect, see Corollary 3.3. On the other hand, the Cauchy–Schwarz inequality for the states implies the inequalities (6.60) 0 ≤ ω(nκ(l),↑ nκ(l),↓ ) ≤ ω(nκ(l),↑ ) ω(nκ(l),↓ ) for any l ∈ N and ω ∈ EU+ . In fact, in the superconducting phase the second inequality of (6.60) is an equality for any ω ∈ Ω∞ . Indeed, (6.59) and Corollary 3.4 yield ω(nκ(l),↑ nκ(l),↓ ) = ω(nκ(l),↑ ) = ω(nκ(l),↓ ),

(6.61)

for any ω ∈ Ω∞ and l ∈ N. It shows that 100% of electrons form Cooper pairs in superconducting ground states. In the case where h = 0 with γ > Γ|µ−λ|,λ+|h| and |µ − λ| = λ + |h|, the density vector (d, m, w) deﬁned by (6.58) and (6.59) is also unique as in the superconducting phase. It equals (d∞ , m∞ , w∞ ), see Corollaries 3.2–3.4. However, if h = 0 with γ < Γ|µ−λ|,λ , or γ = Γ|µ−λ|,λ+|h| , or |µ − λ| = λ + |h|, then the density vector (d, m, w) belongs, in general, to a non-trivial convex set. In other words, there are phase transitions involving to these densities. In particular, even in the case h = 0 where the Hamiltonian HN (1.2) is spin invariant, there are ground states breaking the spin SU (2)-symmetry. For instance, take β, γ > 0 and parameters µ, λ such that |µ − λ| < λ and γ < Γ|µ−λ|,λ . Then for any ω ∈ Ω∞ and l ∈ N, the electron density equals d = d∞ = 1, whereas the Coulomb correlation density is w = w∞ = 0. In particular, the ﬁrst inequality of (6.60) is an equality showing that 0% of electrons forms Cooper pairs. But, even if the magnetic ﬁeld vanishes, i.e. h = 0, for any x ∈ (−1, 1) there exists a ground state ω (x) ∈ Ω∞ with magnetization density m = x (see (6.59) for the deﬁnition of m). Therefore, all the thermodynamics of the strong coupling BCS-Hubbard model discussed in Secs. 3.1–3.5 is encoded in the notion of equilibrium and ground states ω ∈ Ωβ with β ∈ (0, ∞]. However, there is still an important open question related to the thermodynamics of this model. It concerns the problem of ﬂuctuations of the Cooper pair condensate density (Theorem 3.1) or Cooper ﬁelds Φκ(l) and Ψκ(l) (6.48) as a function of the temperature. Unfortunately, no result in that direction are known as soon as the thermodynamic limit is concerned. We prove however a

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

291

simple statement about ﬂuctuations of Cooper ﬁelds for pure states from Ωβ in the limit γβ → ∞. Theorem 6.6 (Fluctuations of Cooper Fields). Take β, γ > 0 and real numbers µ, λ, h away from any critical point. Then, for any pure state ωζcβ ∈ Ωβ and l ∈ N, the ﬂuctuations of Cooper ﬁelds Φκ(l) and Ψκ(l) (6.48) are bounded by 0 ≤ ωζcβ ({Φκ(l) − ωζcβ (Φκ(l) )}2 ) ≤ 2γ −1 β −1 , 0 ≤ ωζcβ ({Ψκ(l) − ωζcβ (Ψκ(l) )}2 ) ≤ 2γ −1 β −1 , i.e. they vanish in the limit γβ → ∞. Proof. Recall that properties of pure states are characterized in Corollary 6.1, i.e. they are product states ωζcβ with the one-site state ζcβ being deﬁned in (6.1). In 1/2

particular, they satisfy (6.49). Now, to avoid triviality, assume that cβ := rβ eiφ = 0 and let f(τ ) be the function deﬁned for any τ ∈ R by f(τ ) := −γ|cβ + τ |2 + p(cβ + τ ). Since cβ = 0 is a maximizer of the function −γ|c|+p(c) of c ∈ C, one has ∂τ2 f(0) ≤ 0, i.e. ∂τ2 p(cβ +τ )|τ =0 ≤ 2γ. From straightforward computations, observe that p(cβ +τ ) is a convex function of τ ∈ R with β −1 γ −2 {∂τ2 p(cβ + τ )}|τ =0 = ωζcβ ({Φκ(l) − ωζcβ (Φκ(l) )}2 ) ≥ 0. From this last equality combined with {∂τ2 p(cβ + τ )}|τ =0 ≤ 2γ, we deduce the theorem for Φκ(l) . Moreover, from similar arguments using the function ˆf(τ ) := f(iτ ) instead of f, the ﬂuctuations of the Cooper ﬁeld Ψκ(l) are also bounded by 2γ −1 β −1 .

From Theorem 6.6, note that Cooper ﬁelds are c-numbers in the corresponding GNS-representation [32] of pure ground states deﬁned as weak∗ -limits of pure equilibrium states: Corollary 6.2 (Cooper Fields for Pure Ground States). Let ω ∈ Ω∞ be any weak∗ -limit of pure equilibrium states and let (ψ, π, H) be the corresponding GNS-representation of ω on bounded operators on the Hilbert space H with cyclic vacuum ψ. Then ω is pure and for any l ∈ N, π(Φκ(l) ) = ω(Φκ(l) )IH and π(Ψκ(l) ) = ω(Ψκ(l) )IH . Proof. A pure equilibrium state is a product state (6.7) and any weak∗ -limit of product states in EUS,+ is also a product state. Thus, by Lemma 6.6, any ground state ω ∈ Ω∞ deﬁned as the weak∗ -limit of pure equilibrium states is extreme in EUS,+ and hence extreme in Ω∞ . Clearly, for such ground state, π(ω(Φκ(l) )) = ω(Φκ(l) )IH ˜ := Φκ(l) − ω(Φκ(l) ). From Theorem 6.6 combined with the for any l ∈ N. Let Φ

April 20, 2010 14:17 WSPC/S0129-055X

292

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

Cauchy–Schwarz inequality we obtain for any A ∈ U that 2 ∗˜ ˜ ∗Φ ˜ 2 )) ˜ ˜ ΦAA ˜ Φ ΦA) ≤ A π(Φ)π(A)ψ = ω(A ω(Φ( H ˜ 3/2 [ω(Φ ˜ 2 )]1/4 = 0. ≤ A2 Φ From the cyclicity of ψ, it follows that π(Φκ(l) ) = ω(Φκ(l) )IH . The proof of π(Ψκ(l) ) = ω(Ψκ(l) )IH is also performed in the same way. We omit the details. In particular, for such pure ground states ω in Ω∞ , correlation functions can explicitly be computed at any order in Cooper ﬁelds. For instance, for all N ∈ N, all kj , lj ∈ N, mj , nj ∈ N0 , j = 1, . . . , N , and any An ∈ U, n = 1, . . . , N + 1, one has n1 mN nN 1 ω(A1 Φm κ(k1 ) Ψκ(l1 ) A2 · · · AN Φκ(kN ) Ψκ(lN ) AN +1 ) n1 mN mN 1 = ω(Φm κ(k1 ) )ω(Ψκ(l1 ) ) · · · ω(Φκ(kN ) )ω(Ψκ(lN ) )ω(A1 · · · AN +1 ).

7. Analysis of the Variational Problem The variational problem (2.4) is quite explicit but for the reader convenience, we collect here some properties of its solution rβ with respect to β, γ > 0 and µ, λ, h ∈ R. We show in particular that rβ > 0 exists in a non-empty domain of (β, γ, µ, λ, h) with some monotonicity properties as well as the existence of both ﬁrst and second order phase transitions. We conclude this section by giving the asymptotics of rβ as β → ∞, i.e. by proving Corollary 3.1. (1) We start by showing that rβ = 0 for suﬃciently small inverse temperatures β at ﬁxed γ, µ, λ and h. Indeed, for any r ≥ 0 one computes that γ sinh(βgr ) −1 , (7.1) ∂r f (r) = γ 2gr (eλβ cosh(βh) + cosh(βgr )) cf. Theorem 2.1. Direct estimations show that if 0 < β < 2γ −1 , then ∂r f (r) < 0 for any r ≥ 0, i.e. rβ = 0. (2) Fix now β > 0 and µ, λ, h ∈ R, then rβ > 0 for suﬃciently large coupling constants γ. Indeed, for large enough γ > 0 there is, at least, one strictly positive solution ˜rβ > 0 of (2.5). Since direct computations using again (2.5) imply that d {f (β, µ, λ, γ, h; ˜rβ (γ)) − f (β, µ, λ, γ, h; 0)} = ˜rβ (γ) > 0, dγ and f (β, µ, λ, γ, h; ˜rβ ) − f (β, µ, λ, γ, h; 0) = O(γ)

as γ → ∞,

for any ﬁxed β > 0 and µ, λ, h ∈ R, there is a unique γc > 2|λ − µ| such that f (˜rβ ) > f (0), i.e. rβ > 0 for γ > γc . The domain of parameters (β, µ, λ, γ, h) where rβ is strictly positive is therefore non-empty, cf. Figs. 3 and 4.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

293

(3) To get an intuitive idea of the behavior of the function f (r) (cf. Theorem 2.1), we analyze the cardinality of the set S of strictly positive solutions of the gap equation (2.5): Lemma 7.1 (Cardinality of the Set S). If βγ ≤ 6, the gap equation (2.5) has at most one strictly positive solution, whereas it has, at most, two strictly positive solutions when βγ > 6. Proof. From (7.1), any strictly positive maximizer rβ > 0 of (2.4) is solution of the equation γ sinh(βx) − eλβ cosh(βh) − cosh(βx). (7.2) h1 (gr ) = 0, with h1 (x) := 2x This last equation is equivalent to the gap equation (2.5). For any x > 0, observe that γ βγ ∂x h1 (x) = cosh(xβ) − + β sinh(xβ) = 0 (7.3) 2x 2x2 if and only if (2β −1 γ −1 )1/2 y =

$

y − 1 =: C(y), tanh(y)

y = βx > 0.

(7.4)

The map y → C(y) is strictly concave for y > 0, C(0) = 0, and ∂y C(0) = (2/6)1/2 . Therefore, if βγ > 6 there is a unique strictly positive solution y% = β% x > 0 of (7.4), and there is no strictly positive solution of (7.4) when βγ < 6. Since h1 (0) could be negative in some cases and h1 (x) diverges exponentially to −∞ as x → ∞, the cardinality of set of strictly positive solutions of the gap equation (2.5) is at most two if βγ > 6, or at most one if βγ ≤ 6. Consequently, if the gap equation (2.5) has no solution, then f (r) is strictly decreasing for any r ≥ 0. If the gap equation (2.5) has one unique solution rβ > 0, the function f (r) is increasing until its (strictly positive) maximizer rβ > 0 and decreasing next for r ≥ rβ . Finally, when there are two strictly positive solutions of (2.5), the lower one must be one local minimum whereas the larger solution must be a local maximum. In this case the function f (r) decreases for r ≥ 0 until its local minimum, then increases until its local maximum, and ﬁnally decreases again to diverge towards −∞. Note that none of these cases can be excluded, i.e. they all appear depending on β, γ > 0 and µ, λ, h ∈ R. See Figs. 3 and 18. (4) We study now the dependence of rβ > 0 with respect to variations of each parameter. So, let us ﬁx the parameters {β, µ, λ, γ, h}\{ν} with ν = β, µ, λ, γ, or h and consider the function ξ(r, ν) := ∂r f (r, ν) for r ≥ 0 and ν in the open set of deﬁnition of f (r, ν) = f (β, µ, λ, γ, h; r), see (7.1). Recall that rβ > 0 is a solution at ν = ν0 of the gap equation (2.5), i.e. ξ(rβ , ν0 ) = 0. Straightforward computations imply that ∂r2 f (r) =

γ4β h2 (gr ), 4gr2 (eλβ cosh(βh) + cosh(βgr ))

(7.5)

April 20, 2010 14:17 WSPC/S0129-055X

294

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

f (r)

f (r)

f (r) 1.16

1.38

2.15

1.15 1.37 1.14

1.36

2.10

1.35

1.13

1.34

2.05

1.12 1.33 0.05

0.10

0.15

0.20

0.25

r

0.05

0.10

0.15

0.20

0.25

r

0.05

0.10

0.15

0.20

0.25

r

Fig. 18. Illustrations of the function f (r) for r ∈ [0, 1/4] at (µ, γ, h) = (1, 2.6, 0) with inverse temperatures β = βc − 0.3 (orange line), β = βc (red line), β = βc + 0.5 (blue line), and with coupling constants λ = 0 (left ﬁgure), λ = 0.45 (ﬁgure on the center) and λ = 0.575 (right ﬁgure). Here βc = θc−1 is the critical inverse temperature which, from left to right, equals 2.04, 3.46 and 6.35, respectively. (Color online.)

for any r > 0 with h2 (x) :=

eλβ cosh(βh) cosh(βx) + 1 sinh(βx) − . eλβ cosh(βh) + cosh(βx) βx

(7.6)

It yields that there is at most one strictly positive solution, ˜r ≥ 0 of ∂r ξ(r, ν0 ) = 0 for each ﬁxed set of parameters. For instance, if eλβ cosh(βh) ≤ 1, then it is straightforward to check that ∂r ξ(r, ν0 ) < 0 for any r > 0. In the situation where the gap equation (2.5) has two strictly positive solutions, rβ > 0 cannot solve ∂r ξ(r, ν0 ) = 0, since in this case the equation h2 (x) = 0 would have at least two strictly positive solutions, as rβ is a maximizer. Consequently, to simplify our study we restrict on the very large set of parameters where ∂r ξ(rβ , ν0 ) = 0. In this case, the diﬀerential dξ has maximal rank at (rβ , ν0 ) and from the implicit function theorem, there are ε > 0 and a smooth and strictly positive functionv rβ (ν) > 0 deﬁned on the ball Bε (ν0 ) centered on the point ν0 and with radius ε such that ξ(ν, rβ (ν)) = 0 for any ν ∈ B (ν0 ). By continuity of the function ∂r ξ we can choose ε > 0 such that ∂r ξ(ν, rβ (ν)) does not change its sign for ν ∈ B (ν0 ). Thus rβ (ν) describes the evolution of the solution of (2.4) for ν ∈ B (ν0 ). If rβ = rβ (ν0 ) > 0 is the unique maximizer of (2.4) with ∂r ξ(rβ , ν0 ) = 0, then the function rβ (ν) describes the smooth evolution of the Cooper pair condensate density with respect to small perturbations of ν0 . Observe that ∂ν ξ(rβ (ν), ν) = {∂ν rβ (ν)}{∂r ξ(r, ν)}|r=rβ (ν) + {∂ν ξ(r, ν)}|r=rβ (ν) = 0 and {∂r ξ(r, ν0 )}|r=rβ (ν0 ) < 0 because rβ is a maximizer. Consequently, one obtains sgn{∂ν rβ (ν0 )} = sgn{{∂ν ∂r f (r, ν0 )}|r=rβ (ν0 ) }. In other words, the function rβ (ν) of ν ∈ B (ν0 ) is either increasing if {∂ν ∂r f (r, ν0 )}|r=rβ (ν0 ) > 0, v If

ν = β, then of course rβ (ν) := rν .

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

295

or decreasing if {∂ν ∂r f (r, ν0 )}|r=rβ (ν0 ) < 0, as soon as rβ > 0 is the unique maximizer of (2.4) with ∂r ξ(rβ , ν0 ) = 0. (5) By applying this last result respectively to ν0 = γ > Γ|µ−λ|,λ+|h| (Corollary 3.1) and ν0 = h ∈ R, we obtain that rβ > 0 is an increasing function of γ > 0 and a decreasing function of |h| because via (2.5) one has {∂γ ∂r f (r, γ)}|r=rβ > 4γ −2 (µ − λ)2 ≥ 0 at ﬁxed parameters (β, µ, λ, h) and {∂h ∂r f (r, h)}|r=rβ = −

2grβ βeλβ sinh(βh) sinh(βgrβ )

at ﬁxed (β, µ, λ, γ). (6) If γ > Γ|µ−λ|,λ+|h| , for any ﬁxed (β, γ, λ, h) the order parameter rβ > 0 is a decreasing function of |µ − λ| under the condition that eλβ cosh(βh) ≤ 1, as {∂µ ∂r f (r, µ)}|r=rβ =

2gr2 (eλβ

γ 2 β(µ − λ) h2 (grβ ), cosh(βh) + cosh(βgrβ ))

cf. (7.6). If eλβ cosh(βh) > 1, the behavior of rβ > 0 is not anymore monotone as a function of |µ − λ| (λ being ﬁxed), cf. Fig. 10. The behavior of rβ as a function of λ or β is also not clear in general. But, at least as a function of the inverse temperature β > 0, we can give simple suﬃcient conditions to get its monotonicity. Indeed, direct computations show that {∂β ∂r f (r, β)}|r=rβ = (γ + 2λ)grβ − 2hgrβ

cosh(βgrβ ) − (λγ + 2gr2β ) sinh(βgrβ )

eλβ sinh(βh) . sinh(βgrβ )

By combining this last equality with (2.5), we then get that {∂β ∂r f (r, β)}|r=rβ ≥ 0

(7.7)

with rβ > 0 if and only if gr2β ≤

γ(γ cosh(βgrβ ) − 2eλβ cosh(βh)(λ + h tanh(βh))) . 4(cosh(βgrβ ) + eλβ cosh(βh))

(7.8)

From (2.5) combined with tanh(x) < 1, we also have gr2β <

γ 2 cosh2 (βgrβ ) . 4(cosh(βgrβ ) + eλβ cosh(βh))2

(7.9)

Therefore, a suﬃcient condition to satisfy the inequality (7.8) is obtained by bounding the right-hand side of (7.9) with the r.h.s. of (7.8). From (2.5) this implies the

April 20, 2010 14:17 WSPC/S0129-055X

296

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

condition grβ ≥ (λ + h tanh(βh)) tanh(βgrβ ), under which rβ is an increasing function of β > 0. This inequality is also equivalent to eλβ cosh(βh) γ − (λ + h tanh(βh)) . grβ ≤ tanh(βgrβ ) 2 cosh(βgrβ ) In particular, by using again the gap equation (2.5), if eλβ cosh(βh) γ > 2(λ + h tanh(βh)) 1 + , cosh(βgrβ ) then rβ > 0 is an increasing function of β > 0. Since tanh x ≤ 1, another suﬃcient condition to get (7.7) is λ+|h| ≤ grβ . In particular, if λ < |µ−λ| and γ > Γ|µ−λ|,λ+|h| with h suﬃciently small, then rβ > 0 is again an increasing function of β > 0. Therefore, the domain of (µ, λ, γ, h) where rβ > 0 is proven to be an increasing function of β > 0 is rather large. Actually, from a huge number of numerical computations, we conjecture that rβ > 0 is always an increasing function of β > 0. In other words, this conjecture implies that the condition expressed in Corollary 3.1 on (µ, λ, γ, h) should be necessary to obtain a superconductor at a ﬁxed temperature. (7) Observe that the order of the phase transition depends on the parameters. For instance, assume λ ≤ 0, h = 0 and γ > Γ|µ−λ|,λ . Then, at any inverse temperature β > 0 it follows from (7.5) that f (r) is a strictly concave function of r > 0. This property justiﬁes the existence and uniqueness of the inverse temperature βc solution of the equation 2 tanh(β|µ − λ|) eλβ = 1+ , |µ − λ| γ cosh(β|µ − λ|) i.e. (2.5) for λ ≤ 0, h = 0 and r = 0. In particular, βc is such that the Cooper pair condensate density continuously goes from rβ = 0 for β ≤ βc to rβ > 0 for β > βc . In this case the superconducting phase transition is of second order, cf. Fig. 3. The appearance of a ﬁrst order phase transition at some ﬁxed (µ, λ, γ, h) is also not surprising. Indeed, recall that the function f (r) may have a local minimum and a local maximum, see discussions below Lemma 7.1. For instance, assume now λ = µ > 0, h = 0 and 4λ = Γ0,λ < γ ≤ 6λ. Then, from (7.1) for r = 0, γβ γ − (eλβ + 1) . ∂r f (0) = λβ e +1 2 Since by explicit computations

min x>0

ex + 1 x

> 3,

it follows that ∂r f (0) < 0 for any β > 0 whenever λ = µ > 0, h = 0 and 0 < γ ≤ 6λ. Therefore, as soon as there is a superconducting phase transition, for instance if

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

297

4λ < γ ≤ 6λ (cf. Corollary 3.1), the function rβ of β > 0 must be discontinuous at the critical point. This case is an example of a ﬁrst order superconducting phase transition. Numerical illustrations of a similar ﬁrst order phase transition are also given in Fig. 3. (8) We conclude this section by a computation of the asymptotics of the order parameter rβ as β → ∞. We prove in particular Corollary 3.1. µλ | with µ ˜ λ := µ − λ. From (2.6), we already know that rβ = 0 for any γ ≤ 2|˜ Therefore, we consider here that γ > 2|˜ µλ | and we look for the domain where the parameter rβ is strictly positive in the limit β → ∞. Recall that rβ is solution of the variational problem (2.4), i.e. 1 1 ln 2 + sup f (r) = −γrβ + ln{eβh + e−βh + eβ(grβ −λ) + e−β(grβ +λ) }. β β r≥0

(7.10)

When β → ∞ the last exponential term can always be neglected for our analysis since grβ ≥ 0. µλ | > λ + |h|. Then gr > λ + |h| for any r ≥ 0 and Now, assume ﬁrst that g0 = |˜ when β → ∞ the function f (r) converges to w(r) := −γr + gr − λ. In particular, the order parameter rβ converges towards the unique maximizer rmax (2.6) of the function w(r) for r ≥ 0, i.e. r∞ := lim rβ = rmax ,

(7.11)

β→∞

for any γ > 2|˜ µλ | and real numbers µ, λ, h satisfying |˜ µλ | > λ + |h|. Assume now that |˜ µλ | ≤ λ + |h| and let rmin be the solution of gr = λ + |h|, i.e. ˜2λ ) ≥ 0. rmin := γ −2 ((λ + |h|)2 − µ

(7.12)

Then, for any r ∈ [0, rmin] f (r) = −γr + |h| + o(1) as β → ∞. In particular, since γ > 0, f (r) = f (δ) = |h| + o(1),

sup

with δ = o(1)

as β → ∞.

(7.13)

0≤r≤rmin

The solution rβ of the variational problem (7.10) converges either to 0, or to some strictly positive value r∞ > rmin . In the case where r∞ > rmin , we would have f (r∞ ) = w(r∞ ) + o(1) as β → ∞.

(7.14)

Now, if |˜ µλ | ≤ λ + |h| and γ ≤ 2(λ + |h|), then rmin ≥ rmax , cf. (2.6) and (7.12). In this regime, straightforward computations show that ˜2λ ) ≥ 0. |h| − sup w(r) = |h| − w(rmin ) = γ −1 ((|h| + λ)2 − µ r≥rmin

(7.15)

April 20, 2010 14:17 WSPC/S0129-055X

298

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

In other words, the order parameter rβ converges towards r∞ := lim rβ = 0,

(7.16)

β→∞

for any γ ≤ 2(λ + |h|) and real numbers µ, λ, h satisfying |˜ µλ | ≤ λ + |h|. However, if |˜ µλ | ≤ λ + |h| and γ > 2(λ + |h|), then rmin < rmax . In particular one gets |h| − sup w(r) = |h| − w(rmax ) = − r≥rmin

1 ˜ |˜µ |,λ+|h| )(γ − Γ|˜µ |,λ+|h| ), (γ − Γ λ λ 4γ

(7.17)

with Γx,y ≥ 2y deﬁned for any x ∈ R+ and y ∈ R in Corollary 3.1 and ˜ |˜µ |,λ+|h| := 2(λ + |h| − (λ + |h|)2 − µ Γ ˜2λ ) ≤ 2|˜ µλ |. λ In particular, sup w(r) = w(rmax ) > |h|,

(7.18)

r≥rmin

µλ |. Therefore, by combining (7.13) with (7.14) and for any γ > Γ|˜µλ |,λ+|h| ≥ 2|˜ (7.18), we obtain r∞ := lim rβ = rmax ,

(7.19)

β→∞

for any γ > Γ|˜µλ |,λ+|h| and real numbers µ, λ, h satisfying |˜ µλ | ≤ λ + |h|. µλ | < λ + |h|, observe that (7.17) is zero. So, Finally, if γ = Γ|˜µλ |,λ+|h| and |˜ we analyze the next order term to know which number, 0 or rmax , maximizes the function f (r) when β → ∞. On the one hand, straightforward estimations imply that f (0) − |h| = β −1 (e−β(λ+|h|−|˜µλ |) + e−2β|h| )(1 + o(1))

as β → ∞.

(7.20)

On the other hand, if γ = Γ|˜µλ |,λ+|h| with |˜ µλ | < λ + |h|, then by using (2.6) one obtains √ 2 2 (7.21) f (rmax ) − |h| = β −1 e−β (λ+|h|) −˜µλ (1 + o(1)) as β → ∞. Therefore, if γ = Γ|˜µλ |,λ+|h| and |˜ µλ | < λ + |h|, it is trivial to check from (7.20)– (7.21) that f (0) > f (rmax ) when β → ∞. Consequently, the limits (7.11), (7.16) and (7.19) together with (2.6) imply Corollary 3.1 for any γ = Γ|µ−λ|,λ+|h| , whereas if γ = Γ|µ−λ|,λ+|h| , the order parameter rβ converges to r∞ = 0. Appendix. Griﬃths Arguments As we have an explicit representation of the pressure, it can be veriﬁed in some cases that rβ is a C 1 -functionw of parameters implying that p(β, µ, λ, γ, h) is differentiable with respect to parameters. In this particular situation, the proofs of w For

instance, for special choices of parameters one could check that ∂r ξ(rβ , ν0 ) = 0, see Sec. 7.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

299

Theorems 3.1, 3.2, 3.4–3.7 done in Sec. 6.2 could also be performed without our notion of equilibrium states by using Griﬃths arguments [29–31], which are based on convexity properties of the pressure. We explain it shortly and we conclude by a discussion of an alternative proof of Theorem 3.2. Remark A.1. Our method gives access to all correlation functions at once (cf. Theorem 6.5). It is generalized in [18] to all translation invariant Fermi systems. However, computing all correlation functions with Griﬃths arguments [29–31] requires the diﬀerentiability of the pressure with respect to any perturbation as well as the computation of its corresponding derivative. This is generally a very hard task, for instance for correlation functions involving many lattice points. (1) Take self-adjoint operators PN acting on the fermionic Fock space and assume the existence of the (inﬁnite volume) grand-canonical pressure pε (β, µ, λ, γ, h) := lim pN,ε (β, µ, λ, γ, h) N →∞

for any ﬁxed ε in a neighborhood V of 0. In this case, observe that the ﬁnite volume pressure pN,ε (β, µ, λ, γ, h) :=

1 ln Trace(e−β(HN −εPN ) ) βN

is convex as a function of ε ∈ V and ∂ε pN,0 = N −1 ωN (PN ). Consequently, the point-wise convergence of the function pN,ε towards pε implies that lim inf lim− ∂ε pN,ε ≥ lim− ∂ε pε and lim sup lim+ ∂ε pN,ε ≤ lim+ ∂ε pε , N →∞

ε→0

N →∞

ε→0

ε→0

ε→0

(A.1) see Griﬃths lemma [30, 31] or [29, Appendix C]. In particular, one gets lim {∂ε pN,0 } = lim {N −1 ωN (PN )} = ∂ε pε=0 ,

N →∞

N →∞

(A.2)

under the assumption that pε is diﬀerentiable at ε = 0. (2) Therefore, by taking PN =

a∗x,↑ a∗x,↓ ay,↓ ay,↑ ,

x,y∈ΛN

we obtain from (A.2) that    1  ∗ ∗ = ∂γ p(β, µ, λ, γ, h), a a a a lim y,↓ y,↑ x,↑ x,↓ N →∞  N 2  x,y∈ΛN

as soon as the (inﬁnite volume) pressure p (β, µ, λ, γ, h) has continuous derivative with respect to γ > 0. Combined with Theorem 2.1 and (2.5) we would obtain

April 20, 2010 14:17 WSPC/S0129-055X

300

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

Theorem 3.1. Meanwhile, Theorems 3.4–3.7 could have been deduced in the same way from (A.2) combined with explicit computations using (2.5). (3) A direct proof of Theorem 3.2 using Griﬃths arguments is more delicate. One uses similar arguments as in [29, 42]. We give them for the interested reader. For any φ ∈ [0, 2π), ﬁrst recall that the pressure pα,φ associated with HN,α,φ (3.1) in the thermodynamic limit is given by (6.41), which equals (6.43). Additionally, if the parameters β, µ, λ, γ, and h are such that (2.4) has a unique maximizer rβ , then the variational problem (6.43) has a unique maximizer cβ,α,φ ∈ eiφ R for 1/2 α > 0 suﬃciently small, and cβ,α,φ converges to rβ eiφ as α → 0, see proof of Theorem 6.2. Now, let us denote by (nx,↑ + nx,↓ ) NN := x∈ΛN

the full particle number operator. By straightforward computations, observe that [ax,↑ , NN ] = ax,↑

and [ax,↓ , NN ] = ax,↓ ,

(A.3)

for any lattice site labelled by x ∈ ΛN , where [A, B] := AB − BA. Therefore the iφ unitary operator Uφ := e− 2 NN realizes a global gauge transformation because one deduces from (A.3) that iφ

Uφ ax,↑ Uφ∗ = e 2 ax,↑

iφ

and Uφ ax,↓ Uφ∗ = e 2 ax,↓ .

(A.4)

In particular, the unitary transformation of the Hamiltonian HN,α,φ (3.1) equals Uφ HN,α,φ Uφ∗ = HN,α,0 . It implies on the corresponding Gibbs states (6.42) that ωN,α,φ(BN ) = eiφ ωN,α,0 (BN ), with the operator BN be deﬁned by BN :=

(A.5)

ax,↓ ax,↑ .

x∈ΛN

In other words, it suﬃces to prove Theorem 3.2 for φ = 0. Take φ = 0. Observe that 0 = ωN,α,0 ([HN,α,0 , NN ]) = αωN,α,0 (BN − B∗N ).

(A.6)

Additionally, by using the positive semideﬁnite Bogoliubov–Duhamel scalar product " β (X, Y )HN,α,0 := β −1 e−βN pN,α,0 (β,µ,λ,γ,h) Trace(e−(β−τ )HN,α,0 X ∗ e−τ HN,α,0 Y )dτ 0

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

301

with respect to the Hamiltonian HN,α,0 (see, e.g., [25, 29, 42]), one gets that 0 ≤ β([NN , HN,α,0 ], [NN , HN,α,0 ])HN,α,0 = ωN,α,0 ([NN , [HN,α,0 , NN ]]) = αωN,α,0 (BN + B∗N ).

(A.7)

So, by combining (A.6) with (A.7) it follows that ωN,α,0 (BN ) = ωN,α,0(B∗N ) ≥ 0 for any α ≥ 0. In particular ωN,α,0 (BN ) = ωN,α,0(B∗N ) is a real number. The function pN,α,0 is a convex function of α ≥ 0 because β({(BN + B∗N ) − ωN,α,0 (BN + B∗N )}, {(BN + B∗N ) − ωN,α,0 (BN + B∗N )})HN,α,0 = ∂α2 pN,α,0 (β, µ, λ, γ, h). Then, under the assumption that pα,0 is diﬀerentiable at α = 0 away from any critical point, the equations (A.2), with PN = BN + B∗N and (6.43), imply that 1 1 ωN,α,0 (BN + B∗N ) = lim ∂α ln Trace(e−βHN,α,0 ) lim N →∞ N N →∞ βN = ∂α pα,0 (β, µ, λ, γ, h) = ζcβ,α,0 (a∗↓ a∗↑ + a↑ a↓ ), for any α > 0 suﬃciently small and with ζc (·) deﬁned for any c ∈ C by (6.1). Returning back to the original Hamiltonian HN,α,φ (3.1) for any φ ∈ [0, 2π), we conclude from (A.5) combined with the last equalities that eiφ 1 lim ωN,α,φ(ax,↑ ax,↓ ) = (a∗ a∗ + a↑ a↓ ). ζc N →∞ N 2 β,α,0 ↓ ↑ x∈ΛN

Therefore, by taking the limit α → 0, Theorem 3.2 would follow if one additionally checks that pα,0 is diﬀerentiable at α = 0 away from any critical point. Acknowledgments We are very grateful to Volker Bach and Jakob Yngvason for their hospitality at the Erwin Schr¨ odinger International Institute for Mathematical Physics, at the Physics University of Vienna, and at the Institute of Mathematics of the Johannes Gutenberg–University that allowed us to work on diﬀerent aspects of the present paper. We also thank N. S. Tonchev and V. A. Zagrebnov for giving us relevant references, as well as the referee for having helped us to improve the paper. Additionally, J.-B. B. especially thanks the mathematical physics group of the Department of Physics of the University of Vienna for the very nice working environment.

April 20, 2010 14:17 WSPC/S0129-055X

302

148-RMP

J070-00395

J.-B. Bru & W. de Siqueira Pedra

References [1] E. Størmer, Symmetric states of inﬁnite tensor product C ∗ -algebras, J. Funct. Anal. 3 (1969) 48–68. [2] J. R. Schrieﬀer and M. Tinkham, Superconductivity, Rev. Mod. Phys. 71 (1999) S313–S317. [3] Y. Yanase, T. Jujo, T. Nomura, H. Ikeda, T. Hotta and K. Yamada, Theory of superconductivity in strongly correlated electron systems, Phys. Rep. 387 (2003) 1–149. [4] A. L. Patrick, N. Nagaosa and X.-G. Wen, Doping a Mott insulator: Physics of hightemperature superconductivity, Rev. Mod. Phys. 78 (2006) 17–85. [5] S. T. Beliaev, Application of the methods of quantum ﬁeld theory to a system of bosons, Sov. Phys. JETP 7 (1958) 289–299. [6] W. Thirring and A. Wehrl, On the mathematical structure of the B.C.S.-model, Comm. Math. Phys. 4 (1967) 303–314. [7] W. Thirring, On the mathematical structure of the B.C.S.-model. II, Comm. Math. Phys. 7 (1968) 181–189. [8] D. J. Thouless, The Quantum Mechanics of Many-Body Systems, 2nd edn. (Academic Press, New York, 1972). [9] N. G. Duﬃeld and J. V. Pul´e, A new method for the thermodynamics of the BCS model, Comm. Math. Phys. 118 (1988) 475–494. [10] G. A. Raggio and R. F. Werner, The Gibbs variational principle for general BCS-type models, Europhys. Lett. 9 (1989) 633–638. [11] I. A. Bernadskii and R. A. Minlos, Exact solution of the BCS model, Theor. Math. Phys. 12(2) (1972) 779–787. [12] N. Ilieva and W. Thirring, High-Tc superconductivity by phase cloning, arXiv:hepth/0701245v3 (2007). [13] N. N. Bogoliubov, V. V. Tolmachev and D. V. Shirkov, A New Method in the Theory of Superconductivity (Academy of Sciences Press, Moscow, 1958) and (Consult. Bureau, Inc., N.Y., Chapman Hall Ltd., London, 1959). [14] R. J. Bursill and C. J. Thompson, Variational bounds for lattice fermion models II: Extended Hubbard model in the atomic limit, J. Phys. A Math. Gen. 26 (1993) 4497–4511. [15] F. P. Mancini, F. Mancini and A. Naddeo, Exact solution of the extended Hubbard model in the atomic limit on the Bethe lattice, arXiv:0711.0318v1 (2007). [16] I. G. Brankov and N. S. Tonchev, On the SD model for coexistence of ferromagnetism and superconductivity, Phys. Stat. Sol. (B) 102 (1980) 179–187. [17] N. N. Bogoliubov Jr., A. N. Ermilov and A. M. Kurbatov, On coexistence of superconductivity and ferromagnetism, Phys. A 101 (1980) 613–628. [18] J.-B. Bru and W. de Siqueira Pedra, Non-cooperative equilibria of Fermi systems with long range interactions, in preparation. [19] D. Petz, G. A. Raggio and A. Verbeure, Asymptotics of Varadhan-type and the Gibbs variational principle, Comm. Math. Phys. 121 (1989) 271–282. [20] G. A. Raggio and R. F. Werner, Quantum statistical mechanics of general mean ﬁeld systems, Helv. Phys. Acta 62 (1989) 980–1003. [21] G. A. Raggio and R. F. Werner, The Gibbs variational principle for inhomogeneous mean ﬁeld systems, Helv. Phys. Acta 64 (1991) 633–667. [22] F. Hiai, M. Mosonyi, H. Ohno and D. Petz, Free energy density for mean ﬁeld perturbation of states of a one-dimensional spin chain, Rev. Math. Phys. 20(3) (2008) 335–365.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00395

Eﬀect of a Locally Repulsive Interaction on s-Wave Superconductors

303

[23] W. De Roeck, C. Maes, K. Netocny and L. Rey-Bellet, A note on the non-commutative Laplace-Varadhan integral Lemma, arXiv:0808.0293v2 [math-ph] (2009). [24] G. L. Sewell, Quantum Theory of Collective Phenomena (Clarendon Press, Oxford, 1986). [25] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. II, 2nd edn. (Springer-Verlag, New York, 1996). [26] R. Haag, The mathematical structure of the Bardeen–Cooper–Schrieﬀer model, Il Nuovo Cimento 25(2) (1962) 287–299. [27] G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory (Wiley-Interscience, New York, 1972). [28] L. Accardi, De Finetti theorem, in Encyclopaedia of Mathematics, ed. M. Hazewinkel (Kluwer Academic Publishers, 2001). [29] V. A. Zagrebnov and J.-B. Bru, The Bogoliubov model of weakly imperfect Bose gas, Phys. Rep. 350 (2001) 291–434. [30] R. Griﬃths, A proof that the free energy of a spin system is extensive, J. Math. Phys. 5 (1964) 1215–1222. [31] K. Hepp and E. H. Lieb, Equilibrium statistical mechanics of matter interacting with the quantized radiation ﬁeld, Phys. Rev. A 8 (1973) 2517–2525. [32] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. I, 2nd edn. (Springer-Verlag, New York, 1996). [33] M. Fannes, H. Spohn and A. Verbeure, Equilibrium states for mean ﬁeld models, J. Math. Phys. 21(2) (1980) 355–358. [34] N. N. Bogoliubov Jr., J. G. Brankov, V. A. Zagrebnov, A. M. Kurbatov and N. S. Tonchev, Metod approksimiruyushchego gamil’toniana v statisticheskoi ﬁzikex (Izdat. Bulgar. Akad. Nauk,y Soﬁa, 1981). [35] N. N. Bogoliubov Jr., J. G. Brankov, V. A. Zagrebnov, A. M. Kurbatov and N. S. Tonchev, Some classes of exactly soluble models of problems in Quantum Statistical Mechanics: The method of the approximating Hamiltonian, Russ. Math. Surv. 39 (1984) 1–50. [36] J. G. Brankov, D. M. Danchev and N. S. Tonchev, Theory of Critical Phenomena in Finite-Size Systems: Scaling and Quantum Eﬀects (World Scientiﬁc, 2000). [37] N. N. Bogoliubov Jr., On model dynamical systems in statistical mechanics, Physica 32 (1966) 933–944. [38] C. N. Yang, Concept of oﬀ-diagonal long range order and the quantum phases of liquid He and of superconductors, Rev. Mod. Phys. 34 (1962) 694–704. [39] S. Adams and T. Dorlas, C ∗ -Algebraic approach to the Bose–Hubbard model, J. Math. Phys. 48 (2007) 103304, 14 pp. [40] H. Araki and H. Moriya, Equilibrium statistical mechanics of fermion lattice systems, Rev. Math. Phys. 15 (2003) 93–198. [41] R. R. Phelps, Lectures on Choquet’s Theorem, Lecture Notes in Mathematics, Vol. 1757, 2nd edn. (Springer-Verlag, 2001). [42] J. Ginibre, On the asymptotic exactness of the Bogoliubov approximation for many Bosons systems, Comm. Math. Phys. 8 (1968) 26–51.

x The

Approximating Hamiltonian Method in Statistical Physics. House Bulg. Acad. Sci.

y Publ.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Reviews in Mathematical Physics Vol. 22, No. 3 (2010) 305–329 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003965

ON SEMICLASSICAL AND UNIVERSAL INEQUALITIES FOR EIGENVALUES OF QUANTUM GRAPHS

SEMRA DEMIREL∗ and EVANS M. HARRELL, II† ∗Department

of Mathematics, University of Stuttgart, Pfaﬀenwaldring 57, D-70569 Stuttgart, Germany [email protected]

†School

of Mathematics, Georgia Institute of Technology, Atlanta GA 30332-0160, USA [email protected] Received 12 November 2009

We study the spectra of quantum graphs with the method of trace identities (sum rules), which are used to derive inequalities of Lieb–Thirring, Payne–P´ olya–Weinberger, and Yang types, among others. We show that the sharp constants of these inequalities and even their forms depend on the topology of the graph. Conditions are identified under which the sharp constants are the same as for the classical inequalities. In particular, this is true in the case of trees. We also provide some counterexamples where the classical form of the inequalities is false. Keywords: Quantum graph; semiclassical; Lieb–Thirring inequality; sum rule; universal spectral bounds. Mathematics Subject Classification 2010: 81Q35, 34L15, 34L40, 81Q20, 47E05, 47A75

1. Introduction This article is focused on inequalities for the means, moments, and ratios of eigenvalues of quantum graphs. A quantum graph is a metric graph with one- dimensional Schr¨ odinger operators acting on the edges and appropriate boundary conditions imposed at the vertices and at the ﬁnite external ends, if any. Here we shall deﬁne the Hamiltonian H on a quantum graph as the minimal (Friedrichs) self-adjoint extension of the quadratic form ∞ |φ |2 ds, (1.1) φ ∈ Cc → E(φ) := Γ

which leads to vanishing Dirichlet boundary conditions at the ends of exterior edges and to the conditions at each vertex vk that φ is continuous and moreover ∂φ (0+ ) = 0, (1.2) ∂x kj j 305

April 20, 2010 14:17 WSPC/S0129-055X

306

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

where the sum runs over all edges emanating from vk , and xkj designates the distance from vk along the jth edge. (Edges connecting vk to itself are accounted twice.) In the literature, these vertex conditions are usually known as Kirchhoﬀ or Neumann conditions. Other vertex conditions are possible, and are amenable to our methods with some complications, but they will not be considered in this article. For details about the deﬁnition of H, we refer to [15]. Quantum mechanics on graphs has a long history in physics and physical chemistry [21, 24], but recent progress in experimental solid state physics has renewed attention on them as idealized models for thin domains. While the problem of quantum systems in high dimensions has to be solved numerically, since quantum graphs are locally one-dimensional their spectra can often be determined explicitly. A large literature on the subject has arisen, for which we refer to the bibliography given in [3, 7]. The subject of inequalities for means, moments, and ratios of eigenvalues is rather well developed for Laplacians on domains and for Schr¨ odinger operators, and it is our aim to determine the extent to which analogous theorems apply to quantum graphs. For example, when there is a potential energy V (x) in appropriate function spaces, Lieb–Thirring inequalities provide an upper bound for the moments odinger operator H(α) = −α∇2 +V (x) of the negative eigenvalues Ej (α) of the Schr¨ 2 d in L (R ), α > 0, of the form d/2 γ α (−Ej (α)) ≤ Lγ,d (V− (x))γ+d/2 dx (1.3) Ej (α)<0

Rd

cl for some constant Lγ,d ≥ Lcl γ,d , where Lγ,d , known as the classical constant, is given by

Lcl γ,d =

Γ(γ + 1) 1 . d/2 Γ(γ + d/2 + 1) (4π)

It is known that (1.3) holds true for various ranges of γ ≥ 0 depending on the dimension d; see [5, 13, 19, 20, 23, 27]. In particular, in [18] Laptev and Weidl proved that Lγ,d = Lcl γ,d for all γ ≥ 3/2 and d ≥ 1, and Stubbe [25] has recently given a new proof of sharp Lieb–Thirring inequalities for γ ≥ 2 and d ≥ 1 by showing monotonicity with respect to coupling constants. His proof is based on general trace identities for operators [11, 12] known as sum rules, which will again be used as the foundation of the present article. When there is no potential energy but instead the Laplacian is given Dirichlet conditions on the boundary of a bounded domain, then the means of the ﬁrst n eigenvalues are bounded from below by the Berezin–Li–Yau inequality in terms of the volume of the domain, and in addition there is a large family of universal bounds on the spectrum, dating from the work of Payne, P´ olya, and Weinberger [22], which constrain the spectrum without any reference to properties of the domain. (For a review of the subject, see [2].) It turns out that there are far-reaching analogies between these “universal” inequalities for Dirichlet Laplacians and Lieb–Thirring

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

307

inequalities, which have led to common proofs based on sum rules [8–12, 25]. More precisely, some sharp Lieb–Thirring inequalities and some universal inequalities of the PPW family can be viewed as corollaries of a “Yang-type” inequality like (2.5) below, which in turn follows from a sum-rule identity. In one dimension, a domain is merely an interval and the spectrum of the Dirichlet Laplacian is a familiar elementary calculation, for which the question of universal bounds is trivial. A quantum graph, however, has a spectrum that responds in complex ways to its connectedness; if the total length is ﬁnite and appropriate boundary conditions are imposed at exterior vertices, then the spectrum is discrete, and questions about counting functions, moments, etc. and their relation to the topology of the graph become interesting, even in the absence of a potential energy. Below we shall prove several inequalities for the spectra of ﬁnite quantum graphs, with the aid of the same trace identities we use to derive Lieb–Thirring inequalities. For Lieb–Thirring inequalities on quantum graphs, the essential question is whether a form of (1.3) holds with the sharp constant for d = 1, or whether the connectedness of the graph can change the state of aﬀairs. In [6], Ekholm, Frank and Kovaˇr´ık proved Lieb–Thirring inequalities for Schr¨ odinger operators on regular metric trees for any γ ≥ 1/2, but without sharp constants. We shall show below that trees enjoy a Lieb–Thirring inequality with the sharp constant when γ ≥ 2, but that this circumstance depends on the topology of the graph. We begin with some simple explicit examples showing that neither the expected Lieb–Thirring inequality nor the analogous universal inequalities for ﬁnite quantum graphs without potential hold in complete generality. As it will be convenient to have a uniform way of describing examples, we shall let xij denote the distance from vertex vi along the jth edge Γj emanating from vi . We note that every edge corresponds to two distinct coordinates xij = L − xi j where L is the length of the edge, and that a homoclinic loop from a vertex vi to itself is accounted as two edges. d2 For the operator − dx 2 on an interval, with vanishing Dirichlet boundary conditions, the universal inequality of Payne–P´ olya–Weinberger reduces to E2 /E1 ≤ 5, and the Ashbaugh–Benguria theorem becomes E2 /E1 ≤ 4, both of which are trivial in one dimension. But for which quantum graphs do these classic inequalities continue to be valid? We shall show below that the classic PPW and related inequalities can be proved for the case of trees, with Dirichlet boundary conditions imposed at all external ends of edges, using the method of sum rules. The sum-rule proof does not work for every graph, however, so the question naturally arises whether the topology makes a real diﬀerence, or whether a better method of proof is required. The following examples show that the failure of the sum-rule proof in the case of multiply connected graphs is not an artifact of the method but due to a true topological eﬀect. We refer to graphs consisting of a circle attached to a single external edge as “simple balloon graphs.” The external edge may either be inﬁnite or of ﬁnite length with a vanishing boundary condition at its exterior end. Consider ﬁrst the graph

April 20, 2010 14:17 WSPC/S0129-055X

308

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

Γ1 v1

Fig. 1.

Γ2

The “balloon graph”.

Γ := Γ1 ∪ Γ2 , which consists of a loop Γ1 to which a ﬁnite external interval Γ2 is attached at a vertex v1 . Without loss of generality, we may ﬁx the length of the loop as 2π, while the “string” will be of length L. Example 1.1 (Violation of the Analogue of PPW). Let us begin with the case of a balloon graph with L < ∞, and assume that there is no potential. We set d2 α = 1. Thus H locally has the form − dx 2 with Dirichlet condition at the end of the string Γ2 and vertex condition (1.2) at v1 connecting it to the loop. For convenience, we slightly simplify the coordinate system, letting xs := x12 be the distance on Γs := Γ2 from the node, and x := x11 − π on Γ1 . Thus x increases from −π at v1 to x2 = +π when it joins it again. It is possible to analyze the eigenvalues of the balloon graph quite explicitly: With a Dirichlet condition at xs = L, any eigenfunction must be of the form a sin(k(L − xs )) on Γs . On Γ1 symmetry dictates that the eigenfunction must be proportional to either sin kx or cos kx . There are thus two categories of eigenfunctions and eigenvalues. Eigenfunctions of the form sin kx contribute nothing to the vertex condition (1.2) (because the outward derivatives at the node are equal in magnitude with opposite signs), and therefore the derivative of a sin(k(L − xs )) must vanish at xs = 0. If k is a positive integer, then k 2 is an eigenvalue corresponding to an eigenfunction that vanishes on Γs . Otherwise, the conditions on Γs cannot be achieved without violating the condition of continuity with the eigenfunction on Γ1 . To summarize: the eigenvalues of the ﬁrst category are the squares of positive integers. The second category of eigenfunctions match cos kx on the loop to a sin(k(L − xs )) on the interval. The boundary conditions and continuity lead after a standard calculation to the transcendental equation cot kL = 2 tan kπ.

(1.4)

There are three interesting situations to consider. In the limit L → 0, an asymptotic analysis of (1.4) shows that the eigenvalues tend to {( n2 )2 }. In the limit L → π2 ∞, the lower eigenvalues tend to {(n + 12 )2 L 2 }, which are the eigenvalues of an interval of length L with Dirichlet conditions at L and Neumann conditions at 0. The ratio of the ﬁrst two eigenvalues in this limit is approximately 9, which is already greater than the classically anticipated value of 5 or 4. The highest value of the ratio is, somewhat surprisingly, attained for an intermediate value of L, viz., L = π, for which (1.4) can be easily solved, yielding k = ± π1 arctan √12 + j

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

309

for a positive integer j. The corresponding fundamental ratio of the lowest two eigenvalues becomes  2 1 √ π − arctan  E2 2  = =   ˙ 16.8453. 1 E1 arctan √ 2 (We spare the reader the direct calculation showing that the critical value of the ratio occurs precisely at L = π, establishing this value as the maximum among all simple balloons.) Example 1.2 (Showing that E2 /E1 can be Arbitrarily Large). A modiﬁcation of Example 1.1 with more complex topology shows that no upper bound on the ratio of the ﬁrst two eigenvalues is possible for the graph analogue of the Dirichlet problem. We again set α = 1 and assume V = 0, and consider a “fancy balloon” graph consisting of an external edge, Γs , the “string,” of length π joined at v1 to N edges Γm , m = 1, . . . , N of length π, all of which meet at a second vertex v2 . We observe that the eigenfunctions may be chosen either even or odd under pairwise permutation of the edges Γm . This is because if P f represents the linear transformation of a function f deﬁned on the graph by permuting two of the variables {x21 , . . . , x2N }, and φj is an eigenfunction of the quantum graph with eigenvalue Ej , then so are φj ± P φj . (In particular, continuity and (1.2) are preserved by these superpositions.) Moreover, the fundamental eigenfunction is even under any permutation, because it is unique and does not change sign. By continuity and the conditions (1.2) at the vertices, as in Example 1.1, a straightforward exercise shows that E1 = ( π1 arctan( √1N ))2 , and that there are other even-parity eigenvalues

2 1 1 j ± arctan √ π N for all positive integers j. Odd parity, when combined with continuity, forces the eigenfunctions to vanish at the nodes, and thus leads to eigenvalues of the form j 2 , for positive integers j. The fundamental ratio E2 /E1 for this example can be seen to be

2  1 √  π − arctan  N 

   , 1 arctan √ N which is roughly π 2 N for large N . Remarks. (1) With no external edges, the lowest eigenvalue of a quantum graph is E1 = 0, so one might intuitively argue that for a graph with a large and

April 20, 2010 14:17 WSPC/S0129-055X

310

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

complex interior part the eﬀect of an exterior edge with a boundary condition is small. The theorems and examples given below, however, point towards a more nuanced intuition. (2) Another instructive example is the “bunch-of-balloons” graph, with many nonintersecting loops attached to the string at v1 . We leave the details to the interested reader. Example 1.3 (Violation of Classical Lieb–Thirring). Next consider a balloon d2 2 graph with L = ∞ and the Schr¨ odinger operator H(l) := − dx 2 + V (x) on L (Γ) with vertex conditions (1.2). Let the potential V be given by

V (x) :=

  V1 (x) :=  

−2a2 , x ∈ Γ1 = [−π, π] cosh2 (ax) . xs ∈ Γ2 = [0, ∞)

V2 (x) := 0,

Then the eigenfunction corresponding to the eigenvalue −a2 is given by C cosh−1 (ax ) on Γ1 and by e−axs on Γ2 . The continuity condition gives C = cosh(aπ) and the condition (1.2) at v1 leads to the equation tanh(aπ) =

1 . 2

(1.5)

Denoting the ratio Q(γ, V ) :=

|E1 |γ

,

|V (x)|γ+1/2 dx

Γ

we compute a3 4a4 dx 2 4 0 cosh (ax ) aπ

−1 1 = 8 dy cosh4 (y) 0

−1 8 = . tanh(aπ)(2 + sech2 (aπ)) 3

Q(3/2, V ) =

π

Because of (1.5), sech2 (aπ) = 1 − tanh2 (aπ) = 34 , and therefore Q(3/2, V ) =

3 3 > = Lcl 3/2,1 . 11 16

(1.6)

Note that the ratio Q(3/2, V ) is independent of the length of the loop, as expected because any length L can be achieved by a change of scale.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

311

The ratio Q(γ, V ) can also be calculated explicitly for the case γ = 2. In this case

−1 3 1 3 arctan(tanh(aπ/2)) + sech(aπ) + sech3 (aπ) Q(2, V ) = 27/2 4 16 8 8 = ˙ 0.1697. = ˙ 0.2009 > Lcl 2,1 = 15π 2. Lieb–Thirring Inequalities for Quantum Graphs 2.1. Classical Lieb–Thirring inequality for metric trees Our point of departure is the family of sum-rule identities from [11, 12]. Let H and G be abstract self-adjoint operators satisfying certain mapping conditions. We suppose that H has nonempty discrete spectrum lying below the continuum, {Ej : Hφj = Ej φj }. In the situations of interest in this article the spectrum will either be entirely discrete, in which case we focus on spectral subsets of the form J := {Ej , j = 1, . . . , k}, or else, when there is a continuum, it will lie on the positive real axis and we shall take J as the negative part of the spectrum. Let PA denote the spectral projector associated with H and a Borel set A. Then, given a pair of self-adjoint operators H and G with domains D(H) and D(G), such that G(J ) ⊂ D(H) ⊂ D(G), where J is the subspace spanned by the eigenfunctions φj corresponding to the eigenvalues Ej , it is shown in [11, 12] that:

(z − Ej )2 [G, [H, G]]φj , φj − 2(z − Ej ) [H, G]φj , [H, G]φj

Ej ∈J

=2

Ej ∈J

κ∈J c

(z − Ej )(z − κ)(κ − Ej )dG2jκ ,

(2.1)

where dG2jκ := |Gφj , dPκ Gφj | corresponds to the matrix elements of the operator G with respect to the spectral projections onto J and J c . Because of our choice of J,

(z − Ej )2 [G, [H, G]]φj , φj − 2(z − Ej )[H, G]φj , [H, G]φj ≤ 0.

(2.2)

Ej ∈J

In this section H is the Schr¨ odinger operator on the graph Γ, namely H(α) = −α

d2 + V (x) dx2

in L2 (Γ),

α > 0,

with the usual conditions (1.2) at each vertex vi . In particular, if any leaves (i.e. edges with one free end) are of ﬁnite length, vanishing Dirichlet boundary conditions are imposed at their ends. Without loss of generality we may assume that V ∈ C0∞

April 20, 2010 14:17 WSPC/S0129-055X

312

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

for the operator H(α). Under this assumption, for any α > 0, H(α) has at most a ﬁnite number of negative eigenvalues. We denote negative eigenvalues of H(α) by Ej (α) corresponding to the normalized eigenfunctions φj . We shall be able to derive inequalities of the standard one-dimensional type when it is possible to choose G to be multiplication by the arclength along some distinguished subsets of the graph. This depends on the following: Lemma 2.1. Suppose that there exists a continuous, piecewise-linear function G on the graph Γ, such that at each vertex vk ∂G (0+ ) = 0. ∂x kj j

(2.3)

Suppose that Γ = m Γm with (G )2 = am on Γm . If the spectrum has nonempty essential spectrum, assume that z ≤ inf σess (H). Then

(z − Ej )2+ am χΓm φj 2 − 4α(z − Ej )+ am χΓm φj 2 ≤ 0.

(2.4)

j,m

We observe that χΓm = 1 ⇔ am = 0. Proof. The formula (2.4) is a direct application of (2.2), when we note that, locally, [H, G] = −2G dxdkj − G and [G, [H, G]] = 2(G )2 . (A factor of 2α has been divided out.) The reason for the condition (2.3) is that Gφj must be in the domain of deﬁnition of H, which requires that at each vertex, 0=

∂Gφj j

=G = φj

∂xkj

(0+ )

∂φj ∂G (0+ ) + φj (0+ ) ∂x ∂x kj kj j j ∂G (0+ ). ∂x kj j

If we are so fortunate that (G )2 is the same constant on every edge, then (2.4) reduces to the quadratic inequality (z − Ej )2+ − 4α(z − Ej )+ φj 2 ≤ 0, (2.5) j

familiar from [8, 9, 11, 12, 25], where it was shown that it implies universal spectral bounds for Laplacians and Lieb–Thirring inequalities for Schr¨ odinger operators in routine ways. Equation (2.5) can be considered as a Yang-type inequality, after [30].

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

313

Stubbe’s monotonicity argument In [25], Stubbe showed that some of the classical sharp Lieb–Thirring inequalities follow from the quadratic inequality (2.5). Here we apply the same argument to quantum graphs: For any α > 0, the functions Ej (α) are non-positive, continuous and increasing. Ej (α) is continuously diﬀerentiable except at countably many values where Ej (α) fails to be isolated or enters the continuum. By the Feynman–Hellman theorem, d Ej (α) = φj , −φj = φj 2 . dα Setting z = 0, (2.5) reads α

(−Ej (α))2 + 2α2

Ej (α)<0

d dα

(−Ej (α))2 ≤ 0.

Ej (α)<0

We denote by ∞ ≥ α1 ≥ α2 ≥ · · · ≥ αk ≥ · · · > 0 the values at which Ej (α) appears. For any α ∈ ]αN +1 , αN [ the number of eigenvalues is constant, and therefore   d  1/2 (−Ej (α))2  ≤ 0. α dα Ej (α)<0

This means that α1/2 Ej (α)<0 (−Ej (α))2 is monotone decreasing in α. Hence, by Weyl’s asymptotics (see [4, 28]), (−Ej (α))2 ≤ lim α1/2 (−Ej (α))2 = Lcl (V− (x))2+1/2 dx. α1/2 2,1 Ej (α)<0

α→0+

Ej (α)<0

Γ

Remark 2.2. Strictly speaking the Feynman–Hellman theorem only holds for nondegenerate eigenvalues. In the case of degenerate eigenvalues, one has to take the right basis in the corresponding degeneracy space and to change the numbering if necessary, see, e.g., [26]. The balloon counterexamples given above might lead one to think that the existence of cycles poses a barrier for a quantum graph to have an inequality of the form (2.5). Consider, however the following example. Example 2.3 (Hash Graphs). Let Γ be a planar graph consisting of (or metrically isomorphic to) the union of a closed family of vertical lines and line segments Fv and a closed family of horizontal lines and line segments Fh . We assume that for some δ > 0 the distance between any two lines or line segments in Fv is at least δ, and that the same is true of Fh . (The assumption on the spacing of the lines allows an unproblematic deﬁnition of the vertex conditions (1.2).) We impose Dirichlet boundary conditions at any ends of ﬁnite line segments. We also suppose a

April 20, 2010 14:17 WSPC/S0129-055X

314

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

“crossing condition”, that there are no vertices touching exactly three edges. (That is, no line segment from Fv has an end point in Fh and vice versa.) Regarding the graph as a subset of the xy-plane, we let G(x, y) = x + y. It is immediate from the crossing condition that G satisﬁes (2.3). Furthermore, the derivative of G along every edge is 1, and therefore the quadratic inequality (2.5) holds. A quadratic inequality (2.5) can arise in a diﬀerent way, if there is a family of piecewise aﬃne functions G each with a range of values am , but such that next example. am = 1 (or any other ﬁxed positive constant). This occurs in our Even when this is not possible, if we can arrange that 0 < amin ≤ am ≤ amax , then the resulting weaker quadratic inequality amax (z − Ej )2+ − 4α (z − Ej )+ φj 2 ≤ 0, (2.6) a min j will still lead to universal spectral bounds that may be useful. We speculate about this circumstance below. Example 2.4 (Y -Graph). As the next example we consider a simple graph, namely the Y -graph, which is a star-shaped graph with three positive halfaxes Γi , i = 1, 2, 3, joined at a single vertex v1 . If we set   x11 ∈ Γ1  g1 := 0, G1 (x) := g2 := −x12 , x12 ∈ Γ2 ,   g := x , x ∈Γ , 3

13

13

3

then obviously G(J ) ⊂ D(HΓ (α)) holds, and with Lemma 2.1 we get (z − Ej )2+ ( χΓ2 φj 2 + χΓ3 φj 2 ) j

− 4α(z − Ej )+ ( χΓ2 φj 2 + χΓ3 φj 2 ) ≤ 0.

(2.7)

As Γ1 does not contribute to this inequality, we cyclically permute the zero part of G, i.e. we next choose G2 (x), such that g2 = 0, g1 = x11 and g3 = −x13 , and ﬁnally G3 (x), such that g3 = 0, g1 = x11 and g2 = −x12 . These give us two further inequalities analogous to (2.7). Summing all three inequalities, and noting that on 3 every edge, =1 am = 2, we ﬁnally obtain 2(z − Ej )2+ − 8α(z − Ej )+ φj 2 ≤ 0, (2.8) j

which when divided by 2 yields the quadratic inequality (2.5). We next extend the averaging argument to prove (2.5) for arbitrary metric trees. A metric tree Γ consists of a set of vertices, a set of leaves and a set of edges, i.e. segments of the real axis, which connect the vertices, such that there is exactly

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

315

one path connecting any two vertices. It is common in graph theory to distinguish between edges and leaves; a leaf is joined to a vertex at only one of its endpoints, i.e. there is a free end, at which we shall set Dirichlet boundary conditions. (When the distinction is not material we shall refer to both edges and leaves as edges. It is also common to regard one free end as the distinguished “root” r of the tree, but for our purposes all free ends of the graph have the same status.) We denote the vertices by vi , i = 1, . . . , n. The edges including leaves will be denoted by e. We shall explicitly write lj for leaves when the distinction matters. Theorem 2.5. For any tree graph with a finite number of vertices and edges, the mapping (−Ej (α))2 α → α1/2 Ej (α)<0

is nonincreasing for all α > 0. Consequently 1/2

α

(−Ej (α)) ≤ 2

Lcl 2,1

Ej (α)<0

(V− (x))2+1/2 dx Γ

for all α > 0. Remark 2.6. By the monotonicity principle of Aizenman and Lieb (see [1]), Theorem 2.5 is also true with the sharp constant for higher moments of eigenvalues. Alternatively, the extension to higher values of γ can be obtained directly from the trace inequality of [10] for power functions with γ > 2. Furthermore, Theorem 2.5 can be extended by a density argument to potentials V ∈ Lγ+1/2 (Γ). To prepare the proof of Theorem 2.5, we ﬁrst formulate some auxiliary results. Lemma 2.7. For all n ∈ N, [ n−1 2 ]

k=0

[ n

2 ]−1 n−1 n−1 = . 2k 2k + 1

(2.9)

k=0

Proof. This is a simple computation. Definition 2.8. Let E be the set of all edges e ⊂ Γ. We call the mapping C : E → {0, 1} a coloring and say that C is an admissible coloring if at each vertex v ∈ Γ the number #{e : e emanates from v : C(e) = 1} is even. We let A(Γ) denote the set of all admissible colorings on Γ.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

316

Theorem 2.9. Let Γn be a metric tree with n vertices. For an edge e ⊂ Γn , we denote by a(e, n) := #{C(Γn ) ∈ A : C(e) = 1} the number of all admissible mappings C ∈ A(Γn ), such that C(e) = 1 for e ⊂ Γn . Then a(e, n) is independent of e ⊂ Γn .

(2.10)

Proof. We shall prove (2.10) by induction over the number of vertices of Γ. The case with one vertex v1 is trivial because of the symmetry of the graph. Given a metric tree Γn with n vertices, we can decompose it as follows. Γn consists of a metric tree Γn−1 with n − 1 vertices to which m − 1 leaves lj , j = 2, . . . , m, are attached to the free end of a leaf l1 ⊂ Γn−1 . We call the vertex at which the leaves lj , j = 1, . . . , m, are joined vn . Hence, m

Γn := Γn−1 ∪ vn ∪

lj .

j=2

By the induction hypothesis, a(e, n − 1) := #{C ∈ A(Γn−1 ) : C(e) = 1} is independent of e ⊂ Γn−1 .

(2.11)

Obviously for every edge or leaf e = l1 in Γn−1 , we have a(e, n − 1) = #{C ∈ A(Γn−1 ) : C(e) = 1 ∧ C(l1 ) = 1} + #{C ∈ A(Γn−1 ) : C(e) = 1 ∧ C(l1 ) = 0}.

(2.12)

Now, we have to show that a(e, n) is independent of e ⊂ Γn . Note ﬁrst that for m each ﬁxed leaf lj of the subgraph Γ∗ = vn ∪ j=1 lj , we have ∗

∗

µ1 := #{C ∈ A(Γ ) : C(lj ) = 1, lj ∈ Γ } =

[m 2 ]−1

k=0

m−1 2k + 1

(2.13)

m−1 . 2k

(2.14)

and ∗

∗

µ0 := #{C ∈ A(Γ ) : C(lj ) = 0, lj ∈ Γ } =

[ m−1 2 ]

k=0

Hence, for arbitrary neighboring edges e , e ⊂ Γn−1 the following equality holds, a(e , n) = µ1 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 1} + µ0 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 0},

(2.15)

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

317

and, respectively, a(e , n) = µ1 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 1} + µ0 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 0}.

(2.16)

By Lemma 2.7, µ := µ0 = µ1 . Therefore, with (2.12) the equalities (2.15) and (2.16) read a(e , n) = µa(e , n − 1), a(e , n) = µa(e , n − 1). Furthermore, by the induction hypothesis, a(e , n − 1) = a(e , n − 1), from which it immediately follows that a(e , n) = µa(e , n − 1) = µa(e , n − 1) = a(e , n). This proves Theorem 2.9. Proof of Theorem 2.5. In order to apply Stubbe’s monotonicity argument [25], we need to establish inequality (2.5) for metric trees. To do this, we proceed as for the example of the Y -graph. Let J denote the subspace spanned by the eigenfunctions φj on L2 (Γ) corresponding to the eigenvalues Ej . Note ﬁrst that there exist selfadjoint operators G, which are given by piecewise aﬃne functions gi on the edges (or leaves) of Γ, such that G(J ) ⊂ D(H(α)) ⊂ D(G). Edges (or leaves) on which constant functions gi are given, do not contribute to the sum rule. Therefore we average over a family of operators G, such that every edge e (or leaf) of the tree appears equally often in association with an aﬃne function having G = ±1 on e. We let G denote the set of continuous operators G(x) = {gi (x) aﬃne, x ∈ ei (or li )}, which satisfy (1.2) at the vertices v of Γ. Indeed it is not necessary to average over all the operators G ∈ G, because it makes no diﬀerence in Lemma 2.1, for instance, whether gi = 1 or gi = −1. Therefore we deﬁne an equivalence relation ∼G on G ˜ = {˜ as follows: Let G gi (x) aﬃne, x ∈ ei , (or li )} be another operator in G. We say ˜ gi (x)|. We deﬁne G ∗ := G/∼. Then we that G ∼ G ⇔ ∀i ∈ {1, . . . , n} : |gi (x)| = |˜ can consider the isomorphism I : A(Γ) → G ∗ ,

(2.17)

where for each C ∈ A(Γ) we choose an aﬃne function GC ∈ G ∗ on Γ, such that |GC (e)| = C(e) for every e ⊂ Γ. By Theorem 2.9, we know that #{C ∈ A(Γ) : C(e) = 1} is independent of e ⊂ Γ. This means that summing up all inequalities corresponding to (2.4), which we get from each GC ∈ G ∗ , leads to (z − Ej )2+ p − 4α(z − Ej )+ p φj 2 ≤ 0, (2.18) j

April 20, 2010 14:17 WSPC/S0129-055X

318

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

where p := am = #{C ∈ A(Γ) : C(e) = 1} and we have used the normalization φj = 1. Having the anologue of inequality (2.5) for metric trees, we can reformulate the monotonicity argument for our case. This proves Theorem 2.5. Remark 2.10. The proof applies equally to metric trees with leaves of inﬁnite lengths. 2.2. Modified Lieb–Thirring inequalities for one-loop graphs In this section we consider the graph Γ consisting of a circle to which two leaves are attached. It is not hard to see that the construction leading to Lieb–Thirring inequalities with the sharp classical constant fails for one-loop graphs, because no family of auxiliary functions G exists with the side condition that am = 1 throughout Γ. Unlike the case of the balloon graph, it is possible to replace the classical inequality with a weakened version (2.6) as mentioned above. There is, however another option, based on commutators with exponential functions, following an idea of [10]: As usual, we deﬁne the one-parameter familiy of Schr¨odinger operators H(α) = −α

d2 + V (x), dx2

α > 0,

in L2 (Γ) with the usual conditions (1.2) at each vertex vi of Γ. The leaves are denoted by Γ1 := [0, ∞) and Γ2 := [0, ∞), while we write Γ3 and Γ4 for the semicircles with lengths L. Let φj be the eigenfunctions of H(α) corresponding to the eigenvalues Ej (α). Theorem 2.11. Let q := 2π/L. For all α > 0 the mapping

α → α1/2

Ej (α)<0

2 3 z − αq 2 − Ej 16 +

(2.19)

is nonincreasing. Furthermore, for all z ∈ R and all α > 0 the following sharp Lieb–Thirring inequality holds: R2 (z, α) ≤ α1/2 Lcl 2,1

2+1/2 3 dx, V (x) − z + q 2 α 16 Γ −

(2.20)

where R2 (z, α) :=

(z − Ej (α))2+ .

Ej (α)
Remark 2.12. Once again, Theorem 2.11 can be extended to potentials V ∈ Lγ+1/2 (Γ) and is true for all γ ≥ 2, either by the monotonicity principle of Aizenman and Lieb [1] or by the trace formula of [10] for γ ≥ 2.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

319

For the proof of Theorem 2.11, we make use of a theorem of Harrell and Stubbe: Theorem 2.13 ([10, Theorem 2.1]). Let H be a self-adjoint operator on H, with a nonempty set J of finitely degenerate eigenvalues lying below the rest of the spectrum J c and {φj } an orthonormal set of eigenfunctions of H. Let G be a linear operator with domain DG and adjoint G∗ defined on DG∗ such that G(DH ) ⊆ DH ⊆ DG and G∗ (DH ) ⊆ DH ⊆ DG∗ , respectively. Then 1 (z − Ej )2 ([G∗ , [H, G]]φj , φj + [G, [H, G∗ ]]φj , φj ) 2 Ej ∈J ≤ (z − Ej )( [H, G]φj 2 + [H, G∗ ]φj 2 ).

(2.21)

Ej ∈J

Remark 2.14. Strictly speaking, in [10] it was assumed that the spectrum was purely discrete. However, the extension to the case where continuous spectrum is allowed in J c follows exactly as in [11, Theorem 2.1]. Proof of Theorem 2.11. In this case, it is not possible to get a quadratic inequality from Lemma 2.1 without worsening the constants. This follows from the fact that the conditions φ3 (0) = φ4 (0) and φ3 (L) = φ4 (L) imply that the piecewise linear function G has to be deﬁned equally on Γ3 and Γ4 . Consequently, the condition (1.2) can be satisﬁed only with diﬀerent values of am as in (2.6), namely a1 = a2 = 4a3 = 4a4 . Our proof of Theorem 2.11 consists of three steps. First, we apply Lemma 2.1, after which we apply Theorem 2.13. Finally we combine both results and apply the line of argument given in [10]. First step. Using Lemma 2.1 with the choice,   g := −2x11 ,   1  g := 2x + L, 2 22 G(x) :=  g := x ,  3 13    g4 := x14 , we obtain  4 (z − Ej (α))2+ p12 (j) − 4α Ej (α)<0

+

x11 ∈ Γ1 x22 ∈ Γ2 x13 ∈ Γ3 x14 ∈ Γ4 , 

(z − Ej (α))+ p12 (j)

Ej (α)<0

(z − Ej (α))2+ p34 (j) − 4α

Ej (α)<0

(z − Ej (α))+ p34 (j) ≤ 0,

Ej (α)<0

where pik (j) := χΓi φj 2 + χΓk φj 2 and pik (j) := χΓi φj 2 + χΓk φj 2 .

(2.22)

April 20, 2010 14:17 WSPC/S0129-055X

320

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

Second step. Next, in Theorem 2.13 we set   g1    g 2 G(x) :=  g  3    g4

:= 1,

x11 ∈ Γ1

:= 1 ,

x22 ∈ Γ2

:= e−i2πx13 /L , x13 ∈ Γ3 x14 ∈ Γ4 .

:= ei2πx14 /L ,

It is easy to see that Gφj ∈ D(Hα ). With q := 2π/L, the ﬁrst commutators work out to be [Hj , gj ] = 0, [H3 , g3 ] = e−iqx13 α(q 2 + 2iqd/dx),

j = 1, 2,

[H4 , g4 ] = eiqx14 α(q 2 − 2iqd/dx);

whereas for the second commutators, [gj∗ , [Hj , gj ]] = [gj , [Hj , gj∗ ]] = 0, [gj∗ , [Hj , gj ]]

=

[gj , [Hj , gj∗ ]]

j = 1, 2, 2

= 2αq ,

(2.23) j = 3, 4.

From inequality (2.21), we get

(z − Ej (α))2 p34 (j) ≤ α

Ej (α)∈J

(z − Ej (α)) (q 2 p34 (j) + 4p34 (j)).

(2.24)

Ej (α)∈J

Third step. Adding (2.22) and (2.24), we ﬁnally obtain

3 d 2 R2 (z, α) + 2α R2 (z, α) ≤ αq 2 (z − Ej )p34 (j), dα 2

(2.25)

Ej ∈J

or 2R2 (z, α) + 4α

3 d R2 (z, α) − αq 2 R1 ≤ 0, dα 2

(2.26)

which is equivalent to ∂ 3q 2 1/2 (α1/2 R2 (z, α)) ≤ α R1 (z, α). ∂α 8

(2.27)

Letting U (z, α) := α1/2 R2 (z, α), the inequality has the form ∂U 3 2 ∂U ≤ q . ∂α 16 ∂z

(2.28)

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

321

3 2 Since the expression in (2.20) can be written as U (z− 16 q α, α), an application of the chain rule shows that the monotonicity claimed in (2.20) follows from (2.28). (We note that (2.28) can be solved by changing to characteristic variables ξ := α − 16z 3q2 , 16z η := α + 3q2 , in terms of which

∂U ≤ 0. ∂ξ

(2.29)

That is, U decreases as ξ increases while η is ﬁxed.) By shifting the variable in (2.29), we also obtain

3 U (z, α) ≤ U z + q 2 (α − αs ), αs (2.30) 16 for α ≥ αs . By Weyl’s asymptotics, for all γ ≥ 0, γ+d/2 lim αd/2 (z − Ej (α))γ = Lcl (V (x) − z)− dx, γ,d α→0+

(2.31)

Γ

Ej (α)
see [4, 28]. Hence, as αs → 0, the right-hand side of (2.30) tends to Lcl 2,1

2+1/2 3 2 dx, V (x) − z + q α 16 Γ −

so the conclusion of Theorem 2.11 follows. Remark 2.15. Theorem 2.11 can be generalized to one-loop graphs to which 2n, n ∈ N equidistant semiaxes are attached. To summarize, in this section we have seen that for some classes of quantum graphs a quadratic inequality (2.5) can be proved with the classical constants, and that for some other classes of graphs similar statements can be proved at the price of worse constants as in (2.6), or of a shift in the zero-point energy as in (2.20). It is reasonable to ask whether one can look at the connectness of a graph and say whether a weak Yang-type inequality (2.6) can be proved. As we have seen, this is the case if there exists a family of continuous functions G on the graph such that • On each edge, all the derivatives {G } are constant. • At each vertex vk , each function G satisﬁes dG (0+ ) = 0. dx kj j • For each edge e there exists at least one function G with G = 0.

April 20, 2010 14:17 WSPC/S0129-055X

322

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

Interestingly, the question of the existence of such a family of functions can be rephrased in terms of the theory of electrical resistive circuits, a subject dating from the mid-19th century [14]. We ﬁrst note that for a suitable family of functions to exist, there must be at least two leaves, which can be regarded as external leads of an electric circuit, bearing some resistance. (In the ﬁnite case let the resistance be equivalent to the length of the leaf, and in the inﬁnite case let it be some ﬁxed ﬁnite value, at least as large as the length of any ﬁnite leaf.) Each internal edge is regarded as a wire bearing a resistance equal to the length of the edge. If we regard the value of G as a current, then Kirchhoﬀ’s condition at the vertex of an electric circuit is exactly the condition (1.2) that dG + j dxkj (0 ) = 0, and the condition that the electric potential G must be uniquely deﬁned at all vertices is equivalent to global continuity of G . It has been known since Weyl [29] that the currents and potentials in an electric circuit are uniquely determined by the voltages applied at the leads. There are, however, circuits such that no matter what voltages are applied to the external leads, there will be an internal wire where no current ﬂows; the most well known of these is the Wheatstone bridge. (See, for instance, the Wikipedia article on the Wheatstone bridge.) Let us call a metric graph a generalized Wheatstone bridge when the corresponding circuit has exactly two external leads and a conﬁguration for which no current will ﬂow in at least one of its wires. Then we conjecture that there are only two impediments to the existence of a suitable family of functions G , and therefore to a weakened quadratic inequality (2.6), namely: Unless a quantum graph contains either • a subgraph that can be disconnected from all leaves by the removal of one point (such as a balloon graph or a graph shaped like the letter α); or • a subgraph that when disconnected from the graph by cutting two edges is a generalized Wheatstone bridge, then an inequality of the form (2.6) holds. Otherwise the best that can be obtained may be a modiﬁed quadratic inequality with a variable shift, as in Theorem 2.11.

Fig. 2.

The “Wheatstone bridge”.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

323

3. Universal Bounds for Finite Quantum Graphs In this section, we derive diﬀerential inequalities for Riesz means of eigenvalues of the Dirichlet Laplacian on bounded metric trees Γ with at least one leaf (free edge). From these inequalities we derive Weyl-type bounds on the averages of the eigenvalues of the Dirichlet Laplacian

d2 in L2 (Γ), HD := − 2 dx D with the conditions (1.2) at each vertex vi . At the ends of the leaves, vanishing Dirichlet boundary conditions are imposed. We recall that with the methods of [9, 12] these are consequences of the same quadratic inequality (2.5) as was used above to prove Lieb–Thirring inequalities. When the total length of the graph is ﬁnite, the operator HD on D(HD ) has a positive discrete spectrum {Ej }∞ j=1 , allowing us to deﬁne the Riesz mean of order ρ, (z − Ej )ρ+ (3.1) Rρ (z) := j

for ρ > 0 and real z. Theorem 3.1. Let Γ be a metric tree of finite length and with finitely many edges and vertices, and let HD be the Dirichlet Laplacian in L2 (Γ) with domain D(HD ). Then for z > 0, R1 (z) ≥

5 R2 (z); 4z

(3.2)

R2 (z) ≥

5 R2 (z); 2z

(3.3)

and consequently R2 (z) z 5/2 is a nondecreasing function of z. Proof. The claims are vacuous for z ≤ E1 , so we henceforth assume z > E1 . The line of reasoning of the proof of Theorem 2.5 applies just as well to the operator HD on D(HD ), yielding (z − Ej )2+ − 4(z − Ej )+ φj 2 ≤ 0. (3.4) j

Since V ≡ 0, φj 2 = Ej . Observing that (z − Ej )+ Ej = zR1 (z) − R2 (z), j

April 20, 2010 14:17 WSPC/S0129-055X

324

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

we get from (3.4) 5R2 (z) − 4zR1 (z) ≤ 0. This proves (3.2). Inequality (3.3) follows from (3.2), as R2 (z) = 2R1 (z). Since by Theorem 3.1, R2 (z)z −5/2 is a nondecreasing function, we obtain a lower bound of the form R2 (z) ≥ Cz 5/2 for all z ≥ z0 in terms of R2 (z0 ). Upper bounds can be obtained from the limiting behavior of R2 (z) as z → ∞, as given by the Weyl law. In the following, we follow [9] to derive Weyl-type bounds on the averages of the eigenvalues of HD in L2 (Γ). Corollary 3.2. For z ≥ 5E1 , 5/2 −1/2 z 5/2 ≤ R2 (z) ≤ Lcl , 16E1 2,1 |Γ|z 5 where Lcl 2,1 :=

Γ(3) , (4π)1/2 Γ(7/2)

and |Γ| is the total length of the tree.

Proof. By Theorem 3.1, for all z ≥ z0 , R2 (z) R2 (z0 ) ≥ 5/2 . 5/2 z z0

(3.5)

As R2 (z0 ) ≥ (z0 − E1 )2+ for any z0 > E1 , it follows from (3.5) that R2 (z) ≥ (z0 − The coeﬃcient

(z0 −E1 )2+ 5/2

z0

E1 )2+

z z0

5/2 .

is maximized when z0 = 5E1 . Thus we get −1/2

16E1

5/2 z ≤ R2 (z). 5

For metric trees with total length |Γ|, the Weyl law states that √ π En lim = , n→∞ n |Γ| (see [16]). It follows that R2 (z) → Lcl 2,1 |Γ|, z 5/2 as z → ∞. Since

R2 (z) z 5/2

is nondecreasing, we get R2 (z) ≤ Lcl 2,1 |Γ|, z 5/2

∀ z < ∞.

(3.6)

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

325

In summary, we get from Theorem 3.1 and Corollary 3.2 the following two-sided estimate: 3/2 5 −1/2 z R2 (z) ≤ R1 (z). ≤ (3.7) 4E1 5 4z In order to obtain similar estimates, related to higher eigenvalues, we introduce the notation 1 Ej := E j ≤j

for the means of eigenvalues E ; similarly, the means of the squared eigenvalues are denoted 1 2 Ej2 := E . j ≤j

For a given z, we let ind(z) be the greatest integer i such that Ei ≤ z. Then obviously, 2 ). R2 (z) = ind(z)(z 2 − 2zEind(z) + Eind(z)

As for any integer j and all z ≥ Ej , ind(z) ≥ j, we get R2 (z) ≥ D(z, j) := j(z 2 − 2zEj + Ej2 ). Using Theorem 3.1 for z ≥ zj ≥ Ej , it follows that 5/2 z R2 (z) ≥ D(zj , j) . zj

(3.8)

2

Furthermore, Ej ≤ Ej2 by the Cauchy–Schwarz inequality, and hence 2

D(z, j) = j((z − Ej )2 + Ej2 − Ej ) ≥ j(z − Ej )2 .

(3.9)

This establishes the following Corollary 3.3. Suppose that z ≥ 5Ej . Then R2 (z) ≥

16jz 5/2 25(5Ej )1/2

(3.10)

R1 (z) ≥

4jz 3/2 . 5(5Ej )1/2

(3.11)

and, therefore,

Proof. Combining Eqs. (3.8) and (3.9), we get 5/2 z 2 . R2 (z) ≥ j(zj − Ej ) zj

April 20, 2010 14:17 WSPC/S0129-055X

326

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

Inserting zj = 5Ej the ﬁrst statement follows. (This choice of zj maximizes the constant appearing in (3.10).) The second statement results from substituting the ﬁrst statement into (3.7). The Legendre transform is an eﬀective tool for converting bounds on Rρ (z) into bounds on the spectrum, as has been realized previously, e.g., in [17]. Recall that if f (z) is a convex function on R+ that is superlinear in z as z → +∞, its Legendre transform L[f ](w) := sup{wz − f (z)} z

is likewise a superlinear convex function. Moreover, for each w, the supremum in this formula is attained at some ﬁnite value of z. We also note that if f (z) ≥ g(z) for all z, then L[g](w) ≤ L[f ](w) for all w. The Legendre transform of the two sides of inequality (3.11) is a straightforward calculation (e.g., see [9]). The result is (w − [w])E[w]+1 + [w]E[w] ≤

w3 125 Ej , j 2 108

(3.12)

for certain values of w and j. In Corollary 3.3 it is supposed that z ≥ 5Ej . Let zmax be the value for which L[f ](w) = wzmax − f (zmax ), where f is the right-hand side of (3.11). Then by an elementary calculation, w=

6j 5

zmax 5Ej

1/2 .

It follows that inequality (3.12) is valid for w ≥ 6j/5. Meanwhile, for any w we can always ﬁnd an integer k such that on the left-hand side of (3.12), k − 1 ≤ w < k. If k > 6j/5 and if we let approach k from below, we obtain from (3.12) Ek + (k − 1)Ek−1 ≤

k 3 125 Ej . j 2 108

The left-hand side of this equation is the sum of the eigenvalues E1 through Ek , so we get the following: Corollary 3.4. For k ≥ 65 j, the means of the eigenvalues of the Dirichlet Laplacian on an arbitrary metric tree with finitely many edges and vertices satisfy a universal Weyl-type bound, 2 Ek 125 k ≤ . (3.13) 108 j Ej In [10] it was shown that a similar inequality with a diﬀerent constant can be proved for all k ≥ j in the context of the Dirichlet Laplacian on Euclidian domains. The very same argument applies to quantum graphs with V = 0. With

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

327

this assumption φj 2 = Ej , so with α = 1 (2.5) can be rewritten as a quadratic inequality,

Pj (z) :=

j

(z − E )(z − 5E ) ≤ 0

(3.14)

=1

for z ∈ [Ej , Ej+1 ] (cf. [10, Eq. (4.6)]). From (3.2) and (3.5) for z ≥ z0 ≥ Ej , 5 5 −5/2 R1 (z) ≥ R2 (z) ≥ z 3/2 z0 (z0 − Ej )2 . 4z 4 j

(3.15)

=1

The derivative of the right-hand side of (3.15) with respect to z0 , by a calculation, is a negative quantity times Pj (z0 ), and therefore an optimal choice for the value of (3.15) is the root z0 = 3Ej +

Dj ≤ 5Ej ,

(3.16)

where Dj is the discriminant of Pj . The inequality in (3.16) results from the Cauchy– Schwarz inequality as in [10, 12]. Because Pj (z0 ) = 0,

0=

j

(z0 − E )(z0 − 5E ) = 5

=1

j

(z0 − E )2 − 4z0

=1

j

(z0 − E ),

=1

so (3.15) reads R1 (z) ≥

z z0

3/2 j

(z0 − E ) =

=1

z z0

3/2 j(z0 − Ej ).

From the left-hand side of (3.16), z0 − Ej ≥ 23 z0 , so R1 (z) ≥

2 −1/2 3/2 jz0 z . 3

(3.17)

The Legendre transform of (3.17) is kEk ≤

z0 3 k , 3j 2

(3.18)

and a calculation of the maximizing z in the Legendre transform of the right-hand side of (3.17) shows that (3.18) is valid for all k > j. In particular, with the inequality on the right-hand side of (3.16), we have established the following:

April 20, 2010 14:17 WSPC/S0129-055X

328

148-RMP

J070-00396

S. Demirel & E. M. Harrell, II

Corollary 3.5. For k ≥ j, the means of the eigenvalues of HD in L2 (Γ) satisfy 2 Ek 5 k ≤ . (3.19) 3 j Ej Remark 3.6. Relaxing the assumption to k ≥ j comes at the price of making the constant on the right-hand side larger. It would be possible to interpolate between (3.19) and (3.13) for k ∈ [j, 6j/5] with a slightly better inequality. Acknowledgment The authors are grateful to several people for useful comments, including Rupert L. Frank, Lotﬁ Hermi, Thomas Morley, Joachim Stubbe, and Timo Weidl, and to Michael Music for calculations and insights generated by them. We also wish to express our appreciation to the Mathematisches Forschungsinstitut Oberwolfach for hosting a workshop in February 2009, where this collaboration began, and to the Erwin Schr¨ odinger Institut for hospitality. References [1] M. Aizenman and E. H. Lieb, On semiclassical bounds for eigenvalues of Schr¨ odinger operators, Phys. Lett. A 66(6) (1978) 427–429. [2] M. S. Ashbaugh, The universal eigenvalue bounds of Payne–P´ olya–Weinberger, Hile– Protter, and H. C. Yang, Spectral and Inverse Spectral Theory (Goa, 2000), Proc. Indian Acad. Sci. Math. Sci. 112 (2002) 3–30. [3] G. Berkolaiko, R. Carlson, S. A. Fulling and P. Kuchment (eds.), Quantum Graphs and Their Applications, Contemporary Mathematics, Vol. 415 (American Mathematical Society, 2006). [4] M. Sh. Birman, The spectrum of singular boundary problems, Amer. Math. Soc. Trans. (2) 53 (1966) 23–80. [5] M. Cwikel, Weak type estimates for singular values and the number of bound states of Schr¨ odinger operators, Ann. Math. (2) 106(1) (1977) 93–100. [6] T. Ekholm, R. L. Frank and H. Kovar´ık, Eigenvalue estimates for Schr¨ odinger operators on metric trees, arXiv:0710.5500. [7] P. Exner, J. P. Keating, P. Kuchment, T. Sunada and A. Teplyaev (eds.), Analysis on Graphs and Its Applications, Proceedings of Symposia in Pure Mathematics, Vol. 77 (American Mathematical Society, Providence, RI, 2008); Papers from the program held in Cambridge, January 8–June 29 (2007). [8] E. M. Harrell, II and L. Hermi, On Riesz means of eigenvalues, arXiv:0712.4088. [9] E. M. Harrell, II and L. Hermi, Diﬀerential inequalities for Riesz means and Weyltype bounds for eigenvalues, J. Funct. Anal. 254(12) (2008) 3173–3191. [10] E. M. Harrell, II and J. Stubbe, Trace identities for commutators with applications to the distribution of eigenvalues, arXiv:0903:0563v1. [11] E. M. Harrell, II and J. Stubbe, Universal bounds and semiclassical estimates for eigenvalues of abstract Schr¨ odinger operators, arXiv:0808.1133. [12] E. M. Harrell, II and J. Stubbe, On trace identities and universal eigenvalue estimates for some partial diﬀerential operators, Trans. Amer. Math. Soc. 349(5) (1997) 1797– 1809.

April 20, 2010 14:17 WSPC/S0129-055X

148-RMP

J070-00396

Inequalities for Eigenvalues of Quantum Graphs

329

[13] D. Hundertmark, Bound state problems in quantum mechanics, in Spectral Theory and Mathematical Physics: A Festschrift in Honor of Barry Simon’s 60th Birthday, Proc. Sympos. Pure Math., Vol. 76, Part 1 (Amer. Math. Soc., Providence, RI, 1980), pp. 463–496. ¨ [14] G. R. Kirchhoﬀ, Uber die Auﬂ¨ osung der Gleichungen, auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Str¨ ome gef¨ uhrt wird, Poggendorf ’s Ann. Phys. Chem. 72 (1847) 497–508. [15] P. Kuchment, Quantum graphs: An introduction and a brief survey, in Analysis on Graphs and Its Applications, Proc. Symp. Pure. Math. (Amer. Math. Soc., Providence, RI, 2008), pp. 291–314. [16] P. Kurasov, Schr¨ odinger operators on graphs and geometry. I. Essentially bounded potentials, J. Funct. Anal. 254(4) (2008) 934–953. [17] A. Laptev and T. Weidl, Recent results on Lieb–Thirring inequalities, in Journ´ees ´ “Equations aux D´eriv´ees, Partielles” (La Chapelle sur Erdre, 2000), Exp. No. XX (Univ. Nantes, Nantes, 2000), 14pp. [18] A. Laptev and T. Weidl, Sharp Lieb–Thirring inequalities in high dimensions, Acta Math. 184(1) (2000) 87–111. [19] E. H. Lieb, The number of bound states of one-body Schr¨ odinger operators and the Weyl problem, in Geometry of the Laplace Operator (Proc. Sympos. Pure Math., Univ. Hawaii, Honolulu, Hawaii, 1979), Proc. Sympos. Pure Math., Vol. 36 (Amer. Math. Soc., Providence, RI, 1980), pp. 241–252. [20] E. H. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the Schr¨ odinger Hamiltonian and their relation to Sobolev inequalities, Studies in Mathematical Physics: Essays in Honor of Valentine Bergmann (Princeton Univ. Press, 1976), pp. 269–303. [21] L. Pauling, The diamagnetic anistropy of aromatic molecules, J. Chem. Phys. 4 (1936) 673–677. [22] L. H. Payne, G. P´ olya and H. F. Weinberger, On the ratio of consecutive eigenvalues, J. Math. Phys. 35 (1956) 289–298. [23] G. V. Rozenblum, Distribution of the discrete spectrum of singular diﬀerential operators, Izv. Vysˇs. Uˇcebn. Zaved. Matematika 1(164) (1976) 75–86. [24] K. Ruedenberg and C. W. Scherr, Free-electron network model for conjugated systems, I, Theory, J. Chem. Phys. 21 (1953) 1565–1581. [25] J. Stubbe, Universal monotonicity of eigenvalue moments and sharp Lieb–Thirring inequalities, preprint (2008). [26] W. Thirring, A Course in Mathematical Physics: Quantum Mechanics of Atoms and Molecules, Vol. 3 (Springer-Verlag, 1991), pp. 149–150. [27] T. Weidl, On the Lieb–Thirring constants Lγ,1 for γ ≥ 1/2, Comm. Math. Phys. 178(1) (1996) 135–146. [28] H. Weyl, Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen, Math. Ann. 71 (1912) 441–479. [29] H. Weyl, Repartici´ on de corriente en una red conductora, Rev. Mat. Hisp. Amer. 5(1) (1923) 153–164. [30] H. C. Yang, Estimates of the diﬀerence between consecutive eigenvalues, preprint (1995); revision of International Centre for Theoretical Physics, preprint IC/91/60, Trieste (April 1991).

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Reviews in Mathematical Physics Vol. 22, No. 3 (2010) 331–354 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10003977

GEOMETRIC MODULAR ACTION FOR DISJOINT INTERVALS AND BOUNDARY CONFORMAL FIELD THEORY

ROBERTO LONGO∗,§ , PIERRE MARTINETTI∗,†,‡,¶ and KARL-HENNING REHREN†,‡, ∗Dipartimento di Matematica, Universit` a di Roma 2 “Tor Vergata”, 00133 Roma, Italy †Institut

f¨ ur Theoretische Physik, Universit¨ at G¨ ottingen, Friedrich-Hund-Platz 1, 37077 G¨ ottingen, Germany

‡Courant

Centre “Higher Order Structures in Mathematics”, Universit¨ at G¨ ottingen, Bunsenstr. 3-5, 37073 G¨ ottingen, Germany §[email protected] ¶[email protected] [email protected] Received 7 December 2009 Revised 14 January 2010

Dedicated to John E. Roberts on the occasion of his 70th birthday In suitable states, the modular group of local algebras associated with unions of disjoint intervals in chiral conformal quantum ﬁeld theory acts geometrically. We translate this result into the setting of boundary conformal QFT and interpret it as a relation between temperature and acceleration. We also discuss novel aspects (“mixing” and “charge splitting”) of geometric modular action for unions of disjoint intervals in the vacuum state. Keywords: Quantum ﬁeld theory; modular theory. Mathematics Subject Classiﬁcations 2010: 81T40

1. Introduction Geometric modular action is a most remarkable feature of quantum ﬁeld theory [2], emerging from the combination of the basic principles: unitarity, locality, covariance and positive energy [1]. It associates thermal properties with localization [17, 30], and is intimately related to the Unruh eﬀect [34] and Hawking radiation [31]. It allows for a reconstruction of space and time along with their symmetries [7], and for a construction of full-ﬂedged quantum ﬁeld theories [23, 16] out of purely algebraic data together with a Hilbert space vector (the vacuum). 331

14:17 WSPC/S0129-055X

332

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

The modular group [32, Chap. VI, Theorem 1.19] is an intrinsic group of automorphisms of a von Neumann algebra M , associated with a cyclic and separating vector Φ, provided by the theory of Tomita and Takesaki [17, 32]. In quantum ﬁeld theory, M may be the algebra of observables localized in a wedge region {x ∈ R4 : x1 > |x0 |} and Φ = Ω the vacuum state. In this situation it follows [1] that the associated modular group is the 1-parameter group of Lorentz boosts in the 1-direction, which preserves the wedge, i.e. it has a geometric action on the subalgebras of observables localized in subregions of the wedge. Geometric modular action was also established for the algebras of observables localized in lightcones or double cones in the vacuum state in conformally invariant QFT [5, 16], and for interval algebras in chiral conformal QFT [4]. It is known, however, that the modular group of the vacuum state is not geometric (“fuzzy”) for double cone algebras in massive QFT (see, e.g., [2,29]), and the same is true for the modular group of wedge algebras or conformal double cone algebras in thermal states [3]. In this contribution, we shall be interested in modular groups for algebras associated with disconnected regions (such as unions of disjoint intervals in chiral conformal QFT). Our starting point is the observation [21] that in chiral conformal QFT (the precise assumptions will be speciﬁed below), for any ﬁnite number n of disjoint intervals Ii on the circle one can ﬁnd product states (not the vacuum if n > 1) on the algebras A( i Ii ) = i A(Ii ) whose modular groups act geometrically inside the intervals. For n = 2, let E = I1 ∪ I2 and E = S 1 \E the complement of the closure of E. By locality, A(E) ⊂ A(E ) , where the inclusion is in general proper. The larger algebra A(E ) admits the physical re-interpretation as a double cone algebra B+ (O) in boundary conformal QFT [25] as will be explained in Sec. 2.2. The above state on A(E) can be extended to a state on B+ (O) = A(E ) such that the geometric modular action is preserved. We shall compute the geometds as ric ﬂow in the double cone O in Sec. 2. Adopting the interpretation of dτ inverse temperature β (where τ is the proper time along an orbit and s the modular group parameter) [11, 28], we compute the relation between temperature and acceleration. There is not a simple proportionality as in the case of the Hawking temperature. In Sec. 3, we shall connect our results with a recent work by Casini and Huerta [9]. In a ﬁrst quantization approach as in [14], these authors have succeeded to compute the operator resolvent in the formula of [14] for the modular operator. From this, they obtained the modular ﬂow for disjoint intervals and double cones in 2 dimensions in the theory of free Fermi ﬁelds. Unlike [21], they consider the vacuum state. They ﬁnd a geometric modular action in the massless case (including the chiral case), but this action involves a “mixing” (“modular teleportation” [9]) between the diﬀerent intervals resp. double cones. We shall discuss how, upon descent to gauge-invariant subtheories, the mixing leads to the new phenomenon of “charge splitting” (Sec. 3.3).

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

333

Ignoring the mixing, the geometric part of the vacuum modular ﬂow for two intervals in the chiral free Fermi model is the same as the purely geometric modular ﬂow in the previous non-vacuum product state, provided a “canonical” choice for the latter is made, in the model-independent approach. We shall make the result of Casini and Huerta (which was obtained by formal manipulations of operator kernels) rigorous by establishing the KMS property of the vacuum state with respect to the modular action they found. We shall also present a preliminary discussion of the question, to what extent the result may be expected to hold in other than free Fermi theories. 2. Geometric Modular Flow for n-Intervals Let I → A(I) be a diﬀeomorphism covariant local net on the circle S 1 : the orientation-preserving diﬀeomorphisms γ of S 1 are unitarily implemented by U (γ) such that Ad U (γ) maps A(I) onto A(γ(I)) and Ad U (γ)|A(I) = id |A(I) if γ|I = id |I ; in particular, for localized diﬀeomorphisms U (γ) are local observables, associated with the stress-energy tensor; see, e.g., [27, Sec. 3]. An n-interval is the union E := nk=1 Ik of n open intervals Ik ⊂ S 1 (k = 1, . . . , n) with mutually disjoint closure. The complement E = S 1 \E is another n-interval. If there is an interval I ⊂ S 1 such that E√= {z ∈ S 1 : z n ∈ I}, we write √ n E = I, and call E symmetric. In this case, E = n I . Note that every 2-interval is a M¨ obius transform of a symmetric 2-interval, while the same is not true for n > 2. We are interested in the algebras A(E) :=

n

A(Ii ) and A(E) := A(E ) ,

(2.1)

i=1

and their states with geometric modular action. By Ω we denote the vacuum vector, and by U the projective unitary representation of the diﬀeomorphism group in the vacuum representation, with generators Ln (n ∈ Z) and central charge c. 2.1. Product states with geometric modular action For n = 1, E ist just an interval and A(I) = A(I) (Haag duality). Proposition 1 (Bisognano–Wichmann Property) ([4, Theorem 2.3]). The modular group of unitaries for the pair (A(I), Ω) is given by the 1-parameter group of M¨ obius transformations that ﬁxes the interval I, ∆it A(I),Ω = U (ΛI (−2πt)). 1 For I = S+ the upper half circle, the generator of the subgroup U (ΛS+1 (t)) is the dilation operator D = i(L1 − L−1 ). It follows that D as well as its M¨ obius conjugates DI (the generators of the subgroups U (ΛI (t))) are “of modular origin”:

−2π · DI = log ∆A(I),Ω .

(2.2)

14:17 WSPC/S0129-055X

334

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

I2

I1

I3 Fig. 1.

Flow ft in the 3-intervals E =

q 3

1 = I ∪ I ∪ I and E = S+ 1 2 3

q 3

1. S−

Let now (n)

L0

=

1 c n2 − 1 L0 + , n 24 n

(n)

L±1 =

1 L±n , n

(2.3) (n)

and U (n) the covering representation of the M¨obius group with generators Lk (k = 0, ±1). The unitary one-parameter groups V (t) = U (n) (ΛI (−2πt)) act on the diﬀeomorphism covariant net by √ n V (t)A(J)V (t)∗ = A(ft (J)) (J ⊂ I) (2.4) where the geometric ﬂow ft is given by (cf. Fig. 1) (2.5) ft (z) = n ΛI (−2πt)(z n ), √ with the branch of n · chosen in the same connected component of E as z, i.e. ft is a diﬀeomorphism of S 1 which √ preserves each component of E separately. The same formulae hold also for J ⊂ n I . (n) The question arises whether for n > 1 the generators DI of V (t) also have “modular origin” as in (2.2). However, unlike with n = 1, we have the following lemma and corollary: Lemma. In a unitary positive-energy representation of sl(2, R) of weight h > 0, there is no vector such that DΦ = 0, where D = i(L1 − L−1 ). Proof. An orthonormal basis of the representation is given by the vectors |n = 1 (n!(2h)n )− 2 Ln−1 |h, where |h is the lowest weight vector. Solving the eigenvalue equation L1 Φ = L−1 Φ by the ansatz Φ = n cn |n, produces a recursion for the coeﬃcients cn whose solution is not square-summable. Corollary. For n > 1, no cyclic and separating vector Φ exists in a positive-energy representation of the net A such that the modular Hamiltonian log ∆A(E),Φ would (n)

equal −2πDI . 2

c n −1 Proof. By modular theory, log ∆A(E),Φ Φ = 0. But because L0 ≥ 24 > 0, n (n) obius the lemma states that no vector Φ can be annihilated by DI which is a M¨ conjugate of D(n) . (n)

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

335

Instead, the appropriate generalization of (2.2) for the modular origin of the (n) generators DI was given in [21], assuming that the net A is completely rational. This means that the split property holds and the µ-index µA = [A(E) : A(E)] is ﬁnite, and implies that A(E) ⊂ A(E) is irreducible and there is a unique conditional dψ → A(E) [22, Proposition 5 and Sec. 3]. In the sequel, dψ expectation εE : A(E) is the Connes spatial derivative for a pair of faithful normal states ψ and ψ on a von Neumann algebra M and its commutant M , which is the canonical positive dψ it dψ −it implements σtψ on M and ( dψ implements σtψ on operator such that ( dψ ) ) M [10, Theorem 9]. Proposition√2 ([21, Corollary 16]). There is a faithful normal state ϕE on A(E) (E = n I) and a second faithful normal state ϕE on A(E ), such that the ϕ following hold: The modular automorphism group σtϕE is implemented by V (t), σt E is implemented by V (−t), and dϕ E n−1 (n) −2πDI = log log µA . (2.6) + dϕE 2 Here, ϕ E = ϕE ◦ εE extends the state on A(E) to a state on A(E). Moreover, dϕ bE dϕE

=

dϕE dϕ bE

.

n The state ϕE on A(E) is given by ϕE := ( k=1 ϕk ) ◦ χE where χE : A(E) ≡ n k=1 A(Ik ) → k=1 A(Ik ) is the natural isomorphism given by the split property (Ik are the components of E), and the states ϕk on A(Ik ) are given by ϕk = ω ◦ Ad U (γk ), where ω is the vacuum state, and U (γk ) implement diﬀeomorphisms γk that equal z → z n on Ik . (By locality, ϕk do not depend on the behavior of γk outside Ik .) n

Corollary. Let ϕE and ϕ E be the states on A(E) and on A(E), respectively, as in n Proposition 2. For intervals Jk ⊂ Ik (= the components of E) and F = k=1 Jk , we have the geometric modular actions σtϕE (A(Jk )) = A(ft (Jk )),

σtϕbE (A(Jk ))

= A(ft (Jk )),

hence and

σtϕE (A(F )) = A(ft (F )), )) = A(f t (F )). σ ϕbE (A(F t

(2.7) (2.8)

Proof. (2.7) is obvious from (2.4). By the deﬁning implementation properties of the Connes spatial derivative, we conclude from (2.6), that σ ϕbE is implemented by V (t). This implies (2.8), by the U (n) -covariance of the algebras under consideration. (We include the obvious statement (2.7) for later comparison with the geometric modular ﬂow in [9], for which only the second equality in (2.7) holds while the ﬁrst is violated.) For n = 1, one may just choose γ = id , so that both ϕI and ϕI are given by the restrictions of the vacuum state, and (2.6) reduces to (2.2). For n > 1, the state ϕE is diﬀerent from the vacuum state, but it is rotation invariant on A(E) in the sense, that ϕE ◦ Ad U (rott ) = ϕE on A(Jk ) for J k ⊂ Ik and t small enough that rott (Jk ) ⊂ Ik . (rott stands for the rotations z → eit z.)

14:17 WSPC/S0129-055X

336

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

Namely, if J ⊂ I such that gJ ⊂ I for g in a neighborhood N of the√identity of the M¨ obius group, then by construction, ϕE ◦ Ad U (n) (g) = ϕE on A( n J) for g ∈ N . In particular, the same is true for the rotations rott with t in a neighborhood of 0. Since U (n) (rott ) = U (rott/n ) · (complex phase), the rotation invariance on A(E) follows. One could actually have chosen any other family of diﬀeomorphisms γk that map (γ ) Ik onto I, resulting in product states ϕE k with a diﬀerent geometric ﬂow on E. In that case, the unitary 1-parameter group V (t) satisfying the properties of Propo(n) sition 2 is a diﬀeomorphism conjugate of UI (ΛI (−2πt)). One might expect that our choice of ϕE is the only one in this class which enjoys the rotation invariance on A(E). Surprisingly, this is not the case: (γ ) Let ϕE k be a product state on A(E) that is given on A(Ik ) by ω ◦ Ad U (γk ), where γk are diﬀeomorphisms of S 1 that map Ik onto I. Then this state is rotation invariant on A(E), by construction, if and only if ω ◦ Ad U (hk ) are rotation invariant on A(I), where hk are diﬀeomorphisms of S 1 , deﬁned on I by hk (z n ) = γk (z) for z ∈ Ik . In particular, hk map I onto I. The condition that ω ◦ Ad U (h) is rotation invariant on A(I), can be evaluated for the 2-point function of the stressenergy tensor in that state. Using the inhomogeneous transformation law under diﬀeomorphisms h, involving the Schwartz derivative Dz h = hh − 32 ( hh )2 , the quantity 2  dht (z) dht (w) c2   dz dw (2.9) 2c ·  2  + 36 · Dz ht (z) · Dw ht (w), (ht (z) − ht (w)) where ht = h ◦ rott , must be independent of t for z, w ∈ I and t in a neighborhood of zero. Working out the singular parts of the expansion in w around z, one ﬁnds that Dz ht (z) must be independent of t for z ∈ I. This already implies that the second (regular) term is separately invariant, so that, in particular, the invariance condition does not depend on the central charge c. Solving (2.10) ∂t Dz ht (z) = 0 ⇔ z 2 · Dz h(z) = const., when the constant is parametrized as 12 (1 − ν 2 ), yields h(z) = µ(z ν ) =

Az ν + B Cz ν + D

for z ∈ I,

(2.11)

where µ is a M¨ obius transformation.a The state ω ◦ Ad U (h) is indeed rotation obius invariance of ω. invariant on A(I) by h ◦ rott (z) = µ ◦ rotνt (z ν ) and M¨ a The sign of the exponent ν can be reversed by exchanging A ↔ B and C ↔ D. In order that 1 h takes values in “ S , ν” must be either real or imaginary, with corresponding reality conditions

A

B

on the matrix C D . Requiring h also to preserve the orientation, we ﬁnd: If ν > 0, then ” “ ” “ ” “ ” “ A B A B i 1 i 1 ∈ SU (1, 1). If iν > 0, then C D ∈ −i 1 · SL(2, R), where −i 1 is the Cayley C D transformation x →

1+ix . 1−ix

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

337

For each value of ν, requiring h to preserve the endpoints of the interval I ﬁxes the M¨ obius transformation up to left composition with the 1-parameter subgroup ΛI (t). Because ω is invariant under ΛI (t), the state ω ◦ Ad U (h) is uniquely determined by the exponent ν in (2.11). One has therefore a 1-parameter family of product states, all rotation-invariant on A(I), but with diﬀerent modular ﬂows on I. Going back to the product states on A(E) by composition with z → z n , there is one parameter νk for each interval, i.e. for the choice of the states ω ◦ Ad U (γk ) on A(Ik ). The state is invariant also under “large” rotations by 2π/n, if and only if these parameters are the same for all k. 2.2. Geometric modular action in boundary CFT The case n = 2 is of particular interest in boundary conformal quantum ﬁeld theory (BCFT) [25]. With every 2-interval E such that −1 ∈ E, one associates a double cone OE in the halfspace M+ = {(t, x) ∈ R2 : x > 0} as follows. The boundary x = 0, t ∈ R is the pre-image of S˙ 1 := S 1 \{−1} under the Cayley transform C : R t → z = (1 + it)/(1 − it) ∈ S 1 . Let E = I− ∪ I+ ⊂ S˙ 1 with I− < I+ in the R = C −1 (I± ) ⊂ R. Then counter-clockwise order, and I± R R R × I− ≡ {(t, x) : t ± x ∈ I± }. OE := I+

(2.12)

(When there can be no confusion, we shall drop the subscript E.) Now, the algebras B+ (O) := A(E)

(2.13)

have the re-interpretation as local algebras of BCFT, which extend the subalgebras of chiral observables A+ (O) := A(E) ≡ A(I− ) ∨ A(I+ ).

(2.14)

Under this re-interpretation, the second statement in (2.8) asserts, that the modular group σtϕbE acts geometrically inside the associated diamond O: σsϕbE (B+ (Q)) = B+ (fsO (Q)),

(2.15)

where the double cone Q = OF ⊂ O corresponds to a sub-2-interval F ⊂ E, and the ﬂow fsO on O arises from the pair of ﬂows fs (2.5) on I+ and I− , by the said transformations, i.e. fsO (t + x, t − x) ≡ (us , vs ) = (C −1 ◦ fs ◦ C(t + x), C −1 ◦ fs ◦ C(t − x)).

(2.16)

R R = (a, b) ⊂ R+ and I− = (−1/a, −1/b) (corresponding to a symmetric For I+ 2-interval E), we have computed the velocity ﬁeld

∂s us = 2π

(us − a)(aus + 1)(us − b)(bus + 1) =: −2πV O (us ) (b − a)(1 + ab) · (1 + u2s )

R R for us ∈ I+ , and the same equation for vs ∈ I− .

(2.17)

14:17 WSPC/S0129-055X

338

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

R R For I+ = (a1 , b1 ) and I− = (a2 , b2 ) corresponding to a non-symmetric 2-interval ˜ ˜ onto a symmetric interval E. E, there is a M¨ obius transformation m that maps E ˜ Choosing the state ϕE˜ := ϕE ◦ Ad U (m) on A(E), the resulting geometric modular ﬂow is given by f˜s = m−1 ◦ fs ◦ m. Going through the same steps, we ﬁnd

∂s us = −2πV O (us ) = 2π

(u − a1 )(u − b1 )(u − a2 )(u − b2 ) Lu2 − 2M u + N

(2.18)

with L = b1 −a1 +b2 −a2 ,

M = b1 b2 −a1 a2 ,

N = b2 a2 (b1 −a1 )+b1 a1 (b2 −a2 ). (2.19)

This diﬀerential equation is solved by log −

(us − a1 )(us − a2 ) = −2πs + const. (us − b1 )(us − b2 )

(2.20)

The modular orbits for u = t + x, v = t − x are obtained by eliminating s: (u − a1 )(u − a2 ) (v − b1 )(v − b2 ) · = const. (u − b1 )(u − b2 ) (v − a1 )(v − a2 )

(2.21)

2.3. General boundary CFT Up to this point, we have taken the boundary CFT to be given by B+ (O) := A(E), which equals the relative commutant B+ (O) = A(K) ∩ A(L) by virtue of Haag duality of the local chiral net A. Here, K and L ⊂ S˙ 1 are the open intervals between I+ and I− , and spanned by I+ and I− , respectively, i.e. L = I+ ∪ K ∪ I− . The general case of a boundary CFT was studied in [25]. If A is completely rational, every irreducible local boundary CFT net containing A(E) is intermediate between A(E) and a maximal (Haag dual) BCFT net: dual (O) ≡ B(K) ∩ B(L), A(I+ ) ∨ A(I− ) ≡ A+ (O) ⊂ B+ (O) ⊂ B+

(2.22)

where I → B(I) is a conformally covariant, possibly nonlocal net on S˙ 1 , which extends A and is relatively local with respect to A [25, Proposition 2.9(ii)]. (Its extension to the circle in general requires a covering). If A is completely rational, the local subfactors A(I) ⊂ B(I) automatically have ﬁnite index (not depending on I ⊂ S˙ 1 ) by the same argument as in [20, p. 39], and there are only ﬁnitely many such extensions [19, Theorem 2.4]. There is then a unique global conditional expectation ε, that maps each B(I) onto A(I). ε commutes with M¨obius transformations and preserves the vacuum state. By relative locality, ε maps B(K) ∩ B(L) into (in general, not onto) A(K) ∩ A(L), hence A(E) ≡ A+ (O) ⊂ ε(B+ (O)) ⊂ A(E).

(2.23)

induces a faithful normal state ϕ E ◦ ε on B+ (O). The product state ϕ E on A(E) Proposition 3. In a completely rational, diﬀeomorphism invariant BCFT, the modular group of the state ϕ E ◦ ε acts geometrically on B+ (Q), Q ⊂ O, i.e. σsϕbE ◦ε (B+ (Q)) = B+ (fsO (Q)), where fsO is the ﬂow (2.16).

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

339

Proof. B+ (O) is generated by A+ (O) and an isometry v [24] such that every element b ∈ B+ (O) has a unique representation as b = av with a ∈ A+ (O), and va = θ(a)v where θ is a dual canonical endomorphism of B+ (O) into A+ (O). For a double cone Q ⊂ O, the isometry v may be chosen to belong to B+ (Q), in which case θ is localized in Q. We know that the modular group restricts to the modular group of A+ (O), which acts geometrically, in particular, it takes A+ (Q) to A+ (fsO (Q)). It then follows by the properties of the conditional expectation that σsϕbE ◦ε (v) ≡ vs = us v where us ∈ A(E) is a unitary cocycle of intertwiners us : θ → θs ≡ σsϕbE ◦ θ ◦ σsϕbE −1 . Since σsϕbE acts geometrically in A+ (O), θs is localized in fsO (Q), and A+ (fsO (Q)) · vs = B+ (fsO (Q)). This proves the claim. Thus, in every BCFT, the modular group of the state ϕ E ◦ ε on B+ (OE ) acts geometrically inside the double cone OE by the same ﬂow (2.20), (2.21). 2.4. Local temperature in boundary conformal QFT We shall show that the states ϕ E ◦ ε, whose geometric modular action we have just discussed, are manufactured far from thermal equilibrium. We adopt the notion of “local temperature” introduced in [8], where one compares the expectation values of suitable “thermometer observables” Φ(x) in a given state ϕ with their expectation values in global KMS reference states ωβ of inverse temperature β. If one can represent the expectation values as weighted averages (2.24) ϕ(Φ(x)) = dρx (β)ωβ (Φ(x)) (where the thermal functions β → ωβ (Φ(x)) do not depend on x because KMS states are translation invariant), then one may regard the state ϕ at each point x as a statistical average of thermal equilibrium states. In BCFT, this analysis can be carried out very easily for the product states ϕE with the energy density 2T00 (t, x) = T (t + x) + T (t − x) as thermometer observable. One has ωβ (T ( · )) = π2 −2 in the KMS states, while the inhomogeneous transformation law of T under 24 c β c c R Dy γ± (y) = − 4π (1 + y 2)−2 if y ∈ I± where diﬀeomorphisms gives ϕE (T (y)) = − 24π 2y −1 2 γ± (y) = C ◦ (z → z ) ◦ C(y) = 1−y2 , i.e. negative energy density inside the R R double cone O = I+ × I− . The product states ϕE can therefore not be interpreted as local thermal equilibrium states in the sense of [8]. The possibility of locally negative energy density in quantum ﬁeld theory is well known, and its relation to the Schwartz derivative in two-dimensional conformal QFT was ﬁrst discussed in [15]. 2.5. Modular temperature in boundary conformal QFT The “thermal time hypothesis” [11] provides a very diﬀerent thermal interpretation of states with geometric modular action. According to this hypothesis, one interprets the norm of the vector ∂s tangent to the modular orbit xµ (s) as the inverse

14:17 WSPC/S0129-055X

340

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

temperature βs of the state as seen by a physical observer with accelerated trajectory xµ (s). In the vacuum state on the Rindler wedge algebra, this gives precisely 2π the Unruh temperature βs = dτ ds = κ (τ is the proper time, and κ the acceleration). One may also give a local interpretation, by viewing βs as the inverse temperature of the state for an observer at each point whose trajectory is tangent to the unique modular orbit through that point. For these interpretations to make sense it is important that ∂s is a timelike vector. Indeed, it is easily seen that the ﬂow (2.17), (2.18) gives negative sign for both ∂s us and ∂s vs , because the velocity ﬁeld V O is positive inside the interval. Hence the tangent vector is past-directed timelike. This conforms with a general result, proven in more than 2 spacetime dimensions: Proposition 4 ([32, Satz 6.5]). Let A(O) be a local algebra and Ut a unitary 1parameter group such that Ut A(Q)Ut∗ = A(ft Q) where ft is an automorphism of O taking double cones in O to double cones. If there is a vector Φ, cyclic and separating for A(O), such that Ut AΦ has an analytic continuation into a strip −β < Im t < 0, then −∂t (ft x)|t=0 ∈ V+ . In particular, the ﬂow of a geometric modular action is always past-directed null or timelike. From (2.18), we get the proper time (dτ )2 = du dv and hence the inverse temµ perature β = dτ ds as a function of the position x = (t, x) β(t, x)2 =

du dv = 4π 2 · V O (t + x)V O (t − x). ds ds

(2.25)

The temperature diverges on the boundaries of the double cone (V O (ai ) = V O (bi ) = 0), and is positive everywhere in its interior. For comparison with the ordinary Unruh eﬀect, we also compute the acceleration in the momentarily comoving frame κ=

1 2 (d2 x/dt2 ) u v − u v ∂ xµ ∂ 2 xµ 2 = = , − 2 2 2 3/2 ∂τ ∂τ (1 − (dx/dt) ) 2(u v )3/2

where the prime stands for ∂s , and we have used (dx/dt) t

=

−u v 4 u(uv +v )3 .

dx dt

=

x t

=

u −v u +v

(2.26) and

d2 x dt2

=

Thus

V O (u) − V O (v) κ(t, x) = 2 V O (u)V O (v)

u=t+x, v=t−x

=

V O (t + x) − V O (t − x) π −1 β(t, x)

(2.27)

as a function of the position (t, x). The product β(t, x) · κ(t, x) = π ∂x V O (t + x) + V O (t − x) = π ∂t V O (t + x) − V O (t − x)

(2.28)

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

341

1 B

1 B u

u A

A 0

0

-1 -1 B

-1 -1 B

v

v

-1 A

-1 A

Fig. 2. Inﬂuence of the boundary. Left: modular orbit of an arbitrary point in the symmetric 1 1 ≤ t − x ≤ −B }. Right: a zoom on the modular double cone O = {(t, x) : A ≤ t + x ≤ B, − A us , vs ) orbit (us , vs ) going through the center of the double cone. The plot represents the curve (˜ ) + udiag , with (udiag , vs ) the straight line joining the two tips of the where u ˜s = f ∗ (us − udiag s s s double cone (a special vacuum modular orbit in the absence of the boundary), and f = 100 a zoom factor.

has the maximal value 2π (Unruh temperature) near the left and right edges of the double cone, and equals 0 along a timelike curve connecting the past and future tips. This curve is in general not itself a modular orbit. In general, the modular orbits are not boost trajectories. However, the quantitative departure is very small. As an illustration, we display a true modular orbit, as well as a plot with one coordinate exaggerated by a zoom factor of 100 (Fig. 2). There exists however one distinguished modular orbit with a simple dynamics, namely the boost us vs = −1 ∀s ∈ R

(2.29)

(in the symmetric case, for simplicity) which is a solution of (2.21) for const. = 1. It is the Lorentz boost of a wedge in M+ , whose edge lies on the boundary x = 0. The same is true also for non-symmetric intervals, although the formula (2.29) is more involved. Along this distinguished orbit the inverse temperature (2.25) simply writes β = 2π

d ∂s us = 2π ln us . us ds

(2.30)

14:17 WSPC/S0129-055X

342

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

One can express the proper time τ of the observer following the boost as a function of the modular parameter τ (s) = ln us − ln u0 , O

(2.31)

τ

0e ) hence β(τ ) = 2π V u(u . Choosing u0 = 1, one can write the inverse temperature τ 0e as a function of the proper time in the form

β(τ ) = 2π

(sinh(τmax ) − sinh(τ )) · (sinh(τ ) − sinh(τmin )) , (sinh(τmax ) − sinh(τmin )) · cosh(τ )

(2.32)

where τmin and τmax are functions of the coordinates of the double cone. As for double cones in Minkowski space [28], the temperature is inﬁnite at the tips of the double cone (τ = τmin or = τmax ) and reaches its minimum in the middle of the observer’s “lifetime”. Unfortunately, for generic orbits we have no closed formula for the temperature as a function of the proper time, so as to compare with the “plateau behavior” (constant temperature for most of the “lifetime”) as in [28], that occurs in CFT without boundary for vacuum modular orbits close to the edges of the double cone. 3. The Vacuum Modular Flow Casini and Huerta [9] recently found that the vacuum modular group for the algebra of a free Fermi ﬁeld in the union of n disjoint intervals (ak , bk ) ⊂ R is given by the formula dxj dxk (t) · σt (ψ(xj )) = · ψ(xk (t)). Ojk (t) (3.1) dζ dζ k

Here, eζ(x) = −

x − ak k

x − bk

(3.2)

deﬁnes a uniformization function ζ that maps each interval (ak , bk ) onto R, and l ζ eζ ∈ R+ has n pre-images xk = xk (ζ), one in each interval, i.e. − l xxkk(ζ)−a (ζ)−bl = e . The geometric modular ﬂow is given byb ζ(t) = ζ0 − 2πt,

(3.3)

i.e. a separate ﬂow xk (t) = xk (ζ − 2πt) in each interval. The orthogonal matrix O yields a “mixing” of the ﬁelds on the diﬀerent trajectories xi (t), and is determined by the diﬀerential equation ˙ O(t) = K(t)O(t)

(3.4)

[9], the notation is diﬀerent: the authors “counter” the ﬂow so that the position of σt (ψ(xj (ζ + 2πt))) remains constant, except for the mixing.

b In

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

343

where Kjj (t) = 0 and Kjk (t) = 2π

dxj (t) dxk (t) dζ dζ (j = k). xj (t) − xk (t)

(3.5)

Remark. The mixing is a “minimal” way to evade an absurd conclusion from Takesaki’s Theorem ([32, Chap. IX, Theorem 4.2]): Without mixing the modular group would globally preserve the component interval subalgebras. Then, the Reeh– Schlieder property of the vacuum vector would imply that the n-interval algebra coincides with each of its component interval subalgebras. Proposition 5. √For k (ak , bk ) ⊂ R the Cayley transform of a symmetric n I ⊂ S 1 \{−1}, the geometric part (3.3) of the ﬂow (without n-interval E = mixing) is the same as (2.5). 1+iak , vk = Proof. We use variables uk = 1−ia k 2i(x − a) = (1 − ix)(1 − ia)(z − u). Then

eζ = −

x − ak k

x − bk

= const. ·

1+ibk 1−ibk ,

z − uk k

z − vk

z =

= const. ·

1+ix 1−ix ,

and the identity

zn − U zn − V

(3.6)

where U = unk , V = vkn such that I = (U, V ) ⊂ S 1 . Therefore, the ﬂow (3.3) is equivalent to n z(t)n − U −2πt z − U = e , · z(t)n − V zn − V

(3.7)

which in turn is easily seen to be equivalent to (2.5). Keep in mind, however, that the modular group of the product state in Sec. 2.1 does not “mix” the intervals (ak , bk ). Since every 2-interval is a M¨ obius transform of a symmetric 2-interval, the statement of Proposition 5 is also true for general 2-intervals, with the ﬂow (2.20).

3.1. Verification of the KMS condition The authors of [9] have obtained the ﬂow (3.1) using formal manipulations. We shall establish the KMS property of the vacuum state for this ﬂow. Because this property distinguishes the modular group [32, Chap. VIII, Theorem 1.2], we obtain an independent proof of the claim. We take k (ak , bk ) ⊂ R the Cayley transform of a symmetric n-interval E = √ n I ⊂ S˙ 1 . We ﬁrst solve the diﬀerential equation (3.4) for the mixing.

14:17 WSPC/S0129-055X

344

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

With angular variables x = tan ξ2 , and π > ξ0 > ξ1 > · · · > ξn−1 > −π, the non-diagonal elements of the matrix K can be written as

dxk (t) dξk (t)

Kkl (t) = 2π ·

dxl (t) dξl (t) dξk (t) dξl (t)

xk (t) − xl (t) dξk (t) dξl (t) dz dz = 2π · ξk (t) − ξl (t) 2 sin 2

dz

for k = l. For symmetric intervals, ξk = ξ0 − k ·

dz

(3.8)

2π n

dξ0 (t) dz = Ωkl · ξ˙0 (t), Kkl (t) = −2π · (k − l)π 2 sin n

and

dξk dz

Ωkl =

=

dξ0 dz

> 0, hence

1 . (k − l)π 2 sin n

(3.9)

With the constant anti-symmetric matrix Ω = (Ωkl )n−1 k,l=0 , we obtain the orthogonal mixing matrix Corollary. The mixing matrix is given by O(t) = e(ξ0 (t)−ξ0 (0))·Ω .

(3.10)

Remark. The mixing matrix O(t) always belongs to the same one-parameter subgroup of SO(n), with generator Ω. For n = 2, this is just O(t) =

cos θ sin θ

−sinθ cos θ

with θ(t) =

1 (ξ0 (t) − ξ0 ). 2

(3.11)

If E is not symmetric, the general formula is Lx0 (t) − M Lx0 (0) − M θ(t) = arctan √ − arctan √ LN − M 2 LN − M 2

(3.12)

with notations as in (2.18).c Next, we compute the vacuum expectation values σt (ψ(xi ))σs (ψ(yj )) for xi ∈ −i . Passing to angular variables Ii , yj ∈ Ij , using (3.1) and ψ(x)ψ(y) = x−y−iε c The

authors of [9] also compute this angle, but misrepresent it as the arctan of the diﬀerence, rather than the diﬀerence of the arctan’s.

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

345

x → ξ, y → η by √ √ dx dy = x − y − iε

√ √ dξ dη , ξ − η − iε 2 sin 2

(3.13)

this gives

σt (ψ(xi ))σs (ψ(yj )) =

−i

(e(ξ0 (t)−ξ0 )·Ω )ik (e(η0 (s)−η0 )·Ω )jl · 2 sin

kl

dξk (t) dxi

dηl (s) dyj

ξk (t) − ηl (s) − iε 2

.

(3.14)

Notice that again dξk , dηl in the square roots do not depend on k and l. To perform the sums over k and l, we need a couple of trigonometric identities: Lemma. For n ∈ N and k = 0, 1, . . . , n − 1, let sink (α) := sin(α − k πn ). Then (sums and products always extending from 0 to n − 1): (i) k sink (α) = (−2)1−n sin(nα). (ii) For j = 0, . . . , n − 1 one has k: k=j cot((j − k) πn ) = 0. (iii) For j = 0, . . . , n − 1 one has

(e2(α−β)Ω )jk ·

k

sin(nβ) 1 1 = · . sink (α) sin(nα) sinj (β)

(3.15)

2π Proof. (i) is just another way of writing k (z − ωk ) = z n − 1 where ωk = eik n are the nth roots of unity, and z = e2iα . Dividing (i) by sinj (α), taking the logarithm, and taking the derivative at α = 0, yields (ii). For (iii), we have to show that the expression (−2)1−n sin(nα)

k

(e2αΩ )jk ·

1 = (e2αΩ )jk sinl (α) sink (α) k

(3.16)

l: l=k

is independent of α. Taking the derivative with respect to α and inserting (3.9), we have to show that k

π sin (α) + cos α − k sinl (α) = 0. · · l π n sin(j − k) l: l=k k l: l=j,k n 1

(3.17)

Writing cos(α− k πn ) = (sink (α) cos((j − k) πn )− sinj (α))/sin((j − k) πn ), this suﬃcient condition reduces to the identity (ii).

14:17 WSPC/S0129-055X

346

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

Using (3.15) with 2α = ξ0 (t) − ηl (s) and 2β = ξ0 − ηl (s) in the expression (3.14), and once again with 2α = η0 (s) − ξ0 and 2β = η0 − ξ0 , we get dξ0 (t) dη0 (s) ξ0 − η0 − iε −i sin n dxi dyj 2 . (3.18) σt (ψ(xi ))σs (ψ(yj )) = ξi − ηj − iε ξ0 (t) − η0 (s) − iε sin n 2 sin 2 2 We exhibit the t- and s-dependent terms: dξ0 (t) dη0 (s) dΞ0 (t) dH0 (s) = nξ0 (t) − nη0 (s) − iε Ξ0 (t) − H0 (s) − iε 2 sin 2n sin 2 2 1 dX(t) dY (s) . = n X(t) − Y (s) − iε

(3.19)

The ﬁrst equality is the invariance of the 2-point function under a M¨ obius trans1 formation µ mapping I to S+ , such that for z = eiξ ∈ E and w = eiη ∈ E we get 1+iX 1 1 ∈ S+ and µ(wn ) = eiH = 1+iY µ(z n ) = eiΞ = 1−iX 1−iY ∈ S+ with X, Y ∈ R+ ; the second equality is again (3.13) for the inverse transformation Ξ → X, H → Y . By Proposition 5, the ﬂow on R+ is just X(t) = e−2πt · X, giving σt (ψ(xi ))σs (ψ(yj )) =

e−π(t+s) · f (xi , yj ). e−2πt X − e−2πs Y − iε

(3.20)

This expression manifestly satisﬁes the KMS condition in the form ψ(x)σ−i/2 (ψ(y)) = ψ(y)σ−i/2 (ψ(x)).

(3.21)

We conclude that the KMS condition holds for the Casini–Huerta ﬂow for symmetric n-intervals: √ Corollary. For symmetric n-intervals E = n I, (3.1) is the modular automorphism group of the algebra A(E) with respect to the vacuum state. Proof. Smearing with test functions of appropriate support, the KMS property holds for bounded generators of the CAR algebra A(E). Because ψ is a free ﬁeld, the KMS property of the 2-point function in the vacuum extends to the KMS property of the corresponding quasifree (i.e. Fock) state of the CAR algebra. Remark. It is quite remarkable that by virtue of the mixing, through the identity (ii) of the lemma, the ratio of the modular vacuum correlation functions (n)

(n)

(1)

(1)

σt (ψ(xi ))σs (ψ(yj )) σt (ψ(X))σs (ψ(Y ))

(3.22)

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

347

is independent of the modular parameters t, s. Here, in the numerator σ (n) is the modular group for a symmetric n-interval ⊂ R, and in the denominator σ (1) is the modular group for the 1-interval R+ . 3.2. Product states for general n-intervals With hindsight from [9], we can generalize to non-symmetric n-intervals the modelindependent construction of a product state, as in Sec. 2.1, by replacing the function 1+ix , z → z n as follows. If C stands for the Cayley transformation x → z = 1−ix √ n and k (ak , bk ) ⊂ R the pre-image of a symmetric n-interval E = I, then U = C(ak )n ∈ S 1 and V = C(bk )n ∈ S 1 do not depend on k. One computes the uniformization function (3.2) in this case to be given by eζ = C −1 ◦ µ ◦ (z → z n ) ◦ C(x)

(3.23)

n −V Z−U where µ : S 1 → S 1 is the M¨ obius transformation Z → C (−1) · n (−1) −U V −Z , that 1 1 ˙ takes I to S+ . For a general n-interval E = Ik ⊂ S , one may choose µ an arbitrary M¨ obius transformation, and replace z → z n by the function g(z) := µ−1 ◦ C ◦ eζ ◦ C −1 ,

(3.24)

where ζ is the uniformization function (3.2). Thus, g maps each component Ik onto 1 ), i.e. we have E = g −1 (I). Repeating the construction the same interval I = µ−1 (S+ of Proposition 2 with factor states ϕk = ω ◦ Ad U (γk ), where the diﬀeomorphisms γk coincide with g on Ik , one obtains a product state with the geometric modular ﬂow ft (z) = g −1 ΛI (−2πt)g(z) ,

(3.25)

instead of (2.5). By construction, this ﬂow corresponds to ζ(t) = ζ(0) − 2πt as before, which in turn coincides with the geometric part of the vacuum modular ﬂow (3.1).

3.3. Lessons from the free Fermi model Charge splitting It is tempting to ask whether, and in which precise sense, the free Fermi ﬁeld result extends also to the free Bose case. (The authors of [9] are positive about this, but did not present a proof.) In the chiral situation, the free Bose net A(I) (the current algebra with central charge c = 1) is given by the neutral subalgebras of the complex free Fermi net F (I). Because the vacuum state is invariant under the charge transformation, there is a vacuum-preserving conditional expectation

14:17 WSPC/S0129-055X

348

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

ε : F (I) → A(I), implying that the vacuum modular group of F (E) restricts to the vacuum modular group of C(E) := ε(F (E)). We have F (E) ε↓ A(E) ⊂ C(E) ⊂ A(E),

(3.26)

where both inclusions are strict: C(E) contains neutral products of integer charged elements of F (Ik ) in diﬀerent component intervals, which do not belong to A(E), while A(E) contains “charge transporters” [6, 22] for the continuum of superselection sectors of the current algebra with central charge c = 1, which do not belong to C(E). Being the restriction of the vacuum modular group of F (E), the action of the vacuum modular group of C(E) can be directly read oﬀ. It acts geometrically, i.e. takes C(F ) to C(ft (F )),d but it does not take A(F ) to A(ft (F )), because the mixing takes a neutral product of two Fermi ﬁelds in one component Jk of F to a linear combination of neutral products of Fermi ﬁelds in diﬀerent components ft (Jj ), belonging to C(ft (F )) but not to A(ft (F )). Let us call this feature “charge splitting” (stronger than “mixing”). The inclusion situation (3.26) does not permit to determine the vacuum modular ﬂow of A(E) from that of C(E), because there is no vacuum-preserving conditional expectations C(E) → A(E) that would imply that the modular group restricts. (Of course, this would be a contradiction, because we have already seen that the modular group of F (E), and hence that of C(E), does not preserve A(E).) Similarly, we cannot conclude that the vacuum modular ﬂow of A(E) should extend that of C(E), or that of A(E). Proposition 6 below actually shows that this scenario must be excluded. Application to BCFT It is instructive to discuss the consequence of the free Fermi ﬁeld mixing and the ensuing charge splitting for C(E) under the geometric re-interpretation of boundary CFT, as in Sec. 2.2. For deﬁniteness and simplicity, we consider the case when A is the even subnet of the real free Fermi net, i.e. A is the Virasoro net with c = 12 . Unlike the c = 1 free Bose net, this model is completely rational. The same considerations as in the previous argument apply also in this case: are strict, the Again, the inclusions A(E) ⊂ C(E) := ε(F (E)) ≡ F (E)Z2 ⊂ A(E) 1 latter because charge transporters for the Ramond sector (weight h = 16 ) do not belong to C(E). The vacuum modular ﬂow for C(E) is induced by that for F (E), but it does not pass to A(E) or A(E). S and below, F ⊂ E always stands for an n-interval F = k Jk where Jk are the components √ of the pre-image of some interval under the function ζ (3.2), i.e. in the symmetric case, F = n J with J ⊂ I.

d Here

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

349

R R Let therefore E ⊂ S˙ 1 be 2-intervals and O = I+ × I− ⊂ M+ the associated double cones. The net

O → C(O) = F (E)Z2

(3.27)

is a BCFT net intermediate between the “minimal” net A+ (O) = A(E) and the see [25]. It is generated by ﬁelds “maximal” (Haag dual) net B+ (O) = A(E), m n R ψ(u ) ψ(v ) with n + m = even, and ui smeared in I+ , vj smeared in i j i=1 j=1 R I− . The vacuum modular ﬂow of C(O) mixes ft ui with ft ui and ft vj with ft vj , where u → u and v → v are the bijections of the two intervals onto each other connecting the two pre-images of the uniformization function ζ. Hence, if ψ(u)n ψ(v)m (in schematical notation) belongs to C(Q) for a double cone Q ⊂ O, the vacuum modular ﬂow takes it to linear combinations of ψ(ft u)n1 ψ(ft u )n2 ψ(ft v)m1 ψ(ft v )m2

(3.28)

with n1 + n2 = n, m1 + m2 = m. Grouping the charged factors to neutral (even) “bi-localized” products, these generators belong to the local algebra of 6 double 6 around 6 points as indicated in Fig. 3. cones α=1 C(ft Qα ) ⊂ C(ft Q) the correIn spite of the fact that two of the 6 double cones Qα lie outside Q, But their bi-localized generators, sponding algebras C(Qα ) are contained in C(Q).

v’

J+ u

∧ Q Q

O v

u’ Fig. 3. The 6 regions mixed by the vacuum modular ﬂow in boundary CFT. (u, v) is a point in 1 and v = − v1 . Q ⊂ O. The boost is the distinguished orbit in O as in Sec. 2.5, and deﬁnes u = − u If (u, v) lies on the boost, then the points (v, u ) and (v , u) lie on the boundary. Consequently, if a double cone Q ⊂ O around (u, v) intersects the distinguished orbit, then four of the 6 associated double cones Qα merge with each other, while the other two touch the boundary and degenerate to left wedges. (The ﬂow ft itself, as in Fig. 2, is suppressed.)

14:17 WSPC/S0129-055X

350

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

because on the boundary such as ψ(u)ψ(v ), cannot be associated with points in Q, they are localized in the entire interval J+ spanned by u and v [26, Sec. 2], hence Therefore, in the geometric re-interpretation belong to J− C(J+ × J− ) ⊂ C(Q). of boundary CFT, the discrete mixing (charge splitting) on top of the geometric modular action induces a truely “fuzzy” action on BCFT algebras associated with double cones Q ⊂ O! The fuzzyness seems, however, not to be described by a pseudo diﬀerential operator, as suggested in [30, 29], but rather reﬂects the nonlocality of an operator product expansion for bi-localized ﬁelds. 3.4. Preliminaries for a general theory Also in the general case of a local chiral net A, there is a notion of “charge splitting”: Superselection sectors are described by DHR endomorphisms of the local net, which are localized in some interval [12, 13]. Intertwiners that change the interval of localization (charge transporters) are observables, i.e. they do not carry a charge themselves, but they may be regarded as operators that annihilate a charge in one interval and create the same charge in another interval. These charge transporters do not belong to A(E) (where the 2-interval E is the union of the two intervals), but together with A(E) generate A(E), see the discussion in [22, Sec. 5]. Therefore, one may speculate whether the combination of geometric action with charge splitting could be a general feature for the vacuum modular group of suitable n-interval algebras intermediate between A(E) and A(E), i.e. the modular group does not preserve the subalgebras A(F ), let alone the algebras of the component intervals A(Jk ). The discussion of the algebras A(E) ⊂ C(E) ⊂ A(E) in the preceding subsection shows that there cannot be a simple general answer. Nevertheless, we can derive a few ﬁrst general results. Proposition 6. Let Φ ∈ H be a joint cyclic and separating vector for A(E) and A(E ), e.g., the vacuum. (i) If the modular automorphism group of (A(E), Φ) globally preserves the subal gebra A(E), then A(E) = A(E). (ii) If the adjoint action of the modular unitaries ∆it for (A(E), Φ) globally pre serves A(E), or, equivalently, A(E ) then A(E) = A(E). Proof. By assumption, Φ is also cyclic and separating for A(E) = A(E ) and ) = A(E) . Then (i) follows directly by Takesaki’s Theorem [32, Chap. IX, A(E Theorem 4.2]. For (ii), note that ∆it preserves A(E ) if and only if it preserves and ∆−it implements the modular automorphism group for A(E ) = A(E); ), Φ). Thus, the statement is equivalent to (i), with E replaced (A(E) = A(E by E .

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

351

The obvious relevance of Proposition 6(ii) is that in the generic case when A(E) is strictly larger than A(E), there can be no vector state satisfying the Reeh–Schlieder property such that A(E) has geometric modular action on A(E) and on A(E ). In particular, the modular unitaries will not belong to the diﬀeomorphism group, but we may expect that Connes spatial derivatives as in Proposition 2 do. Recall that we have already seen (in the Remark after (3.4)) that mixing nec essarily occurs. By Proposition 6(i), it is not possible that A(E) has geometric modular action without charge splitting. 4. Loose Ends We have put into relation and contrasted the two facts that (i) in diﬀeomorphism covariant conformal quantum ﬁeld theory there is a construction of states on the von Neumann algebras of local observables associated with disconnected unions of n intervals (n-intervals), such that the modular group acts by diﬀeomorphisms of the intervals [21], and (ii) in the theory of free chiral Fermi ﬁelds, the modular action of the vacuum state on n-interval algebras is given by a combination of a geometric ﬂow with a “mixing” among the intervals [9]. The absence of the mixing in (i) can be ascribed to the choice of “product” states in which quantum correlations across diﬀerent intervals are suppressed. (In the reinterpretation of 2-interval algebras as double cone algebras in boundary conformal ﬁeld theory [25], the inﬂuence of the boundary was shown to weaken — as expected on physical grounds — in the limit when the double cone is far away from the boundary [26]. Indeed, it can be seen from the formula (3.12) for the mixing angle that in this limit the mixing in (ii) also disappears.) On the other hand, there is some freedom in the choice of product states, which allows to deform the geometric modular ﬂow within each of the intervals. It comes therefore as a certain surprise that the geometric part of the vacuum modular ﬂow in (ii) coincides with the purely geometric ﬂow in the product states in (i), precisely when the latter are chosen in a “canonical” way (involving the simple function z → z n on the circle, corresponding to ν = 1 in (2.11), in the case of symmetric n-intervals, and the function g (3.24) in the general case). This means that the relative Connes cocycle between the vacuum state and the “canonical” product state is just the mixing, while for all other product states, it will also involve a geometric component. Two circles of questions arise: First, is the geometric part of the vacuum ﬂow speciﬁc for the free Fermi model, or is it universal? And if it is universal, what takes the place of the mixing in the general case? Putting aside some technical complications of the proof, the authors of [9] claim a universal behavior for free ﬁelds, while in this paper, we have given ﬁrst indications how the geometric behavior should “propagate” to subtheories and

14:17 WSPC/S0129-055X

352

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

to ﬁeld extensions, also strongly supporting the idea of a universal behavior. Insight from the theory of superselection sectors suggests that the mixing in the general case should be replaced by a “charge splitting”. On the other hand, Takesaki’s Theorem poses obstructions against the idea that charge splitting on top of a geometric modular ﬂow could be the general answer (Proposition 6). Second, the notion of “canonical” (ν = 1) in the above should be given a physical meaning, related to the absence of a geometric component in the Connes cocycle. In the free Fermi case, the geometric part of the modular Hamiltonian contains the stress-energy tensor ∼ψ(x)∂x ψ(x), while the mixing part can be expressed in terms of ψ(xk )ψ(xl ) with xk and xl belonging to diﬀerent intervals. The absence of derivatives suggests that the Connes cocycle is “more regular in the UV” in the case when the geometric parts coincide, than in the general case. The same should be true for the generalized product state constructed in Sec. 3.2. A precise formulation of this UV regularity is wanted. Acknowledgments We thank Jakob Yngvason for bringing to our attention the article of Casini and Huerta [9], and Horacio Casini for discussions about their work. We also thank the Erwin Schr¨ odinger Institute (Vienna) for the hospitality at the “Operator Algebras and Conformal Field Theory” program, August–December 2008, where this work has been initiated. This work was supported in part by ERC Advanced Grant 227458 OACFT “Operator Algebras and Conformal Field Theory”, and by the EU network “Noncommutative Geometry” MRTN-CT-2006-0031962. R.L. is partially supported by PRIN-MIUR and GNAMPA-INDAM. P.M. and K.H.R. are supported in part by the German Research Foundation (Deutsche Forschungsgemeinschaft (DFG)) through the Institutional Strategy of the University of G¨ ottingen. References [1] J. Bisognano and E. H. Wichmann, On the duality condition for quantum ﬁelds, J. Math. Phys. 17 (1976) 303–321. [2] H.-J. Borchers, On revolutionizing QFT with modular theory, J. Math. Phys. 41 (2000) 3604–3673. [3] H.-J. Borchers and J. Yngvason, Modular groups of quantum ﬁelds in thermal states, J. Math. Phys. 40 (1999) 601–624. [4] R. Brunetti, D. Guido and R. Longo, The conformal spin and statistics theorem, Comm. Math. Phys. 156 (1993) 201–219. [5] D. Buchholz, On the structure of local quantum ﬁelds with non-trivial interaction, in Proc. Intern. Conf. Operator Algebras, Ideals, and Their Applications in Physics, ed. H. Baumg¨ artel (Teubner, 1977), pp. 146–153. [6] D. Buchholz, G. Mack and I. T. Todorov, The current algebra on the circle as a germ of local ﬁeld theories, Nucl. Phys. B 5B (Proc. Suppl.) (1988) 20–56. [7] D. Buchholz, O. Dreyer, M. Florig and S. J. Summers, Geometric modular action and spacetime symmetry groups, Rev. Math. Phys. 12 (2000) 475–560.

14:17 WSPC/S0129-055X

148-RMP

J070-00397

Geometric Modular Action for Disjoint Intervals

353

[8] D. Buchholz, I. Ojima and H. Roos, Thermodynamic properties of non-equilibrium states in quantum ﬁeld theory, Ann. Phys. 297 (2002) 219–242. [9] H. Casini and M. Huerta, Reduced density matrix and internal dynamics for multicomponent regions, Class. Quant. Grav. 26 (2009) 185005, 15 pp. [10] A. Connes, On the spatial theory of von Neumann algebras, J. Funct. Anal. 35 (1980) 153–164. [11] A. Connes and C. Rovelli, Von Neumann algebra automorphisms and time thermodynamics relation in general covariant quantum theories, Class. Quant. Grav. 11 (1994) 2899–2918. [12] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics, I, Comm. Math. Phys. 23 (1971) 199–230. [13] ———, Local observables and particle statics, II, Comm. Math. Phys. 35 (1974) 49–85. [14] F. Figliolini and D. Guido, The Tomita operator for the free scalar ﬁeld, Ann. Inst. Henri Poinc´ are Phys. Theor. 51 (1989) 419–435. [15] E. E. Flanagan, Quantum inequalities in two-dimensional Minkowski spacetime, Phys. Rev. D 56 (1997) 4922–4926. [16] D. Guido, R. Longo and H.-W. Wiesbrock, Extensions of conformal nets and superselection structures, Comm. Math. Phys. 192 (1998) 217–244. [17] R. Haag, N. Hugenholtz and M. Winnink, On the equilibrium states in quantum statistical mechanics, Comm. Math. Phys. 5 (1967) 215–236. [18] P. Hislop and R. Longo, Modular structure of the local algebras associated with the free massless scalar ﬁeld theory, Comm. Math. Phys. 84 (1982) 71–85. [19] M. Izumi and H. Kosaki, On a subfactor analogue of the second cohomology, Rev. Math. Phys. 14 (2002) 733–757. [20] M. Izumi, R. Longo and S. Popa, A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras, J. Funct. Anal. 155 (1998) 25–63. [21] Y. Kawahigashi and R. Longo, Noncommutative spectral invariants and black hole entropy, Comm. Math. Phys. 257 (2005) 193–225. [22] Y. Kawahigashi, R. Longo and M. M¨ uger, Multi-interval subfactors and modularity of representations in conformal ﬁeld theory, Comm. Math. Phys. 219 (2001) 631–669. [23] R. K¨ ahler and H.-W. Wiesbrock, Modular theory and the reconstruction of fourdimensional quantum ﬁeld theories, J. Math. Phys. 42 (2001) 74–86. [24] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567–597. [25] R. Longo and K.-H. Rehren, Local ﬁelds in boundary conformal QFT, Rev. Math. Phys. 16 (2004) 909–960. [26] R. Longo and K.-H. Rehren, How to remove the boundary: An operator algebraic procedure, Comm. Math. Phys. 285 (2009) 1165–1182. [27] R. Longo and F. Xu, Topological sectors and a dichotomy in conformal ﬁeld theory, Comm. Math. Phys. 251 (2004) 321–364. [28] P. Martinetti and C. Rovelli, Diamond’s temperature: Unruh eﬀect for bounded trajectories and thermal time hypothesis, Class. Quant. Grav. 20 (2003) 4919–4932. [29] T. Saﬀary, On the generator of massive modular groups, Lett. Math. Phys. 77 (2006) 235–248. [30] B. Schroer and H.-W. Wiesbrock, Modular theory and geometry, Rev. Math. Phys. 12 (2000) 139–158. [31] G. Sewell, Relativity of temperature and the Hawking eﬀect, Phys. Lett. A 79 (1980) 23–24.

14:17 WSPC/S0129-055X

354

148-RMP

J070-00397

R. Longo, P. Martinetti & K.-H. Rehren

[32] M. Takesaki, Theory of Operator Algebras, II, Springer Encyclopedia of Mathematical Sciences, Vol. 125 (Springer-Verlag, 2003). ¨ [33] S. Trebels, Uber die geometrische Wirkung modularer Automorphismen, PhD thesis, G¨ ottingen (1997); (in German, see also [2, Chap. III.4]). [34] W. G. Unruh, Notes on black-hole evaporation, Phys. Rev. D 14 (1976) 870–892.

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 4 (2010) 355–380 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003941

SPECTRAL SHIFT FUNCTION FOR OPERATORS WITH CROSSED MAGNETIC AND ELECTRIC FIELDS

MOUEZ DIMASSI∗ and VESSELIN PETKOV† ∗D´ epartement de Math´ ematiques, Universit´ e Paris 13, 99, Avenue J.-B. Cl´ ement, 93430 Villetaneuse, France [email protected] †Universit´ e

Bordeaux I, Institut de Math´ ematiques de Bordeaux, 351, Cours de la Lib´ eration, 33405 Talence, France [email protected] Received 19 August 2009 Revised 8 January 2010

We obtain a representation formula for the derivative of the spectral shift function ξ(λ; B, ) related to the operators H0 (B, ) = (Dx − By)2 + Dy2 + x and H(B, ) = H0 (B, ) + V (x, y), B > 0, > 0. We establish a limiting absorption principle for H(B, ) / σ(Q), where Q = (Dx − By)2 + and an estimate O(n−2 ) for ξ (λ; B, ), provided λ ∈ Dy2 + V (x, y). Keywords: Magnetic potential; Stark operator; spectral shift function. Mathematics Subject Classification 2010: 35P25, 35Q40

1. Introduction Consider the two-dimensional Schr¨ odinger operator with homogeneous magnetic and electric ﬁelds H = H(B, ) = H0 (B, ) + V (x, y),

Dx = −i∂x ,

Dy = −i∂y ,

where H0 = H0 (B, ) = (Dx − By)2 + Dy2 + x. Here B > 0 and > 0 are proportional to the strength of the homogeneous magnetic and electric ﬁelds. We assume that V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R)) and V (x, y) satisﬁes the estimate |V (x, y)| ≤ C(1 + |x|)−2−δ (1 + |y|)−1−δ ,

δ > 0.

(1.1)

For = 0 we have σess (H0 (B, )) = σess (H(B, )) = R. On the other hand, for decreasing potentials V we may have embedded eigenvalues λ ∈ R and this situation is completely diﬀerent from that with = 0 when the spectrum of H(B, 0) is formed 355

May 11, J070-S0129055X10003941

356

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

by eigenvalues with ﬁnite multiplicities which may accumulate only to Landau levels λn = (2n + 1)B, n ∈ N (see [9, 13, 15] and the references cited there). The spectral properties of H and the existence of resonances have been studied in [5, 7, 8] under the assumption that V (x, y) admits a holomorphic extension in the x-variable into a domain Γδ0 = {z ∈ C : 0 ≤ |Im z| ≤ δ0 }. Moreover, without any assumption on the analyticity of V (x, y) we show in Proposition 2 below that the operator (H − z)−1 − (H0 − z)−1 for z ∈ C, Im z = 0, is trace class and following the general setup [11, 20], we deﬁne the spectral shift function ξ(λ) = ξ(λ; B, ) related to H0 (B, ) and H(B, ) by ξ , f = tr(f (H) − f (H0 )),

f ∈ C0∞ (R).

By this formula ξ(λ) is deﬁned modulo a constant but for the analysis of the derivative ξ (λ) this is not important. Moreover, the above property of the resolvents and Birman–Kuroda theorem imply σac (H0 (B, )) = σac (H(B, )) = R. A representation of the derivative ξ (λ; B, ) has been obtained in [5] for strong magnetic ﬁelds B → +∞ under the assumption that V (x, y) admits an analytic continuation in x-direction. Moreover, the distribution of the resonances zj of the perturbed operator H(B, ) has been examined in [5] and a Breit–Wigner representation of ξ (λ; B, ) involving the resonances zj was established. In the literature there are a lot of works concerning Schr¨ odinger operators with magnetic ﬁelds ( = 0) but there are only few ones dealing with magnetic and Stark potentials ( = 0) (see [5, 7, 8] and the references given there). It should be mentioned that the tools in [5, 7, 8] are related to the resonances of the perturbed problem and to deﬁne the resonances one supposes that the potential V (x, y) has an analytic continuation in x variable. In this paper we consider the operator H without any assumption on the analytic continuation of V (x, y) and without the restriction B → +∞. Our purpose is to study ξ (λ; B, ) and the existence of embedded eigenvalues of H. To examine the behavior of the spectral shift function we need a representation of the derivative ξ (λ; B, ). The key point in this direction is the following Theorem 1. Let V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) and let (1.1) hold for V and ∂x V . Then for every f ∈ C0∞ (R) and = 0 we have 1 tr(f (H) − f (H0 )) = − tr(∂x V f (H)). (1.2) The formula (1.2) has been proved by Robert and Wang [18] for Stark Hamiltonians in absence of magnetic ﬁeld (B = 0). In fact, the result in [18] says that 1 ∂e (x, y, x, y; λ, 0, )dxdy, (1.3) ξ (λ; 0, ) = − ∂x V R2 ∂λ where e(·, ·; λ, 0, ) is the spectral function of H(0, ). The presence of magnetic ﬁled B = 0 and Stark potential lead to some serious diﬃculties. The operator H is not

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

357

elliptic for |x|+|y| → ∞ and we have double characteristics. On the other hand, the commutator [H, x] involves the term (Dx −By) and it creates additional diﬃculties. The proof of Theorem 1 is long and technical. We are going to study the trace class properties of the operators ψ(H ±i)−N , ∂x ◦ψ(H ±i)−N −1 , (H ±i)∂x ◦ψ(H ±i)−N −2 etc. for N ≥ 2 and ψ ∈ C0∞ (R2 ) (see Lemmas 1 and 2). Moreover, by an argument similar to that in [5, Proposition 2.1], we obtain estimates for the trace norms of the operators (z − H)−1 V (z − H)−1 ,

V (z − H)−1 (z − H)−1 ,

z∈ / R,

z ∈ /R

and we apply an approximation argument. Notice that in [18] the spectral shift function is related to the trace of the time delay operator T (λ) deﬁned via the corresponding scattering matrix S(λ) (see [17]). In contrast to [18], our proof is direct and neither T (λ) nor S(λ) corresponding to the operator H(B, ) are used. The second question examined in this work is the existence of embedded real eigenvalues and the limiting absorption principle for H. In the physical literature one conjectures that for = 0 there are no embedded eigenvalues. We establish in Sec. 3 a weaker result saying that in any interval [a, b] we may have at most a ﬁnite number embedded eigenvalues with ﬁnite multiplicities. Under the assumption for analytic continuation of V it was proved in [7] that for some ﬁnite interval [α(B, ), β(B, )] there are no resonances z of H(B, ) with Re z ∈ / [α(B, ), β(B, )]. Since the real resonances z coincide with the eigenvalues of H(B, ), we obtain some information for the embedded eigenvalues. On the other hand, exploiting the analytic continuation and the resonances we proved in [5] that for B → +∞ the reals parts Re zj of the resonances zj lie outside some neighborhoods of the Landau levels. Thus the Landau levels play a role in the distribution of the resonances. It is known that the spectrum of the operator Q = (Dx − By)2 + Dy2 + V (x, y) with decreasing potential V is formed by eigenvalues (see [9, 13, 15]). In this paper, we establish a limiting absorption principle for λ ∈ / σ(Q). In particular, we show that there are no embedded eigenvalues outside σ(Q). This agrees with the result in [5] obtained under the restrictions on the behavior of V and B → +∞. On the other hand, the result of Proposition 3 and the estimates (4.3) have been established by Wang [19] for Stark operators with B = 0. Following the results in Sec. 4 and the representation of ξ (λ; B, ) given in [5], it is natural to expect that for λ ∈ / σ(Q) the derivative of the spectral shift function ξ (λ; B, ) must be bounded. In fact, we prove the following stronger result. Theorem 2. Let the potential V ∈ C ∞ (R2 ; R) satisfy with some δ > 0 and n ∈ N, n ≥ 2 the estimates |∂xα ∂yβ V (x, y)| ≤ Cα,β (1 + |x|)−n−δ−|α| (1 + |y|)−2−δ−|β| ,

∀α,

∀β.

(1.4)

Then for λ0 ∈ / σ(Q) we have ξ (λ; B, ) = O(n−2 ) uniformly for λ in a small neighborhood Ξ ⊂ R of λ0 .

(1.5)

May 11, J070-S0129055X10003941

358

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

The estimate (1.5) has been obtained in [18] in the case of absence of magnetic ﬁeld B = 0 (for a Breit–Wigner formula see [10], [4] for Stark Hamiltonians and [5] for the operator H(B, )). Our approach is quite diﬀerent from that in [18]. Our proof is going without an application of a representation similar to (1.3) which leads to complications connected with the behavior of the spectral function e(·, ·; λ, B, ) corresponding to H(B, ). The formula (1.2) plays a crucial role and our analysis is based on a complex analysis argument combined with a representation of f (H) involving the almost analytic continuation of f ∈ C0∞ (R). In this direction, our argument is similar to that developed in [4, 5]. The plan of this paper is as follows. In Sec. 2, we establish Theorem 1. The embedded eigenvalues and Mourre estimates are examined in Sec. 3. In Sec. 4, we prove Proposition 3 concerning the limiting absorption principle for H(B, ). Finally, in Sec. 5, we establish Theorem 2. 2. Representation of the Spectral Shift Function Throughout this work we will use the notations of [3] for symbols and pseudodiﬀerential operators. In particular, if m : R4 → [0, +∞[ is an order function (see [3, Deﬁnition 7.4]), we say that a(z, ζ) ∈ S 0 (m) if for every α ∈ N4 there exists Cα > 0 such that α a(z, ζ)| ≤ Cα m(z, ζ). |∂z,ζ

In the special case when m = 1, we will write S 0 instead of S 0 (1). We will use the standard Weyl quantization of symbols. More precisely, if p(z, ζ), (z, ζ) ∈ R4 , is a symbol in S 0 (m), then P w (z, Dz ) is the operator deﬁned by z + z w −2 i(z−z )·ζ , ζ u(z )dz dζ, for u ∈ S(R2 ). p P (z, Dz )u(z) = (2π) e 2 We denote by P w (z, hDz ) the semiclassical quantization obtained as above by quantizing p(z, hζ). Our goal in this section is to prove Theorem 1. For this purpose we need some Lemmas. We set Q0 = H0 − x = (Dx − By)2 + Dy2 ,

Q = Q0 + V,

and in Lemma 1 we will use the notation H1 = H. For the simplicity we assume that = B = 1. The general case can covered by the same argument. Lemma 1. Assume that V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) and let ψ ∈ C0∞ (R2 ). Then for N ≥ 2, j = 0, 1 and for Im z = 0, the following operators are trace class: (i) (ii) (iii) (iv) (v)

ψ(Hj ± i)−N , ∂x ◦ ψ(Hj ± i)−N −1 , (Hj ± i)∂x ◦ ψ(Hj ± i)−N −2 . (Hj ± i)−N ψ, (Hj ± i)−N −1 ψ · ∂x . ψ ◦ ∂x (Hj ± i)−N −1 , (Hj ± i)ψ ◦ ∂x (Hj ± i)−N −2 . (Hj ± i)∂x (Hj ± i)−N −2 ψ. (H1 + i)∂x (H1 + i)−N −1 (H1 − z)−1 ψ.

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

359

Moreover,

(H1 + i)∂x (H1 + i)−N −1 (H1 − z)−1 ψ tr = O

|z| + 1 |Im z|2

.

(2.1)

Proof. We will prove the lemma only for (H1 + i), the case concerning (H1 − i) is similar. On the other hand, the statements for (H0 + i) follow from those for (H1 + i) when V = 0. From the ﬁrst resolvent equation, we obtain (H1 + z)−1 = (Q0 + z)−1 − (Q0 + z)−1 (x + V )(H1 + z)−1 = (Q0 + z)−1 +

N +2

(−1)j (Q0 + z)−1 ((x + V )(Q0 + z)−1 )j

j=1

+ (−1)N +3 ((Q0 + z)−1 (x + V ))N +3 (H1 + z)−1 .

(2.2)

Taking (N − 1) derivatives with respect to z in the above identity and setting z = i, we see that (H1 + i)−N is a linear combination of terms KN := (Q0 + i)−j1 W (Q0 + i)−j2 W · · · (Q0 + i)−jr W (H1 + i)−p , with j1 + · · · + jr ≥ N, j1 ≥ 1, p ≥ 0 and W (x) = x + V (x). Recall that if P ∈ S 0 (m) with m ∈ L1 (R4 ), (respectively, m ∈ L2 (R4 )) then the corresponding operator is trace class (respectively, Hilbert–Schmidt). By using this and the fact that the symbol of (Q0 + i)−1 is in S 0 (ξ − y, η−2 ), we deduce that the operator

j −l y−p (Q0 + i)−j xl yp Kl,p,l ,p := x

is trace class one for l−l , p−p > 1, j ≥ 2 and Hilbert–Schmidt one for l−l , p−p > 1/2, j ≥ 1. Next, we write ψKN as follows j1 j2 W x−1 K3r−3,2r−2,3r−1,2r−4 W x−1 ψKN = ψx3r y2r K3r,2r,3r−2,2r−2 jr · · · W x−1 K3,2,1,0 W x−1 (H1 + i)−p .

(2.3)

Since j1 + j2 + · · · + jr ≥ N ≥ 2, in the above decomposition, there are at least two Hilbert–Schmidt operators or one of trace class. Combining this with the fact ψx3r y2r , W x−1 and (H1 + i)−p are bounded from L2 (R2 ) into L2 (R2 ), we conclude that ψKN is trace class operator. Thus ψ(H1 + i)−N is also a trace class operator. Repeating the same arguments, we obtain the proof for ∂x ◦ψ(Hj ±i)−N −1 . As above to treat (Hj ± i)∂x ◦ ψ(Hj ± i)−N −2 , it suﬃces to show that (Hj ± i)∂x ◦ ψKN is trace class. If we have j1 ≥ 2 the proof is completely similar to that of ψ(H1 + i)−N . In the case where j1 = 1 since (H1 + i)∂x (Q0 + i)−1 is not bounded,

May 11, J070-S0129055X10003941

360

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

we have to exploit the following representation (H1 + i)∂x ◦ ψKN = (H1 + i)(∂x ψ)KN + (H1 + i)ψ(Q0 + i)−1 ∂x ◦ W (Q0 + i)−j2 W · · · (Q0 + i)−jr W (H1 + i)−p . Next use the fact that ∂x W ∈ L∞ and repeat the argument of the proof above. Recall that A is trace class if and only if the adjoint operator A∗ is trace class. Consequently, (i) implies (ii). Since ψ · ∂x = ∂x · ψ − (∂x ψ), the assertion (iii) follows from (i). To deal with (iv), we apply the following obvious identity with z = −i, ∂x (H − z)−1 = (H − z)−1 ∂x + (H − z)−1 (1 + ∂x V )(H − z)−1 ,

(2.4)

and obtain (H1 + i)∂x (H1 + i)−N ψ = (H1 + i)−N ∂x ψ +

N −1

(H1 + i)−j (1 + ∂x V )(H1 + i)−N +j ψ.

(2.5)

j=0

Applying (i) and (ii) to each term on the right hand side of (2.5), we get (iv). Now we pass to the proof of (v). Applying (2.4), we obtain (H1 + i)∂x (H1 + i)−N −1 (H1 − z)−1 ψ = (H1 + i)(H1 − z)−1 ∂x (H1 + i)−N −1 ψ + (H1 + i)(H1 − z)−1 (1 + ∂x V ) (H1 − z)−1 (H1 + i)−N ψ. Combining the above equation with (i), (ii), (iv) and using the estimate |z| + 1 −1

(H1 + i)(H1 − z) = O , |Im z| we get (2.1). Lemma 2. Assume that V (x, y) = φ(x, y)W (x, y), where φ ∈ C0∞ (R2 ; R) and W, ∂x W ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R). Then for N ≥ 4 the operator (H + i)∂x [(H + i)−N − (H0 + i)−N ], is trace class. Proof. Taking (N − 1) derivatives with respect to z in the resolvent identity (H + z)−1 − (H0 + z)−1 = −(H + z)−1 V (H0 + z)−1 and setting z = i, we see that (H + i)−N − (H0 + i)−N is a linear combination of terms (H + i)−j V (H0 + i)−(N +1+j)

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

361

with 1 ≤ j ≤ N . Composing the above terms by (H + i)∂x and applying Lemma 1, we complete the proof. Lemma 3. Assume that V satisﬁes the assumptions of Lemma 1. Let f ∈ C0∞ (R) and ψ ∈ C0∞ (R2 ). Then the operators ψf (Hi ),

Hi ψ∂x f (Hi ),

ψ∂x Hi f (Hi )

are trace class and we have tr(Hi ψ∂x f (Hi )) = tr(ψ∂x Hi f (Hi )). Proof. Set g(x) = (x + i)4 f (x). Since g(Hi ) is bounded, it follows from Lemma 1 that the operators ψ(Hi + i)−4 g(Hi ),

Hi ψ∂x (Hi + i)−4 g(Hi ),

ψ∂x (Hi + i)−4 Hi g(Hi ),

are trace class, and the cyclicity of the trace yields tr(Hi ψ∂x f (Hi )) = tr(Hi ψ∂x (Hi + i)−4 g(Hi )) = tr(Hi g(Hi )ψ∂x (Hi + i)−4 ) = tr(ψ∂x (Hi + i)−4 g(Hi )Hi ) = tr(ψ∂x Hi f (Hi )). Notice that in the above equalities we have used the fact that the operators g(Hi ), Hi and (Hi + i)−4 commute. Lemma 4. Let V be as in Lemma 2. Then for every f ∈ C0∞ (R) the operators f (H) − f (H0 ),

∂x (f (H) − f (H0 ))

and

(H ± i)∂x (f (H) − f (H0 ))

are trace class. Proof. Let g(x) = (x + i)4 f (x) be as above. We decompose (H + i)∂x (f (H) − f (H0 )) = (H + i)∂x ((H + i)−4 − (H0 + i)−4 )g(H0 ) + (H + i)∂x (H + i)−4 (g(H) − g(H0 )) = I + II. According to Lemma 2, the operator I is trace class. To treat II, we use the Helﬀer– Sj¨ ostrand formula 1 ¯g (z)(H + i)∂x (H + i)−4 ((z − H)−1 − (z − H0 )−1 )L(dz) ∂˜ (II ) = − π 1 ¯g (z)(H + i)∂x (H + i)−4 (z − H)−1 V (z − H0 )−1 L(dz), =− ∂˜ π ¯g (z) = where g˜(z) ∈ C0∞ (C) is an almost analytic continuation of g such that ∂˜ ∞ O(|Im z| ), while L(dz) is the Lebesgue measure on C. Now applying Lemma 1(v), we see that the operator (H + i)∂x (H + i)−4 (z − H)−1 V

May 11, J070-S0129055X10003941

362

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

is trace class. Since |z| is bounded on supp g˜, we can apply (2.1) to the right hand ¯g (z) = O(|Im z|∞ ), we deduce part of the above equation and combining this with ∂˜ that II is trace class. Summing up, we conclude that (H + i)∂x (f (H) − f (H0 )) is trace class. The same argument works for (H − i)∂x (f (H) − f (H0 )). The proof concerning f (H) − f (H0 ) and ∂x (f (H) − f (H0 )) are similar and simpler. To establish Theorem 1, we also need the following abstract result. For the reader’s convenience, we present a proof. Proposition 1. Let A be an operator of trace class on some Hilbert space H and let {Kn } be sequences of bounded linear operator which converges strongly to K ∈ L(H). Then lim Kn A − KA tr = 0.

n→∞

Proof. First assume that A is a ﬁnite rank operator having the form A = m k=1 ·, ψk φk , where ψk , φk ∈ H. Since

A tr ≤

m

φk

ψk ,

k=1

we have

(Kn − K)A tr ≤

m

(Kn − K)φk

ψk → 0,

n → ∞.

(2.6)

k=1

The general case can be covered by an approximation. Since Kn converges strongly, it follows from the Banach–Streinhaus theorem that µ = supn Kn < ∞. Let η be an arbitrary positive constant and let Aη be a ﬁnite rank operator such that η . We have

A − Aη tr ≤ 2µ

(Kn − K)A tr ≤ (Kn − K)(A − Aη ) tr + (Kn − K)Aη tr ≤ η + (Kn − K)Aη tr . Next we apply (2.6) for the ﬁnite rank operator Aη and obtain lim (Kn − K)A tr ≤ η,

n→∞

which implies Proposition 1, since η is arbitrary. Proof of Theorem 1. Assume ﬁrst that V = φW where φ ∈ C0∞ (R2 ; R) and W, ∂x W ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R). Choose a function χ ∈ C0∞ (R2 ) such that χ = 1 for |(x, y)| ≤ 1. For R > 0 set x y , χR (x, y) = χ , R R

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

363

and introduce BR := [χR ∂x , H]f (H) − [χR ∂x , H0 ]f (H0 ). Here [A, B] = AB − BA denotes the commutator of A and B. According to Lemma 3, we have tr([χR ∂x , H]f (H)) = tr([χR ∂x , H0 ]f (H0 )) = 0. Thus tr(BR ) = 0.

(2.7)

On the other hand, a simple calculus shows that BR = χR ([∂x , H]f (H) − [∂x , H0 ]f (H0 )) + [χR , H0 ]∂x (f (H) − f (H0 )) 1 2 + BR , := BR

(2.8)

where we have used that [χR , H] = [χR , H0 ]. Since [∂x , H] = 1 + ∂x V and [∂x , H0 ] = 1, it follows from Lemma 3, Lemma 4 and Proposition 1 that 1 ) = tr(f (H) − f (H0 )) + tr(∂x V f (H)). lim tr(BR

R→∞

(2.9)

Next, we claim that 2 = 0. lim BR

R→∞

(2.10)

2 (Dx χR )(Dx − y) − R2 (Dy χR )Dy + R12 (∆χR ), we decompose Using that [χR , H0 ] = R 2 2 1 2 3 as a sum of three terms BR = IR + IR + IR , where BR 1 =− IR

2 (Dx χR )(Dx − y)∂x (f (H) − f (H0 )), R

2 IR =−

2 (Dy χR )Dy ∂x (f (H) − f (H0 )), R

3 IR =

1 (∆χR )∂x (f (H) − f (H0 )). R2

1 To treat IR , we set Q = H − x and write 1 =− IR

2 (Dx χR )(Dx − y)(Q0 − i)−1 (H − i)∂x (f (H) − f (H0 )) R

+

2 (Dx χR )[(Dx − y)(Q − i)−1 , x]∂x (f (H) − f (H0 )) R

+

2 x(Dx χR )(Dx − y)(Q − i)−1 ∂x (f (H) − f (H0 )). R

The operators [(Dx − y)(Q − i)−1 , x] and (Dx − y)(Q − i)−1 are bounded, while ∂x (f (H) − f (H0 )) and (H − i)∂x (f (H) − f (H0 )) are trace class operators

May 11, J070-S0129055X10003941

364

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

2 (see Lemma 4). On the other hand, R2 (Dx χR ), R x(Dx χR ) converges strongly to zero. Indeed, since χ(x, y) = 1 for |(x, y)| ≤ 1, we get 2 x |u|2 dxdy → 0, R → ∞, (Dx χR )u dxdy ≤ sup |xDx χ(x, y)| R (x,y)∈R2 {|(x,y)|≥R}

for all u ∈ L2 (R2 ). Applying Proposition 1, we conclude that 1 lim IR = 0.

(2.11)

R→∞

2 3 To deal with IR , IR , notice that the operators Dy (Q − i)−1 and [Dy (Q − i)−1 , x] are bounded and we repeat the above argument. Thus we deduce

lim I j R→∞ R

= 0,

j = 2, 3.

(2.12)

Consequently, (2.11) and (2.12) imply (2.10) and the claim is proved. Now, combining (2.7)–(2.10), we obtain Theorem 1 in the case where V satisﬁes the assumption of Lemma 2 and = 1. / R, z ∈ /R Proposition 2. Assume that V ∈ L∞ (R2 ; R) satisﬁes (1.1). Then for z ∈ −1 −1 −1 −1 −1 the operators (z − H) V (z − H) , V (z − H) (z − H) , (H − z) − (H0 − z)−1 are trace class and

(z − H)−1 V (z − H)−1 tr ≤ C1 |Im z|−1 |Im z |−1 , −1

V (z − H)

−1

(z − H)

−1

tr ≤ C1 |Im z|

−1

|Im z |

(2.13)

.

Moreover, if g ∈ C0∞ (R), then the operator V g(H) is trace class. δ

1+δ

Proof. Set gδ (x, y) = x−1− 2 y− 2 and fδ (x, y) = x−2−δ y−1−δ , where δ is the constant in (1.1). According to Lemma 8 in the Appendix, gδ (H0 + i)−1 , (H0 + i)−1 gδ are Hilbert–Schmidt operators and fδ (H0 + i)−2 is a trace one. Since gδ−1 V gδ−1 , V fδ−1 ∈ L∞ , it follows that (H0 + i)−1 V (H0 + i)−1 = (H0 + i)−1 gδ [gδ−1 V gδ−1 ]gδ (H0 + i)−1 and V (H0 + i)−2 are trace class operators. Next we write (H + i)−1 − (H0 + i)−1 = −(H0 + i)−1 V (H0 + i)−1 + (H + i)−1 V (H0 + i)−1 V (H0 + i)−1 and conclude that (H + i)−1 − (H0 + i)−1 = −(H + i)−1 V (H0 + i)−1 is trace class. Now consider the following equalities (i + H)−1 V (i + H)−1 = (i + H0 )−1 V (i + H0 )−1 + (i + H)−1 V (i + H0 )−1 V (i + H0 )−1 + (i + H0 )−1 V (i + H0 )−1 V (i + H)−1 + (i + H)−1 V (i + H0 )−1 V (i + H0 )−1 V (i + H)−1

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

365

and V (H + i)−2 = V (H0 + i)−2 − V (H0 + i)−1 (H + i)−1 V (H0 + i)−1 − V (H + i)−1 V (H0 + i)−1 (H + i)−1 . By using the trace class properties established above, we get (2.13) for z = z = −i. By applying the ﬁrst resolvent equation (H − z)−1 = (H + i)−1 + (i − z)(H + i)−1 (H − z)−1 , we obtain the general case. To examine V g(H), consider the function h(x) = (x + i)2 g(x). Then V g(H) = V (H + i)−2 h(H) and since V (H + i)−2 is trace class, we obtain the result. For R > 0 introduce HR := H0 + χR (x, y)V (x, y), x y , R ) with χ ∈ C0∞ (R2 ) such that χ = 1 in a neighborhood of where χR (x, y) = χ( R |(x, y)| ≤ 1.

Remark 1. The result of Proposition 2 concerning the trace class property of (H − z)−1 − (H0 − z)−1 , Im z = 0, improves considerably [5, Proposition 2], where much more regular potentials have been examined. On the other hand, if the potential V satisﬁes (1.1) and V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R), then the state/ R, z ∈ / R. ments of Proposition 2 hold for the operators (z −HR )−1 V (z −H)−1 , z ∈ The proof of Theorem 1 in the general case will be a simple consequence of the following Lemma 5. Let V (x, y) be as in Theorem 1. Then for f ∈ C0∞ (R) we have lim tr(f (HR ) − f (H)) = 0,

(2.14)

lim tr(∂x (χR V )f (HR )) = tr(∂x V f (H)).

(2.15)

R→∞ R→∞

Proof. Let g(x) = (x + i)f (x) be as above. We decompose f (HR ) − f (H) = ((HR + i)−1 − (H + i)−1 )g(H) + (HR + i)−1 (g(HR ) − g(H)) = JR + KR . From the ﬁrst resolvent identity, we obtain JR = (HR − i)−1 (1 − χR )V (H + i)−1 g(H) = (HR − i)−1 (1 − χR )V f (H). According to Proposition 2, the operator V f (H) is trace class and (HR −i)−1 (1−χR ) converges strongly to zero. Then from Proposition 1 it follows that lim tr JR = 0.

R→∞

(2.16)

May 11, J070-S0129055X10003941

366

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

To treat trKR , as in the proof of Lemma 4, we use the Helﬀer–Sj¨ostrand formula and write 1 ¯g (z) tr((HR + i)−1 ((z − HR )−1 − (z − H)−1 ))L(dz) ∂˜ tr KR = − π 1 ¯g(z) tr((HR + i)−1 (z − HR )−1 (1 − χR )V (z − H)−1 )L(dz). ∂˜ = π By cyclicity of the traces we obtain tr((HR + i)−1 (z − HR )−1 (1 − χR )V (z − H)−1 ) = tr((z − HR )−1 (1 − χR )V (z − H)−1 (HR + i)−1 ) = tr((z − HR )−1 (1 − χR )V (z − H)−1 (H + i)−1 ) + tr((1 − χR )V (HR + i)−1 (z − HR )−1 (1 − χR )V (z − H)−1 (H + i)−1 ). Now notice that for z ∈ / R the operators (1−χR )V (HR +i)−1 (z−HR )−1 (1−χR ) and −1 (z−HR ) (1−χR ) converge strongly to zero. On the other hand, from Proposition 2 / R, we we deduce that the operator V (z − H)−1 (i + H)−1 is trace class. Thus for z ∈ conclude that the integrand converge to 0 as R → ∞. An application of the Lebesgue convergence domination theorem combined with the estimates (2.13) yield lim tr KR = 0.

(2.17)

R→∞

Putting together (2.16) and (2.17), we obtain (2.14). Next, we pass to the proof of (2.15). A simple calculus shows that ∂x (χR V )f (HR ) = ∂x (χR V )(f (HR ) − f (H)) +

1 (∂x χ)R V f (H) R

+ (χR ∂x V f (H)).

(2.18)

Repeating the same arguments as in the proof of (2.14), we show that lim tr(∂x (χR V )(f (HR ) − f (H))) = 0.

R→∞

(2.19)

1 (∂x χ)R (respectively χR ) converges strongly to zero On the other hand, since R (respectively 1), it follows from Proposition 1 that 1 (∂x χ)R Vf (H) = 0, lim tr(χR ∂x Vf (H)) = tr(∂x Vf (H)), lim tr R→∞ R→∞ R

which together with (2.18) and (2.19) yield (2.15). End of the proof of Theorem 1. Applying Theorem 1 to HR , we obtain: tr[f (HR ) − f (H)] + tr[f (H) − f (H0 )] = tr[f (HR ) − f (H0 )] = −tr(∂x (χR V )f (H)), and an application of Lemma 5 implies Theorem 1.

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

367

3. Mourre Estimate and Embedded Eigenvalues Consider the operator Q = (Dx − By)2 + Dy2 + V (x, y), and set x = (1 + |x|2 )1/2 , Dx = (1 + Dx2 )1/2 . Lemma 6. Assume that V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) and let

I{|x|+|y|>R}(x, y)∂x V L∞ → 0 for R → +∞. Then for all f ∈ C0∞ (R), the operator f (H)∂x V f (H) is compact. Proof. Let ϕ(x, y) ∈ C0∞ (R2 ) be equal to one near zero. Set ϕn (x, y) = ϕ( nx , ny ). According to Lemma 3, the operator f (H)ϕn ∂x V f (H) is trace class. The set of compact operators is closed with respect to the norm . L(L2 ) and the lemma follows from the obvious estimate

f (H)(1 − ϕn )∂x V f (H) L(L2 ) ≤ f 2 (H) L(L2 ) (1 − ϕn )∂x V ∞ . Theorem 3. Let [a, b] ⊂ R. Under the assumptions of Lemma 6, there exists a compact operator K such that I[a,b] (H)[∂x , H] I[a,b] (H) ≥ I[a,b] (H) + I[a,b] (H)KI[a,b] (H).

(3.1)

Proof. Since the operator ∂x commutes with (Dx −By) and Dy2 , we have [∂x , H] = + ∂x V . Consequently, I[a,b] (H)[∂x , H]Ia,b] (H) = I[a,b] (H) + I[a,b] (H)∂x V I[a,b] (H) = I[a,b] (H) + I[a,b] f (H)∂x V f (H)I[a,b] (H),

(3.2)

where f ∈ C0∞ (R) is a cut-oﬀ function such that f = 1 on [a, b]. Thus, Theorem 3 follows from Lemma 6. The use of commutators with the operator ∂x is well known for the analysis of the operator without magnetic ﬁeld (B = 0) (see the pioneering work [2] and [1] for a more complete list of references). On the other hand, to treat crossed magnetic and electric ﬁelds we need Lemma 1 and Lemma 3. Corollary 1. In addition to the assumptions of Theorem 3 assume that ∂x2 V ∈ C 0 (R2 ) ∩ L∞ (R2 ). Then the point spectrum of H in [a, b] is ﬁnite and with ﬁnite multiplicity. Moreover, the singular continuous spectrum of H is empty. Proof. Set A = Dx and let α ∈ R. The explicit formula eiαA (H + i)−1 = (eiαA He−iαA + i)−1 eiαA = (H + α + V (x + α, y) − V (x, y) + i)−1 eiαA

May 11, J070-S0129055X10003941

368

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

shows that eiαA leaves D(H) invariant. On the other hand, since

HeiαA (H + i)−1 ψ = e−iαA HeiαA (H + i)−1 ψ = (H − α + V (x − α, y) − V (x, y))(H + i)−1 ψ , we deduce that for each ϕ ∈ D(H) sup HeiαA ϕ < ∞.

|α|<1

Combining this with the fact i[A, H] = + ∂x V , [A, [A, H]] = −∂x2 V and using (3.1), we conclude that the self-adjoint operator A is a conjugate operator for H at every E ∈ R in the sense of [14]. Consequently, Corollary 1 follows from the main result in [14] (see also [1, 6]). Remark 2. For any sign-deﬁnite and bounded potential V (x, y) such that |V (x, y)| → 0 as |x| + |y| → ∞ suﬃciently fast in [13, 15] it was established that for = 0 the potential V creates an inﬁnite number of eigenvalues of Q which accumulate to Landau levels. The above corollary shows that only a ﬁnite number of these eigenvalues may survive in the presence of a non vanishing constant electric ﬁeld. In general, the problem of absence of embedded eigenvalues when = 0 remains open and this is an interesting conjecture. For a ﬁxed value of = 0, the following result shows that there are potentials for which H has absolutely continuous spectrum without embedded eigenvalues. Corollary 2. Fix > 0. Assume that ∂xα V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R), α = 0, 1, 2 and + ∂x V (x, y) > c > 0,

(3.3)

uniformly on (x, y) ∈ R2 . Then H has no eigenvalues. Moreover, for s > 1/2, the following estimates holds uniformly on λ in a compact interval

Dx −s (H − λ ± i0)−1 Dx −s = O (1).

(3.4)

Proof. Let [a, b] be a compact interval in R. From (3.1) and (3.3), we have I[a,b] (H)[∂x , H]Ia,b] (H) ≥ cI[a,b] (H).

(3.5)

According to the proof of Corollary 1, A = Dx is a conjugate operator in the sense of [14]. Combining this with (3.5) we deduce from [14] that H has no eigenvalue in R. Applying once more Mourre theorem (see [1, 6, 14]), we obtain the estimate (3.4).

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

369

4. Limiting Absorption Principle In this section, we treat the case when is small enough. Notice that when tends to zero in general the assumption + ∂x V > c > 0 is not satisﬁed and we cannot apply Corollary 2. Our goal is to study the behavior of the resolvent (H − λ ± iδ)−1 as δ → 0 for λ ∈ / σ(Q). For such λ we could have eigenvalues of H and a direct application of Mourre argument is not possible. We will obtain the result assuming that is small and for this purpose we need the following / σ(Q). Let χ ∈ C0∞ (R; R) be Lemma 7. Assume that V ∈ L∞ (R2 ; R) and let λ ∈ equal to 1 near λ and let supp χ ∩ σ(Q) = ∅. Then

χ(H)x−2 ≤ C2 .

(4.1)

Proof. Since supp χ ∩ σ(Q) = ∅, the operators (z − Q)−1 and (z − Q)−1 x(z − Q)−1 are analytic operator valued functions for z in a complex neighborhood of supp χ. Let χ(z) ˜ ∈ C0∞ (C) be an almost analytic continuation of χ(x) such that ∂¯χ(z) ˜ = O(|Im z|∞ ) and supp χ(z) ˜ ∩ σ(Q) = ∅. We have the representation 1 χ(H) = − ∂¯χ(z)(z ˜ − H)−1 L(dz), π where L(dz) is the Lebesgue measure in C. By using the resolvent identity, we get (z − H)−1 = (z − Q)−1 + (z − Q)−1 x(z − Q)−1 + 2 (z − H)−1 x(z − Q)−1 x(z − Q)−1 , and we obtain

χ(H) = χ(Q) − ∂¯χ(z)(z ˜ − Q)−1 x(z − Q)−1 L(dz) π 2 − ∂¯χ(z)(z ˜ − H)−1 x(z − Q)−1 x(z − Q)−1 L(dz). π

Since supp χ(z) ˜ ∩ σ(Q) = ∅, the ﬁrst two terms on the right-hand side vanish. Consequently, 2 χ(H) = − (4.2) ∂¯χ(z)(z ˜ − H)−1 x(z − Q)−1 x(z − Q)−1 L(dz). π Next, we observe that x(z − Q)−1 = (z − Q)−1 x + (z − Q)−1 [x, Q](z − Q)−1 = (z − Q)−1 x + L1 . We have [x, Q] = 2(Dx − By). Thus it is easy to see that for z ∈ / σ(Q), L1 = (z −Q)−1 [x, Q](z −Q)−1 is a bounded operator since (Dx −By)(i−Q)−1 is bounded

May 11, J070-S0129055X10003941

370

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

and (z − Q)−1 = (i − Q)−1 + (i − Q)−1 (i − z)(z − Q)−1 . We write x(z − Q)−1 x(z − Q)−1 = (z − Q)−1 x(z − Q)−1 x + (z − Q)−1 xL1 + L1 (z − Q)−1 x + L21 =

4

Ij .

j=1

The operators I4 = L21 and I3 = L1 (z − Q)−1 xx−2 are bounded. To see that I1 x−2 is bounded, note that I1 x−2 = (z − Q)−2 x2 x−2 + (z − Q)−1 L1 xx−2 . Finally, I2 x−2 = (z − Q)−2 x[x, Q](z − Q)−1 x−2 + (z − Q)−1 L1 [x, Q](z − Q)−1 x−2 and since the second term on the right-hand side is bounded, it remains to examine the operator x[x, Q](z − Q)−1 x−2 = [x, Q]x(z − Q)−1 x−2 + 2(z − Q)−1 x−2 . Applying the above argument, we see that the last operator is bounded. Consequently, the operator under integration in (4.2) is bounded by O(|Im z|−1 ) and this proves the statement. Proposition 3. Assume that ∂xα V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) for α = 0, 1, 2 and let x2 ∂x V ∈ L∞ (R2 ). Let [a, b] be a compact interval such that [a, b]∩σ(Q) = ∅. Then for s > 1/2 and suﬃciently small 0 > 0 we have the following estimate uniformly with respect to λ ∈ [a, b] and ∈ ]0, 0 ]

Dx −s (H − λ ± i0)−1 Dx −s ≤ C−1 .

(4.3)

Moreover, H has no embedded eigenvalues and singular continuous spectrum in [a, b]. Proof. Let [a − δ, b + δ] ∩ σ(Q) = ∅ for 0 < δ 1. Choose a function χ(t) ∈ C0∞ (R; R) such that supp χ ⊂ [a − δ, b + δ] and χ(t) = 1 for a1 = a − δ/2 ≤ t ≤ b + δ/2 = b1 . Then I[a1 ,b1 ] (H)[∂x , H]I[a1 ,b1 ] (H) = I[a1 ,b1 ] (H) + I[a1 ,b1 ] (H)∂x V I[a1 ,b1 ] (H) = I[a1 ,b1 ] (H) + I[a1 ,b1 ] (H)(χ(H)x−2 )(x2 ∂x V ) I[a1 ,b1 ] (H) Our assumption implies that the multiplication operator x2 ∂x V ∈ L∞ , while Lemma 7 says that

χ(H)x−2 ≤ C2 . Thus I[a1 ,b1 ] (H)(χ(H)x−2 )(x2 ∂x V )I[a1 ,b1 ] (H) ≤ C1 2 I[a1 ,b1 ] (H)

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

371

and with a constant c0 > 0 we deduce I[a1 ,b1 ] (H)[∂x , H]I[a1 ,b1 ] (H) ≥ c0 I[a1 ,b1 ] (H). Then it is well known (see, for instance [1,6,14]) that for λ ∈ [a, b] we get (4.3) and H has no eigenvalues and singular continuous spectrum in [a, b]. Remark 3. As we mentioned in Remark 2 for sign-deﬁnite rapidly decreasing potentials the spectrum of the operator Q is formed by inﬁnite number eigenvalues having as points of accumulation the Landau levels µn = (2n+1)B, n ∈ N. For such potentials Proposition 3 shows that the embedded eigenvalues of H could appear only in small neighborhoods of the eigenvalues of Q. Since in every interval we may have only a ﬁnite number of eigenvalues of H, it is clear that for some eigenvalues ν of Q there are no eigenvalues of H in their neighborhoods. Moreover, it was proved in [12] that for potentials V ∈ C0∞ (R2 ) we have σ(Q) ∩ ]µn − B, µn + B[ ⊂ (µn − Cn−1/2 , µn + Cn−1/2 ), n ≥ N with C > 0 and N depending only on sup|V | and the diameter of the support of V . Thus for M large the embedded eigenvalues λ ≥ M of H are suﬃciently close to Landau levels Λn . 5. Estimates for the Derivative of the Spectral Shift Function First we notice that the assumption (1.4) makes possible to deﬁne the spectral shift function ξ(λ, ) related to operators H0 () = H0 (B, ) and H() = H0 (B, )+V (x, y) by the equality ξ , f = tr(f (H()) − f (H0 ())),

f ∈ C0∞ (R).

Here and below we omit the dependence of B in the notations. Our purpose in this section is to establish Theorem 2. For the proof we need the following Proposition 4. Under the assumptions of Theorem 2, for λ0 ∈ / σ(Q) and 1/2 < s < min(1/2 + δ/4, 1) the operator Dx s ∂x V [(Q − z)−1 x]n Dx s is trace class for z in a small complex neighborhood Ξ ⊂ C of λ0 . Proof. Before starting the proof, notice that it is easy to establish the statement for z 0 since in this case the operator (Q−z)−1 is a pseudodiferential one and we can apply the calculus of pseudodiﬀerential operators and the criteria which guarantees that a pseudodiﬀerential operator is trace class (see for instance, [3, Theorem 9.4]). For z ∈ R+ \σ(Q) this is not the case and (Q − z)−1 is a bounded operator but not a pseudodiﬀerential one. We may replace (Q − z)−1 by the pseudodiﬀerential operator (Q−i)−1 modulo bounded operators but therefore it is diﬃcult to examine the product involving many bounded operators and factors xk . To overcome this diﬃculty, we are going to apply a convenient decomposition by product of operators

May 11, J070-S0129055X10003941

372

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

having in mind that the operator on the left of a such product must be trace class one. First, we treat the case n = 2, the general case will be covered by a recurrence. We start with the analysis of the operator Dx 2s ∂x V [(Q − z)−1 x]2 .

(5.1)

Our goal is to show that (5.1) is a trace class operator. Write Dx 2s ∂x V x2 x−2 (Q − z)−1 x(Q − z)−1 x = Dx 2s (∂x V )x2 (Q − z)−1 x−2 x(Q − z)−1 x + Dx 2s ∂x V x2 (Q − z)−1 [Q, x−2 ](Q − z)−1 x(Q − z)−1 x = Dx 2s ∂x V x2 (Q − z)−2 [x−2 x2 + [Q, x−2 x](Q − z)−1 x] + Dx 2s ∂x V x2 (Q − z)−1 [Q, x−2 ](Q − z)−1 x(Q − z)−1 x = T1 + T2 . To deal with T1 , we use the representation T1 = Dx 2s ∂x V x2 (Q − z)−2 W1 and we will show that the operator W1 = x−2 x2 + [Q, x−2 x](Q − z)−1 x 1 − x2 1 − x2 + (D − B ) (Q − z)−1 x = x−2 x2 − i (Dx − By ) x y (1 + x2 )2 (1 + x2 )2 is bounded. Consider the operator (Dx − By)

(1 − x2 ) (Q − z)−1 x (1 + x2 )2

= (Dx − By)

(1 − x2 )x (Q − i)−1 [1 + (z − i)(Q − z)−1 ] (1 + x2 )2

+ (Dx − By)

1 − x2 (Q − z)−1 [Q, x](Q − z)−1 . (1 + x2 )2

The pseudodiﬀerential operator (Dx − By)

(1 − x2 )x (Q − i)−1 (1 + x2 )2

is bounded and the product of this operator with [1 + (i − z)(Q − z)−1] is bounded, too. As in the proof of Lemma 7, we see that [Q, x](Q−z)−1 is bounded and with the same argument we treat the other terms. Thus we conclude that W1 is a bounded

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

373

operator. Next we write T2 = Dx 2s ∂x V x2 (Q − z)−2 W2 , where W2 = [Q, x−2 ]x(Q − z)−1 x + [Q, [Q, x−2 ]](Q − z)−1 x(Q − z)−1 x = W21 + W22 . We have W21 = 2i (Dx − By)

x2 x −1 −1 (Q − z) x + (D − By)x(Q − z) x x (1 + x2 )2 (1 + x2 )2

and as above we deduce that W21 is a bounded operator. For the analysis of W22 , we write 1 − 3x2 4(Dx − By)2 + R1 (x)(Dx − By) + R2 (x) W22 = (1 + x2 )3

x + (4∂x V + 8BDy ) (Q − z)−1 x(Q − z)−1 x. (1 + x2 )2 A simple calculus gives (Q − z)−1 x(Q − z)−1 x = (Q − z)−1 x2 (Q − z)−1 + (Q − z)−1 xM1 = x2 (Q − z)−2 + 4(Q − z)−1 x(Dx − By)(Q − z)−2 + x(Q − z)−1 M1 + (Q − z)−1 M2 = x2 (Q − z)−2 + 4x(Q − z)−1 M3 + (Q − z)−1 M4 = x2 (Q − i)−2 M5 + 4x(Q − i)−1 M6 + (Q − i)−1 M7 , where Mk , k = 1, 2 . . . , denote bounded operators. The pseudodiﬀerential calculus implies that the product of the term in the brackets {· · ·} with xj (Q − i)−j , j = 1, 2 is a bounded operator. Combining this with the above equality, we conclude that W22 is bounded. Now it remains to see that the operator T = Dx 2s ∂x V x2 (Q − z)−2 is trace class. For this purpose we replace (Q − z)2 by (Q − i)−2 [I + (z − i)(Q − z)−1 ]2

May 11, J070-S0129055X10003941

374

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

and consider the pseudodiﬀerential operator Dx 2s ∂x V x2 (Q − i)−2

(5.2)

with principal symbol gs (x, y, ξ, η) =

ξ 2s (∂x V )(x, y)(1 + x2 ) . ((ξ − By)2 + η 2 + V (x, y) − i)2

We use the estimate ξ2s ≤ Cξ − By2s y2s and we apply Theorem 9.4 in [3] to deduce that (5.2) is a trace class operator. In fact we have α

∂x,y,ξ,η gs L1 (R4 ) < ∞ |α|≤5

since 2s < 2 guarantees that the integral with respect to ξ is convergent, while 2s < 1 + δ/2 and the estimate (1.4) imply that integral with respect to y is convergent. Consequently, T is a trace class operator and this completes the analysis of (5.1). Notice also that the same argument implies that the operator Dx s ∂x V [(Q − z)−1 x]2 is trace class. To prove that the operator Dx s ∂x V [(Q − z)−1 x]2 Dx s is trace class, we commute the operator Dx s with (Q − z)−1 x and ∂x V in order to reduce the proof to that of (5.1). The commutators [x, Dx s ] and [V, Dx s ]x are bounded since s < 1. Next [(Q − z)−1 , Dx s ]x = (Q − z)−1 [V, Dx s ](Q − z)−1 x = (Q − z)−1 [V, Dx s ](x(Q − z)−1 + (Q − z)−1 M1 ) = (Q − z)−1 M2 and we obtain operators which can be handled by the above argument. Thus the assertion is proved for n = 2. Passing to the general case n > 2, assume that the assertion holds for n = 2, . . . , k − 1, and suppose that V satisfy the estimate (1.4) with n = k. The idea is to replace the operator Dx s ∂x V [(Q − z)−1 x]k Dx s by the trace class operator Dx s (∂x V )xk (Q − z)−2 Dx s plus a sum of several operators which are trace class according to the recurrence assumption. Notice that if Mj is bounded operator obtained as a product of (Dx − By) and (Q − z)−j , j ≥ 1, the operator Dx −s Mj Dx s becomes a bounded operators and this makes possible to exploit the representation Dx s ∂x V (Q − z)−1 x · · · Mj Dx s = [Dx s ∂x V (Q − z)−1 x · · · Dx s ] (Dx −s Mj Dx s ). Thus we reduce the analysis to the trace class property of Dx s ∂x V (Q − z)−1 x · · · Dx s . For simplicity of the notations we will write A ∼t B if the difference A − B is a trace class operator.

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

375

We start with the observation that Dx s ∂x V [(Q − z)−1 x]k Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−2 (Q − z)−1 x2 (Q − z)−1 Dx s . We can establish this by a recurrence. For k − 1 we apply the equality Dx s ∂x V [(Q − z)−1 x]k−1 Dx s = Dx s ∂x V [(Q − z)−1 x]k−3 (Q − z)−1 x2 (Q − z)−1 Dx s × Dx s ∂x V [(Q − z)−1 x]k−2 (Q − z)−1 [Q, x](Q − z)−1 Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−3 (Q − z)−1 x2 (Q − z)−1 Dx s . Commuting (Q − z)−1 and x2 , we obtain the result for k − 1 and in the same way we continue for p ≤ k − 1. Next we commute (Q − z)−1 and x2 and get Dx s ∂x V [(Q − z)−1 x]k−2 (Q − z)−1 x2 (Q − z)−1 Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−3 (Q − z)−1 x3 (Q − z)−2 Dx s . Indeed, [Q, x2 ] = 4(Dx − By)x = −4ix(Dx − By) − 2 yields (Q − z)−1 x2 (Q − z)−1 = x2 (Q − z)−2 − 4i(Q − z)−1 x(Dx − By) (Q − z)−1 − 2(Q − z)−2 and for the term Dx s ∂x V [(Q − z)−1 x]k−1 (Dx − By)(Q − z)−1 Dx s we use the recurrence assumption and the fact that M2 = (Dx − By)(Q − z)−1 is a bounded operator. In the same way for 1 ≤ j ≤ k − 1 we show that Dx s ∂x V [(Q − z)−1 x]k−j (Q − z)−1 xj (Q − z)−2 Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−j−1 (Q − z)−1 xj+1 (Q − z)−2 Dx s , taking into account the equality [Q, xj ] = 2j(Dx − By)xj−1 = 2jxj−1 (Dx − By) − 2ij(j − 1)xj−1 and the recurrence assumption. Finally, we prove that Dx s ∂x V [(Q − z)−1 x]k Dx s ∼t Dx s (∂x V )xk (Q − z)−2 Dx s and, as in the proof in the case n = 2, we conclude that the operator on the righthand side is trace class one. After this preparation we pass to the proof of Theorem 2. Proof of Theorem 2. Let Ξ ⊂ R be a small neighborhood of λ0 such that Ξ ∩ σ(Q) = ∅. For the simplicity of the notations we will write H(), ξ(λ, ) instead of H(B, ), ξ(λ; B, ). Given f ∈ C0∞ (Ξ), introduce an almost analytic continuation f˜ ∈ C0∞ (C) of f so that ∂¯f˜(z) = O(|Im z|∞ ) and supp f˜(z) ∩ σ(Q) = ∅. Since

May 11, J070-S0129055X10003941

376

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

(z − Q)−1 is analytic over the support of f˜(z), applying the resolvent equality, we get 1 ∂¯f˜(z)∂x V (z − H())−1 L(dz) ∂x V f (H()) = − π n ∂¯f˜(z)∂x V [(z − Q)−1 x]n (z − H())−1 L(dz). (5.3) = (−1)n+1 π Taking into account Proposition 4 and the cyclicity of the trace, we get tr ∂¯f˜(z)Dx −s [Dx s ∂x V [(z − Q)−1 x]n Dx s ]Dx −s (z − H())−1 L(dz) = tr

∂¯f˜(z)[Dx s ∂x V [(z − Q)−1 x]n Dx s ]Dx −s (z − H())−1 Dx −s L(dz).

Set W (z) = Dx s ∂x V [(z − Q)−1 x]n Dx s and note that for z ∈ supp f˜ this operator is trace class and W (z) is analytic. We write 1 − ∂¯f˜(z) tr(∂x V [(z − Q)−1 x]n (z − H())−1 )L(dz) π 1 = lim ∂¯f˜(z + iη) π η0 Im z>0 × tr[(W (z + iη)Dx −s (H() − (z + iη))−1 Dx −s )]L(dz) −s −1 −s ¯ ˜ ∂ f (z − iη) tr(W (z − iη)Dx (H() − (z − iη)) Dx )L(dz) . + Im z<0

Notice that the functions tr(W (z ± iη)Dx −s (H() − (z ± iη))−1 Dx −s ) are analytic in ± Im z > 0. Applying Green formula, as in [4, Lemma 1], we deduce 1 ξ (λ, ), f = tr(f (H() − f (H0 )) = − tr(∂x V f (H()) (−1)n n−1 = lim f (λ) tr(W (λ)[Dx −s ((H() − (λ + iη))−1 η0 2πi − (H() − (λ − iη))−1 )Dx −s ])dλ, where the integral is taken in the sense of distributions. On the other hand, Proposition 4 combined with (4.3) show that the right-hand side of the above representation is ﬁnite and has order O(n−2 ). Thus for ∀f ∈ C0∞ (Ξ) we obtain ξ (λ, ), f = f (λ)T (λ)dλ with T (λ) = O(n−2 ) and this completes the proof.

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

377

Acknowledgments The authors are grateful to the referees for their thorough and careful reading of the paper. Their remarks and suggestions lead to an improvement of the ﬁrst version of this paper. The second author was partially supported by the ANR project NONAa. Appendix The proof of the following lemma is similar to the proof of [5, Proposition 2.1] and for the reader convenience we give it. 1

Lemma 8. Let δ > 0 and let kj (x, y) = x−j(1+δ) y−j( 2 +δ) , j = 1, 2. The operators G2 := k2 (H0 + i)−2 , G∗2 , (respectively, G1 := k1 (H0 + i)−1 , G∗1 ), are trace class (respectively, Hilbert–Schmidt). Proof. Without loss of the generality, we may assume that B = = 1. Introduce the unitary operator U : L2 (R2 ) → L2 (R2 ) by 2 eiϕ(x,y,x ,y ) u(x , y )dx dy , (U u)(x, y) = π 2 R where ϕ(x, y, x , y ) = xy − xy − x y + x y − 12 y . A simple calculus shows that ˜ 0 = U −1 H0 U = (Dy2 + y 2 ) + x − 1 , H 4 1 ω −1 ω ˜ kj = U kj U = kj x − Dy − , y + Dx . 2 ˜ j := U Gj U −1 = Since U is unitary, it suﬃces to prove the lemma for G ω ˜ −j ˜ kj (H0 + i) . Let χ(t) ∈ C0∞ (R; [0, 1]) be a cut-oﬀ function such that χ(t) = 1 for |t| ≤ 1 and 2 } < k < 2, and introduce the χ(t) = 0 for |t| ≥ 2. Fix a number k, max{1, 1+2δ symbol y, ηk q(x, y, η) = χ , |η 2 + y 2 + (x + i)| where y, η = (1 + y 2 + η 2 )1/2 . It clear that q(x, y, η) ∈ S 0 (R4(x,ξ,y,η) ) and we set A = q ω (x, y, Dy ). We decompose ˜ 0 + i)−j = Ak˜ω (H ˜ 0 + i)−j + (I − A)k˜ω (H ˜ 0 + i)−j = Lj + Mj . k˜jω (H j j To treat Lj , notice that on the support of q(x, y, η) we have (η 2 + y 2 + x + i)−1 ∈ S 0 (R4 ; y, η−k ). In fact, on the support of q we obtain y, ηk ≤ 2|η 2 + y 2 + x + i|,

(A.1)

May 11, J070-S0129055X10003941

378

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

and it is easy to estimate the derivatives of (η 2 + y 2 + x + i)−1 . According to the calculus of pseudodiﬀerential operators, Lj becomes a pseudodiﬀerentail operator with symbol in 1

S 0 (R4 ; y, η−k x − η−j(1+δ) y + ξ−j( 2 +δ) ), and the trace norm (respectively, Hilbert–Schmidt norm) of L2 (respectively L1 ) can be estimated (see, for instance, [3, Proposition 9.2 and Theorem 9.4]) by y, η−2k x − η−2−2δ y + ξ−1−2δ dxdξdydη

L1 2HS + L2 tr ≤ C0 ≤ C0

y, η−2k dydη ≤ C0 .

(A.2)

To deal with Mj , j = 1, 2, we will show that (I − A)k˜2ω is trace class operator and (I − A)k˜1ω is Hilbert–Schmidt one. Notice that on the support of the symbol of (I − A) we have y, ηk ≥ |η 2 + y 2 + x + i|. 1

Taking into account the estimate ∂xl ∂ym kj (x, y) = Ol,m (x−j(1+δ) y−j( 2 +δ) ), we get

(I − A)k1ω 2HS + (I − A)k2ω tr ≤ C1 x − η−2−2δ y + ξ−1−2δ dxdξdydη y,η k ≥|η 2 +y 2 +x+i|

≤ C2

y,η k ≥|η 2 +y 2 +x+i|

x − η−2−2δ dxdydη

≤ C2 ≤ C2

y,η k ≥|η 2 +y 2 +η+u+i|

y, η k ≥ |η 2 + y 2 + η + u|, |u| ≤ 12 y, η k

+ C2

≤

C2

u−2−2δ dudydη

y, η k ≥ |η 2 + y 2 + η + u|, |u| ≥ 12 y, η k

|u|≤C3 ,|y|≤C3 ,|η|≤C3

+

u−2−2δ dudydη

|u|≥ 12 y,η k

u−2−2δ dudydη

u−2−2δ dudydη

−2−2δ

u

dudydη

May 11, J070-S0129055X10003941

2010 10:6 WSPC/S0129-055X

148-RMP

Representation of the Spectral Shift Function

≤ C4 + C5 ≤ C4 + C6

 u−2−2δ 



1

(2|u|) k

379

rdr du

0

u−2−2δ+2/k du ≤ C7 ,

(A.3)

since −2 − 2δ + 2/k < −1. Using (A.1)–(A.3) and the fact that A is trace class (respectively Hilbert– Schmidt) operator if and only if A∗ is trace class (respectively Hilbert–Schmidt) operator, we complete the proof of the lemma.

References [1] W. O. Amrein, A. M. Boutet de Monvel and V. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N-Body Hamiltonians, Progress in Mathematics, Vol. 135 (Birkh¨ auser-Verlag, Basel, 1996). [2] F. Bentosela, R. Carmona, P. Duclos, B. Simon, B. Souillard and R. Weder, Schr¨ odinger operators with an electric ﬁeld and randon or deterministic potentials, Comm. Math. Phys. 88 (1983) 387–397. [3] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in Semiclassical Limit, London Mathematical Society, Lecture Notes Series, Vol. 268 (Cambridge University Press, 1999). [4] M. Dimassi and V. Petkov, Spectral shift function and resonances for nonsemibounded and Stark Hamiltonians, J. Math. Pures Appl. 82 (2003) 1303–1342. [5] M. Dimassi and V. Petkov, Resonances for magnetic Stark hamiltonians in two dimensional case, Int. Math. Res. Not. 77 (2004) 4147–4179. [6] C. Gerard, A proof of the abstract limiting absorption principle by energy estimates, J. Funct. Anal. 254 (2008) 2707–2724. [7] C. Ferrari and H. Kovarik, Resonances width in crossed electic and magnetic ﬁelds, J. Phys. A Math. Gen. 37 (2004) 7671–7697. [8] C. Ferrari and H. Kovarik, On the exponential decay of magnetic Stark resonances, Rep. Math. Phys. 56 (2005) 197–207. [9] V. Ivrii, Analysis and Precise Spectral Asymptotics, Springer Monographs in Mathematics (Springer, Berlin, 1998). [10] M. Klein, D. Robert and X. P. Wang, Breit–Wigner formula for the scattering phase in the Stark eﬀect, Comm. Math. Phys. 131(1) (1990) 109–124. [11] M. G. Krein, On the trace formula in perturbation theory, Mat. Sb. 33 (1953) 597–626 (in Russian). [12] E. Korotyaev and A. Pushnitski, A trace formula and high energy spectral asymptotics for the perturbed Landau Hamiltonian, J. Funct. Anal. 217 (2004) 221–248. [13] M. Melgaard and G. Rosenblum, Eigenvalue asymptotics for weakly perturbed Dirac and Schr¨ odinger operators with constant magnetic ﬁelds of full rank, Comm. Partial Diﬀerential Equations 28 (2003) 697–736. [14] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators, Comm. Math. Phys 78(3) (1981) 391–408. [15] G. Raikov and S. Warzel, Quasi-classical versus non-classical spectral asymptotics for magnetic Schr¨ odinger operators with decreasing electric potentials, Rev. Math. Phys. 14 (2002) 1051–1072.

May 11, J070-S0129055X10003941

380

2010 10:6 WSPC/S0129-055X

148-RMP

M. Dimassi & V. Petkov

[16] M. Reed and B. Simon, Methods of Modern Mathematical Physics, IV, Analysis of Operators (Academic Press, New York, 1978). [17] D. Robert and X. P. Wang, Existence of time-delay operators for Stark Hamiltonians, Comm. Partial Diﬀerential Equations 14 (1989) 63–98. [18] D. Robert and X. P. Wang, Time-delay and spectral density for Stark Hamiltonians. II. Asymptotics of trace formulae, Chinese Ann. Math. Ser. B 12(3) (1991) 358–383. [19] X. P. Wang, Weak coupling asymptotics of Schr¨ odinger operators with Stark eﬀect, in Harmonic Analysis, Lecture Notes in Math., Vol. 1494 (Springer, Berlin, 1991), pp. 185–195. [20] D. Yafaev, Mathematical Scattering Theory (Amer. Math. Society, Providence, RI, 1992).

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 4 (2010) 381–430 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10003990

THE LOCALLY COVARIANT DIRAC FIELD

KO SANDERS Institute of Theoretical Physics, University of G¨ ottingen, Friedrich-Hund-Platz 1, D-37077 G¨ ottingen, Germany and Courant Research Center, “Higher Order Structures in Mathematics”, University of G¨ ottingen, Germany [email protected] Received 25 November 2009 Revised 1 March 2010 We describe the free Dirac ﬁeld in a four-dimensional spacetime as a locally covariant quantum ﬁeld theory in the sense of Brunetti, Fredenhagen and Verch, using a representation independent construction. The freedom in the geometric constructions involved can be encoded in terms of the cohomology of the category of spin spacetimes. If we restrict ourselves to the observable algebra, the cohomological obstructions vanish and the theory is unique. We establish some basic properties of the theory and discuss the class of Hadamard states, ﬁlling some technical gaps in the literature. Finally, we show that the relative Cauchy evolution yields commutators with the stress-energy-momentum tensor, as in the scalar ﬁeld case. Keywords: Quantum ﬁeld theory; curved spacetime; Dirac ﬁeld. Mathematics Subject Classiﬁcations 2010: 81T20

1. Introduction Quantum ﬁeld theory in curved spacetime is relevant for several purposes, such as the construction of cosmological models and to obtain a better understanding of quantum ﬁeld theory in Minkowski spacetime. In order to achieve these goals in a more realistic setting, it is important to go beyond the well-studied free scalar ﬁeld. In this paper, we will present a proof, already contained in [1], of the fact that the free Dirac ﬁeld in a four-dimensional globally hyperbolic spacetime can be described as a locally covariant quantum ﬁeld theory in the sense of [2]. Our presentation of the Dirac ﬁeld is representation independent and we emphasize categorical methods throughout in order to point out an interesting problem concerning the uniqeness of the theory. The obstruction for the deﬁnition of a unique theory can be formulated in terms of the cohomology of the category of spacetimes with a spin structure, in particular its ﬁrst Stiefel–Whitney class. It seems diﬃcult to compute this class for a category, but we will show that a unique theory 381

May 11, J070-S0129055X10003990

382

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

can always be obtained by restriction to the observable algebras generated by even polynomials in the ﬁeld, in which case the cohomological obstructions vanish. Hadamard states can be deﬁned in terms of a series expansion of their two-point distribution, detailing their local singularity structure. Alternatively, they can be characterized by a microlocal condition. The equivalence of these two deﬁnitions has been investigated by several authors using diﬀerent techniques of proof, but in our opinion none of these arguments has been fully convincing. In our discussion, we hope to close any remaining gaps in the diﬀerent proofs and establish the equivalence on ﬁrm ground. We also compute the relative Cauchy evolution of this ﬁeld and obtain commutators with the stress-energy-momentum tensor, in complete analogy with the scalar ﬁeld case ([2]). For this, we use a point-splitting procedure to renormalize the stress-energy-momentum tensor. Because we only need commutators with this tensor we do not need to treat the so-called trace anomaly, a ﬁnite multiple of the identity operator, in detail. We refer the interested reader to [3], who also construct the extended algebra of Wick powers, relevant for perturbation theory. A Spin-Statistics Theorem in a generally covariant framework may be found in [4]. The contents of this paper are organized as follows. In Sec. 2, we review some of the mathematical background material that we need in order to describe the Dirac ﬁeld. This includes ﬁrst of all the Dirac algebra and the Spin group, followed by a categorical formulation of some of the diﬀerential geometry that we will need. In Sec. 3, we describe the classical free Dirac ﬁeld, starting with the geometric and algebraic aspects in Secs. 3.1 and 3.2 and the equations of motion and their fundamental solutions in Sec. 3.3. We discuss the uniqueness of the functorial constructions and their cohomological obstructions in Sec. 3.4. We then proceed to the quantum Dirac ﬁeld in Sec. 4. In Sec. 4.1, we quantize the classical Dirac ﬁeld in a local and covariant way and collect some of its basic properties. Section 4.2 deals with Hadamard states and includes a discussion of the existing results concerning the equivalence of the microlocal and the series expansion deﬁnitions. For this purpose we also refer to Appendix A, which contains several relevant and useful (but expected) results in microlocal analysis. Section 4.3 contains our discussion of the relative Cauchy evolution of the free Dirac ﬁeld, obtaining commutators with the stress-energy-momentum tensor, but the proof of our main result there is deferred to Appendix B, because it consists of rather involved computations. Finally we end with some conclusions. Our presentation of locally covariant quantum ﬁeld theory is based on the original [2] and on [5]. For the Dirac ﬁeld in curved spacetime, we largely follow [6, 7], as well as our earlier [1]. For results on Cliﬀord algebras, we refer to [8] (see also [9] for a short review). 2. Mathematical Preliminaries To prepare for our discussion of the locally covariant Dirac ﬁeld, we present in the current section some mathematical preliminaries concerning the Dirac algebra, the

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

383

Spin group and a categorical formulation of relevant aspects of diﬀerential geometry. These merely serve to ﬁx our notation and set the scene for the subsequent sections. We also point out the relations with some other deﬁnitions and conventions in the literature. 2.1. The Dirac algebra and the Spin group The Spin group can be embedded in the Cliﬀord algebra of Minkowski spacetime, which we call the Dirac algebra. Therefore, we will ﬁrst brieﬂy recall some results on Cliﬀord algebras, for wich we refer to [8] (note the diﬀerence in sign convention in the Cliﬀord multiplication). Let Rr,s be a ﬁnite dimensional real vector space with dimension n = r + s and with a non-degenerate bilinear form gab which has r positive and s negative eigenvalues. The Cliﬀord algebra Clr,s is deﬁned as the R-linear associative algebra generated by a unit element I and an orthonormal basis ea of Rr,n−r subject to the relations: ea eb + eb ea = 2gab I. This deﬁnition is independent of the choice of basis. We may identify Rr,s ⊂ Clr,s as the subspace of monomials in the basis ea of degree one. The even, respectively odd, subspace of this Cliﬀord algebra is the one spanned by monomials of even, 0 , respectively respectively odd, degree in the basis vectors and is denoted by Clr,s 1 Clr,s . Note that the even subspace is also a subalgebra. In the following we will be especially interested in Minkowski spacetime, M0 := R1,3 , where the bilinear form is η = diag(1, −1, −1, −1) and where we choose an orthonormal basis ga , a = 0, 1, 2, 3 with g0 2 = 1, ·2 denoting the Minkowski pseudo-norm squared. The associated Cliﬀord algebra is called the Dirac algebra D := Cl1,3 and it is characterized by ga gb + gb ga = 2ηab I.

(1)

As a vector space, the Cliﬀord algebra is naturally isomorphic to the exterior algebra. This motivates the term volume form for the element g5 := g0 g1 g2 g3 (or in general e := e1 · · · er+s ). Note the following properties: Lemma 2.1. We have g52 = −I and g5 vg5−1 = −v for all v ∈ M0 . More generally, 1 −1 deﬁnes a if u ∈ M0 has u2 = u2 I = 0, then u−1 = u 2 u and v → −uvu reﬂection of M0 in the hyperplane perpendicular to u. Proof. These equalities follow directly from Eq. (1). For the last claim, e.g., we compute: −uvu−1 = v − (uv + vu)u−1 = v −

2u, v u, u2

v ∈ M0 .

Standard arguments with Cliﬀord algebras [8] give: 0 0 D = Cl1,3 Cl1,4 Cl4,1 ,

Cl4,1 M (4, C),

May 11, J070-S0129055X10003990

384

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

where M (4, C) denotes the algebra of complex (4×4)-matrices. In fact, Cl4,1 is generated by the generators ga of D together with a central element ω, corresponding to iI ∈ M (4, C). Hence: M (4, C) C ⊗R D.

(2)

This also implies that the center of D is spanned by I (over R). The following Fundamental Theorem provides all the essential information we need on the Dirac algebra (for an elementary algebraic proof, we refer to Pauli [10].): Theorem 2.2 (Fundamental Theorem). The Dirac algebra D is simple and has a unique irreducible complex representation (i.e. an R-linear representation π : D → M (n, C)), up to equivalence. This is the representation π0 : D → M (4, C) determined by π0 (ga ) = γa with the Dirac matrices 0 I 0 −σi γ0 := , , γi := σi 0 I 0 and σ3 := 10 −10 . The where σi are the Pauli matrices σ1 := 01 10 , σ2 := 0i −i 0 equivalence with another irreducible complex representation π of D is implemented by π(S) = Lπ0 (S)L−1 for all S ∈ D, where L ∈ GL(4, C) is unique up to a non-zero complex factor. Consequently, for every set of matrices γa ∈ M (4, C) satisfying Eq. (1) there is an L ∈ GL(4, C), unique up to a non-zero complex constant, such that γa = Lγa L−1 . Proof. One can show [8] that D M (2, H), where H is the skew ﬁeld of quaternions. This algebra is simple, because it is a full matrix algebra. The given matrices γa satisfy the Cliﬀord relations (1) and therefore extend to a representation of D in M (4, C). Any complex representation π : D → M (n, C) extends to a complex representation π ˜ of M (4, C), using Eq. (2) and the trivial center of D, which is irreducible if π is irreducible. As M (4, C) has only one irreducible representation up to equivalence (see [11]), namely the deﬁning one on C4 , this determines π up to equivalence, as stated. If K, L ∈ GL(4, C) are two matrices which implement the same equivalence, then KL−1 commutes with D and hence K = cL, where c ∈ C is non-zero because K is invertible. Note that π (ga ) := γa extends to a complex representation of D in M (4, C) which is faithful (as D is simple). The last statement then follows from the previous one. For notational convenience, we deﬁne γ5 := π0 (g5 ). We can deﬁne a determinant and trace function on D by det S = det π(S) and Tr(S) = Tr(π(S)) for all S ∈ D, where π is any irreducible complex representation of D. This is well-deﬁned by the Fundamental Theorem. The following lemma is often useful in computations: Lemma 2.3. We have Tr(ga gb ) = 4ηab and Tr([gb , gc ]gd ga ) = 8(ηcd ηba − ηbd ηca ).

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

385

Proof. Using the cyclicity of the trace and Eq. (1) we ﬁnd: Tr(ga gb ) = 12 Tr(ga gb + gb ga ) = Tr(ηab I) = 4ηab and Tr([gb , gc ]gd ga ) = Tr(gb [gc , gd ga ]) = Tr(gb {gc , gd }ga − gb gd {gc , ga }) = 2 Tr(ηcd gb ga − gb gd ηca ) = 8(ηcd ηba − ηbd ηca ). We now turn to the Spin group, which is the universal covering group of the special Lorentz group, a double covering which can be constructed in an elegant way inside the Dirac algebra. Definition 2.4. The Pin and Spin groups of Clr,s are deﬁned as Pin r,s := {S ∈ Clr,s | S = u1 · · · uk , ui ∈ Rr,s , u2i = ±I}, 0 Spin r,s := Pin r,s ∩ Clr,s .

We let Spin 01,3 denote the connected component of Spin 1,3 which contains the identity. We also deﬁne the Lorentz group L := O1,3 , the special Lorentz group L+ := 0 , which is the conSO1,3 and the special ortochronous Lorentz group L↑+ := SO1,3 nected component of L+ containing the identity. The special ortochronous Lorentz group preserves the orientation and timeorientation. For S ∈ P in1,3 the map v → SvS −1 on M0 is a product of reﬂections (up to a sign) by Lemma 2.1. Together with the fact that det u = u4 for all u ∈ M0 this gives rise to another useful characterisation of the group P in1,3 , which we shall not provea: Proposition 2.5. Pin 1,3 = {S ∈ D | det S = 1, ∀ v ∈ M0 SvS −1 ∈ M0 }. It can be seen from Proposition 2.5 that P in1,3 and Spin1,3 are indeed Lie groups. For the universal covering homomorphism Λ between P in1,3 and the Lorentz group, we have the following formulaeb,c : Proposition 2.6. The map Λ : P in1,3 → L deﬁned by S → Λab (S) ∈ M (4, R) such that Sgb S −1 = ga Λab (S) is the universal covering homomorphism of Lie groups, which restricts to the universal covering homomorphism Spin 01,3 → L↑+ . We ↑ have Λab (S) = 14 Tr(g a Sgb S −1 ) and the inverse of the derivative dΛ : spin 01,3 → l+ at a The

deﬁnition of the Spin group in [12] corresponds to our group P in1,3 . In [6, 7] one uses the term Spin group for the group S := {S ∈ M (4, C) | det S = 1, SvS −1 ∈ M0 for all v ∈ M0 }.

Note that this group cannot give a double covering of the Lorentz group, as claimed in [6] (but not in [7]), because for any S ∈ S the matrices iS, −S, −iS are in S too. Its usefulness is based on its simple deﬁnition and the fact that S 0 = Spin01,3 . b These results are well known, but we record them for deﬁniteness to correct a sign error in the spin connection (5) that has occured in [6, 7, 13]. c Lower case Latin indices are raised and lowered with η ab , respectively, η ab throughout.

May 11, J070-S0129055X10003990

386

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

S = I is given by: (dΛ)−1 (λba ) =

1 b λ gb g a . 4 a

Proof. For the ﬁrst sentence we refer to [8, Theorem 2.10] and subsequent remarks. Using the Cliﬀord relations (1), we see that Λab (S) = =

1 ac 1 η Tr(ηcd Λdb (S)I) = η ac Tr((gc gd + gd gc )Λdb (S)) 4 8 1 ac 1 η Tr(gc gd Λdb (S)) = Tr(g a Sgb S −1 ). 4 4

Expanding Λ(S+s+O(2 )) up to second order in we ﬁnd dΛ(s)ab = 14 Tr([gb , g a ]s). We check that L(λba ) := 14 λba gb g a is an inverse of dΛ: dΛ(L(λde ))ab = =

1 ac ef d 1 η η λ e Tr([gb , gc ]gd gf ) = η ac η ef λde (ηcd ηbf − ηbd ηcf ) 16 2 1 a (λ − η ae ηbd λde ) = λab , 2 b

↑ where we used Lemma 2.3 and the symmetry properties of λde ∈ l+ in the last line.

2.2. Some category theory and diﬀerential geometry The language of locally covariant quantum ﬁeld theory uses category theory to express the physical ideas of locality and covariance. Any object or construction that is extended from a single spacetime (usually Minkowski spacetime) to the categorical framework gets the adjective “locally covariant”. The essence of local covariance seems to have a geometric origin and, because the Dirac ﬁeld in curved spacetimes involves a substantial amount of geometric constructions, it will be convenient to present the relevant diﬀerential geometry in a categorical setting here. We refrain from the urge to call this “locally covariant diﬀerential geometry”, which appears to be a pleonasm. A category C consists of a set of objects c and a set of morphisms or arrowsd γ : c1 → c2 between objects of C, such that the composition of morphisms, when deﬁned, is associative and each object admits an identity morphism (we refer to [14] for more details). A (covariant ) functor F : C → B is a map between categories, which maps objects c to objects F(c) and morphisms γ : c1 → c2 to morphisms F(γ) : F(c1 ) → F(c2 ) such that an identity morphism maps to an identity morphism and the composition of morphisms is preserved. A contravariant functor F : C → B is deﬁned similarly, but reverses the direction of the morphisms: F(γ) : F(c2 ) → F(c1 ). A natural transformation t: F ⇒ G between covariant functors F : C → B and G : C → B is a map which assigns to each object c a morphism t(c) of B, called the component of t at c, such that for every morphism γ : c1 → c2 d It

is very often convenient to depict the morphisms in a diagram as arrows between objects.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

387

of C we have t(c2 ) ◦ F(γ) = G(γ) ◦ t(c1 ), which can be depicted as a commutative diagram. When a natural transformation t admits another natural transformation s such that t(c) ◦ s(c) = idc = s(c) ◦ t(c) for all objects c, then t is called a natural equivalence. In this case, we write t: F ⇔ G. A natural transformation between contravariant functors or between a covariant and a contravariant functor is deﬁned similarly, except that some arrows in the commutative diagram are reversed. A subcategory B of C consists of a subset of the objects of C and a subset of its morphisms in such a way that B still satisﬁes the axioms of a category. In our case, all categories will be concrete, i.e. the objects will be sets with a certain structure and the morphisms will be maps between sets. The identity morphism will always be the identity map and the composition of maps, when deﬁned, is automatically associative. In short, our categories will be subcategories of the category Set, whose objects are setse and whose morphisms are maps. For our discussion of diﬀerential geometry we start with the following Definition 2.7. The category Mann of smooth manifolds is the category whose objects are C ∞ manifolds M of (ﬁnite) dimension n and whose morphisms are C ∞ embeddings µ : M1 → M2 . The category Bund of ﬁber bundles is the category whose objects are smooth ﬁber bundles p : B → M over objects M of Mann with bundle projection map p, and whose morphisms are C ∞ maps β : B1 → B2 covering a morphism µ : M1 → M2 of Mann , i.e. such that p2 ◦ β = µ ◦ p1 . We denote by Bund the subcategory whose morphisms restrict to isomorphisms of the ﬁbers. The categories VBundR , respectively VBundC , of real (complex) vector bundles is the subcategory of Bund whose objects V are real (complex) vector bundles and whose morphisms ν : V1 → V2 are real (complex) linear maps of the ﬁbers. Again we denote by VBundR and VBundC the subcategories whose morphisms restrict to isomorphisms of the ﬁbers. We could have taken all smooth maps between manifolds as morphisms of Mann or allowed all dimensions. However, local diﬀeomorphisms allow us to transport more structure, which enables us to describe more of the canonical diﬀerential geometric constructions as functors. We describe the most important examples below. For ﬁber bundles, on the other hand, it will be useful to allow maps which are not isomorphisms on the ﬁbers.f ,g

e See

[14] for some relevant remarks concerning the foundations of set theory and the use of small sets. f The unprimed categories, whose morphisms are isomorphisms of the ﬁbers, can be described as ﬁbered categories over Mann , cf. [15, p. 44]. g The functors B : Mann → Bund below are all of a special type, namely, they associate to a manifold M a ﬁber bundle whose base space is again M. Although we will only use functors of this type when describing the Dirac ﬁeld, the restriction is not technically necessary in our deﬁnitions.

May 11, J070-S0129055X10003990

388

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

Two of the most basic functors in diﬀerential geometry are The tangent bundle functor T : Mann → VBundR assigns to every manifold M the tangent bundle T M and to every morphism µ : M1 → M2 the diﬀerential dµ : T M1 → T M2 . The cotangent bundle functorh T∗ : Mann → VBundR assigns to every manifold M the cotangent bundle T ∗ M and to every morphism µ : M1 → M2 the pushforward µ∗ : T M1 → T M2 , which is deﬁned as µ∗ ω := ω ◦ dµ−1 . In a similar way, one can deﬁne the functor Λk : Mann → VBundR of exterior k-forms and the exterior algebra functor Λ : Mann → VBundR , both with pushforwards. Another example is The density bundle functor |Λn | : Mann → VBundR assigns to every spacetime M the one-dimensional trivial vector bundle of densities |Λn M|, where n is the dimension of M. This is the vector bundle whose ﬁber at x ∈ M consists of functions d : Λnx M → R such that d(rω) = |r|ω for all r ∈ R and ω ∈ Λnx M (cf. [16, Appendix A.3]). A morphism µ is mapped to the push-forward deﬁned by µ∗ d := d ◦ µ∗ , where µ∗ ω := ω ◦ dµ is the pull-back. By standard constructions, one can take (ﬁnite) direct sums and tensor products of functors from Mann into VBundR which map M into a vector bundle over M. One obtains another such functor in the obvious way. For functors V into VBundR one can also deﬁne the dual, denoted by V∗ , where the morphism between dual vector bundles is the push-forward of the original morphism. This generalizes the example of T∗ above. As another standard construction one can deﬁne the complexiﬁcation VC of any functor V into VBundR (respectively, VBundR ), which is a functor into VBundC (respectively, VBundC ). Now we turn to some examples of natural transformations: The canonical pairing between a functor V : Mann → VBundR which maps M to a vector bundle V M over M and its dual V∗ is a natural transformation , : V∗ ⊗ V ⇒ Λ0 whose components cover the identity morphism. Complex conjugation is a natural equivalence − : VC ⇔ VC in VBundR (or VBundR ) between complexiﬁed vector bundles, which sends each section to its complex conjugate. A further example of a natural equivalence is the ﬁber-wise multiplication by a real number r = 0. (For r = 0, this only yields a natural transformation.) Furthermore, the constructions mentioned above (dual, direct sum, tensor product) and the natural transformations (pairing, ﬁber-wise multiplication) can also be applied directly to complex vector bundles in a canonical (Hermitean) way. h It

is tempting to think of a contravariant functor that maps manifolds to their cotangent bundles and morphisms µ to the pull-back, µ∗ ω := ω ◦ dµ, which indeed reverses the directions of arrows and changes the order of compositions. However, the pull-back is only deﬁned on the image of µ, so in general this does not deﬁne a morphism in VBundR .

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

389

It will be convenient to consider distributions and integration in a categorical setting too: Definition 2.8. TVec is the category of topological vector spaces with injective continuous linear maps as morphisms. The functor C : Mann → TVec is the constant functor C, i.e. it assigns to each object the one dimensional space C and to each morphism the identity morphism. The functor of test-sections is the functor C∞ 0 : VBundC → TVec which maps ∞ each complex vector bundle V to the space C0 (V) of compactly supported smooth sections of V in the test-section topology.i A morphism ν, covering a morphism µ, is mapped to the push-forward ν∗ deﬁned by ν∗ (f ) = ν ◦ f ◦ µ−1 on µ(M1 ), extended by 0 to all of M2 . The functor of smooth sections is the contravariant functor C∞ : VBundC → TVec which maps each complex vector bundle V to the space C ∞ (V) of smooth sections of V in the usual topology. A morphism ν, covering a morphism µ, is mapped to the pull-back ν ∗ deﬁned by ν ∗ (f ) = ν −1 ◦ f ◦ µ. The functor of distributions is the contravariant functor Distr: VBundC → TVec which maps each complex vector bundle V to the space (C0∞ (V)) of distributions on V with the weak topology induced by C0∞ (V). A morphism ν, covering a morphism µ, is mapped to the pull-back ν ∗ deﬁned by ν ∗ u := u ◦ ν∗ . We will not need compactly supported distributions, but they can be deﬁned as the functor dual to C∞ . Notice that objects which are not compactly supported, such as smooth sections or distributions, behave contravariantly, whereas compactly supported ones behave covariantly. Also note that the pull-back of a smooth section can only be deﬁned for morphisms that restrict to isomorphisms of the ﬁbers. The following constructions will be of importance in Sec. 4: n Integration is a natural transformation : C∞ 0 ◦ |Λ | ⇒ C which assigns to each ∞ n ω ∈ C0 (|Λ M|) the integral M ω. Canonical Injections. Let f : VBundC → VBundC be the forgetful functor. For any functor V : Mann → VBundC there is a canonical natural transformation ∞ ◦ V, whose components are the canonical injections κ: C∞ 0 ◦ f ◦ V ⇒ C ∞ ∞ C0 (V M) ⊂ C (V M). Similarly, there is a canonical natural transformation ι: C∞ ◦ (V ⊗ |Λn |) ⇒ Distr ◦ f ◦ V∗ given by ιM (f ⊗ ω) := M ., f ω for any smooth section f of V M and any density ω on M. Each component of ι is injective. Where convenient we will identify a functor V : Mann → VBundC with the functor f ◦ V, omitting the forgetful functor, as this rarely leads to confusion. Furthermore, any natural transformation t: V1 ⇒ V2 between a pair of functors Vi : Mann → VBundC , i = 1, 2, lifts to a corresponding natural transformation i For a precise deﬁnition of the well-known topologies on test-sections and smooth sections we refer to [17, Chap. 17].

May 11, J070-S0129055X10003990

390

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

∞ T : C∞ 0 ◦ V1 ⇒ C0 ◦ V2 deﬁned pointwise by TM f := tM ◦ f . The same statement holds for T : C∞ ◦ V1 ⇒ C∞ ◦ V2 , if the Vi are functors into the category VBundC . Next we add the structure of a semi-Riemannian metric:

Definition 2.9. The category SRMann of semi-Riemannian manifolds is the subcategory of Mann whose objects M = (M, g) are C ∞ manifolds M of dimension n with a semi-Riemannian metric g and whose morphisms m : M1 → M2 are given by the isometric morphisms in Mann , i.e. morphisms µ : M1 → M2 such that µ∗ g1 = g2 |µ(M1 ) . Again there is a canonical forgetful functor f : SRMann → Mann , which is often left implicit, so we will write e.g. T for the functor T ◦ f . The extra structure of a semi-Riemannian metric gives rise to extra functors and natural equivalences that are of interest to us: The metric identification is a natural equivalence G: T ⇔ T∗ whose component at M = (M, g) is given by the map GM : T M → T ∗ M such that v → g(v, ·). The frame bundle functor F : SRMann → VBundR assigns to each object M the frame bundle F M, i.e. the bundle whose ﬁber at a point x ∈ M consists of all orthonormal bases of Tx M in the metric g. This ﬁber is a subset of T ⊗n M. A morphism m is mapped to the push-forward µ∗ acting on F M ⊂ T ⊗n M. The volume form functor vol : SRMann → VBundRis deﬁned as vol := |Λn | ◦ f . When m : M1 → M2 is a morphism and dvoli := | det gi | the metric induced volume form on Mi , then vol maps dvol1 to the restriction of dvol2 to m(M1 ). There is a canonical natural equivalence from Λ0 to vol, which consists of multiplication with the metric induced volume form. Similarly there are natural equivalences between any functor V: SRMann → VBundC and V ⊗ |Λn |. Therefore we obtain a canonical natural transformation ι: C∞ ◦ V ⇒ Distr ◦ V∗ whose components are injective. Finally we should mention the Cliﬀord bundle functor Cl : SRMann → VBundR , which assigns to each object M = (M, g) the Cliﬀord bundle ClM, which is the vector bundle whose ﬁber at x ∈ M is the Cliﬀord algebra of (Tx M, g) viewed as a linear space. Ignoring the algebraic structure, this functor is naturally equivalent to Λ ◦ f . Although we will not do so, it is possible to use this functor as a basic object for the description of fermions (cf. [18]). 3. The Classical Dirac Field After these mathematical preliminaries we are now ready to start constructing the classical free Dirac ﬁeld (as a locally covariant classical ﬁeld). We will ﬁrst describe the geometric and algebraic constructions, before we discuss the Dirac equation and its fundamental solutions. We close by investigating to what extent the relations between the Dirac operator, charge conjugation and adjoint map ﬁx the structure

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

391

of the theory and ﬁnd that the non-uniqueness can be characterised in terms of the cohomology of the category of spin spacetimes. 3.1. Geometric aspects In order to describe the Dirac ﬁeld we need to introduce the notion of a spin structure on a spacetime, combining the geometric and the algebraic results of Sec. 2. This is the purpose of the current subsection. The systems that we will consider are intended to model Dirac quantum ﬁelds living in a (region of) spacetime which is endowed with a ﬁxed Lorentzian metric (a background gravitational ﬁeld). Mathematically these regions are modelled as follows: Definition 3.1. By the term globally hyperbolic spacetime we will mean a connected, Hausdorﬀ, C ∞ Lorentzian manifold M = (M, g) of dimension d = 4, which is oriented, time-oriented and admits a Cauchy surface. A subset O ⊂ M of a globally hyperbolic spacetime M is called causally convex iﬀ for all x, y ∈ O all causal curves in M from x to y lie entirely in O. The category Spac is the subcategory of SRMann whose objects are all globally hyperbolic spacetimes M = (M, g) and whose morphisms are isometric embeddings ψ that preserve the orientation and time-orientation and such that ψ(M1 ) is causally convex. By a theorem of Geroch any globally hyperbolic spacetime is paracompact ([19, Appendix]). Most notations we use concerning the causal structure of spacetimes are standard, cf. [20]. The importance of causally convex sets is that for any morphism ψ the causal structure of M1 coincides with that of ψ(M1 ) inside M2 : ± ± ψ(JM (x)) = JM (ψ(x)) ∩ ψ(M1 ), 1 2

x ∈ M1 .

If O ⊂ M is a connected open causally convex set, then (O, g|O ) deﬁnes a globally hyperbolic spacetime in its own right. In this case there is a canonical morphism IM,O : O → M given by the canonical embedding ι : O → M. We will often drop IM,O and ι from the notation and simply write O ⊂ M . Notice that there is a forgetful functor f : Spac → SRMann and that we can deﬁne the functor F↑+ : Spac → Bund of oriented, time-oriented orthonormal frames F+↑ M for the tangent bundle, in analogy to Sec. 2.2. This is a principal L↑+ -bundle over M , where the special ortochronous Lorentz group L↑+ acts from the right, i.e., given e = (x, e0 , . . . , e3 ) ∈ F+↑ M , where x ∈ M and ea ∈ Tx M such that gx (ea , eb ) = ηab and e0 is future pointing, the action of Λ is deﬁned by RΛ e = e = (x, e0 , . . . , e3 ) where ea = eb Λba . Definition 3.2. A spin structure on M is a pair (SM , π), where SM is a principal Spin 01,3 -bundle over M , the spin frame bundle, with a right action RS , S ∈ Spin 01,3 ,

May 11, J070-S0129055X10003990

392

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

and π : SM → FM , the spin frame projection, is a base-point preserving bundle homomorphism such that π ◦ RS = RΛ(S) ◦ π, where S → Λ(S) is the universal covering map (cf. Proposition 2.6). A globally hyperbolic spin spacetime SM = (M, g, SM , π) is an object M = (M, g) of Spac which is endowed with the spin structure (SM , π). The category SSpac is the subcategory of Bund whose objects are all globally hyperbolic spin spacetimes SM = (M, g, SM , π) and whose morphisms χ : SM 1 → SM 2 cover a morphism ψ : M1 → M2 in Spac and satisfy χ ◦ (R1 )S = (R2 )S ◦ χ and π2 ◦ χ = ψ∗ ◦ π1 , where pi are the bundle projections, πi the spin frame projections and ψ∗ the push-forward. Note that a morphism acts as a diﬀeomorphism of the ﬁbers, because it intertwines the group action. Every globally hyperbolic spacetime admits a spin structure, which need not be unique [6, 8, 19, 21]. We will regard distinct spin structures on the same underlying spacetime as distinct spin spacetimes.j Spinor and cospinor ﬁelds are sections of vector bundles associated to the spin frame bundle. We will require that the assignment of these vector bundles is functorial: Definition 3.3. A locally covariant spinor bundle is a functor V: SSpac → VBundC , written as SM → VSM , χ → ν, such that χ and ν cover the same morphism ψ in Spac and such that each VSM is a vector bundle associated to the spin frame bundle SM through some representation. The dual functor V∗ is called ∗ , are a locally covariant cospinor bundle. Smooth sections of VSM , respectively VSM called (Dirac) spinors (or spinor ﬁelds), respectively cospinors (cospinor ﬁelds). The condition in the deﬁnition of a locally covariant spinor bundle ensures that the vector bundle VSM and the spin frame bundle SM are both bundles over the same spacetime M . For deﬁniteness we pick out the following standard choice of locally covariant spinor and cospinor bundles: Definition 3.4. The standard locally covariant Dirac spinor bundle D0 : SSpac → VBundC is the locally covariant spinor bundle which associates to each object SM of SSpac the associated vector bundle D0 M = SM ×Spin01,3 C4 of SM with the j There exists another approach to spinors, which considers on each spacetime the Cliﬀord bundle. This Cliﬀord bundle is functorial in its dependence on the spacetime, but it does not generally deﬁne a spin structure. Indeed, at each point one can identify the Spin group inside the ﬁber of the Cliﬀord bundle, but there may not be any projection from these Spin groups onto the frame bundle that intertwines the actions of the structure groups, the obstruction being a topological twist. (Conversely, every spin structure can be seen as a topologically twisted copy of the Spin groups in the Cliﬀord bundle.) Nevertheless, it appears to provide suﬃcient structure to describe all the relevant physics in a functorial way. We refer to [18] for more information on this approach.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

393

representation π0 , and which maps each morphism χ : SM 1 → SM 2 to the morphism ξ : D0 M1 → D0 M2 given by ξ([E, z]) := [χ(E), z]. The standard locally covariant Dirac cospinor bundle D∗0 is the dual functor of D0 . Recall that a point in D0 M consists of an equivalence class of pairs (E, z) ∈ SM × C4 , where the equivalence is given by [RS E, z] = [E, π0 (S)z]. The dual functor D∗0 then assigns to each SM the dual vector bundle D0∗ M whose points are equivalence classes of pairs (E, w∗ ) ∈ SM × (C4 )∗ , where the equivalence is given by [RS E, w∗ ] = [E, w∗ π0 (S −1 )]. (Here we consider w∗ ∈ (C4 )∗ as a row vector, whereas z ∈ C4 is treated as a column vector.) For any object SM the unique connection ∇SM on T M which is compatible ↑ with the metric, ∇SM g = 0, can be described by an l+ -valued one-form (ΩSM )ba on ↑ ↑ is the orthonormal frame bundle F+ M (cf. [22, Chap. 2, Proposition 1.1]), where l+ the Lie-algebra of L↑+ , which can be identiﬁed with the tangent space of the ﬁber of F+↑ M at any point. For every local section e of F+↑ M the pull-back ω ba := e∗ (Ωba ) consists exactly of the connection one-forms of ∇SM expressed in the orthonormal frame ea . The one-form (ΩSM )ba can be pulled back by the spin frame projection π and lifted to a spin01,3 -valued one-form ΣSM on SM : ΣSM := (dΛ)−1 π ∗ ((ΩSM )ba ) =

1 ∗ p ((ΩSM )ba )gb g a , 4

where the last equality uses Proposition 2.6. The one-form ΣSM determines a connection on the spin frame bundle SM . For any associated vector bundle DM we then ﬁnd a connection, also denoted by ∇SM , determined by the connection oneforms σ := E ∗ (ΣSM ) in a local section E of SM , as represented on DM (we will give an explicit expression for σ in Eq. (5)). The connection can be viewed as a map ∇SM : C0∞ (D0 M ) → C0∞ (T ∗ M ⊗ D0 M ), which is a component of a natural ∞ ∗ transformationk ∇: C∞ 0 ◦ D0 ⇒ C0 ◦ (T ⊗ D0 ). The Leibniz rule allows us to extend it to mixed spinor-tensors, using, e.g., ∇a v, u = ∇a v, u + v, ∇a u. 3.2. Adjoints, charge conjugation and the Dirac operator We now deﬁne the adjoint and charge conjugation maps on spinors and cospinors. These are special cases of the Fundamental Theorem 2.2, using the complex conjugate and adjoint matricesl (cf. [23]). k Alternatively

we could have written the connection as a natural transformation from the 1-jet bundle extension of D0 to T∗ ⊗ D0 . l On a general representation space of complex dimension four, one can deﬁne many complex conjugations and Hermitean inner products. In order to obtain the desired equalities involving adjoint and charge conjugate spinors later on, we need these two operations to be compatible, i.e. v, w = v, w. Without loss of generality we can then use the standard complex conjugation and Hermitean inner product on C4 .

May 11, J070-S0129055X10003990

394

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

Theorem 3.5. For any irreducible complex representation π of the Dirac algebra D there are matrices A, C ∈ GL(4, C) such that A = A∗ , ¯ = I, CC

π(ga )∗ = Aπ(ga )A−1 ,

An > 0,

−π(ga ) = Cπ(ga )C −1

(3)

for all future pointing time-like vectors n ∈ M0 ⊂ D. We have for all S ∈ Spin01,3 : A = −C ∗ AT C, π(S)∗ Aπ(S) = A,

π(S −1 )C −1 π(S) = C −1 .

Moreover, if A , C ∈ M (4, C) have the properties stated above for the irreducible complex representation π of D, then there is an L ∈ GL(4, C), unique up to a sign, ¯ −1 C L = C and π = L−1 π L on D. such that L∗ A L = A, (L) Proof. To prove the existence of A and C in the representation π0 we may take A = A0 := γ0 , C = C0 := γ2 and check the required properties straightforwardly. Note for example that 0 i I + n σ 0 n i γ0 na γa = > 0, 0 n0 I − ni σi because det(n0 I ± ni σi ) = n2 > 0 and Tr(n0 I ± ni σi ) = 2n0 > 0. To prove the existence of A and C in a general irreducible complex representation π one writes ¯ −1 C0 K γa = Kπ(ga )K −1 by Theorem 2.2 and veriﬁes that A = K ∗ A0 K and C = K will do. Given A , C satisfying Eq. (3) for π we can ﬁx K ∈ GL(4, C) such that π = KπK −1 on D and the desired matrix L must be L = zK for some z = 0 by the ¯ −1 C K and note Fundamental Theorem 2.2. Now set A˜ := K ∗ A K and C˜ := (K) that A˜ and C˜ satisfy (3) for π. Because the sets of matrices π(ga )∗ and −π(ga ) both satisfy the relations (1) we must have aA = A˜ and cC = C˜ for some non-zero complex factors a and c, again by the Fundamental Theorem. Also, |c| = 1 because ¯ = I and a > 0 because A = A∗ and Aπ(n) > 0 for future pointing time-like CC z , which ﬁxes z (and L) up to a sign. This proves vectors. Hence, |z|2 = a and z = c¯ the last statement. The equation A = −C ∗ AT C holds for A0 , C0 and therefore also in general. For a unit vector u = ua ga we have u2 = ±I and hence π(u)∗ Aπ(u) = ua ub π(ga )∗ Aπ(gb ) = ua ub Aπ(ga gb ) = Aπ(u2 ) = ±A. For S ∈ Spin 1,3 , we must therefore have that π(S)∗ Aπ(S) = ±A, by deﬁnition of the Spin group. For S = I, the sign is a plus, so by continuity and connectedness we conclude that π(S)∗ Aπ(S) = A for all S ∈ Spin01,3 . For C, we use the fact that π(u−1 )C −1 π(u) = −π(u)−1 π(u)C −1 = −C −1 and hence π(S −1 )C −1 π(S) = C −1 for all S ∈ Spin1,3 .

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

395

Note that g5 ∈ Spin 1,3 \Spin 01,3 . Indeed, using π0 and A0 = γ0 in Theorem 3.5 we see that γ5∗ A0 γ5 = −A0 , so g5 ∈ Spin 1,3 by deﬁnition, but not in Spin 01,3 . In the following theorem we use the fact that for any pair of natural transformations t, t : SSpac ⇒ VBundC we can deﬁne the sum t + t and the tensor product t ⊗ t componentwise. Theorem 3.6. The standard locally covariant Dirac spinor and cospinor bundles admit natural (C-antilinear ) equivalences + : D0 ⇔ D∗0 , c : D0 ⇔ D0 , c : D∗0 ⇔ D∗0 in VBundR and a natural transformation γ: D0 ⇒ T∗ ⊗ D0 in VBundC such that all components cover the identity morphism and the following equations hold both on spinors and cospinors (i.e. we denote the inverses of + and c by the same symbol): ◦ = 1 =c ◦c ,

◦ = −1 ◦c ◦+

+ +

+ c

, ◦ S ◦ (+ ⊗+ ) =− ◦ , = , ◦ (c ⊗c ) (1⊗+ ) ◦ γ = γ ∗ ◦+ ,

(1⊗c ) ◦ γ = −1 ◦ γ◦c

(4)

(1 + S ⊗ 1) ◦ (1 ⊗ γ) ◦ γ = (2 ◦ g) ⊗ 1 ∇ ◦ γ = γ ◦ ∇, D∗0

D∗0

⇔ ⊗ D0 and S: T∗ ⊗ T∗ ⇔ T∗ ⊗ T∗ swap the factors in the where S: D0 ⊗ tensor product, g: Λ0 ⇒ T∗ ⊗ T∗ maps the function 1 to the metric g and γ ∗ : D∗0 ⇒ T∗ ⊗ D∗0 is the adjoint map of γ under the canonical pairing , . Furthermore, for every object SM , every time-like future pointing tangent vector n ∈ T M and every v ∈ D0 M we have n ⊗ v + , γ(v) ≥ 0. The natural transformation γ can also be seen as a natural transformation T ⇒ End(D0 ) or T ⇒ End(D∗0 ). Equations (4) simply give the usual computational rules for spinors and cospinors in a functorial setting. Thus, for every SM and every p ∈ D0 M , q ∈ D0∗ M we have: p++ = p = pcc ,

pc+ = −p+c

p+ , q + = q, p = q c , pc (γµ p)+ = p+ γµ ,

(γµ p)c = −γµ pc

γµ γν + γν γµ = 2gµν I,

∇a γb ≡ 0,

where we have dropped the subscript SM to lighten the notation. Proof. The canonical pairing , : D∗0 ⊗ D0 ⇒ Λ0C on SM is given by [E, w∗ ], [E, z] = w, z, where the right-hand side is the standard Hermitean inner product on C4 . Note that this is well-deﬁned, because we can always get the same E ∈ SM on the left-hand side by a suitable action of Spin01,3 . The components of the natural equivalences + and c on each SM are deﬁned using the matrices A0 and C0 of Theorem 3.5 and their properties: [E, z]c := [E, C0−1 z¯], [E, z]+ := [E, z ∗ A0 ],

[E, w∗ ]c := [E, w ¯ ∗ C0 ], [E, w∗ ]+ := [E, A−1 0 w].

May 11, J070-S0129055X10003990

396

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

These are well-deﬁned isomorphisms in VBundR and they give rise to natural equivalences satisfying the ﬁrst two lines of Eq. (4). Now ﬁx E ∈ SM , let ea be the orthonormal basis (e0 , . . . , e3 ) = π(E) of Tp(E) M , where π : SM → FM is the spin frame projection, and let ea be the dual basis of ∗ M . On SM we deﬁne the component of the natural transformation γ on SM Tp(E) to be γ([E, z]) := ea ⊗ [E, γa z]. This is well-deﬁned, because a diﬀerent section E := RS E gives rise to the frame ea = eb Λb a(S) and the dual frame (e )a = Λab (S −1 )eb and on the other hand π0 (S −1 )γa π0 (S) = γb Λba (S −1 ) by deﬁnition of Λ (Proposition 2.6). γ is indeed a morphism in VBundC and gives rise to a natural transformation. The third line of Eq. (4) follows again from the properties of A and C (see Theorem 3.5): γ([E, z]c ) = ea ⊗ [E, γa C0−1 z¯] = −ea ⊗ [E, C0−1 γa z] = −(γ([E, z]))c , γ ∗ ([E, z]+ ) = ea ⊗ [E, z ∗ A0 γa ] = ea ⊗ [E, z ∗ γa∗ A] = (γ([E, z]))+ and similarly on cospinors. Also, 1 c Γ (γc γ d γa − γa γc γ d ) − Γcba γc 4 bd 1 −1 c Γ (δ d γc + ηac γ d ) = 0. = Γcbd (γc {γ d , γa } − {γa , γc }γ d − 4δad γc ) = 4 2 bd a

∇b γa = σb γa − γa σb − Γcba γc =

Finally, for every object SM , every future pointing tangent vector n ∈ T M and every v ∈ D0 M we have n ⊗ v + , γ(v) = v + , Ana γa v ≥ 0 again by Theorem 3.5. In terms of the Christoﬀel symbols Γρµν , the frame eaρ and representing ga on D0 M using the End(D0 M )-valued one-forms γ, the connection one-forms of the spin connection can be expressed asm 1 a Γ γa γ c , 4 bc = −eρc (eσb ∂σ eaρ ) + eaρ eµb eνc Γρµν .

σb := Γabc

(5)

The Dirac operator is deﬁned on spinors and cospinors by ∇ / SM := γ a ∇a . ∞ This deﬁnes natural transformations ∇ / : C∞ / : C∞ 0 ◦ D0 ⇒ C0 ◦ D0 , respectively ∇ 0 ◦ ∗ ∞ ∗ D0 ⇒ C0 ◦ D0 . The intertwining relations of the adjoint and charge conjugation with the Dirac operator follow from their intertwining with γ in Theorem 3.6:

Proposition 3.7. ∇ / ◦+ =+ ◦∇ /, ∇ /◦ m Note

the sign error in [6, 7].

c

= −1 ◦

c

◦∇ /.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

Proof. Recall that any object SM

+

and

c

397

can be deﬁned pointwise on test-sections. Hence, on

(∇ / v)c = ((∂a v − vσa )γ a )c = (∂a v − vσa )γ a C = −(∂(¯ v C) − v¯Cσa )γ a = −∇ / (vC) = −∇ / vc , (∇ / u)+ = (γ a (∂a u + σa u))+ = (∂a u∗ + u∗ σa∗ )(γ a )∗ A = (∂a (u∗ A) − u∗ Aσa )γ a = ∇ / (u∗ A) = ∇ / u+ , where the minus sign in the last line appears because the order of the two factors of / v)+ = (∇ / v ++ )+ = γ in the expression for σa needs to be changed. It follows that (∇ + ++ + c + +c + c+ +c + (∇ /v ) =∇ / v and (∇ / u) = (∇ / u ) = −(∇ / u ) = (∇ / u ) = −(∇ / uc+ )+ = c −∇ /u . Remark 3.8. A change in the sign convention, η˜ := −η, has no physical consequences. In fact, this simply gives rise to D Cl3,1 as the Dirac algebra, but since 0 0 = Cl1,3 nothing changes in the representationn of the group Spin 01,3 = Spin 03,1 . Cl3,1 To accommodate this change one can set γ˜a := iγa in Eq. (1), which yields the same Dirac algebra and other constructions (although we do get signs for all covectors when raising or lowering indices with η˜). This also implies that one should drop the factor i in front of the Dirac operator in the Dirac equation (6) below, which ensures that Pc P = P Pc will still be a wave operator. We can also keep the same matrices A, C, which now must satisfy the relations: γa A−1 , −˜ γa∗ = A˜

γ¯˜ a = C γ˜a C −1 .

The spinor and cospinor bundle and the adjoint and charge conjugation maps then remain the same and all the relations between these operations and the Dirac operator remain valid. 3.3. The Dirac equation and its fundamental solutions The Dirac equation on spinor and cospinor ﬁelds, respectively, on a spin spacetime SM is (−i∇ / + m)u = 0,

(i∇ / + m)v = 0,

(6)

where the constant m ≥ 0 is to be interpreted as the mass of the ﬁeld. These equa tions can be derived as the Euler–Lagrange equations from the action SD := LD n Notice

that a complex irreducible representation of Cl1,3 extends to an irreducible representation of M (4, C) and therefore also gives a complex irreducible representation of Cl3,1 and vice versa. The standard Cliﬀord algebra isomorphism Cl3,1 M (4, R) appears if and only if the ¯a = −γa . In that case we also ﬁnd representation of Cl1,3 is a Majorana representation, i.e. if γ (see, e.g., [12, p. 332]) P in3,1 {S ∈ M (4, R) | det S = 1, ∀ v ∈ M0 SvS −1 ∈ M0 } = P in1,3 .

May 11, J070-S0129055X10003990

398

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

with the Lagrangian densityo LD := u+ , (−i∇ / + m)udvolg

(7)

by varying with respect to u and u+ , viewed as independent ﬁelds. The canonical momentum of the ﬁeld u on a Cauchy surface C with future pointing normal vector ﬁeld n is deﬁned as δSD 1 = −iψ + (x)n / (x). π(x) := µ ∇ ψ(x)) δ(n −det g(x) µ

(8)

We will write P := −i∇ / + m for the operator on spinors and Pc := i∇ / +m for the operator on cospinors. These are components of natural transformations ∞ ∞ ∗ ∞ ∗ ◦ D0 ⇒ C∞ ◦ D0 and Pc : C∞ P : C∞ 0 ◦ D0 ⇒ C0 ◦ D0 , P : C 0 ◦ D0 ⇒ C0 ◦ D0 , ∞ ∗ ∞ ∗ Pc : C ◦ D0 ⇒ C ◦ D0 , which we denote by the same symbol. We then have by Proposition 3.7: P◦

c

=

c

◦ P,

Pc ◦ = ◦P, +

+

Pc ◦

c

=

c

◦ Pc ,

P ◦ = ◦Pc , +

+

(9)

i.e. if a spinor ﬁeld u is a solution to the Dirac equation, then so are u+ and uc . (The adjoint and charge conjugation of u are deﬁned pointwise.) For a distribution v on D0 M we deﬁne the transpose P ∗ by P ∗ v, u := v, P u and similarly for Pc . In this way the transposes give rise to natural transformations P ∗ : Distr ◦ D0 ⇒ Distr ◦ D0 and Pc∗ : Distr ◦ D∗0 ⇒ Distr ◦ D∗0 . Lemma 3.9. Let ι: C∞ ◦ D∗0 ⇒ Distr ◦ D0 and ι: C∞ ◦ D0 ⇒ Distr ◦ D∗0 be the canonical natural transformations (see the end of Sec. 2.2). Then P ∗ ◦ ι = ι ◦ Pc and Pc∗ ◦ ι = ι ◦ P . / vdvolg = Proof. This follows from the fact that for each object SM M u, ∇ / u, vdvolg if at least one of u ∈ C ∞ (D0 M ) and v ∈ C ∞ (D0∗ M ) is com− M ∇ pactly supported. This in turn follows from ∇ / v, u + v, ∇ / u = ∇a v, γ a u and Gauss’ law. One can ﬁnd unique advanced and retarded fundamental solutions for the Dirac equation, both for spinors and cospinors [6, 24]: ∞ Theorem 3.10. There are unique natural transformations S ± : C∞ 0 ◦D0 ⇒ C ◦D0 ± ∞ ∗ ∞ ∗ ± ± ± and Sc : C0 ◦ D0 ⇒ C ◦ D0 such that S ◦ P = P ◦ S = κ, Sc ◦ Pc = Pc ◦ Sc± = κ and such that for each u ∈ C0∞ (D0 M ), v ∈ C0∞ (D0∗ M ) we have

Lagrangian is a natural transformation between the functor J1 D0 , which assigns to each spin spacetime SM the ﬁrst-order jet bundle J1 D0 M of the spinor bundle D0 M , to the functor |Λn | of densities. A component of this natural transformation covers the identity morphism of SM and is only a moprhism in Bund, not in VBundR , because it is not linear. o The

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

399

supp(S ± u) ⊂ J ± (supp(u)), supp(Sc± u) ⊂ J ± (supp(u)). Moreover, S± ◦

c

=

c

◦ S±,

Sc± ◦

c

=

c

◦ Sc± ,

Sc± ◦+ =+ ◦S ± ,

S ± ◦+ =+ ◦Sc± , ◦ , ◦ (1 ⊗ S ± ) = ◦ , ◦ (Sc∓ ⊗ 1).

Proof. The components of S ± and Sc± are the advanced (−) and retarded (+) fundamental solutions for P and Pc , which are given by S ± := (i∇ / + m)E ± and / + m)E ± respectively, where E ± are the unique advanced and retarded Sc± := (−i∇ fundamental solutions for the normally hyperbolic operator (i∇ / + m)(−i∇ / + m) = (−i∇ / + m)(i∇ / + m) = ∇ / 2 + m2 . We refer to [6, Theorem 2.1] for a detailed proof of the existence and uniqueness of these operators (see also [16] for the existence and uniqueness of E ± ). The naturality of S ± and Sc± follows from their uniqueness and the naturality of P and Pc . In detail: for every morphism χ : SM 1 → SM 2 and every f ∈ C0∞ (D0 M1 ) the unique smooth solution to P u = χ∗ f on M2 with supp(u) ⊂ J ± (supp(χ∗ f )) pulls back to a solution v := χ∗ u of P v = f on M1 with supp(v) ⊂ J ± (supp(f )). By uniqueness we must then have u = S ± χ∗ f and χ∗ u = S ± f , i.e. χ∗ ◦ S ± ◦ χ∗ = S ± . The same holds for cospinors. The commutation of S ± and Sc± with charge conjugation and adjoints follows from Eq. (9). For arbitrary u ∈ C0∞ (D0 M ) and v ∈ C0∞ (D0∗ M ) we can ﬁnd a φ ∈ C0∞ (M ) which is identically one on the compact set supp(S ± u) ∩ supp(Sc∓ v). We then compute: v, S ± u = Pc Sc∓ v, φS ± u = Sc∓ v, P φS ± u M

M

=

M

Sc∓ v, φP S ± u =

M

Sc∓ v, u, M

which proves the last claim. We deﬁne the advanced-minus-retarded fundamental solutions S := S − − S + and ∞ ◦ D0 and Sc := Sc− − Sc+ , which are natural transformations S: C∞ 0 ◦ D0 ⇒ C ∞ ∗ ∞ ∗ Sc : C0 ◦ D0 ⇒ C ◦ D0 respectively. 3.4. The non-uniqueness of the functorial Dirac structure We have seen that the (standard) structure of Dirac spinors and cospinors, adjoints, charge conjugation and the Dirac operator is entirely determined by the functor D0 and the natural equivalences + , c and γ. We formalise this with a deﬁnition: Definition 3.11. By a Dirac structure D := (D,+ ,c , γ) we mean a locally covariant spinor bundle D with a dual bundle D∗ , natural equivalences + : D ⇔ D∗ , c : D ⇔ D,

May 11, J070-S0129055X10003990

400

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

and c : D∗ ⇔ D∗ in VBundR and a natural transformation γ: D ⇒ T∗ ⊗ D in VBundC , all of whose components cover the identity morphism and satisfying the relations (4) and γSM (v + , v), n ≥ 0 for every time-like future pointing vector n ∈ T M. We call D0 := (D0 ,+ ,c , γ) of Theorem 3.6 the standard Dirac structure. The category DStruc has all Dirac structures as objects and its morphisms t : D1 → D2 are all natural transformations t: D1 ⇒ D2 whose components are injective morphisms covering the identity morphism and intertwining the adjoints, charge conjugation and γ as follows: +2

◦ t = t◦+1 ,

c2

◦ t = t◦c1 ,

γ2 ◦ (t ⊗ t) = γ1 .

For each Dirac structure, one can perform the constructions of Sec. 3.3. Because the Dirac algebra D has a unique irreducible complex representation one might expect that the category DStruc admits a corresponding unique initial object, perhaps up to isomorphism. This is an object from which there exists a morphism into any other object. However, as we will explain in this section there is a certain cohomological obstruction of the category SSpac involved. We will ﬁrst consider the standard Dirac structure, which would be a good candidate for an initial object, and prove the following weaker property: Proposition 3.12. Any morphism t from a Dirac structure D to the standard Dirac structure D0 is an isomorphism. Proof. Let t : D → D0 be a morphism. By the injectivity of the components of t: D ⇒ D0 we see that the complex dimension of the ﬁber of DM is at most four. On the other hand, the vector bundles DM are modules for the Dirac algebra represented by γ. Because this algebra is simple, and because Eqs. (4) exclude the trivial representation, we ﬁnd that DM must have complex dimension at least four. Therefore, t: D ⇒ D0 must be a natural equivalence and it follows that t : D → D0 is an isomorphism. Corollary 3.13. If we construct a Dirac structure Dπ analogous to D0 , but using a diﬀerent representation π and matrices A, C, then Dπ is isomorphic to D0 . Proof. Because we use the same representation on all spacetimes we can construct a natural equivalence t: Dπ ⇔ D0 whose components are of the form tSM ([E, z]) := [E, Lz] for some L ∈ GL(4, C) which is independent of SM (cf. Theorem 3.5). Corollary 3.14. If D := (D0 ,+1 ,c1 , γ ) is any Dirac structure with the standard locally covariant Dirac spinor bundle D0 , then D is isomorphic to the standard Dirac structure D0 . Proof. At each point x in each object SM we can view γa as matrices that represent the Dirac algebra in a representation π. Using the Fundamental Theorem 2.2, we write γa = Lγa L−1 for some L(x) ∈ GL(4, C). As γa is well-deﬁned

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

401

on D0 we must have π0 (S)γa π0 (S −1 ) = γb Λba (S) for all S ∈ Spin 01,3 . This also holds for the matrices γ, so we conclude from the Fundamental Theorem that π0 (S)L(x) = c(x)L(x)π0 (S), where c ≡ 1 by taking S = I. We can now deﬁne a natural equivalence t: D0 ⇔ D0 by [E, z] → [E, L(p(E))z] such that γ ◦ t = t ◦ γ. If we also deﬁne +2 := t ◦+1 ◦t−1 and c2 := t ◦c1 ◦t−1 , then D ⇔ (D0 ,+2 ,c2 , γ) ⇔ D0 , where the last equivalence follows from the previous corollary. In fact, the proof of Corollary 3.13 shows that for any SM the quadruple (DM,+ ,c , γ) is unique up to an isomorphism tSM , if DM has four-dimensional complex ﬁbers. The isomorphism tSM itself, however, is only unique up to a sign. In other words, on each spin spacetime we ﬁnd a discrete Z2 -symmetry that preserves all physical relations.p Consider two Dirac structures D and D whose locally covariant spinor bundles D and D have four-dimensional complex ﬁbers. Comparing the action of these functors on morphisms of SSpac one ﬁnds a diagram that commutes up to a sign. The existence of an initial object in the category DStruc then boils down to the question whether one can choose signs for all spin spacetimes SM in such a way that all the diagrams commute. The answer is not at all obvious, but can be neatly formulated in terms of the ﬁrst Stiefel–Whitney class of the category SSpac. To explain this we will brieﬂy recall the deﬁnition of cohomology groups for categories (cf. [26]). If C is any category, we can ﬁrst build a simplicial set from it called the nerve of the category (cf. [27]). A 0-simplex is simply an object of C, a 1-simplex is a morphism between two objects, a 2-simplex is a commutative triangle, etc. We will write Σn for the set of all n-simplices. For n ≥ 1 every n-simplex has n + 1 faces, which are described by maps ∂j : Σn → Σn−1 , 0 ≤ j ≤ n, which remove the jth vertex from the diagram. To ﬁnd the cohomology of C with values in an Abelian groupq G, we deﬁne an n-cochain with values in G to be a map v : Σn → G. We denote the set of n-cochains with values in G by C n (G) and we deﬁne the coboundary map d : C n (G) → C n+1 (G) by dv(s) :=

n+1

(−1)j v(∂j s),

s ∈ Σn+1 ,

j=0

where we have written the group operation of G additively. One checks that d2 = 0 and deﬁnes v to be closed iﬀ dv = 0 and exact iﬀ v = dt for some (n − 1)cochain t. The sets of closed and exact n-cochains are denoted by B n (G) and Z n (G), respectively. They inherit an Abelian group structure from G and because p This may be compared to [25], who use complex spinor structures and then ﬁnd a local (gauge) symmetry instead of our more restricted global symmetries. q [26] also considers the non-Abelian case, which is much more involved.

May 11, J070-S0129055X10003990

402

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

Z n (G) ⊂ B n (G) is necessarily normal one can deﬁne the jth cohomology group as the quotient H n (G) := B n (G)/Z n (G). Now let us return to the study of Dirac structures. Suppose that D and D both have four-dimensional complex ﬁbers. Without loss of generality, we may assume that both Dirac structures coincide on each spin spacetime, but the action of their locally covariant spinor bundles on a morphism χ agrees only up to a sign v(χ) ∈ {±1}. We can view v : χ → v(χ) as a 1-cochain on the category SSpac with values in Z2 = {0, 1}, where 0 corresponds to +1 and 1 to −1). Notice that for a composition of morphisms χ = χ1 ◦ χ2 we ﬁnd v(χ) = v(χ1 ) + v(χ2 ) in Z2 , because the Dirac structures are both functorial. In cohomological terms this means precisely that dv = 0. If there is a natural equivalence t: D ⇔ D , then the components tSM are automorphisms of the Dirac structure at each SM , i.e. tSM = ±1, that compensate for all the minus signs in v. If we view t as a 0-cochain with values in Z2 , this means exactly that v = dt. So we have proved: Theorem 3.15. The number of inequivalent Dirac structures whose locally covariant spinor bundles have four-dimensional complex ﬁbers equals the number of ﬁrst Stiefel–Whitney classes of the category SSpac, i.e. the number of elements in H 1 (Z2 ). Remark 3.16. For scalar and vector ﬁelds the problem above can be avoided in a natural way. Taking L↑+ in the deﬁning (four-vector) representation, the vector bundle associated to F+↑ M is just the tangent bundle T M . A morphism in Spac determines a unique morphism on the tangent bundle, so no topological obstructions occur. Similarly for the scalar ﬁeld, where one uses the trivial one-dimensional representation of L↑+ , whose associated vector bundle is Λ0 (M ) = M × R. Again a morphism in Mann automatically determines a unique morphism on these associated vector bundles, now by the requirement that the volume element is preserved. In general one is dealing with representations of Spin 01,3 and associates to each morphism in SSpac an intertwining operator between such representations. For the associated vector bundles of SM , the physical requirements that we imposed on the bundle morphisms, concerning the adjoint and charge conjugation maps and γ, reduce the intertwiners exactly to a choice of lifting L↑+ to its double cover. In this way it leads to the same ﬁrst Stiefel–Whitney class that characterizes the number of spin structures on a manifold. For the general case it is expected that one needs a non-Abelian cohomology theory to quantify the obstruction for ﬁnding initial objects. 4. The Locally Covariant Quantum Dirac Field After our discussion of the classical Dirac ﬁeld in Sec. 3 we now turn to the quantum Dirac ﬁeld, its construction, its Hadamard states and its relative Cauchy evolution.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

403

4.1. Quantization of the free Dirac ﬁeld First, we will quantize the free Dirac ﬁeld in a generally covariant way and establish some of its properties. For this purpose we also present the main ideas of locally covariant quantum ﬁeld theory as introduced in [2] (see also [5]). In the following, any quantum physical system will be described by a topological ∗ -algebra A with a unit I, whose self-adjoint elements are the observables of the system. An injective and continuous ∗ -homomorphism expresses the notion of a subsystem, whereas a state is desccribed by a normalized and positive continuous linear functional ω, i.e. ω(A∗ A) ≥ 0 for all A ∈ A and ω(I) = 1. The state space of A is the set of all states and is denoted by A∗+ 1 . Every state gives rise to a GNS-representation πω (see [28, Theorem 8.6.2.]), which is characterized uniquely, up to unitary equivalence, by the GNS-quadruple (πω , Hω , Ωω , Dω ). Here Hω is the Hilbert space on which πω (A) acts as (possibly unbounded) operators with the dense, invariant domain Dω := πω (A)Ωω . The vector Ωω is cyclic and satisﬁes ω(A) = Ωω , πω (A)Ωω for all A ∈ A. The collection of all systems forms a category TAlg: Definition 4.1. The category TAlg has as its objects all unital topological ∗ algebras A and as its morphisms all continuous and injective ∗ -homomorphisms α such that α(I) = I. A locally covariant quantum ﬁeld theory is a (covariant) functor A: SSpac → TAlg, written as SM → ASM , χ → αχ . A locally covariant quantum ﬁeld theory A is called causal if and only if any pair of morphisms ψi : SM i → SM , i = 1, 2, such that ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ in M yields [αΨ1 (ASM 1 ), αΨ2 (ASM 2 )] = {0} in ASM . A locally covariant quantum ﬁeld theory A satisﬁes the time-slice axiom iﬀ for all morphisms ψ : SM 1 → SM 2 such that ψ(M1 ) contains a Cauchy surface for M2 we have αΨ (ASM 1 ) = ASM 2 . Notice that the condition ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ is symmetric in i = 1, 2. The causality condition formulates how the quantum physical system interplays with the classical gravitational background ﬁeld, whereas the time-slice axiom expresses the existence of a causal dynamical law. We now ﬁx a choice of Dirac structure D := (D,+ ,c , γ), in order to turn the free Dirac ﬁeld into a locally covariant ﬁeld theory. Because we want to impose the canonical anti-commutation relations it will also be convenient to quantize spinor and cospinor ﬁelds simultaneously by introducing the following terminology: Definition 4.2. The locally covariant double spinor bundle is the covariant functor D ⊕ D∗ . We deﬁne the following natural equivalences and natural transformations

May 11, J070-S0129055X10003990

404

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

on this bundle, indicated by their components at SM : (p ⊕ q)c := pc ⊕ q c , γµ (p ⊕ q) := (γµ p) ⊕ (γµ q),

(p ⊕ q)+ := q + ⊕ p+ ,

p ⊕ q, p ⊕ q := p+ , p + q , q + ,

τ (p ⊕ q) := p ⊕ (−q). A double spinor (ﬁeld ) is an element of C ∞ (DM ⊕ D∗ M ). A double test-spinor (ﬁeld) is an element of C0∞ (DM ⊕ D∗ M ). The adjoint, charge conjugation and other operations are deﬁned pointwise. We also deﬁne the operator P := P ⊕ Pc , its advanced (−) and retarded (+) fundamental solutions S ± (u ⊕ v) := (S ± u) ⊕ (Sc± v) and S := S − − S + . The exterior tensor product V1 V1 of two vector bundles Vi with ﬁber Vi over manifolds Mi , i = 1, 2, is the vector bundle over M1 × M2 whose ﬁber is V1 ⊗ V2 and whose local trivializations are determined by (O1 × O2 ) × (V1 ⊗ V2 ), where Oi × Vi are local trivializations of Vi . 0 on a spin spacetime SM is the topoThe Dirac Borchers–Uhlmann algebra FSM ∗ logical -algebra 0 := FSM

∞

C0∞ ((DM ⊕ D∗ M )n ),

n=0

where the direct sum is algebraic (i.e. only ﬁnitely many non-zero summands are allowed) and (1) the product is given by continuous linear extension of f1 · f2 := f1 f2 , (2) the ∗ -operation is given by continuous antilinear extension of (f1 · · · fn )∗ := fn+ · · · f1+ , 0 0 is the strict inductive limit FSM = (3) as a topological vector space FSM ∞ N ∞ ∗ n C ((DM ⊕ D M ) | ×n ), where KN is an exhausting and N =0 n=0 0 KN increasing sequence of compact subsets of M and the test-section space of the restricted vector bundle (DM ⊕D∗ M )n |K ×n is given the test-section topology. N

0 FSM

The topology of is such that a state is given by a sequence of n-point distributional sections ωn of (DM ⊕ D∗ M )n . A morphism χ : SM 1 → SM 2 in SSpac 0 0 → FSM that is given by the algebraic and determines a unique morphism αχ : FSM 1 2 continuous extension of the morphism DM1 ⊕ D∗ M1 → DM2 ⊕ D∗ M2 that is sup0 plied by the functor D. Together with this map on morphisms the map SM → FSM becomes a locally covariant quantum ﬁeld theory F0 : SSpac → TAlg. Our next task will be to divide out the ideals that generate the dynamics and the canonical anti-commutation relations. ∗ ∞ ∗ We deﬁne the natural transformation ( , ): (C∞ 0 ◦(D⊕D ))⊗R (C0 ◦(D⊕D )) ⇒ C whose components are the sesquilinear forms: f1 , τ Sf2 . (f1 , f2 ) := i M

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

405

Note that this is indeed a natural transformation, because it can be written as a composition of natural transformations including , , , + and κ. Lemma 4.3. On each object SM the sesquilinear form ( , ) is Hermitean, (f1 , f2 ) = (f1c , f2c ) = (f2 , f1 ), and there holds (f1+ , f2+ ) = (f2 , f1 ). For any spacelike Cauchy surface C ⊂ M with future pointing unit normal vector ﬁeld na we have (u1 ⊕ v1 , u2 ⊕ v2 ) = (Su1 )+ , n /(Su2 ) + Sc v2 , n /(Sc v1 )+ . (10) C

Proof. The symmetry properties follow straightforwardly from the computational rules of Theorems 3.6 and 3.10. For the last statement we also need a partial integration (see, e.g., [20, Eq. (B.2.26)] for Gauss’ law) and we use the Dirac equation: (u1 ⊕ v1 , u2 ⊕ v2 ) + − = i Pc Sc− u+ 1 , Su2 + Pc Sc v2 , Sv1 J + (C)

+i = − −

J − (C)

J + (C)

J − (C)

+ + Pc Sc+ u+ 1 , Su2 + Pc Sc v2 , Sv1

+ a − a ∇a Sc− u+ 1 , γ Su2 + ∇a Sc v2 , γ Sv1

+ a + a ∇a Sc+ u+ 1 , γ Su2 + ∇a Sc v2 , γ Sv1

+ a − a na Sc− u+ 1 , γ Su2 + na Sc v2 , γ Sv1

= C

−

+ a + a na Sc+ u+ 1 , γ Su2 + na Sc v2 , γ Sv1 C

(Su1 )+ , n /(Su2 ) + Sc v2 , n /(Sc v1 )+ .

= C

From Eq. (10) we notice that ( , ) is positive semi-deﬁnite and hence deﬁnes a 0 by the closed ideal JSM (degenerate) inner product. We proceed by dividing FSM + 0 of FSM generated by all elements of the form P f or f1 · f2 + f2 · f1+ − (f1 , f2 )I. Theorem 4.4. The ideal JSM is a ∗ -ideal and for any morphism χ : SM 1 → SM 2 we have αχ (JSM 1 ) ⊂ JSM 2 . We can deﬁne the locally covariant quantum ﬁeld theory F : SSpac → TAlg which assings to every spin spacetime SM the C ∗ -algebra 0 /J FSM := FSM SM . Proof. The elements that generate JSM are invariant under adjoints and under a morphism they are mapped to elements of the same form. This proves the ﬁrst 0 /JSM are topological ∗ -algebras and statement. It follows that the quotients FSM

May 11, J070-S0129055X10003990

406

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

0 0 that a morphism αχ : FSM → FSM descends to the quotients as a well-deﬁned 1 1 0 morphism. That each algebra FSM /JSM has a C ∗ -norm follows from the fact that they are the inductive limits of ﬁnite-dimensional Cliﬀord algebras ([29]). The morphisms on the quotients are necessarily continuous in the norm and therefore extend to morphisms on the C ∗ -algebras FSM .

Definition 4.5. A locally covariant quantum ﬁeld in the locally covariant vector bundle V for the locally covariant quantum ﬁeld theory A is a natural transforma∗ tion Φ: C∞ 0 ◦ V ⇒ f ◦ A, where we let f : TAlg → TVec be the forgetful functor. We deﬁne the locally covariant quantum ﬁelds B: D ⊗ D∗ ⇒ F, ψ: D∗ ⇒ F and ψ + : D ⇒ F by BSM (f ) := 0 ⊕ f ⊕ 0 ⊕ · · · + JSM , ψSM (v) := BSM (0 ⊕ v) and + (u) := BSM (u ⊕ 0). ψSM That the latter really are locally covariant quantum ﬁelds is a consequence of + are C ∗ -algebraProposition 4.6. The operator-valued maps BSM , ψSM , ψSM valued distributions and:

(1) P ◦ ψ = 0 and Pc ◦ ψ + = 0, + (u) = ψSM (u+ )∗ , (2) ψSM + (3) {ψSM (u), ψSM (v)} = (v + ⊕ 0, u ⊕ 0)I = −i M v, SuI and the other anticommutators vanish. Proof. The ﬁrst item is P BSM (f ) = BSM (P ∗ f ) = BSM (P f ) = 0, where P ∗ is the formal adjoint of P . The last two items follow from the deﬁnitions of ψSM and + and the properties of BSM after a straight-forward computation. ψSM + are C ∗ -algebra-valued distributions, because It remains to show that ψSM , ψSM the result for BSM then follows. The C ∗ -subalgebra of FSM generated by I, ψSM (v), ψ(v)∗SM is a Cliﬀord algebra which is√isomorphic to M (2, C) and an explicit isomorphism is given by ψSM (v) → 00 0c , where c = (0 ⊕ v, 0 ⊕ v) = √ −i M v, Sv + > 0. It follows that ψSM (v) = c is the operator norm of the corresponding matrix, i.e.r 2 ψSM (v) = −i v, Sv + dvolg . M

In the test-spinor topology we then have continuous maps v → v ⊕ v + → −i M v, Sv + , from which it follows that v → ψSM (v) is norm continuous, i.e. + is analogous. it is a C ∗ -algebra-valued distribution. The proof for ψSM Note that the last two conditions of Proposition 4.6 can also be formulated in terms of natural transformations, because the algebraic operations in FSM can be expressed as such. The theory F is the quantized free Dirac ﬁeld and ψ (ψ + ) is r The

factor 2 in [7, Remark 2, p. 340] seems to be erroneous.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

407

the locally covariant Dirac (co)spinor ﬁeld. Alternatively we could have used the 0 /JSM themselves instead of completing them to C ∗ -algebras. algebras FSM To see that the anti-commutator is the canonical one (cf. [24]) we apply [6, / for a Cauchy surface C with Proposition 2.4(c)] which says that S|C×C = −iδn future pointing normal vector ﬁeld n. Comparing with Eq. (8) and using n / 2 = I we then ﬁnd + /(x)), ψSM (y)} = − y, Sn / xI = iδ(y, x)I {−iψSM (n M

as expected. So far our construction depends on the choice of a Dirac structure, although naturally equivalent Dirac structures yield naturally equivalent theories and quantum ﬁelds. The following theorem restricts attention to the observable algebra, dividing out the freedom of choice completely and yielding a unique theory, but for many purposes it is not convenient to use it directly because it lacks locally covariant Dirac (co)spinor ﬁelds. Theorem 4.7. Let B : SSpac → TAlg be the locally covariant quantum ﬁeld theory that assigns to each spin spacetime SM the C ∗ -subalgebra of FSM generated by all even polynomials in elements B(f ), with the induced action on morphisms. For all Dirac structures with four-dimensional complex ﬁbers the resulting theories B are isomorphic. Proof. The algebras BSM generated by the even polynomials are C ∗ -algebras. Morphisms respect evenness and so restrict to morphisms on B, making B a welldeﬁned locally covariant quantum ﬁeld theory. Now consider two Dirac structures D and D0 with associated functors F, B and F0 , B0 . If both Dirac structures have fourdimensional complex ﬁbers, then we infer from the comment below Corollary 3.13 that there are ∗ -isomorphisms αSM : FSM → (F0 )SM such that for any morphism χ : SM 1 → SM 2 we have αSM 2 ◦ αχ = χ · (α0 )χ ◦ αSM 1 , where χ = ±1 depends only on χ. It follows from the evenness that the αSM descend to ∗ -isomorphisms αSM : BSM → (B0 )SM that intertwine with the morphisms. Hence, B and B0 are naturally equivalent. Proposition 4.8. The locally covariant quantum ﬁeld theory B : SSpac → TAlg of Theorem 4.7 is causal and satisﬁes the time-slice axiom. Proof. Causality follows from the anti-commutation relations, [BSM (f1 )BSM (f2 ), BSM (f3 )] = BSM (f1 ){BSM (f2 ), BSM (f3 )} − {BSM (f1 ), BSM (f3 )}BSM (f2 ) = (f2 , f3 )BSM (f1 ) − (f1 , f3 )BSM (f2 ), together with the support properties of S. For the time-slice axiom, we let χ : SM → SM be a morphism in SSpac, covering a morphism ψ : M → M in Spac,

May 11, J070-S0129055X10003990

408

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

such that N := ψ(M ) ⊂ M contains a Cauchy surface C ⊂ M . Then we can choose Cauchy surfaces C ± ⊂ N such that C ± ⊂ I ± (C) and a smooth partition of unity φ+ , φ− with supp φ± ⊂ J ± (C ∓ ). Let f ∈ C0∞ (DM ⊕ D∗ M ) and write f = P (S + f − φ+ Sf ) + f˜, −

(11) −

−

where f˜ := P (φ Sf ) = −P (φ Sf ) is supported in J (C ) ∩ J (C ) ⊂ N and φ+ Sf − S + f has compact support. Hence, BSM 2 (f ) = BSM 2 (f˜) = αχ (BSM 1 (χ∗ (f˜))). Because the algebra FM is generated by such elements this shows that αχ is a ∗-isomorphism. +

+

+

Remark 4.9. A Majorana spinor is a spinor u such that u = uc . In this case the adjoint is anti-Majorana: u+c = −uc+ = −u+ . We call a double spinor f = u ⊕ v Majorana iﬀ u and v + are Majorana, which means that f c = τ f . Such spinors are sections of a subbundle of the Dirac spinor bundle, which can be described by a Majorana representation. Notice that every spinor is a unique complex linear combination of Majorana spinors. To quantize Majorana spinors we note that hc , f = h+ , f c+ . This leads us to deﬁne the charge conjugation on the quantized ﬁeldss by ψ c (v) := ψ + (v c+ ) and ψ +c (u) := ψ(uc+ ), or equivalently B c (f ) := B(f c+ ) = B(f c )∗ . We impose the Majorana condition B c (f ) = B(τ f ) by dividing out the ideal generated by all elements of the form B(f − τ f c+ ). More precisely, if H is the Hilbert space obtained from C0∞ (DM ⊕ D∗ M ) by dividing out the ideal of double spinors f for which (f, f ) = 0, then there is an orthogonal decomposition H = H+ ⊕ H− , where the elements in H± satisfy τ f c+ = ±f . Indeed, every double spinor can be written as f = f+ + if− , where f± := 12 (f ± τ f c+ ) are in H± and the orthogonality follows from Lemma 4.3. For the C ∗ -algebraic quantization we then have F = F+ ⊗ F− , where F− is the C ∗ -algebra of quantized Majorana spinors and F+ the C ∗ -algebra of quantized anti-Majorana spinors (see [30, Sec. 5.2]). The generators ψ(v) and ψ + (u) of F− satisfy the additional relation ψ c = ψ and ψ +c = −ψ + . 4.2. Hadamard states After Radzikowski’s result [31] that a for a scalar ﬁeld state is of Hadamard form if and only if its wave front set has a certain form, several people set out to extend this result to the Dirac ﬁeld, or more general quantum ﬁelds [32–34]. All three papers have provided an original contribution in their method of proof, but upon careful analysis they all have minor gaps. We feel that it is justiﬁed to comment on this here and to provide the necessary results to ﬁll any remaining gaps. The most general results are the most recent ones, due to Sahlmann and Verch [34], who set out to prove the equivalence of the Hadamard form of a state, deﬁned in terms of the Hadamard parametrix, with a wave front set condition analogous to the scalar ﬁeld case. One of the techniques used is the scaling limit, but s Our

deﬁnition diﬀers slightly from that of [13].

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

409

the proof of their Proposition 2.8, which relates the wave front set of a distribution to that of its scaling limit, is in our opinion insuﬃcient (see footnote w). In the Appendix, we prove a similar statement as Proposition A.2, thereby ﬁlling any gap in [34] and establishing the desired equivalence on a ﬁrm ground. For the Dirac ﬁeld, Hollands has proved that this wave front set condition implies a speciﬁc form of the polarization set ([35, Theorem 4.1]). The scaling limit result can also be used to ﬁnd the wave front sets of the advanced and retarded fundamental solutions E ± of normally hpyerbolic operators on a globally hyperbolic spacetime, a result that we prove as Theorem A.5. Our proof is largely analogous to the work of Radzikowski and the outcome is in direct analogy to the results of Duistermaat and H¨ ormander [36] for the scalar case. To ﬁnd the wave front sets of the fundamental solutions S ± for the Dirac equation we use (and correct) an idea of [35]. Finally, we comment on the results by Kratzert [32], which use a spacetime deformation argument to compute the wave front set and polarization set of Hadamard states. This result has a gap, already identiﬁed in [34], concerning the case of points (x, ξ; y, ξ ) where either ξ = 0 or ξ = 0, which prevents the propagation of the singularity from the original to the deformed spacetime. This gap can be avoided using either a propagation of Hadamard form result as in [34], or using the commutation or anti-commutation relations and the explicit form of WF (E), respectively WF (S). The latter argument, which appears to be implicit in Radzikowski’s paper [31], works as follows: when (x, ξ; y, 0) ∈ WF (ω2 ) then also (y, 0; x, ξ) ∈ WF (ω2 ) by the (anti-)commutation relations and the fact that WF (E) (or WF (S)) has no points with either entry equal to 0. Using the calculus of Hilbert-space-valued distributions, Theorem A.4, we then ﬁnd that both (x, ξ; x, −ξ) ∈ WF (ω2 ) and (x, −ξ; x, ξ) ∈ WF (ω2 ). Because ξ = 0 (by deﬁnition the wave front set does not contain the zero covector) these points can both be propagated into a deformed spacetime, where WF (ω) is known to satisfy the required microlocal condition. This, however, leads to a contradiction, because WF (ω2 ) ∩ −WF (ω2 ) = ∅ and hence ξ = 0. Therefore, WF (ω2 ) cannot contain points with one of the covectors equal to 0. After these historical notes we feel free to deﬁne the notion of Hadamard states directly in terms of a wave front set condition, rather than using the Hadamard parametrix. If ω is a state on FSM then we may consider the GNS-representation (Hω , πω , Ωω ) associated to ω and the Hω -valued distribution on DM ⊕ D∗ M deﬁned by: vω (f ) := πω (BSM (f ))Ωω . Definition 4.10. A state ω on FSM is called Hadamard if and only if WF (vω ) = N + := {(x, ξ) ∈ T ∗ M | ξ 2 = 0, ξ µ is future pointing or 0}. A state ω on BSM is called Hadamard if and only if it can be extended to a Hadamard state on FSM . The set of all Hadamard states on BSM will be denoted by SSM .

May 11, J070-S0129055X10003990

410

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

Note that every state on BSM can be extended to FSM , by the Hahn–Banach Theorem and Proposition 4.6. The Hadamard condition is independent of the choice of extension, because it depends solely on the two-point distribution as the following proposition shows (cf. [34], we give a short proof using the more advanced microlocal techniques developed in the Appendix). Proposition 4.11. For a state ω on FSM the following conditions are equivalent : (1) ω is Hadamard, (2) WF (vω ) ⊂ N + , (3) the two-point distribution ω2 (f1 , f2 ) := ω(BSM (f1 )BSM (f2 )) has WF (ω2 ) = C := {(x, −ξ; y, ξ ) ∈ T ∗ M ×2 \Z | (x, ξ) ∼ (y, ξ ), (x, ξ) ∈ N + }, where (x, ξ) ∼ (y, ξ ) if and only if there is an aﬃnely parameterized light-like geodesic from x to y to which ξ, ξ are cotangent, (4) there is a two-point distribution w such that ω2 (f1 , f2 ) = iw(P f1 , f2 ) and WF (w) = C. Proof. First, note that ω2 is a bidistribution on DM ⊕ D∗ M , because BSM is an FSM -valued distribution and multiplication in FSM and ω are continuous. By Theorem A.4 the third statement implies the ﬁrst, which trivially implies the second. To show that the second statement implies the third, we use the argument of [37, Proposition 6.1]. By Theorem A.4 we see that WF (ω2 ) ⊂ N − × N + , where ˜ 2 (f1 , f2 ) := ω2 (f2 , f1 ) we ﬁnd WF (˜ ω2 )∩WF (ω2 ) = ∅. Now, N − := −N + . Deﬁning ω ˜ 2 )(f1 , f2 ) = i M f1 , τ Sf2 , so WF (ω2 ) ∪ WF (˜ ω2 ) = WF (S) = WF (E) by (ω2 + ω Proposition A.7 and hence WF (ω2 ) = WF (E) ∩ N − × N + = C by Corollary A.6. Now, assume that ω2 (f1 , f2 ) = iw(P f1 , f2 ), where WF (w) = C. Then WF (ω2 ) = WF ((P ∗ ⊗ I)w) ⊂ WF (w) = C. It follows that WF (vω ) ⊂ N + . For the converse we suppose that ω is Hadamard and we choose a smooth real-valued function φ+ on M such that φ+ ≡ 0 to the past of some Cauchy surface C− and such that φ− := 1 − φ+ ≡ 0 to the future of another Cauchy surface C+ . We then deﬁne w(f1 , f2 ) := −iω2 (φ+ S − f1 + φ− S + f1 , f2 ). Note that w is a bidistribution which is well-deﬁned, because φ+ S − f1 and φ− S + f1 are compactly supported. By construction iw(P f1 , f2 ) = ω2 (f1 , f2 ). We now estimate the wave front set of w as follows. The wave front sets of S ± are determined in Proposition A.7. Then we may apply [38, Theorems 8.2.9 and 8.2.13] (in combination with Eq. (17)) to estimate the wave front sets of the tensor products φ± (x)S ∓ (x, y)δ(x , y ) and the composi tions in iw(x, x ) = ± ω2 (y, y )(φ± (x)S ∓ (x, y)δ(x , y )) respectively and, using WF (ω2 ) = C, we ﬁnd: WF (iw) ⊂ ∪± WF (S ∓ ⊗ δ) ◦ WF (ω2 ) ⊂ WF (ω2 ) = WF ((P ∗ ⊗ I)w) ⊂ WF (w), i.e. WF (w) = WF (ω2 ) = C.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

411

The second characterization in Proposition 4.11 is especially useful, because it shows we do not need to compute the entire wave front set, as long as we can estimate it. Employing similar techniques as above one can use the anticommutation relations and the wave front set of ω2 to estimate the wave front sets of all higher n-point distributions [39], showing that a Hadamard state necessarily satisﬁes the microlocal spectrum condition (µSC) of [40] and it follows that the set of such states is closed under operations from the algebra. We formulate this and other properties of Hadamard states in the following Proposition 4.12. The set SSM of all Hadamard states on BSM satisﬁes: (1) α∗χ (SSM 1 ) ⊂ SSM 2 for every morphism χ : SM 1 → SM 2 , (2) SSM is closed under operations from BSM , (3) α∗χ (SSM 1 ) = SSM 2 for every morphism χ : SM 1 → SM 2 such that ψ(M1 ) contains a Cauchy surface of M2 . Proof. The ﬁrst property follows from Theorem 4.11 and the fact that wave front sets are local and geometric objects (cf. [38, Chap. 8]). The second property relies on the anti-commutation relations, which implies that the truncated n-point distributions are totally anti-symmetric (cf. [1, 39]). The ﬁnal property follows from the second characterisation in Theorem 4.11, Eq. (17) in the Appendix, the equation of motion and the Propagation of Singularities Theorem for the wave front set, which in this case follows from the propagation of the polarization set [41]. One can also prove that the state spaces are locally physically equivalent [5] and that all quasi-free Hadamard states are locally quasi-equivalent [42]. Whether the latter remains true for all Hadamard states appears to be unknown. We conclude this section with the remark that the functor S : SSpac → TVec deﬁned by SM → SSM and χ → α∗χ (restricted to the relevant state space) is a locally covariant state space for the theory B [2]. 4.3. The relative Cauchy evolution of the Dirac ﬁeld and the stress-energy-momentum-tensor Now that we have a locally covariant free Dirac ﬁeld at our disposal, we will investigate the idea of relative Cauchy evolution for this ﬁeld and prove that it yields commutators with the stress-energy-momentum tensor. This result is completely analogous to the result for the free scalar ﬁeld of [2]. Suppose that we have two objects M0 = (M, g0 , SM 0 , p0 ) and Mg = (M, g, SM g , pg ) in SSpac, where M is the same in both cases and such that outside a compact set K ⊂ M we have g = g0 , SM g = SM 0 and pg = p0 . Now let N ± ⊂ M0 be causally convex open regions, each containing a Cauchy surface for M0 , such that K lies to the future of N − (i.e. K ⊂ J + (N − )\N − in M0 and hence also in Mg ) and to the past of N + . We view N ± as objects in SSpac and

May 11, J070-S0129055X10003990

412

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

± ± consider the canonical morphisms ι± → M0 and ι± → Mg . By the timeg :N 0 :N slice axiom, Proposition 4.8, these give rise to ∗-isomorphisms β0± : BN ± → BM0 and βg± : BN ± → BMg . We then deﬁne

βg := β0+ ◦ (βg+ )−1 ◦ βg− ◦ (β0− )−1 . The ∗-isomorphism βg : BM0 → BM0 measures the change in an operator A ∈ BN − as it evolves to N + in the metric g instead of g0 .t βg can be extended to a ∗isomorphism of the algebra FM0 , where we ﬁx the signs for the isomorphisms between the spinor bundles involved by identifying the double spinor bundles over N ± ⊂ M0 and N ± ⊂ Mg . It represents the relative Cauchy evolution of the free Dirac ﬁeld. We will want to compute the variation of the ∗-isomorphism βg as well as that of the action for the free Dirac ﬁeld with respect to the metric g. For this purpose, we will suppose that the compact set K ⊂ M has a contractible neighborhood O which does not intersect either N ± . Let → g be a smooth curve from [0, 1] into the space of Lorentzian metrics on M starting at g0 and such that g = g0 outside K for every . The spin bundle SM must be trivial over the contractible region O. If we assume it to be diﬀeomorphic to SM 0 outside K we can simply take SM = SM 0 as a manifold and, choosing a ﬁxed representation and matrices A, C, we obtain DM = DM . The deformation of the spin structure is contained entirely in the spin frame projection π : SM 0 → FM . Let E be a section of SM 0 over O and set (e )a := π (E). We require that e varies smoothly with and that (e )a = (e0 )a outside K. To show that projections π with these properties exist we can apply the Gram–Schmidt orthonormalisation procedure to (e0 )a for all simultaneously. The assignment E → e determines π completely, using the intertwining properties. The family of frames e determines principal ﬁber bundle isomorphisms FM → FM 0 between the frame bundles by λ : {(e )a } → {(e0 )a } on K and extending it by the identity on the rest of M. By deﬁnition f intertwines the action of L↑+ on the orthonormal frame bundles. Remark 4.13. There may be many deformations of the spin structure, i.e. many families of projections π which satisfy our requirements. However, the variation of terms like v, P u will not depend on this choice. Indeed, if π is a diﬀerent deformation of the spin structure, then e := π (E) = RΛ e = π (RS E) for some smooth curve S in Spin01,3 . However, using the invariance of , under the action of the gauge group Spin 01,3 , the variation will be equal in both cases. (Also δu = 0 for t In

[2], it seems the authors have the scattering of a state in mind as it passes through the perturbed metric, which leads them to consider the ∗-isomorphisms βg−1 rather than βg . When we take the variation with respect to g this gives rise to a sign.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

413

every spinor u, because D M = DM.) In this sense, the variation will only depend on the variation of the metric. 4.3.1. The stress-energy-momentum tensor The classical stress-energy-momentum tensor for the Dirac ﬁeld is deﬁned as a variation of the action S = M LD , with the Lagrangian density (7), with respect to g µν (x): 2 δS , Tµν (x) := −det g(x) δg µν (x)

(12)

where ψ is a free classical Dirac spinor, ψ + its adjoint. An explicit computation yieldsu Tµν =

i (ψ + , γ(µ ∇ν) ψ − ∇(µ ψ + , γν) ψ). 2

Here the brackets around indices denote symmetrization as an idempotent operation and in the following indices between | · · · | are to be excluded from the symmetrization. Following [7] we quantize the stress-energy-momentum tensor via a point-split procedure, i.e. we want to ﬁnd a bi-distribution of scalar test-functions which reduces to Tµν on the diagonal and which can be quantized in a straight-forward way. For this purpose we use a local spin frame EA and recall that the components γaAB of γa are constant. We deﬁne: s (x, y) := Tab

i (ψ + , EA (x)γ(aA |B| E B , eµb) ∇µ ψ(y) 2 − eµ(a ∇|µ ψ + , EA| (x)γb)AB E B , ψ(y)),

reduces to Tab := eµa eνb Tµν in the limit y → x. Performing a partial integration, µ s ∇µ (ea v, u) = 0, we can write Tab as a bidistribution of scalar test-functions h1 , h2 , s (h1 , h2 ) = Tab

i (−ψ + (EA h1 )γ(aA |B ψ(∇µ| (E B eµb) h2 )) 2 + ψ + (∇µ (eµ(a E|A| h1 ))γb)AB ψ(E B h2 )).

(13)

Equation (13) can be promoted to the quantized case by replacing ψ and ψ + by the + of the corresponding locally covariant quantum ﬁeld. components ψSM and ψSM The expression (13) can be viewed as a formal expression for the same distribution with quantized ﬁeld operators. u For

explicit computations, we refer to [43, Sec. 4], which uses a Lagrangian that diﬀers from ours by a total derivative. Varying with respect to gµν would yield the opposite sign.

May 11, J070-S0129055X10003990

414

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

Proposition 4.14. For all f ∈ C0∞ (DM ⊕ D∗ M ) and h ∈ C0∞ (M ) we have: s [Tab (x, x), BSM (f )]h(x)dvolg (x) M

=

1 {(∇(a BSM )(γb) (Sτ f )h) − BSM (γ(b ∇a) (Sτ f )h)}, 2

where ∇a := eµa ∇µ . Proof. For f = u ⊕ v we use Proposition 4.6 to obtain: + {BSM (f ), ψSM (EA h)} = −i

v, SEA hI = i

M

Sc v, EA hI, M

{BSM (f ), ψSM (∇µ E B eµb h)} = −i

M

∇µ E B eµb h, SuI = i

+ (∇µ eµa EA h)} = −i {BSM (f ), ψSM

M

E B , eµb ∇µ SuhI,

v, S∇µ eµa EA hI = −i

M

eµa ∇µ Sc v, EA hI, M

{BSM (f ), ψ(E B h)} = −iE B , SuhI. With Eq. (13), the commutation relations and [AB, C] = A{B, C} − {A, C}B this implies s (x, y), BSM (f )] = [Tab

1 + {ψ (EA (x))γ(aA |B| E B , ∇b) Su(y) 2 SM + Sc v, EA (x)γ(aA |B| (∇b) ψSM )(E B (y)) + − (∇(a ψSM )(E|A| (x))γb)AB E B , Su(y)

− ∇(a Sc v, E|A| (x)γb)AB ψSM (E B (y))}. In this expression, we are multiplying distributions with smooth functions, so we may take the coincidence limit yielding: s (x, x), BSM (f )] = [Tab

1 + {ψ (γ ∇ (Su)(x)) + ∇(b ψSM (Sc vγa) (x)) 2 SM (a b) + − ∇(a ψSM (γb) Su(x)) − ψSM (∇(a (Sc v)γb) (x))}

=

−1 {∇(a BSM (γb) Sτ f (x)) − BSM (γ(b ∇a) (Sτ f )(x))}, 2

from which the result follows.

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

415

This result can be written for spinors and cospinors separately as: s [Tab (x, x), ψSM (v)]h(x)dvolg (x) M

= M

1 {∇(a ψSM ((Sc v)γb) h) − ψSM (∇(a (Sc v)γb) h)}, 2

+ s [Tab (x, x), ψSM (u)]h(x)dvolg (x)

=

−1 + + {∇(a ψSM (γb) Suh) − ψSM (γ(a ∇b) (Su)h)}. 2

4.3.2. Relative Cauchy evolution To compute the relative Cauchy evolution explicitly, we ﬁrst note that the isomorphism βg can be characterized in terms of its action on the generators BM0 (f ) of FM0 as follows: Proposition 4.15. For f ∈ C0∞ (DN + ⊕ D∗ N + ), we have βg B0 (f ) = B0 (Tg f ), where Tg f = Pg φ+ Sg P0 φ− S0 f. Here the subscripts on B, P and S indicate whether they are the objects deﬁned on M0 or Mg and the smooth functions φ± are such that φ± ≡ 1 to the past of some Cauchy surface in N ± and φ± ≡ 0 to the future of some other Cauchy surface in N ± . Proof. Note that βg− ◦ (β0− )−1 B0 (f˜) = Bg (f˜) for any f˜ ∈ C0∞ (DN − ⊕ D∗ N − ). Similarly, for f ∈ C0∞ (DN + ⊕ D∗ N + ) we have β0+ ◦ (βg+ )−1 Bg (f ) = B0 (f ). The functions φ± , 1 − φ± have been chosen appropriately in order to apply Eq. (11) in Proposition 4.8. We then have B0 (f˜) = B0 (f ), where f˜ := −P0 φ− S0 f . Notice that f˜ indeed has a compact support in N − . Similarly, Bg (f˜) = Bg (f ), where f := −Pg φ+ Sg f˜ has support in N + . Hence, for f = Tg f : βg B0 (f ) = βg B0 (f˜) = β0+ ◦ (βg+ )−1 Bg (f˜) = β0+ ◦ (βg+ )−1 Bg (f ) = B0 (f ). On each spin spacetime M = (M, g , SM 0 , π ) we can now quantize the Dirac ﬁeld and obtain relative Cauchy evolutions β := βg on FN + as before. Proposition 4.16. Writing δ := ∂ |=0 we have for all f ∈ C0∞ (DN + ⊕ D∗ N + ): / )S0 f ). δ(β B0 (f )) = B0 (τ (δ∇ Proof. Using the fact that B0 is a C ∗ -algebra-valued distribution and Proposition 4.15 we ﬁnd: δ(β B0 (f )) = δ(B0 (P φ+ S P0 φ− S0 f )) = B0 (δ(P φ+ S )P0 φ− S0 f ) = B0 (δ(P )φ+ S0 P0 φ− S0 f ) + B0 (P0 φ+ δ(S )P0 φ− S0 f ).

May 11, J070-S0129055X10003990

416

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

Now, because P0 φ− S0 f ∈ C0∞ (DN − ⊕ D∗ N − ) we see that δ(S )P0 φ− S0 f vanishes on J − (N − ) and that φ+ δ(S )P0 φ− S0 f has compact support. Because B0 solves the Dirac equation we conclude that the second term vanishes. The ﬁrst term can be rewritten using Eq. (11), which yields S0 f = −S0 P0 (φ− S0 f ) and hence: δ(β B0 (f )) = −B0 (δ(P )φ+ S0 f ) = −B0 (δ(P )S0 f ). For the last equality, we used the fact that δ(P ) is supported in K, where φ+ ≡ 1. Recall that P = (−i∇ / + m) ⊕ (i∇ / + m) to get the ﬁnal result. To compute the variation of the Dirac operator we may work in a local frame on O, where it is supported. Because the Dirac adjoint map is independent of we only need to compute this variation either for spinors or for cospinors: / )v = (δ(∇ / )v + )+ . Lemma 4.17. For v ∈ C0∞ (D∗ M ) we have δ(∇ Proof. Because the adjoint operation is continuous we have: δ(∇ / )v = ∂ ∇ / v|=0 = ∂ (∇ / v + )+ |=0 = (∂ ∇ / v + |=0 )+ = (δ(∇ / )v + )+ . It is interesting to note that only the variation of the Dirac operator is of importance for the variation of the relative Cauchy evolution, just like for the stress-energy-momentum tensor (cf. [43]). It will also turn out that the variation only depends on the variation of the metric and not on the other freedom in the variation of the orthonormal frame, even though we are now acting on it with the C ∗ -algebra-valued ﬁeld (cf. Remark 4.13). This will follow from the proof of the following theorem, for which we refer to Appendix B. Theorem 4.18. For a double test-spinor f ∈ C0∞ (DM0 ⊕ D∗ M0 ) and x ∈ K: δ δg αβ (x)

(βg B0 (f )) = −B0

δ δg αβ (x)

Pg S0 f

=

−i a b s e e [T (x, x), B0 (f )]. 2 α β ab

(14)

This result compares well with the scalar ﬁeld case, [2, Theorem 4.3].v As particular cases we obtain for ψ and ψ + : δ −i a b s (βg ψ(v)) = e e [T (x, x), ψ(v)], δg αβ (x) 2 α β ab δ −i a b s (βg ψ + (u)) = e e [T (x, x), ψ + (u)]. δg αβ (x) 2 α β ab sign explained in footnote t cancels the sign due to the variation with respect to g αβ instead of gαβ .

v The

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

417

It follows that the same result also holds for products and sums of smeared ﬁeld operators.

5. Conclusions A rigorous formulation of quantum ﬁeld theories in curved spacetime, going beyond the well-known scalar ﬁeld, is a prerequisite for constructing more realistic cosmological models as well as for improving our understanding of quantum ﬁeld theory in Minkowski spacetime. The main purpose of this paper was to present the free Dirac ﬁeld in a four-dimensional globally hyperbolic spacetime as a locally covariant quantum ﬁeld theory in the sense of [2] and to compute the relative Cauchy evolution of this ﬁeld, obtaining commutators with the stress-energy-momentum tensor in analogy with the free real scalar ﬁeld. We achieved this in a representation independent way and in a functorial, and therefore manifestly covariant, framework. We established some basic properties of the locally covariant free Dirac ﬁeld and remarked on the quantization of Majorana spinors. We also provided a detailed discussion of Hadamard states, closing any gaps in the existing proofs of the equivalence of the deﬁnitions in terms of the series expansion of their two-point distribution and a microlocal condition, respectively. Furthermore, we argued that the observable part of the theory is uniqueley determined by the relations between adjoints, charge conjugation and the Dirac operator, although the geometric constructions themselves may not be unique due to the cohomological properties of the category of spin spacetime. On a mathematical level we have consistently replaced a single spin spacetime SM by the category SSpac of such spacetimes, and the diﬀerential geometry on SM by the corresponding functorial descriptions. On a physical level, however, we should not conclude from this that SSpac is now the physical arena in which our system lives, instead of a collection of systems. (See [1, Chap. 1] for more detailed philosophical remarks on the interpretation of the locally covariant approach.)

Acknowledgments I would like to thank Chris Fewster for suggesting to use the cohomological language in Sec. 3.4 and for bringing the problem of computing the relative Cauchy evolution for the Dirac ﬁeld to my attention. I would also like to thank Romeo Brunetti for correcting some of my misconceptions in the early stages of this computation. An anonymous referee made several important corrections and helpful suggestions, for which I am grateful. Much of this work was performed as part of my PhD-thesis at the University of York and I would also like to thank the University of Trento for their kind hospitality during my visit in October 2007. Furthermore, this research was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft (DFG)) through the Institutional Strategy of the University of G¨ ottingen

May 11, J070-S0129055X10003990

418

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

and the Graduiertenkolleg 1493 “Mathematische Strukturen in der modernen Quantenphysik”. Appendix A. Results in Microlocal Analysis In this appendix, we will list some results concerning the microlocal analysis of distributions. For a detailed treatment of scalar distributions we refer to [38], whereas Hilbert and Banach-space-valued distributions are treated in [1, 37]. More details concerning distributional sections of vector bundles can be found in, e.g., [1, 16, 34, 41]. Before we discuss distributional sections of vector bundles, we ﬁrst consider the scaling limit of a distribution in an open set of Rn : Definition A.1. Let O be a convex open region O ⊂ Rn containing 0. For all λ > 0 we deﬁne the scaling map δλ : O → O by δλ (x) := λx. Let u be a distribution on a convex open region O ⊂ Rn containing 0. The scaling degree d of u at 0 is deﬁned as d := inf{β ∈ [−∞, ∞) | limλ→0 λβ δλ∗ u = 0}, where (δλ∗ u)(f ) := λ−n u(f ◦ δλ−1 ). If u0 := limλ→0 λd δλ∗ u exists we call it the scaling limit of u at 0. Note that the scaling limit may fail to exist (e.g., u(x) = log|x|) or it may vanish (e.g., if 0 ∈ supp(u)). On a manifold, we will only consider scaling limits in a certain choice of local coordinates. How this limit depends on this choice of coordinates will not be relevant for us. We now prove the following resultw : Proposition A.2. Let u be a distribution on a convex open region O ⊂ Rn containing 0 with scaling limit u0 at 0. Then {0} × π2 (WF (u0 )) ⊂ WF (u), where π2 denotes the projection on the second coordinate. Proof. Suppose that (0, ξ0 ) ∈ WF (u) with ξ0 = 0. We will prove that (x, ξ0 ) ∈ WF (u0 ) for all x. By assumption, we can choose χ ∈ C0∞ (O) and an open conic neighborhood Γ ⊂ Rn of ξ0 such that χ ≡ 1 on a neighborhood of 0 and supp(χ) × Γ ∩ WF (u) = ∅. We set v := χu and v λ := λd δλ∗ v, where d is the scaling degree of u at 0. Notice that WF (v) ∩ T0∗ O = WF (u) ∩ T0∗ O and u0 := limλ→0 v λ , so without wA

similar result was also claimed as [34, Proposition 2.8], but we ﬁnd their proof unconvincing. In particular, when localizing the scaling limit u0 with a test-function χ0 and estimating (cf. [34, Eq. (2.11)]) “ “.” ξ ” d−n 0 u χ0 e−i λ ·. χ 0 u (ξ) = lim λ λ→0 λ the test-function χ0 ( λ. ) becomes singular in the limit λ → 0. The quoted reference pays insuﬃcient attention to this issue in the last sentence of their proof, because their last estimate does not involve any χ0 .

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

419

loss of generality, we may prove the result with v replacing u and we can view the v λ as compactly supported distributions on all of Rn . Notice that for λ > 0 we have δλ∗ u0 = λ−d u0 , i.e. u0 is a homogeneous distribution and therefore it is tempered ([38, Theorem 7.1.18]). We now prove that v λ converges to u0 in the sense of tempered distributions on Rn . For this we ﬁrst write v = |α|≤r (−1)|α| ∂ α vα , where r is the order of v and the vα are compactly sup ported distributions of order 0 (see [38, Sec. 2.1]). Note that |α|
sup|∂ α φ|, supp(φ) ⊂ B1 , (15) |wλ (φ)| ≤ C |α|≤r

for some C, r > 0, where B1 is the (Euclidean) unit ball and 0 < λ ≤ 1. In fact, for λ ≥ 1 we also have

λd−n−|α| sup|∂ α φ| |wλ (φ)| = λd−n |w(φ ◦ δλ−1 )| ≤ C ≤C

d−n≤|α|≤r

sup|∂ α φ|,

d−n≤|α|≤r

so the estimate (15) holds for all λ > 0. Now, let φ ∈ S(Rn ) be a function of rapid decrease and choose a partition of unity on Rn as follows. We let χ0 ∈ C0∞ (Rn ) be positive such that χ ≡ 1 on B1 and χ(x) = 0 when x ≥ 2. We then set χm (x) := χ0 (2−m x) − χ0 (21−m x) and note that: supp(χm≥1 ) ⊂ {x | 2m−1 ≤ x ≤ 2m+1 },

∞

χm = 1,

m=0

where the sum is ﬁnite near every point. We deﬁne φm := χm φ and µm := 2−m−1 and rescale φm in order to apply the estimate (15): . λ d−n λ/µm |w (φm )| = µm w φm µm

α . (∂ φm ) µd−n−|α| sup ≤C m µm |α|≤r

≤ C1

|α|≤r |β|≤r+n−d

sup|xβ ∂ α φm |, Rn

m ≥ 0,

(16)

May 11, J070-S0129055X10003990

420

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

where the last line uses µm ≤ (4x)|α|+n−d for m ≥ 1, which follows from d − n ≤ |α| and the support properties of χm . (For m = 0 we simply estimate d−n−|α| by a constant to arrive at the last line of (16).) We now note that µ0 maxα supx|∂ α χm | ≤ c for some c independent of m, as the derivatives only bring out extra factors of 2−m ≤ 1. Moreover, for m ≥ 0 we notice that χm+1 +χm +χm−1 ≡ 1 on supp(χm ), where we deﬁne χ−1 := 0. Therefore (16) leads to

|wλ (φm )| ≤ C2 sup |xβ ∂ α φ|(χm+1 + χm + χm−1 ) d−n−|α|

|α|≤r |β|≤r+n−d

and summing over m ≥ 0 then gives:

|wλ (φ)| ≤ 3C2

Rn

|α|≤r |β|≤r+n−d

sup |xβ ∂ α φ|. Rn

This shows that wλ (φ) can be estimated by a seminorm on S(Rn ) uniformly in λ. It then follows that wλ → u0 and hence v λ → u0 as tempered distributions. Indeed, for any φ ∈ S(Rn ) and > 0 we can choose φ ∈ C0∞ (Rn ) and λ0 > 0 such that |wλ (φ − φ )| < 2 for all λ > 0 and |wλ (φ )| < 2 for all λ < λ0 . Fourier transformation is a continuous operation on tempered distributions, so we can compute: −N 0 (ξ)| = lim λd−n vˆ ξ ≤ CN lim λd−n ξ = CN ξ−N lim λN +d−n |u λ λ→0 λ→0 λ→0 λ 0 (ξ) = 0 for all ξ in Γ, all N ∈ N and suitable CN > 0. For N > n−d the limit yields u near ξ0 . We then apply [38, Theorem 8.1.8], which says that for a homogeneous distribution we have for all x = 0 that (x, ξ0 ) ∈ WF (u0 ) if and only if (ξ0 , −x) ∈ 0 ). 0 ) and also (0, ξ0 ) ∈ WF (u0 ) if and only if ξ0 ∈ supp(u WF (u For a distribution u with values in a Banach space B one can deﬁne the wave front set by using estimates of the norm u(χeiξ· ), which replace the corresponding estimates of the absolute value |u(χeiξ· )| for scalar distributions [37]. Alternatively, one can use the following equivalent characterization ( [1, Theorem A.1.4]): WF (u) =

WF (l ◦ u)\Z.

(17)

l∈B

A similar idea works for a distributional section u of a vector bundle V = O × Rm over a contractible region O of Rn . Indeed, using a basis ei for Rm with dual basis ei we can identify u with a distribution u ˜ on O with values in B ⊗ (Rm ) , where the correspondence is given by m m m

i i u ˜(h) := u(hei ) ⊗ e , u f ei = ˜ u(f i ), ei , i=1

i=1

i=1

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

421

where , denotes the canonical pairing of Rm with the second factor of B ⊗ (Rm ) . We set by deﬁnition WF (u) := WF (˜ u). Equation (17) allows a straightforward generalization of many results for scalar distributions on open sets of Rn to Banach-space-valued distributional sections of a vector bundle over regions over Rn . Moreover, by showing how these results transform under changes of coordinates they can be formulated for vector bundles on a manifold. We list a number of these results in the following Theorem (cf. [1, 38]): Theorem A.3. If u, v are distributional sections of a complex vector bundle V over the spacetime M with values in the Banach space B, then: (1) (2) (3) (4)

sing supp(u) is the projection of WF (u) on the ﬁrst variable, u ∈ C ∞ (V, B) if and only if WF (u) = ∅, WF (u + v) ⊂ WF (u) + WF (v), if P is a linear partial diﬀerential operator on V with smooth coeﬃcients and (matrix-valued) principal symbol x p(x; ξ), then WF (P u) ⊂ WF (u) ⊂ WF (P u) ∪ ΩP , where ΩP := {(x; ξ) ∈ T ∗ M | ξ = 0, det p(x; ξ) = 0}, (5) if x ∈ M, φ : U → Rn is a local trivialization on a convex neighborhood U with φ(x) = 0 and (φ−1 )∗ u has a scaling limit u0 at 0, then φ∗ ({0} × π2(WF (u0 ))) ⊂ WF (u) ∩ Tx∗ M . In the last item, the scaling limit depends not just on the choice of coordinates, but also on the choice of a frame ei of V over U and we let the scaling maps δλ act on sections of V componentwise: ( i f i ei ) ◦ δλ−1 = i (f i ◦ δλ−1 )ei . In the particular case where B is a Hilbert space, we also have (see [1, 37]): Theorem A.4. Let H be a Hilbert space and Vi , i = 1, 2, two ﬁnite-dimensional (complex ) vector bundles over smooth ni dimensional spacetimes Mi with complex conjugations Ji , i.e. the Ji are antilinear, base-point preserving bundle isomorphisms Ji : Vi → Vi such that Ji2 = −id. Let ui , i = 1, 2. be two H-valued distributional sections of Vi and let wij be the distributional sections of the vector bundle Xi Xj over Mi × Mj determined by wij (f1 f2 ) := ui (Ji f1 ), uj (f2 ). Then (x, ξ) ∈ WF (u1 ) ⇔ (x, −ξ; x, ξ) ∈ WF (w11 ) and WF (wij ) ⊂ −(WF (ui ) ∪ Z) × (WF (uj ) ∪ Z), where Z denotes the zero-section. Finally, we establish some results on the wave front sets of advanced and retarded fundamental solutions E ± (for their existence and uniqueness we refer to [16]) and S ± , Sc± . These results are analogous to [36, Theorem 6.5.3], but now x See

[16] for the deﬁnition of the principal symbol.

May 11, J070-S0129055X10003990

422

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

for operators in a vector bundle. Note that for distributional sections of vector bundles there is a Propagation of Singularities Theorem, which follows from the propagation of the polarization set [41]. Theorem A.5. Let E ± be the advanced (−) and retarded (+) fundamental solutions for a normally hyperbolic operator P acting on the sections of a vector bundle DM over a globally hyperbolic spacetime M = (M, g) of dimension n ≥ 2. Then WF (E ± ) = {(x, ξ; y, η) ∈ T ∗ M ×2 \Z | x ∈ J ± (y), x = y, (x, −ξ) ∼ (y, η)} ∪ {(x, −ξ; x, ξ) ∈ T ∗ M ×2 \Z | (x, ξ) ∈ T ∗ M \} =: A± ∪ B

(18)

where Z is the zero-section and (x, ξ) ∼ (y, η) if and only if there is a light-like geodesic γ from x to y to which ξ and η are cotangent such that they are each others parallel transport along γ. Proof. The ﬁrst part of this proof follows closely the proof of [31]. We start by reducing the problem to a local one as follows. The principal symbol of P is p(x, ξ) = gµν (x)ξ µ ξ ν I, where I is the identity operator on DM , so by the Propagation of Singularities Theorem, the singularities of E ± propagate along lightlike geodesics by parallel transport. By deﬁnition the points in set A± are invariant under the same parallel transport. Now consider a point p := (x, ξ; y, η) with x = y. If ξ = η = 0 then P is not contained in any set on either side of the equality, so we may assume ξ = 0 (the case η = 0 is analogous). Let S be a spacelike Cauchy surface through y and propagate (x, ξ) along the light-like geodesic γ towards S. If γ ends at S in x = y then P is not contained in A± or B, nor is it contained in WF (E ± ), because E(x , y) = 0 when x and y are spacelike, so it cannot have any singularities there. If γ ends at y, on the other hand, we can ﬁnd a point p := (x , ξ ; y, η), where x on γ is in any given causally convex neighborhood of y and ξ is the parallel transport of ξ along γ to x . Then p ∈ WF (E ± ) if and only if p ∈ WF (E ± ) and p ∈ A± if and only if p ∈ A± . Hence, it suﬃces to prove the claim locally. On a suﬃciently small causally convex domain O ⊂ M we can ﬁnd for every k ∈ N a C k -section W k of DM D∗ M on O×2 such that ( [16, Proposition 2.5.1]): ±

E (x, y) =

k+1

Vj (x, y)f ∗ (1 ⊗ R± (2 + 2j, ·))(x, y) + W k (x, y).

(19)

j=0

Here, the Hadamard coeﬃcients Vj are uniquely deﬁned smooth sections of DM D∗ M on O×2 , R± (α, y) are the retarded (+) and advanced (−) Riesz distributions (or rather distribution densities) on Minkowski spacetime and they are pulled back by the smooth diﬀeomorphism f : O×2 → T O deﬁned by (x, y) → (x, exp−1 x (y)). This means we use Riemannian normal coordinates for y centered on x, which is

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

423

well-deﬁned because O is causally convex. The Riesz distributions have many useful properties, of which we will only use for all j ≥ 0: WF (R± (2j + 2, ·)) = {(x, ξ) ∈ T ∗ M0 \Z | x = 0 or x2 = 0, x ∈ J ± (0), ξ x} R± (2 + 2j, λx) = λ2+2j−n R± (2 + 2j, x), λ > 0. (20) (These can be proved using [16, Proposition 1.2.4 items 4 and 5], j+1 R± (2+2j, ·) = δ and the wave front sets of the distinguished parametrices as determined in [36].) Hence, for all j ∈ N: WF (f ∗ (1 ⊗ R± (2 + 2j, ·))) = f ∗ (WF (1 ⊗ R± (2 + 2j, ·))) = f ∗ (Z|O × WF (R± (2 + 2j, ·))) = {(x, ξ; y, η) | (ξ, η) = df T (0, η ) for some ± (exp−1 x (y), η ) ∈ WF (R (2 + 2j, ·))},

= (A± ∪ B) ∩ T ∗ O×2 ,

(21)

where df T is the transpose of the derivative df at (x, y). The last equality uses the wave front set of the Riesz distributions in Eq. (20) and the properties of Riemannian normal coordinates (cf. [31]). It follows that WF (E ± |O×2 ) ⊂ (A± ∪ B) ∩ T ∗ O×2 , because for each order of diﬀerentiation N we can choose a suﬃciently high order k in Eq. (19) to make the required estimate in the deﬁnition of the wave front set. We can prove the opposite inclusion, if we can show that the wave front set of the ﬁnite sum in (19) also contains (A± ∪ B) ∩ T ∗ O×2 , which we will do using scaling limits (cf. [34]). First, we may employ the Riemannian normal coordinates f : O×2 → T O as above. Next, we may assume that O is also a contractible coordinate neighbourhood, so we can consider local coordinates φ : O → Rn on O and the associated coordinate map dφ on T O. Moreover, we can choose φ in such a way that φ(x0 ) = 0 for an arbitrarily given x0 ∈ O. The composition dφ ◦ f then deﬁnes coordinates on O×2 such that (x0 , x0 ) → 0 ∈ R2n . Using a frame EA for DM |O and the dual frame E B we can express the terms in the sum of Eq. (19) in the A (x, y)R± (2 + 2j, y). From Eq. (20), we then ﬁnd the local coordinates dφ ◦ f as VjB scaling behavior A A (x, y)R± (2 + 2j, y)) = λ2+2j−n (VjB (λx, λy)R± (2 + 2j, y)) δλ∗ (VjB

for all λ > 0. In the scaling limit only the lowest order term survives: A lim λn−2 (δλ ◦ f −1 ◦ dφ−1 )∗ E(x, y) = V0B (0, 0)R(2, y)E B (x)EA (y)

λ→0

= R(2, y)E A (x)EA (y),

May 11, J070-S0129055X10003990

424

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

where we wrote R(2, y) := R− (2, y) − R+ (2, y) and we used the explicit expression A ( [16, Lemmas 2.2.2 and 1.3.17]). V0AB (x, x) = δB Now, the last item of Theorem A.3 (which follows from Proposition A.2) implies that WF (E) ⊃ (dφ ◦ f )∗ ({(0, 0)} × π2 (WF (1 ⊗ R(2, ·)))), because E A (x)EA (y) is smooth and not identically vanishing. From Eq. (20) and the support properties of R± (2, ·) we easily compute π2 (WF (1 ⊗ R(2, ·))) = {(0, ξ) | ξ 2 = 0}. Pulling this back to O×2 and using the properties of Riemannian normal coordinates yields WF (E) ⊃ {(x0 , −ξ; x0 , ξ) | ξ 2 = 0}. Because E is a bi-solution to the wave equation we can apply the Propagation of Singularities Theorem to ﬁnd that WF (E) ⊃ A+ ∪ A− on O×2 and from the support properties of E + and E − we then conclude that WF (E ± ) ⊃ A± . Finally, WF (E ± ) ⊃ WF (P E ± ) = WF (δ) = B. This completes the proof. Corollary A.6. In the notation of Theorem A.5, WF (E) = A+ ∪ A− \Z. Proof. By Theorem A.5 and the support properties of E ± , we have WF (E) = A+ ∪A− away from the diagonal. The inclusion ⊃ then follows from the closedness of the wave front set. For the opposite inclusion we consider a point on the diagonal and use the Propagation of Singularities Theorem to ﬁnd an approximating sequence of points oﬀ the diagonal. Proposition A.7. For the fundamental solutions of the Dirac equation we have, in the notation of Theorem A.5: WF (S ± ) = WF (Sc± ) = A± ∪ B and WF (S) = WF (Sc ) = A+ ∪ A− \Z. In other words, WF (S ± ) = WF (Sc± ) = WF (E ± ) and WF (S) = WF (Sc ) = WF (E). Proof. Because S ± = (i∇ / + m)E ± and Sc± = (−i∇ / + m)E ± (see [6]) we ± ± ± immediately ﬁnd WF (S ) ⊂ WF (E ) and WF (Sc ) ⊂ WF (E ± ). Similarly WF (S) ⊂ WF (E) and WF (Sc ) ⊂ WF (E). Now suppose that WF (S) = WF (Sc ) = WF (E) = A+ ∪ A− , which we will prove below. By the support properties of the fundamental solutions we then ﬁnd that away from the diagonal WF (S ± ) = WF (Sc± ) = A± , whereas on the diagonal WF (E ± ) = B ⊃ WF (S ± ) ⊃ WF (P S ± ) = WF (δ) = B and similarly for cospinors. To complete the proof we need to show that WF (S) ⊃ WF (E) and WF (Sc ) ⊃ WF (E), for which we adapt (and correct) an idea of [33]. We prove the case of S, because the other case follows by taking adjoints (cf. Theorem 3.10). Further note that it is suﬃcient to prove the claim on the diagonal, because the Propagation of Singularities Theorem applies both to E and to S. Now suppose that (x, −ξ; x, ξ) ∈ WF (E)\WF (S). We will derive a contradiction as follows. For every time-like, future pointing normalized vector n0 ∈ Tx M we can ﬁnd a smooth spacelike Cauchy surface C through x such that n0 is normal to C. We let n denote

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

425

the future pointing normal vector ﬁeld on C and ι : C → M the canonical injec/ tion. By [6, Proposition 2.4(c)] we can restrict S to C ×2 to ﬁnd S|C ×2 = −iδn and in particular (x, −dιTx (ξ); x, dιTx (ξ)) ∈ WF (S|C ×2 ). By (a component version of) [38, Theorem 8.2.4], on the other hand: WF (S|C ×2 ) ⊂ (ι × ι)∗ (WF (S)) = {(x, dιTx (ξ); y, dιTy (ξ )) | (x, ξ; y, ξ ) ∈ WF (S)}. Therefore, there must be a point (x, −η; x, η) ∈ WF (S) such that (x, −dιTx (η); x, dιTx (η)) = (x, −dιTx (ξ); x, dιTx (ξ)) Notice, however, that the transpose of dι is nothing else than restricting the dual vector ξ to the tangent space of C. Because WF (S) ⊂ WF (E), there are only two possibilities: η = ξ or η = ξ − 2(ξa na0 )n0 . The ﬁrst contradicts our assumption, so we have η = ξ − 2(ξa na0 )n0 . Now (x, −η; x, η) ∈ WF (S) must hold for every normalised, time-like, future pointing vector n0 ∈ Tx M . Choosing a sequence of vectors n0 such that η → ξ and using the closedness of the wave front set we ﬁnd again (x, −ξ; x, ξ) ∈ WF (S). Hence, WF (E) = WF (S). Appendix B. Proof of Theorem 4.18 The computations involved in the proof of Theorem 4.18 are somewhat similar to the computation of the stress-energy-momentum tensor. We will work in components and in local coordinates on O, using Greek indices to indicate the coordinate frame and coordinate derivatives. To ease the notation we will drop the subscript on the local frame eµa . As γ a is independent of we may use Eqs. (5) to vary 1 c 1 β b a α c c γ b ∇ / v = ∂a v − Γ ab vγc γ γ = ea ∂α v + eb {∂α eβ − eγ Γ αβ }vγc γ γ a , 4 4

(22)

which yields: 1 β d c 1 d a b a c β b a δ∇ / v = δeα a eα ∇d vγ − δeb eβ Γ ad vγc γ γ + ∂a δeβ eb vγc γ γ 4 4 1 1 γ α β c β γ b a b a − δecγ eα a eb Γ αβ vγc γ γ − δΓ αβ ea eb eγ vγc γ γ . 4 4 We can perform an integration by parts as follows: 1 ∂a δecβ eβb vγc γ b γ a 4 =

−i i Pc (δecβ eβb vγc γ b ) + δecβ eβb Pc (vγc γ b ) 4 4 1 1 1 − δecβ ∂a eβb vγc γ b γ a − δedβ eβb Γcad vγc γ b γ a + δecβ eβd Γdab vγc γ b γ a 4 4 4

(23)

May 11, J070-S0129055X10003990

426

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

=

−i i 1 Pc (δecβ eβb vγc γ b ) + δecβ eβb (Pc v)γc γ b − δecβ eβb ∇a v[γc γ b , γ a ] 4 4 4 1 1 1 − δecβ ∂a eβb vγc γ b γ a + δeβb edβ Γcad vγc γ b γ a + δecβ eβd Γdab vγc γ b γ a . 4 4 4 (24)

Because [γc γ b , γ a ] = γc {γ b , γ a } − {γc , γ a }γ b = 2η ab γc − 2δca γ b and ecβ = gµβ η cd eµd we can write: 1 1 1 − δecβ eβb ∇a v[γc γ b , γ a ] = − δ(gµβ η cd eµd )eβb η ab ∇a vγc + δecβ eβb ∇c vγ b 4 2 2 1 = − δgµβ η cd eµd eβb η ab ∇a vγc − δeµd eaµ ∇a vγ d 2 1 d a = δg αβ eaα ebβ ∇a vγb − δeα a eα ∇d vγ . 2

(25)

When substituting Eqs. (24) and (25) into (23), we can recombine the terms −1 c 1 −1 c γ d β γ b a δe ∂a eβb vγc γ b γ a − δecγ eα δe e Γ vγc γ b γ a a eb Γ αβ vγc γ γ = 4 β 4 4 γ d ab to obtain δ∇ /v =

−i i 1 Pc (δecβ eβb vγc γ b ) + δecβ eβb (Pc v)γc γ b + δg αβ eaα ebβ ∇a vγb 4 4 2 1 β c b a − δΓγαβ eα a eb eγ vγc γ γ . 4

(26)

Note that the variations of the frame δeα a cancel out, except in the terms with Pc . / S0 f ), because both B0 and v solve These are harmless when we compute B0 (δ∇ the Dirac equation. Therefore, the ﬁnal answer will not depend on variations of the frame, as desired. In the last term of Eq. (26), we can use the symmetry of the Christoﬀel symbol: 1 γ α β c 1 1 β c b a ab = − δΓγαβ g αβ ecγ vγc − δΓγ(αβ) eα a eb eγ vγc γ γ = − δΓ αβ ea eb eγ vγc η 4 4 4 1 1 = − δg γµ gµν Γναβ g αβ ecγ vγc − ∂α δgβµ eµa g αβ vγ a 4 4 1 + ∂µ δgαβ eµa g αβ vγ a . 8

(27)

We handle the last term using an integration by parts as before: −i i 1 1 ∂a δgαβ g αβ vγ a = Pc (δgαβ g αβ v) + δgαβ g αβ Pc v − δgαβ ∂a g αβ vγ a 8 8 8 8 =

−i i 1 Pc (δgαβ g αβ v) + δgαβ g αβ Pc v − δg αβ ∂a gαβ vγ a , (28) 8 8 8

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

427

where we used δgαβ ∂a g αβ = −δg αβ gαµ gβν ∂a g µν = δg αβ ∂a gαβ . The penultimate term in (27) is: 1 1 − ∂α δgβµ eµa g αβ vγ a = ∂b (δg αβ gαµ gβν )eµa ebρ g ρν vγ a 4 4 =

1 1 ∂b (δg αβ eaα ebβ )vγa − δg αβ gαµ gβν ∂b (eµa ebρ g νρ )vγ a 4 4

=

1 1 ∇b (δg αβ eaα ebβ )vγa − δg αβ (Γabc ecα ebβ + Γbbc eaα ecβ )vγa 4 4 1 − δg αβ gαµ gβν ∂b (eµa ebρ g νρ )vγ a . 4

(29)

The ﬁrst term on the right-hand side of Eq. (29) is 1 1 1 ∇b (δg αβ eaα ebβ )vγa = ∇b (δg αβ eaα ebβ vγa ) − δg αβ eaα ebβ ∇b vγa . 4 4 4

(30)

The other terms can be simpliﬁed with some computation: 1 − δg αβ (Γabc ecα ebβ + Γbbc eaα ecβ + gαµ gβν η ac ∂b (eµc ebρ g ρν ))vγa 4 1 = − δg αβ (−∂β eaα + eaγ Γγβα − eaα ∂c ecβ + eaα Γµµβ 4 + eaα gβν ∂ρ g ρν + eaα ∂b ebβ + gαµ η ac ∂β eµc )vγa 1 = − δg αβ (−η ac eµc ∂β gαµ + eaγ Γγβα + eaα Γµµβ − eaα g ρν ∂ρ gβν )vγa 4 1 = − δg αβ (−2eaγ g γµ ∂β gαµ + eaγ g γµ (2∂β gαµ − ∂µ gαβ ) 8 + eaα g µγ ∂β gµγ − 2eaα g ρν ∂ρ gβν )vγa =

1 αβ a γµ δg (eγ g ∂µ gαβ + 2eaα gβµ g ρν Γµρν )vγa . 8

(31)

Substituting Eqs. (27)–(31) into (26) yields: δ∇ /v =

−i i i i Pc (δecβ eβb vγc γ b ) + δecβ eβb (Pc v)γc γ b − Pc (δgαβ g αβ v) + δgαβ g αβ Pc v 4 4 8 8 1 1 + δg αβ eaα ebβ ∇a vγb + ∇b (δg αβ eaα ebβ vγa ). 4 4

(32)

May 11, J070-S0129055X10003990

428

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

Using Lemma 4.17, we ﬁnd for a spinor u ∈ C ∞ (DM ): δ∇ /u =

i i i i P (δecβ eβb γ b γc u) − δecβ eβb γ b γc (P u) + P (δgαβ g αβ u) − δgαβ g αβ P u 4 4 8 8 1 1 + δg αβ eaα ebβ γb ∇a u + ∇b (δg αβ eaα ebβ γa u). 4 4

(33)

Using Proposition 4.16 and Eqs. (32) and (33) we notice that the terms with Pc and P cancel out in the following equality, because B0 and S0 f both satisfy the Dirac equation: δ(β B0 (f )) = −B0 (δP S0 f ) =

i i B0 (δg αβ eaα ebβ γb ∇a S0 τ f ) + B0 (∇b (δg αβ eaα ebβ γa S0 τ f )) 4 4

=

i αβ a b δg eα eβ (B0 (γ(b ∇a) S0 τ f ) − ∇(b B0 (γa) S0 τ f )). 4

(34)

We now compare with Proposition 4.14 to get the ﬁnal result.

References [1] K. Sanders, Aspects of locally covariant quantum ﬁeld theory, PhD thesis, University of York (2008); also available online, arXiv:0809.4828v1[math-ph]. [2] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality principle — a new paradigm for local quantum ﬁeld theory, Comm. Math. Phys. 237 (2003) 31–68. [3] C. Dappiaggi, T.-P. Hack and N. Pinamonti, The extended algebra of observables for Dirac ﬁelds and the trace anomaly of their stress-energy tensor, Rev. Math. Phys. 21 (2009) 1241–1312. [4] R. Verch, A spin-statistics theorem for quantum ﬁelds on curved spacetime manifolds in a generally covariant framework, Comm. Math. Phys. 223 (2001) 261–288. [5] C. J. Fewster, Quantum energy inequalities and local covariance II: Categorical formulation, Gen. Relativ. Gravit. 39 (2007) 1855–1890. [6] J. Dimock, Dirac quantum ﬁelds on a manifold, Trans. Amer. Math. Soc. 269 (1982) 133–147. [7] C. J. Fewster and R. Verch, A quantum weak energy inequality for Dirac ﬁelds in curved spacetime, Comm. Math. Phys. 225 (2002) 331–359. [8] H. B. Lawson and M.-L. Michelson, Spin Geometry (Princeton University Press, Princeton, 1989). [9] R. Coquereaux, Cliﬀord algebras, spinors and fundamental interactions: Twenty years after, arXiv:math-ph/0509040v1. [10] W. Pauli, Contributions math´ematiques ` a la th´eorie des matrices de Dirac, Ann. Inst. H. Poincar´e 6 (1936) 109–136. [11] B. L. van der Waerden, Group Theory and Quantum Mechanics (Springer, Berlin, 1974). [12] Y. Choquet-Bruhat, C. de Witt-Morette and M. Dillard-Bleick, Analysis, Manifolds and Physics (North Holland, Amsterdam, 1977).

May 11, J070-S0129055X10003990

2010 10:6 WSPC/S0129-055X

148-RMP

The Locally Covariant Dirac Field

429

[13] S. P. Dawson and C. J. Fewster, An explicit quantum weak energy inequality for Dirac ﬁelds in curved spacetimes, Class. Quantum Grav. 23 (2006) 6659–6681. [14] S. Mac Lane, Categories for the Working Mathematician (Springer, New York, 1971). [15] S. Mac Lane and I. Moerdijk, Sheaves in Geometry and Logic: A First Introduction to Topos Theory (Springer, New York, 1992). [16] C. B¨ ar, N. Ginoux and F. Pf¨ aﬄe, Wave Equations on Lorentzian Manifolds and Quantization (EMS, Z¨ urich, 2007). [17] J. Dieudonn´e, Treatise on Analysis, Vol. III (Academic Press, New York-London, 1972). [18] J. Tolksdorf, Cliﬀord modules and generalized Dirac operators, Internat. J. Theoret. Phys. 40 (2001) 191–209. [19] R. Geroch, Spinor structures of space-times in general relativity. I, J. Math. Phys. 9 (1968) 1739–1744. [20] R. M. Wald, General Relativity (University of Chicago Press, Chicago-London, 1984). [21] R. Geroch, Spinor structures of space-times in general relativity. II, J. Math. Phys. 11 (1970) 343–348. [22] S. Kobayashi and K. Nomizu, Foundations of Diﬀerential Geometry, Vol. I (Interscience, New York, 1963). [23] R. H. Good Jr., Properties of the Dirac matrices, Rev. Mod. Phys. 27 (1955) 187–211. [24] A. Lichnerowicz, Champs spinoriels et propagateurs en relativit´e g´en´erale, Bull. Soc. Math. France 92 (1964) 11–100. [25] D. Canarutto and A. Jadczyk, Fundamental geometric structures for the Dirac equation in general relativity, Acta Appl. Math. 51 (1998) 59–92. [26] J. E. Roberts and G. Ruzzi, A cohomological description of connections and curvature tensors over posets, Theory Appl. Categ. 16 (2006) 855–895. ´ [27] G. Segal, Classifying spaces and spectral sequences, Inst. Hautes Etudes Sci. Publ. Math. 34 (1968) 105–112. [28] K. Schm¨ udgen, Unbounded Operator Algebras and Representation Theory (Birkh¨ auser, Basel, 1990). [29] H. Araki, On the diagonalization of a bilinear Hamiltonian by a Bogoliubov transformation, Publ. Res. Inst. Math. Sci. Ser. A 4 (1968/1969) 387–412. [30] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 2 (Springer, Berlin, 1996). [31] M. J. Radzikowski, Micro-local approach to the Hadamard condition in quantum ﬁeld theory on curved space-time, Comm. Math. Phys. 179 (1996) 529–553. [32] K. Kratzert, Singularity structure of the two point function of the free Dirac ﬁeld on a globally hyperbolic spacetime, Ann. Phys. (8) 9 (2000) 475–498. [33] S. Hollands, The Hadamard condition for Dirac ﬁelds and adiabatic states on Robertson–Walker spacetimes, Comm. Math. Phys. 216 (2001) 635–661. [34] H. Sahlmann and R. Verch, Microlocal spectrum condition and Hadamard form for vector-valued quantum ﬁelds in curved spacetime, Rev. Math. Phys. 13 (2001) 1203– 1246. [35] S. Hollands, The operator product expansion for perturbative quantum ﬁeld theory in curved spacetime, Comm. Math. Phys. 273 (2007) 1–36. [36] J. J. Duistermaat and L. H¨ ormander, Fourier integral operators. II, Acta Math. 128 (1972) 183–269. [37] A. Strohmaier, R. Verch and M. Wollenberg, Microlocal analysis of quantum ﬁelds on curved space-times: Analytic wave front sets and Reeh–Schlieder theorems, J. Math. Phys. 43 (2002) 5514–5530.

May 11, J070-S0129055X10003990

430

2010 10:6 WSPC/S0129-055X

148-RMP

K. Sanders

[38] L. H¨ ormander, The Analysis of Linear Partial Diﬀerential Operators, Vol. I (Springer, Berlin, 2003). [39] K. Sanders, Equivalence of the (generalized) Hadamard and microlocal spectrum condition for (generalized) free ﬁelds in curved spacetime, Comm. Math. Phys. 295 (2010) 485–501. [40] R. Brunetti, K. Fredenhagen and M. K¨ ohler, The microlocal spectrum condition and Wick polynomials of free ﬁelds on curved spacetimes, Comm. Math. Phys. 180 (1996) 633–652. [41] N. Dencker, On the propagation of polarization sets for systems of real principal type, J. Funct. Anal. 46 (1982) 351–372. [42] C. D’Antoni and S. Hollands, Nuclearity, local quasiequivalence and split property for Dirac quantum ﬁelds in curved spacetime, Comm. Math. Phys. 261 (2006) 133–159. [43] M. Forger and H. R¨ omer, Currents and the energy-momentum tensor in classical ﬁeld theory: A fresh look at an old problem, Ann. Phys. 309 (2004) 306–389.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 4 (2010) 431–484 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004004

INVERSE SCATTERING IN ¨ DE SITTER–REISSNER–NORDSTROM BLACK HOLE SPACETIMES

´ THIERRY DAUDE Department of Mathematics and Statistics, McGill University, 805 Sherbrooke South West, Montr´ eal QC, H3A 2K6, Canada [email protected] FRANC ¸ OIS NICOLEAU D´ epartement de Math´ ematiques, Laboratoire Jean Leray – UMR 6629, Universit´ e de Nantes, 2, rue de la Houssini` ere, BP 92208, 44322 Nantes Cedex 03, France [email protected] Received 4 October 2009 Revised 15 March 2010 In this paper, we study the inverse scattering of massive charged Dirac ﬁelds in the exterior region of (de Sitter)–Reissner–Nordstr¨ om black holes. Firstly, we obtain a precise high-energy asymptotic expansion of the diagonal elements of the scattering matrix (i.e. of the transmission coeﬃcients) and we show that the leading terms of this expansion allow to recover uniquely the mass, the charge and the cosmological constant of the black hole. Secondly, in the case of nonzero cosmological constant, we show that the knowledge of the reﬂection coeﬃcients of the scattering matrix on any interval of energy also permits to recover uniquely these parameters. Keywords: Inverse scattering; black holes; Dirac equation. Mathematics Subject Classiﬁcation 2010: 81U40, 35P25

1. Introduction This paper deals with inverse scattering problems in black hole spacetimes and is a continuation of our previous work [4]. Here we shall study the inverse scattering of massive charged Dirac ﬁelds that propagate in the outer region of (de Sitter)– Reissner–Nordstr¨om black holes, an important family of spherically symmetric, charged exact solutions of the Einstein equations that will be thoroughly described in Sec. 2. These spacetimes are completely characterized by three parameters: the mass M > 0 and the electric charge Q ∈ R of the black hole as well as the cosmological constant Λ ≥ 0 of the universe. In what follows, these parameters will be 431

May 11, J070-S0129055X10004004

432

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

considered as the “unknowns” of our inverse problem. In fact, the inverse scattering problem we have in mind is the following: we assume that we are static observers living in the exterior region of a (dS)-RN black hole, that is the region between the exterior event horizon of the black hole and the cosmological horizon when Λ > 0, or the region lying beyond the exterior event horizon of the black hole when Λ = 0. The geometry of the spacetime in which these observers live is thus ﬁxed in some sense. But, what we do not assume however is that these observers know the exact values of the parameters M, Q and Λ “a priori ”. Hence the natural question we adress is: do such observers have any means to measure or characterize uniquely these parameters by an inverse scattering experiment? Let us explain more precisely the exact inverse scattering problem studied in this paper. First of all, we shall use the direct scattering theory for massive charged Dirac ﬁelds established in [3] for RN black holes and more generally in [18] for dS-RN black holes. The point of view adopted in these papers to describe the geometry of the black hole is that of static observers located far from the horizons (think typically of a telescope on earth aimed at the black hole). We shall conserve this point of view here which means in practice that all the relevant objects (such as the wave and scattering operators) used in this work will be expressed by means of the Regge–Wheeler coordinates system. This choice of coordinates has an important consequence in the understanding of the boundaries of the outer region of (dS)RN black holes, namely, either the exterior event horizon of the black hole and the cosmological horizon when Λ > 0, or the event horizon of the black hole and spacelike inﬁnity when Λ = 0. These boundaries are indeed perceived by such observers as asymptotic regions of the spacetime which, moreover, may have very diﬀerent geometrical structures. This entails the following nice and peculiar picture concerning the propagation properties of the Dirac ﬁelds ([3, 18]). First, it can be proved that the energy of the ﬁelds contained in any compact set between the two asymptotic regions vanishes at late times. Therefore, the ﬁelds scatter toward these asymptotic regions. Second, from the point of view of our particular observers, Dirac ﬁelds are shown to obey there simple but diﬀerent equations that reﬂect the diﬀerent geometries of the asymptotic regions. Therefore, two distinct wave operators must be introduced according to the asymptotic region we consider. Let us denote for the moment the wave operators corresponding to the part of Dirac ± and the ﬁelds which scatters toward the event horizon of the black hole by W(−∞) wave operators corresponding to the part of Dirac ﬁelds which scatters toward the ± . These wave operators will be cosmological horizon or spatial inﬁnity by W(+∞) precisely deﬁned in Sec. 2. The main result obtained in [3, 18] asserts that the global wave operators deﬁned by ± ± + W(+∞) , W ± = W(−∞)

(1.1)

exist and are asymptotically complete. This permits to deﬁne a global scattering operator S by the usual formula S = (W + )∗ W − .

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

433

The scattering operator S will be the main object of study of this paper. It encodes the scattering data as viewed by observers living far from the horizons of a (dS)-RN black hole. We thus rephrase and precise our initial problem in the following way. We assume that our observers have access experimentally to the scattering operator S. More precisely, we assume that they can measure the expectation values of S, i.e. they can measure any quantities of the form Sψ, φ where ·, · denotes the scalar product of the energy Hilbert space H on which S acts and ψ, φ are any element of H. The question we adress is now: is the knowledge of S and any of its related quantities a suﬃcient information to uniquely characterize the parameters M, Q and Λ of (dS)-RN black holes? We can in fact be a bit more precise in the statement of the problem if we remark that the scattering operator S can be decomposed using (1.1) as S = TL + TR + L + R, where + − TL = (W(+∞) )∗ W(−∞) ,

+ − TR = (W(−∞) )∗ W(+∞) ,

and + − R = (W(+∞) )∗ W(+∞) ,

+ − L = (W(−∞) )∗ W(−∞) .

Each of the terms in S corresponds to a diﬀerent inverse scattering experiment. For instance, the ﬁrst two terms TR and TL (in fact the diagonal elements of S) are understood as transmission operators. These terms measure the part of a signal which is transmitted from one asymptotic region to the other in a scattering process. Conversely, the last two terms L and R (the anti-diagonal elements of S) are understood as reﬂection operators and correspond to the opposite experiment. These terms measure the part of the signal which is reﬂected from an asymptotic region to itself. The quantities of interest — the inverse scattering data — will be thus either the expectation values TR ψ, φ, TL ψ, φ of the transmission operators, or the expectation values Lψ, φ, Rψ, φ of the reﬂection operators. In this paper, we shall study two types of inverse problems. Firstly, in the two cases of RN black holes (Λ = 0) and dS-RN black holes (Λ > 0), we shall prove that the parameters M, Q, Λ are uniquely determined if we assume that the high energies of the transmission operators TR or TL are known. Note here that the same analysis would not be possible working wih the reﬂection operators R or L. The high energies of the reﬂection operators are indeed non-measurable and thus cannot be used to determine uniquely the parameters. This was mentioned in [4] (see also [12] where a similar problem was studied). Secondly, in the case of dS-RN black holes only (Λ > 0), we shall prove the same uniqueness result under the assumption that the reﬂection operators L or R are known on any (possibly small) interval of energy. The reason why we do not treat this second type of inverse problem in the case of RN black hole is the following. The structure of the scattering operator (at any

May 11, J070-S0129055X10004004

434

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

energy) turns out to be more complicated in the case of RN black holes than in dSRN black holes. This is again a consequence of the very diﬀerent geometries of the asymptotic regions in the RN case (see below for a brief explanation). To obtain the same uniqueness result in this last case would require thus a better understanding of the scattering matrix. We are currently investigating this problem. Let us now recall the results of [4] where the ﬁrst kind of inverse problem was adressed in the case of Reissner–Nordstr¨ om black holes (i.e. with only the two parameters M, Q unknown and the cosmological constant Λ equal to 0). Using the direct scattering theory for massless Dirac ﬁelds obtained in [3, 20] and a high energy asymptotic expansion of the expectation values TR ψ, φ or TL ψ, φ (as deﬁned above), a partial answer was then given: the mass M and the modulus of the charge |Q| are uniquely determined from the leading terms of this high energy asymptotic expansion. Note that the indecision of the sign of the charge is not surprising in that case since the propagation of massless Dirac ﬁelds is only inﬂuenced by the geometry of the black hole which in turn only depends on |Q| (see the expression of the metric (2.2) in Sec. 2). In this paper, we continue our investigation and improve our results in several directions. In Sec. 3, we reconsider the case Λ = 0 corresponding to RN black holes but study the inverse scattering of massive charged Dirac ﬁelds instead of massless Dirac ﬁelds. Using the same approach in [4], we show that the mass M as well as the charge Q are uniquely determined by the leading terms of the high energy asymptotic expansion of the transmission operators TR or TL . In fact, the advantage of considering massive charged Dirac ﬁelds is that an explicit term associated to the interaction between the electric charge of the ﬁelds and that of the black hole appears in the equation and allows to recover Q and not |Q|. From the mathematical side, the analysis turns out to be much more involved than in [4]. The reasons are twofold. First, from the point of view of our observers, massive Dirac ﬁelds have completely distinct behaviors when approaching the diﬀerent asymptotic regions. At the event horizon of the black hole for instance, the attraction exerced by the black hole is so strong that massive Dirac ﬁelds seem to behave as massless Dirac ﬁelds. The asymptotic dynamic there turns out to be very simple and is shown to obey a system of transport equations along the null radial geodesics of the black hole.a This is a consequence of the particular geometry (of hyperbolic type) near the event horizon (and more generally near any horizons). Conversely, RN black holes are asymptotically ﬂat at spacelike inﬁnity. There, the ﬁelds simply behave like massive Dirac ﬁelds in Minkowski spacetime and the mass of the ﬁelds, slowing down the propagation, plays an important role. In consequence, the dynamics near the two asymptotic regions are quite diﬀerent and must be treated separately. The

a We emphasize again here that this simple expression for the asymptotic dynamic at the event horizon (in fact at any horizons) is only true from the point of view of observers living far from the horizons. Adopting another point of view such as the one of local observers living near a horizon would lead to a very diﬀerent asymptotic dynamic.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

435

second and related diﬃculty comes from the appearance of long-range terms in the equation but only at a single asymptotic region: spacelike inﬁnity. This entails new technical diﬃculties such as a modiﬁcation of the standard wave operators at inﬁnity and we need to work harder to obtain the high energy asymptotic expansion of the transmission operators. An interesting feature we would like to emphasize is that, eventhough we are considering high energies, the rest mass and the charge of Dirac ﬁelds do contribute to the asymptotic expansion of the scattering matrix. This can be clearly seen from the reconstruction formulae obtained in Theorem 3.2. At last, we also mention that the model studied in this section can be viewed as a good intermediate model before studying the same inverse problem in the more complicated geometrical setting of Kerr black holes. As shown in [13] indeed, the appearance of long-range terms in the equation (even for massless Dirac ﬁelds) is compulsory in that case as a side eﬀect of the rotation of the spacetime. In Sec. 4, we consider the case of nonzero cosmological constant Λ > 0, that is de Sitter–Reissner–Nordstr¨om black holes and the three parameters M, Q, Λ are supposed to be a priori unknown. The two asymptotic regions are the event horizon of the black hole and the cosmological horizon. From the point of view of our observers, massive Dirac ﬁelds seem to behave as massless Dirac ﬁelds when approaching the horizons and as before, their propagation there obeys essentially a system of transport equations along the null radial geodesics of the black hole. However, diﬀerent oscillations appear in the dynamics near these two horizons, once again due to the interaction between the charge of the ﬁeld and that of the black hole. In consequence, Dirac ﬁelds evolve asymptotically according to slightly diﬀerent dynamics in that case too. In Sec. 4.1, using the results of the previous part, we shall obtain a high energy asymptotic expansion of the transmission operators TR and TL and we shall prove that the parameters M, Q and Λ are uniquely characterized by the leading terms of this asymptotic expansion. In Sec. 4.2, we consider an inverse scattering problem based on the knowledge of the reﬂection operators R or L on a (possibly small) interval of energy. As already mentioned, a high energy aymptotic expansion of these reﬂection operators does not give any information and cannot be used to solve the inverse problem. To study this case, we follow instead the usual stationary approach of inverse scattering theory on the line. We refer for instance to the review by Faddeev [8] and to the important paper by Deift and Trubowitz [6] for a presentation of the method for Schr¨ odinger operators and to the nice paper [1] for a recent application to Dirac operators (see also [12, 15]). We shall ﬁrst obtain a stationary representation of the scattering operator S in terms of the usual transmission and reﬂection “coeﬃcients” (note that these turn out to be matrices in our case). This is done after a series of simpliﬁcations of our model which happens ﬁnally to reduce to a particular case of the model studied in [1]. Then we use the analysis of [1], namely, a classical Marchenko method based on a detailed analysis of the stationary solutions of the corresponding Dirac equation, to prove the following result: the knowledge of one of the reﬂection operators L or R at all energies is enough to uniquely characterize the parameters M, Q and Λ. Eventually, we improve this

May 11, J070-S0129055X10004004

436

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

result observing that, in the dS-RN model, the reﬂection operators R or L are in fact analytic in the energy variable on a small strip containing the real axis. Hence it is enough to know R or L on any interval of energy in order to uniquely know them for all energies. Applying the result of [1], this leads to the uniqueness of the parameters in that case too. Note ﬁnally that a stationnary representation of the scattering operator in the case of RN black holes would drastically diﬀer to the one obtained in Sec. 4.2 for dS-RN black holes. This is due to the presence of long-range terms at spacelike inﬁnity that change the asymptotic behaviors of stationary solutions and thus the structure of the scattering matrix. In particular, the stationary representation obtained in [1] could not be used in this case. We ﬁnish this introduction saying a few words on the main technical tool used in Secs. 3 and 4 to prove our uniqueness results from the high energies of the transmission operators TR ot TL . These are based on a high-energy expansion of the scattering operator S following an approach introduced by Enss and Weder in [7] in the case of multidimensional Schr¨ odinger operators. (Note that the case of multidimensional Dirac operators in ﬂat spacetime was treated later by Jung in [17]). Their result can be summarized as follows. Using purely time-dependent methods, they showed roughly speaking that the ﬁrst term of the high-energy expansion of S is exactly the Radon transform of the potential they are looking for. Since they work in dimension greater than two, this Radon transform can be inversed and the potential thus uniquely recovered. In our problem however, due to the spherical symmetry of the black hole, we are led to study a family of one-dimensional Dirac equations and the above Radon transform simply becomes an integral of a one-dimensional function, hence a number, and cannot be inversed. Fortunately in our models, it turns out that this integral can be explicitely computed and gives in general already a physically relevant information. Nevertheless, it is not enough to uniquely characterize all the parameters of the black hole. In fact, we need to calculate several terms of the asymptotic (and thus obtain several integrals) to prove our result. To do this, we follow the stationary technique introduced by one of us [21] which is close in spirit to the Isozaki–Kitada method used in long-range scattering theory [16]. The basic idea is to replace the wave operators (and thus the scattering operator) by explicit Fourier Integral Operators, called modiﬁers, from which we are able to compute the high-energy expansion readily. The construction of these modiﬁers and the precise determination of their phases and amplitudes will be given in a self-contained manner in Sec. 3. Note also that the similar results proved in our previous paper [4] could not be applied directly to our new model because of the presence of long-range terms in the equation. At last we mention that, while this method was well-known for Schr¨odinger operators and applied successfully to various situations (see [2, 21–23]), it has required some substantial modiﬁcations when applied to Dirac operators, essentially because of the matrix-valued nature of the equation. To deal with these diﬃculties, we made an extensive use of the paper by Gˆ atel and Yafaev [9] where a direct scattering theory of massive Dirac ﬁelds in ﬂat spacetime was studied and modiﬁers were constructed.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

437

2. (De Sitter)–Reissner–Nordstr¨ om Black Holes and Dirac Equation In this section, we describe the geometry of the exterior regions of (de Sitter)– Reissner–Nordstr¨om black holes. In particular, we emphasize the point of view adopted for the observers as well as the diﬀerent properties of the asymptotic regions mentioned in the introduction, clearly distinguishing between the cases of zero and nonzero cosmological constant Λ. We then express in a synthetic manner the equations that govern the evolution of massive charged Dirac ﬁelds in these spacetimes. We end up this section recalling the known direct scattering results of [3, 18] and introducing the scattering operator S. 2.1. (De Sitter)–Reissner–Nordstr¨ om black holes In Schwarzschild coordinates a (de Sitter)–Reissner–Nordstr¨om black hole is described by a four–dimensional smooth manifold 2 M = Rt × R + r × Sω ,

equipped with the Lorentzian metric g = F (r) dt2 − F (r)−1 dr2 − r2 dω 2 ,

(2.1)

where F (r) = 1 −

2M Q2 Λr2 + 2 − , r r 3

(2.2)

and dω 2 = dθ2 + sin2 θ dϕ2 is the Euclidean metric on the sphere S 2 . The constants M > 0, Q ∈ R appearing in (2.2) are interpreted as the mass and the electric charge of the black hole and Λ ≥ 0 is the cosmological constant of the universe. Observe that the function (2.2) and thus the metric (2.1) do not depend on the angular variables θ, ϕ ∈ S 2 reﬂecting the fact that dS-RN black holes are spherically symmetric spacetimes. The family (M, g) are in fact exact solutions of the Einstein–Maxwell equations 1 Gµν = Rµν + Rgµν + Λgµν . (2.3) 2 Here Gµν , Rµν and R denote respectively the Einstein tensor, the Ricci tensor and the scalar curvature of (M, g) while Tµν is the energy-momentum tensor 1 1 ρ ρσ Tµν = , (2.4) Fµρ Fν − gµν Fρσ F 4π 4 Gµν = 8πTµν ,

where Fµν is the electromagnetic two-form solution of the Maxwell equations ∇µ Fνρ = 0, ∇[µ Fνρ] = 0 and given here in terms of a global electromagnetic vector potential Fµν = ∇[µ Aν] ,

Aν dxν = −

Q dt. r

(2.5)

May 11, J070-S0129055X10004004

438

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

The metric g has two types of singularities. Firstly, the point {r = 0} for which the function F is singular. This is a true singularity or curvature singularity.b Secondly, the spheres whose radii are the roots of F (note that the coeﬃcient of the metric g involving F −1 blows up in this case). We must distinguish here two cases. When the cosmological constant is positive Λ > 0 and small enough, there are three positive roots 0 ≤ r− < r0 < r+ < +∞ . The spheres of radius r− , r0 and r+ are called, respectively, Cauchy, event and cosmological horizons of the dSRN black hole. When Λ = 0, the number of these roots depends on the respective values of the constants M and Q. In this paper, we only consider the case M > |Q| for which the function F has two zeros at the values r− = M − M 2 − Q2 and r0 = M + M 2 − Q2 . The spheres of radius r− and r0 are called, respectively, the Cauchy and event horizons of the RN black hole. In both situations, the horizons are not true singularities in the sense given for {r = 0}, but in fact coordinate singularities. It turns out that, using appropriate coordinate systems, these horizons can be understood as regular null hypersurfaces that can be crossed one way but would require speeds greater than that of light to be crossed the other way. We refer to [14, 28] for a introduction to black hole spacetimes and their general properties. As mentioned in the introduction, we shall consider in this paper inverse scattering problems from the point of view of static observers living in the exterior region of a (dS)-RN black hole, that is the region {r0 < r < r+ } when Λ > 0 or the region {r0 < r < +∞} when Λ = 0, and located far from the horizons. Such observers are well described by the variable t of the Schwarzschild coordinates meaning that t corresponds to their proper time. Since the metric is singular then, it is important to understand the roles of the singularities — the horizons — as the natural boundaries of the exterior region. It turns out that they are perceived by such observers as asymptotic regions of spacetime. Precisely, this means that they are never reached in a ﬁnite time t by incoming and outgoing null radial geodesics, i.e. the trajectories followed by classical light-rays aimed radially at the black hole and either at the cosmological horizon if Λ > 0 or at inﬁnity if Λ = 0. To see this point more easily, we introduce a new radial coordinate x, called the Regge–Wheeler coordinate, which has the property of straightening the null radial geodesics and will, at the same time, greatly simplify the later analysis. Observing that for all Λ ≥ 0 the function F (r) in the metric (2.2) remains always positive in the exterior region, it can be deﬁned implicitely by the relation dr = F (r) > 0, dx

(2.6)

r 1 1 2κ0 x= − log(r − r0 ) − dy + C, 2κ0 y − r0 F (y) r0

(2.7)

or explicitly, by

b It

means that certain scalars obtained by contracting the Riemann tensor blow up when r → 0.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

439

where the quantity 1 F (r0 ) > 0, 2 is called the surface gravity of the event horizon and C is any constant of integration. Note that, when Λ > 0, the Regge–Wheeler variable could be also deﬁned explicitely by r+ 1 2κ+ 1 log(r+ − r) − + dy + C, (2.8) x= 2κ+ r+ − y F (y) r κ0 =

where the quantity 1 F (r+ ) < 0, 2 is called the surfave gravity of the cosmological horizon. Moreover, in the case Λ = 0, the expression (2.7) simpliﬁes as κ+ =

x=r+

2 r− 1 log(r − r0 ) + log(r − r− ) + C. 2κ0 r0 − r−

(2.9)

In the coordinate system (t, x, ω), it is easy to see from the logarithm in (2.7) and (2.9) and the positive sign of κ0 that the event horizon {r = r0 } is pushed away to {x = −∞} for all Λ ≥ 0. Similarly it follows from (2.8) and the negative sign of κ+ that the cosmological horizon {r = r+ } is pushed away to {x = +∞} when Λ > 0. Hence in any case the Regge–Wheeler variable x runs over the full real line R. Moreover, by (2.6), the metric takes now the form g = F (r)(dt2 − dx2 ) − r2 dω 2 ,

(2.10)

from which it is immediate to see that the incoming and outgoing null radial ∂ ∂ ± ∂x and take the simple geodesics are generated by the vector ﬁelds ∂t form γ ± (t) = (t, x0 ± t, ω0 ),

t ∈ R,

(2.11)

where (x0 , ω0 ) ∈ R × S 2 are ﬁxed. These are simply straight lines with velocity ±1 mimicking, at least in the t − x plane, the situation of a one-dimensional Minkowski spacetime. At last, using (2.11), we can check directly that the event horizon and the cosmological horizon (when Λ > 0) are asymptotic regions of spacetime in the sense given above. From now on, we shall only consider the exterior region of dS-RN black holes and we shall work on the manifold B = Rt ×Σ with Σ = Rx ×Sω2 , equipped with the metric (2.10). Such a manifold B is globally hyperbolic meaning that the foliation Σt = {t} × Σ by the level hypersurfaces of the function t, is a foliation of B by Cauchy hypersurfaces (see [28] for a deﬁnition of global hyperbolicity and Cauchy hypersurfaces). In consequence, we can view the propagation of massive charged Dirac ﬁelds as an evolution equation in t on the spacelike hypersurface Σ, that is a cylindrical manifold having two distinct ends: {x = −∞} corresponding to the event

May 11, J070-S0129055X10004004

440

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

horizon of the black hole and {x = +∞} corresponding to the cosmological horizon when Λ > 0 and to spacelike inﬁnity when Λ = 0. Note that the geometries of these ends are distinct in general. The event and cosmological horizons are indeed exponentially large ends of Σ whereas spacelike inﬁnity is an asymptotically ﬂat end of Σ (in the latter, observe that the metric (2.2) tends to the Minkowski metric expressed in spherical coordinates when r → +∞). The diﬀerence between these geometries will be easily seen from the distinct asymptotic behaviors of Dirac ﬁelds near these regions given in the next subsection. 2.2. Dirac equation and direct scattering results Scattering theory for massive charged Dirac ﬁelds on the spacetime B has been the object of the papers [3, 18]. We brieﬂy recall here the main results of these papers. In particular, we use the form of the Dirac equation obtained therein. First, the evolution equation satisﬁed by massive charged Dirac ﬁelds in B can be written under the Hamiltonian form i∂t ψ = Hψ,

(2.12)

where ψ is a 4-components spinor belonging to the Hilbert space H = L2 (R × S 2 ; C4 ), and the Hamiltonian H is given by H = Γ1 Dx + a(x)DS 2 + b(x)Γ0 + c(x).

(2.13)

Here we use the following notations. The symbol Dx stands for −i∂x whereas DS 2 denotes the Dirac operator on S 2 which, in spherical coordinates, takes the form cot θ i Γ3 ∂ϕ . (2.14) DS 2 = −iΓ2 ∂θ + − 2 sin θ The potentials a, b, c are scalar smooth functions given in terms of the metric (2.1) by F (r) qQ , b(x) = m F (r), c(x) = , (2.15) a(x) = r r where m and q denote the mass and the electric charge of the ﬁelds respectively. Finally, the matrices Γ1 , Γ2 , Γ3 , Γ0 appearing in (2.13) and (2.14) are usual 4 × 4 Dirac matrices that satisfy the anticommutation relations Γi Γj + Γj Γi = 2δij Id,

∀ i, j = 0, . . . , 3.

(2.16)

Second, we use the spherical symmetry of the equation to simplify further the expression of the Hamiltonian H. Since, the Dirac operator DS 2 has compact resolvent, it can be diagonalized into an inﬁnite sum of matrix-valued multiplication operators. The eigenfunctions associated to DS 2 are a generalization of the usual spherical harmonics called spin-weighted spherical harmonics. We refer to

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

441

Gel’Fand and Sapiro [10] for a detailed presentation of these generalized spherical harmonics and to [3, 18] for an application to our model. There exists thus a family Fnl of DS2 with the indexes (l, n) running in the set of eigenfunctions 1 I = (l, n), l − | 2 | ∈ N, l − |n| ∈ N which forms a Hilbert basis of L2 (S 2 ; C4 ) with the following property. The Hilbert space H can then be decomposed into the inﬁnite direct sum [L2 (Rx ; C4 ) ⊗ Fnl ] := Hln , H= (l,n)∈I

(l,n)∈I

is identiﬁed with L (R; C4 ) and more important, we where Hln = L (Rx ; C obtain the orthogonal decomposition for the Hamiltonian H H ln , H= 2

4

) ⊗ Fnl

2

(l,n)∈I

with H ln := H|Hln = Γ1 Dx + al (x)Γ2 + b(x)Γ0 + c(x),

(2.17)

and al (x) = −a(x)(l + 12 ). Note that the Dirac operator DS 2 has been replaced in the expression of H ln by −(l + 12 )Γ2 thanks to the good properties of the spinweighted spherical harmonics Fnl . The operator H ln is a selfadjoint operator on Hln with domain D(H ln ) = H 1 (R; C4 ). Finally we use the following representation for the Dirac matrices Γ1 , Γ2 and Γ0 appearing in (2.17)       1 0 0 0 0 0 0 1 0 0 −i 0 0 1   0 0 0 −1 0  0 i , Γ2 =  0  , Γ0 =  0 0 . Γ1 =   0 0 −1  0 −1 i 0 0 0 0 0 0 0 0

0 −1

1

0

0 0

0

−i

0

0 (2.18)

In this paper, it will be often enough to restrict our analysis to a ﬁxed harmonic. To simplify notations we shall thus simply write H, H and a(x) instead of Hln , H ln and al (x) respectively and we shall indicate in the course of the text whether we work on the global problem or on a ﬁxed harmonic. Let us summarize now the direct scattering results obtained in [3, 18]. It is well known that the main information of interest in scattering theory concerns the nature of the spectrum of the Hamiltonian H. Our ﬁrst result goes in this sense. Using essentially a Mourre theory (see [19]), it was shown in [3, 18] that, for all Λ ≥ 0, σpp (H) = ∅,

σsing (H) = ∅.

In other words, the spectrum of H is purely absolutely continuous. In consequence, massive charged Dirac ﬁelds scatter toward the two asymptotic regions at late times and they are expected to obey simpler equations there. This is one of the main information encoded in the notion of wave operators that we introduce now.

May 11, J070-S0129055X10004004

442

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

We ﬁrst treat the case Λ = 0 corresponding to RN black holes. From (2.2) and (2.9), the potentials a, b, c have very diﬀerent asymptotics as x → ±∞ (according to our discussion above this reﬂects the fact that the geometries near the two asymptotic regions are very diﬀerent). At the event horizon, there exists α > 0 such that |a(x)|, |b(x)|, |c(x) − c0 | = O(eαx ),

x → −∞,

(2.19)

where the constant c0 is given by (see (2.15)) c0 =

qQ . r0

Hence we can write the Hamiltonian H as H = H0 + V0 ,

V0 (x) = a(x)Γ2 + b(x)Γ0 + (c(x) − c0 ),

H0 = Γ1 Dx + c0 ,

where the potential V0 is then short-range when x → −∞. In consequence, we can choose the asymptotic dynamic generated by the Hamiltonian H0 = Γ1 Dx + c0 as the comparison dynamic in this region. The Hamiltonian H0 is a selfadjoint operator on H with its spectrum covering the full real line, i.e. σ(H0 ) = R. Note ﬁnally that due to the simple diagonal form of the matrix Γ1 , the comparison dynamic e−itH0 is essentially a system of transport equations along the curves x ± t, that is the null radial geodesics of the black hole. Conversely at inﬁnity, the potentials a, b, c have the asymptotics 1 |a(x)|, |b(x) − m|, |c(x)| = O , x → +∞. (2.20) x Hence we can write the Hamiltonian H as H = H0m + V0m ,

H0m = Γ1 Dx + mΓ0 ,

V0m (x) = a(x)Γ2 + (b(x) − m)Γ0 + c(x),

where the potential V0m is now a long-range potential having Coulomb decay when x → +∞. The asymptotic dynamic is generated by the Hamiltonian H0m = Γ1 Dx + mΓ0 , a classical one-dimensional Dirac Hamiltonian in Minkowski spacetime. The Hamiltonian H0m is a selfadjoint operator on H and its spectrum has a gap, i.e. σ(H0m ) = (−∞, −m) ∪ (+m, +∞). However, contrary to the preceding case, the m asymptotic dynamic e−itH0 cannot be used alone as a comparison dynamic because of the long-range potential V0m , but must be (Dollard)-modiﬁed. In order to deﬁne this modiﬁcation and for other use, we need to introduce the classical velocity operators V0 = Γ1 ,

Vm = Dx (H0m )−1 ,

associated to the Hamiltonians H0 and H0m , respectively. The classical velocity operators are selfadjoint operators on H and their spectra are simply σ(Γ1 ) = {−1, +1} and σ(Vm ) = [−1, +1]. Let us also denote by P± and P±m the projections onto the positive and negative spectrum of Γ1 and Vm , i.e. P± = 1R± (Γ1 ),

P±m = 1R± (Vm ).

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

443

As shown in [3], a great interest of these projections is that they permit to separate easily the part of the ﬁelds that propagate toward the event horizon and the part of the ﬁelds that propagate toward inﬁnity. They will be used in the deﬁnition of the wave operators below. Moreover, the classical velocity operator Vm enters in the expression of the Dollard modiﬁed comparison dynamic at inﬁnity proposed in [3] and given by Rt m −1 m (2.21) U (t) = e−itH0 e−i 0 (b(sVm )−m)m(H0 ) +c(sVm ) ds . Let us make here two comments. First, the potential a(x)Γ2 turns out to be a “false” m long-range term. This is clear from (2.21) where the asymptotic dynamic e−itH0 has been modiﬁed by an extra phase which only involves the long-range potentials b and c. We refer to [3] for an explanation of this particular point. Second, we shall propose in Sec. 3 a new time-independent modiﬁcation of the comparison dynamic m e−itH0 which will be a direct byproduct of our construction of modiﬁers in the spirit of Isozaki–Kitada’s work [16]. This new modiﬁcation will be shown to be equivalent to the Dollard modiﬁcation (2.21) in Theorem 3.3. We are now in position to introduce the wave operators associated to H. At the event horizon, we deﬁne ± = s- lim eitH e−itH0 P∓ , W(−∞) t→±∞

(2.22)

whereas at inﬁnity, we deﬁne ± = s- lim eitH U (t)P±m . W(+∞) t→±∞

(2.23)

Finally, the global wave operators are given by ± ± W ± = W(−∞) + W(+∞)

(2.24)

Note here our use of the projections P± and P±m to separate the part of the ﬁeld propagating toward the event horizon to the part of the ﬁeld propagating toward inﬁnity. In fact without these projections, the wave operators (2.22) and (2.23) would not exist at all. More precisely the main result of [3] is ± ± Theorem 2.1. The wave operators W(−∞) , W(+∞) and W ± exist on H. Moreover, ± ± = the global wave operators W are partial isometries with initial spaces Hscat m ± P∓ (H) + P± (H) and ﬁnal space H. In particular, W are asymptotically complete, i.e. Ran W ± = H.

As a direct consequence of Theorem 2.1, we can deﬁne the scattering operator S by the usual formula S = (W + )∗ W − .

(2.25)

− It is clear that S is a well-deﬁned operator on H and a partial isometry from Hscat + into Hscat . We now treat the case Λ > 0 corresponding to dS-RN black holes wich turns out to be a little bit more symmetric at the two (event and cosmological) horizons.

May 11, J070-S0129055X10004004

444

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

According to (2.2), (2.7) and (2.8), the potentials a, b, c have the following asymptotics as x → ±∞. There exists α > 0 such that |a(x)|, |b(x)| = O(e−α|x| ),

|x| → ∞,

(2.26)

and |c(x) − c0 | = O(eαx ),

x → −∞,

(2.27)

−αx

x → +∞,

(2.28)

|c(x) − c+ | = O(e

),

where the constants c0 and c+ are given by (see (2.15)) c0 =

qQ , r0

c+ =

qQ . r+

(2.29)

Hence, the potentials a, b are short-range when x → ±∞ and c − c0 and c − c+ are short-range when x → −∞ and x → +∞, respectively. At the event horizon, we choose as before the asymptotic dynamic generated by the Hamiltonian H0 = Γ1 Dx + c0 as the comparison dynamic while, at the cosmological horizon, we choose the asymptotic dynamic generated by the Hamiltonian H+ = Γ1 Dx + c+ as the comparison dynamic. The Hamiltonians H0 and H+ are clearly selfadjoint operators on H and their spectra are exactly the real line, i.e. σ(H0 ) = σ(H+ ) = R. We observe eventually that the dynamics e−itH0 and e−itH+ are essentially a system of transport equations along the null radial geodesics of the black hole but they diﬀer by the distinct oscillations e−itc0 and e−itc+ . We need the classical velocity operators associated to H0 and H+ in order to separate the part of the ﬁelds that propagate toward the event horizon and the part of the ﬁelds that propagate toward the cosmological horizon. It turns out that they are equal to V0 = Γ1 in both cases and the associated projections onto the positive and negative spectrum are still P± . Thus we can introduce the wave operators as before. At the event horizon, we deﬁne ± = s- lim eitH e−itH0 P∓ , W(−∞) t→±∞

(2.30)

and at the cosmological horizon, we deﬁne ± W(+∞) = s- lim eitH e−itH+ P± . t→±∞

(2.31)

Finally, the global wave operators are given by ± ± W ± = W(−∞) + W(+∞) .

(2.32)

The main result of [18] is ± ± Theorem 2.2. The wave operators W(−∞) , W(+∞) and W ± exist on H. Moreover, ± the global wave operators W are isometries on H. In particular, W ± are asymptotically complete, i.e. Ran W ± = H.

Thanks to Theorem 2.2, we can deﬁne the scattering operator S as in (2.25) by S = (W + )∗ W − which is a well-deﬁned isometry on H.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

445

We deduce from the previous discussion that, for all Λ ≥ 0, the scattering operator S is a well-deﬁned operator on H. For all ψ, φ ∈ H, we shall consider in the following the expectation values of S, given by Sψ, φ, as the known data of our inverse problem. Moreover, using (2.24) and (2.32), we observe that these expectation values can be decomposed into 4 natural components Sψ, φ = W − ψ, W + φ = TR ψ, φ + TL ψ, φ + Lψ, φ + Rψ, φ, where − + ψ, W(−∞) φ, TR ψ, φ = W(+∞)

− + TL ψ, φ = W(−∞) ψ, W(+∞) φ,

(2.33)

− + Lψ, φ = W(−∞) ψ, W(−∞) φ,

− + Rψ, φ = W(+∞) ψ, W(+∞) φ.

(2.34)

It follows from our deﬁnitions of the wave operators (2.22), (2.30) and (2.23), (2.31) that the previous quantities can be interpreted in terms of transmission and reﬂection between the diﬀerent asymptotic regions, i.e. {x = −∞} for the event horizon of the black hole and {x = +∞} for either spacelike inﬁnity if Λ = 0, or the cosmological horizon if Λ > 0. For instance, TR ψ, φ corresponds to the part of a signal transmitted from {x = +∞} to {x = −∞} in a scattering process whereas the term TL ψ, φ corresponds to the part of a signal transmitted from {x = −∞} to {x = +∞}. Hence TR stands for “transmitted from the right” and TL for “transmitted from the left”. Conversely, Lψ, φ corresponds to the part of a signal reﬂected from {x = −∞} to {x = −∞} in a scattering process whereas the term Rψ, φ corresponds to the part of a signal reﬂected from {x = +∞} to {x = +∞}. 3. The Inverse Problem when Λ = 0 In this section, we study the inverse problem at high energy in the case Λ = 0 that corresponds to RN black holes. Let us recall here that all the results and formulae given hereafter are always obtained on a ﬁxed spin-weighted spherical harmonic. Therefore the notations H, H, a(x) are a shorthand for Hln , H ln , al (x) deﬁned in the preceding section. In order to state our main result, we make two assumptions. Assumption 1. We assume that our observers may measure the high energies of the transmitted operators TR or TL . Precisely, we assume that one of the following functions of λ ∈ R Fl (λ) = TR eiλx ψ, eiλx φ,

Gl (λ) = TL eiλx ψ, eiλx φ,

are known for all large values of λ, for all l ∈ N where l indexes the spin-weighted spherical harmonics and for all ψ, φ ∈ H with ψ, φ ∈ C0∞ (R; C4 ). Assumption 2. We also assume that the mass m and the charge q of the Dirac ﬁelds considered in these inverse scattering experiments are known and ﬁxed. Moreover we assume that q = 0 since the case q = 0 is similar to the one treated [4].

May 11, J070-S0129055X10004004

446

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

The main result of this section is now summarized in the following Theorem Theorem 3.1. Under Assumptions 1 and 2, the parameters M and Q of the RN black hole are uniquely determined. Following our previous paper [4], the proof of Theorem 3.1 will be based on a high-energy asymptotic expansion of the functions Fl (λ) and Gl (λ) when λ → +∞. Precisely we shall prove the following formulae: Theorem 3.2 (Reconstruction Formulae). Let ψ, φ ∈ C0∞ (R; C4 ). Then for λ large, we obtain Fl (λ) = Θ(x)P− ψ, P− φ +

i A(x)P− ψ, P− φ + O(λ−2 ), 2λ

(3.1)

Gl (λ) = Θ(x)P+ ψ, P+ φ −

i A(x)P+ ψ, P+ φ + O(λ−2 ), 2λ

(3.2)

where θ(x) and A(x) are multiplication operators given by Θ(x) = e−i +∞ 2 A(x) = Θ(x) al (s)ds + −∞

R0

−∞

[c(s)−c0 ]ds+ic0 x

0

−∞

b2 (s)ds +

,

+∞

(3.3) (b(s) − m)2 ds + m2 x .

0

(3.4) Remark 3.1. In Theorem 3.2, we have emphasized the dependence of the functions Fl (λ) and Gl (λ) on the parameter l since the reconstruction formulae (3.1) and (3.2) can be derived if we work on a ﬁxed spin-weighted spherical harmonic only. Nevertheless, as indicated in Assumption 1 we shall need to know these formulae on all spin-weighted spherical harmonics, hence for all l ∈ N, in order to prove the uniqueness result stated in Theorem 3.1. Remark 3.2. In the reconstruction formulae of Theorem 3.2, the contri +∞physical ic0 x 2 appearing in (3.3) and the functions −∞ al (s)ds + m2 x butions are the phase e appearing in (3.4). The presence of these terms clearly show that the charge q through c0 and the mass m of Dirac ﬁelds contribute to the high energy 0 asymptotics of the transmitted operators. On the other hand, the constant terms −∞ [c(s)−c0 ]ds 0 +∞ in (3.3) and −∞ b2 (s)ds + 0 (b(s) − m)2 ds in (3.4) may appear unnatural at ﬁrst sight since they depend explicitely on the particular value 0 of the Regge–Wheeler variable x. They are in fact due to our particular choice of Dollard modiﬁcation in ± . Recall here indeed that there the deﬁnition of the modiﬁed wave operators W(+∞) is no canonical choice for the (necessary) modiﬁcations entailed by the presence of long-range potentials at inﬁnity. This point can be easily seen for instance from the Isozaki–Kitada modiﬁcations — constructed in the next subsection — whose phases are deﬁned only up to a constant of integration (see (3.26) and Remark 3.4 after it). The above constant terms can thus be understood as constants of integration depending on our particular choice of modiﬁcation. We emphasize at last that these

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

447

constants of integration do not play any role in our proof of the uniqueness of the parameters. Remark 3.3. In this paper, we use the high-energy asymptotics of the quantum wave operators for Dirac ﬁelds in order to reconstruct the mass and the charge of the black hole. An other interesting question would be to study the same inverse problem, but from the semiclassical dynamics, or even from the classical ones. According to the authors, these problems are still open. However, for semiclassical Schr¨ odinger operators with energies localized in an arbitrary small interval, an inverse scattering problem was studied in [24], for regular potentials at inﬁnity; for the Newton equations at high energies, this problem was treated by Novikov in [25]. We now explain our strategy to prove Theorem 3.2. Using (2.22), (2.23), (2.33) and the fact that eiλx corresponds to a translation by λ in momentum space, we ﬁrst rewrite Fl (λ) and Gl (λ) as follows − + (λ)ψ, W(−∞) (λ)φ, Fl (λ) = W(+∞)

(3.5)

− + W(−∞) (λ)ψ, W(+∞) (λ)φ,

(3.6)

Gl (λ) = with

± ± (λ) = e−iλx W(−∞) eiλx = s- lim eitH(λ) e−itH0 (λ) P∓ , W(−∞) t→±∞

± (λ) W(+∞)

=e

−iλx

± W(+∞) eiλx

m

= s- lim eitH(λ) e−iX(t,λ) e−itH0 t→±∞

(λ)

P±m,λ ,

where we use the notations H(λ) = Γ1 (Dx + λ) + a(x)Γ2 + b(x)Γ0 + c(x), H0m (λ) = Γ1 (Dx + λ) + mΓ0 , X(t, λ) =

H0 (λ) = Γ1 (Dx + λ) + c0 , −1 Vm (λ) = (Dx + λ) H0m (λ) ,

P±m,λ = 1R± (Vm (λ)), t

(b(sVm (λ)) − m)m(H0m (λ))−1 + c(sVm (λ)) ds.

0

In order to obtain an asymptotic expansion of the functions Fl (λ) and Gl (λ), it is thus enough to obtain an asymptotic expansion of the λ-shifted wave opera± (λ). To do this, we follow the procedure exposed in [21, 22], procedure tors W(±∞) inspired by the well-known Isozaki–Kitada method [16] developed in the setting of long-range stationary scattering theory. It consists simply in replacing the wave ± ± (λ) by “well-chosen” energy modiﬁers J(±∞) (λ), deﬁned as Fourier operators W(±∞) Integral Operators (FIO) with explicit phases and amplitudes. Well-chosen here ± (λ) satisfying for λ large enough means practically that we look for J(±∞) ± ± W(−∞) (λ)ψ = lim eitH(λ) J(−∞) (λ)e−itH0 (λ) P∓ ψ, t→±∞

± ± W(+∞) (λ)ψ = lim eitH(λ) J(+∞) (λ)e t→±∞

−itH0m (λ)

P±m,λ ψ,

(3.7) (3.8)

May 11, J070-S0129055X10004004

448

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

and ± ± (W(±∞) (λ) − J(±∞) (λ))ψ = O(λ−2 ),

(3.9)

for any ﬁxed ψ ∈ H such that ψ ∈ C0∞ (R; C4 ). Note that the decay O(λ−2 ) in (3.9) could be improved to any inverse power decay but turns out to be enough to ± (λ) satisfying our purpose here. In particular if we manage to construct such J(±∞) (3.9) then we obtain by (3.5) and (3.6) − + (λ)ψ, J(−∞) (λ)ψ + O(λ−2 ), Fl (λ) = J(+∞) − + Gl (λ) = J(−∞) (λ)ψ, J(+∞) (λ)ψ + O(λ−2 ),

(3.10)

from which we can calculate the ﬁrst terms of the asymptotics easily. Let us here give a simple but useful result which allows us to simplify slightly the expressions of (3.7) and (3.8). Lemma 3.1. For all ξ ∈ R∗ , set

ν ± (ξ) = ±sgn(ξ) ξ 2 + m2 .

(3.11)

Then, for all ψ with supp ψˆ ⊂ R∗ , m

e−itH0 P±m ψ = e−itν

±

(Dx )

P±m ψ.

(3.12)

Moreover, e−itH0 P± = e∓itDx −itc0 P± .

(3.13)

Proof. The Fourier representation of the operator H0m is Γ1 ξ + mΓ0 and has pre 2 cisely one positive eigenvalue ξ + m2 and one negative eigenvalue − ξ 2 + m2 . Similarly, the Fourier representation of the classical velocity operator Vm is ξ 1 0 m ξ 2 +m2 (Γ ξ+mΓ ). Hence, for ξ > 0, P+ is the projection onto the positive spectrum 1 0 m of Γ ξ + mΓ and P− is the projection onto the negative spectrum of Γ1 ξ + mΓ0 . For ξ < 0, it is the opposite. This implies immediately (3.12). Finally the equality (3.13) is a direct consequence of the deﬁnitions of H0 and P± . According to Lemma 3.1, the projections P± and P±m allow us to “scalarize” the ± Hamiltonians H0 and H0m in the expressions (3.7) and (3.8) of W(±∞) (λ). Precisely these expressions read now ± ± (λ)ψ = lim eitH(λ) J(−∞) (λ)e∓it(Dx +λ)−itc0 P∓ ψ, W(−∞) t→±∞

± ± (λ)ψ = lim eitH(λ) J(+∞) (λ)e−itν W(+∞) t→±∞

±

(Dx +λ)

P±m,λ ψ.

(3.14) (3.15)

This minor simpliﬁcation will be important in the forthcoming construction of the ± modiﬁers J(±∞) (λ).

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

449

Before entering into the details, let us give a hint on how to construct the ± (λ) a priori deﬁned as FIOs with “scalar” phases ϕ± modiﬁers J(±∞) (±∞) (x, ξ, λ) and “matrix-valued” amplitudes p± (x, ξ, λ), i.e. deﬁned for all ψ ∈ H by (±∞) 1 iϕ± (x,ξ,λ) ± ± ˆ J(±∞) (λ)ψ = √ e (±∞) p(±∞) (x, ξ, λ)ψ(ξ)dξ. 2π R If we assume for instance that (3.15) is true then we easily get ± ± (W(+∞) (λ) − J(+∞) (λ))ψ

=i 0

±∞

± eitH(λ) C(+∞) (λ)e−itν

±

(Dx +λ)

P±m,λ ψdt,

(3.16)

where ± ± ± (λ) := H(λ)J(+∞) (λ) − J(+∞) (λ)ν ± (Dx + λ), C(+∞)

(3.17)

± are also FIOs with phases ϕ± (+∞) (x, ξ, λ) and amplitudes c(+∞) (x, ξ, λ). From (3.16), we get the simple estimate ±∞ ± ± ± ± C(+∞) (λ) e−itν (Dx +λ) P±m,λ ψdt. (3.18) (W(+∞) (λ) − J(+∞) (λ))ψ ≤ 0

± In order that (3.9) be true it is then clear from (3.18) that the FIOs C(+∞) (λ) have to be “small” in some sense. Precisely we shall need that the amplitudes c± (+∞) (x, ξ, λ) be short-range in the variable x at inﬁnity (i.e. when x → +∞) and

of order O(λ−2 ) when λ → +∞. Note here the role played by the projections P±m,λ which allow us to consider the part of the Dirac ﬁelds that propagate toward inﬁnity. This explains why the amplitudes c± (+∞) (x, ξ, λ) must short-range in the variable x ± only at inﬁnity. Similarly, for the construction of the modiﬁers J(−∞) (λ), we shall ± ± require that the amplitudes c(−∞) (x, ξ, λ) of the corresponding operators C(−∞) (λ) be short-range in the variable x only at the event horizon (i.e. when x → −∞) and of order O(λ−2 ) when λ → +∞. ± 3.1. Asymptotics of W(+∞) (λ) ± In this subsection, we construct the modiﬁers J(+∞) (λ) and give the asymptotics ± of W(+∞) (λ) when λ → +∞. For simplicity, we shall omit the lower index (+∞) in all the objects deﬁned hereafter. We ﬁrst look at the problem at ﬁxed energy (i.e. we take λ = 0 in the previous formulae). Hence we aim to construct modiﬁers J ± with scalar phases ϕ± (x, ξ) and matrix-valued amplitudes p± (x, ξ) such that the amplitudes c± (x, ξ) of the operators C ± = HJ ± − J ± ν ± (Dx ) be short-range in x when x → +∞. We adapt here to our case the treatment given by Gˆ atel and Yafaev in [9] where a similar problem was considered in Minkowski spacetime (see also our recent paper [4]).

May 11, J070-S0129055X10004004

450

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

The operators C ± are clearly FIOs with phases ϕ± (x, ξ) and amplitudes c± (x, ξ) = B ± (x, ξ)p± (x, ξ) − iΓ1 ∂x p± (x, ξ),

(3.19)

B ± (x, ξ) = Γ1 ∂x ϕ± (x, ξ) + a(x)Γ2 + b(x)Γ0 + c(x) − ν ± (ξ).

(3.20)

where

As usual, we look for phases ϕ± close to xξ and amplitudes p± close to 1. So the term ∂x p± in (3.19) should be short-range et can be neglected in a ﬁrst approximation. With p± = 1, we are thus led to solve B ± = 0. However a direct calculation leads then to matrix-valued phases ϕ± whereas we look for scalar ones. We follow [9] and solve in fact (B ± )2 = 0. Using crucially the anticommutation properties of the Dirac matrices (2.16), we get the new equation (B ± )2 = (∂x ϕ± )2 + a2 + b2 + (c − ν ± )2 + 2(c − ν ± )(B ± − c + ν ± ) = 0.

(3.21)

If we put B ± = 0 in (3.21), we obtain the scalar equation r± (x, ξ) := (∂x ϕ± )2 + a2 + b2 − (c − ν ± )2 = 0.

(3.22)

We look for an approximate solution of (3.22) of the form ϕ± (x, ξ) = xξ + φ± (x, ξ) where φ± (x, ξ) should be a priori relatively small in the variable x. Recalling that (ν ± )2 = ξ 2 + m2 by (3.11), we must then solve 2ξ∂x φ± + (∂x φ± )2 + a2 + (b2 − m2 ) − c2 + 2cν ± = 0. If we neglect (∂x φ± )2 in (3.23), we ﬁnally get 2ξ∂x φ± = − a2 + (b2 − m2 ) − c2 + d± ,

(3.23)

(3.24)

where we have introduced the notation d± (x, ξ) = 2c(x)ν ± (ξ). Note that by (2.20) and (3.11), the following estimate holds ∀ α, β ∈ N,

|∂xα ∂ξβ d± (x, ξ)| ≤ Cαβ x−1−α ξ1−β ,

∀ x ∈ R+ ,

∀ ξ ∈ R∗ . (3.25)

Therefore, using (2.20) again and the previous estimate (3.25), we see that a2 −c2 is short-range when x → +∞ whereas b2 − m2 and d± are long-range (of Coulomb type) when x → +∞. Hence we can deﬁne two solutions of (3.24) for all ξ = 0 as follows +∞ x 2 1 1 ± 2 2 (b (s) − m2 ) + d± (s, ξ) ds [a (s) − c (s)]ds − φ (x, ξ) = 2ξ x 2ξ 0 +∞ 1 + (b(s) − m)2 ds. (3.26) 2ξ 0 +∞ 1 (b(s)−m)2 ds Remark 3.4. Let us emphasize that we only add the quantity 2ξ 0 in (3.26) in order to prove that the Isozaki–Kitada and the Dollard modiﬁcations

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

451

coincide (see Theorem 3.3). In the general case however, the phases φ˜± (x, ξ), solutions of (3.24) would clearly take the form for all ξ = 0 +∞ x 1 1 [a2 (s) − c2 (s)]ds − (b2 (s) − m2 )ds φ˜± (x, ξ) = 2ξ x 2ξ 0 ν ± (ξ) x − c(s)ds + C(ξ), (3.27) ξ 0 where C(ξ) is a constant of integration. With this choice, we obtain for ξ = 0 (see (3.22)), 2 1 r± (x, ξ) = (∂x φ± )2 = 2 a2 (x) + (b2 (x) − m2 ) − c2 (x) + d± (x, ξ) . 4ξ

(3.28)

Moreover it is easy to see that the rests r± satisfy the estimates ∀ α, β ∈ N,

|∂xα ∂ξβ r± (x, ξ)| ≤ Cαβ x−2−α ξ−β ,

∀ x ∈ R+ ,

∀ ξ ∈ R∗ . (3.29)

In our derivation of the phases (3.26), it is important to keep in mind that we did not ﬁnd an approximate solution of B ± = 0 but instead of (B ± )2 = 0. Therefore we cannot expect to take p± = 1 as a ﬁrst approximation and we have to work a bit more. So we look for p± such that B ± p± be as small as possible. According to (3.21) and (3.22), we ﬁrst note that (B ± )2 = r± + 2(c − ν ± )B ± .

(3.30)

We ﬁnd now a relation between B ± and (B ± )2 . Using (3.20) and (3.24), we can reexpress B ± as B ± = B0± + 2ν ± K ± ,

(3.31)

where (3.32) B0± = Γ1 ξ + mΓ0 − ν ± , 1 1 K ± = ± − (a2 + (b2 − m2 ) − c2 + d± )Γ1 + aΓ2 + (b − m)Γ0 + c . 2ν 2ξ (3.33) If we take the square of (3.31) we get (B ± )2 = (B0± )2 + 2ν ± B0± K ± + 2ν ± K ± B ± . However, from (3.32) and (3.11) we see that becomes

(B0± )2

=

−2ν ± B0± .

(3.34) Whence (3.34)

(B ± )2 = −2ν ± B0± (1 − K ± ) + 2ν ± K ± B ± .

(3.35)

± 2

Now we replace the expression (3.35) of (B ) into (3.30) and we obtain c r± = −2ν ± B0± (1 − K ± ) + 2ν ± 1 + K ± − ± B ± . ν

(3.36)

May 11, J070-S0129055X10004004

452

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

We would like to isolate B ± in (3.36). We thus need to invert the functions (1 + K ± − νc± ). Using (2.19), (2.20) and (3.25), we get the following global asymptotics for K ± Cαβ x−1−α ξ−1−β , ∀ x ∈ R+ , ∀ ξ ∈ R∗ , α β ± ∀ α, β ∈ N, |∂x ∂ξ K (x, ξ)| ≤ Cαβ x−α ξ−1−β , ∀ x ∈ R− , ∀ ξ ∈ R∗ . (3.37) Let us consider the set X = {ξ ∈ R, |ξ| ≥ R} where R 1 is a constant. It follows ± − νc± ) immediately from the asymptotics (3.37) and those of νc(x) ± (ξ) that (1 + K and (1 − K ± ) are invertible for all (x, ξ) ∈ R × X if the constant R is assumed to be large enough. In consequence, we can write (3.36) as B ± (1 − K ± )−1 =

−1 1 c ± − r± (1 − K ± )−1 1 + K 2ν ± ν± −1 c ± + 1+K − ± B0± , ν

(3.38)

for all (x, ξ) ∈ R × X. The ﬁrst term in the right-hand side of (3.38) is small thanks to (3.29) but the second one is not. We choose p± in such a way that they cancel this term. To do this, we observe that the Fourier representations of the projections P±m , i.e. the operators ξ 1 0 (Γ ξ + mΓ ) P±m (ξ) = 1R± 2 ξ + m2 1 sgn(ξ) 1 0 = I4 ± (Γ ξ + mΓ ) , ∀ ξ = 0, (3.39) 2 ξ 2 + m2 satisfy the following equations B0± (ξ)P±m (ξ) = 0,

(3.40)

by Lemma 3.1 and (3.32). According to (3.38), a natural choice for p± is thus p± = (1 − K ± )−1 P±m (ξ),

(3.41)

−1 1 c ± r± (1 − K ± )−1 P±m (ξ). q := B p = ± 1 + K − ± 2ν ν

(3.42)

for which we have ±

± ±

Let us summarize the situation at this stage. For ξ = 0, we have deﬁned the phases ϕ± (x, ξ) = xξ + φ± (x, ξ) by (3.26) and for ξ ∈ X, the amplitudes p± are given by (3.41). Directly from the deﬁnitions and from the asymptotics (2.19) and (2.20) of the potentials a, b, c, the following estimates hold.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

453

Lemma 3.2 (Estimates on the Phases, the Amplitudes and Related Quantities). For all x ∈ R+ and ξ ∈ X with R large enough, we have ∀ β ∈ N, ∀ |α| ≥ 1, ∀ β ∈ N,

|∂ξβ φ± (x, ξ)| ≤ Cβ logxξ−β .

(3.43)

|∂xα ∂ξβ φ± (x, ξ)| ≤ Cαβ x−α ξ−β .

(3.44)

2 (ϕ± (x, ξ) − xξ)| ≤ |∂x,ξ

C . R2

(3.45) (3.46)

∀ α, β ∈ N,

|∂xα ∂ξβ K ± (x, ξ)| ≤ Cαβ x−1−α ξ−1−β . |∂xα ∂ξβ p± (x, ξ) − P±m (ξ) | ≤ Cαβ x−1−α ξ−1−β .

∀ α, β ∈ N,

|∂xα ∂ξβ r± (x, ξ) ≤ Cαβ x−2−α ξ−β .

(3.48)

∀ α, β ∈ N,

|∂xα ∂ξβ q ± (x, ξ) ≤ Cαβ x−2−α ξ−1−β .

(3.49)

∀ α, β ∈ N,

|∂xα ∂ξβ c± (x, ξ) ≤ Cαβ x−2−α ξ−1−β .

(3.50)

∀ α, β ∈ N,

(3.47)

Thanks to (3.43)–(3.45) and (3.47), for R large enough, we can deﬁne precisely our modiﬁers J ± as bounded operators on H (see [27], for instance). Let χ+ ∈ C ∞ (R) be a cutoﬀ function in space variables such that χ+ (x) = 0 if x ≤ 12 and χ+ (x) = 1 if x ≥ 1. Let also θ ∈ C ∞ (R) be a cutoﬀ function in energy variables such that θ(ξ) = 0 if |ξ| ≤ 12 and θ(ξ) = 1 if |ξ| ≥ 1. For R large enough, J ± are the Fourier Integral Operators with phases ϕ± (x, ξ) and amplitudes ξ ± + ± P (x, ξ) = χ (x)p (x, ξ)θ . (3.51) R We ﬁnish this part by a ﬁrst application of the previous construction. In the ± are shown to be time-independent modiﬁcations next Theorem, the modiﬁers J(+∞) of Isozaki–Kitada type equivalent to the Dollard modiﬁcation (2.21). Precisely, we have Theorem 3.3. For any ψ ∈ H such that supp ψˆ ⊂ X, we have ± ± W(+∞) ψ = lim eitH J(+∞) e−itν t→±∞

±

(Dx )

P±m ψ.

(3.52)

Proof. We only sketch the proof for the case (+). By deﬁnition of P+m , we have R |D | |D | −i 0t b s√ x −m ν + m +c s √ x ds + (D ) 2 2 2 2 x Dx +m Dx +m P+m ψ U (t)P+m ψ = e−itν (Dx ) e := V (t)P+m ψ. Then, we write: + e−itν eitH J(+∞)

+

(Dx )

(3.53)

+ P+m ψ = eitH V (t) V ∗ (t)e−itν (Dx ) + + + e−itν (Dx ) P+m ψ (3.54) × eitν (Dx ) J(+∞) Rt + + + = eitH V (t) ei 0 [···]ds eitν (Dx ) J(+∞) e−itν (Dx ) P+m ψ. (3.55)

May 11, J070-S0129055X10004004

454

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

The classical ﬂow associated with the Hamiltonian ν + (ξ) = sgn(ξ) ξ 2 + m2 is given by |ξ| t ,ξ . (3.56) Φ (x, ξ) = x + t ξ 2 + m2 + + + Then, using Egorov’s theorem, we see that eitν (Dx ) J(+∞) e−itν (Dx ) is a FIO with phase ϕ+ (t, x, ξ) = xξ + φ+ (x + tη, ξ), and with principal symbolc P + (x + tη, ξ) where η = √ 2|ξ| 2 . ξ +m Rt + + + e−itν (Dx ) is a FIO with the same principal Thus, ei 0 [···]ds eitν (Dx ) J(+∞) + symbol and with phase ϕ+ 1 (t, x, ξ) = xξ + φ1 (t, x, ξ) where φ+ 1 (t, x, ξ) =

1 2ξ +

+∞

1 2ξ

[a2 (s) − c2 (s)]ds − x+tη

1 2ξ

+∞

(b(s) − m)2 ds +

[(b2 (s) − m2 ) + 2c(s)ν + (ξ))]ds

0

t

0

x+tη

(b(sη) − m)

0

m + c(sη) ds. ν + (ξ) (3.57)

+∞ 2 1 Since 2ξ [a (s) − c2 (s)]ds = o(1) when t → +∞, and by making a change of x+tη variables in the last integral, we obtain φ+ 1 (t, x, ξ) = −

1 2ξ

φ+ 1 (t, x, ξ) = −

+

[(b2 (s) − m2 ) + 2c(s)ν + (ξ))]ds +

0

1 + 2ξ Using again that

x+tη

tη

(b(s) − m)2 ds 0

[2(b(s) − m)m + 2c(s)ν + (ξ)]ds + o(1).

tη

1 2ξ

+∞

(3.58)

0

x+tη 1 2ξ

1 2ξ

tη 0

tη

(b2 (s) − m2 ) + 2c(s)ν + (ξ)) ds = o(1), we see that

2 1 (b (s) − m2 ) + 2c(s)ν + (ξ)) ds + 2ξ

[2(b(s) − m)m + 2c(s)ν + (ξ)]ds + o(1).

+∞

(b(s) − m)2 ds 0

(3.59)

0

Then, φ+ 1 (t, x, ξ) = −

1 2ξ

0

tη

(b(s) − m)2 ds +

1 2ξ

+∞

(b(s) − m)2 ds + o(1) = o(1). 0

(3.60) c It

means that the others terms of the symbol are o(1) when t → +∞.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

Using (3.43), (3.44), (3.47) and the continuity of FIOs, we see that Rt + + + e−itν (Dx ) P+m ψ = P+m ψ + o(1) ei 0 [...]ds eitν (Dx ) J(+∞)

455

(3.61)

and Theorem 3.3 follows from (3.55) and (3.61). ± We now construct the modiﬁers at high energy J(+∞) (λ) so that they satisfy (3.9) and (3.15). We still omit the lower index (+∞) in the next notations. Comparing (3.15) and (3.52) suggests to construct J ± (λ) close to e−iλx J ± eiλx which are clearly FIOs with phases ϕ± (x, ξ, λ) = xξ + φ± (x, ξ + λ) and amplitudes P ± (x, ξ + λ). With J ± (λ) = e−iλx J ± eiλx , we see from (3.50) that the amplitudes

c± (x, ξ, λ) = B ± (x, ξ + λ)P ± (x, ξ + λ) − iΓ1 ∂x P ± (x, ξ + λ), of the operators C ± (λ) = H(λ)J ± (λ) − J ± (λ)ν ± (Dx + λ) would satisfy the estimate c± (x, ξ, λ) = O(x−2 λ−1 ),

(3.62)

for ξ in a compact set. Here and in the following, the notation f (x, λ) = O(x−2 λ−1 ) means that f (x, λ) decays as x−2 when x → +∞ and as λ−1 when λ → +∞. We want however the amplitudes c± (x, ξ, λ) to be of order O(x−2 λ−2 ) and the decay in (3.62) is not suﬃcient for our purpose. In consequence, we need to reﬁne our construction. Following the procedure given in [4], we look for modiﬁers J ± (λ) deﬁned as FIOs with phases ϕ± (x, ξ, λ) and with new amplitudes P ± (x, ξ, λ) that take the form 1 1 P ± (x, ξ, λ) = p± (x, ξ + λ) + p± (x, ξ + λ)l± (x) + 2 P∓ k ± (x) , (3.63) λ λ (up to suitable cutoﬀ functions deﬁned later), where P± denote the projections onto the positive and negative spectrum of Γ1 . Here the correctors l± , k ± (that can be matrix-valued) will be functions of x only and should satisfy some decay in x (see below). It will be clear in the next calculations why we add such correctors to the amplitudes p± (x, ξ + λ). We now choose l± and k ± in (3.63) so that the amplitudes 1 ± 1 ± ± ± ± ± c (x, ξ, λ) = B (x, ξ + λ) p (x, ξ + λ) + p (x, ξ + λ)l (x) + 2 P∓ k (x) λ λ 1 − iΓ1 ∂x p± (x, ξ + λ) + ∂x p± (x, ξ + λ)l± (x) λ 1 ± 1 ± ± + p (x, ξ + λ)∂x l (x) + 2 P∓ ∂x k (x) , (3.64) λ λ of the operators C ± (λ) be of order O(x−2 λ−2 ).

May 11, J070-S0129055X10004004

456

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

To prove this, we need the asymptotics of the diﬀerent functions appearing in (3.64). For x in R+ and for λ large enough, we obtain (after long and tedious calculations)

m2 ν (ξ + λ) = ± λ + ξ + + O(λ−2 ). 2λ m2 d± (x, ξ + λ) = ±2c(x) λ + ξ + + O(x−1 λ−2 ). 2λ 1 K ± (x, ξ + λ) = ± [2P∓ c(x) + a(x)Γ2 + (b(x) − m)Γ0 ] 2λ ±

+ O(x−1 λ−2 ).

(3.65)

(3.66)

(3.67)

P±m (ξ + λ) = P± + O(λ−1 ).

(3.68)

p± (x, ξ + λ) = P± + O(λ−1 ).

(3.69)

∂x p± (x, ξ + λ) = ±

1 P∓ (a (x)Γ2 + b (x)Γ0 ) + O(x−2 λ−2 ). 2λ

(3.70)

B ± (x, ξ + λ) = ∓2(ξ + λ)P∓ + 2c(x)P∓ + a(x)Γ2 + b(x)Γ0 + O(λ−1 ).

(3.71)

q ± (x, ξ + λ) = B ± (x, ξ + λ)p± (x, ξ + λ) =±

1 2 c (x)P± + O(x−2 λ−2 ). 2λ

(3.72)

We mention that the following simple equalities have been used several times to prove the preceding asymptotics 1

1+Γ =2

I2 0

0 0

= 2P+ ,

1−Γ =2 1

0 0

0 I2

= 2P− .

(3.73)

By (3.69)–(3.72), the amplitudes c± (x, ξ, λ) take the form 1 2 1 c P± ± 2 c2 P± l± 2λ 2λ 1 1 2 0 + 2 ∓2(ξ + λ)P∓ + 2cP∓ + aΓ + bΓ + O P∓ k ± λ λ 1 1 1 − iΓ ± P∓ (a Γ2 + b Γ0 ) ± 2 P∓ (a Γ2 + b Γ0 )l± 2λ 2λ 1 1 1 1 ± ± + . P± + O ∂x l + 2 P∓ ∂x k + O λ λ λ x2 λ2

c± (x, ξ, λ) = ±

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

457

From the asymptotics (2.20) of the potentials a, b, c, we rewrite this last expression as 1 2 i 1 Γ P∓ (a Γ2 + b Γ0 ) c± (x, ξ, λ) = ± c2 P± ∓ P∓ k ± ∓ 2λ λ 2λ i − Γ1 P± ∂x l± + R(x, λ), (3.74) λ where the rest R(x, λ) satisﬁes 1 + |l± (x)| |∂x l± (x)| |k ± (x)| |k ± (x)| |∂x k ± (x)| R(x, λ) = O + + + + . x2 λ2 λ2 λ2 xλ2 λ2

(3.75)

Now we choose the correctors l± , k ± in such a way that the terms of orders O(λ−1 ) in (3.74) cancel. Once it is done we shall have to check that the rest (3.75) be of order O(x−2 λ−2 ). There are clearly two diﬀerent types of terms in the expression (3.74): on one hand the terms 1 i 1 1 ± c2 P± − Γ1 P± ∂x l± = P± ± c2 ∓ i∂x l± , 2λ λ λ 2 “live” in H± = P± (H); on the other hand the terms i 1 1 i 2 2 ± 2 0 ± 0 Γ P∓ (a Γ + b Γ ) = P∓ ∓2k + (a Γ + b Γ ) , ∓ P∓ k ∓ λ 2λ λ 2 “live” in H∓ = P∓ (H). Since the Hilbert spaces H− and H+ form a direct sum of H, i.e. H = H− ⊕ H+ , we can consider separatly the equations 1 ± c2 ∓ i∂x l± = 0, 2 i ± ∓2k + (a Γ2 + b Γ0 ) = 0, 2

(3.76) (3.77)

in order to cancel the terms of order O(λ−1 ) in (3.74). We solve ﬁrst (3.76) and obtain i +∞ 2 ± l (x) = l(x) = c (s)ds. (3.78) 2 x Then we solve (3.77) and get i k ± (x) = ± (a (x)Γ2 + b (x)Γ0 ). 4

(3.79)

The functions l and k ± clearly satisfy when x → +∞ l(x) = O(x−1 ),

∂x l(x) = O(x−2 ),

k ± (x) = O(x−2 ).

(3.80)

Finally, with this choice of correcting terms l and k ± , we conclude from (3.74) and (3.75) that c± (x, ξ, λ) = R(x, λ) = O(x−2 λ−2 ).

May 11, J070-S0129055X10004004

458

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

In fact, we can prove that for all x ∈ R+ , ξ in a compact set and λ large enough ∀ α, β ∈ N,

|∂xα ∂ξβ c± (x, ξ, λ)| ≤ Cαβ x−2−α λ−2 .

(3.81)

Let us summarize the previous results. The modiﬁers J ± (λ) are (formally) constructed as FIOs with phases ϕ± (x, ξ, λ) = xξ + φ± (x, ξ + λ) where +∞ x 1 ± 2 2 [a (s) − c (s)]ds − [(b2 (s) − m2 ) φ (x, ξ + λ) = 2(ξ + λ) x 0 +∞ ± 2 + d (s, ξ + λ)ds] + (b(s) − m) ds , (3.82) 0

and amplitudes 1 1 P ± (x, ξ, λ) = p± (x, ξ + λ) + p± (x, ξ + λ)l(x) + 2 P∓ k ± (x) , λ λ

(3.83)

where l and k ± are given by (3.78) and (3.79) respectively. Unfortunately, since φ± (x, ξ + λ) = O(x) when x → −∞, this phase does not belong to a good class of oscillating symbols. So, we have to introduce some technical cutoﬀ functions in the amplitude in order to localize x far away from −∞. Moreover, these cutoﬀ functions must be negligible in the asymptotics in the previous calculus. We follow the strategy exposed in [22] which we brieﬂy recall here. We consider a ﬁxed test function ψ ∈ C0∞ (R) and we want to calculate the ± asymptotics of W(+∞) (λ)ψ. Since ψˆ ∈ / C0∞ (R), at high energies, translation of wave packets does not dominate over spreading. So we introduce a cutoﬀ function (depending on λ) in order to control the spreading. Let χ0 ∈ C0∞ (R) be a cutoﬀ function such that χ0 (ξ) = 1 if | ξ |≤ 1, χ0 (ξ) = 0 if | ξ |≥ 2. Using the Fourier representation, we have easily: Dx ∀ > 0, ∀ N ≥ 1, χ0 = O(λ−N ). (3.84) − 1 ψ 2 λ L (R) Now, let us deﬁne the classical propagation zone: Ω = {x + t; x ∈ supp ψ, t ∈ R+ },

(3.85)

and let η + ∈ C ∞ (R) be a cutoﬀ function such that η + = 1 in a neighborhood of Ω and η + = 0 in a neighborhood of −∞. We consider Dx ± + −itν ± (Dx +λ) m,λ K (λ) = (η − 1)e P± χ0 ψ. (3.86) λ Lemma 3.3. For λ 1, ∈ ]0, 1[, t ∈ R± , and N ≥ 1, we have: K ± (λ)L2 (R) = O(t−N λ−N ).

(3.87)

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

459

Proof. We only sketch the proof for the case (+). Using the Fourier transform and (3.39), we easily see that 1 + Γ1 (λ ξ + λ) + mΓ0 +

iϕ(ξ) (η (x) − 1)λ I4 + χ0 (ξ)dξ e K (λ) = 4π (λ ξ + λ)2 + m2 × ψ(y)dy,

(3.88)

where ϕ(ξ) = λ (x − y)ξ − t (λ ξ + λ)2 + m2 . So,

−1 ξ 1 + λ . ∂ξ ϕ(ξ) = λ x − y + t (1 + λ −1 ξ)2 + m2

(3.89)

Since ξ is in a compact set, < 1, y ∈ supp ψ, we easily obtain for x ∈ supp(η + − 1), and λ 1, |∂ξ ϕ(ξ)| ≥ cλ (1 + t),

(3.90)

for a suitable constant c > 0. We conclude by a standard non stationary phase argument. Now, we can deﬁne precisely ours modiﬁers J ± (λ) in order to calculate the ± (λ)ψ. According to (3.84), it suﬃces to calculate the asympasymptotics of W(+∞)

± x totics of W(+∞) (λ)χ0 ( D λ )ψ. We ﬁrst remark that for λ 1 and < 1, we have

ξ + λ ∈ X if λξ ∈ supp χ0 . So, we can deﬁne the modiﬁers J ± (λ) as FIOs with phases ϕ± (x, ξ, λ) = xξ + φ± (x, ξ + λ) where φ± (x, ξ + λ) are given by (3.82) and with amplitudes ξ 1 ± 1 ± + ± ± P (x, ξ, λ) = η (x) p (x, ξ + λ) + p (x, ξ + λ)l(x) + 2 P∓ k (x) χ0 , λ λ λ

(3.91) where l and k ± are given by (3.78) and (3.79), respectively. With this deﬁnition, we can mimick the proof of Theorem 3.3, to get Lemma 3.4. For ψ ∈ C0∞ (R) and for λ large, we have ± Dx ± ± (λ)χ0 (λ)e−itν (Dx +λ) P±m,λ ψ. W(+∞) ψ = lim eitH(λ) J(+∞) t→±∞ λ

(3.92)

Moreover, it is easy to see that the estimates (3.81) are still satisﬁed, so we can prove our main estimate (3.9). Precisely we get Lemma 3.5. For ψ ∈ C0∞ (R) and when λ tends to inﬁnity, the following estimate holds: ± ± (λ) − J(+∞) (λ))ψ = O(λ−2 ). (W(+∞)

May 11, J070-S0129055X10004004

460

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

Proof. Everything done in [4], Lemma 3.3 works here in the same way. All the contributions coming from the cut-oﬀ function η + are negligible using the same arguments as in Lemma 3.3 since the support of the derivatives of η + are far away from Ω. ± We end up this section giving the asymptotics of W(+∞) (λ) when λ is large. ± ∞ (λ)ψ = According to Lemma 3.5, we have for any ψ ∈ C0 (R; C4 ), W(+∞) ± −2 J(+∞) (λ)ψ+ O(λ ). Thus we only need to compute the asymptotics of the modiﬁer ± (λ) that we shall consider as pseudodiﬀerential operators with symbols J(+∞)

j ± (x, ξ, λ) = eiφ

±

(x,ξ+λ)

P ± (x, ξ, λ).

Using the explicit expressions (3.82) and (3.91), we ﬁrst get the asymptotics +∞ x x 1 φ± (x, ξ + λ) = ∓ c(s)ds + (a2 − c2 )(s)ds − (b2 (s) − m2 )ds 2λ x 0 0 +∞ logx 2 + (b(s) − m) ds + O , (3.93) λ2 0 1 1 l(x) P∓ (aΓ2 + bΓ0 ) + P± + O P ± (x, ξ, λ) = η + (x) P± ± . (3.94) 2λ λ λ2 Moreover using a Taylor expansion of et at t = 0, we get from (3.93) logx i ˜+ iφ± (x,ξ+λ) ∓iC + (x) C (x) + O =e e 1+ , 2λ λ2 with

+

C (x) =

x

c(s)ds, 0

+∞

(a2 − c2 )(s)ds −

C˜ + (x) = x

0

x

(3.95)

(b2 (s) − m2 )ds +

(3.96) +∞

(b(s) − m)2 ds. 0

Combining now (3.94) and (3.95), we obtain + i ˜+ 1 l(x) C (x)P± ± P∓ (aΓ2 + bΓ0 ) + P± j ± (x, ξ, λ) = e∓iC (x) η + (x) P± + 2λ 2λ λ 1 +O . (3.97) λ2 However, notice from (3.78) that +∞ x +∞ i l(x) i ˜+ 2 2 2 2 C (x)+ = a (s)ds − (b (s) − m )ds + (b(s) − m) ds , 2λ λ 2λ x 0 0

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

461

and from the anticommutation properties (2.16) of the Dirac matrices that P∓ (aΓ2 + bΓ0 ) = (aΓ2 + bΓ0 )P± . Hence (3.97) becomes +∞ x i η (x) 1 + a2 (s)ds − (b2 (s) − m2 )ds j (x, ξ, λ) = e 2λ x 0 +∞ 1 1 (aΓ2 + bΓ0 ) P± + O (b(s) − m)2 ds ± + . (3.98) 2λ λ2 0 ±

∓iC + (x) +

Eventually, if we introduce the notations +∞ x +∞ i ± 2 2 2 2 R (x) = a (s)ds − (b (s) − m )ds + (b(s) − m) ds 2 x 0 0 1 ± (aΓ2 + bΓ0 ), 2

(3.99)

we deduce from (3.98) and the fact that η + (x) = 1 on supp ψ, the following Proposition Proposition 3.1. For any ψ ∈ C0∞ (R; C4 ), 1 1 ± ± ∓iC + (x) W(+∞) (λ)ψ = e 1 + R (x) P± ψ + O , λ λ2

(3.100)

where C + (x) and R± (x) are given by (3.96) and (3.99), respectively. ± (λ) 3.2. Asymptotics of W(−∞)

In this subsection, we focus on what happens at the event horizon and give the ± (λ) when λ → +∞. In fact, we shall derive them from the asymptotics of W(−∞) results obtained in the preceding Sec. 3.1 after some simpliﬁcations of our model. As usual, we shall omit the lower index (−∞) in the objects deﬁned or used hereafter. Recall that the expressions of the wave operators at the event horizon are given by (see (2.22)) W ± = s- lim eitH e−itH0 P∓ , t→±∞

where H0 = Γ1 Dx + c0 , H = Γ1 Dx + aΓ2 + mΓ0 + c and the potentials a, b, c − c0 satisfy (2.19) when x → −∞. We ﬁrst simplify this expression in a convenient way. Let us introduce the unitary transform U on H x 1 − [c(s) − c0 ]ds + c0 x, (3.101) U = e−iΓ C (x) , C − (x) = −∞

and deﬁne the selfadjoint operators on H A0 = Γ1 Dx ,

A = U ∗ HU .

(3.102)

May 11, J070-S0129055X10004004

462

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

Using (3.101), a short calculation shows that the operator A can be rewritten as A = Γ1 Dx + W (x), where 1

W (x) = eiΓ

C − (x)

(3.103)

1 − a(x)Γ2 + b(x)Γ0 e−iΓ C (x) .

(3.104)

Note that according to the anticommutation properties (2.16) of the Dirac matrices, the potential W satisﬁes W Γ1 + Γ1 W = 0 and W 2 (x) = a2 (x) + b2 (x). Moreover from (2.19), we get the following estimates for W ∃ α > 0,

x → −∞.

W (x) = O(eαx ),

Using the unitarity of U and (3.102) we rewrite W

±

(3.105)

as

W ± = U s- lim eitA U ∗ e−itH0 P∓ , t→±∞

= U s- lim eitA e−itA0 eitA0 U ∗ e−itH0 P∓ .

(3.106)

t→±∞

Now we can simplify the strong limit appearing in (3.106) in two steps. First we claim that 1

s- lim eitA0 U ∗ e−itH0 P∓ = eiΓ t→±∞

c0 x

P∓ .

(3.107)

Indeed, using the particular diagonal form of Γ1 given in (2.18) and since e−itH0 = e−itA0 e−itco , we have 1

eitA0 U ∗ e−itH0 P∓ = eitA0 eiΓ

C − (x) −iA0 −itc0

e

e

1

P∓ = eiΓ

C − (x∓t) −itc0

e

P∓ . (3.108)

When t → +∞, the right-hand-side of (3.108) can be written using (3.101) as R x−t − e−iC (x−t) e−itc0 P− = e−i −∞ (c(s)−c0 )ds+c0 x P− , from which (3.107) follows when t → +∞. The case t → −∞ is obtained similarly. Second since the potential W decays exponentially when x → −∞ by (3.105), it follows from the methods used in [3, 18] that the wave operators W ± (A, A0 ) = s- lim eitA e−itA0 P∓ , t→±∞

(3.109)

exist on H. Hence by (3.106), (3.107), (3.109) and the chain-rule, we obtain the following nice expressions for W ± 1

W ± = U W ± (A, A0 ) eiΓ

c0 x

P∓ .

(3.110)

1

At last since U and eiΓ c0 x commute with eiλx , it is clear from (3.110) that it is enough to know the asymptotics of W ± (A, A0 , λ) = e−iλx W ± (A, A0 )eiλx when λ → +∞ in order to get the asymptotics of W ± (λ). Note here that the λ-shifted wave operator W ± (A, A0 , λ) is exactly the kind of wave operator studied in our previous paper [4] in which the asymptotics of

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

463

W ± (A, A0 , λ) were calculated. Nevertheless, we can also easily derive these asymptotics from the results of the preceding section. For completeness this is what we choose to do here. We thus follow our usual strategy and construct modiﬁers J0± (λ) corresponding to W ± (A, A0 , λ). This problem is in fact similar to the one in Sec. 3.1. It suﬃces to replace H0m by A0 and H by A in our calculations. From the explicit form (3.102) and (3.103) of the operators A0 and A, we deduce that we can use the results obtained in Sec. 3.1 with the following changes: (1) Since the mass m does not appear in A0 hence we take m = 0. (2) The long-range matrix-valued potential b and scalar potential c do not appear in A (see (3.103) and (3.105)) hence we put b(x) = c(x) = 0. (3) The short-range matrix-valued potential a(x)Γ2 is replaced by W (x). (4) The projections P±m are replaced by P∓ since we work at the event horizon. Noting that these changes also entail that ν ± (ξ) = ∓ξ and d± (x, ξ) = 0, we obtain the following results. At ﬁxed energy λ = 0, the modiﬁers J0± are deﬁned as FIOs with phases −∞ 1 ϕ± (x, ξ) = xξ + W 2 (s)ds, 2ξ x and amplitudesd p± (x, ξ) = (1−K ± (x, ξ))−1 P∓ ,

K ± (x, ξ) = ∓

1 W 2 (x) 1 Γ + W (x) . (3.111) − 2ξ 2ξ

At high energy, the modiﬁers J0± (λ) are deﬁned as FIOs with phases −∞ 1 ± ϕ (x, ξ, λ) = xξ + W 2 (s)ds, 2(ξ + λ) x

(3.112)

and amplitudes P ± (x, ξ, λ) = p± (x, ξ + λ) +

1 P± k ± (x), λ2

(3.113)

where k ± (x) = ∓ 4i W (x). Using these deﬁnitions and (3.105), we can prove that the symbols c± (x, ξ, λ) of the operators C ± (λ) = A(λ)J0± (λ) − J0± (λ)A0 (λ) satisfy the estimates eαx ∀ µ, β ∈ N, |∂xα ∂ξβ c± (x, ξ, λ)| ≤ Cµβ 2 , (3.114) λ for all x ∈ R− and λ large enough. Finally as in the proof of Lemma 3.5 the estimates (3.114) are the main ingredients to prove the equivalent properties to (3.14) and (3.9). Precisely we have Lemma 3.6. For any ψ ∈ C0∞ (R; C4 ) and for λ large, the following estimate holds (W ± (A, A0 , λ) − J0± (λ))ψ = O(λ−2 ). d In

the same way as the preceding section, we should add some technical cutoﬀ functions which are negligible in the asymptotics.

May 11, J070-S0129055X10004004

464

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

We now use Lemma 3.6 to compute the asymptotics of W ± (A, A0 , λ)ψ up to the order O(λ−2 ). For any ψ ∈ C0∞ (R; C4 ) and for λ large, we have 1 W ± (A, A0 , λ)ψ = J0± (λ)ψ + O . λ2 Hence, it is enough to compute the asymptotics of J0± (λ) for λ large. Using (3.111)– (3.113) and after some calculations, we obtain −∞ 1 1 ± 2 J0 (λ)ψ = 1 + W (s)ds ∓ W (x) P∓ ψ + O . (3.115) i 2 2λ λ x Note that we retrieve naturally the same formulae as in [4]. Eventually combining (3.110) and (3.115), we obtain the asymptotics of W ± (λ) for λ large Proposition 3.2. For any ψ ∈ C0∞ (R), 1 1 ± ± iΓ1 c0 x W(−∞) (λ)ψ = U 1 + Q (x) e P∓ ψ + O , (3.116) λ λ2 −∞ where U is given by (3.101), Q± (x) = 12 (i x W 2 (s)ds∓ W (x)) and W (x) is given by (3.104). 3.3. Proofs of Theorems 3.1 and 3.2 ± In this last subsection, we use the asymptotics of W(±∞) (λ) obtained in Propositions 3.1 and 3.2 to prove the reconstruction formulae given in Theorem 3.2 and ﬁnally prove Theorem 3.1.

Proof of Theorem 3.2. We only treat the case of the transmission operator TR and give the proof of (3.1) since the proof of (3.2) corresponding to the transmission operator TL is similar. Recall that we want to compute the asymptotic expansion when λ → +∞ of − + (λ)ψ, W(−∞) (λ)φ, Fl (λ) = TR eiλx ψ, eiλx φ = W(+∞)

for ψ, φ ∈ C0∞ (R; C4 ). Using Propositions 3.1 and 3.2 and the notations therein, we have 1 1 + 1 − iC + (x) iΓ1 c0 x Fl (λ) = e P− φ + O 1 + R (x) P− ψ, U 1 + Q (x) e , λ λ λ2 1 1 iC + (x) [e P− ψ, U Q+ eiΓ c0 x P− φ λ 1 iC + (x) − iΓ1 c0 x R P− ψ, U e P− φ] + O + e . λ2

= eiC

+

(x)

1

P− ψ, U eiΓ

c0 x

P− φ +

We now compute separatly the terms of diﬀerent orders in (3.117).

(3.117)

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

465

Order 0. Since Γ1 P− = −P− , the term of order 0 reads ei[C

+

(x)−C − (x)+c0 x]

P− ψ, P− φ.

(3.118)

Moreover from (3.96) and (3.101), the phase C + (x) − C − (x) + c0 takes the simple form 0 [c(s) − c0 ]ds + c0 x. (3.119) C + (x) − C − (x) + c0 x = − −∞

Order 1. Using Γ1 P− = −P− again, the term of order 1 can be written as ei[C

+

(x)−C − (x)+c0 x]

(R− + (Q+ )∗ ) P− ψ, P− φ.

−

Since W 2 = a2 + b2 and W P− = e2iC (aΓ2 + bΓ0 )P− by (2.16), the term (Q+ )∗ P− takes the form − i −∞ 2 1 (a + b2 )(s)ds − e2iC (aΓ2 + bΓ0 ) P− . (3.120) (Q+ )∗ P− = − 2 x 2 Moreover from (3.99) the term R− is i +∞ 2 i x 2 − R = a (s)ds − (b (s) − m2 )ds 2 x 2 0 1 i +∞ (b(s) − m)2 ds − (aΓ2 + bΓ0 ). + 2 0 2

(3.121)

Hence adding (3.120) and (3.121), the term of order 1 reads +∞ i i 0 2 i[C + (x)−C − (x)+c0 x] a2 (s)ds + b (s)ds e 2 −∞ 2 −∞ i i +∞ (b(s) − m)2 ds + m2 x P− ψ, P− φ + 2 0 2 1 i[C + (x)−C − (x)+c0 x] 1 2iC − 2 0 2 0 e − e (aΓ + bΓ ) + (aΓ + bΓ ) P− ψ, P− φ . 2 2 (3.122) +

−

Finally using that ei[C (x)−C (x)+c0 x] is scalar, that (aΓ2 +bΓ0 )P± = P∓ (aΓ2 +bΓ0 ) by (2.16) and the fact that P+ ψ, P− φ = 0, we see that the last term in (3.122) cancel, i.e. + − 1 2iC − 1 e (aΓ2 + bΓ0 ) + (aΓ2 + bΓ0 ) P− ψ, P− φ = 0. ei[C (x)−C (x)+c0 x] 2 2 Hence the term of order 1 is +∞ i i 0 2 i +∞ i[C + (x)−C − (x)+c0 x] 2 a (s)ds + b (s)ds + (b(s) − m)2 ds e 2 −∞ 2 −∞ 2 0 i 2 + m x P− ψ, P− φ . (3.123) 2

May 11, J070-S0129055X10004004

466

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

If we introduce the following functions Θ(x) = e−i

R0

−∞

[c(s)−c0 ]ds+ic0 x

A(x) = Θ(x)

,

+∞

0

a2 (s)ds + −∞

b2 (s)ds + −∞

+∞

(b(s) − m)2 ds + m2 x ,

0

we have proved the reconstruction formula (3.1) and thus Theorem 3.2.

Proof of Theorem 3.1. We show here that the reconstruction formula (3.1) entails the uniqueness of the parameters M and Q under the additional assumption that the charge q of Dirac ﬁelds is known, ﬁxed and nonzero. The same result can be shown from the reconstruction formula (3.2) in a similar way. We ﬁrst compute one of the integrals that appear in (3.1) which will be useful in the later analysis. Using the explicit expressions of F, al given in (2.2) and (2.15) as well as the deﬁnition of the Regge–Wheeler variable x(r) given in (2.6), an easy calculation shows that 2 1 1 2 al (s)ds = l + , (3.124) 2 r 0 R where r0 is the radius of the event horizon. Now let us consider two transmission operators Tl,1 and Tl,2 corresponding, respectively, to parameters Mj , Qj , mj , (j = 1, 2) and q1 = q2 = q where q is supposed to be known and nonzero. In what follows, all the objects corresponding to Tl,j with j = 1, 2 will be denoted by the usual notations with a lower index j. We suppose that Tl,1 = Tl,2 . In consequence we also have Fl,1 (λ) = Fl,2 (λ). Our goal is to prove that M1 = M2 and Q1 = Q2 . Using Theorem 3.2 and identifying the terms of same orders in the reconstruction formula (3.1), we thus get Θ1 (x) = Θ2 (x),

(3.125)

A1 (x) = A2 (x).

(3.126)

By (3.3) and a standard continuity argument, (3.125) leads to the equality 0 0 −i [c1 (s) − c0,1 ]ds + ic0,1 x = −i [c2 (s) − c0,2 ]ds + ic0,2 x + 2kπ, (3.127) −∞

−∞

where k ∈ Z. If we derivate (3.127) with respect to x, we obtain c0,1 = c0,2 := c0 .

(3.128)

Now by (3.124), (3.126) leads to the equality 2 1 1 i 0 2 i +∞ i l+ + b (s)ds + (b1 (s) − m)2 ds + m21 x 2 r0,1 2 −∞ 1 2 0 2 2 1 1 i 0 2 i +∞ i = l+ + b2 (s)ds + (b2 (s) − m)2 ds + m22 x. 2 r0,2 2 −∞ 2 0 2

(3.129)

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

467

If we derivate (3.129) with respect to x, we ﬁrst get m1 = m2 := m.

(3.130)

Hence the mass m of Dirac ﬁelds is uniquely determined. Moreover, using (3.130), (3.124) and the homogeneity in the parameter l, we obtain from (3.129) r0,1 = r0,2 := r0 .

(3.131)

Therefore the radius r0 of the event horizon is also uniquely determined. Now if we combine (3.131) and c0 = qQ r0 into (3.128), we get (since q is supposed to be nonzero) Q1 = Q2 := Q. The charge Q of the black hole is thus uniquely determined. Eventually since r0 cancels the function F , we get from (2.2) that M1 = M2 := M =

r02 + Q2 , 2r0

and the mass M of the black hole is uniquely determined. This ﬁnishes the proof of Theorem 3.1. 4. The Inverse Problem for dS-RN Black Holes (Λ > 0) In this section, we study the inverse problem in the case Λ > 0 corresponding to dS-RN black holes. In a ﬁrst part, we prove the same kind of results as in Sec. 3, that is we prove that the parameters M, Q and Λ are uniquely determined by the high energies of the transmission operators TL or TR . In a second part, we prove by means of a purely stationary method that the parameters M, Q and Λ can also be uniquely determined from the knowledge of the reﬂection operators L or R on any interval of energy. 4.1. The inverse problem at high energy As in Sec. 3, we shall assume here that one of the following functions of λ ∈ R Fl (λ) = TR eiλx ψ, eiλx φ,

Gl (λ) = TL eiλx ψ, eiλx φ,

ˆ φˆ ∈ is known for all large values of λ, for all l ∈ N and for all ψ, φ ∈ H with ψ, ∞ 4 C0 (R; C ). We emphasize that in this case the construction of the modiﬁers are simpler than in the previous section due to the decay of the potentials at inﬁnity; the phases of the modiﬁers constructed later will belong to a good class of oscillating symbols. In particular, we do not need a technical cutoﬀ function η + and a cutoﬀ function χ0 in order to control the spreading of the wave packets as in Sec. 3 and we ˆ φˆ ∈ C ∞ (R; C4 ). We also assume that can consider test functions ψ, φ ∈ H with ψ, 0

May 11, J070-S0129055X10004004

468

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

the mass m and the charge q of the Dirac ﬁelds are known and ﬁxed. Furthermore, the charge q is supposed to be nonzero. Then our main result is Theorem 4.1. Under the previous assumptions, the parameters M, Q and Λ of the dS-RN black hole are uniquely determined. This theorem will follow from the following reconstruction formulae obtained on each spin-weighted spherical harmonics ˆ φˆ ∈ Theorem 4.2 (Reconstruction Formulae). Let ψ, φ ∈ H such that ψ, ∞ 4 C0 (R; C ). Then for λ large, we have Fl (λ) = Θ(x)P− ψ, P− φ +

1 A(x)P− ψ, P− φ + O(λ−2 ), λ

(4.1)

Gl (λ) = Θ(x)P+ ψ, P+ φ −

1 A(x)P+ ψ, P+ φ + O(λ−2 ), λ

(4.2)

where θ(x) and A(x) are multiplication operators given by +∞ 2 i al (s) + b2 (s) ds Θ(x), Θ(x) = e−iβ−i(c+ −c0 )x , A(x) = 2 −∞ and a constant β given by 0 c(s) − c0 ds + β= −∞

+∞

(4.3)

c(s) − c+ ds.

0

We shall prove Theorem 4.2 using the same global strategy as in the proof of Theorem 3.2. From (2.30), (2.31), (2.33) and the fact that eiλx corresponds to a translation by λ in momentum space, we express F (λ) and G(λ) as follows − + Fl (λ) = W(+∞) (λ)ψ, W(−∞) (λ)φ,

(4.4)

− + Gl (λ) = W(−∞) (λ)ψ, W(+∞) (λ)φ,

(4.5)

± ± W(−∞) (λ) = e−iλx W(−∞) eiλx = s- lim eitH(λ) e−itH0 (λ) P∓ ,

(4.6)

± ± W(+∞) (λ) = e−iλx W(+∞) eiλx = s- lim eitH(λ) e−itH+ (λ) P± ,

(4.7)

with t→±∞

t→±∞

and H(λ) = Γ1 (Dx + λ) + a(x)Γ2 + b(x)Γ0 + c(x), H0 (λ) = Γ1 (Dx + λ) + c0 ,

H+ (λ) = Γ1 (Dx + λ) + c+ .

In consequence, it is enough to obtain an asymptotic expansion of the λ-shifted ± wave operators W(±∞) (λ) in order to prove the reconstruction formulae (4.1) and (4.2). ± (λ) given by (4.6) are exactly Note ﬁrst that the λ-shifted wave operators W(−∞) the same as in the case Λ = 0 studied in Sec. 3.2. For completeness we recall here

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

469

± the asymptotic expansion of W(−∞) (λ) obtained in Proposition 3.2. For any ψ ∈ H, ∞ 4 ˆ ψ ∈ C (R; C ), we have 0

1 1 1 ± W(−∞) (λ)ψ = U 1 + Q± (x) eiΓ c0 x P∓ ψ + O , λ λ2

where 1

U = e−iΓ

C − (x)

Q± (x) = W (x) = e

, 1 2

C − (x) = i

iΓ1 C − (x)

x −∞

−∞

[c(s) − c0 ]ds + c0 x,

(4.8)

(4.9)

W 2 (s)ds ∓ W (x) , (4.10)

x 2

0

(a(x)Γ + b(x)Γ )e

−iΓ1 C − (x)

.

± Note second that the λ-shifted wave operators W(+∞) (λ) given by (4.7) are very similar to (4.6), the constant c0 being replaced by c+ and the projections P∓ being replaced by P± since we work now at the cosmological horizon. Hence they can be studied exactly the same way as in Sec. 3.2. Since there are slight modiﬁcations in some formulae, we recall here the procedure but omit the proofs. Using the unitary ± as follows transform (4.9), we simplify the wave operators W(+∞) ± W(+∞) = U s- lim eitA e−itA0 eitA0 U ∗ e−iH+ P± , t→±∞

(4.11)

where we have used again the notations A0 = Γ1 Dx and A = U ∗ HU = Γ1 Dx +W (x) from (3.102) and (3.103) with the potential W given by (4.10). We also recall that by (2.16) this new potential W (x) satisﬁes the properties Γ1 W + W Γ1 = 0,

W 2 = a2 + b 2 ,

(4.12)

as well as the global estimate ∃ α > 0,

W (x) = O(e−α|x| ),

∀ x ∈ R.

(4.13)

The potential W is thus very short-range both at the event horizon and at the cosmological horizon. Now an easy calculation shows that (to be compared with (3.107) and its proof) 1

1

s- lim eitA0 U ∗ e−iH+ P± = eiΓ β eiΓ t→±∞

where the constant β is given by 0 c(s) − c0 ds + β= −∞

+∞

c+ x

P± ,

c(s) − c+ ds.

(4.14)

(4.15)

0

Furthermore, it is immediate from (4.13) that the wave operators W ± (A, A0 ) = ± s- limt→±∞ eitA e−itA0 exist on H. Hence we conclude by the chain-rule that W(+∞)

May 11, J070-S0129055X10004004

470

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

± take the nice form (to be compared to the expressions (3.110) obtained for W(−∞) ) 1

1

± = U W ± (A, A0 ) eiΓ β eiΓ W(+∞) 1

1

Since U and eiΓ β eiΓ ± (λ) for W(+∞)

c+ x

c+ x

P± .

(4.16)

commute with eiλx , we ﬁnally get the following expression 1

1

± W(+∞) (λ) = U W ± (A, A0 , λ) eiΓ β eiΓ

c+ x

P± ,

where W ± (A, A0 , λ) = e−iλx W ± (A, A0 )eiλx . Clearly it is enough to know the asymptotics of W ± (A, A0 , λ)P± when λ → +∞ in ± (λ). In fact, the calculations are exactly the order to get the asymptotics of W(+∞) same to what has been done in Sec. 3.2 (it suﬃces to replace P∓ by P± in these calculations) or in [4]. Hence we only give the ﬁnal result without more details. For any ψ ∈ H, ψˆ ∈ C0∞ (R; C4 ), we ﬁnally obtain 1 1 1 1 ˜± ± (λ)ψ = U 1 + Q (x) eiΓ β eiΓ c+ x P± ψ + O , (4.17) W(+∞) λ λ2 ˜ ± (x) = 1 (i +∞ W 2 (s)ds ± W (x)) and W is given by where U is given by (4.9), Q x 2 (4.10). Proof of Theorem 4.2. We now use the asymptotic expansions (4.8) and (4.17) to prove the reconstruction formulae (4.1) and (4.2). Since the proofs are analogous, we only treat (4.1). Using the previous notations we clearly have ! 1 1 1 ˜− 1 (x) eiΓ β eiΓ c+ x P− ψ, U 1 + Q+ (x) Fl (λ) = U 1 + Q λ λ " 1 1 eiΓ c0 x P− φ + O . (4.18) λ2 Since U is unitary and since Γ1 P− = −P− , we reexpress (4.18) as F (l λ) = e−iβ−i(c+ −c0 )x P− ψ, P− φ − 1 ˜ (x) + (Q+ )∗ (x) P− ψ, P− φ + O 1 . + e−iβ−i(c+ −c0 )x Q λ λ2

(4.19)

˜ − , (4.19) becomes From the explicit expressions of Q+ and Q Fl (λ) = e−iβ−i(c+ −c0 )x P− ψ, P− φ +∞ 1 i 1 −iβ−i(c+ −c0 )x 2 + W (s)ds − W (x) P− ψ, P− φ + O . e λ 2 −∞ λ2 (4.20)

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

471

Eventually observe that W (x)P− = P+ W (x) by (2.16) and that P+ ψ, P− φ = 0. Hence we obtain for (4.20) Fl (λ) = e−iβ−i(c+ −c0 )x P− ψ, P− φ +∞ 1 i 2 −iβ−i(c+ −c0 )x + W (s)ds e P− ψ, P− φ + O . 2λ −∞ λ2

(4.21)

Denoting Θ(x) = e−iβ−i(c+ −c0 )x , +∞ +∞ i i 2 2 2 A(x) = W (s)ds Θ(x) = (al (s) + b (s))ds Θ(x), 2 2 −∞ −∞ we have proved the reconstruction formula (4.1). This ﬁnishes the proof of Theorem 4.2. Proof of Theorem 4.1. We prove here that the parameters M, Q and Λ are uniquely determined from the knowledge of the high energies of the transmission operator TR . Note that the proof with the high energies of TL is the same. Consider TR,1 and TR,2 two transmission operators corresponding to parameters Mj , Qj , Λj with j = 1, 2 where moreover m, q = 0 are supposed to be known and ﬁxed. In what follows, we shall denote all the objects associated to TR,j by the usual notations with a lower index j. We assume that TR,1 = TR,2 . From the deﬁnition of Fl (λ) it follows then that Fl,1 (λ) = Fl,2 (λ). We identify now the terms of same orders in the asymptotic expansion (4.1). Since ψ, φ are dense in H, we get Θ1 (x) = Θ2 (x),

∀x ∈ R

(4.22)

A1 (x) = A2 (x),

∀ x ∈ R.

(4.23)

Let us analyze the term of order 0 ﬁrst. From (4.22) and (4.3), we have −iβ1 − i(c+,1 − c0,1 )x = −iβ2 − i(c+,2 − c0,2 )x + 2kπ,

∀ x ∈ R,

(4.24)

where k ∈ Z. If we derivate (4.24) with respect to x, we thus obtain c0,1 − c+,1 = c0,2 − c+,2 .

(4.25)

Hence using (4.25) and (2.29), we see that the quantity X = c0 − c+ = qQ is uniquely determined.

r+ − r0 , r0 r+

(4.26)

May 11, J070-S0129055X10004004

472

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

We analyze now the term of order O(λ−1 ). From (4.23), (4.3) and (4.22) again, we have +∞ +∞ 2 W1 (s)ds = W22 (s)ds. (4.27) −∞

−∞

Using that W 2 (x) = a2l (x) + b2 (x) and the expressions of the potentials al and b given by (2.15) and the deﬁnition of the Regge–Wheeler variable (2.6), we can compute explicitely the integrals that appear in (4.27). In fact we have 2 +∞ 1 1 1 2 W (s)ds = l + − (4.28) + m2 (r+ − r0 ). 2 r0 r+ −∞ By homogeneity in l and since m is considered as known and ﬁxed, we deduce from (4.27) and (4.28) that r+,2 − r0,2 r+,1 − r0,1 = , r0,1 r+,1 r0,2 r+,2 r+,1 − r0,1 = r+,2 − r0,2 .

(4.29) (4.30)

Hence the quantities Y =

r+ − r0 , r0 r+

Z = r+ − r0 ,

(4.31)

are uniquely determined. We can now show the uniqueness of the parameters M, Q and Λ as follows. We ﬁrst note the following relation X = qQY.

(4.32)

Since X, Y are uniquely determined and q is supposed to be known and ﬁxed, we deduce from (4.32) that Q is uniquely determined, i.e. Q1 = Q2 = Q. Moreover, from (4.31) we deduce that r+ −r0 and r0 r+ are uniquely determined. Hence so are r0 and r+ as the unique solutions of the obvious polynomial of second order. Now recall r0 and r+ are roots of F (r) = 0. The equations F (r0 ) = 0 and F (r+ ) = 0 can be written using (2.2) as the linear system    2  Q2 r+ 2  1 + r2   r+ 3  +    M  (4.33) =  .  Λ  2  2  Q2  r0 1+ 2 r0 r0 3 r 3 −r 3

The determinant of (4.33) is 23 r00 r++ and is clearly nonzero. Hence (M, Λ) are the unique solutions of the system (4.33) whose coeﬃcients depend only on r0 , r+ , Q which are uniquely determined by the previous discussion. We thus conclude that M and Λ are also uniquely determined, i.e. M1 = M2 and Λ1 = Λ2 and the proof of Theorem 4.1 is ﬁnished.

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

473

4.2. The inverse problem on an interval of energy In this last subsection, we solve the inverse problem when the reﬂection operators L or R are supposed to be known on a (possibly small) interval of energy. We follow the usual stationary approach of inverse scattering on the line and refer to [8, 6] for a presentation of the general method in the case of one-dimensonal Schr¨ odinger operators and to [1] for an application to massless Dirac operators (see also [12, 15] for massive Dirac operators). We ﬁrst determine a stationary representation of the scattering operator S expressed in terms of the usual transmission and reﬂection coeﬃcients (here matrices). We do this by a serie of simplications of our model which ﬁnally reduces to the exact framework studied in [1]. We then use the exponential decay of the potentials to show that the reﬂection coeﬃcients R and L can be extended analytically to a small strip around the real axis. In consequence, the reﬂection coeﬃcients R or L are uniquely determined on R if they are known on any interval of energy by analytic continuation. At last, we use the results of [1], a classical Marchenko method, to prove that the parameters M, Q and Λ are uniquely determined by the knowledge of R(ξ) or L(ξ) for all energies. Recall that the scattering operator S is deﬁned by S = (W + )∗ W − , where the global wave operators W ± are given when Λ > 0 by ± ± + W(+∞) , W ± = W(−∞)

(4.34)

with ± W(−∞) = s- lim eitH e−itH0 P∓ ,

± W(+∞) = s- lim eitH e−itH+ P± .

t→±∞

t→±∞

(4.35)

We now use the unitary transform U introduced in (3.101) and the corresponding ± obtained in (3.110) and (4.16) to express (4.34) as simpliﬁed expressions of W(±∞) 1

W ± = U W ± (A, A0 )(eiΓ

c0 x

1

1

P∓ + eiΓ β eiΓ

c+ x

P± ).

(4.36)

Here we have used the notations introduced in Secs. 3.2 and 4.1. Let us denote by 1 1 1 G± the operators eiΓ c0 x P∓ + eiΓ β eiΓ c+ x P± appearing in (4.36) and by S(A, A0 ) the scattering operator associated to the operators A and A0 , i.e. S(A, A0 ) = (W + (A, A0 ))∗ W − (A, A0 ). Using the unitarity of U we thus immediately get the following expression for the scattering operator S S = G∗+ S(A, A0 )G− .

(4.37)

The couple of operators (A, A0 ) acting on H turns out to ﬁt the framework studied in [1]. Recall that they are given by A0 = Γ1 Dx and A = A0 + W (x) where the

May 11, J070-S0129055X10004004

474

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau 1

1

potential W (x) = eiΓ C− (x) (a(x)Γ2 + b(x)Γ0 )e−iΓ C− (x) is the 4× matrix-valued function 0 k(x) −ib(x) a(x) 2iC− (x) W (x) = . (4.38) , k(x) = e k ∗ (x) −a(x) ib(x) 0 Here k ∗ (x) denotes the transpose conjugate of the matrix-valued function k(x). Moreover W satisﬁes (4.12) and (4.13) and thus its entries belong to L1 (R). This is precisely the kind of operators studied in [1]. Note however that our potential W is better than L1 (R) since it is exponentially decreasing at both ends x → ±∞. This will be used hereafter. As a consequence, we can use the following stationary representation of S(A, A0 ) obtained in [1]. Let us introduce the unitary transform F on H deﬁned by 1 1 e−iΓ xξ ψ(x)dx, (4.39) F ψ(ξ) = √ 2π R then we have (see [1, p. 143]) S(A, A0 ) = F ∗ S0 (ξ)F ,

(4.40)

where the scattering matrix S0 (ξ) takes the form TL (ξ) R(ξ) S0 (ξ) = . L(ξ) TR (ξ)

(4.41)

Here TL (ξ) and TR (ξ) are 2 × 2 matrix-valued functions which correspond to the usual transmission coeﬃcients of S whereas L(ξ) and R(ξ) are 2 × 2 matrix-valued functions which correspond to the usual reﬂection coeﬃcients of S. We refer to [1, Secs. 2 and 3] for the deﬁnition and the construction of the scattering matrix S0 (ξ). Hence (4.37) becomes S = (F G+ )∗ S0 (ξ)F G− .

(4.42)

We now ﬁnish our factorization of the scattering operator S as follows. Using 2 × 2 block matrix notations, we note that iβ ic x ic x e 1 0 0 0 0 e 0 e + G+ = = , G , − 0 e−iβ 0 1 0 e−ic0 x 0 e−ic+ x and we deﬁne two unitary transforms F± on H by ic x e + 0 F+ ψ(ξ) = F ψ(ξ) 0 e−ic0 x −ixξ+ic+ x 1 0 e = √ ψ(x)dx, 0 eixξ−ic0 x 2π R and

F− ψ(ξ) = F

eic0 x 0

1 = √ 2π

R

0 e−ic+ x

(4.43)

ψ(ξ)

e−ixξ+ic0 x 0

0 eixξ−ic+ x

ψ(x)dx.

(4.44)

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

Then we have F G+ =

eiβ 0

0 1

F+ ,

F G− =

1 0

0 e−iβ

475

F− .

(4.45)

Hence we conclude from (4.45) that the scattering operator (4.42) factorizes as −iβ e TL (ξ) e−2iβ R(ξ) (4.46) S = F+∗ F− . L(ξ) e−iβ TR (ξ) We summarize this result as a proposition Proposition 4.1. The scattering operator S has the following stationary representation. If F± are the unitary transforms deﬁned in (4.43) and (4.44), then S = F+∗ S(ξ)F− ,

(4.47)

where the 4 × 4 scattering matrix S(ξ) is given by −iβ e TL (ξ) e−2iβ R(ξ) S(ξ) = , L(ξ) e−iβ TR (ξ)

(4.48)

and the quantities TL , TR and L, R are the 2 × 2 matrices that correspond to the transmission and reﬂection matrices of S(A, A0 ) respectively and are obtained in [1, Secs. 2 and 3]. Remark 4.1. As the notations suggest, the diagonal elements of the scattering matrix S(ξ) given in (4.48) are simply the stationary representations of the transmission operators TL and TR introduced in Sec. 2, (2.33). The anti-diagonal elements of S(ξ) are in turn the stationary representations of the reﬂection operators L and R in (2.34). Remark 4.2. The unitary operators F± appearing in the stationary representation (4.47) of S are natural in the following sense. Let us deﬁne the two selfadjoint operators on H H + = (Γ1 Dx + c+ )P+ + (Γ1 Dx + c0 )P− ,

H − = (Γ1 Dx + c0 )P+ + (Γ1 Dx + c+ )P− .

Hence it is clear from (4.34) and (4.35) that the global wave operators can be written in a classical form as ±

W ± = s- lim eitH e−itH . t→±∞

Now it is an easy calculation to show that the unitary transforms F± introduced in (4.43) and (4.44) are precisely the unitary transforms which diagonalize the operators H ± , i.e. H ± = F±∗ Mξ F± , where Mξ denotes the multiplication operator by ξ. We conclude that (4.47) together with (4.48) are the expected stationary representation of the scattering operator S.

May 11, J070-S0129055X10004004

476

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

In the sequel, we shall use the explicit link between our scattering matrix S(ξ) and the scattering matrix S0 (ξ) thoroughly studied in [1] in order to solve the inverse problem. Let us ﬁrst brieﬂy summarize some of the main results obtained in [1]. Under the assumption W ∈ L1 (R), the scattering matrix S0 (ξ) is continuous for ξ ∈ R and tends to I4 when ξ → ±∞. It is also unitary for each ξ ∈ R (see [1, Theorem 3.1] for a proof of these statements and for other properties on S0 (ξ))). Moreover, the following partial characterization result holds: Theorem 4.3 ([1, Theorem 6.3]). Assume that the reﬂection operators R(ξ) and L(ξ) be 2 × 2 matrix valued functions satisfying sup R(ξ) < 1,

sup L(ξ) < 1,

ξ∈R

ξ∈R

ˆ R(α) ∈ L1 (R),

+∞ 2 ˆ αR(α) dα < ∞,

0

ˆ L(α) ∈ L1 (R), (4.49)

0

−∞

2 ˆ αL(α) dα < ∞,

(4.50)

ˆ ˆ where R(α) and L(α) denote the usual Fourier transform of R(ξ) and L(ξ) and · is the Euclidean norm of a given matrix. Then the matrix-valued function k(x) ∈ L1 (R) in (4.38) (and thus the potential W (x)) can be uniquely recovered from the knowledge of R(ξ) and L(ξ) for all ξ ∈ R. We make several comments on this result and how we can apply it to our model: • The proof of the above theorem uses a classical Marchenko method. For instance, the matrix-valued function k(x) can be obtained after solving the following Marchenko integral equations for α > 0 (see [1, Eqs. (6.9) and (6.11)]) +∞ +∞ ˆ + δ + 2x)dγdδ, ˆ + 2x) + ˆ + γ + 2x)∗ R(α B1 (x, γ)R(δ B1 (x, α) = −R(α

ˆ − 2x)∗ + B2 (x, α) = −L(α

0

0

+∞ +∞

0

(4.51) ˆ + γ − 2x)L(α ˆ + δ − 2x)∗ dγdδ. B2 (x, γ)L(δ

0

(4.52) Under the assumption (4.49), the integral equations (4.51) and (4.52) are uniquely solvable in L1 (R+ ) ([1, Theorem 6.2]). Moreover, under the additionnal assumption (4.50), the matrix-valued function k(x) deﬁned using the boundary values of B1 and B2 by the formulae (see [1, Eq. (4.19)]) k(x) = 2iB1 (x, 0+ ),

∀ x > 0,

k(x) = −2iB2 (x, 0+ ),

∀ x < 0,

can be shown to be in L1 (R) and thus corresponds to the potential we are looking for. • If the potential W belongs to L1 (R), then the condition (4.49) is automatically satisﬁed (see [1, Theorem 4.2 and Eq. (6.17)]). Although this condition is the natural one under which one could expect to reconstruct the potential k in the class L1 , the authors of [1] had to add the extra assumption (4.50) (which must then be checked) in order to prove their result. We refer to [1, p. 154] for more

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

477

details on this point. In our case, we shall prove the condition (4.50) as follows. Using the exponential decay of W , we are ﬁrst able to show that the reﬂection coeﬃcients R(ξ) and L(ξ) (in fact the whole scattering matrix S0 (ξ)) are analytic on a small strip around the real axis. Moreover the functions R(· + iη) and L(· + iη) can be shown to belong to L2 (R) uniformly for each |η| small enough. It follows then from standard results on the Fourier transform (see, for instance, ˆ ˆ [26, Theorem IX.13]) that R(α) and L(α) satisfy ˆ ∈ L2 (R), e |α| R(α)

ˆ ∈ L2 (R), e |α| L(α)

∀ small enough,

from which (4.50) follows immediately. • From (4.51) and (4.52) and the reconstruction procedure explained above, we see that the knowledge of R(ξ) and L(ξ) for all ξ ∈ R is used to recover the potential k(x) for all x ∈ R. In fact it is only enough to know either R(ξ) or L(ξ) for all ξ ∈ R since then the whole scattering matrix S0 (ξ) can be uniquely recovered. The procedure is explained in [1, p. 147, Eqs. (5.3)–(5.5)] and we reproduce it for completeness. Assume for instance that R(ξ) is known for all ξ ∈ R. Then the transmission coeﬃcients TL (ξ) and TR (ξ) can be obtained performing the factorizations TL (ξ)TL (ξ)∗ = I4 − R(ξ)R(ξ)∗ ,

TR (ξ)∗ TR (ξ) = I4 − R(ξ)∗ R(ξ),

ξ ∈ R. (4.53)

Under the assumption k ∈ L1 (R), it was shown in [1] that the above factorization problems are in fact left or right canonical Wiener–Hopf factorization in the Wiener algebra W 4 and thus lead to unique TL (ξ) and TR (ξ) (see for instance [11, Theorem 9.2, p. 831]). At last, the reﬂection coeﬃcient L(ξ) is recovered from R(ξ) by the formula L(ξ) = −TR (ξ)R(ξ)∗ (TL (ξ)∗ )−1 .

(4.54)

• Eventually we explain how we can apply this result to our model. From Proposition 4.1, we assume for instance that e−2iβ R(ξ) is known for all ξ ∈ R. Then it is easy to see from (4.53) and (4.54) that we can uniquely recover TL (ξ) and TR (ξ) by performing Wiener–Hopf factorizations and then e2iβ L(ξ) for all ξ ∈ R. Note that the exponential term e−2iβ disappears in the factorization (4.53). If we assume that the assumptions (4.49) and (4.50) hold (this will be checked below), then we can apply Theorem 4.3 as follows. Multiplying the integral equations (4.51) and (4.52) by e−2iβ and solving them, we conclude that we can uniquely recover e2iβ k(x) (and not k(x)) for all x ∈ R. We shall show below that this implies the uniqueness of the parameters M, Q and Λ of the black hole. Let us now show the analyticity of R(ξ) and L(ξ) on a small strip around the real axis and prove there the uniform L2 estimates mentioned above. To do this we need to introduce some objects whose existence has been shown in [1, Secs. 1–3].

May 11, J070-S0129055X10004004

478

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

The reﬂection coeﬃcients R(ξ) and L(ξ) can be expressed in terms of solutions of the stationary problem ξ∈R

[Γ1 Dx + W (x)]X(x, ξ) = ξX(x, ξ),

(4.55)

where X(x, ξ) is understood as 4 × 4 matrix-valued function. Of special interest are the Jost solutions Fl (x, ξ) and Fr (x, ξ) of (4.55) which are singled out by the speciﬁc asymptotics at inﬁnity 1

Fl (x, ξ) = eiΓ Fr (x, ξ) = e

ξx

1

x → +∞,

(I4 + o(1)),

iΓ ξx

x → −∞.

(I4 + o(1)),

For each ξ ∈ R, these two solutions exist, are fundamental matrices of (4.55) and are related as follows ([1, Proposition 2.2]). There exist two 4 × 4 matrix valued functions al (ξ) and ar (ξ) such that Fl (x, ξ) = Fr (x, ξ)al (ξ),

Fr (x, ξ) = Fl (x, ξ)ar (ξ),

and satisfying al (ξ)ar (ξ) = ar (ξ)al (ξ) = I4 for all ξ ∈ R. Note that Fl (x, ξ) and Fr (x, ξ) satisfy the asymptotics (in the opposite ends) 1

ξx

(al (ξ) + o(1)),

x → −∞,

iΓ ξx

(ar (ξ) + o(1)),

x → +∞.

Fl (x, ξ) = eiΓ Fr (x, ξ) = e

1

(4.56)

Let us now express al (ξ) and ar (ξ) using 2 × 2 block matrix notations as al1 (ξ) al2 (ξ) ar1 (ξ) ar2 (ξ) al (ξ) = , ar (ξ) = . al3 (ξ) al4 (ξ) ar3 (ξ) ar4 (ξ) Then the reﬂection coeﬃcients are deﬁned by ([1, Eqs. (3.6) and (3.7)]) R(ξ) = ar2 (ξ)ar4 (ξ)−1 = −al1 (ξ)−1 al2 (ξ), L(ξ) = al3 (ξ)al1 (ξ)−1 = −ar4 (ξ)−1 ar3 (ξ). Since the situations are obviously symmetric, we shall only prove the analyticity and the uniform L2 estimate on a small strip around the real axis for R(ξ) (the proof for L(ξ) being identical). Moreover, we shall only consider the deﬁnition R(ξ) = −al1 (ξ)−1 al2 (ξ) for simplicity. To go further, we use some integral representations of the coeﬃcients al1 (ξ) and al2 (ξ) obtained in [1]. These are given in terms of the Faddeev matrix Ml (x, ξ) deﬁned by 1

Ml (x, ξ) = Fl (x, ξ)e−iΓ

ξx

.

It is easy to see from (4.55) that Ml (x, ξ) must satisfy the integral equation ([1, Eq. (2.12)]) +∞ 1 1 e−iΓ ξ(y−x) W (y)Ml (y, ξ)eiΓ ξ(y−x) dy, (4.57) Ml (x, ξ) = I4 − iΓ1 x

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

479

and from (4.56) that Ml (x, ξ) must satisfy the asymptotics Ml (x, ξ) = I4 + o(1) when x → +∞. In fact, using once again 2 × 2 block matrix notations for Ml (x, ξ) Ml1 (x, ξ) Ml2 (x, ξ) Ml (x, ξ) = , Ml3 (x, ξ) Ml4 (x, ξ) and iterating (4.57) once, we get the uncoupled system of integral equations for Ml3 (x, ξ) and Ml4 (x, ξ) ([1, Eqs. (2.15) and (2.16)]) +∞ e2iξ(y−x) k(y)∗ dy Ml3 (x, ξ) = i x

+∞

+∞

+ x

y +∞

+∞

Ml4 (x, ξ) = I4 + x

e2iξ(y−x) k(y)∗ k(z)Ml3 (z, ξ)dzdy, e−2iξ(z−y) k(y)∗ k(z)Ml4 (z, ξ)dzdy,

(4.58) (4.59)

y

and similar equations for Ml1 (x, ξ) and Ml2 (x, ξ) that we would not need. Eventually, the following integral representations for the coeﬃcients al1 (ξ) and al2 (ξ) hold ([1, Eqs. (2.25) and (2.26)]) (4.60) al1 (ξ) = I2 − i k(y)Ml3 (y, ξ)dy, al2 (ξ) = −i

R

R

e−2iξy k(y)∗ Ml4 (y, ξ)dy.

(4.61)

We ﬁrst study the coeﬃcient al2 (ξ) expressed in terms of the Faddeev matrix Ml4 (x, ξ). Under the assumption k ∈ L1 (R), a solution Ml4 (x, ξ) of (4.59) with the right asymptotics is easily shown to exist by iteration. Moreover for each ﬁxed x ∈ R, this solution can be extended to a continuous function in the variable ξ when Im ξ ≤ 0 and analytic when Im ξ < 0 ([1, Proposition 2.3]). We prove now the following result +∞ Lemma 4.1. Deﬁne the function P (x, ξ) = x e2|Imξ||y| k(y)dy. Then there exists κ > 0 small enough such that (i) For all ξ satisfying |Im ξ| ≤ κ and for all x ∈ R, the function P (x, ξ) is uniformly bounded. (ii) For each ﬁxed x ∈ R, the Faddeev matrix Ml4 (x, ξ) can be extended analytically to the strip |Im ξ| < κ. Moreover, for each such ξ, it satisﬁes the estimate Ml4 (x, ξ) ≤ C cos h(P (x, ξ)).

(4.62)

(iii) For each ﬁxed x ∈ R, the derivative Ml4 (x, ξ) of the Faddeev matrix with respect to the variable x can be extended analytically to the strip |Im ξ| < κ. Moreover, for each such ξ, it satisﬁes the estimate (x, ξ) ≤ C sin h(P (x, ξ)). Ml4

(4.63)

May 11, J070-S0129055X10004004

480

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

Proof. The ﬁrst assertion is a direct consequence of the deﬁnition of P (x, ξ) and (4.13) (take for instance κ = α2 where α is the positive number that appears in #∞ (4.13)). Solving (4.59) by iteration leads to set Ml4 (x, ξ) = n=0 un (x, ξ) with u0 (x, ξ) = I2 and +∞ +∞ e−2iξ(z−y) k(y)∗ k(z)un−1 (z, ξ)dzdy, ∀ n ≥ 1. (4.64) un (x, ξ) = x

y

By induction we get the estimates un (x, ξ) ≤

P (x, ξ)2n , (2n)!

∀ n ∈ N.

(4.65)

Together with (i), this entails the second assertion. To prove the third one, we #∞ consider the serie of derivatives n=1 un (x, ξ). From (4.64), note that +∞ un (x, ξ) = − e−2iξ(z−x) k(x)∗ k(z)un−1 (z, ξ)dzdy. x

2n−1

By induction and using (4.65), we get the estimates un (x, ξ) ≤ C P (x,ξ) (2n−1)! all n ≥ 1 from which we deduce (iii).

for

Corollary 4.1. Let κ the positive number deﬁned in Lemma 4.1. The coeﬃcient al2 (ξ) is analytic on the strip |Im ξ| < κ. Moreover, it satisﬁes there the estimate al2 (ξ) = O(|ξ|−1 ),

|ξ| → ∞.

(4.66)

Proof. The analyticity on the strip |Im ξ| < κ follows directly from (4.61) and Lemma 4.1. To prove the second assertion, we integrate by parts in (4.61). For all ξ with |Im ξ| < κ, we obtain 1 e−2iξy (k (y)Ml4 (y, ξ) + k(y)Ml4 (y, ξ))dy. (4.67) al2 (ξ) = − 2ξ R Since k also satisﬁes the estimate (4.13) and using Lemma 4.1 again, we conclude C . that al2 (ξ) ≤ |ξ| We now study the coeﬃcient al1 (ξ) expressed in terms of the Faddeev matrix Ml3 (x, ξ). Once again under the assumption k ∈ L1 (R), a solution Ml3 (x, ξ) of (4.58) with the right asymptotics is easily shown to exist by iteration. Moreover for each ﬁxed x ∈ R, this solution can be extended to a continuous function in the variable ξ when Im ξ ≥ 0 and analytic when Im ξ > 0 ([1, Proposition 2.3]). Using the same function P (x, ξ) and positive number κ as in Lemma 4.1, let us prove the following result Lemma 4.2. For each ﬁxed x ∈ R, the Faddeev matrix Ml3 (x, ξ) can be extended analytically to the strip |Im ξ| < κ. Moreover, for each such ξ, it satisﬁes the

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

481

estimates Ml3 (x, ξ) ≤ Ce2|Im ξ||x| sinh(P (x, ξ)). C (1 + e2|Im ξ||x| ), |ξ| ≥ 1. Ml3 (x, ξ) ≤ |ξ| Proof. We solve (4.58) by iteration. Hence we set Ml3 (x, ξ) = +∞ v0 (x, ξ) = i e2iξ(y−x) k(y)dy,

(4.68) (4.69) #∞

n=0 vn (x, ξ)

with

x

and

+∞

vn (x, ξ) = x

+∞

e2iξ(y−x) k(y)∗ k(z)vn−1 (z, ξ)dzdy.

(4.70)

y

We can prove the following estimate by induction vn (x, ξ) ≤ e2|Im ξ||x|

P (x, ξ)2n+1 , (2n + 1)!

∀ n ∈ N,

(4.71)

which implies immediately (4.68). Moreover, since P (x, ξ) is uniformly bounded on |Im ξ| < κ, we deduce from (4.68) the analyticity of Ml3 (x, ξ) on the same strip. To prove (4.69), we integrate by parts in (4.58) with respect to the variable y. For all ξ with |Im ξ| < κ, we obtain k ∗ (x) e−2iξx +∞ 2iξy ∗ − e (k ) (y)dy Ml3 (x, ξ) = − 2ξ 2ξ x k ∗ (x)K(x) e−2iξx +∞ 2iξy ∗ − − e ((k ) (y)K(y) 2iξ 2iξ x − k ∗ (y)k(y)Ml3 (y, ξ))dy,

(4.72) +∞

where we have introduced the function K(x) = x k(y)Ml3 (y, ξ)dy. Now using (4.13) for k and k , (4.68) and the uniform estimate K(x) ≤ C for all ξ with |Im ξ| < κ, we deduce that (4.69) holds when |ξ| is large from (4.72). Corollary 4.2. Let κ be the positive number deﬁned in Lemma 4.1. Then the coefﬁcient al1 (ξ) is analytic on the strip |Im ξ| < κ and tends to I2 when |ξ| → ∞. Furthermore, possibly considering smaller κ, the coeﬃcient al1 (ξ) is invertible on the strip |Im ξ| < κ and a−1 l1 (ξ) is analytic and uniformly bounded there. Proof. The ﬁrst assertion is a direct consequence of (4.60) and Lemma 4.2. Since al1 (ξ) tends to I2 when |ξ| → ∞, al1 (ξ) is clearly invertible for |ξ| large enough. Since al1 (ξ) is also invertible on the real axis ([1, Proposition 2.10]), we conclude that al1 (ξ) is invertible on a strip |Im ξ| < with 0 < < κ small enough and that a−1 l1 (ξ) is analytic and uniformly bounded on |Im ξ| < . Denoting this by κ, we have proved the corollary.

May 11, J070-S0129055X10004004

482

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

Let us put all these results together. Since R(ξ) = −a−1 l1 (ξ)al2 (ξ), Corollaries 4.1 and 4.2 imply that the reﬂection coeﬃcient R(ξ) is analytic on a strip |Im ξ| < κ where κ is a small enough positive number. Moreover, using the estimates of the same corollaries, we see that R(· + iη) ∈ L2 (R) for all |η| < κ. In fact, we have sup R(· + iη)L2 < ∞.

|η|<κ

ˆ Finally it follows from [26, Theorem IX.13] that the Fourier transform R(α) satisﬁes the estimate ˆ ∈ L2 (R). eκ|α| R(α)

(4.73)

In particular, the assumption (4.50) in Theorem 4.3 is satisﬁed by R(ξ). We ﬁnish this paper solving the inverse problem. Theorem 4.4. Assume that one of the reﬂection matrices L(ξ) or e−2iβ R(ξ) appearing in (4.48) is known on a (possibly small) interval of R. Assume moreover that the mass m and the charge q = 0 of the Dirac ﬁelds are known and ﬁxed. Then the parameters M, Q and Λ of the dS-RN black hole are uniquely determined. Proof. We only give the proof when the reﬂection matrix e−2iβ R(ξ) is supposed to be known on an interval I of R since the proof with L(ξ) can be treated the same way. We consider thus e−2iβ1 R1 (ξ) and e−2iβ2 R2 (ξ) two reﬂection matrices corresponding to parameters Mj , Qj and Λj with j = 1, 2 where moreover the parameters m, q = 0 are supposed to be known and ﬁxed. As usual we shall denote all the objects related to e−2iβj Rj (ξ) by a lower index j in what follows. Assume that e−2iβ1 R1 (ξ) = e−2iβ2 R2 (ξ) for all ξ ∈ I. By analyticity, we thus have e−2iβ1 R1 (ξ) = e−2iβ2 R2 (ξ),

∀ ξ ∈ R.

Using the procedure explained after Theorem 4.3, this also entails that e2iβ1 L1 (ξ) = e2iβ2 L2 (ξ),

∀ ξ ∈ R.

Thanks to (4.73) and the corresponding result for L(ξ), we can apply Theorem 4.3 (and the remarks following this theorem). Hence we obtain the equality e2iβ1 k1 (x) = e2iβ2 k2 (x) for all x ∈ R or equivalently 1

e2iΓ

β1

1

W1 (x) = e2iΓ

β2

W2 (x),

∀ x ∈ R.

(4.74)

Now recall that W 2 is a positive function since 2 F (r) 1 2 2 2 W (x) = al (x) + b (x) = l + + m2 F (r). 2 r2 Hence taking the square of (4.74) and then the modulus, we have W12 (x) = a2l,1 (x) + b21 (x) = a2l,2 (x) + b22 (x) = W22 (x),

∀ x ∈ R.

(4.75)

May 11, J070-S0129055X10004004

2010 10:7 WSPC/S0129-055X

148-RMP

Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes

Note in particular that

+∞

−∞

W12 (s)ds =

483

+∞

−∞

W22 (s)ds.

(4.76)

Moreover by homogeneity in l and since al and b are positive functions, we deduce from (4.75) that al,1 (x) = al,2 (x),

b1 (x) = b2 (x),

∀ x ∈ R.

(4.77)

Now since 1

W (x) = e−2iΓ

C − (x)

(al (x)Γ2 + b(x)Γ0 ),

by (2.16) it follows from (4.74) and (4.77) that 1

e2iΓ

β1 −2iΓ1 C1− (x)

e

1

= e2iΓ

β1 −2iΓ1 C2− (x)

e

,

∀ x ∈ R,

or equivalently that β1 − C1− (x) = β2 − C2− (x) + kπ,

∀ x ∈ R,

(4.78)

where k ∈ Z. Derivating (4.78), we obtain c1 (x) = c2 (x),

∀ x ∈ R.

(4.79)

If we let tend x to ±∞, we obtain from (4.79) and (2.15) c0,1 = c0,2 ,

c+,1 = c+,2 .

(4.80)

We notice eventually that (4.76) and (4.80) are precisely the conditions under which the parameters M, Q and Λ were shown to be uniquely determined in the proof of Theorem 4.1 (see precisely the conditions (4.25) and (4.27)). We thus apply the same procedure as before to end up the proof of the theorem. References [1] T. Aktosun, M. Klaus and C. van der Mee, Direct and inverse scattering for selfadjoint Hamiltonian systems on the line, Integr. Equa. Oper. Theory 38 (2000) 129–171. [2] S. Arians, Geometric approach to inverse scattering for the Schr¨ odinger equation with magnetic and electric potentials, J. Math. Phys. 38(6) (1997) 2761–2773. [3] T. Daud´e, Time-dependent scattering theory for massive charged dirac ﬁelds by a Reissner–Nordstr¨ om black hole, preprint, Universit´e Bordeaux 1 (2004); available online at http://tel.archives-ouvertes.fr/tel-00011974/en/. [4] T. Daud´e and F. Nicoleau, Recovering the mass and the charge of a Reissner– Nordstr¨ om black hole by an inverse scattering experiment, Inverse Problems 24 (2008) 025017, 18 pp; Corrigendum, ibid. 25 (2009) 059801. [5] J. Derezi´ nski and C. G´erard, Scattering Theory of Classical and Quantum N-Particle Systems (Springer, 1997). [6] P. Deift and E. Trubowitz, Inverse scattering on the line, Comm. Pure Appl. Math 32 (1979) 121–251. [7] V. Enss and R. Weder, The geometrical approach to multidimensional inverse scattering, J. Math. Phys. 36(8) (1995) 3902–3921.

May 11, J070-S0129055X10004004

484

2010 10:7 WSPC/S0129-055X

148-RMP

T. Daud´ e & F. Nicoleau

[8] L. D. Faddeev, The inverse problem in the quantum theory of scattering II, Itogi Nanki i Tekhniki. Ser. Sovrem. Probl. Mat. 3 (1974) 93–180. [9] Y. Gˆ atel and D. R. Yafaev, Scattering theory for the dirac operator with a long-range electromagnetic potential, J. Funct. Anal. 184 (2001) 136–176. [10] I. M. Gel’fand and Z. Y. Sapiro, Representations of the group of rotations of 3-dimensional space and their applications, Amer. Math. Soc. Trans. 11(2) (1956) 207–316. [11] I. Gohberg, S. Goldberg and M. A. Kaashoek, Classes of Linear Operators, Vol. 2, Operator Theory: Advances and Applications, Vol. 63 (Birkh¨ auser, 1993) [12] B. Gr´ebert, Inverse scattering for the Dirac operator on the real line, Inverse Problems 8 (1992) 787–807. [13] D. H¨ afner and J-.P. Nicolas, Scattering of massless Dirac ﬁelds by a Kerr black hole, Rev. Math. Phys. 16(1) (2004) 29–123. [14] S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Space-Time, Cambridge Monographs on Mathematical Physics, No. 1 (Cambridge Univ. Press, 1973). [15] D. B. Hinton, A. K. Jordan, M. Klaus and J. K. Shaw, Inverse scattering on the line for a Dirac system, J. Math. Phys. 32(11) (1991) 3015–3030. [16] H. Isozaki and H. Kitada, Modiﬁed wave operators with time-independent modiﬁers, Papers of the College of Arts and Sciences Tokyo Univ. 32 (1985) 81–107. [17] W. Jung, Geometric approach to inverse scattering for Dirac equation, J. Math. Phys. 36(8) (1995) 3902–3921. [18] F. Melnyk, The Hawking eﬀect for spin 1/2 ﬁelds, Comm. Math. Phys. 244(3) (2004) 483–525. [19] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators, Comm. Math. Phys. 78 (1981) 391–408. [20] J.-P. Nicolas, Scattering of linear Dirac ﬁelds by a spherically symmetric black hole, Ann. Inst. Henri Poincar´e Physique Th´eorique 62(2) (1995) 145–179. [21] F. Nicoleau, A stationary approach to inverse scattering for Schr¨ odinger operators with ﬁrst order perturbation, Comm. Partial Diﬀerential Equations 22(3–4) (1997) 527–553. [22] F. Nicoleau, An inverse scattering problem with the Aharonov–Bohm eﬀect, J. Math. Phys. 8 (2000) 5223–5237. [23] F. Nicoleau, Inverse scattering for Stark Hamiltonians with short-range potentials, Asymptot. Anal. 35(3–4) (2003) 349–359. [24] F. Nicoleau, An inverse scattering problem for the Schr¨ odinger equation in a semiclassical process, J. Math. Pures Appl. 86 (2006) 463–470. [25] R. Novikov, Small angle scattering and X-ray transform in classical mechanics, Arkiv Mat. 37(1) (1999) 141–169. [26] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 2 (Academic Press, 1975). [27] D. Robert, Autour de l’approximation semiclassique, Progress in Mathematics, Vol. 68 (Birkh¨ auser, Basel, 1987). [28] R. Wald, General Relativity (University of Chicago Press, 1984). [29] R. Weder, Multidimensional inverse scattering in an electric ﬁeld, J. Funct. Anal. 139(2) (1996) 441–465.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 485–505 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10003989

´ FLOWS ON THE LOOP BOTT–VIRASORO EULER–POINCARE GROUP AND SPACE OF TENSOR DENSITIES AND (2 + 1)-DIMENSIONAL INTEGRABLE SYSTEMS

PARTHA GUHA Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, D-04103 Leipzig, Germany and S. N. Bose National Centre for Basic Sciences, JD Block, Sector-3, Salt Lake, Calcutta-700098, India [email protected] Received 22 July 2009 Revised 22 January 2010 Dedicated to Professor Tudor Ratiu on his 60th birthday with great respect and admiration Following the work of Ovsienko and Roger ([54]), we study loop Virasoro algebra. Using this algebra, we formulate the Euler–Poincar´e ﬂows on the coadjoint orbit of loop Virasoro algebra. We show that the Calogero–Bogoyavlenskii–Schiﬀ equation and various other (2 + 1)-dimensional Korteweg–deVries (KdV) type systems follow from this construction. Using the right invariant H 1 inner product on the Lie algebra of loop Bott– Virasoro group, we formulate the Euler–Poincar´e framework of the (2+1)-dimensional of the Camassa–Holm equation. This equation appears to be the Camassa–Holm analogue of the Calogero–Bogoyavlenskii–Schiﬀ type (2 + 1)-dimensional KdV equation. We also derive the (2 + 1)-dimensional generalization of the Hunter–Saxton equation. Finally, we give an Euler–Poincar´e formulation of one-parameter family of (1 + 1)-dimensional partial diﬀerential equations, known as the b-field equations. Later, we extend our construction to algebra of loop tensor densities to study the Euler–Poincar´e framework of the (2 + 1)-dimensional extension of b-ﬁeld equations. Keywords: Diﬀeomorphism; loop Virasoro algebra; tensor densities; Calogero– Bogoyavlenskii–Schiﬀ equation; (2 + 1)-dimensional Camassa equation; b-ﬁeld equation. Mathematics Subject Classiﬁcations 2010: 53A07, 53B50

1. Introduction The study of higher dimensional integrable systems is one of the most challenging areas in integrable systems. Early in the study of integrable systems, the main thrusts were restricted to the (1 + 1)-dimensional systems because of the diﬃculty of ﬁnding the physically signiﬁcant high-dimensional solutions which are localized in all directions. Recently, much progress has been achieved in understanding the 485

June 2, 2010 14:55 WSPC/S0129-055X

486

148-RMP

J070-00398

P. Guha

properties and solutions for two-dimensional integrable models such as Kadomtsev– Petvashvili (KP), Davey–Stewartson (DS) equations [1]. One of the most striking feature of (2+1)-dimensional system is the exponentially localized structures, called dromions, which are driven by two perpendicular line ghost solitons in case of the DS equation or two non-perpendicular line ghost solitons in case of the KP equation. One should recall that the name dromions as well as their spectral meaning were introduced by Fokas and Santini [27]. Recently the rich dromion structures were found in (2 + 1)-dimensional KdV equations also [49, 50, 59, 63]. After the discovery of dromions, the question arises whether there exist exponentially localized structures in (2 + 1)-dimensional breaking soliton equations as well. In such systems, the spectral parameter becomes a multivalued function, in other words, spectral parameter possesses so-called breaking behavior. The solutions of these equations may become multivalued. There is an equation exibiting breaking solitons, formulated by Bogoyavlenskii in [5, 6], as one of the (2 + 1)dimensional reductions of the self dual Yang–Mills equations. In a series of papers Bogoyavlenskii studied such breaking soliton equations. He extended the well-known Lax representation to the generalized form Lt = P (L) +

n

Rk (L, Lyk ) + [L, A].

k=1

Here P (L) and Rk (L, Lyk ) are certain meromorphic functions of the operator L. In [5, 6], Bogoyavlenskii constructed several hydrodynamic-type systems which are connected to the Toda lattice and the Volterra model. It has been shown that these systems possess the breaking behavior, the Hamiltonian forms and conservation laws. The continuous limits of these systems include the equation vt = 4vvy + 2vx ∂x−1 vy − vxxy + β0 (6vvx − vxxx ),

(1)

which after the substitution v = ux , is reduced to potential form utx = 4ux uxy + 2uy uxx − uxxxy ,

(2)

where we set β0 = 0. Schiﬀ [64] obtained above equation in a diﬀerent route. He derived Eq. (1) from the reduction of the self-dual Yang–Mills equations from four to three dimensions. There has been considerable interest to show that the self-dual Yang–Mills equations as a master integrable equation, from which many integrable systems can be obtained by suitable reductions and this was the original motivation of Schiﬀ. It has been shown in [66] that the generalized SDYM equations contain (as dimensional reductions) various (2 + 1)-dimensional integrable soliton hierarchies which generalize the nonlinear Schr¨ odinger and KdV hierarchies. One can also derive (2 + 1)-dimensional KdV type systems from another method. Using classical diﬀerential geometry Konopelchenko [44] has derived (2+1)dimensional KdV equation. In geodesic coordinates, the Gauss equation is reduced to the Schr¨ odinger equation where the Gaussian curvature plays the role of a

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

487

potential. It can be shown that a special case is governed by the KdV equation for the Gaussian curvature. In this framework, Konopelchenko [44] studied the integrable dynamics of curvature via the KdV equation, higher KdV equations and other (2 + 1)-dimensional integrable equations with breaking solitons. The bihamiltonian operators for (2 + 1)-dimensional integrable systems were introduced in [28–30]. In an interesting paper Fokas, Olver and Rosenau [26] proposed an algorithmic construction of (2 + 1)-dimensional integrable system qxt − νqxxxt + aqxy + bqxxxy + c(qxx qy + 2qx qxy ) − cν(qxxxx qy + 2qxxx qxy ), (3) which yield peakon/dromion type solutions. This equation can be identiﬁed with the potential form of the Camassa–Holm (integrable) analogue of the Calogero– Bogoyavlenskii–Schiﬀ equation. Recently the one-parameter family of shallow water equations of the following form ut − uxxt + (b + 1)uux = bux uxx + uuxxx,

(4)

where b is a real parameter, has drawn some attention. This equation is known as the b-ﬁeld equation. It was introduced by Degasperis, Holm and Hone [18, 19], who showed the existence of multi-peakon solutions for any value of b, although only the special cases b = 2, 3 are integrable, having bihamiltonian formulations. The b = 2 case is the well-known Camassa–Holm (CH) equation [8] and b = 3 is the integrable system discovered by Degasperis and Procesi [20]. One must note that only for b = 2, 3, Eq. (4) is also hydrodynamically relevant [14, 41]. It is worth to remember that b = 2 case was later recognized as being included in a class of integrable equations derived from hereditary symmetries in Fokas and Fuchssteiner [25] Using the Helmholz ﬁeld m := u−uxx , the b-ﬁeld equation or the DHH equation (4) allows reformulation in the compact form mt + umx + bux m = 0,

(5)

where the three terms correspond respectively to evolution, convection and stretching of the one-dimensional ﬂow. In this paper we also study an Euler–Poincar´e formulation of (2 + 1)-dimensional b-ﬁeld equation. It must be worth to mention that the well-posedness and blow up of the b-ﬁeld equation was proved in [23], and its invariance properties were used by Henry [38] to investigate the equation qualitatively. In a recent paper [35], the author has formulated the Euler–Poincar´e (EP) framework of the Degasperis and Procesi (DP) equation. It turns out that the DP equation is the Euler–Poincar´e ﬂow on the combined space of Hill’s (second order) operator and ﬁrst order diﬀerential operators on circle. In this paper, the author has given the EP formulation of the two-component generalization of the DP equation. It has been shown [35] also that the Hamiltonian structure obtained from the EP framework exactly coincides with the Hamiltonian structures of the

June 2, 2010 14:55 WSPC/S0129-055X

488

148-RMP

J070-00398

P. Guha

DP equation obtained by Degasperis et al. In this paper, we give a much shorter derivation of the DP and the b-ﬁeld equation using the deformation of vector ﬁeld structure on S 1 . Following the work of Ovsienko and Roger [54], we study loop Virasoro algebra. Using this algebra, we are able to derive the (2 + 1)-dimensional b-ﬁeld equation. The aim of this paper is to contribute towards a theory of integrable type geodesic ﬂows on inﬁnite-dimensional Lie groups which has attracted tremendous attention since Arnold’s seminal paper [2] on Euler equation in hydrodynamics. Later, Ebin and Marsden [22] established a proper geometric setting of this problem. They showed that the geodesic spray was smooth. This led to very nice existence proofs; the limit of zero viscosity for manifolds with no boundary was shown to exist for the ﬁrst time. It would be worth to mention that the in recent years equations like the Camassa–Holm equation model for shallow water waves or the Hunter–Saxton [40] model for nemetic liquid crystals, the geometric structures was used to to study qualitative properties of the solutions. The geometric structure of the Hunter–Saxton equation and its relevance has been studied by Lenells [45]. The geometric approach leads to the construction of global weak solutions in the periodic case [46], the case of global weak solutions without periodicity being investigated in Bressan and Constantin [3]. One must note that in (1 + 1) dimensions both the Camassa–Holm and the Degasperis–Procesi equation admit traveling waves that are peaked and orbitally stable so that these patterns are physically detectable [17,48]. These peakons capture the main feature of the exact travelling wave solutions of greatest height of the governing equations for water waves [16]. The KdV equation is an Euler–Poincar´e equation on the Virasoro–Bott group (see [42, 57, 65]). This group is deﬁned as the unique (up to isomorphism) nontrivial central extension of the group Diﬀ(S 1 ) of all diﬀeomorphisms of S 1 . The inertia operator is given by the standard L2 -metric on S 1 . It is known that the two-component KdV and Camassa–Holm equations are also geodesic ﬂows on the extended Virasoro–Bott group [32–34]. It is worth pointing out that for the Camassa–Holm equation the geometric approach leads to a proof which demonstrates that the equation satisﬁes the Least Action Principle [10, 11]. The inﬁnite-dimensional groups also play important role for the construction of (2 + 1)-dimensional integrable systems. One form of (2 + 1)-dimensional KdV and nonlinear Schr¨ odinger equation can be derived from the toroidal Lie algebra. Here the variable x is associated to the action of the usual aﬃne part of the toroidal Lie algebra [4, 61], while evolutions in y and t are indiced by the action of the genuine toroidal part. The weight of v and the relative and the relative weight of y and t are balanced with that of x, thus it allows us two freedoms to determine the weights for all the variables. In this paper we study the (formal) Euler–Poincar´e framework [52] of various (2 + 1)-dimensional KdV type systems. Until now there is no systematic method of construction of (2 + 1)-dimensional integrable systems from the view point of geodesic ﬂows or Euler–Poincar´e framework. In particular, we show that the

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

489

Calogero–Bogoyavlenskii–Schiﬀ equation arises as a geodesic ﬂow on loop Bott– Virasoro group. This equation is an eminent member of the (2 + 1)-dimensional family of KdV equations [39]. We also study the Euler–Poincar´e framework of the Bogoyavlenskii–Konopelchenko equation. In fact there are not so long list of equations of (2 + 1) dimensions are known to be EP formulated. Recently, Ovsienko [58] studied the bihamiltonian properties of the Martinez Alonso–Shabat type system uxt = uy uxy − uyy ux . Another nonlinear diﬀerential equation utx = uxx uy − uxy ux + cuyy has been mentioned in [54]. In the second half of the paper, we construct higher dimensional Camassa– Holm equation. We show that the (2 + 1)-dimensional Camassa–Holm equation arises as geodesic ﬂow with respect to the right invariant H 1 metric on the cotangent loop Virasoro group. We also compute the (2 + 1)-dimensional Hunter–Saxton equation. The result of this paper was announced [36] in the Oberwolfach meeting on geometrical mechanics. This is a long version of [36]. The paper is organized as follows. In Sec. 2, we present the Euler–Poincar´e formalism and frozen Poisson structures. Loop Virasoro algebra is introduced in Sec. 3. In Sec. 4, we give the Euler–Poincar´e formulation of the (2 + 1)-dimensional KdV equation. Section 5 is devoted to the derivation of the (2 + 1)-dimensional Camassa–Holm equation and the Hunter–Saxton equation. In Sec. 6, we present the Euler–Poincar´e framework of the b-ﬁeld equation. The formulation of (2 + 1)-dimensional b-ﬁeld equation is given Sec. 7. 2. The Euler–Poincar´ e Formalism The Euler–Poincar´e equations were born in 1901 (see [52] for details) when Poincar´e made a extensive generalization of the classical Euler equations for the rigid body and ideal ﬂuids. He did this by formulating the equations on a general Lie algebra, with the rigid body being associated with the rotation Lie algebra and ﬂuids with the Lie algebra of divergence free vector ﬁelds. We give a rapid introduction of the Euler–Poincar´e framework. Let G be a Lie group and g be its corresponding Lie algebra and its dual is denoted by g∗ . G can be thought of as the conﬁguration space of some physical system, for example, the group SO(3) for a rigid body and a group sDiﬀ(M ) of volume preserving diﬀeomorphisms for an ideal ﬂuid ﬁlling a domain M . The dual space g∗ to any Lie algebra g carries a natural Lie–Poisson structure: {f, g}LP (µ) := [df, dg], µ for any µ ∈ g and f, g ∈ C (g∗ ). ∗

∞

June 2, 2010 14:55 WSPC/S0129-055X

490

148-RMP

J070-00398

P. Guha

The Hamiltonian vector ﬁeld on g∗ corresponding to a Hamiltonian function f , computed with respect to the Lie–Poisson structure is given by dµ = ad∗df µ, dt

µ ∈ g∗ ,

(6)

which implies that the Hamiltonian vector ﬁeld Xf (µ) = ad∗df µ. Let us ﬁx some quadratic form, energy function, on g. Consider the right translations of this quadratic form to the tangent space at any point of the group. In this process, we deﬁne a right-invariant Riemannian metric on the group using the energy function. The geodesic ﬂow on G with respect to this quadratic form represents the extremals of the principle of least action, which traces out the actual motion of the physical system. We identify the Lie algebra and its dual with this quadratic form. This identiﬁcation is done via inertia operator I : g −→ g∗ . This allows us to rewrite the Euler–Poincar´e equation on the dual space g∗ . It has been proved that the EP equation on g∗ is Hamiltonian with respect to the natural Lie–Poisson structure on the dual space. Deﬁnition 2.1. The Euler–Poincar´e equation on tonian H(µ) = 12 I −1 µ, µ is given by du = ad∗I −1 µ µ, dt

g∗ corresponding to the Hamil-

I −1 µ ∈ g,

(7)

where ad∗(.) µ is the coadjoint operator, dual to the operator [µ, ·], deﬁning the structure of the Lie algebra g. Equation (7) characterizes an evolution of a point µ ∈ g∗ . 2.1. Frozen Lie–Poisson structure Consider the dual of the Lie algebra of g∗ with a Poisson structure given by the “frozen” Lie–Poisson structure. In other words, we ﬁx some point µ0 ∈ g∗ and deﬁne a Poisson structure given by {f, g}0 := [df (µ), dg(µ)], µ0 , which satisﬁes Jacobi identity. It plays an important role in integrable systems, particularly to the construction of the ﬁrst Hamiltonian structure of the underlying integrable system. We can give another interpretation [12,13] of frozen structure from the deﬁnition of cocycle. Given an inertia operator I : g → g∗ one can deﬁne a constant Poisson structure {f, g}0(µ) = df, I dg

where µ ∈ g∗ .

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

491

A two-cocycle ω is called a coboundary if there is a point µ0 ∈ g∗ such that ω(p, q) = [p, q], µ0 , where p, q ∈ g. Since the Poisson structure is generated by a coboundary of ω, we obtain the frozen Lie–Poisson structure. This behaves like a Lie–Poisson structure frozen at the point µ0 ∈ g∗ and this coincides with the previous deﬁnition of frozen structure. It is easy to check that the above Poisson structures are compatible, i.e. their linear combination or pencil of Poisson structures { , }λ = { , }0 + λ { , }LP

(8)

is again a Poisson structure for all λ ∈ R. It was shown by Khesin and Misiolek [43] that Proposition 2.2. The brackets {·, ·}LP and {·, ·}0 are compatible for every “freezing” point µ0 ∈ g∗ . At this point, we can introduce the bihamiltonian structure. The notion of integrability can be understood from this structure. The standard way to understand bihamiltonian vector ﬁelds on the dual of the Lie algebra is associated to Lie–Poisson structures. Deﬁnition 2.3. A vector ﬁeld X on g∗ is called bi-Hamiltonian if there are two functions, H1 and H2 such that X is a Hamiltonian vector ﬁeld of H1 with respect to the Poisson structure { , }LP and is a Hamiltonian vector ﬁeld of H2 with respect to { , }0 . 3. Loop Virasoro Algebra and (2 + 1)-Dimensional KdV Flows We wish to extend the Virasoro algebra to the case of two space variables. A natural way to do this is to consider the loops on it. One deﬁnes the loop group on Diﬀ(S 1 ) as follows L(Diﬀ(S 1 )) = {φ : S 1 → Diﬀ(S 1 ) | φ is diﬀerentiable}, the group law being given by (φ ◦ ψ) (y) = φ(y) ◦ ψ(y),

y ∈ S1.

In a similar manner, we construct the Lie algebra L(Vect(S 1 )) consisting of vector ﬁelds on S 1 depending on one more independent variable y ∈ S 1 . The loop variable is thus denoted by y and the variable on the “target” copy of S 1 by x. The ∂ where f ∈ C ∞ (S 1 × S 1 ) and the elements of L(Vect(S 1 )) are of the form: f (x, y) ∂x Lie bracket reads as follows [54] ∂ ∂ ∂ f (x, y) , g(x, y) = (f (x, y) gx (x, y) − fx (x, y) g(x, y)) . ∂x ∂x ∂x It is easy to convince oneself that L Vect(S 1 ) is the Lie algebra of L Diﬀ(S 1 ) in the usual weak sense for the inﬁnite-dimensional case; a one-parameter group

June 2, 2010 14:55 WSPC/S0129-055X

492

148-RMP

J070-00398

P. Guha

argumentation gives an between the tangent space to L Diﬀ(S 1 ) identiﬁcation at the identity and L Vect(S 1 ) , equipped with its Lie bracket. In future, we will ˜ . The natural pairing between the loop Virasoro algebra denote L(Vect(S 1 )) by g and its dual is given by

∂ , v(x, y)dx2 = f v dx dy. (9) f (x, y) ∂x S 1 ×S 1 3.1. Cocycle and extension of loop Virasoro algebra Consider the following “modiﬁed” Gelfand–Fuchs cocycle on Vect(S 1 ):

d d (af g + bf g)dx, ωmGF f (x) , g(x) = dx dx S1

(10)

where the ﬁrst term is the original Gelfand–Fuchs cocycle. This cocycle is cohomologues to the Gelfand–Fuchs cocycle, hence, the corresponding central-extension is isomorphic to the Virasoro algebra. The additional term is a coboundary term. It is easy to check that the functional

1 f g dx = (f g − f g )dx 2 S1 S1 d d depends on the commutator of f dx and g dx . ˜ . A distribution Let us give the explicit formulæ of non-trivial 2-cocycles [31] on g λ ∈ C ∞ (S 1 ) corresponds to a 2-cocycle of the ﬁrst class given by [67]

fg xxx dx , µλ (f, g) = λ S1

these are the Virasoro type extensions. For the particular case where λ(a(y)) =

a(y)dy, such a 2-cocycle will be denoted by µ1 so that one has 1 S

f gxxx dx dy. (11) ω1 (f, g) = S 1 ×S 1

˜ given ˆ as the one-dimensional central extension of g We deﬁne the Lie algebra g by the cocycles ω1 . As a vector space, gˆ = g˜ R, ˆ . The commutator in g ˆ is given by the where the summand R is the center of g following explicit expression which readily follows from the above formulæ.

∂ ∂ ∂ ,a , g ,b = (f gx − fx g) + f gxxx dx dy . (12) f ∂x ∂x ∂x S 1 ×S 1 ˆ. where the last term is an element of the center of g

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

493

4. EP Formulation of Calogero–Bogoyavlenskii–Schiﬀ Type (2 + 1)-Dimensional KdV Equation In this section, we give a Euler–Poincar´e derivation of the Calogero–Bogoyavlenskii– Schiﬀ equation (1) or (2). It is a most important member of the (2 + 1)-KdV family. We recall the Kirillov–Segal result and generalize it to the case of the Lie ˆ. algebra g ˆ is given by Proposition 4.1. The coadjoint action of the Lie algebra g ∗

2 2 ∂ (v(x, y)dx ) = (fv ad x + 2fx v + c1 fxxx + c2 fx )dx , f (x,y) ∂x

(13)

while the center acts trivially. Corollary 4.2. The Hamiltonian operator OLV corresponding to the coadjoint action of the loop Virasoro algebra is given by OLV = ∂x v + v∂x + c1 ∂x3 + c2 ∂x .

(14)

ˆ ∗ which is a (pseudo)diﬀerential polynomial: Given a functional H on g

H(g, v) = h g, v, gx , vx , gy , vy , ∂x−1 g, ∂x−1 v, ∂y−1 g, ∂y−1 v, gxy , vxy , . . . dx dy, S 1 ×S 1

where h is a polynomial in an inﬁnite set of variables. For instance, δH ∂ ∂ = hv − (hv ) − (hvy ) − ∂x−1 (h∂x−1 v ) − ∂y−1 (h∂y−1 v ) δv ∂x x ∂y +

∂2 ∂2 ∂2 (h (h ) + ) + (hvyy ) ± · · · v v xx xy ∂x2 ∂x∂y ∂y 2

where, as usual, hv means the partial derivative

∂h ∂v ,

similarly hvx =

∂h ∂vx .

Proposition 4.3. The Euler–Poincar´e ﬂow restricted to hyperplane c1 = 1, c2 = 0 at (0, v dx2 ) yields the Calogero–Bogoyavlenskii–Schiﬀ equation (or (2 + 1)dimensional KdV ) (15) vt = vxxy + 2vvy + vx ∂x−1 vy

for the Hamiltonian H = 12 S 1 ×S 1 v∂x−1 vy dx dy. We use the expression [58]

x

2π v(ξ, y)dξ − v(x, y)dx. (∂x−1 v)(x, y) = 0

0

4.1. The Bogoyavlenskii–Konopelchenko equation In this section, we derive several other (2 + 1)-dimensional KdV type equations. The Euler–Poincar´e formalism of the Bogoyavlenskii–Konopelchenko equation vt + βvxxy + 3α + vxxx + 3vvx + 2βvvy + βvx ∂x−1 vy = 0,

(16)

June 2, 2010 14:55 WSPC/S0129-055X

494

148-RMP

J070-00398

P. Guha

x

2π with ∂x−1 v(x, y) = 0 v(ξ, y)dξ − 0 v(x, y)dx, is closely related to the Calogero– Bogoyavlenskii–Schiﬀ equation. In fact, this is a combination of KdV and Calogero–Bogoyavlenskii–Schiﬀ ﬂows. Equation (16) models the (2+1)-dimensional interaction of a Riemann wave propagating along the y-axis with a long wave along the x-axis. Using (14), we obtain the following result. Proposition 4.4. The Euler–Poincar´e ﬂow associated to the loop Virasoro algebra

gˆ yields the Bogoyavlenskii–Konopelchenko (for α = 0) equation for the Hamiltonian H=

1 2

S 1 ×S 1

(v 2 + βv∂x−1 vy )dx dy,

when restricted to hyperplane c1 = 1, c2 = 0. Outline of Proof. We use EP equation vt = −OLV

δH δv

to obtain our result.

Another class of (2 + 1)-dimensional KdV equation was proposed by Lou and his collaborators ([47, 49, 50]) to study the rich dromion structures, deﬁned as vt + vxxx = 3(v∂y−1 vx )x ,

(17)

where ∂y−1 is deﬁned similarly as ∂x−1 [58]. This equation reduces to the usual (1 + 1)-dimensional KdV equation. We use “frozen” Lie–Poisson structure to compute the Hamiltonian operator

1 at (v(x)dx2 , c1 , c2 ) = (0, 0, 1). We also assume that the only cocycle term is S f g induced by the coboundary term. The Hamiltonian operator computed at the freezing at the point (0, 0, 1) yields a truncated Hamiltonian operator ˜1 = ∂x . O We also compute the Hamiltonian operator at (v(x)dx2 , c1 , c2 ) = (0, 1, 0), given by ˜2 = ∂ 3 . O x Proposition 4.5. The second class (2 + 1)-dimensional KdV follows from the ˆ∗ following combination of ﬂows on g ˜1 δH1 + λO ˜2 δH2 , vt = µO δv δv with

δH1 δv

= (v∂y−1 vx ) and

δH2 δv

= v, respectively. Here µ = 3 and λ = −1.

Proof. By direct computation. 5. H 1 Metric and (2 + 1)-Dimensional Camassa–Holm and Hunter–Saxton Systems In this section, we study the Camassa–Holm analogue of the (2 + 1)-dimensional KdV equations from the integrable point of view. The hydrodynamical analogue

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

495

was derived in [41]. We start with the explicit expression for the coadjoint action of g with respect to right invariant H 1 -metric. ˜. Let us introduce H 1 norm on the algebra g Deﬁnition 5.1. The H 1 -Sobolev norm on the loop Virasoro algebra is deﬁned as ∂ f (x, y) , u(x, y)dx2 ∂x H1

=

fu dx + ν

S1

S1

∂x f ∂x u dx.

(18)

Now we compute the coadjoint action. Proposition 5.2. The coadjoint action with respect to H 1 metric of the loop ˆ is given by Virasoro algebra g ∗

2 ∂ v dx = (f v ˜x + 2fx v˜ + c1 fxxx + c2 fx )dx2 , ad f (x,y) ∂x

where v˜ = (1 − ν∂x2 )v. Proof. We know that ∗ ∂ ∂ ∂ 2 2 ∂ ,h adf ∂x v dx , h = v dx , f ∂x H 1 ∂x ∂x H 1

∂ 2 + = v dx , (fhx − fx h fhxxx dx dy, fhx dx dy . ∂x S 1 ×S 1 S 1 ×S 1 H1 Thus from the right-hand side we obtain the matrix expression. We compute now the left-hand side of the above equation. Let us denote ∂ ∂ ˆ = h ∂ ,e , , c , gˆ = g , d , h fˆ = f ∂x ∂x ∂x where c = (c1 , c2 ), d = (d1 , d2 ) and e = (e1 , e2 ) . Now we compute the left-hand side

∗ˆgˆ)h ˆ dx dy + ν LHS = (ad f S 1 ×S 1

=

S 1 ×S 1

S 1 ×S 1

∗ˆgˆ) h ˆ dx dy (ad f

∗

ˆgˆˆh dx dy. [(1 − ν∂x2 ) ad f

Thus by equating the right-hand side and left-hand side, we obtain the above formula. Lemma 5.3. The Hamiltonian operator corresponding to the coadjoint action of the loop Virasoro algebra with respect to H 1 metric is given by OH 1 = (1 − ν∂x2 )−1 ∂x v˜ + v˜∂x + c1 ∂x3 + c2 ∂x , (19) where v˜ = (1 − ν∂ 2 )v. Let us study the Euler–Poincar´e ﬂow associated to H 1 metric on the coadjoint ˆ. orbit of the cotangent loop Virasoro algebra g

June 2, 2010 14:55 WSPC/S0129-055X

496

148-RMP

J070-00398

P. Guha

Proposition 5.4. The Euler–Poincar´e ﬂow with respect to H 1 -norm on dual space of loop Virasoro algebra becomes vt = OH 1

δH , δv

(20) ∗

˜ is deﬁned as where OH 1 is deﬁned by (19). Suppose the quadratic Hamiltonian on g

1 H= v∂ −1 vy dx dy, 2 S 1 ×S 1 x then the Euler–Poincar´e ﬂow yields v˜t = v˜x ∂x−1 vy + 2˜ v vy + c1 vxxy + c2 vy .

(21)

Corollary 5.5. The Euler–Poincar´e ﬂow restricted to hyperplane c2 = 0 yields the Camassa–Holm analogue of the Calogero–Bogoyavlenskii–Schiﬀ equation vt − νvxxt + c1 vxxy + (vx − νvxxx )∂x−1 vy + 2(v − νvxx )vy = 0

for the Hamiltonian H = S 1 ×S 1 g∂x−1 vy dx dy.

(22)

Corollary 5.6. The potential form of the (2+1)-dimensional Camassa–Holm equation takes the form uxt − νuxxxt + c1uxxxy + uxx uy + 2ux uxy − ν uxxxxuy + 2uxxxuxy = 0 (23) for all v = ux . Remark. In a special (1 + 1)-dimensional case (y = x), Eq. (23) reduces to potential Camassa–Holm equation. If we further assume ν = 0, then Eq. (23) reduces to potential KdV equation. Corollary 5.7. The Euler–Poincar´e ﬂow restricted to hyperplane c1 = 1 and c2 = 1 yields the modiﬁed Calogero–Bogoyavlenskii–Schiﬀ equation vt − νvxxt + vy + vxxy + (vx − νvxxx )∂x−1 vy + 2(v − νvxx )vy = 0 and potential form of takes the form uxt − νuxxxt + uxy + uxxxy + (uxx uy + 2ux uxy ) − ν(uxxxx uy + 2uxxxuxy ) = 0. Corollary 5.8. If we assume

1 ν

→ 0, then Eq. (22) takes the form

vxxt + vxxx ∂x−1 vy + 2vxx vy = 0,

(24)

it is known as (2 + 1)-dimensional Hunter–Saxton equation. For (1 + 1)-dimensional case (y = x), this reduces to the Hunter–Saxton equation vxxt + vxxx v + 2vxx vx = 0.

(25)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

497

The potential form of Eq. (25) takes the form uxxxt + uxxxxuy + 2uxxxuxy = 0.

(26)

6. Euler–Poincar´ e Framework of (1 + 1)-Dimensional b-Field Equation Denote Fµ (S 1 ) the space of tensor-densities of degree µ on S 1 Fµ = {a(x)dxµ | a(x) ∈ C ∞ (S 1 )}, where µ is the degree, x is a local coordinate on S 1 . As a vector space, Fµ (S 1 ) is isomorphic to C ∞ (S 1 ) [53]. Geometrically we say Fλ ∈ Γ(Ω⊗λ ),

where Ω⊗λ = (T ∗ S 1 )⊗λ ,

Ω = T ∗ S 1 is the cotangent bundle of S 1 . Here F0 (M ) = C ∞ (M ), the space F1 (M ) and F−1 (M ) coincide with the spaces of diﬀerential forms and vector ﬁelds, respectively. d d and w(x) dx is deﬁned as Deﬁnition 6.1. The b-bracket between v(x) dx

[v, w]b = vwx − (b − 1)vx w.

(27)

This b-bracket can also be expressed as [v, w]b =

b−2 b [v, w] − [v, w]sym , 2 2

(28)

where [v, w] = vwx − vx w and [v, w]sym = vwx + vx w. Remark. The b-bracket can be interpretred as an action of Vect(S 1 ) on F−(b−1) (S 1 ), a tensor densities on S 1 of degree −(b − 1). For b = 2 this is just a vector ﬁeld action corresponding to a Lie algebra. Moreover because of [v, w]sym term b-bracket is not a skew-symmetric bracket, it is a deformation of the bracket of vector ﬁelds. There is a pairing , : Fµ ⊗ F1−µ → R given by 1−µ

a(x)(dx) , b(x)(dx) µ

=

a(x)b(x)dx

(29)

S1

d acts on the space of tensor denwhich is Diﬀ(S 1 )-invariant. A vector ﬁeld f (x) dx sities Fµ by the Lie derivative

Lµf(x)

d dx

(a(x)) = (f (x)a (x) + µf (x)a(x))(dx)µ .

(30)

June 2, 2010 14:55 WSPC/S0129-055X

498

148-RMP

J070-00398

P. Guha

We denote b-algebra by F−(b−1) and its dual by Fb . Thus we can deﬁne a pairing according to (29)

a(x)(dx)−(b−1) , b(x)(dx)b = a(x)b(x)dx. S1

It is clear that the b-algebra is not a Lie algebra and under this circumstances we cannot deﬁne proper “coadjoint action”. We generalize the concept of the coadjoint action to b-algebra and deﬁned with respect to norm (29). Lemma 6.2.

1 (adH )∗f (u) = (1 − ν∂x2 )−1 f (1 − ν∂x 2 )ux + bfx (1 − ν∂x 2 )u .

(31)

Proof. We know ad∗f (u), gH 1 = − u, [f, g]b H 1 ≡ −u dxb , (f g − (b − 1)f g)(dx)1−b H 1 , hence the pairing is well-deﬁned. Let us compute

(ufg − (b − 1)uf g)dx + ν RHS = S1

=

S1

LHS =

S1

=

S1

S1

u (f g − (b − 1)f g) dx

[f (1 − ν∂x 2 )u + bf (1 − ν∂x 2 )u 1 (adH )∗f u)g dx

1

+ν S1

(adH )∗f u g dx

1

[(1 − ν∂x 2 )adH )∗f u]g dx.

Thus by equating the right-hand side and left-hand side we obtain the above formula. Using the Helmholtz operator we express m = (1 − ν∂x 2 )u. Thus, we express the Hamiltonian operator corresponding to (31) as O1 = −(1 − ν∂ 2 )−1 (mx + bm∂). The Euler–Poincar´e equation δH ut = O1 δu

1 for H = 2

(32)

u2 dx,

S1

can be rewritten as mt = O

δH , δu

where O = (mx + bm∂). Using the EP formula (33), we construct the b-ﬁeld equation.

(33)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

499

Proposition 6.3. The Euler–Poincar´e ﬂow on the dual space of b-algebra yields the b-ﬁeld equation mt + mx u + bmux = 0. This is a new derivation of the b-ﬁeld equation. 6.1. Hamiltonian structure of the Degasperis–Procesi equation and EP framework Degasperis et al. studied Hamiltonian structures for b = 3 case of the b-ﬁeld equation or the DHH equation, in other words, they exhibits bihamiltonian features of the Degasperis-Procesi system. They expressed the Degasperis–Procesi equation as mt = Bi

δHi δm

i = 0, 1,

(34)

where m = u − uxx (we assume ν = 1). Thus they studied the ﬂow of Helmholtz function. They showed that there is only one local Hamiltonian structure B0 = ∂x (1 − ∂x2 )(4 − ∂x2 ),

(35)

and the second Hamiltonian structure is given by B1 = m2/3 ∂x m1/3 (∂x − ∂x3 )−1 m1/3 ∂x m2/3 ,

(36)

which can be simpliﬁed to ˆ = 2 (3m∂ + mx )(∂ − ∂ 3 )−1 (3m∂ + 2mx ). B1 ≡ B 9 Proposition 6.4. The Degasperis–Procesi equation ˆ mt = B

ˆ = 2 (3m∂ + mx )(∂ − ∂ 3 )−1 (3m∂ + 2mx ), B 9

δH1 , δm

H1 =

9 4

m dx S1

(37) is equivalent to mt = O

δH δu

for H =

u2 dx,

S1

where O = (mx + bm∂). Proof. Our goal is to show

where H1 = we obtain

9 4

S1

δH δH1 2 (∂ − ∂ 3 )−1 (3m∂ + 2mx ) = , 9 δm δu m dx. If we insert

δH1 δm

=

9 4

to left-hand side of above equation

(∂ − ∂ 3 )−1 mx = u,

June 2, 2010 14:55 WSPC/S0129-055X

500

148-RMP

J070-00398

P. Guha

where we use u = (1 − ∂ 2 )−1 m. Thus we obtain mt = (3m∂ + mx )

δH δu

where H = 12 S 1 u2 dx. Therefore the Degasperis–Holm–Hone form of Hamiltonian structure coincides with our Hamiltonian structure. 6.1.1. First Hamiltonian structure of b-ﬁeld equation Let us compute the Hamiltonian operator at a frozen point m(x) = m0 . Since m0 is constant so the Hamiltonian operator at the frozen point becomes O0 = 3m0 ∂.

(38)

Actually freezing at the point m0 yields a Poisson structure induced by a coboundary, which is always a trivial Poisson structure. For all practical purposes we can normalize this O0 operator or taking m0 = 13 . We show that this leads us to the ﬁrst Hamiltonian operator of the Degasperis–Procesi equation. Proposition 6.5. The Degasperis–Procesi equation with respect to ﬁrst Hamiltoˆ0 = ∂, where the nian structure of Degasperis–Holm–Hone exactly coincides with O ˆ corresponding Hamiltonian H satisﬁes ˆ δH = (2u2 − u2x − uuxx). δu

(39)

Proof. It is easy to check that ˆ δH0 δH = (4 − ∂ 2 ) , δu δu where the ﬁrst DHH hamiltonian is given by H0 = Thus we obtain mt = ∂(4 − ∂ 2 )

1 6

S1

u3 dx.

δH0 . δu

Using the chain rule formula for variational derivatives δH0 δH0 = (1 − ∂ 2 ) δu δm we obtain mt = ∂(4 − ∂ 2 )(1 − ∂ 2 )

δH0 . δm

Hence we obtain the ﬁrst Hamiltonian structure B0 of Degasperis, Holm and Hone from our method.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

501

7. EP Formalism for (2 + 1)-Dimensional b-Field Equation ˜ 1 = LG1 be the associated loop group corresponding to G1 whose algebra Consider G is given by

g˜ 1 = L(F−(b−1) ). Consider an action of L(Vect(S 1 )) on L(F−(b−1) ) Lf

∂ ∂x

(g(dx)−(b−1) ) = (f gx − (b − 1)fx g)(dx)−(b−1) ,

(40)

this yields a new bracket. ˜1. Let us introduce H 1 norm on the algebra g Deﬁnition 7.1. The H 1 -Sobolev norm on the loop tensor density algebra is deﬁned as

−(b−1) b , u(x, y)(dx) H 1 = fu dx + ν ∂x f ∂x u dx. (41) f (x, y)(dx) S1

S1

Proposition 7.2. The action of Vect(S 1 ) with respect to H 1 metric on the tensor product algebra Fb is given by ∗

d (v(x, y)(dx)b ) = +(f v˜x + bfx v˜)dxb , ad (f dx ) where v˜ = (1 − ν∂ 2 )v. Corollary 7.3. The Hamiltonian operator corresponding to the action of Vect(S 1 ) on Fb with respect to H 1 metric yields ˆ = −(1 − ν∂x2 )−1 ∂x v˜ + (b − 1)˜ O v ∂x (42) ˜ 1 orbit yields the (2 + 1)Proposition 7.4. The Euler–Poincar´e ﬂow on the g dimensional b-ﬁeld equation vt − νvxxt + λ∂x−1 vyy + (vx − νvxxx )∂x−1 vy + b(v − νvxx )vy = 0

(43)

where the Hamiltonian is given by H=

1 2

S 1 ×S 1

v∂x−1 vy dx dy.

The potential form of (43) yields uxt − νuxxxt + λuyy + (uxx uy + bux uxy ) − ν(uxxxxy uy + buxxxuxy ) = 0. Introducing the quantity m = v − νvxx ,

(44)

June 2, 2010 14:55 WSPC/S0129-055X

502

148-RMP

J070-00398

P. Guha

which is just the Helmholtz operator action on v. Therefore the one-parameter family of (2 + 1)-dimensional peakon-type PDE’s (or (2 + 1)-dimensional b-ﬁeld equations) may be written in the following form mt + mx ∂x−1 vy + bvy m + λ∂x−1 vyy = 0,

(45)

which reduces to (1 + 1)-dimensional b-ﬁeld equation for y = x and λ = 0. It is clear that Eq. (45) becomes a (2+1)-dimensional Camassa–Holm and (2+1)-dimensional Degasperis–Procesi equation for b = 2 and b = 3 respectively. 8. Conclusion and Outlook We have examined various extensions of (2 + 1)-dimensional KdV equations and (2+1)-dimensional generalized Camassa–Holm type systems. In particular, we have shown that all these equations constitute geodesic ﬂows on the loop Bott–Virasoro group. In fact three famous (2 + 1)-dimensional partial diﬀerential equations: Calogero–Bogoyavlenskii–Schiﬀ (CBS), (2 + 1)-dimensional Camassa–Holm (CH2 ) and (2 + 1)-dimensional Hunter–Saxton (HS2 ) can be described as Euler–Poincar´e ﬂows on the dual space of loop Virasoro orbit. After that we have given the Euler–Poincar´e formalism of the new (1 + 1)dimensional b-ﬁeld equation, proposed by Degasperis et al., on the space of tensor algebra. We also extend the EP framework to (2 + 1)-dimensional b-ﬁeld equation, which includes the (2 + 1)-dimensional Degasperis–Procesi equation too. Therefore, this paper has further strengthened the programme of Euler–Poincar´e and integrable geodesic ﬂows on extended group of diﬀeomorphisms. We hope in our forthcoming work we will consider the singular solutions of the (2 + 1)-dimensional equations. Acknowledgment The author is profoundly grateful to Professors Jerry Marsden, Tudor Ratiu, Valentin Ovsienko and Chand Devchand for stimulating discussions and various constructive suggestions. He is also grateful to Professor Thanasis Fokas for his interest and encouragement. In particular, he is immensely grateful to Chand Devchand for the b-bracket discussion. Finally, the author wants to thank the anonymous referee for many helpful comments and suggestions. He expresses grateful thanks to Professor J¨ urgen Jost for gracious hospitality at the Max Planck Institute for Mathematics in the Sciences. References [1] M. J. Ablowitz and P. A. Clarkson, Solitons, Nonlinear Evolution Equations and Inverse Scattering, London Mathematical Society Lecture Note Series, Vol. 149 (Cambridge University Press, 1991). [2] V. I. Arnold, Sur la g´eom´etrie diﬀerentielle des groupes de Lie de dimenson inﬁnie et ses applications ` a l’hydrodynamique des ﬂuids parfaits, Ann. Inst. Fourier Grenoble 16 (1966) 319–361.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

503

[3] A. Bressan and A. Constantin, Global solutions of the Hunter–Saxton equation, SIAM J. Math. Anal. 37 (2005) 996–1002. [4] Yu. Billig, An extension of the KdV hierarchy arising from a representation of a toroidal Lie algebra, J. Algebra 217 (1999) 40–64. [5] O. I. Bogoyavlensky, Breaking solitons in (2 + 1)-dimensional integrable equations, Russian Math. Surveys 45(4) (1990) 1–86. [6] O. I. Bogoyavlensky, Breaking solitons III, Izv. Akad. Nauk SSSR Ser. Matem. 54 (1990) 123–131; Math. USSR Izv. 36 (1991) 129–137 (English translation). [7] F. Calogero, A method to generate solvable nonlinear evolution equations, Lett. Nuovo Cimento 14 (1975) 443–447. [8] R. Camassa and D. Holm, An integrable shallow water equation with peaked solitons, Phys. Rev. Lett. 71(11) (1993) 1661–1664. [9] R. Camassa, D. Holm and J. M. Hyman, A new integrable shallow water equation, Adv. Appl. Mech. 31 (1994) 1–33. [10] A. Constantin and B. Kolev, Geodesic ﬂow on the diﬀeomorphism group of the circle, Comment. Math. Helv. 78 (2003) 787–804. [11] A. Constantin, T. Kappeler, B. Kolev and P. Topalev, On geodesic exponential maps of the Virasoro group, Ann. Global Anal. Geom. 31 (2007) 155–180. [12] A. Constantin and B. Kolev, Integrability of invariant metrics on the Virasoro group, Phys. Lett. A 350(1–2) (2006) 75–80. [13] A. Constantin and B. Kolev, Integrability of invariant metrics on the diﬀeomorphism group of the circle, J. Nonlinear Sci. 16(2) (2006) 109–122. [14] A. Constantin and D. Lannes, The hydrodynamical relevance of the Camassa–Holm and Degasperis–Procesi equations, Arch. Ration. Mech. Anal. 192 (2009) 165–186. [15] M. Chen, S.-Q. Liu and Y. Zhang, A 2-component generalization of the Camassa– Holm equation and its solutions, nlin.SI/0501028. [16] A. Constantin, The trajectories of particles in Stokes waves, Invent. Math. 166 (2006) 523–535. [17] A. Constantin and W. Strauss, Stability of peakons, Comm. Pure Appl. Math. 53 (2000) 603–610. [18] A. Degasperis, D. D. Holm and A. N. W. Hone, A new integrable equation with peakon solutions, NEEDS 2001 Proceedings, Theoret. and Math. Phys. 133 (2002) 170–183. [19] A. Degasperis, D. D. Holm and A. N. W. Hone, Integrable and non-integrable equations with peakons, in Nonlinear Physics: Theory and Experiment, II (Gallipoli, 2002) (World Sci. Publishing, River Edge, NJ, 2003), pp. 37–43. [20] A. Degasperis and M. Procesi, Asymptotic integrability, in Symmetry and Perturbation Theory (Rome, 1998) (World Sci. Publishing, River Edge, NJ, 1999), pp. 23–37. [21] C. Devchand and J. Schiﬀ, The supersymmetric Camassa–Holm equation and geodesic ﬂow on the superconformal group, J. Math. Phys. 42(1) (2001) 260–273. [22] D. Ebin and J. Marsden, Groups of diﬀeomorphisms and themotion of an incompressible ﬂuid, Ann. Math. 92 (1970) 102–163. [23] J. Escher and Z. Yin, Well-posedness, blow-up phenomena, and global solutions for the b-equation, J. Reine Angew. Math. 624 (2008) 51–80. [24] G. Falqui, On a Camassa–Holm type equation with two dependent variables, nlin.SI/0505059. [25] A. S. Fokas and B. Fuchssteiner, B¨ acklund transformations for hereditary symmetries, Nonlinear Anal. 5(4) (1981) 423–432. [26] A. S. Fokas, P. J. Olver, P. Rosenau, A plethora of integrable bi-Hamiltonian equations, in Algebraic Aspects of Integrable Systems, Progr. Nonlinear Diﬀerential Equations Appl., Vol. 26 (Birkh¨ auser Boston, Boston, MA, 1997), pp. 93–101.

June 2, 2010 14:55 WSPC/S0129-055X

504

148-RMP

J070-00398

P. Guha

[27] A. S. Fokas and P. M. Santini, Dromions and a boundary value problem for the Davey–Stewartson I equation, Physica D 44 (1990) 99–130. [28] A. S. Fokas and P. M. Santini, The recursion operator of the Kadomtsev–Petviashvili equation and the squared eigenfunction of the Schr¨ odinger operators, Stud. Appl. Math. 75 (1986) 179–186. [29] A. S. Fokas and P. M. Santini, Recursion operators and bi-Hamiltonian structures in multidimensions I, Comm. Math. Phys. 115 (1988) 375–419. [30] A. S. Fokas and P. M. Santini, Recursion operators and bi-Hamiltonian structures in multidimensions II, Comm. Math. Phys. 116 (1988) 449–474. [31] I. M. Gelfand and D. B. Fuks, Cohomologies of the Lie algebra of vector ﬁelds on the circle, Funct. Anal. Appl. 2(4) (1968) 92–93. [32] P. Guha, Integrable geodesic ﬂows on the (super)extension of the Bott–Virasoro group, Lett. Math. Phys. 52(4) (2000) 311–328. [33] P. Guha, Geodesic ﬂows, bi-Hamiltonian structure and coupled KdV type systems, J. Math. Anal. Appl. 310 (2005) 45–56. [34] P. Guha and P. Olver, Geodesic ﬂow and two (super) component analog of the Camassa–Holm equation, SIGMA 2 (2006) 054, 9 pp. [35] P. Guha, Euler–Poincar´e formalism of (two component) Degasperis–Procesi and Holm–Staley type systems, J. Nonlinear Math. Phys. 14(3) (2007) 390–421. [36] P. Guha, Euler–Poincar´e ﬂows on the space of tensor densities and integrable systems, Oberwolfach Report 5(3) (2008) 1875–1880. [37] I. M. Gelfand, I. M. Graev and A. M. Vershik, Models of representations of current groups, Representations of Lie Groups and Lie Algebras (Budapest, 1971) (Akad. Kiad, Budapest, 1985), pp. 121–179. [38] D. Henry, Persistence properties for a family of nonlinear partial diﬀerential equations, Nonlinear Anal. 70 (2009) 2049–2064. [39] A. N. W. Hone, Reciprocal link for (2 + 1)-dimensional extensions of shallow water equations, Appl. Math. Lett. 13(3) (2000) 37–42. [40] J. K. Hunter and R. Saxton, Dynamics of director ﬁelds, SIAM J. Appl. Math. 51 (1991) 1498–1521. [41] R. S. Johnson, Camassa–Holm, Korteweg–de Vries and related models for water waves, J. Fluid Mech. 455 (2002) 63–82. [42] A. Kirillov, Inﬁnite-dimensional Lie groups: Their orbits, invariants and representations. The geometry of moments, in Twistor Geometry and Nonlinear Systems, Lecture Notes in Math., Vol. 970 (Springer, Berlin, 1982), pp. 101–123. [43] B. Khesin and G. Misiolek, Euler equations on homogeneous spaces and Virasoro orbits, Adv. Math. 176(1) (2003) 116–144. [44] B. G. Konopelchenko, Solitons in Multidimensions (World Scientiﬁc, 1993). [45] J. Lenells, The Hunter–Saxton equation describes the geodesic ﬂow on a sphere, J. Geom. Phys. 57 (2007) 2049–2064. [46] J. Lenells, Weak geodesic ﬂow and global solutions of the Hunter–Saxton equation, Discrete Contin. Dyn. Syst. 18 (2007) 643–656. [47] S.-Y. Lou, Searching for higher dimensional integrable models from lower ones via Painlev´e analysis, Phys. Rev. Lett. 80 (1998) 5027–5031. [48] Z. Lin and Y. Liu, Stability of peakons for the Degasperis–Procesi equation, Comm. Pure Appl. Math. 62 (2009) 125–146. [49] J. Lin, S.-Y. Lou and K. Wang, High-dimensional Virasoro integrable models and exact solutions, Phys. Lett. A 287(3–4) (2001) 257–267. [50] Y.-S. Li and Y.-J. Zhang, Symmetries of a (2 + 1)-dimensional breaking soliton equation, J. Phys. A 26(24) (1993) 7487–7494.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00398

Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group

505

[51] G. Misiolek, A shallow water equation as a geodesic ﬂow on the Bott–Virasoro group, J. Geom. Phys. 24 (1998) 203–208. [52] J. E. Marsden and T. Ratiu, Introduction to Mechanics and Symmetry (SpringerVerlag, New York, 1994). [53] V. Ovsienko and C. Roger, Generalizations of Virasoro group and Virasoro algebra through extensions by modules of tensor-densities on S 1 , Indag. Math. (N.S.) 9(2) (1998) 277–288. [54] V. Ovsienko and C. Roger, Looped cotangent Virasoro algebra and nonlinear integrable systems in dimension 2 + 1, Comm. Math. Phys., 273 (2007) 357–378; mathph/0602043. [55] V. Ovsienko, Coadjoint representation of Virasoro-type Lie algebras and diﬀerential operators on tensor-densities, in Inﬁnite Dimensional K¨ ahler Manifolds (Oberwolfach, 1995), DMV Sem., Vol. 31 (Birkh¨ auser, Basel, 2001), pp. 231–255. [56] P. Olver and P. Rosenau, Tri-Hamiltonian duality between solitons and solitary-wave solutions having compact support, Phys. Rev. E (3) 53(2) (1996) 1900–1906. [57] V. Yu. Ovsienko and B. A. Khesin, KdV super equation as an Euler equation, Funct. Anal. Appl. 21 (1987) 329–331. [58] V. Yu. Ovsienko, Bi-Hamiltonian nature of the equation utx = uxy uy − uyy ux , arXiv:0802.1818v1 [math-ph]. [59] R. Radha and M. Lakshmanan, Dromion-like structures in the (2 + 1)-dimensional breaking soliton equation, Phys. Lett. A 197(1) (1995) 7–12. [60] P. Rosenau, Nonlinear dispersion and compact structures, Phys. Rev. Lett. 73(13) (1994) 1737–1741. [61] E. Ramos, C.-H. Sah and R Shrock, Algebras of diﬀeomorphisms of the N -torus, J. Math. Phys. 31(8) (1990) 1805–1816. [62] A. Reiman and M. Semenov-Tyan-Shanskii, Hamiltonian structure of equations of Kadomtsev–Petviashvili type, in Diﬀerential Geometry, Lie Groups and Mechanics, VI. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 133 (1984) 212–227. [63] H.-Y. Ruan and Y.-X. Chen, Dromion interactions of (2 + 1)-dimensional KdV-type equations, J. Phys. Soc. Japan 72(3) (2003) 491–495. [64] J. Schiﬀ, Integrability of Chern–Simons–Higgs vortex equations and a reduction of the self-dual Yang–Mills equations to three dimensions, in Painlev´e Transcendents, eds. D. Levi and P. Winternitz, NATO ASI Series B, Vol. 278 (Plenum Press, New York, 1992). [65] G. Segal, Unitary representations of some inﬁnite-dimensional groups, Comm. Math. Phys. 80(3) (1981) 301–342. [66] I. A. B. Strachan, Some integrable hierarchies in (2 + 1)-dimensions and their twistor description, J. Math. Phys. 34(1) (1993) 243–259. [67] P. Zusmanovich, The second homology group of current Lie algebras, in K-Theory (Strasbourg, 1992), Ast´erisque 226(11) (1994) 435–452.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 507–531 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004028

PROJECTIVE MODULE DESCRIPTION OF EMBEDDED NONCOMMUTATIVE SPACES

R. B. ZHANG School of Mathematics and Statistics, University of Sydney, Sydney, Australia [email protected] XIAO ZHANG Institute of Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, P. R. China [email protected] Received 20 May 2009 Revised 5 February 2010

An algebraic formulation is given for the embedded noncommutative spaces over the Moyal algebra developed in a geometric framework in [8]. We explicitly construct the projective modules corresponding to the tangent bundles of the embedded noncommutative spaces, and recover from this algebraic formulation the metric, Levi–Civita connection and related curvatures, which were introduced geometrically in [8]. Transformation rules for connections and curvatures under general coordinate changes are given. A bar involution on the Moyal algebra is discovered, and its consequences on the noncommutative diﬀerential geometry are described. Keywords: Noncommutative space; projective module; isometric embedding. Mathematics Subject Classiﬁcation 2010: 51P05, 81R60, 83C65

1. Introduction It is a long held belief in physics that the notion of spacetime as a pseudo Riemannian manifold requires modiﬁcation at the Planck scale [34, 38]. Theoretical investigations in recent times strongly supported this view. In particular, the seminal paper [16] by Doplicher, Fredenhagen and Roberts demonstrated mathematically that coordinates of spacetime became noncommutative at the Planck scale, thus some form of noncommutative geometry [13] appeared to be necessary in order to describe the structure of spacetime. This prompted intensive activities in mathematical physics studying various noncommutative generalisations of Einstein’s theory of general relativity [1, 3, 5–11, 29, 30]. For reviews on earlier works, we refer 507

June 2, 2010 14:55 WSPC/S0129-055X

508

148-RMP

J070-00402

R. B. Zhang & X. Zhang

to [31, 35] and references therein. For more recent developments, particularly on the study of noncommutative black holes, see [2, 4, 7, 9, 15, 26, 27, 33, 37]. In joint work with Chaichian and Tureanu [8], we investigated the noncommutative geometry [13, 22] of noncommutative spaces embedded in higher dimensions. We ﬁrst quantized a space by deforming [21, 28] the algebra of functions to a noncommutative associative algebra known as the Moyal algebra. Such an algebra naturally incorporates the generalized spacetime uncertainty relations of [16], capturing key features expected of spacetime at the Planck scale. We then systematically investigated the noncommutative geometry of embedded noncommutative spaces. This was partially motivated by Nash’s isometric embedding theorem [32] and its generalization to pseudo-Riemannian manifolds [12, 19, 23], which state that any (pseudo-) Riemannian manifold can be isometrically embedded in Euclidean or Minkowski spaces. Therefore, in order to study the geometry of spacetime, it suﬃces to investigate (pseudo-) Riemannian manifolds embedded in higher dimensions. Embedded noncommutative spaces also play a role in the study of branes embedded in RD in the context of Yang–Mills matrix models [36]. The theory of [8] was developed within a geometric framework analogous to the classical theory of embedded surfaces (see, e.g., [14]). The present paper further develops the diﬀerential geometry of embedded noncommutative spaces by constructing an algebraic formulation in terms of projective modules, a language commonly adopted in noncommutative geometry [13, 22]. We shall ﬁrst describe the ﬁnitely generated projective modules over a Moyal algebra, which will be regarded as noncommutative vector bundles on a quantized spacetime. We then construct a diﬀerential geometry of the noncommutative vector bundles, developing a theory of connections and curvatures on such bundles. In doing this, we make crucial use of a unique property of the Moyal algebra, namely, it has a set of mutually commutative derivations related to the usual partial derivations of functions. Then we apply the noncommutative diﬀerential geometry developed to study the embedded noncommutative spaces introduced in [8]. We explicitly construct the projective modules corresponding to the tangent bundles of the noncommutative spaces, and recover from this algebraic formulation the geometric Levi–Civita connections and related curvatures introduced in [8]. This way, the embedded noncommutative spaces of [8] acquire a natural interpretation in the algebraic formalism present here. Morally, one may regard the very deﬁnition of a projective module (a direct summand of a free module) as the geometric equivalent of embedding a low-dimensional manifold isometrically in a higher dimensional one. In the commutative setting of classical (pseudo-) Riemannian geometry, we make this connection more precise and explicit by showing that the projective module description of tangent bundles studied here is a natural consequence of the isometric embedding theorems [12, 19, 23, 32]. This is brieﬂy discussed in Theorem 7.1.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

509

As a concrete example of noncommutative diﬀerential geometries over the Moyal algebra, we study in detail a quantum deformation of a time slice of the Schwarzschild spacetime. The projection operator yielding the tangent bundle is given explicitly, and the corresponding metric is also worked out. As is well known, one of the fundamental principles of general relativity is general covariance. It is important to ﬁnd a noncommutative version of this principle. By analyzing the structure of the Moyal algebra, we show that the noncommutative geometry developed here (initiated in [8]) retains some notion of “general covariance”. Properties of the connection and curvature under general coordinate transformations are described explicitly (see Theorem 5.1). The Moyal algebra (over the real numbers) admits an involution similar to the bar involution in the context of quantum groups. We introduce a particularly nice class of noncommutative vector bundles over the Moyal algebra, which are associated to bar invariant idempotents and endowed with bar hermitian connections (see Sec. 6). In this case, the bar involution takes the left tangent bundles to right tangent bundles. We show that the tangent bundles of embedded noncommutative spaces under a middle condition belong to this class. The organization of the paper is as follows. In Sec. 2, we describe the Moyal algebras and ﬁnitely generated projective modules over them. In Sec. 3, we discuss the diﬀerential geometry of noncommutative vector bundles on quantum spaces corresponding to Moyal algebras. In Sec. 4, we develop the diﬀerential geometry of embedded noncommutative spaces using the language of projective modules. As an explicit example, we study in detail the quantum deformation of a time slice of the Schwarzschild spacetime in Sec. 4.2. In Sec. 5, we study the eﬀect of general coordinate transformations. In Sec. 6, we investigate properties of noncommutative vector bundles under the bar involution of the Moyal algebra. Finally, Sec. 7, concludes the paper with some general comments and a discussion of the natural relationship between projective modules and isometric embeddings in classical (pseudo-) Riemannian geometry. Before closing this section, we mention that the theory of [8] has the advantage of being explicit and easy to use for computations. Using this theory, we constructed noncommutative Schwarzschild and Schwarzschild–de Sitter spacetimes in joint work with Wang [37]. Our long term aim is to develop a theoretical framework for studying noncommutative general relativity. A variety of physically motivated methods and techniques were used in the literature to study corrections to general relativity arising from the noncommutativity of the Moyal algebra. In particular, references [1, 3] studied deformations of the diﬀeomorphism algebra as a means for incorporating noncommutative eﬀects of spacetime, while in [6, 7, 9] a gauge theoretical approached was taken. These approaches diﬀer considerably from the theory of [8, 37] at the mathematical level.

June 2, 2010 14:55 WSPC/S0129-055X

510

148-RMP

J070-00402

R. B. Zhang & X. Zhang

2. Moyal Algebra and Projective Modules We describe the Moyal algebra of smooth functions on an open region of Rn , and the ﬁnitely generated projective modules over the Moyal algebra. This provides the background material needed in later sections, and also serves to ﬁx notations. We take an open region U in Rn for a ﬁxed n, and write the coordinate of a ¯ be a real indeterminate, and denote by R[[h]] ¯ point t ∈ U as (t1 , t2 , . . . , tn ). Let h ¯ ¯ the ring of formal power series in h. Let A be the set of formal power series in h with coeﬃcients being real smooth functions on U . Namely, every element of A is of the ¯ i where fi are smooth functions on U . Then A is an R[[h]]-module ¯ fi h form i≥0

in the obvious way. Fix a constant skew symmetric n × n matrix θ = (θij ). The Moyal product on A corresponding to θ is a map µ : A ⊗R[[h]] ¯ A → A,

f ⊗ g → µ(f, g),

deﬁned by ¯ P θij ∂ h ij ∂ti

µ(f, g)(t) = lim exp t →t

∂ ∂t j

f (t)g(t ).

(2.1)

On the right-hand side, f (t)g(t ) means the usual product of the numerical values of the functions f and g at t and t , respectively. It has been known since the early days of quantum mechanics that the Moyal ¯ product is associative (see, e.g., [28] for reference). Thus the R[[h]]-module A ¯ which equipped with the Moyal product forms an associative algebra over R[[h]], is a deformation of the algebra of smooth functions on U in the sense of [21]. We shall usually denote this associative algebra by A, but when it is necessary to make explicit the multiplication, we shall write it as (A, µ). The partial derivations ∂i := ∂t∂ i with respect to the coordinates ti for U are ¯ R[[h]]-linear maps on A. Since θ is a constant matrix, the Leibniz rule is valid. Namely, for any element f and g of A, we have ∂i µ(f, g) = µ(∂i f, g) + µ(f, ∂i g).

(2.2)

Therefore, the ∂i (i = 1, 2, . . . , n) are mutually commutative derivations of the Moyal algebra (A, µ) on U . Remark 2.1. The usual notation in the literature for µ(f, g) is f ∗g. This is referred to as the star-product of f and g. Hereafter, we shall replace µ by ∗ and simply write µ(f, g) as f ∗ g. Following the general philosophy of noncommutative geometry [13], we regard the associative algebra (A, µ) as deﬁning some quantum deformation of the region U , and ﬁnitely generated projective modules over A as (spaces of sections of) noncommutative vector bundles on the quantum deformation of U deﬁned by the noncommutative algebra A. Let us now brieﬂy describe ﬁnitely generated projective modules over A.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

511

Given an integer m > n, we let l Am (respectively, Am r ) be the set of mtuples with entries in A written as rows (respectively, columns). We shall regard m (respectively, Am lA r ) as a left (respectively, right) A-module with the action deﬁned by multiplication from the left (respectively, right). More explicitly, for v = a1 a2 · · · am ∈ l Am , and b ∈ A, we have b ∗ v = b ∗ a1 b ∗ a2 · · · b ∗ am .     a1

a1 ∗ b

. am

. am ∗ b

 a2 ∗ b   a2  Similarly for w =  ..  ∈ Am r , we have w ∗ b =  .. . Let Mm (A) be the set of (m × m)-matrices with entries in A. We deﬁne matrix multiplication in the usual way but by using the Moyal product for products of matrix entries, and still denote the corresponding matrix multiplication by ∗. Now for A = (aij ) and B = (bij ), ¯ we have (A ∗ B) = (cij ) with cij = k aik ∗ bkj . Then Mm (A) is an R[[h]]-algebra, m which has a natural left (respectively, right) action on Ar (respectively, l Am ). A ﬁnitely generated projective left (respectively, right) A-module is isomorphic to some direct summand of l Am (respectively, Am r ) for some m < ∞. If e ∈ Mm (A) satisﬁes the condition e ∗ e = e, that is, it is an idempotent, then M = l Am ∗ e := {v ∗ e | v ∈ l Am },

˜ = e ∗ Am := {e ∗ w |∈ Am } M r r

are, respectively, projective left and right A-modules. Furthermore, every projective ˜ constructed this way left (right) A-module is isomorphic to an M (respectively, M) by using some idempotent e. In Sec. 4, we shall give a systematic method for constructing idempotents (see (4.1)). The corresponding noncommutative vector bundles include the tangent bundles of embedded noncommutative spaces introduced in [8], which we shall investigate in depth. An explicit example of embedded noncommutative spaces will be analyzed in detail in Sec. 4.2. To do this, we need to develop some generalities of the diﬀerential geometry of noncommutative vector bundles using the language of projective modules over the Moyal algebra. 3. Diﬀerential Geometry of Noncommutative Vector Bundles In this section, we investigate general aspects of the noncommutative diﬀerential geometry over the Moyal algebra. We shall focus on the abstract theory here. A large class of examples will be given in Sec. 4, including one which will be worked out in detail. As we shall see, the set of mutually commutative derivations ∂i (i = 1, 2, . . . , n) of the Moyal algebra A will play a crucial role in developing the noncommutative diﬀerential geometry. 3.1. Connections and curvatures ˜ We We start by considering the action of the partial derivations ∂i on M and M. only treat the left module in detail, and present the pertinent results for the right module at the end, since the two cases are similar.

June 2, 2010 14:55 WSPC/S0129-055X

512

148-RMP

J070-00402

R. B. Zhang & X. Zhang

Let us ﬁrst specify that ∂i acts on rectangular matrices with entries in A by componentwise diﬀerentiation. More explicitly,     b11 b12 · · · b1l ∂i b11 ∂i b12 · · · ∂i b1l    ∂i b21 ∂i b22 · · · ∂i b2l   for B =  b21 b22 · · · b2l  . ∂i B =   ··· ··· ··· ···  ··· ··· ··· ···  ∂i bk1

∂i bk2

· · · ∂i bkl

bk1

bk2

· · · bkl

In particular, given any ζ = v ∗ e ∈ M, where v ∈ l Am regarded as a row matrix, we have ∂i ζ = (∂i v) ∗ e + v ∗ ∂i (e) by the Leibniz rule. While the ﬁrst term belongs to M, the second term does not in general. Therefore, ∂i (i = 1, 2, . . . , n) send M to some subspace of l Am diﬀerent from M. Let ωi ∈ Mm (A) (i = 1, 2, . . . , n) be (m × m)-matrices with entries in A satisfying the following condition: e ∗ ωi ∗ (1 − e) = −e ∗ ∂i e,

∀i.

(3.1)

¯ Deﬁne the R[[h]]-linear maps ∇i (i = 1, 2, . . . , n) from M to l Am by ∇i ζ = ∂i ζ + ζ ∗ ωi ,

∀ζ ∈ M.

Then each ∇i is a covariant derivative on the noncommutative bundle M in the sense of Theorem 3.1 below. They together deﬁne a connection on M. Theorem 3.1. The maps ∇i (i = 1, 2, . . . , n) have the following properties. For all ζ ∈ M and a ∈ A, ∇i ζ ∈ M

and

∇i (a ∗ ζ) = ∂i (a) ∗ ζ + a ∗ ∇i ζ.

Proof. For any ζ ∈ M, we have ∇i (ζ) ∗ e = ∂i (ζ) ∗ e + ζ ∗ ωi ∗ e = ∂i ζ + ζ ∗ (ωi ∗ e − ∂i e), where we have used the Leibniz rule and also the fact that ζ ∗ e = ζ. Using this latter fact again, we have ζ ∗ (ωi ∗ e − ∂i e) = ζ ∗ (e ∗ ωi ∗ e − e ∗ ∂i e), and by the deﬁning property (3.1) of ωi , we obtain ζ ∗ (e ∗ ωi ∗ e − e ∗ ∂i ∗ e) = ζ ∗ ωi . Hence ∇i (ζ) ∗ e = ∂i ζ + ζ ∗ ωi = ∇i ζ, proving that ∇i ζ ∈ M. The second part of the theorem immediately follows from the Leibniz rule. We shall also say that the set of ωi (i = 1, 2, . . . , n) is a connection on M. Since e ∗ ∂i e = ∂i (e) ∗ (1 − e), one obvious choice for ωi is ωi = −∂i e, which we shall refer to as the canonical connection on M. By inspecting the deﬁning property (3.1) for a connection, we easily see the following result. Lemma 3.2. If ωi (i = 1, 2, . . . , n) deﬁne a connection on M, then so do also ωi + φi ∗ e (i = 1, 2, . . . , n) for any (m × m)-matrices φi with entries in A.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

513

For a given connection ωi (i = 1, 2, . . . , n), we consider [∇i , ∇j ] = ∇i ∇j − ∇j ∇i with the right hand side understood as composition of maps on M. By simple calculations we can show that for all ζ ∈ M, [∇i , ∇j ]ζ = ζ ∗ Rij

with Rij := ∂i ωj − ∂j ωi − [ωi , ωj ]∗ ,

where [ωi , ωj ]∗ = ωi ∗ ωj − ωj ∗ ωi is the commutator. We call Rij the curvature of M associated with the connection ωi . For all ζ ∈ M, [∇i , ∇j ]∇k ζ = ∂k (ζ) ∗ Rij + ζ ∗ ωk ∗ Rij , ∇k [∇i , ∇j ]ζ = ∂k (ζ) ∗ Rij + ζ ∗ (∂k Rij + Rij ∗ ωk ). Deﬁne the following covariant derivatives of the curvature: ∇k Rij := ∂k Rij + Rij ∗ ωk − ωk ∗ Rij ,

(3.2)

we have [∇k , [∇i , ∇j ]]ζ = ζ ∗ ∇k Rij ,

∀ζ ∈ M.

The Jacobian identity [∇k , [∇i , ∇j ]] + [∇j , [∇k , ∇i ]] + [∇i , [∇j , ∇k ]] = 0 leads to ζ ∗ (∇k Rij + ∇j Rki + ∇i Rjk ) = 0,

∀ζ ∈ M.

From this, we immediately see that e ∗ (∇k Rij + ∇j Rki + ∇i Rjk ) = 0. In fact, the following stronger result holds. Theorem 3.3. The curvature satisﬁes the following Bianchi identity: ∇k Rij + ∇j Rki + ∇i Rjk = 0. Proof. The proof is entirely combinatorial. Let Aijk = ∂k ∂i ωj − ∂k ∂j ωi , Bijk = [∂i ωj , ωk ]∗ − [∂j ωi , ωk ]∗ . Then we can express ∇k Rij as ∇k Rij = Aijk + Bijk − ∂k [ωi , ωj ]∗ − [[ωi , ωj ]∗ , ωk ]∗ . Note that Aijk + Ajki + Akij = 0, Bijk + Bjki + Bkij = ∂k [ωi , ωj ]∗ + ∂i [ωj , ωk ]∗ + ∂j [ωk , ωi ]∗ . Using these relations together with the Jacobian identity [[ωi , ωj ]∗ , ωk ]∗ + [[ωj , ωk ]∗ , ωi ]∗ + [[ωk , ωi ]∗ , ωj ]∗ = 0, we easily prove the Bianchi identity.

June 2, 2010 14:55 WSPC/S0129-055X

514

148-RMP

J070-00402

R. B. Zhang & X. Zhang

3.2. Gauge transformations Let GLm (A) be the group of invertible m × m-matrices with entries in A. Let G be the subgroup deﬁned by G = {g ∈ GLm (A) | e ∗ g = g ∗ e},

(3.3)

which will be referred to as the gauge group. There is a right action of G on M deﬁned, for any ζ ∈ M and g ∈ G, by ζ × g → ζ · g := ζ ∗ g, where the right side is deﬁned by matrix multiplication. Clearly, ζ ∗ g ∗ e = ζ ∗ g. Hence ζ ∗ g ∈ M, and we indeed have a G action on M. For a given g ∈ G, let ωig = g −1 ∗ ωi ∗ g − g −1 ∗ ∂i g.

(3.4)

Then e ∗ ωig ∗ (1 − e) = g −1 ∗ e ∗ ωi ∗ (1 − e) ∗ g − g −1 ∗ e ∗ ∂i (g) ∗ (1 − e). By (3.1), g −1 ∗ e ∗ ωi ∗ (1 − e) ∗ g = −g −1 ∗ e ∗ ∂i (e) ∗ g = −g −1 ∗ e ∗ ∂i (e ∗ g) + g −1 ∗ e ∗ ∂i g = −g −1 ∗ e ∗ ∂i (g) ∗ e − e ∗ ∂i e + g −1 ∗ e ∗ ∂i g = −e ∗ ∂i e + g −1 ∗ e ∗ ∂i (g) ∗ (1 − e). Therefore, e ∗ ωig ∗ (1 − e) = −e ∗ ∂i e. This shows that the ωig satisfy the condition (3.1), thus form a connection on M. Now for any given g ∈ G, deﬁne the maps ∇gi on M by ∇gi ζ = ∂i ζ + ζ ∗ ωig ,

∀ζ.

Also, let Rgij = ∂i ωjg − ∂j ωig − [ωig , ωjg ]∗ be the curvature corresponding to the connection ωig . Then we have the following result. Lemma 3.4. Under a gauge transformation procured by g ∈ G, ∇gi (ζ ∗ g) = ∇i (ζ) ∗ g, Rgij

=g

−1

∀ζ ∈ M;

∗ Rij ∗ g.

Proof. Note that ∇gi (ζ ∗ g) = ∂i (ζ) ∗ g + ζ ∗ ∂i g + ζ ∗ g ∗ ωig = (∂i ζ + ζ ∗ ωi ) ∗ g. This proves the ﬁrst formula.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

515

To prove the second claim, we use the following formula ∂i ωjg − ∂j ωig = g −1 ∗ (∂i ωj − ∂j ωi ) ∗ g − ∂i (g −1 ) ∗ ∂j g + ∂j (g −1 ) ∗ ∂i g + [∂i (g −1 ) ∗ g, g −1 ∗ ωj ∗ g]∗ − [∂j (g −1 ) ∗ g, g −1 ∗ ωi ∗ g]∗ ; [ωig , ωjg ]∗ = g −1 ∗ [ωi , ωj ]∗ ∗ g − ∂i (g −1 ) ∗ ∂j g + ∂j (g −1 ) ∗ ∂i g + [∂i (g −1 ) ∗ g, g −1 ∗ ωj ∗ g]∗ − [∂j (g −1 ) ∗ g, g −1 ∗ ωi ∗ g]∗ . Combining these formulae together we obtain Rgij = g −1 Rij g. This completes the proof of the lemma. 3.3. Vector bundles associated to right projective modules ˜ = e ∗ Am in Connections and curvatures can be introduced for the right bundle M r much the same way. Let ω ˜ i ∈ Mm (A) (i = 1, 2, . . . , n) be matrices satisfying the condition that (1 − e) ∗ ω ˜ i ∗ e = ∂i (e) ∗ e.

(3.5)

˜i Then we can introduce a connection consisting of the right covariant derivatives ∇ ˜ (i = 1, 2, . . . , n) on M deﬁned by ˜ → M, ˜ ˜i : M ∇

˜ i ξ = ∂i ξ − ω ξ → ∇ ˜ i ∗ ξ.

˜ i (ξ ∗ a) = ∇ ˜ i (ξ) ∗ a + ξ ∗ ∂i a for all a ∈ A. It is easy to show that ∇ Note that if ω ˜ i is equal to ∂i e for each i, the condition (3.5) is satisﬁed. We call ˜ them the canonical connection on M. Returning to a general connection ω ˜ i , we deﬁne the associated curvature by ˜ ij = ∂i ω R ˜ j − ∂j ω ˜ i − [˜ ωi , ω ˜ j ]∗ . ˜ we have Then for all ξ ∈ M, ˜ ij ∗ ξ. ˜ i ,∇ ˜ j ]ξ = −R [∇ ˜ ij by We further deﬁne the covariant derivatives of R ˜ ij − R ˜ ij ∗ ω ˜ ij = ∂k R ˜ ij + ω ˜ kR ∇ ˜k ∗ R ˜k . Then we have the following result. ˜ satisﬁes the Bianchi identity Lemma 3.5. The curvature on the right bundle M ˜ jk + ∇ ˜ ki + ∇ ˜ ij = 0. ˜ iR ˜ jR ˜ kR ∇ By direct calculations we can also prove the following result: ˜ ij ) ∗ ξ, ˜ k , [∇ ˜ i ,∇ ˜ j ]]ξ = −∇ ˜ k (R [∇

˜ ∀ξ ∈ M.

˜ Consider the gauge group G deﬁned by (3.3), which has a right action on M: ˜ × G → M, ˜ M

ξ × g → ξ · g := g −1 ∗ ξ.

June 2, 2010 14:55 WSPC/S0129-055X

516

148-RMP

J070-00402

R. B. Zhang & X. Zhang

Under a gauge transformation procured by g ∈ G, ˜ ig := g −1 ∗ ω ˜ i ∗ g + ∂i (g −1 ) ∗ g. ω ˜ i → ω ˜ deﬁned by ˜ g on M The connection ∇ i ˜ g ξ = ∂i ξ − ω ∇ ˜ ig ∗ ξ i ˜ satisﬁes the following relation for all ξ ∈ M: ˜ g (g −1 ∗ ξ) = g −1 ∗ ∇ ˜ i ξ. ∇ i Furthermore, the gauge transformed curvature ˜ g := ∂i ω R ˜ jg − ∂j ω ˜ ig − [˜ ωig , ω ˜ jg ]∗ ij ˜ ij by is related to R ˜ g = g −1 ∗ R ˜ ij ∗ g. R ij Given any Λ ∈ Mm (A), we can deﬁne the A-bimodule map ˜ → A, , : M ⊗R[[h]] ¯ M

ζ ⊗ ξ → ζ, ξ = ζ ∗ Λ ∗ ξ,

(3.6)

where ζ ∗ Λ ∗ ξ is deﬁned by matrix multiplication. We shall say that the bimodule homomorphism is gauge invariant if for any element g of the gauge group G, ζ · g, ξ · g = ζ, ξ ,

∀ζ ∈ M,

˜ ξ ∈ M.

Also, the bimodule homomorphism is said to be compatible with the connections ωi ˜ if for all i = 1, 2, . . . , n on M and ω ˜ i on M ˜ i ξ , ∂i ζ, ξ = ∇i ζ, ξ + ζ, ∇

∀ζ ∈ M,

˜ ξ ∈ M.

˜ → A be an A-bimodule homomorphism deﬁned Lemma 3.6. Let , : M ⊗R[[h]] ¯ M by (3.6) with a given m × m-matrix Λ with entries in A. Then (1) , is gauge invariant if g ∗ Λ ∗ g −1 = Λ for all g ∈ G; ˜ if for all i, ˜ i on M (2) , is compatible with the connections ωi on M and ω e ∗ (∂i Λ − ωi ∗ Λ + Λ ∗ ω ˜ i ) ∗ e = 0. ˜ Proof. Note that ζ · g, ξ · g = ζ ∗ g ∗ Λ ∗ g −1 ∗ ξ for any g ∈ G, ζ ∈ M and ξ ∈ M. −1 Therefore ζ · g, ξ · g = ζ, ξ if g ∗ Λ ∗ g = Λ. This proves part (1). ˜ i) ∗ ξ. Thus if Λ satisﬁes Now ∂i ζ, ξ = ∂i ζ, ξ + ζ, ∂i ξ + ζ ∗ (∂i Λ − ωi ∗ Λ + Λ ∗ ω the condition of part (2), then , is compatible with the connections.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

517

3.4. Canonical connections and fiber metric ˜ given by Let us consider in detail the canonical connections on M and M ωi = −∂i e,

ω ˜ i = ∂i e.

A particularly nice feature in this case is that the corresponding curvatures on the left and right bundles coincide. We have the following formula: ˜ ij = −[∂i e, ∂j e]∗ . Rij = R Now we consider a special case of the A-bimodule map deﬁned by Eq. (3.6). ˜ → A the map deﬁned by (3.6) with Λ Deﬁnition 3.7. Denote by g : M ⊗R[[h]] ¯ M being the identity matrix. We shall call g the ﬁber metric on M. Lemma 3.8. The ﬁber metric g is gauge invariant and is compatible with the standard connections. Proof. Since Λ is the identity matrix in the present case, it immediately follows from Lemma 3.6(1) that g is gauge invariant. Note that e ∗ ∂i (e) ∗ e = 0 for all i. Using this fact in Lemma 3.6(2), we easily see that g is compatible with the standard connections. 4. Embedded Noncommutative Spaces In this section, we study explicit examples of idempotents and related projective modules. They correspond to the noncommutative spaces introduced in [8]. The main result here is a reformulation of the theory of embedded noncommutative spaces [8] in the framework of Sec. 3 in terms of projective modules. 4.1. Embedded noncommutative spaces We shall consider only embedded spaces with Euclidean signature. The Minkowski case is similarly, which we brieﬂy allude to in Remark 4.6 at the end of shall 1 2 m in l Am , we deﬁne an (n × n)-matrix this section. Given X = X X · · · X (gij )i,j=1,2,...,n with entries given by gij =

m

∂i X α ∗ ∂j X α .

α=1

Following [8], we shall call X a noncommutative space embedded in Am if the matrix (gij ) is invertible. For a given noncommutative space X, we denote by (g ij ) the inverse matrix of (gij ) with gij ∗ g jk = g kj ∗ gji = δik for all i and k. Here Einstein’s summation convention is used, and we shall continue to use this convention throughout the paper. Let Ei = ∂i X,

˜ i = (Ej )t ∗ g ji , E

E i = g ij ∗ Ej ,

June 2, 2010 14:55 WSPC/S0129-055X

518

148-RMP

J070-00402

R. B. Zhang & X. Zhang



 for i = 1, 2, . . . , n, where (Ei )t = e ∈ Mm (A) by ˜ j ∗ Ej e:=E  ∂i X 1 ∗ g ij ∗ ∂j X 1   ∂i X 2 ∗ g ij ∗ ∂j X 1 =  ···  ∂i X

m

ij

∗ g ∗ ∂j X

1

∂i X 1 2  ∂i X   ..  . ∂i X m

denotes the transpose of Ei . Deﬁne

∂i X 1 ∗ g ij ∗ ∂j X 2

···

∂i X 2 ∗ g ij ∗ ∂j X 2

···

··· ∂i X

m

∗ g ij ∗ ∂j X 2

∂i X 1 ∗ g ij ∗ ∂j X m



 ∂i X 2 ∗ g ij ∗ ∂j X m  .  ··· ···  m ij m · · · ∂i X ∗ g ∗ ∂j X (4.1)

We have the following results. ˜ j = δ j for all i and j. Proposition 4.1. (1) Under matrix multiplication, Ei ∗ E i (2) The m × m matrix e satisﬁes e ∗ e = e, that is, it is an idempotent in Mm (A). ˜ = e ∗ Am are (3) The left and right projective A-modules M = l Am ∗ e and M r i ˜ . More precisely, we have respectively spanned by Ei and E M = {ai ∗ Ei | ai ∈ A},

˜ = {E ˜ i ∗ bi | bi ∈ A}. M

˜ j = Ei ∗ (Ek )t ∗ g kj = δ j . It then Proof. Note that gij = Ei ∗ (Ej )t . Thus Ei ∗ E i immediately follows that e ∗ e = E˜i ∗ (Ei ∗ E˜j ) ∗ Ej = E˜i ∗ δij ∗ Ej = e. ˜ ⊂ {E ˜ i ∗ bi | bi ∈ A}. By the ﬁrst part of Obviously, M ⊂ {ai ∗ Ei | ai ∈ A} and M the proposition, we have ˜ j ) ∗ Ej = aj ∗ Ej , ai ∗ Ei ∗ e = ai ∗ (Ei ∗ E ˜ i ∗ (Ei ∗ E ˜ j ) ∗ bj = E ˜ i ∗ bi . e ∗ E˜ j ∗ bj = E This proves the last claim of the proposition. ˜ = {(Ei )t ∗ bi | bi ∈ A} since (gij ) is invertible. It is also useful to observe that M ˜ respectively by T X and T˜X, and refer to them as We shall denote M and M the left and right tangent bundles of the noncommutative space X. Note that the deﬁnition of the tangent bundles coincides with that in [8]. ˜X → A deﬁned in DeﬁniDeﬁnition 4.2. Call the ﬁber metric g : T X ⊗R[[h]] ¯ T tion 3.7 the metric of the noncommutative space X. The proposition below in particular shows that g agrees with the metric of the embedded noncommutative space deﬁned in [8] in a geometric setting.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

519

Proposition 4.3. For any ζ = ai ∗ Ei ∈ T X and ξ = (Ej )t ∗ bj ∈ T˜X with ai , bj ∈ A, g : ζ ⊗ ξ → g(ζ, ξ) = ai ∗ gij ∗ bj . In particular, g(Ei , (Ej )t ) = gij . Proof. Recall from Deﬁnition 3.7 that g is deﬁned by (3.6) with Λ being the identity matrix. Thus for any ζ = ai ∗ Ei ∈ T X and ξ = (Ej )t ∗ bj ∈ T˜X with ai , bj ∈ A, g(ζ, ξ) = ai ∗ Ei ∗ (Ej )t ∗ bj = ai ∗ gij ∗ bj . This completes the proof. Let us now equip the left and right tangent bundles with the canonical connecωi = −∂i e, and denote the corresponding covariant derivations given by ωi = −˜ tives by ∇i : T X → T X,

˜ i : T˜X → T˜X. ∇

In principle, one can take arbitrary connections for the tangent bundles, but we shall not allow this option in this paper. The following elements of A are deﬁned in [8], c Γijl

=

1 1 (∂i gjl + ∂j gli − ∂l gji ) , Υijl = (∂i (Ej ) ∗ (El )t − El ∗ ∂i (Ej )t ) , 2 2 ˜ ijl = c Γijl − Υijl , Γ

Γijl = c Γijl + Υijl ,

where Υijk was referred to as the noncommutative torsion. Set [8] Γkij = Γijl ∗ g lk ,

˜ k = g kl ∗ Γ ˜ ijl . Γ ij

(4.2)

Then we have the following result. Lemma 4.4. ∇i Ej = Γkij ∗ Ek ,

˜ j = −E ˜ iE ˜ k ∗ Γj . ∇ ki

(4.3)

˜ k ∗ ∂i Ek . We have Proof. Consider the ﬁrst formula. Write ∂i e = ∂i (E˜ k ) ∗ Ek + E ∇i Ej = ∂i Ej − Ej ∂i ∗ e = ∂i Ej − (∂i (Ej ∗ e) − ∂i (Ej ) ∗ e) ˜ k ∗ Ek . = ∂i (Ej ) ∗ E It was shown in [8] that Γkij = ∂i (Ej ) ∗ E˜ k . This immediately leads to the ﬁrst formula. The proof for the second formula is essentially the same. Note that Lemma 4.4 can be re-stated as ˜j ∗ Ek, ∇i E j = −Γ ik

˜ i (Ej )t = (Ek )t ∗ Γ ˜k . ∇ ij

June 2, 2010 14:55 WSPC/S0129-055X

520

148-RMP

J070-00402

R. B. Zhang & X. Zhang

By using Lemmas 3.8 and 4.4, we can easily prove the following result, which is equivalent to [8, Proposition 2.7]. Proposition 4.5. The connections are metric compatible in the sense that ˜ i ξ), ∂i g(ζ, ξ) = g(∇i ζ, ξ) + g(ζ, ∇

∀ζ ∈ T X,

ξ ∈ T˜X.

(4.4)

For ζ = Ej and ξ = (Ek )t , we obtain from (4.4) the following result for all i, j, k: ˜ ikj = 0. ∂i gjk − Γijk − Γ

(4.5)

This formula is in fact equivalent to Proposition 4.5. Deﬁne l ˜l, Rkij = Ek ∗ Rij ∗ E

l ˜ kij R = −g lq ∗ Eq ∗ Rij ∗ E˜ p ∗ gpk .

(4.6)

˜ ij = −[∂i e, ∂j e]∗ , we can show by some lengthy calculations that Using Rij = R l = −∂j Γlik − Γpik ∗ Γljp + ∂i Γljk + Γpjk ∗ Γlip , Rkij ˜l − Γ ˜l + Γ ˜ l = −∂j Γ ˜l ∗ Γ ˜ p + ∂i Γ ˜l ∗ Γ ˜p , R kij

jp

ik

ik

jk

ip

(4.7)

jk

which are the Riemannian curvatures of the left and right tangent bundles of the noncommutative space X given in [8, Lemma 2.12 and §4]. Therefore, l ∗ El , [∇i , ∇j ]Ek = Rkij

˜ i, ∇ ˜ j ](Ek )t = (El )t ∗ R ˜l , [∇ kij

(4.8)

recovering the relations [8, (2.13)] and their generalizations [8, §4] to arbitrary m ≥ n. Remark 4.6. We comment brieﬂy on noncommutative spaces with Minkowski signatures embedded in higher dimensions [8]. Let η = diag(−1, . . . , −1, 1, . . . , 1) be a diagonal (m×m)-matrixwith p of the diagonal entries being −1, and q = m−p of them being 1. Given X = X 1 X 2 · · · X m in l Am , we deﬁne an (n × n)-matrix (gij )i,j=1,2,...,n with entries gij =

m

∂i X α ∗ ηαβ ∗ ∂j X β .

α=1

We call X a noncommutative space embedded in Am if the matrix (gij ) is invertible. Denote its inverse matrix by (g ij ). Now the idempotent which gives rise to the left and right tangent bundles of X is given by e = η(Ei )t ∗ g ij ∗ Ej , which obviously satisﬁes Ei ∗ e = Ei for all i. The ﬁber metric of Deﬁnition 3.8 yields a metric on the embedded noncommutative surface X.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

521

4.2. Example We analyze an embedded noncommutative surface of Euclidean signature arising from the quantisation of a time slice of the Schwarzschild spacetime. While the main purpose here is to illustrate how the general theory developed in previous sections works, the example is interesting in its own right. Let us ﬁrst specify the notation to be used in this section. Let t1 = r, t2 = θ and t3 = φ, with r > 2m, θ ∈ (0, π), and φ ∈ (0, 2π). We deform the algebra of functions in these variables by imposing the Moyal product deﬁned by (2.1) with the following anti-symmetric matrix   0 0 0 3 (θij )i,j=1 =  0 0 1. 0 −1 0 Note that the functions depending only on the variable r are central in the Moyal algebra A. We shall write the usual pointwise product of two functions f and g as f g, but write their Moyal product as f ∗ g. Consider X = X 1 X 2 X 3 X 4 given by −1 2m 1 2 X = f (r) with (f ) + 1 = 1 − , r (4.9) X 2 = r sin θ cos φ,

X 3 = r sin θ sin φ,

X 4 = r cos θ.

Simple calculations yield E1 = ∂r X = ( f

sin θ cos φ sin θ sin φ cos θ ),

E2 = ∂θ X = ( 0

r cos θ cos φ r cos θ sin φ −r sin θ ),

E3 = ∂φ X = ( 0

−r sin θ sin φ

r sin θ cos φ

0 ).

Using these formulae, we obtain the following expressions for the components of the metric of the noncommutative surface X: −1 2m 2m 2¯ 1− 1− cos(2θ) sinh h , g11 = 1 − r r ¯ g12 = g21 = r sin(2θ) sinh2 h, ¯ g22 = r2 [1 + cos(2θ) sinh2 h],

(4.10)

¯ cosh h, ¯ g23 = −g32 = −r2 cos(2θ) sinh h ¯ cosh h, ¯ g13 = −g31 = −r sin(2θ) sinh h g33 = r2 [sin2 θ − cos(2θ) sinh2 ¯h]. ¯ → 0, we recover the spatial components of the Schwarzschild metric. In the limit h Observe that the noncommutative surface still reﬂects the characteristics of the Schwarzschild spacetime in that there is a time slice of the Schwarzschild black hole with the event horizon at r = 2m.

June 2, 2010 14:55 WSPC/S0129-055X

522

148-RMP

J070-00402

R. B. Zhang & X. Zhang

Since the metric (gij ) depends on θ and r only, and the two variables commute, the inverse (g ij ) of the metric can be calculated in the usual way as in the commutative case. Now the components of the idempotent e = (eij ) = (Ei )t ∗ g ij ∗ Ej are given by the following formula: e11 =

2m 2m(2m − r)(2 + cos 2θ) ¯ 2 ¯ 3 ), + h + O(h r r2

e12 =

m cos φ sin θ 2m cos θ sin φ ¯ − h m m r −4m+2r r −4m+2r +

e13 =

m(4m + r + 2m cos 2θ) cos φ sin θ ¯ 2 ¯3) h + O(h m 2 r −4m+2r

m sin θ sin φ 2m cos θ cos φ ¯ + h m m r −4m+2r r −4m+2r +

m(4m + r + 2m cos 2θ) sin θ sin φ ¯ 2 ¯3) h + O(h m r2 −4m+2r

m cos θ m cos θ(4m − r + 2m cos 2θ) ¯ 2 ¯3) e14 = + h + O(h m m 2 r −4m+2r r −4m+2r e21 =

m cos φ sin θ 2m cos θ sin φ ¯ + h m m r −4m+2r r −4 m+2 r +

m(4m + r + 2m cos 2θ) cos φ sin θ ¯ 2 ¯3) h + O(h m r2 −4m+2r

e22 = 1 −

m 2m sin2 θ cos2 φ + 2 [2r + 2m cos 4θ cos2 φ − 6m cos2 φ r 2r

¯ 2 + O(h ¯3) + 2 cos 2θ(m + 8r + (m − r) cos 2φ)]h e23 = − + e24 =

m sin2 θ sin 2φ 3m sin 2θ ¯ − h r r m(2(m − r) cos 2θ + m(−3 + cos 4θ)) sin 2φ ¯ 2 ¯3) h + O(h 2r2

−2m cos θ cos φ sin θ m(1 + 3 cos 2θ) sin φ ¯ − h r r −

m(8m + 5r + 4m cos 2θ) cos φ sin 2θ ¯ 2 ¯3) h + O(h 2r2

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

e31 =

523

m sin θ sin φ 2m cos θ cos φ ¯ − h m m r −4m+2r r −4m+2r +

m(4m + r + 2m cos 2θ) sin θ sin φ ¯ 2 ¯3) h + O(h m 2 r −4m+2r

m sin2 θ sin 2φ 3m sin 2θ ¯ + h r r m(2(m − r) cos 2θ + m(−3 + cos 4θ)) sin 2φ ¯ 2 ¯3) + h + O(h 2r2 2m sin2 θ sin2 φ m = 1− + 2 [2r + 2m cos 4θ sin2 φ − 6m sin2 φ r 2r ¯ 2 + O(h ¯3) + 2 cos 2θ(m + 8r − (m − r) cos 2φ)]h

e32 = −

e33

−2m cos θ sin θ sin φ m(1 + 3 cos 2θ) cos φ ¯ + h r r m(8m + 5r + 4m cos 2θ) sin 2θ sin φ ¯ 2 ¯3) − h + O(h 2r2 m cos θ m cos θ(4m − r + 2m cos 2θ) ¯ 2 ¯3) = + h + O(h m m 2 r −4m+2r r −4m+2r

e34 =

e41

−2m cos θ cos φ sin θ m(1 + 3 cos 2θ) sin φ ¯ + h r r m(8m + 5r + 4m cos 2θ) cos φ sin 2θ ¯ 2 ¯3) − h + O(h 2r2 −2m cos θ sin θ sin φ m(1 + 3 cos 2θ) cos φ ¯ − h = r r m(8m + 5r + 4m cos 2θ) sin 2θ sin φ ¯ 2 ¯3) − h + O(h 2r2 2m cos2 θ 4m cos2 θ(−2m + r − m cos 2θ) ¯ 2 ¯ 3 ). + = 1− h + O(h r r2

e42 =

e43

e44

¯ 1+h ¯ 2 e2 + · · · . Then inspecting the formulae we see Let us write e = e0 + he that the matrices e0 and e2 are symmetric, while e1 is skew symmetric. This is no coincidence; rather it is a consequence of properties of X under the bar involution, which will be discussed in Sec. 6. Here we refrain from presenting the result of the Mathematica computation for the curvature Rij = −[∂i e, ∂j e], which is very complicated and not terribly illuminating. However, we mention that in [37] a quantisation of the Schwarzschild spacetime was carried out (for a particular choice of Θ), and the resulting noncommutative diﬀerential geometry was studied in detail. In particular, the metric, Christoﬀel symbols, Riemannian and Ricci curvatures were explicitly worked out. We refer to that paper for details.

June 2, 2010 14:55 WSPC/S0129-055X

524

148-RMP

J070-00402

R. B. Zhang & X. Zhang

5. General Coordinate Transformations We now return to the general setting of Sec. 3 to investigate “general coordinate transformations”. Our treatment follows closely [8, §V] and makes use of general ideas of [17, 21, 28]. We should point out that the material presented is part of an attempt of ours to develop a notion of “general covariance” in the noncommutative setting. This is an important matter which deserves a thorough investigation. We hope that the work presented here will prompt further studies. Let (A, µ) be a Moyal algebra of smooth functions on the open region U of Rn with coordinate t. This algebra is deﬁned with respect to a constant skew symmetric matrix θ = (θij ). Let Φ : U → U be a diﬀeomorphism of U in the classical sense. We denote ui = Φi (t), and refer to this as a general coordinate transformation of U . Denote by Au the sets of smooth functions of u = (u1 , u2 , . . . , un ). The map Φ ¯ induces an R[[h]]-module isomorphism φ = Φ∗ : Au → A deﬁned for any function f ∈ Au by φ(f )(t) = f (Φ(t)). ¯ We deﬁne the R[[h]]-bilinear map µu : Au ⊗ Au → Au ,

µu (f, g) = φ−1 µt (φ(f ), φ(g)).

Then it is well known [21] that µu is associative. Therefore, we have the associative algebra isomorphism ∼

φ : (Au , µu ) → (At , µt ). We say that the two associative algebras are gauge equivalent by adopting the terminology of [17]. ¯ Following [8], we deﬁne R[[h]]-linear operators ∂iφ := φ−1 ◦ ∂i ◦ φ : Au → Au ,

(5.1)

which have the following properties [8, Lemma 5.5]: ∂iφ ◦ ∂jφ − ∂jφ ◦ ∂iφ = 0, ∂iφ µu (f, g) = µu (∂iφ (f ), g) + µu (f, ∂iφ (g)), ∂iφ .

∀f, g ∈ Au ,

where the second relation is the Leibniz rule for Recall that this Leibniz rule played a crucial role in the construction of noncommutative spaces over (Au , µu ) in [8]. We shall denote by Mm (Au ) the set of (m×m)-matrices with entries in Au . The product of two such matrices will be deﬁned with respect to the multiplication µu of the algebra (Au , µu ). Then φ−1 acting component wise gives rise to an algebra isomorphism from Mm (A) to Mm (Au ), where matrix multiplication in Mm (A) is deﬁned with respect to µ.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

525

Since we need to deal with two diﬀerent algebras (A, µ) and (Au , µu ) simultaneously in this section, we write µ and the matrix multiplication deﬁned with respect to it by ∗ as before, and use ∗u to denote µu and the matrix multiplication deﬁned with respect to it. Let e ∈ Mm (A) be an idempotent. There exists the corresponding ﬁnitely ˜ Now generated projective left (respectively, right) A-module M (respectively, M). −1 −1 −1 −1 eu := φ (e) is an idempotent in Mm (Au ), that is, φ (e)∗u φ (e) = φ (e). Write eu = (Eβα )α,β=1,...,m . This idempotent gives rises to the left projective Au -module ˜ u , respectively deﬁned by Mu and right projective Au -module M α α a ∈ Au , Mu = a ∗u E1α aα ∗u E2α · · · aα ∗u Em α            ∗ u bβ   bβ ∈ Au ,  ..    .       ∗ b u β m where aα ∗u Eβα = α µu (aα , Eβα ) and Eβα ∗u bβ = β µu (Eβα , bβ ). Below we consider the left projective module only, as the right projective module may be treated similarly. Assume that we have the left connection ∂ζ ∇i : M → M, ∇i ζ = i + ζ ∗ ωi . ∂t   Eβ1    β      E2 ˜u =  M           Eβ

∗u bβ

Let ωiu := φ−1 (ωi ). We have the following result. Theorem 5.1. (1) The matrices ωiu satisfy the following relations in Mm (Au ): eu ∗u ωiu ∗u (1 − eu ) = −eu ∗u ∂iφ eu . (2) The operators ∇φi (i = 1, 2, . . . , n) deﬁned for all η ∈ Mu by ∇φi η = ∂iφ η + η ∗u ωiu give rise to a connection on Mu . (3) The curvature of the connection ∇φi is given by Ruij = ∂iφ ωju − ∂jφ ωiu − ωiu ∗u ωju + ωju ∗u ωiu , which is related to the curvature Rij of M by Ruij = φ−1 (Rij ). Proof. Note that eu ∗u ωiu ∗u (1 − eu ) = φ−1 (e ∗ ωi ∗ (1 − e)). We also have φ ∂e −1 (e ∗ φ(∂iφ eu )) = φ−1 (e ∗ ∂i e). This ∂iφ eu = φ−1 ( ∂t i ), which leads to eu ∗u ∂i eu = φ proves part (1). Part (2) follows from part (1) and the Leibniz rule for ∂iφ . Straightforward calculations show that the curvature of the connection ∇φi is given by

June 2, 2010 14:55 WSPC/S0129-055X

526

148-RMP

J070-00402

R. B. Zhang & X. Zhang ∂ω

Ruij = ∂iφ ωju − ∂jφ ωiu − ωiu ∗u ωju + ωju ∗u ωiu . Now ∂iφ ωju = φ−1 ( ∂tij ), and ωiu ∗u ωju − ωju ∗u ωiu = φ−1 (ωi ∗ ωj ) − φ−1 (ωj ∗ ωi ). Hence Ruij = φ−1 (Rij ). Remark 5.2. One can recover the usual transformation rules of tensors under the diﬀeomorphism group from the commutative limit of Theorem 5.1 in a way similar to that in [8, §5.C]. 6. Bar Involution and Generalized Hermitian Structure In this section, we study a Moyal algebra analogue of the bar map of quantum groups, and investigate its implications on noncommutative geometry. Note that the ¯ i in ¯ admits an involution that maps an arbitrary power series a = ai h ring R[[h]] i i i ¯ to a ¯ the conjugate of a. Note that a ¯a contains h . We shall call a R[[h]] ¯ = i (−1) ai ¯ ¯ only even powers of h. We can extend this map to a conjugate linear anti-involution on the Moyal algebra A. ¯i ∈ A, where fi Lemma 6.1. Let ¯ : A → A be the map deﬁned for any f = i fi h i ¯i ¯ are real functions on U, by f = i (−1) fi h . Then for all f, g ∈ A, f ∗ g = g¯ ∗ f¯. We refer to the map as the bar involution of the Moyal algebra. It is an analogue ¯ to q −1 , in the theory of quantum of the well known bar map, sending q = exp(h) groups, which plays an important role in the study of canonical (crystal) bases. The lemma can be easily proven by inspecting (2.1). Given any rectangular matrix A = (ars ) with entries in A, we let A† be the matrix obtained from A by ﬁrst taking its transpose then sending every matrix elements to its conjugate. For example,   † a1 a2 a 1 b 1 c1   =  b1 b2  . a 2 b 2 c2 c1 c2 It is clear that if the product A ∗ B of two matrices are deﬁned, then (A ∗ B)† = B † ∗ A† . ¯ Let Am = l Am be the R[[h]]-module consisting of rows matrices of length m with entries in A. We deﬁne the form ( , ) : Am × Am → A,

ζ × ξ → (ζ, ξ) := ζ ∗ ξ † .

Lemma 6.2. (1) For all ζ, ξ ∈ M and a, b ∈ A, (ζ, ξ) = (ξ, ζ),

(a ∗ ζ, b ∗ ξ) = a ∗ (ζ, ξ) ∗ ¯b.

Thus in this sense the form (6.1) is sesquilinear. (2) (ζ, ζ) = 0 if and only if ζ = 0.

(6.1)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

527

(3) For all ζ, ξ ∈ M and A ∈ Mm (A), we have (ζ ∗ A, ξ) = (ζ, ξ ∗ A† ). (4) Let the bar-unitary group Um (A) over A be the subgroup of GLm (A) deﬁned by Um (A) = {g ∈ GLm (A) | g † = g −1 }. Then the form (6.1) is invariant under Um (A) in the sense that for all g ∈ Um (A) and ζ, ξ ∈ M, (ζ ∗ g, ξ ∗ g) = (ζ, ξ). It is straightforward to prove the lemma. Note that part (2) of the lemma makes the form (6.1) as nice as a positive deﬁnite hermitian form in the commutative case. We shall call an idempotent e ∈ Mm (A) self-adjoint (with respect to the sesquilinear form (6.1)) if e = e† . In this case, the corresponding left and right projective modules M = l Am ∗ e and ˜ = e ∗ Am are related by M r ˜ = {ζ † | ζ ∈ M}. M Furthermore, the form (6.1) restricts to a sesquilinear form on M, which is invariant under G ∩ Um (A). ˜ = e ∗ Am be the left and right bundles Lemma 6.3. Let M = l Am ∗ e and M r associated with a self-adjoint idempotent e. Assume that the left connection ωi on ˜ satisfy the condition M and the right connection ω ˜ i on M ω ˜ i = −ωi† ,

∀i.

Then for any ζ in M, ˜ i (ζ † ). (∇i ζ)† = ∇ Furthermore, the curvatures on the left and right bundles are related by ˜ ij = −R† . R ij Proof. Let ξ = ζ † . We have ˜ i ξ. (∇i ζ)† = (∂i ζ + ζ ∗ ωi )† = ∂i ξ + ωi† ∗ ξ = ∇ This proves the ﬁrst part of the lemma. Now R†ij = (∂i ωj − ∂j ωi − [ωi , ωj ]∗ )† = ∂i ωj† − ∂j ωi† + [ωi† , ωj† ]∗ ˜ ij . = −R This proves the second part.

(6.2)

June 2, 2010 14:55 WSPC/S0129-055X

528

148-RMP

J070-00402

R. B. Zhang & X. Zhang

Hereafter, we shall assume that condition (6.2) is satisﬁed by the left and right connections. Let M be the left bundle corresponding to a self-adjoint idempotent e. We shall say that a connection ωi on M is hermitian with respect to the bar map (or bar-hermitian) if ωi† = ωi for all i. In this case, we shall also say that the bundle M is bar-hermitian. ˜ satisfy ˜ i = ∂i e on M Note that the canonical connections ωi = −∂i e on M and ω † † ω ˜ i = −ωi and ωi = ωi provided that e is self-adjoint. Therefore, in this case the canonical connection is bar-hermitian. Since the left and right curvatures associated to the canonical connections are equal, it follows from Lemma 6.3 that R†ij = −Rij . We have the following result. Theorem 6.4. Let X = X 1 X 2 · · · X m in l Am be an embedded noncommuta tive surface satisfying the condition X := X 1 X 2 · · · X m = X. Then X has the following properties: (1) The metric has the property gij = gji for all i, j. (2) The idempotent e = (Ei )t ∗ g ij ∗ Ej is self-adjoint. (3) Equipped with the canonical connection ωi = −∂i e, the tangent bundle of X is bar-hermitian. (4) The curvature satisﬁes R†ij = −Rij . Proof. The given condition on X implies that all the Ei satisfy Ei† = (Ei )t . Thus gij = Ei ∗ (Ej )t = Ei ∗ (Ej )† ,

e = (Ei )t ∗ g ij ∗ Ej = (Ei )† ∗ g ij ∗ Ej .

Hence we have gij = (Ei ∗ (Ej )† )† = Ej ∗ (Ei )† = gji . It then follows that g ij = g ji . Now the idempotent e satisﬁes e† = ((Ei )† ∗ g ij ∗ Ej )† = (Ej )† ∗ g ij ∗ Ei = (Ej )t ∗ g ji ∗ Ei = e. Parts (3) and (4) follow from part (2) and the discussion preceding the proposition.

Note that the quantum spacetimes studied in [37] and the example in Sec. 4.2 all satisfy the conditions of Theorem 6.4. 7. Concluding Remarks We wish to point out that in the classical commutative setting, we can recover (pseudo-) Riemannian geometry from the theory developed here by using the isometric embedding theorems of [12, 19, 23, 32]. The simpliﬁcation in this case is that there is no need to distinguish the left and the right tangent bundles. To describe the situation, we let (N, g) be a smooth n-dimensional (pseudo-) Riemannian manifold with metric g. Denote by C∞ (N ) the set of smooth functions on N endowed with the usual pointwise multiplication. Let C∞ (N )m be the space consisting of row vectors of length m with entries in C∞ (N ). By results of [12, 19, 23, 32],

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

529

there exist positive integers p, q (with p + q = m) and a set of smooth funcm tions X 1 , . . . , X p , X p+1 , . . . , X m on N such that g = α,β=1 dX α ηαβ dX β , where η = diag(−1, . . . , −1, 1, . . . 1) with p = 0 if N is Riemannian. Let U be a coordinate p

q

1

2

m

∂X ∂X ) and chart of N with local coordinate (t1 , . . . , tn ). We set Ei = ( ∂X ∂ti ∂ti · · · ∂ti t ij deﬁne e = η(Ei ) g Ej on each coordinate chart U . Then we have the following result.

Theorem 7.1. (1) The idempotent e is globally deﬁned on N . (2) The space Γ(T N ) of sections of the tangent bundle of N is given by C∞ (N )m e. (3) For all ζ, ξ ∈ Γ(T N ), we have g(ζ, ξ) = ζη(ξ)t . (4) The standard connection (with ωi = −∂i e) on C∞ (N )m e is the usual Levi– Civita connection on T X with the Christoﬀel symbol Γkij deﬁned by (4.2) and Υijk = 0. (5) The Riemannian curvature tensor is given by (4.6). Returning to the noncommutative case, we recall that one can quantise any Poisson manifold following the prescription of [28]. Then one obtains a collection of noncommutative associative algebras (analogous to the Moyal algebra), one on each coordinate patch. The algebras relative to diﬀerent local coordinates are gauge equivalent [28, Theorem 2.3] as discussed in Sec. 5. This way, one obtains a sheaf of noncommutative algebras over the Poisson manifold. The algebraic geometry of such a quantized Poisson manifold has been extensively developed by Kashiwara and Schapira [24, 25]. In principle one may extend the local theory developed in this paper to a “global” diﬀerential geometry over the quantized Poisson manifold. Work in this direction is currently under way. Results in this paper should be directly applicable to the development of a theory of noncommutative general relativity, which is of considerable current interest in theoretical physics. We hope that the theory presented here will provide a consistent mathematical basis for this purpose. We should also mention that one may use this theory to clarify, conceptually, aspects of the many noncommutative geometries introduced in physics in recent years based on physical intuitions. For example, general features of the noncommutative geometries in [3, 10, 11] have considerable similarity with that of [8]. These works also have the advantage of being explicit and amenable to calculations, thus have the chance to be physically tested. Therefore, it will be useful to further develop the mathematical bases of these theories by casting them into the framework of this paper. Finally, we note that a noncommutative analogue of spin geometry over the Moyal algebra within the C ∗ -algebraic framework in terms of noncompact spectral triples was studied in [20]. Our treatment is complementary to that of [20]. Acknowledgments We wish to thank Masud Chaichian and Anca Tureanu for discussions at various stages of this work. X. Zhang thanks the School of Mathematics and Statistics,

June 2, 2010 14:55 WSPC/S0129-055X

530

148-RMP

J070-00402

R. B. Zhang & X. Zhang

the University of Sydney for the hospitality extended to him during a visit when this work was completed. Partial ﬁnancial support from the Australian Research Council, National Science Foundation of China (grants 10421001, 10725105 and 10731080), NKBRPC (2006CB805905) and the Chinese Academy of Sciences are gratefully acknowledged. References ´ [1] L. Alvarez-Gaum´ e, F. Meyer and M. A. Vazquez-Mozo, Comments on noncommutative gravity, Nucl. Phys. B 753 (2006) 92–117. [2] S. Ansoldi, P. Nicolini, A. Smailagic and E. Spallucci, Non-commutative geometry inspired charged black holes, Phys. Lett. B 645 (2007) 261–266. [3] P. Aschieri, M. Dimitrijevic, F. Meyer and J. Wess, Noncommutative geometry and gravity, Class. Quant. Grav. 23 (2006) 1883–1911. [4] R. Banerjee, B. R. Majhi and S. K. Modak, Noncommutative Schwarzschild black hole and area law, Class. Quant. Grav. 26 (2009) 085010, 11 pp. [5] M. Buric, T. Grammatikopoulos, J. Madore and G. Zoupanos, Gravity and the structure of noncommutative algebras, JHEP 0604 (2006) 054. [6] M. Chaichian, M. Oksanen, A. Tureanu and G. Zet, Gauging the twisted Poincare symmetry as noncommutative theory of gravitation, Phys. Rev. D 79 (2009) 044016, 8 pp. [7] M. Chaichian, M. R. Setare, A. Tureanu and G. Zet, On black holes and cosmological constant in noncommutative gauge theory of gravity, JHEP 0804 (2008) 064. [8] M. Chaichian, A. Tureanu, R. B. Zhang and Xiao Zhang, Riemannian geometry of noncommutative surfaces, J. Math. Phys. 49 (2008) 073511, 26 pp. [9] M. Chaichian, A. Tureanu and G. Zet, Corrections to Schwarzschild solution in noncommutative gauge theory of gravity, Phys. Lett. B 660 (2008) 573–578. [10] A. H. Chamseddine, Complexiﬁed gravity in noncommutative spaces, Comm. Math. Phys. 218 (2001) 283–292. [11] A. H. Chamseddine, SL(2, C) gravity with a complex vierbein and its noncommutative extension, Phy. Rev. D 69 (2004) 024015, 8 pp. [12] C. J. S. Clarke, On the global isometric embedding of pesudo-Riemannian manifolds, Proc. Roy. Soc. Lond. A. 314 (1970) 417–428. [13] A. Connes, Noncommutative Geometry (Academic Press, 1994). [14] M. P. do Carmo, Diﬀerential Geometry of Curves and Surfaces (Prentice-Hall, Englewood Cliﬀs, NJ, 1976). [15] B. P. Dolan, K. S. Gupta and A. Stern, Noncommutative BTZ black hole and discrete time, Class. Quant. Grav. 24 (2007) 1647–1656. [16] S. Doplicher, K. Fredenhagen and J. E. Roberts, The quantum structure of spacetime at the Planck scale and quantum ﬁelds, Comm. Math. Phys. 172 (1995) 187–220. [17] V. Drinfeld, Quasi-Hopf algebras, Leningrad Math. J. 1 (1990) 1419–1457. [18] S. Estrada-Jimenez, H. Garcia-Compean, O. Obregon and C. Ramirez, Twisted covariant noncommutative self-dual gravity, Phys. Rev. D 78 (2008) 124008, 11 pp. [19] A. Friedman, Local isometric embedding of Riemannian manifolds with indeﬁnite metric, J. Math. Mech. 10 (1961) 625–650. [20] V. Gayral, J. M. Gracia-Bond´ıa, B. Iochum, T. Sch¨ ucker and J. C. V´ arilly, Moyal planes are spectral triples, Comm. Math. Phys. 246 (2004) 569–623. [21] M. Gerstenhaber, On the deformation of rings and algebras, Ann. Math. 79 (1964) 59–103.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00402

Projective Module Description of Embedded Noncommutative Spaces

531

[22] J. M. Gracia-Bond´ıa, J. C. V´ arilly and H. Figueroa, Elements of Noncommutative Geometry, Birkh¨ auser Advanced Texts: Basler Lehrb¨ uher (Birkh¨ auser Boston, Inc., Boston, MA, 2001). [23] R. E. Greene, Isometric Embedding of Riemannian and Pseudo Riemannian Manifolds, Mem. Amer. Math. Soc., No. 97 (Amer. Math. Soc., 1970). [24] M. Kashiwara and P. Schapira, Deformation quantization modules I: Finiteness and duality, arXiv:0802.1245 [math.QA]. [25] M. Kashiwara and P. Schapira, Deformation quantization modules II. Hochschild class, arXiv:0809.4309 [math.AG]. [26] H. C. Kim, M. I. Park, C. Rim and J. H. Yee, Smeared BTZ black hole from space noncommutativity, JHEP 10 (2008) 060. [27] A. Kobakhidze, Noncommutative corrections to classical black holes, Phys. Rev. D 79 (2009) 047701, 3 pp. [28] M. Kontsevich, Deformation quantization of Poisson manifolds, Lett. Math. Phys. 66 (2003) 157–216. [29] J. Madore and J. Mourad, Quantum space-time and classical gravity, J. Math. Phys. 39 (1998) 423–442. [30] S. Majid, Noncommutative Riemannian and spin geometry of the standard q-sphere, Comm. Math. Phys. 256 (2005) 255–285. [31] F. Muller-Hoissen, Noncommutative geometries and gravity, in Recent Developments in Gravitation and Cosmology, AIP Conf. Proc., Vol. 977 (Amer. Inst. Phys., Melville, NY, 2008), pp. 12–29. [32] J. Nash, The imbedding problem for Riemannian manifolds, Ann. Math. 63 (1956) 20–63. [33] P. Nicolini, A. Smailagic and E. Spallucci, Noncommutative geometry inspired Schwarzschild black hole, Phys. Lett. B 632 (2006) 547–551. [34] H. S. Snyder, Quantized space-time, Phys. Rev. 71 (1947) 38–41. [35] R. J. Szabo, Symmetry, gravity and noncommutativity, Class. Quant. Grav. 23 (2006) R199–R242. [36] H. Steinacker, Emergent gravity and noncommutative branes from Yang–Mills matrix models, Nucl. Phys. B 810 (2009) 1–39. [37] D. Wang, R. B. Zhang and X. Zhang, Quantum deformations of Schwarzschild and Schwarzschild-de Sitter spacetimes, Class. Quant. Grav. 26 (2009) 085014, 14 pp. [38] C. N. Yang, On quantized space-time, Phys. Rev. 72 (1947) 874.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 533–548 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004016

CONSTRUCTION OF CERTAIN FUZZY FLAG MANIFOLDS

MAJDI BEN HALIMA Facult´ e des Sciences de Sfax, D´ epartement de Math´ ematiques, Route de Soukra, 3038 Sfax, Tunisia [email protected] Received 14 May 2009 Revised 16 February 2010 Approximating the algebra of complex-valued smooth functions on a space-time manifold by a sequence of matrix algebras AN ∼ = Mat(dN , C), with dN ∞, is the basic idea of fuzzy manifolds. In this paper, we explicitly construct fuzzy versions of the homogeneous spaces SO(2n+1)/U (n) and Sp(n)/U (1)×Sp(n−1) for n ≥ 2. This allows us to extend a result of Zhang giving a construction of fuzzy irreducible compact Hermitian symmetric spaces to a class of flag manifolds. Keywords: Fuzzy flag manifolds; Berezin–Toeplitz quantization; representations of compact Lie groups. Mathematics Subject Classification: 81T08, 81S10, 22E47

1. Introduction Let (M, ω) be a quantizable compact K¨ ahler manifold. Let (L, h, ∇) be an associated quantum line bundle. Here L is a holomorphic line bundle, h a Hermitian metric and ∇ the unique connection in L which is compatible with the complex structure and the metric such that the curvature form R of the line bundle and the K¨ ahler form ω of the manifold are related as R(X, Y ) := ∇X ∇Y − ∇Y ∇X − ∇[X,Y ] = −iω(X, Y ), where X, Y are smooth vector ﬁelds on M . Let us ﬁx a positive integer N and set LN := L⊗N , the N th tensor power of L. On the space Γ∞ (M, LN ) of smooth sections of LN , we have the scalar product ϕ, ψ = hN (ϕ(x), ψ(x))dΩ(x), M

where hN := h⊗N is the induced metric on LN and dΩ(x) is the normalized Liouville 2 N 2 measure on M . Let L (M, L ) be the L -completion of the space Γ∞ (M, LN ) and Γhol (M, LN ) be its closed subspace of holomorphic sections. By compactness of M , the Hilbert space HN := Γhol (M, LN ) is ﬁnite-dimensional. The algebra 533

June 2, 2010 14:55 WSPC/S0129-055X

534

148-RMP

J070-00401

M. Ben Halima

AN := EndC (HN ) can evidently be identiﬁed with the matrix algebra Mat(dimC HN , C). Letting C ∞ (M ) be the algebra of complex-valued smooth functions on M , the Berezin–Toeplitz quantization map TN : C ∞ (M ) → AN is deﬁned by associating to a function f multiplication of holomorphic sections of LN by f followed by projection on the space of holomorphic sections. In this way, one obtains a sequence of matrix algebras (AN )N ≥1 and a sequence of linear maps (TN )N ≥1 . Referring to a work of Bordemann, Meinrenken and Schlichenmaier [7], we know that the sequence (AN )N ≥1 should, in some sense, “approximate” the commutative algebra C ∞ (M ). Such an approximation scheme is reminescent of fuzzy manifolds where ﬁnite-dimensional matrix algebras are used to approximate the algebra of complex-valued smooth functions on a space-time manifold. More precisely, a fuzzy version of a compact manifold D is given by a sequence of linear subspaces (EN )N ≥1 in the function algebra C ∞ (D) such that EN ⊂ EN +1 and N ≥1 EN is dense in C ∞ (D), and such that EN is isomorphic to a matrix algebra AN ∼ = Mat(dN , C) with dN ∞. Furthermore, it is required that this truncation retains all symmetries of the manifold D. The prototypical example of fuzzy compact manifold is the fuzzy two-sphere S 2 . Identify S 2 with the homogeneous space SU (2)/S(U (1) × U (1)) and recall that L2 (S 2 ) ∼ , V2k , = k ∈ N0

where N0 := Z≥0 and Vl is the space of homogeneous complex polynomials of degree l in two variables. Then, since VN∗ ⊗ VN ∼ =

N

V2k

k=0

by self-duality of the Vl and the usual Clebsch–Gordan rule, the algebra AN := EndC (VN ) ∼ = Mat(N + 1, C) appears not only as a natural SU (2)-equivariant truncation of L2 (S 2 ) (or C ∞ (S 2 )) but carries a non-commutative multiplication as well (see, e.g., [24, 25] for details). A number of fuzzy compact manifolds have been constructed by now. For reviews on some of these constructions, we refer to [5, 6, 11, 16]. As suggested by Madore in [24], fuzzy compact manifolds have found several applications in physics. In quantum ﬁeld theory, it can provide a ﬁnite mode approximation to commutative continuum ﬁeld theories, giving an alternative to lattice gauge theories. Compared to a lattice regularization procedure, the fuzzy approach has the advantage of preserving the space-time symmetries. It also has further advantages in situations where fermions are included. Due to these and other potential advantages, the fuzzy approach appears as a promising new tool in quantum ﬁeld theory (see, e.g., [4, 12, 15] for more details). There are other reasons to investigate fuzzy compact manifolds in theoretical physics. They lead to matrix models which receive a lot of interest in string theory, especially in the theory of D-branes (see, e.g., [2, 17]).

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Construction of Certain Fuzzy Flag Manifolds

535

From a rather mathematical point of view, fuzzy compact manifolds have an interesting connection with noncommutative geometry. Following an idea of Fr¨ ohlich and Gaw¸edzki [13], a fuzzy version of a compact manifold D can be speciﬁed by a sequence of triples (Mat(dN , C), HN , ∆N ), 2

where the Hilbert space HN = CdN is equipped with the inner product A, B =

1 Tr(AB ∗ ), dN

and ∆N is a matrix analog of the Laplace–Beltrami operator. The “fuzzy Laplacian” ∆N comes with a cutoﬀ and encodes mathematical informations about the manifold D (see, e.g., [10] for details). This important fact motivates the study of fuzzy compact manifolds from a framework of noncommutative geometry. The main goal of the present work is to construct explicit fuzzy versions of the homogeneous spaces SO(2n + 1)/U (n) and Sp(n)/U (1) × Sp(n − 1) (n ≥ 2) by means of elementary representation-theoretic methods. This allows us to establish the following result wich describes a class of fuzzy ﬂag manifolds. Theorem . Let G be a compact, connected simply connected Lie group with Lie algebra g, and let p be a standard maximal parabolic subalgebra of the complexified Lie algebra gC . Let K ⊂ G be the connected Lie group with Lie algebra k := p ∩ g. Assume that (G, K) is a Gelfand pair. Then there exists a sequence (EN )N ≥1 of G-invariant subspaces of L2 (G/K) such that EN ⊂ EN +1 and ∪N ≥1 EN is dense in C ∞ (G/K), and such that EN is G-equivariantly isomorphic to a matrix algebra AN ∼ = Mat(dN , C) with dN ∞. This theorem extends a result of Zhang (see [31, Proposition 3.1 and Theorem 4.2]) wich gives a construction of fuzzy irreducible compact Hermitian symmetric spaces. In the proof of the above theorem, we shall make direct use of the standard Berezin–Toeplitz quantization procedure for compact K¨ahler manifolds. In connection with our work, let us mention that Lazaroiu, McNamee and S¨ amann (see [22]) have recently proved that a particular version of generalized Berezin quantization, which they call “Berezin–Bergmann quantization”, provides a general framework for approaching the construction of fuzzy compact K¨ ahler manifolds. Using this framework, the authors have proposed a general defenition of fuzzy scalar ﬁeld theory on compact K¨ ahler manifolds. The present paper is organized as follows. In Sec. 2, we ﬁrst ﬁx our notations and terminology. Then we recall some useful facts about a special class of Gelfand pairs. In Sec. 3, we provide explicit formulas concerning the decomposition into irreducibles of some tensor product representations of the groups SO(2n + 1) and Sp(n) for n ≥ 2 (see Corollary 1 and Proposition 2 below). These formulas play an important role in Sec. 4, wich is essentially devoted to the proof of our main result.

June 2, 2010 14:55 WSPC/S0129-055X

536

148-RMP

J070-00401

M. Ben Halima

2. Preliminaries 2.1. Basic notions Let G be a compact connected semisimple Lie group with Lie algebra g. We denote by gC the complexiﬁcation of g and by GC the simply connected Lie group with Lie algebra gC . Let T be a maximal torus in G with Lie algebra h. The complexiﬁcation hC of h is a Cartan subalgebra of gC . We denote by ∆ the root system of gC with respect to hC . We ﬁx a lexicographic ordering on the dual h∗R := (ih)∗ and we write ∆+ for the corresponding system of positive roots. The Killing form B on g extends complex bilinearly to gC . It is easy to see that B is positive deﬁnite on hR . For λ ∈ h∗R , let Hλ be the element of hR such that λ(H) = B(H, Hλ ) for all H ∈ hR . Thus we obtain a scalar product on h∗R given by λ, µ = B(Hλ , Hµ ). Let Π = {α1 , . . . , αl } be the system of simple roots corresponding to ∆+ . The elements αj , 1 ≤ j ≤ l, deﬁned by 2αj , αk = δj,k αk , αk

for 1 ≤ k ≤ l,

are called the fundamental weights attached to Π. To simplify notation, we set j := αj . The weight lattice is then given by   l   nj j , nj ∈ Z . Λ = λ ∈ (hC )∗ ; λ =   j=1

The set of dominant weights is the cone   l   nj j , nj ∈ N0 ⊂ Λ. Λ+ = λ ∈ (hC )∗ ; λ =   j=1

For each λ ∈ Λ+ , we denote by ρλ the unique (up to equivalence) irreducible representation of G with highest weight λ, acting in V (λ). Let αj ∈ Π be a simple root. The irreducible representation ρj is called the fundamental representation attached to αj . Let now K be a closed connected subgroup of G with Lie algebra k. A dominant weight λ ∈ Λ+ is called K-spherical if the subspace of K-ﬁxed vectors in V (λ) is one-dimensional. The corresponding representation ρλ is then called K-spherical. We write ΛK + for the subset of K-spherical dominant weights. If for every λ ∈ Λ+ the subspace of K-ﬁxed vectors in V (λ) is at most one-dimensional, then the pair (G, K) is called a Gelfand pair. In this case, the harmonic analysis of the square integrable functions on the homogeneous space M = G/K, endowed with the Haar measure, is given by V (λ). L2 (M ) ∼ = λ ∈ ΛK +

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Construction of Certain Fuzzy Flag Manifolds

537

2.2. A special class of Gelfand pairs Let us keep the notations of the previous subsection. Let C gC = hC ⊕ gC CEα α =h ⊕ α∈∆

α∈∆

be the standard root decomposition of gC . For a given subset S ⊂ Π, deﬁne the parabolic subalgebra gC pS := hC ⊕ α, α ∈ ΓS

where ΓS := ∆+ ∪ {α ∈ ∆; α ∈ span(S)}, and denote by PS the corresponding parabolic subgroup of GC . Let lS be the Levi factor of pS , lS = hC ⊕ gC α α ∈ ΓS ∩(−ΓS )

and set kS := pS ∩ g = lS ∩ g. Then kS is a compact real form of lS . Setting KS := G ∩ PS , we see that KS is a Lie subgroup of G with Lie algebra kS . Assume furthermore that (G, K) is a Gelfand pair and that there exists a subset S ⊂ Π such that S c := (Π\S) = 1 and k = kS . Note that the corresponding PS ⊂ G is maximal parabolic and that the Dynkin diagram of K can be obtained from the Dynkin diagram of G by deleting one node. The simple root β ∈ Π with Π\S = {β} is called the Gelfand node associated to the pair (G, K). The following important proposition characterizes a special class of compact Gelfand pairs. Proposition 1 ([30, Proposition 4.7]). Let G be a compact, connected simply connected Lie group with Lie algebra g, and let p be a standard maximal parabolic subalgebra of the complexified Lie algebra gC . Let K ⊂ G be the connected Lie group with Lie algebra k := p ∩ g. Then (G, K) is a Gelfand pair if and only if one of the following three conditions are satisfied: (i) (G, K) is an irreducible compact Hermitian symmetric pair ; (ii) (G, K) (SO(2n + 1), U (n))(n ≥ 2); (iii) (G, K) (Sp(n), U (1) × Sp(n − 1))(n ≥ 2). Let (G, K) be a pair from the list (i)–(iii) above, and let (g, k) be the associated pair of Lie algebras. Then k = kS for some subset S ⊂ Π with S c = 1. Let β ∈ Π be the associated Gelfand node with corresponding fundamental weight := β . One can extend complex linearly to gC by setting (Eα ) = 0 for all α ∈ ∆. The following fact is worth mentioning. Denote by L the isotropy group of under the coadjoint action of G, i.e. L = {g ∈ G; Ad∗ (g) = }.

June 2, 2010 14:55 WSPC/S0129-055X

538

148-RMP

J070-00401

M. Ben Halima

Using the Killing form of g, we identify with an element Z ∈ hR = ih. Thus we get L = {g ∈ G; Ad(g)Z = Z}, and then l := Lie(L) = {X ∈ g; [X, Z] = 0}. In the standard root decomposition of gC , Eα commutes with Z if and only if the root α is orthogonal to . Observe now that α is orthogonal to if and only if α belongs to the set ΓS ∩ (−ΓS ). This means that lC is spanned by hC and the Eα ’s with α ∈ ΓS ∩ (−ΓS ), and hence we get lC = kC . We conclude that K = L, which proves that the ﬂag manifold M = G/K can be identiﬁed with the G-orbit through under the coadjoint representation. 3. Decomposition of Tensor Product Representations of the Groups SO(2n + 1) and Sp(n) The goal of this section is to describe the decomposition into irreducibles of some particular tensor product representations of the special orthogonal Lie group SO(2n + 1) and the symplectic Lie group Sp(n) for n ≥ 2. We provide here explicit formulas that will be used in the proof of our main result in the next section. For a detailed exposition of the representation theory of SO(2n + 1) and Sp(n), we refer to [19]. 3.1. The case of the group SO(2n + 1) The material of this subsection is not new but is worth summarizing in preparation of our main result. Let Ei,j ∈ Mat(2n + 1, C) be the elementary matrix having 1 at the (i, j)-entry and 0 elsewhere. We take the standard Cartan subalgebra h of the Lie algebra so(2n + 1) spanned by the matrices (E2j−1,2j − E2j,2j−1 ) for 1 ≤ j ≤ n. Let ek be the linear form on the complexiﬁed Lie algebra hC given by   0 ih1   −ih1 0      ..   . ek   = hk   0 ihn       −ihn 0 0 for 1 ≤ k ≤ n. In the usual ordering on h∗R := (ih)∗ and for n ≥ 2, we have the following system of positive roots of the pair (so(2n + 1, C), hC ) ∆+ = ∆+ (so(2n + 1, C), hC ) = {ek ± el , 1 ≤ k < l ≤ n} ∪ {ek , 1 ≤ k ≤ n}.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Construction of Certain Fuzzy Flag Manifolds

539

The associated system of simple roots is then Π = {αk = ek − ek+1 , 1 ≤ k < n} ∪ {αn = en }. Let us recall that: j (a) the fundamental weights are j = k=1 ek for 1 ≤ j ≤ n − 1 and n = n 1 k=1 ek ; 2 n (b) the weight lattice is Λ = { k=1 λk ek ; λk ∈ Z ∀k or λk ∈ Z + 12 ∀k}; n (c) a weight λ = k=1 λk ek ∈ Λ is dominant if and only if λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0; (d) the fundamental representation attached to the simple root αn = en is the so-called spin representation. n Given a dominant weight λ = k=1 λk ek (or simply λ = (λ1 , . . . , λn )), we denote, as before, by V (λ) the associated SO(2n + 1)-irreducible module with highest weight λ. Let now λ = s(e1 + · · ·+ en ) and µ = t(e1 + · · ·+ en ) be two “constant” dominant weights of SO(2n + 1) with s, t ∈ 12 N0 and s ≤ t. In [26, Theorem 2.5], Okada has proven the following multiplicity free decomposition formula V (λ) ⊗ V (µ) ∼ V (ν), = ν ∈ Ps,t

where Ps,t = {ν = (ν1 + t − s, . . . , νn + t − s); (ν1 , . . . , νn ) ∈ Nn0 , 2s ≥ ν1 ≥ · · · ≥ νn ≥ 0}. Since all representations of the group SO(2n + 1) are self-dual, we can deduce Corollary 1. Let λ = s(e1 + · · · + en ) with s ∈ 12 N0 . As SO(2n + 1)-modules, we have V (λ)∗ ⊗ V (λ) ∼ V (ν), = ν ∈ Ps

where Ps = {ν = (ν1 , . . . , νn ) ∈ Nn0 ; 2s ≥ ν1 ≥ · · · ≥ νn ≥ 0}. 3.2. The case of the group Sp(n) We begin this subsection by recalling some well-known facts about the representations of the compact Lie group Sp(n). Let     ih1             .     .   .                 ih n   h= H =  ; hj ∈ R ∀ 1 ≤ j ≤ n     −ih1               .     .   .           −ihn

June 2, 2010 14:55 WSPC/S0129-055X

540

148-RMP

J070-00401

M. Ben Halima

be the standard Cartan subalgebra of the Lie algebra sp(n). Given an element H ∈ h as above, we can simply write H = diag(ih1 , . . . , ihn , −ih1 , . . . , −ihn ). Let ek be the linear form on hC deﬁned by ek (diag(h1 , . . . , hn , −h1 , . . . , −hn )) = hk , where 1 ≤ k ≤ n. For n ≥ 2, we ﬁx the following system of positive roots of the pair (sp(n, C), hC ) ∆+ = ∆+ (sp(n, C), hC ) = {ek ± el , 1 ≤ k < l ≤ n} ∪ {2ek , 1 ≤ k ≤ n}. The associated system of simple roots is Π = {αk = ek − ek+1 , 1 ≤ k < n} ∪ {αn = 2en }. Recall that: j (a) the fundamental weights are j = k=1 ek for 1 ≤ j ≤ n; n (b) the weight lattice is Λ = { k=1 λk ek ; λk ∈ Z ∀k}; (c) a weight λ = nk=1 λk ek ∈ Λ is dominant if and only if λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0. Next we are going to state a Littelmann’s rule which describes the decomposition into irreducibles of the tensor product of two general Sp(n)-irreducible modules. To this end, we ﬁrst brieﬂy recall some basic terminology. As usual, a partition is a non-increasing sequence λ = (λ1 , λ2 , . . .) of non-negative integers. The depth d(λ) of a partition λ is the number of non-zero terms of λ. A partition λ with depth ≤ n is regarded as an element of Nn0 . Let λ = (λ1 , λ2 , . . . , λd ) be a partition of depth d. The Young diagram of λ is a collection of left-justiﬁed rows of boxes with λi boxes in the ith row for 1 ≤ i ≤ d. A ﬁlling of the Young diagram of λ with elements of the set {1, 2, . . . , n} which is nondecreasing in rows and strictly increasing in the columns is called n-semistandard (Young) tableau (or tableau for short) of shape λ. Given a tableau T , the ﬁlling of the box (i, j) is denoted by Ti,j . Let again λ = (λ1 , λ2 , . . . , λd ) be a partition of depth d ≤ n. A tableau T of shape λ is called a (2n)-symplectic tableau if its entries are elements of {1, . . . , 2n} and if it obeys the additional constraint Ti,j ≥ 2i − 1. These tableaux were introduced by King and El-Sharkaway [18]. Consider a (2n)-symplectic tableau T . The vector con(T ) := ( {1 s in T } − {2 s in T }, . . . , {(2n − 1) s in T } − {(2n) s in T }) is called the content of T . We denote by T (l) the tableau that consists of the last n l columns of T . Given a weight ν = j=1 νj ej ∈ Λ, we shall identify ν with the

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Construction of Certain Fuzzy Flag Manifolds

541

element (ν1 , . . . , νn ) ∈ Zn . Now we arrive at Theorem 1 (Littelmann [23, Theorem (a), p. 346]). Let Λ+ be the set of dominant weights of Sp(n) with n ≥ 1. For λ, µ ∈ Λ+ , we have V (λ) ⊗ V (µ) ∼ V (λ + con(T )), = T

where the sum is over all (2n)-symplectic tableaux of shape µ such that the weight λ + con(T (l)) is dominant for all l. Remark. In the formulation of the Littelmann’s rule stated above, we basically reproduced Krattenthaler’s description (see [21, Appendix A6]) with a slight modiﬁcation in the description of (2n)-symplectic tableaux, where we followed [14]. This formulation is more elementary and is mostly convenient to clarify our calculation. Applying the above theorem in the case where λ = µ = (N, 0, . . . , 0), we obtain Proposition 2. For N ∈ N0 and n ≥ 2, we have V ((N, 0, . . . , 0)) ⊗ V ((N, 0, . . . , 0)) ∼ =

V ((2k + l, l, 0, . . . , 0)).

k,l ∈ N0 0 ≤ k+l ≤ N

Proof. If N = 0, then the proposition is obvious. Let us consider a (2n)-symplectic tableau of shape λ = (N, 0, . . . , 0) with N ∈ N. For 1 ≤ i ≤ 2n, we set ki := {i s in T }. By deﬁnition of the ki s, we have k1 + k2 + · · · + k2n = N . Note that the content of T is given by con(T ) = (k1 − k2 , k3 − k4 , . . . , k2n−1 − k2n ). Assume that T satisﬁes the following property: λ + con(T (l)) ∈ Λ+ for all l. For l = k2n , the content of the tableau T (l) is con(T (l)) = (0, . . . , 0, −k2n ) n−1

and so λ + con(T (l)) = (N, 0, . . . , 0, −k2n ). n−2

Since λ + con(T (l)) ∈ Λ+ , it follows that k2n = 0. Next, we are going to prove that ki = 0 for all 4 ≤ i ≤ 2n. The case n = 2 is already proven. Assume n ≥ 3, ﬁx 4 ≤ i ≤ 2n and suppose that kj = 0 for all i + 1 ≤ j ≤ 2n. We will prove that ki = 0. For this we consider the following cases: Case 1. If i is even, then we have for l = ki con(T (l)) = (0, . . . , 0, −ki , 0, . . . , 0). i−2 2

The fact that λ + con(T (l)) ∈ Λ+ clearly forces ki = 0.

June 2, 2010 14:55 WSPC/S0129-055X

542

148-RMP

J070-00401

M. Ben Halima

Case 2. If i is odd, then we have for l = ki con(T (l)) = (0, . . . , 0, ki , 0, . . . , 0). i−1 2

Since λ + con(T (l)) ∈ Λ+ , we easily get ki = 0. We conclude that ki = 0 for the ﬁxed integer i. An induction on i allows us to derive the equality ki = 0 for all 4 ≤ i ≤ 2n with n ≥ 3. Hence the claim is proven for n ≥ 2. Consequently, we can write con(T ) = (k1 − k2 , k3 , 0, . . . , 0), where, of course, k1 + k2 + k3 = N . Conversely, if T is a (2n)-symplectic tableau of shape λ such that con(T ) = (k1 − k2 , k3 , 0, . . . , 0) with the ki ’s being deﬁned as above, then one easily veriﬁes that λ + con(T (l)) is a dominant weight for all l. We deduce that V ((N + k1 − k2 , k3 , 0, . . . , 0)) V ((N, 0, . . . , 0)) ⊗ V ((N, 0, . . . , 0)) ∼ = k1 ,k2 ,k3 ∈ N0 k1 +k2 +k3 =N

∼ =

V ((2k + l, l, 0, . . . , 0)).

k,l ∈ N0 0 ≤ k+l ≤ N

This completes the proof of the proposition. 4. Fuzzy Versions of Certain Flag Manifolds We shall freely use the notations introduced earlier. Let (G, K) be a pair from the list (i)–(iii) in Proposition 1. The aim of this section is to construct a fuzzy version of the ﬂag manifold M = G/K. As we mentioned before, our construction is based on the Berezin–Toeplitz quantization of such a manifold. 4.1. Quantum line bundle over M Fix again a maximal torus T in G and let ∆, ∆+ and Π be as in Sec. 2. Let β be the Gelfand node associated with (G, K). If k is the Lie algebra of K, then k = kS with S = Π\{β}. We denote by ∆+ 1 the set of positive roots corresponding to S. Then C (gC kC = hC ⊕ α ⊕ g−α ). α∈∆+ 1

Setting n+ =

α∈∆+ \∆+ 1

gC α

and n− =

α∈∆+ \∆+ 1

gC −α ,

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Construction of Certain Fuzzy Flag Manifolds

we get gC = hC ⊕

543

C (gC α ⊕ g−α )

α∈∆+

= kC ⊕ n+ ⊕ n− . Deﬁne N + (respectively N − ) to be the connected subgroup of GC with Lie algebra n+ (respectively n− ). Note that G/K GC /K C N + GC /K C N − . This shows that M = G/K can be regarded as a complex manifold. Let ψ ∈ V () be a normalized highest weight vector, with weight = β . Denote by χ the unique holomorphic extension to K C N + of the character e− . With these notations, we have for all k ∈ K ρ (k)ψ = χ (k)−1 ψ . The line bundle L = G ×e− C over M = G/K = GC /K C N + is identiﬁed with GC ×χ C, and then it is seen as a holomorphic line bundle. Note that every holomorphic line bundle over M is of the form Lm for some m ∈ Z. Let HN be the space of holomorphic sections of the line bundle LN := L⊗N , N ∈ N. By the Borel–Weil theorem (see, e.g., [1]), HN is an irreducible G-module with highest weight N . It follows that HN is isomorphic, as G-module, to the space V (N ). The algebra AN := End C (HN ) admits a natural G-action and can be identiﬁed with the matrix algebra Mat(dN , C), where dN := dimC V (N ). Let h be the Hermitian structure of the bundle L → M deﬁned by h([g, z], [g, z ]) = zz

for all g ∈ G.

We know that there exists a unique connection ∇ on L leaving h invariant and satisfying ∇X ψ = 0 for each vector ﬁeld X of type (0, 1) and for each local holomorphic section ψ. The curvature of (L, ∇) is the complex 2-form on M given by R(X, Y ) := ∇X ∇Y − ∇Y ∇X − ∇[X,Y ] = −iω(X, Y ), where X, Y are smooth vector ﬁelds and ω is the G-invariant K¨ ahler metric on M (see, e.g., [3]). This shows that (L, h, ∇) is a quantum line bundle over M . 4.2. Berezin–Toeplitz quantization of M Fix N ∈ N. On the space Γ∞ (M, LN ) of smooth sections of LN , we have the scalar product ϕ, ψ = hN (ϕ(x), ψ(x))dΩ(x), M

where dΩ(x) is the normalized G-invariant measure associated to the metric ω on M . 2 Let L (M, LN ) be the L2 -completion of the space Γ∞ (M, LN ). We denote by ΠN

June 2, 2010 14:55 WSPC/S0129-055X

544

148-RMP

J070-00401

M. Ben Halima

the orthogonal projection onto the subspace HN ⊂ L2 (M, LN ). Given a function f in C ∞ (M ), one can deﬁne an operator on the space HN by TN (f ) := ΠN ◦ Mf where Mf is the multiplication operator associated to f . The corresponding map TN : C ∞ (M ) → EndC (HN ) = AN is called the Berezin–Toeplitz quantization map. Let PN be the orthogonal projector onto the highest weight subspace of V (N ). One easily veriﬁes that ρN (g)PN ρN (g)−1 is the projector onto the “coherent state” associated to x = gK ∈ M (see [9]). Thus the coherent state map used in the Berezin–Toeplitz quantization of K¨ ahler manifolds (see [8]) is here equal to PN : M = G/K → EndC (HN ) gK → ρN (g)PN ρN (g)−1 and we get (see [29, Proposition 3.1]) the following expression for the Berezin– Toeplitz quantization map f (x)PN (x)dΩ(x). TN (f ) = (dimC HN ) M

From this expression, it is obvious that TN is G-equivariant. Using the fact that the map TN : C ∞ (M ) → AN is surjective (see [7, Proposition 4.2]), one can deduce that the algebra AN is G-equivariantly isomorphic to a submodule of L2 (M ). As shown by Bordemann, Meinrenken and Schlichenmaier (see [7]), the maps TN have the correct semi-classical behavior for N → ∞. In particular, the following results hold. Theorem 2. For f, h ∈ C ∞ (M ), we have (1) TN (f )op → f ∞ as N → ∞; (2) TN (f h) − TN (f )TN (h)op → 0 as N → ∞. Here op is the operator norm on AN and ∞ is the sup-norm on C ∞ (M ). Remark. Let l be a continuous length function on G satisfying the condition l(xyx−1 ) = l(y) for all x, y ∈ G. Let δ be the action of G on AN by conjugation by ρN . Then l and δ determine a Lipschitz seminorm LN on AN by δx (A) − Aop ; x = e , LN (A) = sup l(x) where e is the identity element of G. Let C(G/K) be the C -algebra of continuous complex-valued functions on G/K. We denote by ξ the action of G on G/K and on C(G/K) by left translation. We can deﬁne a Lipschitz seminorm on C(G/K) by ξx (f ) − f ∞ ; x = e . L∞ (f ) = sup l(x) Let us underline that the pairs (AN , LN ) and (C(G/K), L∞ ) are “compact quantum metric spaces” in the sense deﬁned by Rieﬀel in [27].

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Construction of Certain Fuzzy Flag Manifolds

545

Motivated by the notion of Gromov–Hausdorﬀ convergence of classical compact metric spaces, Rieﬀel gave in [27] a deﬁnition of a “quantum Gromov–Hausdorﬀ distance” between two compact quantum metric spaces. Furthermore, he proved in [28] that the sequence {(AN , LN )}N ≥1 converges to (C(G/K), L∞ ) for this distance as N → ∞. 4.3. Fuzzy version of M Now we are in position to prove our main result. Theorem 3. Let (G, K), M and AN be as above. Then there exists a sequence (EN )N ≥1 of G-invariant subspaces of L2 (M ) such that EN ⊂ EN +1 and N ≥1 EN is dense in C ∞ (M ), and such that EN is G-equivariantly isomorphic to the matrix algebra AN . Proof. If (G, K) is an irreducible compact Hermitian symmetric pair, then the result of the theorem follows immediately in this case by comparing Proposition 3.1 and Theorem 4.2 in the paper of Zhang mentioned in the introduction ([31]). Thus it suﬃces to prove the theorem in the following two cases: Case 1. Assume that (G, K) (SO(2n + 1), U (n)) with n ≥ 2. Let the notations of roots and weights be as in Sec. 3.1. The Gelfand node associated to the pair (SO(2n + 1), U (n)) is β = αn = en and the fundamental weight attached to this simple root is = 12 (e1 + · · · + en ). Consider the holomorphic line bundle L = SO(2n + 1) ×e− C over the homogeneous space SO(2n + 1)/U (n). As SO(2n + 1)modules, HN = Γhol (LN ) ∼ = V (N )∗ ⊗ V (N ) for N ∈ N. = V (N ) and AN ∼ Using the result of Corollary 1, one immediately has AN ∼ V (λ). = λ=(λ1 ,...,λn )∈Nn 0 N ≥ λ1 ≥ λ2 ≥ ··· ≥ λn ≥ 0

On the other hand, an important result of Kr¨ amer (see [20, Table 1]) says that the 2 SO(2n + 1)-module L (SO(2n + 1)/U (n)) decomposes into irreducibles as L2 (SO(2n + 1)/U (n)) ∼ V (λ). = λ=(λ1 ,...,λn )∈Nn 0 λ1 ≥ λ2 ≥ ··· ≥ λn ≥ 0

Denote by EN the unique submodule of L2 (SO(2n + 1)/U (n)) such that EN ∼ V (λ) = λ=(λ1 ,...,λn ) ∈ Nn 0 N ≥ λ1 ≥ λ2 ≥ ··· ≥ λn ≥ 0

as SO(2n+1)-module. The sequence (EN )N ≥1 satisﬁes the assertions of the theorem.

June 2, 2010 14:55 WSPC/S0129-055X

546

148-RMP

J070-00401

M. Ben Halima

Case 2. Assume that (G, K) (Sp(n), U (1) × Sp(n − 1)) with n ≥ 2. In the notations of Sec. 3.2, the Gelfand node associated to the pair (Sp(n), U (1) × Sp(n − 1)) is β = α1 = e1 − e2 and the fundamental weight attached to this simple root is = e1 . Consider the holomorphic line bundle L = Sp(n) ×e− C over the homogeneous space Sp(n)/(U (1) × U (n)) and take HN = Γhol (LN ) for N ∈ N. As Sp(n)-modules, HN ∼ = V (N ) and AN ∼ = V (N )∗ ⊗ V (N ). Since the module V (N ) is self-dual, the result of Proposition 2 shows that AN ∼ V ((2k + l, l, 0, . . . , 0)). = k,l ∈ N0 0 ≤ k+l ≤ N

As in the previous case, the decomposition into irreducibles of the Sp(n)-module L2 (Sp(n)/(U (1) × Sp(n − 1))) is given by Kr¨ amer in [20, Table 1]. One has L2 (Sp(n)/U (1) × Sp(n − 1)) ∼ V ((2k + l, l, 0, . . . , 0)). = k,l∈N0 2

Denote by EN the unique submodule of L (Sp(n)/(U (1) × Sp(n − 1))) such that EN ∼ V ((2k + l, l, 0, . . . , 0)) = k,l ∈ N0 0 ≤ k+l ≤ N

as Sp(n)-module. The sequence (EN )N ≥1 veriﬁes the assertions of the theorem. Finally, we observe that the analysis used in the proof of Theorem 3 directly implies the following result. Proposition 3 (Compare [30, Proposition 4.8]). Let (G, K) be a pair from the list (i)–(iii) in Proposition 1, and let β ∈ Π be the associated Gelfand node with corresponding fundamental weight := β . Then we have a multiplicity free decomposition of G-modules of the form V ()∗ ⊗ V () ∼ =

r

V (µi )

i=0

for certain r ∈ N, where µ0 := 0 ∈ Λ+ and {µi }1≤i≤r is a subset of the K-spherical K dominant weights ΛK + . Furthermore, every λ ∈ Λ+ can uniquely be written as a N0 -linear combination of the µi ’s (1 ≤ i ≤ r). Acknowledgments I would like to express my gratitude to Tilmann Wurzbacher for suggesting the problem and for helpful discussions. I would also like to thank the anonymous referee for pointing out to me references [22, 28], and for remarks improving the article.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00401

Construction of Certain Fuzzy Flag Manifolds

547

References [1] D. N. Akhiezer, Lie Group Actions in Complex Analysis (Vieweg, Braunschweig, 1995). [2] A. Y. Alekseev, A. Recknagel and V. Schomerus, Non-commutative world-volume geometries: Branes on SU (2) and fuzzy spheres, JHEP 09 (1999) 023. [3] D. Arnal, M. Cahen and S. Gutt, Representations of compact Lie groups and quantization by deformation, Acad. Roy. Belg. Bull. CI. Sci. (5) 74 (1988) 123–141. [4] A. P. Balachandran, T. R. Govindarajan and B. Ydri, The Fermion doubling problem and noncommutative geometry, Mod. Phys. Lett. A 15 (2000) 1279–1286. [5] A. P. Balachandran, B. P. Dolan, J. Lee, X. Martin and D. O’Connor, Fuzzy complex projective spaces and their star-products, J. Geom. Phys. 43 (2002) 184–204. [6] M. Ben Halima and T. Wurzbacher, Fuzzy complex Grassmannians and quantization of line bundles, to appear in Abh. Math. Semin. Hamb. Univ. [7] M. Bordemann, E. Meinrenken and M. Schlichenmaier, Toeplitz quantization of K¨ ahler manifolds and gl(N ), N → ∞ limits, Comm. Math. Phys. 165 (1994) 269–281. [8] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. I. Geometric interpretation of Berezin’s quantization, J. Geom. Phys. 7 (1990) 45–62. [9] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. II, Trans. Amer. Math. Soc. 337 (1993) 73–98. [10] B. P. Dolan and D. O’Connor, A fuzzy three sphere and fuzzy tori, JHEP 10 (2003) 060. [11] B. P. Dolan and J. Olivier, Fuzzy complex Grassmannian spaces and their star products, Internat. J. Modern Phys. A 18 (2003) 1935–1958. [12] M. R. Douglas and N. A. Nekrasov, Noncommutative ﬁeld theory, Rev. Mod. Phys. 73 (2001) 977–1029. [13] J. Fr¨ ohlich and K. Gaw¸edzki, Conformal ﬁeld theory and geometry of strings, in Mathematical Quantum Theory (Vancouver, 1993), Proceedings of the Conference on Mathematical Quantum Theory, Vancouver, Canada (Amer. Math. Soc. 1993), pp. 57–97. [14] M. Fulmek and C. Krattenthaler, Lattice path proofs for determinantal formulas for symplectic and orthogonal characters, J. Combin. Theory Ser. A 77 (1997) 3–50. [15] H. Grosse, C. Klimcik and P. Presnajder, Simple ﬁeld theoretical models on noncommutative manifolds, in Lie Theory and Its Applications in Physics (Clausthal, 1995) (World Sci. Publishing, River Edge, NJ, 1996), pp. 117–131. [16] H. Grosse and A. Strohmaier, Noncommutative geometry and the regularization problem of 4D quantum ﬁeld theory, Lett. Math. Phys. 48 (1999) 163–179. [17] Y. Hikida, M. Nozaki and Y. Sugawara, Formation of spherical 2D-brane from multiple D0-branes, Nucl. Phys. B 617 (2001) 117–150. [18] R. C. King and N. G. I. El-Sharkaway, Standard young tableaux and weight multiplicities of the classical Lie groups, J. Phys. A 16 (1983) 3153–3178. [19] A. W. Knapp, Lie Groups Beyond an Introduction, 2nd edn. (Birkh¨ auser, Boston, 2002). [20] M. Kr¨ amer, Sph¨ arische Untergruppen in Kompakten Zusammenh¨ angenden Liegruppen, Compositio Math. 38 (1979) 129–153. [21] C. Krattenthaler, Identities for classical group characters of nearly rectangular shape, J. Algebra 209 (1998) 1–64. [22] C. L. Lazaroiu, D. McNamee and C. S¨ amann, Generalized Berezin quantization, Bergmann metrics and fuzzy Laplacians, JHEP 09 (2008) 059.

June 2, 2010 14:55 WSPC/S0129-055X

548

148-RMP

J070-00401

M. Ben Halima

[23] P. Littelmann, A generalization of the Littlewood–Richardson rule, J. Algebra 130 (1990) 328–368. [24] J. Madore, The fuzzy sphere, Class. Quantum Grav. 9 (1992) 69–87. [25] J. Madore, An Introduction to Noncommutative Diﬀerential Geometry and Its Physical Applications, 2nd edn. (Cambridge University Press, Cambridge, 1999). [26] S. Okada, Applications of minor summation formulas to rectangular-shaped representations of classical groups, J. Algebra 205 (1998) 337–367. [27] M. A. Rieﬀel, Gromov–Hausdorﬀ distance for quantum metric spaces, Mem. Amer. Soc. 168 (2004) 1–65. [28] M. A. Rieﬀel, Matrix algebras converge to the sphere for quantum Gromov–Hausdorﬀ distance, Mem. Amer. Soc. 168 (2004) 67–91. [29] M. Schlichenmaier, Berezin–Toeplitz quantization and Berezin symbols for arbitrary compact K¨ ahler manifolds, in Coherent States, Quantization and Gravity (Bialowieza, 1998), Proc. XVII Workshop on Geometric Methods in Physics (Warsaw Univ. Press, 2001), pp. 45–56. [30] J. V. Stokman and M. S. Dijkhuizen, Quantized ﬂag manifolds and irreducible -representations, Comm. Math. Phys. 203 (1999) 297–324. [31] G. Zhang, Berezin transform on compact Hermitian symmetric spaces, Manuscripta Math. 97 (1998) 371–388.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 549–596 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X1000403X

ON THE FEYNMAN PATH INTEGRAL FOR NONRELATIVISTIC QUANTUM ELECTRODYNAMICS

WATARU ICHINOSE Department of Mathematical Science, Shinshu University, Matsumoto 390-8621, Japan [email protected] Received 17 March 2008 Revised 26 March 2010 The Feynman path integral for regularized nonrelativistic quantum electrodynamics is studied rigorously. We begin with the Lagrangian function of the corresponding classical mechanics and construct the Feynman path integral. In the present paper, the electromagnetic potentials are assumed to be periodic with respect to a large box and quantized through their Fourier coeﬃcients with large wave numbers cut oﬀ. Firstly, the Feynman path integral with respect to paths on the space of particles and vector potentials is deﬁned rigorously by means of broken line paths under the constraints. Secondly, the Feynman path integral with respect to paths on the space of particles and electromagnetic potentials is also deﬁned rigorously by means of broken line paths and piecewise constant paths without the constraints. This Feynman path integral is stated heuristically in Feynman and Hibbs’ book. Thirdly, the vacuum and the state of photons of given momenta and polarizations are expressed concretely as functions of variables consisting of the Fourier coeﬃcients of vector potentials. It is also proved rigorously in terms of distribution theory that the Coulomb potentials between charged particles naturally appear in the above Feynman path integral approach. This shows that the photons give rise to the Coulomb force. Keywords: Feynman path integral; quantum electrodynamics. Mathematics Subject Classiﬁcation 2010: 81S40, 58D30

1. Introduction A number of mathematical results on the Feynman path integrals for quantum mechanics have been obtained. On the other hand, the author does not know any mathematical results on the Feynman path integrals for quantum electrodynamics (cf. [2, 23]), written as QED from now on. The Feynman path integral for the free relativistic scalar boson ﬁeld was deﬁned rigorously in terms of the inﬁnite dimensional Fresnel integral in [2]. The Chern– Simons functional integral was also deﬁned rigorously, associated with a principal

549

June 2, 2010 14:55 WSPC/S0129-055X

550

148-RMP

J070-00403

W. Ichinose

ﬁber bundle over R3 with structure group a compact connected Lie group, as an inﬁnite dimensional distribution in terms of white noise analysis and the applications of its functional integral to the topological quantum ﬁeld theory were given in [1]. In [27], the interaction of nonrelativistic particles with a scalar boson ﬁeld was studied. There, the functional integral with respect to paths on the space of particles and the boson ﬁeld was deﬁned in terms of Markoﬀ processes under the assumption that the mass divided by the imaginary unit and a coupling constant divided by the imaginary unit are positive. As will be seen in the present paper, particles interact with the boson ﬁeld through the quantized vector potential in QED. On the other hand, in [27], particles interact with the boson ﬁeld through the quantized scalar potential, where the vector potential disappears. This is the most diﬀerent point between our result and Nelson’s one. The spectra of Hamiltonian operators for nonrelativistic QED models have also been studied (cf. [12, 14, 32]). The Hamiltonian operators in these QED models are deﬁned by means of the Coulomb potentials, and creation operators and annihi∞ n 2 3 2 3 lation operators acting on the bosonic Fock space n=0 s (L (R ) ⊕ L (R )), deﬁned dependently on an infrared and ultraviolet cut-oﬀ function in momentum space R3 , where L2 (R3 ) is the space of all square integrable functions in R3 and L2 (R3 ) ⊕ L2 (R3 ) expresses the space of all amplitudes of momentum of a single photon with polarizations. These QED models are simpliﬁed versions of those which are primarily intended in physics (cf. [10, 11, 29, 33]). A functional integral representation for the above nonrelativistic QED model with imaginary time was also obtained by Hiroshima [16] by means of the probabilistic method. We can see from Theorem 3.1 in the present paper that the Hamiltonian operators in [12, 14, 16, 32] are formally like (3.10) in the present paper. But, our presentation (3.10) is exhibited as a partial diﬀerential operator. In addition, as will be seen in Sec. 5, creation operators and annihilation operators with given momenta and polarizations acting on S (R4N ) are deﬁned and the Hamiltonian operator (3.10) can be written by means of these creation operators and annihilation operators, where N is a positive integer determined from the regularization of QED, S(R4N ) denotes the Schwartz space of all rapidly decreasing functions in R4N and S (R4N ) is the dual space of S(R4N ). This description of the Hamiltonian operator is the one familiar in the heuristic presentations in physics (cf. [10, 11, 29, 33]). It is well known that the only translation invariant σ-additive regular measure on a separable inﬁnite dimensional Banach space is the identically zero measure (cf. [13, Chap. 4, Sec. 5, Theorem 4]). The measure deﬁning heuristically the Feynman path integral is meant to be translation invariant (cf. [11, (7-29)]), so it cannot be realized as a σ-additive regular nontrivial measure. As it is known, see, e.g., [2, 15, 23] the Feynman path integral itself can be realized as a linear functional satisfying certain suitable continuity conditions. Our aim in the present paper is to deﬁne rigorously the Feynman path integral for a regularized nonrelativistic QED (for a physical discussion of QED and its nonrelativistic version, see, e.g., [7, 8, 10, 11, 29]). We begin with the Lagrangian

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

551

function of the corresponding classical mechanics, diﬀerently from the models in [12, 14, 16, 32], and construct rigorously the Feynman path integral. Usually in physics, the Feynman path integral for nonrelativistic QED is only heuristically deﬁned. In the present paper, electromagnetic potentials are assumed to be periodic with respect to a large box in R3 and quantized through their Fourier coeﬃcients. We note that in the present paper, regrettably, the Fourier coeﬃcients with large wave numbers need to be arbitrarily cut oﬀ (ultraviolet cut-oﬀ) and we do not take the limit of a box to R3 . In this double sense, our model is regularized. First, the mathematical deﬁnition of the Feynman path integral with respect to paths on the space of particles and vector potentials is given by means of broken line paths under the constraints, i.e. (2.20) in the present paper. These constraints are necessarily introduced in physics (cf., e.g., [11, (9-17)], [29, (A-7)], [32, (13.10)] and [33, (7.38)]) when electrodynamics is quantized from the classical mechanics. It is a reason for introducing the constraints that a momentum canonically conjugate to the scalar potential is absent. See (2.3) in the present paper. Secondly, without the constraints we give the mathematical deﬁnition of the Feynman path integral with respect to paths on the space of particles and electromagnetic potentials by means of broken line paths and piecewise constant paths. This Feynman path integral has been given heuristically by [11, (9-98)]. Our method of deﬁning the Feynman path integral without the constraints is like the one we used before in [20] for deﬁning the phase space Feynman path integral. That is, paths considered on the space of all scalar potentials are determined so that the derivatives of the Lagrangian function with respect to the variables of the scalar potential are piecewise constant (Remark 3.4). The author again emphasize that any deﬁnitions of [11, (9-98)] have not been given. So our result may be completely new. We note that our Feynman path integral with respect to paths on the space of particles and electromagnetic potentials can be proved to be equal with the Feynman path integral with respect to paths on the space of particles and vector potentials. Thirdly, the vacuum and the states of photons with given momenta and polarizations are expressed concretely as functions of variables consisting of the Fourier coeﬃcients of vector potentials. In [11], only the vacuum and the state of a photon with a momentum and a polarization are expressed concretely as functions. Generally, in physics the vacuum and the states of photons with given momenta and polarizations are not considered concretely but rather abstractly (cf. [29, 33]). To write down the state of photons concretely, we introduce creation operators and annihilation operators, which can be written concretely as ﬁrst order partial diﬀerential operators, similarly as it is done in white noise analysis in [15]. The results stated above should have many applications, as heuristically suggested in [11, Chap. 9]. Fourthly, we show in terms of distribution theory that the Coulomb potentials between charged particles appear when the periods of the Fourier series tend to

June 2, 2010 14:55 WSPC/S0129-055X

552

148-RMP

J070-00403

W. Ichinose

inﬁnity and the cut-oﬀ of the Fourier coeﬃcients is removed. This result, which shows that photons yield the Coulomb force, is well known in physics (cf. [8, 11]). In the present paper, we give a rigorous proof of this fact in the frame of our model of regularized nonrelativistic QED. The proof of giving a mathematical deﬁnition of the Feynman path integral for nonrelativistic QED with regularization is obtained by means of a somewhat delicate study of oscillatory integral operators, the abstract Ascoli–Arzel`a theorem on the weighted Sobolev spaces and the uniqueness to the initial problem for the Schr¨ odinger type equations as in [18–21]. The proof of expressing the vacuum and the states of photons with given momenta and polarizations concretely is as follows. We ﬁrst deﬁne annihilation operators of photons with given momenta and polarizations by ﬁrst order diﬀerential operators having the Fourier coeﬃcients of vector potentials as variables. Creation operators of photons are deﬁned as the adjoint operators of the annihilation operators. The vacuum is determined from the annihilation operators and the states of photons with given momenta and polarizations are determined from the vacuum by means of the creation operators. For the mathematics related to this see, e.g., [6]. This relies on formal considerations going back to [7]. The proof of the appearance of the Coulomb potentials between charged particles is given by proving the convergence theorem for the Riemann sum of a unbounded function as the discretization parameter in space tends to zero, which will be stated in Proposition 4.3 in the present paper. Our plan in the present paper is as follows. Section 2 is devoted to preliminaries. In Sec. 3, the main results on the Feynman path integral for regularized nonrelativistic QED are stated. In Sec. 4, the appearance of the Coulomb potentials between charged particles is proved rigorously in our model. In Sec. 5, the vacuum and the states of photons with given momenta and polarizations are given concretely. Sections 6–9 are devoted to the proofs of the main results stated in Sec. 3. 2. Preliminaries For a multi-index α = (α1 , . . . , αd ) and z = (z1 , . . . , zd ) ∈ Rd , we write |α| = d α1 αd α α α1 αd · · · (∂/∂zd) and z = 1 + |z|2 . Let j=1 αj , z = z1 · · · zd , ∂z = (∂/∂z1 ) 2 2 d L = L (R ) be the space of all square integrable functions in Rd with inner product (·, ·) and norm · . Let T > 0 be an arbitrary constant, t ∈ [0, T ] and x ∈ R3 . We consider n charged nonrelativistic particles x(j) (t) ∈ R3 (j = 1, 2, . . . , n) with mass mj > 0 and charge ej ∈ R. Let E(t, x) = (E1 (t, x), E2 (t, x), E3 (t, x)) ∈ R3 be the electric strength and B(t, x) = (B1 (t, x), B2 (t, x), B3 (t, x)) ∈ R3 the magnetic strength. Then the classical equations of motion of x(j) (t) are given by d mj x˙ (j) (t) = ej E(t, x(j) (t)) + ej x˙ (j) (t) × B(t, x(j) (t)), dt

x˙ (j) (t) =

d (j) x (t). dt

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

553

Let φ(t, x) ∈ R be a scalar potential and A(t, x) ∈ R3 a vector potential. We set x(t) := (x(1) (t), . . . , x(n) (t)) ∈ R3n , x˙ (t) := (x˙ (1) (t), . . . , x˙ (n) (t)) ∈ R3n . Then the Lagrangian function for particles and the electromagnetic ﬁeld with the distributional charge density ρ(t, x) =

n

ej δ(x − x(j) (t))

(2.1)

j=1

and the distributional current density j(t, x) =

n

ej x˙ (j) (t)δ(x − x(j) (t)) ∈ R3

(2.2)

j=1

is given in distributional sense by

∂A ∂φ ˙ ˙ L t, x, x, A, A, , φ, ∂x ∂x n mj 1 |x˙ (j) |2 − ρ(t, x)φ(t, x)dx + = j(t, x) · A(t, x)dx 2 c j=1 1 8π n

+ =

j=1

1 + 8π

R3

(|E(t, x)|2 − |B(t, x)|2 )dx + C

mj (j) 2 1 (j) (j) (j) |x˙ | − ej φ(t, x ) + ej x˙ · A(t, x ) 2 c R3

(|E(t, x)|2 − |B(t, x)|2 )dx + C

(2.3)

(cf. [11, 32]), where E=−

1 ∂A ∂φ − , c ∂t ∂x

B = ∇ × A,

(2.4)

∂φ/∂x = (∂φ/∂x1 , ∂φ/∂x2 , ∂φ/∂x3 ) and C is an indeﬁnite constant. It seems that a nontrivial indeﬁnite constant in (2.3) has not been explicitly discussed by anyone before (cf. [11, 29, 32]). As in [8, 10, 29] we consider a suﬃcient large box

L2 L2 L3 L3 L1 L1 × − , × − , ⊂ R3 . V = − , 2 2 2 2 2 2

June 2, 2010 14:55 WSPC/S0129-055X

554

148-RMP

J070-00403

W. Ichinose

In the present paper, as variables we consider all periodic potentials φ(t, x) and A(t, x) in x ∈ R3 with periods L1 , L2 and L3 satisfying ∇ · A(t, x) = 0

in [0, T ] × R3

(the Coulomb gauge)

(2.5)

and also

φ(t, x)dx = 0,

V

A(t, x)dx = 0.

(2.6)

V

Let |V | = L1 L2 L3 . We set k :=

2π 2π 2π s1 , s2 , s3 L1 L2 L3

(s1 , s2 , s3 = 0, ±1, ±2, . . .).

(2.7)

Then, using the Gram and Schmidt method, we can easily determine ej (k) ∈ R3 (j = 1, 2) such that (e1 (k), e2 (k), k/|k|) for all k = 0 form a set of mutually orthogonal unit vectors in R3 and ej (−k) = −ej (k)

(j = 1, 2)

(2.8)

(cf. [3, p. 448]). We ﬁx these ej (k) hereafter. Noting (2.5) and (2.6), we can expand φ(t, x) and A(t, x) formally into the Fourier series √ A(x, {alk (t)}) =

4π c {a1k (t)eik·xe1 (k) + a2k (t)eik·xe2 (k)}, |V |

(2.9)

k=0

φ(x, {φk (t)}) =

1 φk (t)eik·x . |V |

(2.10)

k=0

Remark 2.1. Usually in the physical literature (cf. [11, 29]) the condition (2.6) is not stated clearly. We write (1)

alk =:

(2)

alk − ialk √ 2 (1)

(l = 1, 2),

(2.11)

(2)

φk =: φk − iφk ,

(2.12)

where alk ∈ R and φk ∈ R, and also the complex conjugate of alk as a∗lk . Since A and φ are real valued, the relations (i)

(i)

(1)

(1)

al−k = −alk ,

(2)

(2)

al−k = alk ,

(1)

(1)

φ−k = φk ,

(2)

(2)

φ−k = −φk

(2.13)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

hold from (2.8). So, from (2.9) and (2.10), we have √ 2 4π 1 (1) √ (alk cos k · x + a(2) A(x, {alk }) = c el (k), lk sin k · x) |V | 2

555

(2.14)

k=0 l=1

1 (1) (2) (φk cos k · x + φk sin k · x). |V |

φ(x, {φk }) =

(2.15)

k=0

We also write (1)

ρk (x) :=

n

ej cos k · x(j) ,

(2.16)

ej sin k · x(j) .

(2.17)

j=1 (2)

ρk (x) :=

n j=1

Determining the constant C in the Lagrangian function (2.3) formally as the inﬁnite constant n c|k| 2π 2 1 , (2.18) ej + 2 |V | j=1 |k|2 2 k=0

k=0

we can write L from (2.3) by means of (2.4), (2.9), (2.10) and (2.15) as L(x, x˙ , {alk }, {a˙ lk }, {φk }) =

n mj j=1

2

|x˙ (j) |2 + n

+

e2j

1 8π|V |

  

  2   k=0  i=1

(i)

j=1

(i)

1 ej x˙ (j) · A(x(j) , {alk }) c j=1 n

+

  |k|2 (i) (i) (c|k|)2 (alk )2 (a˙ lk )2 1 c|k| + − + . 2 2|V | 2|V | 2 16π 2

(i)

(|k|2 (φk )2 − 8πρk (x)φk )

(2.19)

k=0,i,l

The reason why we have chosen the indeﬁnite constant C in (2.3) in the way given by (2.18) will be explained in Remark 5.1. n (1) Remark 2.2. If we do not assume (2.6), we must add (−1/|V |)( j=1 ej )φ0 and (i) ˙ l0 )2 /(4|V |) to (2.19). i,l=1,2 (a If we take into the constraints ∇ · E = 4πρ as in [11, (9-17)] and [33, (7.38)], we have (i)

(i)

|k|2 φk = 4πρk (x) (i = 1, 2, k = 0)

(2.20)

June 2, 2010 14:55 WSPC/S0129-055X

556

148-RMP

J070-00403

W. Ichinose

n and j=1 ej = 0 formally from (2.1), (2.4) and (2.5). But, in the present paper, we adopt only (2.20) as constraints. Then from (2.16) and (2.17), we have n 2

(i)

(i)

(i)

(|k|2 (φk )2 − 8πρk (x)φk ) + 16π 2

i=1

e2j

j=1

|k|2   n  16π 2  (1) 2 (2) = − 2 (ρk ) + (ρk )2 − e2j  |k|  j=1 =− =−

16π 2 |k|2 16π 2 |k|2

n

(j)

ej el eik·x e−ik·x

(l)

j,l=1,j=l n

ej el cos k · (x(j) − x(l) ).

(2.21)

j,l=1,j=l

So we get Lc (x, x˙ , {alk }, {a˙ lk }) =

n mj j=1

+

1 c

2

|x˙ (j) |2 −

n

2π |V |

n

k=0 j,l=1,j=l

ej el cos k · (x(j) − x(l) ) |k|2

ej x˙ (j) · A(x(j) , {alk })

j=1

1 + 2

k=0,i,l

(i)

(i)

(c|k|)2 (alk )2 (a˙ lk )2 c|k| − + 2|V | 2|V | 2

.

(2.22)

We introduce the weighted Sobolev spaces B a (Rd ) := {f ∈ L2 ; f B a := f + |α|=a (z α f + (∂z )α f ) < ∞} (a = 1, 2, . . .). Let B −a (Rd ) denote

their dual spaces. We set B 0 := L2 . Let χ ∈ C ∞ (Rd ) with compact support such χ(0) = 1. We deﬁne the oscillatory integral Os- g(·, z )dz by that lim→0 χ( z )g(·, z )dz independently of the choice of χ pointwise, in the topology of B a (Rd ) or in the topology in S(Rd ) (cf. [24]) for a function g(z, z ) in Rd × Rd , provided the integral involving χ exists in Lebesgue sense for any > 0. 3. Main Results We arbitrarily cut oﬀ the terms of large wave numbers k in (2.22). That is, let Mj (j = 1, 2, 3) be arbitrary positive integers such that M2 ≤ M3 . We consider 2π 2π 2π s1 , s2 , s3 ; s21 + s22 + s23 = 0, |s1 |, |s2 |, |s3 | ≤ Mj . Λj := k = L1 L2 L3 (3.1)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

557

Then we can determine Λj (j = 1, 2, 3) such that Λj =: Λj ∪ (−Λj ),

Λj ∩ (−Λj ) = empty set,

Λ2 ⊆ Λ3

(3.2)

and ﬁx Λj hereafter. Let Nj denote the number of elements of the set Λj . It (i)

follows from (2.13) that aΛj := {alk }k∈Λj ,i,l ∈ R4Nj are independent variables (cf. [32, p. 154]). We also introduce cut-oﬀ functions g(x) ∈ C ∞ (R3 ) and ψ(θ) ∈ C ∞ (R). We consider L˜c (x, x˙ , {alk }, {a˙ lk }) :=

n mj j=1

2

|x˙ (j) |2 −

2π |V |

n

k∈Λ1 j,l=1,j=l

ej el cos k · (x(j) − x(l) ) |k|2

1 ˜ (j) , aΛ ) ej x˙ (j) · A(x 2 c j=1 (i) (i) (c|k|)2 (alk )2 (a˙ lk )2 1 c|k| + − + 2 2|V | 2|V | 2 n

+

(3.3)

k∈Λ3 ,i,l

in place of Lc given by (2.22), where A given by (2.14) is replaced with √ 2 4π 1 ˜ √ (ψ(a(1) A(x, aΛ2 ) = cg(x) lk ) cos k · x |V | 2 k∈Λ2 l=1

(2)

+ ψ(alk ) sin k · x)el (k).

(3.4)

We assume ψ(−θ) = −ψ(θ) (θ ∈ R). For the sake of simplicity we write Λ := Λ3 and N := N3 . We consider a subdivision ∆ : 0 = τ0 < τ1 < · · · < τν = T,

|∆| := max (τl − τl−1 ) 1≤l≤ν

of [0, T ]. Let x ∈ R3n and aΛ ∈ R4N be ﬁxed. We take arbitrarily x (0) , . . . , x (ν−1) ∈ R3n and (0)

(ν−1)

aΛ , . . . , aΛ

∈ R4N .

Then, we write the oriented broken line path on [0, T ] connecting x (l) at θ = τl (l = 0, 1, . . . , ν, x (ν) = x) by q∆ (θ) ∈ R3n . Of course, dq∆ (θ)/dθ =: q˙∆ (θ) in distributional sense is in L2 ([0, T ]). In the same way we deﬁne the broken line path (0) (ν−1) and aΛ . We deﬁne aΛ∆ (θ) ∈ R8N by aΛ ∆ (θ) ∈ R4N on [0, T ] for aΛ , . . . , aΛ means of (2.13). We write the classical action T L˜c (q∆ (θ), q˙∆ (θ), aΛ∆ (θ), a˙ Λ∆ (θ))dθ. q∆ , aΛ∆ ) = (3.5) Sc (T, 0; 0

June 2, 2010 14:55 WSPC/S0129-055X

558

148-RMP

J070-00403

W. Ichinose

Let ρ∗ > 0 be the constant, which will be deﬁned from Λ1 , Λ2 and Λ3 in Proposition 7.2 of the present paper. See also Remark 7.1. Then we have Theorem 3.1. We assume for cut-oﬀ functions g(x) and ψ(θ) in (3.4) that for any l = 1, 2, . . . and any multi-index α there exist constants δl > 0 and δα > 0 satisfying |∂θl ψ(θ)| ≤ Cl θ−(1+δl ) ,

θ∈R

(3.6)

x ∈ R3 .

(3.7)

and |∂xα g(x)| ≤ Cα x−(1+δα ) ,

Let |∆| ≤ ρ∗ and f (x, aΛ ) ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .). Then,     4N ν n 3   m 1 j    2πi(τl − τl−1 ) 2πi|V |(τl − τl−1 )  l=1

j=1

× Os-

···

(exp i−1 Sc (T, 0; q∆, aΛ∆ ))f (q∆ (0), (0)

(ν−1)

aΛ ∆ (0))dx (0) · · · dx (ν−1) daΛ · · · daΛ

(3.8)

is well deﬁned in B a (R3n+4N ), which we write as (C∆ (T, 0)f )(x, aΛ ) or (exp i−1 Sc (T, 0; q∆, aΛ∆ ))f (q∆ (0), aΛ ∆ (0))Dq∆ DaΛ ∆ . In addition, as |∆| (T, 0)f )(x, aΛ ) converges to a limit which we call tends to 0, the function (C∆ the Feynman path integral (exp i−1 Sc (T, 0; q, aΛ ))f (q(0), aΛ (0))Dq DaΛ in B a (R3n+4N ). We can also see that this limit is B a -valued continuous and B a−2 valued continuously diﬀerentiable in T ∈ (0, ∞), and satisﬁes the Schr¨ odinger type equation i

∂ u(t) = H(t)u(t) ∂t

(3.9)

with u(0) = f, where 2 n 1 ∂ ej ˜ (j) H(t) = − A(x , aΛ2 ) (j) 2m i c ∂x j j=1 n 2π ej el cos k · (x(j) − x(l) ) |V | |k|2 k∈Λ1 j,l=1,j=l   2 2  |V | ∂ (c|k|) (i) 2 c|k|  (a ) − + + . (i)  2 i ∂a 2|V | lk 2 

+

k∈Λ ,i,l

lk

(3.10)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

559

Remark 3.1. Let us determine the indeﬁnite constant C in (2.3) by n c|k| 2π 2 1 ej + 2 |V | j=1 |k|2 2 k∈Λ1

k∈Λ3

and cut oﬀ the terms of large wave numbers k in (2.19) by introducing Λj (j = 1, 2, 3). Then we get (3.3) again, taking into the account the constraints (2.20). Remark 3.2. Let 0 < ≤ 1 and g (x) ∈ C ∞ (R3 ) satisfy (3.7) for all α. Let U (t, 0)f (0 ≤ t ≤ T ) denote the Feynman path integral deﬁned in Theorem 3.1 for f ∈ B a (R3n+4N ). Suppose that ∂xα g (x) are uniformly bounded with respect to 0 < ≤ 1 in R3n for all α and that ∂xα g (x) converges to ∂xα 1 pointwise in R3n for all α as tends to zero. Then we can prove that as tends to zero, U (t, 0)f converges to the solution of (3.9) with u(0) = f , where g(x) in (3.4) is replaced with 1, in B a uniformly in t ∈ [0, T ]. In this way we can remove the cut-oﬀ function g(x) in (3.4). This result will be published in [22]. Remark 3.3. Let 0 ≤ t0 ≤ t ≤ T . For f ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .) we deﬁne C∆ (t, t0 )f with C∆ (t0 , t0 )f = f as in (3.8). See (9.3) in the present paper for the precise deﬁnition. As will be seen from the proof of Theorem 3.1 of the present paper, under the assumptions of Theorem 3.1 (C∆ (t, t0 )f )(x, aΛ ) is well deﬁned in B a and lim|∆|→0 C∆ (t, t0 )f exists in B a uniformly in 0 ≤ t0 ≤ t ≤ T , which satisﬁes the Sch¨ odinger type equation (3.9) with u(t0 ) = f . In place of L expressed by (2.19) we consider ˜ x, x˙ , {alk }, {a˙ lk }, {φk }) L( :=

n mj j=1

2

|x˙ (j) |2 +

n

+

e2j

1 8π|V |

  

  2   k∈Λ1  i=1

(i)

j=1

(i)

1 ˜ (j) , aΛ ) ej x˙ (j) · A(x 2 c j=1 n

+

  |k|2 (i) (i) (c|k|)2 (alk )2 (a˙ lk )2 c|k| 1 − + + 2 2|V | 2|V | 2 16π 2

(i)

(|k|2 (φk )2 − 8πρk (x)φk )

(3.11)

k∈Λ3 ,i,l

by means of (3.4) as in L˜c . Let q∆ (θ) ∈ R3n , aΛ ∆ (θ) ∈ R4N and aΛ∆ (θ) ∈ R8N be the broken line paths (1) (2) (0) (1) (ν−1) deﬁned before. Let ξk := (ξk , ξk ) ∈ R2 for k ∈ Λ1 . Take ξ k , ξ k , . . . and ξ k (1) (2) in R2 arbitrarily. Set ρk (x) := (ρk (x), ρk (x)) by means of (2.16) and (2.17). Then, we deﬁne the path 4πρk (q∆ (θ)) (l) ∈ R2 , φk∆ (θ) := ξ k + |k|2

τl−1 < θ ≤ τl

(3.12)

June 2, 2010 14:55 WSPC/S0129-055X

560

148-RMP

J070-00403

W. Ichinose

(l = 1, 2, . . . , ν), where φk∆ (0) := limθ→0+0 φk∆ (θ). We set φΛ1 ∆ (θ) := {φk∆ (θ)}k∈Λ1 ∈ R2N1 . We deﬁne φΛ1 ∆ (θ) ∈ R4N1 by means of (2.13). Let ˜ x, x˙ , {alk }, {a˙ lk }, {φk }) given S(T, 0; q∆ , aΛ∆ , φΛ1 ∆ ) be the classical action for L( by (3.11). Theorem 3.2. Let |∆| ≤ ρ∗ and f (x, aΛ ) ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .). Then, under the assumptions of Theorem 3.1 the function    4N ν  n 3 m 1 j     2πi(τl − τl−1 ) 2πi|V |(τl − τl−1 ) j=1 l=1

 |k|2 (τl − τl−1 )   Os- · · · (exp i−1 S(T, 0; q∆, aΛ∆ , φΛ1 ∆ )) × 4iπ 2 |V |  k∈Λ1

× f ( q∆ (0), aΛ ∆ (0))dx (0) · · · dx (ν−1) (0) (ν−1) (0) (1) (ν−1) · da · · · da dξ dξ · · · dξ Λ

Λ

k∈Λ1

k

k

k

(3.13)

is well deﬁned in B a (R3n+4N ) and is equal to (exp i−1 Sc (T, 0; q∆ , aΛ∆ ))f (q∆ (0), aΛ ∆ (0))Dq∆ DaΛ ∆ deﬁned by (3.8) in Theorem 3.1. So it follows from Theorem 3.1 that as |∆| → 0, then (3.13) converges to the Feynman path integral (3.14) (exp i−1 S(T, 0; q, aΛ , φΛ1 ))f (q(0), aΛ (0))Dq DaΛ DφΛ1 in B a (R3n+4N ), which satisﬁes the Schr¨ odinger type equation (3.9) with u(0) = f . The Feynman path integral (3.14) is given heuristically in [11, §9-8]. Remark 3.4. As was noted in the introduction, the constraints (2.20) are not needed in Theorem 3.2 above. The path φk∆ (θ) deﬁned by (3.12) is determined so (i) ˜ q∆ (θ), q˙ ∆ (θ), aΛ∆ (θ), a˙ Λ∆ (θ), φΛ1 ∆ (θ))/∂φk (i = 1, 2) are piecewise conthat ∂ L( stant. Remark 3.5. We take f ∈ S(R3n+4N ) and set M0 = [(3n + 4N )/2] + 1, where [·] denotes Gauss’ symbol. Let ζ = (x, X), and α and β multi-indices. Then, the Sobolev inequality shows ∂ζκ (ζ α ∂ζβ f ). sup |ζ α ∂ζβ f (ζ)| ≤ ζ α ∂ζβ f + ζ∈R3n+4N

|κ|=M0

It follows from Lemma 2.4 with a = b = 1 in [17] or as in the proof of (7.14) in the present paper that the right-hand side of the above is bounded by Cα,β f B |α+β|+M0

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

561

with a constant Cα,β . Hence, for |∆| ≤ ρ∗ the functions (3.8), (3.13), the limit of (3.8) as |∆| → 0 and the limit of (3.13) as |∆| → 0 are well deﬁned in S, so pointwise. Remark 3.6. We write (3.13) as G∆ (T, 0)f . Let 0 ≤ t0 ≤ t ≤ T . For f ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .) we can deﬁne G∆ (t, t0 )f as in (3.13) in the same way that C∆ (t, t0 )f is deﬁned in Remark 3.3. See also (9.20) in the present paper for the precise deﬁnition. As will be seen in the proof of Theorem 3.2, under the assumptions of Theorem 3.1, G∆ (t, t0 )f is well deﬁned in B a and is equal to C∆ (t, t0 )f . We consider an external electromagnetic ﬁeld Eex (t, x) = (Eex1 (t, x), Eex2 (t, x), Eex3 (t, x)) ∈ R3 and Bex (t, x) = (Bex1 (t, x), Bex2 (t, x), Bex3 (t, x)) ∈ R3 such that ∂xα Eex j (t, x), ∂xα Bex j (t, x) and ∂t Bex j (t, x) (j = 1, 2, 3) are continuous in [0, T ] × Rn for all α. Let φex (t, x) ∈ R and Aex (t, x) ∈ R3 be the electromagnetic potential to Eex and Bex . Then we obtain Theorem 3.3 below. Though Theorem 3.3 gives the generalization of Theorems 3.1 and 3.2, the results are stated separately from Theorems 3.1 and 3.2 to avoid confusion. ˜ (j) , aΛ ) + Aex (t, x(j) ). ˜ (j) , aΛ ) in (3.3), (3.10) and (3.11) by A(x We replace A(x n2 n 2 Moreover we add − j=1 ej φex (t, x(j) ) to (3.3) and (3.11), and j=1 ej φex (t, x(j) ) to (3.10), respectively. Then we have Theorem 3.3. Besides the assumptions of Theorem 3.1 we suppose as in [19–21] that for any α = 0 there exist constants Cα and δα > 0 satisfying |∂xα Eex j (t, x)| ≤ Cα ,

|∂xα Bex j (t, x)| ≤ Cα x−(1+δα )

(3.15)

and |∂xα Aex j (t, x)| ≤ Cα ,

|∂xα φex (t, x)| ≤ Cα x

(3.16)

for j = 1, 2 and 3 in [0, T ] × Rn. Then, the same assertions as in Theorems 3.1 and 3.2 hold. Remark 3.7. It follows from [19, Lemma 6.1] that under the assumptions (3.15) there exist Aex and φex satisfying (3.16). 4. The Appearance of the Coulomb Potentials We will show rigorously that the Coulomb potentials appear as the limit of the second term on the right-hand side of (3.3) and the limit of the second term on the right-hand side of (3.10). This result is well known as a heuristic result in physics (cf. [8, 11]). We will give a rigorous proof in our model. In the Hamiltonian operators of QED models in [12, 14, 16, 32], the Coulomb potentials are assumed from the beginning. Our proof is somewhat delicate.

June 2, 2010 14:55 WSPC/S0129-055X

562

148-RMP

J070-00403

W. Ichinose

Theorem 4.1. Let Lj (j = 1, 2, 3) tend to ∞ under the condition Li 1 ≤ ≤ m0 , m0 Lj

i, j = 1, 2, 3

(4.1)

for a constant m0 ≥ 1. Then we have lim

L1 ,L2 ,L3 →∞

k∈Λ1 j,l=1,j=l

= lim

M1 →∞

=

1 2

n

2π M1 →∞ |V |

lim

2π L1 ,L2 ,L3 →∞ |V | lim

ej el cos k · (x(j) − x(l) ) |k|2 n

k∈Λ1 j,l=1,j=l

n j,l=1,j=l

ej el − x(l) |

ej el cos k · (x(j) − x(l) ) |k|2

in S (R3n ).

|x(j)

(4.2)

Let χ0 (k) be the function in R3 deﬁned by

χ0 (k) :=

 1, |k| ≤ 1,

(4.3)

0, |k| > 1.

We ﬁrst prove Lemma 4.2. Let > 0. Then we have 1 →0 (2π)2 =

1 2

n

lim

ej el

j,l=1,j=l n

j,l=1,j=l

cos k · (x(j) − x(l) ) χ0 ( k)dk |k|2

ej el |x(j) − x(l) |

in S (R3n ).

(4.4)

Proof. Let x and k be in R3 . Then, it is well known that 1 (2π)2

1 eik·x dk = |k|2 2|x|

in S (R3 )

(4.5)

(cf. [25, §5.9]). For the sake of simplicity, we consider the case n = 2. Let x = x(1) and y = x(2) . We will prove 1 →0 (2π)2 lim

1 eik·(x−y) χ0 ( k)dk = 2 |k| 2|x − y|

in S (R6 ).

(4.6)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

563

Let ϕ(x, y) ∈ S(R6 ). Then, with ·, · understood as distributional pairing, from (4.6) we have $ ik·(x−y) 1 e χ ( k)dk, ϕ(x, y) 0 (2π)2 |k|2 $ # 1 cos k · (x − y) = lim χ ( k)dk, ϕ(x, y) 0 →0 (2π)2 |k|2 # $ 1 sin k · (x − y) +i χ ( k)dk, ϕ(x, y) 0 (2π)2 |k|2 $ # 1 cos k · (x − y) = lim χ ( k)dk, ϕ(x, y) 0 →0 (2π)2 |k|2 # $ 1 = , ϕ(x, y) . 2|x − y|

# lim

→0

Consequently we obtain (4.4). Equation (4.6) is equivalent to # lim

→0

1 2π 2

$ 1 eik·(x−y) ϕ(x, y)dxdy χ ( k)dk, ϕ(x, y) = 0 2 |k| |x − y|

(4.7)

√ √ for all ϕ(x, y) ∈ S(R6 ). We set x = (x − y)/ 2 and y = (x + y)/ 2. Let ψ1 (x ) ˜ , y ) := ψ1 (x )ψ2 (y ) in the left-hand and ψ2 (y ) be in S(R3 ). We take ϕ(x, y) = ϕ(x side of (4.7). Then the left-hand side of (4.7) is equal to

eik·(x−y) χ0 ( k)ψ1 (x )ψ2 (y )dkdx dy |k|2 ik·√2x 1 e = lim 2 χ ( k)dk ψ2 (y )dy , ψ1 (x )dx 0 →0 2π |k|2

1 lim →0 2π 2

which is also equal to ϕ(x ˜ , y) ϕ(x, y) 1 √ √ ψ1 (x )dx ψ2 (y )dy = dx dy = dxdy |x − y| 2|x | 2|x | from (4.5). So, (4.7) holds for ϕ(x, y) = ψ1 (x )ψ2 (y ). Since the set of all linear combinations of ψ1 (x )ψ2 (y ) for all ψ1 and ψ2 in S(R3 ) is dense in S(Rx6 ,y ), so (4.7) holds for all ϕ(x, y) ∈ S(R6 ). Hence we get (4.6). Proposition 4.3. Let c ≥ 0 be a constant. Let Φ(k) be continuous in R3 \({0} ∪ {|k| = c}). We suppose |Φ(k)| ≤ φ(|k|) (k ∈ R3 ). We assume that φ(r) is nonincreasing in (0, ∞) and that r2 φ(r) is in L1 ([0, ∞)) and is bounded in (0, ∞). Then, ((2π)3 /|V |) k=0 Φ(k) is absolutely convergent, where the sum of k is taken

June 2, 2010 14:55 WSPC/S0129-055X

564

148-RMP

J070-00403

W. Ichinose

over (2πs1 /L1 , 2πs2 /L2 , 2πs3 /L3 ) (s1 , s2 , s3 = 0, ±1, ±2, . . .). We also get (2π)3 Φ(k) = Φ(k)dk L1 ,L2 ,L3 →∞ |V | lim

(4.8)

k=0

under the condition (4.1). Proof. We write L = (L1 , L2 , L3 ). Let us deﬁne the step function ΦL (k) by

2π(s1 − 1) 2πs1 , L1 L1

2π(s2 − 1) 2πs2 2π(s3 − 1) 2πs3 × , , × , L2 L2 L3 L3

2πs1 2π(s1 − 1) 2πs1 2πs2 2πs3 ΦL (k) = Φ ,− , , , k∈ L1 L2 L3 L1 L1

2π(s3 − 1) 2πs3 2πs2 2π(s2 − 1) × − ,− , × L2 L2 L3 L3 ΦL (k) = Φ

2πs1 2πs2 2πs3 , , L1 L2 L3

,

k∈

for s1 , s2 , s3 = 1, 2, . . . . Then, for k ∈ (2π(s1 − 1)/L1 , 2πs1 /L1 ] × (2π(s2 − 1)/L2 , 2πs2 /L2 ] × (2π(s3 − 1)/L3 , 2πs3 /L3 ] we have 2πs1 2πs2 2πs3 2πs1 2πs2 2πs3 ≤ φ(|k|) ≤ φ |ΦL (k)| = Φ , , , , L1 L1 L2 L3 L2 L3 since φ(r) is non-increasing. In the same way, for k ∈ (2π(s1 − 1)/L1 , 2πs1 /L1 ] × [−2πs2 /L2 , −2π(s2 − 1)/L2 ) × (2π(s3 − 1)/L3 , 2πs3 /L3 ] we get |ΦL (k)| ≤ φ(|k|).

(4.9)

In the same way as the above, we can deﬁne the step function ΦL (k) for all k ∈ R3 \{0} such that (4.9) and (2π)3 (2π)3 Φ(k) = ΦL (k)dk + |V | |V | R3 k=0

Φ(k).

(4.10)

k=0,s1 s2 s3 =0

For a short while we suppose L1 ≤ L2 ≤ L3 . Since φ(r) is non-increasing, it holds that for s1 ≥ 2 we have 2πs1 2π(s1 − 1) 2π 2π ≤ φ(|k|), , 0, 0 ≤ φ , , φ L1 L1 L2 L3

2π(s1 − 2) 2π(s1 − 1) 2π 2π , × 0, × 0, k∈ L1 L1 L2 L3

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

565

and also for s1 ≥ 2 and s2 ≥ 1 2πs1 2πs2 2π(s1 − 1) 2πs2 2π ≤ φ(|k|), , , 0 ≤ φ , , φ L1 L2 L1 L2 L3

2π(s2 − 1) 2πs2 2π 2π(s1 − 2) 2π(s1 − 1) , , × × 0, . k∈ L1 L1 L2 L2 L3 For s2 ≥ 2, we also have 2π 2πs2 2π 2π(s2 − 1) 2π ≤ φ(|k|), φ , , 0 ≤ φ , , L1 L2 L1 L2 L3

2π(s2 − 2) 2π(s2 − 1) 2π 2π , × × 0, . k ∈ 0, L1 L2 L2 L3 Thus we get (2π)3 |V |

|Φ(k)| ≤

k=0,s3 =0

≤

(2π)3 |V |

(2π)3 |V | +

φ(|k|)

k=0,s3 =0

φ(|k|)

k=0,s3 =0,s1 ,s2 =0,±1

(2π)3 |V |

φ(|k|)

k=0,s3 =s1 =0,|s2 |≥2

φ(|k|)dk.

+ 10

(4.11)

0≤k3 ≤(2π)/L3

We can take a constant 1 ≤ m ≤ m0 from (4.1) such that L2 ≤ mL1 ≤ L3 . We add the reﬁnement {((2π)/(mL1 ), (2πs2 )/L2 , (2πs3 )/L3 ); s2 , s3 = 0, ±1, ±2, . . .} to {((2πs1 )/L1 , (2πs2 )/L2 , (2πs3 )/L3 ); s1 , s2 , s3 = 0, ±1, ±2, . . .}. Then, for s2 ≥ 2 noting 2π 2π(s2 − 1) 2π 2πs2 ≤ φ(|k|), ,0 ≤ φ , , φ 0, L2 mL1 L2 L3

2π(s2 − 2) 2π(s2 − 1) 2π 2π , × × 0, , k ∈ 0, mL1 L2 L2 L3 we have (2π)3 m|V |

φ(|k|) ≤ 2

k=0,s3 =s1 =0,|s2 |≥2

φ(|k|)dk 0≤k1 ≤(2π)/(mL1 ),0≤k3 ≤(2π)/L3

≤2

φ(|k|)dk. 0≤k3 ≤(2π)/L3

June 2, 2010 14:55 WSPC/S0129-055X

566

148-RMP

J070-00403

(2π)3 |V |

W. Ichinose

Consequently, from (4.11), we get (2π)3 |V |

|Φ(k)| ≤

k=0,s3 =0

φ(|k|)

k=0,s3 =0,s1 ,s2 =0,±1

+ 2(5 + m0 )

φ(|k|)dk.

(4.12)

0≤k3 ≤(2π)/L3

Let us consider the case of general L1 , L2 and L3 . We may suppose L1 ≤ L2 . Noting L2 ≤ m0 L3 from (4.1), we add the reﬁnement {((2πs1 )/L1 , (2πs2 )/L2 , (2π)/(m0 L3 )); s1 , s2 = 0, ±1, ±2, . . .} to {((2πs1 )/L1 , (2πs2 )/L2 , (2πs3 )/L3 ); s1 , s2 , s3 = 0, ±1, ±2, . . .}. Then, as in the proof to (4.11), for s1 ≥ 2 we have 2π(s1 − 1) 2π 2π 2πs1 ≤ φ(|k|), , 0, 0 ≤ φ , , φ L1 L1 L 2 m0 L 3

2π(s1 − 2) 2π(s1 − 1) 2π 2π k∈ , × 0, × 0, L1 L1 L2 m0 L 3 and also for s1 ≥ 2 and s2 ≥ 1, 2π(s1 − 1) 2πs2 2π 2πs1 2πs2 ≤ φ(|k|), , , 0 ≤ φ , , φ L1 L2 L1 L 2 m0 L 3

2π(s1 − 2) 2π(s1 − 1) 2π(s2 − 1) 2πs2 2π k∈ , , × × 0, . L1 L1 L2 L2 m0 L 3 For s2 ≥ 2 we also have 2π 2π(s2 − 1) 2π 2π 2πs2 ≤ φ(|k|), φ , , 0 ≤ φ , , L1 L2 L1 L2 m0 L 3

2π 2π(s2 − 2) 2π(s2 − 1) 2π k ∈ 0, , × × 0, . L1 L2 L2 m0 L 3 Hence we can prove (2π)3 |V |

|Φ(k)| ≤ m0

k=0,s3 =0

≤

(2π)3 m0 |V |

(2π)3 |V | +

φ(|k|)

k=0,s3 =0

φ(|k|)

k=0,s3 =0,s1 ,s2 =0,±1

(2π)3 |V |

φ(|k|)

k=0,s3 =s1 =0,|s2 |≥2

+ 10m0

φ(|k|)dk. 0≤k3 ≤(2π)/(m0 L3 )

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

as in the proof of (4.11) and so (2π)3 (2π)3 |Φ(k)| ≤ |V | |V | k=0,s3 =0

567

φ(|k|)

k=0,s3 =0,s1 ,s2 =0,±1

+ 2m0 (5 + m0 )

φ(|k|)dk 0≤k3 ≤(2π)/(m0 L3 )

as in the proof of (4.12). Thus, for general L1 , L2 and L3 we obtain (2π)3 (2π)3 |Φ(k)| ≤ φ(|k|) |V | |V | k=0,sj =0

k=0,s1 ,s2 ,s3 =0,±1

+ 2m0 (5 + m0 )

φ(|k|)dk

(j = 1, 2, 3).

0≤kj ≤(2π)/(m0 Lj )

(4.13) We assumed that r2 φ(r) is in L1 (R). So, from (4.9), (4.10) and (4.13) we can prove that k=0 |Φ(k)| is convergent. In addition, since r2 φ(r) is assumed to be bounded in (0, ∞), 1 0 ≤ φ(|k|) ≤ Const. 2 , k = 0 |k| holds. So we see that ((2π)3 /|V |) k=0,s1 ,s2 ,s3 =0,±1 φ(|k|) tends to zero as L1 , L2 and L3 tend to the inﬁnity under the condition (4.1). Consequently, from (4.13), we have (2π)3 Φ(k) = 0, j = 1, 2, 3 lim L1 ,L2 ,L3 →∞ |V | k=0,sj =0

under (4.1). Hence, noting (4.9), from (4.10) we obtain (4.8) by means of the Lebesgue dominated convergence theorem. Now we will prove Theorem 4.1. For the sake of simplicity, let n = 2. Let χ0 (k) be the function deﬁned by (4.3). We write x = x(1) and y = x(2) . We take ϕ(x, y) ∈ S(R6 ). Then, we have & % (2π)3 cos k · (x − y) χ0 ( k), ϕ(x, y) |V | |k|2 k=0

=

(2π)3 cos k · (x − y) χ0 ( k)ϕ(x, y)dxdy |V | |k|2 k=0

(2π)3 cos k · (x − y) χ0 ( k)Dx 2 ϕ(x, y)dxdy, |V | |k|2 k2 k=0 2 where we deﬁne Dx := (1 − nj=1 ∂x2j ). Let Φ(k) = |k|−2 k−2 cos k · (x − y)Dx 2 ϕ(x, y)dxdy =

(4.14)

June 2, 2010 14:55 WSPC/S0129-055X

568

148-RMP

J070-00403

W. Ichinose

and 1 |k|2 k2

φ(|k|) :=

|Dx 2 ϕ(x, y)|dxdy.

Then from (4.14), Proposition 4.3 shows % & (2π)3 cos k · (x − y) lim lim χ0 ( k), ϕ(x, y) L1 ,L2 ,L3 →∞ →0 |V | |k|2 k=0

=

(2π)3 cos k · (x − y) Dx 2 ϕ(x, y)dxdy L1 ,L2 ,L3 →∞ |V | |k|2 k2 lim

1 dk |k|2 k2

=

k=0

(cos k · (x − y))Dx 2 ϕ(x, y)dxdy.

(4.15)

In the same way from (4.14), we also have & % (2π)3 cos k · (x − y) lim lim χ0 ( k), ϕ(x, y) →0 L1 ,L2 ,L3 →∞ |V | |k|2 =

1 |k|2 k2

dk

k=0

(cos k · (x − y))Dx 2 ϕ(x, y)dxdy.

(4.16)

On the other hand, Lemma 4.2 and Proposition 4.3 indicate & % (2π)3 cos k · (x − y) lim lim χ0 ( k), ϕ(x, y) →0 L1 ,L2 ,L3 →∞ |V | |k|2 k=0

cos k · (x − y) χ0 ( k)ϕ(x, y)dxdydk |k|2

= lim

→0

= 2π 2

ϕ(x, y) dxdy. |x − y|

(4.17)

Hence we obtain (4.2) together with (4.15) and (4.16). Remark 4.1. Let χ(k) ∈ S(R3 ) such that χ(0) = 1 and χ(−k) = χ(k). We take the limit of Lj (j = 1, 2, 3) under the condition (4.1). Then it holds that lim

→0

2π L1 ,L2 ,L3 →∞ |V | lim

n

k=0 j,l=1,j=l

=

1 2

n j,l=1,j=l

ej el |x(j) − x(l) |

χ( k)

ej el cos k · (x(j) − x(l) ) |k|2 (4.18)

pointwise for x ∈ R3n such that x(j) −x(l) = 0 (j, l = 1, 2, . . . , n, j = l). The proof is easy. Consider the case n = 2 and e1 = e2 = 1. Let us write x = x(1) and y = x(2) .

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

569

We take χ1 (k) ∈ C ∞ (R3 ) such that χ1 (k) = 1 (|k| ≤ 1) and χ1 (k) = 0 (|k| ≥ 2). Then, Proposition 4.3 says for x = y that the left-hand side of (4.18) is equal to 1 cos k · (x − y) lim dk χ( k) (2π)2 →0 |k|2 cos k · (x − y) 1 lim dk = χ1 (k)χ( k) (2π)2 →0 |k|2 1 −2 (cos k · (x − y))∆ {(1 − χ (k))χ( k)|k| }dk − k 1 |x − y|2 1 cos k · (x − y) = dk χ1 (k) (2π)2 |k|2 1 −2 − {(1 − χ (k))|k| }dk (4.19) (cos k · (x − y))∆ k 1 |x − y|2 pointwise, where ∆k denotes the Laplacian operator with respect to k ∈ R3 and we used | χ ( k)| = 1/3 |k|−2/3 ( |k|)2/3 |χ ( k)| ≤ Const. 1/3 |k|−2/3 . Since we have |∆k {(1 − χ1 (k))χ( k)|k|−2 }| ≤ Ck−3−1/3 with a constant C independent of , so we can prove that Eq. (4.19) is also true in the distribution sense S (R6 ). On the other hand, we see as in the proof of Lemma 4.2 that the left-hand side of (4.19) is equal to 1/(2|x − y|) in S (R6 ). Consequently we can prove that (4.19) is equal to 1/(2|x − y|). Hence (4.18) holds pointwise. 5. The Expression for the Vacuum and the States of Photons In this section, we express the vacuum and the states of photons with given momenta and polarizations concretely as functions of variables aΛ consisting of the Fourier coeﬃcients of vector potentials. In [11, Problem 9-8] only the vacuum and the state of a photon of momentum k and polarization state l are expressed concretely. In this section, we generalize this result in [11] for the general states of photons. In physics, the vacuum and the state of photons are not considered concretely but rather considered abstractly (cf. [29, 33]). We also note that the state of photons of given momenta and polarizations are not discussed in the study for QED models deﬁned by means of the functional method (cf. [12, 14, 16, 32]), because in the functional method each photon with polarizations is expressed by an amplitude of momentum in L2 (R3 ) ⊕ L2 (R3 ) as stated in the introduction. To write down the vacuum and the state of photons concretely, we will introduce ). Let us creation operators and annihilation operators acting on the space S (Ra4N Λ deﬁne ∂ |V | c|k| (i) (i) a a ˆlk := i −i 2c|k| i ∂a(i) |V | lk lk ∂ |V | c|k| (i) (i) + a (5.1) = 2c|k| |V | lk ∂a lk

June 2, 2010 14:55 WSPC/S0129-055X

570

148-RMP

J070-00403

W. Ichinose

acting on the space S (Ra4N ) for k ∈ Λ and i, l = 1, 2. From (2.13) we have Λ (1)

(1)

(2)

alk , a ˆl−k = −ˆ

(2)

a ˆl−k = a ˆlk .

(i)† (i) (i) Let a ˆlk denote the formal adjoint operator |V |/(2c|k|)(−∂/∂alk +c|k|alk /|V |) (i) of a ˆlk acting on the space S (Ra4N ). For f ∈ S (Ra4N ) and g ∈ S(Ra4N ) we have Λ Λ Λ (i)

(i)†

(ˆ alk f, g) = (f, a ˆlk g) ) into from the deﬁnition of the distribution. So, a ˆlk is continuous from S (Ra4N Λ (i)

S (Ra4N ) in weak topology. In the same way a ˆlk is continuous from S (Ra4N ) into Λ Λ 4N S (RaΛ ) in weak topology. We can easily see from (5.1) that the commutator relations (i)†

(i)

(i )†

ˆl k ] = δi i δl l δkk , [ˆ alk , a

(i)

(i )

[ˆ alk , a ˆ l k ] = 0

on S(Ra4N ) and so on S (Ra4N ) hold for k and k in the bounded domain Λ (cf. [7, Λ Λ §34] and [6,30]). For S(Ra4N ) is dense in S (Ra4N ) in weak topology (cf. [26]) and the Λ Λ ). We deﬁne the operator operators of both sides above are continuous in S (Ra4N Λ ) for k ∈ Λ and l = 1, 2 by a ˆlk acting on S (Ra4N Λ (1)

a ˆlk :=

(2)

a ˆlk − iˆ a √ lk 2

(5.2)

(cf. (2.11)). We call a ˆlk the annihilation operator and a ˆ†lk the creation operator. We (i) can easily see from the commutator relations for a ˆlk that the operators a ˆlk and a ˆ†lk also satisfy the commutator relations ˆ†l k ] = δl l δkk , [ˆ alk , a

[ˆ alk , a ˆ l k ] = 0

(5.3)

on S (Ra4N ) for k and k in Λ (cf. [29, (2.26)]). It follows from the commutator Λ relations (5.3) that we have

a ˆlk (ˆ a†lk )n − (ˆ a†lk )n a ˆlk = n (ˆ a†lk )n −1

(5.4)

) (cf. [7, §34]). Then we get the following expression as in physics (cf., e.g., on S (Ra4N Λ [29, (2.60) and (2.64)], and [33, (6.165) and (6.172)]). Proposition 5.1. We can write the last term of H(t) deﬁned by (3.10) as   2 2  2 |V | ∂ (c|k|) (i) 2 c|k|  (a ) − Hrad := +  2 i ∂a(i) 2|V | lk 2  k∈Λ ,l i=1

=

k∈Λ,l

c|k|ˆ a†lk a ˆlk

lk

(5.5)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

571

on S (Ra4N ). The vector potential A(x, aΛ2 ) deﬁned by (2.14), where the sum of k Λ is taken over Λ2 , is given for each x ∈ R3 by the expression A(x, aΛ2 ) =

2 4π 1 c ˆ†lk e−ik·x )el (k) (ˆ alk eik·x + a |V | 2c|k| k∈Λ2 l=1

(5.6)

acting on S (Ra4N ). Λ Proof. Since from (5.1) and (5.2) we have ˆlk + a ˆ†l−k a ˆl−k ) c|k|(ˆ a†lk a =

c|k| (1)† (2)† (1) (2) (1)† (2)† (1) (2) {(ˆ alk + iˆ alk )(ˆ alk − iˆ alk ) + (−ˆ alk + iˆ alk )(−ˆ alk − iˆ alk )} 2 (1)† (1)

(2)† (2)

= c|k|(ˆ alk a ˆlk + a ˆlk a ˆlk )   2 2  2 |V | ∂ (c|k|) (i) 2 c|k|  = (a ) − + (i)  2 i ∂a 2|V | lk 2  i=1 lk

) for k ∈ Λ, so we get (5.5) on S (Ra4N ) as in the same way as before. on S(Ra4N Λ Λ From (5.1) and (5.2), we have ˆ†lk e−ik·x a ˆlk eik·x + a 1 (1) (1)† (2) (2)† alk + a = √ {(ˆ ˆlk ) cos k · x − i(ˆ alk − a ˆlk ) cos k · x 2 (1)

(1)†

(2)

(2)†

+ i(ˆ alk − a ˆlk ) sin k · x + (ˆ alk + a ˆlk ) sin k · x} c|k| (1) |V | c|k| (2) a cos k · x + a sin k · x = c|k| |V | lk |V | lk ∂ ∂ − i(cos k · x) (2) + i(sin k · x) (1) ∂alk ∂alk on S(Ra4N ). So, it is shown from (2.8) and (2.13) that Λ k∈Λ2

1 ˆ†lk e−ik·x )el (k) (ˆ alk eik·x + a 2c|k| =

k∈Λ2

1 (1) (2) (alk cos k · x + alk sin k · x)el (k) 2|V |

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

W. Ichinose

572

on S(Ra4N ). Hence, we see that the right-hand side of (5.6) is equal to Λ √ 2 4π 1 (1) √ (alk cos k · x + a(2) c el (k) lk sin k · x) |V | 2 k∈Λ l=1 2

on S (Ra4N ), which is equal to the left-hand side of (5.6) from (2.14). Λ We know

∞

e

−aθ 2

dθ =

−∞

π a

for a constant a > 0. So, we can easily see from (5.2) and (5.5) that c|k| (1)2 c|k| (2)2 exp − (alk + alk ) Ψ0 (aΛ ) := π|V | 2|V |

(5.7)

k∈Λ ,l

is the normal ground state of Hrad , called vacuum, whose energy is 0, i.e. Hrad Ψ0 = 0 and that we have

(5.8)

2c|k| ∗ a Ψ0 , |V | lk

a ˆ†lk Ψ0 =

a ˆlk Ψ0 = 0

(k ∈ Λ)

(5.9)

(cf. [11, §8-1, (9-43) and Problem 9-8]). We know that the eigenvalue 0 of (5.8) is simple in L2 (R4N ) (cf. [4, Chap. 3, Theorem 3.4]). a†lk )n Ψ0 (aΛ ) (k ∈ Λ, n = 0, 1, 2, . . .), which can The function Ψn lk (aΛ ) := (ˆ be written concretely from (5.1), (5.2) and (5.7), expresses the state of n photons of momentum k and polarization state l (cf. [11, §9-2] and [29, §2-2]) and satisﬁes   †  a ˆlk a ˆlk  Ψn l k = n Ψn l k ,

k∈Λ,l

kˆ a†lk a ˆlk

Ψn l k = n (k )Ψn l k

k∈Λ

and Hrad Ψn l k = n (c|k |)Ψn l k from (5.4), (5.5) and (5.9). The operators ˆ†lk a ˆlk and a†lk a ˆlk k∈Λ,l a k∈Λ kˆ are called the total number operator and the momentum operator, respectively (cf. [6], and [29, (2.68) and (2.80)]). Let n (l, k) ≥ 0 be integers. Then ' a†lk )n (l,k) Ψ0 (aΛ ) denotes the state of n (l, k) photons of momentum k and k∈Λ,l (ˆ

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

polarization state l in the same way. Setting Ψ(aΛ ) = we get     †  a ˆ a ˆlk Ψ =  n (l, k)Ψ,

'

a†lk )n (l,k) Ψ0 (aΛ ), k∈Λ,l (ˆ

lk

k∈Λ,l



kˆ a†lk a ˆlk Ψ = 

k∈Λ

k∈Λ,l

573

(5.10)

 n (l, k)k Ψ

(5.11)

k∈Λ,l

and

 Hrad Ψ = 

 n (l, k)c|k|Ψ.

(5.12)

k∈Λ,l

The family

  

(i)†

(ˆ alk )n (l,k,i) Ψ0

k∈Λ ,l,i

∞  

n (l,k,i)=0

makes a complete orthogonal system in L2 (R4N ) (cf. [4, Chap. 3, Theorem 3.1] and [7, §34]). We have (1)

a ˆlk =

a ˆlk − a ˆl−k √ , 2

(2)

a ˆlk =

i(ˆ alk + a ˆl−k ) √ 2

from (2.13) and (5.2). So we see together with (5.4) and the second equation in (5.9) that the family ∞    1 (ˆ a†lk )n (l,k) Ψ0 (5.13)   n (l, k)! k∈Λ,l n (l,k)=0

also makes a complete orthonormal system in L2 (R4N ) (cf. [7, §34] and [29, (2.46)]). For example, we have a†lk )2 Ψ0 ) = (Ψ0 , a ˆlk (ˆ a†lk )2 Ψ0 ) (ˆ a†lk Ψ0 , (ˆ a†lk )2 a ˆlk Ψ0 ) + 2(Ψ0 , a ˆ†lk Ψ0 ) = (Ψ0 , (ˆ = 2(ˆ alk Ψ0 , Ψ0 ) = 0. Remark 5.1. We considered the Lagrangian function (3.3) and the Hamiltonian operator (3.10), determining the indeﬁnite constant in (2.3) by (2.18) or in Remark 3.1. On the other hand, in many references (cf. [11, 29, 32]) the indeﬁnite constant is chosen to be 0. Consequently, the term ∞ = (1/2) nj=1 e2j /|x(j) − x(j) | appears in (4.2) from (2.21) and the ground state energy of Hrad is k∈Λ c|k|/2, which tends to inﬁnity as M3 tends to inﬁnity. Arguments are made about these

June 2, 2010 14:55 WSPC/S0129-055X

574

148-RMP

J070-00403

W. Ichinose

inﬁnities in [11, §9-3 and §9-5]. In the present paper, we could see that the inﬁnity n arising from the term (1/2) j=1 e2j /|x(j) − x(j) | disappears in (4.2) and that the ground state energy of Hrad is 0.

6. Preliminaries for the Proofs of Main Results From Secs. 6–9 we often write x and y in R3n as x and y, respectively, for the sake of simplicity when no confusion arises. Let 0 ≤ s < t ≤ T . For x and y in R3n , we deﬁne q t,s x,y (θ) := x −

t−θ (x − y), t−s

s ≤ θ ≤ t.

(6.1)

For X and Y in R4N , we also deﬁne at,s Λ X,Y (θ) := X −

t−θ (X − Y ), t−s

s ≤ θ ≤ t.

(6.2)

8N Then at,s is deﬁned by means of (2.13). We set ΛX,Y (θ) ∈ R

V1 (x) :=

2π |V |

n

k∈Λ1 j,l=1,j=l

ej el cos k · (x(j) − x(l) ) |k|2

(6.3)

and V2 (aΛ ) :=

(c|k|)2 (i) c|k| (alk )2 − . 2|V | 2

(6.4)

k∈Λ ,i,l

For the sake of simplicity we suppose Λ2 = Λ3 (= Λ ) from Secs. 6–9. We write x = (x, X) ∈ R3n+4N and t,s 1+3n+4N qt,s q t,s , x,y (θ), aΛ X,Y (θ)) ∈ R x,y (θ) = (θ,

s≤θ≤t

(6.5)

for s < t. Then, from (3.3) and (3.5), we have t,s q t,s Sc (t, s; x,y , aΛX,Y )

1 mj |x(j) − y (j) |2 2(t − s) j=1   n 2 1 ˜ (j) , aΛ ) · dx(j) − V2 (aΛ )dt + |X − Y | −V1 (x)dt + + ej A(x c j=1 2|V |(t − s) qt,s x,y n

=

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

1 = mj |x(j) − y (j) |2 − 2(t − s) j=1 n

1 + ej (x(j) − y (j) ) · c j=1 n

|X − Y |2 − + 2|V |(t − s) 1 = 2(t − s)

n

t

s

1

s

t

t−θ (x − y) dθ V1 x − t−s

˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x

0

t−θ (X − Y ) dθ V2 X − t−s

mj |x

(j)

575

−y

| − (t − s)

(j) 2

1

V1 (x − θ(x − y))dθ

0

j=1

1 + ej (x(j) − y (j) ) · c j=1 n

|X − Y |2 − (t − s) + 2|V |(t − s)

1

˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x

0

1

V2 (X − θ(X − Y ))dθ.

(6.6)

0

Let M ≥ 0 and p(x, w, X, W ) a C ∞ function in R6n × R8N such that

α β α β |∂w ∂x ∂W ∂X p(x, w, X, W )| ≤ Cα,β,α ,β (x; wX; W )M

(6.7)

with constants Cα,β,α ,β , where x; w := for all multi-indices α, β, α and β 3n+4N 2 2 1 + |x| + |w| . For f (x, X) ∈ S(R ) we deﬁne the operator P (t, s) by

  n 4N 3  m 1  j t,s     (exp i−1 Sc (t, s; q t,s x,y , aΛX,Y )   2πi(t − s) 2πi|V |(t − s)   j=1        x−y X −Y   √ √ × p x, , X, f (y, Y )dydY, s < t,   t−s t−s         3  n n 4N  m 1 mj |wj |2  j −1      Osexp i    2πi 2πi|V | 2   j=1 j=1          2  |W |    p(x, w, X, W )dwdW f (x, X), s = t. +   2|V |  (6.8) When p(x, w, X, W ) = 1, P (t, s) is called the fundamental operator and denoted by C(t, s).

June 2, 2010 14:55 WSPC/S0129-055X

576

148-RMP

J070-00403

W. Ichinose

Lemma 6.1. Let M1 and M2 be non-negative constants. Suppose that g(x)(x ∈ R3 ) and ψ(θ)(θ ∈ R) in (3.4) satisfy |∂xα g(x)| ≤ Cα xM1 ,

x ∈ R3

for all α and k d M2 dθk ψ(θ) ≤ Ck θ ,

θ∈R

α (P (t, s)f )(x, X) are continufor all k = 0, 1, . . . . Let f ∈ S(R3n+4N ). Then, ∂xα ∂X 3n+4N for all α and α . ous in 0 ≤ s ≤ t ≤ T and (x, X) ∈ R

√ Proof. Let s < t and make the change of variables: y → w = (x − y)/ t − s and √ Y → W = (X − Y )/ t − s in (6.8). Then from (6.6) we have   4N 3 n m 1 j   P (t, s)f = Os(exp i−1 φ(t, s; x, w, X, W )) 2πi 2πi|V | j=1 × p(x, w, X, W )f (x −

√ √ ρw, X − ρW )dwdW,

ρ = t − s,

(6.9)

where φ(t, s; x, w, X, W ) :=

n mj j=1

:=

2

n mj j=1

· 0

1

2

|w(j) |2 +

|w

1 √ √ |W |2 + ψ(t, s; x, ρw, X, ρW ) 2|V |

1 |W |2 − ρ | + 2|V |

(j) 2

0

1

√ 1 √ (j) V1 (x − θ ρw)dθ + ej ρw c j=1

˜ (j) − θ√ρw(j) , X − θ√ρW )dθ − ρ A(x

n

1

√ V2 (X − θ ρW )dθ.

(6.10)

0

We note from (6.8) that (6.9) is also true for t = s. 3 (j) (j) Let L(j) := w(j) −2 (1 − im−1 j k=1 wk ∂/∂wk ) (j = 1, 2, . . . , n) and L1 := 4N W −2 (1 − i|V | k=1 Wk ∂Wk ). Then, integrating by parts with respect to w and W in (6.9) by means of L(j) and L1 , we see that the integrand is bounded by Const.x; Xl w−(3n+1) W −(4N +1) for some real constant l. See the proof of [19, Lemma 2.1] for further details. Consequently, we see that (P (t, s)f )(x, X) is continuous in 0 ≤ s ≤ t ≤ T and (x, X) ∈ R3n+4N . We note (6.9) and (6.10). Then, in the same way as in the above α (P (t, s)f )(x, X) are continuous in 0 ≤ s ≤ t ≤ T and we can prove that ∂xα ∂X (x, X) ∈ R3n+4N for all α and α .

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

577

For 0 ≤ σ1 , σ2 ≤ 1 we set σ := (σ1 , σ2 ) and τ (σ) := t − σ1 (t − s) ∈ R, ζ (j) (σ) := z (j) + σ1 (x(j) − z (j) ) + σ1 σ2 (y (j) − x(j) ) ∈ R3 ,

j = 1, 2, . . . , n,

ζ(σ) := z + σ1 (x − z) + σ1 σ2 (y − x) ∈ R , 3n

˜ ζ(σ) := Z + σ1 (X − Z) + σ1 σ2 (Y − X) ∈ R4N .

(6.11)

We also set ∂ A˜l (j) ∂ A˜m (j) (x , aΛ ) − (x , aΛ ) ∂xm ∂xl

Bml (x(j) , aΛ ) =

(6.12)

for l, m = 1, 2, 3 and j = 1, 2, . . . , n. Then, from (6.6), we have Lemma 6.2. We can write for s < t t,s t,s Sc (t, s; q t,s q t,s z,y , aΛZ,Y ) − Sc (t, s; z,x , aΛZ,X )

=

n x(j) + y (j) 1 mj (x(j) − y (j) ) · z (j) − t − s j=1 2 + (t − s)(x − y) ·

1

σ1 0

+

1 c

n

1

0

∂V1 (ζ(σ))dσ1 dσ2 ∂x

1

ej (x(j) − y (j) ) ·

˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x

0

j=1

1 1 n 3 1 (j) (j) (j) (j) ˜ + ej (xm − ym )(xl − zl ) σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2 c j=1 0 0 l,m=1

1 + ej (x(j) − y (j) ) · c j=1 n

+ (X − Y ) ·

( (Z − X) ·

1 0

0

1

∂ A˜ (j) ˜ σ1 (ζ (σ), ζ(σ))dσ 1 dσ2 ∂aΛ

)

1 1 n 3 1 ∂ A˜m (j) (j) ˜ ej (x(j) − z ) σ1 (ζ (σ), ζ(σ))dσ 1 dσ2 m m c j=1 m=1 ∂aΛ 0 0

X +Y 1 (X − Y ) · Z − + (t − s)|V | 2 1 1 ∂V2 ˜ + (t − s)(X − Y ) · σ1 (ζ(σ))dσ1 dσ2 . ∂aΛ 0 0

(6.13)

June 2, 2010 14:55 WSPC/S0129-055X

578

148-RMP

J070-00403

W. Ichinose

Proof. We use (6.6). From (6.5) and (6.11), we see

q zt,s y ,y

(−V1 (x))dt −

=

n 3

0

=

n 3

(j)

∂V1 /∂xl dt ∧ dxl

n 3 1 j=1 l=1

(−V1 (x))dt (j)

∆

j=1 l=1

=

q zt,s x ,x

1

0

(j)

(j)

∂V1 (ζ(σ))/∂xl det

(j)

(t − s)(xl

(j)

− yl )

1 0

j=1 l=1

= (t − s)(x − y) ·

1

1

σ1 0

0

0

∂(τ (σ), ζl (σ)) dσ1 dσ2 ∂(σ1 , σ2 )

1

(j)

σ1 ∂V1 (ζ(σ))/∂xl dσ1 dσ2

∂V1 (ζ(σ))dσ1 dσ2 , ∂x

(6.14)

where ∆ = ∆(t, s, x, y, z) is the 2-dimensional plane with oriented boundary conq t,s q s,s sisting of (θ, q t,s z,y (θ)), −(θ, z,x (θ)) and (θ, y,x (θ)) (s ≤ θ ≤ t), and σ in (6.11) gives the positive orientation of ∆. So the second term on the right-hand side of (6.13) appears. In the same way the last term appears. It is easy to show that the ﬁrst and the 7th terms appear. As in the proof of (6.14), we have

q zt,s y ,y

˜ (j) , aΛ ) · dx(j) − A(x = s,s qx y ,y

= (x

˜ (j) , aΛ ) · dx(j) A(x

˜ (j) , aΛ ) · dx(j) + A(x

(j)

q zt,s x ,x

−y

(j)

1

)·

∆

˜ (j) , aΛ ) · dx(j) ) d(A(x

˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x

0

+

∆

1≤m
Bml dx(j) m

= (x(j) − y (j) ) ·

+

1

∧

(j) dxl

−

k∈Λ ,i,l m=1

1 0

0

1

∆

(i) (i) (∂ A˜m /∂alk )dx(j) m ∧ dalk

˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x

0 (j)

(j) {(x(j) m − ym )(xl

(j)

(j)

− zl ) − (xl

1≤m
×

3

˜ σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2

(j)

(j) − yl )(x(j) m − zm )}

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

−

3

579

(j) (j) (j) {(x(j) m − ym )(X − Z) − (X − Y )(xm − zm )}

m=1

·

1

1

σ1 0

0

∂ A˜m (j) ˜ (ζ (σ), ζ(σ))dσ 1 dσ2 . ∂aΛ

(6.15)

So we can complete the proof of (6.13) from (6.6). (j)

Let us deﬁne Φm (t, s; x(j) , y (j) , z (j) , X, Y, Z) ∈ R (m = 1, 2, 3, j = 1, 2, . . . , n) and Φ1 (t, s; x, y, z, X, Y, Z) ∈ R4N by 3 (j) (j) ej (t − s) (j) + y x m m (j) (j) + zm − (xl − zl ) Φ(j) m = 2 mj c ×

1 0

l=1

1

˜ σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2

0

1 1 ∂ A˜m (j) ej (t − s) ˜ (X − Z) · σ1 (ζ (σ), ζ(σ))dσ 1 dσ2 mj c ∂aΛ 0 0 ej (t − s) 1 ˜ Am (x(j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ + mj c 0 2 1 1 (t − s) σ1 ∂V1 (ζ(σ))/∂x(j) + m dσ1 dσ2 mj 0 0 −

(6.16)

and Φ1 =

n 3 X +Y (t − s)|V | (j) Z− ej (x(j) + m − zm ) 2 c j=1 m=1 1

∂ A˜m (j) ˜ (ζ (σ), ζ(σ))dσ 1 dσ2 ∂aΛ 0 0 1 1 ∂V2 ˜ + (t − s)2 |V | σ1 (ζ(σ))dσ1 dσ2 , ∂aΛ 0 0 ×

1

σ1

(j)

(j)

(6.17)

(j)

respectively. Let Φ(j) := (Φ1 , Φ2 , Φ3 ) ∈ R3 . Then it follows from (6.13), (6.16) and (6.17) that t,s t,s Sc (t, s; q t,s q t,s z,y , aΛZ,Y ) − Sc (t, s; z,x , aΛZ,X )

1 mj (x(j) − y (j) ) · Φ(j) (t, s; x(j) , y (j) , z (j) , X, Y, Z) t − s j=1 n

=

+

1 (X − Y ) · Φ1 (t, s; x, y, z, X, Y, Z). (t − s)|V |

(6.18)

June 2, 2010 14:55 WSPC/S0129-055X

580

148-RMP

J070-00403

W. Ichinose

7. The Stability of the Fundamental Operator Lemma 7.1. Let f ∈ C 1 (Rd ) and |∂xα f | ≤ Cα < x >−(1+δα ) for all x ∈ Rd and all |α| = 1, where δα > 0 are constants. Then we have: (1) f is a bounded function in Rd . (2) We have α β γ 1 1 |x − z| ∂x ∂y ∂z σ1 f (z + σ1 (x − z) + σ1 σ2 (y − x))dσ1 dσ2 0

≤ Cα,β,γ ,

0

|α + β + γ| = 1,

x, y, z ∈ Rd .

The proof is easy, following the proof of [18, Lemma 3.5]. We note (3.4) and (6.11). Then, it follows from Lemma 7.1 that under the assumptions of Theorem 3.1 we have 1 1 ∂ A˜m (j) α β γ α β γ ˜ σ1 (ζ (σ), ζ(σ))dσ ∂x(j) ∂y(j) ∂z(j) ∂X ∂Y ∂Z (Z − X) · 1 dσ2 ∂a Λ 0 0 |α + β + γ + α + β + γ | ≥ 0

≤ Cα,β,γ,α ,β ,γ ,

(7.1)

for x(j) , y (j) , z (j) ∈ R3 and X, Y, Z ∈ R4N . In the same way we have the same 1 1 (j) (j) (j) ˜ estimates as the above for (xl − zl ) 0 0 σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2 and (xm − ˜ 1 1 (j) ˜ zm ) σ1 ∂ Am (ζ (j) (σ), ζ(σ))dσ 1 dσ2 . To obtain these estimates we assumed (3.6) 0 0

∂aΛ

and (3.7). Consequently, letting Θ be a component of Φ(j) and Φ1 , and |α + β + γ + α + β + γ | ≥ 1, then from (6.16) and (6.17) we obtain

α β γ ∂Y ∂Z Θ| ≤ Cα,β,γ,α ,β ,γ |∂xα ∂yβ ∂zγ ∂X

(7.2)

together with (6.3) and (6.4) for 0 ≤ s ≤ t ≤ T, x, y, z ∈ R3n and X, Y, Z ∈ R4N . Proposition 7.2. Under the assumptions of Theorem 3.1 we have: (1) There exists a constant ρ∗ > 0 such that the mapping: R3n+4N (z, Z) → (ξ, Ξ) = (Φ, Φ1 ) := (Φ(1) , Φ(2) , . . . , Φ(n) , Φ1 ) ∈ R3n+4N is homeomorphic and det ∂(ξ, Ξ)/∂(z, Z) ≥ 1/2 for each ﬁxed 0 ≤ t − s ≤ ρ∗ , x, y, X and Y . We write its inverse mapping as R3n+4N (ξ, Ξ) → (z, Z) = (z(t, s; x, ξ, y, X, Ξ, Y ), Z(t, s; x, ξ, y, X, Ξ, Y )) ∈ R3n+4N . (2) Let η(t, s; x, ξ, y, X, Ξ, Y ) be a component of z and Z. Then, letting |α + β + γ + α + β + γ | ≥ 1, we have

β γ ∂Y η(t, s; x, ξ, y, X, Ξ, Y )| ≤ Cα,β,γ,α ,β ,γ |∂ξα ∂xβ ∂yγ ∂Ξα ∂X

(7.3)

for 0 ≤ t − s ≤ ρ∗ , x, ξ, y ∈ R3n and X, Ξ, Y ∈ R4N . Proof. (1) From (6.16) and (6.17), we can write ∂(Φ, Φ1 )/∂(z, Z) = I + (t − s)d(t, s; x, y, z, X, Y, Z),

(7.4)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

581

where I is the identity matrix of degree 3n + 4N . We can see as in the proof of (7.2) that each component of d(t, s; x, y, z, X, Y, Z) satisﬁes (7.2) for all α, β, γ, α , β and γ . Hence, applying [31, Theorem 1.22] to the mapping: (z, Z) → (Φ, Φ1 ), we prove (1). (2) We see (ξ, Ξ) = (Φ(t, s; x, y, z, X, Y, Z), Φ1 (t, s; x, y, z, X, Y, Z)) with z = z(t, s; x, ξ, y, X, Ξ, Y ) and Z = Z(t, s; x, ξ, y, X, Ξ, Y ). So, (7.3) follows from (7.2) and det ∂(ξ, Ξ)/∂(z, Z) ≥ 1/2. Remark 7.1. Let us consider the general case Λ2 ⊆ Λ3 in Proposition 7.2. Then ˜ aΛ ) and Bml (x, aΛ ) in (6.16) and (6.17). from (3.4) and (6.12), we consider A(x, 2 2 Let Λ1 and Λ2 be ﬁxed. When Λ3 = Λ2 , we could determine ρ∗ > 0 from (7.4) such that we get det ∂(Φ, Φ1 )/∂(z, Z) ≥ 1/2 for 0 ≤ t − s ≤ ρ∗ , x, y, z ∈ R3n and X, Y, Z ∈ R4N3 . Let Λ3 ⊇ Λ2 . Then, direct calculations show det ∂(Φ, Φ1 )/∂(z, Z) ≥ 1/2 for 0 ≤ t − s ≤ ρ∗ , x, y, z ∈ R3n and X, Y, Z ∈ R4N3 from (6.16) and (6.17) since (i) (t − s)2 |V |∂ 2 V2 (aΛ )/∂(alk )2 = (t − s)2 (c|k|)2 are non-negative. Consequently, we can see that when Λ1 and Λ2 are ﬁxed, the constant ρ∗ > 0 is taken independently of Λ3 (⊇ Λ2 ). Theorem 7.3. Let ρ∗ > 0 be the constant determined in Proposition 7.2. Then under the assumptions of Theorem 3.1 we can ﬁnd constants Ka ≥ 0 (a = 0, 1, 2, . . .) such that C(t, s)f B a ≤ eKa (t−s) f B a ,

0 ≤ t − s ≤ ρ∗

(7.5)

for all f (x, aΛ ) ∈ B a (R3n+4N ). Proof. The deﬁnition (6.8) says C(s, s) = Identity.

(7.6)

So (7.5) holds for t = s. Let 0 < t − s ≤ ρ∗ . We take χ ∈ C ∞ (R3n+4N ) with compact support such that χ(0) = 1. Let > 0 and f ∈ S(R3n+4N ). Then from (6.8) and (6.18), we can write C(t, s)∗ χ( ·)2 C(t, s)f   3  4N n  1 mj f (y, Y )dydY =  2π(t − s)  2π|V |(t − s) j=1

×

t,s χ( z, Z)2 exp{i−1 Sc (t, s; q t,s z,y , aΛZ,Y )

t,s − i−1 Sc (t, s; q t,s z,x , aΛZ,X )}dzdZ

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

W. Ichinose

582

=

 n  

j=1

mj 2π(t − s)

 3  

1 2π|V |(t − s)

4N

f (y, Y )dydY

χ( z, Z)2

 (j) Φ Φ m j 1  dzdZ. + i(X − Y ) · × exp i (x(j) − y (j) ) · (t − s) |V |(t − s) j=1 

n

(7.7)

We can make the change of variables: (z, Z) → (ξ, Ξ) = (Φ, Φ1 ) in (7.7) from Proposition 7.2. Then C(t, s)∗ χ( ·)2 C(t, s)f   3  4N n  1 mj =  2π(t − s)  2π|V |(t − s) j=1 ×

χ( z, Z)2

f (y, Y )dydY

+ i(X − Y ) ·

Ξ |V |(t − s)

det

  

 exp i

n

(x(j) − y (j) ) ·

j=1

mj ξ (j) (t − s)

∂(z, Z) dξdΞ. ∂(ξ, Ξ)

Equation (7.4) and Proposition 7.2(2) show det

∂(z, Z) = 1 + (t − s)h(t, s; x, ξ, y, X, Ξ, Y ), ∂(ξ, Ξ)

(7.8)

where h(t, s; x, ξ, y, X, Ξ, Y ) satisﬁes (7.3) for all α, β, γ, α , β and γ . Consequently, from Proposition 7.2(2), we have lim C(t, s)∗ χ( ·)2 C(t, s)f

→0

=

1 2π

3n+4N

lim

f (y, Y )dydY

→0

χ( z, Z)2

× {exp(i(x − y) · γ + i(X − Y ) · Γ)} det = f (x, X) + (t − s)

1 2π

3n+4N

∂(z, Z) dγdΓ ∂(ξ, Ξ)

Os-

{exp(i(x − y) · γ + i(X − Y ) · Γ)}

× h(t, s; x, ξ, y, X, Ξ, Y )f (y, Y )dydY dγdΓ,

(7.9)

where ξ (j) = (t − s)γ (j) /mj (j = 1, 2, . . . , n) and Ξ = |V |(t − s)Γ. We note that the second term on the right-hand side of (7.9) is a pseudo-diﬀerential operator.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

583

So, applying the Calder´ on–Vaillancourt theorem ([5]), we obtain lim χ( ·)C(t, s)f 2 = lim (C(t, s)∗ χ( ·)2 C(t, s)f, f ) →0 ∗ 2 = lim C(t, s) χ( ·) C(t, s)f, f

→0

→0

≤ (1 + 2K0 (t − s))f 2 ≤ e2K0 (t−s) f 2 with a constant K0 ≥ 0. Hence we get (7.5) with a = 0 by Fatou’s lemma. Let p(x, w, X, W ) be a C ∞ function satisfying (6.7) with an integer M ≥ 0. Then we obtain P (t, s)f ≤ Const.f B M

(7.10)

as in the proof of (7.5) with a = 0. See the proof of [19, Proposition 4.3] for further details. Let us recall the expression (6.9) of C(t, s)f . Set ζ := (x, X) and let κ = (κ1 , κ2 , . . . , κ3n+4N ) be an arbitrary multi-index. Then we can see that ∂ζκ (C(t, s)f ) − C(t, s)(∂ζκ f ) and ζ κ (C(t, s)f ) − C(t, s)(ζ κ f ) are written in the form P˜γ (t, s)(∂ζγ f ) (t − s) |γ|≤|κ|



:= (t − s)



|γ|≤|κ|

× Os-

n

j=1

 3 mj  1 2πi 2πi|V |

4N

(exp i−1 φ(t, s; x, w, X, W ))pγ (t, s; x,

× (∂ζγ f )(x −

√

√ ρw, X, ρW )

√ √ ρw, X − ρW )dwdW

(7.11)

respectively, where pγ (t, s; x, w, X, W ) satisﬁes (6.7) with M = |κ| − |γ| for all α, β, α and β . We can prove these results by induction with respect to −1 (j) 2 −1 (j) 2 −1 2 |κ|, using ∂w(j) eimj |w | /2 = imj w(j) eimj |w | /2 , ∂W ei |W | /(2|V |) = −1 2 (iW/|V |)ei |W | /(2|V |) and the integration by parts in (6.9). See the proof of [21, Lemma 3.2] for further details. Let |κ| = a (a = 0, 1, 2, . . .). Then we have P˜γ (t, s)(∂ζγ f ). ∂ζκ (C(t, s)f ) ≤ C(t, s)(∂ζκ f ) + (t − s) |γ|≤a

Applying (7.5) with a = 0 and (7.10) to the right-hand side above, we get ∂ζκ (C(t, s)f ) ≤ eK0 (t−s) ∂ζκ f + Const.(t − s) |γ|≤a ∂ζγ f B a−|γ| . We know from Lemma 2.3 with s = 1 and a = b in [17] that there exist a constant µa ≥ 0 and λa (ζ, η) satisfying |∂ηα ∂ζβ λa (ζ, η)| ≤ Cα,β ζ; η−a

(7.12)

June 2, 2010 14:55 WSPC/S0129-055X

584

148-RMP

J070-00403

W. Ichinose

for all α and β, and Λa (ζ, Dζ ) = (µa + ζa + Dζ a )−1

(7.13)

on S, where Λa (ζ, Dζ ) is the pseudo-diﬀerential operator with symbol λa (ζ, η). So, using [17, Lemma 2.4] and the Calder´ on–Vaillancourt theorem, we have ∂ζγ f B a−|γ| ≤ Const.(µa−|γ| + ζa−|γ| + Dζ a−|γ| )∂ζγ f = Const.{(µa−|γ| + ζa−|γ| + Dζ a−|γ| )∂ζγ Λa } × (µa + ζa + Dζ a )f ≤ Const.f B a .

(7.14)

Hence we get ∂ζκ (C(t, s)f ) ≤ eK0 (t−s) ∂ζκ f + Const.(t − s)f B a .

(7.15)

In the same way, we get ζ κ (C(t, s)f ) ≤ eK0 (t−s) ζ κ f + Const.(t − s)f B a .

(7.16)

Thus we obtain C(t, s)f B a ≤ eK0 (t−s) f B a + Const.(t − s)f B a ≤ eKa (t−s) f B a . This completes the proof of Theorem 7.3. Proposition 7.4. Let 0 ≤ t−s ≤ ρ∗ and p(x, w, X, W ) satisfy (6.7) with an integer M ≥ 0. Then P (t, s) is a continuous operator from B a (a = 0, 1, 2, . . .) into B a+M . Proof. Let ζ = (x, X) and f ∈ S(R3n+4N ). We also use (6.9) as in the proof of Theorem 7.3. Then we have ∂ζκ P (t, s)f = Pγ (t, s)∂ζγ f, γ≤κ

where γ ≤ κ denotes γj ≤ κj for all j and pγ (t, s; x, w, X, W ) satisfy (6.7) with √ √ √ M + |κ| − |γ| as M . Using ζ = (x, X) = (x − ρw, X − ρW ) + ρ(w, W ), we also have ζ κ P (t, s)f = Qγ (t, s)ζ γ f, γ≤κ

where qγ (t, s; x, w, X, W ) satisfy (6.7) with M + |κ| − |γ| as M . Hence from (7.10) and (7.14) we see P (t, s)f B a = P (t, s)f + (ζ κ P (t, s)f + ∂ζκ P (t, s)f ) |κ|=a

≤ Const.f B a+M .

(7.17)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

585

8. The Consistency of the Fundamental Operator Let C(t, s) and H(t) be the fundamental operator deﬁned in Sec. 6 and the operator deﬁned by (3.10) with variables aΛ = aΛ2 = X, respectively. Theorem 8.1. Under the assumptions of Theorem 3.1 there exist integers M ≥ 0, M ≥ 0, C ∞ functions r(t, s; x, w, X, W ) and r (t, s; x, w, X, W ) in 0 ≤ s ≤ t ≤ T, (x, w) ∈ R6n and (X, W ) ∈ R8N satisfying (6.7) for all α, β, α and β , respectively such that √ ∂ (8.1) i − H(t) C(t, s)f = t − sR(t, s)f ∂t and

√ ∂ C(t, s)f + C(t, s)H(s)f = t − sR (t, s)f (8.2) ∂s ), where R(t, s) and R (t, s) are operators deﬁned by (6.8). for f ∈ S(Rx3n × Ra4N Λ i

Proof. In this proof, we write x and y as x and y, respectively. Let x denote variables in R3 . We may assume s < t from Lemma 6.1. It follows from (3.10), (6.6) and (6.8) that direct calculations show ∂ i − H(t) C(t, s)f ∂t   4N n 3 m 1 j   =− 2πi(t − s) 2πi|V |(t − s) j=1 t,s t,s −1 × (exp i Sc (t, s; q x,y , aΛX,Y ) r1 (t, s; x, y , X, Y ) +

i r2 (t, s; x, y, X, Y ) f (y, Y )dy dY 2

(8.3)

by means of (6.3) and (6.4), where t,s r1 (t, s; x, y , X, Y ) = ∂t Sc (t, s; q t,s x, y , aΛX,Y ) 2 n 1 ej ˜ (j) ∂x(j) Sc − A(x , X) + 2mj c j=1

+ V1 (x) +

|V | |∂X Sc |2 + V2 (X) 2

(8.4)

and 3n + 4N 1 − ∆ (j) Sc t−s mj x j=1 n

r2 =

1 ej ˜ (j) , X) − |V |∆X Sc , + (∇x · A)(x c j=1 mj n

(cf. the proof of [18, Proposition 2.3]).

x ∈ R3

(8.5)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

W. Ichinose

586

Set ρ = t − s. From (6.6), we can write ∂x(j) Sc − =

ej ˜ (j) A(x , X) c

mj (x(j) − y (j) ) ρ 1 ej ˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y )) − A(x ˜ (j) , X)}dθ {A(x + c 0 1 3 ej (j) ∂ A˜l (j) (j) + (x − θ(x(j) − y (j) ), X − θ(X − Y ))dθ (xl − yl ) (1 − θ) c ∂x 0 l=1

1

−ρ

∂V1 (x − θ(x − y ))dθ ∂x(j)

(1 − θ)

0

=

3 ˜ mj (x(j) − y (j) ) ej (j) (j) ∂ A − (xm − ym ) (x(j) , X) ρ 2c m=1 ∂xm 4N 3 ˜ ∂ A˜ (j) ej (j) ej (j) ∂ Al (x(j) , X) (Xm − Ym ) (x , X) + (xl − yl ) 2c m=1 ∂Xm 2c ∂x l=1 X −Y x − y + ρq1 t, s; x, √ , X, √ (8.6) ρ ρ

−

and ∂X Sc =

X −Y −ρ |V |ρ ×

1

=

1 ∂V2 (j) (j) (X − θ(X − Y ))dθ + ej (xl − yl ) ∂X c j=1 n

(1 − θ)

0

(1 − θ)

0

1

3

l=1

∂ A˜l (j) (x − θ(x(j) − y (j) ), X − θ(X − Y ))dθ ∂X

n 3 ˜ X −Y 1 (j) (j) ∂ Al + (x(j) , X) ej (xl − yl ) |V |ρ 2c j=1 ∂X l=1

X −Y x − y + ρq2 t, s; x, √ , X, √ . ρ ρ

(8.7)

We can easily see −

3

(j)

(j)

(j) (xk − yk )(x(j) m − ym )

k,m=1

+

3

(j)

(j)

(j)

(xk − yk )(xl

k,l=1

∂ A˜k (j) (x , X) ∂xm (j)

− yl )

∂ A˜l (j) (x , X) = 0. ∂xk

(8.8)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

587

Equations (8.6)–(8.8) show 2 n 1 ej ˜ (j) + |V | |∂X Sc |2 ∂ A(x S − , X) (j) c x 2m c 2 j j=1 n X −Y 1 |X − Y |2 √ x − y (j) (j) 2 = 2 mj |x − y | + + ρq3 t, s; x, √ , X, √ . 2ρ j=1 2|V |ρ2 ρ ρ (8.9) From (6.6), we also have t,s q t,s ∂t Sc (t, s; x, y , aΛX,Y ) = −

n 1 |X − Y |2 mj |x(j) − y (j) |2 − V1 (x) − 2 2ρ j=1 2|V |ρ2

X −Y √ x − y − V2 (X) + ρq4 t, s; x, √ , X, √ . ρ ρ

(8.10) Hence together with (8.4), we obtain x − y X −Y √ . r1 (t, s; x, y , X, Y ) = ρq5 t, s; x, √ , X, √ ρ ρ

(8.11)

From (6.6) or (8.6) and (8.7), the same arguments as for (8.11) show n 1 ∆ (j) Sc + |V |∆X Sc mj x j=1

2 ej 3n + 4N + = ρ c j=1 mj n

1

(1 − θ)

0

˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ × (∇x · A)(x X −Y x − y √ + ρq6 t, s; x, √ , X, √ ρ ρ 1 ej 3n + 4N ˜ (j) , X) + (∇x · A)(x = ρ c j=1 mj n

X −Y √ x − y + ρq7 t, s; x, √ , X, √ . ρ ρ

(8.12)

Hence together with (8.5), we get X −Y √ x − y r2 (t, s; x, y, X, Y ) = − ρq7 t, s; x, √ , X, √ . ρ ρ Thus we could complete the proof of (8.1) from (8.3), (8.11) and (8.13).

(8.13)

June 2, 2010 14:55 WSPC/S0129-055X

588

148-RMP

J070-00403

W. Ichinose

Let us consider (8.2). By direct calculations we see that the left-hand side of (8.2) is equal to   4N n 3 m 1 j  − 2πi(t − s) 2πi|V |(t − s) j=1 × +

(exp i

−1

t,s Sc (t, s; q t,s x, y , aΛX,Y

) r1 (t, s; x, y , X, Y )

i r2 (t, s; x, y , X, Y ) f (y, Y )dy dY, 2

(8.14)

where t,s r1 (t, s; x, y, X, Y ) = ∂s Sc (t, s; q t,s x, y , aΛX,Y ) 2 n 1 ej ˜ (j) ∂y(j) Sc + A(y , Y ) − 2mj c j=1

− V1 (y ) −

|V | |∂Y Sc |2 − V2 (Y ) 2

(8.15)

and r2 = −

3n + 4N 1 + ∆ (j) Sc t−s mj y j=1 n

1 ej ˜ (j) , Y ) + |V |∆Y Sc . + (∇x · A)(y c j=1 mj n

(8.16)

Consequently, we can prove (8.2) as in the proof of (8.1). 9. The Proofs of the Main Results We ﬁrst prove Theorem 3.1. Let ρ∗ > 0 be the constant determined in Proposition 7.2 and χ ∈ C ∞ (R3n+4N ) with compact support such that χ(0) = 1. We consider bounded operators Kj and Kj (j = 1, 2, . . . , ν) on B a (R3n+4N ). Then, it holds for f ∈ B a (R3n+4N ) that Kν χ( ·)Kν−1 χ( ·) · · · χ( ·)K1 χ( ·)f − Kν Kν−1 · · · K1 f

=

ν

Kν χ( ·) · · · χ( ·)Kj+1 χ( ·)(Kj − Kj )Kj−1 · · · K1 f

j=1

+

ν−1 j=0

Kν χ( ·) · · · χ( ·)Kj+1 (χ( ·) − 1)Kj · · · K1 f.

(9.1)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

589

Noting (6.1) and (6.2), from (3.5) we have Sc (T, 0; q∆ , aΛ∆ ) =

ν

τ ,τ

τ ,τ

l l−1 Sc (τl , τl−1 ; q xl(l) l−1 , aΛX (l) ,X (l−1) ), ,x(l−1)

l=1 (l)

where X (l) = aΛ (l = 1, 2, . . . , ν − 1) and X (ν) = aΛ . So, (3.8) is written as lim C(T, τν−1 )χ( ·)C(τν−1 , τν−2 )χ( ·) · · · C(τ2 , τ1 )χ( ·)C(τ1 , 0)χ( ·)f

→0

for f ∈ B a (R3n+4N ). Let f ∈ B a (R3n+4N ) and |∆| ≤ ρ∗ . We can easily see sup χ( ·)f B a ≤ Const.f B a

0<≤1

and lim (χ( ·) − 1)f B a = 0.

→0

Consequently, using Theorem 7.3 and (9.1), we can see that (3.8) is well deﬁned in B a , which is written as C(T, τν−1 )C(τν−1 , τν−2 ) · · · C(τ2 , τ1 )C(τ1 , 0)f (= C∆ (T, 0)f ).

(9.2)

We also see from Remark 3.5 that there exists (3.8) in S for f ∈ S. Let 0 ≤ t0 ≤ t ≤ T . For a subdivision ∆ of [0, T ] we can ﬁnd j and l such that j ≤ l, τj−1 < t0 ≤ τj and τl−1 < t ≤ τl , where we take j = 1 for t0 = 0. Then we deﬁne C∆ (t, t0 )f := lim C(t, τl−1 )χ( ·)C(τl−1 , τl−2 )χ( ·) →0

× · · · χ( ·)C(τj+1 , τj )χ( ·)C(τj , t0 )χ( ·)f

(9.3)

for f ∈ B a as was stated in Remark 3.3. Let |∆| ≤ ρ∗ . Then we have C∆ (t, t0 )f = C(t, τl−1 )C(τl−1 , τl−2 ) · · · C(τj+1 , τj )C(τj , t0 )f as in the proof of (9.2). Consequently, from (7.5) we have C∆ (t, t0 )f B a ≤ eKa (t−t0 ) f B a

(a = 0, 1, 2, . . .)

(9.4)

under the assumptions of Theorem 3.1. Proposition 9.1. Let |∆| ≤ ρ∗ . Then, under the assumptions of Theorem 3.1 we can ﬁnd an integer M ≥ 2 such that C∆ (t, t0 )f − C∆ (t , t0 )f B a ≤ Ca (|t − t | + |t0 − t0 |)f B a+M for 0 ≤ t0 ≤ t ≤ T, 0 ≤ t0 ≤ t ≤ T and a = 0, 1, 2, . . . .

(9.5)

June 2, 2010 14:55 WSPC/S0129-055X

590

148-RMP

J070-00403

W. Ichinose

Proof. Let R(t, s) and R (t, s) be the operators deﬁned by (8.1) and (8.2), respectively. We determine M in Proposition 9.1 by max (M, M , 2) for M and M in Theorem 8.1. We can easily see t √ (H(θ)C(θ, s)f + θ − sR(θ, s)f )dθ (9.6) i(C(t, s)f − C(t , s)f ) = t

from (8.1) for s ≤ t ≤ t ≤ T . Let τj < t ≤ τj+1 and τk < t ≤ τk+1 . So j ≥ k holds. Using the equation just after (9.3) and (9.6), we get i(C∆ (t, t0 )f − C∆ (t , t0 )f ) t t = H(θ)C∆ (θ, t0 )f dθ + θ − τj R(θ, τj )dθC∆ (τj , t0 )f t

+

τj

j−k−1 τj−l+1

τj−l

l=1 τk+1

+ t

θ − τj−l R(θ, τj−l )dθC∆ (τj−l , t0 )f

θ − τk R(θ, τk )dθC∆ (τk , t0 )f.

(9.7)

See the proof of [21, Theorem 4.2] for further details. As in the proof of (7.14), we see H(t)f B a ≤ Const.f B a+M

(9.8)

from (3.10) because of M ≥ 2. We also see R(t, s)f B a ≤ Const.f B a+M

(9.9)

from Proposition 7.4 for 0 ≤ t − s ≤ ρ∗ . Consequently, (9.4) and (9.7) show √ C∆(t, t0 )f − C∆ (t , t0 )f B a ≤ Const. eKa+M T (1 + ρ∗ )|t − t |f B a+M for 0 ≤ t0 ≤ t ≤ t ≤ T . The inequality above holds for 0 ≤ t0 ≤ t , t ≤ T . In the same way we get √ C∆(t, t0 )f − C∆ (t, t0 )f B a ≤ Const. eKa+M T (1 + ρ∗ )|t0 − t0 |f B a+M for 0 ≤ t0 , t0 ≤ t ≤ T . Hence, we can complete the proof of Proposition 9.1. Let M ≥ 2 be the integer determined in Proposition 9.1. We consider a solution u(t), which is B M -valued continuous and L2 -valued continuously diﬀerentiable in [t0 , T ], to (3.9) with u(t0 ) = 0 for a t0 ∈ [0, T ). Then, noting M ≥ 2, from (3.9) and (3.10) we can easily see du d (u(t), u(t)) = 2 (t), u(t) = −2−1 i(H(t)u(t), u(t)) = 0 dt dt

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

591

and so u(t) = 0 in [t0 , T ], where a for a complex number a denotes the real part of a. Consequently, we can see for a given f ∈ B a+M that the solution to (3.9) with u(t0 ) = f is determined uniquely in the space of all B M -valued continuous and L2 -valued continuously diﬀerentiable functions in [t0 , T ]. ∗ Let {∆j }∞ j=1 be a family of subdivisions of [0, T ] such that |∆j | ≤ ρ and limj→∞ |∆j | = 0. Take an arbitrary f ∈ B a+2M (a = 0, 1, 2, . . .). Then we see from (9.4) and (9.5) that {C∆j (t, t0 )f }∞ j=1 is uniformly bounded as a family of B a+2M -valued continuous functions and equicontinuous as a family of B a+M -valued functions in 0 ≤ t0 ≤ t ≤ T , respectively. It follows from the Rellich criterion (cf. [28, Theorem XIII.65]) that the embedding map from B M into L2 is compact. So is the embedding map from B a+2M into B a+M from (7.12), (7.13) and [17, Lemma 2.5] with a = b = 1. Consequently, from Ascoli–Arzel` a theorem we can ﬁnd , which may depend on f , such that C∆jk (t, t0 )f converges a subsequence {∆jk }∞ k=1 a+M uniformly in 0 ≤ t0 ≤ t ≤ T as k → ∞. Since C∆j (t0 , t0 )f = f follows in B from Lemma 6.1, so (9.7)–(9.9) show that limk→∞ C∆jk (t, t0 )f =: U (t, t0 )f , where U (t, t0 )f is a B a+M -valued continuous and B a -valued continuously diﬀerentiable function in 0 ≤ t0 ≤ t ≤ T satisfying (3.9) with u(t0 ) = f . Hence, it follows from the uniqueness of solutions to (3.9) proved above that C∆ (t, t0 )f converges to U (t, t0 )f in B a+M uniformly in 0 ≤ t0 ≤ t ≤ T as |∆| → 0. Take an arbitrary f ∈ B a . Let ∆ and ∆ be subdivisions such that |∆| ≤ ρ∗ and |∆ | ≤ ρ∗ . For any > 0 we can take a g ∈ B a+2M such that g − f B a < . Then from (9.4) we have C∆ (t, t0 )f − C∆ (t, t0 )f B a ≤ C∆ (t, t0 )g − C∆ (t, t0 )gB a + C∆ (t, t0 )(f − g)B a + C∆ (t, t0 )(f − g)B a ≤ C∆ (t, t0 )g − C∆ (t, t0 )gB a+M + 2eKa T . So, lim

max

|∆|,|∆|→0 0≤t0 ≤t≤T

C∆ (t, t0 )f − C∆ (t, t0 )f B a ≤ 2eKa T .

(9.10)

Hence, we can see that C∆ (t, t0 )f converges in B a uniformly in 0 ≤ t0 ≤ t ≤ T as |∆| → 0. We write this limit as W (t, t0 )f . Let f ∈ B a . Take fj ∈ B a+M such that limj→∞ fj = f in B a . From (9.7) we have t H(θ)W (θ, t0 )fj dθ i(W (t, t0 )fj − fj ) = t0

in B a . The inequality W (t, t0 )f B a ≤ eKa (t−t0 ) f B a holds from (9.4). So, from [17, Lemma 2.5] with a = b = 1 we can see t H(θ)W (θ, t0 )f dθ i(W (t, t0 )f − f ) = t0

June 2, 2010 14:55 WSPC/S0129-055X

592

148-RMP

J070-00403

W. Ichinose

in B a−2 and that W (t, t0 )f is B a -valued continuous and B a−2 -valued continuously diﬀerentiable in 0 ≤ t0 ≤ t ≤ T. Hence lim|∆|→0 C∆ (t, t0 )f (=W (t, t0 )f ) satisﬁes (3.9) with u(t0 ) = f . Thus, we could complete the proof of Theorem 3.1. t,s We shall consider the proof of Theorem 3.2. Let q t,s x,y (θ) and a Λ X,Y (θ) be the paths deﬁned by (6.1) and (6.2) for s < t, respectively. For ξk ∈ R2 (k ∈ Λ ) we 1

deﬁne the path by φt,s (θ) ξ k

:= ξk +

4πρk (q t,s x,y (θ)) ∈ R2 , |k|2

s≤θ≤t

(9.11)

as in (3.12). The path φt,s (θ) ∈ R2 (k ∈ Λ1 ) is deﬁned by (2.13). So from (2.16) ξ k and (2.17) we have (1)

(1)

ξ−k = ξk ,

(2)

(2)

ξ−k = −ξk .

For k ∈ Λ1 , we can easily see 2 t,s 2 t,s |k| φ (θ) − 8πρk (q t,s x,y (θ)) · φ (θ) ξk

ξk

t,s 4πρk 2 16π 2 − = |k| φ − |ρk |2 ξk |k|2 |k|2 2

16π 2 2 = |k|2 |ξ k |2 − |ρk (q t,s x,y (θ))| . |k|2

(9.12)

˜ deﬁned by (3.11) is written as So, the classical action for L t,s t,s S(t, s; q t,s x,y , aΛX,Y , {φ }k∈Λ1 ) ξk

t,s = Sc (t, s; q t,s x,y , aΛX,Y ) +

(t − s) |k|2 |ξk |2 4π|V |

(9.13)

k∈Λ1

from (2.21) and (3.3). Let χ1 ∈ C ∞ (R2N1 ) with compact support such that χ1 (0) = 1. Let > 0 and ξ := {ξk }k∈Λ1 ∈ R2N1 . For f ∈ S(R3n+4N ) we deﬁne G (t, s)f (0 ≤ s ≤ t ≤ T ) by     n 4N 3 2  m 1 |k| (t − s)  j        2 |V |  2πi(t − s) 2πi|V |(t − s) 4iπ   k∈Λ1  j=1 −1   dξk , s < t, × · · · ei S χ1 ( ξ)f (y, Y )dydY     k∈Λ1    f, s = t,

t,s t,s where S = S(t, s; q t,s x,y , aΛX,Y , {φ }k∈Λ1 ). ξk

(9.14)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

593

Proposition 9.2. Let f ∈ B a (R3n+4N )(a = 0, 1, 2, . . .). Then, under the assumptions of Theorem 3.1 we have lim G (t, s)f = C(t, s)f

(9.15)

→0

in B a for 0 ≤ t − s ≤ ρ∗ . Proof. In the case t = s Eq. (9.15) is clear from (7.6). Let 0 < t − s ≤ ρ∗ and f ∈ S(R3n+4N ). From (9.13) we have   4N n 3 m 1 j   G (t, s)f = 2πi(t − s) 2πi|V |(t − s) j=1

×

t,s (exp i−1 Sc (t, s; q t,s x,y , aΛX,Y ))f (y, Y )dydY

   |k|2 (t − s) i(t − s)  · · · exp |k|2 |ξk |2  × 4iπ 2 |V | 4π|V | 

k∈Λ1

× χ1 ( ξ)

k∈Λ1

dξk .

k∈Λ1 (1)

(2)

Let ηk := (ηk , ηk ) ∈ R2 and η := {ηk }k∈Λ1 . We know ∞ iπ iaθ 2 e dθ = a −∞

(9.16)

for a constant a > 0. So we can write G (t, s)f = P (t, s)f, where

(9.17)

   |k|2  · · · exp i p (t, s) =  |k|2 |ηk |2  iπ 

k∈Λ1

k∈Λ1

× χ( 4π|V |/(t − s)η) dηk .

(9.18)

k∈Λ1

We see that lim p (t, s) = 1

→0

pointwise. Letting q (t, s) = p (t, s) − 1, we have P (t, s)f − C(t, s)f = Q (t, s)f.

(9.19)

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

W. Ichinose

594

We consider G (t, s)f − C(t, s)f 2 = P (t, s)f − C(t, s)f 2 = ((P (t, s) − C(t, s))† (P (t, s) − C(t, s))f, f ) = (Q (t, s)† Q (t, s)f, f ). Hence, we obtain (9.15) as in the proof of Theorem 7.3 in the present paper together with [17, Lemma 2.2]. See the proof of [20, Lemma 4.1] for further details. We can write (3.13) as lim G (T, τν−1 )χ( ·)G (τν−1 , τν−2 )χ( ·) · · · G (τ2 , τ1 )χ( ·)G (τ1 , 0)χ( ·)f

→0

(9.20)

in the same way that (3.8) is written in the above of (9.2). Integrating by parts in (9.18), we see that sup0<≤1 |p (t, s)| is ﬁnite. So the same proof as for (7.5) shows sup G (t, s)f B a ≤ Ca f B a ,

a = 0, 1, 2, . . .

0<≤1

with constants Ca from (9.17). Hence, using (9.1), we can prove Theorem 3.2 as in the proof of the convergence of (3.8) to (9.2) together with (9.15). Finally, we will prove Theorem 3.3. As in the proof of (6.15) we get 1 (j) (j) (j) Aex (t, x ) · dx − φex (t, x )dt − c q zt,s q zt,s y x ,y ,x 1 = (x(j) − y (j) ) · c − (t − s)(x

(j)

1

Aex (s, x(j) − θ(x(j) − y (j) ))dθ

0

−y

(j)

)·

1 0

1

σ1 Eex (τ (σ), ζ (j) (σ))dσ1 dσ2

0

1 1 3 3 1 (j) (j) (j) (j) − (x − ym ) (zl − xl ) σ1 Bml (τ (σ), ζ (j) (σ))dσ1 dσ2 c m=1 m 0 0 l=1

(9.21) (t, x), B31 (t, x), B12 (t, x)) = Bex (t, x), Blm = −Bml , and τ (σ) for s < t, where (B23 (j) and ζ (σ) were deﬁned by (6.11). See the proof of [18, Proposition 3.3] for further details. So, we get Eq. (6.18) where the sum over j = 1, 2, . . . , n of (9.21) multiplied by mj ej /(t−s) is added to. Hence, under the assumptions of Theorem 3.3 we obtain the same assertion as in Theorem 3.1 in the same way that Theorem 3.1 is proved. In the same way of the proof of Theorem 3.2 we also get the same assertion as in Theorem 3.2 under the assumptions of Theorem 3.3. Thus, we could complete the proof of the main results.

June 2, 2010 14:55 WSPC/S0129-055X

148-RMP

J070-00403

Feynman Path Integral for Nonrelativistic Quantum Electrodynamics

595

Acknowledgements The author thanks the referee for many useful suggestions. This research was partially supported by Grant-in-Aid for Scientiﬁc Research No. 16540145 and No. 19540175, Ministry of Education, Culture, Sports, Science and Technology, Japanese Government.

References [1] S. Albeverio, A. Hahn and A. N. Sengupta, Chern–Simons theory, Hida distribution, and state models, Inﬁn. Dimens. Anal. Quantum Probab. Relat. Top. 6 (2003) 65–81. [2] S. Albeverio, R. J. Høegh-Krohn and S. Mazzuchi, Mathematical Theory of Feynman Path Integrals, Lecture Notes in Math. Vol. 523, 2nd edn. (Springer-Verlag, Berlin and Heidelberg, 2008). [3] A. Arai, Fock Space and Quantum Field (Nihon Hyoron Co., Tokyo, 2000) (in Japanese). [4] F. A. Berezin and M. A. Shubin, The Schr¨ odinger Equation (Kluwer Academic Publishers, Dordrecht, 1983). [5] A. P. Calder´ on and R. Vaillancourt, On the boundedness of pseudo-diﬀerential operators, J. Math. Soc. Japan 23 (1971) 374–378. [6] J. M. Cook, The mathematics of second quantization, Trans. Amer. Math. Soc. 74 (1953) 222–245. [7] P. A. M. Dirac, The Principles of Quantum Mechanics, 4th edn. (Oxford Univ. Press, London, 1958). [8] E. Fermi, Quantum theory of radiation, Rev. Mod. Phys. 4 (1932) 87–132. [9] R. P. Feynman, Space-time approach to nonrelativistic quantum mechanics, Rev. Mod. Phys. 20 (1948) 367–387. [10] R. P. Feynman, Mathematical formulation of the quantum theory of electrodynamic interaction, Phys. Rev. 80 (1950) 440–457. [11] R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals (McGraw-Hill, New York, 1965). [12] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Spectral theory for the standard model of non-relativistic QED, Comm. Math. Phys. 283 (2008) 613–646. [13] I. M. Gel’fand and N. Y. Vilenkin, Generalized Functions. Vol. IV, Applications of Harmonic Analysis (Academic Press, New York-London, 1964). [14] S. J. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics (Springer, Berlin, 2003). [15] T. Hida, H.-H. Kuo, J. Potthoﬀ and L. Streit, White Noise (Kluwer Academic Publishers, Dordrecht, 1993). [16] F. Hiroshima, Functional integral representation of a model in quantum electrodynamics, Rev. Math. Phys. 9 (1997) 489–530. [17] W. Ichinose, A note on the existence and -dependency of the solution of equations in quantum mechanics, Osaka J. Math. 32 (1995) 327–345. [18] W. Ichinose, On the formulation of the Feynman path integral through broken line paths, Comm. Math. Phys. 189 (1997) 17–33. [19] W. Ichinose, On convergence of the Feynman path integral formulated through broken line paths, Rev. Math. Phys. 11 (1999) 1001–1025. [20] W. Ichinose, The phase space Feynman path integral with gauge invariance and its convergence, Rev. Math. Phys. 12 (2000) 1451–1463.

June 2, 2010 14:55 WSPC/S0129-055X

596

148-RMP

J070-00403

W. Ichinose

[21] W. Ichinose, Convergence of the Feynman path integral in the weighted Sobolev spaces and the representation of correlation functions, J. Math. Soc. Japan 55 (2003) 957–983. [22] W. Ichinose, The continuity of the solution with respect to an electromagnetic potential to the Schr¨ odinger equation and the Dirac equation, preprint (2009). [23] G. W. Johnson and M. L. Lapidus, The Feynman Integral and Feynman’s Operational Calculus (Oxford Univ. Press, Oxford, 2000). [24] H. Kumano-go, Pseudo-Diﬀerential Operators (MIT Press, Cambridge, 1981). [25] E. H. Lieb and M. Loss, Analysis (Amer. Math. Soc. Providence, 1997). [26] S. Mizohata, The Theory of Partial Diﬀerential Equations (Cambridge Univ. Press, New York, 1973). [27] E. Nelson, Schr¨ odinger particles interacting with a quantized scalar ﬁeld, in Proceedings of a Conference on the Theory and Applications of Analysis in Function Space (M.I.T. Press, Cambridge, 1964), pp. 88–120. [28] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of Operators (Academic Press, New York, 1978). [29] J. J. Sakurai, Advanced Quantum Mechanics (Addison-Wesley, Massachusetts, 1967). [30] F. E. Schroeck, Jr., Generalization of the Cook formalism for Fock space, J. Math. Phys. 12 (1971) 1849–1857. [31] J. T. Schwartz, Nonlinear Functional Analysis (Gordon and Breach Science Publishers, New York, 1969). [32] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, Cambridge, 2004). [33] M. S. Swanson, Path Integrals and Quantum Processes (Academic Press, San Diego, 1992).

July 12, 2010 12:0 WSPC/S0129-055X J070-S0129055X10004053

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 6 (2010) 597–667 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004053

GRADIENT FLOWS FOR OPTIMIZATION IN QUANTUM INFORMATION AND QUANTUM DYNAMICS: FOUNDATIONS AND APPLICATIONS

∗ and STEFFEN J. GLASER ¨ THOMAS SCHULTE-HERBRUGGEN

Department of Chemistry, Technical University of Munich (TUM), Lichtenbergstrasse 4, D-85747 Garching, Germany ∗[email protected] GUNTHER DIRR† and UWE HELMKE Institute of Mathematics, University of W¨ urzburg, Am Hubland, D-97074 W¨ urzburg, Germany †[email protected] Received 14 December 2008 Revised 26 February 2010 Many challenges in quantum information and quantum control root in constrained optimization problems on ﬁnite-dimensional quantum systems. The constraints often arise from two facts: (i) quantum dynamic state spaces are naturally smooth manifolds (orbits of the respective initial states) rather than being Hilbert spaces; (ii) the dynamics of the respective quantum system may be restricted to a proper subset of the entire state space. Mathematically, either case can be treated by constrained optimization over the reachable set of an underlying control system. Thus, whenever the reachable set takes the form a smooth manifold, Riemannian optimization methods apply. Here, we give a comprehensive account on the foundations of gradient ﬂows on Riemannian manifolds including new applications in quantum information and quantum dynamics. Yet, we do not pursue the problem of designing explicit controls for the underlying control systems. The framework is suﬃciently general for setting up gradient ﬂows on (sub)manifolds, Lie (sub)groups, and (reductive) homogeneous spaces. Relevant convergence conditions are discussed, in particular for gradient ﬂows on compact and analytic manifolds. This is meant to serve as foundation for new achievements and further research. Illustrative examples and new applications are given: we extend former results on unitary groups to closed subgroups with tensor-product structure, where the ﬁnest product partitioning relates to SUloc (2n ) := SU (2) ⊗ · · · ⊗ SU (2) — known as (qubit-wise) local unitary operations. Such applications include, e.g., optimizing ﬁgures of merit on SUloc (2n ) relating to distance measures of pure-state entanglement as well as to best rank-1 approximations of higher-order tensors. In quantum information, our gradient ﬂows provide a numerically favorable alternative to standard tensor-SVD techniques. Keywords: Constrained optimization in quantum systems; Riemannian optimization; Riemannian gradient ﬂows and algorithms; double-bracket ﬂows; quantum control; lowrank approximation of tensors; tensor SVD. Mathematics Subject Classiﬁcation 2010: 49-02, 49R50, 53-02, 53Cxx, 65Kxx, 81V70, 90C30, 15A18, 15A69 597

July 12, J070-S0129055X10004053

598

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Contents 1. Introduction 2. Overview 2.1. Flows and dynamical systems . 2.2. Gradient ﬂows for optimization 2.3. Discretized gradient ﬂows . . . 2.4. Reachability and controllability 2.5. Settings of interest . . . . . . .

598 . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

602 602 603 603 606 608

3. Theory: Gradient Flows 3.1. Gradient ﬂows on Riemannian manifolds 3.2. Gradient ﬂows on Lie groups . . . . . . 3.3. Gradient ﬂows on homogeneous spaces . 3.4. Examples . . . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

609 609 614 619 631

4. Applications to Quantum Information and Quantum Control 4.1. A geometric measure of pure-state entanglement . . . . 4.2. Generalized local subgroups . . . . . . . . . . . . . . . . 4.3. Locally reversible interaction Hamiltonians . . . . . . . 4.4. Intrinsic versus penalty approach: An example . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

644 644 644 651 653

5. Conclusions

. . . . .

. . . . .

. . . . .

. . . . .

655

1. Introduction Controlling quantum systems oﬀers a great potential for performing computational tasks or for simulating the behavior of other quantum systems (which are diﬃcult to handle experimentally) or classical systems [1,2], when the complexity of a problem reduces upon going from a classical to a quantum setting [3]. Important examples are known in quantum computation, quantum search and quantum simulation. Most prominently, there is the exponential speed-up by Shor’s quantum algorithm of prime factorisation [4,5], which on a general level relates to the class of quantum algorithms [6,7] solving hidden subgroup problems in an eﬃcient way [8]. In Grover’s quantum search algorithm [9,10] one still ﬁnds a quadratic acceleration compared to classical approaches [11]. Recently, the simulation of quantum phase-transitions [12] has shifted into focus [13, 14]. Among the generic tools needed for advances in quantum simulation and quantum technology, quantum control plays a major role. For a survey see, e.g., [15,16]. Its key concern is to develop (optimal) control strategies and constructive ways for implementing them under realistic experimental settings such that a certain performance index is maximized. In a wider sense, such ﬁgures of merit depend on terminal conditions as well as on running cost like time or energy. Yet in quantum control important classes of performance indices are completely determined by

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

599

their value at the ﬁnal state, typical examples being quantum gate ﬁdelities, eﬃciencies of state transfer or coherence transfer, as well as distance functions related to Euclidean entanglement measures. Since realistic quantum systems are mostly beyond analytical tractability, numerical methods are often indispensable. A good strategy is to proceed in two steps: (a) ﬁrstly, by exploring the possible gains on an abstract and computationally cheap level, i.e. by maximizing the quality function either over the entire state space or over the set of possible states — the so-called reachable set; (b) secondly, by going into optimizing the experimental controls (“pulse shapes”) in a concrete setting. However, (b) is often computationally expensive and highly sensitive, as it actually approximates the solution of a constrained variational problem in an inﬁnite dimensional function space. Here we almost entirely focus on task (a) and no longer pursue issues of optimal control (b). In particular, we do not address the problem of designing controls that steer concrete experimental setups to achieve optimal ﬁgures of merit. By merely depending on the geometry of the underlying state-space manifold the ﬁrst instance (a) allows for analyzing in advance and on an abstract level the limits of what one can achieve in step (b). We therefore refer to (a) as the abstract optimization task. The second instance, in contrast, hinges on introducing the speciﬁc time scales and control parameters of an experimental setting for ﬁnding steerings of the quantum dynamical system such that the optima determined in (a) are actually assumed. This is why we term (b) the dynamic optimal control task. Certainly, one can approach the entire problem only in terms of (b) and sometimes one is even forced to do so, e.g., if nothing is known about the geometric structure of the reachable set. Yet, the above two-fold strategy serves to yield (strict) upper bounds (independent of the concrete experimental setting) in (a) which provide benchmarks for the reliability of the numerical methods applied in (b). In a pioneering paper [17], Brockett introduced the idea of exploiting gradient ﬂows on the orthogonal group for diagonalizing real symmetric matrices and for sorting lists. In a series of subsequent papers he extended the concept to intrinsic gradient methods for (constrained) optimization [18,19]. Soon these techniques were generalized to Riemannian manifolds, their mathematical and numerical details were worked out [20–22] and thus they turned out to be applicable to a broad range of optimization tasks including eigenvalue and singular-value problems, principal component analysis, matrix least-squares matching problems, balanced matrix factorisations, and combinatorial optimization — for an overview see, e.g., [22, 23]. Implementing a gradient method for optimization on a smooth (constrained) manifold, such as an unitary orbit, via the Riemannian exponential map, inherently ensures that the (discretized) ﬂow remains within the manifold. Alternatively, formulating the optimization problem on some embedding Euclidean space comes at the expense of additional constraints (e.g., enforcing unitarity) to be taken care of by penalty-type or augmented Lagrange-type techniques. In this sense, gradient ﬂows on manifolds are intrinsic optimization methods [24], whereas extrinsic

July 12, J070-S0129055X10004053

600

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

optimizations on an embedding space require in general nonlinear projective techniques in order to stay on the (constrained) manifold. In particular, using the diﬀerential geometry of matrix manifolds has thus become a ﬁeld of active research. For new developments (however without exploiting the Lie structure to the full extent) see, e.g., [25]. Even beyond manifolds, gradient ﬂows have recently been described for metric spaces with applications of probability theory [26]. For optimization in quantum dynamics, gradient ﬂows and their discrete numerical integration schemes have also proven powerful tools. This holds in both types of tasks: (a) for exploring the maxima of pertinent quality functions on the reachable set of a quantum system, e.g., on the unitary group and its orbits [27–29] and (b) beyond the current focus also for arriving at concrete experimental steerings (i.e.“pulse shapes”). These steerings actually achieve the quality limits established in (a) under given experimental conditions for closed systems [30–32], whereas they give (best) approximations in open systems [33–35]. Note that in task (b) gradient ﬂows on the set of admissible control amplitudes can be viewed as another instance of ﬂows on Riemannian manifolds. However, this instance will not be pursued here. Moreover, in view of unifying variational approaches to ground-state calculations [36, 37], a common framework of gradient ﬂows on Riemannian manifolds as well as projective techniques on their tangent spaces will be useful. The latter allow for restricting the ﬂows, e.g., from Lie groups G to any closed subgroup H, in particular to any closed subgroup of SU (N ). Consecutive partitionings into diﬀerent subgroups of SU (N ) are exploited in unitary networks addressing largescale quantum systems by neglecting long-range entangling correlations [38–40]. Related techniques for truncating the Hilbert space to pertinent parametrized subsets include matrix-product states (MPS) [41, 42] of density-matrix renormalization groups (DMRG) [43, 44], quantum cellular automata with Margolus partitionings [45], projected entangled pair states (PEPS) [46] weighted graph states (WGS) [47], multi-scale entanglement renormalization approaches (MERA) [48], string-bond states (SBS) [49] as well as combined methods [36,50]. It is noteworthy that in many-particle physics gradient ﬂows for diagonalizing Hamiltonians were reintroduced independently of Brockett’s work [17] by Wegner [51] and were further elaborated on again independently of the monography by Helmke and Moore [22] or the one by Bloch [23] in the tract by Kehrein [52]. Suﬃce this to illustrate the need for making the mathematical methods available to the physics community in a comprehensive way. Another ﬁeld of applications of restricting ﬂows to closed subgroups of SU (N ) is entanglement of multi-partite quantum systems [53, 54]: we present a connection from gradient ﬂows on the subgroup of local unitary operations to best rank-1 approximations of higher-order tensors as well as a relation to tensor-SVDs. These methods are of importance, e.g., in view of optimization of entanglement witnesses [55]. Applying the same approach to other subgroups of SU (N ) with tensor product structure is anticipated to be of use also for classifying multi-partite

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

601

systems by partial separability, an example being three-tangles of GHZ-type and W-type states [56–58]. Here the goal is to give a comprehensive account of the foundations of gradient ﬂows — and thus the justiﬁcation for some recent developments — as well as to present new applications to quantum information. Terms are kept general enough to trigger future developments, since we elucidate the necessary requirements for implementing gradient-based optimization methods in diﬀerent geometric settings: Riemannian manifolds and submanifolds, Lie groups and homogeneous spaces. We will also show how they can be carried over to homogeneous spaces that do no longer form Lie groups themselves. Standard examples are coset spaces G/H, i.e. the quotient of a Lie group G by a closed (yet not necessarily normal) subgroup H. Here naturally reductive homogeneous space are of particular interest and the wellknown double-bracket ﬂows will be demonstrated to form a special case precisely of this kind. A separate paper on open quantum systems [59] sets up a formal approach within the framework of Lie semigroups accounting for Markovian quantum evolutions (or Markovian channels). There we also show the current limits of abstract optimization over reachable sets speciﬁcally arising in open systems. The diﬀerential geometry of the set of all completely positive, trace-preserving invertible maps is analyzed in the framework of Lie semigroups. In particular, the set of all Kossakowski–Lindblad generators is retrieved as its tangent cone (Lie wedge). Moreover, it shows how the Lie-semigroup structure corresponds to the Markov properties recently studied in terms of divisibility [60]. It illustrates why abstract optimization tasks for open systems are much more intricate than in the case of closed system, while dynamic optimal control tasks for open systems can be handled completely analogously. It speciﬁes algebraic conditions for time-optimal controls to be the method of choice in open systems. Finally it draws perspectives to a new algorithmic approach for optimization on semigroup orbits combining (a priori) knowledge of the respective Lie wedge with well-known techniques from optimal control. Outline To begin with, recall some basic terminology on dynamical systems and Riemannian geometry. Then the aim is to provide the diﬀerential geometric tools for setting up Riemannian optimization methods — primarily focussing on gradient ﬂows — in diﬀerent scenarios ranging from optimization over the entire unitary group to closed subgroups or homogeneous spaces. Finally we give a number of applications including worked examples. More precisely, the paper is organized as follows: Section 2 draws a general sketch of dynamical systems and ﬂows on manifolds including issues of reachability and controllability. It provides the manifold setting for gradient-ﬂow-based algorithms like steepest ascent, conjugate gradients, Jacobi-type, and Newton methods. A detailed analysis is then given in Sec. 3, where (1) we resume the general preconditions for gradient ﬂows on smooth manifolds. In particular, we recall the role

July 12, J070-S0129055X10004053

602

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

of a Riemannian metric that allows for identifying the cotangent bundle T ∗ M with the tangent bundle T M . Though major parts of the foundations can be found scattered in [22,25,61], here we add a comprehensive overview of the interplay between Riemannian geometry, Lie groups, and (reductive) homogeneous spaces. (2) We give examples of gradient ﬂows on compact Lie groups as well as their closed subgroups. (3) In view of further developments, we address gradient ﬂows on reductive homogeneous spaces including specializations to Cartan-like cases as well as naturally reductive homogeneous spaces. In particular, double-bracket ﬂows turn out as gradient ﬂows on naturally reductive homogeneous spaces. (4) Examples interdispersed in the main text illustrate the relevance in a plethora of diﬀerent settings. Section 4 is dedicated to speciﬁc applications in quantum information and quantum control. (1) We show how gradient ﬂows on the subgroup of local unitaries SUloc (2n ) in n qubits do not only provide a valuable tool in witness optimization, but relate to generalized singular-value decompositions (SVD), namely the tensorSVD. Here, our gradient ﬂows yield an alternative to common algorithms for best rank-1 approximations of higher-order tensors, e.g., higher-order power methods (HOPM) or higher-order orthogonal iteration (HOOI). (2) Flows on SUloc (2n ) also serve as a convenient tool to decide whether Hamiltonian interactions can be timereversed solely by local unitary manipulations thus complementing the algebraic assessment given in [29]. (3) Optimization tasks with additional extrinsic constraints are addressed by tailored gradient ﬂows on the respective subgroups or by auxiliary penalty methods. By including practical applications and worked examples we illustrate the ample range of problems to which gradient ﬂows on manifolds provide valuable solutions. To this end, we start out by an extended overview on Riemannian optimization techniques on manifolds with particular emphasis on gradient techniques. 2. Overview 2.1. Flows and dynamical systems In this paper, we treat various optimization tasks for quantum dynamical systems in a common framework, namely by gradient ﬂows on smooth manifolds. Let M denote a smooth manifold, e.g., the unitary orbit of all quantum states relating to an initial state X0 . By a continuous-time dynamical system or a ﬂow one deﬁnes a smooth map Φ: R×M →M

(2.1)

such that for all states X ∈ M and times t, τ ∈ R one has Φ(0, X) = X Φ(τ, Φ(t, X)) = Φ(t + τ, X).

(2.2)

Since these equations hold for any X ∈ M , one gets the operator identity Φτ ◦ Φt = Φt+τ

(2.3)

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

603

for all t, τ ∈ R, thus showing the ﬂow acts as a one-parameter group, and for positive times t, τ ≥ 0 as a one-parameter semigroup of diﬀeomorphisms on M . 2.2. Gradient flows for optimization Now, the general idea for optimizing a scalar quality function on a smooth manifold M (which might either arise naturally or from including smooth equality constraints, vide infra) by dynamical systems is as follows: Let f : M → R be a smooth quality function on M . The diﬀerential of f : M → R is a mapping Df : M → T ∗ M of the manifold to its cotangent bundle T ∗ M , while the gradient vector ﬁeld is a mapping grad f : M → T M to its tangent bundle T M . So the gradient of f at X ∈ M , denoted grad f (X), is the vector in TX M uniquely determined by Df (X) · ξ = grad f (X) | ξX

for all ξ ∈ TX M .

(2.4)

∗ M Here, the scalar product · | ·X plays a central role: it allows for identifying TX with TX M . The pair (M, · | ·) is called a Riemannian manifold with Riemannian metric · | ·. In view of gradient ﬂows, the convenience of Riemannian manifolds lies in the fact that by duality in particular the diﬀerential Df (X) of f at X can be identiﬁed with a tangent vector of TX M . Then, the ﬂow Φ : R×M → M determined by the ordinary diﬀerential equation

X˙ = grad f (X)

(2.5)

is termed gradient ﬂow. Formally, it is obtained by integrating Eq. (2.5), i.e. Φ(t, X) = Φ(t, Φ(0, X)) = X(t),

(2.6)

where X(t) denotes the unique solution of Eq. (2.5) with initial value X(0) = X. Observe this ensures that f does increase along trajectories Φ of the system by virtue of following the gradient direction of f . In generic problems, gradient ﬂows typically run into some local extremum as sketched in Fig. 2. Therefore a suﬃciently large set of independent initial conditions may be needed to provide conﬁdence into numerical results. However, in some pertinent applications, local extrema can be ruled out; prominent examples of this type will be discussed in detail in the context of Brockett’s double bracket ﬂow [17, 22]. 2.3. Discretized gradient flows Gradient ﬂows may be envisaged as natural continuous versions of the steepest ascent method for optimizing a real-valued function f : Rm → R by moving along its gradient grad f ∈ Rm , i.e. Steepest ascent method Xk+1 = Xk + αk grad f (Xk ), where αk ≥ 0 is an appropriate step size.

(2.7)

July 12, J070-S0129055X10004053

604

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Here, the right-hand side of Eq. (2.7) does make sense, as the manifold M = Rm coincides with its tangent space TX M = Rm containing grad f (X). Clearly, a generalization is required as soon as M and TX M are no longer identiﬁable. This gap is ﬁlled by the Riemannian exponential map deﬁned by expX : TX M → M,

ξ → expX (ξ)

(2.8)

so that t → expX (tξ) describes the unique geodesic with initial value X ∈ M and “initial velocity” ξ ∈ TX M as illustrated in Fig. 1. If the manifold M carries the structure of a matrix Lie group G, we may identify the tangent space element ξ ∈ TX G with ΩX, where Ω is itself an element of the Lie algebra g, i.e. the tangent space at the unity element g = T1 G. Moreover, if the Lie-group structure matches with the Riemannian framework in the sense that the metric is bi-invariant (as will be explained in more detail later), then the Riemannian exponential of ξ = ΩX can readily be calculated explicitly. This is done in three steps by (i) right translation with the inverse group element X −1 , (ii) taking the conventional exponential map of the Lie algebra element Ω, (iii) right translation with the group element X as summarized in the following diagram ξ = ΩX ∈ TX G   RX −1 Ω∈g

exp

−−−−−X −−→

eΩ X ∈ G R  X

(2.9)

exp

−−−−−−−−−→ eΩ ∈ G.

Next, the gradient system (2.5) will be integrated (to suﬃcient approximation) by a discrete scheme that can be seen as an intrinsic Euler step method. This can be performed by way of the Riemannian exponential map, which is to say straight line segments used in the standard method are replaced by geodesics on M . This leads to the following integration scheme which is well-deﬁned on any Riemannian manifold M . (1) Riemannian gradient method Xk+1 := expXk (αk grad f (Xk ))

(2.10)

Fig. 1. The Riemannian exponential expX is a smooth map taking the tangent vector tξ ∈ TX M at X ∈ M to expX (tξ) ∈ M . By varying t ∈ R, it yields the unique geodesic with initial value X ∈ M and “initial velocity” ξ ∈ TX M . (Color online)

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

605

where αk ≥ 0 denotes a step size appropriately selected to guarantee convergence, cf. Sec. 3. For matrix Lie groups G with bi-invariant metric, Eq. (2.10) simpliﬁes to (1 ) Gradient method on a Lie group Xk+1 := exp(αk grad f (Xk )Xk−1 )Xk ,

(2.11)

where exp : g → G denotes the conventional exponential map. In either case, the iterative procedure can be pictured as follows: at each point Xk ∈ M one evaluates grad f (Xk ) in the tangent space TXk M . Then one moves via the Riemannian exponential map in direction grad f (Xk ) to the next point Xk+1 on the manifold so that the quality function f improves, f (Xk+1 ) ≥ f (Xk ), as shown in Fig. 2. Higher-order Riemannian optimization methods The steepest ascent approach just outlined is most basic for addressing abstract optimization tasks intrinsically. Other intrinsic iterative schemes exploiting the underlying Riemannian geometry like conjugate gradients, Jacobi-type methods or Newton’s method can be obtained similarly. For an introduction to these more advanced topics beyond the subsequent sketch see, e.g., [20, 62, 63].

↑f

Fig. 2. Abstract optimization task: The quality function f : M → R, X → f (X) (top trace) is driven into a (local) maximum by following the gradient ﬂow X˙ = grad f (X) on the manifold M (lower trace). (Color online)

July 12, J070-S0129055X10004053

606

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

(2) Conjugate gradient method Xkl+1 := argmax f (expXkl (t Ωlk )) t≥0

Ωlk :=

0 Xk+1

:=

(2.12)

Xkn ,

grad f (Xkl )

for l = 0

grad f (Xkl ) + αlk ΠXkl−1 ,Xkl (Ωlk−1 )

for l = 1, . . . , n − 1,

αlk

where is a real parameter and ΠX,Y (Ω) denotes the parallel transport of Ω along the geodesic from X to Y . (3) Jacobi-type method Xkl+1 := argmax f (expXkl (t Ωl (Xkl ))) t∈R

0 Xk+1

:=

(2.13)

Xks ,

where Ω0 , Ω1 , . . . , Ωs−1 are vector ﬁelds such that Ω0 (X), Ω1 (X), . . . , Ωs−1 (X) span TX M for all X ∈ M . The integer s is called sweep length. (4) Newton’s method Xk+1 := expXk (−(Hess f (Xk ))−1 grad f (Xk )),

(2.14)

where Hess f (X) denotes the Hessian of f at X. Since inverting the Hessian or, more precisely, solving the equation Hess f (Xk )Z = grad f (Xk ) is numerically costly, in higher-dimensional problems it is customary to use approximative methods with partial updates, e.g., the limited memory variant of the Broyden– Fletcher–Goldfarb–Shanno algorithm (LBFGS) [64–66]. 2.4. Reachability and controllability Up to now we have addressed optimization tasks over abstract state spaces forming Riemannian manifolds M — hence the term abstract optimization task (AOT). However, often there may be restrictions of the original manifold M to a (proper) submanifold N M . In this paragraph, we sketch how restrictions arising from an underlying control system can be accounted for. To this end, some general remarks on reachable sets and controllability are in order. The abstract optimization task (AOT) then amounts to optimizing over reachable sets, which is the topic we focus on here. In contrast, the dynamic optimal control task (OCT) would give explicit steerings, which will not be discussed here. Instead, we refer the reader to [67, 68], or, for the quantum case, to [69,70] and for numerical methods and applications to [30,71–74]. For simplicity, let (Σ) denote a smooth control system on the state manifold M , i.e. a family of (ordinary) diﬀerential equations (2.15) (Σ) X˙ = F (X, u), u ∈ U ⊂ Rm with control parameters u ∈ U and smooth vector ﬁelds Fu := F (·, u) on M . While the vector ﬁelds Fu are assumed to be time-independent, the controls are allowed

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

607

to vary in time. For convenience, the resulting control function t → u(t) ∈ U is denoted again by u. Moreover, the set of all admissible control functions is supposed to contain at least all piecewise constant ones. For an admissible control function u, we refer to X(t, X0 , u) as the unique solution of (2.15) with initial value X0 . Thereby the reachable set of X0 is deﬁned Reach(X0 ) := Reach(X0 , T ). (2.16) 0≤T

Here Reach(X0 , T ) denotes the set of all states which can be reached in time T , i.e. Reach(X0 , T ) := {X(T, X0, u) ∈ M | u ∈ U}.

(2.17)

The system (Σ) is said to be controllable, if Reach(X0 ) = M for all X0 ∈ M , i.e. if for each pair (X0 , Y0 ) of states there exists an admissible control u and a time T ≥ 0 such that X(T, X0 , u) = Y0 . In general, it is hard to decide whether a given system (Σ) is controllable or not. However, for dynamics evolving on some Lie group G, the situation is much easier. Let (ΣG ) be a bilinear or, equivalently, a right invariant, control-aﬃne system on a matrix Lie group G with Lie algebra g, i.e.   m X˙ = A0 + uj Aj  X, u ∈ U ⊂ Rm (2.18) (ΣG ) j=1

with drift A0 ∈ g and control directions Aj ∈ g. For compact Lie groups G, a simple algebraic test for controllability is known: If the system Lie algebra s := A0 , . . . , Am Lie

(2.19)

generated by A0 , . . . , Am via nested commutators coincides with g, then the corresponding system (ΣG ) is controllable, cf. [75, 76]. In particular, there exists a (minimal) ﬁnite time T > 0, such that the entire group G can be reached from any initial point X0 ∈ G within this time, i.e. G = Reach(X0 , ≤ T ) := Reach(X0 , T ) (2.20) 0≤T ≤T

for all X0 ∈ G. Estimates on T which leads to upper and lower bounds for the optimal time of state-to-state transfer in controlled quantum systems can be found in [77]. If s g, but s generates a closed subgroup of G, one can still conclude how the reachable set of (2.18) looks like: Reach(X0 ) = S · X0 ,

(2.21)

where S denotes the closed subgroup generated by s. In contrast, for non-compact groups G, which are indispensable for describing open quantum systems, the situation gets more involved. Here, s = g implies only accessibility of (ΣG ), i.e. that all reachable sets Reach(X0 ) have non-empty interior.

July 12, J070-S0129055X10004053

608

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

This follows from a more general result on smooth non-linear control systems, which says that the so-called Lie algebra rank condition (LARC) {F (X) | F ∈ Fu | u ∈ ULie } = TX M,

(2.22)

for all X ∈ M implies accessibility. Here, Fu | u ∈ ULie denotes the Lie subalgebra of vectors ﬁelds generated by Fu , u ∈ U via Lie bracket operation, cf. [67]. Note that for right invariant vector ﬁelds on G, the Lie bracket coincides (up to sign) with the commutator such that (2.22) boils down to s = g. Moreover, by exploiting the identity Reach(1, T1 ) · Reach(1, T2 ) = Reach(1, T1 + T2 ),

(2.23)

one can show that Reach(1) is always a Lie subsemigroup of G. A subsemigroup is a subset S ⊂ G which contains the unity and is closed under multiplication, i.e. 1 ∈ S and S · S ⊆ S. However, the geometry of subsemigroups is rather subtle compared to Lie subgroups and therefore at present not amenable to intrinsic optimization methods, as shown in more detail in a paper dwelling on open quantum systems in terms of Lie semigroups [59]. 2.5. Settings of interest In terms of reachability, there are diﬀerent scenarios that structure the subsequent line of thought: we start out with fully controllable or operator controllable quantum systems [28, 75, 76, 78–81] represented as spin- or pseudo-spin systems. Then, neglecting decoherence, to any initial state represented by its density operator A, the entire unitary orbit O(A) := {U AU −1 | U unitary} can be reached [81]. In systems of n qubits (e.g., spin- 21 particles), this is the case under the following mild conditions [82]: (1) the qubits have to be inequivalent, i.e. distinguishable and selectively addressable, and (2) they have to be pairwise coupled (e.g., by Ising or Heisenberg-type interactions), where the coupling topology may take the form of any graph as long as it is connected, (3) the Hamiltonians must not show any symmetry (so the system algebra has to be given in irreducible representation), and ﬁnally (4) the Hamiltonians must not (simultaneously) allow for an orthogonal or a symplectic representation. In other instances not the entire (unitary) group, but just a subgroup K can be reached. This is the case if the system Lie algebra is a proper subalgebra of the fully unitary algebra, so k su(N ) or equivalently exp k = K SU (N ). Such restrictions may ay arise from symmetry constraints, which can be conveniently characterized by the centralizera of k in su(N ), see [82,83]. Otherwise, the system itself can be fully controllable, but the focus of interest may be reduced: e.g., the subgroup K = SUloc (2n ) := SU (2) ⊗ SU (2) ⊗ · · ·⊗ SU (2) of (possibly fast) local actions on each qubit is of interest to study local reachability, or whether an eﬀective multi-qubit interaction Hamiltonian is locally reversible in a i.e.

by k := {s ∈ su(N ) | [s, k] = 0 ∀ k ∈ k}.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

609

the sense of Hahn’s spin echo [29]. Or, one may ask what is the Euclidean distance of some pure state to the nearest point on the local unitary orbit of a pure product state. This may be useful when optimizing entanglement witnesses [55]. Likewise, one may address other than the ﬁnest partitioning of the entire unitary group, e.g., K = SU (2n1 ) ⊗ · · · ⊗ SU (2nj ) ⊗ · · · ⊗ SU (2nr ) ⊂ SU (2n ), where rj=1 nj = n. Another type of reduction arises not by restriction to a subgroup H, but by the fact that the quality function of interest f is equivariant, i.e. constant on cosets HG. Consider, for instance, a fully controllable system where f is equivariant with respect to the closed subgroup H ⊂ G. Then, it may be favorable to transfer the optimization problem from the original Lie group G to the homogeneous space G/H. 3. Theory: Gradient Flows Gradient systems are a standard tool of Riemannian optimization for maximizing smooth quality functions on a manifold M . Thus the manifold structure of M arises either naturally by the problem itself or by smooth equality-constraints imposed on a previously unconstrained problem. Note that in general inequality-constraints would entail manifolds with a boundary — and thus are a much more subtle issue not to be developed any further here. The case M = Rm — sometimes referred to as the unconstrained case — is wellknown and can be found in many texts on ordinary diﬀerential equations or nonlinear programming, cf. [84–87]. However, gradient systems on abstract Riemannian manifolds provide a rather new approach to constrained optimization problems. Although the resulting numerical algorithms are in general only linearly convergent, their global behavior is often much better then the global behavior of locally quadratic methods. Textbooks combining the diﬀerent areas of Riemannian geometry, gradient systems and constrained optimization are quite rare. The best choices to our knowledge are [22,61]. For further reading we also suggest the papers [19–21,62]. Nevertheless, most of the material which is necessary to understand the intrinsic optimization approach applied in Sec. 4 is scattered in many diﬀerent references. For the reader’s convenience, we therefore review the basic ideas on these topics. First, we discuss the general setting on Riemannian manifolds, then we proceed with Lie groups and ﬁnally summarize some more advanced results on homogeneous spaces. For standard deﬁnitions and terminology from Riemannian geometry we refer to any modern text on this subject such as [88–91]. 3.1. Gradient flows on Riemannian manifolds In the following, let M denote a ﬁnite dimensional smooth manifold M with tangent and cotangent bundles T M and T ∗ M , respectively. Moreover, let M be equipped with a Riemannian metric · | ·, i.e. with a scalar product · | ·X on each tangent space TX M varying smoothly with X ∈ M . More precisely, · | · has to be a

July 12, J070-S0129055X10004053

610

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

smooth, positive deﬁnite section in the bundle of all symmetric bilinear forms over M . Such sections always exist for ﬁnite dimensional smooth manifolds, cf. [89, 92]. The pair (M, · | ·) is called a Riemannian manifold. Let f : M → R be a smooth quality function on M with diﬀerential Df : M → T ∗ M . Then the gradient of f at X ∈ M , denoted by grad f (X), is the vector in TX M uniquely determined by the equation Df (X) · ξ = grad f (X) | ξX

(3.1)

for all ξ ∈ TX M . Equation (3.1) naturally deﬁnes a vector ﬁeld on M via grad f : M → T M,

X → grad f (X)

(3.2)

called the gradient vector ﬁeld of f . The corresponding ordinary diﬀerential equation X˙ = grad f (X),

(3.3)

and its ﬂow are referred to as the gradient system and the gradient ﬂow of f , respectively. Obviously, the critical points of f : M → R coincide with the equilibria of the gradient ﬂow. Moreover, the quality function f is monotonically increasing along solutions X(t) of (3.3), i.e. the real-valued function t → f (X(t)) is monotonically increasing in t, as d 2 ˙ f (X(t)) = grad f (X(t)) | X(t) X(t) = grad f (X(t)) X(t) ≥ 0. dt Here · X denotes the norm on TX M induced by · | ·X , i.e. ξ X := ξ | ξX for all ξ ∈ TX M . 3.1.1. Convergence of gradient ﬂows Recall that the asymptotic behavior (for t → +∞) of a solution of (3.3) is characterized by its ω-limit set

ω(X0 ) := {X(τ, X0 ) | t ≤ τ < t+ (X0 )}, 0
where {· · ·} denotes the closure of the set {· · ·} and X(t, X0 ) the unique solution of (3.3) with initial value X(0) = X0 and positive escape time t+ (X0 ) > 0. The following result gives a suﬃcient condition for solutions of Eq. (3.3) to converge to the set of critical points of f . Proposition 3.1. If f has compact a superlevel set, i.e. if the sets {X ∈ M | f (X) ≥ C} are compact for all C ∈ R, then any solution of Eq. (3.3) exists for t ≥ 0 and its ω-limit set is a non-empty compact and connected subset of the set of critical points of f . Proof. Since solutions of Eq. (3.3) are monotonically increasing in t, the compact sets {X ∈ M | f (X) ≥ C} are positively invariant, i.e. invariant for t ≥ 0 under

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

611

the gradient ﬂow of Eq. (3.3). Thus the assertion follows from standard results on ω-limit sets and Lyapunov theory, cf. [22, 85]. Although, Proposition 3.1 guarantees that ω(X0 ) is contained in the set of critical points of f , this does not imply convergence to a critical point. Indeed, there are smooth gradient systems which exhibit solutions converging only to the set of critical points, cf. [93]. The next two results provide suﬃcient conditions for convergence to a single critical point under diﬀerent settings. In particular, Theorem 3.1 yields a powerful tool for analyzing real analytic gradient systems. Corollary 3.1. If f has compact superlevel sets and if all critical points are isolated, then any solution of (3.3) converges to a critical point of f for t → +∞. Proof. This is an immediate consequence of Proposition 3.1. Theorem 3.1 ([94]). If (M, · | ·) and f are real analytic, then all non-empty ω-limit sets ω(X0 ) of Eq. (3.3) are singletons, i.e. ω(X0 ) = ∅ implies that X(t, X0 ) converges to a single critical point X ∗ of f for t → +∞. Proof. The main argument is based on L ojasiewicz’s inequality which says that ∗ in a neighborhood of X an estimate of the type |f (X)|p ≤ C grad f (X) for some p < 1 and C > 0 holds. A complete proof can be found in [94, 95]. 3.1.2. Restriction to submanifolds Now, consider the restriction of f to a smooth submanifold N ⊂ M . Obviously, the Riemannian metric · | · on M restricts to a Riemannian structure on N . Thus (N, · | ·|T N ) constitutes a Riemannian manifold in a canonical way. Moreover the equality Df |N (X) = Df (X)|TX N immediately implies that the gradient of the restriction f |N at X ∈ N is given by the orthogonal projection of grad f (X) onto TX N , i.e. grad f |N (X) = PX (grad f (X)),

(3.4)

where PX denotes the orthogonal projector onto TX N . Hence the gradient system of f |N on an arbitrary submanifold N is well-deﬁned and reads X˙ = PX (grad f (X)).

(3.5)

3.1.3. Analyzing critical points by the Hessian Subsequently, we address the problem, how to deﬁne and compute the Hessian of f , as its knowledge is essential for a deeper insight of (3.3). For instance, the stability of critical points is determined by its eigenvalues or the computation of explicit

July 12, J070-S0129055X10004053

612

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

discretization schemes, preserving the convergence behavior of (3.3), can be based on it, cf. [17, 22]. At critical points X ∗ ∈ M of f , the Hessian is given by the symmetric bilinear form Hess f (X ∗ ) : TX ∗ M × TX ∗ M → R, Hess f (X ∗ )(ξ, η) := (Dϕ(X ∗ )ξ) Hess (f ◦ ϕ−1 )(ϕ(X ∗ ))Dϕ(X ∗ )η,

(3.6)

where ϕ is any chart around X ∗ and Hess(f ◦ ϕ−1 ) denotes the ordinary Hesse matrix of f ◦ ϕ−1 . It is straightforward to show that (3.6) is independent of ϕ. Equivalently, Hess f (X ∗ ) is uniquely determined by d2 (f ◦ α) Hess f (X ∗ )(ξ, ξ) := , (3.7) dt2 t=0 where α is any smooth curve with X ∗ = α(0) and α(0) ˙ = ξ. While the remaining ∗ values of Hess f (X ) can be obtained by a standard polarization argument,b i.e. via the formula 2 Hess f (X ∗ )(ξ, η) = Hess f (X ∗ )(ξ + η, ξ + η) − Hess f (X ∗ )(ξ, ξ) − Hess f (X ∗ )(η, η).

(3.8)

However, the previous deﬁnition does not apply to regular points of f . In general, one has to establish the concept of geodesics, cf. Remark 3.1 More precisely, the Hessian of f at an arbitrary point x ∈ M is given by d2 (f ◦ γ) , (3.9) Hess f (X)(ξ, ξ) := dt2 t=0 where γ is the unique geodesic with X = γ(0) and γ(0) ˙ = ξ. Again, the remaining values can be computed by (3.8). As usual, we associate to Hess f (X) a unique selfadjoint linear operator Hess f (X) : TX M → TX M such that ξ | Hess f (X)ηX = Hess f (X)(ξ, η)

(3.10)

holds for all ξ, η ∈ TX M . It is called the Hessian operator of f at X ∈ M . Remark 3.1. In modern textbooks on diﬀerential geometry, the concept of geodesics as well as the notion of (higher) covariant derivatives are deﬁned via linear connections, cf. [90, 96]. Therefore, Eq. (3.9) is usually derived as a consequence and not introduced as a deﬁnition of the Hessian. For Riemannian manifolds, however, it is also possible to establish (Riemannian) geodesics as curves of minimal arc length. Both approaches coincide if one picks the so-called Riemannian or Levi–Civita connection as linear connection on M . precisely, the polarization procedure is deﬁned as follows: Let H be a real Hilbert space and β : H → R a bounded quadratic form, i.e. there exists a bounded symmetric bilinear form B : H × H → R such that β(v) = B(v, v) for all v ∈ H. By the symmetry and bilinearity of B we have B(v + w, v + w) = B(v, v) + B(w, w) + 2B(v, w) and hence B(v, w) = 12 (B(v + w, v + w) − B(v, v) − B(w, w)) = 12 (β(v + w) − β(v) − β(w)) for all v, w ∈ H. Therefore, B is uniquely determined by the quadratic form β and the latter identity is known as law of polarization. b More

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

613

Unfortunately, the computation of geodesics is in general a non-trivial problem, as one has to solve (in local charts) a second order diﬀerential equation. However, on compact Lie groups their calculation is rather simple as we will see at the end of Sec. 3.2. The above concepts yield the following generalization of a familiar result from elementary calculus for characterizing local extreme points. Theorem 3.2. Let M be a Riemannian manifold and let X∗ be a critical point of the quality function f : M → R. If Hess f (X ∗ ) or, equivalently, Hess f (X) are negative deﬁnite, then X ∗ is a strict local maximum of f . Proof. In local coordinates the result follows straightforwardly from Eq. (3.6). In general, (asymptotic) stability of an equilibrium X ∗ ∈ M of (3.3) may dependent on the Riemannian metric · | ·. However, the property of being a strict local maximum or an isolated critical point of a smooth function f is obviously not up to the choice of any Riemannian metric. Therefore, the following result shows that in fact certain (asymptotically) stable equilibria X ∗ ∈ M of (3.3) are independent of the Riemannian metric. Theorem 3.3. (a) If X ∗ ∈ M is a strict local maximum of f, then X ∗ is a stable equilibrium of (3.3). In particular, for any neighborhood U of X ∗ there exists a neighborhood V of X ∗ such that the ω-limit sets ω(x0 ) are non-empty and contained in U for all x0 ∈ V . (b) If X ∗ ∈ M is a strict local maximum and an isolated critical point of f, then X ∗ is an asymptotically stable equilibrium of (3.3). In particular, there is a neighborhood V of X ∗ such that ω(x0 ) = {X ∗ } for all X0 ∈ V, i.e. all solutions X(t, X0 ) with initial value X0 ∈ V converge to X ∗ for t → +∞. Proof. Both assertions follow immediately from classical stability theory by taking f as Lyapunov function, cf. [22, 85]. Note that the convergence analysis near arbitrary equilibria, i.e. near arbitrary critical points of f is quite subtle and may depend on the Riemannian metric, cf. [97]. 3.1.4. Discretised gradient ﬂows Finally, we approach the problem of ﬁnding discretizations of (3.3) which lead to convergent gradient ascent methods. The ideas presented below can be traced back to Brockett, cf. [17]. Let expX : TX M → M

(3.11)

be the Riemannian exponential map at X ∈ M , i.e. t → expX (tξ) denotes the unique geodesic with initial value X ∈ M and initial velocity ξ ∈ TX M . Moreover,

July 12, J070-S0129055X10004053

614

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

we assume that M is (geodesically) complete, i.e. any geodesic is deﬁned for all t ∈ R. Hence, (3.11) is well-deﬁned for the entire tangent bundle T M . The simplest discretization approach — a scheme that can be seen as an intrinsic Euler step method — leads to Riemannian gradient method Xk+1 := expXk (αk grad f (Xk ))

(3.12)

where αk denotes an appropriate step size. In order to guarantee convergence of (3.12) to the set of critical points, it is suﬃcient to apply the Armijo rule [87]. An alternative to Armijo’s rule provides the step size selection suggested by Brockett in [19], see also [22]. Convergence to a single critical point is a more subtle issue. If (M, · | ·) and f are analytic, and the step sizes are chosen according to a version of the ﬁrst Wolfe–Powell condition for Riemannian manifolds, then pointwise convergence holds. A detailed proof can be found in [98]. 3.2. Gradient flows on Lie groups In the following, we apply the previous results to Lie groups and Lie subgroups. However, to fully exploit Lie-theoretic tools, the Riemannian structure and the group structure have to match, i.e. the metric · | · has to be invariant under the group action. For basic concepts and results on Lie Groups and their Riemannian geometry we refer to [88,89,99–101]. In particular, we recommend the AMS-booklet of Arvanitoyeorgos [102] for a rather comprehensive, but condensed overview including many references for further reading. Sometimes we refer to [102] although it does not contain a full proof of the corresponding statement. Nevertheless, the details given therein will help the reader to get a better understanding of the subject. In any case, we always added a second reference containing a complete proof. Let G denote a ﬁnite dimensional Lie group, i.e. a group which carries a smooth manifold structure such that the group operations are smooth mappings.c For notational convenience we will assume that G can be represented as a (closed) matrix Lie group, i.e. as an (embedded) Lie subgroup of some general linear group GL(N, K) of invertible N ×N -matrices over K = R or C. Remark 3.2. According to a well-known result by Cartan, a subgroup G of GL(N, K) is an (embedded) Lie subgroup, i.e. a smooth submanifold of GL(N, K), if and only if it is closed in GL(N, K), cf. [103]. Note, however, that there is a subtle diﬀerence between embedded and immersed Lie subgroups. Moreover, not every abstract Lie group admits a faithful representation as a matrix Lie group. Nevertheless, the class of matrix Lie groups is rich enough for all of our subsequent applications. For more details on these topics we also refer to [100, 103]. c Actually, any Lie group G exhibits a real analytic substructure (induced by the exponential map), i.e. G can also be regarded as a real analytic manifold [101, 103].

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

615

3.2.1. Invariant metrics A Lie group G can be endowed in a canonical way with a Riemannian metric · | ·. Let g := T1 G be the Lie algebra of G, i.e. the tangent space of G at the unity 1. From the fact that the right multiplication rH : G → G and left multiplication lH : G → G are diﬀeomorphisms of G for all H ∈ G, it follows TH G = gH = Hg

(3.13)

for all H ∈ G. Now, let (· | ·) be any scalar product on g. Then g | hG := (gG−1 | hG−1 )

(3.14)

for all G ∈ G and g, h ∈ TG G yields a right invariant metric on G, where right invariance stands for g | hG = gH | hHGH

(3.15)

for all G, H ∈ G and g, h ∈ TG G. Thus right multiplication rH represents an isometry of G. In the same way, one could obtain left invariant metrics on G. Remark 3.3. In an abstract setting, one has to replace (3.13) by TH G = DrH (1)g = DlH (1)g

(3.16)

for all H ∈ G, where DrH and DlH denote the tangent maps of rH and lH , respectively. For a matrix Lie group, however, the respective tangent maps are given by DrH (G)ξ = ξH and DlH (G)ξ = Hξ for all G ∈ G and ξ ∈ TG G. Hence (3.16) reduces to (3.13). The construction of bi-invariant, i.e. right and left invariant metrics is much more subtle and in general even impossible. To summarize the basic results on this topic we need some further terminology. The adjoint maps Ad : G → GL(g) and ad : g → gl(g) are deﬁned by AdG h := GhG−1

and

adg h := [g, h] := gh − hg

for all G ∈ G and all g, h ∈ g, where GL(g) and gl(g) denote the set of all automorphisms and, respectively, endomorphisms of g. Note both notations adg h and [g, h] are used interchangeably in the literature. A bilinear form (· | ·) on g is called (a) AdG -invariant if the identity (g | h) = (AdG g | AdG h)

(3.17)

is satisﬁed for all g, h ∈ g and G ∈ G. (b) adg -invariant if the identity (adg h | k) = −(h | adg k) is satisﬁed for all g, h, k ∈ g.

(3.18)

July 12, J070-S0129055X10004053

616

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Proposition 3.2. The following statements are equivalent: (a) There exists a bi-invariant Riemannian metric · | · on G. (b) There exists an AdG -invariant scalar product (· | ·) on g. Moreover, each of the statements (a) and (b) imply (c) There exists an adg -invariant scalar product (· | ·) on g. If G is also connected, then (c) is equivalent to (a) and (b), respectively. Proof. The equivalence (a) ⇔ (b) follows easily by exploiting Eq. (3.15) at G = 1. Moreover, applying (b) to a one-parameter subgroup t → exp(tg) and taking the derivative at t = 0 yields (c). The implication (c) ⇒ (b) is obtained in the same way, i.e. by diﬀerentiating t → (Adetg h | Adetg k) with respect to t, cf. [102]. Note, however, that this implies AdG -invariance only on the connected component of the unity. Therefore, connectedness is necessary for the implication (c) ⇒ (b) as counter-examples show. Now, the main result on the existence of bi-invariant metrics reads as follows. Theorem 3.4. A connected Lie group G admits a bi-invariant Riemannian metric if and only if G is the direct product of a compact Lie group G0 and an abelian one, which is isomorphic to some (Rm , +), i.e. G ∼ = G0 × Rm Proof. Cf. [89, 104]. Finally, we focus on a special class of Lie groups. A connected Lie group G is called semisimple if the Killing form, i.e. the bilinear form (g, h) → κ(g, h) := tr(adg adh ) is non-degenerate on g. Most prominent representatives of this class are SL(N, R), SL(N, C), SO(N, R) and SU (N ). More on semisimple Lie groups and their algebras can be found in [99, 103]. Theorem 3.5. (a) If G is semisimple then the Killing form κ deﬁnes an adg -invariant bilinear form on g. (b) If G is semisimple and compact then −κ deﬁnes an adg -invariant scalar product on g. Thus −κ induces a bi-invariant Riemannian metric on G. Proof. Cf. [102, 103]. 3.2.2. Gradient ﬂows with respect to an invariant metric Next, we study gradient ﬂows on G or on a closed subgroup H ⊂ G with respect to an invariant metric · | ·. Therefore, let f : G → R be a smooth quality function

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

617

and let ϕ : G → G be any diﬀeomorphism. Using the identity grad(f ◦ ϕ)(G) = (Dϕ(G))∗ grad f (ϕ(G))

(3.19)

∗

for all G ∈ G, where (·) denotes the adjoint operator, we obtain by the right invariance of the metric grad f (G) = grad(f ◦ rG ) (1)G

(3.20)

G˙ = grad f (G)

(3.21)

G˙ = grad(f ◦ rG )(1)G.

(3.22)

for all G ∈ G. Hence

can be rewritten as Thus the gradient ﬂow of f is determined by the map G → grad(f ◦ rG )(1) ∈ g. To study its asymptotic behavior of Eq. (3.21) we can apply the results of the previous section. For instance, for compact Lie grops we have. Corollary 3.2. Let G be a compact Lie group with a right invariant Riemannian metric · | · and let f : G → R be a real analytic quality function. Then any solution of Eq. (3.21) converges to a critical point of f for t → +∞. Proof. This follows immediately from Proposition 3.1 and Theorem 3.1, as the pair (G, · | ·) constitutes a real analytic Riemannian manifold whenever the metric · | · is invariant, cf. footnote [146]. Now, let H be a closed subgroup of G. By Remark 3.2, we know that H is actually an (embedded) submanifold of G. Therefore, the gradient ﬂow of f |H with respect to · | ·|H is well-deﬁned and can be given explicitly via the orthogonal projectors PH , cf. (3.5). However, for an invariant metric the computation of PH simpliﬁes considerably, as all calculations can be carried out on the Lie algebra g of G. Lemma 3.1. Let G be a Lie group with a right invariant Riemannian metric · | · and let H be a closed subgroup of G. Furthermore, let g and h their corresponding Lie algebras and denote by Ph the orthogonal projection of g onto h. Then the orthogonal projection PH in (3.4) is given by PH (gH) := Ph (g)H

(3.23)

for all gH ∈ TH G. Proof. This is a straightforward consequence of the identity TH H = hH and the right invariance of · | ·. According to (3.5), (3.20) and (3.23), the gradient ﬂow of f |H ﬁnally reads H˙ = Ph (grad(f ◦ rH )(1))H.

(3.24)

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

618

3.2.3. Geodesics with respect to an invariant metric The remainder of this subsection is devoted to the issue: How to compute geodesics and the Hessian of a smooth quality function with respect to an invariant metric. The main results for the forthcoming applications are summarized in Theorem 3.6(b) and Proposition 3.3. For readers with basic diﬀerential geometric background we provide some details of the proof which however can be skipped, so as not to lose the thread. First, we need some further notation. Let Xgr : G → gG and Xgl : G → Gg be the right and left invariant vector ﬁelds on G which are uniquely determined by Xgr (1) = g and Xgl (1) = g, respectively. Moreover, let LX (·) denote the Lie derivative with respect to the vector ﬁeld X , i.e. for a smooth function f : G → R one has LX (f )(G) := Df (G) · X (G). On vector ﬁelds Y, the action of LX (·) is given by (DΦX (t, G))−1 · Y(ΦX (t, G)) − Y(G) , t→0 t

LX (Y)(G) := − lim

where ΦX (t, ·) denotes the corresponding ﬂow of X . Next, we recall two basic facts from diﬀerential geometry which play a key role for the proof of Theorem 3.6. The ﬁrst one shows that the set of right/left invariant vector ﬁelds is invariant under Lie derivation, cf. [99, 102]. The second one relates a Riemannian metric of a manifold M with a particular linear connection on M . For more details see e.g., [89]. Fact 1. The Lie derivative of a right/left invariant vector ﬁeld is again right/left invariant and satisﬁes r LXgr Xhr = −X[g,h]

l and LXgl Xhl = X[g,h] .

(3.25)

Fact 2. On any Riemannian manifold M there exists a unique Riemannian connection ∇ determined by the properties LX Y = ∇X Y − ∇Y X

(3.26)

∇X Y | Z = ∇X Y | Z + Y | ∇X Z.

(3.27)

and

Now, combining both facts yields the main result about geodesics on Lie groups. Theorem 3.6. Let G be a Lie group with a bi-invariant metric · | · and let ∇ denote the unique Riemannian connection on G induced by · | ·. (a) For right/left invariant vector ﬁelds the Riemannian connection ∇ is given by 1 r ∇Xgr Xhr = − X[g,h] 2

and

∇Xgl Xhl =

1 l X . 2 [g,h]

(3.28)

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

619

(b) The geodesics through any G ∈ G are of the form t → G exp(tg) or t → exp(tg)G with g ∈ g. In particular, the geodesics through the unity 1 are precisely the one-parameter subgroups of G. Proof. (a) Applying Koszul’s identity, cf. [89, 91], 2∇X Y | Z = LX Y | Z + LY Z | X − LZ X | Y − X | LY Z + Y | LZ X + Z | LX Y, to Xgr , Xhr and Xkr we obtain r r r − Xhr | X[k,g] − Xkr | X[g,h] . 2∇Xgr Xhr | Xkr = +Xgr | X[h,k]

Now Proposition 3.2 and Fact 1 imply 1 r 2∇Xgr Xhr = − X[g,h] . 2 Obviously, for left invariant vector ﬁelds the same arguments apply. (b) Let γ(t) := exp(tg)G. Part (a) implies that the covariant derivative ∇γ(t) γ(t) ˙ = ˙ r ∇Xgr Xg (γ(t)) of γ vanishes and thus γ represents the unique geodesics through G with “initial velocity” ξ = gG. The same holds for γ (t) := G exp(tg), cf. [99, 102]. r 2∇Xgr Xhr | Xkr = −Xkr | X[g,h]

and hence

Observe that the bi-invariance of the metric and the invariance of the vector ﬁelds are essential for the above result. For example Eq. (3.28) fails, if the Riemannian metric is just right invariant. More details on this topic can be found in [99, 105]. Finally, by Theorem 3.6, the Hessian of the restriction f |H can easily be obtained by restricting the Hessian of f to T H. More precisely, we have. Proposition 3.3. Let f : G → R be a smooth quality function on a Lie group with bi-invariant metric · | · and let H be a closed subgroup. Then the Hessian of f |H at H is given by Hess f |H (H) = Hess f (H)|TH H×TH H

(3.29)

Note that in general Eq. (3.29) is sheer nonsense unless H is a Lie subgroup. Counterexamples can be obtained easily for G = Rm . 3.3. Gradient flows on homogeneous spaces The subsequent section on homogeneous spaces is motivated by the following observation, cf. Sec. 3.4. As before, let f : G → R be a smooth quality function. In many applications f can be decomposed into a function F deﬁned on a smooth manifold M and a (right) group action α : (X, G) → X · G on M such that f (G) := F (X · G)

(3.30)

for some ﬁxed X ∈ M . Then we can think of f as deﬁned on the orbit of X. More precisely, let f = F |O(X) , where O(X) := {X · G | G ∈ G} denotes the orbit of X.

July 12, J070-S0129055X10004053

620

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Thus f(Y ) = f (G)

(3.31)

for Y = X · G with G ∈ G. Such quality functions f are called induced by F , cf. Sec. 3.4. By construction, we have max f (G) = max f(Y ). G∈G

Y ∈O(X)

(3.32)

Moreover, let HX := {G ∈ G | X · G = X} denote the stabilizer or, equivalently, isotropy subgroup of X. Then f can also be viewed as a function on the right coset space,d G/HX := {HX G | G ∈ G},

(3.33)

which is equivalent to say that f is equivariant with respect to HX , i.e. f (G) = f (HG)

(3.34)

for all H ∈ HX . Therefore, coset space show up quite naturally in optimizing equivariant quality functions. Note that passing from G to G/Hx can be rather useful in order to avoid certain degeneracies such as continua of critical points. 3.3.1. Coset spaces We ﬁrst collect the fundamental facts on the diﬀerential structure of G/H, where H is any closed subgroup of G. Detailed expositions can be found in [91, 99, 101, 102, 106]. Theorem 3.7. Let G be a Lie group with Lie algebra g and let H ⊂ G be a closed subgroup with Lie algebra h. Moreover, let p be any complementary subspace to h, i.e. g = h ⊕ p. Then the following holds: (a) The quotient topology turns the set of right cosets G/H := {[G] := HG | G ∈ G} into a locally compact Hausdorﬀ space. (b) There exists a unique manifold structure on G/H such that the canonical projection Π : G → G/H, G → [G] is a submersion. In particular, the tangent space of G/H at [1] is isomorphic to p via the canonical identiﬁcad [exp tp]|t=0 and thus dim G/H = dim G − dim H. tion p → dt The following statements refer to the unique manifold structure on G/H given in part (b). (c) The Lie group G acts smoothly from the right on G/H via ([G ], G) → [G G] d Note

(3.35)

that the coset-terminoloy in the group literature is not consistent, i.e. right cosets are sometimes called left cosets and vice versa. Here, we stick to the term right coset, if the group element in on the right side, i.e. [G] = HG.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

621

such that rG : G/H → G/H,

[G ] → [G G]

(3.36)

are diﬀeomorphisms for all G ∈ G. Moreover, Π ◦ lG : G → G/H,

G → [GG ]

(3.37)

are submersions for all G ∈ G. Thus the tangent space T[G] G/H is given by T[G] G/H = D rG ([1]) T[1] G/H = D(Π ◦ lG )(1)g = D(Π ◦ lG )(1)(AdG−1 p).

(3.38)

(d) Moreover, if H is a normal subgroup, i.e. GHG−1 = H for all G ∈ G, then the multiplication [G] · [G ] := [GG ] is well-deﬁned and yields a Lie group structure on G/H. Proof. Cf. [99, 101, 106]. The Lie group G/H given by Theorem 3.7(d) is called the quotient Lie group of G by H. Moreover, the result provides the possibility to extend the well-known First Isomorphism Law to the category of Lie groups. Theorem 3.8. Let Φ : G → G be a smooth surjective Lie group homomorphism. : G/H → G with H := Then there exists a well-deﬁned Lie group isomorphism Φ ker Φ such that the diagram / G y< y yy Π yy b yy Φ G/H G

Φ

(3.39)

commutes. Moreover, let g, g and h denote the corresponding Lie algebras and let p be any complementary space to h. Then DΦ(1) is a surjective Lie algebra homomorphism with ker DΦ(1) = h and commutative diagram DΦ(1)

/ g w; w w ww DΠ(1) wwDΦ(1) b w w p∼ = g/h. g

(3.40)

Proof. Note that H = ker Φ is a closed normal subgroup of G. Thus by the First Isomorphism Law Φ([G]) := Φ(G) for [G] ∈ G/H is a well-deﬁned group isomor phism. Moreover, Φ is smooth, since Π is a smooth submersion by Theorem 3.7. The assertion that DΦ(1) is a surjective Lie algebra homomorphism, follows easily from the properties of the exponential map. Finally, a straightforward application of the chain rule yields Eq. (3.40).

July 12, J070-S0129055X10004053

622

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

3.3.2. Orbit theorems and homogeneous spaces Next, we analyze the relation between group actions and coset spaces. A smooth right Lie group action is a smooth map α : M × G → M , (X, G) → X · G with (X · G) · H = X · (GH)

and X · 1 = X

for all X ∈ M and G, H ∈ G. The orbit of X ∈ M under the group action α is deﬁned by O(X) := {X · G | G ∈ G}. The action is called transitive if M = O(X) for some and hence for all X ∈ M . Equivalently, one can say that for all X, Y ∈ M there exists an element G ∈ G with Y = X · G. Moreover, for X ∈ M let HX := {G ∈ G | X · G = X} denote the stabilizer of X and αX : G → M the map G → X · G. Then the canonical map α X : G/HX → M is deﬁned by [G] → X · G. Theorem 3.9 (Orbit Theorem). Let G be a Lie group with Lie algebra g and let α : M × G → M be a smooth right action of G on a smooth manifold M . Moreover, let X be any point in M . Then the following statements are satisﬁed : (a) The stabilizer subgroup HX is a closed subgroup of G. (b) Let hX be the Lie algebra of HX . Then ker DαX (1) = hX .

(3.41)

In particular, the canonical map α X : G/HX → M is an injective immersion. (c) The canonical map α X is an embedding, i.e. O(X) is a submanifold of M X is proper.e In this case, the tangent diﬀeomorphic to G/HX , if and only if α space of O(X) at Y = X · G is given by TY O(X) = DαX (G) TG G = DαY (1) g = DαY (1) AdG−1 pX ,

(3.42)

where pX is any complementary subspace of hX , i.e. g = hX ⊕ pX . Proof. (a) The continuity of αX implies that HX = α−1 X (X) is closed. (b) In order to see that α X is an injective immersion, consider the identity αX ◦rG = α(αX (·), G) and thus DαX (G) · gG = D1 α(X, G) ◦ DαX (1) g. Therefore, DαX (1) g = 0 implies d αX (exp(tg)) = 0 dt for all t ∈ R and hence ker DαX (1) ⊂ hX . As the inclusion hX ⊂ ker DαX (1) is obvious, we obtain ker DαX (1) = hX . Moreover, let pX be any complemenαX ([1]) = tary subspace of hX . Then, identifying pX with T[1] G/HX yields D αX ([1]) is injective and the same holds for DαX (1)|p , cf. Theorem 3.7. Thus D any other [G] ∈ G/HX by right multiplication rG . eA

map ϕ is called proper if the pre-image ϕ−1 (K) of any compact set K is also compact.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

623

(c) The ﬁrst part follows from a standard embedding criterion on immersed manifolds, cf. [92]. The ﬁrst equality of Eq. (3.42) is a straightforward consequence X ◦ ΠX , where ΠX : G → G/HX denotes the canonical of the identity αX = α projection. The second one is obtained by αY = αX ◦ lG = αX ◦ rG ◦ AdG , while the third one follows from HY = AdG−1 Hx . For further details see also. Corollary 3.3. Let α : M × G → M be as in Theorem 3.9 and let X ∈ M be any point. (a) If G is compact then G/HX is diﬀeomorphic to O(X). (b) If α is transitive then G/HX is diﬀeomorphic to M . Proof. (a) This follows readily from Theorem 3.9(c) and the compactness of G. αX ([G]). (b) Observe that transitivity of α implies surjectivity of DαX (G) and D Thus Theorem 3.9(b) yields the desired result, cf. [106]. This gives rise to the following deﬁnition. A manifold M is called a homogeneous G-space or for short a homogeneous space, if there exists a transitive smooth Lie group action of G on M . In particular, any coset space G/H can be regarded as a homogeneous space via the canonical action ([G ], G) → [G G] for [G ] ∈ G/H and G ∈ G. Further results on homogeneous spaces, orbit spaces and principal G-bundles can be found in [96, 101, 106]. Remark 3.4. Note that by Theorem 3.9 the orbit O(X) carries always a manifold structure the topology of which is equal or ﬁner than the topology induced by M . 3.3.3. Reductive homogeneous spaces Let M be homogeneous space with transitive Lie group action α : M × G → M and let H := HX be the stabilizer subgroup of a ﬁxed element X ∈ M . Next, we are interested in carrying over the Riemannian structure of G to M or, equivalently, to G/H. First, we need some further terminology. As most of the following terms are conveniently expressed via algebraic properties of the pair (G, H), we focus on the case M = G/H. Yet one could restate all results in terms of an abstract group action α on M . A homogeneous space G/H is reductive, if the Lie algebra h of H has a complementary subspace p in g such that p is AdH -invariant, i.e. HpH −1 ⊂ p for all H ∈ H. A Riemannian metric · | · on G/H is called G-invariant if the mappings rG are isometries, i.e. if for all ξ, η ∈ T[G ] G/H and G, G ∈ G the identity rG ([G ])ξ | D rG ([G ])η[G G] ξ | η[G ] = D

(3.43)

July 12, J070-S0129055X10004053

624

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

holds. Moreover a bilinear form (·|·) on p is called (a) AdH -invariant if the identity (p | p ) = (AdH p | AdH p )

(3.44)

is satisﬁed for all p, p ∈ p and H ∈ H. (b) adh -invariant if the identity (adh p | p ) = −(p | adh p )

(3.45)

is satisﬁed for all p, p ∈ p and h ∈ h. Note that G/H is reductive, if G has a bi-invariant metric, as one can choose p := h⊥ . Next, we give a generalization of Proposition 3.2 and Theorem 3.4 to homogeneous spaces. Proposition 3.4. Let G/H be a homogeneous space with reductive decomposition g = h ⊕ p. The following statements are equivalent : (a) There exists a G-invariant metric · | · on G/H. (b) There exists an AdH -invariant scalar product (·|·) on p. In addition, if H is connected then (a) and (b) are equivalent to (c) There exists a adh -invariant scalar product (·|·) on p. Proof. Cf. [102] and Proposition 3.2. Theorem 3.10. Let G/H be a homogeneous space with reductive decomposition g = h ⊕ p. Then G/H admits a G-invariant metric if and only if the closure of AdH |p := {AdH : p → p | H ∈ H} is compact in GL(p). Proof. Cf. [89]. Remark 3.5. (a) As a special case, Theorem 3.10 implies the existence of bi-invariant metrics on compact Lie groups, cf. Theorem 3.4 and [89]. (b) Replacing p by the quotient space g/h, allows to state Theorem 3.10 without referring to any reductive decomposition g = h ⊕ p of g, cf. [89]. Moreover, it can be shown that any homogeneous space G/H which admits a G-invariant metric is reductive, cf. [107]. Theorem 3.10 can easily be rephrased for an arbitrary homogeneous G-space M with transitive group action α : M × G → M , by choosing H := HX with X ∈ M . Note however, for orbits M := O(X) embedded in some larger Riemannian manifold N , the invariant metric given by Theorem 3.10 does in general not coincide with the induced metric. This gives rise to the following deﬁnition. A manifold M is called a Riemannian homogeneous G-space or for short Riemannian homogeneous space, if M is a homogeneous G-space with α-invariant

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

625

metric, which is to say that the mappings αG : M → M , αG (X) := X · G are isometries of M for all G ∈ G, i.e. for all ξ, η ∈ TX M and G ∈ G one has ξ | ηX = DαG (X)ξ | DαG (X)ηX·G .

(3.46)

Proposition 3.5. (a) Any homogeneous space of the form G/H with a G-invariant metric is a Riemannian homogeneous space. (b) Any Riemannian homogeneous space is isometric to a homogeneous space of the form G/H with a G-invariant metric. Proof. Follows readily from the previous deﬁnitions and Corollary 3.3(b). 3.3.4. Naturally reductive homogeneous spaces and geodesics Characterizing the Riemannian connection of a homogeneous space and its geodesics are in general advanced issues which we do not want to address here, cf. [102] and the references therein e.g., [108]. However, there are two cases — see (a) and (b) below — which are easy to handle. A homogeneous space G/H is called (a) Naturally reductive if it is reductive with complementary space p and AdH invariant scalar product (·|·) on p such that the identity (P adg h | k) = −(h | P adg k)

(3.47)

is satisﬁed for all g, h, k ∈ p, where P denotes the projection onto p along h. (b) Cartan-like if it is reductive with complementary space p and AdH -invariant scalar product (·|·) on p such that the commutator relations [h, h] ⊂ h,

[h, p] ⊂ p and [p, p] ⊂ h.

(3.48)

are satisﬁed. Remark 3.6. If, in deﬁnition (a), the complementary space p can be chosen as the orthogonal complement of h with respect to some AdG -invariant scalar product (·|·) on g, then condition (3.47) reduces to (adg h | k) = −(h | adg k)

(3.49)

for all g, h, k ∈ p. Lemma 3.2. (a) Every Cartan-like homogeneous space G/H is naturally reductive. (b) Every naturally reductive homogeneous space G/H is a Riemannian homogeneous space. Proof. (a) By the commutator relation [p, p] ⊂ h, we have P adg h = 0 for all g, h ∈ p. Thus Eq. (3.47) is satisﬁed. (b) The assertion follows immediately from Proposition 3.4. Theorem 3.11 (Coset Version). Let G/H be naturally reductive. Then G/H is Riemannian homogeneous space such that all geodesics through [G] ∈ G/H are of

July 12, J070-S0129055X10004053

626

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

the form t → [G exp(t AdG−1 p)] = [exp(tp)G]

(3.50)

with p ∈ p. Proof. By Lemma 3.2(b), the quotient space G/H is also a Riemannian homogeneous space. For a proof for Eq. (3.50) we refer to [91, 102, 105]. The above result can be restated for an arbitrary naturally reductive Riemannian homogeneous G-space. Theorem 3.12 (Orbit Version). Let M be a homogeneous G-space with transitive group action α : M × G → M . Assume that G/HX is naturally reductive with decomposition g = hX ⊕ pX . Then M is a Riemannian homogeneous G-space such that all geodesics through Y = X · G ∈ M are of the form t → Y · exp(t AdG−1 p)

(3.51)

with p ∈ pX . Proof. The result is a straightforward consequence of Theorem 3.11. Thus naturally reductive homogeneous spaces are Riemannian spaces, where the exponential map is particularly simple to express. By taking the basic picture of [91] further to discuss geodesics, Fig. 3 illustrates that only in naturally reductive homogeneous spaces the geodesics on G project to geodesics on G/H. In this sense, projection and exponentiation of tangent vectors commute in naturally reductive homogeneous spaces. However, on reductive homogeneous spaces that are not naturally reductive, the problem is considerably more involved. A necessary and suﬃcient condition for t → [G exp(tg)] being a geodesic in G/H can be found in [102, 109]. On the other hand, for numerical purposes it is often enough and even advisible to approximate the Riemannian exponential map by another computationally more eﬃcient local parametrisation. Here, the map p p → Π ◦ lG ◦ exp(AdG−1 p)

(3.52)

might be a natural candidate, even if it fails to give the exact Riemannian exponential map. These issues are subject to current research, and recent details can be found in [25, 110]. Figure 3 also shows how in reductive homogeneous spaces that are no longer naturally reductive, the projected geodesic still provides a ﬁrst-order approximation to the geodesic generated by the projection of the tangent vector. 3.3.5. Adjoint orbits A prime example for naturally reductive homogeneous spaces is provided by the adjoint action of a compact Lie group — a scenario which is of major interest in

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

627

Fig. 3. Geodesics in reductive homogeneous spaces G/H. The tangent vector p ∈ p projects to the tangent vector ξ at the coset [1] = H. Note that only in naturally reductive homogeneous spaces the geodesic in G generated by p projects onto G/H such that it coincides with the geodesic of the projected tangent vector in the sense Π(etp ) = exp[1] (tξ). In reductive homogeneous spaces that are not naturally reductive, the projection yields in general only a ﬁrst-order approximation at [1] = H as shown in the lower part, where Π(etp ) = exp[1] (tξ). (Color online)

the forthcoming applications. Therefore, we summarize the previous results for the particular case of adjoint orbits. Note that the adjoint action given by (X, G) → AdG X := GXG−1 is a left action. However, all previous statements and formulas remain valid mutatis mutandis, e.g., right cosets have to be replaced by left cosets, etc. Corollary 3.4. Let G be a Lie group with Lie algebra g and let K ⊂ G be a compact subgroup with Lie algebra k and bi-invariant metric · | ·. Moreover, let α : g × K → g, (X, K) → AdK X := KXK −1 be the adjoint action of K on g and denote by αX : K → g the map K → AdK X. Then the following assertions hold (a) The stabilizer group H := HX of X is a closed subgroup of K. (b) The coset space K/H is diﬀeomorphic to the adjoint orbit O(X) := {AdK X | K ∈ K} of X. In particular, the map α X : K/H → O(X), [K] → AdK X is a well-deﬁned diﬀeomorphism satisfying the commutative diagram K Π

K/H

αX

/ O(X) ⊂ g s9 ss s s ss ss αbX

(3.53)

July 12, J070-S0129055X10004053

628

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

(c) Let h := hX denote the Lie algebra of H and p be any complementary space to h in k, then Dα(1) = − adX is a surjective homomorphism with ker adX = h and commutative diagram DαX (1)

/ TX O(X) ⊂ g 7 ppp p p DΠ(1) pppαX ([1]) ppp Db p∼ = k/h; g

(3.54)

Moreover, the tangent space of O(X) at Y = AdK X is given by TY O(X) = adY k = adY (AdK −1 p). (d) O(X) ∼ = K/H is naturally reductive. More precisely, p := h⊥ yields a naturally reductive decomposition of k with AdH -invariant scalar product on p is given by the restriction of · | ·. (e) There is a well-deﬁned α-invariant metric on O(X) given by ξ | ηAdK X := pξ | pη

(3.55)

with ξ = adY (AdK pξ ), η = adY (AdK pη ) and pξ , pη ∈ p. (f) All geodesics through Y = AdK X ∈ O(X) with respect to the metric given in part (e) are of the form t → Adexp(t AdK p) Y

(3.56)

with p ∈ p. Proof. Part (a) and (b) follow immediately from Theorem 3.9 and Corollary 3.3. (c) For k ∈ k we have d Adexp(tk) X = − adX k dt t=0 and thus Dα(1) = − adX . All other statements are again consequences of Theorem 3.9. (d) First, observe that the bi-invariance of · | · implies that k = h ⊕ p with p := h⊥ is reductive. Now, let P denote the orthogonal projection onto p. In turn, the bi-invariance of · | · yields P adg h | k = adg h | k = −h | adg k = −h | P adg k for all g, h, k ∈ p, cf. Proposition 3.2. Therefore, O(X) ∼ = K/H is naturally reductive. ∈ K. A straightforward calculation using the identities (e) Let Y ∈ O(X) and K (AdK DαKe (Y )ξ = AdKe ξ for ξ ∈ TY O(X) and AdK e (adY k) = adAdK e k) for fY all k ∈ k yields the required invariance. Part (f) follows immediately from Theorem 3.12 and the identity hY = AdK hX for Y = AdK X which implies h⊥ Y = AdK p.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

629

3.3.6. Gradient ﬂows on Riemannian homogeneous spaces Applying the previous results on gradient ﬂows to quality functions f on Riemannian homogeneous spaces G/H, we obtain by the G-invariance of the Riemannian metric — similar to (3.20) — the gradient equality (3.57) grad f([G]) = D rG ([1]) grad(f ◦ rG ) ([1]) for all G ∈ G, where rG denotes the mapping [G ] → [G G]. Similar to Eq. (3.20), the gradient of f is therefore completely determined by (3.58) G → grad(f ◦ rG )([1]) ∈ p. However, Eq. (3.58) does not induce a mapping from G/H to p, as in general grad(f ◦ rG )([1]) = grad(f ◦ rHG )([1]) for H ∈ H\{1}. Now, for analyzing the asymptotic behavior of ˙ = grad f([G]) [G]

(3.59)

Sec. 3.1 provides again the appropriate tools. For instance, if G/H is compact we have. Corollary 3.5. Let G/H be a compact Riemannian homogeneous space and let f : G/H → R be real analytic. Then any solution of Eq. (3.59) converges to a critical point of f for t → +∞. Proof. This follows immediately from Proposition 3.1 and Theorem 3.1 as a Riemannian homogeneous space constitutes always a real analytic Riemannian manifold, cf. [99, 101]. Finally, we return to our starting point and ask for the relation between (3.59) and (3.21) in the case of an H-equivariant quality function f . Then, f induces a quality function f on G/H via f([G]) := f (G)

(3.60)

for all G ∈ G. Moreover, assume G carries a bi-invariant metric · | · and G/H is a homogeneous space with reductive decomposition g = h ⊕ p and p := h⊥ . This implies that the restriction of · | · to p × p is AdH -invariant. Now, the identity f ◦ Π = f yields Df([G]) · DΠ(G) = Df (G)

for all G ∈ G

(3.61)

and hence (DΠ(G))∗ grad f([G]) = grad f (G)

(3.62)

for all G ∈ G, where Π denotes the canonical projection and (·)∗ the adjoint mapping. By identifying p with the tangent space of G/H at [1], the map DΠ(1) represents the orthogonal projector h + p → p for h ∈ h and p ∈ p. Thus we obtain DΠ(1)(DΠ(1))∗ = idp .

July 12, J070-S0129055X10004053

630

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

In the same way, using the identity Π ◦ rG = rG ◦ Π, one shows DΠ(G)(DΠ(G))∗ = idT[G] G/H for all G ∈ G. Consequently, (3.62) yields grad f([G]) = DΠ(G) grad f (G)

(3.63)

for all G ∈ G. Therefore, we have proven the following result: Theorem 3.13. Suppose G/H satisﬁes the above assumptions and f : G → R is a H-equivariant quality function with induced quality function f : G/H → R. Then the canonical projection of the gradient ﬂow of Eq. (3.21) onto G/H yields the gradient ﬂow of Eq. (3.59), i.e. if G(t) is a solution of Eq. (3.21) then Π(G(t)) is one of Eq. (3.59). 3.3.7. Discretized gradient ﬂows on naturally reductive homogeneous spaces As before, let G be a Lie group with bi-invariant metric and let f be an equivariant quality function with respect to the closed subgroup H, i.e. for all H ∈ H one has f (G) = f (HG), so f |HG = constant for every G ∈ G. Moreover, assume that G/H is a naturally reductive coset space. Implementing a gradient algorithm for the induced quality function f on G/H ﬁnally yields the following recursion scheme [Gk+1 ] := [exp(αk grad f (Gk ) G−1 k ) Gk ],

(3.64)

where αk > 0 denotes a suitable step size. This, however, is not surprising, which can be seen as follows. With G/H being naturally reductive, there is the reductive decomposition g = h ⊕ p with p := h⊥ , such that any Ω ∈ g decomposes uniquely into Ω = Ωh + Ωp . Then the equivariance of f guarantees that its gradient at G ∈ G is orthogonal to the coset HG. Thus one ﬁnds grad f (G) | Ωh G = Df (G)Ωh G = 0 for all Ωh ∈ h. Therefore, the “pullback” of the gradient of f to g satisﬁes grad f (G)G−1 ∈ p. Furthermore, combining Eqs. (3.38) and (3.63) with the identity D(Π ◦ lG )(1)Ω = DΠ(G)GΩ for all Ω ∈ g (cf. Remark 3.3) yields grad f([G]) = D(Π ◦ lG )(1)(G−1 grad f (G)). Thus from Eq. (3.50) we ﬁnally obtain exp[G] (t grad f([G])) = [exp(t grad f (G) G−1 )G] for all t ∈ R, where exp[G] denotes the Riemannian exponential map at [G], cf. Eqs. (2.10) and (2.11). This precisely explains why recursion scheme (3.64) ressembles the corresponding one on the group level.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

631

3.4. Examples Often practically relevant quality functions take the form of a linear functional restricted to an adjoint orbit O(X). For instance, in quantum dynamics the unitary orbit O(A) := {UAU † | U ∈ SU (N )}

(3.65)

of an initial state A plays a central role, because it deﬁnes the largest reachability set under closed Hamiltonian dynamics. Then the set of feasible expectation values is such a linear map, since it is the projection onto an observable C in the sense of a Hilbert–Schmidt scalar product. These expectation values can be generalized to arbitrary complex square matrices A, C ∈ CN ×N such as to coincide with elements of the C-numerical range W (C, A) := {tr(C † UAU † ) | U ∈ SU (N )}.

(3.66)

As C-numerical ranges are well established in the mathematical literature [111,112], in the sequel we will adopt the notation. Note that ﬁnding the maximum absolute value, i.e. the C-numerical radius r(C, A) :=

max

U∈SU(N )

|tr{C † UAU † }|

(3.67)

is straightforward for Hermitian A, C (it amounts to sorting the respective eigenvalues, cf. Corollary 3.8), while for arbitrary complex A, C there is no general analytical solution. Moreover, when restricting to local unitary operations K ∈ SUloc (2n ) := SU (2)⊗n , the maximization task becomes non-trivial even for Hermitian A, C [113, 114]. Having set the frame, we now illustrate the previous theory by gradient ﬂows on the entire unitary group SU (2n ), on the local unitary group SU (2)⊗n as well as their adjoint orbits. 3.4.1. Geometric optimization by gradient ﬂows on SU (N ) Consider a fully controllable system (Σ) on SU (N ) in the sense that the entire group SU (N ) can be generated by evolutions under the Hamiltonian of the system plus the available controls. If A is an initial density operator or a matrix collecting its signal-relevant terms, then the reachable set to A coincides with the orbit of the canonical (semi)group action of (Σ) on A which yields in the entire unitary orbit O(A), cf. Eq. (3.65). Recall its “projection” on some observable C (or its signal-relevant terms) forms the C-numerical range of A, cf. Eq. (3.67). In this setting, there are two geometric optimization tasks of particular practical relevance as they determine maximal signal intensity in coherent spectroscopy [27]. (a) Find all points on the unitary orbit of A that minimize the Euclidean distance to C.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

632

(b) Find all points on the unitary orbit of A that minimize the angle to the 1-dimensional, complex subspace spanned by C. Clearly, the distance UAU † − C 22 = A 22 + C 22 − 2 Re{tr(C † UAU † )}

(3.68)

†

†

is minimal if the overlap Re{tr(C UAU )} is maximal. Moreover, making use of the deﬁnition of the angle between 1-dimensional complex subspaces cos2 ({UAU † , C}) :=

|tr{C † UAU † }|2 2

2

A 2 · C 2

,

(3.69)

problem (b) is equivalent to maximizing the function |tr(C † UAU † )|. Its maximal value is the C-numerical radius of A (see Eq. (3.66)). Obviously, rC (A) ≤ A 2 · C 2 with equality if and only if UAU † and C are complex collinear for some U ∈ SU (N ). Note that the two tasks (a) and (b) are equivalent whenever the C-numerical range forms a circular disk in the complex plane (centred at the origin); conditions for circular symmetry have been characterized in [115]. Extending concepts of Brockett [17] from the orthogonal to the special unitary group [27, 28, 116], the above optimization problems (a) and (b) can be treated by the previously presented gradient-ﬂow methods, cf. also [22, 23]. For ﬁxed matrices A, C ∈ CN ×N deﬁne f1 : SU (N ) → R,

f1 (U ) := Re tr(C † UAU † )

(3.70)

f2 (U ) := |tr(C † UAU † )|2 .

(3.71)

and f2 : SU (N ) → R,

Observe that the distance problem (a) is solved by maximizing f1 , while the angle problem is solved for maximal f2 . Now, the diﬀerential and the gradient of f1 with respect to the bi-invariant Riemannian metric Eq. (3.77) is precisely given by Df1 (U )(ΩU ) = Re tr([UAU † , C † ]Ω), grad f1 (U ) = [UAU † , C † ]†S U, as will be illustrated in the worked example below. The diﬀerential and the gradient of f2 can be obtained in the same manner as ∗

Df2 (U )(ΩU ) = tr(C †UAU † ) · tr([UAU † , C † ]Ω) − tr(C † UAU † ) · tr([UAU † , C † ]† Ω), grad f2 (U ) = 2(f2 (U )∗ · [UAU † , C † ])†S U. This yields the following result. Theorem 3.14. The gradient systems of fν , ν = 1, 2 with respect to the bi-invariant Riemannian metric (3.77) are given by (3.72) U˙ = Ων (U )U

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

633

with Ω1 (U ) := [UAU † , C † ]†S

and

∗

Ω2 (U ) := 2(f2 (U ) · [UAU † , C † ])†S .

(3.73)

respectively. Each solution of (3.72) converges to a respective critical point for t → +∞. Thereby, the critical points of fν are characterized by Ων (U ) = 0, ν = 1, 2. Proof. The above computations immediately yield Eq. (3.72). As fν , ν = 1, 2 are real analytic, the convergence of each solution to a critical point is guaranteed by Proposition 3.1 and Theorem 3.1, cf. [116]. An implementable numerical integration scheme for the above gradient systems making use of the Riemannian exponential, see Eqs. (2.9) and (2.11), is given by (ν)

(ν)

(ν)

(ν)

Uk+1 = exp(αk Ων (Uk )) Uk ,

U0 = 1N .

(3.74)

(ν)

A suitable choice of step sizes αk > 0 ensuring convergence can be found in (ν) [27, 28, 116]. Generically, it drives Uk into ﬁnal states attaining the maxima of the quality functions fν , ν = 1, 2. However, there is no guarantee that the gradient ﬂows always reach the global maxima. Standard numerical integration procedures such as, e.g., the Euler method are not applicable here as they would not preserve unitarity. 3.4.2. Worked example We now derive the discretized integration scheme maximizing the quality function f1 in all detail. To this end, recall that SU (N ) is a compact connected Lie group of real dimension N 2 − 1. Its Lie algebra, i.e. its tangent space at the identity is given by set su(N ) of all skew-Hermitian matrices Ω with tr Ω = 0, i.e. su(N ) := {Ω ∈ CN ×N | Ω† = −Ω, tr Ω = 0}.

(3.75)

So elements Ω ∈ su(N ) relate to Hamiltonians H via Ω = iH. The tangent space at an arbitrary element U ∈ SU (N ) is TU SU (N ) = su(N )U = {ΩU | Ω ∈ su(N )},

(3.76)

cf. Eq. (3.13). Moreover, let SU (N ) be endowed with the bi-invariant Riemannian metric ΩU | ΞU U := tr(Ω† Ξ),

(3.77)

deﬁned on the tangent spaces TU SU (N ), cf. Eq. (3.15). Now set F : SU (N ) → CN ×N , F (U ) := C † UAU † f : SU (N ) → R,

f (U ) := Re tr{C † UAU † }

For computing the tangent map of F , we exploit the fact that SU (N ) is an embedded submanifold of CN ×N . Therefore, the tangent map is obtained by restricting

July 12, J070-S0129055X10004053

634

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

the ordinary Fr´echet derivative DF (U ) to the tangent space TU SU (N ), cf. Eqs. (3.4) and (3.5). Thus, by applying the product rule, one easily ﬁnds DF (U )(ΩU ) = C † ΩUAU † + C † UA(ΩU )† = C † ΩUAU † − C † UAU † Ω. Now, the chain rule as well as the short-hand notations A˜ := UAU † and [·, ·]S to denote the skew-Hermitian part of the commutator [·, ·] give Df (U )(ΩU ) = D(Re tr)(F (U )) ◦ D(F (U ))(ΩU ) ˜ = Re tr{[A, ˜ C † ]Ω} = tr{[A, ˜ C † ]S Ω} = Re tr{C † ΩA˜ − C † AΩ} ˜ C]† | Ω = [A, ˜ C † ]† U | ΩU , = [A, S S where the last identity explicitly invokes the right-invariance of the Riemannian metric on SU (N ), cf. Eq. (3.77). Next, identifying the above expression with Df (U )(ΩU ) = grad f (U ) | ΩU

(3.78)

one gets the gradient vector ﬁeld ˜ C † ]† U grad f (U ) = [A, S

(3.79)

˜ C † ]S U. U˙ = grad f (U ) = −[A,

(3.80)

and thus the gradient system

By the Riemannian exponential, see Eqs. (2.9) and (2.11), and with αk ≥ 0 as an appropriate step size we ﬁnally arrive at the discretization †

Uk+1 = e−αk [Uk AUk ,C]S Uk .

(3.81)

3.4.3. Gradient ﬂows on the local subgroup SUloc (2n ) The quality functions introduced in the previous subsection may be restricted to the subgroup of local action, i.e. to SUloc (2n ) := SU (2) ⊗ · · · ⊗ SU (2) ⊂ SU (2n ). n-times Let the Pauli matrices be deﬁned as 0 1 0 −i σx := , σy := , 1 0 i 0

1 σz := 0

0 . −1

(3.82)

(3.83)

Moreover the σk,α , α ∈ {x, y, z} are deﬁned by σk,α := 12 ⊗ · · · ⊗ 12 ⊗ σα ⊗ 12 ⊗ · · · ⊗ 12 ,

(3.84)

where the term σα appears in the kth position of the Kronecker product and 12 denotes the 2×2-identity matrix.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

635

The Lie subalgebra to SUloc (2n ) ⊂ SU(2n ) can be speciﬁed by n n suloc (2 ) := 12 ⊗ · · · ⊗ 12 ⊗ Ωk ⊗ 12 ⊗ · · · ⊗ 12 Ωk ∈ su(2) , k=1

with the term Ωk ∈ su(2) appearing at the kth position, cf. Eq. (3.84). Then the tangent space of SUloc (2n ) at an arbitrary element U is given by TU SUloc (2n ) = {ΩU | Ω ∈ suloc (2n )}.

(3.85)

Finally, SUloc (2n ) is endowed with the bi-invariant Riemannian metric induced by SU (2n ), i.e. ΩU, ΞU U := tr(Ω† Ξ)

(3.86)

for ΩU, ΞU ∈ TU SUloc (2n ). Lemma 3.3. Let H ⊂ GL(N, C) be any closed subgroup with Lie algebra h ⊂ gl(N, C) := CN ×N . Moreover let h1 , . . . , hm be a real orthonormal basis of h with respect to the real scalar product (g1 | g2 ) := Re tr(g1† g2 ),

g1 , g1 ∈ CN ×N ,

(3.87)

i.e. spanR {h1 , . . . , hm } = h and (hi | hj ) = δij . (a) Then the orthogonal projection P : CN ×N → CN ×N onto h is given by g → P g :=

m

Re tr{h†j g}hj .

(3.88)

j=1

(b) The orthogonal projection P ⊥ : CN ×N → CN ×N onto the orthogonal complement h⊥ is given by g → P ⊥ g = g − P g. Proof. Both (a) and (b) are basic and well-known facts from linear algebra. Remark 3.7. For the unitary case, i.e. for h ⊂ su(N ), the real part in Eq. (3.88) can be neglected and the projector P can be rewritten in the more convenient matrix form P as m vec(hj ) vec(hj )† , (3.89) P := j=1 †

where the terms vec(hj ) vec(hj ) represent the rank-1 projectors Pj = |hj hj | in vec-notation. Corollary 3.6. The orthogonal projection P : CN ×N → CN ×N onto suloc (2n ) with respect to (3.87) is given by P g :=

n 1 (Re(tr(g † Xk ))Xk + Re(tr(g † Yk ))Yk + Re(tr(g † Zk ))Zk ), 2n k=1

July 12, J070-S0129055X10004053

636

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

where Xk , Yk and Zk are deﬁned by, cf. Eq. (3.84) Xk := iσk,x ,

Yk := iσk,y ,

Zk := iσk,z .

Proof. This follows straightforwardly from the orthogonality of the set {Xk , Yk , Zk | k = 1, . . . , n} and Lemma 3.3. Theorem 3.15. Let floc be the restriction of (3.70) to SUloc (2n ). (a) The gradient of floc with respect to (3.86) and the corresponding gradient system are given by grad floc (U ) = P ([C † , UAU † ])U

(3.90)

U˙ = P ([C † , UAU † ])U,

(3.91)

and

where P denotes the orthogonal projection P : gl(2n , C) → gl(2n , C) onto suloc (2n ). More explicitly, (3.91) is equivalent to a system of n coupled equations U˙ k = Ωk Uk ,

k = 1, . . . , n

(3.92)

on SU (2), where Ωk =

1 (Re(tr([C † , UAU † ]† Xk ))X + Re(tr([C † , UAU † ]† Yk ))Y 2n + Re(tr([C † , UAU † ]† Zk ))Z).

Each solution of (3.91) converges for t → ±∞ to a critical point of floc characterized by P ([C † , UAU † ]) = 0.

(3.93)

(b) The Hessian form Hess floc (U ) and the Hessian operator Hess floc (U ) of floc at U are given by Hess floc (U )(ΩU, ΞU ) =

1 (Re(tr(Ω† [C † , [Ξ, UAU † ]])) 2 + Re(tr(Ω† [UAU † , [Ξ, C † ]]))).

(3.94)

and Hess floc (U )ΩU = (S(U )Ω)U,

(3.95)

respectively, with Ω ∈ suloc (2n ) and 1 P ([C † , [Ω, UAU † ]] + [UAU † , [Ω, C † ]]). 2 (c) For all initial points U0 ∈ SUloc (2n ) the discretization scheme S(U )Ω :=

Uk+1 := exp(αk P ([C † , Uk AUk† ]))Uk

(3.96)

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

637

with step size αk =

P ([C † , Uk AUk† ]) 2

[C † , P ([C † , Uk AUk† ])] · [P ([C † , Uk AUk† ]), Uk AUk† ]

(3.97)

converges to the set of critical points of floc . Proof. The subsequent arguments follow our conference report [117], which also contains a complete proof for the ﬂow on the entire groups such as SU (2n ). (a) Since SUloc (2n ) is a closed subgroup of SU (2n ), it is also an embedded Lie subgroup and thus a submanifold of SU (2n ), cf. Remark 3.2. Therefore, the gradient of floc is well-deﬁned by (3.4). Furthermore, by (3.23) and (3.73) we obtain grad floc (U ) = P (grad f1 (U )) = P ([UAU † , C † ]† )U = P ([C † , UAU † ])U, where the last equality follows from P ([UAU † , C † ]† ) = −P ([UAU † , C † ]) and the skew-symmetry of the commutator. Moreover, Eq. (3.92) is derived by Corollary 3.6 and the identity n d (U1 (t) ⊗ · · · ⊗ Un (t)) = 12 ⊗ · · · ⊗ U˙ k (t)Uk−1 (t) ⊗ · · · ⊗ 12 dt k=1

× (U1 (t) ⊗ · · · ⊗ Un (t)). Compactness of SUloc (2n ) and real analyticity of floc imply that each solution converges to critical points for t → +∞, cf. Proposition 3.1 and Theorem 3.1. (b) By (3.9), the Hessian of floc at U is determined by evaluating the second derivative of ϕ := f ◦ γ at t = 0, where γ is any geodesic. This yields Hess floc (U )(ΩU, ΩU ) := ϕ (0) = Re(tr(C † [Ω, [Ω, UAU † ]])),

(3.98)

for Ω ∈ suloc (2n ). The Hessian then is obtained from the quadratic form (3.98) by a standard polarisation argument Eq. (3.8), i.e. 1 Re(tr(C † [Ω, [Ξ, UAU † ]])) + Re(tr(C † [Ξ, [Ω, UAU † ]])) . Hess floc (U )(ΩU, ΞU ) = 2 Finally, by the identity tr[X, Y ]Z = − tr Y [X, Z] we conclude 1 Hessfloc (U )(ΩU, ΞU ) = Re(tr(Ω† [C † , [Ξ, UAU † ]])) + Re(tr(Ω† [UAU † , [Ξ, C † ]])) . 2 Therefore, the Hessian operator of floc at U is given by Hess floc (U )ΩU = (S(U )Ω)U with Ω ∈ suloc (2n ) and S(U )Ω :=

1 P ([C † , [Ω, UAU † ]] + [UAU † , [Ω, C † ]]). 2

July 12, J070-S0129055X10004053

638

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

(c) Estimating the second derivative ϕ (t) = Re(tr([C † , Ω][Ω, etΩ UAU † e−tΩ ])) for Ω := grad floc (U ) = P ([C † , UAU † ]) and U ∈ SUloc (2n ) yields |ϕ (t)| ≤ [C † , Ω] · [Ω, etΩ UAU † e−tΩ ] = [C † , Ω] · [Ω, UAU † ] . Therefore, we get the estimate 2 d max 2 floc (expU (Ωt)) ≤ [C † , Ω] · [Ω, UAU † ] t≥0 dt for Ω := grad floc (U ). Now, a standard Lyapunov-type argument, similar to the proof of Theorem 3.3 in [22], yields the desired result. For similar discretization schemes in diﬀerent contexts or other intrinsic Riemannian methods see also [19, 22, 27, 118]. 3.4.4. Double-bracket ﬂows as gradient ﬂows on naturally reductive homogeneous spaces The well-known double-bracket ﬂows have established themselves as useful tools for diagonalizing matrices (usually real symmetric ones) as well as for sorting lists [17, 19, 22, 23]. Moreover, they relate to Hamiltonian integrable systems [119, 120]. (Note again that in many-particle physics gradient ﬂows were later introduced independently for diagonalizing Hamiltonians [51,52].) In summarizing the most important results we show that double-bracket ﬂows can be viewed as special cases of gradient ﬂows on naturally reductive homogeneous spaces G/H in terms of Sec. 3.3, where H is a stabilizer group, which is typically not normal. Then the homogeneous space G/H does not constitute a group itself. Let O(A) as in Eq. (3.65) denote the unitary orbit of some A ∈ CN ×N . Note that the adjoint action (U, A) → AdU A := UAU † of SU (N ) constitutes a left action on the Lie algebra g := CN ×N . However, this should not cause any confusion for the reader since the key result we refer to — Corollary 3.4 — was presented for left actions. Let C ∈ CN ×N be another complex matrix. For minimizing the (squared) Euclidean distance X − C 22 between C and the unitary orbit of A we derive a gradient ﬂow maximizing the target function f(X) := Re tr{C † X}

(3.99)

over X ∈ O(A). Clearly, this is but an alternative to tackling the problem by a gradient ﬂow on the unitary group, since as in Sec. 3.3, we have the equivalence max f(X) =

X∈O(A)

for f (U ) := Re tr{C † UAU † }.

max

U∈SU(N )

f (U )

(3.100)

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

639

Building upon Corollary 3.4, we have the following facts: O(A) constitutes a compact and connected naturally reductive homogeneous space isomorphic to SU (N )/H. Here, H := {U ∈ SU (N ) | AdU A = A}

(3.101)

denotes the stabilizer group of A. Recalling that the Lie algebra of SU (N ) is su(N ), we further obtain for the tangent space of O(A) at X = AdU A the form TX O(A) = {adX Ω | Ω ∈ su(N )}

(3.102)

with adX Ω := [X, Ω]. Moreover, the kernel of adA : su(N ) → g reads h = {Ω ∈ su(N ) | [A, Ω] = 0}

(3.103)

and forms the Lie subalgebra to H. Now, by the standard Hilbert–Schmidt scalar product (Ω1 , Ω2 ) → tr{Ω†1 Ω2 } on su(N ) one can deﬁne the ortho-complement to the above kernel as p := h⊥ .

(3.104)

This induces a unique decomposition of any skew-Hermitian matrix Ω = Ωh + Ωp with Ωh ∈ h and Ωp ∈ p. Finally, we obtain an AdSU(N ) -invariant Riemannian metric on O(A) via †

adX (AdU Ω1 ) | adX (AdU Ω2 )X := tr{Ωp1 Ωp2 }

(3.105)

for X := AdU A, which is equivalent to saying †

adX (Ω1 ) | adX (Ω2 )X := tr{Ωp1X Ωp2X }

(3.106)

with pX := AdU p. Now, the main results on double-bracket ﬂows read as follows: Theorem 3.16. Set f : O(A) → R, f(X) := Re tr{C † X}. Then one ﬁnds (a) The gradient of f with respect to the Riemannian metric deﬁned by Eq. (3.105) is given by grad f(X) = [X, [X, C † ]S ],

(3.107)

where [X, C † ]S denotes the skew-Hermitian part of [X, C † ]. (b) The gradient ﬂow X˙ = grad f(X) = [X, [X, C † ]S ]

(3.108)

deﬁnes an isospectral ﬂow on O(A) ⊂ g. The solutions exist for all t ≥ 0 and converge to a critical point X∞ of f(X) characterized by [X∞ , C † ]S = 0. Proof. (A detailed proof for the real case can be found in [22]; for an abstract Lie algebraic version see also [19].)

July 12, J070-S0129055X10004053

640

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

(a) For X = AdU A and ξ = adX Ω ∈ TX O(A) we obtain d † −tΩ tΩ Re tr{C e Df (X) adX Ω = Xe } = Re tr{C † adX Ω}. dt t=0 Therefore, the gradient of f has to satisfy Re tr{C † adX ΩpX } = grad f(X) | adX ΩpX X for all ΩpX ∈ pX . Applying Eq. (3.105) to X = A gives †

Re tr{C † adA Ωp } = tr{Γp Ωp } for all Ωp ∈ p, where Γp is deﬁned by grad f(A) = adA Γp with Γp ∈ p. Thus we ﬁnally arrive at †

tr{(adA† C)†S Ωp } = tr{Γp Ωp } for all Ωp ∈ p, where (adA† C)S denotes the skew-Hermitian part of adA† C. Hence, Γp = (adA† C)pS . Moreover, for Ωh ∈ h, we have tr{(adA† C)† Ωh } = −tr{adA C † Ωh } = tr{C † adA Ωh } = 0. Hence, (adA† C)S ∈ p and therefore grad f(A) = adA (adA† C)S = [A, [A, C † ]S ]. The same arguments apply to X = AdU A and thus grad f(X) = [X, [X, C † ]S ]. (b) Since Eq. (3.107) evolves on the unitary orbit of A, the associated ﬂow is isospectral by construction. The compactness of O(A) then implies that each solution X(t) of Eq. (3.107) exists for all t ≥ 0 and converges to the set of critical points cf. Proposition 3.1. Moreover, from Theorem 3.1 we derive that X(t) converges actually to a single critical point X∞ of f, i.e. to a point X∞ which satisﬁes [X∞ , [X∞ , C † ]S ] = 0.

(3.109)

Since [X∞ , C † ]S ∈ pX∞ , Eq. (3.109) is equivalent to [X∞ , C † ]S = 0. In order to obtain a numerical algorithm for maximizing f one can discretize the continuous-time gradient ﬂow (3.107) as in the previous examples via Xk+1 = e−αk [Xk ,C

†

]S

Xk eαk [Xk ,C

†

]S

(3.110)

with appropriate step sizes αk > 0. Note that Eq. (3.110) heavily exploits the fact that the adjoint orbit O(A) constitutes a naturally reductive homogeneous space and thus the knowledge on its geodesics, cf. Corollary 3.4.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

641

Remark 3.8. As an alternative to Eq. (3.110), taking the standard Euler-type iteration Xk+1 = Xk + αk [Xk , [Xk , C † ]S ]

(3.111)

does not retain the isospectral nature of the ﬂow. Therefore, it should only be used as a computationally inexpensive, rough scheme in the neighborhood of equilibrium points, if at all. For A, C complex Hermitian (real symmetric) and the full unitary (or orthogonal) group or its respective orbit the gradient ﬂow (3.107) is well understood, cf. Corollary 3.8. However, for non-Hermitian A and C, the nature of the ﬂow and in particular the critical points have not been analyzed in depth, because the Hessian at critical points is diﬃcult to come by. Even for A, C Hermitian, a full critical point analysis becomes non-trivial as soon as the ﬂow is restricted to a closed and connected subgroup K ⊂ SU (N ). Nevertheless, the techniques from Theorem 3.16 can be taken over to establish a gradient ﬂow and a respective gradient algorithm on the orbit OK in a straightforward manner. Corollary 3.7. The gradient ﬂow of Eq. (3.107) restricts to the subgroup orbit OK (A) := {KAK † | K ∈ K ⊂ SU (N )} by taking the respective orthogonal projection Pk onto the subalgebra k ⊂ su(N ) of K instead of projecting onto the skewHermitian part, i.e. X˙ = [X, Pk [X, C † ]]. With step sizes αk > 0 the corresponding discrete integration scheme reads †

†

Xk+1 = e−αk Pk [Xk ,C ] Xk eαk Pk [Xk ,C ] .

(3.112)

In view of unifying the interpretation of unitary networks, e.g., for the task of computing ground states of quantum mechanical Hamiltonians H ≡ A, the double-bracket ﬂows for complex Hermitian A, C on the full unitary orbit Ou (A) as well as on the subgroup orbits OK (A) for diﬀerent partitionings brought about r by K := {K ∈ SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nr )| j=1 Nj = 2n } have shifted into focus [36]. Therefore, we have given the foundations for the recursive schemes of Eqs. (3.110) and (3.112), which are listed in Table 2 as U1P and U1KP. Finally, we summarize what is known about the nature of critical points for the real symmetric or complex Hermitian case. For a detailed discussion of the real symmetric case and the orthogonal group see e.g., [22]. Corollary 3.8. Let C and A be real symmetric or complex Hermitian and assume for simplicity that they show distinct eigenvalues in either case. Then one ﬁnds: (a) For A, C real symmetric, deﬁne with respect to the special orthogonal group SO(N ) and Y ∈ Oo (A) := {OAO | O ∈ SO(N )} a pair of target functions on the group and on the respective orbit by g(O) := tr{C OAO }

(3.113)

g(Y ) := tr{C Y }.

(3.114)

July 12, J070-S0129055X10004053

642

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Then the gradient ﬂow O˙ := grad g(O) = [OAO , C]O

(3.115)

shows 2(N −1) N ! critical points, while the double-bracket ﬂow Y˙ := grad g(Y ) = [Y, [Y, C]]

(3.116)

only shows N ! equilibrium points. (b) For A, C complex Hermitian, and X ∈ Ou (A) := {UAU † | U ∈ SU (N )} f (U ) := tr{C † UAU † }

(3.117)

f(X) := tr{C † X}

(3.118)

the gradient ﬂow on the special unitary group SU (N ) U˙ := grad f (U ) = [UAU † , C]U

(3.119)

shows a continuum of critical points, while the double-bracket ﬂow on the unitary orbit X˙ := grad f(X) = [X, [X, C]]

(3.120)

again shows only N ! equilibrium points. (c) On the orbit, the respective target function has a unique global maximum which is given by the diagonalization diag(λ1 , . . . , λN ), λ1 > · · · > λN of A, if C is assumed to be diagonal of the form C = diag(µ1 , . . . , µN ), µ1 > · · · > µN . Moreover, the respective gradient ﬂow converges to the unique global maximum for almost all initial values with an exponential bound on the rate. Proof. (a) and (b) The counting arguments follow immediately from the fact that in either case for C diagonal with distinct eigenvalues, the set of critical points C∞ := {X∞ ∈ O(A) | [X∞ , C] = 0} on the orthogonal or unitary orbit is given by N ! diﬀerent diagonalizations of A and remains therefore invariant under conjugation by any permutation matrix. Moreover, on the orthogonal group O(N ), the stabilizer group of A is given by {diag(±1, ±1, . . . , ±1)}, which adds 2N independent further degrees of freedom. Finally, restricting to SO(N ) we obtain 2N −1 N ! critical points on the group level. In contrast, for the unitary case SU (N ), the stabilizer group of A reads N iφ1 iφν iφN diag(e , . . . , e , . . . , e ) φν ∈ 2πZ, φν ∈ R , ν=1

which is always continuous.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

643

(c) Since C is symmetric or Hermitian, we can assume without loss of generality that C is diagonal. Then, the critical point condition [X∞ , C] yields that the critical points of g and, respectively, f are given by the diagonalizations of A. Moreover, analyzing the Hessian at critical points shows that there is only one global maximum in both cases and no local ones [22]. The exponential convergence of the gradient ﬂows Eqs. (3.116) and (3.120) to the respective unique global maximum for almost all initial values is also established via the Hessian, i.e. by linearizing the respective gradient ﬂows at critical points [22].

3.4.5. Some ﬁnal remarks on the naturally reductive case Let f : SU (2n ) → R be an arbitrary smooth function that is equivariant under local unitary operations of the n-fold tensor product SUloc (2n ) := SU (2) ⊗ · · · ⊗ SU (2). This includes, e.g., any measure of entanglement µE (U ) that varies smoothly with U . By construction grad f |SUloc (2n ) = 0, so we may consider then the induced ﬂow to [U˙ ] = grad f([U ]) on the homogeneous space G/K = SU (2n )/SUloc (2n ), which is naturally reductive for all n and even Cartan-like for n = 2. This can be seen, because (i) SU (2n ) carries a bi-invariant metric induced by the Killing form allowing to deﬁne p := k⊥ , which gives the reductive decomposition g = k ⊕ p, yet only for n = 2 one recovers the commutator inclusions [k, k] ⊆ k, [p, p] ⊆ k, and [k, p] ⊆ p; (ii) in any case, by Proposition 3.4 there is an AdK -invariant scalar product on p; and (iii) Eq. (3.47) is fulﬁlled for all {a, b, c} ⊆ p, as tr{[a, b]† c} = − tr{b† [a, c]}, cf. Remark 3.6. Therefore, one ﬁnally arrives at a discretized gradient algorithm of the form [Uk+1 ] := [exp(αk grad f (Uk ) Uk−1 )Uk ],

(3.121)

cf. Eq. (3.64). Clearly, this example extends analogously to functions that are equivariant under the action of generalized local subgroups SU(N1 ) ⊗ · · · ⊗ SU(Nr ) with r j=1 Nj = N , cf. (4.8), giving ﬂows on the corresponding reductive homogeneous spaces G/K = SU (N )/(SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nr )). Comparing Eq. (3.121) with the results of the previous subsection on double bracket ﬂows shows the following: having a “model” of the coset space G/K, i.e. having a smooth group action of G (e.g. on some vector space) such that one of its orbits is diﬀeomorphic to G/K, facilitates the implementation Eq. (3.121) rather than implementing it on the abstract coset level.

July 12, J070-S0129055X10004053

644

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

4. Applications to Quantum Information and Quantum Control 4.1. A geometric measure of pure-state entanglement The Euclidean distance of a pure state to the set Spp of all pure product states may be seen as a geometric measure of entanglement [55, 121, 122]. Since Spp coincides with the local unitary orbit Oloc (yy † ) := {U yy † U † | U ∈ SUloc (2n )}

(4.1)

of any pure product state y ∈ Spp , it relates to the following optimization task ∆(x) :=

min

U∈SUloc (2n )

xx† − U yy † U † 2 ,

n

(4.2)

n

where x ∈ C2 denotes a normalized pure state and y ∈ C2 a pure product state, e.g., y = (1, 0, . . . , 0) = (e1 ⊗ · · · ⊗ e1 ). This notation replaces |x by x and |xx| by xx† for the sake of convenient generalization to higher-order tensor products. Obviously, minimizing (4.2) is equivalent to maximizing the so-called local transfer max

U∈SUloc (2n )

Re(tr(xx† U yy † U † )),

(4.3)

between xx† and yy † . Further, since tr(xx† U yy † U † ) = | tr(x† U y)|2 taking the real part in (4.3) is redundant. Now, the techniques developed in Sec. 3.4.3 match perfectly to tackle problem (4.3). Let C := xx† , A := diag(1, 0, . . . , 0) and deﬁne the so-called local unitary transfer between C and A by the real-valued function floc (U ) := tr (CUAU † ).

(4.4)

Then the gradient ﬂow (3.91) or more precisely its discretization (3.96) will generically solve (4.3). For explicit numerical results see Sec. 4.2.3 and [117, 123]. In general, neither an algebraic characterization of the maximal value of floc nor the structure of its critical points is known, the major diﬃculty arising from the fact that U is restricted to SUloc (2n ). As soon as U may be taken from the entire special unitary group, the solution is well-known: it is simply obtained by arranging the (real) eigenvalues of both A and C magnitude-wise in the same order [17, 22, 124, 125]. 4.2. Generalized local subgroups 4.2.1. Bipartite systems and relations to singular-value decompositions An exceptional case, where the restricted problem (4.3) can be solved are bipartite pure systems. These systems are particularly simple in as much as the maxima of floc can be linked to the singular-value decomposition (SVD) of the matrices X and Y associated to x and y by x := vec X and y := vec Y . Since these ideas

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

645

readily extend to arbitrary ﬁnite dimensional bipartite systems, we generalize the formulation of problem (4.3) thus leading to Eq. (4.5), before going into multipartite systems. † , Y = VY ΣY WY† be singular value decompoProposition 4.1. Let X = VX ΣX WX sitions with VX , VY ∈ U(N1 ), WX , WY ∈ U(N2 ) and ΣX , ΣY sorted by magnitude. Moreover, let x := vec X and y := vec Y . Then the maximum value of the local transfer between xx† and yy † is bounded by

max

U∈SU(N2 )⊗SU(N1 )

Re(tr(xx† U yy † U † )) ≤ (tr Σ†X ΣY )2 .

(4.5)

Equality is actually achieved for VX , VY ∈ SU(N1 ), while WX , WY ∈ SU(N2 ) and ∗ ⊗ VX ) · (WY ⊗ VY† ). U∗ := (WX Proof. For U := W ⊗ V ∈ SU (N2 ) ⊗ SU (N1 ) we obtain tr(xx† U yy † U † ) = tr(xx† (W ⊗ V )yy † (W † ⊗ V † )) = tr(xx† vec(V Y W ) vec(V Y W )† ) = |x† vec(V Y W )|2 = |tr(X † V Y W )|2 .

(4.6)

Here, we have used the identities vec(V Y W ) = (W ⊗ V ) vec Y and (vec X)† vec Y = tr X † Y for all X, Y ∈ CN1 ×N2 . Now, (4.6) implies max

U∈SU(N2 )⊗SU(N1 )

Re tr(xx† U yy † U † ) =

max

V ∈SU(N1 ) W ∈SU(N2 )

|tr(X † V Y W )|2 ≤ (tr Σ†X ΣY )2 , (4.7)

where the last inequality is due to von Neumann, cf. [111,124]. If VX , VY ∈ SU(N1 ) and WX , WY ∈ SU(N2 ), equality is assumed in Eq. (4.7) for † ∗ ) ⊗ VX VY† = (WX ⊗ VX ) · (WY ⊗ VY† ). U∗ := (WY WX

Corollary 4.1. Set x := vec A and y := vec C. Then the maximum local transfer between xx† and yy † in the sense of Proposition 4.1 is bounded by A 2C :=

max

V ∈U(N1 ) W ∈U(N2 )

| tr(C † V AW † )|2 ,

which is known as the C-spectral norm of A, cf. [112]. Note that in the context of ﬁnding maximal distances between global unitary orbits for the purpose of geometric discrimination of generic non-pure quantum states [126], results similar to [125, 127] show up, while here we treat local unitary orbits of pure bipartite states as made explicit in Eq. (4.5).

July 12, J070-S0129055X10004053

646

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

4.2.2. Multipartite systems and relations to best rank-1 approximations of higher-order tensors Proposition 4.1 has a straightforward generalization to multipartite systems, which relates to best rank-1 approximations of higher-order tensors. To outline this relation, we deﬁne the concept of a generalized local subgroup SUloc (N1 , . . . , Nr ) := SU(N1 ) ⊗ · · · ⊗ SU(Nr ).

(4.8)

of type (N1 , . . . , Nr ) with Nk ∈ N, k = 1, . . . , r. Thus the associated general local subgroup optimization problem can be stated as follows. Generalized Local Subgroup Problem (GLSP). For C, A ∈ CN ×N with N := N1 ·N2 · · · Nr ﬁnd max

U∈SUloc (N1 ,...,Nr )

Re(tr(CU AU † )).

(4.9)

To our knowledge, the GLSOP seems to be unsolved so far. To introduce higherorder tensors, we have to ﬁx some further notation. For simplicity, we regard a tensor of order r ∈ N as an array X = (Xi1 ···ir )1≤i1 ≤N1 ,...,1≤ir ≤Nr of size N1 ×· · ·×Nr . The space of all N1 ×· · ·×Nr -tensors is denoted by CN1 ×···×Nr . A natural scalar product for tensors of the same size is given by Yi∗1 ···ir Xi1 ···ir . (4.10) Y | X := i1 ···ir

Moreover, a tensor X is called a rank-1 tensor if there exist xk ∈ CNk , k = 1, . . . , r such that X = x1 x2 · · · xr ,

(4.11)

where the (i1 · · · ir )-entry of the outer product is deﬁned by (x1 x2 · · · xr )i1 ···ir := x1i1 · x2i2 · · · xrir . Thus the question of decomposing a given tensor by tensors of lower rank leads to the following fundamental approximation problem: Best Rank-1 Approximation Problem (BRAP). Let · denote the norm induced by scalar product (4.10). For X ∈ CN1 ×···×Nr solve min

C∈C,xk =1 k=1,...,r

X − C · x1 · · · xr 2 .

(4.12)

Note that the above notation is necessary to distinguish between two diﬀerent types of outer products: the Kronecker product ⊗ (of column-vectors), which maps r-tuples of column-vectors to a column-vector of larger size, and the “abstract” outer product , which maps r-tuples of column-vectors to arrays (= tensors) of

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

647

order r. The relation between both is given by the canonical isomorphism vec : CN1 ×···×Nr → CN with N := N1 · N2 · · · Nr , which is uniquely determined by x1 x2 · · · xr → x1 ⊗ x2 ⊗ · · · ⊗ xr ,

(4.13)

i.e. vec assigns to each array X ∈ CN1 ×···×Nr a column-vector in CN by arranging the entries of X in a lexicographical order. With these notations at hand, the relation between GLSP and BRAP can be stated as follows. Theorem 4.1. Let X ∈ CN1 ×···×Nr be a tensor of order r and let x := vec(X) ∈ CN with N := N1 · N2 · · · Nr . Then the BRAP is equivalent to the GLSP max

U∈SUloc (N1 ,...,Nr )

Re(tr(xx† U yy † U † )),

(4.14)

where y ∈ CN can be any pure product state, e.g., y = (1, 0, . . . , 0) = e1 ⊗ · · · ⊗ e1 . More precisely, (a) If U1 ⊗ · · · ⊗ Ur is a solution of (4.14) then xk := Uk e1 , k = 1, . . . , r and C := X | x1 · · · xr solve (4.12). (b) If C ∈ C and xk , k = 1, . . . , r solve (4.12) then any U1 ⊗ · · · ⊗ Ur with xk = Uk e1 , k = 1, . . . , r yields a solution of (4.14). For proving Theorem 4.1 we need the following technical lemma. Lemma 4.1. The pair (x1 · · · xr , C) solves (4.12) if and only if x1 · · · xr is a maximum of max

z k =1,k=1,...,r

|X | z 1 · · · z r |

(4.15)

and C = X | x1 · · · xr . Proof. Consider the following identity X − C · z 1 · · · z r 2 = X 2 + |C|2 − 2 Re(C ∗ X | z 1 · · · z r ) = X 2 + |C − X | z 1 · · · z r |2 − |X | z 1 · · · z r |2 . Thus we obtain min

C∈C,z k =1 k=1,...,r

X − C · z 1 · · · z r 2 = X 2 − max |X | z 1 · · · z r |2 . z k =1 k=1,...,r

This yields the desired result. Proof of Theorem 4.1. Let y = e1 ⊗ · · · ⊗ e1 . Then (U1 ⊗ · · · ⊗ Ur )y = (U1 e1 ) ⊗ · · · ⊗ (Ur e1 ) and thus tr(xx† U yy † U † ) = tr(x† U yy † U † x) = |x† U y|2 = |X | (U1 e1 ) · · · (Ur e1 )|2 .

July 12, J070-S0129055X10004053

648

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Therefore, we obtain max

U∈SUloc (N1 ,...,Nr )

Re(tr(xx† U yy † U † )) =

max

U∈SUloc (N1 ,...,Nr )

|X | (U1 e1 ) · · · (Ur e1 )|2

= max |X | z 1 · · · z r |2 . z k =1 k=1,...,r

and hence Lemma 4.1 implies (a) and (b). Remark 4.1. (1) The isomorphism vec coincides “almost” with the standard vecoperation on matrices for r = 2, more precisely vec(X) = vec(X ). (2) Since any phase factor can readily be absorbed into x1 · · · xr , it is easy to show that max

xk =1,k=1,...,r

|X | x1 · · · xr | =

max

xk =1,k=1,...,r

Re(X | x1 · · · xr ).

Therefore, maxima of the “real-part-expression” on the right-hand side are always maxima of the “absolute-value-term” on the left. (3) By replacing yy † in (4.14) with an appropriate sum li=1 yi yi† , the above ideas can be extended to best approximations of higher rank, i.e. to best approximations of the form 2 l i,1 i,r X− Ci · x · · · x , min Ci ∈C,xi,k =1 i=1

with l ≤ min{N1 , . . . , Nr } and all xi,1 · · · xi,r mutually orthogonal, cf. [128, 129]. (4) Unfortunately, an analogue of Proposition 4.1 involving the tensor SVD as deﬁned in [130] does not hold for higher-order tensors. Even the classical Eckart–Young Theorem, which asserts that the best rank-k approximation of a matrix is given by its truncated SVD, is false for higher-order tensors, cf. [131]. (5) Higher-order methods, like Newton-, BFGS- or conjugate gradient methods for computing best approximations of higher order tensors can be found in [132–135]. Near local maxima these methods are in general faster than gradient algorithms: Although a single iteration of them is more time-consumimg than a gradient step, the number of iterations to guarantee a certian error threshold is considerably lower due to local higher-order convergence rate. However, their global convergence behavior is a rather delicate issue. In practice, therefore, one often applies a combined strategy: (i) ﬁrst, run a gradient algorithm to reach the region of attraction of a higher-order method; (ii) then switch to a higher-order method. 4.2.3. Numerical results For comparing our gradient-ﬂow approach to tensor-SVD techniques, here we focus on two examples that are well-established in the literature, since analytical solutions

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

649

[136] as well as numerical results from semideﬁnite programming are known [55]. First, consider a pure 3-qubit state depending on a real parameter s ∈ [0, 1] √ √ (4.16) |X(s) := s|W + 1 − s|V , where one deﬁnes 1 1 |W := √ (|001 + |010 + |100) and |V := √ (|110 + |101 + |011) 3 3 with the usual shorthand notation of quantum information |0 := 10 , |1 := 01 and |001 := 10 ⊗ 10 ⊗ 01 , etc. With these stipulations one ﬁnds the corresponding 2 × 2 × 2 tensor representations for |W and |V to take the form 1 0 1 1 1 0 W(1,:,:) = √ W(2,:,:) = √ (4.17) 3 1 0 3 0 0 and V(1,:,:)

1 0 0 =√ 3 0 1

V(2,:,:)

1 0 1 = √ . 3 1 0

Likewise, observe the pure 4-qubit-state √ √ |X(s) := s|GHZ − 1 − s|X + ⊗ |X + ,

(4.18)

(4.19)

with the deﬁnitions 1 1 |GHZ := √ (|0011 + |1100) and |X + := √ (|10 + |01). 2 2 Consider the target function f (K) = tr{C † KAK † } with C = diag(1, 0, 0, . . . , 0) and A := |X(s) X(s)|. As shown in Fig. 4 with the gradient ﬂow restricted

to the local unitaries K ∈ SUloc (2n ) one obtains results perfectly matching the analytical solutions of [136] as well as the numerical ones from semideﬁnite programming ensuring global optimality — yet in drastically less CPU time as compared to [55], see Table 1. Gradient ﬂows are some 30 to 150 times faster in CPU time than semideﬁnite programming methods for the 3-qubit and 4-qubit example, respectively. In the tensor-SVD algorithms [131] such as the higher-order power method (HOPM) or the higher-order orthogonal iteration (HOOI) as implemented in the MATLAB package [137], N = 50 to N = 60 iterations are required for quantitative agreement with the algebraically established results. In the 3-qubit example, all minimal distances are also reproduced correctly with N = 5 iterations — except for the limiting values s near 0 and near 1, for which the minimal distances of ∆(|X(0)) = ∆(|X(1)) = 2/3 are obtained by either tensor method instead of the correct analytical value of 5/9, which requires N = 60 iterations as shown in Fig. 4(c). In the 4-qubit example, however, for N = 5 iterations, both tensor methods suﬀer from apparently random numerical instabilities, which only vanish when allowing for N = 50 iterations in either method. It is the considerably high

July 12, J070-S0129055X10004053

1– max. local transfer

650

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

0.6

0.8

0.5

0.7

0.4

0.6

0.3

0.5

0.2 0

0.5

1

s

0.4 0

0.5

s

(a)

(b) 0.7 0.68

1– max. local transfer

1

0.7

N=5

0.68

0.66

0.66

0.64

0.64

0.62

0.62

0.6

0.6

0.58

0.58

0.56

0.56

0.54

0.54

0.52

0.52

0.5 0.998

0.999

1

s

N = 60

0.5 0.998

0.999

s

1

(c) n Fig. 4. Numerical results by gradient ﬂows on the local unitary group K = SU √loc (2 ) deter√ s|W + 1 − s|V (see mining (a) the Euclidean distance of the 3-qubit state |X(s) = Eq. (4.16)) to the nearest product state as a function of s; (b) the distance of 4-qubit state √ √ b |X(s) = s|GHZ − 1 − s|X+ ⊗ |X+ (see Eq. (4.19)) to the nearest product state. (c) TensorSVD results for Euclidean distance of the 3-qubit state |X(s) to the nearest product state as in part (a). With the standard of N = 5 iterations, both methods (here shown for HOPM) give systematic errors as indicated by the arrow. N = 60 iterations are needed for quantitatively matching the well-established distance values. The high number of iterations required slows down the method as indicated in Table 1. (Color online)

number of iterations that makes the tensor methods substantially slower than our gradient-ﬂow algorithm as shown in Table 1. Therefore, at least for lower order tensors, gradient ﬂows provide an appealing alternative to standard tensor-SVD methods for best rank-1 approximations. Moreover, one should take into account that the above gradient methods are developed to solve the GLSOP and thus a considerable speed-up can be expected by adjusting them to the local orbit Oloc (yy † ) of a pure product state. For similar results obtained by an intrinsic Newton and conjugated gradient method see also [118, 123]. Generalizations of such higher-order methods to Grassmann manifolds, which perfectly

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

651

Table 1. CPU times for determining Euclidean distance to orbit of separable pure states in Fig. 4.

Qubits

Semideﬁnite programming CPU-time [sec]a

By gradient ﬂow CPU-time [sec]b

Speed-up

3 4

10.92 103.97

0.30 0.71

36.4 147.0

Qubits

Higher-ord. tensor-SVD (HOPM) CPU-time [sec]b

H.O. tensor-SVD (HOOI) CPU-time [sec]b

Speed-ups

3 4

2.39 3.93

5.37 7.03

4.6 (2.0) 26.5 (14.8)

a Eisert

et al. (processor with 2.2 GHz, 1 GB RAM) [55]. of 50 runs, Athlon XP1800+ (1.1 GHz, 512 MB RAM).

b Average

ﬁt in the previous theory of Riemannian homogeneous spaces [110], are provided in [132–135]. As also discussed therein, the applications to tensor approximation in signal processing and data compression or subspace reconstruction in image processing are numerous. Moreover we anticipate that these numerical approaches will also prove useful tools in tensor and rank aspects of entanglement and kinematics of qubit pairs as addressed, e.g., in [138, 139]. 4.3. Locally reversible interaction Hamiltonians 4.3.1. Joint local reversibility In a recent study [29], we have addressed the decision problem whether a timeindependent (self-adjoint) Hamiltonian H normalized to ||H||2 = 1 generates a one-parameter unitary group U (t) = {e−itH | t ∈ R} that is jointly invertible for all t by local unitary operations K ∈ SUloc (2n ) = SU (2)⊗n in the sense KHK † = −H.

(4.20)

Apart from complete algebraic classiﬁcation, in [29] we used that the question obviously ﬁnds an aﬃrmative answer, if there is an element K ∈ SUloc (2n ) such that ||KHK † + H||2 = 0,

(4.21)

which amounts to minimizing the transfer function f (K) = Re tr{HKHK † }.

(4.22) n

With P denoting the projector onto k, i.e. the Lie algebra of K = SUloc (2 ), we therefore used the gradient ﬂow K˙ = − grad f (K) = −P ([KHK † , H])K

(4.23)

as an other application of Theorem 3.15. If (due to normalization) Re tr{HKHK † } = −1 can be reached, the interaction Hamiltonian is locally reversible.

July 12, J070-S0129055X10004053

652

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Remark 4.2. There is an interesting relation to local C-numerical ranges as described in detail in [113, 114]: if the local C-numerical range Wloc (H, H) := {tr(HKHK −1 )|K ∈ K} = [−1; +1] then the interaction Hamiltonian H is locally reversible. The references also establish the interconnection to local C-numerical ranges of circular symmetry and multi-quantum interaction components transforming like irreducible spherical spin tensors. In Fig. 5, we give some examples: e.g., the Ising-ZZ interaction in a cyclic four-qubit coupling topology is locally reversible, while in the cyclic three-qubit topology it is not, and also for two qubits coupled by an isotropic Heisenberg-XXX interaction it is not. Thus numerical tests provide convenient answers particularly in problems where an algebraic assessment becomes more tedious than in the examples presented here, which are fully understood on algebraic grounds [29]. 4.3.2. Pointwise local reversibility In [29] we also generalized the above problem to the question, whether for a ﬁxed τ ∈ R there is a pair K1 , K2 ∈ K = SUloc (2n ) so that K1 e−iτ H K2 = e+iτ H

(4.24)

which upon setting A := e−iτ H and C := e+iτ H is equivalent to ||K1 AK2 − C||2 = 0.

(4.25)

tr {KHK −1 H}

[normalised]

1 (a) 0.5

(b)

0 (c) −0.5

−1

0

50

100

150

iteration

Fig. 5. Gradient-ﬂow driven local reversion of diﬀerent Heisenberg interaction Hamiltonians: (a) the Ising-ZZ interaction on a cyclic four-qubit topology C4 can in fact be locally reversed, whereas (b) neither the ZZ interaction on a cyclic three-qubit topology C3 can be reversed locally, (c) nor the Heisenberg-XXX interaction between two qubits. (Color online)

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

653

0 (a)

−0.5

(b)

1

2

Re tr {K e−itHK (−e−itH)} [normalised]

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

−1 0

Fig. 6.

10

20 30 iteration

40

50

Gradient-ﬂow driven local inversion of the exponential of Hamiltonian H = −i π H 4

1 (σz 2

⊗1+

1 ⊗ σz + σz ⊗ σz ) and U (τ ) := e (a) by a gradient ﬂow with independent K1 and K2 (b) by a gradient ﬂow with K1 = K2† =: K. (Color online)

Thus one may choose a gradient ﬂow to minimize 1 Re tr{C † K1 AK2 } 2n

(4.26)

K˙ 1 = grad f (K1 ) = P (K1 AK2 C † )K1 K˙ 2 = grad f (K2 ) = P (K2 C † K1 A)K2 .

(4.27)

f (K1 , K2 ) := − by the coupled system

So if f (K1 , K2 ) = −1 can be reached, then U (t) = e−iτ H is locally reversible at time t = τ . See Fig. 6 for examples comparing pointwise and universal local reversibility. 4.4. Intrinsic versus penalty approach: An example So far, we have demonstrated that in quantum information and control constrained optimization tasks arise that lend themselves to Riemannian, i.e. intrinsic optimization methods. This is because the diﬀerential geometry of their constraint sets is well understood, in particular, many of their Riemannian quantities, like the exponental map, are given explicitly by well-known formulas. In other case, however, the use of sophisticated tools from diﬀerential geometry may be to time-consuming. Therefore, it is sometimes advisable to combine intrinsic techniques with extrinsic methods, like a penalty term or an augmented Lagrange multiplier approach. Here, we only sketch how to incorporate a basic penalty term. For instance, one may face the problem to maximize a quality function f on the reachable set of a quantum system under additional state space contraints. An example amounts to ﬁnding the maximal unitary transfer from matrix (state) A to C subject to leaving another state E invariant (provided A and E do not share

July 12, J070-S0129055X10004053

654

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

the same stabilizer group). Another variant amounts to optimizing the contrast between the transfer from A to C and the transfer from A to D; so the task is to maximize the transfer from A to C subject to suppressing the transfer from A to D. For tackling those types of problems, we address two basically diﬀerent approaches — a purely intrinsic one and a combined method joining intrinsic and penalty-type techniques. Both methods will be brieﬂy illustrated for the problem of maximizing the transfer from A to C while leaving E invariant, i.e. max |tr{UAU † C † }| subject to UE U † = E.

U∈U(N )

(4.28)

It is straightforward to see that the stabilizer group KE := {K ∈ U (N ) | KEK † = E}

(4.29)

of E forms a compact connected Lie subgroup of U (N ). Diﬀerentiating the identity etk Ee−tk = E for t = 0 yields its Lie algebra kE := {k ∈ u(N ) | adk (E) ≡ [k, E] = 0}.

(4.30)

By the Jacobi identity [[Ω1 , Ω2 ], E] + [[Ω2 , E], Ω1 ] + [[E, Ω1 ], Ω2 ] = 0 one can easily verify that kE is indeed a Lie subalgebra of u(N ). Moreover, from the compactness of KE we conclude that the exponential map exp : kE → KE is not only locally, but globally onto. Note, however, this fact is not exploited in what follows. A set of generators of kE may constructively be found by solving a system of homogeneous linear equations, i.e. kE = ker adE ∩ u(N ) = {k ∈ u(N ) | (1 ⊗ E − E ⊗ 1)vec(k) = 0}. In particular, if E is of the form E = µ1 + Ω with µ ∈ C and Ω ∈ u(N ), then kE is identical to the centralizer of Ω in u(N ). By ortho-normalizing the elements kj ∈ kE of the generating set kE with j = 1, 2, . . . , nE , one obtains the projectors Pj := |kj kj | (see also Eq. (3.88)) to give the total projection operator P := j Pj . With this deﬁnition, the gradient ﬂow U2K of the summarizing Table 2 applies and solves Eq. (4.28). Therefore, the constraint of leaving a neutral state E invariant during the transfer from A to C can be approached intrinsically by restricting the ﬂow from the full unitary group to a compact connected Lie subgroup, the stabilizer group KE of E. However, it may be tedious to check for the stabilizer group KE in each and every practical instance and then project the gradients onto the corresponding subalgebra kE . In [28], we therefore presented a combined approach based on the penalty function L(U ) = f2 (U ) − λ(tr{E † U EU † } − ||E||22 )

(4.31)

with f2 (U ) := |tr{C † UAU † }|2 and penalty term λ(tr{E † U EU † } − E 22). Here, the constraint U EU † −E = 0 was rewritten in the more convenient form tr{E † U EU † }− E 22 = 0. The algorithm given in Table 2 as U2C implements a discretized gradient

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

655

ﬂow of L obtained from the identity ∗

DL(U ) (ΩU ) = tr{(2(f2 (U ) [UAU † , C † ])S − λ[UEU † , E † ])Ω}. Note that the penalty parameter λ is increased within the recursion to guarantee that the constraint is (at least approximately) satisﬁed in the limit. Thus, for the constrained optimization task of maximizing the transfer from A to C subject to leaving the state E invariant, one has the choice of taking either the intrinsic approach U2K or the combined approach of U2C. Note, however, that the intrinsic approach restricts the ﬂow to the stabilizer group KE at any time, whereas the combined method is designed such as to start arbitrarily on U (N ) but ﬁnally to give an equilibrium point on KE . Therefore, the intrinsic approach has the advantage that the constraint is (at least in principal) properly satisﬁed for the entire iteration. However, there are situations where an intrinsic method is impractical as the computational costs are too expensive. The combined method, in contrast, does not suﬀer from this shortcoming and thus has a wider range of applications. On the other hand, it is well-known that simple penalty methods as presented above become ill-conditioned for large values of λ. Therefore, an augmented Lagrange multiplier approach may be a good alternative if numerical diﬃculties arise, cf. [86, 87]. Note that the intrinsic approach paves the way to perform (or approximate) a transfer from A to C robustly by taking KE as the stabilizer group resistent against a certain error class in the sense familiar from stabilizer codes [142–144]. The extrinsic approach, on the other hand, could be taken to transfer one protected state A to another one C via intermediate states that are no longer necessarily protected against errors as in the intrinsic case. Finally, in [28, 113], we devised a penalty-type gradient ﬂow algorithm for solving the constrained optimization maxU |tr{C † UAU † }| subject to tr{D† UAU † } = min .

(4.32)

To this end, we introduced the penalty function L(U ) := |tr{C † UAU † }|2 − λ |tr{D† UAU † }|2 ,

(4.33)

to maximize the transfer from A to C while suppressing the transfer from A to D. This leads to the recursive scheme U3C in Table 2. For the relation of unconstrained and constrained gradient ﬂows to the topic of C-numerical ranges and relative C-numerical ranges, see [113, 114, 145], where the latter explicitly compares gradient results with those of quadratic programming with quadratic constraints. 5. Conclusions The ability to calculate optima of quality functions for quantum dynamical processes and to determine steerings in concrete experimental settings that actually

Target function

Discretized gradient ﬂow

f (U ) = | tr{C † UAU † }|2 f (U, V ) = Re tr{C † U AV }

U2 U3

[·, ·]S and (·)S denote skew-Hermitian parts

Uk+1 = exp{−αk ([Ak , C † ]f ∗ (Uk ) − [Ak , C † ]† f (Uk ))}Uk where Ak := Uk AUk† Uk+1 = exp{−αk (Uk AVk C † )S }Uk Vk+1 = exp{−βk (Vk C † Uk A)S }Vk

Uk+1 = exp{−αk [Uk AUk† , C † ]S }Uk

‚ ‚ ‚X ‚ ‚N ‚ ‚ ‚ min U A V − A 0 j j j ‚ ‚ U,V ∈SU (n) ‚ ‚ j=1

‚ ‚ ‚X ‚ ‚ N ‚ ∗ ‚ min ‚ U j Aj U j − A0 ‚ ‚ U ∈SU (n) ‚ ‚ j=1

=

(2)

(j)

s

(j) ∗

(j)

(j)

(j)

(j)

and A0jk := A0 −

(j)

(j)

(j)

where A0jk := A0 −

(j)

ν=1 ν=j

N X

(j)

(ν)

(ν)

Uk Aν Vk

(j)

Vk+1 = exp{−βk (Vk A∗0jk Uk Aj )s }Vk ,

(j)

(j)

where Ak := Uk Aj Uk

(j)

(j)

Uk+1 = exp{−αk (Uk Aj Vk A∗0jk )s }Uk

(j)

(1)

(2) (1) (2) exp{−βk Pk (Kk C † Kk A)}Kk

(1)

= exp{−αk Pk (Kk AKk C † )}Kk

where Ak := Kk AKk†

Uk+1 = exp{−αk [Ak , A∗0jk ] }Uk

(1) Kk+1 (2) Kk+1

ν=1 ν=j

N X

(ν)

Ak

Kk+1 = exp{−αk Pk [Kk AKk† , C † ]}Kk Kk+1 = exp{−αk (Pk [Ak , C † ]f ∗ (Kk ) − Pk [Ak , C † ]† f (Kk ))}Kk

[141]

[141]

[29]

[herea ] [herea ]

[27, 28] [23, 29]

[27, 28]

[17, 22, 23]

Ref.

2010 12:0 WSPC/S0129-055X

U5K

U4K

U3K

1 AK2 }

Re tr{C † K

f (K1 , K2 ) =

f (K) = Re tr{C † KAK † } f (K) = |tr{C † KAK † }|2

U1K U2K

Maximization restricted to subgroups K ⊂ U (N ) of the unitary group with K ∈ K and Pk as projection from gl(N, C) onto k, i.e. the Lie algebra to K

f (U ) = Re tr{C † UAU † }

U1

Maximization over the unitary group: U, V ∈ SU (N ) and A, C ∈

CN×N ;

Maximization over the orthogonal group: O ∈ SO(N, R) and A, ∆ ∈ RN×N with ∆ diagonal, αk > 0 stepsize Ok+1 = exp{−αk [Ok AOk , ∆ ]}Ok O1 f (O) = tr{∆ OAO }

I. Unconstrained optimization

Examples of optimization tasks and related gradient ﬂows. 656

No.

Table 2.

July 12, J070-S0129055X10004053 148-RMP

T. Schulte-Herbr¨ uggen et al.

Target function

Discretized gradient ﬂow

(Continued )

f (X) = tr{CX} with Xk := AdOk (A)

Xk+1 = e−αk [Xk ,C] Xk e+αk [Xk ,C]

f (X) =

Re tr{C † X} Xk+1 =

†

]S X e+αk [Xk ,C † ]S k

† † e−αk Pk [Xk ,C ] Xk e+αk Pk [Xk ,C ]

Xk+1 = e−αk [Xk ,C

a Work

fC (U ) (s.a.) and fE (U ) :=

presented in part at the MTNS 2006 [117].

fC (U ) (s.a.) and fD (U ) := tr{D † UAU † }

L(U ) = |fC (U )|2 − λ|fD (U )|2

tr{E † U EU † }

L(U ) = |fC (U )|2 − λ(fE (U ) − ||E||22)

with fC (U ) := tr{C † UAU † }

L(U ) = Re fC (U ) − λIm2 fC (U )

Uk+1 =

Uk+1

where Ak := Uk AUk† and Ek := Uk EUk† ∗ (U )[A , C † ]) − λ(f ∗ (U )[A , D † ]) )}U exp{−2αk ((fC S S k k k k k D where Ak := Uk AUk†

1 where Ak := Uk AUk† and XH,S := (X ± X † ) 2 ∗ (U )[A , C † ]) − λ[E , E † ])}U = exp{−αk ((2fC S k k k k

Uk+1 = exp{−αk ([Ak , C † ]S + 2iλIm fC (Uk )[Ak , C † ]H )}Uk

[28]

[28]

[28]

[here]

[here]

[22, 119]

Ref.

2010 12:0 WSPC/S0129-055X

U3C

U2C

U1C

Maximizing L(U ) with penalty parameter λ ∈ R over the unitary group: U ∈ SU (N ); A, C, D, E ∈ CN×N

with Xk := AdKk (A),

f (X) = Re tr{C † X} with Xk := AdUk (A),

II. Constrained optimization

U1KP

U1P

Maximization restricted to homogeneous spaces G/H of the unitary group with X ∈ G/H and A, C arbitrary complex square and Pk as projection from gl(N, C) onto k

O1P

Maximization restricted to homogeneous spaces G/H of the orthogonal group with X ∈ G/H and A, C real symmetric

No.

Table 2.

July 12, J070-S0129055X10004053 148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 657

July 12, J070-S0129055X10004053

658

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Summary: General Gradient Algorithm for Steepest Ascent on Riemannian Manifolds Requirements: Riemannian manifold M , e.g., Lie group G with (bi-invariant) metric · | · or its group orbits; smooth target function f : M → R; associated gradient system X˙ = grad f (X). Input : initial state X(0) ∈ M , parameters for target function. Output : sequence of iterative pairs {(Xk , f (Xk ))} approximating critical points X∗ and their critical values f (X∗ ). Initialization: If possible, generate generic initial state X0 , e.g., for compact Lie groups pick random G0 ∈ G according to Haar measure (for SU (N ) see [140]) and set X0 := G0 · X(0), otherwise identify X0 := X(0); calculate f (X0 ), grad f (X0 ), and step size α0 according to Sec. 3. Recursion: while k = 0, 1, 2, . . . , klimit and αk > αthreshold > 0 do 1: iterate Xk+1 = expXk (αk grad f (Xk )) according to examples in Table 2. 2: calculate f (Xk+1 ). 3: update step size αk+1 according to Sec. 3. 4: go to step 1. end Fig. 7. Summarizing scheme for steepest-ascent gradient ﬂows on Riemannian manifolds. For related methods, like conjugate gradients, Jacobi- or Newton-type schemes, step (1) has to be modiﬁed in a straight-forward way according to Sec. 2, for details see [20, 62, 63]. If the dynamic stepsize selection of Sec. 3 is too costly CPU-timewise, one may start out with constant stepsizes, and halve them whenever (f (Xk+1 ) − f (Xk )) ≤ 0, cf. Armijo’s rule. In cases, where local extrema exist (see Sec. 3), make sure to run with a suﬃcient number of generic initial conditions.

achieve these optima is tantamount to exploiting and manipulating quantum eﬀects in future technology. To this end, we have presented a comprehensive account of gradient ﬂows on Riemannian manifolds (see general scheme of Fig. 7) allowing for generically convergent quantum optimization algorithms — an ample array of explicit examples being given in Table 2. Since the state spaces of quantum dynamical systems can often be represented by smooth manifolds, the uniﬁed foundations given here are also illustrated by many applications for numerically addressing optimization tasks in quantum information and quantum control. In the present work, a variety of applications are addressed by relating the dynamics to Lie group actions of the unitary group and its closed subgroups, which also includes recent least-squares approximations by a sum of several elements on independent matrix orbits [141] given as instances U4K and U5K in Table 2. Since symmetries give rise to stabilizer groups, particular attention has been paid to gradient ﬂows on homogeneous spaces.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

659

Theory and algorithms have been structured and tailored for the following scenarios: (i) (ii) (iii) (iv)

for for for for

Lie groups with bi-invariant metric, closed subgroups compact Riemannian symmetric spaces, or, more generally, naturally reductive homogeneous spaces.

As soon as the homogeneous spaces are no longer naturally reductive, the “standard” way of representing geodesics on quotient spaces (by projecting geodesics from the group level to the quotient) fails. Alternatives of local approximations have been sketched in these cases in order to structure future developments. Techniques based on the Riemannian exponential are easy to implement on Lie groups (with bi-invariant metric) and their closed subgroups. In particular, gradient ﬂows on subgroups of the unitary group with tensor product structure allow to address diﬀerent partitionings of m-party quantum systems, the ﬁnest one being the group of purely local operations SU (2) ⊗ SU (2) ⊗ · · · ⊗ SU (2). The corresponding gradient ﬂows have several applications in quantum dynamics: for instance they prove useful tools to decide whether eﬀective multi-qubit interaction Hamiltonians generate time evolutions that can be reversed in the sense of Hahn’s spin echo solely by local operations. As a new application, gradient ﬂows on SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nm ) turned out to be a valuable and reliable alternative to conventional tensor-SVD methods for determining best rank-1 tensor approximations to higherorder tensors. In the case of m-party multipartite pure quantum states, they can readily be applied to optimizing entanglement witnesses. Double-bracket ﬂows have been characterized as a special case of gradient ﬂows on naturally reductive homogeneous orbit spaces. Here, in view of using gradient techniques for ground-state calculations [36], it is important to note that doublebracket ﬂows can also be established for any closed subgroup of SU (N ): by allowing for diﬀerent partitionings SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nm ), one may set up a common frame to compare diﬀerent types of unitary networks [36,50] for calculating and simulating large-scale quantum systems. Moreover, we have shown how techniques of restricting a gradient ﬂow to subgroups also prove a useful tool for addressing constrained optimization tasks by ensuring the constraints are fulﬁlled intrinsically. As an alternative, we have sketched gradient ﬂows that respect constraints extrinsically, e.g., by way of penaltytype parameters. These methods await application, e.g., in error-correction and robust state transfer. Finally, in a follow-up study, we discussed the dynamics of open quantum systems in terms of Lie semigroups [59]. We discuss relations between the theory of Lie semigroups and completely positive semigroups. In particular in open systems, an easy characterization of reachable sets arises only in very simple cases. It thus poses a current limit to an abstract optimization approach on reachable sets. However, in these cases, gradient-assisted optimal control methods again prove valuable.

July 12, J070-S0129055X10004053

660

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

Therefore, not only does the current work give the justiﬁcation to some recent developments, it also provides new techniques to the ﬁeld of quantum dynamics. It shows how to exploit the diﬀerential geometry in Lie theoretical terms for optimization on quantum-state manifolds. Thus the comprehensive theoretical treatment illustrated by known examples and new practical applications has been given to ﬁll a gap. We anticipate that the ample array of methods and their exempliﬁcations will ﬁnd broad application in particular, since tensor approximations begin to play a key role in tensor-network approaches. They are used to approximate ground state energies (Rayleigh-coeﬃcients) of large-system Hamiltonians exceeding the memory capacity of any (classical) computer hardware [36, 41–50]. The account of theoretical foundations is also meant to structure and trigger further basic research thus widening the set of useful tools. Acknowledgments Fruitful discussion with Jens Eisert on [36] is gratefully acknowledged. We wish to thank Otfried G¨ uhne for drawing our attention to witness optimization and Ref. [55]. This work was supported in part by the integrated EU programmes QAP, Q-ESSENCE and the exchange with COQUIT, as well as by Deutsche Forschungsgemeinschaft, DFG, in the incentives SPP 1078 and SFB 631. Support and exchange enabled by the two Bavarian PhD programmes of excellence Quantum Computing, Control, and Communication (QCCC) as well as Identiﬁcation, Optimization and Control with Applications in Modern Technologies is gratefully acknowledged. References [1] R. P. Feynman, Simulating physics with computers, Int. J. Theoret. Phys. 21 (1982) 467–488. [2] R. P. Feynman, Feynman Lectures on Computation (Perseus Books, Reading, MA, 1996). [3] A. Y. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation (American Mathematical Society, Providence, 2002). [4] P. W. Shor, Algorithms for quantum computation: Discrete logarithms and factoring, in Proceedings of the Symposium on the Foundations of Computer Science (1994 ), Los Alamitos, California, USA (IEEE Computer Society Press, New York, 1994), pp. 124–134. [5] P. W. Shor, Polynomial-time algorithms for prime factorisation and discrete logarithm on a quantum computer, SIAM J. Comput. 26 (1997) 1484–1509. [6] R. Jozsa, Quantum algorithms and the Fourier transform, Proc. R. Soc. A 454 (1998) 323–337. [7] R. Cleve, A. Ekert, C. Macchiavello and M. Mosca, Quantum algorithms revisited, R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. 454 (1998) 339–354. [8] M. Ettinger, P. Høyer and E. Knill, The quantum query complexity of the hidden subgroup problem is polynomial, Inf. Process. Lett. 91 (2004) 43–48. [9] L. K. Grover, A fast quantum mechanical algorithm for database search, in Proceedings of the 28th Annual Symposium on the Theory of Computing (1996 ), Philadelphia, Pennsylvania, USA (ACM Press, New York, 1996), pp. 212–219.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

661

[10] L. K. Grover, Quantum mechanics helps in searching for a needle in a haystack, Phys. Rev. Lett. 79 (1997) 325–328. [11] C. H. Papadimitriou, Computational Complexity (Addison Wesley, Reading MA, 1995). [12] S. Sachdev, Quantum Phase Transitions (Cambridge University Press, Cambridge, 1999). [13] E. Jan´e, G. Vidal, W. D¨ ur, P. Zoller and J. Cirac, Simulation of quantum dynamics with quantum optical systems, Quant. Inf. Computation 3 (2003) 15–37. [14] D. Porras and J. Cirac, Eﬀective quantum spin systems with trapped ions, Phys. Rev. Lett. 92 (2004) 207901. [15] J. Dowling and G. Milburn, Quantum technology: The second quantum revolution, Phil. Trans. R. Soc. Lond. A 361 (2003) 1655–1674. [16] H. M. Wiseman and G. J. Milburn, Quantum Measurement and Control (Cambridge University Press, Cambridge, 2009). [17] R. W. Brockett, Dynamical systems that sort lists, diagonalise matrices, and solve linear programming problems, in Proc. IEEE Decision Control (1988 ), Austin, Texas, USA (1988), pp. 779–803; reproduced in: Lin. Alg. Appl. 146 (1991) 79–91. [18] R. W. Brockett, Least-squares matching problems, Lin. Alg. Appl. 122(4) (1989) 761–777. [19] R. W. Brockett, Diﬀerential geometry and the design of gradient algorithms, Proc. Symp. Pure Math. 54 (1993) 69–91. [20] S. T. Smith, Geometric optimization methods for adaptive ﬁltering, PhD Thesis, Harvard University, Cambridge MA (1993). [21] S. T. Smith, Hamiltonian and Gradient Flows, Algorithms and Control, Fields Institute Communications (American Mathematical Society, Providence, 1994), pp. 113– 136, chap. Optimization techniques on Riemannian manifolds. [22] U. Helmke and J. B. Moore, Optimization and Dynamical Systems (Springer, Berlin, 1994). [23] A. Bloch (ed.), Hamiltonian and Gradient Flows, Algorithms and Control, Fields Institute Communications (American Mathematical Society, Providence, 1994). [24] M. T. Chou and K. R. Driessel, The projected gradient method for least-squares matrix approximations with spectral constraints, SIAM J. Numer. Anal. 27 (1990) 1050–1060. [25] P. A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Manifolds (Princeton University Press, Princeton, 2008). [26] L. Ambrosio, N. Gigli and G. Savar´e, Gradient Flows in Metric Spaces and in the Space of Probability Measures, Lectures in Mathematics, 2nd edn. (ETH-Z¨ urich, Birkh¨ auser, Basel, 2008). [27] S. J. Glaser, T. Schulte-Herbr¨ uggen, M. Sieveking, O. Schedletzky, N. C. Nielsen, O. W. Sørensen and C. Griesinger, Unitary control in quantum ensembles: Maximising signal intensity in coherent spectroscopy, Science 280 (1998) 421–424. [28] T. Schulte-Herbr¨ uggen, Aspects and prospects of high-resolution NMR, PhD Thesis, Diss-ETH 12752, Z¨ urich (1998). [29] T. Schulte-Herbr¨ uggen and A. Sp¨ orl, Which quantum evolutions can be reversed by local unitary operations? Algebraic classiﬁcation and gradient-ﬂow based numerical checks (2006); http://arXiv.org/pdf/quant-ph/0610061. [30] N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-Herbr¨ uggen and S. J. Glaser, Optimal control of coupled spin dynamics: Design of NMR pulse sequences by gradient ascent algorithms, J. Magn. Reson. 172 (2005) 296–305.

July 12, J070-S0129055X10004053

662

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

[31] T. Schulte-Herbr¨ uggen, A. K. Sp¨ orl, N. Khaneja and S. J. Glaser, Optimal controlbased eﬃcient synthesis of building blocks of quantum algorithms: A perspective from network complexity towards time complexity, Phys. Rev. A 72 (2005) 042331. [32] A. K. Sp¨ orl, T. Schulte-Herbr¨ uggen, S. J. Glaser, V. Bergholm, M. J. Storcz, J. Ferber and F. K. Wilhelm, Optimal control of coupled Josephson qubits, Phys. Rev. A 75 (2007) 012302. [33] T. Schulte-Herbr¨ uggen, A. Sp¨ orl, N. Khaneja and S. Glaser, Optimal control for generating quantum gates in open dissipative systems (2006); http://arXiv.org/ pdf/quant-ph/0609037. [34] P. Rebentrost, I. Serban, T. Schulte-Herbr¨ uggen and F. Wilhelm, Optimal control of a qubit coupled to a non-Markovian environment, Phys. Rev. Lett. 102 (2009) 090401. [35] M. Grace, C. Brif, H. Rabitz, I. Walmsley, R. Kosut and D. Lidar, Optimal control of quantum gates and suppression of decoherence in a system of interacting two-level particles, J. Phys. B.: At. Mol. Opt. Phys. 40 (2007) S103–S125. [36] C. Dawson, J. Eisert and T. J. Osborne, Unifying variational methods for simulating quantum many-body systems, Phys. Rev. Lett. 100 (2008) 130501. [37] T. Huckle, K. Waldherr and T. Schulte-Herbr¨ uggen, Unifying large-scale tensor approximations — Concepts and algorithms (2010); to be submitted. [38] M. Plenio, J. Eisert, J. Dreissig and M. Cramer, Entropy, entanglement, and area: Analytical results for harmonic lattice systems, Phys. Rev. Lett. 94 (2003) 060503. [39] M. Cramer, J. Eisert, M. Plenio and J. Dreissig, Entanglement-area law for general bosonic harmonic lattice systems, Phys. Rev. A 73 (2006) 012309. [40] M. Wolf, F. Verstraete, M. B. Hastings and I. Cirac, Area laws in quantum systems: Mutual information and correlations, Phys. Rev. Lett. 100 (2008) 070502. [41] M. Fannes, B. Nachtergaele and R. Werner, Abundance of translation invariant pure states on quantum spin chains, Lett. Math. Phys. 25 (1992) 249–258. [42] M. Fannes, B. Nachtergaele and R. F. Werner, Finitely correlated states on quantum spin chains, Comm. Math. Phys. 144 (1992) 443–490. [43] I. Peschel, X. Wang, M. Kaulke and K. Hallberg (eds), Density-Matrix Renormailzation: A New Numerical Method in Physics, Lecture Notes in Physics, Vol. 528 (Springer, Berlin, 1999). [44] U. Schollw¨ ock, The density-matrix renormalization group, Rev. Mod. Phys. 77 (2005) 259–315. [45] B. Schumacher and R. Werner, Reversible quantum cellular automata (2004); http://arXiv.org/pdf/quant-ph/0405174. [46] F. Verstraete, D. Porras and I. Cirac, DMRG and periodic boundary conditions: A quantum information perspective, Phys. Rev. Lett. 93 (2004) 227205. [47] S. Anders, M. B. Plenio, W. D¨ ur, F. Verstraete and H. J. Briegel, Ground-state approximation for strongly interacting spin systems in arbitrary spatial dimension, Phys. Rev. Lett. 97 (2006) 107206. [48] G. Vidal, Entanglement renormalization, Phys. Rev. Lett. 99 (2007) 220405. [49] N. Schuch, M. Wolf, F. Verstraete and I. Cirac, Strings, projected entangled pair states, and variational Monte Carlo methods, Phys. Rev. Lett. 100 (2008) 040501. [50] R. H¨ ubner, C. Kruszynska, L. Hartmann, W. D¨ ur, F. Verstraete, J. Eisert and M. Plenio, Renormalization algorithm with graph enhancement, Phys. Rev. A 79 (2009) 022317. [51] F. Wegner, Flow-equations for Hamiltonians, Ann. Phys. (Leipzig) 3 (1994) 77–91. [52] S. Kehrein, The Flow-Equation Approach to Many-Particle Systems, Springer Tracts in Physics, Vol. 217 (Springer, Berlin, 2006).

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

663

[53] M. B. Plenio and S. Virmani, An introduction to entanglement measures, Quant. Comp. Inf. 7 (2007) 1–51. [54] R. Horodecki, P. Horodecki, M. Horodecki and K. Horodecki, Quantum entanglement, Rev. Mod. Phys. 81 (2009) 865–942. [55] J. Eisert, P. Hyllus, O. G¨ uhne and M. Curty, Complete hierarchies of eﬃcient approximations to problems in entanglement theory, Phys. Rev. A 70 (2004) 062317. [56] R. Lohmayer, A. Osterloh, J. Siewert and A. Uhlmann, Entangled three-qubit states without concurrence and three-tangle, Phys. Rev. Lett. 97 (2006) 260502. [57] A. Osterloh, J. Siewert and A. Uhlmann, Tangles of superpositions and the convexroof extension, Phys. Rev. A 77 (2008) 032210. [58] C. Eltschka, A. Osterloh, J. Siewert and A. Uhlmann, Three-tangle for mixtures of generalized GHZ and generalized W states, New J. Phys. 10 (2008) 043014. [59] G. Dirr, U. Helmke, I. Kurniawan and T. Schulte-Herbr¨ uggen, Lie semigroup structures for reachability and control of open quantum systems, Rep. Math. Phys. 64 (2009) 93–121; http://arXiv.org/pdf/0811.3906. [60] M. M. Wolf and J. I. Cirac, Dividing quantum channels, Comm. Math. Phys. 279 (2008) 147–168. [61] C. Udri¸ste, Convex Functions and Optimization Methods on Riemannian Manifolds (Kluwer, Dordrecht, 1994). [62] D. Gabay, Minimizing a diﬀerential function over a diﬀerential manifold, J. Optim. Theory Appl. 37 (1982) 177–219. [63] M. Kleinsteuber, Jacobi-type methods on semisimple Lie algebras — A Lie algebraic approach to the symmetric eigenvalue problem, PhD Thesis, Universit¨ at W¨ urzburg (2006). [64] J. Nocedal, Updating quasi-Newton matrices with limited storage, Math. Comp. 35 (1980) 773–782. [65] R. H. Byrd, P. Lu and R. B. Schnabel, Representation of quasi-Newton matrices and their use in limited memory methods, Math. Program. 63 (1994) 129–156. [66] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd edn. (Springer, New York, 2006). [67] V. Jurdjevic, Geometric Control Theory (Cambridge University Press, Cambridge, 1997). [68] Y. L. Sachkov, Controllability of invariant systems on Lie groups and homogeneous spaces, J. Math. Sci. 100 (2000) 2355–2427. [69] G. Dirr and U. Helmke, Lie theory for quantum control, GAMM-Mitteilungen 31 (2008) 59–93. [70] D. D’Alessandro, Introduction to Quantum Control and Dynamics (Chapman & Hall/CRC, Boca Raton, 2008). [71] V. F. Krotov, Global Methods in Optimal Control (Marcel Dekker, New York, 1996). [72] A. Peirce, M. Dahleh and H. Rabitz, Optimal control of quantum mechanical systems: Existence, numerical approximations and applications, Phys. Rev. A 37 (1987) 4950–4962. [73] K. L. Teo, C. J. Goh and K. H. Wong, A Uniﬁed Computational Approach to Optimal Control Problems (Chapman & Hall/CRC, Boca Raton, 1991). [74] Y. Maday and G. Turinici, New formulation of monotonically convergent quantum control algorithms, J. Chem. Phys. 118 (2003) 8191–8196. [75] H. Sussmann and V. Jurdjevic, Controllability of nonlinear systems, J. Diﬀerential Equations 12 (1972) 95–116. [76] V. Jurdjevic and H. Sussmann, Control systems on Lie groups, J. Diﬀerential Equations 12 (1972) 313–329.

July 12, J070-S0129055X10004053

664

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

[77] A. Agrachev and T. Chambrion, An estimation of the controllability time for singleinput systems on compact Lie groups, ESAIM Control Optim. Calc. Var. 12 (2006) 409–441. [78] R. W. Brockett, System theory on group manifolds and coset spaces, SIAM J. Control 10 (1972) 265–284. [79] R. W. Brockett, Lie theory and control systems deﬁned on spheres, SIAM J. Appl. Math. 25 (1973) 213–225. [80] W. M. Boothby and E. N. Wilson, Determination of the transitivity of bilinear systems, SIAM J. Control Optim. 17 (1979) 212–221. [81] F. Albertini and D. D’Alessandro, Notions of controllability for bilinear multilevel quantum systems, IEEE Trans. Automat. Control 48 (2003) 1399–1403. [82] R. Zeier, U. Sander and T. Schulte-Herbr¨ uggen, Symmetry in quantum system theory of multi-qubit systems, in Proc. 19th MTNS, Budapest, Hungary (2010), in press. [83] U. Sander and T. Schulte-Herbr¨ uggen, Symmetry in quantum system theory of multi-qubit systems: Rules for quantum architecture design (2009); http://arXiv.org/pdf/0904.4654. [84] M. W. Hirsch and S. Smale, Diﬀerential Equations, Dynamical Systems, and Linear Algebra (Academic Press, San Diego, 1974). [85] M. C. Irwin, Smooth Dynamical Systems (Academic Press, New York, 1980). [86] R. Fletcher, Practical Methods of Optimization, 2nd edn. (Wiley & Sons, Chichester, 1987). [87] D. G. Luenberger and Y. Ye, Linear and Nonlinear Programming, 3rd edn. (Springer, Berlin, 2008). [88] W. Boothby, An Introduction to Diﬀerential Manifolds and Riemannian Geometry (Academic Press, New York, 1975). [89] S. Gallot, D. Hulin and J. Lafontaine, Riemannian Geometry, 3rd edn. (Universitext, Springer, Berlin, 2004). [90] M. Spivak, A Comprehensive Introduction to Diﬀerential Geometry, Vols. I–II, 3rd edn. (Publish or Perish, Houston, 1999). [91] B. O’Neill, Semi-Riemannian Geometry (Academic Press, San Diego, 1983). [92] R. Abraham, J. E. Marsden and T. Ratiu, Manifolds, Tensor Analysis and Applications, 2nd edn. (Springer, New York, 1988). [93] J. Palis and W. de Melo, Geometric Theory of Dynamical Systems (Springer, New York, 1982). [94] S. L ojasiewicz, Sur les Trajectoires du Gradient d’une Fonction Analytique. Seminari di Geometria 1982–1983, Universit` a di Bologna, Istituto di Geometria, Dipartimento di Matematica (1984). [95] K. Kurdyka, On gradients of functions deﬁnable in O-minimal structures, Ann. Inst. Fourier 48 (1998) 769–783. [96] S. Kobayashi and K. Nomizu, Foundations of Diﬀerential Geometry, Vols. I–II (Wiley Interscience, New York, 1996). [97] F. Takens, A solution, in Manifolds — Amsterdam 1970, ed. N. Kuiper, Lecture Notes in Math., Vol. 197 (Springer, New York, 1971), p. 231. [98] C. Lageman, Convergence of gradient-like dynamical systems and optimization algorithms, PhD Thesis, Universit¨ at W¨ urzburg (2007). [99] S. Helgason, Diﬀerential Geometry, Lie Groups, and Symmetric Spaces (Academic Press, New York, 1978). [100] B. C. Hall, Lie Groups, Lie Algebras, and Representations (Springer, New York, 2003).

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

665

[101] J. J. Duistermaat and J. A. C. Kolk, Lie Groups (Springer, New York, 2000). [102] A. Arvanitoyeorgos, An Introduction to Lie Groups and the Geometry of Homogeneous Spaces (American Mathematical Society, Providence, 2003). [103] A. W. Knapp, Lie Groups beyond an Introduction, 2nd edn. (Birkh¨ auser, Boston, 2002). [104] J. Milnor, Curvatures of left invariant metrics on Lie groups, Adv. Math. 21 (1976) 293–329. [105] J. Cheeger and D. G. Ebin, Comparison Theorems in Riemannian Geometry (NorthHolland, Amsterdam, 1975). [106] T. Br¨ ocker and T. tom Dieck, Representation of Compact Lie Groups (Springer, New York, 1985). [107] O. Kowalski and J. Szenthe, On the existence of homogeneous geodesics in homogeneous Riemannian manifolds, Geom. Dedicata 81 (2000) 209–214; Erratum, ibid. 84 (2001) 331. [108] A. Besse, Einstein Manifolds (Spinger, Berlin, 1986). [109] B. Kostant, Holonomy and Lie algebra of motions in Riemannian manifolds, Trans. Amer. Math. Soc. 80 (1955) 520–542. [110] U. Helmke, K. H¨ uper and J. Trumpf, Newton’s method on Grassmann manifolds (2007); http://arXiv.org/pdf/0709.2205. [111] M. Goldberg and E. Straus, Elementary inclusion relations for generalized numerical ranges, Linear Algebra Appl. 18 (1977) 1–24. [112] C.-K. Li, C-numerical ranges and C-numerical radii, Lin. Multilin. Alg. 37 (1994) 51–82. [113] T. Schulte-Herbr¨ uggen, G. Dirr, U. Helmke, M. Kleinsteuber and S. Glaser, The signiﬁcance of the C-numerical range and the local C-numerical range in quantum control and quantum information, Lin. Multin. Alg. 56 (2008) 3–26. [114] G. Dirr, U. Helmke, M. Kleinsteuber and T. Schulte-Herbr¨ uggen, Relative C-numerical ranges for applications in quantum control and quantum information, Lin. Multin. Alg. 56 (2008) 27–51. [115] C.-K. Li and N. K. Tsing, Matrices with circular symmetry on their unitary orbits and C-numerical ranges, Proc. Amer. Math. Soc. 111 (1991) 19–28. [116] U. Helmke, K. H¨ uper, J. B. Moore and T. Schulte-Herbr¨ uggen, Gradient ﬂows computing the C-numerical range with applications in NMR spectroscopy, J. Global Optim. 23 (2002) 283–308. [117] G. Dirr, U. Helmke, M. Kleinsteuber, S. Glaser and T. Schulte-Herbr¨ uggen, The local C-numerical range: Examples, conjectures and numerical algorithms, in Proc. MTNS (2006), Kyoto, Japan (2006), pp. 1419–1426. [118] G. Dirr, U. Helmke, M. Kleinsteuber and T. Schulte-Herbr¨ uggen, A new type of Cnumerical range arising in quantum computing, PAMM 6 (2006) 711–712; Special issue on 80th Annual Meeting GAMM. [119] A. Bloch, R. W. Brockett and T. Ratiu, A new formulation of the generalized Toda lattice equations and their ﬁx-point analysis via the moment map, Bull. Am. Math. Soc. 56 (1990) 447–451. [120] A. Bloch, R. W. Brockett and T. Ratiu, Completely integrable gradient ﬂows, Comm. Math. Phys. 147 (1992) 57–74. [121] R. Bertlman, H. Narnhofer and W. Thirring, A geometric picture of entanglement and Bell inequalities, Phys. Rev. A 66 (2002) 032319. [122] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, UK, 2000).

July 12, J070-S0129055X10004053

666

2010 12:0 WSPC/S0129-055X

148-RMP

T. Schulte-Herbr¨ uggen et al.

[123] O. Curtef, G. Dirr and U. Helmke, Conjugate gradient algorithms for best rank-1 approximation of tensors, PAMM 7 (2008) 1062201–1062202; Proceedings of the ICIAM (2007), Z¨ urich, Switzerland. [124] J. von Neumann, Some matrix-inequalities and metrization of matrix-space, Tomsk Univ. Rev. 1 (1937) 286–300; reproduced in John von Neumann: Collected Works, Vol. IV: Continuous geometry and other topics, ed. A. H. Taub (Pergamon Press, Oxford, 1962), pp. 205–219. [125] O. Sørensen, Polarization transfer experiments in high-resolution NMR spectroscopy, Prog. NMR Spectrosc. 21 (1989) 503–569. ˙ [126] D. Markham, J. A. Miszczak, Z. Puchala and K. Zyczkowski, Quantum state discrimination: A geometric approach, Phys. Rev. A 77 (2008) 042111. [127] J. Stoustrup, O. Schedletzky, S. J. Glaser, C. Griesinger, N. C. Nielsen and O. W. Sørensen, Generalized bound on quantum dynamics: Eﬃciency of unitary transformations between non-Hermitian states, Phys. Rev. Lett. 74 (1995) 2921–2924. [128] T. Kolda, Orthogonal tensor decompositions, SIAM J. Matrix Anal. Appl. 23 (2001) 243–255. [129] T. Zhang and G. H. Golub, Rank-one approximation to higher-order tensors, SIAM. J. Matrix Anal. Appl. 23 (2001) 534–550. [130] L. de Lathauwer, B. de Moor and J. Vandewalle, A multilinear singular value decomposition, SIAM J. Matrix Anal. Appl. 21 (2000) 1253–1278. [131] L. de Lathauwer, B. de Moor and J. Vandewalle, On the best rank-1 and rank(R1 , R2 , . . . , Rn ) approximation of higher-order tensors, SIAM J. Matrix Anal. Appl. 21 (2000) 1324–1342. [132] L. Eld´en and B. Savas, A Newton–Grassmann method for computing the best multilinear rank-(R1 , R2 , R3 ) approximation of a tensor, SIAM J. Matrix Anal. Appl. 31 (2009) 248–271. [133] B. Savas and L. H. Lim, Quasi-Newton methods on Grassmannians and multilinear approximations of tensors, Optimization Online 2009 (2009) 2362; arXiv:0907.2214. [134] O. Curtef, G. Dirr and U. Helmke, Riemannian optimization on tensor manifolds: Applications to generalized Rayleigh quotients (2010); arXiv:1005.4854. [135] M. Ishteva, L. D. Lathauwer, P. A. Absil and S. V. Huﬀel, Diﬀerential-geometric Newton method for the best rank-(R1 , R2 , R3 ) approximation of tensors, Numer. Algorithms 51 (2009) 179–194; Tributes to Gene H. Golub, Part II. [136] T. Wei and P. Goldbart, Geometric measure of entanglement and applications to bipartite and multipartite quantum states, Phys. Rev. A 68 (2003) 022307. [137] T. G. Kolda and B. W. Bader, Tutorial on MATLAB for tensors and the Tucker decomposition, Talk at workshop on tensor decomposition and applications, CIRM, Marseille (2005). [138] J. L. Brylinski, Mathematics of Quantum Computation, Computational Mathematics Series (Chapman & Hall/CRC, Boca Raton, 2002), pp. 3–23, chap. on Algebraic Measures of Entanglement. [139] B. G. Englert and N. Metwally, Mathematics of Quantum Computation, Computational Mathematics Series (Chapman & Hall/CRC, Boca Raton, 2002), pp. 24–75, chap. on Kinematics of Qubit Pairs. [140] F. Mezzadri, How to generate random matrices from the classical compact groups, Notices Amer. Math. Soc. 54 (2007) 592–604. [141] C. K. Li, Y. T. Poon and T. Schulte-Herbr¨ uggen, Least-squares approximation by elements from matrix orbits achieved by gradient ﬂows on compact Lie groups, Math. Comp., in press (2010); arXiv:0812.1817.

July 12, J070-S0129055X10004053

2010 12:0 WSPC/S0129-055X

148-RMP

Gradient Flows for Optimization in Quantum Information and Quantum Dynamics

667

[142] M. Grassl, Lectures on Quantum Information (Wiley-VCH, Weinheim, 2007), pp. 105–120, chap. on Quantum Error Correction. [143] A. R. Calderbank, E. M. Rains, P. W. Shor and N. J. A. Sloane, Quantum error correction via codes over GF (4), IEEE Trans. Inf. Theory 44 (1998) 1369–1387. [144] A. R. Calderbank and P. W. Shor, Good quantum error-correcting codes exist, Phys. Rev. A 54 (1998) 1089–1105. [145] B. Tibken, Y. Fan, S. J. Glaser and T. Schulte-Herbr¨ uggen, Semideﬁnite programming relaxations applied to determining upper bounds of C-numerical ranges, in Proc. IEEE Intl. Conference on Control Applications (CCA) (2004 ), Munich, Germany (2004); published as CD-ROM Proceedings (2006).

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 6 (2010) 669–697 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004041

DENSITY DEPENDENT STOCHASTIC NAVIER–STOKES EQUATIONS WITH NON-LIPSCHITZ RANDOM FORCING

MAMADOU SANGO School of Mathematics, Institute for Advanced Study, 1, Einstein Drive, Princeton, NJ 08540, USA and Department of Mathematics and Applied Mathematics, University of Pretoria, Pretoria 0002, South Africa [email protected] [email protected] Received 24 September 2009 Revised 17 March 2010 In this work, we investigate the question of existence of weak solutions to the density dependent stochastic Navier–Stokes equations. The noise considered contains functions which depend nonlinearly on the velocity and which do not satisfy the Lipschitz condition. Furthermore, the initial density is allowed to vanish. We introduce a suitable notion of probabilistic weak solution for the problem and prove its existence. Keywords: Density dependent stochastic Navier–Stokes equations; weak solutions; Galerkin scheme; tightness of probability measures; Prokhorov and Skorokhod’s theorems. Mathematics Subject Classification 2010: 35R60, 35D05, 35Q35

1. Introduction The mathematical study of incompressible Navier–Stokes equations goes back to the pioneering work of Leray [29–31]. Since then a considerable wealth of work and ground-breaking results have been obtained by some of the brightest minds in Mathematics and Applied Mathematics. For an in-depth historical overview of the body of work done in this direction, we refer to the monographs [1, 27, 32, 35, 57]. One of the greatest challenges in the ﬁeld of ﬂuid dynamics is the question of understanding of complex phenomenon of Turbulence. With the development of stochastic processes, models of Navier–Stokes equations perturbed by white noise were proposed and investigated in the quest for better understanding turbulence in ﬂuids (see [3–5, 10, 11, 15–17, 19, 40, 41, 43, 46, 47, 60–62], just to cite a few). The main feature in these equations is the decomposition of the force acting on the ﬂuid into a regular (deterministic) part and very irregular (turbulent) part driven by white noise. The mathematical theory of stochastic (mainly incompressible) 669

July 12, J070-S0129055X10004041

670

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Navier–Stokes equations is a very rich and broad area covering deep results on existence of solutions, dynamical systems feature, ergodicity, and many more, see [11, 14, 17, 42], for instance. However, while research in the density independent case has known a relatively sustained growth over the years, very little is known about the density dependent case which even in the deterministic case has a relatively recent history. On the deterministic front, results on global existence and some uniqueness results have been obtained by Antonsev and Kazhikov [1,24], in the case of non vanishing initial density, see also [28]. These results were subsequently extended to the vanishing initial density case in [20, 22, 23, 33, 35, 52–54] (the magnetohydrodynamic version). The most diﬃcult case of compressible ﬂuids is a very active area since the work of Lions [37–39] where the notion of renormalized solution introduced earlier by him and Di-Perna led to a breakthrough in the ﬁeld; we refer to his monograph [36] and those of Feireisl [18] and Novotny [44] for a greater wealth of information. In the present work, we provide a detailed investigation of a large class of stochastic density dependent Navier–Stokes equations. We consider a suﬃciently general forcing consisting of a regular part and a stochastic part both depending nonlinearly on the velocity of the ﬂuid and we do not require the functions involved in the forcing to satisfy the Lipschitz condition and we allow the initial density of the ﬂuid to be non negative. The main result is the construction of a probabilistic weak solution for the problem considered. The result is achieved thanks to a delicate blending of the semi-Galerkin approximation and deep theorems of compactness both of probabilistic and analytic nature which has proved very eﬃcient in establishing existence of solutions in other problems, we refer to [3–5,15,16,19,46–50,62,64]. Securing the strong convergence of several sequences of the approximating solutions through the tightness of the corresponding probability distributions and ﬁne measure, theoretic results presented far more challenges than in the deterministic and the density independent stochastic Navier–Stokes cases. Our results extend most of the known deterministic existence results referred to above to the stochastic case. Yashima was the ﬁrst to study stochastic density dependent equations in his thesis [64]. He considered additive noise and the case of positive initial density. One of his main contributions is the extension of some results of Bensoussan and Temam [5] to the density dependent case and the extension of some results of Antontsev and Kazhikov [1] to the stochastic case. The next work known to us in this direction is that of Cutland and Enright [12] who treat the case of positive initial density with nonlinear noise depending on the velocity. Their approach is based on nonstandard analysis and Loeb space techniques. It is worth noting that some existence results in the one-dimensional and two-dimensional compressible cases were obtained in the work of Tornatore and Yashima [58, 59, 63]. In view of the lack of Lipschitzity of the forces uniqueness is out of reach for the problem we study. The genuine uniqueness question is similar to the still unsolved deterministic case.

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

671

Let D be a domain bounded in R3 with a suﬃciently smooth boundary ∂D (at least Lipschitz). We ﬁx a ﬁnal time T > 0 and denote by QT the cylindrical domain (0, T ) × D. We consider the initial boundary value problem for the density dependent stochastic Navier–Stokes equations ρdu + (ρ(u · ∇)u − µ u + ∇P )dt = ρf (t, u)dt + ρg(t, u)dW ∂ρ + (u · ∇)ρ = 0 in QT , ∂t div u = 0

in QT ,

u(0) = u0

(1) (2) (3)

u = 0 on (0, T ) × ∂D, ρ(0) = ρ0 ,

in QT ,

in D;

(4) (5)

u is the velocity of the particles of ﬂuid, P the pressure, ρ the density, W a ldimensional Wiener process and the right-hand side of (1) represents the force acting on the ﬂuid and consisting of a regular part involving the function f and a chaotic part involving the function g and W . As a closing remark in this introduction, we note that the framework elaborated in the present paper opens some opportunities for attacking ergodic problems related to density dependent turbulent Navier–Stokes ﬂuids. The density independent case was considered in [7–9, 14, 17, 25, 26], just to cite a few; see also the references therein. The Galerkin approximation plays an important role in these works. The plan of the paper is as follows. In Sec. 2, we gather some preliminary results that will be needed in the work, we introduce the deﬁnition of the probabilistic weak solution for the problems (1)–(5), we formulate our main result. In Sec. 3, we introduce a semi-Galerkin approximation scheme for the problems (1)–(5) and obtain a priori estimates for the approximating solutions needed for the application of several compactness results. In Sec. 4, we prove the crucial result of tightness of Galerkin’s solutions and apply Prokhorov’s and Skorokhod’s compactness results. In the last Sec. 5, we prove our main result.

2. Preliminaries and Main Result We introduce some function spaces. Let D(D) be the space of C ∞ functions compactly supported in D and let D (D) be the space of distributions on D. For 1 ≤ r ≤ ∞, l a nonnegative integer we deﬁne the Sobolev spaces Wrl (D) = {v ∈ Lr (D) : Dα v ∈ (Lr (D))3 for |α| ≤ l},

July 12, J070-S0129055X10004041

672

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Dα = D1α1 · · · D3α3 , α = (α1 , α2 , α3 ), |α| = α1 + α2 + α3 , Di = ∂/∂xi . l Wr,0 (D) is the closure of D(D) in Wrl (D), 3 ∂vi r −1 : vi ∈ L (D), i = 0, 1, 2, 3 Wr (D) = v = v0 + ∂xi i=1

H l (D) = W2l (D),

l H0l (D) = W2,0 (D),

H −1 (D) = W2−1 (D);

these spaces are endowed with their respective usual norms. Next let V = {v ∈ D(D) : div v = 0}. Denote by V the closure of V in (H 1 (D))3 and by H the closure of V in (L2 (D))3 . V and H are Hilbert spaces with norms · V and · H , respectively. We denote the Euclidean norm by | · |. In view of the Lipschitzity of the boundary of D the following characterization of V and H hold: V = {v ∈ (H 1 (D))3 : div v = 0 in D and v|∂D = 0}, H = {v ∈ (L2 (D))3 : div v = 0 in D and v|∂D · n = 0}, where v|∂D denotes the trace of v on ∂D and n is a vector normal to ∂D. The inner product in H is induced by the inner product (·, ·) in L2 (D). We denote by ·, · the duality paring between V and V the dual of V . We denote by (·, ·)D the duality product in all functions spaces on D. In particular, v(x)w(x)dx, (v, w)D = D p

−1

−1

if v ∈ L (D) and w ∈ L (D), p + (p ) = 1. We recall some properties of products in Sobolev spaces Wp1 (D), p ≥ 1; the −1 − 3−1 ; p∗ is any ﬁnite non negative real Sobolev conjugate p∗ is given by p−1 ∗ =p if p = 3, p∗ = ∞ if p > 3. p

Lemma 1. (i) For 1 ≤ p ≤ q ≤ ∞, the product Wp1 (D) × Wq1 (D) → Wr1 (D) is continuous if r ≥ 1 and r−1 = p−1 + q∗−1 . (ii) For 1 ≤ p ≤ ∞, 1 ≤ q ≤ ∞, the product Wp1 (D) × Wq−1 (D) → Wr−1 (D) −1 is continuous if p−1 + q −1 ≤ 1 and r−1 = p−1 . ∗ +q

(6)

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

673

For a probability space (Ω, F, P ) and a Banach space X we introduce the space L (Ω, F, P, Lq (0, T, X))(1 ≤ p, q ≤ ∞) of random functions deﬁned on Ω with values in Lq (0, T, X). We endow Lp (Ω, F, P, Lq (0, T, X)) with the norm p

ϕLp (Ω,F,P,Lq (0,T,X)) = (Eϕ(ω, ·, ·)pLq (0,T,X) )1/p . We shall need in the sequel some important compactness results that we formulate now. The proofs of these results can be found in the given references. We have [32, Chap. 1, Lemma 1.3]. Lemma 2. Let (gκ )κ=1,2,... and g be some functions in Lq (0, T, Lq (D)) with q ∈ (1, ∞) such that gκ Lq (0,T,Lq (D)) ≤ C,

∀κ

and as κ → ∞ gκ → g

for almost all (x, t) ∈ QT .

Then gκ weakly converges to g in Lq (0, T, Lq (D)). Remark 3. The results of the lemma hold for the space Lq (Ω, F, P, Lq (0, T, D)) in Ω × QT . The next result is a sharper version of a theorem of Aubin (cf. [32, Chap. 1, Par. 5]) due to Simon [51, Sec. 8, Theorem 5]. Lemma 4. Let X, B and Y be some Banach spaces such that X is compactly embedded into B and let B be a subset of Y . For any 1 ≤ p, q ≤ ∞, and 0 < s ≤ 1 let E be a set bounded in Lq (0, T, X) ∩ N s,p (0, T, Y ), where s,p p −s N (0, T, Y ) = v ∈ L (0, T, Y ) : sup h v(t + θ) − v(t)Lp (0,T −θ,Y ) < ∞ . h>0

p

Then E is relatively compact in L (0, T, B). We shall need in the sequel two deep results due to Prokhorov and Skorokhod. We begin by introducing the concept of tightness of probability measures. Let E be a separable Banach space and let B(E) be its Borel σ-ﬁeld. Definition 5. A family of probability measures P on (E, B(E)) is tight if for any ε > 0, there exists a compact set Kε ⊂ E such that µ(Kε ) ≥ 1 − ε,

for all µ ∈ P.

A sequence of measures {µn } on (E, B(E)) is weakly convergent to a measure µ if for all continuous and bounded functions ϕ on E ϕ(x)µn (dx) = ϕ(x)µ(dx). lim n→∞

E

E

The following result due to Prokhorov [45] shows that the tightness property is a compactness criterion.

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

674

Lemma 6. A sequence of measures {µn } on (E, B(E)) is tight if and only if it is relatively compact, that is, there exists a subsequence {µnk } which weakly converges to a probability measure µ. Skorokhod proves in [55] the next result which relates the weak convergence of probability measures with that of almost everywhere convergence of random variables. Lemma 7. For an arbitrary sequence of probability measures {µn } on (E, B(E)) weakly convergent to a probability measure µ, there exists a probability space (Ω, F , P ) and random variables ξ, ξ1 , . . . , ξn , . . . with values in E such that the probability law of ξn , L(ξn )(A) = P {ω ∈ Ω : ξn (ω) ∈ A},

for all A ∈ F,

is µn , the probability law of ξ is µ, and lim ξn = ξ,

n→∞

P -a.s.

We borrowed the presentation of these lemmas from [13]. We now formulate the conditions on f and g. We assume that f : (0, T ) × H → V is a nonlinear mapping: (i) continuous in both its variables, (ii) there exists a positive constant C such that f (t, v)V ≤ C(1 + vH ).

(7)

We assume that g : (0, T ) × H → H l is a nonlinear mapping: (i) continuous in both its variables, (ii) there exists a positive constant C such that g(t, v)|H ×l ≤ C(1 + vH );

(8)

H ×l is the product of l copies of the space H. We state the following: Definition 8. A weak solution of (1)–(5) is a probabilistic system (Ω, F, F t , P, W, u, ρ) where (i) (ii) (iii) (iv)

(Ω, F, P ) is a probability space, F t is a ﬁltration on (Ω, F, P ), W (t) is an l-dimensional F t standard Wiener process, for almost every t, u(t) and ρ(t) are F t -measurable, u ∈ L4 (Ω, F, P, L∞ (0, T, H)) ∩ L2 (Ω, F, P, L2 (0, T, V )), ρ ∈ L∞ (Ω, F, P, L∞ (0, T, L∞ (D))),

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

675

(v) for any ϕ ∈ V, ψ ∈ H 1 (D) t t (ρu)(t)ϕdx − ρuu∇ϕdxdt + µ ∇u · ∇ϕdxdt D

0

=

t

ρ0 u0 ϕdx + D

ρ(t)ψ dx − D

0

D

t

ρf (t, u)ϕdxdt + 0

ρg(t, u)ϕdxdW, 0

D

t

ρ0 ψ dx −

(10)

D

ρ(0) = ρ0 ,

(9)

D

ρu∇ψ dxdt = 0, 0

D

and

D

ρ(0)u(0)ϕdx =

D

ρ0 u0 ϕdx

(11)

D

almost surely and for all t ∈ [0, T ]. Our main result is Theorem 9. Let the above conditions on f and g be satisfied and assume that u0 ∈ H, ρ0 ∈ L∞ (D), ρ0 ≥ 0. Then there exists a solution of problems (1)–(5) in the sense of Definition 8. Remark 10. We hereby emphasize the fact that the initial conditions (5) are understood in the sense of (11). Under the above estimates satisﬁed by u and ρ, and the integral identities (9) and (10) it can be shown as in the deterministic case ( [35, 54, Chap. 2]) that (11) holds almost surely. [54, Proposition 13, p. 1110] shows that conditions (11) are equivalent to Π(ρ0 (u(0) − u0 )) = ∇q, where Π is the Leray projector and q ∈ H 1 (D). Therefore unless ρ(0) is constant this condition is weaker than the one usually required, ρ0 (u(0) − u0 ) = 0. This means the velocity ﬁelds u(0) and u0 are equal outside the vacuum. 3. Semi-Galerkin Approximation and A Priori Estimates 3.1. The semi-Galerkin scheme In this section, we introduce a semi-Galerkin approximation following [1,24,33,54]. We obtain key a priori estimates for the approximating sequences of the presumed solutions of our problem. Let A be the Stokes’ operator with domain D(A) = H 2 ∩ V . We consider an orthonormal basis of D(A) consisting of the eigenvectors w1 , . . . , wm , . . . of A. We ¯ F¯ , P¯ ) with a denote the span of w1 , . . . , wm by V m . On the probability space (Ω,

July 12, J070-S0129055X10004041

676

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

¯ , we look for the pair of sequences (ρm , um ) given l-dimensional Wiener process W m (u is sought as linear combination of w1 , . . . , wm as will be made precise below) satisfying the integral equation (ρm dum )(t)v dx + ρm um ∇um v dxdt + µ ∇um · ∇v dxdt D

D

D

¯, ρm g(t, um )v dxdW

ρm f (t, um )v dxdt +

= D

(12)

D

for all v ∈ V m and ∂ρm + (um · ∇)ρm = 0 in QT ∂t um (0) = um 0 ,

ρm (0) = ρm 0

in D.

(13) (14)

We assume that m um 0 ∈ V , 1 ¯ ρm 0 ∈ C (D),

αm =

um 0 → u0

ρm 0 → ρ0

in (L2 (D))3 ,

in L∞ (D) weakly-star,

1 1 + inf ρ0 ≤ ρm + sup ρ0 = βm . 0 ≤ D m m D

(15) (16) (17)

In solving (13) with the second initial condition in (14), we assume that um exists and let y m (τ, t, x) be the ﬂow of um (·, ·); that is y m is solution of the Cauchy problem dy(τ, t, x) = um (τ, y(τ, t, x)), dτ

y(τ, t, x)|τ =t = x.

(18)

By the method of characteristics, we have the representation m ρm (t, x) = ρm 0 (y (0, t, x))

(19)

for the requested solution. This implies that 0 < αm ≤ ρm (t, x) ≤ βm .

(20)

We note that ρm is a random function through the relations (18) and (19) which is bounded above and below by deterministic values in (20). For the existence of a solution um to (12), we substitute the function ρm from (19) in (12) and look for um in the form of the expansion um =

m k=1

k ϕm k (t)w (x).

(21)

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

677

Substituting v = w1 , . . . , wm successively into (12) we obtain a system of ordinary stochastic diﬀerential equations for the coeﬃcients ϕm k (t) m m k m l ρm wk wl dx dϕm ρm (wj ϕm k (t) + j ∇)w ϕk w dxdt D

k=1

ρm f t,

− D

g t,

= D

m

D



 wl dxdt + µ wj ϕm j

j=1



j,k=1



m

m D j=1

j l ϕm j (t)∇w ∇w dx

 ¯  l wj ϕm j (t) w dxdW ,

l = 1, . . . , m.

(22)

j=1

The matrix

m ρ w w dx m

k

l

D

k,l=1

is non-degenerate since the family {ρ w }k=1,...,m is free; in view of (20) ρm > 0. Thus (22) can be reduced to the canonical form m

k

m m m m m m ¯ dϕm l (t) + Fl (t, ϕ1 , . . . , ϕm )dt = Gl (t, ϕ1 , . . . , ϕm )dW ,

(23)

with the initial conditions m ϕm l (0) = ϕl0 ,

(24)

where ϕm l0 are the coeﬃcients in the expansion um 0

=

m

k ϕm k0 w .

k=1

In view of the conditions on f and g, the functions Flm and Gm l are continum , . . . , ϕ . Thus thanks to an existence result for sysous in their variables t, ϕm 1 m tem of stochastic ordinary diﬀerential equations due to Skorokhod [56, Theorem 2, Chap. 5], a local solution of (23) exists on an interval [0, Tm ]. Therefore any for t ∈ [0, Tm ], the representation (21) holds. The existence over the whole interval [0, T ] will follow from uniform a priori estimates in the next subsection. 3.2. The a priori estimates We now proceed to the task of deriving needed a priori estimates. Substituting v = wk into (12), multiplying the resulting relation by ϕm k (t) and summing over k = 1, . . . , m, we get (um ρm dum )(t)dx + ρm um um ∇um dxdt + µ ∇um · ∇um dxdt D

D

¯. ρm g(t, um )um dxdW

ρm f (t, um )um dxdt +

= D

D

D

(25)

July 12, J070-S0129055X10004041

678

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

We introduce the stopping times   inf{t > 0 : ρm (t)um (t)L2 (D) ≥ N },   ¯ : ρm (t)um (t)L (D) ≥ N } = ∅, τN = if {¯ ω∈Ω 2   ∞ if {¯ ¯ : ρm (t)um (t)L (D) ≥ N } = ∅. ω∈Ω 2

Applying Ito’s formula to ρm um um dx, D

we deduce from (25) that √ | ρm um |2 dx d D

um um

= D

∂ρm (s) ds − 2µ ∂s

um Aum dxds − 2

D

¯ ]dx + ρm um [f (s, um )ds + g(s, um )dW

2 D

ρm um (um · ∇)um dxds D

√ | ρm g(s, um )|2 dxds,

(26)

D

where s ∈ [0, t ∧ τN ], t ∈ [0, Tm ], t ∧ τN = min{t, τN }. We have div(um um ρm um )dx = [um um div(ρm um ) + ρm um ∇(um um )]dx D

D

[um um (um · ∇)ρm + 2ρm um (um · ∇)um ]dx;

= D

where in the last step we made use of the divergence freeness of um . The left-hand side is equal to zero in view of the vanishing of um on (0, T ) × ∂D. Hence from (13), we have ∂ρm (s) dx = − um um um um um ∇ρm dx ∂s D D um ρm (um · ∇)um dx; (27) =2 D

Thus substituting the right-hand side of (27) into (26), we get for all s ∈ [0, t ∧ τN ] s um (r)2V dr ρm (s)um (s)2L2 (D) + 2µ m 2 ≤ ρm 0 u0 L2 (D) + + 0

s

0

s

2|um , ρm f (r, um )|dr

0

s m 2 m m m m ¯ ρ (r)g(r, u )L2 (D) dr + 2 (u , ρ g(r, u ))dW . 0

(28)

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

679

Taking supremum in both sides of (28) over the interval [0, t ∧ τN ], followed by the expectation, we have t∧τN m 2 m ¯ ¯ E sup ρ (s)u (s)L2 (D) + 2µE um (s)2V ds 0≤s≤t∧τN

m 2 ¯ ¯ ≤ E ρm 0 u0 L2 (D) + E

0

t∧τN

2|um , ρm f (s, um )|ds

0

¯ ρm (s)g(s, um )2L2 (D) ds + 2E

t∧τN

¯ +E

0

t∧τN

0

¯ . (um , ρm g(s, um ))dW (29)

We estimate terms in the right-hand side of this equation. By Young’s inequality and the conditions on f , we have for any ε > 0 t∧τN t∧τN m m m 2u , ρ f (s, u )ds ≤ ε ρm (s)um (s)2L2 (D) ds 0

0

+ Cε ≤C

t∧τN

0

t∧τN

0

ρm (s)f (s, um )2L2 (D) ds

ρm (s)um (s)2L2 (D) ds + C.

Similarly in view of the conditions on g we have t∧τN m 2 m ρ (s)g(s, u )L2 (D) ds ≤ C ρm (s)um (s)2L2 (D) ds + C.

(30)

t∧τN

0

(31)

0

We now estimate the stochastic integral in (28). We have for any ε > 0, s m m m ¯ ¯ E sup 2(ρ (s)g(s, u (s)), u (s))dW 0≤s≤t∧τN

0

≤ C E¯

t∧τN

t∧τN

0

2

1/2

ρ (s)L∞ (D) (1 + u (s)L2 (D) ) ρm (s)um (s)2L2 (D) ds

sup

m

0≤s≤t∧τN

¯ ≤ εE

m

(ρ (s)g(s, u (s)), u (s)) ds

×

m

0

≤ C E¯ ≤ C E¯

m

t∧τN 0

0≤s≤t∧τN

t∧τN 0

1/2

ρm (s)um (s)L2 (D) 2

1/2

ρ (s)L∞ (D) (1 + u (s)L2 (D) ) ds

sup

¯ + CE

2

m

m

m

ρm (s)um (s)2L2 (D) (1 + ρm (s)um (s)2L2 (D) )ds.

(32)

July 12, J070-S0129055X10004041

680

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Substituting the inequalities (30)–(32) into (29) we get for suﬃciently small ε > 0 t∧τN ¯ sup ρm (s)um (s)2 2 ¯ E + 2µE um (s)2 ds L (D)

0≤s≤t∧τN

¯ = E

m m 2 ρ0 u0 L2 (D) + C E¯

V

0

t∧τN

0

(1 + ρm (s)um (s)2L2 (D) )ds.

In view of Gronwall’s inequality, it follows that ¯ E

sup

0≤s≤t∧τN

¯ ρm (s)um (s)2L2 (D) + 2µE

t∧τN

0

um (s)2V ds ≤ C.

As N → ∞, t ∧ τN → t. Thus passing to the limit in this inequality, we ﬁnd that t ¯ E¯ sup ρm (s)um (s)2L2 (D) + 2µE um (s)2V ds ≤ C, ∀t ∈ [0, Tm ]. (33) 0≤s≤t

0

Since the constant C is independent of m, we have Tm = T . Applying Ito’s formula to Eq. (26) with p ≥ 1, we get p−2 m d ρm (t)um (t)pL2 (D) + pµ ρm (t)um (t)L (t)2V dt 2 (D) u =

p m p−2 m m ρ (t)um (t)L , ρ f (t, um ) + ρm (t)g(t, um )2L2 (D) }dt 2 (D) {2u 2 p−2 m m ¯ + p ρm (t)um (t)L , ρ g(t, um ))dW 2 (D) (u p p p−4 m m + − 1 ρm (t)um (t)L , ρ g(t, um ))2 dt, t ∈ [0, T ]. 2 (D) (u 2 2

Integrating this equation over [0, t] and squaring the resulting equation we get 2 ρm (t)um (t)2p L2 (D) + (pµ)

0

t

p−2 m ρm (s)um (s)L (s)2V ds 2 (D) u

m 2p ≤ C{ ρm 0 u0 L2 (D) + I1 + I2 + I3 + I4 },

(34)

where I1 =

0

I2 =

t

0

I4 =

t

0

I3 =

t

0

t

p−2 m ρm (s)um (s)L (s), ρm (s)f (s, um (s))ds 2 (D) u

2

2 ,

p−2 ρm (s)um (s)L ρm (s)g(s, um (s))2L2 (D) ds 2 (D)

p−2 m ¯ ρm (s)um (s)L (s), ρm (s)g(s, um (s)))dW 2 (D) (u

p−4 m ρm (s)um (s)L (s), ρm (s)g(s, um (s)))2 ds 2 (D) (u

2 , 2 ,

2 .

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

681

The following inequalities readily follow t 2 p−2 I1 + I2 ≤ ρm (s)um (s)L ρm (s)um (s)2L2 (D) )ds 2 (D) (1 + 0

≤C I4 =

t

0 t

0

≤C

ρm (s)um (s)2p L2 (D) )ds

p−4 ρm (s)um (s)L ρm (s)um (s)4L2 (D) )ds 2 (D) (1 +

t

0

(1 +

(1 +

2

ρm (s)um (s)2p L2 (D) )ds.

For the estimation of I4 , we use the Martingale inequality t 2 p−2 m m m E¯ sup ρm (s)um (s)L (u (s), ρ (s)g(s, u (s)))dW 2 (D) 0≤t≤T 0

¯ ≤ E

T

2p−4 ρm (s)um (s)L ρm (s)um (s)4L2 (D) )ds 2 (D) (1 +

0

¯ ≤ E

2p−4 m ρm (s)um (s)L (s), ρm (s)g(s, um (s)))2 ds 2 (D) (u

0

¯ ≤ E

T

T

0

(1 +

ρm (s)um (s)2p L2 (D) )ds.

In view of these estimates and (34) making use of Gronwall’s inequality, we obtain E¯ sup ρm (t)um (t)2p ∀ p ≥ 1. (35) L2 (D) ≤ C, 0≤t≤T

Raising both sides of (28) to the power p ≥ 1, and using the above inequality (35), we also get along the previous lines p T m 2 ¯ E u (s)V ds ≤ C. (36) 0

Our next task is to estimate some increments in time of um and ρm in the space V . But before that let us make a few remarks. In view of estimate (35), for any p ≥ 1, and the fact that ρm ∈ L∞ (0, T, L∞ (D))

¯ F¯ , P¯ , L∞ (0, T, (L2 (D))3 )). ρm um ∈ L2p (Ω,

(37)

Thus ¯ F¯ , P¯ , L∞ (0, T, H −1 (D))) ∇(ρm um ) ∈ L2p (Ω, and by (13), it follows that ∂ρm ¯ F¯ , P¯ , L∞ (0, T, H −1(D))). ∈ L2p (Ω, ∂t

(38)

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

682

Also by (36), for all p ≥ 1 ¯ F¯ , P¯ , L2 (0, T, V )). um ∈ Lp (Ω, Thus in view of the Sobolev embedding V → (L6 (D))3 we have ¯ F¯ , P¯ , L2 (0, T, (L6 (D))3 )), um ∈ Lp (Ω,

(39)

¯ F¯ , P¯ , L2 (0, T, (L6 (D))3 )). ρm um ∈ Lp (Ω,

(40)

and

Recall the following result due to Riesz and Thorin (cf. [6, Theorem 1.1.1]). Lemma 11. Let T be a linear operator from Lp1 (0, T ) into Lp2 (D) and from Lq1 (0, T ) into Lq2 (D) with q1 ≥ p1 and q2 ≤ p2 . Then for any s ∈ (0, 1), T maps Lr1 (0, T ) into Lr2 (D) with 1 , s/p1 + (1 − s)/q1

r1 =

r2 =

1 . s/p2 − (1 − s)/q2

Applying this lemma with p1 = 2, p2 = 6, q1 = ∞, q2 = 2 and s = 3/4, we get from (37) and (40) that ¯ F¯ , P¯ , L8/3 (0, T, ((L4 (D)))3 )); um ∈ Lp (Ω,

ρm u m ,

(41)

¯ F¯ , P¯ ) → X and where we have also used the lemma with respect to L2p (Ω, p ¯ ¯ ¯ L (Ω, F , P ) → X. Next we have ¯ F¯ , P¯ , L4/3 (0, T, (L2 (D))9 )), ρm um um ∈ Lp (Ω,

(42)

¯ F¯ , P¯ , L4/3 (0, T, (H −1 (D))3 )). ∇(ρm um um ) ∈ Lp (Ω,

(43)

and thus

Indeed applying Holder’s inequality we have for k = 1, 2, 3 2 4/6 T m m m (ρ uk uk ) dx dt 0

D

≤

T

0

≤

D

T

0

≤ C

4/12 4/12 4 m 4 (ρm um ) dx (u ) dx dt k k

0

D T

D

8/12 1/2 m m 4 (ρ uk ) dx dt

T

0

8/12

D

4 (ρm um k ) dx

D T

4 (um k ) dx

dt + 0

1/2

8/12

D

dt

8/12

4 (um k ) dx

dt

.

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

683

The integrals in the right-hand side are bounded for a.e. ω in view of the estimates (41). The sought estimates thus follow. Recalling the deﬁnition of the norm of V , we have (ρm um )(t + θ) − (ρm um )(t)2V = sup [(ρm um )(t + θ) − (ρm um )(t)]v dx. v∈V :vV =1

D

Thus owing to the integral identity (12), we have

T −θ

¯ E 0

(ρm um )(t + θ) − (ρm um )(t)2V dt

¯ = E

T −θ

0

¯ ≤ E

2 t+θ d(ρm um )ds dt t V

T −θ

[R1 (t) + R2 (t) + R3 (t) + R4 (t)]dt

0

(44)

where 2 t+θ m m m ∇(ρ u u )ds , R1 (t) = t 2 t+θ m m R3 (t) = ρ f (s, u )ds , t

V

V

2 t+θ m R2 (t) = µ∆u ds , t V

2 t+θ m m ¯ R4 (t) = ρ g(s, u )dW . t V

We have

1/2 R1

= sup D

≤C

t+θ

t+θ

∇(ρm um um )ds ϕ(x)dx : ϕ ∈ V, ϕV = 1

t

ρm um um L2 (D) ds.

t

Then in view of (42) ¯ E 0

T −θ

¯ R1 (t)dt ≤ Cθ1/2 E

0

≤ Cθ1/2 .

T −θ

t

t+θ

3/4 4/3 ρm um um L2 (D) dsdt

July 12, J070-S0129055X10004041

684

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Next using (33), we get

T −θ

¯ E 0

R2 (t)dt ≤ E¯

T −θ

0

∇u L2 (D) ds m

dt

t

≤ θE¯

2

t+θ

T −θ

0

t+θ

t

∇um 2L2 (D) dsdt

≤ Cθ. Using the conditions on f and estimate (35), we have 2 T −θ T −θ t+θ m m ¯ ¯ E R3 (t)dt ≤ C E ρ (s)L∞ (D) (1 + u (s)L2 (D) )ds dt 0

0

t

¯ m 2 ∞ ≤ CθEρ L (0,T,L∞ (D))

T −θ

0

t

t+θ

(1 + um (s)2L2 (D) )dsdt

≤ Cθ. For the stochastic integral we use the martingale inequality. We have 2 T −θ t+θ m m ¯ ¯ E ρ g(s, u )dW dt t 0 ≤

0

≤

T −θ

T −θ

V

¯ E

t+θ

t

T −θ ¯ E ≤

t

≤ C E¯

0

T −θ

D t+θ

2 ¯ dt ρ g(s, u )ϕ(x)dx dW m

sup

T −θ ¯ E ≤

0

t

ϕ∈V :ϕV =1

0

sup ϕ∈V :ϕV =1

¯ E

0

t+θ

m

2 ρm g(s, um )ϕ(x)dx ds dt

t

D

m

m 2

[ρ g(s, u )] dx ds dt

D t+θ

ρm 2L∞ (D) g(s, um )2(L2 (D))×l ds dt

θ sup ρm 2L∞ (D) (1 + um 2H ) 0≤t≤T

≤ Cθ; at some steps we made use of Fubini’s theorem and the estimate (35). Combining the estimates that we’ve just derived with (44) we get the crucial estimate T −θ ¯ E (ρm um )(t + θ) − (ρm um )(t)2V dt ≤ Cθ1/2 . (45) 0

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

We also need to show that T −θ ¯ E um (t + θ) − um (t)2W −1 (D) dt ≤ Cθ1/2 . 3/2

0

685

(46)

We note that ρm (t + θ)(um (t + θ) − um (t)) = (ρm um )(t + θ) − (ρm um )(t)um (t)(ρm (t + θ) − ρm (t)). Let us estimate

T −θ

¯ E

(47)

um (t)(ρm (t + θ) − ρm (t))2W −1 (D) dt. 3/2

0

−1 We have, by (38), (33) and (6), × → W3/2 (D)) that T −θ um (t)(ρm (t + θ) − ρm (t))2W −1 (D) dt

(W21 (D)

W2−1 (D)

3/2

0

≤

T −θ 0

t+θ m 2 ∂ρ (s) m dsdt u (t)V ∂s t V

m 2 ∂ρ (s) ≤ Cθ2 ∂s ∞ L

(0,T,V

)

0

T

um (t)2V dt.

Taking mathematical expectation in this inequality and using (36) and (38), we get T −θ um (t)(ρm (t + θ) − ρm (t))2W −1 (D) dt ≤ Cθ2 . (48) 3/2

0

Combining (45), (48) and (47), we get T −θ ¯ E ρm (t + θ)(um (t + θ) − um (t))2W −1 (D) dt ≤ Cθ1/2 . 3/2

0

This implies (46). We are left with another key estimate on the function Ψm (t) = ρm (t)um (t)v dx D

for v ∈ V . We claim that ¯ m (t + h) − Ψm (t)C([0,T ]) ≤ ch1/4 . EΨ

(49)

We have from (12), ¯ m (t + h) − Ψm (t)| E|Ψ t+h t+h ¯ ≤ E¯ ρm um um ∇v dxds + µE ∇um ∇v dxds t t D D t+h t+h m m m m ¯ ¯ . ¯ +E ρ f (s, u )v dxds + E ρ g(s, u )v dxdW t t D D

July 12, J070-S0129055X10004041

686

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

In view of (42), we have t+h m m m ¯ sup E ρ u u ∇v dxds t∈[0,T −h] t D ¯ m um um L4/3 (0,T,(L2 (D))9 ) ≤ Ch1/4 . ≤ Ch1/4 ∇vH Eρ Next, using (36), we have 1/2 t+h t+h ¯ sup ∇um ∇v dxds ≤ Ch1/2 vV E um 2V E t∈[0,T −h] t t D ≤ Ch1/2 . By similar arguments, we have by (35) t+h ¯ ¯ sup (1 + um (s)H ) E ρm f (s, um )v dxds ≤ Ch1/2 ρm L∞ (Q) vH E t s∈[t,t+h] D ≤ Ch1/2 . Finally using Martingale inequality we have t+h m m ¯ ¯ sup ρ g(s, u )v dxdW E t∈[0,T −h] t D 2 1/2 t+h ¯ ≤ E ρm g(s, um )v dx dt t D m ¯ ≤ Ehv H ρ L∞ (Q)

sup (1 + um (s)H ) s∈[t,t+h]

≤ Ch. Hence summarizing these estimates we arrive at (49). Furthermore, in view of (35), we have for any p ≥ 1 ¯ sup |Ψm (t)|p ≤ Cvp , E H t∈[0,T ]

¯ sup ρm um p 2 E L (D) ≤ C.

(50)

t∈[0,T ]

We now summarize our key estimates in this section. For that we introduce the k (1 ≤ p < ∞) (k = 1, 2) of random variables y such that spaces Xp,µ n ,νn 1 (i) For Xp,µ n ,νn

¯ sup y(t)2p2 E L (D) ≤ C, 0≤t≤T

¯ sup 1 sup E n νn |θ|≤µn

0

T −θ

E¯ 0

T

p y(s)2V

ds

≤ C,

y(t + θ) − y(t)2W −1 (D) dt ≤ C; 3/2

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

687

endowed with the norm 1 yXp,µ n ,νn

 1/2p ¯ sup y(t)2p2 ¯ E = E + L (D) 0≤t≤T

T

0

¯ sup 1 +E n νn

T −θ

sup |θ|≤µn

0

p/2 2/p  y(s)2V ds 1/2

y(t + θ) −

y(t)2W −1 (D) dt 3/2

,

1 is a Banach space. Xp,µ n ,νn 2 (ii) For Xp,µ n ,νn p ¯ Ey ≤ C, L8/3(0,T,((L4 (D)))3 )

¯ sup 1 sup E n νn |θ|≤µn

T −θ

0

y(t + θ) − y(t)2V dt ≤ C;

endowed with the norm p ¯ 2 yXp,µ = (Ey )1/p L8/3 (0,T,((L4 (D)))3 ) n ,νn

¯ sup 1 +E n νn

sup |θ|≤µn

0

T −θ

1/2 y(t + θ) −

y(t)2V dt

,

2 is a Banach space. Xp,µ n ,νn

We deﬁne Xq3 (q is any positive number) as the space of random variables y such that q ∂y q ¯ ¯ Ey ≤ C, E ≤ C; ∂t ∞ L∞ (0,T,L∞ (D)) L (0,T,H −1 (D)) endowed with the norm yXq3 =

q 1/q ¯ (Ey L∞ (0,T,L∞ (D)) )

1/q ∂y q ¯ + E , ∂t ∞ L (0,T,H −1 (D))

Xq3 is a Banach space. 4 of random variables y such that Finally we have the space Xp,µ n ,νn p ¯ Ey L∞ (0,T ) ≤ C,

sup n

1 ¯ + θ) − y(t)C[0,T ] ≤ C, sup Ey(t νn |θ|≤µn

which endowed with the norm p 1/p ¯ 4 (Ey + sup yXp,µ L∞ (0,T ) ) n ,νn n

is a Banach space.

1 ¯ + θ) − y(t)C[0,T ] sup Ey(t νn |θ|≤µn

July 12, J070-S0129055X10004041

688

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Combining the estimates (20), (35), (36), (38), (41), (45), (46), (49) and (50), we have Theorem 12. For any p ≥ 1 and for µn , νn such that the series ∞ 1/4 µn νn n=1 1 2 converges, the sequences um , ρm um , ρm and Ψm are bounded in Xp,µ , Xp,µ , n ,νn n ,νn 3 4 Xq and Xp,µn ,νn for any n, respectively.

4. Tightness Property of Probability Measures Induced by Galerkin Solutions We may rewrite Lemma 4 in the following more convenient form adapted to our situation as in [4]. For any sequences µn , νn which converge to zero as n → ∞, and any 1 ≤ pk , qk ≤ ∞ (k = 1, 2, 3, 4) the set Yµkn ,νn of functions y ∈ Lqk (0, T, Xk ) ∩ Nµpnk ,νn (0, T, Yk ) where Nµpnk ,νn (0, T, Y ) is the set 1 pk v ∈ L (0, T, Yk ) : sup sup v(t + θ) − v(t)Lpk (0,T −θ,Yk ) < ∞ n νn |θ|≤µn is relatively compact in Lpk (0, T, Bk ), Xk , Bk and Yk play respectively the role of X, B and Y in Lemma 4. −1 (D), q1 = 2, p1 = 2 and let Let Yµ1n ,νn be the space with X1 = V, Y1 = W3/2 2 2 4 B1 = L (D). Let Yµn ,νn be the space with X2 = L (D), Y2 = V , q2 = 8/3, p2 = 8/3 and B2 = W2−θ (D) (0 < θ < 1), W2−θ (D) being the interpolation space [L2 (D) = W20 (D), H −1 (D)]θ ; we refer to [34] for the needed informations. Also by [34, Theorem 16.1, Chap. 1], we have that W2−θ (D) is compactly embedded into H −1 (D). Let Yµ3n ,νn be the space with X3 = L∞ (D), Y3 = H −1 (D), q3 = ∞, p3 = ∞ and let −1 (D). Let Yµ4n ,νn be the space with X4 = B4 = Y4 = R, p4 = q4 = ∞. B3 = W∞ Now we consider the set S = C(0, T, Rl ) ×

4

Lpk (0, T, Bk ).

k=1

and B(S) the σ-algebra of the Borel sets of S. For each m, let Φ be the map Φ : ¯ →S:ω ¯ (¯ Ω ¯ → (W ω , ·), um (¯ ω , ·), ρm um (¯ ω , ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)). Since the solution is not unique in general this map is multivalued. However a selection can be made to suit our needs. Precise arguments can be found in [5]. So we make use of the map modulo a selection. For each m, we introduce a probability measure πm on (S, B(S)) by πm (A) = P¯ (Φ−1 (A))

for all A ∈ B(S).

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

689

The main result of this section is Theorem 13. The family of probability measures {πm : m ∈ N} is tight. Proof. For ε > 0 we should ﬁnd the compact subsets Σε ⊂ C(0, T, Rl ),

Yε ⊂

4

Lpk (0, T, Bk )

k=1

such that ¯ (¯ P¯ {¯ ω:W ω , ·) ∈ / Σε } ≤ ε/2

(51)

P¯ {¯ ω : (um (¯ ω , ·), ρm um (¯ ω , ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)) ∈ / Yε } ≤ ε/2.

(52)

The quest for Σε is made by taking account of some facts about the Wiener process such as the formula E|B(t2 ) − B(t1 )|2j = (2j − 1)!(t2 − t1 )j ,

j = 1, 2, . . . .

(53)

For a constant Lε depending on ε to be chosen later and n ∈ N, we consider the set w(·) ∈ C(0, T, Rl ) : Σε = . sup{n|w(t2 ) − w(t1 )| : t1 , t2 ∈ [0, T ], |t2 − t1 | < n−6 } ≤ Lε The Σε is relatively compact in C(0, T, Rd ) by Arsela–Ascoli’s theorem. Making use of Markov’s inequality P {¯ ω : ξ(¯ ω ) ≥ α} ≤

1 E[|ξ(¯ ω )|k ] αk

¯ F¯ , P¯ ) and positive numbers α and k, we get for a random variable ξ on (Ω, ¯ (¯ P¯ {¯ ω:W ω , ·) ∈ / Σε } ≤ P¯ ∪n ω ¯:

sup

t1 ,t2 ∈[0,T ],|t2 −t1 |
¯ (t2 ) − W ¯ (t1 )| > Lε /n |W

4 ∞ n −1 n ¯ (t) − W ¯ (iT n−6 )|4 ≤ E sup |W L −6 ≤t≤(i+1)T n−6 ε iT n n=1 i=0 6

≤ C

4 ∞ ∞ n C 1 (T n−6 )2 n6 = 4 . Lε Lε n=1 n2 n=1

We choose L4ε to get (51).

1 = 2Cε

∞ 1 2 n n=1

−1

July 12, J070-S0129055X10004041

690

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Next we choose Yε as a ball of radius Mε in Yµ1n ,νn × Yµ2n ,νn × Yµ3n ,νn × Yµ4n ,νn centered at zero and with µn , νn independent of ε, converging to zero and such that −1 1/4 converges. As remarked above Yε is a compact subset of n νn µn 4

Lpk (0, T, Bk ).

k=1

We have further P¯ {¯ ω : (um (¯ ω , ·), ρm um (¯ ω , ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)) ∈ / Yε } ω : ρm um Yµ2n ,νn > Mε } ≤ P¯ {¯ ω : um Yµ1n ,νn > Mε } + P¯ {¯ ω : Ψm Yµ4n ,νn > Mε } + P¯ {¯ ω : ρm Yµ3n ,νn > Mε } + P¯ {¯ ≤

1 ¯ m ¯ m um Y 2 ¯ m Y 3 ¯ m Y 4 (Eu Yµ1n ,νn + Eρ + Eρ + EΨ ) µn ,νn µn ,νn µn ,νn Mε

≤

C . Mε

Choosing Mε = 2Cε−1 we get (52). From (51) and (52), we have ¯ (¯ P {¯ ω:W ω , ·) ∈ Σε ; (um (¯ ω , ·), ρm um (¯ ω, ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)) ∈ Yε } ≥ 1 − ε. This proves that πm (Σε × Yε ) ≥ 1 − ε,

∀ε > 0

and hence the theorem. In view of the just proven tightness of {πm } we have from Lemma 6 that there exists a subsequence {πmj } and a measure π such that πmj → π weakly. By Skorokhod’s Lemma 7, there exist a probability space (Ω, F, P ) and random variables (Wmj , umj , ρmj umj , ρmj , Ψmj ), (W, u, g, ρ, Ψ) on (Ω, F, P ) with values in S such that the probability law of (Wmj , umj , ρmj umj , ρmj , Ψmj ) is πmj ; hence {Wmj } is a sequence of l-dimensional Wiener processes. Furthermore (Wmj , umj , ρmj umj , ρmj , Ψmj ) → (W, u, g, ρ, Ψ) in S,

P -a.s.

(54)

and the probability law of (W, u, g, ρ, Ψ) is π. Set F t = σ{W (s), u(s), ρ(s)}s∈[0,t] . We show that W (t) is a F t -standard Wiener process. For this we use the following characterization of Wiener processes through their characteristic functions (see [21])

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

691

which stipulates that for any m ∈ N, 0 = t0 < t1 < · · · < tm and v0 , v1 , . . . , vm m E exp ivk [W (tk ) − W (tk−1 )] − iz0 W (t0 ) k=1

  Nj  1  = exp − vk2 (tk − tk−1 ) .  2 

(55)

k=1

(55) will follow if we can prove that for the conditional characteristic function we have 2 v h E[exp{iv[W (t + h) − W (t)]}/F t ] = exp − (56) 2 for all h > 0 and any v. Note that for any given σ-algebra F and random variables ˜ F˜ , P˜ ) on which the mathematical expectation X and Y on a probability space (Ω, is denoted by E, if X is F -measurable and E|Y |, E|XY | < ∞, E(XY /F ) = XE(Y /F ),

EE(Y /F ) = E(Y ),

that is E(XY ) = E(XE(Y /F )). Using this fact we see that (56) will be proved if for any continuous bounded functional Λt (W (·), u(·), ρ(·)) on S depending only on the values of W, u and ρ on the interval (0, t), we have E[exp{v[W (t + h) − W (t)]}Λt (W (·), u(·), ρ(·))] 2 z h = exp − EΛt (W (·), u(·), ρ(·)). 2

(57)

Since [Wmj (t + h) − Wmj (t)] are independent of Λt (Wmj , umj , ρmj ) and Wmj is a Wiener process E[exp{iz[Wmj (t + h) − Wmj (t)]}Λt (Wmj , umj , ρmj )] = E exp{iz[Wmj (t + h) − Wmj (t)]}EΛt (Wmj , umj , ρmj ) 2 z h = exp − EΛt (Wmj , umj , ρmj ). 2 In view of (54) and the continuity of Λt , we can pass to the limit in this equality and get (57). It can be shown that Wmj , umj , ρmj satisfy the approximating equations (12) and (25) with m replaced by mj . In particular div umj = 0,

(58)

∂ρmj + (umj · ∇)ρmj = 0, ∂t

(59)

m

umj (0) = u0 j ,

m

ρmj (0) = ρ0 j ,

(60)

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

692

t

(ρmj umj )(t)v dx − D

m

= D

0

m

t 0

0

∇umj · ∇v dxdt D

ρmj f (t, umj )v dxdt

D

ρmj g(t, umj )v dxdWmj .

+ 0

t

D

ρ0 j u0 j v dx +

t

ρmj umj umj ∇v dxdt + µ

(61)

D

5. Passage to the Limit In Theorem 12 let us take p = 2. We have that u mj → u

weakly-star in L4 (Ω, F, P, L∞ (0, T, H)),

(62)

u mj → u

weakly in L2 (Ω, F, P, L2 (0, T, V )).

(63)

By (35), (54) and Vitali’s theorem, we have umj → u strongly in L2 (Ω, F, P, L2 (0, T, H)).

(64)

Thus for ﬁxed x, u mj → u

a.e. (t, ω) with respect to the measure dt ⊗ dP.

(65)

weakly-star in L2 (Ω, F, P, L∞ (0, T, L2 (D))).

(66)

Next we have ρmj u mj → g

By (35) and the uniform boundedness of ρmj Eρmj umj 4L∞ (0,T,L2 (D)) ≤ C. This implies that Eρmj umj 4L∞ (0,T,H −1/2 (D)) ≤ C. This together with (54) and Vitali’s theorem give strongly in L2 (Ω, F, P, L∞ (0, T, W2−θ (D))).

ρmj u mj → g

(67)

We have that ρmj is bounded in Xq3 for any q > 0. Taking q = 4 we get ρmj → ρ weakly-star in L4 (Ω, F, P, L∞ (0, T, L∞ (D)))

(68)

and Eρmj 4L∞ (0,T,W −1 (D)) ≤ C. ∞

This estimate combined with (54) and Vitali’s theorem imply that ρmj → ρ

−1 strongly in L2 (Ω, F, P, C(0, T, W∞ (D))).

(42) gives ρmj u mj u mj → h

¯ F¯ , P¯ , L4/3 (0, T, (L2 (D))9 )). weakly in L2 (Ω,

(69)

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

693

−1 The product W∞ (D) × H 1 (D) → W6−1 (D) is continuous. Thus (69) and (63) give

weakly in L2 (Ω, F, P, L2 (0, T, W6−1 (D))).

ρmj umj → ρu

(70)

And taking account of (67) we get g = ρu. Similarly since the product from (67), (70) and (63) that

(71)

W6−1 (D)×H 1 (D)

→

−1 W3/2 (D)

is continuous we have

−1 ρmj umj umj → gu = ρuu weakly in L2 (Ω, F, P, L2 (0, T, W3/2 (D))).

(72)

Next, in view of (67), (35), the conditions on f and Vitali’s theorem we have f (·, umj (·)) → f (·, u(·)) in L2 (Ω, F, P, L2 (0, T, H)).

(73)

Similarly owing to the conditions on g g(·, umj (·)) → g(·, u(·)) in L2 (Ω, F, P, L2 (0, T, H d )).

(74)

Using this convergence with (69) and (54), we can show that t t ρmj g(s, umj (s))dWmj (s) → ρg(s, u(s))dW (s)

(75)

0

0

weakly in L2 (Ω, F, P, L2 (D)). We skip the details and instead refer to [50] where a similar situation is dealt with thoroughly. A key role is played by Lemma 2. Next in view of (54), Ψmj (ω, ·) → Ψ(ω, ·) uniformly in C([0, T ]), P a.s. Hence owing to (50) and Vitali’s theorem we get Ψ mj → Ψ strongly in L1 (Ω, F, P, C(0, T, R)). Hence mj mj mj ρ (0)u (0)v dx → ρ(0)u(0)v dx. Ψ (0) = D

But

D

ρmj (0)umj (0)v dx =

D

Thus

ρ0 u0 v dx. D

ρ(0)u(0)v dx =

D

ρ0 u0 v dx.

(76)

D

Also passing to the limit in (17), we get inf ρ0 ≤ ρ ≤ sup ρ0 . D

(77)

D

Combining all these convergences we can pass to the limit in the weak formulation of problem (58)–(61) and obtain the claim of our main result.

July 12, J070-S0129055X10004041

694

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Acknowledgments This work is supported by the National Science Foundation under the agreement No. DMS-0635607 and by the National Research Foundation of South Africa. The results were obtained during my stay at the Institute for Advanced Study in fall of 2009. I thank the institute for providing excellent conditions of work. I thank Professor Ya. G. Sinai for stimulating discussions on the results of the paper and encouragement. Until the paper was completed, I was not aware of the work of Professor F. H. Yashima who informed me during the Panafrican Congress of Mathematicians in Yamoussoukro (Cˆote-d’Ivoire) in August 2009. My sincere gratitude is due to him for sending his thesis [64]. I thank one of the reviewers for interesting comments that improved the paper. References [1] S. N. Antontsev, A. V. Kazhikhov and V. N. Monakhov, Boundary Value Problems in Mechanics of Nonhomogeneous Fluids, Studies in Mathematics and Its Applications, Vol. 22 (North-Holland Publishing Co., Amsterdam, 1990). [2] A. Bensoussan, Some existence results for stochastic partial diﬀerential equations, in Stochastic Partial Diﬀerential Equations and Applications (Trento, 1990), Pitman Res. Notes Math. Ser., Vol. 268 (Longman Scientiﬁc and Technical, Harlow, UK, 1992), pp. 37–53. [3] A. Bensoussan, Results on stochastic Navier–Stokes equations, in Control of Partial Diﬀerential Equations (Trento, 1993), Lecture Notes in Pure and Appl. Math., Vol. 165 (Dekker, New York, 1994), pp. 11–21. [4] A. Bensoussan, Stochastic Navier–Stokes equations, Acta Appl. Math. 38 (1995) 267– 304. ´ [5] A. Bensoussan and R. Temam, Equations stochastiques du type Navier–Stokes, J. Funct. Anal. 13 (1973) 195–222. [6] J. Bergh and J. L¨ ofstr¨ om, Interpolation Spaces. An Introduction, Grundlehren der Mathematischen Wissenschaften, No. 223 (Springer-Verlag, Berlin-New York, 1976). [7] J. Bricmont, A. Kupiainen and R. Lefevere, Exponential mixing of the 2D stochastic Navier–Stokes dynamics, Comm. Math. Phys. 230(1) (2002) 87–132. [8] J. Bricmont, A. Kupiainen and R. Lefevere, Ergodicity of the 2D Navier–Stokes equations with random forcing. Dedicated to Joel L. Lebowitz, Comm. Math. Phys. 224(1) (2001) 65–81. [9] J. Bricmont, A. Kupiainen and R. Lefevere, Probabilistic estimates for the twodimensional stochastic Navier–Stokes equations, J. Statist. Phys. 100(3–4) (2000) 743–756. [10] Z. Brzezniak, M. Capinski and F. Flandoli, Stochastic Navier–Stokes equations with multiplicative noise, Stochastic Anal. Appl. 10(5) (1992) 523–532. [11] Z. Brze´zniak and Y. Li, Asymptotic compactness and absorbing sets for 2D stochastic Navier–Stokes equations on some unbounded domains, Trans. Amer. Math. Soc. 358(12) (2006) 5587–5629. [12] N. J. Cutland and B. Enright, Stochastic nonhomogeneous incompressible Navier– Stokes equations, J. Diﬀerential Equations 228(1) (2006) 140–170. [13] G. Da Prato and J. Zabczyk, Stochastic Equations in Inﬁnite Dimensions, Encyclopedia of Mathematics and Its Applications, Vol. 44 (Cambridge University Press, Cambridge, 1992).

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

695

[14] G. Da Prato and J. Zabczyk, Ergodicity for Inﬁnite-Dimensional Systems, London Mathematical Society Lecture Note Series, Vol. 229 (Cambridge University Press, Cambridge, 1996). [15] G. Deugoue and M. Sango, On the stochastic 3D Navier–Stokes-alpha model of ﬂuids turbulence, Abstr. Appl. Anal. 2009 (2009), Article ID 723236, 27 pp. [16] G. Deugoue and M. Sango, On the strong solution for the 3D stochastic Leray-Alpha Model, Boundary Value Problems 2010 (2010), Article ID 723018, 31 pp. [17] E. Weinan and Ya. G. Sinai, New results in mathematical and statistical hydrodynamics, Russian Math. Surveys 55(4) (2000) 635–666. [18] E. Feireisl, Dynamics of Viscous Compressible Fluids, Oxford Lecture Series in Mathematics and Its Applications, Vol. 26 (Oxford University Press, Oxford, 2004). [19] F. Flandoli and D. Gatarek, Martingale solutions and stationary solutions for stochastic Navier–Stokes equations, Probab. Theory Relat. Fields 102 (1995) 367–391. [20] J.-F. Gerbeau and C. Le Bris, Existence of solution for a density-dependent magnetohydrodynamic equation, Adv. Diﬀerential Equations 2(3) (1997) 427–452. [21] I. I. Gikhman and A. V. Skorohod, Stochastic Diﬀerential Equations, Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 72 (Springer-Verlag, New YorkHeidelberg, 1972). [22] Y. Cho and H. Kim, Unique solvability for the density-dependent Navier–Stokes equations, Nonlinear Anal. 59(4) (2004) 465–489. [23] H. J. Choe and H. Kim, Strong solutions of the Navier–Stokes equations for nonhomogeneous incompressible ﬂuids, Comm. Partial Diﬀerential Equations 28(5–6) (2003) 1183–1201. [24] A. V. Kazhikhov, Solvability of the initial-boundary value problem for the equations of the motion of an inhomogeneous viscous incompressible ﬂuid, Dokl. Akad. Nauk SSSR 216 (1974) 1008–1010 (in Russian). [25] S. B. Kuksin, Randomly Forced Nonlinear PDEs and Statistical Hydrodynamics in 2 Space Dimensions, Zurich Lectures in Advanced Mathematics (European Mathematical Society (EMS), Z¨ urich, 2006), x+93 pp. [26] S. Kuksin and A. Shirikyan, Ergodicity for the randomly forced 2D Navier–Stokes equations, Math. Phys. Anal. Geom. 4(2) (2001) 147–195. [27] O. A. Ladyzhenskaya, The Mathematical Theory of Viscous Incompressible Flow, 2nd edn., revised and enlarged (Gordon and Breach, Science Publishers, New YorkLondon-Paris, 1969). [28] O. A. Ladyzhenskaja and V. A. Solonnikov, The unique solvability of an initialboundary value problem for viscous incompressible inhomogeneous ﬂuids. Boundary value problems of mathematical physics, and related questions of the theory of functions, 8, Zap. Nauch. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 52 (1975) 52–109, 218–219 (in Russian). [29] J. Leray, Sur le syst`eme d’´equations aux d´eriv´ees partielles qui r´egit l’´ecoulement permanent des ﬂuides visqueux, C. R. Acad. Sci. Paris 192 (1931) 1180–1182. [30] J. Leray, Sur le mouvement d’un liquide visqueux emplissant l’espace, Acta Math. 63(1) (1934) 193–248. [31] J. Leray, Etude de diverses ´equations int´egrales non lineaires et de quelques probl`emes que pose l’hydrodynamique, J. Math. Pure Appl. (9 ) 12 (1933) 1–82. [32] J. L. Lions, Quelques m´ethodes de r´ esolution des probl` emes aux limites non lin´eaires (Dunod, Gauthiers-Villars, Paris, 1969; Russian translation by Mir). [33] J.-L. Lions, On some problems connected with Navier–Stokes equations, in Nonlinear Evolution Equations (Proc. Sympos., Univ. Wisconsin, Madison, Wis., 1977), Publ.

July 12, J070-S0129055X10004041

696

[34]

[35] [36] [37] [38] [39]

[40] [41] [42] [43] [44]

[45] [46]

[47]

[48] [49] [50] [51] [52] [53] [54]

2010 11:50 WSPC/S0129-055X

148-RMP

M. Sango

Math. Res. Center Univ. Wisconsin, Vol. 40 (Academic Press, New York-London, 1978), pp. 59–84. J.-L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problems and Applications, Vol. I, Die Grundlehren der Mathematischen Wissenschaften, Band 181 (Springer-Verlag, New York-Heidelberg, 1972). P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 1, Incompressible Models (The Clarendon Press, Oxford University Press, New York, 1996). P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 2, Compressible Models (The Clarendon Press, Oxford University Press, New York, 1998). P.-L. Lions, Limites incompressible et acoustique pour des ﬂuides visqueux, compressibles et isentropiques, C. R. Acad. Sci. Paris S´ er. I Math. 317(12) (1993) 1197–1202. P.-L. Lions, Compacit´e des solutions des ´equations de Navier–Stokes compressibles isentropiques, C. R. Acad. Sci. Paris S´ er. I Math. 317(1) (1993) 115–120. P.-L. Lions, Existence globale de solutions pour les ´equations de Navier–Stokes compressibles isentropiques, C. R. Acad. Sci. Paris S´ er. I Math. 316(12) (1993) 1335– 1340. R. Mikulevicius and B. L. Rozovskii, Global L2 -solutions of stochastic Navier–Stokes equations, Ann. Probab. 33(1) (2005) 137–176. R. Mikulevicius and B. L. Rozovskii, Stochastic Navier–Stokes equations for turbulent ﬂows, SIAM J. Math. Anal. 35(5) (2004) 1250–1310. S.-E. Mohammed and T.S. Zhang, Dynamics of Stochastic 2D Navier–Stokes, to appear in J. Funct. Anal. A. S. Monin and A. M. Yaglom, Statistical Fluid Mechanics: Mechanics of Turbulence, Vols. I, II (Dover Publications, Dover Ed Edition, 2007). A. Novotny and I. Stravskraba, Introduction to the Mathematical Theory of Compressible Flow, Oxford Lecture Series in Mathematics and Its Applications, Vol. 27 (Oxford University Press, Oxford, 2004), xx+506 pp. Yu. V. Prohorov, Convergence of random processes and limit theorems in probability theory, Teor. Veroyatnost. i Primenen. 1 (1956) 177–238 (in Russian). P. A. Razaﬁmandimby and M. Sango, Weak solutions of a stochastic model for two-dimensional second grade ﬂuids, Boundary Value Problems 2010 (2010), Article ID 636140, 47 pp. P. A. Razaﬁmandimby and M. Sango, Asymptotic behavior of solutions of stochastic evolution equations for second grade ﬂuids, to appear in C. R. Math. Acad. Sci. Paris. M. Sango, Existence result for a doubly degenerate quasilinear stochastic parabolic equation, Proc. Japan Acad. Ser. A Math. Sci. 81(5) (2005) 89–94. M. Sango, Weak solutions for a doubly degenerate quasilinear parabolic equation with random forcing, Discrete Contin. Dyn. Syst. Ser. B 7(4) (2007) 885–905. M. Sango, Magnetohydrodynamic turbulent ﬂows: Existence results, Phys. D 239 (2010) 912–923. J. Simon, Compact sets in the space Lp (0, T ; B), Ann. Mat. Pura Appl. 146(4) (1987) 65–96. J. Simon, Sur les ﬂuides visqueux incompressibles et non homog`enes, C. R. Acad. Sci. Paris S´er. I Math. 309(7) (1989) 447–452. ´ J. Simon, Ecoulement d’un ﬂuide non homog`ene avec une densit´e initiale s’annulant, C. R. Acad. Sci. Paris S´ er. A-B 287(15) (1978) A1009–A1012. J. Simon, Nonhomogeneous viscous incompressible ﬂuids: Existence of velocity, density, and pressure, SIAM J. Math. Anal. 21(5) (1990) 1093–1117.

July 12, J070-S0129055X10004041

2010 11:50 WSPC/S0129-055X

148-RMP

Density Dependent Stochastic Navier–Stokes Equations

697

[55] A. V. Skorokhod, Limit theorems for stochastic processes, Teor. Veroyatnost. i Primenen. 1 (1956) 289–319. [56] A. V. Skorokhod, Studies in the Theory of Random Processes (Scripta Technica, Inc. Addison-Wesley Publishing Co., Inc., Reading, Mass., 1965); Translated from the Russian. [57] R. Temam, Navier–Stokes Equations. Theory and Numerical Analysis, Studies in Mathematics and Its Applications, Vol. 2 (North-Holland Publishing Co., Amsterdam-New York-Oxford, 1977). [58] E. Tornatore and H. Fujita Yashima, One-dimensional stochastic equations for a viscous barotropic gas, Ricerche Mat. 46(2) (1997) 255–283 (in Italian). [59] E. Tornatore, Global solution of bi-dimensional stochastic equation for a viscous gas, NoDEA Nonlinear Diﬀerential Equations Appl. 7(4) (2000) 343–360. [60] M. I. Vishik, A. I. Komech and A. V. Fursikov, Some mathematical problems of statistical hydromechanics, Uspekhi Mat. Nauk 34(5)(209) (1979) 135–210 (in Russian). [61] M. I. Vishik and A. V. Fursikov, Mathematical Problems of Statistical Hydromechanics, Mathematics and Its Applications (Kluwer, Drodrecht, 1988). [62] M. Viot, Solutions faibles d’equations aux derivees partielles stochastiques non lineaires, Doctor of Sciences thesis, Parix 6 (1973). [63] H. F. Yashima, Equations stochastiques d’un gaz visqueux isotherme dans un domaine monodimensionnel inﬁni, Acta Math. Vietnam. 26(2) (2001) 147–168. [64] H. F. Yashima, Equations de Navier–Stokes stochastiques non homog`enes et applications, Tesi di Perfezionamento, Scuola Normale Superiore, Pisa (1992), 169 pp.

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 6 (2010) 699–732 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004065

DERIVATIONS OF THE TRIGONOMETRIC BCn SUTHERLAND MODEL BY QUANTUM HAMILTONIAN REDUCTION

´ ∗,‡ and B. G. PUSZTAI†,§ L. FEHER ∗Department

of Theoretical Physics, MTA KFKI RMKI, H-1525 Budapest, P.O.B. 49, Hungary and Department of Theoretical Physics, University of Szeged, Tisza Lajos krt 84-86, H-6720 Szeged, Hungary †Bolyai Institute, University of Szeged, Aradi v´ ertan´ uk tere 1, H-6720 Szeged, Hungary ‡[email protected] §[email protected]

Received 7 October 2009 The BCn Sutherland Hamiltonian with coupling constants parametrized by three arbitrary integers is derived by reductions of the Laplace operator of the group U (N ). The reductions are obtained by applying the Laplace operator on spaces of certain vector valued functions equivariant under suitable symmetric subgroups of U (N ) × U (N ). Three diﬀerent reduction schemes are considered, the simplest one being the compact real form of the reduction of the Laplacian of GL(2n, C) to the complex BCn Sutherland Hamiltonian previously studied by Oblomkov. Keywords: Integrable many-body systems; quantum Hamiltonian reduction; polar action. Mathematics Subject Classiﬁcation 2010: 22E70, 53C80, 81R12

1. Introduction The family of Calogero–Sutherland type many-body models is very important both in physics and mathematics, as is amply demonstrated in the reviews [1–6]. In this paper, we focus on the group theoretic derivation of the trigonometric Sutherland models introduced by Olshanetsky and Perelomov [7] in correspondence with the crystallographic root systems. The Hamiltonian of the model associated with the roots system R is given by 1 |α|2 µα (µα + 2µ2α − 1) 1 , (1.1) HR = − ∆ + 2 4 sin2 (α · q) α∈R where ∆ is the Laplacian on the Euclidean space of the roots and the µα are arbitrary real constants depending only on the lengths of the roots, with µ2α := 0 699

July 12, J070-S0129055X10004065

700

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

if 2α ∈ / R. In the original An−1 case, the model was solved by Sutherland [8]. An interesting general observation [9] is that the radial part of the Laplace operator of any compact Riemannian symmetric space is always conjugate to a Sutherland operator (1.1) built on the root system of the symmetric space, with coupling constants determined by the multiplicities of the roots. This observation showed the algebraic integrability of the resulting Hamiltonians HR at (small) ﬁnite sets of coupling constants and inspired later developments. The integrability, and exact solvability in terms of a triangular structure, was ﬁrst established for the models (1.1) in full generality by Heckman and Opdam [10, 11]. Their technique is based on diﬀerential-reﬂection operators belonging to the Hecke algebraic generalization of harmonic analysis [2, 12]. The Hecke algebraic approach is very powerful, but it is still desirable to treat as many cases of the models (1.1) in group theoretic terms as possible. Important progress in this direction was achieved by Etingof, Frenkel and Kirillov [13] who worked out the quantum mechanical version of the classical Hamiltonian reduction due to Kazhdan, Kostant and Sternberg [14] and thereby showed that the An−1 Sutherland Hamiltonian arises as the restriction of the Laplace operator of SU (n) to certain vector valued spherical functions. A spherical function F on SU (n) with values in the SU (n) module V satisﬁes the equivariance condition F (gxg −1 ) = g · F (x) and thus it is uniquely determined by its restriction to the maximal torus T < SU (n). It is easily seen that the restricted function f = F |T must vary in the zero-weight subspace V T and the action of the Laplace operator of SU (n) on F can be expressed by the action of a scalar diﬀerential operator on f whenever dim(V T ) = 1. This latter condition singles out the symmetric tensorial powers V = S kn (Cn ) (k ∈ Z≥0 ) and their duals among the irreducible highest weight representations of SU (n), and the resulting scalar diﬀerential operator turns out to be the Sutherland operator HAn−1 with coupling parameter µα = k + 1. The above arguments cannot be extended to the simple Lie groups beyond SU (n), since in general they do not admit non-trivial highest weight representations with multiplicity one for the zero weight.a However, taking any compact connected Lie group Y , there exist other nice actions of certain subgroups of Y × Y on Y for which one can try to generalize the above arguments. Indeed [17], if G is the ﬁxed point set of an involution of Y ×Y , then every orbit of the natural action of G on Y can be intersected by a toral subgroup A < Y . Therefore the G-equivariant functions on Y with values in a representation V of G give rise to V K -valued functions on A, where K is the isotropy group of the generic elements of A. Moreover, if dim(V K ) = 1, then the application of the Laplace operator of Y on C ∞ (Y, V )G may induce a scalar Sutherland operator. The group actions just alluded to are called Hermann actions. They received a lot of attention in diﬀerential geometry (see, a The only exceptions [15, 16] are the deﬁning representation of SO(2n + 1) and the 7-dimensional representation of G2 . In the former case, we have checked that the reduced Laplacian gives a decoupled system.

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

701

e.g., [17, 18] and references therein), but their use for the construction of integrable systems still has not been explored systematically. The goal of this paper is to explain that certain Hermann actions on Y = U (N ) permit derivations of the BCn Sutherland Hamiltonian from the Laplacian of U (N ). The derivations that we present are partly motivated by an earlier derivation found in the complex holomorphic setting in [19], and by our previous paper [20] where we discussed how the classical mechanical version of the trigonometric BCn model with three arbitrary coupling constants can be obtained by reducing the free particle moving on the group U (N ). Taking for R the root system BCn = {i ± j , ±k , ±2k | i, j, k ∈ {1, . . . , n}, i = j},

(1.2)

with orthonormal vectors {i }, and introducing new coupling parameters a, b, c by the deﬁnition 1 µi ±j := a + 1, µk := b − c, µ2k := c + , (1.3) 2 the Hamiltonian (1.1) reads 1 ∂2 + 2 j=1 ∂qj2 n

HBCn = −

1≤k

a(a + 1) a(a + 1) + sin2 (qk − ql ) sin2 (qk + ql )

1 1 2 2 n n 1 c −4 1 b −4 . + + 2 j=1 sin2 (qj ) 2 j=1 cos2 (qj )

(1.4)

In fact, we shall obtain this Hamiltonian with arbitrary non-negative integers a, b and c as a reduction of the Laplace operator of U (N ). More precisely, we shall present 3 diﬀerent derivations, for which N = 2n, N = 2n + 1 or N = 2n + 2. There is considerable conceptual overlap between this paper and the abovementioned work [19] of Oblomkov, who related the eigenfunctions of the holomorphic BCn Sutherland operator to vector valued spherical functions on the group GL(N, C). If we replace GL(N, C) by U (N ), then Oblomkov’s construction leads to our construction in the most important N = 2n case. However, there are also diﬀerent cases considered in [19] and in this paper even after such replacement, and the language and the techniques used are rather diﬀerent. In fact, we shall obtain the results by applying a recently developed general framework of quantum Hamiltonian reduction under polar group actions [21]. We shall raise interesting open questions, too, and to facilitate their future investigation we describe our analysis in a self-contained manner. The organization of the article is as follows. In the next section, we recall the necessary notions and results concerning quantum Hamiltonian reductions of the Laplace operator on a Riemannian manifold that admits generalized polar coordinates adapted to the symmetry group in the sense of [22]. In Sec. 3, we specialize to Hermann actions on a compact Lie group Y , and describe those Hermann actions on Y = U (N ) that are expected to lead to BCn Sutherland models if the representation

July 12, J070-S0129055X10004065

702

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

of the symmetry group G < Y × Y is chosen appropriately. The key part of the paper is Sec. 4, where we conﬁrm the above expectation for three inﬁnite families of cases. In Sec. 5, we summarize the results, further discuss the comparison with [19] and formulate open questions. There is also an appendix containing background material. 2. Quantum Hamiltonian Reduction Under Polar Actions We here collect general deﬁnitions and results that will be used subsequently. Our main purpose is to explain that formula (2.14) characterizes the reductions of the Laplace operator of a Riemannian manifold under so-called polar actions [22] of compact symmetry groups. The exposition is restricted to the necessary minimum, for more details see [21] and references therein. Let Y be a smooth, connected, complete Riemannian manifold with metric η. Consider the Laplace operator ∆Y corresponding to η. For a smooth function F , 1 1 in local coordinates {y µ } on Y one has ∆Y F = |η|− 2 ∂µ (|η| 2 ∂ µ F ) with |η| := det(ηµ,ν ). The restriction of ∆Y onto the space of the complex-valued compactly supported smooth functions, ∆0Y := ∆Y |Cc∞ (Y ) : Cc∞ (Y ) → Cc∞ (Y ),

(2.1)

is an essentially self-adjoint linear operator of the Hilbert space L2 (Y, dµY ), where µY denotes the measure generated by the Riemannian volume form, locally deﬁned 1 by |η| 2 µ dy µ . Suppose that a compact Lie group G acts on (Y, η) by isometries. The action is given by a smooth map φ : G × Y → Y,

(g, y) → φ(g, y) = φg (y) = g.y

(2.2)

such that φ∗g η = η for every g ∈ G. The measure µY inherits the G-invariance and therefore the Hilbert space L2 (Y, dµY ) naturally carries a continuous unitary representation of G. This in turn is unitarily equivalent to an orthogonal direct sum, L2 (Y, dµY ) ∼ = ⊕ρ Mρ ⊗ Vρ¯, where (ρ, Vρ ) runs over a complete set of pairwise inequivalent irreducible unitary representations of G, ρ¯ denotes the contragredient of the representation ρ, and Mρ is a “multiplicity space” on which G acts trivially. ¯ 0 , which by deﬁnition Correspondingly, the self-adjoint scalar Laplace operator, ∆ Y 0 0 ∼ ˆ ρ ⊗ idVρ¯ , where ∆ ¯ ˆ ρ is is the closure of ∆Y (2.1), can be decomposed as ∆Y = ⊕ρ ∆ ˆ a self-adjoint operator on the Hilbert space Mρ . The system (Mρ , ∆ρ ) is called the ¯ 0 ) having the symmetry type ρ¯. reduction of the system (L2 (Y, dµY ), ∆ Y ˆ ρ ), consider now an irreducible In order to present a convenient model of (Mρ , ∆ unitary representation (ρ, V ) of G, where V is a ﬁnite dimensional complex vector space with inner product ( , )V . By simply acting componentwise, the diﬀerential operator ∆0Y extends onto the complex vector space of the V -valued compactly supported smooth functions, Cc∞ (Y, V ). This gives the essentially self-adjoint operator ∆0Y : Cc∞ (Y, V ) → Cc∞ (Y, V )

(2.3)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

703

of the Hilbert space L2 (Y, V, dµY ). Because of the G-symmetry of the metric η, the set Cc∞ (Y, V )G := {F | F ∈ Cc∞ (Y, V ), F ◦ φg = ρ(g) ◦ F (∀ g ∈ G)}

(2.4)

of the V -valued, compactly supported G-equivariant smooth functions is an invariant linear subspace of ∆0Y . Moreover, the restriction of ∆0Y (2.3) onto Cc∞ (Y, V )G , ∆ρ := ∆0Y |Cc∞ (Y,V )G : Cc∞ (Y, V )G → Cc∞ (Y, V )G ,

(2.5)

is a densely deﬁned, symmetric, essentially self-adjoint linear operator on the Hilbert space L2 (Y, V, dµY )G of the square-integrable G-equivariant functions. It is not diﬃcult to demonstrate the unitary equivalence ˆ ρ) ∼ ¯ ρ ) with V := Vρ , (Mρ , ∆ = (L2 (Y, V, dµY )G , ∆

(2.6)

¯ ρ denotes the closure of ∆ρ in (2.5). It is convenient for many purposes to where ∆ use the realization of the reduced quantum system furnished by L2 (Y, V, dµY )G . Particularly simple cases of the reduction arise if the reduced conﬁguration space Yred := Y /G is a smooth manifold, although this happens very rarely. However, restricting to the principal orbit type, Yˇ ⊂ Y , one always obtains a smooth ﬁber bundle π : Yˇ → Yˇ /G. Note that Yˇ consists of the points of Y having the smallest isotropy subgroups for the G-action [23]. The “big cell” of the reduced conﬁguration space, given by Yˇred := Yˇ /G, is naturally endowed with a Riemannian metric, ηred , making π a Riemannian submersion. From a quantum mechanical point of view, neglecting the non-principal orbits is harmless, in some sense, since Yˇ is not only open and dense in Y , but it is also of full measure. In many applications polar group actions are important, whose characteristic property is that the G-orbits possess representatives that form sections in the sense of Palais and Terng [22]. By deﬁnition, a section Σ ⊂ Y is a connected, closed, regularly embedded smooth submanifold of Y that meets every G-orbit and it does so orthogonally at every intersection point of Σ with an orbit. If a section exists, then any two sections are G-related. The induced metric on Σ is denoted by ηΣ , and for the measure generated by ηΣ , we introduce the notation µΣ . For a section ˇ a connected component of the manifold Σ ˆ := Yˇ ∩ Σ. The isotropy Σ, denote by Σ ˆ subgroups of all elements of Σ are the same and for a ﬁxed section we deﬁne K := Gy ˆ The group K is called the centralizer of the section Σ. By restricting for y ∈ Σ. ˇ η ˇ ), where η ˇ is the ˇ (Yˇred , ηred ) becomes identiﬁed with (Σ, π : Yˇ → Yˇ /G onto Σ, Σ Σ ˇ We let ∆ ˇ stand for the Laplace operator of the Riemannian induced metric on Σ. Σ ˇ η ˇ ). The G-equivariant diﬀeomorphism manifold (Σ, Σ ˇ × (G/K) (Q, gK) → φg (Q) ∈ Yˇ Σ

(2.7)

provides a trivialization of the ﬁber bundle π : Yˇ → Yˇ /G. Generalized polar coordiˇ and “angular” coordinates on G/K. nates on Yˇ consist of “radial” coordinates on Σ

July 12, J070-S0129055X10004065

704

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

To concretize the reduced system (2.6) for polar actions, we introduce the space ˇ V K ), f = F | ˇ for some F ∈ Cc∞ (Y, V )G }, ˇ V K ) := {f | f ∈ Cc∞ (Σ, Fun(Σ, Σ

(2.8)

where V K is spanned by the K-invariant vectors in the representation space V . We assume that the representation (ρ, V ) of the symmetry group G is admissible in the sense that dim(V K ) > 0.

(2.9)

The restriction of functions appearing in the deﬁnition (2.8) gives rise to a linear ˇ V K) ∼ isomorphism Fun(Σ, = Cc∞ (Y, V )G → L2 (Y, V, dµY )G . This induces a scalar K ˇ V ) making it a pre-Hilbert space whose closure satisﬁes the product on Fun(Σ, ˇ V K) ∼ Hilbert space isomorphism Fun(Σ, = L2 (Y, V, dµY )G . Next, consider the Lie algebra G := Lie(G) and its subalgebra K := Lie(K). Fix a G-invariant positive deﬁnite scalar product, BG , on G and thereby determine the orthogonal complement K⊥ of K in G. For any ξ ∈ G denote by ξ the associated vector ﬁeld on Y . Then at ˇ the linear map K⊥ ξ → ξ ∈ TQ Y is injective, and the inertia each point Q ∈ Σ Q operator J(Q) ∈ End(K⊥ ) can be deﬁned by the requirement , ζQ ) = BG (ξ, J(Q)ζ), ηQ (ξQ

∀ξ, ζ ∈ K⊥ .

(2.10)

Note that J(Q) is symmetric and positive deﬁnite with respect to BG |K⊥ ×K⊥ . By choosing dual bases {Tα }, {T α } ⊂ K⊥ , that is, BG (T α , Tβ ) = δβα , we let bα,β (Q) := BG (Tα , J(Q)Tβ ),

bα,β (Q) := BG (T α , J(Q)−1 T β ).

(2.11)

ˇ is an embedded submanifold of The G-orbit G.Q ⊂ Y through any point Q ∈ Σ Y and by its embedding it inherits a Riemannian metric, ηG.Q . Thus we can deﬁne ˇ → (0, ∞) by the smooth density function δ : Σ δ(Q) := volume of the Riemannian manifold (G.Q, ηG.Q ),

(2.12)

where the volume is understood with respect to the measure, µG.Q , belonging to ηG.Q . It is easy to see that 1

1

δ(Q) = C|det(J(Q))| 2 = C|det(bα,β (Q))| 2

(2.13)

with some constant C > 0. In the following proposition, quoted from [21], ρ denotes the representation of G corresponding to the representation ρ of G. Proposition 2.1. Let us consider a polar G-action using the above notations. Then the reduced system (2.6) associated with an admissible irreducible unitary ˇ V K , dµ ˇ ), ∆red ), representation (ρ, V ) of G can be identified with the pair (L2 (Σ, Σ where 1

1

∆red = ∆Σˇ − δ − 2 ∆Σˇ (δ 2 ) + bα,β ρ (Tα )ρ (Tβ )

(2.14)

1 ˇ V K ) is a densely defined, symmetric, essentially with domain D(∆red ) = δ 2 Fun(Σ, ˇ V K , dµ ˇ ). self-adjoint operator on the Hilbert space L2 (Σ, Σ

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

705

The above statement results by calculating the action of ∆Y on the V -valued equivariant functions in (2.8) with the aid of polar coordinates, using also the Hilbert space identiﬁcations ˇ V K) ∼ ˇ V K , δdµ ˇ ). Fun(Σ, = L2 (Y, V, dµY )G ∼ = L2 (Σ, Σ

(2.15)

The last equality follows by integrating out the “angular” coordinates in the scalar ˇ V K, product of equivariant functions. One also uses the unitary map U :L2 (Σ, 1 2 ˇ K δdµΣˇ ) → L (Σ, V , dµΣˇ ) deﬁned by U : f → δ 2 f . The ﬁrst term in (2.14) corresponds to the kinetic energy of a particle moving on ˇ η ˇ ) and the rest represents potential energy if dim(V K ) = 1. The (Yˇred , ηred ) ∼ = (Σ, Σ second term of (2.14) is always potential energy, which is constant in some cases. We refer to this term as the “measure factor”. It represents a signiﬁcant diﬀerence between the outcomes of the corresponding classical and quantum Hamiltonian reductions [21]. If dim(V K ) > 1, then one says that the reduced system contains internal “spin” degrees of freedom and then the third term of (2.14) encodes “spindependent potential energy”. 3. Examples of Polar Actions on Compact Lie Groups From now we take the “unreduced conﬁguration space” Y to be a compact, connected, real Lie group endowed with a bi-invariant metric η, induced by a positive deﬁnite, Y -invariant bilinear form BY of the Lie algebra Y := Lie(Y ). For the reduction group G one may choose any symmetric subgroup of the direct product group Y × Y , that is, (Y × Y )σ0 ≤ G ≤ (Y × Y )σ ,

(3.1)

where (Y × Y )σ stands for the ﬁxed-point set of some involutive automorphism σ ∈ Inv(Y × Y ), and (Y × Y )σ0 is the connected component of the identity in (Y × Y )σ . The group G acts on Y by the map φ : G × Y → Y,

−1 ((gL , gR ), y) → φ(gL ,gR ) (y) := gL ygR .

(3.2)

The group actions of this form are often called Hermann actions. Under mild conditions, which hold in the examples below, these are polar actions in the sense of [22]. In fact, the sections are provided by certain toral subgroupsb A < Y . Thus the sections are ﬂat in the induced metric, which is the characteristic property of the so-called hyperpolar actions [17]. In the simplest special case σ(y1 , y2 ) = (y2 , y1 ), G = Ydiag = {(y, y) | y ∈ Y } ∼ = Y and (3.2) is just the adjoint action of Y on itself, for which the sections are the maximal tori of Y . bA

toral subgroup A < Y is a connected and closed Abelian subgroup. It is the closedness of the relevant subgroups that requires some conditions. If Y is semi-simple, then a suﬃcient condition is to take BY as a multiple of the Killing form [17].

July 12, J070-S0129055X10004065

706

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

3.1. Hermann actions associated with pairs of involutions The reductions that we study later arise from the following construction. Let σL , σR ∈ Inv(Y ) be two involutions of Y , and let YL , YR ≤ Y be corresponding symmetric subgroups of Y , (Y σI )0 ≤ YI ≤ Y σI

(I ∈ {L, R}).

(3.3)

We suppose that the scalar product BY is invariant under both σL and σR and introduce σ ∈ Inv(Y × Y ) by σ(y1 , y2 ) := (σL (y1 ), σR (y2 )). Then G := YL × YR

(3.4)

is a symmetric subgroup of Y ×Y and Eq. (3.2) deﬁnes a hyperpolar Hermann action of G on Y . The classiﬁcation of the inequivalent pairs of involutions (σL , σR ) has been worked out by Matsuki [24]. We assume for simplicity that the two involutions σL and σR commute with each other, which holds for the large majority of cases in the classiﬁcation. Subsequently, the induced Lie algebra involutions are denoted by the same letters σL and σR . Now, with the aid of the subspaces Y σI ,± := ker(σI ∓ IdY ) ⊂ Y

(I ∈ {L, R}) and Y ±± := Y σL ,± ∩ Y σR ,± ⊂ Y (3.5)

we obtain the orthogonal decomposition Y = Y ++ ⊕ Y +− ⊕ Y −+ ⊕ Y −− ,

(3.6)

which gives also a Z2 × Z2 -gradation of Y. The Lie algebra of the symmetric subgroup YI ≤ Y is Lie(YI ) ∼ = Y σI ,+ (I ∈ {L, R}). Then, we choose a maximal Abelian −− and also deﬁne A := exp(A), which is a toral subgroup of Y . subalgebra A in Y According to an important theorem proved in [25, 26], the Lie group Y admits the generalized Cartan decomposition Y = YL AYR .

(3.7)

This means that every element of Y can be written as a product of the elements of the subgroups in (3.7). Recalling the deﬁnition of the Hermann action (3.2) for G = YL ×YR , Eq. (3.7) says that the subgroup A intersects every G-orbit. Moreover, it does so orthogonally at every intersection point, and thus A provides a section for the G-action in the sense of [22]. Below Aˇ denotes a connected component of the regular part of the section A. Let us introduce the subgroups YLR := YL ∩ YR ≤ Y and M := {g | g ∈ YLR , gag −1 = a (∀ a ∈ A)} ≤ YLR .

(3.8)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

707

Their Lie algebras are Lie(YLR ) ∼ = Lie(YL ) ∩ Lie(YR ) ∼ = Y σL ,+ ∩ Y σR ,+ = Y ++ ,

(3.9)

M := Lie(M ) = {X | X ∈ Y ++ , adX (q) = 0 (∀ q ∈ A)},

(3.10)

where adX is deﬁned by the Lie bracket on Y. It can be shown that the centralizer ˇ is now of the section A = exp(A) (the isotropy subgroup of the elements of A) furnished by K = Mdiag = {(g, g) | g ∈ M } ≤ G.

(3.11)

To specialize the inertia operator J deﬁned in (2.10), we introduce a G-invariant scalar product on the Lie algebra G = Lie(G) = Lie(YL × YR ) ∼ = Lie(YL ) ⊕ Lie(YR ) ∼ = Y σL ,+ ⊕ Y σR ,+

(3.12)

by the formula BG ((ξL , ξR ), (ζL , ζR )) := BY (ξL , ζL )+BY (ξR , ζR ),

∀(ξL , ξR ), (ζL , ζR ) ∈ G. (3.13)

This induces the decomposition G = K ⊕ K⊥ , where K = Lie(K). By using the decomposition Y = M ⊕ M⊥ deﬁned by BY , we also introduce the subspaces Ka⊥ := {(X, −X) | X ∈ M} ⊂ K⊥ ,

(3.14)

Ke⊥ := {(ξL , ξR ) | ξL , ξR ∈ M⊥ ∩ Y ++ } ⊂ K⊥ ,

(3.15)

Ko⊥ := {(ζL , ζR ) | ζL ∈ Y +− , ζR ∈ Y −+ } ⊂ K⊥ ,

(3.16)

which yield the orthogonal decomposition K⊥ = Ka⊥ ⊕ Ke⊥ ⊕ Ko⊥ .

(3.17)

Now consider the vector ﬁeld ξ = (ξL , ξR ) on Y associated with ξ = (ξL , ξR ) ∈ G by means of the G-action. At an arbitrary point eq ∈ A (q ∈ A) of the section A we ﬁnd ξeq = (ξL , ξR )eq = (dLeq )e ξR − e−adq (ξL ) ∈ Teq Y,

(3.18)

where Ly denotes the left-translation on Y by group element y ∈ Y . Simply by plugging (3.18) into the deﬁnition (2.10), routine algebraic manipulations lead to the following result: subspaces Lemma 3.1. Equation (3.17) is a decomposition of K⊥ into invariant ˇ One has J(eq ) ⊥ = 2 IdK⊥ and, of the inertia operator J(eq ) at any point eq ∈ A. K a a

July 12, J070-S0129055X10004065

708

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

writing ξ = (ξL , ξR ) ∈ G as a 2-component column vector with components ξL and ξR , the action of J(eq ) on Ke⊥ and Ko⊥ is encoded by the matrices 1 − cosh(ad ) q J(eq )K⊥ = , e ⊥ 1 − cosh(adq ) Ke (3.19) ) 1 − sinh(ad q J(eq )K⊥ = . o ⊥ 1 sinh(adq ) K q

o

J(eq )−1 K⊥ a

1 2

For the inverse of J(e ) one has = IdK⊥ together with a cosh(adq ) sinh−2 (adq ) sinh−2 (adq ) q −1 J(e ) K⊥ = − , e ⊥ sinh−2 (adq ) cosh(adq ) sinh−2 (adq ) Ke −2 −2 sinh(adq ) cosh (adq ) cosh (adq ) J(eq )−1 K⊥ = . −2 o ⊥ cosh−2 (adq ) −sinh(adq ) cosh (adq ) K

(3.20)

(3.21)

o

3.2. A family of two involutions on U (N ) For our later purpose, we now focus on the unitary group Y := U (N ) = {y | y ∈ GL(N, C), y † y = 1N }.

(3.22)

We equip the Lie algebra Y := u(N ) = {X | X ∈ gl(N, C), X † + X = 0}

(3.23)

with the scalar product BY (X, Z) := −tr(XZ),

∀ X, Z ∈ u(N ).

(3.24)

To any pair (m, n) ∈ Z2≥0 with m ≥ n and m+n = N we associate the block-matrix Im,n := diag(1m , −1n ) =

1m

0

0

−1n

∈ U (N ),

(3.25)

and the involutive inner automorphism θm,n : U (N ) → U (N ),

y → θm,n (y) := Im,n yI−1 m,n .

The ﬁxed-point set of θm,n is

a 0 θm,n = U (N ) a ∈ U (m), b ∈ U (n) ∼ = U (m) × U (n). 0 b

(3.26)

(3.27)

Note that U (N )θm,n is connected. The induced Lie algebra involution operates as θm,n (X) = Im,n XI−1 m,n ,

∀X ∈ u(N ).

(3.28)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

709

Using the block-matrix realization

u(N ) =

A −C †

C m×n , A ∈ u(m), B ∈ u(n), C ∈ C B

(3.29)

the eigenspaces u(N )θm,n ,± are 0 = A ∈ u(m), B ∈ u(n) , 0 B

0 C = C ∈ Cm×n . −C † 0

u(N )

θm,n ,+

u(N )θm,n ,−

A

(3.30)

Now we take two pairs (m, n), (r, s) ∈ Z2≥0 with the additional requirements m ≥ r ≥ s ≥ n and m + n = r + s = N , and consider the commuting involutions σL := θr,s

and σR := θm,n .

(3.31)

The corresponding symmetric subgroups YL , YR ≤ Y are U (N )L := U (N )σL ∼ = U (r) × U (s) and U (N )R := U (N )σR ∼ = U (m) × U (n). (3.32) The partition N = n+(r−n)+(s−n)+n leads to a 4×4 block-matrix decomposition of any N × N matrix in general. (Of course, if r = n or s = n, then the blockmatrix decomposition contains fewer blocks.) That is, any matrix X ∈ CN ×N can be written as   X1,1 X1,2 X1,3 X1,4    X2,1 X2,2 X2,3 X2,4  ,  (3.33) X=   X3,1 X3,2 X3,3 X3,4  X4,1 X4,2 X4,3 X4,4 where the entries Xi,j are themselves matrices, X1,1 ∈ Cn×n , X1,2 ∈ Cn×(r−n) , X1,3 ∈ Cn×(s−n) , X1,4 ∈ Cn×n , etc. Then for the Lie group YLR = YL ∩ YR we have U (N )LR  a1,1      a2,1 =   0       0

a1,2

0

a2,2

0

0

a3,3

0

0

   0 a1,1  0   a2,1 a4,4 0

a1,2 a2,2

∈ U (r), a3,3 ∈ U (s − n), a4,4

      ∈ U (n) .      (3.34)

July 12, J070-S0129055X10004065

710

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

Therefore U (N )LR ∼ = U (r) × U (s − n) × U (n) and the Lie algebra Lie(U (N )LR ) = ++ is isomorphic to u(r) ⊕ u(s − n) ⊕ u(n). In our case the subspace Y −− in u(N ) (3.5) reads    0 0 0 A1,4              0 0 0 A 2,4 −− n×n (r−n)×n  A1,4 ∈ C . (3.35) u(N ) =  , A ∈ C 2,4     0 0 0    0       0 −A†1,4 −A†2,4 0 To proceed, we deﬁne the diagonal n × n matrix q := diag(q1 , q2 , . . . , qn ) ∈ Rn×n for any real n-tuple (q1 , q2 , . . . , qn ) ∈ Rn , and we also set   0 0 0 q    0 0 0 0 −−  q :=   0 0 0 0  ∈ u(N ) .   −q 0 0 0

(3.36)

(3.37)

Then the set of matrices A := {q | (q1 , q2 , . . . , qn ) ∈ Rn } ⊂ u(N )−−

(3.38)

is a maximal Abelian subalgebra in u(N )−− . A basis of the dual space A∗ is given by the functionals k : A → R,

q → k (q) := qk .

The corresponding subgroup A = exp(A) has the form     cos(q) 0 0 sin(q)               0 0 0 1 r−n q n  (q1 , q2 , . . . , qn ) ∈ R . A= e =      0 0 0 1   s−n         −sin(q) 0 0 cos(q)

(3.39)

(3.40)

If T(n) denotes the diagonally embedded standard torus in U (n), then it is straightforward to show that the subgroup M (3.8) is now furnished by    a 0 0 0             0 b 0 0    a ∈ T(n), b ∈ U (r − n), c ∈ U (s − n) . (3.41) M=          0 0 c 0      0 0 0 a Note that M is connected, and therefore so is the centralizer K = Mdiag of the section A. Moreover, we have the identiﬁcations K∼ = Mdiag ∼ =M ∼ = T(n) × U (r − n) × U (s − n) ∼ = U (1)×n × U (r − n) × U (s − n).

(3.42)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

711

It is shown in [26, p. 63], that the closed, connected subset π A+ := eq 0 ≤ q1 ≤ q2 ≤ · · · ≤ qn ≤ ⊂A (3.43) 2 intersects each orbit of G = U (N )σL × U (N )σR under the action (3.2) precisely once. Note also that matrix exponentiation provides a bijection from π ⊂A (3.44) A+ := q 0 ≤ q1 ≤ q2 ≤ · · · ≤ qn ≤ 2 onto A+ . By inspecting the isotropy subgroup Geq ≤ G for eq ∈ A+ , we ﬁnd that Geq = K if and only if q ∈ Aˇ+ , where Aˇ+ denotes the connected open subset π Aˇ+ := q 0 < q1 < q2 < · · · < qn < (3.45) ⊂ A+ . 2 We can conclude from the above that the subset Aˇ := exp(Aˇ+ ) provides a connected component for the regular part of the section A. Regarding the components qk in ˇ for the Laplace operator ∆ ˇ deﬁned by the (3.45) as global coordinates on A, A induced metric we obtain n 1 ∂2 . (3.46) ∆Aˇ = 2 ∂qk2 k=1

3.3. Diagonalization of the inertia operator We continue the study of the examples (3.31) by presenting a basis of K⊥ that diagonalizes J(eq ) (3.19) for any q ∈ Aˇ+ in (3.45). We then use this basis to 1 compute the density δ 2 that enters the second term of the reduced Laplacian (2.14). 1 Note that δ 2 could be found also by the specialization of general formulae available for two commuting involutions [25, 2], but we need to ﬁx a basis for the evaluation of the third term of (2.14), which will be performed later. We start by deﬁning an orthonormal basis (ONB) in the space M⊥ ∩ u(N )++ , which (due to (3.34) and (3.41)) has the form M⊥ ∩ u(N )++  X1,1       −X † 1,2 =   0       0

X1,2 0 0 0

       0 0  X1,1 , X4,4 ∈ u(n), (X1,1 + X4,4 )diag = 0,   . n×(r−n)  0 0    X1,2 ∈ C    0 X4,4 0

0

(3.47) ⊥

If r = n, then there are no oﬀ-diagonal blocks, and in general dim(M ∩u(N ) ) = n(2r − 1). For all 1 ≤ j ≤ n we let   Ejj 0 0 0   0 0 0 0  i  i ,  √ (3.48) := E2 j 2 0 0 0    0 0 0 0 −Ejj ++

July 12, J070-S0129055X10004065

712

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

and for all 1 ≤ k < l ≤ n we deﬁne  Ekl − Elk  0 1 Erk +l :=   2 0  Ei k +l

 i :=  2  

Erk −l :=

 1  2  

Ei k −l

 i :=  2 



0 0

0

0 0

0

0 0

0

0

0 0

Elk − Ekl

Ekl + Elk

0 0

0

0

0 0

0

0

0 0

0

0

0 0

Ekl − Elk

0 0

0

0 0

0

0 0

0

0 0

Ekl + Elk

0 0

0

0 0

0

0 0

0

0 0

−Ekl − Elk  0   0 ,  0  Ekl − Elk  0   0 .  0  Ekl + Elk

For all 1 ≤ j ≤ n and 1 ≤ d ≤ r − n we set    0 Ejd 0 0 0    E 0 0 0 1  −Edj  , E i,d := √i  dj Er,d := √  j   j 2 0 2 0 0 0  0 0 0 0 0 0

  ,      ,   (3.49)

Ejd

0

0

0

0

0

0

0

0



 0 . 0  0

(3.50)

The superscripts i and r refer to purely imaginary and to real matrices, respectively, and the elementary matrices Eab are always understood to be of the correct size as dictated by (3.33). The set of matrices i }n ∪ {Er,d , Ei,d }1≤j≤n, {EαD }α,D := {Erk ±l , Ei k ±l }1≤k
(3.51)

1≤d≤r−n

forms an ONB in M⊥ ∩ u(N )++ . Here D is an “index of degeneration” and α runs over the positive roots R+ for the root system Cn or BCn . More precisely,  R+ (Cn ) if r = n, R+ = (3.52) R+ (BCn ) if r > n. One can easily verify the relations (adq )2 EαD = −α(q)2 EαD .

(3.53)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

713

Next, we deal with the subspaces u(N )+− and u(N )−+ given by

u(N )+−

 0       0 =  0       0

u(N )−+  0       0 =   −X †    1,3    0

0

0

0

0

0

0

† 0 −X3,4

0

X1,3

0

X2,3

† −X2,3

0

0

0

        0   X3,4 ∈ C(s−n)×n ,   X3,4      0 0

(3.54)

  0       0   X1,3 ∈ Cn×(s−n) , X2,3 ∈ C(r−n)×(s−n) .  0      0 (3.55)

Note that both u(N )+− and u(N )−+ are trivial if s = n. In general, dim(u(N )+− ) = 2n(s − n) and dim(u(N )−+ ) = 2r(s − n). For all 1 ≤ j ≤ n and 1 ≤ d ≤ s − n we deﬁne 

0 0

0

 0 0 1  r,d  ˜ √ Ej :=  2 0 0 0 0



0

 0 1  r,d ˜ Fj := √   2  Edj 0

0



 0  , Edj   0

0 0 −Ejd

0

−Ejd

0

0

0

0

0

0

0



 0 , 0  0



0 0

0

 0 0 i  i,d  ˜ √ Ej :=  2 0 0 0 0



0

 0 i  i,d ˜ Fj := √   2  Edj 0

0



 0  , Edj   0

0 0 Ejd

0

Ejd

0

0

0

0

0

0

0

(3.56)



 0 . 0  0

(3.57)

For all 1 ≤ c ≤ r − n and 1 ≤ d ≤ s − n we introduce 

0

0

 0 0 1  r,c,d ˜ := √  F0  2  0 −Edc 0 0

0 Ecd 0 0

0



 0 , 0  0



0

0

 0 0 i  i,c,d ˜ F0 := √   2  0 Edc 0 0

0 Ecd 0 0

0



 0 . 0  0 (3.58)

July 12, J070-S0129055X10004065

714

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

The set of matrices ˜D }j,D := {E ˜r,d , E ˜i,d }1≤j≤n {E j j j

(3.59)

1≤d≤s−n

forms an ONB in u(N )+− . The set of matrices , F˜i,d }1≤j≤n {F˜Dj }j,D := {F˜r,d j j

(3.60)

1≤d≤s−n

together with the set {F˜0D }D := {F˜0r,c,d, F˜0i,c,d}1≤c≤r−n

(3.61)

1≤d≤s−n

form an ONB in u(N )−+ . They verify the relations ˜ D ) = qj F˜ D , adq (E j j

˜D, adq (F˜Dj ) = −qj E j

adq (F˜0D ) = 0.

(3.62)

Now we compute the matrix of J and of J −1 on the invariant subspaces in dim(M) in M. Then the vectors (3.17). First, choose an arbitrary ONB {Lj }j=1 ˆ j := √1 (Lj , −Lj ) ≡ √1 L 2 2

Lj

−Lj

(3.63)

yield an ONB in Ka⊥ . The matrix entries of J(eq )|K⊥ and J(eq )−1 |K⊥ read a a ˆ k , J(eq )L ˆ l ) = 2δk,l , BG (L

ˆ l ) = 1 δk,l . ˆ k , J(eq )−1 L BG (L 2

Second, upon introducing the vectors EαD 1 EαD 1 D D Vα := √ , Wα := √ , 2 EαD 2 −EαD we obtain an ONB in Ke⊥ , and by applying (3.19) on these vectors we get 1 (1 − cosh(adq ))EαD q D , J(e )Vα = √ 2 (1 − cosh(adq ))EαD (1 + cosh(adq ))EαD 1 q D . J(e )Wα = √ 2 −(1 + cosh(adq ))EαD

(3.64)

(3.65)

(3.66)

We ﬁnd from the relations (3.53) that cosh(adq )EαD = cos(α(q))EαD , and then elementary trigonometric identities yield α(q) α(q) q D 2 D q D 2 (3.67) J(e )Vα = 2 sin Vα , J(e )Wα = 2 cos WαD . 2 2

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

715

Therefore the only non-trivial matrix entries of J(eq )|K⊥ and J(eq )−1 |K⊥ are the e e following ones: α(q) BG (VαD , J(eq )VαD ) = 2 sin2 , 2 α(q) D q D 2 BG (Wα , J(e )Wα ) = 2 cos , 2 BG (VαD , J(eq )−1 VαD ) =

1 , α(q) 2 sin2 2 1

BG (WαD , J(eq )−1 WαD ) = 2 cos2 Third, by introducing D ˜ 1 E j D ˜ Vj := √ , 2 F˜D j

˜ D := √1 W j 2

˜D E j , −F˜D

α(q) 2

(3.68)

.

Z˜0D :=

0 , F˜ D

(3.69)

0

j

we obtain an ONB in Ko⊥ , and the application of (3.19) on these basis vectors gives J(eq )V˜D j

1 = √ 2

˜D J(eq )W j

1 = √ 2

˜D − sinh(adq )F˜D E j j sinh(adq )E˜Dj + F˜Dj ˜ D + sinh(adq )F˜ D E j j sinh(adq )E˜Dj − F˜Dj

,

(3.70) .

By using the relations (3.62) we see that ˜ D = (1 − sin(qj ))W ˜ D . J(eq )V˜D = (1 + sin(qj ))V˜D , J(eq )W j j j j q

)Z˜0D

(3.71)

Z˜0D ,

Since J(e = we conclude that the only non-trivial matrix entries of q and its inverse J(eq )−1 |K⊥ are the following ones: J(e )|K⊥ o o BG (V˜D , J(eq )V˜D ) = 1 + sin(qj ), j j

˜ D , J(eq )W ˜ D ) = 1 − sin(qj ), BG (W j j

1 , 1 + sin(qj )

˜ D ) = ˜ D , J(eq )−1 W BG (W j j

BG (Z˜0D , J(eq )Z˜0D ) = 1,

BG (Z˜0D , J(eq )−1 Z˜0D ) = 1.

, J(eq )−1 V˜D )= BG (V˜D j j

1 , 1 − sin(qj ) (3.72)

ˇ := Aˇ = exp(Aˇ+ ) with Aˇ+ in (3.45), Lemma 3.2. By using the identification Σ the second term of the reduced Laplacian (2.14) is given by n n 1 1 1 1 (m − n)(r − s) 4(s − n)2 − 1 δ − 2 ∆Aˇ (δ 2 ) = + 2 2 2 2 sin (q ) sin (2qj ) j j=1 j=1 −

n(3m2 + n2 − 1) . 6

(3.73)

July 12, J070-S0129055X10004065

716

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

Proof. Consider the function

J :=

[sin(qk − ql ) sin(qk + ql )]

ν

n

ν1

[sin(qj )]

j=1

1≤k
n

[sin(2qj )]

ν2

,

(3.74)

j=1

where the domain of the variables q1 , q2 , . . . , qn is such that all sin functions are positive and ν, ν1 , ν2 ∈ R are arbitrary parameters. Recall from [9] the identity n 1 ∂2J 1 J −1 = ν(ν − 1) + ∂qa2 sin2 (qk − ql ) sin2 (qk + ql ) a=1 1≤k
n j=1

n 1 1 (ν − 1) + 4ν 2 2 2 2 sin (qj ) sin (2qj ) j=1

2 2 2 − n (ν1 + 2ν2 ) + 2ν(ν1 + 2ν2 )(n − 1) + ν (n − 1)(2n − 1) . 3 (3.75) By calculating det(J(eq )) using the above basis of K⊥ , it is easily obtained from 1 (2.13) that δ 2 (eq ) ∝ J (q1 , q2 , . . . , qn ) with ν = 1,

ν1 = r − s,

1 ν2 = s − n + . 2

(3.76)

Taking into account (3.46), the required statement follows immediately. The subsequent formula is obtained by direct substitution since we have determined the matrix elements of J(eq )−1 (cf. (2.11)). It will be used in Sec. 4, when we shall further inspect the reduced Laplace operator (2.14) in interesting cases. Lemma 3.3. In terms of the above notations, the third term of the reduced Laplacian (2.14) takes the following form: bα,β ρ (Tα )ρ (Tβ ) 1 = 2

ˆ j )2 + 1 ρ (L 2 j=1 n

1≤j≤dim(M)

 +

1 2

+

1 2

i )2 ρ (V2 j

sin2 (qj )

+

i ρ (W2 )2 j

cos2 (qj ) 

 ρ (Vrk −l )2 + ρ (Vik −l )2 ρ (Wrk −l )2 + ρ (Wik −l )2    +   qk − ql qk − ql 2 2 1≤k
 ρ (Vrk +l )2 + ρ (Vik +l )2 ρ (Wrk +l )2 + ρ (Wik +l )2    +   qk + ql qk + ql 2 2 1≤k
July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

717

  n r−n r,d 2 i,d 2 ρ (Wr,d )2 + ρ (Wi,d )2 1  ρ (Vj ) + ρ (Vj ) j j  & ' &q ' + + j 2 qj 2 j=1 2 sin cos d=1 2 2 s−n n r,d 2 i,d 2 r,d 2 ˜ ˜ ˜ ˜ i,d )2 ρ (Vj ) + ρ (Vj ) ρ (Wj ) + ρ (W j + + 1 + sin(qj ) 1 − sin(qj ) j=1 d=1

+

r−n s−n

(ρ (Z˜0r,c,d )2 + ρ (Z˜0i,c,d)2 ).

(3.77)

c=1 d=1

4. BCn Sutherland Models from the KKS Ansatz In this section, we study interesting examples of the quantum Hamiltonian reduction based on the Hermann action (3.2) on Y = U (N ) associated with the involutions (3.31). The reductions correspond to certain UIRREPS ρ of the symmetry group G = U (N )L × U (N )R = (U (r) × U (s)) × (U (m) × U (n)).

(4.1)

To describe them, we now brieﬂy summarize our notations for the UIRREPS of Π λ , Vλ ) U (n), for arbitrary n. (See also the Appendix.) First, we have the UIRREP (Π of SU (n) in correspondence to any highest weight λ ∈ P+ (SU (n)), that can be (n−1 written as λ = i=1 ai i using the fundamental weights i and integers ai ∈ Z≥0 . A label µn (λ) ∈ {0, 1, . . . , n − 1} is attached to the highest weight λ by the congruence relation µn (λ) ≡

n−1

kak

(mod n) for λ =

n−1

ai i .

(4.2)

i=1

k=1 2π

2π

It enters the equality Π λ (ei n 1n ) = ei n µn (λ) IdVλ . Then, for any k ∈ Z, the representation Π λ of SU (n) extends to the representation ρ(k,λ) of U (n) deﬁned by ρ(k,λ) (ξg) = ξ nk+µn (λ)Π λ (g),

∀ ξ ∈ U (1),

∀ g ∈ SU (n).

(4.3)

Up to equivalence, all UIRREPS of U (n) are obtained in this way. The notation makes sense even for n = 1, by putting P+ (SU (1)) := {0}, and we have ρ(k,0) (g) = (det g)k (∀g ∈ U (n)). By letting ρ(k,λ) and πλ stand for the inﬁnitesimal version of the representations ρ(k,λ) and Π λ , respectively, we have tr(Z) tr(Z) 1n + (µn (λ) + nk) IdVλ , ∀ Z ∈ u(n). (4.4) ρ(k,λ) (Z) = πλ Z − n n (n)

(n)

(n)

We use the notations πλ , Vλ , ρ(k,λ) etc. when considering various values of n simultaneously. The UIRREPS of the direct product group G (4.1) have the form ' & ' & (r) (s) (m) (n) (4.5) ρ = ρ(k1 ,λ1 ) ρ(k2 ,λ2 ) ρ(k1 ,λ1 ) ρ(k2 ,λ2 ) , L

L

L

L

R

R

R

R

July 12, J070-S0129055X10004065

718

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

1 2 1 2 where λ1L , λ2L , λ1R , λ2R are the highest weights and kL , kL , kR , kR ∈ Z according to (4.3). The main problem is to ﬁnd the UIRREPS (ρ, V ) for which

dim(V K ) = 1,

(4.6)

where K = Mdiag < G is given by (3.42). We investigate this problem by adopting the ansatz that one of the four constituent representations in (4.5) has the (l) form ρ(k,a1 1 ) (l ∈ {r, s, m, n}) and the other three constituent representations are (l)

1-dimensional. More exactly, ρ(k,a1 1 ) will be used for a factor of the maximal size, l = max{r, s, m, n}. We call this assumption the KKS ansatz, since it eventually originates from the seminal paper by Kazhdan, Kostant and Sternberg [14]. The usefulness of this assumption is also supported by results in [13, 19, 20]. The key (l) property is that all weight-multiplicities of ρ(k,a1 1 ) are equal to one. The analysis of the condition (4.6) is the easiest if the group K (3.42) is Abelian, which happens in the following cases: • Case I: m = r = s = n, N = 2n, • Case II: m = r = n + 1, s = n, N = 2n + 1, • Case III: m = n + 2, r = s = n + 1, N = 2n + 2. Next we describe the simplest Case I in detail, then present the essential points for the other two cases. The complex holomorphic analogue of Case I was studied in [19]; and the results are consistent. The other two cases of our KKS ansatz have not been investigated before. Remark. The reader may wonder why we take l = max{r, s, m, n} in our KKS ansatz in Cases II and III. In fact, we previously studied ([20] and unpublished work) the classical Hamiltonian reductions of the free particle on U (N ) based on the symmetry group (4.1) by using a minimal coadjoint orbit of positive dimension for any one of the four factors and one-point orbits for the other three factors. We found that this leads to the classical BCn Sutherland model with three independent coupling constants only in the three cases mentioned above, and only if the minimal coadjoint orbit of positive dimension, 2(l − 1) for U (l), is associated with a factor of maximal size. The connection to quantum Hamiltonian reduction is clear from the relation between the coadjoint orbits of U (l) of dimension 2(l − 1) and the (l) representations ρ(k,a1 1 ) (and their contragredients), which follows for example from geometric quantization. 4.1. Case I: m = r = s = n, N = 2n Now σL = σR = θn,n and U (N )L = U (N )R ∼ = U (n) × U (n). The decomposition (3.33) of any matrix in CN ×N simpliﬁes to a two by two block form with all four blocks having size n × n. We look for admissible UIRREPS ρ of G (4.1) by adopting the KKS ansatz ' & ' & (n) (n) (n) (n) (4.7) ρ := ρ(k1 ,a1 1 ) ρ(k2 ,0) ρ(k1 ,0) ρ(k2 ,0) , L

L

R

R

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

719

1 2 1 2 where a1 ∈ Z≥0 , kL , kL , kR , kR ∈ Z and the representation space is identiﬁed as

V ≡ Va(n) . 1 1

(4.8)

Note that any element X ∈ G ∼ = u(N )σL ,+ ⊕ u(N )σR ,+ of the symmetry algebra G can be realized as a pair X = (XL , XR ) with XL , XR ∈ u(N )σL ,+ = u(N )σR ,+ ∼ = u(n) ⊕ u(n). So, for any X ∈ G we have the reﬁned decomposition 1 2 X = (XL , XR ) = ((XL1 , XL2 ), (XR , XR )),

where

∈ u(n) and as block-matrices 1 0 XL1 XR 1 2 1 2 (XL , XL ) := , X ) := , (X R R 0 XL2 0

(4.9)

1 2 XL1 , XL2 , XR , XR

0 2 XR

.

(4.10)

With these notations, the formula of the Lie algebra representation corresponding to (4.7) reads tr(XL1 ) µn (a1 1 ) (n) 1 1 1n + kL + ρ (X) = πa1 1 XL − tr(XL1 ) n n 2 1 1 2 2 + tr(kL XL2 + kR XR + kR XR ) IdV . (4.11)

Lemma 4.1. The KKS ansatz (4.7) defines admissible UIRREPS of G satisfying 1 2 1 2 + kL + kR + kR = 0 and a1 = γn with some γ ∈ Z≥0 . dim(V K ) = 0 if and only if kL In these cases dim(V K ) = 1. Using the bosonic oscillator realization of V (4.8) described in the Appendix, V K has the form (n) VK ∼ [0] = spanC {|γ, γ, . . . , γ}. = Vγn

1

(4.12)

Proof. The isotropy subalgebra is K = Mdiag = {X = (X0 , X0 ) | X0 ∈ M}, where M can be parametrized as

H + ix1n 0 (n) M = X0 = (4.13) H ∈ iHR , x ∈ R . 0 H + ix1n That is, for the components of any X ∈ K we have the parametrization 1 2 XL1 = XL2 = XR = XR = H + ix1n .

(4.14)

(n) Va1 1

and X ∈ K we can write Thus, using Eq. (4.11), for any v ∈ (n) 1 2 1 2 ρ (X)v = πa1 1 (H)v + ix µn (a1 1 ) + n(kL + kL + kR + kR ) v.

(4.15)

Clearly ρ (X)v = 0 (∀X ∈ K) if and only if (n)

1 2 1 2 πa(n) (H)v = 0 (∀H ∈ iHR ) and µn (a1 1 ) + n(kL + kL + kR + kR ) = 0. 1 1

(4.16) Therefore It is easy

(n) 1 2 1 2 VK = VK ∼ +kL +kR +kR ) = 0. = Va1 1 [0], provided that µn (a1 1 )+n(kL (n) to see that Va1 1 [0] = {0} if and only if a1 = γn for some γ ∈ Z≥0 .

July 12, J070-S0129055X10004065

720

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

1 2 1 2 Since µn (γn1 ) = 0 by (4.2), the requirement kL + kL + kR + kR = 0 then also (n) follows from (4.16). Finally, note that by using the oscillator realization of Vγn 1 one has the second equality in (4.12).

In what follows we make use of the basis of K⊥ constructed in Sec. 3.3. In the present case this is given by the basis {Vαa , Wαa }a∈{r,i},α∈R+(Cn ) of Ke⊥ together with ˆ j } of Ka⊥ deﬁned according to (3.63) by using the following orthonormal the basis {L n basis {Lj }j=1 of M: 0 Ejj i Lj := √ ∈ M (1 ≤ j ≤ n). (4.17) 2 0 Ejj Lemma 4.2. In the case of the KKS ansatz (4.7) subject to the conditions of Lemma 4.1 the third term in the reduced Laplacian (2.14) gives bα,β ρ (Tα )ρ (Tβ ) 1 1 2 2 + kL ) − γ(γ + 1) = − n(kL 2 −

1≤k

1 1 + 2 2 sin (qk − ql ) sin (qk + ql )

n n 1 1 2 2 1 2 + kR ) − (kL + kR ) 1 1 (kL 2 1 2 + k ) − 2(k . L R 2 2 2 sin (qj ) sin (2qj ) j=1 j=1

(4.18) Proof. Note that in the present case only the ﬁrst four sums occur in the formula (3.77). Recalling that µn (γn1 ) = 0 and utilizing formula (4.11) for ρ , we can calculate the action of the various terms. For example, since ˆ j = √1 (Lj , −Lj ) = i ((Ejj , Ejj ), (−Ejj , −Ejj )) , L (4.19) 2 2 we get 1 (n) 1 2 1 2 ˆ j ) = i πγn

1 − + k − k − k )Id ρ (L E + (k . (4.20) jj n V L L R R 1 2 n ˆ j ) on V K can be easily calculated in the bosonic oscillator picture. The action of ρ (L (n) 2 1 2 1 = −kL − kL − kR , it follows Since πγn 1 Ejj − n1 1n |γ, γ, . . . , γ = 0, and since kR ˆ {|γ, γ, . . . , γ} the operator ρ ( L ) acts as the span that on the subspace V K ∼ = j C ˆ 1 2 i 1 1 ) scalar ρ (Lj ) = i(kL + kL ). In the same manner, the equalities ρ (V2j ) = i(kL + kR i 2 1 K and ρ (W2j ) = −i(kL + kR ) hold on V . Furthermore, we have on V ρ (Vrk −l ) = ρ (Wrk −l ) = ρ (Vrk +l ) = ρ (Wrk +l ) 1 (n) (n) = √ (πγn

(Ekl ) − πγn

(Elk )), 1 1 2 2

(4.21)

ρ (Vik −l ) = ρ (Wik −l ) = ρ (Vik +l ) = ρ (Wik +l ) i (n) (n) = √ (πγn

(Ekl ) + πγn

(Elk )). 1 1 2 2

(4.22)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

721

Next, ∀k, l ∈ {1, 2, . . . , n}, k = l, we obtain (n) (n) (Ekl )πγn

(Elk )|γ, γ, . . . , γ = b†k bl b†l bk |γ, γ, . . . , γ πγn

1 1

= γ(γ + 1)|γ, γ, . . . , γ.

(4.23)

The above equations imply that on V K ˆ j )2 = −(k 1 + k 2 )2 , ρ (L L L

i 1 1 2 ρ (V2 )2 = −(kL + kR ) , j

i 2 1 2 ρ (W2 )2 = −(kL + kR ) , j

(4.24)

1 ρ (Vαr )2 + ρ (Vαi )2 = ρ (Wαr )2 + ρ (Wαi )2 = − γ(γ + 1) for 2 α = k ± l , k = l.

(4.25)

Now (4.18) results by substitution into (3.77), using obvious trigonometric identities. The following proposition is obtained by putting together the statements of Eq. (3.46), Lemmas 3.2 and 4.2. Proposition 4.3. Under the KKS ansatz (4.7) the general formula (2.14) gives the following result for the reduction of the Laplace operator of U (N ): 1 1 1 2 2 (4.26) + kL ) − n(2n − 1)(2n + 1), −∆red = HBCn + n(kL 2 6 where HBCn is the Sutherland Hamiltonian (1.4) with the coupling parameters defined by a ≡ γ,

1 1 + kR |, b ≡ |kL

2 1 c ≡ |kL + kR |

(4.27)

1 2 1 , kL , kR ∈ Z and γ ∈ Z≥0 determined by in terms of the free parameters kL Lemma 4.1. 1 2 1 Remark. By varying γ, kL , kL , kR , the coupling parameters a, b, c in (1.4) can take arbitrary non-negative integer values. As further discussed in Sec. 5, Proposition 4.3 follows also from the results of Oblomkov [19].

4.2. Case II: m = r = n + 1, s = n, N = 2n + 1 In this case σL = σR = θn+1,n and correspondingly U (N )L = U (N )R ∼ = U (n + 1) × U (n). We consider the following ansatz for the UIRREP (ρ, V ) of the symmetry group G (4.1), ' & ' & (n+1) (n) (n+1) (n) (4.28) ρ := ρ(k1 ,a1 1 ) ρ(k2 ,0) ρ(k1 ,0) ρ(k2 ,0) , L

L

R

R

(n+1)

1 2 1 2 where a1 ∈ Z≥0 , kL , kL , kR , kR ∈ Z and the carrier space is identiﬁed as V ≡ Va1 1 . Similarly to (4.9), any X ∈ G ∼ = u(N )σL ,+ ⊕ u(N )σR ,+ can be realized as a pair X = (XL , XR ) with XL , XR ∈ u(N )σL ,+ = u(N )σR ,+ ∼ = u(n + 1) ⊕ u(n). So, we

July 12, J070-S0129055X10004065

722

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

1 2 1 write X ∈ G as X = (XL , XR ) = (XL1 , XL2 ), (XR , XR ) with XL1 , XR ∈ u(n + 1), 2 2 XL , XR ∈ u(n). Then (4.28) implies the formula tr(XL1 ) (n+1) 1 ρ (X) = πa1 1 XL − 1n+1 n+1 µn+1 (a1 1 ) 1 1 2 2 1 1 2 2 + kL + tr(XL ) + kL tr(XL ) + kR tr(XR ) + kR tr(XR ) IdV . n+1 (4.29) Lemma 4.4. The KKS ansatz (4.28) yields admissible UIRREPS of G if and only 1 2 1 2 if ∃γ, γ˜ ∈ Z≥0 such that the parameters kL , kL , kR , kR ∈ Z, and a1 ∈ Z≥0 satisfy the conditions a1 = γn + γ˜ ,

2 2 kL + kR = γ˜ − γ,

1 1 kL + kR = R − (˜ γ − γ),

(4.30)

where γ˜ − γ = Q + (n + 1)R with uniquely determined Q = Q(γ, γ˜ ) ∈ {0, 1, . . . , n} and R = R(γ, γ˜) ∈ Z. If these conditions hold, then dim(V K ) = 1 and V K is given by [γe1 + γe2 + · · · + γen + γ˜ en+1 ] = spanC {|γ, γ, . . . , γ, γ˜ }, VK ∼ = Va(n+1) 1 1

(4.31)

(n+1)

where the last equality refers to the bosonic oscillator realization of Va1 1 . Proof. For the isotropy subalgebra we have K = Mdiag = {X = (X0 , X0 ) | X0 ∈ M}, where     D 0 0   M = X0 = i  0 ω 0  D = diag(d1 , d2 , . . . , dn ) ∈ Rn×n , ω ∈ R . (4.32)   0 0 D So, for any X ∈ K we have XL = XR = X0 , and D 0 1 1 2 XL = XR = i = iD. (4.33) , XL2 = XR 0 ω (n Now, for each ϕ = (ϕ1 , ϕ2 , . . . , ϕn ) ∈ Rn we let ϕ¯ := j=1 ϕj , and consider the traceless Cartan elements (n+1)

¯ ∈ HR Hϕ := diag(ϕ1 , ϕ2 , . . . , ϕn , −ϕ)

,

(n) ˜ ϕ := diag(ϕ1 , ϕ2 , . . . , ϕn ) − 1 ϕ1 H ¯ n ∈ HR . n Then the components of X ∈ K can be parametrized as 1 2 ˜ ϕ + i x + 1 ϕ¯ 1n , XL1 = XR = iHϕ + ix1n+1 , XL2 = XR = iH n

(4.34)

(4.35)

(n+1)

where ϕ ∈ Rn and x ∈ R. From (4.29), it follows that ∀v ∈ Va1 1 we have 2 2 (iHϕ )v + i(kL + kR )ϕv ¯ + ix(µn+1 (a1 1 ) ρ (X)v = πa(n+1) 1 1 1 1 2 2 + (n + 1)(kL + kR ) + n(kL + kR ))v.

(4.36)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

723

Clearly ρ (X)v = 0 (∀X ∈ K) if and only if 2 2 πa(n+1) (Hϕ )v = −(kL + kR )ϕv ¯ 1 1

(∀ ϕ ∈ Rn ),

1 1 2 2 + kR ) + n(kL + kR ) = 0. Note that ϕ¯ = and µn+1 (a1 1 ) + (n + 1)(kL (n e (H ), so after introducing the shorthand notations j ϕ j=1 1 1 κ1 := kL + kR ∈Z

(4.37) (n

2 2 and κ2 := kL + kR ∈ Z,

we conclude that

VK =VK ∼ − κ2 = Va(n+1) 1 1

n

j=1

ϕj =

(4.38)

ej ,

(4.39)

j=1

provided that µn+1 (a1 1 ) + (n + 1)κ1 + nκ2 = 0. Our next goal is to identify the (n+1)

weight space Va1 1 [−κ2 (e1 + e2 + · · · + en )]. Recall that −κ2 (e1 + e2 + · · · + en ) ∈ (n+1)

Wa1 1 if and only if ∃(l1 , l2 , . . . , ln+1 ) ∈ Zn+1 ≥0 with l1 + l2 + · · · + ln+1 = a1 , such that −κ2 (e1 + e2 + · · · + en ) =

n+1 j=1

lj ej =

n

(lj − ln+1 )ej .

(4.40)

j=1

Since the functionals e1 , e2 , . . . , en are linearly independent, we end up with the requirement l1 = l2 = · · · = ln = ln+1 − κ2 . For the free parameters we choose 2 2 + kR and a1 have to obey the γ := l1 and γ˜ := ln+1 , then the parameters κ2 = kL equations κ2 = γ˜ − γ and a1 = γn + γ˜ . Note that under these assumptions we have Va(n+1) [−κ2 (e1 + e2 + · · · + en )] = Va(n+1) [γe1 + γe2 + · · · + γen + γ˜ en+1 ] 1 1 1 1 = spanC {|γ, γ, . . . , γ, γ˜ }.

(4.41)

Now let us express the value of the label µn+1 (a1 1 ) ∈ {0, 1, . . . , n} in terms of γ and γ˜ . Recalling (4.2), we can write µn+1 (a1 1 ) = µn+1 ((γn + γ˜ )1 ) ≡ γn + γ˜ ≡ γ˜ − γ

(mod(n + 1)).

(4.42)

Notice that ∃! Q = Q(γ, γ˜ ) ∈ {0, 1, . . . , n} and ∃! R = R(γ, γ˜ ) ∈ Z such that γ˜ − γ = Q + (n + 1)R, thereby the previous congruence relation translates into the equation µn+1 (a1 1 ) = Q. Plugging this equation into the requirement µn+1 (a1 1 ) + (n + 1)κ1 + nκ2 = 0, we get 0 = Q + (n + 1)κ1 + n(Q + (n + 1)R) = (n + 1)(˜ γ − γ − R + κ1 ),

(4.43)

1 1 therefore we end up with the additional constraint kL + kR = κ1 = R − (˜ γ − γ).

July 12, J070-S0129055X10004065

724

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

1 2 Observe from Lemma 4.4 that kR , kR ∈ Z and γ, γ˜ ∈ Z≥0 can be taken as free parameters that label the admissible cases of the KKS ansatz (4.28). By proceeding like in Sec. 4.1, it is a matter of straightforward substitutions to specialize the reduced Laplacian (2.14) to our case. In this way we found the following result.

Proposition 4.5. Under the KKS ansatz (4.28) with parameters satisfying (4.30) the Laplace operator of U (N ) reduces to 1 1 1 2 2 1 2 (4.44) + kR ) + (kR ) − n(n + 1)(2n + 1), −∆red = HBCn + n(kR 2 3 where HBCn is given by (1.4) with the coupling parameters determined in terms of 1 2 , kR ∈ Z and γ, γ˜ ∈ Z≥0 according to the arbitrary parameters kR a ≡ γ,

b ≡ γ + γ˜ + 1,

1 2 c ≡ |˜ γ − γ + kR − kR |.

(4.45)

Remark. The non-negative integer coupling parameters a, b, c that arise in this case satisfy the condition b ≥ a + 1. 4.3. Case III: m = n + 2, r = s = n + 1, N = 2n + 2 Now the ﬁxpoint subgroups of the two diﬀerent involutions σL = θn+1,n+1 and σR = θn+2,n are U (N )L ∼ = U (n + 1) × U (n + 1) and U (N )R ∼ = U (n + 2) × U (n). We consider the reductions associated with UIRREPS (ρ, V ) of G (4.1) having the form ' & ' & (n+1) (n+1) (n+2) (n) (4.46) ρ := ρ(k1 ,0) ρ(k2 ,0) ρ(k1 ,a1 1 ) ρ(k2 ,0) , L

L

R

R

where a1 ∈ Z≥0 and ∈ Z, and the representation space is identiﬁed (n+2) as V ≡ Va1 1 . Any X ∈ G is a pair X = (XL , XR ) with XL ∈ u(n + 1) ⊕ u(n + 1) and XR ∈ u(n + 2) ⊕ u(n), and we may further write XL = (XL1 , XL2 ) and XR = 1 2 1 2 , XR ), where now XL1 , XL2 ∈ u(n + 1), XR ∈ u(n + 2) and XR ∈ u(n). Then the (XR G-representation can be written as 1 ) tr(XR 1 1 ρ (X) = πa(n+2) − X n+2 R 1 1 n+2 µn+2 (a1 1 ) 1 2 1 1 2 2 + kL tr(XL1 ) + kL tr(XL2 )+ kR + ) + kR tr(XR ) IdV . tr(XR n+2 (4.47) 1 2 1 2 kL , kL , kR , kR

Lemma 4.6. The KKS ansatz (4.46) yields admissible UIRREPS if and only if 1 2 1 2 ∃ γ, γ˜ , γˆ ∈ Z≥0 and k ∈ Z such that the parameters kL , kL , kR , kR ∈ Z and a1 ∈ Z≥0 satisfy the conditions a1 = γn + γ˜ + γˆ , 1 kR = R − γ˜ − k,

1 kL = k,

2 kL = γ˜ − γˆ + k,

2 kR = γˆ − γ − k,

(4.48)

where a1 = Q+(n+2)R with uniquely determined Q = Q(γ, γ˜ , γˆ ) ∈ {0, 1, . . . , n+1} and R = R(γ, γ˜ , γˆ ) ∈ Z. If the above conditions are met, then dim(V K ) = 1 and

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

725

concretely V K = Va(n+2) [γe1 + γe2 + · · · + γen + γ˜ en+1 + γˆ en+2 ] 1 1 = spanC {|γ, γ, . . . , γ, γ˜ , γˆ },

(4.49)

where the last equality refers to the bosonic oscillator realization of Proof. For the isotropy M}, where   D 0    0 ω M = X0 = i  0 0    0 0

(n+2) Va1 1 .

subalgebra we have K = Mdiag = {X = (X0 , X0 ) | X0 ∈ 0 0 ω ˜ 0

  0    0 n×n  D = diag(d1 , d2 , . . . , dn ) ∈ R , ω, ω ˜ ∈ R .  0    D (4.50)

Any X = (XL , XR ) ∈ K satisﬁes XL = XR = X0 , components  D D 0 ω ˜ 0  1 2 1 , XL = i , XR = i  0 XL = i 0 ω 0 D 0

and therefore it has the 0

0



ω

 0 ,

0

ω ˜

For any real (n + 1)-tuple ϕ = (ϕ1 , ϕ2 , . . . , ϕn+1 ) ∈ Rn+1 (n ϕ˜ := j=1 ϕj , and introduce the traceless matrices

2 XR = iD.

(4.51) (n+1 we let ϕ¯ := j=1 ϕj ,

Hϕ := diag(ϕ1 , ϕ2 , . . . , ϕn+1 , −ϕ), ¯ ϕ˜ 1n , n ϕ¯ 1n+1 , HL1 := diag(ϕ1 , ϕ2 , . . . , ϕn+1 ) − n+1 ϕn+1 HL2 := diag(−ϕ, 1n+1 . ¯ ϕ1 , . . . , ϕn ) + n+1 We then write the components of X ∈ K in the form ϕ˜ 1 2 2 = iHϕ + ix1n+2 , XR = iHR +i x+ XR 1n , n ϕ¯ ϕn+1 1 1 2 2 XL = iHL + i x + 1n+1 , XL = iHL + i x − 1n+1 . n+1 n+1 2 := diag(ϕ1 , ϕ2 , . . . , ϕn ) − HR

(4.52)

(4.53)

(4.54) (4.55)

(n+2)

From (4.47), it follows that for any v ∈ Va1 1 and X ∈ K we have 1 2 2 ρ (X)v = πa(n+2) (iHϕ )v + i(kL ϕ¯ − kL ϕn+1 + kR ϕ)v ˜ 1 1 1 1 2 2 v. + kL ) + nkR + ix µn+2 (a1 1 ) + (n + 2)kR + (n + 1)(kL

(4.56)

July 12, J070-S0129055X10004065

726

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

Clearly ρ (X)v = 0 (∀X ∈ K) if and only if 2 1 2 πa(n+2) (Hϕ )v = (kL ϕn+1 − kL ϕ¯ − kR ϕ)v ˜ 1 1

(∀ϕ ∈ Rn ),

(4.57)

and 1 1 2 2 µn+2 (a1 1 ) + (n + 2)kR + (n + 1)(kL + kL ) + nkR = 0.

(4.58)

Since 2 1 2 1 2 kL ϕn+1 − kL ϕ¯ − kR ϕ˜ = −(kL + kR )(e1 + e2 + · · · + en )(Hϕ ) 2 1 + (kL − kL )en+1 (Hϕ ),

(4.59)

we obtain from (4.57) that we must have 1 2 2 1 V K = Va(n+2) [−(kL + kR )(e1 + e2 + · · · + en ) + (kL − kL )en+1 ]. 1 1

(4.60)

It is easy to see (cf. the Appendix) that the weight space in (4.60) is non-trivial if and only if ∃ (l1 , l2 , . . . , ln+2 ) ∈ Zn+2 ≥0 with l1 + l2 + · · · + ln+2 = a1 , such that 1 2 2 1 + kR )(e1 + e2 + · · · + en ) + (kL − kL )en+1 = −(kL

n+1

(lj − ln+2 )ej .

(4.61)

j=1

We set γ := l1 ,

γ˜ := ln+1 ,

γˆ := ln+2 ,

1 k := kL .

(4.62)

2 2 Then (4.61) requires l1 = l2 = · · · = ln = γ and γˆ − γ = k + kR with γ˜ − γˆ = kL − k. So, regarding γ, γ˜ , γˆ ∈ Z and k ∈ Z as free parameters, we see that the other parameters have to obey the relations 2 = γ˜ − γˆ + k, kL

2 kR = γˆ − γ − k,

a1 = γn + γ˜ + γˆ .

(4.63)

To satisfy the remaining condition (4.58), we now deﬁne Q = Q(γ, γ˜ , γˆ ) ∈ {0, 1, . . . , n + 1} and R = R(γ, γ˜ , γˆ ) ∈ Z by the equality a1 = γn + γ˜ + γˆ = Q + (n + 2)R.

(4.64)

1 Then (4.58) translates into the condition kR = R − γ˜ − k, which completes the proof.

Further direct calculations yield the explicit form of the reduced Laplacian (2.14). Proposition 4.7. Under the KKS ansatz (4.46) parametrized by arbitrary γ, γ˜ , γˆ ∈ Z≥0 and k ∈ Z according to Lemma 4.6, the reduced Laplacian of U (N ) satisfies

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

727

−∆red = HBCn + C with the constant 1 1 γ + k)(˜ γ + k + 1) C = − n(4n2 + 12n + 11) + n(2k + γ˜ − γˆ )2 + (˜ 6 2 + (ˆ γ − k)(ˆ γ − k + 1) (4.65) and coupling parameters given in the notation (1.4) by a ≡ γ,

b ≡ γ + γ˜ + 1,

c ≡ γ + γˆ + 1.

(4.66)

Remark. The integer coupling parameters a, b, c arising in this case satisfy b, c ≥ a + 1. 5. Discussion We here summarize the results, discuss the related work [19] and point out open problems. In this paper, we applied the formalism of quantum Hamiltonian reduction under polar group actions to study the reductions of the Laplace operator of U (N ) by means of the Hermann action (3.2) of the symmetry group G = (U (r) × U (s)) × (U (m) × U (n)) with N = m + n = r + s. We concentrated on the three series of cases for which the centralizer of the corresponding section, the group K = Mdiag (3.42), is Abelian. We built the representation (ρ, V ) of the symmetry group that enters the deﬁnition of the reduction by using as building blocks in (4.5) 1-dimensional representations and a symmetric power of the deﬁning representation of the “largest” factor of G. In the framework of this “KKS ansatz” we determined all cases for which the reduction is consistent (that is dim(V K ) = 0), and saw also that in these admissible cases dim(V K ) = 1. We then calculated the explicit formula of the reduced Laplacian by specializing Eq. (2.14), and found that up to an additive constant it yields the BCn Sutherland Hamiltonian (1.4) with coupling parameters given as follows: • Case I: a, b, c ∈ Z≥0 , • Case II: a, b, c ∈ Z≥0 with b ≥ a + 1, • Case III: a, b, c ∈ Z≥0 with b, c ≥ a + 1. The dependence of the additive constant and of the coupling parameters a, b, c on the parameters of the respective representation (ρ, V ) is given by the three propositions formulated in Sec. 4. The above results show that Case I, which is the simplest case, covers all integral values of the coupling parameters a, b, c and the other two cases allow for alternative group theoretic descriptions of the BCn model at proper subsets of the integral coupling parameters. This state of aﬀairs could not be foreseen before performing the analysis of the diﬀerent reduction schemes. Observe also that if b = c, then the Hamiltonian (1.4) becomes of type Cn , but the Bn and Dn type Sutherland models do not arise from (1.4) at any values of the integers a, b, c. This is in contrast with

July 12, J070-S0129055X10004065

728

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

the corresponding classical Hamiltonian reduction [20], which covers all coupling constants of the classical BCn model, and is due to the never vanishing second term of the “measure factor” given by (3.73). The measure factor represents a kind of quantum anomaly since it gives the diﬀerence between the naive quantization of the reduced classical Hamiltonian and the outcome of the corresponding quantum Hamiltonian reduction [21]. In Case I, our analysis is consistent with the results of Oblomkov [19], who studied reductions of the Laplace operator of GL(m + n, C) using the symmetry group GC := (GL(m, C) × GL(n, C)) × (GL(m, C) × GL(n, C)),

m ≥ n.

(5.1)

In fact, in Case I, our reduction is nothing but the compact real form of the reduction studied in [19] for m = n. For the m > n cases of the symmetry group (5.1) a generalization of the KKS ansatz was employed in [19], which was found to yield the complex version of the BCn Sutherland Hamiltonian (1.4) with integer coupling parameters subject to the restriction c ≥ b − (m − n) ≥ 0. Thus the coupling parameters obtained for m > n form a proper subset of those obtained for m = n, and this proper subset is diﬀerent from those that we derived in our Cases II and III. For clarity we note that the KKS ansatz (4.28) that we adopted in Case II was motivated by the corresponding classical reduction [20], and it does not correspond to the ansatz used in [19] for m − n = 1. It is not clear to us how the classical analogues of the m > n reductions of [19] work. Of course, the reductions can be applied also to the diﬀerential operators associated with the higher Casimirs. This can be used to explain the complete integrability of the BCn Sutherland model and to derive the spectra as well as the form of the joint eigenfunctions of the corresponding commuting Hamiltonians at the pertinent values of the coupling constants from representation theory [19]. We stress that the general method that we applied in our analysis can be used also to study other problems in the future. For example, one may try to determine all possible values of the coupling constants of the Sutherland models (1.1) that may result as reductions of the Laplacian of a compact Lie group in general. This is closely related to the open problem concerning the classiﬁcation of the Hermann actions and representations (ρ, V ) of symmetric subgroups G (3.1) such that the condition dim(V K ) = 1 holds for the centralizer K < G of the section. In all such cases the reduced Laplace operator (2.14) is expected to provide a many-body model that can be solved by the group theoretic method because of its very origin. Besides the trigonometric real form that we considered, the complex BCn Sutherland model admits the well-known hyperbolic real form and other physically very diﬀerent real forms associated with two types of particles [27, 28]. The derivation of the hyperbolic model by quantum Hamiltonian reduction can be done similarly to the present work, but starting from U (n, n) instead of U (2n) (in Case I) taking the Cartan involution both for σL and for σR (see also [20]). The models with two types of particles pose a more diﬃcult problem. At the classical level, it

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

729

can be seen from [28] that to derive them one needs to take the Cartan involution of U (n, n) for σL and a diﬀerent involution for σR that has a non-compact ﬁxpoint subgroup. Therefore the corresponding quantum Hamiltonian reduction would require some modiﬁcations of the method used in this paper, which need further investigation. Acknowledgments We thank J. Balog for useful comments on the manuscript. This work was supported by the Hungarian Scientiﬁc Research Fund (OTKA) under the grant K 77400. Appendix. Some Representation Theoretic Facts In this Appendix, we gather some basic facts in order to ﬁx the notations used in Sec. 4. A.1. On the UIRREPS of SU (n) and U (n) Since the Lie group SU (n) is compact, connected and simply-connected, there is a one-to-one correspondence between the UIRREPS (Π, V ) of SU (n) and the ﬁnite dimensional complex IRREPS (π, V ) of sl(n, C) = su(n)C . In the complex simple Lie algebra sl(n, C) we have the Cartan subalgebra H consisting of diagonal matrices, and use also the real Cartan subalgebra HR := {H | H ∈ sl(n, C), H is diagonal with real entries} ⊂ H.

(A.1)

The functionals {ei }ni=1 ⊂ H∗ are deﬁned by the formula ei (H) := Hii (H ∈ H). The roots with respect to H form the set R := {ei − ej | 1 ≤ i, j ≤ n, i = j} ⊂ H∗ and we ﬁx the root vectors Eei −ej := Eij . The set of positive roots is R+ := {ei − ej | 1 ≤ i < j ≤ n} and the simple roots are αi := ei − ei+1 (i (1 ≤ i ≤ n − 1). Let i = k=1 ek ∈ H∗ (1 ≤ i ≤ n − 1) denote the fundamental weights. The equivalence classes of the IRREPS of sl(n, C) can be uniquely labeled by the highest (dominant integral) weights, which are the elements of P+ (SU (n)) = {a1 1 + a2 2 + · · · + an−1 n−1 | a1 , a2 , . . . , an−1 ∈ Z≥0 } ∼ = Zn−1 ≥0 . (A.2) Now take an sl(n, C) IRREP (πλ , Vλ ) of highest weight λ ∈ P+ (SU (n)). To any linear functional ν ∈ H∗ we associate the weight space ) ker (πλ (H) − ν(H) IdVλ ) ⊂ Vλ , (A.3) Vλ [ν] := H∈H

and we also deﬁne the set of weights Wλ := {ν | ν ∈ H∗ , Vλ [ν] = {0}}. Then we * have the weight space decomposition Vλ = ν∈Wλ Vλ [ν]. Note that λ ∈ Wλ and dim(Vλ [λ]) = 1, so we can write Vλ [λ] = Cvλ with some highest weight vector vλ . The characteristic property of the non-zero vector vλ is that πλ (Eα )vλ = 0 holds

July 12, J070-S0129055X10004065

730

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

for all α ∈ R+ . The IRREP (πλ , Vλ ) of sl(n, C) induces the UIRREP (Πλ , Vλ ) of SU (n) by the requirement Πλ (eX ) = eπλ (X) for all X ∈ su(n). The corresponding scalar product on Vλ can be deﬁned by ﬁxing the norm of vλ and requiring the anti-hermiticity of πλ (X) for all X ∈ su(n). The UIRREPS of U (n) are usually parametrized by the set P+ (U (n)) = {m = (m1 , m2 , . . . , mn ) ∈ Zn | m1 ≥ m2 ≥ · · · ≥ mn }.

(A.4)

The representation ρm of U (n) may be deﬁned as the extension of the representation Π λ of SU (n) < U (n) characterized by the properties λ=

n−1

(mi − mi+1 )i

and ρm (ξ1n ) = ξ m1 +···+mn IdVλ

∀ ξ ∈ U (1).

(A.5)

i=1

In the main text we use a slightly diﬀerent parametrization by pairs (k, λ) ∈ Z × P+ (SU (n)). The correspondence is given by the relation m1 +· · ·+mn = µn (λ)+kn, as is seen from the comparison between (A.5) and (4.2) and (4.3). A.2. On the bosonic oscillator realization of (πm1 , Vm1 ) Fix an integer n ≥ 2 and to each n-tuple (l1 , l2 , . . . , ln ) ∈ Zn≥0 associate a “symbol” |l1 , l2 , . . . , ln . Let F denote the complex vector space generated by these symbols, + F := C|l1 , l2 , . . . , ln . (A.6) (l1 ,l2 ,...,ln )∈Zn ≥0

Endow F with the scalar product ( , ) for which the vectors {|l1 , l2 , . . . , ln }(l1 ,l2 ,...,ln )∈Zn≥0 satisfy (|l1 , l2 , . . . , ln , |l1 , l2 , . . . , ln ) = δl1 ,l1 δl2 ,l2 · · · δln ,ln ,

(A.7)

and introduce the annihilation and creation operators bi and b†i (1 ≤ i ≤ n) on F by

√ li |l1 , l2 , . . . , li − 1, . . . , ln if li ≥ 1, bi |l1 , l2 , . . . , ln := (A.8) 0 if li = 0, , (A.9) b†i |l1 , l2 , . . . , ln := li + 1|l1 , l2 , . . . , li + 1, . . . , ln . Then b†i is the adjoint of bi , and one has the commutation relations [bi , bj ] = 0,

[b†i , b†j ] = 0,

[bi , b†j ] = δi,j IdF .

(A.10)

The “bosonic Fock space” F decomposes as the orthogonal direct sum F = * m∈Z≥0 Fm with Fm := spanC {|l1 , l2 , . . . , ln | (l1 , l2 , . . . , ln ) ∈ Zn≥0 , l1 + l2 + · · · + ln = m}. (A.11) Now consider the linear map ψ : gl(n, C) → End(F ) deﬁned on the standard basis {Eij }1≤i,j≤n of gl(n, C) by ψ(Eij ) := b†i bj .

(A.12)

July 12, J070-S0129055X10004065

2010 12:1 WSPC/S0129-055X

148-RMP

Derivations of Trigonometric BCn Sutherland Model

731

Then (ψ, F ) is a representation of gl(n, C) and the subspace Fm is invariant under ψ. The map ψm : gl(n, C) → End(Fm ),

X → ψm (X) := ψ(X)|Fm

(A.13)

provides a ﬁnite dimensional representation of the Lie algebra gl(n, C). By restricting ψm to the subalgebra sl(n, C) < gl(n, C), we end up with a ﬁnite dimensional representation (ψm , Fm ) of sl(n, C). The set of weights of the representation (ψm , Fm ) is

n n li ei (l1 , l2 , . . . , ln ) ∈ Z≥0 , l1 + l2 + · · · + ln = m , (A.14) Wm := i=1

and the weight space Fm [ν] ⊂ Fm corresponding to weight ν = takes the form Fm [l1 e1 + l2 e2 + · · · + ln en ] = C|l1 , l2 , . . . , ln .

(n

i=1 li ei

∈ Wm (A.15)

Note that each weight space is 1-dimensional. The representation (ψm , Fm ) contains the (up to rescaling) unique highest weight vector vm := |m, 0, . . . , 0, with weight m1 = me1 ∈ Wm . This shows that (ψm , Fm ) is equivalent to the IRREP (πm 1 , Vm 1 ). We identify these sl(n, C) (and the naturally corresponding su(n)) representations in the proofs presented in Sec. 4. References [1] M. A. Olshanetsky and A. M. Perelomov, Quantum integrable systems related to Lie algebras, Phys. Rept. 94 (1983) 313–404. [2] G. Heckman, Hypergeometric and spherical functions, Harmonic Analysis and Special Functions on Symmetric Spaces, eds. G. Heckman and H. Schlichtkrull, Perspectives in Mathematics, Vol. 16 (Academic Press, 1994), pp. 1–89. [3] B. Sutherland, Beautiful Models (World Scientiﬁc, 2004). [4] R. Sasaki, Quantum Calogero–Moser systems, in Encyclopaedia of Mathematical Physics (Academic Press, 2006), pp. 123–129. [5] P. Etingof, Calogero–Moser Systems and Representation Theory (European Mathematical Society, 2007). [6] A. P. Polychronakos, Physics and mathematics of Calogero particles, J. Phys. A Math. Gen. 39 (2006) 12793–12845; arXiv:hep-th/0607033. [7] M. A. Olshanetsky and A. M. Perelomov, Completely integrable Hamiltonian systems connected with semisimple Lie algebras, Invent. Math. 37 (1976) 93–108. [8] B. Sutherland, Exact results for a quantum many-body problem in one dimension II, Phys. Rev. A 5 (1972) 1372–1376. [9] M. A. Olshanetsky and A. M. Perelomov, Quantum systems related to root systems, and radial parts of Laplace operators, Funct. Anal. Appl. 12 (1978) 121–128; arXiv:math-ph/0203031. [10] G. J. Heckmam and E. M. Opdam, Root systems and hypergeometric functions I, Compositio Math. 64 (1987) 329–352. [11] E. M. Opdam, Root systems and hypergeometric functions IV, Compositio Math. 67 (1988) 191–209.

July 12, J070-S0129055X10004065

732

2010 12:1 WSPC/S0129-055X

148-RMP

L. Feh´ er & B. G. Pusztai

[12] I. Cherednik, Double Aﬃne Hecke Algebras, London Mathematical Society Lecture Notes Series, Vol. 319 (Cambridge University Press, 2005). [13] P. I. Etingof, I. B. Frenkel and A. A. Kirillov, Jr., Spherical functions on aﬃne Lie groups, Duke Math. J. 80 (1995) 59–90; arXiv:hep-th/9407047. [14] D. Kazhdan, B. Kostant and S. Sternberg, Hamiltonian group actions and dynamical systems of Calogero type, Commun. Pure Appl. Math. 31 (1978) 481–507. [15] A. D. Berenstein and A. V. Zelevinsky, When is the multiplicity of a weight equal to 1?, Funct. Anal. Appl. 24 (1990) 259–269. [16] R. Howe, Perspectives on invariant theory: Schur duality, multiplicity-free actions and beyond, in The Schur Lectures (1992), Israel Math. Conf. Proc., Vol. 8 (Bar-Iran Univ., 1995), pp. 1–182. [17] E. Heintze, R. Palais, C.-L. Terng and G. Thorbergsson, Hyperpolar actions on symmetric spaces, in Geometry, Topology, and Physics for Raoul Bott, ed. S.-T. Yau (International Press, 1995), pp. 214–245. [18] A. Kollross, Polar actions on symmetric spaces, J. Diﬀerential Geom. 77 (2007) 425–482; arXiv:math/0506312 [math.DG]. [19] A. Oblomkov, Heckman–Opdam’s Jacobi polynomials for the BCn root system and generalized spherical functions, Adv. Math. 186 (2004) 153–180; arXiv:math/0202076 [math.RT]. [20] L. Feh´er and B. G. Pusztai, A class of Calogero type reductions of free motion on a simple Lie group, Lett. Math. Phys. 79 (2007) 263–277; arXiv:math-ph/0609085. [21] L. Feh´er and B. G. Pusztai, Hamiltonian reductions of free particles under polar actions of compact Lie groups, Theoret. Math. Phys. 155 (2008) 646–658; arXiv:0705.1998 [math-ph]. [22] R. Palais and C.-L. Terng, A general theory of canonical forms, Trans. Amer. Math. Soc. 300 (1987) 771–789. [23] V. V. Gorbatsevich, A. L. Onishchik and E. B. Vinberg, Foundations of Lie Theory and Lie Transformation Groups (Springer, 1997). [24] T. Matsuki, Classiﬁcation of two involutions on compact semisimple Lie groups and root systems, J. Lie Theory 12 (2002) 41–68. [25] B. Hoogenboom, Intertwining Functions on Compact Lie Groups, CWI Tract, Vol. 5 (Centrun Wisk. Inform., Amsterdam, 1984). [26] T. Matsuki, Double coset decomposition of reductive Lie groups arising from two involutions, J. Algebra 197 (1997) 49–91. [27] F. Calogero, Exactly solvable one-dimensional many-body problems, Lett. Nuovo Cim. 13 (1975) 411–416. [28] M. Hashizume, Geometric approach to the completely integrable Hamiltonian systems attached to the root systems with signature, Adv. Stud. Pure Math. 4 (1984) 291–330.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 7 (2010) 733–838 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004077

A CLASSICAL MECHANICAL MODEL OF BROWNIAN MOTION WITH PLURAL PARTICLES

SHIGEO KUSUOKA∗ and SONG LIANG† ∗Graduate School of Mathematical Sciences, The University of Tokyo, Komaba 3-8-1, Meguro-ku Tokyo 153-8914, Japan [email protected] †Institute of Mathematics, Tsukuba University, Tennoudai 1-1-1 Tsukuba 305-8571, Japan [email protected]

Received 15 December 2008 Revised 26 May 2010

We give a connection between diﬀusion processes and classical mechanical systems in this paper. Precisely, we consider a system of plural massive particles interacting with an ideal gas, evolved according to classical mechanical principles, via interaction potentials. We prove the almost sure existence and uniqueness of the solution of the considered dynamics, prove the convergence of the solution under a certain scaling limit, and give the precise expression of the limiting process, a diﬀusion process. Keywords: Inﬁnite particle systems; classical mechanics; Markov processes; diﬀusions; convergence; Brownian motion. Mathematics Subject Classiﬁcation 2010: 70F45, 34F05, 60B10, 60J60

1. Introduction Brownian motion is a well-known physical phenomenon concerning the dynamics of a small particle put into a ﬂuid in equilibrium, e.g., a grain of pollen in a glass of water [10]. It is an interesting problem in mathematical physics to describe the Brownian motion phenomenology by classical mechanical models. Brownian motion was ﬁrst observed accidentally by Brown in 1827. The ﬁrst physical explanation of it was given by Einstein: the motion being explained as coming about as a result of the repeated collisions of the particle with the numerous much smaller ﬂuid atoms. In more mathematical terms, the explanation is often presented in the following rough way: since a big number of water atoms collide with the massive particle randomly, and each atom is light enough, if we assume that the interactions from each atom at each time are independent, the motion 733

August 10, J070-S0129055X10004077

734

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

of the massive particle could be considered as a sum of many i.i.d. random variables, by the central limit theorem, this will give in a suitable limit the Brownian motion. However, we have to notice that, even in a model where only interactions through collisions are considered, there exists the possibility of re-collisions, so the states (i.e. positions and velocities) of light particles at each time are not independent of each other, so one is not really in presence of a sum of i.i.d. variables. This becomes a more evident and signiﬁcant drawback when considering the model of interactions caused by potentials, or a model with more than one massive particles. Indeed, the actual motion of the massive particles could not be a result of a sum of i.i.d. random variables, it is not even a Markov process. So to study this phenomenon more precisely, we need to construct some new model, which takes the mentioned re-interactions into account. In such a model, a ﬁnite number of massive particles (molecules) interact with a gas of inﬁnitely many light atoms, which have a random initial state. The dynamics is fully deterministic, Newtonian, as long as the initial condition is given, and the only source of randomness is from the initial conﬁguration of the light atoms. The problem is to prove that in an appropriate limit as the mass m of the gas atoms goes to 0 while their velocities and their density increase in an appropriate manner, such that the variance of momentum transfer stays of order 1, the motion of the molecules converges to a Markov process, in particular a diﬀusion process. This type of model, called a mechanical model of Brownian motion, was ﬁrst introduced and studied by Holley [6], for the case of only one molecule, with the whole system in dimension d = 1, and the interactions only given by collisions. This model was later extended by, e.g., D¨ urr–Goldstein–Lebowitz [3–5], Calderoni–D¨ urr–Kusuoka [2], and others, to the case of higher dimensional spaces. But in all papers, only collisional interactions of one molecule with light atoms are considered. In the present paper, we consider the case of plural molecules interacting via smooth compact support potentials with an ideal gas of atoms. This increases the diﬃculties in many aspects, for example, (1) strong non-Markovian character of the dynamics (for every positive mass m of the atoms), due to possibly multiple, or even extended in time, interactions between a particular atom and the molecules; (2) the appearance of an interaction (mediated by the gas atoms) between the molecules in the limiting process; and (3) irregular behaviour of the above interaction when the interaction ranges of the molecules overlap, i.e. an atom can interact with more than one molecule at a time. Let us note that one expects that the non-Markovian character of the dynamics mentioned above disappears when the gas atoms become inﬁnitely fast, in the limit m → 0.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

735

We show the existence and the uniqueness of the solution of an inﬁnite system of ordinary diﬀerential equations (ODEs) describing the model, for almost every initial condition of the ideal gas, and study the motion of the molecules in the Brownian limit, where m → 0, and the intensity function of the initial conﬁguration of the N d−1 m 2 atoms is given by λ m (dx, dv) = m 2 ρ( 2 |v| + i=1 Ui (x − Xi,0 ))dxdv, where ρ is a function giving the initial distribution, x respectively, v denote the position respectively, velocity of an atom, N is the number of molecules, Xi,0 are the initial positions of the molecules and Ui are the potentials. (See Sec. 2 for more details and the notations.) A heuristic central limit theorem type argument for this scaling, which assumes all of the necessary independency in the limit, may be given as follows: when m → 0, the average energy mv 2 of the atoms remains order 1 due to the velocity scaling. For the momentum transfer, notice that since potentials are compactly supported by our assumption, we have that (as long as the molecules stay in a bounded region) interactions occur only if the atoms are in a certain area A which does not depend d Vi (t) is a result of the sum of the on time. So roughly speaking, for a ﬁxed time t, dt eﬀects from those atoms whose positions x and velocities v satisfy x + tv ∈ A. Also, the eﬀect from each one such atoms is of order 1. Since an atom has velocity of order m−1/2 , the length of the time that it stays in the area A has order m1/2 . In summary, in a ﬁxed time interval [0, T ], T > 0, the momentum transfer for a molecule from will remain constant if the total one atom is of order m1/2 , hence the momentum T −1/2 , number of interacting atoms has mean 0 ds x+sv∈A λ m (dx, dv) behaving as m for m → 0. Let us explain further main ideas and provide a sketch of the content of this (or paper before closing this section. First, we introduce a system ϕ(t, x, v; X) (see (2.2) and (3.4)), describing the motion of the light atoms when ψ(t, x, v; X)) the molecules are “frozen”, and consider the classical scattering theory (including a ray representation) for it. As a result of our cut-oﬀ in the potentials and the initial distribution of the atoms, as long as the velocities of the molecules are not too fast, each light atom interacts with the molecules for a time length of order m1/2 (instead of x(t, x, v)) gives the 0-order only, so the interaction given by ψ(t, x, v; X) approximation of the momentum variance of the molecules (see Proposition 3.6.3). We use this fact to get the tightness of the states of the molecules as long as their velocities are of order O(1) (Lemma 3.5.1 and Sec. 4). Next, with the help of this tightness, by adding the eﬀect given by the error caused by the described “frozing approximation” as a 1-order term (see Proposition 3.6.4), we prove the desired convergence for m → 0. This is done by characterizing the possible accumulation points by martingale problem theory (Sec. 5). Finally, we show that when there is only one molecule, or when there are two molecules but the potential functions satisfy certain conditions (see Theorem 2.0.1(4)), the velocity(s) of the molecule(s) do(es) not go beyond order O(1), so the stopping time τn (to keep the velocity(s) O(1))

August 10, J070-S0129055X10004077

736

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

converges to ∞, which means that our convergence is valid for all time intervals (Sec. 6).

2. Description of the Model and Statement of the Result Let us describe our model and results precisely in this section. Let N ≥ 1 and d ≥ 1 be integers, and let M1 , . . . , MN , m > 0. Here N stands for the number of molecules, d for the dimension of the space Rd , in which the whole system is considered. M1 , . . . , MN are the masses of the molecules. m stands for the mass of the light particles (the environmental ideal gas atoms), (later on the limit m → 0 will be taken). We use Ui ∈ C0∞ (Rd ), i = 1, . . . , N , to denote the (cut-oﬀ) potential functions, which, as (2.1) shows, are assumed to provide potentials that only depend on the relative positions of the molecules and the atoms. Also, let Xi,0 , Vi,0 ∈ Rd , i = 1, . . . , N , be given, which stand for the initial positions and the initial velocities of the molecules. Assume that the initial condition of the environment, i.e. the positions and the = Conf(Rd × Rd ). The velocities of the ideal gas atoms at time 0, is given by ω ∈Ω distribution of ω will be speciﬁed later. (We ask for the reader’s tolerance for using “∼” for a while. We do so because we will soon convert the problem to some new probability space (see Sec. 3.1) by using ray representation, and we believe that it is better to keep the notations without “∼” until then.) Here Conf(Rd × Rd ) stands for the set of all non-empty closed subsets of Rd × Rd which have no cluster point. Conf(Rd × Rd ) is equipped with the σ-algebra E0 , the σ-algebra generated by {{C ⊂ Rd × Rd ; C = ∅, closed, C ∩ G = ∅}; G is open in Rd × Rd }. Each ω is means that there exists an atom at position x a subset of Rd × Rd , and (x, v) ∈ ω with velocity v at time 0. As claimed before, we assume that as long as the initial conditions ω ∈ Conf(Rd × Rd ) and Xi,0 , Vi,0 ∈ Rd , i = 1, . . . , N , are given, the whole system evolves according to Newton mechanical laws via interaction potentials depending only on the relative positions. (m) (m) (m) (m) We use Xi (t) = Xi (t, ω) and Vi (t) = Vi (t, ω) ∈ Rd to denote the position and the velocity of the ith molecules at time t with initial environmental ), v (m) (t, x, v, ω ) ∈ Rd to condition ω , and for each (x, v) ∈ ω , we use x(m) (t, x, v, ω denote the position and the velocity at time t of the atom which had state (x, v) at time 0. Also, for the sake of simplicity, we assume that there is no direct interaction between molecules or between atoms. Actually, adding the eﬀect of interactions between molecules causes totally no mathematical diﬃculty, while making the formula more complicated. We would rather say that one of the most interesting points of our results in this paper is that, even for the case with no direct interactions between molecules, after taking the limit m → 0, we get a diﬀusion in which interactions between molecules appear. (See Theorem 2.0.1, especially the deﬁnition of the generator L below.)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

737

In conclusion, for each initial environmental condition ω , we assume that the motion of the system is described by the following inﬁnite system of ODEs:   (m)  d X (m) (t, ω  ) = Vi (t, ω ),  i  dt      d (m)  (m)   V M (t, ω ) = − ∇Ui (Xi (t, ω ) − x(m) (t, x, v, ω ))µωe (dx, dv), i  i  dt d ×Rd  R     (m) (m)  (Xi (0, ω ), Vi (0, ω )) = (Xi,0 , Vi,0 ), i = 1, . . . , N, (2.1)  d  (m) (m)  x (t, x, v, ω  ) = v (t, x, v, ω ),   dt     N  d

  (m) (m)  v m (t, x, v, ω ) = − ∇Ui (x(m) (t, x, v, ω ) − Xi (t, ω )),    dt  i=1    (x(m) (0, x, v, ω ), v (m) (0, x, v, ω )) = (x, v), (x, v) ∈ ω . : µωe (A) = ( ω ∩ A) for any Here µωe ( · ) is the counting measure determined by ω A ∈ B(Rd × Rd ). ( ( · ) thus denoting the number of points in the argument.) Since we are only interested in the motion of the molecules, from now on, (m) (t, ω ), when talking about the solution of (2.1), we always mean the value of (X (m) (m) (m) (m) (m) V (t, ω )) = ((X1 (t, ω ), . . . , XN (t, ω )), (V1 (t, ω ), . . . , VN (t, ω ))). Finally, let us give the distribution of the environmental initial condition ω . Let ρ: R → [0, ∞) be a continuous function such that ρ(s) → 0 rapidly as s → ∞ (see conditions A1 and A2 below for details). Let λ m be the non-atomic Radon measure d d on R × R given by N d−1 m 2

λm (dx, dv) = m 2 ρ Ui (x − Xi,0 ) dxdv, |v| + 2 i=1 and let P ω ) be the Poisson point process with the intensity measure λ m (d m . So d d P is a probability measure on Ω(= Conf(R × R )). We assume that the dism tribution of ω is given by Pm . (See, e.g., [7] for more details about Poisson point processes.) In this paper, we consider the following questions: (Q1) Does the dynamics have a unique solution for Pm -almost every initial condition? (m) (t, ω (m) (t, ω ), V )) as m → 0? (Q2) What is the limit behavior of the solution (X Throughout this paper, we assume that Ui ∈ C0∞ (Rd ) satisfy Ui (−x) = Ui (x), x ∈ Rd , i = 1, . . . , N . Let Ri be constants such that Ui (x) = 0 if N |x| ≥ Ri . Deﬁne the constants C0 = (2 i=1 Ri ∇Ui ∞ )1/2 , e0 = 12 (2C0 + 1)2 + N i=1 Ui ∞ . Assume that ρ: R → [0, ∞) is a measurable function satisfying the

August 10, J070-S0129055X10004077

738

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

following: (A1) ρ(s) = 0 if s ≤ e0 , (A2) for any c > 0, there exists a ρc : R → [0, ∞) such that sup ρ(s + a) ≤ ρc (s),

|a|≤c

and

Rd

3

(1 + |v| )ρc

for any s ∈ R,

1 2 |v| dv < ∞. 2

The meaning of the assumption (A1) is that those atoms with their initial momenta less than a certain value are ignored. The point is that, under this condition, (same as in the case with the molecules “frozen”, which we call the “classical case”), since the initial velocities of the atoms are fast enough, the interactions are not strong enough to “stop” the atoms, so they keep their velocities at a certain level for all time, hence they will leave the valid region for interaction very quickly (see Proposition 3.2.2 and Corollary 3.2.3 for the classical case, and Propositions 3.6.1 and 3.6.5 for our case). This helps us to avoid the problem of “too many collisions in a short period of time”. (A2) is a assumption with respect to the “rapidness” of the decreasing of ρ. Also, assume that the initial position (X1,0 , . . . , XN,0 ) satisﬁes |Xi,0 − Xj,0 | > Ri + Rj for any i = j, i.e. the molecules are originally separated enough such that their potential ranges do not overlap. We answer in this paper the two questions (Q1), (Q2) described above under our present assumption. For (Q1), we will show that there exists a unique solution of (2.1) for P m -almost every initial condition for every m > 0 (Theorem 2.2(1) below). In order to answer (Q2), let us ﬁrst deﬁne some notations to describe the limit = (X1 , . . . , XN ) ∈ RdN , let process. For any X = (ϕ v(t, x, v; X)) ϕ(t, x0 , v0 ; X) 0 (t, x0 , v0 ), ϕ 1 (t, x0 , v0 )) = ( x(t, x, v; X), denote the solution of Newton’s equation   dx = v(t, x, v; X),  (t, x, v; X)   dt    N

d − Xi ), v(t, x, v; X) = − ∇Ui ( x(t, x, v; X)    dt  i=1     v(0, x, v; X)) = (x, v). ( x(0, x, v; X),

(2.2)

Compare (2.2) with the second half of (2.1) with m = 1, one ﬁnds that the only diﬀerence is that in (2.2), we have the molecules ﬁxed, whereas in (2.1), the (with proper X) as an molecules are also moving. We will use this ϕ(t, x0 , v0 ; X)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

739

approximation of (x(t, x0 , v0 ), v(t, x0 , v0 )). As mentioned in Sec. 1, this is actually one of our main ideas of the present paper. Also, we will use the so-called ray representation Ψ: let E = {(x, v) ∈ Rd × (Rd \{0}); x · v = 0}, Ev = {x ∈ Rd ; x · v = 0},

v ∈ Rd \{0},

and let ν(dx, dv) be the measure on E given by ν(dx, dv) = |v| ν (dx; v)dv, where ν(dx; v) is the Lebesgue measure on Ev . Deﬁne Ψ:

R × E → Rd × (Rd \{0}), (s, (x, v)) → Ψ(s, (x, v)) = (Ψ0 (s, (x, v)), Ψ1 (s, (x, v))) = (x − sv, v),

in other words, we decompose the position of each atom into two parts: one parallel to its velocity and the other orthogonal to its velocity. is well 0 (t + s, Ψ(s, x, v); X) Then by Lemma 3.2.1, we have that lims→∞ ϕ 0 deﬁned for any t ∈ R and (x, v) ∈ E. Denote it by ψ (t, x, v; X), i.e. let = lim ϕ ψ 0 (t, x, v; X) 0 (t + s, Ψ(s, x, v); X). s→∞

Now we are ready to give the quadratic term of the diﬀusion generator of the limit process: Let ∞ 1 = − Xi )dt aik;jl (X) ∇k Ui (ψ 0 (t, x, v; X) Mi Mj E −∞

∞ 1 2 0 |v| ν(dx, dv). × ∇l Uj (ψ (t, x, v; X) − Xj )dt ρ 2 −∞ Notice that the integral above, although it might look like inﬁnite at a glance, is actually ﬁnite by Corollary 3.2.3 and assumptions (A1) and (A2). We next give the deﬁnition of the drift term of the limit process. For any (x, v) ∈ V , a) ∈ Rd denote the solution of V ∈ RdN and a ∈ R, let z(t; x, v, X, E, X,  N

 d2  − Xi )(z(t) − (t + a)Vi ),  z(t) = − ∇2 Ui (ψ 0 (t, x, v, X)  dt2 i=1 (2.3)   d   lim z(t) = lim z(t) = 0. t→−∞ t→−∞ dt V , a) is a linear function of V . Let bik;jl : RdN → R be the Then z(t; x, v, X,

C ∞ -functions determined by the following: ∞ 1 2 2 0 − |v| ν(dx, dv) ∇ Ui (ψ (t, x, v, X) − Xi )z(t, x, v, X, V , −t)dt ρ 2 −∞ E =

N d

=1 j=1

j , bi·;j (X)V

August 10, J070-S0129055X10004077

740

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

∞ − Xi)zp (t, x, v, X, V, −t)dt) × ( −∞ dp=1 ∇k ∇p Ui (ψ 0 (t, x, v, X) ρ( 12 |v|2 )ν(dx, dv) = d=1 N j=1 bik;j (X)Vj , k = 1, . . . , d, where zp means the pth element of the vector z for p = 1, . . . , d. By the same reason as that for the quadratic term, the integral on the left-hand side above is ﬁnite. Now we are in a position to give the deﬁnition of the limit diﬀusion generator L on R2dN : or equivalently, −

L=

E

N N N

d d d

∂ ∂2 1

∂ aik,jl (X) + b ( X)V + Vik . ik,jl j 2 i,j=1 ∂Vik ∂Vjl i,j=1 ∂Vik i=1 ∂Xik k,l=1

k,l=1

k=1

The coeﬃcients a· and b· correspond to the 0-order and the 1-order approximations, respectively, given by the “frozing approximation” of the molecules (see also Sec. 1). Our main results in the present paper are formulated in Theorem 2.0.1 below. (1) ensures the existence of a unique dynamics for P m -almost every initial condition. (2)–(4) of Theorem 2.0.1 are to be understood with respect to the convergence of (m) (t, ω ), V (m) (t, ω )), t ≥ 0} under P ω ) as m → 0: for the the distribution of {(X m (d case of only one molecule, we have the convergence with no further assumption (the assertion (2)); when there are more than one molecule, in the general case, the convergence is valid until the stopping time given as the ﬁrst time for which the potential ranges of any pair of molecules overlap (the assertion (3)); ﬁnally, for the special case of exactly two molecules with spherically-symmetric potentials, we strengthen the result by allowing the process to run until an arbitrary time (the assertion (4)). The precise description is as follows. Theorem 2.0.1. Under our present setting, we have the following. . (1) For any m > 0, there exists a unique solution to (2.1) for P m -almost every ω (m) (m) (2) Assume N = 1. Then as m → 0, the distribution of {(X1 (t), V1 (t)), t ≥ 0} under P m converges weakly to the diﬀusion process with generator L in C([0, ∞); R2d ) equipped with the Skorohod metric. (3) Assume N ≥ 2. Let σ0 ( ω ) = inf t > 0; min{|Xi (t; ω ) − Xj (t; ω )| − (Ri + Rj )} ≤ 0 , i=j

be the ﬁrst time for which the distance between molecules in some pair is less than the sum of the radii of their potentials. Then as m → 0, the distribution of (m) (t ∧ σ0 )), t ≥ 0} under P (m) (t ∧ σ0 ), V {(X m converges weakly to the diﬀusion with generator L stopped at σ0 in C([0, ∞); R2dN ) equipped with the Skorohod metric. (4) Let N = 2 and d ≥ 3. Assume that there exist functions h1 , h2 such that Ui (x) = hi (|x|),

i = 1, 2,

and there exists a constant ε0 > 0 such that (−1)i−1 hi (s) > 0,

(−1)i−1 hi (s) > 0,

s ∈ (Ri − ε0 , Ri ),

i = 1, 2.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

741

(m)(t), V (m) (t)), t ≥ 0} under P Then we have that when m → 0, {(X m converges weakly to the Markov process given by the following: it acts as the diffusion with generator L as long as the potential ranges of the two molecules do not overlap, and the two molecules collide whenever their potential ranges touch each other. (See Theorem 6.3.2 for the precise deﬁnition of the limiting process.) Let us comment a little bit more about the conditions in Theorem 2.0.1. We do so for (4) ﬁrst. The ﬁrst half of the conditions requires that the potential functions for the two molecules depend only on the distances from the atoms. The condition d ≥ 3 is used in the proof of (4), and we would say that it is not strange to have it here since, as remarked at the end of Sec. 3, our cut-oﬀ ﬁts reality (in the sense described there) only if d ≥ 3. Finally, the second half of the assumptions above implies that at least near to the edges of the potential ranges, one molecule experiences repulsive forces with the atoms, and the other molecule experiences attractive forces with the atoms. We use this condition to keep bounded (for m → 0) the velocites of the two molecules. This is also the reason why we need to stop the process at σ0 in (3): our decomposition of Vi (t) (see (3.30)) is valid only when the velocities of the molecules are O(1), which holds until σ0 without further assumption (see (3.31)), while this is not always true after σ0 (to see this, notice that the “resulting direct interactions” t∧σ (X(s))ds between molecules in (3.30) become ∞ when m → 0 if −m−1/2 0 ∇i U ∇i U(X(s)) = 0). We succeeded to extend the result until any time for the special case described in (4), by showing that in that case, the “resulting direct interactions” turn out to be “colliding forces”, which do not change the total momenta of the molecules; while in the general case, these might accelerate the molecules to ∞ immediately (to see this, just consider the case of two molecules of the same type), making the decomposition itself not valid anymore. (See also Lemma 3.5.1 and the paragraphs following it.) Remark 1. We can also get the unique existence of the solution to (2.1) for P malmost every ω under some more simple-looked assumptions (see Proposition 3.3.9). Remark 2. We emphasize again that as explained in Sec. 1, in our present problem, the forces at any ﬁxed time are not independent of the history. Therefore, since both the molecules and the light “environmental” atoms are moving, the system is very complicated and diﬃcult to handle. Our basic idea for the proof is that, although all of the particles are moving all the time, since the molecules are very heavy compared with the atoms, when considering the scattering of the atoms, we can use the approximation that the molecules are frozen (see (2.2)), which gives us the 0-order appximation of the momentum variance of the molecules. V ). (See Secs. 3–5 for more The 1-order error appears in our result as z(t, x, v; X, details.)

August 10, J070-S0129055X10004077

742

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Remark 3. For any ﬁxed m > 0, although Vi (t) is continuous with respect to t (since it is described by the ODEs (2.1)), our martingale part Mi (t) in the decomposition of Vi (t) (see Lemma 3.5.1) needs not be continuous. The only thing we can say is that its jumps are dominated by some constant multiple of m1/2 (see Lemma 3.5.1). This is also one of our ideas, namely to use the martingale theorem only for the terms for which it is applicable. For the remaining terms, instead of trying to deal with them in detail, we show that they are negligible as m → 0. The rest of this paper is organized as follows: In Sec. 3, we prove the unique existence of the solution, and present some preparation for the proof of convergence. Especially, we formulate the decomposition of Vi (t) (see Lemma 3.5.1) and deduce from it some properties. The “frozing approximation” is also discussed in this section. Section 4 gives the proof of the lemmas formulated in Sec. 3. In Sec. 5, we use these lemmas to prove the ﬁrst two convergence results (Theorems 2.0.1(2) and 2.0.1(3)), with the help of “martingale theory”. The proof of the last part of Theorem 2.0.1 is given in Sec. 6. 3. Preparations In this section, we formulate the ray-representation, prove the unique existence of the solution of the dynamics for each ﬁxed m > 0, and give some preparations for the proof of our convergence results. For the sake of simplicity, from now on, we will omit the superscription (m) when there is no risk of confusion. We represent related results of classical mechanics, especially Newton’s equa x, v; X) are tion and ray representation in Sec. 3.1; some results with respect to ψ(t, prepared in Sec. 3.2; Sec. 3.3 is devoted to the almost surely unique existence of the solution of (2.1) with the help of the ray representation; in Sec. 3.4, we recall some basic facts about the Skorohod spaces (D([0, T ]; Rd), d0 ) and (D([0, ∞); Rd ), dis), ω (t, ω which will be used later (as described in Remark 3, although both (X(t, ), V )) and the limit processes are continuous with respect to t, this new space is necessary in our proof); in Sec. 3.5, we state several basic lemmas, especially the decomposition of Vi (t), the proof of which will be given in Sec. 4; ﬁnally, in Sec. 3.6, we prepare some basic calculations for later use. Since we are considering the Skorohod metric, it suﬃces for us to prove our assertions for t ∈ [0, T ] for any T > 0, instead of t ∈ [0, ∞). (See [1].) So from now on, we choose an arbitrary T > 0 and ﬁx it. Also, as mentioned in Sec. 1, we use the stopping time that the velocities of the molecules are larger than or equal to n: choose any n ≥ 1 and ﬁx it for a while (we will take n → ∞ at the end). Now, we are ready to deﬁne the following notations: Let σ(ω) = σn (ω) = inf t ≥ 0; max |Vi (t, ω)| ≥ n , i=1,...,N

R0 = R0 (n, T ) = max (Ri + |Xi,0 | + nT ) + 1, i=1,...,N

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

C0 =

2

N

743

1/2 Ri ∇Ui ∞

,

i=1

τ = τ (n, T ) = C0−1 R0 . Also, for reader’s convenience, we give a brief list of the other main notations and their meanings used in this paper: (X(t), V (t), x(t, x, v), v(t, x, v)): Solution of the dynamics, Solution of (2.2), the motion of the atoms with the molecules “frozen”, ϕ(t, x, v; X): which is shown to be the 0-order term in our approximation of (x(t, x, v), v(t, x, v)), The corresponding scattering, ψ(t, x, v; X): Ψ(t, x, v) := (x − tv, v), used in the ray representation, V ): Solution of (2.3), the 1-order term in our approximation of z(t, x, v; X, (x(t, x, v), v(t, x, v)). 3.1. Classical mechanics In this and the next subsection, we prepare some results with respect to the solution of Newton’s equation (2.2). be the solution of (2.2). ∈ (Rd )N , let ϕ(t, x, v; X) As in Sec. 2, for any X First, let us recall the following well-known result about Newton’s equation. Proposition 3.1.1. For any f : R2d → [0, ∞), we have N

1 2 |v| + f (ϕ(t, x, v; X))ρ Ui (x − Xi ) dxdv 2 R2d i=1

=

f (x, v)ρ R2d

N 1 2

|v| + Ui (x − Xi ) dxdv. 2 i=1

(3.1)

Proof. As the proof is fundamental and well-known, we give a sketch only. N First, since the total energy is constant, we have that 12 |v|2 + i=1 Ui (x − 2 + N Ui (ϕ − Xi ), so the left-hand side of 1 (t, x, v; X)| 0 (t, x, v; X) Xi ) = 12 |ϕ i=1 1 2 + N Ui (ϕ − (3.1) is equal to R2d f (ϕ(t, x, v; X))ρ( 1 (t, x, v; X)| 0 (t, x, v; X) i=1 2 |ϕ Xi ))dxdv. Therefore, in order to show the assertion, it is suﬃcient to show that ϕ e 0 ,ϕ e 1) | ∂(∂(x,v) | = 1 for any t > 0. On the other hand, by a straightforward calcu0 1 that d (| ∂(ϕe ,ϕe ) |) = 0, also, we lation, we get by the deﬁnition of ϕ(t, x, v; X) dt

∂(x,v)

= (x, v). This completes the proof of our have by deﬁnition (ϕ 0, ϕ 1 )(0, x, v; X) assertion. The rest of this subsection is dedicated to a discussion of the ray representation. Let E, Ev , Ψ, etc., be as given in Sec. 2. Note that for any measurable

August 10, J070-S0129055X10004077

744

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

f : R2d → [0, ∞), we have by deﬁnition and a simple variable change that f (x, v)dxdv = f (Ψ(t, x, v))dtν(dx, dv). R2d

(3.2)

R×E

In order to derive our new intensity function λm , for the sake of simplicity, we introduce the following notations. Let 1

(s, x, v) → Ψm (s, x, v) = Ψ(s, x, m− 2 v),

Ψm : R × E → Rd × (Rd \{0}), and let

1

fm (x, v) = f (x, m− 2 v), N 1 2

|v| + Ui (x − Xi,0 ) . ρ0 (x, v) = ρ 2 i=1 Then we have f (x, v)λ m (dx, dv) R2d

=m

d−1 2

f (x, v)ρ R2d

=m

− 12

N m 2

|v| + Ui (x − Xi,0 ) dxdv 2 i=1

f (x, m

− 12

v)ρ

R2d

=m

− 12

= m−1

R×E

N 1 2

|v| + Ui (x − Xi,0 ) dxdv 2 i=1

fm (Ψ(s, x, v))ρ0 (Ψ(s, x, v))dsν(dx, dv) 1

R×E

1

fm (Ψ(m− 2 s, x, v))ρ0 (Ψ(m− 2 s, x, v))dsν(dx, dv),

where we used (3.2) when passing to the forth line. On the other hand, 1

1

1

fm (Ψ(m− 2 s, x, v)) = f (x − m− 2 sv, m− 2 v) = f (Ψm (s, x, v)). Therefore,

R2d

f (x, v)λ m (dx, dv) =

R×E

f (Ψm (s, x, v))λm (ds, dx, dv),

where λm (ds, dx, dv) is the measure on Conf(R × E) deﬁned by 1

λm (ds, dx, dv) = m−1 ρ0 (Ψ(m− 2 s, x, v))dsν(dx, dv) N 1 2

−1 −1/2 |v| + Ui (x − m sv − Xi,0 ) dsν(dx, dv). =m ρ 2 i=1 Also, with a little abuse of notation, we use Ψm to denote the natural map = Conf(R × E) to Conf(Rd × (Rd \{0})), i.e. Ψm (A) = {Ψm (a)|a ∈ A}. from Ω

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

745

Let Pm (dω) = Pλm (dω) be the Poisson point process on Conf(R×E) with intensity function λm (ds, dx, dv). Then since λm (B) = λ m (Ψm (B)) for any B ∈ B(R × E), we have that Pm (A) = P m (Ψm (A)),

for all A ∈ E0 .

Therefore, we can convert our problem with respect to Conf(Rd × Rd ) to a problem with respect to Conf(R × E). In summary, we let Ω = Conf(R × E), λ(ds, dx, dv) = λm (ds, dx, dv) N 1 2

−1 −1/2 =m ρ |v| + Ui (x − m sv − Xi,0 ) dsν(dx, dv), 2 i=1 Pm = Pλm be the Poisson point process on Ω with intensity λm (ds, dx, dv). ω ∈ Ω has distribution Pm , and for each initial condition ω, we are considering the following system of inﬁnite ODEs (we omit the superscription (m) for the sake of simplicity):   d   Xi (t, ω) = Vi (t, ω),   dt       1 d   V (t, ω) = − ∇Ui (Xi (t, ω) − x(t, Ψ(s, x, m− 2 v)))µω (ds, dx, dv), M i i   dt  R×E      (Xi (0, ω), Vi (0, ω)) = (Xi,0 , Vi,0 ), i = 1, . . . , N, (3.3)   d   x(t, x, v, ω) = v(t, x, v, ω),   dt      N 

d    m v(t, x, v, ω) = − ∇Ui (x(t, x, v, ω) − Xi (t, ω)),   dt   i=1   (x(0, x, v, ω), v(0, x, v, ω)) = (x, v), (x, v) ∈ Ψ(ω). 3.2. Classical scattering = (X1 , . . . , XN ) ∈ (Rd )N , and let ϕ Continuing as in Sec. 3.1, let X be the solution (see of (2.2). In this subsetion, we prove some results with respect to ψ(t, x, v; X) (3.4) below for its deﬁnition). We call it “classical”scattering since as opposite to (x(t, x, v, ω), v(t, x, v, ω)), the massive particles are not moving when considering ϕ(t, x, v; X). = max{Ri +|Xi |; i = 1, . . . , N }, and let s0 = R(X) Lemma 3.2.1. Let R(X) |v| . Then is independent of for any (x, v) ∈ E and t ∈ R, we have that ϕ(t + s, Ψ(s, x, v); X)

s as long as s ≥ s0 .

August 10, J070-S0129055X10004077

746

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Proof. For any (x, v) ∈ E, notice that x · v = 0 by deﬁnition of E, so inf |x − (s0 + u)v + rv| = inf |x − s0 v − (u − r)v|

0≤r≤u

0≤r≤u

≥ s0 |v| = R(X),

u ≥ 0,

which implies that the derivative of ϕ with respect to time t (the right-hand side of (2.2)) is 0, hence = (x − s0 v, v) = Ψ(s0 , x, v), ϕ(u, Ψ(s0 + u, x, v); X)

u ≥ 0.

Therefore, by the Markovian property of ϕ, we have = ϕ(t X) ϕ(t + s0 + u, Ψ(s0 + u, x, v); X) + s0 , ϕ(u, Ψ(s0 + u, x, v); X); = ϕ(t + s0 , Ψ(s0 , x, v); X),

t ∈ R, u ≥ 0.

That is, = ϕ(t ϕ(t + s, Ψ(s, x, v); X) + s0 , Ψ(s0 , x, v); X),

for any s ≥ s0 ,

is independent of s as long as s ≥ s0 . or equivalently, ϕ(t + s, Ψ(s, x, v); X) is wellBy Lemma 3.2.1, we get that lims→∞ ϕ(t + s, Ψ(s, x, v); X) = deﬁned, and is equal to ϕ(t + s0 , Ψ(s0 , x, v); X). Write it as ψ(t, x, v; X) 0 1 (ψ (t, x, v; X), ψ (t, x, v; X)), i.e. = (ψ 0 (t, x, v; X), ψ 1 (t, x, v; X)) ψ(t, x, v; X) = ϕ(t = lim ϕ(t + s, Ψ(s, x, v); X) + s0 , Ψ(s0 , x, v); X). s→∞

(3.4)

With the same notations as in Sec. 2, we shall present one more result concerning ϕ(t, x, v; X). Proposition 3.2.2. Suppose that |v| > 2C0 . Then · (|v|−1 v) > C0 , ϕ 1 (t, x, v; X)

for any t ∈ R, x ∈ Ev .

= v. Write η = |v|−1 v. Then by assumption, Proof. Notice that ϕ 1 (0, x, v; X) v · η = |v| > 2C0 . Let · η ≤ C0 }. 1 (t, x, v; X) τ1 = inf{t ≥ 0; ϕ We show that τ1 = +∞. · η = C0 . By deﬁnition, we have 1 (τ1 , x, v; X) Suppose τ1 < +∞. Then ϕ t 0 0 · ηdu (s, x, v; X)) · η = ϕ 1 (u, x, v; X) (ϕ (t, x, v; X) − ϕ s

> C0 |t − s|,

for any 0 ≤ s < t ≤ τ1 ,

which implies that d 0 · η) ≥ C0 , (ϕ (t, x, v; X) dt

0 ≤ t ≤ τ1 .

(3.5)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

In particular,

d 0 (t, x, v; X) dt (ϕ

· η) > 0 for 0 ≤ t ≤ τ1 . Also, since

1

747

−v =− ϕ (τ1 , x, v; X)

N τ1

0

− Xi )dt, ∇Ui (ϕ 0 (t, x, v; X)

i=1

we have by deﬁnition that −

N τ1

0

− Xi ) · ηdt ∇Ui (ϕ 0 (t, x, v; X)

i=1

· η − v · η < C0 − 2C0 = −C0 . =ϕ 1 (τ1 , x, v; X) Therefore, with the help of (3.5), we have C0 < ≤

=

N τ1

0

1 C0

− Xi ) · ηdt ∇Ui (ϕ 0 (t, x, v; X)

i=1 N τ1

i=1

0

− Xi ) · η| · |∇Ui (ϕ 0 (t, x, v; X)

d 0 · η)dt (ϕ (t, x, v; X) dt

N 1

− Xi ) · η| |∇Ui (ϕ 0 (t, x, v; X) C0 i=1 {t∈[0,τ1 ],|ϕe 0 (t,x,v;X)·η−X i ·η|
d 0 · η)dt (ϕ (t, x, v; X) dt N 1

d 0 · η − Xi · η)dt (ϕ (t, x, v; X) ≤ ∇Ui ∞ C0 i=1 dt 0 {t∈[0,τ1 ],|ϕ e (t,x,v;X)·η−Xi ·η|
≤

N 1

∇Ui ∞ 2Ri C0 i=1

= C0 , · (|v|−1 v) > C0 which makes a contradiction. Therefore, τ1 = +∞, i.e. ϕ 1 (t, x, v; X) for any t ≥ 0. The assertion for t < 0 can be shown in the same way by considering · η ≤ C0 }. τ2 = sup{t < 0; ϕ 1 (t, x, v; X) By Proposition 3.2.2, we get the following important result with respect to ψ 0 (t, x, v; X). Corollary 3.2.3. For any (x, v) ∈ E with |v| > 2C0 , we have that − Xi | > Ri , |ψ 0 (t, x, v; X) or t ≤ −C −1 R(X). if t ≥ 2C0−1 R(X) 0

i = 1, . . . , N,

August 10, J070-S0129055X10004077

748

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Proof. Choose and ﬁx any (x, v) ∈ E with |v| > 2C0 , and let η = |v|−1 v. Then since x · v = 0, we have that Ψ0 (s, x, v) · η = (x − sv) · η = −sv · η = −s|v|, Let s0 =

R(X) |v|

for any s > 0.

and as before. Then s0 < C0−1 R(X),

· η = Ψ0 (s0 , x, v) · η = −s0 |v| = −R(X). ϕ 0 (0, Ψ(s0 , x, v); X)

(3.6)

Also, |Ψ1 (s0 , x, v)| = |v| > 2C0 by assumption. Combining (3.4), Proposition 3.2.2 and (3.6), we get ·η =ϕ ·η 0 (t + s0 , Ψ(s0 , x, v); X) ψ 0 (t, x, v; X) t+s0 · ηdu + ϕ ·η = ϕ 1 (u, Ψ(s0 , x, v); X) 0 (0, Ψ(s0 , x, v); X) 0

> (t + s0 )C0 − R(X),

for any t > −s0 .

then In particular, if t > 2C0−1 R(X), · η > (t + s0 )C0 − R(X) ≥ R(X). ψ 0 (t, x, v; X) then t + s0 < 0, so In the same way, if t < −C0−1 R(X), ·η = ϕ ·η 0 (t + s0 , Ψ(s0 , x, v); X) ψ 0 (t, x, v; X) 0 · ηdu + ϕ ·η =− ϕ 1 (u, Ψ(s0 , x, v); X) 0 (0, Ψ(s0 , x, v); X) (t+s0 )

< −C0 · (−(t + s0 )) − R(X) < −R(X). This completes the proof of our assertion. Proposition 3.2.4. For any measurable f : R2d → [0, ∞) such that the integrand below is integrable, we have N 1 2

|v| + f (x, v)ρ Ui (x − Xi ) dxdv 2 R2d i=1 ∞ 1 2 |v| ν(dx, dv). f (ψ(t, x, v; X))dt ρ = (3.7) 2 −∞ E ∞ on the right-hand side of (3.7), Remark 4. The integral −∞ f (ψ(t, x, v; X))dt although it might look as being an inﬁnite integral, is actually a ﬁnite one by Corollary 3.2.3. Proof. By using approximation and taking limit with the help of convergence theorem, we may and do assume, without loss of generality, that there exists a

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

749

> 0 such that constant R supp(f ) ⊂ {(x, v); |x| + |v| ≤ R}. Let + R(X)), T = 2C0−1 (R and let N

v) = 1 |v|2 + E(x, Ui (x − Xi ). 2 i=1

Then by Proposition 3.1.1 and a simple change of variables, we have v))dxdv f (x, v)ρ(E(x, R2d

= R2d

v))dxdv f (ϕ(T, x, v; X))ρ( E(x,

=

R×E

f (ϕ(T, Ψ(t, x, v); X))ρ( E(Ψ(t, x, v)))dtν(dx, dv).

(3.8)

Therefore, it suﬃces for us to show that the right-hand side of (3.8) is equal to

1 2 |v| dtν(dx, dv). f (ψ(T − t, x, v); X)ρ 2 R×E We only need to show that the integrands are equal, i.e. it suﬃces to show that

1 2 |v| . (3.9) f (ϕ(T, Ψ(t, x, v); X))ρ(E(Ψ(t, x, v))) = f (ψ(T − t, x, v; X))ρ 2 Let us prove this in what follows. We ﬁrst show that if the left-hand side of (3.9) is not 0, then it is equal to the right-hand side. Assume that f (ϕ(T, Ψ(t, x, v); X))ρ( E(Ψ(t, x, v))) = 0. Then ρ(E(Ψ(t, x, v))) > 0 implies by our assumption that E(Ψ(t, x, v)) > e0 , so |v| > 2C0 , hence by Proposition 3.2.2, · η > C0 ϕ 1 (s, Ψ(t, x, v); X) for any s ∈ R, where η = |v|−1 v. Therefore, since (ϕ 0, ϕ 1 ) is the solution of (2.2), we have by deﬁnition that T d 0 − Ψ0 (t, x, v)) · η = ϕ (s, Ψ(t, x, v); X)ds (ϕ 0 (T, Ψ(t, x, v); X) ·η 0 ds T · ηds ϕ 1 (s, Ψ(t, x, v); X) = 0

+ R(X)), > T · C0 = 2(R where in the latter step we used the deﬁnition of T .

(3.10)

August 10, J070-S0129055X10004077

750

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

> 0 in addition, which gives us We also have f (ϕ(T, Ψ(t, x, v); X)) · η| ≤ |ϕ + |ϕ ≤ R. |ϕ 0 (T, Ψ(t, x, v); X) 0 (T, Ψ(t, x, v); X)| 1 (T, Ψ(t, x, v); X)| (3.11) Combining (3.10) with (3.11), and noticing that x · v = 0 since (x, v) ∈ E, we get by the deﬁnition of η that + 2R(X) < −Ψ0 (t, x, v) · η = (x − tv) · η = t|v|, R hence t ≥

e R+2R( X) |v|

(3.12)

≥ s0 . So by the deﬁnition of ψ, we get

= ϕ(T, ψ(T − t, x, v) = ϕ(T − t + t, Ψ(t, x, v); X) Ψ(t, x, v); X). so by the deﬁnition Also, (3.12) gives us that |Ψ0 (t, x, v)| = |x − tv| ≥ t|v| ≥ R(X), of E, we also get 1 E(Ψ(t, x, v)) = |v|2 . 2 This completes the proof of the fact that if the left-hand side of (3.9) is not 0, then it is equal to the right-hand side. We next show the opposite, i.e. we assume that the right-hand side of (3.9), 1 2 f (ψ(T − t, x, v; X))ρ( 2 |v| ), is not 0, and show that it is equal to the left-hand side, f (ϕ(T, Ψ(t, x, v); X))ρ( E(Ψ(t, x, v))). It is suﬃcient to show that t ≥ s0 (= R(X) ). |v|

hence (Indeed, if t ≥ s0 , then by using x · v = 0, we get |x − tv| ≥ t|v| ≥ R(X), 1 2 E(Ψ(t, x, v)) = 2 |v| by deﬁnition. Also, since t ≥ s0 , we have by the deﬁnition of = ϕ(T = ϕ(T, which ψ that ψ(T − t, x, v; X) − t + t, Ψ(t, x, v); X) Ψ(t, x, v); X), 1 1 2 2 2 will complete our proof.) Since ρ( 2 |v| ) > 0, we have 2 |v| > 2C0 , hence |v| > 2C0 , which in turn by Proposition 3.2.2 gives us that · η > C0 ϕ 1 (u, Ψ(s, x, v); X)

(3.13)

for any u, s ∈ R and x ∈ Ev . If t ≥ T , then by the deﬁnition of T , since |v| > 2C0 , we have 2 + R(X)) > R(X) = s0 . > 4 (R (R + R(X)) t≥T = C0 |v| |v| If t < T , then we have by (3.13) and the deﬁnition of T that for any r > 0 − Ψ0 (r, x, v)) · η (ϕ 0 (T − t + r, Ψ(r, x, v); X) T −t+r · ηdu = ϕ 1 (u, Ψ(r, x, v); X) 0

+ 2R(X) + (r − t)C0 . > (T − t + r) · C0 = 2R > 0, we have Also, since f (ψ(T − t, x, v; X)) + |ψ 1 (T − t, x, v; X)| ≤ R. |ψ 0 (T − t, x, v; X)|

(3.14)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

751

Therefore, we have for any r ≥ s0 = |ψ 0 (T − t, x, v; X)| ≤ R. |ϕ 0 (T − t + r, Ψ(r, x, v); X)|

(3.15)

Combining (3.14) and (3.15), we get + 2R(X) + (r − t)C0 , (r|v| =) − Ψ0 (r, x, v) · η > R X) Applying the above to r = s0 = R(|v| , we get

for any r ≥ s0 .

>R + 2R(X) + (s0 − t)C0 . R(X) Therefore, t > s0 . This completes our proof. 3.3. Existence and uniqueness of the solution In this subsection, we prove the ﬁrst assertion of Theorem 2.0.1, the almost sure unique existence of the solution of the considered inﬁnite system of ODEs for any ﬁxed m > 0. Recall that by Sec. 3.1, we have already converted the problem into C1 , etc., (3.3), which uses the ray representation. In the following we shall use C, C, to denote constants which may be diﬀerent in diﬀerent places. For any open subset G ⊂ R × E, let θG : Conf(R × E) → Conf(R × E), ω → θG (ω) = ω ∩ G. Then θG is E0 /E0 -measurable. Here E0 is the σ-algebra on O(R × E) = {A ⊂ R × E|A = ∅, A is closed}, generated by {{C ∈ O(R × E); C ∩ A = ∅}; A is open in R × E}. Also, let FG = σ{XK ; K ⊂ G, K is compact} ∨ ℵ. Here XK is the random variable deﬁned by XK (ω) = µω (K)(= (ω ∩ K)), ω ∈ Ω, and ℵ stands for the set of null sets. Then it is trivial that {FG |G is open} is an increasing σ-algebra. Let Fin(R × E) denote the set of non-empty ﬁnite subsets of R × E. It is easy to see that if ω ∈ Fin(R × E), then (3.3) has a unique solution. In the following, we extend this unique existence of a solution for (3.3) to Pm -almost every ω. Fix any T > 0 as before. Let R0 and τ be as given at the end of Sec. 2, set Gn = {(t, x, v) ∈ R × E; |x| < R0 , |t| < T + m1/2 τ }, and let θn = θGn . Lemma 3.3.1. θn ω ∈ Fin(R × E) for Pm -a.e. ω. Proof. Let c =

N

λm (Gn ) =

i=1 Ui ∞ .

R×E

×m

Then by deﬁnition and assumption,

1{|x|
−1

ρ

N 1 2

−1/2 |v| + Ui (x − m tv − Xi,0 ) dtν(dx, dv) 2 i=1

August 10, J070-S0129055X10004077

752

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

≤ (2R0 )d−1 2(T + m1/2 τ )m−1

Rd

|v|ρc

1 2 |v| dv 2

< ∞.

(3.16)

So E Pm [ (θn ω)] = E Pm [µω (Gn )] = λm (Gn ) < ∞. θn ω), V (t, θn ω)) is well-deﬁned for Pm By Lemma 3.3.1, we have that (X(t, almost every ω. Next, for any t ∈ [0, T ], we deﬁne St : Fin(R × E) → O(R × E) as St (ω) = (u, x, v) ∈ R × E;

min

i=1,...,N

min |Xi (s, ω) − (x − (u − s)m

−1/2

0≤s≤t

1 v)| − Ri + ≤0 2

for any ω ∈ Fin(R × E). Then we have the following: Lemma 3.3.2. For any open set G and ω ∈ Fin(R × E), we have that {St (ω) ⊂ G} = {St (θG ω) ⊂ G}. Proof. Choose and ﬁx any ω ∈ Fin(R×E). We give the proof of the part {St (ω) ⊂ G} ⊂ {St (θG ω) ⊂ G}, the opposite one can be proven in exactly the same way. Notice that by deﬁnition, 1 (u, x, v) ∈ / St (ω) ⇒ |Xi (s, ω) − (x − um−1/2 v + sm−1/2 v)| ≥ Ri + , 2 ∀ s ∈ [0, t], i = 1, . . . , N x(s, x − um−1/2 v, m−1/2 v; ω) = x − um−1/2 v + sm−1/2 v, ⇒ v(s, x − um−1/2 v, m−1/2 v; ω) = m−1/2 v, for any s ∈ [0, t]. So (u, x, v) ∈ / St (ω) 1 ⇒ |Xi (s; ω) − x(s, x − um−1/2 v, m−1/2 v; ω)| ≥ Ri + , 2

for any s ∈ [0, t],

⇒ ∇Ui (Xi (s; ω) − x(s, x − um−1/2 v, m−1/2 v; ω)) = 0,

for any s ∈ [0, t], (3.17)

for i = 1, . . . , N . Moreover, it is trivial to see that (u, x, v) ∈ G ⇒ µω (du, dx, dv) = µθG ω (du, dx, dv).

(3.18)

(3.17) and (3.18) combined with the deﬁnition (3.3) imply ω), V (s, ω)) = (X(s, θG ω), V (s, θG ω)), St (ω) ⊂ G ⇒ (X(s,

for any s ∈ [0, t], (3.19)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

753

(as long as ω ∈ Fin(R × E)). Therefore, St (ω) ⊂ G ⇒ St (θG ω) ⊂ G. We next deal with general ω ∈ Conf(R × E). As a special case of Lemma 3.3.2, we have the following. Corollary 3.3.3. {St (θk ω) ⊂ Gn } = {St (θn ω) ⊂ Gn },

Pm -a.e. ω,

for any k > n.

By Lemmas 3.3.1 and 3.3.2, we have that {St (θn ω) ⊂ G} = {St (θGn ∩G ω) ⊂ G},

Pm -a.e. ω,

so by the deﬁnition of F· , we have {St (θn ω) ⊂ G} ∈ FGn ∩G ⊂ FG for any open G ⊂ R × E, i.e. St (θn ·) is a {FG }-stopping time. Here, a map T : Ω → O(R × E) is, by deﬁnition, called a F -stopping time if, T is B(Ω)/E0 -measurable and {ω ∈ Ω; T (ω) ⊂ G} ∈ FG for any F -regular open set G. For any ω ∈ Ω, let τn (ω) = inf t ≥ 0; max |Vi (t, θn ω)| > n ∧ T. i=1,...,N

Lemma 3.3.4. For any n ∈ N, there exists a unique solution to (3.3) for Pm -a.e. ω satisfying τn (ω) = T . Proof. We ﬁrst notice that τn (ω) = T ⇒ ST (θn ω) ⊂ Gn .

(3.20)

Indeed, if τn (ω) = T , then |Vi (t, θn ω)| ≤ n for any t ∈ [0, T ] and i = 1, . . . , N , hence / Gn . |Xi (t, θn ω)| ≤ nT +|Xi,0 | for any t ∈ [0, T ] and i = 1, . . . , N . Assume (u, x, v) ∈ Then either |x| ≥ R0 + nT or |u| ≥ m1/2 C0−1 (R0 + nT ) + T . If |x| ≥ R0 + nT , then |x + rv| ≥ |x| ≥ R0 + nT for any r ∈ R, so |Xi (s, θn ω) − (x − um−1/2 v + / ST (θn ω). sm−1/2 v)| ≥ Ri + 12 for any s ∈ [0, T ], which implies that (u, x, v) ∈ If |u| ≥ m1/2 C0−1 (R0 + nT ) + T , then since |v| > C0 Pm -almost surely, for any s ∈ [0, T ], we have |x − um−1/2 v + sm−1/2 v| ≥ C0−1 (R0 + nT )|v| ≥ R0 + nT , so in this case we also have |Xi (s, θn ω) − (x − um−1/2 v + sm−1/2 v)| ≥ Ri + 12 for any s ∈ [0, T ], which implies that (u, x, v) ∈ / ST (θn ω). In conclusion, we have in either cases that (u, x, v) ∈ / ST (θn ω). This completes the proof of (3.20). Now, we are ready to show that the desired solution is well-deﬁned almost surely on the set τn (ω) = T for any n ∈ N. Indeed, if τn (ω) = T , then we have by (3.20), Corollary 3.3.3 and (3.19) that (t, θk ω)) = (X(t, θn ω), V (t, θn ω)), θk ω), V (X(t,

for any t ∈ [0, T ] and k ≥ n,

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

754

so we can deﬁne (t, θn ω)), ω), V (t, ω)) = (X(t, θn ω), V (X(t, (x(t, x, v, ω), v(t, x, v, ω)) = (x(t, x, v, θn ω), v(t, x, v, θn ω)), which exists for Pm -almost every ω satisfying our condition by Lemma 3.3.1. Then ω), V (t, ω), x(t, x, v, ω), v(t, x, v, ω)) satisﬁes (3.3). (X(t, Notice that τn (ω) = T ⇒ τn+1 (ω) = T . Therefore, to complete the proof of Theorem 2.0.1(1), it suﬃces to prove the following: Lemma 3.3.5. P

∞

{τn = T }

= 1.

n=1

We divide the proof of Lemma 3.3.5 into several steps. Lemma 3.3.6. There exist constants C1 , C2 > 0 such that N

1 i=1

2

2

Mi |Vi (t, θn ω)| ≤ C1 + C2

St (θn ω)

1Gn (u, x, v)(1 + |v|2 )µω (du, dx, dv),

for any θn ω ∈ Fin(R × E). Proof. For any θn ω ∈ Fin(R × E), we have by the invariance of the energy N

1 i=1

2

Mi |Vi (t, θn ω)|2

+

+

m 2

N

i=1

+

|v(t, x − um−1/2 v, m−1/2 v; θn ω)|2 µθn ω (du, dx, dv)

N

i=1

=

R×E

R×E

Ui (Xi (t, θn ω) − x(t, x − um−1/2 v, m−1/2 v; θn ω))µθn ω (du, dx, dv)

1 m Mi |Vi,0 |2 + 2 2

N

i=1

R×E

R×E

|m−1/2 v|2 µθn ω (du, dx, dv)

Ui (Xi,0 − (x − um−1/2 v))µθn ω (du, dx, dv).

(3.21)

If (u, x, v) ∈ / St (θn ω), then |Xi (s, θn ω) − (x − (u − s)m−1/2 v)| > Ri + 12 for any s ∈ [0, t] and i = 1, . . . , N , so by (3.3), v(t, x − um−1/2 v, m−1/2 v; θn ω) = m−1/2 v

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

755

and Ui (Xi (t, θn ω) − x(t, x − um−1/2 v, m−1/2 v; θn ω)) = 0. Therefore, (3.21) implies N

1 i=1

2

Mi |Vi (t, θn ω)|2

+

+

m 2

St (θn ω)

|v(t, x − um−1/2 v, m−1/2 v; θn ω)|2 µθn ω (du, dx, dv)

N

i=1 St (θn ω)

=

N

1 i=1

+

2

Mi |Vi,0 |2 +

N

i=1

St (θn ω)

So with C1 := N

1 i=1

Ui (Xi (t, θn ω) − x(t, x − um−1/2 v, m−1/2 v; θn ω))µθn ω (du, dx, dv)

2

m 2

St (θn ω)

|m−1/2 v|2 µθn ω (du, dx, dv)

Ui (Xi,0 − (x − um−1/2 v))µθn ω (du, dx, dv).

N

1 2 i=1 2 Mi |Vi,0 |

and C2 := 2

2

Mi |Vi (t, θn ω)| ≤ C1 + C2

St (θn ω)

= C1 + C2

St (θn ω)

N

i=1 Ui ∞

+

m 2,

we get

(1 + |v|2 )µθn ω (du, dx, dv) 1Gn (u, x, v)(1 + |v|2 )µω (du, dx, dv). (3.22)

Let us prepare for later use the following general result with respect to stopping times and Poisson point process. Lemma 3.3.7. (1) Let f : R × E → [0, ∞) be measurable and let S be a stopping time. Then Pm Pm f dµω = E f dλm . E S(ω)

S(ω)

(2) Let f : R × E → [0, ∞) be measurable and S, T be two stopping times satisfying (i) T (ω) ⊂ S(ω) for any ω ∈ Ω, (ii) E Pm [ S(ω) |f |dλm ] < ∞. Then

f (dµω − dν) FT = E f (dµω − dν) . S(ω) T (ω)

E

Proof. As the result is already known, we give a sketch only. (See, e.g., [8, 12] for related results.)

August 10, J070-S0129055X10004077

756

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

We ﬁrst have E Pm [µω (A\S)|FS ] = λm (A\S),

∀A ∈ B(R × E).

This is heuristically based on the deﬁnition of Poisson point process and the independence of FS and FA\S , and can be proved rigorously, for example, ﬁrst for non-random S, and then be extended to stopping times in a routine way. So for positive simple functions f we have Pm f dµω FS = f dλm . (3.23) E (R×E)\S (R×E)\S With the help of the monotone convergence theorem, this can be extended to any positive measurable function f in a routine way. Therefore, E Pm

f dµω = E Pm

R×E

S(ω)

f dµω − E Pm

= R×E

f dλm − E

Pm

=E

Pm

(R×E)\S(ω)

(R×E)\S(ω)

f dµω

f dλm

f dλm . S(ω)

For the second assertion, (3.23) implies that E[ f (dµω − dλm )|FS ] = R×E S(ω) f (dµω − dλm ), hence E

f (dµω − dλm ) FT = E E f (dµω − dλm ) FS FT S(ω) R×E f (dµω − dλm ) FT =E

R×E

=E T (ω)

f (dµω − dλm ).

Since St (θn ·) is a {FG }-stopping time, FSt+ε (θn ·) is well-deﬁned for any ε > 0 (n) small enough. Let Ft = ε>0 FSt+ε (θn ·) , 0 ≤ t < T . Then τn is a stopping time (n)

with respect to the ﬁltration {Ft }t∈[0,T ) . Let (n) Mt

= St (θn ω)

1Gn (u, x, v)(1 + |v|2 )(µω (du, dx, dv) − λ(du, dx, dv)).

(n)

(n)

Lemma 3.3.8. {Mt }t∈[0,T ] is a {Ft }t∈[0,T ] -martingale with mean 0.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

757

Proof. Notice that St (θn ω) is monotone non-decreasing with respect to t, and by N assumption, with c = i=1 Ui ∞ , we have 1Gn (u, x, v)(1 + |v|2 )λ(du, dx, dv) R×E

≤ 2(T + m1/2 C0−1 (R0 + nT ))(R0 + nT )d−1 m−1

1 2 |v| |v|dv, × (1 + |v|2 )ρc 2 Rd which is ﬁnite (but may depend on n) by assumption. This combined with Lemma 3.3.7 gives us our assertion. (n)

Proof of Lemma 3.3.5. We have by Lemma 3.3.8 that E[Mτn ] = 0. So by Lemma 3.3.6, N

1 i=1

2

Mi E[|Vi (τn , θn ω)|2 ]

≤ C1 + C2 E

2

Sτn (θn ω)

1Gn (u, x, v)(1 + |v| )λ(du, dx, dv)

≤ C1 + C2 E

2

Sτn (θn ω)

(1 + |v| )λ(du, dx, dv) .

So with C3 := (min M2i )−1 , we have P [τn < T ] = P max |Vi (τn , θn ω)| ≥ n i=1,...,N

N

1 C3 2 Mi |Vi (τn , θn ω)| ≤ 2E n 2 i=1 1 1 2 ≤ 2 C1 C3 + 2 C2 C3 E (1 + |v| )λ(du, dx, dv) . n n Sτn (θn ω)

(3.24)

Let us estimate the expectation on the right-hand side of (3.24). Let Sd (r) denote N the volume of the ball in Rd with radius r, and let C1 = i=1 m1/2 Sd (Ri + 12 ), C2 = N 1/2 T Sd−1 (Ri + 12 ). Then i=1 m |{(u, x) ∈ R × Ev ; (u, x, v) ∈ St (θn ω)}| = (u, x) ∈ R × Ev ; ∃i = 1, . . . , N, s.t., s 1 −1/2 −1/2 min (x − um v) + (m v − Vi (r, θn ω))dr ≤ Ri + 0≤s≤t 2 0

August 10, J070-S0129055X10004077

758

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

≤

N

m

1/2

−1

|v|

i=1

≤

N

m

1/2

−1

|v|

i=1

s −1/2 y ∈ Rd ; min y + ≤ Ri + 1 (m v − V (r, θ ω))dr i n 0≤s≤t 2 0

1 −1/2 |v| + max |Vi (s, θn ω)|)Sd−1 Ri + T (m 0≤s≤t 2

1 + Sd Ri + 2

−1 −1/2 = |v| |v| + max |Vi (s, θn ω)|) . C1 + C2 (m 0≤s≤t

Also, notice that |Vi (s, θn ω)| ≤ n for any s ∈ [0, τn ]. Therefore, with C1 = m−1 Rd(1 + |v|2 )(C1 + C2 m−1/2 |v|)ρc ( 12 |v|2 )dv and C2 = m−1 C2 Rd(1 + |v|2 )ρc ( 12 |v|2 )dv, which are ﬁnite by assumption, we have (1 + |v|2 )λ(du, dx, dv) Sτn (θn ω)

1 2 |v| |v|dv|{(u, x) ∈ R × Ev ; (u, x, v) ∈ St (θn ω)}| 2 Rd

1 2 2 −1 |v| (C1 + C2 (m−1/2 |v| + n))dv (1 + |v| )m ρc ≤ 2 Rd

≤

(1 + |v|2 )m−1 ρc

= C1 + C2 n. This combined with (3.24) implies P (τn < T ) ≤

1 1 C1 C3 + 2 C2 C3 (C1 + C2 n) → 0, 2 n n

as n → ∞,

which completes the proof. As mentioned in Sec. 2, we can also get the unique existence of the solution of (2.1) under the following condition (and without any further assumption such ∞ as (A1) or (A2)): d ≥ 2 and −∞ (1 + |s|)d ρ(s)ds < ∞. (See Proposition 3.3.9.) This result is not necessary for the rest of this paper, but we include it here since the condition is very simply: the intensity function ρ decreases rapidly enough at inﬁnity. Proposition 3.3.9. Assume that d ≥ 2 and ∞ (1 + |s|)d ρ(s)ds < ∞,

(3.25)

−∞

then there exists a unique solution to (2.1) for P . m -almost every ω Notice that neither does Theorem 2.0.1(1) include Proposition 3.3.9 nor vice versa.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

759

Proof. The proof is almost the same as the one we just used for Theorem 2.0.1(1), although we do not use the ray representation this time. The point is that in the proof of Theorem 2.0.1(1), the assumption (A2) was only used to estimate several −1/2 tv − Xi,0 )) (e.g., (3.16)), while integrals with respect to ρ( 12 |v|2 + N i=1 Ui (x − m if we do not use the ray representation, then the corresponding term ρ( 12 |v|2 + N i=1 Ui (x − Xi,0 )) does not depend on v, so by the variable change r = |v| and a suitable shift we can get similar estimates without the help of (A2). We give a brief sketch of the proof in the following. Unless otherwise speciﬁed, the notations have the same meanings as in the proof of Theorem 2.0.1(1). First notice that for any α ≥ 0, we have α + d2 − 1 ≥ 0 since d ≥ 2, so for any c ∈ R, we have by assumption and a simple calculation that ∞ m 2 d |v| + c dv ≤ Cd,m |v| ρ (|c| + |s|)α+ 2 −1 ρ(s)ds 2 d −∞ R

2α

(3.26)

for some constants Cd,m > 0 independent of c. So m 2 |v| + c dv < ∞, |v| ρ 2 Rd

2α

if 0 ≤ α ≤

d + 1. 2

(3.27)

Let Gn = {(x, v) ∈ R2d ; |x| < R0 + nT + |v|T }, and let θn = θGn . Then since |{x; (x, v) ∈ Gn }| = 2d (R0 + nT + T |v|)d ≤ 4d (R0 + nT )d + 4d T d |v|d , with C =

m

− d−1 2

N

i=1 Ui ∞ ,

λ m (Gn ) =

we have the following by (3.26) and our assumption:

ρ Gn

N m 2

|v| + Ui (x − Xi,0 ) dxdv 2 i=1

N m 2

≤ |v| + ρ Ui (x − Xi,0 ) dxdv 2 |x|≤R0 i=1

m 2 |v| dxdv ρ + 2 Gn ∩{|x|>R0 } ∞ d ≤ (2R0 )d Cd,m (C + |s|) 2 −1 ρ(s)ds

−∞

+ 4d (R0 + nT )d

ρ Rd

m 2 m 2 |v| dv + 4d T d |v| dv |v|d ρ 2 2 Rd

August 10, J070-S0129055X10004077

760

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

≤ (2R0 )d Cd,m

∞

−∞

d

(C + |s|) 2 −1 ρ(s)ds d

d

+ 4 (R0 + nT ) Cd,m + 4d T dCd,m

∞

−∞

∞

−∞

|s| 2 −1 ρ(s)ds d

|s|d−1 ρ(s)ds

< ∞. So the conclusion of Lemma 3.3.1 still holds in our case. The proof of Theorem 2.0.1(1) until Lemma 3.3.8 is valid in the present case, just with the trivial modiﬁcations such as R × E replaced by R2d , and with the deﬁnition of St modiﬁed as St : Fin(R2d ) → O(R2d ), given by

1 2d St ( ω ) = (x, v) ∈ R ; min ) − (x + sv)| − Ri + min |Xi (s, ω ≤0 i=1,...,N 0≤s≤t 2 for any ω ∈ Fin(R2d). The fact that R2d 1Gn (x, v)(1 + |v|2 )λ m (dx, dv) < ∞ in the proof of Lemma 3.3.8 is now proven as follows: since |{x; (x, v) ∈ Gn }| = 2d (R0 + nT + T |v|)d and there exists a constant C2 > 0 (depending on R0 , n, T, d) such that 2d (R0 + nT + T |v|)d (1 + |v|2 ) ≤ C2 (1 + |v|d+2 ), we get by (3.26) and our assumption that 1−d m 2 1Gn (x, v)(1 + |v|2 )λ m (dx, dv) R2d

N

m |v|2 + (1 + |v|2 )ρ Ui (x − Xi,0 ) dxdv ≤ 2 |x|≤R0 i=1

m 2 + |v| dxdv (1 + |v|2 )ρ 2 Gn ∩{|x|>R0 } N m 2

2 |v| + dx (1 + |v| )ρ Ui (x − Xi,0 ) dv ≤ 2 |x|≤R0 Rd i=1

m 2 d+2 + |v| dv C2 (1 + |v| )ρ 2 Rd ∞ d d ≤ (2R0 )d Cd,m [(C + |s|) 2 −1 + (C + |s|) 2 ]ρ(s)ds

+ C2 Cd,m < ∞, where C =

N

i=1 Ui ∞ .

−∞

∞

−∞

[|s| 2 −1 + |s|d ]ρ(s)ds d

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

761

The part under the title “Proof of Lemma 3.3.5” is most changed, and we give it as follows: N

1 2 2 Mi E[|Vi (τn , θn ω )| ] ≤ C1 + C2 E 1Gn (x, v)(1 + |v| )λm (dx, dv) . 2 Sτn (θn ω e) i=1 Therefore, with C3 := (min M2i )−1 , we have max |Vi (τn , θn ω )| ≥ n P [τn < T ] = P i=1,...,N

(3.28)

N

1 C3 2 Mi |Vi (τn , θn ω ≤ 2E )| n 2 i=1 1 1 2 ≤ 2 C3 C1 + 2 C3 C2 E 1Gn (x, v)(1 + |v| )λm (dx, dv) . n n Sτn (θn ω e) (3.29) m 2 Notice that by deﬁnition, λ m (dx, dv) = ρ( 2 |v| )dxdv if |x| > R0 . Also, there exist constants C0 , C1 > 0 (depending on T, N, d and Ri ) such that

)}| |{x ∈ Rd ; (x, v) ∈ St (θn ω 1 d )| ≤ Ri + = x ∈ R ; ∃i ∈ {1, . . . , N }, s.t., min |x + sv − Xi (s, θn ω 0≤s≤t 2 s 1 = x ∈ Rd ; ∃i ∈ {1, . . . , N }, s.t., min |x + (v − Vi (r, θn ω ))dr| ≤ Ri + 0≤s≤t 2 0 N

≤ C0 + C1 |v| + max |Vi (s, θn ω )| . i=1

0≤s≤t

Moreover, |Vi (t, θn ω )| ≤ n if t ∈ [0, τn ]. Therefore, by assumption and (3.27), there exist constants C0 , C1 > 0 such that 1Gn (x, v)(1 + |v|2 )λ m (dx, dv) Sτn (θn ω e)

≤

|x|≤R0

(1 + |v|2 )λ m (dx, dv)

m 2 |v| dv(1 + |v|2 )|{x ∈ Rd ; (x, v) ∈ Sτn (θn ω +m ρ )}| 2 Rd N

d−1 m |v|2 + ≤m 2 dx (1 + |v|2 )ρ Ui (x − Xi,0 ) dv 2 |x|≤R0 Rd i=1

d−1 m 2 |v| dv (C0 + C1 (|v| + N n))(1 + |v|2 )ρ +m 2 2 Rd d−1 2

August 10, J070-S0129055X10004077

762

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

≤ (2R0 )d m +m +m

d−1 2

d−1 2

d−1 2

(C0

Cd,m

+

∞ −∞

[(C + |s|) 2 −1 + (C + |s|) 2 ]ρ(s)ds d

C1 nN )Cd,m

C1 Cd,m

∞

[|s|

d−1 2

−∞

∞

−∞

d

[|s| 2 −1 + |s| 2 ]ρ(s)ds

+ |s|

d

d+1 2

d

]ρ(s)ds

≤ C0 + C1 n. This combined with (3.29) implies P (τn < T ) → 0, as n → ∞. 3.4. Some basic facts about Skorohod spaces In this subsection, we recall some basic facts about the Skorohod spaces (D([0, T ]; Rd ), d0 ) and (D([0, ∞); Rd ), dis), and the tightness of the probability measures on them. As mentioned in Remark 3 of Sec. 2, these spaces will be needed in order to carry out our proof. (See [1] for more details.) For any T > 0, let D([0, T ]; Rd) be the Skorohod space: d D([0, T ]; R ) = w: [0, T ] → Rd ; w(t) = w(t+) := lim w(s), t ∈ [0, T ), s↓t

and w(t−) := lim w(s) exists, t ∈ (0, T ] , s↑t

with the metric d0 = d0T given by = inf {λ0 ∨ w − w ◦ λ∞ } d0 (w, w) λ∈Λ

for any w, w ∈ D([0, T ]; Rd), where Λ = {λ: [0, T ] → [0, T ]; continuous, non-decreasing, λ(0) = 0, λ(T ) = T }, w∞ = sup0≤t≤T |w(t)|, and

λ(t) − λ(s) λ = sup log t−s 0≤s
for any λ ∈ Λ. It is well known that (D([0, T ]; Rd ), d0 ) is a complete metric space. Also, C([0, T ]; Rd ) = {w: [0, T ] → Rd ; continuous} is closed in (D([0, T ]; Rd ), d0 ), and the Skorohod topology relativized to C([0, T ]; Rd) coincides with the uniform topology there. (See, e.g., [1].) We have the following result about the tightness in ℘(D([0, T ]; Rd )), the space of all probabilities on D([0, T ]; Rd): let (Ωn , Fn , Pn ), n = 1, 2, . . . , be probability spaces, and let Xn : Ωn → D([0, T ]; Rd), n ∈ N, be measurable. Let µXn = Pn ◦Xn−1 .

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

763

Then we have the following: Theorem 3.4.1. Suppose that there exist constants ε, β, γ, C > 0 such that (1) E Pn [Xn ( · )ε∞ ] ≤ C, (2) E Pn [|Xn (r)−Xn (s)|β |Xn (s)−Xn (t)|β ] ≤ C|t−r|1+ε for any 0 ≤ r ≤ s ≤ t ≤ 1, (3) E Pn [|Xn (s) − Xn (t)|ε ] ≤ C|t − s|γ for any 0 ≤ s ≤ t ≤ 1, d for any n ∈ N. Then {µXn }∞ n=1 is tight in ℘(D([0, T ]; R )).

Proof. This is a corollary of results of [1]. Indeed, by [1, Theorem 13.2] and the paragraph between pp. 140–141 there, we have that {µXn }∞ n=1 is tight if the following four conditions are satisﬁed (see [1] for the notations). (1) (2) (3) (4)

lima→∞ lim supn→∞ Pn (Xn ∞ ≥ a) = 0, (δ)| ≥ a) = 0 for any a > 0, limδ→0 supn∈N Pn (|wX n limδ→0 supn∈N Pn (|Xn (δ) − Xn (0)| ≥ a) = 0 for any a > 0, limδ→0 supn∈N Pn (|Xn (1−) − Xn (1 − δ)| ≥ a) = 0 for any a > 0.

The fact that our conditions (1) and (3) imply (1) and (3) here, respectively, is trivial by Chebyshev’s inequality. The condition (4) here is also gotten in the same way, with the help of our (1) and the dominated convergence theorem. So the only thing left is to conﬁrm that the (2) here is also satisﬁed. We do it in the following. We use [1, Theorem 10.4], (the quantities γ, µ((s, t]), β and P there are 1 Xn , C 1+ε (t − s), β/2 and Pn in our case, respectively, and the quantity L(γ, δ) (δ)). Our condition (2) implies that there is now replaced by wX n Pn (|Xn (s) − Xn (r)| ∧ |Xn (t) − Xn (s)| ≥ λ) 1 Pn E [|Xn (r) − Xn (s)|β |Xn (s) − Xn (t)|β ] λ2β 1 ≤ 2β C|t − r|1+ε λ 1 = 2β µ((r, t])1+ε , λ ≤

i.e. [1, (10.20)] is satisﬁed. So by [1, Theorem 10.4], [1, (10.21)] holds, i.e. Pn (|wX (δ)| ≥ a) ≤ n

1 1 2K (C 1+ε T )(C 1+ε 2δ)ε . 2β a

The right-hand side above certainly converges to 0 as δ → 0 for any a > 0. Finally, let D([0, ∞); Rd ) be the set of functions on [0, ∞) that are right continuous and have left limits at every point, and let dis(w1 , w2 ) =

∞

n=1

2−n (1 ∧ d0n (gn w1 , gn w2 )),

August 10, J070-S0129055X10004077

764

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

where d0n is the Skorohod metric on D([0, n]; Rd ) as just deﬁned, and gn is the function given by gn (t) = 1{t∈[0,n−1]} + (n − t)1{t∈(n−1,n]} . Then the convergence to a continuous process in (D([0, ∞); Rd ), dis) is equivalent to the convergence to it in (C([0, T ]; Rd ), · [0,T ] ) for all T > 0. By [1, Theorem 16.7], we have that in order to prove the weak convergence of the distribution of a process with t ∈ [0, ∞) in (D([0, ∞); Rd ), dis), it is suﬃcient to show it for t ∈ [0, T ], for all T > 0. 3.5. Basic lemmas In this subsection, we state several key lemmas which are used for the proof of our results. The proof of these lemmas will be given in Secs. 4 and 5. Let (T,n)

Ft = Ft

= F(−∞,t+2m1/2 τ )×E ∨ ℵ

= σ{ω ∩ (−∞, t + 2m1/2 τ ) × E} ∨ ℵ. Proposition 3.6.5 below ensures that (Xi (t ∧ σ), Vi (t ∧ σ)), i = 1, . . . , N , are Ft -measurable. Also, we deﬁne a new potential in the following way. Let ∞ ρ(s)ds, t ∈ R, ρ(t) = − t

1 2 |v| + s dv, ρ p(s) = 2 Rd

and let X) = U(

p

Rd

N

Ui (Xi − x)

− p(0) dx.

i=1

will be given after Lemma 3.5.1. Some more discussion concerning U Our key decomposition is given in Lemma 3.5.1. Its result suﬃces for the proof of the tightness, but in order to ﬁnd the limit, concrete expressions for Mi (t) and Pi∗1 (t) are necessary, and will be given later (see (4.22)). In order to keep the line of our proof sharp we shall ﬁrst avoid presenting such concrete expressions. Lemma 3.5.1. For any i = 1, . . . , N, there exist an Rd -valued (Ft )t -martingale Mi (t), an Rd -valued (Ft )t -adapted process ηi (t) and an Rd -valued (Ft )t -adapted C 1 -class (in t) process Pi∗1 (t), such that (1) Mi (Vi (t ∧ σ) − Vi (0)) = Mi (t) + ηi (t) + Pi∗1 (t) − m−1/2

0

t∧σ

(X(s))ds, ∇i U

(3.30)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

(2) sup E

sup

Pm

m∈(0,1] t∈[0,T ]

765

d ∗1 2 Pi (t) < ∞, dt

(3) there exists a constant C independent of m such that for any i = 1, . . . , N, 0 ≤ s ≤ t ≤ T and m ∈ (0, 1], we have E Pm [|Mi (t) − Mi (s)|2 |Fs ] ≤ C|t − s|, and the jumps of Mi (·) satisfy |∆Mi (t)| ≤ Cm1/2 , (4)

E

Pm

2

sup |ηi (t)|

t∈[0,T ]

→ 0,

as m → 0

for any i = 1, . . . , N . In particular, the distributions of {Mi (t) + ηi (t); t ∈ [0, T ]} and {Pi∗1 (t); t ∈ [0, T ]} under Pm are tight in ℘(D([0, T ]; Rd)) as m → 0, and any of their cluster points have continuous canonical processes. Let us explain a little bit before going further. As claimed in Sec. 1, in our model, the molecules feel each other through the mediation of the gas atoms, and the molecules do not interact with each other directly. In Lemma 3.5.1, we reexpress the interactions in such a way that the light atoms do not appear explicitly X) appears as a new potential. this time. In this new expression, the function U( As will be shown later (Lemma 4.3.3), it is approximately the expected total force given by the “frozing approximations” ψ(t, x, v, X). , it is easy to see that if |Xi − Xj | > Ri + Rj for any i = j, By the deﬁnitionof U N then U (X) = i=1 Rd (p(Ui (x)) − p(0))dx, therefore, (X) = 0, ∇U

if |Xi − Xj | > Ri + Rj for any i = j.

(3.31)

at X is a constant. Write this constant as U 0 . So in this case, the value of U So our “new potential” U (X(t)) keeps 0 until any pair of two molecules are too near such that their (original) potentials overlap. This is heuristic because when the molecules are far enough from each other, as a result of our cut-oﬀ, they feel the inﬂuence of diﬀerent atoms, so by the symmetry of the potentials and the initial distribution λm , we get our assertion. Also notice that as soon as this term becomes non-zero, since m−1/2 → ∞, it gives us an “inﬁnitely strong force”. This is why we needed to stop the process in Theorem 2.0.1(2) (see also the paragraphs following it). Also, we will use the following lemmas to prove Theorem 2.0.1(4): Lemma 3.5.2. Let D be any open subset of RdN , and assume that for any i = ¯ → Rd satisfying 1, . . . , N, there exists a Cb1 -class function gi : D X) = |∇i U (X)|, · ∇i U( gi (X)

∈ D, ¯ i = 1, . . . , N. for any X

August 10, J070-S0129055X10004077

766

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Let C σ D = inf{t ≥ 0; X(t) ∈ D }.

Then (1)

sup E

Pm

m∈(0,1]

T ∧σ∧σf D

m 0

−1/2

X(t))|dt <∞ |∇i U(

for any i = 1, . . . , N, N (X(t ∧ σ ∧ σ (2) the distributions of m−1/2 (U D )) − U0 ) + i=1 under Pm is tight in ℘(C([0, T ]; R)) as m → 0.

Mi 2 |Vi (t ∧ σ

2 ∧ σ D )|

Let L be the operator deﬁned in Sec. 2. By looking into the concrete expressions of the decomposition (3.30), we can get the following Lemma. In particular, this implies Theorems 2.0.1(2) and 2.0.1(3). The proof will be given in Sec. 5. −U 0 )C ⊂ RdN , and assume that f ∈ C0∞ (D0 × Lemma 3.5.3. Let D0 = (supp U dN ∧ σ), V (t ∧ σ)) under Pm is R ). Then we have that the distribution of f (X(t tight in ℘(C([0, T ]; R)) as m → 0, and its limit is the solution of the L-martingale problem stopped at σ. 3.6. Some basic calculation In this subsection, we prepare some estimates, especially some properties of x(t, x, v, ω) (see Propositions 3.6.3–3.6.5), for later use. First notice that it is trivial by deﬁnition that |Xi (t, ω)| ≤ |Xi,0 | + nT,

for any t ∈ [0, σ(ω) ∧ T ].

(3.32)

Proposition 3.6.1. Suppose that (x, v) ∈ E, |v| > (2C0 +1)m−1/2 and n ≤ m−1/2 . Then (|v|−1 v) · v(t, x, v; ω) ≥ m−1/2 (C0 + 1),

for any t ∈ [0, σ(ω)].

Proof. Let η = |v|−1 v and let ξ = inf{t > 0; v(t, x, v, ω) · η < m−1/2 (C0 + 1)}. We only need to show that ξ ≥ σ(ω). Suppose that the contrary holds. Notice that by deﬁnition, N ξ

(∇Ui (x(t, x, v, ω) − Xi (t, ω)) · η)dt. (v(ξ, x, v, ω) − v) · η = −m−1 i=1

0

Also, for any t ∈ [0, ξ ∧ σ(ω)], we have by assumption d (x(t, x, v, ω) − Xi (t, ω)) · η = v(t, x, v, ω) · η − Vi (t, ω) · η dt ≥ m−1/2 (C0 + 1) − n

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

767

≥ m−1/2 (C0 + 1) − m−1/2 = m−1/2 C0 , in particular, (x(t, x, v, ω) − Xi (t, ω)) · η is monotone increasing with respect to t. So since v · η = |v| > (2C0 + 1)m−1/2 by assumption, we have m−1/2 C0 < −(v(ξ, x, v, ω) − v) · η N ξ

= m−1 (∇Ui (x(t, x, v, ω) − Xi (t, ω)) · η)dt 0

i=1

≤ m−1

N

0

i=1

ξ

|∇Ui (x(t, x, v, ω) − Xi (t, ω)) · η|

× (m−1/2 C0 )−1 d[(x(t, x, v, ω) − Xi (t, ω)) · η] ≤ m−1 ×

N

(m−1/2 C0 )−1 ∇Ui ∞

i=1

|(x(t,x,v,ω)−Xi (t,ω))·η|≤Ri

≤ m−1

N

d[(x(t, x, v, ω) − Xi (t, ω)) · η]

(m−1/2 C0 )−1 ∇Ui ∞ 2Ri

i=1

= m−1/2 C0 , which yields a contradiction. Therefore, ξ ≥ σ(ω). Since we are considering the limit behavior as m → 0, without loss of generality, we assume n < m−1/2 from now on. Also, for the sake of simplicity, from now on, we omit the notation ω when there is no risk of confusion. Note that in our setting, since   d x(t, Ψ(s, x, m−1/2 v)) = v(t, Ψ(s, x, m−1/2 v)),     dt N

 d  −1/2  v(t, Ψ(s, x, m v)) = − ∇Ui (x(t, Ψ(s, x, m−1/2 v)) − Xi (t)), m   dt i=1

we have d2 x(m1/2 t + s, Ψ(s, x, m−1/2 v)) dt2 =−

N

∇Ui (x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − Xi (m1/2 t + s, ω)).

i=1

Also, for any s > 0 and t ∈ [0, T ∧ σ(ω)], we have by deﬁnition and (3.32) that (x(t, Ψ(s, x, m−1/2 v)), v(t, Ψ(s, x, m−1/2 v))) = Ψ(s − t, x, m−1/2 v)

(3.33)

August 10, J070-S0129055X10004077

768

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

if t < s − (m−1/2 C0 )−1 R0 , (x, v) ∈ E and |v| ≥ 2C0 + 1. Indeed, since 0 ≤ t < s − (m−1/2 C0 )−1 R0 , for any u ∈ [0, t], we have that |s − u| > (m−1/2 C0 )−1 R0 . This combined with (x, v) ∈ E and |v| ≥ 2C0 + 1 gives us that |x − (s − u)m−1/2 v| ≥ |(s − u)m−1/2 v| > R0 ≥ Ri + |Xi,0 | + nT , which in turn combined with |Xi (u, ω)| ≤ |Xi,0 | + nT implies that |x − (s − u)m−1/2 v − Xi (u, ω)| ≥ Ri for any u ∈ [0, t] and i = 1, . . . , N . Therefore, until t, the velocity of this atom keeps unchanged, hence its position at time t is equal to x − (s − t)m−1/2 v. Therefore,

d 1/2 −1/2 1/2 −1/2 v)), x(m t + s, Ψ(s, x, m v)) x(m t + s, Ψ(s, x, m dt = (Ψ0 (−m1/2 t, x, m−1/2 v), m1/2 Ψ1 (−m1/2 t, x, m−1/2 v)) = (x + tv, v) = Ψ(−t, x, v)

(3.34)

if t < −C0−1 R0 , (x, v) ∈ E, |v| ≥ 2C0 + 1, and 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). We recall the following well-known Gronwall’s Lemma, for later use. Lemma 3.6.2 (Gronwall’s Lemma). Suppose that a continuous function g(·) satisﬁes t g(s)ds, 0 ≤ t ≤ T, 0 ≤ g(t) ≤ α(t) + β 0

with β ≥ 0 and α: [0, T ] → R integrable. Then t g(t) ≤ α(t) + β α(s)eβ(t−s) ds, 0

0 ≤ t ≤ T.

In particular, if α(t) = α is a constant, then g(t) ≤ αeβt ,

0 ≤ t ≤ T.

1/2 As claimed in Sec. 2, we will use ψ 0 (t, x, v, X(s−am , ω)) as an approximation of x(m1/2 t + s; Ψ(s, x, m−1/2 v)). In the following two propositions, with the help of Gronwall’s Lemma, we show that this is a good approximation by giving some estimate for the error (see Proposition 3.6.3(3)), which is necessary when showing the tightness, and giving the coeﬃcient of the next term in its expansion (see Proposition 3.6.4). which is necessary when showing the convergence to the limit.

Proposition 3.6.3. Fix any a ∈ R. Suppose that 0 ≤ s − am1/2 ≤ T ∧ σ(ω) and 0 ≤ s − m1/2 τ ≤ T ∧ σ(ω). Let − am1/2 , ω)). y(t) = x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − ψ 0 (t, x, v; X(s Also, suppose that (x, v) ∈ E and |v| > 2C0 + 1. Then (1) y(t) = 0 if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω) and t ≤ −τ,

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

769

(2) N

d2 − am1/2 , ω)) − Xi (m1/2 t + s, ω)) y(t) = − {∇Ui (y(t) + ψ 0 (t, x, v; X(s dt2 i=1

− am1/2 , ω)) − Xi (s − am1/2 , ω))}. − ∇Ui (ψ 0 (t, x, v; X(s depending only on n, τ and N ∇2 Ui ∞ , such that (3) there exists a constant C, i=1 d + |a|), (3.35) |y(t)| + y(t) ≤ m1/2 C(2τ dt if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω) and |t| ≤ 2τ . Proof. We ﬁrst show the ﬁrst assertion. We have by (3.34) that x(m1/2 t + s, Ψ(s, x, m−1/2 v)) = x+tv in our setting. We next look at the term ψ 0 (t, x, v; X(s− 1/2 1/2 am , ω)). It is trivial that |Xi (s − am , ω)| ≤ |Xi,0 | + nT under our assumption. Also, since t ≤ −τ and |v| ≥ 2C0 + 1, we have for any s big enough that u ∈ [0, t + s] ⇒ u − s ∈ [− s, t] ⊂ [− s, −τ ], hence inf

u∈[0,t+e s]

|x − sv + uv| ≥ |t||v| ≥ C0−1 R0 (2C0 + 1) ≥ R0 ,

(this might look incorrect if one forgets the fact that t is now taken to be nega0 (t + s, x − sv, v; X(s − am1/2 , ω)) = limes→∞ ϕ − tive). Therefore, ψ 0 (t, x, v; X(s 1/2 am , ω)) = x + tv. This proves our ﬁrst assertion. The second assertion is trivial by deﬁnition. Let us prove the third assertion. Notice that for any |t| ≤ 2τ satisfying 0 ≤ m1/2 t + s ≤ T ∧ σ(ω), we have |Xi (m1/2 t + s, ω) − Xi (s − am1/2 , ω)| ≤ n|(m1/2 t + s) − (s − am1/2 )| ≤ nm1/2 (2τ + |a|), so by (2), 2

N d ≤ y(t) ∇2 Ui ∞ |y(t) − [Xi (m1/2 t + s, ω) − Xi (s − am1/2 , ω)]| dt2 i=1

≤

N

2

∇ Ui ∞ m

1/2

n(2τ + |a|) +

i=1

N

2

∇ Ui ∞ |y(t)|.

i=1

Therefore, 2 d y(t), d y(t) ≤ d y(t) + d y(t) dt 2 dt dt dt N

≤ m1/2 ∇2 Ui ∞ n (2τ + |a|)

i=1

+ 1+

N

i=1

d ∇ Ui ∞ y(t), y(t) dt 2

August 10, J070-S0129055X10004077

770

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

d if |t| ≤ 2τ and 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). Also, by (1), y(−τ ) = dt y(−τ ) = 0. Let d g(t) = |(y(t − τ ), dt y(t − τ ))|, then we have g(0) = 0 and d g(t) = d y(t − τ ), d y(t − τ ) dt dt dt N N

1/2 2 2 ∇ Ui ∞ n (2τ + |a|) + 1 + ∇ Ui ∞ g(t), ≤m i=1

i=1

if −τ ≤ t ≤ 3τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω). (Notice that t = 0 satisﬁes these conditions since 0 ≤ s − m1/2 τ ≤ T ∧ σ(ω) under our assumption.) Therefore, if 0 ≤ t ≤ 3τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω), then N N t

1/2 2 2 ∇ Ui ∞ n (2τ + |a|)3τ + 1 + ∇ Ui ∞ g(s)ds, g(t) ≤ m i=1

0

i=1

so by Gronwall’s inequality, we get N

PN 2 1/2 2 ∇ Ui ∞ n (2τ + |a|)3τ e(1+ i=1 ∇ Ui ∞ )t . g(t) ≤ m i=1

The assertion for t ∈ [−τ, 0] satisfying 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω) is proved in the same way, and we omit the proof here. This completes the proof. V , a) be the solution of (2.3). In the following, we show that this Let z(t; x, v, X, z(t) gives the next term in the approximation of x(m1/2 t + s, Ψ(s, x, m−1/2 v)). Proposition 3.6.4. Let a ∈ R. Suppose that t ≥ −a, 0 ≤ s − m1/2 τ ≤ T ∧ σ(ω), −τ ≤ t ≤ 2τ and 0 ≤ s − am1/2 ≤ s + m1/2 t ≤ T ∧ σ(ω). Also, let (x, v) ∈ E and |v| > 2C0 + 1. Then |x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − am1/2 )) + m1/2 z(t; x, v, X(s − am1/2 ), V (s − am1/2 ), a))| − (ψ 0 (t, x, v, X(s s+m1/2 t 1/2 2 1/2 −1/2 1/2 (1 + |a|) m + m ≤ Cm |V (r) − V (s − am )|dr . s−am1/2

Here C is a constant depending only on τ, n,

N i=1

∇3 Ui ∞ and

N i=1

∇2 Ui ∞ .

Proof. The main tool is again Gronwall’s Lemma. Let − am1/2 , ω)) y(t) = x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − ψ 0 (t, x, v, X(s as in Proposition 3.6.3, and let − am1/2 ), V (s − am1/2 ), a). ξ(t) = y(t) − m1/2 z(t; x, v, X(s

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

771

We need to estimate |ξ(t)|. By a simply calculation, N

d2 − am1/2 )) − Xi (m1/2 t + s)) y(t) = − {∇Ui (y(t) + ψ 0 (t, x, v; X(s dt2 i=1

− am1/2 )) − Xi (s − am1/2 ))} − ∇Ui (ψ 0 (t, x, v; X(s N 1

=− ∇2 Ui (η[y(t) − {Xi (m1/2 t + s) − Xi (s − am1/2 )}] i=1

0

− am1/2 )) − Xi (s − am1/2 )) + ψ 0 (t, x, v, X(s × [y(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a)}]dη, so

d2 ξ(t) = − 2 dt i=1 N

1

0

dη{∇2 Ui (η[y(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a)}]

− am1/2 )) − Xi (s − am1/2 )) + ψ 0 (t, x, v; X(s − am1/2 )) − Xi (s − am1/2 ))} − ∇2 Ui (ψ 0 (t, x, v; X(s · (y(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a)}) −

N

− am1/2 )) − Xi (s − am1/2 )) ∇2 Ui (ψ 0 (t, x, v, X(s

i=1

× (ξ(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a) − m1/2 (t + a)Vi (s − m1/2 a)}). Therefore, since |Xi (m1/2 t + s) − Xi (s − m1/2 a)| ≤ n(t + |a|)m1/2 in our domain, s+m1/2 t and Xi (m1/2 t + s) − Xi (s − m1/2 a) = s−am1/2 Vi (r)dr, we get N 2

N

d ∇3 Ui ∞ (|y(t)| + n(t + |a|)m1/2 )2 + ∇2 Ui ∞ |ξ(t)| dt2 ξ(t) ≤ i=1 i=1 +

N

∇2 Ui ∞

i=1

s+m1/2 t s−am1/2

|Vi (r) − Vi (s − m1/2 a)|dr.

be the constant in Proposition 3.6.3(3), and let Let C C1 =

N

+ n)2 (2τ + 1)2 , ∇3 Ui ∞ (C

i=1

C2 =

N

i=1

∇2 Ui ∞ .

(3.36)

August 10, J070-S0129055X10004077

772

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Then (3.36) combined with Proposition 3.6.3(3) gives us 2 d ≤ C1 m(1 + |a|)2 + C2 ξ(t) dt2

s+m1/2 t

s−am1/2

|Vi (r) − Vi (s − m1/2 a)|dr + C2 |ξ(t)|,

if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω), |t| ≤ 2τ and t ≥ −a. Let d g(t) = ξ(t − τ ), ξ(t − τ ) . dt Then the estimate above gives us 2 d g(t) ≤ d ξ(t − τ ) + d ξ(t − τ ) dt dt dt2 s+m1/2 (t−τ ) ≤ C1 m(1 + |a|)2 + C2 |Vi (r) − Vi (s − m1/2 a)|dr s−am1/2

+ (C2 + 1)g(t), if t − τ ≥ −a, |t − τ | ≤ 2τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω). Since ξ(−τ ) = s+m1/2 (t−τ ) d |Vi (r) − Vi (s − m1/2 a)|dr is dt ξ(−τ ) = 0, we have g(0) = 0. Also, s−am1/2 monotone non-decreasing with respect to t. So if t − τ ≥ −a and 0 ≤ t ≤ 3τ , then 1/2 g(t) ≤ 3τ

C1 m(1 + |a|)2 + C2

+ (C2 + 1)

(t−τ )

s+m

s−am1/2

|Vi (r) − Vi (s − m1/2 a)|dr

t

g(u)du. 0

Therefore, by Gronwall’s inequality and the monotonicity of m1/2 a)|dr again, the above implies 1/2 g(t) ≤ 3τ e

(C2 +1)3τ

2

C1 m(1 + |a|) + C2

s+m

s−am1/2

(t−τ )

s+m1/2 t s−am1/2

|Vi (r)−Vi (s−

|Vi (r) − Vi (s − m

1/2

a)|dr ,

if t − τ ≥ −a, −τ ≤ t − τ ≤ 2τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω). This completes the proof of our assertion. In the following proposition, we show that similarly as for the solution of Newton’s equation (see Corollary 3.2.3), x(m1/2 t + s, Ψ(s, x, m−1/2 v)) does not interact with Xi (m1/2 t + s, ω) if |t| is big. Proposition 3.6.5. Let (x, v) ∈ E and |v| > 2C0 +1. Suppose that 0 ≤ m1/2 t+s ≤ T ∧ σ(ω) and that either t < −τ or t > 2τ . Then ∇Ui (x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − Xi (m1/2 t + s, ω)) = 0.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

773

Proof. Let η = |v|−1 v. Notice that |Xi (m1/2 t + s, ω)| ≤ |X0,i | + nT if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). So it suﬃces to show that |x(m1/2 t + s, Ψ(s, x, m−1/2 v))| ≥ R0 for t satisfying our condition. We show it in the following. First notice that by (3.34), if t < −τ = −C0−1 R0 , then |x(m1/2 t + s, Ψ(s, x, m−1/2 v))| = |x + tv| ≥ |t||v| ≥ C0−1 R0 (2C0 + 1) > R0 . We next prove the assertion for t > 2τ . Let us divide it into two cases, according to whether s < 0 or not. We ﬁrst deal with the case s < 0. Notice that by Proposition 3.6.1, we have that η · v(u, Ψ(s, x, m−1/2 v)) ≥ m−1/2 (C0 + 1) for any u ∈ (0, T ∧ σ). Also, x(0, Ψ(s, x, m−1/2 v)) = x − sm−1/2 v, x · η = 0 and v · η = |v|. Therefore, η · x(m1/2 t + s, Ψ(s, x, m−1/2 v)) m1/2 t+s = η · v(u, Ψ(s, x, m−1/2 v))du + η · (x − sm−1/2 v) 0

≥ m−1/2 (C0 + 1)(m1/2 t + s) − sm−1/2 |v| = t(C0 + 1) + m−1/2 s(C0 + 1 − |v|) ≥ t(C0 + 1) > 2C0−1 R0 (C0 + 1) > R0 , where when passing to the last line, we used the fact that s < 0 and C0 + 1 − |v| < 0. Let us now prove the assertion for t > 2τ and s > 0. Notice that s < T ∧ σ in this case since we have by assumption 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). We ﬁrst show that η · x(s, Ψ(s, x, m−1/2 v)) ≥ −R0 ,

for all s ∈ [0, T ∧ σ).

(3.37)

In the following, again, we use the fact that η · v(u, Ψ(s, x, m−1/2 v)) ≥ m−1/2 (C0 + 1) > 0 for any u ∈ (0, T ∧ σ), which is guaranteed by Proposition 3.6.1. We also use the fact that x(0, Ψ(s, x, m−1/2 v)) = x − sm−1/2 v, x · η = 0 and v · η = |v|. Let 1/2 0 . If s ∈ [0, s0 ], then we have that s0 = R |v| m η · x(s, Ψ(s, x, m

−1/2

s

v)) = 0

η · v(u, Ψ(s, x, m−1/2 v))du + η · (x − sm−1/2 v)

≥ 0 − m−1/2 |v|s ≥ −m−1/2 |v| ·

R0 1/2 m = −R0 . |v|

If s ∈ [s0 , T ∧ σ], then by using a similar argument as in the proof of (3.33), it is easy to see by deﬁnition that x(s − s0 , Ψ(s, x, m−1/2 v)) = x − s0 m−1/2 v, v(s − s0 , Ψ(s, x, m−1/2 v)) = m−1/2 v,

August 10, J070-S0129055X10004077

774

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

therefore, η · x(s, Ψ(s, x, m

−1/2

s

η · v(u, Ψ(s, x, m−1/2 v))du

v)) = s−s0

+ η · x(s − s0 , Ψ(s, x, m−1/2 v)) ≥ 0 + η · (x − s0 m−1/2 v) = −s0 m−1/2 |v| = −

R0 1/2 m · m−1/2 |v| = −R0 . |v|

This completes the proof of (3.37). Since d x(m1/2 t + s, Ψ(s, x, m−1/2 v)) = m1/2 v(m1/2 t + s, Ψ(s, x, m−1/2 v)), dt and 0 ≤ m1/2 t + s ≤ σ(ω) by assumption, we have by Proposition 3.6.1 that d (η · x(m1/2 t + s, Ψ(s, x, m−1/2 v))) > C0 . dt

(3.38)

This combined with (3.37) implies that t d 1/2 −1/2 η · x(m t + s, Ψ(s, x, m (η · x(m1/2 u + s, Ψ(s, x, m−1/2 v))du v)) = du 0 + η · x(s, Ψ(s, x, m−1/2 v)) ≥ C0 t − R0 ≥ C0 · 2C0−1 R0 − R0 = R0 . This completes the proof of our assertion, hence the lemma is proven. Before closing this section, let us discuss a little bit more about the new potential and the function p deﬁned in Sec. 3.5. U The following equation will be used later: N

1 (X) = ∇i U |v|2 + ∇Ui (Xi − x)ρ Ui (x − Xi ) dxdv. (3.39) 2 R2d i=1 Also, by a simple calculation, there exists a global constant Cd such that ∞ d p(s) = Cd ρ(r + s)r 2 −1 dr, 0

hence p (s) = Cd

∞

ρ(r + s)r 2 −1 dr d

0 ∞

= Cd s

ρ(t)(t − s) 2 −1 dt. d

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

So if s < e0 , then

p (s) = Cd

∞

0

ρ(t)(t − s) 2 −1 dt, d

∞ d d p (s) = Cd 1 − ρ(t)(t − s) 2 −2 dt, 2 0

∞ d d d p (s) = Cd 1 − ρ(t)(t − s) 2 −3 dt. 2− 2 2 0

775

(3.40) (3.41)

Also notice that under the condition s < e0 , if 0 ≤ t < s, then t < e0 , hence ρ(t) = 0. Therefore, we get that   < 0, if d ≥ 3, p (s) = 0, if d = 2, (3.42)   > 0, if d = 1. We remark that in reality, we have ρ(t) = e−t , so ρ(t) = −e−t and p(s) = −Ce−s , for some constant C > 0, so p (s) < 0. 4. Proof of Basic Lemmas We give the proofs of Lemmas 3.5.1 and 3.5.2 in this section. The proof of Lemma 3.5.3 will be given in Sec. 5. 4.1. First decomposition Let σ(ω) = σn (ω) = inf{t ≥ 0; maxi=1,...,N |Vi (t, ω)| ≥ n}, R0 = maxi=1,...,N {Ri + |Xi,0 |} + nT + 1, and τ = C0−1 R0 as before. Also, we always assume that (x, v) ∈ E, i.e. x · v = 0. First, for any t ≤ T , we have by (3.3) that Mi (Vi (t) − Vi (0)) t =− ds 0

R×E

∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))µω (dr, dx, dv),

so we have the following decomposition. −Mi (Vi (t ∧ σn ) − Vi (0)) = Vi0 (t) + Vi1 (t), with Vi0 (t)

t∧σn

= 0

1[4m1/2 τ,∞) (s)ds

×

R×E

∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))µω (dr, dx, dv),

August 10, J070-S0129055X10004077

776

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Vi1 (t) =

t∧σn 0

1[0,4m1/2 τ ) (s)ds

×

R×E

∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))µω (dr, dx, dv).

4.2. The term Vi1 (t) Let us deal with Vi1 (t) in this subsection. We will show that it is negligible as m → 0. Let us decompose Vi1 (t) as follows: Vi1 (t) = Vi10 (t) + Vi11 (t), with Vi10 (t) =

t∧σn

1[0,4m1/2 τ ) (s)ds

0

×

R×E

{∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))

− ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))}µ ω (dr, dx, dv), t∧σn Vi11 (t) = 1[0,4m1/2 τ ) (s)ds 0

×

R×E

∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))µ ω (dr, dx, dv).

Before discussing the behavior of Vi10 (t), let us prepare the following result. Fix any t0 > 0. Then we have the following: Lemma 4.2.1. For any s ∈ [0, t0 ] satisfying 0 ≤ m1/2 s ≤ T ∧ σn (ω), we have that |x(m1/2 s, Ψ(r, x, m−1/2 v)) − ϕ 0 (s, Ψ(m−1/2 r, x, v); X(0)))| ≤ nm1/2 s

N

∇2 Ui ∞ t0 e(

PN i=1

∇2 Ui ∞ +1)t0

.

i=1

Proof. The main tool is again Gronwall’s lemma. First notice that under our condition, |Xi (m1/2 s) − Xi (0)| ≤ nm1/2 s. Let ξ(s) = x(m1/2 s, Ψ(r, x, m−1/2 v)) − ϕ 0 (s, Ψ(m−1/2 r, x, v); X(0))). Then we have N

d2 ξ(s) = {−∇Ui (x(m1/2 s, Ψ(r, x, m−1/2 v)) − Xi (m1/2 s)) ds2 i=1

+ ∇Ui (ϕ 0 (s, Ψ(m−1/2 r, x, v); X(0))) − Xi (0))}.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

777

Therefore, since ∇2 Ui , i = 1, . . . , N , are bounded, we have that 2 N d

≤ ξ(s) ∇2 Ui ∞ (|ξ(s)| + |Xi (m1/2 s) − Xi (0)|) ds2 i=1 ≤

N

∇2 Ui ∞ (|ξ(s)| + nm1/2 s).

i=1

Let g(s) = |(ξ(s), d g(s) ≤ ds

d ds ξ(s))|.

Then the above implies that 2 d ξ(s) + d ξ(s) ds2 ds N N

∇2 Ui ∞ + ∇2 Ui ∞ + 1 g(s). ≤ nm1/2 s i=1

i=1

Also, g(0) = 0. So for any 0 ≤ s ≤ t0 , we get that N N

1/2 2 2 ∇ Ui ∞ t0 + ∇ Ui ∞ + 1 g(s) ≤ nm s i=1

s

g(u)du.

0

i=1

Therefore, by Gronwall’s Lemma, we have g(s) ≤ nm

1/2

s

N

∇2 Ui ∞ t0 e(

PN i=1

∇2 Ui ∞ +1)s

.

i=1

This gives us our assertion. In particular, applying Lemma 4.2.1 to t0 = 4τ , we get that |x(s, Ψ(r, x, m−1/2 v)) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| ≤ ns

N

∇2 Ui ∞ 4τ e(

PN i=1

∇2 Ui ∞ +1)4τ

,

(4.1)

i=1

|Xi (s) − Xi (0)| ≤ ns,

for any s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)).

We use this to prove the following. The key point here is that the domain of s now is close to 0 and narrow enough. Lemma 4.2.2. E Pm [sup0≤t≤T |Vi10 (t)|2 ] → 0 as m → 0. Proof. First notice that in the deﬁnition of Vi10 , we are taking an integral for s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)), so if r > 6m1/2 τ or r < −2m1/2 τ , then we have |u| > 2m1/2 τ for any u ∈ [r − s, r], so since x · v = 0, we get by deﬁnition = |x − m−1/2 (r − s)v| |ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| ≥ m−1/2 |r − s||v| ≥ 2τ |v| ≥ R0 .

August 10, J070-S0129055X10004077

778

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Therefore, for any s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)), we have 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0))) =0 ∇Ui (Xi (0) − ϕ

(4.2)

if r > 6m1/2 τ or r < −2m1/2 τ . Also, (4.2) holds if |x| ≥ R0 + 1. Similarly, the same holds with X(0) substituted by X(s) (since 0 ≤ s ≤ σ). Let   N

PN 2 C1 = ∇2 Ui ∞  ∇2 Uj ∞ 4τ e( j=1 ∇ Uj ∞ +1)4τ + 1. j=1

Then by combining these facts with (4.1), we get that for any s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)), |∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v))) − ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| ≤ 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)nsC1 . Therefore, by the deﬁnition of Vi10 (t), we get that t∧σn 10 |Vi (t)| ≤ 1[0,4m1/2 τ ) (s)ds 0

×

R×E

C1 ns1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)µω (dr, dx, dv)

C1 ≤ n(4m1/2 τ )2 2

R×E

1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)µω (dr, dx, dv). (4.3)

We need to discuss the L2 (Pm )-norm of the integral on the right-hand side above. Notice thatin general, it is easy to see by the deﬁnition of a Poisson point process that E Pm [( gdµω )2 ] = g 2 dλm + ( gdλm )2 for any g ∈ L2 (λm ). N Let c = j=1 Uj ∞ , and set C2 = 8τ (2(R0 + 1))d−1 Rd ρc ( 12 |v|2 )|v|dv, which is ﬁnite by our assumption. Then we have by deﬁnition that 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)λ(dr, dx, dv) R×E

= R×E

≤

R×E

≤ 8m

1/2

1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)m−1 ρ0 (x − m−1/2 rv, v)drν(dx, dv) 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)m−1 ρc τm

−1

= C2 m−1/2 .

d−1

(2(R0 + 1))

Rd

ρc

1 2 |v| drν(dx, dv) 2

1 2 |v| |v|dv 2

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

Therefore, E

Pm

R×E

= R×E

2 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)µω (dr, dx, dv)

1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)λ(dr, dx, dv)

+ R×E

≤ C2 m

779

−1/2

2 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)λ(dr, dx, dv) + C22 m−1 .

(4.4)

This combined with (4.3) gives us that 2 1 E Pm sup |Vi10 (t)|2 ≤ C1 n(4m1/2 τ )2 (C2 m−1/2 + C22 m−1 ). 2 0≤t≤T The right-hand side above converges to 0 as m → 0. This completes the proof of our assertion. For the term Vi11 (t), we show in the following that it is also negligible when m → 0. The main idea is to use the fact that the expectation (of the integral with respect to the counting measure) is 0 (see (4.5) below), which means that we only need to calculate its variance. Lemma 4.2.3.

E

Pm

sup

0≤t≤T

|Vi11 (t)|2

→0

as m → 0.

Proof. We ﬁrst notice that ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))λ(dr, dx, dv) = 0

(4.5)

R×E

for any s ∈ [0, 4m1/2 τ ∧T ∧σ) and |v| ≥ C0 . Indeed, since |Xi (0)− Xj (0)| > Ri + Rj X(0)) = 0. Combining this with (3.39), for any i = j, we have by (3.31) that ∇i U( we get that   N

1 ∇Ui (Xi (0) − x)ρ  |v|2 + Uj (x − Xj (0)) dxdv = 0. 2 2d R j=1 Applying Proposition 3.1.1 to this with t = m−1/2 s and f (x, v) = ∇Ui (Xi (0) − x), we get   N

 1 |v|2 + ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, x, v; X(0)))ρ Uj (x − Xj (0)) dxdv = 0. 2 2d R j=1

August 10, J070-S0129055X10004077

780

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Reformulating this using the ray representation yields ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(r, x, v); X(0))) R×E





N

1 × ρ  |v|2 + Uj (Ψ0 (r, x, v) − Xj (0)) drν(dx, dv) = 0. 2 j=1 By changing variable r = m−1/2 r, we obtain (4.5). By (4.5), we get that Vi11 (t) =

t∧σn

1[0,4m1/2 τ ) (s)ds

0

×

R×E

∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))

× (µω (dr, dx, dv) − λ(dr, dx, dv)).

(4.6)

As in the proof of Lemma 4.2.2, (4.2) holds if r > 6m1/2 τ or r < −2m1/2 τ , or if |x| ≥ R0 + 1. Let 1 ρc ( |v|2 )|v|dv, C3 = 8τ (2(R0 + 1))d−1 ∇Ui 2∞ 2 d R which is ﬁnite by our assumption. Then we have that E Pm ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0))) R×E 2 × (µω (dr, dx, dv) − λ(dr, dx, dv)) = R×E

≤

R×E

2 |∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| λ(dr, dx, dv)

1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)∇Ui 2∞ λ(dr, dx, dv)

= ∇Ui 2∞



R×E

1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)

 N

1 × m−1 ρ  |v|2 + Uj (Xj,0 − (x − m−1/2 rv)) drν(dx, dv) 2 j=1 ≤m

−1

8m

1/2

= C3 m−1/2 .

d−1

τ (2(R0 + 1))

∇Ui 2∞

Rd

ρc

1 2 |v| |v|dv 2

(4.7)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

Therefore, E

Pm

sup t∈[0,T ]

|Vi11 (t)|2



≤E

Pm 

T ∧σ

0

0

−ϕ (m ≤E

−1/2

4m

1[0,4m1/2 τ ) (s)

R×E

s, Ψ(m

Pm

781

1/2

−1/2

T ∧σ

τ 0

∇Ui (Xi (0)

2 r, x, v); X(0)))(µω (dr, dx, dv) − λ(dr, dx, dv)) ds

1[0,4m1/2 τ ) (s) ∇Ui (Xi (0) R×E

2   −ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))(µ ω (dr, dx, dv) − λ(dr, dx, dv)) ds ≤ (4m

1/2

4m1/2 τ

τ)

dsE 0

0

−ϕ (m

−1/2

s, Ψ(m

Pm

−1/2

R×E

∇Ui (Xi (0)

2 r, x, v); X(0)))(µ ω (dr, dx, dv) − λ(dr, dx, dv))

≤ (4m1/2 τ )2 C3 m−1/2 , which converges to 0 as m → 0. This completes the proof of our assertion. Combining Lemmas 4.2.2 and 4.2.3, we get the following main result of this subsection. Lemma 4.2.4.

E Pm

sup |Vi1 (t)|2 → 0

0≤t≤T

as m → 0.

4.3. The term Vi0 (t) Let us discuss the term Vi0 (t) in this subsection. For any r ∈ R, let r = r(ω) = ((r − 2m1/2 τ ) ∨ 0) ∧ T ∧ σ(ω). Notice that by Corollary 3.2.3, r ))) = 0 r ) − ψ 0 (m−1/2 (s − r), x, v; X( ∇Ui (Xi ( ⇒ |m−1/2 (s − r)| ≤ 2τ. So for s ∈ [4m1/2 τ, ∞), r ))) = 0. r ) − ψ 0 (m−1/2 (s − r), x, v; X( r < 2m1/2 τ ⇒ ∇Ui (Xi (

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

782

Therefore, we have the following decomposition: Vi0 (t) = Vi01 (t) + Vi02 (t) + Vi03 (t) − Vi04 (t) + Vi05 (t), with Vi01 (t) =

t∧σn

1[4m1/2 τ,∞) (s)ds

0

× Vi02 (t) Vi03 (t)

R×E

1[4m1/2 τ,∞) (s)ds

0

=

t∧σn

=

∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)))λ(dr, dx, dv),

t∧σn

ds 0

(2m1/2 τ,∞)×E

R×E

fi (s, r, x, v)µω (dr, dx, dv),

r ))) ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(

× (µω (dr, dx, dv) − λ(dr, dx, dv)), t∧σn Vi04 (t) = 1[0,4m1/2 τ ) (s)ds 0

[2m1/2 τ,∞)×E

r ))) × ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( × (µω (dr, dx, dv) − λ(dr, dx, dv)), t∧σn $ 05 Vi05 (t) = 1[4m1/2 τ,∞) (s)ds F i (s, r, x, v)λ(dr, dx, dv), 0

R×E

where fi (s, r, x, v) = ∇Ui (Xi (s) − x(s, Ψ(r, x, m−1/2 v))) r ))), − ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( 0 −1/2 $ 05 F (s − r), x, v; X(s))) i (s, r, x, v) = −{∇Ui (Xi (s) − ψ (m

r )))}. r ) − ψ 0 (m−1/2 (s − r), x, v; X( − ∇Ui (Xi ( We discuss each term in the above decomposition in the following. We will show that Vi02 (t) and Vi05 (t) give us the “smooth” term in (3.30), and the martingale part of Vi03 (t) gives us the “martingale” term there (see the end of Sec. 4). For the term Vi02 , we have by deﬁnition d 02 fi (t, r, x, v; ω)µω (dr, dx, dv). Vi (t) = 1(4m1/2 τ,σ) (t) dt R×E By deﬁnition and assumption, we have that λm (dr, dx, dv) = 0 if |v| ≤ 2C0 + 1. Also, by Proposition 3.6.5 and Corollary 3.2.3, fi (t, r, x, v) = 0 if |r − t| ≥ 2m1/2 τ . So we only need to consider the case where t ∈ [4m1/2 τ, T ∧ σ), r ∈ [2m1/2 τ, T ∧ σ(ω) + 2m1/2 τ ] and |v| ≥ 2C0 + 1. Before going further, we ﬁrst show the following, with the help of Proposition 3.6.5, Corollary 3.2.3 (which claimed that both of the two interactions exist

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

783

only for a certain range of t − r), and Proposition 3.6.3 (which gave an estimate for the error of our approximation of x(t, Ψ(r, x, m−1/2 v))). Lemma 4.3.1. There exists a constant C > 0 such that |fi (t, r, x, v)| ≤ 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r) · Cm1/2 , if t ∈ [4m1/ τ, T ∧ σ], |r − 2m1/2 τ | ≤ T ∧ σ(ω) and |v| ≥ 2C0 + 1. Proof. First, since t ∈ [0, T ∧ σn ), we have by Proposition 3.6.5 that ∇Ui (Xi (t) − x(t, Ψ(r, x, m−1/2 v))) = 0 if t − r > 2m1/2 τ or t − r < −m1/2 τ . Also, since r ∈ r )| ≤ |Xi,0 | + nT , so by Corollary 3.2.3, [0, T ∧ σn ) by deﬁnition, we have |Xi ( r ))) = 0 if t − r ≥ 2m1/2 τ or t − r ≤ −m1/2 τ . r ) − ψ 0 (m−1/2 (t − r), x, v; X( ∇Ui (Xi ( / [t − 2m1/2 τ, t + m1/2 τ ]. Combining the above, we get that fi (t, r, x, v) = 0 if r ∈ 1/2 1/2 Next, for r ∈ [t − 2m τ, t + m τ ], if |x| ≥ R0 + 1, since x · v = 0, we get easily that |x(t, Ψ(r, x, m−1/2 v))| = |x − (r − t)m1/2 v| ≥ |x| ≥ R0 + 1, hence both of the terms of fi (t, r, x, v) are equal to 0. Finally, we show, for |x| < R0 + 1 and r ∈ [t − 2m1/2 τ, t + m1/2 τ ], that |fi (t, r, x, v)| ≤ Cm1/2 . For this kind of x and r, since t ∈ [4m1/ τ, T ∧ σ(ω)], we have by deﬁnition 2m1/2 τ ≤ r ≤ T ∧ σ + m1/2 τ , so r = r − 2m1/2 τ . We have |fi (t, r, x, v)| ≤ ∇2 Ui ∞ (|Xi (t) − Xi ( r )| + |x(t, Ψ(r, x, m−1/2 v)) r ))|). − ψ 0 (m−1/2 (t − r), x, v; X( The term involving X is easy. Indeed, since t, r ∈ [0, T ∧σ(ω)], we have by deﬁnition r )| ≤ n|t − r| = n|t − (r − 2m1/2 τ )| |Xi (t) − Xi ( ≤ n(|t − r| + 2m1/2 τ ) ≤ n4m1/2 τ. We next deal with the second absolute value above. Notice that by assumption, 0 ≤ r − 2m1/2 τ ≤ T ∧ σ(ω), 0 ≤ r − m1/2 τ ≤ T ∧ σ(ω) and 0 ≤ t ≤ T ∧ σ(ω). Therefore, by Proposition 3.6.3 (3) (with (t, s, a) there given by (m−1/2 (t − r), r, 2τ )), such that there exists a constant C − 2m1/2 τ ))| ≤ m1/2 C(2τ + 2τ ). |x(t, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (t − r), x, v; X(r Combining the above, we get our assertion. Now we are ready to prove the following result concerning the term Vi02 (t). Lemma 4.3.2. We have that sup

sup E

m∈(0,1] 0≤t≤T

Pm

d 02 2 V (t) < ∞. dt i

In particular, {the distribution of {Vi02 (t)}t∈[0,T ] under Pm }m∈(0,1] is tight in ℘(D([0, T ]; Rd)).

August 10, J070-S0129055X10004077

784

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Before giving the proof, we remark that this result is natural since by Lemma 4.3.1, fi (s, r, x, v) is not 0 only if r is very near to s, which implies by − 2m1/2 τ )) is a good Proposition 3.6.3(3) and 3.6.4 that ψ 0 (m−1/2 (s − r), x, v; X(r −1/2 v)). approximation of x(s, Ψ(r, x, m Proof. By Lemma 4.3.1, we have d 02 V (t) ≤ Cm1/2 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)µω (dr, dx, dv). dt i R×E Therefore, 2 d E Pm Vi02 (t) dt ≤E

Pm

≤ C 2m

Cm

1/2

R×E

R×E

2 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)µω (dr, dx, dv)

1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)λm (dr, dx, dv)

+ Cm1/2 R×E

2 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)λm (dr, dx, dv) .

(4.8) Let c = i=1 Ui ∞ , and C = 3τ [2(R0 + 1)]d−1 Rd ρc ( 12 |v|2 )|v|dv, which is ﬁnite by assumption. Then 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)λm (dr, dx, dv) N

R×E

= R×E

1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)

N 1 2

1/2 ×m ρ |v| + Ui (x − m rv − Xi,0 ) dr|v| ν (dx; v)dv 2 i=1

1 2 ≤ m−1 3m1/2 τ |v| |v| 1[0,R0 +1) (|x|)ρc ν (dx; v)dv 2 E

1 2 |v| |v|dv ρc ≤ 3m−1/2 τ [2(R0 + 1)]d−1 2 Rd −1

= Cm−1/2 .

(4.9)

Combining (4.8) and (4.9), we get that d 02 2 P m := sup Vi (t) < ∞, C sup E dt 0≤t≤T m∈(0,1]

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

785

which is exactly the ﬁrst half of our assertion. Therefore, − t |2 , E[|Vi02 (t) − Vi02 (t )|2 ] ≤ C|t hence by Theorem 3.4.1 (with β = ε = γ = 1), {{Vi02 (t)}t∈[0,T ] under Pm }m∈(0,1] is tight in ℘(D([0, T ]; Rd)). We next deal with Vi01 (t). By using Proposition 3.2.4, we show that it is equal to t∧σ (X(s))ds, which gives us the “colliding” term in Theorem 2.0.1(4). ∇i U m 0 −1/2

0 , n, T and Ui , i = Lemma 4.3.3. There exists an m0 > 0 (depending on X 1, . . . , N ) such that for any m ≤ m0 , t∧σ 01 −1/2 (X(s))ds, ∇i U Vi (t) = m 0

is as deﬁned in Sec. 3.5. where U Proof. Suppose that ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)) = 0. Then s − r < 1/2 2m τ by Proposition 3.6.5, this combined with s ≥ 4m1/2 τ implies that r > 2m1/2 τ = 2m1/2 C0−1 R. Since |v| ≥ 2C0 +1 and x·v = 0 for λm -almost every (r, x, v), this implies |x − m−1/2 rv| ≥ m−1/2 r|v| ≥ R0 , hence Ui (Xi,0 − (x − m−1/2 rv)) = 0. Therefore, by deﬁnition, Proposition 3.2.4 and (3.39), t∧σn 1[4m1/2 τ,∞) (s)ds ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) Vi01 (t) = 0

R×E

N

1 Ui (x − m−1/2 rv − Xi,0 ) drν(dx, dv) × m−1 ρ |v|2 + 2 i=1 t∧σn = 1[4m1/2 τ,∞) (s)ds ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) 0

× m−1 ρ

×m

0

R×E

1 2 |v| drν(dx, dv) 2

1[4m1/2 τ,∞) (s)ds

0

=

t∧σn

=

−1/2

t∧σn

R2d

∇Ui (Xi (s) − x)ρ

N 1 2

|v| + Uk (x − Xk,0 ) dxdv 2 k=1

X(s))ds, 1[4m1/2 τ,∞) (s)m−1/2 ∇i U(

where we used Proposition 3.2.4 in passing to the third equality, and used (3.39) in passing to the last equality. So in order to complete the proof of our assertion, it suﬃces to show that X(s)) = 0 for any s ∈ [0, 4m1/2 τ ∧ σ], if m is small enough. We show it from ∇i U(

August 10, J070-S0129055X10004077

786

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

now on. Notice that since |Xi,0 − Xj,0 | > Ri + Rj for any i, j = 1, . . . , N with i = j by assumption, there exists an m0 > 0 (small enough) such that for any m ≤ m0 , we have |Xi,0 − Xj,0 | > Ri + Rj + 8m1/2 τ n for any i = j. Also, by deﬁnition, we have |Xi (s) − Xi,0 | ≤ sn ≤ 4m1/2 τ n for any s ∈ [0, 4m1/2 τ ∧ σ] and i = 1, . . . , N . Therefore, |Xi (s) − Xj (s)| ≥ |Xi,0 − Xj,0 | − |Xi (s) − Xi,0 | − |Xj (s) − Xj,0 | > Ri + Rj + 8m1/2 τ n − 4m1/2 τ n − 4m1/2 τ n = Ri + Rj , (X(s)) so by (3.31), ∇i U = 0 for any s ∈ [0, 4m1/2 τ ∧ σ]. This completes the proof of our assertion. Before discussing the term Vi05 (t), let us ﬁrst prepare, by using Gronwall’s with respect to X: Lemma, the continuity of ψ 0 (t, x, v; X) (depending on Lemma 4.3.4. For any Y > 0, there exists a constant C N N 2 maxi=1 Ri + Y, τ, C0 and i=1 ∇ Ui ∞ ) such that 1 ) − ψ 0 (t, x, v; X 2 )| ≤ C X 1−X 2 Rd , |ψ 0 (t, x, v; X 1 |, |X 2| ≤ Y . for any (x, v) ∈ E, |v| ≥ 2C0 + 1, |t| ≤ 2τ and |X Proof. Choose and ﬁx any v ∈ Rd with |v| ≥ 2C0 + 1, and let s0 = 1 ) − ψ 0 (t, x, v; X 2 ). Then by deﬁnition, 2τ . Let g(t) = ψ 0 (t, x, v; X

maxN i=1 Ri +Y |v|

1) − ϕ 2 ), g(t) = ϕ 0 (t + s0 , x − s0 v, v; X 0 (t + s0 , x − s0 v, v; X so N

d2 1 ) − Xi1 ) g(t) = − ∇Ui (ϕ 0 (t + s0 , x − s0 v, v; X dt2 i=1

+ N

N

2 ) − Xi2 ). ∇Ui (ϕ 0 (t + s0 , x − s0 v, v; X

i=1 2

Let C = i=1 ∇ Ui ∞ , then 2

N d ≤ 1−X 2 Rd ), g(t) ∇2 Ui ∞ (|g(t)| + |Xi1 − Xi2 |) ≤ C(|g(t)| + X dt2 i=1 therefore, d g(t), d g(t) ≤ dt dt

2 d g(t) + d g(t) dt2 dt

d 1 2 ≤ CX − X Rd + (1 + C) g(t), g(t) . dt

∨

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

787

d d Also, g(−s0 ) = dt g(−s0 ) = 0. Let h(t) = |(g(t − s0 ), dt g(t − s0 ))|. Then h(0) = 0, and for any t ∈ [0, s0 + 2τ ], t 1−X 2 Rd (s0 + 2τ ) + (1 + C) h(t) ≤ CX h(s)ds, 0

so by Gronwall’s Lemma, 2 Rd (s0 + 2τ )e(1+C)(s0 +2τ ) , 1−X h(t) ≤ CX Notice that since |v| ≥ 2C0 + 1, we have 2τ ≤ s0 ≤

t ∈ [0, s0 + 2τ ].

maxN i=1

Ri +Y 2C0 +1

∨ 2τ . Therefore,

|g(t)| ≤ h(t + s0 )

maxN i=1 Ri +Y maxN i=1 Ri + Y 1−X 2 Rd , ≤C ∨ 2τ + 2τ e(1+C)( 2C0 +1 ∨2τ +2τ ) X 2C0 + 1 for any t ∈ [−2τ, 2τ ]. This complets the proof of our assertion. We use Lemma 4.3.4 to prove the following: Lemma 4.3.5. There exists a constant C > 0 such that 1/2 $ 05 1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1) (|x|) |F i (s, r, x, v)| ≤ Cm

for s ∈ [4m1/2 τ, T ∧ σn ]. Proof. First, since s, r ∈ [0, T ∧ σ(ω)] in our domain, it is easy to see that $ $ 05 05 |F i (s, r, x, v)| = 0 if |x| ≥ R0 + 1. Also, by Corollary 3.2.3, |Fi (s, r, x, v)| = 0 if −1/2 (s − r)| ≥ 2τ . Finally, for |x| ≤ R0 + 1 and |s − r| ≤ 2m1/2 τ , by deﬁnition |m and Lemma 4.3.4, we only need to show the following: r )| ≤ Cm1/2 , |Xi (s) − Xi (

s ≥ 4m1/2 τ.

(4.10)

To show (4.10), again, notice that in the present setting, 0 ≤ r − 2m1/2 τ ≤ T ∧ σ, so r = r − 2m1/2 τ . So the left-hand side of (4.10) = |Xi (s) − Xi (r − 2m1/2 τ )| ≤ n|s − (r − 2m1/2 τ )| ≤ n(|s − r| + 2m1/2 τ ) ≤ n4m1/2 τ . This completes the proof of our assertion. By Lemma 4.3.5, we get the following lemma in the same way as we derived Lemma 4.3.2 from Lemma 4.3.1. d Vi05 (t)|2 ] < ∞, Lemma 4.3.6. (1) supm∈(0,1] sup0≤t≤T E Pm [| dt 05 (2) {the distribution of {Vi (t)}t∈[0,T ] under Pm }m∈(0,1] is tight in ℘(D([0, T ]; Rd )).

We show that the term Vi04 is negligible. Precisely, we show the following: Lemma 4.3.7.

E

Pm

sup

0≤t≤T

|Vi04 (t)|2

→0

as m → 0.

August 10, J070-S0129055X10004077

788

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Proof. The proof is similar to previous ones, it is easier than the one of Lemma 4.2.3, where we had to show ﬁrst that the expectation is 0 (see (4.5)), whereas now, we are considering only the variance from the very beginning. We have for any s ∈ [0, 4m1/2 τ ] that r )))| |∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( ≤ ∇Ui ∞ 1[0,R0 +1) (|x|)1[0,2m1/2 τ ) (|s − r|) ≤ ∇Ui ∞ 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r). Let C4 = 8∇Ui 2∞ τ (2(R0 +1))d−1 Rd ρc ( 12 |v|2 )|v|dv, which is ﬁnite. Then we have by the deﬁnition of λ and the assumption A2 that  Pm  r ))) ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( E [2m1/2 τ,∞)×E 2  × (µω (dr, dx, dv) − λ(dr, dx, dv)) 

0

=E [2m1/2 τ,∞)×E

≤

[2m1/2 τ,∞)×E



∇Ui (Xi ( r ) − ψ (m

−1/2

2

r ))) λ(dr, dx, dv) (s − r), x, v; X(

∇Ui 2∞ 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r) 

N

1 × m−1 ρ  |v|2 + Uj (x − m−1/2 rv − Xj,0 ) drν(dx, dv) 2 j=1 ≤ ∇Ui 2∞ 8m1/2 τ (2(R0 + 1))d−1 m−1

Rd

ρc

1 2 |v| |v|dv 2

= C4 m−1/2 .

(4.11)

Therefore, E Pm



sup |Vi04 (t)|2 ≤ E Pm 4m1/2 τ

0≤t≤T

4m1/2 τ

0

[2m1/2 τ,∞)×E

r ))) × ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(  2 × (µω (dr, dx, dv) − λ(dr, dx, dv)) ds ≤ (4m1/2 τ )2 C4 m−1/2 , which converges to 0 as m → 0. This completes the proof of our assertion.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

789

Now, the only term left to be discussed is Vi03 . We deal with it in the next subsection. 4.4. The term Vi03 We deal with the term Vi03 in this subsection. More precisely, we show that it is equal to a martingale plus a negligible term. (m,n) = F(−∞,2m1/2 τ +t)×E ∨ ℵ as We ﬁrst prepare some notations. Let Ft = Ft in Sec. 3.5. Then Ft is increasing and right continuous. Let N ((0, t] × A) := µω ((2m1/2 τ, 2m1/2 τ + t] × A) N for any A ∈ B(E). Notice that if ρ( 12 |v|2 + j=1 Uj (Xj,0 − (x − m−1/2 rv))) > 0, then |v| ≥ 2C0 + 1, hence if r ≥ m1/2 τ in addition, then |x − m−1/2 rv| ≥ τ |v| > R0 , N so ρ( 12 |v|2 + j=1 Uj (Xj,0 − (x − m−1/2 rv))) = ρ( 12 |v|2 ). Therefore, if we let

1 2 |v| ν(dx, dv), ν(dx, dv) = ρ 2 then N is the Ft -adapted Poisson point process with intensity measure λ(dt, dx, dv) = m−1 dtν(dx, dv) = m−1 dtρ( 12 |v|2 )ν(dx, dv). Notice that N ((s, t] × A) is independent of Fs for any s < t and A ∈ B(E). Let ¯ (dt, dx, dv) = N (dt, dx, dv) − m−1 dtν(dx, dv). N Notice that Xi (t ∧ σ) and Vi (t ∧ σ) are Ft -measurable. Also, since ∇Ui (Xi ( r) − r )) = 0 only if |m−1/2 (s − r)| ≤ 2τ , which combined with ψ 0 (m−1/2 (s − r), x, v; X( r ≥ 2m1/2 τ and s ≤ T ∧ σ implies r = r − 2m1/2 τ , we get by deﬁnition that t∧σ ds Vi03 (t) = 0

[2m1/2 τ,2m1/2 τ +(T ∧σ))×E

− 2m1/2 τ ))) × ∇Ui (Xi (r − 2m1/2 τ ) − ψ 0 (m−1/2 (s − r), x, v; X(r × (µω (dr, dx, dv) − λ(dr, dx, dv)) t∧σ = ds 0

[0,T ∧σ)×E

× ∇Ui (Xi (r) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r))) N¯ (dr, dx, dv). In the last expression above, if r > t ∧ σ, then since s ≤ t ∧ σ, we get = 0. m−1/2 (s − r) − 2τ < −τ , hence ∇Ui (Xi (r) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r))) Therefore, t∧σ 03 Vi (t) = ds 0

[0,t∧σ)×E

× ∇Ui (Xi (r) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r))) N¯ (dr, dx, dv).

August 10, J070-S0129055X10004077

790

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Let $ 03 V$ i (t) =

t

¯ (dr, dx, dv) N

ds 0

(0,t]×E

∧ σ))). × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r Then $ 03 Vi03 (t) = V$ i (t ∧ σ). ∧ σ))) = 0 if |u| ≥ 2τ . So By Corollary 3.2.3, ∇Ui (Xi (r ∧ σ) − ψ 0 (u, x, v; X(r $ 03 the integral domain s ∈ [0, t] in the deﬁnition of V$ i (t), which is equivalent to s − r ∈ [−r, t − r], can be substituted by s − r ∈ [0, (t − r) ∧ 4m1/2 τ ] = [0, 4m1/2 τ ] \ $ 03 (t) can be decomposed into [(t − r) ∧ (4m1/2 τ ), 4m1/2 τ ]. Therefore, V$ i

$ 03 i V$ i (t), i (t) = M (t) + η where i (t) = M

4m1/2 τ

¯ (dr, dx, dv) N

ds 0

(0,t]×E

∧ σ))), × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r 4m1/2 τ ¯ (dr, dx, dv) N ds ηi (t) = − (t−r)∧(4m1/2 τ )

(0,t]×E

∧ σ))). × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r $ 03 By deﬁnition (notice that the integral domain (0, t] × E in the deﬁnition of V$ i (t) can always be converted into (0, T ] × E whenever necessary, and vice versa), d $ $ 03 ¯ (dr, dx, dv) N V (t) = dt i (0,t]×E ∧ σ))), × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 (t − r) − 2τ, x, v; X(r so with C1 = 4τ ∇Ui 2∞ (2(R0 + 1))d−1 Rd ρ( 12 |v|2 )|v|dv, we have 2 d $ 03 (t) |∇Ui (Xi (r ∧ σ) E Pm V$ =E dt i (0,t]×E 0

− ψ (m ≤

(0,t]×E

−1/2

2 (t − r) − 2τ, x, v; X(r ∧ σ)))| λ(dr, dx, dv)

∇Ui 2∞ 1[0,R0 +1) (|x|)1[0,2τ ] (|m−1/2 (t − r) − 2τ |)

× m−1 drρ

1 2 |v| ν(dx, dv) 2

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

≤ 4m1/2 τ ∇Ui 2∞ (2(R0 + 1))d−1 m−1

ρ Rd

791

1 2 |v| |v|dv 2

= C1 m−1/2 . (4.12) This fact will be used later. i (t). First, it is easy to see by deﬁnition that M i (t) Let us study the term M i | ≤ 4m1/2 τ ∇Ui ∞ . Also, with is a Ft -martingale, with its jumpssatisfying |∆M 1 2 2 d−1 2 ρ( 2 |v| )|v|dv, we have that, for any 0 ≤ s ≤ C = (4τ ) ∇Ui ∞ (2(R0 + 1)) Rd t ≤ T, i (t) − M i (s)|2 |Fs ] E Pm [|M  2 4m1/2 τ Pm  0 −1/2 =E ∇Ui (Xi (r) − ψ (m u − τ, x, v; X(r)))du 0 (s,t)×E × 1[0,R0 +1) (|x|)m−1 drρ

1 2 |v| 2

 ν(dx, dv) Fs 

≤ C|t − s|,

(4.13)

hence for any 0 ≤ r ≤ s ≤ t ≤ T , i (t) − M i (s)|2 |M i (s) − M i (r)|2 ] ≤ C 2 |t − s||s − r|. E Pm [|M

(4.14)

Also, by Doob’s inequality and (4.13), we get E Pm

 2 1/2 i (t)| ≤ E Pm  sup |M i (t)|  sup |M

t∈[0,T ]

t∈[0,T ]

i (t)|2 ]1/2 ≤ 2 sup E Pm [|M t∈[0,T ]

≤ 2 sup

t∈[0,T ]

√ √ Ct = 2 CT < ∞.

(4.15)

By Theorem 3.4.1 (with ε = 1, β = 2 and γ = 1/2), (4.13)–(4.15) imply the following: i (t)} Lemma 4.4.1. {The distribution of {M t∈[0,T ] under Pm }m∈(0,1] is tight in d ℘(D([0, T ]; R )). We next show that under any of its cluster points as m → 0, the canonical process is continuous with probability 1. We ﬁrst make the following preparation.

August 10, J070-S0129055X10004077

792

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Lemma 4.4.2. For any ε ∈ (0, 1], let % d ω ∈ D([0, T ]; R ): sup |ω(t) − ω(s)| > ε , A= δ≥0

B=

% δ≥0

|t−s|≤δ

ε ω ∈ D([0, T ]; R ): sup |ω(t) − ω(s)| > . 2 |t−s|≤eδ d

Then A ⊂ A¯ ⊂ B o ⊂ B. Here A¯ and B o means the closure of A and the interior of B in (D([0, T ]; Rd), d0 ), respectively. Proof. For any ω0 ∈ A and ω ∈ D([0, T ]; Rd ) with d0 (ω, ω0 ) < ε5 , we have that ω ∈ B. Indeed, by deﬁnition, we have that there exists a continuous non-decreasing function λ: [0, T ] → [0, T ] such that λ(0) = 0, λ(T ) = T , and |λ(t) − λ(s)| ≤ eε/4 |t − s| ≤ e|t − s|,

for any 0 ≤ s < t ≤ T,

sup |ω0 (t) − ω(λ(t))| ≤ ε/4.

0≤t≤T

Therefore, sup |ω(t) − ω(s)| =

|t−s|≤eδ

sup |λ(t)−λ(s)|≤eδ

|ω(λ(t)) − ω(λ(s))|

≥ sup |ω(λ(t)) − ω(λ(s))| |t−s|≤δ

≥ sup |ω0 (t) − ω0 (s)| − sup |ω0 (t) − ω(λ(t))| |t−s|≤δ

0≤t≤T

− sup |ω0 (s) − ω(λ(s))| 0≤s≤T

ε ε − = ε/2, 4 4 which means that ω ∈ B. This completes the proof of our assertion. > ε−

Now, we are ready to prove the continuity of canonical processes of cluster points i (t)} of {{M t∈[0,T ] under Pm }m→0 . i (t)} Lemma 4.4.3. Any cluster point of {{M t∈[0,T ] under Pm }m→0 in ℘(D([0, T ]; d R )) must have continuous canonical processes. Proof. Suppose there exists a sequence mk → 0 (as k → 0) such that Pmk ◦ i )−1 (which we write as Qk for the sake of simplicity) converges to some Q∞ ∈ (M ℘(D([0, T ]; Rd )) as k → ∞. We show that the canonical process under Q∞ is continuous with probability 1. Suppose not. Then there exists a constant ε > 0

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

such that



%

Q∞ 

793



ω ∈ Dd ([0, T ]): sup |ω(t) − ω(s)| > ε  = a > 0. |t−s|≤δ

δ≥0

Without loss of generality, we assume that ε ≤ 1. Let A and B be the sets deﬁned in Lemma 4.4.2. Then Q∞ (A) = a > 0, so by Lemma 4.4.2, Q∞ (B o ) ≥ a > 0. Also, B o is an open set, and Qk → Q∞ weakly in ℘(D([0, T ]; Rd)), so we have lim inf k→∞ Qk (B o ) ≥ Q∞ (B o ). Therefore, there exists an N ∈ N such that for any k ≥ N, Qk (B o ) ≥ a2 , hence Qk (B) ≥ a2 , which means that i has a jump greater than ε/2) ≥ a . Since mk → 0 as k → ∞, this yields a Pm (M 2

k

i under Pm are smaller than contradiction with the fact that all of the jumps of M k 1/2 4mk τ ∇Ui ∞ . This completes the proof of our assertion. We next use Lemma 4.4.3 to show the following, which will be used later. Lemma 4.4.4. For any ε > 0, we have that lim sup lim sup Pm m→0

δ→0

sup 0≤s≤t≤T,|s−t|≤δ

i (t) − M i (s)| > ε |M

= 0.

(4.16)

i (t) − M i (s)| > ε). If Proof. Let a(m, δ) = Pm (sup0≤s≤t≤T,|s−t|≤δ |M lim sup lim sup a(m, δ) > 0, m→0

δ→0

then there exists a constant a > 0 and sequences δk → 0, mk → 0 (as k → ∞) such that i i Pm |M (t) − M (s)| > ε ≥ a (4.17) sup k

0≤s≤t≤T,|s−t|≤δk

i )−1 , k ∈ N. Also, let for any k ∈ N. As before, let Qk = Pmk ◦ (M ω ∈ D([0, T ]; R ): d

Ak = Bk =

sup 0≤s≤t≤T,|t−s|≤δk

|ω(t) − ω(s)| > ε ,

ε . ω ∈ D([0, T ]; Rd): sup |ω(t) − ω(s)| > 2 0≤s≤t≤T,|t−s|≤eδk

Then Qk (Ak ) > a by assumption, and by the same argument as in the proof of Lemma 4.4.2, we get that Ak ⊂ Ak ⊂ Bko ⊂ Bk for any k ∈ N. Also, Ak is monotone decreasing with respect to k, hence for any ≥ k, we have that Q (Ak ) ≥ Q (A ) > a. Therefore, since Ak is a closed set, we get that Q∞ (Bk ) ≥ Q∞ (Ak ) ≥ lim sup Q (Ak ) ≥ a. →∞

August 10, J070-S0129055X10004077

794

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

This is true for any k ∈ N, so since Bk is monotone decreasing with respect to k, we get that ∞ % Bk ≥ a, Q∞ k=1

which means that Q∞ ({canonical process has jump ≥ ε/2}) ≥ a, which contradicts Lemma 4.4.3. This completes the proof of our assertion. i (t), for later use. Before dealing with ηi (t), we prepare one more result about M Lemma 4.4.5. There exists a constant C > 0 (not depending on m) such that Pm 4 i sup E sup |M (t)| ≤ C. t∈[0,T ]

m∈(0,1]

¯ |4 ] ≤ Proof. fact of Poisson point process that E[| f dN 2By 2 the general 4 E[3( f dλ) + f dλ], we get with the help of Doob’s inequality that i (t)|4 E Pm sup |M t∈[0,T ]

i (T )|4 ] ≤ (4/3)4 E Pm [|M  4  λ(dr, dx, dv) = (4/3) E 3

4m1/2 τ

ds

0

(0,T ]×E

2 2  ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ(m−1/2 s − 2τ, x, v; X(r 

+

4m1/2 τ

λ(dr, dx, dv)

ds 0

(0,T ]×E

4 ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ(m−1/2 s − 2τ, x, v; X(r  4 ≤ (4/3) 3

m

−1

ρ

(0,T ]×E

  

1 2 |v| drν(dx, dv) 2 2

× (4m

1/2

2

τ ∇Ui ∞ 1[0,R0 +1) (|x|))

+ (0,T ]×E

m−1 ρ



1 2 |v| drν(dx, dv)(4m1/2 τ ∇Ui ∞ 1[0,R0 +1) (|x|))4  2

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles



≤ (4/3)4 3(4τ ∇Ui ∞ )4

795

2 1 2 ρ |v |v|dv 2 Rd

T (2(R0 + 1))d−1

+ (4τ ∇Ui ∞ )4 mT (2(R0 + 1))d−1

1 2 ρ |v |v|dv . 2 Rd

The right-hand side above is dominated by a ﬁnite global constant for m ∈ (0, 1]. We next deal with ηi (t). First, we use some basic properties of Poisson point process to show that there exists a constant C such that E Pm [|ηi (t)|6 ] ≤ Cm3/2 ,

t ∈ [0, T ], m ∈ (0, 1].

In fact, notice that ηi (t) can be expressed as ¯ N (dr, dx, dv) ηi (t) = − [(t−4m1/2 τ )∨0,t]×E

(4.18)

4m1/2 τ

ds (t−r)∧(4m1/2 τ )

∧ σ))). × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r Also, in general, if Z is a Poisson random variable with mean a, then we have E[Z −a] = 0, E[(Z −a)2 ] = E[(Z −a)3 ] = a, E[(Z −a)4 ] = 3a2 +a, and E[(Z −a)6 ] = 15a3 + 25a2 + a. Therefore, by deﬁnition of Poisson point process and a simple calculation, there exists a global constant C such that 6 ¯ E f dN ≤ CE

3 2 3 2 4 6 f dλ + f dλ + f dλ f dλ + f dλ , 2

for any measurable function f . We use this to prove (4.18). 4m1/2 τ ∧ σ)))ds|. Let A = | (t−r)∧(4m1/2 τ ) ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r

Then since t − r ≥ 0, we get that A ≤ 4m1/2 τ ∇Ui ∞ . Therefore, E Pm [|ηi (t)|6 ]  ≤ CE 

2

A m

−1

ρ

[(t−4m1/2 τ )∨0,t]×E

3 1 2 |v| drν(dx, dv) 2

2 1 2 |v| drν(dx, dv) + A m ρ 2 [(t−4m1/2 τ )∨0,t]×E

1 + A2 m−1 ρ |v|2 drν(dx, dv) 2 [(t−4m1/2 τ )∨0,t]×E 3

−1

August 10, J070-S0129055X10004077

796

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

×

4

A m

−1

ρ

[(t−4m1/2 τ )∨0,t]×E

A6 m−1 ρ

+

[(t−4m1/2 τ )∨0,t]×E

1 2 |v| drν(dx, dv) 2  1 2 |v| drν(dx, dv)  2

1/2 1/2 2 −1 d ≤C 4m τ (4m τ ∇Ui ∞ ) m (2(R0 + 1))

ρ

Rd

1/2 1/2 3 −1 d + 4m τ (4m τ ∇Ui ∞ ) m (2(R0 + 1))

3 1 2 |v| |v|dv 2

2 1 2 |v| |v|dv ρ 2 Rd

1 2 1/2 1/2 2 −1 d |v| |v|dv + 4m τ (4m τ ∇Ui ∞ ) m (2(R0 + 1)) ρ 2 Rd

1 2 |v| |v|dv ρ × 4m1/2 τ (4m1/2 τ ∇Ui ∞ )4 m−1 (2(R0 + 1))d 2 Rd

1 |v|2 |v|dv ρ + 4m1/2 τ (4m1/2 τ ∇Ui ∞ )6 m−1 (2(R0 + 1))d , 2 Rd which gives us our assertion. We use (4.18) to show the following, with the help of (4.12) (the estimate for $ 03 i the derivative of V$ i ), Lemma 4.4.4 (the “continuity” of the limit of M (t)), and 4 i Lemma 4.4.5 (the estimate with respect to |M (t)| ). Lemma 4.4.6.

lim E Pm

m→0

sup |ηi (t)|2 = 0.

0≤t≤T

Proof. By (4.18),   4 [m− 3 T ]

  E Pm  |ηi (km4/3 )|6  ≤ Cm3/2 m−4/3 T → 0,

as m → 0.

k=0

In particular we have E

Pm

max 4

0≤k≤[m− 3 T ]

|ηi (km

4/3

6

)|

→ 0,

as m → 0.

(4.19)

ag process, there exists a measurable ξm : Ω → [0, T ] such Since ηi (t) is a c`adl` that |ηi (ξm )| ∨ |ηi (ξm −)| = sup |ηi (t)|. 0≤t≤T

(4.20)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

797

Also, the jumps of ηi satisfy |∆ηi | ≤ 4m1/2 τ ∇Ui ∞ , so |ηi (ξm −)| ≤ |ηi (ξm )| + 4/3 4/3 4m1/2 τ ∇Ui ∞ . Let ξ [m−4/3 ξm ]. Then 0 ≤ ξm − ξ . m = m m ≤m Combining the above, we get that Pm 2 sup |ηi (t)| = E Pm [|ηi (ξm )|2 ∨ |ηi (ξm −)|2 ] E 0≤t≤T

≤ 2(4m1/2 τ ∇Ui ∞ )2 + 2E Pm [|ηi (ξm )|2 ] ≤ 2(4m

1/2

2

τ ∇Ui ∞ ) + 4E

Pm

max 4

0≤k≤[m− 3 T ]

|ηi (km

4/3

2

)|

2 + 4E Pm [|ηi (ξm ) − ηi (ξ m )| ].

The ﬁrst term on the right-hand side above converges to 0 as m → 0 evidently. By (4.19), the second term above is also converging to 0 as m → 0. So in order to show that E Pm [sup0≤t≤T |ηi (t)|2 ] → 0, it suﬃces to prove that the third term 2 E Pm [|ηi (ξm ) − ηi (ξ m )| ] converges to 0. We show it in the following. Notice that $ 2 Pm $ $ 03 03 2 [|V$ E Pm [|ηi (ξm ) − ηi (ξ m )| ] ≤ 2E i (ξm ) − Vi (ξm )| ] 2 i (ξm ) − M i (ξ + 2E Pm [|M m )| ]. 4/3 Since 0 ≤ ξm − ξ , we get by (4.12) that m ≤ m   2 T $ $ $ d Pm  $ 03 03 2 03 (t) dt  E Pm [|V$ 1[ξm ,ξf (t) V$ i (ξm ) − Vi (ξm )| ] ≤ E m] dt i 0

≤ E Pm ≤ m4/3

T

0

0

T

1[ξm ,ξf (t)dt · m]

$ 2 Pm d $ 03 E dt Vi (t) dt

≤ m4/3 T C1 m−1/2 → 0,

0

T

2 d $ 03 V$ dt i (t) dt

as m → 0.

2 i (ξm ) − M i (ξ For the term E Pm [|M m )| ], we ﬁrst notice that since 0 ≤ ξm − ξm ≤ m4/3 by deﬁnition, (4.16) gives us that

i (ξm ) − M i (ξ lim Pm (|M m )| > ε) = 0.

m→0

(4.21)

This is true for any ε > 0. Also, we have by Lemma 4.4.5 that for any ε > 0, 2 i (ξm ) − M i (ξ E Pm [|M m )| ] 2 i 2 i (ξm ) − M i (ξ i ≤ E Pm [|M m )| , |M (ξm ) − M (ξm )| > ε] + ε

August 10, J070-S0129055X10004077

798

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang 4 1/2 1/2 i (ξm ) − M i (ξ i (ξm ) − M i (ξ ≤ E Pm [|M P (|M + ε2 m )| ] m )| > ε)  1/2

≤ 4E

Pm 

i (t)| sup |M

4



t∈[0,T ]

1/2 i (ξm ) − M i (ξ P (|M + ε2 m )| > ε)

1/2 i (ξm ) − M i (ξ ≤ 4C 1/2 P (|M + ε2 . m )| > ε)

This combined with (4.21) gives us that 2 i (ξm ) − M i (ξ lim E Pm [|M m )| ] = 0,

m→0

and completes the proof of the fact that 2 lim E Pm [|ηi (ξm ) − ηi (ξ m )| ] = 0,

m→0

completing then the proof of our assertion. Combining all of the results in Secs. 4.1–4.3, we get Lemma 3.5.1, with i (t ∧ σ), Mi (t) = −M Pi∗1 (t) = −Vi02 (t) − Vi05 (t), ηi (t) =

−Vi1 (t)

+

Vi04 (t)

(4.22)

− ηi (t ∧ σ).

Before closing this subsection, we state the following result with respect to the quadratic variation of the martingale Mi (·). The proof is easy and we omit it. For i = 1, . . . , N and k = 1, . . . , d, let Aik (r) = Aik (r, x, v) =

2τ

−2τ

∇k Ui (Xi (r) − ψ 0 (u, x, v; X(r)))du.

Then we have: Lemma 4.4.7. For any l1 , l2 = 1, . . . , N and k1 , k2 = 1, . . . , d, the following equality holds: Al1 k1 (r, x, v)Al2 k2 (r, x, v)N (dr, dx, dv). [Mlk11 , Mlk22 ]s = m [0,s∧σ]×E

4.5. Proof of Lemma 3.5.2 In this subsection, we present the proof of Lemma 3.5.2. The ﬁrst assertion is just an easy consequence of Lemma 3.5.1 and the formula of integration by parts. Indeed, for any t ≥ 0, we have by assumption and the

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

799

formula of integration by parts that t∧σfD t∧σfD −1/2 (X(s)))ds m |∇i U (X(s))|ds = g(X(s)) · (m−1/2 ∇i U 0

0

∧ σ = g(X(t D )) −

t∧σf D

t∧σf D 0

(X(s))ds m−1/2 ∇i U

(s)) ds(∇g(X(s)) V 0

0

s

(X(r))dr. m1/2 ∇i U

Therefore, by Lemma 3.5.1(1), we get T ∧σ∧σfD X(s))|ds m−1/2 |∇i U( 0

= g(X(T ∧ σ ∧ σ D ))(−Mi (Vi (T ∧ σ ∧ σ D ) − Vi (0)) + Mi (T ∧ σ D) T ∧σ∧σfD ∗1 (t)) (∇g(X(t)) V + ηi (T ∧ σ D ) + Pi (T ∧ σ D )) − 0

× {−Mi (Vi (t ∧ σ ∧ σ D ) − Vi (0)) + Mi (T ∧ σ D ) + ηi (T ∧ σ D) + Pi∗1 (t ∧ σ D )}dt

≤ (g∞ + ∇g∞ · N nT ) 2Mi n + sup |Mi (t) + ηi (t)| + sup |Pi∗1 (t)| . 0≤t≤T

0≤t≤T

Therefore, we get our ﬁrst assertion by Lemmas 3.5.1(2), 3.5.1(4) and (4.15). Before giving the proof of the second assertion, let us make some preparation. With the help of Lemma 4.4.7, we have the following. Lemma 4.5.1.

lim E

m→0

Pm

t 2 ηi (s)dMi (s) = 0. sup

t∈[0,T ]

0

t Proof. Since Mi (·) is a martingale, Lemma 3.5.1(4) implies that 0 ηi (s)dMi (s) is also a martingale. Therefore, with the help of Lemma 4.4.7 and Doob’s inequality, we get that t 2 Pm ηi (s)dMi (s) sup E t∈[0,T ]

0

 2  T ≤ 4E Pm  ηi (s)dMi (s)  0 =2

d

k,=1

E

Pm 0

T

ηik (s)ηi (s)d[Mik , Mi ]s

August 10, J070-S0129055X10004077

800

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang d

= 2m

E

T ∧σ

Pm 0

k,=1

≤ 2(4τ ∇Ui ∞ )2

ηik (s)ηi (s)

[0,T ]×E

Aik (s, x, v)Ai (s, x, v)N (ds, dx, dv) E

E Pm [|ηi (s)|2 ]1[0,R0 +1) (|x|)ρ

1 2 |v| ν(dx, dv)ds 2

1 2 ρ v |v|dvE Pm [|ηi (s)|2 ]. 2 Rd

2

≤ 2(4τ ∇Ui ∞ ) T (2(R0 + 1))

d−1

This combined with Lemma 3.5.1(4) completes the proof of our assertion. We next show the second assertion of Lemma 3.5.2. The basic idea is to add an d t∧σ extra term i=1 M1i 0 ηi (s)dMi (s) ﬁrst, use the decomposition and the estimates of Lemma 3.5.1 to show that the resulting quantity is tight, and ﬁnally delete the added term by Lemma 4.5.1. First, by Lemma 3.5.1, we have (X(t ∧ σ)) − U (X(0))) m−1/2 (U +

N

Mi

1 |Vi (t ∧ σ)| + 2 Mi

i=1

=

N

Mi

2

i=1

2

|Vi (0)| +

Mi

0 N

Mi

2

i=1

+ 0

t∧σ

+

=

2

N

i=1

t∧σ

0

t∧σ

0

ηi (s)dMi (s)

(X(s)) · Vi (s)ds m−1/2 ∇i U

t∧σ d 1 Vi (s) · Vi (s)ds + ηi (s)dMi (s) ds Mi 0

|Vi (0)|2 +

N

i=1

t∧σ

Vi (s)dηi (s) +

t∧σ

Vi (s)

0

1 Mi

0

t∧σ

d ∗1 P (s)ds + ds i

t∧σ

0

Vi (s)dMi (s)

ηi (s)dMi (s) .

Since |Vi (t ∧ σ)| ≤ n by the deﬁnition of σ, we have by Lemma 3.5.1(2) that 2 d sup sup E Pm Vi (t ∧ σ) Pi∗1 (t) < ∞. dt 0≤t≤T

m∈(0,1]

t∧σ d Therefore, by Theorem 3.4.1, we get that 0 Vi (s) ds Pi∗1 (s)ds under Pm is tight for m ∈ (0, 1]. t For the term 0 1[0,σ] (s)Vi (s)dMi (s), we recall that σ = inf{t > 0; maxi=1,...,N × |Vi (t)| = n}, so σ is a Ft -stopping time. Therefore, since {Mi (s)}s is a martingale,

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

we get that

Ni (t) :=

801

t

0

1[0,σ] (s)Vi (s)dMi (s)

is also a Ft -martingale. Notice that E Pm [|Ni (t) − Ni (s)|2 |Fs ] ≤ n2 d2 E Pm [|Mi (t) − Mi (s)|2 |Fs ]. So by Lemma 3.5.1(3), we get that E Pm [|Ni (t) − Ni (s)|2 |Fs ] ≤ n2 d2 C|t − s|,

|∆N (t)| ≤ dnCm1/2 .

Therefore, similarly as in the proof of Lemmas 4.4.1 and 4.4.3, we get that {Ni (t)}t under Pm is tight for m → 0, and the canonical process under any of its cluster points is continuous withprobability 1. t∧σ t∧σ Finally, we show that 0 Vi (s)dηi (s)+ M1i 0 ηi (s)dMi (s) is negligible. Notice that by Lemma 3.5.1(3), t∧σ t∧σ 1 Vi (s)dηi (s) + ηi (s)dMi (s) Mi 0 0 t∧σ t∧σ 1 = Vi (t ∧ σ)ηi (t) − ηi (s)dVi (s) + ηi (s)dMi (s) Mi 0 0

t∧σ t∧σ 1 1 d ∗1 = Vi (t ∧ σ)ηi (t) − ηi (s) ηi (s)dηi (s) Pi (s) ds − Mi 0 ds Mi 0 t∧σ 1 (X(s))ds + ηi (s)m−1/2 ∇i U Mi 0 1 1 ηi (t)2 + [ηi , ηi ]t 2Mi 2Mi

t∧σ d 1 (X(s)) − Pi∗1 (s) ds. + ηi (s) m−1/2 ∇i U Mi 0 ds

= Vi (t ∧ σ)ηi (t) −

Since |Vi (t ∧ σ)| ≤ n, Lemma 3.5.1(4) gives us that 1 1 Pm 2 lim E ηi (t) + [ηi , ηi ]t = 0. sup Vi (t ∧ σ)ηi (t) − m→0 2Mi 2Mi t∈[0,T ∧σ] Also, for any ε > 0, we have for any A > 0, t∧σ

d ∗1 −1/2 Pm ηi (s) m ∇i U (X(s)) − Pi (s) ds > ε sup ds t∈[0,T ∧σ] 0 ≤ Pm

sup s∈[0,T ∧σ]

|ηi (s)| > A

+ Pm

sup s∈[0,T ∧σ]

0

t∧σ

d ∗1 ε −1/2 (X(s))| + Pi (s) ds > ∇i U |m ds A

August 10, J070-S0129055X10004077

802

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

1 ≤ E Pm A

sup s∈[0,T ∧σ]

A + E Pm ε

T ∧σ

0

|ηi (s)|

|m

−1/2

d ∗1 ∇i U(X(s))| + Pi (s) ds . ds

Combining this with Lemmas 3.5.1(2), 3.5.1(4) and 3.5.2(1), by taking ﬁrst A > 0 small enough and then m > 0 small enough, we get that t∧σ

d −1/2 ∗1 (X(s)) − Pi (s) ds > ε = 0 ηi (s) m ∇i U sup lim Pm m→0 ds t∈[0,T ∧σ∧σf D]

0

(X(t ∧ σ ∧ σ for any ε > 0. This completes the proof of the fact that m−1/2 (U D )) − t∧σ∧σfD N M 1 2 i (X(0))) U + |Vi (t ∧ σ ∧ σ )| + η (s)dM (s) under P is tight D i i m i=1 2

Mi

0

as m → 0, and the canonical process under any of its cluster points is continuous with probability 1. This combined with Lemma 4.5.1 gives us our second assertion of Lemma 3.5.2. 5. Convergence until “Near” As mentioned at the end of Sec. 3.4, weak convergence of the distribution of a process with t ∈ [0, T ] for any T > 0 implies the weak convergence of the distribution of the process with t ∈ [0, ∞). So in order to prove Theorems 2.0.1(2)–2.0.1(4), it suﬃces to prove the assertions for t ∈ [0, T ] for any T > 0. Fix a T > 0 from now on. t∧σ × By Lemma 3.5.1, we have that {{Mi Vi (t ∧ σn ) + m−1/2 0 n ∇i U s )ds}t∈[0,T ] under Pm } is tight in ℘(D([0, T ]; Rd)) as m → 0, and the canon(X ical process under any of its cluster points is continuous with probability 1. Let σ0 (ω) = inf{t > 0; mini=j {|Xi (t) − Xj (t)| − (Ri + Rj )} ≤ 0}. Then by (X s ) = 0 for any s ≤ σ0 . Therefore, there exists (at least) one sequence (3.31), ∇i U ∧ σn ∧ σ0 ), V (t ∧ σn ∧ mk → 0 (as k → ∞) such that {distribution of {(X(t d σ0 ))}t∈(0,T ] under Pmk } converges in ℘(D([0, T ]; R )). In this section, we give the proof of the fact that any cluster point gotten above is the stopped diﬀusion process with generator L as given in Sec. 2, by proving that it is the solution of the martingale problem L. This certainly implies Theorem 2.0.1(2) and 2.0.1(3). For the sake of simplicity, in this section, we let σ = σn ∧ σ0 . We use the same )C ⊂ RdN . notations as in Sec. 4. Also, we use the notation D0 = (supp U 5.1. Decomposition ∧ As claimed, we show from now on that any cluster point of {distribution of {(X(t σ), V (t ∧ σ))}t∈[0,T ] under Pm } is a solution of the martingale problem L, i.e. for

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

any f ∈ C0∞ (D0 × RdN ), f (X(t ∧ σ), V (t ∧ σ)) − f (X0 , V0 ) −

t∧σ

0

Lf (Xs , Vs )ds ,

803

(5.1)

after taking the limit m → 0, is a martingale. First, since we do not have enough information about the term ηi (t), we use the following to convert the problem to the one without ηi (t). Let

t∧σn −1 ∗1 −1/2 ∇i U (Xs )ds Yi (t) = Vi (0) + Mi Mi (t) + Pi (t) − m 0

= Vi (t) −

Mi−1 ηi (t),

i = 1, . . . , N,

and let Yt = Y (t) = (Y1 (t), . . . , YN (t)). Then we have the following. (We use the (t).) and Vt = V notations Xt = X(t) Lemma 5.1.1. For any f ∈ C0∞ (D0 × RdN ), we have that {f (Xt∧σn , Vt∧σn )}t and {f (Xt∧σn , Yt∧σn )}t converge or do not converge for m → 0 at the same time, and when they converge, they have the same limit. Proof. Just notice that if we let fV denote the partial diﬀerential of f with respect to V , then fV ∞ < ∞ and |f (Xt∧σn , Vt∧σn ) − f (Xt∧σn , Yt∧σn )| ≤ fV ∞ max

i=1,...,N

hence

E

Pm

1 sup |ηi (s)|, Mi s∈[0,T ]

sup |f (Xt∧σn , Vt∧σn ) − f (Xt∧σn , Yt∧σn )|

0≤t≤T

1 Pm ≤ fV ∞ max E i=1,...,N Mi

sup |ηi (s)| ,

s∈[0,T ]

which, by Lemma 3.5.1(4), converges to 0 as m → 0. By Lemma 5.1.1, in order to prove that any cluster point of (5.1) is a martingale, it suﬃces to prove that any cluster point of t∧σ Lf (Xs , Vs )ds f (Xt∧σ , Yt∧σ ) − f (X0 , Y0 ) − 0

is a martingale. Since f ∈ C0∞ (D0 × RdN ) (notice that all the terms involved except Mi (t) are continuous with respect to t), we have t∧σ (X s )ds = 0, fV (Xs , Ys ) · ∇U 0

August 10, J070-S0129055X10004077

804

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

so we obtain by Ito’s formula and the deﬁnition of Yi (t) that f (Xt∧σ , Yt∧σ ) − f (X0 , Y0 ) t∧σ fX (Xs , Ys ) · Vs ds = 0

+

t∧σ N

1 fVi (Xs , Ys ) · dMi (s) + (II) + (III) + (IV), Mi 0 i=1

with (II) =

(III) =

t∧σ N

1 fV (Xs , Ys ) · dPi∗1 (s), M i 0 i=1 N

d

l1 ,l2 =1 k1 ,k2 =1

(IV) =

0<s≤t∧σ

1 2Ml1 Ml2

t∧σ

0+

fV k1 V k2 (Xs , Ys )d[Mlk11 , Mlk22 ]s , l1

l2

 N 

1 fV (Xs , Ys− ) · ∆Ml (s) f (Xs , Ys ) − f (Xs , Ys− ) −  Ml l=1

 N 

1 1 . fVl1 Vl2 (Xs , Ys− )(∆Ml1 (s))(∆Ml2 (s)) − 2 M l1 M l2  l1 ,l2 =1

N 1 t∧σ The term fVi (Xs , Ys ) · dMi (s) is already a martingale since i=1 Mi 0 {Mi (t)}t , i = 1, . . . , N , are martingales and fV is bounded, hence it remains a martingale when taking m → 0. So it suﬃces to show that 0

t∧σ

fX (Xs , Ys ) · Vs ds + (II) + (III) + (IV) −

0

t∧σ

Lf (Xs , Vs )ds → 0.

t∧σ t∧σ The fact that the diﬀerence between 0 fX (Xs , Ys ) · Vs ds and 0 fX (Xs , Vs ) · t∧σ Vs ds, its corresponding term in 0 Lf (Xs , Vs )ds, converges to 0 is a direct consequence of Lemma 5.1.1. In the following sections, we show the convergences of the other terms. Precisely, we show that when m → 0, (II) −

N d

0 i,j=1 k,l=1

(III) −

t

t

N d

0 i,j=1 k,l=1

(IV) → 0.

Vj (s)bik,jl (Xs )

aik,jl (Xs )

∂ f (Xs , Vs )ds → 0, ∂Vik

∂2 f (Xs , Vs )ds → 0, ∂Vik ∂Vjl

(5.2)

(5.3) (5.4)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

805

5.2. The term (IV) By using the fact that any jump of Mi (t) is dominated by Cm1/2 (Lemma 3.5.1(3)) and the deﬁnition of Mi (t), we show with the help of the properties of a Poisson point process that (IV) is negligible. Precisely, we show the following. Lemma 5.2.1.

lim E

Pm

m→0

sup |(IV)| = 0.

0≤t≤T

Proof. Since f ∈ C0∞ (D0 × RdN ), we have that the third partial derivatives fV k1 V k2 V k3 , l1 , l2 , l3 = 1, . . . , N, k1 , k2 , k3 = 1, . . . , d, are bounded. Also, l1

l2

l3

by Lemma 3.5.1(3), the jumps of {Mi (t)} satisfy |∆Mi (s)| ≤ Cm1/2 . ThereN d fore, by Taylor’s expansion, with C1 = l1 ,l2 ,l3 =1 k1 ,k2 ,k3 =1 fV k1 V k2 V k3 ∞ C, l1

we have

|(IV)| ≤

N

d

0<s≤t∧σ l1 ,l2 ,l3 =1 k1 ,k2 ,k3 =1

≤ C1 m1/2

N

l2

l3

fV k1 V k2 V k3 ∞ |∆Mlk11 (s)||∆Mlk22 (s)||∆Mlk33 (s)| l1

l2

l3

|∆Ml (s)|2 .

0<s≤t∧σ l=1

Therefore, to complete the proof of this lemma, it suﬃces to show that E Pm [ 0<s≤T ∧σ |∆Mi (s)|2 ] is bounded for m > 0, which we are now going to show. We have by the deﬁnition of {Mi (t)} that

Mi (t) = −

4m1/2 τ

¯ (dr, dx, dv) N (0,t∧σ]×E

du 0

∧ σ))), × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r so

0<s≤t∧σ

2

|∆Mi (s)| =

4m1/2 τ

N (dr, dx, dv) (0,t∧σ]×E

du 0

2 ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r

.

Recall that N is the Poisson point process with intensity m−1 ρ( 12 |v|2 )drν(dx, dv). Therefore, since ∧ σ)))| ≤ ∇Ui ∞ 1[0,R +1) (|x|), |∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r 0

August 10, J070-S0129055X10004077

806

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

we get that 



E Pm 

|∆Mi (s)|2 

0<s≤T ∧σ

 = E Pm 

 N (dr, dx, dv) 

4m1/2 τ

du 0

(0,T ∧σ]×E

2  ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r



1 2 |v| 1[0,R0 +1) (|x|)(4m1/2 τ )2 ∇Ui 2∞ drν(dx, dv) 2 [0,T ]×E

1 2 |v| |v|dv, ≤ 16τ 2 ∇Ui 2∞ T (2(R0 + 1))d−1 ρ 2 Rd

m−1 ρ

≤

which is ﬁnite by our assumption. This completes the proof of our assertion. 5.3. The term (III) For the term (III), we show in this subsection that (5.3) holds, i.e. when m → 0, (III) corresponds to the quadratic term of the generator L. By Lemma 4.4.7, we have that t∧σ fV k1 V k2 (Xs , Ys )d[Mlk11 , Mlk22 ]s 0+

l1

l2

t∧σ

=m 0+

fV k1 V k2 (Xs , Ys )Al1 k1 (s, x, v)Al2 k2 (s, x, v)N (ds, dx, dv). l1

l2

Let (III ) =

N

d

1

l1 ,l2 =1 k1 ,k2 =1

2Ml1 Ml2

t∧σ

0+

fV k1 V k2 (Xs , Ys ) l1

×

l2

Al1 k1 (s, x, v)Al2 k2 (s, x, v)ρ E

1 2 |v| ν(dx, dv) ds. 2

Then we have the following. The reason is intuitively as follows: when subtracting (III ), we are subtracting the corresponding expectation, so the resulting quantity is its variance, which converges to 0. Lemma 5.3.1.

lim E

m→0

Pm

sup |(III) − (III )| = 0.

0≤t≤T

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

807

Proof. By deﬁnition, N (ds, dx, dv) is the Poisson point process with intensity λ(ds, dx, dv) = m−1 ρ( 12 |v|2 )dsν(dx, dv). Also, notice that there exists a constant C > 0 such that |Alk (s∧σ, x, v)| ≤ C1[0,R0 +1] (|x|). Let C1 = 2C 2 fV V ∞ (T ((2R0 + 1))d−1 Rd ρ( 12 |v|2 )|v|dv)1/2 , which is ﬁnite by assumption. Then by Doob’s inequality, for any l1 , l2 = 1, . . . , N and k1 , k2 = 1, . . . , d, we have t∧σ mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s)(N − λ)(ds, dx, dv) sup l l 1 2 0≤t≤T 0 E t∧σ ≤ E Pm sup mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s) l1 l2 0≤t≤T

E

Pm

0

E

2 1/2 × (N − λ)(ds, dx, dv)  ≤ 2E Pm 

T ∧σ

0

E

mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s) l1

l2

2 1/2 × (N − λ)(ds, dx, dv) 

1/2

T ∧σ

= 2E Pm 0 2

≤ 2C fV V ∞ m

E

(mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s))2 λ(ds, dx, dv) l1

l2

T 0

E

1[0,R0 +1] (|x|)m

−1

ρ

1/2 1 2 |v| dsν(dx, dv) 2

≤ C1 m1/2 . This completes the proof of our assertion. Lemma 5.3.1 combined with Corollary 3.2.3 implies that (5.3) holds, i.e. N d after taking the limit m → 0, (III) corresponds to the term i,j=1 k,l=1 × ∂2 aik,jl (X) k l . ∂Vi ∂Vj

5.4. The term (II) In this subsection, we deal with the term (II). The most basic idea is the same as up to now: use the beneﬁt that the variance of the corresponding Poisson point process is small (see Lemmas 5.4.1 and 5.4.7). Proposition 3.6.4 is also used, to derive the V , a). limit, which gives us z(·; x, v, X,

August 10, J070-S0129055X10004077

808

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Recall that Pi∗1 is given by Pi∗1 (t) = −Vi02 (t) − Vi05 (t). So we have the decomposition −

t∧σ

0

fV (Xs , Ys ) · dPi∗1 (s)

t∧σ

=

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

0

t∧σ

+

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

0

R×E

fi (s, r, x, v)µω (dr, dx, dv) ds

R×E

$ 05 (s, r, x, v)λ(dr, dx, dv) ds F i

t∧σ

=

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

0

×

R×E

$ 05 (fi (s, r, x, v) + Fi (s, r, x, v))µω (dr, dx, dv) ds

t∧σ

+

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

0

×

R×E

$ 05 Fi (s, r, x, v)(λ(dr, dx, dv) − µω (dr, dx, dv)) ds.

(5.5)

We ﬁrst show in the following lemma that the second term on the right-hand side above is negligible. Lemma 5.4.1. lim E

Pm

m→0

sup 0≤t≤T

0

×

R×E

t∧σ

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

$ 05 (s, r, x, v)(λ(dr, dx, dv) − µ (dr, dx, dv)) ds = 0. F ω i

Proof. As mentioned, this result intuitively lies in the fact that only the variance of the Poisson point process is involved. We prove it by ﬁrst performing a proper decomposition (see (5.6)) and then show that each of these terms are small enough (see Lemmas 5.4.2–5.4.4). Let 2 $ 05 r ))) R(s, r, x, v) = −F r ) − ψ 0 (m−1/2 (s − r), x, v; X( i (s, r, x, v) − ∇ Ui (Xi (

r ) − ψ 0 (m−1/2 (s − r), x, v; X(s)) × {Xi (s) − Xi ( r ))}. + ψ 0 (m−1/2 (s − r), x, v; X(

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

809

Then we have the decomposition t∧σ fV (Xs , Ys )1[4m1/2 τ,σ) (s) 0

×

R×E

$ 05 (s, r, x, v)(λ(dr, dx, dv) − µ (dr, dx, dv)) ds F ω i

= (5I) + (5II) + (5III), where

(5.6)

t∧σ

(5I) =

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

0

×

R×E

R(s, r, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv)) ds,

t∧σ

(5II) =

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

0

×

R×E

r ))) ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(

×{(Xi (s) − Xi ( r ) − (s − r)Vi ( r )) − (ψ 0 (m−1/2 (s − r), x, v; X(s)) r ) + (s − r)V ( − ψ 0 (m−1/2 (s − r), x, v; X( r )))} × (µω (dr, dx, dv) − λ(dr, dx, dv)) ds,

t∧σ

(5III) = 0

fV (Xs , Ys )1[4m1/2 τ,σ) (s)

×

R×E

g(r, s, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv)) ds,

with r ))) r ) − ψ 0 (m−1/2 (s − r), x, v; X( g(r, s, x, v) = ∇2 Ui (Xi ( r ) + (s − r)V ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( r )) × {(s − r)Vi ( r ))}. + ψ 0 (m−1/2 (s − r), x, v; X( So Lemma 5.4.1 follows from Lemmas 5.4.2–5.4.4 in the following: Lemma 5.4.2.

lim E

m→0

Pm

sup |(5III)| = 0.

0≤t≤T

( Proof. First notice that 0 ≤ r ≤ σ, hence |V r )| ≤ N n. Let C1 = 2 N n). Then by Corollary 3.2.3 and Lemma 4.3.4, we ∇ Ui ∞ (2τ n + 4Cτ

August 10, J070-S0129055X10004077

810

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

have that |g(r, s, x, v)| − r||V ( ≤ ∇2 Ui ∞ 1[0,2m1/2 τ ) (|s − r|)(|s − r||Vi ( r )| + C|s r )|)1[0,R0 +1) (|x|) ≤ m1/2 C1 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|). Also, it is easy to see that g(r, s, x, v) is Fr -measurable. Therefore by N and C2 = fV ∞ T C1 (2(R0 + assumption, with c = i=1 Ui ∞ 1))(d−1)/2 (4τ Rd ρc ( 12 |v|2 )|v|dv)1/2 , we have Pm E sup |(5III)| 0≤t≤T

≤ fV ∞ E

Pm

≤ fV ∞

0 T

0

= fV ∞

R×E

R×E

T

ds

≤ fV ∞

ρ

E

Pm

g(r, s, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv))

2 1/2 g(r, s, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv)) 1/2 [|g(r, s, x, v)| ]λ(dr, dx, dv) 2

R×E

T

ds 0

×m

ds

Pm dsE

0

−1

T ∧σ

R×E

(m1/2 C1 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|))2

1/2 N 1 2

−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1

≤ C2 m1/4 , which converges to 0 as m → 0. Lemma 5.4.3.

lim E Pm

m→0

sup |(5I)| = 0.

0≤t≤T

Proof. By the deﬁnition of R(s, r, x, v), a Taylor expansion, Corollary 3.2.3 and Lemma 4.3.4, we get that r )) |R(s, r, x, v)| ≤ ∇3 Ui ∞ |(Xi (s) − Xi ( r )))|2 − (ψ 0 (m−1/2 (s − r), x, v; X(s)) − ψ 0 (m−1/2 (s − r), x, v; X( 2 |Xi (s) − Xi ( ≤ (1 + C) r )|2 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|). r )| ≤ Notice that when |s − r| ≤ 2m1/2 τ , since s, r ∈ [0, σ], we get that |Xi (s) − Xi ( n|s − r| ≤ n4m1/2 τ . So the above gives us that 2 m1[0,2m1/2 τ ) (|s − r|)1[0,R +1) (|x|). |R(s, r, x, v)| ≤ (4nτ (1 + C)) 0

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

2 4τ (2(R0 + 1))d−1 Therefore, with C1 = 2fV ∞ T (4nτ (1 + C)) we have E Pm sup |(5I)| 0≤t≤T

≤ 2fV ∞

0

×m

R×E

T

ds 0

−1

ρ

Rd

ρc ( 12 |v|2 )|v|dv,

T

ds

≤ 2fV ∞

811

R×E

E Pm [1[0,σ] (s)|R(s, r, x, v)|]λ(dr, dx, dv) 2 m1[0,2m1/2 τ ) (|s − r|)1[0,R +1) (|x|) (4nτ (1 + C)) 0

N 1 2

−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1

≤ C1 m1/2 , which converges to 0 as m → 0. Lemma 5.4.4.

lim E

Pm

m→0

sup |(5II)| = 0.

0≤t≤T

Proof. First, by Lemma 4.3.4, r ) + (s − r)V ( |ψ 0 (m−1/2 (s − r), x, v; X(s)) − ψ 0 (m−1/2 (s − r), x, v; X( r ))| X(s) r ) − (s − r)V ( ≤ C| − X( r ))|. Notice that if s ≥ 4m1/2 τ and |s − r| ≤ 2m1/2 τ , then by the deﬁnition of r, we always have that r ≤ s. Therefore, s r ) − (s − r)V ( (u) − V ( X(s) − X( r) = (V r ))du. (5.7) r e

For any l ≤ s (≤ σ), we have that |Xi (l) − Xj (l)| ≥ Ri + Rj , i = j, which implies X(l)) = 0, i = 1, . . . , N . Therefore we have by Lemma 3.5.1 that that ∇i U(

u d ∗1 1 Pi (l)dl + ηi (u) − ηi ( Vi (u) − Vi ( r) = r) Mi r e dl u −1/2 + Mi (u) − Mi ( r) − m ∇i U(X(l))dl =

1 Mi

r e

u

r e

d ∗1 Pi (l)dl + ηi (u) − ηi ( r ) + Mi (u) − Mi ( r) . dl

Let am = 4m1/2 τ + 2 max E Pm i=1,...,N

(5.8)

sup |ηi (u)| + (4m1/2 τ )1/2 .

0≤u≤T

Then by Lemma 3.5.1(4), we have am → 0 as m → 0. Notice that for s ∈ [0, T ∧ σ], |s − r| ≤ 2m1/2 τ implies |s − r| ≤ 4m1/2 τ . Let C be the constant in (4.13),

August 10, J070-S0129055X10004077

812

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

N d ∗1 and let C1 = 4 i=1 M1i (supm∈(0,1] supt∈[0,T ] E Pm [| dl Pi (l)|2 ]1/2 + 1 + C), which is ﬁnite by Lemma 3.5.1. Then we get by (5.7), (5.8) and (4.13) that r ) − (s − r)V ( − X( r ))|] E Pm [|X(s) N u

s 1 d ∗1 Pm ≤ Pi (l)dl du E M dl i r e r e i=1 s s + du(ηi (u) − ηi ( r )) + du(Mi (u) − Mi ( r )) re

≤

N

i=1

r e

 2 1/2 d 1  (4m1/2 τ )2 sup sup E Pm Pi∗1 (l) Mi  dl m∈(0,1] t∈[0,T ]

+ 4m1/2 τ 2 sup E Pm [|ηi (u)|] + 0≤u≤T

s

r e

duE Pm [|Mi (u) − Mi ( r )|2 ]1/2

≤ C1 (4m1/2 τ )am .

   (5.9)

Also,by properties of Poisson point processes, we have in general E Pm [ f dµω ] = E Pm [ f dλm ] = E[f ]dλm for any measurable f : R × E × Conf(R × E) → R. Let be as in Lemma 4.3.4, and let C2 = 2(1 + C)f V ∞ ∇2 Ui ∞ T C1 4τ (2(R0 + C 1))d−1 Rd ρc ( 12 |v|2 )|v|dv. Then by Corollary 3.2.3, Lemma 4.3.4 and (5.9), we get that T ds(1 + C) E Pm sup |(5II)| ≤ fV ∞ 0≤t≤T

0

× E Pm 1[0,σ) (s)

R×E

r ) − (s − r)V ( ∇2 Ui ∞ |X(s) − X( r ))|

× 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|)

× (µω (dr, dx, dv) + λm (dr, dx, dv)) V ∞ ∇2 Ui ∞ = 2(1 + C)f ×

R×E

T

ds 0

r ) − (s − r)V ( E Pm [1[0,σ) (s)|X(s) − X( r ))|]

× 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|)λm (dr, dx, dv)) ≤ C2 am , which converges to 0 as m → 0. This completes the proof of Lemma 5.4.4. Lemmas 5.4.2–5.4.4 complete the proof of Lemma 5.4.1.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

813

We next deal with the ﬁrst term on the right-hand side of (5.5). We ﬁrst make the decomposition −1/2 $ 05 v))) fi (s, r, x, v) + F i (s, r, x, v) = ∇Ui (Xi (s) − x(s, Ψ(r, x, m

− ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) 2 3 1 = f i (s, r, x, v) + fi (s, r, x, v) + fi (s, r, x, v), with −1/2 1 f v))) i (s, r, x, v) = ∇Ui (Xi (s) − x(s, Ψ(r, x, m

− ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) − ∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) · (x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s))), 2 0 −1/2 2 f (s − r), x, v; X(s))) i (s, r, x, v) = ∇ Ui (Xi (s) − ψ (m

· (x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s)) − m1/2 z(m−1/2 (s − r); x, v, X(s), V (s), −m−1/2 (s − r))), 2 0 −1/2 3 f (s − r), x, v; X(s))) i (s, r, x, v) = ∇ Ui (Xi (s) − ψ (m

· m1/2 z(m−1/2 (s − r); x, v, X(s), V (s), −m−1/2 (s − r)), where z is as deﬁned in (2.3). Although we are not subtracting the expectations of the corresponding terms, 2 1 1 the terms involving f i (s, r, x, v) and fi (s, r, x, v) are negligible, as fi (s, r, x, v) and 2 f i (s, r, x, v) themselves are small enough. Indeed, it is easy to see by Taylor expan−1/2 1 v)) − ψ 0 (m−1/2 (s − sion that f i (s, r, x, v) is of higher order than x(s, Ψ(r, x, m r), x, v; X(s)), so it is somehow trivial that the term corresponding to it is negli2 gible. The fact that the term involving f i (s, r, x, v) is also negligible comes from Proposition 3.6.4. We formulate the result in the following. Lemma 5.4.5. We have that t∧σ lim E Pm sup fV (Xs , Ys )1[4m1/ τ,σ] (s) m→0

×

0≤t≤T

R×E

0

k (s, r, x, v)µ (dr, dx, dv) ds = 0, f ω i

k = 1, 2.

Proof. We ﬁrst show the assertion for k = 1. First notice that by Corollary 3.2.3 1 and Proposition 3.6.5, we have that for s ∈ [0, T ∧ σ], f i (s, r, x, v) = 0 only if 1/2 1/2 1/2 |x| ≤ R0 + 1 and s − r ∈ [−m τ, 2m τ ]. Since s ∈ [4m τ, T ∧ σ], this implies that r − m1/2 τ ∈ [0, T ∧ σ]. So in this region, we have by Proposition 3.6.3 that

August 10, J070-S0129055X10004077

814

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

> 0 such that there exists a constant C |x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s))| m1/2 . + |m−1/2 (r − s)|) ≤ 4Cτ ≤ m1/2 C(2τ )2 , we have So with C1 = ∇3 Ui ∞ (4Cτ 3 −1/2 1 v)) |f i (s, r, x, v)| ≤ ∇ Ui ∞ |x(s, Ψ(r, x, m 2 − ψ 0 (m−1/2 (s − r), x, v; X(s))|

≤ C1 m1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|). Let C2 = fV ∞ T C12 (2(R0 + 1))d−1 4τ Rd ρc ( 12 |v|2 )|v|dv. Then by the deﬁnition of λ, we get t∧σ

Pm 1 fV (Xs , Ys )1[4m1/ τ,σ] (s) sup fi (s, r, x, v)µω (dr, dx, dv) ds E 0≤t≤T

≤

T

0

dsfV ∞ C1 m

0

×m

−1

R×E

ρ

R×E

1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|)

N 1 2

−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1

≤ C2 m1/2 , which converges to 0 as m → 0. The assertion for k = 2 is similar. Again, for s ∈ [0, T ∧ σ], we have by Corol1/2 2 τ, 2m1/2 τ ]. lary 3.2.3 that f i (s, r, x, v) = 0 only if |x| ≤ R0 + 1 and s − r ∈ [−m 1/2 For any s and r satisfying |s − r| ≤ 2m τ , we have by Proposition 3.6.4 that |x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s)) − m1/2 z(m−1/2 (s − r); x, v, X(s), V (s), −m−1/2 (s − r))| ≤ Cm1/2 (1 + 2τ )2 m1/2 .

Let C3 = 4fV ∞ ∇2 Ui ∞ C(1 + 2τ )2 τ T (2(R0 + 1))d−1 Rd ρc ( 12 |v|2 )|v|dv. Then t∧σ

Pm 2 E fV (Xs , Ys )1[4m1/ τ,σ] (s) sup fi (s, r, x, v)µω (dr, dx, dv) ds 0≤t≤T

≤

0

T

0

dsfV ∞ ∇2 Ui ∞ C(1 + 2τ )2 m

×m

−1

ρ

R×E

R×E

1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|)

N 1 2

−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1

≤ C3 m1/2 , which converges to 0 as m → 0. This completes the proof of our lemma.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

815

Before dealing with the main term, namely the one involving fi3 (s, r, x, v), let V , a) with respect to X and V , by us prove the following continuity of z(t; x, v, X, again using Gronwall’s Lemma. Lemma 5.4.6. For any T1 > 0 and b, A, B > 0, there exists a constant C = C(T1 , b, A, B) such that 1 , a) − z(t; x, v, X 2, V 2 , a)| ≤ C(X 1−X 2 + V 1 − V 2 ). 1, V |z(t; x, v, X 1 , X 2 ≤ B, V 1 , V 2 ≤ b. for any t ∈ [−τ, T1 ], |a| ≤ A, X V , by using the same method as in the Proof. First notice that for any a, x, v, X, proofs of Lemmas 3.6.3, 4.3.4, etc., with the help of Gronwall’s Lemma, we get easily that for any T0 > 0, T0 e(1+ |z(t)| ∨ |z (t)| ≤ (T0 + |a|)V

PN i=1

∇2 Ui ∞ )T0

,

|t| ≤ T0 .

(5.10)

k, V k , a), k = 1, 2, and let For the sake of simplicity, we write z k (t) = z(t; x, v, X 1 2 ξ(t) = z (t) − z (t). Then we have that in our domain, there exists a constant C0 = > 0 be the constant in Lemma 4.3.4 C0 (T, b, A, B) > 0 such that |z 1 (t)| ≤ C0 . Let C N 3 + 1)(C0 + (T + A)b) + ∇2 Ui ∞ (1 + T + A)}. Then and let C = i=1 {∇ Ui ∞ (C by deﬁnition and Lemma 4.3.4, 2 N d 2 1 ) − Xi1 )(z 1 (t) − (t + a)V 1) ∇ Ui (ψ 0 (t, x, v; X dt2 ξ(t) = − i=1 N

2 ) − Xi2 )(z 2 (t) − (t + a)V 2 ) + ∇2 Ui (ψ 0 (t, x, v; X i=1 N

1 ) − X 1 ) − ∇2 Ui (ψ 0 (t, x, v; X 2 ) − X 2 )} = − {∇2 Ui (ψ 0 (t, x, v; X i i i=1

1) × (z 1 (t) − (t + a)V −

N

2

0

2

∇ Ui (ψ (t, x, v; X ) −

Xi2 )(z 1 (t)

i=1

≤

N

−V )) − z (t) − (t + a)(V 2

+ 1)X 1−X 2 (|z 1 (t)| + (T + |a|)V 1 ) ∇3 Ui ∞ (C

i=1

+

N

1−V 2 ) ∇2 Ui ∞ (|z 1 (t) − z 2 (t)| + (T + |a|)V

i=1

2 + V 1−V 2 ) + C|z 1 (t) − z 2 (t)| 1−X ≤ C(X 1−X 2 + V 1−V 2 ) + C|ξ(t)|. = C(X

1

2

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

816

d Let g(t) = |(ξ(t), dt ξ(t))|. Then 2 d g(t) ≤ d ξ(t) + d ξ(t) dt dt2 dt

1−X 2 + V 1 − V 2 ) + (C + 1)g(t). ≤ C(X Hence if we set g(t) = g(t − τ ), then g(0) = discussion above gives us that

d (0) dt g

= 0 by deﬁnition, and the

d 1 −X 2 + V 1−V 2 ) + (C + 1) g(t) ≤ C(X g (t). dt So we have for any t ∈ [0, T1 + τ ] that t 1 2 1 2 g(t) ≤ Ct(X − X + V − V ) + (C + 1) g(s)ds. 0

This combined with Gronwall’s Lemma implies that 1 −X 2 + V 1 − V 2 )e(C+1)(T1 +τ ) , g(t) ≤ C(T1 + τ )(X

t ∈ [0, T1 + τ ],

which completes the proof of our assertion. Now, let us come back to deal with the term corresponding to fi3 (s, r, x, v). We make once more a decomposition of the form

t∧σ 3 (s, r, x, v)µ (dr, dx, dv) ds = (V 1) + (V 2), fV (Xs , Ys )1[4m1/ τ,σ] (s) f ω i 0

with

R×E

(V1) =

fV (Xs , Ys )1[4m1/ τ,σ] (s)

0

(V2) =

0

t∧σ

R×E

t∧σ

fV (Xs , Ys )1[4m1/ τ,σ] (s)

R×E

3 fi (s, r, x, v)λ(dr, dx, dv) ds, 3 fi (s, r, x, v)(µω − λ)(dr, dx, dv) ds.

The term (V1) (after a slight modiﬁcation to get rid of the restriction that s ≥ 4m1/ τ ), is actually our goal term. The term (V2), being the variance with respect to the corresponding Poisson point process, is expected to be negligible. We show the second assertion in Lemma 5.4.7. (t) and X(t) are bounded. Also, m−1/2 |s − r| ≤ 2τ and Notice that up to σn , V 2 0 −1/2 (s − r), x, v; X(s))) = 0. So by (5.10), in |x| ≤ R0 + 1 if ∇ Ui (Xi (s) − ψ (m −1/2 −1/2 (s − r); x, v, X(s), V (s), −m (s − r)) is bounded. So by the this case, z(m 2 3 deﬁnition of f i and the boundedness of ∇ Ui , we get that there exists a constant C > 0 such that (|s − r|)1 (|x|). |f3 (s, r, x, v)| ≤ Cm1/2 1 1/2 [0,2m

i

Lemma 5.4.7.

lim E

m→0

Pm

τ]

sup |(V 2)| = 0.

0≤t≤T

[0,R0 +1]

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

817

Proof. Let 2 3 r ))) R3 (s, r, x, v) = f r ) − ψ 0 (m−1/2 (s − r), x, v; X( i (s, r, x, v) − ∇ Ui (Xi (

r ), V ( r ); −m−1/2 (s − r)). · m1/2 z(m−1/2 (s − r); x, v, X( Then (V2) = (V21) + (V22), with

t∧σ

(V21) =

fV (Xs , Ys )1[4m1/ τ,σ] (s)

0

×

R×E

t∧σ

(V22) = 0

fV (Xs , Ys )1[4m1/ τ,σ] (s)

×

R3 (s, r, x, v)(µω − λ)(dr, dx, dv) ds,

R×E

r ))) ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(

r ), V ( · m1/2 z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r)) × (µω − λ)(dr, dx, dv) ds. We ﬁrst deal with (V21). We have by Corollary 3.2.3 and Proposition 3.6.5 that R3 (s, r, x, v) = 0 if |x| ≥ R0 + 1 or if |s − r| ≥ 2m1/2 τ . For s ∈ [0, T ∧ σ] and |s − r| ≤ 4m1/2 τ . Let C1 = ∇2 Ui ∞ C + |s − r| ≤ 2m1/2 τ , we have by deﬁnition PN 2 3 (1+ 0 + 2τ )nN 2τ e i=1 ∇ Ui ∞ 2τ ) , where C is the constant in ∇ Ui ∞ (1 + C)(T is the one in Lemma 4.3.4. Then by (5.10), Lemmas 5.4.6 Lemma 5.4.6, and C and 4.3.4, we have |R3 (s, r, x, v)| = ∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) (s), −m−1/2 (s − r)) V · m1/2 {z(m−1/2 (s − r); x, v, X(s), r ), V ( r ), −m−1/2 (s − r))} − z(m−1/2 (s − r); x, v, X( + {∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) r )))} − ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( r ), V ( r ), −m−1/2 (s − r)) × m1/2 z(m−1/2 (s − r); x, v, X( (s), −m−1/2 (s − r)) ≤ ∇2 Ui ∞ m1/2 |z(m−1/2 (s − r); x, v, X(s), V r ), V ( r ), −m−1/2 (s − r))| − z(m−1/2 (s − r); x, v, X(

August 10, J070-S0129055X10004077

818

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

+ ∇3 Ui ∞ (|Xi (s) − Xi ( r )| r ))|) + |ψ 0 (m−1/2 (s − r), x, v; X(s)) − ψ 0 (m−1/2 (s − r), x, v; X( r ), V ( × m1/2 |z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r))| r ) + V (s) − V ( − X( r )). ≤ C1 m1/2 (X(s) r )| ≤ N n|s − r| ≤ 4m1/2 τ N n. To Since |Vi (t)| ≤ n until σn , we have |X(s) − X( estimate the term with respect to V (·) in the equation above, let am = 4m1/2 τ + 2 max E Pm sup |ηi (u)| + (4m1/2 τ )1/2 i=1,...,N

0≤u≤T

as before. Then by Lemma 3.5.1(4), am → 0 as m → 0. √ N d Pi∗1 (u)|] + 1 + C), where C is Let C2 = i=1 M1i (supm∈(0,1] sup0≤u≤t E Pm [| du the constant in (4.13). Then we have that (s) − V ( E Pm [|V r )|] ≤ C2 am ,

|s − r| ≤ 2m1/2 τ.

Indeed, since s, r ∈ [0, σ0 ∧ σn ], we have by Lemma 3.5.1(1) and (3.31) that

s d ∗1 1 Pi (l)dl + ηi (s) − ηi ( r) = r ) + Mi (s) − Mi ( r) , Vi (s) − Vi ( Mi r e dl hence by Lemma 3.5.1(2) and (4.13),

d

1 Pm Pm d ∗1 Pi (u) E [|V (s) − V ( r )|] ≤ |s − r| sup E Mi du 0≤u≤T i=1 + 2E Pm sup |ηi (u)| + E Pm [|Mi (s) − Mi ( r )|] 0≤u≤T

≤

d

d 1 |s − r| sup E Pm Pi∗1 (u) Mi du 0≤u≤t i=1 √ Pm 1/2 + 2E sup |ηi (u)| + C|s − r| 0≤u≤T

≤ C2 am , which gives us our assertion. = Combining the above and the deﬁnition of λ, we get that with C 1 d−1 2 8τ T fV ∞ (2(R0 + 1)) C1 (4τ N n + C2 ) Rd ρc ( 2 |v| )|v|dv, Pm sup |(V 21)| E 0≤t≤T

≤

0

T

dsfV ∞ E Pm 1[0,T ∧σ] (s)

R×E

(s) − V ( C1 m1/2 (4m1/2 τ N n + V r ))

× 1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|)(µω + λ)(dr, dx, dv)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

≤2

T

dsfV ∞

0

R×E

819

E Pm [1[0,T ∧σ] (s)C1 m1/2 (4m1/2 τ N n + C2 am )]

× 1[0,2m1/2τ ] (|s − r|)1[0,R0 +1] (|x|)λ(dr, dx, dv) 1/2 + am ) → 0, ≤ C(m

as m → 0.

r ) − ψ 0 (m−1/2 (s − r), To handle the term (V22) is easier. We have |∇2 Ui (Xi ( 2 x, v; X( r )))| ≤ ∇ Ui ∞ 1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|). Also, for s ∈ [0, T ] and r ), V ( |s − r| ≤ 2m1/2 τ , we have by (5.10) that z(m−1/2 (s − r); x, v, X( r ), −1/2 (s−r)) is bounded. Let C be a bound of it, and let C = T f C ((2(R −m 3 V ∞ 3 0+ r ) is Fr -measurable, by the 1))d−1 4τ ∇2 Ui ∞ Rd ρc ( 12 |v|2 )|v|dv)1/2 . Then since X( deﬁnition of Poisson point processes and the deﬁnition of λ, we have E Pm sup |(V 22)| 0≤t≤T

≤

T

0

dsfV ∞ E Pm

R×E

r )) ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(

r ), V ( × m1/2 z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r)) 2 1/2 × (µω (dr, dx, dv) − λ(dr, dx, dv))

T

= 0

dsfV ∞

R×E

+ r )) E Pm (∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(

r ), V ( × m1/2 z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r)))2

,

1/2 × λ(dr, dx, dv) ≤

0

T

dsfV ∞

R×E

(∇2 Ui ∞ m1/2 C3 )2 1[0,R0 +1) (|x|)

1/2 × 1[0,2m1/2 τ ) (|s − r|)λ(dr, dx, dv) 1/4 → 0, ≤ Cm

as m → 0.

This completes the proof of Lemma 5.4.7. N Up to now, we have shown that all of the terms of −(II) except i=1 M1i (V1) are negligible. We are almost done with our discussion with respect to (II), except for getting rid of the term 1[4m1/ τ,σ] (s) in the deﬁnition of (V1). We do it now.

August 10, J070-S0129055X10004077

820

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Notice that in the integral domain of (V1), we have s ≥ 4m1/2 τ . So if ∇ Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)) = 0, then r ≥ 2m1/2 τ . If ρ( 12 |v|2 + N −1/2 rv − Xi,0 )) = 0 in addition, then |v| ≥ 2C0 + 1. Therefore, in this i=1 Ui (x− m −1/2 rv| ≥ 2τ (2C0 +1) ≥ R0 , hence since x·v = 0, we get |x−m−1/2 rv| ≥ R0 , case, |m −1/2 rv − Xi,0 )) = ρ( 12 |v|2 ). which in turn gives us that ρ( 12 |v|2 + N i=1 Ui (x − m Therefore, by deﬁnition, t∧σ dsfV (Xs , Ys )1[4m1/ τ,σ] (s) (V1) = 2

0

×

∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)))

R×E

(s), −m−1/2 (s − r)) · m1/2 z(m−1/2 (s − r); x, v, X(s), V

1 2 |v| drν(dx, dv) · m−1 ρ 2 t∧σ = dsfV (Xs , Ys )1[4m1/τ ,σ] (s) 0

×

+∞

−∞

E

du∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s)))

(s), −u) ρ 1 |v|2 ν(dx, dv) , × z(u; x, v, X(s), V 2 where when passing to the last equality, we used the change of variable u = m−1/2 (s − r) for every s ﬁxed. With the help of this re-expression, we make a decomposition once more, (V1) = (V11) + (V12), with

t∧σ

dsfV (Xs , Ys )

(V11) = 0

× E

+∞

−∞

du∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s)))

1 2 × z(u; x, v, X(s), V (s), −u) ρ |v| ν(dx, dv) , 2 t∧σ dsfV (Xs , Ys )1[0,4m1/2 τ ] (s) (V12) = − 0

× E

+∞

−∞

du∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s)))

1 2 |v| ν(dx, dv) . × z(u; x, v, X(s), V (s), −u) ρ 2

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

821

Notice that for s ∈ [0, T ∧ σ], ∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s))) = 0 only if |u| ≤ 2τ + 1, and z(u; x, v, X(s), V (s), −u) is bounded in this domain. So and |x| ≤ R 0 +∞ 2 0 ( du∇ U (X (s) − ψ (u, x, v; X(s)))z(u; x, v, X(s), V (s), −u))ρ( 12 |v|2 )ν(dx, i i E −∞ dv) is bounded. Let C be a bound of it. Then |(V12)| ≤ 4Cτ fV ∞ m1/2 . This completes the proof of (5.2), i.e. the fact that the term (II) is converging N to − i=1 M1i (V11) as m → 0.

5.5. Conclusion Combining the results of Secs. 5.1–5.4, and taking the limit n → ∞ at last (notice that σn → ∞ a.s.), we get Theorems 2.0.1(2) and 2.0.1(3). Notice that this also gives us Lemma 3.5.3, by considering each time interval [ηn−1 , ξn ], with ηn , ξn given by the following: η0 = 0,

, εf ∈ B supp U ξn = inf t ≥ ηn−1 ; X(t) , 2 , εf )}, ∈ / B(supp U ηn = inf{t ≥ ξn ; X(t)

n ≥ 1.

, 2εf ) × RdN )C . Here εf > 0 is chosen such that supp f ⊂ (B(supp U

6. Case of Two Molecules In this section, we consider the special case of two molecules with d ≥ 3 and spherically-symmetric potential functions U1 , U2 , as described in Theorem 2.0.1(4). Precisely, in addition to all of the assumptions in Secs. 3–5, we assume from now on that d ≥ 3 and there exist functions h1 , h2 : [0, ∞) → R such that Ui (x) = hi (|x|), i = 1, 2, and, moreover, there exists a constant ε0 > 0 such that (−1)i−1 hi (s) > 0, (−1)i−1 hi (s) > 0, s ∈ (Ri − ε0 , Ri ), i = 1, 2. See Sec. 2 for the explanation of these assumptions. Without loss of generality, we assume that ε0 < R1 ∧ R2 . In the following, we show that in this special case, as announced in Sec. 2, as m → 0, {(X(t), V (t))}t under Pm converges to the reﬂecting diﬀusion process which has generator L and act as “colliding” when the potential ranges of the two molecules overlap. (See Theorem 6.3.2 for the precise deﬁnition of the limiting process.) . We then show We ﬁrst discuss a little bit more about the new potentials U that in our present setting, the condition of Lemma 3.5.2 is satisﬁed, and that (t ∧ σn ))}t under Pm }m is tight in ∧ σn ), V when m → 0 {the distribution of {(X(t

August 10, J070-S0129055X10004077

822

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

$d ) with the metric function dis $d = C([0, ∞); Rd )×D([0, ∞); Rd ) given by of W ℘(W

∞

1 , ω2 ) = dis(ω 2−n 1 ∧ max |x1 (t) − x2 (t)| + dis(v1 , v2 ) t∈[0,n]

n=1

$d , i = 1, 2. Here dis is the Skorohod metric on for ωi = (xi (·), vi (·)) ∈ W d D([0, ∞), R ) deﬁned in Sec. 3.4. Finally, we use these to show the desired convergence. 6.1. The new potential U be as deﬁned in Sec. 3.6, and let U 0 be the constant Let p and U 2

0 = U (p(Ui (Xi − x)) − p(0))dx, Rd

i=1

(X1 , X2 ), when X1 and X2 are far which, as claimed in Sec. 3.6, is the value of U enough, precisely, when |X1 − X2 | ≥ R1 + R2 . Then U (X1 , X2 ) − U0 = {[p(U1 (X1 − x) + U2 (X2 − x)) − p(0)] Rd

− [(p(U1 (X1 − x)) − p(0)) + (p(U2 (X2 − x)) − p(0))]}dx U1 (X1 −x)+U2 (X2 −x) dx p (s)ds = 0

Rd

−

U1 (X1 −x)

0

=

p (s)ds −

U2 (X2 −x)

U1 (X1 −x)+U2 (X2 −x)

U2 (X2 −x)

=

U1 (X1 −x)

dx 0

Rd

=

Therefore, (X1 , X2 ) = ∇1 U

U1 (X1 −x)

p (s)ds 0

p (s + u)du.

0

U2 (X2 −x)

dx 0

Rd

p (s)ds −

U2 (X2 −x)

ds 0

Rd

(p (s + U2 (X2 − x)) − p (s))ds

U1 (X1 −x)

dx

p (s)ds

0

dx Rd

p (U1 (X1 − x) + u)du∇U1 (X1 − x). (6.1)

Notice that the integrand in (6.1) is 0 outside of B2 = BX1 ,X2 = {x ∈ Rd ; |x − X1 | ≤ R1 , |x − X2 | ≤ R2 }. Therefore, (X1 , X2 ) = ∇1 U

dx

B2

0

U2 (X2 −x)

p (U1 (X1 − x) + u)du∇U1 (X1 − x). (6.2)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

823

We will use this expression in the following calculations. In this subsection, we show, by using the spherical-symmetry of the poten tials, that ∇1 U (X1 , X2 ) has the same direction as X2 − X1 . Therefore, the term −1/2 t∧σ (X(s))ds in the decomposition (3.30) of Mi Vi (t) gives us the ∇i U −m 0 reﬂecting force. First, we have the following: Lemma 6.1.1. Let ε ∈ (0, ε0 ]. Then there exists a Cε > 0 such that for any X1 , X2 ∈ Rd satisfying |X1 − X2 | ∈ [R1 + R2 − ε, R1 + R2 − 2ε ), we have that 1 , X2 ) is parallel to X2 − X1 in Rd , and ∇i U(X (X1 , X2 ) ≤ −Cε , (X1 − X2 ) · ∇1 U

(X1 , X2 ) ≥ Cε . (X1 − X2 ) · ∇2 U

= (X1 , X2 ) ∈ Proof. First notice that by assumption and (3.39), we have for any X 2d R

1 2 ∇i U(X) = ∇Ui (Xi − x)ρ |v| + U1 (X1 − x) + U2 (X2 − x) dxdv 2 R2d

1 2 Xi − x hi (|Xi − x|)ρ |v| + h1 (|X1 − x|) + h2 (|X2 − x|) dxdv. = 2 R2d |Xi − x| (X) is parallel to X1 − X2 in Rd . From this, it is easy to see that ∇i U For the second half of the lemma, since the proofs are similar, we only prove the ﬁrst assertion. Notice that for any x ∈ B2 , since |X1 − X2 | ≥ R1 + R2 − ε, we have that |X1 − x| ≥ R1 − ε, |X2 − x| ≥ R2 − ε. By our assumption, U1 (X1 − x) = h1 (|X1 − x|), U2 (X2 − x) = h2 (|X2 − x|). Therefore, by (6.2), h2 (|X2 −x|) X1 − x (X1 , X2 ) = . ∇1 U dx p (h1 (|X1 − x|) + u)duh1 (|X1 − x|) |X 1 − x| 0 B2 Notice that in this integral domain, since ε ≤ ε0 < R1 ∧ R2 , we have (X1 − X2 ) · X1 −x |X1 −x| > 0. By assumption, h1 (|X1 − x|) > 0,

h2 (|X2 − x|) < 0,

h1 (|X1 − x|) < 0,

h2 (|X2 − x|) > 0.

Also, since d ≥ 3, we have by (3.42) that p (s) < 0 for any s < e0 . Therefore, if we set . 2 = x; |X1 − x| ≤ R1 − ε , |X2 − x| ≤ R2 − ε ⊂ B2 , B 6 6 then h2 (|X2 −x|) 1 , X2 ) ≥ − −(X1 − X2 ) · ∇1 U(X dx p (h1 (|X1 − x|) + u)du f2 B

0

× h1 (|X1 − x|)(X1 − X2 ) ·

X1 − x . |X1 − x|

August 10, J070-S0129055X10004077

824

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

We have by (3.42) that −p (s) > 0 for any |s| ≤ U1 ∞ + U2 ∞ , also, −p (·) is continuous in this closed interval. Therefore, there exists a constant C1 > 0 such that inf{−p (s); |s| ≤ U1 ∞ + U2 ∞ } ≥ C1 . ε , we have that Moreover, for any x ∈ B

ε 5 |X1 − x| ≥ |X1 − X2 | − |X2 − x| ≥ (R1 + R2 − ε) − R2 − = R1 − ε, 6 6 i.e. |X1 − x| ∈ [R1 − 56 ε, R1 − 6ε ]. In the same way, |X2 − x| ∈ [R2 − 56 ε, R2 − 6ε ]. So by assumption, there exists a constant Cε1 > 0 (which does not depend on x) such that h1 (|X1 − x|) ≥ Cε1 ,

h2 (|X2 − x|) ≤ −Cε1 ,

h1 (|X1 − x|) ≤ −Cε1 ,

h2 (|X2 − x|) ≥ Cε1 .

Also, we have that (X1 − X2 ) ·

(R1 + R2 − ε)(R1 − ε) X1 − x ≥ . |X1 − x| R1

Indeed, if we decompose X1 − x into + (x − x ) X1 − x = X1 − x with X1 − x X1 − X2 and x − x ⊥ X1 − X2 , then X2 − x = X2 − x + (x − x ) is 2 2 2 2 | + |x − x | , hence also an orthogonal decomposition. So R2 ≥ |X2 − x| = |X2 − x | ≤ R2 . Also, |X1 − X2 | ≥ R1 + R2 − ε, So |X1 − x | ≥ |X1 − X2 | − |X2 − x | ≥ |X2 − x (R1 + R2 − ε) − R2 = R1 − ε. Therefore, (X1 − X2 ) ·

|X1 − X2 | |X1 − x X1 − x (R1 + R2 − ε)(R1 − ε) | ≥ ≥ . |X1 − x| R1 R1

Combining these, we get that (X1 , X2 ) ≥ − −(X1 − X2 ) · ∇1 U

f2 B

dx 0

h2 (|X2 −x|)

p (h1 (|X1 − x|) + u)du

X1 − x |X1 − x| (R1 + R2 − ε)(R1 − ε) ≥ Cε1 C1 Cε1 dx, R1 fε B × h1 (|X1 − x|)(X1 − X2 ) ·

which gives us our ﬁrst assertion. As a direct corollary of Lemma 6.1.1, we have the following. Lemma 6.1.2. Let ε ∈ (0, ε0 ], and let X1 , X2 ∈ Rd satisfying |X1 − X2 | ∈ [R1 + R2 − ε, R1 + R2 ). Then we have that (X1 , X2 ) < 0, (X1 − X2 ) · ∇1 U

1 , X2 ) > 0. (X1 − X2 ) · ∇2 U(X

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

825

We also have the following as an easy corollary of Lemma 6.1.1. ε Corollary 6.1.3. Assume that t1 , t2 ∈ [0, σn ] satisfy |t1 − t2 | ≤ 4n , and |X1 (t1 ) − ε X2 (t1 )| ∈ [R1 + R2 − ε, R1 + R2 − 2 ). Then

ε/2 −(X1 (t2 ) − X2 (t2 )) · ∇1 U (X1 (t1 ), X2 (t1 )) ≥ Cε 1 − . R1 + R2 − ε |a−b| d Proof. By using the general fact that (a,b) |b|2 ≥ 1 − |b| for any a, b ∈ R , we get 1 , X 2 ) with |(X 1 − X 2 ) − (X1 − X2 )| < |X1 − X2 |, by Lemma 6.1.1 that for any (X

we have (X1 , X2 ) 1 − X 2 ) · ∇1 U −(X 2 , X1 − X2 ) 1 − X (X |X1 − X2 |2 2 ) − (X1 − X2 )| 1 − X |(X ≥ Cε 1 − . |X1 − X2 |

1 , X2 ) = −(X1 − X2 ) · ∇1 U(X

Under our assumption, we have |X1 (t1 ) − X1 (t2 )| ≤ n|t1 − t2 | ≤ |X2 (t1 ) − X2 (t2 )| ≤ 4ε . Therefore, by the argument above,

ε 4,

similarly,

1 (t1 ), X2 (t1 )) −(X1 (t2 ) − X2 (t2 )) · ∇1 U(X

|(X1 (t2 ) − X2 (t2 )) − (X1 (t1 ) − X2 (t1 ))| ≥ Cε 1 − |X1 (t1 ) − X2 (t1 )|

ε/2 ≥ Cε 1 − . R1 + R2 − ε

6.2. Tightness Same as before, we only need to discuss under condition |Vi | ≤ n, i.e. use t ∧ σn instead of t, and ﬁnally take n → ∞. We ﬁrst show that the condition of Lemma 3.5.2 is satisﬁed. = (X1 , X2 ) ∈ R2d with |X1 − X2 | < R1 + R2 big enough, For any X (X) is parallel to X1 − X2 in Rd , and by by Lemma 6.1.1, we have that ∇i U (X) has the Lemma 6.1.2, ∇1 U(X) has the opposite direction as X1 − X2 , and ∇2 U same direction as X1 − X2 . = X2 −X1 , g2 (X) = X1 −X2 , and let D ¯ = Therefore, if we let g1 (X) |X2 −X1 | |X1 −X2 | {(X1 , X2 )||Xi | ≤ |Xi,0 |+nT, |X1 −X2 | ≥ R1 +R2 −ε0 }. Then since R1 +R2 −ε0 > 0, X) = |∇i U (X)| for any x ∈ D, ¯ i.e. ¯ and gi (X) · ∇i U( we have that g1 , g2 ∈ Cb1 (D) the condition of Lemma 3.5.2 is satisﬁed. ∧ We next give a brief proof of the tightness of {the distribution of {(X(t σn ), V (t ∧ σn ))}t under Pm }m as m → 0. The only diﬃculty is the assertion with

August 10, J070-S0129055X10004077

826

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

T respect to V (·). We deal with it from now on. Let Ak = {Yt : 0 |dYt | ≤ k}, k ∈ N. Then we have by Kusuoka [9, Corollary 8] that Ak is compact in Lp ([0, T ]; Rd) with cluster points in D([0, T ]; Rd) for any k ∈ N. Also, by Lemma 3.5.2(1), there exists a constant C > 0 such that −1

·∧σn −1/2 ∇i U(X(s))ds (Ak ) Pm ◦ m 0

= 1 − Pm 1 ≥ 1 − E Pm k ≥1−

T ∧σ

m

−1/2

0

X(s))|ds >k |∇i U(

T ∧σ

m

−1/2

0

(X(s))|ds |∇i U

C , k

t∧σ (X(s))ds} which converges to 1 as k → ∞. Therefore, {{m−1/2 0 n ∇i U t d under Pm }m∈(0,1] is tight in ℘(D([0, T ]; R )). Therefore, since by Lemma 3.5.1, t∧σ ∗1 −1/2 (X(s))ds, Mi (Vi (t ∧ σ) − Vi (0)) = Mi (t) + ηi (t) + Pi − m ∇i U 0

Pi∗1

and the distributions of Mi (t) + ηi (t) and under Pm are tight in ℘(D([0, T ]; Rd )), we get the conclusion that {{Vi (t∧σn )}t under Pm }m→0 is tight in ℘(D([0, T ]; Rd )). 6.3. Convergence to a Markov process The idea is similar to that presented by Kusuoka in [9]. Let us ﬁrst recall the following existence and uniqueness theorem of Kusuoka [9, Theorem 1]. Let D be a bounded domain in Rd with a smooth boundary ∂D and let n(x), x ∈ ∂D, be the outer normal vector at x ∈ ∂D. Let L0 =

d

i=1

vi

d d

1 ij ∂2 ∂ ∂ + a (x) + bi (x, v) i , ∂xi 2 i,j=1 ∂v i ∂v j i=1 ∂v

where aij : Rd → R, i, j = 1, . . . , d, are smooth function, symmetric with respect to i, j and uniformly elliptic with respect to x, and bi : R2d → R, i = 1, . . . , d, are bounded measurable functions. Let Φ: Rd × ∂D → Rd be a smooth map satisfying the following: (1) Φ(·, x): Rd → Rd is linear for all x ∈ ∂D, (2) Φ(v, x) = v for any x ∈ ∂D and v ∈ Tx (∂D), i.e. Φ(v, x) = v if x ∈ ∂D, v ∈ Rd and v · n(x) = 0, (3) Φ(Φ(v, x), x) = v for all v ∈ Rd and x ∈ ∂D, (4) Φ(n(x), x) = n(x) for any x ∈ ∂D.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

827

Then Kusuoka [9, Theorem 1] proved the following: ¯ C × Rd . Then there exists a unique probability Theorem 6.3.1. Let (x0 , v0 ) ∈ (D) d ˜ satisfying the following: measure µ over W (1) µ(ω(0) = (x0 , v0 )) = 1, (2) µ(ω(t) ∈ DC × Rd , t ∈ [0, ∞)) = 1, ¯ C ×Rd ), {f (ω(t))− t L0 f (w(s))ds; t ≥ 0} is a martingale (3) For any f ∈ C0∞ ((D) 0 under µ(ω), (4) µ(1∂D (x(t))(v(t) − Φ(v(t−), x(t))) = 0 for all t ∈ [0, ∞)) = 1. ˜ d. Here ω(·) = (x(·), v(·)) ∈ W By using this, we get the following slight variation. Recall that D0 = {(X1 , X2 ) ∈ R2d ; |X1 − X2 | > R1 + R2 } in our present setting. Theorem 6.3.2. There exists a unique probability measure P∞,0 over D([0, ∞); R4d ) satisfying the following. (1) P∞,0 (ω(0) = (x0 , v0 )) = 1, ¯ 0 , t ∈ [0, ∞)) = 1, ∈D (2) P∞,0 (X(t) t (s))ds; t ≥ 0} is V (t)) − 0 (Lf )(X(s), V (3) For any f ∈ C0∞ (D0 × R2d ), {f (X(t), a martingale under P∞,0 , (4) If f ∈ C0∞ (R4d ) satisﬁes M1−1 (∇v1 f )(x, v) · (x1 − x2 ) + M2−1 (∇v2 f )(x, v) · (x2 − x1 ) = 0

(6.3)

for any (x, v) ∈ ∂D0 × R2d , then f (X(t), V (t)) is continuous in t, P∞,0 -a.s., 2 2 (5) M1 |V1 (t)| + M2 |V2 (t)| is continuous in t, P∞,0 -a.s. Proof. We deﬁne Φ(v, x), (v, x) = (v1 , v2 , x1 , x2 ) ∈ R4d , in the following way: For any such v1 , v2 , x1 , x2 ∈ Rd , decompose v1 and v2 into vi = ui +wi with ui ⊥ x1 −x2 and wi x1 − x2 , i = 1, 2, and let Φ(v, x) = (Φ1 (v, x), Φ2 (v, x)) with Φ1 (v, x) = u1 +

M1 − M2 2M2 w1 + w2 , M1 + M2 M1 + M2

Φ2 (v, x) = u2 +

2M1 M2 − M1 w1 + w2 . M1 + M2 M1 + M2

Then Φ satisﬁes the conditions before Theorem 6.3.1. We ﬁrst check the fact that a probability µ satisfying the conditions (1)–(4) of Theorem 6.3.1 with Φ given above also satisﬁes conditions (1)–(5) of Theorem 6.3.2. All except (4) are trivial. For (4), it suﬃcient to show that f (x, Φ(v, x)) = f (x, v)

August 10, J070-S0129055X10004077

828

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

for any x ∈ ∂D0 if it satisﬁes (6.3). We show it in the following. Since Φ1 (v, x) − v1 =

2M2 (w2 − w1 ), M1 + M2

we have

f (x, Φ(v, x)) − f (x, v) =

0

1

Φ2 (v, x) − v2 =

2M1 (w1 − w2 ), M1 + M2

[∇v1 f (x, v + t(Φ(v, x) − v))(Φ1 (t, x) − v1 )

+ ∇v2 f (x, v + t(Φ(v, x) − v))(Φ2 (t, x) − v2 )]dt 1 2M1 M2 =− [−M1−1 ∇v1 f + M2−1 ∇v2 f ] M1 + M2 0 × (x, v + t(Φ(v, x) − v)) · (w2 − w1 )dt = 0, where in the last line we used (6.3) and the fact that w2 − w1 x2 − x1 . For the opposite direction, i.e. the fact that a probability µ satisfying the conditions (1)–(5) of Theorem 6.3.2 also satisﬁes conditions (1)–(4) of Theorem 6.3.1 with Φ given above, we only need to check that (4) of Theorem 6.3.1 is satisﬁed, or equivalently, show that V (σ) = Φ(V (σ−), X(σ−)) if X(σ) ∈ ∂D0 . Choose any d w ∈ R and ﬁx it for a while. Let f (x, v) = M1 v1 · w + M2 v2 · w. Then f satisﬁes (6.3), so by (4) of Theorem 6.3.2, f (X(t), V (t)) is continuous in t. We write it down together with (5) of Theorem 6.3.2: M1 V1 (t) + M2 V2 (t) is continuous in t, M1 V12 (t) + M2 V22 (t) is continuous in t. Solving these two equations, we get that either Φ1 (V (σ−), X(σ)) · w = V1 (σ) · w,

and

Φ2 (V (σ−), X(σ)) · w = V2 (σ) · w

(6.4)

or V1 (σ−) · w = V1 (σ) · w,

and V2 (σ−) · w = V2 (σ) · w.

(6.5)

If w is orthogonal to X1 (σ) − X2 (σ), then these two conditions are equivalent, so both of them hold, which means that there is no jump at time σ in any of these directions. Now, the only thing left to be checked is that (6.4) also holds for any w X1 (σ) − X2 (σ). If not, then (6.5) holds, so Vi (σ) = Vi (σ−) for i = 1, 2. Since d (X1 (t) − X2 (t))2 = (X1 (t) − X2 (t)) · (V1 (t) − V2 (t)), dt this implies that

d d 2 2 (X1 (t) − X2 (t)) (X1 (t) − X2 (t)) = . dt dt t=σ− t=σ

(6.6)

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

829

If (6.6) is equal to 0, then V1 (σ−) − V2 (σ−) is orthogonal to X1 (σ) − X2 (σ), so by the deﬁnition of Φ this implies that Φ(V (σ−), X(σ)) = V (σ−), which combined with our assumption implies that Φ(V (σ−), X(σ)) = V (σ), the very equation that we need. If (6.6) is not equal to 0, write it as C ∈ R, then by the continuity d (X1 (t) − X2 (t))2 |t=σ , there exists an ε > 0 small enough such that either of dt (X1 (σ − ε) − X2(σ − ε))2 or (X1 (σ + ε) − X2(σ + ε))2 is less than (X1 (σ) − X2 (σ))2 − |C| |C| 2 2 · ε = (R1 + R2 ) − 2 · ε. This contradicts the condition (2). Therefore, (6.4), i.e. (4) of Theorem 6.3.1 holds. We have already shown in Sec. 6.2 that {{(X(t∧σ n ), V (t∧σn ))}t under Pm }m→0 is tight. We show from now on that any cluster point of it satisﬁes all of the conditions of Theorem 6.3.2. The fact that any of its cluster points satisﬁes (1) is trivial. The fact that it satisﬁes (3) is nothing but Lemma 3.5.3. So we only need to show that (2), (4) and (5) are also satisﬁed. We show (2) ﬁrst. Choose an arbitrary ε > 0 and ﬁx it for a while. Let 3 ξ = ξε = inf t > 0; |X1 (t) − X2 (t)| ≤ R1 + R2 − ε ∧ σn ∧ T. 4 Then (2) is implied by the following. Lemma 6.3.3. Let ε ∈ (0, ε0 ] and let ξ be as deﬁned above. Then lim Pm (ξ < T ∧ σn ) = 0.

m→0

This result is easy to be imagined, since as m → 0, m−1/2 → ∞, so by Corol −1/2 t∧σ lary 6.1.3, the term −m ∇i U (X(s))ds in the decomposition of Mi Vi (t) 0 gives us a very strong force as soon as the distance between the two molecules is less than R1 + R2 . Proof. Notice that if ξ < T ∧ σn , then |X1 (ξ) − X2 (ξ)| = R1 + R2 − 34 ε, hence + ε, |X1 (t) − X2 (t)| ∈ R1 + R2 − ε, R1 + R2 − , 2

+ ε , for any t ∈ ξ − ,ξ . 8n

We have by Ito’s formula and Lemma 3.5.1 that 2

2

|X1 (t) − X2 (t)| = |X1 (0) − X2 (0)| + 2

0

t

(X1 (s) − X2 (s))

· M1 (s) − M2 (s) + η1 (s) − η2 (s) + P1∗1 (s) − P2∗1 (s) −m

−1/2

0

s

(∇1 U (X1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))du ds,

August 10, J070-S0129055X10004077

830

so

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

3 R1 + R2 − ε 4

2

− (R1 + R2 − ε)2

2 ε ε ≥ |X1 (ξ) − X2 (ξ)|2 − X1 ξ − − X2 ξ − 8n 8n ξ

(X1 (s) − X2 (s)) · M1 (s) − M2 (s) + η1 (s) − η2 (s)

=2 ε ξ− 8n

+ P1∗1 (s) − P2∗1 (s)

− m−1/2 −m

0

−1/2

1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))du (∇1 U(X

s

ε ξ− 8n

≥ −2

ε ξ− 8n

1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))du ds (∇1 U(X

ξ

ε R1 + R2 − 2

ε ξ− 8n

|M1 (s)| + |M2 (s)| + |η1 (s)| + |η2 (s)|

+ |P1∗1 (s)| + |P2∗1 (s)| +m

−1/2

T ∧σn

0

+ 2m−1/2

(X1 (u), X2 (u))| + |∇2 U (X1 (u), X2 (u)))| du (|∇1 U

ξ

s

ds ε ξ− 8n

ε ξ− 8n

ds

[−(X1 (s) − X2 (s))

(X1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))]du. · (∇1 U

(6.7)

ε/2 ε 2 Let C1 = (R1 + R2 − ε)2 − (R1 + R2 − 34 ε)2 and C2 = ( 8n ) Cε (1 − R1 +R ) > 0, 2 −ε where Cε is the constant given in Lemma 6.1.1 and Corollary 6.1.3. Notice that C1 and C2 depend only on R1 + R2 , ε and n, and do not depend on m. Also, write Ys = |M1 (s)| + |M2 (s)| + |η1 (s)| + |η2 (s)| + |P1∗1 (s)| + |P2∗1 (s)|. Then with the help of Corollary 6.1.3, (6.7) implies that

ξ < T ∧ σn

ξ

ε ε ε ⇒ 2 R1 + R2 − Ys ds + R1 + R2 − ε 2 4n 2 ξ− 8n

×

0

T ∧σn

(X1 (u), X2 (u))| + |∇2 U (X1 (u), X2 (u)))|)du m−1/2 (|∇1 U

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

831

2

3 ≥ (R1 + R2 − ε) − R1 + R2 − ε 4 s ξ + 2m−1/2 ds [−(X1 (s) − X2 (s)) 2

ε ξ− 8n

ε ξ− 8n

(X1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))]du · (∇1 U

ξ s ε/2 −1/2 ≥ C1 + 2m Cε 1 − ds du ε ε R1 + R2 − ε ξ− 8n ξ− 8n = C1 + 2m

−1/2

Cε 1 −

ε/2 R1 + R2 − ε

2 1 ε 2 8n

= C1 + m−1/2 C2 . T 2 ε Pm T ∧σn (X1 (u), [ 0 m−1/2 |∇i U Let C3 = supm∈(0,1] {2 0 E Pm [Ys ]ds + 4n i=1 E X2 (u))|du]}, which is ﬁnite by Lemmas 3.5.1 and 3.5.2. Then the above implies that ξ

ε ε ε Ys ds + Pm (ξ < T ∧ σn ) ≤ Pm 2 R1 + R2 − R1 + R2 − ε 2 4n 2 ξ− 8n ×

0

T ∧σn

1 (u), X2 (u))| m−1/2 (|∇1 U(X

(X1 (u), X2 (u)))|)du ≥ C1 + m−1/2 C2 + |∇2 U ξ∧σn 1 ε Pm ≤ E + R − Ys ds 2 R 1 2 ε 2 C1 + m−1/2 C2 ξ− 8n

T ∧σn ε ε (X1 (u), X2 (u))| m−1/2 (|∇1 U R1 + R2 − 4n 2 0 (X1 (u), X2 (u)))|)du + |∇2 U

+

1 ε ≤ R1 + R2 − C3 , 2 C1 + m−1/2 C2 which converges to 0 as m → 0. This completes the proof of our assertion. We next show that the condition (5) of Theorem 6.3.2 is satisﬁed, i.e. M1 |V1 (t)|2 + M2 |V2 (t)|2 is continuous in t almost surely, under any limit probability.

August 10, J070-S0129055X10004077

832

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

We ﬁrst prepare the following: (Y1 , Y2 ) · Y1 −Y2 is monotone non-increasing with respect to Lemma 6.3.4. −∇1 U |Y1 −Y2 | |Y1 − Y2 | for |Y1 − Y2 | ∈ [R1 + R2 − ε0 , R1 + R2 ]. Proof. As in the proof of Lemma 6.1.1, by (6.2), we have that in our present setting, h2 (|Y2 −x|) (Y1 , Y2 ) · Y1 − Y2 = − −∇1 U dx p (h1 (|Y1 − x|) + u)du |Y1 − Y2 | 0 BY1 ,Y2 × h1 (|Y1 − x|)

Y1 − Y2 Y1 − x · . |Y1 − x| |Y1 − Y2 |

Let B Y1 ,Y2 = {(s, t)|∃x ∈ BY1 ,Y2 , s = |Y1 − x|, t = |Y2 − x|}, and for any (s, t) ∈ BY1 ,Y2 , let α, β, θ be the angles between Y1 Y2 and Y1 x, Y2 Y1 and Y2 x, xY1 and xY2 , respectively. Write A = |Y1 − Y2 |. Finally, let l(s, t) denote the length of the hypercircle {x ∈ Rd ; |Y1 − x| = s, |Y2 − x| = t} in Rd−2 . Then by using a change of variables, (Y1 , Y2 ) · Y1 − Y2 −∇1 U |Y1 − Y2 | 0 = dsdt (−p (h1 (s) + u))du(−h1 (s))l(s, t) cos α sin θ. B Y1 ,Y2

h2 (t)

Notice that all of the terms above are positive. The integration domain B Y1 ,Y2 is decreasing with respect to |Y1 − Y2 |. Also, for any ﬁxed s and t, the term l(s, t) is also decreasing with respect to |Y1 − Y2 |. Therefore, it is suﬃcient to show that for any s, t ﬁxed, cos α sin θ is decreasing with respect to A = |Y1 − Y2 |. We shall show it from now on. By the sine formula, cos α sin θ = At sin α cos α. So it suﬃces to show that A sin α cos α is monotone decreasing with respect to A, or equivalent, is monotone increasing /with respect to α, for α > 0 small enough. It is easy to see that A = s cos α + t2 − s2 sin2 α. So / A sin α cos α = s sin α cos2 α + t2 − s2 sin2 α sin α cos α 0 = s sin α(1 − sin2 α) + (t2 − s2 sin2 α)(1 − sin2 α) sin2 α. Since α > 0 is small enough, we have sin2 α > 0 small enough and monotone 1 increasing with respect to α. Also, since s/t is near to R R2 (> 0), there exists an 2 ε1 > 0 such that the functions f1 (x) = sx(1 − x ) and f2 (x) = (t2 − s2 x)(1 − x)x = 2 t2 x(x − 1)(x − st2 ) are monotone increasing in x ∈ [0, ε1 ]. Combining these, we get the desired property of A sin α cos α to be increasing with respect to α for α > 0 small enough, or equivalent, decreasing with respect to A. This completes the proof of our assertion.

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

833

Let ξ0 = ξε0 = inf{t > 0; |X1 (t) − X2 (t)| ≤ R1 + R2 − 34 ε0 } ∧ σn ∧ T . Then by Lemma 6.3.3, Pm (ξ0 < T ∧ σn ) → 0 as m → ∞. We next use Lemma 6.3.4 to prove the following: T ∧σ ∧ξ (X(s)) 0 )ds > δ) = 0 for any Lemma 6.3.5. limm→0 Pm ( 0 n 0 m−1/2 (U −U δ > 0. 1 , Y2 ) · Y1 −Y2 is positive for |Y1 − Proof. By Lemma 6.1.2, we have that −∇1 U(Y |Y1 −Y2 | Y2 | ∈ [R1 + R2 − ε0 , R1 + R2 ). Also, by Lemma 6.3.4, the same quantity is monotone (X1 , X2 ) = U (X1 −X2 , 0). So non-increasing with respect to |Y1 −Y2 |. Notice that U with a little bit abuse of notation, we can write U (X1 , X2 ) = U (X1 − X2 ). We have 0 = 0 if |X1 − X2 | ≥ R1 + R2 . Also, for any |X1 − X2 | < R1 + R2 , 1 , X2 )− U that U(X R1 +R2 0 , and R1 +R2 + t(1 − R1 +R2 ) ≥ 1 for t ∈ [0, 1], we have U( |X1 −X2 | (X1 − X2 )) = U |X1 −X2 | |X1 −X2 | hence 0 (X1 , X2 ) − U U

1 − X2 ) − U R1 + R2 (X1 − X2 ) = U(X |X1 − X2 |

1 R1 + R2 (X1 − X2 ) + t 1 − R1 + R2 (X1 − X2 ) = −∇1 U |X1 − X2 | |X1 − X2 | 0

R1 + R2 · −1 + (X1 − X2 )dt |X1 − X2 |

1 (X1 − X2 ) · (X1 − X2 ) −1 + R1 + R2 dt −∇1 U ≤ |X1 − X2 | 0

R1 + R2 = −∇1 U(X1 − X2 ) · (X1 − X2 ) −1 + |X1 − X2 | (X1 − X2 )||X1 − X2 | ≤ |∇1 U

R1 + R2 − |X1 − X2 | |X1 − X2 |

(X1 − X2 )|(R1 + R2 − |X1 − X2 |). = |∇1 U 0 is (X1 , X2 ) − U The ﬁrst equation in the calculation above also gives us that U non-negative. Also, by (3.31), (X(s)) − U0 = 0 if |X1 (s) − X2 (s)| ≥ R1 + R2 . T ∧σ U ∧ξ 1 (s) − X2 (s))|ds], which is ﬁnite Let C = supm∈(0,1] E Pm [ 0 n 0 m−1/2 |∇1 U(X by Lemma 3.5.2. Then for any ε ∈ (0, 34 ε0 ), we have T ∧σn ∧ξ0 −1/2 Pm m (U (X(s)) − U0 )ds > δ 0

≤ Pm

0

T ∧σn ∧ξ0

(X1 (s) − X2 (s))| m−1/2 |∇1 U

× (R1 + R2 − |X1 (s) − X2 (s)|)1{|X1 (s)−X2 (s)| δ

August 10, J070-S0129055X10004077

834

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

≤ Pm

inf

s∈[0,T ∧σn ]

δ 1 (s) − X2 (s))|ds > m |∇1 U(X ε 0 T ∧σn ∧ξ0 ε Pm −1/2 (X1 (s) − X2 (s))|ds < T ∧ σn ) + E m |∇1 U δ 0

+ Pm ≤ Pm (ξ 43 ε

|X1 (s) − X2 (s)| ≤ R1 + R2 − ε

T ∧σn ∧ξ0

−1/2

ε ≤ Pm (ξ 43 ε < T ∧ σn ) + C. δ By Lemma 6.3.3, Pm (ξ 43 ε < T ∧ σn ) → 0 as m → 0 for any ε > 0. Therefore, by taking ﬁrst ε > 0 small enough and then m > 0 small enough, we get our assertion. We are now ready to show that the condition (5) of Theorem 6.3.2 is satisﬁed. Lemma 6.3.6. M1 |V1 (t)|2 +M2 |V2 (t)|2 is continuous in t almost surely, under any (t))t under Pm } as m → 0. cluster point of {(X(t), V Proof. Let mk be a sequence and P∞ be a probability such that limk→∞ mk = 0 and {(X(t), V (t))t under Pm } converges to P∞ as k → ∞. (This is possible by Sec. 6.2.) Then (Vi2 (s))s under Pm → (Vi2 (s))s under P∞ in ℘(D([0, T ]; Rd)), as m → 0. Also, let 2

(X(s)) 0 ) + 1 Hsm = m−1/2 (U −U Mi |Vi (s)|2 . 2 i=1 m Then we have by Lemma 3.5.2(2) that under our present setting, {(Ht∧σ ) under n ∧ξ0 t d Pm }m→0 is tight in ℘(C([0, T ]; R )). That is, there exists a Hs ∈ C([0, T ]; Rd) such that

(Hsm )s under Pm → (Hs )s under P∞ in ℘(C([0, T ]; Rd )), as m → 0. Combining the above, we get 2 1

m 2 Hs − Mi Vi (s) under Pm 2 i=1

→

s∈[0,T ∧σn ∧ξ0 )

2

1

Hs − Mi Vi (s)2 2 i=1

under P∞ s∈[0,T ∧σn ∧ξ0 )

in ℘(D([0, T ]; Rd)), as m → 0. However, for any δ > 0, we have by Lemma 6.3.5 that 2 T ∧σn ∧ξ0 m 1

2 Mi Vi (s) ds > δ → 0, as m → 0. Pm Hs − 2 0 i=1

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

So

P∞

T ∧σn ∧ξ0

0

835

2 1

2 Mi Vi (s) ds > δ = 0 Hs − 2 i=1

for any δ > 0. Also, ξ0 → T ∧ σn as m → 0. Therefore, T ∧σn 2 1

2 Mi Vi (s) ds = 0, P∞ -a.s. Hs − 2 0 i=1

This combined with the continuity of Hs and the fact that σn → ∞ a.e. gives us that M1 |V1 (t)|2 + M2 |V2 (t)|2 is continuous in t, P∞ -almost surely. We ﬁnally show that the condition (4) of Theorem 6.3.2 is satisﬁed. The method is similar to the one of the proof of (5). As in Sec. 5.1, let Yi (t) = Vi (t) − Mi−1 ηi (t), i = 1, 2, where ηi (t) is as given in (t) = (Y1 (t), Y2 (t)), and let Lemma 3.5.1. Let Y t (X(s)) {M −1 fV1 (X(s), Y (s)) · ∇1 U Gt = m−1/2 0

1

(X(s))}ds (s)) · ∇2 U + M2−1 fV2 (X(s), + f (X(t), V (t)). Y We ﬁrst show the following. Lemma 6.3.7. {(Gt∧σn )t under Pm }m→0 is tight in ℘(C([0, T ]; Rd )). Proof. Let t = Gt − f (X(t), (t)) + f (X(t), G V Y (t)). Then t | ≤ fV1 ∞ M −1 |η1 (t)| + fV2 ∞ M −1 |η2 (t)|. |Gt − G 1 2 Therefore, by Lemma 3.5.1(4), we have that the tightness of {(Gt∧σn )t t∧σn )t under Pm }m→0 in ℘(C([0, T ]; Rd)) is equivalent to the tightness of {(G under Pm }m→0 in ℘(C([0, T ]; Rd )). On the other hand, we have by Lemma 3.5.1 and Ito’s formula that t = fX1 (X(t), dG Y (t)) · V1 (t)dt + fX2 (X(t), Y (t)) · V2 (t)dt + M1−1 fV1 (X(t), Y (t)) · (dM1 (t) + dP1∗1 (t)) + M2−1 fV2 (X(t), Y (t)) · (dM2 (t) + dP2∗1 (t)). t∧σn )t under Pm }m→0 is tight So by Lemma 3.5.1(2), (4.13) and Theorem 3.4.1, {(G d in ℘(C([0, T ]; R )). This completes the proof of our assertion.

August 10, J070-S0129055X10004077

836

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Lemma 6.3.8. Suppose that f ∈ C0∞ (R4d ) satisﬁes the condition in (4) of Theorem 6.3.2. Then for any δ > 0, we have that t T ∧σn ∧ξ0 m−1/2 X(s)) {M1−1 fV1 (X(s), Y (s)) · ∇1 U( lim Pm m→0

0

0

+ M2−1 fV2 (X(s), Y

(X(s))}ds dt > δ = 0. (s)) · ∇2 U

(X1 , X2 ) = 0 if |X1 −X2 | > R1 +R2 . For any X1 , X2 ∈ Proof. First notice that ∇i U d i = R1 +R2 Xi , i = 1, 2. Then |X 1 − X 2 | = R1 + R with |X1 − X2 | ≤ R1 + R2 , let X |X1 −X2 | = (X 1 , X 2 ) ∈ R2 . Since D0 = {(X1 , X2 ) ∈ R2d ; |X1 −X2 | > R1 +R2 }, this means X ∂D0 , Also, as in the proof of Corollary 6.1.3, −∇1 U (X1 , X2 ) = ∇2 U (X1 , X2 ) is

parallel with same direction to X1 − X2 , so

(X1 , X2 ) = − |∇1 U(X1 , X2 )| (X1 − X2 ) = − |∇1 U(X1 , X2 )| (X 1 − X 2 ), ∇1 U |X1 − X2 | R1 + R2 (X1 , X2 ) = + |∇2 U(X1 , X2 )| (X1 − X2 ) = + |∇1 U(X1 , X2 )| (X 1 − X 2 ). ∇2 U |X1 − X2 | R1 + R2 So by assumption, for any Y ∈ R2d , 1 , X2 ) + M −1 fV2 (X, (X1 , X2 ) Y ) · ∇1 U(X Y ) · ∇2 U M1−1 fV1 (X, 2 1 , X2 )| |∇1 U(X Y ) · (X 1 − X 2 ) + M −1 fV2 (X, Y ) · (X 1 − X 2 )) (−M1−1 fV1 (X, 2 R1 + R2 = 0,

=

hence if we set C1 ≡ M1−1 fXV1 ∞ ∨ M2−1 fXV2 ∞ , then 1 , X2 ) + M −1 fV2 (X, Y ) · ∇2 U(X 1 , X2 )| |M1−1 fV1 (X, Y ) · ∇1 U(X 2 (X1 , X2 ) Y )) · ∇1 U = |M1−1 (fV1 (X, Y ) − fV1 (X, (X1 , X2 )| Y )) · ∇2 U + M2−1 (fV2 (X, Y ) − fV2 (X, (X1 , X2 )| 1U ≤ M1−1 fXV1 ∞ |X − X||∇ (X1 , X2 )| 2U + M2−1 fXV2 ∞ |X − X||∇

R1 + R2 − 1 |X|. ≤ C1 (|∇1 U (X1 , X2 )| + |∇2 U(X1 , X2 )|) |X1 − X2 | 0 | + 2nT )(R1 + R2 )−1 , and let Let C2 = 2(|X C3 = C1 C2 sup E m∈(0,1]

T ∧σn

Pm

m 0

−1/2

X(s))| X(s))|)ds + |∇2 U( , (|∇1 U(

August 10, J070-S0129055X10004077

2010 15:0 WSPC/S0129-055X

148-RMP

Classical Mechanical Model of Brownian Motion with Plural Particles

837

which is ﬁnite by Lemma 3.5.2. Then by the calculation above, we have for any ε ∈ [0, 34 ε0 ∧ 12 (R1 + R2 )), (hence R1 + R2 − ε > 12 (R1 + R2 )), t T ∧σn ∧ξ0 m−1/2 (X(s)) (s)) · ∇1 U Pm {M1−1 fV1 (X(s), Y 0

0

+ M2−1 fV2 (X(s), Y ≤ Pm

T ∧σn ∧ξ0

0

(s)) · ∇2 U(X(s))}ds dt > δ

(X(s))| (X(s))|) + |∇2 U m−1/2 C1 (|∇1 U

R + R 1 2 0 | + 2nT ) − 1 1{|X1 (s)−X2 (s)| δ × (|X |X1 (s) − X2 (s)|

≤ Pm |X1 (s) − X2 (s)| ≤ R1 + R2 − ε inf

s∈[0,T ∧σn ]

+ Pm

0

T ∧σn ∧ξ0

X(s))| X(s))|)ds + |∇2 U( m−1/2 C1 (|∇1 U(

0 | + 2nT ) > δ (|X

ε (R1 + R2 )/2

−1

≤ Pm (ξ 43 ε < T ∧ σn ) 2 T ∧σn ∧ξ0

ε Pm −1/2 m C1 |∇i U (X(s))| ds + C1 C2 · E δ 0 i=1 ε ≤ Pm (ξ 43 ε < T ∧ σn ) + C3 . δ Since Pm (ξ 43 ε < T ∧ σn ) → 0 as m → 0 for any ε ∈ (0, 34 ε0 ] by Lemma 6.3.3, we get our assertion by taking ﬁrst ε > 0 small enough then m > 0 small enough. By using the same argument when deriving Lemma 6.3.6 from Lemmas 3.5.2 and 6.3.5, with the help of Lemmas 6.3.7 and 6.3.8, we get the following, which means that the condition (4) of Theorem 6.3.2 is also satisﬁed. Lemma 6.3.9. Assume that f ∈ C0∞ (R4d ) satisﬁes V ) · (X1 − X2 ) + M −1 (∇V2 f )(X, V ) · (X2 − X1 ) = 0 M1−1 (∇V1 f )(X, 2 V ) ∈ ∂D0 × R2d , then f (X(t), V (t)) is continuous in t almost surely, for any (X, under any cluster point of {(X(t), V (t))t under Pm }, as m → 0. This completes the proof of the fact that in our setting any cluster point of the distribution of {(Xt , Vt )}t under Pm as m → 0 satisﬁes all of the conditions of

August 10, J070-S0129055X10004077

838

2010 15:0 WSPC/S0129-055X

148-RMP

S. Kusuoka & S. Liang

Theorem 6.3.2. Therefore, by the uniqueness of Theorem 6.3.2, the distribution of {(Xt , Vt )}t under Pm converges to P∞,0 as m → 0. Acknowledgment We would like to thank the referees for their valuable comments which helped to improve the manuscript in many ways. Also we would like to thank Professor Sergio Albeverio for read the manuscript carefully. This work was ﬁnancially supported by Grant-in-Aid for the Encouragement of Young Scientists (No. 21740063), Japan Society for the Promotion of Science. References [1] P. Billingsley, Convergence of Probability Measures (John Wiley & Sons, Inc., 1968). [2] P. Calderoni, D. D¨ urr and S. Kusuoka, A mechanical model of Brownian motion in half-space, J. Statist. Phys. 55(3–4) (1989) 649–693. [3] D. D¨ urr, S. Goldstein and J. L. Lebowitz, A mechanical model of Brownian motion, Comm. Math. Phys. 78(4) (1980/81) 507–530. [4] D. D¨ urr, S. Goldstein and J. L. Lebowitz, A mechanical model for the Brownian motion of a convex body, Z. Wahrsch. Verw. Gebiete 62(4) (1983) 427–448. [5] D. D¨ urr, S. Goldstein and J. L. Lebowitz, Stochastic processes originating in deterministic microscopic dynamics, J. Statist. Phys. 30(2) (1983) 519–526. [6] R. Holley, The motion of a heavy particle in an infinite one dimensional gas of hard spheres, Z. Wahrsch. Verw. Gebiete 17 (1971) 181–219. [7] N. Ikeda and S. Watanabe, Stochastic Diﬀerential Equations and Diﬀusion Processes, North-Holland Mathematical Library, Vol. 24 (North-Holland Publishing Co., Kodansha, Ltd., 1981). [8] O. Kallenberg, Foundations of Modern Probability, Probability and Its Applications, 2nd edn. (Springer-Verlag, New York, 2002). [9] S. Kusuoka, Stochastic Newton equation with reflecting boundary condition, in Stochastic Analysis and Related Topics in Kyoto, Adv. Stud. Pure Math., Vol. 41 (Math. Soc. Japan, 2004), pp. 233–246. [10] E. Nelson, Dynamical Theories of Brownian Motion (Princeton University Press, Princeton, 1967). [11] M. Reed and B. Simon, Methods of Modern Mathematical Physics. III. Scattering Theory (Academic Press, 1979). [12] J. A. M. van der Weide, Stochastic Processes and Point Processes of Excursions, CWI Tract, Vol. 102 (Stichting Mathematisch Centrum, Centrum voor Wiskunde en Informatica, Amsterdam, 1994).

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 7 (2010) 839–858 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004089

A NOTE ON THE NON-COMMUTATIVE LAPLACE–VARADHAN INTEGRAL LEMMA

W. DE ROECK Institut f¨ ur Theoretische Physik, Universit¨ at Heidelberg, Germany [email protected] CHRISTIAN MAES Instituut voor Theoretische Fysica, K. U. Leuven, Belgium [email protected] ˇ Y ´ KAREL NETOCN Institute of Physics, Academy of Sciences of the Czech Republic Prague, Czech Republic [email protected] LUC REY-BELLET Department of Mathematics and Statistics, University of Massachusetts, Amherst, USA [email protected] Received 10 September 2009 Revised 21 May 2010 We continue the study of the free energy of quantum lattice spin systems where to the local Hamiltonian H an arbitrary mean field term is added, a polynomial function of the arithmetic mean of some local observables X and Y that do not necessarily commute. By slightly extending a recent paper by Hiai, Mosonyi, Ohno and Petz [10], we prove in general that the free energy is given by a variational principle over the range of the operators X and Y . As in [10], the result is a non-commutative extension of the Laplace–Varadhan asymptotic formula. Keywords: Quantum large deviations; quantum lattice systems; Laplace–Varadhan lemma. Mathematics Subject Classification 2010: 82B10

1. Introduction 1.1. Large deviations One of the highlights in the combination of analysis and probability theory is the asymptotic evaluation of certain integrals. We have here in mind integrals of the 839

August 10, J070-S0129055X10004089

840

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

form, for some real-valued function G, dµn (x) exp{vn G(x)},

vn +∞ as n +∞

(1.1)

for which the measures µn satisfy a law of large numbers. Such integrals can be evaluated depending on the asymptotics of the µn . The latter is the subject of the theory of large deviations, characterizing the rate of convergence in the law of large numbers. In a typical scenario, the µn are the probabilities of some macroscopic variable, such as the average magnetization or the particle density in ever growing volumes vn and as distributed in a given equilibrium Gibbs ensemble. Then, depending on the case, thermodynamic potentials J make the rate function dµn (x) ∼ dx exp{−vn J (x)} in the sense of large deviations for Gibbs measures, see [8, 9, 16, 22, 23]. That theory of large deviations is however broader than the applications in equilibrium statistical mechanics. Essentially, when the rate function for µn is given by J , then the integral (1.1) is computed as 1 log dµn (x) exp{vn G(x)} −−−−− → sup{G(x) − J (x)}. (1.2) n+∞ x vn This is a typical application of Laplace’s asymptotic formula for the evaluation of real-valued integrals. The systematic combination with the theory of large deviations gives the so called Laplace–Varadhan integral lemma. We ﬁrst recall the large deviation principle (LDP). Let (M, d) be some complete separable metric space. Definition 1.1. The sequence of measures µn on M satisﬁes a LDP with rate function J : M → R+ ∪ {+∞} and speed vn ∈ R+ if (1) J is convex and has closed level sets, i.e. {J −1 (x), x ≤ c}

(1.3)

is closed in (M, d) for all c ∈ R+ ; (2) for all Borel sets U ⊂ M with interior int U and closure cl U , one has lim inf

1 log µn (U ) ≥ − inf J (u), u∈int U vn

lim sup

1 log µn (U ) ≤ − inf J (u). u∈cl U vn

n+∞

n+∞

We say that the rate function J is good whenever the level sets (1.3) are compact. For the transfer of LDP, one considers a pair (µn , νn ), n ∞ of sequences of absolutely continuous measures on (M, d) such that dνn (x) = exp{vn G(x)}, dµn

µn -almost everywhere,

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

841

for some measurable mapping G : M → R. We now state an instance of the Laplace– Varadhan lemma. Lemma 1.1 (Laplace–Varadhan Integral Lemma). Assume that G is bounded and continuous and that the sequence (µn ) satisfies a large deviation principle with good rate function J and speed vn . Then (νn ) satisfies a large deviation principle with good rate function G − J and speed vn . For more general versions and proofs we refer to the literature, see e.g. [5–7, 22, 23]; it remains an important subject of analytic probability theory to extend the validity of the variational formulation (1.2) and to deal with its applications. 1.2. Mean-field interactions From the point of view of equilibrium statistical mechanics, one can also think of the formula (1.1) as giving (the exponential of) the pressure or free energy when adding a mean ﬁeld type term to a Hamiltonian which is a sum of local interactions. The choice of the function G is then typically monomial with a power decided by the number of particles or spins that are in direct interaction. For example, the free energy of an Ising-like model with such an extra mean ﬁeld interaction would be given by the limit p 1 1 log exp −βHΛ (η) + λp |Λ| ηi (1.4) lim |Λ| ΛZd |Λ| Λ i∈Λ

η∈{+,−}

for p = 1, 2, . . . , where HΛ (η) is the (local) energy of the spin conﬁguration η and the limit takes a sequence of regularly expanding boxes Λ to cover some given lattice. The case p = 1 corresponds to the addition of a magnetic ﬁeld λ1 ; p = 2 is most standard and adds eﬀectively a very small but long range two-spin interaction. Higher p-values are also not uncommon in the study of Ising interactions on hypergraphs, and even very large p has been found relevant, e.g., in models of spin glasses and in information theory [4]. The form (1.1) is easily recognized in (1.4), with exp{−βHΛ (η)}, vn = |Λ|, µn (x) ∼ η∈{+,−}Λ ,

P

i∈Λ

ηi =x|Λ|

and the function G(x) = λp xp . The Laplace–Varadhan lemma applies to (1.4) since we know that the sequence of Gibbs states with density ∼ exp{−βHΛ ( · )} satisﬁes a LDP with a good rate function Jcl and speed |Λ|. The result reads that (1.4) is given by the variational formula sup {λp up − Jcl (u)}.

(1.5)

u∈[−1,1]

In non-commutative versions the local Hamiltonian H and the additional mean ﬁeld term are allowed not to commute with each other. That is natural within the

August 10, J070-S0129055X10004089

842

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

statistical mechanics of quantum spin systems and this is also the context of the present paper. 1.3. Non-commutative extensions Although it has proven very useful to think of integrals (1.1) within the framework of probability and large deviation theory, it is fundamentally a problem of analysis. However, without such a probabilistic context, the question of a non-commutative extension of the Laplace–Varadhan Lemma 1.1 becomes ambiguous and it in fact allows for diﬀerent formulations, each possibly having a physical interpretation on its own. One approach is to ask for the asymptotic evaluation of the expectations lim

ΛZd

1 ¯ log ωΛ (e|Λ| G(XΛ ) ) |Λ|

(1.6)

¯ Λ would now be the arithmetic mean under a family of quantum states ωΛ where X of some quantum observable in volume Λ. To be speciﬁc, one can take ωΛ a quantum Gibbs state for a Hamiltonian HΛ at inverse temperature β, with density matrix ¯ Λ = ( σΛ ∼ exp{−βHΛ }, and X i∈Λ Xi )/|Λ| the mean magnetization in some ﬁxed direction. Arguably, this formulation is closely related to the asymptotic statistics ¯ Λ . Indeed, let νΛ be the measure of outcomes in von Neumann measurements of X on [− X , X ] deﬁned by ¯ Λ )) νΛ (f ) := ωΛ (f (X

for f ∈ C([− X , X ]).

(1.7)

Then, (1.6) can be evaluated with the help of Lemma 1.1 (the commutative Laplace– Varadhan integral lemma) if the family νΛ satisﬁes a LDP with speed |Λ|. In recent years, this LDP has been established for σΛ ∼ exp{−βHΛ } in the regime of small β (high temperature) or d = 1, see [11, 13–15]. A more general class of possible extensions is obtained by considering the limits of 1 |Λ| 1 ¯ log TrΛ (σΛK e K G(XΛ ) )K , Λ Zd (1.8) |Λ| for diﬀerent K > 0, where σΛ is the density matrix of a quantum state in box Λ. For the canonical form σΛ = exp(−βHΛ )/ZΛβ with local Hamiltonian HΛ at inverse temperature β, (1.8) becomes |Λ| β 1 1 ¯ log β TrΛ (e− K HΛ e K G(XΛ ) )K , |Λ| ZΛ

Λ Zd .

(1.9)

There is no a priori reason to exclude any particular value of K from consideration. Two standard options are: K = 1, which corresponds to the expression (1.6) above, and K +∞, which, by the Trotter product formula, boils down to 1 1 ¯ log β TrΛ (e−βHΛ +|Λ|G(XΛ ) ), |Λ| ZΛ

Λ Zd

(1.10)

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

843

which is the free energy of a corresponding quantum spin model, cf. (1.4). In the present paper, we study the case K +∞ (without touching the question of interchangeability of both limits). One of our results, Theorem 3.1 with Y = Y¯Λ = 0, is of the form lim

ΛZd

1 ¯ log TrΛ (e−βHΛ +|Λ| G(XΛ ) ) = sup {G(u) − J (u)}. |Λ| −X≤u≤X

(1.11)

Note that we omitted the normalization factor 1/ZΛβ since it merely adds a constant (independent of G) to (1.10). In the usual context of the theory of large deviations, formula (1.11) arises as a change of rate function. However, while our result (1.11) very much looks like Varadhan’s formula in Lemma 1.1, there is a big diﬀerence in interpretation: The function J is not as such the rate function of large deviations ¯ Λ . Instead, it is given as the Legendre transform for X J (u) = sup{tu − q(t)},

u∈R

(1.12)

t∈R

of a function q( · ) which is the pressure corresponding to a linearized interaction, i.e. q(t) = lim

ΛZd

1 ¯ log TrΛ (e−βHΛ +t|Λ|XΛ ) ). |Λ|

(1.13)

1.4. Several non-commuting observables: Towards joint large deviations? In the previous Sec. 1.3, we made the tacit assumption that there is a single observ¯ Λ corresponding to some Hermitian operator on Hilbert space. However, in able X 1 formula (1.4), the observable |Λ| i∈Λ ηi could equally well represent a vectorvalued magnetization which, upon quantization, would correspond to several non¯ Λ , Y¯Λ , say, the magnetization along the x-axis and y-axis, commuting observables X respectively. In the commutative theory, this case does not require special attention; the framework of large deviations applies equally regardless of whether the observable takes values in R or R2 . Obviously, this is not true in the non-commutative setting and in fact, we do not even know a natural analogue of the generating function (1.6), since we do not dispose of a simultaneous Von Neumann measurement ¯ Λ and Y¯Λ . One can take the point of view that this is inevitable in quantum of X mechanics, and insisting is pointless. Yet, as Λ Zd , the commutator 1 ¯ ¯ [XΛ , YΛ ] = O (1.14) |Λ| ¯ Λ , Y¯Λ is restored on the macroscopic vanishes and hence the joint measurability of X scale. We refer the reader to [19] where this issue is discussed and studied in more depth. The advantage of the approach via the Laplace–Varadhan Lemma is that one ¯ Λ and can set aside these conceptual questions and study joint large deviations of X ¯ Λ and Y¯Λ , for example a symmetrized Y¯Λ by choosing G to be a joint function of X

August 10, J070-S0129055X10004089

844

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

monomial ¯Λ , Y¯Λ ) = (X ¯ Λ )k (Y¯Λ )l + (Y¯Λ )l (X ¯ Λ )k , G(X

for some k, l ∈ N,

(1.15)

and check whether the formula (1.11) remains valid with some obvious adjustments. This turns out to be the case and it is our main result: Theorem 3.1. 1.5. Comparison with previous results The asymptotics of the expression (1.10) was ﬁrst studied and the result (1.11) was ﬁrst obtained by Petz et al. [17], in the case where the Hamiltonian HΛ is made solely from a one-body interaction. The corresponding equilibrium state is then a product state. In [10], Hiai et al. generalized this result to the case of locally interacting spins but the lattice dimension was restricted to d = 1. However, the authors of [10] argue that the restriction to d = 1 can be lifted in the high-temperature regime. The main reason is that their work relies heavily on an asymptotic decoupling condition which is proven in that regime, [1]. One should observe here that this asymptotic decoupling condition in fact implies a large deviation principle for ¯ Λ , as follows from the work of Pﬁster [18]. Hence, in the language of Sec. 1.3, [10] X evaluates (1.10) (the case K = ∞) in those regimes where (1.6) (the case K = 1) can be evaluated as well. The present paper elaborates on the result of [10] in two ways. First, we remark that, in our setup, the decoupling condition is actually not necessary for (1.11) to hold, and therefore one can do away with the restriction to d = 1 or high temperature. Hence, again referring to Sec. 1.3, the case K = ∞ can be controlled even when we know little about the case K = 1. To drop the decoupling condition, it is absolutely essential that we start from ﬁnite-volume Gibbs states, and not from ﬁnite-volume restrictions of inﬁnite-volume Gibbs states, as it is done in [10]. Second, we show that by the same formalism, one can treat the case of several noncommuting observables, as explained in Sec. 1.4. The most serious step in this generalization is actually an extension of the result of [17] to noncommuting observables. This extension is stated in Lemma 6.1 and proven in Sec. 7. Note. While we were ﬁnishing this paper, we learnt of a similar project by J.-B. Bru and W. de Siqueira Pedra. Their result [3] is nothing less than a full-ﬂedged theory of equilibrium states with mean-ﬁeld terms in the Hamiltonian, describing not only the mean-ﬁeld free energy (as we do here), but also the states themselves. Also, their results hold for fermions, while ours are restricted to spin systems, and they provide interesting examples. Yet, the focus of our paper diﬀers from theirs and our main result is not contained in their paper. 1.6. Outline In Sec. 2, we sketch the setup. We introduce spin systems on the lattice, noncommutative polynomials and ergodic states. Section 3 describes the result of the paper. The remaining Secs. 4–7 contain the proofs.

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

845

2. Setup 2.1. Hamiltonian and observables We consider a quantum spin system on the regular lattice Zd , d = 1, 2, . . . . We brieﬂy introduce the essential setup below, and we refer to [12, 20] for more expanded, standard introductions. The single site Hilbert space H is ﬁnite-dimensional (isomorphic to Cn ) and for any ﬁnite volume Λ ⊂ Zd , we set HΛ = ⊗Λ H. The C ∗ -algebra of bounded operators on HΛ is denoted by BΛ ≡ B(HΛ ). The standard embedding BΛ ⊂ BΛ for Λ ⊂ Λ is assumed throughout. The quasi-local algebra U is deﬁned as the norm closure of the ﬁnite-volume algebras BΛ . (2.1) U := Λ ﬁnite

Denote by τi , i ∈ Zd , the translation which shifts all observables over a lattice vector i, i.e. τi is a homomorphism from BΛ onto Bi+Λ . We introduce an interaction potential Φ, that is a collection (ΦA ) of Hermitian elements of BA , labeled by ﬁnite subsets A ⊂ Zd . We assume translation invariance (i) and a ﬁnite range (ii): (i) τi (ΦA ) = Φi+A for all ﬁnite A ⊂ Zd ; (ii) there is a dmax < ∞ such that, if diam(A) > dmax , then ΦA = 0. In estimates, we will frequently use the number r(Φ) :=

ΦA < ∞.

(2.2)

A0

The local Hamiltonian in a ﬁnite volume Λ is ΦA HΛ ≡ HΛΦ =

(2.3)

A⊂Λ

which corresponds to free or open boundary conditions. Boundary conditions will however turn out to be irrelevant for our results. We will drop the superscript Φ since we will keep the interaction potential ﬁxed. Let X, Y, . . . denote local observables on the lattice, located at the origin, i.e. Supp X (which is deﬁned as the smallest set A such that X ∈ BA ) is a ﬁnite set which includes 0 ∈ Zd . We write τj X (2.4) XΛ := j∈Zd ,Supp τj X⊂Λ

and ¯ Λ := 1 XΛ X |Λ| for the corresponding intensive observable (the “empirical average” of X).

(2.5)

August 10, J070-S0129055X10004089

846

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

All of these operators are naturally embedded into the quasi-local algebra U. At some point, we will also require the intensive inﬁnite volume observable ¯ ∼X ¯ Λ∞ . X ¯ since it does not belong to the quasi-local Some care is required in dealing with X algebra U. We will further comment on this in Sec. 2.3. 2.2. Non-commutative polynomials ¯ Λ , Y¯Λ ) We will perturb the Hamiltonian HΛΦ by a mean ﬁeld term of the form |Λ|G(X ¯ Λ , Y¯Λ , e.g., as where G is a “non-commutative polynomial” of the operators X in (1.15). In this section, we introduce these non-commutative polynomials G as quantizations of polynomial functions g. First, we deﬁne Ran(X, Y ) := [− X , X ] × [− Y , Y ].

(2.6)

This deﬁnition is motivated by the fact that (“sp” stands for spectrum) ¯ Λ × sp Y¯Λ ⊂ Ran(X, Y ), sp X

for all Λ.

(2.7)

Let g be a real polynomial function on the rectangular set Ran(X, Y ). Using the symbol I for the collection of all ﬁnite sequences from the binary set {1, 2}, ˜ : I → C is called a quantization of g whenever any map G N

˜ G(α) xα(1) · · · xα(n) = g(x1 , x2 )

(2.8)

n=0 α=(α(1),...,α(n))∈I

˜ is called for all (x1 , x2 ) ∈ Ran(X, Y ) and for some N ∈ N. A quantization G symmetric whenever ˜ ˜ G(α(1), . . . , α(n)) = G(α(n), . . . , α(1)).

(2.9)

˜ deﬁnes a self-adjoint operator Any such symmetric quantization G G(X, Y ) =

N

˜ G(α) Xα(1) · · · Xα(n)

(2.10)

n=0 α=(α(1),...,α(n))∈I

taking X1 ≡ X and X2 ≡ Y . In the thermodynamic limit, one expects diﬀerent quantizations of g to be equivalent: ˜ and G ˜ be any two quantizations of g : Ran(X, Y ) → R. Then Lemma 2.1. Let G ¯ Λ , Y¯Λ ) − G (X ¯ Λ , Y¯Λ ) ≤ Cg (X, Y )

G(X |Λ| for some Cg (X, Y ) < ∞, and for all finite volumes Λ.

(2.11)

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

847

Proof. This is a simple consequence of the fact that the commutator of macroscopic observables vanishes in the thermodynamic limit, more precisely, ¯ Λ , Y¯Λ ] ≤ 1 X |Supp X| × Y |Supp Y |.

[X |Λ|

(2.12)

Indeed, our results, Theorems 3.1 and 3.2, do not depend on the choice of quantization. This can also be checked a priori using the above lemma and the log-trace inequality in (3.11). 2.3. Infinite-volume states A state ωΛ is a positive linear functional on BΛ , normalized by ωΛ = ωΛ (1) = 1. An example is the tracial state, ωΛ ( · ) ∼ TrΛ ( · ). In general we consider states ωΛ as characterized by their density matrix σΛ , ωΛ ( · ) = TrΛ (σΛ ·). An inﬁnite volume state ω is a positive normalized function on the C ∗ -algebra U (the quasi-local algebra). It is translation invariant when ω(A) = ω(τj A) for all j ∈ Zd and A ∈ U. A translation-invariant state ω is ergodic whenever it is an extremal point in the convex set of translation invariant states. A state is called symmetric whenever it is invariant under a permutation of the lattice sites, that is, for any sequence of one-site observables A1 , . . . , An ∈ B{0} ⊂ U and i1 , . . . , in ∈ Zd ω(τi1 (A1 )τi2 (A2 ) · · · τin (An )) = ω(τiπ(1) (A1 )τiπ(2) (A2 ) · · · τiπ(n) (An ))

(2.13)

for any permutation π of the set {1, . . . , n}. The set of ergodic/symmetric states on U is denoted by Serg , Ssym , respectively. At some point we will need the theorem by Størmer [21] that states that any ω ∈ Ssym can be decomposed as dνω (φ)φ (2.14) ω= prod.

for some regular probability measure νω whose support consists of product states. Of course, the set of product states can be identiﬁed with the (ﬁnite-dimensional) set of states on the one-site algebra B{0} = B(H). For a ﬁnite-volume state ωΛ on BΛ , we consider the entropy functional S(ωΛ ) ≡ SΛ (ωΛ ) = − Tr σΛ log σΛ .

(2.15)

The mean entropy of a translation-invariant inﬁnite-volume state ω is deﬁned as s(ω) := lim

ΛZd

1 S(ωΛ ), |Λ|

with ωΛ := ω BΛ (restriction to Λ).

(2.16)

In this formula and in the rest of the paper, the limit limΛZd is meant in the sense of Van Hove, see, e.g., [12, 20]. Standard properties of the functional s are its aﬃnity and upper semicontinuity (with respect to the weak∗-topology on states).

August 10, J070-S0129055X10004089

848

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

¯ and Y¯ , postponing In Sec. 2.1, we mentioned the observables at inﬁnity’ X l ¯k ¯ their deﬁnition to the present section. Expressions like ω(X Y ) (for some positive numbers l, k) can be deﬁned as ¯ l Y¯ k ) := ω(X

lim

Λ,Λ Zd

¯ l Y¯ k ), ω(X Λ Λ

(2.17)

provided that the limit exists. We use the following standard result that can be viewed as a non-commutative law of large numbers Lemma 2.2. For ω ∈ Serg , the limit (2.17) exists and ¯ l Y¯ k ) = [ω(X)]l [ω(Y )]k . ω(X

(2.18)

¯ and ω(Y ) = ω(Y¯ ) by translation invariance. An immeNote that ω(X) = ω(X) diate corollary is that for a non-commutative polynomial G which is a quantization of g (see Sec. 2.2), and ω ∈ Serg : ¯ Y¯ )) = g(ω(X), ω(Y )). ω(G(X,

(2.19)

For the convenience of the reader, we sketch the proof of Lemma 2.2 in the Appendix. Finally, we note that Lemma 2.2 does not require the state ω to be trivial at inﬁnity. Triviality at inﬁnity is a stronger notion which is not used in the present paper. In particular, the state µ ¯ constructed in Sec. 4 is ergodic, but not trivial at inﬁnity, since it fails to be ergodic with respect to a subgroup of lattice translations. 3. Result Choose X, Y to be local operators and let HΛΦ be the Hamiltonian corresponding ˜ be a symto a ﬁnite-range, translation invariant interaction Φ, as in Sec. 2.1. Let G metric quantization of a polynomial g on the rectangle Ran(X, Y ) and G( ·, · ) the corresponding self-adjoint operator, as deﬁned in Sec. 2.2. We deﬁne the “G-mean ﬁeld partition function” ¯

¯

ZΛG (Φ) := TrΛ (e−HΛ +|Λ| G(XΛ ,YΛ ) )

(3.1)

¯ Λ , Y¯Λ empirical averages of X, Y . The following theorem is our main result: with X Theorem 3.1. Define the pressure p(u, v) = lim

ΛZd

Φ 1 log TrΛ e−HΛ +uXΛ +vYΛ |Λ|

(3.2)

and its Legendre transform I(x, y) =

sup (ux + vy − p(u, v)).

(3.3)

(u,v)∈R2

Then lim

ΛZd

1 log ZΛG (Φ) = sup (g(x, y) − I(x, y)) |Λ| (x,y)∈R2

(3.4)

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

849

where the limit Λ Zd is in the sense of Van Hove, as in (3.2). In particular, the left-hand side of (3.4) does not depend on the particular form of quantization taken. As discussed in Sec. 1, our result expresses the pressure of the mean ﬁeld Hamiltonian through a variational principle. To derive this result, it is helpful to represent this pressure ﬁrst as a variational problem on a larger space, namely that of ergodic states, as in Theorem 3.2. Theorem 3.1 follows then by parametrizing these states by their values on X and Y . We also need the “local energy operator” associated to the interaction Φ as EΦ :=

1 ΦA . |A|

(3.5)

A0

Theorem 3.2 (Mean-Field Variational Principle). Let s( · ) be the mean entropy functional, as in Sec. 2.3. Then lim

ΛZd

1 log ZΛG (Φ) = sup (g(ω(X), ω(Y )) + s(ω) − ω(EΦ )). |Λ| ω∈Serg

(3.6)

To understand how the ﬁrst term on the right-hand side of (3.6) originates from (3.1), we recall the equality (2.19) for ergodic states ω. The proof of Theorem 3.2 is postponed to Secs. 5 and 6. Here we prove that Theorem 3.1 is a rather immediate consequence of Theorem 3.2. Proof of Theorem 3.1. We write the right-hand side of (3.6) in the form ˜ y)) sup (g(x, y) − I(x,

(3.7)

(x,y)∈R2

where ˜ y) = I(x,

inf

ω∈Serg ω(X)=x, ω(Y )=y

(−s(ω) + ω(EΦ ))

(3.8)

is a convex function on R2 , inﬁnite on the complement of Ran(X, Y ). To establish ˜ y) is lower semi-continuous (l.s.c.), we proceed as in the proof of the that I(x, contraction principle in large deviation theory, see, e.g., [5]: The function ω → (−s(ω) + ω(EΦ )) is l.s.c. and the set {ω ∈ Serg , ω(X) = x, ω(Y ) = y} is compact by the continuity of ω → (ω(X), ω(Y )) (compactness and continuity with respect to the weak∗ -topology). Therefore, the inﬁmum is attained and we can deduce that ˜ y) ≤ a} = F ({ω ∈ Serg |−s(ω) + ω(EΦ ) ≤ a}) {x, y | I(x,

(3.9)

where F : ω → (ω(X), ω(Y )). The level set on the left-hand side is closed and hence I˜ is l.s.c.

August 10, J070-S0129055X10004089

850

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

By using the inﬁnite-volume Gibbs variational principle [12, 20], the Legendre– Fenchel transform of I˜ reads ˜ y)) = sup (s(ω) − ω(EΦ ) + u ω(X) + v ω(Y )) sup (ux + vy − I(x,

(x,y)∈R2

ω∈Serg

= p(u, v).

(3.10)

The equality I = I˜ then follows by the involution property of the Legendre–Fenchel transform on the set of convex lower-semicontinuous functions, see, e.g., [20]. Independence of boundary conditions. Observe that both Theorems 3.1 and 3.2 have been formulated for the ﬁnite volume Gibbs states with open boundary conditions. It is however easy to check that this choice is not essential and other equivalent formulations can be obtained. Indeed, by the standard log-trace inequality, ¯

¯

¯

¯

| log TrΛ (e−βHΛ +WΛ +|Λ| G(XΛ ,YΛ ) ) − log TrΛ (e−βHΛ +|Λ| G(XΛ ,YΛ ) )| ≤ WΛ

(3.11) and hence if one chooses WΛ such that limΛZd WΛ /|Λ| = 0, then we can replace −βHΛ by −βHΛ + WΛ in Theorems 3.1 and 3.2. Finite-range restrictions. It is obvious that our paper contains some restrictions that are not essential. In particular, by standard estimates (in particular, those used to prove the existence of the pressure, see, e.g., [20]) one can relax the ﬁnite-range conditions on the interaction Φ to the condition that ΦA

< ∞, (3.12) |A| A0

and similarly for the local observables X, Y . Moreover, it is not necessary that G is a non-commutative polynomial. Starting from (3.11), one checks that it suﬃces that G can be approximated in operator norm by non-commutative polynomials. 4. Approximation by Ergodic States In this section, we describe a construction that is the main ingredient of our proofs, as well as of those in [10, 17]. This construction will be used in Secs. 6 and 7. Let V be a hypercube centered at the origin, i.e. V = [−L, L]d for some L > 1 and let ∂V := {i ∈ V ∃i ∈ Zd \V such that i, i are nearest neighbors} (4.1) We write Zd /V = ((2L + 1)Z)d

(4.2)

to denote the “block lattice” whose points can be thought of as translates of V . In other words, Zd = ∪i∈Zd /V V + i. Consider a state µV on BV .

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

851

We aim to build an inﬁnite-volume ergodic state out of µV . First, we deﬁne the block product state

µV . (4.3) µ ˜ := Zd /V

We deﬁne also the translation-average of µ ˜, 1 µ ˜ ◦ τj . µ ¯ := |V |

(4.4)

j∈V

We can now check the following properties: • We have the exact equality of entropies s(¯ µ) = s(˜ µ) =

1 S(µV ). |V |

(4.5)

This follows from the aﬃnity of the entropy in inﬁnite-volume. A remark is in order: A priori, the inﬁnite-volume entropy is deﬁned for translation-invariant states, whereas µ ˜ is only periodic. However, one easily sees that the entropy can still be deﬁned, e.g. be viewing µ ˜ as a translation-invariant state on the block d lattice Z /V , and correcting the deﬁnition by dividing by |V |. • The state µ ¯ is ergodic. This follows, for example, from an explicit calculation that is presented in [10]. Note however that µ ¯ is in general not ergodic with respect to the translations over the sublattice Zd/V = ((2L + 1)Z)d . This phenomenon (though in a slightly diﬀerent setting) is commented upon in [20] (the end of Sec. III.5). • The state µ ¯ is a good approximation of µV for observables which are empirical averages, provided V is large. Consider the local observable X as in Sec. 2.1. A translate τj X can lie inside a translate of V , i.e. Supp τj X ⊂ V + i for some i ∈ Zd/V , or it can lie on the boundary between two translates of V . The diﬀerence ¯ V ) clearly stems from those translates where X ¯ and µV (X between µ ¯ (X) = µ ¯(X) lies on a boundary, and the fraction of such translates is bounded by |Supp X| ×

|∂V | . |V |

(4.6)

Hence ¯ V )| ≤ X |Supp X| × ¯ − µV (X |¯ µ(X)

|∂V | . |V |

(4.7)

5. The Lower Bound In this section, we prove the following lower bound. Lemma 5.1. Recall ZΛG (Φ) as defined in (3.1). Then lim inf ΛZd

1 log ZΛG (Φ) ≥ sup ((g(ω(X), ω(Y )) + s(ω) − ω(EΦ )) |Λ| ω∈Serg

where all symbols have the same meaning as in Sec. 3.

(5.1)

August 10, J070-S0129055X10004089

852

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

Proof. Consider a state ω ∈ Serg . We show that 1 log ZΛG (Φ) ≥ g(ω(X), ω(Y )) + s(ω) − ω(EΦ ). (5.2) |Λ| Consider, for each volume Λ, the restriction ωΛ := ω BΛ . By the ﬁnite-volume variational principle (see, e.g., [2, Proposition 6.2.22]), lim inf ΛZd

1 ¯ Λ , Y¯Λ )) + 1 S(ωΛ ) − 1 ωΛ (HΛ ). log ZΛG (Φ) ≥ ωΛ (G(X |Λ| |Λ| |Λ|

(5.3)

The following convergence properties apply with Λ Zd in the sense of Van Hove: (1) (2)

¯ Λ , Y¯Λ )) = ω(G(X ¯ Λ , Y¯Λ )) → g(ω(X), ω(Y )), ωΛ (G(X 1 S(ωΛ ) → s(ω), |Λ|

(3)

1 ω(HΛ ) → ω(EΦ ). |Λ|

(5.4) (5.5) (5.6)

The relation (5.6) is obvious from the ﬁnite range condition on Φ, see Sec. 2.1. The convergence (5.5) is the deﬁnition of the mean entropy s. Finally, (5.4) follows from the ergodicity of ω as explained in Sec. 2.3. The relation (5.2) now follows immediately, since one can repeat the above construction for any ergodic state ω. 6. The Upper Bound 6.1. Reduction to product states In this section, we outline how to approximate 1 log ZΛG (Φ) |Λ|

(6.1)

by a similar expression involving the partition function of a block-product state. Fix a hypercube V = [−L, L]d and cover the lattice with its translates, as explained in Sec. 4. From now on, Λ is chosen such that it is a multiple of V . One can easily adopt the arguments such as to cover the case where Λ tends to inﬁnity in the sense of Van Hove (as one has to do as well in the proof of the existence of the pressure for local interactions, see [12]). Deﬁne the observables HΛV ≡ HΛΦ,V ,

¯ ΛV , X

Y¯ΛV

by cutting all terms that connect any two translates of V , i.e. ¯ V := 1 X τj X, Λ |Λ|

(6.2)

(6.3)

j∈Λ ∃i∈Zd/V :Supp τj X⊂V +i

and analogously for HΛV and Y¯ΛV . One can say that these observables with superscript V are one-block observables with the blocks being translates of V . One easily

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

853

derives that ¯V − X ¯ Λ ≤ X |Supp X| |∂V | ,

X Λ |V |

HΛV − HΛ ≤ r(Φ)|Λ|

|∂V | |V |

(6.4)

with the number r(Φ) as deﬁned in Sec. 2.1. Using the log-trace inequality, we bound V 1 1 ¯ ¯ ¯V ¯V log TrΛ (e−HΛ +|Λ| G(XΛ ,YΛ ) ) − log TrΛ (e−HΛ +|Λ|G(XΛ ,YΛ ) ) |Λ| |Λ|

(6.5)

as follows 1 ¯ Λ , Y¯Λ ) − G(X ¯ V , Y¯ V )

HΛ − HΛV + G(X Λ Λ |Λ| |∂V | ≤ (r(Φ) + Cg ( X |Supp X| + Y |Supp Y |)) |V |

(6.5) ≤

where Cg is constant depending on the function G. The second term of (6.5) is clearly the pressure of a product state with mean ﬁeld interaction. We will ﬁnd an upper bound for this pressure by slightly extending the treatment of Petz et al. in [17]. We prove an “extended PRV”-lemma, Lemma 6.1 in the next section. 6.2. The extended Petz–Raggio–Verbeure upper bound In this section, we outline the bound from above on the quantity V 1 ¯V ¯ V log TrΛ (e−HΛ +|Λ|G(XΛ ,YΛ ) ) |Λ|

(6.6)

that appeared in (6.5). To do this, let us make the setting slightly more abstract. Consider the lattice d Z with the one-site Hilbert space G given by

H. (6.7) G := V

In words, Z should be thought of as the block lattice Zd/V . Let D, A, B be onesite observable on the new lattice, i.e. D, A, B are Hermitian operators on G. The extended PRV (Petz–Raggio–Verbeure) states that d

Lemma 6.1 (Extended PRV). Let all symbols have the same meaning as in Secs. 2.1–2.3, except that the one-site Hilbert space is changed from H to G. Then lim sup ΛZd

1 ¯ ¯ ¯ B)) ¯ + s(ω) − ω(D)). log TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ) ≤ sup (ω(G(A, |Λ| ω∈Ssym (6.8)

¯ B)) ¯ defined as (2.17) exists. In particular ω(G(A, To appreciate the similarity between (6.8) and (3.6), one should realize that D is a local energy operator, as EΨ in (3.6). The proof of this lemma in the case that A = B is in the original paper [17]. The proof for the more general case is presented

August 10, J070-S0129055X10004089

854

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

in Sec. 7. Of course, one can prove that the right-hand side of (6.8) is also a lower bound: it suﬃces to copy Sec. 5. By the Størmer theorem, see (2.14), each symmetric state ω on U can be written as the barycenter of a regulary probability measure on the product states, and since all terms on the right-hand side of (6.8) are aﬃne and upper semicontiuous functions of ω, it follows that the sup can be restricted to product states (see [17] for the ﬁne details of this argument). Since, moreover, all product states are ergodic, we can ¯ B)) ¯ by g(ω(A), ω(A)). Hence, Lemma 6.1 implies that replace ω(G(A, lim sup ΛZd

1 ¯ ¯ log TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ) ≤ sup (g(ω(A), ω(B)) + s(ω) − ω(D)). |Λ| ω prod. (6.9)

6.2.1. From the extended PRV to the upper bound Next, we use (6.9) to formulate an upper bound on the quantity V 1 ¯V ¯ V TrΛ (e−HΛ +|Λ|G(XΛ ,YΛ ) ) |Λ|

(6.10)

for Λ a multiple of V . This means that we have to recall that the lattice sites in (6.9) are in fact blocks. We write Λ∗ := Λ/V and choose D := HV ¯V A := X B := Y¯V . Then, by the extended PRV,

1 1 ∗ s (ω) − ω(D) (6.10) ≤ sup g(ω(A), ω(B)) + |V | |V | ω prod. on B(Λ∗ ) ¯ V ), ωV (Y¯V )) + 1 S(ωV ) − 1 ωV (HV ) = sup G(ωV (X |V | |V | ωV on BV

where s∗ indicates that this is the entropy density on the block lattice Λ∗ , hence it should be divided by |V | to obtain the density on Λ. Now, let ω ˜ be the inﬁnite¯ be its volume state obtained by taking a block-product over states ωV and let ω “translation-average”, as in Sec. 4. By the conclusions of Sec. 4, it follows that ω ¯ is ergodic and s(¯ ω ) = S(ωV ). Also, we see that |∂V | ¯V ) − ω |ωV (X ¯ (X)| ≤ X |Supp X| |V | 1 |∂V | |ωV (HV ) − ω ¯ (EΦ )| ≤ r(Φ) |V | |V | ¯ and analogously for YV . Consequently, we obtain |∂V | (6.10) ≤ sup (g(ω(X), ω(Y )) + s(ω) − ω(EΦ )) + O , |V | ω∈Serg

V Zd

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

855

| which proves the upper bound for Theorem 3.2, since the O( |∂V |V | )-term can be made arbitrarily small by increasing V .

7. Proof of Lemma 6.1 Let the state µΛ on BΛ be given by µΛ ( · ) =

1 ¯ ¯ TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ·) ZΛG (D)

with ¯

¯

ZΛG (D) := TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ). Naturally, µΛ is the ﬁnite-volume Gibbs state that saturates the variational principle, i.e. 1 ¯ Λ )) + 1 S(ωΛ ) − ωΛ (D) log ZΛG (D) = sup ωΛ (G(A¯Λ , B |Λ| |Λ| ωΛ on BΛ ¯Λ )) + 1 S(µΛ ) − µΛ (D). = µΛ (G(A¯Λ , B (7.1) |Λ| Our strategy is to attain the “entropy” and “energy” of the state µΛ via ergodic states. For deﬁniteness, we assume that G is of the form ¯Λ ) := [A¯Λ ]k [B ¯Λ ]l G(A¯Λ , B

for some integers k, l,

¯Λ ) has to be a self-adjoint (which, strictly speaking, is not allowed since G(A¯Λ , B operator, but this does not matter for the argument in this section). The general case follows by the same argument. We apply the construction in Sec. 4 to µΛ , thus obtaining inﬁnite-volume states µ ˜ and µ ¯. Since we will repeat the construction for diﬀerent Λ, we indicate the ¯{Λ} , but remembering that these are states on the Λ-dependence in µ ˜{Λ} and µ inﬁnite lattice. They satisfy s(¯ µ{Λ} ) =

1 S(µΛ ). |Λ|

(7.2)

We have also established in Sec. 4 that µ ¯{Λ} is ergodic and that the states µ ¯{Λ} {Λ} and µ ˜ approximate µΛ for observables which are empirical averages. However, ¯ B), ¯ except in we cannot conclude yet that they have comparable values for G(A, the case where G is linear. Essentially, such a comparison is achieved next by using the fact that µΛ is symmetric. Choose a sequence of volumes Λn such that along that sequence the right-hand side of (7.1) converges. We assume that µ ¯Λn has a weak∗-limit, as n ∞, which can always be achieved (by the weak∗-compactness) by restricting to a subsequence of Λn . We call this limit µ. By construction, it is a symmetric state.

August 10, J070-S0129055X10004089

856

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

Energy estimate. Since µ ¯Λn → µ, in the weak∗-topology, and µ ¯Λn (D) = µΛn (D), we have µΛn (D) → µ(D).

(7.3)

G-estimates. Using the symmetry of the state µΛ , we estimate ¯Λ )) − µΛ (⊗k A ⊗l B)| |µΛ (G(A¯Λ , B c(k, l) (k + l)2 k+l ≤ max ( A , B ) +O , |Λ| |Λ|2

|Λ| ∞

(7.4)

where the tensor products ⊗k A ⊗l B := A ⊗ · · · ⊗ A ⊗ B ⊗ · · · ⊗ B k copies

(7.5)

l copies

denote that all one-site operators are placed on diﬀerent sites. Since µΛ is symmetric, we need not specify on which sites. The error term of order 1/|Λ| comes from those terms in the expansion of the monomial containing a product of k + l one-site operators but only involving k + l − 1 sites. Since µ is symmetric, we obtain analogously that ¯ B)) ¯ = µ(⊗k A ⊗l B). µ(G(A,

(7.6)

In particular, the left-hand side is well-deﬁned. Hence, by combining (7.4) and (7.6), we obtain ¯Λn )) → µ(G(A, ¯ B)). ¯ µΛn (G(A¯Λn , B

(7.7)

For a more general non-commutative polynomial G as deﬁned in Sec. 2.2 (not ¯ Λn ) necessarily a monomial), the convergence (7.7) follows easily since G(A¯Λn , B can be approximated in operator norm by polynomials. Entropy estimates. As established in Sec. 4, we have 1 S(µΛ ) = s(¯ µ{Λ} ), |Λ|

for all Λ.

(7.8)

By the upper semi-continuity of the inﬁnite-volume entropy and the convergence µ ¯Λn → µ, we get that µ{Λn } ) ≤ s(µ). lim sup s(¯

(7.9)

n∞

Hence lim

n∞

1 S(µΛn ) ≤ s(µ). |Λn |

(7.10)

By combining the convergence results (7.3), (7.7) and (7.10), we have proven that there is a symmetric state µ such that the right-hand side of (6.8) with ω ≡ µ is larger than a given limit point of the right-hand side of (7.1). Since the construction can be repeated for any limit point, this concludes the proof of Lemma 6.1.

August 10, J070-S0129055X10004089

2010 15:1 WSPC/S0129-055X

148-RMP

Note on Non-Commutative Laplace–Varadhan Integral Lemma

857

Acknowledgment The authors thank M. Fannes, M. Mosonyi, Y. Ogata, D. Petz and A. Verbeure for fruitful discussions. K. N. is also grateful to the Instituut voor Theoretische Fysica, K. U. Leuven, and to Budapest University of Technology and Economics for kind hospitality, and acknowledges the support from the Grant Agency of the Czech Republic (Grant no. 202/07/J051). W. D. R. was a postdoctoral fellow of the FWOFlanders at the time when the paper was written and he acknowledges the ﬁnancial support. L. R. B. acknowledges the support of the NSF (DMS-0605058). Appendix. Proof of Lemma 2.2 To prove Lemma 2.2, it is convenient to introduce an extended framework: Let πω be the cyclic GNS-representation associated to the state ω, Hω the associated Hilbert space and ψ ∈ Hω the representant of the state ω, i.e. ω(A) = ψ, πω (A)ψHω ,

A ∈ U.

(A.1)

The set πω (U) is a subalgebra of B(Hω ). Let Uj , ∈ Zd be the unitary representation of the translation group induced on πω (U), i.e. Uj πω (A)Uj∗ = πω (τj A). Ergodicity of ω implies (see, e.g., the proof of [20, Theorem III.1.8]) that 1 strongly Uj −−−−−→ Pψ |Λ| ΛZd

(A.2)

(A.3)

j∈Λ

where Pψ is the one-dimensional orthogonal projector associated to the vector ψ, and Λ Zd in the sense of Van Hove. Using (A.3) and the translation-invariance Uj ψ = ψ, one calculates 1 ¯Λ )π(Y¯Λ )ψ = Uj π(X)Uj −j π(Y )U−j ψ π(X |Λ|2 j,j ∈Λ

−−−−→d ΛZ Pψ π(X)Pψ π(Y )ψ = ω(X)ω(Y )ψ for local observables X, Y ∈ U. Taking the scalar product with ψ, we conclude ¯ Λ Y¯Λ ) → ω(X)ω(Y ). The same argument works for all polynomials in that ω(X ¯ ¯ XΛ , YΛ , thus proving Lemma 2.2. Finally, we remark that one can also construct ¯ Y¯ as weak∗-limits of X ¯ Λ , Y¯Λ , as Λ Zd (these weak∗-limits are the operators X, simply multiples of identity: ω(X)1, ω(Y )1). This is however not necessary for our results. References [1] H. Araki and P. D. F. Ion, On the equivalence of KMS and Gibbs conditions for states of quantum lattice systems, Comm. Math. Phys. 35 (1974) 1–12. [2] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics: 2, 2nd edn. (Springer-Verlag, Berlin, 1996).

August 10, J070-S0129055X10004089

858

2010 15:1 WSPC/S0129-055X

148-RMP

W. De Roeck et al.

[3] J.-B. Bru and W. de Siqueira Pedra, Equilibrium states of Fermi systems with long range interactions, in preparation. [4] R. Heylen, D. Boll´e and N. S. Skantzos, Thermodynamics of spin systems on smallworld hypergraphs, Phys. Rev. E 74 (2006) 056111. [5] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications (Springer, Berlin, 1993). [6] F. den Hollander, Large Deviations, Field Institute Monographs, Vol. 14 (Amer. Math. Soc., 2000). [7] J. D. Deuschel and D. W. Stroock, Large Deviations, Pure and Applied Mathematics, Vol. 137 (Academic Press, Boston, 1989). [8] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics (Springer, 2005). [9] H.-O. Georgii, Gibbs Measures and Phase Transitions, De Gruyter Studies in Mathematics, Vol. 9 (De Gruyter, 1988). [10] F. Hiai, M. Mosonyi, H. Ohno and D. Petz, Free energy density for mean ﬁeld perturbation of states of a one-dimensional spin chain, Rev. Math. Phys. 20 (2008) 335–365. [11] F. Hiai, M. Mosonyi and O. Tomohiro, Large deviations and Chernoﬀ bound for certain correlated states on the spin chain, J. Math. Phys. 48(12) (2007) 123301– 123319. [12] R. B. Israel, Convexity in the Theory of Lattice Gases, Princeton Series in Physics (Princeton University Press, 1979). [13] M. Lenci and L. Rey-Bellet, Large deviations in quantum lattice systems: One-phase region, J. Stat. Phys. 119 (2005) 715–746. [14] K. Netoˇcn´ y and F. Redig, Large deviations for quantum spin systems, J. Stat. Phys. 117 (2004) 521–547. [15] Y. Ogata, Large deviations in quantum spin chain, arXiv:0803.0113. [16] S. Olla, Large deviations for Gibbs random ﬁelds, Probab. Theory Related Fields 77 (1988) 343–357. [17] D. Petz, G. A. Raggio and A. Verbeure, Asymptotics of Varadhan-type and the Gibbs variational principle, Comm. Math. Phys. 121 (1989) 271–282. [18] C.-E. Pﬁster, Thermodynamical aspects of classical lattice systems, in In and Out of Equilibrium, Probability with a Physics Flavor, Vol. 1, ed. V. Sidoravicius (Birkh¨ auser, 2002). [19] W. De Roeck, C. Maes and K. Netoˇcn´ y, Quantum macrostates, equivalence of ensembles and an H theorem, J. Math. Phys. 47 (2006) 073303. [20] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton University Press, Princeton, 1993). [21] E. J. Stormer, Symmetric states on inﬁnite tensor products of C ∗ -algebras, Funct. Anal. 3 (1969) 48–68. [22] S. R. S. Varadhan, Asymptotic probabilities and diﬀerential equations, Comm. Pure Appl. Math. 19 (1966) 261–286. [23] S. R. S. Varadhan, Large Deviations and Applications (Society for Industrial and Applied Mathematics, 1984).

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 8 (2010) 859–879 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004090

DYNAMICAL BOUNDS FOR STURMIAN ¨ SCHRODINGER OPERATORS

L. MARIN UMR 6628-MAPMO, Universit´ e d’Orl´ eans, B.P. 6759, 45067 Orl´ eans cedex, France [email protected] Received 3 November 2009 The Fibonacci Hamiltonian, that is a Schr¨ odinger operator associated to a quasiperiodical Sturmian potential with respect to the golden mean has been investigated intensively in recent years. Damanik and Tcheremchantsev developed a method in [10] and used it to exhibit a non trivial dynamical upper bound for this model. In this paper, we use this method to generalize to a large family of Sturmian operators dynamical upper bounds and show at suﬃcently large coupling anomalous transport for operators associated to irrational number with a generic diophantine condition. As a counterexample, we exhibit a pathological irrational number which does not verify this condition and show its associated dynamic exponent only has ballistic bound. Moreover, we establish a global lower bound for the lower box counting dimension of the spectrum that is used to obtain a dynamical lower bound for bounded density irrational numbers. Keywords: Sturmian Schr¨ odinger operators; quasiperiodical potential; dynamical bounds. Mathematics Subject Classiﬁcation 2010: 81Q10, 47B36

1. Introduction If H is a self-adjoint operator on a separable Hilbert space H, the time dependent Schr¨ odinger equation of quantum mechanics, i∂t ψ = Hψ, yields to a unitary dynamical evolution in H, ψ(t) = e−itH ψ(0). Under the time evolution, ψ(t) will generally spread out with time. This could be a complicated question to quantify this spreading in concrete cases. One of the most studied case is where H is given by L2 (Rd ) or l2 (Zd ), H is a Sch¨ odinger operator of the form −∆ + V , and ψ(0) is a localized wavepacket. The form of the potential V is depending on the physical model one studies. One of the most studied is the Sturmian potential and its particular subcase, the Fibonacci Hamiltonian, describing a standard one-dimensional quasicrystal. The ﬁrst approach to study quantum dynamics is the spectral theorem. Recall that each initial vector ψ(0) = ψ has a spectral measure, deﬁned as the unique 859

September 14, J070-S0129055X10004090

860

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

Borel measure verifying

ψ, f (H)ψ =

f (E)dµψ (E) σ(H)

for every measurable function f . ·, · denotes the scalar product of H. A major step in the theory discovered by Guarneri ([14, 15]) was that suitable continuity properties of the spectral measure dµψ implies lower bounds on the spreading of the wavepacket. It was then extended by many authors in [3, 16, 25, 23]. Continuity properties of the spectral measure follows from upper bounds on measure of intervals, µψ ([E − ε, E + ε]), E ∈ σ(H), ε → 0. Later on, many authors reﬁned Guarneri’s method ([2, 17, 30]) allowing to take into account the whole statistics of µψ ([E − ε, E + ε]), E ∈ R. One can ﬁnd better lower bounds with information about both measure of intervals and the growth of the generalized eigenfunctions uψ (n, E) ([23, 30]). In the case of Schr¨ odinger operators in one space dimension, the information on the spectral measure and on generalized eigenfunctions is linked to the properties of solutions to the diﬀerence (also called sometimes free) equation Hu = Eu ([6, 11, 13, 19, 20, 31]). Explicit lower bounds on spreading rate for numerous concrete cases come from an analysis of these solutions ([5, 6, 13, 19, 20, 23]). The second approach to dynamical lower bounds in one dimension is based on the Parseval formula, 2 −1 ∞ ∞ i −2t/T −itH 2 H −E− e |e δ1 , δn | dt = δ1 , δn dE. 2π T 0 −∞ This method developed in [8, 9, 31] is the basis for the results in [7, 21]. This method has the advantage that it gives directly dynamical bounds without any knowledge of the properties of spectral measure. What is required is upper bounds for solutions corresponding to some set of energies, which can be very small (non empty is suﬃcient). Moreover, additional information allows to improve the results. A combination of both approach leads to optimal dynamical bounds for growing sparse potentials (see [31]). As mentioned before, there is a fairly good understanding of how to prove dynamical lower bounds, specially in one space dimension. Results of dynamical upper bounds are a few and more recent. Proving upper bounds is hard because one needs to control the entire wavepacket. In fact, the dynamical lower bounds that typically established only bound some (fast) part of the wavepacket from below and this is suﬃcient for the desired growth of the standard dynamical quantities. In the same way, it is of course much easier to prove upper bounds only for a (slow) portion of the wavepacket. Killip, Kiselev and Last developed this idea with success in [24]. Their work provides explicit criteria for upper bounds on the slow part of the wavepacket in terms of lower bounds on solutions. Applying their general method to the Fibonacci operator, their result supports the conjecture that this model exhibits anomalous transport (i.e. neither localized, nor diﬀusive, nor ballistic).

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

861

The conjecture for Fibonacci model is ﬁnally proved at suﬃciently large coupling by Damanik and Tcheremchantsev in [10]. They developed a general method establishing a connection between solutions properties and dynamical upper bounds. Based on the Parseval formula, this method allows to bound the entire wavepacket from above provided that suitable lower bounds for solution (or rather transfer matrix) growth at complex energies are available. It is the main purpose of this paper to extend the application of this general method used for concrete Fibonacci model to almost every Sturmian potential. We will show that one has anomalous transport for Sturmian models associated to irrational numbers far enough from rational numbers, in a sense we develop further. On the other hand, we construct an irrational number close enough to rational number that yields to balistic motion. In this paper, we use tools that are relevant to give a new lower bound for the box counting dimension of the spectrum that is better for almost every irrational number. Since the spectrum is a Cantor set with Lebesgue measure zero, it is logical to investigate its fractal dimension. It is well known that this Cantor set is the limit of band spectra of approximant operators [29, 1]. To ﬁnd the bound, we use band spectra at rank n as a sequence of εn -cover of the spectrum. Using the informations given in [28] about the number of band in periodic band spectra and in [27] about the length of the bands, we estimate εn and give a bound for the number of band of this diameter. This yields to a bound from below of the minimal number of balls of diameter εn one needs to cover the spectrum. This bound also has a direct dynamical application and allows us to state a dynamical lower bound using the method in ([30]). It is required for this lower bound to have the transfer matrix norms polynomially bounded. This property is shown to be true for bounded density irrational number in [18], hence more is not expected. This limits dynamical implication of this lower bound to a set of irrational number of Lebesgue measure 0. We will give precise statements of the model we study and our results in the next section. Section 3 will be devoted to the proof of our main result. We give a pathological example in the Sec. 4 and a new lower bound for box counting dimension of the spectrum in Sec. 5. 2. Model and Statements We limit our study to the one-dimensional discrete Schr¨odinger operator Hβ , [Hβ ψ](n) = ψ(n + 1) + ψ(n − 1) + V (n)ψ(n)

(1)

acting on l2 (Z), associated to a Sturmian potential V (n) given by V (n) = ((n + 1)β − nβ)V with β an irrational number in [0, 1] and V a positive constant. We denote continued fraction expansion of β by 1 = [0, a1 , a2 , . . .]. β= 1 a1 + a2 + · · ·

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

862

√

The Fibonacci Hamiltonian, Hβ with β = 5−1 = [0, 1, 1, . . .] is the sim2 plest example in Sturmian model because of its particular continued fraction development. Since we are interested in dynamical bounds, let us recall some quantities we want to bound: We denote the time average outside probabilities by a(n, T ), P (N, T ) = |n|>N

with a(n, T ) =

2 T

0

∞

e−2t/T |e−itH δ1 , δn |2 dt.

For all α ∈ [0, +∞], see [13] S − (α) = − lim inf

log P (T α − 2, T ) log T

S + (α) = − lim sup

log P (T α − 2, T ) . log T

T →∞

and T →∞

The following critical exponents are particular of interest: ± α± l = sup{α ≥ 0 : S (α) = 0}, ± α± u = sup{α ≥ 0 : S (α) < ∞}. ± + γ They verify 0 ≤ α± l ≤ αu . In particular, if γ > αu then P (T , T ) goes to 0 ± fast. αl can be interpreted as the (lower and upper) rates of propagation of the essential part of the wavepacket and α± u as the rates of propagation of the fastest part of the wavepacket. Moreover, we always have for this kind of models α+ u ≤ 1. This upper bound, called ballistic, is the fastest rate of spreading of the wavepacket. Sturmian potentials (quasiperiodic structure) are the buﬀer situation between random potentials (no structure in potential) that imply dynamical localization ± (α± u = 0) and periodic potentials that imply ballistic spreading that is αu = 1. More precisely, one has a non trivial strictly positive bound for almost all irrational numbers. In a sense we will make more precise latter, these irrational numbers are far enough from rational numbers. On the other hand, we show for irrational number close enough to rational number, one has ballistic motion. The ﬁrst objective of this paper is to give a non ballistic upper bound for a large set of irrational numbers. Recall the sequences associated to β:

p−1 = 1,

p0 = 0,

q−1 = 0,

q0 = 1,

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

863

pk+1 = ak+1 pk + pk−1 , qk+1 = ak+1 qk + qk−1 .

(2)

We can now state our main result: Theorem 1. Let β be an irrational number and Hβ deﬁned as in (1) with a Sturmian potential associated to β. Assume that V > 20. If D = lim supk logkqk is ﬁnite then 2D . α+ u ≤ V −8 log 3 Moreover, for an irrational number with continued fraction expansion containing no 1, the dynamical upper bound becomes D . α+ u ≤ V −8 log 3 Remark 1. It is clear that taking V large enough, one can obtain a non trivial bound that is smaller than 1. It is well known that the set of irrational numbers with ﬁnite D has full Lebesgue measure. In fact, for any algebraic number, that is with a periodic continued fraction development, one can easily compute D. Moreover, the explicit value of D is known for almost all β by the result of Khinchin discussed next. Lemma 1 ([22]). For almost all β with respect to Lebesgue measure, D = lim sup k

log qk π2 = DK = , k 12 log 2

where qk is the sequence deﬁned as in (2) and 1

M = lim inf (a1 · · · ak ) k = CK = 2.685 . . . k

CK is called the Khintchin constant. Corollary 1. For Lebesgue almost every irrational number β, we have 2D K . α+ u ≤ V −8 log 3 Proof. It follows directly from previous Theorem 1 and Khinchin lemma. Corollary 2. For a precious number, that is ω = [0, a, a, a, a, . . .], a = 1 the bound becomes log(a + ω) . α+ u ≤ V −8 log 3

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

864

Proof. One can compute qk easily for such numbers. On the contrary, if D is inﬁnite, one can have ballistic motion at all large coupling: Theorem 2. There exist an irrational number ω with D = +∞ such that for any V > 20 the dynamic of Hω is ballistic. We also prove a new lower bound for the fractal dimension of the spectrum: Theorem 3. Set Ck = k3 kj=1 log(aj + 2). We have for any irrational number β verifying C = lim sup Ck < +∞ and V > 20: dim+ B (σ) ≥

log 2 1 2 C + log(V + 5)

(3)

where σ is the spectrum of Hβ . 3. Proof of Theorem 1 When one wants to bound all these dynamical quantities for speciﬁc models, it is useful to connect them to the qualitative behavior of the solutions of the diﬀerence equation ψ(n + 1) + ψ(n − 1) + V (n)ψ(n) = zψ(n)

(4)

with z ∈ C and ψ a non-zero vector. One can reformulate this equation in terms of transfer matrices. ψ(n + 1) ψ(1) = F (n, z) ψ(n) ψ(0) with

   T (n, z) · · · T (1, z) F (n, z) = Id   [T (n, z)]−1 · · · [T (0, z)]−1

and

T (m, z) =

We set

z − V (m) 1

−1 0

n ≥ 1, n = 0, n ≤ −1,

.

   T (qk , z) · · · T (1, z) Mk (z) = F (qk , z) = Id   [T (q , z)]−1 · · · [T (0, z)]−1 k

n ≥ 1, k = 0, n ≤ −1.

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

865

The following statement allows us to connect transfer matrix norms with dynamical exponents (see [10] for details). Here and in what follows, f g means that f ≤ Cg for some positive constant C that we leave implicit. Theorem 4. Let Hβ be deﬁned as in (1) and K ≥ 4 such that σ(Hβ ) ⊆ [−K + 1, K − 1]. Then, the outside probabilities can be bounded from above in terms of transfer matrix norms as follows: 2 −1 K i dE, max Mk E + Pr (N, T ) exp(−cN ) + T 3 1≤qk ≤N T −K

Pl (N, T ) exp(−cN ) + T

3

K

−K

2 −1 i dE, max Mk E + −N ≤qk ≤−1 T

the implicit constants depend only on K and c is a universal positive constant. This theorem connects transfer matrix behavior with a dynamical upper bound in the following way. Choosing N = N (T ) = CT α such that the both integrals decay faster that any inverse power of T , implies that P (N (T ), T ) goes to 0 faster + that any inverse power of T . By deﬁnition, of α+ u , it follows that αu ≤ α. To exhibit such kind of condition, we have to prove the considered energy is not in the spectrum, then the transfer matrix norm is shown to grow super exponentially. We shall recall now a few properties of the transfer matrix and their traces. The transfer matrix sequence veriﬁes the evolution in k (see, e.g., [1, 28]) Mk+1 (z) = Mk−1 (z)Mk (z)ak+1 .

(5)

In order to bound from below the sequence of the norm of transfer matrix, it is enough to consider their traces. We recall now the following result one can ﬁnd in [28]. Proposition 1. Let tk,p be the trace of the matrix Mk−1 Mkp . The evolution along the p index is given by tk,p+1 = tk+1,0 tk,p − tk,p−1 , and consequently, tk,p+1 = Sp (tk+1,0 )tk,1 − Sp−1 (tk+1,0 )tk,0 = Sp (tk+1,0 )tk,0 − Sp±1 (tk+1,0 )tk,−1 . The evolution along the k index is related to the p-evolution by tk+2,0 = tk,ak+1 , tk+1,1 = tk,ak+1 +1 , tk+1,−1 = tk,ak+1 −1 .

(6) (7)

September 14, J070-S0129055X10004090

866

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

If one denotes by xk = tk+1,0 the trace of Mk and zk = tk,1 the trace of Mk−1 Mk . This can be reduced to the usual trace map relation (6) xk+1 = zk Sak+1 −1 (xk ) − xk−1 Sak+1 −2 (xk ), zk+1 = zk Sak+1 (xk ) − xk−1 Sak+1 −1 (xk ), with initial conditions, x−1 = 2, x0 = z and z0 = z − V . Remark 2. This two sequences are dependent on z but we will omit it in order to simplify notations. Here, Sl denotes the lth Tchebychev polynomial of the second kind: S−1 (x) = 0, S0 (x) = 1, Sl+1 (x) = xSl (x) − Sl−1 (x),

∀ l ≥ 0.

The sequence {xk (z)}k can have two diﬀerent behaviors depending on z. If and only if z lies in the spectrum of Hβ then this sequence is bounded. A criterium has ﬁrst been stated by S¨ ut˝ o in [29] for Fibonacci Hamiltonian and extended by Bellissard et al. in [1] for other irrational numbers. The appearance of δ in the next Lemma is purely technical and does not change the proof. Lemma 2. A necessary and suﬃcient condition that {xk (z)}k be unbounded is that xN −1 (z) ≤ 2 + δ,

xN (z) > 2 + δ,

zN (z) > 2 + δ

for some N ≥ 0. This N is unique. Set Gk = Gk−1 + ak Gk−2 ,

G0 = 1,

G−1 = 1.

We have |xk+1 | ≥ |zk | ≥ ecGk−N + 1

∀ k > N,

with c = log(1 + δ) > 0 constant. Proof. We start by stating the following inequality on Chebychev polynomial: |Sl (x)| − |Sl−1 (x)| ≥ (|x| − 1)|Sl−1 (x)| − |Sl−2 (x)| ≥ (|x| − 1)[|Sl−1 (x)| − |Sl−2 (x)|] iterating this, one obtains ≥ (|x| − 1)l [|S0 (x)| − |S−1 (x)|] = (|x| − 1)l . The proof is made by induction. Hypothesis HN is the following: One has |xN | > 2 + δ and |zN | > 2 + δ. Moreover |xN −1 | ≤ |zN |. It is clear that the hypothesis of the lemma implies HN . We now show the induction property, namely HN implies |zN +1 | > |zN |, |xN +1 | > |zN |,

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

867

and |xN | ≤ |zN +1 |. It is easy to see that these three relations with HN implies HN +1 . Suppose HN to be true, then one has |zN +1 | ≥ |zN SaN +1 (xN )| − |xN −1 SaN +1 −1 (xN )| ≥ |zN |[|SaN +1 (xN )| − |SaN +1 −1 (xN )|] ≥ |zN |(|xN | − 1)aN +1 .

(8)

This shows that |zN +1 | > |zN | with |xN | ≥ 2 + δ. One also has |zN +1 | > |xN |. Indeed, one can write |zN +1 | ≥ |zN |(|xN | − 1) ≥ |xN | + (|zN | − 1)|xN | − |zN | ≥ |xN | + 2(|zN | − 1) − |zN | ≥ |xN | + |zN | − 2 ≥ |xN |. Only the last relation remain to be shown: One shows the same way that before |xN +1 | ≥ |zN SaN +1 −1 (xN )| − |xN −1 SaN +1 −2 (xN )| ≥ |zN |[|SaN +1 −1 (xN )| − |SaN +1 −2 (xN )|] ≥ |zN |(|xN | − 1)aN +1 −1 which yields to |xN +1 | > |zN |. Taking logarithms in (8), one obtains: log|zk+1 | ≥ log|zk | + ak+1 log(|xk | − 1). Using |zk+1 | > |zk | and |zk−1 | < |xk | yields to log(|zk+1 | − 1) ≥ log(|zk | − 1) + ak+1 log(|zk−1 | − 1). Sequence {log(|zk | − 1)}k>N grows faster than the exponential sequence Gk . This sequence is deﬁned in the following way Gk = Gk−1 + ak+N Gk−2 ,

G0 = 1,

G−1 = 1.

One has |xk+1 | ≥ |zk | ≥ ecGk−N + 1

∀ k > N,

with c = log(1 + δ) > 0 a ﬁxed constant. This constant c comes from the difference in the initial conditions between the sequence {Gk }k and the sequence {log(|zk | − 1)}k>N . This criterium motivates the following deﬁnition: Set σk,p = {E ∈ R, |tk,p (E)| ≤ 2}.

September 14, J070-S0129055X10004090

868

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

Denote by βn = pqnn , the rational approximation of β. It is well known that the spectrum of the operator Hβn , where βn replace β in the deﬁnition of Hβ coincide with the set σk,0 . The sequence of operator {Hβn } is called the periodic approximants of Hβ and converges strongly to Hβ . It is well known spectrum of Hβ is a Cantor set that can be approximate by the band spectra of the periodic approximants. The following proposition recalls precisely this statement ([29, 1, 32]): Proposition 2. The sequence of spectra of periodic approximants of Hβ satisﬁes (i) the set σk,p is made of pqk + qk−1 distinct intervals, c ∩ σk,p ), ∀ k ∈ N, (ii) σ ⊂ σk+1,0 ∪ σk,0 and σk,p+1 ⊂ σk+1,0 ∪ (σk+1,0 (iii) σk+1,0 ∩ σk,p ∩ σk,p−1 = ∅, ∀ V > 4 and ∀ k ∈ N, p ≥ 0. We recall now important result about periodic approximants spectra structure. It allows to know the way the intervals of σk,p are included in σk−1,p . It requires some deﬁnitions: Definition 1. For a given k, we call — Type I gap: A band of σk,1 included in a band of σk,0 and therefore in a gap of σk+1,0 , — Type II band: A band of σk+1,0 included in a band of σk,−1 and in a gap of σk,0 , — Type III band: A band of σk+1,0 included in a band of σk,0 and in a gap of σk,1 . As proved in [28] these deﬁnitions exhaust all the possible conﬁguration with the following lemma. Lemma 3 ([28]). At a given level k, (i) a type I gap contains an unique type II band of σk+2,0 . (ii) a type II band contains (ak+1 +1) bands of type I of σk+1,1 . They are alternated with (ak+1 ) type III bands of σk+2,0 . (iii) a type III band contains (ak+1 ) bands of type I of σk+1,1 . They are alternated with (ak+1 − 1) type III bands of σk+2,0 . As stated above, the spectrum of Hβn is made by a growing number of intervals of decreasing length as n is increasing. We recall now a result obtain in [27] which allows to control the length of the bands of σk,p at any level k. We need again some notations to resume it: Let A = {I, II, III} be an alphabet. For each band B of spectrum at level k, correspond an unique word i0 i1 · · · ik ∈ An+1 such that B is a band of type ik included in a band of type ik−1 at level k − 1, . . . , included in a band of type i0 at level 0. This word will be called the index of B. More than one band can have the same index. Let Tn = (ti,j (n))3∗3 be a sequence of matrix and τ = i0 i1 · · · ik an index, we deﬁne: Lτ (T ) = ti0 ,i1 (1)ti1 ,i2 (2) · · · tik−1 ,ik (k).

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

869

We can now recall the result in [27]: Theorem 5 ([27]). If β = [a1 , a2 , . . .] is an irrational number in [0, 1] and Hβ deﬁned as above with V > 20 then any band B of index τ veriﬁes, 4Lτ (Q) ≤ |B| ≤ 4Lτ (P ) where P = (Pn )n>0

with c1 =

with c2 =

3 V −8

0  Pn = c1 /an c1 /an

c1an −1 0 0

 0 c1 /an  c1 /an

and Q = (Qn )n>0  0 Qn =  c2 (an + 2)−3 c2 (an + 2)−3

c2an −1 0 0

 0 c2 (an + 2)−3  c2 (an + 2)−3



1 V +5 .

By now, we deﬁne the periodic approximants spectrum not only in R but in C. δ = {z ∈ C: |xk (z)| ≤ 2 + δ} σk,0

The statements of the preceeding propositions remain true if one replace σk,p by δ for some small enough ﬁxed δ. A condition on V should be added to keep the σk,p invariant formula, V > Vδ = [16 + 24δ + 9δ 2 + 4]1/2 (see [10]). Since the invariant δ remains the same. The proof is the very keeps true, all the structure for set σk,0 same, see [28, 24]. The following proposition states, due to classical Koebe distortion theorem, the height of this set is almost the same that its length. Proposition 3. If k ≥ 3, δ > 0 and V > 20 then there exists constants cδ ,dδ > 0 such that qk−1 qk−1 (j) (j) δ B(xk , rk ) ⊆ σk,0 ⊆ B(xk , Rk ) j=1

j=1

(j)

where {xk }1≤j≤qk−1 are the zeros of xk , rk = cδ inf τ ∈Ak Lτ (Q) and Rk = dδ supτ ∈Ak Lτ (P ). Proof. The proof follows the same steps that in [10]. Let Cj be a connected com2δ . With V > max{20, λ(2δ)}, Cj contains exactly one of a qk−1 zeros ponent of σk,0 (j)

δ δ of σk,0 , xk . Moreover Cj contains one connected component of σk,0 , denoted by ˜ Cj . It suﬃces to show that (j) (j) B(xk , rk ) ⊆ C˜j ⊆ B(xk , Rk ),

to obtain the result.

(9)

September 14, J070-S0129055X10004090

870

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

As xk is a proper function (as a polynomial of z) and Cj contains an unique zero, its degree is 1. xk : int(Cj ) → B(0, 2 + 2δ) is univalent (as a proper function of degree one) and so x−1 k : B(0, 2 + 2δ) → int(Cj ) is well deﬁned and univalent too. Consequently, the function (j)

F : B(0, 1) → C, F (z) =

x−1 k ((2 + 2δ)z) − xk (2 + 2δ)(x−1 k ) (0)

is univalent on B(0, 1). We have F (0) = 0 and F (0) = 1. Applying Koebe distortion theorem, we get |z| |z| ≤ |F (z)| ≤ , 2 (1 + |z|) (1 − |z|)2 Evaluating this for |z| =

2+δ 2+2δ ,

|z| ≤ 1.

one has

(2 + δ)(2 + 2δ) (2 + δ)(2 + 2δ) ≤ F (z) ≤ . (4 + 3δ)2 δ2 By deﬁnition of F this implies (j)

(2 + δ)(2 + 2δ) −1 |(xk ) (0)|, δ2

(j)

(2 + δ)(2 + 2δ) −1 |(xk ) (0)|. (4 + 3δ)2

|x−1 k ((2 + 2δ)z) − xk | ≤ |x−1 k ((2 + 2δ)z) − xk | ≥ And then for |z| = 2 + δ, (j)

(2 + δ)(2 + 2δ) −1 |(xk ) (0)|, δ2

(j)

(2 + δ)(2 + 2δ) −1 |(xk ) (0)|. (4 + 3δ)2

|x−1 k (z) − xk | ≤ |x−1 k (z) − xk | ≥ (j)

It suﬃces with |(x−1 k ) (0)| = |xk (xk )| to remark that rk ≤ |(x−1 k ) (0)| ≤ Rk

˜ and with |z| = 2 + δ, x−1 k (z) runs through the entire boundary of Cj to conclude.

Proof of Theorem 1. We have now all the required tools to ﬁnish the proof of the Theorem 1. (j) As xk are real, we have −γ(V )

δ ⊆ {z ∈ C: |Im z| < Rk } ⊆ {z ∈ C: |Im z| < dqk σk,0

},

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

871

for a suitable γ(V ). This implies with Proposition 2 −γ(V )

δ δ σk,0 ∪ σk,1 ⊆ {z ∈ C: |Im z| < dqk

}.

(10)

Let us be more precise on how to choose γ(V ). We need to bound all Rk from above. Rk is the supremum of products of k elements of matrix Pn . All the coeﬃcients in Pn are maximal for an = 1. The worst case possible happens when a band has a index history type I containing a band of type II, in that case the coeﬃcient could be trivial equal to 1 (if an = 1). But because of combinatoric behavior of bands described by the Lemma 3, this situation cannot occur more than half of the time. Consequently this implies k/2

Rk ≤ c1 . −γ(V )

We should have Rk < dqk

so a suitable γ can be chosen by taking:

γ(V ) ≤ lim sup − k

k log c1 . 2 log qk

For ε = Im z > 0, we get an uniform lower bound for |xn (E + iε)| with E ∈ −γ(V ) < ε. With (10), this [−K, K] ⊂ R. For a ﬁxed ε > 0, we choose k such that dqk shows |xk (E + iε)| > 2 + δ and |zk (E + iε)| > 2 + δ. As |x−1 (E + iε)| = 2 ≤ 2 + δ we are in the situation of the Lemma 2 and we have the bound |xj | ≥ elog(1+δ)Gj−k + 1,

∀ j > k.

(11)

All this motivates the following deﬁnitions: For δ > 0, T > 1, denote by k(T ) the unique integer with γ(V )

qk(T )−1 dδ

γ(V )

≤T ≤

qk(T ) dδ

and let N (T ) = qk(T )+√k(T ) . It is then easy to see for T large enough and for every ν > 0, that we have a constant Cν > 0 such that 1

N (T ) Cν T γ(V ) T ν . Let us give explicit argument on this statement: log qk(T )+√k(T ) k(T ) + k(T ) log N (T ) = log T log T k(T ) + k(T ) log qk(T )+√k(T ) k(T ) + k(T ) ≤ k(T ) + k(T ) (−k(T ) + 1)/2 log c1 k(T ) + k(T ) ≤ 2D . (−k(T ) + 1) log c1

(12)

September 14, J070-S0129055X10004090

872

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

For k(T ) large enough, last expression is close to enough, one gets

1 γ(V )

=

2D − log c1 .

So for T large

2D

N (T ) Cν T − log c1 T ν with ν arbitrary small. Applying (11) to Theorem 4, we get K 3 Pd (N (T ), T ) exp(−cN (T )) + T −K

exp(−cN (T )) + T 3 e

2 −1 i dE, max Mn E + T 1≤qn ≤N (T )

−2 log(1+δ)G√k(T )

.

From this bound, it is clear that Pd (N (T ), T ) goes to zero faster than any inverse power of T since sequence G has exponential growth. One gets the same bound for Pg (N (T ), T ) because of the symetry of the potential. Finally, one can conclude with (12) that α+ u ≤ α with α=

1 +ν γ(V )

and ν arbitrary small. For the second part of the theorem, notice the constant 2 comes from the choice of γ(V ) considering the worst coeﬃcient in matrix Pn . But assuming there are no 1 in continued fraction development, one gets Rk ≤ ck1 and γ(V ) ≤ lim sup − k

k log c1 . log qk

4. A Pathological Counterexample The statements above holds if D < +∞. In the case D = +∞, we exhibit in the next statement a counter example. It is still an open question if D = +∞ implies ballistic motion. Theorem 6. There exists an irrational number ω with D = +∞ such that for any V > 20 α+ u = 1. The proof, made by induction, follows the lines of pathological example in [25]. The main idea is that, choosing an irrational number close to rational numbers (with large values for the sequence {ak }k ), potentials of Hβ and Hβn coincide on large scale of time. Large enough to say that Hβ and Hβn have the same dynamical behavior. It is well known that periodic operator Hβn has ballistic motion.

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

873

We make now these ideas more precise and ﬁrst prove the following lemma: Let βn = [a1 , . . . , an ] be ﬁxed and β be any an irrational number verifying β = [a1 , . . . , an , . . .]. Lemma 4. The Sturmian potentials of the operators Hβ and Hβn have the same ﬁrst qn+1 values. Proof. To prove this, we recall the iterative construction of Sturmian word that coincide with our potential. For details and proof, see, e.g., [26]. Set W0 = 0 et W1 = 0a1 −1 V and deﬁne the sequence of Sturmian words by a

Wk+1 = Wk k+1 Wk−1 ,

k ≥ 1.

Each word Wk has length qk . As Hβ and Hβn have the same ﬁrst n terms of continued fraction expansion, words W0 , W1 , . . . , Wn are the same for Hβ and Hβn . For Hβn , the limit word W∞ is periodic with period qn and repeat endless the an word Wn . As Wn = Wn−1 Wn−2 , one has an Wn∞ = Wnan+1 Wn−1 Wn−2 Wn∞ . a

This shows that the potential Hωn begins with the word Wn n+1 Wn−1 which is the word Wn+1 for Hω . As Wn+1 is qn+1 long, this ends the proof. We need another lemma, one can ﬁnd in [25]. It states that two operators have close dynamic (on some scale of time T ) if their potentials are close enough. We make this idea more precise by recalling this lemma: Lemma 5. Let H1 = ∆ + V1 and H2 = ∆ + V2 acting on l2 (Z), and such that |V1 (k)|, |V2 (k)| < C for all k ∈ Z and some constant C. Let T > 0 and ε > 0 be ﬁxed constant then if it exists L(T, ε), δ > 0 such that |V1 (k) − V2 (k)| < δ for all |k| < L, then ||X|2H1 T − |X|2H2 T | < ε. We get back to the construction. Proof of Theorem 6. As Hωn is a periodic potential operator, one has |X|2Hωn T > Cn T 2 , choose Tn big enough such that Cn >

1 . log Tn

One can then choose an+1 such that L(Tn , 1) ≤ qn+1 .

September 14, J070-S0129055X10004090

874

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

Inductively, we have a sequence Tn going to inﬁnity and an irrational number ω with |X|2 Tn >

Tn2 − 1 > Tn2−ε , log Tn

∀ ε > 0.

Now, since ω is fully construct, one can compare Hω with Hωn . Then Lemma 5 implies |X|2Hω Tn >

Tn2 − 1, log Tn

(13)

which yields to − α+ u ≥ βδ1 (2) > 1 − ε,

∀ ε > 0.

5. Lower Bound for the Box Counting Dimension of the Spectrum We give now a lower bound of the fractal box counting dimension of the spectrum of operator Hβ . We recall now the deﬁntion. If one denotes by N (ε) the number of balls of diameter at most ε one need to cover σ, then the upper box counting dimension is deﬁned by dim+ B = lim sup ε→0

log N (ε) . log ε

The spectrum is approached by the band spectrum of periodic Hβn . Moreover, in [28, 27], we have precise information of the number of bands and their length. It allows us to give a lower bound of minimal number of set of some decreasing scale needed to cover the spectrum and then to give a lower bound of box dimension of the limit set. The ﬁrst idea to cover the spectrum can be to take into account all the bands and take as a scale the smallest length, but this is a bad idea because this minimal length decreases faster than the number of intervals grows. The second idea can be to count the number of bands that have the maximal length, in terms of inverse power of V . This yields to a better lower bound for the box dimension of the spectrum for almost every irrational number. Fixing the irrational number, one can improve this method, by counting precisely the number of band that have a particular length. It has been made for Fibonacci number in [12] where the full fractal spectrum has been investigated. The length of a band is depending of its history, in that case, the number of I in the index history. Hence, one obtains this way all the contribution at any scale to the box dimension. It is shown their result is optimal with V increasing and one has for β = [0, 1, 1, . . .] √ log(1 + 2) . dimB (σ(Hβ )) ≈ log V An other example, simpler than golden mean is silver ratio. Fix β = [0, 2, 2, . . .], then all the bands have the same length up to a constant independent of V . Namely,

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

875

all bands at level k have length ck V −k , where ck is a constant depending of history of the band but not of V . At a given rank k, the number of band of length ck V −k needed to cover the spectrum is bound from below by qk . This implies that one has: √ log qk log(1 + 2) . dimB (σ(Hβ )) ≥ − lim inf ≈ k log ck V −k log V It is easy to show by direct computation the other side inequality and hence we obtain the same estimation for this case √ log(1 + 2) . dimB (σ(Hβ )) ≈ log V It is quite astonishing that both golden mean and silver ratio yield the same fractal dimension estimate. Going back to the general case, we will apply the same method used for silver mean, that is count the number of bands at level k that have length equal to ck V −k . We obtain: k Theorem 7. Set Ck = k3 j=1 log(aj + 2). We have for any irrational number β verifying C = lim sup Ck < +∞ and V > 20: dim+ B (σ) ≥

log 2 1 2 C + log(V + 5)

(14)

where σ is the spectrum of Hβ . Remark 3. As in Lemma 1, C ﬁnite is valid for a set of full Lebesgue measure. The following lemma give precise statement of the counting idea. Lemma 6. Denote by nk,I , nk,II and nk,III the number of bands of type respectively I, II and III in respectively σk,1 , σk+1,0 , σk+1,0 and with a length greater than εk = 4Πkj=1 (V + 5)−1 (aj + 2)−3 . For all k, we have the following induction relation: nk+1,I = (ak+1 + 1)nk,II + ak+1 nk,III , nk+1,II = 1{ak+1 ≤2} nk,I , nk+1,III = ak+1 nk,II + (ak+1 − 1)nk,III . Here, the initial conditions are n0,I = 1, n0,II = 0, n0,III = 1. Moreover this three sequences verify the following properties: nk,II = 0 nk,III = 0 nk,I = 0 nk,I > nk,III and k

nk,II + nk,III > 2 2 .

September 14, J070-S0129055X10004090

876

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

Proof. The induction relation is obvious with (5). The two ﬁrst properties are made by induction. Initial conditions give level 0. Assume it is true at level n, then as ak+1 > 0, nk,II = 0 ∨ nk,III = 0, implies nk+1,I = 0. For the second part, if ak+1 ≤ 2 then nk+1,II = 0, else ak+1 > 2 implies nk+1,III = 0. To prove nk,I > nk,III it suﬃces to see that nk,I = nk,III + nk−1,II + nk−1,III . For the last property, it suﬃces to show that nk,II + nk,III ≥ 2(nk−2,II + nk−2,III ). Using induction relation, we get nk,II = [(ak−1 + 1)nk−2,II + ak−1 nk−2,III ]1{ak ≤2} nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,I 1{ak−1 ≤2} . We distinguish 4 cases depending on the values of ak and ak−1 . • If ak > 2 and ak−1 > 2, then we simply get nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) ≥ (ak − 1)(ak−1 − 1)(nk−2,II + nk−2,III ) ≥ 4(nk−2,II + nk−2,III ). • If ak ≤ 2 and ak−1 > 2, then one has nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + (ak−1 + 1)nk−2,II + ak−1 nk−2,III ≥ ak ak−1 (nk−2,II + nk−2,III ) ≥ 3(nk−2,II + nk−2,III ). • If ak > 2 and ak−1 ≤ 2, then one has nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,I ≥ (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,III ≥ (ak − 1)ak−1 (nk−2,II + nk−2,III ) ≥ 2(nk−2,II + nk−2,III ). • If ak ≤ 2 and ak−1 ≤ 2, then one gets nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,I + (ak−1 + 1)nk−2,II + ak−1 nk−2,III .

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

877

And one obtains nk,II + nk,III ≥ ((ak − 1)ak−1 + ak−1 + 1)nk−2,II + ((ak − 1)(ak−1 − 1) + (ak−1 + ak )nk−2,III ≥ (ak ak−1 + 1)nk−2,II + (ak−1 + ak )nk−2,III ≥ 2(nk−2,II + nk−2,III ). Proof of Theorem 7. With previous lemma, we ﬁnd a bound for nk,II + nk,III , that is the number of bands of length at least εk . To make sure we have a disjoint cover we consider only half of the bands. Each band is then separeted by another band we does not count. Then by deﬁnition of box dimension, we have dim+ B (σ) ≥ lim inf k

log 1/2(nk,II + nk,III ) , − log εk

and the stated result. Remark 4. The former bound for box dimension provided in [27] was log 2 log M − log 3 + , dimB (σ) ≥ dimH (σ) ≥ max , 10 log 2 − 3 log t2 log M − log t2 /3 1

where M = lim inf k→∞ (a1 a2 · · · ak ) k and t2 =

1 4(V +8) .

For almost all irrational numbers, that is with M equal to the Khintchin constant (2.685. . .), our bound is better than above and for any V > 20. On the other hand, for all ﬁxed V , one has no improvement with some speciﬁc numbers. Fixing β = [0, c, c, . . .], the bound above goes to 1 and (14) to 0 as c goes to inﬁnity. A lower bound for box dimension can be relevant to obtain a bound for dynamic lower exponent αu . Definition 2. An irrational number is said to be a bounded density irrational number if it fulﬁlls the following condition 1 ai < +∞. n i=1 n

lim sup n

Theorem 8. For any bounded density irrational number, we have α− u ≥ with C = lim sup k3

k j=1

log 2 1 2 C + log(V + 5)

log(aj + 2).

Proof. It is shown in [30, 12] that if the norms of the transfer matrix are poly+ nomially bounded on the spectrum then we have α− u ≥ dimB (σ). This property on the norm of the transfer matrix is shown for irrational with bounded density in [18].

September 14, J070-S0129055X10004090

878

2010 13:28 WSPC/S0129-055X

148-RMP

L. Marin

Acknowledgments It is a pleasure to thank Dominique Vieugu´e for useful conversations about number theory.

References [1] J. Bellissard, B. Iochum, E. Scoppola and D. Testard, Spectral properties of one dimensional quasi-crystals, Comm. Math. Phys. 125 (1989) 527–543. [2] J.-M. Barbaroux, F. Germinet and S. Tcheremchantsev, Fractal dimensions and the phenomenon of intermittency in quantum dynamics, Duke. Math. J. 110 (2001) 161–193. [3] J. M. Combes, Connections between quantum dynamics and spectral properties of time-evolution operators, in Diﬀerential Equations with Applications to Mathematical Physics (Academic Press, Boston, 1993), pp. 59–68. [4] D. Damanik, Dynamical upper bounds for one-dimensional quasicrystals, J. Math. Anal. Appl. 303 (2005) 327–341. [5] D. Damanik, α-continuity properties of one-dimensional quasicrystals, Comm. Math. Phys. 192 (1998) 169–182. [6] D. Damanik, R. Killip and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals. III. α-continuity, Comm. Math. Phys. 212 (2000) 191–204. [7] D. Damanik, D. Lenz and G. Stolz, Lower transport bounds for one-dimensional continuum Schr¨ odinger operators, Math. Ann. 336 (2006) 361–389. [8] D. Damanik, A. S¨ ut˝ o and S. Tcheremchantsev, Power-law bounds on transfer matrices and quantum dynamics in one dimension II, J. Funct. Anal. 216 (2004) 362–387. [9] D. Damanik and S. Tcheremchantsev, Power-law bounds on transfer matrices and quantum dynamics in one dimension, Comm. Math. Phys. 236 (2003) 513–534. [10] D. Damanik and S. Tcheremchantsev, Upper bounds in quantum dynamics, J. Amer. Math. Soc. 20 (2007) 799–827. [11] D. Damanik and S. Tcheremchantsev, Scaling estimates for solutions and dynamical lower bounds on wavepacket spreading, J. Anal. Math. 97 (2005) 103–131. [12] D. Damanik, M. Embree, A. Gorodetski and S. Tcheremchantsev, The fractal dimension of the spectrum of the Fibonacci Hamiltonian, Comm. Math. Phys. 280 (2008) 499–516. [13] F. Germinet, A. Kiselev and S. Tcheremchantsev, Transfert matrices and transport for Schr¨ odinger operators, Ann. Inst. Fourier (Grenoble) 54 (2004) 787–830. [14] I. Guarneri, Spectral properties of quantum diﬀusion on discrete lattices, Europhys. Lett. 10 (1989) 95–100. [15] I. Guarneri, On an estimate concerning quantum diﬀusion in the presence of a fractal spectrum, Europhys. Lett. 21 (1993) 729–733. [16] I. Guarneri and H. Schulz-Baldes, Lower bounds on wave packet propagation by packing dimensions of spectral measures, Math. Phys. Electron. J. 5 (1999), Paper 1, 16 pp. [17] I. Guarneri and H. Schulz-Baldes, Intermittent lower bounds on quantum diﬀusion, Lett. Math. Phys. 49 (1999) 317–324. [18] B. Iochum, L. Raymond and D. Testard, Resistance of one-dimensional quasicristals Phys. A 187 (1992) 353–368. [19] S. Jitomirskaya and Y. Last, Power-law subordinacy and singular spectra. I. Half-line operators, Acta Math. 183 (1999) 171–189.

September 14, J070-S0129055X10004090

2010 13:28 WSPC/S0129-055X

148-RMP

Dynamical Bounds for Sturmian Schr¨ odinger Operators

879

[20] S. Jitomirskaya and Y. Last, Power-law subordinacy and singular spectra. II. Line operators, Comm. Math. Phys. 211 (2000) 643–658. [21] S. Jitomirskaya, H. Schulz-Baldes and G. Stolz, Delocalization in random polymer models, Comm. Math. Phys. 233 (2003) 27–48. [22] A. Ya. Khinchin, Continued Fractions (University of Chicago Press, 1964). [23] A. Kiselev and Y. Last, Solutions, spectrum and dynamics for Schr¨ odinger operators on inﬁnite domains, Duke Math. J. 102 (2000) 125–150. [24] R. Killip, A. Kiselev and Y. Last, Dynamical upper bounds on wavepacket spreading, Amer. J. Math. 125 (2003) 1165–1198. [25] Y. Last, Quantum dynamics and decompositions of singular continuous spectra, J. Funct. Anal. 142 (1996) 406–445. [26] M. Lothaire, Algebraic Combinatorics on Words (Cambridge Univ. Press, 2002), Chap. 2, pp. 40–97. [27] Q. Liu and Z. Wen, Hausdorﬀ dimension of spectrum of one-dimensional Schr¨ odinger operator with Sturmian potentials, Potential Anal. 20 (2004) 33–59. [28] L. Raymond, A constructive gap labelling for the discrete Schr¨ odinger operator on a quasiperiodic chain, preprint (1997). [29] A. S¨ ut˝ o, The spectrum of a quasiperiodic Schr¨ odinger operator, Comm. Math. Phys. 111 (1987) 409–415. [30] S. Tcheremchantsev, Mixed lower bound in quantum transport, J. Funct. Anal. 197 (2003) 247–282. [31] S. Tcheremchantsev, Dynamical analysis of Schr¨ odinger operators with growing sparse potentials, Comm. Math. Phys. 253 (2005) 221–252. [32] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Mathematical Surveys and Monographs, Vol. 72 (Amer. Math. Soc., 2000).

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 8 (2010) 881–961 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004107

ASYMPTOTICS FOR FERMI CURVES: SMALL MAGNETIC POTENTIAL

GUSTAVO DE OLIVEIRA Department of Mathematics, University of British Columbia, Canada [email protected] Received 9 March 2010 We consider complex Fermi curves of electric and magnetic periodic fields. These are analytic curves in C2 that arise from the study of the eigenvalue problem for periodic Schr¨ odinger operators. We characterize a certain class of these curves in the region of C2 where at least one of the coordinates has “large” imaginary part. The new results in this work extend previous results in the absence of magnetic field to the case of “small” magnetic field. Our theorems can be used to show that generically these Fermi curves belong to a class of Riemann surfaces of infinite genus. Keywords: Fermi curves; Bloch variety; Fermi surfaces; periodic Schr¨ odinger operators. Mathematics Subject Classification 2010: 47B99, 81Q99, 14H55

1. Introduction In [1], the authors introduced a class of Riemann surfaces of inﬁnite genus that are “asymptotic to” a ﬁnite number of complex lines joined by inﬁnite many handles. These surfaces are constructed by pasting together a compact submanifold of ﬁnite genus, plane domains, and handles. All these components satisfy a number of geometric/analytic hypotheses stated in [1] that specify the asymptotic holomorphic structure of the surface. The class of surfaces obtained in this way yields an extension of the classical theory of compact Riemann surfaces that has analogues of many theorems of the classical theory. It was proven in [1] that this new class includes quite general hyperelliptic surfaces, heat curves (which are spectral curves associated to a certain “heat-equation”), and Fermi curves with zero magnetic potential. In order to verify the geometric/analytic hypotheses for the latter the authors proved two “asymptotic” theorems similar to the ones we prove below. This is the main step needed to verify these hypotheses. In this work, we extend their results to Fermi curves with “small” magnetic potential. There are two immediate applications of our results. First, as we have already mentioned, one can use our theorems for verifying the geometric/analytic hypotheses of [1] for Fermi curves with small magnetic potential. This would show that 881

September 14, J070-S0129055X10004107

882

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

these curves belong to the class of Riemann surfaces mentioned above. Secondly, one can prove that a class of these curves are irreducible (in the usual algebraicgeometrical sense). Both these applications were done in [1] for Fermi curves with zero magnetic potential. Complex Fermi curves (and other similar spectral curves) have been studied, in diﬀerent perspectives, in the absence of magnetic ﬁeld [1–5], and in the presence of magnetic ﬁeld [6]. Some results on the real Fermi curve in the high-energy region were obtained in [7]. There one also ﬁnds a short description of the existing results on periodic magnetic Schr¨ odinger operators. An even more general review is presented in [8]. To our knowledge, our work provides new results on complex Fermi curves with magnetic ﬁeld. At this moment, we are only able to handle the case of “small” magnetic potential. The asymptotic characterization of Fermi curves with arbitrarily large magnetic potential remains as an open problem. In order to prove our theorems, we follow the same strategy as [1]. The presence of magnetic ﬁeld makes the analysis considerably harder and requires new estimates. As it was pointed out in [7, 8], the study of an operator with magnetic potential is essentially more complicated than the study of the operator with just an electric potential. This seems to be the case in this problem as well. Before we outline our results let us introduce some deﬁnitions. Let Γ be a lattice in R2 and let A1 , A2 and V be real-valued functions in L2 (R2 ) that are periodic with respect to Γ. Set A := (A1 , A2 ) and deﬁne the operator H(A, V ) := (i∇ + A)2 + V acting on L2 (R2 ), where ∇ is the gradient operator in R2 . For k ∈ R2 consider the following eigenvalue–eigenvector problem in L2 (R2 ) with boundary conditions, H(A, V )ϕ = λϕ, ϕ(x + γ) = eik·γ ϕ(x) for all x ∈ R2 and all γ ∈ Γ. Under suitable hypotheses on the potentials A and V this problem is self-adjoint and its spectrum is discrete. It consists of a sequence of real eigenvalues E1 (k, A, V ) ≤ E2 (k, A, V ) ≤ · · · ≤ En (k, A, V ) ≤ · · · . For each integer n ≥ 1, the eigenvalue En (k, A, V ) deﬁnes a continuous function of k. From the above boundary condition, it is easy to see that this function is periodic with respect to the dual lattice Γ# := {b ∈ R2 | b · γ ∈ 2πZ for all γ ∈ Γ}, where b · γ is the usual scalar product on R2 . It is customary to refer to k as the crystal momentum and to En (k, A, V ) as the nth band function. The corresponding normalized eigenfunctions ϕn,k are called Bloch eigenfunctions. The operator H(A, V ) (and its three-dimensional counterpart) is important in solid state physics. It is the Hamiltonian of a single electron under the inﬂuence of

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

883

magnetic ﬁeld with vector potential A, and electric ﬁeld with scalar potential V , in the independent electron model of a two-dimensional solid [9]. The classical framework for studying the spectrum of a diﬀerential operator with periodic coeﬃcients is the Floquet (or Bloch) theory [9–11]. Roughly speaking, the main idea of this theory is to “decompose” the original eigenvalue problem, which usually has continuous spectrum, into a family of boundary value problems, each one having discrete spectrum. In our context this leads to decomposing the problem H(A, V )ϕ = λϕ (without boundary conditions) into the above k-family of boundary value problems. Let Uk be the unitary transformation on L2 (R2 ) that acts as Uk : ϕ(x) → eik·x ϕ(x). By applying this transformation, we can rewrite the above problem and put the boundary conditions into the operator. Indeed, if we deﬁne Hk (A, V ) := Uk−1 H(A, V ) Uk

and ψ := Uk−1 ϕ,

then the above problem is unitarily equivalent to Hk (A, V )ψ = λψ

for ψ ∈ L2 (R2 /Γ).

Furthermore, a simple (formal) calculation shows that Hk (A, V ) = (i∇ + A − k)2 + V. The real “lifted” Fermi curve of (A, V ) with energy λ ∈ R is deﬁned as Fˆλ,R (A, V ) := {k ∈ R2 | (Hk (A, V ) − λ)ϕ = 0 for some ϕ ∈ DHk (A,V ) \{0}}, where DHk (A,V ) ⊂ L2 (R2 /Γ) denotes the (dense) domain of Hk (A, V ). The adjective “lifted” indicates that Fˆλ,R (A, V ) is a subset of R2 rather than R2 /Γ# . As we may replace V by V − λ, we only discuss the case λ = 0 and write FˆR (A, V ) ˆ := in place of Fˆ0,R (A, V ) to simplify the notation. Let |Γ| := R2 /Γ dx and A(0) −1 ˆ |Γ| A(x)dx. Since Hk (A, V ) is equal to H ˆ (A − A(0), V ), if we perform 2 R /Γ

k−A(0)

ˆ ˆ the change of coordinates k → k + A(0) and redeﬁne A − A(0) → A we may ˆ assume, without loss of generality, that A(0) = 0. The dual lattice Γ# acts on R2 by translating k → k + b for b ∈ Γ# . This action maps FˆR (A, V ) to itself because for each n ≥ 1 the function k → En (k, A, V ) is periodic with respect to Γ# . In other words, the real lifted Fermi curve “is periodic” with respect to Γ# . Deﬁne FR (A, V ) := FˆR (A, V )/Γ# . We call FR (A, V ) the real Fermi curve of (A, V ). It is a curve in the torus R2 /Γ# . The above deﬁnitions and the real Fermi curve have physical meaning. It is useful and interesting, however, to study the “complexiﬁcation” of these curves. Knowledge about the complexiﬁed curves may provide information about the real counterparts. For complex-valued functions A1 , A2 and V in L2 (R2 ) and for k ∈ C2 the above problem is no longer self-adjoint. Its spectrum, however, remains discrete. It is a sequence of eigenvalues in the complex plane. From the boundary condition

September 14, J070-S0129055X10004107

884

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

in the original problem it is easy to see that the family of functions k → En (k, A, V ) remains periodic with respect to Γ# . Moreover, the transformation Uk is no longer unitary but it is still bounded and invertible and it still preserves the spectrum, that is, we can still rewrite the original problem in the form Hk (A, V )ψ = λψ for ψ ∈ L2 (R2 /Γ) without modifying the eigenvalues. Thus, it makes sense to deﬁne Fˆ (A, V ) := {k ∈ C2 | Hk (A, V )ϕ = 0 for some ϕ ∈ DHk (A,V ) \{0}}, F (A, V ) := Fˆ (A, V )/Γ# . ˆ We call F(A, V ) and F (A, V ) the complex “lifted” Fermi curve and the complex Fermi curve, respectively. When there is no risk of confusion we refer to either simply as Fermi curve. We are now ready to outline our results. When A and V are zero the (free) Fermi curve can be found explicitly. It consists of two copies of C with the points −b2 + ib1 (in the ﬁrst copy) and b2 + ib1 (in the second copy) identiﬁed for all (b1 , b2 ) ∈ Γ# with b2 = 0. In this work, we prove that in the region of C2 where k ∈ C2 has “large” imaginary part the Fermi curve (for nonzero A and V ) is “close to” the free Fermi curve. In a compact form, our main result (that will be stated precisely in Theorems 1 and 2) is essentially the following. Main result. Suppose that A and V have some regularity and assume that (in a suitable norm) A is smaller than a constant given by the parameters of the problem. Write k in C2 as k = u + iv with u and v in R2 and suppose that |v| is larger than a constant given by the parameters of the problem. (Recall that the free Fermi curve is two copies of C with certain points in one copy identiﬁed with points in the other one.) Then, in this region of C2 , the Fermi curve of A and V is very close to the free Fermi curve, except that instead of two planes we may have two deformed planes, and identiﬁcations between points can open up to handles that look like {(z1 , z2 ) ∈ C2 | z1 z2 = constant} in suitable local coordinates. The proof of our results has basically three steps: • We ﬁrst derive very detailed information about the free Fermi curve (which is explicitly known). Then, to compute the interacting Fermi curve we have to ﬁnd the kernel of H in L2 (R2 ) with the above boundary conditions. • In the second step of the proof, we derive a number of estimates for showing that this kernel has ﬁnite dimension for small A and k ∈ C2 with large imaginary part. Our strategy here is similar to the Feshbach method in perturbation theory [12]. Indeed, we prove that in the complement of the kernel of H in L2 (R2 ), after a suitable invertible change of variables in L2 (R2 ), the operator H multiplied by the inverse of the operator that implements this change of variables is a compact perturbation of the identity that is invertible for such A and k. This reduces the problem of ﬁnding the kernel to ﬁnite dimension and thus we can write local deﬁning equations for the Fermi curve.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

885

• In the third step of the proof, we use these equations to study the Fermi curve. A few more estimates and the implicit function theorem gives us the deformed planes. The handles are obtained using a quantitative Morse lemma from [13] that is available in the Appendix A. Steps two and three contain most of the novelties in this work. The critical part of the proof is the second step. The main diﬃculty arises due to the presence of the term A·i∇ in the Hamiltonian H(A, V ). When A is large, taking the imaginary part of k ∈ C2 arbitrarily large is not enough to control this term — it is not enough to make its contribution small and hence have the interacting Fermi curve as a perturbation of the free Fermi curve. (The term V in H(A, V ) is easily controlled by this method.) However, the proof can be implemented by assuming that A is small. This work is organized as follows. In Sec. 2, we collect some properties of the free Fermi curve and in Sec. 3, we deﬁne ε-tubes about it. In Sec. 4, we state our main results and in Sec. 5, we describe the general strategy of analysis used to prove them. Subsequently, we implement this strategy by proving a number of lemmas and propositions in Secs. 6–10, which we put together later in Secs. 11 and 12 to prove our main theorems. The proof of the estimates of Secs. 9 and 10 are left to the Appendices B and C. 2. The Free Fermi Curve When the potentials A and V are zero the curve Fˆ (A, V ) can be found explicitly. In this section we collect some properties of this curve. For ν ∈ {1, 2} and b ∈ Γ# set Nb,ν (k) := (k1 + b1 ) + i(−1)ν (k2 + b2 ), Nν (b) := {k ∈ C2 | Nb,ν (k) = 0}, Nb (k) := Nb,1 (k)Nb,2 (k), Nb := N1 (b) ∪ N2 (b), θν (b) :=

1 ((−1)ν b2 + ib1 ). 2

Observe that Nν (b) is a line in C2 . The free lifted Fermi curve is an union of these lines. Here is the precise statement. Proposition 1 (The Free Fermi Curve). The curve Fˆ (0, 0) is the locally ﬁnite union Nν (b). b∈Γ# ν∈{1,2}

In particular, the curve F (0, 0) is a complex analytic curve in C2 /Γ# .

September 14, J070-S0129055X10004107

886

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

The proof of this proposition is straightforward. It can be found in [13]. Here we only give its ﬁrst part. Proof of Proposition 1 (First Part). For all k ∈ C2 the functions {eib·x | b ∈ Γ# } form a complete set of eigenfunctions for Hk (0, 0) in L2 (R2 /Γ) satisfying Hk (0, 0)eib·x = (i∇ − k)2 eib·x = (b + k)2 eib·x = Nb (k)eib·x . Hence,

Fˆ (0, 0) = {k ∈ C2 | Nb (k) = 0 for some b ∈ Γ# } =

b∈Γ#

Nb =

Nν (b).

b∈Γ# ν∈{1,2}

This is the desired expression for Fˆ (0, 0). The lines Nν (b) have the following properties (see [13] for a proof). Proposition 2 (Properties of Nν (b)). Let ν ∈ {1, 2} and let b, c, d ∈ Γ# . Then: (a) (b) (c) (d) (e)

Nν (b) ∩ Nν (c) = ∅ if b = c; dist(Nν (b), Nν (c)) = √12 |b − c|; N1 (b) ∩ N2 (c) = {(iθ1 (c) + iθ2 (b), θ1 (c) − θ2 (b))}; the map k → k + d maps Nν (b) to Nν (b − d); the map k → k + d maps N1 (b) ∩ N2 (c) to N1 (b − d) ∩ N2 (c − d).

Let us brieﬂy describe what the free Fermi curve looks like. In Fig. 1, there is a sketch of the set of (k1 , k2 ) ∈ Fˆ (0, 0) for which both ik1 and k2 are real, for the case where the lattice Γ# has points over the coordinate axes, that is, it has points ik1 N2 (0)

N2 (b)

N2 (−b)

ik1 N1 (b)

N1 (0)

N1 (−b)

k2

k2 Fig. 1.

Sketch of Fˆ (0, 0) and F (0, 0) when both ik1 and k2 are real.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

887

of the form (b1 , 0) and (0, b2 ). Observe that, in particular, Proposition 2 yields N1 (0) ∩ N2 (b) = {(iθ1 (b), θ1 (b))}, N1 (−b) ∩ N2 (0) = {(iθ2 (−b), θ2 (b))}, the map k → k + b maps N1 (0) ∩ N2 (b) to N1 (−b) ∩ N2 (0). Recall that points in Fˆ (0, 0) that diﬀer by elements of Γ# correspond to the same point in F (0, 0). Thus, in the sketch on the left, we should identify the lines k2 = −b2 /2 and k2 = b2 /2 for all b ∈ Γ# with b2 = 0, to get a pair of helices climbing up the outside of a cylinder, as illustrated by the ﬁgure on the right. The helices intersect each other twice on each cycle of the cylinder — once on the front half of the cylinder and once on the back half. Hence, viewed as a “manifold” (with singularities), the pair of helices are just two copies of R with points that corresponds to intersections identiﬁed. We can use k2 as a coordinate in each copy of R and then the pairs of identiﬁed points are k2 = b2 /2 and k2 = −b2 /2 for all b ∈ Γ# with b2 = 0. So far we have only considered k2 real. The full Fˆ (0, 0) is just two copies of C with k2 as a coordinate in each copy, provided we identify the points θ1 (b) = 12 (−b2 + ib1 ) (in the ﬁrst copy) and θ2 (b) = 12 (b2 + ib1 ) (in the second copy) for all b ∈ Γ# with b2 = 0. 3. The ε-Tubes about the Free Fermi Curve We now introduce real and imaginary coordinates in C2 and deﬁne ε-tubes about the free Fermi curve. We derive some properties of the ε-tubes as well. For k ∈ C2 write k1 = u1 + iv1

and k2 = u2 + iv2 ,

where u1 , u2 , v1 and v2 are real numbers. Then, Nb,ν (k) = (k1 + b1 ) + i(−1)ν (k2 + b2 ) = i(v1 + (−1)ν (u2 + b2 )) − (−1)ν (v2 − (−1)ν (u1 + b1 )), so that |Nb,ν (k)| = |v + (−1)ν (u + b)⊥ |, where (y1 , y2 )⊥ := (y2 , −y1 ). Since Nb (k) = Nb,1 (k)Nb,2 (k), we have Nb (k) = 0 if and only if v − (u + b)⊥ = 0

or v + (u + b)⊥ = 0.

Let 2Λ be the length of the shortest nonzero “vector” in Γ# . Then there is at most one b ∈ Γ# with |v + (u + b)⊥ | < Λ and at most one b ∈ Γ# with |v − (u + b)⊥ | < Λ (see [13] for the proof).

September 14, J070-S0129055X10004107

888

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Let ε be a constant satisfying 0 < ε < Λ/6. For ν ∈ {1, 2} and b ∈ Γ# , deﬁne the ε-tube about Nν (b) as Tν (b) := {k ∈ C2 | |Nb,ν (k)| = |v + (−1)ν (u + b)⊥ | < ε}, and the ε-tube about Nb = N1 (b) ∪ N2 (b) as Tb := T1 (b) ∪ T2 (b). Since (v + (u + b)⊥ ) + (v − (u + b)⊥ ) = 2v, at least one of the factors |v + (u + b)⊥ | or |v − (u + b)⊥ | in |Nb (k)| must always be greater or equal to |v|. If k ∈ Tb both factors are also greater or equal to ε. If k ∈ Tb one factor is bounded by ε and the other must lie within ε of |2v|. Thus, k ∈ Tb ⇒ |Nb (k)| ≥ ε|v|,

(1)

k ∈ Tb ⇒ |Nb (k)| ≤ ε(2|v| + ε).

(2)

Finally, denote by T¯b the closure of Tb . The intersection T¯b ∩ T¯b is compact whenever b = b , and T¯b ∩ T¯b ∩ T¯b is empty for all distinct elements b, b , b ∈ Γ# (see [13] for details). If a point k belongs to the free Fermi curve the function Nb (k) vanishes for some b ∈ Γ# . We now give a lower bound for this function when (b, k) is not in the zero set. Proposition 3 (Lower Bound for |Nb (k)|). (a) If |b + u + v ⊥ | ≥ Λ and |b + u − v ⊥ | ≥ Λ, then |Nb (k)| ≥ Λ2 (|v| + |u + b|). (b) If |v| > 2Λ and k ∈ T0 , then |Nb (k)| ≥ Λ2 (|v| + |u + b|) for all b = 0 but at most one b = 0. This exceptional ˜b obeys |˜b| > |v| and | |u + ˜b| − |v| | < Λ. (c) If |v| > 2Λ and k ∈ T0 ∩ Td with d = 0, then |Nb (k)| ≥ Λ2 (|v| + |u + b|) for all b ∈ {0, d}. Furthermore we have |d| > |v| and | |u + d| − |v| | < Λ. Proof. (a) By hypothesis, both factors in |Nb (k)| = |v + (u + b)⊥ | |v − (u + b)⊥ | are greater or equal to Λ. We now prove that at least one of the factors must also be greater or equal to 12 (|v| + |u + b|). Suppose that |v| ≥ |u + b|. Then, since (v + (u + b)⊥ ) + (v − (u + b)⊥ ) = 2v, at least one of the factors must also be greater or equal to |v| = 12 (|v| + |v|) ≥ 12 (|v| + |u + b|). Now suppose that |v| < |u + b|. Then similarly we prove that |u + b| > 12 (|v| + |u + b|). All this together implies that |Nb (k)| ≥ Λ2 (|v| + |u + b|), which proves part (a). (b) By hypothesis ε < Λ/6 < |v|. Let k ∈ T0 . Then, by (2), |N0 (k)| ≤ ε(2|v| + ε) < 3ε|v| <

Λ |v|. 2

(3)

Thus we have either |u + v ⊥ | < Λ or |u − v ⊥ | < Λ (otherwise we apply part (a) to get a contradiction). Suppose that |u + v ⊥ | < Λ. Then there is no b ∈ Γ# \{0} with |b+u+v ⊥ | < Λ and there is at most one ˜b ∈ Γ# \{0} satisfying

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

889

|˜b + u − v ⊥ | < Λ. This inequality implies | |u + ˜b| − |v| | < Λ. Furthermore, for this ˜b, |˜b| = |2v ⊥ − (u + v ⊥ ) + (˜b + u − v ⊥ )| > 2|v| − 2Λ > |v|, since −2Λ > −|v|. Now suppose that |u−v ⊥ | < Λ. Then similarly we prove that |˜b| > |v|. Finally observe that, if b ∈ {0, ˜b} then |b+u+v ⊥ | ≥ Λ and |b+u−v ⊥ | ≥ Λ. Hence, by applying part (a) it follows that |Nb (k)| ≥ Λ2 (|v| + |u + b|). This proves part (b). (c) As in the proof of part (b), if k ∈ T0 ∩ Td then in addition to (3), we have |Nd (k)| < Λ2 |v|. Thus, applying part (b) we conclude that d must be the exceptional ˜b of part (b). The statement of part (c) follows then from part (b). This completes the proof. 4. Main Results The Riemann surfaces introduced in [1] can be decomposed into X com ∪ X reg ∪ X han , where X com is a compact submanifold with smooth boundary and ﬁnite genus, X reg is a ﬁnite union of open “regular pieces”, and X han is an inﬁnite union of closed “handles”. All these components satisfy a number of geometric/analytic hypotheses stated in [1] that specify the asymptotic holomorphic structure of the surface. Below we state two “asymptotic” theorems that essentially characterize the X reg and X han components of Fermi curves with small magnetic potential. Before we move to the theorems let us introduce some deﬁnitions. For any ϕ ∈ L2 (R2 /Γ) deﬁne ϕˆ : Γ# → C as 1 ϕ(x)e−ib·x dx, ϕ(b) ˆ := (F ϕ)(b) := |Γ| R2 /Γ where |Γ| := R2 /Γ dx. Then, ib·x ϕ(x) = (F −1 ϕ)(x) ˆ = ϕ(b)e ˆ , b∈Γ#

ϕ L2 (R2 /Γ) = |Γ|1/2 ϕ ˆ l2 (Γ# ) . Recall that k = u + iv with u, v ∈ R2 , let ρ be a positive constant, and set Kρ := {k ∈ C2 | |v| ≤ ρ}. Finally, consider the projection pr: C2 → C, (k1 , k2 ) → k2 , and deﬁne q := (i∇ · A) + A2 + V.

September 14, J070-S0129055X10004107

890

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

It is easy to construct a holomorphic map E: Fˆ (A, V ) → F(A, V ) [13]. The precise form of this map is irrelevant here. For our purposes it is enough to think of it simply as a “projection” (or “exponential map”). We are ready to state our results. Clearly, the set Kρ is invariant under the action of Γ# and Kρ /Γ# is compact. Hence, the image of Fˆ (A, V )∩Kρ under the holomorphic map E is compact in F (A, V ). This image set will essentially play the role of X com in the decomposition of F (A, V ). Our ﬁrst theorem characterizes the regular piece X reg of F (A, V ). Theorem 1 (The Regular Piece). Let 0 < ε < Λ/6 and suppose that A1 , A2 and ˆ V are functions in L2 (R2 /Γ) with b2 qˆ(b) l1 (Γ# ) < ∞ and (1+b2 )A(b) l1 (Γ# \{0}) < 2ε/63. Then there is a constant ρ = ρΛ,ε,q,A such that, for ν ∈ {1, 2}, the projection pr induces a biholomorphic map between   (Fˆ (A, V ) ∩ Tν (0)) Kρ ∪ Tb  b∈Γ# \{0}

and its image in C. This image component contains {z ∈ C | 8|z| > ρ and |z + (−1)ν θν (b)| > ε for all b ∈ Γ# \{0}} and is contained in

1 ε2 z ∈ C |z + (−1)ν θν (b)| > ε− for all b ∈ Γ# \{0} , 2 40Λ where θν (b) = 12 ((−1)ν b2 + ib1 ). Furthermore, pr−1 : Image(pr) → Tν (0), (1,0)

y → (−β2 (1,0)

where β2

− i(−1)ν y − r(y), y),

is a constant given by (24) that depends only on ρ and A, (1,0)

|β2

|<

ε2 100Λ

and

|r(y)| ≤

ε3 C + , 50Λ2 ρ

where C = CΛ,ε,q,A is a constant. Now observe that, since Tb + c = Tb+c for all b, c ∈ Γ# , the complement of ˆ E(F (A, V ) ∩ Kρ ) in F (A, V ) is the disjoint union of    A   A  ˆ   E Tb  (F (A, V ) ∩ T0 ) A Kρ ∪  # b∈Γ A b2 =0 A and E(Fˆ (A, V ) ∩ T0 ∩ Tb ). b∈Γ# b2 =0

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

891

Basically, the ﬁrst of the two sets will be the regular piece of F (A, V ), while the second set will be the handles. The map Φ parametrizing the regular part will be the composition of the map E with the inverse of the map discussed in the above theorem. The detailed information about the handles X han in F (A, V ) comes from our second main theorem. Theorem 2 (The Handles). Let 0 < ε < Λ/6 and suppose that A1 , A2 and V ˆ are functions in L2 (R2 /Γ) with b2 qˆ(b) l1 (Γ# ) < ∞ and (1 + b2 )A(b) l1 (Γ# \{0}) < 2ε/63. Then, for every suﬃciently large constant ρ and for every d ∈ Γ# \{0} with 2|d| > ρ, there are maps

ε ε φd,1 : (z1 , z2 ) ∈ C2 |z1 | ≤ and |z2 | ≤ → T1 (0) ∩ T2 (d), 2 2

ε 2 φd,2 : (z1 , z2 ) ∈ C |z1 | ≤ and |z2 | ≤ ε → T1 (−d) ∩ T2 (0), 2 and a complex number td with |td | ≤

C |d|4

such that:

(i) For ν ∈ {1, 2} the domain of the map φd,ν is biholomorphic to its image, and the image contains

ε 2 k ∈ C |k1 + i(−1)ν k2 | ≤ and 8 ε ν+1 ν ν+1 d1 − i(−1) (k2 + (−1) d2 )| ≤ |k1 + (−1) . 8 Furthermore, Dφˆd,ν =

1 2

1 −i(−1)ν

1 i(−1)ν

and

I +O

φd,ν (0) = (iθν (d), (−1)ν+1 θν (d)) + O

ε 900

1 |d|2

+O

1 . ρ

(ii) ˆ φ−1 d,1 (T1 (0) ∩ T2 (d) ∩ F (A, V ))

ε ε = (z1 , z2 ) ∈ C2 z1 z2 = td , |z1 | ≤ and |z2 | ≤ , 2 2 ˆ φ−1 d,2 (T1 (−d) ∩ T2 (0) ∩ F (A, V ))

ε ε = (z1 , z2 ) ∈ C2 z1 z2 = td , |z1 | ≤ and |z2 | ≤ . 2 2 (iii) φd,1 (z1 , z2 ) = φd,2 (z2 , z1 ) − d.

September 14, J070-S0129055X10004107

892

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

These are the main results in this paper. In the next section, we outline the strategy for proving them. The proofs are presented in the subsequent sections divided in many steps. 5. Strategy Outline Below we brieﬂy describe the general strategy of analysis used to prove our results. We ﬁrst introduce some notation and deﬁnitions. Observe that Hk (A, V )ϕ = ((i∇ + A − k)2 + V )ϕ = ((i∇ − k)2 + 2A · (i∇ − k) + (i∇ · A) + A2 + V )ϕ, and write Hk (A, V ) = ∆k + h(k, A) + q(A, V ) with ∆k := (i∇ − k)2 ,

h(k, A) := 2A · (i∇ − k) and

q(A, V ) := (i∇ · A) + A2 + V. For each ﬁnite subset G of Γ# set G := Γ# \ G

and C2G := C2

Nb ,

b∈G

L2G := span{eib·x | b ∈ G}

and L2G := span{eib·x | b ∈ G }.

To simplify the notation write L2 in place of L2 (R2 /Γ). Let I be the identity operator on L2 , and let πG and πG be the orthogonal projections from L2 onto L2G and L2G , respectively. Then, L2 = L2G ⊕ L2G

and I = πG + πG .

2 For k ∈ C2G deﬁne the partial inverse (∆k )−1 G on L as −1 (∆k )−1 G := πG + ∆k πG .

Its matrix elements are ((∆k )−1 G )b,c :=

ib·x

ic·x

e e , (∆k )−1 G 1/2 |Γ| |Γ|1/2

  δb,c

= L2

 δb,c

if c ∈ G, 1 Nc (k)

if c ∈ G,

where b, c ∈ Γ# . Here is the main idea. By deﬁnition, a point k is in Fˆ (A, V ) if Hk (A, V ) has a nontrivial kernel in L2 . Hence, to study the part of the curve in the intersection of 2 # d ∈G Td with C \ b∈G Tb for some ﬁnite subset G of Γ , it is natural to look for a nontrivial solution of (∆k + h + q)(ψG + ψG ) = 0,

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

893

where ψG ∈ L2G and ψG ∈ L2G . Equivalently, if we make the following (invertible) change of variables in L2 , (ψG + ψG ) = (∆k )−1 G (ϕG + ϕG ), where ϕG ∈ L2G and ϕG ∈ L2G , we may consider the equation (∆k + h + q)ϕG + (I + (h + q)∆−1 k )ϕG = 0.

(4)

The projections of this equation onto L2G and L2G are, respectively, πG (h + q)ϕG + πG (I + (h + q)∆−1 k )ϕG = 0, πG (∆k + h + q)ϕG + πG (h +

q)∆−1 k ϕG

= 0.

(5) (6)

Now deﬁne RG G on L2 as RG G := πG (I + (h + q)∆−1 k )πG . Observe that RG G is the zero operator on L2G . Then, if RG G has a bounded inverse on L2G , Eq. (5) is equivalent to −1 ϕG = −RG G πG (h + q)ϕG .

Substituting this into (6) yields −1 πG (∆k + h + q − (h + q)∆−1 k RG G πG (h + q))ϕG = 0.

This equation has a nontrivial solution if and only if the (ﬁnite) |G| × |G| determinant −1 det[πG (∆k + h + q − (h + q)∆−1 k RG G πG (h + q))πG ] = 0

or, equivalently, expressing all operators as matrices in the basis {|Γ|−1/2 eib·x | b ∈ Γ# },   wd ,b −1  (RG = 0, (7) detNd (k)δd ,d + wd ,d − G )b,c wc,d N (k) b b,c∈G

d ,d ∈G

where ˆ − c) + qˆ(b − c). wb,c := hb,c + qˆ(b − c) = −2(c + k) · A(b Therefore, if RG G has a bounded inverse on L2G — which is in fact the case under suitable conditions — in the region under consideration we can study the Fermi curve in detail using the (local) deﬁning Eq. (7).

September 14, J070-S0129055X10004107

894

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

6. Invertibility of RG G The following notation will be used whenever we consider vector-valued quantities. Let X be a Banach space and let A, B ∈ X 2 , where A = (A1 , A2 ) and B = (B1 , B2 ). Then, A X := ( A1 2X + A2 2X )1/2

and A · B := A1 B1 + A2 B2 .

Furthermore, we will denote by · the operator norm on L2 (R2 /Γ). In general, for any B, C ⊂ Γ# (C such that ∆−1 k πC exists) deﬁne the operator RBC as RBC := πB (I + (h + q)∆−1 k )πC −1 −1 = πB πC + πB q ∆−1 k πC + πB (2A · i∇)∆k πC − πB (2k · A)∆k πC . (8)

Its matrix elements are (RBC )b,c = δb,c +

ˆ − c) 2k · A(b ˆ − c) qˆ(b − c) 2c · A(b − − , Nc (k) Nc (k) Nc (k)

(9)

where b ∈ B and c ∈ C. We ﬁrst estimate the norm of the last three terms on the right-hand side of (8). We begin with the following proposition. Proposition 4. Let k ∈ C2 and let B, C ⊂ Γ# with C ⊂ {b ∈ Γ# | Nb (k) = 0}. Then, 1 , q l1 sup πB q ∆−1 k πC ≤ ˆ |N c (k)| c∈C ˆ πB (A · i∇)∆−1 k πC ≤ A l1 sup c∈C

|c| , |Nc (k)|

ˆ πB (k · A)∆−1 k πC ≤ A l1 |k| sup c∈C

1 . |Nc (k)|

To prove this proposition we apply the following well-known inequality (see [13]). Proposition 5. Consider a linear operator T : L2C → L2B with matrix elements Tb,c . Then, |Tb,c |, sup |Tb,c | . T ≤ max sup c∈C

b∈B

b∈B c∈C

Proof of Proposition 4. We only prove the ﬁrst inequality. The proof of the other ones is similar. Write T := πB q ∆−1 k πC . Then, in view of (8) and (9), |ˆ q (b − c)| 1 ≤ sup ˆ q l1 , |Tb,c | ≤ sup sup |Nc (k)| c∈C c∈C c∈C |Nc (k)| b∈B

sup

b∈B c∈C

b∈B

|ˆ q (b − c)| 1 ≤ sup ˆ q l1 . |N (k)| |N c c∈C c (k)| b∈B

|Tb,c | ≤ sup

c∈C

By Proposition 5, these estimates yield the desired inequality.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

895

−1 The key estimate for the existence of RG G is given below.

Proposition 6 (Estimate of RSS − πS ). Let k ∈ C2 with |u| ≤ 2|v| and |v| > 2Λ. Suppose that S ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Then, q l1 RSS − πS ≤ ˆ

14 ˆ 1 + A l1 . ε|v| ε

(10)

If A = 0, the right-hand side of (10) can be made arbitrarily small for any V by taking |v| suﬃciently large (recall that q(0, V ) = V ). If A = 0, however, we need ˆ l1 in (10) ˆ l1 small to make that quantity less than 1. The term 14 A to take A ε −1 comes from the estimate we have for πG h ∆k πG . Proof of Proposition 6. By hypothesis, for all b ∈ S, 1 1 ≤ . |Nb (k)| ε|v|

(11)

We now show that, for all b ∈ S, 4 |b| ≤ . |Nb (k)| ε

(12)

First suppose that |b| ≤ 4|v|. Then, |b| 4|v| 4 ≤ = . |Nb (k)| ε|v| ε Now suppose that |b| ≥ 4|v|. Again, by hypothesis we have |u| ≤ 2|v| and |v| > 2Λ > ε. Hence, |b| 3 |v ± (u + b)⊥ | ≥ |b| − |u| − |v| ≥ |b| − 3|v| ≥ |b| − |b| = . 4 4 Consequently, |b| |b| 4 4 16 4 4 = ≤ |b| = ≤ ≤ . ⊥ ⊥ |Nb (k)| |v + (u + b) | |v − (u + b) | |b| |b| |b| |v| ε This proves (12). The expression for RSS − πS is given by (8). Observe that |k| ≤ |u| + |v| ≤ 3|v|. Then, applying Proposition 4 and using (11) and (12) we obtain ˆ l1 + ˆ q l1 ) sup RSS − πS ≤ (6|v| A b∈S

1 ˆ l1 sup |c| + 2 A |Nc (k)| b∈S |Nc (k)|

8 ˆ 14 ˆ 1 1 ˆ l1 + ˆ ≤ (6|v| A + A + A q l1 ) q l1 l1 = ˆ l1 . ε|v| ε ε|v| ε This is the desired inequality.

September 14, J070-S0129055X10004107

896

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

From the last proposition it follows easily that RSS has a bounded inverse for large |v| and weak magnetic potential. Lemma 1 (Invertibility of RSS ). Let k ∈ C2 ,

2 |u| ≤ 2|v|, |v| > max 2Λ, ˆ q l1 , ˆ q l1 < ∞ ε

and

ˆ l1 < A

2 ε. 63

Suppose that S ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Then the operator RSS has a bounded inverse with 1 ˆ l1 14 < 17 , + A RSS − πS < ˆ q l1 ε|v| ε 18 −1 RSS − πS < 18 RSS − πS .

Proof. Write RSS = πS + T with T = RSS − πS . Then, by Proposition 6, 1 ˆ l1 14 < 1 + 4 = 17 < 1. + A q l1 T = RSS − πS ≤ ˆ ε|v| ε 2 9 18 −1 Hence, the Neumann series for RSS = (πS + T )−1 converges (and is a bounded operator). Furthermore, −1 − πS = (πS + T )−1 − πS = (πS + T )−1 − (πS + T )−1 (πS + T ) RSS

= (πS + T )−1 T ≤ (1 − T )−1 T < 18 RSS − πS , as was to be shown. Lemma 1 says that if G is such that G ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|} the operator RG G has a bounded inverse on L2G for |u| ≤ 2|v|, large |v|, and weak magnetic potential. We are now able to write local deﬁning equations for Fˆ (A, V ) under such conditions. 7. Local Deﬁning Equations In this section we derive local deﬁning equations for the Fermi curve. We begin with a simple proposition. Proposition 7. Suppose either (i) or (ii) or (iii) where: (i) G = {0} and k ∈ T0 \ b∈Γ# \{0} Tb ; (ii) G = {0, d} and k ∈ T0 ∩ Td ; (iii) G = ∅ and k ∈ C2 \ b∈Γ# Tb . Then G = Γ# \G = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Proof. The proposition follows easily if we observe that G = Γ# \G and recall from (1) that k ∈ Tb ⇒ |Nb (k)| ≥ ε|v|.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

897

We now introduce some notation. Let B be a fundamental cell for Γ# ⊂ R2 (see [9, p. 310]). Then any vector u ∈ R2 can be written as u = ξ + u for some ξ ∈ Γ# and u ∈ B. Deﬁne

2 α := sup{|u| | u ∈ B}, R := max α, 2Λ, ˆ q l1 , KR := {k ∈ C2 | |v| ≤ R}. ε We ﬁrst show that in C2 \KR the Fermi curve is contained in the union of ε-tubes about the free Fermi curve. ˆ (A, V )\KR is Contained in the Union of ε-Tubes). Proposition 8 (F Fˆ (A, V )\KR ⊂ Tb . b∈Γ#

Proof. Without loss of generality, we may consider k ∈ C2 with real part in B. We now prove that any point outside the region KR and outside the union of ε-tubes does not belong to Fˆ (A, V ). Suppose that k ∈ C2 \(KR ∪ b∈Γ# Tb ) and recall that k is in Fˆ (A, V ) if and only if (4) has a nontrivial solution. If we choose G = ∅ then G = Γ# and this equation reads RG G ϕG = 0. By Proposition 7(iii), we have G = Γ# = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Furthermore, since u ∈ B and |v| > R ≥ α, it follows that |u| ≤ α < |v| < 2|v|. Consequently, the operator RG G has a bounded inverse by Lemma 1. Thus, the only solution of the above equation is ϕG = 0. That is, there is no nontrivial solution of this equation and therefore k ∈ Fˆ (A, V ). We are left to study the Fermi curve inside the ε-tubes. There are two types of regions to consider: intersections and non-intersections of tubes. To study non intersections we choose G = {0} and consider the region (T0 \ b∈Γ# \{0} Tb )\KR . For intersections we take G = {0, d} for some d ∈ Γ# \{0} and consider (T0 ∩ Td )\KR . Observe that, since the tubes Tb have the following translational property, Tb + c = Tb+c for all b, c ∈ Γ# , and the curve Fˆ (A, V ) is invariant under the action of Γ# , there is no loss of generality in considering only the two regions above. Any other part of the curve can be reached by translation. Recall that G = Γ# \G and for d , d ∈ G and i, j ∈ {1, 2} set

dd (k; G) := −4 Bij

Aˆi (d − b) −1 ˆ (RG G )b,c Aj (c − d ), Nb (k)

b,c∈G

Cid d (k; G) := −2Aˆi (d − d ) + 2

qˆ(d − b) − 2b · A(d ˆ − b) Nb (k)

b,c∈G −1 ˆ × (RG G )b,c Ai (c − d )

September 14, J070-S0129055X10004107

898

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

+2

Aˆi (d − b) −1 (RG q (c − d ) G )b,c (ˆ Nb (k)

b,c∈G

ˆ − d )), − 2d · A(c ˆ − d ) C0d d (k; G) := qˆ(d − d ) − 2d · A(d

−

qˆ(d − b) − 2b · A(d ˆ − b) −1 (RG G )b,c (ˆ q (c − d ) N (k) b

b,c∈G

ˆ − d )). − 2d · A(c

(13)

Then, Dd ,d (k; G) := wd ,d −

b,c∈G

wd ,b (R−1 )b,c wc,d Nb (k) G G

dd 2 dd 2 dd dd = B11 k1 + B22 k2 + (B12 + B21 )k1 k2

+ C1d d k1 + C2d d k2 + C0d d . These functions have the following property.

dd , Cid d , C0d d Proposition 9. For d , d ∈ G and i, j ∈ {1, 2}, the functions Bij (and consequently Dd ,d ) are analytic on (T0 \ b∈Γ# \{0} Tb )\KR and (T0 ∩Td )\KR for G = {0} and G = {0, d}, respectively.

dd Sketch of the proof. It suﬃces to show that Bij , Cid d and C0d d are analytic functions. This property follows from the fact that all the series involved in the deﬁnition of these functions are uniformly convergent sums of analytic functions. The argument is similar for all cases. See [13] for details.

Using the above functions we can write (local) deﬁning equations for the Fermi curve. ˆ (A, V )). Lemma 2 (Local Deﬁning Equations for F (i) Let G = {0} and k ∈ (T0 \ b∈Γ# \{0} Tb )\KR . Then k ∈ Fˆ (A, V ) if and only if N0 (k) + D0,0 (k) = 0. (ii) Let G = {0, d} and k ∈ (T0 ∩ Td )\KR . Then k ∈ Fˆ (A, V ) if and only if (N0 (k) + D0,0 (k))(Nd (k) + Dd,d (k)) − D0,d (k)Dd,0 (k) = 0. Proof. We only prove part (i). The proof of part (ii) is similar. First, by Proposition 7(i) we have G = Γ# \{0} = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Furthermore, since k ∈ T0 , we have either |v − u⊥ | < ε or |v + u⊥ | < ε. In either case this implies

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

899

|u| < ε + |v| < 2Λ + |v| < 2|v|. Hence, the operator RG G has a bounded inverse by Lemma 1. Thus, in the region under consideration Fˆ (A, V ) is given by (7): w0,b −1 (RG 0 = N0 (k) + w0,0 − G )b,c wc,0 = N0 (k) + D0,0 (k). N (k) b b,c∈G

This is the desired expression. To study in detail the deﬁning equations above we shall estimate the asymptotic d d , Cid d , C0d d and Dd ,d for large |v|. (We sometimes behavior of the functions Bij refer to these functions as coeﬃcients.) Since all these functions have a similar form it is convenient to prove these estimates in a general setting and specialize them later. This is the contents of Secs. 9 and 10. We next introduce a change of variables in C2 that will be useful for proving these bounds. 8. Change of Coordinates Deﬁne the (complementary) index ν as ν := ν − (−1)ν . Observe that ν = 2 if ν = 1, ν = 1 if ν = 2, and (−1)ν = −(−1)ν . The following change of coordinates in C2 will be useful for our analysis. For ν ∈ {1, 2} and d , d ∈ G deﬁne the functions wν,d , zν,d : C2 → C as wν,d (k) := k1 + d1 + i(−1)ν (k2 + d2 ), zν,d (k) := k1 + d1 − i(−1)ν (k2 + d2 ).

(14)

Observe that, the transformation (k1 , k2 ) → (wν,d , zν,d ) is just a translation composed with a rotation. Furthermore, if k ∈ Tν (d )\KR then |wν,d (k)| is “small” and |zν,d (k)| is “large”. Indeed, |wν,d (k)| = |Nd ,ν (k)| < ε and |zν,d (k)| = |Nd ,ν (k)| ≥ |v| > R. Deﬁne also

1 d d d d d d d d (B − B22 + i(−1)ν (B12 + B21 )), 4 11 1 d d d d := (B11 + B22 ), 2 1 d d d d d d d d := −d1 B11 − i(−1)ν d2 B22 − (d2 + i(−1)ν d1 )(B12 + B21 ) 2 1 + (C1d d + i(−1)ν C2d d ), 2

Jνd d :=

Kd d

Ldν d

dd dd dd d d + d2 + d1 d2 (B12 + B21 ) M d d := d2 1 B11 2 B22

− d1 C1d d − d2 C2d d + C0d d ,

where Jνd d , K d d , Ldν d and M d d are functions of k ∈ C2 that also depend on the choice of G ⊂ Γ# . Using these functions we can express Nd (k) + Dd ,d (k) and Dd ,d (k) as follows.

September 14, J070-S0129055X10004107

900

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Proposition 10. Let ν ∈ {1, 2} and let d , d ∈ G. Then,

2 dd 2 zν,d + (1 + K d d )wν,d zν,d Nd + Dd ,d = Jνd d wν,d + Jν

+ Ldν d wν,d + Ldν d zν,d + M d d ,

2 dd 2 dd zν,d wν,d zν,d Dd ,d = Jνd d wν,d + Jν + K

+ Ldν d wν,d + Ldν d zν,d + M d d . Furthermore, (1, −i(−1)ν ) · A(d ˆ − b) −1 ν ˆ (RG G )b,c (1, −i(−1) ) · A(c − d ), N (k) b

Jνd d (k) = −

b,c∈G

K d d (k) = −2

A(d ˆ − b) · A(c ˆ − d ) −1 (RG G )b,c , Nb (k)

b,c∈G

qˆ(d − b) + 2(d − b) · A(d ˆ − b) −1 ν ˆ (RG G )b,c (1, i(−1) ) · A(c − d ) Nb (k)

Ldν d (k) =

b,c∈G

+

(1, i(−1)ν ) · A(d ˆ − b) −1 (RG q (c − d ) G )b,c (ˆ N (k) b

b,c∈G

ˆ − d )) − (1, i(−1)ν ) · A(d ˆ − d ), + 2(d − d ) · A(c

M d d (k) = −

qˆ(d − b) + 2(d − b) · A(d ˆ − b) −1 (RG ˆ(c − d ) G )b,c q N (k) b

b,c∈G

ˆ − d ). + qˆ(d − d ) + 2(d − d ) · A(d

dd Proof. To simplify the notation write w = wν,d , z = zν,d , Bij = Bij d d Ci = Ci . First observe that, in view of (14),

Nd = (k1 + d1 + i(−1)ν (k2 + d2 ))(k1 + d1 − i(−1)ν (k2 + d2 )) = wz. Furthermore, k1 = k2 = k12 = k22 = k1 k2 =

1 (w + z) − d1 , 2 (−1)ν (w − z) − d2 , 2i 1 2 1 (w + z 2 ) + wz − d1 (w + z) + d2 1 , 4 2 1 1 − (w2 + z 2 ) + wz + i(−1)ν d2 (w − z) + d2 2 , 4 2 i(−1)ν 2 1 1 (z − w2 ) − (d2 − i(−1)ν d1 )w − (d2 + i(−1)ν d1 ) + d1 d2 . 4 2 2

and

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

901

Hence, Dd ,d = B11 k12 + B22 k22 + (B12 + B21 )k1 k2 + C1 k1 + C2 k2 + C0 =

1 (B11 − B22 − i(−1)ν (B12 + B21 ))w2 4 1 ν 2 + (B11 − B22 + i(−1) (B12 + B21 ))z + −d1 B11 + i(−1)ν d2 B22 4

1 1 − (d2 − i(−1)ν d1 )(B12 + B21 ) + (C1 − i(−1)ν C2 ) w 2 2 1 + −d1 B11 + i(−1)ν d2 B22 − (d2 + i(−1)ν d1 )(B12 + B21 ) 2

1 2 + (C1 + i(−1)ν C2 ) z + d2 1 B11 + d2 B22 + d1 d2 (B12 + B21 ) 2 1 − d1 C1 − d2 C2 + C0 + (B11 + B22 )wz 2

= Jνd d w2 + Jνd d z 2 + K d d wz + Ldν d w + Ldν d z + M d d . This proves the ﬁrst claim. Consequently,

Nd + Dd ,d = Jνd d w2 + Jνd d z 2 + (1 + K d d )wz + Ldν d w + Ldν d z + M d d , which proves the second claim. Now, again to simplify the notation write fg =

fˆ(b, d ) −1 (RG ˆ(c, d ), G )b,c g N (k) b

b,c∈G

that is, to represent sums of this form suppress the summation and the other factors. Note that f g = gf according to this notation. Then, substituting (13) into the deﬁnition of Jνd d we have

1 (B11 − B22 + i(−1)ν (B12 + B21 )) 4 = −A1 A1 + A2 A2 − i(−1)ν (A1 A2 + A2 A1 )

Jνd ,d =

= (A1 − i(−1)ν A2 )(−A1 + i(−1)ν A2 ) = −((1, −i(−1)ν ) · A) ((1, −i(−1)ν ) · A) =−

(1, −i(−1)ν ) · A(d ˆ − b) −1 ν ˆ (RG G )b,c (1, −i(−1) ) · A(c − d ). N (k) b

b,c∈G

Similarly, substituting (13) into the deﬁnitions of K d d , Ldν d and M d d we derive the other expressions.

September 14, J070-S0129055X10004107

902

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

9. Asymptotics for the Coeﬃcients Let f and g be functions on Γ# and for k ∈ C2 and d , d ∈ G set f (d − b) −1 Φd ,d (k; G) := (RG G )b,c g(c − d ). N (k) b

(15)

b,c∈G

In this section we study the asymptotic behaviour of the function Φd ,d (k) for k in the union of ε-tubes with large |v|. Here we only give the statements. See Appendix B for the proofs. Reset the constant R as

4 ˆ l1 , (1 + b2 )ˆ q (b) l1 R := max 1, α, 2Λ, 140 A , (16) ε and make the following hypothesis. Hypothesis 1. 2 ε. 63 Our ﬁrst lemma provides and expansion for Φd ,d (k) “in powers of 1/|zν,d (k)|”. ˆ b2 qˆ(b) l1 < ∞ and (1 + b2 )A(b) l1 <

Lemma 3 (Asymptotics for Φd ,d (k)). Under Hypothesis 1, let ν ∈ {1, 2} and let f and g be functions on Γ# with b2 f (b) l1 < ∞ and b2 g(b) l1 < ∞. Suppose either (i) or (ii) where: (i) G = {0} and k ∈ (Tν (0)\ b∈G Tb )\KR ; (ii) G = {0, d} and k ∈ (Tν (0) ∩ Tν (d))\KR . Then, for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii), (1)

(2)

(3)

Φd ,d (k) = αµ,d (k) + αµ,d (k) + αµ,d (k), where for 1 ≤ j ≤ 2, (j)

|αµ,d (k)| ≤

(2|z

µ,d

Cj (k)| − R)j

and

(3)

|αµ,d (k)| ≤

|z

C3 , (k)|R2

µ,d

where Cj = Cj;Λ,A,q,f,g and C3 = C3;ε,Λ,A,q,f,g are constants. Furthermore, the (j) functions αµ,d (k) are given by (66) and (69) and are analytic in the region under consideration. (1)

Below we have more information about the function αµ,d (k). (1)

Lemma 4 (Asymptotics for αµ,d (k)). Consider the same hypotheses of Lemma 3. Then, for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii), (1)

(1,0)

(1,1)

(1,2)

(1,3)

zµ,d (k) αµ,d (k) = αµ,d + αµ,d (w(k)) + αµ,d (k) + αµ,d (k), (1,0)

(1,j)

where αµ,d is a constant given by (80), and the remaining functions αµ,d are given by (79). Furthermore, for 0 ≤ j ≤ 2, (1,j)

|αµ,d | ≤ Cj

and

(1,3)

|αµ,d | ≤

C3 , 2|zµ,d (k)| − R

where Cj = Cj;Λ,A,f,g and C3 = C3;Λ,A,f,g are constants given by (81).

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

903

The next lemma estimates the decay of Φd ,d (k) with respect to zν ,d (k) for d = d .

Lemma 5 (Decay of Φd ,d (k) for d = d ). Under Hypothesis 1, let ν ∈ {1, 2} and let f and g be functions on Γ# with b2 f (b) l1 < ∞ and b2 g(b) l1 < ∞. Suppose further that G = {0, d} and k ∈ (Tν (0) ∩ Tν (d))\KR . Then, for d , d ∈ G with d = d , CΓ# ,ε,f,g , |zν ,d (k)|3−10−1

|Φd ,d (k)| ≤ where CΓ# ,ε,f,g is a constant.

The next proposition relates the quantities |v|, |k2 |, |zν,d (k)| and |d| for k in the ε-tubes with large |v|. Proposition 11. For ν ∈ {1, 2} we have: (i) Let k ∈ Tν (0)\KR . Then, 1 3 1 ≤ ≤ |zν,0 (k)| |v| |zν,0 (k)|

and

1 1 8 ≤ ≤ . 4|v| |k2 | |v|

(ii) Let k ∈ (Tν (0) ∩ Tν (d))\KR . Then, 1 1 3 ≤ ≤ , |zν,0 (k)| |v| |zν,0 (k)| 1 2|z

ν ,d

(k)|

≤

1 1 3 ≤ ≤ , |zν ,d (k)| |v| |zν ,d (k)| 1 2 ≤ . |d| |zν ,d (k)|

10. Bounds on the Derivatives (j)

In the last section, we expressed Φd ,d (k) as a sum of certain functions αµ,d (k) for k in the ε-tubes with large |v|. In this section we provide bounds for the derivatives of all these functions. Here we only give the statements. See Appendix C for the proofs. Our ﬁrst lemma concerns the derivatives of Φd ,d (k). Lemma 6 (Derivatives of Φd ,d (k)). Under Hypothesis 1, let f and g be functions in l1 (Γ# ) and suppose either (i) or (ii) where: (i) G = {0} and k ∈ (T0 \ b∈G Tb )\KR ; (ii) G = {0, d} and k ∈ (T0 ∩ Td )\KR . Then, for any integers n and m with n + m ≥ 1 and for any d , d ∈ G, n+m ∂ C ∂k n ∂k m Φd ,d (k) ≤ |v| , 1

2

where C is a constant with C = Cε,Λ,A,f,g,m,n if (i) or C = CΛ,A,f,g,m,n if (ii). We now improve the estimate of Lemma 6(ii) for d = d .

September 14, J070-S0129055X10004107

904

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Lemma 7 (Derivatives of Φd ,d (k) for d = d). Consider a constant β ≥ 2 ˆ and suppose that |b|β qˆ(b) l1 < ∞ and (1 + |b|β )A(b) l1 < 2ε/63. Let ν ∈ {1, 2} # β and let f and g be functions on Γ obeying |b| f (b) l1 < ∞ and |b|β g(b) l1 < ∞. Suppose further that G = {0, d} and k ∈ T0 ∩ Td with |v| > 2ε |b|β qˆ(b) l1 . Then, for any integers n and m with n + m ≥ 0 and for any d , d ∈ G with d = d , n+m ∂ C ∂k n ∂k m Φd ,d (k) ≤ |d|1+β , 1 2 where C = Cε,Λ,A,f,g,m,n is a constant. Observe that, in particular, this lemma with m = n = 0 generalizes Lemma 5. (j) We next have bounds for the derivatives of αµ,d (k). (j)

Lemma 8 (Derivatives of αµ,d (k)). Under Hypothesis 1, let ν ∈ {1, 2} and let f and g be functions in l1 (Γ# ). Suppose either (i) or (ii) where: (i) G = {0} and k ∈ (Tν (0)\ b∈G Tb )\KR ; (ii) G = {0, d} and k ∈ (Tν (0) ∩ Tν (d))\KR . Then, there is a constant ρ = ρε,A,q,m,n with ρ ≥ R such that, for |v| ≥ ρ and for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii), for any integers n and m with n + m ≥ 1 and for 1 ≤ j ≤ 2, n+m n+m ∂ ∂ Cj C3 (j) (3) ≤ α (k) and α (k) ∂k n ∂k m µ,d ≤ |zµ,d (k)|R2 , ∂k n ∂k m µ,d (2|zµ,d (k)| − R)j 1 2 1 2 where Cl = Cl;f,g,Λ,A,q,n,m for 1 ≤ l ≤ 3 are constants. Furthermore, C1;f,g,Λ,A,1,0 , C1;f,g,Λ,A,0,1 ≤ 13Λ−2 f l1 g l1

and

C1;f,g,Λ,A,1,1 ≤ 65Λ−3 f l1 g l1 . 11. The Regular Piece Proof of Theorem 1. Step 1 (Deﬁning Equation). We ﬁrst derive a deﬁning equation for the Fermi curve. Without loss of generality we may assume that ˆ A(0) = 0. Let G = {0}, recall that G = Γ# \{0}, and consider the region (Tν (0)\ b∈G Tb )\Kρ , where ρ is a constant to be chosen suﬃciently large obeying ρ ≥ R. By Proposition 7(i) we have G = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. To simplify the notation write   Kρ ∪ Tb  . Mν := Fˆ (A, V ) ∩ Tν (0) b∈Γ# \{0}

By Lemma 2(i), a point k is in Mν if and only if N0 (k) + D0,0 (k) = 0.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

905

By Proposition 10, if we set w(k) := wν,0 (k) = k1 + i(−1)ν k2

and z(k) := zν,0 (k) = k1 − i(−1)ν k2 ,

this equation becomes β1 w2 + β2 z 2 + (1 + β3 )wz + β4 w + β5 z + β6 + qˆ(0) = 0,

(17)

where β1 := Jν00 ,

β2 := Jν00 ,

β3 := K 00 ,

β4 := L00 ν ,

β5 := L00 ν ,

β6 := M 00 − qˆ(0),

00 with Jν00 , K 00 , L00 given by Proposition 10. Observe that all the coeﬃν and M cients β1 , . . . , β6 have exactly the same form as the function Φ0,0 (k) of Lemma 3(i) (see (15)). Thus, by this lemma, for 1 ≤ i ≤ 6 we have (1)

βi = β i (j)

where the function βi (j)

|βi (k)| ≤

(2)

+ βi

(3)

+ βi ,

(18)

is analytic in the region under consideration with

C C ≤ j (2|z(k)| − ρ) |z(k)|j

(3)

for 1 ≤ j ≤ 2 and |βi (k)| ≤

C , |z(k)|ρ2

(j)

where C = Cε,Λ,q,A is a constant. The exact expression for βi can be easily obtained from the deﬁnitions and from Lemma 3(i). Substituting (18) into (17) and dividing both sides of the equation by z yields (1)

w + β2 z + g = 0,

(19)

where g :=

qˆ(0) β4 w β6 β1 w 2 (2) (3) + (β2 + β2 )z + β3 w + + β5 + + z z z z

(20)

obeys |g(k)| ≤

C , ρ

(21)

with a constant C = Cε,Λ,q,A . Therefore, a point k is in Mν if and only if F (k) = 0, where (1)

F (k) := w(k) + β2 (k) z(k) + g(k) is an analytic function (in the region under consideration). Step 2 (Candidates for a Solution). Let us now identify which points are candidates to solve the equation F (k) = 0. First observe that, by Proposition 2(c) the lines

September 14, J070-S0129055X10004107

906

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Nν (0) and Nν (d) intersect at Nν (0) ∩ Nν (d) = {(iθν (d), (−1)ν θν (d))}. Hence, the second coordinate of this point and the second coordinate of a point k diﬀer by

pr(k) − pr(Nν (0) ∩ Nν (d)) = k2 − (−1)ν θν (d) = k2 + (−1)ν θν (d). Now observe that, if k ∈ Tν (0) ∩ Tν (d) then |k1 + i(−1)ν k2 | < ε and 1 1 ν ν ν |k2 + (−1) θν (d)| = (k1 + i(−1) k2 ) − (k1 + d1 − i(−1) (k2 + d2 ) 2 2 1 ε ε |N0,ν (k) − Nd,ν (k)| < + = ε. 2 2 2 That is, the second coordinate of k and the second coordinate of Nν (0) ∩ Nν (d) must be apart from each other by at most ε. This gives a necessary condition on the second coordinate of a point k for being in Mν . Conversely, if a point k is in the (ε/4)-tube inside Tν (0), that is, |k1 + i(−1)ν k2 | < 4ε , and its second coordinate diﬀer from the second coordinate of Nν (0) ∩ Nν (d) by at most ε/4, that is, |k2 + (−1)ν θν (d)| < 4ε , then ε ε |Nd,ν (k)| = N0,ν (k) − 2(k2 + (−1)ν θν (d))| ≤ + 2 < ε, 4 4 that is, the point k is also in Tν (d) and hence lie in the intersection Tν (0) ∩ Tν (d). This gives a suﬃcient condition on the ﬁrst and second coordinates of a point k for being in Tν (0) ∩ Tν (d). For y ∈ C deﬁne the set of candidates for a solution of F (k) = 0 as   Mν (y) := pr−1 (y) ∩ Tν (0) Tb  ≤

b∈Γ# \{0}



= pr−1 (y) ∩ Tν (0)

 Tν (b)  .

b∈Γ# \{0}

Observe that, if |y + (−1)ν θν (b)| ≥ ε for all b ∈ Γ# \{0} then Mν (y) = pr−1 (y) ∩ Tν (0) = {(k1 , y) ∈ C2 | |k1 + i(−1)ν y| < ε}.

(22)

On the other hand, if |y + (−1)ν θν (d)| < ε for some d ∈ Γ# \{0}, then there is at most one such d and consequently Mν (y) = pr−1 (y) ∩ (Tν (0)\Tν (d)) = {(k1 , y) ∈ C2 ||k1 + i(−1)ν y| < ε

and |k1 + d1 + i(−1)ν (y + d2 )| ≥ ε}. (23)

Indeed, suppose there is another d = 0 such that |y + (−1)ν θν (d )| < ε. Then, |d − d | = |2(−1)ν θν (d − d )| = |y + (−1)ν θν (d) − (y + (−1)ν θν (d ))| ≤ 2ε < 2Λ, which contradicts the deﬁnition of Λ. Thus, there is no such d = 0.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

907

Step 3 (Uniqueness). We now prove that, given k2 , if there exists a solution k1 (k2 ) of F (k1 , k2 ) = 0, then this solution is unique and it depends analytically on k2 . This follows easily using the implicit function theorem and the estimates below, which we prove later. Proposition 12. Under the hypotheses of Theorem 1 we have |F (k) − w(k)| ≤

ε C1 + , 900 ρ

(a)

∂F 1 C2 ∂k1 (k) − 1 ≤ 7 · 34 + ρ ,

(b)

where the constants C1 and C2 depend only on ε, Λ, q and A. Now suppose that (k1 , y) ∈ Mν (y). Then, ∂F 1 C2 ∂k1 (k1 , y) − 1 ≤ 7 · 34 + ρ . Hence, by the implicit function theorem, by choosing the constant ρ ≥ R suﬃciently large, if F (k1∗ , y) = 0 for some (k1∗ , y) ∈ Mν (y), then there is a neighborhood U × V ⊂ C2 which contains (k1∗ , y), and an analytic function η : V → U such that F (k1 , k2 ) = 0 for all (k1 , k2 ) ∈ U × V if and only if k1 = η(k2 ). In particular this implies that the equation F (k1 , k2 ) = 0 has at most one solution (η(y), y) in Mν (y) for each y ∈ C. We next look for conditions on y to have a solution or have no solution in Mν (y). Step 4 (Existence). We ﬁrst state an improved version of Proposition 12(a). Proposition 13. Under the hypotheses of Theorem 1 we have (1,0)

F (k) − w(k) = β2 where (1,0) β2

(1,1)

+ β2

(1,2)

(w(k)) + β2

(k) + h(k),

! " θν (A(−b)) ˆ ˆ − c)) θν (A(b ˆ δb,c + θν (A(c)) = −2i θν (b) θν (c) b,c∈G1

is a constant that depends only on ρ and A and (1,3)

h := β2

+ g.

Furthermore, 1 ε2 , 100Λ 1 (1,2) |β2 (k)| < 4 3 ε4 , 7 Λ (1,0)

|β2

|<

1 ε3 , 40Λ2 1 |h(k)| ≤ Cε,Λ,q,A . ρ (1,1)

|β2

(k)| <

(24)

September 14, J070-S0129055X10004107

908

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

We now derive conditions for the existence of solutions. Suppose that F (η(y), y) = 0. Then, since η(y) + i(−1)ν y = w(η(y), y) and ε < Λ/6, using the above proposition we obtain |η(y) + i(−1)ν y| = |w(η(y), y)| = |F (η(y), y) − w(η(y), y)| ≤

ε2 ε4 C ε3 ε2 C + + + ≤ + . 100Λ 40Λ2 74 Λ3 ρ 50Λ ρ

Hence, by choosing the constant ρ suﬃciently large we ﬁnd that |η(y) + i(−1)ν y| <

ε2 . 40Λ

In view of (23), there is no solution in Mν (y) if for some d ∈ Γ# \{0} we have

|y + (−1)ν θν (d)| < ε and |η(y) + d1 + i(−1)ν (y + d2 )| < ε. This happens if 1 |y + (−1) θν (d)| ≤ 2 ν

ε2 ε− 40Λ

because in this case

|η(y) + d1 + i(−1)ν (y + d2 )| = |η(y) + i(−1)ν y − 2i(−1)ν y + d1 − i(−1)ν d2 | ≤ |η(y) + i(−1)ν y| + 2|y + (−1)ν θν (d)| < ε. Therefore, the image set of pr is contained in

1 ε2 ν # Ω1 := z ∈ C |z + (−1) θν (b)| > ε− for all b ∈ Γ \{0} . 2 40Λ On the other hand, in view of (22), there is a solution in Mν (y) if |y+(−1)ν θν (b)| > ε for all b ∈ Γ# \{0}. Recall from Proposition 11(a) that ρ < |v| < 8|k2 |. Thus, the image set of pr contains the set Ω2 := {z ∈ C | 8|z| > ρ and |z + (−1)ν θν (b)| > ε for all b ∈ Γ# \{0}}. Step 5. Summarizing, we have the following biholomorphic correspondence: pr

Mν k −−→ k2 ∈ Ω, pr −1

Mν (η(y), y) ←−−− y ∈ Ω, where Ω2 ⊂ Ω ⊂ Ω1 (1,0)

with the constant β2

− i(−1)ν y − r(y),

given by (24),

(1,0)

|β2

(1,0)

and η(y) = −β2

|<

ε2 100Λ

and |r(y)| ≤

This completes the proof of the theorem.

ε3 C + . 50Λ2 ρ

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

909

Proof of Proposition 12. (a) Recallthat β2 = Jν00 . First observe that, by Proposition 10, Lemma 3, and (66), we have (1)

β2 (k) = (Jν00 )(1) (k) (1, i(−1)ν ) · A(−b) ˆ ˆ Sb,c (1, −i(−1)ν ) · A(c). = Nb (k)

(25)

b,c∈G1

Thus, by (94) and (99), √

2 45 √ ˆ 2 A l1 Λ(2|z(k)| − R) 44 2 44 2ε 1 4 ε . ≤ ≤ Λ|z(k)| 45 63 900 |z(k)|

(1)

|β2 (k)| ≤

ˆ l1 2 A

(26)

Now recall that |g(k)| ≤ Cε,Λ,q,A ρ1 . Hence, (1)

|F (k) − w(k)| = |β2 (k)z(k) + g(k)| ≤ This proves part (a). (b) We ﬁrst compute 2wz − w2 ∂g ∂β1 w2 + β1 = + ∂k1 ∂k1 z z2 (2)

(3)

+ β2 + β 2 + + β4

#

ε 1 + Cε,Λ,q,A . 900 ρ

(2)

(3)

∂β2 ∂β + 2 ∂k1 ∂k1

$ z

∂β3 ∂β4 w w + β3 + ∂k1 ∂k1 z

z − w ∂β5 ∂β6 1 β6 qˆ(0) − 2 − 2 . + + z2 ∂k1 ∂k1 z z z

(27)

Now observe that, since k ∈ Tν (0)\Kρ we have |w(k)| < ε, 3|v| ≥ |z| and ρ < |v| ≤ |z|. Furthermore, by Lemmas 3(i), 6(i) and 8(i), for 1 ≤ i ≤ 6 and 1 ≤ j ≤ 2, |βi (k)| ≤

C , |z(k)|

∂βi (k) C ∂k1 ≤ |z(k)| ,

C C (j) (3) |βi (k)| ≤ , |βi (k)| ≤ , |z(k)|j |z(k)|ρ2 (28) ∂β (j) (k) ∂β (3) (k) C C i i ≤ , , ≤ ∂k1 |z(k)|j ∂k1 |z(k)|ρ2

where C = Cε,Λ,q,A in all cases. Hence, ∂g(k) 1 ∂k1 ≤ Cε,Λ,q,A ρ .

(29)

ˆ we obtain By Lemma 8(i) with f = g = (1, −i(−1)ν ) · A, (1) ∂β2 (k) 13 ˆ 21 (1, −i(−1)ν ) · A z(k) ≤ |z(k)| 2 l ∂k1 Λ |z(k)| ≤

26 ˆ 2 1 A l1 < . Λ2 7 · 34

(30)

September 14, J070-S0129055X10004107

910

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Therefore, ∂ ∂ ∂F (1) = = (k) − 1 (F (k) − w(k)) (β (k)z(k) + g(k)) ∂k1 ∂k1 2 ∂k1 ∂β (1) 1 1 ∂g (1) = 2 (k)z(k) + β2 (k) + (k) ≤ + Cε,Λ,q,A . 4 ∂k1 7·3 ∂k1 ρ This proves part (b) and completes the proof of the proposition. Proof of Proposition 13. First observe that

(1, i(−1)ν ) · A = A1 + i(−1)ν A2 = A1 − i(−1)ν A2 = −2iθν (A). Thus, recalling (25), (1)

β2 (k) = (Jν00 )(1) (k) =

2iθν (A(−b)) ˆ ˆ Sb,c 2iθν (A(c)). Nb (k)

b,c∈G1

Now, by Lemma 4, we have (1)

(1,0)

z(k)β2 (k) = β2 where (1,0) β2

(1,1)

+ β2

(1,2)

(w(k)) + β2

(1,3)

(k) + β3

(k),

! " θν (A(−b)) ˆ ˆ − c)) θν (A(b ˆ δb,c + θν (A(c)) = −2i θν (b) θν (c) b,c∈G1

and (1,3)

|β3

(k)| ≤ CΛ,A

1 1 < CΛ,A . |z(k)| ρ

Hence, (1)

(1,0)

F (k) − w(k) = z(k)β2 (k) + g(k) = β2 (1,3)

with h := β3

(1,1)

+ β2

(1,2)

(w(k)) + β2

(k) + h(k)

+ g. Furthermore, in view of (21), (1,3)

|h(k)| ≤ |β3

1 (k)| + |g(k)| < Cε,Λ,q,A . ρ

ˆ l1 < 2ε/63 This proves the ﬁrst part of the proposition. Finally, by (81), since A and ε < Λ/6, we ﬁnd that

1 1 (1,0) ˆ ˆ l1 2iθν (A) ˆ l1 θν (A) l1 2iθν (A) |β2 | ≤ 1+ 2Λ 2Λ 4 ˆ 2 1 ε2 , ≤ A 1 < Λ l 100Λ

ε 7 (1,1) ˆ ˆ l1 2iθν (A) ˆ l1 θν (A) l1 2iθν (A) |β2 | ≤ 2 1 + Λ 6Λ 8 ˆ 21 < 1 ε3 ≤ 2 ε A l Λ 40Λ2

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

911

and 64 ˆ 21 2iθν (A) ˆ l1 2iθν (A) ˆ l1 ≤ 256 A ˆ 41 < 1 ε4 . θν (A) l l 3 3 Λ Λ 74 Λ3 This completes the proof. (1,2)

|β2

|≤

12. The Handles Proof of Theorem 2. Step 1 (Deﬁning Equation). Let G = {0, d} and consider the region (Tν (0) ∩ Tν (d))\Kρ , where ρ is a constant to be chosen suﬃciently large obeying ρ ≥ R. Observe that, this requires d being suﬃciently large for (Tν (0) ∩ Tν (d))\Kρ being not empty. In fact, by Proposition 11(ii), for k in this region we have ρ < |v| ≤ 2|d|. Now, recall from Proposition 7(ii) that G = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}, and to simplify the notation write Hν := Fˆ (A, V ) ∩ (Tν (0) ∩ Tν (d))\Kρ . By Lemma 2(ii), a point k is in Hν if and only if (N0 (k) + D0,0 (k))(Nd (k) + Dd,d(k)) − D0,d (k)Dd,0 (k) = 0.

(31)

Deﬁne w1 (k) := wν,0 = k1 + i(−1)ν k2 , z1 (k) := zν,0 = k1 − i(−1)ν k2 ,

(32)

w2 (k) := wν ,d = k1 + d1 + i(−1)ν (k2 + d2 ),

z2 (k) := zν ,d = k1 + d1 − i(−1)ν (k2 + d2 ). Note that, by Proposition 11(ii), |v| ≤ |z1 | ≤ 3|v|,

|v| ≤ |z2 | ≤ 3|v| and |d| ≤ |z2 | ≤ 2|d|.

By Proposition 10, N0 + D0,0 = β1 w12 + β2 z12 + (1 + β3 )w1 z1 + β4 w1 + β5 z1 + β6 + qˆ(0), Nd + Dd,d = η1 w22 + η2 z22 + (1 + η3 )w2 z2 + η4 w2 + η5 z2 + η6 + qˆ(0),

(33)

where β1 := Jν00 ,

β2 := Jν00 ,

β3 := K 00 ,

00 00 − qˆ(0), β4 := L00 ν , β5 := Lν , β6 := M

and η1 := Jνdd ,

η2 := Jνdd ,

η3 := K dd,

dd dd − qˆ(0), η4 := Ldd ν , η5 := Lν , η6 := M

with Jνd d , K d d , Ldν d and M d d given by Proposition 10. Observe that all the coeﬃcients β1 , . . . , β6 and η1 , . . . , η6 have exactly the same form as the function Φd ,d (k) of Lemma 3(ii) (see (15)). Thus, by this lemma, for 1 ≤ i ≤ 6 we have (1)

βi = βi

(2)

+ βi

(3)

+ βi

(1)

and ηi = ηi

(2)

+ ηi

(3)

+ ηi ,

(34)

September 14, J070-S0129055X10004107

912

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira (j)

where the functions βi with

(j)

and ηi

are analytic in the region under consideration

C C ≤ (2|z1 (k)| − ρ)j |z1 (k)|j C (3) and |βi (k)| ≤ , |z1 (k)|ρ2 C C (j) ≤ |ηi (k)| ≤ j (2|z2 (k)| − ρ) |z2 (k)|j C (3) , and |ηi (k)| ≤ |z2 (k)|ρ2 (j)

|βi (k)| ≤

for 1 ≤ j ≤ 2

for 1 ≤ j ≤ 2

(j)

(j)

where C = Cε,Λ,q,A is a constant. The exact expressions for βi and ηi can be easily obtained from the deﬁnitions and from Lemma 3(ii). Substituting (34) into (33) yields 1 (1) (N0 + D0,0 ) = w1 + β2 z1 + g1 , z1 1 (1) (Nd + Dd,d ) = w2 + η2 z2 + g2 , z2

(35)

where g1 :=

β1 w12 β4 w1 β6 qˆ(0) (2) (3) + (β2 + β2 )z1 + β3 w1 + + β5 + + , z1 z1 z1 z1

η1 w22 η4 w2 η6 qˆ(0) (2) (3) g2 := + (η2 + η2 )z2 + η3 w2 + + η5 + + z2 z2 z2 z2

(36)

obey |g1 (k)| ≤

C ρ

and |g2 (k)| ≤

C , ρ

(37)

with a constant C = Cε,Λ,q,A . This gives us more information about the ﬁrst term in (31). We next consider the second term in that equation. Write D0,d = c1 (d) + p1

and Dd,0 = c2 (d) + p2

(38)

with ˆ ˆ c1 (d) := qˆ(−d) − 2d · A(−d), p1 := D0,d − qˆ(−d) + 2d · A(−d), ˆ ˆ p2 := Dd,0 − qˆ(d) − 2d · A(d). c2 (d) := qˆ(d) + 2d · A(d), We have the following estimates. Proposition 14. Under the hypotheses of Theorem 2 we have, for any integers n and m with n + m ≥ 0 and for 1 ≤ j ≤ 2, n+m ∂ C1 C2 ∂k n ∂k m pj (k) ≤ |d| and |cj (d)| ≤ |d| , 1

2

where the constants C1 and C2 depend only on ε, Λ, q and A.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

913

Thus, by dividing both sides of (31) by z1 z2 and substituting (35) and (38) we ﬁnd that 1 [(N0 + D0,0 )(Nd + Dd,d ) − D0,d Dd,0 ] z1 z2

0=

(1)

(1)

= (w1 + β2 z1 + g1 )(w2 + η2 z2 + g2 ) −

1 (c1 (d) + p1 )(c2 (d) + p2 ). z1 z2

(39)

We now introduce a (nonlinear) change of variables in C2 . Set (1)

x1 (k) := w1 (k) + β2 (k) z1 (k) + g1 (k), (1)

(40)

x2 (k) := w2 (k) + η2 (k) z2 (k) + g2 (k). This transformation obeys the following estimates. Proposition 15. Under the hypotheses of Theorem 2 we have: (i) For 1 ≤ j ≤ 2 and for ρ suﬃciently large, |xj (k) − wj (k)| ≤

C ε ε + < . 900 ρ 8

(ii) 

∂x1  ∂k1   ∂x2 ∂k1 and



∂k1  ∂x1   ∂k2 ∂x1

 ∂x1 # 1 ∂k2  =  ∂x2 1 ∂k2

i(−1)ν i(−1)ν

 ∂k1 ∂x2  1 = 1  2 i(−1)ν ∂k2 ∂x2

$

(I + M )

1 i(−1)ν

(I + N )

with M ≤

4 1 C < + 7 · 34 ρ 2

and

N ≤ 4 M .

Furthermore, for all m, i, j ∈ {1, 2}, 2 ∂ km 3 2 C ∂xi ∂xj ≤ Λ3 ε + ρ . Here, all the constants C depend only on ε, Λ, q and A. By the inverse function theorem, these estimates imply that the above transformation is invertible. Therefore, by rewriting Eq. (39) in terms of these new

September 14, J070-S0129055X10004107

914

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

variables, we conclude that a point k is in Hν if and only if x1 (k) and x2 (k) satisfy the equation x1 x2 + r(x1 , x2 ) = 0,

(41)

where r(x1 , x2 ) := −

1 (c1 (d) + p1 )(c2 (d) + p2 ). z1 z2

In order to study this deﬁning equation we need some estimates. Step 2 (Estimates). Using the above inequalities we have, for i, j, l ∈ {1, 2}, 2 ∂ ∂pj ∂km C ∂xi pj (k(x)) ≤ ∂km ∂xi ≤ |d| m=1 and 2 2 2 ∂ pj ∂km ∂kn ∂pj ∂ 2 km ∂2 C ∂km ∂kn ∂xi ∂xl + ∂km ∂xi ∂xl ≤ |d| , ∂xi ∂xl pj (k(x)) ≤ m,n=1 m=1 so that |r(x)| ≤ C

1 1 1 C ≤ 4, 2 |d| |d| |d| |d|

∂ 1 1 1 1 1 1 C ∂xi r(x) ≤ C |d|3 |d| |d| + C |d|2 |d| |d| ≤ |d|4 and ∂2 ≤ C . r(x) ∂xi ∂xj |d|4 Here, all the constants depend only on ε, Λ, q and A. Step 3 (Morse Lemma). We now apply the quantitative Morse lemma in Appendix A for studying Eq. (41). We consider this lemma with a = b = C/|d|4 , 1 ε , 4 }. Observe that, under this δ = ε, and d suﬃciently large so that b < max{ 23 55 condition we have (δ − a)(1 − 19b) >

ε 2

and (δ − a)(1 − 55b) >

ε . 4

According to this lemma, there is a biholomorphism Φν deﬁned on

ε ε 2 Ω1 := (z1 , z2 ) ∈ C |z1 | < and |z2 | < 2 2

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

with range containing

ε ε 2 (x1 , x2 ) ∈ C |x1 | < and |x2 | < 4 4

915

(42)

such that DΦν − I ≤

C , |d|2

((x1 x2 + r) ◦ Φν )(z1 , z2 ) = z1 z2 + td , |td | ≤

C , |d|4

|Φν (0)| ≤

C , |d|4

(43)

where DΦν is the derivative of Φν and td is a constant that depends on d. Hence, if for ν = 1 we deﬁne φd,1 : Ω1 → T1 (0) ∩ T2 (d) as φd,1 (z1 , z2 ) := (k1 (Φ1 (z1 , z2 )), k2 (Φ1 (z1 , z2 ))), where k(x) is the inverse of the transformation (40), we obtain the desired map. Note that the conclusion (ii) of the theorem is immediate. We next prove (i) and (iii). Step 4 (Proof of (i)). By Proposition 15(i), for 1 ≤ j ≤ 2 we have |xj (k)−wj (k)| ≤ 8ε . Now, recall from (32) the deﬁnition of w1 (k) and w2 (k). Then, since ε |xj (k)| ≤ |xj (k) − wj (k)| + |wj (k)| < + |wj (k)|, 8 the set

ε ε 2 (k1 , k2 ) ∈ C |w1 (k)| < and |w2 (k)| < 8 8 is contained in the set (42). This proves the ﬁrst part of (i). To prove the second part we use Proposition 15 and (43). First observe that # $ 1 ∂k 1 1 DΦ1 = (I + N )(I + DΦ1 − I) Dφd,1 = ∂x 2 i −i # $ 1 1 1 (I + N + R), = 2 i −i where N ≤

1 C + 33 ρ

and R ≤

C . |d|2

Furthermore, from (32) and (40) we have 1 1 (1) (1) k1 = iθν (d) + (w1 + w2 ) = iθν (d) + (x1 + x2 + β2 z1 + η2 z2 + g1 + g2 ) 2 2

September 14, J070-S0129055X10004107

916

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

and similarly k2 = −(−1)ν θν (d) + so that

(−1)ν (1) (1) (x1 − x2 − β2 z1 + η2 z2 − g1 + g2 ), 2i

1 φd,1 (0) = k(Φ1 (0)) = k O |d|4

= (iθν (d), −(−1)ν θν (d)) + O

ε 900

+O

1 . ρ

Step 5 (Proof of (iii)). To prove part (iii) it suﬃces to note that T1 (0) ∩ T2 (d) ∩ Fˆ (A, V ) is mapped to T1 (−d) ∩ T2 (0) ∩ Fˆ (A, V ) by translation by d and deﬁne φd,2 by φd,2 (z1 , z2 ) := φd,1 (z2 , z1 ) + d. This completes the proof of the theorem. Proof of Proposition 14. It suﬃces to estimate ˆ − d ) cd ,d := qˆ(d − d ) − 2(d − d ) · A(d

and pd ,d := Dd ,d − cd ,d

ˆ − d ). Observe for d , d ∈ {0, d} with d = d . Deﬁne lνd d := (1, i(−1)ν ) · A(d that, since 1 |d − d |2 |ˆ q (d − d )| − d |2 1 1 |b|2 |ˆ q (b)| ≤ b2 qˆ(b) l1 2 , ≤ 2 |d − d | |d| #

|ˆ q (d − d )| =

|d

b∈Γ

and similarly 1 ˆ − d )| ≤ b2 A(b) ˆ |A(d , l1 |d|2 it follows that |cd ,d | ≤

CA,q |d|

and |lνd d | ≤

CA . |d|2

This gives the desired bounds for c1 and c2 . Now, by Proposition 10, we have

2 dd 2 dd ˜ dν d − lνd d )wν,d zν,d wν,d zν,d + (L p = Jνd d wν,d + Jν + K

˜ d d − ld d )zν,d + M ˜dd + (L ν ν

˜ dν d := Ldν d +lνd d and M ˜ d d := M d d −c. Observe that all the coeﬃcients with L ˜ dν d and M ˜ d d have exactly the same form as the function Φd ,d (k) Jνd d , K d d , L of Lemma 7 (see Proposition 10 and (15)). Thus, by this lemma with β = 2, for n+m any integers n and m with n + m ≥ 0, the absolute value of the ∂k∂ n ∂km -derivative 1

2

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

917

of each of these functions is bounded above by Cε,Λ,A,q,m,n |d|1 3 . Hence, if we recall from Proposition 11(ii) that |z1 (k)| ≤ 6|d| and |z2 (k)| ≤ 2|d|, and apply the Leibniz rule we ﬁnd that n+m ∂ C ∂k n ∂k m pd ,d (k) ≤ Cm,n |d| . 1

2

This yields the desired bounds for p1 and p2 and completes the proof. Proof of Proposition 15. (i) Similarly as in (26) we have (1)

|β2 (k)| ≤

1 ε 900 |z1 (k)|

(1)

and |η2 (k)| ≤

1 ε . 900 |z2 (k)|

Thus, in view of (37), and by choosing ρ suﬃciently large, (1)

|x1 (k) − w1 (k)| ≤ |β2 (k) z1 (k) + g1 (k)| ≤

C ε ε + < , 900 ρ 8

and similarly |x2 (k) − w2 (k)| < ε/8. This proves part (i). (ii) Recall (32) and (40). Then, for 1 ≤ j ≤ 2, (1)

∂β ∂ ∂w1 ∂z1 (1) ∂g1 ∂x1 (1) = (w1 + z1 β2 + g1 ) = + z1 2 + β + , ∂kj ∂kj ∂kj ∂kj ∂kj 2 ∂kj (1)

∂x2 ∂η ∂ ∂w2 ∂z2 (1) ∂g2 (1) = (w2 + z2 η2 + g2 ) = + z2 2 + η + . ∂kj ∂kj ∂kj ∂kj ∂kj 2 ∂kj First observe that the functions g1 and g2 are similar to the function g (see ∂g1 ∂g2 and ∂k are given by expressions (36) and (20)). Thus, it is easy to see that ∂k j j similar to (27). Since k ∈ Tν (0) ∩ Tν (d) we have |w1 (k)| < ε and |w2 (k)| < ε. Recall also the inequalities in Proposition 11(ii). Hence, by Lemmas 3(ii), 6(ii) and 8(ii), we obtain (28) with k1 and z(k) replaced by kj and z1 (k), respectively, and for k1 , z(k) and β replaced by kj , z2 (k) and η, respectively. Consequently, similarly as in (29) and using again Lemma 3(ii), for 1 ≤ j ≤ 2 we have ∂z2 (1) ∂g2 ∂z1 (1) ∂g1 1 1 ∂kj β2 + ∂kj ≤ Cε,Λ,q,A ρ and ∂kj η2 + ∂kj ≤ Cε,Λ,q,A ρ . Now recall that β2 = Jν00 and η2 = Jνdd . Then, by Proposition 10, Lemma 3(ii), and (66), it follows that (1)

β2 (k) = (Jν00 )(1) (k) =

(1, i(−1)ν ) · A(−b) ˆ ˆ Sb,c (1, −i(−1)ν ) · A(c), N (k) b

b,c∈G1 (1) η2 (k)

= (Jνdd )(1) (k) =

(1, i(−1)ν ) · A(d ˆ − b) ˆ − d). Sb,c (1, −i(−1)ν ) · A(c N (k) b

b,c∈G1

September 14, J070-S0129055X10004107

918

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Hence, by Lemma 8(ii), similarly as in (30), for 1 ≤ j ≤ 2, (1) 13 ∂β2 (k) ˆ 21 < 1 (1, −i(−1)ν ) · A z1 (k) ≤ l ∂kj Λ2 7 · 34 (1) 1 ∂η2 (k) . < z2 (k) ∂kj 7 · 34 Therefore,  ∂x1  ∂k1   ∂x2 ∂k1

and

 (1) (1) ∂β2 (k) ∂β2 (k)

z1 (k) z1 (k) ∂k1 ∂k2  i(−1)ν   +   (1) (1) i(−1)ν  ∂η2 (k) ∂η2 (k)  z2 (k) z2 (k) ∂k1 ∂k2   ∂g1 ∂g1   (1) (1) −i(−1)ν β2 β2  ∂k1 ∂k 2  + +   (1) (1) ∂g ∂g ν 2 2 η2 −i(−1) η2 ∂k1 ∂k2

1 i(−1)ν := (I + M1 + M2 + M3 ), 1 i(−1)ν

 ∂x1 ∂k2   = 1 1 ∂x2  ∂k2



where M1 ≤ 2

2 7 · 34

1 and M2 + M3 ≤ Cε,Λ,q,A . ρ

Set M := M1 + M2 + M3 . This proves the ﬁrst claim. Now, by choosing ρ suﬃciently large we can make M < 12 . Write # $ 1 i(−1)ν P := . 1 i(−1)ν Then, by the inverse function theorem and using the Neumann series,     ∂k1 ∂x1 −1 ∂k1 ∂x1  ∂x1  ∂x2  ∂k2  ˜ )P −1  =  ∂k1  = (I + M )−1 P −1 = (I + M   ∂k2  ∂x2 ∂k2  ∂x2  ∂x1 ∂x2 ∂k1 ∂k2 ˜ P −1 ) =: P −1 (I + P M

1 1 1 ˜ P −1 ), = (I + P M i(−1)ν 2 i(−1)ν with ˜ 1 ≤ ˜ P −1 ≤ 2 M P M

2 M ≤ 4 M . 1 − M

˜ P −1 . This proves the second claim. Set N := P M

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

919

Diﬀerentiating the matrix identity T T −1 = I and applying the chain rule we ﬁnd that

2 2 ∂xl ∂kp ∂ 2 km ∂km ∂ ∂km ∂ 2 xl ∂kr ∂kp =− =− . ∂xi ∂xj ∂xl ∂xi ∂kp ∂xj ∂xl ∂kr ∂xp ∂xi ∂xj l,p=1

l,p=1

Furthermore, in view of the above calculations we have

∂ki 1 ≤ (1 + N ) ≤ 1 (1 + 4 M ) ≤ 1 1 + 4 1 < 3 . ∂xj 2 2 2 2 2 Thus, 2 2 3 ∂ km ≤ 4 3 sup ∂ xl . ∂xi ∂xj 2 l,r,p ∂kr ∂xp We now estimate (1)

(1)

(1)

∂ 2 β2 ∂z1 ∂β2 ∂z1 ∂β2 ∂ 2 g1 ∂ 2 x1 = + z1 + + ∂ki ∂kj ∂ki ∂kj ∂ki ∂kj ∂kj ∂ki ∂ki ∂kj

and

∂ 2 x2 . ∂ki ∂kj

From (27) with g, w and z replaced by g1 , w1 and z1 , respectively, we obtain ∂ 2 β1 w12 ∂β1 2w1 z1 − w12 2z 2 − 6w1 z1 + 4w12 ∂ 2 g1 = +2 + β1 1 2 2 2 ∂k1 ∂k1 z1 ∂k1 z1 z13 # # $ $ (2) (3) (2) (3) ∂ 2 β2 ∂ 2 β2 ∂β2 ∂β2 ∂ 2 β3 + + + 2 + w1 z + 1 ∂k12 ∂k12 ∂k1 ∂k1 ∂k12 +2 +

2(w1 − z1 ) ∂β3 ∂ 2 β4 w1 ∂β4 z1 − w1 + +2 + β4 ∂k1 ∂k12 z1 ∂k1 z12 z13

∂ 2 β5 ∂ 2 β6 1 ∂β6 1 β6 2ˆ q (0) + −2 +2 3 + . 2 ∂k1 ∂k12 z1 ∂k1 z12 z1 z13

Hence, by Lemmas 3(ii), 6(ii) and 8(ii), 2 ∂ g1 1 ∂k 2 ≤ Cε,Λ,q,A ρ . 1 Similarly we prove that 2 ∂ gl 1 ∂ki ∂kj ≤ Cε,Λ,q,A ρ for all l, i, j ∈ {1, 2} because all the derivatives acting on gl are essentially the same up to constant factors (see [13]). Furthermore, again by Lemma 8(ii), ∂η (1) ∂β (1) 1 1 2 2 ≤ Cε,Λ,q,A , ≤ Cε,Λ,q,A , ∂kj ∂kj ρ ρ

September 14, J070-S0129055X10004107

920

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

and

(1) 65 ∂ 2 β2 (k) ˆ 21 < 1 ε2 , (1, −i(−1)ν ) · A ≤ z1 (k) l ∂k1 ∂kj Λ3 5Λ3 (1) 1 2 ∂ 2 η2 (k) ε . z2 (k) < ∂ki ∂kj 5Λ3

Hence,

2 ∂ xl 1 1 2 ∂ki ∂kj ≤ 5Λ3 ε + Cε,Λ,q,A ρ .

Therefore,

2 2 3 ∂ km ≤ 4 3 sup ∂ xl ≤ 3 ε2 + Cε,Λ,q,A 1 . ∂xi ∂xj 2 l,r,p ∂kr ∂xp Λ3 ρ

This completes the proof of the proposition. Acknowledgments I would like to thank Professor Joel Feldman for suggesting this problem and for the many discussions I have had with him. I am also grateful to Alessandro Michelangeli for useful comments about the manuscript. This work is part of the author’s Ph.D. thesis [13] defended at the University of British Columbia in Vancouver, Canada. Appendix A. Quantitative Morse Lemma Lemma 9 (Quantitative Morse Lemma [13]). Let δ be a constant with 0 < δ < 1 and assume that f (x1 , x2 ) = x1 x2 + r(x1 , x2 ) is an holomorphic function on Dδ = {(x1 , x2 ) ∈ C2 ||x1 | ≤ δ and |x2 | ≤ δ}. Suppose further that, for all x ∈ Dδ and 1 ≤ i ≤ 2, the function r satisﬁes %& % ' % ∂2r % ∂r 1 % % , (x) %≤b< ∂xi (x) ≤ a < δ and % % ∂xi ∂xj % 55 i,j∈{1,2} where a and b are constants. Then f has a unique critical point ξ = (ξ1 , ξ2 ) ∈ Dδ with |ξ1 | ≤ a and |ξ2 | ≤ a. Furthermore, let s = max{|ξ1 |, |ξ2 |}. Then there is a biholomorphic map Φ from the domain D(δ−s)(1−19b) to a neighbourhood of ξ ∈ Dδ that contains {(z1 , z2 ) ∈ C2 | |zi − ξi | < (δ − s)(1 − 55b) for 1 ≤ i ≤ 2} such that (f ◦ Φ)(z1 , z2 ) = z1 z2 + c, where c ∈ C is a constant fulﬁlling |c− r(0, 0)| ≤ ∂r ∂r (0, 0) = 0 and ∂x (0, 0) = 0, a2 . The diﬀerential DΦ obeys DΦ − I ≤ 18b. If ∂x 1 2 then ξ = 0 and s = 0.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

921

Appendix B. Asymptotics for the Coeﬁcients: Proofs Proof of Proposition 11. We ﬁrst derive a more general inequality and then we prove parts (i) and (ii). First observe that, if k ∈ Tµ (d )\KR then |v + (−1)µ (u + d )⊥ | = |Nd ,µ (k)| < ε < |v|. Hence, |v| ≤ |2v − (v + (−1)µ (u + d )⊥ )| ≤ 3|v|. But |2v − (v + (−1)µ (u + d )⊥ )| = |v − (−1)µ (u + d )⊥ | = |k1 + d1 − i(−1)µ (k2 + d2 )| = |zµ,d (k)|. Therefore, 1 3 1 ≤ ≤ . |zµ,d (k)| |v| |zµ,d (k)|

(44)

We now prove parts (i) and (ii). (i) The ﬁrst inequality of part (i) follows from the above estimate setting (µ, d ) = (ν, 0). To prove the second inequality observe that, since |v| > R ≥ 2Λ > 12ε by hypothesis and |v| ≤ |zν,0 (k)| by (44), on the one hand we have 1 11 1 1 |v| ≤ |v| = |v| − |v| ≤ |v| − Λ ≤ |v| − ε 4 12 12 6 ≤ |zν,0 (k)| − |k1 + i(−1)ν k2 | ≤ |zν,0 (k) − k1 − i(−1)ν k2 | = 2|k2 |. On the other hand, since |zν,0 (k)| < 3|v| by (44), |k2 | = |2i(−1)ν k2 | = |k1 + i(−1)ν k2 − (k1 − i(−1)ν k2 )| = |k1 + i(−1)ν k2 − zν,0 (k)| ≤ ε + 3|v| ≤ 4|v|. Combining these estimates we obtain the second inequality of part (i). (ii) Similarly, in view of (44), if k ∈ Tµ (d )\KR for (µ, d ) ∈ {(ν, 0), (ν , d)} then 1 1 3 ≤ ≤ |zν,0 (k)| |v| |zν,0 (k)|

1 1 3 ≤ ≤ . |zν ,d (k)| |v| |zν ,d (k)|

and

These are the ﬁrst two inequalities of part (ii). Now, since

zν ,d (k) = k1 − i(−1)ν k2 + d1 − i(−1)ν d2

= zν ,0 (k) + d1 − i(−1)ν d2 = wν,0 (k) + d1 − i(−1)ν d2 ,

|wν,0 (k)| < ε, and |d1 − i(−1)ν d2 | = |d|, it follows that |zν ,d (k)| − ε ≤ |d| ≤ |zν ,d (k)| + ε. Furthermore, by (45), ε<

|v| |zν ,d (k)| Λ ≤ ≤ . 6 12 12

(45)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

922

Thus, 1 |zν ,d (k)| ≤ |d| ≤ 2|zν ,d (k)|. 2 This yields the third inequality of part (ii) and completes the proof. Proof of Lemma 3. We consider all cases at the same time. Therefore, we have either hypothesis (i) with (µ, d ) = (ν, 0) or hypothesis (ii) with (µ, d ) ∈ {(ν, 0), (ν , d)}. Observe that either (ν, ν ) = (1, 2) or (ν, ν ) = (2, 1). Step 1. Recall the change of variables (14) and set

1 1 G1 := b ∈ G |b − d | < R , G2 := b ∈ G |b − d | ≥ R . 4 4 Then G = G1 ∪ G2 and G1 , G2 ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|} by Proposition 7. Furthermore, by Proposition 11, for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii) we have |zµ,d | ≤ 3|v|. Thus, observing the deﬁnition of G2 , f (d − b) −1 (RG G )b,c g(c − d ) |R1 (k)| := Nb (k) b∈G1 c∈G2

≤

|c − d |2 1 −1 RG |f (d − b)| |g(c − d )| G ε|v| |c − d |2 b∈G1

≤

c∈G2

16 2 1 Cε,f,g −1 RG c g(c) l1 ≤ , G f l1 2 ε|v| R |zµ,d |R2

(46)

and similarly |R2 (k)| ≤ Hence,

Cε,f,g . |zµ,d |R2

(47)





   f (d − b) −1  (RG G )b,c g(c − d ) Φd ,d (k) =  + +  N (k)   b b,c∈G1

=

b∈G1 c∈G2

b∈G2 c∈G

f (d − b) −1 (RG G )b,c g(c − d ) + R1 (k) + R2 (k) Nb (k)

(48)

b,c∈G1

with |R1 (k) + R2 (k)| ≤

Cε,f,g . |zµ,d |R2

(49)

Now, if we set TG G := πG − RG G and recall the convergent series expansion −1 −1 = RG G = (πG − TG G )

∞ j=0

TGj G ,

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

923

we can write f (d − b) −1 (RG G )b,c g(c − d ) N (k) b

b,c∈G1

=

∞ f (d − b) j (TG G )b,c g(c − d ). N (k) b j=0

(50)

b,c∈G1

Note, the above equality is ﬁne because G1 is ﬁnite set. Let

1 1 G3 := b ∈ G |b − d | < R , G4 := b ∈ G |b − d | ≥ R . 2 2 Again, observe that G = G3 ∪ G4 . Thus, we can break TG G into TG G = πG T πG = (πG3 + πG4 )T (πG3 + πG4 ) = T33 + T43 + T34 + T44 , where Tij := πGi T πGj for i, j ∈ {3, 4}. Using this decomposition we prove the following. Proposition 16. Under the hypotheses of Lemma 3 we have ∞ f (d − b) j (TG G )b,c g(c − d ) N (k) b j=0 b,c∈G1

=

∞ f (d − b) j (T33 )b,c g(c − d ) + R3 (k) N (k) b j=0 b,c∈G1

with R3 (k) given by (75) and |R3 (k)| ≤

CΛ,f,g . |zµ,d |R2

(51)

This proposition will be proved below. Combining this with (48) and (50) we obtain Φ

d ,d

∞ 3 f (d − b) j (T33 )b,c g(c − d ) + (k) = Rj (k). Nb (k) j=0 j=1

(52)

b,c∈G1

j Step 2. We now look in detail to the operator T33 and its powers T33 . Recall that 1 µ µ µ θµ (b) = 2 ((−1) b2 + ib1 ) and set µ := µ − (−1) so that (−1) = −(−1)µ . Then,

Nb (k) = Nb,µ (k)Nb,µ (k) = (wµ,d − 2iθµ (b − d ))(zµ,d − 2iθµ (b − d )). Extend the deﬁnition of θµ (y) to any y ∈ C2 . Thus, ˆ − c) = −2iθµ (A(b ˆ − c))wµ,d − 2iθµ (A(b ˆ − c))zµ,d . 2(k + d ) · A(b

September 14, J070-S0129055X10004107

924

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Hence, Tb,c = =

1 ˆ − c) − qˆ(b − c)) (2(c + k) · A(b Nc (k) ˆ − c) − qˆ(b − c) + 2(k + d ) · A(b ˆ − c) 2(c − d ) · A(b (wµ,d − 2iθµ (c − d ))(zµ,d − 2iθµ (c − d ))

= Xb,c + Yb,c ,

(53)

where Xb,c := Yb,c :=

ˆ − c) − qˆ(b − c) − 2iθµ (A(b ˆ − c))wµ,d 2(c − d ) · A(b , (wµ,d − 2iθµ (c − d ))(zµ,d − 2iθµ (c − d )) (wµ,d

ˆ − c))zµ,d −2iθµ (A(b . − 2iθµ (c − d ))(zµ,d − 2iθµ (c − d ))

(54) (55)

Let X and Y be the operators whose matrix elements are, respectively, Xb,c and Yb,c . Set X33 := πG3 XπG3

and Y33 := πG3 Y πG3 .

We next prove the following estimates,

1 1 ˆ l1 + 4 ˆ q l1 X33 ≤ 20 A < , Λ |zµ,d |R 3

(56)

8 ˆ l1 < 1 , Y33 ≤ θµ (A) Λ 14 where |zµ,d |R := 2|zµ,d | − R. First observe that the “vector” b ∈ Γ# has the same length as the complex number 2iθµ (b): |b| = |(b1 , b2 )| = |b1 + i(−1)µ b2 | = |2iθµ (b)|.

(57)

Thus, for b ∈ G3 , |b − d | 1 |2iθµ (b − d )| = < . R R 2 Consequently, |zµ,d

1 1 ≤ − 2iθµ (b − d )| |zµ,d | − |2iθµ (b − d )| <

1 1 |zµ,d | − R 2

=

2 . |zµ,d |R

(58)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

925

Furthermore, for b ∈ G , |wµ,d

1 1 1 ≤ ≤ − 2iθµ (b − d )| |b − d | − |wµ,d | |b − d | − ε 1 1 ≤ = . 2Λ − Λ Λ

(59) (60)

Here we have used that |wµ,d | < ε < Λ and |b − d | ≥ 2Λ for all b ∈ G . Using again that ε < Λ ≤ |c − d |/2 for all c ∈ G we have |c − d | < 2. |c − d | − ε

(61)

Finally recall that ε 1 < Λ 6

and

1 1 1 ≤ < , |zµ,d | |v| R

(62)

where the last inequality follows from Proposition 11 since |v| > R by hypothesis. Then, using the above inequalities and Proposition 5, the bounds (56) for X33 and Y33 follow from the estimates      sup  |Xb,c | ≤  sup  + sup + sup c∈G3

b∈G3

b∈G3

c∈G3

c∈G3

b∈G3

b∈G3

c∈G3

ˆ − c)| + |ˆ ˆ − c))| |wµ,d | q (b − c)| + |2iθµ (A(b 2|c − d | |A(b |wµ,d − 2iθµ (c − d )| |zµ,d − 2iθµ (c − d )|   2  sup  ≤ + sup |zµ,d |R c∈G3 b∈G3

×

!

b∈G3

c∈G3

b∈G3

c∈G3

" √ ˆ − c)| ˆ − c)| 2|c − d | |A(b |ˆ q (b − c)| + ε 2 |A(b × + |wµ,d − 2iθµ (c − d )| |wµ,d − 2iθµ (c − d )|   2  sup  ≤ + sup |zµ,d |R c∈G3 b∈G3 " √ ˆ − c)| |ˆ ˆ − c)| q (b − c)| + ε 2 |A(b 2|c − d | |A(b + × |c − d | − ε Λ " " !! √ 1 2 ε 2 ˆ q l ˆ l1 + ≤ A 2 4+ |zµ,d |R Λ Λ & ' 1 q l1 ˆ l1 + 4 ˆ ≤ 20 A Λ |zµ,d |R & ' 1 1 1 q l1 1 ˆ l1 + 4 ˆ < + = ≤ 20 A Λ R 7 4 3 !

September 14, J070-S0129055X10004107

926

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

and similarly

  sup

c∈G3 b∈G

3

+ sup



b∈G3 c∈G

ˆ l1 < 1 .  |Yb,c | ≤ 8 θµ (A) Λ 14 3

j Step 3. We now look in detail to T33 . For each integer j ≥ 1 write j j = (X33 + Y33 )j = Zj + Wj + Y33 , T33

(63)

where Wj is the sum of the j terms containing only one factor X33 and j − 1 factors Y33 , Wj :=

j

(Y33 )m−1 X33 (Y33 )j−m ,

m=1 j Zj := (X33 + Y33 )j − Wj − Y33 .

In view of (56) we have j 1 Y33 j ≤ , 14 j−1 CΛ,A,q 1 Wj ≤ j X33 Y33 ≤ j , |zµ,d |R 14 j−2 j CΛ,A,q 2 1 j 2 ≤ . Zj ≤ (2 − j − 1) X33 3 |zµ,d |2R 3 j−1

Hence, the series S :=

∞

j Y33 = (I − Y33 )−1 ,

j=0

W :=

∞

Wj

and Z :=

j=1

∞

Zj

(64)

j=2

converge, and the operator norm of W and Z decay with respect to |zµ,d |. Indeed,

j ∞ ∞ 1 Y33 j ≤ < C, S ≤ 14 j=0 j=0 W ≤

∞

Wj ≤

j=1

∞ CΛ,A,q j 2|zµ,d | − R j=1

1 14

j−1 <

∞ j CΛ,A,q 2 CΛ,A,q Z ≤ Zj ≤ ≤ . 2 |zµ,d |R j=2 3 |zµ,d |2R j=2 ∞

Thus, we have the expansion ∞ j=0

j T33 = S + W + Z.

CΛ,A,q , |zµ,d |R

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

927

Step 4. Consequently, ∞ f (d − b) j (T33 )b,c g(c − d ) N (k) b j=0 b,c∈G1

=

b,c∈G1

f (d − b) (S + W + Z)b,c g(c − d ) (wµ,d − 2iθµ (b − d ))(zµ,d − 2iθµ (b − d ))

(1)

(2)

= αµ,d + αµ,d + R4 , where

(1)

αµ,d (k) :=

b,c∈G1 (2) αµ,d (k)

:=

b,c∈G1

and R4 (k) :=

b,c∈G1

(65)

f (d − b) Sb,c (k)g(c − d ) , (wµ,d (k) − 2iθµ (b − d ))(zµ,d (k) − 2iθµ (b − d )) f (d − b) Wb,c (k)g(c − d ) (wµ,d (k) − 2iθµ (b − d ))(zµ,d (k) − 2iθµ (b − d ))

f (d − b) Zb,c (k)g(c − d ) . (wµ,d (k) − 2iθµ (b − d ))(zµ,d (k) − 2iθµ (b − d ))

(66)

(67)

By a short calculation as in (74), using (58) and (60) we ﬁnd that (1)

|αµ,d (k)| ≤ (2)

|αµ,d (k)| ≤ |R4 (k)| ≤

2 1 CΛ,f,g f l1 g l1 S ≤ , Λ 2|zµ,d | − R |zµ,d |R 2 1 CΛ,A,q,f,g f l1 g l1 W ≤ , Λ 2|zµ,d | − R |zµ,d |2R

(68)

2 1 CΛ,A,q,f,g f l1 g l1 Z ≤ . Λ 2|zµ,d | − R |zµ,d |3R

Hence, recalling (52) we conclude that (1)

(2)

(3)

Φd ,d = αµ,d + αµ,d + αµ,d , where (3) αµ,d (k)

:=

4

Rj (k).

(69)

j=1

Furthermore, in view of (49), (51) and (68), since 1 |zµ,d |3R

=

1 1 < , 3 (2|zµ,d | − R) |zµ,d |R2

for 1 ≤ j ≤ 2 we have (j)

|αµ,d (k)| ≤

Cj |zµ,d (k)|jR

(3)

and |αµ,d (k)| ≤

C3 , |zµ,d (k)|R2

where Cj = Cj;Λ,A,q,f,g and C3 = C3;ε,Λ,A,q,f,g are constants. This proves the main statement of the lemma. Finally observe that, since G3 is a ﬁnite set, the matrices

September 14, J070-S0129055X10004107

928

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

X33 and Y33 are analytic in k because their matrix elements are analytic functions of k. (Note, the functions wµ,d (k) and zµ,d (k) are analytic.) Consequently, the matrices Wj and Zj are also analytic and so are Sb,c , Wb,c and Zb,c because the series (j) (64) converge uniformly with respect to k. Thus, all the functions αµ,d (k) are analytic in the region under consideration. This completes the proof of the lemma. Proof of Proposition 16. Step 1. Recall that TG G = T33 + T34 + T43 + T44 with (0) (0) (0) (0) Tij = πGi T πGj and set X33 := 0, Y34 := T34 , W43 := T43 , and Z44 := T44 . It is straightforward to verify that, for any integer j ≥ 0, (j)

(j)

(j)

(j)

j+1 + X33 + Y34 + W43 + Z44 , TGj+1 G = T33

(70)

where (j)

(j−1)

+ T34 W43

(j)

(j−1)

+ T34 Z44

X33 := T33 X33 Y34 := T33 Y34 (j)

(j−1)

(j−1)

(j−1)

j + T43 X33 W43 := T43 T33 (j)

(j−1)

Z44 := T43 Y34

: L2G3 → L2G3 , : L2G3 → L2G4 , (j−1)

+ T44 W43

(j−1)

+ T44 Z44

(71)

: L2G4 → L2G3 , : L2G4 → L2G4 .

Step 2. Since πG1 πG4 = πG4 πG1 = 0 and πG1 πG3 = πG3 πG1 = πG1 , substituting (0)

(70) into the sum below for the terms where j ≥ 1 we have, recalling that X33 = 0, ∞ f (d − b) j (TG G )b,c g(c − d ) N (k) b j=0 b,c∈G1

∞ f (d − b) j (T33 )b,c g(c − d ) = Nb (k) j=0 b,c∈G1

∞ f (d − b) (j) (X33 )b,c g(c − d ). + N (k) b j=1

(72)

b,c∈G1

Now recall from (58) and (60) that, for all b ∈ G3 , 1 2 1 ≤ , |Nb (k)| Λ |zµ,d |R

(73)

and observe that G1 ⊂ G3 . Let M be either TG G or T33 . Then, the estimate f (d − b) j (M ) g(c − d ) b,c b,c∈G Nb (k) 1 ib·x ic·x f (d − b) e j e = g(c − d , M ) Nb (k) |Γ|1/2 |Γ|1/2 b∈G1

c∈G1

1 2 f l1 g l1 M j ≤ Λ |zµ,d |R

(74)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

929

implies that the left-hand side and the ﬁrst term on the right-hand side of (72) converge because M < 17/18. Thus, the last term in (72) also converges. Hence, we are left to show that R3 (k) :=

∞ f (d − b) (j) (X33 )b,c g(c − d ) N (k) b j=1

(75)

b,c∈G1

obeys |R3 (k)| ≤

CΛ,f,g . |zµ,d | R2

In order to do this we need the following inequality, which we prove later. Proposition 17. Consider a constant β ≥ 0 and suppose that (1 + |b|β )ˆ q (b) l1 < 2 β ˆ ˆ 1 < 2ε/63. Suppose further that |v| > (1 + |b| )A(b) l1 . ∞ and (1 + |b|β )A(b) l ε Then, for any B, C ⊂ G and m ≥ 1, m 17 1 πB TGm G πC ≤ (1 + (2Λ)β−β βmβ −1 ) sup , β 18 b∈B 1 + |b − c| c∈C

where β is the smallest integer greater or equal than β. Step 3. Now observe that, if b ∈ G1 and c ∈ G4 then |b − c| = |b − d − (c − d )| ≥ |c − d | − |b − d | ≥

R R R − = . 2 4 4

Thus, applying the last proposition with β = 2 and recalling that G3 ⊂ G , for m ≥ 0 we have m+1 3(m + 1) 17 m T34 ≤ πG1 TGm G TG G4 = πG1 TGm+1 π ≤ . πG1 T33 G G 4 1 18 1 + R2 16 Furthermore, since πG4 πG3 = πG4 πG1 = 0 and πG3 πG1 = πG1 , from (70) we obtain (j)

j+1 W43 πG1 = πG4 TGj+1 G πG πG = πG TG G πG . 3 1 4 1

Hence, (j)

j+1 W43 πG1 = πG4 TGj+1 < G πG ≤ TG G 1

17 18

j+1 .

Therefore, for 0 ≤ m < j, (j−m−1)

m m πG1 T33 T34 W (j−m−1) πG1 ≤ πG1 T33 T34 W43 j+1 3(m + 1) 17 ≤ . 1 18 1 + R2 16

πG1

September 14, J070-S0129055X10004107

930

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Iterating the ﬁrst expression in (71) we ﬁnd that (j)

(j−1)

+ T33 X33

(j−1)

+ T33 T34 W43

= T34 W43

(j−1)

+ T33 T34 W43

j−1

(j−m−1)

X33 = T34 W43 = T34 W43

(j−1) (j−2)

2 + T33 X33

(j−2)

(j−2)

j−2 j−1 + · · · + T33 T34 W43 + T33 T34 W43

.. .

=

m T33 T34 W43

(1)

.

(0)

(76)

m=0

Thus, using the above inequality, (j) πG1 X33 πG1

% j−1 % % % % % (j−m−1) m =% πG1 T33 T34 W43 πG1 % % % m=0

≤

j−1

(j−m−1)

m πG1 T33 T34 W43

πG1

m=0

3 ≤ 1 1 + R2 16 =

17 18

j+1 j−1

3 (j 2 + j) 1 2 2+ R 8

(m + 1)

m=0

17 18

j+1 .

Consequently, % %   % % ∞ ∞ % % (j)  (j) %πG  % π X πG1 X33 πG1 ≤ G1 % 33 % 1 % % j=1 j=1 j+1 ∞ 17 3 C 2 ≤ (j + j) ≤ 2, 1 2 18 R 2 + R j=1 8 where C is an universal constant. Finally, using this and (73), since |zµ,d | ≤ 3|v| we have   ∞ f (d − b) 6C 1 (j)   |R3 (k)| = f l1 g l1 X33 g(c − d ) ≤ . 2 N (k) Λ |z b µ,d | R b,c∈G1 j=1 b,c

In view of (72) and (75) this completes the proof.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

931

Proof of Proposition 17. For any b, c ∈ Γ# set Qb,c := (1 + |b − c|β )Tb,c . We ﬁrst claim that, for any B, C ⊂ G , sup

|Qb,c | <

b∈B c∈C

17 18

and

sup c∈C

|Qb,c | <

b∈B

17 . 18

(77)

In fact, using the bounds (11), (12) and |k| ≤ 3|v|, it follows that qˆ(b − c) 2c · A(b ˆ − c) 2k · A(b ˆ − c) sup − − |Qb,c | = sup (1 + |b − c|β ) Nc (k) Nc (k) Nc (k) b∈B b∈B c∈C

c∈C

≤ (1 + |b|β )ˆ q (b) l1

14 17 1 1 4 ˆ + (1 + |b|β )A(b) + = , l1 < ε|v| ε 2 9 18

and similarly we prove the second bound in (77). Furthermore, since |Tb,c | ≤ |Qb,c | for all b, c ∈ Γ# , for any integer m ≥ 1 we have m m 17 17 m m |(TBC )b,c | < and sup |(TBC )b,c | < . sup 18 18 b∈B c∈C c∈C

b∈B

Now, let p be the smallest integer greater or equal than β, and for any integer m ≥ 1 and any ξ0 , ξ1 , . . . , ξm ∈ Γ# , let b = ξ0 and c = ξm . Then, & |b − c| = (2Λ) β

β

=

(2Λ)β (2Λ)p

|b − c| 2Λ

&

'β

m

≤ (2Λ)

β

|b − c| 2Λ

'p

|ξi1 −1 − ξi1 | · · · |ξip −1 − ξip |

i1 ,...,ip =1 m

≤ (2Λ)β−p

(|ξi1 −1 − ξi1 |p + · · · + |ξip −1 − ξip |p )

i1 ,...,ip =1

= (2Λ)β−p p mp−1

m

|ξi−1 − ξi |p

i=1

≤ (2Λ)β−p p mp−1

m * (1 + |ξi−1 − ξi |p ).

(78)

i=1

To simplify the notation write s := supb∈B, c∈C sup

b∈B c∈C

1 . 1+|b−c|β

|(TGm G )b,c |

≤ sup b∈B c∈C

1 sup (1 + |b − c|β )|(TGm G )b,c | 1 + |b − c|β b∈B c∈C

Hence,

September 14, J070-S0129055X10004107

932

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

 ≤ s sup

b∈B c∈C

×

|(TGm G )b,c | + (2Λ)β−p p mp−1 sup

b∈B ξ ∈G 1

(1 + |ξ1 − ξ2 |2 )|Tξ1 ,ξ2 | · · ·

ξ2 ∈G

(1 + |b − ξ1 |β )|Tb,ξ1 | 

(1 + |ξm−1 − c|2 )|Tξm−1 ,c |

c∈C

 m 17 + (2Λ)β−p p mp−1 sup (1 + |b − ξ1 |2 )|Tb,ξ1 | ≤ s 18 b∈B ξ1 ∈G

× sup

ξ1 ∈G ξ ∈G 2

(1+|ξ1 − ξ2 |2 )|Tξ1 ,ξ2 | · · ·

≤ s (1 + (2Λ)β−p p mp−1 )

17 18

sup

ξm−1 ∈G c∈C

 (1 + |ξm−1 − c|2 )|Tξm−1 ,c |

m ,

and similarly we prove the other inequality. Therefore, by Proposition 5, πB TGm G πC

≤ (1 + (2Λ)

β−β

βm

β −1

)

17 18

m sup b∈B c∈C

1 , 1 + |b − c|β

where β is the smallest integer greater or equal than β. This is the desired estimate. Proof of Lemma 4. To simplify the notation write w = wµ,d , z = zµ,d , and |z|R = 2|z| − R. First observe that 1 w − 2iθµ (c −

d )

=

−1 w + , 2iθµ (c − d ) 2iθµ (c − d )(w − 2iθµ (c − d ))

so that z −1 w = + Nc (k) 2iθµ (c − d ) 2iθµ (c − d )(w − 2iθµ (c − d )) +

1 2iθµ (c − d ) w − 2iθµ (c − d ) z − 2iθµ (c − d )

=: ηc(0) + ηc(w) + ηc(z) , where, in view of (58) to (61), since |w| < ε, |ηc(0) | ≤

1 , 2Λ

|ηc(w) | ≤

ε 2Λ2

and |ηc(z) | ≤

4 . |z|R

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

933

Hence, ˆ − c))z −2iθµ (A(b Nc (k)

Yb,c =

ˆ − c))η (0) − 2iθµ (A(b ˆ − c))η (w) − 2iθµ (A(b ˆ − c))η (z) = −2iθµ (A(b c c c (0)

(w)

(z)

=: Yb,c + Yb,c + Yb,c . (·)

(·)

Let Y ( · ) be the operator whose matrix elements are Yb,c and set Y33 := πG3 Y ( · ) πG3 . Then, similarly as we estimated Y33 , using (58) to (61) and Proposition 5, it follows easily that 1 (0) ˆ l1 , Y (w) ≤ ε θµ (A) ˆ l1 , Y (z) ≤ 4 θµ (A) ˆ l1 . θµ (A) Y33 ≤ 33 33 2Λ 2Λ2 |z|R Furthermore, S = (I − Y33 )−1 = 1 + (1 − Y33 )−1 Y33 = 1 + SY33 (0)

(w)

(z)

2 = 1 + (1 + SY33 )Y33 = 1 + Y33 + Y33 + Y33 + SY33 ,

where, recalling (56), 2 SY33

−1

≤ (1 − Y33 )

14 Y33 2 < Y33 ≤ 1 − Y33 13 2

2 8 ˆ 21 . θµ (A) l Λ

Combining all this we have z Sb,c (0) (w) (0) (w) (z) (z) 2 = (ηb + ηb )(δb,c + Yb,c + Yb,c + Yb,c + (SY33 )b,c ) + ηb Sb,c Nb (k) (0)

(0)

(0)

(w)

(w)

(0)

(w)

= [ηb (δb,c + Yb,c )] + [ηb Yb,c + ηb (δb,c + Yb,c + Yb,c )] (0)

(w)

(0)

(w)

(z)

(z)

2 + [(ηb + ηb )(SY33 )b,c ] + [(ηb + ηb )Yb,c + ηb Sb,c ] (0)

(1)

(2)

(3)

=: Kb,c + Kb,c + Kb,c + Kb,c with

1 ˆ θµ (A) l1 , 1+ 2Λ

ε ε ε 1 (1) ˆ ˆ ˆ 1 1 1 θ θ ( A) + ( A) + θ ( A) 1 + |Kb,c | ≤ µ µ µ l l l 4Λ3 2Λ2 2Λ Λ2

ε 7 ˆ l1 , (A) θ < 1 + µ 2Λ2 6Λ 2 1 8 (2) ˆ 21 < 64 θµ (A) ˆ 21 , |Kb,c | ≤ θµ (A) l l Λ Λ Λ3

(0) |Kb,c |

(3)

1 ≤ 2Λ

|Kb,c | ≤

3 ˆ l1 4 + 14 4 < CΛ,A θµ (A) 2Λ |z|R 13 |z|R |z|R

for all b, c ∈ G3 . Here, to estimate |Kb,c | we have used that ε < Λ/6. (1)

September 14, J070-S0129055X10004107

934

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Finally, recalling (66) and using the above estimates we ﬁnd that z Sb,c (1) zµ,d (k)αµ,d (k) = g(c − d ) f (d − b) N (k) b b,c∈G1

=

  3 (j) f (d − b)  K g(c − d ) b,c

b,c∈G1 (1,0)

j=0 (1,1)

(1,2)

(1,3)

=: αµ,d + αµ,d (w(k)) + αµ,d (k) + αµ,d (k), where, in particular, (1,0) αµ,d

=−

b,c∈G1

! " ˆ − c)) f (d − b) θµ (A(b δb,c + g(c − d ). 2iθµ (b − d ) θµ (c − d )

(79)

(80)

(1,j)

Furthermore, for 0 ≤ j ≤ 2, it follows easily from (79) that |αµ,d | ≤ Cj with

1 1 ˆ l1 f l1 g l1 , θµ (A) C0 := 1+ 2Λ 2Λ

ε 7 ˆ (81) C1 := θµ (A) l1 f l1 g l1 , 1+ 2Λ2 6Λ C2 :=

64 ˆ 21 f l1 g l1 , θµ (A) l Λ3

while for j = 3, (1,3)

|αµ,d | ≤ CΛ,A,f,g

1 . |z|R

This completes the proof of the lemma. Proof of Lemma 5. To prove this lemma we apply the following (well-known) inequality (see [13] for a proof). Proposition 18. Let α and δ be constants with 1 < α ≤ 2 and 1 < δ ≤ 2. Suppose that f is a function on Γ# obeying |b|α f (b) l1 < ∞. Then, for any ξ1 , ξ2 ∈ Γ# with ξ1 = ξ2 , 1 if α, δ < 2, |f (b − ξ1 )| C ≤ × δ α+δ−2 |b − ξ | |ξ − ξ | 2 1 2 ln|ξ1 − ξ2 | if α = 2 or δ = 2, # b∈Γ \{ξ1 ,ξ2 }

where C = CΓ# ,α,δ,f is a constant. First observe that π{b} TGm G π{c} = |(TGm G )b,c |. Hence, by Proposition 17 with β = 2, for all b, c ∈ G and m ≥ 1, m 17 1 m m |(TG G )b,c | = π{b} TG G π{c} ≤ (1 + 2m) . 18 1 + |b − c|2

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

Note that this inequality is also valid for m = 0. Thus, ∞ f (d − b) m |Φd ,d (k)| = (TG G )b,c g(c − d ) m=0 b,c∈G Nb (k) " ! ∞ m |g(c − d )| 1 17 ≤ (1 + 2m) |f (d − b)| ε|v| m=0 18 1 + |b − c|2 b∈G c∈G   |g(c − d )| C , ≤ |f (d − b)| |g(b − d )| + ε|v| |b − c|2

935

(82)

c∈G \{b}

b∈G

where C is an universal constant. Now, by the triangle inequality, H¨ older’s inequality, and since · l2 ≤ · l1 , |f (d − b)| |g(b − d )| b∈G

=

|d − d |2 |f (d − b)| |g(b − d )| − d |2 |d

b∈G

≤

4 (|d − b|2 + |b − d |2 ) |f (d − b)| |g(b − d )| |d − d |2 b∈G

≤

4 ( b2 f (b) l2 g l2 + f l2 b2 g(b) l2 ) |d − d |2

≤

4 Cf,g ( b2 f (b) l1 g l1 + f l1 b2 g(b) l1 ) ≤ . |d − d |2 |d − d |2

(83)

Furthermore, by Proposition 18 with α = δ = 2, for any 0 < 1 < 2, |g(c − d )| CΓ# ,g,1 ln|b − d | ≤ C # ,g ≤ . Γ 2 |2 |b − c| |b − d |b − d |2−1 c∈G \{b}

Applying this inequality and (83) to (82) we obtain ! " |f (d − b)| Cf,g C |Φd ,d (k)| ≤ + CΓ# ,g,1 . ε|v| |d − d |2 |b − d |2−1 b∈G

Again, by Proposition 18 with α = 2 and δ = 2 − 1 we conclude that, for any 0 < 2 < 2 − 1 , ' & Cε,Γ# ,f,g,1 ,2 Cf,g ln |d − d | C + CΓ# ,f,g,1 . |Φd ,d (k)| ≤ ≤ 2 2− 1 ε|v| |d − d | |d − d | |v| |d − d |2−1 −2 Finally, recall from Proposition 11(ii) that |zν ,d | < 3|d| and |zν ,d | < 3|v|, observe that |d − d | = |d|, and set = 1 + 2 . Then, for any 0 < < 2, |Φd ,d (k)| ≤

Cε,Γ# ,f,g,1 ,2 Cε,Γ# ,f,g, ≤ . |d| |d|2−1 −2 |zν ,d |3−

Choosing = 10−1 we obtain the desired inequality.

September 14, J070-S0129055X10004107

936

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Appendix C. Bounds on the Derivatives: Proofs Proof of Lemma 6. Step 0. When there is no risk of confusion we shall use the same notation to denote an operator or its matrix. Deﬁne FBC := [f (b − c)]b∈B,c∈C ,

GBC := [g(b − c)]b∈B,c∈C ,

ΦG (k) := [Φd ,d (k; G)]d ,d ∈G . Here FBC and GBC are |B| × |C| matrices and ΦG (k) is a |G| × |G| matrix. First observe that   f (d − b) −1  (RG ΦG (k) =  G )b,c g(c − d ) N (k) b b,c∈G

d ,d ∈G

−1 can be written as the product of matrices FGG ∆−1 k RG G GG G . Furthermore, since −1 −1 = Hk−1 , we can write ΦG (k) as on L2G we have ∆−1 k RG G = (RG G ∆k ) −1 FGG Hk GG G . Hence,

∂ n+m Hk−1 ∂ n+m Φ (k) = F GG G . G GG ∂k1n ∂k2m ∂k1n ∂k2m

(84)

This is the quantity we want to estimate. Step 1. Let T = T (k) be an invertible matrix. Then applying T T −1 = I and using the Leibniz rule for

∂ m0 m ∂ki 0

∂ m0 m ∂ki 0

to the identity

(T T −1) we ﬁnd that

m0 −m1 m 0 −1 m0 ∂ ∂ m0 T −1 T ∂ m1 T −1 −1 = −T . m0 m0 −m1 ∂ki m1 ∂ki ∂kim1 m =0 1

Iterating this formula m0 − 1 times we obtain   m0 mj−1 mj−1 −mj * −1 mj−1 ∂ mm0 T −1 ∂ m0 T −1 T ∂ = (−T −1 ) mj−1 −mj  m m0 ∂ki mj ∂ki m0 ∂ki j=1 m =0  =

j



mj−1 −mj mj−1 T ∂ (−T −1 ) mj−1 −mj  mj ∂ki

m* 0 −1 mj−1 −1 j=1

mj =0

mm0 −1 −1

∂ mm0 −1 −mm0 T ∂ mm0 T −1 mm0 −1 × (−T −1 ) mm −1 −mm m 0 m m0 ∂ki m0 ∂ki 0 mm0 =0   m* 0 −1 mj−1 mj−1 −mj −1 mj−1 ∂ T ∂ mm0 −1 T = (−1)m0  T −1 mj−1 −mj T −1 mm0 −1 T −1 . mj ∂ki ∂ki j=1 mj =0

(85)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

Step 2. In view of (85), it is not diﬃcult to see that linear combination of terms of the form   m nj * ∂ H k  −1  Hk , Hk−1 nj ∂k 2 j=1 where either have

∂ m Hk−1 ∂k2m

937

is given by a ﬁnite

(86)

+m

m −1 ∂ n ∂ Hk ∂n j=1 nj = m. Thus, when we compute ∂k1n ∂k2m , the derivative ∂k1n acts , nj k ˆ on Hk−1 or ∂ nHjk . However, since ∂H ∂k2 b,c = 2(k2 + b2 )δb,c − 2A2 (b − c), we ∂k

n ∂ n ∂ j Hk ∂k1n ∂knj 2

2

= 0 if nj ≥ 1 and ∂

Hk−1 ∂k1n

n

n ∂ n ∂ j Hk ∂k1n ∂knj 2

=

∂ n Hk ∂k1n

if nj = 0. Similarly, using

again (85), one can see that is given by a ﬁnite linear combination of terms of +n the form (86), with m and k2 replaced by n and k1 , respectively, and j=1 nj = n. ∂ n+m H −1

k Therefore, combining all this we conclude that ∂kn ∂km is given by a ﬁnite linear 1 2 combination of terms of the form   n+m nj * ∂ H k −1  ∆−1 R−1 , ∆−1 (87) nj k RG G k GG ∂k i j j=1

+ +n+m where n+m j=1 nj δ2,ij = m and j=1 nj δ1,ij = n, that is, where the sum of nj for which ij = 2 is equal to m, and the sum of nj for which ij = 1 is equal to n. nj

Hk −1 ∆k πG . n ∂ki j

Step 3. The ﬁrst step in bounding (87) is to estimate ∂

A simple

j

calculation shows that #

∂ nj Hk −1 n ∆ ∂kijj k

$ b,c

  ˆ  2(kij + bij )δb,c + 2Aij (b − c) if nj = 1, 1 × 2δb,c = if nj = 2, Nc (k)   0 if nj ≥ 3.

Furthermore, by Proposition 7, 1 1 ≤ |Nb (k)| ε|v| for all b ∈ G , while by Proposition 3 we have 2 1 ≤ |Nb (k)| Λ|v|

(88)

and |ki + bi | ≤ |ui + bi | + |vi | ≤ |v| + |u + b| ≤

2 |Nb (k)| Λ

for all b ∈ G if G = {0, d}, and for all b ∈ G \{˜b} if G = {0}. Furthermore, |˜b| ≤ Λ + |u| + |v| < Λ + 3|v|,

(89)

September 14, J070-S0129055X10004107

938

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

since |u| < 2|v| because k ∈ T0 . Now, let 1B (x) be the characteristic function of the set B. Then, using the above estimates we have # $ ∂ nj Hk −1 sup nj ∆k πG ∂k c∈G i j b∈G

!

b,c

2|kij + bij |δnj ,1 + 2δnj ,2 2|Aˆij (b − c)| ≤ sup δb,c + δnj ,1 |Nb (k)| |Nb (k)| c∈G b∈G " ! 2|kij + ˜bij | + 2 2|Aˆij (˜b − c)| ≤ sup δ˜b,c + 1G (˜b) |N˜b (k)| |N˜b (k)| c∈G " ! 2|kij + bij | + 2 2|Aˆij (b − c)| δb,c + + sup |Nb (k)| |Nb (k)| c∈G

"

b∈G \{˜ b}

≤

ˆ l1 2|kij + ˜bij | + 2 + 2 A 1G (˜b) ε|v| " !& ' 2|Aˆij (b − c)| 4 2 + sup + δb,c + Λ |Nb (k)| |Nb (k)| c∈G b∈G \{˜ b}

≤

2 ˆ l1 )1G (˜b) + 4 + 4 + 4 A ˆ l1 (2(|u| + |v| + |˜b|) + 2 + 2 A ε|v| Λ Λ|v| Λ|v|

≤

2 ˆ l1 )1G (˜b) + 4 + 4 + 4 A ˆ l1 (12|v| + 2Λ + 2 + 2 A ε|v| Λ Λ|v| Λ|v|

≤ 1G (˜b) ε−1 CΛ,A + CΛ,A . Similarly, # $ ∂ nj Hk −1 sup ∂k nj ∆k πG b∈G c∈G ij

≤ 1G (˜b) ε−1 CΛ,A + CΛ,A . b,c

Hence, by Proposition 5, % % % ∂ nj H % % % k −1 ˜ −1 CΛ,A + CΛ,A . ∆ π % G % ≤ 1G (b) ε % ∂kinjj k % Step 4. By a similar (and much simpler) calculation (using Proposition 5) we get FGG ≤ f l1 , GGG ≤ g l1 , 1 2 ˜ ∆−1 + (1 − 1G (˜b)) . k πG ≤ 1G (b) ε|v| Λ|v|

(90)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

939

From Lemma 1 we have (RG G )−1 ≤ 18. Thus, the operator norm of (87) is bounded by % %  % % n+m % % * −1 −1 ∂ nj Hk −1 −1 % %  ∆k RG G % ∆k RG G nj % ∂k % % j=1 ij   % % n+m % nj * % % ∂ Hk −1 % −1  −1  ≤ ∆−1 ∆ πG % RG % G , k RG G % ∂kinjj k % j=1 which is bounded either by   n+m * 1 1 18  (ε−1 CΛ,A + CΛ,A ) 18 ≤ ε−(n+m+1) CΛ,A,n,m ε|v| |v| j=1 if G = {0}, or by

  n+m * 1 1 18  CΛ,A 18 g l1 ≤ CΛ,A,n,m Λ|v| |v| j=1

if G = {0, d}. Therefore, % n+m −1 % %∂ Hk % % % % ∂k n ∂k m % ≤ 1

2

ﬁnite sum where # of terms depend on n and m

with C = Cε,Λ,A,n,m if G = {0} (84) and (90) we have n+m ∂ Φ (k) ∂k n ∂k m G = 1 2

C C C ≤ Cn,m ≤ , |v| |v| |v|

(91)

or C = CΛ,A,n,m if G = {0, d}. Finally, recalling

n+m −1 Hk FGG ∂ G G G n m ∂k1 ∂k2 % n+m −1 % %∂ Hk % C % ≤ FGG % % ∂k n ∂k m % GG G ≤ |v| , 1 2

where C = Cε,Λ,A,n,m,f,g if G = {0} or C = CΛ,A,n,m,f,g if G = {0, d}. This is the desired inequality. The proof of the lemma is complete. Proof of Lemma 7. Let R+ be the set of non-negative real numbers and let σ be a real-valued function on R+ such that: (i) σ(t) ≥ 1 for all t ∈ R+ with σ(0) = 1; (ii) σ(s)σ(t) ≥ σ(s + t) for all s, t ∈ R+ ; (iii) σ increases monotonically. For example, for any β ≥ 0 the functions t → eβt and t → (1 + t)β satisfy these properties. Now, let T be a linear operator from L2C to L2B with B, C ⊂ Γ# (or a

September 14, J070-S0129055X10004107

940

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

matrix T = [Tb,c ] with b ∈ B and c ∈ C) and consider the σ-norm T σ := max sup |Tb,c |σ(|b − c|), sup |Tb,c |σ(|b − c|) . c∈C

b∈B c∈C

b∈B

In [13] we prove that this norm has the following properties. Proposition 19 (Properties of · σ ). Let S and T be linear operators from L2C to L2B with B, C ⊂ Γ# . Then: (a) (b) (c) (d)

T ≤ T σ≡1 ≤ T σ ; If B = C, then S T σ ≤ S σ T σ ; If B = C, then (I + T )−1 σ ≤ (1 − T σ )−1 if T σ < 1; 1 T σ for all b ∈ B and all c ∈ C. |Tb,c | ≤ σ(|b−c|)

Now, by using these properties we prove Lemma 7. We follow the same notation as above. First observe that, similarly as in the last proof we can write −1 −1 Φd ,d (k) = F{d }G ∆−1 k RG G GG {d } = F{d }G Hk GG {d } .

Now, let σ(|b|) = (1 + |b|)β , and observe that there is a positive constant Cβ such that σ(|b|) ≤ Cβ (1 + |b|β ) for all b ∈ Γ# . Then, it is easy to see that F{d }G σ = f σ ≤ Cβ (1 + |b|β )f (b) l1 , GG {d } σ = g σ ≤ Cβ (1 + |b|β )g(b) l1 . Furthermore, by (77) and Proposition 5, −1 −1 RG σ ≤ G σ = (I + TG G )

∞

TG G jσ < 18,

(92)

j=0

and since for diagonal operators the σ-norm and the operator norm agree, from (90) we have ∆−1 k πG σ ≤

2 . Λ|v|

Hence, in view of Propositions 19(b) and 11(ii), −1 |Φd ,d (k)| ≤ F{d }G ∆−1 k RG G GG {d } ≤ Cβ,f,g,Λ,A,m,n

1 , |d|

and by repeating the proof of Lemma 6 with the operator norm replaced by the σ-norm we obtain % % n+m % % ∂ 1 % % % ∂k n ∂k m Φd ,d (k)% ≤ Cβ,f,g,Λ,A,m,n |d| . 1

2

σ

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

941

Therefore, by Proposition 19(d), for any integers n and m with n + m ≥ 0, % n+m % n+m % ∂ % ∂ 1 % % ≤ Φ (k) Φ (k) 1 + |d − d |β % ∂k n ∂k m d ,d % ∂k n ∂k m d ,d 1 2 1 2 σ ≤ Cβ,f,g,Λ,A,m,n

1 . |d|1+β

This is the desired inequality. Proof of Lemma 8. Deﬁne the operator M (j) : L2G → L2G as

M

(j)

   S := W   Z

3

3

if j = 1, if j = 2, if j = 3,

where S, W and Z are given by (64). In order to prove Lemma 8, we ﬁrst prove the following proposition. Proposition 20. Assume the same hypotheses of Lemma 8. Then, for any integers n and m with n + m ≥ 1 and for 1 ≤ j ≤ 3, % n+m % % ∂ % Cj −1 (j) % % ∆ M % ∂k n ∂k m k % ≤ (2|zµ,d (k)| − R)j , 1 2 where C1 = C1;Λ,A,n,m and Cj = Cj;Λ,A,q,n,m for 2 ≤ j ≤ 3 are constants. Furthermore, C1;Λ,A,1,0 ≤

13 , Λ2

C1;Λ,A,0,1 ≤

13 Λ2

and

C1;Λ,A,1,1 ≤

65 . Λ3

Proof. Step 0. To simplify the notation write w = wµ,d , z = zµ,d and |z|R = 2|z|− R. First observe that, for any analytic function of the form h(k) = ˜h(w(k), z(k)) we have

∂ ∂ ∂ ˜ ∂ ˜ ∂ ∂ ν + − h= h = i(−1) h, h. ∂k1 ∂w ∂z ∂k2 ∂w ∂z Thus, % % n+m % % ∂ −1 (j) % % ∆ M % % ∂k n ∂k m k 1 2 % % m n % % n−r+m−p r+p m n ∂ ∂ % −1 (j) % ∆ M (−1)m−p n−r+m−p = %(i(−1)ν )m % k % % p r ∂z ∂wr+p p=0 r=0 % n−r+m−p r+p % %∂ % ∂ −1 n+m (j) % % sup sup % n−r+m−p ∆ M %. ≤2 r+p k ∂z ∂w p≤r r≤n

September 14, J070-S0129055X10004107

942

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Now, by the Leibniz rule, % n % % ∂ ∂ m −1 (j) % % %= ∆ M % ∂z n ∂wm k %

% m n % % m n ∂ n−r+m−p ∆−1 ∂ r+p M (j) % % % k % n−r ∂wm−p r ∂wp % % % p r ∂z ∂z p=0 r=0

% n−r+m−p −1 % %∂ ∆k % % ≤ 2n+m sup sup % % % n−r m−p ∂z ∂w p≤m r≤n

% r+p (j) % %∂ M % % % % ∂z r ∂wp % .

Furthermore, we shall prove below that % %% % % ∂ n−r+m−p ∆−1 % % ∂ r+p M (j) % C % % j,n,m k %% sup sup % , % %≤ n−r ∂wm−p % % % ∂z r ∂wp % |z|n+j p≤m r≤n % ∂z R

(93)

with constants C1,n,m = C1,n,m;Λ,A and Cj,n,m = Cj,n,m;Λ,A,q for 2 ≤ j ≤ 3. Hence, % n m % %∂ ∂ % −1 (j) % n+m Cj,n,m % . % ∂z n ∂wm ∆k M % ≤ 2 |z|n+j R Therefore, being careful with the indices, % n+m % % ∂ % Cj,n−r+m−p,r+p Cj −1 (j) % n+m % sup sup 2n−r+m−p+r+p ≤ j , % ∂k n ∂k m ∆k M % ≤ 2 n−r+m−p+j p≤m r≤n |z|R |z|R 1 2 where C1 = C1;Λ,A,n,m and Cj = Cj;Λ,A,q,n,m for 2 ≤ j ≤ 3. This is the desired inequality. We are left to prove (93) and estimate the constants C1;Λ,A,i,j for i, j ∈ {0, 1} to ﬁnish the proof of the proposition. ∂ r+p ∆−1

Step 1. The ﬁrst step for obtaining (93) is to estimate ∂zr ∂wkp πG3 . Observe that ∂ r+p ∆−1 ∂ r+p (∆−1 ) k k b,c = ∂z r ∂wp b,c ∂z r ∂wp p ∂ ∂r 1 δb,c = p ∂w w − 2iθµ (b − d ) ∂z r z − 2iθµ (b − d ) (−1)r r! δb,c (−1)p p! = (w − 2iθµ (b − d ))p+1 (z − 2iθµ (b − d ))r+1 ≤

|w − 2iθµ (b −

p! r! δb,c d )|p+1 |z −

2iθµ (b − d )|r+1

,

and recall from (58) and (59) that, for all b ∈ G3 , 2 1 ≤ |z − 2iθµ (b − d )| |z|R Then,

and

1 1 ≤ . |w − 2iθµ (b − d )| Λ

∂ r+p ∆−1 p! r! 2r+1 δ b,c k ≤ p+1 r+1 , ∂z r ∂wp b,c Λ |z|R

(94)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

943

and consequently, 

 sup

b∈G3 c∈G 3

+ sup

c∈G3 b∈G

 ≤

r+1

 r+p ∆−1 k  ∂ ∂z r ∂wp b,c

3



r+2 p! r! 2  sup δb,c = p! r! 2 + sup . r+1 Λp+1 |z|R Λp+1 |z|r+1 b∈G3 c∈G c∈G3 b∈G R 3

3

Therefore, by Proposition 5, % r+p −1 % %∂ % p! r! 2r+2 1 ∆k % % ≤ π . G % ∂z r ∂wp 3% Λp+1 |z|r+1 R

(95)

Step 2. We now estimate the second factor in (93). Let us ﬁrst consider the case j = 1, that is, M (1) = S. Since S = (I − Y33 )−1 , the operator S is clearly invertible. ∂pS Thus, by applying (85) with T = S −1 , one can see that ∂w p is given by a ﬁnite linear combination of terms of the form   p nj −1 * ∂ S   S S, (96) ∂wnj j=1 where

+p j=1

nj = p. Hence, when we compute ∂ nj S −1 . ∂w nj

∂r ∂pS ∂z r ∂w p ,

the derivative

∂r ∂z r

acts

−1

either on S or Similarly, using again (85) with T = S , one can see that ∂r S is given by a ﬁnite linear combination of terms of the form (96), with p and w r ∂z +r ∂ r+p S replaced by r and z, respectively, and j=1 mj = r. Thus, we conclude that ∂z r ∂w p is given by a ﬁnite linear combination of terms of the form   r+p * ∂ mj +nj S −1 S,  (97) S ∂z mj ∂wnj j=1 +r+p +r+p where j=1 mj = r and j=1 nj = p. Indeed, observe that the general form of the terms (97) follows directly from (85) because that identity is also valid for mixed derivatives. Since S = (I − Y33 )−1 with Y33 < 1/14 and Yb,c =

ˆ − c)) z −2iθµ (A(b , (w − 2iθµ (c − d ))(z − 2iθµ (c − d ))

(98)

we have S = (I − Y33 )−1 ≤

14 1 ≤ 1 − Y33 13

(99)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

944

and

j+l ∂ j+l ∂ −1 S = j l Yb,c j l ∂z ∂w ∂z ∂w b,c ∂ j −2iθ (A(b ˆ − c)) z ∂ l 1 µ = j . ∂z z − 2iθµ (c − d ) ∂wl w − 2iθµ (c − d )

Furthermore, ˆ − c)) z ˆ − c)) 2iθν (c − d ) ∂ j −2iθµ (A(b (−1)j−1 j! 2iθµ (A(b = j ∂z z − 2iθµ (c − d ) (z − 2iθν (c − d ))j+1 ∂l (−1)l l! 1 = l ∂w w − 2iθµ (c − d ) (w − 2iθµ (c − d ))l+1

for j ≥ 1, for l ≥ 0.

Recall from (59) and (61) that, for all c ∈ G , |c − d | |c − d | ≤ ≤ 2. |w − 2iθµ (c − d )| |c − d | − ε

(100)

Then, using this and (94), for j ≥ 1 and l ≥ 0,

∂ j+l ˆ − c)| j! l! |A(b |c − d | −1 S ≤ j l j+1 l ∂z ∂w |z − 2iθµ (c − d )| |w − 2iθµ (c − d )| |w − 2iθµ (c − d )| b,c ≤

ˆ − c)| 2j+2 j! l! |A(b Λl |z|j+1 R

,

(101)

while for j = 0 and l ≥ 0,

∂ j+l ˆ − c)| |z| l! |A(b −1 S ≤ j l ∂z ∂w |z − 2iθµ (c − d )| |w − 2iθµ (c − d )|l+1 b,c ≤

ˆ − c)| 2 l! |A(b . Λl+1

(102)

Consequently,  

∂ j+l −1  sup  + sup S ∂z j ∂wl b∈G3 c∈G3 b,c c∈G3

b∈G3

 

j+2 |z|R 2 j! l!  ˆ − c)|  |A(b δ0,j sup + sup ≤ 1 − δ0,j + 2Λ Λl |z|j+1 b∈G3 c∈G3 R

j+3 2 j! l! ˆ |z|R ≤ 1 − δ0,j + δ0,j A l1 . 2Λ Λl |z|j+1 R

c∈G3

b∈G3

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

Therefore, by Proposition (5), % % j+l

j+3 % % ∂ 2 j! l! ˆ |z|R −1 % % % ∂z j ∂wl S % ≤ 1 − δ0,j + 2Λ δ0,j Λl |z|j+1 A l1 . R Thus, for r ≥ 1, in view of (97) where

+r+p j=1

945

(103)

mj = r,

  % % r+p % % m +n r+p * % % ∂ % % ∂ j j −1 %  % %  S % % ∂z r ∂wp S % ≤ Cr,p % ∂z mj ∂wnj S % S j=1

 ≤ Cr,p 

r+p *

 CΛ,A

j=1

× CΛ,A

r+p *

2

mj +3

mj ! n j ! ˆ  A l1 Λnj

1 − δ0,mj +

j=1

≤ CΛ,A,r,p

|z|R δ0,mj 2Λ

1 m +1 |z|R j

1 , |z|r+1 R

since mj ≥ 1 for at least one 1 ≤ j ≤ r + p. Similarly, if r = 0 then % r+p % % ∂ % % % % ∂z r ∂wp S % ≤ CΛ,A,r,p . Hence, in view of (95), % n−r+m−p −1 % % r+p (1) % % %∂ ∆k % M % % %∂ % sup sup % % % % n−r m−p r ∂z ∂w ∂z ∂wp % p≤m r≤n (m − p)! (n − r)! 2n−r+2 ˆ l1 CΛ,A,r,p A Λm−p+1 |z|n−r+1 p≤m r≤n R

1 |z|R × 1 − δ0,r + δ0,r 2Λ |z|r+1 R

≤ sup sup

≤ CΛ,A,n,m

1 . |z|n+1 R

This proves (93) for j = 1. Step 3. We now estimate the constant C1;Λ,A,i,j for i, j ∈ {0, 1}. First observe that ∂w = |δ1,j + i(−1)ν δ2,j | = 1 and ∂z = |δ1,j − i(−1)ν δ2,j | = 1. ∂kj ∂kj

September 14, J070-S0129055X10004107

946

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Thus, in view of (99) and (103), since |z| ≥ |v| > R ≥ 2Λ, % % % % %

% −1 % −1 −1 % % ∂S % % % % = %−S ∂S S % = %−S ∂w ∂S + ∂z ∂S % S% % % % ∂kj % % % ∂kj ∂kj ∂w ∂kj ∂z $ % −1 % % −1 % 2 # 4 ˆ ˆ l1 % ∂S % % ∂S % 3 22 A 2 A l1 % % % % + ≤ ≤ S % + ∂w % % ∂z % 2 |z|2R Λ2 2

≤

ˆ l1 18 A . Λ2

Similarly, ∂S ∂2S =− ∂ki ∂kj ∂ki −S −S

+

∂z ∂kj

∂z ∂S −1 ∂w ∂S −1 + S ∂kj ∂w ∂kj ∂z

∂w ∂S −1 ∂z ∂S −1 + ∂kj ∂w ∂kj ∂z ∂w ∂kj

∂S ∂ki

∂w ∂ 2 S −1 ∂z ∂ 2 S −1 + ∂ki ∂w2 ∂ki ∂z∂w

∂w ∂ 2 S −1 ∂z ∂ 2 S −1 + ∂ki ∂w∂z ∂ki ∂z 2

S,

so that, using the above inequality as well, % % % −1 % % −1 % % 2 % % % % % % % % ∂ S % % ≤ 2 S % ∂S % % ∂S % + % ∂S % % % ∂ki % % ∂w % % ∂z % % ∂ki ∂kj % % 2 −1 % % 2 −1 % % 2 −1 % %∂ S % %∂ S % %∂ S % % % % % % +2% + S % % ∂z∂w % + % ∂z 2 % % 2 ∂w 2

ˆ l1 8 A ˆ l1 3 18 A ≤2 + 2 2 Λ Λ2

$ 2 # 3 ˆ ˆ l1 ˆ l1 3 25 A 26 A 2 A l1 + + 2 Λ3 Λ|z|2R |z|3R

ˆ l1 54 ˆ 55 A 432 ˆ 2 ≤ 4 A l1 + 3 A l1 ≤ Λ Λ Λ3

#

$ ˆ l1 8 A +1 . Λ

Furthermore, by (95), % % % % % % % % ∂∆−1 % % % ∂∆−1 % ∂∆−1 23 8 22 k % k % k % % % % + ≤ % ∂kj % % ∂w % % ∂z % ≤ Λ2 |z|R + Λ|z|2 ≤ Λ2 |z|R R

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

and

947

% 2 −1 % % 2 −1 % % 2 −1 % % 2 −1 % % % % ∂ ∆k % % ∂ ∆k % % % %≤% % + 2 % ∂ ∆k % + % ∂ ∆k % % % ∂z∂w % % ∂z 2 % % ∂ki ∂kj % % ∂w2 % ≤

23 24 26 5 · 23 1 + + < . 2 3 Λ3 |z|R Λ2 |z|R Λ|z|R Λ3 |z|R

ˆ l1 < 2ε/63 and ε < Λ/6, Hence, since A % % % % % % % ∂ −1 % % ∂∆−1 % % % −1 % ∂S % k % % % % % ∂kj ∆k S % ≤ % ∂kj % S + ∆k % ∂kj % ≤

8 Λ2 |z|R

ˆ l1 2 18 A 3 13 1 + ≤ 2 2 2 Λ|z|R Λ Λ |z|R

and % % % % ∂2 −1 % % % ∂ki ∂kj ∆k S % % 2 −1 % % %% % % %% % % 2 % % ∂ ∆k % % ∂∆−1 %% % % ∂∆−1 %% % % % −1 % ∂ S % k % % ∂S % k % % ∂S % % % % % ≤% S + % +% + ∆k % % % % % % % % ∂ki ∂kj ∂kj ∂ki ∂ki ∂kj ∂ki ∂kj % $$ # # ˆ l1 8 A ˆ l1 ˆ l1 1 8 18 A 65 1 2 55 A 5 · 23 3 ≤ +2 2 +1 < 3 + . 3 2 3 |z|R Λ 2 Λ Λ Λ Λ Λ Λ |z|R Therefore, C1;Λ,A,1,0 ≤

13 , Λ2

C1;Λ,A,0,1 ≤

13 Λ2

and C1;Λ,A,1,1 ≤

65 , Λ3

as was to be shown. r+p

(2)

r+p

M ∂ W Step 4. To prove (93) for j = 2 we need to bound ∂∂zr ∂w p = ∂z r ∂w p . Recall from (64) that

W =

∞ j=1

Wj =

j ∞

(Y33 )m−1 X33 (Y33 )j−m ,

j=1 m=1

where Yb,c is given above by (98) and X33 ≤ C/|z| < 1/3 with Xb,c =

ˆ − c) − qˆ(b − c) − 2iθµ (A(b ˆ − c))w (c − d ) · A(b . (w − 2iθµ (c − d ))(z − 2iθµ (c − d ))

First observe that ∂ r+p (Y33 )m−1 X33 (Y33 )j−m ∂z r wp

September 14, J070-S0129055X10004107

948

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

is given by a sum of j r+p terms of the form ∂ l1 +n1 Y33 ∂ lm−1 +nm−1 Y33 ∂ lm +nm X33 ∂ lm+1 +nm+1 Y33 ∂ lj +nj Y33 · · · · · · , l n l n l n l n ∂z 1 ∂w 1 ∂z m−1 ∂w m−1 ∂z m ∂w m ∂z m+1 ∂w m+1 ∂z lj ∂wnj where there are j factors ordered as in the product (Y33 )m−1 X33 (Y33 )j−m . Further+j +j more, for each term in the sum we have i=1 li = r and i=1 ni = p. Thus, % r+p % % ∂ % % % W % ∂z r wp % % % %∞ % % ∂ r+p % % =% Wj % (104) % r p % j=1 ∂z w % % % % %∞ j % % ∂ r+p m−1 j−m % (Y ) X (Y ) =% 33 33 33 % % r p % % j=1 m=1 ∂z w ≤

≤

% j % ∞ % ∂ r+p % m−1 j−m % % (Y ) X (Y ) 33 33 % ∂z r wp 33 % j=1 m=1 ∞

j r+p

% l +n % % ∂ 1 1 Y33 ∂ lm +nm X33 ∂ lj +nj Y33 % % sup % · · · · · · % ∂z l1 ∂wn1 ∂z lm ∂wnm ∂z lj ∂wnj % m=1 I

j r+p

% l +n % l +n % l +n % % % % m m X33 % % j j Y33 % % ∂ 1 1 Y33 % % · · · %∂ % · · · %∂ % sup % % ∂z lm ∂wnm % % ∂z lj ∂wnj % , % ∂z l1 ∂wn1 % m=1 I

j=1

≤

∞ j=1

j

j

(105)

where I :=

j j (li , ni )li ≤ r and ni ≤ p for 1 ≤ i ≤ j with li = r and ni = p . i=1

i=1

(106) +∞ Note, we can diﬀerentiate the series (104) term-by-term because the sum j=1 Wj +j converges uniformly and the sum m=1 is ﬁnite. We next estimate the factors in (105). Combining (101) and (102) we have l +n

li +2 ∂i i 2 l i ! ni ! ˆ |z|R (107) ∂z li ∂wni Yb,c ≤ 1 − δ0,li + 2Λ δ0,li Λni |z|li +1 |A(b − c)|. R Furthermore, using (94) and (100), l +n ∂i i ∂z li ∂wni Xb,c ∂ li ˆ − c) − qˆ(b − c) − 2iθµ (A(b ˆ − c))w ∂ ni (c − d ) · A(b 1 = li ∂z z − 2iθµ (c − d ) ∂wni w − 2iθµ (c − d )

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

949

(−1)li l !(−1)ni n !(2θ (A(b ˆ − c))2θµ (c − d ) − (c − d ) · A(b ˆ − c) − qˆ(b − c)) i i µ = (z − 2iθµ (c − d ))li +1 (w − 2iθµ (c − d ))ni +1 ˆ − c)| |c − d | + |ˆ li ! ni ! (2|A(b q (b − c)|) |z − 2iθµ (c − d )|li +1 |w − 2iθµ (c − d )|ni +1

≤

ˆ − c)| |c − d | + |ˆ q (b − c)| 2li +1 li ! ni ! 2|A(b l +1 i n |w − 2iθµ (c − d )| Λ i |z|R

2li +1 li ! ni ! ˆ − c)| + 1 |ˆ q (b − c)| . ≤ 4| A(b Λ Λni |z|lRi +1

≤

(108)

Hence, 

 sup

b∈G3

+ sup

c∈G3

c∈G3



b∈G3

l +n i i  ∂ ∂z li ∂wni Yb,c

 

li +2 2 l i ! ni !  |z|R ˆ − c)|  |A(b ≤ 1 − δ0,li + δ0,li + sup sup 2Λ Λni |z|lRi +1 b∈G3 c∈G3

li +3 2 l i ! ni ! ˆ |z|R δ0,li ≤ 1 − δ0,li + A l1 n 2Λ Λ i |z|lRi +1

c∈G3

b∈G3

and similarly   sup

b∈G3

c∈G3



l +n

2li +2 li ! ni ! ∂i i ˆ q l1 ˆ  + sup . ∂z li ∂wni Xb,c ≤ Λni |z|li +1 4 A l1 + Λ c∈G3 R b∈G

3

Thus, by Proposition (5), since |z| ≥ |v| > R ≥ 2Λ, % l +n %

li +3 % ∂i i % 2 l i ! ni ! ˆ |z|R % % δ ≤ 1 − δ Y + A l1 0,li 0,li % ∂z li ∂wni 33 % 2Λ Λni |z|lRi +1

1 1 2li +3 li ! ni ! ˆ 2li +3 li ! ni ! ˆ 1 ≤ ≤ + A A l1 l l |z|R 2Λ Λni |z|Ri Λni +1 |z|lRi

(109)

and % % l +n

% 2li +2 li ! ni ! % ∂i i ˆ q l1 ˆ % % 1 X + 4 A ≤ l % ∂z li ∂wni 33 % Λ Λni |z|lRi +1 $ # 1 2li +3 li ! ni ! ˆ ˆ q l1 A l1 . = 2Λ + ˆ l1 |z|R Λni +1 |z|li 2 A R

(110)

September 14, J070-S0129055X10004107

950

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

+j +j Applying these estimates to (105) and recalling that i=1 li = r and i=1 ni = p we have % r+p % % ∂ % % % % ∂z r wp W % % l +n % l +n % l +n % % % j ∞ % ∂ m m X33 % % ∂ j j Y33 % % ∂ 1 1 Y33 % r+p % % % % % ≤ j sup % l1 n1 % · · · % lm nm % · · · % lj nj % ∂z ∂w ∂z ∂w ∂z ∂w % m=1 I j=1 ≤

∞ j=1

j r+p

j

#

sup

m=1 I

#

ˆ q l1 2Λ + ˆ l1 2 A

$

j 1 * 2li +3 li ! ni ! ˆ A l1 |z|R i=1 Λni +1 |z|lRi

$j j # j j ∞ * * ˆ l1 2r r+p 8 A 1 j sup li ! nm ! 1 r p |z|R Λ |z|R j=1 Λ I m=1 m=1 i=1 # $ j ∞ ˆ q l1 1 2r r!p! r+p+1 1 ≤ 2Λ + j ≤ CΛ,A,q,r,p r+1 r+1 . p ˆ 21 Λ |z| |z| 2 A l1 R R j=1 ˆ q l1 = 2Λ + ˆ l1 2 A

$

This is the inequality we needed to prove (93) for j = 2. In fact, using (95) we obtain % n−r+m−p −1 % % r+p (2) % % %∂ ∆k % M % % %∂ % sup sup % % n−r ∂wm−p % % ∂z r ∂wp % ∂z p≤m r≤n (m − p)! (n − r)! 2n−r+2 CΛ,A,q,r,p Λm−p+1 |z|n−r+1 |z|r+1 p≤m r≤n R R

≤ sup sup

≤ CΛ,A,q,m,n

1 . |z|n+2 R r+p

(3)

r+p

M ∂ Z Step 5. To prove (93) for j = 3 we need to estimate ∂∂zr ∂w p = ∂z r ∂w p , where

Z=

∞ j=2

Zj =

∞

j (X33 + Y33 )j − Wj − Y33 .

j=2

First observe that ∂ r+p ∂ r+p j Z = ((X33 + Y33 )j − Wj − Y33 ) j ∂z r ∂wp ∂z r ∂wp is given by a sum of (2j − j − 1) · j r+p terms of the form ∂ l1 +n1 Y33 ∂ lm +nm X33 ∂ lj +nj Y33 · · · · · · , l1 n1 ∂z lm/0 ∂wnm ∂z lj ∂wnj1 .∂z ∂w j factors

(111)

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

951

where there are j − 2 factors involving X33 or Y33 and two factors containing X33 . + + Furthermore, for each term in the sum we have ji=1 li = r and ji=1 ni = p. Thus, % r+p % % ∂ % % % % ∂z r ∂wp Zj % % l +n % l +n % l +n % % % % ∂ m m X33 % % ∂ j j Y33 % % ∂ 1 1 Y33 % j r+p % % % % % ≤ (2 − j − 1) j , sup % l1 n1 % · · · % lm nm % · · · % lj nj % ∂z ∂w ∂z ∂w ∂z ∂w % I where the set I is given above by (106). Now observe that, the estimate for the derivatives of X33 in (110) is better then the estimate for the derivatives of Y33 in (109) because the former has an extra factor CΛ,A,q /|z|R < 1. Since the product (111) has at least two factors containing X33 , we can estimate any of these products by considering the worst case. This happens when there are exactly two factors involving X33 . Hence, by proceeding in this way, for each j ≥ 2 we have % r+p % % ∂ % % % % ∂z r ∂wp Zj % #  $2 j   li +3 * 1 1 2 l !n ! ˆ q i i l ˆ l1 ≤ (2j − j − 1) j r+p sup A 2Λ + ˆ l1  |z|2R i=1 Λni +1 |z|lRi I  2 A # ≤2 j

j r+p

ˆ q l1 2Λ + ˆ l1 2 A

≤ CΛ,A,q,r,p j r+p

2 21

j

$2

1 2r r!p! |z|2R Λp |z|rR

#

ˆ l1 8 A Λ

$j

1 , |z|r+2 R

since A l1 ≤ 2ε/63 and ε < Λ/6. Thus, % % % r+p j ∞ ∞ % % ∂ r+p % CΛ,A,q,r,p % % ∂ 2 r+p % % % % j % ∂z r ∂wp Zj % ≤ |z|r+2 % ∂z r ∂wp Z % ≤ 21 R j=2 j=2 ≤

CΛ,A,q,r,p . |z|r+2 R

Therefore, recalling (95), % n−r+m−p −1 % % r+p (3) % %∂ % ∆k % M % % %∂ % sup sup % % n−r ∂wm−p % % ∂z r ∂wp % ∂z p≤m r≤n (m − p)! (n − r)! 2n−r+2 CΛ,A,q,r,p Λm−p+1 |z|n−r+1 |z|r+2 p≤m r≤n R R

≤ sup sup

≤ CΛ,A,q,m,n

1 . |z|n+3 R

This is the desired inequality for j = 3. The proof of the proposition is complete.

September 14, J070-S0129055X10004107

952

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

We can now prove Lemma 8. We ﬁrst prove it for 1 ≤ j ≤ 2 and then for j = 3 separately. Proof of Lemma 8 for 1 ≤ j ≤ 2. Deﬁne the |B| × |C| matrices FBC := [f (b − c)]b∈B,c∈C

and GBC := [g(b − c)]b∈b,c∈C ,

and write w = wµ,d , z = zµ,d and |z|R = 2|z|−R. First observe that, for 1 ≤ j ≤ 2, the functions   (j) f (d − b)M g(c − d ) (j) b,c  [αµ,d (k)]d ∈G =  (b − d ))(z − 2iθµ (b − d )) (w − 2iθ µ b,c∈G1

d ∈G

(j) FGG1 ∆−1 k M

are the diagonal entries of the matrix GG1 G . Thus, similarly as in the proof of Lemma 6, by Proposition 20, for 1 ≤ j ≤ 2, n+m % % n m ∂ % % (j) ≤ FGG % ∂ ∂ ∆−1 M (j) % GG G ≤ Cj , α (k) ∂k n ∂k m µ,d % % ∂k n ∂k m k 1 1 |z|j 1

2

1

2

R

where C1 = C1;Λ,A,n,m,f,g and C2 = C2;Λ,A,q,n,m,f,g are constants. Furthermore, C1;Λ,A,1,0,f,g ≤

13 f l1 g l1 , Λ2

C1;Λ,A,0,1,f,g ≤

13 f l1 g l1 Λ2

and C1;Λ,A,1,1,f,g ≤

65 f l1 g l1 . Λ3

This proves the lemma for 1 ≤ j ≤ 2. Proof of Lemma 8 for j = 3. We need to estimate ∂ n+m ∂ n+m (3) α (k) = Rj (k), ∂k1n ∂k2m µ,d ∂k1n ∂k2m j=1 4

where R1 , . . . , R4 are given by (46), (47), (75) and (67), respectively. Step 1. We begin with the terms involving R1 and R2 , which are easier. We follow the same notation as above. First observe that, similarly as in the proof of Lemma 6, −1 −1 on L2G , we have since ∆−1 k RG G = Hk % n+m % % ∂ % ∂ n+m Hk−1 % % F = R (k) G ∂k n ∂k m 1 % {d }G1 ∂k n ∂k m G2 {d } % 1 2 1 2 % n+m −1 % %∂ Hk % % ≤ F{d }G1 % % ∂k n ∂k m % GG2 {d } , 1 2 % % n+m % % ∂ ∂ n+m Hk−1 % % ∂k n ∂k m R2 (k) = %F{d }G2 ∂k n ∂k m GG {d } % 1 2 1 2 % n+m −1 % %∂ Hk % % ≤ F{d }G2 % % ∂k n ∂k m % GG {d } . 1 2

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

953

Furthermore, we have already proved that F{d }G1 ≤ f l1 and GG {d } ≤ g l1 (see (90) and (91)), and since |z| ≤ 3|v|, by Proposition 11, % n+m −1 % %∂ Hk % 1 −(n+m+1) % % CΛ,A,n,m . % ∂k n ∂k m % ≤ ε |z| 1 2 Now recall that G2 = {b ∈ G ||b − d | > 14 R}. Then, sup

b∈{d } c∈G

|f (b − c)| ≤

c∈G2

2

≤ sup

c∈G2

|d − c|2 1 |f (d − c)| ≤ b2 f (b) l1 sup 2 2 |d − c| c∈G2 |d − c|

16 2 b f (b) l1 , R2

|f (b − c)| ≤ sup

c∈G2

b∈{d }

≤

|d − c|2 1 |f (d − c)| ≤ b2 f (b) l1 sup 2 |d − c|2 c∈G2 |d − c|

16 2 b f (b) l1 . R2

Hence, by Proposition 5, F{d }G2 ≤ 16 b2 f (b) l1

1 . R2

GG2 {d } ≤ 16 b2 f (b) l1

1 . R2

Similarly,

Therefore, combining all this, for 1 ≤ j ≤ 2 we obtain n+m ∂ 1 −(n+m+1) CΛ,A,n,m,f,g . ∂k n ∂k m Rj (k) ≤ ε |z|R2 1 2 Step 2. Recall from (67) the expression for R4 . Then, similarly as above, by applying Proposition 20 for j = 3 we ﬁnd that n+m % n+m % ∂ % ∂ % −1 % % ≤ F R (k) ∆ Z {d }G1 % k n m ∂k n ∂k m 4 % GG1 {d } ∂k1 ∂k2 1 2 ≤ f l1 g l1 CΛ,A,q,n,m

1 . |z|3R

Step 3. To bound the derivatives of R3 (which is given by (75)) we need a few more (j) estimates. Recall from (70) that W43 = πG4 TGj+1 G πG . First observe that 3 ∂ r+p ∂ r+p (j−m−1) −1 m m ∆ π T T W = ∆−1 πG1 T33 T34 TGj−m G πG 34 G p 33 43 k 1 3 ∂k1r ∂k2 ∂k1r ∂k2p k

September 14, J070-S0129055X10004107

954

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

is given by a sum of (j + 2)r+p terms of the form ∂ l1 +n1 ∆−1 k ∂k1l1 ∂k2n1 ×

πG1

∂ l2 +n2 T33 ∂ lm+2 +nm+2 T34 · · · l n ∂k1l2 ∂k2n2 ∂k1m+2 ∂k2 m+2

∂ lm+3 +nm+3 TG G l n ∂k2m+3 ∂k2 m+3

···

∂ lj+2 +nj+2 TG G l

n

∂k1j+2 ∂k2 j+2

πG3 .

+j+2 +j+2 Moreover, for each term in the sum we have i=1 li = r and i=1 ni = p. Thus, % r+p % % ∂ (j−m−1) % −1 m % % π ∆ T T W % ∂k r ∂k p G1 k 33 34 43 % 2 1 %#j+2 % $ % * ∂ li +ni T % (i) % % %, π ≤ (j + 2)r+p sup % (112) G li ni 3 % % ∂k1 ∂k2 I i=1

where the set I is given by (106) with j replaced by j + 2 and   ∆−1 π for i = 1,   k G1  T for 2 ≤ i ≤ m + 1, 33 T(i) := T34 for i = m + 2,     T G G for m + 3 ≤ i ≤ j + 2. Step 3a. The ﬁrst step in bounding (112) is to estimate

(113)

∂ r+p ∆−1 k πG1 . ∂k1r ∂k2p

We follow the

same argument that we have used in the proof of Lemma 6 to bound In fact, in view of (85) one can see that   p nj * ∂ p ∆−1 ∂ ∆ k −1 k  ∆−1 , = ∆k nj k ∂k2p ∂k 2 j=1

∂ n+m Hk−1 ∂k1n ∂k2m .

(114)

ﬁnite sum where # of terms depend on p

where

+p j=1

nj = p. Hence, when we compute

acts either on ∆−1 or k ∂ r ∂ nj ∂k1r ∂knj 2

∂ nj ∆k . n ∂k2 j

∆k = 0 if nj ≥ 1 and ∂ r ∆−1 k ∂k1r

p −1 ∂ r ∂ ∆k ∂k1r ∂k2p ,

the derivative

∂r ∂k1r

k However, since ( ∂∆ ∂k2 )b,c = 2(k2 + c2 )δb,c , we have

∂ r ∂ nj ∂k1r ∂knj 2

∆k =

∂r ∂k1r ∆k

if nj = 0. Similarly, using again

(85) one can see that is given by a ﬁnite sum as in (114), with p and k2 +r replaced by r and k1 , respectively, and j=1 nj = r. Thus, combining all this we conclude that   r+p r+p −1 n * j ∂ ∆k ∂ ∆k  −1  ∆k , = ∆−1 (115) n k ∂k1r ∂k2p ∂kijj j=1 ﬁnite sum where # of terms depend on r and p

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

where

+r+p j=1

nj δ2,ij = p and #

∂ nj ∆k n ∂kijj

955

+r+p j=1

$ = b,c

nj δ1,ij = r. If we observe that    2(kij + cij )δb,c if nj = 1, 2δb,c   0

if nj = 2, if nj ≥ 3,

and extract the “leading term” from the summation in (115), in a sense that will be clear below, we can rewrite (115) in terms of matrix elements as & 'r & 'p (−1)r+p (r + p)! 2(k1 + c1 ) 1 2(k2 + c2 ) ∂ r+p = ∂k1r ∂k2p Nc (k) Nc (k) Nc (k) Nc (k)

+

ﬁnite sum where # of terms depend on r and p

(2(k1 + c1 ))αj (2(k2 + c2 ))βj , Nc (k)r+p+1

where αj + βj < r + p for every j in the summation. Recall from (88) and (89) that, c}, for all c ∈ G \{˜ 2 1 7 |ki + ci | ≤ < < |Nc (k)| Λ 3ε 2ε

and

|ki + c˜i | Λ + 3|v| 7 ≤ ≤ . |Nc˜(k)| ε|v| 2ε

Hence, r+p r+p ∂ 1 (r + p)! 7 + ∂k r ∂k p Nc (k) ≤ |Nc (k)| ε 2 1

≤

(r + p)! |Nc (k)|

ﬁnite sum where # of terms depend on r and p

r+p 7 1 + Cε,r,p . ε |Nc (k)|2

(116)

αj +βj 7 1 ε |Nc (k)|2

(117)

Thus, by Proposition 5, since |Nc (k)| ≥ ε|v| ≥ ε|z|/3 for all c ∈ G , we have % r+p −1 % %∂ % 7r+p (r + p)! 3 ∆k Cε,r,p % % + π . (118) % ∂k r ∂k p G1 % ≤ r+p+1 ε |z| |z|2 2 1 Now, let ρ1 = ρ1;ε,r,p be the constant ρ1;ε,r,p := max

l1 ≤r n1 ≤p

εl1 +n1 +1 Cε,l1 ,n1 , 4(l1 + n1 )! 7l1 +n1

where Cε,l1 ,n1 is the constant in (118). Then, for |z| > ρ1 and for any l1 ≤ r and any n1 ≤ p, % % % ∂ l1 +n1 ∆−1 % 7l1 +n1 (l + n )! 3 7l1 +n1 (l1 + n1 )! 4 % % 1 1 k % ≤ + π % G l n 1 % ∂k11 ∂k2 1 % εl1 +n1 +1 |z| εl1 +n1 +1 |z| l1 +n1 +1 7 1 = (l1 + n1 )! . ε |z|

(119)

September 14, J070-S0129055X10004107

956

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

This is the ﬁrst inequality we need to bound (112). We next estimate the other factors in that expression. Step 3b. Recall from (53) that Tb,c =

1 ˆ − c) − qˆ(b − c)). (2(c + k) · A(b Nc (k)

By direct calculation we have r+p

∂ 1 ∂ r+p Tb,c ˆ − c) − qˆ(b − c)) = (2(c + k) · A(b ∂k1r ∂k2p ∂k1r ∂k2p Nc (k) r−1+p

∂ 1 +r 2Aˆj (b − c) ∂k1r−1 ∂k2p Nc (k) $ # 1 ∂ r+p−1 2Aˆj (b − c). +p ∂k1r ∂k2p−1 Nc (k) Hence, using (116) and (117), since |Nc (k)| ≥ ε|v| ≥ ε|z|/3 for all c ∈ G and |v| > 1, $ # r+p

r+p ∂ 7 ˆ |ˆ q (b − c)| T C 7 b,c ε,r,p | A(b − c)| + + ≤ (r + p)! ∂k r ∂k p ε ε|v| ε ε|v| 2 1 Cε,r,p ˆ |A(b − c)| |v| r+p+1 7 ˆ − c)| + |ˆ ˆ − c)| + Cε,r,p (|A(b q (b − c)|). ≤ (r + p)! |A(b ε |z| (120) +

Therefore, by Proposition 5, % r+p % %∂ T G G % % % % ∂k r ∂k p % ≤ Θr,p , 1

(121)

2

where Θr,p := (r + p)!

r+p+1 7 ˆ l1 + Cε,A,q,r,p 1 . A ε |z|

(122)

This is the second estimate we need to bound (112). We next derive one more inequality. Step 3c. Set 2 Qr,p b,c := (1 + |b − c| )

∂ r+p Tb,c . ∂k1r ∂k2p

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

We ﬁrst prove that, for any B, C ⊂ G , r,p sup |Qb,c | ≤ Ωr,p and b∈B c∈C

where Ωr,p

sup c∈C

957

|Qr,p b,c | ≤ Ωr,p ,

b∈B

r+p+1 7 1 ˆ := (r + p)! (1 + b2 )A(b) . l1 + Cε,A,q,r,p ε |z|

(123)

In fact, in view of (120) we have r+p r,p Tb,c 2 ∂ |Qb,c | = sup (1 + |b − c| ) r p sup ∂k1 ∂k2 b∈B c∈C b∈B c∈C (1 + |b − c|2 ) ≤ sup b∈B c∈C

" r+p+1 7 Cε,r,p ˆ ˆ (|A(b − c)| + |ˆ q (b − c)|) |A(b − c)| + × (r + p)! ε |z| !

r+p+1 7 1 ˆ ≤ (r + p)! , (1 + b2 )A(b) l1 + Cε,A,q,r,p ε |z| + and similarly we estimate supc∈C b∈B |Qr,p b,c |. Now observe that, as in (78), for any integer m ≥ 0 and for any ξ0 , ξ1 , . . . , ξm+2 ∈ Γ# , let b = ξ0 and c = ξm+2 . Then, |b − c|2 ≤ 2(m + 2)

m+2

|ξi−1 − ξi |2 .

i=1

To simplify the notation write ∂

li ,ni

=

∂ li +ni l n ∂k1i ∂k2 i

, and recall from (113) and (123)

the deﬁnition of T(i) and Ωr,p . Hence, similarly as in the proof of Proposition 17, since |b − c| ≥ R/4 for all b ∈ G1 and c ∈ G4 , # $ m+2 * sup ∂ li ,ni T(i) b∈G1 i=2 c∈G4 b,c # $ m+2 * 1 2 li ,ni ≤ sup sup (1 + |b − c| ) ∂ T (i) 2 i=2 b∈G1 1 + |b − c| b∈G1 c∈G4

c∈G4

≤

2(m + 2) sup (1 + |b − ξ1 |2 )|∂ l2 ,n2 Tb,ξ1 | 1 2 b∈G 1 ξ1 ∈G 1+ R 3 16 × (1 + |ξ1 − ξ2 |2 )|∂ l3 ,n3 Tξ1 ,ξ2 | · · · ξ2 ∈G3

×

c∈G4

(1 + |ξm+1 − c|2 )|∂ lm+2 ,nm+2 Tξm+1 ,c |

b,c

September 14, J070-S0129055X10004107

958

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

≤

2(m + 2) sup (1 + |b − ξ1 |2 )|∂ l2 ,n2 Tb,ξ1 | 1 2 b∈G 1 ξ1 ∈G 1+ R 3 16 × sup (1 + |ξ1 − ξ2 |2 )|∂ l3 ,n3 Tξ1 ,ξ2 | ξ1 ∈G3 ξ ∈G 2 3

×

=

sup

ξm+1 ∈G3

(1 + |ξm+1 − c|2 )|∂ lm+2 ,nm+2 Tξm+1 ,c |

c∈G4

l 2(m + 2) ,nm+2 2 ,n2 sup |Qlb,ξ | · · · sup |Qξm+2 | 1 m+1 ,c 1 2 b∈G ξm+1 ∈G3 1 ξ1 ∈G 1+ R c∈G4 3 16

m+2 2(m + 2) * ≤ Ωli ,ni 1 1 + R2 i=2 16

and similarly # $ m+2 m+2 * 2(m + 2) * li ,ni sup ∂ T(i) Ωli ,ni . ≤ c∈G4 b∈G i=2 1 + 1 R2 i=2 b,c 1 16 Therefore, by Proposition 5, % % m+2 l +n m+2 % % 2(m + 2) * % * ∂ i i T(i) % ≤ Ωli ,ni . %πG1 % % ∂k1li ∂k2ni % 1 + 1 R2 i=2 i=2 16 We have all we need to bound (112). Step 3d. From (121) and (119) it follows that % j+2 % j+2 % * ∂ li +ni T % * (i) % % ≤ Θli ,ni % % % ∂k1li ∂k2ni % i=m+3 i=m+3 and

% % r+p+1 % ∂ l1 +n1 T % 7 1 (1) % % . ≤ (r + p)! % % l n 1 1 % ∂k1 ∂k2 % ε |z|

Thus, recalling (112) we get % r+p % % ∂ (j−m−1) % −1 m % % ∆ π T T W % ∂k r ∂k p k G1 33 34 43 % 2 1 %#j+2 % $ % * ∂ li +ni T % (i) % % r+p ≤ (j + 2) sup % π G3 % li ni % I % i=1 ∂k1 ∂k2

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

959

  " j+2   r+p+1 !m+2  1 2(m + 2)  * * 7 ≤ (j + 2)r+p sup (r + p)! Ωli ,ni Θli ,ni  ε I  |z| 1 + 1 R2  i=2 i=m+3 16 C ≤ (j + 2)r+p (m + 2) |z|R2 " j+2 l1 +n1 +1 !m+2 * * 7 × sup (l1 + n1 )! Ωli ,ni Θli ,ni , ε I i=2 i=m+3 where C is an universal constant. Now, recall the deﬁnition of Θr,p and Ωr,p in ˆ l1 , and let ρ2 = ρ2;ε,A,q,r,p be a ˆ l1 < (1 + b2 )A (122) and (123), observe that A suﬃciently large constant such that, for |z| > ρ2 and for any li ≤ r and any ni ≤ p, li +ni +1 7 ˆ Θli ,ni , Ωli ,ni ≤ 2(li + ni )! (1 + b2 )A(b) l1 . ε Then, % % r+p % ∂ (j−m−1) % −1 m % % % % ∂k r ∂k p ∆k πG1 T33 T34 W43 2 1 C |z|R2 " j+2 l1 +n1 +1 !m+2 * * 7 × sup (l1 + n1 )! Ωli ,ni Θli ,ni ε I i=2 i=m+3

≤ (j + 2)r+p (m + 2)

(m + 2)C j+1 ˆ (2 (1 + b2 )A(b) l1 ) |z|R2 Pj+2 j+2 i=1 (li +ni ) * 7 × sup (li + ni )! ε I i=1

≤ (j + 2)r+p

(since

+j+2

i=1 li

= r,

+j+2

5j+2

i=1 (li + ni )! < (r + p)!) r+p+1

j+1 7 14 1 ˆ 1 (1 + b2 )A(b) ≤ C(r + p)! (m + 2)(j + 2)r+p l ε ε |z|R2 j+1 4 Cε,r,p r+p ≤ (m + 2)(j + 2) , |z|R2 9 i=1

ni = p and

j+2 7 ε

ˆ since (1 + b2 )A(b) l1 < 2ε/63. This establishes a bound for (112). Step 4. We now apply the last inequality for deriving an estimate for the derivatives of R3 and complete the proof of the lemma for j = 3. Recall from (76) that (j)

X33 =

j−1 m=0

(j−m−1)

m T33 T34 W43

.

September 14, J070-S0129055X10004107

960

2010 13:29 WSPC/S0129-055X

148-RMP

G. de Oliveira

Then,

% % r+p % % ∂ −1 (j) % % % ∂k r ∂k p πG1 ∆k X33 % 2

1

≤

j−1 m=0

≤

% r+p % % ∂ (j−m−1) % −1 m % % % ∂k r ∂k p ∆k πG1 T33 T34 W43 % 1

2

j+1 j−1 4 Cε,r,p r+p (m + 2)(j + 2) 2 |z|R 9 m=0

Cε,r,p ≤ (j + 2)r+p |z|R2 Cε,r,p = (j + 2)r+p |z|R2

j+1 j−1 4 (m + 2) 9 m=0 j+1 4 1 2 (j + 3j). 9 2

Thus, since G1 ⊂ G3 , % %   % % ∞ r+p % % (j)  −1 % %πG ∂  X ∆ π G1 % 33 k % 1 ∂k r ∂k p 2 % % 1 j=1 % ∞ % % ∂ r+p % −1 (j) % % ≤ % ∂k r ∂k p πG1 ∆k X33 % j=1

1

2

j+1 ∞ Cε,r,p 4 1 2 1 r+p ≤ (j + 3j) ≤ CCε,r,p (j + 2) , |z|R2 j=1 9 2 |z|R2 where C is an universal constant. Therefore,   r+p ∞ r+p ∂ ∂ (j)  −1  F = ∆ G R (k) X G1 {d } 33 k ∂k r ∂k p 3 {d }G1 ∂k r ∂k p 2 2 1 1 j=1 % %   % % ∞ r+p % % ∂ (j) −1 % ≤ F{d }G1 %πG1 r p ∆k X33 πG1 % % GG1 {d } ∂k1 ∂k2 % % j=1 1 . |z|R2 Finally, combining all the estimates we have % n+m % % 4 % % ∂ % ∂ n+m % % (3) % % % % % ∂k n ∂k m αµ,d (k)% ≤ % ∂k n ∂k m Rj (k)% 1 2 1 2 j=1 ≤ CCε,r,p f l1 g l1

≤3

C C 4C + ≤ , |z|R2 |z|3R |z|R2

where C = Cε,Λ,A,q,f,g,m,n is a constant. Set ρε,A,q,m,n := max{ρ1;ε,m,n , ρ2;ε,A,q,m,n }. The proof of the lemma for j = 3 is complete.

September 14, J070-S0129055X10004107

2010 13:29 WSPC/S0129-055X

148-RMP

Asymptotics for Fermi Curves: Small Magnetic Potential

961

References [1] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Riemann Surfaces of Inﬁnite Genus, CRM Monograph Series (Amer. Math. Soc., 2003). [2] D. Gieseker, H. Kn¨ orrer and E. Trubowitz, The Geometry of Algebraic Fermi Curves, Perspectives in Mathematics, Vol. 14 (Academic Press, Inc., 1993). [3] H. Kn¨ orrer and E. Trubowitz, A directional compactification of the complex Bloch variety, Comment. Math. Hel. 65 (1990) 114–149. [4] I. Krichever, Spectral theory of two-dimensional periodic operators and its applications, Russian Math. Surveys 44(2) (1989) 145–225. [5] H. McKean, Integrable systems and algebraic curves, in Global Analysis (Proc. Biennial Sem. Canad. Math. Congr. Univ. Calgary, 1978), Lecture Notes in Math., Vol. 755 (Springer, 1979), pp. 83–200. [6] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Asymmetric Fermi surfaces for magnetic Schr¨ odinger operators, Comm. Partial Diﬀerential Equations 26 (2000) 319–336. [7] Y. Karpeshina, Spectral properties of the periodic magnetic Schr¨ odinger operator in the high-energy region. Two-dimensional case, Comm. Math. Phys. 251 (2004) 473–514. [8] L. Erd¨ os, Recent developments in quantum mechanics with magnetic fields, in Spectral Theory and Mathematical Physics: A Festschrift in Honor of Barry Simon’s 60th Birthday, Proc. Sympos. Pure Math., Vol. 76, Part 1 (Amer. Math. Soc., 2007), pp. 401–428. [9] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of Operators (Academic Press, 1978). [10] P. Kuchment, Floquet Theory for Partial Diﬀerential Equations (Birkh¨ auser, 1993). [11] W. Magnus and S. Winkler, Hill’s Equation (Dover, 2004). [12] S. Gustafson and I. Sigal, Mathematical Concepts of Quantum Mechanics (Springer, 2006). [13] G. de Oliveira, Asymptotics for Fermi curves of electric and magnetic periodic fields, Ph.D. thesis, The University of British Columbia (2009); http://hdl.handle.net/2429/11114.

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 8 (2010) 963–993 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004119

THE 3D SPIN GEOMETRY OF THE QUANTUM TWO-SPHERE

SIMON BRAIN∗,‡ and GIOVANNI LANDI∗,†,§ ∗Dipartimento

di Matematica e Informatica, Universit` a di Trieste, Via A. Valerio 12/1, 34127 Trieste, Italy †INFN,

Sezione di Trieste, Trieste, Italy ‡[email protected] §[email protected] Received 23 March 2010

We study a three-dimensional diﬀerential calculus Ω1 Sq2 on the standard Podle´s quantum two-sphere Sq2 , coming from the Woronowicz 4D+ diﬀerential calculus on the quantum group SUq (2). We use a frame bundle approach to give an explicit description of Ω1 Sq2 and its associated spin geometry in terms of a natural spectral triple over Sq2 . We equip this spectral triple with a real structure for which the commutant property and the ﬁrst order condition are satisﬁed up to inﬁnitesimals of arbitrary order. Keywords: Noncommutative geometry; spectral triples; quantum groups; quantum spheres. Mathematics Subject Classiﬁcation 2010: 58B34, 17B37

1. Introduction The standard quantum two-sphere Sq2 has proven to be one of the most important and useful examples in trying to understand the relationship between the geometric/analytic world of noncommutative geometry and the algebraic setting of quantum group theory. At the algebraic level, it is known that Sq2 has a unique left-covariant two-dimensional diﬀerential calculus [17, 18]. On the other hand, it is known that this same calculus is recovered via analytic techniques by means of a noncommutative spin geometry [4, 20]. This compatibility has led to the discovery of other noncommutative two-dimensional geometries on Sq2 with a range of interesting properties [7]. In this paper, we extend the investigation to the noncommutative spin geometry of a diﬀerential calculus on Sq2 whose dimension is equal to three. Quantum two-spheres were constructed and classiﬁed by Podle´s in [16]. The standard sphere Sq2 is unique amongst the Podle´s family in that it also appears as the base space of the noncommutative Hopf ﬁbration SUq (2) → Sq2 constructed in [1] as a basic example of a quantum principal bundle. By equipping the total space SUq (2) with the 3D diﬀerential calculus of [22], one ﬁnds that the two-dimensional 963

September 14, J070-S0129055X10004119

964

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

diﬀerential calculus on Sq2 appears as an associated vector bundle. This “quantum frame bundle” approach to noncommutative geometry, developed in [13, 14], has been applied successfully to study a host of examples, not least the two-dimensional geometry of the quantum sphere Sq2 itself. The present paper also uses the frame bundle approach to study the geometry of Sq2 , but this time starting with the 4D+ diﬀerential calculus on SUq (2) of [22]. This calculus has the advantage of being bicovariant under both left and right translations, in contrast with the 3D calculus, which is only left-covariant. Using the framing theory we recover the three-dimensional diﬀerential calculus Ω1 Sq2 of [9, 10, 17] on Sq2 . The methods we use are well-adapted to the principal bundle structure and as a consequence we immediately ﬁnd an explicit description of the bimodule relations in Ω1 Sq2 , including a decomposition into irreducible components. We do not discuss the deeper aspects of the Riemannian geometry such as Hodge structure and connection theory: these will be developed elsewhere [12]. Our main results concern the spin geometry of the three-dimensional calculus Ω1 Sq2 . Remarkably, we ﬁnd that the spinor bundle of Sq2 is unchanged from the one used in [4, 14, 20] for the two-dimensional calculus. We construct a Dirac operator D which implements the exterior derivative in Ω1 Sq2 , ﬁnding that the eigenvalues of |D| grow not faster than q −2j for large j and hence that the associated spectral triple has metric dimension zero. Moreover, we equip this spectral triple with a Z2 -grading operator and a real structure which is deﬁned “up to compact operators”, in the sense that the “commutant property” and the “ﬁrst order condition” for a real spectral triple [3] are satisﬁed up to inﬁnitesimals of arbitrary order. As we shall see, this is in contrast with [4], where a “true” real structure for the “two-dimensional” calculus on Sq2 was given (cf. also [20]), but is parallel to the results of [7] for the sphere Sq2 . We also ﬁnd that the “KO-theoretic” dimension of this real spectral triple is equal to the classical value, just two. The paper is organized as follows. In Sec. 2, we give a brief overview of the construction of quantum diﬀerential calculi on quantum groups and their homogeneous spaces, followed by the general quantum frame bundle construction itself. Following this, Sec. 3 recalls the elementary geometry of the Hopf ﬁbration SUq (2) → Sq2 and the Hopf algebra Uq (su(2)) which describes its symmetries. In Sec. 4, we describe the diﬀerential structure of the Hopf ﬁbration. We start from the 4D quantum diﬀerential calculus on the total space SUq (2) from which we derive the calculus on the bundle ﬁber U(1). The structure of the calculus Ω1 Sq2 is then obtained as a “framed quantum manifold” in the sense of [14]. Finally, in Sec. 5 we construct our spectral triple (A[Sq2 ], H, D) over Sq2 , which in addition we equip with a Z2 -grading Γ of the spinor bundle H and a real structure J: H → H. Notation. In this paper, we make frequent use of the “q-numbers” deﬁned by [x] :=

q x − q −x q − q −1

(1.1)

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

965

for each x ∈ R and q = 1. Furthermore, for the sake of brevity we introduce the constants µ := q + q −1 ,

ν := q − q −1

(1.2)

to be used throughout the paper. Our convention is that N = {0, 1, 2, . . .}. 2. Preliminaries on Quantum Principal Bundles We start with some generalities on diﬀerential calculi and quantum principal bundles. These will be endowed both with universal and non-universal compatible calculi. 2.1. Diﬀerential structures Let P be a complex ∗-algebra with unit. A ﬁrst order diﬀerential calculus over P is a pair (Ω1 P, d) where Ω1 P is a P -P -bimodule (the one-forms) and d: P → Ω1 P is a linear map obeying the Leibniz rule d(ab) = a(db) + (da)b,

a, b ∈ P,

and such that the map P ⊗ P → Ω1 P deﬁned by a ⊗ b → a db is surjective. ˜ where Ω 1 P := 1 P, d), The universal diﬀerential calculus over P is the pair (Ω ker m is the kernel of the product map m: P ⊗ P → P on P , with obvious bimodule structure p · (a ⊗ b) = pa ⊗ b,

(a ⊗ b) · p = a ⊗ bp,

a, b, p ∈ P

˜ is deﬁned by dp ˜ := 1 ⊗ p − p ⊗ 1, for each p ∈ P . It is so-called because any and d 1 P/NP , other diﬀerential calculus (Ω1 P, d) over P arises as a quotient Ω1 P = Ω 1 1 where NP is some P -P -sub-bimodule of Ω P . With the projection πP : Ω P → Ω1 P ˜ one has d = πP ◦ d.

If H is a Hopf algebra, we write mH: H ⊗ H → H and 1H for its product and unit, ∆H: H → H ⊗ H and H: H → C for its coproduct and counit and SH: H → H for its antipode (when there is no possibility of confusion, we omit the subscript H). We use Sweedler notation ∆(h) = h(1) ⊗ h(2) for the coproduct. A diﬀerential calculus Ω1 H over a Hopf algebra H is said to be leftcovariant if the coproduct ∆, viewed as a left coaction of H on itself, extends to a left coaction ∆L: Ω1 H → H ⊗ Ω1 H such that d is an intertwiner and ∆L is a bimodule map: ∆L (dh) = (id ⊗ d)∆L (h),

∆L (hω) = ∆(h) · ∆L (ω),

∆L (ωh) = ∆L (ω) · ∆(h)

for all h ∈ H, ω ∈ Ω1 H. A similar deﬁnition holds for a right-covariant calculus, now with a right coaction ∆R: Ω1 H → Ω1 H ⊗ H. A calculus is said to be bicovariant if it is both left and right covariant with commuting coactions. The universal

September 14, J070-S0129055X10004119

966

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

1 H is bicovariant when equipped with the left and right tensor product calculus Ω coactions on H ⊗ H. Left-covariant diﬀerential calculi on a Hopf algebra H are classiﬁed as follows after [22]. First, it may be shown that the linear map r: H ⊗ H → H ⊗ H,

r(a ⊗ b) := ab(1) ⊗ b(2) ,

(2.1)

is an isomorphism with inverse r−1: H ⊗ H → H ⊗ H,

r−1 (a ⊗ b) = aS(b(1) ) ⊗ b(2) .

(2.2)

1 H we obtain an isomorphism Upon restricting r to the universal calculus Ω 1H → H ⊗ H +, r: Ω where H + := ker H denotes the augmentation ideal of H. This is in fact an isomorphism of H-H bimodules if we equip H ⊗ H + with the bimodule structure a · (b ⊗ ω) = ab ⊗ ω,

(a ⊗ ω) · b = ab(1) ⊗ ωb(2) ,

a, b ∈ H, ω ∈ H +

(2.3)

and an isomorphism of H-H-bicomodules if we equip H ⊗ H + with the bicomodule structure ∆L (a ⊗ ω) = a(1) ⊗ (a(2) ⊗ ω), ∆R (a ⊗ ω) = (a(1) ⊗ ω(1) ) ⊗ a(2) ω(2) ,

a ∈ H, ω ∈ H + .

1 H is carried to a right ideal IH of H + Any left-covariant sub-bimodule NH of Ω by the map r in (2.1). Conversely, any right ideal IH arises in this way from a 1 H. It follows that the left-covariant diﬀerential left-covariant sub-bimodule of Ω calculi on H are in one-to-one correspondence with right ideals IH ⊂ H + ; indeed, given such an IH , one has Ω1 H H ⊗ Λ1 , where Λ1 ∼ = H + /IH are the left-invariant 1 −1 1 one-forms. We also write Ωinv H := r (Λ ). A left-covariant sub-bimodule NH is also right-covariant if and only if the corresponding ideal IH is stable under the right adjoint coaction AdR: H → H ⊗ H,

AdR (a) = a(2) ⊗ S(a(1) )a(3) ,

in the sense that AdR (IH ) ⊂ IH ⊗ H. It follows that bicovariant calculi on H are in one-to-one correspondence with right ideals IH of H + which are AdR -stable [22]. Given a left-covariant diﬀerential calculus Ω1 H over H, the quantum tangent space of Ω1 H is the vector space TH := {X ∈ H | X(1) = 0 and X(a) = 0 for all a ∈ IH },

(2.4)

where the vector space H is the linear dual of H. This tangent space admits many properties analogous to the classical case, in particular there exists a unique bilinear

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

967

form · | · : TH × Ω1 H → C such that X | a db = H (a)X(b),

a, b ∈ H, X ∈ TH .

(2.5)

With respect to this bilinear form, the vector spaces Ω1inv H and TH are nondegenerately paired, so that dim Ω1inv H = dim TH = dim Λ1 . This number is said to be the dimension of the left-covariant diﬀerential calculus Ω1 H. 2.2. Quantum principal bundles The general set-up for a principal ﬁbration of noncommutative spaces is an algebra P (playing the role of the algebra of functions on the total space) which is a right comodule algebra for a Hopf algebra H with coaction δR : P → P ⊗ H. The algebra of functions on the base space of the ﬁbration is the subalgebra M of P consisting of coinvariant elements under δR , M := P H = {p ∈ P: δR (p) = p ⊗ 1}. For a well-deﬁned bundle structure at the level of universal diﬀerential calculi, one requires exactness of the following sequence [1], j ver 1 M )P − 1P − 0 → P (Ω →Ω −→ P ⊗ H + → 0,

(2.6)

with H + the augmentation ideal, as before. The algebra inclusion M → P extends 1 P of universal diﬀerential calculi, hence P (Ω 1 M )P are ˜ 1 M → Ω to an inclusion Ω the analogues of the horizontal one-forms (classically this corresponds to the space of one-forms which have been pulled back from the base of the ﬁbration). The map ver is deﬁned by ver(p ⊗ p ) = pδR (p ); the generator of the vertical one-forms. We say that the inclusion M → P is a quantum principal bundle with universal calculi and structure quantum group H. Requiring exactness of the sequence (2.6) is equivalent to requiring that the induced canonical map χ: P ⊗M P → P ⊗ H,

p ⊗M p → pδR (p )

(2.7)

be bijective. If this is the case, one also says that the triple (P, H, M ) is an H-Hopf–Galois extension. This bijection condition is enough for a principal bundle structure at the level of universal diﬀerential calculi. For a principal bundle with non-universal calculi extra conditions are required that we brieﬂy recall. Assume then that P and M are equipped with diﬀerential cal 1 P/NP and Ω1 M = Ω 1 M/NM , where NP and MM are sub-bimodules culi Ω1 P = Ω

September 14, J070-S0129055X10004119

968

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

1 P and Ω 1 M, respectively. Assume further that H is equipped with a leftof Ω covariant calculus Ω1 H corresponding to a right ideal IH . Compatibility of the diﬀerential structures means that the calculi satisfy the conditions 1M NM = NP ∩ Ω

and δR (NP ) ⊂ NP ⊗ H.

(2.8)

1

The role of the ﬁrst condition is to ensure that Ω M is spanned by elements of the form mdn with m, n ∈ M and is hence obtained by restricting the calculus on P . The second condition in (2.8) is suﬃcient to ensure covariance of Ω1 P . Finally, we need the sequence ver

0 → P (Ω1 M )P → Ω1 P −−→ P ⊗ Λ1 → 0

(2.9)

to be exact. This sequence is the analogue of the sequence (2.6) but now at the level of non-universal calculi. The P -P -bimodule P (Ω1 M )P once again makes up the horizontal one-forms and ver(p ⊗ p ) = pδR (p ) is the canonical map which generates the vertical one-forms. The condition ver(NP ) = P ⊗ IH

(2.10)

ensures that the map ver: Ω1 P → P ⊗ Λ1 ,

Λ1 H + /IH

is well-deﬁned and yields that the sequence (2.9) is indeed exact. 2.3. Framed quantum manifolds Suppose that the total space P of the bundle is itself a Hopf algebra equipped with a Hopf algebra surjection π: P → H. Here we have a coaction of H on P by coproduct and projection to H, δR: P → P ⊗ H,

δR = (id ⊗ π)∆.

The base is then the quantum homogeneous space M = P H of coinvariants and the algebra inclusion M → P is automatically an H-Hopf–Galois extension, i.e. a quantum principal bundle with universal calculi. To impose non-universal diﬀerential structure, we suppose that Ω1 P is left-covariant for P and Ω1 H is left-covariant for H, so that they are deﬁned by right ideals IP and IH of P + and H + , respectively. We ensure the ﬁrst of (2.8) by taking it as a deﬁnition of Ω1 M ; in the case at hand, the remaining compatibility conditions in (2.8)–(2.10) reduce to (id ⊗ π)AdR (IP ) ⊂ IP ⊗ H,

π(IP ) = IH .

(2.11)

Thus a choice of left-covariant calculus on P satisfying these conditions automatically gives a principal bundle with non-universal calculi [14]. We say that an algebra M is a framed quantum manifold if it is the base of a quantum principal bundle, M = P H , to which Ω1 M is an associated vector bundle.

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

969

To give M as a framed quantum manifold we therefore require not only a quantum principal bundle δR: P → P ⊗ H as above but also a right H-comodule V , so that E := (P ⊗ V )H plays the role of the sections of the corresponding associated vector bundle (the space P ⊗ V is equipped with the tensor product coaction). Moreover, we require a “soldering form” θ: V → P Ω1 M such that the map sθ: E → Ω1 M,

p ⊗ v → pθ(v)

is an isomorphism. For a general M , it is usually not obvious how to go about looking for a framing. However in the case of a quantum homogeneous space with compatible calculi one has a “standard” framing in the following way [14]. If the conditions in (2.11) are satisﬁed then the algebra M = P H is automatically framed by the bundle (P, H, M ). The H-comodule V and soldering form θ are given explicitly by the formulæ V = (P + ∩ M )/(IP ∩ M ),

∆R v = v˜(2) ⊗ Sπ(˜ v(1) ),

θ(v) = S˜ v(1) d˜ v(2) , (2.12)

with v˜ any representative of v in P + ∩ M and ∆(˜ v ) = v˜(1) ⊗ v˜(2) is the coproduct on P . 3. The Standard Podle´ s Sphere We recall here some of the basic geometry of the so-called standard Podle´s quantum two-sphere Sq2 of [16]. We begin with the quantum group A[SUq (2)] and its symmetries Uq (su(2)), from which we obtain the quantum sphere Sq2 as the base space of the quantum Hopf ﬁbration SUq (2) → Sq2 . Finally we sketch the construction of a family of quantum line bundles over Sq2 which shall prove useful in what is to follow. 3.1. The quantum group SUq (2) Recall that the coordinate algebra A[Mq (2)] of functions on the quantum matrices Mq (2) is the associative unital algebra generated by the entries of the matrix x = (xi j ) =

a c

b d

ac = qca,

bd = qdb,

obeying the relations ab = qba,

bc = cb,

ad − da = (q − q

−1

cd = qdc, )bc,

(3.1)

with 0 = q ∈ C a deformation parameter. The algebra A[Mq (2)] has a coalgebra structure given by ∆(xi j ) = xi µ ⊗ xµ j and (xi j ) = δi j . From A[Mq (2)] we obtain a

September 14, J070-S0129055X10004119

970

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

Hopf algebra A[SLq (2)] upon quotienting by the determinant relation ad = 1 + qbc (equivalently da = 1 + q −1 bc) and deﬁning an antipode by a b d −q −1 b S . = −qc a c d When the deformation parameter q is taken to be real A[Mq (2)] is made into a ∗-algebra by deﬁning the anti-linear involution ∗ b∗ d −qc a := . (3.2) x∗ = ∗ c d∗ a −q −1 b It is not diﬃcult to see that A[SLq (2)] inherits this ∗-structure. Without loss of generality, we take 0 < q < 1. The compact quantum group A[SUq (2)] is deﬁned to be the quotient of A[SLq (2)] by the additional relations S(xk l ) = (xl k )∗ . Thus in A[SUq (2)] we have a b a −qc∗ x= . (3.3) = c a∗ c d The algebra relations become ac = qca,

ac∗ = qc∗ a,

cc∗ = c∗ c,

aa∗ + q 2 cc∗ = 1,

a∗ a + c∗ c = 1,

(3.4)

together with their conjugates. On generators, the counit is (a) = (a∗ ) = 1, (c) = (c∗ ) = 0 and the antipode is now S(a) = a∗ , S(a∗ ) = a, S(c) = −qc, S(c∗ ) = −q −1 c∗ , while the coproduct now reads ∆(a) = a ⊗ a − qc∗ ⊗ c, ∆(c) = c ⊗ a + a∗ ⊗ c and ∆(a∗ ) = a∗ ⊗ a∗ − qc ⊗ c∗ , ∆(c∗ ) = c∗ ⊗ a∗ + a ⊗ c∗ . 3.2. The quantum universal enveloping algebra Uq (su(2)) The quantum universal enveloping algebra Uq (su(2)) is the unital ∗-algebra generated by the four elements K, K −1 , E, F , with KK −1 = K −1 K = 1, subject to the relations K ±1 E = q ±1 EK ±1 ,

K ±1 F = q ∓1 F K ±1 ,

[E, F ] = (q − q −1 )−1 (K 2 − K −2 )

(3.5)

and the ∗-structure K ∗ = K,

E ∗ = F,

F ∗ = E.

It becomes a Hopf ∗-algebra when equipped with the coproduct ∆ and counit deﬁned on generators by ∆(K ±1 ) = K ±1 ⊗ K ±1 ,

∆(E) = E ⊗ K + K −1 ⊗ E,

∆(F ) = F ⊗ K + K −1 ⊗ F, (K) = 1,

(E) = 0,

(F ) = 0,

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

971

and with antipode S deﬁned by S(K) = K −1 , S(E) = −qE, S(F ) = −q −1 F on generators. The maps ∆, are extended as ∗-algebra maps, whereas S extends as a ∗-anti-algebra map. From the relations (3.5), one ﬁnds that the quadratic Casimir element Cq := F E + (q − q −1 )−2 (qK 2 − 2 + q −1 K −2 ) −

1 4

(3.6)

generates the center of the algebra Uq (su(2)). The ﬁnite-dimensional irreducible ∗-representations πj of Uq (su(2)) are indexed by a half-integer j = 0, 1/2, 1, 3/2, . . . called the spin of the representation. Explicitly, these representations are given by πj (K)|j, m = q m |j, m , πj (F )|j, m = ([j − m][j + m + 1])1/2 |j, m + 1 ,

(3.7)

πj (E)|j, m = ([j − m + 1][j + m])1/2 |j, m − 1 , where the vectors |j, m for m = −j, −j + 1, . . . , j − 1, j form an orthonormal basis of the (2j + 1)-dimensional irreducible Uq (su(2))-module V j . Moreover, πj is a ∗representation with respect to the Hermitian inner product on V j for which the vectors |j, m are orthonormal. In each representation, the Casimir Cq of (3.6) acts as a multiple of the identity, with constant given by

1 πj (Cq ) = j + 2

2 −

1 4

(3.8)

as one may easily verify by direct computation. The Hopf ∗-algebras A(SUq (2)) and Uq (su(2)) are dually paired via a bilinear pairing ( · , · ): Uq (su(2)) × A[SUq (2)] → C

(3.9)

which is non-degenerate. It is deﬁned on generators by (K, a) = q −1/2 ,

(K −1 , a) = q 1/2 , (E, c) = 1,

(K, d) = q 1/2 ,

(K −1 , d) = q −1/2 ,

(F, b) = 1,

with all other combinations of generators pairing to give zero. The pairing is extended to products of generators via the requirements (∆(X), p1 ⊗ p2 ) = (X, p1 p2 ), (X, 1) = (X),

(X1 X2 , p) = (X1 ⊗ X2 , ∆(p)),

(3.10)

(1, p) = (p),

for all X, X1 , X2 ∈ Uq (su(2)) and all p, p1 , p2 ∈ A[SUq (2)]. It is compatible with the antipode and the ∗-structures in the sense that, for all X ∈ Uq (su(2)),

September 14, J070-S0129055X10004119

972

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

p ∈ A[SUq (2)], (S(X), p) = (X, S(p)),

(X ∗ , p) = (X, (S(p))∗ ),

(X, p∗ ) = ((S(X))∗ , p).

(3.11)

Using the pairing, there is a canonical left action of Uq (su(2)) on A[SUq (2)] deﬁned by : Uq (su(2)) × A[SUq (2)] → A[SUq (2)],

X p := p(1) (X, p(2) )

(3.12)

where X ∈ Uq (su(2)), p ∈ A[SUq (2)] and ∆(p) = p(1) ⊗ p(2) denotes the coproduct on A[SUq (2)]. In particular, this action works out on generators to be E a = b, K ±1 a = q ±1/2 a,

E c = d,

K ±1 c = q ±1/2 c,

E b = 0,

F b = a,

F d = c,

K ±1 b = q ∓1/2 b,

E d = 0,

F a = 0,

K ±1 d = q ∓1/2 d,

(3.13)

F c = 0.

This action makes A[SUq (2)] into a left Uq (su(2))-module ∗-algebra, in the sense that X (p1 p2 ) = (X(1) p1 )(X(2) p2 ),

X 1 = 1,

X p∗ = ((S(X))∗ p)∗

for all p, p1 , p2 ∈ A[SUq (2)], X ∈ Uq (su(2)). There is also a canonical right action of Uq (su(2)) on A[SUq (2)], deﬁned by

: A[SUq (2)] × Uq (su(2)) → A[SUq (2)],

p X := (X, p(1) )p(2)

(3.14)

for X ∈ Uq (su(2)) and p ∈ A[SUq (2)], with properties similar to those for the left action. These two canonical actions commute amongst one another. 3.3. Line bundles on the quantum sphere Sq2 The coordinate algebra H := A[U(1)] of the group U(1) is the commutative unital ∗-algebra generated by t, t∗ , subject to the relations tt∗ = t∗ t = 1. It is a Hopf algebra when equipped with the coproduct, counit and antipode ∆(t) = t ⊗ t,

(t) = 1,

S(t) = t∗ ,

extended as ∗-algebra maps. There is a canonical Hopf algebra projection given on generators by a b t 0 := . (3.15) π: A[SUq (2)] → A[U(1)], π c d 0 t∗ Using this projection a right coaction of H = A[U(1)] on P := A[SUq (2)] is deﬁned by δR : A[SUq (2)] → A[SUq (2)] ⊗ A[U(1)],

δR (xi j ) := xi µ ⊗ π(xµ j ).

(3.16)

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

973

In fact, this coaction is the same thing as a Z-grading on A[SUq (2)] for which the generators have degrees deg(a) = deg(c) = 1,

deg(b) = deg(d) = −1.

(3.17)

The subalgebra of coinvariants under this coaction is denoted A[Sq2 ], A[Sq2 ] := {m ∈ A[SUq (2)] | δR (m) = m ⊗ 1}. We shall frequently write M := A[Sq2 ]. This algebra is precisely the subalgebra generated by elements of degree zero: it is the unital ∗-algebra generated by the elements b+ := cd,

b− := ab,

b0 := bc

(3.18)

subject to the relations b0 b± = q ±2 b± b0 ,

q −2 b− b+ = q 2 b+ b− + (1 − q 2 )b0 ,

b+ b− = b0 (1 + q −1 b0 ) inherited from those of A[SUq (2)]. In the classical limit q → 1, the ﬁrst line of relations becomes the statement that the algebra is commutative, whereas the second line becomes the sphere relation for the classical two-sphere S 2 . The quantum sphere Sq2 is precisely the standard Podle´s sphere of [16]. The canonical algebra inclusion M → P is well known to be a Hopf–Galois extension [1] and hence a quantum principal bundle with universal diﬀerential calculi whose typical ﬁber is determined by H := A[U(1)]. The coaction (3.16) of H on A[SUq (2)] is also used to deﬁne a family of line bundles over the quantum sphere Sq2 , indexed by n ∈ Z: Ln := {x ∈ A[SUq (2)] | δR (x) = x ⊗ t−n }. One has the decomposition [15] A[SUq (2)] =

Ln .

n∈Z

In particular L0 = A[Sq2 ] and one ﬁnds that L∗n ∼ = L−n and Ln ⊗A[Sq2 ] Lm ∼ = Ln+m for each n, m ∈ Z. Moreover, E Ln ⊂ Ln+2 ,

F Ln ⊂ Ln−2 ,

K ±1 Ln ⊂ Ln

for all n ∈ Z, as can be checked directly using (3.13) and (3.10). It is known that each Ln is a ﬁnitely generated projective (say) left A[Sq2 ]module of rank one [21]. In this way, we think of the module Ln as the space of sections of a line bundle over Sq2 with winding number −n.

September 14, J070-S0129055X10004119

974

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

4. Diﬀerential Structure of the Quantum Hopf Fibration In this section we equip the quantum group SUq (2) with a four-dimensional bicovariant diﬀerential calculus, originally described in [22]. Using this, the base space Sq2 of the Hopf ﬁbration inherits a three-dimensional diﬀerential calculus which was originally described in [17], although we describe it here in terms which are more compatible with the principal bundle structure. Finally, we show that Sq2 is a framed quantum manifold, in the sense that its cotangent bundle is a vector bundle associated to the Hopf ﬁbration SUq (2) → Sq2 . 4.1. Diﬀerential structure on SUq (2) In the following we write P for the counit of the Hopf algebra P := A[SUq (2)]. In terms of the matrix elements in (3.3), we deﬁne IP to be the right ideal of P + := Ker P generated by the nine elements b2 ,

c2 ,

b(a − d), zb,

zc,

c(a − d),

a2 + q 2 d2 − (1 + q 2 )(ad + q −1 bc),

z(a − d),

z(q 2 a + d − (q 2 + 1)),

(4.1)

where z := q 2 a + d − (q 3 + q −1 ). As discussed in Sec. 2.1, this ideal deﬁnes a leftcovariant ﬁrst order diﬀerential calculus on SUq (2), which we denote by Ω1 P . In fact, one checks that IP is stable under the right adjoint coaction AdR and so this calculus is bicovariant under left and right coactions of A[SUq (2)]. It is precisely the 4D+ calculus on SUq (2) introduced in [22]: indeed, one may check that the space Λ1 ∼ = P + /IP of left-invariant one-forms is a four-dimensional vector space. Following [11], we deﬁne elements L− , L0 , L+ , Lz of Uq (su(2)) by L− := q 1/2 F K −1 ,

L+ := q −1/2 EK −1 ,

L0 := K 2 + ν 2 q −1 F E − 1,

Lz := K −2 − 1.

The vectors L0 and Lz are related to the quantum Casimir (3.6) by 2 1 1 −1 2 (q − q ) Cq + − = qL0 + q −1 Lz . 4 2

(4.2)

The elements L− , L0 , L+ , Lz act upon A[SUq (2)] via the formula (3.12) and together provide a basis for the quantum tangent space TP of the calculus. Note in particular that the element Cq − P (Cq )1 is also an element of TP . Let {ω− , ω0 , ω+ , ωz } be a basis of the space of left-invariant one-forms Λ1 such that (Lj , ωk ) = δjk for j, k = −, 0, +, z. As given in [19], the bimodule relations in the calculus Ω1 P with respect to these one-forms are: a b 0 a b 2 −1 b ω− = ω0 ; ω− + ν q c d d 0 c d a b a b 2 −1 0 a ω+ = ω0 ; ω+ + ν q c d 0 c c d

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

a c a ωz c ω0

975

−1 b q a qb ω0 ; = q −1 c qd d b 0 a 2 −1 a 0 = ω0 ω− + ν q d c 0 0 c b 0 qa q −1 b + ωz . ω+ + d 0 qc q −1 d (4.3)

In these terms, the exterior derivative d: A[SUq (2)] → Ω1 P has the form dp = (L− p)ω− + (L0 p)ω0 + (L+ p)ω+ + (Lz p)ωz ,

p ∈ A[SUq (2)], (4.4)

where is the left action of Uq (su(2)) on A[SUq (2)] deﬁned in (3.12). By using the formulæ (3.13) to compute the action of L0 , Lz , L+ , L− on the generators of A[SUq (2)] and then substituting into (4.4), one obtains the explicit expressions da = (q −1 − 1 + ν 2 q −1 )aω0 + bω+ + (q − 1)aωz , db = aω− + (q − 1)bω0 + (q −1 − 1)bωz , dc = (q −1 − 1 + ν 2 q −1 )cω0 + dω+ + (q − 1)cωz ,

(4.5)

dd = cω− + (q − 1)dω0 + (q −1 − 1)dωz for the diﬀerentials of the matrix generators of A[SUq (2)] in terms of these leftinvariant one-forms. 4.2. Framed manifold structure of Sq2 Next, we use Sec. 2.3 to compute the cotangent bundle Ω1 Sq2 of the base space Sq2 of the Hopf ﬁbration as an associated vector bundle. As before, we write P = A[SUq (2)] for the algebra of functions on the total space of the Hopf ﬁbration, M = A[Sq2 ] for the algebra of functions on the base and H = A[U(1)] for the structure quantum group. Recall the right coaction δR : P → P ⊗H deﬁned in (3.16) and the canonical projection π: P → H deﬁned in (3.15). The diﬀerential calculus on P is taken to be the four-dimensional bicovariant calculus Ω1 P deﬁned in the previous section; it is deﬁned in terms of the AdR -invariant ideal IP generated by the elements in (4.1). Now writing H for the counit of H, we obtain a bicovariant diﬀerential calculus Ω1 H on H = A[U(1)] by projecting the ideal IP to obtain an ideal IH := π(IP ) of Ker H . As such, IH is generated by the three elements t2 + q 2 t∗2 − (1 + q 2 ),

z(t − t∗ ),

z(q 2 t + t∗ − (q 2 + 1)),

again with z = q 2 t + t∗ − (q 3 + q −1 ), where t, t∗ are the generators of H.

(4.6)

September 14, J070-S0129055X10004119

976

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

Lemma 4.1. The calculus Ω1 H is one-dimensional. It is spanned as a left module by the left-invariant one-form ωt := t∗ dt and has bimodule relations ωt t = qtωt ,

ωt t∗ = q −1 t∗ ωt ,

where t, t∗ are the generators of H = A[U(1)]. Proof. We deﬁne an equivalence relation ∼ on H + by x ∼ y if and only if x − y ∈ IH . By taking a linear combination of the generators in (4.6), one ﬁnds in particular that (t − 1) + q(t∗ − 1) ∼ 0, which is our key equivalence. Using it, one deduces that t2 = (t + 1)(t − 1) + 1 ∼ −q(t + 1)(t∗ − 1) + 1 = −q(t∗ − t) + 1 ∼ (q + 1)(t − 1) + 1, t∗2 = (t∗ + 1)(t∗ − 1) + 1 ∼ −q −1 (t∗ + 1)(t − 1) + 1 = −q −1 (t − t∗ ) + 1 ∼ −q −1 (1 + q −1 )(t − 1) + 1, so that every quadratic polynomial in t, t∗ and 1 is equivalent to a linear combination of t − 1 and t∗ − 1. By induction any polynomial in t is equivalent to such a linear combination. Applying the key equivalence once more tells us that we can always eliminate t∗ − 1. Thus we take t − 1 as a representative of the quotient space H + /IH and ωt := r−1 (1 ⊗ (t − 1)) as the corresponding left-invariant oneform, which spans the calculus Ω1 H as a left H-module. To obtain the bimodule relations, we compute for example that ωt t = ((t∗ − 1) ⊗ t − 1)t = (1 − t) ⊗ t2 − t = qt(t∗ − 1) ⊗ t − 1 = qtωt , where denotes an equivalence class modulo IH . The ﬁrst and last equalities use the deﬁnition of the map r and the middle equality uses the bimodule structure (2.3). The diﬀerential calculus Ω1 M on the base of the ﬁbration is deﬁned by restricting the calculus Ω1 P to M . This means that it is deﬁned as the quotient 1 M/NM , where NM is the M -M -bimodule NM := NP ∩ Ω 1 M . We postΩ1 M := Ω 1 pone the computation of generators and relations for Ω M and observe that for now we have the following expressions for the exterior derivative on M in terms of the left-invariant one-forms ω± , ω0 . Lemma 4.2. The exterior derivative d acts on M = A[Sq2 ] as 

db+





d2

    db0  = db db−

b

2

µν 2 q −1 cd ν 2 q −1 (1 + µbc) 2 −1

µν q

ab

qc2



ω+



  ac   ω0 

qa

2

in terms of the generators b± , b0 of M given in (3.18).

ω−

(4.7)

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

977

Proof. This follows from direct computation. For example, to compute db+ the Leibniz rule yields db+ = d(cd) = (dc)d + c(dd). One uses the expressions (4.5) to rewrite dc, dd in terms of ω± and ω0 , then the bimodule relations in Eqs. (4.3) to collect all coeﬃcients to the left. Combining together alike terms yields the expression as stated. The same method works for computing db0 and db− . Lemma 4.3. With P, H and M as above, the diﬀerential calculi Ω1 P, Ω1 H and Ω1 M satisfy the compatibility conditions of (2.11). Proof. The relation π(IP ) = IH holds by deﬁnition of the calculus on H. It is suﬃcient to verify the AdR -condition in (2.11) on generators: one ﬁnds that (id ⊗ π)AdR (c2 ) = c2 ⊗ t4 ,

(id ⊗ π)AdR (c(a − d)) = c(a − d) ⊗ t2 ,

(id ⊗ π)AdR (b2 ) = b2 ⊗ t∗4 ,

(id ⊗ π)AdR (b(a − d)) = b(a − d) ⊗ t∗2 ,

(id ⊗ π)AdR (zc) = zc ⊗ t2 ,

(id ⊗ π)AdR (zb) = zb ⊗ t−2 ,

with all other generators coinvariant under the map (id ⊗ π)AdR . This means that we may apply Sec. 2.3 to express Sq2 as a framed quantum manifold. The framing comodule V is computed as follows. Clearly P + ∩ M is equal to M + = Ker M , the restriction of the counit P to the subalgebra M . In our case, with M = A[Sq2 ] being generated by b± , b0 , we have that M + = b0 , b± as a right ideal. To compute IP ∩ M we note that, since the generators b(a − d), c(a − d), a2 + q 2 d2 − (1 + q 2)(ad + q −1 bc), zb, zc, z(a − d), z(q 2 a + d − (q 2 + 1)) are not of homogeneous degree, the ideal that each of them generates has no intersection with M . Thus we concentrate on the generators b2 , c2 of IP . The elements of degree zero in b2 include b2 {a2 , ac, c2 } and so we see that b2− , b− b0 , b20 all lie in IP ∩ M . Similarly, from the ideal c2 we see that b2+ and b+ b0 are also in IP ∩ M . From this discussion we obtain V = b0 , b± /b2± , b20 , b± b0 .

(4.8)

Hence V is three-dimensional with representatives b± and b0 . We compute the right coaction of H on V from (2.12) as ∆R (b+ ) = cd ⊗ Sπ(d2 ) = b+ ⊗ t2 , ∆R (b− ) = ab ⊗ Sπ(a2 ) = b− ⊗ t∗2 , ∆R (b0 ) = bc ⊗ 1 = b0 ⊗ 1. Hence V = C ⊕ C ⊕ C and the associated bundle E = L−2 ⊕ L0 ⊕ L+2 = A[SUq (2)]2 ⊕ A[SUq (2)]0 ⊕ A[SUq (2)]−2

September 14, J070-S0129055X10004119

978

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

is the direct sum of the line bundles over Sq2 with winding numbers −2, 0 and 2. This yields the following theorem. Theorem 4.4. The homogeneous space Sq2 is a framed quantum manifold with cotangent bundle Ω1 Sq2 ∼ = L−2 ⊕ L0 ⊕ L+2 . The isomorphism is given by the soldering form θ(b+ ) = q 2 c2 db− − qµac db0 + a2 db+ = ω+ , θ(b0 ) = −qdc db− + (1 + µbc)db0 − q −1 ba db+ = ν 2 q −1 ω0 , θ(b− ) = d2 db− − q −1 µbd db0 + q −2 b2 db+ = qω− and makes Ω1 Sq2 projective as a left A[Sq2 ]-module. Proof. The only remaining part is to compute the soldering form θ(b± ), θ(b0 ). We ﬁnd the left coaction on M = A[Sq2 ] inherited from the coproduct on A[SUq (2)] to be ∆L (b+ ) = ∆L (cd) = c2 ⊗ b− + cd ⊗ (1 + µb0 ) + d2 ⊗ b+ , ∆L (b0 ) = ∆L (bc) = ca ⊗ b− + 1 ⊗ b0 + bc ⊗ (1 + µb0 ) + db ⊗ b+ , ∆L (b− ) = ∆L (ab) = a2 ⊗ b− + ab ⊗ (1 + µb0 ) + b2 ⊗ b+ . In fact these coproducts were already used in computing ∆R above. This time we apply the antipode S to the ﬁrst tensor factor to obtain θ(b+ ) = S(b+ (1) )d(b+ (2) ) = q 2 c2 db− − qµac db0 + a2 db+ , similarly for θ(b− ) and θ(b0 ). This yields the middle expressions as stated. We then insert the expressions from Lemma 4.2 to obtain {ω+ , ν 2 q −1 ω0 , qω− } for the values of the map θ. According to Sec. 2.3, the map θ: V → P Ω1 M is well-deﬁned on V . In order to get one-forms on A[Sq2 ], one must multiply θ(b− ) by an element of degree 2, θ(b+ ) by an element of degree −2 and θ(b0 ) by an element of degree zero. Moreover, every one-form is obtained in this way. This yields the isomorphism as stated. Since all line bundles Ln are projective, so is Ω1 Sq2 . The above also shows that the exterior derivative d in the calculus Ω1 Sq2 is given by restriction of the expression in (4.4), namely dm = (L− m)ω− + (L0 m)ω0 + (L+ m)ω+ ,

m ∈ A[Sq2 ].

(4.9)

We stress that L∓ m ∈ L±2 rather than being element in A[Sq2 ]. Of course, from (4.4) combined with the fact that the vertical vector ﬁeld Lz obeys Lz m = 0

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

979

for all m ∈ A[Sq2 ], we already expected this to be the case. From Theorem 4.4 we know that Ω1 Sq2 is spanned as a left module by {d2 , db, b2 } ω+ := {∂+ b+ , ∂+ b0 , ∂+ b− }, ν 2 q −1 {µcd, 1 + µbc, µab} ω0 := {∂0 b+ , ∂0 b0 , ∂0 b− }, 2

(4.10)

2

{qc , ac, qa } ω− := {∂− b+ , ∂− b0 , ∂− b− }. The bimodule relations in the calculus Ω1 Sq2 are in general quite complicated to compute directly, but we can use the expressions in Eqs. (4.10) to break them into smaller pieces which are much easier to work with. Corollary 4.5. The cotangent bundle Ω1 Sq2 has ﬁrst order diﬀerential sub-calculi Ω1+ ∼ = L−2 ⊕ L0 ,

Ω10 ∼ = L0 ,

Ω1− ∼ = L0 ⊕ L+2

with diﬀerentials given by d+ := ∂+ + ∂0 , d0 := ∂0 and d− := ∂0 + ∂− respectively. These calculi obey the bimodule relations  −2 q b+ (∂+ b+ ) + q −3 µ−1 b+ (∂0 b+ )     b  +  q −4 b (∂ b ) + µ−1 q −2 (1 + q −3 b )(∂ b )  0 + + 0 0 + ∂+ b+ b0 = q −2 b− (∂+ b+ ) − (q 2 − q −2 )b+ (∂+ b− ) + ∂0 b0      b−  + (q 2 − q −2 )−1 (q −2 b− (∂0 b+ ) − b+ (∂0 b− )) − q −1 νb+ (∂0 b− ),   −3 −1   b+ (∂+ b0 ) + q µ b0 (∂0 b+ ) b + ∂+ b0 b0 = q −2 b0 (∂+ b0 ) + q −2 µ−1 b+ (∂0 b− )    −2  b− q b− (∂+ b0 ) − q −1 νb0 (∂+ b− ) + q −2 (1 + q −1 b0 )(∂0 b− ),   2 2 −2 −1 2   b + q b+ (∂+ b− ) + (q − q ) (q b− (∂0 b+ ) − b+ (∂0 b− )) ∂+ b− b0 = b0 (∂+ b− ) + q −1 µ−1 b0 (∂0 b− )     −2 b− q b− (∂+ b− ) + q −3 µ−1 b− (∂0 b− ),  2  3 −1   q b+ (∂− b+ ) + q µ b+ (∂0 b+ ) b + ∂− b+ b0 = b0 (∂− b+ ) + qµ−1 b0 (∂0 b+ )    −2  q b− (∂− b+ ) + (q 2 − q −2 )−1 (b− (∂0 b+ ) − q 2 b+ (∂0 b− )), b−  2  −1   q b+ (∂− b0 ) + qνb0 (∂− b+ ) + µ (1 + qb0 )(∂0 b+ ) b + ∂− b0 b0 = q 2 b0 (∂− b0 ) + µ−1 b− (∂0 b+ )     b− b− (∂− b0 ) + q 3 µ−1 b0 (∂0 b− ),  2 q b+ (∂− b− ) + (q 2 − q −2 )b− (∂− b+ ) + q 2 ∂0 b0     b     + + (q 2 − q −2 )−1 (b− (∂0 b+ ) − q 2 b+ (∂0 b− )) + qνb− (∂0 b+ ) ∂− b− b0 = 4 −1 3   q b0 (∂− b− ) + µ (1 + q b0 )(∂0 b− )    b−  2 q b− (∂− b− ) + q 3 µ−1 b− (∂0 b− ).

September 14, J070-S0129055X10004119

980

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

Proof. Using the expressions in Eqs. (4.10) the bimodule relations in Ω1 Sq2 are easily determined from straightforward but laborious computation along the following lines. From the bimodule relations in Eqs. (4.3) one ﬁnds that   2 −1 2   b + ω + + ν q c ω 0 b + ω+ b0 = b0 ω+ + ν 2 q −1 caω0     b− b− ω+ + ν 2 q −1 a2 ω0 ,   2 2   b + ω − + ν d ω 0 b + ω− b0 = b0 ω− + ν 2 dbω0     b− b− ω− + ν 2 b2 ω0 , with ω0 commuting with each of b± , b0 . Combining these with the algebra relations in A[SUq (2)] yields the bimodule relations as stated, together with     b + b+ (∂0 b+ ) ∂0 b+ b0 = q −2 b0 (∂0 b+ )     −2 b− q b− (∂0 b+ ) − q −2 b− (∂0 b+ ) + b+ (∂0 b− ),  2  −1   q b+ (∂0 b0 ) − qµ ν(∂0 b+ ) b + ∂0 b0 b0 = b0 (∂0 b0 )    −2  b− q b− (∂0 b0 ) + q −1 µ−1 ν(∂0 b− ),  2  −2   q b+ (∂0 b− ) + b− (∂0 b+ ) − q b+ (∂0 b− ) b + ∂0 b− b0 = q 2 b0 (∂0 b− )     b− b− (∂0 b− ). The fact that Ω1+ = L−2 ⊕ L0 , Ω10 = L0 and Ω1− = L0 ⊕ L+2 close as sub-bimodules is now clear by inspection. The Leibniz rules for the diﬀerentials d+ , d0 and d− follow from the Leibniz rule for d and the direct sum decomposition of Ω1 Sq2 . Corollary 4.6. The one-forms in the calculus Ω1 Sq2 enjoy the relations ∂+ b0 = q −2 b− (∂+ b+ ) − q 2 b+ (∂+ b− ), b0 b− (∂+ b+ ) = q 3 (1 + qb0 )b+ (∂+ b− ), ∂− b0 = b+ (∂− b− ) − q −4 b− (∂− b+ ), b0 b+ (∂− b− ) = q −3 (1 + q −1 b0 )b− (∂− b+ ), b0 ∂0 b0 = −qµν −1 b− (∂0 b+ ) + q −1 µν −1 b+ (∂0 b− ), b+ (∂0 b0 ) = (µ−1 + q −2 b0 )∂0 b+ , b− (∂0 b0 ) = (µ−1 + q 2 b0 )∂0 b+ .

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

981

Proof. These are obtained in analogy with the proof of Corollary 4.5, from the relations in A[SUq (2)] acting on ω± and ω0 . One ﬁnds the relations as stated, together with b+ (∂+ b− ) = q −1 b0 (∂+ b0 ),

b− (∂+ b+ ) = q 2 (1 + qb0 )(∂+ b0 ),

b− (∂− b+ ) = q 2 b0 (∂− b0 ),

b+ (∂− b− ) = q −1 (1 + q −1 b0 )(∂− b0 ).

There are other relations involving the diﬀerential ∂0 , but they are quite complicated (since the sphere relation in A[Sq2 ] does not explicitly involve the unit) and are not particularly illuminating, so we shall not give them here. Finally, we use Theorem 4.4 to compute the diﬀerentials ∂± and ∂0 in terms of the exterior derivative d. Using the algebra relations in A[SUq (2)] and the expressions in Eqs. (4.10) we ﬁnd that ∂+ b+ = q −1 b2+ db− − µb+ (1 + q −1 b0 )db0 + (1 + q −1 b0 )2 db+ + q −2 νb+ b− db+ , ∂+ b0 = qb+ b0 db− − µb+ b− db0 + q −2 (1 + q −1 b0 )b− db+ , ∂+ b− = q 2 b20 db− − q −1 µb− b0 db0 + q −3 b2− db+ , ∂0 b+ = −µb2+ db− + µb+ (1 + µb0 )db0 − q −2 µb+ b− db+ , ∂0 b0 = (1 + µb0 )(−b+ db− + (1 + µb0 )db0 − q −2 b− db+ ), ∂0 b− = −µb− b+ db− + µb− (1 + µb0 )db0 − q −2 µb2− db+ , ∂− b+ = qb2+ db− − q −1 µb0 b+ db0 + q −2 b20 db+ , ∂− b0 = (1 + qb0 )b+ db− − qµb0 (1 + qb0 )db0 + q −2 b− b0 db+ , ∂− b− = ((1 + qb0 )2 + νb− b+ )db− − µb− (1 + qb0 )db0 + q −1 b2− db+ . These expressions may now be used to compute the full bimodule structure of the calculus Ω1 Sq2 in terms of the diﬀerential d, as well as the deeper structure of the noncommutative Riemannian geometry of this calculus, along similar lines to [14]. However, since our objective is to study the spin geometry of the calculus, we have all we need and so we shall not pursue these directions here. 5. The Spectral Geometry of Sq2 In this section, we give the “three-dimensional” diﬀerential calculus Ω1 Sq2 by a spectral triple on Sq2 . This means equipping Sq2 with a spinor bundle S and a Dirac operator D which together implement the exterior derivative d for Ω1 Sq2 . We then equip this spectral triple with a real structure for which the commutant property and the ﬁrst order condition for the Dirac operator are satisﬁed up to inﬁnitesimals of arbitrary order, in parallel with the results of [7] for the “two-dimensional” calculus on Sq2 .

September 14, J070-S0129055X10004119

982

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

5.1. Background on spectral triples We recall brieﬂy the notion of a spectral triple [2]. Deﬁnition 5.1. A unital spectral triple (A, H, D) consists of a complex unital ∗-algebra A, faithfully ∗-represented by bounded operators on a (separable) Hilbert space H, and a self-adjoint operator D: H → H (the Dirac operator) with the following properties: / R, is a compact operator on H; (i) the resolvent (D − λ)−1 , λ ∈ (ii) for all a ∈ A the commutator [D, π(a)] is a bounded operator on H. A spectral triple (A, H, D) is called even if there exists a Z2 -grading of H, i.e. an operator Γ: H → H with Γ = Γ∗ and Γ2 = 1, such that ΓD + DΓ = 0 and Γa = aΓ for all a ∈ A. Otherwise the spectral triple is said to be odd. With 0 < n < ∞, the Dirac operator D is said to be n+ -summable if + (D + 1)−1/2 is in the Dixmier ideal Ln (H). The metric dimension of the spectral triple (A, H, D) is deﬁned to be the inﬁmum of the set of all n, such that D is n+ -summable. Given a spectral triple (A, H, D), one associates to it a canonical ﬁrst order diﬀerential calculus (Ω1D A, dD ). In particular, the A-A-bimodule Ω1D A is deﬁned to be     j a0 [D, aj1 ] | aj0 , aj1 ∈ A , Ω1D A := ω = (5.1)   2

j

with the diﬀerential dD given by dD a = [D, a] for a ∈ A. The original deﬁnition [3] of a real structure on a spectral triple (A, H, D) was given by an anti-unitary operator J: H → H with the properties J 2 = ±1, JD = ±DJ and [π(a), Jπ(b)J −1 ] = 0,

[[D, π(a)], Jπ(b)J −1 ] = 0,

a, b ∈ A.

(5.2)

These are called the commutant property and the ﬁrst order condition respectively. However, in many examples involving quantum spaces, one needs to modify these conditions in order to obtain non-trivial spin geometries [5–8]. Following the approach there, we impose the weaker assumption that (5.2) holds only up to inﬁnitesimals of arbitrary order (i.e. up to compact operators T with the property that the singular values sk (T ) satisfy limk→∞ k p sk (T ) = 0 for all p > 0). Deﬁnition 5.2. A real structure on a spectral triple (A, H, D) is an anti-unitary operator J: H → H such that J 2 = ±1, [π(a), Jπ(b)J

−1

] ∈ I,

JD = ±DJ,

[[D, π(a)], Jπ(b)J −1 ] ∈ I ,

a, b, ∈ A,

(5.3)

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

983

where I is an operator ideal of inﬁnitesimals of arbitrary order. We say that the datum (A, H, D, J) is a real spectral triple (up to inﬁnitesimals). If (A, H, D, Γ) is even and JΓ = ±ΓJ, we call the datum (A, H, D, Γ, J) an even real spectral triple (up to inﬁnitesimals). The signs above depend on the so-called KO-dimension of the triple. We shall only need the case where the KO-dimension is two; then J 2 = −1, JD = DJ and JΓ = −ΓJ. 5.2. A Dirac operator on Sq2 In order to deﬁne a spectral triple on Sq2 , we need a spinor bundle over Sq2 and an associated Dirac operator, which we require should recover the diﬀerential calculus Ω1 Sq2 via the commutator representation deﬁned in (5.1). Since the diﬀerential calculus Ω1 Sq2 constructed in Theorem 4.4 is equivariant under a left coaction of A[SUq (2)] and hence a right action of Uq (su(2)), we are led to consider spinor bundles and Dirac operators which are right Uq (su(2))-equivariant. Guided by this principle, as well as by the spin structure of the classical twosphere S 2 , for the A[Sq2 ]-module of spinors we take S = S+ ⊕ S− := L−1 ⊕ L+1 . As right Uq (su(2))-modules, the vector spaces S± are both isomorphic to the direct sum Vj (5.4) V := j∈N+ 12

over all irreducible Uq (su(2))-modules V j with spin j ∈ N + A corresponding basis for V is then given by 1 |j, m j ∈ N + , m = −j, . . . , j , 2

1 2

a half-odd integer.

where the vectors |j, m span the irreducible Uq (su(2))-module V j in Eqs. (3.7). We denote the orthonormal bases of the two diﬀerent copies S± of V respectively by |j, m ± ,

1 j ∈ N+ , 2

m = −j, . . . , j.

(5.5)

We equip S with the inner product which makes this basis orthonormal and write H for the corresponding Hilbert space completion of S. As A[Sq2 ]-modules, the vector spaces S± each carry one of two inequivalent Uq (su(2))-equivariant representations of A[Sq2 ], π± : A[Sq2 ] → End(S± ). Recall that S± are just the subspaces of A[SUq (2)] with overall degrees ∓1 with respect to the Z-grading (3.17), so the representations π± on S± are simply given

September 14, J070-S0129055X10004119

984

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

by restricting the multiplication in A[SUq (2)] to the appropriate degrees. However, it is possible to describe these representations explicitly in terms of the basis (5.5) in the following way. Indeed, the Uq (su(2))-equivariant representations of A[Sq2 ] on V were already described in [7, 21]. To be able to simply quote them we make a change of generators, now writing x1 = −q 1/2 µb+ ,

x0 − 1 = µb0 ,

x−1 = −q −3/2 µb− ,

(5.6)

where b± , b0 are the generators of A[Sq2 ] deﬁned in (3.18), and µ = q + q −1 . With respect to these new generators, the algebra relations of A[Sq2 ] now read x−1 (x0 − 1) = q 2 (x0 − 1)x−1 , x1 (x0 − 1) = q −2 (x0 − 1)x1 , (q 2 x0 + 1)(x0 − 1) = (q + q −1 )x−1 x1 , (q −2 x0 + 1)(x0 − 1) = (q + q −1 )x1 x−1 . Then, with N = ±1/2, the two representations π± = π±1/2 of A[Sq2 ] on S± have the form 0 πN (xi )|j, m ± = α− i (j, m; N )|j − 1, m + i ± + αi (j, m; N )|j, m + i ±

+ α+ i (j, m; N )|j + 1, m + i ± ,

(5.7)

where the coeﬃcients are determined by α+ 1 (j, m; N )

=q

−j+m

[j + m + 1][j + m + 2] [2j + 1][2j + 2]

1/2 αN (j + 1),

α01 (j, m; N ) = −q m+2 ([2][j − m][j + m + 1])1/2 [2j]−1 βN (j), 1/2 [j − m − 1][j − m] − j+m+1 αN (j), α1 (j, m; N ) = −q [2j − 1][2j] 1/2 [2][j − m + 1][j + m + 1] + m αN (j + 1), α0 (j, m; N ) = q [2j + 1][2j + 2] α00 (j, m; N ) = [2j]−1 ([j − m + 1][j + m] − q −2 [j − m][j + m + 1])βN (j), 1/2 [2][j − m][j + m] − m αN (j), α0 (j, m; N ) = q [2j − 1][2j] 1/2 [j − m + 1][j − m + 2] j+m (j, m; N ) = q αN (j + 1), α+ −1 [2j + 1][2j + 2]

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

985

α0−1 (j, m; N ) = q m ([2][j − m + 1][j + m])1/2 [2j]−1 βN (j), −j+m−1 α− −1 (j, m; N ) = −q

[j + m − 1][j + m] [2j − 1][2j]

1/2 αN (j)

1 1 (with the convention that α− i ( 2 , ± 2 ; N ) = 0) and the real numbers αN (j), βN (j) are

αN (j) = ([2j + 1][2j])−1/2 ([2][j + N ][j − N ])1/2 ([2j + 1][2j])1/2 q N , 1 3 −1 −1 −ε −1 βN (j) = q [2j + 2] (εq − (q − q ) [j][j + 1] − , 2 2 with ε = sign(N ). Next we come to the Dirac operator. With the 2 × 2 Pauli matrices 0 1 1 0 0 0 σ+ := , σ0 := , σ− := , 0 0 0 −1 1 0 one has the relations 1 σ+ σ− = 0 σ0 σ+ = σ+ ,

0 , 0

σ+ σ0 = −σ+ ,

σ02 = 2 σ+

=

1 0 , 0 1

2 σ−

= 0,

σ− σ+ =

σ− σ0 = σ− ,

Further, we use the diﬀerential operators D± , D0 , D± := L± ,

0 0

D0 := L0 + q −2 Lz = q −1 (q − q −1 )2

0 , 1

(5.8)

σ0 σ− = −σ− .

2 1 1 Cq + − , 4 2

(5.9)

having used the expression (4.2) for the last equality. As will be clearly momentarily, the use of D0 instead of L0 (the extra Lz vanishing identically on A[Sq2 ]) will lead to a Dirac operator whose square is diagonal. We deﬁne a Dirac operator D: S → S by D = D+ σ+ + D0 σ0 + D− σ− ,

(5.10)

where the 2 × 2 Pauli matrices σ± , σ0 act upon the column vector of S by left multiplication and the vector ﬁelds D± , D0 operate via the left action of Uq (su(2)) (using the symbol , which we omit from now on). As mentioned above, elements a ∈ A[Sq2 ] act as multiplicative operators on S via the representations π± : π+ (a) 0 π : A[Sq2 ] → End(S), π(a) := 0 π− (a) although we will not always explicitly denote the representation π. Proposition 5.3. The Dirac operator D: S → S obeys [D, a] = (L+ a)σ+ + (L0 a)σ0 + (L− a)σ− for each a ∈ A[Sq2 ].

September 14, J070-S0129055X10004119

986

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

Proof. For ψ = (ψ+ ψ− )tr ∈ S+ ⊕S− , using the derivation property of the vector ﬁelds D± , D0 , the commutator [D, a] works out to be 0 (D+ a)ψ− (D0 a)ψ+ + [D, a]ψ = + (D− a)ψ+ 0 −(D0 a)ψ− = ((D+ a)σ+ + (D0 a)σ0 + (D− a)σ− )ψ. To obtain the desired result, one simply substitutes D± = L± and D0 = L0 +q −2 Lz , observing that Lz a = 0 for all a ∈ A[Sq2 ]. This also shows that for all a ∈ A[Sq2 ] the commutator [D, a] recovers the one-form da, acting on the spinors S by “Cliﬀord multiplication”. The summand D+ σ+ + D− σ− in the operator (5.10) is precisely the Dirac operator of [4], corresponding [20] to the “two-dimensional” diﬀerential calculus on the sphere Sq2 . The extra term D0 in our Dirac operator is the origin of the extra ‘direction’ in the calculus Ω1 Sq2 . It is clear from (4.2) that D0 vanishes when q → 1, whence the classical limit of our construction is just the canonical spectral triple on the classical two-sphere S 2 . Next, we compute the spectrum of the Dirac operator. We shall use the identities 1 q −1 K 2 − 2 + qK −2 −2 L+ L− = qEF K = q Cq + − K −2 , 4 (q − q −1 )2 (5.11) 2 −1 −2 qK − 2 + q K 1 L− L+ = q −1 F EK −2 = q −1 Cq + − K −2 , 4 (q − q −1 )2 each obtained using the expression (3.6) for the quantum Casimir Cq . Moreover, we know from (3.13) that for all ψ± ∈ S± we have K 2 ψ± = q ±1 ψ± ,

K −2 ψ± = q ∓1 ψ± .

(5.12)

These facts lead to the following result. Proposition 5.4. The Dirac operator D obeys D2 = q −2 ν 4

2 2 1 1 1 + Cq + Cq + − , 4 2 4

where Cq is the quantum Casimir. Proof. Using the Pauli relations (5.8) one computes that, for ψ = (ψ+ 2

D ψ=

D02

1 0 1 0 0 ψ + D+ D− ψ + D− D+ 0 1 0 0 0

0 ψ. 1

ψ− )tr ∈ S, (5.13)

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

987

The crucial fact in this calculation is that D0 is a function of the Casimir Cq and therefore commutes with D± . Next, using the relations (5.11) and (5.12) we ﬁnd 1 D± D∓ ψ± = Cq + ψ± 4 for each ψ± ∈ S± . Furthermore, we have that 2 2 1 1 Cq + − D02 = q −2 ν 4 . 4 2 Substituting these expressions into (5.13) yields the formula as claimed. As an immediate consequence we obtain the spectrum of our Dirac operator D. Corollary 5.5. The Dirac operator D deﬁned in (5.10) has spectrum   2 1/2   1 j ∈ N + 1 Spec(D) = ± q −2 ν 4 [j]2 [j + 1]2 + j +  2 2 with multiplicities 2j + 1. Proof. The eigenvalues of Cq are given in (3.8): each |j, m ± is an eigenvector with eigenvalue [j + 12 ]2 − 14 , whence the multiplicity of the jth eigenvalue is 2(2j + 1). From the expression for D2 in Proposition 5.4, we read oﬀ its eigenvalues using those for Cq , yielding 2 1 1 , (5.14) Spec(D2 ) = λj := q −2 ν 4 [j]2 [j + 1]2 + j + j ∈N+ 2 2 each having multiplicity 2(2j + 1). Here we have used the identity [j + 12 ]2 − 1/2 [ 12 ]2 = [j][j + 1]. The eigenvalues of D are therefore just ±λj with multiplicities 2j + 1. By inspection, we see that the eigenvalues of |D| grow not faster than q −2j for large j, in contrast with the Dirac operator of [4], whose eigenvalues diverge not faster than q −j . It is the extra term D0 which accounts for this behavior. This result immediately gives us an expression for D in terms of an orthonormal basis of eigenspinors |j, m; ↑ , |j, m; ↓ deﬁned by D|j, m; ↑ = µj |j, m; ↑ ,

D|j, m; ↓ = −µj |j, m; ↓

with eigenvalues µj :=

q

2 1/2 1 ν [j] [j + 1] + j + . 2

−2 4

2

2

(5.15)

September 14, J070-S0129055X10004119

988

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

To proceed further, it will be necessary to have an explicit description of these eigenspinors in terms of the basic spinors |j, m ± . By evaluating the actions of D± , D0 on S one ﬁnds that the Dirac operator is 1 (5.16) D|j, m ± = ±q −1 ν 2 [j][j + 1]|j, m ± + j + |j, m ∓ , 2 the ﬁrst term corresponding to the action of D0 σ0 , the second to the action of D± σ± . Knowing the eigenvalues of D, we ﬁnd the corresponding eigenspinors to be 1 |j, m; ↑ := (−ζj+ |j, m + − ζj− |j, m − ), 2µj 1 |j, m; ↓ := (−ζj− |j, m + + ζj+ |j, m − ), 2µj for m = −j, −j + 1, . . . , j − 1, j and j ∈ N + 12 , where we have written ζj+ = µj + q −1 ν 2 [j][j + 1], ζj− = µj − q −1 ν 2 [j][j + 1].

(5.17)

(5.18)

On the two-dimensional subspace Vj,m spanned by |j, m + , |j, m − for ﬁxed values of j, m, the operator which diagonalizes D is just the orthogonal matrix + −ζj −ζj− 1 Wj := . (5.19) 2µj −ζj− ζj+ We write W: H → H for the closure of the operator deﬁned by the matrices Wj , j ∈ N + 12 . 5.3. Spectral properties of Sq2 We now show that the datum (A[Sq2 ], H, D) fulﬁls the conditions required of a spectral triple, which we then equip with a real structure in the sense of Deﬁnition 5.2. Theorem 5.6. The datum (A(Sq2 ), H, D) constitutes a unital spectral triple over the sphere Sq2 with metric dimension zero. Proof. For each a ∈ A[Sq2 ] the commutator [D, a] acts on S by multiplication operators and is therefore itself a bounded operator. In fact, for the summand D+ σ+ + D− σ− this goes as in [4], whereas for the term D0 one gets multiplication by L0 a which belongs to A[Sq2 ] itself. The operator D clearly satisﬁes D = D∗ on the dense domain S of H. From Corollary 5.5 it is clear that the only accumulation points of the spectrum of D are at inﬁnity, so the resolvent of D is compact. Since the eigenvalues of D grow exponentially with j ∈ N + 12 , the metric dimension is just zero.

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

989

Proposition 5.7. With the Z2 -grading Γ: H → H deﬁned by Γ|j, m; ↑ := |j, m; ↓ ,

Γ|j, m; ↓ := |j, m; ↑

on the orthonormal basis (5.15) and extended by A[Sq2 ]-linearity, the datum (A[Sq2 ], H, D, Γ) constitutes an even spectral triple. Proof. It is obvious that Γ2 = 1 and Γ = Γ∗ . The property ΓD + DΓ = 0 follows from the fact that Γ interchanges the +µj and −µj eigenspaces of D, as may be veriﬁed directly on the basis vectors (5.15). Next a real structure. Since we have made the same choice for the spinors as in [4], it is tempting to take the same real structure as well. However, one quickly ﬁnds that this choice is unsuitable, since it neither commutes nor anti-commutes with our Dirac operator D. The reason for this lies mainly in the fact that the term D0 in our Dirac operator (5.10) is proportional to the Casimir operator, which is rather a “second order diﬀerential operator”, if anything. Instead, we deﬁne an anti-unitary operator J : H → H in terms of its action on the orthonormal basis (5.15) by J|j, m; ↑ = (−1)m+1/2 |j, −m; ↑ ,

J|j, m; ↓ = (−1)m+1/2 |j, −m; ↓

and seek to show that this J equips the datum (A[Sq2 ], H, D, Γ) with a real structure. It is not diﬃcult to check that the J above is equivariant under the right action of Uq (su(2)) on H, making it a particularly natural choice. Proposition 5.8. The operator J satisﬁes J 2 = −1, DJ = JD and ΓJ = −JΓ. Proof. The fact that J 2 = −1 is immediate. We ﬁnd that (DJ − JD)|j, m; ↑ = (−1)m+1/2 D|j, −m; ↑ − µj D|j, m; ↑ = (−1)m+1/2 µj |j, −m; ↑ − (−1)m+1/2 µj |j, −m; ↑ = 0, (JΓ + ΓJ)|j, m; ↑ = J|j, m; ↓ − (−1)m+1/2 Γ|j, −m; ↑ = (−1)m+1/2 |j, −m; ↓ − (−1)m+1/2 |j, −m; ↓ = 0, where we have used anti-linearity of J. Similar computations hold on |j, m; ↓ . Aiming at (modiﬁed) commutant and ﬁrst order conditions as in Deﬁnition 5.2, and having in mind the strategy of [7], we denote by Lq the positive trace-class operator deﬁned by Lq |j, m ± := q j |j, m ± ,

1 j ∈N+ , 2

on H and let Kq be the two-sided ideal of B(H) generated by the operators Lq . The ideal Kq is an ideal of inﬁnitesimals of arbitrarily high order and so we take

September 14, J070-S0129055X10004119

990

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

I = Kq as our operator ideal in Deﬁnition 5.2. Thus, to prove that J deﬁnes a real structure, it remains to check that the commutant property and ﬁrst order condition in (5.3) are satisﬁed. The strategy of [7] is based on the fact that the operators π(xi ), i = −1, 0, 1, can be “approximated” by operators acting diagonally on the Hilbert space of spinors. Speciﬁcally, these operators zi , i = −1, 0, 1, on H are deﬁned by 0 zi |j, m ± = α− i (j, m; 0)|j − 1, m + i ± + αi (j, m; 0)|j, m + i ±

+ α+ i (j, m; 0)|j + 1, m + i ± .

(5.20)

The coeﬃcients are exactly the ones used in (5.7), unless |m + i| > j + ν for ν = −1, 0, 1, in which case we set ανi (j, m; 0) = 0. Momentarily we shall show that the operators zi approximate the operators π(xi ) modulo the ideal Kq , but to do this we ﬁrst need the following technical lemma. Lemma 5.9. With Wj , j ∈ N + 12 , the operators in (5.19), there exists a constant C (independent of j) such that ∗ − 1|| < Cq j ||Wj Wj+1

for all j ∈ N + 12 . ∗ Proof. One evaluates the norm ||Wj Wj+1 − 1|| by computing the eigenvalues ∗ of the 2 × 2 matrix Wj Wj+1 − 1 and choosing the larger of the two, ﬁnding it to be √ + − + − ζj+ ζj+1 + ζj− ζj+1 − ζj− ζj+1 + ζj+ ζj+1 − 2 µj µj+1 ∗ − 1 = . Wj Wj+1 √ 2 µj µj+1

Using the inequalities [j] < (q − q −1 )−1 q −j and [j]−1 < q j−1 , elementary estimates √ for each of the terms in this expression yield that ζj± < C q −j and µj µj+1 < C q −2j for real constants C , C , so it appears at ﬁrst glance that the above norm has an O(1) behavior. However, a more detailed analysis shows that the coeﬃcient of q −2j in the numerator is in fact zero; the behavior of the numerator is therefore O(q −j ) and we have our result. Proposition 5.10. There exist bounded operators Ai , Bi , i = −1, 0, 1, such that π(xi ) − zi = Ai Lq = Lq Bi when acting upon the basis vectors |j, m; ↑↓ . In particular, π(xi ) − zi ∈ Kq for i = −1, 0, 1. Proof. From [7, Lemma 4.4], there exist bounded operators Ai , Bi , i = −1, 0, 1 such that π(xi ) − zi = Ai Lq = Lq Bi

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

991

with respect to the basis |j, m ± of H, and so the operators π(xi ) are approximated by the operators zi modulo the ideal Kq of inﬁnitesimals. We need to check that using the operator W to change the basis vectors from |j, m ± to |j, m; ↑↓ does not spoil this approximation property. Evaluating Wj zi Wj∗ − zi on |j, m; ↑↓ gives ∗ (Wj zi Wj∗ − zi )|j, m; ↑↓ = α− i (j, m; 0)(Wj−1 Wj − 1)|j − 1, m + i; ↑↓ ∗ + α+ i (j, m; 0)(Wj Wj+1 − 1)|j + 1, m + i; ↑↓ .

This and Lemma 5.9 yield that Wj zi Wj∗ − zi ∈ Kq for all i = −1, 0, 1 and all j ∈ N + 12 . As a consequence, we immediately get the commutant property, the ﬁrst of the two conditions in (5.3). Proposition 5.11. For all a, b ∈ A[Sq2 ] we have [π(a), Jπ(b)J −1 ] ∈ Kq . Proof. From the derivation property of commutators, it suﬃces to check this only for the generators x−1 , x0 , x1 of A[Sq2 ]. With the operators z−1 , z0 , z1 deﬁned in (5.20), we have Jzk J −1 |j, m ± = (−1)k (α− k (j, −m; 0)|j − 1, m − k ± + α0k (j, −m; 0)|j, m − k ± + α+ k (j, −m; 0)|j + 1, m − k ± ). (5.21) Using this, one computes as in [7, Lemma 6.2] that [zi , Jzk J −1 ] = 0,

i, k = −1, 0, 1.

(5.22)

It is straightforward to check that [π(xi ), Jπ(xk )J −1 ] = [π(xi ) − zi , Jπ(xk )J −1 ] + [zi , J(π(xk ) − zk )J −1 ] + [zi , Jzk J −1 ], whence the assertion follows from Proposition 5.10. We are now ready for our main theorem regarding the diﬀerential structure of Sq2 . Theorem 5.12. The datum (A(Sq2 ), H, D, Γ, J) constitutes a real even unital spectral triple (up to inﬁnitesimals) with KO-dimension equal to two. Proof. Having already established Propositions 5.8 and 5.11, it remains to verify the ﬁrst order condition for D, namely that [[D, a], JaJ −1 ] ∈ Kq for all a ∈ A[Sq2 ]. For this, we split the Dirac operator into two pieces, D = D∆ + DΩ , where D∆ = D0 σ0 and DΩ = D− σ− + D+ σ+ . By linearity it suﬃces to check the ﬁrst order condition for D∆ and DΩ individually.

September 14, J070-S0129055X10004119

992

2010 13:30 WSPC/S0129-055X

148-RMP

S. Brain & G. Landi

Since D0 is a function of the Casimir, each a ∈ A[Sq2 ] is an eigenfunction for the derivation [D∆ , · ], whence the ﬁrst order condition for D∆ follows immediately from the commutant property in Proposition 5.11. On the other hand, the component DΩ has eigenvalues ±γj , γj := [j + 12 ], whose growth with j obeys γj < Cq −j for C a real constant (as already mentioned, DΩ is precisely the Dirac operator considered in [4]). It is easy to compute that [DΩ , zi ]|j, m ± = (γj−1 − γj )α− i (j, m; 0)|j − 1, m + i ∓ + (γj+1 − γj )α+ i (j, m; 0)|j + 1, m + i ∓ . Using this expression, together with (5.21), one calculates the action of the commutators [[DΩ , zi ], Jzk J −1 ] for i, k = −1, 0, 1 and ﬁnds them to be a sum of ﬁve ν (j, m), ν = −2, . . . , 2, i.e. independent weighted shift operators with weights Si,k [[DΩ , zi ], Jzk J −1 ]|j, m ± =

2

ν Si,k (j, m)|j + ν, m + i − k ± .

ν=−2 ν (j, m) Si,k

are estimated using exactly the same method as in These weights [7, Proposition 6.5]. In our case, the growth condition for γj is suﬃcient to ν (j, m)| < C q j for some real constant C . We conclude that guarantee that |Si,k −1 [[DΩ , zi ], Jzk J ] ∈ Kq for all i, k = −1, 0, 1. Since the zi approximate the operators π(xi ) modulo Kq , the proof is complete. Acknowledgments Both authors were partially supported by the Italian Project “Coﬁn08– Noncommutative Geometry, Quantum Groups and Applications”. SB is grateful to INdAM–GNSAGA for support and the Department of Mathematics at the University of Trieste for its hospitality. We thank Francesco D’Andrea for very useful comments. References [1] T. Brzezi´ nski and S. Majid, Quantum group gauge theory on quantum spaces, Comm. Math. Phys. 157 (1993) 591–638; Erratum, ibid. 167 (1995) 235. [2] A. Connes, Noncommutative Geometry (Academic Press, 1994). [3] A. Connes, Gravity coupled with matter and the foundation of noncommutative geometry, Comm. Math. Phys. 182 (1996) 155–176. [4] L. D¸abrowski and A. Sitarz, Dirac operator on the standard Podle´s quantum sphere, in Noncommutative Geometry and Quantum Groups (Warsaw, 2001 ), Banach Center Publ., Vol. 61 (Polish Acad. Sci., Warsaw, 2003), pp. 49–58. [5] L. D¸abrowski, G. Landi, M. Paschke and A. Sitarz, The spectral geometry of the equatorial Podle´s sphere, C. R. Math. Acad. Sci. Paris 340 (2005) 819–822. [6] L. D¸abrowski, G. Landi, S. Sitarz, W. D. van Suijlekom and J. C. Varilly, The Dirac operator on SUq (2), Comm. Math. Phys. 259 (2005) 729–759. [7] L. D¸abrowski, F. D’Andrea, G. Landi and E. Wagner, Dirac operators on all Podle´s spheres, J. Noncommut. Geom. 1 (2007) 213–239.

September 14, J070-S0129055X10004119

2010 13:30 WSPC/S0129-055X

148-RMP

Spin Geometry of the Quantum Two-Sphere

993

[8] F. D’Andrea, L. D¸abrowski and G. Landi, The isospectral Dirac operator on the 4-dimensional orthogonal quantum sphere, Comm. Math. Phys. 279 (2008) 77–116. - urdevi´c, Geometry of quantum principal bundles. I, Comm. Math. Phys. 175 [9] M. D (1996) 457–520. - urdevi´c, Geometry of quantum principal bundles. II, Rev. Math. Phys. 9 (1997) [10] M. D 531–607. [11] A. Klimyk and K. Schm¨ udgen, Quantum Groups and Their Representations (Springer Verlag, Berlin Heidelberg, 1997). [12] G. Landi and A. Zampini, in preparation. [13] S. Majid, Quantum and braided group Riemannian geometry, J. Geom. Phys. 30 (1999) 113–146. [14] S. Majid, Noncommutative Riemannian and spin geometry of the standard q-sphere, Comm. Math. Phys. 256 (2005) 255–285. [15] T. Masuda, K. Mimachi, Y. Nakagami, M. Noumi and K. Ueno, Representations of the quantum group SUq (2) and the little q-Jacobi polynomials, J. Funct. Anal. 99 (1991) 357–387. [16] P. Podle´s, Quantum spheres, Lett. Math. Phys. 14 (1987) 193–202. [17] P. Podle´s, Diﬀerential calculus on quantum spheres, Lett. Math. Phys. 18 (1989) 107–119. [18] P. Podle´s, The classiﬁcation of diﬀerential structures on quantum two-spheres, Comm. Math. Phys. 150 (1992) 167–179. [19] K. Schm¨ udgen, Commutator representations of diﬀerential calculi on the quantum group SUq (2), J. Geom. Phys. 31 (1999) 241–264. [20] K. Schm¨ udgen and E. Wagner, Dirac operator and a twisted cyclic cocycle on the standard Podle´s quantum sphere, J. Reine Angew. Math. 574 (2004) 219–235. [21] K. Schm¨ udgen and E. Wagner, Representations of crossed product algebras of Podle´s quantum spheres, J. Lie Theory 17 (2007) 751–790. [22] S. L. Woronowicz, Diﬀerential calculus on compact matrix pseudogroups (quantum groups), Comm. Math. Phys. 122 (1989) 125–170.

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 995–1032 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004120

BOREL SUMMABILITY OF ϕ44 PLANAR THEORY VIA MULTISCALE ANALYSIS

MARCELLO PORTA∗ and SERGIO SIMONELLA† ∗Dipartimento

di Fisica, Universit` a di Roma “Sapienza”, Piazzale Aldo Moro 5, 00185 Roma, Italy

†Dipartimento

di Matematica, Universit` a di Roma “Sapienza”, Piazzale Aldo Moro 5, 00185 Roma, Italy ∗[email protected] †[email protected] Received 23 March 2010

We review the issue of Borel summability in the framework of multiscale analysis and renormalization group, by discussing a proof of Borel summability of the ϕ44 massive Euclidean planar theory; this result is not new, since it was obtained by Rivasseau and ’t Hooft. However, the techniques that we use have already been proved eﬀective in the analysis of various models of consended matter and ﬁeld theory; therefore, we take the ϕ44 planar theory as a toy model for future applications. Keywords: Borel summability; ϕ44 theory; renormalization group. Mathematics Subject Classiﬁcation 2010: 81T08, 81T17, 40G10

1. Introduction The problem of giving a meaning to the formal perturbative series deﬁning the scalar ϕ44 theory, the simplest four-dimensional interacting ﬁeld theory, has been very debated (see [7] for a critical introduction to the problem) and it is still wide open, despite several triviality conjectures have been proposed since the work of Landau, [1]. Here we focus on the planar restriction of the full perturbative series; that is, we consider only the graphs that can be drawn on a sheet of paper without ever crossing lines in points where no interacting vertices are present. This problem is much easier than the complete case, since the number of topological Feynman graphs contributing to a given order n is much smaller than the original n!. In fact, in the planar theory this number is bounded by (const.)n , see [10, 11]. Still, the problem is far from being trivial, since the theory needs to be renormalized; this can be done using renormalization group, see [6, 12, 13, 25], for instance. 995

October 12, J070-S0129055X10004120

996

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

It is well known that the full ϕ44 with the “wrong” sign of the renormalized coupling constant, that is the one corresponding to an unstable self-interaction potential, is perturbatively asymptotically free, in the sense that truncating the beta function to a ﬁnite order the running coupling constant describing the interaction of the ﬁelds at energy scale µ ﬂows to zero in the ultraviolet as (log µ)−1 . This fact does not have any direct physical interpretation in the full ϕ44 , since the theory is not deﬁned for the considered value of the renormalized coupling constant. Moreover, the beta function itself is not deﬁned, because of the factorial growth of the number of topological Feynman graphs in the order of the series. However, these problems do not aﬀect the planar theory, since it is only deﬁned perturbatively and the number of graphs at a given order is far smaller than the original n!. Therefore, one can hope in this case to exploit asymptotic freedom to rigorously construct the theory. This has been done independently by Rivasseau and ’t Hooft using quite diﬀerent methods, see [2–5]; indeed, they proved that the renormalized perturbative series deﬁning the Schwinger functions, which are the result of various resummations, are absolutely convergent. In particular, they proved that the result is the Borel sum of the perturbative series in the renormalized coupling constant. This last fact means in particular that the Schwinger functions can be expressed to an arbitrary accuracy starting from their perturbative series in the renormalized coupling constant, following a well-deﬁned prescription; moreover, the result is unique within a certain class of functions, the Borel summable ones. Clearly, this does not exclude the existence of other less regular solutions with the same formal perturbative expansion. At the time of those works, besides the possibility of giving a mathematically rigorous meaning to a simple quantum ﬁeld theory, the physical motivation of the study was that the ϕ44 planar theory is formally equal to the limit N → ∞ of a massive SU (N ) theory in four dimensions, with interaction λ Tr ϕ4 where ϕ is an N × N matrix, see [3, 11]. In particular, in ’t Hooft work the planar approximation was seen as a ﬁrst step towards the more ambitious study of QCD with large number of colors. In this paper, we review the issue of Borel summability of the ϕ44 planar theory using the rigorous renormalization group techniques introduced in [6, 12, 13] (in [6, 13] the ﬂow of the running coupling constants of the planar theory was heuristically discussed), which make possible a transparent proof of the ultraviolet stability of the massive Euclidean ϕ44 theory, through the so called “n! bounds”. One of the motivations of our work lies in the fact that very few proofs of Borel summability based on renormalization group methods are present in literature, [8, 9]. Moreover, we take the ϕ44 planar theory as a ﬁrst step towards the study of physically more interesting models, which can be analyzed by similar techniques. As mentioned before, the great gain that one has in the planar restriction of the full ϕ44 theory is that the topological Feynman graphs of a given order n are far less (their number is bounded as (const.)n , against the n! of the full case). This is in a sense reminiscent of what happens in fermionic ﬁeld theories, where it is possible

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

997

to control the factorial growth of the number of Feynman graphs by exploiting the −1 arising in the anticommutation of the ﬁelds, showing that the nth order of the series, which is given by n! addends, reconstructs the determinant of an n × n matrix, which is estimated by (const.)n . For instance, we think that the methods described in this paper could be useful to prove Borel summability for the onedimensional Hubbard model, where one sector of the theory is asymptotically free, while to control the ﬂow of the other running coupling constants one has to prove that the beta function is vanishing, [21]. This model has been rigorously constructed in [15] using renormalization group methods similar to those used here, but a proof of Borel summability has not been given yet. Informally, our main result can be stated as follows; we refer the reader to Sec. 3, Theorem 1, for a precise formulation. Main result. The Schwinger functions of the Euclidean massive planar ϕ44 theory are Borel summable in the renormalized coupling constant; in particular, they satisfy the hypothesis of the Nevanlinna–Sokal theorem [14], which are suﬃcient conditions for Borel summability. Roughly speaking, our proof goes as follows. First, by choosing the renormalized coupling constant in a suitable complex domain, we prove that the ﬂow equation deﬁning recursively the running coupling constants at all energy scales admits a bounded solution which falls into the radius of convergence of the Schwinger functions, and veriﬁes some special regularity properties. To do that, we use a ﬁxed point argument, similar to the one introduced by ’t Hooft in [2]. Then, to conclude the check of the hypothesis of Nevanlinna–Sokal theorem on Borel summability, we show that it is possible to “undo” the resummation that allowed to write the Schwinger functions as power series in the running coupling constants so that the nth order Taylor remainder in the renormalized coupling constant λ can be bounded proportionally to n!|λ|n+1 uniformly in the analyticity domain. To prove this second statement, we rely in a crucial way on the Gallavotti–Nicol` o tree representation of the beta function; the “undoing” of the resummations, corresponding to rather involved analytical operations, is made clear by a graphical manipulation of these trees. This procedure is quite similar in spirit to what has been done by Rivasseau in [4]. Therefore, we feel that our proof lies halfway between those of Rivasseau and ’t Hooft. As mentioned above, in ’t Hooft approach, which is based on renormalization group ideas, the ﬂow of the beta function is studied in a way analogous to the one we follow. However, instead of deriving bounds on the remainder of the resummed perturbative series, ’t Hooft, see [2], concludes the proof of Borel summability by checking the analyticity properties of the Borel transform using a totally independent argument, that we have not been able to rigorously reproduce in our framework. For what concerns the comparison with Rivasseau’s work, see [4], the main diﬀerence is that in his approach the beta function is not introduced: to construct

October 12, J070-S0129055X10004120

998

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

the planar theory Rivasseau uses a “minimal” resummation procedure, involving only a certain class of Feynman graphs with four external legs, the parquet ones. This deﬁnes an asymptotically free “running coupling constant”, and it turns out to be enough to prove the ﬁniteness of the planar theory. To conclude, Rivasseau shows that the result of these operations is the Borel sum of the nonrenormalized series, by proving an n! bound on the Taylor remainder; this bound is obtained undoing the resummation of the parquet subgraphs in a suitable way. The paper is organized as follows. In Sec. 2, we deﬁne the model, we set the notations, we brieﬂy review the ideas behind multiscale integration and we introduce the beta function and the ﬂow of the running coupling constants; we refer the interested reader to [6, 12, 13] for a detailed introduction to these techniques. In Sec. 3, we state our main result and we discuss the strategy of the proof. Finally, in Sec. 4 and in the appendices cited therein, we prove the theorem.

2. Renormalization Group Analysis In this section we describe the iterative procedure that allows to express the Schwinger functions of the full ϕ44 theory as power series order by order ﬁnite in the ultraviolet limit, graphically represented in terms of renormalized Feynman graphs; at the same time, we deﬁne the planar ϕ44 theory by considering at each step only the planar graphs. Our discussion will be quite short; we refer the reader to [6] for a detailed proof of the renormalizability of the ϕ44 theory. be a massive gaussian free ﬁeld with ultraviolet The full ϕ44 theory. Let ϕx cut-oﬀ at length γ −N , where γ > 1 is a ﬁxed scale parameter, and x ∈ Λ where Λ is a four-dimensional box of side size L with periodic boundary conditions; for simplicity, we set to 1 the value of the mass. We rewrite the ﬁeld as: (≤N )

) ϕ(≤N = x

N

x ∈ Λ,

ϕ(j) x ,

(2.1)

j=0

where {ϕ(j) }N j=0 are independent gaussian ﬁelds with propagators (j)

Cx,y := fj (p) := and

dp (2π)4

dp fj (p) ip·(x−y) e , (2π)4 p2 + 1 2

e−p /γ 2 e−p

2j

is a shorthand for |Λ|−1 lim

N →+∞

− e−p

2

/γ 2(j−1)

p=2πn/L

N j=0

(2.2) if j > 0 , if j = 0

with n ∈ Z4 ; notice that

fj (p) = 1.

(2.3)

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

999

The generating functional of the Schwinger functions of the ϕ44 theory is given by: (N ) (≤N ) ) ) eWN (ζf ) := exp ζ dx ϕ(≤N f P (dϕ(≤N ) ), (2.4) eV (ϕ x x (j) where fx is a Schwartz test function, ζ ∈ R, P (dϕ(≤N ) ) := N j=0 P (dϕ ) with P (dϕ(j) ) the gaussian distribution of the ﬁeld ϕ(j) with covariance given by (2.2), and the interaction V (N ) is deﬁned as V (N ) (ϕ(≤N ) ) ) 4 ) 2 ) 2 := dx (λN : (ϕ(≤N ) : +αN : (∂ϕ(≤N ) : +µN : (ϕ(≤N ) : +νN ), x x x

(2.5)

Λ

where λN , αN , µN , νN are called bare coupling constants, and the dots denote the Wick product of the ﬁelds (see [6, Appendix C]); notice that in our convention the “wrong” sign of λN is the positive one. The generic q-point Schwinger function of the full ϕ44 theory is obtained deriving the generating functional q times with respect p np p np (ζf ) + WN (ζf ), where WN , WN to ζ and setting ζ = 0. Now, let WN (ζf ) =: WN are respectively the planar/non planar part of WN to be deﬁned recursively in the following; the q-point Schwinger function of the planar theory is deﬁned as: T S(N ) (f ; q) :=

∂q W p (ζf )|ζ=0 . ∂ζ q N

(2.6)

We shall denote by S T (f ; q) the limit for N → +∞ of (2.6). Multiscale analysis. As explained in [6], we can try to evaluate (2.4) by proceeding in an iterative fashion, integrating the independent ﬁelds ϕ(j) starting from the ultraviolet scale j = N going down to the infrared scale j = 0. This iterative integration gives rise to an expansion in Feynman graphs; the restriction to the planar theory will be enforced by considering at each integration step only the planar ones. For simplicity, in what follows, we shall explicitly discuss only the case f = 0, which corresponds to the integration of the “partition function”. The case f = 0 is a straightforward extension of our argument, and it will be discussed later. After the integration of ϕ(N ) , ϕ(N −1) , . . . , ϕ(k+1) , we rewrite the integral (2.4) as (k) (≤k) (k) (≤k) (k) ) )+Vnp (ϕ(≤k) ) eWN (0) = eV (ϕ P (dϕ(≤k) ) = eVp (ϕ P (dϕ(≤k) ), (2.7) where P (dϕ(≤k) ) := kj=0 P (dϕ(j) ), the ﬁeld ϕ(≤k) = kj=0 ϕ(j) has a propagator given by, in momentum space, Cp(≤k) :=

k j=0

Cp(j) ,

Cp(j) :=

fj (p) , p2 + 1

(2.8) (k)

(k)

and the eﬀective potential V (k) together with its planar/non planar parts Vp , Vnp (N ) will be deﬁned recursively. At the beginning, V (N ) (ϕ(≤N ) ) = Vp (ϕ(≤N ) ); on scale

October 12, J070-S0129055X10004120

1000

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

k we will show that, if # = “p”, “np”: dp1 dpm (k) (k) ··· V (p1 , . . . , pm ; m) V# (ϕ(≤k) ) = 4 (2π) (2π)4 # m≥0

m (≤k) ϕpi : δ pi , : i=1

(2.9)

i

(k)

where V# (p1 , . . . , pm ; m) are suitable coeﬃcients to be recursively deﬁned, and the product with m = 0 is interpreted as 1. Let us perfom the single scale integration. First, we split V (k) as LV (k) + RV (k) , where R = 1 − L and L, the localization operator, is a linear operator acting on functions of the form (2.9), deﬁned by its (k) action on the kernels V# (p1 , . . . , pm ; m) in the following way (with a slight abuse of notation, due to the presence of the delta function in (2.9) we only write the independent values of the momenta in the arguments of the kernels): (k)

(k)

LV# (p1 , p2 , p3 ; 4) := V# (0, 0, 0; 4),

(2.10)

1 (k) (k) (k) (k) LV# (p; 2) := V# (0; 2) + p∂p V# (0; 2) + pi pj ∂pi ∂pj V# (0; 2), 2 (k)

and LV# (p1 , . . . , pm ; m) = 0 otherwise. By symmetry, it follows that (k)

∂pi V# (0; 2) = 0,

(k)

∂pi ∂pj V# (0; 2) = 0

(k)

for i = j, (2.11)

(k)

∂pi pi V# (0; 2) = ∂pj pj V# (0; 2) for all i, j; ﬁnally, we deﬁne the running coupling constants of the planar theory on scale k as: (k)

λk := Vp (0, 0, 0; 4), αk :=

1 ∂p p V (k) (0; 2), 2 1 1 p

(k)

γ 2k µk := Vp (0; 2), (2.12) γ 4k νk := Vp(k) (0); (k)

the corresponding objects in the full theory are obtained by replacing the Vp in (2.12) with V (k) . Therefore, setting ϕ(≤k) =: ϕ(≤k−1) + ϕ(k) , we can rewrite (2.7) with k replaced by k − 1, and V (k−1) given by (k) (≤k−1) (k−1) (≤k−1) +ϕ(k) ) V (ϕ ) = log P (dϕ(k) )eV (ϕ :=

1 E T (V (k) (ϕ(≤k) ); n), n! k

(2.13)

n≥0

where EkT is called truncated expectation on scale k, and it is deﬁned as: (h) ∂n EhT (X(ϕ(h) ); n) := n log P (dϕ(h) )eζX(ϕ ) |ζ=0 . ∂ζ

(2.14)

It is convenient to deﬁne also V (−1) ; for this purpose one thinks ϕ(≤N ) as being given by, see formula [6, Eq. (6.9)], ϕ(≤N ) = ϕ(−1) + ϕ(0) + · · · + ϕ(N ) ,

(2.15)

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1001

where the ﬁeld ϕ(−1) is distributed independently relative to the other ϕ(j) , j ≥ 0, (−1) and it has its own covariance Cx,y which needs not to be speciﬁed (because it will eventually be taken to be identically zero whenever it appears in some interesting formulas). The introduction of V (−1) allows to treat the case k = 0 on the same grounds as the cases k > 0. Tree expansion and Feynman graphs. The iterative integration described above leads to a representation of the eﬀective potential on scale k − 1 as a power series in the running coupling constants λh , αh , µh , νh with h ≥ k, where the coefﬁcients of the series can be represented in terms of connected Feynman graphs, as brieﬂy explained in the following. The key formula which we start from is (2.13); iterating this formula as suggested by Fig. 1, we end up with a representation of the eﬀective potentials in terms of a sum over Gallavotti–Nicol` o trees [6, 12, 13], see Fig. 2: V (k−1) (γ), V (k−1) (ϕ(≤h) ) = n≥1 γ∈Tk−1,n

V (k−1) (γ) =

dpm (k−1) dp1 ··· V (p1 , . . . , pm ; γ, m) 4 (2π) (2π)4 m≥0

m (≤k−1) ϕpi :δ pi , : i=1

(2.16)

i

where Tk−1,n is the set of trees with root r on scale hr = k − 1 and n endpoints, with value V (k−1) (γ). The trees involved in the sum are distinct; two trees are considered identical if it is possible to superpose them together with the labels appended to their vertices by stretching or shortening the branches. Proceeding in a way analogous to [6, Sec. XVI and Appendix C], it follows that the kernels V (k) (p1 , . . . , pm ; γ, m) satisfy the following recursion relation:   s 1  V˜ (k) (p1 , . . . , pmj ; γj , mj ) V (k−1) (p1 , . . . , pm ; γ, m) = s! m ,...,m j=1 1

·

s

π∈Gm

ϑ⊂π connected

 

λ∈ϑ

 

(k) Cp(λ) 

·



(≤k−1) Cp(λ)  ,

λ∈π/ϑ

(2.17) where γ1 , . . . , γs are the s subtrees of γ with root corresponding to the ﬁrst nontrivial vertex of γ, V˜ (k) (p1 , . . . , pmj ; γj , mj ) is equal to RV (k) (p1 , . . . , pmj ; γj , mj ) if γj is nontrivial and to LV (k) (p1 , . . . , pmj ; mj ) otherwise, Gm is a suitable set of Feynman graphs deﬁned below, and the integral is over their loop momenta. This relation is a consequence of the rules of evaluation of the truncated expectations of Wick monomials, see [6, Appendix C]. Formula (2.17) is iterated by replacing each V˜ (k) (p1 , . . . , pmj ; γj , mj ) corresponding to nontrivial γj ’s with (2.17) with k − 1

October 12, J070-S0129055X10004120

1002

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

V(k) =

V(k1) =

,

k

k−1

V(k) =

+ k

V(k) =

,

k

k

+

k−1

k

+ k−1

k

k−1

+ ...

k

Fig. 1. Graphical interpretation of (2.13). The graphical equations for LV (k−1) , RV (k−1) are obtained from the equation in the second line by putting an L, R label, respectively, over the vertices on scale k.

v v0

V(k−1) = trees

k−1

k

hv

N

Fig. 2. The eﬀective potential V (h) can be represented as a sum over Gallavotti–Nicol` o trees. The small black dots will be called vertices of the tree. All the vertices except the ﬁrst (i.e. the one on scale k) have an R label attached, which means that they correspond to the action of REhTv , while the ﬁrst represents EkT . The generic endpoint e, represented by a fat endpoint, corresponds to LV (he −1) . The sum is over distinct trees; two trees are considered identical if it is possible to superpose them together with the labels appended to their vertices by stretching or shortening the branches.

replaced by k. Analogously, the planar part of the eﬀective potential is deﬁned as:   s 1  V˜p(k) (p1 , . . . , pmj ; γj , mj ) Vp(k−1) (p1 , . . . , pm ; γ, m) = s! m1 ,...,ms j=1     (k) (≤k−1)   · · C C . p(λ)

π∈Gm ϑ⊂π π planar connected

λ∈ϑ

p(λ)

λ∈π/ϑ

(2.18) Represent a generic Wick monomial Mj containing the product of mj ﬁelds as a point or as a cluster with mj emerging lines, depending on whether the corresponding γj is trivial or not; we shall consider the points as (trivial) clusters, too. Given the Wick monomials M1 , . . . , Ms the symbol Gm denotes the set of connected graphs that can be made joining pairwise some of the lines associated with the clusters M1 , . . . , Ms in such a way that: (i) two lines emerging from the same cluster cannot be contracted together, (ii) there should be enough lines so that looking the clusters as points the resulting graph is connected, (iii) after the contraction

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1003

there should be still m uncontracted lines, representing the Wick monomial M . The resulting graph is enclosed in a new cluster, labeled by k. Furthermore, the condition ϑ ⊂ π with the subscript “connected” means that the subgraph ϑ still keeps the connection between the boxes. We graphically represent the propagators C (k) by a solid line, while C (≤k−1) correspond to wavy lines. Finally, the restriction to planarity means that we discard all the graphs that show lines crossing in points were no interacting vertices are present. We refer the reader to [6, Sec. XVI], for a more extensive discussion and for examples. Clearly, the iteration stops when only trivial subtrees appear in (2.17), (2.18); at this point, the resulting graph looks like an “usual” one, but enclosed in a hierarchical cluster structure, where each cluster has a scale label; and given two clusters Gv , Gv then Gv ⊂ Gv if and only if hv > hv . After the iteration, the eﬀective potential on scale k is expressed as a power series in the running coupling constants λh , αh , µh , νh with h > k. From the analysis of [6, 12, 13], it follows that the contribution of a given tree γ ∈ Tk,n to a kernel of the planar theory can be bounded in the following way, setting δ := maxh {|λh |, |αh |, |µh |, |νh |}, for some positive Cm , ρ: (k) Vp (p1 , . . . , pm ; γ, m) ≤ Cm (const.)n δ n γ k(4−m) γ −ρ(hv −hv ) , (2.19) v>r v not e.p.

where the product runs over the vertices of the tree γ and v is the vertex immediately preceding v; since the number of distinct trees is bounded as (const.)n it follows that, see [6, Sec. XIX]: |Vp(k) (p1 , . . . , pm ; γ, m)| ≤ Cm C n δ n γ k(4−m) , (2.20) γ∈Tk,n

which means that the planar part of the eﬀective potential can be expressed as a convergent power series in the running coupling constants, provided their absolute values are small enough. This is not the case in the full theory; in the analogous of (2.19), due to the combinatorics of the Feynman graphs, one has to take into account an extra n! factor. Formula (2.19) implies in particular the so called short memory property of the Gallavotti–Nicol` o trees, which states that if two scales of a given tree are constrained to have ﬁxed values, say h, k with h < k, then the bound on the sum over all the remaining scales is improved by a factor γ −(ρ/2)(k−h) with respect to (2.20); in other words, long trees are exponentially suppressed. The expansion of the Schwinger functions. The generating functional of the Schwinger functions can be evaluated repeating a procedure completely analogous to the one described for the eﬀective potentials; after the integration of the scales N, N − 1, . . . , k + 1 it turns out that: (k) (≤k) WN (ζf ) ;ζf ) = P (dϕ(≤k) )eS (ϕ e =

(k)

P (dϕ(≤k) )eSp

(k) (ϕ(≤k) ;ζf )+Snp (ϕ(≤k) ;ζf )

,

(2.21)

October 12, J070-S0129055X10004120

1004

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella (k)

where the eﬀective potentials S# (ϕ(≤k) ; ζf ) have the form: dp1 dpm+t (k) (k) S# (ϕ(≤k) ; ζf ) = ··· S (p1 , . . . , pm+t ; m, t) (2π)4 (2π)4 # m≥0 t≥0



· :

m

m+t

ϕ(≤k) : pi

i=1



 ζfpj δ pi ,

j=m+1

(2.22)

i

and can be represented as sums over trees very similar to the ones introduced for the eﬀective potentials, up to the following diﬀerences, see [12, Sec. 7.5] and [13]: (i) special vertices may appear, from which dotted lines representing the “external ﬁelds” ζf emerge (that do not contribute to the total number of endpoints), and (ii) no R operation is deﬁned on the path from a given dotted line to the root. We call Tk,n,t the set of such trees having root scale k, n endpoints and t dotted lines. See Fig. 3 for an example. Setting (k) (k) S# (p1 , . . . , pm+t ; m, t) = S# (p1 , . . . , pm+t ; γ, m, t), (2.23) n≥1 γ∈Tk,n,t

the planar parts of the kernels of the eﬀective potentials are related by the following recursive equation: Sp(k−1) (p1 , . . . , pm+t ; γ, m, t)   s 1  S˜p(k) (p1 , . . . , pmj +tj ; γj , mj , tj ) = s! m1 ,...,ms j=1 t1 ,...,ts

·

π∈Gm ϑ⊂π π planar connected

 

λ∈ϑ

 

Cp(λ)  ·  (k)



λ∈π/ϑ

k

Fig. 3.

(≤k−1) 

Cp(λ)

A generic tree belonging to Tk,6,2 .

,

(2.24)

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1005

where γ1 , . . . , γs are the s subtrees of γ with root coinciding with the ﬁrst vertex of γ following the root. If γj is trivial and corresponds to a dotted (k) line then S˜p (p1 , . . . , pmj +tj ; γj , mj , tj ) = δmj ,1 δtj ,1 , while if it corresponds (k) to a solid line S˜p (p1 , . . . , pmj +tj ; γj , mj , tj ) = δtj ,0 LV (k) (p1 , . . . , pmj ; mj ); if (k) γj is a nontrivial subtree with tj > 0 then S˜p (p1 , . . . , pmj +tj ; γj , mj , tj ) = (k) Sp (p1 , . . . , pmj +tj ; γj , mj , tj ), while if γj is nontrivial and tj = 0 then (k) (k) S˜p (p1 , . . . , pmj ; γj , mj , 0) = RVp (p1 , . . . , pmj ; γj , mj ), with R = 1 − L deﬁned as in (2.10). Clearly, m1 , . . . , ms and t1 , . . . , ts are subject to the con straints j mj = m, j tj = t. Formula (2.24) is iterated by replacing each (k)

Sp (p1 , . . . , pmj ; γj , mj , tj ) corresponding to any nontrivial γj with tj > 0. ThereT fore, the generic planar Schwinger function S(N ) (f ; q) can be written as: T S(N Sp (γ), ) (f ; q) = n≥1 γ∈T−1,n,q

Sp (γ) :=

dp1 dpq ··· fp · · · fpq Sp(−1) (p1 , . . . , pq ; γ, 0, q), (2π)4 (2π)4 1

γ ∈ T−1,n,q , (2.25)

(−1)

is given by (2.24) with k = 0. Finally, from the theory of [12, Sec. 7.5], where Sp it follows that |Sp (γ)| ≤ f q1 Cq C n δ n , (2.26) γ∈T−1,n,q

which implies that in the planar theory the Schwinger functions can be expressed as absolutely convergent power series in the running coupling constants, provided their absolute values are small enough. As it is well known, this is not the case in the full theory, since the bound (2.26) has to be multiplied by n!; see [6, 12, 13, 22]. The beta function and its tree expansion. From now on, we shall focus only on the planar theory. The running coupling constants obey to recursive equations (4) (2 ) induced by the iterative integration; it follows that, setting vk := λk , vk := αk , (2) (0) vk = µk , vk := νk : (a) (a) (a) vk = γ −2δa,2 −4δa,0 vk−1 − Bv k , k ≥ 0, (2.27) where the operator B, the beta function of the theory, has the form, see formula [6, Eq. (9.15)]: (a)

(Bv)k :=

∞

N

r=2 h1 ,...,hr ≥k a1 ,...,ar (a)

βa(a) (k; h1 , . . . , hr ) 1 ,...,ar

r

(a )

vhi i .

(2.28)

i=1

The quantities {v−1 } are called the renormalized coupling constants. As the iter(a) ative procedure described before suggests, the beta function (Bv)k can be represented as a sum over trees; the only diﬀerence with respect to the trees which

October 12, J070-S0129055X10004120

1006

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

have been introduced previously is that we attach an La over the ﬁrst vertex, (k) (k) where La is deﬁned in the following way: La Vp (p1 , p2 , p3 ; 4) := Vp (0, 0, 0; 4) if a = 4 and zero otherwise, La Vp (p; 2) := γ −2k Vp (0; 2) if a = 2 and zero other(k)

(k)

wise, La Vp (p; 2) := (1/2)∂p1 p1 Vp (0; 2) if a = 2 and zero otherwise and ﬁnally (k) (k) La Vp (0) := γ −4k Vp (0) if a = 0 and zero otherwise. From the theory of [6], it follows that in the planar theory: βa(a),...,a (k; h1 , . . . , hr ) ≤ (const.)r , (2.29) 1 r (k)

(k)

h1 ,...,hr ≥k

which means that the beta function is deﬁned as an absolutely convergent power series provided the absolute values of the running coupling constants are small enough; this is not the case in the full theory, since in that case the bound (2.29) has to be multiplied by r!. Remarks. (1) From the representation of the coeﬃcients of the beta function in terms of Feynman graphs, induced by the iterative integration previously described (see also [6, Secs. IX, XVI–XIX]), it follows that for k > 0, calling r¯ the number of indexes i such that ai = 4 (corresponding to the number of vertices with four external lines), (k; h1 , . . . , hr ) = 0 unless r¯ ≥ 2, βa(4) 1 ,...,ar

(2.30)

) βa(21 ,...,a (k; h1 , . . . , hr ) = 0 unless r¯ ≥ 2, r

(2.31)

(k; h1 , . . . , hr ) = 0 unless r¯ ≥ 1. βa(2) 1 ,...,ar

(2.32)

These properties can be understood in the following way. The graphs contributing to (2.30)–(2.32) are all computed at vanishing external momenta, and the momenta ﬂowing on the propagators must have absolute values bigger than 0; (a) in fact, the quantity (Bv)k arise from the integration of the ﬁelds ϕ(h) with h ≥ k, which if k > 0 have support for momenta p such that |p| > 0. Then, to see property (2.30), simply try to draw on a sheet of paper any graph with four external lines evaluated at vanishing external momenta; as the reader may check, the condition r¯ < 2 is not compatible with the fact that the momenta ﬂowing on the propagators have absolute values > 0. Property (2.32) can be seen in an analogous way. To understand (2.31), notice that the graphs con(2 ) tributing to βa1 ,...,ar (k; h1 , . . . , hr ) have two external lines, and are derived twice with respect to the external momentum. Then, proceed as for (2.32), and notice that the only two-legged graphs with r¯ = 1 compatible with the request on the modulus of the inner momenta are “tadpole” graphs, which do not depend on the value of the external momentum; therefore, their derivatives are vanishing. (2) Note that the ﬂow of νk is decoupled from the others, since νk does not appear in the recursive equations deﬁning λk , αk , µk (it is graphically represented by

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1007

a vertex with no external lines); moreover, the sequence ν−1 , . . . , νN solves the following equation: (0) (2.33) νk = γ −4 νk−1 − Bv k , which implies νk = γ −4(k+1) ν−1 −

k j=0

(0) γ 4(j−k) Bv j ,

(2.34)

(0)

where (Bv)j is analytic in its arguments for maxk {|λk |, |αk |, |µk |} small enough. For these reasons, in what follows we shall focus only on the ﬂows of λk , µk , αk . We can rewrite Eq. (2.27) as: vk = γ −(k+1)(2δa,2 +4δa,0 ) v−1 − (a)

(a)

k j=0

(a) γ (j−k)(2δa,2 +4δa,0 ) Bv j ,

(2.35) (a)

and this equation can be iterated in order to obtain the formal power series of vk in the renormalized coupling constants. Again, Eq. (2.35) can be represented graphically. The second term in (2.35) corresponds to the sum of all the possible trees with root scale k enclosed in a frame labeled by a type label a. The correspondence between the framed trees and the trees discussed after (2.28) is made explicit by the example in Fig. 4. In general, the fat endpoint e labeled by ae and attached to a vertex on scale (a) he − 1 corresponds to the running coupling constant vhe −1 , while the ﬁrst term in (2.35) is represented as a trivial tree with a thin endpoint labeled by a and root scale k. See Fig. 5 for a graphical representation of (2.35). The iteration of (2.35) produces trees showing thin endpoints, and in general more than one frame; see Fig. 6 for a picture of the situation. Therefore, the nth order contribution in the (a) renormalized coupling constant to vk is deﬁned graphically as the sum of all the possible framed trees with root scale k enclosed in a frame labeled by a, with n thin endpoints, and where the generic vertex v has an R label attached otherwise the corresponding subtree is enclosed in a frame. We stress that trees with diﬀerent type

k

=

k

a,2

a,0)

j=0

a Fig. 4.

Example of framed tree.

a

j

October 12, J070-S0129055X10004120

1008

2010 10:1 WSPC/S0129-055X

M. Porta & S. Simonella

a1 k

148-RMP

a

=

k

a

+

k

+ a2

k

a1

a2

a2

+

a3

k

+ ...

a3 a

a

Fig. 5.

a1

a

Graphical interpretation of formula (2.35); a sum over the ai ’s is understood.

a1

a1

a1

a2

a2 +

a2

a3

a3

+ ...

a5 a4 a3

Fig. 6.

Graphical interpretation of the iteration of Eq. (2.35).

labels attached to their frames and endpoints are considered diﬀerent. The same graphical procedure allows to ﬁnd the perturbative expansion of the Schwinger functions (or equivalently of the eﬀective potentials) in the renormalized coupling constants, starting from their deﬁnition as trees with only “fat” endpoints. Remark. Given a generic framed tree showing any number of inner frames, we deﬁne the maximally pruned framed tree as the tree obtained by replacing the maximal inner frames (i.e. the ones enclosed only by the outermost frame) with fat endpoints of the corresponding type; by properties (2.30)–(2.32) the sum over the scale of the ﬁrst vertex of a framed tree, see Fig. 4, involves only the term with j = 0 if: • the type label of the frame is 2 and the maximally pruned framed tree has no endpoints of type 4; • the type label of the frame is 2 or 4 and the maximally pruned framed tree has at most one endpoint of type 4. We shall say that a frame is trivial if the enclosed tree veriﬁes one of the above properties; all the other frames will be called nontrivial. Call T˜−1,m,q the set of trees with root scale −1, any number of frames, m endpoints fat or thin, and q dotted lines; given a generic tree γ ∈ T˜−1,m,q we call n2 ,4 (γ) the number of nontrivial frames (see previous remark) labeled by a = 2 , 4 and we denote by ma (γ) the number of endpoints of type a. In the planar theory the following remarkable result is true.

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1009

Theorem [n! Bound]. Let q > 0; there exist two positive constants C, Cq such that, if m = m4 + m2 + m2 : ma (a) q m |S(γ)| ≤ C Cq f 1 n! ; (2.36) max |vk | a

γ∈T˜−1,m,q n2 ,4 (γ)=n ma (γ)=ma

k≥−1

for q = 0 the bound (2.36) has to be multiplied by |Λ|. We refer the reader to [6, 12, 13], and to Appendix E (see item (2) in the Remark below), for a proof of this result. Remarks. (1) The “n! bound” (2.36) only applies to the planar theory; in the full theory n is replaced by the number of endpoints of the tree. This proves the ultraviolet stability of the full ϕ44 theory; see [6, 12, 13, 22–25]. (2) In References [6, 13], it was noticed that in the planar case the bound grows factorially in the number of frames; as we show in Appendix E, it is possible to improve the bound by considering only the nontrivial frames labeled by 4, 2 . Roughly speaking, the factorial is “produced” by the sums appearing in the deﬁnitions of the frames; the frames labeled by 2, 0 do not contribute to the factorial because their sums can be controlled thanks to the exponential factor appearing in (2.35) and Fig. 4, and if a frame is trivial the sum is missing. Notations. From now on we shall set λ := λ−1 ,

α := α−1 ,

µ := γ −2 µ−1 ;

(2.37)

moreover, we deﬁne λ := {λk }k≥1 ,

α := {αk }k≥1 ,

µ := {µk }k≥1 .

(2.38)

Notice that the deﬁnition (2.38) does not involve the running coupling constants on scale zero. In fact, for purely technical reasons, the running coupling constants on scale zero have to be treated separately from those on scales > 0. In particular, we ﬁrst determine the running coupling constants on scales > 0 as functions of those on scale 0, and then we express the running coupling constants on scale 0 as functions of the renormalized ones. The motivation of this procedure is connected with the fact that the properties of the beta function (2.30)–(2.32), that will play a key role in our analysis, are true only for scales k > 0. It is also convenient to introduce ξk := (ξ2 ,k , ξ2,k ) := (αk , µk ), Finally, we deﬁne the sets Bδ , Cδ , Wδ,ϑ see Fig. 7: Bδ := {z ∈ C : |z| < δ},

ξ := {ξk }k≥1 . (2.39) with δ > 0, ϑ ∈ 0, π2 in the following way, ξ := (α, µ),

Cδ := {z ∈ C : Re z −1 > δ −1 },

Wδ,ϑ := {z ∈ C : |z| < δ, |arg z| < π − ϑ}.

(2.40)

October 12, J070-S0129055X10004120

1010

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

Fig. 7.

The domains Bδ , Cδ , Wδ,ϑ .

3. Borel Summability of ϕ44 Planar Theory In this section, we state our main result in a mathematically precise form, we recall what Borel summability is and we outline the ideas of the proof. The technical details are contained in Sec. 4 and in the appendices. Theorem 1 (’t Hooft–Rivasseau). For any ϑ ∈ 0, π2 there exist η¯ > 0, ε¯ > 0 T 4 such that the Schwinger functions S T (f ; q) = limN →+∞ S(N ) (f ; q) of the planar ϕ4 theory are analytic for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯, and Borel summable in λ at the origin. Remark. Not surprisingly, ε¯ → 0 if ϑ → 0. Before discussing a sketch of the proof, let us brieﬂy remind what Borel summa bility is (see [14, 16]). A formal power series n an z n , z ∈ C, is said to be Borel summable if the following properties are true: • the Borel transform B(t) := n an!n tn converges for every t in some circle Bδ ; • B(t) admits an analytic continuation in a neighbourhood of the positive real axis; • the integral 1 +∞ − t e z B(t)dt (3.1) f (z) = z 0 is convergent for z ∈ Cδ¯ for some δ¯ > 0. n Notice that f (z) ∼ for z → 0. The function f (z) is called the Borel n an z sum of the formal power series, and if f (z) exists it is unique. Therefore, Borel summability is nothing else than a one-to-one mapping between a certain space of functions and a certain space of power series: all the information on the function is enclosed in the list of its Taylor coeﬃcients. For these reasons, Borel summability is, [17], the perfect substitute for ordinary analyticity when a function is expanded on the boundary of its analyticity domain. By the Nevanlinna–Sokal theorem, [14], to establish whether f (z) is the Borel sum of n an z n it is suﬃcient to check the following two properties: • f (z) is analytic in Cδ for some δ > 0;

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

• for every z ∈ Cδ and for all M > 0 the following estimate holds: M−1 an z n ≤ C M M !|z|M , C > 0. f (z) −

1011

(3.2)

n=0

Sketch of the Proof. Our proof consists in a check of the two hypothesis of the Nevanlinna–Sokal theorem, and it goes as First, we prove that for any follows. ﬁxed ultraviolet cutoﬀ N > 0 and any ϑ ∈ 0, π2 the running coupling constants are analytic for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯; analyticity of the Schwinger functions T T S(N ) (f ; q) in the same domain is straightforward, since S(N ) (f ; q) is given by an absolutely convergent power series in the running coupling constants, see [6, 12, 13]. T Then, we prove that S T (f ; q) = limN →+∞ S(N ) (f ; q) exists, and that the limit is reached uniformly in the analyticity domain. Therefore, S T (f ; q) is analytic in the T T same analyticity domain of S(N ) (f ; q). To conclude, we show that S (f ; q), as function of λ in Wε¯,ϑ , veriﬁes the bound (3.2). These two properties imply Borel summability, since Cε¯ ⊂ Wε¯,ϑ . Analyticity. To solve the ﬂow equations (2.27) and determine the analyticity properties of the running coupling constants we use a ﬁxed point argument. More precisely, we show that the Eqs. (2.27) are solved by sequences parametrized by the renormalized coupling constants (λ, α, µ) which, for ﬁnite N , are the ﬁxed points of some operators acting on suitable ﬁnite dimensional spaces; all the technical work is reduced to showing that in the considered spaces the operators are contractions. After this, the sequences of running coupling constants are determined through an exponentially convergent procedure. In particular, in the limit N → +∞, for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯, we ﬁnd that the Eqs. (2.27) admit a solution of the form, for some positive C, c: 1 , |αk − α| ≤ c(|λ| + |µ|2 ), λk = k ˜ −1 + λ β˜k (3.3) j=0

|µk − γ −2k µ| ≤ c[γ −2k |µ|2 + (|λ| + |ξ|)|λk |], ˜ = λ(1 + O(µ)), |β˜k − βk | ≤ C(|λ| + |ξ|), βk := β (4) (k; k, k) > 0. where λ 4,4 To begin, we rewrite the ﬂow equation for λk as, see (2.27) with a = 4: λk =: λk+1 + β4,k+1 (λ, ξ),

k ≥ 0,

(3.4)

λ =: λ0 + f4,0 (λ0 , µ0 ) + β4,0 (λ0 , λ, ξ0 , ξ),

(3.5)

where f4,0 is linear in λ0 , and β4,h is given by a sum of terms proportional to at least two among λ0 , . . . , λN . Then, iterating (2.27) up to the scale 0 we get that, for a = 2, 2 : αk =: α0 −

k j=1

β2 ,j λ, ξ ,

µk =: γ −2k µ0 −

k

γ 2(j−k) β2,j λ, ξ ,

k ≥ 1,

j=1

(3.6)

October 12, J070-S0129055X10004120

1012

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

α0 =: α − f2 ,0 (λ0 , µ0 ) − β2 ,0 λ0 , λ, ξ0 , ξ , µ0 =: µ − f2,0 (µ0 ) − β2,0 λ0 , λ, ξ0 , ξ ,

(3.7)

where f2 ,0 collect terms at most linear in λ0 , while β2 ,h , β2,h are given by sums of terms proportional to at least two or one among λ0 , . . . , λN , respectively. Setting β4,h (λ, ξ) =: βh λ2h + β¯4,h λ, ξ where βh > 0 and β¯4,h is of order ≥ 3, Eq. (3.4) can be rewritten as k k −1 −1 −1 λ, ⇒ λ = λ − β + R ξ = λ + β − Rj λ, ξ , λ−1 k+1 k+1 j 0 k k+1 k j=1

(3.8)

j=1

where Rj is given by a sum of terms bounded proportionally to one between αj , µj , λj , and it depends only on running coupling constants on scales ≥ j, see Appendix B; the key remark is that, formally, Eq. (3.8) can be seen as deﬁning the ﬁxed point of the map 1

(Tλ0 ,ξ x)k = λ−1 0

+

k

βj −

j=1

k

Rj x, ξ

,

k ≥ 1,

(3.9)

j=1

where x = (x1 , . . . , xN ) with xi ∈ C and α, µ satisfy (3.6), which again can be formally seen as the ﬁxed point of the map   k α0 − β2 ,j λ, y     j=1   ˜ (3.10) (Tξ0 ,λ y)k =   , k ≥ 1, k    γ −2k µ −  2(j−k) γ β2,j λ, y 0 j=1

where y = (y1 , y2 , . . . , yN ) and yk = (yk,2 , yk,2 ) with yk,i ∈ C. Therefore, we can in principle determine the running coupling constants on scale > 0 as functions of (λ0 , α0 , µ0 ) by solving the equations: λ = Tλ0 ,ξ λ,

˜ ξ0 ,λ ξ; ξ=T

(3.11)

after this, the dependence of the running coupling constants on the renormalized ones can be deduced from Eqs. (3.5) and (3.7). To solve (3.11), in Sec. 4.1 and in Appendices A and B we prove that if S ∈ CN is the set of sequences “close enough” to the solution of the ﬂow of λk truncated to second order and if S˜ ∈ C2N is a 2N -dimensional ball centered in zero and of suitably small radius, then: (i) if x ∈ S and |α0 |, |µ0 | are small enough the map ˜ ξ0 ,x leaves S˜ invariant and is a contraction therein; (ii) the ﬁxed point y(x) of T ˜ ξ0 ,x T in S˜ is H¨older continuous in x with exponent 0 < ρ < 1; (iii) given ϑ ∈ (0, π/2], for all λ0 ∈ Wε,ϑ with ε small enough, the map Tλ0 ,y(·) leaves S invariant and is a contraction therein. To be speciﬁc, the distances d, d˜ that we shall adopt

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1013

˜ y ) := maxk,i |yk,i − y |, in S, S˜ are deﬁned as d(x, x ) := maxk |xk − xk |, d(y, k,i respectively. Then, we can construct the sequences solving (3.11) in the following way: take (0) α (0) ˜ λ(0) ∈ S, (3.12) ∈ S, ξ = µ(0) and deﬁne, for m ≥ 0,

n (m) ˜ ξ , ξ (m+1) := lim T ξ0 ,λ(m) n→∞

λ(m+1) := Tλ0 ,ξ(m+1) λ(m) .

(3.13)

Assume inductively that for all 0 ≤ m ≤ m the sequences ξ (m ) , λ(m ) belong ˜ S, which is true for m = 0. Property (i) above implies that respectively to S, (m+1) ˜ while property (iii) implies that λ(m+1) belongs to S. Then, ξ belongs to S, our procedure (3.13) converges exponentially to a limit; in fact, for m ≥ 1, for some 0 < ρ < 1, Cρ > 0 and 0 < < 1: ρ (m+1) (m) (m) (m−1) max ξk,i − ξk,i ≤ Cρ max λk − λk k,i k ρ (1) (0) (m−1)ρ ≤ Cρ (3.14) max λk − λk k

(m+1) (1) (m) (0) − λk ≤ m max λk − λk max λk k,i

k

where we used property (ii) to get the ﬁrst inequality in the ﬁrst line, and property (iii) for the remaining ones. Since λ(1) , λ(0) are bounded, Eqs. (3.14) prove that the limits λ∗ = lim λ(m) , m→∞

ξ ∗ = lim ξ (m) m→∞

(3.15)

exist in S, S˜ respectively, and by construction λ∗ = Tλ0 ,ξ∗ λ∗ ,

˜ ξ ,λ∗ ξ ∗ , ξ∗ = T 0

(3.16)

i.e. λ∗ , ξ ∗ are the sequences of running coupling constants from scale 1 to N of the planar ϕ44 theory, parametrized by λ0 , α0 , µ0 . The proof of analyticity of the limits for (λ0 , α0 , µ0 ) ∈ Wε,ϑ × Bη × Bη with ε, η small enough is straightforward; it is a consequence of the analyticity properties of the initial data and of the maps T, ˜ and of the fact that convergence is uniform for (λ0 , α0 , µ0 ) ∈ Wε,ϑ × Bη × Bη . T, After this, from Eqs. (3.5) and (3.7) we show that λ0 , α0 , µ0 are analytic for (λ, α, µ) ∈ Wε ,ϑ × Bη × Bη with ϑ > ϑ, ε < ε, η < η, and this concludes the proof of analyticity of the running coupling constants in the renormalized ones. T Finally, to prove analyticity of the Schwinger functions we use that S(N ) (f ; q) is given by an absolutely convergent power series in the running coupling constants, see Sec. 2, and we prove that the limit for N → ∞ exists and it is reached uniformly for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯ with ε¯ < ε , η¯ < η .

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

1014

Bound on the remainder. In Sec. 4.2, we show that relying on the tree representation of the beta function described in Sec. 2, it is possible to rewrite the q-point Schwinger function as: S T (f ; q) = S T,(≤n) (f ; q) + r(n) (f ; q),

(3.17)

where S T,(≤n) (f ; q) is the Taylor expansion of S T (f ; q) up to order n in λ = 0, and r(n) (f ; q) is a quantity bounded by (const.)n+1 Cq f q1 (n + 1)!|λ|n+1 uniformly in the analyticity domain. The idea is to use the graphical representation of the beta function depicted in Fig. 5 to “extract” in the tree expansion of the Schwinger function all the possible trees with less than n + 1 thin endpoints corresponding to λ, as suggested by Fig. 6; the main diﬃculty in this procedure is to check that after having reproduced the Taylor series up to the order n the “unwanted” trees, i.e. the ones showing more than n endpoints of type 4, have less than n + 1 nontrivial frames labeled by a = 2 , 4, see remark after Fig. 6. After having checked this, the desired bound is a straightforward consequence of the n! bound (2.36). 4. Proof of Theorem 1 4.1. Analyticity of the flow of the running coupling constants In this section we present in a mathematically precise form the properties (i)–(iii) mentioned in the previous section after Eq. (3.11), which, as we already discussed, are the key ingredients in the construction of the sequences of the running coupling constants on scale ≥ 1 as functions of the ones on scale 0. After this, we express the running coupling constants on scale 0 in terms of the renormalized ones, and we prove the analyticity properties required for Borel summability. The spaces of sequences that we shall consider are the following ones:               √ 1 N , , |t | ≤ δk Sλ0 ,δ := x ∈ C : xk = k k     (4.1)     λ−1 βj + t k   0 +   j=1

S˜η := {y ∈ C2N : |yk,i | ≤ η}. The following two lemmas imply, respectively, properties (i), (ii) and property (iii) stated in Sec. 3. Lemma 1. For any ϑ ∈ 0, π2 there exist ε¯ > 0, η¯ > 0 such that if (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × B2¯η × B2¯η and x ∈ Sλ0 ,¯ε+¯η : 2

˜ ξ0 ,x is a map from S˜4¯η to S˜4¯η ; (1) T ˜ ξ0 ,x is a contraction in S˜4¯η , i.e. if y ∈ S˜4¯η , y ∈ S˜4¯η (2) T ˜ ξ0 ,x y ˜ ξ0 ,x y max T − T , ≤ max yk,i − yk,i k,i k,i k,i

k,i

0 < < 1;

(4.2)

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1015

(3) given two sequences x, x belonging to Sλ0 ,¯ε+¯η , the ﬁxed points y(x), y(x ) of ˜ ξ0 ,x , T ˜ ξ0 ,x in S˜4¯η verify the following inequalities: the maps T |yk,i (x) − yk,i (x )| ≤ C[log(1 + ε¯k) + 1] max |xk − xk |,

(4.3)

ρ |yk,i (x) − yk,i (x )| ≤ Cρ max |xk − xk | .

(4.4)

k

k

for some positive C, Cρ and 0 < ρ < 1. Lemma 2. For any ϑ ∈ 0, π2 there exist ε¯ > 0, η¯ > 0 such that if (λ0 , α0 , µ0 ) ∈ ˜ ξ0 ,x in S˜4¯η for x ∈ Sλ0 ,¯ε+¯η exists and: W2¯ε, ϑ × B2¯η × B2¯η the ﬁxed point y(x) of T 2

(1) Tλ0 ,y(·) is a map from Sλ0 ,¯ε+¯η to Sλ0 ,¯ε+¯η ; (2) Tλ0 ,y(·) is a contraction in Sλ0 ,¯ε+¯η , i.e. if x ∈ Sλ0 ,¯ε+¯η , x ∈ Sλ0 ,¯ε+¯η , max |(Tλ0 ,y(x) x)k − (Tλ0 ,y(x ) x )k | ≤ max |xk − xk |, k

k

0 < < 1.

(4.5)

We refer the reader to Appendices A and B for the proofs of these lemmas. As explained in Sec. 3, this two results allow to construct the sequences of the running coupling constants as functions of those on scale 0, and to determine their analyticity properties. We take (0) α (0) (4.6) ∈ S˜4¯η , λ(0) ∈ Sλ0 ,¯ε+¯η ξ = µ(0) analytic for (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × B2¯η × B2¯η ; to be concrete, we can choose 2

(0) αk

=

(0) µk

= η¯,

(0)

λk =

1 , k −1 λ0 + βj

λ0 ∈ W2¯ε, ϑ . 2

(4.7)

j=1

Then, we can construct the sequences of running coupling constants by proceeding as explained after (3.12); analyticity for (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × B2¯η × B2¯η is a 2 straightforward consequence of the analyticity properties of the maps and of the initial data, and of the fact that convergence is uniform for (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × 2 B2¯η × B2¯η . Now we turn to the ﬂow Eqs. (3.5) and (3.7) for the running coupling constants on scale 0. Notice that these equations are diﬀerent from the ones corresponding to higher scales, because of the presence of the functions fa,0 . The main consequence of this fact is that choosing λ inside Cε does not imply that λ0 ∈ Cε for some ε ; this is the reason why we considered λ0 ∈ Wε,ϑ so far. The strategy that we shall adopt is very similar, but technically much simpler, to the one we followed for the scales 1, . . . , N , see Appendix C for details: ﬁrst, we determine with a ﬁxed point argument α0 , µ0 as analytic functions of λ0 , α, µ in W2ε, ϑ × Bη × Bη for ε, η small 2 enough; then, we plug α0 , µ0 into Eq. (3.5) for λ0 , and we solve it using again a ﬁxed point argument; ﬁnally, we show that the solution has the required analyticity

October 12, J070-S0129055X10004120

1016

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

−1 properties in λ, α, µ. In particular, it follows that λ−1 (1 + O(µ)) + β0 , up 0 λ to corrections bounded by (const.)(|λ| + |ξ|).

Asymptotic behavior of the running coupling constants. So far, our construction allowed us to conclude that, if (λ, α, µ) ∈ Wε,ϑ × Bη × Bη with ε, η small enough: 1

λk = −1

λ

(1 + O(µ)) +

k

, βk +

|αk | ≤ η,

|µk | ≤ η,

(4.8)

tk

j=0

√ with |tk | ≤ (k + 1) ε + η; however, these results can be improved to get (3.3). In fact, the ﬂows of αk , µk are given by, for k ≥ 1: αk = α0 −

k

β2 ,j λ, ξ ,

µk = γ −2k µ0 −

j=1

k

γ 2(j−k) β2,j λ, ξ ,

(4.9)

j=1

where: β2 ,j λ, ξ ≤ c |λj |2 ,

β2,j λ, ξ ≤ c |λ| + |ξ| |λj |.

(4.10)

Therefore it follows that, using the expression for λk in (4.8), for some c > 0: # |αk − α| ≤ c |λ| + |µ|2 , µk − γ −2k µ ≤ c γ −2k |µ|2 + (|λ| + |ξ|)|λk | , (4.11) which give the last two of (3.3). To prove the ﬁrst of (3.3), simply use (4.11) and the ﬁrst of (4.8) to replace the running coupling constants appearing in Rj , see (3.9) and (B.2). Analyticity of the Schwinger functions. As we have discussed in Sec. 2, the T Schwinger functions S(N ) (f ; q) are given by absolutely convergent power series in the running coupling constants on scales ≤ N ; therefore, taking ε¯, η¯ smaller than T the radius of convergence of the series, S(N ) (f ; q) is analytic for (λ, α, µ) ∈ Wε¯,ϑ × × B . To prove analyticity in the limit N → +∞ we show that the sequence B η ¯ η ¯ $ T % S(N ) (f ; q) N ≥1 is uniformly Cauchy in the analyticity domain. In fact, consider two positive integers N, N such that N > N ; then, T T T T S(N ) (f ; q) − S(N ) (f ; q) := δS1,(N,N ) (f ; q) + δS2,(N,N ) (f ; q),

(4.12)

T where δS1,(N,N ) (f ; q) is given by a sum of trees with at least one endpoint on scale (a),N

(a),N

k ≤ N corresponding to the diﬀerence of running coupling constants vk −vk T of theories with cutoﬀs on scales N , N , and δS2,(N,N ) (f ; q) is given by a sum of GN trees having root scale −1 and at least one endpoint on scale ≥ N + 1. The ﬁrst term can be bounded using the results of Appendix D as: T −1 δS , (4.13) 1,(N,N ) (f ; q) ≤ (const.)N

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1017

while the second can be estimated using the short memory property of the GN trees (see discussion after (2.20)) as, for some ρ > 0, T −ρN δS , γ > 1; (4.14) 2,(N,N ) (f ; q) ≤ (const.)γ all the bounds are uniform in (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯. Therefore the limit exists, and it is analytic in Wε¯,ϑ × Bη¯ × Bη¯. 4.2. Bounds on the Taylor remainder of the Schwinger functions In this section we show that for all n > 0, (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯, the q-points Schwinger function S T (f ; q) veriﬁes S T (f ; q) = S T,(≤n) (f ; q) + rn (f ; q)

(4.15)

where S T,(≤n) (f ; q) is the Taylor expansion of S T (f ; q) up to the order n in λ = 0 and rn (f ; q) is a remainder bounded by C n+1 (n + 1)!|λ|n+1 for some C > 0. Result (4.15) concludes the proof of Borel summability of the Schwinger functions of the planar theory. One can try to prove decomposition (4.15) by iterating the graphical deﬁnition of the running coupling constants, see discussion after (2.35) and, in particular, Fig. 6 to get an idea of the graphical meaning of the iteration, to “extract” all the possible trees with only thin endpoints and at most n of them labeled by 4; to conclude the proof one has to check at the end that the sum of the values of the trees not belonging to this category is bounded by C n+1 (n + 1)!|λ|n+1 . For simplicity, in the following we shall call “a-endpoint” an endpoint labeled by a, and “a-frame” a frame labeled by a; a-frames with a equal to 2 or 4 will be called “(2 , 4)-frames”. Empty and square endpoints. We can rewrite (3.4), (3.7) in the more compact form: (a)

vk

= γ −2δa,2 (k+1) v−1 − γ −2δa,2 k fa,0 (λ0 , µ0 ) (a)

−

k

γ 2(j−k)δa,2 βa,j λ0 , λ, ξ0 , ξ .

(4.16)

j=0

We graphically represent −γ −2δa,2 k fa,0 as an empty a-endpoint and −

k

γ 2(j−k)δa,2 βa,j

j=0

as a square a-endpoint. Therefore, in general, the fat a-endpoint can be written as the sum of thin, empty and square a-endpoints; see Fig. 8. In turn, the empty and the square endpoints can be represented as sums of framed trees with root scale k, no inner frames and only fat endpoints, see discussion after (2.35). It is important to notice that the frames appearing in the tree representation of −γ −2δa,2 k fa,0 are trivial, see Remark after Fig. 5.

October 12, J070-S0129055X10004120

1018

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

k

a

Fig. 8.

=

k

a

+

k

a

+

k

a

Fat endpoints are equal to thin plus empty plus square endpoints.

We deﬁne the order and the 4-order of fat, thin, empty, and square endpoints as the order of their values in all the renormalized coupling constants and in λ only, respectively. Therefore: • Thin and fat endpoints have order 1; empty endpoints have order 2; square a-endpoints have order 1 or 2 depending on whether a = 2 , 4 or a = 2. • Thin, fat and empty a-endpoints have 4-order 0 or 1 depending on whether a = 2 , 2 or a = 4; square endpoints have 4-order 1. Notice that the reason why we set to 1 the order and the 4-order of the square a-endpoints with a = 2 , 4, which are given by sums of trees with two 4-endpoints, is that we have to exploit asymptotic freedom to control the sum in (4.16); the result can be bounded uniformly in k by |λ| but not by |λ|2 . Notations. We shall use the following notations: • n2 ,4 (γ) is the number of nontrivial (2 , 4)-frames appearing in a tree γ; (a) • nsq (γ) is the number of square a-endpoints appearing in a tree γ, and nsq (γ) := (4) (2 ) (2) nsq (γ) + nsq (γ) + nsq (γ); • the order O(γ) and the 4-order O4 (γ) of a tree γ are respectively equal to the sums of the orders, 4-orders of the endpoints of γ; • the “expansion” of square and empty endpoints consists in replacing them with their tree expansions in terms of framed trees with no inner frames and only fat endpoints, see discussion after (2.35). Proof of (3.17). We will proceed by induction. Assume that, at the step r of the induction, for every n > 0, M > 0 with M ≥ n the Schwinger function S T (f ; q) can be written as (r)

(r),1

(r),2

S T (f ; q) = Fn,M + Rn,M + Rn,M , (r)

(4.17)

(r),i

where both Fn,M , Rn,M can be represented as sums over distinct trees such that n2 ,4 (γ) ≤ n. Moreover, we assume that: (r)

• the trees γ contributing to Fn,M are such that O4 (γ) ≤ n, O(γ) ≤ M and show fat and thin endpoints; (r),i • the trees γ contributing to Rn,M are such that O4 (γ) > n or O(γ) > M , depending on whether i = 1, 2, and may have empty and square endpoints. These assumptions are trivially true at the beginning of the induction, see Sec. 2. As a consequence of result (2.36), and since the number of topologically distinct (r),1 (r),2 trees with m endpoints is estimated by (const.)m , Rn,M , Rn,M are bounded

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1019

respectively by C n Cq f q1n!|λ|n+1 , C M Cq f q1 n!δ M+1 for some positive C and δ := maxk {|λk |, |αk |, |µk |}. Now do the following. (r)

(1) Substitute every fat 2-endpoint appearing in Fn,M with the sum of a thin plus an empty plus a square 2-endpoints: in this way the fat 2-endpoint disappear, generating new trees such that n2 ,4 (γ) ≤ n that we organize by writing (r)

(r)

(r),1

Fn,M = A1 + A2

(r),2

+ A2

,

(4.18)

where (r)

A1 := “sum of trees γ such that O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1

:= “sum of trees γ such that O4 (γ) > n”,

(r),2

:= “sum of trees γ such that O4 (γ) ≤ n and O(γ) > M ”.

A2 A2

(2) Substitute every fat 2 -endpoint appearing in A1 with the sum of a thin plus an empty plus a square 2 -endpoint: in this way the fat 2 -endpoints disappear, generating new trees such that n2 ,4 (γ) ≤ n that we organize by writing (r)

(r)

(r)

(r),1

A1 = A3 + A4

(r),2

+ A4

,

(4.19)

where (2 )

(r)

A3 := “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) ≤ M ”, (2 )

(r),1

:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) > n − 1 + δn2 ,4 ,0 ”,

(r),2

:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) > M ”.

A4 A4

(2 )

(r),1

Notice that the trees appearing in A4 trees,

are such that O4 (γ) > n; in fact, for these

(2 ) O4 (γ) ≥ n2 ,4 (γ) + 1 + nsq (γ) − δn2 ,4 ,0 > n,

(4.20)

where we used that each nontrivial 2 -frame contains trees of 4-order ≥ 2, that the square 2 -endpoints are of 4-order strictly bigger than their corresponding thin and (r),1 empty endpoints, and the deﬁnition of A4 . (3) Expand each square a-endpoint with a = 2 , 2 appearing in A3 , and write (r)

(r)

(r)

(r),1

A3 = A5 + A6

(r),2

+ A6

,

(4.21)

where (r)

A5 := “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1

A6

:= “sum of the trees γ s.t. O4 (γ) > n”,

(r),2 A6

:= “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) > M ”.

Notice that the trees generated at this step are such that n2 ,4 (γ) ≤ n; in fact, for a (r) (2 ) generic tree γ generated by γ ∈ A3 it follows that n2 ,4 (γ ) = n2 ,4 (γ) + nsq (γ) ≤ (r) n, where the last inequality holds by deﬁnition of A3 .

October 12, J070-S0129055X10004120

1020

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella (r)

(4) Substitute every fat 4-endpoint appearing in A5 with the sum of a thin plus an empty plus a square 4-endpoint: in this way the fat 4-endpoints disappear, generating new trees such that n2 ,4 (γ) ≤ n that we organize by writing (r)

(r)

(r),1

A5 = A7 + A8

(r),2

+ A8

,

(4.22)

where (r)

(4)

A7 := “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) ≤ M ”, (r),1

:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) > n − 1 + δn2 ,4 ,0 ”,

(r),2

:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) > M ”.

A8 A8

(4) (4)

(r),1

Now, we show A8 can be rewritten as a sum of trees such that O4 (γ) > n and n2 ,4 (γ) ≤ n. Notice that since the 4-order of the 4-square endpoint is equal to the 4-order of its corresponding fat, thin and empty endpoints, we cannot use a bound like the one in (4.20). To “rise the 4-order” of a tree γ up to n + 1 we have to (4) (4) expand a suitable number n ˜ sq (γ) ≤ nsq (γ) of square 4-endpoints (which are given (4) ˜ sq (γ) = 0, because in by sums of trees of 4-order ≥ 2). If n2 ,4 (γ) = 0 we choose n (r) this case by deﬁnition of S4 the 4-order of γ is already > n; if n2 ,4 (γ) > 0 we choose n ˜ (4) sq (γ) := n − n2 ,4 (γ),

(4.23)

with this choice it follows that (n2 ,4 (γ) refers to the tree γ before this last expansion), ˜ (4) O4 (γ) ≥ n2 ,4 (γ) + 1 + n sq (γ) = n + 1.

(4.24)

Finally, a generic tree γ produced by this last expansion veriﬁes n2 ,4 (γ ) = n2 ,4 (γ) + n ˜ (4) sq (γ) = n.

(4.25)

(r)

(5) Expand each square 4-endpoint appearing in A7 , and write (r)

(r)

(r),1

A7 = A9 + A10

(r),2

+ A10 ,

(4.26)

where (r)

A9 := “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1

A10

:= “sum of the trees γ s.t. O4 (γ) > n”,

(r),2 A10

:= “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) > M ”.

(4.27)

Notice that the trees generated at this step are such that n2 ,4 (γ) ≤ n; in fact, for a (r) (4) generic tree γ generated by γ ∈ A7 it follows that n2 ,4 (γ ) = n2 ,4 (γ) + nsq (γ) ≤ (r) n, where the last inequality holds by deﬁnition of A7 .

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1021

(r)

(6) Expand each empty a-endpoint appearing in A9 , and write (r)

(r+1)

(r),1

A9 = Fn,M + A12

(r),2

+ A12

(4.28)

where (r+1)

Fn,M := “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1

A12

:= “sum of the trees γ s.t. O4 (γ) > n”,

(r),2 A12

:= “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) > M ”.

(4.29)

(7) We are now able to express the generic Schwinger function S T (f ; q) as S (f ; q) =

(r+1) Fn,M

=:

(r+1) Fn,M

T

+

(r),1 Rn,M

+

(r+1),1 Rn,M

+

(r),2 Rn,M

6

+

(r),j

A2i

j=1,2 i=1

+

(r+1),2 Rn,M ,

(4.30)

where, by construction, all the trees are such that n2 ,4 (γ) ≤ n, the remainder (r+1),1 (r+1),2 contains distinct trees such that O4 (γ) > n, while Rn,M is given by a Rn,M (r+1)

sum of distinct trees such that O(γ) > M . If Fn,M still contains trees with fat endpoints repeat the process starting from step (1), otherwise we have ﬁnished: calling r∗ the ﬁnal step (which is ﬁnite, see Remark below), i.e. the integer such (r ∗ ) that Fn,M contains trees with only thin endpoints, the n! bound (2.36) implies that, if δ = maxh {|λh |, |αh |, |µh |}: (r ∗ ),1

|Rn,M | ≤ C n Cq f q1 n!|λ|n+1 , ∗

(r )

(r ∗ ),2

|Rn,M | ≤ C M Cq f q1n!δ M+1 .

(4.31)

∗

(r )

Moreover, Fn,M diﬀers from Fn,+∞ , the Taylor expansion in λ to the order n, by a quantity bounded by C M Cq f q1 n!δ M+1 ; therefore, for each λ in the analyticity domain and for each n ≥ 0 there exists a ﬁnite integer M (λ, n) ≥ n such that for all M ≥ M (λ, n) it follows that: T (r ∗ ) S (f ; q) − Fn,+∞ (4.32) ≤ 4C n Cq f q1 n!|λ|n+1 , and this bound concludes the proof of Borel summability of the ϕ44 planar theory. Remark. The iteration ends in less than M + 1 steps (where each step is formed by the seven substeps described above); this means that no trees with fat endpoints (M+1) are present in Fn,M . We can prove this fact with a simple induction. At the step r = 0, the trees with fat endpoints are of order ≥ 0. Assume inductively that at the (r) rth step, the trees belonging to Fn,M with at least one fat endpoint are of order ≥ r. If this is true, by repeating the six substeps described above, we ﬁnd that (r+1) the new trees with at least one fat endpoint appearing in Fn,M must be of order ≥ r + 1, since at the rth step the fat endpoints are replaced by thin plus empty plus square endpoints, and the empty endpoints are of order 2 while the squares are given by sums of trees of order ≥ 2. Hence, after at most r∗ = M + 1 iterations (r ∗ ) no more trees showing fat endpoints will be present in Fn,M .

October 12, J070-S0129055X10004120

1022

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

5. Conclusions In this paper, we discussed the issue of Borel summability in the framework of multiscale analysis and renormalization group, by providing a proof of Borel summability for the ϕ44 planar theory using the techniques of [6]. This result is not new, since it has been proven independently by ’t Hooft and Rivasseau, [2–5]. The proof given by ’t Hooft is based on renormalization group methods, and it does not rely on Nevanlinna–Sokal theorem; we have not been able to fully reproduce ’t Hooft argument in our rigorous framework. The proof given by Rivasseau, instead, consists in a check of the two hypothesis of Nevanlinna–Sokal theorem. However, his methods are quite diﬀerent from the ones that we use, since in his approach the beta function was not introduced. Moreover, in his work a particular choice of the wave function renormalization and of the renormalized mass was made. One of the motivations of our work is that very few proofs of Borel summability of interacting ﬁeld theories based on renormalization group methods are present in the literature, [8, 9]. Moreover, our framework has already been proved eﬀective in the analysis of various models of condensed matter and ﬁeld theory. Therefore, we consider our work as a ﬁrst step towards the analysis of more interesting models. For instance, we think that the ideas of this paper can be applied to the one-dimensional Hubbard model, which has been rigorously constructed through renormalization group methods in [15], but where a proof of Borel summability has not been given yet. In fact, due to the anticommutativity of the fermionic ﬁelds the factorial growth of the Feynman graphs can be controlled using the so called Gram bounds. Moreover, one sector of the theory is asymptotically free, while to control the ﬂow of the other running coupling constants one has to exploit the vanishing of the beta function. Regarding our work, the ﬁrst part of this paper consists essentially in a rigorous study of the beta function of an asymptotically free ﬁeld theory. In particular, we have shown that the theory is analytic for values of the renormalized coupling constant λ belonging to a “Watson domain”, see [18] and deﬁnition (2.40), and for values of the wave function renormalization and of the renormalized mass close to 1 in absolute value. In the second part of our work, to prove Borel summability we have shown that it is possible to “undo” the resummation that allowed us to write the Schwinger functions as a convergent power series in the running coupling constants, in such a way that the diﬀerence between the generic Schwinger function and its Taylor expansion to the order n in λ is bounded by C n n!|λ|n+1 for some positive C. Thanks to Nevanlinna–Sokal theorem, see [14], this last fact along with the above mentioned analyticity properties implies Borel summability. Acknowledgments It is a pleasure to thank Prof. G. Gallavotti for having introduced us to the theory of renormalization, for having proposed the problem and for many very useful discussions, from which all the ideas of this paper emerged. We are also grateful to Dr. A. Giuliani, for constant encouragement and constructive criticism.

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1023

Appendix A. Proof of Lemma 1 In this appendix, we present the proof of Lemma 1. Recall that r¯ is the number of running coupling constants of type 4 appearing at a given order r of the perturbative series deﬁning the beta function, see (2.28); moreover, we deﬁne r˜ := r − r¯. We ˜ ξ ,x remind also that with the notation y(x) we denote the ﬁxed point of the map T 0

in S˜2η . All the estimates that we shall derive here and in the next appendix are consequences of the fact that, as it can be checked in a straightforward way, if x, x ∈ Sλ0 ,ε+η , λ0 ∈ Wε,ϑ and ε, η are small enough there exists a constant Cϑ > 0 such that xk Cϑ |xk | ≤ , ≤ Cϑ if k ≥ h; (A.1) |λ0 |−1 + k x h

−1

the constant Cϑ grows as ∼ ϑ

for ϑ → 0.

Proof of Lemma 1 (1). First, we have to prove that if (λ0 , α0, µ0 )∈ Wε,ϑ ×Bη ×Bη ˜ ξ0 ,x leaves invariant S˜2η , for ϑ ∈ 0, π , and ε, η small and x ∈ Sλ0 ,ε+η the map T 2 enough; in fact, setting a = (a1 , . . . , ar ), h = (h1 , . . . , hr ): ˜ ξ ,x y)k,2 | ≤ |α0 | + |(T 0

k j=1 r≥2

≤ |α0 | +

r r βa(2 ) (j; h) |xhi | |yhi ,ai |

hi ≥j {ai }ri=1 i=1,...,r

k j=1 r≥2

i=1 ai =4

i=1 ai =4

βa(2 ) (j; h)|xj |2 Cϑr¯ εr¯−2 (2η)r˜

hi ≥j {ai }ri=1 i=1,...,r

≤ |α0 | + Cϑ ε

(A.2)

for some Cϑ > 0. Similarly, ˜ ξ ,x y)k,2 | ≤ |µ0 | + |(T 0

k

γ 2(j−k)

j=1

r≥2

βa(2) (j; h)Cϑr¯εr¯(2η)r˜

hi ≥j {ai }ri=1 i=1,...,r

≤ |µ0 | + Cϑ (ε + η)2 ,

(A.3)

for Cϑ large enough. Hence, if (α0 , µ0 ) ∈ Bη × Bη , then both (A.2), (A.3) can be made smaller than 2η taking ε small enough. Proof of Lemma 1(2). Under the same assumption of Lemma 1(1), we show now ˜ ξ0 ,x is a contraction in S˜2η ; in fact, that T ˜ ξ ,x y )k,2 | ≤ ˜ ξ ,x y)k,2 − (T |(T 0 0

k j=1

r βa(2 ) (j; h) |xhi |

r≥3 hi ≥j {ai }ri=1 i=1,...,r

· (6η)r˜−1 r˜ max yk,i − yk,i , k,i

i=1 ai =4

(A.4)

October 12, J070-S0129055X10004120

1024

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

where we used that the second order of the beta function depends only on xj ; therefore, we can exploit two of the xhi’s to perform the sum, and it follows that, for ε, η small enough: ˜ ξ0 ,x y )k,2 | ≤ max |y − yk,i |, ˜ ξ0 ,x y)k,2 − (T |(T k,i

0 < < 1.

k,i

(A.5)

The same result can be proved for the diﬀerence of the 2-components, using the γ 2(j−k) factor to perform the sum over the j’s; this concludes the proof of the ˜ ξ ,x . contractivity of T 0 Proof of Lemma 1(3). We prove here the last item of Lemma 1. Given y ∈ S˜2η , set n n ˜ ˜ yk,i,n := T (A.6) yk,i,n := T ξ0 ,x y k,i , ξ0 ,x y k,i , and assume inductively that for all 0 ≤ m ≤ n the following bound is true: (A.7) |yk,i,m − yk,i,m | ≤ C(log(1 + εk) + 1) max xk − xk ; k

therefore, from (A.7) it follows that: |yk,2 ,n+1 −

yk,2 ,n+1 |

≤

k j=1

r≥2 hi ≥j {ai }ri=1 i=1,...,r

r

+

& βa(2 ) (j; h) C r¯−1 |xj |¯ r (3ε)r¯−2 (6η)r˜ ϑ '

CCϑr¯ (log(1

r¯−2

+ εh ) + 1)|xj | (3ε) 2

r˜−1

(6η)

=1 a =4

× max |xk − xk |,

(A.8)

k

and |yk,2,n+1 −

yk,2,n+1 |

≤

k

γ

2(j−k)

& βa(2) (j; h) C r¯−1 r¯(3ε)r¯−1 (6η)r˜ ϑ

r≥2 hi ≥j {ai }ri=1 i=1,...,r

j=1

+

r

CCϑr¯ (log(1

' r¯

r˜−1

+ εh ) + 1)(3ε) (6η)

=1 a =4

× max |xk − xk |. k

(A.9)

Using the short memory property of the GN trees, see discussion after (2.20), it follows that: β (i) (j; h) log(1 + εh ) ≤ (const.)r log(1 + εj); (A.10) a h1 ,...,hr hi ≥j

plugging this bound into (A.8), (A.9) we can reproduce our inductive assumption (A.7) for m = n + 1, choosing for ε, η small enough. This concludes the proof

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1025

of (4.3). The “H¨older continuity bound” (4.4) can be proved again by induction, replacing (A.7) with |yk,i,m −

yk,i,m |

ρ ≤ Cρ max |xk − xk | ,

0 < ρ < 1,

k

(A.11)

and using in (A.8) the bound, if hi ≥ j for all i = 1, . . . , r and 0 < ρ < 1: ρ r r xhi − xhi ≤ 2¯ r max |xk − xk | |xj |2−ρ Cϑr¯ (3ε)r¯−2 . k i=1 i=1 ai =4

(A.12)

ai =4

Appendix B. Proof of Lemma 2 In this appendix we present a proof of Lemma 2. Proof of Lemma 2(1). First, we have to prove that Tλ0 ,y(·) leaves invariant Sλ0 ,ε+η for (λ0 , α0 , µ0 ) ∈ Wε,ϑ × Bη × Bη for ε, η small enough. We have that (Tλ0 ,y(x) x)k = λ−1 0

+

k

1 , k βj − Rj (x, y(x))

j=1

(B.1)

j=1

where y(x) ∈ S˜2η and

Rj (x, y(x)) =

−2 ¯ ¯ xj βj2 + x−1 j βj β4,j (x, y(x)) − xj β4,j (x, y(x)) , 1 + βj xj + x−1 β¯4,j (x, y(x))

(B.2)

j

with β¯4,j (x, y(x)) =

r≥3

hi ≥j {ai }ri=1 i=1,...,r

βa(4) (j; h)

r

xhi

i=1 ai =4

r

yhi ,ai (x),

(B.3)

i=1 ai =4

(4)

where βa1 ,...,ar (j; h1 , . . . , hr ) = 0 unless there are at least two ai equal to 4. The ﬁnal statement follows from the fact that for ε, η small enough |Rj (x, y(x))| ≤ (const.)(ε + η).

(B.4)

Proof of Lemma 2(2). To conclude, we have to show that under the same assumptions of the previous item, Tλ0 ,y(x) is a contraction in Sλ0 ,ε+η . Setting y(x) =: y,

October 12, J070-S0129055X10004120

1026

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

y(x ) =: y , from (B.1) we have that (Tλ0 ,y(x) x)k − (Tλ0 ,y(x ) x )k k

=

(Rj (x, y) − Rj (x , y )) 

j=1

λ−1 0 +

k j=1

βj −

k

k

Rj (x, y) λ−1 0 +

j=1

βj −

j=1

k

,

(B.5)

Rj (x , y )

j=1

where Rj is given by (B.2); therefore, to bound the diﬀerence of Rj ’s calculated at diﬀerent x we have to estimate (the other terms can be worked out in a similar way) −2 x β¯4,j (x, y) − xj −2 β¯4,j (x , y ) j βa(4) (j; h) ≤ r≥3 {hi }≥j {ai }

r r r r −2 −2 · xj xhi yhi ,ai − xj xhi yhi ,ai ; i=1 ai =4

we have that

i=1 ai =4

i=1 ai =4

(B.6)

i=1 ai =4

r r r r −2 −2 xhi yhi ,ai − xj xhi yhi ,ai xj i=1 i=1 i=1 i=1 ai =4 ai =4 ai =4 ai =4   r r r r −2   ≤ xj  xhi yhi ,ai − xhi yhi ,ai  i=1 i=1 i=1 i=1 ai =4 ai =4 ai =4 ai =4 r r (x + x ) j j + max |xk − xk | 2 2 xhi yhi ,ai k xj xj i=1 i=1 a =4

(B.7)

(B.8)

ai =4

i

and: (B.7) ≤ |xj |−1 Cϑr¯−1 r(3ε)r¯−2 (6η)r˜ max |xk − xk | + Cϑr¯(3ε)r¯−2 (6η)r˜−1

r

k

|yh ,a − yh ,a |

(B.9)

=1 a =4

(B.8) ≤ max |xk − xk ||xj |−1 2Cϑr¯ (3ε)r¯−2 (6η)r˜. k

(B.10)

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1027

Using (4.3) and the short memory property it follows that: r β (4) (j; h)yh ,a − y a h ,a ≤ (const.) [log(1 + εj) + 1] max |xk − xk |; k

h1 ,...,hr hi ≥j

(B.11) therefore, since the other terms arising in the diﬀerence (B.5) can be treated exactly in the same way, from (B.9)–(B.11) we ﬁnd that:   k k |Rj (x , y ) − Rj (x, y)| ≤ (const.)  |xj |−1 (ε + η) + k(log(1 + εk) + 1) j=1

j=1

× max |xk − xk |,

(B.12)

k

which gives statement (4.5) for ε, η small enough. In fact, the denominator of (B.5) is bounded from below as k k −1 −1 λ + (βj − Rj (x, y)) λ0 + (βj − Rj (x , y )) ≥ (const.)|xk |−2 ; 0 j=1 j=1 (B.13) using the second of (A.1) our claim (4.5) follows. Appendix C. The Running Coupling Constants on Scale 0 In this appendix, we discuss how to express the running coupling constants on scale zero as functions of the renormalized ones. First, a straightforward computation shows that the second equation in (3.7) can be rewritten as: µ0 =

1 − µ0 µ − β2,0 (λ0 , λ, ξ0 , ξ); 1+µ 1+µ

(C.1)

(2)

this is a consequence of the fact that in (2.28) βa1 ,...,ar (0; 0 . . . , 0) = 1 if ai = 2 for all i ∈ [1, r]. Since the running coupling constants on scale > 0 are parametrized by the ones on scale 0, we can rewrite (C.1) as µ + g2 (λ0 , ξ0 , µ) =: µ + f˜2 (µ) + g2 (λ0 , ξ0 , µ), µ0 =: (C.2) 1+µ and plugging (C.2) in the ﬁrst equation of (3.7) we get α0 =: α + f˜2 (µ) + g2 (λ0 , ξ0 , µ),

(C.3)

where: f˜i (µ) are analytic functions of µ ∈ Bη¯, and gi (λ0 , ξ0 , µ) are analytic for (λ0 , ξ0 , µ) ∈ W2¯ε, ϑ × B2¯η × B2¯η × Bη¯. Formulas (C.2), (C.3) can be regarded as a 2 ﬁxed point equation: ) ( ˜ ξ,λ0 ξ0 ; ξi,0 = M (C.4) i

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

1028

˜ ξ,λ0 leaves invariant all we have to do is to check that: (i) for |λ0 |, |ξ| small enough M ˜ ξ,λ0 is a contraction therein. The property (i) is a the set B 32 η¯ × B 23 η¯, and (ii) M straightforward consequence of the fact that |f˜i (µ)| ≤ C|µ|2 ,

|gi (λ0 , ξ0 , µ)| ≤ C|λ0 |(|λ0 | + |ξ|),

(C.5)

where in the second inequality we used that |λ | ≤ c|λ |, |ξ | ≤ c |λ | + |ξ | and k 0 i,k 0 0 that, from (3.7), |ξi,0 | ≤ c |λ0 | + |ξ| ; if we choose (λ0 , ξ) ∈ W2¯ε, ϑ × Bη¯ × Bη¯ with 2 ε¯, η¯ small enough then the set B 23 η¯ × B 32 η¯ ⊂ B2¯η × B2¯η is left invariant by (C.4). To prove property (ii), we use a Cauchy estimate. In fact, the Cauchy bound tells us that if y, y ∈ B 23 η¯ × B 23 η¯ then, since gi (λ0 , y, µ) is analytic for y ∈ B2¯η × B2¯η and bounded as (C.5), for (λ0 , µ) ∈ W2¯ε, ϑ × Bη¯ with ε¯, η¯ small enough: 2

ε¯ |gi (λ0 , y, µ) − gi (λ0 , y , µ)| ≤ 2C (¯ ε + η¯) max |yi − yi | ≤ max |yi − yi | i i η¯

(C.6)

with 0 < < 1. Therefore, we can construct explicitly the solution ξi,0 (λ0 , ξ), and the above properties allow us to conclude that it is analytic for (λ0 , ξ) ∈ W2¯ε, ϑ × Bη¯ × Bη¯. 2 After this, we are left with Eq. (3.5) for λ0 ; since all the couplings on scale ≥ 1 are functions of λ0 , ξ0 and, as we know for our previous analysis, ξi,0 = ξi,0 (λ0 , ξ), we can rewrite (3.5) as: λ0 =: λ − λ0 f˜4 (µ) − β0 λ20 + h(λ0 , ξ) f˜4 (µ) = O(µ),

(C.7)

|h(λ0 , ξ)| ≤ C|λ0 |2 (|λ0 | + |ξ|),

where we used that µ0 satisﬁes (C.4) with i = 2, and h(λ0 , ξ) is analytic for (λ0 , ξ) ∈ W2¯ε, ϑ × Bη¯ × Bη¯. Therefore, we can rewrite (C.7) as: 2

λ0 = Mλ,ξ ˜ λ0 , ˜ := λ

λ , 1 + f˜4 (µ)

˜ Mλ,ξ ˜ x := λ +

1 1 + f˜4 (µ)

(C.8) (−β0 x2 + h(x, ξ)).

leaves All we have to do is to check that: (i) if (λ, ξ) ∈ Wε¯,ϑ × Bη¯ × Bη¯ then Mλ,ξ ˜ invariant the set W 32 ε¯, 23 ϑ ⊂ W2¯ε, ϑ , and (ii) Mλ,ξ is a contraction therein. Let us ˜ 2 prove property (i); for ε¯, η¯ small enough, it is easy to see that if λ ∈ Wε¯,ϑ then ˜ ∈ W 4 3 and x ∈ W 3 2 ⇒ M ˜ x ∈ W 3 2 . λ ¯, 3 ϑ ¯, 3 ϑ λ,ξ 3 ε, 4 ϑ 2ε 2ε

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1029

We now turn to property (ii). From the analyticity of h(x, ξ) in x ∈ W2¯ε, ϑ , using 2 that the distance from a point x ∈ W 32 ε¯, 23 ϑ to the boundary of W2¯ε, ϑ is bounded 2

|x| 3

sin ϑ6 , if x, x ∈ W 32 ε¯, 23 ϑ a Cauchy estimate tells us that: 3C ε¯(¯ ε + η¯) x − M x | ≤ 8 β ε ¯ + |Mλ,ξ |x − x | ≤ |x − x | ˜ ˜ 0 λ,ξ sin(ϑ/6)

from below by

(C.9)

with < 1; the ﬁrst inequality follows from the bound on h in (C.7), while the second holds taking ε¯ small enough (remember that ϑ ∈ 0, π2 ). In conclusion, we can explicitly construct the solution of (C.8), and by a simple inductive argument it follows that it is analytic for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯. Appendix D. Dependence of the Running Coupling Constants on the Ultraviolet Cutoﬀ In this appendix we show that the running coupling constants are weakly dependent on the location of the ultraviolet cutoﬀ; in particular, denoting with a superscript N the quantities corresponding to a theory with cutoﬀ on scale N , if (λ, α, µ) ∈ Wε,ϑ × Bη × Bη with ε, η small enough we show that there exist two positive constants C, ρ such that for any k ≤ N and N < N the following bounds hold: N C α − αN ≤ + Cηγ −ρN , k k ε−1 + N N C µk − µN ≤ (D.1) + Cηγ −ρN , k −1 ε +N N C λ − λN ≤ . k k ε−1 + N In the proof, we shall use in a crucial way the short memory property of the GN N trees, see discussion after (2.20). Consider ﬁrst the diﬀerence of αN k , αk . Denoting by a prime the running coupling constants corresponding to a theory with cutoﬀ N and neglecting the N label in the others we have that |αk − αk | ≤

k N β2 ,j (λ, α, µ) − β2N ,j (λ , α , µ ) + |α0 − α0 |,

(D.2)

j=1

where β2N ,j is the beta function the theory with an ultraviolet cutoﬀ on scale N . Let a := maxk∈[0,N ] |ak |; using property (2.31) and the bounds in (A.1) it follows that, for some C1 > 0, ρ > 0 (neglecting for simplicity the arguments of the beta function): N C1 C1 β2 ,j − β2N ,j ≤ |λh − λh |γ ρ(j−h) α − α + µ − µ + −1 −1 2 (ε + j) ε +j h≥j

ρ(j−N )

+ C1 (ε + η)

γ , ε−1 + j

(D.3)

October 12, J070-S0129055X10004120

1030

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

|α0 − α0 | ≤ C1 (ε + η)(α − α + µ − µ + λ − λ ) + C1 (ε + η)2 γ −ρN , (D.4) where the last terms in (D.3) and (D.4) take into account the contribution of GN trees with at least one endpoint on scale > N , and all the others bound the diﬀerences of trees with all endpoints on scale < N . Therefore, plugging (D.3) and (D.4) in (D.2) we have that, for some C˜1 > 0: α − α ≤ C˜1 (ε + η)(µ − µ + λ − λ ) +

N j=1

N C˜1 |λh − λh |γ ρ(j−h) ε−1 + j h≥j

C˜1 (ε + η) + C˜1 (ε + η)2 γ −ρN . + −1 ε +N

(D.5)

By what has been discussed in Secs. 3 and 4 and in Appendices A and B, it follows that |λk − λk | for k ≥ 1 can be estimated in the following way, for some C2 > 0: |λk − λk | ≤

C2 (λ − λ + α − α + µ − µ ) C2 (ε + η)γ ρ(k−N ) + , ε−1 + k (ε−1 + k)2

(D.6)

where the ﬁrst term takes into account the diﬀerence of running coupling constants on scale ≤ N , while the last term takes into account trees with root scale ≤ k having at least one endpoint on scale > N . Plugging (D.6) in (D.5) it is straightforward to see that, for some C3 > 0, α − α ≤ C3 (ε + η)(λ − λ + µ − µ ) +

C3 (ε + η) + C3 (ε + η)2 γ −ρN , ε−1 + N

(D.7)

which if inserted in (D.6) implies, for some positive C4 , C5 : λ − λ ≤ εC4 µ − µ + C4 ε(ε + η)2 γ −ρN + α − α ≤ C5 (ε + η)µ − µ +

C4 ε(ε + η) , ε−1 + N

C5 (ε + η) + C5 (ε + η)2 γ −ρN . ε−1 + N

(D.8)

The diﬀerence µk − µk can be bounded in a way analogous to αk − αk , and using (D.8) it follows that µ − µ ≤

C6 + C6 ηγ −ρN , ε−1 + N

C6 > 0,

(D.9)

which together with (D.8) proves (D.1). Appendix E. An Improvement of the n! Bounds in the Planar Theory In this appendix, we discuss an improvement, valid in the planar case, of the n! bounds proved in [6, Sec. XIX], see formulas (19.5) and (20.2). Here we shall follow

October 12, J070-S0129055X10004120

2010 10:1 WSPC/S0129-055X

148-RMP

Borel Summability of ϕ44 Planar Theory via Multiscale Analysis

1031

the notations of that work: we remind that the “form factor” r(a) (σ; k) of [6] corresponds to the contribution of the tree σ with thin endpoints to the formal expansion (a) of vk γ (2δa,2 +4δa,0 )k in λ, α, µ, which is obtained by iteration of the equation graphically represented in Fig. 5. We claim that [6, Eq. (20.2)] is still valid if f is replaced by an f¯ denoting just the number of nontrivial frames (see remark after Fig. 5 for the deﬁnition of trivial frame) labeled by a = 2 , 4. To prove the claim, observe that one can repeat the proof of Sec. XIX in [6] with the new inductive assumption ¯

|r

(a)

˜ n−1 f¯! (σ; k)| ≤ ε¯ D n

f (bk)j

j!

j=0

γ (2δa,2 +4δa,0 )k

(E.1)

instead of [6, Eq. (19.5)], the only diﬀerence being that the number of topological Feynman graphs with m vertices is bounded proportionally to N0m where N0 is a suitable constant, because of the restriction to the planar theory. Then if f˜ is the number of nontrivial (2 , 4)-frames of σ excluding the external one, equation [6, Eq. (19.13)] is replaced by, depending on whether the frame enclosing σ is trivial or not: ˜ n−m Dm f˜! |r(a) (σ; k)| ≤ D7 N m Dm ε¯n D ×

0 4 ˜ f k

h=0 r=0

6

γ (2δa,2 +4δa,0 )h

(bh)r , (nontrivial frame), r!

˜ n−m Dm f˜!, |r(a) (σ; k)| ≤ D7 N0m D4m ε¯n D 6

(E.2)

(trivial frame);

with respect to [6], we have kept the factor γ (2δa,2 +4δa,0 )h inside the sum, instead of estimating it replacing h with k. If the frame enclosing σ is trivial the claim follows ˜ large enough (as in [6], here m ≥ 2). If the frame from the second of (E.2), taking D is nontrivial and a = 2 , 4, proceed as in [6, Eq. (19.15)], while if a = 2 substitute that bound with ˜

f k h=0 r=0

¯

γ 2h

f γ 2k (bk)r (bh)r ≤ , r! 1 − γ −2 r=0 r!

(E.3)

and do the same for a = 0 (γ 2k will be replaced by γ 4k ). From this the claim follows ˜ suﬃciently large, as explained in [6]. choosing D References [1] L. D. Landau, Collected Papers of L. D. Landau (Gordon and Breach, 1965). [2] G. ’t Hooft, Borel summability of a four-dimensional ﬁeld theory, Phys. Lett. B 119 (1982) 369–371. [3] G. ’t Hooft, Rigorous construction of planar diagram ﬁeld theories in four dimensional euclidean space, Comm. Math. Phys. 88 (1983) 1–25. [4] V. Rivasseau, Construction and Borel summability of planar 4-dimensional Euclidean ﬁeld theory, Comm. Math. Phys. 95 (1984) 445–486.

October 12, J070-S0129055X10004120

1032

2010 10:1 WSPC/S0129-055X

148-RMP

M. Porta & S. Simonella

[5] V. Rivasseau, Rigorous construction and Borel summability for a planar fourdimensional ﬁeld theory, Phys. Lett. B 137 (1983) 98–102. [6] G. Gallavotti, Renormalization theory and ultraviolet stability for scalar ﬁelds via renormalization group methods, Rev. Mod. Phys. 57 (1985) 471–562. [7] G. Gallavotti and V. Rivasseau, ϕ4 -Field theory in dimension four: A modern introduction to its open problems, Ann. Inst. H. Poincar´ e 40 (1984) 185–220. [8] F. Feldman, J. Magnen, V. Rivasseau and R. S´en´eor, Construction and Borel summability of infrared Φ44 by a phase space expansion, Comm. Math. Phys. 109 (1987) 437–480. [9] F. Feldman, J. Magnen, V. Rivasseau and R. S´en´eor, A renormalizable ﬁeld theory: The massive Gross–Neveu model in two-dimensions, Comm. Math. Phys. 103 (1986) 67–103. [10] J. Koplik, A. Neveu and S. Nussinov, Some aspects of the planar perturbation series, Nucl. Phys. B 123 (1977) 109–131. [11] E. Br´ezin, C. Itzykson, G. Parisi and J. B. Zuber, Planar diagrams, Comm. Math. Phys. 59 (1978) 35–51. [12] G. Gallavotti and F. Nicol` o, Renormalization theory for four-dimensional scalar ﬁelds, I, Comm. Math. Phys. 100 (1985) 545–590. [13] G. Gallavotti and F. Nicol` o, Renormalization theory for four-dimensional scalar ﬁelds, II, Comm. Math. Phys. 101 (1985) 247–282. [14] A. Sokal, An improvement of Watson’s theorem on Borel summability, J. Math. Phys. 21 (1980) 261–263. [15] V. Mastropietro, Rigorous proof of Luttinger liquid behaviour in the 1d Hubbard model, J. Stat. Phys. 121 (2005) 373–432. [16] G. H. Hardy, Divergent Series (Oxford University Press, 1949). [17] V. Rivasseau, Constructive ﬁeld theory in zero dimension, Adv. Math. Phys. 2009 (2009) article ID 180159, 12 pp. [18] G. N. Watson, A theory of asymptotic series, Philos. Trans. R. Soc. Lond. Ser. A 211 (1912) 279–313. [19] G. Benfatto and G. Gallavotti, Perturbation theory of the Fermi surface in a quantum liquid. A general quasi-particle formalism and one-dimensional systems, Comm. Math. Phys. 258 (2005) 609–655. [20] G. Gentile and V. Mastropietro, Renormalization group for one-dimensional fermions. A review on mathematical results, Phys. Rep. 352 (2001) 273–437. [21] G. Benfatto and V. Mastropietro, Ward identities and chiral anomaly in the Luttinger liquid, Comm. Math. Phys. 258 (2005) 609–655. [22] C. De Calan and V. Rivasseau, Local existence of the Borel transform in Euclidean ϕ44 , Comm. Math. Phys. 82 (1982) 69–100. [23] K. Hepp, Proof of the Bogoliubov–Parasiuk theorem on renormalization, Comm. Math. Phys. 2 (1966) 301–326. [24] W. Zimmermann, Convergence of Bogoliubov’s method of renormalization in momentum space, Comm. Math. Phys. 15 (1969) 208–234. [25] J. Polchinski, Renormalization and eﬀective Lagrangians, Nucl. Phys. B 231 (1984) 269–295.

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 1033–1059 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004156

PARALLEL TRANSPORT OVER PATH SPACES

SAIKAT CHATTERJEE∗ and AMITABHA LAHIRI† S. N. Bose National Centre for Basic Sciences, Block JD, Sector III, Salt Lake, Kolkata 700098, West Bengal, India ∗[email protected] †[email protected] AMBAR N. SENGUPTA Department of Mathematics, Louisiana State University, Baton Rouge, Louisiana 70803, USA [email protected] Received 10 October 2009 Revised 14 June 2010 We develop a diﬀerential geometric framework for parallel transport over path spaces and a corresponding discrete theory, an integrated version of the continuum theory, using a category-theoretic framework. Keywords: Gauge theory; path spaces; double categories. Mathematics Subject Classiﬁcation 2010: 81T13, 58Z05, 16E45

1. Introduction A considerable body of literature has grown up around the notion of “surface holonomy”, or parallel transport on surfaces, motivated by the need to have a gauge theory of interaction between charged string-like objects. Approaches include direct geometric exploration of the space of paths of a manifold (Cattaneo et al. [5], for instance), and a very diﬀerent, category-theory ﬂavored development (Baez and Schreiber [2], for instance). In the present work, we develop both a path-space geometric theory as well as a category theoretic approach to surface holonomy, and describe some of the relationships between the two. As is well known [1] from a group-theoretic argument and also from the fact that there is no canonical ordering of points on a surface, attempts to construct a groupvalued parallel transport operator for surfaces leads to inconsistencies unless the ∗ Current

address: School of Mathematics, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India. 1033

October 12, J070-S0129055X10004156

1034

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

group is abelian (or an abelian representation is used). So in our setting, there are two interconnected gauge groups G and H. We work with a ﬁxed principal G-bundle ¯ then, viewing the space of A-horizontal ¯ π : P → M and connection A; paths itself as a bundle over the path space of M , we study a particular type of connection on this path-space bundle which is speciﬁed by means of a second connection A and a ﬁeld B whose values are in the Lie algebra LH of H. We derive explicit formulas describing parallel-transport with respect to this connection. As far as we are aware, this is the ﬁrst time an explicit description for the parallel transport operator has been obtained for a surface swept out by a path whose endpoints are not pinned. We obtain, in Theorem 2.1, conditions for the parallel-transport of a given point in path-space to be independent of the parametrization of that point, viewed as a path. We also discuss H-valued connections on the path space of M , constructed from the ﬁeld B. In Sec. 3, we show how the geometrical data, including the ﬁeld B, lead to two categories. We prove several results for these categories and discuss how these categories may be viewed as “integrated” versions of the diﬀerential geometric theory developed in Sec. 2. In working with spaces of paths, one is confronted with the problem of specifying a diﬀerential structure on such spaces. It appears best to proceed within a simpler formalism. Essentially, one continues to use terms such as “tangent space” and “diﬀerential form”, except that in each case the speciﬁc notion is deﬁned directly (for example, a tangent vector to a space of paths at a particular path γ is a vector ﬁeld along γ) rather than by appeal to a general theory. Indeed, there is a good variety of choices for general frameworks in this philosophy (see, for instance, [16, 17]). For this reason, we shall make no attempt to build a manifold structure on any space of paths. 1.1. Background and motivation Let us brieﬂy discuss the physical background and motivation for this study. Traditional gauge ﬁelds govern interaction between point particles. Such a gauge ﬁeld is, mathematically, a connection A on a bundle over spacetime, with the structure group of the bundle being the relevant internal symmetry group of the particle species. The amplitude of the interaction, along some path γ connecting the point particles, is often obtained from the particle wave functions ψ coupledtogether using ¯ which is quantities involving the path-ordered exponential integral P exp(− γ A), ¯ the same as the parallel-transport along the path γ by the connection A. If we now change our point of view concerning particles, and assume that they are extended

Fig. 1.

Point particles interacting via a gauge ﬁeld.

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1035

string-like entities, then each particle should be viewed not as a point entity but rather a path (segment) in spacetime. Thus, instead of the two particles located at two points, we now have two paths γ1 and γ2 ; in place of a path connecting the two point particles we now have a parametrized path of paths, in other words a surface Γ, connecting γ1 with γ2 . The interaction amplitudes would, one may expect, involve both the gauge ﬁeld A, as expressed through the parallel transports along γ1 and γ2 , and an interaction between these two parallel transport ﬁelds. This higher order, or higher dimensional interaction, could be described by means of a gauge ﬁeld at the higher level: it would be a gauge ﬁeld over the space of paths in spacetime.

1.2. Comparison with other works The approach to higher gauge theory developed and explored by Baez [1], Baez and Schreiber [2, 3], and Lahiri [13], and others cited in these papers, involves an abstract category theoretic framework of 2-connections and 2-bundles, which are higher-dimensional analogs of bundles and connections. There is also the framework of gerbes [6, 4, 14]. We develop both a diﬀerential geometric framework and category-theoretic structures. We prove in Theorem 2.1 that a requirement of parametrization invariance imposes a constraint on a quantity called the “fake curvature” which has been observed in a related but more abstract context by Baez and Schreiber [2, Theorem 23]. Our diﬀerential geometric approach is close to the works of Cattaneo et al. [5], Pfeiﬀer [15], and Girelli and Pfeiﬀer [11]. However, we develop, in addition to the diﬀerential geometric aspects, the integrated version in terms of categories of diagrams, an aspect not addressed in [5]; also, it should be noted that our connection form is diﬀerent from the one used in [5]. To link up with the integrated theory it is essential to explore the eﬀect of the LH-valued ﬁeld B. To this end we determine a “bi-holonomy” associated to a path of paths (Theorem 2.2) in terms of the ﬁeld B; this aspect of the theory is not studied in [5] or other works. Our approach has the following special features: • we develop the theory with two connections A and A¯ as well as a 2-form B (with the connection A¯ used for parallel-transport along any given string-like object, and the forms A and B used to construct parallel-transports between diﬀerent strings); • we determine, in Theorem 2.2, the “bi-holonomy” associated to a path of paths using the B-ﬁeld; • we allow “quadrilaterals” rather than simply bigons in the category theoretic formulation, corresponding to having strings with endpoints free to move rather than ﬁxed-endpoint strings. Our category theoretic considerations are related to notions about double categories introduced by Ehresmann [9, 10] and explored further by Kelly and Street [12].

October 12, J070-S0129055X10004156

1036

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

Fig. 2.

Gauge ﬁelds along paths c1 and c2 interacting across a surface.

2. Connections on Path-Space Bundles In this section we will construct connections and parallel-transport for a pair of intertwined structures: path-space bundles with structure groups G and H, which are Lie groups intertwined as explained below in (2.1). For the physical motivation, it should be kept in mind that G denotes the gauge group for the gauge ﬁeld along each path, or string, while H governs, along with G, the interaction between the gauge ﬁelds along diﬀerent paths. An important distinction between existing diﬀerential geometric approaches (such as Cattaneo et al. [5]) and the “integrated theory” encoded in the categorytheoretic framework is that the latter necessarily involves two gauge groups: a group G for parallel transport along paths, and another group H for parallel transport between paths (in path space). We shall develop the diﬀerential geometric framework using a pair of groups (G, H) so as to be consistent with the “integrated” theory. Along with the groups G and H, we use a ﬁxed smooth homomorphism τ : H → G and a smooth map G × H → H : (g, h) → α(g)h such that each α(g) is an automorphism of H, such that the identities τ (α(g)h) = gτ (h)g −1 , α(τ (h))h = hh h−1 ,

(2.1)

hold for all g ∈ G and h, h ∈ H. The derivatives τ (e) and α (e) will be denoted simply as τ : LH → LG and α : LG → LH. (This structure is called a Lie 2-group in [1, 2].) To summarize very rapidly, anticipating some of the notions explained below, we work with a principal G-bundle π : P → M over a manifold M , equipped with ¯ and an α-equivariant vertical 2-form B on P with values in connections A and A, ¯ paths in P , the Lie algebra LH. We then consider the space PA¯ P of A-horizontal

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1037

which forms a principal G-bundle over the path-space PM in M . Then there is an associated vector bundle E over PM with ﬁber LH; using the 2-form B and the connection form A¯ we construct, for any section σ of the bundle P → M , an LH-valued 1-form θσ on PM . This being a connection over the path-space in M with structure group H, parallel-transport by this connection associates elements of H to parametrized surfaces in M . Most of our work is devoted to studying a second connection form ω(A,B) , which is a connection on the bundle PA¯ P which we construct using a second connection A on P . Parallel-transport by ω(A,B) is related to parallel-transport by the LH-valued connection form θσ . ¯ 2.1. Principal bundle and the connection A Consider a principal G-bundle π:P →M with the right-action of the Lie group G on P denoted P × G → P : (p, g) → pg = Rg p. ¯ Let A¯ be a connection on this bundle. The space PA¯ P of A-horizontal paths in P may be viewed as a principal G-bundle over PM , the space of smooth paths in M . We will use the notation pK ∈ Tp P , for any point p ∈ P and Lie-algebra element K ∈ LG, deﬁned by d pK = p · exp(tK). dt t=0 It will be convenient to keep in mind that we always use t to denote the parameter for a path on the base manifold M or in the bundle space P ; we use the letter s to parametrize a path in path-space. 2.2. The tangent space to PA¯ P ¯ The points of the space PA¯ P are A-horizontal paths in P . Although we call PA¯ P a “space” we do not discuss any topology or manifold structure on it. However, it is useful to introduce certain diﬀerential geometric notions such as tangent spaces on PA¯ P . It is intuitively clear that a tangent vector at a “point” γ˜ ∈ PA¯ P ought to be a vector ﬁeld on the path γ˜ . We formalize this idea here (as has been done elsewhere as well, such as in Cattaneo et al. [5]). If PX is a space of paths on a manifold X, we denote by evt the evaluation map evt : PX → X : γ → evt (γ) = γ(t).

(2.2)

Our ﬁrst step is to understand the tangent spaces to the bundle PA¯ P . The following result is preparation for the deﬁnition (see also [5, Theorem 2.1]). Proposition 2.1. Let A¯ be a connection on a principal G-bundle π : P → M, and ˜ : [0, 1] × [0, 1] → P : (t, s) → Γ(t, ˜ s) = Γ ˜ s (t) Γ

October 12, J070-S0129055X10004156

1038

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

a smooth map, and ˜ s). v˜s (t) = ∂s Γ(t, Then the following are equivalent: (i) Each transverse path ˜ s : [0, 1] → P : t → Γ(t, ˜ s) Γ ¯ is A-horizontal. ˜ 0 is A-horizontal, ¯ (ii) The initial path Γ and the “tangency condition” ¯ ∂ A(˜ vs (t)) ¯ ˜ s), v˜s (t)) = F A (∂t Γ(t, ∂t holds, and thus also T ¯ ¯ ˜ s), v˜s (t))dt, ¯ A(˜ vs (T )) − A(˜ vs (0)) = F A (∂t Γ(t,

(2.3)

(2.4)

0

for every T, s ∈ [0, 1]. Equation (2.3), and variations on it, is sometimes referred to as the Duhamel formula and sometimes a “non-abelian Stokes formula”. We can write it more compactly by using the notion of a Chen integral. Withsuitable regularity assumptions, a 2-form Θ on a space X yields a 1-form, denoted Θ, on the space PX of smooth paths in X; if c is such a path, a “tangent vector” v ∈ Tc (PX) is a vector ﬁeld t → v(t) along c, and the evaluation of the 1-form Θ on v is deﬁned to be 1 Θ (v) = Θ(c (t), v(t))dt. (2.5) Θ v=

c

c

0

The 1-form Θ, or its localization to the tangent space Tc (PX), is called the Chen integral of Θ. Returning to our context, we then have T ¯ F A, (2.6) ev∗T A¯ − ev∗0 A¯ = 0

where the integral on the right is a Chen integral; here it is, by deﬁnition, the 1-form on PA¯ P whose value on a vector v˜s ∈ TΓ˜ s PA¯ P is given by the right-hand side of (2.3). The pullback ev∗t A¯ has the obvious meaning. ¯

Proof. From the deﬁnition of the curvature form F A , we have ¯ ˜ ∂s Γ) ˜ = ∂t (A(∂ ˜ − ∂s (A(∂ ˜ − A([∂ ¯ t Γ, ˜ ∂s Γ] ˜ ) + [A(∂ ˜ A(∂ ¯ s Γ)]. ˜ ¯ s Γ)) ¯ t Γ)) ¯ t Γ), F A (∂t Γ, 0

So ˜ − F A¯ (∂t Γ, ˜ ∂s Γ) ˜ = ∂s (A(∂ ˜ − [A(∂ ¯ t Γ), ˜ A(∂ ¯ s Γ)] ˜ ¯ s Γ)) ¯ t Γ)) ∂t (A(∂ ˜ = 0, ¯ t Γ) = 0 if A(∂ thus proving (2.3) if (i) holds. Equation (2.4) then follows by integration.

(2.7)

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1039

Next suppose (ii) holds. Then, from the ﬁrst line in (2.7), we have ˜ − [A(∂ ¯ t Γ), ˜ A(∂ ¯ s Γ)] ˜ = 0. ¯ t Γ)) ∂s (A(∂

(2.8)

˜ t); then Now let s → h(s) ∈ G describe parallel-transport along s → Γ(s, ˜ t)), ¯ s Γ(s, h (s)h(s)−1 = −A(∂

and h(0) = e.

Then ¯ t Γ(t, ˜ s))h(s)) = Ad(h(s)−1 )[∂s (A(∂ ˜ − [A(∂ ¯ t Γ), ˜ A(∂ ¯ s Γ)] ˜ ¯ t Γ)) ∂s (h(s)−1 A(∂

(2.9)

and the right-hand side here is 0, as seen in (2.8). Therefore, ¯ t Γ(t, ˜ s))h(s) h(s)−1 A(∂ is independent of s, and hence is equal to its value at s = 0. Thus, if A¯ vanishes ˜ 0) then it also vanishes in ∂t Γ(t, ˜ s) for all s ∈ [0, 1]. In conclusion, if the on ∂t Γ(t, ˜ 0 is A-horizontal, ¯ initial path Γ and the tangency condition (2.3) holds, then each ¯ ˜ s is A-horizontal. transverse path Γ In view of the preceding result, it is natural to deﬁne the tangent spaces to PA¯ P as follows: Definition 2.1. The tangent space to PA¯ P at γ˜ is the linear space of all vector ﬁelds t → v˜(t) ∈ Tγ˜(t) P along γ˜ for which ¯ v (t)) ∂ A(˜ ¯ γ (t), v˜(t)) = 0 − F A (˜ ∂t

(2.10)

holds for all t ∈ [0, 1]. The vertical subspace in Tγ˜ PA¯ P consists of all vectors v˜(·) for which v˜(t) is vertical in Tγ˜(t) P for every t ∈ [0, 1]. Let us note one consequence: ¯ Lemma 2.1. Suppose γ : [0, 1] → M is a smooth path, and γ˜ an A-horizontal lift. Let v : [0, 1] → TM be a vector ﬁeld along γ, and v˜(0) any vector in Tγ˜(0) P with π∗ v˜(0) = v(0). Then there is a unique vector ﬁeld v˜ ∈ Tγ˜ PA¯ P whose projection down to M is the vector ﬁeld v, and whose initial value is v˜(0). Proof. The ﬁrst-order diﬀerential equation (2.10) determines the vertical part of ¯ v˜(t), from the initial value. Thus v˜(t) is this vertical part plus the A-horizontal lift of v(t) to Tγ˜(t) P .

October 12, J070-S0129055X10004156

1040

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

2.3. Connections induced from B All through our work, B will denote a vertical α-equivariant 2-form on P with values in LH. In more detail, this means that B is an LH-valued 2-form on P which is vertical in the sense that B(u, v) = 0

if u or v is vertical,

and α-equivariant in the sense that Rg∗ B = α(g −1 )B

for all g ∈ G

wherein Rg : P → P : p → pg is the right action of G on the principal bundle space P , and α(g −1 )B = dα(g −1 )|e B, recalling that α(g −1 ) is an automorphism H → H. ¯ Consider an A-horizontal γ˜ ∈ PA¯ P , and a smooth vector ﬁeld X along γ = π ◦ γ˜; ˜ take any lift Xγ˜ of X along γ˜, and set 1 def ˜ ˜ γ˜ (u))du. θγ˜ (X) = B (Xγ˜ ) = B(˜ γ (u), X (2.11) γ ˜

0

˜ γ˜ (as any two choices diﬀer by a vertical This is independent of the choice of X vector on which B vanishes) and speciﬁes a linear form θγ˜ on Tγ (PM ) with values in LH. If we choose a diﬀerent horizontal lift of γ, a path γ˜ g, with g ∈ G, then θγ˜ g (X) = α(g −1 )θγ˜ (X).

(2.12)

Thus, one may view θ˜ to be a 1-form on PM with values in the vector bundle E → PM associated to PA¯ P → PM by the action α of G on LH. Now ﬁx a section σ : M → P , and for any path γ ∈ PM let σ ˜ (γ) ∈ PA¯ P be the ¯ A-horizontal lift with initial point σ(γ(0)). Thus, σ ˜ : PM → PA¯ P is a section of the bundle PA¯ P → PM . Then we have the 1-form θσ on PM with values in LH given as follows: for any X ∈ Tγ (PM ), (θσ )(X) = θσ˜ (γ) (X).

(2.13)

We shall view θσ as a connection form for the trivial H-bundle over PM . Of course, it depends on the section σ of PA¯ P → PM , but in a “controlled” manner, i.e. the behavior of θσ under change of σ is obtained using (2.12). 2.4. Constructing the connection ω(A,B) Our next objective is to construct connection forms on PA¯ P . To this end, ﬁx a connection A on P , in addition to the connection A¯ and the α-equivariant vertical LH-valued 2-form B on P .

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1041

The evaluation map at any time t ∈ [0, 1], given by evt : PA¯ P → P : γ˜ → γ˜ (t), commutes with the projections PA¯ P → PM and P → M , and the evaluation map PM → M . We can pull back any connection A on the bundle P to a connection ev∗t A on PA¯ P . Given a 2-form B as discussed above, consider the LH-valued 1-form Z on PA¯ P speciﬁed as follows. Its value on a vector v˜ ∈ Tγ˜ PA¯ P is deﬁned to be Z(˜ v) =

1

B(˜ γ (t), v˜(t))dt.

(2.14)

0

Thus

1

Z=

B,

(2.15)

0

where on the right we have the Chen integral (discussed earlier in (2.5)) of the ¯ 2-form B on P , lifting it to an LH-valued 1-form on the space of (A-horizontal) smooth paths [0, 1] → P . The Chen integral here is, by deﬁnition, the 1-form on PA¯ P given by 1 B(˜ γ (t), v˜(t))dt. v˜ ∈ Tγ˜ PA¯ P → 0

Note that Z and the form θ are closely related: Z(˜ v ) = θγ˜ (π∗ v˜).

(2.16)

Now deﬁne the 1-form ω(A,B) by ω(A,B) = ev∗1 A + τ (Z).

(2.17)

Recall that τ : H → G is a homomorphism, and, for any X ∈ LH, we are writing τ (X) to mean τ (e)X; here τ (e) : LH → LG is the derivative of τ at the identity. The utility of bringing in τ becomes clear only when connecting these developments to the category theoretic formulation of Sec. 3. A similar construction, but using only one algebra LG, is described by Cattaneo et al. ([5]). However, as we pointed out earlier, a parallel transport operator for a surface cannot be constructed using a single group unless the group is abelian. To allow non-abelian groups, we need to have two groups intertwined in the structure described in (2.1), and thus we need τ . Note that ω(A,B) is simply the connection ev∗1 A on the bundle PA¯ P , shifted by the 1-form τ (Z). In the ﬁnite-dimensional setting it is a standard fact that such a shift, by an equivariant form which vanishes on verticals, produces another connection; however, given that our setting is, technically, not identical to the ﬁnitedimensional one, we shall prove this below in Proposition 2.2.

October 12, J070-S0129055X10004156

1042

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

Thus, ω(A,B) (˜ v ) = A(˜ v (1)) +

1

τ B(˜ γ (t), v˜(t))dt.

(2.18)

0

We can rewrite this as ¯ − ev∗ (A − A)] ¯ + ω(A,B) = ev∗0 A + [ev∗1 (A − A) 0

1

¯

(F A + τ B).

(2.19)

0

To obtain this we have simply used the relation (2.4). The advantage in (2.19) is that it separates oﬀ the endpoint terms and expresses ω(A,B) as a perturbation of the simple connection ev∗0 A by a vector in the tangent space Tev∗0 A A, where A is the space of connections on the bundle PA¯ P . Here note that the “tangent vectors” to the aﬃne space A at a connection ω are the 1-forms ω1 − ω, with ω1 running over A. A diﬀerence such as ω1 − ω is precisely an equivariant LG-valued 1-form which vanishes on vertical vectors. Recall that the group G acts on P on the right P × G → P : (p, g) → Rg p = pg and this induces a natural right action of G on PA¯ P : γ , g) → Rg γ˜ = γ˜ g. PA¯ P × G → PA¯ P : (˜ Then for any vector X in the Lie algebra LG, we have a vertical vector ˜ γ ) ∈ Tγ˜ PA¯ P X(˜ given by ˜ γ )(t) = d X(˜ γ˜ (t) exp(uX). du u=0 Proposition 2.2. The form ω(A,B) is a connection form on the principal G-bundle PA¯ P → PM . More precisely, ω(A,B) ((Rg )∗ v) = Ad(g −1 )ω(A,B) (v) for every g ∈ G, v˜ ∈ Tγ˜ (PA¯ P ) and ˜ =X ω(A,B) (X) for every X ∈ LG. Proof. It will suﬃce to show that for every g ∈ G, Z((Rg )∗ v) = Ad(g −1 )Z(v) and every vector v tangent to PA¯ P , and ˜ =0 Z(X) for every X ∈ LG.

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1043

˜ From (2.15) and the fact that B vanishes on verticals it is clear that Z(X) is 0. The equivariance under the G-action follows also from (2.15), on using the G-equivariance of the connection form A and of the 2-form B, and the fact that ¯ ¯ the right action of G carries A-horizontal paths into A-horizontal paths. 2.5. Parallel transport by ω(A,B) Let us examine how a path is parallel-transported by ω(A,B) . At the inﬁnitesimal level, all we need is to be able to lift a given vector ﬁeld v : [0, 1] → T M , along γ ∈ PM , to a vector ﬁeld v˜ along γ˜ such that: (i) v˜ is a vector in Tγ˜ (PA¯ P ), which means that it satisﬁes Eq. (2.10): ¯ v (t)) ∂ A(˜ ¯ = F A (˜ γ (t), v˜(t)); ∂t

(2.20)

(ii) v˜ is ω(A,B) -horizontal, i.e. satisﬁes the equation

1

A(˜ v (1)) +

τ B(˜ γ (t), v˜(t))dt = 0.

(2.21)

0

The following result gives a constructive description of v˜. ¯ B, and ω(A,B) are as speciﬁed before. Let Proposition 2.3. Assume that A, A, γ˜ ∈ PA¯ P, and γ = π ◦ γ˜ ∈ PM its projection to a path on M, and consider any v ∈ Tγ PM . Then the ω(A,B) -horizontal lift v˜ ∈ Tγ˜ PA¯ P is given by h v˜(t) = v˜A ˜v (t), ¯ (t) + v h ¯ where v˜A ˜ (t) P is the A-horizontal lift of v(t) ∈ Tγ(t) M, and ¯ (t) ∈ Tγ

¯ v (1)) − v˜v (t) = γ˜ (t) A(˜

t

1

¯

h F A (˜ γ (u), v˜A ¯ (u))du

(2.22)

wherein h v˜(1) = v˜A (1) + γ˜ (1)X,

(2.23)

h (1) being the A-horizontal lift of v(1) in Tγ˜(1) P, and with v˜A

X =− 0

1

h τ B(˜ γ (t), v˜A ¯ (t))dt.

(2.24)

Note that X in (2.24) is A(˜ v (1)). Note also that since v˜ is tangent to PA¯ P , the vector v˜v (t) is also given by

t ¯ A v h ¯ v (0)) + v˜ (t) = γ˜(t) A(˜ F (˜ γ (u), v˜A¯ (u))du . (2.25) 0

October 12, J070-S0129055X10004156

1044

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

Proof. The ω(A,B) horizontal lift v˜ of v in Tγ˜ (PA¯ P ) is the vector ﬁeld v˜ along γ˜ which projects by π∗ to v and satisﬁes the condition (2.21): 1 A(˜ v (1)) + τ B(˜ γ (t), v˜(t))dt = 0. (2.26) 0

¯ Now for each t ∈ [0, 1], we can split the vector v˜(t) into an A-horizontal part and ¯ v v (t)) ∈ LG viewed as a a vertical part v˜v (t) which is essentially the element A(˜ vector in the vertical subspace in Tγ˜(t) P : h v˜(t) = v˜A ˜v (t) ¯ (t) + v

and the vertical part here is given by ¯ v (t)). v˜v (t) = γ˜ (t)A(˜ Since the vector ﬁeld v˜ is actually a vector in Tγ˜ (PA¯ P ), we have, from (2.20), the relation 1 ¯ h ¯ v (t)) = A(˜ ¯ v (1)) − A(˜ F A (˜ γ (u), v˜A ¯ (u))du. t

We need now only verify the expression (2.23) for v˜(1). To this end, we ﬁrst split this into A-horizontal and a corresponding vertical part: h (1) + γ˜ (1)A(˜ v (1)). v˜(1) = v˜A

The vector A(˜ v (1)) is obtained from (2.26), and thus proves (2.23). There is an observation to be made from Proposition 2.3. Equation (2.24) has, on the right-hand side, the integral over the entire curve γ˜ . Thus, if we were to consider parallel-transport of only, say, the “left half” of γ˜, we would, in general, end up with a diﬀerent path of paths! 2.6. Reparametrization invariance If a path is reparametrized, then, technically, it is a diﬀerent point in path space. Does parallel-transport along a path of paths depend on the speciﬁc parametrization of the paths? We shall obtain conditions to ensure that there is no such dependence. Moreover, in this case, we shall also show that parallel transport by ω(A,B) along a path of paths depends essentially on the surface swept out by this path of paths, rather than the speciﬁc parametrization of this surface. For the following result, recall that we are working with Lie groups G, H, smooth homomorphism τ : H → G, smooth map α : G × H → H : (g, h) → α(g)h, where each α(g) is an automorphism of H, and the maps τ and α satisfy (2.1). ¯ and B an Let π : P → M be a principal G-bundle, with connections A and A, LH-valued α-equivariant 2-form on P vanishing on vertical vectors. As before, on

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1045

¯ the space PA¯ P of A-horizontal paths, viewed as a principal G-bundle over the space PM of smooth paths in M , there is the connection form ω(A,B) given by 1 τ B. ω(A,B) = ev∗1 A + 0

By a “smooth path” s → Γs in PM , we mean a smooth map [0, 1]2 → M : (t, s) → Γ(t, s) = Γs (t), viewed as a path of paths Γs ∈ PM . With this notation and framework, we have: Theorem 2.1. Let Φ : [0, 1]2 → [0, 1]2 : (t, s) → (Φs (t), Φt (s)) be a smooth diﬀeomorphism which ﬁxes each vertex of [0, 1]2 . Assume that (i) either ¯

F A + τ (B) = 0

(2.27)

and Φ carries each s-ﬁxed section [0, 1] × {s} into an s-ﬁxed section [0, 1] × {Φ0 (s)}; (ii) or 1 ¯ ∗ ∗ ¯ ¯ (F A + τ B) = 0, (2.28) [ev1 (A − A) − ev0 (A − A)] + 0

2

Φ maps each boundary edge of [0, 1] into itself, and Φ0 (s) = Φ1 (s) for all s ∈ [0, 1]. ˜ 0 ◦ Φ0 along the path s → (Γ ◦ Φ)s , Then the ω(A,B) -parallel-translate of the point Γ ˜ ˜ 0 along s → Γs . ˜ is Γ1 ◦ Φ1 , where Γ1 is the ω(A,B) -parallel-translate of Γ As a special case, if the path s → Γs is constant and Φ0 the identity map on [0, 1], so that Γ1 is simply a reparametrization of Γ0 , then, under conditions (i) or ˜ 0 along the path s → (Γ ◦ Φ)s , (ii) above, the ω(A,B) -parallel-translate of the point Γ ˜ 0. ˜ is Γ0 ◦ Φ1 , i.e. the appropriate reparametrizaton of the original path Γ ˜ ◦ Φ)0 projects down to (Γ ◦ Φ)0 , which, by the boundary Note that the path (Γ behavior of Φ, is actually that path Γ0 ◦ Φ0 , in other words Γ0 reparametrized. ¯ ˜ ◦ Φ)1 is an A-horizontal lift of the path Γ1 , reparametrized by Φ1 . Similarly, (Γ If A = A¯ then conditions (2.28) and (2.27) are the same, and so in this case the weaker condition on Φ in (ii) suﬃces. Proof. Suppose (2.27) holds. Then the connection ω(A,B) has the form ¯ − ev∗0 (A − A)]. ¯ ev∗0 A + [ev∗1 (A − A) The crucial point is that this depends only on the endpoints, i.e. if γ˜ ∈ PA¯ P and V˜ ∈ Tγ˜ PA¯ P then ω(A,B) (V˜ ) depends only on V˜ (0) and V˜ (1). If the conditions

October 12, J070-S0129055X10004156

1046

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

˜ s with on Φ in (i) hold then reparametrization has the eﬀect of replacing each Γ ˜ ˜ ΓΦ0 (s) ◦ Φs , which is in PA¯ P , and the vector ﬁeld t → ∂s (ΓΦ0 (s) ◦ Φs (t)) is an ˜ Φ0 (s) (t)), ω(A,B) -horizontal vector, because its endpoint values are those of t → ∂s (Γ since Φs (t) equals t if t is 0 or 1. Now suppose (2.28) holds. Then ω(A,B) becomes simply ev∗0 A. In this case ω(A,B) (V˜ ) depends on V˜ only through the initial value V˜ (0). Thus, the ω(A,B) -parallel-transport of γ˜ ∈ PA¯ P , along a path s → Γs ∈ PM , is obtained by A-parallel-transporting the initial point γ˜ (0) along the path s → Γ0 (s), and ¯ shooting oﬀ A-horizontal paths lying above the paths Γs . (Since the paths Γs do not necessarily have the second component ﬁxed, their horizontal lifts need not be ˜ Φs ◦ Φs is ˜ s ◦ Φs , except at s = 0 and s = 1, when the composition Γ of the form Γ ˜ 0 ◦ Φ0 , guaranteed to be meaningful.) From this it is clear that parallel translating Γ ˜ by ω(A,B) along the path s → Γs , results, at s = 1, in the path Γ1 ◦ Φ1 . 2.7. The curvature of ω(A,B) We can compute the curvature of the connection ω(A,B) . This is, by deﬁnition, 1 Ω(A,B) = dω(A,B) + [ω(A,B) ∧ ω(A,B) ], 2 where the exterior diﬀerential d is understood in a natural sense that will become clearer in the proof below. More technically, we are using here notions of calculus on smooth spaces; see, for instance, [16] for a survey, and [17] for another approach. First we describe some notation about Chen integrals in the present context. 1 If B is a 2-form on P , with values in a Lie algebra, then its Chen integral 0 B, restricted to PA¯ P , is a 1-form on PA¯ P given on the vector V˜ ∈ Tγ˜ (PA¯ P ) by 1 1 ˜ B (V ) = B(˜ γ (t), V˜ (t))dt. 0

0

If C is also a 2-form on P with values in the same Lie algebra, we have a product ˜ Y˜ ∈ Tγ˜ (PA¯ P ) by 2-form on the path space PA¯ P given on X, 1 2 0

˜ Y˜ ) [B ∧ C](X,

˜ [B(˜ γ (u), X(u)), C(˜ γ (v), Y˜ (v))]du dv

= 0≤u

˜ [C(˜ γ (u), X(u)), B(˜ γ (v), Y˜ (v))]du dv

− 0≤u

1

= 0

0

1

˜ [B(˜ γ (u), X(u)), C(˜ γ (v), Y˜ (v))]du dv.

(2.29)

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1047

Proposition 2.4. The curvature of ω(A,B) is 1 ω(A,B) ∗ A = ev1 F + d τB Ω 0

+

ev∗1 A ∧

1

1 2

τB + 0

[τ B ∧ τ B],

(2.30)

0

where the integrals are Chen integrals. Proof. From ω(A,B) = ev∗1 A +

1

τ B, 0

we have 1 Ωω(A,B) = dω(A,B) + [ω(A,B) ∧ ω(A,B) ] 2 1 = ev∗1 dA + d τ B + W,

(2.31)

0

where ˜ ω(A,B) (Y˜ )] ˜ Y˜ ) = [ω(A,B) (X), W (X, ˜ ev∗1 A(Y˜ )] = [ev∗1 A(X),

1 ∗ ˜ ˜ τ B(˜ γ (t), Y (t))dt + ev1 A(X), 0

1

+

0 1

∗ ˜ ˜ τ B(˜ γ (t), X(t))dt, ev1 A(Y )

+ 0

1

˜ τ B(˜ γ (u), X(u)), τ B(˜ γ (v), Y˜ (v)) du dv

0

˜ Y˜ ) + ev∗ A ∧ = [ev∗1 A, ev∗1 A](X, 1 1 2 +

1

˜ Y˜ ) τ B (X,

0

˜ Y˜ ). [τ B ∧ τ B](X,

(2.32)

0

¯ and without τ , the expression for the curvature can be In the case A = A, ¯ expressed in terms of the “fake curvature” F A + B. For a result of this type, for a related connection form, see Cattaneo et al. [5, Theorem 2.6] have calculated a similar formula for curvature of a related connection form. A more detailed exploration of the fake curvature would be of interest.

October 12, J070-S0129055X10004156

1048

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

2.8. Parallel-transport of horizontal paths As before, A and A¯ are connections on a principal G-bundle π : P → M , and B is an LH-valued α-equivariant 2-form on P vanishing on vertical vectors. Also PX is the space of smooth paths [0, 1] → X in a space X, and PA¯ P is the space of ¯ smooth A-horizontal paths in P . Our objective now is to express parallel-transport along paths in PM in terms of a smooth local section of the bundle P → M : σ:U →P where U is an open set in M . We will focus only on paths lying entirely inside U . The section σ determines a section σ ˜ for the bundle PA¯ P → PM : if γ ∈ PM ¯ then σ ˜ (γ) is the unique A-horizontal path in P , with initial point σ(γ(0)), which projects down to γ. Thus, σ ˜ (γ)(t) = σ(γ(t))¯ a(t),

(2.33)

for all t ∈ [0, 1], where a ¯(t) ∈ G satisﬁes the diﬀerential equation ¯ (t) = −Ad(¯ a(t)−1 )A¯ ((σ ◦ γ) (t)) a ¯(t)−1 a

(2.34)

for t ∈ [0, 1], and the initial value a ¯(0) is e. Recall that a tangent vector V ∈ Tγ (PM ) is a smooth vector ﬁeld along the path γ. Let us denote σ ˜ (γ) by γ˜: def

γ˜ = σ ˜ (γ). Note, for later use, that γ˜ (t) = σ∗ (γ (t))¯ a(t) + γ˜ (t)¯ a(t)−1 a ¯ (t) .

(2.35)

vertical

Now deﬁne the vector V˜ = σ ˜∗ (V ) ∈ Tγ˜ (PA¯ P )

Fig. 3.

The section σ ˜ applied to a path c.

(2.36)

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1049

to be the vector V˜ in Tγ˜ (PA¯ P ) whose initial value V˜ (0) is V˜ (0) = σ∗ (V (0)). The existence and uniqueness of V˜ was proved in Lemma 2.1. Note that V˜ (t) ∈ Tγ˜(t) P and (σ∗ V )(t) ∈ Tσ(γ(t)) P , are generally diﬀerent veca(t) and V˜ (t) are both in Tγ˜(t) P and diﬀer by a vertical tors. However, (σ∗ V )(t)¯ vector because they have the same projection V (t) under π∗ : V˜ (t) = (σ∗ V )(t)¯ a(t) + vertical vector.

(2.37)

Our objective now is to determine the LG-valued 1-form =σ ˜ ∗ ω(A,B) ω(A,A,B) ¯

(2.38)

on PM , deﬁned on any vector V ∈ Tγ (PM ) by (V ) = ω(A,B) (˜ σ∗ V ). ω(A,A,B) ¯

(2.39)

We can now work out an explicit expression for this 1-form. Proposition 2.5. With notation as above, and V ∈ Tγ (PM ), 1 ω(A,A,B) (V ) = Ad(¯ a(1)−1 )Aσ (V (1)) + Ad(¯ a(t)−1 )τ Bσ (γ (t), V (t))dt, (2.40) ¯ 0

∗

¯ : [0, 1] → G where Cσ denotes the pullback σ C on M of a form C on P, and a describes parallel-transport along γ, i.e. satisﬁes ¯ (t) = −Ad(¯ a(t)−1 )A¯σ (γ (t)) a ¯(t)−1 a (V ) can also be expressed with initial condition a ¯(0) = e. The formula for ω(A,A,B) ¯ as (V ) = Aσ (V (0)) ω(A,A,B) ¯ + [Ad(¯ a(1)−1 )(Aσ − A¯σ )(V (1)) − (Aσ − A¯σ )(V (0))] 1 ¯ Ad(¯ a(t)−1 )(FσA + τ Bσ )(γ (t), V (t))dt. (2.41) + 0 ¯ Note that in (2.41), the terms involving A¯σ and FσA cancel each other out.

Proof. From the deﬁnition of ω(A,B) in (2.17) and (2.14), we see that we need only focus on the B term. To this end we have, from (2.35) and (2.37): a(t) + vertical, (σ∗ V )(t)¯ a(t) + vertical) B(˜ γ (t), V˜ (t)) = B(σ∗ (γ (t))¯ a(t), (σ∗ V )(t)¯ a(t)) = B(σ∗ (γ (t))¯ = α(¯ a(t)−1 )Bσ (γ (t), V (t)).

(2.42)

October 12, J070-S0129055X10004156

1050

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

Now recall the relation (2.1) τ (α(g)h) = gτ (h)g −1 , for all g ∈ G and h ∈ H, which implies τ (α(g)K) = Ad(g)τ (K) for all g ∈ G and K ∈ LH. As usual, we are denoting the derivatives of τ and α by τ and α again. Applying this to (2.42) we have τ B(˜ γ (t), V˜ (t)) = Ad(¯ a(t)−1 )τ Bσ (γ (t), V (t)), and this yields the result. Suppose ˜ : [0, 1]2 → P : (t, s) → Γ(t, ˜ s) = Γ ˜ s (t) = Γ ˜ t (s) Γ ˜ s being A-horizontal, ¯ ˜ s) being is smooth, with each Γ and the path s → Γ(0, ˜ We will need to use the bi-holonomy g(t, s) which A-horizontal. Let Γ = π ◦ Γ. ¯ then up the ˜ 0) along Γ0 |[0, t] by A, is speciﬁed as follows: parallel translate Γ(0, t 0 ¯ path Γ |[0, s] by A, back along Γs -reversed by A and then down Γ |[0, s] by A; then the resulting point is ˜ 0)g(t, s). Γ(0,

(2.43)

The path ˜s s → Γ ˜ 0 using the connection ev∗ A. In describes parallel transport of the initial path Γ 0 what follows we will compare this with the path ˆs s → Γ ˆ0 = Γ ˜ 0 using the connection ev∗ A. The following which is the parallel transport of Γ 1 result describes the “diﬀerence” between these two connections. Proposition 2.6. Suppose ˜ s) = Γ ˜ s (t) = Γ ˜ t (s) ˜ : [0, 1]2 → P : (t, s) → Γ(t, Γ ¯ ˜ s) being ˜ s being A-horizontal, and the path s → Γ(0, is smooth, with each Γ ∗ ˜ A-horizontal. Then the parallel translate of Γ0 by the connection ev1 A along the ˜ results in Γ ˜ s g(1, s), with g(1, s) being path [0, s] → PM : u → Γu , where Γ = π ◦ Γ, the “bi-holonomy” speciﬁed as in (2.43). ˜ 0 by ev∗ A along the path [0, s] → PM : ˆ s be the parallel translate of Γ Proof. Let Γ 1 ˆ u → Γu . Then the right endpoint Γs (1) traces out an A-horizontal path, starting ˆ s (1) is the result of parallel transporting Γ(0, ˜ 0) by A¯ along Γ0 ˜ 0 (1). Thus, Γ at Γ 1 ˆ s (1) back by A¯ along then up the path Γ |[0, s] by A. If we then parallel transport Γ ˆ s (0). This point is of the form Γs |[0, 1]-reversed then we obtain the initial point Γ ˜ s (0)b, for some b ∈ G, and so Γ ˆs = Γ ˜ s b. Γ

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1051

ˆ s (0) back down Γ0 |[0, s]-reversed, by A, produces the Then, parallel-transporting Γ ˜ 0)b. This shows that b is the bi-holonomy g(1, s). point Γ(0, Now we can turn to determining the parallel-transport process by the connection ˜ as above, let now Γ ˇ s be the ω(A,B) -parallel-translate of Γ ˜ 0 along ω(A,B) . With Γ ˇ ˜ ¯ [0, s] → PM : u → Γu . Since Γs and Γs are both A-horizontal and project by π∗ down to Γs , we have ˇs = Γ ˆ s bs , Γ ˇ s is 0, for some bs ∈ G. Since ω(A,B) = ev∗1 A + τ (Z) applied to the s-derivative of Γ ∗ ˆ and ev1 A applied to the s-derivative of Γs is 0, we have −1 ˆ b−1 s ∂s bs + Ad(bs )τ Z(∂s Γs ) = 0.

Thus, s → bs describes parallel transport by θ ˆ σ ◦ Γ = Γ. ˜ s g(1, s), we then have ˆs = Γ Since Γ

σ

(2.44)

where the section σ satisﬁes

dbs −1 ˜ s) b = −Ad(g(1, s)−1 )τ Z(∂s Γ ds s 1 ˜ s), ∂s Γ(t, ˜ s))dt. = −Ad(g(1, s)−1 ) τ B(∂t Γ(t,

(2.45)

0

To summarize: Theorem 2.2. Suppose ˜ : [0, 1]2 → P : (t, s) → Γ(t, ˜ s) = Γ ˜ s (t) = Γ ˜ t (s) Γ ¯ ˜ s) being A˜ s being A-horizontal, and the path s → Γ(0, is smooth, with each Γ ˜ 0 by the connection ω(A,B) along the horizontal. Then the parallel translate of Γ ˜ results in path [0, s] → PM : u → Γu , where Γ = π ◦ Γ, ˜ s g(1, s)τ (h0 (s)), Γ

(2.46)

with g(1, s) being the “bi-holonomy” speciﬁed as in (2.43), and s → h0 (s) ∈ H solving the diﬀerential equation 1 dh0 (s) ˜ s), ∂s Γ(t, ˜ s))dt (2.47) h0 (s)−1 = −α(g(1, s)−1 ) B(∂t Γ(t, ds 0 with initial condition h0 (0) being the identity in H. Let σ be a smooth section of the bundle P → M in a neighborhood of Γ([0, 1]2 ). Let at (s) ∈ G specify parallel transport by A up the path [0, s] → M : v → Γ(t, v), i.e. the A-parallel-translate of σΓ(t, 0) up the path [0, s] → M : v → Γ(t, v) results in σ(Γ(t, s))at (s). On the other hand, a ¯s (t) will specify parallel transport by A¯ along [0, t] → M : u → Γ(u, s). Thus, ˜ s) = σ(Γ(t, s))a0 (s)¯ as (t) Γ(t,

(2.48)

October 12, J070-S0129055X10004156

1052

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

The bi-holonomy is given by g(1, s) = a0 (s)−1 a ¯s (1)−1 a1 (s)¯ a0 (1). Let us look at parallel-transport along the path s → Γs , by the connection ˆ s ∈ PA¯ P be obtained by parallel ω(A,B) , in terms of the trivialization σ. Let Γ ˜0 = σ ˜ (Γ0 ) ∈ PA¯ P along the path transporting Γ [0, s] → M : u → Γ0 (u) = Γ(0, u). This transport is described through a map [0, 1] → G : s → c(s), speciﬁed through ˆs = σ ˜ s a0 (s)−1 c(s). Γ ˜ (Γs )c(s) = Γ

(2.49)

c(s)−1 c (s) = −Ad(c(s)−1 )ω(A,A,B) (V (s)), ¯

(2.50)

Then c(0) = e and

where Vs ∈ TΓs PM is the vector ﬁeld along Γs given by Vs (t) = V (s, t) = ∂s Γ(t, s) for all t ∈ [0, 1]. Equation (2.50), written out in more detail, is

c(s)−1 c (s) = −Ad(c(s)−1 ) Ad(¯ as (1)−1 )Aσ (Vs (1)) +

1

Ad(¯ as (t)−1 )τ Bσ (Γs (t), Vs (t))dt ,

(2.51)

0

where a ¯s (t) ∈ G describes A¯σ -parallel-transport along Γs |[0, t]. By (2.46), c(s) is given by c(s) = a0 (s)g(1, s)τ (h0 (s)), where s → h0 (s) solves dh0 (s) h0 (s)−1 = − ds

1

α(¯ as (t)a0 (s)g(1, s))−1 Bσ (∂t Γ(t, s), ∂s Γ(t, s))dt,

(2.52)

0

with initial condition h0 (0) being the identity in H. The geometric meaning of a ¯s (t)a0 (s) is that it describes parallel-transport ﬁrst by Aσ up from (0, 0) to (0, s) and then to the right by A¯σ from (0, s) to (t, s). 3. Two Categories from Plaquettes In this section we introduce two categories motivated by the diﬀerential geometric framework we have discussed in the preceding sections. We show that the geometric framework naturally connects with certain category theoretic structures introduced by Ehresmann [9, 10] and developed further by Kelley and Street [12].

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1053

We work with the pair of Lie groups G and H, along with maps τ and α satisfying (2.1), and construct two categories. These categories will have the same set of objects, and also the same set of morphisms. The set of objects is simply the group G: Obj = G. The set of morphisms is Mor = G4 × H, with a typical element denoted (a, b, c, d; h). It is convenient to visualize a morphism as a plaquette labeled with elements of G: To connect with the theory of the preceding sections, we should think of a and ¯ c as giving A-parallel-transports, d and b as A-parallel-transports, and h should be thought of as corresponding to h0 (1) of Theorem 2.2. However, this is only a rough guide; we shall return to this matter later in this section. For the category Vert, the source (domain) and target (co-domain) of a morphism are: sVert (a, b, c, d; h) = a, tVert (a, b, c, d; h) = c. For the category Horz sHorz (a, b, c, d; h) = d, tHorz (a, b, c, d; h) = b. We deﬁne vertical composition, that is composition in Vert, using Fig. 5. In this ﬁgure, the upper morphism is being applied ﬁrst and then the lower. Horizontal composition is speciﬁed through Fig. 6. In this ﬁgure, we have used the notation ◦opp to stress that, as morphisms, it is the one to the left which is applied ﬁrst and then the one to the right. Our ﬁrst observation is: Proposition 3.1. Both Vert and Horz are categories, under the speciﬁed composition laws. In both categories, all morphisms are invertible. c d

h

b

a Fig. 4.

Plaquette.

October 12, J070-S0129055X10004156

1054

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

c d

h

b c

a = c ◦ c = a d

= d d h(α(d−1 )h ) b b

a

b

h

a Fig. 5.

Vertical composition.

c

c d

b

h

◦opp d

a

h

b= d (α(a−1 )h )h b

a Fig. 6.

e

a a

Horizontal composition (for b = d ).

a e

c c

e e

a

a

e

a

e

Identity for Vert

Identity for Horz Fig. 7.

Identity maps.

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1055

Proof. It is straightforward to verify that the composition laws are associative. The identity map a → a in Vert is (a, e, a, e; e), and in Horz it is (e, a, e, a; e). These are displayed in in Fig. 7. The inverse of the morphism (a, b, c, d; h) in Vert is (c, b−1 , a, d−1 ; α(d)h−1 ); the inverse in Horz is (a−1 , d, c−1 , b; α(a)h−1 ). The two categories are isomorphic, but it is best not to identify them. We use ◦H to denote horizontal composition, and ◦V to denote vertical composition. We have seen earlier that if A, A¯ and B are such that ω(A,B) reduces to ev∗0 A ¯ (for example, if A = A¯ and F A + τ (B) is 0) then all plaquettes (a, b, c, d; h) arising from the connections A and ω(A,B) , satisfy τ (h) = a−1 b−1 cd. Motivated by this observation, we could consider those morphisms (a, b, c, d; h) which satisfy τ (h) = a−1 b−1 cd.

(3.1)

However, we can look at a broader class of morphisms as well. Suppose h → z(h) ∈ Z(G) is a mapping of the morphisms in the category Horz or in Vert into the center Z(G) of G, which carries composition of morphisms to products in Z(G): z(h ◦ h ) = z(h)z(h ). Then we say that a morphism h = (a, b, c, d; h) is quasi-ﬂat with respect to z if τ (h) = (a−1 b−1 cd)z(h)

(3.2)

A larger class of morphisms could also be considered, by replacing Z(G) by an abelian normal subgroup, but we shall not explore this here. Proposition 3.2. Composition of quasi-ﬂat morphisms is quasi-ﬂat. Thus, the quasi-ﬂat morphisms form a subcategory in both Horz and Vert. Proof. Let h = (a, b, c, d; h) and h = (a , b , c , d ; h ) be quasi-ﬂat morphisms in Horz, such that the horizontal composition h ◦H h is deﬁned, i.e. b = d . Then h ◦H h = (a a, b , c c, d; {α(a−1 )h }h). Applying τ to the last component in this, we have a−1 τ (h )aτ (h) = a−1 (a

−1 −1

b

= ((a a)−1 b which says that h ◦H h is quasi-ﬂat.

−1

c d )a(a−1 b−1 cd)z(h)z(h )

(c c)d)z(h ◦H h),

(3.3)

October 12, J070-S0129055X10004156

1056

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

Now suppose h = (a, b, c, d; h) and h = (a , b , c , d ; h ) are quasi-ﬂat morphisms in Vert, such that the vertical composition h ◦V h is deﬁned, i.e. c = a . Then h ◦V h = (a, b b, c , d d; h{α(d−1 )h }). Applying τ to the last component in this, we have τ (h)d−1 τ (h )d = (a−1 b−1 cd)d−1 (a = (a

−1

−1 −1

b

c d )dz(h)z(h )

(b b)−1 c d d)z(h ◦V h),

(3.4)

which says that h ◦V h is quasi-ﬂat. For a morphism h = (a, b, c, d; h) we set τ (h) = τ (h).

If h = (a, b, c, d; h) and h = (a , b , c , d ; h ) are morphisms then we say that they are τ -equivalent, h = τ h if a = a , b = b , c = c , d = d , and τ (h) = τ (h ). Proposition 3.3. If h, h , h , h are quasi-ﬂat morphisms for which the compositions on both sides of (3.5) are meaningful, then (h ◦H h ) ◦V (h ◦H h) = τ (h ◦V h ) ◦H (h ◦V h)

(3.5)

whenever all the compositions on both sides are meaningful. Thus, the structures we are using here correspond to double categories as described by Kelly and Street [12, Sec. 1.1] Proof. This is a lengthy but straightforward veriﬁcation. We refer to Fig. 8. For a morphism h = (a, b, c, d; h), let us write τ∂ (h) = a−1 b−1 cd. For the left-hand side of (3.5), we have (h ◦H h) = (a a, b , c c, d; {α(a−1 )h }h) (h ◦H h ) = (c c, b , f f, d ; {α(c−1 )h }h )

(3.6)

∗ def

h = (h ◦H h ) ◦V (h ◦H h) = (a a, b b , f f, d d; h∗ ), where h∗ = {α(a−1 )h }h{α(d−1 c−1 )h }{α(d−1 )h }

(3.7)

Applying τ gives τ (h∗ ) = a−1 τ (h )z(h )a · τ (h)z(h)d−1 c−1 τ (h )cd · × z(h ) · d−1 τ (h )dz(h ) = (a a)−1 (b b )−1 (f f )(d d)z(h∗ ),

(3.8)

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

f

f d

h

d d

h

b

b

a Fig. 8.

h

b

c c

c c d

1057

h

b

a

Consistency of horizontal and vertical compositions.

where we have used the fact, from (2.1), that α is converted to a conjugation on applying τ , and the last line follows after algebraic simpliﬁcation. Thus, τ (h∗ ) = τ∂ (h∗ )z(h∗ )

(3.9)

On the other hand, by an entirely similar computation, we obtain h∗ = (h ◦V h ) ◦H (h ◦V h) = (a a, b b , f f, d d; h∗ ),

(3.10)

h∗ = {α(a−1 )h }{α(a−1 b−1 )h }h{α(d−1 )h }.

(3.11)

def

where

Applying τ to this yields, after using (2.1) and computation, τ (h∗ ) = τ∂ (h∗ )z(h∗ ). Since τ (h∗ ) is equal to τ (h∗ ), the result (3.5) follows. Ideally, a discrete model would be the exact “integrated” version of the diﬀerential geometric connection ω(A,B) . However, it is not clear if such an ideal transcription is feasible for any such connection ω(A,B) on the path-space bundle. To make contact with the diﬀerential picture we have developed in earlier sections, we should compare quasi-ﬂat morphisms with parallel translation by ω(A,B) in the case where B is such that ω(A,B) reduces to ev∗0 A (for instance, if A = A¯ ¯ and the fake curvature F A + τ (B) vanishes); more precisely, the h for quasi-ﬂat morphisms (taking all z(h) to be the identity) corresponds to the quantity h0 (1) speciﬁed through the diﬀerential Eq. (2.47). It would be desirable to have a more thorough relationship between the discrete structures and the diﬀerential geometric constructions, even in the case when z(·) is not the identity. We hope to address this in future work.

October 12, J070-S0129055X10004156

1058

2010 10:3 WSPC/S0129-055X

148-RMP

S. Chatterjee, A. Lahiri & A. N. Sengupta

4. Concluding Remarks We have constructed in (2.17) a connection ω(A,B) from a connection A on a principal G-bundle P over M , and a 2-form B taking values in the Lie algebra of a ¯ second structure group H. The connection ω(A,B) lives on a bundle of A-horizontal paths, where A¯ is another connection on P which may be viewed as governing the gauge theoretic interaction along each curve. Associated to each path s → Γs of paths, beginning with an initial path Γ0 and ending in a ﬁnal path Γ1 in M , is a parallel transport process by the connection ω(A,B) . We have studied conditions (in Theorem 2.1) under which this transport is “surface-determined”, that is, depends more on the surface Γ swept out by the path of paths than on the speciﬁc parametrization, given by Γ, of this surface. We also described connections over the path space of M with values in the Lie algebra LH obtained from the A¯ and B. We developed an “integrated” version, or a discrete version, of this theory, which is most conveniently formulated in terms of categories of quadrilateral diagrams. These diagrams, or morphisms, arise from parallel transport by ω(A,B) when B has a special form which makes the parallel transports surface-determined. Our results and constructions extend a body of literature ranging from diﬀerential geometric investigations to category theoretic ones. We have developed both aspects, clarifying their relationship. Acknowledgments We are grateful to the anonymous referee for useful comments and pointing us to the reference [12]. Our thanks to Urs Schreiber for the reference [16]. We also thank Swarnamoyee Priyajee Gupta for preparing some of the ﬁgures. ANS acknowledges research supported from US NSF grant DMS-0601141. AL acknowledges research support from Department of Science and Technology, India under Project No. SR/S2/HEP-0006/2008. References [1] J. Baez, Higher Yang–Mills theory, http://arxiv.org/abs/hep-th/0206130. [2] J. Baez and U. Schreiber, Higher gauge theory, http://arXiv:hep-th/0511710v2. [3] J. Baez and U. Schreiber, Higher gauge theory II: 2-connections on 2-bundles, http://arxiv.org/abs/hep-th/0412325. [4] L. Breen and W. Messing, Diﬀerential geometry of gerbes, http://arxiv.org/abs/ math/0106083. [5] Alberto S. Cattaneo, P. Cotta-Ramusino and M. Rinaldi, Loop and path spaces and four-dimensional BF theories: Connections, holonomies and observables, Comm. Math. Phys. 204 (1999) 493–524. [6] D. Chatterjee, On gerbs, Ph.D. thesis, University of Cambridge (1998). [7] K.-T. Chen, Algebras of iterated path integrals and fundamental groups, Trans. Amer. Math. Soc. 156 (1971) 359–379. [8] K.-T. Chen, Iterated integrals of diﬀerential forms and loop space homology, Ann. of Math. 97(2) (1973) 217–246.

October 12, J070-S0129055X10004156

2010 10:3 WSPC/S0129-055X

148-RMP

Parallel Transport Over Path Spaces

1059

´ [9] C. Ehresmann, Cat´egories structur´ees, Ann. Sci. Ecole Norm. Sup. 80 (1963) 349–425. [10] C. Ehresmann, Cat´egories et structures (Dunod, Paris, 1965). [11] F. Girelli and H. Pfeiﬀer, Higher gauge theory — Diﬀerential versus integral formulation, J. Math. Phys. 45 (2004) 3949–3971; http://arxiv.org/abs/hep-th/0309173. [12] G. M. Kelly and R. Street, Review of the elements of 2-categories, in Category Seminar (Proc. Sem., Sydney, 1972/1973), Lecture Notes in Math., Vol. 420 (Springer, Berlin, 1974), pp. 75–103. [13] A. Lahiri, Surface holonomy and gauge 2-group, Int. J. Geom. Methods Mod. Phys. 1 (2004) 299–309. [14] M. Murray, Bundle gerbes, J. London Math. Soc. 54 (1996) 403–416. [15] H. Pfeiﬀer, Higher gauge theory and a non-abelian generalization of 2-form electrodynamics, Ann. Phys. 308 (2003) 447–477; http://arxiv.org/abs/hep-th/0304074. [16] A. Stacey, Comparative smootheology; http://arxiv.org/abs/0802.2225. [17] O. Viro, http://www.pdmi.ras.ru/∼olegviro/talks.html.

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 1061–1097 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004132

MODULI SPACES OF G2 MANIFOLDS

SERGEY GRIGORIAN Max-Planck-Institut f¨ ur Gravitationsphysik (Albert-Einstein-Institut), Am M¨ uhlenberg 1, D-14476 Golm, Germany and Simons Center for Geometry and Physics, Stony Brook University, Stony Brook, NY 11794, USA [email protected] Received 27 January 2010 Revised 14 June 2010 This paper is a review of current developments in the study of moduli spaces of G2 manifolds. G2 manifolds are seven-dimensional manifolds with the exceptional holonomy group G2 . Although they are odd-dimensional, in many ways they can be considered as an analogue of Calabi–Yau manifolds in seven dimensions. They play an important role in physics as natural candidates for supersymmetric vacuum solutions of M -theory compactiﬁcations. Despite the physical motivation, many of the results are of purely mathematical interest. Here we cover the basics of G2 manifolds, local deformation theory of G2 structures and the local geometry of the moduli spaces of G2 structures. Keywords: Special holonomy; moduli space; M -theory. Mathematics Subject Classiﬁcation 2010: 53C25, 53C29, 53Z05

1. Introduction Ever since antiquity there has been a very close relationship between physics and geometry. Originally, in Timaeus, Plato related four of the ﬁve Platonic solids — tetrahedron, hexahedron, octahedron, icosahedron to the elements ﬁre, earth, air and water, respectively, while the ﬁfth solid, the dodecahedron was the quintessence of which the cosmos itself is made. Later, Isaac Newton’s Laws of Motion and Theory of Gravitation gave a precise mathematical framework in which the motion of objects can be calculated. However, Albert Einstein’s General Relativity made it very explicit that the physics of spacetime is determined by its geometry. More recently, this fundamental relationship has been taken to a new level with the development of String and M -theory. Over the past 25 years, superstring theory has emerged as a successful candidate for the role of a theory that would unify gravity with other interactions. It was later discovered that all ﬁve superstring theories can be obtained as special limits of a more general 11-dimensional theory known as M -theory and moreover, the low energy limit of which is the 11-dimensional supergravity [44, 46]. The complete formulation of M -theory is, however, not known yet. 1061

October 12, J070-S0129055X10004132

1062

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

One of the key features of String and M -theory is that these theories are formulated in 10- and 11-dimensional spacetimes, respectively. One of the techniques to relate this to the visible four-dimensional world is to assume that the remaining six or seven dimensions are curled up as a small, compact, so-called internal space. This is known as compactiﬁcation. Such a procedure also leads to a remarkable interrelationship between physics and geometry, since the eﬀective physical content of the resulting four-dimensional theory is determined by the geometry of the internal space. Usually the full multidimensional spacetime is regarded as a direct product M4 × X, where M4 is a four-dimensional non-compact manifold with Lorentzian signature (− + ++) and X is a compact six, or seven-dimensional Riemannian manifold. In general, the parameters that deﬁne the geometry of the internal space give rise to massless scalar ﬁelds known as moduli, and the properties of the moduli space are determined by the class of spaces used in the compactiﬁcation. The properties of the internal space in String and M -theory compactiﬁcations are governed by physical considerations. A key ingredient of these theories is supersymmetry [45]. Supersymmetry is a physical symmetry between particles the spin of which diﬀers by 12 — that is, between integer spin bosons and half-integer spin fermions. Mathematically, bosons are represented as functions or tensors and fermions as spinors. When looking for a supersymmetric vacuum for which the metric is the only non-zero ﬁeld, that is a Ricci-ﬂat solution that is invariant under supersymmetry transformations, it turns out that a necessary requirement is the existence of covariantly constant, or parallel, spinor. That is, there must exist a non-trivial spinor η on the Riemannian manifold X that satisﬁes ∇η = 0

(1.1)

where ∇ is the relevant spinor covariant derivative [8]. This condition implies that η is invariant under parallel transport. Properties of parallel transport on a Riemannian manifold are closely related to the concept of holonomy. Consider a vector v at some point x on X. Using the natural Levi–Civita connection that comes from the Riemannian metric, we can parallel transport v along paths in X. In particular, consider a closed contractible path γ based at x. As shown in Fig. 1, if we parallel transport v along γ, then the new vector v which we get will necessarily have the same magnitude as the original vector v, but otherwise it does not have to be the same. This gives the notion of holonomy group. Below we give the precise deﬁnition. Definition 1. Let (X, g) be a Riemannian manifold of dimension n with metric g and corresponding Levi–Civita connection ∇, and ﬁx point x ∈ X. Let γ : [0, 1] → X be a loop based at x, that is, a piecewise-smooth path such that γ(0) = γ(1) = x. The parallel transport map Pγ : Tx X → Tx X is then an invertible linear map which lies in SO(n). Deﬁne the Riemannian holonomy group Holx (X, g) of ∇ based at x to be Holx (X, g) = {Pγ : γ is a loop based at x} ⊂ O(n).

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1063

X

γ

v’ v x

Fig. 1.

Parallel transport of a vector.

If the manifold X is connected, then it is trivial to see that the holonomy group is independent of the base point, and can hence be deﬁned for the whole manifold. Parallel transport is initially deﬁned for vectors, but can then be naturally extended to other objects like tensors and spinors, with the holonomy group acting on these objects via relevant representations. Now going back to the covariantly constant spinor η, (1.1) implies that η is invariant under the action of the holonomy group. This shows that the spinor representation of Hol(X, g) must contain the trivial representation. For Hol(X, g) = SO(n), this is not possible since the spinor representation is reducible, so Hol(X, g) ⊂ SO(n). Hence the condition (1.1) implies a reduced holonomy group. Thus, Ricci-ﬂat special holonomy manifolds occur very naturally in string and M -theory. As shown by Berger [9], the list of possible special holonomy groups is very limited. In particular, if X is simply-connected, and neither locally a product nor symmetric, the only possibilities are given in Fig. 2. In this list manifolds with holonomy SU (k), Sp(k), G2 and Spin(7) are Ricci-ﬂat. Moreover, these groups are subgroups of SO(n) and are simply-connected. This implies that manifolds with these holonomy groups always admit a spin structure ([30, Proposition 3.6.2]). These are also precisely the manifolds that admit a parallel spinor. K¨ ahler manifolds only admit parallel projective spinors — a line subbundle of the spinor bundle. Geometry K¨ ahler Calabi–Yau HyperK¨ ahler Exceptional Exceptional Fig. 2.

Holonomy U (k) SU (k) Sp(k) G2 Spin(7)

Dimension 2k 2k 4k 7 8

List of special holonomy groups.

October 12, J070-S0129055X10004132

1064

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

Thus, for a Ricci-ﬂat supersymmetric vacuum in a 10-dimensional theory, X has to be six-dimensional in order to reduce to four dimensions, and hence necessarily a Calabi–Yau manifold. Similarly, for an 11-dimensional theory, seven-dimensional manifolds with G2 holonomy arise naturally. We have thus seen that even rather simple physical requirement restrict the geometry of the manifold X to rather special classes. In particular, the study of Calabi–Yau manifolds has been crucial in the development of String Theory, and in fact some very important discoveries in the theory of Calabi–Yau manifolds have been made thanks to advances in the physics. One such major discovery is Mirror Symmetry [41, 27]. This symmetry ﬁrst appeared in String Theory where evidence was found that conformal ﬁeld theories (CFTs) related to compactiﬁcations on a Calabi–Yau manifold with Hodge numbers (h1,1 , h2,1 ) are equivalent to CFTs on a Calabi–Yau manifold with Hodge numbers (h2,1 , h1,1 ). Mirror symmetry is currently a powerful tool both for calculations in String Theory and in the study of the Calabi–Yau manifolds and their moduli spaces. In mathematical literature G2 holonomy ﬁrst appeared in Berger’s list of special holonomy groups in 1955 [9]. In 1966, Bonan has shown that manifolds with G2 holonomy are Ricci-ﬂat. It was known from general theory that having a holonomy group G is equivalent to having a torsion-free G-structure. So it was natural to study G2 structures on manifolds to get a better understanding of G2 holonomy. The diﬀerent classes of G2 structures have been explored by Fern´andez and Gray in their 1982 paper [18]. In particular they have shown that a torsionfree G2 structure is equivalent to the G2 -invariant three-form ϕ being closed and co-closed. It was not known whether the group G2 (or indeed Spin(7) for that matter) does actually appear as a non-symmetric holonomy group until in 1987 Bryant [12] proved the existence of metrics with G2 and Spin(7) holonomy. In a later paper, Bryant and Salamon [11] constructed complete metrics with G2 holonomy. However the ﬁrst compact examples of G2 holonomy manifolds have been constructed by Joyce in 1996 [28, 29]. These examples are based on quotients T 7 /Γ where Γ is a ﬁnite group. Such quotient spaces usually exhibit singularities, and Joyce has shown that it is possible to resolve these singularities in such a way as to get a smooth, compact manifold with G2 holonomy. Since then, a number of other types of constructions have been found, in particular the construction by Kovalev [35] where a compact G2 manifold is obtained by gluing together two non-compact asymptotically cylindrical Riemannian manifolds with holonomy SU (3). In the G2 holonomy compactiﬁcation approach to M -theory, the physical content of the four-dimensional theory is given by the moduli of G2 holonomy manifolds. A review of the role of G2 manifolds in M -theory is given by Acharya and Gukov [2] and by Duﬀ [17]. Such a compactiﬁcation of M -theory is in many ways analogous to Calabi–Yau compactiﬁcations in String Theory, where much progress has been made through the study of the Calabi–Yau moduli spaces. In particular, as it was shown in [14, 40], the moduli space of complex structures and the

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1065

complexiﬁed moduli space of K¨ ahler structures are both in fact, K¨ahler manifolds. Moreover, both have a special geometry: that is, both have a line bundle whose ﬁrst Chern class coincides with the K¨ ahler class. However, until recently, the structure of the moduli space of G2 holonomy manifolds has not been studied in that much detail. Generally, it turns out that the study of G2 manifolds is quite diﬃcult. Unlike the study of Calabi–Yau manifolds where the machinery of algebraic geometry has been used with great success, in the case of G2 manifolds there is no analogue, so analytical rather than algebraic study is needed. In this review, we aim to give an overview of what is currently known about G2 moduli spaces and corresponding deformations of G2 structures. We ﬁrst give an introduction to the properties of the group G2 — deﬁnitions and representations. Then we look at general properties of G2 structures. Finally we move on to properties of G2 moduli spaces. Note that here we will only be looking at smooth compact G2 manifolds. Properties of the non-compact asymptotically cylindrical G2 manifolds have recently been studied by Kovalev and Nordstr¨ om [36] and by Nordstr¨ om [38], while the properties of G2 manifolds with conical singularities have been studied by Karigiannis [33]. 2. The Group G2 2.1. Automorphisms of octonions The group G2 is the smallest of the ﬁve exceptional Lie groups, the others being F4 , E6 , E7 and E8 . Surprisingly, all of these Lie groups are related to the octonions, but G2 is especially close. So let us ﬁrst give a few facts about the octonions. The eight-dimensional algebra of octonions, denoted by O, is the largest possible normed division algebra. The others of course are the real numbers R, complex numbers C and the quaternions H. Following Baez [6], it turns out that division algebras can be deﬁned using the notion of triality. Given three real vector spaces U, V, W , then a triality is a non-degenerate trilinear map t : U × V × W → R. Non-degenerate here means that for any ﬁxed non-zero elements of U and V , the induced functional on W is non-zero. Hence, t also deﬁnes a bilinear map m m : U × V → W ∗. For each ﬁxed element of U , this map deﬁnes an isomorphism between V and W ∗ , and for each ﬁxed element of V , an isomorphism between U and W ∗ . Hence these three spaces are isomorphic to each, and if we choose to identify non-zero elements e1 ∈ U , e2 ∈ V , and e1 e2 ∈ W ∗ , we can identify the spaces U, V, W with each other, and we can say that m now deﬁnes multiplication on U with identity element e = e1 = e2 = e1 e2 . Note that in particular, the existence of a non-degenerate trilinear map implies that the original vector spaces U ,V ,W are all of the same dimension.

October 12, J070-S0129055X10004132

1066

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

Due to the non-degeneracy of the original triality, multiplication by a ﬁxed element is an isomorphism, so in fact, U is a division algebra. Assuming further that U, V, W are inner product spaces, if the triality map satisﬁes |t(u, v, w)| ≤ uvw and is such that for all u, v there exists a non-zero w such that the bound is attained (and similarly for cyclic permutations for u, v, w) then we get a normed division algebra. The converse is also true — any division algebra deﬁnes a triality. As discussed in detail by Baez [6], on Rn it is possible to construct bilinear maps mn involving the vector and spinor representations of Spin(n) mn : Vn × Sn± → Sn∓

for n = 0, 4 (mod 8)

(2.1a)

mn : Vn × Sn → Sn

otherwise

(2.1b)

(±) Sn

are the (left- and right-handed) where Vn is the vector representation of SO(n), spinor representations. The spinor representations in (2.1) are self-dual, so in principle, by dualizing the maps in (2.1), we could obtain trilinear maps into R. However, in order to obtain trialities, these maps have to be non-degenerate, and hence the dimensions of the relevant representations must agree. This happens only for n = 1, 2, 4, 8, and each of these trialities gives a normed division algebra of the corresponding dimension: t1 t2 t4 t8

: V1 × S1 × S1 → R ⇒ R, : V2 × S2 × S2 → R ⇒ C, : V4 × S4+ × S4− → R ⇒ H, : V8 × S8+ × S8− → R ⇒ O.

(2.2)

This way, via the trialities we obtain all of the normed division algebras. In general, suppose we have a triality t : U1 × U2 × U3 → R. Then to deﬁne a normed division algebra from t, we ﬁx two vectors in the two of the three spaces. Hence the automorphism of the division algebra is the subgroup of the automorphism group of the triality that ﬁxes these two vectors. For t8 the automorphism group of the triality turns out to be Spin(8), while G2 is deﬁned as the automorphism group of the corresponding octonion algebra. Thus we have Definition 2. The group G2 is the automorphism group of the octonion algebra. Since G2 is the automorphism group of octonions, it is the subgroup of Spin(8) (the automorphism group of the triality t8 ) that preserves unit vectors in V8 and S8+ . As explained by Baez in [6], the subgroup of Spin(8) that ﬁxes a unit vector in V8 is Spin(7). Moreover, if the representation S8+ is restricted to Spin(7), we get the spinor representation S7 . Therefore, G2 is the subgroup of Spin(7) that ﬁxes a unit vector in S7 . In this representation, Spin(7) acts transitively on the unit sphere S 7 , so we have Spin(7)/G2 = S 7 . Hence we have the following result.

(2.3)

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1067

Proposition 3. The group G2 has dimension 14. Proof. From (2.3), dim G2 = dim(Spin(7)) − dim S 7 = 21 − 7 = 14. The automorphism group ﬁxes the identity, so in fact G2 acts non-trivially on octonions that are orthogonal to the identity — the imaginary octonions, denoted by Im(O) and thus we get a natural seven-dimensional representation of G2 . A closer look at this representation reveals another description of G2 . Using octonion multiplication, we can deﬁne a cross product on Im(O) by 1 (ab − ba). (2.4) 2 But G2 preserves octonion multiplication, hence any element of G2 preserves the seven-dimensional cross product. Alternatively, (2.4) can be written as a × b = Im(ab) =

a × b = ab + a, b

(2.5)

where , is the octonionic inner product, in general deﬁned by a, b =

1 ∗ (a b + ba∗ ). 2

Also, it can be shown that 1 a, b = − Tr(a × (b × ·)) (2.6) 6 Therefore, from (2.5), multiplication of imaginary octonions can be deﬁned in terms of the cross product, hence any transformation preserving the cross product preserves multiplication on Im(O), and is thus in G2 . So, G2 is precisely the group that preserves the seven-dimensional cross product. Moreover, from the cross product we can form a “scalar triple product” on Im(O) given by ϕ0 (a, b, c) = a, b × c = a, bc .

(2.7)

This deﬁnes ϕ0 as an anti-symmetric trilinear functional — that is, a three-form on R7 . Equivalently, for a basis ei of Im(O), ei × ej = ϕ0kij ek .

(2.8)

So in this description, the components of ϕ0 are essentially the structure constants of the algebra of imaginary octonions. A well known way to encode the multiplication rules for the octonions is the Fano plane [6]. It is shown in Fig. 3. In the diagram, the vertices e1 , . . . , e7 are the seven square roots of −1. Multiplication follows along the six straight lines (sides of the triangle and the altitudes) and along the central circle in the direction of the arrows. So if ei , ej , ek are in this order on a straight line, then ei ej = ek and ej ei = −ek .

October 12, J070-S0129055X10004132

1068

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

e3

e2

e7

e6

e1

e5

Fig. 3.

e4

Fano plane.

However, from (2.8) we see that ϕ0 encodes precisely the same information as the Fano plane. Suppose x1 , . . . , x7 are coordinates on R7 and let eijk = dxi ∧dxj ∧dxk , then just reading oﬀ from the Fano plane, ϕ0 can be written as ϕ0 = e123 + e145 + e167 + e246 − e257 − e347 − e356 .

(2.9)

Note that in order to keep the same convention for ϕ0 as Joyce [30], in the Fano plane we have a diﬀerent numbering for the octonions compared to Baez [6]. With this choice of coordinates, the inner product on Im(O) ∼ = R7 is given by the standard Euclidean metric g0 = (dx1 )2 + · · · + (dx7 )2 .

(2.10)

As seen from (2.6), G2 preserves the inner product on Im(O), so it clearly preserves g0 and is hence a subgroup of SO(7). Since ϕ0 deﬁnes the seven-dimensional cross product, and G2 is the symmetry group of this cross product, G2 is the stabilizer of ϕ0 in GL(7, R). So we can state: Theorem 4 ([12]). The subgroup of GL(7, R) that preserves the three-form ϕ0 is G2 . From the metric g0 we can deﬁne the Hodge star ∗0 on R7 , and using this, the dual four-form ψ0 = ∗0 ϕ0 which is given by ψ0 = e4567 + e2367 + e2345 + e1357 − e1346 − e1256 − e1247 .

(2.11)

This is a key property of G2 and as such this is often taken as the deﬁnition of the group G2 , in particular in [30]. As we have seen, G2 preserves both ϕ0 and g0 ,

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1069

so it also preserves ψ0 . In particular, ϕ0 and ψ0 give alternate descriptions of the trivial one-dimensional representation of G2 . It also turns out that ψ0 is closely related to the associator on Im(O). As the octonions are non-associative, we can deﬁne a non-trivial associator map [·, ·, ·] : Im(O) × Im(O) × Im(O) → Im(O) given by [a, b, c] = a(bc) − (ab)c.

(2.12)

Just as ϕ0 is deﬁned as a dualization of the cross product using the inner product to obtain the map ϕ0 : Im(O) × Im(O) × Im(O) → R so it turns out that up to a constant multiple the map ψ0 : Im(O) × Im(O) × Im(O) × Im(O) → R is a dualization of the associator, given by 1 [a, b, c], d . (2.13) 2 It is possible to show that ϕ0 and ψ0 satisfy various contraction identities. In particular, from [13, 21, 32], we have ψ0 (a, b, c, d) =

Proposition 5. The three-form ϕ0 and the corresponding four-form ψ0 satisfy the following identities: ϕ0abc ϕ0mn c = g0am g0bn − g0an g0bm + ψ0abmn ,

(2.14a)

ϕ0abc ψ0mnp c = 3(g0a[m ϕ0np]b − g0b[m ϕ0np]a ),

(2.14b)

[mn p n p ψ0abcd ψ0mnpq = 24δ [m δc δd] − 16ϕ0[abc ϕ0 [mnp δd] , a δb δc δd + 72ψ0[ab q]

q]

q]

(2.14c)

where [m n p] denotes antisymmetrization of indices and δab is the Kronecker delta, with δba = 1 if a = b and 0 otherwise. The above identities can be of course further contracted — the details can be found in [21, 32]. These identities and their contractions are crucial whenever any calculations involving ϕ0 and ψ0 have to be done. In particular, these are very useful when studying G2 manifolds. 2.2. Representations of G2 As we will see in Sec. 3, a crucial role in the study of G2 structures is played by the representations of G2 . Since G2 is a subgroup of SO(7), it has a fundamental vector representation on R7 . In the study of G2 manifolds, it is very important to understand the representations of G2 on p-forms. So let us consider ﬁrst the

October 12, J070-S0129055X10004132

1070

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

representations of G2 on antisymmetric tensors in R7 . For brevity let V = R7 . Following Bryant [13], we ﬁrst look at the the Lie algebra so(7), which is the space of antisymmetric 7 × 7 matrices on V . For a vector ω ∈ V , deﬁne the map ρϕ : V → so(7)

given by ρϕ (ω) = ωϕ0

(2.15)

which is clearly injective. Conversely, deﬁne the map τϕ : so(7) → V

given by τϕ (αab )c =

1 c ϕ αab . 6 0 ab

(2.16)

From (2.14), we get that τϕ (ρϕ (ω)) = ω, so that τϕ is a partial inverse of ρϕ . Thus we get a decomposition so(7) = ker τϕ ⊕ ρϕ (V )

(2.17)

where dim ρϕ (V ) = 7 and dim ker τϕ = 14. It turns out that ker τϕ is in fact a Lie algebra with respect to the matrix commutator. This is the Lie algebra bracket on so(7) and satisﬁes the Jacobi identity. It is hence only necessary to show that for α, β ∈ ker τϕ , we have [α, β] ∈ ker τϕ . This is an exercise in applying the contractions for ϕ. Thus we get a 14-dimensional Lie subalgebra of so(7). However, this is precisely the Lie algebra g2 [32], that is g2 = ker τϕ = {α ∈ so(7) : ϕ0abc αbc = 0}.

(2.18)

This further implies that we get the following decomposition of so(7): so(7) = g2 ⊕ ρϕ (V ).

(2.19)

The group G2 acts via the adjoint representation on the 14-dimensional vector space g2 and via the fundamental vector representation on the seven-dimensional space ρϕ (V ). This is a G2 -invariant irreducible decomposition of so(7) into the representations 7 and 14. Hence we get the following result: Theorem 6 ([12]). The space Λ2 of two-forms on V decomposes as Λ2 = Λ27 ⊕ Λ214 .

(2.20)

with the components Λ27 and Λ214 given by: Λ27 = {ωϕ: ω a vector}, 1 2 a b Λ14 = α = αab e ∧ e : (αab ) ∈ g2 . 2

(2.21a) (2.21b)

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1071

An alternative, but fully equivalent, description of Λ27 and Λ214 presents them as eigenspaces of the operator T ψ : Λ2 → Λ2

given by Tψ (αab ) = ψ0abcd αcd .

(2.22)

With this description, we have [32]: Λ27 = {α ∈ Λ2 : Tψ α = 4α},

(2.23a)

Λ214 = {α ∈ Λ2 : Tψ α = −2α}.

(2.23b)

Correspondingly, the description of the 7 and 14 pieces of Λ5 is obtained from (2.21a) and (2.21b) via Hodge duality. Let us now look at three-forms in more detail. Consider Sym2 (V ∗ ) — the space of symmetric two-tensors on V , and deﬁne a map iϕ : Sym2 (V ∗ ) → Λ3

given by iϕ (h)abc = hd[a ϕ0bc]d .

(2.24)

We can decompose Sym2 (V ∗ ) = Rg0 ⊕ Sym20 (V ∗ ) where Rg0 is the set of symmetric tensors proportional to the metric g0 and Sym20 (V ∗ ) is the set of traceless symmetric tensors. This is a G2 -invariant irreducible decomposition of Sym2 (V ∗ ) into onedimensional and 27-dimensional representations. We clearly have iϕ (g0 )abc = ϕ0abc , so the map iϕ is also G2 -invariant and is injective on each summand of this decomposition. Looking at the ﬁrst summand, we get that iϕ (Rg0 ) = Λ31 — the one-dimensional singlet representation of G2 . Now look at the second summand and consider iϕ (Sym20 (V ∗ )). This is 27-dimensional and irreducible, so it gives a 27-dimensional representation of G2 on three-forms: iϕ (Sym20 (V ∗ )) = Λ327 (V ∗ ). Now, Λ3 is 35-dimensional, and we have accounted for 1 + 27 = 28 dimensions. Thus we still have seven dimensions left unaccounted for in Λ3 . So let us extend the map iϕ to Λ2 — the antisymmetric two-tensors on R7 . Suppose β ∈ Λ27 . Then β = ωϕ0 , for some vector ω ∈ V so iϕ (β)abc = ϕd0

[a|e| ϕ0bc]d ω

e

= ψ0abcd ω d

(2.25)

where we have used (2.14). This deﬁnes a G2 -invariant map from V to Λ3 and hence gives Λ37 . So overall we thus have a decomposition of three-forms into irreducible representations of G2 : Theorem 7 ([13]). The space Λ3 of three-forms on V decomposes as Λ3 = Λ31 ⊕ Λ37 ⊕ Λ327

(2.26)

October 12, J070-S0129055X10004132

1072

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

where Λ31 = {χ ∈ Λ3 : χabc = f ϕ0abc for scalar f },

(2.27a)

Λ37 = {ωψ0 : ω a vector},

(2.27b)

Λ327 = {χ ∈ Λ3 : χabc = hd[a ϕ0bc]d for hab traceless, symmetric}.

(2.27c)

From the identities for contraction of ϕ0 and ψ0 , it is possible to see that an equivalent description of Λ327 is Λ327 = {χ ∈ Λ3 : χ ∧ ϕ0 = 0 and χ ∧ ψ0 = 0}. A similar decomposition of four-forms is again obtained via Hodge duality. Suppose we have χ ∈ Λ3 , then deﬁne π1 , π7 and π27 to be projections of χ onto 3 Λ1 , Λ37 and Λ327 , respectively. Using contraction identities for ϕ and ψ, we get the following relations [21]: Proposition 8. Given a three-form χ ∈ Λ3 , the projections of χ onto the components (2.26) of Λ3 are given by: 1 1 (χabc ϕabc χ, ϕ0 with |π1 (χ)|2 = 7a2 , 0 )= 42 7 1 π7 (χ) = ωψ0 where ω a = − χmnp ψ0mnpa with |π7 (χ)|2 = 4|ω|2 , 24 3 2 π27 (χ) = iϕ (h) where hab = χmn{a ϕmn with |π27 (χ)|2 = |h|2 . 0b} 4 9 π1 (χ) = aϕ0 where a =

(2.28a) (2.28b) (2.28c)

Here {a b} denotes the traceless symmetric part. Note that similar projections can be deﬁned for four-forms as well. 3. G2 Structures 3.1. Definition As we shall see, the notion of holonomy is closely related to G-structures on manifolds. Let us give the necessary deﬁnitions Definition 9. Let X be a manifold of dimension n. Suppose T X is the tangent bundle over X. Deﬁne the manifold F by F = {(x, e1 , . . . , en ) : x ∈ X and (e1 , . . . , en ) is a basis for Tx X} This then has a projection π : (x, e1 , . . . , en ) → x onto X and a natural left action by GL(n, R) on the ﬁbers. F is thus a principal bundle over X with ﬁber GL(n, R), called the frame bundle of X. Definition 10. Let X be a manifold of dimension n. Let G be a Lie subgroup of GL(n, R). Then a G -structure on X is a principal subbundle P of F with ﬁber G.

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1073

The framework of G-structures is very powerful, and a number of geometrical structures can be reformulated in this language. In particular, a Riemannian metric on a manifold is equivalent to an O(n) structure. We are in particular interested in torsion-free G-structures. A G-structure is torsion-free if and only if there exists a compatible torsion-free connection on TM . A connection ∇ on TM is equivalent to a connection D on the frame bundle F , and we say ∇ is compatible with the G-structure P if D reduces to a connection on P . For example, given a Riemannian metric, a unique torsion-free Levi–Civita connection can always be deﬁned, hence all O(n) structures are torsion-free. On a complex manifold with complex dimension, an integrable complex structure is equivalent to a torsion-free GL(m, C) structure. A K¨ ahler structure is then equivalent to a torsion-free U (m)-structure. From [30], we have a key result that relates torsion-free structures and holonomy: Proposition 11. Let (X, g) be a Riemannian manifold of dimension n, with O(n)structure P corresponding to g. Let G be a Lie subgroup of O(n). Then Hol(g) ⊆ G if and only if X admits a torsion-free G-structure Q that is a subbundle of P . As Proposition 11 shows, the study of Riemannian holonomy is equivalent to studying torsion-free G-structures. Hence in order to study G2 holonomy manifolds we will ﬁrst consider G2 structures. Now suppose X is a smooth, oriented 7-dimensional manifold. Following Joyce [30], deﬁne a three-form ϕ to be positive if locally we can choose a frame such that ϕ is written in the form (2.9) — that is for every p ∈ X there is an oriented isomorphism qp between Tp X and R7 such that ϕ|p = ϕ0 . For each p ∈ X deﬁne Pp3 X to be set of such three-forms. To each positive ϕ we can associate a metric g and a Hodge dual ∗ϕ which are identiﬁed with g0 and ψ0 under the qp and the associated metric is written (2.10). Since ϕ0 is preserved by G2 and GL(7, R)+ acts transitively on Pp3 X it follows that Pp3 X ∼ = GL(7, R)+ /G2 and hence dim Pp3 X = dim GL(7, R)+ − dim G2 = 49 − 14 = 35. This is equal to the dimension of Λ3 Tp∗ X, hence Pp3 X is an open subset of Λ3 Tp∗ X. Moreover if we consider the bundle P 3 X over X with ﬁber Pp3 X, it will be an open subbundle of Λ3 T ∗ X. Given a positive three-form ϕ on X, consider at each point p the set Qp of isomorphisms qp between Tp X and R7 such that ϕ|p = ϕ0 . It is then easy to see that Qp ∼ = G2 and that the bundle Q over X with ﬁber Qp is in fact a principal subbundle of the frame bundle F . So in fact, Q is a G2 structure. The converse is also true — given an oriented G2 structure Q, we can uniquely deﬁne a positive three-form ϕ and associated metric g and four-form ψ that correspond to ϕ0 ,g0 and ψ0 , respectively. We thus have a key result: Theorem 12 ([30]). Let X be an oriented seven-dimensional manifold. There exists a 1 − 1 correspondence between positive three-forms on X and oriented

October 12, J070-S0129055X10004132

1074

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

G2 -structures Q on X. Moreover, to each positive three-form ϕ we can associate a Riemannian metric g and a corresponding four-form ∗ϕ ϕ = ψ such for each p ∈ X, under the isomorphism qp : Tp X → R7 , these quantities are identiﬁed with ϕ0 ,g0 and ψ0 , respectively. So given a positive three-form ϕ on X, it is possible to deﬁne a metric g associated to ϕ. This metric then deﬁnes the Hodge star, which we denote by ∗ϕ to emphasize the dependence on ϕ. Given the Hodge star, we can in turn deﬁne the four-form ψ = ∗ϕ ϕ. Thus in fact both the metric g and the four-form ψ are functions of ϕ. By deﬁnition, at point p ∈ X there is an isomorphism that identiﬁes ϕ with ϕ0 , ψ with ψ0 and g with g0 . Therefore, properties of ϕ0 and ψ0 such as the contraction identities (2.14) that we encountered in Sec. 2.1 also hold for the diﬀerential forms ϕ and ψ. In general, any G-structure on a manifold X induces a splitting of bundles of p-forms into subbundles corresponding to irreducible representations of G. The same is of course true for G2 structures. The decomposition of p-forms on R7 carries over to any manifold with a G2 structure, so from the previous section we have the following decomposition of the spaces of p-forms Λp : Λ1 = Λ17 ,

(3.1a)

Λ2 = Λ27 ⊕ Λ214 ,

(3.1b)

Λ3 = Λ31 ⊕ Λ37 ⊕ Λ327 ,

(3.1c)

Λ4 = Λ41 ⊕ Λ47 ⊕ Λ427 ,

(3.1d)

5

Λ =

Λ57

⊕

Λ514 ,

Λ6 = Λ67 .

(3.1e) (3.1f)

Λpk

corresponds to the k-dimensional irreducible representation of G2 . Here each are isomorphic to each other via Hodge Moreover, for each k and p, Λpk and Λ7−p k p duality, and also Λ7 are isomorphic to each other for n = 1, 2, . . . , 6. Deﬁne the standard inner product on Λp , so that for p-forms α and β, 1 (3.2) α, β = αa1 ···ap β a1 ···ap . p! This is related to the Hodge star, since α ∧ ∗β = α, β vol where vol is the invariant volume form given locally by vol = det g dx1 ∧ · · · ∧ dx7 .

(3.3)

(3.4)

Then the decompositions (3.1) are orthogonal with respect to (3.2). Note that ϕ, ϕ = 7, so in fact we have 1 V = ϕ ∧ ∗ϕ (3.5) 7 where V is the volume of the manifold X.

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1075

We know that the metric g is deﬁned by the three-form ϕ and we can use some of the results from Sec. 2.1 to ﬁnd a direct relationship between the two quantities. Proposition 13. Given a positive three-form ϕ on a seven-manifold X, the associated metric g is given by 1

gab = (det s)− 9 sab .

(3.6)

1 ϕamn ϕbpq ϕrst ˆεmnpqrst 144

(3.7)

with sab =

where ˆεmnpqrst is the alternating symbol with ˆε12,...,7 = +1. Alternatively, for u,v vector ﬁelds on X, 1 (uϕ) ∧ (vϕ) ∧ ϕ 6 where denotes interior multiplication: (uϕ)bc = ua ϕabc . u, v vol =

(3.8)

Proof. Consider the quantity Pab given by Pab = ϕamn ϕbpq ψ mnpq Using identities (2.14) to contract ϕ and ψ, this gives Pab = 24gab . Expanding ψ mnpq in terms of ϕ and the Levi–Civita tensor we get Pab =

1 ϕamn ϕbpq ϕrst εmnpqrst . 6

If we write ˆεmnpqrst for the alternating symbol with ˆε12,...,7 = +1, then we get 1 ϕamn ϕbpq ϕrst ˆεmnpqrst . (3.9) gab det g = 144 Alternatively, let u and v be vector ﬁelds on X. Then 1 (ua ϕamn )(v a ϕbpq )ϕrst ˆεmnpqrst . u, v det g = 144 Hence we get (3.8). Now deﬁne 1 ϕamn ϕbpq ϕrst ˆεmnpqrst 144 so that then, after taking the determinant of (3.9) we get (3.6). sab =

Thus we see that even though given the three-form ϕ we can deﬁne the metric g, this relationship is rather complicated and nonlinear. In particular, this also shows that ψ = ∗ϕ ϕ depends on ϕ in an even more non-trivial fashion, since the Hodge star depends itself on the metric. Here we need to say a few words about the notation used for the G2 threeform ϕ and the associated four-form ψ. The notation that we use here is due to

October 12, J070-S0129055X10004132

1076

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

Authors Beasley and Witten; Gukov, Yau and Zaslow

Three-form

Dual Four-form

Φ

∗Φ

Bryant

φ=

Hitchin; Lee and Leung Joyce Karigiannis; Karigiannis and Leung; Grigorian and Yau

Ω ϕ

Θ = ∗Ω Θ(ϕ) = ∗ϕ

[26, 37] [28–30]

ϕ

ψ = ∗ϕ ϕ

[21, 31, 32, 34]

Fig. 4.

1 ε eijk 6 ijk

∗φ φ =

References [7, 22]

1 ε eijkl 24 ijkl

[12, 13]

Notation that is used by diﬀerent authors.

Karigiannis — where the Hodge dual of ϕ is denoted by ψ and was ﬁrst introduced in [31]. In Fig. 4, we summarize the diﬀerent notations used by other authors: where eijk = ei ∧ ej ∧ ek and eijkl = ei ∧ ej ∧ ek ∧ el for basis covectors ei . 3.2. Torsion-free structures The deﬁnition of a G2 structure only deﬁnes the algebraic properties of ϕ, and in general does not address the analytical properties of ϕ. Using the associated metric g we can deﬁne the Levi–Civita connection ∇ on X. Then it is natural to ask what are the properties of ∇ϕ. This quantity is known as the torsion of the andez and G2 structure. Originally the torsion of G2 structures was studied by Fern´ Gray [18], and their analysis revealed that there are in fact a total of 16 torsion classes of G2 structures. Later on, Karigiannis reproduced their results using simple computational arguments [32]. Following [32], consider the three-form ∇X ϕ for some vector ﬁeld X. We know that three-forms split as Λ31 ⊕ Λ37 ⊕ Λ327 , so consider the projections π1 ,π7 and π27 of ∇X ϕ onto these components. Using (2.28), we have π1 (∇X ϕ) = aϕ where a = X a (∇a ϕbcd )ϕbcd = X a ∇a (ϕbcd ϕbcd ) − ϕbcd X a ∇a ϕbcd = −X a (∇a ϕbcd )ϕbcd = 0. Hence we see that the Λ31 component vanishes. Similarly, for Λ327 we have π27 (∇X ϕ) = iϕ (h) where 3 c 3 3 (X ∇c ϕmn{a )ϕb}mn = X c ∇c (ϕmn{a ϕb}mn ) − ϕmn{a X c ∇c ϕ b}mn 4 4 4 3 = − (X c ∇c ϕmn{a )ϕ b}mn 4 = 0.

hab =

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1077

Here we have used the fact that ϕmna ϕb mn = 6gab , the traceless part of which vanishes. Therefore, the Λ327 part of ∇X ϕ also vanishes. Now consider the Λ37 component. In this case, π7 (∇X ϕ) = ωψ where ωa = −

1 c 1 a X (∇c ϕmnp )ψ mnpa = X (∇a ψ bcde )ϕbcd . 24 24

This quantity does not vanish in general, so we can conclude that ∇X ϕ ∈ Λ37

(3.10)

∇ϕ ∈ W = Λ17 ⊗ Λ37 .

(3.11)

and thus overall,

Further classiﬁcation of torsion classes depends on the decomposition of W into components according to irreducible representations of G2 . Given (3.11), we can write ∇a ϕbcd = Ta e ψebcd

(3.12)

where Tab is the full torsion tensor. This two-tensor full deﬁnes ∇ϕ since pointwise, it has 49 components and the space W is also 49-dimensional (pointwise). In general we can split Tab as T = τ1 g + τ7 + τ14 + τ27

(3.13)

where τ1 is a function, and gives the 1 component of T , τ7 ∈ Λ27 and hence gives the 7 component, τ14 ∈ Λ214 gives the 14 component and τ27 is traceless symmetric, giving the 27 component. Note that the normalization of these components is diﬀerent from [32]. Hence we can split W as W = W1 ⊕ W7 ⊕ W14 ⊕ W27 .

(3.14)

The 16 torsion classes arise as the subsets of W which ∇ϕ belongs to. Moreover, as shown in [32], the torsion components τi relate directly to the expression for dϕ and dψ. In fact, in our notation, dϕ = 4τ1 ψ + 3τ7 ∧ ϕ − ∗τ27 ,

(3.15a)

dψ = 4τ7 ∧ ψ − 2 ∗τ14 .

(3.15b)

Now suppose dϕ = dψ = 0. Then this means that all four torsion components vanish and hence T = 0, and as a consequence ∇ϕ = 0. The converse is trivially true, since d and d∗ can both be expressed in terms of the covariant derivative. This result is

October 12, J070-S0129055X10004132

1078

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

due to Fern´ andez and Gray [18]. If we add the fact that Hol(g) is a subgroup of G if and only if X admits a torsion-free G structure from Proposition 11, then we get the following important result. Theorem 14 ([30, Proposition 10.1.3]). Let X be a seven-manifold with a G2 structure deﬁned by the three-form ϕ and equipped with the associated Riemannian metric g. Then the following are equivalent: (1) (2) (3) (4)

The G2 -structure is torsion-free; Hol(g) ⊆ G2 and ϕ is the induced three-form; ∇ϕ = 0 on X where ∇ is the Levi–Civita connection of g; dϕ = dψ = 0 where ψ = ∗ϕ with the Hodge star deﬁned by g.

Diﬀerent torsion classes of the G2 structure also restrict the curvature of the manifold. Consider the curvature tensor Rabcd . Then for ﬁxed a,b, we have (Rab )cd ∈ Λ2 , so we can decompose it as (Rab )cd = (π7 Rab )cd + (π14 Rab )cd .

(3.16)

Following Karigiannis [32], consider the operator Tψ (2.22) acting on Rabcd . Then we have g ad Tψ Rabcd = Rabef ψ efcd g ad = −(Rbeaf + Reabf )ψ efcd g ad = −Rbeaf ψ e caf + Rf bea ψ eaf c = −2g adTψ Rabcd =0 where we have used the cyclic identity for Rabef . Hence, from (2.23) we get 3 (π14 Rab )cd g ac (3.17) 2 where Ricbd is the Ricci tensor. However, in general, by the Ambrose–Singer holonomy theorem [5], if Hol(g) ⊆ G, then Rabcd ∈ Sym2 (g) where g is the Lie algebra of G. Therefore, in the G2 case, if the G2 structure is torsion-free and hence Hol(g) ⊆ G2 , then Rabcd ∈ Sym2 (g2 ). This however implies that in (3.16), the π7 component vanishes, and thus from (3.17), we have the following result: Ricbd = 3(π7 Rab )cd g ac =

Theorem 15 ([10]). Let X be a Riemannian seven-manifold with metric g. If Hol(g) ⊆ G2 , then X is Ricci-ﬂat. In fact, this result can also be derived without invoking the general Ambrose– Singer theorem. In [32], Karigiannis expressed the Λ27 component of the curvature tensor in terms of the torsion tensor Tab , so that when the torsion vanishes, the

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1079

curvature tensor is fully contained in Λ214 , thus directly conﬁrming the Ambrose– Singer theorem in the G2 case. The original proof of Theorem 15 due to Bonan [10] relied on the fact that the Lie algebra structure of g2 imposes strong conditions on the Riemann tensor, and that these imply that the Ricci tensor cannot be nonvanishing. Given a compact manifold with a torsion-free G2 structure, the decompositions (3.1) carry over to de Rham cohomology [30], so that we have H 1 (X, R) = H71 , 2

H (X, R) = 3

H (X, R) = 4

H (X, R) = 5

H (X, R) = 6

H (X, R) =

(3.18a)

H72 ⊕ H13 ⊕ H14 ⊕ H75 ⊕ H76 .

2 H14 , 3 H7 ⊕ H74 ⊕ 5 H14 ,

(3.18b) 3 H27 , 4 H27 ,

(3.18c) (3.18d) (3.18e) (3.18f)

Deﬁne the reﬁned Betti numbers bpk = dim(Hkp ). Clearly, b31 = b41 = 1 and we also have b1 = bk7 for k = 1, . . . , 6. Moreover, it turns out that if Hol(X, g) = G2 then b1 = 0. Therefore, in this case the H7k component vanishes in (3.18). It can be easily shown that on a Ricci-ﬂat manifold, any harmonic one-form must be parallel. However this happens if and only if Hol(g) has an invariant one-form. However the only G2 -invariant forms are ϕ and ψ. Therefore there are no non-trivial harmonic one-forms when Hol(g) = G2 and thus b1 = 0. An example of a construction of a manifold with a torsion-free G2 structure is to consider X = Y × S 1 where Y is a Calabi–Yau three-fold. Deﬁne the metric and a three-form on X as gX = dθ2 × gY , ϕ = dθ ∧ ω + Re Ω,

(3.19) (3.20)

where θ is the coordinate on S 1 . This then deﬁnes a torsion-free G2 structure, with ∗ϕ =

1 ω ∧ ω − dθ ∧ Im Ω. 2

(3.21)

However, the holonomy of X in this case is SU (3) ⊂ G2 . From the K¨ unneth formula we get the following relations between the reﬁned Betti numbers of X and the Hodge numbers of Y bk7 = 1 bk14

=h

1,1

−1

bk27 = h1,1 + 2h2,1

for k = 1, . . . , 6,

(3.22)

for k = 2, 5,

(3.23)

for k = 3, 4.

(3.24)

In [28–30], Joyce describes a possible construction of a smooth manifold with holonomy equal to G2 from a Calabi–Yau manifold Y . So suppose Y is a Calabi–Yau

October 12, J070-S0129055X10004132

1080

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

three-fold as above. Then suppose σ : Y → Y is an antiholomorphic isometric involution on Y , that is, χ preserves the metric on Y and satisﬁes σ 2 = 1,

(3.25a)

σ (ω) = −ω, ¯ σ ∗ (Ω) = Ω.

(3.25b)

∗

(3.25c)

Such an involution σ is known as a real structure on Y . Deﬁne now a quotient given by σ Z = (Y × S 1 )/ˆ

(3.26)

where σ ˆ :Y × S 1 → Y × S 1 is deﬁned by σ ˆ (y, θ) = (σ(y), −θ). The three-form ϕ ˆ and hence provides deﬁned on Y × S 1 by (3.20) is invariant under the action of σ Z with a G2 structure. Similarly, the dual four-form ∗ϕ given by (3.21) is also invariant. Generically, the action of σ on Y will have a non-empty ﬁxed point set N , which is in fact a special Lagrangian submanifold on Y [30]. This gives rise to orbifold singularities on Z. The singular set is two copies of Z. It is conjectured that it is possible to resolve each singular point using an ALE four-manifold with holonomy SU (2) in order to obtain a smooth manifold with holonomy G2 , however the precise details of the resolution of these singularities are not known yet. We will therefore consider only free-acting involutions, that is those without ﬁxed points. Manifolds deﬁned by (3.26) with a freely acting involution were called barely G2 manifolds by Harvey and Moore in [25]. The cohomology of barely G2 manifolds is expressed in terms of the cohomology of the underlying Calabi–Yau manifold Y : H 2 (Z) = H 2 (Y )+ ,

(3.27a)

H 3 (Z) = H 2 (Y )− ⊕ H 3 (Y )+ .

(3.27b)

Here the superscripts ± refer to the ± eigenspaces of σ ∗ . Thus H 2 (Y )+ refers to twoforms on Y which are invariant under the action of involution σ and correspondingly H 2 (Y )− refers to two-forms which are odd under σ. Wedging an odd two-form on Y with dθ gives an invariant three-form on Y × S 1 , and hence these forms, together with the invariant 3-forms H 3 (Y )+ on Y , give the three-forms on the quotient space ˆ . Now, Z. Also note that H 1 (Z) vanishes, since the one-form on S 1 is odd under σ given a three-form on Y , its real part will be invariant under σ, hence H 3 (Y )+ is essentially the real part of H 3 (Y ). Therefore the Betti numbers of Z in terms of Hodge numbers of Y are b1 = 0,

(3.28a)

b 2 = h+ 1,1 ,

(3.28b)

3

b =

h− 1,1

+ h2,1 + 1.

(3.28c)

A class of barely G2 manifolds that are constructed from complete intersection Calabi–Yau manifolds has recently been considered in [20], where the Betti numbers of all such manifolds have been calculated explicitly.

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1081

Note that barely G2 manifolds have holonomy SU (3) Z2 while the ﬁrst Betti number still vanishes. This shows that vanishing ﬁrst Betti number is not a necessary and suﬃcient condition for Hol(g) = G2 . In fact, as shown by Joyce in [28, 29], Hol(g) = G2 if and only if the fundamental group π1 (X) is ﬁnite. Let us brieﬂy describe Joyce’s construction of compact torsion-free manifolds with Hol(g) = G2 . Here we follow [30]. On T 7 we can deﬁne a ﬂat G2 structure (ϕ0 , g0 ), similarly as on R7 . Now suppose that Γ is a ﬁnite group acting on T 7 that preserves the G2 structure. Then we can deﬁne the orbifold T 7 /Γ. The key to resolving the orbifold singularities is to consider appropriate Quasi Asymptotically Locally Euclidean (QALE) G2 manifolds. These are seven-manifolds with a torsionfree G2 structure that is asymptotic to the G2 structure on R7 /G where G is a ﬁnite subgroup of G2 . The orbifold T 7 /Γ is then resolved to obtain a smooth compact manifold. However on the resolution, the resulting G2 -structure is not necessarily torsion-free, so it is shown that it can be deformed to a torsion-free G2 structure (ϕ, g). Further, the fundamental group is calculated, and if it is ﬁnite, then Hol(g) = G2 . Using this method, Joyce found 252 topologically distinct G2 holonomy manifolds with unique pairs of Betti numbers (b2 , b3 ). 4. Moduli Space 4.1. Deformations of G2 structures One of the interesting directions in the study of G2 holonomy manifolds is the structure of the moduli space. Essentially, the idea is to consider the space of all torsion-free G2 structures modulo diﬀeomorphisms on a manifold with ﬁxed topology. The moduli space itself has an interesting geometry that may give further information about G2 manifolds. Currently, we can only say something about the very local structure of the G2 moduli space. For this, we take a ﬁxed G2 structure and deform it slightly. The space of these deformations is the local moduli space. To study it, we thus need to understand the deformations of G2 structures. Although, we are mostly interested in deformations of torsion-free G2 structures, many of the results are valid for any G2 structures. Our aim is to consider inﬁnitesimal deformations of ϕ of the form ϕ → ϕ + εχ

(4.1)

for some three-form χ. As we already know, the G2 structure on X and the corresponding metric g are all determined by the invariant three-form ϕ. Hence, deformations of ϕ will induce deformations of the metric. These deformations of metric will then also aﬀect the deformation of ψ = ∗ϕ. Theoretically, “large” deformations could also be considered, and in fact, as we shall see below in some cases closed expressions can be obtained for large deformations. However in that case, it is difﬁcult to determine the resulting torsion class of the new G2 structure [31]. In order for the deformed ϕ to deﬁne a new G2 structure, the new ϕ must also be a positive

October 12, J070-S0129055X10004132

1082

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

form (as per the deﬁnition of a G2 structure). However it is known [30] that the bundle of positive three-forms on X is an open subbundle of Λ3 T ∗ X, so we can always ﬁnd ε small enough in order for the deformed ϕ to be positive. Using the decomposition of three-forms (3.1c), we can split χ into Λ31 , Λ37 and 3 Λ27 parts, and at ﬁrst let us consider each one separately. As shown by Karigiannis in [31], metric deformations can be made explicit when the three-form deformations are either in Λ31 or Λ37 . Let us ﬁrst review some of these results. First suppose ϕ ˜ = f ϕ.

(4.2)

˜ = ˜∗ϕ We will also use the notation ψ ˜ where ˜∗ is the Hodge star derived from the metric g˜ corresponding to ϕ ˜ . Then from (3.9) we get 1 ϕ ˜ ϕ ˜ ϕ ˜ ˆεmnpqrst g˜ab det g˜ = 144 amn bpq rst = f 3 gab det g. (4.3) After taking the determinant on both sides, we obtain det g˜ = f

14 3

det g.

(4.4)

Substituting (4.4) into (4.3), we ﬁnally get 2

g˜ab = f 3 gab ,

(4.5)

˜ = f 43 ψ. ψ

(4.6)

and hence

So, a scaling of ϕ gives a conformal transformation of the metric. Hence deformations of ϕ in the direction Λ31 also give inﬁnitesimal conformal transformation. Suppose f = 1 + εa, then to third order in ε, we can write ˜ = 1 + 4 aε + 2 a2 ε2 − 4 a3 ε3 + O(ε4 ) ψ. ψ (4.7) 3 9 81 Given a torsion-free G2 structure, dϕ = dψ = 0, so if we want the deformed structure to be also torsion-free, f must be constant. Now, suppose in general that ϕ ˜ = ϕ + εχ for some χ ∈ Λ3 . Then using (3.8) for the deﬁnition of the metric associated with ϕ ˜ , after some manipulations, we get: = 1 (uϕ) ∧ (vϕ) ∧ ϕ u, v vol 6 1 + ε[(uχ) ∧ ∗(vϕ) + (vχ) ∧ ∗(uϕ)] 2 1 + ε2 (uχ) ∧ (vχ) ∧ ϕ 2 1 + ε3 (uχ) ∧ (vχ) ∧ χ. 6

(4.8)

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1083

Rewriting (4.8) in local coordinates, we get √ det g˜ 1 1 2 mnpq g˜ab √ = gab + εχmn(a ϕmn b) + ε χamn χbpq ψ 2 8 det g +

1 3 ε χamn χbpq (∗χ)mnpq . 24

(4.9)

Now suppose the deformation is in the Λ37 direction. This implies that χ = ωψ

(4.10)

for some vector ﬁeld ω. Look at the ﬁrst order term in (4.9). From (2.28) we see that this is essentially a projection onto Λ31 ⊕ Λ327 — the traceless part gives the Λ327 component and the trace gives the Λ31 component. Hence this term vanishes for χ ∈ Λ37 . For the third order term, it is more convenient to study at it in (4.8). By looking at ω((uωψ) ∧ (vωψ) ∧ ψ) = 0 we immediately see that the third order term vanishes. So now we are left with 1 2 c d mnpq det g g˜ab det g˜ = gab + ε ω ω ψcamn ψdbpq ψ 8 = (gab (1 + ε2 |ω|2 ) − ε2 ωa ωb ) det g (4.11) where we have used a contraction identity for ψ twice. Taking the determinant of (4.11) gives 2 det g˜ = (1 + ε2 |ω|2 ) 3 det g. (4.12) Eventually we have the following result: Theorem 16 ([31]). Given a deformation of a G2 structure (4.1) with χ = ωψ ∈ Λ37 , then the new metric g˜ab is given by 2

g˜ab = (1 + ε2 |ω|2 )− 3 ((gab (1 + ε2 |ω|2 ) − ε2 ωa ωb ))

(4.13)

˜ is given by and the deformed four-form ψ ˜ = (1 + ε2 |ω|2 )− 13 (ψ + ∗ε(ωψ) + ε2 ω ∗ (ωϕ)). ψ

(4.14)

One of the key reasons why it is possible to get these closed form expressions for modiﬁed g and ψ is because as shown by Karigiannis in [31], the determinant of (4.11) can be calculated in a closed form. Notice that to ﬁrst order in ε, both √ det g and gab remain unchanged under this deformation. Now let us examine the last term in (4.14) in more detail. Firstly, we have ω ∗ (ωϕ) = ∗(ω ∧ (ωϕ))

October 12, J070-S0129055X10004132

1084

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

and (ω ∧ (ωϕ))mnp = 3ω[m ω a ϕ|a|np] = 3iϕ (ω ◦ ω)

(4.15) Λ41

Λ427

and compowhere (ω ◦ ω)ab = ωa ωb . Therefore, in (4.14), this term gives nents. So, can write (4.14) as 3 ˜ = (1 + ε2 |ω|2 )− 13 1 + ε2 |ω|2 ψ + ∗ε(ωψ) + ε2 ∗ iϕ ((ω ◦ ω)0 ) . (4.16) ψ 7 Here (ω ◦ ω)0 denotes the traceless part of ω ◦ ω, so that iϕ ((ω ◦ ω)0 ) ∈ Λ327 and thus, in (4.16), the components in diﬀerent representations are now explicitly shown. To ﬁrst order, we thus have the deformations ϕ ˜ = ϕ + ε(ωψ), ˜ = ψ + ∗ε(ωψ). ψ If originally dϕ = dψ = 0, that is, the G2 structure is torsion-free, then for the deformed structure to be torsion-free to ﬁrst order we need d(ωψ) = d ∗ (ωψ) = 0.

(4.17) 4

By expanding d(ωψ) in terms of the decomposition of Λ , and setting each term individually to 0, we ﬁnd that the symmetric part of ∇a ωb and the Λ27 part of dω must vanish. Furthermore, by expanding ∗d ∗ (ωψ) in terms of the decomposition of Λ2 we ﬁnd that the Λ214 part of dω must also vanish. Hence we get that ∇ω = 0. If Hol(g) = G2 , then we know that in this case ω = 0, so there are no interesting small Λ37 deformations of manifolds with holonomy equal to G2 . As we have seen above, in the cases when the deformations were in Λ31 or Λ37 directions, there were some simpliﬁcations, which make it possible to write down all results in a closed form. In the case of deformations in Λ327 the only known way to get results for deformations of the metric and the four-form ψ is to consider the deformations order by order in ε. This analysis has been carried out in [21], and here we will review those results. So suppose we have a deformation ϕ ˜ = ϕ + εχ where χ ∈ Λ327 . Now let us set up some notation. Deﬁne 1 1 √ ϕ ˜ ϕ ˜ ˆεmnpqrst ϕ ˜ 144 det g amn bpq rst det g˜ = g˜ab . det g

s˜ab =

(4.18)

(4.19)

From (3.9), the untilded sab is then just equal to gab . We can rewrite (4.19) as det g g˜ab = (gab + δsab ) (4.20) det g˜

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1085

where δgab is the deformation of the metric and δsab is the deformation of sab , which from (4.9) is given by δsab =

1 1 1 εχmn(a ϕb)mn + ε2 χamn χbpq ψ mnpq + ε3 χamn χbpq (∗χ)mnpq . 2 8 24

(4.21)

Also introduce the following short-hand notation sk = Tr((δs)k )

(4.22)

where the trace is taken using the original metric g. From (4.21), note that since χ ∈ Λ327 , when taking the trace the ﬁrst order term vanishes, and hence s1 is at least second-order in ε. Clearly, for k > 1, sk are at least of order k in ε. Similarly as before, take the determinant of (4.18):

det g˜ det g

92 =

det(g + δs) . det(g)

(4.23)

Unlike in the case of Λ37 deformations, we cannot compute det(g + δs) in closed form, so we have to calculate it order by order in ε. From the standard expansion of det(I + X), we ﬁnd 1 1 det(g + δs) = 1 + s1 + (s21 − s2 ) + (s31 − 3s1 s2 + 2s3 ) + O(ε4 ). det g 2 6

(4.24)

However, as we noted above, s1 is second-order in ε, so this expression actually simpliﬁes: det(g + δs) 1 1 = 1 + s1 − s2 + s3 + O(ε4 ). det g 2 3

(4.25)

Raising this to the power of − 19 , and expanding again to fourth order in ε, we get

det g det g˜

12 =1+

1 1 s2 − s1 18 9

−

1 s3 + O(ε4 ). 27

(4.26)

Using this and (4.20), we can immediately get the deformed metric, but the expressions using the current form of δsab are not very useful. So far, the only property of Λ327 that we have used is that it is orthogonal to ϕ, thus in fact, up to this point everything applies to Λ37 as well. Now however, let χ be of the form χabc = hd[a ϕbc]d

(4.27)

where hab is traceless and symmetric, so that χ ∈ Λ327 . Let us ﬁrst introduce some further notation. Let h1 , h2 , h3 , h4 be traceless, symmetric matrices, and introduce

October 12, J070-S0129055X10004132

1086

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

the following shorthand notation be (ϕh1 h2 ϕ)mn = ϕabm had 1 h2 ϕden ,

ϕh1 h2 h3 ϕ =

be cf ϕabc had 1 h2 h3 ϕdef ,

be cf (ψh1 h2 h3 ψ)mn = ψabcm ψdef n had 1 h2 h3 , be cf mn ψh1 h2 h3 h4 ψ = ψabcm ψdef n had 1 h2 h3 h4 .

(4.28a) (4.28b) (4.28c) (4.28d)

It is clear that all of these quantities are symmetric in the hi and moreover (ϕh1 h2 ϕ)mn and (ψh1 h2 h3 ψ)mn are both symmetric in indices m and n. Then, it can be shown that 4 χ(a|mn| ϕb)mn = hab , 3 4 16 4 χamn χbpq ψ mnpq = − |χ|2 gab + (h2 ){ab} − (ϕhhϕ){ab} , 7 9 9 32 8 Tr(h3 )gab − (ϕhh2 ϕ){ab} , χamn χbpq ∗ χmnpq = 189 9 where as before {a b} denotes the traceless symmetric part. Using this and (4.21), we can now express δsab in terms of h: 2 4 3 1 2 2 3 ε Tr(h ) δsab = εhab + gab − ε Tr(h ) + 3 63 567 2 2 ε3 1 2 +ε (4.29) (h ){ab} − (ϕhhϕ){ab} − (ϕhh2 ϕ){ab} 9 18 27 and hence 1 4 s1 = Tr(δs) = − ε2 Tr(h2 ) + ε3 Tr(h3 ), 9 81 8 4 2 Tr(h3 ) − (ϕhhhϕ) , s2 = Tr(δs2 ) = ε2 Tr(h2 ) + ε3 9 27 27

(4.30a) (4.30b)

8 3 ε Tr(h3 ). (4.30c) 27 Substituting these expressions into (4.26) and (4.20), we can get the full expression for the deformed metric (up to third order in ε) and correspondingly the expression for the deformed four-form ψ: s3 = Tr(δs3 ) =

Theorem 17 ([21]). Given a deformation of a G2 structure (4.1) with χabc = hd[a ϕbc]d ∈ Λ327 , then the new metric g˜ab is given to third order in ε by 1 1 1 3 2 g˜ab = 1 + ε2 Tr(h2 ) + ε3 Tr(h3 ) − ε (ϕhhhϕ) gab + εhab 18 81 243 3 1 2 2 2 (h )(ab) − (ϕhhϕ)ab + ε3 hab Tr(h2 ) + ε2 9 18 81 −

ε3 (ϕhh2 ϕ)ab + O(ε4 ) 27

(4.31)

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

˜ is given by and correspondingly, the deformed four-form ψ ˜ = ψ − ε ∗ χ + ε2 − 1 Tr(h2 )ψ + 1 ∗ iϕ ((φhhφ)0 ) ψ 189 6 2 1 5 + ε3 − (ϕhhhϕ)ψ − Tr(h2 ) ∗ χ + ∗ iϕ (h30 ) 1701 108 18 1 1 ∗ iϕ ((ψhhhψ)0 ) + α ∧ ϕ + O(ε4 ) − 36 324

1087

(4.32)

where (φhhφ)0 , h30 and (ψhhhψ)0 denote the traceless parts of (φhhφ)ab , (h3 )ab and (ψhhhψ)ab , respectively, and αa = ψamnp ϕrst hmr hns hpt .

(4.33)

In general if such a deformation is performed on a torsion-free G2 structure, then it is not known what conditions must h satisfy in order for the torsion class to be preserved. If we restrict our analysis only to ﬁrst order deformations, then it is easier to see these conditions. Suppose we have dϕ = dψ = 0 and we apply a deformation (4.1) with χ = iϕ (h) ˜ = 0 are for traceless and symmetric. Then to ﬁrst order the conditions for d˜ ϕ = dψ dχ = d ∗ χ = 0. Hence the deformation must be a form that is closed and co-closed. For a compact manifold this is thus equivalent to χ being harmonic. We can also ﬁnd what this condition means in terms of h. By decomposing dχ into Λ41 , Λ47 and Λ427 components, we ﬁnd that we must have ∇r hra = 0.

(4.34a)

∇m ha(b ϕmac) = 0.

(4.34b)

Further, if we decompose ∗d ∗ χ into Λ27 and Λ214 components, we again get (4.34a) and moreover get a new constraint ∇m ha[b ϕmac] = 0.

(4.35a)

Thus overall, for h traceless and symmetric, χ = iϕ (h) being closed and co-closed is equivalent to ∇r hra = 0

and ∇m hab ϕmac = 0.

On a compact manifold χ being closed and co-closed is equivalent to χ being harmonic. It also turns out [2] that, if χ is deﬁned as above, then ∆χ = 0 ⇔ ∆L h = 0

October 12, J070-S0129055X10004132

1088

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

where ∆L is the Lichnerowicz operator given by ∆L hab = ∇2 hab + 2Racbd hcd .

(4.36)

Therefore to preserve the torsion-free G2 structure, we have to limit our attention to zero modes of the Lichnerowicz operator. Note that, to linear order, traceless deformations of the metric which preserve the Ricci tensor are also precisely the Lichnerowicz zero modes, and this is consistent with (4.31) where the linear term in the metric deformation is proportional to h. Let us compare what happens here to what happens on Calabi–Yau manifolds [14]. In that case, deformations of the metric δgmn split into deformations of mixed type δgµ¯ν and deformations of pure type δgµν and δgµ¯ ν¯ . From the mixed type deformations we can deﬁne a real (1, 1)-form iδgµ¯ν dxµ ∧ dxν¯

(4.37)

and given the holomorphic 3-form Ω, we can use the mixed type deformation to deﬁne a real (2, 1)-form Ωκλ ν¯ δgµ¯ ν¯ dxk ∧ dxλ ∧ dxµ¯ .

(4.38)

In order to preserve the Calabi–Yau structure, the metric deformation must preserve the vanishing Ricci curvature, and hence δgmn must satisfy the Lichnerowicz equation: ∆L δgmn = 0. However, the Lichnerowicz equation for δgmn becomes equivalent to both the (1, 1)form (4.37) and the (2, 1)-form (4.38) being harmonic. Note that the deﬁnition (4.38) is very similar to χabc = hd[a ϕbc]d in the G2 case with ϕ playing the role of Ω and h the role of δgµ¯ ν¯ . 4.2. Geometry of the moduli space In the theory of Calabi–Yau moduli spaces, one of the key results is that the local moduli space of complex structure deformations is isomorphic to an open set in H m−1,1 (X) where X is a Calabi–Yau m-fold. Moreover, as it has been shown by Tian and Todorov [42, 43], any inﬁnitesimal deformation can be in fact lifted to a full deformation. For the moduli spaces of G2 manifolds however, we can only replicate the results about the local moduli space. First let us deﬁne the moduli space of torsion-free G2 structures. Let X be the set of of positive three-forms ϕ ∈ P 3 X such that dϕ = d ∗ϕ ϕ = 0. Here we use ∗ϕ to emphasize that the Hodge star is deﬁned using the G2 holonomy metric that is deﬁned by ϕ itself. Then X gives the set of all three-forms that correspond to oriented, torsion-free G2 structures. However we do not want to distinguish between three-forms that are related by a diﬀeomorphism. Hence, let D be the group of all diﬀeomorphisms of X isotopic to the identity. This group then acts naturally on three-forms. The

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1089

moduli space of torsion-free G2 structures is then deﬁned as the quotient M = X /D. The key result by Joyce is that M is locally diﬀeomorphic to an open set of H 3 (X, R): Theorem 18 ([28, 29]). Deﬁne a map Ξ : X →H 3 (X, R) by Ξ(ϕ) = [ϕ]. Then Ξ is invariant under the action of D on X . Moreover, Ξ induces a diﬀeomorphism between neighborhoods of ϕD ∈ M and [ϕ] ∈ H 3 (X, R). Since the dimension of H 3 (X, R) is b3 (X), this result implies that dim M = b3 (X). The full proof of this result can be found either in [28, 29] or [30]. This result covers the basic local properties of the G2 moduli space, but we do not yet know anything about the global structure of M. So anything we can say about the moduli space only holds in a small neighborhood. Looking back at the study of Calabi–Yau moduli spaces, we know that the complex structure moduli space admits a K¨ ahler structure, and the K¨ ahler structure moduli space admits a Hessian structure [14]. It turns out that on the G2 moduli space we can also deﬁne a Hessian structure. First let us deﬁne the notion of a Hessian manifold [39]: Definition 19. Let M be a smooth manifold and suppose D is a ﬂat, torsionfree connection on M . A Riemannian metric G on a ﬂat manifold (M, D) is called Hessian if G can be locally expressed as G = D2 H

(4.39)

that is, ∂ 2H (4.40) ∂xi ∂xj where {x1 , . . . , xn } is an aﬃne coordinate system with respect to D. Then H is called the Hessian potential. Gij =

Note that this is the closest analogue to a K¨ ahler structure that can be deﬁned on a real manifold. In fact, as shown by Shima [39], if we deﬁne a complex structure on the manifold TM , then the straightforward extension of G onto TM is K¨ahler if and only if G is a Hessian metric on (M, D). Thus the complexiﬁcation of a Hessian manifold is K¨ ahler. In the case of the G2 moduli space M, we know that M is locally diﬀeomorphic to an open set in H 3 (X, R). Suppose we choose a basis [ϕ0 ], . . . , [ϕn ] on H 3 (X, R) where n = b3 (X) − 1. Taking the unique harmonic representatives of the basis elements, we can expand ϕ ∈ M as ϕ=

n

sN φN .

(4.41)

N =0

Since H 3 (X, R) is a vector space, s0 , . . . , sn give an aﬃne coordinate system, which in turn deﬁnes a ﬂat connection D = d on M. It is trivial to check that this connection is well-deﬁned [34].

October 12, J070-S0129055X10004132

1090

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

In order to deﬁne a metric on M, we have to choose a Hessian potential function on M. The only natural function on M is the volume function V (ϕ) given by (3.5): 1 ϕ ∧ ψ. V (ϕ) = 7 X Note that as before, ψ = ∗ϕ ϕ is itself a function of ϕ. So we can consider V or some function of V as potential candidates for a Hessian potential. Let us calculate the Hessian of V . Note that under a scaling sM → λsM , ϕ scales as ϕ → λϕ and from 4 (4.6), ∗ϕ scales as ∗ϕ → λ 3 ∗ ϕ, and so V scales as 7

V → λ 3 V. So V is homogeneous of order sM

7 3

in the sM , and hence

∂V 7 = V M ∂s 3 1 = sM φM ∧ ∗ϕ 3

and thus, 1 ∂V = M ∂s 3

φM ∧ ∗ϕ.

(4.42)

Using our results on deformations of G2 structures from Sec. 4.1, we can deduce that ∂N (∗ϕ) =

4 ∗ π1 (φN ) + ∗π7 (φN ) − ∗π27 (φN ). 3

Hence diﬀerentiating (4.42) again, we ﬁnd that 4 1 ∂V π π7 (ϕM ) ∧ ∗π7 (ϕN ) = (ϕ ) ∧ ∗π (ϕ ) + 1 M 1 N ∂sM ∂sN 9 3 1 − π27 (ϕM ) ∧ ∗π27 (ϕN ). 3

(4.43)

(4.44)

Note that in the case when b1 (X) = 0 (which in particular is true when Hol(g) = G2 ), since H73 = H 1 , the H73 component of H 3 (X, R) is empty. Therefore, the second term in (4.44) vanishes, and we ﬁnd that the signature of this metric is Lorentzian — (1, b3 − 1). Up to a constant factor, this deﬁnition of the moduli space metric has been been used in mathematical literature — in particular by Hitchin in [26] and Karigiannis and Leung in [34]. However in physics literature, in particular by Beasley and Witten in [7] and by Gutowski and Papadopoulos in [23], the potential K given by K = −3 log V has been used instead.

(4.45)

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1091

The motivation for using this modiﬁed potential is two-fold. Firstly, this is more in line with the logarithmic K¨ ahler potentials on Calabi–Yau moduli spaces. Secondly, and perhaps most importantly is that the metric that arises from this potential appears as the target space metric of the eﬀective theory in four dimensions when the action for the 11-dimensional supergravity is reduced to four dimensions on a G2 manifold. We will hence deﬁne the moduli space metric GMN as GMN =

∂2K . ∂sM ∂sN

Using the deﬁnition of K and (4.44), we get 1 ∂2K = (ϕ ) ∧ ∗π (ϕ ) − π7 (ϕM ) ∧ ∗π7 (ϕN ) π 1 M 1 N ∂sM ∂sN V + π27 (ϕM ) ∧ ∗π27 (ϕN )

(4.46)

In this case, if b1 (X) = 0, we get GMN =

1 V

φM ∧ ∗φN .

(4.47)

X

This metric is then in fact Riemannian. In the physics setting, apart from the G2 three-form, there is another three-form C and when the 11-dimensional supergravity action is reduced to four dimensions, the parameters of ϕ and C naturally combine to give a complexiﬁcation of the G2 moduli space. The extension of the metric GMN to this complex space is then K¨ ahler [7, 21, 23]. However since the metric on the complexiﬁed space does not depend on C, there is not much diﬀerence in treating the moduli space as a complexiﬁed K¨ ahler manifold or a real Hessian manifold. Here we will treat M as a real Hessian manifold. Now that we have ﬁxed a metric on M, we can proceed to various other geometrical quantities. For this we will need to use higher derivatives of ψ. In what follows, we will assume that b1 (X) = 0, so that there no harmonic forms in H73 . Let us introduce local special coordinates on M. Let φ0 = aϕ and φµ ∈ Λ327 for µ = 1, . . . , b327 , so that s0 deﬁnes directions parallel to ϕ and sµ deﬁne directions 3 . Then, from the deformations of ψ in Sec. 4.1, we can extract the higher in H27 derivatives of ψ in these directions: 4 2 8 a ψ, ∂0 ∂0 ∂0 ψ = − a3 ψ, 9 27 1 2 ∂0 ∂µ ψ = − a ∗ φµ , ∂0 ∂0 ∂µ ψ = a2 ∗ φµ , 3 9 2 1 ∂µ ∂ν ψ = − Tr(hµ hν )ψ + ∗ iϕ ((ϕhµ hν ϕ)0 ), 189 3 4 2 a Tr(hµ hν )ψ − a ∗ iϕ ((ϕhµ hν ϕ)0 ), ∂0 ∂µ ∂ν ψ = 567 9 ∂0 ∂0 ψ =

(4.48a) (4.48b) (4.48c) (4.48d)

October 12, J070-S0129055X10004132

1092

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

5 1 Tr(hµ hν ) ∗ φκ + ∗ iϕ ((hµ hν hκ )0 ) 18 3 1 4 − ∗ iϕ ((ψhµ hν hκ ψ)0 ) − (ϕhµ hν hκ ϕ)ψ, 6 567

∂µ ∂ν ∂κ ψ = −

(4.48e)

where hµ , hν and hκ are traceless symmetric matrices corresponding to the threeforms φµ , ϕν and φκ , respectively. On a Hessian manifold, there is a natural symmetric three-tensor given by the derivative of the metric, or equivalently the third derivative of the Hessian potential. We will denote this tensor AMNP . By analogy with similar quantities on Calabi–Yau moduli spaces, this tensors is called the Yukawa coupling. Using these expressions, following [21] we can now write down all the components of AMNR : A000 = −14a3 ,

(4.49a)

A00µ = 0,

(4.49b)

2a φµ ∧ ∗φν = −2aGµν , V 2 (ϕhµ hν hρ ϕ)dV. =− 27V

A0µν = −

(4.49c)

Aµνρ

(4.49d)

The full Riemann curvature on a Hessian manifold is then deﬁned by RMNPQ =

1 M (A QR AR NP − AMPR AR NQ ). 4

(4.50)

Note that since the fourth derivative of K is fully symmetric, the fourth derivative terms vanish here. However, we can also deﬁne the Hessian curvature tensor by QKLMN = ∂M ∂N ∂L ∂K K − AKMR AR LN .

(4.51)

This tensor is the equivalent of the K¨ ahler curvature, and carries more information than the actual Riemann tensor (4.50). The Riemann curvature tensor is obtained from Q by RMNPQ =

1 (QMNPQ − QNMPQ ). 2

(4.52)

From (4.48), we can calculate the fourth derivatives of K, and hence get all the components of Q: Theorem 20 ([21]). The components of the Hessian curvature tensor Q corresponding to the metric (4.47) on the local moduli space of torsion-free G2 structures are given by: Q0000 = 14a4 ,

(4.53a)

Q000µ = 0,

(4.53b)

Q00µν = 2a2 Gµν ,

(4.53c)

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

Q0µνρ = −Aµνρ a, 1 5 Qκµνρ = Gµν Gκρ + Gµκ Gνρ − Gµρ Gκν − Gτ σ Aµτ ρ Aκνσ 3 7 1 1 2 + − Tr(hκ hµ hν hρ ) + (ψhκ hµ hν hρ ψ) V 27 27 5 Tr(h(κ hµ ) Tr(hν hρ) ) vol. + 81

1093

(4.53d)

(4.53e)

Let us look in more detail at the expression for Aµνρ . If we deﬁne haµ = haµm dxm , then we get 4 (4.54) Aµνρ = − ϕabc haµ ∧ hbν ∧ hcρ ∧ ψ. 9V Expressions for the G2 Yukawa coupling has been derived by diﬀerent authors — in particular by Lee and Leung, [37], de Boer, Naqvi and Shomer [16], and Karigiannis [32]. Similarly, we can rewrite (4.53e) as 1 5 Gµν Gκρ + Gµκ Gνρ − Gµρ Gκν − Gτ σ Aµτ ρ Aκνσ Qκµνρ = 3 7 81 + ψabcd haκ ∧ hbµ ∧ hcν ∧ hdρ ∧ ϕ 9V 1 1 + (4.55) (5 Tr(h(κ hµ ) Tr(hν hρ) ) − 6 Tr(hκ hµ hν hρ ))vol. 81 V As we have mentioned previously, by complexifying the G2 moduli space, it is possible to turn the Hessian structure into a K¨ ahler structure. Similarly, the Hessian curvature Q becomes K¨ahler curvature. On Calabi–Yau manifolds, the complex structure moduli space is naturally a complex manifold, and admits a K¨ ahler structure, while the K¨ ahler structure moduli space is naturally a Hessian manifold, but can be complexiﬁed to become K¨ahler itself. We compare the various quantities on G2 moduli space and on the Calabi–Yau complex structure moduli space in Fig. 5. We can see that there are a number of similarities. This leads to a speculation that perhaps the G2 moduli space possesses more structures than it is currently known. One of the key features of Calabi–Yau moduli spaces is the special geometry, that is, both have a line bundle whose ﬁrst Chern class coincides with the K¨ ahler class [19, 40]. From physics point of view, special geometry relates to the eﬀective theory having N = 2 supersymmetry. M -theory compactiﬁed on G2 manifolds only gives N = 1 supersymmetry, so from this point of view it is perhaps unlikely that the (complexiﬁed) G2 moduli space would admit precisely this structure. Moreover, it was shown by Alekseevsky and Cort´es in [4] that a so-called special real structure on a Hessian manifold corresponds to special K¨ ahler structure on the tangent bundle. A special real manifold is a Hessian manifold on which the cubic form DG (with D being the ﬂat connection, and G the Hessian metric) is parallel with respect

October 12, J070-S0129055X10004132

1094

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

Quantity

G2 moduli in Λ327

Complex structure moduli

Form

ϕ, ψ

Ω

Deformation space

3 H27

H (2,1)

Metric deformation

2 h 3 µν

δgµ ¯ν ¯

χabc = hd[a ϕbc]d R K = −3 log( ϕ ∧ ψ) R Gµν = V1 φM ∧ ∗φN R 4 ϕabc ha Aµνρ = − 9V µ × ∧ hbν ∧ hcρ ∧ ψ

χαβ γ¯ = − 21 Ωαβδ δgγ¯ ¯δ R ¯ K = − log(i Ω ∧ Ω)

Form deformation K¨ ahler potential Moduli space metric Yukawa coupling Curvature

Fig. 5.

Qκµνρ as in (4.55)

¯

Gµ¯ν = −

R

κµνρ = −

χ ∧¯ χν ¯ R µ ¯ Ω∧Ω

R

β γ Ωαβγ χα µ ∧ χν ∧ χρ ∧ Ω

Rµ¯ν ρ¯ σ = Gµ¯ ν Gρ¯ σ + Gµ¯ σ Gρ¯ ν − e2KC κµν τ¯ κν¯σ¯ τ¯

Comparison of G2 moduli space and Calabi–Yau complex structure moduli space.

to D. In our terms, this would mean that the derivative of the Yukawa coupling A vanishes. This is a rather strong condition which is not necessarily fulﬁlled in our case. So perhaps instead there is some intermediate structure that could be deﬁned on the G2 moduli space or its complexiﬁcation.

5. Concluding Remarks In this paper, we have reviewed the developments in the study of G2 moduli spaces. Currently only the local picture of the moduli space is known, so in the future it is natural to try and obtain at least some information on the global structure of the G2 moduli space. On Calabi–Yau manifolds, the extension to the global moduli space was originally done by Tian and Todorov [42, 43]. We have seen that there are a number of similarities in the local structure of Calabi–Yau moduli spaces and G2 moduli spaces, so it is feasible that it could also be possible to derive similar global properties of G2 moduli spaces. However torsion-free G2 structures are very nonlinear in some aspects — in particular, the metric depends nonlinearly on ϕ and hence the diﬀerential equation ∇ϕ = 0 for a torsion-free structure is also nonlinear. Therefore, it is not clear how to extend inﬁnitesimal deformations of a G2 structure to large deformations, apart from considering deformations order by order. However even such expansions quickly get very complicated. Another possible topic for study would be to further develop approaches to mirror symmetry on G2 holonomy manifolds [22]. One possible direction for further research is to look at G2 manifolds in a slightly diﬀerent way. Suppose we have type IIA superstrings on a non-compact Calabi–Yau three-fold with a special Lagrangian submanifold which is wrapped by a D6 brane which also ﬁlls M4 .

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1095

Then, as explained in [3], from the M -theory perspective this looks like a S 1 bundle over the Calabi–Yau which is degenerate over the special Lagrangian submanifold, but this seven-manifold is still a G2 manifold. The moduli space of this manifold will be then determined by the Calabi–Yau moduli and the special Lagrangian moduli. This possibly could provide more information about mirror symmetry on Calabi–Yau manifolds [41]. One more direction is to look at G2 manifolds with singularities. So far in this work we have considered only smooth G2 manifolds, however, from a physical point of view, G2 manifolds with singularities are even more interesting, as they yield more realistic matter content [1]. Also, the moduli spaces which we studied are for manifolds with ﬁxed topology. By allowing topological transitions through singularities [15], it may be possible to ﬁnd some relations between the diﬀerent moduli spaces. Understanding these questions would improve our grasp of both the geometry and physics of G2 moduli spaces and the interplay between them. References [1] B. Acharya and E. Witten, Chiral fermions from manifolds of G(2) holonomy, hep-th/0109152. [2] B. S. Acharya and S. Gukov, M theory and singularities of exceptional holonomy manifolds, Phys. Rept. 392 (2004) 121–189; hep-th/0409191. [3] M. Aganagic, A. Klemm and C. Vafa, Disk instantons, mirror symmetry and the duality web, Z. Naturforsch. A 57 (2002) 1–28; hep-th/0105045. [4] D. V. Alekseevsky and V. Cortes, Geometric construction of the r-map: From aﬃne special real to special K¨ ahler manifolds, arXiv:0811.1658. [5] W. Ambrose and I. M. Singer, A theorem on holonomy, Trans. Amer. Math. Soc. 75 (1953) 428–443. [6] J. Baez, The Octon, Bull. Amer. Math. Soc. (N.S.) 39 (2002) 145–205. [7] C. Beasley and E. Witten, A note on ﬂuxes and superpotentials in M -theory compactiﬁcations on manifolds of G(2) holonomy, JHEP 07 (2002); hep-th/0203061. [8] K. Becker, M. Becker and J. H. Schwarz, String Theory and M-Theory: A Modern Introduction (Cambridge University Press, 2007). [9] M. Berger, Sur les groupes d’holonomie homog`ene des vari´et´es ` a connexion aﬃne et des vari´et´es riemanniennes, Bull. Soc. Math. France 83 (1955) 279–330. [10] E. Bonan, Sur les vari´et´es riemanniennes ` a groupe d’holonomie g2 our spin(7), C. R. Acad. Sci. Paris 262 (1966) 127–129. [11] R. Bryant and S. Salamon, On construction of some complete metrics with exceptional holonomy, Duke Math. J. 58 (1989) 829–850. [12] R. L. Bryant, Metrics with exceptional holonomy, Ann. of Math. (2) 126(3) (1987) 525–576. [13] R. L. Bryant, Some remarks on G 2-structures, math/0305124. [14] P. Candelas and X. de la Ossa, Moduli space of Calabi–Yau manifolds, Nucl. Phys. B 355 (1991) 455–481. [15] M. Cvetic, G. W. Gibbons, H. Lu and C. N. Pope, M -theory conifolds, Phys. Rev. Lett. 88 (2002) 121602, pp. 4; hep-th/0112098. [16] J. de Boer, A. Naqvi and A. Shomer, The topological, hep-th/0506211. [17] M. J. Duﬀ, M -theory on manifolds of G(2) holonomy: The ﬁrst twenty years, hep-th/0201062.

October 12, J070-S0129055X10004132

1096

2010 10:1 WSPC/S0129-055X

148-RMP

S. Grigorian

[18] M. Fern´ andez and A. Gray, Riemannian manifolds with structure group G2 , Ann. Mat. Pura Appl. (4) 132 (1982) 19–45. [19] D. S. Freed, Special Kaehler manifolds, Comm. Math. Phys. 203 (1999) 31–52; hep-th/9712042. [20] S. Grigorian, Betti numbers of a class of barely G2 manifolds, arXiv:0909.4681. [21] S. Grigorian and S.-T. Yau, Local geometry of the G2 moduli space, Comm. Math. Phys. 287 (2009) 459–488; arXiv:0802.0723. [22] S. Gukov, S.-T. Yau and E. Zaslow, Duality and ﬁbrations on G(2) manifolds, hep-th/0203217. [23] J. Gutowski and G. Papadopoulos, Moduli spaces and brane solitons for M theory compactiﬁcations on holonomy G(2) manifolds, Nucl. Phys. B 615 (2001) 237–265; hep-th/0104105. [24] F. R. Harvey, Spinors and Calibrations (Academic Press, 1990). [25] J. A. Harvey and G. W. Moore, Superpotentials and membrane instantons, hep-th/9907026. [26] N. J. Hitchin, The geometry of three-forms in six and seven dimensions, math/0010054. [27] K. Hori et al., Mirror Symmetry (Amer. Math. Soc., 2003). [28] D. D. Joyce, Compact Riemannian 7-manifolds with holonomy G2 . I, J. Diﬀerential Geom. 43 (1996) 291–328. [29] D. D. Joyce, Compact Riemannian 7-manifolds with holonomy G2 . II, J. Diﬀerential Geom. 43 (1996) 329–375. [30] D. D. Joyce, Compact Manifolds with Special Holonomy, Oxford Mathematical Monographs (Oxford University Press, 2000). [31] S. Karigiannis, Deformations of G 2 and Spin(7) structures on manifolds, Canad. J. Math. 57 (2005) 1012–1055; math/0301218. [32] S. Karigiannis, Geometric flows on manifolds with G 2 structure, I, math/0702077. [33] S. Karigiannis, Desingularization of G2 manifolds with isolated conical singularities, Geom. Topol. 13(3) (2009) 1583–1655; arXiv:0807.3346. [34] S. Karigiannis and N. C. Leung, Hodge theory for G2-manifolds: Intermediate Jacobians and Abel–Jacobi maps, arXiv:0709.2987. [35] A. Kovalev, Twisted connected sums and special Riemannian holonomy, math/0012189. [36] A. Kovalev and J. Nordstr¨ om, Asymptotically cylindrical 7-manifolds of holonomy G2 with applications to compact irreducible G2 -manifolds, arXiv:0907.0497. [37] J.-H. Lee and N. C. Leung, Geometric structures on G(2) and Spin(7)-manifolds, math/0202045. [38] J. Nordstr¨ om, Deformations of asymptotically cylindrical G2 -manifolds, Math. Proc. Cambridge Philos. Soc. 145(2) (2008) 311–348; arXiv:0705.4444. [39] H. Shima, The Geometry of Hessian Structures (World Scientiﬁc Publishing, 2007). [40] A. Strominger, Special geometry, Comm. Math. Phys. 133 (1990) 163–180. [41] A. Strominger, S.-T. Yau and E. Zaslow, Mirror symmetry is T -duality, Nucl. Phys. B 479 (1996) 243–259; hep-th/9606040. [42] G. Tian, Smoothness of the universal deformation space of compact Calabi–Yau manifolds and its Petersson–Weil metric, in Mathematical Aspects of String Theory (San Diego, Calif., 1986), Adv. Ser. Math. Phys., Vol. 1 (World Sci. Publishing, 1987), pp. 629–646. [43] A. Todorov, The Weil–Petersson geometry of the moduli space of SU (n ≥ 3) (Calabi–Yau) manifolds I, Comm. Math. Phys. 126 (1989) 325–346.

October 12, J070-S0129055X10004132

2010 10:1 WSPC/S0129-055X

148-RMP

Moduli Spaces of G2 Manifolds

1097

[44] P. K. Townsend, The eleven-dimensional supermembrane revisited, Phys. Lett. B 350 (1995) 184–187; hep-th/9501068. [45] P. C. West, Introduction to Supersymmetry and Supergravity (World Scientiﬁc Publishing — Singapore, 1990). [46] E. Witten, String theory dynamics in various dimensions, Nucl. Phys. B 443 (1995) 85–126; hep-th/9503124.

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 1099–1121 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004144

A UNIFIED TREATMENT OF CONVEXITY OF RELATIVE ENTROPY AND RELATED TRACE FUNCTIONS, WITH CONDITIONS FOR EQUALITY

ˇ ´ ANNA JENCOV A Mathematical Institute, Slovak Academy of Sciences, ˇ anikova 49, 814 73 Bratislava, Slovakia Stef´ [email protected] MARY BETH RUSKAI Department of Mathematics, Tufts University, Medford, MA 02155, USA [email protected] Received 14 August 2009 Revised 19 June 2010 We consider a generalization of relative entropy derived from the Wigner–Yanase–Dyson entropy and give a simple, self-contained proof that it is convex. Moreover, special cases yield the joint convexity of relative entropy, and for Tr K ∗ Ap KB 1−p Lieb’s joint concavity in (A, B) for 0 < p < 1 and Ando’s joint convexity for 1 < p ≤ 2. This approach allows us to obtain conditions for equality in these cases, as well as conditions for equality in a number of inequalities which follow from them. These include the monotonicity under partial traces, and some Minkowski type matrix inequalities proved by Carlen and Lieb for Tr1 (Tr2 Ap12 )1/p . In all cases, the equality conditions are independent of p; for extensions to three spaces they are identical to the conditions for equality in the strong subadditivity of relative entropy. Keywords: Relative entropy; convex trace functions; Wigner–Yanase–Dyson entropy. Mathematics Subject Classiﬁcation 2010: 47A63, 15A45, 94A17

1. Introduction 1.1. Background For matrices A12 > 0 acting on a tensor product of two Hilbert spaces, Carlen and Lieb ([7, 8]) considered the trace function [Tr1 (Tr2 Ap12 )q/p ]1/q and proved that it is concave when 0 ≤ p ≤ q ≤ 1 and convex when 1 ≤ q and 1 ≤ p ≤ 2. They showed that this implies that these functions and the norms they generate satisfy Minkowski type inequalities, including a natural generalization to matrices A123 acting on a tensor product of three Hilbert spaces. They also raised the question of the conditions for equality in their inequalities. When q = 1, we show that this can 1099

October 12, J070-S0129055X10004144

1100

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

be treated using methods developed to treat equality in the strong subadditivity of quantum entropy. Moreover, we obtain conditions for equality in a large class of related convexity inequalities, show that they are independent of p in the range 0 < p < 2, and show that for inequalities involving A123 they are identical to the equality conditions for strong subadditivity (SSA) of quantum entropy given in [13]. These equality conditions are non-trivial and have found many applications in quantum information theory. For example, they play an important role in some recent “no broadcasting” results; see [19] and references therein. They also play a key role in Devetak and Yard’s ([9]) “quantum state redistribution” protocol which gives an operational interpretation to the quantum conditional mutual information. Our approach to proving joint convexity of relative entropy is motivated by Araki’s relative modular operator ([5]), introduced to generalize relative entropy to more general situations including type III von Neumann algebras. It was subsequently used by Narnhofer and Thirring ([29]) to give a new proof of SSA. The argument given here is similar to that in [18,31,37]; however, the uniﬁed treatment for 0 < p < 2 leading to equality conditions, is new. Moreover, a dual treatment can be given for −1 < p < 1 allowing extension to the full range (−1, 2). Wigner and Yanase ([42, 43]) introduced the notion of skew information of a density matrix γ with respect to a self-adjoint observable K, 1 −Tr [K, γ p ][K, γ 1−p ] 2

(1)

for p = 12 and Dyson suggested extending this to p ∈ (0, 1). Wigner and Yanase [43] proved that (1) is convex in γ for p = 12 and, in his seminal paper [20] on convex trace functions, Lieb proved joint concavity for p ∈ (0, 1) for the more general function (A, B) → Tr K ∗ Ap KB 1−p

(2)

for K ﬁxed and A, B > 0 positive semi-deﬁnite. This implies convexity of (1) and was a key step in the original proof ([23]) of the strong subadditivity (SSA) inequality of quantum entropy. Moreover, it leads to a proof of joint convexity of relative entropya as well. It is less well known that Ando ([3, 4]) gave another proof which also showed that for 1 ≤ p ≤ 2, the function (2) is jointly convex in A, B. The case p = 2 was considered earlier by Lieb and Ruskai ([24]). We modify what one might describe as Lieb’s extension of the Wigner–Yanase–Dyson (WYD) entropy to a type of relative entropy in a way that allows a uniﬁed treatment of the convexity and concavity of Tr K ∗ Ap KB 1−p in the range p ∈ (0, 2] and includes the usual relative entropy as a special case. Our modiﬁcation retains a linear term, a In [23], only concavity of the conditional entropy was proved explicitly, but the same argument [36, Sec. V.B] yields joint convexity of the relative entropy. Independently, Lindblad ([26]) observed that this follows directly from (2) by diﬀerentiating at p = 1.

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1101

even for A = B. Although this might seem unnecessary for convexity and concavity questions, it is crucial to a uniﬁed treatment. Lieb also considered Tr K ∗ Ap KB q with p, q > 0 and 0 ≤ p + q ≤ 1 and Ando considered 1 < q ≤ p ≤ 2. In Sec. 2.2, we extend our results to this situation. However, we also show that for q = 1−p, equality holds only under trivial conditions. Therefore, we concentrate on the case q = 1 − p. Next, we introduce our notation and conventions. In Sec. 2, we ﬁrst describe our generalization of relative entropy and prove its convexity; then consider the extension to q = 1 − p mentioned above; and ﬁnally prove monotonicity under partial traces including a generalization of strong subadditivity to p = 1. In Sec. 3, we consider several formulations of equality conditions. In Sec. 4, we show how to use these results to obtain equality conditions in the results of Lieb and Carlen ([7, 8]). For completeness, we include an appendix which contains the proof of a basic convexity result from [37] that is key to our results. 1.2. Notation and conventions We introduce two linear maps on the space Md of d×d matrices. Left multiplication by A is denoted LA and deﬁned as LA (X) = AX; right multiplication by B is denoted RB and deﬁned as RB (X) = XR. These maps are associated with the −1 introduced by Araki ([5]) in a far more relative modular operator ∆AB = LA RB general context. They have the following properties: (a) The operators LA and RB commute since LA [RB (X)] = AXB = RB [LA (X)]

(3)

even when A and B do not commute. (b) LA and RA are invertible if and only if A is non-singular, in which case L−1 A = −1 = RA−1 . LA−1 and RA (c) When A is self-adjoint, LA and RA are both self-adjoint with respect to the Hilbert–Schmidt inner product A, B = Tr A∗B. (d) When A ≥ 0, the operators LA and RA are positive semi-deﬁnite, i.e. Tr X ∗LA (X) = Tr X ∗AX ≥ 0 and Tr X ∗RA (X) = Tr X ∗XA = Tr XAX ∗ ≥ 0. (d) When A > 0, then (LA )p = LAp and (RA )p = RAp for all p ≥ 0. If A is also non-singular, this extends to all p ∈ R. More generally, f (LA ) = Lf (A) for f : (0, ∞) → R. To see why (e) holds, it suﬃces to observe that A > 0 implies LA and RA are linear operators for which f (A) can be deﬁned by the spectral theorem for any function f with domain in (0, ∞). It is easy to verify that A|φj = αj |φj implies LA |φj φk | = αj |φj φk | for k = 1, . . . , d so that the spectral decomposition of A induces one on LA with degeneracy d and f (LA )|φj φk | = f (αk )|φj φk |.

October 12, J070-S0129055X10004144

1102

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

For RB a similar argument goes through starting with left eigenvectors of B i.e. φj |B = βj φj |. If a function is homogeneous of degree 1, then convexity is equivalent to subad ditivity. Thus, if F (λA) = λF (A), then F is convex if and only if F (A) ≤ j F (Aj ) with A = j Aj . We will use this equivalence without further ado. For B positive semi-deﬁnite, we denote the projection onto (ker B)⊥ by P(ker B)⊥ . We will encounter expressions involving commuting positive semi-deﬁnite matrices A, D with ker D ⊆ ker A. We will simply write AD−1 for √ √ lim A(D + I)−1 A = AD−1 P(ker D)⊥ = AD−1 P(ker A)⊥ (4) →0

with D−1 the generalized inverse.

2. WYD Entropy Revisited and Extended 2.1. Generalization of relative entropy We now introduce the family of functions  1  (x − xp ) p = 1 gp (x) = p(1 − p)  x log x p = 1,

(5)

which are well-deﬁned for x > 0 and p = 0. We will consider p ∈ (0, 2] although it would suﬃce to consider p ∈ [ 12 , 2]. For A, B strictly positive we deﬁne √ √ −1 (K B) (6) Jp (K, A, B) ≡ Tr BK ∗gp LA RB  1   (Tr K ∗AK − Tr K ∗Ap KB 1−p ) p ∈ (0, 1) ∪ (1, 2),   p(1 − p)   (7) = Tr KK ∗A log A − Tr K ∗AK log B p = 1,      − 1 (Tr K ∗AK − Tr AKB −1 K ∗A) p = 2. 2 When p = 1 and K = I, (6) reduces to the usual relative entropy, i.e. J1 (I, A, B) = H(A, B) = Tr A(log A − log B).

(8)

For p = 1, the function Jp (K, A, B) diﬀers from that considered by Lieb ([20]) and 1 Ando ([3, 4]) by the seemingly irrelevant linear term Tr K ∗AK and the factor p(1−p) . However, this minor diﬀerence allows us to give a uniﬁed treatment of p ∈ (0, 2] because of the extension by continuity to p = 1 and the sign change there. One might expect to associate the exchange A ↔ B with the symmetry p ↔ (1−p) around p = 12 . However, there are several subtleties due to the linear term, the exchange K ↔ K ∗ , and the case p = 1. Therefore, we use instead the observation

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

that

1103

√ √ −1 Jp (K ∗ , B, A) = Tr AK gp (LB RA )(K ∗ A) √ √ −1 g1−p (LA RB )(K B) = Tr BK ∗ = J1−p (K, A, B)

where, for −1 ≤ p < 1, we deﬁne

(9)

 

1 (1 − xp ) p = 0 p(1 − p) (10)  − log x p=0 √ √ −1 and Jp (K, A, B) = Tr BK ∗ gp (LA RB )(K B). The functions Jp (K, A, B) and Jp (K, A, B) have been considered before, usually with K = I, in the context of information geometry ([2, Sec. 7.2] and references therein) and by Petz ([31]) who used the term “quasi-entropy”. What is novel here is that we present a simple uniﬁed proof of joint convexity in A, B that easily yields equality conditions, shows that they are independent of p, and can be extended to other functions. The special case Jp (I, A, I) is equivalentb to the Tsallis ([40]) entropy. When K = K ∗ , the relation gp (x) = xg1−p (x−1 ) =

Jp (K, A, A) = −

1 Tr [K, Ap ][K, A1−p ] 2p(1 − p)

(11)

yields the original WYD information (up to a constant) and extends it to the range (0, 2]. Morevoer, K = K ∗ implies that Jp (K, A, A) = J1−p (K, A, A). Although neither gp (w) nor g1−p (w) is positive, their averagec Gp (w) ≡ 12 [gp (w)+wgp (w−1 )] ≥ 0 on (0, ∞). Therefore, when K = K ∗ , √ √ −1 )(K A) ≥ 0. (12) Jp (K, A, A) = Tr(K A)∗ Gp (LA RA The function Jp (I, A, B) is a more appealing generalization of relative entropy than Tr Ap B 1−p because of Proposition 1, which one can consider to be a generalization of Klein’s inequality ([17]). It allows one to use Jp (I, A, B) as a pseudo-metric, as is commonly done with the relative entropy. Proposition 1. When U is unitary and A, B > 0 with Tr A = Tr B = 1, then Jp (U, A, B) ≥ 0 with equality if and only if A = U ∗BU . Proof. When U is unitary, Jp (U, A, B) = Jp (I, U ∗AU, B) = Jp (I, A, U BU ∗ ). b This

(13)

was pointed out by Karol Zyczkowski. deﬁnition of gep in (10) diﬀers from that in [18] by the exchange e gp ↔ e g1−p so that in [18] g (w)] for any g. In the convention used here, Gp (w) = 12 [gp (w) + ge1−p (w)]. G(w) = 12 [g(w) + e

c The

October 12, J070-S0129055X10004144

1104

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

Therefore, it suﬃces to consider the case U = I. For p ∈ (0, 1) H¨ olders inequality p 1−p p 1−p ≤ (Tr A) (Tr B) = 1 with equality if and and only A = B. implies Tr A B It immediately follows that Jp (I, A, B) ≥

1 (Tr A − 1) = 0 and Jp (I, A, B) = 0 ⇔ A = B. (14) p(1 − p)

For p = 1, the result is well-known [38, Sec. 2.5.2] and originally due to Klein ([17]). For p ∈ (1, 2) we write p = 1 + r and again use H¨ older’s inequality r

r

r

1 = Tr A = Tr B − 2(r+1) AB − 2(r+1) B r+1 1 1+r 1+r

r r r ≤ Tr B − 2(r+1) AB − 2(r+1) (Tr B) 1+r

(15)

1 1 1 1 ≤ [Tr B − 2 A1+r B − 2 ] 1+r Tr A1+r B −r 1+r where we used Tr B = 1 and the second inequality follows from a classic result of Lieb–Thirring [25, Appendix B, Theorem 9]. Because the denominator p(1 − p) changes sign at p = 0 and p = 1, both gp and gp are convex. In fact, they satisfy the much stronger condition of operator convexity for p ∈ (0, 2] and p ∈ [−1, 1) respectively. Since g(0) = 0 and   

1 (1 − xp−1 ) p = 1 gp (x) p(1 − p) = ,  x log x p=1

(16)

it follows that gp (x)/x is operator monotone [3, 10, 27], for p ∈ (0, 2], i.e. gp can be analytically continued to the upper half plane, which it maps into itself. By applying Nevanlinna’s theorem [1, Sec. 59, Theorem 2] to gp (x)/x, one ﬁnds that gp (x) has an integral representation of the form

∞

x2 t − x dν(t) x+t 0

∞ 2 x 1 1 = ax + − + t dν(t) x+t t x+t 0

gp (x) = ax +

(17)

with ν(t) ≥ 0. Integral representations are not unique, and making a suitable change of variable in the classic formula

0

∞

π 1 xp−1 = ≡ x+1 sin pπ cp

p ∈ (0, 1)

(18)

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

allows us to give the following explicit representations 

∞ 1 t  p−1  − 1 t dt x + cp    p(1 − p) x+t 0       ∞ 2   x t 1   −1+ dt   0 x+t x+t 1+t gp (x) = 

∞ 2   1 x p−2   t dt x − c  p−1   p(1 − p) x+t 0          1 (−x + x2 ) 2

1105

p ∈ (0, 1),

p = 1, (19) p ∈ (1, 2),

p = 2.

Note that for p ∈ (0, 2) the integrand is supported on (0, ∞). This plays a key role in the equality conditions; therefore, we will henceforth concentrate on p ∈ (0, 2). Theorem 2. The function Jp (K, A, B) deﬁned in (6) is jointly convex in A, B. Proof. It follows from (17) that Jp (K, A, B) = a Tr K ∗AK

∞ + Tr K ∗A

1 Tr KBK ∗ (AK) − LA + tRB t 0 1 + Tr BK ∗ (KB) tν(t)dt. LA + tRB

(20)

The joint convexity then follows immediately from that of the map (X, A, B) → 1 (X) which was proved in [37] following the strategy in [24]. The proof Tr X ∗ LA +tR B is also given in the Appendix. For other approaches, see [30, 31, 11]. The advantage to the argument used here is that it immediately implies that equality holds in joint convexity if and only if it holds for each term in the integrand. Corollary 3. The relative entropy H(A, B) = J1 (I, A, B) is jointly convex in A, B. 2.2. Extensions with r = 1 − p We now consider extensions of Theorem 2 to situations considered by Ando ([4]) and Lieb ([20]) in which B 1−p is replaced by B r with r = 1 − p. Our approach uses an idea from Bekjan ([6]) and Eﬀros ([11]). We will also show that equality holds in these extensions only under trivial conditions. For this we ﬁrst need an elementary lemma, which we prove for the concave case. Lemma 4. Let f (λ): [0, ∞) → R be a nonlinear convex or concave operator function, let A1 , A2 be density matrices and A = λA1 + (1 − λ)A2 with λ ∈ (0, 1). Then f (A) = λf (A1 ) + (1 − λ)f (A2 ) if and only if A1 = A2 .

October 12, J070-S0129055X10004144

1106

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

Proof. Since any operator concave function is analytic, nonlinearity implies that f is strictly concave. If f (A) = λf (A1 ) + (1 − λ)f (A2 ), then v, f (A)v = λv, f (A1 )v + (1 − λ)v, f (A2 )v

(21)

for any vector v. Now choose v to be a normalized eigenvector of A. Then inserting this on the left above and applying Jensen’s inequality to each term on the right, one ﬁnds f (v, Av) ≤ λf (v, A1 v) + (1 − λ)f (v, A2 v).

(22)

But this contradicts concavity unless equality holds, which implies that v is also an eigenvector of A1 and A2 . But then the strict concavity of f also implies that v, A1 v = v, A2 v. Since this holds for an orthonormal basis of eigenvectors of A, A1 and A2 , we must have A1 = A2 . Corollary 5. The function (A, B) → Tr K ∗Ap KB r is jointly concave on the set of positive deﬁnite matrices when p, r ≥ 0 and p + r ≤ 1. Moreover, when p + r < 1 and K is invertible, the convexity is strict unless B1 = B2 and A1 = A2 . Proof. It is an immediate consequence of Theorem 2 that (A, B) → Tr K ∗Ap KB 1−p is jointly concave in A, B. Now write Tr K ∗Ap KB r = Tr K ∗Ap K(B s )1−p with s = r/(1 − p). First, observe that for 0 < s < 1 the function f (x) = xs satisﬁes the hypotheses of Lemma 4. Therefore, (λB1 + (1 − λ)B2 )s > λB1s + (1 − λ)B2s

(23)

with 0 < λ < 1 and B1 = B2 . The operator monotonicity of x → x1−p for 0 < p < 1 then implies (λB1 + (1 − λ)B2 )r > (λB1s + (1 − λ)B2s )1−p ,

(24)

and the joint concavity of Tr K ∗Ap KB 1−p implies Tr K ∗Ap K(B s )1−p ≥ Tr K ∗ (λA1 + (1 − λ)A2 )p K(λB1s + (1 − λ)B2s )1−p ≥ λTr K ∗Ap1 KB1

s(1−p)

+ (1 − λ) Tr K ∗Ap2 KB2

s(1−p)

(25)

where A = λA1 + (1 − λ)A2 , B = λB1 + (1 − λ)B2 , which is precisely the joint concavity of Tr K ∗Ap KB r . Moreover, equality in joint concavity implies equality in (25) and, since K ∗Ap K is strictly positive, this implies equality in (23). Therefore, equality in (25) gives a contradiction unless B1 = B2 . In that case, the joint concavity reduces to concavity in A for which, by a similar argument, equality holds if and only if A1 = A2 . Corollary 6. The function (A, B) → Tr K ∗Ap KB 1−r is jointly convex on the set of positive deﬁnite matrices when 1 < r ≤ p ≤ 2. Moreover, when r < p and K is invertible, the convexity is strict unless B1 = B2 and A1 = A2 .

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1107

Proof. The argument is similar to that for Corollary 5. Write Tr K ∗Ap KB 1−r = 1−r . Since s ∈ (0, 1) and 1 − p ∈ (−1, 0) when 1 < Tr K ∗Ap K(B s )1−p with s = 1−p s r < p < 2, it follows that x is operator concave and x1−p is operator monotone decreasing. 2.3. Monotonicity under partial traces Let X and Z denote the generalized Pauli operators whose action on the standard basis is X|ek = |ek+1 (with subscript addition mod d) and Z|ek = ei2πk/d |ek . It is well known and easy to verify that d1 k Z kAZ −k is the projection of a matrix onto its diagonal. If D is a diagonal matrix, then k X k DX −k = (Tr D)I. Now let {Wn }n=1,2,...,d2 denote some ordering of the generalized Pauli operators, e.g., Wj+k(d−1) = X j Z k with j, k = 1, 2, . . . , d. Then d1 n Wn AWn∗ = (Tr A)I and 1 (Wn ⊗ I2 )A12 (Wn ⊗ I2 )∗ = I1 ⊗ (Tr1 A) = I1 ⊗ A2 . d n

(26)

Using the fact that replacing Wn by U Wn U ∗ with U unitary, simply corresponds to a change of basis which does not aﬀect (26) and then multiplying both sides by U ∗ ⊗ I2 on the left and U ⊗ I2 on the right gives the equivalent expression 1 (Wn U ∗ ⊗ I2 )A12 (Wn U ∗ ⊗ I2 )∗ = I1 ⊗ A2 . d n

(27)

Combining this with joint convexity yields a slight generalization of the well-known monotonicity of Jp (K, A, B) under partial traces (MPT), ﬁrst proved by Lieb in [20] for the case K12 = I1 ⊗ K2 when p ∈ (0, 1). Theorem 7. Let Jp be as in (7), A12 , B12 strictly positive in Md1 ⊗ Md2 and K12 = V1 ⊗ K2 with V1 unitary in Md1 . Then Jp (K2 , A2 , B2 ) ≤ Jp (K12 , A12 , B12 ).

(28)

Proof. Writing Wn for Wn ⊗ I2 and V for V1 ⊗ I2 and using (27) gives 1 Jp (I1 ⊗ K2 , I1 ⊗ A2 , I1 ⊗ B2 ) d1 1 1 ∗ ∗ 1 ∗ Jp I1 ⊗ K2 , Wn V A12 VWn , Wn B12 Wn = d1 d1 n d1 n

Jp (K2 , A2 , B2 ) =

≤

1 Jp (I1 ⊗ K2 , Wn (V1∗ ⊗ I2 )A12 (V1 ⊗ I2 )Wn∗ , Wn B12 Wn∗ ) d21 n

= Jp (V1 ⊗ K2 , A12 , B12 ) where the ﬁnal equality follows from the unitary invariance of the trace.

October 12, J070-S0129055X10004144

1108

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

Because Tr 12 (V1 ⊗ K2 )A12 (V1 ⊗ K2 )∗ = Tr 2 K2 A2 K2∗ , (28) is equivalent to 1−p 1−p ≥ 0 p ∈ (0, 1) ∗ p ∗ p Tr K2 A2 K2 B2 − Tr(V1 ⊗ K2 ) A12 (V1 ⊗ K2 )B12 . (29) ≤ 0 p ∈ (1, 2) We can obtain a weak reversal of this for p ∈ (0, 1). The argument in the Appendix shows that for any p and ﬁxed A, B ≥ 0 both Tr K ∗Ap KB 1−p and Tr K ∗AK are convex in K. This was observed earlier by Lieb ([20]) and also follows from the results in [24]. One can then apply the argument above in the special case A12 = I1 ⊗ A2 , B12 = I1 ⊗ B2 to conclude that Tr K2∗ Ap2 K2 B21−p ≤

1 ∗ Tr K12 (I1 ⊗ A2 )p K12 (I1 ⊗ B2 )1−p d1

∗ (I1 ⊗ A2 )p K12 (I1 ⊗ B2 )1−p ≤ Tr K12

(30) (31)

independent of whether p < 1 or p > 1. However, because the term Tr K ∗AK is convex rather than linear in K, (30) does not allow us to draw any conclusions about the monotonicity of Jp (K12 , I1 ⊗ A2 , I1 ⊗ B2 ). To prove Theorem 7 we showed that joint convexity implies monotonicity; the reverse implication also holds. Let A1 , . . . , Am , B1 , . . . , Bm be positive deﬁnite matrices in Md , A = j Aj , B = j Bj , and put 12 = 12 = A |ej ej | ⊗ Aj , B |ej ej | ⊗ Bj , (32) j

j

12 and B 12 are block diagonal, for e1 , . . . , em the standard basis of Cm . Then A 12 = Ak = A and similarly for B. Then if monotonicity under 2 = Tr1 A and A k partial traces holds, one can conclude that 2 , B 2 ) Jp (K, A, B) = Jp (K, A 12 , B 12 ) = ≤ Jp (I1 ⊗ K, A

Jp (K, Aj , Bj )

(33)

j

Thus, monotonicity under partial traces also directly implies joint convexity of Jp . Applying (28) in the case K = I, and A12 → A123 and B12 → A12 ⊗ I3 gives Jp (I23 , A23 , A2 ⊗ I3 ) ≤ Jp (I123 , A123 , A12 ⊗ I3 ).

(34)

When p = 1, it follows from (7) that J1 (I23 , A23 , A2 ⊗ I3 ) = H(A23 , A2 ⊗ I2 ) = −S(A23 ) + S(A2 ) where S(A) = −Tr A log A. Thus, (34) becomes −S(A23 ) + S(A2 ) ≤ −S(A123 ) + S(A12 ) or, equivalently S(A2 ) + S(A123 ) ≤ S(A12 ) + S(A23 ) which is the standard form of SSA.

(35)

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1109

3. Equality for Joint Convexity of Jp (K, A, B) 3.1. Origin of necessary and suﬃcient conditions Looking back at the proof of Theorem 2, we see that for p ∈ (0, 2), equality holds in the joint convexity of Jp (K, A, B) if and only if equality holds in the joint convexity for each term in the integrand in (17). It should be clear from the argument given in the Appendix, that this requires Mj = 0 for all j with Mj given by (70). This is easily seen to be equivalent to (36) (LAj + tRBj )−1 (Xj ) = (LA + tRB )−1 (X) for all j, with A = j Aj , B = j Bj , and X = j Xj with Xj = Aj K and/or Xj = KBj . By writing AK = LA (K) in the former case and KB = RB (K) in the latter we obtain the conditions −1 −1 (I + t∆−1 (K) = (I + t∆−1 (K) ∀ j Aj Bj ) AB )

∀ t > 0,

(37a)

(∆Aj Bj + tI)−1 (K) = (∆AB + tI)−1 (K) ∀ j

∀ t > 0.

(37b)

From the integral representations (19), one might expect it to be necessary for either or both of (37a) and (37b) to hold depending on p. In fact, either will suﬃce because (37a) holds if and only if (37b) holds. Because ∆AB is positive deﬁnite, by analytic continuation (37b) extends from t > 0 to the entire complex plane, except points −t on the negative real axis for which t ∈ spectrum (∆AB ). Therefore, by using the Cauchy integral formula, one ﬁnds that for any function G analytic on C\(−∞, 0] G(∆Aj Bj )(K) = G(∆AB )(K). Theorem 8. For ﬁxed K, and A = j Aj , B = j Bj , the following are equivalent (a) Jp (K, A, B) = j Jp (K, Aj , Bj ) for all p ∈ (0, 2). (b) Jp (K, A, B) = j Jp (K, Aj , Bj ) for some p ∈ (0, 2). (c) (∆Aj Bj + tI)−1 (K) = (∆AB + tI)−1 (K) for all j and for all t > 0. −it = Ait KB −it for all j and for all t > 0. (d) Ait j KBj (e) (log A − log Aj )K = K(log B − log Bj ) for all j. Proof. Clearly (a) ⇒ (b). The implications (b) ⇒ (c) ⇒ (d), as well as (b) ⇒ (a), follow from the discussion above. Diﬀerentiation of (d) at t = 0 gives (d) ⇒ (e), and it is straightforward to verify that (e) ⇒ (b) with p = 1. Moreover, (d) implies 1−it ∗ it = Tr K ∗Ait KB 1−it for all t, which implies (a) by analytic j Tr K Aj KBj continuation. 3.2. Suﬃcient subalgebras When K = I, we can obtain a more useful reformulation of the equality conditions by using results about suﬃcient subalgebras obtained in [14, 15, 33]. Since the deﬁnition and convexity properties of Jp (I, A, B) extend by continuity to positive

October 12, J070-S0129055X10004144

1110

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

semideﬁnite matrices, with ker B ⊆ ker A, we will formulate the conditions in this more general situation, using the conventions in Sec. 1.2. Let N ⊆ Md be a subalgebra, then there is a trace preserving conditional expectation EN from Md onto N , such that Tr AX = Tr EN (A)X for all X ∈ N . In particular, if N = Md1 ⊗ I ⊆ Md1 ⊗ Md2 , then we have EN (A12 ) = Tr2 A ⊗ d12 I. Let Q1 , . . . , Qm ∈ Md+ and assume that ker Qm ⊆ ker Qj for all j. The subalgebra N is said to be suﬃcient for {Q1 , . . . , Qm } if there is a completely positive trace preserving map T : N → Md , such that T (EN (Qj )) = Qj for all j = 1, . . . , m. This deﬁnition is due to Petz ([33, 32]) and it is a quantum generalization of the well known notion of suﬃciency from classical statistics. In [33], it was shown that suﬃcient subalgebras can be characterized by the condition H(Qj , Qm ) = H(EN (Qj ), EN (Qm )),

for all j.

We combine this with the results of the previous section to obtain other useful characterizations of suﬃciency. Theorem 9. Let Q1 , . . . , Qm ∈ Md+ be such that ker Qm ⊆ ker Qj for all j. Let N ⊆ Md be a subalgebra. The following are equivalent. (i) N is suﬃcient for {Q1 , . . . , Qm }. −it (ii) EN (Qj )it EN (Qm )−it P(ker Qm )⊥ = Qit j Qm , for all j, t ∈ R. + + (iii) There exist Qj,0 ∈ N , and D ∈ Md , such that ker D = ker Qm , and Qj = Qj,0 D for j = 1, . . . , m. (iv) Jp (I, Qj , Qm ) = Jp (I, EN (Qj ), EN (Qm )) for all j and some p ∈ (0, 1). The proof of the conditions (i)–(iii) can be found in [14], see also [28]. The condition (iv) was proved in [15]. 3.3. Equality conditions with K = I Theorem 10. Let A1 , . . . , Am and B1 , . . . , Bm be positive semi-deﬁnite matrices with ker Bj ⊆ ker Aj , and let A = j Aj , B = j Bj . Then the following are equivalent. (a) Jp (I, A, B) = j Jp (I, Aj , Bj ) for all p ∈ (0, 2). (b) Jp (I, A, B) = j Jp (I, Aj , Bj ) for some p ∈ (0, 2). −it (c) Ait = Ait B −it P(ker Bj )⊥ for all j and t ∈ R. j Bj (d) There are positive matrices D1 , . . . , Dm , with ker Dj = ker Bj , such that [Aj , Dj ] = [Bj , Dj ] = 0, and with D = j Dj . Aj = AD−1 Dj ,

Bj = BD−1 Dj .

(38)

Proof. As in Sec. 3.1, (b) implies (36) on (ker Bj )⊥ , with Xj = Bj , X = B. This gives (∆Aj Bj + tI)−1 (I) = (∆AB + tI)−1 (I) on (ker Bj )⊥ . Then (c) follows from the Cauchy integral formula as in Sec. 3.1.

(39)

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1111

To show (c) implies (d), we will use Theorem 9. First let N = I ⊗ Md ⊆ 12 , B 12 be the block-diagonal matrices in Mm ⊗ Md , deﬁned Mm ⊗ Md and let A 12 ⊇ ker B 12 = |ej ej | ⊗ ker Bj and EN (A 12 ) = by (32). Clearly, we have ker A j 1 12 ) = 1 I ⊗ B. Then (c) implies EN (A 12 )it EN (B 12 )−it P I ⊗ A, EN (B e ⊥ = m it −it A 12 B12

(ker B12 )

m

12 , Qm = Q2 = B 12 , we for all t. Then by using Theorem 9 with Q1 = A can conclude that there are positive matrices A0 , B0 ∈ Md and D12 ∈ (Mm ⊗ Md)+ , 12 , [I ⊗ A0 , D12 ] = [I ⊗ B0 , D12 ] = 0 and such that ker D12 = ker B 12 = (I ⊗ A0 )D12 , A

12 = (I ⊗ B0 )D12 . B

(40)

12 are block diagonal, D12 = |ej ej | ⊗ Dj must also be block 12 , B Since A j diagonal with Dj ∈ Md+ , ker Dj = ker Bj , [A0 , Dj ] = [B0 , Dj ] = 0 for all j and Aj = A0 Dj ,

Bj = B0 Dj .

(41)

Taking Tr1 in (40) gives A = A0 D and B = B0 D. Using this in (41) gives (38) which proves (d). The implications (d) ⇒ (a) ⇒ (b) are straightforward. We return brieﬂy to the case of arbitrary K. Note that if the condition (d) holds and [Dj , K] = 0 for all j, then Jp (K, A, B) = j Jp (K, Aj , Bj ) for all p ∈ (0, 2), this gives a suﬃcient, but not necessary, condition for equality if K = I. The next result reduces the case of K unitary to K = I. Then, we can apply the conditions of Theorem 10 to Aj and KBj K ∗ . Theorem 11. If K is unitary, then Jp (K, A, B) = Jp (I, A, KBK ∗ ) = j Jp (I, Aj , KBj K ∗ )

j

Jp (K, Aj , Bj ) if and only if

Proof. When K is unitary, then KB p K ∗ = (KBK ∗ )p which implies Jp (K, A, B) = Jp (I, A, KBK ∗ ). One can try to extend the results of this section to the case K ≤ 1, and hence to all K, by using the unitary dilation U=

K

L

−L

K

where L = U (1 − |K|2 )1/2 and K = U |K| is the polar decomposition. Then, with A=

A 0 0

0

,

B=

B

0

0

0

October 12, J070-S0129055X10004144

1112

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

we have Jp (K, A, B) = Jp (U, A, B), so that we may use Theorem 11 to get conditions for equality. But note that the conditions of Theorem 10 require that ker UBj U ∗ ⊆ ker Aj and it can be shown that this implies P(ker Aj )⊥ KP(ker Bj )⊥ K ∗ = P(ker Aj )⊥ , where PN denotes a projection onto the subscripted space. In particular, if all Aj and Bj are invertible, this restricts us to unitary K. 3.4. Equality in monotonicity under partial trace It is easy to see that when A12 = A1 ⊗A2 and B12 = B1 ⊗B2 , then Jp (I, A12 , B12 ) = Jp (I, A2 , B2 ) if and only if A1 = B1 with Tr A1 = 1. However, it is not necessary that A12 = A1 ⊗ A2 . The equality conditions are given by the following theorem. Theorem 12. Let K12 = I12 and A12 , B12 ∈ B(H1 ⊗H2 )+ , with ker B12 ⊆ ker A12 . Equality holds in (28) if and only if (i) H2 = n HnL ⊗ HnR , L L + R R + ⊗ AR (ii) A12 = n AL n with An ∈ B(H1 ⊗ Hn ) and An ∈ B(Hn ) , nL R L L + R (iii) B12 = n Bn ⊗ Bn with Bn ∈ B(H1 ⊗ Hn ) and Bn ∈ B(HnR )+ , L (iv) AL n = Bn for all n. Proof. Let us denote Aj = d11 Wj A12 Wj∗ , Bj = d11 Wj B12 Wj∗ , with Wj deﬁned as in the proof of Theorem 7. Then we get that equality in (28) is equivalent to   Jp I12 , Aj , Bj  = Jp (I12 , Aj , Bj ). j

j

j

By Theorem 10, equality for some p implies equality for all p, so that Jp (I12 , A12 , B12 ) = Jp (I2 , Tr1 A, Tr1 B) = Jp (I12 , EN (A12 ), EN (B12 )) for p ∈ (0, 1), where N is the subalgebra I1 ⊗ B(H2 ) ⊆ B(H1 ⊗ H2 ). Hence N is suﬃcient for {A12 , B12 } and, by Theorem 9, there are some AR , BR ∈ B(H2 )+ and D ∈ B(H1 ⊗ H2 )+ , ker D = ker B12 , such that [(I1 ⊗ AR ), D] = [(I1 ⊗ BR ), D] = 0 and A12 = D(I1 ⊗ AR ),

B12 = D(I1 ⊗ BR ).

(42)

Now let M1 be the subalgebra in B(H2 ), generated by AR , BR . Then D ∈ (I1 ⊗ M1 ) = B(H1 ) ⊗ M1 where M denotes the commutant of M . There is a decomposition H2 = n HnL ⊗ HnR , such that R B(HnL ) ⊗ 1R M1 = 1L M1 = n, n ⊗ B(Hn )

and D = result, with

n

n

R L n Dn ⊗ 1n , where Dn ∈ B(H1 ⊗ Hn ). Since AR , BR ∈ L L An = Bn = Dn . The converse can be veriﬁed directly.

M1 , we get the

Applying this result in the case A12 → A123 and B12 → A12 ⊗ I3 gives equality conditions in (34). Since these are independent of p, they are identical to the conditions, ﬁrst given in [13], for equality in SSA (35) which corresponds to p = 1.

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1113

Corollary 13. Equality holds in (34) if and only if (i) H2 = n HnL ⊗ HnR . R L L R R (ii) A123 = n AL n ⊗ An with An ∈ B(H1 ⊗ Hn ) and An ∈ B(Hn ⊗ H3 ). Proof. It suﬃces to let A12 → A123 and B12 → A12 ⊗ I3 in Theorem 12. To apply these results in Sec. 4, it is useful to observe that condition (ii) in Corollary 13 above can be written as A123 = (FL ⊗ I3 )(I1 ⊗ FR )

(43)

with FL ∈ B(H1 ⊗ H2 )+ , FR ∈ B(H2 ⊗ H3 )+ , [FL ⊗ I3 , I1 ⊗ FR ] = 0. Combining this with part (d) of Theorem 10 gives the following useful result, which essentially allows us to bypass the need to apply Theorem 10 to Jp (I, Aj , Wn Aj Wn ). Corollary 14. Let Aj ∈ Md1 ⊗ Md2 , A = Aj . Then Jp (I12 , A, (Tr2 A) ⊗ I2 ) = Jp (I12 , Aj , (Tr2 Aj ) ⊗ I2 )

(44)

j

if and only if there are Dj ∈ Md+1 , such that ker Dj = ker Tr2 Aj , [Aj , Dj ⊗ I] = 0 and Aj = A(D−1 Dj ⊗ I) with D = j Dj . 123 = |ej ej |⊗Aj ∈ Mm ⊗Md1 ⊗Md2 , then A = A 23 ∈ Md1 ⊗Md2 Proof. Let A j and (44) can be written as 23 , A 2 ⊗ I3 ) = Jp (I123 , A 123 , A 12 ⊗ I3 ). Jp (I23 , A By (43), this is equivalent to the existence of FL and FR , [(FL ⊗ I3 ), (I1 ⊗ FR )] = 0, (1)(23) is block-diagonal, FL must 123 = (FL ⊗ I3 )(I1 ⊗ FR ). Since A such that A be of the form FL = j |ej ej | ⊗ Dj , so that Aj = FR (Dj ⊗ I). Then Tr2 Aj = Dj Tr2 FR which implies that ker Dj ⊆ ker Tr2 Aj . If we let Pj = P(ker Tr2 Aj )⊥ , then Pj commutes with Dj and Aj = (Pj ⊗ I)Aj = (Pj Dj ⊗ I)FR , so that we can assume that ker Dj = ker Tr2 Aj , by taking Pj Dj instead of Dj . Taking Tr1 of (43) gives A = (D ⊗ I3 )FR = FR (D ⊗ I3 ) so that Aj = A(D−1 Dj ⊗ I).

4. Equality in Joint Convexity of Carlen–Lieb Carlen and Lieb [8] obtained several convexity inequalities from those of the map Υp,q (K, A) ≡ Tr(K ∗Ap K)q/p

(45)

October 12, J070-S0129055X10004144

1114

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

using an identity which we write only for q = 1 and p > 1 in our notation as 1 Υp,1 (K, A) = (p − 1) inf Jp (K, A, X) + Tr X p 1 ∗ + Tr K AK : X > 0 . p(p − 1)

(46)

We introduce the closely related quantity p,1 (K, A) = inf Jp (K, A, X) + 1 Tr X : X > 0 Υ p 1 1 ∗ = Υp,1 (K, A) − Tr K AK (p − 1) p

(47) (48)

which is well-deﬁned for all p ∈ (0, 2) and allows us to continue to treat the cases p < 1 and p > 1 simultaneously, as well as include the special case p = 1 for which 1,1 (K, A) = −Tr K ∗AK log(K ∗AK) + Tr K ∗ (A log A)K + Tr K ∗AK Υ = S(K ∗AK) + Tr KK ∗A log A + Tr K ∗AK.

(49)

Since we are dealing with ﬁnite dimensional spaces, the inﬁmum in (46) has a minimizer which satisﬁes Xmin = (K ∗Ap K)1/p .

(50)

For ﬁxed K, let Xj denote the minimizer associated with Aj . Then p,1 (K, A1 ) + Υ p,1 (K, A2 ) = Jp (K, A1 , X1 ) + 1 Tr X1 + Jp (K, A2 , X2 ) + 1 Tr X2 Υ p p 1 ≥ Jp (K, A1 + A2 , X1 + X2 ) + Tr(X1 + X2 ) p 1 ≥ inf Jp (K, A1 + A2 , X) + Tr X : X > 0 p p,1 (K, A1 + A2 ) =Υ

(51)

(52)

. Note that equality above requires both X = which proves convexity of Υ p,1 j Xj and Jp (K, A, X) = j Jp (K, Aj , Xj ), where X is the minimizer associated with A. Now we introduce some notation following the strategy in the published version of [8]. Let |½ denote the vector (1, 1, . . . , 1) with all components 1 and |e1 the

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1115

vector (1, 0, . . . , 0). Deﬁne  I  I 1  K = I ⊗ |½e1 | = . .. d  I

0

 ... 0  . . . 0  ..  .  ... 0

Aj1

0

0

Aj2

0

0

Aj3

0 0 .. .

(53)

and  Aj =

Ajk ⊗ |ek ek | =

k

and A =

j

 0  = 0  .. .

Ajk

k

Aj =

k

Ak ⊗ |ek ek | = ∗

KA K= p

k

Ak with Ak =

j

 ... 0  . . . 0  , . . . 0   . .. . ..

(54)

Ajk . Then

Apk

⊗ |e1 e1 |

k

With this notation, we make some deﬁnitions following Carlen and Lieb but modiﬁed to allow a uniﬁed treatment of p ∈ (0, 2). Φ(p,1) (A) = Φ(p,1) (A1 , A2 , A3 . . .) ≡ Υp,1 (K, A) = Tr(Ap1 + Ap2 + Ap3 + · · ·)1/p ,

(55)

(p,1) (A) = Φ (p,1) (A1 , A2 , A3 . . .) Φ p,1 (K, A) ≡Υ ! " 1 1 Φ(p,1) (A1 , A2 , A3 , . . .) − = Tr Ak . (p − 1) p

(56)

k

apply only when A is a block diagonal matrix in The deﬁnitions of Φ and Φ Md1 ⊗ Md2 . We now extend this to an arbitrary matrices A12 ∈ Md1 ⊗ Md2 . Ψ(p,1) (A12 ) ≡ Tr1 (Tr2 Ap12 )1/p , 1 1 Ψ(p,1) (A12 ) ≡ Ψ(p,1) (A12 ) − Tr A12 . (p − 1) p

(57) (58)

October 12, J070-S0129055X10004144

1116

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

For p = 1, the formulas with hats are related to the conditional entropy, from which they diﬀer by a constant Φ(1,1) (A1 , A2 , A3 . . .) − Tr A12 = −Tr Ak log Ak + Ak log Ak =S

k

Ak

k

k

− S(A12 )

k

= J1 (I, A12 , Tr2 A12 ⊗ I2 ), (1,1) (A12 ) − Tr A12 = S(A1 ) − S(A12 ) = H(A12 , A1 ⊗ I2 ). Ψ

(59) (60)

When A12 is block diagonal, Ψ(p,1) (A12 ) = Φ(p,1) (A12 ) with the understanding that Tr2 A12 = k Ak . Now let Wn denote the generalized Pauli matrices as in Sec. 2.3, Wn = I1 ⊗ Wn and deﬁne A123 = Wn A12 Wn∗ ⊗ |en en | = Wn A12 Wn∗ (61) n

n

so that A123 is block diagonal with blocks Wn A12 Wn∗ . Then 1+p

d2 p Ψ(p,1) (A12 ) = Φ(A(12)(3) ) = Φ(W1 A12 W1∗ , W2 A12 W2∗ , . . .).

(62)

(p,1) (A) (p,1) (A) and Ψ It is straightforward to show that for p ∈ (0, 2) the functions Φ are all convex in A, inheriting this property from the quantities from which they are deﬁned. In view of (59) and (60), the conditions for equality in the next two theorems are not surprising. (p,1) (A) is convex in A for p ∈ (0, 2). Moreover, the Theorem 15. The function Φ following are equivalent: (i) Jp (I, A, (Tr2 A) ⊗ I2 ) = j Jp (I, Aj , (Tr2 Aj ) ⊗ I2 ), (ii) There are matrices Dj > 0, D = j Dj , such that [Ajk , Dj ] = 0, ker Dj = ker( k Ajk ) and Ajk = Ak D−1 Dj , (p,1) (A1 , A2 , A3 . . .) = Φ (iii) Φ j (p,1) (Aj1 , Aj2 , Aj3 . . .). Proof. It follows from Corollary 14 and the fact that Aj are block-diagonal that (i) ⇔ (ii) and it is straightforward to verify that (ii) ⇒ (iii). Moreover, (iii) implies (i) for p = 1, by (59). To show that (iii) implies (ii) for p = 1, observe that (iii), p,1 (K, A) = Υ implies Υ j p,1 (K, Aj ), and this implies Jp (K, Aj , Xj ) (63) Jp (K, A, X ) = j

where Xj = (K = Xj ⊗ |e1 e1 | and Xj = X = (K∗Ap K)1/p = p 1/p j p 1/p X ⊗ |e1 e1 |, with Xj = ( k Ajk ) and X = ( k Ak ) . Since Apjk Xj1−p ⊗ |e1 e1 |, K∗Apj KXj1−p = ∗

Apj K)1/p

k

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1117

with a similar expression for K∗ Ap KX 1−p , we ﬁnd Jp (I, Ak , X) = Jp (K, A, X ) = Jp (K, Aj , Xj ) = Jp (I, Ajk , Xj ). j

k

k,j

Convexity then implies that we must have Jp (I, Ak , X) = Jp (I, Ajk , Xj ) ∀ k.

(64)

j

Since ker Xj ⊆ ker Ajk , Theorem 10 implies that −it −it P(ker Xj )⊥ = Ait for all k, j, t. Ait kX jk Xj , k = After writing A j |ej ej | ⊗ Ajk , X = j |ej ej | ⊗ Xj , this reads −it it 1 1 it −it I ⊗ Tr1 Ak I ⊗ Tr1 X Ak X = P(ker X) e ⊥, m m

(65)

so that, by Theorem 9, there are elements Bk ∈ Md+ and D ∈ (Mm ⊗ Md )+ , such k = (I ⊗ Bk )D. As before, one ﬁnds [(I ⊗ Bk ), D] = 0 and A that ker D = ker X, + D = j |ej ej | ⊗ Dj for some Dj ∈ Md which implies (ii). (p,1) (A12 ) is convex in A12 for p ∈ (0, 2). Moreover, Theorem 16. The function Ψ if we let A123 denote the block diagonal matrix with blocks Wn AWn∗ , the following are equivalent: (i) Jp (I, A123 , A1 ⊗I23 ) = j Jp (I, (A123 )j , (A1 )j ⊗I23 ) with A123 deﬁned by (61), (ii) There are matrices Dj ∈ Md+1 , D = j Dj , such that ker Dj = ker(A1 )j , [Aj , Dj ⊗ I] = 0 and Aj = A(D−1 Dj ⊗ I). (p,1) (A) = Ψ (iii) Ψ j (p,1) (Aj ). 1+p

Proof. It follows from the deﬁnition of A123 , that d2 p Ψ(p,1) (A) = Φ(A123 ). The equivalence (i) ⇔ (iii) follows immediately from Theorem 15, and (i) ⇔ (ii) can be shown to follow from Corollary 14. Theorem 17. The following monotonicity inequalities hold, (p,1) (A23 ) ≤ Ψ (p,1) (A123 ), Ψ

p ∈ (0, 2),

(66a)

Ψ(p,1) (A23 ) ≥ Ψ(p,1) (A123 ),

p ∈ (0, 1),

(66b)

Ψ(p,1) (A23 ) ≤ Ψ(p,1) (A123 ),

p ∈ [1, 2).

(66c)

Moreover, equality holds if and only if the conditions of Corollary 13 are satisﬁed. since the other inequalities follow immeProof. It suﬃces to give the proof for Ψ diately. The argument is similar to that for Theorem 7. Let Wn denote the generalized Pauli matrices of Sec. 2.3, but now let Wn = Wn ⊗ I23 . Then the convexity

October 12, J070-S0129055X10004144

1118

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

(p,1) (A23 ) implies of Ψ (p,1) (A23 ) = 1 Ψ (p,1) (I1 ⊗ A23 ) Ψ d1 1 1 Ψ(p,1) Wn A123 Wn = d1 d1 n ≤

1 (p,1) (A123 ) Ψ(p,1) (Wn A123 Wn ) = Ψ d21 n

under unitaries of the form U1 ⊗ I23 . In the case where we used the invariance of Ψ 1)(A123 ) becomes (1,1) (A23 ) ≤ Ψ(1, p = 1, it follows from (60) that Ψ S(A2 ) − S(A23 ) ≤ S(A12 ) − S(A123 )

(67)

which is SSA. Because the equality conditions in Theorem 16 are independent of p, they are identical to those for SSA, which are given in Corollary 13. The Carlen–Lieb triple Minkowski inequality for the case q = 1 is an immediate corollary of Theorem 17. Observe that Tr3 Tr1 (Tr2 Ap123 )1/p = Ψ(p,1) (A(13),(2) ) Tr3 [Tr2 (Tr1 A123 )p ]1/p = Ψ(p,1) (A32 )

(68a) (68b)

so that it follows immediately from (66c) that Tr3 [Tr2 (Tr1 A123 )p ]1/p = Ψ(p,1) (A32 ) ≤ Ψ(p,1) (A132 ) = Tr3 Tr1 (Tr2 Ap123 )1/p

(69)

for 1 < p ≤ 2 and from (66b) that the inequality reverses for 0 < p < 1. Moreover, the conditions for equality are again independent of p and identical to those for equality in SSA, given in Corollary 13. 5. Final Remarks It should be clear that the results in Sec. 2 are not restricted to Jp (K, A, B). The function gp (x) given in (6) can be replaced by any operator convex function of the form g(x) = xf (x) with f operator monotone on (0, ∞). Moreover, if the measure ν(t) in (17) is supported on (0, ∞), then the conditions for equality are identical to those in Sec. 3. In particular, our results go through with gp replaced by gp and Jp (I, A, B) replaced by Jp (I, A, B), which is well-deﬁned for p ∈ [−1, 1) with J0 (I, A, B) = H(B, A). Thus our results can be extended to all p ∈ (−1, 2). The case p = 2 reduces to the convexity of (A, X) → Tr X ∗A−1 X with A > 0 proved in [24]. One can show that equality holds if and only if Xj = Aj T ∀ j with T = A−1 X. We recently learned that Kiefer ([16]) proved the p = 2 convexity, by a diﬀerent method, much earlier and also found these equality conditions.

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1119

There have been various attempts, e.g., the Renyi ([35]) and Tsallis ([40]) entropies, to generalize quantum entropy in a way that gives the usual von Neumann entropy at p = 1. In this paper we have considered two extensions of the conditional entropy involving an exponent p ∈ (0, 2), namely, • Jp (I, A12 , A1 ) which gives Tr Ap23 A21−p

≤ p ∈ (0, 1) Tr Ap123 A1−p and can be 12 ≥ p ∈ (1, 2)

thought of as a pseudo-metric; and (p,1) (A12 ) which gives Tr2 (Tr3 Ap )1/p ≥ Tr12 (Tr3 Ap )1/p p ∈ (0, 1) and can be • Ψ 23 123 ≤ p ∈ (1, 2) thought of as a pseudo-norm. These expressions are quite diﬀerent for p = 1, but arise from quantities with the same convexity and monotonicity properties, as well as the same equality conditions which are independent of p. Moreover, both yield SSA at p = 1 and the equality conditions for p = 1 are identical to those for SSA. This independence of non-trivial equality conditions on the precise form of the function seems remarkable. If one uses gp and Jp (I, A, B) from (10), then the inequalities above hold with p ∈ (1, 2) replaced by p ∈ (−1, 0) and SSA corresponds to p = 0. Acknowledgments The ﬁrst-named author was supported by the grants VEGA 2/0032/09, APVV0071-06, Center of Excellence SAS — Quantum Technologies and ERDF OP R&D Project CE QUTE ITMS 26240120009. The second-named author was partially supported by National Science Foundation under Grant DMS-0604900. Appendix. Proof of the Key Schwarz Inequality For completeness, we include the proof of the joint convexity of (A, B, X) → Tr X ∗ (LA + tRB )−1 (X) when A, B > 0 and t > 0. Since this function is homogeneous of degree one, it suﬃces to prove subadditivity. Now let Mj = (LAj + tRBj )−1/2 (Xj ) − (LAj + tRBj )1/2 (Λ).

(70)

Then one can verify that 0≤ Tr Mj∗ Mj = Mj , Mj j

=

j

j



Tr Xj∗ (LAj + tRBj )−1 (Xj ) − Tr  

− Tr Λ∗ 

j

 Xj  + Tr Λ∗

 Xj∗  Λ

j

j

(LAj + tRBj )Λ.

(71)

October 12, J070-S0129055X10004144

1120

2010 10:2 WSPC/S0129-055X

148-RMP

A. Jenˇ cov´ a & M. B. Ruskai

Next, observe that for any matrix W , (LAj + tRBj )(W ) = (Aj W + tW Bj ) = LPj Aj (W ) + tRPj Bj (W ). j

j

Therefore, inserting the choice Λ = (LPj Aj + tRPj Bj )−1 ( j Xj ) in (71) yields  ∗   1 1  Tr  Xj  P Xj  ≤ Tr Xj∗ (Xj ). (72) P L j Aj + tR j Bj LAj + tRBj j j j for any t ≥ 0. References [1] N. I. Akheizer and I. M. Glazman, Theory of Operators in Hilbert Space, Vol. II (Frederik Ungar Publishing, NY, 1963). [2] A. Amari and H. Nagaoka, Methods of Information Geometry, Translations of Mathematical Monographs, Vol. 191 (American Mathematical Society and Oxford University Press, 2000). [3] T. Ando, Topics on Operator Inequalities, Lecture Notes (Hokkaido University, 1978). [4] T. Ando, Concavity of certain maps on positive deﬁnite matrices and applications to Hadamard products, Lin. Alg. Appl. 26 (1979) 203–241. [5] H. Araki, Relative entropy of states of von Neumann algebras, Publ RIMS Kyoto Univ. 9 (1976) 809–833. [6] T. Bekjan, On joint convexity of trace functions, Lin. Alg. Appl. 390 (2004) 321–327. [7] E. Carlen and E. Lieb, A Minkowski type trace inequality and strong subadditivity of quantum entropy, Amer. Math. Soc. Trans. 189(2) (1999) 59–62; Reprinted in [21]. [8] E. A. Carlen and E. H. Lieb, A Minkowski type trace inequality and strong subadditivity of quantum entropy II: Convexity and concavity, Lett. Math. Phys. 83 (2008) 107–126; arXiv:0710.4167. [9] I. Devetak and J. Yard, Exact cost of redistributing multipartite quantum states, Phys. Rev. Lett. 100 (2008) 230501, 4 pp. [10] W. F. Donoghue Jr., Monotone Matrix Functions and Analytic Continuation (Springer, 1974). [11] E. G. Eﬀros, A matrix convexity approach to some celebrated quantum inequalities, Proc. Natl. Acad. Sci. 106 (2009) 1006–1008; arXiv:0802.1234. [12] H. Epstein, Remarks on two theorems of E. Lieb, Comm. Math. Phys. 31 (1973) 317–325. [13] P. Hayden, R. Jozsa, D. Petz and A. Winter, Structure of states which satisfy strong subadditivity of quantum entropy with equality, Comm. Math. Phys. 246 (2004) 359–374; arXiv:quant-ph/0304007. [14] A. Jenˇcov´ a and D. Petz, Suﬃciency in quantum statistical inference, Comm. Math. Phys. 263 (2006) 259–276; arXiv:math-ph/0412093. [15] A. Jenˇcov´ a and D. Petz, Suﬃciency in quantum statistical inference. A survey with examples, J. Infin. Dimens. Anal. Quantum Prob. Relat. Top. 9 (2006) 331–352; arXiv:quant-ph/0604091. [16] J. Kiefer, Optimum experimental designs, J. Roy. Statist. Soc. Ser. B 21 (1959) 272–310. [17] O. Klein, Zur quantenmechanischen begr¨ undung der zweiten hauptsatzes der w¨ aremlehre, Z. Phys. 72 (1931) 767–775.

October 12, J070-S0129055X10004144

2010 10:2 WSPC/S0129-055X

148-RMP

Unified Treatment of Convexity of Relative Entropy and Related Trace Functions

1121

[18] A. Lesniewski and M. B. Ruskai, Relative entropy and monotone Riemannian metrics on non-commutative probability space, J. Math. Phys. 40 (1999) 5702–5724. [19] S. Luo, N. Li and X. Cao, Relation between “no broadcasting” for noncommuting states and “no local broadcasting” for quantum correlations, Phys. Rev. A 79 (2009) 054305, 3 pp. [20] E. H. Lieb, Convex trace functions and the Wigner–Yanase–Dyson conjecture, Adv. Math. 11 (1973) 267–288; Reprinted in [21]. [21] M. Loss and M. B. Ruskai (eds.), Inequalities: Selecta of E. Lieb (Springer, 2002). [22] E. H. Lieb and M. B. Ruskai, A fundamental property of the quantum-mechanical entropy, Phys. Rev. Lett. 30 (1973) 434–436; Reprinted in [21]. [23] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum mechanical entropy, J. Math. Phys. 14 (1973) 1938–1941; Reprinted in [21]. [24] E. H. Lieb and M. B. Ruskai, Some operator inequalities of the Schwarz type, Adv. Math. 12 (1974) 269–273; Reprinted in [21]. [25] E. H. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the Schr¨ odinger Hamiltonian and their relation to Sobolev inequalities, in Studies in Mathematical Physics, eds. E. Lieb, B. Simon and A. Wightman (Princeton University Press, 1976), pp. 269–303; Reprinted in [21]. [26] G. Lindblad, Expectations and entropy inequalities, Comm. Math. Phys. 39 (1974) 111–119. ¨ [27] K. L¨ owner, Uber monotone Matrix Funktionen, Math. Z. 38 (1934) 177–216. [28] M. Mosonyi and D. Petz, Structure of suﬃcient quantum coarse-grainings, Lett. Math. Phys. 68 (2004) 19–30. [29] H. Narnhofer and W. Thirring, From relative entropy to entropy, Fizika 17 (1985) 257–265. [30] M. Ohya and D. Petz, Quantum Entropy and Its Use, 2nd edn. (Springer-Verlag, 2004). [31] D. Petz, Quasi-entropies for ﬁnite quantum systems, Rep. Math. Phys. 23 (1986) 57–65. [32] D. Petz, Suﬃciency of channels over von Neumann algebras, Quart. J. Math. 39 (1988) 907–1008. [33] D. Petz, Suﬃcient subalgebras and the relative entropy of states of a von Neumann algebra, Comm. Math. Phys. 105 (1986) 123–131. [34] D. Petz, Monotone Metrics on Matrix Spaces, Lin. Alg. Appl. 244 (1996) 81–96. [35] A. R´enyi, On measures of entropy and information, in Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. I (Univ. California Press, Berkeley, 1961), pp. 547–561. [36] M. B. Ruskai, Inequalities for quantum entropy: A review with conditions for equality, J. Math. Phys. 43 (2002) 4358–4375; Erratum ibid., 46 (2005) 019901, quantph/0205064. [37] M. B. Ruskai, Another short and elementary proof of strong subadditivity of quantum entropy, Rep. Math. Phys. 60 (2007) 1–12; arXiv:quant-ph/0604206. [38] D. Ruelle, Statistical Mechanics (Benjamin, 1969). [39] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton Univ. Press, 1993). [40] C. Tsallis, Possible generalization of Boltzmann–Gibbs statistics, J. Stat. Phys. 52 (1988) 479–487. [41] A. Wehrl, General properties of entropy, Rev. Mod. Phys. 50 (1978) 221–260. [42] E. P. Wigner and M. M. Yanase, Information content of distributions, Proc. Nat. Acad. Sci. 49 (1963) 910–918. [43] E. P. Wigner and M. M. Yanase, On the positive semi-deﬁnite nature of certain matrix expressions, Canad. J. Math. 16 (1964) 397–406.

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1123–1145 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X1000417X

ON THE HERMAN–KLUK SEMICLASSICAL APPROXIMATION

DIDIER ROBERT D´ epartement de Math´ ematiques, Laboratoire Jean Leray, CNRS-UMR 6629, Universit´ e de Nantes, 2 rue de la Houssini` ere, F-44322 Nantes Cedex 03, France [email protected] Received 19 November 2009 For a subquadratic symbol H on Rd ×Rd = T ∗ (Rd ), the quantum propagator of the time ˆ is a Semiclassical Fourier-Integral Operator = Hψ dependent Schr¨ odinger equation i ∂ψ ∂t ˆ = H(x, Dx ) (-Weyl quantization of H). Its Schwartz kernel is described by when H a quadratic phase and an amplitude. At every time t, when is small, it is “essentially supported” in a neighborhood of the graph of the classical ﬂow generated by H, with a full uniform asymptotic expansion in for the amplitude. In this paper, our goal is to revisit this well-known and fundamental result with emphasis on the ﬂexibility for the choice of a quadratic complex phase function and on global L2 estimates when is small and time t is large. One of the simplest choice of the phase is known in chemical physics as Herman–Kluk formula. Moreover, we prove that 1 |log | where δ > 0 is the semiclassical expansion for the propagator is valid for |t| 4δ a stability parameter for the classical system. Keywords: Coherent states; time dependent Schr¨ odinger equations; Semiclassical Fourier-Integral Operator; Ehrenfest time. Mathematics Subject Classiﬁcation 2010: 35Q41, 81Q05, 81S30, 35S30

1. Introduction and Results Let us consider the time-dependent Schr¨odinger equation i

∂ψ(t) ˆ = H(t)ψ(t), ∂t

ψ(t = t0 ) = ψ0 ,

(1.1)

ˆ where ψ is an initial state, H(t) is a quantum Hamiltonian deﬁned as a continuous family of self-adjoint operators in the Hilbert space L2 (Rd ), depending on time t and on the Planck constant > 0, which plays the role of a small parameter in ˆ the system of units considered in this paper. H(t) is supposed to be the -Weylquantization of a classical smooth observable H(t, X), X = (x, ξ) ∈ Rd ×Rd (see [27] for more details concerning semiclassical Weyl quantization). 1123

November 16, J070-S0129055X1000417X

1124

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

Our main results concern subquadratic Hamiltonians H; that means here that H(t, X) is continuous in t ∈ R, C ∞ smooth in X ∈ R2d and satisﬁes, for every γ ∈ N2d , |γ| ≥ 2, γ H(t, X| ≤ CT,γ , |∂X

∀ t,

|t − t0 | ≤ T,

∀ X ∈ R2d

(1.2)

∂ and CT,γ > 0. where ∂X = ∂X Let us introduce some classes of symbols (“classical observables”) deﬁned as follows. Let be m, n ∈ N.

Definition 1.1. We say that a symbol s is in Om (n) if s is a smooth function on the Euclidean space Rn such that for every γ ∈ Nn , |γ| ≥ m we have γ s(X)| < +∞ |s|∞,γ := sup |∂X

(1.3)

X∈Rn

If s(ε) depends on a parameter ε ∈ P we say that s(ε) is bounded in Om (n) if for every γ, we have sup |s(ε)|∞,γ < +∞.

ε∈P

It is well known that the subquadratic assumption entails that Eq. (1.1) is solved by a unique quantum unitary propagator in L2 (Rd ) such that ψt = U (t, t0 )ψ0 , ∀ t ∈ R. For the same reason, the classical dynamics is also well deﬁned ∀ t ∈ R. zt = (qt , pt ) is the classical path in the phase space R2d such that zt0 = z and satisfying q˙t = ∂p H(t, qt , pt ) (1.4) p˙t = −∂Hq (t, qt , pt ), qt0 = q, pt0 = p. It deﬁnes a Hamiltonian ﬂow: φt (z) = zt (φt0 (z) = z). Let us introduce the stability Jacobi matrix of this Hamiltonian ﬂow:F (t) = ∂z φt (z). F (t) is a 2d× 2d symplectic Bt t matrix with four d × d blocks, F (t) = A Ct Dt , where At =

∂qt , ∂q

Bt =

∂qt , ∂p

Ct =

∂pt , ∂q

Dt =

∂pt . ∂p

We also introduce the classical action t S(t, z) = (ps · q˙s − H(s, zs ))ds

(1.5)

(1.6)

t0

where u · v denote the usual scalar product for u, v ∈ Rd , and the phase function i Φ(t, z; x, y) = S(t, z) + pt · (x − qt ) − p · (y − q) + (|x − qt |2 + |y − q|2 ). 2

(1.7)

For applications, it is useful to introduce semi-classical subquadratic symbols. These symbols have an asymptotic expansion in the semiclassical parameter > 0,

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

H (t, X)

j≥0

1125

j Hj (t, X) such that the following conditions are satisﬁed.

∀ j ≥ 0, Hj (t, •) ∈ O(2−j)+ (2d) and are bounded in O(2−j)+ (2d) for t ∈ R,   ∀ N ≥ 1, −N −1 H(t, X) − j Hj (t, X) is bounded in O0 for

(1.8)

0≤j≤N

t ∈ R and ∈ ]0, 1].

(1.9)

Let us recall the deﬁnition of Weyl quantization. For any symbol s in Om (2d),and for any ψ ∈ S(Rd ), we have x+y i w −d (x−y)·ξ , ξ ψ(y)dydξ. (1.10) Op [s]ψ(x) = (2π) e s 2 R2d We shall also use the notation sˆ = Opw [s]. The Herman–Kluk formula is included in the following asymptotic result which will be discussed in details in this paper. This formula was discovered by several authors in the chemical-physics litterature in the eighties. We refer to the introductions of [22, 29] for interesting historical expositions. It is rather surprising that until the recent paper [29] and the Ph.D. thesis [33] there was no explicite connexion in the mathematical literature between the Herman–Kluk formula and Fourier-Integral Operators with complex phases. Theorem 1.2. Let be H (t) a time dependent semiclassical subquadratic Hamiltonian and K (t; x, y) be the Schwartz kernel of its propagator U (t, t0 ). Then there exists a semi-classical symbol of order 0, a (t; z) = 0≤j<+∞ aj (t; z)j where aj is continuous in t, i K (t; x, y) e Φ(t,z;x,y) a(; t; z)dz (1.11) R2d

in the L2 uniform norm. More precisely, if we denote   i e Φ(t,z;x,y)  aj (t; z)j  dz K (,N ) (t; x, y) = (2π)−3d/2 R2d

(1.12)

0≤j≤N

and U (,N ) (t, t0 ) the operator, in L2 (Rd ), with the Schwartz kernel K (,N ) (t; x, y), then, for every T > 0 and every N ≥ 1, there exists C(T, N ) > 0 such that for the L2 operator norm we have U (t, t0 ) − U (,N ) (t, t0 ) ≤ C(T, N )N +1 ,

∀ t,

|t − t0 | ≤ T,

∈ ]0, 1]. (1.13)

The leading term is 1/2

a0 (t; z) = det

t (At + Dt + i(Bt − Ct )) exp −i H1 (zs )ds t0

(1.14)

November 16, J070-S0129055X1000417X

1126

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

where the square root is defined by continuity starting from t = t0 (a0 (t0 ; z) = 2d/2 ). Moreover, the amplitudes aj are smooth functions defined by transport equations (see the proof below ) and, for every T > 0 they are bounded in O0 for |t| ≤ T . In [29], the authors give a rigorous proof of this result with an additional hypothesis: they assume that H(x, ξ) is a polynomial in ξ. Here we consider more general subquadratic symbols. In particular our result applies to relativistic Hamiltonians

like 1 + |ξ|2 + V (x). Using a global diagonalization (see [28, Sec. 3]), the result can be extended to Dirac systems. Similar results are true with more general quadratic phases and for systems with diagonalizable leading symbols (see [4, 28]). Let us deﬁne the quadratic phase Φ(Θt ,Γ) (t, z; x, y) = S(t, z) + pt · (x − qt ) − p · (y − q) 1 ¯ − q).(y − q)) (1.15) + (Θt (x − qt ) · (x − qt ) − Γ(y 2 where Γ, Θt are complex symmetrix matrices with a deﬁnite-positive imaginary part, Θt is C 1 in t. Γ is constant, Θt may depend smoothly on t and z such that the following condition is satisﬁed: Θt v.v ≥

∃cT > 0, ∀γ,

|γ| ≥ 1,

∃CT,γ ,

1 2 |v| , cT

∀ t,

∂zγ Θt ≤ CT,γ ,

|t| ≤ T,

∀ z ∈ R2d

∀ z ∈ R2d ,

∀ |t| ≤ T.

(1.16) (1.17)

So we have Theorem 1.3. Under the assumptions of Theorem 1.2 and (1.16), (1.17), we have (Θt ,Γ) i −3d/2 (t,z;x,y) K(t; x, y) (2π) eΦ f (; t; z)dz (1.18) where f (; t; z) = In particular

R2d

0≤j<+∞

fj (t; z)j with the same meaning as in Theorem 1.2.

f0 (t, z) = 2d/2 det1/2 [M (Θt , Γ)]

(1.19)

where ¯ − Θ(A + B Γ)). ¯ M (Θt , Γ) = i(C + DΓ There exist several methods to prove this theorem. In [29], the authors prove it as a consequence of a symbolic calculus for FIO with complex quadratic phases. In [5], the authors proved a weaker result for Γ = iI and Θt = Γt is determined by the propagation of Gaussian coherent states: Γt = (C + DΓ)(A + BΓ)−1 (see Sec. 2 of this paper). Laptev–Sigal in [23] have also considered a similar formula for the propagator (see Sec. 5 of this paper) but assume that the initial data has a compact support in momenta. Kay ([22]) explains how to compute all the semiclassical corrections aj but did not give estimates on the error term, so its expansion is not rigorously established. Here we choose another approach, may be more explicit and

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1127

simpler. We shall prove the general Theorem 1.3 as a consequence of the particular case of Theorem 1.2 by using a real deformation of the phase Φ(Θt ,Γ) on the simpler one Φ(iI,iI) . Moreover, we give a direct proof of Theorem 1.2, proving the necessary properties for Fourier integrals with complex quadratic phases. This way we can get easily explicit estimates for the error terms for large times. ˆ Let us assume that conditions on H(t) are satisﬁed for T = +∞. Moreover assume that there exists a positive real function µ(T ) ≥ 1, T > 0, such that the classical ﬂow φt satisﬁes, for every multiindex γ, |γ| ≥ 1, we have for some Cγ > 0,

|∂zγ φt,t (z)| ≤ Cγ µ(T )|γ| ,

for |t| + |t | ≤ T,

∀ z ∈ R2d .

(1.20)

We have discussed in [5] the condition (1.20). In particular this condition is fulﬁlled 2 H(t, X). with µ(T ) = eδT for δ = supX∈R2d ,t∈R J∂X,X Theorem 1.4. Choosing the phase as in Theorem 1.2, for j ≥ 0 the amplitudes aj (t, z) satisfy the following estimates, for every multiindex γ there exist a constant Cj,γ such that |∂zγ aj (t, z)| ≤ Cjγ |det1/2 Mt |µ(t)4j+|γ| ,

∀ t ∈ R,

∀ z ∈ R2d .

(1.21)

Hence we have the following Ehrenfest type estimate. For every N ≥ 1 and every ε > 0 there exists CN,ε such that we have U (t, t0 ) − U (N ) (t, t0 ) ≤ CN,ε ε(N +1) , |t| ≥ s

1−ε |log |, 4δ

∀ t,

∀ ∈ ]0, 1].

(1.22)

In previous works, an Ehrenfest time TE = c|log |, c > 0, was estimated for propagation of Gaussians in [9] and propagation of observables in [6]. For Gaussians 1 1 , for observables c = 2δ . In [29], the authors gave an Ehrenfest time we got c = 6δ without explicit estimate on c. 2. Gaussians Coherent States and Quadratic Hamiltonians The phase functions Φ(Θ,Γ) in (1.7) and (1.15) are closely related with Gaussian coherent states. This can be seen by proving a particular case of Theorem 1.2 for quadratic time-dependent Hamiltonians: Ht (q, p) =

1 (Gt q · q + 2Lt q · p + Kt p · p) 2

where q, p ∈ Rd , Kt , Lt , Gt are real, d × d matrices, continuous in time t ∈ R, Gt , Kt are symmetric. The classical motion in the phase space is given by the linear diﬀerential equation q q˙ 0 I Gt LTt , J= =J· (2.1) Lt Kt p p˙ −I 0

November 16, J070-S0129055X1000417X

1128

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

where LT is the transposed matrix of L, J deﬁnes the symplectic form σ(X, X ) := JX · X , X = (x, ξ), X = (x , ξ ). This equation deﬁnes a linear symplectic transformation, Ft , such that F0 = I (we take here t0 = 0). It can be represented as a 2d × 2d matrix which can be written as four d × d blocks: At Bt Ft = . (2.2) Ct Dt ˆ The quantum evolution for the Hamiltonian H(t) is denoted by U (t) (U (0) = I). We can compute the matrix elements of U (t) on the coherent states basis ϕz . This has been done in [24, p. 249 (6.36)] and [3, 12, 10]. We follow here the presentation given in [10]. Let us introduce some notations which will be used later. g denotes the 2 Gaussian function: g(x) = π −d/4 e−|x| /2 and Λ is the dilation operator Λ ψ(x) = −d/4 ψ(−1/2 x). So ϕ0 = Λ g, and the general Gaussian coherent states are deﬁned as follows. = Tˆ (z)ϕ(Γ) , ϕ(Γ) z where Tˆ (z) is the Weyl translation operator, z = (q, p), i ˆ T (z) = exp (p · x − q · Dx ) ∂ and z = (q, p) ∈ Rd × Rd . ϕ(Γ) is the Gaussian state: where Dx = −i ∂x i (Γ) −d/4 Γx · x aΓ exp ϕ (x) = (π) 2

(2.3)

(2.4)

(2.5)

where Γ is a complex symmetric matrix such that Γ is deﬁnite-positive, aΓ is a normalization constant. (aΓ = det1/4 Γ). It is convenient to introduce here the Siegel space Σ+ (d) of d × d complex matrices Γ such that Γ is deﬁnite-positive. (See in [13] properties of Σ+ (d).) (Γ) Let us deﬁne the Fourier–Bargmann transform FB as follows, ψ ∈ L2 (Rd ), FB [ψ](z) = (2π)−d/2 ψ, ϕ(Γ) z . (Γ)

(Γ)

z ∈ R2d , ϕz x ∈ Rd ,

ϕ(Γ) z (x) (Γ)

(2.6)

is the following coherent state living at z, z = (q, p) ∈ Rd × Rd , −d/4

= (π)

i p · q iΓ(x − q) · (x − q) p·x− + aΓ exp , 2 2

(2.7)

FB is an isometry from L2 (Rd ) into L2 (R2d ) (with the Lebesgue measures). If 2 Γ =iI we denote FB = FBiI ; its range consists of F ∈ L2 (R2d ) such that exp p2 + d i q·p 2 F (q, p) is holomorphic in C in the variable q − ip. In other words, 2 q·p p FB ψ(z) = Eψ (q − ip) exp − − i (2.8) 2 2

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1129

where Eψ is entire in Cd (see [25]). Moreover we have the inversion formula (Γ) ψ(x) = FB [ψ](z)ϕ(Γ) in the L2 -sense. (2.9) z (x)dz, R2d

These properties are well known (see [25, 5]). Sometimes we shall use the shorter (Γ) ˜ notation ψ˜Γ = FB ψ and ψ˜Γ = ψ. ˆ Let us denote by R[Ft ] the quantum propagator for the Hamiltonian H(t) (this is the metaplectic representation of Ft ) and K (Ft ) its Schwartz kernel. We know ˆ t ]g is the following Gaussian state [10, 13], that Λ R[F i −d/4 ˆ Γt x · x Λ R[Ft ]g(x) = (π) aΓ (t) exp (2.10) 2 where aΓ (t) = [det(At + ΓBt )]−1/2 aΓ , the complex square root is computed by continuitya from t = t0 = 0, and Γt = (Ct + ΓDt )(At + ΓBt )−1 ,

Γt0 = Γ.

Proposition 2.1. We have the following exact formula (Θ,Γ) M (Θt , Γ) (t,z;x,y) K (Ft ) (x, y) = 2d/2 (2π)−3d/2 det1/2 eΦ dz i 2d R

(2.11)

(2.12)

¯ − Θt (A + B Γ) ¯ and where Γ, Θt ∈ Σ+ (d), Θt is C 1 in t; M (Θt , Γ) = C + DΓ Φ(Θt ,Γ) (t, z; x, y) =

1 (qt · pt − q · p) + pt · (x − qt ) − p · (y − q) 2 1 ¯ − q) · (y − q)). + (Θt (x − qt ) · (x − qt ) − Γ(y 2

Let us remark that here the action is S(t, z) = 12 (qt · pt − q · p). First of all let us remark that the integral (2.12) is an oscillating integral and is deﬁned, as usual, by integrations by parts. We shall give two proofs of this formula. Proof I. We start with any Γ0 in the Siegel space Σ+ (d). Using the formula ψ(x) = (2π)−d

ψ, ϕΓz 0 ϕΓz 0 dz R2d

we get the formula K (Ft ) (x, y) = (2π)−d

(Γ

R2d

ϕz 0 (y)ϕz(Γt t ) (x)dz.

(2.13)

a This deﬁnition of det1/2 is diﬀerent that the det1/2 function on Σ (d), this is explained in [10] + to compute Maslov index.

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

1130

So, we get K (Ft ) (x, y) = (2π)−3d/2 k0 (t)

i

(Γt ,Γ0 ) (t,z;x,y)

eΦ

dz,

(2.14)

R2d

where k0 (t) = 2d/2

det1/2 ( Γ0 ) det1/2 (A + BΓ0 )

.

Now we shall transform the phase Φ(Γt ,Γ0 ) into the phase Φ(Θ,Γ0 ) . Let us introduce Θ(s) = sΘ + (1 − s)Γt , 0 ≤ s ≤ 1. We have Θ(s) ∈ Σ+ (d). We want to ﬁnd k(t, s) such that k(t, 0) = k0 (t) and (Θt ,Γ0 ) i ∂ (t,z;x,y) eΦ dz = 0, ∀ s ∈ [0, 1]. (2.15) k(t, s) ∂s R2d We have (Θt ,Γ0 ) i ∂ i Φ(Θt ,Γ0 ) i e (Θt − Γt )(x − qt ) · (x − qt )e Φ = . ∂s 2

The main trick used here and later in this paper, and also in all the previous papers on this subject ([23, 22, 29]), is to integrate by parts to convert each factor (x − qt ) into , using the following equality ¯ p )ΦΘ,Γ = (C τ + ΓD ¯ τ − (Aτ + ΓB ¯ τ )Θ)(x − qt ) (∂q + Γ∂

(2.16)

where Aτ denotes the transposed matrix of A. Let us introduce the matrix ¯ − Θ(A + B Γ). ¯ M = M (Θ, Γ) = C + DΓ So we have i

(Θ,Γ)

M τ (x − qt )e Φ

=

¯ p e i ΦΘ,Γ . ∂q + Γ∂ i

(2.17)

Let us remark that M is invertible. This is a consequence of the following lemma (see [11, 13] or [28, Appendix A], for proofs). ∗ d ∗ d Lemma A B 2.2. For every linear symplectic map in F : T (R ) → T (R ), d F = C D and every Γ ∈ Σ+ (d), (A + BΓ), (C + DΓ) are invertible in C and (C + DΓ)(A + BΓ)−1 ∈ Σ+ (d).

So we have ¯ = C + DΓ − Θ(A ¯ + BΓ) = ((C + DΓ)(A + BΓ)−1 − Θ)(A ¯ M + BΓ)−1 . ¯ ∈ Σ+ (d) so is invertible. But (C + DΓ)(A + BΓ)−1 − Θ) Denote M (t, s) = M (Θs , Γt ). Let us recall the Liouville formula ∂s det(M (t, s)) = det(M (t, s)) Tr(∂s M (t, s)M (t, s)−1 ).

(2.18)

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1131

So, integrating by parts in (q, p) we get k(t, s) = k(t, 0)

det1/2 M (t, s)

(2.19)

det1/2 M (t, 0)

k(t,0) Now we have to compute det1/2 . A simple computation gives M (t, 0) = (D − M(t,0) ¯ Γt B)(Γ0 − Γ0 ). The proof of (2.12) follows from the formula

det(D − Γt B) = det(A + BΓ0 )−1 .

(2.20)

This equality follows from the symplecticity of F (Dτ B = B τ D). We have B τ Γt B − Dτ B = −(A + BΓ0 )−1 B. So we get (2.20) if detB = 0. The general case follows by a density argument. Let us remark that can exchange the role of Θ and Γ by considering the adjoint U (t)∗ of U (t). Proof II. We solve directly the Schr¨ odinger equation ∂ ˆ ψ(t, x) = 0 i − H(t) ∂t

(2.21)

for any initial data ψ(x) := ψ(0, x), ψ ∈ S(Rd ) using the ansatz (Θ,Γ) −3d/2 (t,z;x,y) k(t) eiΦ ψ(y)dzdy. ψ(t, x) = (2π)

(2.22)

R2d ×Rd

We have to compute k(t) such that k(0) = 2d/2 . Let us remark that if we integrate ﬁrst in y then the integral (2.22) in z converges because the Fourier–Bargmann transform of ψ, FB ψ, is in the Schwartz space S(R2d ). For simplicity, we assume here that Θ = Γ = iI. The general case can be reached by the same method or by using the deformation argument of Proof I as we shall see later for more general Hamiltonians. ˆ Here the Hamiltonian H(t) is a quadratic form. So using dilations we can assume that = 1. A simple computation left to the reader, gives the following: Lemma 2.3. ˆ = Gx · x + i(L + Lτ )x · x − Kx · x + Tr(K − iL) (g −1 H(t)g)(x) where g(x) = e

|x|2 − 2

(2.23)

.

So we get ˆ (i∂t − H(t))ψ(t) = (2π)−3d/2

i

(Θ,Γ)

eΦ

(t,z;x,y)

b(t, x, z)ψ(y)dzdy

R2d ×Rd

(2.24) where b(t, z, z) = i∂t k(t) − k(t)(E(x − qt ) · (x − qt ) + Tr(K − iL)).

November 16, J070-S0129055X1000417X

1132

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

As in Proof I, we integrate by parts in the variable z ∈ R2d , using (∂q − i∂p )Φ = M τ (x − qt ) with M = C − B − i(A + D), which is invertible (see below Lemma 3.2). Using the Hamilton equation of motion we get M˙ = −E(A − iB) − i(K − iL)M. So, we ﬁnd the following diﬀerential equation for k(t), 1 k˙ = Tr( M M˙ k. 2 Using the Liouville formula, we get again (2.12) for this particular phase.

(2.25)

(2.26)

3. Proof of Theorems 1.2 and 1.4 As usual for this kind of problems there are two steps: (1) Determine the amplitudes aj solving by induction transport diﬀerential equations; (2) Estimate the error between the approximated propagator and the exact one. 3.1. Transport equations It is convenient to write e Φ = (π)d/2 ϕzt (x)ϕ¯z (y)e (S(t,z)+(p·q−pt ·qt )/2) . i

i

(3.1)

ˆ (t)ϕzt . It is not diﬃcult to add contributions of Then we have to compute H the lower order terms of the Hamiltonian, so we shall assume for simplicity that H (t) = H0 (t) := H(t). Lemma 3.1. For every N ≥ 2 we have |γ|/2 γ x − qt √ ∂ H(t, zt )Πγ ϕzt (x) γ! X

ˆ H(t)ϕ zt (x) =

|γ|≤N

+ (N +1)/2 T (zt )Λ Opw 1 [RN (t, zt )]g(x) where

RN (t, zt , X) = 0

1

(1 − s)N N!

√ γ ∂X H(t, zt + s X)X γ ds

(3.2)

(3.3)

|γ|=N +1

and Πγ is a universal polynomial of degree ≤ |γ| which is even or odd according |γ| is even or odd. Proof. Let us recall that ϕz = Tˆ(z)Λ g. In this proof we put zt = z. An easy property of Weyl quantization gives √ w ˆ ˆ ˆ (3.4) Λ−1 T (z)H(t)T (z)Λ = Op1 [H( • +z)]. So the lemma follows easily from the Taylor formula with integral remainder.

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1133

In this ﬁrst step, we do not take care of remainder estimates, this will be done in the next step. Let us denote I(a, Φ) the formal operator having the Schwartz kernel i e Φ(t,z;x,y) a(t, z)dz. (3.5) Ka (x, y) = (2π)−3d/2 R2d

From the Lemma 3.1, we can write ˆ H(t)I(a, Φ) ∼ I(b, Φ),

where b ∼

|γ|/2 γ

We have Πγ (x) =

γ!

γ ∂X H(t, zt )Πγ

x − qt √

hγ,β xβ .

a.

(3.6)

(3.7)

β≤γ

The quadratic part can be computed as for quadratic Hamiltonians and the linear part disappears with the classical motion. So we have b ∼ H(t, zt )a + (∂q H(t, zt ) + i∂p H(t, zt )) · (x − qt )a x − qt x − qt √ √ + E · + Tr(K − iL) a 2 H(t, X) the Hessian matrix of H(t). We have where we denote ∂X,X G L 2 ∂X,X H(t, zt ) = , E = G + 2iL − K, L K

(3.8)

(3.9)

2 2 2 H(t, zt ), L := ∂q,p H(t, zt ), K := ∂p,p H(t, zt ). with G := ∂q,q At Bt 2 Here the stability matrix Ft = Ct Dt satisﬁes F˙t = J∂X,X H(t, zt )Ft , Ft=0 = I. As in the quadratic case we want to transform the power of (x − qt ) into power of .

Lemma 3.2. Let us denote Mt = (Ct − Bt ) − i(At + Dt ). We have |det Mt | ≥ 2−d , (∂q − i∂p )e

i Φ

and

= iMtτ (x − qt )e

(3.10) i Φ

Proof. For simplicity, let us forget the lower index t. Let us consider the 2d × 2d matrix I + A − iC B + i(I − D) I + F + iJ(I − F ) = C − i(I − A) I + D + iB I + A − iC −i(D + iB) + i = . i(A − iC) I + D + iB

(3.11)

(3.12)

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

1134

Using [13, Lemma 4, Appendix A], we get det(I + F + iJ(I − F )) = det((I + A − iC)(I + D + iB) − (A − iC − I)(D + iB − I)) = 2d det(A + D + i(B − C)).

(3.13)

Using that F is symplectic, we get (I + F + iJ(I − F ))∗ (I + F + iJ(I − F )) = (I + F τ )(I + F ) + (1 − F τ )(I − F ) ≥ I2d

(3.14)

hence (3.10) follows. Let us recall classical computations for the derivatives of the action ∂q S = (∂q qt )τ pt − p,

(3.15)

∂p S = (∂p qt )τ pt .

(3.16)

Then we can compute ∂q Φ, ∂p Φ and we get (3.11). Integrate by parts like in the quadratic case, we get ˆ Φ) I(f, Φ) (i∂t − H(t))I(a,

(3.17)

where

|γ|/2 γ x − qt 1 −1 ˙ √ ∂ H(t, zt )Πγ f ∼ i ∂t a − Tr(M M )a + a. 2 γ! X

(3.18)

|γ|≥3

Hence using the Liouville formula, we get the ﬁrst term a0 (t, z) = 2d/2 det1/2 (iM .

(3.19)

We shall obtain the next terms aj by successive integrations by parts. This is solved more explicitly with the following lemma. Lemma 3.3. For any symbol b ∈ O0 (2d), and every multiindex α ∈ N2d we have i i (x − qt )α e Φ b(z)dz = |β| fα,β (t, z)e Φ ∂zβ b(z)dz (3.20) R2d

|α| 2 ≤|β|≤|α|

R2d

where fα,β (t, z) are symbols of order 0, uniformly bounded in O0 (2d) on bounded time intervals. They only depend on the classical flow φt (z) and its derivatives. More precisely, let us assume that there exists a positive function µ(T ) such that for every γ ∈ N2d we have sup |∂zγ φt (z)| ≤ Cγ µ(T )|γ| .

(3.21)

|∂z fα,β (z)| ≤ Cα,β; µ(T )|α|−|β|+| |.

(3.22)

|a|≤T

Then we have

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1135

Proof. The lemma is easily obtained by induction on |α| using Lemma 3.2. Now, to determine the transport equation, we solve inductively on j ≥ 0, the equation   ˆ  (i∂t − H(t))I k ak (t), Φ = O(j+2 ). (3.23) 0≤k≤j+1

Reasoning by induction on j ≥ 0, we get the transport equation for aj+1 (t) by cancellation of the coeﬃcient of j+1 in (3.23). ∂t aj+1 (t, z) =

1 ˙ −1 Tr M M aj+1 (t, z) + bj (t, z), 2

where

bj (t, z) =

aj+1 (0, z) = 0,

Fj,k,α (t, z)∂zα ak (t, z).

(3.24)

(3.25)

|α|+2k≤2(j+2)

Moreover, Fj,k,α (t, z) depends only on the classical ﬂow φt (z) and its derivatives and satisﬁes |∂zγ Fj,k,α (t, z)| ≤ Cj,k,α,γ µ(T )2(j−k+2)+|γ|−|α| where Cj,k,α,γ only depends on sup|t|≤T |H(t)|∞,γ , 2 ≤ |γ| ≤ j + 2. So we get, for every j ≥ 0, t det1/2 M (t, z)M (s, z)−1 bj (s, z)ds. aj+1 (t, z) =

(3.26)

(3.27)

0

Moreover, from (3.25) and (3.26), we get the following estimate, for every j ≥ 0, |t| ≤ T , z ∈ R2d , |∂zγ aj (t, z)| ≤ Cj,γ |det1/2 M (t, z)|µ(T )4j+|γ|

(3.28)

with the same remark as in (3.26) for the constant Cj,γ . 3.2. Error estimates Let us denote

where a(N ) (t) =

(N ) ˆ RN (t) = (i∂t − H(t))I(a (t), Φ)

k ak . Using the Duhamel formula, we have t U (t) − U N,(t) ≤ −1 R(s)ds

(3.29)

0≤k≤N

(3.30)

0

where t0 = 0, U (t) = U (t, 0), U N, (t) = I(a(N ) (t), Φ). So we have to estimate RN (t). Let us denote K (N ) (x, y) the Schwartz kernel ˜ (N ) (X, Y ) the Schwartz kernel of RN (t) in the Fourier–Bargmann of RN (t) and K

November 16, J070-S0129055X1000417X

1136

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

representation: ˜ (N ) (X, Y ) = K Rd ×Rd

K (N ) (x, y)ϕX (y)ϕY (x)dxdy.

(3.31)

˜ N (t) the operator with Schwartz kernel K ˜ (N ) (X, Y ). The following lemma Let be R is well known. Here we forget N and t for simplicity. Lemma 3.4. We have the L2 norm estimate ˜ L2 (Rd ) . RL2 (Rd ) ≤ (2π)−d R

(3.32)

˜ ˜ max sup |K(X, Y )|dX, sup |K(X, Y )|dY .

(3.33)

In particular, we have −d

RL2 (Rd ) ≤ (2π)

Y

X

Proof. For inequality (3.32) we use that the Fourier–Bargmann transform is an isometry. Inequality (3.33) is known as Carleman (or Schur) L2 estimate. Using Lemma 3.1, we get ˜ (N ) (X, Y ) = 2−3d/2 (π)−d K i (N ) ×

Tˆ (zt )Λ Opw (t, z)e δ(t,z) dz 1 [RN (t)]g, ϕY ϕX , ϕz a R2d

(3.34) t ·qt where δ(t, z) = S(t, z) + p·q−p . 2 Using Weyl commutation formula, we have

i |X − z|2

ϕX , ϕz = exp − + σ(X, z) , 4 2 w −zt .

Tˆ (zt )Λ Opw 1 [RN (t)]g, ϕY = Op1 [RN (t)]g, g Y√

(3.35) (3.36)

We know the Wigner function W0,Z of the pair (g, gZ ), Z ∈ R2d ([28]) 2 Z W0,Z (X) = 22d exp − X − − iσ(X, Z) . 2

(3.37)

By a well-known property of Weyl quantization ([13]), for any symbol s, we have −d

Opw 1 [s]g, gZ = (2π)

R2d

s(X)W0,Z (X)dX

(3.38)

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1137

We shall use the following lemma Lemma 3.5. Let be f ∈ O0 (2d). For every γ ∈ N2d and m > 0 there exists Cγ,m such that γ −|X−Z|2 −iJZ·X dX 2d X f (X)e R

≤ Cγ,m (1 + |Z|)−m

sup |α|≤m+|γ|; Y ∈R2d

|∂Yα f (Y )|.

(3.39)

Proof. It is enough to assume |Z| ≥ 1. We integrate m times by parts with the diﬀerential operator L=

2(X − Z) − iJZ · X ∂X 4|X − z|2 + |JZ|2

(3.40)

α , with |lm,α | ≤ Cm,α (|Z| + |X − Z|)−m , where using that (Lτ )m = |α|≤m lm,α ∂X θ(X) = −|X − Z|2 − iJZ · X. So using Lemma 3.5 we get the following estimate: for every N ; N there exists CN,N (depending only on semi-norms |H(t)|∞,γ , 2 ≤ |γ| ≤ N + N , such that for X, Y ∈ R2d and |t| ≤ T we have

N +1

˜ (N ) (X, Y )| ≤ CN,N (µ(T ))N +N 2 −d |K −N |X−z|2 |Y − zt | × e− 4 |a(N ) (t, z)|dz. 1+ √ 2d R

(3.41)

Let us denote φ∗t = φ0,t = (φt )−1 . We have the Lipchitz estimate, for |t| ≤ T , |φ∗,t Y − z| ≤ µ(T )|Y − zt |.

(3.42)

So we get −N −N t∗ 2 | Y − X| |Y − z |φ t − |X−z| 4 √ e dz ≤ CN 1 + 1+ √ R2d µ(T )

(3.43)

and ˜ (N ) (X, Y )| |K N +N

≤ CN,N (µ(T ))

N +1 2

−N ∗ |φt Y − X| √ sup |a(N ) (t, z)|. 1+ µ(T ) z∈R2d ,|t|≤T (3.44)

Then using Lemma 3.4 and choosing N > 2d, we get the following uniform L2 estimate for the remainder term, for |t| ≤ T , RN (t) ≤ CN (µ(T ))N +1 (N +1)/2

sup z∈R2d ,|t|≤T

|a(N ) (t, z)|.

(3.45)

If T is ﬁxed, pushing the expansion up to 2N instead of N we get easily Theorem 1.2 using the Duhamel formula.

November 16, J070-S0129055X1000417X

1138

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

Using global estimates on aj (t, z) obtained from the transport equation (3.28) and pushing the asymptotic expansion up to 2N , we get the proof of Theorem 1.4 using again the Duhamel formula. 4. Varying Phase. Proof of Theorem 1.3 To avoid technicalities, we ﬁx the time t. It would be not diﬃcult to follow a time parameter t if necessary for application. So in this section, φ is a symplectic diﬀeomorphism in R2d , such that φ, φ−1 are Lipchitz continuous and φ ∈ O1 (2d). We denote z = (q, p) ∈ R2d , φ(z) = (Q(z), P (z)) ∈ Rd × Rd and S an action for φ, i.e. a primitive on R2d of the closed 1-form P dQ − pdq. We consider the following phases Φ(φ,Θ,Γ) (z; x, y) = S(z) + P · (x − Q) − p · (y − q) 1 ¯ − q) · (y − q)). + (Θ(x − Q) · (x − Q) − Γ(y 2

(4.1)

This class of Fourier-Integral Operators with complex quadratic phase was already analyzed in [29]. We want to show here how to vary the choice of the matrices Θ, Γ for a given canonical transformation φ of R2d . As in Sec. 3, let us denote I(a, Φ) the operator with the Schwartz kernel (φ,Θ,Γ) i −3d/2 (z;x,y) Ka (x, y) = (2π) eΦ a(z)dz (4.2) R2d

where a ∈ O0 (2d), Φ = Φ(φ,Θ,Γ) . Using a Fourier–Bargmann transform and the following estimate: there exist C > 0, c > 0 such that for all X ∈ R2d , we have c|X|2 | ϕΓ , ϕX | ≤ C exp − , (4.3) ˜ a (X, Y ) of Ka and prove that we can estimate the Fourier–Bargmann transform K 2 d I(a, Φ) is bounded in L (R ) (see Sec. 3, Lemma 3.5 and Sec. 5 below). Our goal in this section is to prove the following result which gives Theorem 1.3 as a particular case. Proposition 4.1. Let be 4 matrices in Σ+ (d), Θ, Θ , Γ, Γ and a ∈ O0 (2d). Θ, Θ may be z dependent such that

∀ γ,

∃c > 0,

Θ() v.v ≥ c|v|2 ,

|γ| ≥ 1,

∃Cγ ,

∀ z ∈ R2d

(4.4)

∂zγ Θ() ≤ Cγ , ∀ z ∈ R2d . (4.5) Then there exists a semi-classical symbol a ∼ j j aj of order 0 such that we have for the L2 operator norm,

I(a, Φ(φ,Θ,Γ) ) = I(a , Φ(φ,Θ ,Γ ) ) + O(∞ ).

(4.6)

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1139

Moreover we have for the principal symbol a0 the formula a0 (z) = a0 (z)

det1/2 (M (1)) det1/2 (M (0))

(4.7)

¯ ¯ − ((1 − s)Θ + sΘ )(A + B Γ). where M (s) := C + DΓ Proof. The method is rather simple and is an extension of what we have already done for quadratic Hamiltonians (Proof I) except that here we have to solve transport equations in the deformation parameter s to get the lower order correction terms. Let us remark that this class of Fourier-Integral Operators is closed under adjointness: I(a, Φ(Θ,Γ) )∗ = I(a∗ , Φ∗ ),

(4.8)

¯(φ−1 Z), Z = (Q, P ), Z = φ(z) and where a∗ (Z) = a Φ∗ (Z; x, y) = −S(φ−1 Z) + p · (x − q) − P · (y − Q) 1 ¯ − Q) · (y − Q)). + (Γ(x − q) · (x − q) − Θ(y 2

(4.9)

So by transitivity we can assume that Γ = Γ . As in the quadratic Hamiltonian case, let us introduce, Θs = (1 − s)Θ + sΘ , Φ(s) = Φ(Θs ,Γ) , 0 ≤ s ≤ 1 and look for (s) a semiclassical symbol a(s) = j j aj such that (s) i ∂ e Φ (z;x,y) a(s) (z)dz = O(∞ ), ∀ s ∈ [0, 1]. (4.10) ∂s R2d However, we have ∂ (s) i Φ (z; x, y) = (Θ − Θ)(x − Q) · (x − Q) ∂s and we have to ﬁnd a C 1 family symbol a(s) , 0 ≤ s ≤ 1 such that i (s) (s) I ∂s a + (Θ − Θ)(x − Q) · (x − Q)a , Φ = O(∞ ).

(4.11)

(4.12)

The principal term a0 = a(1) is computed as in the quadratic case. Let us suppose for a moment that Θ, Θ are constant. Then as in the quadratic case we have ¯ p )Φ(s) = (C τ + ΓD ¯ τ − (Aτ + ΓB ¯ τ )Θs )(x − Q) (4.13) (∂q + Γ∂ A B where A = ∂q Q, B = ∂p Q, C = ∂q P , D = ∂p P and F = C D is a symplectic matrix. ¯ is invertible so we can integrate ¯ − Θs (A + B Γ) We know that M (s) := C + DΓ by parts as in Sec. 3. and as above we can achieve the proof of Proposition 4.1.

November 16, J070-S0129055X1000417X

1140

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

When Θ, Θ are z dependent, the integrations by part are more tricky. We have to use ¯ p )Φ(s) = M τ (s, z)(x − Q) + N (s, z)(x − Q, x − Q) (∂q + Γ∂

(4.14)

where N (s, z)(x, y) is a bilinear application in (x, y) ∈ Rd × Rd into d × d matrices, with coeﬃcients in O0 in z, C 1 in s. Hence we have (Θ,Γ) i ¯ p e i Φ(Θ,Γ) = (M τ )−1 (s, z) ∂q + Γ∂ (x − Q)e Φ i

(Θ,Γ)

− (M τ )− (s, z)N (s, z)(x − Q, x − Q)e i Φ

.

(4.15)

So we apply (4.15) and the following lemmas to proceed like in Sec. 3. Lemma 4.2. For any symbol b ∈ O0 (2d), for every multiindex α ∈ N2d and every N ≥ |α|/2 we have (s) i (x − Q)α e Φ b(z)dz 2d R (s) i = |β| fα,β (s, z)e Φ ∂ β b(z)dz |α| 2 ≤|β|≤N

+

R2d

|β|+|γ|=N +1,|β|≥1

|γ|

i

R2d

(s)

gα,β (s, z)(x − Q)β e Φ gβ,γ ∂ γ b(z)dz

(4.16)

where fα,β (s, z), gα,β (s, z) are symbols of order 0, uniformly bounded in O0 (2d) for s ∈ [0, 1]. Lemma 4.3. For every b ∈ O0 (2d) and β ∈ Nd we have the crude L2 estimate, uniform in s ∈ [0, 1], I((x − Q)β b, Φ(s) = O(|β|/2 ). Using these two lemmas we get the full semiclassical symbol a ∼ a0 (z) = a0

det1/2 (M (s)) det1/2 (M (0))

(4.17) j

j aj , where (4.18)

and for j ≥ 1, aj is computed by induction as solution for s = 1 of the diﬀerential equation ∂s aj (s) = Tr M˙ (s)M −1 (s) aj (s) + bj (s), aj (0) = aj . (4.19) where bj (s) depends on the ak (s), k ≤ j − 1. Remark 4.4. Considering the adjoint operator, it is possible to exchange the role of the matrices Θ and Γ. If the symbol a depends smoothly on some parameter λ, it is not diﬃcult to show that a also depends smoothly in λ.

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1141

Proof of Lemma 4.2. This is done by an induction on N such that α ≤ N . Proof of Lemma 4.3. Let us begin by giving a simple proof of (4.3) when Θ is z dependent satisfying the assumptions (4.4) and (4.5) of Proposition 4.1. We shall prove the more general estimate, for every β ∈ Nd there exist C > 0, c > 0 such that 2

| xα g Θ , gY | ≤ Ce−c|Y | ,

∀ Y ∈ R2d .

(4.20)

Let us denote Y = (y, η) ∈ Rd × Rd . By a direct estimate we get easily, 2

| xα g Θ , gY | ≤ Ce−2c|y| ,

∀ (y, η) ∈ R2d .

(4.21)

Using Fourier transform and Plancherel formula, we exchange y and η and we get (4.20). Now we can follow the method of Sec. 3 to estimate L2 norm of operators using a Fourier–Bargmann transformation. ˜ Let be K(X, Y ) the Fourier–Bargmann kernel of I((x − Q)β b, Φ(s) ). We have i ˜ K(X, Y ) = 2−3d/2 (π)−d |α|/2

Tˆ (Z)Λ (xβ g Θ ), ϕY ϕX , ϕz b(z)e δ(t,z) dz R2d

(4.22) where Z = (Q, P ) = φ(z) and | Tˆ (Z)Λ (xβ g Θ ), ϕY | = | xβ g Θ , g Y√−Z |.

So we get ˜ |K(X, Y )| ≤ C|α|/2

c 2 2 exp − (|Y − φ(z)| + |X − z| dz. R2d

(4.23)

(4.24)

Using that φ is a Lipchitz canonical transformation, we have, for C0 large enough and c0 > 0 small enough, c0 |α|/2 2 ˜ exp − (|Y − φ(X)| . |K(X, Y )| ≤ C0 (4.25) Hence we get the proof of Lemma 4.3 using Lemma 3.4. We have proved Proposition 4.1 and Theorem 1.3. 5. Semiclassical Fourier Integral Operators In [23, 8] and in the recent preprint [30], the authors have considered FourierIntegral Operators deﬁned by the following simpler phase 1 Ψ(φ,Θ) (p; x, y) = S(y, p) + P · (x − Q) + Θ(x − Q) · (x − Q) (5.1) 2 where (Q, P ) = φ(y, p), φ is a bilipchitz canonical transformation like above, Θ ∈ Σ+ (d).

November 16, J070-S0129055X1000417X

1142

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

In [23, 8] the authors have proved semiclassical expansions for the propagator of Schr¨ odinger equation for initial data with a compact support. This result is extended in [30] for the Schr¨ odinger Hamiltonian −2 + V , to general data in L2 with uniform norm estimates. We shall give here some extensions of results of [30] using the same techniques as in Secs. 3 and 4, so we shall not repeat the details. Let us denote J (a, Ψφ,Θ ) the operator whose Schwartz kernel is (φ,Θ) i (p;x,y) eΨ a(y, p)dp. (5.2) K(x, y) = (2π)−d Rd

A natural question discussed in this section is to compare the Fourier-Integral Operators I(a, Φ(φ,Θ,Γ) ) deﬁned with 2d “frequency variables” and J (a, Ψ(φ,Θ) ) deﬁned with d “frequency variables”. A Fourier integral operator in L2 (Rd ) is always a quantization of a canonical transformation φ in the cotangent space T ∗ (Rd ). A nice way to make clear this relationship is to use a Fourier–Bargmann transform (see [7, 31]). This can be easily done in the same way for Semiclassical Fourier-Integral Operators as we shall see now. Definition 5.1. A family of operators, depending on a small parameter ∈ ]0, 1], U : S(Rd ) → S (Rd ) is a Semiclassical Fourier-Integral Operator of order m ∈ R associated to the canonical bilipchitz transformation φ: T ∗ (Rd ) → T ∗ (Rd ), if for d d every N we have U = UN + RN where UN : S(R ) → S (R ) and RN = O(N ) and for every N ≥ 0 there exists CN such that −N |Y − φ(X)| m−3d/2 ˜ √ |K (X, Y )| ≤ CN , ∀ X, Y ∈ R2d , ∈ ]0, 1], 1+ (5.3) ∗ ˜ (X, Y ) is the Schwartz kernel of FB U F . where K N

B

Remark 5.2. (1) In this deﬁnition, which co¨ıncides with a deﬁnition given in [31] for = 1, a Semiclassical Fourier-Integral Operator has, up to a negligible operator in , a kernel living in a neighborhood of the graph of a canonical transformation φ. But this deﬁnition says nothing concerning asymptotic ˜ (X, Y ) in a neighborhood of the graph of φ when is small. expansion of K So this deﬁnition is certainly too permissive. But for ﬁxed it is suitable as proven in [31]. (2) Using Carleman–Schur estimate, a Semiclassical Fourier-Integral Operator of order 0 is uniformly bounded in L2 (Rd ). This is a straightforward consequence of the deﬁnition. This class of Semiclassical Fourier-Integral Operator of order 0 is clearly closed by composition. (3) In Deﬁnition 5.1, it is equivalent to use any Fourier–Bargmann transformation (Γ) FB , Γ ∈ Σ+ (d). (4) There are other deﬁnitions of Semiclassical Fourier-Integral Operator using Lagrangian analysis and real phase functions. For this point of view, see for example, [1].

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1143

(5) Fourier-Integral Operators with complex phase were used to study propagation of singularities of P.D.E. Many papers and books have been published on this subject, among them let us point out [2, 26, 32]. Now we shall see that the operators already considered in this paper are Semiclassical Fourier-Integral Operators. Proposition 5.3. Let be amplitudes a = a(x, z), a ∈ O0 (3d) and u = u(x, y, p), u ∈ O0 (3d) and Θ, Γ ∈ Σ+ (d), Θ may depend in z or (y, p), such that (1.16), (1.17) are satisfied. Then I(a, Φ(φ,Θ,Γ) ) and J (u, Ψφ,Θ ) are Semiclassical Fourier-Integral Operators of order 0. Proof. Concerning I(a, Φ(φ,Θ,Γ) ), we get the result following Sec. 3.2, estimate (3.44). The proof for J (u, Ψφ,Θ ) is almost the same. For simplicity we assume Θ constant. For Θ depending in (y, p) we could proceed as in Sec. 4. ˜ Y = (˜ Let us denote X = (˜ x, ξ), y , η˜). We want to estimate i ˜ −d ˜ e Φ u(x, y, p)dpdxdy (5.4) K(X, Y ) = (2π) R3d

where ˜ = S(y, p) + P · (x − Q) + Θ (x − Q) · (x − Q) Φ 2 i i + (˜ x − y) · (˜ x − y) + ξ˜ · (˜ x − y) + (˜ y − x) · (˜ y − x) + η˜ · (˜ y − x). (5.5) 2 2 B Dτ −B τ −1 if F = A is Let us remark that we have: F −1 = −C τ Aτ C D . So, because F τ τ symplectic, we know that D − B Θ is invertible. Hence we have ˜ = (C τ − Aτ Θ)(x − Q) + (ξ˜ − p) + i(˜ x − y), ∂y Φ

(5.6)

˜ = (Dτ − B τ Θ)(x − Q), ∂p Φ

(5.7)

˜ = Θ(x − Q) + (P − η˜) + i(˜ y − x). ∂x Φ

(5.8)

˜ by integrations by parts using So we get the necessary estimates on K ˜ − (−Aτ Θ + C τ )(Dτ − B τ Θ)−1 ∂p Ψ = (ξ˜ − p) + i(˜ ∂y Φ x − y), ˜ − Θ(Dτ − B τ Θ)−1 ∂p Ψ = (P − η˜) + i(˜ ∂x Φ y − x).

(5.9) (5.10)

The following result is a slight generalization of [23, 8, 30]. Theorem 5.4. Under the assumptions of Theorem 1.2 and (1.16), (1.17), we have i (φt ,Θt ) (t,y,p,x) K(t; x, y) (2π)−d eψ u(; t, y, p)dp (5.11) where u(; t, y, p) Theorem 1.2.

=

Rd

0≤j<+∞

uj (t; y, p)j has the same meaning as in

November 16, J070-S0129055X1000417X

1144

2010 15:27 WSPC/S0129-055X

148-RMP

D. Robert

In particular u0 (t, y, p) = det1/2 (D − ΘB)).

(5.12)

Sketch of Proof. These result can be proved following the same strategy as for proving Theorem 1.3. We ﬁrst prove the theorem for some Θ (Θ = iI), following the proof of Theorem 1.2. Then we can get the theorem for any Θ by the variation argument as in the proof of Theorem 1.3. L2 estimate for operator norm of Fourier-Integral Operators is used to control the remainder terms. Remark 5.5. It is not diﬃcult to adapt the proof of Theorem 1.4 concerning an Ehrenfest time estimate to the setting of Theorem 5.4. References [1] I. Alexandrova, Semi-classical wave front set and Fourier-Integral Operators, Canad. J. Math. 60 (2008) 241–263. [2] V. M. Babich and V. S. Buldreyev, Asymptotic Methods in Short Waves Diﬀraction Problem (Moscow Nauka, 1972, in Russian); (Springer, 1991, English translation). [3] V. Bargmann, On the Hilbert space of analytic functions and associated integral transform, Comm. Pure Appl. Math. 14 (1961) 187–214. [4] J. M. Bily, Propagation d’´etats coh´erents et applications, Ph.D. thesis, Universit´e de Nantes (2001). [5] J. M. Bily and D. Robert, The semi-classical Van–Vleck formula. Application to the Aharonov–Bohm eﬀect, in Long Time Behaviour of Classical and Quantum Systems, Proceedings of the Bologna APTEX International Conference, Bologna, Italy, September 13–17, 1999 (World Scientiﬁc, 2001), pp. 89–106. [6] A. Bouzouina and D. Robert, Uniform semiclassical estimates for the propagation of quantum observables, Duke Math. J. 111(2) (2002) 223–252. [7] J. M. Bony, Evolution equations and microlocal analysis, in Hyperbolic Problems and Related Topics, Grad. Ser. Anal. (International Press, 2003), pp. 17–40. [8] J. Butler, Global h Fourier integral operators with complex-valued phase functions, Bull. London Math. Soc. 34(4) (2002) 479–489. [9] M. Combescure and D. Robert, Semiclassical spreading of quantum wave packets and applications near unstable ﬁxed points of the classical ﬂow, Asymptot. Anal. 14 (1997) 377–404. [10] M. Combescure and D. Robert, Quadratic quantum Hamiltonians revisited, Cubo 8(1) (2006) 61–86. [11] A. C´ ordoba and C. Feﬀerman, Wave packets and Fourier Integral Operators, Comm. Partial Diﬀerential Equations 3(11) (1978) 979–1005. [12] B. Fedosov, Deformation Quantization and Index Theory, Mathematical Topics, Vol. 9 (Akademic Verlag, 1996). [13] G. B. Folland, Harmonic Analysis in Phase Space, Annals of Mathematics Studies, Vol. 122 (Princeton University Press, Princeton, NJ, 1989). [14] D. Fujiwara, A construction of the fundamental solution for the Schr¨ odinger equation, J. Anal. Math. 35 (1979) 41–96. [15] G. Hagedorn, Semiclassical quantum mechanics I: The limit for coherent states, Comm. Math. Phys. 71 (1980) 77–93.

November 16, J070-S0129055X1000417X

2010 15:27 WSPC/S0129-055X

148-RMP

On the Herman–Kluk Semiclassical Approximation

1145

[16] G. Hagedorn and A. Joye, Exponentially accurate semi-classical dynamics: Propagation, localization, Ehrenfest times, scattering and more general states, Ann. Henri Poincar´e 1(5) (2000) 837–883. [17] E. J. Heller, Time-dependent approach to semiclassical dynamics, J. Chem. Phys. 62(4) (1975) 1544–1555. [18] E. J. Heller, Frozen Gaussians: A very simple semiclassical approximation, J. Chem. Phys. 75(6) (1981) 2923–2931. [19] M. F. Herman and E. Kluk, A semiclassical justiﬁcation for the use of non-spreading wavepackets in dynamics calculations, Chem. Phys. 91(1) (1984) 27–34. [20] L. H¨ ormander, The Analysis of Linear Partial Diﬀerential Operators I (SpringerVerlag, 1983). [21] K. Kay, Integral expressions for the semi-classical time-dependent propagator, J. Chem. Phys. 100(6) (1994) 4377–4392. [22] K. Kay, The Herman–Kluk approximation: Derivation and semiclassical corrections, Chem. Phys. 322 (2006) 3–12. [23] A. Laptev and I. M. Sigal, Global Fourier Integral Operators and semiclassical asymptotics, Rev. Math. Phys. 12(5) (2000) 749–766. [24] R. Littlejohn, The semiclassical evolution of wave packets, Phys. Rep. 138(4–5) (1986) 193–291. [25] A. Martinez, An Introduction to Semiclassical and Microlocal Analysis, Universitext (Springer-Verlag, 2002). [26] J. Ralston, Gaussian beams and propagation of singularities, in Studies in Partial Diﬀerential Equations, MAA Stud. Math., Vol. 23 (Math. Assoc. America, 1982), pp. 246–248. [27] D. Robert, Autour de l’approximation Semi-Classique, Progress in Mathematics, No. 68 (Birkh¨ auser, 1987). [28] D. Robert, Propagation of coherent states in quantum mechanics and applications, in Partial Diﬀerential Equations and Applications, S´emin. Congr., Vol. 15 (Soc. Math. France, 2007), pp. 181–250. [29] V. Rousse and T. Swart, A mathematical justiﬁcation for the Herman–Kluk propagator, Comm. Math. Phys. 286 (2009) 725–750. [30] V. Rousse, Semiclassical simple initial value representations, Universit´e Paris 12 (2009), arXiv:0904.0387. [31] D. Tataru, Phase space transforms and microlocal analysis, in Phase Space Analysis of Partial Diﬀerential Equations, Publ. Cent. Ri. Mat Ennio Giorgi, Vol. II (Scuola Norm. Sup. Pisa, 2004), pp. 505–524. [32] J. Sj¨ ostrand, Singularit´es analytiques microlocales, Ast´erique 95 (1982) 1–166. [33] T. Swart, Initial value representation, Ph.D. thesis, Frei Universit¨ at Berlin (2008).

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1147–1179 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004168

ALMOST ADDITIVE THERMODYNAMIC FORMALISM: SOME RECENT DEVELOPMENTS

LUIS BARREIRA Departamento de Matem´ atica, Instituto Superior T´ ecnico, 1049-001 Lisboa, Portugal [email protected] Received 9 November 2009 Revised 13 July 2010 This is a survey on recent developments concerning a thermodynamic formalism for almost additive sequences of functions. While the nonadditive thermodynamic formalism applies to much more general sequences, at the present stage of the theory there are no general results concerning, for example, a variational principle for the topological pressure or the existence of equilibrium or Gibbs measures (at least without further restrictive assumptions). On the other hand, in the case of almost additive sequences, it is possible to establish a variational principle and to discuss the existence and uniqueness of equilibrium and Gibbs measures, among several other results. After presenting in a self-contained manner the foundations of the theory, the survey includes the description of three applications of the almost additive thermodynamic formalism: a multifractal analysis of Lyapunov exponents for a class of nonconformal repellers; a conditional variational principle for limits of almost additive sequences; and the study of dimension spectra that consider simultaneously limits into the future and into the past. Keywords: Almost additive sequences; thermodynamic formalism. Mathematics Subject Classification 2010: 37C45, 37D20, 37D35

1. Introduction The point of departure for this survey is the nonadditive thermodynamic formalism developed in [1], having in mind certain applications to the dimension theory of dynamical systems, as detailed below. Our main aim is to survey some recent developments in the particular case of almost additive sequences of functions. During the last two decades, the dimension theory of dynamical systems progressively developed into an independent ﬁeld of research, roughly speaking with the objective of measuring the complexity from the dimensional point of view of the objects that remain invariant under the dynamics, such as the invariant sets and measures. The ﬁrst monograph that clearly took this point of view was Pesin’s book ([36]), which describes the state-of-the-art up to 1997. We refer to our book ([4]) for a detailed description of many of the more recent results in the area. 1147

November 16, J070-S0129055X10004168

1148

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

The nonadditive thermodynamic formalism is a generalization of the classical thermodynamic formalism, in which the topological pressure P (ϕ) of a continuous function ϕ (with respect to a given dynamics on a compact metric space), is replaced by the topological pressure P (Φ) of a sequence of continuous functions Φ = (ϕn )n . The classical pressure P (ϕ) was introduced by Ruelle in [39] for expansive maps (see also his book [40]), and by Walters in [46] in the general case. For arbitrary sets (not necessarily compact), the nonadditive topological pressure also generalizes (and imitates) the notion of topological pressure introduced by Pesin and Pitskel in [37], which is equivalent to the notion introduced earlier by Bowen in [13] (see [37]). The nonadditive thermodynamic formalism contains as a particular case a new formulation of the subadditive thermodynamic formalism earlier introduced by Falconer in [19]. The main motivation behind the nonadditive thermodynamic formalism is to allow certain applications to a more general class of invariant sets in the context of the dimension theory of dynamical systems. We ﬁrst recall that the unique solution s of the equation P (sϕ) = 0,

(1)

where ϕ is a certain function associated to a given invariant set, is often related to the Hausdorﬀ dimension of the set. Equation (1) was introduced by Bowen in [15] (in his study of quasi-circles) and is usually called Bowen’s equation. It is also appropriate to call it Bowen–Ruelle’s equation, taking into account the fundamental role of the thermodynamic formalism developed by Ruelle, and of his article [41]. Virtually all known equations used to compute or to estimate the dimension of invariant sets are particular cases of Eq. (1) or of an appropriate generalization. We recommend [42] for a quite detailed and informative related discussion. On the other hand, in certain applications of dimension theory (we refer to the examples in [1, 4]), one is naturally led to consider sequences Φ = (ϕn )n that may satisfy no additivity between the functions ϕn . The nonadditive topological pressure and its associated thermodynamic formalism allow us to consider these generalizations in a uniﬁed framework. In particular, this allowed to establish in [1] sharp lower and upper dimension estimates for repellers and hyperbolic sets, including for a class of nondiﬀerentiable maps, without further eﬀort. The dimension estimates are obtained as solutions of appropriate generalizations of Eq. (1) now involving the nonadditive topological pressure. Given a continuous function ϕ : X → R in a compact metric space X, the classical topological pressure of ϕ, with respect to a continuous map f : X → X, satisﬁes the variational principle ϕdµ , P (ϕ) = sup hµ (f ) + µ

X

where hµ (f ) is the Kolmogorov–Sinai entropy of f with respect to the measure µ, and where the supremum is taken over all f -invariant probability measures on X.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1149

The thermodynamic formalism developed in [1] also includes a variational principle for the topological pressure, although with a restrictive assumption on the sequence Φ. Namely, if there exists a continuous function ϕ : X → R such that ϕn+1 − ϕn ◦ f → ϕ then

uniformly when n → ∞,

(2)

ϕdµ , P (Φ) = sup hµ (f ) + µ

X

again with the supremum taken over all f -invariant probability measures on X. The restrictive assumption in (2) caused that until recently there was no available discussion of equilibrium and Gibbs measures, in the general context of the nonadditive thermodynamic formalism. But it is well known that equilibrium and Gibbs measures play a prominent role in dimension theory and in particular in the multifractal analysis of dynamical systems, in which the spectra are often obtained by providing equilibrium measures with the appropriate local entropy or the appropriate pointwise dimension. Equilibrium and Gibbs measures can also be for example measures of full topological entropy or full Hausdorﬀ dimension. It is sometimes possible to develop the theory without a variational principle for the topological pressure, and thus without these measures, but the corresponding proofs tend to be more technical. Clearly, from the points of view of dimension theory and multifractal analysis, it is desirable to continue using equilibrium and Gibbs measures even when the classical thermodynamical formalism cannot be used. The discussion above justiﬁes the interest in looking for more general classes of sequences of functions, although perhaps not arbitrary sequences, for which it is still possible to establish a corresponding variational principle for the topological pressure, and to study the associated equilibrium and Gibbs measures, among several other results. This is precisely what happens with the so-called almost additive sequences, for which it is possible not only to establish a variational principle, but also to discuss the existence and uniqueness of equilibrium and Gibbs measures. We recall that a sequence Φ = (ϕn )n is said to be almost additive if there is a constant C > 0 such that −C + ϕn + ϕm ◦ f n ≤ ϕn+m ≤ C + ϕn + ϕm ◦ f n

(3)

for every n, m ∈ N. Clearly, for any function ϕ the sequence ϕn =

n−1

ϕ ◦ fk

k=0

is almost additive, since in this case ϕn+m = ϕn + ϕm ◦ f n for every n, m ∈ N. Nontrivial examples of almost additive sequences occur for example in the study of Lyapunov exponents for nonconformal maps by Barreira

November 16, J070-S0129055X10004168

1150

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

and Gelfert in [7] (see Sec. 7). Following [3], we consider in particular repellers and hyperbolic sets of C 1 transformations, and for an almost additive sequence Φ of continuous functions we describe several results towards the foundations of an almost additive thermodynamic formalism. This includes the formula 1 log P (Φ) = lim exp ϕn (x) n→∞ n n x:f (x)=x

for the topological pressure, for the class of almost additive sequences Φ with tempered variation. We also describe a variational principle for the topological pressure of an almost additive sequence, namely 1 ϕn dµ , (4) P (Φ) = sup hµ (f ) + lim n→∞ n X µ and we discuss the existence and uniqueness of equilibrium and invariant Gibbs measures, among several other results, for example concerning characterizations of unique equilibrium measures. Mummert ([34]) established independently identity (4), although under an additional assumption on the sequence Φ that can be removed by repeating verbatim arguments in [3]. Cao, Feng and Huang considered more recently in [16] the general class of subadditive sequences, and they also obtained the variational principle in (4), but they do not discuss the existence of equilibrium or Gibbs measures. Earlier results in this direction were obtained by K¨ aenm¨aki in [30] for a particular class of subadditive sequences, while also discussing the existence of an equilibrium measure. After presenting the foundations of the almost additive thermodynamic formalism, we describe three applications of the formalism. The ﬁrst application, following Barreira and Gelfert in [7], considers nonconformal repellers in R2 satisfying a cone condition. The main objective is to obtain a multifractal analysis for the level sets of the Lyapunov exponents. In particular, we consider certain almost additive sequences related to the Lyapunov exponents to which one can apply the almost additive thermodynamic formalism. However, we emphasize that the results in [7] were obtained independently of the theory described in the survey. We also point out that the proofs of some results in Secs. 4–6 can be considered a distillation of arguments in that paper. We recall that a diﬀerentiable map f is said to be conformal on a given set provided that the diﬀerential dx f is a multiple of an isometry at every point x of the set. We emphasize that the dimension theory and the multifractal analysis of dynamical systems are only completely understood in the case of conformal uniformly hyperbolic dynamics, either invertible or noninvertible. This includes saddle-type hyperbolic diﬀeomorphisms on surfaces, and holomorphic maps in the complex plane with a hyperbolic Julia set. The study of the dimension of invariant sets of nonconformal transformations has proven to be much more delicate. The main diﬃculty is related with the possibility of existence of distinct Lyapunov exponents in diﬀerent directions, which may change from point to point. Another diﬃculty is that certain number-theoretical

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1151

properties may play an important role. Nevertheless, there exist several noteworthy results concerning the dimension theory of certain classes of invariant sets of nonconformal transformations, namely due to Falconer ([18, 20]), Bothe ([12]), Simon ([44]), and Simon and Solomyak ([45]). We refer to [4] for a related discussion. The second application, following Barreira and Doutor in [5], has the objective of establishing a conditional variational principle for the multifractal spectra obtained from limits of almost additive sequences. This means that we consider the level sets ϕn (x) =α , Kα = x ∈ X : lim n→∞ ψn (x) where (ϕn )n and (ψn )n are almost additive sequences, and we give a description of their topological entropy or Hausdorﬀ dimension in terms of a conditional variational principle. For example, in the case of the topological entropy the conditional variational principle takes the form       ϕn dµ   X =α , h(f | Kα ) = max hµ (f ) : lim n→∞     ψn dµ   X

where h(f | Kα ) denotes the topological entropy on Kα . It is also shown that the spectra, such as α → h(f | Kα ), are continuous, and that the associated irregular sets have full dimension. The approach in [5] builds on related arguments in former work of Barreira et al. in [9], although now for almost additive sequences. The multifractal analysis of dynamical systems can be considered a subﬁeld of the dimension theory of dynamical systems, and it studies the complexity of the level sets of invariant local quantities obtained from a dynamical system. The concept of multifractal analysis was suggested by Halsey et al. in [27]. The ﬁrst rigorous approach is due to Collet, Lebowitz and Porzio in [17] for a class of measures invariant under 1-dimensional Markov maps. In [32], Lopes considered the measure of maximal entropy for hyperbolic Julia sets, and in [38], Rand studied Gibbs measures for a class of repellers. We refer the reader to the books [4, 36] for details and further references. The third application, following Barreira and Doutor in [6], is a complete description of the dimension spectra of limits of almost additive sequences on a hyperbolic set of a surface diﬀeomorphism. The main novelty is that we consider simultaneously limits into the future and into the past. More precisely, the spectra are obtained by computing the Hausdorﬀ dimension of the level sets of limits of almost additive sequences both for positive and negative time. We emphasize that the description of the spectra is not a consequence of the results considering simply limits into the future (or into the past). The main diﬃculty is that although the local product structure provided by the intersection of stable and unstable manifolds is

November 16, J070-S0129055X10004168

1152

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

bi-Lipschitz equivalent to a product, the level sets are never compact (this causes that their box dimension is strictly larger than their Hausdorﬀ dimension), and thus the product of level sets may have a dimension that need not be the sum of the dimensions of the sets. Instead we construct explicitly noninvariant measures concentrated on each product of level sets having the appropriate pointwise dimension. This approach builds on former work of Barreira and Valls in [11] in the additive case. 2. Nonadditive Topological Pressure 2.1. General theory We recall in this section the notion of nonadditive topological pressure introduced n−1 by Barreira in [1]. The main idea is to replace each sequence of functions k=0 ϕ◦f k in the deﬁnition of topological pressure by an arbitrary sequence ϕn . Let f : X → X be a continuous transformation of a compact metric space X. Given a ﬁnite open cover U of X, we denote by Wn (U) the collection of vectors U = (U0 , . . . , Un ) with U0 , . . . , Un ∈ U. For each U ∈ Wn (U), we write m(U ) = n, and we consider the open set X(U ) =

n

f −k Uk .

k=0

These sets can be thought of as cylinder sets. Now let Φ be a sequence of continuous functions ϕn : X → R for each n ∈ N. We deﬁne γn (Φ, U) = sup{|ϕn (x) − ϕn (y)| : x, y ∈ X(U ) for some U ∈ Wn (U)}

(5)

for each n ∈ N, and we always assume that lim sup lim sup

diam U→0 n→∞

γn (Φ, U) = 0. n

(6)

We observe that condition (6) holds automatically when Φ is an additive sequence, that is, when ϕn =

n−1

ϕ ◦ fk

(7)

k=0

for a given continuous function ϕ : X → R and each n ∈ N (this is an immediate consequence of the uniform continuity of any continuous function in the compact metric space X). Now we proceed with the construction of the nonadditive topological pressure. For each U ∈ Wn (U) we write   sup ϕn if X(U ) = ∅, (8) ϕ(U ) = X(U)  −∞ if X(U ) = ∅.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1153

Given a set Z ⊂ X and a number α ∈ R, we deﬁne the function exp(−αm(U ) + ϕ(U )), M (Z, α, Φ, U) = lim inf n→∞ Γ

U∈Γ

where the inﬁmum is taken over all ﬁnite or countable collections Γ ⊂ k≥n Wk (U) such that U∈Γ X(U ) ⊃ Z (in other words, such that the cylinder sets X(U ) cover the set Z). One can show that the function α → M (Z, α, Φ, U) jumps from +∞ to 0 at a unique value of α, and thus we can deﬁne PZ (Φ, U) = inf{α ∈ R : M (Z, α, Φ, U) = 0}. Theorem 2.1 ([1]). The following properties hold : (1) The limit PZ (Φ) :=

lim

diam U→0

PZ (Φ, U)

exists; (2) If there exist constants c1 , c2 < 0 such that c1 n ≤ ϕn ≤ c2 n for every n ∈ N, and the topological entropy h(f | X) is ﬁnite, then there exists a unique number s ∈ R such that PZ (sΦ) = 0. The number PZ (Φ) is called the nonadditive topological pressure of the sequence of functions Φ (with respect to f on Z). We note that the set Z need not be compact nor f -invariant. For simplicity, when there is no danger of confusion, we simply refer to PZ (Φ) as the topological pressure of Φ (with respect to f on Z). We also write P (Φ) = PX (Φ). One can easily verify that if Φ is the (additive) sequence of functions in (7), then P (Φ) coincides with the classical topological pressure of the function ϕ. The number h(f | Z) = PZ (0) is called the topological entropy of f on Z. It coincides with the notion of topological entropy for noncompact sets introduced in [37], and is equivalent to the notion of topological entropy introduced earlier by Bowen in [13]. It can be described as follows. Given a set Z ⊂ X and a number α ∈ R, we deﬁne the function exp(−αm(U )), N (Z, α, U) = lim inf n→∞ Γ

U∈Γ

where the inﬁmum is taken over all ﬁnite or countable collections Γ ⊂ such that U∈Γ X(U ) ⊃ Z. Then h(f | Z) =

lim

diam U→0

k≥n

Wk (U)

inf{α ∈ R : N (Z, α, U) = 0}.

2.2. Equilibrium measures for subadditive sequences As described in the introduction, the nonadditive thermodynamic formalism developed in [1] also includes a variational principle for the topological pressure, although

November 16, J070-S0129055X10004168

1154

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

with a restrictive assumption on the sequence Φ (see (2)). Nevertheless, it is still meaningful to consider some particular classes of dynamics and potentials, and to look for equilibrium and Gibbs measures. With this in mind we describe in this section results by K¨aenm¨aki [30] and by Feng and K¨ aenm¨aki [25] concerning the construction of equilibrium measures for a class of subadditive sequences in the particular case of symbolic dynamics. These sequences are well adapted to the study of the dimension of a class of limit sets of iterated function systems (see [30]) and of the multifractal analysis of the top Lyapunov exponent of products of matrices (see [21, 23, 26]). We refer to the following sections for related results concerning the existence of equilibrium and Gibbs measures for other classes of dynamics and potentials. We ﬁrst introduce some notation to consider the particular case of symbolic dynamics. Given p ∈ N, we write Σn = {1, . . . , p}n for each n ∈ N and |ω| = n for each ω ∈ Σn . We also write Σn , Σ = {1, . . . , p}N and Σ∗ = n∈N

and we consider the shift map σ : Σ → Σ by σ(i1 i2 · · ·) = (i2 i3 · · ·). Given t ≥ 0 and ω ∈ Σ∗ , let C be the class of all (parametrized) functions ψωt : Σ → R+ with ψω0 = 1 satisfying the following properties: (1) there exists Kt > 0 such that ψωt (ω1 ) ≤ Kψωt (ω2 ) for any ω1 , ω2 ∈ Σ; (2) for every ω ∈ Σ and j ∈ [1, |ω|] ∩ N we have ψωt (ω ) ≤ ψωt | j (σ j (ω)ω )ψσt j (ω) (ω ), where ω | j are the ﬁrst j elements ω, and where σ j (ω)ω denotes the juxtaposition of the two sequences; (3) for each δ > 0 there exist a = a(δ), b = b(δ) ∈ (0, 1) depending only on δ, with a(δ) 1 and b(δ) 1 when δ → 0, such that ψωt (ω )a|ω| ≤ ψωt+δ (ω ) ≤ ψωt (ω )b|ω| for every ω ∈ Σ. We note that this class of functions contains as particular examples several classes earlier considered by Falconer [18, 20] and by Barreira [2], in connection with the study of the dimension of repellers of nonconformal transformations. For any function in the class C, using the subadditivity it is shown in [30] that given ω ∈ Σ and a σ-invariant probability measure µ in Σ, the limits 1 log ψωt (ω ) (9) p(t) = lim n→∞ n n ω∈Σ

and sµ (t) = lim

n→∞

1 µ(Cω ) log ψωt (ω ) n n ω∈Σ

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1155

exist, where Cω ⊂ Σ is the set of sequences whose ﬁrst n elements are equal to those of ω). Moreover, they are independent of ω . To verify that p(t) is indeed a particular case of the nonadditive topological pressure, given ω ∈ Σ and n ∈ N we deﬁne a sequence ϕn : Σ → R by ϕtn (ω) = sup log ψωt (ω ). ω ∈Cω

(10)

Then the ﬁrst condition on the class C ensures that (6) holds, and we can show that p(t) coincides with the nonadditive topological pressure of the sequence Φt = (ϕn )n for any ω . This follows readily from results in [1] using the second condition on C. Moreover, by the third condition we can readily apply Theorem 2.1 to conclude that there exists a unique t ≥ 0 such that p(t) = 0 (the proof of this statement in [30] follows the same argument). This zero is often related to the dimension of certain classes of limit sets of iterated function systems and repellers (see for example [1, 2, 4, 18, 20]). In addition, the following property holds. Theorem 2.2 ([30]). We have p(t) ≥ hµ (σ) + sµ (t).

(11)

By Kingman’s subadditive ergodic theorem, we have 1 sµ (t) = lim ϕt dµ, n→∞ n Σ n and thus, inequality (11) can be written in the form 1 P (Φt ) ≥ hµ (σ) + lim ϕtn dµ. n→∞ n Σ This inequality is due to Falconer [19] in the general case of arbitrary subadditive sequences (and not only for the sequences Φt ) with a bounded distortion condition (which in the present context is given by the ﬁrst condition on C). Assuming a certain Lipschitz property for the elements of the sequence (more generally for topological Markov chains), he also obtained the variational principle 1 ϕtn dµ . (12) P (Φt ) = sup hµ (σ) + lim n→∞ n Σ µ In an analogous manner to that in the classical additive theory, we say that a σ-invariant probability measure µ in Σ is an equilibrium measure for the sequence Φt if it attains the supremum in (12). In the present context the existence of equilibrium measures was establish by K¨aenm¨aki. Theorem 2.3 ([30]). For each t ≥ 0 there exists an equilibrium measure for the sequence Φt . The existence of these equilibrium measures is used in [30] to study the dimension of a class of limit sets of iterated function systems.

November 16, J070-S0129055X10004168

1156

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

Now we consider a particular class of functions in C that are obtained from products of matrices. Given p, m ∈ N, let M1 , . . . , Mp be m × m matrices. For each t > 0, n ∈ N and ω ∈ Σn , we consider the constant function ψ¯ωt = Mi1 · · · Min t , ¯ t as in (10), that is, where ω = (i1 · · · in ), and again we deﬁne a sequence Φ ϕ¯tn (ω) = sup log ψ¯ωt (ω ) = sup log Mi1 · · · Min t , ω ∈Cω

ω ∈Cω

where ω = (i1 · · · in ). One can easily verify that the functions ψ¯ωt belong to the class C, and that p(t) in (9) is given by 1

Mi1 · · · Min t . p(t) = lim n→∞ n n ω∈Σ

Moreover, given a σ-invariant probability measure µ in Σ, we have 1 sµ (t) = t lim µ(Cω )log Mi1 · · · Min , n→∞ n n ω∈Σ

and it follows from (12) (see also [16]) that p(t) = sup(hµ (σ) + sµ (t)). µ

The following result is due to Feng and K¨ aenm¨aki. Theorem 2.4 ([25]). If for each n ∈ N there exist i1 , . . . , in ∈ {1, . . . , m} such that Mi1 · · · Min = 0, then for each t ≥ 0 there exist at most m ergodic equilibrium ¯ t . If in addition the only proper vector space V such measures for the sequence Φ that Mi V ⊂ V for i = 1, . . . , m is the origin, then for each t ≥ 0 there exists a ¯ t. unique equilibrium measure for the sequence Φ The irreducibility condition in Theorem 2.4 concerning the subspaces V is used in [23] to show that there exist c > 0 and k ∈ N such that for each ω, ω ∈ Σ∗ there exists ω ¯ ∈ kj=1 Σj for which

Mωω¯ ω ≥ c Mω · Mω .

(13)

It is essentially this property that allows to establish the existence of a unique equilibrium measure in [25]. We note that property (13) ensures that the sequence ¯ t is almost additive (see (3)), and thus the existence of a unique ergodic measure Φ in Theorem 2.4 as well as its Gibbs property (also obtained in [25]) follow from general results in [3] for the class of almost additive sequences (compare with the results in Secs. 4 and 5). 3. Topological Pressure for Almost Additive Sequences We introduce in this section the class of almost additive sequences, and we present formulas for the nonadditive topological pressure. For deﬁniteness we consider only

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1157

the case of functions deﬁned on a repeller. We refer to the remaining sections for further developments. 3.1. Repellers and Markov partitions We recall in this section the notion of repeller and the notion of Markov partition. Let f : M → M be a C 1 map, and let Λ ⊂ M be a compact f -invariant set (this means that f −1 Λ = Λ). We say that f is expanding on Λ, and that Λ is a repeller of f if there exist constants c > 0 and β > 1 such that

dx f n v ≥ cβ n v for every x ∈ Λ, n ∈ N, and v ∈ Tx M . In addition, we always assume in this presentation that there is an open set U ⊃ Λ such that Λ = n∈N f n U , and that f is topologically mixing on Λ. We recall that a collection of closed sets R1 , . . . , Rp ⊂ Λ is said to be a Markov partition of the repeller Λ if: p (1) Λ = i=1 Ri , and int Ri = Ri for i = 1, . . . , p; (2) int Ri ∩ int Rj = ∅ whenever i = j; (3) f (Ri ) ⊃ Rj whenever f (int Ri ) ∩ int Rj = ∅. We note that here the interior of each set Ri is computed with respect to the induced topology on Λ. Any repeller has Markov partitions with arbitrarily small diameter max{diam Ri : i = 1, . . . , p}

(14)

(see [41]). Given a Markov partition R1 , . . . , Rp of Λ, we deﬁne a p × p matrix A = (aij ) with entries 1 if f (int Ri ) ∩ int Rj = ∅, aij = (15) 0 if f (int Ri ) ∩ int Rj = ∅, and we consider the corresponding topological Markov chain σ : ΣA → ΣA deﬁned by the shift map σ(i1 i2 · · ·) = (i2 i3 · · ·) in the set ΣA = {(i1 i2 · · ·) ∈ {1, . . . , p}N : aik ik+1 = 1 for every k ∈ N}.

(16)

We denote by ΣA,n the set of n-tuples (i1 · · · in ) for which there is a sequence (j1 j2 · · ·) ∈ ΣA such that i = j for = 1, . . . , n. For each (i1 · · · in ) ∈ ΣA,n we deﬁne n−1 f − Ri+1 , (17) ∆i1 ···in = =0

and setting χ(i1 i2 · · ·) =

∞ =0

f − Ri+1 =

∞ n=1

we obtain a coding map χ : ΣA → Λ for the repeller.

∆i1 ···in ,

November 16, J070-S0129055X10004168

1158

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

3.2. Formulas for the topological pressure Now we introduce the class of almost additive sequences, and we describe corresponding formulas for the nonadditive topological pressure both using and avoiding Markov partitions. We say that the sequence of functions Φ = (ϕn )n with ϕn : Λ → R for each n ∈ N is almost additive (with respect to f on Λ) if there exists a constant C > 0 such that for every n, m ∈ N and x ∈ Λ we have (18) −C + ϕn (x) + ϕm (f n (x)) ≤ ϕn+m (x) ≤ C + ϕn (x) + ϕm (f n (x)). n−1 Clearly, any additive sequence of functions ϕn = k=0 ϕ ◦ f k is almost additive. Nontrivial examples of almost additive sequences occur naturally for example in the study of nonconformal repellers (see Sec. 7 for a detailed description). Now let Λ be a repeller of f , and let ∆i1 ···in be the sets in (17) obtained from a given Markov partition. We write γn (Φ) = sup{|ϕn (x) − ϕn (y)| : x, y ∈ ∆i1 ···in and (i1 · · · in ) ∈ ΣA,n }.

(19)

One can easily verify that γn (Φ) coincides with γn (Φ, U) in (5) for the open cover U of Λ formed by the elements R1 , . . . , Rp of the Markov partition (with respect to the induced topology on Λ). We say that Φ has tempered variation if γn (Φ)/n → 0 as n → ∞. Clearly, any sequence with tempered variation satisﬁes condition (6). The following result provides a formula for the topological pressure of an almost additive sequence with tempered variation. Theorem 3.1 ([7, Proposition 3]). Let Λ be a repeller of a C 1 map, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with tempered variation. Then 1 log exp ϕn (xi1 ···in ) (20) P (Φ) = lim n→∞ n i ···i 1

n

for any points xi1 ···in ∈ ∆i1 ···in , for each (i1 · · · in ) ∈ ΣA,n and n ∈ N. The statement in Theorem 3.1 was ﬁrst established by Barreira and Gelfert in [7], and was then extended by Barreira in [3] to other classes of transformations (see Secs. 5 and 6). We emphasize that identity (20) ensures not only that the nonadditive topological pressure of an almost additive sequence is a limit, but also that the limit is independent of the particular Markov partition used to deﬁne it. For a continuous function ϕ : Λ → R, we recall that the (classical) topological pressure of ϕ (with respect to f on Λ) is given by n−1 1 log exp max ϕ(f k (x)), n→∞ n x∈∆i1 ···in i ···i

P (ϕ) = lim

1

n

(21)

k=0

where ∆i1 ···in are the sets in (17) obtained from any given Markov partition. One can easily verify that the limit in (20) exists (by showing that the ﬁrst sum deﬁnes a

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1159

submultiplicative sequence). Furthermore, the limit is independent of the particular Markov partition used to deﬁne it (see [36,47] for details). We note that identity (20) includes identity (21) (which is often taken as the deﬁnition of topological pressure) as a particular case. We have also the following alternative characterization of the topological pressure. It has the advantage of avoiding Markov partitions and the associated symbolic dynamics. Let Fix(f ) = {x ∈ Λ : f (x) = x} be the set of ﬁxed points of f in Λ. Theorem 3.2 ([3]). Let Λ be a repeller of a C 1 map, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with tempered variation. Then 1 log exp ϕn (x). (22) P (Φ) = lim n→∞ n n x∈Fix(f )

4. Results for Repellers We describe in this section several results of the almost additive thermodynamic formalism, again for deﬁniteness in the particular case of functions deﬁned in a repeller. In particular, we describe a variational principle for the topological pressure. We also introduce, for almost additive sequences, the notions of equilibrium measure and of Gibbs measure, and we consider the problem of existence and uniqueness of these measures. 4.1. Variational principle for the topological pressure To formulate the variational principle for the topological pressure, we ﬁrst recall the notion of Kolmogorov–Sinai entropy. Given a measurable transformation f : Λ → Λ, we denote by M the family of f -invariant probability measures in Λ. We recall that a measure µ in Λ is said to be f -invariant if µ(f −1 A) = µ(A) for every measurable set A ⊂ Λ. Given a measure µ ∈ M and a partition ξ of Λ into measurable subsets, we deﬁne µ(C) log µ(C), Hµ (ξ) = − C∈ξ

with the convention that 0 log 0 = 0. The Kolmogorov–Sinai entropy of f with respect to µ is given by hµ (f ) = sup{hµ (f, ξ) : Hµ (ξ) < ∞}, where hµ (f, ξ) = inf

n∈N

1 Hµ (ξn ), n

November 16, J070-S0129055X10004168

1160

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

n−1 for the partition ξn of Λ into the sets k=0 f −k Ck+1 with C1 , . . . , Cn ∈ ξ. In the case of invariant measures in repellers, the entropy can be obtained as follows. Given a Markov partition of the repeller Λ, we consider the partition ξn = {∆i1 ···in : (i1 · · · in ) ∈ ΣA,n } of Λ. Its entropy is given by Hµ (ξn ) = −

µ(∆i1 ···in ) log µ(∆i1 ···in ),

i1 ···in

and hµ (f ) = lim

n→∞

1 1 Hµ (ξn ) = inf Hµ (ξn ). n∈N n n

The following is a variational principle for the topological pressure. Theorem 4.1 ([3]). Let Λ be a repeller of a C 1 map f, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with tempered variation. Then ϕn (x) dµ(x) lim P (Φ) = max hµ (f ) + µ∈M n Λ n→∞ 1 = max hµ (f ) + lim ϕn dµ , (23) n→∞ n Λ µ∈M including the existence in L1 (Λ, µ) of the ﬁrst limit, and the existence of the second limit. In a similar manner to that in the classical theory, it is easier to show that 1 ϕn dµ P (Φ) ≥ max hµ (f ) + lim n→∞ n Λ µ∈M when compared to the reverse inequality. The argument uses the subadditivity of the sequence ψn = ϕn + C (see (18)), that is, the property ψn+m ≤ ψn + ψm ◦ f n ,

n, m ∈ N,

together with Kingman’s subadditive ergodic theorem. The proof of the reverse inequality uses analogous arguments to those in the proof of [1, Theorem 1.7], which in their turn were inspired in arguments of Bowen in [14]. The fact that the supremum can be replaced by a maximum in (23) follows from the upper semicontinuity of the map 1 ϕn dµ, (24) M µ → hµ (f ) + lim n→∞ n Λ since µ → hµ (f ) is upper semi-continuous in this setting, and since the limit in (24) is continuous in µ.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1161

4.2. Equilibrium and Gibbs measures We continue to consider a repeller Λ of a C 1 map f . In an analogous manner to that in the classical additive theory, we say that a measure µ ∈ M is an equilibrium measure for the almost additive sequence Φ (with respect to f on Λ) if it attains any of the maxima in (23) (and thus both maxima), that is, if 1 ϕn dµ. P (Φ) = hµ (f ) + lim n→∞ n Λ The existence of equilibrium measures is thus an immediate consequence of Theorem 4.1. Theorem 4.2 ([3]). Let Λ be a repeller of a C 1 map. Then any almost additive sequence of continuous functions on Λ with tempered variation has at least one equilibrium measure. We also say that a probability measure µ in Λ (which need not be f -invariant) is a Gibbs measure for the sequence Φ (with respect to f on Λ, and to a given Markov partition of Λ) if there exists a constant K > 0 such that K −1 ≤

µ(∆i1 ···in ) ≤K exp[−nP (Φ) + ϕn (x)]

for every n ∈ N, (i1 · · · in ) ∈ ΣA,n , and x ∈ ∆i1 ···in . It turns out, as in the classical additive theory, that invariant Gibbs measures are always equilibrium measures. The argument is simple. We ﬁrst note that if µ is an f -invariant Gibbs measure, then the limit hµ (x) := lim − n→∞

ϕn (x) 1 log µ(∆i1 ···in ) = P (Φ) − lim n→∞ n n

(25)

exists for µ-almost every x ∈ Λ (by Theorem 4.1 the second limit in (25) exists in L1 (Λ, µ), and thus it also exists for µ-almost every x ∈ Λ). By Shannon–McMillan– Breiman’s theorem we obtain ϕn (x) hµ (x)dµ(x) = P (Φ) − lim hµ (f ) = dµ(x), n→∞ n Λ Λ and hence µ is an equilibrium measure. To formulate the following result we need to consider the stronger notion of bounded variation. We say that the sequence of functions Φ = (ϕn )n has bounded variation if supn∈N γn (Φ) < ∞ (see (19) for the deﬁnition of γn (Φ)). For example, k one can easily verify that if Φ is the additive sequence ϕn = n−1 k=0 ϕ ◦ f for some H¨older continuous ϕ in a repeller, then Φ has bounded variation. Clearly, if Φ has bounded variation, then it has tempered variation. The following statement says in particular that for each almost additive sequence with bounded variation there exists a unique equilibrium measure.

November 16, J070-S0129055X10004168

1162

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

Theorem 4.3 ([3]). Let Λ be a repeller of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then: (1) there is a unique equilibrium measure for Φ; (2) there is a unique invariant Gibbs measure for Φ; (3) the two measures coincide and are mixing. In particular, the unique equilibrium measure for an almost additive sequence with bounded variation is an invariant Gibbs measure. We refer to [34] for some results related to those in this section, although using a diﬀerent notion of equilibrium measure. 4.3. Characterizations of unique equilibrium measures The unique equilibrium measure in Theorem 4.3 can be characterized as follows. We denote by δx the probability measure with δx ({x}) = 1. Theorem 4.4 ([3]). Let Λ be a repeller of a C 1 map, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with bounded variation. Then the unique equilibrium measure for Φ is the weak limit of the sequence of invariant probability measures eϕn (x) δx eϕn (x) . (26) µn = x∈Fix(f n )

x∈Fix(f n )

Now we present another characterization of the unique equilibrium measures. Given a sequence of continuous functions Φ = (ϕn )n with bounded variation, we set ai1 ···in = max{exp ϕn (y) : y ∈ ∆i1 ···in }, with the convention that ai1 ···in = 0 if ∆i1 ···in = ∅. We also set αn = ai1 ···in . i1 ···in

We deﬁne a probability measure νn in the algebra generated by the sets ∆i1 ···in by νn (∆i1 ···in ) = ai1 ···in /αn for each (i1 · · · in ) ∈ ΣA,n , and we extend it arbitrarily to the Borel σ-algebra of Λ. Since Λ is compact, the family of probability measures in Λ is compact in the weak* topology, and hence, there exists a subsequence (νnk )k converging to some probability measure ν in the weak* topology. A priori the accumulation point ν need not be unique. We denote the set of all accumulation points of the sequence (νn )n by M(Φ). As explained above, M(Φ) = ∅. The following statement shows that all accumulation points are Gibbs measures. Theorem 4.5 ([3]). Let Λ be a repeller of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then each measure in M(Φ) is an ergodic Gibbs measure for Φ.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1163

Moreover, the following is a characterization of the unique invariant Gibbs measure. Theorem 4.6 ([3]). Let Λ be a repeller of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then the unique invariant Gibbs measure for Φ is the unique invariant measure in M(Φ). When Φ is an almost additive sequence of continuous functions in Λ with tempered variation (but not necessarily with bounded variation), we can still show that there exist an ergodic probability measure ν in Λ, a constant K > 0, and a positive sequence (ρn )n decreasing to 0, such that K −1 e−nρn ≤

ν(∆i1 ···in ) ≤ Kenρn exp[−nP (Φ) + ϕn (x)]

(27)

for every n ∈ N, (i1 · · · in ) ∈ ΣA,n , and x ∈ ∆i1 ···in . We emphasize that the measure ν need not be invariant. Furthermore, in general it may not be possible to obtain an invariant measure through an averaging procedure, due to the extra small exponentials in (27). On the other hand, it is still reasonable to call the measure ν in (27) a weak Gibbs measure for Φ, as proposed by Yuri in [48]. 5. Results for Hyperbolic Sets We consider in this section the case of functions deﬁned in a hyperbolic set, and we formulate corresponding results to those in Sec. 4 for functions deﬁned in a repeller. 5.1. Hyperbolic sets and Markov partitions Let f : M → M be a diﬀeomorphism of a smooth manifold M , and let Λ ⊂ M be a compact f -invariant set. We say that Λ is a hyperbolic set for f if for every point x ∈ Λ there exists a decomposition of the tangent space Tx M = E s (x) ⊕ E u (x) such that dx f E s (x) = E s (f (x))

and dx f E u (x) = E u (f (x)),

and there exist constants λ ∈ (0, 1) and c > 0 such that

dx f n | E s (x) ≤ cλn

and dx f −n | E u (x) ≤ cλn

for every x ∈ Λ and n ∈ N. In addition, we always assume in this presentation that there is an open set U ⊃ Λ such that f n U, (28) Λ= n∈Z

and that f is topologically mixing on Λ. Given ε > 0 suﬃciently small, for each x ∈ Λ the local stable and unstable manifolds (of size ε) are given by V s (x) = {y ∈ M : d(f n (y), f n (x)) < ε for every n ≥ 0}

November 16, J070-S0129055X10004168

1164

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

and V u (x) = {y ∈ M : d(f n (y), f n (x)) < ε for every n ≤ 0}, where d is the distance on M . Now we brieﬂy recall the notion of Markov partition for a hyperbolic set. A collection of closed sets R1 , . . . , Rp ⊂ Λ with suﬃciently small diameter (given by (14)) is called a Markov partition of Λ if: p (1) Λ = i=1 Ri , and int Ri = Ri for i = 1, . . . , p; (2) V s (x) ∩ V u (x) ∈ Ri and card(V s (x) ∩ V u (x)) = 1 for x, y ∈ Ri ; (3) int Ri ∩ int Rj = ∅ whenever i = j; (4) if x ∈ f (int Ri ) ∩ int Rj , then f −1 (V u (f (x)) ∩ Rj ) ⊂ V u (x) ∩ Ri and f (V s (x) ∩ Ri ) ⊂ V s (f (x)) ∩ Rj . The interior of each set Ri is computed with respect to the induced topology on Λ. Any hyperbolic set satisfying (28) has Markov partitions with arbitrarily small diameter (see, for example, [14]). Given a Markov partition R1 , . . . , Rp of a hyperbolic set Λ, we deﬁne as in the case of repellers a p× p matrix A = (aij ) with entries given by (15), and we consider the corresponding two-sided topological Markov chain deﬁned by the shift map on the set ΣA = {(i1 i2 · · ·) ∈ {1, . . . , p}Z : aik ik+1 = 1 for every k ∈ Z}.

(29)

We continue to denote by ΣA,n the set of n-tuples (i1 · · · in ) for which there is a sequence (· · · j0 j1 j2 · · · ) ∈ ΣA such that i = j for = 1, . . . , n. For each (i1 · · · in ) ∈ ΣA,n we consider again the sets ∆i1 ··· in deﬁned by (17). 5.2. Formulation of the results Repeating arguments in the proofs of Theorems 3.1 and 3.2 we obtain the following statement, thus providing formulas for the topological pressure of an almost additive sequence. Theorem 5.1 ([3]). Let Λ be a hyperbolic set of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with tempered variation. Then identities (20) and (22) hold for any points xi1 ···in ∈ ∆i1 ···in , for each (i1 · · · in ) ∈ ΣA,n and n ∈ N. We also formulate corresponding versions of Theorems 4.1 and 4.3. Theorem 5.2 ([3]). Let Λ be a hyperbolic set of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with tempered variation. Then (23)

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1165

holds, including the existence in L1 (Λ, µ) of the ﬁrst limit, and the existence of the second limit. In particular, this shows that the sequence Φ has at least one equilibrium measure. Theorem 5.3 ([3]). Let Λ be a hyperbolic set of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then: (1) there is a unique equilibrium measure for Φ; (2) there is a unique invariant Gibbs measure for Φ; (3) the two measures are equal, are mixing, and coincide with the weak limit of the sequence of invariant probability measures µn in (26). 6. Further Generalizations Some of the former results for repellers and hyperbolic sets can be generalized to more general classes of dynamics. We ﬁrst present a variational principle for the topological pressure. Theorem 6.1 ([3]). Let f be a continuous map in a compact metric space Λ, and let Φ be an almost additive sequence of continuous functions in Λ satisfying (6). Then ϕn (x) dµ(x) lim P (Φ) = sup hµ (f ) + n µ∈M Λ n→∞ 1 = sup hµ (f ) + lim ϕn dµ , n→∞ n Λ µ∈M including the existence in L1 (Λ, µ) of the ﬁrst limit, and the existence of the second limit. We also formulate a criterion for the existence of equilibrium measures. Theorem 6.2 ([3]). Let f be a continuous map in a compact metric space Λ such that M ∈ µ → hµ (f ) is upper semi-continuous, and let Φ be an almost additive sequence of continuous functions on Λ satisfying (6). Then there exists an equilibrium measure for Φ. For example, if f is an expansive continuous map in Λ, then the entropy is upper semi-continuous, and hence each almost additive sequence has an equilibrium measure. We recall that f is said to be expansive if there exists δ > 0 such that if d(f n (x), f n (y)) < δ

for every n ∈ N,

then x = y (when f is invertible we replace N by Z). For example, when f is a onesided or two-sided topologically mixing topological Markov chain, the entropy is

November 16, J070-S0129055X10004168

1166

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

upper semi-continuous. Incidentally, all these transformations satisfy speciﬁcation. On the other hand, there are plenty transformations not satisfying speciﬁcation for which the entropy is still upper semi-continuous. For example, all β-shifts are expansive, and thus the entropy is upper semi-continuous (see [31] for details), but for β in a residual set of full Lebesgue measure (although the complement has full Hausdorﬀ dimension) the corresponding β-shift does not satisfy speciﬁcation (see [43]). Finally, we describe some regularity properties of the topological pressure. We denote by A(Λ) the family of almost additive sequences of continuous functions satisfying (6). Let also E(Λ) ⊂ A(Λ) be the family of sequences with a unique equilibrium measure. Theorem 6.3 ([5]). Let f be a continuous map in a compact metric space Λ such that M µ → hµ (f ) is upper semi-continuous. Then: (1) given Φ ∈ A(Λ), the function t → P (Φ + tΨ) is diﬀerentiable at t = 0 for every Ψ ∈ A(Λ) if and only if Φ ∈ E(Λ); in this case the unique equilibrium measure µ of Φ is ergodic, and ψn d P (Φ + tΨ)|t=0 = lim dµ; (30) n→∞ Λ n dt (2) for each open set U ⊂ R, if Φ + tΨ ∈ E(Λ) for every t ∈ U, then the function t → P (Φ + tΨ) is of class C 1 in U . The proof of Theorem 6.3 follows partially arguments in [31]. 7. Application I: Nonconformal Repellers We describe in this section a class of nonconformal repellers considered by Barreira and Gelfert in [7] to which one can apply the results in Sec. 4, in connection with the study of Lyapunov exponents of nonconformal transformations. 7.1. Cone condition and bounded distortion To describe the class of repellers under consideration, we ﬁrst introduce what we call a cone condition. Given a number γ ≤ 1 and a 1-dimensional subspace E(x) ⊂ R2 , we consider the cone Cγ (x) = {(u, v) ∈ E(x) ⊕ E(x)⊥ : v ≤ γ u }. We say that a diﬀerentiable map f : R2 → R2 satisﬁes a cone condition on a set Λ ⊂ R2 if there exist γ ≤ 1 and for each x ∈ Λ a 1-dimensional subspace E(x) ⊂ R2 varying continuously with x such that (dx f )Cγ (x) ⊂ {0} ∪ int Cγ (f x).

(31)

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1167

Following [7], we present several examples of maps satisfying a cone condition. Example 7.1. Assume that for each x ∈ Λ the derivative dx f is represented by a positive 2 × 2 matrix. Then the ﬁrst quadrant Q is invariant under these linear transformations, that is, (dx f )Q ⊂ Q for each x ∈ Λ. Therefore, the map f satisﬁes the cone condition in (31) with γ = 1, taking for E(x) the 1-dimensional subspace making an angle of π/4 with the horizontal direction. This example is related to work in [26] (see also [22]). Another class of examples corresponds to the existence of a strongly unstable foliation. Example 7.2. Let Λ be a locally maximal repeller in the sense that in some open neighborhood U the repeller Λ is the only invariant set. In this case f −1 Λ ∩ U = Λ. Assume that there exists a strongly unstable foliation of the set U , that is, a foliation by 1-dimensional C 2 leaves V (x) such that: (1) f (V (x)) ⊃ V (f x) for every x ∈ U ∩ f −1 U ; (2) there exist constants c > 0 and λ ∈ (0, 1) such that |det dx f n | ≤ cλn

dx f n | Tx V (x) 2

for all x ∈

n

f −i U

and n ∈ N.

i=0

It is shown by Hu in [29] that this assumption is equivalent to: (1) for some choice of subspaces E(x) varying continuously with x, the cone condition in (31) holds for every x ∈ U ∩ f −1 U ; (2) there exist 1-dimensional subspaces F (x) ⊂ {0} ∪ int Cγ (x) for each x ∈ U ∩ f −1 U such that dx f F (x) = F (f x). Thus, repellers with a strongly unstable foliation satisfy a cone condition. Notice that the cone condition in (31) is weaker then assuming the existence of a strongly unstable foliation. In particular, (31) does not ensure the existence of an invariant distribution F (x) as in Example 7.2. On the other hand, when there exists a strongly unstable foliation, the invariant distribution F (x) is given by (see [29]) dy f n Cγ (y). F (x) = n∈N y∈f −n x

It is thus independent of the particular preimages xn ∈ f −n x, that is, F (x) = dxn f n Cγ (xn ). n∈N

We can also consider repellers with a dominated splitting.

November 16, J070-S0129055X10004168

1168

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

Example 7.3. We say that the repeller Λ possesses a dominated splitting if there exists a decomposition TΛ R2 = E ⊕ F such that: (1) dx f E(x) = E(f x) and dx f F (x) = F (f x) for every x ∈ Λ; (2) there exist constants c > 0 and λ ∈ (0, 1) such that

dx f n | E · (dx f )−n | F ≤ cλn

for all x ∈ Λ

and n ∈ N.

It follows easily from the deﬁnition that the subspaces E(x) and F (x) vary continuously with x. Furthermore, one can verify that when there exists a dominated splitting of Λ, the map f satisﬁes a cone condition on Λ. We note that the existence of a strongly unstable foliation does not ensure the existence of a dominated splitting, due to the requirement of a df -invariant decomposition E ⊕ F (more precisely, the existence of a strongly unstable foliation only ensures the existence of the invariant distribution F in Example 7.2). Now we consider certain almost additive sequences of functions obtained from the singular values of a 2 × 2 matrix A, namely σ1 (A) = A and σ2 (A) = A−1 −1 (with respect to the 2-norm in R2 ). Given a C 1 map f : R2 → R2 , we deﬁne sequences of functions Φi = (ϕi,n )n for i = 1, 2 by ϕi,n (x) = log σi (dx f n )

(32)

for each n ∈ N and i = 1, 2. Clearly, the functions ϕi,n are continuous. These sequences are related to the Lyapunov exponents of the map f (see Sec. 7.2). We ﬁrst present a criterium for almost additivity. Proposition 7.4 ([7]). Let Λ be a repeller of a C 1 map f : R2 → R2 . If f satisﬁes a cone condition on Λ, then Φi is almost additive for i = 1, 2. For a map f as in Proposition 7.4, we consider a number δ > 0 such that for every x ∈ Λ the map is invertible on the ball B(x, δ) (simply take a Lebesgue number of a cover by balls with the property that f is invertible on each of them). For each x ∈ Λ and n ∈ N we deﬁne Bn (x, δ) =

n−1

f − B(f x, δ).

=0

We always assume that the diameter of the Markov partition used to deﬁne the sets ∆i1 ···in in (17) is at most δ/2 (we recall that any repeller has Markov partitions of arbitrarily small diameter). This ensures that ∆i1 ···in ⊂ Bn (x, δ) for every x = χ(i1 i2 · · ·) ∈ Λ and n ∈ N. We say that f has bounded distortion on Λ if there exists δ > 0 such that sup{ dy f n (dz f n )−1 : x ∈ Λ and y, z ∈ Bn (x, δ)} < ∞.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1169

Now we give a condition for bounded distortion in the case of C 1+α transformations. Given α > 0, we say that f is α-bunched on Λ if

(dx f )−1 1+α dx f < 1 for every x ∈ Λ (this notion was introduced in [1] in the context of dimension theory of nonconformal transformations). The following statement is an immediate consequence of the proof of [2, Theorem 4]. Proposition 7.5. Let Λ be a repeller of a C 1+α map f : M → M . If f is α-bunched on Λ, then f has bounded distortion on Λ. Now we consider the sequences Φi for i = 1, 2 introduced in (32) and we present a criterium for bounded variation. Proposition 7.6 ([7, Proposition 1]). Let Λ be a repeller of a C 1 transformation f : R2 → R2 . If f has bounded distortion on Λ, then Φi has bounded variation for i = 1, 2. 7.2. Variational principle and Gibbs measures It follows from Propositions 7.4 and 7.6 that if a C 1 map f : R2 → R2 satisﬁes a cone condition on Λ and has bounded distortion on Λ, then Φi is an almost additive sequence with bounded variation for i = 1, 2. This allows us to apply the results in Sec. 4 to recover the corresponding statements of Barreira and Gelfert in [7]. To explain the relation between the sequences Φi and the theory of Lyapunov exponents, we ﬁrst recall some basic notions. Given a diﬀerentiable transformation f : M → M (which is not necessarily invertible), for each x ∈ M and v ∈ Tx M we deﬁne the Lyapunov exponent of (x, v) by 1 χ(x, v) = lim sup log dx f n v , n→+∞ n

(33)

with the convention that log 0 = −∞. It follows from the abstract theory of Lyapunov exponents (see [8] for full details) that for each x ∈ M there exist a positive integer s(x) ≤ dim M , numbers χ1 (x) < · · · < χs(x) (x), and linear subspaces {0} = E0 (x) ⊂ E1 (x) ⊂ · · · ⊂ Es(x) (x) = Tx M such that for i = 1, . . . , s(x) we have Ei (x) = {v ∈ Tx M : χ(x, v) ≤ χi (x)}, and χ(x, v) = χi (x) whenever v ∈ Ei (x)\Ei−1 (x). It follows from Oseledets’ multiplicative ergodic theorem (see, for example, [8]), or more precisely from its version for noninvertible transformations, that for each ﬁnite f -invariant measure in M there is a set X ⊂ M of full measure such that if x ∈ X, then 1 log dx f n v = χi (x) lim n→+∞ n

November 16, J070-S0129055X10004168

1170

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

for every v ∈ Ei (x)\Ei−1 (x) and i = 1, . . . , s(x), with uniform convergence in v on each subspace F ⊂ Ei (x) such that F ∩ Ei−1 (x) = {0} (in particular, the lim sup in (33) is now a limit). For M = R2 and each x ∈ R2 , when s(x) = 1 we set λ1 (x) = χ1 (x)

and λ2 (x) = χ1 (x),

and when s(x) = 2 we set λ1 (x) = χ1 (x)

and λ2 (x) = χ2 (x).

The numbers λ1 (x) and λ2 (x) are the values of the Lyapunov exponent v → χ(x, v) counted with multiplicities. It follows again from Oseledets’ multiplicative ergodic theorem that for each ﬁnite f -invariant measure in R2 there is a set X ⊂ R2 of full measure such that lim

n→+∞

ϕi,n (x) 1 = lim log σi (dx f n ) = λi (x) n→+∞ n n

for each x ∈ X and i = 1, 2 (see (32)). Combining these observations with the criteria in Propositions 7.4 and 7.6, we readily obtain the following statement of Barreira and Gelfert by applying the results in Sec. 4. Theorem 7.7 ([7]). Let Λ be a repeller of a C 1 map f : R2 → R2 . If f satisﬁes a cone condition on Λ, and f has bounded distortion on Λ, then for i = 1, 2 the following properties hold : (1) the topological pressure satisﬁes the variational principle λi (x)dµ(x) P (Φi ) = max hµ (f ) + µ∈M

Λ

1 n = max hµ (f ) + lim log σi (dx f )dµ(x) ; n→∞ n Λ µ∈M (2) there is a unique equilibrium measure µi for Φi , and this is the unique invariant Gibbs measure for Φi ; (3) there is a constant K > 0 such that K −1 ≤

µi (∆i1 ···in ) ≤K exp[−nP (Φi )]σi (dx f n )

for every n ∈ N, (i1 · · · in ) ∈ ΣA,n , and x ∈ ∆i1 ···in ; (4) the measure µi is mixing, and σi (dx f n )δx σi (dx f n ) µi x∈Fix(f n )

x∈Fix(f n )

as n → ∞.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1171

8. Application II: Multifractal Analysis We describe in this section a conditional variational principle for the u-dimension spectrum established by Barreira and Doutor in [5]. This contains as a particular case a conditional variational principle for the entropy spectrum (see Theorem 8.3 below). For simplicity of the exposition, we do not consider the multidimensional case in [5] but only the case of a single ratio of almost additive functions. We emphasize that this is already a nontrivial result when compared to the existing results in the classical case of additive sequences. 8.1. Notion of u-dimension We recall in this section the notion of u-dimension introduced by Barreira and Schmeling in [10]. Let f : X → X be a continuous transformation of a compact metric space, and let U be a ﬁnite open cover of X. Let also u : X → R+ be a continuous function. Given a set Z ⊂ X and a number α ∈ R, we deﬁne the function exp(−αu(U )), N (Z, α, u, U) = lim inf n→∞ Γ

U∈Γ

where u(U ) is deﬁned as in (8), and where the inﬁmum is taken over all ﬁnite or countable collections Γ ⊂ k≥n Wk (U) such that u∈Γ X(U ) ⊃ Z. Setting dimu,U Z = inf{α ∈ R : N (Z, α, u, U) = 0}, one can show that the limit dimu Z =

lim

diam U→0

dimu,U Z

exists. The number dimu Z is called the u-dimension of the set Z (with respect to f ). For example, if u = 1, then dimu Z is equal to the topological entropy h(f | Z) of f on Z (see Sec. 2). The following result is an easy consequence of the deﬁnitions. Proposition 8.1. The number dimu Z = α is the unique root α of the equation k PZ (−αU ) = 0, where U = (un )n with un = n−1 k=0 u ◦ f for each n ∈ N. Furthermore, given a probability measure µ in X, we set dimu,U µ = inf{dimu,U Z : µ(Z) = 1}. One can show that the limit dimu µ =

lim

diam U→0

dimu,U µ

exists, and we call it the u-dimension of µ. Moreover, the lower and upper u-pointwise dimensions of µ at the point x ∈ X are deﬁned by dµ,u (x) =

lim

lim inf inf −

diam U→0 n→∞

U

log µ(X(U )) u(U )

November 16, J070-S0129055X10004168

1172

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

and dµ,u (x) =

lim

lim sup sup −

diam U→0 n→∞

U

log µ(X(U )) , u(U )

where the inﬁmum and supremum are taken over all vectors U ∈ Wn (U) such that x ∈ X(U ). If µ ∈ M is ergodic, then hµ (f ) dimu µ = dµ,u (x) = dµ,u (x) = u dµ X for µ-almost every x ∈ X (see [10]). 8.2. Conditional variational principle We formulate in this section a conditional variational principle for the u-dimension of sets deﬁned in terms of ratios of almost additive sequences. This corresponds to a multifractal analysis of the level sets of limits of ratios of almost additive sequences. We continue to consider a continuous map f : X → X of a compact metric space. Let Φ = (ϕn )n and Ψ = (ψn )n be almost additive sequences of functions in X. We assume that lim inf m→∞

ψm (x) >0 m

and ψn (x) > 0

for every x ∈ X and n ∈ N. Given α ∈ R we deﬁne ϕn (x) =α . Kα = x ∈ X : lim n→∞ ψn (x)

(34)

The function Fu : R → R deﬁned by Fu (α) = dimu Kα is called the u-dimension spectrum of the pair (Φ, Ψ) (with respect to f ). We also consider the function P : M → R deﬁned by ϕn dµ . P(µ) = lim X n→∞ ψn dµ X

The following is a conditional variational principle for the spectrum Fu . We n−1 consider the (additive) sequence of functions U = (un )n with un = k=0 u ◦ f k for each n ∈ N. We recall that E(X) denotes the family of almost additive sequences satisfying (6) with a unique equilibrium measure. Theorem 8.2 ([5]). Let f be a continuous map of a compact metric space X such that µ → hµ (f ) is upper semi-continuous, and assume that span{Φ, Ψ, U } ⊂ E(X).

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1173

If α ∈ P(M), then Kα = ∅. Otherwise, if α ∈ int P(M), then Kα = ∅, and the following properties hold : (1) Fu satisﬁes the variational principle         h (f ) µ : µ ∈ M and P(µ) = α ; Fu (α) = max       u dµ X

(2) we have Fu (α) = min{Tu (α, q) : q ∈ R}, where Tu (α, q) is the unique real number satisfying P (q(Φ − αΨ) − Tu (α, q)U ) = 0;

(35)

(3) there is an ergodic measure µα ∈ M such that P(µα ) = α, µα (Kα ) = 1, and hµ (f ) dimu µα = α = Fu (α). u dµα X

In addition, the spectrum Fu is continuous in int P(M). The proof of Theorem 8.2 builds on earlier work of Barreira et al. in [9]. We note that the number Tu (α, q) is deﬁned implicitly by (35). By Theorem 6.3, the function (p, α, q) → P (q(Φ − αΨ) − pU ) is of class C . By the Implicit function theorem, we conclude that (α, q) → Tu (α, q) is also of class C 1 in R2 , since by (30), ∂ P (q(Φ − αΨ) − pU )|(p,q)=(Tu (α,q),q) = − u dµq < 0, ∂p X 1

where µq is the unique equilibrium measure of q(Φ − αΨ) − Tu (α, q)U . Now we formulate explicitly a particular case of Theorem 8.2. Let Φ = (ϕn )n be an almost additive sequence of functions ϕn : X → R. Given α ∈ R, we consider the level set Kα = x ∈ X : lim ϕn (x) = α . n→∞

The entropy spectrum E : R → R (of the sequence Φ) is deﬁned by E(α) = h(f | Kα ), where h(f | Kα ) denotes the topological entropy of f on Kα (see Secs. 2 and 8.1). We also consider the function P : M → R deﬁned by 1 ϕn dµ. P(µ) = lim n→∞ n X

November 16, J070-S0129055X10004168

1174

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

The following statement is a conditional variational principle for the entropy spectrum E. It is an immediate consequence of Theorem 8.2 below. Theorem 8.3. Let f be a continuous map of a compact metric space X such that µ → hµ (f ) is upper semi-continuous, and assume that the almost additive sequence Φ has a unique equilibrium measure. If α ∈ P(M), then Kα = ∅. Otherwise, if α ∈ int P(M), then Kα = ∅, and the following properties hold : (1) E satisﬁes the variational principle E(α) = max{hµ (f ) : µ ∈ M and P(µ) = α}; (2)

E(α) = min{P (qΦ) − qα : q ∈ R};

(3) there is an ergodic measure µα ∈ M such that P(µα ) = α, µα (Kα ) = 1, and hµα (f ) = E(α). In addition, the spectrum E is continuous in int P(M). Now we consider the associated irregular sets, on which the limits in (34) do not exist. We consider only the particular case of topological Markov chains. Namely, let Φ and Ψ be almost additive sequences in ΣA , either as in (16) or as in (29). The irregular set of the pair (Φ, Ψ) is deﬁned by ϕn (x) ϕn (x) < lim sup , I = x ∈ ΣA : lim inf n→∞ ψn (x) n→∞ ψn (x) and we denote by mu the equilibrium measure of u, when it is unique. Theorem 8.4 ([5]). Let σ | ΣA be a topologically mixing topological Markov chain. If span{Φ, Ψ, U } ⊂ E(ΣA ), and P(mu ) ∈ int P(Mσ ), then dimu I = dimu ΣA . Theorem 8.4 follows from the application of results in [10] combined with Theorem 8.2. 9. Application III: Dimension Spectra Our last application of the almost additive thermodynamic formalism considers dimension spectra of level sets associated to the limits of ratios of almost additive sequences. Moreover, we take into account simultaneously limits of ratios of sequences into the future and into the past.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1175

Let f : M → M be a C 1+ε surface diﬀeomorphism with a hyperbolic set Λ satisfying the same hypotheses as in Sec. 5.1. We always assume that dim E s (x) = dim E u (x) = 1 for every x ∈ Λ. Let ts and tu be the unique real numbers such that P (ts log df | E s ) = P (tu log df −1 | E u ) = 0, where P denotes the (classical) topological pressure with respect to f on Λ. It was shown by McCluskey and Manning in [33] that dimH (Λ ∩ V s (x)) = ts

and dimH (Λ ∩ V u (x)) = tu

for every x ∈ Λ, where dimH denotes the Hausdorﬀ dimension. Moreover, it was shown by Palis and Viana in [35] that dimH (Λ ∩ V s (x)) = dimB (Λ ∩ V s (x)), dimH (Λ ∩ V u (x)) = dimB (Λ ∩ V u (x)) for every x ∈ Λ, where dimB denotes the upper box dimension. Since the stable and unstable distributions have codimension 1, it follows from results of Hasselblatt in [28] that the maps x → E s (x) and x → E u (x) are Lipschitz. This implies that dimH Λ = dimH [(Λ ∩ V s (x)) × (Λ ∩ V u (x))] = dimH (Λ ∩ V s (x)) + dimH (Λ ∩ V u (x)) = ts + tu .

(36)

Indeed, if dimH A = dimB A, then for any set B we have dimH (A × B) = dimH A + dimH B. Now we proceed with the description of the dimension spectra. We denote by L+ (respectively, L− ) the family of almost additive sequences of continuous functions with respect to f (respectively, f −1 ) that have bounded variation with respect to f (respectively, f −1 ). We only consider almost additive sequences Φ+ = (ϕ+ n )n ,

Φ− = (ϕ− n )n ,

Ψ+ = (ψn+ )n ,

and Ψ− = (ψn− )n

such that lim inf m→∞

± ψm (x) > 0 and ψn± (x) > 0 m

for every n ∈ N and x ∈ Λ. Given (Φ+ , Ψ+ ) ∈ L+ × L+ and α ∈ R we deﬁne ϕ+ (x) =α , Kα+ = x ∈ Λ : lim n+ n→∞ ψn (x) and given (Φ− , Ψ− ) ∈ L− × L− and α ∈ R we deﬁne ϕ− n (x) − Kα = x ∈ Λ : lim − =α . n→∞ ψn (x)

November 16, J070-S0129055X10004168

1176

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

We also consider the dimension spectrum D : R2 → R deﬁned by D(α, β) = dimH (Kα+ ∩ Kβ− ). The following is a conditional variational principle for the spectrum D. Theorem 9.1 ([6]). If α ∈ int P+ (M) and β ∈ int P− (M), then D(α, β) = dimH Kα+ + dimH Kβ− − dimH Λ         hµ (f ) + : µ ∈ M and P (µ) = α = max      − log df | E s dµ  Λ

        hµ (f ) − + max : µ ∈ M and P (µ) = β .      log df | E u dµ 

(37)

Λ

Moreover, the spectrum D is analytic in int P+ (M) × int P− (M). The proof Theorem 9.1 follows to some extent arguments of Barreira and Valls ([11]) in the additive case. In particular, it involves constructing a measure ν = ναβ sitting on the set Kα+ ∩ Kβ− , that is, such that ν(Kα+ ∩ Kβ− ) = 1, having the “right” pointwise dimension. This means that lim inf r→0

log ν(B(x, r)) ≥ dimH Kα+ + dimH Kβ− − dimH Λ log r

for ν-almost every x ∈ Λ, and lim sup r→0

log ν(B(x, r)) ≤ dimH Kα+ + dimH Kβ− − dimH Λ log r

for every x ∈ Kα+ ∩Kβ− . These properties, together with general results in dimension theory (see, for example, [4]) readily yield the ﬁrst identity in (37). The second identity follows from Theorem 8.2. The measure ν, although never invariant, is constructed essentially as a product of (invariant) equilibrium measures along the stable and unstable directions, for which the results in Sec. 4 are essential. More precisely, set U = q + (Φ − αΨ) − (dimH Kα+ − ts )Du

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1177

and S = q − (Φ − βΨ) − (dimH Kβ− − tu )Ds , where Du and Ds are the additive sequences n−1

log df | E u ◦ f k

k=0

and

n−1

log df −1 | E s ◦ f k ,

k=0

−

and where q , q ∈ R are such that +

P (U ) = P (S) = 0. By the almost additive thermodynamic formalism there exist unique equilibrium measures ν u and ν s respectively of U and S. Roughly speaking, the measure ναβ is given by the product ν u × ν s at the level of symbolic dynamics. It is also shown in [6] that dimH Kα+ = dimH (Kα+ ∩ V u (x)) + ts and dimH Kβ− = dimH (Kβ− ∩ V s (y)) + tu for every x ∈ Kα+ and y ∈ Kβ− . Together with (36) and (37), this shows that D(α, β) = dimH (Kα+ ∩ V u (x)) + dimH (Kβ− ∩ V s (y)) for every x ∈ Kα+ and y ∈ Kβ− . Note Added in Proof. Meantime, I became aware of the interesting paper [24] by Feng and Huang. Their work considers the more general case of asymptotically subadditive sequences and is a quite substantial advance towards a general theory. Acknowledgment The author was partially supported by FCT through CAMGSD, Lisbon. References [1] L. Barreira, A non-additive thermodynamic formalism and applications to dimension theory of hyperbolic dynamical systems, Ergodic Theory Dynam. Systems 16 (1996) 871–927. [2] L. Barreira, Dimension estimates in nonconformal hyperbolic dynamics, Nonlinearity 16 (2003) 1657–1672. [3] L. Barreira, Nonadditive thermodynamic formalism: Equilibrium and Gibbs measures, Discrete Contin. Dyn. Syst. 16 (2006) 279–305. [4] L. Barreira, Dimension and Recurrence in Hyperbolic Dynamics, Progress in Mathematics, Vol. 272 (Birkh¨ auser, 2008). [5] L. Barreira and P. Doutor, Almost additive multifractal analysis, J. Math. Pures Appl. 92 (2009) 1–17.

November 16, J070-S0129055X10004168

1178

2010 15:27 WSPC/S0129-055X

148-RMP

L. Barreira

[6] L. Barreira and P. Doutor, Dimension spectra of almost additive sequences, Nonlinearity 22 (2009) 2761–2773. [7] L. Barreira and K. Gelfert, Multifractal analysis for Lyapunov exponents on nonconformal repellers, Comm. Math. Phys. 267 (2006) 393–418. [8] L. Barreira and Ya. Pesin, Lyapunov Exponents and Smooth Ergodic Theory, Univ. Lect. Ser., Vol. 23 (Amer. Math. Soc., 2002). [9] L. Barreira, B. Saussol and J. Schmeling, Higher-dimensional multifractal analysis, J. Math. Pures Appl. 81 (2002) 67–91. [10] L. Barreira and J. Schmeling, Sets of “non-typical” points have full topological entropy and full Hausdorﬀ dimension, Israel J. Math. 116 (2000) 29–70. [11] L. Barreira and C. Valls, Multifractal structure of two-dimensional horseshoes, Comm. Math. Phys. 266 (2006) 455–470. [12] H. Bothe, The Hausdorﬀ dimension of certain solenoids, Ergodic Theory Dynam. Systems 15 (1995) 449–474. [13] R. Bowen, Topological entropy for noncompact sets, Trans. Amer. Math. Soc. 184 (1973) 125–136. [14] R. Bowen, Equilibrium States and the Ergodic Theory of Anosov Diﬀeomorphisms, Lect. Notes in Math., Vol. 470 (Springer, 1975). ´ [15] R. Bowen, Hausdorﬀ dimension of quasi-circles, Inst. Hautes Etudes Sci. Publ. Math. 50 (1979) 259–273. [16] Y.-L. Cao, D.-J. Feng and W. Huang, The thermodynamic formalism for sub-additive potentials, Discrete Contin. Dyn. Syst. 20 (2008) 639–657. [17] P. Collet, J. Lebowitz and A. Porzio, The dimension spectrum of some dynamical systems, J. Stat. Phys. 47 (1987) 609–644. [18] K. Falconer, The Hausdorﬀ dimension of self-aﬃne fractals, Math. Proc. Cambridge Philos. Soc. 103 (1988) 339–350. [19] K. Falconer, A subadditive thermodynamic formalism for mixing repellers, J. Phys. A 21 (1988) 1737–1742. [20] K. Falconer, Bounded distortion and dimension for non-conformal repellers, Math. Proc. Cambridge Philos. Soc. 115 (1994) 315–334. [21] D.-J. Feng, Lyapunov exponents for products of matrices and multifractal analysis. I. Positive matrices, Israel J. Math. 138 (2003) 353–376. [22] D.-J. Feng, The variational principle for products of non-negative matrices, Nonlinearity 17 (2004) 447–457. [23] D.-J. Feng, Lyapunov exponents for products of matrices and multifractal analysis. II. General matrices, Israel J. Math. 170 (2009) 355–394. [24] D.-J. Feng and W. Huang, Lyapunov spectrum of asymptotically sub-additive potentials, Comm. Math. Phys. 297 (2010) 1–43. [25] D.-J. Feng and A. K¨ aenm¨ aki, Equilibrium states for the pressure function for products of matrices, preprint (2009). [26] D.-J. Feng and K. Lau, The pressure function for products of non-negative matrices, Math. Res. Lett. 9 (2002) 363–378. [27] T. Halsey, M. Jensen, L. Kadanoﬀ, I. Procaccia and B. Shraiman, Fractal measures and their singularities: The characterization of strange sets, Phys. Rev. A 34 (1986) 1141–1151; Errata, ibid. 34 (1986) 1601. [28] B. Hasselblatt, Regularity of the Anosov splitting and of horospheric foliations, Ergodic Theory Dynam. Systems 14 (1994) 645–666. [29] H. Hu, Box dimensions and topological pressure for some expanding maps, Comm. Math. Phys. 191 (1998) 397–407.

November 16, J070-S0129055X10004168

2010 15:27 WSPC/S0129-055X

148-RMP

Almost Additive Thermodynamic Formalism

1179

[30] A. K¨ aenm¨ aki, On natural invariant measures on generalised iterated function systems, Ann. Acad. Sci. Fenn. Math. 29 (2004) 419–458. [31] G. Keller, Equilibrium States in Ergodic Theory, London Mathematical Society Student Texts, Vol. 42 (Cambridge University Press, 1998). [32] A. Lopes, The dimension spectrum of the maximal measure, SIAM J. Math. Anal. 20 (1989) 1243–1254. [33] H. McCluskey and A. Manning, Hausdorﬀ dimension of horseshoes, Ergodic Theory Dynam. Systems 3 (1983) 251–260. [34] A. Mummert, The thermodynamic formalism for almost-additive sequences, Discrete Contin. Dyn. Syst. 16 (2006) 435–454. [35] J. Palis and M. Viana, On the continuity of the Hausdorﬀ dimension and limit capacity for horseshoes, in Dynamical Systems (Valparaiso, 1986), eds. R. Bam´ on, R. Labarca and J. Palis, Lect. Notes in Math., Vol. 1331 (Springer, 1988), pp. 150– 160. [36] Ya. Pesin, Dimension Theory in Dynamical Systems: Contemporary Views and Applications, Chicago Lectures in Mathematics (Chicago University Press, 1997). [37] Ya. Pesin and B. Pitskel’, Topological pressure and the variational principle for noncompact sets, Funct. Anal. Appl. 18 (1984) 307–318. [38] D. Rand, The singularity spectrum f (α) for cookie-cutters, Ergodic Theory Dynam. Systems 9 (1989) 527–541. [39] D. Ruelle, Statistical mechanics on a compact set with Zν action satisfying expansiveness and speciﬁcation, Trans. Amer. Math. Soc. 185 (1973) 237–251. [40] D. Ruelle, Thermodynamic Formalism, Encyclopedia of Mathematics and Its Applications, Vol. 5 (Addison-Wesley, 1978). [41] D. Ruelle, Repellers for real analytic maps, Ergodic Theory Dynam. Systems 2 (1982) 99–107. [42] H. Rugh, On the dimensions of conformal repellers. Randomness and parameter dependency, Ann. of Math. (2 ) 168 (2008) 695–748. [43] J. Schmeling, Symbolic dynamics for β-shifts and self-normal numbers, Ergodic Theory Dynam. Systems 17 (1997) 675–694. [44] K. Simon, The Hausdorﬀ dimension of the Smale–Williams solenoid with diﬀerent contraction coeﬃcients, Proc. Amer. Math. Soc. 125 (1997) 1221–1228. [45] K. Simon and B. Solomyak, Hausdorﬀ dimension for horseshoes in R3 , Ergodic Theory Dynam. Systems 19 (1999) 1343–1363. [46] P. Walters, A variational principle for the pressure of continuous transformations, Amer. J. Math. 97 (1976) 937–971. [47] P. Walters, An Introduction to Ergodic Theory, Graduate Texts in Mathematics, Vol. 79 (Springer, 1982). [48] M. Yuri, Zeta functions for certain non-hyperbolic systems and topological Markov approximations, Ergodic Theory Dynam. Systems 18 (1998) 1589–1612.

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1181–1208 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004181

PAULI–FIERZ MODEL WITH KATO-CLASS POTENTIALS AND EXPONENTIAL DECAYS

TAKERU HIDAKA and FUMIO HIROSHIMA∗ Faculty of Mathematics, Kyushu University, Fukuoka 819-0385, Japan ∗[email protected] Received 29 March 2010 Revised 25 October 2010 Generalized Pauli–Fierz Hamiltonian with Kato-class potential KPF in nonrelativistic quantum electrodynamics is defined and studied by a path measure. KPF is defined as the self-adjoint generator of a strongly continuous one-parameter symmetric semigroup and it is shown that its bound states spatially exponentially decay pointwise and the ground state is unique. Keywords: Pauli–Fierz model; exponential decay; ground states; functional integrations. Mathematics Subject Classification 2010: 81Q10, 46N50

1. Introduction In this paper, we investigate generalized Pauli–Fierz Hamiltonians with Katoclass potentials in nonrelativistic quantum electrodynamics by a path measure. It includes not only Kato-class potentials but also general cutoﬀ functions of quantized radiation ﬁelds. Basic ingredients in this paper are path measures and functional integral representation of semigroups. It has been shown that functional integral representations are useful tools to investigate the spectrum of models in quantum ﬁeld theory. See, e.g., [4, 9, 15, 18, 20, 22, 23, 28, 29]. The strongly continuous one-parameter semigroup (e−tHp )t≥0 generated by the Schr¨ odinger operator, Hp = 12 (p − a)2 + V , on L2 (Rd ) with some external potential V and vector potential a = (a1 , . . . , ad ) is expressed by a path measure, which is known as Feynman–Kac–Itˆ o formula ([25]): Rt Rt (f, e−tHp g) = dxf¯(x)Ex [e− 0 V (Bs )ds−i 0 a(Bs )◦dBs g(Bt )], (1.1) x where Ex denotes the expectation value with respect t to the Wiener measure P , (Bt )t≥0 the d-dimensional Brownian motion and 0 a(Bs ) ◦ dBs a Stratonovich integral.

1181

November 16, J070-S0129055X10004181

1182

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

Conversely since a Kato-class potential V satisﬁes that sup Ex [e−

Rt 0

V (Bs )ds

x

] < ∞,

t ≥ 0,

(1.2)

the family of mappings St deﬁned by St g(x) = Ex [e−

Rt 0

V (Bs )ds−i

Rt 0

a(Bs )◦dBs

g(Bt )],

t ≥ 0,

(1.3)

turns to be the strongly continuous one-parameter symmetric semigroup for a Kato-class potential V . The Schr¨ odinger operator with a Kato-class potential V is then deﬁned as the self-adjoint generator of (St )t≥0 . See, e.g., [3, 26, 27, 19]. The three-dimensional Kato-class includes a singular external potential such as V (x) = −|x|−a , 0 ≤ a < 2. We extend this to the Pauli–Fierz Hamiltonian. The Pauli–Fierz Hamiltonian HPF is a self-adjoint operator deﬁned on the tensor product of Hilbert spaces: H = L2 (Rd ) ⊗ L2 (Q),

(1.4)

where L2 (Q) is an L2 -space over a probability apace (Q, B, µ) with a Gaussian measure µ, and it describes the Schr¨odinger representation of the standard Boson Fock space. The Pauli–Fierz Hamiltonian HPF is given by HPF =

√ 1 (p ⊗ 1 + αA )2 + V ⊗ 1 + 1 ⊗ Hf (m), 2

(1.5)

where α ≥ 0 is a coupling constant, Hf (m) the free ﬁeld Hamiltonian with a ﬁeld mass m ≥ 0 and A = (A1 , . . . , Ad ) a quantized radiation ﬁeld with a cutoﬀ function. See Sec. 2 for further details of notations. Under some conditions on cutoﬀ functions and V it is proven that (1.5) is self-adjoint and e−tHPF is then deﬁned by the spectral resolution. In [14], (F, e−tHPF G) is also presented by a path measure: (F, e−tHPF G) = dx(F (x), (Tt G)(x))L2 (Q) , (1.6) where Tt is of the form Tt f (x) = Ex [e−

Rt 0

√ V (Bs )ds ∗ i αAE (Kt ) J0 e Jt G(Bt )]

∈ L2 (Q)

(1.7)

for each x ∈ Rd . Compare with (1.3) and see (2.47) for details. Our construction of generalized Pauli–Fierz Hamiltonians is closed to the procedure to deﬁne the Schr¨ odinger operator with Kato-class potentials. We believe however that it is worthwhile extending it to the Pauli–Fierz Hamiltonian from the mathematical point of view. It will be shown that the family of operators Tt : H → H , t ≥ 0, can be also deﬁned for Kato class potentials V and general cutoﬀ functions in A , and the generalized Pauli–Fierz Hamiltonian KPF is deﬁned as the self-adjoint generator of (Tt )t≥0 . Of course, under some conditions KPF coincides with HPF , but KPF permits to include more singular V’s and general cutoﬀ functions in A .

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1183

Cutoﬀ functions of Aµ (x), µ = 1, 2, 3, of the standard Pauli–Fierz Hamiltonian in three dimensions are of the form ˆ |k| (1.8) e−ikx eµ (k, j)ϕ(k)/ with some function ϕˆ and polarization vectors e(k, j) = (e1 (k, j), e2 (k, j), e3 (k, j)), j = 1, 2. In [8], the so-called Nelson model on a pseudo-Riemannian manifold is studied by a path measure. A generalized Pauli–Fierz Hamiltonians include a mathematical analogue of the Nelson model on a pseudo-Riemannian manifold, which is unitarily transformed to the Pauli–Fierz Hamiltonian with a variable mass. The cutoﬀ function of the Pauli–Fierz Hamiltonian with a variable mass v is (1.8) ˆ replaced by Ψ(k, x) and φˆjµ (k), respectively: with eikx and eµ (k, j)ϕ(k) Ψ(k, x)φˆjµ (k)/ |k|. (1.9) Here φˆjµ (k) is some function and Ψ(k, x), k = 0, is the unique solution of the Lippman–Schwinger equation ([21]): i|k||x−y| 1 v(y) e +ikx Ψ(k, y)dy. (1.10) − Ψ(k, x) = e 4π |x − y| The main results of the present paper are as follows: (1) we deﬁne the generalized Pauli–Fierz Hamiltonian KPF with Kato-class potentials and generalized cutoﬀ functions, i.e. we prove that (Tt )t≥0 is a strongly continuous one-parameter symmetric semigroup; (2) KPF is an extension of HPF ; (3) bound states of KPF spatially exponentially decay pointwise and the ground is unique if it exists. We explain an outline of (1)–(3) above. First we deﬁne the strongly continuous one-parameter symmetric semigroup (Tt )t≥0 with Kato-class potentials and general cutoﬀ functions by functional integral representations. Then KPF is deﬁned by Tt = e−tKPF for t ≥ 0. We introduce two assumptions, Assumptions 2.1 and 2.12, on cutoﬀ functions of A . The former is stronger than the latter. One advantage to deﬁne the generalized Pauli–Fierz Hamiltonian by a path measure is that we need only a weak condition on cutoﬀ functions (Assumption 2.12) and external potentials. Then for arbitrary α ∈ R, Kato-class potential V and cutoﬀ function ρˆjµ (x, k) satisfying ρˆjµ (x, k) ∈ Cb1 (Rdx ; L2 (Rdk )), we can deﬁne KPF as a self-adjoint operator. Secondly, we can show that √ 1 ˙ V− ⊗ 1 + 1 ⊗ Hf (m) ˙ V+ ⊗ 1 − (p ⊗ 1 + αA )2 + 2

(1.11)

is well deﬁned for V± such that 0 ≤ V+ ∈ L1loc (Rd ) and 0 ≤ V− is relatively form bounded with respect to p2 /2 with a relative bound strictly smaller than one. It is shown that KPF = (1.11) under Assumption 2.1 on cutoﬀ functions.

November 16, J070-S0129055X10004181

1184

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

Finally it is shown that bound states of KPF spatially exponentially decays pointwise. To show the spatial exponential decay of bound states is very important to study the properties of spectrum of Pauli–Fierz type models. In [2, 11, 10] the spatial exponential decay of bound states is shown but our method is completely diﬀerent from them. Since ϕb (x) = etE e−tKPF ϕb for ϕb such that KPF ϕb = Eϕb , exponential decay of ϕb (x) is proven by means of showing supx ϕb (x) L2 (Q) < ∞ Rt and estimating etE Ex [e− 0 V (Bs )ds ]. We conclude that

ϕb (x) L2 (Q) ≤ De−C|x|

β

(1.12)

almost everywhere x ∈ Rd , and constants D and C are independent of the ﬁeld mass m. Here the exponent β, β ≥ 1, is determined by the behavior of external potential V . When lim inf |x|→∞ V (x) < E, we can take β = 1, and when V (x) = |x|2n , β = n+1 is obtained. See Theorem 3.1 for the details. Furthermore, from a standard argument [15] it follows that the transformed operator ei(π/2)N Tt e−i(π/2)N is a positivity improving semigroup, where N denotes the number operator in L2 (Q). Then we conclude that the ground state of KPF is unique if it exists. This paper is organized as follows: Section 2 is devoted to constructing a strongly continuous symmetric semigroup (Tt )t≥0 and deﬁning the self-adjoint operator KPF . In Sec. 3, we show the spatial exponential decay of bound states of KPF pointwise. And lastly, we have the Appendix. 2. Generalized Pauli–Fierz Hamiltonian 2.1. Definitions Let us begin with deﬁning a generalized Pauli–Fierz Hamiltonian by a path measure. We usethe notation EP for the expectation with respect to a probability measure P , i.e. · · · dP = EP [· · ·]. Let Sreal = Sreal (Rd ) be the set of real-valued Schwartz d−1 test functions on Rd . We set Q = j=1 Sreal . There exist a σ-ﬁeld B, a probability measure µ on a measurable space (Q, B) and a Gaussian random variable A (Φ) d−1 indexed by Φ = (Φ1 , . . . , Φd−1 ) ∈ j=1 L2real (Rd ) such that Eµ [A (Φ)] = 0

(2.1)

and the covariance is given by 1 (Φj , Ψj )L2 (Rd ) . 2 j=1 d−1

Eµ [A (Φ)A (Ψ)] =

(2.2)

Throughout the scalar product on Hilbert space, L is denoted by (F, G)L , where it is antilinear in F and linear in G. We omit L when no confusion arises. For d−1 2 d L (R ), A (Φ) is deﬁned by general Φ ∈ A (Φ) = A (Φ) + iA ( Φ).

(2.3)

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1185

Thus A (Φ) is linear in Φ over C. The Boson Fock space is deﬁned by L2 (Q, dµ) = L2 (Q). It is know that the linear hull of d−1 2 d (2.4) : A (φ1 ) · · · A (φn ) : φj ∈ L (R ), j = 1, . . . , n, n ≥ 0 is dense in L2 (Q), where : X : denotes the wick product of X. See the Appendix for the deﬁnition of Wick product. Let us deﬁne the free ﬁeld Hamiltonian Hf (m) on L2 (Q). Deﬁne the map Γ(T ): L2 (Q) → L2 (Q) by Γ(T )1 = 1 and Γ(T ) : A (φ1 ) · · · A (φn ) : = : A (T φ1 ) · · · A (T φn ) :

(2.5)

for a contraction operator T on ⊕d−1 L2 (Rd ). Then Γ(T ) is also contraction on (2.4) and can be uniquely extended to the contraction operator on the hole space L2 (Q), which is denoted by the same symbol Γ(T ). We can check that Γ(T )Γ(S) = Γ(T S). Then {Γ(e−ith )}t∈R for a self-adjoint operator h deﬁnes the strongly continuous one-parameter unitary group on L2 (Q). The self-adjoint generator of {Γ(e−ith )}t∈R is denoted by dΓ(h), i.e. Γ(e−ith ) = e−itdΓ(h) ,

t ∈ R.

(2.6)

Let h= where ω(k) =

d−1

ω(−i∂),

|k|2 + m2 ,

m ≥ 0, k ∈ Rd .

(2.7)

(2.8)

Then we set Hf (m) = dΓ(h)

(2.9)

and it is called the free ﬁeld Hamiltonian on L2 (Q). Let p = −i∂ = (−i∂x1 , . . . , odinger operator −i∂xd ) be momentum operators in L2 (Rdx ). We deﬁne the Schr¨ Hp by 1 2 p + V, (2.10) 2 where V denotes a real-valued external potential. The conditions on V will be required later. The zero coupling Hamiltonian is now given by the self-adjoint operator Hp =

Hp ⊗ 1 + 1 ⊗ Hf (m)

(2.11)

H = L2 (Rdx ) ⊗ L2 (Q).

(2.12)

on the Hilbert space

The Pauli–Fierz Hamiltonian HPF is deﬁned by replacing p ⊗ 1 in zero cou√ pling Hamiltonian (2.11) with p ⊗ 1 + αA , where α ≥ 0 is a coupling

November 16, J070-S0129055X10004181

1186

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

constant and

Aµ =

⊕

Rd

Aµ (x)dx

(2.13)

∼ is ⊕the2 so-called quantized radiation ﬁeld. Here we used the identiﬁcation H = Rd L (Q)dx. We shall deﬁne Aµ (x) below. Let √ ρjµ (·, x) = (φˆjµ Ψ(·, x)/ ωˇ), j = 1, . . . , d − 1, µ = 1, . . . , d, (2.14) ˆ (respectively X) ˇ denotes the (respectively where φjµ is a cutoﬀ function and X j inverse) Fourier transform of X. Note that ρˆµ (k, x) = φˆjµ (k)Ψ(k, x)/ ω(k). Examples of cutoﬀ functions are given letter. The quantized radiation ﬁeld is deﬁned by   d−1 Aµ (x) = A  ρjµ (x), µ = 1, . . . , d, (2.15) j=1

for each x ∈ Rd . Now we arrive at the deﬁnition of the Pauli–Fierz Hamiltonian. It is deﬁned by HPF =

√ 1 (p ⊗ 1 + αA )2 + V ⊗ 1 + 1 ⊗ Hf (m). 2

(2.16)

We omit ⊗ for notational convenience in what follows. Then HPF is expressed as HPF =

√ 1 (p + αA )2 + V + Hf (m). 2

Assumption 2.1. Suppose that ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )) and √ √ ω ρˆjµ , ρˆjµ , ρˆjµ / ω, ∂xµ ρˆjµ , ∂xµ ρˆjµ / ω ∈ L∞ (Rdx ; L2 (Rdk )).

(2.17)

(2.18)

Under Assumption 2.1 it follows that

(p · A + A · p)F ≤ c1 (p2 + Hf (m) + 1)F ,

A · A F ≤ c2 (Hf (m) + 1)F .

(2.19) (2.20)

Moreover, HPF is self-adjoint on D(p2 ) ∩ D(Hf (m)) under Assumption 2.1. See [16, 17, 12] for the proof. We give examples of cutoﬀ functions ρjµ . Example 2.2 (Standard Pauli–Fierz Hamiltonian). The standard Pauli– Fierz Hamiltonian is deﬁned by HPF with the dimension d = 3, m = 0, and √ ˆ Ψ(k, x) = e+ikx , φˆjµ (k) = ϕ(k)e µ (k, j)/ ω, where e(k, j) = (e1 (k, j), e2 (k, j), e3 (k, j)), j = 1, 2, denote polarization vectors, √ √ ˆ ϕ/ ˆ ω, ϕ/ω ˆ ∈ L2 (Rd ). and ϕˆ is an ultraviolet cutoﬀ function. Suppose that ω ϕ, j 1 d 2 d Then ρµ (k, x) ∈ Cb (Rx ; L (Rk )) and (2.18) is fulﬁlled.

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1187

Example 2.3 (The Pauli–Fierz Hamiltonian with a Variable Mass). The Pauli–Fierz Hamiltonian with a variable mass v instead of m is studied in [13]. Then d = 3, m = 0, and Ψ(k, x) is the unique solution to the Lippman–Schwinger equation ([21]): i|k||x−y| e v(y) 1 +ikx Ψ(k, y)dy. (2.21) − Ψ(k, x) = e 4π |x − y| Ψ(k, x) formally satisﬁes (−∆x + v(x))Ψ(k, x) = |k|2 Ψ(k, x),

k = 0.

It is established that the Pauli–Fierz Hamiltonian with a variable mass has a ground state for arbitrary values of coupling constants when |v(x)| ≤ C(1+|x|2 )−β/2 , β > 3, with some constant C. Then it is also seen that |Ψ(k, x) − eikx | ≤ C(1 + |x|2 )−1/2 . Since ∂xµ Ψ(k, x) = ikµ e ×

ikx

1 − 4π

R3

(2.22)

1 − i|k| |x − y|

(xµ − yµ )ei|k||x−y| v(y) Ψ(k, y)dy, |x − y|2

(2.23)

it follows that sup k∈D,x∈Rd x

|∂xµ Ψ(k, x)| < ∞

(2.24)

for any compact set D but D 0. Let supp φˆjµ ⊂ D. Then ρjµ ∈ Cb1 (Rdx ; L2 (Rdk )) follows from (2.22) and (2.24). In addition to condition supp φˆjµ ⊂ D let us suppose √ √ that φˆjµ / ω, ω φˆjµ , φˆjµ /ω ∈ L2 (Rdk ), then (2.18) is fulﬁlled. 2.2. Feynman–Kac type formulae Let us prepare the Euclidean version of the quantized radiation ﬁeld A (Φ) to construct a functional integral representation of e−tHPF in the same way as [14]. d−1 Sreal (Rd+1 ). There exist a probability measure µE on a meaLet QE = surable space (QE , BE ) and a Gaussian random variable AE (Φ) indexed by d−1 2 d+1 L (R ) such that Φ∈ EµE [AE (Φ)] = 0 and the covariance is given by 1 (Φj , Ψj )L2 (Rd+1 ) . 2 j=1 d−1

EµE [AE (Φ)AE (Ψ)] =

November 16, J070-S0129055X10004181

1188

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

Both L2 (Q) and L2 (QE ) are connected through the second quantization of the family of isometry {jt }t∈R between L2 (Rd ) and L2 (Rd+1 ): e−ik0 t ω(k)/(ω(k)2 + |k0 |2 )fˆ(k). j t f (k0 , k) = √ π

(2.25)

d−1 Deﬁne Jt = Γ( jt ) : L2 (Q) → L2 (QE ). From the identity j∗t js = e−|t−s|ω(−i∂) ∗ it follows that Jt Js = e−|t−s|Hf (m) . Set X = C([0, ∞); Rd ) be the set of continuous paths on [0, ∞). Let (Bt )t≥0 denote the d-dimensional Brownian motion starting at x ∈ Rd on (X , B(X ), P x ) with the Wiener measure P x . That is, P x (B0 = x) = 1. Let Cbn (Rdx ; L2 (Rdk )) be the set of strongly n-times diﬀerentiable L2 (Rd )-valued functions on Rd such that supx ∂xz f (x) L2 (Rd ) < ∞ for |z| ≤ n. For fµ ∈ Cb1 (Rdx ; L2 (Rdk )), µ = 1, . . . , d, we can deﬁne an L2 (Rd )-valued Stratonovich integral: d 0

µ=1

t

fµ (Bs ) ◦ dBsµ =

0

t

f (Bs ) · dBs +

1 2

0

t

∂ · f (Bs )ds,

(2.26)

d d where f (Bs ) · dBs = µ=1 fµ (Bs )dBsµ and ∂ · f (Bs ) = µ=1 (∂xµ fµ )(Bs ). We also deﬁne an L2 (Rd+1 )-valued Stratonovich integral by d µ=1

0

t

d

js fµ (Bs ) ◦ dBsµ =

µ=1

tj/n

lim

n→∞

t(j−1)/n

jt(j−1)/n fµ (Bs ) ◦ dBsµ ,

(2.27)

where limn→∞ is a strong limit in L2 (X ; L2 (Rd+1 )). By the Itˆ o isometry we have the identity for S ≤ T   S T  js f (Bs ) · dBs , js g(Bs ) · dBs Ex  0

=

0

d µ=1

0

S

L2 (Rd+1 )

Ex [(fµ (Bs ), gµ (Bs ))]ds

Hence we have the bound  2  d Ex  js fµ (Bs ) ◦ dBsµ  µ=1

≤

t

x

dsE 0

1 2

fµ (Bs ) + ∂ · f (Bs ) . 2 2 µ=1 d

The next proposition is fundamental.

2

(2.28)

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1189

Proposition 2.4. Let V be bounded. Suppose Assumption 2.1. Then Rt √ (F, e−tHPF G) = dxEx [e− 0 V (Bs )ds (J0 F (B0 ), ei αAE (Kt ) Jt G(Bt ))L2 (Q) ], (2.29) Kt is the

d−1

L2 (Rd+1 )-valued stochastic integral given by Kt =

d−1 d

t

0

j=1 µ=1

js ρjµ (·, Bs ) ◦ dBsµ .

(2.30)

Here d µ=1

0

t

js ρjµ (·, Bs ) ◦ dBsµ =

0

t

js ρj (·, Bs ) · dBs +

1 2

0

t

js ∂ · ρj (·, Bs )ds.

Proof. Suppose that ρˆjµ ∈ Cb2 (Rdx ; L2 (Rdk )). Then (2.29) is proven in the same way as [16, Lemma 4.8]. Next we suppose that ρˆjµ (k, x) ∈ Cb1 (Rdx ; L2 (Rdk )). Let χ ∈ C ∞ (Rd ) and ϕ ∈ C0∞ (Rd ) be such that  |x| < 1,  1, χ(x) = <1, 1 ≤ |x| ≤ 2, ϕ ≥ 0   0, 2 < |x|, and ϕ(x)dx = 1. Deﬁne χN (x) = χ(x/N ) and ϕn (x) = ϕ(x/n)n−d/2 . Let ρˆjµ (k, x)M,n = (ϕn ∗ (ρjµ (k, ·)χN (·)))(x), ρˆjµ (k, x)M = ρjµ (k, x)χM (x). We note that ρˆjµ (k, x)M,n ∈ Cb∞ (Rdx ; L2 (Rdk )). Since ρˆjµ (k, x)M,n → ρˆjµ (k, x)M in Lp (Rdx , L2 (Rdk )) for 1 ≤ p < ∞ as n → ∞, there exists a subsequence n such that ρˆjµ (k, x)M,n → ρˆjµ (k, x)M strongly in L2 (Rdk ) for almost everywhere x ∈ Rd . Furthermore, ρˆjµ (k, x)M → ρˆjµ (k, x) for each x ∈ Rd in L2 (Rdk ). Then lim ρˆjµ (k, x)M,n = ρˆjµ (k, x)

lim

M→∞ n →∞

(2.31)

strongly in L2 (Rdk ) for almost everywhere x ∈ Rd . In the same way as above, we can also see that lim

lim ∂xz ρˆjµ (k, x)M,n = ∂xz ρˆjµ (k, x)

M→∞ n →∞

(2.32)

strongly in L2 (Rdk ) for almost everywhere x ∈ Rd for |z| ≤ 1. Thus (2.29) holds with ρˆjµ replaced by ρˆjµ (k, x)M,n . HPF with ρjµ replaced by ρˆjµ (k, x)M,n is denoted by HPF (M, n ). Let F ∈ C0∞ ⊗ D(Hf (m)). Then we can prove directly that lim

lim HPF (M, n )F = HPF F.

M→∞ n →∞

November 16, J070-S0129055X10004181

1190

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

Since C0∞ ⊗ D(Hf (m)) is a core of HPF (M, n ) and HPF ,

lim e−tHPF (M,n ) = e−tHPF

lim

(2.33)

M→∞ n →∞

strongly. Moreover (F, e

−tHPF (M,n )

√

dxEx [(J0 F (x), ei

G) =

αAE (Kt (M,n )) −

e

Rt 0

V (Bs )

Jt G(Bt ))], (2.34)

where Kt (M, n ) is deﬁned by Kt with ρjµ (k, x) replaced by ρjµ (k, x)M,n . Operator N = dΓ(1) is called the number operator in L2 (Q). Let F ∈ D(N ). Then the bound

A (Φ)F ≤ 2 Φ

(N + 1)1/2 F

is known. From (2.34) and √

|ei

αAE (Kt (M,n ))

√ αAE (Kt )

− ei

| ≤ |AE (Kt (M, n ) − Kt )|

it follows that

|(F, e−tHPF (M,n ) G) − (F, e−tHPF G)| Rt √ ≤ α dxEx [(|J0 F (x)|, |AE (Kt (M, n ) − Kt )|e− 0 V (Bs ) |Jt G(Bt )|)] √ ≤C α √ ≤C α

dx (N + 1)1/2 F (x) Ex [ Kt (M, n ) − Kt

G(Bt ) ] dx (N + 1)1/2 F (x) (Ex [ Kt (M, n ) − Kt 2 ])1/2 (Ex [ G(Bt ) 2 ])1/2 .

We estimate Ex [ Kt (M, n ) − Kt 2 ]. By (2.28), we have d d−1 t 1 x 2 x j 2 j 2 E [ Kt (M, n ) − Kt ] ≤ E 2

δρµ (Bs ) + δ∂ · ρ (Bs ) ds. 2 µ=1 j=1 0 where δf = f − fM,n . By (2.31) and (2.32), we see that lim

lim Ex [ Kt (M, n ) − Kt 2 ] = 0

M→∞ n →∞

for each x ∈ Rd . Then by the Lebesgue dominated convergence theorem we have lim r.h.s. (2.34)

lim

M→∞ n →∞

=

√

dxEx [(J0 F (x), ei

αAE (Kt ) −

e

Rt 0

V (Bs )

Jt G(Bt ))].

(2.35)

Then (2.29) also holds for ρjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Thus the proposition follows.

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1191

2.3. One-parameter symmetric semigroup and generalized Pauli–Fierz Hamiltonian We can extend functional integral representations in Proposition 2.4 to more general external potentials and ρjµ . Deﬁnition 2.5 (Kato-Class Potentials). External potential V : Rd → R is called a Kato-class potential if and only if   sup |λ(x − y)V (y)|dy < ∞ d = 1,   x∈Rd B1 (x) (2.36)     lim sup |λ(x − y)V (y)|dy = 0 d ≥ 2 r→0 x∈Rd

Br (x)

holds, where Br (x) denotes the closed ball of radius r centered at x, and  1, d = 1,    λ(x) = − log |x|, d = 2,    2−d |x| , d ≥ 3.

(2.37)

We denote the set of Kato-class potential by KKato . An equivalent characterization of Kato-class is as follows: Proposition 2.6. A function V is in KKato if and only if t lim sup Ex |V (Bs )|ds = 0. t↓0 x∈Rd

(2.38)

0

Proof. See, e.g., [1, 6, 26, 27]. Deﬁnition 2.7. Let K be the set of external potential V = V+ − V− such that 0 ≤ V+ ∈ L1loc (Rd ) and 0 ≤ V− ∈ KKato . Example 2.8. In [1, 26, 27], it is shown that Lpu (Rd ) ⊂ KKato where p d p Lu (R ) = f sup |f (x)| dx < ∞ x |x−y|≤1 with p

  =1,

d = 1,

 > d , 2

d ≥ 2.

(2.39)

In particular let V ∈ Lp (Rd ) + L∞ (Rd ) with (2.39), then V ∈ KKato . a Example 2.9. Let d = 3 and V (x) = P (x) − |x| b , where a ≥ 0, 0 ≤ b < 2 and 2n j P (x) = j=0 aj x is a polynomial such that a2n > 0. Then V ∈ K .

November 16, J070-S0129055X10004181

1192

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

Now we shall see that the random variable to the Wiener measure P x for V ∈ K .

t 0

V± (Bs )ds is integrable with respect

t Lemma 2.10. Let 0 ≤ V ∈ L1loc (Rd ). Then P x ( 0 V (Bs )ds < ∞) = 1 for each x ∈ Rd . t Proof. Since V ∈ L1loc (Rd ), we can see that Ex [ 0 1N V (Bs )ds] < ∞ for the indicator function 1, |k| ≤ N, 1N (k) = 0, |k| > N. Then there exists a measurable set NN ⊂ X such that P x (NN ) = 0 and t !∞ < ∞ for ω ∈ X \NN . Set N = N =1 NN . For ω ∈ X \N 0 1N (Bs )V (Bs )ds t we can see that 0 1N (Bs (ω))V (Bs (ω))ds < ∞ for arbitary N ≥ 1. Let ω ∈ X \N . There exists N = N (ω) ≥ 1 such that sup0≤s≤t |Bs (ω)| < N . Henceforth t t V (Bs (ω))ds = 1N (Bs (ω))V (Bs (ω))ds < ∞, ω ∈ X \N . 0

0

Thus the lemma follows. Rt

When V− ∈ KKato , it can be seen that the Rexponent e 0 V (Bs )ds is integrable t with respect to P x , and the supremum of Ex [e 0 V (Bs )ds ] in x is ﬁnite. We shall check it. Lemma 2.11. Let V ∈ KKato . Then there exists β > 0 and γ > 0 such that sup Ex [e

Rt 0

V (Bs )

] < γeβt .

(2.40)

x

Furthermore when V ∈ Lp (Rd ) with   =1, p  > d , 2

d = 1, d ≥ 2,

there exists C such that β ≤ C V p .

(2.41)

Proof. By Proposition 2.6, there exists t∗ > 0 such that t αt = sup Ex V (Bs ) < 1 0

x

∗

for all t ≤ t , and αt → 0 as t → 0. It is known as Khasminskii’s lemma that sup Ex [e x

Rt 0

V (Bs )

]<

1 1 − αt

(2.42)

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1193

for all t ≤ t∗ . By means of the Markov property of the Brownian motion we have 2 R 2t∗ R t∗ R t∗ 1 Ex [e− 0 V (Bs ) ] = Ex [e− 0 V (Bs ) EBt∗ [e− 0 V (Bs ) ]] ≤ . 1 − αt∗ Repeating this procedure, we can see that [t/t∗ ]+1 Rt 1 sup Ex [e 0 V− (Bs ) ] ≤ 1 − αt∗ x

(2.43)

1 for all t > 0, where [z] = max{w ∈ Z | w ≤ z}. Set γ = ( 1−α ) and β = t∗ ∗ 1 1/t log( 1−αt∗ ) . Then (2.40) is proven. Next we prove (2.41). Suppose V ∈ Lp (Rd ). In the case of d = 1, we directly see that t t Ex [V (Bs )]ds ≤ (2πs)−1/2 ds V 1 . (2.44) αt = 0

0

1 p

Next, we let d ≥ 2 and q be such that + 1q = 1. The following estimates are due to [1, Proof of Theorem 4.5]. Let an arbitrary > 0 be ﬁxed. We have t Ex [|V (Bs )|]ds 0

t

= 0

Ex [|V (Bs )|χ|Bs −x|≥ ]ds +

≤t

−d/2 −|y|2 /(2t)

(2πt) |y|≥

e

0

t

Ex [|V (Bs )|χ|Bs −x|< ]ds

|V (x + y)|dy + e

t

∞

0

Ex [e−s |V (Bs )|χ|Bs −x|< ].

It is easy to see that 1/q 2 2 t (2πt)−d/2 e−|y| /(2t) |V (x + y)|dy ≤ t(2π)−d/2

V p . e−q|y| /2 dy |y|≥

(2.45) Let f be the integral kernel of ( 12 p2 + 1)−1 . Then we see that ∞ x −s dsE [e |V (Bs )|χ|Bs −x|< ] ≤ f (x − y)|V (y)|dy. 0

|x−y|<

Since |f (z)| ≤ Cλ(z) for |z| ≤ 12 with some constant C, we have ∞ dsEx [e−s |V (Bs )|χ|Bs −x|< ] ≤ C λ(x − y)|V (y)|dy 0

|x−y|<

and then

∞

x

dsE [e 0

−s

|V (Bs )|χ|Bs −x|< ] ≤ C

1/q q

λ(z) dy |z|<

V p

(2.46)

November 16, J070-S0129055X10004181

1194

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

by the H¨ older inequality. Hence from(2.44)–(2.46), there exists Ct () such that αt ≤ Ct () V p and limt→0 Ct () = C( |z|< λ(z)q dy)1/q . Then for suﬃciently small 1 1/T and then there exists DT such that β ≤ T and we have β ≤ ( 1−CT () V p ) DT V p . Then (2.41) follows.

The functional integral representation (2.29) introduced in Proposition 2.4 is well deﬁned not only for bounded external potentials and ρjµ satisfying (2.18) but also more general external potentials and ρjµ . We can identify Hilbert space H with L2 (Rd ×Q) with the scalar product (F, G) = dx(F (x), G(x))L2 (Q) . The functional integral representation of (F, e−tHPF G) is also given by Rt √ (F, e−tHPF G) = dx(F (x), Ex [e− 0 V (Bs )ds J∗0 ei αAE (Kt ) Jt G(Bt )])L2 (Q) . From this expression we shall deﬁne (Tt )t≥0 by (2.47) below. Assumption 2.12. We suppose that V ∈ K and ρˆjµ = ρˆjµ (k, x) ∈ Cb1 (Rdx ; L2 (Rdk )). Note that under Assumption 2.12, Aµ (x) is not relatively bounded with respect to Hf (m) in the case of m = 0. Under Assumption 2.12 however we deﬁne the family of linear operators {Tt }t≥0 on H by Tt F (x) = Ex [e−

Rt 0

√ V (Bs )ds ∗ i αAE (Kt ) J0 e Jt F (Bt )]

(2.47)

for all t ≥ 0. Note that Kt is well deﬁned since ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Lemma 2.13. Suppose Assumption 2.12. Then Tt is bounded on H for t ≥ 0. Proof. By the deﬁnition of Tt we have Rt

Tt F 2H ≤ dxEx [e−2 0 V (Bs )ds ]Ex [ F (Bt ) 2L2 (Q) ]. Since V ∈ K , C = supx Ex [e−2

Rt 0

V (Bs )ds

] < ∞. Thus Tt F 2H ≤ C F 2H follows.

In what follows we shall show that {Tt }t≥0 is a strongly continuous oneparameter symmetric semigroup on H . In order to show it we introduce the second quantization of Euclidean group {ut , r} on L2 (Rd+1 ), where the time shift operator ut : L2 (Rd+1 ) → L2 (Rd+1 ) is deﬁned by ut f (x0 , x) = f (x0 − t, x) and the time reﬂection r: L2 (Rd+1 ) → L2 (Rd+1 ) by rf (x0 , x) = f (−x0 , x) for x = (x0 , x) ∈ R × Rd . The second quantization of ut and r are denoted by Ut : L2 (QE ) → L2 (QE ) and R: L2 (QE ) → L2 (QE ), respectively. Note that r∗ = r,

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1195

rr = r∗ r = 1, u∗t = u−t and u∗t ut = 1 and that Ut and R are unitary. The time shift ut , the time reﬂection r and isometry jt : L2 (Rd ) → L2 (Rd+1 ) satisfy the lemma below. Lemma 2.14. (1) ut js = js+t and Ut Js = Js+t . (2) rjs = j−s r and RUs = U−s R. Proof. By the deﬁnition of js we have ω(k) 1 i(k0 (x0 −s)+k·x) js f (x) = √ fˆ(k)dk0 dk. e (d+1)/2 π(2π) ω(k)2 + |k0 |2 Then ut js = js+t follows, and Ut Js = Γ(ut )Γ(js ) = Γ(ut js ) = Γ(js+t ) = Js+t . (2) is similarly proven. Lemma 2.15. Suppose Assumption 2.12. Then it follows that Tt Ts = Tt+s for all t, s ≥ 0. Proof. By the deﬁnition of Tt , we have Rs

√ V (Br )dr ∗ i αAE (Ks ) J0 e Js EBs Rt √ × [e− 0 V (Br )dr J∗0 ei αAE (Kt ) Jt F (Bt )]].

Ts Tt F (x) = Ex [e−

0

(2.48)

Let Es = Js J∗s , s ∈ R, be the family of projections. By the formulae Js J∗0 = ∗ ∗ = Es U−s and Jt = U−s Jt+s , (2.48) is expressed as Js J∗s U−s Rs

√ V (Br )dr ∗ i αAE (Ks ) J0 e Es EBs Rt √ ∗ i αAE (Kt ) × [e− 0 V (Br )dr U−s e U−s Jt+s F (Bt )]].

Ts Tt F (x) = Ex [e−

0

Since Us is unitary, we have √

∗ i e U−s

αAE (Kt )

√

U−s = ei

αAE (u∗ −s Kt )

(2.49)

as an operator, where the exponent is given by d−1 d t ∗ jr+s ρjµ (Br ) ◦ dBrµ . u−s Kt = j=1 µ=1

0

Let (Ft )t≥0 be the natural ﬁltration of the Brownian motion (Bt )t≥0 . By the Markov property of the projections Et ’s ([24]), we can neglect Es in (2.49) and we have Ts Tt F (x) = Ex [e−

Rs

× Ex [e

√ V (Br )dr ∗ i αAE (Ks ) J0 e R √ − ss+t V (Br )dr i αAE (Kss+t ) 0

e

Jt+s F (Bs+t )|Fs ]],

where Ex [· · · | Fs ] denotes the conditional expectation with respect to (Ft )t≥0 and d−1 d s+t s+t jr ρjµ (Br ) ◦ dBrµ . Ks = j=1 µ=1

s

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

1196

Hence we obtain that Ts Tt F (x) = Ex [e−

R s+t 0

√ V (Br )dr ∗ i αAE (Ks+t ) J0 e Js+t F (Bs+t )]

= Ts+t F (x)

and the lemma is proven. Next we check the symmetric property of Tt . Lemma 2.16. Suppose Assumption 2.12. Then it follows that Tt∗ = Tt for t ≥ 0. Proof. By the functional integral representation and the unitarity of the timereﬂection R on L2 (QE ), we have Rt √ (F, Tt G) = dxEx [e− 0 V (Bs )ds (RJ0 F (B0 ), Rei αAE (Kt ) RRJt G(Bt ))] =

dxEx [e−

Rt 0

V (Bs )ds

√

(J0 F (B0 ), ei

αAE (rKt )

J−t G(Bt ))],

d−1 d t j µ where the exponent is rKt = j=1 µ=1 0 j−s ρµ (Bs ) ◦ dBs . By means of the time-shift Ut we also have Rt √ (F, Tt G) = dxEx [e− 0 V (Bs )ds (Ut J0 F (B0 ), Ut ei αAE (rKt ) Ut∗ Ut J−t G(Bt ))]

dxEx [e−

=

Rt 0

V (Bs )ds

√

(Jt F (B0 ), ei

αAE (ut rKt )

J0 G(Bt ))],

d−1 d t ˜s = Bt−s − Bt , where ut rKt = j=1 µ=1 0 jt−s ρjµ (Bs ) ◦ dBsµ . Finally we set B which equals to Bs in law. Then we have Rt √ ˜ ˜t ))], (2.50) (F, Tt G) = dxE0 [e− 0 V (x+Bs )ds (Jt F (x), ei αAE (ut rK t ) J0 G(x + B where u t rK t =

d−1 d j=1 µ=1

t

˜s ) ◦ dB ˜sµ = lim jt−s ρjµ (x + B

d−1 n

n→∞

0

∆j (i)

j=1 i=1

and limn→∞ is in the strong sense of L2 (X ; L2 (Rd+1 )) and ∆j (i) =

d µ=1

ti/n

t(i−1)/n

˜ s ) ◦ dB ˜µ. jt−t(i−1)/n ρjµ (x + B s

dx and E0 in (2.50) we have Rt ˜ 0 (F, Tt G) = lim E dxe− 0 V (x+Bs )ds

Then exchanging

n→∞

√

× (Jt F (x), ei

αAE (

Ld−1 Pn j=1

i=1

∆j (i))

˜t )) J0 G(x − B

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1197

and changing variable x − Bt to x in dx we have Rt 0 (F, Tt G) = lim E dxe− 0 V (x+Bs )ds n→∞

Pn ˜ L √ αAE ( d−1 i=1 ∆j (i)) j=1

× (Jt F (x + Bt ), ei

J0 G(x)) ,

where ˜ j (i) = − ∆

d µ=1

ti/n

t(i−1)/n

jt−t(i−1)/n ρjµ (x + Bs ) ◦ dBsµ .

and lim

n→∞

n

˜ j (i) = − ∆

d µ=1

i=1

0

t

ρjµ (x + Bs ) ◦ dBsµ .

We thus can ﬁnally see that Rt √ (F, Tt G) = dxEx [e− 0 V (Bs )ds (Jt F (Bt ), e−i αAE (Kt ) J0 G(B0 ))] = (Tt F, G). Then the lemma follows. Lemma 2.17. Suppose Assumption 2.12. Then Tt is strongly continuous in t ≥ 0 on H . Proof. Since Tt is uniformly bounded and the semigroup property Tt Ts = Tt+s is hold, it is enough to show the weak continuity at t = 0. By the Lebesgue dominated convergence theorem it suﬃces to show that √

Ex [(J0 F (B0 ), ei

αAE (Kt )

Jt G(Bt )] → Ex [(J0 F (B0 ), J0 G(B0 )]

as t → 0 for each x ∈ Rd . Let √

Ex [(J0 F (B0 ), ei =

αAE (Kt )

Jt G(Bt )] − Ex [(J0 F (B0 ), J0 G(B0 )] √ √ Ex [(J0 F (B0 ), ei αAE (Kt ) Jt G(Bt )] − Ex [(J0 F (B0 ), ei αAE (Kt ) Jt G(B0 )] √ √ + Ex [(J0 F (B0 ), ei αAE (Kt ) Jt G(B0 )] − Ex [(J0 F (B0 ), ei αAE (Kt ) J0 G(B0 )] √ + Ex [(J0 F (B0 ), ei αAE (Kt ) J0 G(B0 )] − Ex [(J0 F (B0 ), J0 G(B0 )].

The ﬁrst and second terms of the right-hand side above converge to zero as t → 0, since Bt and Jt are continuous in t. We will check that the third line also goes to zero. We have √

|Ex [(J0 F (B0 ), ei αAE (Kt ) J0 G(B0 )] − Ex [(J0 F (B0 ), J0 G(B0 )]| √ ≤ (Ex [ αAE (Kt )J0 F (B0 ) 2 ])1/2 (Ex [ G(Bt ) 2 ])1/2 .

November 16, J070-S0129055X10004181

1198

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

We have a bound √ Ex [ AE (Kt )J0 F (B0 ) 2 ] ≤ N + 1F (x) 2 E0 [ Kt (x) 2L2 (Rd+1 ) ], where Kt (x) = 0

E

t j µ=1 0 js ρµ (x

+ Bs ) ◦ dBsµ . We have

d−1 t

d

d−1 d j=1

[ Kt (x) 2L2 (Rd+1 ) ]

≤

x

dsE 2

j=1 0

ρjµ (Bs ) 2

µ=1

1 j 2 + ∂ · ρ (Bs )ds . 2

(2.51)

Then limt→0 Ex [ AE (Kt )J0 F (B0 ) 2 ] = 0 follows and the proof is complete. Theorem 2.18. Suppose Assumption 2.12. Let V ∈ K . Then {Tt }t≥0 is a strongly continuous one-parameter symmetric semigroup. In particular, there exists a selfadjoint operator KPF bounded below such that e−tKPF = Tt ,

t ≥ 0,

(2.52)

and e−tKPF F (x) = Ex [e−

Rt 0

√ V (Bs )ds ∗ i αAE (Kt ) J0 e Jt F (Bt )].

(2.53)

Proof. This follows from Lemmas 2.15–2.17. Deﬁnition 2.19 (Generalized Pauli–Fierz Hamiltonians). Suppose Assumption 2.12. We deﬁne a generalized Pauli–Fierz Hamiltonian with an external potential V ∈ K by a self-adjoint operator KPF in (2.52). Corollary 2.20. Suppose Assumption 2.12. Let us identify H with L2 (Rd × Q). Then under this identification ei(π/2)N e−tKPF e−i(π/2)N , t > 0, is positivity improving. In particular the ground state of KPF is unique if it exists. Proof. By (2.53), we can see that (F, ei(π/2)N e−tKPF e−i(π/2)N G) Rt √ = dxEx [(J0 F (x), e− 0 V (Bs )ds ei(π/2)N ei αAE (Kt ) e−i(π/2)N Jt G(Bt ))]. √

Since in [15] it is shown that ei(π/2)N ei αAE (Kt ) e−i(π/2)N is positivity improving, (F, ei(π/2)N e−tKPF e−i(π/2)N G) > 0 for all 0 ≤ F, G ∈ H but F = 0 and G = 0. Then the corollary follows. Let Lp (Rd ; L2 (Q)) = {f : Rd → L2 (Q)| f (x) pL2 (Q) dx < ∞} and set the Lp norm as F p = ( F (x) pL2 (Q) dx)1/p . Corollary 2.21. Suppose Assumption 2.12. e−tKPF can be extended to a bounded operator from Lp (Rd ; L2 (Q)) to itself for 1 ≤ p ≤ ∞.

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

Proof. Let p = ∞, p = 1 and

1 p

+

1 q

= 1. Then we have

e−tKPF F (x) pL2 (Q) ≤ (Ex [e−

Rt

≤ (Ex [e−q Thus we have

e

−tKPF

1199

0

V (Bs )ds

Rt 0

F (Bt ) ])p

V (Bs )ds

])p/q Ex [ F (Bt ) pL2 (Q) ].

F (x) pL2 (Q) dx

≤C

F (x) pL2 (Q) dx.

In the case of p = ∞ and p = 1, the proof is similar. 2.4. Quadratic form and KPF By the functional integral representation, we have the so-called diamagnetic inequality |(F, e−tHPF G)| ≤ (|F |, e−t(Hp +Hf (m)) |G|).

(2.54)

By means of the diamagnetic inequality, we can see that when |V |1/2 is relatively bounded with respect to (p2 /2)1/2 with a relative bound a ≥ 0, it is also relatively √ bounded with respect to ( 12 (p + αA )2 + Hf (m))1/2 with a relative bound ≤ a. See [14]. Let V = V+ − V− be such that V+ ∈ L1loc (Rd ) and V− inﬁnitesimally small with respect to p2 /2 in the sense of form. Then under Assumption 2.1 we can deﬁne the self-adjoint operator HPF =

√ 1 ˙ V+ − ˙ V− (p + αA )2 + Hf (m) + 2

(2.55)

˙ by the quadratic form sum ±. Theorem 2.22. Let V ∈ K and suppose Assumption 2.1. Then KPF = HPF , where HPF is defined by (2.55). Proof. The functional integral representation of e−tHPF for (2.55) can be given by the procedure below [25, 14]. Let  V (x) ≥ n.  n, Vn,m (x) = V (x), m < V (x) < n,   m, V (x) ≤ m. Thus Vn,m ∈ L∞ (Rd ) and then the functional integral representation of e−tHPF with external potential Vn,m , which is denoted by e−tHPF (n,m) , is given by Proposition 2.6. By the monotone convergence theorem for forms, we can see that limn→∞ limm→∞ e−tHPF (n,m) = e−tHPF , where HPF is deﬁned by (2.55). On the

November 16, J070-S0129055X10004181

1200

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

other hand, the functional integral representation of I = (F, e−tHPF (n,m) G) = I + i I is divided into the positive part and the negative part as I = (I)+ − (I)− + i( I)+ − i( I)− , and each term converges as n, m → ∞ by the monotone convergence theorem for integral. Then the functional integral representation is given by (F, e−tHPF G) Rt Rt √ = lim dxE[(J0 F (B0 ), e− 0 Vn,+ (Bs )ds e+ 0 Vm,− (Bs )ds ei αA (Kt ) Jt G(Bt ))]. n,m→∞

=

dxE[(J0 F (B0 ), e−

Rt 0

√ V (Bs )ds i αA (Kt )

e

Jt G(Bt ))].

(2.56)

Since V ∈ K , we see that V+ ∈ L1loc (Rd ) and V− is inﬁnitesimally small with respect to p2 /2 in the sense of form [6, Theorem 1.12]. Moreover (F, e−tKPF G) equals to the right-hand side of (2.22). Then we conclude that e−tHPF = e−tKPF . Thus the theorem follows. 3. Pointwise Spatial Exponential Decays In this section, we show the spatial exponential decay of bound states of KPF . Let ϕb be a bound state of KPF associated with eigenvalue E; KPF ϕb = Eϕb .

(3.1)

Assumption 3.1. We say that V = W + U ∈ E if and only if W ∈ L1loc (Rd ), inf x W (x) > −∞ and 0 > U ∈ Lp (Rd ) for some  =1, d = 1, p > d , d ≥ 2. 2 Let W + U ∈ E and set W = W+ − W− , where W± ≥ 0 is given by W+ (x) = max{0, W (x)} and W− (x) = min{0, W (x)}. Since U ∈ Lp (Rd ) ⊂ KKato , W− ∈ L∞ ⊂ KKato and W+ ∈ L1loc (Rd ), we note that E ⊂ K . We set W∞ = inf W (x). x

(3.2)

A fundamental estimate to show the spatial exponential decay of bound states is the lemma below. Lemma 3.2. Let V = W + U ∈ E . Suppose that ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Then for arbitrary t, a > 0 and each 0 < α < 1/2, there exist constants D1 , D2 and D3 such that α a2 t

ϕb (x) L2 (Q) ≤ D1 eD2 U p t eEt (D3 e− 4 where Wa (x) = inf{W (y)||x − y| < a}.

e−tW∞ + e−tWa (x) ) ϕb H ,

(3.3)

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1201

Proof. It is a slight modiﬁcation of [5]. Since ϕb = etE e−tKPF ϕb , we have ϕb (x) = Ex [J∗0 e−

Rt 0

√ V (Bs ) i αAE (Kt )

Jt ϕb (Bt )]etE .

(3.4)

ϕb (Bt ) L2 (Q) ].

(3.5)

e

Hence for almost every x it follows that

ϕb (x) L2 (Q) ≤ etE Ex [e−

Rt 0

V (Bs )

By this, we have

ϕb (x) L2 (Q) ≤ etE (Ex [e−4

Rt 0

W (Bs )ds

])1/4 (Ex [e−4

Rt 0

U(Bs )ds

])1/4 ϕb H ,

where we used the Schwartz inequality and 2 Ex [ ϕb (Bt ) 2L2 (Q) ] = (2πt)−d/2 e−|y| /2t ϕb (x + y) 2L2 (Q) dy =

2

e−π|z| ϕb (x +

√ 2πtz) 2L2 (Q) dz

≤ ϕb 2H . Let A = {ω ∈ X | sup0≤s≤t |Bs (ω)| > a}. Then it follows from a martingale inequality that ∞ 2 −r 2 /2 d−1 r dx ≤ ξα e−αa /t E0 [1A ] ≤ 2P 0 (|Bt | ≥ a) = 2(2π)−d/2 Sd−1 √ e a/ t

with some ξα for each 0 < α < 1/2. Thus it follows that Ex [e−4

Rt 0

W (Bs )ds

] = E0 [1A e−4

Rt 0

W (Bs +x)ds

] + Ex [1Ac e−4

Rt 0

W (Bs )ds

]

≤ e−4tW∞ E0 [1A ] + e−4tWa (x) 2

≤ ξα e−αa

/t −4tW∞

e

+ e−4tWa (x) .

Rt

Next we estimate Ex [e−4 0 U(BRs )ds ]. Since U is in Kato-class, there exist constants t D1 and D2 such that Ex [e−4 0 U(Bs )ds ] ≤ D1 eD2 U p t by Lemma 2.11. Setting D3 = ξα 1/4 , we obtain the lemma by the inequality (a + b)1/4 ≤ a1/4 + b1/4 for a, b ≥ 0. For V = W + U ∈ E , we deﬁne Σ = lim inf V (x). |x|→∞

(3.6)

Since U ∈ Lp (Rd ), lim inf |x|→∞ U (x) = 0 and hence Σ = lim inf W (x). |x|→∞

Moreover Σ ≥ W∞ holds.

(3.7)

November 16, J070-S0129055X10004181

1202

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

Theorem 3.3. Suppose that V = W + U ∈ E and ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Conﬁning Case 1. Suppose that W (x) ≥ γ|x|2n outside a compact set K for some n > 0 and some γ > 0. Let 0 < α < 1/2. Then there exists a constant C1 such that # " αc (3.8)

ϕb (x) L2 (Q) ≤ C1 exp − |x|n+1 ϕb H , 16 where c = inf x∈Rd \K W |x| (x)/|x|2n . 2

Conﬁning Case 2. Suppose that lim|x|→∞ W (x) = ∞. Then there exist constants C and δ such that

ϕb (x) L2 (Q) ≤ C exp(−δ|x|) ϕb H .

(3.9)

Non-Conﬁning Case. Suppose that Σ > E and Σ > W∞ . Let 0 < β < 1. Then there exists a constant C2 such that β (Σ − E)

ϕb (x) L2 (Q) ≤ C2 exp − √ √ |x| ϕb H . (3.10) 8 2 Σ − W∞ Proof. Since supx ϕb (x) L2 (Q) < ∞, it is enough to show all the statements for suﬃciently large |x|. Conﬁning Case 1. Note that W |x| (x) ≥ c|x|2n for x ∈ Rd \K. Then we have 2

bounds for x ∈ Rd \K:

|x|W |x| (x)1/2 ≥ c|x|n+1 ,

(3.11)

|x|W |x| (x)−1/2 ≤ c|x|1−n .

(3.12)

2

2

Inserting t = t(x) = W |x| (x)−1/2 |x| and a = a(x) = 2

ϕb (x) ≤ e− 16 c|x| α

n+1

× (D3 ec|x|

|x| 2

D1 e(D2 U p +E)c|x|

1−n

|W∞ |

in (3.3), we have

1−n

+ e−(1− 16 )c|x| α

n+1

) ϕb H

(3.13)

for x ∈ Rd \K. Then (3.8) follows. Non-Conﬁning Case. Rewrite formula (3.3) as α a2 t

ϕb (x) ≤ D1 eD2 U p t (D3 e− 4

e−t(W∞ −E) + e−t(Wa (x)−E) ) ϕb H .

(3.14)

Then altering both Σ = lim inf |x|→∞ (−W− (x)) and Σ > W∞ , it is possible to choose decomposition V = W + U ∈ E such that U p ≤ (Σ − E)/2, since lim inf |x|→∞ U (x) = 0. Inserting t = t(x) = |x| and a = a(x) = |x| 2 in (3.14),

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1203

we have

ϕb (x) ≤ D1 e U p |x| (D3 e− 16 |x| e−|x|(W∞ −E) + e α

≤ D1 (D3 e +e √

Choosing =

2

) ϕb H

α −( 16 +(W∞ −E)− 12 (Σ−E))|x|

−((W |x| (x)−E)− 12 (Σ−E))|x| 2

α/16 √ , Σ−W∞

−|x|(W |x| (x)−E)

) ϕb H .

the exponent on the ﬁrst term above turns out to be

1 α 1 + (W∞ − E) − (Σ − E) = (Σ − E). 16 2 2 Moreover we see that lim inf |x|→∞ W |x| (x) = Σ, and obtain 2

ϕb (x) L2 (Q) ≤ C2 e− 2 (Σ−E)|x| ϕb H

for suﬃciently large |x|. Then (3.10) follows. Conﬁning Case 2. Finally, we prove conﬁning case 2. In this case for arbitrary c > 0 there exists N such that W |x| (x) ≥ c for all |x| > N . Inserting t = t(x) = |x| and a = a(x) =

|x| 2

2

in (3.3), we obtain that

ϕb (x) ≤ D1 e U p |x| (D3 e− 16 |x| e−|x|(W∞ −E) + e α

≤ D1 (D3 e

α −( 16 − U p +(W∞ −E))|x|

−|x|(W |x| (x)−E) 2

) ϕb H

+ e−|x|(c−E− U p) ) ϕb H

for |x| > N . Choosing suﬃciently large c and suﬃciently small such that α − U p + (W∞ − E) > 0, 16 c − E − U p > 0,

we have ϕb (x) ≤ C e−δ |x| for suﬃciently large |x|. Then (3.9) follows. We give several remarks on Theorem 3.3. Independence of Bose Mass m. Suppose that ω(k) = |k|2 + m2 . Let ϕb be a normalized ground state of KPF : ϕb H = 1, and Em = inf σ(KPF ). It is shown that there exist also constants C1 and C2 such that

ϕb (x) L2 (Q) ≤ C1 e−C2 |x| , n

n ≥ 1,

by Theorem 3.3. Since the ground state energy Em is decreasing in m, we can take C1 and C2 independent of m < M with some M . This fact is nontrivial and useful to show the existence of ground states of the Pauli–Fierz model with m = 0. This is used in, e.g., [13].

November 16, J070-S0129055X10004181

1204

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

Condition W ∞ < Σ. When inf x V (x) < Σ, it is possible to decompose V = W + U ∈ E such that W∞ < Σ. In fact, for arbitrary > 0, there exists y ∈ Rd such that V (y) < inf V (x) + . x

Suppose that inf x V (x) + < Σ. Let Oy ⊂ Rd be a neighborhood of y. Then deﬁne U (x), x ∈ Oy , u(x) = 0, y ∈ Oy . ˜ = W + u and U ˜ = U − u. This yields that V = W ˜ +U ˜ ∈ E and W ˜∞ < Let W inf x V (x) + < Σ. Threshold. The threshold is deﬁned by Σ∞ = lim

inf

(F, HPF F ),

R→∞ F ∈DR , F =1

where DR = {F ∈ D(HPF ) | F (x) = 0, |x| < R}. We note that Σ∞ ≥ Σ, and Σ = Σ∞ = ∞ in conﬁning cases. The bound given in [10] is e+C|·|1(−∞,λ] (HPF ) H < ∞, where C 2 + λ < Σ∞ . From this the bound (3.15) dx e+δ|x| ϕb (x) 2L2 (Q) ≤ C ϕb H follows, where δ<

Σ∞ − E.

Theorem 3.3, however, gives pointwise bounds:

ϕb (x) L2 (Q) ≤ C1 exp(−C2 |x|β ) ϕb H ,

β ≥ 1.

(3.16)

In particular, the superexponential decay, ϕb (x) ≤ C1 e−C2 |x| ϕb H , is shown for the case of polynomially increasing potentials (Conﬁning Case 1), while in nonconﬁning cases, we show that in (3.16), β = 1 and n+1

Σ−E C2 < √ √ . 8 2 E − W∞

(3.17)

We give examples of external potentials. Example 3.4 (Conﬁning Potentials). Let V = V+ − V− be such that V+ ∈ Lploc (Rd ) and V− ∈ Lp (Rd ), where  =1, d = 1, p > d , d ≥ 2. 2 In this case V ∈ E .

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1205

Example 3.5 (Coulomb Potentials). Suppose Assumption 2.1. Then HPF = KPF . Let V = −αZ/|x| be the Coulomb potential. Then inf σ(Hp ) = −αZ/2. We have (φ ⊗ 1, HPF φ ⊗ 1)H = (φ, (Hp + Veﬀ )φ)L2 (Rd ) for φ ∈ D( 12 p2 ), where Veﬀ (x) = Let V∞ = supx |

d−1 d α j (ρ (x), ρjν (x))L2 (Rd ) . 2 j=1 µ,ν=1 µ

d−1 d j=1

j j µ,ν=1 (ρµ (x), ρν (x))L2 (Rd ) |.

Thus

α inf σ(HPF ) ≤ − (Z − V∞ ). 2 When Z > V∞ , inf σ(HPF ) < lim|x|→∞ V (x) = 0 follows for all values of coupling constant α. Then ground states of HPF decay as C1 e−C2 |x| pointwise for all values of coupling constants. Acknowledgments FH acknowledges support of Grant-in-Aid for Science Research (B) 20340032 from JSPS and Grant-in-Aid for Challenging Exploratory Research 22654018 from JSPS. Appendix In this appendix, we show the unitary equivalence between HPF and the Pauli–Fierz Hamiltonian deﬁned on L2 (Rd ) ⊗ F ,

d−1 2 d d−1 2 d ∞ L (R )) is the Boson Fock space over L (R ). where F = n=0 ⊗ns ( Let Ω = {1, 0, 0, . . .} ∈ F be the Fock vacuum. The annihilation operator and the creation operator in F are denoted by a∗ (f ) and a(f ), respectively, where d−1 2 d L (R ). They satisfy canonical commutation relations: f = (f1 , . . . , fd−1 ) ∈ [a(f ), a∗ (g)] =

d−1

(f¯j , gj )L2 (Rd ) ,

j=1 ∗

∗

[a (f ), a (g)] = 0 = [a(f ), a(g)]. The ﬁeld operator in F is given by ˜ˆ ˆ + a(φ)), ˆ = √1 (a∗ (φ) A(φ) 2 ⊕ ˜ˆ ˆ where φ(k) = φ(−k). The quantized radiation ﬁeld is deﬁned by Aµ = Rd Aµ (x)dx ρµ (x)), where a under the identiﬁcation L2 (Rd ) ⊗ F ∼ = L2 (Rd ; F ) and Aµ (x) = A(ˆ

November 16, J070-S0129055X10004181

1206

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

cutoﬀ function is given by ρˆµ (x) = ρˆµ (k, x) = the free ﬁeld Hamiltonian is deﬁned by dΓ(ω) =

k ∞ k=0 i=1

d−1 ˆj j=1 φµ (k)Ψ(k, x)/ ω(k). Finally

i

1 ⊗ ··· ω ···⊗ 1. $ %& '

(A.1)

k 2

Then the Pauli–Fierz Hamiltonian in L (R ) ⊗ F is given by √ ˆ PF = 1 (p ⊗ 1 + αA)2 + V ⊗ 1 + 1 ⊗ dΓ(ω). H (A.2) 2 Suppose that V is relatively bounded with respect to 12 p2 with a relative bound strictly smaller than one, and that ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )) and √ √ ω ρˆjµ , ρˆjµ , ρˆjµ / ω, ∂xµ ρˆjµ , ∂xµ ρˆjµ / ω ∈ L∞ (Rdx ; L2 (Rdk )). (A.3) d

ˆ PF is self-adjoint on D(p2 ⊗ 1) ∩ D(1 ⊗ dΓ(ω)). Now let See Assumption 2.1. Then H us see the relationship between L2 (Q) and F . Let U : F → L2 (Q) be deﬁned by U Ω = 1, U : A(φˆ1 ) · · · A(φˆn ) : Ω = : A (φ1 ) · · · A (φn ):, where the Wick product on the left-hand side is deﬁned by moving all the creation operators to the left and annihilation operators to the right without any commutation relations. While the Wick product of the left-hand side is deﬁned recursively by : A (φ) : = A (φ) and : A (φ)

n ( j=1

A (φj ) : = A (φ) :

n ( j=1

A (φj ) : −

n ( 1 (fk , f ) : A (φj ) : . 2 k=1

j =k

The unitary operator U can be extended to the unitary operator from F to L2 (Q), and it also implements U dΓ(ω)U −1 = Hf (m). Then under (A.3) it follows that (1 ⊗ U ) maps D( 12 p2 ⊗ 1) ∩ D(1 ⊗ dΓ(ω)) to D( 12 p2 ⊗ 1) ∩ D(1 ⊗ Hf (m)) and ˆ PF (1 ⊗ U −1 ) = HPF . (1 ⊗ U )H

(A.4)

References [1] M. Aizenman and B. Simon, Brownian motion and Harnak’s inequality for Schr¨ odinger operators, Comm. Pure Appl. Math. 35 (1982) 209–270. [2] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation ﬁeld, Comm. Math. Phys. 207 (1999) 249–290. [3] K. Broderix, D. Hundertmark and H. Leschke, Continuity properties of Schr¨ odinger semigroups with magnetic ﬁelds, Rev. Math. Phys. 12 (2000) 181–225.

November 16, J070-S0129055X10004181

2010 15:28 WSPC/S0129-055X

148-RMP

Generalized PF Model

1207

[4] V. Betz, F. Hiroshima, J. L˝ orinczi, R. A. Minlos and H. Spohn, Ground state properties of the Nelson Hamiltonian — a Gibbs measure-based approach, Rev. Math. Phys. 14 (2002) 173–198. [5] R. Carmona, Pointwise bounds for Schr¨ odinger operators, Comm. Math. Phys. 62 (1978) 97–106. [6] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators (SpringerVerlag, 1987). [7] C. Feﬀerman, J. Fr¨ ohlich and G. M. Graf, Stability of ultraviolet-cutoﬀ quantum electrodynamics with non-relativistic matter, Comm. Math. Phys. 190 (1997) 309–330. [8] C. G´erard, F. Hiroshima, A. Panatti and A. Suzuki, Infrared divergence of a scalar quantum ﬁeld model on a pseudo Riemannian manifold, Interdiscip. Inform. Sci. 15 (2009) 399–421. [9] M. Gubinelli, Gibbs measures for self-interacting Wiener paths, Mark. Proc. Rel. Fields 12 (2006) 747–766. [10] M. Griesemer, Exponential decay and ionization thresholds in non-relativistic quantum electrodynamics, J. Funct. Anal. 210 (2004) 321–340. [11] M. Griesemer, E. Lieb and M. Loss, Ground states in non-relativistic quantum electrodynamics, Invent. Math. 145 (2001) 557–595. [12] D. Hasler and I. Herbst, On the self-adjointness and domain of Pauli–Fierz type Hamiltonians, Rev. Math. Phys. 20 (2008) 787–800. [13] T. Hidaka, On the existence of ground states for the Pauli–Fierz model with a variable mass, preprint (2010). [14] F. Hiroshima, Functional integral representation of a model in quantum electrodynamics, Rev. Math. Phys. 9 (1997) 489–530. [15] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics II, J. Math. Phys. 41 (2000) 661–674. [16] F. Hiroshima, Essential self-adjointness of translation invariant quantum ﬁled models for arbitrary coupling constants, Comm. Math. Phys. 211 (2000) 585–613. [17] F. Hiroshima, Self-adjointness of the Pauli–Fierz Hamiltonian for arbitrary values of coupling constants, Ann. Henri Poincar´e 3 (2002) 171–201. [18] F. Hiroshima, Fiber Hamiltonians in nonrelativistic quantum electrodynamics, J. Funct. Anal. 252 (2007) 314–355. [19] F. Hiroshima, T. Ichinose and J. L˝ orinczi, Path integral representation for Schr¨ odinger operator with Bernstein function of the Laplacian, preprint (2009). [20] F. Hiroshima and J. L˝ orinczi, Functional integral representations of the Pauli–Fierz model with spin 1/2, J. Funct. Anal. 254 (2008) 2127–2185. [21] T. Ikebe, Eigenfunction expansion asociated with the Schr¨ odinger operators and their applications to scattering theory, Arch. Ration. Mech. Anal. 5 (1960) 1–34. [22] J. L˝ orinczi, R. A. Minlos and H. Spohn, The infrared behaviour in Nelson’s model of a quantum particle coupled to a massless scalar ﬁeld, Ann. Henri Poincar´e 3 (2002) 1–28. [23] E. Nelson, Schr¨ odinger particles interacting with a quantized scalar ﬁeld, in Proc. Conf. Analysis in Function Space, eds. W. T. Martin and I. Segal (MIT Press, Cambridge 1964), p. 87. [24] B. Simon, The P (φ)2 Euclidean (Quantum) Field Theory (Princeton Univ. Press, 1974). [25] B. Simon, Functional Integral Representation and Quantum Physics (Academic Press, 1979). [26] B. Simon, Schr¨ odinger semigroups, Bull. Amer. Math. Soc. 7 (1982) 447–526.

November 16, J070-S0129055X10004181

1208

2010 15:28 WSPC/S0129-055X

148-RMP

T. Hidaka & F. Hiroshima

[27] B. Simon, Kato’s inequality and the comparison of semigroups, J. Funct. Anal. 32 (1979) 97–101. [28] H. Spohn, Ground state of quantum particle coupled to a scalar boson ﬁeld, Lett. Math. Phys. 44 (1998) 9–16. [29] H. Spohn, Dynamics of Charged Particles and their Radiation Field (Cambridge University Press, 2004).

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1209–1240 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004193

EIGENFUNCTION EXPANSIONS AND SPACETIME ESTIMATES FOR GENERATORS IN DIVERGENCE-FORM

MATANIA BEN-ARTZI Institute of Mathematics, Hebrew University, Jerusalem 91904, Israel [email protected] Received 28 January 2010 Revised 10 August 2010 Let H = − L2 (Rn ), n

Pn

∂ j,k=1 ∂xj

∂ aj,k (x) ∂x

k

be a formally self-adjoint (elliptic) operator in

≥ 2. The real coeﬃcients aj,k (x) = ak,j (x) are assumed to be bounded and to coincide with −∆ outside of a ball. The paper deals with two topics: (i) An eigenfunction expansion theorem, proving in particular that H is unitarily equivalent to −∆, and (ii) Global spacetime estimates for the associated inhomogeneous wave equation, proved under suitable (“nontrapping”) additional assumptions on the coeﬃcients. The main tool used here is a Limiting Absorption Principle (LAP) in the framework of weighted Sobolev spaces, which holds also at the threshold. Keywords: Divergence-type operator; limiting absorption principle; eigenfunction expansion; spacetime estimates. Mathematics Subject Classiﬁcation 2010: 35J15, 35L15, 47F05

1. Introduction Let H = − nj,k=1 ∂j aj,k (x)∂k , where aj,k (x) = ak,j (x), be a formally self-adjoint ∂ ∂ operator in L2 (Rn ), n ≥ 2. The notations ∂j = ∂x and ∂t = ∂t are used throughout j the paper. We assume that the real measurable matrix function a(x) = {aj,k (x)}1≤j,k≤n satisﬁes, with some positive constants a1 > a0 > 0, Λ0 > 0, a0 I ≤ a(x) ≤ a1 I, a(x) = I

x ∈ Rn ,

(1.1)

for |x| > Λ0 .

(1.2)

In what follows, we shall use the notation H = −∇ · a(x)∇. We retain the notation H for the self-adjoint (Friedrichs) extension associated with the form (a(x)∇ϕ, ∇ψ), where ( , ) is the scalar product in L2 (Rn ). When a(x) ≡ I we set H = H0 = −∆. Operators of this type appear in geometry (Laplacian on noncompact Riemannian manifolds) as well as in physics, typically when physical parameters vary in space (such as the acoustic propagator in a medium with variable speed of sound). 1209

November 16, J070-S0129055X10004193

1210

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Under our assumptions (1.1) and (1.2), it follows that σ(H), the spectrum of H, is the half-axis [0, ∞), and is entirely continuous. In particular, the equality (Hu, u) = (a(x)∇u, ∇u) shows that H has no eigenvalue at zero. In addition, if the coeﬃcient matrix a(x) is smooth, the absence of singular continuous spectrum follows from the classical work of Mourre ([58]). However, it seems that there is no proof in the literature establishing the absolute continuity of the spectrum in our case of non-smooth (and even discontinuous) coeﬃcients. This fact is implied by our Theorem A stated in Sec. 3 below. The “threshold” z = 0 plays a special role in this setting, as we shall see later. The mere fact that both H and H0 are spectrally absolutely continuous over [0, ∞) does not imply that they are “identical”, namely, in the functional analytic setting, that they are “unitarily equivalent”. Thus one question that arises is: Question 1. Are the operators H and H0 unitarily equivalent, under the above assumptions on the coeﬃcients? We next recall the deﬁnition of the wave operators related to H, H0 [50, Chap. X]. Consider the family of unitary operators W (t) = exp(itH) exp(−itH0 ),

−∞ < t < ∞.

The strong limits W± (H, H0 ) = s- lim W (t), t→±∞

(1.3)

if they exist, are called the wave operators (relating H, H0 ). These operators play an important role in scattering theory. They are clearly isometries. If the range of W+ is equal to the absolutely continuous subspace of H (which here is L2 (Rn ) itself), we say that it is complete, with a similar deﬁnition for W− . If either one is complete, then it is unitary (in the case at hand) and provides a unitary equivalence between H and H0 . A second question that arises therefore is: Question 2. Do the wave operators exist and, if so, are they complete? As noted above, a positive answer to this question entails a positive answer to the ﬁrst question. Another aspect related to the spectral theory of H is its associated eigenfunction expansion. When available, it serves as an analytic tool which is sharper than the abstract spectral theorem. In the case of H0 , the Fourier transform n g(x)e−iξx dx, (1.4) F g(ξ) = g(ξ) = (2π)− 2 Rn

serves to express g(x) as n

g(x) = (2π)− 2

Rn

g(ξ)eiξx dξ,

(1.5)

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1211

which can be viewed as an “expansion” of g in terms of the “generalized eigenfunctions” (or “modes”) exp(iξx), associated with the eigenvalues |ξ|2 . Furthermore, the operator F is unitary and F H0 F −1 is just multiplication by |ξ|2 in Fourier space. Such (“diagonalizing”) expansions have been used extensively in quantum mechanics (for example, the Airy transform associated with the Stark Hamiltonian). It is therefore natural to pose the following question: Question 3. Can one associate a similar “eigenfunction expansion” with the operator H? More speciﬁcally, can one replace the exponentials exp(iξx) by some approximating generalized eigenfunctions (“distorted plane waves”) so that the resulting transform remains unitary and diagonalizes the operator? As a ﬁnal topic in this paper, we turn back to the evolution (unitary) group exp(−itH)u0 , which solves the Schr¨odinger equation i∂t u = Hu,

u(0) = u0 .

The last 30 years have seen a very intensive research on the global (spacetime) properties of these solutions, known as “Strichartz and smoothing” estimates. Instead of treating the Schr¨ odinger equation we choose here to address the generalized wave equation, ∂t2 u = −Hu + f,

(1.6)

subject to initial conditions u(0) = u0 , ∂t u(0) = v0 . The conservation of energy for this equation (in the homogeneous case, f = 0) is given by 1 1 [|H β ∂t u(x, t)|2 + |H β+ 2 u(x, t)|2 ]dx = [|H β v0 (x)|2 + |H β+ 2 u0 (x)|2 ]dx, Rn

Rn

(1.7) for any β ∈ R, and any t ∈ R. In this context, the dispersive character of the equation means that the solution “escapes” from any bounded set, as |t| → ∞, in some average sense. We would like to estimate this decay in terms of the initial energy norm, namely, the right-hand side of (1.7). We therefore ask: Question 4. Can one establish global L2 spacetime estimates for solutions of (1.6) in terms of the initial energy norm? In this paper, we answer aﬃrmatively the ﬁrst three questions. As for Question 4, we provide such estimates by imposing restrictive hypotheses on the coeﬀﬁcient matrix. The precise statements, as well as discussions of the relevant bibliography for each topic, are given in Sec. 3.

November 16, J070-S0129055X10004193

1212

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

The main technical tool used here consists of a close study of the properties of the resolvent R(z) as z approaches the real axis. To be more speciﬁc, we introduce the general notion of the “continuity up to the spectrum” of the resolvent. Deﬁnition 1.1. Let [α, β] ⊆ R. We say that H satisﬁes the “Limiting Absorption Principle” (LAP) in [α, β] if R(z), z ∈ C ± , can be extended continuously to Im z = 0, Re z ∈ [α, β], in a suitable operator topology. In this case we denote the limiting values by R± (λ), α ≤ λ ≤ β. The precise speciﬁcation of the operator topology in the above deﬁnition is left open. Typically, it will be the uniform operator topology associated with weightedL2 or Sobolev spaces, which are introduced in Sec. 2. Note that the limiting values R− (λ) are, generally speaking, diﬀerent from + R (λ). In fact, one has (formally) the “Stieltjes formula” A(λ) =

1 d (R+ (λ) − R− (λ)) = E(λ), 2πi dλ

where E(λ) is the spectral family associated with H. The operator A(λ), λ ∈ [0, ∞), known in the physical literature as the “density of states” ([28, Chap. XIII]), plays an important role in our study. The paper is organized as follows. Basic functional spaces and notations are introduced in Sec. 2. Our results are stated as Theorems A–C in Sec. 3. Around each of the three theorems, we discuss some background material as well as relevant references. Obviously, the large amount of existing literature excludes any possibility of compiling an exhaustive bibliography. Section 4 is devoted to revisiting the LAP as applied to the Laplacian H0 , and in particular obtaining uniform “low energy” estimates. In Sec. 5, we prove Theorem A, the LAP for H. The eigenfunction expansion theorem, Theorem B, is proved in Sec. 6. The global spacetime estimates for the generalized wave equation (1.6), as stated in Theorem C, are proved in Sec. 7. Some of the results presented here were announced in [9]. 2. Functional Spaces and Notation Throughout this paper we shall make use of the following weighted-L2 and Sobolev spaces. First, for s ∈ R and m a nonnegative integer we deﬁne. (1 + |x|2 )s |u(x)|2 dx < ∞ (2.1) L2,s (Rn ) := u(x)/ u 20,s = H m,s (Rn ) :=

  

Rn

u(x)/Dα u ∈ L2,s , |α| ≤ m, u 2m,s =

(we write L2 for L2,0 and u 0 = u 0,0 ).

|α≤m

Dα u 20,s

  

(2.2)

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1213

More generally, for any σ ∈ R, let H σ ≡ H σ,0 be the Sobolev space of order σ, namely, u/u ∈ L2,σ , ˆ u σ,0 = u 0,σ } H σ = {ˆ

(2.3)

where the Fourier transform is deﬁned as in (1.4). For negative indices, we denote by {H −m,s , · −m,s } the dual space of H m,−s . In particular, observe that any function f ∈ H −1,s can be represented (not uniquely) as f = f0 +

n k=1

i−1

∂ fk , ∂xk

fk ∈ L2,s , 0 ≤ k ≤ n.

(2.4)

In the case n = 2 and s > 1, we deﬁne 2 2,s L2,s (R2 )/ˆ u(0) = 0}, 0 (R ) = {u ∈ L

and set H0−1,s (R2 ) to be the space of functions f ∈ H −1,s (R2 ) which have a representation (2.4) where fk ∈ L2,s 0 , k = 0, 1, 2. For any two normed spaces X, Y , we denote by B(X, Y ) the space of bounded linear operators from X to Y , equipped with the operator-norm · B(X,Y ) topology. 3. Statement of Results and Background 3.1. The limiting absorption principle (LAP) We note that the operator H can be extended in an obvious way (retaining the −1 1 same notation) as a bounded operator H: Hloc

→ Hloc . In particular, H: H 1,−s → −1,−s , for all s ≥ 0. Furthermore, the graph-norm of H in H −1,−s is equivalent H to the norm of H 1,−s . Similarly, we can consider the resolvent R(z) as deﬁned on L2,s , s ≥ 0, where L2,s is densely and continuously embedded in H −1,s . The basic technical tool used in the present paper is given in the following theorem. It has its own signiﬁcance, stating that the resolvent is continuous up to the spectrum, including the threshold at λ = 0. Theorem A. Suppose that a(x) satisﬁes (1.1), (1.2). Then the operator H satisﬁes the LAP in R. More precisely, let s > 1 and consider the resolvent R(z) = (H − z)−1 , Im z = 0, as a bounded operator from L2,s (Rn ) to H 1,−s (Rn ). Then: (a) R(z) is bounded with respect to the H −1,s (Rn ) norm. Using the density of L2,s in H −1,s , we can therefore view R(z) as a bounded operator from H −1,s (Rn ) to H 1,−s (Rn ). (b) The operator-valued functions, deﬁned respectively in the lower and upper halfplanes, z → R(z) ∈ B(H −1,s (Rn ), H 1,−s (Rn )),

s > 1,

±Im z > 0,

(3.1)

November 16, J070-S0129055X10004193

1214

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

can be extended continuously from C ± = {z/ ± Im z > 0} to C ± = C ± ∪ R (with respect to the operator-norm topology of B(H −1,s (Rn ), H 1,−s (Rn ))). In the case n = 2, replace H −1,s by H0−1,s . Notation. We denote the limiting values of the resolvent on the real axis by R± (λ) = lim R(λ + i). →±0

The spectrum of H is therefore entirely absolutely continuous. In particular, it follows that the limiting values R± (λ) are continuous at λ = 0 and H has no resonance there. The main focus of Theorem A is the LAP for H at “low energies”, i.e. in intervals [α, β] where α < 0 < β. However, to review the existing literature, we consider ﬁrst the LAP in (0, ∞), namely, over the interior of the spectrum. Under assumptions close to ours here (but also assuming that a(x) is continuously diﬀerentiable) a weaker version (roughly, “strong” instead of “uniform” convergence of the resolvents) was obtained by Eidus ([34, Theorem 4 and Remark 1]). His approach relied on elliptic (kernel) estimates. The systematic treatment of the LAP started with the work of Agmon ([1]). He established it for operators of the type H0 + V , where V is a short-range perturbation. To obtain the LAP for H0 he considered the action of division by symbols with simple zeros in weighted Sobolev spaces. We therefore label this approach as the “Fourier approach” (see [41, Chap. 14]). The short-range potential was treated by perturbation methods. Soon thereafter, two other approaches to the LAP were proposed, ﬁrst the “Commutator method” (known as “Mourre’s method”) proposed in the classical paper [58] and then the “Spectral method”, initiated in joint works of the author with Devinatz ([12, 13]). In its implementation for partial diﬀerential operators, this method relies on estimates of traces of Sobolev functions on characteristic manifolds, somewhat in analogy to the division by symbols with simple zeros in the case of the Fourier method. In fact, it implies the H¨ older continuity of the limiting values ± R (λ) in a suitable operator topology. All three approaches yielded simple proofs for the LAP associated with H = H0 + V, where V is short-range, in the interior (0, ∞) of the spectrum. Using one of the aforementioned approaches, the LAP for H has later been established, with V being a long-range or Stark-like potential ([5, 45]), a potential in Lp (Rn ) ([36, 47]), a potential depending only on direction (x/|x|) ([38]) or a perturbation of such a potential ([61, 62]). In these latter cases the condition α > 0 is replaced by α > lim sup|x|→∞ V (x). The LAP for operators of the type f (−∆) + V, for a certain class of functions f, was derived in [17], using the spectral method. A remarkable success of Mourre’s method was in its application to the LAP in the case of the N -body Schr¨ odinger operator (outside of thresholds) ([60]).

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1215

As mentioned in the Introduction, if the coeﬃcient matrix a(x) is smooth, the operator H can be viewed as the Laplace–Beltrami operator ∆g on noncompact manifolds, where g is a smooth metric that approaches the Euclidean metric at inﬁnity. The LAP in this case (in the interior of the spectrum) has already been established by Mourre. We refer to [65] and references therein for the case of perturbations of such operators. More recent works that employ the Mourre method for the derivation of the LAP in the interior of the spectrum, for asymptotically Euclidean spaces, are [75, Sec. 5] and [19, Theorem 2.2]. We now turn back to our topic here, the LAP in intervals containing the threshold at the bottom of the spectrum. The study of the resolvent near the threshold λ = 0 is sometimes referred to as “low energy estimates”. The literature in this case is considerably more limited. An inspection of the aforementioned works shows that the methods they employ cannot be extended in a straightforward way to our operator H. This case has been studied for the Laplacian H0 in [12, Appendix A] and for H in the one-dimensional case (n = 1) in [8, 10, 27]. The present paper deals with the multi-dimensional case n ≥ 2. In recent works, Bouclet ([21]) and Bony and H¨ afner ([20]) have applied the Mourre method in order to establish “low energy” LAP for ∆g on noncompact manifolds of dimension n ≥ 3, where the metric g(x) is smooth but long-range. The paper [64] deals with the two-dimensional (n = 2) case, but the resolvent R(z) is restricted to continuous compactly supported functions f , thus enabling the use of pointwise decay estimates of R(z)f at inﬁnity. Finally we mention the case of the closely related “acoustic propagator”, where the matrix a(x) = b(x1 )I is scalar and dependent on a single coordinate, has been extensively studied [10, 22, 29, 31, 48, 49, 53], as well as the “anisotropic” case where b(x1 ) is a general positive matrix ([11]). The LAP for the periodic case (namely, a(x) is symmetric and periodic) has recently been established in [59]. Note that in this case the spectrum is absolutely continuous and consists of a union of intervals (“bands”). The proof of Theorem A, based on the spectral approach, is given in Sec. 5. It uses an extended version of the LAP for H0 , with the resolvent R0 (z) acting on elements of H −1,s , for suitable positive values of s (see Sec. 4). Since L2,s (respectively H 1,−s ) is densely and continuously embedded in H −1,s (respectively L2,−s ), we conclude that the resolvents R0 (z), R(z) can be extended continuously to C ± in the B(L2,s (Rn ), L2,−s (Rn )) operator topology. An immediate consequence of this fact is the existence and completeness of the wave operators. Using a well-known theorem of Kato and Kuroda ([51]), we have the following immediate corollary concerning the completeness of the wave operators (see (1.3) for the deﬁnition). Corollary 3.1. The wave operators W± (H, H0 ) exist and are complete.

November 16, J070-S0129055X10004193

1216

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Indeed, all that is needed is that H, H0 satisfy the LAP in R, with respect to the same operator topologies. We refer to the paper [46] where the existence and completeness of the wave operators W± (H, H0 ) is established under suitable smoothness assumptions on a(x) (however, a(x) − I is not assumed to be compactly supported and H can include also magnetic and electric potentials).

3.2. The eigenfunction expansion theorem The spectral theorem (for self-adjoint operators) can be viewed as a “generalized eigenfunction theorem”. In fact, using the result of Theorem A one can obtain a more reﬁned version in this case as follows. d E(λ) Let {E(λ), λ ∈ R} be the spectral family associated with H. Let A(λ) = dλ be its weak derivative. More precisely, we use the well-known formula, A(λ) =

1 1 lim (R(λ + i) − R(λ − i)) = (R+ (λ) − R− (λ)). 2πi →0+ 2πi

(3.2)

By Theorem A, we know that A(λ) ∈ B(L2,s (Rn ), L2,−s (Rn )). The formal relation (H − λ)A(λ) = 0 can be given a rigorous meaning if, for example, we can ﬁnd a bounded operator T such that T ∗ A(λ)T is bounded in L2 (Rn ) and has a complete set (necessarily at most countable) of eigenvectors. These will serve as “generalized eigenvectors” for H. We refer to [18, Chaps. V and VI] and [23] for a development of this approach for self-adjoint elliptic operators. Note that by this approach we have at most a countable number of such generalized eigenvectors for any ﬁxed √ − n−3 √ 2 J κj ( λ|x|)ψj (ω), where λ. In the case of H0 = −∆, they correspond to |x|

, λj being the jth eigenvalue of the Laplace–Beltrami operator κj = λj + (n−1)(n−3) 4 on the unit sphere S n−1 , ψj the corresponding eigenfunction and Jν is the Bessel function of order ν. On the other hand, the Fourier expansion (1.5) can be viewed as expressing a function in terms of the “generalized eigenfunctions” exp(iξx) of H0 . Observe that now there is a continuum of such functions corresponding to λ > 0, namely, |ξ|2 = λ. From the physical point of view, this expansion in terms of “plane waves” proves to be more useful for many applications. In particular, replacing −∆ by the Schr¨ odinger operator −∆ + V (x) one can expect, under certain hypotheses on the potential V , a similar expansion in terms of “distorted plane waves”. This has been accomplished, in increasing order of generality (more speciﬁcally, decay assumptions on V (x) as |x| → ∞) in [1, 2, 44, 63, 68]. See also [74] for an eigenfunction expansion for relativistic Schr¨odinger operators. Here we use the LAP result of Theorem A in order to derive a similar expansion for the operator H. In fact, our generalized eigenfunctions are given by the following deﬁnition.

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1217

Deﬁnition 3.2. For every ξ ∈ Rn let ψ± (x, ξ) = −R∓ (|ξ|2 )((H − |ξ|2 ) exp(iξx))   n = R∓ (|ξ|2 )  ∂l (al,j (x) − δl,j )∂j exp(iξx).

(3.3)

l,j=1

The generalized eigenfunctions of H are deﬁned by ϕ± (x, ξ) = exp(iξx) + ψ± (x, ξ).

(3.4)

We assume n ≥ 3 in order to simplify the statement of the theorem. As we show below (see Proposition 6.1) the generalized eigenfunctions are (at least) continuous in x, so that the integral in the statement makes sense. Theorem B. Suppose that n ≥ 3 and that a(x) satisﬁes (1.1) and (1.2). For any compactly supported f ∈ L2 (Rn ) deﬁne −n 2 (F± f )(ξ) = (2π) f (x)ϕ± (x, ξ)dx, ξ ∈ Rn . (3.5) Rn

Then the transformations F± can be extended as unitary transformations (for which we retain the same notation) of L2 (Rn ) onto itself. Furthermore, these transformations “diagonalize” H in the following sense. f ∈ L2 (Rn ) is in the domain D(H) if and only if |ξ|2 (F± f )(ξ) ∈ L2 (Rn ) and H = F∗± M|ξ|2 F± ,

(3.6)

where M|ξ|2 is the multiplication operator by |ξ|2 . 3.3. Spacetime estimates for a generalized wave equation The Strichartz estimates ([72]) have become a fundamental ingredient in the study of nonlinear wave equations. They are Lp spacetime estimates that are derived for operators whose leading part has constant coeﬃcients. We refer to the books [4, 70, 71] for detailed accounts and further references. Here we focus exclusively on spacetime estimates pertinent to the framework of this paper, namely, weighted L2 estimates. Indeed, once the “low energy estimates” of Theorem A are established, the method of proof here follows a standard methodology. We recall ﬁrst some results related to the Cauchy problem for the classical wave equation u =

∂2u − ∆u = 0, ∂t2

(3.7)

subject to the initial data u(x, 0) = u0 (x),

∂t u(x, 0) = v0 (x),

x ∈ Rn .

(3.8)

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

1218

The Morawetz estimate [56] yields |x|−3 |u(x, t)|2 dxdt ≤ C( ∇u0 20 + v0 20 ), R

Rn

n ≥ 4,

while in [7] we gave the estimate |x|−2α−1 |u(x, t)|2 dxdt ≤ Cα ( |∇|α u0 20 + |∇|α−1 v0 20 ), R

Rn

(3.9)

n ≥ 3,

(3.10)

for every α ∈ (0, 1). Related results were obtained in [55] (allowing also dissipative terms), [42] (with some gain in regularity), [76] (with short-range potentials) and [39] for spherically symmetric solutions. Here we consider the equation n ∂2u ∂ 2u + Hu = − ∂i ai,j (x)∂j u = f (x, t), (3.11) ∂t2 ∂t2 i,j=1 subject to the initial data (3.8). We ﬁrst replace the assumptions (1.1) and (1.2) by stronger ones as follows. a(x) = g −1 (x) = (g i,j (x))1≤i,j≤n

(H1)

(3.12)

where g(x) = (gi,j (x))1≤i,j≤n is a smooth Riemannian metric on Rn such that g(x) = I (H2)

for |x| > Λ0 .

(3.13)

The Hamiltonian ﬂow associated with h(x, ξ) = (g(x)ξ, ξ) is nontrapping for any (positive) value of h.

Recall that (H2) means that the ﬂow associated with the Hamiltonian vectorﬁeld ∂h ∂ ∂ n H = ∂h ∂ξ ∂x − ∂x ∂ξ leaves any compact set in Rx . Identical hypotheses are imposed in the study of resolvent estimates in semiclassical theory ([24, 25]). In our estimates we use “homogeneous Sobolev spaces” associated with the operator H. 1 We note that since H has no eigenvalue at zero, the operators H −1 and H − 2 1 are well deﬁned self-adjoint operators. Note that H 2 θ 0 is equivalent to the homogeneous Sobolev norm ∇θ 0 . Theorem C. Suppose that n ≥ 3 and that a(x) satisﬁes Hypotheses (H1) and (H2). Let s > 1. 1

(a) (Local Energy Decay) Let u0 ∈ D(H 2 ) and v0 ∈ L2 (Rn ). Then there exists a constant C1 = C1 (s, n) > 0 such that the solution to (3.11) and (3.8) satisﬁes, 1 (1 + |x|2 )−s [|H 2 u(x, t)|2 + |ut (x, t)|2 ]dxdt R

Rn

1 ≤ C1 H 2 u0 20 + v0 20 + R

Rn

|f (x, t)| dxdt . 2

(3.14)

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1219

1

(b) (Amplitude Decay) Assume f = 0. Let u0 ∈ L2 (Rn ) and v0 ∈ D(H − 2 ). There exists a constant C2 = C2 (s, n) > 0 such that the solution to (3.11) and (3.8) satisﬁes, 1 (1 + |x|2 )−s |u(x, t)|2 dxdt ≤ C2 [ u0 20 + H − 2 v0 20 ]. (3.15) R

Rn

These estimates generalize similar estimates obtained for the classical (g = I) wave equation ([7, 55]). Remark 3.3. The estimate (3.14) is an “energy decay estimate” for the wave equation (3.11). A localized (in space) version of the estimate has served to obtain global (small amplitude) existence theorems for the corresponding nonlinear equation ([25, 40]). Remark 3.4. The referee has pointed out to the author the recent preprint [19, Theorem 1.3], where a more general result is obtained, with the metric being longrange. The weighted L2 -spacetime estimates for the dispersive equation i−1

∂ u = Lu, ∂t

have been extensively treated in recent years. In general, in this case there is also a gain of derivatives (so called “smoothing”) in addition to the energy decay. For the Schr¨ odinger operator L = −∆ + V (x), with various assumptions on the potential V, we refer to [3, 6, 7, 15, 16, 42, 52, 67, 69, 77] and references therein. Smoothing estimates in the presence of magnetic potentials are considered in [30]. The Schr¨ odinger operator on a Riemannian manifold is considered in [24, 33]. For more general operators, see [14, 17, 26, 43, 57, 66, 73] and references therein. 4. The Operator H0 = −∆ Let {E0 (λ)} be the spectral family associated with H0 , so that ˆ 2 dξ, λ ≥ 0, h ∈ L2 (Rn ). (E0 (λ)h, h) = |h| |ξ|2 ≤λ

(4.1)

Following the methodology of [13, 32], we see that the weak derivative A0 (λ) = d 2,s , L2,−s ) for any s > 12 and λ > 0. (Here and below we dλ E0 (λ) exists in B(L write L2,s for L2,s (Rn )). Furthermore, √ ˆ 2 dτ,

A0 (λ)h, h = (2 λ)−1 |h| (4.2) |ξ|2 =λ

where , is the (L2,−s , L2,s ) pairing (conjugate linear with respect to the second term) and dτ is the Lebesgue surface measure. Recall that by the standard trace

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

1220

lemma we have

|ξ|2 =λ

ˆ 2 dτ ≤ C h ˆ 2 s, |h| H

s>

1 . 2

(4.3)

However, we can reﬁne this estimate near λ = 0 as follows. Proposition 4.1. Let and h ∈ L2,s 0 . Then

1 2

< s < 32 ,

|ξ|2 =λ

h ∈ L2,s . For n = 2 assume further that s > 1

ˆ 2 dτ ≤ C min(λγ , 1) h ˆ 2 s, |h| H

(4.4)

1 0 <γ =s− , 2 1 0 <γ < s− , 2

(4.5)

where n ≥ 3, n = 2,

and C = C(s, γ, n). Proof. If n ≥ 3, the proof follows as in [16, Appendix], using the “generalized Hardy inequality” due to Herbst [37], namely, that multiplication by |ξ|−s is bounded from H s into L2 (see also [54, Sec. 9.4]). If n = 2 and 1 < s < 32 we have, for h ∈ L2,s 0 , ˆ Hs , ˆ ˆ ˆ |h(ξ)| = |h(ξ) − h(0)| ≤ Cs,δ |ξ|δ h for any 0 < δ < min(1, s − 1). Using this estimate in the integral in the right-hand side of (4.4) the claim follows also in this case. Combining Eqs. (4.2)–(4.4) we conclude that, 1

1

| A0 (λ)f, g| ≤ A0 (λ)f, f 2 A0 (λ)g, g 2 1

≤ C min(λ− 2 , λη ) f 0,s g 0,σ ,

f ∈ L2,s , g ∈ L2,σ ,

(4.6)

where either (i) n ≥ 3,

3 1 < s, σ < , 2 2

s + σ > 2 and 0 < 2η = s + σ − 2,

or (ii)

n = 2,

1<s<

3 , 2

1 3 <σ< , 2 2

(4.7) s + σ > 2,

0 < 2η < s + σ − 2

and fˆ(0) = 0. In both cases, A0 (λ) is H¨older continuous and vanishes at 0, ∞, so as in [13] we obtain. Proposition 4.2. The operator-valued function B(L2,s , L2,−σ ), z → R0 (z) ∈ 2,−σ ), B(L2,s 0 ,L

n ≥ 3, n = 2,

(4.8)

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1221

where s, σ satisfy (4.7), can be extended continuously from C ± to C ± , in the respective uniform operator topologies. Remark 4.3. We note that the conditions (4.7) yield the continuity of A0 (λ) across the threshold λ = 0 and hence the continuity property of the resolvent as in Proposition 4.2. However, for the local continuity at any λ0 > 0, it suﬃces to take s, σ > 12 , as in [1]. This remark applies equally to the statements below, where the resolvent is considered in other functional settings. We shall now extend this proposition to more general function spaces. Let g ∈ H 1,σ , where s, σ satisfy (4.7). Let f ∈ H −1,s have a representation of the form (2.4). Equation (4.2) can be extended to yield an operator (for which we retain the same notation) A0 (λ) ∈ B(H −1,s , H −1,−σ ), deﬁned by (where now , is used for the (H −1,s , H 1,σ ) pairing), n ∂ −1 A0 (λ) f0 + i fk , g ∂xk k=1 n √ −1 = (2 λ) ξk fˆk (ξ) gˆ(ξ)dτ, f ∈ H −1,s , g ∈ H 1,σ , fˆ0 (ξ) + |ξ|2 =λ

k=1

(4.9) (replace H −1,s by H0−1,s if n = 2). Observe that this deﬁnition makes good sense even though the representation (2.4) is not unique, since f = f0 +

n

∂ ∂ ˜ fk = f˜0 + i−1 fk , ∂xk ∂xk n

i−1

k=1

k=1

implies fˆ0 (ξ) +

n k=1

ξk fˆk (ξ) = fˆ˜0 (ξ) +

n

ξk fˆ˜k (ξ)

k=1

(as tempered distributions). To estimate the operator-norm of A0 (λ) in this setting we use (4.9) and the considerations preceding Proposition 4.2, to obtain, instead of (4.6), for k = 1, 2, . . . , n, A0 (λ) ∂ fk , g ≤ C min(λ− 12 , λη ) f −1,s g 1,σ , f ∈ H −1,s , g ∈ H 1,σ , ∂xk (4.10) where s, σ satisfy (4.7) (replace H −1,s by H0−1,s if n = 2).

November 16, J070-S0129055X10004193

1222

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

We now deﬁne the extension of the resolvent operator by ∞ A0 (λ) dλ, Im z = 0. R0 (z) = λ−z 0

(4.11)

The convergence of the integral (in operator-norm) follows from the estimate (4.10). The LAP in this case is given in the following proposition. Proposition 4.4. The operator-valued function R0 (z) is well-deﬁned (and analytic) for nonreal z in the following functional setting. B(H −1,s , H 1,−σ ), n ≥ 3, (4.12) z → R0 (z) ∈ B(H0−1,s , H 1,−σ ), n = 2, where s, σ satisfy (4.7). Furthermore, it can be extended continuously from C ± to C ± , in the respective uniform operator topologies. The limiting values are denoted by R0± (λ). The extended function satisﬁes (H0 − z)R0 (z)f = f,

f ∈ H −1,s , z ∈ C ± ,

(4.13)

where for z = λ ∈ R, R0 (z) = R0± (λ). Proof. We assume for simplicity n ≥ 3. By Deﬁnition (4.11) and the estimate (4.10), we get readily R0 (z) ∈ B(H −1,s , H −1,−σ ) if Im z = 0, as well as the analyticity of the map z → R0 (z), Im z = 0. Furthermore, the extension to Im z = 0 is carried out as in [13]. Equation (4.13) is obvious if Im z = 0 and f ∈ L2,s . By the density of L2,s in −1,s , the continuity of R0 (z) on H −1,s and the continuity of H0 − z (in the sense H of distributions), we can extend it to all f ∈ H −1,s . As z → λ ± i · 0, we have R0 (z)f → R0± (λ)f in H −1,−σ . Applying the (constant coeﬃcient) operator H0 − z yields, in the sense of distributions, f = (H0 − z)R0 (z)f → (H0 − λ)R0± (λ)f which establishes (4.13) also for Im z = 0. Finally, the established continuity of z → R0 (z) ∈ B(H −1,s , H −1,−σ ) (up to the real boundary) and Eq. (4.13) imply the continuity of the map z → H0 R0 (z) ∈ B(H −1,s , H −1,−σ ). The stronger continuity claim (4.12) follows since the norm of H 1,−σ is equivalent to the graph-norm of H0 as a map of H −1,−σ to itself. Remark 4.5. The main point here is the fact that the limiting values can be extended continuously to the threshold at λ = 0. In the neighborhood of any λ > 0 this proposition follows from [68, Theorem 2.3], where a very diﬀerent proof is used. In fact, using the terminology there, the limit functions R0± (λ)f are the unique (on either side of the positive real axis) radiative functions and they satisfy a suitable “Sommerfeld radiation condition”. We recall it here for the sake of completeness, since we will need it in the next section.

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1223

Let z = k 2 ∈ C\{0}, Im k ≥ 0. For f ∈ H −1,s let u = R0 (z)f ∈ H 1,−σ be as deﬁned above. Then 2 − n−1 ∂ n−1 2 Ru = (4.14) (r 2 u) − iku dx < ∞, r ∂r |x|>Λ0

where r = |x|. We shall refer to Ru as the radiative norm of u. Furthermore, we can take 12 < s, σ, as in Remark 4.3. 5. The Operator H Fix [α, β] ∈ R and let Ω = {z ∈ C + /α < Re z < β, 0 < Im z < 1}.

(5.1)

Let z = µ + iε ∈ Ω and consider the equation (H − z)u = f ∈ H −1,s ,

u ∈ H 1,−σ ,

(f ∈ H0−1,s if n = 2).

(Observe that in the case n = 2 also u ∈ L2,σ 0 .) ∞ n With Λ0 as in (1.2), let χ(x) ∈ C (R ) be such that 0, |x| < Λ0 + 1, χ(x) = 1, |x| > Λ0 + 2.

(5.2)

(5.3)

Equation (5.2) can be written as (H0 − z)(χu) = χf − 2∇χ · ∇u − u∆χ.

(5.4)

Letting ψ(x) = 1 − χ( x2 ) ∈ C0∞ (Rn ) and using Proposition 4.4 and standard elliptic estimates, we obtain from (5.4) u 1,−σ ≤ C[ f −1,s + ψu 0,−s],

(5.5)

where s, σ satisfy (4.7), and C > 0 depends only on Λ0 , σ, s, n. We note that since ψ is compactly supported, the term ψu 0,−s can be replaced by ψu 0,−s for any real s . In fact, the second term in the right-hand side can be dispensed with, as is demonstrated in the following proposition. Proposition 5.1. The solution to (5.2) satisﬁes, u 1,−σ ≤ C f −1,s ,

(5.6)

where s, σ satisfy (4.7) and C > 0 depends only on σ, s, n, Λ0 . Proof. In view of (5.5), we only need to show that ψu 0,−s ≤ C f −1,s .

(5.7)

November 16, J070-S0129055X10004193

1224

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Since L2,s (Rn ) is dense in H −1,s (Rn ) it suﬃces to prove this inequality for f ∈ L2,s (Rn ) ∩ H −1,s (Rn ) (using the norm of H −1,s ). We argue by contradiction. Let {zk }∞ k=1 ⊆ Ω,

2,s {fk }∞ (Rn ) ∩ H −1,s (Rn ) k=1 ⊆ L

(with fˆk (0) = 0 if n = 2) and 1,−σ {uk = R(zk )fk }∞ (Rn ) k=1 ⊆ H

be such that, ψuk 0,−s = 1,

fk −1,s ≤ k −1 , k = 1, 2, . . .

¯ as k → ∞. zk → z0 ∈ Ω

(5.8)

1,−σ By (5.5), {uk }∞ . Replacing the sequence by a suitable subk=1 is bounded in H sequence (without changing notation) and using the Rellich compactness theorem we may assume that there exists a function u ∈ L2,−σ , σ > σ, such that,

uk → u in L2,−σ as k → ∞.

(5.9)

Furthermore, by weak compactness we actually have (restricting again to a subsequence if needed) uk − → u in H 1,−σ as k → ∞. w

(5.10)

Since H maps continuously H 1,−σ into H −1,−σ we have Huk − → Hu in H −1,−σ as k → ∞, w

so that from (H − zk )uk = fk we infer that (H − z0 )u = 0.

(5.11)

In view of (5.4) and Remark 4.5, the functions χuk are “radiative functions”. Since they are uniformly bounded in H 1,−σ their “radiative norms” (4.14) are uniformly bounded. Suppose ﬁrst that z0 = 0. In view of Remark 4.5, we can take s, σ > 12 . Then the limit function u is a radiative solution to (H0 − z0 )u = 0 in |x| > Λ0 + 2 and hence must vanish there (see [68]). By the unique continuation property of solutions to (5.11) we conclude that u ≡ 0. Thus by (5.9) we get ψuk 0,−σ → 0 as k → ∞, which contradicts (5.8). We are therefore left with the case z0 = 0. In this case u ∈ H 1,−σ satisﬁes the equation ∇ · (a(x)∇u) = 0.

(5.12)

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

In particular, ∆u = 0 in |x| > Λ0 and 2 ∞ ∂u −2σ 2 r |u| + dτ dr < ∞. ∂r Λ0 |x|=r

1225

(5.13)

Consider ﬁrst the case n ≥ 3. We may then use the representation of u by spherical harmonics, so that, with x = rω, ω ∈ S n−1 ,   ∞ ∞   n−1 u(x) = r− 2 bj rµj hj (ω) + cj r−νj hj (ω) , r > Λ0 , (5.14)   j=0

j=0

where, (n − 1)(n − 3) , 4 0 = λ0 < λ1 ≤ λ2 ≤ · · ·

µj (µj − 1) = νj (νj + 1) = λj +

(5.15)

being the eigenvalues of the Laplace–Beltrami operator on S n−1 , and hj (ω) the corresponding spherical harmonics. Since λ1 = n − 1, it follows that µ0 =

n−1 , 2

µ0 + 1 ≤ µ1 ≤ µ2 · · · ,

n−3 = ν0 < ν1 ≤ ν2 · · · . 2

(5.16)

We now observe that (5.13) forces b0 = b1 = · · · = 0. Also, by (5.14)

|x|=r

∂u dτ = −(n − 2)|S n−1 |c0 , ∂r

r > Λ0 ,

(|S n−1 | is the surface measure of S n−1 ), while integrating (5.12) we get ∂u dτ = 0, r > Λ0 . |x|=r ∂r Thus c0 = 0. It now follows from (5.14) that, for r > Λ0 , 2 2 −2ν1 ∂u ∂u r 2 2 |u| + dτ ≤ |u| + dτ. ∂r Λ0 ∂r |x|=r |x|=Λ0

(5.17)

(5.18)

(5.19)

Multiplying (5.12) by u ¯ and integrating by parts over the ball |x| ≤ r, we infer from (5.19) that the boundary term vanishes as r → ∞. Thus ∇u ≡ 0, in contradiction to (5.8) and (5.9). It remains to deal with the case n = 2. Instead of (5.14), we now have   ∞ ∞   1 1 bj rµj hj (ω) + cj r−νj hj (ω) , r > Λ0 , (5.20) u(x) = r− 2 b0 r 2 log r +   j=0

where µ0 = 12 , µ1 = 32 , ν1 = 12 .

j=1

November 16, J070-S0129055X10004193

1226

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

As in the derivation above, the condition (5.13) yields b0 = b1 = · · · = 0. Also, we get b0 = 0 in view of (5.18). It now follows that u ¯ |x|=r

∞ ∂u 1 dτ = −2π νj + |cj |2 r−2νj −1 , ∂r 2 j=1

r ≥ Λ0 ,

(5.21)

from which, as in the argument following (5.19), we deduce that u ≡ 0, again in contradiction to (5.8) and (5.9).

Proof of Theorem A. Part (a) of the theorem is actually covered by Proposition 5.1. Moreover, the proposition implies that the operator-valued function z → R(z) ∈ B(H −1,s (Rn ), H 1,−σ (Rn )),

s > 1, z ∈ Ω,

is uniformly bounded, where s, σ satisfy (4.7). Here and below replace H −1,s by H0−1,s if n = 2. ¯ in We next show that the function z → R(z) can be continuously extended to Ω −1,s n 1,−σ n −1,s (R ), H (R )). To this end, we take f ∈ H (Rn ) the weak toplogoy of B(H −1,σ n (R ) and consider the function and g ∈ H z → g, R(z)f ,

z ∈ Ω,

where , is the (H −1,σ , H 1,−σ ) pairing. We need to show that it can be extended ¯ continuously to Ω. In view of the uniform boundedness established in Proposition 5.1, we can take f, g in dense sets (of the respective spaces). In particular, we can take f ∈ L2,s (Rn ) and g ∈ L2,σ (Rn ), so that the continuity property in Ω is obvious. ∞ −−− → z0 ∈ [α, β]. Consider therefore a sequence {zk }k=1 ⊆ Ω such that zk − k→∞

1,−σ The sequence {u R(zk )f }∞ (Rn ). Therefore there exists k=1 is bounded in H k =∞ a subsequence ukj j=1 which converges to a function u ∈ L2,−σ , σ > σ. w

We can further assume that ukj −−−→ u in H 1,−σ . It follows that j→∞

g, ukj −−−→ g, u. j→∞

Passing to the limit in (H − zkj )ukj = f we see that the limit function satisﬁes (H − z0 )u = f. We now repeat the argument employed in the proof of Proposition 5.1. If z0 = 0 we note that the functions {χuk }∞ k=1 are radiative functions with uniformly bounded “radiative norms” (4.14) in |x| > Λ0 + 2. The same is therefore true for the limit function u. If z0 = 0 the function u ∈ H 1,−σ solves Hu = f.

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1227

In both cases this function is unique and we get the convergence

g, R(zk )f = g, uk −−−− → g, u. k→∞

We can now deﬁne R+ (z0 )f = u,

(5.22)

with an analogous deﬁnition for R− (z0 ). At this point we can readily deduce the following extension of the resolvent R(z) as the inverse of H − z. (H − z)R(z)f = f,

f ∈ H −1,s , z ∈ C ± ,

(5.23)

where R(z) = R± (λ) when z = λ ∈ R. Indeed, observe that if Im z = 0 then (H − z)R(z)f = f for f ∈ L2,s (Rn ) and (H − z)R(z) ∈ B(H −1,s , H −1,−σ ), so the assertion follows from the density of L2,s (Rn ) in H −1,s (Rn ). For z = λ ∈ R we use the (just established) weak continuity of the map z → (H − z)R(z) from H −1,s into H −1,−σ in C ± . The passage “from weak to uniform continuity” (in the operator topology) is a classical argument due to Agmon ([1]). In [8], we have applied it in the case n = 1. Here we outline the proof in the case n > 1. ¯ We establish ﬁrst the continuity of the operator-valued function z → R(z), Ω, −1,s n 2,−σ n (R ), L (R )). in the uniform operator topologoy of B(H ¯ and {fk }∞ ⊆ H −1,s (Rn ) be sequences such that zk − ⊆ Ω −−−→ Let {zk }∞ k=1 k=1 k→∞ ¯ and fk converges weakly to f in H −1,s (Rn ). It suﬃces to prove that the z ∈ Ω sequence uk = R(zk )fk , which is bounded in H 1,−σ (Rn ), converges strongly in L2,−σ (Rn ). Since this is clear if Im z = 0, we can take z ∈ [α, β]. Note ﬁrst that we can take 12 < σ < σ so that s, σ satisfy (4.7). Then ∞ the {uk }k=1 is bounded in H 1,−σ (Rn ) and there exists a subsequence sequence ∞ ukj j=1 which converges to a function u ∈ L2,−σ . w

We can further assume that ukj −−−→ u in H 1,−σ . j→∞

It follows that the limit function satisﬁes (see Eq. (5.23)) (H − z)u = f. Once again we consider separately the cases z = 0 and z = 0. In the ﬁrst case, in view of (5.23) and Remark 4.5 the functions χuk are “radiative functions”. Since they are uniformly bounded in H 1,−σ their “radiative norms” (4.14) are uniformly bounded, and we conclude that also Ru < ∞. In the second case, we simply note that u ∈ H 1,−σ solves Hu = f. As in the proof of Proposition 5.1 we conclude that in both cases the limit is 2,−σ (Rn ). unique, so that the whole sequence {uk }∞ k=1 converges to u in L Thus, the continuity in the uniform operator topologoy of B(H −1,s (Rn ), 2,−σ (Rn )) is proved. L

November 16, J070-S0129055X10004193

1228

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Finally, we claim that the operator-valued function z → R(z) is continuous in the uniform operator toplogoy of B(H −1,s (Rn ), H 1,−σ (Rn )). Indeed, if we invoke Eq. (5.23) we get that also z → HR(z) is continuous in the uniform operator topology of B(H −1,s (Rn ), H −1,−σ (Rn )). Since the domain of H in H −1,−σ (Rn ) is H 1,−σ (Rn ), the claim follows. The conclusion of the theorem follows by taking σ = s. Remark 5.2. In view of (5.4) and Remark 4.5 it follows that for λ > 0 the functions R± (λ)f, f ∈ H −1,s , are “radiative”, i.e. satisfy a Sommerfeld radiation condition. 6. The Eigenfunction Expansion Theorem In this section we prove Theorem B stated in Sec. 3. We ﬁrst collect some basic properties of the generalized eigenfunctions in the following proposition. Proposition 6.1. The generalized eigenfunctions ϕ± (x, ξ) = exp(iξx) + ψ± (x, ξ) (see (3.4)) are in

1 (Rn ) Hloc

for each ﬁxed ξ ∈ Rn and satisfy the equation (H − |ξ|2 )ϕ± (x, ξ) = 0.

(6.1)

In addition, these functions have the following properties: (i) The map Rn ξ → ψ± (·, ξ) ∈ H 1,−s (Rn ),

s > 1,

is continuous. (ii) For any compact K ⊆ Rn the family of functions {ϕ± (x, ξ), ξ ∈ K} is uniformly bounded and uniformly H¨ older continuous in x ∈ Rn . Proof. Since (H − |ξ|2 ) exp(iξx) ∈ H −1,s , s > 1, Eq. (6.1) follows from the deﬁnition (3.3) in view of Eq. (5.23). Furthermore, the map Rn ξ → (H − |ξ|2 ) exp(iξx) ∈ H −1,s (Rn ),

s > 1,

is continuous, so the continuity assertion (i) follows from Theorem A. For s > 1 the set of functions {ψ± (·, ξ), ξ ∈ K} is uniformly bounded in H 1,−s . Thus, in view of (6.1), it follows from the De Giorgi–Nash–Moser Theorem [35, older Chap. 8] that the set {ϕ± (x, ξ), ξ ∈ K} is uniformly bounded and uniformly H¨ continuous in {|x| < R} for every R > 0. In particular, we can take R > Λ0 (see Eq. (1.2)). In the exterior domain {|x| > R} the set {ψ± (x, ξ), ξ ∈ K} is bounded in H 1,−s , s > 1, and we have (H0 − |ξ|2 )ψ± (x, ξ) = 0. In addition the boundary values {ψ± (x, ξ), |x| = R, ξ ∈ K} are uniformly bounded. From well-known properties of solutions of the Helmholtz equation, we

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1229

conclude that this set is uniformly bounded and therefore, invoking once again the De Giorgi–Nash–Moser Theorem, uniformly H¨ older continuous.

Proof of Theorem B. We use the LAP proved in Theorem A, adapting the methodology of Agmon’s proof ([1]) for the eigenfunction expansion in the case of Schr¨ odinger operators with short-range potentials. To simplify notation, we prove for F+ . Let u ∈ H 1 be compactly supported. For any z such that Im z = 0 we can write its Fourier transform as n n (2π)− 2 u(x) exp(−iξx)dx = 2 u(x)(H0 − z) exp(−iξx)dx. u ˆ(ξ) = (2π)− 2 |ξ| − z Rn Rn Let θ ∈ C0∞ (Rn ) be a (real) cutoﬀ function such that θ(x) = 1 for x in a neighborhood of the support of u. We can rewrite the above equality as n

u ˆ(ξ) =

(2π)− 2

(H0 − z)u(x), θ(x) exp(iξx), |ξ|2 − z

where ·, · is the (H −1,s , H 1,−s ) bilinear pairing (conjugate linear with respect to the second term). We have therefore, with f = (H − z)u, n

u ˆ(ξ) =

(2π)− 2 ( (H − z)u(x), θ(x) exp(iξx) + (H0 − H) exp(iξx), u(x)) |ξ|2 − z n

(2π)− 2 ( f (x), θ(x) exp(iξx) + f (x), R(¯ z )(H0 − H) exp(iξx)). (6.2) = 2 |ξ| − z Introducing the function n z )(H0 − H) exp(iξx), f˜(ξ, z) = fˆ(ξ) + (2π)− 2 f (x), R(¯

we have (ξ) = u ˆ(ξ) = R(z)f

f˜(ξ, z) , |ξ|2 − z

Im z = 0,

(6.3)

We now claim that this equation is valid for all compactly supported f ∈ H −1 . Indeed, let u = R(z)f ∈ H 1,−s , s > 1. Let ψ(x) = 1 − χ(x), where χ(x) is deﬁned in (5.3). We set uk (x) = ψ(k −1 x)u(x),

fk (x) = (H − z)(ψ(k −1 x)u(x)),

k = 1, 2, 3, . . . .

The equality (6.3) is satisﬁed with u, f replaced, respectively, by uk , fk .

November 16, J070-S0129055X10004193

1230

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Since −−− → u(x) ψ(k −1 x)u(x) − k→∞

in H

1,−s

, we have (H − z)(ψ(k −1 x)u(x)) −−−− → (H − z)u = f (x) k→∞

−1,−s

in H , where in the last step we have used Eq. (5.23). In addition, since (H0 − H) exp(iξx) is compactly supported z )(H0 − H) exp(iξx) = (H0 − H) exp(iξx), R(z)fk (x)

fk (x), R(¯ z )(H0 − H) exp(iξx). − −−− → (H0 − H) exp(iξx), R(z)f = f, R(¯ k→∞

Combining these considerations with the continuity of the Fourier transform (on tempered distributions) we establish that (6.3) is valid for all compactly supported f ∈ H −1 . d E(λ) Let {E(λ), λ ∈ R} be the spectral family associated with H. Let A(λ) = dλ be its weak derivative. More precisely, we use the well-known formula, 1 lim (R(λ + i) − R(λ − i)), A(λ) = 2πi →0+ to get (using Theorem A), for any f ∈ H −1,s , s > 1, 1

f, (R+ (λ) − R− (λ))f .

f, A(λ)f = 2πi We now take f ∈ L2 and compactly supported. From the resolvent equation we infer R(λ + i) − R(λ − i) = 2iR(λ + i)R(λ − i),

> 0,

so that R(λ + i)f 20 , > 0. π Using Eq. (6.3) and Parseval’s theorem we therefore have,

f, A(λ)f = lim (|ξ|2 − (λ + i))−1 f˜(ξ, λ + i) 20 , > 0. →0+ π Note that f˜(ξ, z) can be extended continuously as z → λ + i · 0 by

f, A(λ)f = lim

→0+

n f˜(ξ, λ) = fˆ(ξ) + (2π)− 2 f (x), R− (λ)(H0 − H) exp(iξx).

(6.4)

(6.5)

In order to study properties of f˜(ξ, z) as a function of ξ we compute   n n −2  ˜ ˆ f (ξ, z) = f (ξ) + (2π) ∂l (al,j (x) − δl,j )∂j  exp(iξx), R(z)f (x) l,j=1 n = fˆ(ξ) + (2π)− 2 i

n l,j=1

ξj

Rn

(al,j (x) − δl,j )∂l (R(z)f (x)) exp(−iξx)dx, (6.6)

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1231

where in the last step we have used that both ∂l (R(z)f (x)) and (al,j (x) − δl,j ) exp(−iξx) are in L2 . Consider now the integral (al,j (x) − δl,j )∂l (R(z)f (x)) exp(−iξx)dx, z ∈ Ω, g(ξ, z) = Rn

where Ω is as in (5.1). In view of Theorem A the family {∂l R(z)f (x)}z∈Ω is uniformly bounded in L2,−s , s > 1, so by Parseval’s theorem we get g(·, z) 0 < C,

z ∈ Ω,

where C only depends on f. This estimate and (6.6) imply that, if f ∈ L2 is compactly supported: (i) The function ¯ (ξ, z) → f˜(ξ, z) Rn × Ω is continuous. For real z it is given by (6.5). (ii)

lim

k→∞

|ξ|>k

(|ξ|2 − z)−1 |f˜(ξ, z)|2 dξ = 0,

uniformly in z ∈ Ω. As z → |ξ|2 + i · 0, we have by Theorem A and Eq. (3.4), −n ˜ 2 lim f (x)ϕ+ (x, ξ)dx = F+ f (ξ), f (ξ, z) = (2π) 2 z→|ξ| +i·0

(6.7)

Rn

so that, taking (i) and (ii) into account we obtain from (6.4), for any compactly supported f ∈ L2 , 1

f, A(λ)f = √ |F+ f (ξ)|2 dσ, λ > 0, (6.8) 2 λ |ξ|2 =λ where dσ is the surface Lebesgue measure. It follows that for any [α, β] ⊆ [0, ∞), β

f, A(λ)f dλ = ((E(β) − E(α))f, f ) = α

α≤|ξ|2 ≤β

|F+ f (ξ)|2 dξ.

(6.9)

Letting α → 0, β → ∞, we get f 0 = F+ f 0 . 2

(6.10)

Thus f → F+ f ∈ L (R ) is an isometry for compactly supported functions, which can be extended by density to all f ∈ L2 (Rn ). Furthermore, since the spectrum of H is entirely absolutely continuous, it follows that for every f ∈ L2 , Eq. (6.8) holds for almost all λ > 0 (with respect to the Lebesgue measure). n

November 16, J070-S0129055X10004193

1232

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Let f ∈ D(H). By the spectral theorem 1 2

Hf, A(λ)Hf = λ f, A(λ)f = √ ||ξ|2 F+ f (ξ)|2 dσ, λ > 0. 2 λ |ξ|2 =λ In particular, 2 Hf 0 = ||ξ|2 F+ f (ξ)|2 dξ. (6.11) Rn ∞ 2 Conversely, if the right-hand side of (6.11) is ﬁnite, then 0 λ f, A(λ)f dλ < ∞, so f ∈ D(H). The adjoint operator F∗+ is a partial isometry (on the range of F+ ). If f (x) ∈ 2 L (Rn ) is compactly supported and g(ξ) ∈ L2 (Rn ) is likewise compactly supported then n (F+ f, g) = (2π)− 2 f (x)ϕ+ (x, ξ)dx g(ξ)dξ Rn

−n 2

Rn

f (x)

= (2π)

Rn

Rn

g(ξ)ϕ+ (x, ξ)dξ dx,

where in the change of order of integration Proposition 6.1 was taken into account. It follows that for a compactly supported g(ξ) ∈ L2 (Rn ), ∗ −n 2 g(ξ)ϕ+ (x, ξ)dξ, (6.12) (F+ g)(x) = (2π) Rn

and the extension to all g ∈ L2 (Rn ) is obtained by the fact that F∗+ is a partial isometry. Now if f ∈ D(H), g ∈ L2 (Rn ), we have |ξ|2 F+ f (ξ)F+ g(ξ)dξ = F∗+ (|ξ|2 F+ f (ξ))g(ξ)dξ, (Hf, g) = Rn

Rn

which is the statement (3.6) of the theorem. It follows from the spectral theorem that for every interval J = [α, β] ⊆ [0, ∞) and for every f ∈ L2 (Rn ) we have, with EJ = E(β)−E(α) and χJ the characteristic function of J, EJ f (x) = F∗+ (χJ (|ξ|2 )F+ f (ξ)), or F+ EJ f (ξ) = χJ (|ξ|2 )F+ f (ξ). It remains to prove that the isometry F+ is onto (and hence unitary). So, suppose to the contrary that for some nonzero g(ξ) ∈ L2 (Rn ) (F∗+ g)(x) = 0. In particular, for any f ∈ L2 (Rn ) and any interval J as above, 0 = (EJ f, F∗+ g) = (F+ EJ f, g) = (χJ (|ξ|2 )F+ f (ξ), g(ξ)) = (F+ f (ξ), χJ (|ξ|2 )g(ξ)), so that F∗+ (χJ (|ξ|2 )g(ξ)) = 0.

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1233

By Eq. (6.12), we have, for any 0 ≤ α < β, g(ξ)ϕ+ (x, ξ)dξ = 0, α<|ξ|2 <β

so that, in view of the continuity properties of ϕ+ (x, ξ) (see Proposition 6.1), for a.e. λ ∈ (0, ∞), g(ξ)ϕ+ (x, ξ)dσ = 0. (6.13) |ξ|2 =λ

From the deﬁnition (3.4), we get g(ξ) exp(iξx)dσ − |ξ|2 =λ

|ξ|2 =λ

g(ξ)R− (λ)((H − λ) exp(iξx))dσ = 0.

(6.14)

Since (H − λ) exp(iξx) is compactly supported (when |ξ|2 = λ), the continuity property of R− (λ) enables us to write g(ξ)R− (λ)((H − λ) exp(iξx))dσ = R− (λ) g(ξ)(H − λ) exp(iξx)dσ, |ξ|2 =λ

|ξ|2 =λ

which, by Remark 5.2, satisﬁes a Sommerfeld radiation condition. We conclude that the function 1 g(ξ) exp(iξx)dσ ∈ H 1,−s , s > , G(x) = 2 2 |ξ| =λ is a radiative solution (see Remark 4.5) of (−∆ − λ)G = 0, and hence must vanish. Since this holds for a.e. λ > 0, we get gˆ(ξ) = 0, hence g = 0. 7. Global Spacetime Estimates 1

Proof of Theorem C. (a) Deﬁne, with G = H 2 , u± =

1 (Gu ± i∂t u). 2

(7.1)

Then i ∂t u± = ∓iGu± ± f. 2 Deﬁning

U (t) =

u+ (t)

(7.2)

u− (t)

(7.3)

we have i−1 U (t) = −KU + F,   1 f (·, t)  2  G 0 . K= , F (t) =    1 0 −G − f (·, t) 2

(7.4)

November 16, J070-S0129055X10004193

1234

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Note that, as is common when treating evolution equations, we write U (t), F (t), . . . for U (x, t), F (x, t), . . . when there is no risk of confusion. The operator K is a self adjoint operator on D = L2 (Rn ) ⊕ L2 (Rn ). Its spectral family EK (λ) is given by EK (λ) = EG (λ) ⊕ (I − EG (−λ)), λ ∈ R, where EG is the spectral family of G. d E(λ) be its weak Let E(λ) be the spectral family of H, and let A(λ) = dλ derivative (3.2). By the deﬁnition of G we have EG (λ) = E(λ2 ), hence its weak derivative is given by AG (λ) =

d EG (λ) = 2λA(λ2 ), dλ

λ > 0.

(7.5)

In view of the LAP (Theorem A), we therefore have that the operator-valued function AG (λ) ∈ B(L2,s (Rn ), L2,−s (Rn )), is continuous for λ ≥ 0. Denoting Ds = L2,s (Rn ) ⊕ L2,s (Rn ), it follows that AK (λ) =

d EK (λ) = AG (λ) ⊕ AG (−λ), dλ

λ ∈ R,

is continuous with values in B(Ds , D−s ) for s > 1. Making use of Hypotheses (H1) and (H2), we invoke [65, Theorem 5.1] to con1 clude that lim supµ→∞ µ 2 A(µ) B(L2,s ,L2,−s ) < ∞, so that by (7.5) there exists a constant C > 0, such that AG (λ) B(L2,s ,L2,−s ) < C,

λ ≥ 0.

(7.6)

s > 1, λ ∈ R.

(7.7)

It follows that also AK (λ) B(Ds ,D−s ) < C,

λ ∈ R,

Let , be the bilinear pairing between D−s and Ds (conjugate linear with respect to the second term). For any ψ, χ ∈ Ds we have, in view of the fact that AK (λ) is a weak derivative of a spectral measure, (i) (ii)

| AK (λ)ψ, χ|2 ≤ AK (λ)ψ, ψ · AK (λ)χ, χ, ∞

AK (λ)ψ, ψdλ = ψ 2L2 (Rn )⊕L2 (Rn ) .

(7.8)

−∞

We ﬁrst treat the pure Cauchy problem, i.e. f ≡ 0. To estimate U (x, t) = e−itK U (x, 0) we use a duality argument. Some of the following computations will be rather formal, but they can easily be justiﬁed by

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1235

a density argument, as in [7, 17]. We shall use (( , )) for the scalar product in L2 (Rn+1 ) ⊕ L2 (Rn+1 ). Take w(x, t) ∈ C0∞ (Rn+1 ) ⊕ C0∞ (Rn+1 ). Then,

∞

((U, w)) = −∞

∞

= −∞

e−itK U (x, 0) · w(x, t)dxdt !

AK (λ)U (x, 0),

= (2π)1/2

∞

−∞

∞

" eitλ w(·, t)dt dλ

−∞

AK (λ)U (x, 0), w(·, ˜ λ)dλ,

where 1

w(x, ˜ λ) = (2π)− 2

w(x, t)eitλ dt. R

Noting (7.8) and (7.7), and using the Cauchy–Schwartz inequality |((U, w))| ≤ (2π)1/2 U (x, 0) 0 · ≤ C U (x, 0) 0 ·

∞

−∞

∞

−∞

12

AK (λ)w(·, ˜ λ), w(·, ˜ λ)dλ

w(·, ˜ λ) 2Ds

12 dλ .

It follows from the Plancherel theorem that |((U, w))| ≤ C U (x, 0) 0

R

w(·, t) 2Ds dt

12 . s

Let φ(x, t) ∈ C0∞ (Rn+1 ) ⊕ C0∞ (Rn+1 ), and take w(x, t) = (1 + |x|2 )− 2 φ(x, t), so that s

|(((1 + |x|2 )− 2 U, φ))| ≤ C · U (x, 0) 0 · φ L2 (Rn+1 ) . This concludes the proof of the part involving the Cauchy data in (3.14), in view of (7.3). To prove the part concerning the inhomogeneous equation, it suﬃces to take u0 = v0 = 0. In this case the Duhamel principle yields, for t > 0, U (t) =

t

e−i(t−τ )K F (τ )dτ,

0

where we have used the form (7.4) of the equation.

November 16, J070-S0129055X10004193

1236

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

Integrating the inequality U (t) we get

∞ 0

D −s

≤

t

0

U (t) D−s dt ≤

e−i(t−τ )K F (τ ) D−s dτ,

∞

0

∞

τ

e−i(t−τ )K F (τ ) D−s dtdτ.

Invoking the ﬁrst part of the proof we obtain ∞ U (t) D−s dt ≤ C 0

∞

0

F (τ ) 0 dτ,

which proves the part related to the inhomogeneous term in (3.14). (b) Deﬁne v± (x, t) = exp(±itG)φ± (x), where 1 [u0 (x) ∓ G−1 v0 (x)]. 2

φ± (x) = Then clearly

u(x, t) = v+ (x, t) + v− (x, t).

(7.9)

We establish the estimate (3.15) for v+ . Taking w(x, t) ∈ C0∞ (Rn+1 ) we proceed as in the ﬁrst part of the proof. Let , be the L2,−s (Rn ), L2,s (Rn ) pairing. Then ∞ eitG φ+ (x) · w(x, t)dxdt (v+ , w) = −∞

∞

= 0

!

AG (λ)φ+ ,

= (2π)1/2

∞

0

∞

" e−itλ w(·, t)dt dλ

−∞

AG (λ)φ+ , w(·, ˜ λ)dλ,

where 1

w(x, ˜ λ) = (2π)− 2

w(x, t)e−itλ dt.

R

Noting (7.6) as well as the inequalities (7.8) (with AG replacing AK ) and using the Cauchy–Schwartz inequality ∞ 1/2 |(v+ , w)| ≤ (2π)1/2 φ+ 0 ·

AG (λ)w(·, ˜ λ), w(·, ˜ λ)dλ ≤ C φ+ 0 ·

0

0

∞

12

w(·, ˜ λ) 20,s dλ

.

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

The Plancherel theorem yields |(v+ , w)| ≤ C φ+ 0

R

w(·, t) 20,s

1237

1/2 dt

.

s

Let ω ∈ C0∞ (Rn+1 ), and take w(x, t) = (1 + |x|2 )− 2 ω(x, t), so that s

|((1 + |x|2 )− 2 v+ , ω)| ≤ C · φ+ 0 · ω L2(Rn+1 ) . This (with the similar estimate for v− ) concludes the proof of the estimate (3.15). Remark 7.1 (Optimality of the Requirement s > 1). A key point in the proof was the use of the uniform bound (7.6). In view of the relation (7.5), this is reduced to the uniform boundedness of λA(λ2 ), λ ≥ 0, in B(L2,s , L2,−s ). By [65, Theo1 rem 5.1] the boundedness at inﬁnity, lim supµ→∞ µ 2 A(µ) < ∞, holds already with s > 12 . Thus the further restriction s > 1 is needed in order to ensure the boundedness at λ = 0 (Theorem A). Remark 7.2. Clearly we can take [0, T ] as the time interval, instead of R, for any T > 0. Acknowledgments This work was partially done during my visits to the Department of Mathematics at Stanford University (Spring 2004) and the Department of Mathematics of the Universit´e de Provence (Marseille, Spring 2006). I am grateful for the hospitality of both departments with special thanks to Professors Rafe Mazzeo and Yves Dermenjian. In addition, very stimulating discussions with S. Agmon, K. Hidano, Y. Pinchover, M. Ruzhansky, M. Sugimoto and T. Umeda are happily acknowledged. The author thanks the referee for calling his attention to the works [19–21]. References [1] S. Agmon, Spectral properties of Schr¨ odinger operators and scattering theory, Ann. Sc. Norm. Super. Pisa 2 (1975) 151–218. [2] S. Agmon, J. Cruz-Sampedro and I. Herbst, Spectral properties of Schr¨ odinger operators with potentials of order zero, J. Funct. Anal. 167 (1999) 345–369. [3] Y. Ameur and B. Walther, Smoothing estimates for the Schr¨ odinger equation with an inverse-square potential, preprint (2007). [4] M. Beals and W. Strauss, Lp estimates for the wave equation with a potential, Comm. Partial Diﬀerential Equations 18 (1993) 1365–1397. [5] M. Ben-Artzi, Unitary equivalence and scattering theory for Stark-like Hamiltonians, J. Math. Phys. 25 (1984) 951–964. [6] M. Ben-Artzi, Global estimates for the Schr¨ odinger equation, J. Funct. Anal. 107 (1992) 362–368. [7] M. Ben-Artzi, Regularity and smoothing for some equations of evolution, in Nonlinear Partial Diﬀerential Equations and Their Applications; Coll`ege de France Seminar, Longman Scientiﬁc, Vol. 11, eds. H. Brezis and J. L. Lions (Longman Sci. Tech. 1994), pp. 1–12.

November 16, J070-S0129055X10004193

1238

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

[8] M. Ben-Artzi, On spectral properties of the acoustic propagator in a layered band, J. Diﬀerential Equations 136 (1997) 115–135. [9] M. Ben-Artzi, Spectral theory for divergence-form operators, in Spectral and Scattering Theory and Related Topics, ed. H. Ito, Vol. 1607 (RIMS Kokyuroku, 2008), pp. 77–84. [10] M. Ben-Artzi, Y. Dermenjian and J.-C. Guillot, Analyticity properties and estimates of resolvent kernels near thresholds, Comm. Partial Diﬀerential Equations 25 (2000) 1753–1770. [11] M. Ben-Artzi, Y. Dermenjian and A. Monsef, Resolvent kernel estimates near thresholds, Diﬀerential Integral Equations 19 (2006) 1–14. [12] M. Ben-Artzi and A. Devinatz, The limiting absorption principle for a sum of tensor products applications to the spectral theory of diﬀerential operators, J. Anal. Math. 43 (1983/84) 215–250. [13] M.Ben-Artzi and A. Devinatz, The Limiting Absorption Principle for Partial Diﬀerential Operators, Memoirs of the AMS, Vol. 364 (Amer. Math. Soc., 1987). [14] M. Ben-Artzi and A. Devinatz, Local smoothing and convergence properties for Schr¨ odinger-type equations, J. Funct. Anal. 101 (1991) 231–254. [15] M. Ben-Artzi and A. Devinatz, Regularity and decay of solutions to the Stark evolution equations, J. Funct. Anal. 154 (1998) 501–512. [16] M. Ben-Artzi and S. Klainerman, Decay and regularity for the Schr¨ odinger equation, J. Anal. Math. 58 (1992) 25–37. [17] M. Ben-Artzi and J. Nemirovsky, Remarks on relativistic Schr¨ odinger operators and their extensions, Ann. Inst. H. Poincar´ e 67 (1997) 29–39. [18] Ju. M. Berezanskii, Expansion in Eigenfunctions of Selfadjoint Operators, Translations of Mathematical Monographs, Vol. 17 (Amer. Math. Soc., 1968). [19] J.-F. Bony and D. H¨ afner, The semilinear wave equation on asymptotically Euclidean manifolds, arXiv:0810.0464. [20] J.-F. Bony and D. H¨ afner, Low frequency resolvent estimates for long range perturbations of the Euclidean Laplacian, arXiv:0903.5531. [21] J.-M. Bouclet, Low frequency estimates for long range perturbations in divergence form, arXiv:0806.3377. [22] A. Boutet de Monvel-Berthier and D. Manda, Spectral and scattering theory for wave propagation in perturbed stratiﬁed media, J. Math. Anal. Appl. 191 (1995) 137–167. [23] F. E. Browder, The eigenfunction expansion theorem for the general self-adjoint singular elliptic partial diﬀerential operator. I. The analytical foundation, Proc. Natl. Acad. Sci. 40 (1954) 454–459. [24] N. Burq, Semi-classical estimates for the resolvent in nontrapping geometries, Int. Math. Res. Not. 5 (2002) 221–241. [25] N. Burq, Global Strichartz estimates for nontrapping geometries: About an article by H. Smith and C. Sogge, Comm. Partial Diﬀerential Equations 28 (2003) 1675–1683. [26] H. Chihara, Smoothing eﬀects of dispersive pseudodiﬀerential equations, Comm. Partial Diﬀerential Equations 27 (2002) 1953–2005. [27] A. Cohen and T. Kappeler, Scattering and inverse scattering for steplike potentials in the Schr¨ odinger equation, Indiana Univ. Math. J. 34 (1985) 127–180. [28] C. Cohen-Tannoudji, B. Diu and F. Lalo¨e, Quantum Mechanics (John Wiley, 1977). [29] E. Croc and Y. Dermenjian, Analyse spectrale d’une bande acoustique multistratiﬁe´e. Partie I: Principe d’absorption limite pour une stratiﬁcation simple, SIAM J. Math. Anal. 26 (1995) 880–924. [30] P. D’ancona and L. Fanelli, Strichartz and smoothing estimates for dispersive equations with magnetic potentials, Comm. Partial Diﬀerential Equations 33 (2008) 1082–1112.

November 16, J070-S0129055X10004193

2010 15:28 WSPC/S0129-055X

148-RMP

Eigenfunctions Expansions and Spacetime Estimates

1239

[31] S. DeBi`evre and W. Pravica, Spectral analysis for optical ﬁbers and stratiﬁed ﬂuids I: The limiting absorption principle, J. Funct. Anal. 98 (1991) 404–436. [32] V. G. Deich, E. L. Korotayev and D. R. Yafaev, Theory of potential scattering, taking into account spatial anisotropy, J. Soviet Math. 34 (1986) 2040–2050. [33] S.-I. Doi, Smoothing eﬀects of Schr¨ odinger evolution groups on Riemannian manifolds, Duke Math. J. 82 (1996) 679–706. [34] D. M. Eidus, The principle of limiting absorption, in American Mathematical Society Translations, Series 2, Vol. 47 (Amer. Math. Soc., Providence, 1965), pp. 157–192. (Originally in Russian, Mat. Sb. 57 (1962) 13–44). [35] D. Gilbarg and N. S. Trudinger, Elliptic Partial Diﬀerential Equations of Second Order (Springer-Verlag, 1977). [36] M. Goldberg and W. Schlag, A limiting absorption principle for the three-dimensional Schr¨ odinger equation with Lp potentials, Int. Math. Res. Not. 75 (2004) 4049–4071. 2 1 [37] I. Herbst, Spectral theory of the operator (p2 + m2 ) 2 − Z er , Comm. Math. Phys. 53 (1977) 285–294. [38] I. Herbst, Spectral and scattering theory for Schr¨ odinger operators with potentials independent of |x|, Amer. J. Math. 113 (1991) 509–565. [39] K. Hidano, Morawetz–Strichartz estimates for spherically symmetric solutions to wave equations and applications to semilinear Cauchy problems, Diﬀerential Integral Equations 20 (2007) 735–754. [40] K. Hidano, J. Metcalfe, H. F. Smith, C. D. Sogge and Y. Zhou, On abstract Strichartz estimates and the Strauss conjecture for nontrapping obstacles, to appear in Trans. Amer. Math. Soc. (2009); http://front.math.ucdavis.edu/0805.1673. [41] L. H¨ ormander, The Analysis of Linear Partial Diﬀerential Operators II (SpringerVerlag, 1983). [42] T. Hoshiro, On weighted L2 estimates of solutions to wave equations, J. Anal. Math. 72 (1997) 127–140. [43] T. Hoshiro, Decay and regularity for dispersive equations with constant coeﬃcients, J. Anal. Math. 91 (2003) 211–230. [44] T. Ikebe, Eigenfunction expansions associated with the Schr¨ odinger operators and their application to scattering theory, Arch. Ration. Mech. Anal. 5 (1960) 1–34. [45] T. Ikebe and Y. Saito, Limiting absorption method and absolute continuity for the Schr¨ odinger operators, J. Math. Kyoto Univ. Ser. A 7 (1972) 513–542. [46] T. Ikebe and T. Tayoshi, Wave and scattering operators for second-order elliptic operators in Rn , Publ. RIMS Kyoto Univ. Ser. A 4 (1968) 483–496. [47] A. D. Ionescu and W. Schlag, Agmon–Kato–Kuroda theorems for a large class of perurbations, Duke Math. J. 131 (2006) 397–440. [48] M. Kadowaki, Low and high energy resolvent estimates for wave propagation in stratiﬁed media and their applications, J. Diﬀerential Equations 179 (2002) 246–277. [49] M. Kadowaki, Resolvent estimates and scattering states for dissipative systems, Publ. RIMS Kyoto Univ. Ser. A 38 (2002) 191–209. [50] T. Kato, Perturbation Theory for Linear Operators (Springer-Verlag, 1966). [51] T. Kato and S. T. Kuroda, The abstract theory of scattering, Rocky Mountain J. Math. 1 (1971) 127–171. [52] T. Kato and K. Yajima, Some examples of smooth operators and the associated smoothing eﬀect, Rev. Math. Phys. 1 (1989) 481–496. [53] K. Kikuchi and H. Tamura, The limiting amplitude principle for acoustic propagators in perturbed stratiﬁed ﬂuids, J. Diﬀerential Equations 93 (1991) 260–282. [54] V. G. Maz’ya and T. O. Shaposhnikova, Theory of Sobolev Multipliers (SpringerVerlag, 2008).

November 16, J070-S0129055X10004193

1240

2010 15:28 WSPC/S0129-055X

148-RMP

M. Ben-Artzi

[55] K. Mochizuki, Scattering theory for wave equations with dissipative terms, Publ. RIMS Kyoto Univ. Ser. A 12 (1976) 383–390. [56] C. S. Morawetz, Time decay for the Klein–Gordon equation, Proc. Roy. Soc. Ser. A 306 (1968) 291–296. [57] K. Morii, Time-global smoothing estimates for a class of dispersive equations with constant coeﬃcients, Ark. Mat. 46 (2008) 363–375. [58] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators, Comm. Math. Phys. 78 (1980/81) 391–408. [59] M. Murata and T. Tsuchida, Asymptotics of green functions and the limiting absorption principle for elliptic operators with periodic coeﬃcients, J. Math. Kyoto Univ. 46 (2006) 713–754. [60] P. Perry, I. M. Sigal and B. Simon, Spectral analysis of N -body Schr¨ odinger operators, Ann. Math. 114 (1981) 519–567. [61] B. Perthame and L. Vega, Morrey–Campanato estimates for Helmholtz equations, J. Funct. Anal. 164 (1999) 340–355. [62] B. Perthame and L. Vega, Energy decay and Sommerfeld condition for Helmholtz equation with variable index at inﬁnity, preprint (2002). [63] A. Ja. Povzner, The expansion of arbitrary functions in terms of eigenfunctions of the operator −∆u + cu, in American Mathematical Society Translations, Series 2, Vol. 60 (Amer. Math. Soc., 1966) 1–49. (Originally in Russian, Math. Sb. 32 (1953) 109–156. [64] A. G. Ramm, Justiﬁcation of the limiting absorption principle in R2 , in Operator Theory and Applications, Fields Institute Communications, Vol. 25, eds. A. G. Ramm, P. N. Shivakumar and A. V. Strauss (Amer. Math. Soc., 2000), pp. 433–440. [65] D. Robert, Asymptotique de la phase de diﬀusion ` a haute ´energie pour des perturbations du second ordre du laplacien, Ann. Sci. Ecole Norm. Sup. (4) 25 (1992) 107–134. [66] M. Ruzhansky and M. Sugimoto, Global L2 -boundedness theorems for a class of Fourier integral operators, Comm. Partial Diﬀerential Equations 31 (2006) 547–569. [67] M. Ruzhansky and M. Sugimoto, A smoothing property of Schr¨ odinger equations in the critical case, Math. Ann. 335 (2006) 645–673. [68] Y. Saito, Spectral Representations for Schr¨ odinger Operators with Long-Range Potentials, Lecture Notes in Mathematics, Vol. 727 (Springer-Verlag, 1979). [69] B. Simon, Best constants in some operator smoothness estimates, J. Funct. Anal. 107 (1992) 66–71. [70] C. D. Sogge, Lectures on Non-Linear Wave Equations, 2nd edn. (International Press, 2008). [71] W. A. Strauss, Nonlinear Wave Equations, CBMS Lectures, Vol. 73 (Amer. Math. Soc., 1989). [72] R. S. Strichartz, Restrictions of Fourier transforms to quadratic surfaces and decay of solutions of wave equations, Duke Math. J. 44 (1977) 705–714. [73] M. Sugimoto, Global smoothing properties of generalized Schr¨ odinger equations, J. Anal. Math. 76 (1998) 191–204. [74] T. Umeda, Generalized eigenfunctions of relativistic Schr¨ odinger operators I, Electronic J. Diﬀerential Equations 127 (2006) 1–46. [75] A. Vasy and J. Wunsch, Positive commutators at the bottom of the spectrum, J. Funct. Anal. 259 (2010) 503–523. [76] G. Vodev, Local energy decay of solutions to the wave equation for short-range potentials, Asymptot. Anal. 37 (2004) 175–187. [77] B. G. Walther, A sharp weighted L2 -estimate for the solution to the time-dependent Schr¨ odinger equation, Ark. Math. 37 (1999) 381–393.

November 16, 2010 15:28 WSPC/S0129-055X 148-RMP J070-S0129055X10004211

Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1241–1243 c World Scientiﬁc Publishing Company DOI: 10.1142/S0129055X10004211

REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 22 (2010)

Barreira, L., Almost additive thermodynamic formalism: Some recent developments Bassi, A., D¨ urr, D. & Kolb, M., On the long time behavior of free stochastic Schr¨ odinger evolutions Ben-Artzi, M., Eigenfunction expansions and spacetime estimates for generators in divergence-form Ben Halima, M., Construction of certain fuzzy ﬂag manifolds Brain, S. & Landi, G., The 3D spin geometry of the quantum two-sphere Bru, J.-B. & de Siqueira Pedra, W., Eﬀect of a locally repulsive interaction on s-wave superconductors Chatterjee, S., Lahiri, A. & Sengupta, A. N., Parallel transport over path spaces Daud´ e, T. & Nicoleau, F., Inverse scattering in de Sitter–Reissner– Nordstr¨ om black hole spacetimes

de Oliveira, G., Asymptotics for Fermi curves: Small magnetic potential De Roeck, W., Maes, C., Netoˇ cn´ y, K. & Rey-Bellet, L., A note on the non-commutative Laplace–Varadhan integral lemma de Siqueira Pedra, W., see Bru, J.-B. Demirel, S. & Harrell, II, E. M., On semiclassical and universal inequalities for eigenvalues of quantum graphs Dimassi, M. & Petkov, V., Spectral shift function for operators with crossed magnetic and electric ﬁelds Dirr, G., see SchulteHerbr¨ uggen, T. D¨ urr, D. see Bassi, A. Feh´ er, L. & Pusztai, B. G., Derivations of the trigonometric BCn Sutherland model by quantum Hamiltonian reduction Glaser, S. J., see SchulteHerbr¨ uggen, T.

10 (2010) 1147

1 (2010) 55

10 (2010) 1209 5 (2010) 533

8 (2010) 963

3 (2010) 233

9 (2010) 1033

4 (2010) 431

1241

8 (2010) 881

7 (2010) 839 3 (2010) 233

3 (2010) 305

4 (2010) 355 6 (2010) 597 1 (2010) 55

6 (2010) 699 6 (2010) 597

November 16, 2010 15:28 WSPC/S0129-055X 148-RMP J070-S0129055X10004211

1242

Author Index

Grigorian, S., Moduli spaces of G2 manifolds Guha, P., Euler–Poincar´e ﬂows on the loop Bott–Virasoro group and space of tensor densities and (2 + 1)-dimensional integrable systems Harrell, II, E. M., see Demirel, S. Helmke, U., see SchulteHerbr¨ uggen, T. Hidaka, T. & Hiroshima, F., Pauli–Fierz model with Kato-class potentials and exponential decays Hiroshima, F., see Hidaka, T. Ichinose, W., On the Feynman path integral for nonrelativistic quantum electrodynamics Jenˇ cov´ a, A. & Ruskai, M. B., A uniﬁed treatment of convexity of relative entropy and related trace functions, with conditions for equality Jensen, A. & Yajima, K., Spatial growth of fundamental solutions for certain perturbations of the harmonic oscillator Kolb, M., see Bassi, A. Kriz, I., Perturbative deformations of conformal ﬁeld theories revisited Kusuoka, S. & Liang, S., A classical mechanical model of Brownian motion with plural particles

9 (2010) 1061

5 (2010) 485 3 (2010) 305 6 (2010) 597

10 (2010) 1181 10 (2010) 1181

5 (2010) 549

9 (2010) 1099

2 (2010) 193 1 (2010) 55

2 (2010) 117

7 (2010) 733

Lahiri, A., see Chatterjee, S. Landi, G., see Brain, S. Liang, S., see Kusuoka, S. Longo, R., Martinetti, P. & Rehren, K.-H., Geometric modular action for disjoint intervals and boundary conformal ﬁeld theory Maes, C., see De Roeck, W. Marin, L., Dynamical bounds for Sturmian Schr¨ odinger operators Martinetti, P., see Longo, R. Matte, O. & Stockmeyer, E., Spectral theory of no-pair Hamiltonians Morsella, G. & Tomassini, L., From global symmetries to local currents: The free (scalar) case in four dimensions Nachtergaele, B., Schlein, B., Sims, R., Starr, S. & Zagrebnov, V., On the existence of the dynamics for anharmonic quantum oscillator systems Netoˇ cn´ y, K., see De Roeck, W. Nicoleau, F., see Daud´ e, T. Petkov, V., see Dimassi, M. Porta, M. & Simonella, S., Borel summability of ϕ44 planar theory via multiscale analysis Pusztai, B. G., see Feh´ er, L. Rehren, K.-H., see Longo, R. Rey-Bellet, L., see De Roeck, W.

9 (2010) 1033 8 (2010) 963 7 (2010) 733

3 (2010) 331 7 (2010) 839

8 (2010) 859 3 (2010) 331

1 (2010) 1

1 (2010) 91

2 (2010) 207 7 (2010) 839 4 (2010) 431 4 (2010) 355

9 (2010) 995 6 (2010) 699 3 (2010) 331 7 (2010) 839

November 16, J070-S0129055X10004211

2010 15:28 WSPC/S0129-055X

148-RMP

Author Index Robert, D., On the Herman–Kluk semiclassical approximation Ruskai, M. B., see Jenˇ cov´ a, A. Sanders, K., The locally covariant Dirac ﬁeld Sango, M., Density dependent stochastic Navier–Stokes equations with non-Lipschitz random forcing Schlein, B., see Nachtergaele, B. Schulte-Herbr¨ uggen, T., Glaser, S. J., Dirr, G. & Helmke, U., Gradient ﬂows for optimization in quantum information and quantum dynamics: Foundations and applications

10 (2010) 1123 9 (2010) 1099 4 (2010) 381

6 (2010) 669 2 (2010) 207

6 (2010) 597

Sengupta, A. N., see Chatterjee, S. Simonella, S., see Porta, M. Sims, R., see Nachtergaele, B. Starr, S., see Nachtergaele, B. Stockmeyer, E., see Matte, O. Tomassini, L., see Morsella, G. Yajima, K., see Jensen, A. Zagrebnov, V., see Nachtergaele, B. Zhang, R. B. & Zhang, X., Projective module description of embedded noncommutative spaces Zhang, X., see Zhang, R. B.

1243

9 (2010) 1033 9 (2010) 995 2 (2010) 207 2 (2010) 207 1 (2010) 1 1 (2010) 91 2 (2010) 193 2 (2010) 207

5 (2010) 507 5 (2010) 507

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

No title

Recommend Documents