February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Reviews in Mathematical Physics Vol. 22, No. 1 (2010) 1–53 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003874
SPECTRAL THEORY OF NO-PAIR HAMILTONIANS
OLIVER MATTE Mathematisches Institut, Ludwig-Maximilians-Universit¨ at, Theresienstraße 39, D-80333 M¨ unchen, Germany
[email protected] EDGARDO STOCKMEYER∗ Institut f¨ ur Mathematik, Johannes Gutenberg-Universit¨ at, Staudingerweg 9, D-55099 Mainz, Germany
[email protected]
Received 27 October 2008 Revised 18 August 2009 We prove an HVZ theorem for a general class of no-pair Hamiltonians describing an atom or positively charged ion with several electrons in the presence of a classical external magnetic field. Moreover, we show that there exist infinitely many eigenvalues below the essential spectrum and that the corresponding eigenfunctions are exponentially localized. The novelty is that the electrostatic and magnetic vector potentials as well as a nonlocal exchange potential are included in the projection determining the model. As a main technical tool, we derive various commutator estimates involving spectral projections of Dirac operators with external fields. Our results apply to all coupling constants e2 Z < 1. Keywords: Dirac operator; Brown and Ravenhall; no-pair operator; pseudo-relativistic; Furry picture; intermediate pictures; HVZ theorem; exponential localization. Mathematics Subject Classification 2000: 81Q10, 47B25
1. Introduction The relativistic dynamics of a single electron moving in the potential of a static nucleus, VC ≤ 0, in the presence of an external classical magnetic field B = curl A is generated by the Dirac operatora DA,VC := α · (−i∇ + A) + VC . ∗ On
(1.1)
leave from Mathematisches Institut, Ludwig-Maximilians-Universit¨ at, Theresienstraße 39, D-80333 M¨ unchen, Germany. a Energies are measured in units of mc2 , m denoting the rest mass of an electron and c the speed of light. Length is measured in units of /(mc), which is the Compton wave length divided by 2π. is Planck’s constant divided by 2π. In these units, the square of the elementary charge equals the fine structure constant, e2 ≈ 1/137.037. 1
February 11, 2010 10:0 WSPC/148-RMP
2
J070-S0129055X10003874
O. Matte & E. Stockmeyer
Here an electron is a state lying in the positive spectral subspace of DA,VC . A ground state of the one-electron atom modeled by DA,VC can be characterized as an energy minimizing bound state of the restriction of DA,VC to its positive spectral 2 3 4 subspace, Λ+ A,VC L (R , C ), where 1 1 + sgn(DA,VC ). (1.2) 2 2 This is confirmed by Dirac’s interpretation of the negative spectral subspace as a completely filled sea of virtual electrons which, on account of Pauli’s exclusion principle, forces an additional electron to attain a state of positive energy. On the other hand, it is well known that there is no canonical a priori given atomic or molecular Hamiltonian generating the relativistic time evolution of N > 1 interacting electrons. Guided by non-relativistic quantum mechanics one might naively propose to start with the formal expression Λ+ A,VC =
N
(j)
DA,VC +
j=1
Wjk ,
(1.3)
1≤j
where the superscript (j) means that the operator below acts on the jth electron and Wjk ≥ 0 is the interaction potential between the jth and kth electron. It then turns out, however, that (1.3) suffers from the phenomenon of continuum dissolution which is also known as the Brown–Ravenhall disease [9]. That is, the eigenvalue problem associated to (1.3) has no normalizable solutions; see, e.g., [43, 48]. A frequently used ansatz to find a reasonable and semi-bounded Hamiltonian for an N -electron atomic or molecular system again incorporates the concept of a Dirac sea. Namely, one projects (1.3) onto the N -fold antisymmetric tensor product of a suitable one-electron subspace, i.e. one considers operators of the form N (j) DA,VC + Wjk Λ+,N (1.4) HN := Λ+,N A,V A,V , 1≤j
j=1
where Λ+,N A,V :=
N
+(j)
ΛA,V .
(1.5)
j=1
Here Λ+ A,V is defined as in (1.1) and (1.2) but with VC replaced by a new potential V . A Hamiltonian of this kind has been introduced first by Brown and Ravenhall in [9]. We emphasize that HN can formally be derived from quantum electrodynamics (QED) by a procedure that neglects the creation and annihilation of electron-positron pairs [47], the latter being defined with respect to Λ+ A,V . For this reason operators of the form (1.4) are often called no-pair Hamiltonians. Models of this type are widely used as a starting point for numerical computations in quantum chemistry. We refer the reader to the recent textbook [43] for a detailed exposition of the application of no-pair models in quantum chemistry, for examples
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
3
of molecular systems which can be studied effectively by these methods, and an exhaustive reference list. Roughly speaking, no-pair models are supposed to give a good description of heavy atoms and molecules where relativistic effects play an important role in the understanding of their chemical properties but where QED effects can be neglected since their contributions live on a negligible energy scale; compare [43, Chap. 8]. In particular, the binding energies of a molecular system are low enough so that processes involving pair creation and annihilation do not need to be taken into account in the investigation of its chemical properties. Another broad field of application for no-pair models is the theoretical and numerical study of highly ionized heavy atoms; see, e.g. [11,29,30,45] for a review. In fact, since very accurate spectroscopic data is available for highly charged heavy ions they provide important tests of relativistic atomic structure theory and QED. We quote from the introduction to [11]: “Although the proper point of departure for relativistic atomic structure calculations is quantum electrodynamics, very few atomic structure calculations have been carried out entirely within the QED framework. Indeed, almost all relativistic calculations of the structure of many-electron atoms are based on some variant of the Hamiltonian introduced a half century ago by Brown and Ravenhall to understand the helium fine structure”. As already indicated above, various QED effects like retardation of the electron-electron interaction, electron self-energy, and vacuum polarization are not accounted for by the no-pair Hamiltonian. However, by comparing the splitting between eigenvalues of the no-pair operator with experimental values and taking corrections due to the finite mass of the nucleus into account one can infer the size of the omitted QED corrections. The good agreement of the discrepancy found in this way with theoretical computations of QED effects provides a test of QED in the strong fields of highly charged nuclei; see, e.g., [29, §3]. In particular, the no-pair energy levels may serve as a first approximation for more accurate and complicated QED calculations; see, for instance, [6]. In practice, the eigenvalues of the no-pair operator and the finite nuclear mass corrections can be determined by means of the (formal) relativistic many-body perturbation theory (MBPT) as described in [11,29,30,45]. There exist variants of MBPT where the negative-energy states discarded in the no-pair approximation are re-introduced in perturbative expansions which becomes important in certain physical situations [12, 14]. We remark that in the articles cited above the electron–electron interaction is often given by the Coulomb–Breit potential and A is equal to zero. In the discussion of no-pair models in quantum chemistry and in atomic physics (see, e.g., [43, Chap. 8] and [11]) it is assumed that the spectrum of the Hamiltonian in (1.4) shows the usual qualitative features well known, for instance, for multi-particle Schr¨odinger operators. That is, the essential spectrum is supposed to cover some positive half-axis and there should exist infinitely many eigenvalues below the ionization threshold (provided N is not too large). In view of the variety and number of applications, it therefore seems worthwhile to complement the treatment of no-pair models in the quantum chemical and physical literature by
February 11, 2010 10:0 WSPC/148-RMP
4
J070-S0129055X10003874
O. Matte & E. Stockmeyer
mathematically rigorous results and, in particular, to confirm the assumptions on the spectral properties of HN by providing mathematical proofs. We would like to give some further comments on the connection between no-pair Hamiltonians and certain computational techniques in quantum chemistry. Namely, there exists a block-diagonalization scheme which is used to represent the formal Coulomb–Dirac operator in (1.3) as a two-fold direct sum of operators acting on two-spinors [23,26]. For the one-particle Coulomb–Dirac operator DA,VC , both blocks are unitarily equivalent to the restriction of DA,VC to its positive and negative spectral subspaces, respectively. In the general case, the upper left block turns out to be unitarily equivalent to the no-pair Hamiltonian. One may then expand the upper left block with respect to the Coulomb coupling constant. The partial sums in this expansion then give reasonable approximations for Hamiltonians describing a relativistic molecular system and are implemented in numerical algorithms in quantum chemistry [43, Chaps. 11 and 12]. It has been rigorously shown in [25, 46] that, for sufficiently small Coulomb coupling constants and sufficiently large orders in the expansion, each partial sum in the expansion has a distinguished self-adjoint realization and that the sequence of partial sums converges in the norm resolvent sense. In particular, the spectra of the partial sums — which are directly studied numerically — converge to the spectrum of the no-pair Hamiltonian HN . Obviously, the question arises how to choose the new potential V determining the projection (1.5) or, in other words, how to fix the Dirac sea for one electron in the presence of the others in a physically efficient way. Various possibilities are discussed from a physical and numerical point of view in [29,45,47,48]; see also [12] where the potential dependence of MBPT results is discussed and a possibility to eliminate this dependence is proposed. The choice V = 0 is referred to as the free picture, or Brown–Ravenhall model [9]. It has by now been studied in many mathematical works [2–5, 10, 17, 21, 20, 24, 27, 28, 35, 37–40, 50–52]. This is due to the fact that the free projection, Λ+ 0,0 , can be calculated explicitly both in momentum and position space [2, 40]. The free picture is considered as one extreme case in a family of intermediate pictures [48]. The opposite extreme case, sometimes called the Furry picture (see, e.g., [45, §III.F] and [48]), is given by substituting the (negative) Coulomb potential VC for V . Other members of that family are obtained by choosing V to be equal to VC plus some additional positive and in general nonlocal operator. The Furry and intermediate pictures give better numerical results in comparison to the free picture [47, 48] and are the preferred choices in MBPT. The additional non-local term may, for instance, incorporate the interaction with the remaining electrons. An example would be the Hartree–Fock potential generated by a set of appropriately chosen orbitals which is in fact a choice often employed in relativistic MBPT [11, 29, 30, 45]. In this paper, we do not aim to contribute to the subtle question of optimizing the choice of V . Rather we try to keep the assumptions on V as general as our techniques permit. Namely, we consider a class of potentials which can be written as V = VC + VH + VE , where VC may have several Coulomb-type singularities,
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
5
VH is a bounded potential function vanishing at infinity, and VE is a compact nonlocal operator that behaves nicely under conjugation with exponential weights. As already mentioned above, our goal then is to establish some basic qualitative spectral properties of HN . First, we show that HN is well-defined on a natural dense subspace (which is not obvious) and, thus, has a self-adjoint Friedrichs extension. We further locate its essential spectrum and, assuming that the number of electrons does not exceed the sum of all nuclear charges, we prove that there exist infinitely many eigenvalues below the ionization threshold. Moreover, we show that the corresponding eigenfunctions are exponentially localized. Our results apply to all nuclear charges Z < e−2 ≈ 137.037. The “easy part” of the HVZ theorem, i.e. the upper bound on the ionization threshold holds for a certain class of possibly unbounded magnetic fields. The “hard part”, i.e. the lower bound on the ionization threshold as well as our results on existence of eigenvectors and their exponential localization are derived for bounded magnetic fields. Although the general strategy of our proofs is fairly standard the discussion of general no-pair models poses a variety of new mathematical problems. As an essential and novel technical input, necessary to obtain any of the results mentioned above, we first derive various commutator estimates involving the spectral projection Λ+ A,V , exponential weights, and cut-off functions. They describe the non-local 2 properties of Λ+ A,V in an L -sense and might be of independent interest. Similar estimates are obtained in [36] in the case V = 0. The case where V does not vanish requires, however, more care due to the complex extension theory for singular Dirac operators. In order to derive the exponential localization estimate, we employ a strategy from [1] which has been complemented by a number of useful observations in [19]. The argument presented in [1] is advantageous for us since no a priori knowledge on the spectrum is required to prove the localization. (In particular, no eigenvalue equations are exploited in the argument.) Inspired by a remark in [19] we rather argue in the opposite direction and infer the lower bound on the ionization threshold from the localization estimate. This possibility is very convenient in the analysis of the no-pair Hamiltonian HN since the corresponding argument is very simple and requires only form bounds on the interaction potentials Wij . The comparably bad behavior of singular Dirac operators and the resulting weak control on the interaction potentials also complicates the derivation of the upper bound on the ionization threshold, which is obtained by a modified version of the standard Weyl criterion [13] where a suitable strictly monotonic function of the operator is considered. In order to prove the existence of infinitely many bound states we employ minimax principles proceeding along the lines of [40] where the Brown–Ravenhall model (with A = 0) is considered. The main problem here is to replace those arguments in [40] where explicit position or momentum space representations of Λ+ 0,0 are used by new and somewhat more abstract ones. We remark that our results on spectral projectors also allow to analyze the Hamiltonian HN in the free picture proceeding along the discussion of the Furry and intermediate pictures presented here. There is, however, a subtlety to
February 11, 2010 10:0 WSPC/148-RMP
6
J070-S0129055X10003874
O. Matte & E. Stockmeyer
consider: Namely, for vanishing magnetic fields, it is known that the one-particle 2 ≈ Brown–Ravenhall operator is stable if and only if Z ≤ Zc := e12 (2/π)+(π/2) 124.2 [17]. In the presence of an exterior B-field one can show that H1 is still bounded below in the free picture, for all Z ≤ Zc , provided the vector potential is locally bounded and Lipschitz continuous in a neighborhood of the nuclei [36]. (For smaller values of the coupling constant, one can actually prove the stability of matter of the second kind in the free picture treating a gauge fixed vector potential as a variable in the minimization and adding the field energy to the multi-particle Hamiltonian; see [35, 34], where the quantized electro-magnetic field is considered. In this situation it is essential that the vector potential is included in the projection for otherwise the model is always unstable if N > 1 [21].) Finally, we comment on some closely related recent work. In the free picture and for vanishing exterior magnetic field, an HVZ theorem and the existence of infinitely many eigenvalues below the essential spectrum have been proved in [40], for nuclear charges Z ≤ Zc . The case N = 2 is also treated in [27]. A more general HVZ theorem that applies to different particle species and a wider class of interaction potentials and exterior fields in the free picture has been established in [37]. Moreover, the reduction to irreducible representations of the groups of rotation and reflection and permutations of identical particles is considered in [37]. The L2 exponential localization of eigenvectors in the Brown–Ravenhall model is studied in [38,39] improving and generalizing earlier results from [2]. In all these works, the authors employ explicit position space representations of Λ+ 0,0 . An HVZ theorem in the free picture with constant magnetic field is established in [28] again using explicit representations for the projection based on Mehler’s formula. By employing somewhat more abstract arguments we are able to study a wider class of projections in this paper. Similar results on the spectral projectors are used in [36] to study the regularity of the eigenvectors of H1 in the free picture and to derive pointwise exponential decay estimates for their partial derivatives of any order (for Z ≤ Zc and assuming that all partial derivatives of A increase more slowly than any exponential function). The rate of exponential decay found in [36] is actually the same as it is known for the Chandrasekhar operator and, hence, seems to be the optimal one. For many-electron atoms it is, however, more difficult to prove the exponential localization and — as in [2,38] — we shall only get suboptimal bounds on the decay rate in the present article. For recent developments and numerous references to the literature on HVZ theorems we refer to [33]. The article is organized as follows. In Sec. 2, we introduce our mathematical model precisely and state our main theorems. Section 3 is devoted to the study of some non-local properties of Λ+ A,V expressed in terms of various commutator estimates which form the basis of the spectral analysis of HN . Moreover, it contains results + that allow to compare the projections Λ+ A,V and ΛA,0 . In Sec. 4, we derive the exponential localization estimate for HN and in Sec. 5 infer a lower bound on the
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
7
threshold energy. Section 6 deals with Weyl sequences and, finally, in Sec. 7 we show that HN possesses infinitely many eigenvalues below the threshold energy. Some frequently used notation. Open balls in R3 with radius R > 0 and center z ∈ R3 are denoted by BR (z). Spectral projections of a self-adjoint operator, T , on some Hilbert space are denoted by Eλ (T ) and EI (T ), if λ ∈ R and I is an interval. D(T ) denotes the domain of the operator T and Q(T ) its form domain. The characteristic function of a subset M ⊂ Rn is denoted by 1M . C, C , C , . . . denote constants whose values might change from one estimate to another. 2. The Model and Main Results In our choice of units the free Dirac operator reads D0 := −iα · ∇ + β := −i
3
αj ∂xj + β.
j=1
Here α = (α1 , α2 , α3 ) and β =: α0 are 4 × 4 hermitian matrices which satisfy the Clifford algebra relations {αi , αj } = 2δij 1,
0 ≤ i, j ≤ 3.
(2.1)
In Dirac’s representation, which we fix throughout the paper, they are given as 0 σj 1 0 , j = 1, 2, 3, β = , αj = σj 0 0 −1 where σ1 , σ2 , σ3 are the standard Pauli matrices. D0 is a self-adjoint operator in the Hilbert space H := L2 (R3 , C4 ) with domain H 1 (R3 , C4 ). Its spectrum is purely absolutely continuous and given by the union of two half-lines, σ(D0 ) = σac (D0 ) = (−∞, −1] ∪ [1, ∞).
(2.2)
Next, we formulate our precise hypotheses on the exterior electrostatic potential VC and on the potential V determining the Dirac sea. We think that, at least with regards to the commutator estimates in Sec. 3, it is interesting to keep the conditions on VC and V fairly general. Hypothesis 2.1. There is a finite set Y ⊂ R3 , #Y < ∞, such that VC ∈ 3 4 L∞ loc (R \Y, L (C )) is almost everywhere hermitian and VC (x) → 0,
|x| → ∞.
(2.3)
February 11, 2010 10:0 WSPC/148-RMP
8
J070-S0129055X10003874
O. Matte & E. Stockmeyer
Moreover, there exist γ ∈ (0, 1) and ε > 0 such that the balls Bε (y), y ∈ Y, are mutually disjoint and, for 0 < |x| < ε and y ∈ Y, VC (y + x) ≤
γ . |x|
(2.4)
Example 2.1. The main example for a potential satisfying Hypothesis 2.1 is certainly the Coulomb potential generated by a finite number of static nuclei, VC (x) = −
e2 Zy 1, |x − y|
x ∈ R3 \Y.
y∈Y
In this case the restriction on the strength of the singularities of VC imposed in Hypothesis 2.1 allows for all nuclear charges, 0 ≤ Zy < e−2 ≈ 137.037, y ∈ Y. Hypothesis 2.2. V = VC + VH + VE , where VC fulfills Hypothesis 2.1 and 3 4 VH ∈ L∞ loc (R , L (C )) is an almost everywhere hermitian matrix-valued function dropping off to zero at infinity, VH (x) → 0,
|x| → ∞.
(2.5)
VE is compact and has the following property: There exist m > 0 and some increasing function c : [0, m) → (0, ∞) such that, for every F ∈ C 1 (R3 , R) with |∇F | ≤ a < m, ∀χ ∈ C 1 (R3 , [0, 1]) : [eF VE e−F , χ] ≤ c(a) ∇χ ∞ ,
(2.6)
[VE , eF ]e−F ≤ c(a) ∇F ∞ ,
(2.7)
lim 1R3 \BR (0) eF VE e−F 1R3 \BR (0) = 0.
R→∞
(2.8)
Example 2.2. (i) Possible choices for VH and VE satisfying the conditions of Hypothesis 2.2 are the Hartree and non-local exchange potentials corresponding to a set of exponentially localized orbitals ϕ1 , . . . , ϕM ∈ H , |ϕi (x)| ≤ Ce−m|x| , 1 ≤ i ≤ M , for some C ∈ (0, ∞). Their Hartree potential is given as
M 1 2 VH (x) := e |ϕi | ∗ (x), |·| i=1 2
x ∈ R3 .
It incorporates the presence of M electrons in a fixed state into the Dirac sea by a smeared out background density. The exchange potential corresponding to ϕ1 , . . . , ϕM is the integral operator with matrix-valued kernel VE (x, y) := e2
M ϕi (x)ϕ∗ (y) i
i=1
|x − y|
.
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
9
It is a correction to the Hartree potential accounting for the Pauli principle. In the sense of quadratic forms it then holds VC ≤ V = VC + VH + VE ≤ 0, which justifies the notion “intermediate picture”. These choices of VH and VE are discussed, e.g., in [11, 29, 30, 45]. (ii) More generally, we may set VH := ∗ | · |−1 , for some 0 ≤ ∈ L1 ∩ L5/3 (R3 ). In this case we find some C ∈ (0, ∞) such that 0 ≤ VH ≤ C/| · |. Moreover, standard theorems on integral operators show that every kernel with values in the set of hermitian (4 × 4)-matrices satisfying VE (x, y) ≤ C
e−m|x−y| , xρ |x − y|yρ
for some m, ρ, C > 0, yields a compact operator satisfying the conditions of Hypothesis 2.2. As a first consequence of Hypothesis 2.2 we find, for every locally bounded vector potential A : R3 → R3 , a distinguished self-adjoint realization of the Dirac operator DA,V = α · (−i∇ + A) + β + V, whose essential spectrum is again contained in (−∞, −1] ∪ [1, ∞); see Lemma 3.2 below, where we recall some important well-known facts on Dirac operators with singular potentials. Therefore, it makes sense to define the spectral projections, Λ+ A,V := E[e0 ,∞) (DA,V ),
+ Λ− A,V := 1 − ΛA,V ,
(2.9)
where e0 ∈ (DA,V ) ∩ (−1, 1).
(2.10)
For later reference we introduce the parameter 0 :=
1 − e20 .
(2.11)
Many of our technical results on DA,V , for instance, various commutator estimates of Sec. 3 hold actually true under the mere assumption that the components of A are locally bounded. Of course, if not all eigenvalues of DA,V are larger than −1 and e0 is chosen between −1 and the lowest eigenvalue, the physical relevance of the N -particle Hamiltonian HN becomes rather questionable. We remark that such situations are not excluded by our hypotheses. For instance, if VC is the Coulomb potential and the intensity of a constant exterior magnetic field is increased, then the lowest eigenvalue of DA,VC eventually reaches the lower continuum [16]. Nevertheless, our theorems hold for any choice of e0 as in (2.10).
February 11, 2010 10:0 WSPC/148-RMP
10
J070-S0129055X10003874
O. Matte & E. Stockmeyer
In order to define the atomic no-pair Hamiltonian precisely we first set HN :=
N
HN+ := Λ+,N A,V HN ,
H,
N ∈ N,
H + := H1+ ,
i=1 3 3 where Λ+,N A,V is given by (1.5) and (2.9). We let W : R × R → [0, ∞] denote the interaction potential between two electrons.
Hypothesis 2.3. There is some γ > 0 such that, for all x, y ∈ R3 , x = y, 0 ≤ W (x, y) = W (y, x) ≤ γ |x − y|−1 .
(2.12)
When we consider N electrons located at x1 , . . . , xN ∈ R3 we denote their common position variable by X = (x1 , . . . , xN ). Furthermore, we often write Wjk for the maximal multiplication operator in HN induced by the function (R3 )N X → W (xj , xk ). For N ∈ N, we introduce a symmetric, semi-bounded operator acting in HN+ by ˚N ) := Λ+,NDN , D(H A,V
˚N Φ := H
N
Λ+,N A,V
DN :=
N
D,
D := C0∞ (R3 , C4 ),
i=1 (i) DA,VC
i=1
+
(2.13)
Wij Λ+,N A,V Φ,
˚N ). (2.14) Φ ∈ D(H
1≤i<j≤N
Proposition 2.1. Assume that V fulfills Hypothesis 2.2, W fulfills Hypothesis 2.3, 3 3 −τ0 |x| A ∞ < ∞, for some 0 ≤ τ0 < and that A ∈ L∞ loc (R , R ) satisfies e ˚N given by (2.13) and (2.14) is well-defined, min{0 , m}. Then the operator H symmetric, and semi-bounded from below. Proof. The only claim that is not obvious is that Wij Λ+,N A,V φ is again squareintegrable, for every φ ∈ DN . This follows, however, from Corollary 3.3. ˚N by HN . Note that we do not require We denote the Friedrichs extension of H the elements in the domain of HN to be anti-symmetric since in our proofs it is convenient to consider HN as an operator acting in the full tensor product. Of course, in the end we shall be interested in the restriction of HN to the anti-symmetric (fermionic) subspace of HN+ . We denote the anti-symmetrization operator on HN by AN , 1 (AN Φ)(X) = sgn(σ)Φ(xσ(1) , . . . , xσ(N ) ), Φ ∈ HN , (2.15) N! σ∈SN
where SN is the group of permutations of {1, . . . , N }, and define the no-pair Hamiltonian by A := HN AN H + . HN N
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
11
Our first main result is the following theorem, where A ENA := inf σ(HN ),
E0A := 0.
N ∈ N,
(2.16)
Theorem 2.1 (Exponential Localization). Assume that V and W fulfill Hypotheses 2.2 and 2.3, respectively. If A ∈ C 1 (R3 , R3 ) and B = curl A is bounded and if I ⊂ R is an interval with sup I < ENA−1 + 1, then there exists b ∈ (0, ∞) such A )) ⊂ D(eb|X| ) and that Ran(EI (HN A eb|X|EI (HN ) < ∞.
Proof. This theorem is proved in Sec. 4. Remark 2.1. In the case N = 1 the assertion of Theorem 2.1 still holds true under the assumptions of Proposition 2.1. This follows from the proof of Theorem 2.1. In fact, for N = 1, we do not have to control error terms involving the interaction W which is the only reason why B is assumed to be bounded in Theorem 2.1. If also V = VC , then we obtain an exponential localization estimate for an eigenvector, φE , with eigenvalue E ∈ (−1, 1) of the Dirac operator DA,VC . The estimate on the decay rate which could be extracted from our proof is, however, suboptimal due to error terms coming from the projections; see also [7] for decay estimates for Dirac operators. Next, we introduce a hypothesis which is used to prove the “easy part” of the HVZ theorem below. Hypothesis 2.4. (i) For every λ ≥ 1, there exist radii, 1 ≤ R1 < R2 < · · · , Rn ∞, and ψ1 , ψ2 , . . . ∈ D such that ψn = 1,
supp(ψn ) ⊂ R3 \BRn (0),
lim (DA − λ)ψn = 0.
n→∞
(2.17)
(ii) A ∈ C 1 (R3 , R3 ) and B = curl A has the following property: There are b1 ∈ (0, ∞) and 0 ≤ τ < min{0 , m} (m and 0 are the parameters appearing in Hypothesis 2.2 and (2.11)) such that, for all x, y ∈ R3 , |B(x) − B(y)| ≤ b1 eτ |x−y|.
(2.18)
Example 2.3 ([22]). We recall a result from [22] which provides a large class of examples where Hypothesis 2.4(i) is fulfilled: Suppose that A ∈ C ∞ (R3 , R3 ), B = curl A, and set, for x ∈ R3 and ν ∈ N, |∂ α B(x)| 0 (x) := |B(x)|,
ν (x) :=
|α|=ν
1+
|α|<ν
|∂ α B(x)|
.
February 11, 2010 10:0 WSPC/148-RMP
12
J070-S0129055X10003874
O. Matte & E. Stockmeyer
Suppose further that there exist ν ∈ N0 , z1 , z2 , . . . ∈ R3 , and ρ1 , ρ2 , . . . > 0 such that ρn ∞, the balls Bρn (zn ), n ∈ N, are mutually disjoint and sup{ν (x) | x ∈ Bρn (zn )} → 0,
n → ∞.
Then there is a Weyl sequence, ψ1 , ψ2 , . . . , that satisfies the conditions of Hypothesis 2.4(i). The fact that the additional assumption of Part (ii) of the next theorem yields a lower bound on the ionization threshold is an observation made in [19] for Schr¨ odinger operators. Theorem 2.2 (HVZ). Assume that V and W fulfill Hypotheses 2.2 and 2.3, respectively. Then the following assertions hold true: A ) ⊃ [ENA−1 + 1, ∞). (i) If Hypothesis 2.4 is fulfilled also, then σess (HN (ii) Assume additionally that, for every interval I ⊂ R with sup I < ENA−1 + 1, there A )∈ is some g ∈ C(R, (0, ∞)) such that g(r) → ∞, as r → ∞, and g(|X|)EI (HN A A L (AN HN ). Then σess (HN ) ⊂ [EN −1 + 1, ∞). In particular, this inclusion is valid if A ∈ C 1 (R3 , R3 ) and B = curl A is bounded.
Proof. (i) follows directly from Lemma 6.3 and (ii) follows from Theorems 5.1 and 2.1. To show the existence of infinitely many eigenvalues below the bottom of the A we certainly need a condition on the relationship between essential spectrum of HN VC , W , and the magnetic field. To formulate it we set, for δ, R > 0, Sδ (R) := {x ∈ R3 : (1 − δ)R ≤ |x| ≤ (1 + δ)R} v (δ, R) :=
sup
(2.19)
sup v | VC (x)vC4 .
(2.20)
x∈Sδ (R) |v|=1
Hypothesis 2.5. (i) V fulfills Hypothesis 2.2. (ii) A ∈ C 1 (R3 , R3 ) and B = curl A is bounded. (iii) There exist radii 1 ≤ R1 < R2 < · · · , Rn ∞, some constant δ ∈ (0, 1/N ), and a sequence of spinors, ψ1 , ψ2 , . . . ∈ D, with vanishing lower spinor components, ψn = (ψn,1 , ψn,2 , 0, 0) , n ∈ N, such that ψn = 1,
supp(ψn ) ⊂ {Rn < |x| < (1 + δ/2)Rn },
2Rn ≤ Rn+1 , (2.21)
for all n ∈ N, and (DA − 1)ψn = O(1/Rn ),
n → ∞.
(2.22)
(iv) W fulfills Hypothesis 2.3 and, for every δ ∈ (0, N1 ), we find some ε ∈ (0, 1) such that lim sup Rn n→∞
v (δ, Rn ) + (N − 1)
sup |x−y|≥(1−ε)Rn
W (x, y)
< 0.
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
13
Example 2.4. V = VC + VH + VE and W fulfill Hypothesis 2.5(i) and (iv), if
VC is given as in Example 2.1 with y∈Y Zy ≥ N , VH and VE are given as in Example 2.2(i) or (ii), and W is the Coulomb repulsion, W (x, y) = e2 /|x − y|. Hypothesis 2.5(iii) is fulfilled under the following strengthened version of the condition given in [22]: Suppose again that A ∈ C ∞ (R3 , R3 ), B = curl A, and let Bρn (zn ) denote the balls appearing in Example 2.3. Suppose additionally that there is some C ∈ (0, ∞) such that ρn < |zn | ≤ Cρn , for all n ∈ N, and that either sup{|B(x)| : x ∈ Bρn (zn )} ≤ C/|zn |2 ,
n ∈ N,
or ∀ n ∈ N : |B(zn )| ≥ 1/C
and
sup{ν (x)|x ∈ Bρn (zn )} = o(ρ−ν n ).
Then we find a Weyl sequence ψ1 , ψ2 , . . . satisfying the conditions in Hypothesis 2.5(iii). This follows by inspecting and adapting all relevant proofs in [22]. We leave this procedure to the reader since it is straightforward but a little bit lengthy. Theorem 2.3 (Existence of Bound States). Assume that V, W, and A fulfill A has infinitely many eigenvalues below the infimum of its Hypothesis 2.5. Then HN A ) = ENA−1 + 1. essential spectrum, inf σess (HN Proof. This theorem is proved in Sec. 7.
3. Spectral Projections of the Dirac Operator In this section, we study spectral projections of Dirac operators with singular potentials in magnetic fields. We start by recalling some basic well-known facts about Dirac operators in Sec. 3.1. A crucial role is played by the resolvent identity stated in that subsection which applies to Coulomb singularities with coupling constants up to e2 Z < 1. We remark that the domains of the Dirac operators studied here are not known in general and actually change when the strength of a Coulomb-type potential is increased. Consequently, the usual resolvent identities are not applicable and all formal manipulations involving Dirac operators and their spectral projections have to be treated carefully in the whole paper. In Sec. 3.2 we derive some norm estimates on resolvents of Dirac operators which are conjugated with exponential weight functions. We verify that the conjugated resolvent stays bounded provided the weights increase with an exponential rate smaller than 1 − (z)2 , where z ∈ (−1, 1) + iR is the spectral parameter. The simple Neumann-type argument we employ to prove this for non-vanishing electrostatic potentials might be new. In Sec. 3.3, we derive the main technical tools of this paper, namely, various commutator estimates involving spectral projections of singular Dirac operators. Some long and technical proofs are postponed to Sec. 3.5. Finally, in Sec. 3.4, we study the difference of projections with and without electrostatic potentials.
February 11, 2010 10:0 WSPC/148-RMP
14
J070-S0129055X10003874
O. Matte & E. Stockmeyer
3.1. Basic properties of Dirac operators with singular potentials in magnetic fields In the next lemma, we collect various well-known results on Dirac operators which play an important role in the whole paper. To this end we let Hcs := Hcs (R3 , C4 ) denote all elements of H s := H s (R3 , C4 ), s ∈ R, having compact support. Moreover, ˇ 0. we denote the canonical extension of D0 to an element of L (H 1/2 , H −1/2 ) by D It shall sometimes be convenient to consider the singular part of VC , VCs (x) := (x − y)VC (x), x ∈ R3 , (3.1) y∈Y
where ∈ C0∞ (R3 , [0, 1]) equals 1 on Bε/2 (0) and 0 outside Bε (0). Here ε is the parameter appearing in Hypothesis 2.1. We let VCs (x) = S(x)|VCs |(x) denote the polar decomposition of VCs (x). By Hardy’s inequality we know that VCs is a bounded operator from H 1 (R3 , C4 ) to L2 (R3 , C4 ). By duality and interpolation it possesses 3 3 a unique extension VˇCs ∈ L (H 1/2 , H −1/2 ). Given some A ∈ L∞ loc (R , R ) we set s ∞ 3 A := (1 − ϑ)A, where ϑ ∈ C0 (R , [0, 1]) is equal to 1 on some ball containing supp(VCs ). We let α · As (x) = U (x)|α · As (x)| denote the polar decomposition of ˇ 0 + α · As + Vˇ s is well-defined as an α · As (x) and note that the operator sum D C 1/2 −1/2 3 3 s s element of L (Hc , Hc ), for every A ∈ L∞ (R , R ). So V C and A have disjoint loc support by definition. As a consequence the application of the following lemma eventually becomes more convenient. 3 3 Lemma 3.1 ([8,41,42,44]). Assume that A ∈ L∞ loc (R , R ) and VC fulfills Hypothesis 2.1. Then there is unique self-adjoint operator, DAs ,VCs , such that : 1/2
(i) D(DAs ,VCs ) ⊂ Hloc (R3 , C4 ). (ii) For all ψ ∈
1/2 Hc (R3 , C4 )
and φ ∈ D(DAs ,VCs ),
ψ | DAs ,VCs φ = |D0 |1/2 ψ | sgn(D0 )|D0 |1/2 φ + |α · As |1/2 ψ | U |α · As |1/2 φ + |VCs |1/2 ψ | S|VCs |1/2 φ. Proof. In [44, Proposition 4.3] it is observed that the claim follows from [8, Theorem 1.3] and [41, 42]. Consequently, we may define a self-adjoint operator, DA,V := DAs ,VCs + α · (A − As ) + (VC − VCs ) + VH + VE
(3.2)
on the domain D(DA,V ) = D(DAs ,VCs ). Notice that in (3.2) we only add bounded operators to DAs ,VCs . We state some of its properties in the following lemma where RA,V (z) := (DA,V − z)−1 ,
z ∈ (DA,V ).
(3.3)
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
15
3 3 Lemma 3.2 ([8, 41, 42, 44]). Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2. Then the following assertions hold true:
(a) 1BR (0) (DA,V − i)−1 is compact, for every R > 0. (b) σess (DA,V ) = σess (DA ), σ(DA ) ⊂ (−∞, −1] ∪ [1, ∞). (c) DA,V is essentially self-adjoint on ˇ 0 φ + α · Aφ + Vˇ s φ ∈ L2 (R3 , C4 )} De := {φ ∈ Hc1/2 (R3 , C4 ) : D C
(3.4)
and, for φ ∈ De , DA,V φ is given as a sum of four vectors in H −1/2 , ˇ 0 φ + α · Aφ + VˇCs φ + (V − VCs )φ. DA,V φ = D Moreover, De = D(DA,V ) ∩ E , where E denotes the dual space of C ∞ (R3 , C4 ). (d) For χ ∈ C0∞ (R3 ) and φ ∈ D(DA,V ), we have χφ ∈ De ⊂ D(DA,V ) and [DA,V , χ]φ = −i(α · ∇χ)φ + [VE , χ]φ. In particular, for z ∈ (DA,V ), [χ, RA,V (z)] = RA,V (z)[DA,V , χ]RA,V (z) = RA,V (z)(−i(α · ∇χ) + [VE , χ])RA,V (z).
(3.5)
(e) If A is bounded, then D(DA,V ) ⊂ H 1/2 (R3 , C4 ). Proof. Since VE is compact it is clear that all assertions hold true as soon as they hold for VE = 0, which we assume in the following. To prove (a) we write 1BR (0) (DA,V − i)−1 = (1BR (0) |D0 |−1/2 )(|D0 |1/2 χ(DA,V − i)−1 ), where χ ∈ C0∞ (R3 , [0, 1]) equals 1 in a neighborhood of BR (0). Then we use that 1BR (0) |D0 |−1/2 is compact and that |D0 |1/2 χ(DA,V −i)−1 is bounded by Lemma 3.1 and the closed graph theorem. By standard arguments, we obtain the identity σess (DA,V ) = σess (DA ) from (a) since V drops off to zero at infinity; see, e.g., [49, §4.3.4]. The inclusion σ(DA ) ⊂ (−∞, −1] ∪ [1, ∞) follows from supersymmetry arguments; see, e.g., [49, §5.6]. The assertions in (c) follow from [8, §2], (d) follows from [8, Lemma G], and (e) from [41]. Next, we recall the useful resolvent identity (3.6) (see, e.g., [18, 53]) which is used very often in the sequel. It should be regarded as a substitute for the second resolvent identity which is typically not applicable in order to compare two different Dirac operators in this paper. For, in general, the domain of one of these Dirac in operators is not included in the domain of the other. The vector potential A Eq. (3.6) below could for instance be the gradient of some gauge potential or just be
February 11, 2010 10:0 WSPC/148-RMP
16
J070-S0129055X10003874
O. Matte & E. Stockmeyer
equal to zero. We recall another well-known resolvent identity [41] in the beginning of Sec. 3.5. ∈ L∞ (R3 , R3 ). Lemma 3.3. Assume that V fulfills Hypothesis 2.2, and that A, A loc s Let V be either VC (given by (3.1)) or 0, let z ∈ (DA, eV e ) ∩ (DA,V ) and χ ∈ C ∞ (R3 , R) be constant outside some ball in R3 , and assume that (VC − V )χ and are bounded. Then α · (A − A)χ χRA,V (z) = χRA, eV e (z) + RA, eV e (z)iα · (∇χ)(RA, eV e (z) − RA,V (z)) − RA, eV e (z)χ(V − V + α · (A − A))RA,V (z).
(3.6)
ˇ 0 ψ + α · Aψ + V ψ ∈ L2 }. Since χ can be written e := {ψ ∈ Hc1/2 |D Proof. Let φ ∈ D ∞ as χ = c + ϑ, for some c ∈ R and ϑ ∈ C0 (R3 , R), Lemmas 3.2(c) and (d) imply e . By the definition of De in (3.4) and the assumptions on χ it further that χφ ∈ D follows that χφ ∈ De ⊂ D(DA,V ) and DA, eV e χφ = DA,V χφ + {−V + V − α · (A − A)}χφ. Therefore, we obtain (RA, eV e (z) − RA,V (z))χ(DA, eV e − z)φ = (RA, eV e (z) − RA,V (z))((DA, eV e − z)χ + iα · (∇χ))φ = χφ − RA,V (z)(DA,V − z − V + V − α · (A − A))χφ + (RA, eV e (z) − RA,V (z))iα · (∇χ)φ = RA,V (z)(V − V + α · (A − A))χR eV e (z)(DA, eV e − z)φ A, + (RA, eV e (z) − RA,V (z))iα · (∇χ)RA, eV e (z)(DA, eV e − z)φ. As DA, eV e is essentially self-adjoint on De , we know that (DA, eV e − z)De is dense, which together with the calculation above implies (RA, eV e (z) − RA,V (z))χ = (RA, eV e (z) − RA,V (z))iα · (∇χ)RA, eV e (z) + RA,V (z)(V − V + α · (A − A))χR eV e (z). A,
(3.7)
Taking the adjoint of (3.7) (with z replaced by z¯) we obtain (3.6). 3.2. Conjugation of RA,V (z) with exponential weights As a preparation for the localization estimates for the spectral projections, we shall now study the conjugation of RA,V (z) with exponential weight functions eF acting as multiplication operators on H . To this end we recall that e0 ∈ (−1, 1) is an
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
17
element of the resolvent set of DA,V and set δ0 := inf{|e0 − λ| : λ ∈ σ(DA,V )} > 0,
(3.8)
0 := min{1 − e0 , e0 + 1}, Γ := e0 + iR.
(3.9)
Notice that the decay rate in the following lemma is determined only by the decay rate m appearing in Hypothesis 2.2 and the number 0 defined in (2.11). In the next proof and henceforth we shall often use the abbreviations DA := DA,0 ,
RA (z) := (DA − z)−1 ,
z ∈ (DA ).
(3.10)
We remark that, for V = 0, the bound (3.11) below follows from a well-known computation (see, e.g., [7]) which is recalled in the next proof. For non-vanishing, singular potentials V a bound on the operator norm of the conjugated resolvent seems to be less well known and the Neumann-type argument we use to prove it might be a new observation. 3 3 Lemma 3.4. Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2. Let 0 < a < min{0 , m}. Then there is some Ca ∈ (0, ∞) such that, for all F ∈ C ∞ (R3 , R) with F (0) = 0, F ≥ 0 or F ≤ 0, ∇F ∞ ≤ a, and all z = e0 + iη ∈ Γ,
Ca eF RA,V (z)e−F ≤ . 1 + η2
(3.11)
Proof. First, we assume that F is constant outside some ball in R3 . Then it suffices to treat the case F ≥ 0, since otherwise we could consider the adjoint of eF RA,V (z)e−F . Since F is smooth and constant outside some compact set a straightforward calculation (see [7]) using Lemma 3.2(d) yields, for z ∈ C and ϕ ∈ D(DA ) ∩ E , 1 F e (DA − z)e−F ϕ 2 + 3ε α · (−i∇ + A)ϕ 2 4ε + 3ε(1 + |z|2 ) ϕ 2 + 3εϕ | |∇F |2 ϕ ≥ e−F (DA + z¯)eF ϕ | eF (DA − z)e−F ϕ = α · (−i∇ + A)ϕ 2 + ϕ | (1 − z 2 − |∇F |2 )ϕ. This and the assumption |∇F | ≤ a permit to get, for z = e0 + iη ∈ Γ, that is, z 2 = e20 − η 2 , and for every 0 < ε < (1 − e20 − a2 )/9 = (20 − a2 )/9, eF RA (z)e−F ≤
Ca,ε 1 ≤ . 2 2 2 4ε 1 − e0 − a − 9ε + η /2 1 + η2
(3.12)
February 11, 2010 10:0 WSPC/148-RMP
18
J070-S0129055X10003874
O. Matte & E. Stockmeyer
We choose ε := (1 − e20 − a2 )/10 in what follows. Next, we pick some R > max{|y| : y ∈ Y} and χ ∈ C ∞ (R3 , [0, 1]) such that χ ≡ 0 on BR (0), χ ≡ 1 on R3 \BR+2 (0), denote the characterand ∇χ ∞ ≤ 1. We set χ := 1 − χ. Furthermore, we let χ istic function of R3 \BR (0). We choose R so large (depending on a, but not on F ; recall (2.8)) that sup VC (x) + sup VH (x) + χ eF VE e−F χ ≤
|x|≥R
|x|≥R
1 . 2Ca,ε
(3.13)
Conjugating (3.6) with exponential weights and rearranging the terms we find, for z ∈ Γ, {1 + eF RA (z)e−F ( χVC + χ VH + χeF VE e−F χ )}χeF RA,V (z)e−F = χeF RA (z)e−F − (eF RA (z)e−F )(eF iα · ∇χ)(RA,V (z) − RA (z))e−F − (eF RA (z)e−F )(χeF VE e−F χ)(1BR+2 (0) eF )RA,V (z)e−F . Here the operator {· · ·} on the left side can be inverted by means of a Neumann series and {· · ·}−1 ≤ 2 by (3.12) and (3.13). Furthermore, we recall the identity α · v L (C4 ) = |v|,
v ∈ R3 ,
(3.14)
which follows from the Clifford algebra relations (2.1), and observe that, by the choice of χ, the assumption on F , and (3.14), eF iα · ∇χ ≤ ea(R+2) ,
e−F ≤ 1.
Moreover, we have, for z = e0 + iη ∈ Γ, 1 , RA (z) ≤ 2 0 + η 2
1 RA,V (z) ≤ 2 . δ0 + η 2
(3.15)
Using these remarks together with (2.7) and (3.12), we obtain C eR+2 χeF RA,V (z)e−F ≤ a , 1 + η2
z = e0 + iη ∈ Γ.
This estimate implies the assertion if F is constant outside some ball since, certainly, χeF RA,V (z)e−F ≤ ea(R+2) (δ02 + η 2 )−1/2 . Let us now assume that F ≥ 0 is not necessarily bounded. Let F1 , F2 , . . . ∈ C ∞ (R3 , [0, ∞)) be constant near infinity and such that Fn = F on Bn (0) and Fn → F . Then e−Fn RA,V (z)eFn φ → e−F RA,V (z)eF φ by the dominated convergence theorem, for every φ ∈ D. Since e−Fn RA,V (z)eFn obeys the estimate (3.11) uniformly in n, we see that the densely defined operator e−F RA,V (z)eF D is bounded and satisfies (3.11), too. But this is the case if and only if its adjoint, eF RA,V (z)e−F = (e−F RA,V (z)eF )∗ , is an element of L (H ) and satisfies (3.11) as well.
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
19
In the applications of the previous lemma, the following observation is very useful. 3 3 Lemma 3.5. Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2. Let 0 < a < min{0 , m}. Then there is some Ca ∈ (0, ∞) such that, for all F ∈ C ∞ (R3 , R) with F (0) = 0, F ≥ 0 or F ≤ 0, ∇F ∞ ≤ a, which are constant outside some ball in R3 , and for all φ ∈ H , |DA,V |1/2 eF RA,V (z)e−F φ 2 |dz| ≤ Ca φ 2 , (3.16) Γ
and, for φ ∈ D(|DA,V |1/2 ), eF RA,V (z)e−F |DA,V |1/2 φ 2 |dz| ≤ Ca φ 2 .
(3.17)
Γ
Proof. For later reference we additionally pick some χ ∈ C ∞ (R3 , R) which is constant outside some large ball and infer from Lemma 3.2(e) that, for z ∈ Γ, [RA,V (z), χeF ] = RA,V (z){iα · (∇χ + χ∇F ) + [χeF , VE ]e−F }eF RA,V (z).
(3.18)
The special case χ ≡ 1 implies eF RA,V (z)e−F = RA,V (z) − RA,V (z){iα · ∇F + [eF , VE ]e−F }eF RA,V (z)e−F . (3.19) Taking the adjoint and replacing F by −F and z¯ by z we also get eF RA,V (z)e−F = RA,V (z) − eF RA,V (z)e−F {iα · ∇F + eF [VE , e−F ]}RA,V (z). (3.20) Now, let T be a self-adjoint operator on some Hilbert space, K , such that (−δ0 , δ0 ) ⊂ (T ). Then, for φ ∈ K , R
|T |1/2 (T − iη)−1 φ 2 dη =
R
R
λ2
|λ| dη d Eλ (T )φ 2 = π φ 2 , + η2
(3.21)
and it is elementary to check that, for η ∈ R, |T |1/2 (T − iη)−1 ≤
1/2
δ0 1(−δ0 ,δ0 ) (η) 1(−δ0 ,δ0 )c (η) + . 2|η| δ02 + η 2
(3.22)
Using (3.21) and (3.22) with T = DA,V − e0 and taking (2.8), (3.11), (3.14), and (3.15) into account, we readily derive the asserted estimate (3.16) from (3.19). The second estimate (3.17) it obtained analogously by means of (3.20).
February 11, 2010 10:0 WSPC/148-RMP
20
J070-S0129055X10003874
O. Matte & E. Stockmeyer
3.3. Commutators In this subsection, we derive the crucial technical prerequisites for the spectral analysis of HN , namely various commutator estimates involving the projection Λ+ A,V , cut-off functions, and exponential weights eF . Roughly speaking, these estimates allow to adapt many arguments known from the spectral analysis of partial differential operators that involve partitions of unity and conjugations with exponential weights to our non-local model. Our standard assumptions on the cut-off and weight functions are χ ∈ C ∞ (R3 , [0, 1]) is constant outside some ball and
F ∈ C ∞ (R3 , R), F ≥ 0 or F ≤ 0, F (0) = 0, |∇F | ≤ a, F is constant outside some ball.
(3.23)
(3.24)
To shorten the presentation, we generalize our estimates to unbounded F only if this is explicitly used in this article. 3 3 Proposition 3.1. Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2 and let 0 < a0 < min{0 , m}. Then there is some constant Ca0 ∈ (0, ∞) such that, for all a ∈ [0, a0 ] and χ, F satisfying (3.23) and (3.24), F −F |DA,V |1/2 ≤ Ca0 ( ∇χ ∞ + a). |DA,V |1/2 [Λ+ A,V , χe ]e
(3.25)
Proof. We shall employ the identity F [Λ+ A,V , χe ] =
1 [sgn(DA,V − e0 ), χeF ] 2
(3.26)
and the representation of the sign function as a Cauchy principal value (see, e.g., [31, p. 359]), dz RA,V (z)ψ sgn(DA,V − e0 )ψ = π Γ R dη (3.27) := lim RA,V (e0 + iη)ψ , R→∞ −R π for ψ ∈ H , where Γ is defined in (3.9). Taking also (2.6), (2.7), and (3.18) into account we obtain F −F ||DA,V |1/2 φ | [Λ+ |DA,V |1/2 ψ| A,V , χe ]e ≤ |DA,V |1/2 RA,V (¯ z )φ iα · (∇χ + χ∇F ) + [χeF , VE ]e−F Γ
· eF RA,V (z)e−F |DA,V |1/2 ψ
|dz| 2π
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
≤
Ca 0 ( ∇χ ∞ ·
Γ
+ χ∇F ∞ + a)
Γ
1/2
|DA,V |
eF RA,V (z)e−F |DA,V |1/2 ψ 2
|dz| 2π
RA,V (z)φ
2 |dz|
21
1/2
2π
1/2 ,
(3.28)
for φ, ψ ∈ D(|DA |1/2 ) ⊃ Ran(RA (z)). By virtue of (3.16) and (3.17), we first infer that F −F |DA,V |1/2 ψ ∈ D((|DA,V |1/2 )∗ ) = D(|DA,V |1/2 ). [Λ+ A,V , χe ]e
We conclude by recalling that an operator T : D(T ) → K on some Hilbert space K is bounded if and only if sup{|φ | T ψ| : φ ∈ X, ψ ∈ D(T ), φ = ψ = 1}
(3.29)
is finite, in which case it is equal to the norm of T . Here X ⊂ K is a subspace with ¯ ⊃ Ran(T ). X Given some suitable weight function, F , we abbreviate F + −F ΛF . A,V := e ΛA,V e
(3.30)
3 3 Corollary 3.1. Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2. Let 0 < a < min{0 , m}. Then there is some C(a) ∈ (0, ∞) such that, for all F ∈ C ∞ (R3 , R) satisfying F (0) = 0, F ≥ 0 or F ≤ 0, and ∇F ∞ ≤ a, we have F ΛF A,V ∈ L (H ) and ΛA,V ≤ C(a).
Proof. First, we assume that F satisfies (3.24). In this case the claim follows from + −F Proposition 3.1 because [eF , Λ+ = ΛF A,V − ΛA,V . If F is unbounded, then A,V ]e we apply an approximation argument similar to the one at the end of the proof of Lemma 3.4. 3 3 Corollary 3.2. Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2 and let 0 < a0 < min{0 , m}. Then there is some constant C ∈ (0, ∞) such that, for all a ∈ [0, a0 ], χ, F satisfying (3.23), (3.24), and ∇χ ∞ ≤ 1, L ∈ L (H ), and ϕ∈H, + + F 2 |ϕ | ΛF A,V χLχΛA,V ϕ − ϕ | χΛA,V LΛA,V χϕ| ≤ (a + ∇χ ∞ )C L ϕ .
(3.31)
Moreover, for all ϕ ∈ D(DA,V ), + + F |ϕ | ΛF A,V χDA,VC χΛA,V ϕ − ϕ | χΛA,V DA,VC ΛA,V χϕ| + −1 ϕ 2 }. ≤ (a + ∇χ ∞ ) inf {εϕ | χΛ+ A,V DA,VC ΛA,V χϕ + Cε 0<ε≤1
(3.32)
February 11, 2010 10:0 WSPC/148-RMP
22
J070-S0129055X10003874
O. Matte & E. Stockmeyer
If VC = V = 0, then (3.32) still holds true, if χDA χ is replaced by χ|DA |χ on the left side. Proof. The proof of (3.31) is a rather obvious application of Proposition 3.1 and in fact a simpler analogue of the derivation of (3.32) below. Once (3.31) is verified, it suffices to prove (3.32) with DA,VC replaced by DA,V since VH and VE are bounded. Without loss of generality we may further assume that DA,V is positive on the range of Λ+ A,V . For otherwise we could add a suitable constant to DA,V . To prove (3.32) we first recall that Λ+ A,V maps the domain of DA,V into itself and by Lemma 3.2(d) we know that multiplication with χ or e±F leaves D(DA,V ) invariant, too. We thus have the following identity on D(DA,V ), + F ΛF A,V χDA,V χΛA,V − χΛA,V DA,V χ + + −F F −F ]DA,V Λ+ = eF [Λ+ A,V , χe A,V χ + χΛA,V DA,V [χe , ΛA,V ]e −F −F + eF [Λ+ ]DA,V [χeF , Λ+ . A,V , χe A,V ]e
It follows that the absolute value on the left-hand side of (3.32) is less than or equal to F −F |DA,V |1/2 Λ+ |DA,V |1/2 [Λ+ ϕ A,V χϕ A,V , χe ]e =±
+
F −F |DA,V |1/2 [Λ+ ϕ 2 , A,V , χe ]e
=±
which together with Proposition 3.1 implies (3.32). The last statement of the lemma is valid since the argument above works equally well with |DA,V | in place of DA,V + because Λ+ A,V |DA,V | = ΛA,V DA,V . In order to carry out explicit computations it is important to know that functions in the image set Λ+ A,V D still have a certain regularity. This is ensured by the following lemma. As a first consequence we shall see in the corollary below that HN is actually well-defined on DN . 3 3 −τ0 |x| ∞ < ∞, for Lemma 3.6. Assume that A ∈ L∞ loc (R , R ) satisfies Ae some 0 ≤ τ0 < min{m, 0 }, and that V fulfills Hypothesis 2.2. Then Λ+ A,V φ ∈ 1/2 3 4 H (R , C ), for every φ ∈ D.
Proof. Let φ ∈ D. We pick some χ ∈ C0∞ (R3 , [0, 1]) with χ ≡ 1 on supp(φ). Furthermore, we pick ζ ∈ C0∞ (R3 , [0, 1]) such that ζ ≡ 1 on supp(χ) ∪ BR (0), where 1/2 R > max{|y| : y ∈ Y}. We set ζ := 1 − ζ. Since D(DA,V ) ⊂ Hloc (R3 , C4 ) and + the spectral projection ΛA,V maps the domain of DA,V into itself it follows that
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
23
1/2 ζΛ+ (R3 , C4 ). Furthermore, we pick a (smooth, locally finite) partition of A,V φ ∈ H
∞
∞ 3 unity on R , {Jν }ν∈N , ν=1 Jν = 1, such that ν=1 |∇Jν | ≤ C, for some constant C ∈ (0, ∞). Setting ζν := Jν ζ, ν ∈ N, φ := Λ+ A,V (DA,V − i)φ, and using (3.6), we obtain
ζΛ+ A,V φ =
∞
ζν RA,V (i)Λ+ A,V (DA,V − i)φ
ν=1
= ζR0 (i)φ −
∞
R0 (i)iα · (∇ζν )(RA,V (i) − R0 (i))φ
(3.33)
ν=1
−
∞
R0 (i)ζν (V + α · A)RA,V (i)φ.
(3.34)
ν=1
Here the sum in (3.33) commutes with the first resolvent and the strong limit
∞ (∇ζν ) defines a bounded operator on L2 (R3 , C4 ). To treat (3.34) we first ν=1 iα · ∞ ∞ 3 use that ν=1 ζν V = ζV is bounded. Next, we pick some F ∈ C (R , [0, ∞)) vanishing on some ball containing 0 and supp(φ) and satisfying F (x) = a|x| − a , for x outside some sufficiently large ball with τ0 < a < min{m, 0 }, a > 0. Then we write (α · A)RA,V (i)φ = (e−F α · A)(eF RA,V (i)e−F ) −F × (eF Λ+ )(DA,VC +VH + iα · ∇F + eF VE e−F )φ. A,V e
Using (2.7), Lemma 3.4 and Corollary 3.1 we see that (α · A)RA,V (i)φ is an element of L2 (R3 , C4 ). These remarks imply that ζΛ+ A,V φ belongs to Ran(R0 (i)) + 1 3 4 Ran(ζR0 (i)) = H (R , C ). We may now conclude that HN is well-defined on the dense domain DN defined in (2.13). 3 3 −τ0 |x| ∞ < ∞, for Corollary 3.3. Assume that A ∈ L∞ loc (R , R ) satisfies Ae some 0 ≤ τ0 < min{m, 0 }, and that V fulfills Hypothesis 2.2. Then, for Ψ ∈ DN and 1 ≤ i < j ≤ N, 1 2 |Λ+,N A,V Ψ(X)| dX < ∞. 2 |x − x | 3N i j R + Proof. Let φ, ψ ∈ D. Thanks to Lemma 3.6 we know that both Λ+ A,V φ and ΛA,V ψ 1/2 3 4 3 3 4 belong to H (R , C ) and, hence, to L (R , C ) by the Sobolev inequality for | 1i ∇|. An application of the Hardy–Littlewood–Sobolev inequality thus yields 1 2 + 2 |Λ+ A,V φ(x)| |ΛA,V ψ(y)| dx dy < ∞. 2 R6 |x − y|
This estimate clearly implies the full assertion.
February 11, 2010 10:0 WSPC/148-RMP
24
J070-S0129055X10003874
O. Matte & E. Stockmeyer
In our applications it is important to control commutators that are multiplied with square-roots of the electron-electron interactions W (xi , xj ). In order to formulate an appropriate estimate we set Wy (x) := W (x, y) = W (y, x),
x, y ∈ R3 ,
(3.35)
in what follows. The proof of the next proposition looks somewhat lengthy and is hence postponed to Sec. 3.5. This is due to the fact that the singularity of Wy may be located anywhere and that we allow for unbounded magnetic fields. We remark that, even in the case V = 0, a diamagnetic inequality is not very useful in this context since, for unbounded magnetic fields, one cannot compare |−i∇ + A| with |DA |. We tackle this problem by a procedure that involves a partition of unity, local gauge transformations, and exponential decay estimates which control the correlation between different regions in position space. As a result we obtain a commutator estimate which can be chosen to depend only on the local magnitude of |B| either at the singularity y or on the support of the involved cut-off function. For any function χ on R3 we use the notation B ∞,χ := sup{|B(x)| : x ∈ supp(χ)}.
(3.36)
Proposition 3.2. Assume that A ∈ C 1 (R3 , R3 ) and B = curl A satisfies (2.18) and that V fulfills Hypothesis 2.2. Let 0 ≤ a0 < min{m, 0 } and N ⊂ R3 be a neighborhood of the set of singularities, Y, of VC . Then there is some constant, Ca0 ,N ∈ (0, ∞), such that, for all a ∈ [0, a0 ], all χ, F satisfying (3.23), (3.24) which are constant on N , and all y ∈ R3 , F −F Wy1/2 [Λ+ ≤ Ca0 ,N (1 + min{|B(y)|, B ∞,χ })(a + ∇χ ∞ ). A,V , e χ]e
(3.37)
If VE = 0, then B ∞,χ can be replaced by B ∞,χ∇F +∇χ in (3.37). Corollary 3.4. Assume that A ∈ C 1 (R3 , R3 ) and B = curl A is bounded and that V fulfills Hypothesis 2.2. Then we find, for every ε > 0, some constant Ca0 ,ε ∈ (0, ∞) such that, for all F satisfying (3.24), ϕ ∈ DN , and 1 ≤ i ≤ N, +,N +,N −F |ϕ | 1 ⊗ eF Λ+,N ϕ − ϕ | Λ+,N A,V WiN ΛA,V 1 ⊗ e A,V WiN ΛA,V ϕ| +,N 2 ≤ a{εϕ | Λ+,N A,V WiN ΛA,V ϕ + Ca0 ,ε ϕ },
where eF acts only on the last variable. Proof. This corollary is proved by means of Proposition 3.2 in the same way as Corollary 3.2. We also recall that Λ+,N A,V DN ⊂ D(Wij ). The technique used in the proof of Proposition 3.2 also yields the following result whose proof can be found in Sec. 3.5, too:
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
25
Lemma 3.7. Assume that A ∈ C 1 (R3 , R3 ) and B = curl A satisfies (2.18) and that V fulfills Hypothesis 2.2. Then there is some constant C ∈ (0, ∞) such that, for all ψ ∈ D(DA,V ), Wy1/2 Λ+ A,V ψ ≤ C(1 + min{|B(y)|, B ∞,ψ }) (DA,V − i)ψ .
(3.38)
3.4. Differences of projections In our applications it is eventually necessary to have some control on the difference between Λ+ A,V and + Λ+ A := ΛA,0 . 3 3 Lemma 3.8. Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2. Then there is some C ∈ (0, ∞) such that, for all ζ ∈ C ∞ (R3 , [0, 1]) which are constant outside some ball such that ζVC is bounded, + |DA |1/2 ζ(Λ+ A − ΛA,V ) ≤ C( ζV + ∇ζ ∞ ). 1/2 ), for every ϕ ∈ D(DA ). In particular, ζΛ+ A,V ϕ ∈ D(|DA |
Proof. Due to (3.27) the norm in the statement (if it exists) is bounded from above by sup 1/2
φ∈D(|DA | ), ψ∈H
φ = ψ =1
Γ
||DA |1/2 φ | ζ(RA (z) − RA,V (z))ψ|
|dz| . π
We next use (3.6), (3.15), and (3.17) to conclude that the asserted bound holds true. We note the following trivial consequence of the previous lemma: Namely, we pick some θ ∈ C0∞ (R3 , [0, 1]) with θ ≡ 1 on B1 (0) and θ ≡ 0 outside B2 (0), and set θR (x) := θ(x/R), for R ≥ 1, x ∈ R3 . By virtue of Hypothesis 2.2 and Lemma 3.8 we then have, for every ζ as in the statement of Lemma 3.8,
∇θ ∞ + |DA |1/2 (1 − θR )ζ(Λ+ − Λ ) ≤ C (1 − θ )V + → 0, (3.39) R A A,V R as R tends to infinity. 3 3 Corollary 3.5. Assume that A ∈ L∞ loc (R , R ) and that V fulfills Hypothesis 2.2. Then there is some C ∈ (0, ∞) such that, for every ζ ∈ C ∞ (R3 , R), which is constant outside some ball and such that ζVCs = 0 and ζV + ∇ζ ∞ ≤ 1, and
February 11, 2010 10:0 WSPC/148-RMP
26
J070-S0129055X10003874
O. Matte & E. Stockmeyer
every ϕ ∈ D, + + + |ϕ | Λ+ A,V ζDA,VC ζΛA,V ϕ − ϕ | ΛA ζDA ζΛA ϕ| C + 2 ≤ ( ζV + ∇ζ ∞ ) inf ϕ ζ|D |ζΛ ϕ + εϕ | Λ+ , (3.40) A A A 0<ε≤1 ε
and + + + |ϕ | ζΛ+ A,V DA,VC ΛA,V ζϕ − ϕ | ζΛA DA ΛA ζϕ| C + + 2 ≤ ( ζV + ∇ζ ∞ ) inf εϕ | ζΛA DA ΛA ζϕ + ϕ . 0<ε≤1 ε
(3.41)
The last estimate still holds true (with a new constant C), if DA and DA,VC are replaced by DA − 1 and DA,VC − 1, respectively. Proof. Let ϕ ∈ D and let θR be the cut-off function constructed in the paragraph 1/2 and, preceding (3.39). On account of Lemma 3.6 we know that Λ+ A,V ϕ ∈ H 1/2
hence, we infer from Lemma 3.2(c) that θR ζΛ+ belongs to the domain A,V ϕ ∈ Hc of DA . Applying also the formula appearing in (ii) of Lemma 3.1 and using ζVCs = 0 we obtain + + + s s ϕ | Λ+ A,V ζDA,VC ζΛA,V ϕ = lim θR ζΛA,V ϕ | DA,VC ζΛA,V ϕ R→∞
+ = lim DA θR ζΛ+ A,V ϕ | ζΛA,V ϕ. R→∞
+ Writing δΛ+ := Λ+ A,V − ΛA we further get + DA θR ζΛ+ A,V ϕ | ζΛA,V ϕ + + + = DA θR ζΛ+ A ϕ | ζΛA ϕ + DA θR ζΛA ϕ | ζδΛ ϕ + + + θR ζδΛ+ ϕ | DA ζΛ+ A ϕ + DA θR ζδΛ ϕ | ζδΛ ϕ.
By virtue of Lemma 3.8, we know that ζδΛ+ ϕ ∈ D(|DA |1/2 ) and it is easy to see + that DA θR ζΛ+ A ϕ → DA ζΛA ϕ, as R → ∞. Using also (3.39) we arrive at + + + s |ϕ | Λ+ A,V ζDA,VC ζΛA,V ϕ − DA ζΛA ϕ | ζΛA ϕ| 1/2 ≤ 2 |DA |1/2 ζΛ+ ζδΛ+ ϕ + |DA |1/2 ζδΛ+ ϕ 2 . A ϕ |DA |
Therefore, we obtain (3.40) by applying Lemma 3.8 once again. (3.41) follows from a straightforward combination of (3.40) and Corollary 3.2. The last statement of Corollary 3.5 follows from (3.41) and Lemma 3.8.
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
27
3.5. Proofs of Proposition 3.2 and Lemma 3.7 First, we recall a useful resolvent identity. We remind the reader that VCs = S|VCs | denotes the polar decomposition of the potential defined in (3.1) and set M (z) := |VCs |1/2 R0 (z)|VCs |1/2 ,
z ∈ (D0 ).
(3.42)
Lemma 3.9 ([32, 41, 42]). Assume that VC fulfills Hypothesis 2.1 and let VCs be given by (3.1). Then there exist η0 > 0 and γ0 ∈ (γ, 1) such that, for every η ∈ R\(−η0 , η0 ), we have M (iη) ≤ γ0 and R0,VCs (iη) = R0 (iη) − R0 (iη)|VCs |1/2 (1 + SM (iη))−1 S|VCs |1/2 R0 (iη).
(3.43)
Proof. The inequality | · |−1/2 R0 (iη)| · |−1/2 ≤ 1 has been conjectured in [41] and proved in [32]. By means of this inequality and the arguments of [42, pp. 2 and 3 (with ks ≡ 1)] we find some γ0 ∈ (γ, 1) such that M (iη) ≤ γ0 provided |η| is large enough. The resolvent formula (3.43) then follows from [41, Lemma 2.2 and Theorem 2.2]. Proof of Proposition 3.2. We pick some ζ ∈ C0∞ (R3 , [0, 1]) such that ζ = 1 ˚ . We set ζ := 1 − ζ. Moreover, we pick in a neighborhood of Y and supp(ζ) N ∞ 3 some ∈ C0 (R , [0, 1]) with ≡ 1 on B1/2 (0) and supp() ⊂ B1 (0) and set y (x) := (x − y), x ∈ R3 . On account of Proposition 3.1 it suffices to consider F −F y Wy1/2 φ | [Λ+ ψ A,V , e χ]e dz dz 1/2 + y Wy1/2 φ | ζT (z)ψ , y Wy φ | ζT (z)ψ = 2π 2π Γ Γ
(3.44)
for φ ∈ H 1/2 (R3 , C4 ) and ψ ∈ H , where, by (3.18) and (3.27), T (z) := RA,V (z)T eF RA,V (z)e−F ,
z ∈ Γ,
T := iα · (χ∇F + ∇χ) + [eF χ, VE ]e−F = O( ∇χ ∞ + a).
(3.45) (3.46)
To study the first integral in (3.44) we write, using (3.6), ζRA,V (z) = ζR0,VCs (z) + R0,VCs (z)iα · ∇ζ(R0,VCs (z) − RA,V (z)) − R0,VCs (z)ζ{V − VCs + α · A}RA,V (z), where VCs = S|VCs | is defined in (3.1). Since D(D0,VCs ) ⊂ H 1/2 (R3 , C4 ) due to Lemma 3.2(e) we find C, C ∈ (0, ∞) such that, for all y ∈ R3 and all z ∈ Γ, Wy1/2 R0,VCs (z) ≤ C |D0 |1/2 R0,VCs (z) ≤ C .
(3.47)
February 11, 2010 10:0 WSPC/148-RMP
28
J070-S0129055X10003874
O. Matte & E. Stockmeyer
By definition of T (z) and T and by (3.11) we thus have |dz| |y Wy1/2 φ | ζ{T (z) − R0,VCs (z)T eF RA,V (z)e−F }ψ| 2π Γ dη ≤ Ca,ζ (a + ∇χ ∞ ) φ ψ . 2 R 1+η
(3.48)
To treat the remaining part of the first integral in (3.44) we employ the first resolvent formula and (3.43) to obtain, for η ∈ R with |η| ≥ η0 and z = e0 + iη (η0 is the parameter appearing in Lemma 3.9), R0,VCs (z) = R0 (iη) + e0 R0,VCs (z)R0,VCs (iη) − R0 (iη)|VCs |1/2 (1 + SM (iη))−1 S|VCs |1/2 |D0 |−1/2 |D0 |1/2 R0 (iη). (3.49) Here the operator (1 + SM (iη))−1 is uniformly bounded for |η| ≥ η0 . Moreover, Wy1/2 R0 (iη)|VCs |1/2 ≤ C,
(3.50)
uniformly in y ∈ R3 and η ∈ R, which is a simple consequence of Kato’s inequality. In view of (3.48), (3.49) and (3.22) it is therefore clear that it suffices to discuss the contribution coming from the bare resolvent R0 (iη) in (3.49). On account of (3.11), (3.21) and (3.46) we find by means of the Cauchy–Schwarz inequality, dη |R0 (−iη)Wy1/2 y ζφ | T eF RA,V (e0 + iη)e−F ψ| 2π |η|≥η0 ≤ T (|D0 |−1/2 Wy1/2 )y ζφ ψ = O(a + ∇χ ∞ ) φ ψ . Since the remaining part of the integral over {|η| < η0 } does not pose any further problem, we see altogether that the first integral in (3.44) is absolutely convergent and of order O( ∇χ ∞ + a). Next, we treat the second integral on the right-hand side of (3.44). Since we eventually have to change the gauge locally, we pick a (smooth) partition of unity, {Jν }ν∈Z3 , on R3 such that supp(Jν ) ⊂ B1 (ν), for every ν ∈ Z3 . We can certainly
assume that ν∈Z3 |∇Jν | ≤ C, for some C ∈ (0, ∞). We set Jν , µ := 1 − µ, G (χ) := {ν ∈ Z3 : Jν χ = 0}, µ := ν∈G (χ)
so that µχ = χ, µχ = 0, and re-write the operator defined in (3.46) as T = iα · (χ∇F + ∇χ) + χ[eF , VE ]e−F + [χ, VE ] = µ{iα · (χ∇F + ∇χ) + χ[eF , VE ]e−F + [χ, VE ]} − {µVE χ}µ (Jν U1 − U2 Jν ). =: µU1 − U2 µ = ν∈G (χ)
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
29
For every ν ∈ Z3 , there is some gauge potential, gν ∈ C 2 (R3 , R), such that A − ∇gν = Aν , where Aν is defined by Aν (x) :=
1
0
B(ν + t(x − ν)) ∧ t(x − ν)dt,
x ∈ R3 .
(3.51)
ν := ∇gν , By virtue of (3.6) we obtain with A y ζRA,V (z)T =
y ζRA,V (z)(Jν U1 − U2 Jν )
ν∈G (χ)
=
ν∈G (χ)
+
y ζRAeν (z)(Jν U1 − U2 Jν )
ν∈G (χ)
−
ν∈G (χ)
RAeν (z)iα · ∇(y ζ)(RAeν (z) − RA,V (z))(Jν U1 − U2 Jν ) RAeν (z)y ζ(V + α · Aν )RA,V (z)(Jν U1 − U2 Jν )
=: S1 + S2 − S3 .
(3.52)
The proof of Proposition 3.2 is finished as soon as we have shown Lemma 3.10 below. Lemma 3.10. In the situation above there exists a χ- and F -independent constant, Ca0 ∈ (0, ∞), such that, for j = 1, 2, 3, Ij := |Wy1/2 φ | Sj eF RA,V (z)e−F ψ||dz| Γ
≤ Ca0 (1 + min{|B(y)|, B ∞,χ })( ∇χ ∞ + a) φ ψ . Proof of Lemma 3.10. In our estimations below, that involve non-local operators, we exploit the fact that the interference between spatially separated regions decays exponentially. Therefore, we start by introducing appropriate exponential weight functions: We pick some a ˜ ∈ (τ, min{0 , m}) and a convex, even function f˜ ∈ ∞ ˜|t|−3˜ a, for |t| ≥ 4, and 0 ≤ f˜ ≤ 1 C (R, [0, ∞)) such that f˜ ≡ 0 on [−2, 2], f˜(t) = a on (0, ∞). We define fν (x) := f˜(|x − ν|), x ∈ R3 , so that fν ≡ 0 on B2 (ν) and ˜ dist(·, B2 (ν)) − a ˜ with equality outside B4 (ν). Moreover, |∇fν | ≤ a ˜. We fν ≥ a further pick some non-decreasing θ ∈ C ∞ (R, R) such that θ(t) = t, for t ≤ 1, θ ≡ 2 on [3, ∞) and θ ≤ 1. We set θν,y (t) := (|ν − y| + 1)θ(t/(|ν − y| + 1)), t ∈ R, ˜ dist(·, B2 (ν)) − a ˜ on and fν,y := θν,y ◦ fν , so that fν,y is bounded, fν,y = fν ≥ a fν,y ˜. By construction e Jν = Jν . Setting B1 (y) ⊃ supp(y ), and |∇fν,y | ≤ a ψ ν,y (z) := (Jν U1 − efν,y U2 e−fν,y Jν )eF RA,V (z)e−F ψ
February 11, 2010 10:0 WSPC/148-RMP
30
J070-S0129055X10003874
O. Matte & E. Stockmeyer
we thus have (Jν U1 − U2 Jν )eF RA,V (z)e−F ψ = e−fν,y ψ ν,y (z). Observing that µχ = 0 implies efν,y U2 e−fν,y = µ[efν,y VE e−fν,y , χ] and employing (2.6) and (3.11) we further find some constant C ∈ (0, ∞) such that, for all z = e0 + iη ∈ Γ, ν ∈ Z3 , and y ∈ R3 , a + ∇χ ∞ ψ ν,y (z) ≤ C ψ . 1 + η2
(3.53)
Taking these remarks into account we obtain |e−fν,y RAeν (¯ z )efν,y Wy1/2 (y e−fν,y )ζφ | ψ ν,y (z)||dz|. I1 ≤ Γ
ν∈G (χ)
ν = 0, whence (2.1), (2.12) and ν = ∇gν is a gradient we have curl A Now, since A Hardy’s inequality imply ν )ϕ ≤ C |D e |ϕ , Wy ϕ ≤ C ∇(eigν ϕ) = C (−i∇ + A Aν for ϕ ∈ D, with some ν- and y-independent constant C ∈ (0, ∞). Standard argu1/2 ments now imply that |DAeν |−1/2 Wy is a bounded operator whose norm is uniformly bounded in ν and y. Setting φν := |DAeν |−1/2 Wy1/2 (y e−fν,y )ζφ and applying (3.17) we thus find
I1 ≤
Γ
ν∈G (χ)
·
Γ
RAeν (¯ z )e
φν (a + ∇χ ∞ )
|DAeν |
1/2 φν |dz| 2
R
dη 1 + η2
1/2 ψ
sup{e−fν,y (x) : x ∈ B1 (y)} φ (a + ∇χ ∞ ) ψ
ν∈Z3
1/2
ν∈G (χ)
≤ C
fν,y
1/2 2 ψν,y (z) |dz|
≤C
e
−fν,y
≤ C (a + ∇χ ∞ ) φ ψ .
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
31
For I2 , we obtain the estimate by means of (3.22), (3.11) and (3.53), |dz| |DAeν |−1/2 Wy1/2 φ |DAeν |1/2 RAeν (z) I2 ≤ Γ
ν∈G (χ)
· |∇(y ζ)|e−fν,y efν,y (RA,V (z) − RAeν (z))e−fν,y ψ ν,y (z) sup{e−fν,y (x) : x ∈ B1 (y)} φ ψ ≤ C( ∇χ ∞ + a) ν∈Z3
≤ C ( ∇χ ∞ + a) φ ψ . To derive a bound on I3 we employ the special properties of the gauge transformed vector potentials Aν . Namely, we make use of the bound y (x)e−fν,y (x) |Aν (x)| ≤ y (x)
|x − ν| (b1 e−(˜a−τ )|x−ν|+3˜a + |B(ν)|e−˜a|x−ν|+3˜a ), 2 (3.54)
for all ν ∈ Z3 and x ∈ R3 , which follows from (2.18). Since also |B(ν)| ≤ |B(y)| + |B(ν) − B(y)| and |x − y| ≤ 1, if y (x) = 0, we infer again from (2.18) that y e−fν,y |Aν | ∞ ν∈G (χ)
≤ C
ν∈G (χ)
e
−ˆ a|y−ν|
1 + min |B(y)|, sup |B(ν)|
,
(3.55)
ν∈G (χ)
for some sufficiently small a ˆ > 0. Using these observations and the uniform boundfν,y −fν,y edness of ζe V e , which is implied by Hypothesis 2.2 and the choice of ζ, we find some χ-, F -, and y-independent constant C ∈ (0, ∞) such that RAe(¯ z )Wy1/2 φ { y e−fν,y ∞ ζefν,y V e−fν,y I3 ≤ ν∈G (χ)
Γ
+ y e−fν,y |Aν | ∞ } efν,y RA,V (z)e−fν,y ψ ν,y (z) |dz| ≤ C (1 + min{|B(y)|, B ∞,χ })( ∇χ ∞ + a) φ ψ . This completes the proof of Lemma 3.10 and, at the same time, the proof of Proposition 3.2. (The last assertion of Proposition 3.2 follows by inspecting the arguments above.) Proof of Lemma 3.7. We use the notation introduced in the proofs of Proposition 3.2 and Lemma 3.10 in the following. We already know from Corollary 3.6 1/2 that the vector Λ+ A,V ψ belongs to D(Wy ), but we do not have any control on the norm on the left in (3.38) yet. It is certainly sufficient to derive the asserted bound
February 11, 2010 10:0 WSPC/148-RMP
32
J070-S0129055X10003874
O. Matte & E. Stockmeyer 1/2
1/2
with Wy replaced by y Wy . As in the proof of Corollary 3.6, we first pick some ζ ∈ C0∞ (R3 , [0, 1]) (independent of ψ) such that ζ ≡ 1 on some large open ball containing Y and set ζ = 1 − ζ. By the closed graph theorem |D0 |1/2 ζRA,V (i) is bounded whence 1/2 ζRA,V (i) (DA,V − i)ψ . y Wy1/2 ζΛ+ A,V ψ ≤ C |D0 |
We denote the characteristic function of the support of ψ by χ. To treat the remaining piece of the norm we set ψ ν := Λ+ A,V (DA,V − i)Jν ψ, and write analogously to (3.52), |Wy1/2 φ | y ζΛ+ A,V ψ| ≤ |Wy1/2 φ | y ζRA,V (i)ψ ν | ν∈G (χ)
≤
ν∈G (χ)
+
|Wy1/2 y ζφ | RAeν (i)ψ ν |
ν∈G (χ)
+
ν∈G (χ)
|Wy1/2 φ | RAeν (i)α · ∇(y ζ)(RA,V (i) − RAeν (i))ψ ν | |Wy1/2 φ | RAeν (i)y ζ(V + α · Aν )RA,V (i)ψ ν |
=: Q1 + Q2 + Q3 , where φ ∈ H 1/2 (R3 , C4 ). Again we use exponential weights constructed in the beginning of the proof of Lemma 3.10 and abbreviate −fν,y fν,y )e (DA,V − i)e−fν,y Jν ψ, ψν,y := (efν,y Λ+ A,V e
so that by Corollary 3.1, ˜) ψ ≤ C (DA,V − i)ψ , ψν,y ≤ C (DA,V − i)ψ + O( ∇Jν ∞ + a where C, C ∈ (0, ∞) neither depend on ν nor y. Writing also efν,y RAeν (i)e−fν,y = RAeν (i)(1 − iα · ∇fν,y efν,y RAeν (i)e−fν,y ) and using (3.11) we thus obtain Q1 ≤
ν∈G (χ)
|DAeν |−1/2 Wy1/2 y e−fν,y φ
· |DAeν |1/2 efν,y RAeν (i)e−fν,y ψν,y ≤ C φ (DA,V − i)ψ .
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
33
Using (3.55) we further find |DAeν |−1/2 Wy1/2 φ |DAeν |1/2 RAeν (i) Q3 ≤ ν∈G (χ)
· { y e−fν,y ∞ ζefν,y V e−fν,y + y e−fν,y |Aν | ∞ } ψν,y ≤ C (1 + min{|B(y)|, B ∞,ψ }) φ (DA,V − i)ψ . The remaining term, Q2 , can be dealt with similarly. 4. Exponential Localization In this section, we prove Theorem 2.1. To this end, we adapt an argument from [1] and some useful improvements of the latter from [19] to our non-local situation. In the proof below, we present the general strategy of the argument. In doing so, we refer to three technical lemmata whose proofs are postponed to the end of this section. The argument from [1] is advantageous here since it does not require any a priori knowledge on the spectrum of HN . It rather gives the possibility to prove the exponential localization of spectral projections directly and to infer results on the nature of the spectrum from the localization estimate. In particular, the argument avoids the use of eigenvalue equations which are, for instance, exploited in Agmon type estimates. Throughout this section we always assume that the assumptions of Theorem 2.1 are fulfilled. Proof of Theorem 2.1. Since HN is bounded from below we may suppose that inf I > −∞. By assumption we have sup I < ENA−1 + 1. Moreover, we consider HN as an operator on the unprojected N -particle space HN . In this case we have to keep in mind that 0 becomes an infinitely degenerated eigenvalue of HN . Our goal 2b|X| |Φ(X)|2 dX ≤ C, for all is to show that there are b, C ∈ (0, ∞) such that R3N e normalized Φ ∈ Ran(EI (HN )) such that Φ = AN Φ and Φ = Λ+,N A,V Φ. Borrowing an idea from [38] we simplify the problem by using the bounds √ N |xj |
e2b|X| ≤ max e2b j=1,...,N
≤
N
√ N |xj |
e2b
,
X = (x1 , . . . , xN ) ∈ (R3 )N ,
j=1
and the anti-symmetry of Φ = AN Φ. (We are not aiming to derive good estimates on the decay rate here.) Indeed, it suffices to show that there exist a, C ∈ (0, ∞) such that e2a|xN | |Φ(X)|2 dX ≤ C, (4.1) R3N
Λ+,N A,V Φ
for all Φ = AN Φ = ∈ EI (HN ), Φ = 1. Then Theorem 2.1 holds true with √ b = a/ N . Furthermore, it suffices to show that (4.1) holds true with a|xN | replaced
February 11, 2010 10:0 WSPC/148-RMP
34
J070-S0129055X10003874
O. Matte & E. Stockmeyer
by F (xN ), for every (bounded) F : R3 → R satisfying (3.24). This is in fact an obvious consequence of the monotone convergence theorem applied to the integrals 2Fn (xN ) e |Φ(X)|2 dX with Φ as above, where F1 , F2 , . . . is a suitable increasing 3N R sequence of functions satisfying (3.24) and converging to a|xN |. Therefore, it suffices to find some a > 0 such that (AN −1 ⊗ eF )EI (HN )(AN −1 ⊗ 1)Λ+,N A,V < ∞,
(4.2)
for every F satisfying (3.24), where AN −1 denotes anti-symmetrization of the first N − 1 variables and eF acts only on the N th electron variable. We start by introducing a comparison operator. To this end we pick some χ ∈ C ∞ (R3 , [0, 1]) such that χ ≡ 1 outside B2 (0) and χ ≡ 0 on B1 (0) and set χR := χ(·/R) and χR := 1 − χR , for R ≥ 1. Furthermore, we define orthogonal projections +,N −1 PN −1 := AN −1 ΛA,V ,
QN := (AN −1 ⊗ 1)Λ+,N A,V ,
PN⊥−1 = 1HN −1 − PN −1 , Q⊥ N = 1HN − QN .
Then the comparison operator is defined, a priori on the domain DN ⊂ HN , by − A A ⊥ N := QN HN QN + HN H −1 ⊗ ΛA,V + EN −1 PN −1 ⊗ 1 + A ⊥ + PN −1 ⊗ Λ+ A,V (1 − E1 )χR ΛA,V + QN A A ⊥ = HN −1 ⊗ 1 + EN −1 PN −1 ⊗ 1
(4.3)
+ A ⊥ + PN −1 ⊗ Λ+ A,V {DA,VC + (1 − E1 )χR }ΛA,V + QN
+
N −1
QN WiN QN .
(4.4) (4.5)
i=1
N again by the same symbol. (The idea to We denote the Friedrichs extension of H introduce an additional cut-off function χR in (4.4) to compensate for the Coulomb singularity in the last variable xN is borrowed from [19]; together with the other N stays away additional terms in (4.3) and (4.4) it ensures that the spectrum of H A A from the interval I.) Notice that on DN we have HN −1 ⊗ 1 + EN −1 PN⊥−1 ⊗ 1 ≥ ENA−1 1HN . Furthermore, Lemma 4.3 below implies that + A ⊥ PN −1 ⊗ Λ+ A,V {DA,VC + (1 − E1 )χR }ΛA,V + QN ≥ 1 − o(1)PN −1 ⊗ 1,
as R tends to infinity. We now pick some ε > 0 with sup I < ENA−1 + 1 − ε. Then the above remarks imply N ψ ≥ (E A + 1 − ε/2) ψ 2, ψ | H N −1
ψ ∈ DN ,
for all sufficiently large R ≥ 1. Next, we define − A HN := QN HN QN + HN −1 ⊗ ΛA,V .
(4.6)
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
35
N and H have the same domain since they differ by a bounded operator Then H N I ∈ C0∞ (R, [0, 1]), such that on their common form core DN . We further pick some χ N ) = 0 by (4.6). As χI ) ⊂ (−∞, ENA−1 + 1 − ε). Then χ I (H χ I ≡ 1 on I and supp( in [1] we now observe that N ))QN . )QN = ( χI (HN )−χ I (H QN EI (HN )QN = QN EI (HN
(4.7)
We preserve the symbol χ I to denote an almost analytic extension of χ I (see, e.g., [15]) to a smooth, compactly supported function on the complex plane satisfying supp( χI ) ⊂ (−∞, ENA−1 + 1 − ε) + i(−δ, δ), I (z) = ON (|z|N ), ∂z¯χ
N ∈ N,
(4.8)
where ∂z¯ = 12 (∂ z + i∂z ). Here we may choose δ > 0 as small as we please. We shall apply the Helffer–Sj¨ ostrand formula (see, e.g., [15]), χ I (T ) =
C
(z − T )−1 d χI (z),
d χI (z) :=
i I (z)dz ∧ d¯ z, ∂z¯χ 2π
which holds for every self-adjoint operator T on some Hilbert space. By means of (4.7), we then find the representation QN EI (HN )QN =
C
N − z)−1 ]d [(HN − z)−1 − (H χI (z)QN .
(4.9)
For some F as in (3.24) (which acts only on the last variable in what follows), we abbreviate F +,N −F . ΛF,N A,V := e ΛA,V e
Then (4.9) and the second resolvent identity together with the trivial identities ⊥ Q⊥ N QN = 0 = (PN −1 ⊗ 1)QN yield (AN −1 ⊗ eF )EI (HN )(AN −1 ⊗ 1)Λ+,N A,V N − z)−1 PN −1 ⊗ {Λ+ (1 − E A )χ Λ+ } eF (H ≤ R A,V 1 A,V C
− z)−1 QN |d χI (z)| × QN (HN χI (z)| N − z)−1 e−F ΛF,N eF χR |d ≤ (1 − E1A ) eF (H A,V |z| C χI (z)| N − z)−1 e−F |d . ≤ Ca,R eF (H |z| C
(4.10)
February 11, 2010 10:0 WSPC/148-RMP
36
J070-S0129055X10003874
O. Matte & E. Stockmeyer
In the last step, we apply Proposition 3.1 and eF χR ≤ e2aR . By (4.8) |d χI (z)|/|z| is a finite measure. To conclude the proof of Theorem 2.1 it thus N − z)−1 e−F is uniformly bounded in all remains to show that the norm of eF (H z ∈ supp( χI )\R and F satisfying (3.24). This is done in the rest of this proof. Since F satisfies (3.24) we know that 1N −1 ⊗ eF is an isomorphism on HN . N e−F and H N have the same resolvent Therefore, the densely defined operators eF H set and N e−F − z)−1 , N − z)−1 e−F = (eF H RF (z) := eF (H
N ). z ∈ (H
(4.11)
N e−F is closed because its resolvent set is not empty. Using the In particular, eF H ∗ ∗ −1 N e−F )∗ = e−F H N eF . = RF (z)−1 we readily verify that (eF H identity RF (z) Since e±F maps DN into itself we further have N e∓F ) = e±F D(H N ) ⊂ e±F Q(H N ). DN ⊂ D(e±F H
(4.12)
The following two lemmata, whose proofs are postponed to the end of this section, N e−F is a small form perturbation of H N . We define T : DN → show that eF H HN by N e−F ϕ − H N ϕ, T ϕ := eF H
ϕ ∈ DN .
(4.13)
Lemma 4.1. Assume that F : R3 → R satisfies (3.24). Then we have, as a > 0 tends to zero, N ϕ + O(a)ϕ | ϕ, |ϕ | T ϕ| ≤ aϕ | H
ϕ ∈ DN .
(4.14)
Lemma 4.2. There exist constants c1 , c2 ∈ (0, ∞) such that, for all F : R3 → R satisfying (3.24) and all ϕ ∈ DN , N e±F ϕ| ≤ c1 e±F 2 ϕ | H N ϕ + c2 e±F 2 ϕ 2 . |e±F ϕ | H
(4.15)
N ) ⊂ Q(H N ). In particular, e±F Q(H N e−F )DN has a distinguished, If a < 1/2 then Lemma 4.1 implies that (eF H F sectorial, closed extension, HN , that is the only closed extension having the prop N ), D(H F ∗ ) ⊂ Q(H N ), and iη ∈ (H F ), for all η ∈ R with F ) ⊂ Q(H erties D(H N N N sufficiently large absolute value; see [31]. Thanks to (4.11), (4.12) and Lemma 4.2, N e−F is a closed extension enjoying these properties, whence we know that eF H F N e−F . N H = eF H
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
37
We are now prepared to derive a uniform bound on the norm under the integral sign in (4.10). For z ∈ supp( χI ) and ϕ ∈ DN , we obtain N − z)ϕ + ϕ | T ϕ F − z)ϕ = ϕ | (H ϕ | (H N
N − z ≥ (1 − a) ϕ H ϕ − O(a) ϕ 2 . 1−a
(4.16)
By (4.6) and (4.8), we thus find a ∈ (0, 1/2) and R ∈ [1, ∞) such that, for all z ∈ supp( χI ) and ϕ ∈ DN , F − z)ϕ ≥ ϕ | (H N
ε ϕ 2 . 4
F − z This inequality implies that, for z ∈ supp( χI ), the numerical range of H N is contained in the half space {ζ ∈ C : ζ ≥ ε/4} [31, Theorem VI.1.18 and F − z is zero, for all Corollary VI.2.3]. Moreover, by (4.11) the deficiency of H N F z ∈ C\R, and we may hence estimate the norm of (HN − z)−1 by the inverse F [31, Theorem V.3.2]. We thus arrive at distance of z to the numerical range of H N
F N (H − z)−1 ≤
4 , ε
z ∈ supp( χI ),
which together with (4.10) proves Theorem 2.1. Lemma 4.3. For every sufficiently large R ≥ 1, there is some cR ∈ (0, ∞) such that cR → 0, as R → ∞, and, for all ϕ ∈ D, + + A 2 2 ϕ | Λ+ A,V [DA,VC + (1 − E1 )χR ]ΛA,V ϕ ≥ ΛA,V ϕ − cR ϕ .
(4.17)
Proof. To begin with we introduce a scaled partition of unity. Namely, we pick ˜ ≡ 1 on B2 (0) and observe that θ := µ ˜2 + some µ ˜ ∈ C0∞ (R3 , [0, 1]) such that µ 2 3 (1 − µ ˜) is strictly positive. We further set, for R ≥ 1 and x ∈ R , µ1 (x) ≡ ˜(x/R)/θ1/2 (x/R), and µ2 (x) ≡ µR,2 (x) := (1 − µ ˜ (x/R))/θ1/2 (x/R), so µR,1 (x) := µ 2 2 2 2 that µ1 + µ2 = 1. Since µ1 ∇µ1 + µ2 ∇µ2 = ∇(µ1 + µ2 )/2 = 0 it follows that, for ϕ ∈ D, + A ϕ | Λ+ A,V [DA,VC + (1 − E1 )χR ]ΛA,V ϕ + A 2 = ϕ | Λ+ A,V [µj DA,VC µj + (1 − E1 )µj χR ]ΛA,V ϕ j=1,2
=:
Yj .
(4.18)
j=1,2
To treat the summand with j = 1 we use that, by construction, µ1 χR = µ1 , for every R ≥ 1. Taking also Corollary 3.2 and (2.16) into account we find, for all R ≥ 1
February 11, 2010 10:0 WSPC/148-RMP
38
J070-S0129055X10003874
O. Matte & E. Stockmeyer
and ϕ ∈ D, + A + 2 2 Y1 ≥ (1 − 1/R)µ1 ϕ | Λ+ A,V [DA,VC − E1 ]ΛA,V µ1 ϕ + µ1 ΛA,V ϕ − O(1/R) ϕ 2 2 ≥ µ1 Λ+ A,V ϕ − O(1/R) ϕ .
(4.19)
We next turn to the summand with j = 2 in (4.18) where µ22 χR ≥ 0. Applying successively Corollaries 3.2 and 3.5, Proposition 3.1, and Lemma 3.8 we deduce that, for all ϕ ∈ D and every ε > 0, + + + 2 ϕ | Λ+ A,V µ2 DA,VC µ2 ΛA,V ϕ ≥ (1 − ε)ϕ | µ2 ΛA DA ΛA µ2 ϕ − oε (1) ϕ 2 2 ≥ (1 − ε) Λ+ A µ2 ϕ − oε (1) ϕ 2 2 ≥ (1 − ε)2 µ2 Λ+ A ϕ − oε (1) ϕ 2 2 ≥ (1 − ε)3 µ2 Λ+ A,V ϕ − oε (1) ϕ ,
(4.20)
as R → ∞. We conclude by combining (4.18)–(4.20) and using µ21 + µ22 = 1. N e−F − H N Proof of Lemma 4.1. We have to study the contribution to T = eF H F coming from each term in (4.3)–(4.5). The terms in (4.3) commute with e and hence give no contribution. In order to estimate the contribution coming from the left term in (4.4) we first observe that Corollary 3.1 and (3.32) imply the following identities on D, + −F F = ΛF eF Λ+ A,V (DA,VC + iα · ∇F )ΛA,V A,V DA,VC ΛA,V e F = ΛF A,V DA,VC ΛA,V + O(a) + = (1 + a)Λ+ A,V DA,VC ΛA,V + O(a).
The term in (4.4) involving the cut-off function χR yields a contribution of order O(a), too, due to Corollary 3.2 (with L = (1 − E1A )χR and χ = 1). To account for + the projection on the right in (4.4) we write Q⊥ N = 1HN − PN −1 ⊗ ΛA,V and use F ⊥ −F ⊥ Proposition 3.1 to obtain e QN e −QN = O(a). Finally, we apply Corollary 3.4 to all terms in (4.5) — this is the only place in this section where we use the assumption that B is bounded — and arrive at ! N −1 (N ) WiN QN ϕ + O(a) ϕ 2 . |ϕ | T ϕ| ≤ a ϕ QN DA,VC + i=1
A A Since HN −1 ≥ EN −1 this completes the proof of Lemma 4.1.
Proof of Lemma 4.2. We drop the ±-signs in (4.15) since the they do not play any role in this proof. It is clear that we only have to comment on those terms in (4.3)–(4.5) that involve unbounded operators. Since HN −1 ⊗ 1 commutes with eF
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
39
and since HN −1 ≥ ENA−1 we first find, for ϕ ∈ DN , eF ϕ | (HN −1 ⊗ 1 − ENA−1 PN⊥−1 ⊗ 1)eF ϕ ≤ e2F ϕ | (HN −1 ⊗ 1 − ENA−1 PN⊥−1 ⊗ 1)ϕ.
(4.21)
By virtue of Proposition 3.2 we can estimate |ϕ | eF QN WiN QN eF ϕ|, for ϕ ∈ DN , as 1/2
1/2
1/2
WiN QN eF ϕ 2 ≤ 2 eF WiN QN ϕ 2 + 2 WiN [eF , QN ]e−F eF ϕ 2 1/2
≤ 2 eF 2 WiN QN ϕ 2 + O(a2 ) eF 2 ϕ 2 .
(4.22)
(If B is unbounded, then the O-symbol in (4.22) depends on the supremum of |B| on supp(∇F ).) It remains to prove that there are constants c3 , c4 ∈ (0, ∞) such that + F ϕ | eF Λ+ A,V DA,VC ΛA,V e ϕ + F 2 2 ≤ c3 eF 2 ϕ | Λ+ A,V DA,VC ΛA,V ϕ + c4 e ϕ ,
(4.23)
for ϕ ∈ D. Moreover, since VH and VE are bounded it suffices to prove this estimate := DA,V − e0 , which is positive on the range of Λ+ . with DA,VC replaced by D A,V We abbreviate Λ± := Λ± A,V in the rest of this proof and seek for bounds on both terms on the right side of + )1/2 eF Λ ϕ 2 . + )1/2 eF ϕ 2 ≤ 2 (DΛ (4.24) (DΛ =±
+ )1/2 [eF , Λ− ]ϕ and is not greater than Here the norm with = − equals (DΛ some O(a) eF ϕ due to Proposition 3.1. We next define + + 1 ≥ 1. ˆ := Λ+ (DA,V − e0 )Λ+ + 1 = Λ+ DΛ D ˆ −1/2 ≤ 1 and + )1/2 D In fact, because of (DΛ ˆ 1/2 Λ+ ϕ 2 + [D ˆ 1/2 eF Λ+ ϕ 2 ≤ eF D ˆ 1/2 , eF ]Λ+ ϕ 2 D ˆ 1/2 ϕ 2 + D ˆ 1/2 ϕ 2 ˆ 1/2 [D ˆ −1/2 , eF ]Λ+ D ≤ eF 2 Λ+ D we shall see that (4.23) holds true as soon as we have shown that ˆ 1/2 [D ˆ −1/2 , eF ]Λ+ = O(a) eF . D
(4.25)
To check whether (4.25) is correct we first note that, on D, ˆ eF ] = Λ+ [D, eF ] + [Λ+ , eF ]D [D, = −Λ+ iα · ∇F eF + ([VE , eF ]e−F )eF + [Λ+ , eF ]D.
(4.26)
February 11, 2010 10:0 WSPC/148-RMP
40
J070-S0129055X10003874
O. Matte & E. Stockmeyer
We apply the norm-convergent integral representation 1 ∞ 1 dt √ , T −1/2 = π 0 T +t t
(4.27)
which holds for any strictly positive operator, T , on some Hilbert space. For φ, ψ ∈ D, it implies dt 1 ∞ ˆ 1/2 −1 ˆ F Λ+ 1/2 −1/2 F + ˆ ˆ D φ [D, e ] ψ √ . (4.28) , e ]Λ ψ = D φ | [D ˆ ˆ π 0 t D+t D+t We estimate the contribution of the first term on the right side of (4.26) to (4.28) as D ˆ 1/2 φ
+ O(a) eF Λ+ F Λ ≤ iα · ∇F e ψ φ ψ , D ˆ +t ˆ +t (1 + t)3/2 D
t ≥ 0.
(4.29)
In view of (2.7) the second term in (4.26) can be dealt with similarly. To account for the third term in (4.26) we apply Proposition 3.1 and obtain, for t ≥ 0, ! 1 O(a) eF D 1/2 + F −F F + ˆ [Λ , e ]e }e {D Λ ψ ≤ φ ψ . (4.30) φ ˆ +t ˆ +t D 1+t D Equations (4.28)–(4.30) show that (4.25) holds true, which completes the proof of Lemma 4.2. A 5. The Lower Bound on inf σess (HN )
In order to prove the “hard part” of the HVZ theorem, Theorem 2.2(ii), we employ an idea we learned from [19]: One may use a localization estimate for spectral projections to prove their compactness. Of course one might try to derive a lower bound on the ionization threshold by a more direct argument, for instance, by following the general strategy presented in [33]. Since we have already derived an exponential localization estimate we find it, however, more convenient here to adapt the observation from [19] to our non-local model. Another advantage of the proof below is that we can work with the square root of HN . This is important since only form bounds on perturbations of HN are available. Theorem 5.1. Let the assumptions of Theorem 2.2(ii) be fulfilled and let I ⊂ R A ) is a compact be an interval sup I < 1 + ENA−1 . Then the spectral projection EI (HN + operator on AN HN . In particular, A σess (HN ) ⊂ [1 + ENA−1 , ∞). A Proof. Let g ∈ C(R, (0, ∞)) satisfy g(r) → ∞, r → ∞, and g(|X|)EI (HN ) ∈ L (AN HN ) and set h := 1/g. We let R denote a smoothed characteristic function
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
41
of the closed ball in R3 with radius R > 0 and center 0 and set χR (X) := R (x1 ) · · · R (xN ), for X = (x1 , . . . , xN ) ∈ (R3 )N . First, we argue that it suffices to A )χR (−∆+1)1/8 is a (densely defined) bounded operator from HN show that EI (HN + to AN HN . In fact, let us assume that this is the case. Since (−∆ + 1)−1/8 h(|X|) A ) is bounded, it then follows that is compact and g(|X|)EI (HN A A EI (HN )[χR h(|X|)]g(|X|)EI (HN ) A A = EI (HN )χR (−∆ + 1)1/8 [(−∆ + 1)−1/8 h(|X|)]g(|X|)EI (HN )
is compact. Since χR h(|X|) converges to h(|X|) in the operator norm, as R A A A ) = EI (HN )h(|X|)g(|X|)EI (HN ) tends to infinity, it further follows that EI (HN is compact, too.
(j) A )χR (−∆ + 1)1/8 is bounded we set S := 1 + N To verify that EI (HN j=1 |DA,V | and write, for some sufficiently large c > 0, A )χR (−∆ + 1)1/8 EI (HN A 1/2 = EI (HN )(HN + c)1/2 {(HN + c)−1/2 Λ+,N }{S −1/2 χR (−∆ + 1)1/8 }. A,V S (5.1)
Here the left curly bracket in (5.1) is a bounded operator from HN to HN+ since +,N Λ+,N A,V SΛA,V ≤ HN + c, provided c is large enough, due to the positivity of the interaction potentials and the boundedness of VH and VE . To see that the right curly bracket in (5.1) is a bounded operator in HN we first notice that it is a restriction of S −1/2 T ∗ , where T := (−∆ + 1)1/8 χR is closed. It thus remains to show that T S −1/2 = T ∗∗ S −1/2 = (S −1/2 T ∗ )∗ belongs to L (HN ). To this end (i) (i) we recall that (−∆(i) + 1)1/4 R (|DA,V | + 1)−1 is bounded on L2 (R3i , C4 ) since 1/2
D(DA,V ) ⊂ Hloc (R3 , C4 ). It follows that (i)
(i)
(−∆(i) + 1)1/4 χR S −1 = (−∆(i) + 1)1/4 χR (|DA,V | + 1)−1 (|DA,V | + 1)S −1 is bounded, for i = 1, . . . , N , and, hence, χR (−∆ + 1)1/4 χR S −1 ∈ L (HN ). Since χR (−∆ + 1)1/4 χR is a restriction of T ∗ T we see that T ∗ T S −1 ∈ L (HN ), which implies |T |S −1/2 ∈ L (HN ) and, hence, T S −1/2 ∈ L (HN ). 6. Weyl Sequences In this section, we prove the “easy part” of our HVZ theorem, namely Part (i) of Theorem 2.2 asserting that A ) ⊃ [ENA−1 + 1, ∞). σess (HN A This is done by constructing suitable Weyl sequences for HN . The difficulties we encounter are similar to those in [40] where the Brown–Ravenhall model (free picture without magnetic field) is considered. We have, however, to replace those arguments in [40] that require explicit momentum or position space representations
February 11, 2010 10:0 WSPC/148-RMP
42
J070-S0129055X10003874
O. Matte & E. Stockmeyer
of the free projection Λ+ 0 by more abstract ones; see, e.g., Lemma 6.2. Another new technical complication is caused by the related facts that Λ+ A,V maps the dense subspace D merely into H 1/2 when V has a strong Coulombic singularity and that, compared to the free picture, it is more difficult to control the singularities of the interaction potentials. For this reason we shall eventually study the square root of HN rather than HN itself. We fix some spectral parameter λ ≥ 1 throughout the whole section and {ψn }n∈N will always denote a corresponding Weyl sequence as in Hypothesis 2.4(i). In this and the following section we shall repeatedly employ the following sequence of cut-off functions: We pick some χ ∈ C ∞ (R, [0, 1]) such that χ ≡ 0 on (−∞, 1 − ε/4] and χ ≡ 1 on [1, ∞). Here ε ∈ (0, 1) is a fixed parameter whose value becomes important only in Sec. 7. We set χn := χ(|x|/Rn ), for x ∈ R3 and n ∈ N, where Rn is given by Hypothesis 2.4(i). Then it holds χn ψn = ψn and ∇χn ∞ = Rn−1 ∇χ ∞ → 0, as n → ∞. To begin with we draw two simple conclusions from our hypotheses: 3 3 Lemma 6.1. Assume that A ∈ L∞ loc (R , R ) and V fulfill Hypotheses 2.4(i) and 2.2, respectively. Then
lim (DA,VC − λ)Λ+ A,V ψn = 0.
lim (DA,V − λ)ψn = 0,
n→∞
n→∞
(6.1)
Proof. The first identity is clear from the hypotheses. To treat the second we employ the cut-off functions defined in the paragraph preceding the statement of this lemma and abbreviate VHE := VH + VE . By means of Proposition 3.1 and VHE χn → 0 we then obtain + + VHE Λ+ A,V ψn ≤ VHE χn ΛA,V ψn + VHE [ΛA,V , χn ]ψn → 0,
as n tends to infinity. Therefore, the second identity follows from the first. 3 3 Lemma 6.2. Assume that A ∈ L∞ loc (R , R ) and V fulfill Hypotheses 2.4(i) and 2.2, respectively. Let ε > 0 and set Iε := (λ − ε, λ + ε). Then we have, as n tends to infinity,
EIε (DA,V )ψn → 1,
in particular,
Λ+ A,V ψn → 1.
(6.2)
Proof. Clearly, EIε (DA,V )ψn ≤ 1 since ψn is normalized. Suppose that there is some δ > 0 such that lim inf EIε (DA,V )ψn 2 ≤ 1 − δ. Then we have lim→∞ EIε (DA,V )ψn 2 ≤ 1 − δ, for an appropriate subsequence, and lim (DA,V − λ)ψn 2 ≥ ε2 lim inf (1 − 1Iε (s))d Es (DA )ψn 2 →∞
→∞
R
2
= ε − ε2 lim EI (DA,V )ψn 2 ≥ ε2 δ > 0. →∞
This is a contradiction to (6.1).
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
43
In the following we show that ENA−1 + λ belongs to σess (HN ) by means of a suitable Weyl sequence. Instead of applying Weyl’s criterion directly to HN we shall, however, use a slightly strengthened version of it in Lemma 6.3 (see, e.g., [13]) which allows to work with quadratic forms. As already mentioned above this is important since, for instance, it seems that one cannot expect Proposition 3.2 to √hold with W 1/2 replaced by W . (At least not for large nuclear charges e2 Z ≥ 3/2.) To construct the Weyl sequence we pick, for every n ∈ N, some Φn | H A Φn < E A + 1 , N −1 N −1 +,N −1 n Φn = AN −1 Φn ∈ ΛA,V DN −1 such that (6.3) Φn = 1. This is possible since HN −1 is defined as a Friedrichs extension starting from +,N −1 ΛA,V DN −1 . We further set Υn (x) := |Φn (x, X )|2 dX . (6.4) R3(N −2)
Next, we pick 0 < a < min{m, 0 }, r ∈ (0, 1 − ε/4), and r ∈ (0, 1) such that (1 − r)a > (1 + r)τ,
s := r + r − 1 > 0.
(6.5)
Here τ appears in (2.18). We further pick some cut-off function, ϑ ∈ C ∞ (R, [0, 1]), such that ϑ ≡ 0 on (−∞, s/2] and ϑ ≡ 1 on [s, ∞). By Lemma 3.6 we know that (1) |D0 |1/2 Φn ∈ HN −1 , where the superscript (1) again indicates that the operator acts on the first variable. Therefore, we find a subsequence, {Rkn }n∈N , of {Rk }k∈N such that, for every n ∈ N, 1 (1) (6.6) ||D0 |1/2 ϑ(x1 /Rkn )Φn (X)|2 dX < , n R3(N −1) As a candidate for a Weyl sequence we then try {AN Ψn }n∈N , where +,N Ψn := Φn ⊗ Λ+ A,V ψkn ∈ ΛA,V DN ,
n ∈ N.
(6.7)
To simplify the notation we again write n instead of kn in the following. Finally, we pick some c > 1 and set f (t) := (t + c)−1/2 (t − ENA−1 − λ),
t > −c.
Lemma 6.3. Let the assumptions of Theorem 2.2(i) be fulfilled. If, in the situation described above, c > 1 is sufficiently large, then AN Ψn ∈ D(f (HN )), for every n ∈ N, and w-lim AN Ψn = 0, n→∞
lim inf AN Ψn > 0, n→∞
lim f (HN )AN Ψn = 0.
n→∞
(6.8)
In particular, ENA−1 + λ ∈ σess (HN ). Proof. First, suppose that (6.8) holds true. If c > 1 is chosen sufficiently large, then f is strictly monotonically increasing on σ(HN ). If I is some small open
February 11, 2010 10:0 WSPC/148-RMP
44
J070-S0129055X10003874
O. Matte & E. Stockmeyer
interval around ENA−1 + λ we thus get EI (HN ) = Ef (I) (f (HN )). By (6.8) and the Weyl criterion applied to f (HN ) it follows that ∞ = dim Ran(Ef (I) (f (HN ))) = dim Ran(EI (HN )). To verify (6.8) we first notice that Ψn 0, as n → ∞, because of (2.17). Exactly as in [40, §4] we can also check that lim inf AN Ψn > 0. So it suffices to show that f (HN )Ψn → 0, as HN commutes with AN . Since ψn and Φn are +,N normalized and Ψn = Φn ⊗ Λ+ A,V ψn ∈ ΛA,V DN we obtain f (HN )Ψn 1
1
1
≤ (HN + c)− 2 (HN −1 − ENA−1 ) 2 ⊗ 1H + (HN −1 − ENA−1 ) 2 Φn + (DA,VC − λ)Λ+ A,V ψn +
N −1
(6.10) 1
1
(6.9)
1
+ 2 2 (HN + c)− 2 Λ+,N A,V WiN WiN (Φn ⊗ ΛA,V ψn ) .
(6.11)
i=1
We first show that the operator norm in (6.11) is actually finite. In fact, 1/2
1/2
+,N −1/2 ≤ 1, (HN + c)−1/2 Λ+,N A,V WiN = WiN ΛA,V (HN + c) +,N −1 + + A , and, hence, since W ≥ 0, Λ+ A,V DA,VC ΛA,V ≥ −C ΛA,V , HN −1 ≥ EN −1 ΛA,V 1/2
+,N +,N 2 WiN Λ+,N A,V φ = φ | ΛA,V WiN ΛA,V φ ≤ φ | (HN + c)φ,
for φ ∈ Λ+,N A,V DN . Using similar estimates and (6.3) it is straightforward to check that the term in (6.9) converges to zero provided c > 1 is sufficiently large. The norm in (6.10) tends to zero by Lemma 6.1. The claim now follows from Lemma 6.4 below which implies that the remaining norm in (6.11) tends to zero, too. The first inequality of the following lemma is used in the proof of Lemma 6.3 and the second one in Sec. 7. Lemma 6.4. There are κ, C ∈ (0, ∞) such that, for all n ∈ N, 1 2 W (x, y)Υn (y)|Λ+ sup W (x, y) + Ce−κRn + . A,V ψn (x)| d(x, y) ≤ n R6 |x−y|≥ (1−r )Rn
If B is bounded, then there is some C ∈ (0, ∞) such that, for all n ∈ N, 2 W (x, y)Υn (y)|Λ+ A,V ψn (x)| d(x, y) R6
≤
2 −aεRn /2 sup W (x, y) Λ+ A,V ψn + C (1 + B ∞ )e
|x−y|≥ (1−ε)Rn
+C
{|y|≥εRn /2}
Υn (y)dy(1 + B ∞ ) (DA,V + i)ψn 2 .
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
45
Proof. For n ∈ N, we pick a weight function, Fn ∈ C ∞ (R3 , [0, ∞)), with Fn ≡ 0 on R3 \BRn (0), Fn ≥ (1 − r)aRn − a on BrRn (0) and ∇Fn ∞ ≤ a. Here a and r are the parameters from (6.5) and a > 0 is some fixed, n-independent constant. Since ψn = χn ψn and 1BrRn (0) χn = 0 we obtain 2 1BrRn (0) (x)W (x, y)Υn (y)|Λ+ A,V ψn (x)| d(x, y) {|x−y|<(1−r )Rn }
≤ 1BrRn (0) e−Fn ∞ ≤ C e−(1−r)aRn
sup |y|≤(r+1−r )Rn
sup |y|≤(1+r)Rn
−Fn Wy1/2 eFn [Λ+ ]ψn 2 Υn 1 A,V , χn e
(1 + |B(y)|) ≤ C e−([1−r]a−[1+r]τ )Rn .
(6.12)
In the last two steps we make use of Proposition 3.2 and (2.18). Next, if |x − y| ≤ (1 − r )Rn and 1BrRn (0) (x) = 0, then |y| ≥ (r + r − 1)Rn = sRn , and by the choice of ϑ (see the paragraph below (6.5)) it follows that 2 (1 − 1BrRn (0) (x))W (x, y)Υn (y)|Λ+ A,V ψn (x)| d(x, y) {|x−y|≤(1−r )Rn }
≤
sup |x|≥rRn
R3
2 W (x, y)ϑ(y/Rn )Υn (y)dy Λ+ A,V ψn .
(6.13)
On account of (6.5), Kato’s inequality, and (6.6) the first asserted estimate follows from (6.12) and (6.13). The second one is derived similarly by means of Lemma 3.7 and the replacements r → 1 − ε/2, r → ε. Note that 1B(1+ε/2)Rn (0) χn = 0, which is used to derive the analogue of (6.12). 7. Existence of Eigenvalues A In this section, we prove Theorem 2.3 which asserts that HN possesses infinitely A A many eigenvalues below inf σess (HN ) = 1 + EN −1 . We proceed along the lines of [40, §6] with a few changes. In particular, as in the previous section we replace the arguments of [40] that employ explicit position or momentum space representations of Λ+ 0 by more abstract ones. A crucial observation is the new argument used to prove Lemma 7.1. Throughout this section we always assume without further notice that the assumptions of Theorem 2.3, i.e. Hypothesis 2.5, are fulfilled.
Proof of Theorem 2.3. We proceed by induction on N and start with the inducA tion step. So, we pick N ∈ N, N ≥ 2, and assume that HN −1 possesses infinitely A many eigenvalues below EN −2 + 1. In particular, we can pick a normalized ground A state of HN −1 , which we denote by Φ. Moreover, we denote the transposition operator which flips the ith and N th electron variable by πiN , 1 ≤ i < N , and set πN N := 1. The vectors ψ1 , ψ2 , . . . are the elements of the sequence appearing in Hypothesis 2.5.
February 11, 2010 10:0 WSPC/148-RMP
46
J070-S0129055X10003874
O. Matte & E. Stockmeyer
Now, let d ∈ N. By Lemma 7.7 below we know that, for all sufficiently large m0 +d m0 ∈ N, the set {AN (Φ ⊗ Λ+ A,V ψn )}n=m0 , where AN (Φ ⊗ Λ+ A,V ψn ) =
N 1 (−1)N −i πiN (Φ ⊗ Λ+ A,V ψn ), N i=1
n ∈ N,
is linearly independent. Our goal then is to show that the expectation of N := HN − ENA−1 − 1 H m0 +d with respect to any linear combination of the vectors {AN (Φ ⊗ Λ+ A,V ψn )}n=m0 is strictly negative provided m0 ∈ N is large enough. Since d is arbitrary the assertion of Theorem 2.3 then follows from the minimax principle. For cm0 , . . . , cm0 +d ∈ C, and
Ψ :=
m 0 +d n=m0
cn
N (−1)N −i i=1
N 1/2
πiN (Φ ⊗ Λ+ A,V ψn ),
(7.1)
we obtain as in [40] by means of the anti-symmetry of Φ, N Ψ ≤ Ψ | H
m 0 +d
+ |cn |2 Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψn )
(7.2)
n=m0
+ (N − 1)
m 0 +d
+ |cn ||cm ||π1N (Φ ⊗ Λ+ A,V ψn ) | HN (Φ ⊗ ΛA,V ψm )|
n,m=m0
(7.3) +
m 0 +d
+ |cn ||cm ||Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψm )|.
(7.4)
n, m=m0 m=n
Combining the eigenvalue equation (HN −1 − ENA−1 )Φ = 0 with Lemmata 7.1–7.4, Hypothesis 2.5, and (6.2), we find some δ0 > 0 such that the scalar product in (7.2) −1 −1 + o(Rm ), as m0 gets large. Here the numbers is bounded from above by −δ0 Rm 0 0 R1 , R2 , . . . are those appearing in Hypothesis 2.5. Lemmata 7.5 and 7.6 imply that −K ), as m0 → ∞, for every the scalar products in (7.3) and (7.4) are of order O(Rm 0 K ∈ N. By the Cauchy–Schwarz inequality we find some δ0 > 0 such that N Ψ ≤ −δ Ψ | H 0
m 0 +d
|cn |2 ,
n=m0
for all cm0 , . . . , cm0 +d ∈ C, if m0 is sufficiently large (depending on d). This concludes the induction step. Finally, the case N = 1 is treated in the same way as the induction step N → N + 1 (setting E0A := 0 and ignoring Φ, W , and the term (7.3)).
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
47
To show that the contribution coming from the (one-particle) kinetic energy of ψn decreases faster than its negative potential energy we make use of the requirement that the ψn have vanishing lower spinor components, ψn = (ψn,1 , ψn,2 , 0, 0) , n ∈ N. This has also been used in [40] together with explicit formulas for Λ+ 0 . We replace these arguments by the following observation: Lemma 7.1. There is some C ∈ (0, ∞) such that + −2 0 ≤ Λ+ A ψn | (DA − 1)ΛA ψn ≤ CRn ,
n ∈ N.
Proof. Since the last two components of ψn are zero we have (β − 1)ψn = 0. If we denote the projection onto the first two spinor components, L2 (R3 , C4 ) (ϕ1 , ϕ2 , ϕ3 , ϕ4 ) → (ϕ1 , ϕ2 , 0, 0) , by p then we also have pαi ψn = 0 = pαi ∂i ψn , 2 ] = 0 and, hence, i = 1, 2, 3, and, therefore, p(DA − 1)ψn = 0. Moreover, [p, DA −1 [p, |DA | ] = 0. This implies + |Λ+ A ψn | (DA − 1)ΛA ψn | 1 1 = ψn | p(DA − 1)ψn + sgn(DA )ψn | (DA − 1)ψn 2 2 1 1 −1 −1 = (DA − 1)ψn | |DA | (DA − 1)ψn + ψn | |DA | p(DA − 1)ψn 2 2
≤
1 (DA − 1)ψn 2 = O(1/Rn2 ). 2
In the last step we apply Hypothesis 2.5. In the following we split VC into a singular and regular part, VC = VCs + VCr , where VCs is defined in (3.1). By Hypothesis 2.1 VCr is bounded. Lemma 7.2. As n tends to infinity, + Λ+ A,V ψn | (DA,VC − 1)ΛA,V ψn + + r + −1 = Λ+ A,V ψn | VC ΛA,V ψn + ΛA ψn | (DA − 1)ΛA ψn + o(Rn ).
Proof. We let χn , n ∈ N, denote the cut-off functions introduced in the paragraph preceding Lemma 6.1. Then the assertion follows from Corollary 3.5 applied to DA,VCs − 1 with ζ = χn , since by Lemma 7.1 and Hypothesis 2.2, + −1 2 ( χn V + ∇χn ) Λ+ A ψn | (DA − 1)ΛA ψn ψn = o(Rn ). In the next lemma, we single out the leading order negative contribution to (7.2).
February 11, 2010 10:0 WSPC/148-RMP
48
J070-S0129055X10003874
O. Matte & E. Stockmeyer
Lemma 7.3. There is some constant C ∈ (0, ∞) such that, for all sufficiently large n ∈ N, + r + 2 −Rn /C , Λ+ A,V ψn | VC ΛA,V ψn ≤ v (δ, Rn ) ΛA,V ψn + Ce
where v (δ, Rn ) is given by (2.20). Proof. We pick some even function f ∈ C ∞ (R, [0, ∞)) such that f ≡ 1 on [δ, ∞), f ≡ 0 on [0, δ/2], and |f | ≤ 4/δ. (Recall (2.21).) For some a ∈ (0, δ min{0 , m}/4), we define exponential weights, Fn (x) := aRn f (|x|/Rn ), n ∈ N. Using the notation introduced in (2.19) and (2.20) we then obtain, for all sufficiently large n ∈ N, + r + r + Λ+ A,V ψn | VC ΛA,V ψn ≤ ΛA,V ψn | 1Sδ (Rn ) VC ΛA,V ψn −Fn + VCr 1R3 \Sδ (Rn ) e−Fn eFn Λ+ . A,V e
where, by (2.20) and Pythagoras’ theorem, r + Λ+ A,V ψn | 1Sδ (Rn ) VC ΛA,V ψn + 2 2 ≤ v (δ, Rn )( Λ+ A,V ψn − 1R3 \Sδ (Rn ) ΛA,V ψn ) 2 −Fn 2 Fn + e ΛA,V e−Fn 2 . ≤ v (δ, Rn ) Λ+ A,V ψn + |v (δ, Rn )| 1R3 \Sδ (Rn ) e
By (2.19), (2.21), and the choice of Fn we know that 1R3 \Sδ (Rn ) e−Fn ≤ Ce−aδRn /2 , which implies the assertion of the lemma. From now on, we always assume that the induction hypothesis made in the proof of Theorem 2.3 is fulfilled and that Φ is a normalized ground state eigenvector of 1 A A A A A HN −1 . So, HN −1 Φ = EN −1 Φ, EN −1 < EN −2 + 1. Given δ ∈ (0, N ) we pick some ε ∈ (0, 1) as in Hypothesis 2.5(i). Then the following assertion is valid: Lemma 7.4. As n tends to infinity, we have, for 1 ≤ i ≤ N − 1, + Φ ⊗ Λ+ A,V ψn | WiN Φ ⊗ ΛA,V ψn ≤
2 −∞ sup W (x, y) Λ+ A,V ψn + O(Rn ).
|x−y|≥ (1−ε)Rn
Proof. This follows from Lemma 6.4 with Υn (y) = R3(N −2) |Φ(y, X )|2 dX and the exponential decay of Φ, which is ensured by Theorem 2.1 and the induction hypothesis. Now, we turn to the discussion of the terms in (7.3). Lemma 7.5. As n and m tend to infinity, + −∞ σnm := |π1N (Φ ⊗ Λ+ A,V ψn ) | HN (Φ ⊗ ΛA,V ψm )| = O(Rmin{n,m} ).
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
49
Proof. We pick χ ∈ C0∞ (R3 , [0, 1]) such that χ ≡ 1 on B1/4 (0) and χ ≡ 0 outside B1/2 (0) and set χn := χ(·/Rn ) and χn := 1 − χn , for n ∈ N. As in [40] we find + σnm ≤ |{χn Λ+ A,V ψn } ⊗ Φ | Φ ⊗ {(DA,VC − 1)ΛA,V ψm }| + (1) + |{Λ+ A,V ψn } ⊗ Φ | {χn Φ} ⊗ {(DA,VC − 1)ΛA,V ψm }|
+
N −1
+ |{χn Λ+ A,V ψn } ⊗ Φ | WiN Φ ⊗ (ΛA,V ψm )|
i=1
+
N −1
+ (1) |{Λ+ A,V ψn } ⊗ Φ | WiN (χn Φ) ⊗ (ΛA,V ψm )|
i=1
=: Y1 + Y2 +
N −1
Y3i +
i=1
N −1
Y4i .
i=1
For the first two summands we find + (1) Y1 + Y2 ≤ (DA,VC − 1)Λ+ A,V ψm ( χn ΛA,V ψn + χn Φ ), −∞ where the right-hand side is of order O(Rmin{n,m} ) due to the exponential localization of Φ and the support properties of ψn and χn . Moreover, we observe that, for i = 2, . . . , N − 1, 1/2
1/2 + Y3i ≤ χn Λ+ A,V ψn WiN Φ Φ sup Wy ΛA,V ψm , y∈R3
1/2
(1) 1/2 + Y4i ≤ Λ+ A,V ψn WiN Φ χn Φ sup Wy ΛA,V ψm . y∈R3
1/2
Here the norms WiN Φ , i = 2, . . . , N −1, are actually finite since Φ ∈ Ker(HN −1 − ENA−1 ) implies 1/2
+,N −1 +,N −1 WiN ΛA,V Φ ≤ (ENA−1 + C) Φ 2 , WiN Φ 2 = Φ | ΛA,V
for some constant C ∈ (0, ∞). Finally, 2 1/2 + Y31 ≤ sup Wy1/2 χn Λ+ A,V ψn Φ sup Wy ΛA,V ψm , y∈R3
y∈R3
(1) 1/2 + Y41 ≤ sup Wy1/2 Λ+ A,V ψn Φ χn Φ sup Wy ΛA,V ψm . y∈R3
y∈R3
We pick f ∈ C ∞ (R, [0, ∞)) such that f ≡ 0 on [1, ∞), f ≡ 1 on (−∞, 1/2], and −3 ≤ f ≤ 0, and set Fn (x) = aRn f (|x|/Rn ), x ∈ R3 , n ∈ N, where a ∈ (0, min{0 , m}/3). Since χn ψn = 0, we find sup Wy1/2 χn Λ+ A,V ψn
y∈R3
≤
sup |x|≤Rn /2
Fn −Fn e−Fn sup Wy1/2 [Λ+ ψn . A,V , χn e ]e y∈R3
February 11, 2010 10:0 WSPC/148-RMP
50
J070-S0129055X10003874
O. Matte & E. Stockmeyer
This estimate, the exponential decay of Φ, and Lemma 3.7 imply that the terms −∞ ) also. Y3i and Y4i , 1 ≤ i ≤ N − 1, vanish of order O(Rmin{n,m} Finally, we discuss the terms in (7.4). Lemma 7.6. As n tends to infinity, it holds, for all m > n, + −∞ |Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψm )| = O(Rn ).
Proof. We pick a family of smooth weight functions, {Fk }k,∈N , such that Fk ≡ 0 on supp(ψk ), Fk is constant outside some ball containing supp(ψk ) and supp(ψ ), ∇Fk ∞ ≤ a < min{0 , m}, and
gk := e−Fk −Fk ∞ ≤ Ce−a
min{Rk ,R }
,
k, ∈ N,
where a, a ∈ (0, min{0 , m}) and C ∈ (0, ∞) do not depend on k, ∈ N. In view of (2.21) it is easy to see that such a family exists. Then we observe that + |Φ ⊗ Λ+ A,V ψn | HN (Φ ⊗ ΛA,V ψm )| + + ≤ |Λ+ A,V ψn | (DA,V − 1)ψm | + |ΛA,V ψn | (VH + VE )ΛA,V ψm | 1/2 1/2 + + |WiN Φ ⊗ Λ+ A,V ψn | WiN Φ ⊗ ΛA,V ψm | 1≤i
+ (iα · ∇Fmn + VH + [eFmn , VE ]e−Fmn )ψm } −Fk + gnm eFmn (VH + VE )e−Fmn sup eFk Λ+ ψk 2 A,V e k=
−Fk + gnm (N − 1) sup sup Wy1/2 eFk Λ+ ψk 2 . A,V e k= y∈R3
By virtue of Proposition 3.2 and Lemma 3.7, we know that all terms behind the factors gnm appearing here are uniformly bounded which shows that the assertion holds true. Applying the above arguments in an easier situation, we obtain the following lemma. Lemma 7.7. For every d ∈ N, there is some n0 ∈ N such that the set of vectors m0 +d {AN (Φ ⊗ Λ+ A,V ψn )}n=m0 is linearly independent, for all m0 ∈ N, m0 ≥ n0 . Proof. We pick Ψ as in (7.1) and estimate Ψ 2 from below by an obvious analogue N replaced by the identity. Now, by virtue of Lemma 6.2 there of (7.2)–(7.4) with H
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
51
+ is some m1 ∈ N such that Φ ⊗ Λ+ A,V ψn = ΛA,V ψn ≥ 1/2, for all n ≥ m1 . The proof of Lemma 7.5 shows that + −∞ |π1N (Φ ⊗ Λ+ A,V ψn ) | Φ ⊗ ΛA,V ψm | = O(Rmin{n,m} ).
Furthermore, by employing the exponential weights from the proof of Lemma 7.6 we see that + + − min{Rn ,Rm }/C |Φ ⊗ Λ+ . A,V ψn | Φ ⊗ ΛA,V ψm | = |ψn | ΛA,V ψm | ≤ Ce
Altogether we find some C ∈ (0, ∞) such that, for d ∈ N and all sufficiently large m0 ∈ N, Ψ 2 ≥
m0 +d m0 +d 1 C (N − 1) C |cn |2 − |cn ||cm | − 2 n=m Rm0 Rm0 n,m=m 0
0
m 0 +d
|cn ||cm |.
n, m=m0 n=m
Hence, the Cauchy–Schwarz inequality implies that, for sufficiently large m0 , Ψ in (7.1) is zero if and only if cm0 = · · · = cm0 +d = 0. Acknowledgments It is a pleasure to thank Hubert Kalf, Sergey Morozov, and Heinz Siedentop for useful remarks and helpful discussions. Moreover, we thank Sergey Morozov for making parts of his manuscripts [38] available to us prior to publication.
References [1] V. Bach, J. Fr¨ ohlich and I. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998) 299–395. [2] V. Bach and O. Matte, Exponential decay of eigenfunctions of the Bethe–Salpeter operator, Lett. Math. Phys. 55 (2001) 53–62. [3] A. A. Balinsky and W. D. Evans, On the virial theorem for the relativistic operator of Brown and Ravenhall, and the absence of embedded eigenvalues, Lett. Math. Phys. 44 (1998) 233–248. [4] A. A. Balinsky and W. D. Evans, Stability of one-electron molecules in the Brown– Ravenhall model, Comm. Math. Phys. 202 (1999) 481–500. [5] A. A. Balinsky and W. D. Evans, On the spectral properties of the Brown–Ravenhall operator, J. Comput. Appl. Math. 148 (2002) 239–255. [6] P. Beiersdorfer, M. H. Chen, K. T. Cheng and J. Sapirstein, Transition energies of the 3s–3p3/2 resonance lines in sodiumlike to phosphoruslike uranium, Phys. Rev. A 68 (2003) 022507, 7 pp. [7] A. Berthier and V. Georgescu, On the point spectrum of Dirac operators, J. Funct. Anal. 71 (1987) 309–338. [8] A. M. Boutet de Monvel and R. Purice, A distinguished self-adjoint extension for the Dirac operator with strong local singularities and arbitrary behavior at infinity, Rep. Math. Phys. 34 (1994) 351–360.
February 11, 2010 10:0 WSPC/148-RMP
52
J070-S0129055X10003874
O. Matte & E. Stockmeyer
[9] G. E. Brown and D. G. Ravenhall, On the interaction of two electrons, Proc. Roy. Soc. London A 208 (1951) 552–559. [10] R. Cassanas and H. Siedentop, The ground-state energy of heavy atoms according to Brown and Ravenhall: Absence of relativistic effects in leading order, J. Phys. A 39 (2006) 10405–10414. [11] K. T. Cheng, M. H. Chen and W. R. Johnson, Accurate relativistic calculations including QED contributions for few-electron systems, in Relativistic Electronic Structure Theory. Part 2: Applications, ed. P. Schwerdtfeger, Theoretical and Computational Chemistry, Vol. 14 (Elsevier, 2002), pp. 120–187. [12] K. T. Cheng, M. H. Chen and J. Sapirstein, Potential independence of the solution to the relativistic many-body problem and the role of negative energy states in heliumlike ions, Phys. Rev. A 59 (1999) 259–266. [13] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators, Texts and Monographs in Physics (Springer, Berlin-Heidelberg, 1987). [14] A. Derevianko, W. R. Johnson, D. P. Plante and I. M. Savukov, Negative-energy contributions to transition amplitudes in heliumlike ions, Phys. Rev. A 58 (1998) 4453–4461. [15] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in the Semi-Classical Limit, London Math. Soc. Lecture Note Series, Vol. 268 (Cambridge University Press, Cambridge, 1999). [16] J. Dolbeault, M. J. Esteban and M. Loss, Relativistic hydrogenic atoms in strong magnetic fields, Ann. Henri Poincar´e 8 (2007) 749–779. [17] W. D. Evans, P. Perry and H. Siedentop, The spectrum of relativistic one-electron atoms according to Bethe and Salpeter, Comm. Math. Phys. 178 (1996) 733–746. [18] V. Georgescu and M. M˘ antoiu, On the spectral theory of singular Dirac type Hamiltonians, J. Operator Theory 46 (2001) 289–321. [19] M. Griesemer, Exponential decay and ionization thresholds in non-relativistic quantum electrodynamics, J. Funct. Anal. 210 (2004) 321–340. [20] M. Griesemer and H. Siedentop, A minimax principle for the eigenvalues in spectral gaps, J. London Math. Soc. (2) 60 (1999) 490–500. [21] M. Griesemer and C. Tix, Instability of a pseudo-relativistic model of matter with self-generated magnetic field, J. Math. Phys. 40 (1999) 1780–1791. [22] B. Helffer, J. Nourrigat and X. P. Wang, Sur le spectre de l’´equation de Dirac (dans ´ Norm. Sup. 22(4) (1989) 515–533. R3 ou R2 ) avec champ magnetic, Ann. Sci. Ecole [23] B. A. Heß, M. Reiher and A. Wolf, The generalized Douglas–Kroll transformation, J. Chem. Phys. 117 (2002) 9215–9226. [24] G. Hoever and H. Siedentop, Stability of the Brown–Ravenhall operator, Math. Phys. Electr. J. 5 (1999) Paper 6, 11 pp. [25] M. Huber and E. Stockmeyer, Perturbative implementation of the Furry picture, Lett. Math. Phys. 79 (2007) 99–108. [26] G. Jansen and B. A. Heß, Revision of the Douglas–Kroll transformation, Phys. Rev. A 39 (1989) 6016–6017. [27] D. H. Jakubaßa-Amundsen, The HVZ theorem for a pseudo-relativistic operator, Ann. Henri Poincar´e 8 (2007) 337–360. [28] D. H. Jakubaßa-Amundsen, Heat kernel estimates and spectral properties of a pseudorelativistic operator with magnetic field, J. Math. Phys. 49 (2008) 032305, 22 pp. [29] W. R. Johnson, Relativistic many-body theory applied to highly-charged ions, in Many-Body Theory of Atomic Structure and Photoionization, ed. T. N. Chang (World Scientific, 1993), pp. 19–46.
February 11, 2010 10:0 WSPC/148-RMP
J070-S0129055X10003874
Spectral Theory of No-Pair Hamiltonians
53
[30] W. R. Johnson, Relativistic many-body perturbation theory for highly charged ions, in Many-Body Atomic Physics, eds. J. J. Boyle and M. S. Pindzola (University Press, 1998), pp. 39–64. [31] T. Kato, Perturbation Theory for Linear Operators, Classics in Mathematics (Springer, Berlin-Heidelberg, 1995). [32] T. Kato, Holomorphic families of Dirac operators, Math. Z. 183 (1983) 399–406. [33] Y. Last and B. Simon, The essential spectrum of Schr¨ odinger, Jacobi, and CMV operators, J. d’Analyse Math. 98 (2006) 183–220. [34] E. H. Lieb and M. Loss, Stability of a model of relativistic quantum electrodynamics, Comm. Math. Phys. 228 (2002) 561–588. [35] E. H. Lieb, H. Siedentop and J. P. Solovej, Stability and instability of relativistic electrons in classical electromagnetic fields, J. Statist. Phys. 89 (1997) 37–59. [36] O. Matte and E. Stockmeyer, On the eigenfunctions of no-pair operators in classical magnetic fields, Integr. Equ. Oper. Theory 65 (2009) 255–283. [37] S. Morozov, Essential spectrum of multiparticle Brown–Ravenhall operators in external field, Documenta Math. 13 (2008) 51–79. [38] S. Morozov, Multi-particle Brown–Ravenhall operators in external fields, PhD thesis, Universit¨ at M¨ unchen (2008). [39] S. Morozov, Exponential decay of eigenfunctions of Brown–Ravenhall operators, J. Phys. A 42 (2009) 475206, 16 pp. [40] S. Morozov and S. Vugalter, Stability of atoms in the Brown–Ravenhall model, Ann. Henri Poincar´e 7 (2006) 661–687. [41] G. Nenciu, Self-adjointness and invariance of the essential spectrum for Dirac operators defined as quadratic forms, Comm. Math. Phys. 48 (1976) 235–247. [42] G. Nenciu, Distinguished self-adjoint extension for the Dirac operator with potential dominated by multicenter Coulomb potentials, Helvetica Phys. Acta 50 (1977) 1–3. [43] M. Reiher and A. Wolf, Relativistic Quantum Chemistry (Wiley-VCH, Weinheim, 2009). [44] R. Richard and R. Tiedra de Aldecoa, On the spectrum of magnetic Dirac operators with Coulomb-type perturbations, J. Funct. Anal. 250 (2007) 625–641. [45] J. Sapirstein, Theoretical methods for the relativistic atomic many-body problem, Rev. Modern Phys. 70 (1998) 55–76. [46] H. Siedentop and E. Stockmeyer, The Douglas–Kroll–Heß method: Convergence and block-diagonalization of Dirac operators, Ann. Henri Poincar´e 7 (2006) 45–58. [47] J. Sucher, Foundations of the relativistic theory of many-electron atoms, Phys. Rev. A 22 (1980) 348–362. [48] J. Sucher, Relativistic many-electron Hamiltonians, Phys. Scripta 36 (1987) 271–281. [49] B. Thaller, The Dirac Equation, Texts and Monographs in Physics (Springer, BerlinHeidelberg, 1992). [50] C. Tix, Strict positivity of a relativistic Hamiltonian due to Brown and Ravenhall, Bull. London Math. Soc. 30 (1998) 283–290. [51] C. Tix, Self-adjointness and spectral properties of a pseudo-relativistic Hamiltonian due to Brown and Ravenhall, preprint (1997) 20 pp.; mp arc 97-441. [52] C. Tix, Lower bound for the ground state energy of the no-pair Hamiltonian, Phys. Lett. B 405 (1997) 293–296. [53] J. Xia, On the contribution of the Coulomb singularity of arbitrary charge to the Dirac Hamiltonian, Trans. Amer. Math. Soc. 351 (1999) 1989–2023.
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
Reviews in Mathematical Physics Vol. 22, No. 1 (2010) 55–89 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003886
ON THE LONG TIME BEHAVIOR OF FREE STOCHASTIC ¨ SCHRODINGER EVOLUTIONS
ANGELO BASSI Department of Physics, University of Trieste, Strada Costiera 11, 34151 Trieste, Italy and Istituto Nazionale di Fisica Nucleare, Trieste Section, Via Valerio 2, 34127 Trieste, Italy
[email protected] ∗ and MARTIN KOLB† ¨ DETLEF DURR
Mathematisches Institut der L.M.U., Theresienstr. 39, 80333 M¨ unchen, Germany ∗
[email protected] †
[email protected] Received 10 March 2009 Revised 15 August 2009 We discuss the time evolution of the wave function which is the solution of a stochastic Schr¨ odinger equation describing the dynamics of a free quantum particle subject to spontaneous localizations in space. We prove global existence and uniqueness of solutions. We observe that there exist three time regimes: the collapse regime, the classical regime and the diffusive regime. Concerning the latter, we assert that the general solution converges almost surely to a diffusing Gaussian wave function having a finite spread both in position as well as in momentum. This paper corrects and completes earlier works on this issue. Keywords: Collapse models; GRW-model; Hilbert space valued diffusions; large time behavior. Mathematics Subject Classification 2000: 60H30, 60J60, 82C31, 81S99, 35R60
1. Introduction Stochastic differential equations (SDEs) in infinite dimensional spaces are a subject of growing interest within the mathematical physics and physics communities working in quantum mechanics; they are currently used in models of spontaneous wave function collapse [1–14], in the theory of continuous quantum measurement [15, 17–25], and in the theory of open quantum systems [26–28]. In the first case, the Schr¨odinger equation is modified by adding appropriate nonlinear and stochastic terms which induce the (random) collapse of the wave function in 55
February 11, 2010 10:1 WSPC/148-RMP
56
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
space; in this way, one achieves the goal of a unified description of microscopic quantum phenomena and macroscopic classical ones, avoiding the occurrence of macroscopic quantum superpositions. Current research focuses on designing experiments which discriminate between collapse and non-collapse theories, see references in [16]. In the second case, using the projection postulate, stochastic terms in the Schr¨ odinger equation are used to describe the effect of a continuous measurement. In the third case, slightly generalizing the notion of continuous measurement to generic interactions with environments, SDEs are used as phenomenological equations describing the interaction of a quantum system with an environment, the stochastic terms encoding the effect of the environment on the system. Looking directly at the stochastic differential equation for the wave function, rather than the deterministic equation of the Lindblad type for the statistical operator has some advantages with respect to the standard master equation approach, e.g. for faster numerical simulations [29]. Among the different SDEs which have been considered so far, the following equation, defined in the Hilbert space H ≡ L2 (R), is of particular interest [17– 19, 30–37, 39] √ i p2 λ 2 dt + λ(q − qt )dWt − (q − qt ) dt ψt , ψ0 = ψ. (1.1) dψt = − 2m 2 The first term on the right-hand side represents the usual quantum Hamiltonian of a free particle in one dimension, p being the momentum operator. The second and third terms of the equation, as we shall see, induce the localization of the wave function in space; q is the position operator and qt denotes the quantum expectation ψt |qψt of q with respect to ψt . The parameter λ is a fixed positive constant which sets the strength of the collapse mechanism, while Wt is a standard Wiener process defined on a probability space (Ω, F , P) with filtration {Ft , t ≥ 0}. Equation (1.1) plays a special role among the SDEs in Hilbert spaces because it is the simplest exactly solvable equation describing the time evolution of a nontrivial physical system. Within the theory of continuous quantum measurement, it describes a measurement-like process designed to measure the position of a free quantum particle; within decoherence theory it represents one of the possible unravellings of the master equation first derived by Joos and Zeh [40]. Within collapse models (like GRW-models), it may describe the evolution of a free quantum particle (or the center of mass of an isolated system) subject to spontaneous localizations in space [1, 2] in the following sense. Realistic models of spontaneous wave function collapse are based on a more complicated stochastic differential equation: The difference between Eq. (1.1) and the equations of the standard localization models such as GRW [1] and CSL [2] is most easily described on the level of the Lindblad equations for the respective statistical operators ρt := EP [|ψt ψt |], induced by the stochastic dynamics of the wave function. By virtue of Eq. (1.1) (see, e.g., [9]): i λ d ρt = − [p2 , ρt ] − [q, [q, ρt ]], dt 2m 2
(1.2)
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
57
with the “Lindblad term” in position representation λGRW α (x − y)2 ρt (x, y). 4
(1.3)
For the GRW dynamics as described in [1] the corresponding Lindblad term of the GRW master equation in the position representation reads: −λGRW [1 − e−α(x−y)
2
/4
(1.4) √ When the distances involved are smaller than the length 1/ α 10−5 cm characterizing the model we have that −λGRW [1 − e−α(x−y)
2
/4
]
]ρt (x, y).
λGRW α (x − y)2 , 4
1 for |x − y| √ . α
(1.5)
Accordingly, the stochastic dynamics of Eq. (1.1) approximates — at least on the statistical level — the GRW dynamics for all atomic and subatomic distances. Since this is a regime of growing interest [41, 13, 42, 43] it is reasonable to study now first the simpler Eq. (1.1). Equation (1.1) is nonlinear. Nonlinearity is a fundamental ingredient because only in this way it is possible to reproduce the collapse of the wave function. It is well known how to “linearize” the equation, i.e. how to express its solutions as a function of the solutions of a suitable linear SDE [31, 44]. We briefly review this procedure. Let us consider the following linear SDE: √ i p2 λ 2 dt + λq dξt − q dt φt , φ0 = φ, (1.6) dφt = − 2m 2 defined in the same Hilbert space H ≡ L2 (R); the stochastic process ξt is a standard Wiener process with respect to the probability space (Ω, F , Q) and filtration {Ft , t ≥ 0}, where Q is a new probability measure whose relation with P will soon be established. This equation does not conserve the norm of the state vector, as the evolution is not unitary; we therefore introduce the normalized state vectors: φt /φt if φt = 0, (1.7) ψt = 0 otherwise. A standard application of Itˆ o calculus shows that, if φt solves Eq. (1.6), then ψt defined in (1.7) solves the following nonlinear SDE: √ √ i p2 λ dt + λ(q − qt )(dξt − 2 λqt dt) − (q − qt )2 dt ψt , (1.8) dψt = − 2m 2 for the same initial condition ψ = φ. Equation (1.8) is a well defined collapse equation, however it is not suitable for physical applications, as the collapse does not occur with the correct quantum probabilities. This can be seen by analyzing the time evolution of particular solutions, such as Gaussian wave functions; it can also be easily understood by noting
February 11, 2010 10:1 WSPC/148-RMP
58
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
that there is no fundamental difference between Eqs. (1.8) and (1.6), since any solution of Eq. (1.8) can be obtained from a solution of Eq. (1.6) simply by normalizing the wave function. In turn, Eq. (1.6) does not contain any information as to why the wave function should collapse according to the Born probability rule, i.e. the Wiener process ξt is not forced to pick most likely those values necessary to reproduce quantum probabilities, during the collapse process. The way to include such a feature into the dynamical evolution of the wave function is to replace the measure Q with a new measure (which will turn out to be the measure P previously introduced) so that the process ξt , according to the new measure, is forced to take with higher probability the values which account for quantum probabilities. This is precisely the key idea behind the original GRW model of spontaneous wave function collapse [1]: the wave function is more likely to collapse where it is more appreciably different from zero. The mathematical structure of the GRW model suggests that the square modulus φt 2 should be used as density for the change of measure. We now formalize these steps. In [31], Holevo has proven that for initial condition φ0 2 = 1 the process (φt 2 )t≤0 is a martingale satisfying the equation √ t qs φs 2 dξs . (1.9) φt 2 = φ0 2 + 2 λ 0
We shall always work with normalized initial states. The martingale φt 2 can be used as a Radon–Nikodym derivative to generate a new probability measure P from Q, according to the usual formula: P[E] := EQ [1E φt 2 ],
∀ E ∈ Ft ,
∀ t < +∞,
(1.10)
where 1E is the indicator function relative to the measurable subset E. We recall that the martingale property, together with the property EQ [φt 2 ] = 1, guarantee consistency among different times, so that (1.10) defines indeed a unique probability measure P on F . In the following, for simplicity we will write dP/dQ ≡ φt 2 . One can then show that Eq. (1.8), with the stochastic dynamics defined on the probability space (Ω, F , P) in place of (Ω, F , Q), correctly describes the desired physical situations. A drawback of the change of measure is that the equation is defined in terms of the stochastic process ξt , which is not anymore a Wiener process with respect to the measure P, as it was with respect to the measure Q. This can be a source of many difficulties, e.g. when analyzing the properties of the solutions of the equation. The disadvantage can be removed by resorting to Girsanov’s theorem, which connects Wiener processes defined on the same measurable space, but with respect to different probability measures. According to this theorem, the process √ t qs ds, (1.11) Wt := ξt − 2 λ 0
is a Wiener process with respect to (Ω, F , P) and filtration {Ft , t ≥ 0}, and thus is the natural process for describing the stochastic dynamics with respect to the
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
59
measure P. It is immediate to see that, once written in terms of Wt , Eq. (1.8) reduces to Eq. (1.1), thus the link between Eq. (1.6) and (1.1) is established. The above discussion should also have given a first idea of why SDEs like Eq. (1.1) are those which are used in Quantum Mechanics to described the collapse of the wave function; we will come back on this point later in the paper. The first important problem to address concerns the status of the solutions of Eq. (1.6). In [31], Holevo has proven the existence and uniqueness of topological weak solutions of a rather general class of SDEs with unbounded operators, to which Eq. (1.6) belongs. (See the end of the section for the notation.) The problem of the existence and uniqueness of topological strong solutions of Eq. (1.6) has been addressed in [30]; there however, the proof relies on the expansion of wave functions in terms of Gaussian states, which in general is problematic and requires special care, as shown in [45]. An explicit representation of the strong solution of Eq. (1.6) has been given in [37]; the representation is written in terms of path integrals and is not particularly suitable for analyzing the time evolution of the general solution. A much more convenient representation, given in terms of the Green’s function of Eq. (1.6), has been first derived in [32, 35]; the Green’s function reads: αt ¯t x + ¯bt y + c¯t ; (1.12) Gt (x, y) = Kt exp − (x2 + y 2 ) + βt xy + a 2 the coefficients Kt , αt and βt are deterministic and equal to λ , Kt = υπ sinh υt 2λ coth υt, υ
αt =
λ βt = 2 sinh−1 υt, υ while the remaining coefficients are functions of the Wiener process ξt : t √ a ¯t = λ sinh−1 υt sinh υs dξs ,
(1.13) (1.14) (1.15)
(1.16)
0
¯bt = 2i λ m υ i c¯t = m
0
t
0
t
a ¯s ds, sinh υs
a ¯2s ds.
In the above expressions, we have introduced the following two constants: λ 1+i ω, ω ≡ 2 . υ≡ 2 m
(1.17)
(1.18)
(1.19)
February 11, 2010 10:1 WSPC/148-RMP
60
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
As we shall see, the parameter ω, which has the dimensions of a frequency, will set the time scales for the collapse of the wave function. The representation in terms of the Green’s function (1.12), as we have said, is particularly suitable for analyzing the time evolution of the general solution of Eq. (1.6), and thus of Eq. (1.1), even though we will see that, when studying the long time behavior, another representation is more convenient. Our first result concerns the meaning of the solution of Eq. (1.6) in terms of (1.20) φt (x) := dyGt (x, y)φ(y) for given initial condition φ. Theorem 1.1. Let φt be defined as in (1.20); then the following three statements hold true with Q-probability 1: (1) φ ∈ L2 (R) ⇒ φt ∈ L2 (R), (2) φ ∈
L2B (R) 2
⇒ φt is a topological strong solution of Eq. (1.6),
(3) φ ∈ L (R) ⇒ limt→0 φt − φ = 0,
(1.21) (1.22) (1.23)
where L2B (R) is the subspace of all bounded functions of L2 (R). Having the explicit solution of the Eq. (1.6), and thus of Eq. (1.1), the next relevant problem is to unfold its physical content. Previous analysis of similar equations [2,8,10,14,39] have shown that one can identify three regimes, which are more or less well separated depending on the value of the parameters λ and m. (1) Collapse regime. A wave function having an initial large spread, localizes in space; the localization occurring in agreement with the Born probability rule. (2) Classical regime. The localized wave function moves in space like a classical free particle, since the fluctuations due to the Wiener process can be safely ignored. That a well localized Schr¨ odinger wave function should move along a classical path is connected to the validity of Ehrenfest- or Egorov-type theorems [38]. (3) Diffusive regime. Eventually, the random fluctuations become dominant and the wave function starts to diffuse appreciably. It is not an easy task to spell out rigorously these regimes and their properties. We shall, however, be a bit more specific on this in the following section. We shall afterwards focus on the simplest regime, namely the diffusive one, which in fact has been intensively looked at in the previous years [7, 19, 26, 34, 35, 39] and we shall prove a remarkable property of the solutions of Eq. (1.1): Any solution converges almost surely to a Gaussian wave function having a fixed spread. Theorem 1.2. let ψt be a solution of Eq. (1.1); then under conditions which we will specify, the following property holds true with P-probability 1: lim ψt − ψt∞ = 0,
t→∞
(1.24)
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
61
where ψt∞ , defined in (5.2), is a Gaussian wave function with a fixed spread both in position and momentum. Theorems 1.1 and 1.2 have been extensively discussed before in the literature [7, 19, 26, 30, 34–36, 39], proving that the community has devoted much attention to the problem. However, these proofs are not complete or flawed. Concerning Theorem 1.1, in particular Statement (3) was not proven [30, 34–36]. While Statements (1) and (2) are rather straightforward conclusions from the Gaussian kernel of the propagator, the third statement is much more subtle and does not follow from purely analytical arguments. Concerning Theorem 1.2, none of the previous proofs is decisive. In [35, 36], the major flaw was that it was overlooked that the eigenfunction expansion of the relevant dissipative operator (not self-adjoint) does not give rise to an orthonormal basis. In [19], the long time behavior was analyzed by expanding the general solution in terms of coherent states, while in [26,39] it was analyzed by scrutinizing the time evolution of the spread in position of the solution; in [45] it has been shown that both approaches are not conclusive. Finally, [7] proposed Theorem 1.2 as a conjecture, but shows stability of ψt∞ only against small perturbations. Building on previous work of Holevo, Mora and Rebolledo recently enhanced in [46, 47] the general theory of stochastic Schr¨odinger equations. In particular, they developed criteria for the existence of regular invariant measures for a large class of stochastic Schr¨odinger equations as an important step towards an understanding of the large time behavior. Until now however the only complete and detailed results on the large time behavior seem to be Theorems 1.1 and 1.2. We conclude this introductory section by summarizing the content of the paper. In Sec. 2, we will present a qualitative analysis of the time evolution of the general solution of Eq. (1.1); we will discuss the three regimes previously introduced, giving also numerical estimates, and we will set the main problems which we aim at solving. In Sec. 3, we will analyze the structure of the Green’s function (1.12) and prove Theorem 1.1. In Sec. 4, we will introduce another representation of the general solution of Eq. (1.1), which is more suitable for analyzing its long time behavior. Sec. 5 will be devoted to the proof of Theorem 1.2. Finally, Sec. 6 will contain some concluding remarks and an outlook. Notation. We will work in the complex and separable Hilbert space L2 (R), with the norm and the scalar product given, respectively, by · and ·|·. We will also consider the subspace L2B (R) of all bounded functions of L2 (R). Given an operator O, we denote with D(O) its domain and with R(O) its range. Since in some expression the real and imaginary parts of some coefficients appear, we introduce for ease of readability the symbols z R or zR will denote the real part of the complex number z, while z I or zI will denote its imaginary part. Given the linear SDE (1.6), a topological strong solution is an L2 -valued process such that for any t > 0, √ t i t p2 λ t 2 φs ds + λ qφs dξs − q φs ds (1.25) φt = φ − 0 2m 2 0 0
February 11, 2010 10:1 WSPC/148-RMP
62
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
holds with Q-probability 1. A topological weak solution instead is an L2 -values process such that for any t > 0 and for any χ ∈ D(p2 ) ∩ D(q 2 ), √ t i t 1 2 p χ|φs ds + λ χ|φt = χ|φ − qχ|φs dξs 0 2m 0 λ t 2 − q χ|φs ds (1.26) 2 0 holds with Q-probability 1. Topological strong and week solutions for the nonlinear SDE (1.1) are defined in a similar way. There is also a distinction between strong and weak solutions in a stochastic sense [48], depending on whether the probability space, the filtration and the Wiener process are given a priori (strong solution) or whether they can be constructed in such a way to solve the required SDE (weak solution). Throughout the paper, we will deal only with strong solutions in the stochastic sense. 2. Time Evolution of the General Solution We begin our discussion with a qualitative analysis of the time evolution of the general solution of Eq. (1.1); we will spot out the regimes we introduced in the previous section, corresponding to three different behaviors of the wave function. These regimes of course depend on the value of the mass m of the particle and also on the value of the coupling constant λ which sets the strength of the collapse mechanism. As discussed, e.g., in [39], it is physically appropriate to take λ proportional to the mass m according to the formula: λ := λ0
m , m0
(2.1)
where λ0 is now assumed to be a universal coupling constant, while m0 is taken equal to the mass of a nucleon ( 1.67×10−27 kg). To be definite, in the following we take λ0 1.00 × 10−2 m−2 sec−1 , so that the localization mechanism has the same strength as that of the GRW model [1]. Though, as we discussed in the introduction, Eq. (1.1) is used also in the context of the theory of continuous measurement as well as in the theory of decoherence, for brevity and clarity in the following we will only make reference to its application within models of spontaneous wave function collapse. 1. The collapse regime The first important effect of the dynamics embodied in Eq. (1.1) is that a wave function, which initially is well spread out in space, becomes rapidly localized. This is most easily seen through the Green’s function representation of the solution. The Green’s function Gt (x, y) in (1.12) can be rewritten as follows α ˜t 2 αt x 2 ˜t x + c˜t exp − (y − Yt ) Gt (x, y) = Kt exp − x + a (2.2) 2 2
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
63
where we have introduced the new parameters: βt2 2λ tanh υt, = αt υ βt¯bt a ˜t = a ¯t + , αt ¯b2 c˜t = c¯t + t , 2αt βt x + ¯bt . Ytx = αt α ˜ t = αt −
(2.3) (2.4) (2.5) (2.6)
The y-part of Gt (x, y) is a Gaussian function whose spread in position (equal to 1/ αR t ) rapidly decreases in time, and afterwards remains very small. In particular, we have: 2λ sinh ωt − sin ωt ω cosh ωt − cos ωt 2 24 −2 −1 −1 λt (3.99 × 10 m kg sec )mt 3 = 2λ (2.39 × 1029 m−2 kg−1 )m ω
αR t =
t ω −1 , (2.7) t → +∞,
with ω 5.01 × 10−5 sec−1 independent of the mass of the particle. Let us introduce a length , and let us say that a wave function is localized when its spread is smaller than . For sake of definiteness, we take 1.00 × 10−7 m, corresponding to the width of the collapsing Gaussian of the GRW model. By means of this length, we can define the collapse time t1 as the time when the spread of the y-part of the Green’s function Gt (x, y) becomes smaller than . By using the small time approximation of αR t given in (2.7), we can set: t1 :=
2.51 × 10−11 kg sec 3 . 2 2 λ m
(2.8)
As we see, and as we expect, this time decreases for increasing masses, i.e. for increasing values of λ, and is very small for macroscopic particles. Let us assume that the initial state φ(x) is not already localized, and in particular that it does not change appreciably on the scale set by ; this is a physically reasonable assumption when φ represents the state of the center of mass of a macroscopic object. In this case, from the time t1 on, the y-part of the Green’s function Gt (x, y) acts like a Dirac-delta on φ(x), and the solution at time t of the linear equation can be written as follows: 2π α ˜t 2 φt (x) Kt exp − x + a ˜t x + c˜t φ(Ytx ). (2.9) αt 2
February 11, 2010 10:1 WSPC/148-RMP
64
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
This is a Gaussian state whose spread is controlled by α ˜ t , which evolves in time in a way similar to αt ; in particular: 2λ sinh ωt + sin ωt ω cosh ωt + cos ωt 2λt (1.20 × 1025 m−2 kg−1 sec−1 )mt t ω −1 , = 2λ (2.10) (2.39 × 1029 m−2 kg−1 )m t → +∞. ω ˜R As we see, the spread 1/ α t is well below , for any t ≥ t1 . We can then conclude that, for times greater than the collapse time, any state initially well spread out in space is mapped into a very well localized wave function. An important issue is where the wave function collapses to, given that the initial state is spread out in space. We now show that the position of the wave function after the collapse is distributed in very good agreement with the Born probability rule. A reasonable measure of where the wave function is, after it has collapsed, is given by the quantum average of the position operator qt . Accordingly, the probability for the collapsed wave function to lie within a Borel measurable set A of R can be simply defined to be Pcoll t [A] := P[ω : qt ∈ A]. Though this probability is mathematically well defined for any Borel measurable subset A, it is physically meaningful only when A represents an interval ∆ much larger than the spread of the wave function itself, or a sum of such intervals. In such a case, as discussed in [49], one can show that: coll 2 pt (x)dx, (2.11) Pt [A] EP [P∆ ψt ] ≡ α ˜R t =
∆
where P∆ (x) is the characteristic function of the interval ∆ of the real axis and pt = EP [|ψt (x)|2 ]. The idea behind the approximate equality (2.11) is that when ψt lies within ∆, then P∆ ψt ψt , so that P∆ ψt 2 is almost equal to 1, while when it lies outside ∆, it is practically 0. The critical situations, which require special care, are those when the wave function lies at the edges of ∆. In [39] it has been proven that: 2 µt (2.12) dy e−µt y pSch pt (x) = t (x + y), π µt =
3mm0 m (2.27 × 1043 m−2 kg−1 sec3 ) 3 , 22 λ0 t3 t
(2.13)
Sch 2 Sch where pSch t (x) = |ψt (x)| and ψt (x) is the solution of the standard free-particle Sch¨ odinger equation, for the given initial condition φ(x). For the times we are considering (t = t1 ), the Gaussian term in (2.12) is much more peaked than any typical quantum probability distribution pSch t (x), and consequently acts like a Dirac-delta (x). Finally, for macroscopic systems and for the on it; accordingly, pt (x) pSch t
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
65
times we are considering, the wave function solution of the free-particle Schr¨ odinger Sch Sch equation does not change appreciably, implying that pt (x) p0 (x) = |φ(x)|2 , which means precisely that the collapse probability is distributed in agreement with the Born probability rule. 2. The classical regime After time t1 , we are left with a wave function which, when m is the mass of a macroscopic particle, is very well localized in space, almost point-like. This is the way in which collapse model reproduce the particle-like behavior of classical systems, within the framework of a wave-like dynamics. The relevant question now is to unfold the time evolution of the position and momentum of the wave function, to see whether it matches Newton’s laws. When the wave function is well localized in space (t > t1 ), one can reasonably assume that it can be approximated with the Gaussian state to which — as we shall see — it asymptotically converges to. We will analyze the time evolution of such a Gaussian state in the following, and we will see that its mean position x ¯t and momentum k¯t evolve in time as follows (see Eqs. (5.28) and (5.29)): √ ¯t1 + k¯t1 (t − t1 ) + λ x ¯t = x m m k¯t = k¯t1 +
t t1
Ws ds +
(Wt − Wt1 ), m
√ λ(Wt − Wt1 ).
(2.14) (2.15)
We can easily recognize in the deterministic parts of the above equations the freeparticle equations of motions of classical mechanics describing a particle moving along a straight line with constant velocity; the remaining terms are the fluctuations around the classical motion, driven by the Brownian motion Wt . The important feature of the above equations is that these fluctuations, for macroscopic masses, are very small, for very long times. As a matter √ of fact, if we estimate the Brownian motion fluctuations by setting Wt ∼ t, we have for the stochastic terms in Eq. (2.14): √ t 2 √ 3/2 t3/2 λ Ws ds λ t (1.63 × 10−22 m kg1/2 sec−3/2 ) √ , m t1 3 m m
(Wt − Wt1 ) m
t t −17 1/2 −1/2 (1.02 × 10 . m kg sec ) m m
(2.16)
(2.17)
We see that the random fluctuations decrease with the square root of the mass m of the particle, which means that the bigger the system, the more deterministic its motion. This is how collapse models recover classical determinism at the macroscopic level, from a fundamentally stochastic theory. We can introduce a time t2 , defined as the time after which the fluctuations become larger than L; we can set, e.g., L 1.00 × 10−3 m. Since the fluctuations
February 11, 2010 10:1 WSPC/148-RMP
66
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
in (2.16) grow faster as those in (2.17), we can set: 2/3 √ 3 L m √ t2 (3.55 × 1012 sec m−1/3 ) 3 m 2 λ √ (1.13 × 105 year m−1/3 ) 3 m.
(2.18)
The time t2 defines the time interval [t1 , t2 ] during which the classical regime holds. As we can see, for macroscopic systems this is a very long time — much longer than the time during which a macro-object can be kept isolated from the rest of the universe, so that its dynamics is described by Eq. (1.1). To summarize, during the classical regime, which for macroscopic systems lasts very long, the wave function behaves, for all practical purposes, like a point moving deterministically in space according to Newton’s laws. In other words, the wave function reproduces the motion of a classical particle. 3. The diffusive regime After time t2 , two new effects become dominant: First, the wave function converges towards a Gaussian state, as we shall prove. Second, the motion becomes more and more erratic: the dynamics begins to depart from the classical one, showing its intrinsic stochastic nature. A thorough mathematical analysis of these time regimes and their main properties is still lacking. In this paper, as we have anticipated, we focus now only on the long time behavior of the solutions of Eq. (1.1), leaving the study of the remaining properties as open problems for future research. 3. Solution of the Equation In the first part of this section, we derive the Green’s function (1.12) in a way which will make clear the connection between Eq. (1.6) and the equation of the so called non-self-adjoint (NSA) harmonic oscillator [52–54]. This connection is important for two reasons; from a physical point of view, it will bring a deep insight on how the collapse of the wave function actually works. From a mathematical point of view, it will allow to prove rigorously both Theorems 1.1 and 1.2 presented in the introductory section. A way to connect Eq. (1.6) with that of the NSA harmonic oscillator is to apply suitable transformations to the wave function in such a way to transform the SDE in a Schr¨ odinger-like equation. We will do this in two steps. We present this section in detail for convenience although the approach goes back to Kolokoltsov [35]. 1. Reduction of Eq. (1.6) to a linear differential equation with random coefficients √ The idea is to remove the stochastic differential term λq dξt from Eq. (1.6): borrowing the language of quantum mechanics, we shift to a sort of interaction picture by defining a suitable operator which maps the solution of Eq. (1.6) to the solution
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
67
of a new equation which does not have that stochastic term. To this end, let us consider the operator Qa : D(Qa ) ⊆ L2 (R) → L2 (R) defined as follows: Qa φ(x) = eax φ(x),
a ∈ C;
(3.1)
where D(Qa ) is defined as the set of all φ(x) ∈ L2 (R) such that eax φ(x) ∈ L2 (R). It should be noted that, in general, the operator Qa is unbounded and its domain D(Qa ) is dense in L2 (R) but does not coincide with it. We will settle all technical issues in the second part of the section. We now define the vector: (1)
φt
= Q−√λξt φt ;
(3.2)
(1)
an easy application of Itˆ o calculus shows that φt
satisfies the differential equation:
p2 −1 i (1) (1) 2 √ √ Q = − Q− λξt − λq φt dt, φ0 = φ. (3.3) 2m − λξt √ The stochastic differential λq dξt has disappeared; in turn, the free Hamiltonian √ which, due to the p2 /2m has been replaced by the operator Q−√λξt (p2 /2m)Q−1 − λξt specific commutation relations between q and p, takes the simple form: √ √ Q−√λξt p2 Q−1 = p2 − 2i λξt p − λ2 ξt2 . (3.4) − λξ (1) dφt
t
Equation (3.3) can then be re-written as follows: 2 p d (1) i √ λ2 2 (1) 2 − iλq − ξ φ . i φt = λξt p − dt 2m m 2m t t
(3.5)
This is a standard differential equation with random coefficients; note that the operator on the right-hand side is not self-adjoint, due to the presence of the second and third terms. The last term of Eq. (3.5) is a multiple of the identity operator and can be removed by defining: iλ t 2 (2) (1) ξ ds φt ; (3.6) φt = exp − 2m 0 s we then obtain:
2 p d (2) i √ (2) 2 i φt = − iλq − λξt p φt . dt 2m m
(3.7)
The third term on the right-hand side contains a time dependent coefficient, and the next step aims at removing it. 2. Reduction of Eq. (3.7) to a differential equation with constant coefficients The idea we now follow is to perform a transformation similar to a boost. We introduce the operator Pa : D(Pa ) ⊆ L2 (R) → L2 (R) defined as: Pia/ φ(x) = φ(x + a),
a ∈ C,
(3.8)
February 11, 2010 10:1 WSPC/148-RMP
68
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
where D(Pa ) is the set of all φ(x) ∈ L2 (R) which can be analytically continued to the line x + a in the complex space C, and such that φ(x + a) ∈ L2 (R). Similarly to Qa , also Pa is in general an unbounded operator and its domain D(Pa ), though being dense, does not coincide with L2 (R); we will come back to this point later in this section. We define the operator: Vt = exp(−iat /)Pibt / Q−ict / ,
(3.9)
where the coefficients at , bt and ct , yet to be determined, will turn out to be complex random functions of time. One can easily verify that: Vt qVt−1 = q + bt ,
(3.10)
Vt pVt−1
(3.11)
= p + ct ,
and similarly for higher powers of q and p. Let us define the vector: (2)
ϕt = Vt φt ,
(3.12)
which solves the equation: 2 p d 1 i √ 2 ˙ − iλq − bt − ct + i ϕt = λξt p + (c˙t − 2iλbt )q dt 2m m m 1 2 i √ 2 c − λξt ct − iλbt ϕt . + a˙ t + c˙t bt + 2m t m
(3.13)
The time-dependent part of the equation can be removed by requiring that at , bt and ct satisfy the first-order differential equations: √ mb˙ t − ct = −i λξt b0 = 0, (3.14) c0 = 0 c˙t − 2iλbt = 0 and a˙ t + iλb2t +
1 2 i √ c − λξt ct = 0, 2m t m
a0 = 0.
(3.15)
The first two equations form a non-homogeneous linear system of first-order differential equations, which has a unique Q-a.s. continuous random solution; the third equation instead determines the global factor at , which is also random. With such a choice for the three parameters, Eq. (3.13) becomes: 2 p d − iλq 2 ϕt , ϕ0 = φ, (3.16) i ϕt = dt 2m which is the equation of the so-called non-self-adjoint (NSA) harmonic oscillator, whose solution and most important properties are well known. Before continuing, we note that in the case of a more general Hamiltonian H = p2 /2m+V (q) appearing in Eq. (1.1) in place of just the free evolution p2 /2m, the potential V (q) would have
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
69
been transformed, when going from Eq. (3.7) to Eq. (3.16), according to the rule: Vt V (q)Vt−1 = V (q + bt ); in this case, we would not be able to remove completely the time-dependent terms from the equation and we would not be able to reduce the original equation to one, whose solution is known. However, besides the free particle case, all equations containing terms at most quadratic in q and p (among them, the important case of the harmonic oscillator) can be solved in a similar way. The solution of Eq. (3.16) admit a representation in terms of the Green’s function, also known as Mehler’s formula: (x, y) GNSA t
=
λ λ 2 λ −1 2 exp − (x + y ) coth υt + 2 xy sinh υt , υπ sinh υt υ υ
(3.17)
with υ and ω defined as in (1.19). In this way, we have established the link between the solutions of the SDE (1.6) and those of the equation for the NSA harmonic oscillator (3.16), which we summarize in the following lemma, whose proof is straightforward. Lemma 3.1. Let TtNSA be the evolution operator represented by the Green’s func(x, y) and Tt the one represented by Gt (x, y); then: tion GNSA t Tt ≡ exp(iϑt /)Q√λξt +(ict /) P−ibt / TtNSA ,
(3.18)
where the two random functions bt and ct solve the linear system (3.14), and ϑt , which includes all global, i.e. independent of x, phase factors, solves the equation: 1 2 i √ λ2 2 ct + ξ , λξt ct + ϑ˙ t = −iλb2t − 2m m 2m t
θ0 = 0.
(3.19)
We now proceed to prove in which sense φt := Tt φ is the topological strong solution of Eq. (1.6) for the given initial condition φ. We first need to set some (x, y) which will be necessary for the subproperties of the Green’s function GNSA t sequent theorem. Lemma 3.2. The absolute value of GNSA (x, y) is equal to: t
|GNSA (x, y)| t
=
λ λ 2λ √ exp − (x2 + y 2 )pt + 4 xyqt , ω ω πω cosh ωt − cos ωt
(3.20)
where we have introduced the following quantities: pt =
sinh ωt − sin ωt , cosh ωt − cos ωt
(3.21)
qt =
sinh ωt/2 cos ωt/2 − cosh ωt/2 sin ωt/2 ; cosh ωt − cos ωt
(3.22)
February 11, 2010 10:1 WSPC/148-RMP
70
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
note that the function pt is positive for any t > 0. The integral of |GNSA (x, y)|2 t with respect to y is equal to:
λ p2t − 4qt2 2 2λ NSA 2 exp −2 dy|Gt (x, y)| = x . (3.23) πω(sinh ωt − sin ωt) ω pt A simple calculation shows that p2t − 4qt2 > 0 for any t > 0; this means that (x, ·), taken as a function of y, belongs to L2 (R) for any x ∈ R and t > 0; GNSA t moreover : (x, ·)2 < +∞ for any t > 0. (3.24) dxGNSA t Finally, the following expression holds true:
λ p2t − 4qt2 2 2λ bx NSA 2 exp −2 dy|e Gt (x + a, y)| = x πω(sinh ωt − sin ωt) ω pt qt (qt aR + q¯t aI ) + 2 pt aR + p¯t aI − 4 + 2bR x pt (qt aR + q¯t aI )2 + pt (a2R − a2I ) + 2¯ p t aR aI − 4 , pt (3.25) with p¯t =
sinh ωt + sin ωt , cosh ωt − cos ωt
(3.26)
q¯t =
sinh ωt/2 cos ωt/2 + cosh ωt/2 sin ωt/2 . cosh ωt − cos ωt
(3.27)
The above formulas imply that, for any a, b ∈ C, for any x ∈ R and for any t > 0, the function ebx GNSA (x + a, ·) belongs to L2 (R) and: t (x + a, ·)2 < +∞. (3.28) dxebx GNSA t We are now in a position to state and prove the main theorem of this section. Theorem 3.1. Let Pa and Qa be defined, respectively, as in (3.8) and (3.1); let bt and ct solve the linear system (3.14) and θt be the solution of Eq. (3.19). Finally, let φt = Tt φ, with φ ∈ L2 (R) and Tt defined as in (3.18). Then the following three statements hold true with probability 1: (1) Tt : L2 (R) → L2 (R)defines a bounded operator for everyt > 0.
(3.29)
(2) φ ∈ L2B (R) ⇒ φt is a topological strong solution of Eq. (1.6).
(3.30)
(3) φ ∈ L2 (R) ⇒ limt→0 φt − φ = 0.
(3.31)
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
71
Proof. Statement 1. Let φ belong to L2 (R); since also GNSA (x, ·) belongs to t older’s inequality implies that GNSA (x, ·)φ L2 (R) for any x ∈ R and t > 0, H¨ t belongs to L1 (R); accordingly, the operator TtNSA is well defined for any t > 0, and maps any L2 (R)-function into a measurable function. By using Schwartz inequality together with relation (3.24), we have: 2 ≤ φ2 dxGNSA (x, ·)2 < +∞; (x, y)φ(y) (3.32) dx dy GNSA t t thus TtNSA φ belongs to L2 (R) for any φ in L2 (R) and for any t > 0. (x + a, ·) belongs to L2 (R) for any a ∈ C and In a similar way, since also GNSA t because of (3.28), one proves that Pa TtNSA φ belongs to L2 (R) for any φ ∈ L2 (R), for any complex a and for any t > 0, i.e. that D(Pa ) contains R(TtNSA ). Using once more the same inequalities and (3.28), one shows also that Qb Pa TtNSA φ belongs to L2 (R) for any φ in L2 (R), for any a, b ∈ C and t > 0. Remark. Actually a stronger statement is true, as can be readily seen from the Gaussian form of the Green’s function Gt of the operator Tt : For positive t, it maps L2 (R) to Schwartz space S(R). We shall need this information in the proof of Statement 3. Statement 2. Let us consider the vector ϕt := TtNSA φ, with φ ∈ L2B (R). By construction, ϕt solves Eq. (3.16), once one proves that the integration dy GNSA (x, y)φ(y) (3.33) t can be exchanged with the first and second partial derivatives with respect to x and with the first partial derivative with respect to t. We note that the (x, y)φ(y) satisfies the following two properties: (i) The function function GNSA t (x, y)φ(y) is measurable and integrable on R for any t > 0 and for y → GNSA t any x ∈ R; (ii) The first and second partial derivatives with respect to x and the first partial derivatives with respect to t are exists for any t > 0, x ∈ R and y ∈ R and can be bounded uniformly with respect to t and x. Accordingly, one can apply, e.g., [50, Theorem 12.13, p. 199] to conclude that the operations of integration and differentiation can be exchanged. o calculus proves Having proved that ϕt solve Eq. (3.16), a direct application of Itˆ that φt , defined as in (3.18), is a topological strong solution of Eq. (1.6). Statement 3. Let φ = φ0 ∈ Cc∞ (R) be given. Since φt solves Eq. (1.6) in a strong sense, it also solves the SDE in a weak sense; hence, using, e.g., [31, Eq. (1.1)], one has: lim ϕ|φt = ϕ|φ0 ∀ ϕ ∈ Cc∞ (R).
t→0
(3.34)
We extend (3.34) to the general case of ϕ ∈ L2 (R). Being dense in L2 (R), there exist a sequence {ϕn ∈ Cc∞ (R), n ∈ N} which approximates any ϕ ∈ L2 (R). By
February 11, 2010 10:1 WSPC/148-RMP
72
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
triangle and Schwarz inequality we get |ϕ|φt − ϕ|φ0 | ≤ |ϕn |φt − ϕn |φ0 | + ϕ − ϕn φt + ϕ − ϕn φ0 .
(3.35)
The first term on the right-hand side can be made arbitrarily small because of (3.34); the second and third terms can also be made arbitrarily small by choosing n sufficiently large, while φt can be bounded as it converges to φ0 for t → 0, due to Eq. (1.9). This proves that: lim ϕ|φt = ϕ|φ0 ∀ ϕ ∈ L2 (R).
(3.36)
t→0
Statement 3 for test functions φ ∈ Cc∞ (R) now follows directly from Eq. (1.9), Eq. (3.36) and observing φt −φ0 2 = φt 2 +φ0 2 −2φ0 |φt R . It remains to extend the strong continuity of Tt from the subspace Cc∞ (R) to L2 (R). For this, observe that for φ ∈ Cc∞ (R) (φt 2 )t≥0 defines a stochastic process with continuous paths and by Holevo’s result (cf. Eq. (1.9)) it is a martingale. For given f ∈ L2 (R) choose a sequence (ϕn )n∈N ⊂ Cc∞ (R), which converges to f in L2 (R). Doob’s inequality for submartingales implies that for all n, m ∈ N, T > 0 and λ > 0 1 n 2 m 2 2 (3.37) Q sup |ϕt − ϕt | > λ ≤ EQ [|ϕnT 2 − ϕm T |]. λ 0≤t≤T We now show that 2 lim EQ [|ϕnT 2 − ϕm T |] = 0.
(3.38)
n,m→∞
The elementary inequality 2 n m n m |ϕnt 2 − ϕm t | ≤ (ϕt + ϕt )ϕt − ϕt
implies that 2 n m n m EQ [|ϕnt 2 − ϕm t | ≤ EQ [(ϕt + ϕt )ϕt − ϕt ] 1
1
2 2 n m 2 2 ≤ (EQ [(ϕnt + ϕm t ) ]) (EQ [ϕt − ϕt ]) √ 2 12 n m 2 ≤ 2(EQ [ϕnt 2 ] + EQ [ϕm t ]) ϕ − ϕ √ 1 = 2(ϕn 2 + ϕm 2 ) 2 ϕn − ϕm .
The right-hand side converges to 0 as n, m → ∞. Therefore, the sequence of stochastic processes (ϕnt 2 )t≥0 is a Cauchy sequence in the complete metric space (D, d) of adapted processes with right continuous paths having left limits, where the metric d is defined as (see [51, pp. 56–57] for background concerning this topology) d(X, Y ) =
∞ 1 E |(X − Y ) | min 1, sup Q s 2n 0≤s≤n n=1
(X, Y ∈ D).
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
73
Therefore (ϕnt 2 )t≥0 converges locally uniformly in probability to a stochastic process. This stochastic process again has to be continuous almost surely, since a subsequence of (ϕnt 2 )t≥0 converges locally uniformly with probability one. Since limn→∞ ϕnt 2 = ft 2 almost surely we know that [0, ∞) t → ft 2 is continuous, in particular limt→0 ft = f almost surely and defines by the lemma of Fatou a positive continuous supermartingale. Therefore, it has a unique decomposition ft 2 = Mt − At , where (Mt )t≥0 is a continuous martingale and (At )t≥0 is increasing process. In fact, as we shall show now, the increasing process is identically 0, i.e. ft 2t≥0 is a positive martingale for every f ∈ L2 (R). For that, we observed in the remark above that for positive ε the function fε almost surely belongs to the Schwartz space and in particular to the domain of the generator. By Holevo’s result cited above, (Tt−ε fε )t≥ε is a continuous martingale. Therefore, At = 0 for t > 0 and hence it equals 0 almost surely. In order to ensure strong convergence limt→∞ ft − f = 0 we need only show that weak convergence holds, i.e. limt→∞ φ|ft = φ|f . Observing |ψ|ft − φ|f | ≤ |φ|ft − φ|ϕnt | + |φ|ϕnt − φ|ϕn | + |φ|ϕn − φ|f | it suffices to show that for some T > 0 limn→∞ supt≤T |φ|ft − φ|ϕnt | = 0. But supt≤T |φ|ft − φ|ϕnt | ≤ φ supt≤T ft − ϕnt . Therefore, we need only establish that limn→∞ supt≤T ft − ϕnt = 0. This is done by a similar argument as above, namely we show that for every ε > 0 n 2 lim Q sup ft − ϕt > ε = 0, n→∞
t≤T
because then there exists a subsequence which is almost surely convergent to 0. But as we showed above (gt 2 )t≥0 is a martingale for every g ∈ L2 (R). Hence (ft − ϕnt 2 )t≥0 is a martingale and we can again apply Doob’s inequality as we did before. Remark 1. The Gaussian form of the Green’s function (1.12) is a consequence of the fact that Eq. (1.6) contains terms which are at most quadratic in q and p. This in particular implies that the dynamics preserves the shape of initially Gaussian wave functions; in fact, as shown e.g. in [30, 34, 35, 39], a state 2 m φt (x) = exp[−σt (x − xm t ) + ikt x + ςt ],
(3.39)
m is solution of Eq. (1.6) provided that the two real parameters xm t , kt and the two complex parameters σt , ςt satisfy the following stochastic differential equations: 2i 2 dσt = λ − (σt ) dt, (3.40) m √ √ m λ m (3.41) dxt = kt dt + R [dξt − 2 λxm t dt], m 2σt
February 11, 2010 10:1 WSPC/148-RMP
74
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
√ σI √ dktm = − λ Rt [dξt − 2 λxm t ], σt √ m √ I λ 2 σ dςtR = λ(xm ) + + dt + λxm t t t [dξt − 2 λxt dt], R m 4σt √ √ σtI m m 2 R λσtI I (kt ) − σt + λ R xt [dξt − 2 λxm dςt = − dt + t dt]. R 2 2m m 4(σt ) σt
(3.42) (3.43) (3.44)
In particular, the solution of Eq. (3.40) is σt = (λ/υ) coth(υt + κ), where κ sets the initial condition. These results will be useful in the subsequent analysis. 4. Representation of the Solution in Terms of Eigenstates of the NSA Harmonic Oscillator We now turn to the problem of analyzing the long time behavior of the solution of the (norm-preserving) nonlinear Eq. (1.1). The representation of the solution φt of Eq. (1.6) in terms of the Green’s function (1.12) is not suitable for controlling the long time behavior; it turns out to be more convenient to express φt in terms of the eigenstates of the NSA harmonic oscillator, resorting to the connection which we previously established between Eqs. (1.6) and (3.16). In this way, as we shall see, the collapse process will be manifest: the coefficients of the superposition will decrease exponentially in time, the damping being the faster, the higher the associated eigenstate. Accordingly — when normalization is also taken into account — in the large time limit only the ground state survives, which has a Gaussian shape. We first recall a few basic features of the Hamiltonian of the NSA harmonic oscillator, H≡
p2 − iλq 2 2m
(4.1)
which has been studied in particular by Davies in a series of papers [52, 53] and reviewed in his recent book [54]. The eigenvalues of H are complex and equal to: 1−i 1 λn ≡ ωn , ωn ≡ n + ω, (4.2) 2 2 and the corresponding eigenvectors are: (n)
φ
√ 2 2 ¯ n (zx), (x) ≡ ze−z x /2 H
2
z ≡ (1 − i)
λm
(4.3)
¯ n (x) is the normalized Hermite polynomial of degree n. Since the argument where H ¯ n in (4.3) is complex, these eigenstates are not orthogonal; it can be shown of H that they are linearly independent and form a complete set, however they do not form a basis. As such, they cannot directly used to expand an initial state into a superposition of the eigenstates of H. This problem can be circumvented in the following way, also discussed by Davies.
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
75
It is easy to see that the sequences {φ(n) } and {φ(n) } form a bi-orthonormal system; one then defines the (non-orthogonal) projection operators: Pn φ ≡ φ|φ(n) φ(n) = αn φ(n) ,
(4.4)
which satisfy the relations: Pn = φ(n) 2
Pn Pm = δn,m Pn ,
and
lim
n→+∞
lnPn = 2c, n
(4.5)
where c is an appropriate constant [54]. As we see, although the states φ(n) are normalized, in the sense that +∞ φ(n) (x)φ(m) (x)dx = δn,m , (4.6) −∞
the norm of the projection operators Pn grows exponentially as n → +∞. Finally, the following equality holds true [54]: TtNSA
=
∞
e−(1+i)ωn t/2 Pn
for t > 4c/ω.
(4.7)
n=0
A remarkable property of the above representation of the solution of Eq. (3.16) in terms of the eigenstates of the operator (4.1) is that it holds not for any t ≥ 0, as one would naively expect, but only for t > 4c/ω. The reason is that the norm of the projection operators Pn grows exponentially with n, so one has to wait for t to be large enough in order for the term e−nωt/2 to suppress the exponential growth of the projectors. From a physical point of view, recalling the discussion of Sec. 2, since the constant c is of order 1 [54] and ω 5.01 × 10−5 sec−1 , we see that the representation (4.7) holds true only in part of the classical regime and in the diffusive regime, which is the one we are interested in studying now, but not in the physically more crucial collapse regime. We now apply the above results to our problem; we will first proceed in an informal way, and at the end we will prove the relevant theorems. Let φ ∈ L2 (R); then, according to (3.18) and (4.7): φt (x) = Tt φ = e[
+∞ √ λξt +ict /]x+iϑt /
αn e−(1+i)ωn t/2 φ(n) (x − bt )
(4.8)
¯ n [z(x − bt )], αn e−(1+i)ωn t/2 H
(4.9)
n=0
= e−z
2
¯t x+γt √ (x−¯ xt )2 /2+ik
z
+∞ n=0
where αn = φ|φ(n) (see Eq. (4.4)), while the two real parameters x¯t , k¯t and the complex parameter γt are defined as follows: √ I I (4.10) x ¯t = bR t + bt − (2/mω)ct + (ω/2 λ)ξt , √ I λξt , (4.11) k¯t = (mω/)bIt + (1/)(cR t − ct ) + ¯2t ) + (i/)θt . γt = −(1 − i)(mω/4)(b2t − x
(4.12)
February 11, 2010 10:1 WSPC/148-RMP
76
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
By resorting to Eqs. (3.14) and (3.19), and after a rather long calculation, we obtain the following set of SDEs for these parameters: √ ¯ [dξt − 2 λ¯ xt dt], (4.13) d¯ xt = kt dt + m m √ √ dk¯t = λ[dξt − 2 λ¯ xt dt], (4.14) √ √ ω x2t + xt [dξt − 2 λ¯ xt dt], (4.15) dγtR = λ¯ dt + λ¯ 4 √ √ ¯2 ω I k + dγt = − xt [dξt − 2 λ¯ xt dt]; (4.16) dt − λ¯ 2m t 4 the initial conditions are: x ¯0 = k¯0 = γ0 = 0. Note that these equations are equivm ¯ ¯t = xm alent to (3.41)–(3.44), with σt = σ∞ = λ/υ = z 2 /2, x t , kt = kt and γt = ςt + (1 + i)ω/4; as a matter of fact, the above equations describe the time evolution (according to Eq. (1.6)) of the ground state of the NSA harmonic oscillator, which is: 2 z 1+i ∞ 2 (0) ¯ ωt , φ∞ (x). (4.17) φt (x) = exp − (x − x¯t ) + ikt x + γt − 0 (x) = φ 2 4 As we shall prove in the next section, this is the state to which — apart from normalization — any initial state converges to, in the long time limit, hence the name φ∞ t . As we see, due to the stochastic part of the dynamics, the argument of the Gaussian weighting factor and that of the Hermite polynomials of Eq. (4.9) are different functions of time, while for analyzing the long time behavior of the wave function, it is more convenient that both arguments display the same time dependence. We thus modify the argument of the Hermite polynomials, to make it equal ¯t − bt ; we can then to that of the weighting factor. To this end, let us define ζt = x write: ¯ n [z(x − bt )] = √ 1 H Hn [z(x − x ¯t ) + zζt ] π2n n! n n 1 = √ ¯t )] (2zζt )n−m Hm [z(x − x n m π2 n! m=0 √ n √ n! ¯ m [z(x − x √ = ( 2zζt )n−m H ¯t )], (4.18) m!(n − m)! m=0 where Hm is the standard (not normalized) Hermite polynomial of degree m; in going from the first to the second line, we have used property (A.2). Resorting to the above relation, we can rewrite Eq. (4.9) as follows: ¯
φt (x) = eikt x+γt −(1+i)ωt/4
+∞ m=0
(m) −(1+i)mωt/2 (m)
α ¯t
e
φ
(x − x ¯t );
(4.19)
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
77
the functions φ(m) are the eigenstates defined in (4.3), while the time dependent (m) coefficients α ¯ t are defined as follows: (m) α ¯t
(k + m)! √ ¯ k ( 2z ζt ) , = αk+m √ m!k! k=0 +∞
(4.20)
where we have introduced the new quantity ζ¯t ≡ e−(1+i)ωt/2 ζt . Equations (4.19) and (4.20) represent the two main formulas, which we will use in the next section to analyze the large time behavior. Before doing this, we need to set these formulas on a rigorous ground; we will do these with the following two lemmata. Lemma 4.1. Let φ ∈ L2 (R) and αn = φ|φ(n) , with φ(n) defined as in (4.3). (m) Then the series (4.20) defining α ¯t is a.s. convergent for any m and any t > 0. Moreover, one has the following bound on the coefficients: (m) |¯ αt |
≤ Nt e
(c+1/2)m
,
+∞ k(c+1) √ e | 2z ζ¯t |k √ Nt ≡ A kk k=0
a.s.,
(4.21)
where A is a constant independent of the Brownian motion ξt . Proof. Because of (4.5), there exists a constant C1 such that: |αn | ≤ φφ(n) = φ(n) ≤ C1 enc . Secondly, using Stirling formula, there exists a constant C2 such that: √ √ C2−1 2πnnn e−n < n! < C2 2πnnn e−n ,
(4.22)
(4.23)
for n > 1; we can then write the following estimate: (k + m)! C22 4 k + m (k + m)(k+m)/2 e−(k+m)/2 √ ≤ √ mk 2 mm/2 e−m/2 k k e−k 2π m!k! C2 ≤ √2 e−k(ln k−2)/2+m/2 ; π
(4.24)
in the second line, we have used the inequality (k + m) ln(k + m) ≤ k ln k + m ln m + k + m. Using Eqs. (4.22) and (4.24), we have the following bound: √ (k + m)! √ ¯ k C1 C22 ek(c+1) | 2z ζ¯t |k √ ( 2z ζt ) ≤ √ , k, m ≥ 1. (4.25) αk+m √ 4 π m!k! kk The cases k = 0 and m = 0 can be treated separately, giving the same bound, with the only possible difference of an overall constant factor. This proves convergence of the series defined in (4.20) and the bound (4.21).
February 11, 2010 10:1 WSPC/148-RMP
78
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
Theorem 4.1. Let the conditions of Lemma 4.1 be satisfied; let moreover ζ¯t ≡ ¯t − bt with x ¯t and bt solutions of Eqs. (4.13) and (3.14), e−(1+i)ωt/2 ζt , where ζt = x respectively. Then the series defined in (4.19) is a.s. norm convergent for t > t¯ ≡ (4c + 1)/ω. In addition, the following equality holds true: ¯
Tt φ = eikt x+γt −(1+i)ωt/4
+∞
(m) −(1+i)mωt/2 (m)
α ¯t
e
φ
(x − x¯t ),
t > t¯,
(4.26)
m=0
where Tt is the evolution operator associated to the Green’s function (1.12). Proof. According to (4.5) and (4.21), one has: (m) −(1+i)mωt/2 (m)
α ¯t
e
φ
[z(x − x ¯t )] ≤ C1 Nt e(2c+1/2−ωt/2)m ,
(4.27)
from which the conclusion follows. Comparing the two expressions of Eqs. (3.18) and (4.19) when the initial state φ is an eigenstate φ(n) , we see that they coincide on the dense subspace of all finite linear combinations of φ(n) , and hence on the whole of L2 (R). 5. The Long Time Behavior We are now in a position to study the long time behavior of the solution of Eq. (1.1). Looking at expressions (4.19) for the solution φt and (4.20) for the coefficients (m) α ¯ t , it should be clear what the long time behavior of the normalized solution ψt = φt /φt is: whatever the initial condition, at any time t > 0 the wave function (0) ¯t ), since α ¯ t = 0 as long as at φt picks a component on the ground state φ(0) (x − x least one of the coefficients αk is not null, which is always the case. Equation (4.19) on the other hand shows that each term of the superposition has an exponential damping factor, which is the bigger, the higher the eigenvalue. Accordingly, after normalization, only the eigenstate with the weakest damping factor survives, which is the ground state. Hence we expect that the general solution of Eq. (1.1) converges ¯t ), which is a Gaussian a.s., in the large time limit, to the ground state φ(0) (x − x state. That this is true is proven in the following theorem. Theorem 5.1. Let φt be a strong solution of Eq. (1.6) that admits, for t > t¯ a representation as in (4.26). Let ψt ≡ φt /φt (when φt = 0), which can be written as follows: ψt =
ψt∞
+e
¯t x+γ I −ωt/4) i(k t
+∞ (m) α ¯t e−(1+i)mωt/2 φm (x − x ¯t ), r t m=1
(5.1)
with: (0)
α ¯ t i(k¯t x+γtI −ωt/4) e φ0 (x − x¯t ), rt +∞ (m) −(1+i)mωt/2 α ¯t e φm (x − x¯t ) . rt :=
ψt∞ :=
m=0
(5.2) (5.3)
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
79
Then, with P-probability 1: lim ψt − ψt∞ = 0.
t→∞
(5.4)
Note that, apart from global factors, ψt∞ is the ground state of the NSA harmonic oscillator, randomly displaced both in position space as well as in momentum space. Proof. According to Eq. (5.1), all we need to prove is that, with P-probability 1: +∞ (m) α ¯t −(1+i)mωt/2 e φm (x − x¯t ) = 0. lim (5.5) t→∞ r t m=1 Resorting to (4.27), one can write the following bound: +∞ (m) α ¯t Nt e−ω(t−t¯) e−(1+i)mωt/2 φm (x − x ¯t ) ≤ C1 , rt rt 1 − e−ω(t−t¯) m=1
(5.6)
thus all we need to set is the long time behavior of rt and Nt . Lemmas 5.1 and 5.2 (see Eqs. (5.7) and (5.12)) state that, with P-probability 1, rt converges asymptotically to a finite and non-null random variable, while Nt converges to a finite random variable. From the above properties, the conclusion of the theorem follows immediately. In the remaining of the section, we prove the required lemmas. Lemma 5.1. Let rt be defined as in (5.3). Then, with P-probability 1, lim rt = r∞
t→∞
finite and not null.
(5.7)
Proof. According to Eqs. (4.19) and (5.3), the following equality holds: R
φt = eγt −ωt/4 rt ;
(5.8)
resorting to the stochastic differentials (1.9) and (4.15) for φt 2 and γtR , respectively, one can write the following stochastic differential equation for rt2 : √ x2t − qt x ¯t ) dt]rt2 , r02 = 1. (5.9) drt2 = [2 λ(qt − x¯t )dξt + 4λ(¯ By using relation (1.11), the above equation can be re-written in terms of the Wiener process Wt as follows: √ drt2 = [2 λ(qt − x¯t )dWt + 4λ(qt − x ¯t )2 dt]rt2 , r02 = 1, (5.10) whose solution is: rt2
t √ t 2 = exp 2 λ (qs − x¯s )dWs + 2λ (qs − x ¯s ) ds . 0
0
(5.11)
February 11, 2010 10:1 WSPC/148-RMP
80
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
The crucial point is to establish the behavior of the difference qt − x ¯t between the mean position of the general solution ψt and the mean position of the “asymptotic” state ψt∞ . Since ψt converges to ψt∞ , we expect qt − x ¯t to vanishes asymptotically. That this is actually true with P-probability 1 is proven in Lemma 5.3 (see Eq. (5.15)), where indeed it is shown that the convergence is exponentially fast. This fact, together with (5.11), concludes the proof of the lemma. Lemma 5.2. Let Nt be defined as in (4.21). Then, with P-probability 1, lim Nt = N∞
t→∞
finite.
(5.12)
Proof. Looking back at Eq. (4.21), we see that in order to prove this lemma it is sufficient to show that ζ¯t tends to a finite limit as t → ∞, with P-probability 1. According to our previous definition, ζ¯t is equal to: ζ¯t = e−(1+i)ωt/2 (¯ xt − bt ).
(5.13)
Equations (3.14) and (4.10), together with the change of measure (1.11), lead to the following stochastic differential equation for ζ¯t in terms of the Wiener process Wt : √ ω ¯t )dt], ζ¯0 = 0. (5.14) dζ¯t = √ e−(1+i)ωt/2 [dWt + 2 λ(qt − x 2 λ Once again, the large time behavior of qt − x ¯t (see Eq. (5.15)) yields the conclusion of the lemma. Lemma 5.3. Let qt ≡ ψt |q|ψt and x ¯t defined in (4.10). Then, with Pprobability 1: ¯t = O(e−ωt/2 ). ht ≡ qt − x
(5.15)
Proof. Let us consider the Gaussian solution of Eq. (1.6): αt 2 G ¯t x + c¯t φt (x) ≡ Gt (x, 0) = Kt exp − x + a 2 αt G 2 G ¯ ¯t ) + ikt x + c˜t = Kt exp − (x − x 2
(5.16)
(5.17)
where Gt (x, y) is the Green’s function defined in (1.12) and x ¯G t =
a ¯R t , αR t
αI R k¯tG = a ¯It − Rt a ¯ , αt t
c˜t = c¯t +
αt G 2 (¯ x ) . 2 t
(5.18)
G ¯G Note that x¯G t is the mean position of the Gaussian state φt , while kt is its average momentum. Obviously we can write:
¯G xG ¯t ); ht = (qt − x t ) + (¯ t −x
(5.19)
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
81
Lemma B.1 proves that qt − x ¯G has the required asymptotic behavior t ¯t behaves as (see Eq. (B.1)), so all we need to show is that also x ¯G t − x required. Lemma B.1 was first proven in [35]; for completeness, we reproduce it in Appendix B, adapting it to our notation. The proof of the lemma is instructive ¯G because it makes clear why it is convenient to analyze qt − x t separately from G x ¯t − x¯t . By letting the ground state of the NSA harmonic oscillator evolve according to ¯t in terms of the functions (1.13)– the Green’s function Gt (x, y), one can express x (1.18); a straightforward calculation leads to the following result: R βt¯bt ω −1 G R ¯t = at − , (5.20) x ¯t − x (pt − 1)¯ 2λ αt + α∞ where α∞ ≡ limt→∞ αt = 2λ/υ. By inspecting expressions (3.21) and (1.15), we − 1 = O(e−ωt ) and |βt | = O(e−ωt/2 ), thus in order to prove recognize that p−1 t the lemma all we have to do is to control the long time behavior of a ¯t , which in turn sets the asymptotic behavior of ¯bt through (1.17). Inverting Eq. (5.18) we get: ¯G a ¯t = αt x¯G t + ikt ,
(5.21)
¯G thus we can control a ¯t by controlling x ¯G t and kt . These two quantities, being the average position and (modulo ) average momentum of the Gaussian solution (5.17), satisfy the stochastic differential equations (3.41) and (3.42), with αt /2 in place of σt . By using the change of measure (1.11), we can re-express these equations in terms of the Wiener process Wt as follows: √ ¯ G 2λ λ G k + R ft dt + R dWt , d¯ xt = (5.22) m t αt αt √ αI √ αI dk¯tG = −2 λ Rt ft dt − λ Rt dWt , αt αt
(5.23)
¯G with ft ≡ qt − x t . By integrating the second equation, by using the strong law of large numbers applied to Wt , Eq. (B.1) for ft and the fact that αt has an asymptotic finite limit, one can show that, with P-probability 1, the process k¯tG grows slower than t2 , for t → ∞. By integrating now the first equation, and by using the same 3 properties as before, one can show that x ¯G t grows slower than t , for t → ∞ and again with P-probability 1. According to Eqs. (5.21) and (1.17), we then have, with P-probability 1: a ¯t = o(t3 ) as t → ∞,
lim ¯bt = ¯b∞
t→∞
finite.
(5.24)
¯G This proves that x ¯t − x t has the required asymptotic behavior, hence the conclusion of the lemma.
February 11, 2010 10:1 WSPC/148-RMP
82
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
In this way, we have proven that any initial state is P-a.s. norm convergent to the Gaussian state (5.2), which can be written as follows: 2 π z ω ¯t )2 + ik¯t x + i γtI − t , (5.25) ψt∞ ≡ 4 2 exp − (x − x zR 2 4 which has a fixed finite spread both in position and in momentum, given by [39]: ∞ 2 ∞ 1/2 ∆q = ψt |(q − x , (5.26) ¯t ) |ψt = mω mω ∞ 2 ∞ 1/2 ¯ ∆p = ψt |(p − kt ) |ψt . (5.27) = 2 This corresponds almost √ to the minimum allowed by Heisenberg’s uncertainty relations, as ∆q ∆p = / 2. Note also that, the more massive the particle, the smaller the spread in position of the asymptotic Gaussian state: this is a well known effect of the localizing property of Eq. (1.1). Finally, Eqs. (4.13) and (4.14), together with the change of measure (1.11), tell how the average position x ¯t and momentum k¯t evolve in time, as a function of the Wiener process Wt : ¯ ω kt dt + ωht dt + √ dWt , m 2 λ √ dk¯t = 2λht dt + λdWt ,
d¯ xt =
(5.28) (5.29)
which imply that there exist two random variables X and K such that [35]: √ t Wt + O(e−ωt/2 ), x ¯t = X + Kt + λ Ws ds + (5.30) m m 0 m √ k¯t = K + λWt + O(e−ωt/2 ). (5.31) These parameters fully describe the time evolution of the Gaussian state (5.25). 6. Conclusions and Outlook In Sec. 2, we have spotted three interesting time regimes during which the wave function, depending on the values of the parameters λ and m, evolves in a different way. In the central sections of this paper, we have analyzed the long time behavior, which pertains to the third regime, the diffusive one. There are many other properties of the solutions of Eq. (1.1) which deserve to be analyzed, and in this conclusive section, we would like to point out a number of interesting open problems. I. Collapse regime Let be the length which discriminates between a localized and a non-localized wave function, i.e. such that, defining with ∆ψ q the spread in position of a wave
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
83
function ψ, we say that ψ is localized in space whenever ∆ψ q ≤ . In our case, we must take > /mω, where /mω is the asymptotic spread (see Eq. (5.26)). Problem I.1: Find bounds on the collapse time Let ψt be the solution of Eq. (1.1), for a given initial condition ψ ∈ L2 (R) such ψ as the first time at which the that ∆ψ q > . Let us define the collapse time TCOL wave function is localized in space: ψ := min{t : ∆ψt q ≤ }. TCOL
(6.1)
ψ How is TCOL distributed? Find best possible bounds (depending on the parameters defining the model) for the distribution function. The dependence of the collapse time on the parameters of the model is physically relevant as for macroscopic bodies the collapse is supposed to happen at a much shorter time, producing a classical macroscopic body. This time must be much before diffusion becomes effective. Bounds on the collapse time will lead hopefully to experimentally testable deviations from linear quantum mechanics (i.e. where the superposition principle holds on all scales.)
Problem I.2: collapse probability ψ ¯ ψ ¯ be the position of the wave function ]. Let x ¯ := ψ|q| Let ψ¯ := ψt , for t = EP [TCOL at the average time at which it is localized in space. Show that the distribution of x ¯ is close to the Born probability given by |ψ(x)|2 . II. Classical regime In the classical regime, the wave function is expected to move, on the average, like a classical free particle. Problem II.1: classical motion Let q¯t and p¯t be the (quantum) average position and momentum of ψt . Let t > ψ . Show that the random trajectories q¯· and p¯· are with high probability for TCOL a reasonably large amount of time close to the classical trajectories. The closeness will of course depend on the parameters defining the model. III. Diffusive regime With this regime, the classical regime ends and has been analyzed in this paper: as we have seen, the wave keeps diffusing in the Hilbert space, eventually taking a Gaussian shape, as described in Sec. 5.
Acknowledgments The work was supported by the EU grant No. MEIF CT 2003-500543 and by DFG (Germany). We thank GianCarlo Ghirardi, Lajos Di´ osi and the referees for helpful comments and an anonymous referee for pointing out a flaw in a previous version.
February 11, 2010 10:1 WSPC/148-RMP
84
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
Appendix A. Properties of Hermite Polynomials We list here the main properties of Hermite polynomials, which are used in the paper. The primary definition of the Hermite polynomials is n/2
Hn (z) = n!
(−1)m (2z)n−2m , m!(n − 2m)! m=0
(A.1)
where z is any complex number. These polynomials satisfy the following addition rule n n Hn (z1 + z2 ) = (2z2 )n−m Hm (z1 ). m m=0
(A.2)
When the argument is real (z = x ∈ R), they form an orthogonal set with respect to the weight exp[−x2 ]; the normalized Hermite polynomials are: ¯ n (x) = 1 Hn (x), H Nn
√ n Nn = π2 n!.
(A.3)
Appendix B. Lemma 3.1 in [34] Lemma B.1. Let φ ∈ L2 (R), φ = 1 and let φt = Tt φ. Then, with P-probability 1:
−ωt/2 ft ≡ qt − x ¯G ), t = O(e
(B.1)
¯G where qt = ψt |q|ψt , and x t has been defined in (5.18). Proof. Using the expression (1.12) for Gt (x, y) together with Schwartz inequality, we can derive the following bound on φt : 2
2
|φt (x)| ≤ |Kt |
λ p2t − 4qt2 2 π exp −2 x ω pt αR t
2 ¯bR qt ω (¯bR t t ) R R x + 2¯ ct + , +2 a ¯t + 8 pt 2λ pt
(B.2)
which holds for any t > 0. The above inequality implies that it is sufficient to consider φ ∈ L2 (R) such that: 2
|φ(x)| ≤ Ce−Ax ,
(B.3)
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
85
where C and A are random variables. A direct calculation leads to the following expression for the quantum average φt |q|φt : 2 π (¯ aR t ) R exp 2¯ c + dy1 dy2 φ(y1 )φ(y2 ) t R αR α t t R βt y 1 + β t y 2 a ¯t βt2 1 βt2 1 2 · + − − − exp − y y22 α α t 1 t R R R 2 2 2αR α 2α 2α t t t t R R 2 ¯ ¯ βt a β a |βt | · exp ¯bt + Rt y1 + ¯bt + t Rt y2 + y1 y2 . (B.4) αt αt 2αR t
φt |q|φt = |Kt |2
As we shall soon see, all exponential terms in the above expression can be controlled. The crucial factors are the two within brackets: the first term decays exponentially in time, since βt = O(e−ωt/2 ), while αt has a finite asymptotic limit; the term R a ¯R t /αt , instead, does not decay in time (see the discussion in connection with the proof of Lemma 5.3). Since φt 2 is equal to the expression (B.4) without the terms in square brackets, and because of (5.18), we have that a ¯R t φt 2 (B.5) αR t 2 π (¯ aR t ) R βt y 1 + β t y 2 = |Kt |2 exp 2¯ c + dy φ(y )φ(y ) dy 1 2 1 2 t αR αR 2αR t t t 2 2 1 β 1 β · exp − αt − tR y12 − αt − t R y22 2 2 2αt 2αt βt a βa |βt |2 ¯R ¯R · exp ¯bt + Rt y1 + ¯bt + t Rt y2 + y y (B.6) 1 2 . αt αt 2αR t
ft φt 2 = φt |q|φt −
According to the discussion above, we expect the quantity ft φt 2 to decay exponentially in time, as we shall now prove; this is the reason why, in proving Lemma 5.3, it was convenient to split the difference ht as done in Eq. (5.19). Using the inequality y1 y2 ≤ (y12 + y22 )/2, we can write: a ¯R (B.7) |ft |φt 2 = φt |q|φt − tR φt 2 αt 2 |βt | (¯ aR π t ) 2 R ≤ dy1 dy2 |φ(y1 )||φ(y2 )| |K | exp 2¯ c + t t 2αR αR αR t t t · (|y1 | + |y2 |)g(y1 )g(y2 ),
(B.8)
with: 1 ¯R (βtR )2 βtR a t R 2 R ¯ g(y) ≡ exp − y + bt + y . αt − 2 αR αR t t
(B.9)
February 11, 2010 10:1 WSPC/148-RMP
86
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
Next, by using the inequality g(y1 ) + g(y2 ) ≤ (g(y1 )2 + g(y1 )2 )/2 and the symmetry between y1 and y2 , we have: 2 π |βt | (¯ aR t ) 2 2 R |Kt | exp 2¯ ct + |ft |φt ≤ dy1 dy2 |φ(y1 )||φ(y2 )| 2αR αR αR t t t · (|y1 | + |y2 |)g(y1 )2 .
(B.10)
Now, a direct computation shows that 2 π (¯ aR t ) R Gt (·, y)2 ≡ dx|Gt (x, y)|2 = |Kt |2 exp 2¯ c + g(y)2 ; t R αR α t t
(B.11)
the key point is that, since Gt (x, y) solves Eq. (1.6), then Gt (·, y)2 is a positive martingale with respect to the measure Q, for any value of y; we call MarQ (t, y) this martingale. We can then write: |βt | dy1 dy2 |φ(y1 )||φ(y2 )|(|y1 | + |y2 |) MarQ (t, y) |ft |φt 2 ≤ 2αR t 2 |βt | (B.12) ≤ dy e−Ay (A1 |y| + A2 ) MarQ (t, y), 2αR t where A1 and A2 are suitable constants. In going from the first to the second line, we have used (B.3). The quantity 2 1 (B.13) dye−Ay (A1 |y| + A2 ) MarQ (t, y) R 2αt is another positive martingale with respect to Q, which we call Mar Q (t). We arrive in this way at the inequality: |ft | ≤ |βt |
Mar Q (t) . φt 2
(B.14)
Since Mar Q (t) is a positive martingale with respect to Q, then MarP (t) = Mar Q (t)/φt 2 is a positive martingale with respect to P which, by Doob’s convergence theorem, has a P-a.s. finite limit for t → +∞. The conclusion of the lemma then follows from Eq. (1.15), according to which βt = O(e−ωt/2 ). References [1] G. C. Ghirardi, A. Rimini and T. Weber, Unified dynamics for microscopic and macroscopic systems, Phys. Rev. D 34 (1986) 470–491. [2] G. C. Ghirardi, P. Pearle and A. Rimini, Markov processes in Hilbert space and continuous spontaneous localization of systems of identical particles, Phys. Rev. A 42 (1990) 78–89. [3] G. C. Ghirardi, R. Grassi and P. Pearle, Relativistic dynamical reduction models: General framework and examples, Found. Phys. 20 (1990) 1271–1316.
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
87
[4] P. Pearle, Reduction of the state vector by a nonlinear Schrdinger equation, Phys. Rev. D 13 (1976) 857–868. [5] P. Pearle, Combining stochastic dynamical state-vector reduction with spontaneous localization, Phys. Rev. A 39 (1989) 2277–2289. [6] P. Pearle, Collapse Models, in Open Systems and Measurement in Relativistic Quantum Theory, eds. F. Petruccione and H.-P. Breuer (Springer-Verlag, Berlin, 1999). [7] L. Di´ osi, Localized solution of simple nonlinear quantum Langevin-equation, Phys. Lett. A 132 (1988) 233–236. [8] L. Di´ osi, Models for universal reduction of macroscopic quantum fluctuations, Phys. Rev. A 40 (1989) 233–236. [9] L. Di´ osi, Relativistic theory for continuous measurement of quantum fields, Phys. Rev. A 42 (1990) 5086–5092. [10] S. L. Adler, D. C. Brody, T. A. Brun and L. P. Hughston, Martingale models for quantum state reduction, J. Phys. A 34 (2001) 8795–8820. [11] S. L. Adler and T. A. Brun, Generalized stochastic Schr¨ odinger equations for state vector collapse, J. Phys. A 34 (2001) 4797–4809. [12] S. L. Adler, Quantum Theory as an Emergent Phenomenon. The Statistical Mechanics of Matrix Models as the Precursor of Quantum Field Theory (Cambridge University Press, Cambridge, 2004). [13] A. Bassi, E. Ippoliti and S. L. Adler, Towards quantum superpositions of a mirror: An exact open systems analysis, Phys. Rev. Lett. 94 (2005) 030401, 4 pp. [14] A. Bassi, E. Ippoliti and B. Vacchini, On the energy increase in space-collapse models, J. Phys. A 38 (2005) 8017–8038. [15] V. P. Belavkin, Non-demolition measurements, non-linear filtering and dynamic programming in quantum stochastic processes, in Lecture Notes in Control and Information Science, ed. A. Blaqui`ere, Vol. 121 (Springer-Verlag, Berlin, 1988). [16] S. L. Adler and A. Bassi, Is quantum theory exact? Science 325 (2009) 275–276. [17] V. P. Belavkin and P. Staszewski, A quantum particle undergoing continuous observation, Phys. Lett. A 140 (1989) 359–362. [18] V. P. Belavkin and P. Staszewski, Nondemolition observation of a free quantum particle, Phys. Rev. A 45 (1992) 1347–1357. [19] D. Chru´sci´ nski and P. Staszewski, On the asymptotic solutions of Belavkin’s stochastic wave equation, Phys. Scripta 45 (1992) 193–199. [20] A. Barchielli, Direct and heterodyne detection and other applications of quantum stochastic calculus to quantum optics, Quantum Opt. 2 (1990) 423–441. [21] A. Barchielli, On the quantum theory of measurements continuous in time, Proceedings of the XXV Symposium on Mathematical Physics (Toru´ n, 1992), Rep. Math. Phys. 33 (1993) 21–34. [22] A. Barchielli and A. S. Holevo, Constructing quantum measurement processes via classical stochastic calculus, Stochastic Process. Appl. 58 (1995) 293–317. [23] Ph. Blanchard and A. Jadczy, On the interaction between classical and quantum systems, Phys. Lett. A 175 (1993) 157–164. [24] Ph. Blanchard and A. Jadczyk, Event-enhanced quantum theory and piecewise deterministic dynamics, Ann. Physik 4(8) (1995) 583–599. [25] Ph. Blanchard and A. Jadczyk, Events and piecewise deterministic dynamics in eventenhanced quantum theory, Phys. Lett. A 203 (1995) 260–266. [26] J. Halliwell and A. Zoupas, Quantum state diffusion, density matrix diagonalization, and decoherent histories: A model, Phys. Rev. D 52 (1995) 7294–7307. [27] J. Halliwell and A. Zoupas, Post-decoherence density matrix propagator for quantum Brownian motion, Phys. Rev. D 55 (1997) 4697–4704.
February 11, 2010 10:1 WSPC/148-RMP
88
J070-S0129055X10003886
A. Bassi, D. D¨ urr & M. Kolb
[28] H.-P. Breuer and F. Petruccione, The Theory of Open Quantum Systems (Oxford University Press, New York, 2002). [29] H.-P. Breuer, U. Dorner and F. Petruccione, Numerical integration methods for stochastic wave function equations, Comp. Phys. Comm. 132 (2000) 30–43. [30] D. Gatarek and N. Gisin, Continuous quantum jumps and infinite-dimensional stochastic equations, J. Math. Phys. 32 (1991) 2152–2157. [31] A. S. Holevo, On dissipative stochastic equations in a Hilbert space, Probab. Theory Related Fields 104 (1996) 483–500. [32] V. P. Belavkin and V. N. Kolokol’tsov, Quasiclassical asymptotics of quantum stochastic equations, Teoret. Mat. Fiz. 89 (1991) 163–177 (Russian); translation in Theoret. and Math. Phys. 89(2) (1991) 1127–1138. [33] V. N. Kolokol’tsov, Application of quasiclassical methods to the study of Belavkin’s quantum filtering equation, Mat. Zametki 50 (1991) 153–156 (Russian); translation in Math. Notes 50 (1991) 1204–1206. [34] V. N. Kolokol’tsov, Scattering theory for the Belavkin equation describing a quantum particle with continuously observed coordinate, J. Math. Phys. 36 (1995) 2741–2760. [35] V. N. Kolokol’tsov, Localization and analytic properties of the solutions of the simplest quantum filtering equation, Rev. Math. Phys. 10 (1998) 801–828. [36] V. N. Kolokol’tsov, Semiclassical Analysis for Diffusion and Stochastic Processes, Lecture Notes in Mathematics, Vol. 1724 (Springer-Verlag Berlin, 2000). [37] S. Albeverio, V. N. Kolokol’tsov and O. G. Smolyanov, Continuous quantum measurement: Local and global approaches, Rev. Math. Phys. 9 (1997) 907–920. [38] S. Teufel, Adiabatic Perturbation Theory in Quantum Dynamics, Lecture Notes in Mathematics, Vol. 1821 (Springer-Verlag, Berlin, 2003). [39] A. Bassi, Collapse models: Analysis of the free particle dynamics, J. Phys. A 38 (2005) 3173–3192. [40] E. Joos and H. D. Zeh, The emergence of classical properties through interaction with the environment, Z. Phys. B 59 (1985) 223–243. [41] W. Marshall, C. Simon, R. Penrose and D. Bouwmeester, Towards quantum superpositions of a mirror, Phys. Rev. Lett. 91 (2003) 130401, 4 pp. [42] J. Z. Bern´ ad, L. Di´ osi and T. Geszti, Quest for quantum superpositions of a mirror: High and moderately low temperatures, Phys. Rev. Lett. 97 (2006) 250404, 4 pp. [43] S. L. Adler, A density tensor hierarchy for open system dynamics: Retrieving the noise, J. Phys. A 40 (2007) 8959–8990. [44] A. Barchielli, Some stochastic differential equations in quantum optics and measurement theory: The case of diffusive processes, in Contributions in Probability — in Memory of Alberto Frigerio, ed. C. Cecchini (Forum, Udine, 1996), pp. 43–55. [45] A. Bassi and D. D¨ urr, On the long-time behavior of Hilbert space diffusion, Europhys. Lett. 84 (2008) 10005. [46] C. M. Mora and R. Rebolledo, Regularity of solutions to linear stochastic Schr¨ odinger equations, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 10 (2007) 237–259. [47] C. M. Mora and R. Rebolledo, Basic properties of nonlinear stochastic Schr¨ odinger equations driven by Brownian motions, Ann. Appl. Probab. 18 (2008) 591–619. [48] R. S. Liptser and A. N. Shiryaev, Statistics of Random Processes (Springer-Verlag, Berlin, 2001). [49] A. Bassi, G. C. Ghirardi and D. G. M. Salvetti, The Hilbert-space operator formalism within dynamical reduction models, J. Phys. A 40 (2007) 13755–13772. [50] R. G. Bartle, A Modern Theory of Integration, Graduate Studies in Mathematics, Vol. 32 (American Mathematical Society, Providence, RI, 2001).
February 11, 2010 10:1 WSPC/148-RMP
J070-S0129055X10003886
On Long Time Behavior of Free Stochastic Schr¨ odinger Evolutions
89
[51] P. E. Protter, Stochastic Integration and Differential Equation (Springer-Verlag, Berlin, 2004). [52] E. B. Davies, Pseudo-spectra, the harmonic oscillator and complex resonances, R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 455 (1999) 585–599. [53] E. B. Davies and A. B. J. Kuijlaars, Spectral asymptotics of the non-self-adjoint harmonic oscillator, J. London Math. Soc. (2 ) 70(2) (2004) 420–426. [54] E. B. Davies, Linear Operators and Their Spectra, Cambridge Studies in Advanced Mathematics, Vol. 106 (Cambridge University Press, Cambridge, 2007).
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
Reviews in Mathematical Physics Vol. 22, No. 1 (2010) 91–115 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003904
FROM GLOBAL SYMMETRIES TO LOCAL CURRENTS: THE FREE (SCALAR) CASE IN FOUR DIMENSIONS
GERARDO MORSELLA∗ and LUCA TOMASSINI† Department of Mathematics, Tor Vergata University, via della Ricerca Scientifica I-00133 Roma, Italy ∗
[email protected] †
[email protected] Received 4 May 2009 Revised 15 October 2009
Within the framework of algebraic quantum field theory, we propose a new method of constructing local generators of (global) gauge symmetries in field theoretic models, starting from the existence of unitary operators implementing locally the flip automorphism on the doubled theory. We show, in the simple example of the internal symmetries of a multiplet of free scalar fields, that through the pointlike limit of such local generators the conserved Wightman currents associated with the symmetries are recovered. Keywords: Quantum Noether theorem; split property; flip automorphism. Mathematics Subject Classification 2010: 81T05, 46L45
1. Introduction One of the most important features of field theoretic models is the existence of local conserved currents corresponding to space-time and internal (gauge) symmetries. While in the framework of classical Lagrangian field theory a clarification of this issue comes from Noether’s theorem (which provides an explicit formula for the conserved current associated to any continuous symmetry of the Lagrangian itself), it is well known that in the quantum case several drawbacks contribute to make the situation more confusing. For example, symmetries which are present at the classical level can disappear upon quantization due to renormalization effects. In [1, 2], a different approach to the problem was outlined in the context of algebraic quantum field theory. It consisted of two main steps: (1) given double ˆ with bases B, B ˆ in the time-zero plane centered at the origin and such cones O, O ¯ ˆ that O ⊂ O, start from generators Q of global space-time or gauge transformations Q and construct local ones, i.e. operators JO, ˆ generating the correct symmetry on O ˆ ˆ the field algebra F (O) and localized in O (i.e. affiliated to F (O)); and (2) these local generators should play the role of integrals of (time components of) Wightman 91
February 11, 2010 11:24 WSPC/148-RMP
92
J070-S0129055X10003904
G. Morsella & L. Tomassini
ˆ and possibly some smearing in time, so currents over B with a smooth cut-off in B that one is led to conjecture that 1 Q Q f (x)αx (JλO,λ (1.1) ˆ )dx → cj0 (f ) O λ3 R4 holds, in a suitable sense, as λ → 0. Here α denotes space-time translations, j0Q (x) the sought-for Wightman current, f ∈ S (R4 ) any test function and c a constant Q which (in view of the above interpretation of JO, ˆ ) would be expected to satisfy O ˆ vol(B) ≤ c ≤ vol(B).
(1.2)
It is important to note that there is a large ambiguity in the choice of the local ˆ is not fixed by the above requirements we are generators: since their action in O ∩ O ˆ Thus, the limit (1.1) is not to be expected free to add perturbations in F (O ∩ O). to converge in full generality, but we can still hope that a “canonical” choice or construction of the local generators might solve the problem (see below). The first problem above was completely solved in [1] for the case of Abelian gauge transformation groups, while in [2,3] the general case (including discrete and space-time symmetries and supersymmetries) was treated. The final result was that in physically reasonable theories what was called by the authors a canonical local unitary implementation of global symmetries exists and if a part of them actually constitutes a Lie group the corresponding canonical local generators provide a local representation of the associated Lie (current) algebras. A key assumption was identified in the so-called split property (for double cones), which holds in theories with a realistic thermodynamic behavior [4]. It expresses a strong form of statistical ˆ and is equivalent to the existence of independence between the regions O and O ˆ such that φ(AB) = ω(A)ω(B) (ω being normal product states φ on F (O) ∨ F (O) ˆ [5]. the vacuum state) for A ∈ F (O) and B ∈ F (O) However, the above-mentioned construction crucially depends on such a highly elusive object as the unique vector representative of the state φ in the (natural) cone ˆ ) = ∆1/4 (F (O) ∨ F (O) ˆ )+ Ω PΩ (F (O) ∨ F (O)
(1.3)
(see [6]), where Ω indicates the vacuum vector and ∆ the modular operator of ˆ , Ω), so that finding an explicit expression of the local the pair (F (O) ∨ F (O) generators appears as an almost hopeless task. This makes it extremely hard to proceed to the above-mentioned second step, i.e. the determination of the current fields themselves. Notwithstanding this, the reconstruction of the energy momentum tensor of a certain (optimal) class of 2-dimensional conformal models was carried out in [7], while partial results for the U(1)-current in the free massless 4-dimensional case were obtained in [8], showing that for the local generators of [3] the drawbacks briefly discussed after Eq. (1.1) might be less severe. However, in both cases the existence of a unitary implementation of dilations was crucial for handling the limit λ → 0.
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
93
In what follows, we restrict our attention to the case of continuous symmetries and propose a new method for obtaining local generators based on the existence of local unitary implementations of the flip automorphism, a requirement actually equivalent, under standard assumptions, to the split property [9]. This method turns out to be particularly suited for carrying out step (2) above, at least in the free field case. To be more specific, we consider a quantum field theory defined by a net O → F (O) of von Neumann algebras on open double cones in Minkowski 4dimensional spacetime acting irreducibly on a Hilbert space H with scalar product ·, · satisfying the following standard assumptions: (1) there is a unitary strongly continuous representation V on H of a compact Lie group G, which acts locally on F V (g)F (O)V (g)∗ = F (O),
g ∈ G,
and we set βg := Ad V (g); ¯1 ⊂ O2 ) there (2) (split property) for each pair of double cones O1 O2 (i.e. O exists a type I factor N such that F (O1 ) ⊂ N ⊂ F (O2 ). ¯ F (O), To such a theory, we associate the doubled theory O → F˜ (O) := F (O) ⊗ with the corresponding unitary representation of G given by V˜ (g) := V (g) ⊗ V (g). In this situation, it is well known that for each pair of double cones O1 O2 there exists a local implementation of the flip automorphism of F˜ (O1 ), i.e. a unitary operator WO1 ,O2 ∈ F˜ (O2 ) such that WO1 ,O2 F1 ⊗ F2 WO∗ 1 ,O2 = F2 ⊗ F1 ,
F1 , F2 ∈ F (O1 ).
(1.4)
Assume now, for the argument’s sake, that there is a 1-parameter subgroup θ ∈ R → gθ ∈ G of G, such that the generator Q of the corresponding unitary group θ → V (gθ ) is a bounded operator on H . Considering the conditional expectation ¯ B(H ) → B(H ) defined by (Fubini mapping) EΦ : B(H ) ⊗ EΦ (A1 ⊗ A2 ) = Φ, A2 ΦA1 ,
A1 , A2 ∈ B(H ),
where Φ ∈ H is such that Φ = 1, we can define the operator Q ∗ JO := ΞΦ O1 ,O2 (Q) := EΦ (WO1 ,O2 (1 ⊗ Q)WO1 ,O2 ), 1 ,O2
(1.5)
and it is then easy to see that such operator gives a local implementation of the infinitesimal symmetry generated by Q in the following natural sense: Q ∈ F (O2 ), JO 1 ,O2
Q [JO , F ] = [Q, F ], 1 ,O2
∀ F ∈ F (O1 ).
(1.6)
We also note that for this last equation to hold, it is sufficient that WO1 ,O2 is ¯ B(H ) for only a semi-local implementation of the flip, i.e. a unitary in F (O2 ) ⊗ which (1.4) holds.
February 11, 2010 11:24 WSPC/148-RMP
94
J070-S0129055X10003904
G. Morsella & L. Tomassini
The assumption of boundedness for Q is of course very strong, and it is not expected to be satisfied in physically interesting models. In the unbounded case it is however possible, in the slightly more restrictive setting of [2, 3], to make sense Q affiliated to F (O2 ) of Eqs. (1.5) and (1.6) producing a self-adjoint operator JO 1 ,O2 and implementing the commutator with Q on a suitable dense subalgebra of F (O1 ). More explicitly, assume that the triple Λ = (F (O1 ), F (O2 ), Ω) is a standard split W∗ -inclusion in the sense of [10] and consider the unitary standard implementation UΛ : H → H ⊗ H of the isomorphism ¯ F (O2 ) . η : F1 F2 ∈ F (O1 ) ∨ F (O2 ) → F1 ⊗ F2 ∈ F (O1 ) ⊗ This was used in [3] to define the universal localizing map ψΛ : B(H ) → B(H ), ψΛ (T ) = UΛ∗ (T ⊗ 1)UΛ ,
T ∈ B(H ),
where the standard type-I factor NΛ = ψΛ (B(H )) satisfies F (O1 ) ⊂ NΛ ⊂ F (O2 ). For the commutant standard inclusion Λ = (F (O2 ) , F (O1 ) , Ω) [10], one has ψΛ (T ) = UΛ∗ (1 ⊗ T )UΛ . For any unitarily equivalent triple Λ0 = (V0 F (O1 )V0∗ , V0 F (O2 )V0∗ , V0 Ω), one finds UΛ0 · V0 = V0 ⊗ V0 · UΛ . Notice that in the case of gauge transformations Λ = Λ0 and so UΛ V (g) = V (g) ⊗ V (g) · UΛ .
(1.7)
It is then straightforward to verify that, with Z1,3 the unitary interchanging the first and third factors in H ⊗ H ⊗ H ⊗ H , the operator WΛ = (UΛ∗ ⊗ UΛ∗ )Z1,3 (UΛ ⊗ UΛ ) is a local implementation of the flip. Setting g = gθ in (1.7) and differentiating with respect to θ, a simple computation shows that WΛ (1 ⊗ Q)WΛ∗ = ψΛ (Q) ⊗ 1 + 1 ⊗ ψΛ (Q) = JΛQ ⊗ 1 + 1 ⊗ JΛQ , where JΛQ , JΛQ are the canonical local implementations of [2, 3], which of course Q satisfy (1.6). Choosing now Φ = UΛ∗ (Ω ⊗ Ω), we see that ΞΦ O1 ,O2 (Q) = JΛ . The above construction (1.5) therefore includes the canonical one as a particular case. As remarked above, the control of the limit (1.1) for such operators does not seem within reach of the presently known techniques. However, we shall see in Sec. 3 below that if Q is the (unbounded) generator of a 1-parameter subgroup of a compact Lie gauge group acting on a finite multiplet of free scalar fields of mass m ≥ 0, it is possible to provide a different explicit (semi-)local implementation of the flip WO1 ,O2 such that the limit (1.1) can actually be performed for the Q (which is self-adjoint and satisfies (1.6) in the same corresponding generator JO 1 ,O2 Q sense as JΛ ). The rest of the paper is organized as follows. In Sec. 2, we introduce a new class of test functions spaces and use it to obtain estimates concerning certain free field bilinears; as it is shown in the Appendix, these estimates also allow to
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
95
establish the existence of the above-mentioned unitaries. This is used in Sec. 3, where we go into the study of our models of 4-dimensional free fields. We focus on the case of a single-charged free field with U(1) symmetry, the multiplet case being an easy generalization discussed at the end. We elaborate on the explicit realization of local unitaries implementing the flip automorphisms introduced for the neutral field case in [9], make use of the multiple commutator theorem in [11] to get an expression for the corresponding local generators of the U(1) symmetry and prove their (essential) self-adjointness on a suitable domain. Finally, convergence of the limit (1.1) is proved and the constant c there shown to satisfy (1.2) (and in particular to be different from zero). 2. Test Functions Spaces and N -Bounds for Free Field Bilinears We collect here some technical results, needed in the following section, on the extension of bilinear expressions in two commuting complex free scalar fields φi , i = 1, 2, and their derivatives, to suitable spaces of tempered distributions. Using this, we will also obtain useful N -bounds for such operators. The Hilbert space H˜ on which the fields φi act is the bosonic second quantization of K = L2 (R3 ) ⊗ C4 . For Φ ∈ H˜ , we denote by Φ(n) its component in K ⊗S n ˜ 0 we indicate the dense space (the symmetrized n-fold tensor power of K) and by D (n) ˜ be the = 0 for all but finitely many n ∈ N0 . Let N of Φ ∈ H˜ such that Φ (n) (n) ˜ ˜ = nΦ on the domain D(N ) of vectors number operator, defined by (N Φ) =+,− of Φ ∈ H˜ such that n n2 Φ(n) 2 < ∞. Fixing an orthonormal basis (eτi )τi=1,2 τ1 ...τn τ1 ...τn =+,− 4 ⊗S n with collections Φ = (Φi1 ...in )i1 ...in =1,2 of C , we can identify elements Φ ∈ K ...τn functions on R3n , such that Φτi11...i (p1 , . . . , pn ) is symmetric for the simultaneous n interchange of (τk , ik , pk ) and (τh , ih , ph ), and τ1 ,...,τn =+,− i1 ,...,in =1,2
R3n
...τn dp1 · · · dpn |Φτi11...i (p1 , . . . , pn )|2 < ∞. n
We introduce then the operators on H˜ τ cτ,− i (ψ) = a(ψ ⊗ ei ),
−τ ∗ ¯ cτ,+ i (ψ) = a(ψ ⊗ ei ) ,
where ψ ∈ L2 (R3 ) and a(ξ), ξ ∈ K, is the usual Fock space annihilation operator. Their commutation relations are τ,σ ρ,ε dp ψ(p)ϕ(p). [ci (ψ), cj (ϕ)] = −σδij δτ,−ρ δσ,−ε R3
Introducing also the maps jσ : S (R4 ) → L2 (R3 ), σ = +, −, defined by dx ipx 2π/ωm (p)fˆ(σωm (p), σp),(where fˆ(p) = R4 (2π) is the jσ f (p) := 2 f (x)e † Fourier transform of f and ωm (p) = |p|2 + m2 ) and the notation φi (f ) := φi (f¯)∗ ,
February 11, 2010 11:24 WSPC/148-RMP
96
J070-S0129055X10003904
G. Morsella & L. Tomassini
we have 1 −,σ φi (f ) = √ ci (jσ f ), 2 σ=+,−
1 +,σ φ†i (f ) = √ ci (jσ f ). 2 σ=+,−
˜ 0, With the notation ∂ := ∂0 , we have, for f ∈ S (R8 ) and Φ ∈ D : ∂ l c−,σ ∂ k c+,ε : (f )(n) Φ(n−σ−ε) , (: ∂ l φi ∂ k φ†j : (f )Φ)(n) = i j
(2.1)
σ,ε
where : ∂ l c−,σ ∂ k c+,ε : (f )(n) : K ⊗S (n−σ−ε) → K ⊗S n is a bounded operator whose i j expression can be obtained from the formal expression of φi in terms of creation and annihilation operators. For instance, if Φ ∈ K ⊗S n , ...τn (: ∂ l ci−,+ ∂ k cj+,− : (f )(n) Φ)τi11...i (p1 , . . . , pn ) n
=
n X
δτr ,+ δi,ir il+r (−1)k π
r=1
Z
×
R3
τr ...τn ˆ r , . . . , pn ), dpωm (p)k−1/2 ωm (pr )l−1/2 fˆ(pr,+ , −p+ )Φ+τ1 ...ˆ (p, p1 , . . . , p ˆ ji1 ...ir ...in
where the hat over an index means that the index itself must be omitted and where we have introduced the convention (which we will use systematically in the following) of denoting simply by qσ ∈ R4 the 4-vector (σωm (q), q), σ = +, −. We now want to show that such operators can be extended to suitable spaces of tempered distributions on R8 , which in turn are left invariant by the operation induced by the commutator of field bilinears. Definition 2.1. We denote by Cˆ the space of functions f ∈ C ∞ (R8 ) such that for all r ∈ N, α, β ∈ N40 ,
f r,α,β =
sup |(1 + |p + q|)r ∂pα ∂qβ f (p, q)| < ∞.
(p,q)∈R8
Introducing the notation f˜(p, q) := f (q, p) and the expressions (T k,l (f )Φ)(p) := dq ωm (p)k−1/2 ωm (q)l−1/2 f (p+ , −q+ )Φ(q), R3
Φk,l,σ (p, q) f
:= f (σp+ , σq+ )ωm (p)k−1/2 ωm (q)l−1/2 ,
where k, l = 0, 1 and σ = +, −, we denote by Cˆk,l the space of functions f ∈ Cˆ ∈ such that T k,l (|f |), T l,k (|f˜|) : L2 (R3 ) → L2 (R3 ) are bounded operators and Φk,l,σ f 2 6 k,l L (R ). Furthermore, we introduce on Cˆ the seminorm
f k,l := max{ T k,l (|f |) , T l,k (|f˜|) , Φk,l,σ
L2 (R6 ) }. f The spaces Cˆk,l depend also on the mass m appearing in ωm , but we have avoided to indicate this explicitly in order not to burden the notations. It is clear that functions in Cˆ are bounded with all their derivatives and therefore
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
97
Cˆk,l ⊂ S (R8 ). We denote then by C k,l the space of distributions f ∈ S (R8 ) such that fˆ ∈ Cˆk,l . It is also easy to verify that S (R8 ) ⊂ C k,l . Lemma 2.1. The expression Cˆ l,k (f, g)(p, q) := (−1)l π
k+l
σ(iσ)
R3
σ=±
dk ωm (k)l+k−1 f (p, −σk+ )g(σk+ , q), (2.2)
defines a bilinear map Cˆ l,k : Cˆl ,l × Cˆk,k → Cˆl ,k , such that Cˆ l,k (f, g) l ,k ≤ 2π f l ,l g k,k .
Proof. We start by showing that if f, g ∈ Cˆ then Cˆ l,k (f, g) ∈ Cˆ. Setting ε = 2/|p + q|, and e = ε(p + q)/2, it is clearly sufficient to show that, as ε → 0, Ih,r (ε) :=
R3
(1 + |x +
|x|h/2 dx ≤ O(εs(r,h) ), + |x − ε−1 e|)r
ε−1 e|)r (1
(2.3)
where h = k+l−1 = −1, 0, 1, and s(r, h) → +∞ as r → +∞. Consider first the case h = 0. Choosing the x3 axis along e and evaluating the integral in prolate spheroidal coordinates x1 = ε−1 (u2 − 1)(1 − v 2 ) cos φ, x2 = ε−1 (u2 − 1)(1 − v 2 ) sin φ, x3 = ε−1 uv, one gets +∞ +∞ +∞ I0,r (ε) = 2πε2r−3 du Jr−1 (u) + ε2 du Jr (u) − 2ε du uJr (u) , 1+ε
1+ε
1+ε
where, by recursion, Jr (u) :=
1
dv −1
r−1 1 2(2r − 3) · · · (2r − 2k + 1) 1 = 2 2 r 2k 2 (u − v ) (2r − 2) · · · (2r − 2k) u (u − 1)r−k k=1 u + 1 (2r − 3)!! 1 , + log (2r − 2)!! u2r−1 u − 1
which easily gives estimate (2.3) with s(r, 0) = r − 3. Take now h = −1. Dividing the integration region into the subregions {|x| ≤ 1}, {|x| > 1} and using the Cauchy–Schwarz inequality in the first integral, one gets I−1,r (ε) ≤
|x|≤1
1/2 |x|−1 dx
I0,2r (ε)1/2 + I0,r (ε) ≤ O(εr−3 ).
Finally, for h = 1, taking into account the bound |x|1/2 /(1 + |x + ε−1 e|)(1 + |x − ε−1 e|) ≤ 1/2, one gets I1,r (ε) ≤ O(εr−4 ). We now show that if f ∈ Cˆl ,l , g ∈ Cˆk,k , then Cˆ l,k (f, g) ∈ Cˆl ,k . We introduce the notation KΨ to denote the Hilbert–Schmidt operator on L2 (R3 ) with kernel
February 11, 2010 11:24 WSPC/148-RMP
98
J070-S0129055X10003904
G. Morsella & L. Tomassini
Ψ ∈ L2 (R6 ). It is then easy to verify that, if Φ ∈ L2 (R3 ),
T l ,k (|Cˆ l,k (f, g)|)Φ 2 ≤ π( T l ,l (|f |)T k,k (|g|)|Φ| 2 + KΦl ,l,+ KΦk,k ,− |Φ| 2 ), |f |
|g|
(f, g)|)Φ 2 ≤ π( T k ,k (|˜ g |)T l,l (|f˜|)|Φ| 2 + KΦl ,l,− KΦk,k ,+ |Φ| 2 ),
T k ,l (|Cˆ l,k |f |
|g|
(f, g)|) are bounded. Furthermore one has, so that T l ,k (|Cˆ l,k (f, g)|) and T k ,l (|Cˆ l,k 2 6 for Ψ ∈ L (R ),
l ,k ,+ k,k ,+ |ΦC , (T l ,l (|f |)∗ ⊗ 1)|Ψ|L2 (R6 ) ˆ l,k (f,g) , ΨL2 (R6 ) | ≤ π(Φ|g|
k ,k + Φl|f,l,+ (|˜ g|)∗ )|Ψ|L2 (R6 ) ) | , (1 ⊗ T
l ,k ,− l ,l,− k,k |ΦC (|g|))|Ψ|L2 (R6 ) ˆ l,k (f,g) , ΨL2 (R6 ) | ≤ π(Φ|f | , (1 ⊗ T
k,k ,− + Φ|g| , (T l,l (|f˜|) ⊗ 1)|Ψ|L2 (R6 ) )
l ,k ,σ 2 6 ˆ l,k (f, g) l,k now so that by Riesz theorem ΦC ˆ l,k (f,g) ∈ L (R ). The bound on C follows at once from the above estimates.
For (f, g) ∈ C l ,l × C k,k , we write C l,k (f, g) := Cˆ l,k (fˆ, gˆ)∨ . Proposition 2.1. The following statements hold for any i, j ∈ {1, 2}, k, l ∈ {0, 1}, n ∈ N, σ, ε ∈ {+, −}, with n − σ − ε ≥ 0. ∂ k c+,ε : (f )(n) ∈ B(K ⊗S (n−σ−ε) , K ⊗S n ) (1) The map f ∈ S (R8 ) → : ∂ l c−,σ i j can be extended to a map (denoted by the same symbol) from C l,k to B(K ⊗S (n−σ−ε) , K ⊗S n ), such that
: ∂ l c−,σ ∂ k c+,ε : (f )(n) ≤ π fˆ l,k (n + 2). i j
(2.4)
˜ 0 by formula (2.1), (2) For each f ∈ C l,k the operator : ∂ l φi ∂ k φ†j : (f ), defined on D satisfies ˜ + 1)−1/2 ≤ υ fˆ l,k , ˜ + 1)−1/2 : ∂ l φi ∂ k φ† : (f )(N
(N j
(2.5)
˜ + 1)−1/2 [N ˜ , : ∂ l φi ∂ k φ† : (f )](N ˜ + 1)−1/2 ≤ υ fˆ l,k ,
(N j
(2.6)
˜ 0, for some υ > 0. If furthermore (f, g) ∈ C l ,l × C k,k , there holds, on D
[: ∂ l φi ∂ l φ†i : (f ), : ∂ k φj ∂ k φ†j : (g)]
= δij : ∂ l φi ∂ k φ†j : (C l,k (f, g)) − δi ,j : ∂ k φj ∂ l φ†i : (C k ,l (g, f )) ,+ + il+l +k+k π 2 δi ,j δi,j ((−1)l+l Φlfˆ,l,− , Φk,k L2 (R6 ) g ˆ k,k ,− − (−1)k+k Φlfˆ,l,+ , Φ L2 (R6 ) )1. g ˆ
(2.7)
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
99
Proof. (1) Define the contraction operator Π(ψ) : K ⊗(n+2) → K ⊗n , ψ ∈ K ⊗2 , by Π(ψ)ψ1 ⊗· · ·⊗ψn+2 = ψ, ψ1 ⊗ψ2 ψ3 ⊗· · ·⊗ψn+2 . It is easily seen from the usual expressions of creation and annihilation operators (see, e.g., [12, Sec. X.7]) that for f ∈ S (R8 ) : ∂ l ci−,+ ∂ k cj+,− : (f )(n) = il (−i)k π
n
+ Vr ((T l,k (fˆ) ⊗ |e+ i ej |) ⊗ 1 ⊗ · · · ⊗ 1),
r=1
: ∂ l ci−,+ ∂ k cj+,+
(n)
: (f )
: ∂ l ci−,− ∂ k cj+,− : (f )(n)
il+k π
1,n
− ∗ = Wr,s Π(Φl,k,+ ⊗ (e+ i ⊗ ej )) , fˆ n(n − 1) r=s + = (−i)l+k π (n + 1)(n + 2)Π(Φl,k,− ⊗ (e− i ⊗ ej )), fˆ
where for ψi ∈ K, i = 1, . . . , n, Vr ψ1 ⊗ · · · ⊗ ψn = ψ2 ⊗ · · · ⊗ ψ1 ⊗ · · · ⊗ ψn , r th place
Wr,s ψ1 ⊗ · · · ⊗ ψn = ψ3 ⊗ · · · ⊗ ψ1 ⊗ · · · ⊗ ψ2 ⊗ · · · ⊗ ψn . rth place
sth place
Thus the above formulas provide an extension of : ∂ l c−,σ ∂ k c+,ε : (·)(n) to C l,k i j and the bound (2.4) holds. √ (2) The bounds (2.5) and (2.6), with υ = 4π( 3 + 1), follow easily from (2.4). Equation (2.7) is obtained by a straightforward (if lengthy) calculation, using ∂ k c+,ε : (·)(n) . the above expressions for : ∂ l c−,σ i j Remark 2.1. It is not difficult to see that the above defined exten∂ k c+,ε : (·)(n) to C l,k is unique in the family of linear maps sion of : ∂ l c−,σ i j l,k ⊗S (n−σ−ε) → B(K , K ⊗S n ) which are sequentially continuous when S : C ⊗S (n−σ−ε) ⊗S n B(K ,K ) is equipped with the strong operator topology and C l,k is equipped with the topology induced by the family of seminorms ˜
f k,l,Ψ = max{ T k,l (|fˆ|)Ψ , T l,k (|fˆ|)Ψ , Φk,l,σ
L2 (R6 ) }, fˆ
Ψ ∈ L2 (R3 ),
with respect to which S (R8 ) is sequentially dense in C l,k . On the other hand, we point out the fact that, according to Eq. (2.7), the linear span of extended field bilinears is stable under the operation of taking commutators. Together with Proposition A.1 in the Appendix, this implies that in the construction of the local symmetry generator carried out in the following section, Eq. (3.8), only the above defined extensions are relevant. According to the results in [12, Sec. X.5], the bounds (2.5) and (2.6) imply that : ∂ l φi ∂ k φ†j : ∂ l φi ∂ k φ†j : (f ) can be extended to an operator, denoted by the same ˜ ). symbol, whose domain contains D(N
February 11, 2010 11:24 WSPC/148-RMP
100
J070-S0129055X10003904
G. Morsella & L. Tomassini
3. Reconstruction of the Free Field Noether Currents We start by considering the theory of a complex free scalar field φ of mass m ≥ 0. The Hilbert space of the theory is the symmetric Fock space H = Γ(L2 (R3 ) ⊗ C2 ). As customary, we denote by D0 ⊂ H the space of finite particle vectors, and by N the number operator N = dΓ(1), with domain D(N ). The local field algebras are defined as usual by F (O) := {ei[φ(f )+φ(f )
∗ −
]
: f ∈ D(O)} ,
and if we consider iθ e V (θ) := Γ 1 ⊗ 0
0 e−iθ
,
we obtain a continuous unitary representation of U(1) (i.e. a 2π-periodic representation of R) on H , θ ∈ R → V (θ), which induces a group of gauge automorphisms βθ := Ad V (θ) of F such that βθ (φ(f )) = eiθ φ(f ). We denote by Q the self-adjoint generator of this group. It is easy to see that (N + 1)−1/2 Q(N + 1)−1/2 ≤ 1 and [N, Q] = 0, so that thanks to Nelson’s commutator theorem (cfr. [12, Sec. X.5]) D(N ) ⊂ D(Q). Furthermore we introduce the unitary operator Z on H such that Zφ(f )Z ∗ = −φ(f ), ZΩ = Ω. In order to find an explicit representation of the (semi-)local implementation of the flip automorphism we consider, following [9], the doubled theory O → F˜ (O) := ¯ F (O), generated by the two commuting complex scalar fields φ1 (f ) := F (O) ⊗ φ(f ) ⊗ 1, φ2 (f ) := 1 ⊗ φ(f ). There is a continuous unitary representation of U(1) on H˜ = H ⊗ H , ζ ∈ R → Y (ζ), which induces a group of gauge automorphisms γζ := Ad Y (ζ) of F˜ such that γζ (φ1 (f )) = cos ζ φ1 (f ) − sin ζ φ2 (f ),
(3.1)
γζ (φ2 (f )) = sin ζ φ1 (f ) + cos ζ φ2 (f ).
In Proposition A.1 in the Appendix it is shown that the Noether current of this U(1) symmetry Jµ (x) = φ1 (x)∂µ φ2 (x)∗ + φ1 (x)∗ ∂µ φ2 (x)− ∂µ φ1 (x)φ2 (x)∗ − ∂µ φ1 (x)∗ φ2 (x)
(3.2)
is a well-defined Wightman field that when smeared with an h ∈ SR (R4 ) gives ˜ ), and generates a group of an operator which is essentially self-adjoint on D(N unitaries which locally implements the symmetry: given 3-dimensional open balls Br , Br+δ centered at the origin of radii r + δ > r > 0 together with functions ϕ∈D R (Br+δ−τ ), ψ ∈ DR ((−τ, τ )) such that τ < δ/2, ϕ(x) = 1 for each x ∈ Br+τ and R ψ = 1, it holds that eiζJ0 (ψ⊗ϕ) ∈ F˜ (Or+δ ),
eiζJ0 (ψ⊗ϕ) F e−iζJ0 (ψ⊗ϕ) = γζ (F ),
∀ F ∈ F˜ (Or ),
(3.3)
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
101
where Or , Or+δ are the double cones with bases Br , Br+δ , respectively. It then follows easily that setting hλ := ψλ ⊗ ϕλ with ϕλ (x) = ϕ(λ−1 x) and ψλ (t) = λ−1 ψ(λ−1 t), the unitary operator π
¯ B(H ), WλOr ,λOr+δ := (1 ⊗ Z)ei 2 J0 (hλ ) ∈ F (λOr+δ ) ⊗
(3.4)
is a semi-local implementation of the flip automorphism on F˜ (λOr ) for each λ > 0. In what follows, we will keep the functions ϕ, ψ fixed and we will assume that ϕ(Rx) = ϕ(x) for each R ∈ O(3). For a function h ∈ S (R4 ), we introduce the distribution hδ ∈ S (R8 ) defined by hδ (x, y) = h(x)δ(x − y) (i.e. hδ , f = R4 dx h(x)f (x, x) for f ∈ S (R8 )). Proposition 3.1. Let the operator WλOr ,λOr+δ be defined as above. The operator ΞλOr ,λOr+δ (Q) defined on D(N ) by ∗ ΞλOr ,λOr+δ (Q)Φ = P1 WλOr ,λOr+δ (1 ⊗ Q)WλO Φ ⊗ Ω, r ,λOr+δ
Φ ∈ D(N ),
(3.5)
where P1 (Φ1 ⊗ Φ2 ) = Ω, Φ2 Φ1 , is essentially self-adjoint. Furthermore, there are l,k (λ) ∈ C l,k , n ∈ N, l, k = 0, 1, m ≥ 0, defined recursively by distributions Kn,m 1,0 0,1 K1,m (λ) = −K1,m (λ) := (hλ )δ , l,k Kn+1,m (λ) = i(−1)n
1
0,0 1,1 K1,m (λ) = K1,m (λ) = 0,
(3.6)
r,k [(−1)l+1 C 1−l,r ((hλ )δ , Kn,m (λ))
r=0 k
l,r + (−1) C r,1−k (Kn,m (λ), (hλ )δ )],
(3.7)
such that, for all Φ ∈ D(N ),
ΞλOr ,λOr+δ (Q)Φ =
+∞
2n
π n (2n)! 4 n=1
0,1
l,k : ∂ l φ∂ k φ† : (K2n,m (λ))Φ ,
(3.8)
l,k
the series being absolutely convergent for all λ ∈ (0, 1]. Proof. We start by observing that, for all Φ ∈ H for which the right-hand side of (3.5) is defined, one has π
π
ΞλOr ,λOr+δ (Q)Φ = P1 ei 2 J0 (hλ ) (1 ⊗ Q)e−i 2 J0 (hλ ) Φ ⊗ Ω.
(3.9)
It follows from this formula that ΞλOr ,λOr+δ (Q) is well-defined (and symmetric) on D(N ): according to formula (A.1) in the Appendix for J0 (hλ ), Proposition 2.1(2) π ˜ ) ⊂ D(N ˜ ) and D(N ) ⊂ D(Q) as remarked and [11, Lemma 2], we have ei 2 J0 (hλ ) D(N above.
February 11, 2010 11:24 WSPC/148-RMP
102
J070-S0129055X10003904
G. Morsella & L. Tomassini
˜) Recalling now the definition of Q one has on D(N Q1 (λ) := i[J0 (hλ ), 1 ⊗ Q] =
2
[: ∂φj φ†j : ((hλ )δ ) − : φj ∂φ†j : ((hλ )δ )]
j=1
=
0,1 2
l,k : ∂ l φj ∂ k φ†j : (K1,m (λ)),
j=1 l,k
where j = 3 − j. Proceeding now inductively using formula (2.7), one verifies that ˜ 0, there are operators Qn (λ) such that, on D Qn+1 (λ) = i[J0 (hλ ), Qn (λ)], Q2n (λ) =
0,1 2
(3.10)
l,k (−1)j+1 : ∂ l φj ∂ k φ†j : (K2n,m (λ)),
j=1 l,k
Q2n+1 (λ) =
0,1 2
(3.11) : ∂ l φj ∂ k φ†j
l,k : (K2n+1,m (λ)),
j=1 l,k l,k (λ) ∈ C l,k satisfy (3.7). It is also easy to verify inducwhere the distributions Kn,m tively that the distributions K l,k (λ) are real (g ∈ S being real if g, f = g, f¯), n,m
so that Qn (λ) is symmetric. Arguing again by induction, it follows from (3.7) and Lemma 2.1, that l,k n n−1 ˆ n,m (λ) l,k ≤ (8π)n−1 (max{ (h
h nS ,
K λ )δ 0,1 , (hλ )δ 1,0 }) ≤ (8π)
where h S is some fixed Schwartz norm of h. The last inequality above follows from Lemma A.1 and from the observation that, switching for a moment to the (m) notation · l,k in order to make explicit the dependence on the mass m of the seminorms · l,k , one has (m) ˆ (λm)
(h λ )δ l,1−l = hδ l,1−l ,
l = 0, 1.
Using now the bounds in Proposition 2.1(2) and the results in [12, Sec. X.5], we see that Qn (λ) can be extended to an operator (denoted by the same symbol) which ˜ . The domain D ˜ 0 being such a core, is essentially self-adjoint on any core for N ˜ ) × D(N ˜ ) and we are therefore Eq. (3.10) can be assumed to hold weakly on D(N in the position of applying [11, Theorem 1∞ ] to obtain π
π
ei 2 J0 (hλ ) (1 ⊗ Q)e−i 2 J0 (hλ ) = 1 ⊗ Q +
+∞ 1 π n Qn (λ) n! 2 n=1
˜ ). Combining this with (3.9), and the series converges strongly absolutely on D(N and the fact that l,k l,k (λ))Φ ⊗ Ω = 0 = P1 : ∂ l φ2 ∂ k φ†2 : (K2n,m (λ))Φ ⊗ Ω, P1 : ∂ l φj ∂ k φ†j : (K2n+1,m
Eq. (3.8) readily follows, upon identification of φ1 (f ) = φ(f ) ⊗ 1 with φ(f ).
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
103
It remains to prove that ΞλOr ,λOr+δ (Q) is essentially self-adjoint on D(N ), but this again follows from the easily obtained N -bounds
(N + 1)−1/2 ΞλOr ,λOr+δ (Q)(N + 1)−1/2 ≤ γ cosh(4π 2 h S ),
(N + 1)−1/2 [N, ΞλOr ,λOr+δ (Q)](N + 1)−1/2 ≤ γ cosh(4π 2 h S ),
(3.12)
where γ > 0 is a suitable numerical constant. We now show that the unitary group generated by the operator ΞOr ,Or+δ (Q) defined in the above proposition provides a local implementation of the U(1) symmetry. Proposition 3.2. For each θ ∈ R and F ∈ F (Or ) there holds: eiθΞOr ,Or+δ (Q) ∈ F (Or+δ ),
eiθΞOr ,Or+δ (Q) F e−iθΞOr ,Or+δ (Q) = βθ (F ).
Proof. Since the free field enjoys Haag duality property, it is sufficient to show that eiθΞOr ,Or+δ (Q) ei[φ(f )+φ(f )
∗ −
]
e−iθΞOr ,Or+δ (Q) = ei[φ(f )+φ(f )
∗ −
]
if supp f ⊂ Or+δ and that
eiθΞOr ,Or+δ (Q) ei[φ(f )+φ(f )
∗ −
]
iθ
e−iθΞOr ,Or+δ (Q) = ei[e
φ(f )+e−iθ φ(f )∗ ]−
if supp f ⊂ Or . Applying once again [11, Theorem 1∞ ] and keeping in mind the previously obtained N -bounds for ΞOr ,Or+δ (Q), Eq. (3.12), one sees that in order to achieve this, it is enough to show that for all Φ1 , Φ2 ∈ D(N ) ΞOr ,Or+δ (Q)Φ1 , φ(f )Φ2 − φ(f )∗ Φ1 , ΞOr ,Or+δ (Q)Φ2 = 0
(3.13)
for supp f ⊂ Or+δ and
ΞOr ,Or+δ (Q)Φ1 , φ(f )Φ2 − φ(f )∗ Φ1 , ΞOr ,Or+δ (Q)Φ2 = Φ1 , φ(f )Φ2
(3.14)
for supp f ⊂ Or . In order to prove the latter equation we compute ΞOr ,Or+δ (Q)Φ1 , φ(f )Φ2 π
π
= (1 ⊗ Q)e−i 2 J0 (h) Φ1 ⊗ Ω, e−i 2 J0 (h) (φ(f ) ⊗ 1)Φ2 ⊗ Ω π
π
= (1 ⊗ Q)e−i 2 J0 (h) Φ1 ⊗ Ω, (1 ⊗ φ(f ))e−i 2 J0 (h) Φ2 ⊗ Ω π
π
= (1 ⊗ φ(f )∗ )e−i 2 J0 (h) Φ1 ⊗ Ω, (1 ⊗ Q)e−i 2 J0 (h) Φ2 ⊗ Ω π
π
+ e−i 2 J0 (h) Φ1 ⊗ Ω, (1 ⊗ φ(f ))e−i 2 J0 (h) Φ2 ⊗ Ω = φ(f )∗ Φ1 , ΞOr ,Or+δ (Q)Φ2 + Φ1 , φ(f )Φ2 , where in the second and fourth equalities we used (3.1) and (3.3), and in the third π equality the fact that, as noted in the proof of Proposition 3.1, e−i 2 J0 (h) Φi ⊗ Ω ∈ ˜ 2 ∈ D(N ˜ ), there holds ˜ ) and that for Φ ˜ 1, Φ D(N ˜ 1 , (1 ⊗ φ(f ))Φ ˜ 2 − (1 ⊗ φ(f )∗ )Φ ˜ 1 , (1 ⊗ Q)Φ ˜ 2 = Φ ˜ 1 , (1 ⊗ φ(f ))Φ ˜ 2 (1 ⊗ Q)Φ
February 11, 2010 11:24 WSPC/148-RMP
104
J070-S0129055X10003904
G. Morsella & L. Tomassini
which in turn is an easy consequence of the commutation relation [Q, φ(f )]Φ = φ(f )Φ,
Φ ∈ D(N ),
˜ is the closure of N ⊗ 1 + 1 ⊗ N and of the N ˜ -bounds holdof the fact that N ing for 1 ⊗ Q and 1 ⊗ φ(f ). The proof of (3.13) being analogous, we get the statement. l,k In the following lemma, we collect some properties of the distributions Kn,m := which will be needed further on. We will use systematically the notations
l,k (1) Kn,m
f α := sup (1 + |p0 | + |p|)α |f (p)|, p∈R4
ϕ α := sup (1 + |p|)α |ϕ(p)|, p∈R3
ψ 1,∞ := max{ ψ ∞ , ψ ∞ },
ϕ 1,α := max{ ϕ α , ∂1 ϕ α , . . . , ∂3 ϕ α }, for f ∈ S (R4 ), ϕ ∈ S (R3 ), ψ ∈ S (R) and α > 0. Lemma 3.1. The following statements hold. ˆ l,k enjoy the following symmetry properties: (1) The functions K n,m l,k k,l ˆ n,m ˆ n,m K (p, q) = −K (q, p),
l,k l,k ˆ n,m ˆ n,m K (p0 , Rp, q0 , Rq) = K (p, q)
(3.15)
for all p = (p0 , p), q = (q0 , q) ∈ R4 , and all R ∈ O(3). (2) Given α > 5 there exists a constant C1 > 0 such that, uniformly for all m ∈ [0, 1] and all smearing functions ϕ ∈ DR (Br+δ−τ ), ψ ∈ DR ((−τ, τ )), l,k ˆ n,m |K (p, q)| ≤
C1n−1 ˆ n
ψ ∞ ϕ
ˆ nα (1 + |p|)2−l (1 + |q|)2−k , 4π 2
n ∈ N,
(3.16)
for all p = (p0 , p), q = (q0 , q) ∈ R4 . ˆ l,k (p, q) is continuous. (3) For each n ∈ N, the function (p, q, m) ∈ R8 × [0, 1] → K n,m l,k ˆ n,m (p, q) is of (4) For each n ∈ N, the function (p, q, m) ∈ R8 × [0, 1/e] → K 1 class C . Moreover, given α > 5, there exists a constant C2 ≥ C1 such that uniformly for all m ∈ [0, 1/e] and all smearing functions ϕ ∈ DR (Br+δ−τ ), ψ ∈ DR ((−τ, τ )), ∂ C1n−1 ˆ n l,k ˆ K (p, q)
ψ 1,∞ ϕ
ˆ n1,α (1 + |p|)2−l (1 + |q|)2−k , (3.17) ≤ ∂uµ n,m 4π 2 ∂ C2n−1 ˆ n l,k ˆ K (p, q)
ψ 1,∞ ϕ
ˆ n1,α (1 + |p|)2−l (1 + |q|)2−k , ≤ m|log m| ∂m n,m 4π 2 (3.18) for all p = (p0 , p), q = (q0 , q) ∈ R4 , and where u in (3.17) is p or q.
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
105
Proof. (1) Both properties in (3.15) follow easily by induction from the recursive l,k ˆ n,m , taking into account rotational invariance of the function ϕ. definition of K (2) We start by observing that, by interchanging k with −k in the σ = −1 summand, formula (2.2) can be rewritten as Cˆ l,k (f, g)(p, q) := (−1)l π σ(iσ)k+l dk ωm (k)l+k−1 f (p, −kσ )g(kσ , q), σ=±
R3
(3.19) where we recall that kσ = (σωm (k), k). Since α > 5, there exists a fixed constant dk |k|s dk B1 > , , s = 0, 1, 2, p ∈ R3 . α α R3 |k|(1 + |p − k|) R3 (1 + |k|) It is then easily computed that for h = −1, 0, 1, j = 1, 2 and m ∈ [0, 1], dk R3
ωm (k)h (1 + |k|)j ≤ 7B1 (1 + |p|)h+j , (1 + |p − k|)α
so that estimate (3.16) follows by induction from (3.7) and the above expression ˆ δ (p, q) = for Cˆ l,k , provided one defines C1 := 14B1 /π and keeps in mind that h 1 ˆ + q ) ϕ(p ˆ + q). ψ(p 0 0 4π 2 ˆ ∈ S (R4 ), we obtain a bound to the inte(3) Using (3.16) and the fact that h 1−l,r r,k ˆδ, K ˆ δ ) with an integrable funcˆ ˆ r,1−k (K ˆ l,r , h (h grands in Cˆ n,m ) and C n,m tion of k, uniformly for (p, q, m) in a prescribed neighborhood of any given (¯ p, q¯, m) ¯ ∈ R8 × [0, 1]. By a straightforward application of Lebesgue’s domiˆ l,k (p, q) follows nated convergence theorem, the continuity of (p, q, m) → K n,m then by induction from the recursive relation (3.7). ˆ l,k ∈ Cˆl,k , we already know that it is differentiable with respect to the (4) Since K n,m components of p and q. The estimate (3.17) and the continuity of (p, q, m) → ∂ ˆ l,k ∂uµ Kn,m (p, q) then follow by an easy adaptation of the inductive arguments l,k ˆ n,m of points (2) and (3) above, using also (3.16). In order to show that K is continuously differentiable in m and satisfies (3.18), we proceed again by r,k ˆδ, K ˆ n,m ) induction using (3.7). The m-derivative of the integrands in Cˆ 1−l,r (h is given, apart from numerical constants, by σm m(r − l) ˆ r,k ˆ − kσ )K ˆ ˆ r,k (kσ , q) ∂0 h(p h(p − kσ )Kn,m (kσ , q) − n,m ωm (k)2+l−r ωm (k)1+l−r ˆ r,k ∂K n,m ˆ − kσ ) ˆ − kσ ) ∂ K ˆ r,k (kσ , q). − h(p (kσ , q) + ωm (k)r−l h(p ∂p0 ∂m n,m It is now straightforward to verify, using (3.16), (3.17) and the inductive hypothesis (3.18), that it is possible to bound the last three terms in the above
February 11, 2010 11:24 WSPC/148-RMP
106
J070-S0129055X10003904
G. Morsella & L. Tomassini
expression with an integrable function of k, uniformly for (p, q, m) in a given neighborhood of a fixed (¯ p, q¯, m) ¯ ∈ R8 ×[0, 1/e]. The same reasoning also applies to the first term when 2 + l − r < 3 and also when 2 + l − r = 3 for |k| ≥ 1/2. For |k| ≤ 1/2 and 2 + l − r = 3 the first term can be bounded uniformly in a neighborhood of (¯ p, q¯) by the function m(m + |k|)−3 , apart from a constant (depending on the chosen neighborhood). By maximizing the function x → x3 | log x|β /(m + x)3 in the interval [0, 1/2], with β > 1, one finds the bound
β −β/3 3 mW0 e 3 m 3m ≤ (m + |k|)3 β |k|3 |log|k||β
3 ,
where W0 is the principal branch of Lambert’s W function [13]. From the asymptotic expansion of W0 given in [13, Eq. (4.20)] it is then easily seen that the numerator on the right-hand side converges to 0 as m → 0; since the function k → |k|−3 |log|k||−β is integrable for |k| ≤ 1/2, interchangeability of derivar,k ˆδ, K ˆ n,m ) tion with respect to m and integration with respect to k in Cˆ 1−l,r (h for all values of l, r, k = 0, 1 follows. A completely analogous argument applies ˆ δ ), so that we conclude that K ˆ l,r , h ˆ l,k of course to Cˆ r,1−k (K n,m n+1,m is continuously differentiable in m. To complete the inductive step, it remains to be shown that ∂ ˆ l,k Kn+1,m . In order to do that, we argue again in a estimate (3.18) holds for ∂m similar way as in point (2) by choosing constants B2 , B3 > 0 such that
dk |k|s dk , , s = 0, 1, t = 0, 1, 2, t α α R3 |k| (1 + |p − k|) R3 (1 + |k|) 1 , m ∈ [0, 1/e]. B3 ≥ log(1 + 1 + m2 ) − √ 1 + m2
B2 ≥
p ∈ R3 ,
Taking now into account the identity 0
1
1 x2 dx = log(1 + 1 + m2 ) − √ − log m, 2 3/2 +x ) 1 + m2
(m2
it is easy to verify that the estimate ∂ 1−l,r m|log m| r,k ˆ ˆ ˆ (hδ , Kn,m )(p, q) ≤ [16π(1 + B3 ) + 16B2 + 7B1 ] ∂m C 8π 3 n+1 2−l ˆ n+1 ϕ
(1 + |q|)2−k , × C2n−1 ψ
1,∞ ˆ 1,α (1 + |p|) ∂ ˆ r,1−k ˆ l,r ˆ C holds for all m ∈ [0, 1/e] together with a similar one for ∂m (Kn,m , hδ ). 2 Choosing C2 := π [16π(1 + B3 ) + 16B2 + 7B1 ] ≥ C1 , one finally gets (3.18) for ˆ l,k K n+1,m .
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
107
In the next theorem, which is our main result, we denote by D0,S the dense subspace of H of finite particle vectors such that the n-particle wave functions are in S (R3n ) for each n ∈ N. Theorem 3.1. There holds, for each f ∈ S (R4 ) and each Φ ∈ D0,S , 1 dx f (x)αx (ΞλOr ,λOr+δ (Q))Φ = cj0 (f )Φ, lim λ→0 λ3 R4
(3.20)
where j0 (f ) = : ∂φφ† − φ∂φ† : (fδ ) is the Noether current associated to the U(1) symmetry of the charged Klein–Gordon field of mass m ≥ 0 smeared with the test function f and +∞ ˆ 0,0 ∂K π 2n 2n,0 0,1 4 ˆ K2n,0 (0, 0) + i (0, 0) . (3.21) c = −(2π) 4n (2n)! ∂p0 n=1 Proof. Since D0,S is translation invariant and contained in D(N ), according to Proposition 2.1 and the estimates given in the proof of Proposition 3.1 there exists a υ > 0 such that, for each x ∈ R4 , l,k
αx (: ∂ l φ∂ k φ† : (K2n,m (λ)))Φ ≤ υ(8π)2n−1 h 2n S (N + 1)Φ ,
and l,k l,k (λ)))Φ − αy (: ∂ l φ∂ k φ† : (K2n,m (λ)))Φ
αx (: ∂ l φ∂ k φ† : (K2n,m ∗ ∗ ≤ υ(8π)2n−1 h 2n S (U (x) − U (y) )(N + 1)Φ
l,k (λ))U (y)∗ Φ , + (U (x) − U (y)): ∂ l φ∂ k φ† : (K2n,m
so that the function x → αx (ΞλOr ,λOr+δ (Q))Φ is continuous and bounded in norm for each Φ ∈ D0,S , the integral in (3.20) exists in the Bochner sense and furthermore it is possible to interchange the integral and the series. ˆ fˆδ still Given now K ∈ C l,k , it is easy to see that the pointwise product K 1 l,k ˆ ˆ ˆ ˆ ˆ and K fδ l,k ≤ (2π)2 f ∞ K l,k so that we can define K ∗ f := belongs to C 4 ˆ ˆ ∨ (2π) (K fδ ) ∈ C l,k . It is then straightforward to check that l,k l,k dx f (x)αx (: ∂ l φ∂ k φ† : (K2n,m (λ)))Φ = : ∂ l φ∂ k φ† : (K2n,m (λ) ∗ f )Φ. R4
ˆ l,k (λp, λq) and, with the notaˆ l,k (λ)(p, q) = λ2+l+k K Furthermore one has K 2n,m 2n,λm ˆ tion (δλ K)b(p, q) = K(λp, λq), we see that we are left with the calculation of lim
λ→0
0,1 l,k
l+k−1
λ
+∞
π 2n l,k : ∂ l φ∂ k φ† : (δλ K2n,λm ∗ f )Φ. n (2n)! 4 n=1
(3.22)
As a first step in this calculation, we show that it is possible to interchange the limit and the series. Of course, it is sufficient to consider vectors Φ with vanishing n-particles components except for n = N with any fixed N ∈ N. For simplicity, we will give here only the relevant estimates in the case m > 0, the case m = 0 being
February 11, 2010 11:24 WSPC/148-RMP
108
J070-S0129055X10003904
G. Morsella & L. Tomassini
treated in a similar way. Using then the notations for creation and annihilation operators and for wave functions introduced in Sec. 2 and the formulas in the proof of Proposition 2.1, we have l,k ∗ f )(N ) Φ
: ∂ l c−,+ ∂ k c+,− : (δλ K2n,λm l,k ≤ 16π 5 N ((T l,k ((δλ K2n,λm )bfˆδ ) ⊗ |e+ e+ |) ⊗ 1 ⊗ · · · ⊗ 1)Φ ,
together with the estimate, for λ ∈ [0, 1/m], l,k )bfˆδ ) ⊗ |e+ e+ |) ⊗ 1 ⊗ · · · ⊗ 1)Φ]τ1 ...τN (p1 , . . . , pN )| |[((T l,k ((δλ K2n,λm
≤
(1 + |p1 |)2−l ωm (p1 )l−1/2 C1n−1 B1 ˆ n n ˆ
ψ
ϕ
ˆ
f
β ∞ α 4π 2 (1 + |p2 |)α · · · (1 + |pN |)α ωm (q)k−1/2 (1 + |q|)2−k × dq , (1 + |q|)γ (1 + |p1 − q|)β R3
where we have used (3.16) and the fact that Φ ∈ D0,S (which gives the constant B1 > 0). It is now easy to see that the right hand side is a square integrable function of (p1 , . . . , pN ) if α > 3/2, β > 3, γ > 15/2 and therefore we get l,k n ˆ n ϕ
∗ f )(N ) Φ ≤ B2 C1n−1 ψ
: ∂ l c−,+ ∂ k c+,− : (δλ K2n,λm ∞ ˆ α,
where B2 > 0 is a constant depending on m, f , Φ but not on n and λ. A similar l,k ∗f )(N ) Φ . Furthermore we have estimate holds then for : ∂ l c−,− ∂ k c+,+ : (δλ K2n,λm l,k ∗ f )(N −2) Φ
: ∂ l c−,− ∂ k c+,− : (δλ K2n,λm ≤ 16π 5 N (N − 1) Φl,k,−
2 6 Φ , )bfˆ L (R ) (δ K l,k λ
2n,λm
δ
with
Φl,k,− l,k
2L2 (R6 ) ≤ bˆ
2(n−1)
C1
(δλ K2n,λm ) fδ
16π 4
B3 ˆ 2n ˆ2
ψ ∞ ϕ
ˆ 2n α f β
×
dp dq R6
(1 + |p|)3 (1 + |q|)3 , (1 + |p| + |q|)2β
l,k for some β > 6. A similar estimate holds for : ∂ l c−,+ ∂ k c+,+ : (δλ K2n,λm ∗f )(N +2) Φ . In summary, we get, uniformly for λ ∈ [0, 1/m], l,k n ˆ n ϕ
∗ f )Φ ≤ B4 C1n−1 ψ
: ∂ l φ∂ k φ† : (δλ K2n,λm ∞ ˆ α,
with B4 independent of λ and n, so that, if l + k ≥ 1, it is possible to interchange the limit and the sum in (3.22). The term in (3.22) with l = k = 0 needs however a separate treatment, due to the divergent prefactor λ−1 . We first observe that, due ˆ 0,0 (0, 0) = 0. Using bounds (3.17) and to the first relation in (3.15), we have K n,m
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
109
(3.18), we thus obtain the estimate λ 1 1 0,0 d 0,0 ˆ ˆ K K (λp , λq ) dµ (λp , λq ) = σ σ σ σ λ 2n,λm λ 0 dλ 2n,λm λ=µ ≤
3C2n−1 ˆ
ψ 1,∞ ϕ
ˆ 1,α (m + |p| + |q|)(1 + |p|)2 (1 + |q|)2 , 4π 2
valid for σ, σ = ± and for λ ∈ [0, λ0 ], with λ0 := min{1/em, 1}. Then a straightforward adaptation of the above arguments easily gives, uniformly for λ ∈ [0, λ0 ], 1 l,k n ˆ n ϕ
: φφ† : (δλ K2n,λm ∗ f )Φ ≤ B5 C2n−1 ψ
1,∞ ˆ 1,α , λ
(3.23)
with B5 > 0 a constant independent of λ and n. The same estimates above, being uniform in λ ∈ [0, 1/m], together with use of Lemma 3.1(3), allow us also to conclude that l,k ˆ l,k (0, 0): ∂ l φ∂ k φ† : (fδ )Φ. ∗ f )Φ = (2π)4 K lim : ∂ l φ∂ k φ† : (δλ K2n,λm 2n,0
(3.24)
λ→0
Furthermore there holds lim
λ→0
1 0,0 δλ K2n,λm ∗f λ
b
(p, q) = (2π)4 (p0 − q0 )
since, as a consequence of (3.15), we have ˆ 0,0 ∂K 2n,0 ∂p0 (0, 0)
ˆ 0,0 ∂K − ∂q2n,0 (0, 0). 0
ˆ 0,0 ∂K 2n,0 ∂pi (0, 0)
ˆ 0,0 ∂K 2n,0 (0, 0)fˆ(p + q), ∂p0
=0=
ˆ 0,0 ∂K 2n,0 ∂qi (0, 0),
i = 1, 2, 3,
= Exploiting again the uniformity in λ ∈ [0, λ0 ] of and the estimates leading to (3.23), we finally get 0,0
lim
λ→0
ˆ ∂K 1 2n,0 0,0 : φφ† : (δλ K2n,λm ∗ f )Φ = −(2π)4 i (0, 0): ∂φφ† − φ∂φ† : (fδ ). λ ∂p0
Together with (3.24), this gives the statement. We stress that vanishing of the constant c in the previous theorem is still by no means ruled out. That in general this is not the case, can be seen by choosing the time-smearing function ψ ∈ DR ((−τ, τ )) sufficiently close to a δ function and the space-smearing function ϕ ∈ DR (Br+δ−τ ) to a characteristic function. Proposition 3.3. Assume that the time-smearing function ψ used in the construction of ΞλOr ,λOr+δ (Q) satisfies ψ(t) = τ −1 ψ1 (τ −1 t), where ψ1 ∈ DR ((−1, 1)) is such that R ψ1 = 1, and that the space-smearing function ϕ is such that ϕ ∈ DR (Br+δ/2+ε ), 0 ≤ ϕ ≤ 1 and ϕ(x) = 1 for all x ∈ Br+δ/2−ε , with ε < δ/2 − τ . Then, denoting with c(τ, ε) the corresponding constant given by
February 11, 2010 11:24 WSPC/148-RMP
110
J070-S0129055X10003904
G. Morsella & L. Tomassini
Eq. (3.21), there holds lim lim c(τ, ε) =
ε→0 τ →0
3 δ 4 π r+ . 3 2
(3.25)
ˆ l,k : Proof. By induction, it is straightforward to prove the following formula for K n,0 ˆ l,k (p, q) K n,0 =
(−1)k+n−1 in−l−k ηn (2π)n+1 ×
n−1
R3(n−1) j=1
0,1
n−1
r −rj−1
σj j
r1 ,...,rn−2 σ1 ,...,σn−1 j=1
ˆ 1,σ − k2,σ ) · · · h(k ˆ n−1,σ ˆ − k1,σ )h(k dkj |kj |rj −rj−1 h(p 1 1 2 n−1 + q),
where ηn = i for n even and ηn = −1 for n odd and r0 := l, rn−1 := 1 − k. Since ˆ 0 ) = ψˆ1 (τ p0 ) → (2π)−1/2 as τ → 0 and kj,σ = (σj |kj |, kj ), it is easy to see that ψ(p j in the limit τ → 0 the dependence on the σj ’s drops off the integral in the second line of the above equation and therefore (−1)n 22n−1 (−1)n 4n 0,1 ˆ ϕˆ ∗ · · · ∗ ϕ(0) ˆ = dx ϕ(x)2n . lim K2n,0 (0, 0) = τ →0 (2π)3n+1 2(2π)4 R3 Analogously, since ψˆ (p0 ) = τ ψˆ1 (τ p0 ) → 0 as τ → 0, one has from the above formula ˆ 0,0 ∂K 2n,0 (0, 0) = 0. τ →0 ∂p0 lim
But, thanks to the estimates (3.16), (3.17), the convergence of the series (3.21) is uniform in τ , so that one has +∞ 1 (−1)n π 2n lim c(τ, ε) = − dx ϕ(x)2n . τ →0 2 n=1 (2n)! 3 R Since ϕ is bounded above by the characteristic function of the ball Br+δ for ε < δ/2, the convergence of the above series is also uniform in ε so that, taking into account that ϕ converges to the characteristic function of the ball Br+δ/2 when ε → 0, we finally get (3.25). It is straightforward to extend the above analysis to treat the case of the net O → F (O) generated by a multiplet of free scalar fields φa , a = 1, . . . , d, with the action of a compact Lie group G defined by V (g)φa (f )V (g)∗ =
d
v(g)ab φb (f ),
b=1
where v is a d-dimensional unitary representation.
g ∈ G,
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
111
More precisely, consider the 1-parameter subgroup θ ∈ R → gθξ ∈ G associated to a Lie algebra element ξ ∈ g and correspondingly the global generator Qξ of θ → V (gθξ ), which satisfies on D(N ) [Qξ , φa (f )] = −i
d
t(ξ)ab φb (f ),
b=1
ξ → t(ξ) being the representation of g (through antihermitian matrices) associated to v. Then considering again the U(1) symmetry of the doubled theory and the associated Noether current J0 it is possible to define a semi-local implementation of the flip as in Eq. (3.4) and to construct a local implementation ΞλOr ,λOr+δ (Qξ ) of Qξ as in Eq. (3.5), which is essentially self-adjoint on D(N ) and for which an expansion analogous to (3.8) holds: 0,1 d +∞ 2n π l,k t(ξ)ab : ∂ l φa ∂ k φ†b : (K2n,m (λ))Φ , ΞλOr ,λOr+δ (Qξ )Φ = n (2n)! 4 n=1 l,k a,b=1
l,k (λ) are the distributions defined in (3.6) and (3.7). Finally, the anawhere K2n,m logue of formula (3.20) holds, where on the right-hand side the appropriate Noether current d
j0ξ (f ) =
t(ξ)ab : φa ∂φ†b − ∂φa φ†b : (fδ )
a,b=1
appears and the normalization constant c is again given by (3.21). 4. Summary and Outlook In the present work we have shown that it is in principle possible to construct operators implementing locally a given infinitesimal symmetry of a local net of von Neumann algebras (local generators), starting from the existence of unitary operators implementing (semi-)locally the flip automorphism on the tensor product of the net with itself. In particular, in a large class of free scalar field models our construction provides an efficient tool to obtain manageable such local generators through the explicit expression of the local flip given in Eq. (3.4). Moreover, we showed that it is possible to recover, up to a well-determined strictly positive normalization constant, the associated Noether currents through a natural scaling limit of these generators in which the localization region shrinks to a point. As expected, the above-mentioned constant is found to depend only on the volume of the initial localization region of the generator and not on the mass and isospin of the model. The existence of this limit depends in this case on control of the energy behavior of the generators (namely the existence of H-bounds) rather than on dilation invariance of the (thus massless) theory, which was a key ingredient of previous similar results [7, 8].
February 11, 2010 11:24 WSPC/148-RMP
112
J070-S0129055X10003904
G. Morsella & L. Tomassini
These results have been obtained in the spirit of giving a consistency check towards a full quantum Noether theorem according to the program set down in [1] and recalled in the introduction. In order to proceed further in this direction it is apparent that two main problems have to be tackled. First, it is necessary to extend the construction of local generators proposed in the Introduction to a suitably general class of theories. Second, it would be desirable to gain a deeper understanding of the general properties granting the existence and non-triviality of the pointlike limit of the free generators, which are presently under investigation. Among other things, this is likely connected with the problem of clarifying if it is generally possible, through a suitable choice of the local flip implementation, to gain control over the “boundary part” of the local symmetry implementation, whose arbitrariness is considered to be an important obstruction for the reconstruction of Noether currents. The methods of [14] can be expected to be useful to put this analysis in a more general framework. Finally, we believe that our method could help to shed some light on the difficult problem of obtaining sharply localized charges from global ones. Acknowledgments We would like to thank Sergio Doplicher for originally suggesting the problem to one of us and for his constant support and encouragement, and Sebastiano Carpi for several interesting and useful discussions. We also thank the referees for suggesting several improvements in the exposition. This work was supported by MIUR, GNAMPA-INDAM, the SNS, the Marie Curie Research Training Network MRTNCT-2006-031962 EU-NCG and the ERC Advanced Grant 227458 “Operator Algebras and Conformal Field Theory”. Appendix. Local Implementation of the Doubled Theory U(1) Symmetry In this Appendix, we show that the smeared Noether current associated to the U(1) symmetry of the theory of two complex free scalar fields of mass m ≥ 0, Eq. (3.1), is represented by a self-adjoint operator which generates a group locally implementing the symmetry. Although this material is more or less standard, we include it here both for the convenience of the reader and because the proof of self-adjointness of (Wick-ordered) bilinear expressions in the free field (and its derivatives) can be found in the literature only for mass m > 0 (see [15, 16]). For this reason, we will only emphasize the main differences in the (possibly) massless case. To begin with, the main estimates in the appendix of [15], which are valid only for m > 0, have to be sharpened as in the following lemma. Lemma A.1. Let h ∈ S (R4 ), and consider the tempered distribution hδ (x, y) = ˆ δ 0,1 , h ˆ δ 1,0 ≤ h S h(x)δ(x − y). Then hδ ∈ C 0,1 ∩ C 1,0 for all m ≥ 0, and h where · S is some Schwartz norm independent of m varying in bounded intervals.
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
113
1 ˆ (2π)2 h(p
ˆ δ ∈ Cˆ. We denote by + q), which implies h ˆ δ |). It is easy to see that, for |q| ≥ 1, w(p, q) the integral kernel defining T 1,0 (|h ωm (p) ≤ (1 + |p − q|)1/2 , ωm (q)
ˆ δ (p, q) = Proof. One has h
ˆ ∈ S (R4 ), there exists a C1 > 0 and an r > 3 such that and therefore, being h
2 2 C1 dp dq w(p, q)Φ(q) ≤ dp dq |Φ(q)| (1 + |p − q|)r R3 |q|>1 R3 |q|>1 ≤
C12
R3
dp (1 + |p|)r
2
Φ 2L2 (R3 ) ,
where use was made of the Young inequality f ∗ g L2 ≤ f L1 g L2 . On the other hand, there exist C2 > 0 and s > 2 such that, for |p| > 1, C2 ωm (p) dq w(p, q)Φ(q) ≤ dq |Φ(q)| |q| (1 + |p − q|)s |q|≤1 |q|≤1 ωm (p) dq |Φ(q)| ≤ C2 s |p| |q| |q|≤1 2πωm (p)
Φ L2 (R3 ) , ≤ C2 |p|s and a C3 > 0 such that, for |p| ≤ 1,
|q|≤1
dq w(p, q)Φ(q) ≤ C3
|q|≤1
dq
ωm (p) |Φ(q)| |q|
√ ≤ 2πC3 (1 + m2 )1/4 Φ L2 (R3 ) .
Putting these inequalities together, we obtain
1/2 √ √ dp ωm (p) 1,0 ˆ
T (|hδ |)Φ L2 (R3 ) ≤ 2 C1 + 2πC2 r 2s R3 (1 + |p|) |p|>1 |p| 2 1/4 + 2π 2/3C3 (1 + m )
Φ L2 (R3 ) , so that, since the constants Ci can be expressed by Schwartz norms of h, we conclude ˆ δ |) ≤ h S for a suitable Schwartz norm · S . that T 1,0(|h ˆ˜ δ |) , T 0,1 (|h ˆ δ |) , T 0,1(|h ˆ˜ δ |) ≤ h S are completely The proofs that T 1,0 (|h
1,0,σ ∈ L2 (R6 ), σ = ±, and that analogous and it is immediate to see that Φ0,1,σ ˆ δ , Φh ˆδ h their norms can be bounded by h S .
February 11, 2010 11:24 WSPC/148-RMP
114
J070-S0129055X10003904
G. Morsella & L. Tomassini
This lemma, together with Proposition 2.1, shows that the timelike component J0 (h) of the current (3.2) is well-defined for h ∈ S (R4 ). Using the fact that |pi | ≤ ωm (p), the proof above shows that the spacelike components Ji (h), i = 1, 2, 3, are well-defined too. Proposition A.1. The following statements hold. ˜ ) by (1) For each h ∈ S (R4 ), the operator Jµ (h) defined on D(N Jµ (h) :=
2
(−1)j [: ∂µ φj φ†j : (hδ ) − : φj ∂µ φ†j : (hδ )],
(A.1)
j=1
where j = 3 − j, defines a Wightman field such that Jµ (h) is essentially selfadjoint for real h. (2) If h ∈ DR (O), O a double cone, then eiζJµ (h) ∈ F˜ (O), ζ ∈ R. (3) Given a 3-dimensional open ball Br of radius r centered at the origin together with functions ϕ ∈ DR (R3 ), ψ ∈ DR ((−τ, τ )) such that ϕ(x) = 1 for each x ∈ Br+τ and R ψ = 1, it holds that eiζJ0 (ψ⊗ϕ) F e−iζJ0 (ψ⊗ϕ) = γζ (F ),
∀ F ∈ F˜ (Or ),
(A.2)
where Or is the double cone with base Br . ˆ δ 0,1 , h ˆ δ 1,0 ≤ h S , so that Jµ Proof. (1) According to Lemma A.1 one has h is a Wightman field and Jµ (h) is symmetric for real h. Given now a Φ ∈ K ⊗S n , Jµ (h)p Φ is the sum of 16p vectors of the form −,σp kp +,εp ∂µ cjp
: ∂µlp cjp
1 k1 +,ε1 : (h)(np ) · · · : ∂µl1 c−,σ ∂µ cj : (h)(n1 ) Φ j1 1
with nj = nj−1 + σj + εj , j = 1, . . . , p (n0 := n). Therefore, by (2.4), p 4 h S p
Jµ (h) Φ ≤ (n + 2(p + 1)) · · · (n + 4) Φ , π ˜ 0 is a and we see that Φ is an analytic vector for Jµ (h). Since any element in D finite sum of such vectors, essential self-adjointness of Jµ (h) follows. ˜ 0, (2) A straightforward but lengthy calculation shows that, on D [Jµ (h), φj (f ) + φj (f )∗ ] = (−1)j+1 i(φj (g) + φj (g)∗ ), g = h(∂µ ∆ ∗ f ) + ∂µ (h(∆ ∗ f )),
(A.3)
1 ε(p0 )δ(p2 − m2 ). Since where, as customary, ∆ is the Fourier transform of 2πi ˜ 0 is an invariant dense set supp ∆ is contained in the closed light cone and D of analytic vectors for both Jµ (h) and φj (f ) + φj (f )∗ , we see by standard ∗ − arguments that eiζJµ (h) commutes with ei[φj (f )+φj (f ) ] if supp h is spacelike from supp f , i.e. eiζJµ (h) ∈ F˜ (O) = F˜ (O) if supp h ⊂ O.
February 11, 2010 11:24 WSPC/148-RMP
J070-S0129055X10003904
From Global Symmetries to Local Currents
115
(3) Take f ∈ D(Or ). Since supp ∆ ∗ f does not intersect [−τ, τ ] × {x : ϕ(x) = 1} we have that ψ ⊗ ϕ(∂0 ∆ ∗ f ) + ∂0 (ψ ⊗ ϕ(∆ ∗ f )) = ψ ⊗ 1(∂0 ∆ ∗ f ) + ∂0 (ψ ⊗ 1(∆ ∗ f )). On the other hand, a calculation shows that, thanks to R ψ = 1, ∆ ∗ (ψ ⊗ 1(∂0 ∆ ∗ f ) + ∂0 (ψ ⊗ 1(∆ ∗ f ))) = ∆ ∗ f, and, since ∆ ∗ f1 = 0 implies f1 = ( + m2 )f2 with fi ∈ S (R4 ), the commutation relations (A.3) become [J0 (ψ ⊗ ϕ), φj (f ) + φj (f )∗ ] = (−1)j+1 i(φj (f ) + φj (f )∗ ). Furthermore, thanks to the estimates (2.5) and (2.6) we can apply the multiple commutator theorems in [11] to conclude, as in the proof of [9, Theorem 2], that (A.2) holds. References [1] S. Doplicher, Local aspects of superselection rules, Comm. Math. Phys. 85 (1982) 73–86. [2] S. Doplicher and R. Longo, Local aspects of superselection rules. II, Comm. Math. Phys. 88 (1983) 399–409. [3] D. Buchholz, S. Doplicher and R. Longo, On Noether’s theorem in quantum field theory, Ann. Phys. 170 (1986) 1–17. [4] D. Buchholz and E. H. Wichmann, Causal independence and the energy level density of states in local quantum field theory, Comm. Math. Phys. 106 (1986) 321–344. [5] D. Buchholz, Product states for local algebras, Comm. Math. Phys. 36 (1974) 287– 304. [6] S. Stratila, Modular Theory in Operator Algebras (Abacus Press, Bucharest, 1981). [7] S. Carpi, Quantum Noether’s theorem and conformal field theory: A study of some models, Rev. Math. Phys. 11 (1999) 519–532. [8] L. Tomassini, Sul teorema di Noether quantistico: Studio del campo libero di massa zero in quattro dimensioni, Master’s thesis, Universit` a di Roma “La Sapienza” (1999). [9] C. D’Antoni and R. Longo, Interpolation by type I factors and the flip automorphism, J. Funct. Anal. 51 (1983) 361–371. [10] S. Doplicher and R. Longo, Standard and split inclusions of von Neumann algebras, Invent. Math. 75 (1984) 493–536. [11] J. Fr¨ ohlich, Application of commutator theorems to the integration of representations of Lie algebras and commutation relations, Comm. Math. Phys. 54 (1977) 135–150. [12] M. Reed and B. Simon, Methods of Modern Mathematical Physics. Vol. II: Fourier Analysis, Self-Adjointness (Academic Press, New York, 1975). [13] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey and D. E. Knuth, On the Lambert W function, Adv. Comput. Math. 5 (1996) 329–359. [14] H. Bostelmann, Phase space properties and the short distance structure in quantum field theory, J. Math. Phys. 46 (2005) 052301, 17 pp. [15] J. Langerholc and B. Schroer, On the structure of the von Neumann algebras generated by local functions of the free Bose field, Comm. Math. Phys. 1 (1965) 215–239. [16] S. Albeverio, B. Ferrario and M. W. Yoshida, On the essential self-adjointness of Wick powers of relativistic fields and of fields unitary equivalent to random fields, Acta Appl. Math. 80 (2004) 309–334.
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 2 (2010) 117–192 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003916
PERTURBATIVE DEFORMATIONS OF CONFORMAL FIELD THEORIES REVISITED
IGOR KRIZ Mathematics Department, University of Michigan, Ann Arbor, MI 48109-1109 USA
[email protected] Received 30 March 2009 Revised 12 October 2009 The purpose of this paper is to revisit the theory of perturbative deformations of conformal field theory from a mathematically rigorous, purely worldsheet point of view. We specifically include the case of N = (2, 2) conformal field theories. From this point of view, we find certain surprising obstructions, which appear to indicate that contrary to previous findings, not all deformations along marginal fields exist perturbatively. This includes the case of deformation of the Gepner model of the Fermat quintic along certain cc fields. In other cases, including Gepner models of K3-surfaces and the free field theory, our results coincides with known predictions. We give partial interpretation of our results via renormalization and mirror symmetry. Keywords: N = (2, 2) conformal field theories; perturbative deformation; Gepner model. Mathematics Subject Classification 2010: 83E30, 53D37, 81T15
1. Introduction Recently, there has been renewed interest in the mathematics of the moduli space of conformal field theories, in particular, in connection with speculations about elliptic cohomology. The purpose of this paper is to investigate this space by perturbative methods from first principles and from a purely “worldsheet” point of view. It is conjectured that at least at generic points, the moduli space of CFT’s is a manifold, and in fact, its tangent space consists of marginal fields, i.e. primary fields of weight (1, 1) of the conformal field theory (that is in the bosonic case, in the supersymmetric case there are modifications which we will discuss later). This then means that there should exist an exponential map from the tangent space at a point to the moduli space, i.e. it should be possible to construct a continuous 1-parameter set of conformal field theories by “turning on” a given marginal field. There is a more or less canonical mathematical procedure for applying a “Pexp” type construction to the field which has been turned on, and obtaining a perturbative expansion in the deformation parameter. This process, however, returns certain 117
March 10, J070-S0129055X10003916
118
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
cohomological obstructions, similar to Gerstenhaber’s obstructions to the existence of deformations of associative algebras [26–29]. Physically, these obstructions can be interpreted as changes of dimension of the deforming field, and can occur, in principle, at any order of the perturbative path. The primary obstruction is well known, and was used, e.g., by Ginsparg in his work on c = 1 conformal field theories [30]. The obstruction also occured in earlier work, see [45–47, 63–65, 61], from the point of view of continuous lines in the space of critical models. In the models considered, notably the Baxter model [11], the Ashkin–Teller model [8] and the Gaussian model [48], vanishing of the primary obstruction did correspond to a continuous line of deformations, and it was therefore believed that the primary obstruction tells the whole story. (A similar story also occurs in the case of deformations of boundary sectors, see [1, 2, 12, 22, 51, 52, 58, 38].) In a certain sense, the main point of the present paper is analyzing, or giving examples of, the role of the higher obstructions. We shall see that these obstructions can be non-zero in cases where the deformation is believed to exist, most notably in the case of deforming the Gepner model of the Fermat quintic along a cc field, cf. [3, 55, 23, 60, 44, 66, 67, 14, 15, 17]. Some discussion of marginality of primary field in N = 2-supersymmetric theories to higher order exists in the literature. Notably, Dixon, [19] verified the vanishing for any N = (2, 2)-theory, and any linear combination of cc, ac, ca and ac field, of an amplitude integral which physically expresses the change of central charge (a similar calculation is also given in Distler– Greene [18]). Earlier work of Zamolodchikov [70,71] showed that the renormalization β-function vanishes for theories where c does not change during the renormalization process. However, we find that the calculation [19] does not guarantee that the primary field would remain marginal along the perturbative deformation path, due to subtleties involving singularities of the integral. The obstruction we discuss in this paper is an amplitude integral which physically expresses directly the change of dimension of the deforming field, and it turns out this may not vanish. We will return to this discussion in Sec. 3 below. This puzzle of having obstructions where none should appear will not be fully explained in this paper, although a likely interpretation of the result will be discussed. It is possible that our effect does not impact the general question of the existence of the nonlinear σ-model, which is widely believed to exist (e.g., [3, 55, 23, 60, 44, 66, 67, 14, 15, 17]), but simply concerns questions of its perturbative construction. One caveat is that the case we investigate here is still not truly physical, since we specialize to the case of cc fields, which are not real. The actual physical deformations of CFT’s should occur along real fields, e.g., a combination of a cc field and its complex-conjugate aa field (we give a discussion of this in case of the free field theory at the end of Sec. 4). The obstructions discussed here however are not linear, and hence a priori the case of the corresponding real field in the Gepner model is much more difficult to analyze, in particular, it requires regularization of the deforming parameter, and is not discussed here. Nevertheless, it is still surprising that an obstruction occurs for a single cc field; for example, this
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
119
does not happen in the case of the (compactified or uncompactified) free field theory. Also, there is strong evidence that obstructions to deformations along cc fields and the corresponding real fields are equivalent (see the remark after Example 2 in Sec. 6). Since an nth order obstruction indeed means that the marginal field gets deformed into a field of non-zero weight, which changes to the order of the nth power of the deformation parameter, usually [30, 45–47, 63–65, 61], when obstructions occur, one therefore concludes that the CFT does not possess continuous deformations in the given direction. Other interpretations are possible. One thing to observe is that our conclusion is only valid for purely perturbative theories where we assume that all fields have power series expansion in the deformation parameters with coefficients which are fields in the original theory. This is not the only possible scenario. Therefore, as remarked above, our results merely indicate that in the case when our algebraic obstruction is non-zero, non-perturbative corrections must be made to the theory to maintain the presence of marginal fields along the deformation path. In fact, evidence in favor of this interpretation exists in the form of the analysis of Nemeschansky and Sen [55,35] of higher order corrections to the β-function of the nonlinear σ-model. Grisaru, Van de Ven and Zanon [35] found that the four-loop contribution to the β-function of the nonlinear σ model for Calabi–Yau manifolds is non-zero, and [55] found a recipe how to cancel this singularity by deforming the manifold to metric which is non-Ricci flat at higher orders of the deformation parameter. The expansion [4] used in this analysis is around the 0 curvature tensor, but assuming for the moment that a similar phenomenon occurs if we expanded around the Fermat quintic vacuum, then there are no fields present in the Gepner model which would correspond perturbatively to these higher order corrections in the direction of non-Ricci flat metric: bosonically, such fields would have to have critical conformal dimension classically, since the σ-model Lagrangian is classically conformally invariant for non-Ricci flat target K¨ ahler manifolds. However, quantum mechanically, there is a one-loop correction proportional to the Ricci tensor, thus indicating that fields expressing such perturbative deformations would have to be of generalized weight (cf. [39–42]). Fields of generalized weight, however, are not present in the Gepner model, which is a rational CFT, and more generally are excluded by unitarity (see discussions in Remarks after Theorems 2 and 3 in Sec. 3 below). Thus, although this argument is not completely mathematical, renormalization analysis seems to confirm our finding that deformations of the Fermat quintic model must in general be non-perturbative. It is also noteworthy that the β-function is known to vanish to all orders for K3-surfaces because of N = (4, 4) supersymmetry. Accordingly, we also find that the phenomenon we see for the Fermat quintic is not present in the case of the Fermat quartic (see Sec. 7 below). It is also worth noting that other non-perturbative phenomena such as instanton corrections also arise when passing from K3-surfaces to Calabi–Yau 3-folds ( [14, 15, 17]). Finally, one must also remark that the proof of [55] of the β-function cancellation
March 10, J070-S0129055X10003916
120
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
is not mathematically complete because of convergence questions, and thus one still cannot exclude even the scenario that not all nonlinear σ models would exist as exact CFT’s, thus creating some type of “string landscape” picture also in this context (cf. [20]). We should remark that this scenario also has a compelling interpretation from the point of view of the relationship between classical and quantum geometry (see the end of the Concluding Remarks). In this paper, we shall be mostly interested in the strictly perturbative picture. The main point of this paper is an analysis of the algebraic obstructions in certain canonical cases. We discuss two main kinds of examples, namely the free field theory (both bosonic and N = 1-supersymmetric), and the Gepner models of the Fermat quintic and quartic, which are exactly solvable N = 2-supersymmetric conformal field theories which should be the nonlinear σ-models of the Fermat quintic Calabi– Yau 3-fold and the Fermat quartic K3-surface (in the case of the Fermat quartic, this was actually proved in [54]). In the case of the free field theory, what happens is essentially that all non-trivial gravitational deformations of the free field theory are algebraically obstructed. In the case of a free theory compactified on a torus, the only gravitational deformations which are algebraically unobstructed come from linear change of metric on the torus. (We will focus on gravitational deformations; there are other examples, for example the sine-Gordon interaction [69, 13], which are not discussed in detail here.) The Gepner case deserves special attention. From the moduli space of Calabi– Yau 3-folds, there is supposed to be a σ-model map into the moduli space of CFT’s. In fact, when we have an exactly solvable Calabi–Yau σ-model, one gets operators in CFT corresponding to the cohomology groups H 1,1 and H 2,1 , which measure deformations of complex structure and K¨ ahler metric, respectively, and these in turn give rise to infinitesimal deformations. Now the Fermat quintic x5 + y 5 + z 5 + t5 + u5 = 0
(1)
in CP 4 has a model conjectured by Gepner [24, 25] which is embedded in the tensor product of 5 copies of the N = 2-supersymmetric minimal model of central charge 9/5. The weight (1/2, 1/2) cc and ac fields correspond to the 100 infinitesimal deformations of complex structure and 1 infinitesimal deformation of K¨ ahler metric of the quintic (1). Despite the numerical matches in dimension, however, it is not quite correct to say that the gravitational deformations, corresponding to the moduli space of Calabi–Yau manifolds, occurs by turning on cc and ac fields. This is because, to preserve unitarity, a physical deformation can only occur when we turn on a real field, and the fields in question are not real. In fact, the complex conjugate of a cc field is an aa field, and the complex conjugate of an ac field is a ca field. The complex conjugate must be added to get a real field, and a physical deformation (we discuss this calculationally in the case of the free field theory in Sec. 4). In this paper, we do not discuss deformations of the Gepner model by turning on real fields. As shown in the case of the free field theory in Sec. 4, such deformations
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
121
require for example regularization of the deformation parameter, and are much more difficult to calculate. Because of this, we work only with the case of one cc and one ac field. We will show that at least one cc deformation, whose real version corresponds to the quintics x5 + y 5 + z 5 + t5 + u5 + λx3 y 2 = 0
(2)
for small (but not infinitesimal) λ is algebraically obstructed. (One suspects that similar algebraic obstructions also occur for other fields, but the computation is too difficult at the moment; for the cc field corresponding to xyztu, there is some evidence suggesting that the deformation may exponentiate.) It is an interesting question if nonlinear σ-models of Calabi–Yau 3-folds must also contain non-perturbative terms. If so, likely, this phenomenon is generic, which could be a reason why mathematicians so far discovered so few of these conformal field theories, despite ample physical evidence of their existence [3, 55, 23, 60, 44, 66, 67]. Originally prompted by a question of Igor Frenkel, we also consider the case of the Fermat quartic K3 surface x4 + y 4 + z 4 + t4 = 0 in CP 3 . This is done in Sec. 7. It is interesting that the problems of the Fermat quintic do not arise in this case, and all the infinitesimally critical fields exponentiate in the purely perturbative sense. This dovetails with the result of Alvarez-Gaume and Ginsparg [5] that the β-function vanishes to all orders for critical perturbative models with N = (4, 4) supersymmetry, and hence from the renormalization point of view, the nonlinear σ model is conformal for the Ricci flat metric on K3-surfaces. There are also certain differences between the ways mathematical considerations of moduli space and mirror symmetry vary in the K3 and Calabi–Yau 3-fold cases, which could be related to the behavior of the non-perturbative effects. This will be discussed in Sec. 8. To relate more precisely in what setup these results occur, we need to describe what kind of deformations we are considering. It is well known that one can obtain infinitesimal deformations from primary fields. In the bosonic case, the weight of these fields must be (1, 1), in the N = 1-supersymmetric case in the NS-NS sector the critical weight is (1/2, 1/2) and in the N = 2-supersymmetric case the infinitesimal deformations we consider are along so called ac or cc fields of weight (1/2, 1/2). For more specific discussion, see Sec. 2 below. There may exist infinitesimal deformations which are not related to primary fields (see the remarks at the end of Sec. 3). However, they are excluded under a certain continuity assumption which we also state in Sec. 2. Therefore, the approach we follow is exponentiating infinitesimal deformations along primary fields of appropriate weights. In the “algebraic” approach, we assume that both the primary field and amplitudes can be updated at all points of the deformation parameter. Additionally, we assume one can obtain a perturbative power
March 10, J070-S0129055X10003916
122
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
series expansion in the deformation parameter, and we do not allow counterterms of generalized weight or non-perturbative corrections. We describe a cohomological obstruction theory similar to Gerstenhaber’s theory [26–29] for associative algebras, which in principle controls the coefficients at individual powers of the deformation parameter. Obstructions can be written down explicitly under certain conditions. This is done in Sec. 3. The primary obstruction in fact is the one which occurs for the deformations of the free field theory at gravitational fields of non-zero momentum (“gravitational waves”). In the case of the Gepner model of the Fermat quintic, the primary obstruction vanishes but in the case (2), one can show there is an algebraic obstruction of order 5 (i.e. given by a 7 point function in the Gepner model). It should be pointed out that even in the “algebraic” case, there are substantial complications we must deal with. The moduli space of CFT’s is not yet well defined. There are different definitions of conformal field theory, for example the Segal approach [59, 36, 37] is quite substantially different from the vertex operator approach (see [41] and references therein). Since these definitions are not known to be equivalent, and their realizations are supposed to be points of the moduli space, the space itself therefore cannot be defined until a particular definition is selected. Next, it remains to be specified what structure there should be on the moduli space. Presumably, there should at least be a topology, so that we need to ask what is a nearby conformal field theory. That, too, has not been answered. These foundational questions are enormously difficult, mostly from the philosophical point of view: it is very easy to define ad hoc notions which immediately turn out insufficiently general to be desirable. Because of that, we only make minimal definitions needed to examine the existing paradigm in the context outlined. Let us, then, confine ourselves to observing that even in the perturbative case, the situation is not purely algebraic, and rather involves infinite sums which need to be discussed in terms of analysis. For example, the obstructions may in fact be undefined, because they may involve infinite sums which do not converge. Such phenomenon must be treated carefully, since it does not mean automatically that perturbative exponentiation fails. In fact, because the deformed primary fields are only determined up to a scalar factor, there is a possibility of regularization along the deformation parameter. We briefly discuss this theoretically in Sec. 3, and then give an example in the case of the free field theory in Sec. 4. We also briefly discuss sufficient conditions for exponentiation. The main method we use is the case when Theorem 1 gives a truly local formula for the infinitesimal amplitude changes, which could be interpreted as an “infinitesimal isomorphism” in a special case. We then give in Sec. 3 conditions under which such infinitesimal isomorphisms can be exponentiated. This includes the case of a coset theory, which does not require regularization, and a more general case when regularization may occur. In the final Secs. 5 and 6, namely the case of the Gepner model, the main problem is finding a setup for the vertex operators which would be explicit enough to allow evaluating the obstructions in question; the positive result is obtained using
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
123
a generalization of the coset construction. The formulas required are obtained from the Coulomb gas approach (= Feigin–Fuchs realization), which is taken from [34]. The present paper is organized as follows: In Sec. 2, we give the general setup in which we work, show under which condition we can restrict ourselves to deformations along a primary field, and derive the formula for infinitesimally deformed amplitudes, given in Theorem 1. In Sec. 3, we discuss exponentiation theoretically, in terms of obstruction theory, explicit formulas for the primary and higher obstructions, and regularization. We also discuss supersymmetry, and in the end show a mechanism by which non-perturbative deformations may still be possible when algebraic obstructions occur. In Sec. 4, we give the example of the free field theories, the trivial deformations which come from 0 momentum gravitational deforming fields, and the primary obstruction to deforming along primary fields of non-zero momentum. In Sec. 5, we will discuss the Gepner model of the Fermat quintic, and in Sec. 6, we will discuss examples of non-zero algebraic obstructions to perturbative deformations in this case, as well as speculations about unobstructed deformations. In Sec. 7, we will discuss the (unobstructed) deformations for the Fermat quartic K3 surface, and in Sec. 8, we attempt to summarize and discuss our possible conclusions. 2. Infinitesimal Deformations of Conformal Field Theories We shall work in the framework of [59] (see also [36–38]). In the bosonic case (without considering supersymmetry), a conformal field theory in this framework is characterized by a Hilbert space of states H, and for a worldsheet, by which one means a Riemann surface Σ (a 1-dimensional complex manifold) with analytically parametrized boundary components, a trace class element ˆ ∗ˆ ˆ H ⊗ H (3) UΣ ∈ defined up to scalar multiple. One assumes that these elements depend on Σ analytically (i.e. are real-analytic functions on the moduli space of worldsheets). Here H∗ ˆ denotes the Hilbert tensor product. In denotes the Hilbert space dual of H, and ⊗ ∗ (3), the tensor product of copies of H (respectively, H) is over the inbound (respectively, outbound) boundary components of Σ. Inbound and outbound boundary components are distinguished by orientation. For an annulus in C enclosed by two concentric circles oriented counterclockwise, the inside circle is inbound. The elements (3) are subject to gluing identities (gluing of Riemann surfaces corresponds to trace). These elements can also be viewed (perhaps even more conventionally, but less symmetrically) as operators ˆ ˆ H→ H (4) UΣ : where the tensor product in the source (respectively, target) is over inbound (respectively, outbound) boundary components. In this paper, we shall almost exclusively consider the case when Σ is a Riemann surface of genus 0, since this is the key
March 10, J070-S0129055X10003916
124
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
case for deformation theory. It should be noted that a physical CFT has still more structure. Namely, we want to consider the operators (4) where Σ is an annulus approaching the degenerate annulus which is the unit circle with both inbound and outbound parametrizations equal to the identity. In such limit, the operator (4) should approach the identity H → H. Also, one requires reflection-positivity (which is the Wick rotation of unitarity). ¯ the Riemann surface complex conjugate to Σ, This means that if we denote by Σ then UΣ¯ is adjoint to UΣ . One also requires that for a physical theory that H actually be the complexification of a real vector space, and the quadratic form one obtains by taking limits to the degenerate annulus S 1 with boundary parametrizations by z (the identity) and 1/z be related to the Hermitian form on H by complex conjugation. Treating the supersymmetric case mathematically is more technical, but analogous. Essentially, one must work on the super-moduli space of superconformal surfaces (for a very quick review mostly sufficient for the purposes of this paper, see [49]). The structure just described originates in conformally invariant 2-dimensional quantum field theory. From the point of view of 2-dimensional quantum field theory, the element (3) can be viewed as a generalization of the vacuum expectation value in the sense that no field is inserted inside the worldsheet. From the point of view of conformal field theory, this element is a CFT amplitude. Now in a bosonic (= non-supersymmetric) CFT H, if we have a primary field u of weight (1, 1), then, as observed in [59], we can make an infinitesimal deformation of H as follows: For a worldsheet Σ with associated element UΣ (see (3)), the infinitesimal deformation of the vacuum is UΣxu . (5) VΣ = x∈Σ
Here UΣxu is obtained by choosing a holomorphic embedding f : D → Σ, f (0) = x, where D is the standard disk. Let Σ be the worldsheet obtained by cutting f (D) out of Σ, and let UΣxu be obtained by gluing the vacuum UΣ with the field u inserted at f (∂D). The element UΣxu is proportional to f (0)2 , since u is (1, 1)-primary, so it transforms the same way as a measure and we can define the integral (5) without coupling with a measure. The integral (5) is an infinitesimal deformation of the original CFT structure in the sense that UΣ + VΣ satisfies CFT gluing identities in the ring C[]/2 . The main topic of this paper is studying (in this and analogous supersymmetric cases) the question as to when the infinitesimal deformation (5) can be exponentiated at least to perturbative level, i.e. when there exist for each n ∈ N
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
125
elements u0 , . . . , un−1 ∈ H,
u0 = u
and for every worldsheet Σ UΣ0 , . . . , UΣn ∈
H∗ ⊗ H
such that UΣ (m) =
m
UΣi i ,
UΣ0 = UΣ
(6)
i=0
satisfy gluing axioms in C[]/m+1 , 0 ≤ m ≤ n, u(m) =
m
u i i
(7)
i=0
is primary of weight (1, 1) with respect to (6), 0 < m ≤ n, and dUΣ (m) = UΣxu(m−1) (m − 1) d x∈Σ
(8)
in the same sense as in (5). We should remark that a priori, it is not known that all deformations of CFT come from primary fields: One could, in principle, simply ask for the existence of vacua (6) such that (6) satisfy gluing axioms over C[]/m+1 . As remarked in [59], it is not known whether all perturbative deformations of CFT’s are obtained from primary fields u as describe above. However, one can indeed prove that the primary fields u exist given suitable continuity assumptions. Suppose the vacua UΣ (m) exist for 0 ≤ m ≤ n. We notice that the integral on the right-hand side of (8) is, by definition, the limit of integrals over regions R which are proper subsets of Σ such that the measure of Σ − R goes to 0 (fix an analytic metric on Σ compatible with the complex structure). Let, thus, ΣD1 ,...,Dk be a worldsheet obtained from Σ by cutting out disjoint holomorphically embedded copies D1 , . . . , Dk of the unit disk D. Then we calculate dUΣ (m) = UΣxu(m−1) (m − 1) d x∈Σ = lim UΣD1 ,...,Dk (m − 1) U(S Di )xu(m−1) (m − 1) S µ(ΣD1 ,...,Dk )→0
=
=
lim
µ(ΣD1 ,...,Dk )→0
lim
µ(ΣD1 ,...,Dk )→0
UΣDi (m − 1)
i
i
UΣDi (m − 1)
x∈
x∈Di
Di
U(Di )xu(m−1) (m − 1)
dUDi (m) d
March 10, J070-S0129055X10003916
126
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
assuming (8) for Σ = D, so the assumption we need is dUΣ dUDi (m) (m) = lim . UΣDi (m − 1) ◦ d µ(ΣD1 ,...,Dk )→0 d i
(9)
The composition notation on the right-hand side means gluing. Granted (9), we can recover dUΣd(m) from dUDd(m) for the unit disk D. µ denotes the Lebesgue measure (this is well defined on a worldsheet at least up to absolute continuity, which is sufficient for taking the limit in the above computation). Now in the case of the unit disk, we get a candidate for u(m− 1) in the following way: Assume that H is topologically spanned by subspaces H(m1 ,m2 ) of -weight (m1 , m2 ) where m1 , m2 ≥ 0, H(0,0) = UD . Then UD (m) is invariant under rigid rotation, so ˆ UD (m) ∈ H(k,k) []/m+1 . (10) k≥0
We see that if Aq is the standard annulus with boundary components S 1 , qS 1 with standard parametrizations, then u(m − 1) = lim
q→0
1 dUD (m) UA q2 q d
(11)
exists and is equal to the weight (1, 1) summand of (10). In fact, by (9) and the definition of integral, we already see that (8) holds. We do not know however yet that u(m − 1) is primary. To see that, however, we note that for any annulus A = D − D where f : D → D is a holomorphic embedding with derivative r, (9) also implies (for the same reason — the exhaustion principle) that (8) is valid with u(m − 1) replaced by UA u(m − 1) . r2
(12)
Since this is true for any Σ, in particular where Σ is any disk, the integrands must be equal, so (12) and u(m − 1) have the same vertex operators, so at least in the absence of null elements, UA u(m − 1) = u(m − 1) r2
(13)
which means that u(m − 1) is primary of weight (1, 1), which says precisely that the expression on the left-hand side of (13) is independent of A. We have presented an argument by which, making certain assumptions, deformations of CFT’s occur along primary fields of critical weights. This is a question raised in [59]. We shall see however that there are problems with this formulation even in the simplest possible case: Consider the free (bosonic) CFT of dimension ˜−1 . (We disregard here the issue that H itself lacks 1, and the primary field x−1 x a satisfactory Hilbert space structure, see [37], we could eliminate this problem by
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
127
compactifying the theory on a torus or by considering the state spaces of given momentum.) Let us calculate 1 ˜ −1 )x−1 x UD = exp(zL−1 ) exp(¯ zL ˜−1 D
=
˜−k 1 x−k x . 2 k
(14)
k≥1
We see that the element (14) is not an element of H, since its norm is k≥1 1 = ∞. This occurs despite the fact that the norm on H is preserved by the deformation, i.e. the deformation is unitary. (This is because the inner product is conjugate by reality to the quadratic form which is the operator associated with the degenerate worldsheet with two outbound boundary components S 1 = {z ∈ C| z = 1} parametrized by z and 1/z; in the class of measures equivalent to the Lebesgue measure by absolute continuity, this worldsheet has measure 0 and hence the deformation acts trivially on it.) The explanation is simply that the infinitesimally deformed vacuum is ˜−n 1 x−n x . (15) 1+ 2 n>0 n When computing the square norm of (15), the second summand is orthogonal to the first, hence its square norm occurs with coefficient 2 , which disappears when calculating up to linear order in (which is what we are doing in an infinitesimal deformation); such phenomena routinely occur when one attempts to differentiate unitary processes on Hilbert spaces. In our case, as we shall see, the situation is further complicated by the fact that the process actually has to be regularized. There are other problems as well. For one thing, we wish to consider theories which really do not have Hilbert axiomatizations in the proper sense, including Minkowski signature theories, where the Hilbert approach is impossible for physical reasons. Therefore, we prefer a “vertex operator algebra” approach where we discard the Hilbert completion and restrict ourselves to examining tree level amplitudes. One such axiomatization of such theories was given in [41] under the term “full field algebra”. In the present paper, however, we prefer to work from scratch, listing the properties we will use explicitly, and referring to our objects as conformal field theories in the vertex operator formulation. As mentioned in the introduction, our approach in this paper is essentially to build the minimal possible machinery in which we can phrase the concept of perturbative deformation of a CFT along a primary field of critical weight to arbitrary degree, and identifying obstructions to obtaining such deformation. Actually identifying the deformed conformal field theory upon plugging in a value of the deformation parameter (provided the obstructions vanish) by means of a general abstract machinery (i.e. not assuming we can recognize the theory by other means) is a difficult problem which remains untreated in the present paper. Therefore, speaking purely mathematically, we are actually defining the concept of perturbative
March 10, J070-S0129055X10003916
128
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
deformation along with finding our obstructions. It would be far superior to define rigorously the moduli space of conformal field theories upfront, with enough geometry to allow us to define paths. Such technology, however, is not mathematically available at the present time. Regarding the approach to constructing and treating fields, the vertex operator approach is largely superior from the computational point of view. One can convert to the more symmetric and foundationally more powerful Hilbert space approach when we have appropriate convergence of the operators constructed. We shall proceed by using either language according to what is more convenient at each particular time. For now, let us consider untopologized vector spaces (16) V = V(wL ,wR ) . Here (wL , wR ) are weights (we refer to wL , respectively wR , as the left, respectively right, component of the weight), so we assume wL − wR ∈ Z and usually wL , wR ≥ 0,
(17)
V(0,0) = UD .
(18)
The “no ghost” assumptions (17), (18) will sometimes be dropped. If there is a Hilbert space H, then V is interpreted as the “subspace of states of finite weights”. We assume that for u ∈ VwL ,wR , we have vertex operators of the form u−vL −wL ,−vR −wR z vL z¯vR . (19) Y (u, z, z¯) = (vL ,vR )
Here ua,b are operators which raise the left (respectively, right) component of weight by a (respectively, b). We additionally assume vL − vR ∈ Z and that for a given w, the weights of operators which act on w are discrete. Even more strongly, we assume that Yi (u, z)Y˜i (u, z¯) (20) Y (u, z, z¯) = i
where Yi (u, z) = Y˜i (u, z¯) =
ui;−vL −wL z vL , (21) u ˜i;−vR −wR z¯vR
where all the operators Yi (u, z) commute with all Y˜j (v, z¯). The main axiom which fields (19) must satisfy is “commutativity” and “associativity” analogous to the case of vertex operator algebras, i.e. there must exist for fields u, v, w ∈ V and w ∈ V ∨ of finite weight, a “4-point function” w Z(u, v, z, z¯, t, t¯)w
(22)
which is real-analytic and unbranched outside the loci of z = 0, t = 0, z = ∞, t = ∞ and z = t, and whose expansion in t first and z second (respectively, z first
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
129
and t second, respectively, z − t first and t second) is w Y (u, z, z¯)Y (v, t, t¯)w, w Y (v, t, t¯)Y (u, z, z¯)w, w Y (Y (u, z − t, z¯ − t)v, t, t¯)w, respectively. Here, for example, by an expansion in t first and z second we mean a series in the variable z whose coefficients are series in the variable t, and the other cases are analogous. Comment: the existence of 4-point function is the appropriate generalization of “locality”. ˜ n with equal central charges We also assume that Virasoro algebras Ln , L cL = cR act and that ∂ Y (u, z, z¯), ∂z ˜ −1 u, z, z¯) = ∂ Y (u, z, z¯) Y (L ∂ z¯
(23)
˜ 0 ). VwL ,wR is the weight (wL , wR ) subspace of V with respect to (L0 , L
(24)
Y (L−1 u, z, z¯) =
and
Remark. Even the axioms outlined here are meant for theories which are initial points of the proposed perturbative deformations, they are too restrictive for the theories obtained as a result of the deformations themselves. To capture those deformations, it is best to revert to Segal’s approach, restricting attention to genus 0 worldsheets with a unique outbound boundary component (tree level amplitudes). Operators will then be expanded both in the weight grading and in the perturbative parameter (i.e. the coefficient at each power of the deformation parameter will be an element of the product-completed state space of the original theory). To avoid discussion of topology, we simply require that perturbative coefficients of all compositions of such operators converge in the product topology with respect to the weight grading, and the analytic topology in each graded summand. In this section, we discuss infinitesimal perturbations, i.e. the deformed theory is defined over C[]/(2 ) where is the deformation parameter. One case where such infinitesimal deformations can be described explicitly is the following Theorem 1. Consider fields u, v, w ∈ V where u is primary of weight (1, 1). Next, assume that Zα,β (u, v, z, z¯, t, t¯) Z(u, v, z, z¯, t, t¯) = α,β
where Zα,β (u, v, z, z¯, t, ¯ t) =
i
Zα,β,i (u, v, z, t)Z˜α,β,i (u, v, z¯, t¯)
March 10, J070-S0129055X10003916
130
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
and for w ∈ W ∨ of finite weight, w Zα,β,i (u, v, z, t)(z − t)α z β (respectively, α w Z˜α,β,i (u, v, z¯, t¯)(z − t) z¯β ) is a meromorphic (respectively, antimeromorphic) function of z on CP 1 , with poles (if any) only at 0, t, ∞. Now write Yu,α,β (v, t, t¯) = (i/2) Zα,β (u, v, z, z¯, t, t¯)dzd¯ z, (25) Σ
so Yu (v, t, t¯) = Y (v, t, t¯) +
Yu,α,β (v, t, t¯)
α,β
is the infinitesimally deformed vertex operator where Σ is the degenerate worldsheet with unit disks cut out around 0, t, ∞. Assume now further that we can expand Zα,β,i (u, v, z, t) = Yα,β,i (v, t)Yα,β,i (u, z)
when z is near 0,
(26)
Zα,β,i (u, v, z, t) = Yα,β,i (u, z)Yα,β,i (v, t)
when z is near ∞,
(27)
Zα,β,i (u, v, z, t) = Yα,β,i (Yα,β,i (u, z − t))v, t)
Write Yα,β,i (u, z) = Yα,β,i (u, z) = Yα,β,i (u, z) =
when z is near t.
(28)
uα,β,i,−n−β z n+β−1 , uα,β,i,−n−α−β z n+α+β−1 , uα,β,i,n−α z n+α−1 .
(Analogously with the ˜’s.) Assume now uα,β,i,0 w = 0,
uα,β,i,0 v = 0,
uα,β,i,0 Yα,β,i (v, t)w = 0
(29)
and analogously for the ˜’s (note that these conditions are only nontrivial when β = 0, respectively, α = 0, respectively, α = −β). Denote now by ωα,β,i,0 , ωα,β,i,∞ , ωα,β,i,t the indefinite integrals of (26)–(28) in the variable z, obtained using the formula z k+1 for k = −1 z k dz = k+1 (thus fixing the integration constant), and analogously with the ˜’s. Let then Cα,β,i = ωα,β,i,∞ − ωα,β,i,t , Dα,β,i = ωα,β,i,∞ − ωα,β,i,0 , C˜α,β,i = ω ˜ α,β,i,∞ − ω ˜ α,β,i,t , ˜ Dα,β,i = ω ˜ α,β,i,∞ − ω ˜ α,β,i,0 (see the comment in the proof on branching). Let uα,β,i,−n u˜α,β,i,−n φα,β,i = π n n
(30)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
131
and similarly for the˜’s, the ’s and the ’s. (The definition makes sense when applied to fields on which the term with denominator 0 vanishes.) Then φα,β,i Y (v, t, t¯)w − Y (φα,β,i v, t, t¯)w − Y (v, t, t¯)φα,β,i w Yα,β,u (v, t, t¯)w = i
˜ α,β,i (1 − e2πiβ ). + Cα,β,i C˜α,β,i (−1 + e−2πiα ) + Dα,β,i D
(31)
˜ α,β,i = 0, and when β = 0 then Cα,β,i = Additionally, when α = 0, then Dα,β,i = D ˜ Cα,β,i = 0, and φα,β,i Y (v, t, t¯)w − Y (φα,β,i v, t, t¯)w − Y (v, t, t¯)φα,β,i w. (32) Yα,β,u (v, t, t¯)w = i
Equation (32) is also valid when α = −β. Remark 1. Note that technically, the integral (25) is not defined on the nondegenerate worldsheet described. This can be treated in the standard way, namely by considering an actual worldsheet Σ obtained by gluing on standard annuli on the boundary components. It is easily checked that if we denote by Auq the infinitesimal deformation of Aq by u, then Auq (w) = φAq (w) − Aq (φw). Therefore, the theorem can be stated equivalently for the worldsheet Σ . The only change needs to be made in formula (31), where φ needs to be multiplied by s−2n and φ needs to be multiplied by r−2n where r and s are radii of the corresponding boundary components. Because however this is equivalent, we can pretend to work on the degenerate worldsheet Σ directly, in particular avoiding inconvenient scaling factors in the statement. Remark 2. The validity of this theorem is rather restricted by its assumptions. Most significantly, its assumption states that the chiral 4-point function can be rendered meromorphic in one of the variables by multiplying by a factor of the form z α (z − t)β . This is essentially equivalent to the fusion rules being “abelian”, i.e. 1-dimensional for each pair of labels, and each pair of labels has exactly one product. As we will see (and as is well known), the N = 2 minimal model is an example of a “non-abelian” theory. Speaking more generally in terms of function theory, branched analytic functions on CP 1 (at a risk of great confusion, we recall that those were called “Abelsche Funktionen” by Riemann) are finite-dimensional vector spaces which are locally spaces of holomorphic functions, outside of finitely many points z1 , . . . , zn on CP 1 . One also assumes that the singularities at zi are of bounded polynomial growth. Such function then defines a finite-dimensional representation of the fundamental group π1 (CP 1 −{z1 , . . . , zn }), called the holonomy representation. In particular, chiral correlation functions of a full field algebra are branched functions in this sense. The key issue is whether the holonomy representation is a sum of one-dimensional
March 10, J070-S0129055X10003916
132
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
representation (in which case it factors through the abelianization of the fundamental group — the first homology group). Then and only then is the function a sum of contributions which can be rendered holomorphic by multiplication with appropriate products of (zi − zj )αij . A most basic example of a branched function with non-abelian holonomy is the hypergeometric function, which occurs as the 4-point function of parafermions and N = 2-minimal models. Even for an abelian theory, the theorem only calculates the deformation in the “0 charge sector” because of the assumption (29). Because of this, even for a free field theory, we will need to discuss an extension of the argument. Since in that case, however, stating precise assumptions is even more complicated, we prefer to treat the special case only, and to postpone the discussion to Sec. 4 below. Proof. Let us work on the scaled real worldsheet Σ . Let ηα,β,i = Zα,β,i (u, v, z, t)dz, z. η˜α,β,i = Z˜α,β,i (u, v, z¯, t¯)d¯ Denote by ∂0 , ∂∞ , ∂t the boundary components of Σ near 0, ∞, t. Then the form ωα,β,i,∞ η˜α,β,i is unbranched on a domain obtained by making a cut c connecting ∂0 and ∂t . We have ωα,β,i,t η˜ = −Y (φα,β,i v, t, t¯), (33)
∂t
ωα,β,i,0 η˜ = −Y (v, t, t¯)φα,β,i .
(34)
∂0
But we want to integrate ωα,β,i,∞ η˜α,β,i over the boundary ∂Σ : ωα,β,i η˜α,β,i = ωα,β,i η˜α,β,i + ωα,β,i η˜α,β,i + ωα,β,i η˜α,β,i ∂Σ
∂t
+ c+
∂0
∂∞
ωα,β,i η˜α,β,i +
c−
ωα,β,i η˜α,β,i
(35)
where c+ , c− are the two parts of ∂Σ along the cut c, oriented from ∂t to ∂0 and back respectively. Before going further, let us look at two points x+ ∈ c+ , x− ∈ c− which project to the same point on c. We have C(e−2πiα − 1)˜ η (x− ) = C η˜(x+ ) − C η˜(x− ) = (ωt + C)˜ η (x+ ) − (ωt + C)˜ η (x− ) = ω∞ η˜(x+ ) − ω∞ η˜(x− ) = (ω0 + D)˜ η (x+ ) − (ω0 + D)˜ η (x− ) η (x− ) = D˜ η (x+ ) − D˜ = D(e2πiβ − 1)˜ η (x− )
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
133
(the subscripts α, β, i were omitted throughout to simplify the notation). This implies the relation Cα,β,i (e−2πiα − 1) = Dα,β,i (e2πiβ − 1).
(36)
Comment. This is valid when the constants Cα,β,i , Dα,β,i are both taken at the point x− ; note that since the chiral forms are branched, we would have to adjust the statement if we measured the constants elsewhere. This however will not be of much interest to us as in the present paper we are most interested in the case when the constants vanish. In any case, note that (36) implies Cα,β,i = 0 when β = 0 mod Z and α = 0 mod Z, and Dα,β,i = 0 when α = 0 mod Z and β = 0 mod Z. There is an anlogous ˜ α,β,i . Note that when α = 0 = β, all the forms in relation to (36) between C˜α,β,i , D sight are unbranched, and (32) follows directly. To treat the case α = −β, proceed analogously, but replacing ωα,β,i,∞ by ωα,β,i,0 or ωα,β,i,t . Thus, we have finished proving (32) under its hypotheses. Returning to the general case, let us study the right-hand side of (35). Subtracting the first two terms from (33), (34), we get Cα,β,i η˜α,β,i , Dα,β,i η˜α,β,i , (37) ∂t
∂0
respectively. On the other hand, the sum of the last two terms, looking at points x+ , x− for each x ∈ c, can be rewritten as Cα,β,i (−e−2πiα + 1)˜ ηα,β,i = Dα,β,i (−e2πiβ + 1)˜ ηα,β,i . (38) c+
c−
Now recall (30). Choosing ω ˜ α,β,i,∞ as the primitive function of η˜α,β,i , we see that for the end point x of c− , ω ˜ α,β,i,∞ (x+ ) − ω ˜ α,β,i,∞ (x− ) = ω ˜ α,β,i,t (x+ ) − ω ˜ α,β,i,t (x− ) = (e−2πiα − 1)˜ ωα,β,i,t (x− ) = (e−2πiα − 1)˜ ωα,β,i,∞ (x− ) + (e−2πiα − 1)C˜α,β,i . (39) −
Similarly, for the beginning point y of c , −ω ˜ α,β,i,∞ (y + ) + ω ˜ α,β,i,∞ (y − ) = −˜ ωα,β,i,0 (y + ) + ω ˜ α,β,i,0 (y − ) = −(e2πiβ − 1)˜ ωα,β,i,0 (y − ) ˜ α,β,i . = −(e2πiβ − 1)˜ ωα,β,i,∞ (y − ) − (e2πiβ − 1)D (40) Then (39), (40) multiplied by Cα,β,i are the integrals (37), while the integral (38) is − Dα,β,i (1 − e2πiβ )˜ ωα,β,i,0 (y − ) + Cα,β,i (1 − e−2πiα )˜ ωα,β,i,0 (x− ).
(41)
March 10, J070-S0129055X10003916
134
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Adding this, we get ˜ α,β,i (1 − e2πiβ ), Cα,β,i C˜α,β,i (−1 + e−2πiα ) + Dα,β,i D as claimed.
3. Exponentiation of Infinitesimal Deformations Let us now look at primary weight (1, 1) fields u. We would like to investigate whether the infinitesimal deformation of vertex operators (more precisely worldsheet vacua or string amplitudes) along u indeed continues to a finite deformation, or at least to perturbative level, as discussed in the previous section. Looking again at Eq. (8), we see that we have in principle a series of obstructions similar to those of Gerstenhaber [26–29], namely if we denote by Ln (m) =
m
Lin i ,
L0n = Ln
(42)
i=0
a deformation of the operator Ln in Hom(V, V )[]/m , we must have Ln (m)u(m) = 0 ∈ V []/m+1
for n > 0
L0 (m)u(m) = u(m) ∈ V []/m+1 . This can be rewritten as Ln um = −
(L0 − 1)um = −
(44)
Lin um−i
i≥1
(43)
(45) Li0 um−i .
i≥1
(Analogously for the ˜’s. In the following, we will work on the obstruction for the chiral part, the antichiral part is analogous.) At first, these equations seem very overdetermined. Similarly as in the case of Gerstenhaber’s obstruction theory, however, of course the obstructions are of cohomological nature. If we denote by A the Lie algebra L0 − 1, L1 , L2 , . . . , then the system Ln (m)u(m − 1) (L0 (m) − 1)u(m − 1)
(46)
is divisible by m in V []/m+1 , and is obviously a coboundary, hence a cocycle with respect to L0 (m) − 1, L1 (m), . . . . Hence, dividing by m , we get a 1-cocycle in H 1 (A, C). Solving (45) means expressing this A-cocycle as a coboundary. In the absence of ghosts (= elements of negative weights), there is another simplification we may take advantage of. Suppose we have a 1-cocycle c = (x0 , x1 , . . .) of A, representing an element of H 1 (A, C). (In our applications, we will be interested in the case when the xi ’s are given by (46).) Writing out the cocycle condition explicitly, we obtain the equations Lk xj − Lj xk = (k − j)xj+k ,
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
135
where Lk = Lk for k > 0, L0 = L0 − 1. In particular, Lk x0 − L0 xk = kxk , or Lk x0 = (L0 + k − 1)xk
for k > 0.
(47)
In the absence of ghosts, (47) means that for k ≥ 1, xk is determined by x0 with the exception of the weight 0 summand (x1 )0 of x1 . Additionally, if we denote the weight k summand of y in general by yk , then c = dy
(48)
means (x0 )k = (k − 1)y,
(49)
(x0 )1 = 0.
(50)
The rest of Eq. (48) then follows from (47), with the exception of the weight 0 summand of x1 . We must, then, have (x1 )0 ∈ Im L1 . Conditions (50), (51), for xk = −
(51)
Lik um−i ,
i≥1
are the conditions for solving (45), i.e. the actual obstruction. For m = 1, we get what we call the primary obstruction. Calculating the integral (5) over an annulus and passing to the appropriate limits (the infinitesimal annuli expressing the operators Ln ), we obtain ˜ 1−k = ui,m+k u˜i,m , (52) L1k = L m,i
so (50) becomes
ui,0 u ˜i,0 u = 0.
(53)
i
The condition (51) becomes ui,1 u ˜i,0 u ∈ Im L1 ,
i
i
˜ 1. ui,0 u˜i,1 u ∈ Im L
(54)
This investigation is also interesting in the supersymmetric context. In the case of N = 1 worldsheet supersymmetry, we have additional operators Gir , and in the −i i N = 2 SUSY case, we have operators G+i r , Gr , Jn (cf. [31, 49]), defined as the i + − -coefficient of the deformation of Gr , resp. Gr , Gr , Jn analogously to Eq. (42). In the N = 1-supersymmetric case, the critical deforming fields have weight (1/2, 1/2) (as do a and c fields in the N = 2 case), so in both cases the first
March 10, J070-S0129055X10003916
136
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Eq. (42) remains the same as in the N = 0 case, the second becomes (L0 − 1/2)um = − Li0 um−i .
(55)
i≥1
Additionally, for N = 1, we get Gr um = −
Gir um−i ,
r ≥ 1/2
(56)
i≥1
(similarly when ˜’s are present). In the N = 1-supersymmetric case, we therefore deal with the Lie algebra A, which is the free C-vector space on Ln , Gr , n ≥ 0, r ≥ 1/2. For a cocycle which has value xk on Lk and zr on Gr , Eq. (47) becomes Lk x0 = (L0 + k − 1/2)xk
for k > 0,
(57)
so in the absence of ghosts, xk is always determined by x0 . If the 1-cocycle (xk , zr ) is the coboundary of y
(58)
we additionally get (x0 )k = (k − 1/2)y, so (x0 )1/2 = 0. On the other hand, on the z’s, we get Gr x0 = (L0 + r − 1/2)zr ,
r ≥ 1/2,
(59)
so we see that in the absence of ghosts, all zr ’s are determined, with the exception of (z1/2 )0 . Therefore our obstruction is (z0 )1/2 = 0, (z1/2 )0 ∈ Im(G1/2 ). For the primary obstruction, we have ˜ −1/2 u)m+k,m , ˜1 = L1k = L (G−1/2 G −k
(60)
(61)
m
G1r = 2
˜ −1/2 u)m+r,m , (G m
˜ 1r G
=2 (G−1/2 u)m,m+r , m
(62)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
so the obstruction becomes
137
˜ −1/2 u)m,m = 0, (G−1/2 G
m
˜ −1/2 u)m+1/2,m ∈ Im(G1/2 ), (G
(63)
m
˜ 1/2 ). (G−1/2 u)m,m+1/2 ∈ Im(G
m
In the case of N = 2 supersymmetry, there is an additional complication, namely chirality. This means that in addition to the conditions (L0 − 1/2)u = 0, Ln u = G± r u = Jn−1 u = 0 for n ≥ 1,
r ≥ 1/2,
(64)
we require that u be chiral primary, which means G+ −1/2 u = 0.
(65)
(There is also the possibility of antichiral primary, which has G− −1/2 u = 0
(66)
instead, and similarly for the ˜’s.) Let us now write down the obstruction equations for the chiral primary case. We get the first Eqs. (45), (55), and an analogue of (56) −i with Gir replaced by G+i r and Gr . Additionally, we have the equation m m−i G+ G+i −1/2 u = − −1/2 u i≥1
and analogously for the ˜’s. In this situation, we consider the super-Lie algebra A2 which is the free C-vector + space on Ln , Jn , n ≥ 0, G− r , r ≥ 1/2 and Gs , s ≥ −1/2. One easily verifies that this is a super-Lie algebra on which the central extension vanishes canonically ([31], Sec. 3.1). Looking at a 1-cocycle whose value is xk ,zr± , tk on Lk , G± r , Jk respectively, we get Eq. (57), and additionally ± G± r x0 = (L0 + r − 1/2)zr ,
r ≥ 1/2 for −, r ≥ −1/2 for +
(67)
and Jn x0 = (n − 1/2)tn ,
n ≥ 0.
(68)
± + We see that the cocycle is determined by x0 , with the exception of (z1/2 )0 , (z−1/2 )1 . Therefore, we get the condition
(x0 )1/2 = 0 ± )0 ∈ Im(G± (z1/2 1/2 ) + )1 (z−1/2
and similarly for the ˜’s.
=
G+ −1/2 u
(69) where
G+ 1/2 u
=0
March 10, J070-S0129055X10003916
138
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
In the case of deformation along a cc field u, we have ˜− ˜1 = L1k = L (G− −k −1/2 G−1/2 u)m+k,m ,
(70)
m
Gr+,1 =
˜ − u)m+r+1/2,m , 2(G −1/2
m
˜ +,1 = G r Gr−,1 = Jn1
2(G− −1/2 u)m,m+r+1/2 ,
m ˜ r−,1 G
(71)
= 0, ˜ = 0 = Jn1 ,
so the obstructions are, in a sense, analogous to (63) with Gr replaced by G− r . Remark. The relevant computation in verifying that (70), (71) (and the analogous cases before) form a cocycle uses formulas of the following type ([72]): Resz (a(z)v(w)z n ) − Resz (v(w)a(z)z n ) = Resz−w ((a(z − w)v)(w)z n ).
(72)
For example, when v is primary of weight 1, a = L−2 , the right hand side of (72) is Resz−w (L0 v(z − w)−2 n(z − w)wn−1 + L−1 v(z − w)−1 wn ) = nv(w)wn−1 + L−1 v(w)wn = nvk wn−k−2 + (−k − 1)vk wn−k−2 = (n − k)vn wn−k−2 . The left-hand side is [Ln−1 , vk−n+1 ]wn−k−2 , so we get [Ln−1 , vk−n+1 ] = (n − k − 1)vk , as needed. Other required identities follow in a similar way. Let us verify one interesting case when a = G− −3/2 , u chiral primary. Then the right-hand side of (72) is − −1 n −n−1 Resz−w (G− v(w)(z − w) w ) = (G v)(w) = (G− . −1/2 −1/2 −1/2 v)w This implies − [G− r , us ] = (G−1/2 u)r+s ,
(73)
as needed. We have now analyzed the primary obstructions for exponentiation of infinitesimal CFT deformations. However, in order for a perturbative exponentiation to exist, there are also higher obstructions which must vanish. The basic principle for obtaining these obstructions was formulated above. However, in practice, it may often happen that those obstructions will not converge. This may happen for two different basic reasons. One possibility is that the deformation of the deforming field
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
139
itself does not converge. This is essentially a violation of perturbativity, but may in some cases be resolved by regularizing the CFT anomaly along the deformation parameter. We will discuss this at the end of this section, and will give an example in Sec. 4 below. Even if all goes well with the parameter, however, there may be another problem, namely the expressions for Lin etc. may not converge due to the fact that our deformation formulas concern vacua of actual worldsheets, while Lin etc. correspond to degenerate worldsheets. Similarly, vertex operators may not converge in the deformed theories. We will show here how to deal with this problem. The main strategy is to rephrase the conditions from the above part of this section in terms of “finite annuli”. We start with the N = 0 (non-supersymmetric) case. Similarly as in (42), we can expand UAr (m) =
m
UAhr h .
(74)
h=0
In the non-supersymmetric case, the basic fact we have is the following: Theorem 2. Assuming uk (considered as fields in the original undeformed CFT ) have weight > (1, 1) for k < h, r ∈ (0, 1) we have 1 sh 2m −1 2mh −1 h umh ,mh · · · um1 ,m1 UAr sh sh−1h−1 · · · U Ar = (mk ,k≤h)
·
s2
s1 =r
1 −1 s2m ds1 · · · dsh . 1
sh =r
sh−1 =r
(75)
(Note that the integral on the right-hand side is over a simplex.) 1 uh = um ,m · · · um1 ,m1 u. h 2 (mh + · · · + m1 )(mh−1 + · · · + m1 ) · · · m1 h h (mk ,k≤h)
(76) In particular, the obstruction is the vanishing of the sum (with the term mh +· · ·+m1 omitted from the denominator ) of the terms in (76) with mh + · · · + m1 = 0. Proof. The identity (75) is essentially by definition. The key point is that in the higher deformed vacua, there are terms in the integrand obtained by inserting uk , k > 1 to boundaries of disjoint disks Di cut out of Ar . Then there are corrective terms to be integrated on the worldsheets obtained by cutting out those disks. But the point is that under our weight assumption, all the disks Di can be shrunk to a single point, at which point the term disappears, and we are left with integrals of several copies of u inserted at different points. If we are using vertex operators to express the integral, the operators must additionally be applied in time order (i.e. fields at points of lower modulus are inserted first). There is an h! permutation factor which cancels with the Taylor denominator. This gives (75).
March 10, J070-S0129055X10003916
140
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Now (76) is proved by induction. For h = 1, the calculation is done above ((52)). Assume now the induction hypothesis, and evaluate the integral in the standard way of taking primitive functions successively from the inside out. The primitive function of ms is taken to be ms+1 /(s + 1) (by the induction hypothesis, and the assumptions that lower order obstructions vanish, the case s = −1 never occurs. Then the contributing term of the integral where the k − 1 innermost integrals have the upper bound and the kth innermost integral has the lower bound is equal to uk , UAh−k r
h>k≥1
(and this term occurs with a minus sign because of the lower bound involved). The summand which has all upper bounds except in the last integral is equal to 1 − r2(m1 +···+mh ) um ,m · · · um1 ,m1 ur2 , 2h (mh + · · · + m1 )(mh−1 + · · · + m1 ) · · · m1 h h
(77)
which is supposed to be equal to −UAr uh + r2 uh . This gives the desired solution. Remark. The formula (77) of course does not apply to the case m1 + · · ·+ mh = 0. In that case, the correct formula is − ln(r) um ,m · · · um1 ,m1 ur2 . (mh−1 + · · · + m1 ) · · · m1 h h
(78)
So the question becomes whether there could exist a field uh such that UAr uh −r2 uh is equal to the quantity (78). One sees immediately that such field does not exist in the product-completed space of the original theory. What this approach does not settle however is whether it may be possible to add such non-perturbative fields to the theory and preserve CFT axioms, which could facilitate existence of deformations in some generalized sense, despite the algebraic obstruction. It would have to be, however, a field of generalized weight in the sense of [39–42]. In effect, written in infinitesimal terms, the expression (78) becomes L0 uh − uh = −
1 um ,m · · · um1 ,m1 u. (mh−1 + · · · + m1 ) · · · m1 h h
The right-hand side wu is a field of holomorphic weight 1, so we see that we have a matrix relation uh 1 w uh L0 = . u 0 1 u This is an example of what one means by a field of generalized weight. One should note, however, that fields of generalized weight are excluded in unitary conformal field theories. By Wick rotation, the unitarity axiom of a conformal field theory becomes the axiom of reflection positivity [59]: the operator UΣ associated with a
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
141
worldsheet Σ is defined up to a 1-dimensional complex line LΣ (which is often more ¯ the complexstrongly assumed to have a positive real structure). If we denote by Σ conjugate worldsheet (note that this reverses orientation of boundary components), then reflection positivity requires that we have an isomorphism LΣ¯ ∼ = L∗Σ (the dual ∗ line), and using this isomorphism, an identification UΣ¯ = UΣ (here the asterisk denotes the adjoint operator). Specializing to annuli Ar , r ≤ 1, we see that the annulus for r real is self-conjugate, so the corresponding operators are selfadjoint, and hence diagonalizable. On the other hand, for r = 1, we obtain unitary operators, and unitary representations of S 1 on Hilbert space split into eigenspaces of integral weights. The central extension given by L is then trivial and hence the operators corresponding to all Ar commute, and hence are simultaneously diagonalizable, thus excluding the possibility of generalized weight. The possibility, of course, remains that the correlation function of the deformed theory can be modified by a non-perturbative correction. Let us note that if left uncorrected, the term (78) can be interpreted infinitesimally as L0 u() − u() = Cm v
mod m+1 ,
(79)
where v is another field of weight 1. Note that in case that u = v, (79) can be interpreted as saying that u changes weight at order m of the perturbation parameter. In the general case, we obtain a matrix involving all the (holomorphic) weight 1 fields in the unperturbed theory. Excluding fields of generalized weight in the unperturbed theory (which would translate to fields of generalized weight in the perturbed theory), the matrix must have other eigenvalues than 1, thus showing that some critical fields will change weight. In the N = 1-supersymmetric case, an analogous statement holds, except the assumption is that the weight of uk is greater than (1/2, 1/2) for k < h, and the integral (75) must be replaced by ˜ −1/2 u)m ,m · · · (G−1/2 G ˜ −1/2 u)m1 ,m1 UAr (G−1/2 G UAhr = h h mk
·
1
sh =r
h −1 sm h
sh
sh−1 =r
m
h−1 sh−1
−1
···
s2
s1 =r
s1m1 −1 ds1 · · · dsh ,
(80)
and accordingly uh =
mk
1 2h (mh + · · · + m1 )(mh−1 + · · · + m1 ) · · · m1
˜ −1/2 u)m ,m · · · (G−1/2 G ˜ −1/2 u)m1 ,m1 u, · (G−1/2 G h h
(81)
so the obstruction again states that the term with mh + · · · + m1 = 0 must vanish. In the N = 2 case, when u is a cc field, we simply replace G by G− in (80) and (81). But in the supersymmetric case, to preserve supersymmetry along the deformation, we must also investigate the “finite” analogs of the obstructions associated
March 10, J070-S0129055X10003916
142
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
+ with G1/2 in the N = 1 case, and G± 1/2 , G−1/2 in the N = 2 c case (and similarly for the a case, and the ˜’s). In fact, to tell the whole story, we should seriously investigate integration of the deforming fields over super-Riemann surfaces (= super-worldsheets). This can be done; one approach is to treat the case of the superdisk first, using Stokes theorem twice with the differentials ∂, ∂¯ replaced by ¯ respectively in the N = 1 case (and the same at one chirality for the N = 2 D, D case). A general super-Riemann surface is then partitioned into superdisks. For the purpose of obstruction theory, the following special case is sufficient. We treat the N = 2 case, since it is of main interest for us. Let us consider the case of cc fields (the other cases are analogous). First we note (see (71)) that G− is unaffected by deformation via a cc field, so the obstructions derived from G− −1/2 and G− are trivial (and similarly at the ˜’s). 1/2 To understand the obstruction associated with G+ 1/2 , we will study “finite” (as opposed to infinitesimal) annuli obtained by exponentiating G+ 1/2 . Now the element + G1/2 is odd. Thinking of the super-semigroup of superannuli as a supermanifold, then it makes no sense to speak of “odd points” of the supermanifold. It makes sense, however, to speak of a family of odd elements parametrized by an odd parameter s: this is simply the same thing as a map from the (0|1)-dimensional superaffine line into the supermanifold. In this sense, we can speak of the “finite” odd annulus
exp(sG+ 1/2 ).
(82)
Now we wish to study the deformations of the operator associated with (82) along a cc field u as a perturbative expansion in . Thinking of G+ 1/2 as an N = 2-supervector field, we have + − G+ 1/2 = (z + θ θ )
∂ ∂ − zθ− . + ∂θ ∂z
(83)
We see that (83) deforms infinitesimally only the variables θ+ and z, not θ− . Thus, more specifically, (82) results in the transformation z → exp(sθ− )z,
(84)
θ− → θ− .
This gives rise to the formula, valid when uk have weight > (1/2, 1/2) for 1 ≤ k < h, th 1 mh−1 −1 mh −1 h = t th−1 ··· Uexp(sG + h ) 1/2
mk
·
th =exp(sθ − )
t2
t1 =exp(sθ − )
th−1 =exp(sθ − )
t1m1 −1 dt1 · · · dth vmh ,mh · · · vm1 ,m1 Uexp(sG+ ) , 1/2
(85)
where vmk ,mk is equal to ˜ − u)m (G k+1/2 ,mk −1/2
(86)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
143
in summands of (85) where the factor resulting from integrating the tk variable has a θ− factor, and ˜− (G− −1/2 G−1/2 u)mk ,mk
(87)
in other summands. (We see that each summand can be considered as a product of factors resulting from integrating the individual variables tk ; in at most one factor, (86) can occur, otherwise the product vanishes.) Realizing that exp(msθ− ) = 1 + msθ− , this gives that the obstruction (under the weight assumption for uk ) is that the summand for m1 + · · · + mh = 0 (with the denominator m1 + · · · + mh omitted) in the following expression vanish: h mk k=1
1 1 ˜ − u)m ,m ··· (G− G h h m1 + · · · + mh m1 −1/2 −1/2
˜− ˜ − u)m · · · mk (G · · · (G− k+1/2 ,mk −1/2 −1/2 G−1/2 u)m1 ,m1 u.
(88)
To investigate the higher obstructions further, we need the language of correlation functions. Specifically, the CFT’s whose deformations we will consider are “RCFT’s”. The simplest way of building an RCFT is from “chiral sectors” Hλ where λ runs through a set of labels, by the recipe H= Hλ ⊗ Hλ∗ λ ∗
where λ denotes the contragredient label (cf. [38]). (In the case of the Gepner model, we will need a slightly more general scenario, but our methods still apply to that case analogously.) Further, we will have a symmetric bilinear form B : Hλ ⊗ Hλ∗ → C with respect to which the adjoint to Y (v, z) is (−z −2 )n Y (ezL1 v, 1/z) where v is of weight n. There is also a real structure ¯ λ∗ , Hλ ∼ =H thus specifying a real structure on H, u ⊗ v = u ¯ ⊗ v¯, and inner product ¯2 )B(v1 , v¯2 ). u1 ⊗ v1 , u2 ⊗ v2 = B(u1 , u We also have an inner product Hλ ⊗R Hλ∗ → C given by u, v = B(u, v¯). Then we have the P1 -chiral correlation function u(z∞ )∗ |vm (zm )vm−1 (zm−1 ) · · · v1 (z1 )v0 (z0 )
(89)
March 10, J070-S0129055X10003916
144
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
which can be defined by taking the vacuum operator associated with the degenerate worldsheet Σ obtained by “cutting out” unit disks with centers z0 , . . . , zm from the unit disk with center z∞ , applying this operator to v0 ⊗ · · · ⊗ vm , and taking inner product with u. Thus, the correlation function (89) is in fact the same thing as applying the field on either side of (89) to the identity, and taking the inner product. This object (89) is however not simply a function of z0 , . . . , z∞ . Instead, there is a finite-dimensional vector space MΣ depending holomorphically on Σ (called the modular functor) such that (89) is a linear function MΣ → C. However, now one assumes that M is a “unitary modular functor” in the sense of Segal [59]. This means that MΣ has the structure of a positive-definite inner product space for not just the Σ as above, but an arbitrary worldsheet. The inner product is not valued in C, but in det(Σ)2c where c is the central charge. Since the determinant of Σ as above is the same as det(P1 ) (hence in particular constant), we can make the inner product C-valued in our case. If the deforming field is of the form u⊗u ˜,
(90)
the “higher L0 obstruction” (under the weight assumptions given above) can be further written as v(0)∗ |u(zm ) · · · u(z1 )u(0) 0≤ z1 ≤ zm ≤1
u(zm ) · · · u ˜(z1 )˜ u(0) dz1 d¯ z1 · · · dzm d¯ zm · ˜ v ∗ |˜
for w(v) ≤ 1
(91)
(w is weight) in the N = 0 case and − v(0)∗ |(G− −1/2 u)(zm ) · · · (G−1/2 u)(z1 )u(0) 0≤ z1 ≤ zm ≤1
˜ − u˜)(zm ) · · · (G ˜− u · ˜ v ∗ |(G z1 )˜ u(0) dz1 d¯ z1 · · · dzm d¯ zm −1/2 −1/2 ˜)(¯
for w(v) ≤ 1/2 (92)
in the N = 2 cc case. The G+ 1/2 -obstruction in the N = 2 case can be written as
m
0≤ z1 ≤ zm ≤1 k=1
− v(0)∗ |(G− −1/2 u)(zm ) · · · u(zk ) · · · (G−1/2 u)(z1 )u(0)
˜− u ˜ − ˜) · ˜ v ∗ |(G −1/2 ˜)(zm ) · · · (G−1/2 u · (z1 )˜ u(0) dz1 d¯ z1 · · · dzm d¯ zm
for w(v) ≤ 0, w(˜ v ) ≤ 1/2
(93)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
145
and similarly for the ˜. We see that these obstructions vanish when we have v(z∞ )∗ |u(zm ) · · · u(z0 ) = 0,
for w(v) ≤ 1
(94)
in the N = 0 case (and similarly for the ˜’s), and − v(z∞ )∗ |G− −1/2 u(zm ) · · · G−1/2 u(z1 )u(z0 ) = 0,
for w(v) ≤ 1/2,
(95)
and similarly for the ˜’s. Observe further that when u ˜=u ¯, the condition for the ˜’s is equivalent to the condition for u, and further (94), (95) are also necessary in this case, as in (91), (92) we may also choose v˜ = v¯, which makes the integrand non-negative (and only 0 if it is 0 at each chirality). In the N = 2 case, it turns that the condition (95) simplifies further: Theorem 3. Let u be a chiral primary field of weight 1/2. Then the necessary and sufficient condition (95) for existence of perturbative CFT deformations along the field u ⊗ u ¯ is equivalent to the same vanishing condition applied to only chiral primary fields v of weight 1/2. Proof. In order for the fields (95) to correlate, they would have to have the same J-charge QJ . We have QJ u = 1,
QJ (G− −1/2 u) = 0.
As QJ of the right-hand side of (95) is 1. Thus, for the function (95) to be possibly non-zero, we must have QJ v = 1.
(96)
But then we have w(v) ≥
1 1 QJ v = 2 2
with equality arising if and only if v is chiral primary of weight 1/2
(97)
(see [31, Sec. 3.3]). Remark 1. We see therefore that in the N = 2 SUSY case, there is in fact no need to assume that the weight of U k is > (1/2, 1/2) for k < h. If the obstruction vanishes for k < h, then we have 1 ˜ − u)(zk ) · · · (G− G ˜− uk = (G− G z1 · · · d¯ zk (98) −1/2 −1/2 u)(z1 )udz1 · · · dzk d¯ k! D −1/2 −1/2 where the integrand is understood as a (k + 1)-point function (and not its power series expansion in any particular range), over the unit disk.
March 10, J070-S0129055X10003916
146
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Additionally, for any worldsheet Σ, 1 ˜ − u)(zh ) · · · (G− G ˜− (G− G z1 · · · d¯ zh UΣh = −1/2 −1/2 u)(z1 )dz1 · · · dzh d¯ h! Σ −1/2 −1/2
(99)
(it is to be understood that in both (98), (99), the fields are inserted into holomorphic images of disks where the origin maps to the point of insertion with derivative of modulus 1 with respect to the measure of integration). When the obstruction occurs at step k, the integral (98) has a divergence of logarithmic type. In the N = 0 case, there is a third possibility, namely that the obstruction vanishes, but the field uh in Theorem 2 has summands of weight < (1, 1) (< (1/2, 1/2) for N = 1). In this case, the integral (98) will have a divergence of power type, and the intgral of terms of weight < (1, 1) (respectively, < (1/2, 1/2)) has to be taken in range from ∞ to 1 rather than from 0 to 1 to get a convergent integral. The formula (99) is not correct in that case. Remark 2. In [19], a different correlation function is considered as an measure of marginality of u to higher perturbative order. The situation there is actually more general, allowing combinations of both chiral and antichiral primaries. In the present setting of chiral primaries only, the correlation function considered in [19] amounts to − 1|(G− −1/2 u)(zn ) · · · (G−1/2 u)(z1 ) .
(100)
It is easy to see using the standard contour deformation argument that (100) indeed vanishes, which is also observed in [18]. In [19], this type of vanishing is taken as evidence that the N = 2 CFT deformations exist. It appears, however, that even though the vanishing of (100) follows from the vanishing of (95), the opposite implication does not hold. (In fact, we will see examples in Sec. 6 below.) The explanation seems to be that [19] writes down an integral expressing the change of central charge when deforming by a combination of cc fields and ac fields, and proves its vanishing. While this is correct formally, we see from Remark 1 above that in fact a singularity can occur in the integral when our obstruction is non-zero: the integral can marginally diverge for k points while it is convergent for < k points. It would be nice if the obstruction theory a la Gerstenhaber we described here settled in general the question of deformations of conformal field theory, at least in the vertex operator formulation. It is, however, not that simple. The trouble is that we are not in a purely algebraic situation. Rather, compositions of operators which are infinite series may not converge, and even if they do, the convergence cannot be understood in the sense of being eventually constant, but in the sense of analysis, i.e. convergence of sequences of real numbers. Specifically, in our situation, there is the possibility of divergence of the terms on the right-hand side of (45). Above we dealt with one problem, that in general, we do not expect infinitesimal deformations to converge on the degenerate worldsheets of vertex operators, so we may have to replace (45) by equations involving finite annuli instead. However, that is not the only problem. We may encounter
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
147
regularization along the flow parameter. This stems from the fact that Eqs. (43), (44) only determine u() up to scalar multiple, where the scalar may be of the form Ki i = f (). (101) 1+ i≥1
But the point is (as we shall see in an example in the next section) that we may only be able to get a well defined value of f −1 ()u() = v()
(102)
when the constants Ki are infinite. The obstruction then is Ln (m)f (m)v(m) = 0 ∈ V []/m+1 (1 − L0 (m))f (m)v(m) = 0 ∈ V []/
m+1
for n > 0 .
(103)
At first, it may seem that it is difficult to make this rigorous mathematically with the infinite constants present. However, we may use the following trick. Suppose we want to solve c1 a11 + · · · + cn a1n = b1 .. . c1 am1 + · · · + cn amn = bn in a, say, finite-dimensional vector space V . Then we may rewrite (104) as (b1 , . . . , bn ) = 0 ∈ V (a11 , . . . , am1 ), . . . , (a1n , . . . , amn ) .
(104)
(105)
m
This of course does not give anything new in the algebraic situation, i.e when the aij ’s are simply elements of the vector space V . When, however the vectors (a11 , . . . , am1 ), . . . , (a1n , . . . , amn ) are (possibly divergent) infinite sums (a1j , . . . , amj ) =
(a1jk , . . . , amjk ),
k
then the right-hand side of (105) can be interpreted as V (a11k , . . . , am1k ), . . . , (a1nk , . . . , amnk ) . m
In that sense, (105) always makes sense, while (104) may not when interpreted directly. We interpret (103) in this way. Let us now turn to the question of sufficient conditions for exponentiation of infinitesimal deformations. Suppose there exists a subspace W ⊂ V closed under vertex operators which contains u and such that for all elements v ∈ W , we have that Yi (u, z)Y˜i (u, z¯)v i
March 10, J070-S0129055X10003916
148
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
involve only z n z¯m with m, n ∈ Z, m, n = −1. Then, by Theorem 1, ˆ []/2 1 − φ : W → W is an infinitesimal isomorphism between W and the infinitesimally u-deformed W . It follows, in the non-regularized case, that then exp(−φ)u
(106)
is a globally deformed primary field of weight (1, 1), and ˆ [[]] exp(−φ) : W → W
(107)
is an isomorphism between W and the exponentiated deformation of W . However, since we now know the primary fields along the deformation, vacua can be recovered from Eq. (8) of the last section. Such non-regularized exponentiation occurs in the case of the coset construction. Setting W = v|Yi (u, z)Y˜i (u, z¯)v involve only z n z¯m with m, n ≥ 0, m, n ∈ Z . Then W is called the coset of V by u. Then W is closed under vertex operators, and if u ∈ W , the formulas (106), (107) apply without regularization. The case with regularization occurs when there exists some constant K() = 1 + K n n n≥1
where Kn are possibly constants such that K() exp(−φ)u
(108)
is finite in the sense described above (see (105)). We will see an example of this in the next section. All these constructions are easily adapted to supersymmetry. The formulas (106), (107) hold without change, but the deformation is with respect to + ˜ −1/2 u respectively, G− G ˜− ˜− G−1/2 G −1/2 −1/2 u, G−1/2 G−1/2 u depending on the situation applicable. 4. The Deformations of Free Field Theories As our first application, let us consider the 1-dimensional bosonic free field conformal field theory, where the deformation field is u = x−1 x˜−1 . In this case, the infinitesimal isomorphism of Theorem 1 satisfies x−n x ˜−n φ=π n
(109)
(110)
n∈Z
and the sufficient condition of exponentiability from the last section is met when we take W the subspace consisting of states of momentum 0. Then W is closed
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
149
under vertex operators, u ∈ W and the n = 0 term of (110) drops out in this case. However, this is an example where regularization is needed. It can be realized as follows: Write φn φ= n>0
where
φn = π
xn x x−n x ˜−n ˜n − n n
We have exp φ =
.
exp φn .
(111)
n>0
To calculate exp φn explicitly, we observe that
x ˜−n x˜n x−n x ˜−n xn x ˜n x−n xn , − − 1, =− n n n n and setting e=
˜−n x−n x , n
f =
xn x ˜n , n
h=−
(112)
x˜−n x x−n xn ˜n − − 1, n n
we obtain the sl2 Lie algebra [e, f ] = h, (113)
[e, h] = 2e, [f, h] = −2f.
Note that conventions regarding the normalization of e, f, h vary, but the relations (113) are satisfied for example for
0 1 0 0 −1 0 e= , f= , h= . (114) 0 0 −1 0 0 1 In SL2 , we compute
0 1 exp(π(−f + e)) = exp π 1 0 cosh π sinh π = sinh π cosh π 1 1 tanh π cosh π = 0 1 0
0 cosh π
1 tanh π
0 . 1
(115)
March 10, J070-S0129055X10003916
150
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
In the translation (112), this is
x−n xn x˜−n x ˜−n ˜n x−n x + +1 exp tanh(π) exp (−ln cosh π) n n n
xn x ˜n · exp −tanh(π) . (116) n To exponentiate the middle term, we claim
x−n xn x−n xn z z = : exp (e − 1)) : exp n n
(117)
To prove (117), differentiate both sides by z. On the left-hand side, we get
x−n xn x−n xn exp z . n n Thus, if the derivative by z of the right-hand side y of (117) is
x−n xn z x−n xn : exp (e − 1) :, n n
(118)
then we have the differential equation y = x−nnxn y, which proves (117) (looking also at the initial condition at z = 0). Now we can calculate (118) by moving the xn occuring before the normal order symbol to the right. If we do this simply by changing (118) to normal order, we get
x−n xn z x−n xn exp (e − 1) :, (119) : n n but if we want equality with (118), we must add the terms coming from the commutator relations [xn , x−n ] = n, which gives the additional term
x−n xn z x−n xn z exp (e − 1) :. (120) (e − 1) : n n Adding together (119) and (120) gives ez :
x−n xn exp n
x−n xn z (e − 1) :, n
(121)
which is the derivative by z of the right-hand side of (117), as claimed. Using (117), (116) becomes
x−n x˜−n 1 exp tanh(π) Φn = cosh π n
x−n xn 1 x˜−n x ˜n ˜n xn x −1 + : exp : exp −tanh(π) cosh π n n n (122)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
151
which is in normal order. Let us write Φn =
1 Φ . cosh π n
Then the product Φ =
(123)
Φn
n≥1
is in normal order, and is the regularized isomorphism from the exponentiated deformation W of the conformal field theory in vertex operator formulation on to the original W . The inverse, which goes from W to W , is best calculated by regularizing the exponential of −φ. We get
0 −1 exp(π(f − e)) = exp π −1 0
cosh π −sinh π = −sinh π cosh π 1
0 1 0 1 −tanh π cosh π = −tanh π 1 0 1 0 cosh π =
x−n x˜−n 1 exp −tanh(π) cosh π n
x−n xn 1 x ˜−n x ˜n −1 + : exp : cosh π n n
˜n xn x exp tanh(π) . n
So expressing this as Ψn =
1 Ψ , cosh π n
the product Ψ =
(124)
Ψn
n≥1
is the regularized isomorphism from W to W . ˆ , the element u() = Ψ u is the Even though Ψ and Φ are only elements of W regularized chiral primary field in W , and can be used in a regularized version of Eq. (8) to calculate the vacua on V , which will converge on non-degenerate Segal worldsheets. In this approach, however, the resulting CFT structure on V remains opaque, while as it turns out, in the present case it can be identified by another method. In fact, to answer the question, we must treat precisely the case missing in Theorem 1, namely when the weight 0 part of the vertex operator of the deforming
March 10, J070-S0129055X10003916
152
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
field, which in this case is determined by the momentum, does not vanish. The answer is actually known in string theory to correspond to constant deformation of the metric on spacetime, which ends up isomorphic to the original free field theory. From the point of view of string theory, what we shall give is a “purely worldsheet argument” establishing this fact. Let us look first at the infinitesimal deformation of the operator Y (v, t, t¯) for some field v ∈ V which is an eigenstate of momentum. We have three forms which coincide where defined: ˜1 , z, z¯)Y (v, t, t¯)dzd¯ z Y (x−1 x
(125)
Y (v, t, t¯)Y (x−1 x ˜−1 , z, z¯)dzd¯ z
(126)
Y (Y (x−1 x ˜−1 , z − t, z − t)v, t, t¯)dzd¯ z.
(127)
By chiral splitting, if we assume v is a monomial in the modes, we can denote (125)–(127) by η η˜ (without forming a sum of terms). Again, integrating (125)– (127) term by term dz, we get forms ω∞ , ω0 , ωt , respectively. Here we set 1 dz = ln z. z Again, these are branched forms. Selecting points p0 , p∞ , pt on the corresponding boundary components, we can, say, make cuts c0,t and c0,∞ connecting the points p0 , pt and p0 , p∞ . Cutting the worldsheet in this way, we obtain well defined branches ω∞ , ω0 , ωt . To complicate things further, we have constant discrepancies C0t = ω0 − ωt C0∞ = ω0 − ω∞ .
(128)
These can be calculated for example by comparing with the 4 point function Y+ (x−1 , z)Y (v, t) + Y (v, t)Y− (x−1 , z) + Y (Y− (x−1 , z − t)v, t)
(129)
where Y− (v, z) denotes the sum of the terms in Y (v, z) involving negative powers of z, and Y+ (v, z) is the sum of the other terms. Another way to approach this is as follows: one notices that (130) Y (x−1 , z)dz = ∂ Y (1 , z)S− |=0 where Sm denotes the operator which adds m to momentum. It follows that C0t = ∂ (Z(x−1 , v, z, t)S− − Z(x−1 , S v, z, t))|=0 C0∞ = ∂ (Z(x−1 , v, z, t)S− − S Z(x−1 , v, z, t))|=0 .
(131)
Now the deformation is obtained by integrating the forms ω0 η˜,
(132)
η, (ωt + C0t )˜
(133)
(ω∞ + C0∞ )˜ η,
(134)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
153
on the boundary components around 0, t and ∞, and along both sides of the cuts c0,t , c0,∞ . To get the integrals of the terms in (132)–(134) which do not involve the discrepancy constants, we need to integrate x−n n m−1 z + x0 ln z x ˜−m z¯ . (135) n m n =0
To do this, observe that (pretending we work on the degenerate worldsheet, and hence omitting scaling factors, taking curved integrals over z = 1), − ln z¯ 1 ln z d¯ z=− d¯ z = −2πi ln z¯ − (2πi)2 (136) z¯ z¯ 2 1 ln z · z¯m−1 d¯ z = −2πi z¯m . (137) m (Actually, the first term on the right-hand side of (136) depends on the branch of the logarithm taken and hence cannot figure in the final result; the reader may note that this is indeed the case.) Integrating (135), we obtain terms m z¯ + x˜0 ln z¯ x ˜−m (138) −2πix0 m m =0
which will cancel with the integral along the cuts (to calculate the integral over the cuts, pair points on both sides of the cut which project to the same point in the original worldsheet), and “local” terms 1 ˜−n 2πi x−n x − (2πi)2 x0 x˜0 . (139) 2 n 2 n =0
The discrepancies play no role on the cuts (as the forms C0t η˜, C0∞ η˜ are unbranched), but using the formula (131), we can compensate for the discrepancies to linear order in by applying on each boundary component S−2πi˜x0 .
(140)
In (138), however, when integrating η˜, we obtain also discrepancy terms conjugate to (140), so the correct expression is S−2πi˜x0 S˜−2πix0 .
(141)
The term (141) is also “local” on the boundary components, so the sum of (139) and (141) is the formula for the infinitesimal isomorphism between the free CFT and the infinitesimally deformed theory. To exponentiate, suppose now we are working in a D-dimensional free CFT, and the deformation field is ˜−1 . M x−1 x
(142)
Then the formula for the exponentiated isomorphism multiplies left momentum by exp M
(143)
March 10, J070-S0129055X10003916
154
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
and right momentum by exp M T .
(144)
But of course, in the free theory, the left momentum must equal to the right momentum, so this formula works only when M is a symmetric matrix. Thus, to cover the general case, we must discuss the case when M is antisymmetric. In this case, it may seem that we obtain indeed a different CFT which is defined in the same way as the free CFT with the exception that the left momentum mL and right momentum mR are related by the formula mL = AmR for some fixed orthogonal matrix A. As it turns out, however, this theory is still isomorphic to the free CFT. The isomorphism replaces the left moving oscillators xi,−n by their transform via the matrix A (which acts on this Heisenberg representation by transport of structure). Next, let us discuss the case of deforming gravitaitonal field of non-zero momentum, i.e. when u = M x−1 x˜−1 1λ
(145)
with λ = 0. Of course, in order for (145) to be of weight (1, 1), we must have λ = 0.
(146)
Clearly, then, the metric cannot be Euclidean, hence there will be ghosts and a part of our theory does not apply. Note that in order for (145) to be primary, we also must have µi ⊗ µ ˜i (147) M= i
where µi , λ = 0. µi , λ = ˜
(148)
Despite the indefinite signature, we still have the primary obstruction, which is x−k x˜−k k zk + z¯ M x−m x ˜−n z m−1 z¯n−1 exp λ coeff z−1 z˜−1 : : k k m,n k =0
˜−1 1λ M x−1 x
(149)
(we omit the z λ,x0 term, since the power is 0 by (146)). In the notation (147), this is (µi x0 − µi x0 λx−1 λx1 + µi x−1 λx1 + λx−1 µi x1 ) i,j
⊗ (˜ µj x˜0 − µ ˜j x ˜0 λ˜ x−1 λ˜ x1 + µ ˜j x ˜−1 λ˜ x1 + λ˜ x−1 µ ˜j x ˜1 )M x−1 x ˜−1 1λ
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
155
which in the presence of (148) reduces to the condition ˜−1 = 0. M 2 λ ⊗ λx−1 x
(150)
M 2 = 0
(151)
This is false unless
which means that (145) is a null state, along which the deformation is not interesting in the sense of string theory. More generally, the distributional form of (150) is λ ⊗ λM (λ)2 = 0. (152) λ 2 =0
If we set f (λ) = δ λ 2 =0 M (λ)2 then the Fourier transform of f will be a function g satisfying ∂ 2g ± 2 =0 ∂λi where the signs correspond to the metric, which we assume is diagonal with entries ±1. The Fourier transform of the condition (152) is then ∂2 g = 0. ∂λi ∂λj
(153)
Assuming a decay condition under which the Fourier transform makes sense, (153) implies g = 0, hence (151), so in this case also the obstruction is nonzero unless (145) is a null state. In this discussion, we restricted our attention to deforming fields of gravitational origin. It is important to note that other choices are possible. As a very basic example, let us look at the 1-dimensional Euclidean model. Then there is a possibility of critical fields of the form a1√2 + b1−√2 .
(154)
This includes the sine-Gordon interaction [69] when a = b. (We see hyperbolic rather than trigonometric functions because we are working in Euclidean spacetime rather than in the time coordinate, which is the case usually discussed.) The primary obstruction in this case states that the weight (0, 0) descendant of (154) applied to (154) is 0. Since the descendant is (4ab)x−1 , we obtain the condition a = 0 or b = 0. It is interesting to note that in the case of the compactification on a circle, these cases where investigated very successfully by Ginsparg [30], who used the obstruction to competely characterize the component of the moduli space of c = 1 CFT’s originating from the free Euclidean compactified free theory. The result is that only free theories compactify at different radii, and their Z/2-orbifolds occur.
March 10, J070-S0129055X10003916
156
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
There are many other possible choices of non-gravitational deformation fields, one for each field in the physical spectrum of the theory. We do not discuss these cases in the present paper. Let us now look at the N = 1-supersymmetric free field theory. In this case, as pointed out above, in the NS-NS sector, critical gravitational fields for deformations have weight (1/2, 1/2). We could also consider the NS-R and R-R sectors, where the critical weights are (1/2, 0) and (0, 0), respectively. These deforming fields parametrize soul directions in the space of infinitesimal deformations. The soul parameters θ, θ˜ have weights (1/2, 0), (0, 1/2), which explains the difference of critical weights in these sectors. Let us, however, focus on the body of the space of gravitational deformations, i.e. the NS-NS sector. Let us first look at the weight (1/2, 1/2) primary field M ψ−1/2 ψ˜−1/2 .
(155)
The point is that the infinitesimal deformation is obtained by integrating the insertion operators of ˜ −1/2 M ψ−1/2 ψ˜−1/2 = M x−1 x ˜−1 . G−1/2 G Therefore, (155) behaves exactly the same as a deformation along the field (142) in the bosonic case. Again, if M is a symmetric matrix, exponentiating the deformation leads to a theory isomorphic via scaling the momenta, while if M is antisymmetric, the isomorphism involves transforming the left moving modes by the orthogonal matrix exp(M ). In the case of momentum λ = 0, we again have indefinite signature, and the field u = M ψ−1/2 ψ˜−1/2 1λ .
(156)
Once again, for (156) to be primary, we must have (147), (148). Moreover, again the actual infinitesimal deformation is obtained by applying the insertion operators ˜ −1/2 u, so the treatment is exactly the same as for the deformation along of G−1/2 G the field (145) in the bosonic case. Again, we discover that under a suitable decay condition, the obstruction is always nonzero for gravitational deformations of nonzero momentum with suitable decay conditions. It is worth noting that in both the bosonic and supersymmetric cases, one can apply the same analysis to free field theories compactified on a torus. In this case, however, scaling momenta changes the geometry of the torus, so using deformation fields of 0 momentum, we find exponential deformations which change (constantly) the metric on the torus. This seems to confirm, in the restricted sense investigated here, a conjecture stated in [59]. Remark. Since one can consider Calabi–Yau manifolds which are tori, one sees that there should also exist an N = 2-supersymmetric version of the free field theory compactified on a torus. (It is in fact not difficult to construct such model directly,
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
157
it is a standard construction.) Now since we are in the Calabi–Yau case, marginal cc fields should correspond to deformations of complex structure, and marginal ac fields should correspond to deformations of K¨ ahler metric in this case. But on the other hand, we already identified gravitational fields which should be the sources of such deformations. Additionally, deformations in those direction require regularization of the deformation parameter, and hence cannot satisfy the conclusion of Theorem 3. This is explained by observing that we must be careful with reality. The gravitational fields we considered are in fact real, but neither chiral nor antichiral primary in either the left or the right moving sector. By contrast, chiral primary fields (or − antiprimary) fields are not real. This is due to the fact that G+ −3/2 and G−3/2 are not real in the N = 2 superconformal algebra, but are in fact complex conjugate to each other. Therefore, to get to the real gravitational fields, we must take real parts, or in other words linear combinations of chiral and antichiral primaries, resulting in the need for regularization. It is in fact a fun exercise to calculate explicitly how our higher N = 2 obstruction theory operates in this case. Let us consider the N = 2-supersymmetric free field theory, since the compactification behaves analogously. The minimum number of dimensions for N = 2 supersymmetry is 2. Let us denote the bosonic fields by x, y and their fermionic superpartners by ξ, ψ. Then the 0-momentum summand of the state space (NS sector) is (a Hilbert completion of)
1 . Sym(xn , yn |n < 0) ⊗ Λ ξr , ψr |r < 0, r ∈ Z + 2 The “body” parts of the bosonic and fermionic vertex operators are given by the usual formulas x−n z n−1 , Y (y−1 , z) = y−n z n−1 , Y (x−1 , z) = Y (ξ−1/2 , z) =
ξ−s z n−s−1/2 , Y (ψ−1/2 , z) =
[ξr , ξ−r ] = [ψr , ψ−r ] = 1 [xn , x−n ] = [yn , y−n ] = n. We have, say, G1−3/2 = ξ−1/2 x−1 + ψ−1/2 y−1 G2−3/2 = ξ−1/2 y1 − ψ−1/2 x−1 . As usual, 1 √ (G1−3/2 ± iG2−3/2 ). G± −3/2 = 2
ψ−s z n−s−1/2 ,
March 10, J070-S0129055X10003916
158
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
With these conventions, we have a critical chiral primary u = ξ−1/2 − iψ−1/2
(157)
(and its complex conjugate critical antichiral primary). We then see that for a non-zero coefficient C, CG− −1/2 u = x−1 − iy−1 .
(158)
We now notice that formulas analogous to (112) etc. apply to (158), but the −1 summans of h will appear with opposite signs for the real and imaginary summands, so it will cancel out, so the regularizations (123), (124) are not needed, as expected. Next, let us study the formula (81). The key observation here is that we have the combinatorial identity 1 1 = (159) n1 · · · nk (nσ(1) + · · · + nσ(k) )(nσ(1) + · · · + nσ(k−1) ) · · · nσ(1) σ where the sum on the right is over all permutations on the set {1, . . . , k}. Now in the present case, we have the infinitesimal isomorphism on the 0-momentum part, up to non-zero coefficient, (xn − iyn )(˜ xn + i˜ yn ) (160) φ= n and in the absence of regularization, the expansion of the exponentiated isomorphism on the 0-momentum parts is simply exp(φ). (The + sign in the ˜’s is caused by the fact that we are in the complex conjugate Hilbert space.) Applying this to (157), we see that we have formulas analogous to (116)–(122), and applying the exponentiated isomorphism to (157), all the terms in normal order involving x>0 , y>0 will vanish, so we end up with
xn + i˜ yn ) (xn − iyn )(˜ exp D u n n<0 for some non-zero coefficient D. Applying (159), we get (81). Finally, the obstruction in chiral form − u∗ (0), (G− −1/2 u)(zk ), . . . , (G−1/2 u)(z1 ), u(0)
must vanish identically. To see this, we simply observe (157), (158) that in the present case, u is in the coset model with respect to G− −1/2 u (see the discussion below formula (250) below). Thus, in the N = 2-free field theory, the obstruction theory works as expected, and in the case discussed, the obstructions vanish. It is worth noting that in 2n-dimensional N = 2-free field theory, we thus have an n2 dimensional space of cc + aa real fields, and an n2 -dimensional space of real ca + ac fields, and although regularization occurs, there is no obstruction to exponentiating the deformation by turning on any linear combination of those fields. For a free N =
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
159
2-theory compactified on an n-dimensional abelian variety, this precisely recovers the deformations in the corresponding component of the moduli space of Calabi– Yau varieties. However, other deformations exist. For an interesting calculation of deformations of the N = 2-free field theory in “sine-Gordon” directions, see [13]. 5. The Gepner Model of the Fermat Quintic The finite weight states of one chirality (say, left moving) of the Gepner model of the Fermat quintic are embedded in the 5-fold tensor product of the N = 2supersymmetric minimal model of central charge 9/5 [24, 25, 32]. More precisely, the Gepner model is an orbifold construction. This construction has two versions. In [24,25,32], one is interested in actual string theories, so the 5-fold tensor product of central charge 9 of N = 2 minimal models is tensored with a free supersymmetric CFT on 4 Minkowski coordinates. This is then viewed in lightcone gauge, so in effect, one tensors with a 2-dimensional supersymmetric Euclidean free CFT, resulting in N = 2-supersymmetric CFT of central charge 12. Finally, one performs an orbifolding/GSO projection to give a candidate for a theory for which both modularity and spacetime SUSY can be verified. (Actually, this is still not quite precise, as in Gepner’s original work, the true point of interest is the construction of heterotic string theories; from our point of view, however, the difference does not matter.) What we care about is that it is also possible to create an orbifold theory of central charge 9 which is the candidate of the nonlinear σ-model itself, without the spacetime coordinates. (The spacetime coordinates can be added to this construction and usual GSO projection performed if one is interested in the corresponding string theory.) The essence of this construction not involving the spacetime coordinates is formula (2.10) of [33]. In the case of the level 3 N = 2-minimal model (more precisely, the unitary N = 2 Virasoro minimal model of A-type), the orbifold construction is with respect to the Z/5-action diagonal which acts on the eigenstates of J0 eigenvalue (= “U (1)-charge”) j/5 by e2πij/5 . As we shall review, the NS part of the level 3 N = 2 minimal model has two sectors of U (1)-charge j/5, which we will for j ∈ Z/5Z. In the FF realization for the moment ad hoc denote Hj/5 and Hj/5 (see below), these sectors correspond to = 0, = 1, respectively. Then the NS-NS sector of the 5-fold tensor product of minimal model has the form 5
ˆ ˆ i∗ /5 ⊕ Hi /5 ⊗H ˆ i∗/5 ). (Hik /5 ⊗H k k k (ik )
(161)
k=1
The corresponding sector of the orbifold construction (formula (2.10) of [33]) has the form (ik ):
P
5
ˆ
ik ∈5Z j∈Z/5Z k=1
∗ ∗ ˆ (j+i ˆ (j+i (Hik /5 ⊗H ⊕ Hik /5 ⊗H ). k )/5 k )/5
(162)
March 10, J070-S0129055X10003916
160
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Mathematically speaking, this orbifold can be constructed by noting that, ignoring for the moment supersymmetry, the N = 2-minimal model is a tensor product of the parafermion theory of the same level and a lattice theory (see [31] and also below). The orbifold construction does not affect the parafermionic factor, and on the lattice coordinate, which in this case does not possess a non-zero Z/2valued form, and hence physically models a free theory compactified on a torus, the orbifold simply means replacing the torus by its factor by the free action of the diagonal Z/5 translation group, which is represented by another lattice theory. On this construction, N = 2 supersymmetry is then easily restored using the same formulas as in (161), since the U (1)-charge of the G’s is integral. The calculations in this and the next section proceed entirely in the orbifold (162), and hence can be derived from the structure of the level 3 N = 2-minimal model. It should be pointed out that a mathematical approach to the fusion rules of the N = 2 minimal models was given in [42]. We shall use the Coulomb gas realization of the N = 2-minimal model, cf. [34, 53]. Let us restrict attention to the NS sector. Then, essentially, the left moving sector of the minimal model is a subquotient of the lattice theory where the lattice is 3-dimensional, and spanned by
2 1 i 2 i 2 2 3 √ , 0, 0 , √ , , , √ , 0, i . 3 15 15 2 5 2 3 15 We will adopt the convention that we shall abbreviate (k, , m)MM = (k, , m) for the lattice label
k i √ , 15 2
2 mi , 5 2
2 . 3
We shall also write (, m)MM = (m, , m)MM . Call the oscillator corresponding to the jth coordinate xj,m , j = 0, 1, 2. Then the conformal vector is 1 2 i 2 1 1 2 x x1,−2 + x22,−1 . − x + (163) 2 0,−1 2 1,−1 2 5 2 The superconformal algebra is generated by 1 5 + G−3/2 = i x2,−1 − x1,−1 1( √5 ,0,i√ 2 ) , 3 2 3 15 1 5 x2,−1 + x1,−1 1(− √5 ,0,−i√ 2 ) . G− −3/2 = −i 3 2 3 15
(164)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
161
For future reference, we will sometimes use the notation (a, b, c)xn = ax0,n + bx1,n + cx2,n and also sometimes abbreviate (b, c)xn = (0, b, c)xn . The module labels are realized by labels (, m) = 1( √m
15
0 ≤ ≤ 3,
,− i 2
√2
im 3, 2
√2 , ) 3
m = −, − + 2, . . . , − 2, .
(165) (166)
It is obvious that to stay within the range (166), we must understand the fusion rules and how they are applied. The basic principle is that labels are indentified as follows: No identifications are imposed on the 0th lattice coordinate. This means that upon any identification, the 0th coordinate must be the same for the labels identified. Therefore, the identification is governed by the 1st and 2nd coordinates, which give the Coulomb gas (= Feigin–Fuchs) realization of the corresponding parafermionic theory (the Z/3 parafermion model). The keypoint here are the parafermionic currents 1 5 x2,−1 − x1,−1 1(0,i√ 2 ) , ψ1,−2/3 = i 3 PF 2 3 (167) 1 5 + = −i ψ1,−2/3 x2,−1 + x1,−1 1(0,−i√ 2 ) 3 PF 2 3 (the 0th coordinate is omitted). Clearly, the parafermionic currents act on the labels by ψ1,−2/3 : (, m)P F → (, m + 2)P F , + : (, m)P F → (, m − 2)P F . ψ1,−2/3
(168)
The lattice labels (, m)P F allowed are those which have non-negative weight. This condition coincides with (166). Now we impose the identification for parafermionic labels: (, m)P F = (3 − , m − 3)P F . This implies (1, −1)P F ∼ (2, 2)P F , (1, 1)P F ∼ (2, −2)P F , (0, 0)P F ∼ (3, −3)P F ∼ (3, 3)P F .
(169)
March 10, J070-S0129055X10003916
162
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Now in the Gepner model corresponding to the quintic, the cc fields allowed are ((3, 2, 0, 0, 0)L, (3, 2, 0, 0, 0)R),
(170)
((3, 1, 1, 0, 0)L, (3, 1, 1, 0, 0)R),
(171)
((2, 2, 1, 0, 0)L, (2, 2, 1, 0, 0)R),
(172)
((2, 1, 1, 1, 0)L, (2, 1, 1, 1, 0)R),
(173)
((1, 1, 1, 1, 1)L, (1, 1, 1, 1, 1)R),
(174)
and the ac field allowed is ((−1, −1, −1, −1, −1)L, (1, 1, 1, 1, 1)R).
(175)
Here we wrote for 1( , )M M , ( = 0, . . . , 3), which is a chiral primary in the N = 2 minimal model of weight /10, and − for 1( ,− )M M , which is antichiral primary of weight /10. The tuple notation in (170)–(175) really means tensor product. We omit permutations of the fields (170)–(173), so counting all permutations, there are 101 fields (170)–(174). We will need an understanding of the fusion rules in the Z/3 parafermion model and N = 2-supersymmetric minimal model of central charge 9/5. In the Z/3 parafermion model, we have 6 labels (0, 0)P F , (3, 1)P F , (3, −1)P F ,
(176)
(1, 1)P F , (1, −1)P F , (2, 0)P F .
(177)
This can be described as follows: the labels (176) have the same fusion rules as the √ lattice L = i 6 ⊂ C, i.e. L /L
(178)
where L is the dual lattice (into which L is embedded using the standard quadratic
form on C). This dual lattice is 2i 23 , and the fusion rule is “abelian”, which means that the product of labels has only onepossible label as outcome, and is described
by the product in L /L. The label ± 2i 23 corresponds to (0, ±2)P F ∼ (3, ∓1)P F . Next, the product of (2, 0)P F with (3, ∓1)P F has only one possible outcome, (2, ±2)P F = (1, ∓1)P F . The product of (2, 0)P F with itself has two possible outcomes, (2, 0)P F and (0, 0)P F . All other products are determined by commutativity, associativity and unitality of fusion rules. The result can be summarized as follows: We call (176) level 0, 3 labels and (177) level 1, 2 labels. Every level 1, 2 label has a corresponding label of level 0, 3.
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
163
The correspondence is (0, 0)P F ↔ (2, 0)P F (3, 1)P F ↔ (1, 1)P F
(179)
(3, −1)P F ↔ (1, −1)P F . As described above, the fusion rules on level 0, 3 are determined by the lattice theory of L. Additionally, multiplication preserves the correspondence (179), while the level of the product is restricted only by requiring that any level added to level 0, 3 is the original level. To put it in another way still, the Verlinde algebra is Z[ζ]/(ζ 3 − 1) ⊗ Z[]/(2 − − 1)
(180)
where ζ = (3, 1)P F and = (2, 0)P F . In the N = 2 supersymmetric minimal model (MM) case, we allow labels (3k + m, , m)MM
(181)
where (, m) is a PF label, k ∈ Z. Two labels (181) are identified subject to identifications of PF labels, and also (j, , m)MM ∼ (j + 15, , m)MM ,
(182)
and, as a result of SUSY, (j, , m)MM ∼ (j − 5, , m − 2)MM .
(183)
(By ∼ we mean that the labels (i.e. VA modules) are identified, but we do not imply that the states involved actually coincide; in the case (183), they have different weights.) Recalling again that we abbreviate (m, , m)MM as (, m)MM , we get the following labels for the c = 9/5 N = 2 SUSY MM: (0, 0)MM ↔ (2, 0)MM (3, 3)MM ↔ (2, −2)MM (3, 1)MM ↔ (1, 1)MM
(184)
(3, −1)MM ↔ (1, −1)MM (3, −3)MM ↔ (2, 2)MM . Again, the left column (184) represents 0, 3 level labels, the right column represents level 1, 2 labels. The left column labels multiply as the labels of the lattice superCFT corresponding to the lattice Λ in C ⊕ C spanned by √ 2 5 (185) ( 15, 0), √ , i 3 15
March 10, J070-S0129055X10003916
164
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
(recall that a super-CFT can be assigned to a lattice with integral quadratic form; the quadratic form on C ⊕ C is the standard one, the complexification of the Euclidean inner product). The dual lattice of (185) is spanned by
3 5 2 √ ,0 , √ ,i , (186) 3 15 15 which correspond to the labels (3, 0, 0)MM , (5, 3, −1)MM , respectively. We see that Λ /Λ ∼ = Z/5.
(187)
In (184), the rows (counted from top to bottom as 0, . . . , 4) match the corresponding residue class (187). The fusion rules for (2, 0)MM , (0, 0)MM are the same as in the PF case. Hence, again, multiplication of labels preserves the rows (184), and the Verlinde algebra is isomorphic to Z[η]/(η 5 − 1) ⊗ Z[]/(2 − − 1)
(188)
where η is (3, 3)MM . Remark. As remarked in Sec. 3, the positive definiteness of the modular functor, which is crucial for our theory to work, is a requirement for a physical CFT. It is interesting to note, however, that if we do not include this requirement, other possible choices of real structure are possible on the modular functor: The Verlinde algebra of a lattice modular functor with another modular functor M with two labels 1 and , and Verlinde algebras (180), (188) are tensor products of lattice Verlinde algebras and the algebra Z[]/(2 − − 1). The real structure of this last modular functor can be changed by multiplying by −1 the complex conjugation in MΣ for a worldsheet Σ precisely when Σ has an odd number of boundary components labelled on level 1, 2. The resulting modular functor of this operation is not positive-definite. Let us now discuss the question of vertex operators in the PF realization of the minimal model. Clearly, since the 0th coordinate acts as a lattice coordinate and is not involved in renaming, it suffices the question for the parafermions. Now in the Feigin–Fuchs realization of the level 3 PF model, any state can be written as u1λ
(189)
where λ is one of the labels (166) and u is a state of the Heisenberg representation of the Heisenberg algebra generated by xi,m , i = 1, 2, m = 0. The situation is however further complicated by the fact that not all Heisenberg states u are allowed for a given label λ. We shall call the states which are in the image of the embedding
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
165
admissible. For example, since the λ = 0 part of the PF model is isomorphic to the coset model SU (2)/S 1 of the same level, states (a, b)x−1 (0, 0)P F
(190)
are not admissible for (a, b) = (0, 0). One can show that admissible states are exactly those which are generated from the ground states (166) by vertex operators and PF currents. Because not all states are admissible, however, there are also states whose vertex operators are 0 on admissible states. Let us call them null states. For example, since (190) is not admissible, it follows that (a, b)x−1 (3, 3)P F ,
(191)
which is easily seen to be admissible for any choice of (a, b), is null. Determining explicitly which states are admissible and which are null is extremely tricky (cf. [34]). Fortunately, we do not need to address the question for our purposes. This is because we will only deal with states which are explicitly generated by the primary fields, and hence automatically admissible; because of this, we can ignore null states, which do not affect correlation functions of admissible states. On the other hand, we do need an explicit formula for vertex operators. One method for obtaining vertex operators is as follows. We may rename fields using the identifications (169) and also PF currents: a PF current applied to a renamed field must be equal to the same current applied to the original field. Note that this way we may get Heisenberg states above labels which fail to satisfy (166). Such states are also admissible, even though the corresponding “ground states” (which have the same name as the label) are not. Now if we have two admissible states ui 1( i ,mi ) ,
i = 1, 2
where 0 ≤ i ≤ 3 and 1 + 2 ≤ 3, then the lattice vertex operator (u1 1( 1 ,m1 ) )(z)u2 1( 2 ,m2 )
(192)
always satisfies our fusion rules, and (up to scalar multiple constant on each module) is a correct vertex operator of the PF theory. This is easily seen simply by the fact that (192) intertwines correctly with module vertex operators (which are also lattice operators). While in our examples, it will suffice to always consider operators obtained in the form (192), it is important to realize that they do not describe the PF vertex operators completely. The problem is that when we want to iterate vertex operators, we would have to keep renaming states. But when two ground states 1λ , 1µ are identified via the formula (169), it does not follow that we would have u1λ = u1µ
(193)
March 10, J070-S0129055X10003916
166
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
for every Heisenberg state u. On the contrary, we saw for example that (190) is inadmissible, while (191) is null. One also notes that one has for example the identification λx−1 1λ = L−1 1λ = L−1 1µ = µx−1 1µ ,
(194)
which is not of the form (193). Because of this, to describe completely the full force of the PF theory, one needs another device for obtaining vertex operators (although we will not need this in the present paper). Briefly, it is shown in [34] that up to scalar multiple, any vertex operator u(z)v = Y (u, z)(v) where u, v are admissible states can be written as · · · (ak x−1 1(0,−2) )(tk ) · · · (a1 x−1 1(0,−2) (t1 )u(z)vdt1 · · · dtk
(195)
where the operators in the argument (195) are lattice vertex operators and the number k is selected to conform with the given fusion rule. While it is easy to show that operators of the form (195) are correct vertex operators on admissible states (again up to scalar multiple constant on each irreducible module), as the “screening operators” ax−1 1(0,−2) commute with PF currents, selecting the bounds of integration (“contours”) is much more tricky. Despite the notation, it is not correct to imagine these as integrals over closed curves, at least not in general. One approach which works is to bring the argument of (195) to normal order, which expands it as an infinite sum of terms of the form (196) (ti − tj )αij tβk k (where we put t0 = z) with coefficients which are lattice vertex operators. Then to integrate (196), for αij , βk > 0, we may simply integrate ti from 0 to ti−1 , and define the integral by analytic continuation in the variables αij , βk otherwise. The functions obtained in this way are generalized hypergeometric functions, and fail for example the assumptions of Theorem 1 (see Remark 2 after the theorem). The explanation is in the fact that, as we already saw, the fusion rules are not “abelian” in this case. 6. The Gepner Model: The Obstruction We will now show that for the Gepner model of the Fermat quintic, the function (95) may not vanish for the deforming field (170). This means, not all perturbative deformations corresponding to marginal fields exist in this case. We emphasize that our result applies to deformations of the CFT itself (of central charge 9). A different
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
167
approach is possible by embedding the model to string theory, and investigating the deformations in that setting (cf. [16]). Our results do not automatically apply to deformations in that setting. We will consider v = u = (3, 3, 3) ⊗ (2, 2, 2). (In the remaining three coordinates, we will always put the vacuum, so we will omit them from our notation.) First note that by Theorem 3, this is actually the only relevant case (95), since the only other chiral primary field of weight 1/2 with only two non-vacuum coordinates is (2, 2, 2) ⊗ (3, 3, 3), which cannot correlate with the right-hand side of (95), whose first coordinate is on level 0, 3. In any case, we will show therefore that the Gepner model has an obstruction against continuous perturbative deformation along the field (170) in the moduli space of exact conformal field theories. Now the chiral correlation function (95) is a complicated multivalued function because of the integrals (196), which are generalized hypergeometric functions. As remarked above, the modular functor has a canonical flat connection on the space of degenerate worldsheets whose boundary components are shifts of the unit circle with the identity parametrization. The flat connection comes from the fact that these degenerate worldsheets are related to each other by applying exp(zL−1 ) to their boundary components. This is why we can speak of analytic continuation of a branch of the correlation function corresponding to a particular fusion rule. It can further be shown (although we do not need to use that result here) that the continuations of the correlation function corresponding to any one particular fusion rule generate the whole correlation function (i.e. the whole modular functor is generated by any one non-zero section). Let us now investigate which number m we need in (95). In our case, we have − − G− −1/2 (u) = G−1/2 (3, 3, 3) ⊗ (2, 2, 2) − (3, 3, 3) ⊗ G−1/2 (2, 2, 2).
(197)
(The sign will be justified later;√it is not √ needed at this point.) The first sum√ mand √ (197) has x0,0 -charge (−2/ 15, 2/ 15), the second has x0,0 -charge (3/ 15, −3/ 15). Thus, the charges can add up to 0 only if m is a multiple of 5. The smallest possible obstruction is therefore for m = 5, in which case (95) is a 7 point function. Let us focus on this case. This function however is too big to calculate completely. Because of this, we use the following trick. First, it is equivalent to consider the question of vanishing of the function − u(t) . 1|(G− −1/2 u)(z5 ) · · · (G−1/2 u)(z1 )u(z0 )¯
(198)
Now by the OPE, it is possible to transform any correlation function of the form · · · | · · · v(z)w(t) · · ·
(199)
March 10, J070-S0129055X10003916
168
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
to the correlation function · · · | · · · (vn w)(t) · · ·
(200)
(all other entries are the same). More precisely, (199) is expanded, in a certain range and choice of branch, into a series in z − t with coefficients (200) for values of n belonging to a coset Q/Z. By the above argument, therefore, the function (199) vanishes if and only if the function (200) vanishes for all possible choices of n associated with one fixed choice of fusion rule. In the case of (198), we shall divide the fields on the right-hand side into two sets Gx , Gy containing two copies of G− −1/2 u each, and a set Gz containing the − remaining three fields u, u ¯ and G−1/2 u. Each set Gx , Gy , Gz will be reduced to a single field using the transition from (199) to (200) (twice in the case of Gz ). To simplify notation (eliminating the subscripts), we will denote the fields resulting from Gx , Gy , Gz by a(x), b(y), c(z), respectively. Thus, x, y, z are appropriate choices among the variables zi , t, depending how the transition from (199) to (200) is applied. This reduces the correlation function (198) to 1|a(x)b(y)c(z) .
(201)
Most crucially, however, we make the following simplification: We shall choose the fusion rules in such a way that the fields a, b, c are level 0, 3 in the Feigin–Fuchs realization, and at most one of the charges will be 3 (in each coordinate). Then, (201) is just a lattice correlation function, for the computation of which we have an algorithm. To make the calculation correctly, we must keep careful track of signs. When taking a tensor product of super-CFT’s, one must add appropriate signs analogous to the Koszul–Milnor signs in algebraic topology. Now a modular functor of a superCFT decomposes into an even part and an odd part. Additionally, more than one choice of this decomposition may be possible for the same theory, depending on which bottom states of irreducible modules are chosen as even or odd. The sign of a fusion rule is then determined by whether composition along the pair of pants with given labels preserves parity of states or not. Mathematically, this phenomenon was noticed by Deligne in the case of the determinant line (cf. [50]). (Deligne also noticed that in some cases no consistent choice of signs is possible and a more refined formalism is needed; a single fermion of central charge 1/2 is an example; this is also discussed in [50]. However, this will not be needed here.) In the case of the N = 2-minimal model, there is a choice of parities of ground states of irreducible modules which make the whole modular functor (all the fusion rules) even: simply choose the parity of (k, , m) to be k mod 2. We easily see that this is compatible with supersymmetry. Now in this case of completely even modular functor, the signs simplify, and we put Y (u ⊗ v, z)(r ⊗ s) = (−1)π(r)π(v) Y (u, z)r ⊗ Y (v, z)s
(202)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
169
(where π(u) means the parity of u). Regarding supersymmetry (if present), an element H of the superconformal algebra also acts on a tensor product by H(u ⊗ v) = Hu ⊗ v + (−1)π(H)π(u) u ⊗ Hv,
(203)
− π(u) G− u ⊗ (G− −1/2 (u ⊗ v) = (G−1/2 u) ⊗ v + (−1) −1/2 v).
(204)
in particular
We see that because of (204), the fields a, b, c may have the form of sums of several terms. Example 1. Recall that the inner product (more precisely symmetric bilinear form) of labels considered as lattice points is (r1 , s1 , t1 ), (r2 , s2 , t2 ) =
s1 s2 t1 t2 r1 r2 + − . 15 10 6
(205)
Recall also (from the definition of energy-momentum tensor) that weight of the label ground states is calculated by w(r, s, t)MM =
s(s + 2) t2 r2 r2 + w(s, t)P F = + − . 30 30 20 12
(206)
Now we have u = (3, 3, 3) ⊗ (2, 2, 2) = (3, 0, 0) ⊗ (2, 1, −1).
(207)
We begin by choosing the field c. Compose first u and u ¯ = (−3, 3, −3) ⊗ (−2, 2, −2) = (−3, 0, 0) ⊗ (−2, 1, 1).
(208)
We choose the non-zero un u¯ of the bottom weight for the fusion rule which adds the lattice charges on the right-hand side of (207), (208). The result is u−1/10 u¯ = (0, 0, 0) ⊗ (0, 2, 0).
(209)
Next, apply G− −1/2 u to (209). Again, we will choose the bottom descendant. Now − G−1/2 u has two summands, (−2, 3, 1) ⊗ (2, 1, −1)
(210)
(3, 0, 0) ⊗ (0, 5, 3)x−1 (−3, 1, 3)
(211)
and
(the term (211) involves renaming to stay withing no-ghost PF labels after composition). Applying (210) to (209) gives bottom descendant (−2, 3, 1) ⊗ (2, 3, −1) of weight 8/5,
(212)
applying (211) to (209) gives bottom descendant (3, 0, 0) ⊗ (−3, 0, 0) of weight 3/5.
(213)
March 10, J070-S0129055X10003916
170
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Since (213) has lower weight than (212), (212) may be ignored, and we can choose c = (3, 0, 0) ⊗ (−3, 0, 0).
(214)
Now again, using the formula (204), we see that in the sets of fields Gx , Gy we need one summand (211) and three summands (210) to get to x00 -charge 0. Thus, one of the groups Gx , Gy will contain two summands of (210) and the other will contain one. We employ the following convention: We choose Gy to contain two summands (210) and Gx to contain one summand (210) and one summand (211). (215) This leads to the following: We must choose the fields a and b of the same weights and symmetrize the resulting correlation function with respect to x and y. (216) We will choose b first. Again, we will choose the bottom weight (nonzero) descendant of (210) applied to itself renamed as (0, 5, −3)x−1 (−2, 0, −2) ⊗ (2, 2, 2),
(217)
(−4, 3, −1) ⊗ (4, 3, 1).
(218)
which is
We rename to level 0, which gives b = (0, 5, 3)x−1 (−4, 0, 2) ⊗ (0, 5, −3)x−1 (4, 0, −2), w(b) = 12/5.
(219)
Then a must have weight 12/5 to satisfy (216). When calculating a, however, there is an additional subtlety. This time, we actually have to take into account two summands, from applying (210) to (211) and vice versa, i.e. (211) to (210). In both cases, we must rename to get the desired fusion rule. To this end, we may replace (211) by (3, 0, 0) ⊗ (−3, 2, 0).
(220)
However, when applying (210) and (220) to each other in opposite order, the renamings then do not correspond, resulting in the possibility of wrong coefficient/sign (since renaming are correct only up to constants which we have not calculated). To reconcile this, we must use exactly the same renamings step by step, related only by applying PF currents. To this end, we may compare the renaming of applying (0, 5, −3)x−1 (−2, 0, −2) ⊗ (2, 2, 2)
(221)
1 (3, 0, 0) ⊗ (0, 5, −3)x−1 (−3, 1, −3) 2
(222)
to
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
(the and
1 2
171
comes from the PF current (5, −3)x−1 (0, −2) which takes (2, 2) to 2(2, 0)) (3, 0, 0) ⊗ (−3, 2, 0)
(223)
(0, 5, −3)x−1 (−2, 0, −2) ⊗ (2, 1, −1).
(224)
to
We see that the bottom descendant of applying (221) to (222) is (0, 5, −3)x−1 (1, 0, −2) ⊗ (−1)(−1, 3, −1)
(225)
while the bottom descendant of applying (223) to (224) is (0, 5, −3)x−1 (1, 0, −2) ⊗ (−1, 3, −1).
(226)
The expression (225) is the negative of (226). On the other hand, we see that the bottom descendants of applying (210) to (220) and vice versa are the same. This means that we are allowed to use the names (210) and (220) to each other in either order, but we must take the results with opposite signs. Now (226) has weight 7/5, so to get weight 12/5, we must take the descendant of applying (210) to (220) and vice versa which is of weight 1 higher than the bottom. This gives ((−2, 3, 1) − (3, 0, 0))x−1 (1, 3, 1) ⊗ (−1, 3, −1) + (1, 3, 1) ⊗ ((2, 1, −1) − (−3, 2, 0))x−1 (−1, 3, −1), which is a = (−5, 3, 1)x−1 (1, 3, 1) ⊗ (−1, 3, −1) + (1, 3, 1) ⊗ (5, −1, −1)x−1 (−1, 3, −1). (227) Now the correlation function of a(x), b(y), c(z) given in (227), (219), (214) is an ordinary lattice correlation function. The algorithm for calculating the lattice correlation function of fields ui (xi ) which are of the form 1λi (xi ) or µi x−1 1λi (xi ) with the label 1P λi is as follows: The correlation function is a multiple of (xi − xj )λi ,λj i<j
March 10, J070-S0129055X10003916
172
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
by a certain factor, which is a sum over all the ways we may “absorb” any µi x−1 factors. Each such factor may either be absorbed by another µj x−1 , which results in a factor µi , µj (xi − xj )−2 ,
i = j
(228)
or by another lattice label 1λj , which results in a factor µi , λj (xi − xj )−1 ,
i = j.
(229)
Each µi x−1 must be absorbed exactly once (and the mechanism (228) is considered as absorbing both µi and µj ), but one lattice label 1λj may absorb several different µi x−1 ’s via (229). Evaluating the correlation function of a(x), b(y), c(z) with the vacuum using this algorithm, we get 2(y − z) . (x − z)(x − y)3 Symmetrizing with respect to x, y, we get 2(x − 2z + y) , (y − z)(x − z)(x − y)2 (our total correlation function factor), which is non-zero. In more detail, we can calculate separately the contributions to the correlation function of the two summands (227). For the first summand, the factor before the ⊗ sign contributes −
1 , (x − z)(y − x)
(230)
the factor after the ⊗ sign contributes 1 . y−x
(231)
Multiplying (230) and (231), we get −
1 , (x − z)(x − y)2
and symmetrizing with respect to x and y, −
x − 2z + y , (x − z)(x − y)2 (y − z)
which is the total contribution of the first summand (227).
(232)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
173
For the second summand (227), the factor after ⊗ contributes −
1 2 , − 2 (x − y) (x − z)(y − x)
(233)
and the factor before the ⊗ sign contributes 1 . y−x
(234)
Multiplying, we get x − 2z + y . (x − z)(x − y)3 After symmetrizing with respect to x, y, we get also (232), so both summands of (227) contribute equally to the correlation function. Example 2. In this example, we keep the same a(x) and b(y) as in the previous example, but change c(z). To select c(z), this time we start with G− −1/2 u represented as (3, 0, 0) ⊗ (0, 5, −3)x−1 (−3, 1 − 3) + C(−2, 3, 1) ⊗ (2, 1, −1)
(235)
(C is a non-zero normalization constant which we do not need to evaluate explicitly), which we apply to u¯ represented as (−3, 0, 0) ⊗ (−2, 1, 1).
(236)
From the two summands (235), we get bottom descendants (0, 0, 0) ⊗ (−5, 2, −2) of weight 9/10
(237)
(−5, 3, 1) ⊗ (0, 2, 0) of weight 19/10.
(238)
and
Therefore, we may ignore (238) and select (237) only. Now applying (237) to u written as (3, 0, 0) ⊗ (2, 1, −1),
(239)
we select a descendant of weight 1 above the label (3, 0, 0) ⊗ (−3, 3, −3). Recalling from the conjugate of (191) that weight 1 states above the label (3, −3)P F = (0, 0)P F must vanish, we get c = (3, 0, 0) ⊗ (1, 0, 0)x−1 (−3, 0, 0)
(240)
March 10, J070-S0129055X10003916
174
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
(up to a non-zero multiplicative constant). This gives the correlation function (x − 2z + y)2 . 5(y − z)2 (x − z)2 (x − y)2
(241)
Let us write again in more detail the contributions of the two summands (227). For the first summand, the contribution of the factor before ⊗ is again (230) (remain unchanged), and the contribution of the factor after ⊗ is −y − 3z + 4x . 15(y − z)(x − z)(x − y)
(242)
Multiplying, we get −y − 3z + 4x , 15(y − z)2 (x − z)2 (x − y) and symmetrizing with respect to x, y, −
y 2 + 6yz − 6z 2 − 8xy + 6zx + x2 . 15(y − z)2 (x − z)2 (x − y)2
(243)
This is the total contribution of the first summand (227). For the second summand (227), the coordinate before ⊗ contributes again (234), and the coordinate after ⊗ contributes 2(−yx − 3yz − 3xz + 3z 2 + 2x2 + 2y 2 ) . 15(y − z)(x − z)2 (x − y)2
(244)
Multiplying, we get −
−yx − 3yz − 3xz + 3z 2 + 2x2 + 2y 2 . 15(y − z)(x − z)2 (x − y)3
Symmetrizing with respect to x, y, we get 2(−yx − 3yz − 3xz + 3z 2 + 2x2 + 2y 2 ) 15(y − z)2 (x − z)2 (x − y)2
(245)
which is the total contribution of the second summand (227). Adding the contributions (243) and (245) (which are not equal in this case) gives (241). Remark. When u is, say, a cc field of weight (1/2, 1/2) in an N = (2, 2) CFT, then we have a CPT-conjugate aa field v. Physical CFT’s require a real structure, and the fields u, v are not real. As already noted in the Remark at the end of Sec. 4 for the case of the free field theory, deforming along the field u (or v), which is the case considered in this section, breaks real structure of the CFT. Truly physical infinitesimal deformations therefore occur not along the fields u, v but the field u + v.
(246)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
175
(This, of course, explains why the dimension of the space of, say, infinitesimal ccdeformations is the dimension of the space of deformations of the complex structure, and not the double of that number.) In the literature, the contribution of the CPTconjugate is often ignored (cf. [31, formulas (4.5) and (4.7)]). Nevertheless, from the point of view of obstruction theory, considering (246) and the original cc field u should be equivalent. An argument can be sketched as follows: Let a be a non-zero complex number. Then replacing u by au, (246) becomes au + a ¯v.
(247)
Since the obstructions are homogeneous, instead of (247), we may consider u + bv, b =
a ¯ a
(248)
Then b is an arbitrary element of the unit circle S 1 . Thus, even when we restrict to real deformations, the obstruction should vanish for every field (248) with b ∈ S 1 . But the chiral part of the obstruction is holomorphic in b, so vanishing for all b ∈ S 1 implies vanishing for b = 0 and hence if all of the real deformations along (247) are unobstructed, so is the deformation along u. While this argument is compelling, we learned in Sec. 4 that when deforming along fields of the form (246), regularization along the deformation parameter is required. Therefore, to make the argument precise in the present setting, we would either need to develop a general regularization scheme to the same order to which obstructions vanish, or compute the regularization parameters explicitly in the present case. Working this out would be a substantial improvement of the present result. The remainder of this section is dedicated to comments on possible perturbative deformations along the fields (15 , 15 ), (−15 , 15 ) (the exponent here denotes repetition of the field in a tensor product, and 1 again stands for (1, 1, 1)MM , etc.). We will present some evidence (although not proof) that the obstruction might vanish in this case. The results we do obtain will prove useful in the next section. Such conjecture would have a geometric interpretation. In Gepner’s conjectured interpretation of the model we are investigating as the σ-model of the Fermat quintic, the field (175) corresponds to the dilaton. It seems reasonable to conjecture that the dilaton deformation should exist, since the theory should not choose a particular global size of the quintic. Similarly, the field (174) can be explained as the dilaton on the mirror manifold of the quintic, which should correspond to deformations of complex structure of the form x5 + y 5 + z 5 + t5 + u5 + λxyztu = 0.
(249)
Therefore, our analysis predicts that the (body of) the moduli space of N = 2supersymmetric CFT’s containing the Gepner model is 2-dimensional, and contains σ-models of the quintics (249), where the metric is any multiple of the metric for which the σ-model exists (which is unique up to a scalar multiple).
March 10, J070-S0129055X10003916
176
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
To discuss possible deformations along the fields (15 , 15 ), (−15 , 15 ), let us first review a simpler case, namely the coset construction: In a VOA V , we set, for u ∈ V homogeneous, u−n+w(u) z n , Y (u, z) = n∈Z
Y− (u, z) =
u−n+w(u) z n ,
(250)
n<0
Y+ (u, z) = Y (u, z) − Y− (u, z). The coset model of u is Vu = v ∈ V |Y− (u, z)v = 0 and Y+ (u, z)v involves only integral powers of z . (251) Then Vu is a sub-VOA of V . To see this, recall that Y (u, z)Y (v, t)w = Y+ (u, z)Y− (v, t)w + Y+ (v, t)Y− (u, z)w + Y (Y− (u, z − t)v)w. (252) When v, w ∈ Vu , the last two terms of the right-hand side of (252) vanish, which proves that Y (v, t)w ∈ Vu [[t]][t−1 ]. Now in the case of N = 2-super-VOA’s, let us stick to the NS sector. Then (250) still correctly describes the “body” of a vertex operator. The complete vertex operator takes the form − n + n − u−n+w(u) z n + u+ Y (u, z, θ+ , θ− ) = −n−1/2+w(u) z θ + u−n−1/2+w(u) z θ n∈Z n + − + u± −n−1+w(u) z θ θ .
(253)
We still define Y− (u, z, θ+ , θ− ) to be the sum of terms involving n < 0, and Y+ (u, z, θ+ , θ− ) the sum of the remaining terms. The compatibility relations for an N = 2-super-VOA are + − D+ Y (u, z, θ+ , θ− ) = Y (G+ −1/2 u, z, θ , θ ),
+ − D− Y (u, z, θ+ , θ− ) = Y (G− −1/2 u, z, θ , θ ),
(254)
where D+ =
∂ ∂ + θ+ , + ∂θ ∂z
D− =
∂ ∂ + θ− . − ∂θ ∂z
(255)
Now using (252) again, for u ∈ V homogeneous, we will have a sub-N = 2-VOA Vu + defined by (251), which is further endowed with the operators G− −1/2 , G−1/2 .
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
177
In the case of lack of locality, only a weaker conclusion holds. Assume first we have “abelian” fusion rules in the same sense as in Remark 2 after Theorem 1. Lemma 4. Suppose we have fields ui , i = 0, . . . , n such that for i > j, Y (ui , z)uj = (ui )−n−αij −w(ui ) z n+αij
(256)
n≥0
with 0 ≤ αij < 1. Consider further points z0 = 0, z1 , . . . , zn . Then
(zi − zj )−αij Y (un , zn ) · · · Y (uz , z1 )u0
(257)
n≥i>j≥0
where each (zi − zj )−αij are expanded in zj is a power series whose coefficients involve nonnegative integral powers of z0 , . . . , zn only. Proof. Induction on n. Assuming the statement is true for n − 1, note that by assumption, (257), when coupled to w ∈ V ∨ of finite weight, is a meromorphic function in zn with possible singularities at z0 = 0, z1 , . . . , zn zn−1 . Thus, (257) can be expanded at its singularities, and is equal to (zi − zj )−αij n−1≥i>j≥0
· zn−αn0 expandzn
(zn − zj )−αnj Y (un−1 , zn−1 ) · · ·
j =0
· Y (u1 , z1 )Y (un , zn )u0 <0 zn
n−1
(zn − zi )−αni expand(zn −zi )
+
i=1
(zn − zj )−αnj
n−1≥j =i
· Y (un−1 , zn−1 ) · · · Y (ui+1 , zi+1 )Y (Y (un , zn − zi )ui , zi ) · Y (ui−1 , zi−1 ) · · · Y (u1 , z1 )u0
(zn −zi )<0
−α −···−αn,n−1 + zn n0 expand1/zn
n−1
j=1
zj 1− zn
· Y (un , zn )Y (un−1 , zn−1 ) · · · Y (u1 , z1 )u0 ≥0 zn
−αnj .
(258)
March 10, J070-S0129055X10003916
178
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
In (258), expand? (?) means that the argument is expanded in the variable given as the subscript. The symbol (?)?<0 (respectively, (?)?≥0 ) means that we take only terms in the argument, (which is a power series in the subscript), which involve negative (respectively, non-negative) powers of the subscript. In any case, by the assumption of the lemma, all summands (258) vanish with the exception of the last, which is the induction step. In the case of non-abelian fusion rules, an analogous result unfortunately fails. Assume for simplicity that u0 = · · · = un holds in (258) with 0 ≤ αF ij < 1 true for any fusion rule F .
(259)
We would like to conclude that the correlation function v, u(zn ) · · · u(z1 )u
(260)
involves only non-negative powers of zi when expanded in z1 , . . . , zn (in this order). Unfortunately, this is not necessarily the case. Note that we know that (260) converges to 0 when two of the arguments zi approach while the others remain separate. However, this does not imply that the function (260) converges to 0 when three or more of the arguments approach simultaneously. To give an example, let us consider the solution of the Fuchsian differential equation of P1 − {0, t, ∞}
A B y = + y (261) x z−t for square matrices A, B (with t = 0 constant). Since the solution y has bounded singularities, multiplying y by z m (z − t)n for large enough integers m, n makes the resulting function Y converge to 0 when z approaches 0 or t. If, however, the expansion of Y at ∞ involved only non-negative powers of z, it would have only finitely many terms, and hence abelian monodromy. It is well known, however, that this is not necessarily the case. In fact, any irreducible monodromy occurs for a solution of Eq. (261) for suitable matrices A, B (cf. [7]). Therefore, the following result may be used as evidence, but not proof, of the exponentiability of deformations along (15 , 15 ) and (15 , −15 ). Lemma 5. The assumption (259) is satisfied for the field u = G− −1/2 ((1, 1, 1), . . . , (1, 1, 1)) in the 5-fold tensor product of the N = 2 minimal model of central charge 9/5. Before proving this, let us state the following consequence: Indeed, assuming Lemma 5 and setting w = ((1, 1, 1), . . . , (1, 1, 1)), the obstruction is − w |(G− −1/2 )w(zn ) · · · (G−1/2 w)(z1 )w .
(262)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
179
(The antichiral primary case is analogous.) But using the fact that − − − − − G− −1/2 ((G−1/2 )w(zn ) · · · (G−1/2 w)(z1 )w) = (G−1/2 )w(zn ) · · · (G−1/2 w)(z1 )G−1/2 w
along with injectivity of G− −1/2 on chiral primaries of weight 1/2, we see that the non-vanishing of (262) implies the non-vanishing of (260) with u = G− −1/2 w for some v of weight 1, which would contradict Lemma 5. Proof of Lemma 5. We have G− −1/2 (1, 1, 1) = (−4, 1, −1).
(263)
We have in our lattice (1, 1, 1) · (1, 1, 1) = 1/15 + 1/10 − 1/6 = 0, (−4, 1, −1) · (−4, 1, −1) = 16/15 + 1/10 − 1/6 = 1,
(264)
(1, 1, 1) · (−4, 1, −1) = −4/15 + 1/10 + 1/6 = 0, so we see that with the fusion rules which stays on levels 1, 2, the vertex operators u(z)u have only non-singular terms. However, this is not sufficient to verify (259). In effect, when we use the fusion rule which goes to levels 0, 3, (1, 1, 1)(z)(−4, 1, −1) and (−4, 1, −1)(z)(1, 1, 1) will have most singular term z −2/5 , so when we write again 1 instead of (1, 1, 1) and G instead of G− −1/2 (1, 1, 1), with the least favorable choice of fusion rules, it seems u(z)u can have singular term z −4/5 , coming from the expressions (G1111)(z)(1G111)
(265)
(1G111)(z)(G1111)
(266)
and
(and expression obtained by permuting coordinates). Note that with other combinations of fusion rules, various other singular terms can arise with z >−4/5 . Now the point is, however, that we will show that with any choice of fusion rule, the most singular terms of (265) and (266) come with opposite signs and hence cancel out. Since the z exponents of other terms are higher by an integer, this is all we need.
March 10, J070-S0129055X10003916
180
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
Recalling the Koszul–Milnor sign rules for the minimal model, recall that 1 is odd and G is even, so (G ⊗ 1)(z)(1 ⊗ G) = −G(z)1 ⊗ 1(z)G,
(267)
(1 ⊗ G)(z)(G ⊗ 1) = 1(z)G ⊗ G(z)1.
(268)
We have 1(z)G = (1, 1, 1)(z)(−4, 2, 2) = M (−3, 0, 0)z −2/5 + HOT, G(Z)1 = (−4, 2, 2)(z)(1, 1, 1) = N (−3, 0, 0)z −2/5 + HOT, with some non-zero coefficients M, N , so the bottom descendants of (267) and (268) are −M N (−3, 0, 0) ⊗ (−3, 0, 0) respectively, M N (−3, 0, 0) ⊗ (−3, 0, 0), so they cancel out, as required. 7. The Case of the Fermat Quartic K3-Surface The Gepner model of the K3 Fermat quartic is an orbifold analogous to (162) with 5 replaced by 4 of the 4-fold tensor product of the level 2 N = 2-minimal model, although one must be careful about certain subtleties arising from the fact that the level is even. The model has central charge 6. The level 2 PF model is the 1-dimensional fermion (of central charge 1/2), viewed as a bosonic CFT. As such, that model has 3 labels, the NS label with integral weights (denote by N S), the NS label with weights Z + 12 (denote by N S ), and the R label (denote by R). The fusion rules are given by the fact that N S is the unit label, N S ∗ N S = N S, N S ∗ R = R,
(269)
R ∗ R = N S + N S . We shall again find it useful to use the free field realization of the N = 2 minimal model, which we used in the last two sections. In the present case, the theory is a subquotient of a lattice theory spanned by
1 1 i i 1 √ , 0, 0 , √ , √ , , √ , 0, i . 2 8 8 2 2
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
181
Analogously as before, we write (k, , m) for
k i mi √ ,√ , . 8 8 2 The conformal vector is 1 i 1 1 2 x0,−1 − x21,−1 + √ x1,−2 + x22,−1 . 2 2 2 2 2 The superconformal vectors are G− −3/2 = (0, 4, 2)x−1 (4, 0, 2), G+ −3/2 = (0, 4, −2)x−1 (−4, 0, −2). The fermionic labels will again be denoted by omitting the first coordinate: (, m)F . The fermionic identifications are: (2, 2)F ∼ (2, −2)F ∼ (0, 0)F (1, 1)F ∼ (1, −1)F .
(270)
√ A priori the lattice 8 has 8 labels √k8 , 0 ≤ k ≤ 7, but the G− definition together with (270) forces the MM identification of labels (1, 1, 1) ∼ (−3, 1, −1) ∼ (−3, 1, 1). The labels of the level 2 MM are therefore (2k, 0, 0),
0 ≤ k ≤ 3,
(2k + 1, 1, 1),
0 ≤ k ≤ 1.
The fusion rules are (k, 0, 0) ∗ (, 0, 0) = (k + , 0, 0), (k, 0, 0) ∗ (, 1, 1) = (k + , 1, 1), (k, 1, 1) ∗ (, 1, 1) = (k + , 0, 0) + (k + + 4, 0, 0), so the Verlinde algebra is simply Z[a, b]/(a4 = 1, b2 = a + a3 , a2 b = b) where a = (2, 0, 0), b = (1, 1, 1). One subtlety of the even level MM in comparison with odd level concerns signs. Since the k-coordinates of G− and G+ are even, we can no longer use the kcoordinate of an element as an indication of parity (u and G± u cannot have the same parity). Because of this, we must introduce odd fusion rules. There are various ways of doing this. For example, let the bottom states of (2k, 0, 0), (1, 1, 1) and (−1, 1, 1) be even. Then the fusion rules on level = 0 are even, as are the fusion rules combining levels 0 and 1. The fusion rules (1, 1, 1) ∗ (1, 1, 1) → (2, 0, 0), (1, 1, 1) ∗ (−1, 1, 1) → (2, 0, 0)
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
182
are even, the remaining fusion rules (adding 4 to the k-coordinate on the right-hand side) are odd. Now the c fields of the MM are (0, 0, 0), (1, 1, 1), (2, 2, 2) and the a fields are (0, 0, 0), (−1, 1, −1), (−2, 2, −2). If we denote by H1,2k+1 the state space of label (2k + 1, 1, 1), 0 ≤ k ≤ 1, and by H0,2k the state space of label (2k, 0, 0), 0 ≤ k ≤ 3, then the state space of the 4-fold tensor product of the level 2 minimal model is 1 3 3 ˆ ∗ ∗ ˆ 0,2ki ⊕ ˆ 1,2ki +1 . H0,2ki ⊗H H1,2ki +1 ⊗H (271) i=0
ki =0
ki =0
The Gepner model is an orbifold with respect to the Z/4-group which acts by i on products in (271) where the sum of the subscripts 2ki or 2ki + 1 is congruent to modulo 4. Therefore, the state space of the Gepner model is the sum over β ∈ Z/4 and αi ∈ Z/4, 3
αi = 0 ∈ Z/4,
i=0
of 3
ˆ i=0
2ki ≡αi mod 4
∗ ˆ 0,2k H0,2ki ⊗H i +2β
⊕
∗ ˆ 1,2k H1,2ki +1 ⊗H i +1+2β
.
2ki +1≡αi
(272) It is important to note that each summand (272) in which all the factors have the “odd” subscripts 1, 2ki + 1 occurs twice in the orbifold state space. If we write again for (, , ) and − for (−, , −), then the critical cc fields are chirally symmetric permutations of (2, 2, 0, 0), (2, 2, 0, 0) (2, 1, 1, 0), (2, 1, 1, 0)
(273)
(1, 1, 1, 1), (1, 1, 1, 1). Note that applying all the possible permutations to the fields (273), we obtain only 19 fields, while there should be 20, which is the rank of H 1,1 (X) for a K3-surface X. However, this is where the preceeding comment comes to play: the last field (273) corresponds to a term (272) where all the factors have odd subscripts, and hence there are two copies of that summand in the model, so the last field (273) occurs “twice”. By the fact that the Fermat quartic Gepner model has N = (4, 4) worldsheet supersymmetry (se e.g. [9, 54] and references therein), the spectral flow guarantees
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
183
that the number of critical ac fields is the same as the number of critical cc fields. Concretely, the critical ac fields are the permutations of (0, 0, −2, −2), (2, 2, 0, 0) (0, −1, −1, −2), (2, 1, 1, 0)
(274)
(−1, −1, −1, −1), (1, 1, 1, 1). As above, the last field (274) occurs in 2 copies, thus the rank of the space of critical ac fields is also 20. We wish to investigate whether infinitesimal deformations along the fields (273), (274) exponentiate perturbatively. To this end, let us first see when the “cosettype scenario” occurs. This is sufficient to prove convergence in the present case. This is due to the fact that in the present theory, there is an even number of fermions, in which case it is well known by the boson-fermion correspondence that the correlation functions follow abelian fusion rules, and therefore Lemma 4 applies. To prove that the coset scenario occurs, let us look at the chiral c fields u = (2, 2, 0, 0), (2, 1, 1, 0), (1, 1, 1, 1) and study the singularities of − G− −1/2 (z)(G−1/2 u).
(275)
By Lemma 4, if (275) are non-singular, the obstructions vanish. The inner product is (k, , m), (k , , m ) = w(k, , m) =
mm kk + − , 8 8 4 ( + 2) m2 k2 + − . 16 16 8
Next, (2, 2, 2) = (2, 0, 0), G− −1/2 (2, 0, 0) = (0, 4, −2)x−1 (−2, 0, −2), G− −1/2 (1, 1, 1) = (−3, 1, −1), if we again replace, to simplify notation, the symbol G− −1/2 by G, then we have 2·2 1 = , 8 2 1 −2 · 2 =− . The most singular z-power of G2(z)2 is 8 2 For G2(z)G2, rename the rightmost G2 as (−2, 2, 0). We get The most singular z-power of 2(z)2 is
The most singular z-power of G2(z)G2 is − 1 +
(−2) · (−2) 1 =− , 8 2
(276) (277)
(278)
The most singular z-power of 1(z)1 is 0 for the even fusion rule and 1/2 for the odd fusion rule, (279)
March 10, J070-S0129055X10003916
184
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
−3 1 + + 8 8 fusion rule and −1/2 for the odd fusion rule, 9 1 The most singular z-power of G1(z)G1 is + − 8 8 fusion rule and 3/2 for the odd fusion rule.
The most singular z-power of G1(z)1 is
1 = 0 for the even, 4 (280) 1 = 1 for the even 4 (281)
One therefore sees that for the field u = (1, 1, 1, 1), (275) is non-singular: In the case of the least favorable (odd) fusion rules, the most singular term appears to be −1, coming from (G1, 1, 1, 1) ⊗ (1, G1, 1, 1).
(282)
However, this term cancels with (1, G1, 1, 1) ⊗ (G1, 1, 1, 1).
(283)
To see this, note that the last two coordinates do not enter the picture. We have an odd (respectively, even) pair of pants P− respectively, P+ in the MM with input 1, 1. They add up to a pair of pants in MM⊗MM. On (282), we have pairs of pants Pi ∈ {P− , P+ }, P (G1 ⊗ 1) ⊗ (1 ⊗ G1) = (P1 ⊗ P2 )(G1 ⊗ 1) ⊗ (1 ⊗ G1) = sP1 (G1 ⊗ 1) ⊗ P2 (1 ⊗ G1)
(284)
where s is the sign of permuting P2 past G1 ⊗ 1. Here we use the fact that 1 is even. On the other hand, P (1 ⊗ G1) ⊗ (G1 ⊗ 1) = (P1 ⊗ P2 )(1 ⊗ G1) ⊗ (G1 ⊗ 1) = −sP1 (1 ⊗ G1) ⊗ P2 (G1 ⊗ 1)
(285)
(as G1 is odd, so there is a − by permuting it with itself). From (73), the lowest term of Pi (1 ⊗ G1) and Pi (G1 ⊗ 1) have opposite signs, so (284) and (285) cancel out. The situation is simpler for u = (2, 2, 0, 0), in which case all the fusion rules are even, and the most singular term of (G2 ⊗ 2)(z)(2 ⊗ G2) appears to have most singular term z −1 . However, again note that 2 is even and G2 is odd, so (G2 ⊗ s)(z)(2 ⊗ G2) = G2(z)2 ⊗ 2(z)G2,
(286)
(2 ⊗ G2)(z)(G2 ⊗ 2) = −2(z)G(z) ⊗ G2(z)2.
(287)
while
Renaming G2 as (−2, 2, 0), the bottom descendant of both G2(z)2 and 2(z)G2 is (0, 2, 0) with some coefficient, so (286) and (287) cancel out. Thus, the deformations along the first and last fields of (273) and (274) exponentiate.
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
185
The field u = (2, 1, 1, 0) is difficult to analyse, since in this case, (275) has singular channels and the coset-type scenario does not occur. We do not know how to calculate the obstruction directly in this case. It is however possible to present an indirect argument why these deformations exist. In one precise formulation, the boson-fermion correspondence asserts that a tensor product of two copies of the 1-dimensional chiral fermion theory considered bosonically (= the level 2 parafermion) is an orbifold of the lattice theory 2 , by the Z/2-group whose generator acts on the lattice by sign. This has an N = 2-supersymmetric version. We tensor with two copies of the √ lattice theory associated with 8 , picking out the sector
m n p √ ,√ , 8 8 4
where m ≡ n ≡ p
mod 2.
(288)
The fermionic currents of the individual coordinates are ψ−1/2,1 = (1) + (−1),
ψ−1/2,2 = i((1) − (−1)),
(289)
so the SUSY generators are
4 ± √ , 0 ⊗ ((1) + (−1)), 8
4 = 0, ± √ ⊗ i((1) + (−1)), 8
G± −3/2,1 = G± −3/2,2
(290)
G = G·1 + G·2 . The Z/2 group acts trivially on the new lattice coordinate. A note is due on the signs: To each state, we can assign a pair of parities, which will correspond to the parities of the 2 coordinates in the orbifold. This then also determines the sign of fusion rules. Now consider our field as a tensor product of (2, 0) and (1, 1), each in a tensor product of two copies of the minimal models. Considering each of these factors as orbifold of the N = 2-supersymmetric lattice theory, let us lift to the lattice theory: 2 (2, 0) → √ , 0 ⊗ (0), 8
1 1 (1, 1) → √ , √ ⊗ ((1/2) + (−1/2)). 8 8
(291) (292)
Then the fields (291), (292) are Z/2-invariant. In the case of (291), we can proceed in the lift instead of the orbifold, because the fusion rules in the orbifold are abelian anyway. In the case of (292), the choice amounts to choosing a particular fusion
March 10, J070-S0129055X10003916
186
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
rule. But now the point is that
2 − √ , 0 ⊗ ((1) + (−1)), 8
1 3 − G−1/2 (1, 1) → − √ , √ ⊗ ((1/2) + (−1/2)), 8 8
1 3 − ⊗ ((1/2) + (−1/2)). G−1/2 (1, 2) → √ , − √ 8 8 G− −1/2 (2, 0) →
(293) (294) (295)
Thus, the left of G− −1/2 u is a sum of lattice labels! Now the critical summands of the operator − G− −1/2 (u)(zk ) · · · G−1/2 (u)(z1 )(u)(0)
(296)
have k = 4m, and we have 2m summands (293), and m summands (294), (295), respectively. All
4m 2m, m, m possibilities occur. It is the bottom (= label) term which we must compute in order to evaluate our obstruction. But by our sign discussion, when we swap a (294) term with a (295) term, the label summands cancel out. Now adding all such possible
4m 2m, m − 1, m − 1, 2 pairs, all critical summands of (296) will occur with equal coefficients by symmetry, and hence also the bottom coefficient of (289) is 0, thus showing that the vanishing of our obstruction for this field lifts to the lattice theory. Since the field (292) is invariant under the Z/2-orbifolding (and although (291) is not, the same conclusion holds when replacing it with its orbifold image), the entire perturbative deformation can also be orbifolded, yielding the desired deformation. We thus conclude that for the Gepner model of the K3 Fermat quartic, all the critical fields exponentiate to perturbative deformations. 8. Conclusions and Discussion In this paper, we have investigated perturbative deformations of CFT’s by turning on a marginal cc field, by the method of recursively updating the field along the deformation path. A certain algebraic obstruction arises. We work out some examples, including free field theories, and some N = (2, 2) supersymmetric Gepner models. In the N = (2, 2) case, in the case of a single cc field, the obstruction we find can be made very explicit, and perhaps surprisingly, does not automatically vanish. By explicit computation, we found that the obstruction does not vanish for a particular critical cc field in the Gepner model of the Fermat quintic 3-fold
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
187
(we saw some indication, although not proof, that it may vanish for the field corresponding to adding the symmetric term xyztu to the superpotential, and for the unique critical ac field). By comparison, the obstruction vanishes for the critical cc fields and ac fields in the Gepner model of the Fermat quartic K3-surface. Our calculations are not completely physical in the sense that cc fields are not real: real fields are obtained by adding in each case the complex-conjugate aa field, in which case the calculation is more complicated and is not done here. Assuming (as seems likely) that the real field case exhibits similar behavior as we found, why are the K3 and 3-fold cases different, and what does the obstruction in the 3-fold case indicate? In the K3-case, our perturbative analysis conforms with the Aspinwall–Morrison construction [9] of the big moduli space of K3’s, and corresponding (2, 2)- (in fact, (4, 4)-) CFT’s, and also with the findings of Nahm and Wendland [54, 62]. In the 3-fold case, however, the straightforward perturbative construction of the deformed nonlinear σ-model fails. This corresponds to the discussion of Nemeschansky–Sen [55] of the renormalization of the nonlinear σ-model. They expand around the 0 curvature tensor, but it seems natural to assume that similar phenomena would occur if we could expand around the Fermat quintic vacuum. Then [55] find that non-Ricci flat deformations must be added to the Lagrangian at higher orders of the deformation parameter in order to cancel the β function. Therefore, if we want to do this perturbatively, fields must be present in the original (unperturbed) model which would correspond to non-Ricci flat deformation. No such fields are present in the Gepner model. (Even if we do not a priori assume that the marginal fields of the Gepner model correspond to Ricci flat deformations, we see that different fields are needed at higher order of the perturbation parameter, so there are not enough fields in the model.) More generally, ignoring for the moment the worldsheet SUSY, the bosonic superpartners are fields which are of weight 1 classically (as the classical nonlinear σ-model Lagrangian is conformally invariant even in the non-Ricci flat case). A 1-loop correction arises in the quantum picture [4], indicating that the corresponding deformation fields must be of generalized weight (cf. [39–42]). However, such fields are excluded in unitary CFT’s, which is the reason why these deformations must be non-perturbative. One does not see this phenomenon on the level of the corresponding topological models, since these are invariant under varying the metric within the same cohomological class, and hence do not see the correction term [68]. Also, it is worth noting that in the K3-case, the β function vanishes directly for the Ricci-flat metric by the N = (4, 4) supersymmetry ( [5]), and hence the correction terms of [55] are not needed. Accordingly, we have found that the corresponding perturbative deformations exist. From the point of view of mirror symmetry, mirror-symmetric families of hypersurfaces in toric varieties were proposed by Batyrev [10]. In the case of the Fermat quintic, the exact mirror is a singular orbifold and the nonlinear σ-model deformations corresponding to the Batyrev dual family exist perturbatively by our analysis.
March 10, J070-S0129055X10003916
188
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
To obtain mirror candidates for the additional deformations, one uses crepant resolutions of the mirror orbifold (see [57] for a survey). In the K3-case, this approach seems validated by the fact that the mirror orbifolds can indeed be viewed as a limit of non-singular K3-surfaces [6]. In the 3-fold case, however, this is not so clear. The moduli spaces of Calabi–Yau 3-folds are not locally symmetric spaces. The crepant resolution is not unique even in the more restrictive category of algebraic varieties; different resolutions are merely related by flops. It is therefore not clear what the exact mirrors are of those deformations of the Fermat quintic where the deformation does not naturally occur in the Batyrev family, and resolution of singularities is needed. In other words, the McKay correspondence sees only “topological” invariants, and not the finer geometrical information present in the whole nonlinear σ model. In [21], Fan, Jarvis and Ruan constructed exactly mathematically the A-models corresponding to Landau–Ginzburg orbifolds via Gromov–Witten theory applied to the Witten equation. Using mirror symmetry conjectures, this may be used to construct mathematically candidates of topological gravity-coupled A-models as well as B-models of Calabi–Yau varieties. Gromov–Witten theory, however, is a rich source of examples where such gravity-coupled topological models exist, while a full conformally invariant (2, 2)-σ-model does not. For example, Gromov–Witten theory can produce highly non-trivial topological models for 0-dimensional orbifolds (cf. [56, 43]). Why does our analysis not contradict the calculation of Dixon [19] that the central charge does not change for deformation of any N = 2 CFT along any linear combination of ac and cc fields? Zamolodchikov [70,71] defined an invariant c which is a non-decreasing function in a renormalization group flow in a 2-dimensional QFT, and is equal to the central charge in a conformal field theory. It may therefore appear that by [19], all infinitesimal deformations along critical ac and cc fields in an N = 2-CFT exponentiate. However, we saw that when our obstruction occurs, additional counterterms corresponding to those of Nemeschansky–Sen are needed. This corresponds to non-perturbative corrections of the correlation function needed to fix c, and the functions [19] cannot be used directly in our case. Finally, let us briefly discuss the significance of our result to the relationship between classical and quantum geometry. One of the well known effects (and also great puzzles) of string duality (as reviewed, say, in [31]) is that a smooth path in the moduli space of conformal field theories corresponding to Calabi–Yau varieties can correspond to a discontinuous path in the classical moduli space of the Calabi– Yau varieties themselves, and more specifically that the topology of the underlying Calabi–Yau variety can change along such path. In view of our result, it is possible that this picture needs to be refined. Namely, what we perceive as a smooth path in quantum geometry may actually consist of discrete steps “tunneling” across the changes of topology. An explanation of such phenomenon could be that the moduli space of quantum geometries should itself be quantized, and can have a discrete rather than continuous spectrum.
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
189
Acknowledgments The author thanks D. Burns, I. Dolgachev, I. Frenkel, Doron Gepner, Y. Z. Huang, I. Melnikov, K. Wendland and E. Witten for explanations and discussions. Special thanks to H. Xing, who contributed many useful ideas to this project before changing his field of interest. The author is supported by grants from the NSA and the MCTP. References [1] I. Affleck and A. W. Ludwig, Universal noninteger ground state degeneracy in critical quantum systems, Phys. Rev. Lett. 67 (1991) 161–164. [2] I. Affleck and A. W. Ludwig, Exact conformal field theory results on the multichannel Kondo effect: Single Fermion Green’s function, selfenergy and resistivity, J. High Energy Phys. 11 (2000) 21. [3] L. Alvarez-Gaume, S. Coleman and P. Ginsparg, Finiteness of Ricci flat N = 2 supersymmetric σ-models. Comm. Math. Phys. 103(3) (1986) 423–430. [4] L. Alvarez-Gaume and D. Z. Freedman, K¨ ahler geometry and renormalization of supersymmetric σ-models, Phys. Rev. D 22 (1980) 846–853. [5] L. Alvarez-Gaume and P. Ginsparg, Finiteness of Ricci flat supersymmetric σ models, Comm. Math. Phys. 102 (1985) 311–326. [6] M. T. Anderson, The L2 structure of moduli spaces of Einstein metrics on 4-manifolds, Geom. Funct. Anal. 2 (1992) 29–89. [7] D. V. Anosov and A. A. Bolibruch, The Riemann–Hilbert Problem, Aspects of Mathematics, E22 (Friedr. Vieweg and Sohn, Braunschweig, 1994). [8] J. Ashkin and G. Teller, Statistics of two-dimensional lattices with four components, Phys. Rev. 64 (1943) 178–184. [9] P. S. Aspinwall and D. R .Morrison, String theory on K3 surfaces, in Mirror Symmetry, eds. B. R. Greene and S. T. Yau, Vol. II, AMS/IP Stud. Adv. Math. (Amer. Math. Soc., 1994), pp. 703–716. [10] V. V. Batyrev, Dual polyhedra and mirror symmetry for Calabi–Yau hypersurfaces in toric varieties, J. Alg. Geom. 3 (1994) 493–535. [11] R. J. Baxter, Eight-vertex model in lattice statistics, Phys. Rev. Lett. 26 (1971) 832–833. [12] P. Bouwknecht and D. Ridout, A note on the equality of algebraic and geometric D-brane charges in WZW, J. High Energy Phys. 0405 (2004) 029. [13] R. Cohen and D. Gepner, Interacting bosonic models and their solution, Mod. Phys. Lett. A 6 (1991) 2249. [14] M. Dine, N. Seiberg, X. G. Wen and E. Witten, Non-perturbative effects on the string world sheet, Nucl. Phys. B 278 (1986) 769–969. [15] M. Dine, N. Seiberg, X. G. Wen and E. Witten, Non-perturbative effects on the string world sheet, Nucl. Phys. B 289 (1987) 319–363. [16] L. J. Dixon, V. S. Kaplunovsky and J. Louis, On effective field theories describing (2, 2) vacua of the heterotic string, Nucl. Phys. B 329 (1990) 27–82. [17] J. Ellis, C. Gomez, D. V. Nanopoulos and M. Quiros, World sheet instanton effects on no-scale structure, Phys. Lett. B 173 (1986) 59–64. [18] J. Distler and B. Greene, Some exact results on the superpotential from Calabi–Yau compactifications, Nucl. Phys. B 309 (1988) 295–316. [19] L. Dixon, Some worldsheet properties of superstring compactifications, on orbifolds and otherwise, in Superstrings, Unified Theories and Cosmology, Proc. ICTP Summer school, 1987, ed. G. Furlan (World Scientific, 1988), pp. 67–126.
March 10, J070-S0129055X10003916
190
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
[20] M. R. Douglas and W. Taylor, The landscape of intersecting brane models, J. High Energy Phys. 0701 (2007) 031. [21] H. Fan, T. J. Jarvis and Y. Ruan, The Witten equation, mirror symmetry and quantum singularity theory, arXiv:0712.4021. [22] S. Fredenhagen and V. Schomerus, Branes on group manifolds, Gluon condensates, and twisted K-theory, J. High Energy Phys. 104 (2001) 7. [23] D. Friedan, Nonlinear models in 2+ dimensions, Ann. Phys. 163(2) (1985) 318–419. [24] D. Gepner, Space-time supersymmetry in compactified string theory and superconformal models, Nucl. Phys. B 296 (1988) 757–778. [25] D. Gepner, Exactly solvable string compactifications on manifolds of SU (N ) holonomy, Phys. Lett. B 199 (1987) 380. [26] M. Gerstenhaber, On the deformation of rings and algebras I, Ann. of Math. 79 (1964) 59–103. [27] M. Gerstenhaber, On the deformation of rings and algebras II, Ann. of Math. 84 (1966) 1–19. [28] M. Gerstenhaber, On the deformation of rings and algebras III, Ann. of Math. 88 (1968) 1–34. [29] M. Gerstenhaber, On the deformation of rings and algebras IV, Ann. of Math. 99 (1974) 257–276. [30] P. Ginsparg, Curiosities at c = 1, Nucl. Phys. B 295 (1988) 153–170. [31] B. R. Greene, String theory on Calabi–Yau manifolds, hep-th/9702155. [32] B. R. Greene and M. R. Plesser, Duality in Calabi–Yau moduli space, Nucl. Phys. B 338 (1990) 15–37. [33] B. R. Greene, C. Vafa and N. P. Warner, Calabi–Yau manifolds and renormalization group flows, Nucl. Phys. B 324 (1989) 371–390. [34] P. A. Griffin and O. F. Hernandez, Structure of irreducible SU (2) parafermion modules derived vie the Feigin–Fuchs construction, Int. J. Modern Phys. A 7 (1992) 1233–1265. [35] M. T. Grisaru, A. E. M. Van Den and D. Zanon, Four-loop β-function for the N = 1 and N = 2 supersymmetric non-linear sigma model in two dimensions, Phys. Lett. B 173 (1986) 423. [36] P. Hu and I. Kriz, Conformal field theory and elliptic cohomology, Adv. Math. 189 (2004) 325–412. [37] P. Hu and I. Kriz, Closed and open conformal field theories and their anomalies, Comm. Math. Phys. 254 (2005) 221–253. [38] P. Hu and I. Kriz, A mathematical formalism for the Kondo effect in WZW branes, J. Math. Phys. 48 (2007) 072301, 31 pp. [39] Y. Z. Huang, J. Lepowsky and L. Zhang, Logarithmic tensor product theory for generalized modules for a conformal vertex algebra, Part I, math/0609833. [40] Y. Z. Huang, J. Lepowsky and L. Zhang, A logarithmic generalization of tensor product theory for modules for a vertex operator algebra, Internat. J. Math. 17 (2006) 975–1012; math/0311235. [41] Y. Z. Huang and L. Kong, Full field algebras, QA/0511328. [42] Y. Z. Huang and A. Milas, Intertwining operator superalgebras and vertex tensor categories for superconformal algebras, II, Trans. Amer. Math. Soc. 354 (2002) 363– 385. [43] P. Johnson, Equivariant Gromov–Witten theory of one dimensional stacks, Ph.D. thesis, Univ. of Michigan (2009). [44] S. Kachru and E. Witten, Computing the complete massless spectrum of a Landau– Ginzburg orbifold, Nucl. Phys. B 407 (1993) 637–666.
March 10, J070-S0129055X10003916
2010 10:13 WSPC/S0129-055X
148-RMP
Perturbative Deformations of Conformal Field Theories Revisited
191
[45] L. P. Kadanoff, Multicritical behavior of the Kosterlitz–Thouless critical point, Ann. Phys. 120 (1979) 39–71. [46] L. P. Kadanoff and A. C. Brown, Correlation functions on the critical lines of the Baxter and Ashten–Teller models, Ann. Phys. 121 (1979) 318–342. [47] L. P. Kadanoff and F. J. Wegner, Some critical properties of the eight-vertex model, Phys. Rev. D 4 (1971) 3989–3993. [48] M. Kohmoto and L. P. Kadanoff, Lower bound RSRG approximation for a large system, J. Phys. A 13 (1980) 3339–3343. [49] I. Kriz, Some notes on the N -superconformal algebra, http://www.math.lsa. umich.edu/˜ikriz. [50] I. Kriz, On spin and modularity in conformal field theory, Ann. Sci. ENS 36 (2003) 57–112. [51] J. Maldacena, G. Moore and N. Seiberg, D-brane instantons and K-theory charges, J. High Energy Phys. 111 (2004) 62. [52] G. Moore, K-theory from a physical perspective, Topology, Geometry and Quantum Field Theory, London Math. Soc. Lecture Ser., Vol. 308 (Cambridge Univ. Press, 2004), pp. 194–234. [53] G. Mussardo, G. Sotkov and M. Stanishkov, N = 2 superconformal minimal models, Int. J. Mod. Phys. A 4(5) (1989) 1135–1206. [54] W. Nahm and K. Wendland, A hiker’s guide to K3, Comm. Math. Phys. 216 (2001) 85–103. [55] D. Nemeschansky and A. Sen, Conformal invariance of supersymmetric σ-models on Calabi–Yau manifolds, Phys. Lett. B 178(4) (1986) 365–369. [56] A. Okounkov and R. Pandharipande, The equivariant Gromov–Witten theory of P1 , Ann. of Math. (2) 163 (2006) 561–605. [57] M. Reid, La Correspondence de McKay, 52eme annee, session de Novembre 1999, no. 897, Asterisque 276 (2002) 53–72. [58] V. Schomerus, Lectures on branes in curved backgrounds, Class. Quant. Grav. 19 (2002) 5781–5847. [59] G. Segal, The definition of conformal field theory, in Topology, Geometry and Quantum Field Theory, London Math. Soc. Lecture Note Ser., Vol. 308 (Cambridge University Press, 2004), pp. 421–577. [60] C. Vafa and N. Warner, Catastrophes and the classification of conformal theories, Phys. Lett. B 218 (1989) 51–58. [61] F. J. Wegner, Corrections to scaling laws, Phys. Rev. B 5 (1972) 4529–4536. [62] K. Wendland, A family of SCFT’s hosting all very attractive relatives to the (2)4 Gepner model, J. High Energy Phys. 0603 (2006) 102. [63] K. G. Wilson, The renormalization group: Critical phenomena and the Kondo problem, Rev. Mod. Phys. 47 (1975) 773–840. [64] K. G. Wilson, Non-Lagrangian models of current algebra, Phys. Rev. 179 (1969) 1499–1512. [65] K. G. Wilson, Operator-product expansions and anomalous dimensions in the Thirring model, Phys. Rev. D 2 (1970) 1473–1493. [66] E. Witten, Phases of N = 2 theories in two dimensions, Nucl. Phys. B 403 (1993) 159–222. [67] E. Witten, On the Landau–Ginzburg description of N = 2 minimal models, Int. J. Mod. Phys. A 9 (1994) 4783–4800. [68] E. Witten, Topological sigma models, Comm. Math. Phys. 118 (1988) 411–449. [69] A. B. Zamolodchikov, Integrable field theory from conformal field theory, Adv. Stud. Pure Math. 19 (1989) 641–674.
March 10, J070-S0129055X10003916
192
2010 10:13 WSPC/S0129-055X
148-RMP
I. Kriz
[70] A. B. Zamolodchikov, “Irreversibility” of the flux of the renormalization group in a 2D field theory, JETP Lett. 43 (1986) 730–732. [71] A. B. Zamolodchikov, Renormalization group and perturbation theory about fixed points in two-dimensional field theory, Sov. J. Nucl. Phys. 46 (1987) 1090–1096. [72] Y. Zhu, Modular invariance of characters of vertex operator algebras, J. Amer. Math. Soc. 9 (1996) 237–302.
March 10, J070-S0129055X10003928
2010 10:14 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 2 (2010) 193–206 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003928
SPATIAL GROWTH OF FUNDAMENTAL SOLUTIONS FOR CERTAIN PERTURBATIONS OF THE HARMONIC OSCILLATOR
ARNE JENSEN∗ and KENJI YAJIMA† ∗Department
of Mathematical Sciences, Aalborg University, Fr. Bajers Vej 7G, DK-9220 Aalborg Ø, Denmark
[email protected]
†Department
of Mathematics, Gakushuin University, 1-5-1 Mejiro, Toshima-ku, Tokyo 171-8588, Japan
[email protected] Received 5 June 2009 Revised 24 November 2009
We consider the fundamental solution for the Cauchy problem for perturbations of the harmonic oscillator by time dependent potentials which grow at spatial infinity slower than quadratic but faster than linear functions and whose Hessian matrices have a fixed sign. We prove that the fundamental solution at resonant times grows indefinitely at spatial infinity with an algebraic growth rate, which increases indefinitely when the growth rate of perturbations at infinity decreases from the near quadratic to the near linear ones. Keywords: Fundamental solution; Schr¨ odinger equation; harmonic oscillator. Mathematics Subject Classification 2010: 35A08, 35B10, 35J10, 81Q20
1. Introduction We consider d-dimensional time dependent Schr¨odinger equations i
∂u = ∂t
1 − ∆ + V (t, x) u(t, x), 2
(t, x) ∈ R1 × Rd .
(1)
We assume throughout this paper that V (t, x) is smooth with respect to the x variables, and V (t, x) and its derivatives ∂xα V (t, x) are jointly continuous with respect to (t, x). Under the conditions to be imposed on V (t, x) in what follows Eq. (1) generates a unique unitary propagator {U (t, s) : t, s ∈ R} in the Hilbert space H = L2 (Rd ), so that the solution in H of (1) with the initial condition u(s, x) = ϕ(x) ∈ H 193
March 10, J070-S0129055X10003928
194
2010 10:14 WSPC/S0129-055X
148-RMP
A. Jensen & K. Yajima
is uniquely given by u(t) = U (t, s)ϕ. The distribution kernel E(t, s, x, y) of U (t, s) is called the fundamental solution (FDS for short) of Eq. (1): U (t, s)ϕ(x) = E(t, s, x, y)ϕ(y)dy. We write E(t, x, y) = E(t, 0, x, y). It is well known that the FDS of the free Schr¨ odinger equation, viz. Eq. (1) with V = 0, is given by 2 e∓ 4 ei(x−y) /2t , d/2 |2πt| iπd
E0 (t, x, y) =
t≷0
(2)
and that of the harmonic oscillator, viz. Eq. (1) with V (t, x) = x2 /2, is given for non-resonant times mπ < t < (m + 1)π, m ∈ Z via Mehler’s formula: Eh (t, x, y) =
e−id(1+2m)π/4 (i/sin t)((x2 +y2 )cos t/2−x·y) e , |2π sin t|d/2
(3)
and, for resonant times t − s = mπ by Eh (mπ, x, y) = e−imdπ/2 δ(x − (−1)m y).
(4)
Note that the FDS for the free Schr¨ odinger equation is smooth and spatially bounded for any t = 0; for the harmonic oscillator the FDS has this property only at non-resonant times; at resonant times t = mπ singularities of the initial function Eh (0, x, y) = δ(x − y) recur at x = (−1)m y, however, it is smooth and decays rapidly at spatial infinity. Actually, it vanishes outside the singular point x = (−1)m y when t = mπ. We begin with a brief review on properties of the FDS for (1) with general potentials V (t, x) laying emphasis on its smoothness and boundedness with respect to the spatial variables (x, y). We denote the classical Hamiltonian and Lagrangian corresponding to (1), respectively, by H(t, x, p) = p2 /2 + V (t, x) and L(t, q, v) = v 2 /2 − V (t, q) and (x(t, s, y, k), p(t, s, y, k)) is the solution of the initial value problem for Hamilton’s equations x(t) ˙ = ∂p H(t, x, p),
p(t) ˙ = −∂x H(t, x, p);
x(s) = y,
p(s) = k.
(5)
We write (x(t, 0, y, k), p(t, 0, y, k)) = (x(t, y, k), p(t, y, k)). Suppose first that V (t, x) increases at most quadratically at spatial infinity in the sense that sup |∂xα V (t, x)| ≤ Cα , t
for all |α| ≥ 2.
(6)
Then, in the seminal work [4], Fujiwara has shown that there exists a T depending only on V such that the following results hold for the time interval 0 ≤ ±(t−s) < T : The map Rd k → x(t, s, y, k) ∈ Rd is a diffeomorphism for every fixed y ∈ Rd
March 10, J070-S0129055X10003928
2010 10:14 WSPC/S0129-055X
148-RMP
Spatial Growth of Fundamental Solutions for Certain Perturbations
195
and, therefore, there exists a unique path of (5) such that x(s) = y and x(t) = x; if we write t (x(r) ˙ 2 /2 − V (r, x(r)))dr S(t, s, x, y) = s
for the action integral of the path, the FDS E(t, s, x, y) has the form e∓ 4 E(t, s, x, y) = eiS(t,s,x,y) a(t, s, x, y), (2π|t − s|)d/2 iπd
t ≷ s,
(7)
where a(t, s, x, y) is a smooth function of (x, y) such that, for any α and β, ∂xα ∂yβ a(t, s, x, y) are C 1 with respect to (t, s, x, y) and |∂xα ∂yβ (a(t, s, x, y) − 1)| ≤ Cαβ (t − s)2 .
(8)
Moreover the semi-classical approximation for the amplitude function is valid in the sense that as |t − s| → 0 −1/2 ∂ a(t, s, x, y) −d/2 = (2π) + O(|t − s|−(d−2)/2 ), (9) x(t, s, k, y) det ∂k (2π|t − s|)d/2 where k is the (unique) point such that x = x(t, s, y, k). In particular, E(t, s, x, y) is smooth and bounded with respect to the spatial variables (x, y) ∈ Rd × Rd for every 0 < |t − s| < T (see [9] for a generalization to the case when magnetic fields are present). For the free Schr¨odinger equation or for the harmonic oscillator the relation (9) holds without the error term O(|t − s|−(d−2)/2 ). Under the condition (6) the structure (7) of the FDS in general breaks down at later times because singularities of the initial data δ(x) may recur in finite time as the FDS of the harmonic oscillator (4) explicitly demonstrates. If V (t, x) is subquadratic at spatial infinity in the sense that lim sup |∂xα V (t, x)| = 0,
|x|→∞
t
|∂xα V (t, x)| ≤ Cα ,
|α| = 2, (10)
for all |α| ≥ 3,
then this recurrence of singularities does not take place, however, and the FDS is of the form (7) for any finite time ([12]). More precisely, if V satisfies (10), then for any T > 0, there exists R > 0 such that, for any t and s with 0 < ±(t − s) ≤ T and for any pair (x, y) ∈ Rd × Rd with x2 + y 2 ≥ R2 , there exists a unique path of (5) such that x(s) = y and x(t) = x and the FDS for 0 < ±(t − s) ≤ T may be written in the form (7), where, for (x, y) with x2 + y 2 ≥ R2 , S(t, s, x, y) is the action integral of this path. Moreover, we have a(t, s, x, y) → 1 as x2 + y 2 → ∞. In particular, E(t, s, x, y) is smooth and bounded with respect to (x, y) for any t = s. On the other hand, if d = 1 and V (t, x) does not depend on t, and if V is convex and V (x) ≥ C|x|2+ε for large |x| for some ε > 0 and C > 0, then, under certain additional techinical assumption on the derivatives, E(t, x, y) is nowhere C 1 with respect to (t, x, y) ([10]). It is also known that, if V satisfies C1 |x|δ ≤ V (x) ≤ C2 |x|δ near infinity with constants δ > 10 and 0 < C1 ≤ C2 < ∞, then E(t, 0, x, y) is
March 10, J070-S0129055X10003928
196
2010 10:14 WSPC/S0129-055X
148-RMP
A. Jensen & K. Yajima
unbounded with respect to (x, y) for any t ∈ R ([13]). These results have been proven only in one dimension so far, however, it is believed that similar results hold in all dimensions. In this way, properties of the FDS experience a sharp transition when the growth rate at spatial infinity of the potential V (t, x) changes from subquadratic to superquadratic. Thus, the FDS for the borderline case, viz. perturbations of the harmonic oscillator 1 1 ∂u = − ∆ + x2 + W (t, x) u(t, x), (t, x) ∈ R1 × Rd , (11) i ∂t 2 2 where W (t, x) is subquadratic in the sense it satisfies (10) with W in place of V , has attracted particular interest of many authors, and the following properties of E(t, s, x, y) have been established (see, e.g., [14, 5, 12, 2, 3]). We may set s = 0, which we will do, and we will write E(t, x, y) for E(t, 0, x, y); x = (1 + |x|2 )1/2 . (a) The structure of the FDS Eh (t, x, y) at non-resonant times as stated in (3) is stable under perturbations and E(t, x, y) is smooth and spatially bounded for mπ < t < (m + 1)π. However, E(t, x, y) at resonant times is more sensitive to perturbations: (b) If W is sublinear, viz. |∂xα W (t, x)| = o(1), |α| = 1, as |x| → ∞ uniformly with respect to t, then the recurrence of singularities at resonant times mπ, m ∈ Z, persists (WFx denotes the wavefront set): WFx E(mπ, x, y) = {(−1)m (y, ξ) : ξ ∈ Rd \{0}}, and it decays rapidly at spatial infinity, viz. for any N , |E(mπ, x, y) ≤ CN x − y −N ,
|x − y| ≥ 1.
(12)
(c) If W is of linear type, viz. |∂xα W (t, x)| ≤ C for |α| = 1, singularities of E(0, x, y) can propagate at resonant times. For example, if W = a x , then with ξˆ = ξ/|ξ|, ˆ ξ) : ξ ∈ Rd \{0}}, WFx E(mπ, x, y) = {(−1)m (y + 2amξ, but it remains to decay rapidly at spatial infinity: |E(mπ, x, y) ≤ CN x − y −N ,
|x − y| ≥ 1.
(13)
(d) If W is superlinear and satisfies the following sign condition on the Hessian matrix ∂x2 W = (∂ 2 W/∂xj ∂xk ) that C1 x −δ ≤ ∂x2 W (t, x) ≤ C2 x −δ ,
(t, x) ∈ R1 × Rd
(14)
for some constants 0 < δ < 1 and 0 < C1 < C2 < ∞ or −∞ < C1 < C2 < 0, then E(mπ, x, y), m ∈ Z, is C ∞ with respect to (x, y), viz. singularities at resonant times t = mπ are swept away.
March 10, J070-S0129055X10003928
2010 10:14 WSPC/S0129-055X
148-RMP
Spatial Growth of Fundamental Solutions for Certain Perturbations
197
This paper is concerned with the properties of the FDS E(t, x, y), when t is at resonant times t ∈ πZ. We show that, in the last case (d) above, E(mπ, x, y) increases indefinitely as |x| → ∞ at the algebraic rate C|x|dδ/(2−2δ) , exhibiting a sharp contrast to the decay result (12) or (13) for the case when W is at most linearly increasing at spatial infinity. More precisely we prove the following theorem: Theorem 1.1. Suppose that W (t, x) is subquadratic and satisfies the sign condition (14) for some 0 < δ < 1 and 0 < C1 < C2 < ∞ or −∞ < C1 < C2 < 0. Let m ∈ Z and y ∈ Rd be fixed. Let χ ∈ C0∞ (Rd \{0}) be such that χ(x) = 1 for a ≤ |x| ≤ b, 0 < a < b < ∞ being constants. Then there exist constants 0 < M1 < M2 , independent of R ≥ 1, such that M1 R
dδ/(2−2δ)
≤
x |E(mπ, x, y)| χ R Rd 2
2
dx Rd
1/2 ≤ M2 Rdδ/(2−2δ) .
(15)
It is interesting to note that, when δ increases from 0 to 1, the growth rate as |x| → ∞ of W (t, x) decreases (hence W (t, x) becomes weaker), whereas that of E(mπ, x, y) as |x − y| → ∞, r(δ) = dδ/(2 − 2δ), increases from 0 indefinitely to infinity. This seemingly contradictory behavior may be understood via the semiclassical picture as follows. For functions a(x) and b(x) on Ω, a ∼ b means that A1 a(x) ≤ b(x) ≤ A2 a(x), x ∈ Ω, for constants 0 < A1 < A2 . At time 0 consider the ensemble Γ of classical particles in the phase space Rd × Rd sitting on the linear Lagrangian manifold {(x, p) ∈ Rd × Rd : x = y, p ∈ Rd } with uniform momentum distribution (2π)−d/2 dp. Semiclassically, this is described by the wave function δ(x−y) = E(0, x, y). After time mπ, Γ will be transported by the Hamilton flow (5) to the Lagrangian manifold {(x(mπ, y, k), p(mπ, y, k)) : k ∈ Rd }. As we shall see below, we have |p(mπ, y, k)| ∼ |k| and |x(mπ, y, k)| ∼ |k|1−δ as |k| → ∞. It follows at least semiclassically (see (9)) that −1/2 ∂x |E(mπ, x, y)| ∼ det ∼ |k|dδ/2 ∼ |x|dδ/(2−2δ) , ∂k
|x| → ∞,
which is consistent with (15). Here is another remark, which clarifies that Theorem 1.1 is more or less consistent with the known results. We should note that 2 if δ = 0, then W = c x , and mπ is no longer a resonant time for V = x2 /2 + W , and the corresponding E(mπ, x, y) is bounded as |x − y| → ∞; on the other hand, if δ = 1, then W = c x and, as in (c) above, a large portion of E(mπ, x, y) is concentrated in a bounded domain |x − y| ≤ 2cm, which may be represented as the extreme case of C x dδ/(2−2δ) as δ → 1. We mention here that the result of the theorem has been conjecture by Martinez and the second author in [7], where a similar problem is studied in the semi-classical
March 10, J070-S0129055X10003928
198
2010 10:14 WSPC/S0129-055X
148-RMP
A. Jensen & K. Yajima
setting. More precisely, they consider the FDS of the semi-classical Schr¨odinger equation 2 ∂u h 1 ih = − ∆ + x2 + hµ W (x) u, ∂t 2 2 where W (x) is t independent and satisfies the same conditions as in this paper, (10) and (14); and they prove that the FDS at the resonant times may be written in the form E(mπ, x, y) = h−d(1+ν)/2 a(x, y, h)eiS(x,y)/h ,
ν = µ/(1 − δ),
(16)
where S(x, y) is the action integral of the path of (5) connecting x(0) = y and x(mπ) = x and a(x, y, h) satisfies C −1 ≤ |a(x, y, h)| ≤ C uniformly with respect h on every compact subset K of R2d \{(x, (−1)m x) : x ∈ Rd }. Thus, E(mπ, x, y) has the extra growing factor h−dν/2 as h → 0 compared to E(t, x, y) at non-resonant times t = mπ and they remark that, if their arguments applied for non-smooth potentials, (16) would imply the estimate (15) of Theorem 1.1 for the homogeneous potential W (x) = C|x|2−δ . It is well known that the boundedness of E(t, s, x, y) with respect to (x, y) implies the so called Lp -Lq estimates of the propagator U (t, s) (hence, also finite time Strichartz’ estimates). There are examples of Schr¨ odinger equations with smooth coefficients, which exhibit break down of the estimates, e.g., the harmonic oscillator at resonant times. However, to the best knowledge of the authors, in all known examples they are broken because of local singularities and, Theorem 1.1 is the first example in which they are broken because of the growth at spatial infinity of the FDS (see [8] for Lp -Lq estimates for potentials which are singular but decay at infinity). For the micro-local smoothing estimate which may be applied for proving the smoothness of the FDS, see for example [1] or [6]. The rest of the paper is devoted to the proof of this theorem. We prove it only in the m = 1 case. The proof for the other cases is similar. In Sec. 2, we recall several known facts, which will be used in Sec. 3, where the theorem is proved. We often omit some of the variables of functions, if no confusion is to be feared. For functions f of several variables, we write f ∈ C k (x) or f ∈ C k (t, x) etc., if f is of class C k with respect to x or (t, x), etc. 2. Preliminaries We first recall some results on the Hamiltonian flow generated by (5) when V (t, x) = x2 /2 + W (t, x) and W is subquadratic. We set the initial time s = 0 and omit the variable s. The solutions (x(t), p(t)) = (x(t, y, k), p(t, y, k)) of (5) satisfy the integral equations t sin(t − s)∂x W (s, x(s))ds, (17) x(t) = y cos t + k sin t − p(t) = −y sin t + k cos t −
0 t
0
cos(t − s)∂x W (s, x(s))ds.
(18)
March 10, J070-S0129055X10003928
2010 10:14 WSPC/S0129-055X
148-RMP
Spatial Growth of Fundamental Solutions for Certain Perturbations
199
Since the subquadratic condition implies |x(t)| ˙ + |p(t)| ˙ ≤ C(1 + |x(t)| + |p(t)|) for a constant C > 0 and, hence, e−C|t| (1 + |y| + |k|) ≤ (1 + |x(t)| + |p(t)|) ≤ eC|t| (1 + |y| + |k|),
(19)
it follows, as y 2 + k 2 → ∞, uniformly with respect to t in compact intervals, that |x(t) − (y cos t + k sin t)| = o(|y| + |k|),
(20)
|p(t) − (−y sin t + k cos t)| = o(|y| + |k|).
(21)
We fix m ∈ Z, m = 0, and 0 < ε < π/2, and consider t in the interval I = [mπ − ε, mπ + ε]. Then, the following results have been proved in Lemmas 2.3, 2.5 and 3.5, respectively, of [12] by using the integral equations (17) and (18). (i) For any α and β, as R2 = y 2 + k 2 → ∞ ∂yα ∂kβ (∂y x(t) − (cos t)1) → 0,
∂yα ∂kβ (∂k x(t) − (sin t)1) → 0,
(22)
∂yα ∂kβ (∂y p(t)
∂yα ∂kβ (∂k p(t)
(23)
+ (sin t)1) → 0,
− (cos t)1) → 0,
uniformly with respect to t ∈ I. Here 1 is the d × d identity matrix. (ii) Let R > 0 be sufficiently large. Then, for any t ∈ I and (ξ, y) ∈ R2d with ξ 2 + y 2 ≥ R2 , there exists a unique k ∈ Rd such that the solution (x(s, y, k), p(s, y, k)) of (5) satisfies p(t, y, k) = ξ.
(24)
(iii) Let R be as in (ii) and define ϕ(t, ξ, y) for t ∈ I and ξ 2 + y 2 > R2 by ϕ(t, ξ, y) = x(t, y, k) · ξ −
t
0
L(s, x(s, y, k), x(s, ˙ y, k))ds,
where k is determined by (24). Then ϕ ∈ C ∞ (ξ, y) and ∂ξα ∂yβ ϕ ∈ C 1 (t, ξ, y) for any α, β; ϕ is a generating function of the canonical map (p(t, y, k), y) → (x(t, y, k), k): (∂ξ ϕ)(t, p(t, y, k), y) = x(t, y, k),
(∂y ϕ)(t, p(t, y, k), y) = k,
(25)
and ϕ satisfies the Hamilton–Jacobi equation ∂t ϕ = ξ 2 /2 − V (t, ∂ξ ϕ). Moreover, as ξ 2 + y 2 → ∞, ∂ξα ∂yβ ϕ approaches the corresponding function of the harmonic oscillator whenever |α + β| ≥ 2: α β (ξ 2 + y 2 ) sin t + 2ξ · y sup ∂ξ ∂y ϕ(t, ξ, y) − → 0. 2 cos t t∈I Furthermore, we have the following representation formula of the FDS [12, Theorem 1.3(2)].
March 10, J070-S0129055X10003928
200
2010 10:14 WSPC/S0129-055X
148-RMP
A. Jensen & K. Yajima
Theorem 2.1. Let W be subquadratic. Then, for t ∈ I = [mπ − ε, mπ + ε], the FDS E(t, x, y) of (11) may be written in the following form E(t, x, y) = lim ε↓0
2
Rd
˜ i−(m+1)d eix·ξ−iϕ(t,ξ,y)−εξ (2π)d |cos t|d/2
/2
a(t, ξ, y)
dξ
(26)
where the integral converges in the C ∞ topology with respect to (x, y) and the functions ϕ˜ and a satisfy the following properties: (a) ϕ˜ ∈ C ∞ (ξ, y), ∂ξα ∂yβ ϕ˜ ∈ C 1 (t, ξ, y) for any α, β and ϕ(t, ˜ ξ, y) = ϕ(t, ξ, y)
for t ∈ I,
ξ 2 + y 2 ≥ R2 .
(b) a ∈ C ∞ (ξ, y), ∂xα ∂yβ a ∈ C 1 (t, ξ, y) for any α, β and lim
sup |∂xα ∂yβ (a(t, ξ, y) − 1)| → 0
ξ 2 +y 2 →∞ t∈I
for any α and β. We call integrals of the form (26) oscillatory integrals and often write them simply as ˜ i−(m+1)d eix·ξ−iϕ(t,ξ,y) a(t, ξ, y) dξ. d d/2 (2π) |cos t| Rd When W satisfies the sign condition (14), the phase function ϕ(π, ξ, y) satisfies the following properties which are essential for the proof of the theorem. From now on we let m = 1. Proposition 2.2. Let W be subquadratic and satisfy (14). Let L > 0. Then, there exist constants C > 0 and R > 0 depending only on L such that for every |ξ| ≥ R and |y| ≤ L: C1 |ξ|1−δ ≤ |∂ξ ϕ(π, ξ, y)| ≤ C2 |ξ|1−δ , |∂ξα ϕ(π, ξ, y)| ≤ C|ξ|−δ ,
|α| ≥ 2.
(27) (28)
Proof. The upper bound in estimate (27) is obvious from (25), (17) and (20); the lower bound is proved in [11, pp. 61–63] for time independent perturbations W (t, x) = W (x), and the proof applies to the time dependent case as well, if we use [12, Lemmas 2.1 and 2.2] instead of [11, Lemmas 4.2 and 4.3]. From [11, pp. 61– 63], we also have for |ξ| ≥ R and k such that p(π, y, k) = ξ ∂k x(π, y, k) ∼ |ξ|−δ .
(29)
Differentiating (∂ξ ϕ)(π, p(π, y, k), y) = x(π, y, k) with respect to k, we have (∂ξ2 ϕ)(π, ξ, y)∂k p(π, y, k) = ∂k x(π, y, k)
(30)
March 10, J070-S0129055X10003928
2010 10:14 WSPC/S0129-055X
148-RMP
Spatial Growth of Fundamental Solutions for Certain Perturbations
201
and, applying the second result of (23) and (29), we obtain (28) for the case |α| = 2. For higher derivatives, we further differentiate (30) and apply (22) and (23) in addition to (29). Estimate (28) follows inductively. Lemma 2.3. Let L > 0 and 0 < a < b < ∞ be fixed arbitrarily and let χ ∈ C0∞ (Rd ) be supported by {x ∈ Rd : a ≤ |x| ≤ b}. Then, there exist R0 > 0 and C0 > 0, such that for all R > R0 and |y| ≤ L 1 |χ(∂ξ ϕ(π, ξ, y)/R)|2 dξ ≤ C0 Rdδ/(1−δ) . (31) R d Rd If χ(x) > δ > 0 for a1 < |x| < b1 , a < a1 < b1 < b, then we also have the lower bound: 1 dδ/(1−δ) C1 R ≤ d |χ(∂ξ ϕ(π, ξ, y)/R)|2 dξ. (32) R Rd Proof. For sufficiently large R > 0, we have by virtue of (21) that 1/2 ≤ |p(π, y, k)|/|k| ≤ 2 for |y| ≤ L and |k| ≥ R, and (27) implies C1 |k|1−δ ≤ |x(π, y, k)| ≤ C2 |k|1−δ . It follows that, if χ(x(π, y, k)/R) = 0, then aR/C2 ≤ |k|1−δ ≤ bR/C1 . Hence, whenever χ(∂ξ ϕ(π, ξ, y)/R) = 0, we have D1 R1/(1−δ) ≤ |ξ| ≤ D2 R1/(1−δ) and 1 Rd
|χ(∂ξ ϕ(π, ξ, y)/R)|2 dξ ≤ CRdδ/(1−δ) .
A similar argument yields the lower bound in the second case. We omit the obvious details. 3. Proof of Theorem 1.2 Before starting the proof we remark the following: If we were able to prove the faster decay as |ξ| → ∞ for the higher derivatives ∂ξα ϕ, say, |∂ξα ϕ(π, ξ, y)| ≤ C|ξ|−δ−|α| ,
(33)
then the standard stationary phase method combined with a change of scale would yield the pointwise estimate |E(mπ, x, y)| ∼ C|x|dδ/(2−2δ)
as |x| → ∞.
(34)
However, (33) does not seem to hold in general and this required a weaker formulation of the theorem and a little complicated proof given below.
March 10, J070-S0129055X10003928
202
2010 10:14 WSPC/S0129-055X
148-RMP
A. Jensen & K. Yajima
We need to estimate I(R) ≡
1 Rd
R
2 χ x E(π, x, y) dx. R d
(35)
In what follows, we omit the variable π, the domain of integration Rd from integral signs and write ϕ ˜ as ϕ. Since y is fixed in the following computation, we sometimes omit the variables y as well. This should not cause any confusion. Then, by virtue of (26), (35) may be written as an oscillatory integral 2 i(x·ξ−ϕ(ξ))−εξ 2 /2 dx χ x a(ξ)dξ e ε↓0 R 2 2 2 x 1 = lim eix·(ξ−η)+i(ϕ(η)−ϕ(ξ))−ε(ξ +η )/2 a(ξ)a(η)dξdηdx χ 2d d ε↓0 (2π) R R 2 2 1 = lim (36) χ ˆ2 (R(η − ξ))ei(ϕ(η)−ϕ(ξ))−ε(ξ +η )/2 a(ξ)a(η)dξdη, ε↓0 (2π)d
I(R) = lim
1 (2π)2d Rd
where we wrote χ2 (x) = χ2 (x) and we defined the Fourier transform by 1 fˆ(ξ) = (F f )(ξ) = e−ix·ξ f (x)dx. (2π)d In what follows we omit the limit sign limε↓0 and the damping factors which arise from exp(−ε(ξ 2 + η 2 )/2). In the right-hand side of (36), we change variables η to ζ = η − ξ and expand by Taylor’s formula as a(ξ + ζ) =
1 ζ α a(α) (ξ) + α!
|α|≤N
|α|=N +1
1 α ζ bα (ξ, ζ) α!
in the resulting formula, where a(α) = ∂ξα a and where we wrote bα (ξ, ζ) =
1
0
(1 − θ)N a(α) (ξ + θζ)dθ.
This expresses I(R) as 1 χ ˆ2 (Rζ)ζ α eiϕ(ξ+ζ)−iϕ(ξ) a(ξ)a(α) (ξ)dξdζ + BN (R), (2π)d α! |α|≤N
where BN (R) is the sum over α with |α| = N + 1 of constants times χ ˆ2 (Rζ)ζ α ei(ϕ(ξ+ζ)−ϕ(ξ)) a(ξ)bα (ξ, ζ)dξdζ =
e
−iϕ(ξ)
a(ξ)
e
iϕ(ξ+ζ)
α
χ ˆ2 (Rζ)ζ bα (ξ, ζ)dζ dξ.
(37)
March 10, J070-S0129055X10003928
2010 10:14 WSPC/S0129-055X
148-RMP
Spatial Growth of Fundamental Solutions for Certain Perturbations
203
We take ∈ N such that (1 − δ) > d and apply integration by parts times to the inner integral, which we denote by I(ξ, R), by using the identity
1 − i∂ζ ϕ(ξ + ζ) · ∂ζ 1 + (∂ζ ϕ(ξ + ζ))2
eiϕ(ξ+ζ) = eiϕ(ξ+ζ) .
Thus, if we write M for the transpose of the differential operator on the left, we have I(ξ, R) = eiϕ(ξ+ζ) M (χ ˆ2 (Rζ)ζ α bα (ξ, ζ))dζ. (38) Since M has the form M=
1 + i divζ 1 + (∂ζ ϕ)2
∂ζ ϕ 1 + (∂ζ ϕ)2
+
i∂ζ ϕ · ∂ζ , 1 + (∂ζ ϕ)2
∂ζα ϕ are bounded for |α| ≥ 2 and since C −1 ξ + ζ 2(1−δ) ≤ 1 + (∂ξ ϕ(ξ + ζ))2 ≤ C ξ + ζ 2(1−δ) by virtue of (27), M is an th order differential operator with respect to ∂ζ whose coefficients are bounded by C ξ + ζ −(1−δ) . Hence |I(ξ, R)| ≤ C
ˆ2 )(Rζ)||ζ α−γ ||∂ζδ bα (ξ, ζ)|dζ. ξ + ζ −(1−δ) R|β| |(∂ζβ χ
|β+γ+δ|≤
Since χ ˆ2 (ζ) is rapidly decreasing and ∂ζδ bα (ξ, ζ) are bounded, the integrand is bounded for any L > 0 by a constant times ξ
−(1−δ)
ζ (1−δ) Rζ −L R|β| |ζ|N +1−|γ| .
It follows, by changing variables ζ to ζ/R, and by taking L large enough, that for R>1 |I(ξ, R)| ≤ CR−N −1−d+ ξ
−(1−δ)
≤ C R−N −1−d+ ξ
−(1−δ)
ζ/R (1−δ) ζ N +1−L dζ
.
Thus, for such that (1 − δ) > d we may estimate the remainder BN (R) in (37) by |BN (R)| ≤
C RN +d+1−
|a(ξ)| ξ
−(1−δ)
dξ ≤
C RN +d+1−
,
March 10, J070-S0129055X10003928
204
2010 10:14 WSPC/S0129-055X
148-RMP
A. Jensen & K. Yajima
and we may ignore BN (R) by taking N large enough. We have next to deal with the first terms in (37), which are sum over |α| ≤ N of 1 α i(ϕ(ξ+ζ)−ϕ(ξ)) χ ˆ (Rζ)ζ e dζ a(ξ)a(α) (ξ)dξ. (39) Aα = 2 (2π)d α! By using Taylor’s formula, we write ei(ϕ(ξ+ζ)−ϕ(ξ)) = eiζ·∂ξ ϕ(ξ) eiΨ(ξ,ζ) , 1 ∂2ϕ (1 − θ) 2 (ξ + θζ)dθ ζ, Ψ(ξ, ζ) = ζ · ∂ξ 0 and expand eiΨ via Taylor’s formula: N N +1 1 (iΨ)m (iΨ) ei(ϕ(ξ+ζ)−ϕ(ξ)) = eiζ·∂ξ ϕ(ξ) + (1 − θ)N eiθΨ dθ , m! N ! 0 m=0 where we take N large enough so that (N + 1)δ > d. We then insert this into the right-hand side of (39). Note that |Ψ(ξ, ζ)| ≤ C ξ −δ ζ δ |ζ|2 by virtue of (28). It follows that the contribution to Aα of the term containing (iΨ)N +1 /(N + 1)! is bounded by taking L such that L > (2 + δ)(N + 1) + |α| + d by CLN Rζ −L |ζ|2(N +1)+|α| ξ −(N +1)δ ζ (N +1)δ dξdζ ≤ CLN R
−2(N +1)−|α|−d
·
−L+(N +1)δ
ζ
2(N +1)+|α|
|ζ|
dζ ·
ξ −(N +1)δ dξ
≤ CR−(d+|α|+2N +2) . Thus, we may again ignore this term and we are left for Aα with N 1 1 iζ·∂ξ ϕ(ξ) α m χ ˆ2 (Rζ)ζ (iΨ(ξ, ζ)) dζ a(ξ)a(α) (ξ)dζdξ. e (2π)d α! m=0 m! Here we repeat the same argument as in the first step to the inner integral. We expand Ψ(ξ, ζ) further by Taylor’s formula: Ψ(ξ, ζ) =
2≤|α|≤N
LN (ξ, ζ) =
|α|=N +1
ζ α (α) ϕ (ξ) + LN (ξ, ζ), α! Cα ζ α
0
1
(1 − θ)N ϕ(α) (ξ + θζ)dθ
and expand the product Ψ(ξ, ζ)m . We estimate the contribution to Aα of the terms which contain LN , by performing integration by parts times, (1 − δ) > d, by
March 10, J070-S0129055X10003928
2010 10:14 WSPC/S0129-055X
148-RMP
Spatial Growth of Fundamental Solutions for Certain Perturbations
using the identity
1 − i∂ξ ϕ(ξ) · ∂ζ 1 + |∂ξ ϕ(ξ)|2
205
eiζ∂ξ ϕ(ξ) = eiζ∂ξ ϕ(ξ)
and the estimate (27). This yields the bound CR−2(N +1)−d+ for the contribution and we ignore them. The rest is a sum of the terms of the form Cβ1 ···βm ζ β ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ),
β = β1 + · · · + βm
and their contributions to Aα are given by constants times eiζ∂ξ ϕ(ξ) χ ˆ2 (Rζ)ζ (α+β) ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ)a(ξ)a(α) (ξ)dξdζ =
1 (iR)|α|+|β|Rd
(∂ζα+β χ2 )(∂ξ ϕ(ξ)/R)ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ)a(ξ)a(α) (ξ)dξ. −mδ
by (28) and this Here |β1 |, . . . , |βm | ≥ 2 and |ϕ(β1 ) (ξ) · · · ϕ(βm ) (ξ)| ≤ C ξ
integral is bounded in modulus by C −mδ dξ |(∂ζα+β χ2 )(∂ξ ϕ(ξ)/R)| ξ
R|α|+|β|Rd ≤ C Rdδ/(1−δ) R−|α+β| R−mδ/(1−δ) , by virtue of Lemma 2.3. Thus the main contribution to I(R) is given by the term with m = 0 and α = 0: 1 1 χ(∂ξ ϕ(ξ)/R)2 |a(ξ)|2 dξ. (2π)d Rd Since a(ξ) → 1 as |ξ| → ∞, this is comparable with CRdδ/(1−δ) for large R by virtue of Lemma 2.3. The theorem follows. Acknowledgements The first author was partially supported by the Danish Natural Science Research Council grant “Mathematical Physics”. The second author was supported by JSPS grant in aid for scientific research No. 18340041. This work has been done while the second author was visiting Department of Mathematical Sciences of Aalborg University. He acknowledges the hospitality of the department. References [1] W. Craig, T. Kappeler and W. Strauss, Microlocal dispersive smoothing for the Schr¨ odinger equation, Comm. Pure Appl. Math. 48 (1995) 769–860. [2] S. Doi, Dispersion of singularities of solutions for Schr¨ odinger equations, Comm. Math. Phys. 250 (2004) 473–505. [3] S. Doi, Smoothness of solutions for Schr¨ odinger equations with unbounded potentials, Publ. RIMS Kyoto Univ. 41 (2005) 175–221.
March 10, J070-S0129055X10003928
206
2010 10:14 WSPC/S0129-055X
148-RMP
A. Jensen & K. Yajima
[4] D. Fujiwara, Remarks on the convergence of the Feynman path integrals, Duke Math. J. 47 (1980) 41–96. [5] L. Kapitanski, I. Rodnianski and K. Yajima, On the fundamental solution of a perturbed harmonic oscillator, Topol. Methods Nonlinear Anal. 9 (1997) 77–106. [6] A. Martinez, S. Nakamura and V. Sordoni, Analytic smoothing effect for the Schr¨ odinger equation with long-range perturbation, Comm. Pure Appl. Math. 59(9) (2006) 1330–1351. [7] A. Martinez and K. Yajima, On the fundamental solution of semiclassical Schr¨ odinger equations at resonant times, Comm. Math. Phys. 216 (2001) 357–373. [8] W. Schlag, Dispersive estimates for Schr¨ odinger operators: A survey, in Mathematical Aspects of Nonlinear Dispersive Equations, Ann. of Math. Stud., Vol. 163 (Princeton Univ. Press, Princeton, NJ, 2007), pp. 255–285. [9] K. Yajima, Schr¨ odinger evolution equations with magnetic fields, J. d’Analyse Math. 56 (1991) 29–76. [10] K. Yajima, Smoothness and non-smoothness of the fundamental solution of time dependent Schr¨ odinger equations, Comm. Math. Phys. 181 (1996) 605–629. [11] K. Yajima, On fundamental solution of time dependent Schr¨ odinger equations, Contemp. Math. 217 (1998) 49–68. [12] K. Yajima, On the behavior at infinity of the fundamental solution of time dependent Schr¨ odinger equation, Rev. Math. Phys. 13 (2001) 891–920. [13] G. P. Zhang and K. Yajima, Smoothing property for Schr¨ odinger equations with potential super-quadratic at infinity, Comm. Math. Phys. 221 (2001) 573–590. [14] S. Zelditch, Reconstruction of singularities for solutions of Schr¨ odinger equation, Comm. Math. Phys. 90 (1983) 1–26.
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 2 (2010) 207–231 c 2010 by the authors DOI: 10.1142/S0129055X1000393X
ON THE EXISTENCE OF THE DYNAMICS FOR ANHARMONIC QUANTUM OSCILLATOR SYSTEMS∗
BRUNO NACHTERGAELE† , BENJAMIN SCHLEIN‡ , ROBERT SIMS§ , SHANNON STARR¶ and VALENTIN ZAGREBNOV †Department
of Mathematics, University of California, Davis, CA 95616, USA
[email protected]
‡Centre for Mathematical Sciences, University of Cambridge, Cambridge, CB3 0WB, UK
[email protected] §Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA
[email protected] ¶Department
of Mathematics, University of Rochester, Rochester, NY 14627, USA
[email protected]
Universite
de la M´ editerran´ ee (Aix-Marseille II), Centre de Physique Th´ eorique-UMR 6207 CNRS, Luminy - Case 907, 13288 Marseille, Cedex 09, France
[email protected] Received 18 September 2009 We construct a W ∗ -dynamical system describing the dynamics of a class of anharmonic quantum oscillator lattice systems in the thermodynamic limit. Our approach is based on recently proved Lieb–Robinson bounds for such systems on finite lattices [19]. Keywords: Thermodynamic limit; infinite-system dynamics; anharmonic lattice. Mathematics Subject Classification 2010: 82C10, 82C20, 81Q15, 37K60, 46L55
1. Introduction The dynamics of a finite quantum system, i.e. one with a finite number of degrees of freedom described by a Hilbert space H, is given by the Schr¨ odinger equation. ∗ c 2010 by the authors. This paper may be reproduced, in its entirety, for non-commercial purposes.
207
March 10, J070-S0129055X1000393X
208
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
The Hamiltonian H is a densely defined self-adjoint operator on H, and for a vector ψ(t) in the domain of H the state at time t satisfies i∂t ψ(t) = Hψ(t).
(1.1)
For all initial conditions ψ(0) ∈ H, the unique solution is given by ψ(t) = e−itH ψ(0),
for all t ∈ R.
Due to Stone’s Theorem e−itH is a strongly continuous one-parameter group of unitary operators on H, and the self-adjointness of H is the necessary and sufficient condition for the existence of a unique continuous solution for all times. An alternative description of this dynamics is the so-called Heisenberg picture in which the time evolution is defined on the algebra of observables instead of the Hilbert space of states. The corresponding Heisenberg equation is ∂t A(t) = i[H, A(t)],
(1.2)
where, for each t ∈ R, A(t) ∈ B(H) is a bounded linear operator on H. Its solutions are given by a one-parameter group of ∗-automorphisms, τt , of B(H): A(t) = τt (A(0)). For the description of physical systems we expect the Hamiltonian, H, to have some additional properties. For example, for finite systems such as atoms or molecules, stability of the system requires that H is bounded from below. In this case, the infimum of the spectrum is expected to be an eigenvalue and is called the ground state energy. When the model Hamiltonian, H, is describing bulk matter rather than finite systems, we expect some additional properties. For example, the stability of matter requires that the ground state energy has a lower bound proportional to N , where N is the number of degrees of freedom. Much progress on this stability property has been made in the last several decades [24,12]. We also expect that the dynamics of local observables of bulk matter, or large systems in general, depends only on the local environment. Mathematically this is best expressed by the existence of the dynamics in the thermodynamic limit, i.e. in infinite volume. This is the question we address in this paper. There are two settings that allow one to prove a rich set of important physical properties of quantum dynamical systems, including infinite ones: the C ∗ dynamical systems and the W ∗ dynamical systems [3]. In both cases, the algebra of observables can be thought of as a norm-closed ∗-subalgebra A of some algebra of the form B(H), but in the case of the W ∗ -dynamical systems, we additionally require that the algebra is closed for the weak operator topology, which makes it a von Neumann algebra. For a C ∗ -dynamical system, the group of automorphisms τt is assumed to be strongly continuous, i.e. for all A ∈ A, the map t → τt (A) is continuous in t for the operator norm (C ∗ -norm) on A. In a W ∗ -dynamical system the continuity is with respect to the weak topology.
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
209
In the case of lattice systems with a finite-dimensional Hilbert space of states associated with each lattice site, such as quantum spin-lattice systems and lattice fermions, it has been known for a long time that under rather general conditions the dynamics can be described by a C ∗ dynamical system, including in the thermodynamic limit [4]. When the Hilbert space at each site is infinite-dimensonal and the finite-system Hamiltonians are unbounded, this is no longer possible and the weak continuity becomes a natural assumption. The class of systems we will primarily focus on here are lattices of quantum oscillators but the underlying lattice structure is not essential for our method. Systems defined on suitable graphs, such as the systems considered in [6, 7] can also be analyzed with the same methods. In a recent preprint [1], it was shown that convergence of the dynamics in the thermodynamic limit can be obtained for a modified topology. Here, we follow a somewhat different approach. The main difference is that we study the thermodynamic limit of anharmonic perturbations of an infinite harmonic lattice system described by an explicit W ∗ -dynamical system. The more traditional way is to first define the dynamics of anharmonic systems in finite volume (which can be done by standard means [21]), and then to study the limit in which the volume tends to infinity. This is what is done in [1], but it appears that controlling the continuity of the limiting dynamics is more straightforward in our approach. In fact, we are able to show that the resulting dynamics for the class of anharmonic lattices we study is indeed weakly continuous, and we obtain a W ∗ dynamical system for the infinite system. The W ∗ -dynamical setting is obtained by considering the GNS representation of a ground state or thermal equilibrium state of the harmonic system. The ground states and thermal states are quasi-free states in the sense of [22], or convex mixtures of quasi-free states. In the ground state case the GNS representations are the well-known Fock reprensentations. For the thermal states the GNS representations have been constructed by Araki and Woods [2]. Common to both approaches, ours and the one of [1], is the crucial role played by an estimate of the speed of propagation of perturbations in the system, commonly referred to as Lieb–Robinson bounds [8, 11, 16–18]. Briefly, if A and B are two observables of a spatially extended system, localized in regions X and Y of our graph, respectively, and τt denotes the time evolution of the system, a Lieb– Robinson bound is an estimate of the form [τt (A), B] ≤ Ce−a(d(X,Y )−v|t|) , where C, a, and v are positive constants and d(X, Y ) denotes the distance between X and Y . Lieb–Robinson bounds for anharmonic lattice systems were recently proved in [19], and this work builds on the results obtained there. Our results are mainly limited to short-range interactions that are either bounded or unbounded perturbations of the harmonic interaction (linear springs). To conclude the introduction, let us mention that the same questions, the existence of the dynamics for infinite oscillator lattices, can and has been asked for
March 10, J070-S0129055X1000393X
210
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
classical systems. Two classic papers are [10, 15]. Many properties of this classical infinite volume harmonic dynamics have been studied in detail, e.g., [23,9] and some recent progress on locality estimates for anharmonic systems is reported in [5, 20]. The paper is organized as follows. We begin with a section discussing bounded interactions. In this case, the existence of the dynamics follows by mimicking the proof valid in the context of quantum spins systems. Section 3 describes the infinite volume harmonic dynamics on general graphs. It is motivated by an explicit example on Zd . Next, in Sec. 4, we discuss finite volume perturbations of the infinite volume harmonic dynamics and prove that such systems satisfy a Lieb–Robinson bound. In Sec. 5, we demonstrate that the existence of the dynamics and its continuity follow from the Lieb–Robinson estimates established in the previous section. 2. Bounded Interactions The goal of this section is to prove the existence of the dynamics for oscillator systems with bounded interactions. Since oscillator systems with bounded interactions can be treated as a special case of more general models with bounded interactions, we will use a slightly more general setup in this section, which we now introduce. We will denote by Γ the underlying structure on which our models will be defined. Here Γ will be an arbitrary set of sites equipped with a metric d. For Γ with countably infinite cardinality, we will need to assume that there exists a non-increasing function F : [0, ∞) → (0, ∞) for which: (i) F is uniformly integrable over Γ, i.e. F (d(x, y)) < ∞, F := sup
(2.1)
x∈Γ y∈Γ
and (ii) F satisfies C := sup x,y∈Γ
F (d(x, z))F (d(z, y)) z∈Γ
F (d(x, y))
< ∞.
(2.2)
Given such a set Γ and a function F , by the triangle inequality, for any a ≥ 0 the function Fa (x) = e−ax F (x), also satisfies (i) and (ii) above with Fa ≤ F and Ca ≤ C. In typical examples, one has that Γ ⊂ Zd for some integer d ≥ 1, and the metric is just given by d(x, y) = |x − y| = dj=1 |xj − yj |. In this case, the function F can be chosen as F (|x|) = (1 + |x|)−d− for any > 0. To each x ∈ Γ, we will associate a Hilbert space Hx . In many relevant systems, one considers Hx = L2 (R, dqx ), but this is not essential. With any finite subset
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
211
Λ ⊂ Γ, the Hilbert space of states over Λ is given by HΛ = Hx , x∈Λ
and the local algebra of observables over Λ is then defined to be AΛ = B(Hx ), x∈Λ
where B(Hx ) denotes the algebra of bounded linear operators on Hx . If Λ1 ⊂ Λ2 , then there is a natural way of identifying AΛ1 ⊂ AΛ2 , and we may thereby define the algebra of quasi-local observables by the inductive limit AΛ , AΓ = Λ⊂Γ
where the union is over all finite subsets Λ ⊂ Γ; see [3, 4] for a discussion of these issues in general. The result discussed in this section corresponds to bounded perturbations of local self-adjoint Hamiltonians. We fix a collection of on-site local operators H loc = {Hx }x∈Γ where each Hx is a self-adjoint operator over Hx . In addition, we will consider a general class of bounded perturbations. These are defined in terms of an interaction Φ, which is a map from the set of subsets of Γ to AΓ with the property that for each finite set X ⊂ Γ, Φ(X) ∈ AX and Φ(X)∗ = Φ(X). As with the Lieb–Robinson bound proven in [19], we will need a growth condition on the set of interactions Φ for which we can prove the existence of the dynamics in the thermodynamic limit. This condition is expressed in terms of the following norm. For any a ≥ 0, denote by Ba (Γ) the set of interactions for which 1 Φ(X) < ∞. x,y∈Γ Fa (d(x, y))
Φa := sup
(2.3)
Xx,y
Now, for a fixed sequence of local Hamiltonians H loc = {Hx }x∈Γ , as described above, an interaction Φ ∈ Ba (Γ), and a finite subset Λ ⊂ Γ, we will consider selfadjoint Hamiltonians of the form Hx + Φ(X), (2.4) HΛ = HΛloc + HΛΦ = x∈Λ
X⊂Λ
acting on HΛ (with domain given by x∈Λ D(Hx ) where D(Hx ) ⊂ Hx denotes the domain of Hx ). As these operators are self-adjoint, they generate a dynamics, or time evolution, {τtΛ }, which is the one-parameter group of automorphisms defined by τtΛ (A) = eitHΛ Ae−itHΛ
for any A ∈ AΛ .
March 10, J070-S0129055X1000393X
212
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
Theorem 2.1. Under the conditions stated above, for all t ∈ R, A ∈ AΓ , the norm limit lim τtΛ (A) = τt (A)
(2.5)
Λ→Γ
exists in the sense of non-decreasing exhaustive sequences of finite volumes Λ and defines a group of ∗-automorphisms τt on the completion of AΓ . The convergence is uniform for t in a compact set. Proof. Let Λ ⊂ Γ be a finite set. Consider the unitary propagator loc
loc
UΛ (t, s) = eitHΛ e−i(t−s)HΛ e−isHΛ
(2.6)
and its associated interaction-picture evolution defined by Λ τt,int (A) = UΛ (0, t)AUΛ (t, 0) for all A ∈ AΓ .
(2.7)
Clearly, UΛ (t, t) = 1l for all t ∈ R, and it is also easy to check that d UΛ (t, s) = HΛint (t)UΛ (t, s) and dt with the time-dependent generator i
loc
loc
HΛint (t) = eiHΛ t HΛΦ e−iHΛ
t
=
−i
d UΛ (t, s) = UΛ (t, s)HΛint (s) ds loc
loc
eiHΛ t Φ(Z)e−iHΛ t .
(2.8)
Z⊂Λ
Fix T > 0 and X ⊂ Γ finite. For any A ∈ AX , we will show that for any Λn (A)} is Cauchy non-decreasing, exhausting sequence {Λn } of Γ, the sequence {τt,int in norm, uniformly for t ∈ [−T, T ]. Moreover, the bounds establishing the Cauchy property depend on A only through X and A. Since loc
loc
Λ Λ (eitHΛ Ae−itHΛ ) = τt,int (eit τtΛ (A) = τt,int
P x∈X
Hx
Ae−it
P x∈X
Hx
),
an analogous statement then immediately follows for {τtΛn (A)}, since they are all also localized in X and have the same norm as A. Take n ≤ m with X ⊂ Λn ⊂ Λm and calculate t d Λm Λn {UΛm (0, s)UΛn (s, t)AUΛn (t, s)UΛm (s, 0)} ds. (2.9) τt,int (A) − τt,int (A) = ds 0 A short calculation shows that d UΛ (0, s)UΛn (s, t)AUΛn (t, s)UΛm (s, 0) ds m (s) − HΛint (s)), UΛn (s, t)AUΛn (t, s)]UΛm (s, 0) = iUΛm (0, s)[(HΛint m n loc
loc
Λn ˜ ˜ = iUΛm (0, s)eisHΛn [B(s), τs−t (A(t))]e−isHΛn UΛm (s, 0),
(2.10)
where loc loc ˜ = e−itHΛlocn AeitHΛlocn = e−itHX A(t) AeitHX
(2.11)
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
213
and loc loc ˜ B(s) = e−isHΛn (HΛint (s) − HΛint (s))eisHΛn m n loc loc = eisHΛm \Λn Φ(Z)e−isHΛm \Λn − Φ(Z)
Z⊂Λm
=
Z⊂Λn
e
loc isHΛ m \Λn
Φ(Z)e
loc −isHΛ m \Λn
.
(2.12)
Z⊂Λm : Z∩Λm \Λn =∅
Combining the results of (2.9)–(2.12), and using unitarity, we find that t Λm Λn Λn ˜ ˜ τt,int (A) − τt,int (A) ≤ [τs−t (A(t)), B(s)] ds
(2.13)
0
and by the Lieb–Robinson bound proven in [19], it is clear that Λn ˜ ˜ [τs−t (A(t)), B(s)] loc loc Λn ˜ ≤ [τs−t (A(t)), eisHΛm \Λn Φ(Z)e−isHΛm \Λn ] Z⊂Λm : Z∩Λm \Λn =∅
≤
≤
≤
2A 2 Φ a Ca |t−s| (e − 1) Ca 2A 2 Φ a Ca |t−s| (e − 1) Ca
Φ(Z)
y∈Λm \Λn Z⊂Λm : y∈Z
Φ(Z)
y∈Λm \Λn z∈Λm Z⊂Λm : y,z∈Z
2AΦa 2 Φ a Ca |t−s| (e − 1) Ca
≤ 2AΦa(e2 Φ a Ca |t−s| − 1)
Fa (d(x, z))
x∈X z∈Z
Fa (d(x, z))
x∈X
Fa (d(x, z))Fa (d(z, y))
y∈Λm \Λn x∈X z∈Λm
Fa (d(x, y)).
(2.14)
y∈Λm \Λn x∈X
With the estimate above and the properties of the function Fa , it is clear that sup t∈[−T,T ]
Λm Λn τt,int (A) − τt,int (A) → 0 as n, m → ∞,
(2.15)
and the rate of convergence only depends on the norm A and the set X where A is supported. This proves the claim. If all local Hamiltonians Hx are bounded, {τt } is strongly continuous. If the Hx are allowed to be densely defined unbounded self-adjoint operators, we only have weak continuity and the dynamics is more naturally defined on a von Neumann algebra. This can be done when we have a sufficiently nice invariant state for the model with only the on-site Hamiltonians. For example, suppose that for each x ∈ Γ,
March 10, J070-S0129055X1000393X
214
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
we have a normalized eigenvector φx of Hx . Then, for all A ∈ AΛ , for any finite Λ ⊂ Γ, define (2.16) φx , A φx . ρ(A) = x∈Λ
x∈Λ
ρ can be regarded as a state of the infinite system defined on the norm completion of AΓ . The GNS Hilbert space Hρ of ρ can be constructed as the closure of AΓ x∈Γ φx . Let ψ ∈ AΓ x∈Γ φx . Then (Λn )
(τt (A) − τt0 (A))ψ ≤ (τt (A) − τt (Λn )
+ (τt0
(Λn )
(A))ψ + (τt
(A) − τt0 (A))ψ.
(Λn )
(A) − τt0
(A))ψ (2.17)
For sufficiently large Λn , the limt→t0 of the middle term vanishes by Stone’s theorem. The two other terms are handled by (2.5). It is clear how to extend the continuity to ψ ∈ Hρ . We will discuss this type of situation in more detail in the next three sections where we consider models that include quadratic (unbounded) interactions as well. 3. The Harmonic Lattice As noted in the introduction, we will consider anharmonic perturbations of infinite harmonic lattices. In this section, we discuss the properties of the harmonic systems that we need to assume in general in order to study the perturbations in the thermodynamic limit. We will also show in detail that a standard harmonic lattice model possesses all the required properties. 3.1. The CCR algebra of observables We begin by introducing the CCR algebra on which the harmonic dynamics will be defined. Following [14], one can define the CCR algebra over any real linear space D equipped with a non-degenerate, symplectic bilinear form σ, i.e. σ : D × D → R with the property that if σ(f, g) = 0 for all f ∈ D, then g = 0, and σ(f, g) = −σ(g, f ) for all f, g ∈ D.
(3.1)
In typical examples, D will be a complex inner product space associated with Γ, e.g., D = 2 (Γ) or a subspace thereof such as D = 1 (Γ), or 2 (Γ0 ), with Γ0 ⊂ Γ, and σ(f, g) = Im[ f, g].
(3.2)
The Weyl operators over D are defined by associating non-zero elements W (f ) to each f ∈ D which satisfy W (f )∗ = W (−f ) for each f ∈ D,
(3.3)
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
215
and W (f )W (g) = e−iσ(f,g)/2 W (f + g) for all f, g ∈ D.
(3.4)
It is well known that there is a unique, up to ∗-isomorphism, C ∗ -algebra generated by these Weyl operators with the property that W (0) = 1l, W (f ) is unitary for all f ∈ D, and W (f ) − 1l = 2 for all f ∈ D\{0}, see, e.g., [4, Theorem 5.2.8]. This algebra, commonly known as the CCR algebra, or Weyl algebra, over D, we will denote by W = W(D). 3.2. Quasi-free dynamics The anharmonic dynamics we study in this paper will be defined as perturbations of harmonic, technically quasi-free, dynamics. A quasi-free dynamics on W(D) is a one-parameter group of *-automorphisms τt of the form τt (W (f )) = W (Tt f ),
f ∈D
(3.5)
where Tt : D → D is a group of real-linear, symplectic transformations, i.e. σ(Tt f, Tt g) = σ(f, g).
(3.6)
As W (f ) − W (g) = 2 for all f = g ∈ D, one should not expect τt to be strongly continuous; only a weaker form of continuity is present. This means that τt does not define a C ∗ -dynamical system on W, and thus we look for a W ∗ -dynamical setting in which the weaker form of continuity is naturally expressed. In the present context, it suffices to regard a W ∗ -dynamical system as a pair {M, αt } where M is a von Neumann algebra and αt is a weakly continuous, one parameter group of ∗-automorphisms of M. For the harmonic systems we are considering, a specific W ∗ -dynamical system arises as follows. Let ρ be a state on W and denote by (Hρ , πρ , Ωρ ) the corresponding GNS representation. We will assume that ρ is both regular and τt -invariant. Recall that ρ is regular if and only if t → ρ(W (tf )) is continuous for all f ∈ D, and τt -invariance means ρ(τt (A)) = ρ(A)
for all A ∈ W.
(3.7)
For the von Neumann algebra M, take the weak-closure of πρ (W) in L(Hρ ) and let αt be the weakly continuous, one parameter group of ∗-automorphisms of M obtained by lifting τt to M. The latter step is possible since ρ is τt -invariant; see, e.g., [3, Corollary 2.3.17]. 3.3. Lieb–Robinson bounds for harmonic lattices To prove the existence of the dynamics for anharmonic models, we use that the unperturbed harmonic system satisfies a Lieb–Robinson bound. Such an estimate
March 10, J070-S0129055X1000393X
216
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
depends directly on properties of σ and Tt . In fact, it is easy to calculate that [τt (W (f )), W (g)] = {W (Tt f ) − W (g)W (Tt f )W (−g)}W (g) = {1 − eiσ(Tt f,g) }W (Tt f )W (g),
(3.8)
using the Weyl relations (3.4). For the examples we consider below, one can prove that for every a > 0, there exist positive numbers ca and va for which |σ(Tt f, g)| ≤ ca eva |t|
|f (x)||g(y)|
x,y∈Zd
e−a|x−y| (1 + |x − y|)d+1
(3.9)
holds for all t ∈ R and all f, g ∈ 2 (Zd ). In general, we will assume that the harmonic dynamics satisfies an estimate of this type. Namely, we suppose that there exists a number a0 > 0 for which given 0 < a ≤ a0 , there are numbers ca and va for which |f (x)||g(y)|Fa (d(x, y)) (3.10) |1 − eiσ(Tt f,g) | ≤ ca eva |t| x,y∈Γ
holds for all t ∈ R and all f, g ∈ 2 (Γ). Here we describe the spatial decay in Γ through the functions Fa as introduced in Sec. 2. Since the Weyl operators are unitary, the norm estimate |f (x)||g(y)|Fa (d(x, y)), (3.11) [τt (W (f )), W (g)] ≤ ca eva |t| x,y
readily follows. 3.4. An important example Using the example given below, we illustrate the general discussion above in terms of a standard harmonic model defined over Γ = Zd . We begin with a description of some well-known calculations that are valid for these models when restricted to a finite volume. This analysis motivates the definition of the harmonic dynamics in the infinite volume. We then demonstrate that this infinite volume dynamics satisfies a Lieb–Robinson bound. By representing this dynamics in a suitable state, the relevant weak-continuity is readily verified. Interestingly, our analysis also applies to the massless case of ω = 0, see below, and we discuss this briefly. We end this subsection with some final comments. 3.4.1. Finite volume analysis We consider a system of coupled harmonic oscillators restricted to a finite volume. Specifically on cubic subsets ΛL = (−L, L]d ⊂ Zd , we analyze Hamiltonians of the form HLh =
x∈ΛL
p2x + ω 2 qx2 +
d j=1
λj (qx − qx+ej )2
(3.12)
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
217
acting in the Hilbert space HΛL =
L2 (R, dqx ).
(3.13)
x∈ΛL
Here the quantities px and qx , which appear in (3.12) above, are the single site momentum and position operators regarded as operators on the full Hilbert space HΛL by setting px = 1l ⊗ · · · ⊗ 1l ⊗ −i
d ⊗ 1l · · · ⊗ 1l and qx = 1l ⊗ · · · ⊗ 1l ⊗ q ⊗ 1l · · · ⊗ 1l, dq (3.14)
i.e. these operators act non-trivially only in the xth factor of HΛL . These operators satisfy the canonical commutation relations [px , py ] = [qx , qy ] = 0 and [qx , py ] = iδx,y ,
(3.15)
valid for all x, y ∈ ΛL . In addition, {ej }dj=1 are the canonical basis vectors in Zd , the numbers λj ≥ 0 and ω ≥ 0 are the parameters of the system, and the Hamiltonian is assumed to have periodic boundary conditions, in the sense that qx+ej = qx−(2L−1)ej if x ∈ ΛL but x + ej ∈ ΛL . It is well-known that Hamiltonians of this form can be diagonalized in Fourier space. We review this quickly to establish some notation and refer the interested reader to [19] for more details. Introducing the operators 1 e−ik·x qx Qk = |ΛL | x∈ΛL
1 and Pk = e−ik·x px , |ΛL | x∈ΛL
(3.16)
defined for each k ∈ Λ∗L = { xπ L : x ∈ ΛL }, and setting
d γ(k) = ω 2 + 4 λj sin2 (kj /2),
(3.17)
j=1
one finds that HLh =
γ(k)(2b∗k bk + 1)
(3.18)
k∈Λ∗ L
where the operators bk and b∗k satisfy 1
bk = Pk − i 2γ(k)
γ(k) Qk 2
and
b∗k
1
= P−k + i 2γ(k)
In this sense, we regard the Hamiltonian HLh as diagonalizable.
γ(k) Q−k . 2
(3.19)
March 10, J070-S0129055X1000393X
218
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
Using the above diagonalization, one can determine the action of the dynamics corresponding to HLh on the Weyl algebra W(2 (ΛL )). In fact, by setting W (f ) = exp i (Re[f (x)]qx + Im[f (x)]px ) , (3.20) x∈ΛL
for each f ∈ 2 (ΛL ), it is easy to verify that (3.3) and (3.4) hold with σ(f, g) = Im[ f, g]. It is convenient to express these Weyl operators in terms of annihilation and creation operators, i.e. 1 1 ax = √ (qx + ipx ) and a∗x = √ (qx − ipx ), 2 2
(3.21)
which satisfy [ax , ay ] = [a∗x , a∗y ] = 0 and [ax , a∗y ] = δx,y One finds that
for all x, y ∈ ΛL .
i W (f ) = exp √ (a(f ) + a∗ (f )) , 2
(3.23)
where, for each f ∈ 2 (ΛL ), we have set a(f ) = f (x)ax , a∗ (f ) = f (x)a∗x . x∈ΛL
(3.22)
(3.24)
x∈ΛL
Now, the dynamics corresponding to HLh , which we denote by τtL , is trivial with respect to the diagonalizing variables, i.e. τtL (bk ) = e−2iγ(k)t bk
and τtL (b∗k ) = e2iγ(k)t b∗k ,
(3.25)
where bk and b∗k are as defined in (3.19). Hence, if we further introduce 1 eikx bk bx = |ΛL | k∈Λ∗ L
1 and b∗x = eikx b∗k , |ΛL | k∈Λ∗
for each x ∈ ΛL and, analogously to (3.24), define b(f ) = f (x)bx , b∗ (f ) = f (x)b∗x , x∈ΛL
(3.26)
L
(3.27)
x∈ΛL
for each f ∈ 2 (ΛL ), then one has that τtL (b(f )) = b([F −1 Mt F ]f ),
(3.28)
where F is the unitary Fourier transform on 2 (ΛL ) and Mt is the operator of multiplication by e2iγ(k)t in Fourier space with γ(k) as in (3.17). We need only determine the relation between the a’s and the b’s.
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
219
A short calculation shows that there exists a linear mapping U : 2 (ΛL ) → (ΛL ) and an anti-linear mapping V : 2 (ΛL ) → 2 (ΛL ) for which 2
b(f ) = a(U f ) + a∗ (V f ),
(3.29)
a relation know in the literature as a Bogoliubov transformation [13]. In fact, one has that U=
i −1 F MΓ+ F 2
and V =
i −1 F MΓ− F J 2
(3.30)
where J is complex conjugation and MΓ± is the operator of multiplication by 1 ± γ(k), Γ± (k) = γ(k)
(3.31)
with γ(k) as in (3.17). Using the fact that Γ± is real valued and even, it is easy to check that U ∗ U − V ∗ V = 1l = U U ∗ − V V ∗
(3.32)
V ∗U − U ∗V = 0 = V U ∗ − U V ∗
(3.33)
and
where we stress that V ∗ is the adjoint of the anti-linear mapping V . The relation (3.29) is invertible, in fact, a(f ) = b(U ∗ f ) − b∗ (V ∗ f ),
(3.34)
i ∗ ∗ ∗ ∗ ∗ W (f ) = exp √ (b((U − V )f ) + b ((U − V )f )) . 2
(3.35)
and therefore
Clearly then, τt (W (f )) = W (Tt f ),
(3.36)
where the mapping Tt is given by Tt = (U + V )F −1 Mt F (U ∗ − V ∗ ),
(3.37)
and we have used (3.28). 3.4.2. Infinite volume dynamics It is now clear how to define the infinite volume harmonic dynamics. Consider a subspace D ⊂ 2 (Zd ) and define W(D) as above with σ(f, g) = Im[ f, g]. First, assume ω > 0, take γ : [−π, π)d → R as in (3.17), and set U and V as in (3.30) with
March 10, J070-S0129055X1000393X
220
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
(3.31). If ω > 0, both U and V are bounded transformations on 2 (Zd ). We will treat the case ω = 0 by a limiting argument. The mapping Tt defined by setting Tt = (U + V )F −1 Mt F (U ∗ − V ∗ ),
(3.38)
is well-defined on 2 (Zd ). To define the dynamics on W(D) we will need to choose subspaces D that are Tt invariant. On such D, Tt is clearly real-linear. With (3.32) and (3.33), one can easily verify the group properties T0 = 1l, Ts+t = Ts ◦ Tt , and Im[ Tt f, Tt g] = Im[ f, g],
(3.39)
i.e. Tt is sympletic in the sense of (3.6). Using [4, Theorem 5.2.8], there is a unique one-parameter group of ∗-automorphisms on W(D), which we will denote by τt , that satisfies τt (W (f )) = W (Tt f ) for all f ∈ D.
(3.40)
This defines the harmonic dynamics on W(D). Here it is important that Tt : D → D. As was demonstrated in [19], the mapping Tt can be expressed as a convolution. In fact, i (1) i (−1) (0) (1) (−1) (H − Ht ) . Tt f = f ∗ Ht + (Ht + Ht ) + f ∗ (3.41) 2 2 t where
1 1 i(k·x−2γ(k)t) e = Im dk , (2π)d γ(k) 1 (0) i(k·x−2γ(k)t) Re e dk , Ht (x) = (2π)d 1 (1) i(k·x−2γ(k)t) Ht (x) = Im γ(k)e dk . (2π)d
(−1) Ht (x)
(3.42)
Using analysis similar to what is proven in [19], the following result holds. Lemma 3.1. Consider the functions defined in (3.42). For ω ≥ 0, λ1 , . . . , λd ≥ 0, d but such that cω,λ = (ω 2 + 4 j=1 λj )1/2 > 0, and any µ > 0, the bounds (0)
2
(µ/2)+1
|Ht (x)| ≤ e−µ(|x|−cω,λ max( µ ,e (−1)
|Ht
(1)
2
)|t|)
(µ/2)+1
−µ(|x|−cω,λ max( µ ,e (x)| ≤ c−1 ω,λ e
2
)|t|)
(µ/2)+1
|Ht (x)| ≤ cω,λ eµ/2 e−µ(|x|−cω,λ max( µ ,e d hold for all t ∈ R and x ∈ Zd . Here |x| = j=1 |xi |.
(3.43) )|t|)
Given the estimates in Lemma 3.1, Eq. (3.41) and Young’s inequality imply that Tt can be defined as a transformation of p (Zd ), for p ≥ 1. However, the symplectic form limits us to consider D = p (Zd ) with 1 ≤ p ≤ 2.
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
221
The following bound now readily follows: |Im Tt f, g| ≤ (1 + 2eµ/2 cω,λ + 2c−1 ω,λ ) (µ/2)+1 2 )|t|) × |f (x)||g(y)|e−µ(|x|−cω,λ max( µ ,e .
(3.44)
x,y
This implies an estimate of the form (3.9), and hence a Lieb–Robinson bound as in (3.11). A simple corollary of Lemma 3.1 follows. Corollary 3.2. Consider the functions defined in (3.42). For ω ≥ 0, λ1 , . . . , λd ≥ 0, d but with cω,λ = (ω 2 + 4 j=1 λj )1/2 > 0, take · 1 to be the 1 -norm. One has that (0)
Ht
− δ0 1 → 0
as t → 0,
(3.45)
and (m)
Ht
1 → 0
as
t → 0,
for m ∈ {−1, 1}.
(3.46) (m)
are bounded Proof. The estimates in Lemma 3.1 imply that the functions Ht by exponentially decaying functions (in |x|). These estimates are uniform for t in compact sets, e.g., t ∈ [−1, 1], and therefore dominated convergence applies. It is (0) (m) clear that H0 (x) = δ0 (x) while H0 (x) = 0 for m ∈ {−1, 1}. This proves the corollary. 3.4.3. Representing the dynamics The infinite-volume ground state of the model (3.12) is the vacuum state for the b-operators, as can be seen from (3.18). This state is defined on W(D) by 1
ρ(W (f )) = e− 4 (U
∗
−V ∗ )f 2
(3.47)
By standard arguments this defines a state on W(D) [4]. Using (3.38), (3.32) and (3.33), one readily verifies that ρ is τt -invariant. ρ is regular by observation. The weak continuity of the dynamics in the GNS-representation of ρ will follow from the continuity of the functions of the form t → ρ(W (g1 )W (Tt f )W (g2 )),
for g1 , g2 , f ∈ D.
(3.48)
When ω > 0, this continuity can be easily observed from the following expression: ρ(W (g1 )W (Tt f )W (g2 )) = eiσ(g1 ,g2 )/2 eiσ(Tt f,g2 −g1 )/2 × e− (U
∗
−V ∗ )(g1 +g2 +Tt f ) 2 /4
.
(3.49)
Note that Tt is differentiable with bounded derivative and that both U and V are bounded. This establishes the continuity in the case that ω > 0. As discussed in the introduction of the section, the W ∗ -dynamical system is now defined by considering the GNS representation πρ of ρ. This yields a von
March 10, J070-S0129055X1000393X
222
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
Neumann algebra M = πρ (W(D)). The invariance of ρ implies that the dynamics is implementable by unitaries Ut , i.e. πρ (τt (W (f ))) = Ut∗ πρ (W (f ))Ut .
(3.50)
Using Ut , the dynamics can be extended to M. As a consequence of (3.48), this extended dynamics is weakly continuous. 3.4.4. The case of ω = 0 We now discuss the case ω = 0. Here, the maps Tt are defined using the convolution formula (3.41). By Lemma 3.1, Tt is well-defined as a transformation of p (Zd ), for 1 ≤ p ≤ 2. Both the group property of Tt and the invariance of the symplectic form σ follow in the limit ω → 0 by dominated convergence which is justified by Lemma 3.1. This demonstrates that the dynamics is well defined. We represent the dynamics in a state ρ defined by (3.47), but with the understanding that (U ∗ −V ∗ )f may take on the value +∞, in which case ρ(W (f )) = 0. ρ is still clearly regular. It remains to show that the dynamics is weakly continuous. Observe that i (−1) (0) (1) (H + Ht ) Tt f − f = f ∗ (Ht − δ0 ) − f ∗ 2 t i (1) (−1) +f ∗ (H − Ht ) , (3.51) 2 t follows from (3.41). Using Young’s inequality and Corollary 3.2, it is clear that Tt f − f → 0 as t → 0 for any f ∈ p (Zd ) with 1 ≤ p ≤ 2. A calculation shows that (0)
(U ∗ − V ∗ )(Tt f − f ) = F1 ∗ (Ht
(−1)
− δ0 ) − F2 ∗ Ht
(1)
− iF3 ∗ Ht ,
(3.52)
where F1 = F −1 M√γ F Im[f ] − iF −1 Mγ −1/2 F Re[f ], F2 = F −1 M√γ F Re[f ] and F3 = F −1 Mγ −1/2 F Im[f ].
(3.53)
A similar argument to what is given above now implies that (U ∗ −V ∗ )(Tt f −f ) → 0 as t → 0, for any f ∈ D0 , where D0 = {f ∈ 2 (Zd ) : F −1 Mγ −1/2 F Re[f ] ∈ 2 (Zd )}.
(3.54) (1)
No additional assumption on Im[f ] is necessary since F3 is convolved with Ht . Given the form of (3.49), this suffices to prove weak continuity. In fact, one can check that Tt leaves D0 invariant and that if f ∈ D0 , then (U ∗ − V ∗ )Tt f ∈ 2 (Zd ) for all t ∈ R. This establishes weak continuity of the dynamics, defined on W(D0 ). Remark 3.3. We observe that, when ω = 0, the finite volume Hamiltonian HLh (3.12) is translation invariant and commutes with the total momentum operator P0
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
223
(see (3.16)). In fact, HLh can be written as HLh = P02 + Pk∗ Pk + γ 2 (k)Q∗k Qk k∈Λ∗ L \{0}
= P02 +
γ(k)(2b∗k bk + 1)
k∈Λ∗ L \{0}
where we used the notation (3.16) and, for k = 0, we introduced the operators bk , b∗k as in (3.19). In this case, the operator HLh does not have eigenvectors: its spectrum is purely continuous. By a unitary transformation, the Hilbert space HΛL (see (3.13)) can be mapped into the space L2 (R, dP0 ; Hb ) of square integrable functions of P0 ∈ R, with values in Hb . Here, Hb denotes the Fock space generated by all creation and annihilation operators b∗k , bk with k = 0. It is then easy to construct vectors which minimize the energy by a given distribution of the total momentum: for an arbitrary (complex valued) f ∈ L2 (R) with f = 1, we define ψf ∈ L2 (R, dP0 ; Hb ) by setting ψf (P0 ) = f (P0 )Ω (where Ω is the Fock vacuum in Hb ). These vectors are not invariant with respect to the time evolution. It is simple to check that the h 2 Schr¨ odinger evolution of ψf is given by e−iHL t ψf = ψft with ft (P0 ) = e−itP0 f (P0 ) is the free evolution of f . In particular, for ω = 0, HLh does not have a ground state in the traditional sense of an eigenvector. For this reason, when ω = 0, it is not a priori clear what the natural choice of state should be. As is discussed above, one possibility is to consider first ω = 0 and then take the limit ω → 0. This yields a ground state for the infinite system with vanishing center of mass momentum of the oscillators. By considering non-zero values for the center of mass momentum, one can also define other states with similar properties. 3.4.5. Some final comments The analysis in the following sections and our main result is not limited to the class of examples we discussed above. For example, harmonic systems defined on more general graphs, such as the ones considered in [6, 7] can also be treated. Also note that our choice of time-invariant state, while natural, is by no means the only possible state. Instead of the vacuum state defined in (3.47), equilibrium states at positive temperatures could be used in exactly the same way. It would also make sense to study the convergence of the equilibrium or ground states for the perturbed dynamics and to consider the dynamics in the representation of the limiting infinitesystem state, but we have not studied this situation and will not discuss it in this paper. 4. Perturbing the Harmonic Dynamics In this section, we will discuss finite volume perturbations of the infinite volume harmonic dynamics which we defined in Sec. 3. To begin, we recall a fundamental result about perturbations of quantum dynamics defined by adding a bounded term
March 10, J070-S0129055X1000393X
224
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
to the generator. This is a version of what is usually known as the Dyson or Duhamel expansion. The following statement summarizes [4, Proposition 5.4.1]. Proposition 4.1. Let {M, αt } be a W ∗ -dynamical system and let δ denote the infinitesimal generator of αt . Given any P = P ∗ ∈ M, set δP to be the bounded derivation with domain D(δP ) = M satisfying δP (A) = i[P, A] for all A ∈ M. It follows that δ + δP generates a one-parameter group of ∗-automorphisms αP of M which is the unique solution of the integral equation t P αP (4.1) αt (A) = αt (A) + i s ([P, αt−s (A)]) ds. 0
In addition, the estimate |t| P αP − 1)A t (A) − αt (A) ≤ (e
(4.2)
holds for all t ∈ R and A ∈ M. Since the initial dynamics αt is assumed weakly continuous, the norm estimate (4.2) can be used to show that the perturbed dynamics is also weakly continuous. ∗ Hence, for each P = P ∗ ∈ M the pair {M, αP t } is also a W -dynamical system. P1 +P2 ∗ iteratively. Thus, if Pi = Pi ∈ M for i = 1, 2, then one can define αt 4.1. A Lieb–Robinson bound for on-site perturbations In this section, we will consider perturbations of the harmonic dynamics defined in Sec. 3. Recall that our general assumptions for the harmonic dynamics on Γ are as follows. We assume that the harmonic dynamics, τt0 , is defined on a Weyl algebra W(D) where D is a subspace of 2 (Γ). In fact, we assume there exists a group Tt of real-linear transformations which leave D invariant and satisfy τt0 (W (f )) = W (Tt f ) for all f ∈ D.
(4.3)
In addition, we assume that this harmonic dynamics satisfies a Lieb–Robinson bound. Specifically, we suppose that there exists a number a0 > 0 for which given any 0 < a ≤ a0 , there are positive numbers ca and va for which |f (x)||g(y)|Fa (d(x, y)) (4.4) |1 − eiσ(Tt f,g) | ≤ ca eva |t| x,y∈Γ
here the spatial decay in Γ is described by the function Fa as introduced in Sec. 2. As we discussed in Sec. 3, the estimate (4.4) immediately implies the Lieb–Robinson bound |f (x)||g(y)|Fa (d(x, y)). (4.5) [τt0 (W (f )), W (g)] ≤ ca eva |t| x,y∈Γ
Finally, we assume that we have represented this harmonic dynamics in a regular and τt0 -invariant state ρ for which the pair {M, τt0 }, with M = πρ (W(D)), is a W ∗ -dynamical system.
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
225
Our first estimate involves perturbations defined as finite sums of on-site terms. More specifically, the perturbations we consider are defined as follows. To each site x ∈ Γ, we will associate a finite measure µx on C, and an element Px ∈ W(D) which has the form W (zδx )µx (dz). (4.6) Px = C
We require that each µx is even, i.e. invariant under z → −z, to ensure selfadjointness, i.e. Px∗ = Px . Our Lieb–Robinson bounds hold under the additional assumption that the second moment is uniformly bounded, i.e. sup |z|2 |µx |(dz) < ∞. (4.7) x∈Γ
C
We use Proposition 4.1 to define the perturbed dynamics. Fix a finite set Λ ⊂ Γ. Set PΛ =
Px ,
(4.8)
x∈Λ (Λ)
and note that (P Λ )∗ = P Λ ∈ W(D). We will denote by τt the dynamics that results from applying Proposition 4.1 to the W ∗ -dynamical system {M, τt0 } and P Λ . Before we begin the proof of our estimate, we discuss two examples. Example 1. Let µx be supported on [−π, π) and absolutely continuous with respect to Lebesgue measure, i.e. µx (dz) = vx (z) dz. If vx is in L2 ([−π, π)), then Px is proportional to an operator of multiplication by the inverse Fourier transform of vx . Moreover, since the support of µx is real, Px corresponds to multiplication by a function depending only on qx . Example 2. Let µx have finite support, e.g., take supp(µx ) = {z, −z} for some number z = α + iβ ∈ C. Then Px = W (zδx ) + W (−zδx ) = 2 cos(αqx + βpx ).
(4.9)
We now state our first result. Theorem 4.2. Let τt0 be a harmonic dynamics defined on Γ as described above. Suppose that (4.10) κ = sup |z|2 |µx |(dz) < ∞, x∈Γ
C
(Λ)
and define the perturbed dynamics τt as indicated above. For every 0 < a ≤ a0 , there exist positive numbers ca and va for which the estimate (Λ) [τt (W (f )), W (g)] ≤ ca e(va +ca κCa )|t| |f (x)||g(y)|Fa (d(x, y)) (4.11) x,y
holds for all t ∈ R and for any functions f, g ∈ D.
March 10, J070-S0129055X1000393X
226
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
Here the numbers ca and va are as in (4.4), whereas Ca is the convolution constant as defined in (2.2) with respect to the function Fa . Proof. Fix t > 0 and define the function Ψt : [0, t] → W(D) by setting 0 Ψt (s) = [τs(Λ) (τt−s (W (f ))), W (g)].
(4.12)
It is clear that Ψt interpolates between the commutator associated with the original (Λ) harmonic dynamics, τt0 at s = 0, and that of the perturbed dynamics, τt at s = t. A calculation shows that d Ψt (s) = i [τs(Λ) ([Px , W (Tt−s f )]), W (g)], (4.13) ds x∈Λ
where differentiability is guaranteed by the results of Proposition 4.1. The inner commutator can be expressed as [Px , W (Tt−s f )] = [W (zδx ), W (Tt−s f )]µx (dz) C
= W (Tt−s f )Lt−s;x (f ), where L∗t−s;x (f ) = Lt−s;x (f ) =
C
W (zδx ){eiσ(Tt−s f,zδx ) − 1}µx (dz) ∈ W(D).
(4.14)
(4.15)
Thus Ψt satisfies d Ψt (s) = i Ψt (s)τs(Λ) (Lt−s;x (f )) ds x∈Λ +i τs(Λ) (W (Tt−s f ))[τs(Λ) (Lt−s;x (f )), W (g)].
(4.16)
x∈Λ
The first term above is norm preserving. In fact, define a unitary evolution Ut (·) by setting d Ut (s) = −i τs(Λ) (Lt−s;x (f ))Ut (s) ds
with Ut (0) = 1l.
(4.17)
x∈Λ
It is easy to see that d (Ψt (s)Ut (s)) = i τs(Λ) (W (Tt−s f ))[τs(Λ) (Lt−s;x (f )), W (g)]Ut (s), ds
(4.18)
x∈Λ
and therefore, Ψt (t)Ut (t) = Ψt (0) + i
x∈Λ
0
t
τs(Λ) (W (Tt−s f ))[τs(Λ) (Lt−s;x (f )), W (g)]Ut (s) ds. (4.19)
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
227
Estimating in norm, we find that (Λ)
[τt
(W (f )), W (g)] ≤ [τt0 (W (f )), W (g)] t + [τs(Λ) (Lt−s;x (f )), W (g)] ds. x∈Λ
(4.20)
0
Moreover, using (4.15) and the bound (4.4), it is clear that [τs(Λ) (Lt−s;x (f )), W (g)] ≤ ca eva (t−s) |f (x )|Fa (d(x, x )) x ∈Γ
×
C
|z|[τs(Λ) (W (zδx )), W (g)]|µx |(dz)
(4.21)
holds. Combining (4.21), (4.20), and (4.5), we have proven that (Λ) |f (x)||g(y)|Fa (d(x, y)) [τt (W (f )), W (g)] ≤ ca eva t x,y
+ ca ×
C
|f (x )|
x ∈Γ
x∈Λ
Fa (d(x, x ))
t
eva (t−s)
0
|z|[τs(Λ) (W (zδx )), W (g)]|µx |(dz) ds.
(4.22)
Following the iteration scheme applied in [19], one arrives at (4.11) as claimed. 4.2. Multiple site anharmonicities In this section, we will prove that Lieb–Robinson bounds, similar to those in Theorem 4.2, also hold for perturbations involving short range interactions. We introduce these as follows. For each finite subset X ⊂ Γ, we associate a finite measure µX on CX and an element PX ∈ W(D) with the form W (z · δX )µX (dz), (4.23) PX = CX
where, for each z ∈ C , the function z · δX : Γ → C is given by zx if x ∈ X, (z · δX )(x) = zx δx (x) = 0 otherwise. x ∈X X
(4.24)
We will again require that µX is invariant with respect to z → −z, and hence, PX is self-adjoint. In analogy to (4.8), for any finite subset Λ ⊂ Γ, we will set PΛ = PX , (4.25) X⊂Λ (Λ)
where the sum is over all subsets of Λ. Here we will again let τt denote the dynamics resulting from Proposition 4.1 applied to the W ∗ -dynamical system {M, τt0 } and the perturbation P Λ defined by (4.25).
March 10, J070-S0129055X1000393X
228
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
The main assumption on these multi-site perturbations follows. There exists a number a1 > 0 such that for all 0 < a ≤ a1 , there is a number κa > 0 for which given any pair x1 , x2 ∈ Γ, |zx1 ||zx2 ||µX |(dz) ≤ κa Fa (d(x1 , x2 )). (4.26) X⊂Γ: x1 ,x2 ∈X
CX
Theorem 4.3. Let τt0 be a harmonic dynamics defined on Γ. Assume that (4.26) (Λ) holds, and that τt denotes the corresponding perturbed dynamics. For every 0 < a ≤ min(a0 , a1 ), there exist positive numbers ca and va for which the estimate 2 (Λ) |f (x)||g(y)|Fa (d(x, y)) (4.27) [τt (W (f )), W (g)] ≤ ca e(va +ca κa Ca )|t| x,y
holds for all t ∈ R and for any functions f, g ∈ D. The proof of this result closely follows that of Theorem 4.2, and so we only comment on the differences. Proof. For f, g ∈ D and t > 0, define Ψt : [0, t] → W(D) as in (4.12). The derivative calculation beginning with (4.13) proceeds as before. Here W (z · δX ){eiσ(Tt−s f,z·δX ) − 1}µX (dz), (4.28) Lt−s;X (f ) = CX
is also self-adjoint. The norm estimate (Λ)
[τt
(W (f )), W (g)] ≤ [τt0 (W (f )), W (g)] t + [τs(Λ) (Lt−s;X (f )), W (g)] ds, X⊂Λ
(4.29)
0
holds similarly. With (4.28), it is easy to see that the integrand in (4.29) is bounded by ca eva (t−s) |f (x)| Fa (d(x, x )) |zx ||[τs(Λ) (W (z · δX )), W (g)]|µX |(dz), CX
x ∈X
x∈Γ
(4.30) the analogue of (4.21), for 0 < a ≤ a0 . Moreover, if 0 < a ≤ min(a0 , a1 ), then (Λ)
[τt
(W (f )), W (g)] ≤ ca eva t |f (x)||g(y)|Fa (d(x, y)) + ca |f (x)| Fa (d(x, x )) x,y∈Γ
×
0
t
eva (t−s)
x∈Γ
CX
X⊂Λ x ∈X
|zx |[τs(Λ) (W (z · δX )), W (g)]|µX |(dz)ds.
(4.31)
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
229
The estimate claimed in (4.27) follows by iteration. In fact, the first term in the iteration is bounded by ca |f (x)| Fa (d(x, x1 )) X⊂Λ x1 ∈X
x
×
t
e
va (t−s)
0
CX
|zx1 | ca e
va s
x2 ∈X
|zx2 ||g(y)|Fa (d(x2 , y))
y
× |µX |(dz) ds |f (x)||g(y)| Fa (d(x, x1 ))Fa (d(x2 , y)) ≤ ca t · ca eva t ×
X⊂Γ: x1 ,x2 ∈X
x1 ,x2 ∈Γ
x,y
CX
≤ κa ca t · ca eva t
|zx1 ||zx2 ||µX |(dz)
Fa (d(x, x1 ))Fa (d(x1 , x2 ))Fa (d(x2 , y))
x1 ,x2 ∈Γ
x,y
≤ κa Ca2 ca t · ca eva t
|f (x)||g(y)|
|f (x)||g(y)|Fa (d(x, y)).
(4.32)
x,y
The higher order iterates are treated similarly. 5. Existence of the Dynamics In this section, we demonstrate that the finite volume dynamics analyzed in the previous section converge to a limiting dynamics as the volume Λ on which the perturbation is defined tends to Γ. We state this as Theorem 5.1 below. Theorem 5.1. Let τt0 be a harmonic dynamics defined on W(1 (Γ)) as described in Sec. 4.1. Let {Λn } denote a non-decreasing, exhaustive sequence of finite subsets of Γ. Consider a family of perturbations P Λn as defined in (4.25) and (4.23) which satisfy (4.26). Suppose in addition that |zx ||µX |(dz) < ∞. (5.1) M = sup x∈Γ X⊂Γ: x∈X
CX
Then, for each f ∈ 1 (Γ) and t ∈ R fixed, the limit (Λ ) lim τ n (W (f )) n→∞ t
(5.2)
exists in norm. The limiting dynamics, which we denote by τt , is weakly continuous. It is important to note that since the estimates in Theorem 4.3 are independent of Λ, the limiting dynamics also satisfies a Lieb–Robinson bound as in (4.27). We now prove Theorem 5.1.
March 10, J070-S0129055X1000393X
230
2010 10:13 WSPC/S0129-055X
148-RMP
B. Nachtergaele et al.
Proof. Fix a Weyl operator W (f ) with f ∈ 1 (Γ). Let T > 0 and take m ≤ n. Iteratively applying Proposition 4.1, we have that t (Λ ) (Λ ) (Λ ) τt n (W (f )) = τt m (W (f )) + i τs(Λn ) ([P Λn \Λm , τt−sm (W (f ))]) ds, (5.3) 0
for all −T ≤ t ≤ T . The bound (Λ )
[P Λn \Λm , τt−sm (W (f ))] (Λ ) ≤ [W (z · δX ), τt−sm (W (f ))]|µX |(dz) X⊂Λn : X∩Λn \Λm =∅
CX
2
≤ ca e(va +ca κa Ca )(t−s)
≤ ca e
2
|f (x)|
x∈Γ
≤ M ca e(va +ca κa Ca )(t−s)
Fa (d(x, y))
X⊂Λn : y∈X X∩Λn \Λm =∅
x∈Γ (va +ca κa Ca2 )(t−s)
|f (x)|
x∈Γ
Fa (d(x, y))
y∈Λn \Λm
|f (x)|
CX
X⊂Γ: y∈X
Fa (d(x, y))
CX
|zy ||µX |(dz)
|zy ||µX |(dz)
(5.4)
y∈Λn \Λm
follows readily from Theorem 4.3 and assumption (5.1). For f ∈ 1 (Γ) and fixed t, the upper estimate above goes to zero as n, m → ∞. In fact, the convergence is uniform for t ∈ [−T, T ]. This proves (5.2). By an /3 argument, similar to what is done at the end of Sec. 2, weak continuity follows since we know it holds for the finite volume dynamics. This completes the proof of Theorem 5.1. Acknowledgments The work reported in this paper was supported by the National Science Foundation: B.N. under Grants #DMS-0605342 and #DMS-0757581, R.S. under Grant #DMS0757424, and S.S. under Grant #DMS-0757327 and #DMS-0706927. The authors would also like to acknowledge the hospitality of the Department of Mathematics at U.C. Davis where a part of this work was completed. References [1] L. Amour, P. Levy-Bruhl and J. Nourrigat, Dynamics and Lieb–Robinson estimates for lattices of interacting anharmonic oscillators, to appear in Colloq. Math., Special volume dedicated to A. Hulanicki; arXiv:0904.2717. [2] H. Araki and E. J. Woods, Representations of the canonical commutation relations describing a non-relativistic infinite free Bose gas, J. Math. Phys. 4 (1963) 637–662. [3] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. Volume 1, 2nd edn. (Springer-Verlag, 1987).
March 10, J070-S0129055X1000393X
2010 10:13 WSPC/S0129-055X
148-RMP
On the Existence of the Dynamics
231
[4] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. Volume 2, 2nd edn. (Springer-Verlag, 1997). [5] P. Butt` a, E. Caglioti, S. Di Ruzza and C. Marchioro, On the propagation of a perturbation in an anharmonic system, J. Stat. Phys. 127 (2007) 313–325. [6] M. Cramer and J. Eisert, Correlations, spectral gap, and entanglement in harmonic quantum systems on generic lattices, New J. Phys. 8 (2006) 71. [7] M. Cramer, A. Serafini and J. Eisert, Locality of dynamics in general harmonic quantum systems, in Quantum Information and Many Body Quantum Systems, eds. M. Ericsson and S. Montangero (Edizioni della Normale, 2008). [8] M. Hastings and T. Koma, Spectral gap and exponential decay of correlations, Comm. Math. Phys. 265(3) (2006) 781–804. [9] J. L. van Hemmen, Dynamics and ergodicity of the infinite harmonic crystal, Phys. Rept. 65 (1980) 45–149. [10] O. E. Lanford, J. Lebowitz and E. H. Lieb, Time evolution of infinite anharmonic systems, J. Statist. Phys. 16(6) (1977) 453–461. [11] E. H. Lieb and D. W. Robinson, The finite group velocity of quantum spin systems, Comm. Math. Phys. 28 (1972) 251–257. [12] E. H. Lieb and R. Seiringer, The Stability of Matter in Quantum Mechanics (Cambridge University Press, 2009). [13] J. Manuceau and A. Verbeure, Quasi-free states of the CCR algebra and Bogoliubov transformations, Comm. Math. Phys. 9 (1968) 293–302. [14] J. Manuceau, M. Sirugue, D. Testard and A. Verbeure, The smallest C ∗ -algebra for canonical commutation relations, Comm. Math. Phys. 32 (1973) 231–243. [15] C. Marchioro, A. Pellegrinotti, M. Pulvirenti and L. Triolo, Velocity of a perturbation in infinite lattice systems, J. Statist. Phys. 19(5) (1978) 499–510. [16] B. Nachtergaele and R. Sims, Lieb–Robinson bounds and the exponential clustering theorem, Comm. Math. Phys. 265(1) (2006) 119–130. [17] B. Nachtergaele, Y. Ogata and R. Sims, Propagation of correlations in quantum lattice systems, J. Statist. Phys. 124(1) (2006) 1–13. [18] B. Nachtergaele and R. Sims, Locality estimates for quantum spin systems, in New Trends in Mathematical Physics, Selected Contributions of the XVth International Congress on Mathematical Physics (Springer-Verlag, 2009), pp. 591–614. [19] B. Nachtergaele, H. Raz, B. Schlein and R. Sims, Lieb–Robinson bounds for harmonic and anharmonic lattice systems, Comm. Math. Phys. 286 (2009) 1073–1098. [20] H. Raz and R. Sims, Estimating the Lieb–Robinson velocity for classical anharmonic lattice systems, J. Statist. Phys. 137 (2009) 79–108. [21] M. Reed and B. Simon, Methods of Modern Mathematical Physics, II, Fourier Analysis, Self-Adjointness (Academic Press, 1975). [22] D.W. Robinson, The ground state of the bose gas, Comm. Math. Phys. 1 (1965) 159–174. [23] H. Spohn and J. L. Lebowitz, Stationary non-equilibrium states of infinite harmonic systems, Comm. Math. Phys. 54 (1977) 97–120. [24] W. Thirring and F. Dyson (eds), The Stability of Matter: From Atoms to Stars: Selecta of Elliott H. Lieb, 4th edn. (Springer-Verlag, 2005).
April
20,
2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Reviews in Mathematical Physics Vol. 22, No. 3 (2010) 233–303 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003953
EFFECT OF A LOCALLY REPULSIVE INTERACTION ON s-WAVE SUPERCONDUCTORS
J.-B. BRU∗ and W. DE SIQUEIRA PEDRA† ∗Departamento de Matem´ aticas, Facultad de Ciencia y Tecnolog´ıa Universidad del Pa´ıs Vasco, Apartado 644, 48080 Bilbao, Spain and IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain jeanbernard
[email protected] [email protected] †Institut
f¨ ur Mathematik, Universit¨ at Mainz, Staudingerweg 9, 55099 Mainz, Germany
[email protected] Received 23 September 2009 Revised 22 February 2010
The thermodynamic impact of the Coulomb repulsion on s-wave superconductors is analyzed via a rigorous study of equilibrium and ground states of the strong coupling BCS-Hubbard Hamiltonian. We show that the one-site electron repulsion can favor superconductivity at fixed chemical potential by increasing the critical temperature and/or the Cooper pair condensate density. If the one-site repulsion is not too large, a first or a second order superconducting phase transition can appear at low temperatures. The Meißner effect is shown to be rather generic but coexistence of superconducting and ferromagnetic phases is also shown to be feasible, for instance, near half-filling and at strong repulsion. Our proof of a superconductor-Mott insulator phase transition implies a rigorous explanation of the necessity of doping insulators to create superconductors. These mathematical results are consequences of “quantum large deviation” arguments combined with an adaptation of the proof of Størmer’s theorem [1] to even states on the CAR algebra. Keywords: Superconductivity; s-wave; Coulomb interaction; Hubbard model; Meißner effect; Mott insulators; equilibrium states; Størmer’s theorem. Mathematics Subject Classification 2010: 82B20, 82D55
Contents 1. Introduction
234
2. Grand-Canonical Pressure and Gap Equation
241
233
April 20, 2010 14:17 WSPC/S0129-055X
234
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
3. Phase Diagram at Fixed Chemical Potential 3.1. Existence of a s-wave superconducting phase transition . 3.2. Electron density per site and electron-hole symmetry . . 3.3. Superconductivity versus magnetization: Meißner effect 3.4. Coulomb correlation density . . . . . . . . . . . . . . . . 3.5. Superconductor-Mott insulator phase transition . . . . . 3.6. Mean-energy per site and the specific heat . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
244 245 249 250 252 255 257
4. Phase Diagram at Fixed Electron Density per Site 260 4.1. Thermodynamics away from any critical point . . . . . . . . . . . . . 260 4.2. Coexistence of ferromagnetic and superconducting phases . . . . . . 262 5. Concluding Remarks
266
6. Mathematical Foundations of the Thermodynamic Results 268 6.1. Thermodynamic limit of the pressure: Proof of Theorem 2.1 . . . . . 269 6.2. Equilibrium and ground states of the strong coupling BCS-Hubbard model . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 7. Analysis of the Variational Problem
292
Appendix. Griffiths Arguments
298
1. Introduction Since the discovery of mercury superconductivity in 1911 by the Dutch physicist Onnes, the study of superconductors has continued to intensify, see, e.g., [2]. Since that discovery, a significant amount of superconducting materials has been found. This includes usual metals, like lead, aluminum, zinc or platinum, magnetic materials, heavy-fermion systems, organic compounds and ceramics. A complete description of their thermodynamic properties is an entire subject by itself, see [2–4] and references therein. In addition to zero-resistivity and many other complex phenomena, superconductors manifest the celebrated Meißner or Meißner–Ochsenfeld effect, i.e. they can become perfectly diamagnetic. The highesta critical temperature for superconductivity obtained nowadays is between 100 and 200 Kelvin via doped copper oxides, which are originally insulators. In contrast to most superconductors, note that superconduction in magnetic superconductors only exists on a finite range of non-zero temperatures. Theoretical foundations of superconductivity go back to the celebrated BCS theory — appeared in the late fifties (1957) — which explains conventional type I
a In
January 2008, a critical temperature over 180 Kelvin was reported in a Pb-doped copper oxide.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
235
superconductors. This theory is based on the so-called (reduced) BCS Hamiltonian := (εk − µ)(˜ a∗k,↑ a ˜k,↑ + a ˜∗k,↓ a ˜k,↓ ) HBCS Λ k∈Λ∗
+
1 γk,k a ˜∗k,↑ a ˜∗−k,↓ a ˜k ,↓ a ˜−k ,↑ |Λ| ∗
(1.1)
k,k ∈Λ
defined in a cubic box Λ ⊂ R3 of volume |Λ|. Here Λ∗ is the dual group of Λ seen as a ˜k,s creates torus (periodic boundary condition) and the operator a˜∗k,s respectively a respectively annihilates a fermion with spin s ∈ {↑, ↓} and momentum k ∈ Λ∗ . The function εk represents the kinetic energy, the real number µ is the chemical potential and γk,k is the BCS coupling function. The choice γk,k = −γ < 0 is often used in the Physics literature and the case εk = 0 is known as the strong coupling limit of the BCS model. The lattice approximation of the BCS Hamiltonian amounts to replace the box Λ ⊂ R3 by Λ ⊂ Z3 (or, more generally, by Λ ⊂ Zd≥1 ) and the strong coupling limit of the reduced BCS model is in this case known as the strong coupling (with γk,k = −γ) BCS model.b The assumptions εk = 0 and γk,k = −γ are of interest, because in this case the BCS Hamiltonian can be explicitly diagonalized. The exact solution of the strong coupling BCS model is well-known since the sixties [6, 7]. This model is in a sense unrealistic: among other things, its representation of the kinetic energy of electrons is rather poor. Nevertheless, it became popular because it displays most of basic properties of real conventional type I superconductors. See, e.g., [8, Chap. VII, Sec. 4]. Even though the analysis of the thermodynamics of the BCS Hamiltonian was rigorously performed in the eighties [9, 10] (see also the innovating work of Bernadskii and Minlos in 1972 [11]), generalizations of the strong coupling approximation of the BCS model are still subject of research. For instance, strong coupling-BCS-type models with superconducting phases at arbitrarily high temperatures are treated in [12]. In fact, a general theory of superconductivity is still a subject of debate, especially for high-Tc superconductors. An important phenomenon ignored in the BCS theory is the Coulomb interaction between electrons or holes, which can imply strong correlations, for instance in high-Tc superconductors. To study these correlations, most of theoretical methods, inspired by Beliaev [5], use perturbation theory or renormalization group derived from the diagram approach of Quantum Field Theory. However, even if these approaches have been successful in explaining many physical properties of superconductors [3, 4], only few rigorous results exist on superconductivity. For instance, the effect of the Coulomb interaction on superconductivity is not rigorously known. This problem was of course adressed in theoretical Physics right after the emergence of the Fr¨ohlich model and the BCS theory, see, e.g., [13]. b See
also (1.2) with λ = 0 and h = 0.
April 20, 2010 14:17 WSPC/S0129-055X
236
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
In particular, the authors explain in [13, Chap. VI], by means of diagrammatic pertubation theory, that the effect of the Coulomb interaction on the Fr¨ ohlich model should be to lower the critical temperature of the superconducting phase by lowering the electron density. We rigorously show that this phenomenology is only true — for our model — in a specific region of parameters. Indeed, the aim of the present paper is to understand the possible thermodynamic impact of the Coulomb repulsion in the strong coupling approximation. More precisely, we study the thermodynamic properties of the strong coupling BCSHubbard model defined in the boxc ΛN := {Z∩[−L, L]}d≥1 of volume |ΛN | = N ≥ 2 by the Hamiltonian (nx,↑ + nx,↓ ) − h (nx,↑ − nx,↓ ) HN := −µ x∈ΛN
+ 2λ
x∈ΛN
x∈ΛN
nx,↑ nx,↓ −
γ N
a∗x,↑ a∗x,↓ ay,↓ ay,↑
(1.2)
x,y∈ΛN
for real parameters µ, h, λ, and γ ≥ 0. The operator a∗x,s respectively ax,s creates respectively annihilates a fermion with spin s ∈ {↑, ↓} at lattice position x ∈ Zd whereas nx,s := a∗x,s ax,s is the particle number operator at position x and spin s. The first term of the right-hand side of (1.2) represents the strong coupling limit of the kinetic energy, with µ being the chemical potential of the system. Note that this “strong coupling limit” — explained above for the BCS Hamiltonian — is also called “atomic limit” in the context of the Hubbard model, see, e.g., [14, 15]. The second term in the right-hand side of (1.2) corresponds to the interaction between spins and the magnetic field h. The one-site interaction with coupling constant λ represents the (screened) Coulomb repulsion as in the celebrated Hubbard model. So, the parameter λ should be taken as a positive number but our results are also valid for any real λ. The last term is the BCS interaction written in the x-space since γ ∗ ∗ γ ∗ ∗ ax,↑ ax,↓ ay,↓ ay,↑ = a ˜k,↑ a ˜−k,↓ a ˜q,↓ a ˜−q,↑ , (1.3) N N ∗ x,y∈ΛN
k,q∈ΛN
˜q,s is the corwith Λ∗N being the reciprocal lattice of quasi-momenta and where a responding annihilation operator for s ∈ {↑, ↓}. Observe that the thermodynamics of the model for γ = 0 can easily be computed. Therefore, we restrict the analysis to the case γ > 0. Note also that the homogeneous BCS interaction (1.3) can imply a superconducting phase and the mediator implying this effective interaction does not matter here, i.e. it could be due to phonons, as in conventional type I superconductors, or anything else. We show that the one-site repulsion suppresses superconductivity for large λ ≥ 0. In particular, the repulsive term in (1.2) cannot imply any superconducting state if γ = 0. However, the first elementary but nonetheless important property c Without
loss of generality, we choose N such that L := (N 1/d − 1)/2 ∈ N.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
237
of this model is that the presence of an electron repulsion is not incompatible with superconductivity if |λ−µ| and (λ+|h|) are not too big as compared to the coupling constant γ of the BCS interaction. In this case, the superconducting phase appears at low temperatures as either a first order or a second order phase transition. More surprisingly, the one-site repulsion can even favor superconductivity at fixed chemical potential µ by increasing the critical temperature and/or the Cooper pair condensate density. This contradicts the naive guess that any one-site repulsion between electron pairs should at least reduce the formation of Cooper pairs. It is however important to mention that the physical behavior described by the model depends on which parameter, µ or ρ, is fixed. (It does not mean that the canonical and grand-canonical ensembles are not equivalent for this model.) Indeed, we also analyze the thermodynamic properties at fixed electron density ρ per site in the grand-canonical ensemble, as it is done for the perfect Bose gas in the proof of Bose–Einstein condensation. The analysis of the thermodynamics of the strong coupling BCS-Hubbard model is performed in details. In particular, we prove that the Meißner effect is rather generic but also that the coexistence of superconducting and ferromagnetic phases is possible (as in the Vonsovkii–Zener model [16, 17]), for instance at large λ > 0 and densities near half-filling. The later situation is related to a superconductor-Mott insulator phase transition. This transition gives furthermore a rigorous explanation of the need of doping insulators to obtain superconductors. Indeed, at large enough coupling constant λ, the superconductor-Mott insulator phase transition corresponds to the breakdown of superconductivity together with the appearance of a gap in the chemical potential as soon as the electron density per site becomes an integer, i.e. 0, 1 or 2. If the system has an electron density per site equal to 1 without being superconductor, then any non-zero magnetic field h = 0 implies a ferromagnetic phase. Note that the present setting is still too simplified with respect to real superconductors. For instance, the anti-ferromagnetic phase or the presence of vortices, which can appear in (type II) high-Tc superconductors [3,4], are not modeled. However, the BCS-Hubbard Hamiltonian (1.2) may be a good model for certain kinds of superconductors or ultra-cold Fermi gases in optical lattices, where the strong coupling approximation is experimentally justified. Actually, even if the strong coupling assumption is a severe simplification, it may be used in order to analyze the thermodynamic impact of the Coulomb repulsion, as all parameters of the model have a phenomenological interpretation and can be directly related to experiments. See discussions in Sec. 5. Moreover, the range of parameters in which we are interested turns out to be related to a first order phase transition. This kind of phase transitions are known to be stable under small perturbations of the Hamiltonian. In particular, by including a small kinetic part it can be shown by high-low temperature expansions that the model ε(x − y)(a∗y,↓ ax,↓ + a∗y,↑ ax,↑ ) HN,ε := HN + x,y∈ΛN
April 20, 2010 14:17 WSPC/S0129-055X
238
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
has essentially the same correlation functions as HN , up to corrections of order ε1 (1 -norm of ε). This analysis will be the subject of a separated paper. For any ε = 0 notice that the model HN,ε is not anymore permutation invariant but only translation invariant. Such translation invariant models are studied in a systematic way in [18]. Their detailed analysis is however, generally much more difficult to perform. Considering first models having more symmetries — as for instance, permutation invariance — is in this case technically easier. Coming back to the strong coupling BCS-Hubbard model HN , it turns out that the thermodynamic limit of its (grand-canonical) pressured 1 ln Trace(e−βHN ) pN (β, µ, λ, γ, h) := (1.4) βN exists at any fixed inverse temperature β > 0. It corresponds to a variational problem which has minimizerse in the set EUS,+ of (evenf ) permutation invariant states on the CAR C ∗ -algebra U generated by annihilation and creation operators: p(β, µ, λ, γ, h) := lim {pN (β, µ, λ, γ, h)} = − N →∞
inf
S,+ ω∈EU
F(ω).
(1.5)
Here the map ˜ ω → F(ω) := e(ω) − β −1 S(ω) is the affine (lower weak∗ -semicontinuous) free-energy density functional defined on EUS,+ from the mean energy per volume e(ω) := lim {N −1 ω(HN )} < ∞ N →∞
and the entropy density
1 Trace(Dω|UN log Dω|UN ) < ∞. N →∞ N Note that Dω|UN is the density matrix associated to the state ω restricted on the local CAR C ∗ -algebra UN B( CΛN ×{↑,↓} ) (isomorphism). Such a derivation of the pressure as a minimization problem over states on a C ∗ -algebras are also performed for various quantum spin systems, see, e.g., [19–23]. The minimum of the variational problem (1.5) is attained for any weak∗ -limit point of local Gibbs states
˜ S(ω) := − lim
Trace(· e−βHN ) (1.6) Trace(e−βHN ) associated with HN . Similarly to what is done for general translation invariant models (see [24, 25]), the set of equilibrium states of the strong coupling BCSHubbard model is naturally defined to be the set Ωβ = Ωβ (µ, λ, γ, h) of minimizers ωN (·) :=
d Our
notation for the “Trace” does not include the Hilbert space where it is evaluated but it should be deduced from operators involved in each statement. e Because ω → F(ω) is lower semicontinuous and E S,+ is compact with respect to the weak∗ U topology. f See Remark 6.1 in Sec. 6.1.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
239
of (1.5). Note that Ωβ is a non empty convex subsetg of EUS,+ and the extreme decomposition in Ωβ coincides with the one in EUS,+ , i.e. Ωβ is a faceh in EUS,+ . So, pure equilibrium states are extreme states of Ωβ . Meanwhile, any weak∗ limit point as n → ∞ of an equilibrium state sequence {ω (n) }n∈N with diverging inverse temperature βn → ∞ is — per definition — a ground state ω ∈ EUS,+ . Here we have left the Fock space representation of the model to go to a representation-free formulation of thermodynamic phases. This means that HN is not anymore seen as a Hamiltonian acting on the Fock space but as a (self-adjoint) element of the CAR C ∗ -algebra U with thermodynamic phases describes by states on U. Doing so we take advantage of the non-uniqueness of the representation of the CAR C ∗ -algebra U. This property is indeed necessary to get non-unique equilibrium and ground states which imply phase transitions. This fact was first observed by Haag in 1962 [26], who established that the non-uniqueness of the ground state of the BCS model in infinite volume is related to the existence of several inequivalenti irreducible representationsj of the Hamiltonian, see also [6, 27]. Equilibrium states define tangents to the convex map (β, µ, λ, γ, h) → p(β, µ, λ, γ, h). The analysis of the set of tangents of this map gives hence information about the expectations of many important observables with respect to equilibrium states. The main technical point in the present work is therefore to find an explicit representation of the pressure by using the permutation invariance of the model in a crucial way. Indeed, we adapt to our case of fermions on a lattice the methods of [19] used to find the pressure of spin systems of mean-field type. Then, it is proven that it suffices to minimize the variational problem (1.5) with respect to the set EUS,+ of extreme states in EUS,+ . By adapting the proof of Størmer’s theorem [1] to even states on the CAR algebra, we show next that extreme, permutation invariant and even states are product states ζx ωζ := x∈Zd
obtained by “copying” some one-site even state ζ to all other sites. This result is a non-commutative version of the celebrated de Finetti Theorem from (classical) probability theory [28]. Using this, the variational problem (1.5) can be drastically simplified to a minimization problem on a finite dimensional manifold. At the end, it yields to another explicit, rather simple, variational problem on R+ 0 , which can S,+ map ω → F(ω) on the convex set EU is affine and lower semicontinuous, thus Ωβ is a S,+ non-empty face of EU . h A face F of a compact convex set K is subset of K with the property that if ω = Σm λ ω ∈ F n=1 n n m m with Σm n=1 λn = 1 and {ωn }n=1 ⊂ K, then {ωn }n=1 ⊂ F. i This means that there is no isomorphism between h j1 and hj2 whenever hj1 and hj2 are the Hilbert spaces corresponding to two different irreducible representations. j This means that the Hamiltonian can be seen as an operator acting on several Hilbert spaces {hj }j∈J with no (non-trivial) invariant subspace. g The
April 20, 2010 14:17 WSPC/S0129-055X
240
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
be rigorously analyzed by analytic or numerical methods to obtain the complete thermodynamic behavior of the model. Observe however, that all correlation functions cannot be drawn from an explicit formula for the pressure by taking derivatives combined with Griffiths arguments [29–31] on the convergence of derivatives of convex functions, unless the (infinite volume) pressure is shown to be differentiable with respect to any perturbation. Showing differentiability of the pressure as well as the explicit computation of its corresponding derivative can be a very hard task, for instance for correlation functions involving many lattice points. By contrast, the method presented in this paper gives access to all correlation functions at once. This is one basic (mathematical) message of this method, which is generalized in [18] to all translation invariant Fermi systems without requiring any quantum spin representation. In fact, we precisely characterize the sets Ωβ for all β ∈ (0, ∞], where Ω∞ is the set of ground states with parameters µ, γ, λ, and h. This detailed study yields our main rigorous results on the strong coupling BCS-Hubbard model HN , which can be summarized as follows: • There is a set of parameters S, defining the superconducting phase, with equilibrium and ground states breaking the U (1)-gauge symmetry and showing offdiagonal long range order (ODLRO). • Depending on the parameters, the superconducting phase transition is either a first order or a second order phase transition. • The superconducting phase S is characterized by the formation of Cooper pairs (shown by proving bounds for the density-density correlations) and a depleted Cooper pair condensate, the density rβ ∈ [0, 1/4] of which is defined by the gap equation. • From our proof of Størmer’s theorem [1] for even states on the CAR algebra, we observe that the superconducting phase S corresponds to a s-wave superconductor, i.e. a superconductor with two-point correlation function, for x, y ∈ Zd , 1/2 s1 , s2 ∈ {↑, ↓} and within S, equal to ω(ax,s1 ay,s2 ) = rβ eiφ = 0 if x = y and s1 = s2 , and ω(ax,s1 ay,s2 ) = 0 else. (Here ω is any pure state of Ωβ ; φ ∈ [0, 2π) is determined by ω.) • We observe the Meißner effectk by analyzing the relation between superconductivity and magnetization. • We establish the existence of a superconductor-Mott insulator phase transition for integer electron density per site. • The coexistence of ferromagnetic and superconducting phases is shown to be feasible at (critical) points of the boundary ∂S of S, by applying the decomposition theory for states [32] on the weak∗ -compact and convex set Ωβ . k It is mathematically defined here by the absence of magnetization in presence of superconductivity. Steady surface currents around the bulk of the superconductor are not analyzed as it is a finite volume effect.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
241
• The critical temperature θc for the superconducting phase transition with respect to λ, γ or h is analyzed in the case of fixed chemical potential µ and also in the case of constant electron density ρ. It shows that θc can be an increasing function of the positive coupling constant λ > 0 at fixed µ ∈ R but not at fixed ρ > 0. • For λ ∼ γ the critical temperature θc shows — as a function of the electron density ρ — the typical behavior observed (only) in high-Tc superconductors: θc is zero or very small for ρ ∼ 1 and is much larger for ρ away from 1. Thus, our model provides a simple rigorous microscopic explanation for such experimentally well-known behavior of high-Tc superconductors. • Together with our study of the heat capacity, all these results can be used to fix experimentally all parameters of HN . Note that our study of equilibrium states is reminiscent of the work of Fannes, Spohn and Verbeure [33], performed however within a different framework. By opposition with our setting, their analysis [33] concerns symmetric states on an infinite tensor product of one C ∗ -algebra and their definition of equilibrium states uses the so-called correlation inequalities for KMS-states, see [29, Appendix E]. To conclude, this paper is organized as follows. In Sec. 2, we give the thermodynamic limit of the pressure pN (1.4) as well as the gap equation. Then, our main results concerning the thermodynamic properties of the model are formulated in Sec. 3 at fixed chemical potential µ and in Sec. 4, at fixed electron density ρ per site. Section 5 briefly explains our result on the level of equilibrium states and gives additional remarks. In order to keep the main issues and the physical implications as transparent as possible, we reduce the technical and formal aspects to a minimum in Secs. 2–5. In particular, in Secs. 2–4 we only stay on the level of pressure and thermodynamic limit of local Gibbs states. The generalization of the results on the level of equilibrium and ground states is postponed to Sec. 6.2. Indeed, the rather long Sec. 6 gives the detailed mathematical foundations of our phase diagrams. In particular, in Sec. 6.1 we introduce the C ∗ -algebraic machinery needed in our analysis and prove various technical facts to conclude in Sec. 6.2 with the rigorous study of equilibrium and ground states. In Sec. 7, we collect some useful properties on the qualitative behavior of the Cooper pair condensate density, whereas the Appendix is an appendix on Griffiths arguments [29–31].
2. Grand-Canonical Pressure and Gap Equation In order to obtain the thermodynamic behavior of the strong coupling BCSHubbard model HN , it is essential to get first the thermodynamic limit N → ∞ of its grand-canonical pressure pN (1.4). The rigorous derivation of this limit is performed in Sec. 6.1. We explain here the final result with the heuristic behind it. The first important remark is that one can guess the correct variational problem by the so-called approximating Hamiltonian method [34–36] originally proposed by Bogoliubov Jr. [37]. In our case, the correct approximation of the Hamiltonian HN
April 20, 2010 14:17 WSPC/S0129-055X
242
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
is the c-dependent Hamiltonian HN (c) := −µ (nx,↑ + nx,↓ ) − h (nx,↑ − nx,↓ ) x∈ΛN
+ 2λ
x∈ΛN
x∈ΛN
γ nx,↑ nx,↓ − ((N c)a∗x,↑ a∗x,↓ + (N c¯)ax,↓ ax,↑ ), N
(2.1)
x∈ΛN
with c ∈ C, see also [6, 7]. The main advantage of this Hamiltonian in comparison with HN is the fact that it is a sum of shifts of the same local operator. For an appropriate order parameter c ∈ C, it leads to a good approximation of the pressure pN as N → ∞. This can be partially seen from the inequality γ 2 ∗ ∗ ax,↑ ax,↓ − N c¯ ax,↑ ax,↓ − N c ≥ 0, γN |c| + HN (c) − HN = N x∈ΛN
x∈ΛN
which is valid as soon as γ ≥ 0. Observe that the constant term γN |c|2 is not included in the definition of HN (c). Hence, by using the Golden–Thompson inequal∗ ity Trace(eA+B B ) ≤ Trace(eA ), the thermodynamic limit p(β, µ, λ, γ, h) of the pressure pN (1.4) is bounded from below by p(β, µ, λ, γ, h) ≥ sup{−γ|c|2 + p(c)}.
(2.2)
c∈C
The function p(c) = p(β, µ, λ, γ, h; c) is the pressure associated with HN (c) for any N ≥ 1. It can easily be computed since HN (c) is a sum of local operators which commute with each other. Indeed, for any N ≥ 1, this pressure equalsl p(c) := =
1 1 ln Trace(e−βHN (c) ) = ln Trace(e−βH1 (c) ) βN β ∗ ∗ 1 ln Trace(eβ{(µ+h)n↑ +(µ−h)n↓ +γ(ca↓ a↑ +¯ca↑ a↓ )−2λn↑ n↓ } ). β
(2.3)
To be useful, the variational problem in (2.2) should also be an upper bound of p(β, µ, λ, γ, h). By adapting the proof of Størmer’s theorem [1] to even states on the CAR algebra and by using the Petz–Raggio–Verbeure proof for spin systems [19] as a guideline, we prove this in Sec. 6.1. Thus, the thermodynamic limit of the pressure of the model HN exists and can explicitly be computed by using the approximating Hamiltonian HN (c): Theorem 2.1 (Grand-Canonical Pressure). For any β, γ > 0 and µ, λ, h ∈ R, the thermodynamic limit p(β, µ, λ, γ, h) of the grand-canonical pressure pN (1.4) equals p(β, µ, λ, γ, h) = sup{−γ|c|2 + p(c)} = β −1 ln 2 + µ + sup f (r) < ∞, c∈C
l Here
r≥0
a0,↑ , a0,↓ and n0,↑ , n0,↓ are replaced, respectively, by a↑ , a↓ and n↑ , n↓ .
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
243
where the real function f (r) = f (β, µ, λ, γ, h; r) is defined by f (r) := −γr +
1 ln{cosh(βh) + e−λβ cosh(βgr )}, β
with gr := {(µ − λ)2 + γ 2 r}1/2 . Remark 2.1. The fact that the pressure pN coincides as N → ∞ with the variational problem given by the so-called approximating Hamiltonian (here HN (c)) was previously proven via completely different methods in [34] for a large class of Hamiltonian (including HN ) with BCS-type interaction. However, as explained in the introduction, our proof gives deeper results, not expressed in Theorem 2.1, on the level of states, cf. (1.5) and (6.33). In contrast to the approximating Hamiltonian method [34–37], it leads to a natural notion of equilibrium and ground states and allows the direct analysis of correlation functions. For more details, we recommend Sec. 6, particularly Sec. 6.2. From the gauge invariance of the map c → p(c) observe that any maximizer 1/2 cβ ∈ C of the first variational problem given in Theorem 2.1 has the form rβ eiφ with rβ ≥ 0 being solution of sup f (r) = f (rβ )
(2.4)
r≥0
and φ ∈ [0, 2π). For any β, γ > 0 and real numbers µ, λ, h, it is also clear that the order parameter rβ is always bounded since f (r) diverges to −∞ when r → ∞. Up to (special) points (β, µ, λ, γ, h) corresponding to a phase transition of first order, it is always unique and continuous with respect to each parameter (see Sec. 7). For low inverse temperatures β (high temperature regime) rβ = 0. Indeed, straightforward computations at low enough β show that the function f (r) is concave as a function of r ≥ 0 whereas ∂r f (0) < 0, see Sec. 7. On the other hand, any non-zero solution rβ of the variational problem (2.4) has to be solution of the gap equation (or Euler–Lagrange equation) 2grβ eλβ cosh(βh) tanh(βgrβ ) = 1+ . (2.5) γ cosh(βgrβ ) If gr = 0, observe that one uses in (2.5) the asymptotics x−1 tanh x ∼ 1 as x → 0, see also (7.2). Because tanh(x) ≤ 1 for x ≥ 0, we then conclude that 1 − γ −2 (µ − λ)2 . (2.6) 4 In particular, if γ ≤ 2|µ − λ|, then rβ = 0 for any β > 0. However, at large enough β > 0 (low temperature regime) and at fixed λ, h, µ ∈ R, there is a unique γc > 2|λ − µ| such that rβ > 0 for any γ ≥ γc . In other words, the domain of parameters (β, µ, λ, γ, h) where rβ is strictly positive is non-empty, see Figs. 1 and 2 and Sec. 7. Observe in Fig. 2 that a positive λ, i.e. a one-site repulsion, can significantly increase (right figure) the critical temperature θc = θc (µ, λ, γ, h), which is defined such that rβ > 0 if and only if β > θc−1 . 0 ≤ rβ ≤ max{0, rmax },
with rmax :=
April 20, 2010 14:17 WSPC/S0129-055X
244
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra θc
θc
θc
0.6
0.8
0.20
0.5 0.6
0.15 0.4 0.3
0.4
0.10
0.2 0.2
0.05 0.1
− 2.0
− 1.5
− 1.0
− 0.5
µ
0.5
− 1.5
− 1.0
− 0.5
0.5
1.0
1.5
µ
− 0.5
0.5
1.0
1.5
2.0
µ
Fig. 1. Illustration, as a function of µ, of the critical temperature θc = θc (µ, λ, γ, h) such that rβ > 0 if and only if β > θc−1 (blue area) for γ = 2.6, h = 0 and with λ = −0.575 (left figure), 0 (figure on the center) and 0.575 (right figure). The blue line corresponds to a second order phase transition, whereas the red dashed line represents the domain of µ with a first order phase transition. The black dashed line is the chemical potential µ = λ corresponding to an electron density per site equal to 1, see Sec. 3. (Color online.)
θc
θc
θc
0.4
0.5 0.8 0.4 0.6
0.3
0.3 0.2
0.4
0.2 0.1
0.2
− 2.0
− 1.5
− 1.0
− 0.5
0.1
0.5
1.0
λ
− 0.4
− 0.2
0.2
0.4
0.6
λ
0.2
0.4
0.6
0.8
λ
Fig. 2. Illustration, as a function of λ, of the critical temperature θc = θc (µ, λ, γ, h) for γ = 2.6, h = 0 and with µ = −0.5 (left figure), µ = 1 (figure at the center) and µ = 1.25 (right figure). The blue line corresponds to a second order phase transition, whereas the red dashed line represents the domain of λ with first order phase transition. The black dashed line is the coupling constant λ = µ corresponding to an electron density per site equal to 1, see Sec. 3. (Color online.)
From Lemma 7.1, the set of maximizers of the variational problem (2.4) has at most two elements in [0, 1/4]. It follows by continuity of (β, µ, λ, γ, h, r) → f (β, µ, λ, γ, h; r), and from the fact that the interval [0, 1/4] is compact, that the set S := {(β, µ, λ, γ, h): β, γ > 0 and rβ > 0 is the unique maximizer of (2.4)}
(2.7)
is open. In Sec. 3.1, we prove that the set S corresponds to the superconducting phase since the order parameter solution of (2.4) can be interpreted as the Cooper pair condensate density. The boundary ∂S of the set S is called the set of critical points of our model. By definition, if (2.4) has more than one maximizer, then (β, µ, λ, γ, h) ∈ ∂S, whereas if (β, µ, λ, γ, h) ∈ S, then r = 0 is the unique maximizer of (2.4). For more details on the study of the variational problem (2.4), we recommend Sec. 7. 3. Phase Diagram at Fixed Chemical Potential By using our main theorem, i.e. Theorem 2.1, we can now explain the thermodynamic behavior of the strong coupling BCS-Hubbard model HN . The rigorous
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
245
proofs are however given in Sec. 6.2. Actually, we concentrate here on the physics of the model extracted from the (finite volume) grand-canonical Gibbs state ωN (1.6) associated with HN . We start by showing the existence of a superconducting phase transition in the thermodynamic limit. 3.1. Existence of a s-wave superconducting phase transition The solution rβ of (2.4) can be interpreted as an order parameter related to the Cooper pair condensate density ωN (c∗0 c0 )/N , where 1 1 ax,↓ ax,↑ = √ a ˜k,↓ a ˜−k,↑ c0 := √ N x∈ΛN N k∈Λ∗ N
c∗0 )
(respectively annihilates (respectively creates) one Cooper pair within the condensate, i.e. in the zero-mode for electron pairs. Indeed, in Sec. 6.2 (see Theorem 6.3) we prove, by using a notion of equilibrium states, the following. Theorem 3.1 (Cooper Pair Condensate Density). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (infinite volume) Cooper pair condensate density equals 1 1 ∗ ∗ ωN (c∗0 c0 ) = lim ω (a a a a ) lim N y,↓ y,↑ x,↑ x,↓ N →∞ N N →∞ N 2 x,y∈ΛN
= rβ ≤ max{0, rmax}, with rmax ≤ 1/4 defined in (2.6). The (uniquely defined ) order parameter rβ = rβ (µ, λ, γ, h) is an increasing function of γ > 0. Remark 3.1. In fact, Theorem 3.1 is not anymore satisfied only if the order parameter rβ is discontinuous with respect to γ > 0 at fixed (β, µ, λ, h). In this case, the thermodynamic limit of the Cooper pair condensate density is bounded by the left and right limits of the corresponding (infinite volume) density, see the Appendix, in particular (A.1). Similar remarks can be done for Theorems 3.4–3.7. At least for large enough β and γ, we have explained that rβ > 0, see Figs. 1 and 2. Illustrations of the Cooper pair condensate density rβ as a function of β and λ are given in Fig. 3. In other words, a superconducting phase transition can appear in our model. Its order depends on parameters: it can be a first order or a second order superconducting phase transition, cf. Fig. 3 and Sec. 7 for more details. From numerical investigations, note that rβ was always found to be an increasing function of β > 0. Unfortunately we are able to prove only a part of this fact in Sec. 7. Therefore, a superconducting phase appearing only in a range of non-zero temperatures as for magnetic superconductors cannot rigorously be excluded. But we conjecture that our model can never show this phenomenon, i.e. rβ should always be an increasing function of β > 0.
April 20, 2010 14:17 WSPC/S0129-055X
246
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
Fig. 3. In the figure on the left, we have three illustrations of the Cooper pair condensate density rβ as a function of the inverse temperature β for λ = 0 (blue line), λ = 0.45 (red line) and λ = 0.575 (green line). The figure on the right represents a 3D illustration of rβ as a function of λ and β. The color from red to blue reflects the decrease of the temperature. In all figures, µ = 1, γ = 2.6 and h = 0. (Color online.)
Observe that a non-trivial solution rβ = 0 is a manifestation of the breakdown of the U (1)-gauge symmetry. To see this phenomenon, we need to perturb the Hamiltonian HN with the external field √ α N (e−iφ c0 + eiφ c∗0 ) for any α ≥ 0 and φ ∈ [0, 2π). This leads to the perturbed Gibbs state ωN,α,φ (·) defined by (1.6) with HN replaced by HN,α,φ := HN − α (e−iφ ax,↓ ax,↑ + eiφ a∗x,↑ a∗x,↓ ), (3.1) x∈ΛN
see (6.42). We then obtain the following result for the so-called Bogoliubov quasiaverages (cf. Theorem 6.2). Theorem 3.2 (Breakdown of the U (1)-Gauge Symmetry). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, and for any φ ∈ [0, 2π), one gets for the Bogoliubov quasi-average below : √ 1 1/2 ωN,α,φ (ax,↑ ax,↓ ) = rβ eiφ , lim lim ωN,α,φ (c0 / N ) = lim lim α↓0 N →∞ α↓0 N →∞ N x∈ΛN
with rβ ≥ 0 being the unique solution of (2.4), see Theorem 2.1. Note that the breakdown of the U (1)-gauge symmetry should be “seen” in experiments via the so-called off diagonal long range order (ODLRO) property of the correlation functions [38], see Sec. 6.2. In fact, because of the permutation invariance, Theorem 3.1 still holds if we remove the space average, i.e. for any
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
247
lattice sites x and y = x, lim ωN (a∗y,↓ a∗y,↑ ax,↑ ax,↓ ) = rβ ,
N →∞
see Theorem 6.3. Similar remarks can be done for Theorems 3.4–3.7. Observe also that the type of superconductivity described here is the s-wave superconductivity, which is defined via the two-point correlation function. Theorem 3.3 (s-Wave Superconductivity). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, and for any φ ∈ [0, 2π), x, y ∈ Zd and s1 , s2 ∈ {↑, ↓}, the two-point correlation function defined from the Bogoliubov quasi-averages equals 1/2
lim lim ωN,α,φ(ax,s1 ay,s2 ) = rβ eiφ δx,y (1 − δs1 ,s2 ), α↓0 N →∞
with rβ ≥ 0 being the unique solution of (2.4), see Theorem 2.1. Here δx,y = 1 if and only if x = y. In other words, for x, y ∈ Zd and s1 , s2 ∈ {↑, ↓} the two-point correlation function inside the superconducting phase is non-zero if and only if x = y and s1 = s2 . More generally, for any infinite volume equilibrium state ω, we have ω(ax,s1 ay,s2 ) = ω(a0,s1 a0,s2 )δx,y , see Sec. 6. We conclude now this analysis by giving the zero-temperature limit β → ∞ of the Cooper pair condensate density rβ proven in Sec. 7. Corollary 3.1 (Cooper Pair Condensate Density at Zero-Temperature). The Cooper pair condensate density r∞ = r∞ (µ, λ, γ, h) is equal at zerotemperature to rmax for any γ > Γ|µ−λ|,λ+|h| r∞ := lim rβ = β→∞ 0 for any γ < Γ|µ−λ|,λ+|h| with rmax ≤ 1/4 (cf. (2.6) and Fig. 4) and Γx,y := 2(y + {y 2 − x2 }1/2 )χ[0,y) (x)χ(0,∞) (y) + 2xχ[y,∞) (x) ≥ 0 been defined for any x ∈ R+ and y ∈ R. Here χK is the characteristic function of the set K ⊂ R. Remark 3.2. If γ = Γ|µ−λ|,λ+|h| , straightforward estimations show that the order parameter rβ converges to r∞ = 0, see Sec. 7. This special case is a critical point at sufficiently large β. We exclude it in our discussion since all thermodynamic limits of densities in Sec. 3 are performed away from any critical point, see, for instance, Theorem 3.1. The result of Corollary 3.1 is in accordance with Theorem 3.1 in the sense that the order parameter r∞ is an increasing function of γ ≥ 0. Observe also that 1 sup{r∞ (µ, λ, γ, h)} = r∞ (µ, µ, γ, h) = 4 λ∈R
April 20, 2010 14:17 WSPC/S0129-055X
248
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
Fig. 4. In the figure on the left, the blue area represents the domain of (λ, γ) with 1 ≤ γ ≤ 6, where the (zero-temperature) Cooper pair condensate density r∞ is non-zero at µ = 1 and h = 0. The figure on the right represents a 3D illustration of r∞ when 1 ≤ γ ≤ 6 and −2.5 ≤ λ ≤ 2.5 with again µ = 1, h = 0. (Color online.)
for any fixed γ > Γ0,µ+|h| , whereas for any real numbers µ, λ, h, 1 . γ→∞ 4 In other words, the superconducting phase for µ = λ is as perfect as for γ = ∞. In particular, in order to optimize the Cooper pair condensate density, if µ > 0, then it is necessary to increase the one-site repulsion by tuning in λ to µ. Consequently, the direct repulsion between electrons can favor the superconductivity at fixed µ. This phenomenon is confirmed by the following analysis. First observe that Eq. (2.5) has no solution if γ ≤ 2|µ| and λ = 0. In other words, the strong coupling BCS theory has no phase transition as soon as γ ≤ 2|µ| and µ = 0. However, even if γ ≤ 2|µ|, there is a range of λ where a superconducting phase takes place. For instance, take µ > 0 and note that γ > Γ|µ−λ|,λ+|h| when γ γ (3.2) 0 ≤ µ − < λ < µ + − γ(µ + |h|). 2 2 This last inequality can always be satisfied for some λ > 0, if µ + |h| < γ ≤ 2µ. Therefore, although there is no superconductivity for γ ≤ 2|µ| and λ = 0, there is a range of positive λ ≥ 0 defined by (3.2) for µ + |h| < γ ≤ 2µ, where the superconductivity appears at low enough temperature, see Corollary 3.1 and Fig. 4. In the region γ ≥ 2µ > 0 where the superconducting phase can occur for λ = 0, observe also that the critical temperature θc for λ > 0 can sometimes be larger as compared with the one for λ = 0, cf. Fig. 2. lim r∞ (µ, λ, γ, h) =
Remark 3.3. The effect of a one-site repulsion on the superconducting phase transition may be surprising since one would naively guess that any repulsion between pairs of electrons should destroy the formation of Cooper pairs. In fact, the one-site and BCS interactions in (1.2) are not diagonal in the same basis, i.e. they do not commute. In particular, the Hubbard interaction cannot be directly interpreted as
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
249
a repulsion between Cooper pairs. This interpretation is only valid for large λ ≥ 0. Indeed, at fixed µ and γ > 0, if λ is large enough, there is no superconducting phase. 3.2. Electron density per site and electron-hole symmetry We give next the grand-canonical density of electrons per site in the system (cf. Theorem 6.4). Theorem 3.4 (Electron Density per Site). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (infinite volume) electron density equals (µ − λ) sinh(βgrβ ) 1 , ωN (nx,↑ + nx,↓ ) = dβ := 1 + lim N →∞ N grβ (eβλ cosh(βh) + cosh(βgrβ )) x∈ΛN
with dβ = dβ (µ, λ, γ, h) ∈ [0, 2], rβ ≥ 0 being the unique solution of (2.4) and gr := {(µ − λ)2 + γ 2 r}1/2 , see Theorem 2.1 and Fig. 5. At low enough temperature and for γ > Γ|µ−λ|,λ+|h| , Corollary 3.1 tells us that a superconducting phase appears, i.e. rβ > 0. In this case, it is important to note that the electron density becomes independent of the temperature. Indeed, by combining Theorem 3.4 with (2.5) one gets that dβ = 1 + 2γ −1 (µ − λ)
(3.3)
is linear as a function of µ in the domain of (β, µ, λ, γ, h) where rβ > 0, i.e. in the presence of superconductivity, see Fig. 5. We give next the electron density per site in the zero-temperature limit β → ∞, which straightforwardly follows from Theorem 3.4 combined with Corollary 3.1. Corollary 3.2 (Electron Density per Site at Zero-Temperature). The (infinite volume) electron density d∞ = d∞ (µ, λ, γ, h) ∈ [0, 2] at zero-temperature dβ
dβ
2.0
dβ
2.0
1.00 1.5
1.5
1.0
1.0
0.5
0.5
0.95
0.90
0.85
−2
−1
1
2
µ
− 1.0
− 0.5
0.5
1.0
1.5
2.0
µ
2
4
6
8
10
12
β
Fig. 5. In the figure on the left, we give illustrations of the electron density dβ as a function of the chemical potential µ for β < βc (red line) and β > βc (blue line) at coupling constant λ = 0 (figure on the left, β = 1.4, 2.45) and λ = 0.575 (figure on the center, β = 4, 6.45). In the figure on the right, dβ is given as a function of β at µ = 0.3 with λ > µ equal to 0.35 (orange line, second order phase transition), 0.575 (blue line, first order phase transition) and 1.575 (green line, no phase transition). In all figures, γ = 2.6, h = 0 and βc = θc−1 is the critical inverse temperature. (Color online.)
April 20, 2010 14:17 WSPC/S0129-055X
250
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
is equal to d∞ := lim dβ = 1 + β→∞
sgn(µ − λ) χ[λ+|h|,∞) (|µ − λ|) 1 + δ|µ−λ|,λ+|h| (1 + δh,0 )
for γ < Γ|µ−λ|,λ+|h| , whereas within the superconducting phase, i.e. for γ > Γ|µ−λ|,λ+|h| (Corollary 3.1), d∞ = 1 + 2γ −1 (µ − λ). Recall that sgn(0) := 0. To conclude, observe that (2−dβ ) is the density of holes in the system. So, if µ > λ, then dβ ∈ (1, 2], i.e. there are more electrons than holes in the system, whereas dβ ∈ [0, 1) for µ < λ, i.e. there are more holes than electrons. This phenomenon can directly be seen in the Hamiltonian HN , where there is a symmetry between electrons and holes as in the Hubbard model. Indeed, by replacing the creation operators a∗x,↓ and a∗x,↑ of electrons by the annihilation operators −bx,↓ and −bx,↑ of holes, we can map the Hamiltonian HN (1.2) for electrons to another strong coupling BCS-Hubbard model for holes defined via the Hamiltonian N := −µhole (ˆ nx,↑ + n ˆ x,↓ ) − hhole (ˆ nx,↑ − n ˆ x,↓ ) H x∈ΛN
+ 2λ
x∈ΛN
γ n ˆ x,↑ n ˆ x,↓ − N
x∈ΛN
b∗y,↑ b∗y,↓ bx,↓ bx,↑ + 2(λ − µ)N − γ,
x,y∈ΛN
with n ˆ x,↓ := b∗x,↓ bx,↓ ,
n ˆ x,↑ := b∗x,↑ bx,↑ ,
hhole := −h and µhole := 2λ − µ − γN −1 .
Therefore, if one knows the thermodynamic behavior of HN for any h ∈ R and µ ≥ λ (regime with more electrons than holes), we directly get the thermodynamic properties for µ < λ (regime with more holes than electrons), which correspond to N with hhole = −h and a chemical potential for holes µhole > λ at the one given by H N shifts the grand-canonical large enough N . Note that the last constant term in H pressure by a constant, but also the (infinite volume) mean-energy per site β (Sec. 3.6). 3.3. Superconductivity versus magnetization: Meißner effect (c)
It is well known that for magnetic fields h with |h| below some critical value hβ , type I superconductors become perfectly diamagnetic in the sense that the mag(c) netic induction in the bulk is zero. Magnetic fields with strength above hβ destroy the superconducting phase completely. This property is the celebrated Meißner or (c) Meißner–Ochsenfeld effect. For small fields h (i.e. |h| < hβ ) the magnetic field in the bulk of the superconductor is (almost) cancelled by the presence of steady surface currents. As we do not analyze transport here, we only give the magnetization density explicitly as a function of the external magnetic field h for the strong coupling BCS-Hubbard model. Note that type II superconductors cannot be covered
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
251
in the strong coupling regime since the vortices appearing in presence of magnetic fields come from the magnetic kinetic energy. Theorem 3.5 (Magnetization Density). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (infinite volume) magnetization density equals sinh(βh)eλβ 1 , ωN (nx,↑ − nx,↓ ) = mβ := λβ lim N →∞ N e cosh(βh) + cosh(βgrβ ) x∈ΛN
with mβ = mβ (µ, λ, γ, h) ∈ [−1, 1], rβ ≥ 0 being the unique solution of (2.4) and gr := {(µ − λ)2 + γ 2 r}1/2 , see Theorem 2.1 and Fig. 6. This theorem deduced from Theorem 6.4 does not seem to show any Meißner effect since mβ > 0 as soon as h = 0. However, when the Cooper pair condensate density rβ is strictly positive, from Theorem 3.5 combined with (2.5) note that mβ =
2grβ eλβ sinh(βh) . γ sinh(βgrβ )
(3.4)
In particular, it decays exponentially as β → ∞ when rβ → r∞ > 0, see Fig. 6. We give therefore the zero-temperature limit β → ∞ of mβ in the next corollary. Corollary 3.3 (Magnetization Density at Zero-Temperature). The (infinite volume) magnetization density m∞ = m∞ (µ, λ, γ, h) ∈ [−1, 1] at zero-temperature is equal to m∞ := lim mβ = β→∞
sgn(h) χ[0,λ+|h|] (|µ − λ|), 1 + δ|µ−λ|,λ+|h|
Fig. 6. In the figure on the left, we have an illustration of the electron density dβ (blue line), the Cooper pair condensate density rβ (red line) and the magnetization density mβ (green line) as functions of the magnetic field h at β = 7, µ = 1, λ = 0.575 and γ = 2.6. The figure on the right represents a 3D illustration of mβ = mβ (1, 0.575, 2.6, h) as a function of h and β. The color from red to blue reflects the decrease of the temperature. In both figures, we can see the Meißner effect (in the 3D illustration, the area with no magnetization corresponds to rβ > 0). (Color online.)
April 20, 2010 14:17 WSPC/S0129-055X
252
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
for γ < Γ|µ−λ|,λ+|h| (see Corollary 3.1), whereas for γ > Γ|µ−λ|,λ+|h| there is no magnetization at zero-temperature since mβ decays exponentiallym as β → ∞ to m∞ = 0. Consequently, there is no superconductivity, i.e. r∞ = 0, when γ < Γ|µ−λ|,λ+|h| and, as soon as h = 0 with |µ − λ| < λ + |h|, there is a perfect magnetization at zero-temperature, i.e. m∞ = sgn(h). Observe that the condition |µ − λ| > λ + |h| implies from Corollary 3.2 that either d∞ = 0 or d∞ = 2, which implies that m∞ must be zero. On the other hand, if γ > Γ|µ−λ|,λ , we can define the critical magnetic field at zero-temperature by the unique positive solution 1 (c) −2 2 + γ (µ − λ) − λ > 0 (3.5) h∞ := γ 4 (c)
of the equation Γ|µ−λ|,λ+y = γ for y ≥ 0. Then, by increasing |h| up to h∞ , the (zero-temperature) Cooper pair condensate density r∞ stays constant, whereas the (zero-temperature) magnetization density m∞ is zero, i.e. r∞ = rmax and m∞ = 0 (c) (c) for |h| < h∞ , see Corollary 3.1. However, as soon as |h| > h∞ , r∞ = 0 and m∞ = sgn(h), i.e. there is no Cooper pair and a pure magnetization takes place. In other words, the model manifests a pure Meißner effect at zero-temperature corresponding to a superconductor of type I, cf. Fig. 6. Finally, note that we give an energetic interpretation of the critical magnetic (c) (c) field h∞ after Corollary 3.5. Observe also that a measurement of h∞ (3.5) implies, for instance, a measurement of the chemical potential µ if one would know γ and λ, which could be found via the asymptotic (3.15) of the specific heat, see discussions in Sec. 5. 3.4. Coulomb correlation density The space distribution of electrons is still unknown and for such a consideration, we need the (infinite volume) Coulomb correlation density 1 ωN (nx,↑ nx,↓ ) . (3.6) lim N →∞ N x∈ΛN
Together with the electron and magnetization densities dβ and mβ , the knowledge of (3.6) allows us in particular to explain in detail the difference between superconducting and non-superconducting phases in terms of space distributions of electrons. Actually, by the Cauchy–Schwarz inequality for the states one gets that 1 1 1 ωN (nx,↑ nx,↓ ) ≤ ωN (nx,↑ ) ωN (nx,↓ ). (3.7) N N N x∈ΛN
m Actually,
x∈ΛN
x∈ΛN
mβ = O(e−(γ−2(λ+|h|))β/2 ) for γ > Γ|µ−λ|,λ+|h| ≥ 2(λ + |h|).
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
253
From Theorems 3.4 and 3.5, the densities of electrons with spin up ↑ and down ↓ equal, respectively, dβ + mβ 1 ∈ [0, 1] ωN (nx,↑ ) = lim N →∞ N 2 x∈ΛN
and
lim
N →∞
dβ − mβ 1 ∈ [0, 1] ωN (nx,↓ ) = N 2 x∈ΛN
for any β, γ > 0 and µ, λ, h away from any critical point. Consequently, by using (3.7) in the thermodynamic limit, the (infinite volume) Coulomb correlation density is always bounded by 1 2 1 0 ≤ lim ωN (nx,↑ nx,↓ ) ≤ wmax := dβ − m2β . (3.8) N →∞ N 2 x∈ΛN
If, for instance, (3.6) equals zero, then as soon as an electron is on a definite site, the probability to have a second electron with opposite spin at the same place goes to zero as N → ∞. In this case, there would be no formation of pairs of electrons on a single site. This phenomenon does not appear exactly in finite temperature due to thermal fluctuations. Indeed, we can explicitly compute the Coulomb correlation in the thermodynamic limit (cf. Theorem 6.4): Theorem 3.6 (Coulomb Correlation Density). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (infinite volume) Coulomb correlation density equalsn 1 1 lim ωN (nx,↑ nx,↓ ) = wβ := (dβ − mβ coth(βh)), N →∞ N 2 x∈ΛN
with wβ = wβ (µ, λ, γ, h) ∈ (0, wmax ), see Fig. 7. Here dβ and mβ are, respectively, defined in Theorems 3.4 and 3.5. Consequently, because grβ ≥ |λ − µ|, for any inverse temperature β > 0 the Coulomb correlation density is never zero, i.e. wβ > 0, even if the electron density dβ is exactly 1, i.e. if λ = µ. Moreover, the upper bound in (3.8) is also never attained. However, for low temperatures, wβ goes exponentially fast with respect to β to one of the bounds in (3.8), cf. Fig. 7. Indeed, one has the following zero– temperature limit: Corollary 3.4 (Coulomb Correlation Density at Zero-Temperature). The (infinite volume) Coulomb correlation density w∞ = w∞ (µ, λ, γ, h) ∈ [0, 1] at n If
h = 0, then wβ (µ, λ, γ, 0) := limh→0 wβ (µ, λ, γ, h).
April 20, 2010 14:17 WSPC/S0129-055X
254
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
wβ , wmax
wβ , wmax
0.5 0.4
wβ , wmax
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.3 0.2 0.1
2
4
6
8
10
12
β
2.0
2.5
3.0
3.5
4.0
4.5
5.0
β
2.0
2.5
3.0
3.5
4.0
4.5
5.0
β
Fig. 7. Illustration of the Coulomb correlation density wβ (red lines) and its corresponding upper bound wmax (blue lines) as a function of β > 0 at µ = 0.2, γ = 2.6, for λ = 1.305 < µ (left figure, dβ < 1), λ = 0.2 = µ (two right figures, dβ = 1), and from the left to the right, with h = 0 (mβ = 0), and h = 0.3, 0.35 (where mβ > 0). The dashed green lines indicate that d∞ /2 = 0.5 in the three cases. In the figure on the left there is no superconducting phase in opposition to the right figures where we see a phase transition for β > 2.3 (second order) or 2.6 (first order). (Color online.)
zero-temperature is equal to w∞ := lim wβ = β→∞
1 + sgn(µ − λ) χ[λ+|h|,∞) (|µ − λ|) 2(1 + δ|µ−λ|,λ+|h| (1 + δh,0 ))
for γ < Γ|µ−λ|,λ+|h| whereas w∞ = d∞ /2 for γ > Γ|µ−λ|,λ+|h| , see Corollaries 3.1 and 3.2. If |µ − λ| > λ + |h|, the interpretation of this asymptotics is clear since either d∞ = 0 for µ < λ or d∞ = 2 for µ > λ. The interesting phenomena are when |µ − λ| < λ + |h|. In this case, if there is no superconducting phase, i.e. γ < Γ|µ−λ|,λ+|h| , then wβ converges towards w∞ = 0 as β → ∞. In particular, as explained above, if an electron is on a definite site, the probability to have a second electron with opposite spin at the same place goes to zero as N → ∞ and β → ∞. However, in the superconducting phase, i.e. for γ > Γ|µ−λ|,λ+|h| , the upper bound wmax (3.8) is asymptotically attained. Since wmax = d∞ /2 as β → ∞, it means that 100% of electrons form Cooper pairs in the limit of zero-temperature, which is in accordance with the fact that the magnetization density must disappear, i.e. m∞ = 0, cf. Corollary 3.3. As explained in Sec. 3.1, the highest Cooper pair condensate density is 1/4, which corresponds to an electron density d∞ = 1. Actually, although all electrons form Cooper pairs at small temperatures, there are never 100% of electron pairs in the condensate, see Fig. 8. In the special case where d∞ = 1, only 50% of Cooper pairs are in the condensate. The same analysis can be done for hole pairs by changing ax by −b∗x in the definition of extensive quantities. Define the electron and hole pair condensate fracˆβ , where ˆrβ and d ˆβ are the hole ˆβ := 2ˆrβ /d tions respectively by vβ := 2rβ /dβ and v condensate density and the hole density respectively. Because of the electron-hole ˆβ = 2 − dβ . In particular, when rβ > 0, we asymptotically symmetry, ˆrβ = rβ and d get that v ˆβ + vβ → 1 as β → ∞. Hence, in the superconducting phase, an electron pair condensate fraction below 50% means in fact that there are more than 50% of hole pair condensate and conversely at low temperatures. For more details concerning ground states in relation with this phenomenon, see discussions around (6.60) in Sec. 6.2.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors % of Cooper pair condensate
% of Cooper pair condensate
dβ
100
255
2.0
100
80
80 1.5
60
60 1.0
40
40 0.5
20
− 1.0
− 0.5
20
0.5
1.0
1.5
2.0
µ
− 1.0
− 0.5
0.5
1.0
1.5
2.0
µ
− 1.0
− 0.5
0.5
1.0
1.5
2.0
µ
Fig. 8. The fraction of electron pairs in the condensate is given in right and left figures as a function of µ. In the figure on the left, λ = h = 0, with inverse temperatures β = 2.45 (orange line), 3.45 (red line) and 30 (blue line). In the figure on the right, λ = 0.575 and h = 0.1 with β = 5 (orange line), 7 (red line) and 30 (blue line). The figure on the center illustrates the electron density dβ also as a function of µ at β = 30 (low temperature regime) for λ = h = 0 (red line) and for λ = 0.575 and h = 0.1 (green line). In all figures, γ = 2.6. (Color online.)
3.5. Superconductor-Mott insulator phase transition By Corollary 3.2, if λ > 0 and the system is not in the superconducting phase (i.e. if rβ = 0), then the electron density converges to either 0, 1 or 2 as β → ∞ since d∞ = 1 + sgn(µ − λ).
(3.9)
We define the phase where the system does not form a pair condensate and the electron density is around 1, as a Mott insulator phase. More precisely, we say that the system forms a Mott insulator, if for some < 1, some 0 < β0 < ∞, some µ0 ∈ R and some δµ > 0, the electron density dβ ∈ (1 − , 1 + ) and rβ = 0
for all (β, µ) ∈ (β0 , ∞) × (µ0 − δµ, µ0 + δµ).
As discussed in Sec. 3.4, observe that we have, in this phase, exactly one electron (or hole) localized in each site at the low temperature limit since dβ → 1 and wβ → 0 as β → ∞. To extract the whole region of parameters where such a thermodynamic phase takes place, a preliminary analysis of the function Γx,y defined in Corollary 3.1 is first required. Observe that Γ0,y > 0 if and only if y > 0. Consequently, for any real numbers λ and h such that λ + |h| ≤ 0 we have Γ0,λ+|h| = 0. However, if λ + |h| > 0 then Γ0,λ+|h| > 0. Meanwhile, at fixed y > 0, the continuous function Γx,y of x ≥ 0 is convex with minimum for x = y, i.e. inf {Γx,y } = Γy,y = 2y > 0.
x≥0
(3.10)
In particular, Γx,y is strictly decreasing as a function of x ∈ [0, y] and strictly increasing for x ≥ y. Now, by combining Corollaries 3.1–3.4, we are in position to extract the set of parameters corresponding to insulating or superconducting phases: (1) For any γ > 0 and µ, λ ∈ R such that |µ − λ| > max{γ/2, λ + |h|},
April 20, 2010 14:17 WSPC/S0129-055X
256
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
observe first that there are no superconductivity (r∞ = 0), either no electrons or no holes (see (3.9)) and, in any case, no magnetization since m∞ = 0. It is a standard (non ferromagnetic) insulator. The next step is now to analyze the thermodynamic behavior for |µ − λ| < max{γ/2, λ + |h|},
(3.11)
which depends on the strength of γ > 0. From (2) to (4), we assume that (3.11) is satisfied. (2) If the BCS coupling constant γ satisfies 0 < γ ≤ Γλ+|h|,λ+|h| = 2(λ + |h|), then from (3.10) combined with Corollary 3.1 there is no Cooper pair for any µ and any λ. In particular, under the condition (3.11) there are a perfect magnetization, i.e. m∞ = sgn(h), and exactly one electron or one hole per site since d∞ = 1 and w∞ = 0. In other words, we obtain a ferromagnetic Mott insulator phase. (3) Now, if γ > 0 becomes too strong, i.e. γ > Γ0,λ+|h| = 4(λ + |h|), then for any µ ∈ R such that |µ − λ| < γ/2 there are Cooper pairs because r∞ = rmax > 0, an electron density d∞ equal to (3.3) and no magnetization (m∞ = 0). In this case, observe that all quantities are continuous at |µ − λ| = γ/2. This is a superconducting phase. (4) The superconducting-Mott insulator phase transition only appears in the intermediary regime where Γλ+|h|,λ+|h| = 2(λ + |h|) < γ < Γ0,λ+|h| = 4(λ + |h|),
(3.12)
cf. Fig. 9. Indeed, the function Γx,λ+|h| = γ has two solutions x1 :=
γ 1/2 {4(λ + |h|) − γ}1/2 2
and x2 :=
γ > x1 . 2
In particular, for any µ ∈ R such that |µ − λ| ∈ (x1 , γ/2), the BCS coupling constant γ is strong enough to imply the superconductivity (r∞ = rmax > 0), with an electron density d∞ equal to (3.3) and no magnetization (m∞ = 0). We are in the superconducting phase. However, for any µ ∈ R such that |µ − λ| < x1 , the BCS coupling constant γ becomes too weak and there is no superconductivity (r∞ = 0), exactly one electron per site, i.e. d∞ = 1 and w∞ = 0, and a pure magnetization if h = 0, i.e. m∞ = sgn(h). In this regime, one gets a ferromagnetic Mott insulator phase. All quantities are continuous at |µ − λ| = γ/2 but not for |µ − λ| = x1 . In other words, we get a superconductor-Mott insulator phase transition by tuning in the chemical potential µ. An illustration of this phase transition is given in Fig. 10, see also Fig. 8.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
λ
257
λ
2.0
100
1.5 50 1.0 0.5
1.5
2.0
2.5
3.0
3.5
4.0
γ
50
100
150
200
γ
− 50
− 0.5 − 100
− 1.0
Fig. 9. In both figures, the blue area represents the domain of (λ, γ), where there is a superconducting phase at zero temperature for µ = 1 and h = 0. The two increasing straight lines (green and brown) are γ = 4λ and γ = 2λ for γ ≥ 1. In particular, between these two lines (2λ < γ < 4λ), there is a superconducting-Mott insulator phase transition by tuning µ. (Color online.)
dβ , r β , mβ
θc
dβ , r β , mβ
2.0
2.0 0.20
− 0.5
1.5
1.5
1.0
1.0
0.5
0.5
0.5
1.0
1.5
2.0
µ
− 0.5
0.15
0.10
0.05
0.5
1.0
1.5
2.0
µ
− 0.5
0.5
1.0
1.5
2.0
µ
Fig. 10. Here λ = 0.575, γ = 2.6, and h = 0.1. In the two figures on the left, we plot the electron density dβ (blue line), the Cooper pair condensate density rβ (red line) and the magnetization density mβ (green line) as functions of µ for β = 7 (left figure) or 30 (low temperature regime, figure on the center). Observe the superconducting-Mott Insulator phase transition which appears in both cases. In the right figure, we illustrate as a function of µ the corresponding critical temperature θc . The blue line corresponds to a second order phase transition, whereas the red dashed line represents the domain of µ with first order phase transition. The black dashed line is the chemical potential µ = λ corresponding to an electron density per site equal to 1. (Color online.)
3.6. Mean-energy per site and the specific heat To conclude, low-Tc superconductors and high-Tc superconductors differ by the behavior of their specific heat. The first one shows a discontinuity of the specific heat at the critical point whereas the specific heat for high–Tc superconductors is continuous. It is therefore interesting to give now the mean-energy per site in the thermodynamic limit in order to compute next the specific heat. Theorem 3.7 (Mean-Energy per Site). For any β, γ > 0 and real numbers µ, λ, h away from any critical point, the (infinite volume) mean energy per site is equal to lim {N −1 ωN (HN )} = β := −µdβ − hmβ + 2λwβ − γrβ ,
N →∞
see Theorems 3.1–3.6 and Fig. 11.
April 20, 2010 14:17 WSPC/S0129-055X
258
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
εβ
εβ − 0.95
− 1.5
− 1.0
− 1.6
εβ
− 1.00
0.20
− 1.1
− 1.7
0.15 − 1.2
− 1.05
− 1.8
6
− 1.9
2
3
4
5
6
7
4
β
5
6
7
8
9
10
β
0.10
h
8
β
0.05 10 12
0.00
Fig. 11. In the two figures on the left, we give the mean energy per site β as a function of β at h = 0 for λ = 0 (figure on the left, second order BCS phase transition) or λ = 0.575 (figure on the center, first order phase transition). The dashed line in both figures is the mean energy per site with zero Cooper pair condensate density. On the right figure, β is given as a function of β and h at λ = 0.575. The color from red to blue reflects the decrease of the temperature and the plateau corresponds to the superconducting phase. In all figures, µ = 1 and γ = 2.6. (Color online.)
At zero-temperature, Corollaries 3.1–3.4 imply an explicit computation of the mean energy per site: Corollary 3.5 (Mean-Energy per Site at Zero-Temperature). The (infinite volume) mean energy per site ∞ = ∞ (µ, λ, γ, h) at zero-temperature is equal to ∞ := lim β = −µ + β→∞
−
λ + |λ − µ| χ[λ+|h|,∞) (|µ − λ|) 1 + δ|µ−λ|,λ+|h| (1 + δh,0 )
|h| 1 + δ|µ−λ|,λ+|h|
χ[0,λ+|h|] (|µ − λ|),
for γ < Γ|µ−λ|,λ+|h| whereas for γ > Γ|µ−λ|,λ+|h| γ ∞ := lim β = − + (λ − µ)(1 + γ −1 (µ − λ)), β→∞ 4 cf. Corollary 3.1. (c)
Note that the critical magnetic field h∞ (3.5) has a direct interpretation in terms of the zero-temperature mean energy per site ∞ . Indeed, if |µ − λ| < λ + |h|, / {0, 2}, by equating ∞ in the superconducting phase with the mean energy i.e. d∞ ∈ ∞ = −µ − |h| in the non-superconducting (ferromagnetic) state, we directly get (c) that the magnetic field should be equal to |h| = h∞ (3.5). In other words, the (c) critical magnetic field h∞ corresponds to the point where the mean energies at zero-temperature in both cases are equal to each other, as it should be. Note that this phenomenon is not true at non-zero temperature since the mean energy per site can be discontinuous as a function of h (even if λ = 0), see Fig. 11. Now, the specific heat at finite volume equals cN,β := −β 2 ∂β {N −1 ωN (HN )} = N −1 β 2 ωN ([HN − ωN (HN )]2 ).
(3.13)
However, its thermodynamic limit cβ := lim cN,β = −β 2 ∂β β + Cβ N →∞
(3.14)
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
259
cannot be easily computed because one cannot exchange the limit N → ∞ and the derivative ∂β , i.e. Cβ = Cβ (µ, λ, γ, h) may be non-zero. For instance, Griffiths arguments [29–31] (Appendix) would allow to exchange any derivative of the pressure pN and the limit N → ∞ by using the convexity of pN . To compute (3.14) in this way, we would need to prove the (piece-wise) convexity of N,β := N −1 ωN (HN ) as a function β > 0. As suggested by Fig. 11, this property of convexity might be right but it is not proven here. Notice however that if experimental measurements of the specific heat comes from a discrete derivative of the mean energy per site β , it is then clear that it corresponds to forget about the term Cβ . In this case, i.e. assuming Cβ = 0, we find again the well-known BCS-type behavior of the specific heat in presence of a second order phase transition, see Fig. 12. In addition, if Cβ = 0, then for any µ, λ, h and γ > Γ|µ−λ|,λ+|h| (Corollary 3.1), we explicitly obtain via direct computations the well-known exponential decay of the specific heat at zero-temperature for s-wave superconductors: 1 (3.15) cβ = (2λγ + γ 2 − 4λ2 )β 2 e−βγ + o(β 2 e−βγ ) as β → ∞. 4 (Note that this asymptotic could give access to γ and also λ, see discussions in Sec. 5.) However, if a first order phase transition appears, then the (infinite volume) mean energy per site β is discontinuous at the critical temperature θc (cf. Fig. 11) and the specific heat cθc−1 is infinite. In Fig. 12, we give an illustration of the ratio ∆c/cmax between the jump ∆c at θ = θc and the maximum value cmax of cθc−1 . For most of standard superconductorso note that the measured values are between 0.6 and 0.7. Numerical computations suggest that this ratio ∆c/cmax may always be bounded in our model by one as soon as a second order phase transition appears. cβ =c −1 θ
∆c/cmax
cβ =c −1 θ
3.0
3.0
2.5
2.5
2.0
2.0
1.5
1.5
1.0
1.0
0.5
0.5
1.0
0.8
0.6
0.4
0.4
0.6
0.8
1.0
1.2
θ/θc
0.4
0.2
0.6
0.8
1.0
1.2
θ/θc
− 0.2
0.0
0.2
0.4
0.6
λ
Fig. 12. Here µ = 1, γ = 2.6 and h = 0. Assuming Cβ = 0, we give 3 plots of the specific heat cβ as a function of the ratio θ/θc between θ := β −1 and the critical temperature θc for λ = 0, 0.5 (both left figure, respectively blue and red lines, second order phase transition), and λ = 0.575 (figure on the center, blue line, first order phase transition). The dashed red line in the figure on the center indicates what the specific heat at finite volume might be since cθ−1 = +∞. The right c figure is a plot as a function of λ of the relative specific heat jump, i.e. the ratio ∆c/cmax between the jump ∆c at θ = θc and the maximum value cmax of cθ−1 at the same point. The yellow colored c area indicates that this ratio numerically computed is formally infinite due to a first order phase transition. (Color online.) o At
least for the following elements: Hg, In, Nb, Pb, Sn, Ta, Tl, V.
April 20, 2010 14:17 WSPC/S0129-055X
260
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
4. Phase Diagram at Fixed Electron Density per Site In any finite volume, the electron density per site is strictly increasing as a function of the chemical potential µ by strict convexity of the pressure. Therefore, for any fixed electron density ρ ∈ (0, 2) there exists a unique µN,β = µN,β (ρ, λ, γ, h) such that 1 ωN (nx,↑ + nx,↓ ), (4.1) ρ= N x∈ΛN
where ωN represents the (finite volume) grand-canonical Gibbs state (1.6) associated with HN and taken at inverse temperature β and chemical potential µ = µN,β . The aim of this section is now to analyze the thermodynamic properties of the model for a fixed ρ instead of a fixed chemical potential µ. We start by investigating it away from any critical point. 4.1. Thermodynamics away from any critical point In the thermodynamic limit and away from any critical point, the chemical potential µN,β converges to a solution µβ = µβ (ρ, λ, γ, h) of the equation ρ = dβ (µ, λ, γ, h),
(4.2)
see Theorem 3.4. For instance, if ρ = 1, the chemical potential µβ is simply given by λ, i.e. µβ (1, λ, γ, h) = λ. At least away from any critical point, this chemical potential µβ is always uniquely defined. Indeed, outside the superconducting phase (see Sec. 3.1), the electron density dβ given by Theorem 3.4 is a strictly increasing continuous function of the chemical potential µ at fixed β > 0. In other words, for any fixed electron density ρ ∈ (0, 2), Eq. (4.2) has a unique solution µβ , i.e. the chemical potential µβ is the inverse of the electron density dβ taken as a function of µ ∈ R. On the other hand, inside the superconducting phase, from (3.3) the chemical potential µβ is also unique and equals γ (4.3) µβ = (ρ − 1) + λ, 2 see Figs. 5 and 10. In particular, µβ does not depend on h or β as soon as rβ > 0. The gap equation (2.5) then equals 1 eλβ cosh(βh) tanh(βγgr ) = 2gr 1 + , with gr := {(ρ − 1)2 + 4r}1/2 , cosh(βγgr ) 2 and 0 ≤ rβ ≤ max{0, ρ(2 − ρ)/4}, for any fixed electron density ρ ∈ (0, 2). Hence, the thermodynamic behavior of the strong coupling BCS-Hubbard model HN is simply given for any ρ ∈ (0, 2), away from any critical point, by setting µ = µβ
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
rβ
261
rβ
0.25 0.12 0.20
0.10
0.15
0.08 0.06
0.10 0.04 0.05
0.02
1
2
3
4
5
6
7
β
2
3
4
5
6
7
8
β
Fig. 13. Illustrations of the Cooper pair condensate density rβ as a function of the inverse temperature β for γ = 2.6, h = 0, and densities ρ = 1, 1.7 (respectively left and right figures), with λ = 0 (blue line), 0.5 (red line), 0.75 (green line), and 1 (orange line). The dashed line indicates the value of r∞ . (Color online.)
in Sec. 3. In particular, the superconducting phase can appear by tuning in each parameter: the BCS coupling constant γ (see (2.6)), the inverse temperature β > 0 (see Corollary 3.1), the coupling constant λ, the magnetic field h (see Sec. 3.3), the chemical potential µ or the electron density ρ (see Sec. 3.5). Therefore, to explain the phase diagram at fixed electron density, it is sufficient to give the behavior of the Cooper pair condensate density rβ as a function of ρ ∈ (0, 2). Everything can be easily performed via numerical methods, see Fig. 13. We restrict our rigorous analysis to the zero-temperature limit of rβ , which is a straightforward consequence of Corollary 3.1 and (4.3). Corollary 4.1 (Zero-Temperature Cooper Pair Condensate Density). At zero-temperature, fixed electron density ρ ∈ (0, 2) and λ, h ∈ R, the Cooper pair condensate density rβ converges as β → ∞ towards r∞ = ρ(2 − ρ)/4 when γ > ˜ ρ,λ+|h| , 0}. Here max{Γ ˜ x,y := Γ
4y χ[0,∞) (y) x(x − 2) + 2
is a function defined for any x, y ∈ R. ˜ ρ,λ+|h| is more subtle than its analogous with a Remark 4.1. The case 0 < γ < Γ fixed chemical potential µ, because phase mixtures can take place. See Sec. 4.2. ˜ ρ,λ+|h| we can extract from this corollary all As explained above, as soon as γ > Γ the zero-temperature thermodynamics of the strong coupling BCS-Hubbard model by using Corollaries 3.1–3.4. If λ + |h| > 0 and γ satisfy the inequalities ˜ ρ,λ+|h| } = Γ ˜ 0,λ+|h| = Γ ˜ 2,λ+|h| = 2(λ + |h|) γ > min {Γ ρ∈(0,2)
and ˜ ρ,λ+|h| } = Γ ˜ 1,λ+|h| = 4(λ + |h|), γ < max {Γ ρ∈(0,2)
April 20, 2010 14:17 WSPC/S0129-055X
262
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
it is also clear that the superconductor-Mott insulator phase transition appears by tuning the electron density ρ in the same way as described in Sec. 3.5 for µ. See Fig. 10. In this case however, we recommend Sec. 4.2 for more details because of the subtlety mentioned in Remark 4.1. See Figs. 15 and 16 below. From (4.3) combined with Corollary 4.1, note that the asymptotics (3.15) of the specific heat at zero-temperature is still valid at fixed electron density ρ as ˜ ρ,λ+|h| , 0}. Meanwhile, from Corollary 4.1 the zero-temperature soon as γ > max{Γ Cooper pair condensate density r∞ does not depend on λ, γ, or h, as soon as ˜ ρ,λ+|h| is satisfied. Indeed, the chemical potential µβ in the case where rβ > 0 γ>Γ is renormalized, cf. (4.3). In other words, at zero-temperature, the thermodynamic ˜ ρ,λ+|h| is equal to behavior of the strong coupling BCS-Hubbard model for γ > Γ the well-known behavior of the BCS theory in the strong coupling approximation (λ = h = 0). This phenomenon is also seen by using renormalization methods where it is believed that the Coulomb interaction simply modifies the mass of electrons by creating quasi-particles (which however do not exist in our model). 4.2. Coexistence of ferromagnetic and superconducting phases Observe that the electron density dβ given by Theorem 3.4 can have discontinuities as a function of the chemical potential µ. This phenomenon appears at the superconductor-Mott insulator phase transition, see Sec. 3.5 and Fig. 10. Because of electron-hole symmetry (Sec. 3.2), without loss of generality we can restrict our study to the case where dβ ∈ [0, 1], i.e. ρ ∈ [0, 1] and µβ ≤ λ. In this regime, the electron density dβ has, at most, one discontinuity point at (c) the so-called critical chemical potential µβ ≤ λ. In particular, there are two critical electron densities + − d± β := dβ (µβ ± 0, λ, γ, h) with dβ > dβ . (c)
Similarly, we can also define two critical Cooper pair condensate densities r± β , two and two critical Coulomb correlation density critical magnetization densitiesp m± β − wβ± . Of course, since r+ β > rβ = 0, we are here on a critical point, i.e. (c)
(β, µβ , λ, γ, h) ∈ ∂S (see (2.7)), with β, γ > 0 and λ, h ∈ R such that this critical chemical potential (c) (c) µβ = µβ (λ, γ, h) exists. + The thermodynamics of the model for ρ ∈ [d− β , dβ ] is already explained in Sec. 4.1 because the solution rβ of (2.4) is unique at µ = µβ . The chemical potential (c) + µN,β converges to µβ = µβ , if ρ ∈ [d− β , dβ ]. In this case the variational problem ± (2.4) has exactly two maximizers rβ . The thermodynamic behavior of the system p If
h = 0, then m± β = 0.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
263
in this regime is not, a priori, clear except from the obvious fact that 1 lim ωN (nx,↑ + nx,↓ ) = ρ N →∞ N x∈ΛN
per definition. In particular, it cannot be deduced from the above results. We handle this situation within a much more general framework in Theorem 6.5. As a consequence of this study (see discussions after Theorem 6.5), all the extensive quantities can be obtained in the thermodynamic limit: Theorem 4.1 (Densities in Coexistent Phases). Take β, γ > 0 and real num(c) bers λ, h in the domain of definition of the critical chemical potential µβ . For any − + ρ ∈ [dβ , dβ ], all densities are uniquely defined : (i) The Cooper pair condensate density equals 1 ∗ ∗ lim ω (a a a a ) = τρ r+ N y,↓ y,↑ x,↑ x,↓ β, N →∞ N 2
with
x,y∈ΛN
τρ :=
ρ − d− β − d+ β − dβ
∈ [0, 1].
(ii) The magnetization density equals 1 + ωN (nx,↑ − nx,↓ ) = (1 − τρ )m− lim β + τρ mβ . N →∞ N x∈ΛN
(iii) The Coulomb correlation density equals 1 lim ωN (nx,↑ nx,↓ ) = (1 − τρ )wβ− + τρ wβ+ . N →∞ N x∈ΛN
(iv) The mean energy per site equals + lim {N −1 ωN (HN )} = (1 − τρ )− β + τρ β ,
N →∞
± ± ± with ± β := −µβ ρ − hmβ + 2λwβ − γrβ . (c)
As a consequence of this theorem, as soon as the magnetic field h = 0, there is a coexistence of ferromagnetic and superconducting phases at low temperatures + for ρ ∈ (d− β , dβ ). In other words, the Meißner effect is not valid in this interval of electron densities. An illustration of this is given in Fig. 14. Such phenomenon was also observed in experiments and from our results, it should occur rather near half-filling (but not exactly at half-filling) and at strong repulsion λ > 0. Additionally, observe that this coexistence of thermodynamic phases can also appear at the (c) critical magnetic field hβ (see Sec. 3.3). Remark 4.2. Coexistence of ferromagnetic and superconducting phases has already been rigorously investigated, see, e.g., [16, 17]. For instance, in [16] such
April 20, 2010 14:17 WSPC/S0129-055X
264
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra r β , µβ , mβ
mβ
rβ
1.2 0.5 0.20
1.0 0.4 0.8
0.15 0.3
0.6
0.10 0.2 0.05
0.4
0.1
5
10
15
20
β
0.2
5
10
15
20
β
0.6
0.8
1.2
1.4
ρ
Fig. 14. In the two figures on the left, we give illustrations of the Cooper pair condensate density rβ and the magnetization density mβ as functions of the inverse temperature β for densities ρ = 0.6 (orange line), 0.7 (magenta line), 0.8 (red line), 0.9 (cyan line). In the figure on the right, we illustrate the coexistence of ferromagnetic and superconducting phases via graphs of rβ , mβ and the chemical potential µβ as functions of ρ for β = 30 (low temperature regime). In all figures, λ = 0.575, γ = 2.6, and h = 0.1. (The small discontinuities around ρ = 1 in the right figure are numerical anomalies.) (Color online.)
phenomenon is shown to be impossible in the ground state of the Vonsovkii–Zener model applied to s-wave superconductors,q whereas at finite temperature, numerical computations [17] suggests the contrary. This last analysis [17] is however not performed in details. The second interesting physical aspect related to densities ρ between the critical + densities d− β and dβ is a smoothing effect of the extensive quantities (magnetization density, Cooper pair condensate density, etc.) as functions of the inverse temper(c) ature β. Indeed, since the critical chemical potential µβ only exists when a first order phase transition occurs, one could expect that the extensive quantities are + not continuous as functions of β > 0. In fact, for ρ ∈ (d− β , dβ ), there is a convex + interpolation between quantities related to the solutions r− β = 0 and rβ > 0 of (2.4), see Theorem 4.1. The continuity of the extensive quantities then follows, see Fig. 14. It does not imply however, that all densities become always continuous at fixed ρ as a function of the inverse temperature β. For instance, in Fig. 13, the green and orange graphs give two illustrations of a discontinuity of the order parameter rβ at fixed electron density ρ = 1 where µβ = λ. To understand this first order phase transition, other extensive quantity should be additionally fixed, see discussions in Sec. 5 and Fig. 17. Following these last results, we give now in Fig. 15 other plots of the critical temperature θc = θc (ρ, λ, γ, h), which is defined as usual such that rβ > 0 if and only if β > θc−1 . In this figure, observe that a positive λ, i.e. a one-site repulsion, can never increase the critical temperature if the electron density ρ is fixed instead of the chemical potential µ, compare with Fig. 2. We also show in Fig. 15 (right figure) that if the density of holes equals the density of electrons, i.e. ρ = 1, then we have a Mott insulator, whereas a small doping of electrons or holes implies either a superconducting phase (blue area) or a superconductor-Mott insulator q It
is a combination of the BCS interaction (1.3) with the Zener s–d exchange interaction.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors θc
θc
1.2
0.6
1.0
0.5
0.8
0.4
0.6
0.3
0.4
0.2
0.2
0.1
265
θc 0.20
0.15
0.10
0.05
− 3.0
− 2.5
− 2.0
− 1.5
− 1.0
− 0.5
λ
0.5
0.2
0.4
0.6
0.8
1.0
λ
1.2
0.5
1.0
1.5
2.0
ρ
Fig. 15. Illustration, as a function of λ (the two figures on the left) or ρ (figure on the right), of the critical temperature θc = θc (ρ, λ, γ, h) for γ = 2.6, h = 0.1 and with ρ = 1 (left figure), ρ = 0.7 (figure on the center) and λ = 0.575 (right figure). The blue and yellow areas correspond respectively to the superconducting and ferromagnetic-superconducting phases, whereas the red dashed line indicates the domain of λ with a first order phase transition as a function of β or the temperature θ := β −1 (It only exists in the left figure). The dashed green line (left figure) is the asymptote when λ → −∞. In the right figure, observe that there is no phase transition for ρ = 1. (Color online.)
εβ
εβ
6
1.0 0.5 2 − 0.5 − 1.0
4
6
8
10
β
8
9
10
β
cβ =c −1 θ 10
− 0.4
8
− 0.6 − 0.8
− 1.5 − 2.0
7
− 0.2
6 4 2
− 1.0 − 1.2
0.4
0.6
0.8
1.0
1.2
θ/θc
Fig. 16. In the two figures on the left, we give illustrations of the mean energy per site β as a function of the inverse temperature β for densities ρ = 0.7 (magenta line), 0.9 (cyan line), 1 (green line), 1.1 (blue line) and 1.3 (red line). For ρ = 1, there is no phase transition and for ρ = 0.9 or 1.1 only a ferromagnetic-superconducting phase appears, whereas for ρ = 0.7 or 1.3 this last phase is followed for larger β by a superconducting phase. In the figure on the right, assuming Cβ = 0, we give two plots of the specific heat cβ as a function of the ratio θ/θc between θ := β −1 and the critical temperature θc for densities ρ = 0.7 (magenta line) and 0.9 (cyan line). In all figures, λ = 0.575, γ = 2.6, and h = 0.1. (Color online.)
(ferromagnetic) phase (yellow area) related to the superconductor-Mott insulator phase transition described in Sec. 3.5 and Fig. 10. To conclude, the Fig. 16 illustrates various thermodynamic features of the system at fixed ρ. First, as a function of β > 0, β is continuously differentiable only for ρ = 1. In other words, there is no phase transition by opposition to the cases with ρ = 0.7, 0.9 or ρ = 1.1, 1.3. This is the Mott insulator phase transition illustrated in Fig. 10. As in Fig. 10, we also observe the electron-hole symmetry implying that ρ = 0.7 and ρ = 1.3, or ρ = 0.9 and ρ = 1.1, has same phase transitions at exactly the same critical points. As explained in Sec. 3.1, the mean energy per site β for ρ = 0.7, 1.3, or ρ = 0.9, 1.1, differs by a constant, i.e. in absolute value by |2λ − µβ |. At high temperatures, i.e. when β → 0, the function β diverges to ±∞ if ρ = 1 ∓ ε with ε ∈ (0, 1) whereas it stays finite at ρ = 1. Indeed, when β → 0 the electron density dβ converges to 1 at fixed µ, λ, γ, h, see Theorem 3.4 and Fig. 5. If ρ = 1 ∓ ε, it follows that the chemical potential µβ diverges to ∓∞ as β → 0,
April 20, 2010 14:17 WSPC/S0129-055X
266
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
implying that β → ±∞. In other words, it is energetically unfavorable to fix an election density ρ = 1 at high temperatures. Finally, the specific heat cβ has only one jump in the case of one phase transition and two jumps when there are two phase transitions, namely when the superconductor-Mott insulator (ferromagnetic) phase and the purely superconducting phase appear.
5. Concluding Remarks (1) First, it is important to note that two different physical behaviors can be extracted from the strong coupling BCS-Hubbard model HN : a first one at fixed chemical potential µ and a second one at fixed electron density ρ ∈ (0, 2). This does not mean that the canonical and grand-canonical ensembles are not equivalent for this model. But, the influence of the direct interaction with coupling constant λ drastically changes from the case at fixed µ to the other one at fixed ρ. For instance, via Corollary 4.1 (see also Fig. 15), any one-site repulsion between pairs of electrons is in any case unfavorable to the formation of Cooper pairs, as soon as the electron density ρ is fixed. This property is however wrong at fixed chemical potential µ, see Fig. 2. In other words, fixing the electron density ρ is not equivalentr to fixing the chemical potential µ in the model. Physically, a fixed electron density can be modified by doping the superconductor. Changing the chemical potential may be more difficult. One naive proposition would be to impose an electric potential on a superconductor which is coupled to an additional conductor serving as a reservoir of electrons or holes at fixed chemical potential. (2) A measurement of the asymptotics as β → ∞ of the specific heat cβ (see (3.14) with Cβ = 0) in a superconducting phase would determine, by using (3.15), first the parameter γ > 0 via the exponential decay and then the coupling constant λ. Next, the measurement of the critical magnetic field at very low temperature would allow to obtain by (3.5) the chemical potential µ and hence the electron density at zero-temperature. Since the inverse temperature β as well as the magnetic field h can directly be measured, all parameters of the strong coupling BCS-Hubbard model HN (1.2) would be experimentally found. In particular, its thermodynamic behavior, explained in Secs. 2–4, could finally be confronted to the real system. One could for instance check if the critical temperature θc given by HN in appropriate dimension corresponds to the one measured in the real superconductor. Such studies would highlight the thermodynamic impact of the kinetic energy. (3) In Sec. 4, the electron density is fixed but one could have fixed each extensive quantity: the Cooper pair condensate density, the magnetization density, the Coulomb correlation density or the mean-energy per site. For instance, if the magnetization density m ∈ R is fixed, by strict convexity of the pressure there is a r “Equivalent”
is not taken here in the sense of the equivalence of ensembles.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
267
unique magnetic field hN,β = hN,β (µ, λ, γ, m) such that 1 m= ωN (nx,↑ − nx,↓ ). N x∈ΛN
In the thermodynamic limit, we then have hN,β converging to hβ solution of the equation mβ = m at fixed β, γ > 0 and µ, λ ∈ R. By using Theorem 6.5, we would obtain the thermodynamics of the system for any β, γ > 0 and µ, λ, m ∈ R. More generally, when one of the extensive quantities rβ , dβ , mβ , wβ , or β is discontinuous at a critical point, then the thermodynamic limit of the local Gibbs states ωN can be uniquely determined by fixing one of the corresponding extensive quantity between its critical values. The other extensive quantities are determined in this case by an obvious transcription of Theorem 4.1 for the considered discontinuous quantity at the critical point. Observe, however, that rβ , dβ , mβ , wβ , and β should be related, respectively, to the parameters γ, µ, h, λ and β. For instance, the existence of a magnetic field hN,β solution of (4.1) at fixed ρ ∈ (0, 2) is not clear at finite volume. Figure 17 gives an example of an electron density always equal to 1 for µ = λ together with discontinuity of all other extensive quantities. In order to get welldefined quantities at the thermodynamic limit in this example for parameters allowing a first order phase transition, it is not sufficient to have the electron density fixed. At the critical point we could for instance fix the magnetization density m ∈ R in the ferromagnetic case (h = 0.1) or in any case, the Coulomb correlation density w ≥ 0 which determines a coupling constant λN,β converging to λβ , see the right illustrations of Fig. 17 with the existence of a critical magnetic field and a critical coupling constant. (4) To conclude, as explained in the introduction, for a suitable space of states it is possible to define a free energy density functional F (1.5) associated with the Hamiltonians HN . The states minimizing this functional are equilibrium states and implies all the thermodynamics of the strong coupling BCS-Hubbard model discussed in Secs. 3 and 4. Indeed, the weak∗ -limit ω∞ of the local Gibbs state ωN as N → ∞ exists and belongs to our set of equilibrium states for any β, γ > 0 r β , mβ , wβ , ε β
r β , mβ , wβ , ε β
mβ c (h), wβ c (λ ) 0.8
0.4
0.4
0.6
0.2
0.2
1
2
3
4
5
6
7
β
1
− 0.2
− 0.2
− 0.4
− 0.4
− 0.6
− 0.6
2
3
4
5
6
7
β
0.4
0.2
0.1
0.2
0.3
0.4
0.5
h, λ
Fig. 17. In the two figures on the left, we give illustrations of the Cooper pair condensate density rβ (blue line), the magnetization density mβ (green line), the Coulomb correlation density wβ (red line), and the mean-energy per site β (orange line) as functions of the inverse temperature β for h = 0 (figure on the left) and h = 0.1 (figure on the center) whereas µβ = λ = 0.375, i.e. ρ = 1. In the figure on the right, we illustrate mβc (green line) and wβc (red line) respectively as functions of h with µ = λ = 0.375 and λ with (µ, h) = (0.375, 0.1) at the critical inverse temperature βc := θc−1 3.04. (Color online.)
April 20, 2010 14:17 WSPC/S0129-055X
268
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
and µ, λ, h ∈ R, cf. Theorem 6.5. In Sec. 6.2, we prove in particular the following properties of equilibrium states: 1/2
(i) Any pure equilibrium state ω satisfies ω(ax,↓ ax,↑ ) = rβ eiφ for some φ ∈ [0, 2π). In particular, if rβ = 0 they are not U (1)-gauge invariant and show off diagonal long range order [38] (ODLRO), cf. Theorem 6.1, Theorem 6.3 and Corollary 6.1. (ii) All densities are uniquely defined: the electron density of any equilibrium states ω is given by ω(nx,↑ + nx,↓ ) = dβ , its magnetization density by ω(nx,↑ − nx,↓ ) = mβ , and its Coulomb correlation density equals ω(nx,↑ nx,↓ ) = wβ , cf. Theorem 6.4. (iii) The Cooper fields Φx := a∗x,↓ a∗x,↑ + ax,↑ ax,↓ and Ψx := i(a∗x,↓ a∗x,↑ − ax,↑ ax,↓ ) for pure states become classical in the limit γβ → ∞, i.e. their fluctuations go to zero in this limit, cf. Theorem 6.6. Any weak∗ limit point of equilibrium states with diverging inverse temperature is (by definition) a ground state. For γ > 0 and µ, λ, h ∈ R, most of ground states inherit the properties (i)–(iii) of equilibrium states. In particular, within the GNSrepresentation [32] of pure ground states, Cooper fields are exactly c-numbers, see Corollary 6.2. In this case, correlation functions can explicitly be computed at any order in Cooper fields. Furthermore, notice that even in the case h = 0 where the Hamiltonian HN is spin invariant, there exist ground states breaking the spin SU (2)-symmetry. For more details including a precise formulation of these results, we recommend Sec. 6, in particular, Sec. 6.2. 6. Mathematical Foundations of the Thermodynamic Results The aim of this section is to give all the detailed proofs of the thermodynamics of the strong coupling BCS-Hubbard model HN (1.2). The central result of this section is the thermodynamic limit of the pressure, i.e. the proof of Theorem 2.1. The main ingredient in this analysis is the celebrated Størmer Theorem [1], which we adapt here for the CAR algebra (see Lemma 6.8). We orient our approach on the Petz–Raggio–Verbeure results in [19], but we would like to mention that the analysis of permutation invariant quantum systems in the thermodynamic limit (with Størmer’s theorem as the background) is carried out for different classes of systems also by other authors. See, e.g., [33, 39]. Finally, we introduce in Sec. 6.2 a notion of equilibrium and ground states by a usual variational principle for the free energy density. The thermodynamics of the strong coupling BCS-Hubbard model described in Secs. 3 and 4 is encoded in this notion and the thermodynamic limits of local Gibbs states used above for simplicity are special cases of equilibrium and ground states defined in Sec. 6.2. Before we proceed, we first define some basic mathematical objects needed in our analysis. Let I be the set of finite subsets of Zd≥1 . For any Λ ∈ I we then define UΛ as the C ∗ -algebra generated by {ax,↑ , ax,↓ }x∈Λ and the identity. Choosing some
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
269
fixed bijective map κ : N → Zd , N := {1, 2, . . .}, UN denotes the local C ∗ -algebra U{κ(1),...,κ(N )} at fixed N ∈ N, whereas U is the full C ∗ -algebra, i.e. the closure of the union of all UN for any integer N ≥ 1. Note that nκ(l),↑ := a∗κ(l),↑ aκ(l),↑
and nκ(l),↓ := a∗κ(l),↓ aκ(l),↓
are the electron number operators on the site κ(l), respectively, with spin up ↑ and down ↓. To simplify the notation, as soon as a statement clearly concerns the onesite algebra U1 = U{κ(1)} , we replace aκ(1),↑ , aκ(1),↓ and nκ(1),↑ , nκ(1),↓ , respectively, by a↑ , a↓ and n↑ , n↓ , whereas any state on U1 is denoted by ζ and not by ω, which is by definition a state on more than one site (on UΛ , UN or U). Important one-site Gibbs states in our analysis are the states ζc associated for any c ∈ C with the Hamiltonian H1 (c) (2.1) and defined by ∗ ∗
ζc (A) :=
Trace(Aeβ{(µ−h)n↑ +(µ+h)n↓ +γ(ca↓ a↑ +¯ca↑ a↓ )−2λn↑ n↓ } ) ∗ ∗
Trace(eβ{(µ−h)n↑ +(µ+h)n↓ +γ(ca↓ a↑ +¯ca↑ a↓ )−2λn↑ n↓ } )
,
(6.1)
for any A ∈ U1 . Finally, note that our notation for the “Trace” does not include the Hilbert space where it is evaluated. Using the isomorphisms UΛ B( CΛ×{↑,↓} ) of C ∗ -algebras, the corresponding Hilbert space is deduced from the local algebra where the operators involved in each statement are living. Now, we are in position to start the proof of Theorem 2.1. It is followed by a rigorous analysis of the corresponding equilibrium and ground states. 6.1. Thermodynamic limit of the pressure: Proof of Theorem 2.1 Since we have already shown the lower bound (2.2) in Sec. 2, to finish the proof of Theorem 2.1 it remains to obtain lim sup{pN (β, µ, λ, γ, h)} ≤ sup{−γ|c|2 + p(c)}. N →∞
(6.2)
c∈C
We split this proof into several lemmata. But first, we need some additional definitions. We define the set of all S-invariant even states. Let S be the set of bijective maps from N to N which leaves invariant all but finitely many elements. It is a group with respect to the composition. The condition ηs : aκ(l),# → aκ(s(l)),# ,
s ∈ S,
l ∈ N,
(6.3)
defines a group homomorphism η : S → Aut(U), s → ηs uniquely. Here, # stands for a spin up ↑ or down ↓. Then, let EUS,+ := {ω ∈ EU : ω ◦ ηs = ω for any s ∈ S, and
ω(a∗κ(l1 ),# · · · a∗κ(lt ),# aκ(m1 ),# · · · aκ(mτ ),# ) = 0 if t + τ is odd}
be the set of all S-invariant even states, where EU is the set of all states on U. The set EUS,+ is weak∗ -compact and convex. In particular, the set of extreme points of EUS,+ , denoted by EUS,+ , is not empty.
April 20, 2010 14:17 WSPC/S0129-055X
270
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
Remark 6.1. Any permutation invariant (p.i.) state on U is in fact automatically even, see, e.g., [25, Example 5.2.21]. We explicitly write the evenness of states in the definition of EUS,+ because this property is essential in our arguments below. Now, to fix the notation and for the reader convenience, we collect well-known results about the so-called relative entropy, cf. [25, 40]. Let ω (1) and ω (2) be two states on the local algebra UΛ , with ω (1) being faithful. Define the relative entropys S(ω (1) |ω (2) ) := Trace(Dω(2) ln Dω(2) ) − Trace(Dω(2) ln Dω(1) ), where Dω(j) is the density matrix associated to the state ω (j) with j = 1, 2. The relative entropy is super-additive: for any Λ1 , Λ2 ∈ I, Λ1 ∩ Λ2 = ∅, and for any even states ω (1) , ω (2) , ω (1,2) , respectively, on UΛ1 , UΛ2 and UΛ1 ∪Λ2 , ω (1) and ω (2) faithful, we have S(ω (1) ⊗ ω (2) | ω (1,2) ) ≥ S(ω (1) | ω (1,2) |UΛ1 ) + S(ω (2) | ω (1,2) |UΛ2 ).
(6.4)
For even states ω (1) and ω (2) , respectively on UΛ1 and UΛ2 with Λ1 ∩ Λ2 = ∅, the even state ω (1) ⊗ ω (2) is the unique extension of ω (1) and ω (2) on UΛ1 ∪Λ2 satisfying for all A ∈ UΛ1 and all B ∈ UΛ2 , ω (1) ⊗ ω (2) (AB) = ω (1) (A)ω (2) (B). The state ω (1) ⊗ω (2) is called the product of ω (1) and ω (2) . The product of even states is an associative operation. In particular, products of even states can be defined with respect to any countable set {UΛn }n∈N of subalgebras of U with Λm ∩ Λn = ∅ for m = m. Observe that the relative entropy becomes additive with respect to product ˆ (1) ⊗ ω ˆ (2) , where ω ˆ (1) and ω ˆ (2) are two even states respectively states: if ω (1,2) = ω on UΛ1 and UΛ2 , then (6.4) is satisfied with equality. The relative entropy is also convex: for any states ω (1) , ω (2) , and ω (3) on UΛ , ω (1) faithful, and for any τ ∈ (0, 1) S(ω (1) | τ ω (2) + (1 − τ )ω (3) ) ≤ τ S(ω (1) | ω (2) ) + (1 − τ )S(ω (1) | ω (3) ).
(6.5)
Meanwhile S(ω (1) | τ ω (2) + (1 − τ )ω (3) ) ≥ τ log τ + (1 − τ ) log(1 − τ ) + τ S(ω (1) | ω (2) ) + (1 − τ )S(ω (1) | ω (3) ),
(6.6)
for any τ ∈ (0, 1). Note that the relative entropy makes sense in a class of states on U much larger than that of even states on UΛ (cf. [40]), but this is not needed here. The condition σ : aκ(l),# → aκ(l+1),# uniquely defines a homomorphism σ on U called right-shift homomorphism. Any state ω on U such that ω = ω ◦ σ is called shift-invariant and we denote by EUσ the s As
in [40] we use the Araki–Kosaki definition, which has opposite sign than the one given in [25].
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
271
set of shift-invariant states on U. An important class of shift-invariant states are product states ωζ obtained by “copying” some even state ζ of the one-site algebra U1 on all other sites, i.e. ωζ :=
∞
ζ ◦ σk .
(6.7)
k=0
Such product states are important and used below as reference states. More generally, a state ω is L-periodic with L ∈ N if ω = ω ◦ σ L . For each L ∈ N, the set of L all L-periodic states from EU is denoted by EUσ . Let ζ be any faithful even state on U1 and let ω be any L-periodic state on U. It immediately follows from super-additivity (6.4) that for any N, M ∈ N S(ωζ |U(M +N )L | ω|U(M +N )L ) ≥ S(ωζ |UM L | ω|UM L ) + S(ωζ |UN L | ω|UN L ). In particular, the following limit exists ˜ ω) := lim S(ωζ |UN L | ω|UN L ) = sup S(ωζ |UN L | ω|UN L ) S(ζ, N →∞ NL NL N ∈N
(6.8)
and is the relative entropy density of ω with respect to the reference state ζ. This functional has the following important properties: Lemma 6.1 (Properties of the Relative Entropy Density). At any fixed ˜ ω) is lower weak∗ L ∈ N, the relative entropy density functional ω → S(ζ, semicontinuous, i.e. for any faithful even state ζ ∈ EU1 and any r ∈ R, the set L ˜ ω) > r} Mr := {ω ∈ EUσ : S(ζ,
is open with respect to the weak∗ -topology. It is also affine, i.e. for any faithful state L ζ ∈ EU1 and states ω, ω ∈ EUσ ˜ τ ω + (1 − τ )ω ) = τ S(ζ, ˜ ω) + (1 − τ )S(ζ, ˜ ω ), S(ζ, with τ ∈ (0, 1). Proof. Without loss of generality, let L = 1. From the second equality of (6.8), Mr = {ω ∈ EUσ : S(ωζ |UN | ω|UN ) > rN }. N ∈N
As the maps ω → S(ωζ |UN | ω|UN ) are weak∗ -continuous for each N , it follows that Mr is the union of open sets, which implies the lower weak∗ -semicontinuity of the relative entropy density functional. Moreover from (6.5) and (6.6) we directly obtain ˜ ω) is affine. that S(ζ, Notice that any p.i. state is automatically shift-invariant. Thus, the mean relative entropy density is a well-defined functional on EUS,+ . Now, we need to define on EUS,+ the functional ∆(ω) relating to the mean BCS interaction energy
April 20, 2010 14:17 WSPC/S0129-055X
272
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
per site: Lemma 6.2 (BCS Energy per Site for p.i. States). For any ω ∈ EUS,+ , the mean BCS interaction energy per site in the thermodynamic limit N γ ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N →∞ N 2
∆(ω) := lim
l,m=1
=
γω(a∗κ(1),↑ a∗κ(1),↓ aκ(2),↓ aκ(2),↑ )
is well-defined and the affine map ∆ : EUS,+ → C, ω → ∆(ω) is weak∗ -continuous. Proof. First, N
ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ )
l,m=1
=
N
ω(a∗κ(l),↑ a∗κ(l),↓ aκ(l),↓ aκ(l),↑ ) +
l=1
N
ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ).
l, m=1 l =m
(6.9) Since ω ∈
EUS,+ ,
for any l = m observe that
ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) = ω(a∗κ(1),↑ a∗κ(1),↓ aκ(2),↓ aκ(2),↑ ),
(6.10)
ω(a∗κ(l),↑ a∗κ(l),↓ aκ(l),↓ aκ(l),↑ ) = ω(a∗κ(1),↑ a∗κ(1),↓ aκ(1),↓ aκ(1),↑ ).
(6.11)
whereas
Therefore, by combining (6.9) with (6.10) and (6.11), the lemma follows. Now, we define by ω H (A) :=
Trace(A e−βH ) , Trace(e−βH )
A ∈ UΛ ,
(6.12)
the Gibbs state associated with any self-adjoint element H of UΛ at inverse temperature β > 0. This definition is of course in accordance with the Gibbs state ωN (1.6) associated with the Hamiltoniant HN (1.2) since ωN = ω HN for any N ∈ N. Note however, that the state ωN is seen either as defined on the local algebra UN or as defined on the whole algebra U by periodically extending it (with period N ). Next we give an important property of Gibbs states (6.12): Lemma 6.3 (Passivity of Gibbs States). Let H0 , H1 be self-adjoint elements from UΛ and define for any state ω on UΛ FΛ (ω) := −ω(H1 ) − β −1 S(ω H0 |ω) + P H0 , t With
the appropriate numbering of sites defined by the bijective map κ.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
273
where P H := β −1 ln Trace(e−βH ) for any self-adjoint H ∈ UΛ . Then P H1 +H0 ≥ FΛ (ω) for any state ω on UΛ with equality if ω = ω H0 +H1 . Note that −FΛ (ω) is the free energy associated with the state ω. Proof. For any self-adjoint H ∈ UΛ and any state ω on UΛ observe that Trace(Dω ln DωH ) = Trace(Dω ln(exp(−βP H − βH))) = −βω(H) − βP H ,
(6.13)
which implies that P H1 +H0 = −β −1 (Trace(DωH0 +H1 ln DωH0 +H1 ) − Trace(DωH0 +H1 ln DωH0 )) − ω H0 +H1 (H1 ) + P H0 ,
(6.14)
i.e. P H1 +H0 = FΛ (ω H0 +H1 ). Without loss of generality take any faithful state ω on UΛ . In this case, there are positive numbers λj with j λj = 1 and vectors j| from the Hilbert space HΛ such that ω(·) = j λj j| · |j. In particular, from (6.13) we have λj (− ln λj − βj|H0 + H1 |j). −βω(H1 ) − S(ω H0 |ω) + βP H0 = j
Consequently, by convexity of the exponential function combined with Jensen inequality we obtain that exp(−βω(H1 ) − S(ω H0 |ω) + βP H0 ) ≤ λj exp(− ln λj − βj|H0 + H1 |j) j
≤ Trace(exp(−β(H0 + H1 ))) = exp(βP H1 +H0 ). Note that the last inequality uses the so-called Peierls–Bogoliubov inequality which is again a consequence of Jensen inequality. This proof is standard (see, e.g., [25]). It is only given in detail here, because we also need later Eqs. (6.13) and (6.14). Observe that Lemma 6.3 applied to ω = ω H0 gives the Bogoliubov (convexity) inequality [29]. We can also deduce from this lemma that the pressure pN (β, µ, λ, γ, h) (1.4) associated with HN equals pN (β, µ, λ, γ, h) =
N γ ωN (a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N2 l,m=1
−
1 S(ωζ0 |UN |ωN ) + pN (β, µ, λ, 0, h), βN
(6.15)
April 20, 2010 14:17 WSPC/S0129-055X
274
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
for any β, γ > 0 and real numbers µ, λ, h. Recall that ωζ0 is the shift-invariant state obtained by “copying” the state ζ0 (6.1) of the one-site algebra U1 , see (6.7). ˜ at Finite N ). Let Lemma 6.4 (From S to the Relative Entropy Density S ω ˜ N be the shift-invariant state defined by 1 ω ˜ N := (ωN + ωN ◦ σ + · · · + ωN ◦ σ N −1 ), N ˜ 0, ω where σ is the right-shift homomorphism. Then S(ωζ0 |UN |ωN ) = N S(ζ ˜ N ), cf. (6.8). ˜ 0, ω ˜N ) Proof. By Lemma 6.1 combined with (6.8), the relative entropy density S(ζ equals N −1 1 1 ˜ 0, ω S(ωζ0 |UM N | ωN ◦ σ k |UM N ) , ˜ N ) = lim (6.16) S(ζ M→∞ MN N k=0
for any fixed N ∈ N. By using now the additivity of the relative entropy for product states observe that S(ωζ0 |UM N | ωN ◦ σ k |UM N ) = (M − 1)S(ωζ0 |UN | ωN |UN ) + S(ωζ0 |Uk | ωN |Uk ) + S(ωζ0 |UN −k | ωN |UN −k ),
(6.17)
for any k ∈ {0, . . . , N − 1}, with S(ωζ0 |U0 | ωN |U0 ) := 0 by definition. Therefore the ˜ 0, ω ˜ N ) directly follows from (6.16) combined with equality S(ωζ0 |UN |ωN ) = N S(ζ (6.17). We are now in position to give a first general upper bound for the pressure pN (β, µ, λ, γ, h) by using the equality (6.15) together with Lemmas 6.2 and 6.4. Lemma 6.5 (General Upper Bound of the Pressure pN ). For any β, γ > 0 and µ, λ, h ∈ R, one gets that ˜ 0 , ω)}, lim sup{pN (β, µ, λ, γ, h)} ≤ p(β, µ, λ, 0, h) + sup {∆(ω) − β −1 S(ζ N →∞
S,+ ω∈EU
where we recall that EUS,+ is the non empty set of extreme points of EUS,+ . Proof. By (6.15) combined with Lemma 6.4 one gets pN (β, µ, λ, γ, h) =
N γ ωN (a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N2 l,m=1
˜ 0, ω − β −1 S(ζ ˜ N ) + pN (β, µ, λ, 0, h).
(6.18)
The last term of this equality is independent of N ∈ N since 1 pN (β, µ, λ, 0, h) = ln Trace(eβ[(µ−h)n↑ +(µ+h)n↓ −2λn↑ n↓ ] ) β =: p(β, µ, λ, 0, h), cf. (2.3).
(6.19)
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
275
However, the other terms require the knowledge of the states ωN and ω ˜ N in the limit N → ∞. Actually, because the unit ball in U is a metric space with respect ωN } converges in the weak∗ -topology along to the weak∗ -topology, the sequence {˜ a subsequence towards ω∞ . Meanwhile, it is easy to see that for all A ∈ UΛ , Λ ∈ I, lim {ωN (A) − ω ˜ N (A)} = 0.
N →∞
˜ N have the same limit points. Since ωN Thus, the sequences of states ωN and ω is even and permutation invariant with respect to the N first sites, the state ω∞ belongs to EUS,+ . We now estimate the first term (6.18) as in Lemma 6.2 to get lim sup{pN (β, µ, λ, γ, h)} ≤ p(β, µ, λ, 0, h) + γω∞ (a∗κ(1),↑ a∗κ(1),↓ aκ(2),↑ aκ(2),↓ ) N →∞
˜ 0, ω + β −1 lim sup{−S(ζ ˜ N )}. N →∞
(6.20)
From Lemma 6.1 the relative entropy density is lower semicontinuous in the weak∗ topology, which implies that ˜ 0, ω ˜ 0 , ω∞ ). ˜ N )} ≤ −S(ζ lim sup{−S(ζ N →∞
By combining this last inequality with (6.20) we then find that ˜ 0 , ω∞ ), (6.21) lim sup{pN (β, µ, λ, γ, h)} ≤ p(β, µ, λ, 0, h) + ∆(ω∞ ) − β −1 S(ζ N →∞
with ω∞ ∈ EUS,+ . Now, from Lemma 6.2 the functional ω → ∆(ω) is affine and weak∗ -continuous, ˜ 0 , ω) is affine and lower weak∗ whereas by Lemma 6.1 the map ω → S(ζ ˜ 0 , ω) is, in parsemicontinuous. The free energy functional ω → ∆(ω) − β −1 S(ζ ∗ ticular, convex and upper weak -semicontinuous. Meanwhile recall that EUS,+ is a weak∗ -compact and convex set. Therefore, from the Bauer maximum principle [32, Lemma 4.1.12] it follows that ˜ 0 , ω)} = sup {∆(ω) − β −1 S(ζ ˜ 0 , ω)}. sup {∆(ω) − β −1 S(ζ S,+ ω∈EU
(6.22)
S,+ ω∈EU
Together with (6.21), this last inequality implies the upper bound stated in the lemma. Since even states on U are entirely determined by their action on even elements from U, observe that we can identify the set of even p.i. states of U with the set of p.i. states on the even sub-algebra U + . We want to show next that the set of extreme points EUS,+ belongs to the set of strongly clustering states on the even sub-algebra U + of U. By strongly clustering states ω with respect to U + , we mean that for any B in U + , there exists a net {Bj } ⊆ Co{ηs (B) : s ∈ S} such that for any A ∈ U + , lim |ω(A ηs (Bj )) − ω(A)ω(B)| = 0 j
uniformly in s ∈ S. Here, Co M denotes the convex hull of the set M .
April 20, 2010 14:17 WSPC/S0129-055X
276
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
S,+ Lemma 6.6 (Characterization of the Set of Extreme States of EU ). Any S,+ extreme state ω ∈ EU is strongly clustering with respect to the even sub-algebra U + and conversely.
Proof. We use some standard facts about extreme decompositions of states which can be found in [32, Theorems 4.3.17 and 4.3.22]. To satisfy the requirements of these theorems, we need to prove that the C ∗ -algebra U + of even elements of U is asymptotically abelian with respect to the action of the group S. This is proven as follows. For each l ∈ N define the map π (l) : N → N by l−1 l−1 k + 2 , if 1 ≤ k ≤ 2 . (6.23) π (l) (k) := k − 2l−1 , if 2l−1 + 1 ≤ k ≤ 2l . l k, if k > 2 . In other words, the map π (l) exchanges the block {1, . . . , 2l−1 } with {2l−1 + 1, . . . , 2l }, and leaves the rest invariant. For any A, B ∈ UΛ ∩ U + with Λ ∈ I, it is then not difficult to see that lim [A, ηπ(l) (B)] = 0
l→∞
in the norm sense. Recall that the map ηπ(l) is defined via (6.3). By density of local elements of U + the limit above equals zero for all A, B ∈ U + . Therefore, by using now [32, Theorems 4.3.17 and 4.3.22] all states ω ∈ EUS,+ are then strongly clustering with respect to U + and conversely. We show next that p.i. states, which are strongly clustering with respect to the even sub-algebra U + , have clustering properties with respect to the whole algebra U. Lemma 6.7 (Extension of the Strongly Clustering Property). Let ω ∈ EUS,+ be any strongly clustering state with respect to U + . Then, for any A, B ∈ U and ε > 0, there are Bε ∈ Co{ηs (B): s ∈ S} and lε such that for any l ≥ lε , |ω(Aηπ(l) (Bε )) − ω(A)ω(B)| < ε. Proof. By density of local elements it suffices to prove the lemma for any A, B ∈ UN and N ∈ N. The operators A and B can always be written as sums A = A+ +A− and B = B + + B − , where A+ and B + are in the even sub-algebra U + whereas A− and B − are odd elements, i.e. they are sums of monomials of odd degree in annihilation and creation operators. Since ω is assumed to be strongly clustering with respect to U + , for any ε > 0 there are positive numbers λ1 , . . . , λk with λ1 + · · · + λk = 1, and maps s1 , . . . , sk ∈ S such that for any l ∈ N, k + + + ω A+ ηπ(l) λ η (B ) − ω(A )ω(B ) (6.24) k sj < ε. j=1
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
277
By parity and linearity of ω observe that ω(A+ )ω(B + ) = ω(A)ω(B), whereas k ω(Aηπ(l) (Bε )) = ω A+ ηπ(l) λk ηsj (B + ) (6.25) j=1
for large enough l with the operator Bε ∈ Co{ηs (B): s ∈ S} defined by Bε :=
k
λk ηsj (B).
(6.26)
j=1
The equality (6.25) follows from parity and the statement ˜ − )) = 0 ω(Aηπ(l) (B ˜ − ∈ UN , B ˜ − odd, and sufficiently large l. This can be seen for any ω ∈ EUS,+ , A, B as follows. Since any element of UN with defined parity can be written as a linear combination of two self-adjoint elements with same parity, we assume without loss ˜ − . Choose l ∈ N large enough such that the support ˜ − )∗ = B of generality that (B − ˜ − ) does not intersect {κ(1), . . . , κ(N )} for all l ≥ l . The map π (l) : ˜ := π (l) (B of B l ˜ − ), m ∈ N0 := {0, 1, 2, . . .}, ˜ − := σ m2l+1 (B N → N is defined by (6.23). Define B l,m
l
where σ is the right-shift homomorphism. For any J ∈ N J − ˜ ˜− ) ω AB = (J + 1)ω(AB l,m
l,0
m=0
by symmetry of ω. Use now the Cauchy–Schwarz inequality for states to get J − ∗ ˜− ˜ ˜− B (J + 1)|ω(ABl,0 )| ≤ ω(A A) ω(B l,m l,m ). m,m =0
˜ − anti-commute if m = m , ˜ − and B Since per construction, B l,m l,m J
ω(Bl,m B
l,m
)=
m,m =0
J
ω(Bl,m Bl,m ).
m=0
By symmetry of ω, the right-hand side of the equation above equals (J + ˜ − )2 ). Hence, we conclude that 1)ω((B l,0
˜ − )| ≤ (J + 1)−1/2 |ω(AB l,0
˜ − )2 ), ω(|A|2 )ω((B l,0
˜ − ) = 0 for all l ≥ l . for any J ∈ N, i.e. ω(AB l,0 Therefore, the lemma follows from (6.24) and (6.25) with Bε ∈ Co{ηs (B) : s ∈ S} defined by (6.26) for any ε > 0. We now identify the set of clustering states on U with the set of product states by the following lemma, which is a non-commutative version of de Finetti Theorem
April 20, 2010 14:17 WSPC/S0129-055X
278
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
of probability theory [28]. Størmer [1] was the first to show the corresponding result for infinite tensor products of C ∗ -algebras. Lemma 6.8 (Strongly Clustering p.i. States are Product States). Any p.i. and strongly clustering (in the sense of Lemma 6.7) state ω is a product state (6.7) with the one-site state ζ = ζω := ω|U1 being the restriction of ω on the local (one-site) algebra U1 . Proof. Let l1 , . . . , lk ∈ N with li = lj whenever i = j, and for any j ∈ {1, . . . , k} take Aj ∈ U1 . To prove the lemma we need to show that ω(σ l1 (A1 ) · · · σ lk (Ak )) = ζω (A1 ) · · · ζω (Ak ).
(6.27)
The proof of this last equality for any k ≥ 1 is performed by induction. First, for k = 1 the equality (6.27) immediately follows by symmetry of the state ω. Now, assume the equality (6.27) verified at fixed k ≥ 1. The state ω is strongly clustering in the sense of Lemma 6.7. Therefore for each ε > 0 there are q ∈ N, positive numbers λ1 , . . . , λq with λ1 + · · · + λq = 1, and maps s1 , . . . , sq ∈ S such that q λj ω(σ l1 (A1 ) · · · σ lk (Ak )ηπ(l) ◦sj (σ lk+1 (Ak+1 ))) j=1 (6.28) − ω(σ l1 (A1 ) · · · σ lk (Ak ))ω(σ lk+1 (Ak+1 )) < ε, for any l ∈ N. Fix N sufficiently large such that the operators σ lm (Am ) and ηsj (σ lk+1 (Ak+1 )) belong to UN for any m ∈ {1, . . . , k + 1} and j ∈ {1, . . . , q}. Choose l ∈ N large enough such that the support of ηπ(1) ◦sj (σ lk+1 (Ak+1 )) does not intersect {κ(1), . . . , κ(N )} for all l ≥ l and j ∈ {1, . . . , q}, which by symmetry of ω implies that ω(σ l1 (A1 ) · · · σ lk (Ak )ηπ(l) ◦sj (σ lk+1 (Ak+1 ))) = ω(σ l1 (A1 ) · · · σ lk (Ak )σ lk+1 (Ak+1 )). Combined with (6.28) and λ1 + · · · + λq = 1, it yields |ω(σ l1 (A1 ) · · · σ lk (Ak )σ lk+1 (Ak+1 )) − ω(σ l1 (A1 ) · · · σ lk (Ak ))ζω (Ak+1 )| < ε. Since the equality (6.27) is assumed to be verified at fixed k ≥ 1, it follows that |ω(σ l1 (A1 ) · · · σ lk+1 (Ak+1 )) − ζω (A1 ) · · · ζω (Ak+1 )| < ε, for any ε > 0. In other words, by induction the equality (6.27) is proven for any k ≥ 1. As soon as the upper bound is concerned, we combine Lemma 6.5 with Lemmas 6.6–6.8 to obtain that lim sup{pN (β, µ, λ, γ)} ≤ p(β, µ, λ, 0, h) + sup {γ|ζ(a∗↑ a∗↓ )|2 − β −1 S(ζ0 |ζ)}. N →∞
+ ζ∈EU
1
(6.29)
denotes the set of even states on the (one-site) algebra U1 . Now the Here proof of the upper bound (6.2) easily follows from the passivity of Gibbs states on EU+1
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
279
U1 . Indeed, we apply Lemma 6.3 to the one-site Hamiltonians H0 = H1 (0) (see (2.1)) and c c¯ H1 = − a∗↑ a∗↓ − a↑ a↓ 2 2 in order to bound the relative entropy S(ζ0 |ζ). More precisely, it follows that p(β, µ, λ, 0, h) − β −1 S(ζ0 |ζ) ≤ p(c/(2γ)) − x Re{ζ(a↑ a↓ )} − y Im{ζ(a↑ a↓ )}, (6.30) and any c ∈ C with x := Re{c} and y := Im{c}. Consequently, for any state ζ ∈ from (6.29) we deduce that EU+1
lim sup{pN (β, µ, λ, γ, h)} N →∞ ! ≤ sup inf γ(Re{ζ(a↑ a↓ )}2 + Im{ζ(a↑ a↓ )}2 ) + ζ∈EU
x,y∈R
1
− x Re{ζ(a↑ a↓ )} − y Im{ζ(a↑ a↓ )} + p((x + iy)/(2γ))} ≤ sup inf {γ(t2 + s2 ) − tx − sy + p((x + iy)/(2γ))} . t,s∈R
x,y∈R
In particular, by fixing x = 2tγ and y = 2sγ in the infimum we finally obtain lim sup{pN (β, µ, λ, γ, h)} ≤ sup {−γ(t2 + s2 ) + p(t + is)}, N →∞
t,s∈R
i.e. the upper bound (6.2) for any β, γ > 0 and µ, λ, h ∈ R. 6.2. Equilibrium and ground states of the strong coupling BCS-Hubbard model It follows immediately from the passivity of Gibbs states that ˜ 0 , ω) + p(β, µ, λ, 0, h), p(β, µ, λ, γ, h) ≥ ∆(ω) − β −1 S(ζ
(6.31)
EUS,+ ,
for any ω ∈ cf. (6.1) and Lemmas 6.2 and 6.3. Therefore, by using Lemma 6.5 with (6.22) the (infinite volume) pressure can be written as ˜ 0 , ω)} + p(β, µ, λ, 0, h). p(β, µ, λ, γ, h) = sup {∆(ω) − β −1 S(ζ S,+ ω∈EU
Moreover, as shown above (see the upper bound in the proof of Lemma 6.5), any weak∗ limit point ω∞ of local Gibbs states ωN (1.6) when N → ∞ satisfies (6.31) with equality. Indeed, by using (6.13) one obtains for any state ω that N 1 γ −1 (−ω(HN ) − β S(trN |ω|UN )) = 2 ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N N l,m=1
1 − S(ωζ0 |UN |ω|UN ) + pN (β, µ, λ, 0, h), βN (6.32)
April 20, 2010 14:17 WSPC/S0129-055X
280
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
with pN being the (finite volume) pressure (1.4) associated with the Hamiltonian HN (1.2), ωζ0 being the product state obtained by “copying” the state ζ0 (6.1) on the one-site algebra U1 (see (6.7)), and with the trace state trN defined on the local algebra UN for N ∈ N by trN (·) :=
Trace(·) . Trace(IUN )
For any permutation invariant state ω it is straightforward to check that the limits lim {N −1 S(ωζ0 |UN |ω|UN )}
N →∞
and e(ω) := lim {N −1 ω(HN )} = ω(H1 (0)) − ∆(ω) N →∞
exist for any fixed parameters β, γ > 0 and µ, λ, h ∈ R, see respectively (2.1) and Lemma 6.2 for the definitions of H1 (0) and ∆(ω). Combined with (6.19) and (6.32) it then follows that the usual entropy density ˜ S(ω) := − lim {N −1 S(trN |ω|UN )} N →∞
= − lim
N →∞
1 Trace(Dω|UN log Dω|UN ) < ∞ N
of the permutation invariant state ω also exists and 1 ˜ + p(β, µ, λ, 0, h). S(ωζ0 |UN |ω|UN ) = e(ω) + ∆(ω) − β −1 S(ω) N →∞ βN lim
The set Ωβ = Ωβ (µ, λ, γ, h) of equilibrium states of the strong coupling BCSHubbard model is defined by ˜ = p(β, µ, λ, γ, h) Ωβ := {ω ∈ EUS,+ : −e(ω) + β −1 S(ω) −1 ˜ = ∆(ω) − β S(ζ0 , ω) + p(β, µ, λ, 0, h)}. Note that Ωβ contains per construction all weak∗ limit points of local Gibbs states ωN as N → ∞. Consequently, the equilibrium states are, as usual, the minimizers of the free energy functional ˜ ω → F(ω) := e(ω) − β −1 S(ω)
(6.33)
on the convex and weak∗ -compact set cf. (1.5). They also maximize the ˜ 0 , ω). It follows that upper semicontinuous affine functional ω → ∆(ω) − β −1 S(ζ S,+ Ωβ is a closed face of EU and we have in this set a notion of pure and mixed thermodynamic phases (equilibrium states) by identifying purity with extremity. In particular, it is convex and weak∗ -compact. Each weak∗ -limit ω of equilibrium states ω (n) ∈ Ωβn (µn , λn , γn , hn ) such that (µn , λn , γn , hn ) → (µ, λ, γ, h) and βn → ∞ is called a ground state of the strong coupling BCS-Hubbard model. The set of all ground states with parameters γ > 0 and µ, λ, h ∈ R is denoted EUS,+ ,
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
281
by Ω∞ = Ω∞ (µ, λ, γ, h). Extreme states of the weak∗ -compact convex set Ω∞ are called pure ground states. We analyze now the set of pure equilibrium states, i.e. the equilibrium states ω ∈ Ωβ belonging to the set EUS,+ of extreme points of EUS,+ , cf. (6.22). First, from Lemmas 6.6–6.8 recall that any extreme state is a product state ωζ (6.7), i.e. it is obtained by “copying” a state ζ on the one-site algebra U1 to the other sites. In particular, by combining (6.22) with (6.31) observe that p(β, µ, λ, γ, h) = sup {γ|ζ(a∗↑ a∗↓ )|2 − β −1 S(ζ0 |ζ)} + p(β, µ, λ, 0, h).
(6.34)
+ ζ∈EU
1
Therefore, a product state ωζ is a pure equilibrium state if and only if ζ belongs to the set Gβ = Gβ (µ, λ, γ, h) of one-site equilibrium states defined by Gβ := {ζ ∈ EU+1 : γ|ζ(a∗↑ a∗↓ )|2 − β −1 S(ζ0 |ζ) = p(β, µ, λ, γ, h) − p(β, µ, λ, 0, h)}. (6.35) In other words, the study of pure states of Ωβ can be reduced, without loss of generality, to the analysis of Gβ . The first important statement concerns the characterization of the set Gβ in relation with the variational problems (2.4) and (6.34). Theorem 6.1 (Explicit Description of One-Site Equilibrium States). For any β, γ > 0 and µ, λ, h ∈ R, the set Gβ of one-site equilibrium states are given by 1/2 the states ζcβ (6.1) with cβ := rβ eiφ for any order parameter rβ solution of (2.4) and any phase φ ∈ [0, 2π). Proof. Take any solution rβ of (2.4) and any φ ∈ [0, 2π). Then, from (6.14) observe that −β −1 S(ζ0 |ζcβ ) + p(β, µ, λ, 0, h) = −γζcβ (cβ a∗↑ a∗↓ + ¯cβ a↓ a↑ ) + p(cβ ).
(6.36)
Since ζcβ (a↓ a↑ ) = cβ and ζcβ (a∗↑ a∗↓ ) = ¯cβ , the last equality combined with Theorem 2.1 implies that γ|ζcβ (a↓ a↑ )|2 − β −1 S(ζ0 |ζcβ ) = p(β, µ, λ, γ, h) − p(β, µ, λ, 0, h).
(6.37)
In other words, ζcβ is a maximizer of the variational problem defined in (6.34) and hence, ζcβ ∈ Gβ . On the other hand, any state ζ ∈ Gβ satisfies (6.37) and by combining Theorem 2.1 with the inequality (6.30) for c = 2γζ(a↓ a↑ ) it follows that −γ|ζ(a↓ a↑ )|2 + p(ζ(a↓ a↑ )) ≥ sup{−γ|c|2 + p(c)}. c∈C
1/2
Hence, ζ(a↓ a↑ ) = rβ eiφ = cβ for some φ ∈ [0, 2π). It remains to prove that the equality ζ(a↓ a↑ ) = cβ uniquely defines the one-site equilibrium state ζ ∈ Gβ . It follows from ζ(a↓ a↑ ) = ζcβ (a↓ a↑ ) = cβ with ζ, ζcβ ∈ Gβ that S(ζ0 |ζcβ ) = S(ζ0 |ζ) and γζ(cβ a∗↑ a∗↓ + ¯cβ a↓ a↑ ) − β −1 S(ζ0 |ζ) = P H1 (cβ ) − P H1 (0)
(6.38)
April 20, 2010 14:17 WSPC/S0129-055X
282
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
because of (6.36), see (2.1) for the definition of H1 (c). By Lemma 6.3, one obtains for any self-adjoint A ∈ U1 that −ζ(A) + γζ(cβ a∗↑ a∗↓ + ¯cβ a↓ a↑ ) − β −1 S(ζ0 |ζ) ≤ P H1 (cβ )+A − P H1 (0) . (6.39) Consequently, we obtain by combining (6.38) and (6.39) that P H1 (cβ )+A − P H1 (cβ ) ≥ −ζ(A), for any self-adjoint A ∈ U1 and ζ ∈ Gβ such that ζ(a↓ a↑ ) = cβ . In other words, the functional {−ζ} is tangent to the pressure at H1 (cβ ). Since the convex map A → P H1 (cβ )+A is continuously differentiable and self-adjoint elements separate states, the tangent functional is unique and ζ = ζcβ . It follows immediately from the theorem above that pure states of Ωβ solve the gap equation: Corollary 6.1 (Gap Equation for Pure Equilibrium States). For any β, γ > 0 and µ, λ, h ∈ R, pure states from Ωβ are precisely the product states ωζcβ satisfying 1/2
the gap equation ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ for any l ∈ N and with cβ := rβ eiφ being any maximizer of the first variational problem given in Theorem 2.1. If cβ = 0, observe that the gap equation ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ with ζc defined in (6.1) corresponds to the Euler–Lagrange equation satisfied by the solutions cβ := 1/2 rβ eiφ of the first variational problem given in Theorem 2.1. The phase φ ∈ [0, 2π) is arbitrarily taken because of the gauge invariance of the map c → p(c), and the gap equation ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ can be reduced to (2.5). In other words, if cβ = 0, the gap equation can be written in two different ways: either ωζcβ (aκ(l),↑ , aκ(l),↓ ) = cβ in the view point of extreme equilibrium states or (2.5) in the view point of the order parameter rβ . From this last corollary observe also that the existence of non-zero maximizers cβ = 0 implies the existence of equilibrium states breaking the U (1)-gauge symmetry satisfied by HN (1.2). This breakdown of the U (1)-gauge symmetry for cβ = 0 is already explained by Theorem 3.2, which can be proven by our notion of equilibrium states as follows. Consider the upper semicontinuous convex map on EUS,+ defined for any α ≥ 0 and φ ∈ [0, 2π) by ˜ + 2α Re{eiφ ω(a∗↓ a∗↑ )}. ω → −e(ω) + β −1 S(ω)
(6.40)
From Sec. 6.1 it is straightforward to check that 1 ln Trace(e−βHN,α,φ ) pα,φ (β, µ, λ, γ, h) := lim N →∞ βN =
˜ + 2α Re{eiφ ω(a∗↓ a∗↑ )}}, sup {−e(ω) + β −1 S(ω)
S,+ ω∈EU
(6.41)
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
283
with the Hamiltonian HN,α,φ defined in (3.1). Moreover, any weak∗ -limits ω∞,α,φ of local Gibbs states ωN,α,φ(·) :=
Trace(· e−βHN,α,φ ) Trace(e−βHN,α,φ )
(6.42)
are equilibrium states (see the proof of Lemma 6.5 applied to HN,α,φ ), i.e. the state ω∞,α,φ belongs to the (non-empty) convex set Ωβ,α,φ = Ωβ,α,φ (µ, λ, γ, h) of maximizers of (6.40) at fixed α ≥ 0 and φ ∈ [0, 2π). In fact, one gets the following statement, which implies Theorem 3.2. Theorem 6.2 (Breakdown of the U (1)-Gauge Symmetry). Take β, γ > 0 and real numbers µ, λ, h away from any critical point. Then at fixed phase φ ∈ [0, 2π), N 1 1/2 ωN,α,φ(aκ(l),↓ aκ(l),↑ ) = lim ω∞,α,φ (aκ(1),↓ aκ(1),↑ ) = rβ eiφ , α↓0 N →∞ N α↓0
lim lim
l=1
with ω∞,α,φ ∈ Ωβ,α,φ being the unique maximizer of (6.40) for sufficiently small α ≥ 0. Proof. First we need to characterize pure states of Ωβ,α,φ as it is done in Corollary 6.1 for α = 0. By convexity and upper semicontinuity, note that maximizers of (6.40) are taken on the set of extreme states whereas the set of extreme maximizers is a face. Since extreme states are product states (cf. Lemmas 6.6–6.8), we get that ˜ + α Re{eiφ ω(a∗↓ a∗↑ )}} = sup{−γ|c|2 + p(c + αγ −1 eiφ )}, sup {−e(ω) + β −1 S(ω)
S,+ ω∈EU
c∈C
(6.43)
as in the case α = 0 (see (2.3) for the definition of p(c)). If cβ,α,φ = cβ,α,φ (µ, λ, γ, h) ∈ C is a maximizer of − γ|c|2 + p(c + αγ −1 eiφ ),
(6.44)
then observe that zβ,α,φ := cβ,α,φ + αγ −1 eiφ maximizes the function −γ|z − αγ −1 eiφ |2 + p(z) of the complex variable z ∈ C. By gauge invariance of the map z → p(β, µ, λ, h; z), it follows that zβ,α,φ ∈ eiφ R and thus cβ,α,φ ∈ eiφ R. Using this, we extend Corollary 6.1 to α ≥ 0 and φ ∈ [0, 2π). In other words, for any β, γ > 0, α ≥ 0, φ ∈ [0, 2π) and µ, λ, h ∈ R, pure states of Ωβ,α,φ are product states ωζcβ,α,φ satisfying the gap equation ωζcβ,α,φ (aκ(l),↑ , aκ(l),↓ ) = cβ,α,φ , for any l ∈ N and with cβ,α,φ ∈ eiφ R being any maximizer of (6.44).
(6.45)
April 20, 2010 14:17 WSPC/S0129-055X
284
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
As |c| → ∞, notice that p(c) = O(|c|). So, by gauge invariance we obtain sup{−γ|c|2 + p(c + αγ −1 eiφ )} = c∈C
max
{−γ|seiφ |2 + p([s + αγ −1 ]eiφ )}
max
{−γs2 + p(s + αγ −1 )},
s∈[−M,M]
=
s∈[−M,M]
for any α ∈ (0, 1) and M < ∞ sufficiently large. Consequently, if the parameters β, µ, λ, γ, and h are such that the maximizer rβ (2.4) is unique, then the maximizer cβ,α,φ ∈ eiφ R of (6.44) is also unique as soon as α > 0 is sufficiently small. Indeed the map s → p(s) is continuous on the compact interval [−M, M ]. In particular, from (6.45) there is a unique maximizer of (6.40), i.e. Ωβ,α,φ = {ωζcβ,α,φ }.
(6.46)
1/2
Moreover, cβ,α,φ converges to rβ eiφ as α → 0. Therefore, it follows from (6.45) that 1/2
lim ωζcβ,α,φ (aκ(l),↓ aκ(l),↑ ) = rβ eiφ
(6.47)
α↓0
for any l ∈ N. By permutation invariance N 1 ωN,α,φ (a∗κ(l),↑ a∗κ(l),↓ ) = ωN,α,φ(a∗κ(1),↑ a∗κ(1),↓ ). N l=1
Now, let
(1) {Nj }
(2)
and {Nj } be two subsequences in N such that
lim ωN (1) ,α,φ (a∗κ(1),↑ a∗κ(1),↓ ) = lim sup ωN,α,φ (a∗κ(1),↑ a∗κ(1),↓ ),
j→∞
lim
j→∞
N →∞
j
ωN (2) ,α,φ (a∗κ(1),↑ a∗κ(1),↓ ) j
= lim inf ωN,α,φ (a∗κ(1),↑ a∗κ(1),↓ ). N →∞
We can assume without loss of generality that ωN (2) and ωN (1) both converge with j
j
respect to the weak∗ -topology as j → ∞. Since any weak∗ -limits ω∞,α,φ of local Gibbs states ωN,α,φ (6.42) are equilibrium states (see again the proof of Lemma 6.5), i.e. ω∞,α,φ ∈ Ωβ,α,φ , the theorem then follows from (6.46) and (6.47). Indeed, for any β, γ > 0 and µ, λ, h ∈ R away from any critical point, the sequence ωN,α,φ of local Gibbs state converges towards ω∞,α,φ = ωζcβ,α,φ in the weak∗ -topology as soon as α ≥ 0 is sufficiently small. From Corollary 6.1 note that the expectation values of Cooper fields Φκ(l) := a∗κ(l),↓ a∗κ(l),↑ + aκ(l),↑ aκ(l),↓
(6.48)
Ψκ(l) := i(a∗κ(l),↓ a∗κ(l),↑ − aκ(l),↑ aκ(l),↓ ) are ωζcβ (Φκ(l) ) = 2 Re{cβ } and ωζcβ (Ψκ(l) ) = 2 Im{cβ }
(6.49) 1/2
for any pure state ωζcβ of Ωβ and l ∈ N, where we recall that cβ := rβ eiφ is some maximizer of the first variational problem given in Theorem 2.1. In particular,
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
285
ω(Φκ(l) ) = 0 or ω(Ψκ(l) ) = 0 for any pure state ω ∈ Ωβ is a manifestation of the breakdown of the U (1)-gauge symmetry. Unfortunately, the operators Φκ(l) and Ψκ(l) do not correspond to any experiment, as they are not gauge invariant. More generally, experiments only “see” the restriction of states ωζcβ to the subalgebra of gauge invariant elements. Consequently, the next step is to prove the so-called off diagonal long range order (ODLRO) property proposed by Yang [38] to define the superconducting phase. Indeed, one detects the presence of U (1)-gauge symmetry breaking by considering the asymptotics, as |l − m| → ∞, of the (U (1)-gauge symmetric) Cooper pair correlation function Gω (l, m) := ω(a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ )
(6.50)
associated with some state ω. In particular, if Gω (l, m) converges to some fixed non-zero value whenever |l − m| → ∞, the state ω shows off diagonal long range order (ODLRO). This property can directly be analyzed for equilibrium states from our next statement. Theorem 6.3 (Cooper Pair Correlation Function). For any β, γ > 0 and µ, λ, h ∈ R away from any critical point, the Cooper pair correlation function GωN (l, m) associated with the local Gibbs state ωN converges for fixed l = m towards lim GωN (l, m) = Gω (l, m) = rβ ,
N →∞
for any equilibrium state ω ∈ Ωβ , and with rβ being the solution of (2.4). Proof. By similar arguments as in the proof of Theorem 6.2, if Gω (l, m) = rβ for all equilibrium states ω, then lim GωN (l, m) = rβ .
N →∞
By permutation invariance of ω ∈ Ωβ , note that Gω (l, m) = Gω (1, 2)
(6.51)
for any l = m. If ω = ωζcβ is an extreme equilibrium state, then one clearly has Gωζc (1, 2) = ζcβ (a∗↑ a∗↓ )ζcβ (a↓ a↑ ) = |cβ |2 = rβ . β
On the other hand, the set Ωβ of equilibrium states for fixed parameters β, γ > 0, and µ, λ, h ∈ R is weak∗ -compact. In particular, if ω ∈ Ωβ is not extreme, the function Gω (1, 2) is given, up to arbitrarily small errors, by convex sums of the form k
λj Gω(j) (1, 2),
λ1 , . . . , λk ≥ 0,
λ1 + · · · + λk = 1,
(6.52)
j=1
where {ω (j) }j=1,...,k are extreme equilibrium states. Since any weak∗ -limit ω∞ of local Gibbs states ωN (1.6) is an equilibrium state (see proof of Lemma 6.5), the theorem is then a consequence of (6.51) and (6.52).
April 20, 2010 14:17 WSPC/S0129-055X
286
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
Since N 1 ωN (a∗κ(l),↑ a∗κ(l),↓ aκ(m),↓ aκ(m),↑ ) N2 l,m=1
=
N (N − 1) ωN (a∗κ(1),↑ a∗κ(1),↓ aκ(2),↓ aκ(2),↑ ) + O(N −1 ), N2
note that this theorem implies Theorem 3.1. Therefore, away from any critical point, if an equilibrium state shows ODLRO then all pure equilibrium states break the U (1)-gauge symmetry. Conversely, if all pure equilibrium states break the U (1)-gauge symmetry, then all equilibrium state show ODLRO. This is due to the fact that the order parameter rβ is unique away from any critical point. In particular, from Sec. 7, at sufficiently small inverse temperature β there is no ODLRO and Ωβ = {ωζ0 }, whereas for sufficiently large β and γ all equilibrium states show ODLRO. For any β, γ > 0 and real numbers µ, λ, h at some critical point, this property is not satisfied in general. There are indeed cases where the phase transition is of first order, cf. Fig. 3. In this situation, 0 and some rβ > 0 are maximizers at the same time, and hence, there are some equilibrium states breaking the U (1)-gauge symmetry and other equilibrium states which do not show ODLRO in this specific situation. Observe now that the superconducting phase is not only characterized by ODLRO and the breakdown of the U (1)-gauge symmetry. Indeed, the two-point correlation function determines its type: s-wave, d-wave, p-wave, etc. In fact, for any extreme equilibrium state ω = ωζcβ , x, y ∈ Zd and s1 , s2 ∈ {↑, ↓}, one clearly has 0 if x = y. ζcβ (ax,s1 )ζcβ (ay,s2 ) if x = y = 0 if x = y, s1 = s2 . ωζcβ (ax,s1 ay,s2 ) = ζcβ (ax,s1 ax,s2 ) if x = y c if x = y, s1 = s2 . β As a consequence, for any equilibrium state ω ∈ Ωβ , we have ω(ax,s1 ay,s2 ) = ω(a0,s1 a0,s2 )δx,y and we obtain a s-wave superconducting phase. In particular, Theorem 3.3 is a simple consequence of this last equalities combined with (6.46), (6.47) and the fact that any weak∗ -limits ω∞,α,φ ∈ Ωβ,α,φ of local Gibbs states ωN,α,φ (6.42) are equilibrium states (see again the proof of Lemma 6.5). Now we would like to pursue this analysis of equilibrium states by showing that their definition is in accordance with results of Theorems 3.4–3.6. This statement is given in the next theorem. Theorem 6.4 (Uniqueness of Densities for Equilibrium States). Take β, γ > 0 and real numbers µ, λ, h away from any critical point. Then, for any
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
287
equilibrium state ω ∈ Ωβ and l ∈ N, all densities are uniquely defined : (i) The electron density is equal to N 1 lim ωN (nκ(l ),↑ + nκ(l ),↓ ) = ω(nκ(l),↑ + nκ(l),↓ ) = dβ , N →∞ N l =1
cf. Theorem 3.4. (ii) The magnetization density is equal to N 1 ωN (nκ(l ),↑ − nκ(l ),↓ ) = ω(nκ(l),↑ − nκ(l),↓ ) = mβ , lim N →∞ N l =1
cf. Theorem 3.5. (iii) The Coulomb correlation density is equal to N 1 ωN (nκ(l ),↑ nκ(l ),↓ ) = ω(nκ(l),↑ nκ(l),↓ ) = wβ , lim N →∞ N l =1
cf. Theorem 3.6. Proof. Suppose first that ω ∈ Ωβ is pure. Then, from Corollary 6.1 it follows that ω(nκ(l),↑ + nκ(l),↓ ) = ωζcβ (nκ(l),↑ + nκ(l),↓ ), 1/2
with cβ = rβ eiφ for some φ ∈ [0, 2π). Thus, by using the gauge invariance of the map c → p(c) we directly get 1/2
ω(nκ(l),↑ + nκ(l),↓ ) = ∂µ p(β, µ, λ, γ, h; cβ ) = ∂µ p(β, µ, λ, γ, h; rβ ) = dβ .
(6.53)
At fixed parameters β, γ > 0, µ, λ, h ∈ R, recall that the set Ωβ of equilibrium states is weak∗ -compact. In particular, if ω ∈ Ωβ is not pure, it is the weak∗ -limit of convex combinations of pure states. Therefore, we obtain (6.53) for any ω ∈ Ωβ . Similarly one gets ω(nκ(l),↑ − nκ(l),↓ ) = mβ
and ω(nκ(l),↑ nκ(l),↓ ) = wβ ,
(6.54)
for any equilibrium state ω ∈ Ωβ and l ∈ N. Moreover, since any weak∗ -limit ω∞ of local Gibbs states ωN (1.6) is an equilibrium state, i.e. ω∞ ∈ Ωβ , we therefore deduce from (6.53) and (6.54), exactly as in the proof of Theorem 6.2, the existence of the limits in the statements (i)–(iii). Observe that the weak∗ -limit ω∞ ∈ Ωβ of local Gibbs states ωN (1.6) can easily be performed, even at critical points, by using the decomposition theory for states [32]: Theorem 6.5 (Asymptotics of the Local Gibbs State ωN as N → ∞). 1/2 Recall that for any φ ∈ [0, 2π), cβ := rβ eiφ is a maximizer of the first variational
April 20, 2010 14:17 WSPC/S0129-055X
288
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
problem given in Theorem 2.1, whereas the states ζc and ωζ are respectively defined by (6.1) and (6.7). Take any β, γ > 0, µ, λ, h ∈ R, and let N → ∞. (i) Away from any critical point, the local Gibbs state ωN converges in the weak∗ topology towards the equilibrium state " 2π 1 ωζcβ (·)dφ. (6.55) ω∞ (·) = 2π 0 (ii) For each weak∗ limit point ω∞ of local Gibbs states ωN with parameters (βN , γN , µN , λN , hN ) converging to any critical point (β, γ, µ, λ, h) ∈ ∂S (2.7), there is τ ∈ [0, 1] such that " 2π τ ω∞ (·) = (1 − τ )ωζ0 (·) + ωζcβ (·)dφ. 2π 0 Proof. By U (1)-gauge symmetry of the Hamiltonians HN (1.2) recall that any weak∗ -limit ω∞ of local Gibbs states ωN (1.6) is a U (1)-invariant equilibrium state. So, in order to prove the first part of the Theorem it suffices to show that the equilibrium state given in (i) is the unique U (1)-invariant state in Ωβ . If the solution rβ of (2.4) is zero, then this follows immediately from Corollary 6.1. 1/2 Let rβ > 0 be the unique maximizer of (2.4), i.e. cβ := rβ eiφ = 0 for any φ ∈ [0, 2π). Let ∂Ωβ = {ωζ : ζ ∈ Gβ } be the set of all extreme states of Ωβ , see (6.35) for the definition of the set Gβ of onesite equilibrium states. Observe that the closed convex hull of ∂Ωβ is precisely Ωβ and that ∂Ωβ is the image of the torus [0, 2π) under the continuous map φ → ωζcβ , 1/2
with cβ := rβ eiφ . This last map defines a homeomorphism between the torus and ∂Ωβ . In particular, the set ∂Ωβ is compact and for each equilibrium state ω ∈ Ωβ ˆ ω on the torus such that there is a uniquely defined probability measure dm " 2π ˆ ω (φ), for all A ∈ U. ω(A) = ωζcβ (A)dm (6.56) 0
See, e.g., [41, Proposition 1.2]. By U (1)-invariance of ω∞ , for any n ∈ N one has from (6.56) that n " 2π # n/2 ˆ ω∞ (φ) = 0. aκ(l),↑ aκ(l),↓ = rβ einφ dm ω∞ l=1
0
Therefore, if rβ > 0, there is a unique probability measure allowing the U (1)-gauge ˆ ω∞ (φ) must be the uniform probability measure on [0, 2π). symmetry of ω∞ : dm From Lemma 7.1 the cardinality of set of maximizers of (2.4) is at most 2. Indeed, away from any critical point, it is 1 whereas at a critical point it can be either 1 (second order phase transition) or 2 (first order phase transition). For
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
289
more details, see Sec. 7. In both cases, we can use the same arguments as above. By similar estimates as in the proof of Lemma 6.5 it immediately follows that all limit points of the Gibbs states ωN with parameters (βN , γN , µN , λN , hN ) converging to (β, γ, µ, λ, h) ∈ ∂S as N → ∞, belongs to Ωβ = Ωβ (µ, λ, γ, h). Since the set of all U (1)-invariant equilibrium states from Ωβ is {ω (τ ) for any τ ∈ [0, 1]} with " 2π τ (τ ) ω (·) := (1 − τ )ωζ0 (·) + ωζcβ (·)dφ, (6.57) 2π 0 we obtain the second statement (ii). This theorem is a generalization of results obtained for the strong couplingu BCS model [7]. Note however, that Thirring’s analysis [7] of the asymptotics of local Gibbs states comes from explicit computations, whereas we use the structure of sets of states, as explained for instance in [33]. Observe that Theorem 4.1 is a simple consequence of Theorem 6.5. Indeed, assume for instance that the order parameter rβ = rβ (µ, λ, γ, h) and the electron + density per site dβ = dβ (µ, λ, γ, h) jumps respectively from r− β = 0 to rβ and + from d− β to dβ by crossing a critical chemical potential µβ at fixed parameters (β, λ, γ, h). An example of such behavior is given in figure 10 for an electron density + smaller than one. If ρ ∈ [d− β , dβ ], then the unique solution µN,β = µN,β (ρ, λ, γ, h) (c)
(c)
(c)
of (4.1) must converge towards µβ as N → ∞. Meanwhile, at fixed (β, µβ , λ, γ, h) ωζ0 (n↑ + n↓ ) = d− β
and ωζc+ (n↑ + n↓ ) = d+ β, β
iφ r+ and φ ∈ [0, 2π). Any weak∗ -limit ω∞ of local Gibbs states ωN with c+ β := βe satisfies per construction ω∞ (n↑ + n↓ ) = ρ and has the form ω (τ ) (·) (6.57), by Theorem 6.5. Hence, the Gibbs state ωN converges in the weak∗ -topology towards ω (τρ ) (·) with τρ defined in Theorem 4.1. Indeed, the existence of the limits (i)–(iii) in Theorem 4.1 follows from the unique+ ness of the limiting equilibrium state with fixed electron density ρ ∈ [d− β , dβ ]. We give now various important properties of densities in ground states, i.e. for β = ∞, which immediately follow from Theorem 6.4. Recall that the set Ω∞ of ground states is the set of all weak∗ limit points as n → ∞ of all equilibrium state sequences {ω (n) }n∈N with diverging inverse temperature βn → ∞. Take γ > 0 and parameters µ, λ, h such that |µ − λ| = λ + |h|. Then the electron and Coulomb correlation densities equal, respectively, d := ω(nκ(l),↑ + nκ(l),↓ ) = d∞
and w := ω(nκ(l),↑ nκ(l),↓ ) = w∞ ,
for any ground state ω ∈ Ω∞ and l ∈ N, cf. Corollaries 3.2 and 3.4. u See
(1.2) with λ = 0 and h = 0.
(6.58)
April 20, 2010 14:17 WSPC/S0129-055X
290
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
If additionally γ > Γ|µ−λ|,λ+|h| , we are in the superconducting phase for ground states, cf. Corollary 3.1. Indeed, for any ϕ ∈ [0, 2π), there is a ground state ω ∈ Ω∞ such that for any l ∈ N, iϕ ω(aκ(l),↓ aκ(l),↑ ) = r1/2 max e .
In the superconducting phase, from Corollary 3.4 we observe that d∞ = 2w∞ , whereas the magnetization density equals m := ω(nκ(l),↑ − nκ(l),↓ ) = m∞ = 0,
(6.59)
for any superconducting state ω ∈ Ω∞ and l ∈ N. This is the Meißner effect, see Corollary 3.3. On the other hand, the Cauchy–Schwarz inequality for the states implies the inequalities (6.60) 0 ≤ ω(nκ(l),↑ nκ(l),↓ ) ≤ ω(nκ(l),↑ ) ω(nκ(l),↓ ) for any l ∈ N and ω ∈ EU+ . In fact, in the superconducting phase the second inequality of (6.60) is an equality for any ω ∈ Ω∞ . Indeed, (6.59) and Corollary 3.4 yield ω(nκ(l),↑ nκ(l),↓ ) = ω(nκ(l),↑ ) = ω(nκ(l),↓ ),
(6.61)
for any ω ∈ Ω∞ and l ∈ N. It shows that 100% of electrons form Cooper pairs in superconducting ground states. In the case where h = 0 with γ > Γ|µ−λ|,λ+|h| and |µ − λ| = λ + |h|, the density vector (d, m, w) defined by (6.58) and (6.59) is also unique as in the superconducting phase. It equals (d∞ , m∞ , w∞ ), see Corollaries 3.2–3.4. However, if h = 0 with γ < Γ|µ−λ|,λ , or γ = Γ|µ−λ|,λ+|h| , or |µ − λ| = λ + |h|, then the density vector (d, m, w) belongs, in general, to a non-trivial convex set. In other words, there are phase transitions involving to these densities. In particular, even in the case h = 0 where the Hamiltonian HN (1.2) is spin invariant, there are ground states breaking the spin SU (2)-symmetry. For instance, take β, γ > 0 and parameters µ, λ such that |µ − λ| < λ and γ < Γ|µ−λ|,λ . Then for any ω ∈ Ω∞ and l ∈ N, the electron density equals d = d∞ = 1, whereas the Coulomb correlation density is w = w∞ = 0. In particular, the first inequality of (6.60) is an equality showing that 0% of electrons forms Cooper pairs. But, even if the magnetic field vanishes, i.e. h = 0, for any x ∈ (−1, 1) there exists a ground state ω (x) ∈ Ω∞ with magnetization density m = x (see (6.59) for the definition of m). Therefore, all the thermodynamics of the strong coupling BCS-Hubbard model discussed in Secs. 3.1–3.5 is encoded in the notion of equilibrium and ground states ω ∈ Ωβ with β ∈ (0, ∞]. However, there is still an important open question related to the thermodynamics of this model. It concerns the problem of fluctuations of the Cooper pair condensate density (Theorem 3.1) or Cooper fields Φκ(l) and Ψκ(l) (6.48) as a function of the temperature. Unfortunately, no result in that direction are known as soon as the thermodynamic limit is concerned. We prove however a
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
291
simple statement about fluctuations of Cooper fields for pure states from Ωβ in the limit γβ → ∞. Theorem 6.6 (Fluctuations of Cooper Fields). Take β, γ > 0 and real numbers µ, λ, h away from any critical point. Then, for any pure state ωζcβ ∈ Ωβ and l ∈ N, the fluctuations of Cooper fields Φκ(l) and Ψκ(l) (6.48) are bounded by 0 ≤ ωζcβ ({Φκ(l) − ωζcβ (Φκ(l) )}2 ) ≤ 2γ −1 β −1 , 0 ≤ ωζcβ ({Ψκ(l) − ωζcβ (Ψκ(l) )}2 ) ≤ 2γ −1 β −1 , i.e. they vanish in the limit γβ → ∞. Proof. Recall that properties of pure states are characterized in Corollary 6.1, i.e. they are product states ωζcβ with the one-site state ζcβ being defined in (6.1). In 1/2
particular, they satisfy (6.49). Now, to avoid triviality, assume that cβ := rβ eiφ = 0 and let f(τ ) be the function defined for any τ ∈ R by f(τ ) := −γ|cβ + τ |2 + p(cβ + τ ). Since cβ = 0 is a maximizer of the function −γ|c|+p(c) of c ∈ C, one has ∂τ2 f(0) ≤ 0, i.e. ∂τ2 p(cβ +τ )|τ =0 ≤ 2γ. From straightforward computations, observe that p(cβ +τ ) is a convex function of τ ∈ R with β −1 γ −2 {∂τ2 p(cβ + τ )}|τ =0 = ωζcβ ({Φκ(l) − ωζcβ (Φκ(l) )}2 ) ≥ 0. From this last equality combined with {∂τ2 p(cβ + τ )}|τ =0 ≤ 2γ, we deduce the theorem for Φκ(l) . Moreover, from similar arguments using the function ˆf(τ ) := f(iτ ) instead of f, the fluctuations of the Cooper field Ψκ(l) are also bounded by 2γ −1 β −1 .
From Theorem 6.6, note that Cooper fields are c-numbers in the corresponding GNS-representation [32] of pure ground states defined as weak∗ -limits of pure equilibrium states: Corollary 6.2 (Cooper Fields for Pure Ground States). Let ω ∈ Ω∞ be any weak∗ -limit of pure equilibrium states and let (ψ, π, H) be the corresponding GNS-representation of ω on bounded operators on the Hilbert space H with cyclic vacuum ψ. Then ω is pure and for any l ∈ N, π(Φκ(l) ) = ω(Φκ(l) )IH and π(Ψκ(l) ) = ω(Ψκ(l) )IH . Proof. A pure equilibrium state is a product state (6.7) and any weak∗ -limit of product states in EUS,+ is also a product state. Thus, by Lemma 6.6, any ground state ω ∈ Ω∞ defined as the weak∗ -limit of pure equilibrium states is extreme in EUS,+ and hence extreme in Ω∞ . Clearly, for such ground state, π(ω(Φκ(l) )) = ω(Φκ(l) )IH ˜ := Φκ(l) − ω(Φκ(l) ). From Theorem 6.6 combined with the for any l ∈ N. Let Φ
April 20, 2010 14:17 WSPC/S0129-055X
292
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
Cauchy–Schwarz inequality we obtain for any A ∈ U that 2 ∗˜ ˜ ∗Φ ˜ 2 )) ˜ ˜ ΦAA ˜ Φ ΦA) ≤ A π(Φ)π(A)ψ = ω(A ω(Φ( H ˜ 3/2 [ω(Φ ˜ 2 )]1/4 = 0. ≤ A2 Φ From the cyclicity of ψ, it follows that π(Φκ(l) ) = ω(Φκ(l) )IH . The proof of π(Ψκ(l) ) = ω(Ψκ(l) )IH is also performed in the same way. We omit the details. In particular, for such pure ground states ω in Ω∞ , correlation functions can explicitly be computed at any order in Cooper fields. For instance, for all N ∈ N, all kj , lj ∈ N, mj , nj ∈ N0 , j = 1, . . . , N , and any An ∈ U, n = 1, . . . , N + 1, one has n1 mN nN 1 ω(A1 Φm κ(k1 ) Ψκ(l1 ) A2 · · · AN Φκ(kN ) Ψκ(lN ) AN +1 ) n1 mN mN 1 = ω(Φm κ(k1 ) )ω(Ψκ(l1 ) ) · · · ω(Φκ(kN ) )ω(Ψκ(lN ) )ω(A1 · · · AN +1 ).
7. Analysis of the Variational Problem The variational problem (2.4) is quite explicit but for the reader convenience, we collect here some properties of its solution rβ with respect to β, γ > 0 and µ, λ, h ∈ R. We show in particular that rβ > 0 exists in a non-empty domain of (β, γ, µ, λ, h) with some monotonicity properties as well as the existence of both first and second order phase transitions. We conclude this section by giving the asymptotics of rβ as β → ∞, i.e. by proving Corollary 3.1. (1) We start by showing that rβ = 0 for sufficiently small inverse temperatures β at fixed γ, µ, λ and h. Indeed, for any r ≥ 0 one computes that γ sinh(βgr ) −1 , (7.1) ∂r f (r) = γ 2gr (eλβ cosh(βh) + cosh(βgr )) cf. Theorem 2.1. Direct estimations show that if 0 < β < 2γ −1 , then ∂r f (r) < 0 for any r ≥ 0, i.e. rβ = 0. (2) Fix now β > 0 and µ, λ, h ∈ R, then rβ > 0 for sufficiently large coupling constants γ. Indeed, for large enough γ > 0 there is, at least, one strictly positive solution ˜rβ > 0 of (2.5). Since direct computations using again (2.5) imply that d {f (β, µ, λ, γ, h; ˜rβ (γ)) − f (β, µ, λ, γ, h; 0)} = ˜rβ (γ) > 0, dγ and f (β, µ, λ, γ, h; ˜rβ ) − f (β, µ, λ, γ, h; 0) = O(γ)
as γ → ∞,
for any fixed β > 0 and µ, λ, h ∈ R, there is a unique γc > 2|λ − µ| such that f (˜rβ ) > f (0), i.e. rβ > 0 for γ > γc . The domain of parameters (β, µ, λ, γ, h) where rβ is strictly positive is therefore non-empty, cf. Figs. 3 and 4.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
293
(3) To get an intuitive idea of the behavior of the function f (r) (cf. Theorem 2.1), we analyze the cardinality of the set S of strictly positive solutions of the gap equation (2.5): Lemma 7.1 (Cardinality of the Set S). If βγ ≤ 6, the gap equation (2.5) has at most one strictly positive solution, whereas it has, at most, two strictly positive solutions when βγ > 6. Proof. From (7.1), any strictly positive maximizer rβ > 0 of (2.4) is solution of the equation γ sinh(βx) − eλβ cosh(βh) − cosh(βx). (7.2) h1 (gr ) = 0, with h1 (x) := 2x This last equation is equivalent to the gap equation (2.5). For any x > 0, observe that γ βγ ∂x h1 (x) = cosh(xβ) − + β sinh(xβ) = 0 (7.3) 2x 2x2 if and only if (2β −1 γ −1 )1/2 y =
$
y − 1 =: C(y), tanh(y)
y = βx > 0.
(7.4)
The map y → C(y) is strictly concave for y > 0, C(0) = 0, and ∂y C(0) = (2/6)1/2 . Therefore, if βγ > 6 there is a unique strictly positive solution y% = β% x > 0 of (7.4), and there is no strictly positive solution of (7.4) when βγ < 6. Since h1 (0) could be negative in some cases and h1 (x) diverges exponentially to −∞ as x → ∞, the cardinality of set of strictly positive solutions of the gap equation (2.5) is at most two if βγ > 6, or at most one if βγ ≤ 6. Consequently, if the gap equation (2.5) has no solution, then f (r) is strictly decreasing for any r ≥ 0. If the gap equation (2.5) has one unique solution rβ > 0, the function f (r) is increasing until its (strictly positive) maximizer rβ > 0 and decreasing next for r ≥ rβ . Finally, when there are two strictly positive solutions of (2.5), the lower one must be one local minimum whereas the larger solution must be a local maximum. In this case the function f (r) decreases for r ≥ 0 until its local minimum, then increases until its local maximum, and finally decreases again to diverge towards −∞. Note that none of these cases can be excluded, i.e. they all appear depending on β, γ > 0 and µ, λ, h ∈ R. See Figs. 3 and 18. (4) We study now the dependence of rβ > 0 with respect to variations of each parameter. So, let us fix the parameters {β, µ, λ, γ, h}\{ν} with ν = β, µ, λ, γ, or h and consider the function ξ(r, ν) := ∂r f (r, ν) for r ≥ 0 and ν in the open set of definition of f (r, ν) = f (β, µ, λ, γ, h; r), see (7.1). Recall that rβ > 0 is a solution at ν = ν0 of the gap equation (2.5), i.e. ξ(rβ , ν0 ) = 0. Straightforward computations imply that ∂r2 f (r) =
γ4β h2 (gr ), 4gr2 (eλβ cosh(βh) + cosh(βgr ))
(7.5)
April 20, 2010 14:17 WSPC/S0129-055X
294
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
f (r)
f (r)
f (r) 1.16
1.38
2.15
1.15 1.37 1.14
1.36
2.10
1.35
1.13
1.34
2.05
1.12 1.33 0.05
0.10
0.15
0.20
0.25
r
0.05
0.10
0.15
0.20
0.25
r
0.05
0.10
0.15
0.20
0.25
r
Fig. 18. Illustrations of the function f (r) for r ∈ [0, 1/4] at (µ, γ, h) = (1, 2.6, 0) with inverse temperatures β = βc − 0.3 (orange line), β = βc (red line), β = βc + 0.5 (blue line), and with coupling constants λ = 0 (left figure), λ = 0.45 (figure on the center) and λ = 0.575 (right figure). Here βc = θc−1 is the critical inverse temperature which, from left to right, equals 2.04, 3.46 and 6.35, respectively. (Color online.)
for any r > 0 with h2 (x) :=
eλβ cosh(βh) cosh(βx) + 1 sinh(βx) − . eλβ cosh(βh) + cosh(βx) βx
(7.6)
It yields that there is at most one strictly positive solution, ˜r ≥ 0 of ∂r ξ(r, ν0 ) = 0 for each fixed set of parameters. For instance, if eλβ cosh(βh) ≤ 1, then it is straightforward to check that ∂r ξ(r, ν0 ) < 0 for any r > 0. In the situation where the gap equation (2.5) has two strictly positive solutions, rβ > 0 cannot solve ∂r ξ(r, ν0 ) = 0, since in this case the equation h2 (x) = 0 would have at least two strictly positive solutions, as rβ is a maximizer. Consequently, to simplify our study we restrict on the very large set of parameters where ∂r ξ(rβ , ν0 ) = 0. In this case, the differential dξ has maximal rank at (rβ , ν0 ) and from the implicit function theorem, there are ε > 0 and a smooth and strictly positive functionv rβ (ν) > 0 defined on the ball Bε (ν0 ) centered on the point ν0 and with radius ε such that ξ(ν, rβ (ν)) = 0 for any ν ∈ B (ν0 ). By continuity of the function ∂r ξ we can choose ε > 0 such that ∂r ξ(ν, rβ (ν)) does not change its sign for ν ∈ B (ν0 ). Thus rβ (ν) describes the evolution of the solution of (2.4) for ν ∈ B (ν0 ). If rβ = rβ (ν0 ) > 0 is the unique maximizer of (2.4) with ∂r ξ(rβ , ν0 ) = 0, then the function rβ (ν) describes the smooth evolution of the Cooper pair condensate density with respect to small perturbations of ν0 . Observe that ∂ν ξ(rβ (ν), ν) = {∂ν rβ (ν)}{∂r ξ(r, ν)}|r=rβ (ν) + {∂ν ξ(r, ν)}|r=rβ (ν) = 0 and {∂r ξ(r, ν0 )}|r=rβ (ν0 ) < 0 because rβ is a maximizer. Consequently, one obtains sgn{∂ν rβ (ν0 )} = sgn{{∂ν ∂r f (r, ν0 )}|r=rβ (ν0 ) }. In other words, the function rβ (ν) of ν ∈ B (ν0 ) is either increasing if {∂ν ∂r f (r, ν0 )}|r=rβ (ν0 ) > 0, v If
ν = β, then of course rβ (ν) := rν .
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
295
or decreasing if {∂ν ∂r f (r, ν0 )}|r=rβ (ν0 ) < 0, as soon as rβ > 0 is the unique maximizer of (2.4) with ∂r ξ(rβ , ν0 ) = 0. (5) By applying this last result respectively to ν0 = γ > Γ|µ−λ|,λ+|h| (Corollary 3.1) and ν0 = h ∈ R, we obtain that rβ > 0 is an increasing function of γ > 0 and a decreasing function of |h| because via (2.5) one has {∂γ ∂r f (r, γ)}|r=rβ > 4γ −2 (µ − λ)2 ≥ 0 at fixed parameters (β, µ, λ, h) and {∂h ∂r f (r, h)}|r=rβ = −
2grβ βeλβ sinh(βh) sinh(βgrβ )
at fixed (β, µ, λ, γ). (6) If γ > Γ|µ−λ|,λ+|h| , for any fixed (β, γ, λ, h) the order parameter rβ > 0 is a decreasing function of |µ − λ| under the condition that eλβ cosh(βh) ≤ 1, as {∂µ ∂r f (r, µ)}|r=rβ =
2gr2 (eλβ
γ 2 β(µ − λ) h2 (grβ ), cosh(βh) + cosh(βgrβ ))
cf. (7.6). If eλβ cosh(βh) > 1, the behavior of rβ > 0 is not anymore monotone as a function of |µ − λ| (λ being fixed), cf. Fig. 10. The behavior of rβ as a function of λ or β is also not clear in general. But, at least as a function of the inverse temperature β > 0, we can give simple sufficient conditions to get its monotonicity. Indeed, direct computations show that {∂β ∂r f (r, β)}|r=rβ = (γ + 2λ)grβ − 2hgrβ
cosh(βgrβ ) − (λγ + 2gr2β ) sinh(βgrβ )
eλβ sinh(βh) . sinh(βgrβ )
By combining this last equality with (2.5), we then get that {∂β ∂r f (r, β)}|r=rβ ≥ 0
(7.7)
with rβ > 0 if and only if gr2β ≤
γ(γ cosh(βgrβ ) − 2eλβ cosh(βh)(λ + h tanh(βh))) . 4(cosh(βgrβ ) + eλβ cosh(βh))
(7.8)
From (2.5) combined with tanh(x) < 1, we also have gr2β <
γ 2 cosh2 (βgrβ ) . 4(cosh(βgrβ ) + eλβ cosh(βh))2
(7.9)
Therefore, a sufficient condition to satisfy the inequality (7.8) is obtained by bounding the right-hand side of (7.9) with the r.h.s. of (7.8). From (2.5) this implies the
April 20, 2010 14:17 WSPC/S0129-055X
296
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
condition grβ ≥ (λ + h tanh(βh)) tanh(βgrβ ), under which rβ is an increasing function of β > 0. This inequality is also equivalent to eλβ cosh(βh) γ − (λ + h tanh(βh)) . grβ ≤ tanh(βgrβ ) 2 cosh(βgrβ ) In particular, by using again the gap equation (2.5), if eλβ cosh(βh) γ > 2(λ + h tanh(βh)) 1 + , cosh(βgrβ ) then rβ > 0 is an increasing function of β > 0. Since tanh x ≤ 1, another sufficient condition to get (7.7) is λ+|h| ≤ grβ . In particular, if λ < |µ−λ| and γ > Γ|µ−λ|,λ+|h| with h sufficiently small, then rβ > 0 is again an increasing function of β > 0. Therefore, the domain of (µ, λ, γ, h) where rβ > 0 is proven to be an increasing function of β > 0 is rather large. Actually, from a huge number of numerical computations, we conjecture that rβ > 0 is always an increasing function of β > 0. In other words, this conjecture implies that the condition expressed in Corollary 3.1 on (µ, λ, γ, h) should be necessary to obtain a superconductor at a fixed temperature. (7) Observe that the order of the phase transition depends on the parameters. For instance, assume λ ≤ 0, h = 0 and γ > Γ|µ−λ|,λ . Then, at any inverse temperature β > 0 it follows from (7.5) that f (r) is a strictly concave function of r > 0. This property justifies the existence and uniqueness of the inverse temperature βc solution of the equation 2 tanh(β|µ − λ|) eλβ = 1+ , |µ − λ| γ cosh(β|µ − λ|) i.e. (2.5) for λ ≤ 0, h = 0 and r = 0. In particular, βc is such that the Cooper pair condensate density continuously goes from rβ = 0 for β ≤ βc to rβ > 0 for β > βc . In this case the superconducting phase transition is of second order, cf. Fig. 3. The appearance of a first order phase transition at some fixed (µ, λ, γ, h) is also not surprising. Indeed, recall that the function f (r) may have a local minimum and a local maximum, see discussions below Lemma 7.1. For instance, assume now λ = µ > 0, h = 0 and 4λ = Γ0,λ < γ ≤ 6λ. Then, from (7.1) for r = 0, γβ γ − (eλβ + 1) . ∂r f (0) = λβ e +1 2 Since by explicit computations
min x>0
ex + 1 x
> 3,
it follows that ∂r f (0) < 0 for any β > 0 whenever λ = µ > 0, h = 0 and 0 < γ ≤ 6λ. Therefore, as soon as there is a superconducting phase transition, for instance if
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
297
4λ < γ ≤ 6λ (cf. Corollary 3.1), the function rβ of β > 0 must be discontinuous at the critical point. This case is an example of a first order superconducting phase transition. Numerical illustrations of a similar first order phase transition are also given in Fig. 3. (8) We conclude this section by a computation of the asymptotics of the order parameter rβ as β → ∞. We prove in particular Corollary 3.1. µλ | with µ ˜ λ := µ − λ. From (2.6), we already know that rβ = 0 for any γ ≤ 2|˜ Therefore, we consider here that γ > 2|˜ µλ | and we look for the domain where the parameter rβ is strictly positive in the limit β → ∞. Recall that rβ is solution of the variational problem (2.4), i.e. 1 1 ln 2 + sup f (r) = −γrβ + ln{eβh + e−βh + eβ(grβ −λ) + e−β(grβ +λ) }. β β r≥0
(7.10)
When β → ∞ the last exponential term can always be neglected for our analysis since grβ ≥ 0. µλ | > λ + |h|. Then gr > λ + |h| for any r ≥ 0 and Now, assume first that g0 = |˜ when β → ∞ the function f (r) converges to w(r) := −γr + gr − λ. In particular, the order parameter rβ converges towards the unique maximizer rmax (2.6) of the function w(r) for r ≥ 0, i.e. r∞ := lim rβ = rmax ,
(7.11)
β→∞
for any γ > 2|˜ µλ | and real numbers µ, λ, h satisfying |˜ µλ | > λ + |h|. Assume now that |˜ µλ | ≤ λ + |h| and let rmin be the solution of gr = λ + |h|, i.e. ˜2λ ) ≥ 0. rmin := γ −2 ((λ + |h|)2 − µ
(7.12)
Then, for any r ∈ [0, rmin] f (r) = −γr + |h| + o(1) as β → ∞. In particular, since γ > 0, f (r) = f (δ) = |h| + o(1),
sup
with δ = o(1)
as β → ∞.
(7.13)
0≤r≤rmin
The solution rβ of the variational problem (7.10) converges either to 0, or to some strictly positive value r∞ > rmin . In the case where r∞ > rmin , we would have f (r∞ ) = w(r∞ ) + o(1) as β → ∞.
(7.14)
Now, if |˜ µλ | ≤ λ + |h| and γ ≤ 2(λ + |h|), then rmin ≥ rmax , cf. (2.6) and (7.12). In this regime, straightforward computations show that ˜2λ ) ≥ 0. |h| − sup w(r) = |h| − w(rmin ) = γ −1 ((|h| + λ)2 − µ r≥rmin
(7.15)
April 20, 2010 14:17 WSPC/S0129-055X
298
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
In other words, the order parameter rβ converges towards r∞ := lim rβ = 0,
(7.16)
β→∞
for any γ ≤ 2(λ + |h|) and real numbers µ, λ, h satisfying |˜ µλ | ≤ λ + |h|. However, if |˜ µλ | ≤ λ + |h| and γ > 2(λ + |h|), then rmin < rmax . In particular one gets |h| − sup w(r) = |h| − w(rmax ) = − r≥rmin
1 ˜ |˜µ |,λ+|h| )(γ − Γ|˜µ |,λ+|h| ), (γ − Γ λ λ 4γ
(7.17)
with Γx,y ≥ 2y defined for any x ∈ R+ and y ∈ R in Corollary 3.1 and ˜ |˜µ |,λ+|h| := 2(λ + |h| − (λ + |h|)2 − µ Γ ˜2λ ) ≤ 2|˜ µλ |. λ In particular, sup w(r) = w(rmax ) > |h|,
(7.18)
r≥rmin
µλ |. Therefore, by combining (7.13) with (7.14) and for any γ > Γ|˜µλ |,λ+|h| ≥ 2|˜ (7.18), we obtain r∞ := lim rβ = rmax ,
(7.19)
β→∞
for any γ > Γ|˜µλ |,λ+|h| and real numbers µ, λ, h satisfying |˜ µλ | ≤ λ + |h|. µλ | < λ + |h|, observe that (7.17) is zero. So, Finally, if γ = Γ|˜µλ |,λ+|h| and |˜ we analyze the next order term to know which number, 0 or rmax , maximizes the function f (r) when β → ∞. On the one hand, straightforward estimations imply that f (0) − |h| = β −1 (e−β(λ+|h|−|˜µλ |) + e−2β|h| )(1 + o(1))
as β → ∞.
(7.20)
On the other hand, if γ = Γ|˜µλ |,λ+|h| with |˜ µλ | < λ + |h|, then by using (2.6) one obtains √ 2 2 (7.21) f (rmax ) − |h| = β −1 e−β (λ+|h|) −˜µλ (1 + o(1)) as β → ∞. Therefore, if γ = Γ|˜µλ |,λ+|h| and |˜ µλ | < λ + |h|, it is trivial to check from (7.20)– (7.21) that f (0) > f (rmax ) when β → ∞. Consequently, the limits (7.11), (7.16) and (7.19) together with (2.6) imply Corollary 3.1 for any γ = Γ|µ−λ|,λ+|h| , whereas if γ = Γ|µ−λ|,λ+|h| , the order parameter rβ converges to r∞ = 0. Appendix. Griffiths Arguments As we have an explicit representation of the pressure, it can be verified in some cases that rβ is a C 1 -functionw of parameters implying that p(β, µ, λ, γ, h) is differentiable with respect to parameters. In this particular situation, the proofs of w For
instance, for special choices of parameters one could check that ∂r ξ(rβ , ν0 ) = 0, see Sec. 7.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
299
Theorems 3.1, 3.2, 3.4–3.7 done in Sec. 6.2 could also be performed without our notion of equilibrium states by using Griffiths arguments [29–31], which are based on convexity properties of the pressure. We explain it shortly and we conclude by a discussion of an alternative proof of Theorem 3.2. Remark A.1. Our method gives access to all correlation functions at once (cf. Theorem 6.5). It is generalized in [18] to all translation invariant Fermi systems. However, computing all correlation functions with Griffiths arguments [29–31] requires the differentiability of the pressure with respect to any perturbation as well as the computation of its corresponding derivative. This is generally a very hard task, for instance for correlation functions involving many lattice points. (1) Take self-adjoint operators PN acting on the fermionic Fock space and assume the existence of the (infinite volume) grand-canonical pressure pε (β, µ, λ, γ, h) := lim pN,ε (β, µ, λ, γ, h) N →∞
for any fixed ε in a neighborhood V of 0. In this case, observe that the finite volume pressure pN,ε (β, µ, λ, γ, h) :=
1 ln Trace(e−β(HN −εPN ) ) βN
is convex as a function of ε ∈ V and ∂ε pN,0 = N −1 ωN (PN ). Consequently, the point-wise convergence of the function pN,ε towards pε implies that lim inf lim− ∂ε pN,ε ≥ lim− ∂ε pε and lim sup lim+ ∂ε pN,ε ≤ lim+ ∂ε pε , N →∞
ε→0
N →∞
ε→0
ε→0
ε→0
(A.1) see Griffiths lemma [30, 31] or [29, Appendix C]. In particular, one gets lim {∂ε pN,0 } = lim {N −1 ωN (PN )} = ∂ε pε=0 ,
N →∞
N →∞
(A.2)
under the assumption that pε is differentiable at ε = 0. (2) Therefore, by taking PN =
a∗x,↑ a∗x,↓ ay,↓ ay,↑ ,
x,y∈ΛN
we obtain from (A.2) that 1 ∗ ∗ = ∂γ p(β, µ, λ, γ, h), a a a a lim y,↓ y,↑ x,↑ x,↓ N →∞ N 2 x,y∈ΛN
as soon as the (infinite volume) pressure p (β, µ, λ, γ, h) has continuous derivative with respect to γ > 0. Combined with Theorem 2.1 and (2.5) we would obtain
April 20, 2010 14:17 WSPC/S0129-055X
300
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
Theorem 3.1. Meanwhile, Theorems 3.4–3.7 could have been deduced in the same way from (A.2) combined with explicit computations using (2.5). (3) A direct proof of Theorem 3.2 using Griffiths arguments is more delicate. One uses similar arguments as in [29, 42]. We give them for the interested reader. For any φ ∈ [0, 2π), first recall that the pressure pα,φ associated with HN,α,φ (3.1) in the thermodynamic limit is given by (6.41), which equals (6.43). Additionally, if the parameters β, µ, λ, γ, and h are such that (2.4) has a unique maximizer rβ , then the variational problem (6.43) has a unique maximizer cβ,α,φ ∈ eiφ R for 1/2 α > 0 sufficiently small, and cβ,α,φ converges to rβ eiφ as α → 0, see proof of Theorem 6.2. Now, let us denote by (nx,↑ + nx,↓ ) NN := x∈ΛN
the full particle number operator. By straightforward computations, observe that [ax,↑ , NN ] = ax,↑
and [ax,↓ , NN ] = ax,↓ ,
(A.3)
for any lattice site labelled by x ∈ ΛN , where [A, B] := AB − BA. Therefore the iφ unitary operator Uφ := e− 2 NN realizes a global gauge transformation because one deduces from (A.3) that iφ
Uφ ax,↑ Uφ∗ = e 2 ax,↑
iφ
and Uφ ax,↓ Uφ∗ = e 2 ax,↓ .
(A.4)
In particular, the unitary transformation of the Hamiltonian HN,α,φ (3.1) equals Uφ HN,α,φ Uφ∗ = HN,α,0 . It implies on the corresponding Gibbs states (6.42) that ωN,α,φ(BN ) = eiφ ωN,α,0 (BN ), with the operator BN be defined by BN :=
(A.5)
ax,↓ ax,↑ .
x∈ΛN
In other words, it suffices to prove Theorem 3.2 for φ = 0. Take φ = 0. Observe that 0 = ωN,α,0 ([HN,α,0 , NN ]) = αωN,α,0 (BN − B∗N ).
(A.6)
Additionally, by using the positive semidefinite Bogoliubov–Duhamel scalar product " β (X, Y )HN,α,0 := β −1 e−βN pN,α,0 (β,µ,λ,γ,h) Trace(e−(β−τ )HN,α,0 X ∗ e−τ HN,α,0 Y )dτ 0
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
301
with respect to the Hamiltonian HN,α,0 (see, e.g., [25, 29, 42]), one gets that 0 ≤ β([NN , HN,α,0 ], [NN , HN,α,0 ])HN,α,0 = ωN,α,0 ([NN , [HN,α,0 , NN ]]) = αωN,α,0 (BN + B∗N ).
(A.7)
So, by combining (A.6) with (A.7) it follows that ωN,α,0 (BN ) = ωN,α,0(B∗N ) ≥ 0 for any α ≥ 0. In particular ωN,α,0 (BN ) = ωN,α,0(B∗N ) is a real number. The function pN,α,0 is a convex function of α ≥ 0 because β({(BN + B∗N ) − ωN,α,0 (BN + B∗N )}, {(BN + B∗N ) − ωN,α,0 (BN + B∗N )})HN,α,0 = ∂α2 pN,α,0 (β, µ, λ, γ, h). Then, under the assumption that pα,0 is differentiable at α = 0 away from any critical point, the equations (A.2), with PN = BN + B∗N and (6.43), imply that 1 1 ωN,α,0 (BN + B∗N ) = lim ∂α ln Trace(e−βHN,α,0 ) lim N →∞ N N →∞ βN = ∂α pα,0 (β, µ, λ, γ, h) = ζcβ,α,0 (a∗↓ a∗↑ + a↑ a↓ ), for any α > 0 sufficiently small and with ζc (·) defined for any c ∈ C by (6.1). Returning back to the original Hamiltonian HN,α,φ (3.1) for any φ ∈ [0, 2π), we conclude from (A.5) combined with the last equalities that eiφ 1 lim ωN,α,φ(ax,↑ ax,↓ ) = (a∗ a∗ + a↑ a↓ ). ζc N →∞ N 2 β,α,0 ↓ ↑ x∈ΛN
Therefore, by taking the limit α → 0, Theorem 3.2 would follow if one additionally checks that pα,0 is differentiable at α = 0 away from any critical point. Acknowledgments We are very grateful to Volker Bach and Jakob Yngvason for their hospitality at the Erwin Schr¨ odinger International Institute for Mathematical Physics, at the Physics University of Vienna, and at the Institute of Mathematics of the Johannes Gutenberg–University that allowed us to work on different aspects of the present paper. We also thank N. S. Tonchev and V. A. Zagrebnov for giving us relevant references, as well as the referee for having helped us to improve the paper. Additionally, J.-B. B. especially thanks the mathematical physics group of the Department of Physics of the University of Vienna for the very nice working environment.
April 20, 2010 14:17 WSPC/S0129-055X
302
148-RMP
J070-00395
J.-B. Bru & W. de Siqueira Pedra
References [1] E. Størmer, Symmetric states of infinite tensor product C ∗ -algebras, J. Funct. Anal. 3 (1969) 48–68. [2] J. R. Schrieffer and M. Tinkham, Superconductivity, Rev. Mod. Phys. 71 (1999) S313–S317. [3] Y. Yanase, T. Jujo, T. Nomura, H. Ikeda, T. Hotta and K. Yamada, Theory of superconductivity in strongly correlated electron systems, Phys. Rep. 387 (2003) 1–149. [4] A. L. Patrick, N. Nagaosa and X.-G. Wen, Doping a Mott insulator: Physics of hightemperature superconductivity, Rev. Mod. Phys. 78 (2006) 17–85. [5] S. T. Beliaev, Application of the methods of quantum field theory to a system of bosons, Sov. Phys. JETP 7 (1958) 289–299. [6] W. Thirring and A. Wehrl, On the mathematical structure of the B.C.S.-model, Comm. Math. Phys. 4 (1967) 303–314. [7] W. Thirring, On the mathematical structure of the B.C.S.-model. II, Comm. Math. Phys. 7 (1968) 181–189. [8] D. J. Thouless, The Quantum Mechanics of Many-Body Systems, 2nd edn. (Academic Press, New York, 1972). [9] N. G. Duffield and J. V. Pul´e, A new method for the thermodynamics of the BCS model, Comm. Math. Phys. 118 (1988) 475–494. [10] G. A. Raggio and R. F. Werner, The Gibbs variational principle for general BCS-type models, Europhys. Lett. 9 (1989) 633–638. [11] I. A. Bernadskii and R. A. Minlos, Exact solution of the BCS model, Theor. Math. Phys. 12(2) (1972) 779–787. [12] N. Ilieva and W. Thirring, High-Tc superconductivity by phase cloning, arXiv:hepth/0701245v3 (2007). [13] N. N. Bogoliubov, V. V. Tolmachev and D. V. Shirkov, A New Method in the Theory of Superconductivity (Academy of Sciences Press, Moscow, 1958) and (Consult. Bureau, Inc., N.Y., Chapman Hall Ltd., London, 1959). [14] R. J. Bursill and C. J. Thompson, Variational bounds for lattice fermion models II: Extended Hubbard model in the atomic limit, J. Phys. A Math. Gen. 26 (1993) 4497–4511. [15] F. P. Mancini, F. Mancini and A. Naddeo, Exact solution of the extended Hubbard model in the atomic limit on the Bethe lattice, arXiv:0711.0318v1 (2007). [16] I. G. Brankov and N. S. Tonchev, On the SD model for coexistence of ferromagnetism and superconductivity, Phys. Stat. Sol. (B) 102 (1980) 179–187. [17] N. N. Bogoliubov Jr., A. N. Ermilov and A. M. Kurbatov, On coexistence of superconductivity and ferromagnetism, Phys. A 101 (1980) 613–628. [18] J.-B. Bru and W. de Siqueira Pedra, Non-cooperative equilibria of Fermi systems with long range interactions, in preparation. [19] D. Petz, G. A. Raggio and A. Verbeure, Asymptotics of Varadhan-type and the Gibbs variational principle, Comm. Math. Phys. 121 (1989) 271–282. [20] G. A. Raggio and R. F. Werner, Quantum statistical mechanics of general mean field systems, Helv. Phys. Acta 62 (1989) 980–1003. [21] G. A. Raggio and R. F. Werner, The Gibbs variational principle for inhomogeneous mean field systems, Helv. Phys. Acta 64 (1991) 633–667. [22] F. Hiai, M. Mosonyi, H. Ohno and D. Petz, Free energy density for mean field perturbation of states of a one-dimensional spin chain, Rev. Math. Phys. 20(3) (2008) 335–365.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00395
Effect of a Locally Repulsive Interaction on s-Wave Superconductors
303
[23] W. De Roeck, C. Maes, K. Netocny and L. Rey-Bellet, A note on the non-commutative Laplace-Varadhan integral Lemma, arXiv:0808.0293v2 [math-ph] (2009). [24] G. L. Sewell, Quantum Theory of Collective Phenomena (Clarendon Press, Oxford, 1986). [25] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. II, 2nd edn. (Springer-Verlag, New York, 1996). [26] R. Haag, The mathematical structure of the Bardeen–Cooper–Schrieffer model, Il Nuovo Cimento 25(2) (1962) 287–299. [27] G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory (Wiley-Interscience, New York, 1972). [28] L. Accardi, De Finetti theorem, in Encyclopaedia of Mathematics, ed. M. Hazewinkel (Kluwer Academic Publishers, 2001). [29] V. A. Zagrebnov and J.-B. Bru, The Bogoliubov model of weakly imperfect Bose gas, Phys. Rep. 350 (2001) 291–434. [30] R. Griffiths, A proof that the free energy of a spin system is extensive, J. Math. Phys. 5 (1964) 1215–1222. [31] K. Hepp and E. H. Lieb, Equilibrium statistical mechanics of matter interacting with the quantized radiation field, Phys. Rev. A 8 (1973) 2517–2525. [32] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. I, 2nd edn. (Springer-Verlag, New York, 1996). [33] M. Fannes, H. Spohn and A. Verbeure, Equilibrium states for mean field models, J. Math. Phys. 21(2) (1980) 355–358. [34] N. N. Bogoliubov Jr., J. G. Brankov, V. A. Zagrebnov, A. M. Kurbatov and N. S. Tonchev, Metod approksimiruyushchego gamil’toniana v statisticheskoi fizikex (Izdat. Bulgar. Akad. Nauk,y Sofia, 1981). [35] N. N. Bogoliubov Jr., J. G. Brankov, V. A. Zagrebnov, A. M. Kurbatov and N. S. Tonchev, Some classes of exactly soluble models of problems in Quantum Statistical Mechanics: The method of the approximating Hamiltonian, Russ. Math. Surv. 39 (1984) 1–50. [36] J. G. Brankov, D. M. Danchev and N. S. Tonchev, Theory of Critical Phenomena in Finite-Size Systems: Scaling and Quantum Effects (World Scientific, 2000). [37] N. N. Bogoliubov Jr., On model dynamical systems in statistical mechanics, Physica 32 (1966) 933–944. [38] C. N. Yang, Concept of off-diagonal long range order and the quantum phases of liquid He and of superconductors, Rev. Mod. Phys. 34 (1962) 694–704. [39] S. Adams and T. Dorlas, C ∗ -Algebraic approach to the Bose–Hubbard model, J. Math. Phys. 48 (2007) 103304, 14 pp. [40] H. Araki and H. Moriya, Equilibrium statistical mechanics of fermion lattice systems, Rev. Math. Phys. 15 (2003) 93–198. [41] R. R. Phelps, Lectures on Choquet’s Theorem, Lecture Notes in Mathematics, Vol. 1757, 2nd edn. (Springer-Verlag, 2001). [42] J. Ginibre, On the asymptotic exactness of the Bogoliubov approximation for many Bosons systems, Comm. Math. Phys. 8 (1968) 26–51.
x The
Approximating Hamiltonian Method in Statistical Physics. House Bulg. Acad. Sci.
y Publ.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Reviews in Mathematical Physics Vol. 22, No. 3 (2010) 305–329 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003965
ON SEMICLASSICAL AND UNIVERSAL INEQUALITIES FOR EIGENVALUES OF QUANTUM GRAPHS
SEMRA DEMIREL∗ and EVANS M. HARRELL, II† ∗Department
of Mathematics, University of Stuttgart, Pfaffenwaldring 57, D-70569 Stuttgart, Germany
[email protected]
†School
of Mathematics, Georgia Institute of Technology, Atlanta GA 30332-0160, USA
[email protected] Received 12 November 2009
We study the spectra of quantum graphs with the method of trace identities (sum rules), which are used to derive inequalities of Lieb–Thirring, Payne–P´ olya–Weinberger, and Yang types, among others. We show that the sharp constants of these inequalities and even their forms depend on the topology of the graph. Conditions are identified under which the sharp constants are the same as for the classical inequalities. In particular, this is true in the case of trees. We also provide some counterexamples where the classical form of the inequalities is false. Keywords: Quantum graph; semiclassical; Lieb–Thirring inequality; sum rule; universal spectral bounds. Mathematics Subject Classification 2010: 81Q35, 34L15, 34L40, 81Q20, 47E05, 47A75
1. Introduction This article is focused on inequalities for the means, moments, and ratios of eigenvalues of quantum graphs. A quantum graph is a metric graph with one- dimensional Schr¨ odinger operators acting on the edges and appropriate boundary conditions imposed at the vertices and at the finite external ends, if any. Here we shall define the Hamiltonian H on a quantum graph as the minimal (Friedrichs) self-adjoint extension of the quadratic form ∞ |φ |2 ds, (1.1) φ ∈ Cc → E(φ) := Γ
which leads to vanishing Dirichlet boundary conditions at the ends of exterior edges and to the conditions at each vertex vk that φ is continuous and moreover ∂φ (0+ ) = 0, (1.2) ∂x kj j 305
April 20, 2010 14:17 WSPC/S0129-055X
306
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
where the sum runs over all edges emanating from vk , and xkj designates the distance from vk along the jth edge. (Edges connecting vk to itself are accounted twice.) In the literature, these vertex conditions are usually known as Kirchhoff or Neumann conditions. Other vertex conditions are possible, and are amenable to our methods with some complications, but they will not be considered in this article. For details about the definition of H, we refer to [15]. Quantum mechanics on graphs has a long history in physics and physical chemistry [21, 24], but recent progress in experimental solid state physics has renewed attention on them as idealized models for thin domains. While the problem of quantum systems in high dimensions has to be solved numerically, since quantum graphs are locally one-dimensional their spectra can often be determined explicitly. A large literature on the subject has arisen, for which we refer to the bibliography given in [3, 7]. The subject of inequalities for means, moments, and ratios of eigenvalues is rather well developed for Laplacians on domains and for Schr¨ odinger operators, and it is our aim to determine the extent to which analogous theorems apply to quantum graphs. For example, when there is a potential energy V (x) in appropriate function spaces, Lieb–Thirring inequalities provide an upper bound for the moments odinger operator H(α) = −α∇2 +V (x) of the negative eigenvalues Ej (α) of the Schr¨ 2 d in L (R ), α > 0, of the form d/2 γ α (−Ej (α)) ≤ Lγ,d (V− (x))γ+d/2 dx (1.3) Ej (α)<0
Rd
cl for some constant Lγ,d ≥ Lcl γ,d , where Lγ,d , known as the classical constant, is given by
Lcl γ,d =
Γ(γ + 1) 1 . d/2 Γ(γ + d/2 + 1) (4π)
It is known that (1.3) holds true for various ranges of γ ≥ 0 depending on the dimension d; see [5, 13, 19, 20, 23, 27]. In particular, in [18] Laptev and Weidl proved that Lγ,d = Lcl γ,d for all γ ≥ 3/2 and d ≥ 1, and Stubbe [25] has recently given a new proof of sharp Lieb–Thirring inequalities for γ ≥ 2 and d ≥ 1 by showing monotonicity with respect to coupling constants. His proof is based on general trace identities for operators [11, 12] known as sum rules, which will again be used as the foundation of the present article. When there is no potential energy but instead the Laplacian is given Dirichlet conditions on the boundary of a bounded domain, then the means of the first n eigenvalues are bounded from below by the Berezin–Li–Yau inequality in terms of the volume of the domain, and in addition there is a large family of universal bounds on the spectrum, dating from the work of Payne, P´ olya, and Weinberger [22], which constrain the spectrum without any reference to properties of the domain. (For a review of the subject, see [2].) It turns out that there are far-reaching analogies between these “universal” inequalities for Dirichlet Laplacians and Lieb–Thirring
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
307
inequalities, which have led to common proofs based on sum rules [8–12, 25]. More precisely, some sharp Lieb–Thirring inequalities and some universal inequalities of the PPW family can be viewed as corollaries of a “Yang-type” inequality like (2.5) below, which in turn follows from a sum-rule identity. In one dimension, a domain is merely an interval and the spectrum of the Dirichlet Laplacian is a familiar elementary calculation, for which the question of universal bounds is trivial. A quantum graph, however, has a spectrum that responds in complex ways to its connectedness; if the total length is finite and appropriate boundary conditions are imposed at exterior vertices, then the spectrum is discrete, and questions about counting functions, moments, etc. and their relation to the topology of the graph become interesting, even in the absence of a potential energy. Below we shall prove several inequalities for the spectra of finite quantum graphs, with the aid of the same trace identities we use to derive Lieb–Thirring inequalities. For Lieb–Thirring inequalities on quantum graphs, the essential question is whether a form of (1.3) holds with the sharp constant for d = 1, or whether the connectedness of the graph can change the state of affairs. In [6], Ekholm, Frank and Kovaˇr´ık proved Lieb–Thirring inequalities for Schr¨ odinger operators on regular metric trees for any γ ≥ 1/2, but without sharp constants. We shall show below that trees enjoy a Lieb–Thirring inequality with the sharp constant when γ ≥ 2, but that this circumstance depends on the topology of the graph. We begin with some simple explicit examples showing that neither the expected Lieb–Thirring inequality nor the analogous universal inequalities for finite quantum graphs without potential hold in complete generality. As it will be convenient to have a uniform way of describing examples, we shall let xij denote the distance from vertex vi along the jth edge Γj emanating from vi . We note that every edge corresponds to two distinct coordinates xij = L − xi j where L is the length of the edge, and that a homoclinic loop from a vertex vi to itself is accounted as two edges. d2 For the operator − dx 2 on an interval, with vanishing Dirichlet boundary conditions, the universal inequality of Payne–P´ olya–Weinberger reduces to E2 /E1 ≤ 5, and the Ashbaugh–Benguria theorem becomes E2 /E1 ≤ 4, both of which are trivial in one dimension. But for which quantum graphs do these classic inequalities continue to be valid? We shall show below that the classic PPW and related inequalities can be proved for the case of trees, with Dirichlet boundary conditions imposed at all external ends of edges, using the method of sum rules. The sum-rule proof does not work for every graph, however, so the question naturally arises whether the topology makes a real difference, or whether a better method of proof is required. The following examples show that the failure of the sum-rule proof in the case of multiply connected graphs is not an artifact of the method but due to a true topological effect. We refer to graphs consisting of a circle attached to a single external edge as “simple balloon graphs.” The external edge may either be infinite or of finite length with a vanishing boundary condition at its exterior end. Consider first the graph
April 20, 2010 14:17 WSPC/S0129-055X
308
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
Γ1 v1
Fig. 1.
Γ2
The “balloon graph”.
Γ := Γ1 ∪ Γ2 , which consists of a loop Γ1 to which a finite external interval Γ2 is attached at a vertex v1 . Without loss of generality, we may fix the length of the loop as 2π, while the “string” will be of length L. Example 1.1 (Violation of the Analogue of PPW). Let us begin with the case of a balloon graph with L < ∞, and assume that there is no potential. We set d2 α = 1. Thus H locally has the form − dx 2 with Dirichlet condition at the end of the string Γ2 and vertex condition (1.2) at v1 connecting it to the loop. For convenience, we slightly simplify the coordinate system, letting xs := x12 be the distance on Γs := Γ2 from the node, and x := x11 − π on Γ1 . Thus x increases from −π at v1 to x2 = +π when it joins it again. It is possible to analyze the eigenvalues of the balloon graph quite explicitly: With a Dirichlet condition at xs = L, any eigenfunction must be of the form a sin(k(L − xs )) on Γs . On Γ1 symmetry dictates that the eigenfunction must be proportional to either sin kx or cos kx . There are thus two categories of eigenfunctions and eigenvalues. Eigenfunctions of the form sin kx contribute nothing to the vertex condition (1.2) (because the outward derivatives at the node are equal in magnitude with opposite signs), and therefore the derivative of a sin(k(L − xs )) must vanish at xs = 0. If k is a positive integer, then k 2 is an eigenvalue corresponding to an eigenfunction that vanishes on Γs . Otherwise, the conditions on Γs cannot be achieved without violating the condition of continuity with the eigenfunction on Γ1 . To summarize: the eigenvalues of the first category are the squares of positive integers. The second category of eigenfunctions match cos kx on the loop to a sin(k(L − xs )) on the interval. The boundary conditions and continuity lead after a standard calculation to the transcendental equation cot kL = 2 tan kπ.
(1.4)
There are three interesting situations to consider. In the limit L → 0, an asymptotic analysis of (1.4) shows that the eigenvalues tend to {( n2 )2 }. In the limit L → π2 ∞, the lower eigenvalues tend to {(n + 12 )2 L 2 }, which are the eigenvalues of an interval of length L with Dirichlet conditions at L and Neumann conditions at 0. The ratio of the first two eigenvalues in this limit is approximately 9, which is already greater than the classically anticipated value of 5 or 4. The highest value of the ratio is, somewhat surprisingly, attained for an intermediate value of L, viz., L = π, for which (1.4) can be easily solved, yielding k = ± π1 arctan √12 + j
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
309
for a positive integer j. The corresponding fundamental ratio of the lowest two eigenvalues becomes 2 1 √ π − arctan E2 2 = = ˙ 16.8453. 1 E1 arctan √ 2 (We spare the reader the direct calculation showing that the critical value of the ratio occurs precisely at L = π, establishing this value as the maximum among all simple balloons.) Example 1.2 (Showing that E2 /E1 can be Arbitrarily Large). A modification of Example 1.1 with more complex topology shows that no upper bound on the ratio of the first two eigenvalues is possible for the graph analogue of the Dirichlet problem. We again set α = 1 and assume V = 0, and consider a “fancy balloon” graph consisting of an external edge, Γs , the “string,” of length π joined at v1 to N edges Γm , m = 1, . . . , N of length π, all of which meet at a second vertex v2 . We observe that the eigenfunctions may be chosen either even or odd under pairwise permutation of the edges Γm . This is because if P f represents the linear transformation of a function f defined on the graph by permuting two of the variables {x21 , . . . , x2N }, and φj is an eigenfunction of the quantum graph with eigenvalue Ej , then so are φj ± P φj . (In particular, continuity and (1.2) are preserved by these superpositions.) Moreover, the fundamental eigenfunction is even under any permutation, because it is unique and does not change sign. By continuity and the conditions (1.2) at the vertices, as in Example 1.1, a straightforward exercise shows that E1 = ( π1 arctan( √1N ))2 , and that there are other even-parity eigenvalues
2 1 1 j ± arctan √ π N for all positive integers j. Odd parity, when combined with continuity, forces the eigenfunctions to vanish at the nodes, and thus leads to eigenvalues of the form j 2 , for positive integers j. The fundamental ratio E2 /E1 for this example can be seen to be
2 1 √ π − arctan N
, 1 arctan √ N which is roughly π 2 N for large N . Remarks. (1) With no external edges, the lowest eigenvalue of a quantum graph is E1 = 0, so one might intuitively argue that for a graph with a large and
April 20, 2010 14:17 WSPC/S0129-055X
310
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
complex interior part the effect of an exterior edge with a boundary condition is small. The theorems and examples given below, however, point towards a more nuanced intuition. (2) Another instructive example is the “bunch-of-balloons” graph, with many nonintersecting loops attached to the string at v1 . We leave the details to the interested reader. Example 1.3 (Violation of Classical Lieb–Thirring). Next consider a balloon d2 2 graph with L = ∞ and the Schr¨ odinger operator H(l) := − dx 2 + V (x) on L (Γ) with vertex conditions (1.2). Let the potential V be given by
V (x) :=
V1 (x) :=
−2a2 , x ∈ Γ1 = [−π, π] cosh2 (ax) . xs ∈ Γ2 = [0, ∞)
V2 (x) := 0,
Then the eigenfunction corresponding to the eigenvalue −a2 is given by C cosh−1 (ax ) on Γ1 and by e−axs on Γ2 . The continuity condition gives C = cosh(aπ) and the condition (1.2) at v1 leads to the equation tanh(aπ) =
1 . 2
(1.5)
Denoting the ratio Q(γ, V ) :=
|E1 |γ
,
|V (x)|γ+1/2 dx
Γ
we compute a3 4a4 dx 2 4 0 cosh (ax ) aπ
−1 1 = 8 dy cosh4 (y) 0
−1 8 = . tanh(aπ)(2 + sech2 (aπ)) 3
Q(3/2, V ) =
π
Because of (1.5), sech2 (aπ) = 1 − tanh2 (aπ) = 34 , and therefore Q(3/2, V ) =
3 3 > = Lcl 3/2,1 . 11 16
(1.6)
Note that the ratio Q(3/2, V ) is independent of the length of the loop, as expected because any length L can be achieved by a change of scale.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
311
The ratio Q(γ, V ) can also be calculated explicitly for the case γ = 2. In this case
−1 3 1 3 arctan(tanh(aπ/2)) + sech(aπ) + sech3 (aπ) Q(2, V ) = 27/2 4 16 8 8 = ˙ 0.1697. = ˙ 0.2009 > Lcl 2,1 = 15π 2. Lieb–Thirring Inequalities for Quantum Graphs 2.1. Classical Lieb–Thirring inequality for metric trees Our point of departure is the family of sum-rule identities from [11, 12]. Let H and G be abstract self-adjoint operators satisfying certain mapping conditions. We suppose that H has nonempty discrete spectrum lying below the continuum, {Ej : Hφj = Ej φj }. In the situations of interest in this article the spectrum will either be entirely discrete, in which case we focus on spectral subsets of the form J := {Ej , j = 1, . . . , k}, or else, when there is a continuum, it will lie on the positive real axis and we shall take J as the negative part of the spectrum. Let PA denote the spectral projector associated with H and a Borel set A. Then, given a pair of self-adjoint operators H and G with domains D(H) and D(G), such that G(J ) ⊂ D(H) ⊂ D(G), where J is the subspace spanned by the eigenfunctions φj corresponding to the eigenvalues Ej , it is shown in [11, 12] that:
(z − Ej )2 [G, [H, G]]φj , φj − 2(z − Ej ) [H, G]φj , [H, G]φj
Ej ∈J
=2
Ej ∈J
κ∈J c
(z − Ej )(z − κ)(κ − Ej )dG2jκ ,
(2.1)
where dG2jκ := |Gφj , dPκ Gφj | corresponds to the matrix elements of the operator G with respect to the spectral projections onto J and J c . Because of our choice of J,
(z − Ej )2 [G, [H, G]]φj , φj − 2(z − Ej )[H, G]φj , [H, G]φj ≤ 0.
(2.2)
Ej ∈J
In this section H is the Schr¨ odinger operator on the graph Γ, namely H(α) = −α
d2 + V (x) dx2
in L2 (Γ),
α > 0,
with the usual conditions (1.2) at each vertex vi . In particular, if any leaves (i.e. edges with one free end) are of finite length, vanishing Dirichlet boundary conditions are imposed at their ends. Without loss of generality we may assume that V ∈ C0∞
April 20, 2010 14:17 WSPC/S0129-055X
312
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
for the operator H(α). Under this assumption, for any α > 0, H(α) has at most a finite number of negative eigenvalues. We denote negative eigenvalues of H(α) by Ej (α) corresponding to the normalized eigenfunctions φj . We shall be able to derive inequalities of the standard one-dimensional type when it is possible to choose G to be multiplication by the arclength along some distinguished subsets of the graph. This depends on the following: Lemma 2.1. Suppose that there exists a continuous, piecewise-linear function G on the graph Γ, such that at each vertex vk ∂G (0+ ) = 0. ∂x kj j
(2.3)
Suppose that Γ = m Γm with (G )2 = am on Γm . If the spectrum has nonempty essential spectrum, assume that z ≤ inf σess (H). Then
(z − Ej )2+ am χΓm φj 2 − 4α(z − Ej )+ am χΓm φj 2 ≤ 0.
(2.4)
j,m
We observe that χΓm = 1 ⇔ am = 0. Proof. The formula (2.4) is a direct application of (2.2), when we note that, locally, [H, G] = −2G dxdkj − G and [G, [H, G]] = 2(G )2 . (A factor of 2α has been divided out.) The reason for the condition (2.3) is that Gφj must be in the domain of definition of H, which requires that at each vertex, 0=
∂Gφj j
=G = φj
∂xkj
(0+ )
∂φj ∂G (0+ ) + φj (0+ ) ∂x ∂x kj kj j j ∂G (0+ ). ∂x kj j
If we are so fortunate that (G )2 is the same constant on every edge, then (2.4) reduces to the quadratic inequality (z − Ej )2+ − 4α(z − Ej )+ φj 2 ≤ 0, (2.5) j
familiar from [8, 9, 11, 12, 25], where it was shown that it implies universal spectral bounds for Laplacians and Lieb–Thirring inequalities for Schr¨ odinger operators in routine ways. Equation (2.5) can be considered as a Yang-type inequality, after [30].
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
313
Stubbe’s monotonicity argument In [25], Stubbe showed that some of the classical sharp Lieb–Thirring inequalities follow from the quadratic inequality (2.5). Here we apply the same argument to quantum graphs: For any α > 0, the functions Ej (α) are non-positive, continuous and increasing. Ej (α) is continuously differentiable except at countably many values where Ej (α) fails to be isolated or enters the continuum. By the Feynman–Hellman theorem, d Ej (α) = φj , −φj = φj 2 . dα Setting z = 0, (2.5) reads α
(−Ej (α))2 + 2α2
Ej (α)<0
d dα
(−Ej (α))2 ≤ 0.
Ej (α)<0
We denote by ∞ ≥ α1 ≥ α2 ≥ · · · ≥ αk ≥ · · · > 0 the values at which Ej (α) appears. For any α ∈ ]αN +1 , αN [ the number of eigenvalues is constant, and therefore d 1/2 (−Ej (α))2 ≤ 0. α dα Ej (α)<0
This means that α1/2 Ej (α)<0 (−Ej (α))2 is monotone decreasing in α. Hence, by Weyl’s asymptotics (see [4, 28]), (−Ej (α))2 ≤ lim α1/2 (−Ej (α))2 = Lcl (V− (x))2+1/2 dx. α1/2 2,1 Ej (α)<0
α→0+
Ej (α)<0
Γ
Remark 2.2. Strictly speaking the Feynman–Hellman theorem only holds for nondegenerate eigenvalues. In the case of degenerate eigenvalues, one has to take the right basis in the corresponding degeneracy space and to change the numbering if necessary, see, e.g., [26]. The balloon counterexamples given above might lead one to think that the existence of cycles poses a barrier for a quantum graph to have an inequality of the form (2.5). Consider, however the following example. Example 2.3 (Hash Graphs). Let Γ be a planar graph consisting of (or metrically isomorphic to) the union of a closed family of vertical lines and line segments Fv and a closed family of horizontal lines and line segments Fh . We assume that for some δ > 0 the distance between any two lines or line segments in Fv is at least δ, and that the same is true of Fh . (The assumption on the spacing of the lines allows an unproblematic definition of the vertex conditions (1.2).) We impose Dirichlet boundary conditions at any ends of finite line segments. We also suppose a
April 20, 2010 14:17 WSPC/S0129-055X
314
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
“crossing condition”, that there are no vertices touching exactly three edges. (That is, no line segment from Fv has an end point in Fh and vice versa.) Regarding the graph as a subset of the xy-plane, we let G(x, y) = x + y. It is immediate from the crossing condition that G satisfies (2.3). Furthermore, the derivative of G along every edge is 1, and therefore the quadratic inequality (2.5) holds. A quadratic inequality (2.5) can arise in a different way, if there is a family of piecewise affine functions G each with a range of values am , but such that next example. am = 1 (or any other fixed positive constant). This occurs in our Even when this is not possible, if we can arrange that 0 < amin ≤ am ≤ amax , then the resulting weaker quadratic inequality amax (z − Ej )2+ − 4α (z − Ej )+ φj 2 ≤ 0, (2.6) a min j will still lead to universal spectral bounds that may be useful. We speculate about this circumstance below. Example 2.4 (Y -Graph). As the next example we consider a simple graph, namely the Y -graph, which is a star-shaped graph with three positive halfaxes Γi , i = 1, 2, 3, joined at a single vertex v1 . If we set x11 ∈ Γ1 g1 := 0, G1 (x) := g2 := −x12 , x12 ∈ Γ2 , g := x , x ∈Γ , 3
13
13
3
then obviously G(J ) ⊂ D(HΓ (α)) holds, and with Lemma 2.1 we get (z − Ej )2+ ( χΓ2 φj 2 + χΓ3 φj 2 ) j
− 4α(z − Ej )+ ( χΓ2 φj 2 + χΓ3 φj 2 ) ≤ 0.
(2.7)
As Γ1 does not contribute to this inequality, we cyclically permute the zero part of G, i.e. we next choose G2 (x), such that g2 = 0, g1 = x11 and g3 = −x13 , and finally G3 (x), such that g3 = 0, g1 = x11 and g2 = −x12 . These give us two further inequalities analogous to (2.7). Summing all three inequalities, and noting that on 3 every edge, =1 am = 2, we finally obtain 2(z − Ej )2+ − 8α(z − Ej )+ φj 2 ≤ 0, (2.8) j
which when divided by 2 yields the quadratic inequality (2.5). We next extend the averaging argument to prove (2.5) for arbitrary metric trees. A metric tree Γ consists of a set of vertices, a set of leaves and a set of edges, i.e. segments of the real axis, which connect the vertices, such that there is exactly
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
315
one path connecting any two vertices. It is common in graph theory to distinguish between edges and leaves; a leaf is joined to a vertex at only one of its endpoints, i.e. there is a free end, at which we shall set Dirichlet boundary conditions. (When the distinction is not material we shall refer to both edges and leaves as edges. It is also common to regard one free end as the distinguished “root” r of the tree, but for our purposes all free ends of the graph have the same status.) We denote the vertices by vi , i = 1, . . . , n. The edges including leaves will be denoted by e. We shall explicitly write lj for leaves when the distinction matters. Theorem 2.5. For any tree graph with a finite number of vertices and edges, the mapping (−Ej (α))2 α → α1/2 Ej (α)<0
is nonincreasing for all α > 0. Consequently 1/2
α
(−Ej (α)) ≤ 2
Lcl 2,1
Ej (α)<0
(V− (x))2+1/2 dx Γ
for all α > 0. Remark 2.6. By the monotonicity principle of Aizenman and Lieb (see [1]), Theorem 2.5 is also true with the sharp constant for higher moments of eigenvalues. Alternatively, the extension to higher values of γ can be obtained directly from the trace inequality of [10] for power functions with γ > 2. Furthermore, Theorem 2.5 can be extended by a density argument to potentials V ∈ Lγ+1/2 (Γ). To prepare the proof of Theorem 2.5, we first formulate some auxiliary results. Lemma 2.7. For all n ∈ N, [ n−1 2 ]
k=0
[ n
2 ]−1 n−1 n−1 = . 2k 2k + 1
(2.9)
k=0
Proof. This is a simple computation. Definition 2.8. Let E be the set of all edges e ⊂ Γ. We call the mapping C : E → {0, 1} a coloring and say that C is an admissible coloring if at each vertex v ∈ Γ the number #{e : e emanates from v : C(e) = 1} is even. We let A(Γ) denote the set of all admissible colorings on Γ.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
316
Theorem 2.9. Let Γn be a metric tree with n vertices. For an edge e ⊂ Γn , we denote by a(e, n) := #{C(Γn ) ∈ A : C(e) = 1} the number of all admissible mappings C ∈ A(Γn ), such that C(e) = 1 for e ⊂ Γn . Then a(e, n) is independent of e ⊂ Γn .
(2.10)
Proof. We shall prove (2.10) by induction over the number of vertices of Γ. The case with one vertex v1 is trivial because of the symmetry of the graph. Given a metric tree Γn with n vertices, we can decompose it as follows. Γn consists of a metric tree Γn−1 with n − 1 vertices to which m − 1 leaves lj , j = 2, . . . , m, are attached to the free end of a leaf l1 ⊂ Γn−1 . We call the vertex at which the leaves lj , j = 1, . . . , m, are joined vn . Hence, m
Γn := Γn−1 ∪ vn ∪
lj .
j=2
By the induction hypothesis, a(e, n − 1) := #{C ∈ A(Γn−1 ) : C(e) = 1} is independent of e ⊂ Γn−1 .
(2.11)
Obviously for every edge or leaf e = l1 in Γn−1 , we have a(e, n − 1) = #{C ∈ A(Γn−1 ) : C(e) = 1 ∧ C(l1 ) = 1} + #{C ∈ A(Γn−1 ) : C(e) = 1 ∧ C(l1 ) = 0}.
(2.12)
Now, we have to show that a(e, n) is independent of e ⊂ Γn . Note first that for m each fixed leaf lj of the subgraph Γ∗ = vn ∪ j=1 lj , we have ∗
∗
µ1 := #{C ∈ A(Γ ) : C(lj ) = 1, lj ∈ Γ } =
[m 2 ]−1
k=0
m−1 2k + 1
(2.13)
m−1 . 2k
(2.14)
and ∗
∗
µ0 := #{C ∈ A(Γ ) : C(lj ) = 0, lj ∈ Γ } =
[ m−1 2 ]
k=0
Hence, for arbitrary neighboring edges e , e ⊂ Γn−1 the following equality holds, a(e , n) = µ1 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 1} + µ0 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 0},
(2.15)
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
317
and, respectively, a(e , n) = µ1 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 1} + µ0 #{C ∈ A(Γn−1 ) : C(e ) = 1 ∧ C(l1 ) = 0}.
(2.16)
By Lemma 2.7, µ := µ0 = µ1 . Therefore, with (2.12) the equalities (2.15) and (2.16) read a(e , n) = µa(e , n − 1), a(e , n) = µa(e , n − 1). Furthermore, by the induction hypothesis, a(e , n − 1) = a(e , n − 1), from which it immediately follows that a(e , n) = µa(e , n − 1) = µa(e , n − 1) = a(e , n). This proves Theorem 2.9. Proof of Theorem 2.5. In order to apply Stubbe’s monotonicity argument [25], we need to establish inequality (2.5) for metric trees. To do this, we proceed as for the example of the Y -graph. Let J denote the subspace spanned by the eigenfunctions φj on L2 (Γ) corresponding to the eigenvalues Ej . Note first that there exist selfadjoint operators G, which are given by piecewise affine functions gi on the edges (or leaves) of Γ, such that G(J ) ⊂ D(H(α)) ⊂ D(G). Edges (or leaves) on which constant functions gi are given, do not contribute to the sum rule. Therefore we average over a family of operators G, such that every edge e (or leaf) of the tree appears equally often in association with an affine function having G = ±1 on e. We let G denote the set of continuous operators G(x) = {gi (x) affine, x ∈ ei (or li )}, which satisfy (1.2) at the vertices v of Γ. Indeed it is not necessary to average over all the operators G ∈ G, because it makes no difference in Lemma 2.1, for instance, whether gi = 1 or gi = −1. Therefore we define an equivalence relation ∼G on G ˜ = {˜ as follows: Let G gi (x) affine, x ∈ ei , (or li )} be another operator in G. We say ˜ gi (x)|. We define G ∗ := G/∼. Then we that G ∼ G ⇔ ∀i ∈ {1, . . . , n} : |gi (x)| = |˜ can consider the isomorphism I : A(Γ) → G ∗ ,
(2.17)
where for each C ∈ A(Γ) we choose an affine function GC ∈ G ∗ on Γ, such that |GC (e)| = C(e) for every e ⊂ Γ. By Theorem 2.9, we know that #{C ∈ A(Γ) : C(e) = 1} is independent of e ⊂ Γ. This means that summing up all inequalities corresponding to (2.4), which we get from each GC ∈ G ∗ , leads to (z − Ej )2+ p − 4α(z − Ej )+ p φj 2 ≤ 0, (2.18) j
April 20, 2010 14:17 WSPC/S0129-055X
318
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
where p := am = #{C ∈ A(Γ) : C(e) = 1} and we have used the normalization φj = 1. Having the anologue of inequality (2.5) for metric trees, we can reformulate the monotonicity argument for our case. This proves Theorem 2.5. Remark 2.10. The proof applies equally to metric trees with leaves of infinite lengths. 2.2. Modified Lieb–Thirring inequalities for one-loop graphs In this section we consider the graph Γ consisting of a circle to which two leaves are attached. It is not hard to see that the construction leading to Lieb–Thirring inequalities with the sharp classical constant fails for one-loop graphs, because no family of auxiliary functions G exists with the side condition that am = 1 throughout Γ. Unlike the case of the balloon graph, it is possible to replace the classical inequality with a weakened version (2.6) as mentioned above. There is, however another option, based on commutators with exponential functions, following an idea of [10]: As usual, we define the one-parameter familiy of Schr¨odinger operators H(α) = −α
d2 + V (x), dx2
α > 0,
in L2 (Γ) with the usual conditions (1.2) at each vertex vi of Γ. The leaves are denoted by Γ1 := [0, ∞) and Γ2 := [0, ∞), while we write Γ3 and Γ4 for the semicircles with lengths L. Let φj be the eigenfunctions of H(α) corresponding to the eigenvalues Ej (α). Theorem 2.11. Let q := 2π/L. For all α > 0 the mapping
α → α1/2
Ej (α)<0
2 3 z − αq 2 − Ej 16 +
(2.19)
is nonincreasing. Furthermore, for all z ∈ R and all α > 0 the following sharp Lieb–Thirring inequality holds: R2 (z, α) ≤ α1/2 Lcl 2,1
2+1/2 3 dx, V (x) − z + q 2 α 16 Γ −
(2.20)
where R2 (z, α) :=
(z − Ej (α))2+ .
Ej (α)
Remark 2.12. Once again, Theorem 2.11 can be extended to potentials V ∈ Lγ+1/2 (Γ) and is true for all γ ≥ 2, either by the monotonicity principle of Aizenman and Lieb [1] or by the trace formula of [10] for γ ≥ 2.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
319
For the proof of Theorem 2.11, we make use of a theorem of Harrell and Stubbe: Theorem 2.13 ([10, Theorem 2.1]). Let H be a self-adjoint operator on H, with a nonempty set J of finitely degenerate eigenvalues lying below the rest of the spectrum J c and {φj } an orthonormal set of eigenfunctions of H. Let G be a linear operator with domain DG and adjoint G∗ defined on DG∗ such that G(DH ) ⊆ DH ⊆ DG and G∗ (DH ) ⊆ DH ⊆ DG∗ , respectively. Then 1 (z − Ej )2 ([G∗ , [H, G]]φj , φj + [G, [H, G∗ ]]φj , φj ) 2 Ej ∈J ≤ (z − Ej )( [H, G]φj 2 + [H, G∗ ]φj 2 ).
(2.21)
Ej ∈J
Remark 2.14. Strictly speaking, in [10] it was assumed that the spectrum was purely discrete. However, the extension to the case where continuous spectrum is allowed in J c follows exactly as in [11, Theorem 2.1]. Proof of Theorem 2.11. In this case, it is not possible to get a quadratic inequality from Lemma 2.1 without worsening the constants. This follows from the fact that the conditions φ3 (0) = φ4 (0) and φ3 (L) = φ4 (L) imply that the piecewise linear function G has to be defined equally on Γ3 and Γ4 . Consequently, the condition (1.2) can be satisfied only with different values of am as in (2.6), namely a1 = a2 = 4a3 = 4a4 . Our proof of Theorem 2.11 consists of three steps. First, we apply Lemma 2.1, after which we apply Theorem 2.13. Finally we combine both results and apply the line of argument given in [10]. First step. Using Lemma 2.1 with the choice, g := −2x11 , 1 g := 2x + L, 2 22 G(x) := g := x , 3 13 g4 := x14 , we obtain 4 (z − Ej (α))2+ p12 (j) − 4α Ej (α)<0
+
x11 ∈ Γ1 x22 ∈ Γ2 x13 ∈ Γ3 x14 ∈ Γ4 ,
(z − Ej (α))+ p12 (j)
Ej (α)<0
(z − Ej (α))2+ p34 (j) − 4α
Ej (α)<0
(z − Ej (α))+ p34 (j) ≤ 0,
Ej (α)<0
where pik (j) := χΓi φj 2 + χΓk φj 2 and pik (j) := χΓi φj 2 + χΓk φj 2 .
(2.22)
April 20, 2010 14:17 WSPC/S0129-055X
320
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
Second step. Next, in Theorem 2.13 we set g1 g 2 G(x) := g 3 g4
:= 1,
x11 ∈ Γ1
:= 1 ,
x22 ∈ Γ2
:= e−i2πx13 /L , x13 ∈ Γ3 x14 ∈ Γ4 .
:= ei2πx14 /L ,
It is easy to see that Gφj ∈ D(Hα ). With q := 2π/L, the first commutators work out to be [Hj , gj ] = 0, [H3 , g3 ] = e−iqx13 α(q 2 + 2iqd/dx),
j = 1, 2,
[H4 , g4 ] = eiqx14 α(q 2 − 2iqd/dx);
whereas for the second commutators, [gj∗ , [Hj , gj ]] = [gj , [Hj , gj∗ ]] = 0, [gj∗ , [Hj , gj ]]
=
[gj , [Hj , gj∗ ]]
j = 1, 2, 2
= 2αq ,
(2.23) j = 3, 4.
From inequality (2.21), we get
(z − Ej (α))2 p34 (j) ≤ α
Ej (α)∈J
(z − Ej (α)) (q 2 p34 (j) + 4p34 (j)).
(2.24)
Ej (α)∈J
Third step. Adding (2.22) and (2.24), we finally obtain
3 d 2 R2 (z, α) + 2α R2 (z, α) ≤ αq 2 (z − Ej )p34 (j), dα 2
(2.25)
Ej ∈J
or 2R2 (z, α) + 4α
3 d R2 (z, α) − αq 2 R1 ≤ 0, dα 2
(2.26)
which is equivalent to ∂ 3q 2 1/2 (α1/2 R2 (z, α)) ≤ α R1 (z, α). ∂α 8
(2.27)
Letting U (z, α) := α1/2 R2 (z, α), the inequality has the form ∂U 3 2 ∂U ≤ q . ∂α 16 ∂z
(2.28)
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
321
3 2 Since the expression in (2.20) can be written as U (z− 16 q α, α), an application of the chain rule shows that the monotonicity claimed in (2.20) follows from (2.28). (We note that (2.28) can be solved by changing to characteristic variables ξ := α − 16z 3q2 , 16z η := α + 3q2 , in terms of which
∂U ≤ 0. ∂ξ
(2.29)
That is, U decreases as ξ increases while η is fixed.) By shifting the variable in (2.29), we also obtain
3 U (z, α) ≤ U z + q 2 (α − αs ), αs (2.30) 16 for α ≥ αs . By Weyl’s asymptotics, for all γ ≥ 0, γ+d/2 lim αd/2 (z − Ej (α))γ = Lcl (V (x) − z)− dx, γ,d α→0+
(2.31)
Γ
Ej (α)
see [4, 28]. Hence, as αs → 0, the right-hand side of (2.30) tends to Lcl 2,1
2+1/2 3 2 dx, V (x) − z + q α 16 Γ −
so the conclusion of Theorem 2.11 follows. Remark 2.15. Theorem 2.11 can be generalized to one-loop graphs to which 2n, n ∈ N equidistant semiaxes are attached. To summarize, in this section we have seen that for some classes of quantum graphs a quadratic inequality (2.5) can be proved with the classical constants, and that for some other classes of graphs similar statements can be proved at the price of worse constants as in (2.6), or of a shift in the zero-point energy as in (2.20). It is reasonable to ask whether one can look at the connectness of a graph and say whether a weak Yang-type inequality (2.6) can be proved. As we have seen, this is the case if there exists a family of continuous functions G on the graph such that • On each edge, all the derivatives {G } are constant. • At each vertex vk , each function G satisfies dG (0+ ) = 0. dx kj j • For each edge e there exists at least one function G with G = 0.
April 20, 2010 14:17 WSPC/S0129-055X
322
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
Interestingly, the question of the existence of such a family of functions can be rephrased in terms of the theory of electrical resistive circuits, a subject dating from the mid-19th century [14]. We first note that for a suitable family of functions to exist, there must be at least two leaves, which can be regarded as external leads of an electric circuit, bearing some resistance. (In the finite case let the resistance be equivalent to the length of the leaf, and in the infinite case let it be some fixed finite value, at least as large as the length of any finite leaf.) Each internal edge is regarded as a wire bearing a resistance equal to the length of the edge. If we regard the value of G as a current, then Kirchhoff’s condition at the vertex of an electric circuit is exactly the condition (1.2) that dG + j dxkj (0 ) = 0, and the condition that the electric potential G must be uniquely defined at all vertices is equivalent to global continuity of G . It has been known since Weyl [29] that the currents and potentials in an electric circuit are uniquely determined by the voltages applied at the leads. There are, however, circuits such that no matter what voltages are applied to the external leads, there will be an internal wire where no current flows; the most well known of these is the Wheatstone bridge. (See, for instance, the Wikipedia article on the Wheatstone bridge.) Let us call a metric graph a generalized Wheatstone bridge when the corresponding circuit has exactly two external leads and a configuration for which no current will flow in at least one of its wires. Then we conjecture that there are only two impediments to the existence of a suitable family of functions G , and therefore to a weakened quadratic inequality (2.6), namely: Unless a quantum graph contains either • a subgraph that can be disconnected from all leaves by the removal of one point (such as a balloon graph or a graph shaped like the letter α); or • a subgraph that when disconnected from the graph by cutting two edges is a generalized Wheatstone bridge, then an inequality of the form (2.6) holds. Otherwise the best that can be obtained may be a modified quadratic inequality with a variable shift, as in Theorem 2.11.
Fig. 2.
The “Wheatstone bridge”.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
323
3. Universal Bounds for Finite Quantum Graphs In this section, we derive differential inequalities for Riesz means of eigenvalues of the Dirichlet Laplacian on bounded metric trees Γ with at least one leaf (free edge). From these inequalities we derive Weyl-type bounds on the averages of the eigenvalues of the Dirichlet Laplacian
d2 in L2 (Γ), HD := − 2 dx D with the conditions (1.2) at each vertex vi . At the ends of the leaves, vanishing Dirichlet boundary conditions are imposed. We recall that with the methods of [9, 12] these are consequences of the same quadratic inequality (2.5) as was used above to prove Lieb–Thirring inequalities. When the total length of the graph is finite, the operator HD on D(HD ) has a positive discrete spectrum {Ej }∞ j=1 , allowing us to define the Riesz mean of order ρ, (z − Ej )ρ+ (3.1) Rρ (z) := j
for ρ > 0 and real z. Theorem 3.1. Let Γ be a metric tree of finite length and with finitely many edges and vertices, and let HD be the Dirichlet Laplacian in L2 (Γ) with domain D(HD ). Then for z > 0, R1 (z) ≥
5 R2 (z); 4z
(3.2)
R2 (z) ≥
5 R2 (z); 2z
(3.3)
and consequently R2 (z) z 5/2 is a nondecreasing function of z. Proof. The claims are vacuous for z ≤ E1 , so we henceforth assume z > E1 . The line of reasoning of the proof of Theorem 2.5 applies just as well to the operator HD on D(HD ), yielding (z − Ej )2+ − 4(z − Ej )+ φj 2 ≤ 0. (3.4) j
Since V ≡ 0, φj 2 = Ej . Observing that (z − Ej )+ Ej = zR1 (z) − R2 (z), j
April 20, 2010 14:17 WSPC/S0129-055X
324
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
we get from (3.4) 5R2 (z) − 4zR1 (z) ≤ 0. This proves (3.2). Inequality (3.3) follows from (3.2), as R2 (z) = 2R1 (z). Since by Theorem 3.1, R2 (z)z −5/2 is a nondecreasing function, we obtain a lower bound of the form R2 (z) ≥ Cz 5/2 for all z ≥ z0 in terms of R2 (z0 ). Upper bounds can be obtained from the limiting behavior of R2 (z) as z → ∞, as given by the Weyl law. In the following, we follow [9] to derive Weyl-type bounds on the averages of the eigenvalues of HD in L2 (Γ). Corollary 3.2. For z ≥ 5E1 , 5/2 −1/2 z 5/2 ≤ R2 (z) ≤ Lcl , 16E1 2,1 |Γ|z 5 where Lcl 2,1 :=
Γ(3) , (4π)1/2 Γ(7/2)
and |Γ| is the total length of the tree.
Proof. By Theorem 3.1, for all z ≥ z0 , R2 (z) R2 (z0 ) ≥ 5/2 . 5/2 z z0
(3.5)
As R2 (z0 ) ≥ (z0 − E1 )2+ for any z0 > E1 , it follows from (3.5) that R2 (z) ≥ (z0 − The coefficient
(z0 −E1 )2+ 5/2
z0
E1 )2+
z z0
5/2 .
is maximized when z0 = 5E1 . Thus we get −1/2
16E1
5/2 z ≤ R2 (z). 5
For metric trees with total length |Γ|, the Weyl law states that √ π En lim = , n→∞ n |Γ| (see [16]). It follows that R2 (z) → Lcl 2,1 |Γ|, z 5/2 as z → ∞. Since
R2 (z) z 5/2
is nondecreasing, we get R2 (z) ≤ Lcl 2,1 |Γ|, z 5/2
∀ z < ∞.
(3.6)
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
325
In summary, we get from Theorem 3.1 and Corollary 3.2 the following two-sided estimate: 3/2 5 −1/2 z R2 (z) ≤ R1 (z). ≤ (3.7) 4E1 5 4z In order to obtain similar estimates, related to higher eigenvalues, we introduce the notation 1 Ej := E j ≤j
for the means of eigenvalues E ; similarly, the means of the squared eigenvalues are denoted 1 2 Ej2 := E . j ≤j
For a given z, we let ind(z) be the greatest integer i such that Ei ≤ z. Then obviously, 2 ). R2 (z) = ind(z)(z 2 − 2zEind(z) + Eind(z)
As for any integer j and all z ≥ Ej , ind(z) ≥ j, we get R2 (z) ≥ D(z, j) := j(z 2 − 2zEj + Ej2 ). Using Theorem 3.1 for z ≥ zj ≥ Ej , it follows that 5/2 z R2 (z) ≥ D(zj , j) . zj
(3.8)
2
Furthermore, Ej ≤ Ej2 by the Cauchy–Schwarz inequality, and hence 2
D(z, j) = j((z − Ej )2 + Ej2 − Ej ) ≥ j(z − Ej )2 .
(3.9)
This establishes the following Corollary 3.3. Suppose that z ≥ 5Ej . Then R2 (z) ≥
16jz 5/2 25(5Ej )1/2
(3.10)
R1 (z) ≥
4jz 3/2 . 5(5Ej )1/2
(3.11)
and, therefore,
Proof. Combining Eqs. (3.8) and (3.9), we get 5/2 z 2 . R2 (z) ≥ j(zj − Ej ) zj
April 20, 2010 14:17 WSPC/S0129-055X
326
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
Inserting zj = 5Ej the first statement follows. (This choice of zj maximizes the constant appearing in (3.10).) The second statement results from substituting the first statement into (3.7). The Legendre transform is an effective tool for converting bounds on Rρ (z) into bounds on the spectrum, as has been realized previously, e.g., in [17]. Recall that if f (z) is a convex function on R+ that is superlinear in z as z → +∞, its Legendre transform L[f ](w) := sup{wz − f (z)} z
is likewise a superlinear convex function. Moreover, for each w, the supremum in this formula is attained at some finite value of z. We also note that if f (z) ≥ g(z) for all z, then L[g](w) ≤ L[f ](w) for all w. The Legendre transform of the two sides of inequality (3.11) is a straightforward calculation (e.g., see [9]). The result is (w − [w])E[w]+1 + [w]E[w] ≤
w3 125 Ej , j 2 108
(3.12)
for certain values of w and j. In Corollary 3.3 it is supposed that z ≥ 5Ej . Let zmax be the value for which L[f ](w) = wzmax − f (zmax ), where f is the right-hand side of (3.11). Then by an elementary calculation, w=
6j 5
zmax 5Ej
1/2 .
It follows that inequality (3.12) is valid for w ≥ 6j/5. Meanwhile, for any w we can always find an integer k such that on the left-hand side of (3.12), k − 1 ≤ w < k. If k > 6j/5 and if we let approach k from below, we obtain from (3.12) Ek + (k − 1)Ek−1 ≤
k 3 125 Ej . j 2 108
The left-hand side of this equation is the sum of the eigenvalues E1 through Ek , so we get the following: Corollary 3.4. For k ≥ 65 j, the means of the eigenvalues of the Dirichlet Laplacian on an arbitrary metric tree with finitely many edges and vertices satisfy a universal Weyl-type bound, 2 Ek 125 k ≤ . (3.13) 108 j Ej In [10] it was shown that a similar inequality with a different constant can be proved for all k ≥ j in the context of the Dirichlet Laplacian on Euclidian domains. The very same argument applies to quantum graphs with V = 0. With
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
327
this assumption φj 2 = Ej , so with α = 1 (2.5) can be rewritten as a quadratic inequality,
Pj (z) :=
j
(z − E )(z − 5E ) ≤ 0
(3.14)
=1
for z ∈ [Ej , Ej+1 ] (cf. [10, Eq. (4.6)]). From (3.2) and (3.5) for z ≥ z0 ≥ Ej , 5 5 −5/2 R1 (z) ≥ R2 (z) ≥ z 3/2 z0 (z0 − Ej )2 . 4z 4 j
(3.15)
=1
The derivative of the right-hand side of (3.15) with respect to z0 , by a calculation, is a negative quantity times Pj (z0 ), and therefore an optimal choice for the value of (3.15) is the root z0 = 3Ej +
Dj ≤ 5Ej ,
(3.16)
where Dj is the discriminant of Pj . The inequality in (3.16) results from the Cauchy– Schwarz inequality as in [10, 12]. Because Pj (z0 ) = 0,
0=
j
(z0 − E )(z0 − 5E ) = 5
=1
j
(z0 − E )2 − 4z0
=1
j
(z0 − E ),
=1
so (3.15) reads R1 (z) ≥
z z0
3/2 j
(z0 − E ) =
=1
z z0
3/2 j(z0 − Ej ).
From the left-hand side of (3.16), z0 − Ej ≥ 23 z0 , so R1 (z) ≥
2 −1/2 3/2 jz0 z . 3
(3.17)
The Legendre transform of (3.17) is kEk ≤
z0 3 k , 3j 2
(3.18)
and a calculation of the maximizing z in the Legendre transform of the right-hand side of (3.17) shows that (3.18) is valid for all k > j. In particular, with the inequality on the right-hand side of (3.16), we have established the following:
April 20, 2010 14:17 WSPC/S0129-055X
328
148-RMP
J070-00396
S. Demirel & E. M. Harrell, II
Corollary 3.5. For k ≥ j, the means of the eigenvalues of HD in L2 (Γ) satisfy 2 Ek 5 k ≤ . (3.19) 3 j Ej Remark 3.6. Relaxing the assumption to k ≥ j comes at the price of making the constant on the right-hand side larger. It would be possible to interpolate between (3.19) and (3.13) for k ∈ [j, 6j/5] with a slightly better inequality. Acknowledgment The authors are grateful to several people for useful comments, including Rupert L. Frank, Lotfi Hermi, Thomas Morley, Joachim Stubbe, and Timo Weidl, and to Michael Music for calculations and insights generated by them. We also wish to express our appreciation to the Mathematisches Forschungsinstitut Oberwolfach for hosting a workshop in February 2009, where this collaboration began, and to the Erwin Schr¨ odinger Institut for hospitality. References [1] M. Aizenman and E. H. Lieb, On semiclassical bounds for eigenvalues of Schr¨ odinger operators, Phys. Lett. A 66(6) (1978) 427–429. [2] M. S. Ashbaugh, The universal eigenvalue bounds of Payne–P´ olya–Weinberger, Hile– Protter, and H. C. Yang, Spectral and Inverse Spectral Theory (Goa, 2000), Proc. Indian Acad. Sci. Math. Sci. 112 (2002) 3–30. [3] G. Berkolaiko, R. Carlson, S. A. Fulling and P. Kuchment (eds.), Quantum Graphs and Their Applications, Contemporary Mathematics, Vol. 415 (American Mathematical Society, 2006). [4] M. Sh. Birman, The spectrum of singular boundary problems, Amer. Math. Soc. Trans. (2) 53 (1966) 23–80. [5] M. Cwikel, Weak type estimates for singular values and the number of bound states of Schr¨ odinger operators, Ann. Math. (2) 106(1) (1977) 93–100. [6] T. Ekholm, R. L. Frank and H. Kovar´ık, Eigenvalue estimates for Schr¨ odinger operators on metric trees, arXiv:0710.5500. [7] P. Exner, J. P. Keating, P. Kuchment, T. Sunada and A. Teplyaev (eds.), Analysis on Graphs and Its Applications, Proceedings of Symposia in Pure Mathematics, Vol. 77 (American Mathematical Society, Providence, RI, 2008); Papers from the program held in Cambridge, January 8–June 29 (2007). [8] E. M. Harrell, II and L. Hermi, On Riesz means of eigenvalues, arXiv:0712.4088. [9] E. M. Harrell, II and L. Hermi, Differential inequalities for Riesz means and Weyltype bounds for eigenvalues, J. Funct. Anal. 254(12) (2008) 3173–3191. [10] E. M. Harrell, II and J. Stubbe, Trace identities for commutators with applications to the distribution of eigenvalues, arXiv:0903:0563v1. [11] E. M. Harrell, II and J. Stubbe, Universal bounds and semiclassical estimates for eigenvalues of abstract Schr¨ odinger operators, arXiv:0808.1133. [12] E. M. Harrell, II and J. Stubbe, On trace identities and universal eigenvalue estimates for some partial differential operators, Trans. Amer. Math. Soc. 349(5) (1997) 1797– 1809.
April 20, 2010 14:17 WSPC/S0129-055X
148-RMP
J070-00396
Inequalities for Eigenvalues of Quantum Graphs
329
[13] D. Hundertmark, Bound state problems in quantum mechanics, in Spectral Theory and Mathematical Physics: A Festschrift in Honor of Barry Simon’s 60th Birthday, Proc. Sympos. Pure Math., Vol. 76, Part 1 (Amer. Math. Soc., Providence, RI, 1980), pp. 463–496. ¨ [14] G. R. Kirchhoff, Uber die Aufl¨ osung der Gleichungen, auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Str¨ ome gef¨ uhrt wird, Poggendorf ’s Ann. Phys. Chem. 72 (1847) 497–508. [15] P. Kuchment, Quantum graphs: An introduction and a brief survey, in Analysis on Graphs and Its Applications, Proc. Symp. Pure. Math. (Amer. Math. Soc., Providence, RI, 2008), pp. 291–314. [16] P. Kurasov, Schr¨ odinger operators on graphs and geometry. I. Essentially bounded potentials, J. Funct. Anal. 254(4) (2008) 934–953. [17] A. Laptev and T. Weidl, Recent results on Lieb–Thirring inequalities, in Journ´ees ´ “Equations aux D´eriv´ees, Partielles” (La Chapelle sur Erdre, 2000), Exp. No. XX (Univ. Nantes, Nantes, 2000), 14pp. [18] A. Laptev and T. Weidl, Sharp Lieb–Thirring inequalities in high dimensions, Acta Math. 184(1) (2000) 87–111. [19] E. H. Lieb, The number of bound states of one-body Schr¨ odinger operators and the Weyl problem, in Geometry of the Laplace Operator (Proc. Sympos. Pure Math., Univ. Hawaii, Honolulu, Hawaii, 1979), Proc. Sympos. Pure Math., Vol. 36 (Amer. Math. Soc., Providence, RI, 1980), pp. 241–252. [20] E. H. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the Schr¨ odinger Hamiltonian and their relation to Sobolev inequalities, Studies in Mathematical Physics: Essays in Honor of Valentine Bergmann (Princeton Univ. Press, 1976), pp. 269–303. [21] L. Pauling, The diamagnetic anistropy of aromatic molecules, J. Chem. Phys. 4 (1936) 673–677. [22] L. H. Payne, G. P´ olya and H. F. Weinberger, On the ratio of consecutive eigenvalues, J. Math. Phys. 35 (1956) 289–298. [23] G. V. Rozenblum, Distribution of the discrete spectrum of singular differential operators, Izv. Vysˇs. Uˇcebn. Zaved. Matematika 1(164) (1976) 75–86. [24] K. Ruedenberg and C. W. Scherr, Free-electron network model for conjugated systems, I, Theory, J. Chem. Phys. 21 (1953) 1565–1581. [25] J. Stubbe, Universal monotonicity of eigenvalue moments and sharp Lieb–Thirring inequalities, preprint (2008). [26] W. Thirring, A Course in Mathematical Physics: Quantum Mechanics of Atoms and Molecules, Vol. 3 (Springer-Verlag, 1991), pp. 149–150. [27] T. Weidl, On the Lieb–Thirring constants Lγ,1 for γ ≥ 1/2, Comm. Math. Phys. 178(1) (1996) 135–146. [28] H. Weyl, Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen, Math. Ann. 71 (1912) 441–479. [29] H. Weyl, Repartici´ on de corriente en una red conductora, Rev. Mat. Hisp. Amer. 5(1) (1923) 153–164. [30] H. C. Yang, Estimates of the difference between consecutive eigenvalues, preprint (1995); revision of International Centre for Theoretical Physics, preprint IC/91/60, Trieste (April 1991).
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Reviews in Mathematical Physics Vol. 22, No. 3 (2010) 331–354 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003977
GEOMETRIC MODULAR ACTION FOR DISJOINT INTERVALS AND BOUNDARY CONFORMAL FIELD THEORY
ROBERTO LONGO∗,§ , PIERRE MARTINETTI∗,†,‡,¶ and KARL-HENNING REHREN†,‡, ∗Dipartimento di Matematica, Universit` a di Roma 2 “Tor Vergata”, 00133 Roma, Italy †Institut
f¨ ur Theoretische Physik, Universit¨ at G¨ ottingen, Friedrich-Hund-Platz 1, 37077 G¨ ottingen, Germany
‡Courant
Centre “Higher Order Structures in Mathematics”, Universit¨ at G¨ ottingen, Bunsenstr. 3-5, 37073 G¨ ottingen, Germany §
[email protected] ¶
[email protected] [email protected] Received 7 December 2009 Revised 14 January 2010
Dedicated to John E. Roberts on the occasion of his 70th birthday In suitable states, the modular group of local algebras associated with unions of disjoint intervals in chiral conformal quantum field theory acts geometrically. We translate this result into the setting of boundary conformal QFT and interpret it as a relation between temperature and acceleration. We also discuss novel aspects (“mixing” and “charge splitting”) of geometric modular action for unions of disjoint intervals in the vacuum state. Keywords: Quantum field theory; modular theory. Mathematics Subject Classifications 2010: 81T40
1. Introduction Geometric modular action is a most remarkable feature of quantum field theory [2], emerging from the combination of the basic principles: unitarity, locality, covariance and positive energy [1]. It associates thermal properties with localization [17, 30], and is intimately related to the Unruh effect [34] and Hawking radiation [31]. It allows for a reconstruction of space and time along with their symmetries [7], and for a construction of full-fledged quantum field theories [23, 16] out of purely algebraic data together with a Hilbert space vector (the vacuum). 331
14:17 WSPC/S0129-055X
332
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
The modular group [32, Chap. VI, Theorem 1.19] is an intrinsic group of automorphisms of a von Neumann algebra M , associated with a cyclic and separating vector Φ, provided by the theory of Tomita and Takesaki [17, 32]. In quantum field theory, M may be the algebra of observables localized in a wedge region {x ∈ R4 : x1 > |x0 |} and Φ = Ω the vacuum state. In this situation it follows [1] that the associated modular group is the 1-parameter group of Lorentz boosts in the 1-direction, which preserves the wedge, i.e. it has a geometric action on the subalgebras of observables localized in subregions of the wedge. Geometric modular action was also established for the algebras of observables localized in lightcones or double cones in the vacuum state in conformally invariant QFT [5, 16], and for interval algebras in chiral conformal QFT [4]. It is known, however, that the modular group of the vacuum state is not geometric (“fuzzy”) for double cone algebras in massive QFT (see, e.g., [2,29]), and the same is true for the modular group of wedge algebras or conformal double cone algebras in thermal states [3]. In this contribution, we shall be interested in modular groups for algebras associated with disconnected regions (such as unions of disjoint intervals in chiral conformal QFT). Our starting point is the observation [21] that in chiral conformal QFT (the precise assumptions will be specified below), for any finite number n of disjoint intervals Ii on the circle one can find product states (not the vacuum if n > 1) on the algebras A( i Ii ) = i A(Ii ) whose modular groups act geometrically inside the intervals. For n = 2, let E = I1 ∪ I2 and E = S 1 \E the complement of the closure of E. By locality, A(E) ⊂ A(E ) , where the inclusion is in general proper. The larger algebra A(E ) admits the physical re-interpretation as a double cone algebra B+ (O) in boundary conformal QFT [25] as will be explained in Sec. 2.2. The above state on A(E) can be extended to a state on B+ (O) = A(E ) such that the geometric modular action is preserved. We shall compute the geometds as ric flow in the double cone O in Sec. 2. Adopting the interpretation of dτ inverse temperature β (where τ is the proper time along an orbit and s the modular group parameter) [11, 28], we compute the relation between temperature and acceleration. There is not a simple proportionality as in the case of the Hawking temperature. In Sec. 3, we shall connect our results with a recent work by Casini and Huerta [9]. In a first quantization approach as in [14], these authors have succeeded to compute the operator resolvent in the formula of [14] for the modular operator. From this, they obtained the modular flow for disjoint intervals and double cones in 2 dimensions in the theory of free Fermi fields. Unlike [21], they consider the vacuum state. They find a geometric modular action in the massless case (including the chiral case), but this action involves a “mixing” (“modular teleportation” [9]) between the different intervals resp. double cones. We shall discuss how, upon descent to gauge-invariant subtheories, the mixing leads to the new phenomenon of “charge splitting” (Sec. 3.3).
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
333
Ignoring the mixing, the geometric part of the vacuum modular flow for two intervals in the chiral free Fermi model is the same as the purely geometric modular flow in the previous non-vacuum product state, provided a “canonical” choice for the latter is made, in the model-independent approach. We shall make the result of Casini and Huerta (which was obtained by formal manipulations of operator kernels) rigorous by establishing the KMS property of the vacuum state with respect to the modular action they found. We shall also present a preliminary discussion of the question, to what extent the result may be expected to hold in other than free Fermi theories. 2. Geometric Modular Flow for n-Intervals Let I → A(I) be a diffeomorphism covariant local net on the circle S 1 : the orientation-preserving diffeomorphisms γ of S 1 are unitarily implemented by U (γ) such that Ad U (γ) maps A(I) onto A(γ(I)) and Ad U (γ)|A(I) = id |A(I) if γ|I = id |I ; in particular, for localized diffeomorphisms U (γ) are local observables, associated with the stress-energy tensor; see, e.g., [27, Sec. 3]. An n-interval is the union E := nk=1 Ik of n open intervals Ik ⊂ S 1 (k = 1, . . . , n) with mutually disjoint closure. The complement E = S 1 \E is another n-interval. If there is an interval I ⊂ S 1 such that E√= {z ∈ S 1 : z n ∈ I}, we write √ n E = I, and call E symmetric. In this case, E = n I . Note that every 2-interval is a M¨ obius transform of a symmetric 2-interval, while the same is not true for n > 2. We are interested in the algebras A(E) :=
n
A(Ii ) and A(E) := A(E ) ,
(2.1)
i=1
and their states with geometric modular action. By Ω we denote the vacuum vector, and by U the projective unitary representation of the diffeomorphism group in the vacuum representation, with generators Ln (n ∈ Z) and central charge c. 2.1. Product states with geometric modular action For n = 1, E ist just an interval and A(I) = A(I) (Haag duality). Proposition 1 (Bisognano–Wichmann Property) ([4, Theorem 2.3]). The modular group of unitaries for the pair (A(I), Ω) is given by the 1-parameter group of M¨ obius transformations that fixes the interval I, ∆it A(I),Ω = U (ΛI (−2πt)). 1 For I = S+ the upper half circle, the generator of the subgroup U (ΛS+1 (t)) is the dilation operator D = i(L1 − L−1 ). It follows that D as well as its M¨ obius conjugates DI (the generators of the subgroups U (ΛI (t))) are “of modular origin”:
−2π · DI = log ∆A(I),Ω .
(2.2)
14:17 WSPC/S0129-055X
334
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
I2
I1
I3 Fig. 1.
Flow ft in the 3-intervals E =
q 3
1 = I ∪ I ∪ I and E = S+ 1 2 3
q 3
1. S−
Let now (n)
L0
=
1 c n2 − 1 L0 + , n 24 n
(n)
L±1 =
1 L±n , n
(2.3) (n)
and U (n) the covering representation of the M¨obius group with generators Lk (k = 0, ±1). The unitary one-parameter groups V (t) = U (n) (ΛI (−2πt)) act on the diffeomorphism covariant net by √ n V (t)A(J)V (t)∗ = A(ft (J)) (J ⊂ I) (2.4) where the geometric flow ft is given by (cf. Fig. 1) (2.5) ft (z) = n ΛI (−2πt)(z n ), √ with the branch of n · chosen in the same connected component of E as z, i.e. ft is a diffeomorphism of S 1 which √ preserves each component of E separately. The same formulae hold also for J ⊂ n I . (n) The question arises whether for n > 1 the generators DI of V (t) also have “modular origin” as in (2.2). However, unlike with n = 1, we have the following lemma and corollary: Lemma. In a unitary positive-energy representation of sl(2, R) of weight h > 0, there is no vector such that DΦ = 0, where D = i(L1 − L−1 ). Proof. An orthonormal basis of the representation is given by the vectors |n = 1 (n!(2h)n )− 2 Ln−1 |h, where |h is the lowest weight vector. Solving the eigenvalue equation L1 Φ = L−1 Φ by the ansatz Φ = n cn |n, produces a recursion for the coefficients cn whose solution is not square-summable. Corollary. For n > 1, no cyclic and separating vector Φ exists in a positive-energy representation of the net A such that the modular Hamiltonian log ∆A(E),Φ would (n)
equal −2πDI . 2
c n −1 Proof. By modular theory, log ∆A(E),Φ Φ = 0. But because L0 ≥ 24 > 0, n (n) obius the lemma states that no vector Φ can be annihilated by DI which is a M¨ conjugate of D(n) . (n)
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
335
Instead, the appropriate generalization of (2.2) for the modular origin of the (n) generators DI was given in [21], assuming that the net A is completely rational. This means that the split property holds and the µ-index µA = [A(E) : A(E)] is finite, and implies that A(E) ⊂ A(E) is irreducible and there is a unique conditional dψ → A(E) [22, Proposition 5 and Sec. 3]. In the sequel, dψ expectation εE : A(E) is the Connes spatial derivative for a pair of faithful normal states ψ and ψ on a von Neumann algebra M and its commutant M , which is the canonical positive dψ it dψ −it implements σtψ on M and ( dψ implements σtψ on operator such that ( dψ ) ) M [10, Theorem 9]. Proposition√2 ([21, Corollary 16]). There is a faithful normal state ϕE on A(E) (E = n I) and a second faithful normal state ϕE on A(E ), such that the ϕ following hold: The modular automorphism group σtϕE is implemented by V (t), σt E is implemented by V (−t), and dϕ E n−1 (n) −2πDI = log log µA . (2.6) + dϕE 2 Here, ϕ E = ϕE ◦ εE extends the state on A(E) to a state on A(E). Moreover, dϕ bE dϕE
=
dϕE dϕ bE
.
n The state ϕE on A(E) is given by ϕE := ( k=1 ϕk ) ◦ χE where χE : A(E) ≡ n k=1 A(Ik ) → k=1 A(Ik ) is the natural isomorphism given by the split property (Ik are the components of E), and the states ϕk on A(Ik ) are given by ϕk = ω ◦ Ad U (γk ), where ω is the vacuum state, and U (γk ) implement diffeomorphisms γk that equal z → z n on Ik . (By locality, ϕk do not depend on the behavior of γk outside Ik .) n
Corollary. Let ϕE and ϕ E be the states on A(E) and on A(E), respectively, as in n Proposition 2. For intervals Jk ⊂ Ik (= the components of E) and F = k=1 Jk , we have the geometric modular actions σtϕE (A(Jk )) = A(ft (Jk )),
σtϕbE (A(Jk ))
= A(ft (Jk )),
hence and
σtϕE (A(F )) = A(ft (F )), )) = A(f t (F )). σ ϕbE (A(F t
(2.7) (2.8)
Proof. (2.7) is obvious from (2.4). By the defining implementation properties of the Connes spatial derivative, we conclude from (2.6), that σ ϕbE is implemented by V (t). This implies (2.8), by the U (n) -covariance of the algebras under consideration. (We include the obvious statement (2.7) for later comparison with the geometric modular flow in [9], for which only the second equality in (2.7) holds while the first is violated.) For n = 1, one may just choose γ = id , so that both ϕI and ϕI are given by the restrictions of the vacuum state, and (2.6) reduces to (2.2). For n > 1, the state ϕE is different from the vacuum state, but it is rotation invariant on A(E) in the sense, that ϕE ◦ Ad U (rott ) = ϕE on A(Jk ) for J k ⊂ Ik and t small enough that rott (Jk ) ⊂ Ik . (rott stands for the rotations z → eit z.)
14:17 WSPC/S0129-055X
336
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
Namely, if J ⊂ I such that gJ ⊂ I for g in a neighborhood N of the√identity of the M¨ obius group, then by construction, ϕE ◦ Ad U (n) (g) = ϕE on A( n J) for g ∈ N . In particular, the same is true for the rotations rott with t in a neighborhood of 0. Since U (n) (rott ) = U (rott/n ) · (complex phase), the rotation invariance on A(E) follows. One could actually have chosen any other family of diffeomorphisms γk that map (γ ) Ik onto I, resulting in product states ϕE k with a different geometric flow on E. In that case, the unitary 1-parameter group V (t) satisfying the properties of Propo(n) sition 2 is a diffeomorphism conjugate of UI (ΛI (−2πt)). One might expect that our choice of ϕE is the only one in this class which enjoys the rotation invariance on A(E). Surprisingly, this is not the case: (γ ) Let ϕE k be a product state on A(E) that is given on A(Ik ) by ω ◦ Ad U (γk ), where γk are diffeomorphisms of S 1 that map Ik onto I. Then this state is rotation invariant on A(E), by construction, if and only if ω ◦ Ad U (hk ) are rotation invariant on A(I), where hk are diffeomorphisms of S 1 , defined on I by hk (z n ) = γk (z) for z ∈ Ik . In particular, hk map I onto I. The condition that ω ◦ Ad U (h) is rotation invariant on A(I), can be evaluated for the 2-point function of the stressenergy tensor in that state. Using the inhomogeneous transformation law under diffeomorphisms h, involving the Schwartz derivative Dz h = hh − 32 ( hh )2 , the quantity 2 dht (z) dht (w) c2 dz dw (2.9) 2c · 2 + 36 · Dz ht (z) · Dw ht (w), (ht (z) − ht (w)) where ht = h ◦ rott , must be independent of t for z, w ∈ I and t in a neighborhood of zero. Working out the singular parts of the expansion in w around z, one finds that Dz ht (z) must be independent of t for z ∈ I. This already implies that the second (regular) term is separately invariant, so that, in particular, the invariance condition does not depend on the central charge c. Solving (2.10) ∂t Dz ht (z) = 0 ⇔ z 2 · Dz h(z) = const., when the constant is parametrized as 12 (1 − ν 2 ), yields h(z) = µ(z ν ) =
Az ν + B Cz ν + D
for z ∈ I,
(2.11)
where µ is a M¨ obius transformation.a The state ω ◦ Ad U (h) is indeed rotation obius invariance of ω. invariant on A(I) by h ◦ rott (z) = µ ◦ rotνt (z ν ) and M¨ a The sign of the exponent ν can be reversed by exchanging A ↔ B and C ↔ D. In order that 1 h takes values in “ S , ν” must be either real or imaginary, with corresponding reality conditions
A
B
on the matrix C D . Requiring h also to preserve the orientation, we find: If ν > 0, then ” “ ” “ ” “ ” “ A B A B i 1 i 1 ∈ SU (1, 1). If iν > 0, then C D ∈ −i 1 · SL(2, R), where −i 1 is the Cayley C D transformation x →
1+ix . 1−ix
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
337
For each value of ν, requiring h to preserve the endpoints of the interval I fixes the M¨ obius transformation up to left composition with the 1-parameter subgroup ΛI (t). Because ω is invariant under ΛI (t), the state ω ◦ Ad U (h) is uniquely determined by the exponent ν in (2.11). One has therefore a 1-parameter family of product states, all rotation-invariant on A(I), but with different modular flows on I. Going back to the product states on A(E) by composition with z → z n , there is one parameter νk for each interval, i.e. for the choice of the states ω ◦ Ad U (γk ) on A(Ik ). The state is invariant also under “large” rotations by 2π/n, if and only if these parameters are the same for all k. 2.2. Geometric modular action in boundary CFT The case n = 2 is of particular interest in boundary conformal quantum field theory (BCFT) [25]. With every 2-interval E such that −1 ∈ E, one associates a double cone OE in the halfspace M+ = {(t, x) ∈ R2 : x > 0} as follows. The boundary x = 0, t ∈ R is the pre-image of S˙ 1 := S 1 \{−1} under the Cayley transform C : R t → z = (1 + it)/(1 − it) ∈ S 1 . Let E = I− ∪ I+ ⊂ S˙ 1 with I− < I+ in the R = C −1 (I± ) ⊂ R. Then counter-clockwise order, and I± R R R × I− ≡ {(t, x) : t ± x ∈ I± }. OE := I+
(2.12)
(When there can be no confusion, we shall drop the subscript E.) Now, the algebras B+ (O) := A(E)
(2.13)
have the re-interpretation as local algebras of BCFT, which extend the subalgebras of chiral observables A+ (O) := A(E) ≡ A(I− ) ∨ A(I+ ).
(2.14)
Under this re-interpretation, the second statement in (2.8) asserts, that the modular group σtϕbE acts geometrically inside the associated diamond O: σsϕbE (B+ (Q)) = B+ (fsO (Q)),
(2.15)
where the double cone Q = OF ⊂ O corresponds to a sub-2-interval F ⊂ E, and the flow fsO on O arises from the pair of flows fs (2.5) on I+ and I− , by the said transformations, i.e. fsO (t + x, t − x) ≡ (us , vs ) = (C −1 ◦ fs ◦ C(t + x), C −1 ◦ fs ◦ C(t − x)).
(2.16)
R R = (a, b) ⊂ R+ and I− = (−1/a, −1/b) (corresponding to a symmetric For I+ 2-interval E), we have computed the velocity field
∂s us = 2π
(us − a)(aus + 1)(us − b)(bus + 1) =: −2πV O (us ) (b − a)(1 + ab) · (1 + u2s )
R R for us ∈ I+ , and the same equation for vs ∈ I− .
(2.17)
14:17 WSPC/S0129-055X
338
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
R R For I+ = (a1 , b1 ) and I− = (a2 , b2 ) corresponding to a non-symmetric 2-interval ˜ ˜ onto a symmetric interval E. E, there is a M¨ obius transformation m that maps E ˜ Choosing the state ϕE˜ := ϕE ◦ Ad U (m) on A(E), the resulting geometric modular flow is given by f˜s = m−1 ◦ fs ◦ m. Going through the same steps, we find
∂s us = −2πV O (us ) = 2π
(u − a1 )(u − b1 )(u − a2 )(u − b2 ) Lu2 − 2M u + N
(2.18)
with L = b1 −a1 +b2 −a2 ,
M = b1 b2 −a1 a2 ,
N = b2 a2 (b1 −a1 )+b1 a1 (b2 −a2 ). (2.19)
This differential equation is solved by log −
(us − a1 )(us − a2 ) = −2πs + const. (us − b1 )(us − b2 )
(2.20)
The modular orbits for u = t + x, v = t − x are obtained by eliminating s: (u − a1 )(u − a2 ) (v − b1 )(v − b2 ) · = const. (u − b1 )(u − b2 ) (v − a1 )(v − a2 )
(2.21)
2.3. General boundary CFT Up to this point, we have taken the boundary CFT to be given by B+ (O) := A(E), which equals the relative commutant B+ (O) = A(K) ∩ A(L) by virtue of Haag duality of the local chiral net A. Here, K and L ⊂ S˙ 1 are the open intervals between I+ and I− , and spanned by I+ and I− , respectively, i.e. L = I+ ∪ K ∪ I− . The general case of a boundary CFT was studied in [25]. If A is completely rational, every irreducible local boundary CFT net containing A(E) is intermediate between A(E) and a maximal (Haag dual) BCFT net: dual (O) ≡ B(K) ∩ B(L), A(I+ ) ∨ A(I− ) ≡ A+ (O) ⊂ B+ (O) ⊂ B+
(2.22)
where I → B(I) is a conformally covariant, possibly nonlocal net on S˙ 1 , which extends A and is relatively local with respect to A [25, Proposition 2.9(ii)]. (Its extension to the circle in general requires a covering). If A is completely rational, the local subfactors A(I) ⊂ B(I) automatically have finite index (not depending on I ⊂ S˙ 1 ) by the same argument as in [20, p. 39], and there are only finitely many such extensions [19, Theorem 2.4]. There is then a unique global conditional expectation ε, that maps each B(I) onto A(I). ε commutes with M¨obius transformations and preserves the vacuum state. By relative locality, ε maps B(K) ∩ B(L) into (in general, not onto) A(K) ∩ A(L), hence A(E) ≡ A+ (O) ⊂ ε(B+ (O)) ⊂ A(E).
(2.23)
induces a faithful normal state ϕ E ◦ ε on B+ (O). The product state ϕ E on A(E) Proposition 3. In a completely rational, diffeomorphism invariant BCFT, the modular group of the state ϕ E ◦ ε acts geometrically on B+ (Q), Q ⊂ O, i.e. σsϕbE ◦ε (B+ (Q)) = B+ (fsO (Q)), where fsO is the flow (2.16).
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
339
Proof. B+ (O) is generated by A+ (O) and an isometry v [24] such that every element b ∈ B+ (O) has a unique representation as b = av with a ∈ A+ (O), and va = θ(a)v where θ is a dual canonical endomorphism of B+ (O) into A+ (O). For a double cone Q ⊂ O, the isometry v may be chosen to belong to B+ (Q), in which case θ is localized in Q. We know that the modular group restricts to the modular group of A+ (O), which acts geometrically, in particular, it takes A+ (Q) to A+ (fsO (Q)). It then follows by the properties of the conditional expectation that σsϕbE ◦ε (v) ≡ vs = us v where us ∈ A(E) is a unitary cocycle of intertwiners us : θ → θs ≡ σsϕbE ◦ θ ◦ σsϕbE −1 . Since σsϕbE acts geometrically in A+ (O), θs is localized in fsO (Q), and A+ (fsO (Q)) · vs = B+ (fsO (Q)). This proves the claim. Thus, in every BCFT, the modular group of the state ϕ E ◦ ε on B+ (OE ) acts geometrically inside the double cone OE by the same flow (2.20), (2.21). 2.4. Local temperature in boundary conformal QFT We shall show that the states ϕ E ◦ ε, whose geometric modular action we have just discussed, are manufactured far from thermal equilibrium. We adopt the notion of “local temperature” introduced in [8], where one compares the expectation values of suitable “thermometer observables” Φ(x) in a given state ϕ with their expectation values in global KMS reference states ωβ of inverse temperature β. If one can represent the expectation values as weighted averages (2.24) ϕ(Φ(x)) = dρx (β)ωβ (Φ(x)) (where the thermal functions β → ωβ (Φ(x)) do not depend on x because KMS states are translation invariant), then one may regard the state ϕ at each point x as a statistical average of thermal equilibrium states. In BCFT, this analysis can be carried out very easily for the product states ϕE with the energy density 2T00 (t, x) = T (t + x) + T (t − x) as thermometer observable. One has ωβ (T ( · )) = π2 −2 in the KMS states, while the inhomogeneous transformation law of T under 24 c β c c R Dy γ± (y) = − 4π (1 + y 2)−2 if y ∈ I± where diffeomorphisms gives ϕE (T (y)) = − 24π 2y −1 2 γ± (y) = C ◦ (z → z ) ◦ C(y) = 1−y2 , i.e. negative energy density inside the R R double cone O = I+ × I− . The product states ϕE can therefore not be interpreted as local thermal equilibrium states in the sense of [8]. The possibility of locally negative energy density in quantum field theory is well known, and its relation to the Schwartz derivative in two-dimensional conformal QFT was first discussed in [15]. 2.5. Modular temperature in boundary conformal QFT The “thermal time hypothesis” [11] provides a very different thermal interpretation of states with geometric modular action. According to this hypothesis, one interprets the norm of the vector ∂s tangent to the modular orbit xµ (s) as the inverse
14:17 WSPC/S0129-055X
340
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
temperature βs of the state as seen by a physical observer with accelerated trajectory xµ (s). In the vacuum state on the Rindler wedge algebra, this gives precisely 2π the Unruh temperature βs = dτ ds = κ (τ is the proper time, and κ the acceleration). One may also give a local interpretation, by viewing βs as the inverse temperature of the state for an observer at each point whose trajectory is tangent to the unique modular orbit through that point. For these interpretations to make sense it is important that ∂s is a timelike vector. Indeed, it is easily seen that the flow (2.17), (2.18) gives negative sign for both ∂s us and ∂s vs , because the velocity field V O is positive inside the interval. Hence the tangent vector is past-directed timelike. This conforms with a general result, proven in more than 2 spacetime dimensions: Proposition 4 ([32, Satz 6.5]). Let A(O) be a local algebra and Ut a unitary 1parameter group such that Ut A(Q)Ut∗ = A(ft Q) where ft is an automorphism of O taking double cones in O to double cones. If there is a vector Φ, cyclic and separating for A(O), such that Ut AΦ has an analytic continuation into a strip −β < Im t < 0, then −∂t (ft x)|t=0 ∈ V+ . In particular, the flow of a geometric modular action is always past-directed null or timelike. From (2.18), we get the proper time (dτ )2 = du dv and hence the inverse temµ perature β = dτ ds as a function of the position x = (t, x) β(t, x)2 =
du dv = 4π 2 · V O (t + x)V O (t − x). ds ds
(2.25)
The temperature diverges on the boundaries of the double cone (V O (ai ) = V O (bi ) = 0), and is positive everywhere in its interior. For comparison with the ordinary Unruh effect, we also compute the acceleration in the momentarily comoving frame κ=
1 2 (d2 x/dt2 ) u v − u v ∂ xµ ∂ 2 xµ 2 = = , − 2 2 2 3/2 ∂τ ∂τ (1 − (dx/dt) ) 2(u v )3/2
where the prime stands for ∂s , and we have used (dx/dt) t
=
−u v 4 u(uv +v )3 .
dx dt
=
x t
=
u −v u +v
(2.26) and
d2 x dt2
=
Thus
V O (u) − V O (v) κ(t, x) = 2 V O (u)V O (v)
u=t+x, v=t−x
=
V O (t + x) − V O (t − x) π −1 β(t, x)
(2.27)
as a function of the position (t, x). The product β(t, x) · κ(t, x) = π ∂x V O (t + x) + V O (t − x) = π ∂t V O (t + x) − V O (t − x)
(2.28)
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
341
1 B
1 B u
u A
A 0
0
-1 -1 B
-1 -1 B
v
v
-1 A
-1 A
Fig. 2. Influence of the boundary. Left: modular orbit of an arbitrary point in the symmetric 1 1 ≤ t − x ≤ −B }. Right: a zoom on the modular double cone O = {(t, x) : A ≤ t + x ≤ B, − A us , vs ) orbit (us , vs ) going through the center of the double cone. The plot represents the curve (˜ ) + udiag , with (udiag , vs ) the straight line joining the two tips of the where u ˜s = f ∗ (us − udiag s s s double cone (a special vacuum modular orbit in the absence of the boundary), and f = 100 a zoom factor.
has the maximal value 2π (Unruh temperature) near the left and right edges of the double cone, and equals 0 along a timelike curve connecting the past and future tips. This curve is in general not itself a modular orbit. In general, the modular orbits are not boost trajectories. However, the quantitative departure is very small. As an illustration, we display a true modular orbit, as well as a plot with one coordinate exaggerated by a zoom factor of 100 (Fig. 2). There exists however one distinguished modular orbit with a simple dynamics, namely the boost us vs = −1 ∀s ∈ R
(2.29)
(in the symmetric case, for simplicity) which is a solution of (2.21) for const. = 1. It is the Lorentz boost of a wedge in M+ , whose edge lies on the boundary x = 0. The same is true also for non-symmetric intervals, although the formula (2.29) is more involved. Along this distinguished orbit the inverse temperature (2.25) simply writes β = 2π
d ∂s us = 2π ln us . us ds
(2.30)
14:17 WSPC/S0129-055X
342
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
One can express the proper time τ of the observer following the boost as a function of the modular parameter τ (s) = ln us − ln u0 , O
(2.31)
τ
0e ) hence β(τ ) = 2π V u(u . Choosing u0 = 1, one can write the inverse temperature τ 0e as a function of the proper time in the form
β(τ ) = 2π
(sinh(τmax ) − sinh(τ )) · (sinh(τ ) − sinh(τmin )) , (sinh(τmax ) − sinh(τmin )) · cosh(τ )
(2.32)
where τmin and τmax are functions of the coordinates of the double cone. As for double cones in Minkowski space [28], the temperature is infinite at the tips of the double cone (τ = τmin or = τmax ) and reaches its minimum in the middle of the observer’s “lifetime”. Unfortunately, for generic orbits we have no closed formula for the temperature as a function of the proper time, so as to compare with the “plateau behavior” (constant temperature for most of the “lifetime”) as in [28], that occurs in CFT without boundary for vacuum modular orbits close to the edges of the double cone. 3. The Vacuum Modular Flow Casini and Huerta [9] recently found that the vacuum modular group for the algebra of a free Fermi field in the union of n disjoint intervals (ak , bk ) ⊂ R is given by the formula dxj dxk (t) · σt (ψ(xj )) = · ψ(xk (t)). Ojk (t) (3.1) dζ dζ k
Here, eζ(x) = −
x − ak k
x − bk
(3.2)
defines a uniformization function ζ that maps each interval (ak , bk ) onto R, and l ζ eζ ∈ R+ has n pre-images xk = xk (ζ), one in each interval, i.e. − l xxkk(ζ)−a (ζ)−bl = e . The geometric modular flow is given byb ζ(t) = ζ0 − 2πt,
(3.3)
i.e. a separate flow xk (t) = xk (ζ − 2πt) in each interval. The orthogonal matrix O yields a “mixing” of the fields on the different trajectories xi (t), and is determined by the differential equation ˙ O(t) = K(t)O(t)
(3.4)
[9], the notation is different: the authors “counter” the flow so that the position of σt (ψ(xj (ζ + 2πt))) remains constant, except for the mixing.
b In
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
343
where Kjj (t) = 0 and Kjk (t) = 2π
dxj (t) dxk (t) dζ dζ (j = k). xj (t) − xk (t)
(3.5)
Remark. The mixing is a “minimal” way to evade an absurd conclusion from Takesaki’s Theorem ([32, Chap. IX, Theorem 4.2]): Without mixing the modular group would globally preserve the component interval subalgebras. Then, the Reeh– Schlieder property of the vacuum vector would imply that the n-interval algebra coincides with each of its component interval subalgebras. Proposition 5. √For k (ak , bk ) ⊂ R the Cayley transform of a symmetric n I ⊂ S 1 \{−1}, the geometric part (3.3) of the flow (without n-interval E = mixing) is the same as (2.5). 1+iak , vk = Proof. We use variables uk = 1−ia k 2i(x − a) = (1 − ix)(1 − ia)(z − u). Then
eζ = −
x − ak k
x − bk
= const. ·
1+ibk 1−ibk ,
z − uk k
z − vk
z =
= const. ·
1+ix 1−ix ,
and the identity
zn − U zn − V
(3.6)
where U = unk , V = vkn such that I = (U, V ) ⊂ S 1 . Therefore, the flow (3.3) is equivalent to n z(t)n − U −2πt z − U = e , · z(t)n − V zn − V
(3.7)
which in turn is easily seen to be equivalent to (2.5). Keep in mind, however, that the modular group of the product state in Sec. 2.1 does not “mix” the intervals (ak , bk ). Since every 2-interval is a M¨ obius transform of a symmetric 2-interval, the statement of Proposition 5 is also true for general 2-intervals, with the flow (2.20).
3.1. Verification of the KMS condition The authors of [9] have obtained the flow (3.1) using formal manipulations. We shall establish the KMS property of the vacuum state for this flow. Because this property distinguishes the modular group [32, Chap. VIII, Theorem 1.2], we obtain an independent proof of the claim. We take k (ak , bk ) ⊂ R the Cayley transform of a symmetric n-interval E = √ n I ⊂ S˙ 1 . We first solve the differential equation (3.4) for the mixing.
14:17 WSPC/S0129-055X
344
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
With angular variables x = tan ξ2 , and π > ξ0 > ξ1 > · · · > ξn−1 > −π, the non-diagonal elements of the matrix K can be written as
dxk (t) dξk (t)
Kkl (t) = 2π ·
dxl (t) dξl (t) dξk (t) dξl (t)
xk (t) − xl (t) dξk (t) dξl (t) dz dz = 2π · ξk (t) − ξl (t) 2 sin 2
dz
for k = l. For symmetric intervals, ξk = ξ0 − k ·
dz
(3.8)
2π n
dξ0 (t) dz = Ωkl · ξ˙0 (t), Kkl (t) = −2π · (k − l)π 2 sin n
and
dξk dz
Ωkl =
=
dξ0 dz
> 0, hence
1 . (k − l)π 2 sin n
(3.9)
With the constant anti-symmetric matrix Ω = (Ωkl )n−1 k,l=0 , we obtain the orthogonal mixing matrix Corollary. The mixing matrix is given by O(t) = e(ξ0 (t)−ξ0 (0))·Ω .
(3.10)
Remark. The mixing matrix O(t) always belongs to the same one-parameter subgroup of SO(n), with generator Ω. For n = 2, this is just O(t) =
cos θ sin θ
−sinθ cos θ
with θ(t) =
1 (ξ0 (t) − ξ0 ). 2
(3.11)
If E is not symmetric, the general formula is Lx0 (t) − M Lx0 (0) − M θ(t) = arctan √ − arctan √ LN − M 2 LN − M 2
(3.12)
with notations as in (2.18).c Next, we compute the vacuum expectation values σt (ψ(xi ))σs (ψ(yj )) for xi ∈ −i . Passing to angular variables Ii , yj ∈ Ij , using (3.1) and ψ(x)ψ(y) = x−y−iε c The
authors of [9] also compute this angle, but misrepresent it as the arctan of the difference, rather than the difference of the arctan’s.
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
345
x → ξ, y → η by √ √ dx dy = x − y − iε
√ √ dξ dη , ξ − η − iε 2 sin 2
(3.13)
this gives
σt (ψ(xi ))σs (ψ(yj )) =
−i
(e(ξ0 (t)−ξ0 )·Ω )ik (e(η0 (s)−η0 )·Ω )jl · 2 sin
kl
dξk (t) dxi
dηl (s) dyj
ξk (t) − ηl (s) − iε 2
.
(3.14)
Notice that again dξk , dηl in the square roots do not depend on k and l. To perform the sums over k and l, we need a couple of trigonometric identities: Lemma. For n ∈ N and k = 0, 1, . . . , n − 1, let sink (α) := sin(α − k πn ). Then (sums and products always extending from 0 to n − 1): (i) k sink (α) = (−2)1−n sin(nα). (ii) For j = 0, . . . , n − 1 one has k: k=j cot((j − k) πn ) = 0. (iii) For j = 0, . . . , n − 1 one has
(e2(α−β)Ω )jk ·
k
sin(nβ) 1 1 = · . sink (α) sin(nα) sinj (β)
(3.15)
2π Proof. (i) is just another way of writing k (z − ωk ) = z n − 1 where ωk = eik n are the nth roots of unity, and z = e2iα . Dividing (i) by sinj (α), taking the logarithm, and taking the derivative at α = 0, yields (ii). For (iii), we have to show that the expression (−2)1−n sin(nα)
k
(e2αΩ )jk ·
1 = (e2αΩ )jk sinl (α) sink (α) k
(3.16)
l: l=k
is independent of α. Taking the derivative with respect to α and inserting (3.9), we have to show that k
π sin (α) + cos α − k sinl (α) = 0. · · l π n sin(j − k) l: l=k k l: l=j,k n 1
(3.17)
Writing cos(α− k πn ) = (sink (α) cos((j − k) πn )− sinj (α))/sin((j − k) πn ), this sufficient condition reduces to the identity (ii).
14:17 WSPC/S0129-055X
346
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
Using (3.15) with 2α = ξ0 (t) − ηl (s) and 2β = ξ0 − ηl (s) in the expression (3.14), and once again with 2α = η0 (s) − ξ0 and 2β = η0 − ξ0 , we get dξ0 (t) dη0 (s) ξ0 − η0 − iε −i sin n dxi dyj 2 . (3.18) σt (ψ(xi ))σs (ψ(yj )) = ξi − ηj − iε ξ0 (t) − η0 (s) − iε sin n 2 sin 2 2 We exhibit the t- and s-dependent terms: dξ0 (t) dη0 (s) dΞ0 (t) dH0 (s) = nξ0 (t) − nη0 (s) − iε Ξ0 (t) − H0 (s) − iε 2 sin 2n sin 2 2 1 dX(t) dY (s) . = n X(t) − Y (s) − iε
(3.19)
The first equality is the invariance of the 2-point function under a M¨ obius trans1 formation µ mapping I to S+ , such that for z = eiξ ∈ E and w = eiη ∈ E we get 1+iX 1 1 ∈ S+ and µ(wn ) = eiH = 1+iY µ(z n ) = eiΞ = 1−iX 1−iY ∈ S+ with X, Y ∈ R+ ; the second equality is again (3.13) for the inverse transformation Ξ → X, H → Y . By Proposition 5, the flow on R+ is just X(t) = e−2πt · X, giving σt (ψ(xi ))σs (ψ(yj )) =
e−π(t+s) · f (xi , yj ). e−2πt X − e−2πs Y − iε
(3.20)
This expression manifestly satisfies the KMS condition in the form ψ(x)σ−i/2 (ψ(y)) = ψ(y)σ−i/2 (ψ(x)).
(3.21)
We conclude that the KMS condition holds for the Casini–Huerta flow for symmetric n-intervals: √ Corollary. For symmetric n-intervals E = n I, (3.1) is the modular automorphism group of the algebra A(E) with respect to the vacuum state. Proof. Smearing with test functions of appropriate support, the KMS property holds for bounded generators of the CAR algebra A(E). Because ψ is a free field, the KMS property of the 2-point function in the vacuum extends to the KMS property of the corresponding quasifree (i.e. Fock) state of the CAR algebra. Remark. It is quite remarkable that by virtue of the mixing, through the identity (ii) of the lemma, the ratio of the modular vacuum correlation functions (n)
(n)
(1)
(1)
σt (ψ(xi ))σs (ψ(yj )) σt (ψ(X))σs (ψ(Y ))
(3.22)
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
347
is independent of the modular parameters t, s. Here, in the numerator σ (n) is the modular group for a symmetric n-interval ⊂ R, and in the denominator σ (1) is the modular group for the 1-interval R+ . 3.2. Product states for general n-intervals With hindsight from [9], we can generalize to non-symmetric n-intervals the modelindependent construction of a product state, as in Sec. 2.1, by replacing the function 1+ix , z → z n as follows. If C stands for the Cayley transformation x → z = 1−ix √ n and k (ak , bk ) ⊂ R the pre-image of a symmetric n-interval E = I, then U = C(ak )n ∈ S 1 and V = C(bk )n ∈ S 1 do not depend on k. One computes the uniformization function (3.2) in this case to be given by eζ = C −1 ◦ µ ◦ (z → z n ) ◦ C(x)
(3.23)
n −V Z−U where µ : S 1 → S 1 is the M¨ obius transformation Z → C (−1) · n (−1) −U V −Z , that 1 1 ˙ takes I to S+ . For a general n-interval E = Ik ⊂ S , one may choose µ an arbitrary M¨ obius transformation, and replace z → z n by the function g(z) := µ−1 ◦ C ◦ eζ ◦ C −1 ,
(3.24)
where ζ is the uniformization function (3.2). Thus, g maps each component Ik onto 1 ), i.e. we have E = g −1 (I). Repeating the construction the same interval I = µ−1 (S+ of Proposition 2 with factor states ϕk = ω ◦ Ad U (γk ), where the diffeomorphisms γk coincide with g on Ik , one obtains a product state with the geometric modular flow ft (z) = g −1 ΛI (−2πt)g(z) ,
(3.25)
instead of (2.5). By construction, this flow corresponds to ζ(t) = ζ(0) − 2πt as before, which in turn coincides with the geometric part of the vacuum modular flow (3.1).
3.3. Lessons from the free Fermi model Charge splitting It is tempting to ask whether, and in which precise sense, the free Fermi field result extends also to the free Bose case. (The authors of [9] are positive about this, but did not present a proof.) In the chiral situation, the free Bose net A(I) (the current algebra with central charge c = 1) is given by the neutral subalgebras of the complex free Fermi net F (I). Because the vacuum state is invariant under the charge transformation, there is a vacuum-preserving conditional expectation
14:17 WSPC/S0129-055X
348
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
ε : F (I) → A(I), implying that the vacuum modular group of F (E) restricts to the vacuum modular group of C(E) := ε(F (E)). We have F (E) ε↓ A(E) ⊂ C(E) ⊂ A(E),
(3.26)
where both inclusions are strict: C(E) contains neutral products of integer charged elements of F (Ik ) in different component intervals, which do not belong to A(E), while A(E) contains “charge transporters” [6, 22] for the continuum of superselection sectors of the current algebra with central charge c = 1, which do not belong to C(E). Being the restriction of the vacuum modular group of F (E), the action of the vacuum modular group of C(E) can be directly read off. It acts geometrically, i.e. takes C(F ) to C(ft (F )),d but it does not take A(F ) to A(ft (F )), because the mixing takes a neutral product of two Fermi fields in one component Jk of F to a linear combination of neutral products of Fermi fields in different components ft (Jj ), belonging to C(ft (F )) but not to A(ft (F )). Let us call this feature “charge splitting” (stronger than “mixing”). The inclusion situation (3.26) does not permit to determine the vacuum modular flow of A(E) from that of C(E), because there is no vacuum-preserving conditional expectations C(E) → A(E) that would imply that the modular group restricts. (Of course, this would be a contradiction, because we have already seen that the modular group of F (E), and hence that of C(E), does not preserve A(E).) Similarly, we cannot conclude that the vacuum modular flow of A(E) should extend that of C(E), or that of A(E). Proposition 6 below actually shows that this scenario must be excluded. Application to BCFT It is instructive to discuss the consequence of the free Fermi field mixing and the ensuing charge splitting for C(E) under the geometric re-interpretation of boundary CFT, as in Sec. 2.2. For definiteness and simplicity, we consider the case when A is the even subnet of the real free Fermi net, i.e. A is the Virasoro net with c = 12 . Unlike the c = 1 free Bose net, this model is completely rational. The same considerations as in the previous argument apply also in this case: are strict, the Again, the inclusions A(E) ⊂ C(E) := ε(F (E)) ≡ F (E)Z2 ⊂ A(E) 1 latter because charge transporters for the Ramond sector (weight h = 16 ) do not belong to C(E). The vacuum modular flow for C(E) is induced by that for F (E), but it does not pass to A(E) or A(E). S and below, F ⊂ E always stands for an n-interval F = k Jk where Jk are the components √ of the pre-image of some interval under the function ζ (3.2), i.e. in the symmetric case, F = n J with J ⊂ I.
d Here
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
349
R R Let therefore E ⊂ S˙ 1 be 2-intervals and O = I+ × I− ⊂ M+ the associated double cones. The net
O → C(O) = F (E)Z2
(3.27)
is a BCFT net intermediate between the “minimal” net A+ (O) = A(E) and the see [25]. It is generated by fields “maximal” (Haag dual) net B+ (O) = A(E), m n R ψ(u ) ψ(v ) with n + m = even, and ui smeared in I+ , vj smeared in i j i=1 j=1 R I− . The vacuum modular flow of C(O) mixes ft ui with ft ui and ft vj with ft vj , where u → u and v → v are the bijections of the two intervals onto each other connecting the two pre-images of the uniformization function ζ. Hence, if ψ(u)n ψ(v)m (in schematical notation) belongs to C(Q) for a double cone Q ⊂ O, the vacuum modular flow takes it to linear combinations of ψ(ft u)n1 ψ(ft u )n2 ψ(ft v)m1 ψ(ft v )m2
(3.28)
with n1 + n2 = n, m1 + m2 = m. Grouping the charged factors to neutral (even) “bi-localized” products, these generators belong to the local algebra of 6 double 6 around 6 points as indicated in Fig. 3. cones α=1 C(ft Qα ) ⊂ C(ft Q) the correIn spite of the fact that two of the 6 double cones Qα lie outside Q, But their bi-localized generators, sponding algebras C(Qα ) are contained in C(Q).
v’
J+ u
∧ Q Q
O v
u’ Fig. 3. The 6 regions mixed by the vacuum modular flow in boundary CFT. (u, v) is a point in 1 and v = − v1 . Q ⊂ O. The boost is the distinguished orbit in O as in Sec. 2.5, and defines u = − u If (u, v) lies on the boost, then the points (v, u ) and (v , u) lie on the boundary. Consequently, if a double cone Q ⊂ O around (u, v) intersects the distinguished orbit, then four of the 6 associated double cones Qα merge with each other, while the other two touch the boundary and degenerate to left wedges. (The flow ft itself, as in Fig. 2, is suppressed.)
14:17 WSPC/S0129-055X
350
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
because on the boundary such as ψ(u)ψ(v ), cannot be associated with points in Q, they are localized in the entire interval J+ spanned by u and v [26, Sec. 2], hence Therefore, in the geometric re-interpretation belong to J− C(J+ × J− ) ⊂ C(Q). of boundary CFT, the discrete mixing (charge splitting) on top of the geometric modular action induces a truely “fuzzy” action on BCFT algebras associated with double cones Q ⊂ O! The fuzzyness seems, however, not to be described by a pseudo differential operator, as suggested in [30, 29], but rather reflects the nonlocality of an operator product expansion for bi-localized fields. 3.4. Preliminaries for a general theory Also in the general case of a local chiral net A, there is a notion of “charge splitting”: Superselection sectors are described by DHR endomorphisms of the local net, which are localized in some interval [12, 13]. Intertwiners that change the interval of localization (charge transporters) are observables, i.e. they do not carry a charge themselves, but they may be regarded as operators that annihilate a charge in one interval and create the same charge in another interval. These charge transporters do not belong to A(E) (where the 2-interval E is the union of the two intervals), but together with A(E) generate A(E), see the discussion in [22, Sec. 5]. Therefore, one may speculate whether the combination of geometric action with charge splitting could be a general feature for the vacuum modular group of suitable n-interval algebras intermediate between A(E) and A(E), i.e. the modular group does not preserve the subalgebras A(F ), let alone the algebras of the component intervals A(Jk ). The discussion of the algebras A(E) ⊂ C(E) ⊂ A(E) in the preceding subsection shows that there cannot be a simple general answer. Nevertheless, we can derive a few first general results. Proposition 6. Let Φ ∈ H be a joint cyclic and separating vector for A(E) and A(E ), e.g., the vacuum. (i) If the modular automorphism group of (A(E), Φ) globally preserves the subal gebra A(E), then A(E) = A(E). (ii) If the adjoint action of the modular unitaries ∆it for (A(E), Φ) globally pre serves A(E), or, equivalently, A(E ) then A(E) = A(E). Proof. By assumption, Φ is also cyclic and separating for A(E) = A(E ) and ) = A(E) . Then (i) follows directly by Takesaki’s Theorem [32, Chap. IX, A(E Theorem 4.2]. For (ii), note that ∆it preserves A(E ) if and only if it preserves and ∆−it implements the modular automorphism group for A(E ) = A(E); ), Φ). Thus, the statement is equivalent to (i), with E replaced (A(E) = A(E by E .
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
351
The obvious relevance of Proposition 6(ii) is that in the generic case when A(E) is strictly larger than A(E), there can be no vector state satisfying the Reeh–Schlieder property such that A(E) has geometric modular action on A(E) and on A(E ). In particular, the modular unitaries will not belong to the diffeomorphism group, but we may expect that Connes spatial derivatives as in Proposition 2 do. Recall that we have already seen (in the Remark after (3.4)) that mixing nec essarily occurs. By Proposition 6(i), it is not possible that A(E) has geometric modular action without charge splitting. 4. Loose Ends We have put into relation and contrasted the two facts that (i) in diffeomorphism covariant conformal quantum field theory there is a construction of states on the von Neumann algebras of local observables associated with disconnected unions of n intervals (n-intervals), such that the modular group acts by diffeomorphisms of the intervals [21], and (ii) in the theory of free chiral Fermi fields, the modular action of the vacuum state on n-interval algebras is given by a combination of a geometric flow with a “mixing” among the intervals [9]. The absence of the mixing in (i) can be ascribed to the choice of “product” states in which quantum correlations across different intervals are suppressed. (In the reinterpretation of 2-interval algebras as double cone algebras in boundary conformal field theory [25], the influence of the boundary was shown to weaken — as expected on physical grounds — in the limit when the double cone is far away from the boundary [26]. Indeed, it can be seen from the formula (3.12) for the mixing angle that in this limit the mixing in (ii) also disappears.) On the other hand, there is some freedom in the choice of product states, which allows to deform the geometric modular flow within each of the intervals. It comes therefore as a certain surprise that the geometric part of the vacuum modular flow in (ii) coincides with the purely geometric flow in the product states in (i), precisely when the latter are chosen in a “canonical” way (involving the simple function z → z n on the circle, corresponding to ν = 1 in (2.11), in the case of symmetric n-intervals, and the function g (3.24) in the general case). This means that the relative Connes cocycle between the vacuum state and the “canonical” product state is just the mixing, while for all other product states, it will also involve a geometric component. Two circles of questions arise: First, is the geometric part of the vacuum flow specific for the free Fermi model, or is it universal? And if it is universal, what takes the place of the mixing in the general case? Putting aside some technical complications of the proof, the authors of [9] claim a universal behavior for free fields, while in this paper, we have given first indications how the geometric behavior should “propagate” to subtheories and
14:17 WSPC/S0129-055X
352
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
to field extensions, also strongly supporting the idea of a universal behavior. Insight from the theory of superselection sectors suggests that the mixing in the general case should be replaced by a “charge splitting”. On the other hand, Takesaki’s Theorem poses obstructions against the idea that charge splitting on top of a geometric modular flow could be the general answer (Proposition 6). Second, the notion of “canonical” (ν = 1) in the above should be given a physical meaning, related to the absence of a geometric component in the Connes cocycle. In the free Fermi case, the geometric part of the modular Hamiltonian contains the stress-energy tensor ∼ψ(x)∂x ψ(x), while the mixing part can be expressed in terms of ψ(xk )ψ(xl ) with xk and xl belonging to different intervals. The absence of derivatives suggests that the Connes cocycle is “more regular in the UV” in the case when the geometric parts coincide, than in the general case. The same should be true for the generalized product state constructed in Sec. 3.2. A precise formulation of this UV regularity is wanted. Acknowledgments We thank Jakob Yngvason for bringing to our attention the article of Casini and Huerta [9], and Horacio Casini for discussions about their work. We also thank the Erwin Schr¨ odinger Institute (Vienna) for the hospitality at the “Operator Algebras and Conformal Field Theory” program, August–December 2008, where this work has been initiated. This work was supported in part by ERC Advanced Grant 227458 OACFT “Operator Algebras and Conformal Field Theory”, and by the EU network “Noncommutative Geometry” MRTN-CT-2006-0031962. R.L. is partially supported by PRIN-MIUR and GNAMPA-INDAM. P.M. and K.H.R. are supported in part by the German Research Foundation (Deutsche Forschungsgemeinschaft (DFG)) through the Institutional Strategy of the University of G¨ ottingen. References [1] J. Bisognano and E. H. Wichmann, On the duality condition for quantum fields, J. Math. Phys. 17 (1976) 303–321. [2] H.-J. Borchers, On revolutionizing QFT with modular theory, J. Math. Phys. 41 (2000) 3604–3673. [3] H.-J. Borchers and J. Yngvason, Modular groups of quantum fields in thermal states, J. Math. Phys. 40 (1999) 601–624. [4] R. Brunetti, D. Guido and R. Longo, The conformal spin and statistics theorem, Comm. Math. Phys. 156 (1993) 201–219. [5] D. Buchholz, On the structure of local quantum fields with non-trivial interaction, in Proc. Intern. Conf. Operator Algebras, Ideals, and Their Applications in Physics, ed. H. Baumg¨ artel (Teubner, 1977), pp. 146–153. [6] D. Buchholz, G. Mack and I. T. Todorov, The current algebra on the circle as a germ of local field theories, Nucl. Phys. B 5B (Proc. Suppl.) (1988) 20–56. [7] D. Buchholz, O. Dreyer, M. Florig and S. J. Summers, Geometric modular action and spacetime symmetry groups, Rev. Math. Phys. 12 (2000) 475–560.
14:17 WSPC/S0129-055X
148-RMP
J070-00397
Geometric Modular Action for Disjoint Intervals
353
[8] D. Buchholz, I. Ojima and H. Roos, Thermodynamic properties of non-equilibrium states in quantum field theory, Ann. Phys. 297 (2002) 219–242. [9] H. Casini and M. Huerta, Reduced density matrix and internal dynamics for multicomponent regions, Class. Quant. Grav. 26 (2009) 185005, 15 pp. [10] A. Connes, On the spatial theory of von Neumann algebras, J. Funct. Anal. 35 (1980) 153–164. [11] A. Connes and C. Rovelli, Von Neumann algebra automorphisms and time thermodynamics relation in general covariant quantum theories, Class. Quant. Grav. 11 (1994) 2899–2918. [12] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics, I, Comm. Math. Phys. 23 (1971) 199–230. [13] ———, Local observables and particle statics, II, Comm. Math. Phys. 35 (1974) 49–85. [14] F. Figliolini and D. Guido, The Tomita operator for the free scalar field, Ann. Inst. Henri Poinc´ are Phys. Theor. 51 (1989) 419–435. [15] E. E. Flanagan, Quantum inequalities in two-dimensional Minkowski spacetime, Phys. Rev. D 56 (1997) 4922–4926. [16] D. Guido, R. Longo and H.-W. Wiesbrock, Extensions of conformal nets and superselection structures, Comm. Math. Phys. 192 (1998) 217–244. [17] R. Haag, N. Hugenholtz and M. Winnink, On the equilibrium states in quantum statistical mechanics, Comm. Math. Phys. 5 (1967) 215–236. [18] P. Hislop and R. Longo, Modular structure of the local algebras associated with the free massless scalar field theory, Comm. Math. Phys. 84 (1982) 71–85. [19] M. Izumi and H. Kosaki, On a subfactor analogue of the second cohomology, Rev. Math. Phys. 14 (2002) 733–757. [20] M. Izumi, R. Longo and S. Popa, A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras, J. Funct. Anal. 155 (1998) 25–63. [21] Y. Kawahigashi and R. Longo, Noncommutative spectral invariants and black hole entropy, Comm. Math. Phys. 257 (2005) 193–225. [22] Y. Kawahigashi, R. Longo and M. M¨ uger, Multi-interval subfactors and modularity of representations in conformal field theory, Comm. Math. Phys. 219 (2001) 631–669. [23] R. K¨ ahler and H.-W. Wiesbrock, Modular theory and the reconstruction of fourdimensional quantum field theories, J. Math. Phys. 42 (2001) 74–86. [24] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567–597. [25] R. Longo and K.-H. Rehren, Local fields in boundary conformal QFT, Rev. Math. Phys. 16 (2004) 909–960. [26] R. Longo and K.-H. Rehren, How to remove the boundary: An operator algebraic procedure, Comm. Math. Phys. 285 (2009) 1165–1182. [27] R. Longo and F. Xu, Topological sectors and a dichotomy in conformal field theory, Comm. Math. Phys. 251 (2004) 321–364. [28] P. Martinetti and C. Rovelli, Diamond’s temperature: Unruh effect for bounded trajectories and thermal time hypothesis, Class. Quant. Grav. 20 (2003) 4919–4932. [29] T. Saffary, On the generator of massive modular groups, Lett. Math. Phys. 77 (2006) 235–248. [30] B. Schroer and H.-W. Wiesbrock, Modular theory and geometry, Rev. Math. Phys. 12 (2000) 139–158. [31] G. Sewell, Relativity of temperature and the Hawking effect, Phys. Lett. A 79 (1980) 23–24.
14:17 WSPC/S0129-055X
354
148-RMP
J070-00397
R. Longo, P. Martinetti & K.-H. Rehren
[32] M. Takesaki, Theory of Operator Algebras, II, Springer Encyclopedia of Mathematical Sciences, Vol. 125 (Springer-Verlag, 2003). ¨ [33] S. Trebels, Uber die geometrische Wirkung modularer Automorphismen, PhD thesis, G¨ ottingen (1997); (in German, see also [2, Chap. III.4]). [34] W. G. Unruh, Notes on black-hole evaporation, Phys. Rev. D 14 (1976) 870–892.
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 4 (2010) 355–380 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003941
SPECTRAL SHIFT FUNCTION FOR OPERATORS WITH CROSSED MAGNETIC AND ELECTRIC FIELDS
MOUEZ DIMASSI∗ and VESSELIN PETKOV† ∗D´ epartement de Math´ ematiques, Universit´ e Paris 13, 99, Avenue J.-B. Cl´ ement, 93430 Villetaneuse, France
[email protected] †Universit´ e
Bordeaux I, Institut de Math´ ematiques de Bordeaux, 351, Cours de la Lib´ eration, 33405 Talence, France
[email protected] Received 19 August 2009 Revised 8 January 2010
We obtain a representation formula for the derivative of the spectral shift function ξ(λ; B, ) related to the operators H0 (B, ) = (Dx − By)2 + Dy2 + x and H(B, ) = H0 (B, ) + V (x, y), B > 0, > 0. We establish a limiting absorption principle for H(B, ) / σ(Q), where Q = (Dx − By)2 + and an estimate O(n−2 ) for ξ (λ; B, ), provided λ ∈ Dy2 + V (x, y). Keywords: Magnetic potential; Stark operator; spectral shift function. Mathematics Subject Classification 2010: 35P25, 35Q40
1. Introduction Consider the two-dimensional Schr¨ odinger operator with homogeneous magnetic and electric fields H = H(B, ) = H0 (B, ) + V (x, y),
Dx = −i∂x ,
Dy = −i∂y ,
where H0 = H0 (B, ) = (Dx − By)2 + Dy2 + x. Here B > 0 and > 0 are proportional to the strength of the homogeneous magnetic and electric fields. We assume that V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R)) and V (x, y) satisfies the estimate |V (x, y)| ≤ C(1 + |x|)−2−δ (1 + |y|)−1−δ ,
δ > 0.
(1.1)
For = 0 we have σess (H0 (B, )) = σess (H(B, )) = R. On the other hand, for decreasing potentials V we may have embedded eigenvalues λ ∈ R and this situation is completely different from that with = 0 when the spectrum of H(B, 0) is formed 355
May 11, J070-S0129055X10003941
356
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
by eigenvalues with finite multiplicities which may accumulate only to Landau levels λn = (2n + 1)B, n ∈ N (see [9, 13, 15] and the references cited there). The spectral properties of H and the existence of resonances have been studied in [5, 7, 8] under the assumption that V (x, y) admits a holomorphic extension in the x-variable into a domain Γδ0 = {z ∈ C : 0 ≤ |Im z| ≤ δ0 }. Moreover, without any assumption on the analyticity of V (x, y) we show in Proposition 2 below that the operator (H − z)−1 − (H0 − z)−1 for z ∈ C, Im z = 0, is trace class and following the general setup [11, 20], we define the spectral shift function ξ(λ) = ξ(λ; B, ) related to H0 (B, ) and H(B, ) by ξ , f = tr(f (H) − f (H0 )),
f ∈ C0∞ (R).
By this formula ξ(λ) is defined modulo a constant but for the analysis of the derivative ξ (λ) this is not important. Moreover, the above property of the resolvents and Birman–Kuroda theorem imply σac (H0 (B, )) = σac (H(B, )) = R. A representation of the derivative ξ (λ; B, ) has been obtained in [5] for strong magnetic fields B → +∞ under the assumption that V (x, y) admits an analytic continuation in x-direction. Moreover, the distribution of the resonances zj of the perturbed operator H(B, ) has been examined in [5] and a Breit–Wigner representation of ξ (λ; B, ) involving the resonances zj was established. In the literature there are a lot of works concerning Schr¨ odinger operators with magnetic fields ( = 0) but there are only few ones dealing with magnetic and Stark potentials ( = 0) (see [5, 7, 8] and the references given there). It should be mentioned that the tools in [5, 7, 8] are related to the resonances of the perturbed problem and to define the resonances one supposes that the potential V (x, y) has an analytic continuation in x variable. In this paper we consider the operator H without any assumption on the analytic continuation of V (x, y) and without the restriction B → +∞. Our purpose is to study ξ (λ; B, ) and the existence of embedded eigenvalues of H. To examine the behavior of the spectral shift function we need a representation of the derivative ξ (λ; B, ). The key point in this direction is the following Theorem 1. Let V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) and let (1.1) hold for V and ∂x V . Then for every f ∈ C0∞ (R) and = 0 we have 1 tr(f (H) − f (H0 )) = − tr(∂x V f (H)). (1.2) The formula (1.2) has been proved by Robert and Wang [18] for Stark Hamiltonians in absence of magnetic field (B = 0). In fact, the result in [18] says that 1 ∂e (x, y, x, y; λ, 0, )dxdy, (1.3) ξ (λ; 0, ) = − ∂x V R2 ∂λ where e(·, ·; λ, 0, ) is the spectral function of H(0, ). The presence of magnetic filed B = 0 and Stark potential lead to some serious difficulties. The operator H is not
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
357
elliptic for |x|+|y| → ∞ and we have double characteristics. On the other hand, the commutator [H, x] involves the term (Dx −By) and it creates additional difficulties. The proof of Theorem 1 is long and technical. We are going to study the trace class properties of the operators ψ(H ±i)−N , ∂x ◦ψ(H ±i)−N −1 , (H ±i)∂x ◦ψ(H ±i)−N −2 etc. for N ≥ 2 and ψ ∈ C0∞ (R2 ) (see Lemmas 1 and 2). Moreover, by an argument similar to that in [5, Proposition 2.1], we obtain estimates for the trace norms of the operators (z − H)−1 V (z − H)−1 ,
V (z − H)−1 (z − H)−1 ,
z∈ / R,
z ∈ /R
and we apply an approximation argument. Notice that in [18] the spectral shift function is related to the trace of the time delay operator T (λ) defined via the corresponding scattering matrix S(λ) (see [17]). In contrast to [18], our proof is direct and neither T (λ) nor S(λ) corresponding to the operator H(B, ) are used. The second question examined in this work is the existence of embedded real eigenvalues and the limiting absorption principle for H. In the physical literature one conjectures that for = 0 there are no embedded eigenvalues. We establish in Sec. 3 a weaker result saying that in any interval [a, b] we may have at most a finite number embedded eigenvalues with finite multiplicities. Under the assumption for analytic continuation of V it was proved in [7] that for some finite interval [α(B, ), β(B, )] there are no resonances z of H(B, ) with Re z ∈ / [α(B, ), β(B, )]. Since the real resonances z coincide with the eigenvalues of H(B, ), we obtain some information for the embedded eigenvalues. On the other hand, exploiting the analytic continuation and the resonances we proved in [5] that for B → +∞ the reals parts Re zj of the resonances zj lie outside some neighborhoods of the Landau levels. Thus the Landau levels play a role in the distribution of the resonances. It is known that the spectrum of the operator Q = (Dx − By)2 + Dy2 + V (x, y) with decreasing potential V is formed by eigenvalues (see [9, 13, 15]). In this paper, we establish a limiting absorption principle for λ ∈ / σ(Q). In particular, we show that there are no embedded eigenvalues outside σ(Q). This agrees with the result in [5] obtained under the restrictions on the behavior of V and B → +∞. On the other hand, the result of Proposition 3 and the estimates (4.3) have been established by Wang [19] for Stark operators with B = 0. Following the results in Sec. 4 and the representation of ξ (λ; B, ) given in [5], it is natural to expect that for λ ∈ / σ(Q) the derivative of the spectral shift function ξ (λ; B, ) must be bounded. In fact, we prove the following stronger result. Theorem 2. Let the potential V ∈ C ∞ (R2 ; R) satisfy with some δ > 0 and n ∈ N, n ≥ 2 the estimates |∂xα ∂yβ V (x, y)| ≤ Cα,β (1 + |x|)−n−δ−|α| (1 + |y|)−2−δ−|β| ,
∀α,
∀β.
(1.4)
Then for λ0 ∈ / σ(Q) we have ξ (λ; B, ) = O(n−2 ) uniformly for λ in a small neighborhood Ξ ⊂ R of λ0 .
(1.5)
May 11, J070-S0129055X10003941
358
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
The estimate (1.5) has been obtained in [18] in the case of absence of magnetic field B = 0 (for a Breit–Wigner formula see [10], [4] for Stark Hamiltonians and [5] for the operator H(B, )). Our approach is quite different from that in [18]. Our proof is going without an application of a representation similar to (1.3) which leads to complications connected with the behavior of the spectral function e(·, ·; λ, B, ) corresponding to H(B, ). The formula (1.2) plays a crucial role and our analysis is based on a complex analysis argument combined with a representation of f (H) involving the almost analytic continuation of f ∈ C0∞ (R). In this direction, our argument is similar to that developed in [4, 5]. The plan of this paper is as follows. In Sec. 2, we establish Theorem 1. The embedded eigenvalues and Mourre estimates are examined in Sec. 3. In Sec. 4, we prove Proposition 3 concerning the limiting absorption principle for H(B, ). Finally, in Sec. 5, we establish Theorem 2. 2. Representation of the Spectral Shift Function Throughout this work we will use the notations of [3] for symbols and pseudodifferential operators. In particular, if m : R4 → [0, +∞[ is an order function (see [3, Definition 7.4]), we say that a(z, ζ) ∈ S 0 (m) if for every α ∈ N4 there exists Cα > 0 such that α a(z, ζ)| ≤ Cα m(z, ζ). |∂z,ζ
In the special case when m = 1, we will write S 0 instead of S 0 (1). We will use the standard Weyl quantization of symbols. More precisely, if p(z, ζ), (z, ζ) ∈ R4 , is a symbol in S 0 (m), then P w (z, Dz ) is the operator defined by z + z w −2 i(z−z )·ζ , ζ u(z )dz dζ, for u ∈ S(R2 ). p P (z, Dz )u(z) = (2π) e 2 We denote by P w (z, hDz ) the semiclassical quantization obtained as above by quantizing p(z, hζ). Our goal in this section is to prove Theorem 1. For this purpose we need some Lemmas. We set Q0 = H0 − x = (Dx − By)2 + Dy2 ,
Q = Q0 + V,
and in Lemma 1 we will use the notation H1 = H. For the simplicity we assume that = B = 1. The general case can covered by the same argument. Lemma 1. Assume that V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) and let ψ ∈ C0∞ (R2 ). Then for N ≥ 2, j = 0, 1 and for Im z = 0, the following operators are trace class: (i) (ii) (iii) (iv) (v)
ψ(Hj ± i)−N , ∂x ◦ ψ(Hj ± i)−N −1 , (Hj ± i)∂x ◦ ψ(Hj ± i)−N −2 . (Hj ± i)−N ψ, (Hj ± i)−N −1 ψ · ∂x . ψ ◦ ∂x (Hj ± i)−N −1 , (Hj ± i)ψ ◦ ∂x (Hj ± i)−N −2 . (Hj ± i)∂x (Hj ± i)−N −2 ψ. (H1 + i)∂x (H1 + i)−N −1 (H1 − z)−1 ψ.
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
359
Moreover,
(H1 + i)∂x (H1 + i)−N −1 (H1 − z)−1 ψ tr = O
|z| + 1 |Im z|2
.
(2.1)
Proof. We will prove the lemma only for (H1 + i), the case concerning (H1 − i) is similar. On the other hand, the statements for (H0 + i) follow from those for (H1 + i) when V = 0. From the first resolvent equation, we obtain (H1 + z)−1 = (Q0 + z)−1 − (Q0 + z)−1 (x + V )(H1 + z)−1 = (Q0 + z)−1 +
N +2
(−1)j (Q0 + z)−1 ((x + V )(Q0 + z)−1 )j
j=1
+ (−1)N +3 ((Q0 + z)−1 (x + V ))N +3 (H1 + z)−1 .
(2.2)
Taking (N − 1) derivatives with respect to z in the above identity and setting z = i, we see that (H1 + i)−N is a linear combination of terms KN := (Q0 + i)−j1 W (Q0 + i)−j2 W · · · (Q0 + i)−jr W (H1 + i)−p , with j1 + · · · + jr ≥ N, j1 ≥ 1, p ≥ 0 and W (x) = x + V (x). Recall that if P ∈ S 0 (m) with m ∈ L1 (R4 ), (respectively, m ∈ L2 (R4 )) then the corresponding operator is trace class (respectively, Hilbert–Schmidt). By using this and the fact that the symbol of (Q0 + i)−1 is in S 0 (ξ − y, η−2 ), we deduce that the operator
j −l y−p (Q0 + i)−j xl yp Kl,p,l ,p := x
is trace class one for l−l , p−p > 1, j ≥ 2 and Hilbert–Schmidt one for l−l , p−p > 1/2, j ≥ 1. Next, we write ψKN as follows j1 j2 W x−1 K3r−3,2r−2,3r−1,2r−4 W x−1 ψKN = ψx3r y2r K3r,2r,3r−2,2r−2 jr · · · W x−1 K3,2,1,0 W x−1 (H1 + i)−p .
(2.3)
Since j1 + j2 + · · · + jr ≥ N ≥ 2, in the above decomposition, there are at least two Hilbert–Schmidt operators or one of trace class. Combining this with the fact ψx3r y2r , W x−1 and (H1 + i)−p are bounded from L2 (R2 ) into L2 (R2 ), we conclude that ψKN is trace class operator. Thus ψ(H1 + i)−N is also a trace class operator. Repeating the same arguments, we obtain the proof for ∂x ◦ψ(Hj ±i)−N −1 . As above to treat (Hj ± i)∂x ◦ ψ(Hj ± i)−N −2 , it suffices to show that (Hj ± i)∂x ◦ ψKN is trace class. If we have j1 ≥ 2 the proof is completely similar to that of ψ(H1 + i)−N . In the case where j1 = 1 since (H1 + i)∂x (Q0 + i)−1 is not bounded,
May 11, J070-S0129055X10003941
360
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
we have to exploit the following representation (H1 + i)∂x ◦ ψKN = (H1 + i)(∂x ψ)KN + (H1 + i)ψ(Q0 + i)−1 ∂x ◦ W (Q0 + i)−j2 W · · · (Q0 + i)−jr W (H1 + i)−p . Next use the fact that ∂x W ∈ L∞ and repeat the argument of the proof above. Recall that A is trace class if and only if the adjoint operator A∗ is trace class. Consequently, (i) implies (ii). Since ψ · ∂x = ∂x · ψ − (∂x ψ), the assertion (iii) follows from (i). To deal with (iv), we apply the following obvious identity with z = −i, ∂x (H − z)−1 = (H − z)−1 ∂x + (H − z)−1 (1 + ∂x V )(H − z)−1 ,
(2.4)
and obtain (H1 + i)∂x (H1 + i)−N ψ = (H1 + i)−N ∂x ψ +
N −1
(H1 + i)−j (1 + ∂x V )(H1 + i)−N +j ψ.
(2.5)
j=0
Applying (i) and (ii) to each term on the right hand side of (2.5), we get (iv). Now we pass to the proof of (v). Applying (2.4), we obtain (H1 + i)∂x (H1 + i)−N −1 (H1 − z)−1 ψ = (H1 + i)(H1 − z)−1 ∂x (H1 + i)−N −1 ψ + (H1 + i)(H1 − z)−1 (1 + ∂x V ) (H1 − z)−1 (H1 + i)−N ψ. Combining the above equation with (i), (ii), (iv) and using the estimate |z| + 1 −1
(H1 + i)(H1 − z) = O , |Im z| we get (2.1). Lemma 2. Assume that V (x, y) = φ(x, y)W (x, y), where φ ∈ C0∞ (R2 ; R) and W, ∂x W ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R). Then for N ≥ 4 the operator (H + i)∂x [(H + i)−N − (H0 + i)−N ], is trace class. Proof. Taking (N − 1) derivatives with respect to z in the resolvent identity (H + z)−1 − (H0 + z)−1 = −(H + z)−1 V (H0 + z)−1 and setting z = i, we see that (H + i)−N − (H0 + i)−N is a linear combination of terms (H + i)−j V (H0 + i)−(N +1+j)
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
361
with 1 ≤ j ≤ N . Composing the above terms by (H + i)∂x and applying Lemma 1, we complete the proof. Lemma 3. Assume that V satisfies the assumptions of Lemma 1. Let f ∈ C0∞ (R) and ψ ∈ C0∞ (R2 ). Then the operators ψf (Hi ),
Hi ψ∂x f (Hi ),
ψ∂x Hi f (Hi )
are trace class and we have tr(Hi ψ∂x f (Hi )) = tr(ψ∂x Hi f (Hi )). Proof. Set g(x) = (x + i)4 f (x). Since g(Hi ) is bounded, it follows from Lemma 1 that the operators ψ(Hi + i)−4 g(Hi ),
Hi ψ∂x (Hi + i)−4 g(Hi ),
ψ∂x (Hi + i)−4 Hi g(Hi ),
are trace class, and the cyclicity of the trace yields tr(Hi ψ∂x f (Hi )) = tr(Hi ψ∂x (Hi + i)−4 g(Hi )) = tr(Hi g(Hi )ψ∂x (Hi + i)−4 ) = tr(ψ∂x (Hi + i)−4 g(Hi )Hi ) = tr(ψ∂x Hi f (Hi )). Notice that in the above equalities we have used the fact that the operators g(Hi ), Hi and (Hi + i)−4 commute. Lemma 4. Let V be as in Lemma 2. Then for every f ∈ C0∞ (R) the operators f (H) − f (H0 ),
∂x (f (H) − f (H0 ))
and
(H ± i)∂x (f (H) − f (H0 ))
are trace class. Proof. Let g(x) = (x + i)4 f (x) be as above. We decompose (H + i)∂x (f (H) − f (H0 )) = (H + i)∂x ((H + i)−4 − (H0 + i)−4 )g(H0 ) + (H + i)∂x (H + i)−4 (g(H) − g(H0 )) = I + II. According to Lemma 2, the operator I is trace class. To treat II, we use the Helffer– Sj¨ ostrand formula 1 ¯g (z)(H + i)∂x (H + i)−4 ((z − H)−1 − (z − H0 )−1 )L(dz) ∂˜ (II ) = − π 1 ¯g (z)(H + i)∂x (H + i)−4 (z − H)−1 V (z − H0 )−1 L(dz), =− ∂˜ π ¯g (z) = where g˜(z) ∈ C0∞ (C) is an almost analytic continuation of g such that ∂˜ ∞ O(|Im z| ), while L(dz) is the Lebesgue measure on C. Now applying Lemma 1(v), we see that the operator (H + i)∂x (H + i)−4 (z − H)−1 V
May 11, J070-S0129055X10003941
362
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
is trace class. Since |z| is bounded on supp g˜, we can apply (2.1) to the right hand ¯g (z) = O(|Im z|∞ ), we deduce part of the above equation and combining this with ∂˜ that II is trace class. Summing up, we conclude that (H + i)∂x (f (H) − f (H0 )) is trace class. The same argument works for (H − i)∂x (f (H) − f (H0 )). The proof concerning f (H) − f (H0 ) and ∂x (f (H) − f (H0 )) are similar and simpler. To establish Theorem 1, we also need the following abstract result. For the reader’s convenience, we present a proof. Proposition 1. Let A be an operator of trace class on some Hilbert space H and let {Kn } be sequences of bounded linear operator which converges strongly to K ∈ L(H). Then lim Kn A − KA tr = 0.
n→∞
Proof. First assume that A is a finite rank operator having the form A = m k=1 ·, ψk φk , where ψk , φk ∈ H. Since
A tr ≤
m
φk
ψk ,
k=1
we have
(Kn − K)A tr ≤
m
(Kn − K)φk
ψk → 0,
n → ∞.
(2.6)
k=1
The general case can be covered by an approximation. Since Kn converges strongly, it follows from the Banach–Streinhaus theorem that µ = supn Kn < ∞. Let η be an arbitrary positive constant and let Aη be a finite rank operator such that η . We have
A − Aη tr ≤ 2µ
(Kn − K)A tr ≤ (Kn − K)(A − Aη ) tr + (Kn − K)Aη tr ≤ η + (Kn − K)Aη tr . Next we apply (2.6) for the finite rank operator Aη and obtain lim (Kn − K)A tr ≤ η,
n→∞
which implies Proposition 1, since η is arbitrary. Proof of Theorem 1. Assume first that V = φW where φ ∈ C0∞ (R2 ; R) and W, ∂x W ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R). Choose a function χ ∈ C0∞ (R2 ) such that χ = 1 for |(x, y)| ≤ 1. For R > 0 set x y , χR (x, y) = χ , R R
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
363
and introduce BR := [χR ∂x , H]f (H) − [χR ∂x , H0 ]f (H0 ). Here [A, B] = AB − BA denotes the commutator of A and B. According to Lemma 3, we have tr([χR ∂x , H]f (H)) = tr([χR ∂x , H0 ]f (H0 )) = 0. Thus tr(BR ) = 0.
(2.7)
On the other hand, a simple calculus shows that BR = χR ([∂x , H]f (H) − [∂x , H0 ]f (H0 )) + [χR , H0 ]∂x (f (H) − f (H0 )) 1 2 + BR , := BR
(2.8)
where we have used that [χR , H] = [χR , H0 ]. Since [∂x , H] = 1 + ∂x V and [∂x , H0 ] = 1, it follows from Lemma 3, Lemma 4 and Proposition 1 that 1 ) = tr(f (H) − f (H0 )) + tr(∂x V f (H)). lim tr(BR
R→∞
(2.9)
Next, we claim that 2 = 0. lim BR
R→∞
(2.10)
2 (Dx χR )(Dx − y) − R2 (Dy χR )Dy + R12 (∆χR ), we decompose Using that [χR , H0 ] = R 2 2 1 2 3 as a sum of three terms BR = IR + IR + IR , where BR 1 =− IR
2 (Dx χR )(Dx − y)∂x (f (H) − f (H0 )), R
2 IR =−
2 (Dy χR )Dy ∂x (f (H) − f (H0 )), R
3 IR =
1 (∆χR )∂x (f (H) − f (H0 )). R2
1 To treat IR , we set Q = H − x and write 1 =− IR
2 (Dx χR )(Dx − y)(Q0 − i)−1 (H − i)∂x (f (H) − f (H0 )) R
+
2 (Dx χR )[(Dx − y)(Q − i)−1 , x]∂x (f (H) − f (H0 )) R
+
2 x(Dx χR )(Dx − y)(Q − i)−1 ∂x (f (H) − f (H0 )). R
The operators [(Dx − y)(Q − i)−1 , x] and (Dx − y)(Q − i)−1 are bounded, while ∂x (f (H) − f (H0 )) and (H − i)∂x (f (H) − f (H0 )) are trace class operators
May 11, J070-S0129055X10003941
364
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
2 (see Lemma 4). On the other hand, R2 (Dx χR ), R x(Dx χR ) converges strongly to zero. Indeed, since χ(x, y) = 1 for |(x, y)| ≤ 1, we get 2 x |u|2 dxdy → 0, R → ∞, (Dx χR )u dxdy ≤ sup |xDx χ(x, y)| R (x,y)∈R2 {|(x,y)|≥R}
for all u ∈ L2 (R2 ). Applying Proposition 1, we conclude that 1 lim IR = 0.
(2.11)
R→∞
2 3 To deal with IR , IR , notice that the operators Dy (Q − i)−1 and [Dy (Q − i)−1 , x] are bounded and we repeat the above argument. Thus we deduce
lim I j R→∞ R
= 0,
j = 2, 3.
(2.12)
Consequently, (2.11) and (2.12) imply (2.10) and the claim is proved. Now, combining (2.7)–(2.10), we obtain Theorem 1 in the case where V satisfies the assumption of Lemma 2 and = 1. / R, z ∈ /R Proposition 2. Assume that V ∈ L∞ (R2 ; R) satisfies (1.1). Then for z ∈ −1 −1 −1 −1 −1 the operators (z − H) V (z − H) , V (z − H) (z − H) , (H − z) − (H0 − z)−1 are trace class and
(z − H)−1 V (z − H)−1 tr ≤ C1 |Im z|−1 |Im z |−1 , −1
V (z − H)
−1
(z − H)
−1
tr ≤ C1 |Im z|
−1
|Im z |
(2.13)
.
Moreover, if g ∈ C0∞ (R), then the operator V g(H) is trace class. δ
1+δ
Proof. Set gδ (x, y) = x−1− 2 y− 2 and fδ (x, y) = x−2−δ y−1−δ , where δ is the constant in (1.1). According to Lemma 8 in the Appendix, gδ (H0 + i)−1 , (H0 + i)−1 gδ are Hilbert–Schmidt operators and fδ (H0 + i)−2 is a trace one. Since gδ−1 V gδ−1 , V fδ−1 ∈ L∞ , it follows that (H0 + i)−1 V (H0 + i)−1 = (H0 + i)−1 gδ [gδ−1 V gδ−1 ]gδ (H0 + i)−1 and V (H0 + i)−2 are trace class operators. Next we write (H + i)−1 − (H0 + i)−1 = −(H0 + i)−1 V (H0 + i)−1 + (H + i)−1 V (H0 + i)−1 V (H0 + i)−1 and conclude that (H + i)−1 − (H0 + i)−1 = −(H + i)−1 V (H0 + i)−1 is trace class. Now consider the following equalities (i + H)−1 V (i + H)−1 = (i + H0 )−1 V (i + H0 )−1 + (i + H)−1 V (i + H0 )−1 V (i + H0 )−1 + (i + H0 )−1 V (i + H0 )−1 V (i + H)−1 + (i + H)−1 V (i + H0 )−1 V (i + H0 )−1 V (i + H)−1
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
365
and V (H + i)−2 = V (H0 + i)−2 − V (H0 + i)−1 (H + i)−1 V (H0 + i)−1 − V (H + i)−1 V (H0 + i)−1 (H + i)−1 . By using the trace class properties established above, we get (2.13) for z = z = −i. By applying the first resolvent equation (H − z)−1 = (H + i)−1 + (i − z)(H + i)−1 (H − z)−1 , we obtain the general case. To examine V g(H), consider the function h(x) = (x + i)2 g(x). Then V g(H) = V (H + i)−2 h(H) and since V (H + i)−2 is trace class, we obtain the result. For R > 0 introduce HR := H0 + χR (x, y)V (x, y), x y , R ) with χ ∈ C0∞ (R2 ) such that χ = 1 in a neighborhood of where χR (x, y) = χ( R |(x, y)| ≤ 1.
Remark 1. The result of Proposition 2 concerning the trace class property of (H − z)−1 − (H0 − z)−1 , Im z = 0, improves considerably [5, Proposition 2], where much more regular potentials have been examined. On the other hand, if the potential V satisfies (1.1) and V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R), then the state/ R, z ∈ / R. ments of Proposition 2 hold for the operators (z −HR )−1 V (z −H)−1 , z ∈ The proof of Theorem 1 in the general case will be a simple consequence of the following Lemma 5. Let V (x, y) be as in Theorem 1. Then for f ∈ C0∞ (R) we have lim tr(f (HR ) − f (H)) = 0,
(2.14)
lim tr(∂x (χR V )f (HR )) = tr(∂x V f (H)).
(2.15)
R→∞ R→∞
Proof. Let g(x) = (x + i)f (x) be as above. We decompose f (HR ) − f (H) = ((HR + i)−1 − (H + i)−1 )g(H) + (HR + i)−1 (g(HR ) − g(H)) = JR + KR . From the first resolvent identity, we obtain JR = (HR − i)−1 (1 − χR )V (H + i)−1 g(H) = (HR − i)−1 (1 − χR )V f (H). According to Proposition 2, the operator V f (H) is trace class and (HR −i)−1 (1−χR ) converges strongly to zero. Then from Proposition 1 it follows that lim tr JR = 0.
R→∞
(2.16)
May 11, J070-S0129055X10003941
366
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
To treat trKR , as in the proof of Lemma 4, we use the Helffer–Sj¨ostrand formula and write 1 ¯g (z) tr((HR + i)−1 ((z − HR )−1 − (z − H)−1 ))L(dz) ∂˜ tr KR = − π 1 ¯g(z) tr((HR + i)−1 (z − HR )−1 (1 − χR )V (z − H)−1 )L(dz). ∂˜ = π By cyclicity of the traces we obtain tr((HR + i)−1 (z − HR )−1 (1 − χR )V (z − H)−1 ) = tr((z − HR )−1 (1 − χR )V (z − H)−1 (HR + i)−1 ) = tr((z − HR )−1 (1 − χR )V (z − H)−1 (H + i)−1 ) + tr((1 − χR )V (HR + i)−1 (z − HR )−1 (1 − χR )V (z − H)−1 (H + i)−1 ). Now notice that for z ∈ / R the operators (1−χR )V (HR +i)−1 (z−HR )−1 (1−χR ) and −1 (z−HR ) (1−χR ) converge strongly to zero. On the other hand, from Proposition 2 / R, we we deduce that the operator V (z − H)−1 (i + H)−1 is trace class. Thus for z ∈ conclude that the integrand converge to 0 as R → ∞. An application of the Lebesgue convergence domination theorem combined with the estimates (2.13) yield lim tr KR = 0.
(2.17)
R→∞
Putting together (2.16) and (2.17), we obtain (2.14). Next, we pass to the proof of (2.15). A simple calculus shows that ∂x (χR V )f (HR ) = ∂x (χR V )(f (HR ) − f (H)) +
1 (∂x χ)R V f (H) R
+ (χR ∂x V f (H)).
(2.18)
Repeating the same arguments as in the proof of (2.14), we show that lim tr(∂x (χR V )(f (HR ) − f (H))) = 0.
R→∞
(2.19)
1 (∂x χ)R (respectively χR ) converges strongly to zero On the other hand, since R (respectively 1), it follows from Proposition 1 that 1 (∂x χ)R Vf (H) = 0, lim tr(χR ∂x Vf (H)) = tr(∂x Vf (H)), lim tr R→∞ R→∞ R
which together with (2.18) and (2.19) yield (2.15). End of the proof of Theorem 1. Applying Theorem 1 to HR , we obtain: tr[f (HR ) − f (H)] + tr[f (H) − f (H0 )] = tr[f (HR ) − f (H0 )] = −tr(∂x (χR V )f (H)), and an application of Lemma 5 implies Theorem 1.
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
367
3. Mourre Estimate and Embedded Eigenvalues Consider the operator Q = (Dx − By)2 + Dy2 + V (x, y), and set x = (1 + |x|2 )1/2 , Dx = (1 + Dx2 )1/2 . Lemma 6. Assume that V, ∂x V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) and let
I{|x|+|y|>R}(x, y)∂x V L∞ → 0 for R → +∞. Then for all f ∈ C0∞ (R), the operator f (H)∂x V f (H) is compact. Proof. Let ϕ(x, y) ∈ C0∞ (R2 ) be equal to one near zero. Set ϕn (x, y) = ϕ( nx , ny ). According to Lemma 3, the operator f (H)ϕn ∂x V f (H) is trace class. The set of compact operators is closed with respect to the norm . L(L2 ) and the lemma follows from the obvious estimate
f (H)(1 − ϕn )∂x V f (H) L(L2 ) ≤ f 2 (H) L(L2 ) (1 − ϕn )∂x V ∞ . Theorem 3. Let [a, b] ⊂ R. Under the assumptions of Lemma 6, there exists a compact operator K such that I[a,b] (H)[∂x , H] I[a,b] (H) ≥ I[a,b] (H) + I[a,b] (H)KI[a,b] (H).
(3.1)
Proof. Since the operator ∂x commutes with (Dx −By) and Dy2 , we have [∂x , H] = + ∂x V . Consequently, I[a,b] (H)[∂x , H]Ia,b] (H) = I[a,b] (H) + I[a,b] (H)∂x V I[a,b] (H) = I[a,b] (H) + I[a,b] f (H)∂x V f (H)I[a,b] (H),
(3.2)
where f ∈ C0∞ (R) is a cut-off function such that f = 1 on [a, b]. Thus, Theorem 3 follows from Lemma 6. The use of commutators with the operator ∂x is well known for the analysis of the operator without magnetic field (B = 0) (see the pioneering work [2] and [1] for a more complete list of references). On the other hand, to treat crossed magnetic and electric fields we need Lemma 1 and Lemma 3. Corollary 1. In addition to the assumptions of Theorem 3 assume that ∂x2 V ∈ C 0 (R2 ) ∩ L∞ (R2 ). Then the point spectrum of H in [a, b] is finite and with finite multiplicity. Moreover, the singular continuous spectrum of H is empty. Proof. Set A = Dx and let α ∈ R. The explicit formula eiαA (H + i)−1 = (eiαA He−iαA + i)−1 eiαA = (H + α + V (x + α, y) − V (x, y) + i)−1 eiαA
May 11, J070-S0129055X10003941
368
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
shows that eiαA leaves D(H) invariant. On the other hand, since
HeiαA (H + i)−1 ψ = e−iαA HeiαA (H + i)−1 ψ = (H − α + V (x − α, y) − V (x, y))(H + i)−1 ψ , we deduce that for each ϕ ∈ D(H) sup HeiαA ϕ < ∞.
|α|<1
Combining this with the fact i[A, H] = + ∂x V , [A, [A, H]] = −∂x2 V and using (3.1), we conclude that the self-adjoint operator A is a conjugate operator for H at every E ∈ R in the sense of [14]. Consequently, Corollary 1 follows from the main result in [14] (see also [1, 6]). Remark 2. For any sign-definite and bounded potential V (x, y) such that |V (x, y)| → 0 as |x| + |y| → ∞ sufficiently fast in [13, 15] it was established that for = 0 the potential V creates an infinite number of eigenvalues of Q which accumulate to Landau levels. The above corollary shows that only a finite number of these eigenvalues may survive in the presence of a non vanishing constant electric field. In general, the problem of absence of embedded eigenvalues when = 0 remains open and this is an interesting conjecture. For a fixed value of = 0, the following result shows that there are potentials for which H has absolutely continuous spectrum without embedded eigenvalues. Corollary 2. Fix > 0. Assume that ∂xα V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R), α = 0, 1, 2 and + ∂x V (x, y) > c > 0,
(3.3)
uniformly on (x, y) ∈ R2 . Then H has no eigenvalues. Moreover, for s > 1/2, the following estimates holds uniformly on λ in a compact interval
Dx −s (H − λ ± i0)−1 Dx −s = O (1).
(3.4)
Proof. Let [a, b] be a compact interval in R. From (3.1) and (3.3), we have I[a,b] (H)[∂x , H]Ia,b] (H) ≥ cI[a,b] (H).
(3.5)
According to the proof of Corollary 1, A = Dx is a conjugate operator in the sense of [14]. Combining this with (3.5) we deduce from [14] that H has no eigenvalue in R. Applying once more Mourre theorem (see [1, 6, 14]), we obtain the estimate (3.4).
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
369
4. Limiting Absorption Principle In this section, we treat the case when is small enough. Notice that when tends to zero in general the assumption + ∂x V > c > 0 is not satisfied and we cannot apply Corollary 2. Our goal is to study the behavior of the resolvent (H − λ ± iδ)−1 as δ → 0 for λ ∈ / σ(Q). For such λ we could have eigenvalues of H and a direct application of Mourre argument is not possible. We will obtain the result assuming that is small and for this purpose we need the following / σ(Q). Let χ ∈ C0∞ (R; R) be Lemma 7. Assume that V ∈ L∞ (R2 ; R) and let λ ∈ equal to 1 near λ and let supp χ ∩ σ(Q) = ∅. Then
χ(H)x−2 ≤ C2 .
(4.1)
Proof. Since supp χ ∩ σ(Q) = ∅, the operators (z − Q)−1 and (z − Q)−1 x(z − Q)−1 are analytic operator valued functions for z in a complex neighborhood of supp χ. Let χ(z) ˜ ∈ C0∞ (C) be an almost analytic continuation of χ(x) such that ∂¯χ(z) ˜ = O(|Im z|∞ ) and supp χ(z) ˜ ∩ σ(Q) = ∅. We have the representation 1 χ(H) = − ∂¯χ(z)(z ˜ − H)−1 L(dz), π where L(dz) is the Lebesgue measure in C. By using the resolvent identity, we get (z − H)−1 = (z − Q)−1 + (z − Q)−1 x(z − Q)−1 + 2 (z − H)−1 x(z − Q)−1 x(z − Q)−1 , and we obtain
χ(H) = χ(Q) − ∂¯χ(z)(z ˜ − Q)−1 x(z − Q)−1 L(dz) π 2 − ∂¯χ(z)(z ˜ − H)−1 x(z − Q)−1 x(z − Q)−1 L(dz). π
Since supp χ(z) ˜ ∩ σ(Q) = ∅, the first two terms on the right-hand side vanish. Consequently, 2 χ(H) = − (4.2) ∂¯χ(z)(z ˜ − H)−1 x(z − Q)−1 x(z − Q)−1 L(dz). π Next, we observe that x(z − Q)−1 = (z − Q)−1 x + (z − Q)−1 [x, Q](z − Q)−1 = (z − Q)−1 x + L1 . We have [x, Q] = 2(Dx − By). Thus it is easy to see that for z ∈ / σ(Q), L1 = (z −Q)−1 [x, Q](z −Q)−1 is a bounded operator since (Dx −By)(i−Q)−1 is bounded
May 11, J070-S0129055X10003941
370
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
and (z − Q)−1 = (i − Q)−1 + (i − Q)−1 (i − z)(z − Q)−1 . We write x(z − Q)−1 x(z − Q)−1 = (z − Q)−1 x(z − Q)−1 x + (z − Q)−1 xL1 + L1 (z − Q)−1 x + L21 =
4
Ij .
j=1
The operators I4 = L21 and I3 = L1 (z − Q)−1 xx−2 are bounded. To see that I1 x−2 is bounded, note that I1 x−2 = (z − Q)−2 x2 x−2 + (z − Q)−1 L1 xx−2 . Finally, I2 x−2 = (z − Q)−2 x[x, Q](z − Q)−1 x−2 + (z − Q)−1 L1 [x, Q](z − Q)−1 x−2 and since the second term on the right-hand side is bounded, it remains to examine the operator x[x, Q](z − Q)−1 x−2 = [x, Q]x(z − Q)−1 x−2 + 2(z − Q)−1 x−2 . Applying the above argument, we see that the last operator is bounded. Consequently, the operator under integration in (4.2) is bounded by O(|Im z|−1 ) and this proves the statement. Proposition 3. Assume that ∂xα V ∈ C 0 (R2 ; R) ∩ L∞ (R2 ; R) for α = 0, 1, 2 and let x2 ∂x V ∈ L∞ (R2 ). Let [a, b] be a compact interval such that [a, b]∩σ(Q) = ∅. Then for s > 1/2 and sufficiently small 0 > 0 we have the following estimate uniformly with respect to λ ∈ [a, b] and ∈ ]0, 0 ]
Dx −s (H − λ ± i0)−1 Dx −s ≤ C−1 .
(4.3)
Moreover, H has no embedded eigenvalues and singular continuous spectrum in [a, b]. Proof. Let [a − δ, b + δ] ∩ σ(Q) = ∅ for 0 < δ 1. Choose a function χ(t) ∈ C0∞ (R; R) such that supp χ ⊂ [a − δ, b + δ] and χ(t) = 1 for a1 = a − δ/2 ≤ t ≤ b + δ/2 = b1 . Then I[a1 ,b1 ] (H)[∂x , H]I[a1 ,b1 ] (H) = I[a1 ,b1 ] (H) + I[a1 ,b1 ] (H)∂x V I[a1 ,b1 ] (H) = I[a1 ,b1 ] (H) + I[a1 ,b1 ] (H)(χ(H)x−2 )(x2 ∂x V ) I[a1 ,b1 ] (H) Our assumption implies that the multiplication operator x2 ∂x V ∈ L∞ , while Lemma 7 says that
χ(H)x−2 ≤ C2 . Thus I[a1 ,b1 ] (H)(χ(H)x−2 )(x2 ∂x V )I[a1 ,b1 ] (H) ≤ C1 2 I[a1 ,b1 ] (H)
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
371
and with a constant c0 > 0 we deduce I[a1 ,b1 ] (H)[∂x , H]I[a1 ,b1 ] (H) ≥ c0 I[a1 ,b1 ] (H). Then it is well known (see, for instance [1,6,14]) that for λ ∈ [a, b] we get (4.3) and H has no eigenvalues and singular continuous spectrum in [a, b]. Remark 3. As we mentioned in Remark 2 for sign-definite rapidly decreasing potentials the spectrum of the operator Q is formed by infinite number eigenvalues having as points of accumulation the Landau levels µn = (2n+1)B, n ∈ N. For such potentials Proposition 3 shows that the embedded eigenvalues of H could appear only in small neighborhoods of the eigenvalues of Q. Since in every interval we may have only a finite number of eigenvalues of H, it is clear that for some eigenvalues ν of Q there are no eigenvalues of H in their neighborhoods. Moreover, it was proved in [12] that for potentials V ∈ C0∞ (R2 ) we have σ(Q) ∩ ]µn − B, µn + B[ ⊂ (µn − Cn−1/2 , µn + Cn−1/2 ), n ≥ N with C > 0 and N depending only on sup|V | and the diameter of the support of V . Thus for M large the embedded eigenvalues λ ≥ M of H are sufficiently close to Landau levels Λn . 5. Estimates for the Derivative of the Spectral Shift Function First we notice that the assumption (1.4) makes possible to define the spectral shift function ξ(λ, ) related to operators H0 () = H0 (B, ) and H() = H0 (B, )+V (x, y) by the equality ξ , f = tr(f (H()) − f (H0 ())),
f ∈ C0∞ (R).
Here and below we omit the dependence of B in the notations. Our purpose in this section is to establish Theorem 2. For the proof we need the following Proposition 4. Under the assumptions of Theorem 2, for λ0 ∈ / σ(Q) and 1/2 < s < min(1/2 + δ/4, 1) the operator Dx s ∂x V [(Q − z)−1 x]n Dx s is trace class for z in a small complex neighborhood Ξ ⊂ C of λ0 . Proof. Before starting the proof, notice that it is easy to establish the statement for z 0 since in this case the operator (Q−z)−1 is a pseudodiferential one and we can apply the calculus of pseudodifferential operators and the criteria which guarantees that a pseudodifferential operator is trace class (see for instance, [3, Theorem 9.4]). For z ∈ R+ \σ(Q) this is not the case and (Q − z)−1 is a bounded operator but not a pseudodifferential one. We may replace (Q − z)−1 by the pseudodifferential operator (Q−i)−1 modulo bounded operators but therefore it is difficult to examine the product involving many bounded operators and factors xk . To overcome this difficulty, we are going to apply a convenient decomposition by product of operators
May 11, J070-S0129055X10003941
372
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
having in mind that the operator on the left of a such product must be trace class one. First, we treat the case n = 2, the general case will be covered by a recurrence. We start with the analysis of the operator Dx 2s ∂x V [(Q − z)−1 x]2 .
(5.1)
Our goal is to show that (5.1) is a trace class operator. Write Dx 2s ∂x V x2 x−2 (Q − z)−1 x(Q − z)−1 x = Dx 2s (∂x V )x2 (Q − z)−1 x−2 x(Q − z)−1 x + Dx 2s ∂x V x2 (Q − z)−1 [Q, x−2 ](Q − z)−1 x(Q − z)−1 x = Dx 2s ∂x V x2 (Q − z)−2 [x−2 x2 + [Q, x−2 x](Q − z)−1 x] + Dx 2s ∂x V x2 (Q − z)−1 [Q, x−2 ](Q − z)−1 x(Q − z)−1 x = T1 + T2 . To deal with T1 , we use the representation T1 = Dx 2s ∂x V x2 (Q − z)−2 W1 and we will show that the operator W1 = x−2 x2 + [Q, x−2 x](Q − z)−1 x 1 − x2 1 − x2 + (D − B ) (Q − z)−1 x = x−2 x2 − i (Dx − By ) x y (1 + x2 )2 (1 + x2 )2 is bounded. Consider the operator (Dx − By)
(1 − x2 ) (Q − z)−1 x (1 + x2 )2
= (Dx − By)
(1 − x2 )x (Q − i)−1 [1 + (z − i)(Q − z)−1 ] (1 + x2 )2
+ (Dx − By)
1 − x2 (Q − z)−1 [Q, x](Q − z)−1 . (1 + x2 )2
The pseudodifferential operator (Dx − By)
(1 − x2 )x (Q − i)−1 (1 + x2 )2
is bounded and the product of this operator with [1 + (i − z)(Q − z)−1] is bounded, too. As in the proof of Lemma 7, we see that [Q, x](Q−z)−1 is bounded and with the same argument we treat the other terms. Thus we conclude that W1 is a bounded
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
373
operator. Next we write T2 = Dx 2s ∂x V x2 (Q − z)−2 W2 , where W2 = [Q, x−2 ]x(Q − z)−1 x + [Q, [Q, x−2 ]](Q − z)−1 x(Q − z)−1 x = W21 + W22 . We have W21 = 2i (Dx − By)
x2 x −1 −1 (Q − z) x + (D − By)x(Q − z) x x (1 + x2 )2 (1 + x2 )2
and as above we deduce that W21 is a bounded operator. For the analysis of W22 , we write 1 − 3x2 4(Dx − By)2 + R1 (x)(Dx − By) + R2 (x) W22 = (1 + x2 )3
x + (4∂x V + 8BDy ) (Q − z)−1 x(Q − z)−1 x. (1 + x2 )2 A simple calculus gives (Q − z)−1 x(Q − z)−1 x = (Q − z)−1 x2 (Q − z)−1 + (Q − z)−1 xM1 = x2 (Q − z)−2 + 4(Q − z)−1 x(Dx − By)(Q − z)−2 + x(Q − z)−1 M1 + (Q − z)−1 M2 = x2 (Q − z)−2 + 4x(Q − z)−1 M3 + (Q − z)−1 M4 = x2 (Q − i)−2 M5 + 4x(Q − i)−1 M6 + (Q − i)−1 M7 , where Mk , k = 1, 2 . . . , denote bounded operators. The pseudodifferential calculus implies that the product of the term in the brackets {· · ·} with xj (Q − i)−j , j = 1, 2 is a bounded operator. Combining this with the above equality, we conclude that W22 is bounded. Now it remains to see that the operator T = Dx 2s ∂x V x2 (Q − z)−2 is trace class. For this purpose we replace (Q − z)2 by (Q − i)−2 [I + (z − i)(Q − z)−1 ]2
May 11, J070-S0129055X10003941
374
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
and consider the pseudodifferential operator Dx 2s ∂x V x2 (Q − i)−2
(5.2)
with principal symbol gs (x, y, ξ, η) =
ξ 2s (∂x V )(x, y)(1 + x2 ) . ((ξ − By)2 + η 2 + V (x, y) − i)2
We use the estimate ξ2s ≤ Cξ − By2s y2s and we apply Theorem 9.4 in [3] to deduce that (5.2) is a trace class operator. In fact we have α
∂x,y,ξ,η gs L1 (R4 ) < ∞ |α|≤5
since 2s < 2 guarantees that the integral with respect to ξ is convergent, while 2s < 1 + δ/2 and the estimate (1.4) imply that integral with respect to y is convergent. Consequently, T is a trace class operator and this completes the analysis of (5.1). Notice also that the same argument implies that the operator Dx s ∂x V [(Q − z)−1 x]2 is trace class. To prove that the operator Dx s ∂x V [(Q − z)−1 x]2 Dx s is trace class, we commute the operator Dx s with (Q − z)−1 x and ∂x V in order to reduce the proof to that of (5.1). The commutators [x, Dx s ] and [V, Dx s ]x are bounded since s < 1. Next [(Q − z)−1 , Dx s ]x = (Q − z)−1 [V, Dx s ](Q − z)−1 x = (Q − z)−1 [V, Dx s ](x(Q − z)−1 + (Q − z)−1 M1 ) = (Q − z)−1 M2 and we obtain operators which can be handled by the above argument. Thus the assertion is proved for n = 2. Passing to the general case n > 2, assume that the assertion holds for n = 2, . . . , k − 1, and suppose that V satisfy the estimate (1.4) with n = k. The idea is to replace the operator Dx s ∂x V [(Q − z)−1 x]k Dx s by the trace class operator Dx s (∂x V )xk (Q − z)−2 Dx s plus a sum of several operators which are trace class according to the recurrence assumption. Notice that if Mj is bounded operator obtained as a product of (Dx − By) and (Q − z)−j , j ≥ 1, the operator Dx −s Mj Dx s becomes a bounded operators and this makes possible to exploit the representation Dx s ∂x V (Q − z)−1 x · · · Mj Dx s = [Dx s ∂x V (Q − z)−1 x · · · Dx s ] (Dx −s Mj Dx s ). Thus we reduce the analysis to the trace class property of Dx s ∂x V (Q − z)−1 x · · · Dx s . For simplicity of the notations we will write A ∼t B if the difference A − B is a trace class operator.
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
375
We start with the observation that Dx s ∂x V [(Q − z)−1 x]k Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−2 (Q − z)−1 x2 (Q − z)−1 Dx s . We can establish this by a recurrence. For k − 1 we apply the equality Dx s ∂x V [(Q − z)−1 x]k−1 Dx s = Dx s ∂x V [(Q − z)−1 x]k−3 (Q − z)−1 x2 (Q − z)−1 Dx s × Dx s ∂x V [(Q − z)−1 x]k−2 (Q − z)−1 [Q, x](Q − z)−1 Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−3 (Q − z)−1 x2 (Q − z)−1 Dx s . Commuting (Q − z)−1 and x2 , we obtain the result for k − 1 and in the same way we continue for p ≤ k − 1. Next we commute (Q − z)−1 and x2 and get Dx s ∂x V [(Q − z)−1 x]k−2 (Q − z)−1 x2 (Q − z)−1 Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−3 (Q − z)−1 x3 (Q − z)−2 Dx s . Indeed, [Q, x2 ] = 4(Dx − By)x = −4ix(Dx − By) − 2 yields (Q − z)−1 x2 (Q − z)−1 = x2 (Q − z)−2 − 4i(Q − z)−1 x(Dx − By) (Q − z)−1 − 2(Q − z)−2 and for the term Dx s ∂x V [(Q − z)−1 x]k−1 (Dx − By)(Q − z)−1 Dx s we use the recurrence assumption and the fact that M2 = (Dx − By)(Q − z)−1 is a bounded operator. In the same way for 1 ≤ j ≤ k − 1 we show that Dx s ∂x V [(Q − z)−1 x]k−j (Q − z)−1 xj (Q − z)−2 Dx s ∼t Dx s ∂x V [(Q − z)−1 x]k−j−1 (Q − z)−1 xj+1 (Q − z)−2 Dx s , taking into account the equality [Q, xj ] = 2j(Dx − By)xj−1 = 2jxj−1 (Dx − By) − 2ij(j − 1)xj−1 and the recurrence assumption. Finally, we prove that Dx s ∂x V [(Q − z)−1 x]k Dx s ∼t Dx s (∂x V )xk (Q − z)−2 Dx s and, as in the proof in the case n = 2, we conclude that the operator on the righthand side is trace class one. After this preparation we pass to the proof of Theorem 2. Proof of Theorem 2. Let Ξ ⊂ R be a small neighborhood of λ0 such that Ξ ∩ σ(Q) = ∅. For the simplicity of the notations we will write H(), ξ(λ, ) instead of H(B, ), ξ(λ; B, ). Given f ∈ C0∞ (Ξ), introduce an almost analytic continuation f˜ ∈ C0∞ (C) of f so that ∂¯f˜(z) = O(|Im z|∞ ) and supp f˜(z) ∩ σ(Q) = ∅. Since
May 11, J070-S0129055X10003941
376
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
(z − Q)−1 is analytic over the support of f˜(z), applying the resolvent equality, we get 1 ∂¯f˜(z)∂x V (z − H())−1 L(dz) ∂x V f (H()) = − π n ∂¯f˜(z)∂x V [(z − Q)−1 x]n (z − H())−1 L(dz). (5.3) = (−1)n+1 π Taking into account Proposition 4 and the cyclicity of the trace, we get tr ∂¯f˜(z)Dx −s [Dx s ∂x V [(z − Q)−1 x]n Dx s ]Dx −s (z − H())−1 L(dz) = tr
∂¯f˜(z)[Dx s ∂x V [(z − Q)−1 x]n Dx s ]Dx −s (z − H())−1 Dx −s L(dz).
Set W (z) = Dx s ∂x V [(z − Q)−1 x]n Dx s and note that for z ∈ supp f˜ this operator is trace class and W (z) is analytic. We write 1 − ∂¯f˜(z) tr(∂x V [(z − Q)−1 x]n (z − H())−1 )L(dz) π 1 = lim ∂¯f˜(z + iη) π η0 Im z>0 × tr[(W (z + iη)Dx −s (H() − (z + iη))−1 Dx −s )]L(dz) −s −1 −s ¯ ˜ ∂ f (z − iη) tr(W (z − iη)Dx (H() − (z − iη)) Dx )L(dz) . + Im z<0
Notice that the functions tr(W (z ± iη)Dx −s (H() − (z ± iη))−1 Dx −s ) are analytic in ± Im z > 0. Applying Green formula, as in [4, Lemma 1], we deduce 1 ξ (λ, ), f = tr(f (H() − f (H0 )) = − tr(∂x V f (H()) (−1)n n−1 = lim f (λ) tr(W (λ)[Dx −s ((H() − (λ + iη))−1 η0 2πi − (H() − (λ − iη))−1 )Dx −s ])dλ, where the integral is taken in the sense of distributions. On the other hand, Proposition 4 combined with (4.3) show that the right-hand side of the above representation is finite and has order O(n−2 ). Thus for ∀f ∈ C0∞ (Ξ) we obtain ξ (λ, ), f = f (λ)T (λ)dλ with T (λ) = O(n−2 ) and this completes the proof.
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
377
Acknowledgments The authors are grateful to the referees for their thorough and careful reading of the paper. Their remarks and suggestions lead to an improvement of the first version of this paper. The second author was partially supported by the ANR project NONAa. Appendix The proof of the following lemma is similar to the proof of [5, Proposition 2.1] and for the reader convenience we give it. 1
Lemma 8. Let δ > 0 and let kj (x, y) = x−j(1+δ) y−j( 2 +δ) , j = 1, 2. The operators G2 := k2 (H0 + i)−2 , G∗2 , (respectively, G1 := k1 (H0 + i)−1 , G∗1 ), are trace class (respectively, Hilbert–Schmidt). Proof. Without loss of the generality, we may assume that B = = 1. Introduce the unitary operator U : L2 (R2 ) → L2 (R2 ) by 2 eiϕ(x,y,x ,y ) u(x , y )dx dy , (U u)(x, y) = π 2 R where ϕ(x, y, x , y ) = xy − xy − x y + x y − 12 y . A simple calculus shows that ˜ 0 = U −1 H0 U = (Dy2 + y 2 ) + x − 1 , H 4 1 ω −1 ω ˜ kj = U kj U = kj x − Dy − , y + Dx . 2 ˜ j := U Gj U −1 = Since U is unitary, it suffices to prove the lemma for G ω ˜ −j ˜ kj (H0 + i) . Let χ(t) ∈ C0∞ (R; [0, 1]) be a cut-off function such that χ(t) = 1 for |t| ≤ 1 and 2 } < k < 2, and introduce the χ(t) = 0 for |t| ≥ 2. Fix a number k, max{1, 1+2δ symbol y, ηk q(x, y, η) = χ , |η 2 + y 2 + (x + i)| where y, η = (1 + y 2 + η 2 )1/2 . It clear that q(x, y, η) ∈ S 0 (R4(x,ξ,y,η) ) and we set A = q ω (x, y, Dy ). We decompose ˜ 0 + i)−j = Ak˜ω (H ˜ 0 + i)−j + (I − A)k˜ω (H ˜ 0 + i)−j = Lj + Mj . k˜jω (H j j To treat Lj , notice that on the support of q(x, y, η) we have (η 2 + y 2 + x + i)−1 ∈ S 0 (R4 ; y, η−k ). In fact, on the support of q we obtain y, ηk ≤ 2|η 2 + y 2 + x + i|,
(A.1)
May 11, J070-S0129055X10003941
378
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
and it is easy to estimate the derivatives of (η 2 + y 2 + x + i)−1 . According to the calculus of pseudodifferential operators, Lj becomes a pseudodifferentail operator with symbol in 1
S 0 (R4 ; y, η−k x − η−j(1+δ) y + ξ−j( 2 +δ) ), and the trace norm (respectively, Hilbert–Schmidt norm) of L2 (respectively L1 ) can be estimated (see, for instance, [3, Proposition 9.2 and Theorem 9.4]) by y, η−2k x − η−2−2δ y + ξ−1−2δ dxdξdydη
L1 2HS + L2 tr ≤ C0 ≤ C0
y, η−2k dydη ≤ C0 .
(A.2)
To deal with Mj , j = 1, 2, we will show that (I − A)k˜2ω is trace class operator and (I − A)k˜1ω is Hilbert–Schmidt one. Notice that on the support of the symbol of (I − A) we have y, ηk ≥ |η 2 + y 2 + x + i|. 1
Taking into account the estimate ∂xl ∂ym kj (x, y) = Ol,m (x−j(1+δ) y−j( 2 +δ) ), we get
(I − A)k1ω 2HS + (I − A)k2ω tr ≤ C1 x − η−2−2δ y + ξ−1−2δ dxdξdydη y,η k ≥|η 2 +y 2 +x+i|
≤ C2
y,η k ≥|η 2 +y 2 +x+i|
x − η−2−2δ dxdydη
≤ C2 ≤ C2
y,η k ≥|η 2 +y 2 +η+u+i|
y, η k ≥ |η 2 + y 2 + η + u|, |u| ≤ 12 y, η k
+ C2
≤
C2
u−2−2δ dudydη
y, η k ≥ |η 2 + y 2 + η + u|, |u| ≥ 12 y, η k
|u|≤C3 ,|y|≤C3 ,|η|≤C3
+
u−2−2δ dudydη
|u|≥ 12 y,η k
u−2−2δ dudydη
u−2−2δ dudydη
−2−2δ
u
dudydη
May 11, J070-S0129055X10003941
2010 10:6 WSPC/S0129-055X
148-RMP
Representation of the Spectral Shift Function
≤ C4 + C5 ≤ C4 + C6
u−2−2δ
1
(2|u|) k
379
rdr du
0
u−2−2δ+2/k du ≤ C7 ,
(A.3)
since −2 − 2δ + 2/k < −1. Using (A.1)–(A.3) and the fact that A is trace class (respectively Hilbert– Schmidt) operator if and only if A∗ is trace class (respectively Hilbert–Schmidt) operator, we complete the proof of the lemma.
References [1] W. O. Amrein, A. M. Boutet de Monvel and V. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N-Body Hamiltonians, Progress in Mathematics, Vol. 135 (Birkh¨ auser-Verlag, Basel, 1996). [2] F. Bentosela, R. Carmona, P. Duclos, B. Simon, B. Souillard and R. Weder, Schr¨ odinger operators with an electric field and randon or deterministic potentials, Comm. Math. Phys. 88 (1983) 387–397. [3] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in Semiclassical Limit, London Mathematical Society, Lecture Notes Series, Vol. 268 (Cambridge University Press, 1999). [4] M. Dimassi and V. Petkov, Spectral shift function and resonances for nonsemibounded and Stark Hamiltonians, J. Math. Pures Appl. 82 (2003) 1303–1342. [5] M. Dimassi and V. Petkov, Resonances for magnetic Stark hamiltonians in two dimensional case, Int. Math. Res. Not. 77 (2004) 4147–4179. [6] C. Gerard, A proof of the abstract limiting absorption principle by energy estimates, J. Funct. Anal. 254 (2008) 2707–2724. [7] C. Ferrari and H. Kovarik, Resonances width in crossed electic and magnetic fields, J. Phys. A Math. Gen. 37 (2004) 7671–7697. [8] C. Ferrari and H. Kovarik, On the exponential decay of magnetic Stark resonances, Rep. Math. Phys. 56 (2005) 197–207. [9] V. Ivrii, Analysis and Precise Spectral Asymptotics, Springer Monographs in Mathematics (Springer, Berlin, 1998). [10] M. Klein, D. Robert and X. P. Wang, Breit–Wigner formula for the scattering phase in the Stark effect, Comm. Math. Phys. 131(1) (1990) 109–124. [11] M. G. Krein, On the trace formula in perturbation theory, Mat. Sb. 33 (1953) 597–626 (in Russian). [12] E. Korotyaev and A. Pushnitski, A trace formula and high energy spectral asymptotics for the perturbed Landau Hamiltonian, J. Funct. Anal. 217 (2004) 221–248. [13] M. Melgaard and G. Rosenblum, Eigenvalue asymptotics for weakly perturbed Dirac and Schr¨ odinger operators with constant magnetic fields of full rank, Comm. Partial Differential Equations 28 (2003) 697–736. [14] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators, Comm. Math. Phys 78(3) (1981) 391–408. [15] G. Raikov and S. Warzel, Quasi-classical versus non-classical spectral asymptotics for magnetic Schr¨ odinger operators with decreasing electric potentials, Rev. Math. Phys. 14 (2002) 1051–1072.
May 11, J070-S0129055X10003941
380
2010 10:6 WSPC/S0129-055X
148-RMP
M. Dimassi & V. Petkov
[16] M. Reed and B. Simon, Methods of Modern Mathematical Physics, IV, Analysis of Operators (Academic Press, New York, 1978). [17] D. Robert and X. P. Wang, Existence of time-delay operators for Stark Hamiltonians, Comm. Partial Differential Equations 14 (1989) 63–98. [18] D. Robert and X. P. Wang, Time-delay and spectral density for Stark Hamiltonians. II. Asymptotics of trace formulae, Chinese Ann. Math. Ser. B 12(3) (1991) 358–383. [19] X. P. Wang, Weak coupling asymptotics of Schr¨ odinger operators with Stark effect, in Harmonic Analysis, Lecture Notes in Math., Vol. 1494 (Springer, Berlin, 1991), pp. 185–195. [20] D. Yafaev, Mathematical Scattering Theory (Amer. Math. Society, Providence, RI, 1992).
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 4 (2010) 381–430 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003990
THE LOCALLY COVARIANT DIRAC FIELD
KO SANDERS Institute of Theoretical Physics, University of G¨ ottingen, Friedrich-Hund-Platz 1, D-37077 G¨ ottingen, Germany and Courant Research Center, “Higher Order Structures in Mathematics”, University of G¨ ottingen, Germany
[email protected] Received 25 November 2009 Revised 1 March 2010 We describe the free Dirac field in a four-dimensional spacetime as a locally covariant quantum field theory in the sense of Brunetti, Fredenhagen and Verch, using a representation independent construction. The freedom in the geometric constructions involved can be encoded in terms of the cohomology of the category of spin spacetimes. If we restrict ourselves to the observable algebra, the cohomological obstructions vanish and the theory is unique. We establish some basic properties of the theory and discuss the class of Hadamard states, filling some technical gaps in the literature. Finally, we show that the relative Cauchy evolution yields commutators with the stress-energy-momentum tensor, as in the scalar field case. Keywords: Quantum field theory; curved spacetime; Dirac field. Mathematics Subject Classifications 2010: 81T20
1. Introduction Quantum field theory in curved spacetime is relevant for several purposes, such as the construction of cosmological models and to obtain a better understanding of quantum field theory in Minkowski spacetime. In order to achieve these goals in a more realistic setting, it is important to go beyond the well-studied free scalar field. In this paper, we will present a proof, already contained in [1], of the fact that the free Dirac field in a four-dimensional globally hyperbolic spacetime can be described as a locally covariant quantum field theory in the sense of [2]. Our presentation of the Dirac field is representation independent and we emphasize categorical methods throughout in order to point out an interesting problem concerning the uniqeness of the theory. The obstruction for the definition of a unique theory can be formulated in terms of the cohomology of the category of spacetimes with a spin structure, in particular its first Stiefel–Whitney class. It seems difficult to compute this class for a category, but we will show that a unique theory 381
May 11, J070-S0129055X10003990
382
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
can always be obtained by restriction to the observable algebras generated by even polynomials in the field, in which case the cohomological obstructions vanish. Hadamard states can be defined in terms of a series expansion of their two-point distribution, detailing their local singularity structure. Alternatively, they can be characterized by a microlocal condition. The equivalence of these two definitions has been investigated by several authors using different techniques of proof, but in our opinion none of these arguments has been fully convincing. In our discussion, we hope to close any remaining gaps in the different proofs and establish the equivalence on firm ground. We also compute the relative Cauchy evolution of this field and obtain commutators with the stress-energy-momentum tensor, in complete analogy with the scalar field case ([2]). For this, we use a point-splitting procedure to renormalize the stress-energy-momentum tensor. Because we only need commutators with this tensor we do not need to treat the so-called trace anomaly, a finite multiple of the identity operator, in detail. We refer the interested reader to [3], who also construct the extended algebra of Wick powers, relevant for perturbation theory. A Spin-Statistics Theorem in a generally covariant framework may be found in [4]. The contents of this paper are organized as follows. In Sec. 2, we review some of the mathematical background material that we need in order to describe the Dirac field. This includes first of all the Dirac algebra and the Spin group, followed by a categorical formulation of some of the differential geometry that we will need. In Sec. 3, we describe the classical free Dirac field, starting with the geometric and algebraic aspects in Secs. 3.1 and 3.2 and the equations of motion and their fundamental solutions in Sec. 3.3. We discuss the uniqueness of the functorial constructions and their cohomological obstructions in Sec. 3.4. We then proceed to the quantum Dirac field in Sec. 4. In Sec. 4.1, we quantize the classical Dirac field in a local and covariant way and collect some of its basic properties. Section 4.2 deals with Hadamard states and includes a discussion of the existing results concerning the equivalence of the microlocal and the series expansion definitions. For this purpose we also refer to Appendix A, which contains several relevant and useful (but expected) results in microlocal analysis. Section 4.3 contains our discussion of the relative Cauchy evolution of the free Dirac field, obtaining commutators with the stress-energy-momentum tensor, but the proof of our main result there is deferred to Appendix B, because it consists of rather involved computations. Finally we end with some conclusions. Our presentation of locally covariant quantum field theory is based on the original [2] and on [5]. For the Dirac field in curved spacetime, we largely follow [6, 7], as well as our earlier [1]. For results on Clifford algebras, we refer to [8] (see also [9] for a short review). 2. Mathematical Preliminaries To prepare for our discussion of the locally covariant Dirac field, we present in the current section some mathematical preliminaries concerning the Dirac algebra, the
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
383
Spin group and a categorical formulation of relevant aspects of differential geometry. These merely serve to fix our notation and set the scene for the subsequent sections. We also point out the relations with some other definitions and conventions in the literature. 2.1. The Dirac algebra and the Spin group The Spin group can be embedded in the Clifford algebra of Minkowski spacetime, which we call the Dirac algebra. Therefore, we will first briefly recall some results on Clifford algebras, for wich we refer to [8] (note the difference in sign convention in the Clifford multiplication). Let Rr,s be a finite dimensional real vector space with dimension n = r + s and with a non-degenerate bilinear form gab which has r positive and s negative eigenvalues. The Clifford algebra Clr,s is defined as the R-linear associative algebra generated by a unit element I and an orthonormal basis ea of Rr,n−r subject to the relations: ea eb + eb ea = 2gab I. This definition is independent of the choice of basis. We may identify Rr,s ⊂ Clr,s as the subspace of monomials in the basis ea of degree one. The even, respectively odd, subspace of this Clifford algebra is the one spanned by monomials of even, 0 , respectively respectively odd, degree in the basis vectors and is denoted by Clr,s 1 Clr,s . Note that the even subspace is also a subalgebra. In the following we will be especially interested in Minkowski spacetime, M0 := R1,3 , where the bilinear form is η = diag(1, −1, −1, −1) and where we choose an orthonormal basis ga , a = 0, 1, 2, 3 with g0 2 = 1, ·2 denoting the Minkowski pseudo-norm squared. The associated Clifford algebra is called the Dirac algebra D := Cl1,3 and it is characterized by ga gb + gb ga = 2ηab I.
(1)
As a vector space, the Clifford algebra is naturally isomorphic to the exterior algebra. This motivates the term volume form for the element g5 := g0 g1 g2 g3 (or in general e := e1 · · · er+s ). Note the following properties: Lemma 2.1. We have g52 = −I and g5 vg5−1 = −v for all v ∈ M0 . More generally, 1 −1 defines a if u ∈ M0 has u2 = u2 I = 0, then u−1 = u 2 u and v → −uvu reflection of M0 in the hyperplane perpendicular to u. Proof. These equalities follow directly from Eq. (1). For the last claim, e.g., we compute: −uvu−1 = v − (uv + vu)u−1 = v −
2u, v u, u2
v ∈ M0 .
Standard arguments with Clifford algebras [8] give: 0 0 D = Cl1,3 Cl1,4 Cl4,1 ,
Cl4,1 M (4, C),
May 11, J070-S0129055X10003990
384
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
where M (4, C) denotes the algebra of complex (4×4)-matrices. In fact, Cl4,1 is generated by the generators ga of D together with a central element ω, corresponding to iI ∈ M (4, C). Hence: M (4, C) C ⊗R D.
(2)
This also implies that the center of D is spanned by I (over R). The following Fundamental Theorem provides all the essential information we need on the Dirac algebra (for an elementary algebraic proof, we refer to Pauli [10].): Theorem 2.2 (Fundamental Theorem). The Dirac algebra D is simple and has a unique irreducible complex representation (i.e. an R-linear representation π : D → M (n, C)), up to equivalence. This is the representation π0 : D → M (4, C) determined by π0 (ga ) = γa with the Dirac matrices 0 I 0 −σi γ0 := , , γi := σi 0 I 0 and σ3 := 10 −10 . The where σi are the Pauli matrices σ1 := 01 10 , σ2 := 0i −i 0 equivalence with another irreducible complex representation π of D is implemented by π(S) = Lπ0 (S)L−1 for all S ∈ D, where L ∈ GL(4, C) is unique up to a non-zero complex factor. Consequently, for every set of matrices γa ∈ M (4, C) satisfying Eq. (1) there is an L ∈ GL(4, C), unique up to a non-zero complex constant, such that γa = Lγa L−1 . Proof. One can show [8] that D M (2, H), where H is the skew field of quaternions. This algebra is simple, because it is a full matrix algebra. The given matrices γa satisfy the Clifford relations (1) and therefore extend to a representation of D in M (4, C). Any complex representation π : D → M (n, C) extends to a complex representation π ˜ of M (4, C), using Eq. (2) and the trivial center of D, which is irreducible if π is irreducible. As M (4, C) has only one irreducible representation up to equivalence (see [11]), namely the defining one on C4 , this determines π up to equivalence, as stated. If K, L ∈ GL(4, C) are two matrices which implement the same equivalence, then KL−1 commutes with D and hence K = cL, where c ∈ C is non-zero because K is invertible. Note that π (ga ) := γa extends to a complex representation of D in M (4, C) which is faithful (as D is simple). The last statement then follows from the previous one. For notational convenience, we define γ5 := π0 (g5 ). We can define a determinant and trace function on D by det S = det π(S) and Tr(S) = Tr(π(S)) for all S ∈ D, where π is any irreducible complex representation of D. This is well-defined by the Fundamental Theorem. The following lemma is often useful in computations: Lemma 2.3. We have Tr(ga gb ) = 4ηab and Tr([gb , gc ]gd ga ) = 8(ηcd ηba − ηbd ηca ).
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
385
Proof. Using the cyclicity of the trace and Eq. (1) we find: Tr(ga gb ) = 12 Tr(ga gb + gb ga ) = Tr(ηab I) = 4ηab and Tr([gb , gc ]gd ga ) = Tr(gb [gc , gd ga ]) = Tr(gb {gc , gd }ga − gb gd {gc , ga }) = 2 Tr(ηcd gb ga − gb gd ηca ) = 8(ηcd ηba − ηbd ηca ). We now turn to the Spin group, which is the universal covering group of the special Lorentz group, a double covering which can be constructed in an elegant way inside the Dirac algebra. Definition 2.4. The Pin and Spin groups of Clr,s are defined as Pin r,s := {S ∈ Clr,s | S = u1 · · · uk , ui ∈ Rr,s , u2i = ±I}, 0 Spin r,s := Pin r,s ∩ Clr,s .
We let Spin 01,3 denote the connected component of Spin 1,3 which contains the identity. We also define the Lorentz group L := O1,3 , the special Lorentz group L+ := 0 , which is the conSO1,3 and the special ortochronous Lorentz group L↑+ := SO1,3 nected component of L+ containing the identity. The special ortochronous Lorentz group preserves the orientation and timeorientation. For S ∈ P in1,3 the map v → SvS −1 on M0 is a product of reflections (up to a sign) by Lemma 2.1. Together with the fact that det u = u4 for all u ∈ M0 this gives rise to another useful characterisation of the group P in1,3 , which we shall not provea: Proposition 2.5. Pin 1,3 = {S ∈ D | det S = 1, ∀ v ∈ M0 SvS −1 ∈ M0 }. It can be seen from Proposition 2.5 that P in1,3 and Spin1,3 are indeed Lie groups. For the universal covering homomorphism Λ between P in1,3 and the Lorentz group, we have the following formulaeb,c : Proposition 2.6. The map Λ : P in1,3 → L defined by S → Λab (S) ∈ M (4, R) such that Sgb S −1 = ga Λab (S) is the universal covering homomorphism of Lie groups, which restricts to the universal covering homomorphism Spin 01,3 → L↑+ . We ↑ have Λab (S) = 14 Tr(g a Sgb S −1 ) and the inverse of the derivative dΛ : spin 01,3 → l+ at a The
definition of the Spin group in [12] corresponds to our group P in1,3 . In [6, 7] one uses the term Spin group for the group S := {S ∈ M (4, C) | det S = 1, SvS −1 ∈ M0 for all v ∈ M0 }.
Note that this group cannot give a double covering of the Lorentz group, as claimed in [6] (but not in [7]), because for any S ∈ S the matrices iS, −S, −iS are in S too. Its usefulness is based on its simple definition and the fact that S 0 = Spin01,3 . b These results are well known, but we record them for definiteness to correct a sign error in the spin connection (5) that has occured in [6, 7, 13]. c Lower case Latin indices are raised and lowered with η ab , respectively, η ab throughout.
May 11, J070-S0129055X10003990
386
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
S = I is given by: (dΛ)−1 (λba ) =
1 b λ gb g a . 4 a
Proof. For the first sentence we refer to [8, Theorem 2.10] and subsequent remarks. Using the Clifford relations (1), we see that Λab (S) = =
1 ac 1 η Tr(ηcd Λdb (S)I) = η ac Tr((gc gd + gd gc )Λdb (S)) 4 8 1 ac 1 η Tr(gc gd Λdb (S)) = Tr(g a Sgb S −1 ). 4 4
Expanding Λ(S+s+O(2 )) up to second order in we find dΛ(s)ab = 14 Tr([gb , g a ]s). We check that L(λba ) := 14 λba gb g a is an inverse of dΛ: dΛ(L(λde ))ab = =
1 ac ef d 1 η η λ e Tr([gb , gc ]gd gf ) = η ac η ef λde (ηcd ηbf − ηbd ηcf ) 16 2 1 a (λ − η ae ηbd λde ) = λab , 2 b
↑ where we used Lemma 2.3 and the symmetry properties of λde ∈ l+ in the last line.
2.2. Some category theory and differential geometry The language of locally covariant quantum field theory uses category theory to express the physical ideas of locality and covariance. Any object or construction that is extended from a single spacetime (usually Minkowski spacetime) to the categorical framework gets the adjective “locally covariant”. The essence of local covariance seems to have a geometric origin and, because the Dirac field in curved spacetimes involves a substantial amount of geometric constructions, it will be convenient to present the relevant differential geometry in a categorical setting here. We refrain from the urge to call this “locally covariant differential geometry”, which appears to be a pleonasm. A category C consists of a set of objects c and a set of morphisms or arrowsd γ : c1 → c2 between objects of C, such that the composition of morphisms, when defined, is associative and each object admits an identity morphism (we refer to [14] for more details). A (covariant ) functor F : C → B is a map between categories, which maps objects c to objects F(c) and morphisms γ : c1 → c2 to morphisms F(γ) : F(c1 ) → F(c2 ) such that an identity morphism maps to an identity morphism and the composition of morphisms is preserved. A contravariant functor F : C → B is defined similarly, but reverses the direction of the morphisms: F(γ) : F(c2 ) → F(c1 ). A natural transformation t: F ⇒ G between covariant functors F : C → B and G : C → B is a map which assigns to each object c a morphism t(c) of B, called the component of t at c, such that for every morphism γ : c1 → c2 d It
is very often convenient to depict the morphisms in a diagram as arrows between objects.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
387
of C we have t(c2 ) ◦ F(γ) = G(γ) ◦ t(c1 ), which can be depicted as a commutative diagram. When a natural transformation t admits another natural transformation s such that t(c) ◦ s(c) = idc = s(c) ◦ t(c) for all objects c, then t is called a natural equivalence. In this case, we write t: F ⇔ G. A natural transformation between contravariant functors or between a covariant and a contravariant functor is defined similarly, except that some arrows in the commutative diagram are reversed. A subcategory B of C consists of a subset of the objects of C and a subset of its morphisms in such a way that B still satisfies the axioms of a category. In our case, all categories will be concrete, i.e. the objects will be sets with a certain structure and the morphisms will be maps between sets. The identity morphism will always be the identity map and the composition of maps, when defined, is automatically associative. In short, our categories will be subcategories of the category Set, whose objects are setse and whose morphisms are maps. For our discussion of differential geometry we start with the following Definition 2.7. The category Mann of smooth manifolds is the category whose objects are C ∞ manifolds M of (finite) dimension n and whose morphisms are C ∞ embeddings µ : M1 → M2 . The category Bund of fiber bundles is the category whose objects are smooth fiber bundles p : B → M over objects M of Mann with bundle projection map p, and whose morphisms are C ∞ maps β : B1 → B2 covering a morphism µ : M1 → M2 of Mann , i.e. such that p2 ◦ β = µ ◦ p1 . We denote by Bund the subcategory whose morphisms restrict to isomorphisms of the fibers. The categories VBundR , respectively VBundC , of real (complex) vector bundles is the subcategory of Bund whose objects V are real (complex) vector bundles and whose morphisms ν : V1 → V2 are real (complex) linear maps of the fibers. Again we denote by VBundR and VBundC the subcategories whose morphisms restrict to isomorphisms of the fibers. We could have taken all smooth maps between manifolds as morphisms of Mann or allowed all dimensions. However, local diffeomorphisms allow us to transport more structure, which enables us to describe more of the canonical differential geometric constructions as functors. We describe the most important examples below. For fiber bundles, on the other hand, it will be useful to allow maps which are not isomorphisms on the fibers.f ,g
e See
[14] for some relevant remarks concerning the foundations of set theory and the use of small sets. f The unprimed categories, whose morphisms are isomorphisms of the fibers, can be described as fibered categories over Mann , cf. [15, p. 44]. g The functors B : Mann → Bund below are all of a special type, namely, they associate to a manifold M a fiber bundle whose base space is again M. Although we will only use functors of this type when describing the Dirac field, the restriction is not technically necessary in our definitions.
May 11, J070-S0129055X10003990
388
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
Two of the most basic functors in differential geometry are The tangent bundle functor T : Mann → VBundR assigns to every manifold M the tangent bundle T M and to every morphism µ : M1 → M2 the differential dµ : T M1 → T M2 . The cotangent bundle functorh T∗ : Mann → VBundR assigns to every manifold M the cotangent bundle T ∗ M and to every morphism µ : M1 → M2 the pushforward µ∗ : T M1 → T M2 , which is defined as µ∗ ω := ω ◦ dµ−1 . In a similar way, one can define the functor Λk : Mann → VBundR of exterior k-forms and the exterior algebra functor Λ : Mann → VBundR , both with pushforwards. Another example is The density bundle functor |Λn | : Mann → VBundR assigns to every spacetime M the one-dimensional trivial vector bundle of densities |Λn M|, where n is the dimension of M. This is the vector bundle whose fiber at x ∈ M consists of functions d : Λnx M → R such that d(rω) = |r|ω for all r ∈ R and ω ∈ Λnx M (cf. [16, Appendix A.3]). A morphism µ is mapped to the push-forward defined by µ∗ d := d ◦ µ∗ , where µ∗ ω := ω ◦ dµ is the pull-back. By standard constructions, one can take (finite) direct sums and tensor products of functors from Mann into VBundR which map M into a vector bundle over M. One obtains another such functor in the obvious way. For functors V into VBundR one can also define the dual, denoted by V∗ , where the morphism between dual vector bundles is the push-forward of the original morphism. This generalizes the example of T∗ above. As another standard construction one can define the complexification VC of any functor V into VBundR (respectively, VBundR ), which is a functor into VBundC (respectively, VBundC ). Now we turn to some examples of natural transformations: The canonical pairing between a functor V : Mann → VBundR which maps M to a vector bundle V M over M and its dual V∗ is a natural transformation , : V∗ ⊗ V ⇒ Λ0 whose components cover the identity morphism. Complex conjugation is a natural equivalence − : VC ⇔ VC in VBundR (or VBundR ) between complexified vector bundles, which sends each section to its complex conjugate. A further example of a natural equivalence is the fiber-wise multiplication by a real number r = 0. (For r = 0, this only yields a natural transformation.) Furthermore, the constructions mentioned above (dual, direct sum, tensor product) and the natural transformations (pairing, fiber-wise multiplication) can also be applied directly to complex vector bundles in a canonical (Hermitean) way. h It
is tempting to think of a contravariant functor that maps manifolds to their cotangent bundles and morphisms µ to the pull-back, µ∗ ω := ω ◦ dµ, which indeed reverses the directions of arrows and changes the order of compositions. However, the pull-back is only defined on the image of µ, so in general this does not define a morphism in VBundR .
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
389
It will be convenient to consider distributions and integration in a categorical setting too: Definition 2.8. TVec is the category of topological vector spaces with injective continuous linear maps as morphisms. The functor C : Mann → TVec is the constant functor C, i.e. it assigns to each object the one dimensional space C and to each morphism the identity morphism. The functor of test-sections is the functor C∞ 0 : VBundC → TVec which maps ∞ each complex vector bundle V to the space C0 (V) of compactly supported smooth sections of V in the test-section topology.i A morphism ν, covering a morphism µ, is mapped to the push-forward ν∗ defined by ν∗ (f ) = ν ◦ f ◦ µ−1 on µ(M1 ), extended by 0 to all of M2 . The functor of smooth sections is the contravariant functor C∞ : VBundC → TVec which maps each complex vector bundle V to the space C ∞ (V) of smooth sections of V in the usual topology. A morphism ν, covering a morphism µ, is mapped to the pull-back ν ∗ defined by ν ∗ (f ) = ν −1 ◦ f ◦ µ. The functor of distributions is the contravariant functor Distr: VBundC → TVec which maps each complex vector bundle V to the space (C0∞ (V)) of distributions on V with the weak topology induced by C0∞ (V). A morphism ν, covering a morphism µ, is mapped to the pull-back ν ∗ defined by ν ∗ u := u ◦ ν∗ . We will not need compactly supported distributions, but they can be defined as the functor dual to C∞ . Notice that objects which are not compactly supported, such as smooth sections or distributions, behave contravariantly, whereas compactly supported ones behave covariantly. Also note that the pull-back of a smooth section can only be defined for morphisms that restrict to isomorphisms of the fibers. The following constructions will be of importance in Sec. 4: n Integration is a natural transformation : C∞ 0 ◦ |Λ | ⇒ C which assigns to each ∞ n ω ∈ C0 (|Λ M|) the integral M ω. Canonical Injections. Let f : VBundC → VBundC be the forgetful functor. For any functor V : Mann → VBundC there is a canonical natural transformation ∞ ◦ V, whose components are the canonical injections κ: C∞ 0 ◦ f ◦ V ⇒ C ∞ ∞ C0 (V M) ⊂ C (V M). Similarly, there is a canonical natural transformation ι: C∞ ◦ (V ⊗ |Λn |) ⇒ Distr ◦ f ◦ V∗ given by ιM (f ⊗ ω) := M ., f ω for any smooth section f of V M and any density ω on M. Each component of ι is injective. Where convenient we will identify a functor V : Mann → VBundC with the functor f ◦ V, omitting the forgetful functor, as this rarely leads to confusion. Furthermore, any natural transformation t: V1 ⇒ V2 between a pair of functors Vi : Mann → VBundC , i = 1, 2, lifts to a corresponding natural transformation i For a precise definition of the well-known topologies on test-sections and smooth sections we refer to [17, Chap. 17].
May 11, J070-S0129055X10003990
390
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
∞ T : C∞ 0 ◦ V1 ⇒ C0 ◦ V2 defined pointwise by TM f := tM ◦ f . The same statement holds for T : C∞ ◦ V1 ⇒ C∞ ◦ V2 , if the Vi are functors into the category VBundC . Next we add the structure of a semi-Riemannian metric:
Definition 2.9. The category SRMann of semi-Riemannian manifolds is the subcategory of Mann whose objects M = (M, g) are C ∞ manifolds M of dimension n with a semi-Riemannian metric g and whose morphisms m : M1 → M2 are given by the isometric morphisms in Mann , i.e. morphisms µ : M1 → M2 such that µ∗ g1 = g2 |µ(M1 ) . Again there is a canonical forgetful functor f : SRMann → Mann , which is often left implicit, so we will write e.g. T for the functor T ◦ f . The extra structure of a semi-Riemannian metric gives rise to extra functors and natural equivalences that are of interest to us: The metric identification is a natural equivalence G: T ⇔ T∗ whose component at M = (M, g) is given by the map GM : T M → T ∗ M such that v → g(v, ·). The frame bundle functor F : SRMann → VBundR assigns to each object M the frame bundle F M, i.e. the bundle whose fiber at a point x ∈ M consists of all orthonormal bases of Tx M in the metric g. This fiber is a subset of T ⊗n M. A morphism m is mapped to the push-forward µ∗ acting on F M ⊂ T ⊗n M. The volume form functor vol : SRMann → VBundRis defined as vol := |Λn | ◦ f . When m : M1 → M2 is a morphism and dvoli := | det gi | the metric induced volume form on Mi , then vol maps dvol1 to the restriction of dvol2 to m(M1 ). There is a canonical natural equivalence from Λ0 to vol, which consists of multiplication with the metric induced volume form. Similarly there are natural equivalences between any functor V: SRMann → VBundC and V ⊗ |Λn |. Therefore we obtain a canonical natural transformation ι: C∞ ◦ V ⇒ Distr ◦ V∗ whose components are injective. Finally we should mention the Clifford bundle functor Cl : SRMann → VBundR , which assigns to each object M = (M, g) the Clifford bundle ClM, which is the vector bundle whose fiber at x ∈ M is the Clifford algebra of (Tx M, g) viewed as a linear space. Ignoring the algebraic structure, this functor is naturally equivalent to Λ ◦ f . Although we will not do so, it is possible to use this functor as a basic object for the description of fermions (cf. [18]). 3. The Classical Dirac Field After these mathematical preliminaries we are now ready to start constructing the classical free Dirac field (as a locally covariant classical field). We will first describe the geometric and algebraic constructions, before we discuss the Dirac equation and its fundamental solutions. We close by investigating to what extent the relations between the Dirac operator, charge conjugation and adjoint map fix the structure
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
391
of the theory and find that the non-uniqueness can be characterised in terms of the cohomology of the category of spin spacetimes. 3.1. Geometric aspects In order to describe the Dirac field we need to introduce the notion of a spin structure on a spacetime, combining the geometric and the algebraic results of Sec. 2. This is the purpose of the current subsection. The systems that we will consider are intended to model Dirac quantum fields living in a (region of) spacetime which is endowed with a fixed Lorentzian metric (a background gravitational field). Mathematically these regions are modelled as follows: Definition 3.1. By the term globally hyperbolic spacetime we will mean a connected, Hausdorff, C ∞ Lorentzian manifold M = (M, g) of dimension d = 4, which is oriented, time-oriented and admits a Cauchy surface. A subset O ⊂ M of a globally hyperbolic spacetime M is called causally convex iff for all x, y ∈ O all causal curves in M from x to y lie entirely in O. The category Spac is the subcategory of SRMann whose objects are all globally hyperbolic spacetimes M = (M, g) and whose morphisms are isometric embeddings ψ that preserve the orientation and time-orientation and such that ψ(M1 ) is causally convex. By a theorem of Geroch any globally hyperbolic spacetime is paracompact ([19, Appendix]). Most notations we use concerning the causal structure of spacetimes are standard, cf. [20]. The importance of causally convex sets is that for any morphism ψ the causal structure of M1 coincides with that of ψ(M1 ) inside M2 : ± ± ψ(JM (x)) = JM (ψ(x)) ∩ ψ(M1 ), 1 2
x ∈ M1 .
If O ⊂ M is a connected open causally convex set, then (O, g|O ) defines a globally hyperbolic spacetime in its own right. In this case there is a canonical morphism IM,O : O → M given by the canonical embedding ι : O → M. We will often drop IM,O and ι from the notation and simply write O ⊂ M . Notice that there is a forgetful functor f : Spac → SRMann and that we can define the functor F↑+ : Spac → Bund of oriented, time-oriented orthonormal frames F+↑ M for the tangent bundle, in analogy to Sec. 2.2. This is a principal L↑+ -bundle over M , where the special ortochronous Lorentz group L↑+ acts from the right, i.e., given e = (x, e0 , . . . , e3 ) ∈ F+↑ M , where x ∈ M and ea ∈ Tx M such that gx (ea , eb ) = ηab and e0 is future pointing, the action of Λ is defined by RΛ e = e = (x, e0 , . . . , e3 ) where ea = eb Λba . Definition 3.2. A spin structure on M is a pair (SM , π), where SM is a principal Spin 01,3 -bundle over M , the spin frame bundle, with a right action RS , S ∈ Spin 01,3 ,
May 11, J070-S0129055X10003990
392
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
and π : SM → FM , the spin frame projection, is a base-point preserving bundle homomorphism such that π ◦ RS = RΛ(S) ◦ π, where S → Λ(S) is the universal covering map (cf. Proposition 2.6). A globally hyperbolic spin spacetime SM = (M, g, SM , π) is an object M = (M, g) of Spac which is endowed with the spin structure (SM , π). The category SSpac is the subcategory of Bund whose objects are all globally hyperbolic spin spacetimes SM = (M, g, SM , π) and whose morphisms χ : SM 1 → SM 2 cover a morphism ψ : M1 → M2 in Spac and satisfy χ ◦ (R1 )S = (R2 )S ◦ χ and π2 ◦ χ = ψ∗ ◦ π1 , where pi are the bundle projections, πi the spin frame projections and ψ∗ the push-forward. Note that a morphism acts as a diffeomorphism of the fibers, because it intertwines the group action. Every globally hyperbolic spacetime admits a spin structure, which need not be unique [6, 8, 19, 21]. We will regard distinct spin structures on the same underlying spacetime as distinct spin spacetimes.j Spinor and cospinor fields are sections of vector bundles associated to the spin frame bundle. We will require that the assignment of these vector bundles is functorial: Definition 3.3. A locally covariant spinor bundle is a functor V: SSpac → VBundC , written as SM → VSM , χ → ν, such that χ and ν cover the same morphism ψ in Spac and such that each VSM is a vector bundle associated to the spin frame bundle SM through some representation. The dual functor V∗ is called ∗ , are a locally covariant cospinor bundle. Smooth sections of VSM , respectively VSM called (Dirac) spinors (or spinor fields), respectively cospinors (cospinor fields). The condition in the definition of a locally covariant spinor bundle ensures that the vector bundle VSM and the spin frame bundle SM are both bundles over the same spacetime M . For definiteness we pick out the following standard choice of locally covariant spinor and cospinor bundles: Definition 3.4. The standard locally covariant Dirac spinor bundle D0 : SSpac → VBundC is the locally covariant spinor bundle which associates to each object SM of SSpac the associated vector bundle D0 M = SM ×Spin01,3 C4 of SM with the j There exists another approach to spinors, which considers on each spacetime the Clifford bundle. This Clifford bundle is functorial in its dependence on the spacetime, but it does not generally define a spin structure. Indeed, at each point one can identify the Spin group inside the fiber of the Clifford bundle, but there may not be any projection from these Spin groups onto the frame bundle that intertwines the actions of the structure groups, the obstruction being a topological twist. (Conversely, every spin structure can be seen as a topologically twisted copy of the Spin groups in the Clifford bundle.) Nevertheless, it appears to provide sufficient structure to describe all the relevant physics in a functorial way. We refer to [18] for more information on this approach.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
393
representation π0 , and which maps each morphism χ : SM 1 → SM 2 to the morphism ξ : D0 M1 → D0 M2 given by ξ([E, z]) := [χ(E), z]. The standard locally covariant Dirac cospinor bundle D∗0 is the dual functor of D0 . Recall that a point in D0 M consists of an equivalence class of pairs (E, z) ∈ SM × C4 , where the equivalence is given by [RS E, z] = [E, π0 (S)z]. The dual functor D∗0 then assigns to each SM the dual vector bundle D0∗ M whose points are equivalence classes of pairs (E, w∗ ) ∈ SM × (C4 )∗ , where the equivalence is given by [RS E, w∗ ] = [E, w∗ π0 (S −1 )]. (Here we consider w∗ ∈ (C4 )∗ as a row vector, whereas z ∈ C4 is treated as a column vector.) For any object SM the unique connection ∇SM on T M which is compatible ↑ with the metric, ∇SM g = 0, can be described by an l+ -valued one-form (ΩSM )ba on ↑ ↑ is the orthonormal frame bundle F+ M (cf. [22, Chap. 2, Proposition 1.1]), where l+ the Lie-algebra of L↑+ , which can be identified with the tangent space of the fiber of F+↑ M at any point. For every local section e of F+↑ M the pull-back ω ba := e∗ (Ωba ) consists exactly of the connection one-forms of ∇SM expressed in the orthonormal frame ea . The one-form (ΩSM )ba can be pulled back by the spin frame projection π and lifted to a spin01,3 -valued one-form ΣSM on SM : ΣSM := (dΛ)−1 π ∗ ((ΩSM )ba ) =
1 ∗ p ((ΩSM )ba )gb g a , 4
where the last equality uses Proposition 2.6. The one-form ΣSM determines a connection on the spin frame bundle SM . For any associated vector bundle DM we then find a connection, also denoted by ∇SM , determined by the connection oneforms σ := E ∗ (ΣSM ) in a local section E of SM , as represented on DM (we will give an explicit expression for σ in Eq. (5)). The connection can be viewed as a map ∇SM : C0∞ (D0 M ) → C0∞ (T ∗ M ⊗ D0 M ), which is a component of a natural ∞ ∗ transformationk ∇: C∞ 0 ◦ D0 ⇒ C0 ◦ (T ⊗ D0 ). The Leibniz rule allows us to extend it to mixed spinor-tensors, using, e.g., ∇a v, u = ∇a v, u + v, ∇a u. 3.2. Adjoints, charge conjugation and the Dirac operator We now define the adjoint and charge conjugation maps on spinors and cospinors. These are special cases of the Fundamental Theorem 2.2, using the complex conjugate and adjoint matricesl (cf. [23]). k Alternatively
we could have written the connection as a natural transformation from the 1-jet bundle extension of D0 to T∗ ⊗ D0 . l On a general representation space of complex dimension four, one can define many complex conjugations and Hermitean inner products. In order to obtain the desired equalities involving adjoint and charge conjugate spinors later on, we need these two operations to be compatible, i.e. v, w = v, w. Without loss of generality we can then use the standard complex conjugation and Hermitean inner product on C4 .
May 11, J070-S0129055X10003990
394
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
Theorem 3.5. For any irreducible complex representation π of the Dirac algebra D there are matrices A, C ∈ GL(4, C) such that A = A∗ , ¯ = I, CC
π(ga )∗ = Aπ(ga )A−1 ,
An > 0,
−π(ga ) = Cπ(ga )C −1
(3)
for all future pointing time-like vectors n ∈ M0 ⊂ D. We have for all S ∈ Spin01,3 : A = −C ∗ AT C, π(S)∗ Aπ(S) = A,
π(S −1 )C −1 π(S) = C −1 .
Moreover, if A , C ∈ M (4, C) have the properties stated above for the irreducible complex representation π of D, then there is an L ∈ GL(4, C), unique up to a sign, ¯ −1 C L = C and π = L−1 π L on D. such that L∗ A L = A, (L) Proof. To prove the existence of A and C in the representation π0 we may take A = A0 := γ0 , C = C0 := γ2 and check the required properties straightforwardly. Note for example that 0 i I + n σ 0 n i γ0 na γa = > 0, 0 n0 I − ni σi because det(n0 I ± ni σi ) = n2 > 0 and Tr(n0 I ± ni σi ) = 2n0 > 0. To prove the existence of A and C in a general irreducible complex representation π one writes ¯ −1 C0 K γa = Kπ(ga )K −1 by Theorem 2.2 and verifies that A = K ∗ A0 K and C = K will do. Given A , C satisfying Eq. (3) for π we can fix K ∈ GL(4, C) such that π = KπK −1 on D and the desired matrix L must be L = zK for some z = 0 by the ¯ −1 C K and note Fundamental Theorem 2.2. Now set A˜ := K ∗ A K and C˜ := (K) that A˜ and C˜ satisfy (3) for π. Because the sets of matrices π(ga )∗ and −π(ga ) both satisfy the relations (1) we must have aA = A˜ and cC = C˜ for some non-zero complex factors a and c, again by the Fundamental Theorem. Also, |c| = 1 because ¯ = I and a > 0 because A = A∗ and Aπ(n) > 0 for future pointing time-like CC z , which fixes z (and L) up to a sign. This proves vectors. Hence, |z|2 = a and z = c¯ the last statement. The equation A = −C ∗ AT C holds for A0 , C0 and therefore also in general. For a unit vector u = ua ga we have u2 = ±I and hence π(u)∗ Aπ(u) = ua ub π(ga )∗ Aπ(gb ) = ua ub Aπ(ga gb ) = Aπ(u2 ) = ±A. For S ∈ Spin 1,3 , we must therefore have that π(S)∗ Aπ(S) = ±A, by definition of the Spin group. For S = I, the sign is a plus, so by continuity and connectedness we conclude that π(S)∗ Aπ(S) = A for all S ∈ Spin01,3 . For C, we use the fact that π(u−1 )C −1 π(u) = −π(u)−1 π(u)C −1 = −C −1 and hence π(S −1 )C −1 π(S) = C −1 for all S ∈ Spin1,3 .
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
395
Note that g5 ∈ Spin 1,3 \Spin 01,3 . Indeed, using π0 and A0 = γ0 in Theorem 3.5 we see that γ5∗ A0 γ5 = −A0 , so g5 ∈ Spin 1,3 by definition, but not in Spin 01,3 . In the following theorem we use the fact that for any pair of natural transformations t, t : SSpac ⇒ VBundC we can define the sum t + t and the tensor product t ⊗ t componentwise. Theorem 3.6. The standard locally covariant Dirac spinor and cospinor bundles admit natural (C-antilinear ) equivalences + : D0 ⇔ D∗0 , c : D0 ⇔ D0 , c : D∗0 ⇔ D∗0 in VBundR and a natural transformation γ: D0 ⇒ T∗ ⊗ D0 in VBundC such that all components cover the identity morphism and the following equations hold both on spinors and cospinors (i.e. we denote the inverses of + and c by the same symbol): ◦ = 1 =c ◦c ,
◦ = −1 ◦c ◦+
+ +
+ c
, ◦ S ◦ (+ ⊗+ ) =− ◦ , = , ◦ (c ⊗c ) (1⊗+ ) ◦ γ = γ ∗ ◦+ ,
(1⊗c ) ◦ γ = −1 ◦ γ◦c
(4)
(1 + S ⊗ 1) ◦ (1 ⊗ γ) ◦ γ = (2 ◦ g) ⊗ 1 ∇ ◦ γ = γ ◦ ∇, D∗0
D∗0
⇔ ⊗ D0 and S: T∗ ⊗ T∗ ⇔ T∗ ⊗ T∗ swap the factors in the where S: D0 ⊗ tensor product, g: Λ0 ⇒ T∗ ⊗ T∗ maps the function 1 to the metric g and γ ∗ : D∗0 ⇒ T∗ ⊗ D∗0 is the adjoint map of γ under the canonical pairing , . Furthermore, for every object SM , every time-like future pointing tangent vector n ∈ T M and every v ∈ D0 M we have n ⊗ v + , γ(v) ≥ 0. The natural transformation γ can also be seen as a natural transformation T ⇒ End(D0 ) or T ⇒ End(D∗0 ). Equations (4) simply give the usual computational rules for spinors and cospinors in a functorial setting. Thus, for every SM and every p ∈ D0 M , q ∈ D0∗ M we have: p++ = p = pcc ,
pc+ = −p+c
p+ , q + = q, p = q c , pc (γµ p)+ = p+ γµ ,
(γµ p)c = −γµ pc
γµ γν + γν γµ = 2gµν I,
∇a γb ≡ 0,
where we have dropped the subscript SM to lighten the notation. Proof. The canonical pairing , : D∗0 ⊗ D0 ⇒ Λ0C on SM is given by [E, w∗ ], [E, z] = w, z, where the right-hand side is the standard Hermitean inner product on C4 . Note that this is well-defined, because we can always get the same E ∈ SM on the left-hand side by a suitable action of Spin01,3 . The components of the natural equivalences + and c on each SM are defined using the matrices A0 and C0 of Theorem 3.5 and their properties: [E, z]c := [E, C0−1 z¯], [E, z]+ := [E, z ∗ A0 ],
[E, w∗ ]c := [E, w ¯ ∗ C0 ], [E, w∗ ]+ := [E, A−1 0 w].
May 11, J070-S0129055X10003990
396
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
These are well-defined isomorphisms in VBundR and they give rise to natural equivalences satisfying the first two lines of Eq. (4). Now fix E ∈ SM , let ea be the orthonormal basis (e0 , . . . , e3 ) = π(E) of Tp(E) M , where π : SM → FM is the spin frame projection, and let ea be the dual basis of ∗ M . On SM we define the component of the natural transformation γ on SM Tp(E) to be γ([E, z]) := ea ⊗ [E, γa z]. This is well-defined, because a different section E := RS E gives rise to the frame ea = eb Λb a(S) and the dual frame (e )a = Λab (S −1 )eb and on the other hand π0 (S −1 )γa π0 (S) = γb Λba (S −1 ) by definition of Λ (Proposition 2.6). γ is indeed a morphism in VBundC and gives rise to a natural transformation. The third line of Eq. (4) follows again from the properties of A and C (see Theorem 3.5): γ([E, z]c ) = ea ⊗ [E, γa C0−1 z¯] = −ea ⊗ [E, C0−1 γa z] = −(γ([E, z]))c , γ ∗ ([E, z]+ ) = ea ⊗ [E, z ∗ A0 γa ] = ea ⊗ [E, z ∗ γa∗ A] = (γ([E, z]))+ and similarly on cospinors. Also, 1 c Γ (γc γ d γa − γa γc γ d ) − Γcba γc 4 bd 1 −1 c Γ (δ d γc + ηac γ d ) = 0. = Γcbd (γc {γ d , γa } − {γa , γc }γ d − 4δad γc ) = 4 2 bd a
∇b γa = σb γa − γa σb − Γcba γc =
Finally, for every object SM , every future pointing tangent vector n ∈ T M and every v ∈ D0 M we have n ⊗ v + , γ(v) = v + , Ana γa v ≥ 0 again by Theorem 3.5. In terms of the Christoffel symbols Γρµν , the frame eaρ and representing ga on D0 M using the End(D0 M )-valued one-forms γ, the connection one-forms of the spin connection can be expressed asm 1 a Γ γa γ c , 4 bc = −eρc (eσb ∂σ eaρ ) + eaρ eµb eνc Γρµν .
σb := Γabc
(5)
The Dirac operator is defined on spinors and cospinors by ∇ / SM := γ a ∇a . ∞ This defines natural transformations ∇ / : C∞ / : C∞ 0 ◦ D0 ⇒ C0 ◦ D0 , respectively ∇ 0 ◦ ∗ ∞ ∗ D0 ⇒ C0 ◦ D0 . The intertwining relations of the adjoint and charge conjugation with the Dirac operator follow from their intertwining with γ in Theorem 3.6:
Proposition 3.7. ∇ / ◦+ =+ ◦∇ /, ∇ /◦ m Note
the sign error in [6, 7].
c
= −1 ◦
c
◦∇ /.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
Proof. Recall that any object SM
+
and
c
397
can be defined pointwise on test-sections. Hence, on
(∇ / v)c = ((∂a v − vσa )γ a )c = (∂a v − vσa )γ a C = −(∂(¯ v C) − v¯Cσa )γ a = −∇ / (vC) = −∇ / vc , (∇ / u)+ = (γ a (∂a u + σa u))+ = (∂a u∗ + u∗ σa∗ )(γ a )∗ A = (∂a (u∗ A) − u∗ Aσa )γ a = ∇ / (u∗ A) = ∇ / u+ , where the minus sign in the last line appears because the order of the two factors of / v)+ = (∇ / v ++ )+ = γ in the expression for σa needs to be changed. It follows that (∇ + ++ + c + +c + c+ +c + (∇ /v ) =∇ / v and (∇ / u) = (∇ / u ) = −(∇ / u ) = (∇ / u ) = −(∇ / uc+ )+ = c −∇ /u . Remark 3.8. A change in the sign convention, η˜ := −η, has no physical consequences. In fact, this simply gives rise to D Cl3,1 as the Dirac algebra, but since 0 0 = Cl1,3 nothing changes in the representationn of the group Spin 01,3 = Spin 03,1 . Cl3,1 To accommodate this change one can set γ˜a := iγa in Eq. (1), which yields the same Dirac algebra and other constructions (although we do get signs for all covectors when raising or lowering indices with η˜). This also implies that one should drop the factor i in front of the Dirac operator in the Dirac equation (6) below, which ensures that Pc P = P Pc will still be a wave operator. We can also keep the same matrices A, C, which now must satisfy the relations: γa A−1 , −˜ γa∗ = A˜
γ¯˜ a = C γ˜a C −1 .
The spinor and cospinor bundle and the adjoint and charge conjugation maps then remain the same and all the relations between these operations and the Dirac operator remain valid. 3.3. The Dirac equation and its fundamental solutions The Dirac equation on spinor and cospinor fields, respectively, on a spin spacetime SM is (−i∇ / + m)u = 0,
(i∇ / + m)v = 0,
(6)
where the constant m ≥ 0 is to be interpreted as the mass of the field. These equa tions can be derived as the Euler–Lagrange equations from the action SD := LD n Notice
that a complex irreducible representation of Cl1,3 extends to an irreducible representation of M (4, C) and therefore also gives a complex irreducible representation of Cl3,1 and vice versa. The standard Clifford algebra isomorphism Cl3,1 M (4, R) appears if and only if the ¯a = −γa . In that case we also find representation of Cl1,3 is a Majorana representation, i.e. if γ (see, e.g., [12, p. 332]) P in3,1 {S ∈ M (4, R) | det S = 1, ∀ v ∈ M0 SvS −1 ∈ M0 } = P in1,3 .
May 11, J070-S0129055X10003990
398
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
with the Lagrangian densityo LD := u+ , (−i∇ / + m)udvolg
(7)
by varying with respect to u and u+ , viewed as independent fields. The canonical momentum of the field u on a Cauchy surface C with future pointing normal vector field n is defined as δSD 1 = −iψ + (x)n / (x). π(x) := µ ∇ ψ(x)) δ(n −det g(x) µ
(8)
We will write P := −i∇ / + m for the operator on spinors and Pc := i∇ / +m for the operator on cospinors. These are components of natural transformations ∞ ∞ ∗ ∞ ∗ ◦ D0 ⇒ C∞ ◦ D0 and Pc : C∞ P : C∞ 0 ◦ D0 ⇒ C0 ◦ D0 , P : C 0 ◦ D0 ⇒ C0 ◦ D0 , ∞ ∗ ∞ ∗ Pc : C ◦ D0 ⇒ C ◦ D0 , which we denote by the same symbol. We then have by Proposition 3.7: P◦
c
=
c
◦ P,
Pc ◦ = ◦P, +
+
Pc ◦
c
=
c
◦ Pc ,
P ◦ = ◦Pc , +
+
(9)
i.e. if a spinor field u is a solution to the Dirac equation, then so are u+ and uc . (The adjoint and charge conjugation of u are defined pointwise.) For a distribution v on D0 M we define the transpose P ∗ by P ∗ v, u := v, P u and similarly for Pc . In this way the transposes give rise to natural transformations P ∗ : Distr ◦ D0 ⇒ Distr ◦ D0 and Pc∗ : Distr ◦ D∗0 ⇒ Distr ◦ D∗0 . Lemma 3.9. Let ι: C∞ ◦ D∗0 ⇒ Distr ◦ D0 and ι: C∞ ◦ D0 ⇒ Distr ◦ D∗0 be the canonical natural transformations (see the end of Sec. 2.2). Then P ∗ ◦ ι = ι ◦ Pc and Pc∗ ◦ ι = ι ◦ P . / vdvolg = Proof. This follows from the fact that for each object SM M u, ∇ / u, vdvolg if at least one of u ∈ C ∞ (D0 M ) and v ∈ C ∞ (D0∗ M ) is com− M ∇ pactly supported. This in turn follows from ∇ / v, u + v, ∇ / u = ∇a v, γ a u and Gauss’ law. One can find unique advanced and retarded fundamental solutions for the Dirac equation, both for spinors and cospinors [6, 24]: ∞ Theorem 3.10. There are unique natural transformations S ± : C∞ 0 ◦D0 ⇒ C ◦D0 ± ∞ ∗ ∞ ∗ ± ± ± and Sc : C0 ◦ D0 ⇒ C ◦ D0 such that S ◦ P = P ◦ S = κ, Sc ◦ Pc = Pc ◦ Sc± = κ and such that for each u ∈ C0∞ (D0 M ), v ∈ C0∞ (D0∗ M ) we have
Lagrangian is a natural transformation between the functor J1 D0 , which assigns to each spin spacetime SM the first-order jet bundle J1 D0 M of the spinor bundle D0 M , to the functor |Λn | of densities. A component of this natural transformation covers the identity morphism of SM and is only a moprhism in Bund, not in VBundR , because it is not linear. o The
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
399
supp(S ± u) ⊂ J ± (supp(u)), supp(Sc± u) ⊂ J ± (supp(u)). Moreover, S± ◦
c
=
c
◦ S±,
Sc± ◦
c
=
c
◦ Sc± ,
Sc± ◦+ =+ ◦S ± ,
S ± ◦+ =+ ◦Sc± , ◦ , ◦ (1 ⊗ S ± ) = ◦ , ◦ (Sc∓ ⊗ 1).
Proof. The components of S ± and Sc± are the advanced (−) and retarded (+) fundamental solutions for P and Pc , which are given by S ± := (i∇ / + m)E ± and / + m)E ± respectively, where E ± are the unique advanced and retarded Sc± := (−i∇ fundamental solutions for the normally hyperbolic operator (i∇ / + m)(−i∇ / + m) = (−i∇ / + m)(i∇ / + m) = ∇ / 2 + m2 . We refer to [6, Theorem 2.1] for a detailed proof of the existence and uniqueness of these operators (see also [16] for the existence and uniqueness of E ± ). The naturality of S ± and Sc± follows from their uniqueness and the naturality of P and Pc . In detail: for every morphism χ : SM 1 → SM 2 and every f ∈ C0∞ (D0 M1 ) the unique smooth solution to P u = χ∗ f on M2 with supp(u) ⊂ J ± (supp(χ∗ f )) pulls back to a solution v := χ∗ u of P v = f on M1 with supp(v) ⊂ J ± (supp(f )). By uniqueness we must then have u = S ± χ∗ f and χ∗ u = S ± f , i.e. χ∗ ◦ S ± ◦ χ∗ = S ± . The same holds for cospinors. The commutation of S ± and Sc± with charge conjugation and adjoints follows from Eq. (9). For arbitrary u ∈ C0∞ (D0 M ) and v ∈ C0∞ (D0∗ M ) we can find a φ ∈ C0∞ (M ) which is identically one on the compact set supp(S ± u) ∩ supp(Sc∓ v). We then compute: v, S ± u = Pc Sc∓ v, φS ± u = Sc∓ v, P φS ± u M
M
=
M
Sc∓ v, φP S ± u =
M
Sc∓ v, u, M
which proves the last claim. We define the advanced-minus-retarded fundamental solutions S := S − − S + and ∞ ◦ D0 and Sc := Sc− − Sc+ , which are natural transformations S: C∞ 0 ◦ D0 ⇒ C ∞ ∗ ∞ ∗ Sc : C0 ◦ D0 ⇒ C ◦ D0 respectively. 3.4. The non-uniqueness of the functorial Dirac structure We have seen that the (standard) structure of Dirac spinors and cospinors, adjoints, charge conjugation and the Dirac operator is entirely determined by the functor D0 and the natural equivalences + , c and γ. We formalise this with a definition: Definition 3.11. By a Dirac structure D := (D,+ ,c , γ) we mean a locally covariant spinor bundle D with a dual bundle D∗ , natural equivalences + : D ⇔ D∗ , c : D ⇔ D,
May 11, J070-S0129055X10003990
400
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
and c : D∗ ⇔ D∗ in VBundR and a natural transformation γ: D ⇒ T∗ ⊗ D in VBundC , all of whose components cover the identity morphism and satisfying the relations (4) and γSM (v + , v), n ≥ 0 for every time-like future pointing vector n ∈ T M. We call D0 := (D0 ,+ ,c , γ) of Theorem 3.6 the standard Dirac structure. The category DStruc has all Dirac structures as objects and its morphisms t : D1 → D2 are all natural transformations t: D1 ⇒ D2 whose components are injective morphisms covering the identity morphism and intertwining the adjoints, charge conjugation and γ as follows: +2
◦ t = t◦+1 ,
c2
◦ t = t◦c1 ,
γ2 ◦ (t ⊗ t) = γ1 .
For each Dirac structure, one can perform the constructions of Sec. 3.3. Because the Dirac algebra D has a unique irreducible complex representation one might expect that the category DStruc admits a corresponding unique initial object, perhaps up to isomorphism. This is an object from which there exists a morphism into any other object. However, as we will explain in this section there is a certain cohomological obstruction of the category SSpac involved. We will first consider the standard Dirac structure, which would be a good candidate for an initial object, and prove the following weaker property: Proposition 3.12. Any morphism t from a Dirac structure D to the standard Dirac structure D0 is an isomorphism. Proof. Let t : D → D0 be a morphism. By the injectivity of the components of t: D ⇒ D0 we see that the complex dimension of the fiber of DM is at most four. On the other hand, the vector bundles DM are modules for the Dirac algebra represented by γ. Because this algebra is simple, and because Eqs. (4) exclude the trivial representation, we find that DM must have complex dimension at least four. Therefore, t: D ⇒ D0 must be a natural equivalence and it follows that t : D → D0 is an isomorphism. Corollary 3.13. If we construct a Dirac structure Dπ analogous to D0 , but using a different representation π and matrices A, C, then Dπ is isomorphic to D0 . Proof. Because we use the same representation on all spacetimes we can construct a natural equivalence t: Dπ ⇔ D0 whose components are of the form tSM ([E, z]) := [E, Lz] for some L ∈ GL(4, C) which is independent of SM (cf. Theorem 3.5). Corollary 3.14. If D := (D0 ,+1 ,c1 , γ ) is any Dirac structure with the standard locally covariant Dirac spinor bundle D0 , then D is isomorphic to the standard Dirac structure D0 . Proof. At each point x in each object SM we can view γa as matrices that represent the Dirac algebra in a representation π. Using the Fundamental Theorem 2.2, we write γa = Lγa L−1 for some L(x) ∈ GL(4, C). As γa is well-defined
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
401
on D0 we must have π0 (S)γa π0 (S −1 ) = γb Λba (S) for all S ∈ Spin 01,3 . This also holds for the matrices γ, so we conclude from the Fundamental Theorem that π0 (S)L(x) = c(x)L(x)π0 (S), where c ≡ 1 by taking S = I. We can now define a natural equivalence t: D0 ⇔ D0 by [E, z] → [E, L(p(E))z] such that γ ◦ t = t ◦ γ. If we also define +2 := t ◦+1 ◦t−1 and c2 := t ◦c1 ◦t−1 , then D ⇔ (D0 ,+2 ,c2 , γ) ⇔ D0 , where the last equivalence follows from the previous corollary. In fact, the proof of Corollary 3.13 shows that for any SM the quadruple (DM,+ ,c , γ) is unique up to an isomorphism tSM , if DM has four-dimensional complex fibers. The isomorphism tSM itself, however, is only unique up to a sign. In other words, on each spin spacetime we find a discrete Z2 -symmetry that preserves all physical relations.p Consider two Dirac structures D and D whose locally covariant spinor bundles D and D have four-dimensional complex fibers. Comparing the action of these functors on morphisms of SSpac one finds a diagram that commutes up to a sign. The existence of an initial object in the category DStruc then boils down to the question whether one can choose signs for all spin spacetimes SM in such a way that all the diagrams commute. The answer is not at all obvious, but can be neatly formulated in terms of the first Stiefel–Whitney class of the category SSpac. To explain this we will briefly recall the definition of cohomology groups for categories (cf. [26]). If C is any category, we can first build a simplicial set from it called the nerve of the category (cf. [27]). A 0-simplex is simply an object of C, a 1-simplex is a morphism between two objects, a 2-simplex is a commutative triangle, etc. We will write Σn for the set of all n-simplices. For n ≥ 1 every n-simplex has n + 1 faces, which are described by maps ∂j : Σn → Σn−1 , 0 ≤ j ≤ n, which remove the jth vertex from the diagram. To find the cohomology of C with values in an Abelian groupq G, we define an n-cochain with values in G to be a map v : Σn → G. We denote the set of n-cochains with values in G by C n (G) and we define the coboundary map d : C n (G) → C n+1 (G) by dv(s) :=
n+1
(−1)j v(∂j s),
s ∈ Σn+1 ,
j=0
where we have written the group operation of G additively. One checks that d2 = 0 and defines v to be closed iff dv = 0 and exact iff v = dt for some (n − 1)cochain t. The sets of closed and exact n-cochains are denoted by B n (G) and Z n (G), respectively. They inherit an Abelian group structure from G and because p This may be compared to [25], who use complex spinor structures and then find a local (gauge) symmetry instead of our more restricted global symmetries. q [26] also considers the non-Abelian case, which is much more involved.
May 11, J070-S0129055X10003990
402
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
Z n (G) ⊂ B n (G) is necessarily normal one can define the jth cohomology group as the quotient H n (G) := B n (G)/Z n (G). Now let us return to the study of Dirac structures. Suppose that D and D both have four-dimensional complex fibers. Without loss of generality, we may assume that both Dirac structures coincide on each spin spacetime, but the action of their locally covariant spinor bundles on a morphism χ agrees only up to a sign v(χ) ∈ {±1}. We can view v : χ → v(χ) as a 1-cochain on the category SSpac with values in Z2 = {0, 1}, where 0 corresponds to +1 and 1 to −1). Notice that for a composition of morphisms χ = χ1 ◦ χ2 we find v(χ) = v(χ1 ) + v(χ2 ) in Z2 , because the Dirac structures are both functorial. In cohomological terms this means precisely that dv = 0. If there is a natural equivalence t: D ⇔ D , then the components tSM are automorphisms of the Dirac structure at each SM , i.e. tSM = ±1, that compensate for all the minus signs in v. If we view t as a 0-cochain with values in Z2 , this means exactly that v = dt. So we have proved: Theorem 3.15. The number of inequivalent Dirac structures whose locally covariant spinor bundles have four-dimensional complex fibers equals the number of first Stiefel–Whitney classes of the category SSpac, i.e. the number of elements in H 1 (Z2 ). Remark 3.16. For scalar and vector fields the problem above can be avoided in a natural way. Taking L↑+ in the defining (four-vector) representation, the vector bundle associated to F+↑ M is just the tangent bundle T M . A morphism in Spac determines a unique morphism on the tangent bundle, so no topological obstructions occur. Similarly for the scalar field, where one uses the trivial one-dimensional representation of L↑+ , whose associated vector bundle is Λ0 (M ) = M × R. Again a morphism in Mann automatically determines a unique morphism on these associated vector bundles, now by the requirement that the volume element is preserved. In general one is dealing with representations of Spin 01,3 and associates to each morphism in SSpac an intertwining operator between such representations. For the associated vector bundles of SM , the physical requirements that we imposed on the bundle morphisms, concerning the adjoint and charge conjugation maps and γ, reduce the intertwiners exactly to a choice of lifting L↑+ to its double cover. In this way it leads to the same first Stiefel–Whitney class that characterizes the number of spin structures on a manifold. For the general case it is expected that one needs a non-Abelian cohomology theory to quantify the obstruction for finding initial objects. 4. The Locally Covariant Quantum Dirac Field After our discussion of the classical Dirac field in Sec. 3 we now turn to the quantum Dirac field, its construction, its Hadamard states and its relative Cauchy evolution.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
403
4.1. Quantization of the free Dirac field First, we will quantize the free Dirac field in a generally covariant way and establish some of its properties. For this purpose we also present the main ideas of locally covariant quantum field theory as introduced in [2] (see also [5]). In the following, any quantum physical system will be described by a topological ∗ -algebra A with a unit I, whose self-adjoint elements are the observables of the system. An injective and continuous ∗ -homomorphism expresses the notion of a subsystem, whereas a state is desccribed by a normalized and positive continuous linear functional ω, i.e. ω(A∗ A) ≥ 0 for all A ∈ A and ω(I) = 1. The state space of A is the set of all states and is denoted by A∗+ 1 . Every state gives rise to a GNS-representation πω (see [28, Theorem 8.6.2.]), which is characterized uniquely, up to unitary equivalence, by the GNS-quadruple (πω , Hω , Ωω , Dω ). Here Hω is the Hilbert space on which πω (A) acts as (possibly unbounded) operators with the dense, invariant domain Dω := πω (A)Ωω . The vector Ωω is cyclic and satisfies ω(A) = Ωω , πω (A)Ωω for all A ∈ A. The collection of all systems forms a category TAlg: Definition 4.1. The category TAlg has as its objects all unital topological ∗ algebras A and as its morphisms all continuous and injective ∗ -homomorphisms α such that α(I) = I. A locally covariant quantum field theory is a (covariant) functor A: SSpac → TAlg, written as SM → ASM , χ → αχ . A locally covariant quantum field theory A is called causal if and only if any pair of morphisms ψi : SM i → SM , i = 1, 2, such that ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ in M yields [αΨ1 (ASM 1 ), αΨ2 (ASM 2 )] = {0} in ASM . A locally covariant quantum field theory A satisfies the time-slice axiom iff for all morphisms ψ : SM 1 → SM 2 such that ψ(M1 ) contains a Cauchy surface for M2 we have αΨ (ASM 1 ) = ASM 2 . Notice that the condition ψ1 (M1 ) ⊂ (ψ2 (M2 ))⊥ is symmetric in i = 1, 2. The causality condition formulates how the quantum physical system interplays with the classical gravitational background field, whereas the time-slice axiom expresses the existence of a causal dynamical law. We now fix a choice of Dirac structure D := (D,+ ,c , γ), in order to turn the free Dirac field into a locally covariant field theory. Because we want to impose the canonical anti-commutation relations it will also be convenient to quantize spinor and cospinor fields simultaneously by introducing the following terminology: Definition 4.2. The locally covariant double spinor bundle is the covariant functor D ⊕ D∗ . We define the following natural equivalences and natural transformations
May 11, J070-S0129055X10003990
404
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
on this bundle, indicated by their components at SM : (p ⊕ q)c := pc ⊕ q c , γµ (p ⊕ q) := (γµ p) ⊕ (γµ q),
(p ⊕ q)+ := q + ⊕ p+ ,
p ⊕ q, p ⊕ q := p+ , p + q , q + ,
τ (p ⊕ q) := p ⊕ (−q). A double spinor (field ) is an element of C ∞ (DM ⊕ D∗ M ). A double test-spinor (field) is an element of C0∞ (DM ⊕ D∗ M ). The adjoint, charge conjugation and other operations are defined pointwise. We also define the operator P := P ⊕ Pc , its advanced (−) and retarded (+) fundamental solutions S ± (u ⊕ v) := (S ± u) ⊕ (Sc± v) and S := S − − S + . The exterior tensor product V1 V1 of two vector bundles Vi with fiber Vi over manifolds Mi , i = 1, 2, is the vector bundle over M1 × M2 whose fiber is V1 ⊗ V2 and whose local trivializations are determined by (O1 × O2 ) × (V1 ⊗ V2 ), where Oi × Vi are local trivializations of Vi . 0 on a spin spacetime SM is the topoThe Dirac Borchers–Uhlmann algebra FSM ∗ logical -algebra 0 := FSM
∞
C0∞ ((DM ⊕ D∗ M )n ),
n=0
where the direct sum is algebraic (i.e. only finitely many non-zero summands are allowed) and (1) the product is given by continuous linear extension of f1 · f2 := f1 f2 , (2) the ∗ -operation is given by continuous antilinear extension of (f1 · · · fn )∗ := fn+ · · · f1+ , 0 0 is the strict inductive limit FSM = (3) as a topological vector space FSM ∞ N ∞ ∗ n C ((DM ⊕ D M ) | ×n ), where KN is an exhausting and N =0 n=0 0 KN increasing sequence of compact subsets of M and the test-section space of the restricted vector bundle (DM ⊕D∗ M )n |K ×n is given the test-section topology. N
0 FSM
The topology of is such that a state is given by a sequence of n-point distributional sections ωn of (DM ⊕ D∗ M )n . A morphism χ : SM 1 → SM 2 in SSpac 0 0 → FSM that is given by the algebraic and determines a unique morphism αχ : FSM 1 2 continuous extension of the morphism DM1 ⊕ D∗ M1 → DM2 ⊕ D∗ M2 that is sup0 plied by the functor D. Together with this map on morphisms the map SM → FSM becomes a locally covariant quantum field theory F0 : SSpac → TAlg. Our next task will be to divide out the ideals that generate the dynamics and the canonical anti-commutation relations. ∗ ∞ ∗ We define the natural transformation ( , ): (C∞ 0 ◦(D⊕D ))⊗R (C0 ◦(D⊕D )) ⇒ C whose components are the sesquilinear forms: f1 , τ Sf2 . (f1 , f2 ) := i M
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
405
Note that this is indeed a natural transformation, because it can be written as a composition of natural transformations including , , , + and κ. Lemma 4.3. On each object SM the sesquilinear form ( , ) is Hermitean, (f1 , f2 ) = (f1c , f2c ) = (f2 , f1 ), and there holds (f1+ , f2+ ) = (f2 , f1 ). For any spacelike Cauchy surface C ⊂ M with future pointing unit normal vector field na we have (u1 ⊕ v1 , u2 ⊕ v2 ) = (Su1 )+ , n /(Su2 ) + Sc v2 , n /(Sc v1 )+ . (10) C
Proof. The symmetry properties follow straightforwardly from the computational rules of Theorems 3.6 and 3.10. For the last statement we also need a partial integration (see, e.g., [20, Eq. (B.2.26)] for Gauss’ law) and we use the Dirac equation: (u1 ⊕ v1 , u2 ⊕ v2 ) + − = i Pc Sc− u+ 1 , Su2 + Pc Sc v2 , Sv1 J + (C)
+i = − −
J − (C)
J + (C)
J − (C)
+ + Pc Sc+ u+ 1 , Su2 + Pc Sc v2 , Sv1
+ a − a ∇a Sc− u+ 1 , γ Su2 + ∇a Sc v2 , γ Sv1
+ a + a ∇a Sc+ u+ 1 , γ Su2 + ∇a Sc v2 , γ Sv1
+ a − a na Sc− u+ 1 , γ Su2 + na Sc v2 , γ Sv1
= C
−
+ a + a na Sc+ u+ 1 , γ Su2 + na Sc v2 , γ Sv1 C
(Su1 )+ , n /(Su2 ) + Sc v2 , n /(Sc v1 )+ .
= C
From Eq. (10) we notice that ( , ) is positive semi-definite and hence defines a 0 by the closed ideal JSM (degenerate) inner product. We proceed by dividing FSM + 0 of FSM generated by all elements of the form P f or f1 · f2 + f2 · f1+ − (f1 , f2 )I. Theorem 4.4. The ideal JSM is a ∗ -ideal and for any morphism χ : SM 1 → SM 2 we have αχ (JSM 1 ) ⊂ JSM 2 . We can define the locally covariant quantum field theory F : SSpac → TAlg which assings to every spin spacetime SM the C ∗ -algebra 0 /J FSM := FSM SM . Proof. The elements that generate JSM are invariant under adjoints and under a morphism they are mapped to elements of the same form. This proves the first 0 /JSM are topological ∗ -algebras and statement. It follows that the quotients FSM
May 11, J070-S0129055X10003990
406
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
0 0 that a morphism αχ : FSM → FSM descends to the quotients as a well-defined 1 1 0 morphism. That each algebra FSM /JSM has a C ∗ -norm follows from the fact that they are the inductive limits of finite-dimensional Clifford algebras ([29]). The morphisms on the quotients are necessarily continuous in the norm and therefore extend to morphisms on the C ∗ -algebras FSM .
Definition 4.5. A locally covariant quantum field in the locally covariant vector bundle V for the locally covariant quantum field theory A is a natural transforma∗ tion Φ: C∞ 0 ◦ V ⇒ f ◦ A, where we let f : TAlg → TVec be the forgetful functor. We define the locally covariant quantum fields B: D ⊗ D∗ ⇒ F, ψ: D∗ ⇒ F and ψ + : D ⇒ F by BSM (f ) := 0 ⊕ f ⊕ 0 ⊕ · · · + JSM , ψSM (v) := BSM (0 ⊕ v) and + (u) := BSM (u ⊕ 0). ψSM That the latter really are locally covariant quantum fields is a consequence of + are C ∗ -algebraProposition 4.6. The operator-valued maps BSM , ψSM , ψSM valued distributions and:
(1) P ◦ ψ = 0 and Pc ◦ ψ + = 0, + (u) = ψSM (u+ )∗ , (2) ψSM + (3) {ψSM (u), ψSM (v)} = (v + ⊕ 0, u ⊕ 0)I = −i M v, SuI and the other anticommutators vanish. Proof. The first item is P BSM (f ) = BSM (P ∗ f ) = BSM (P f ) = 0, where P ∗ is the formal adjoint of P . The last two items follow from the definitions of ψSM and + and the properties of BSM after a straight-forward computation. ψSM + are C ∗ -algebra-valued distributions, because It remains to show that ψSM , ψSM the result for BSM then follows. The C ∗ -subalgebra of FSM generated by I, ψSM (v), ψ(v)∗SM is a Clifford algebra which is√isomorphic to M (2, C) and an explicit isomorphism is given by ψSM (v) → 00 0c , where c = (0 ⊕ v, 0 ⊕ v) = √ −i M v, Sv + > 0. It follows that ψSM (v) = c is the operator norm of the corresponding matrix, i.e.r 2 ψSM (v) = −i v, Sv + dvolg . M
In the test-spinor topology we then have continuous maps v → v ⊕ v + → −i M v, Sv + , from which it follows that v → ψSM (v) is norm continuous, i.e. + is analogous. it is a C ∗ -algebra-valued distribution. The proof for ψSM Note that the last two conditions of Proposition 4.6 can also be formulated in terms of natural transformations, because the algebraic operations in FSM can be expressed as such. The theory F is the quantized free Dirac field and ψ (ψ + ) is r The
factor 2 in [7, Remark 2, p. 340] seems to be erroneous.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
407
the locally covariant Dirac (co)spinor field. Alternatively we could have used the 0 /JSM themselves instead of completing them to C ∗ -algebras. algebras FSM To see that the anti-commutator is the canonical one (cf. [24]) we apply [6, / for a Cauchy surface C with Proposition 2.4(c)] which says that S|C×C = −iδn future pointing normal vector field n. Comparing with Eq. (8) and using n / 2 = I we then find + /(x)), ψSM (y)} = − y, Sn / xI = iδ(y, x)I {−iψSM (n M
as expected. So far our construction depends on the choice of a Dirac structure, although naturally equivalent Dirac structures yield naturally equivalent theories and quantum fields. The following theorem restricts attention to the observable algebra, dividing out the freedom of choice completely and yielding a unique theory, but for many purposes it is not convenient to use it directly because it lacks locally covariant Dirac (co)spinor fields. Theorem 4.7. Let B : SSpac → TAlg be the locally covariant quantum field theory that assigns to each spin spacetime SM the C ∗ -subalgebra of FSM generated by all even polynomials in elements B(f ), with the induced action on morphisms. For all Dirac structures with four-dimensional complex fibers the resulting theories B are isomorphic. Proof. The algebras BSM generated by the even polynomials are C ∗ -algebras. Morphisms respect evenness and so restrict to morphisms on B, making B a welldefined locally covariant quantum field theory. Now consider two Dirac structures D and D0 with associated functors F, B and F0 , B0 . If both Dirac structures have fourdimensional complex fibers, then we infer from the comment below Corollary 3.13 that there are ∗ -isomorphisms αSM : FSM → (F0 )SM such that for any morphism χ : SM 1 → SM 2 we have αSM 2 ◦ αχ = χ · (α0 )χ ◦ αSM 1 , where χ = ±1 depends only on χ. It follows from the evenness that the αSM descend to ∗ -isomorphisms αSM : BSM → (B0 )SM that intertwine with the morphisms. Hence, B and B0 are naturally equivalent. Proposition 4.8. The locally covariant quantum field theory B : SSpac → TAlg of Theorem 4.7 is causal and satisfies the time-slice axiom. Proof. Causality follows from the anti-commutation relations, [BSM (f1 )BSM (f2 ), BSM (f3 )] = BSM (f1 ){BSM (f2 ), BSM (f3 )} − {BSM (f1 ), BSM (f3 )}BSM (f2 ) = (f2 , f3 )BSM (f1 ) − (f1 , f3 )BSM (f2 ), together with the support properties of S. For the time-slice axiom, we let χ : SM → SM be a morphism in SSpac, covering a morphism ψ : M → M in Spac,
May 11, J070-S0129055X10003990
408
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
such that N := ψ(M ) ⊂ M contains a Cauchy surface C ⊂ M . Then we can choose Cauchy surfaces C ± ⊂ N such that C ± ⊂ I ± (C) and a smooth partition of unity φ+ , φ− with supp φ± ⊂ J ± (C ∓ ). Let f ∈ C0∞ (DM ⊕ D∗ M ) and write f = P (S + f − φ+ Sf ) + f˜, −
(11) −
−
where f˜ := P (φ Sf ) = −P (φ Sf ) is supported in J (C ) ∩ J (C ) ⊂ N and φ+ Sf − S + f has compact support. Hence, BSM 2 (f ) = BSM 2 (f˜) = αχ (BSM 1 (χ∗ (f˜))). Because the algebra FM is generated by such elements this shows that αχ is a ∗-isomorphism. +
+
+
Remark 4.9. A Majorana spinor is a spinor u such that u = uc . In this case the adjoint is anti-Majorana: u+c = −uc+ = −u+ . We call a double spinor f = u ⊕ v Majorana iff u and v + are Majorana, which means that f c = τ f . Such spinors are sections of a subbundle of the Dirac spinor bundle, which can be described by a Majorana representation. Notice that every spinor is a unique complex linear combination of Majorana spinors. To quantize Majorana spinors we note that hc , f = h+ , f c+ . This leads us to define the charge conjugation on the quantized fieldss by ψ c (v) := ψ + (v c+ ) and ψ +c (u) := ψ(uc+ ), or equivalently B c (f ) := B(f c+ ) = B(f c )∗ . We impose the Majorana condition B c (f ) = B(τ f ) by dividing out the ideal generated by all elements of the form B(f − τ f c+ ). More precisely, if H is the Hilbert space obtained from C0∞ (DM ⊕ D∗ M ) by dividing out the ideal of double spinors f for which (f, f ) = 0, then there is an orthogonal decomposition H = H+ ⊕ H− , where the elements in H± satisfy τ f c+ = ±f . Indeed, every double spinor can be written as f = f+ + if− , where f± := 12 (f ± τ f c+ ) are in H± and the orthogonality follows from Lemma 4.3. For the C ∗ -algebraic quantization we then have F = F+ ⊗ F− , where F− is the C ∗ -algebra of quantized Majorana spinors and F+ the C ∗ -algebra of quantized anti-Majorana spinors (see [30, Sec. 5.2]). The generators ψ(v) and ψ + (u) of F− satisfy the additional relation ψ c = ψ and ψ +c = −ψ + . 4.2. Hadamard states After Radzikowski’s result [31] that a for a scalar field state is of Hadamard form if and only if its wave front set has a certain form, several people set out to extend this result to the Dirac field, or more general quantum fields [32–34]. All three papers have provided an original contribution in their method of proof, but upon careful analysis they all have minor gaps. We feel that it is justified to comment on this here and to provide the necessary results to fill any remaining gaps. The most general results are the most recent ones, due to Sahlmann and Verch [34], who set out to prove the equivalence of the Hadamard form of a state, defined in terms of the Hadamard parametrix, with a wave front set condition analogous to the scalar field case. One of the techniques used is the scaling limit, but s Our
definition differs slightly from that of [13].
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
409
the proof of their Proposition 2.8, which relates the wave front set of a distribution to that of its scaling limit, is in our opinion insufficient (see footnote w). In the Appendix, we prove a similar statement as Proposition A.2, thereby filling any gap in [34] and establishing the desired equivalence on a firm ground. For the Dirac field, Hollands has proved that this wave front set condition implies a specific form of the polarization set ([35, Theorem 4.1]). The scaling limit result can also be used to find the wave front sets of the advanced and retarded fundamental solutions E ± of normally hpyerbolic operators on a globally hyperbolic spacetime, a result that we prove as Theorem A.5. Our proof is largely analogous to the work of Radzikowski and the outcome is in direct analogy to the results of Duistermaat and H¨ ormander [36] for the scalar case. To find the wave front sets of the fundamental solutions S ± for the Dirac equation we use (and correct) an idea of [35]. Finally, we comment on the results by Kratzert [32], which use a spacetime deformation argument to compute the wave front set and polarization set of Hadamard states. This result has a gap, already identified in [34], concerning the case of points (x, ξ; y, ξ ) where either ξ = 0 or ξ = 0, which prevents the propagation of the singularity from the original to the deformed spacetime. This gap can be avoided using either a propagation of Hadamard form result as in [34], or using the commutation or anti-commutation relations and the explicit form of WF (E), respectively WF (S). The latter argument, which appears to be implicit in Radzikowski’s paper [31], works as follows: when (x, ξ; y, 0) ∈ WF (ω2 ) then also (y, 0; x, ξ) ∈ WF (ω2 ) by the (anti-)commutation relations and the fact that WF (E) (or WF (S)) has no points with either entry equal to 0. Using the calculus of Hilbert-space-valued distributions, Theorem A.4, we then find that both (x, ξ; x, −ξ) ∈ WF (ω2 ) and (x, −ξ; x, ξ) ∈ WF (ω2 ). Because ξ = 0 (by definition the wave front set does not contain the zero covector) these points can both be propagated into a deformed spacetime, where WF (ω) is known to satisfy the required microlocal condition. This, however, leads to a contradiction, because WF (ω2 ) ∩ −WF (ω2 ) = ∅ and hence ξ = 0. Therefore, WF (ω2 ) cannot contain points with one of the covectors equal to 0. After these historical notes we feel free to define the notion of Hadamard states directly in terms of a wave front set condition, rather than using the Hadamard parametrix. If ω is a state on FSM then we may consider the GNS-representation (Hω , πω , Ωω ) associated to ω and the Hω -valued distribution on DM ⊕ D∗ M defined by: vω (f ) := πω (BSM (f ))Ωω . Definition 4.10. A state ω on FSM is called Hadamard if and only if WF (vω ) = N + := {(x, ξ) ∈ T ∗ M | ξ 2 = 0, ξ µ is future pointing or 0}. A state ω on BSM is called Hadamard if and only if it can be extended to a Hadamard state on FSM . The set of all Hadamard states on BSM will be denoted by SSM .
May 11, J070-S0129055X10003990
410
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
Note that every state on BSM can be extended to FSM , by the Hahn–Banach Theorem and Proposition 4.6. The Hadamard condition is independent of the choice of extension, because it depends solely on the two-point distribution as the following proposition shows (cf. [34], we give a short proof using the more advanced microlocal techniques developed in the Appendix). Proposition 4.11. For a state ω on FSM the following conditions are equivalent : (1) ω is Hadamard, (2) WF (vω ) ⊂ N + , (3) the two-point distribution ω2 (f1 , f2 ) := ω(BSM (f1 )BSM (f2 )) has WF (ω2 ) = C := {(x, −ξ; y, ξ ) ∈ T ∗ M ×2 \Z | (x, ξ) ∼ (y, ξ ), (x, ξ) ∈ N + }, where (x, ξ) ∼ (y, ξ ) if and only if there is an affinely parameterized light-like geodesic from x to y to which ξ, ξ are cotangent, (4) there is a two-point distribution w such that ω2 (f1 , f2 ) = iw(P f1 , f2 ) and WF (w) = C. Proof. First, note that ω2 is a bidistribution on DM ⊕ D∗ M , because BSM is an FSM -valued distribution and multiplication in FSM and ω are continuous. By Theorem A.4 the third statement implies the first, which trivially implies the second. To show that the second statement implies the third, we use the argument of [37, Proposition 6.1]. By Theorem A.4 we see that WF (ω2 ) ⊂ N − × N + , where ˜ 2 (f1 , f2 ) := ω2 (f2 , f1 ) we find WF (˜ ω2 )∩WF (ω2 ) = ∅. Now, N − := −N + . Defining ω ˜ 2 )(f1 , f2 ) = i M f1 , τ Sf2 , so WF (ω2 ) ∪ WF (˜ ω2 ) = WF (S) = WF (E) by (ω2 + ω Proposition A.7 and hence WF (ω2 ) = WF (E) ∩ N − × N + = C by Corollary A.6. Now, assume that ω2 (f1 , f2 ) = iw(P f1 , f2 ), where WF (w) = C. Then WF (ω2 ) = WF ((P ∗ ⊗ I)w) ⊂ WF (w) = C. It follows that WF (vω ) ⊂ N + . For the converse we suppose that ω is Hadamard and we choose a smooth real-valued function φ+ on M such that φ+ ≡ 0 to the past of some Cauchy surface C− and such that φ− := 1 − φ+ ≡ 0 to the future of another Cauchy surface C+ . We then define w(f1 , f2 ) := −iω2 (φ+ S − f1 + φ− S + f1 , f2 ). Note that w is a bidistribution which is well-defined, because φ+ S − f1 and φ− S + f1 are compactly supported. By construction iw(P f1 , f2 ) = ω2 (f1 , f2 ). We now estimate the wave front set of w as follows. The wave front sets of S ± are determined in Proposition A.7. Then we may apply [38, Theorems 8.2.9 and 8.2.13] (in combination with Eq. (17)) to estimate the wave front sets of the tensor products φ± (x)S ∓ (x, y)δ(x , y ) and the composi tions in iw(x, x ) = ± ω2 (y, y )(φ± (x)S ∓ (x, y)δ(x , y )) respectively and, using WF (ω2 ) = C, we find: WF (iw) ⊂ ∪± WF (S ∓ ⊗ δ) ◦ WF (ω2 ) ⊂ WF (ω2 ) = WF ((P ∗ ⊗ I)w) ⊂ WF (w), i.e. WF (w) = WF (ω2 ) = C.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
411
The second characterization in Proposition 4.11 is especially useful, because it shows we do not need to compute the entire wave front set, as long as we can estimate it. Employing similar techniques as above one can use the anticommutation relations and the wave front set of ω2 to estimate the wave front sets of all higher n-point distributions [39], showing that a Hadamard state necessarily satisfies the microlocal spectrum condition (µSC) of [40] and it follows that the set of such states is closed under operations from the algebra. We formulate this and other properties of Hadamard states in the following Proposition 4.12. The set SSM of all Hadamard states on BSM satisfies: (1) α∗χ (SSM 1 ) ⊂ SSM 2 for every morphism χ : SM 1 → SM 2 , (2) SSM is closed under operations from BSM , (3) α∗χ (SSM 1 ) = SSM 2 for every morphism χ : SM 1 → SM 2 such that ψ(M1 ) contains a Cauchy surface of M2 . Proof. The first property follows from Theorem 4.11 and the fact that wave front sets are local and geometric objects (cf. [38, Chap. 8]). The second property relies on the anti-commutation relations, which implies that the truncated n-point distributions are totally anti-symmetric (cf. [1, 39]). The final property follows from the second characterisation in Theorem 4.11, Eq. (17) in the Appendix, the equation of motion and the Propagation of Singularities Theorem for the wave front set, which in this case follows from the propagation of the polarization set [41]. One can also prove that the state spaces are locally physically equivalent [5] and that all quasi-free Hadamard states are locally quasi-equivalent [42]. Whether the latter remains true for all Hadamard states appears to be unknown. We conclude this section with the remark that the functor S : SSpac → TVec defined by SM → SSM and χ → α∗χ (restricted to the relevant state space) is a locally covariant state space for the theory B [2]. 4.3. The relative Cauchy evolution of the Dirac field and the stress-energy-momentum-tensor Now that we have a locally covariant free Dirac field at our disposal, we will investigate the idea of relative Cauchy evolution for this field and prove that it yields commutators with the stress-energy-momentum tensor. This result is completely analogous to the result for the free scalar field of [2]. Suppose that we have two objects M0 = (M, g0 , SM 0 , p0 ) and Mg = (M, g, SM g , pg ) in SSpac, where M is the same in both cases and such that outside a compact set K ⊂ M we have g = g0 , SM g = SM 0 and pg = p0 . Now let N ± ⊂ M0 be causally convex open regions, each containing a Cauchy surface for M0 , such that K lies to the future of N − (i.e. K ⊂ J + (N − )\N − in M0 and hence also in Mg ) and to the past of N + . We view N ± as objects in SSpac and
May 11, J070-S0129055X10003990
412
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
± ± consider the canonical morphisms ι± → M0 and ι± → Mg . By the timeg :N 0 :N slice axiom, Proposition 4.8, these give rise to ∗-isomorphisms β0± : BN ± → BM0 and βg± : BN ± → BMg . We then define
βg := β0+ ◦ (βg+ )−1 ◦ βg− ◦ (β0− )−1 . The ∗-isomorphism βg : BM0 → BM0 measures the change in an operator A ∈ BN − as it evolves to N + in the metric g instead of g0 .t βg can be extended to a ∗isomorphism of the algebra FM0 , where we fix the signs for the isomorphisms between the spinor bundles involved by identifying the double spinor bundles over N ± ⊂ M0 and N ± ⊂ Mg . It represents the relative Cauchy evolution of the free Dirac field. We will want to compute the variation of the ∗-isomorphism βg as well as that of the action for the free Dirac field with respect to the metric g. For this purpose, we will suppose that the compact set K ⊂ M has a contractible neighborhood O which does not intersect either N ± . Let → g be a smooth curve from [0, 1] into the space of Lorentzian metrics on M starting at g0 and such that g = g0 outside K for every . The spin bundle SM must be trivial over the contractible region O. If we assume it to be diffeomorphic to SM 0 outside K we can simply take SM = SM 0 as a manifold and, choosing a fixed representation and matrices A, C, we obtain DM = DM . The deformation of the spin structure is contained entirely in the spin frame projection π : SM 0 → FM . Let E be a section of SM 0 over O and set (e )a := π (E). We require that e varies smoothly with and that (e )a = (e0 )a outside K. To show that projections π with these properties exist we can apply the Gram–Schmidt orthonormalisation procedure to (e0 )a for all simultaneously. The assignment E → e determines π completely, using the intertwining properties. The family of frames e determines principal fiber bundle isomorphisms FM → FM 0 between the frame bundles by λ : {(e )a } → {(e0 )a } on K and extending it by the identity on the rest of M. By definition f intertwines the action of L↑+ on the orthonormal frame bundles. Remark 4.13. There may be many deformations of the spin structure, i.e. many families of projections π which satisfy our requirements. However, the variation of terms like v, P u will not depend on this choice. Indeed, if π is a different deformation of the spin structure, then e := π (E) = RΛ e = π (RS E) for some smooth curve S in Spin01,3 . However, using the invariance of , under the action of the gauge group Spin 01,3 , the variation will be equal in both cases. (Also δu = 0 for t In
[2], it seems the authors have the scattering of a state in mind as it passes through the perturbed metric, which leads them to consider the ∗-isomorphisms βg−1 rather than βg . When we take the variation with respect to g this gives rise to a sign.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
413
every spinor u, because D M = DM.) In this sense, the variation will only depend on the variation of the metric. 4.3.1. The stress-energy-momentum tensor The classical stress-energy-momentum tensor for the Dirac field is defined as a variation of the action S = M LD , with the Lagrangian density (7), with respect to g µν (x): 2 δS , Tµν (x) := −det g(x) δg µν (x)
(12)
where ψ is a free classical Dirac spinor, ψ + its adjoint. An explicit computation yieldsu Tµν =
i (ψ + , γ(µ ∇ν) ψ − ∇(µ ψ + , γν) ψ). 2
Here the brackets around indices denote symmetrization as an idempotent operation and in the following indices between | · · · | are to be excluded from the symmetrization. Following [7] we quantize the stress-energy-momentum tensor via a point-split procedure, i.e. we want to find a bi-distribution of scalar test-functions which reduces to Tµν on the diagonal and which can be quantized in a straight-forward way. For this purpose we use a local spin frame EA and recall that the components γaAB of γa are constant. We define: s (x, y) := Tab
i (ψ + , EA (x)γ(aA |B| E B , eµb) ∇µ ψ(y) 2 − eµ(a ∇|µ ψ + , EA| (x)γb)AB E B , ψ(y)),
reduces to Tab := eµa eνb Tµν in the limit y → x. Performing a partial integration, µ s ∇µ (ea v, u) = 0, we can write Tab as a bidistribution of scalar test-functions h1 , h2 , s (h1 , h2 ) = Tab
i (−ψ + (EA h1 )γ(aA |B ψ(∇µ| (E B eµb) h2 )) 2 + ψ + (∇µ (eµ(a E|A| h1 ))γb)AB ψ(E B h2 )).
(13)
Equation (13) can be promoted to the quantized case by replacing ψ and ψ + by the + of the corresponding locally covariant quantum field. components ψSM and ψSM The expression (13) can be viewed as a formal expression for the same distribution with quantized field operators. u For
explicit computations, we refer to [43, Sec. 4], which uses a Lagrangian that differs from ours by a total derivative. Varying with respect to gµν would yield the opposite sign.
May 11, J070-S0129055X10003990
414
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
Proposition 4.14. For all f ∈ C0∞ (DM ⊕ D∗ M ) and h ∈ C0∞ (M ) we have: s [Tab (x, x), BSM (f )]h(x)dvolg (x) M
=
1 {(∇(a BSM )(γb) (Sτ f )h) − BSM (γ(b ∇a) (Sτ f )h)}, 2
where ∇a := eµa ∇µ . Proof. For f = u ⊕ v we use Proposition 4.6 to obtain: + {BSM (f ), ψSM (EA h)} = −i
v, SEA hI = i
M
Sc v, EA hI, M
{BSM (f ), ψSM (∇µ E B eµb h)} = −i
M
∇µ E B eµb h, SuI = i
+ (∇µ eµa EA h)} = −i {BSM (f ), ψSM
M
E B , eµb ∇µ SuhI,
v, S∇µ eµa EA hI = −i
M
eµa ∇µ Sc v, EA hI, M
{BSM (f ), ψ(E B h)} = −iE B , SuhI. With Eq. (13), the commutation relations and [AB, C] = A{B, C} − {A, C}B this implies s (x, y), BSM (f )] = [Tab
1 + {ψ (EA (x))γ(aA |B| E B , ∇b) Su(y) 2 SM + Sc v, EA (x)γ(aA |B| (∇b) ψSM )(E B (y)) + − (∇(a ψSM )(E|A| (x))γb)AB E B , Su(y)
− ∇(a Sc v, E|A| (x)γb)AB ψSM (E B (y))}. In this expression, we are multiplying distributions with smooth functions, so we may take the coincidence limit yielding: s (x, x), BSM (f )] = [Tab
1 + {ψ (γ ∇ (Su)(x)) + ∇(b ψSM (Sc vγa) (x)) 2 SM (a b) + − ∇(a ψSM (γb) Su(x)) − ψSM (∇(a (Sc v)γb) (x))}
=
−1 {∇(a BSM (γb) Sτ f (x)) − BSM (γ(b ∇a) (Sτ f )(x))}, 2
from which the result follows.
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
415
This result can be written for spinors and cospinors separately as: s [Tab (x, x), ψSM (v)]h(x)dvolg (x) M
= M
1 {∇(a ψSM ((Sc v)γb) h) − ψSM (∇(a (Sc v)γb) h)}, 2
+ s [Tab (x, x), ψSM (u)]h(x)dvolg (x)
=
−1 + + {∇(a ψSM (γb) Suh) − ψSM (γ(a ∇b) (Su)h)}. 2
4.3.2. Relative Cauchy evolution To compute the relative Cauchy evolution explicitly, we first note that the isomorphism βg can be characterized in terms of its action on the generators BM0 (f ) of FM0 as follows: Proposition 4.15. For f ∈ C0∞ (DN + ⊕ D∗ N + ), we have βg B0 (f ) = B0 (Tg f ), where Tg f = Pg φ+ Sg P0 φ− S0 f. Here the subscripts on B, P and S indicate whether they are the objects defined on M0 or Mg and the smooth functions φ± are such that φ± ≡ 1 to the past of some Cauchy surface in N ± and φ± ≡ 0 to the future of some other Cauchy surface in N ± . Proof. Note that βg− ◦ (β0− )−1 B0 (f˜) = Bg (f˜) for any f˜ ∈ C0∞ (DN − ⊕ D∗ N − ). Similarly, for f ∈ C0∞ (DN + ⊕ D∗ N + ) we have β0+ ◦ (βg+ )−1 Bg (f ) = B0 (f ). The functions φ± , 1 − φ± have been chosen appropriately in order to apply Eq. (11) in Proposition 4.8. We then have B0 (f˜) = B0 (f ), where f˜ := −P0 φ− S0 f . Notice that f˜ indeed has a compact support in N − . Similarly, Bg (f˜) = Bg (f ), where f := −Pg φ+ Sg f˜ has support in N + . Hence, for f = Tg f : βg B0 (f ) = βg B0 (f˜) = β0+ ◦ (βg+ )−1 Bg (f˜) = β0+ ◦ (βg+ )−1 Bg (f ) = B0 (f ). On each spin spacetime M = (M, g , SM 0 , π ) we can now quantize the Dirac field and obtain relative Cauchy evolutions β := βg on FN + as before. Proposition 4.16. Writing δ := ∂ |=0 we have for all f ∈ C0∞ (DN + ⊕ D∗ N + ): / )S0 f ). δ(β B0 (f )) = B0 (τ (δ∇ Proof. Using the fact that B0 is a C ∗ -algebra-valued distribution and Proposition 4.15 we find: δ(β B0 (f )) = δ(B0 (P φ+ S P0 φ− S0 f )) = B0 (δ(P φ+ S )P0 φ− S0 f ) = B0 (δ(P )φ+ S0 P0 φ− S0 f ) + B0 (P0 φ+ δ(S )P0 φ− S0 f ).
May 11, J070-S0129055X10003990
416
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
Now, because P0 φ− S0 f ∈ C0∞ (DN − ⊕ D∗ N − ) we see that δ(S )P0 φ− S0 f vanishes on J − (N − ) and that φ+ δ(S )P0 φ− S0 f has compact support. Because B0 solves the Dirac equation we conclude that the second term vanishes. The first term can be rewritten using Eq. (11), which yields S0 f = −S0 P0 (φ− S0 f ) and hence: δ(β B0 (f )) = −B0 (δ(P )φ+ S0 f ) = −B0 (δ(P )S0 f ). For the last equality, we used the fact that δ(P ) is supported in K, where φ+ ≡ 1. Recall that P = (−i∇ / + m) ⊕ (i∇ / + m) to get the final result. To compute the variation of the Dirac operator we may work in a local frame on O, where it is supported. Because the Dirac adjoint map is independent of we only need to compute this variation either for spinors or for cospinors: / )v = (δ(∇ / )v + )+ . Lemma 4.17. For v ∈ C0∞ (D∗ M ) we have δ(∇ Proof. Because the adjoint operation is continuous we have: δ(∇ / )v = ∂ ∇ / v|=0 = ∂ (∇ / v + )+ |=0 = (∂ ∇ / v + |=0 )+ = (δ(∇ / )v + )+ . It is interesting to note that only the variation of the Dirac operator is of importance for the variation of the relative Cauchy evolution, just like for the stress-energy-momentum tensor (cf. [43]). It will also turn out that the variation only depends on the variation of the metric and not on the other freedom in the variation of the orthonormal frame, even though we are now acting on it with the C ∗ -algebra-valued field (cf. Remark 4.13). This will follow from the proof of the following theorem, for which we refer to Appendix B. Theorem 4.18. For a double test-spinor f ∈ C0∞ (DM0 ⊕ D∗ M0 ) and x ∈ K: δ δg αβ (x)
(βg B0 (f )) = −B0
δ δg αβ (x)
Pg S0 f
=
−i a b s e e [T (x, x), B0 (f )]. 2 α β ab
(14)
This result compares well with the scalar field case, [2, Theorem 4.3].v As particular cases we obtain for ψ and ψ + : δ −i a b s (βg ψ(v)) = e e [T (x, x), ψ(v)], δg αβ (x) 2 α β ab δ −i a b s (βg ψ + (u)) = e e [T (x, x), ψ + (u)]. δg αβ (x) 2 α β ab sign explained in footnote t cancels the sign due to the variation with respect to g αβ instead of gαβ .
v The
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
417
It follows that the same result also holds for products and sums of smeared field operators.
5. Conclusions A rigorous formulation of quantum field theories in curved spacetime, going beyond the well-known scalar field, is a prerequisite for constructing more realistic cosmological models as well as for improving our understanding of quantum field theory in Minkowski spacetime. The main purpose of this paper was to present the free Dirac field in a four-dimensional globally hyperbolic spacetime as a locally covariant quantum field theory in the sense of [2] and to compute the relative Cauchy evolution of this field, obtaining commutators with the stress-energy-momentum tensor in analogy with the free real scalar field. We achieved this in a representation independent way and in a functorial, and therefore manifestly covariant, framework. We established some basic properties of the locally covariant free Dirac field and remarked on the quantization of Majorana spinors. We also provided a detailed discussion of Hadamard states, closing any gaps in the existing proofs of the equivalence of the definitions in terms of the series expansion of their two-point distribution and a microlocal condition, respectively. Furthermore, we argued that the observable part of the theory is uniqueley determined by the relations between adjoints, charge conjugation and the Dirac operator, although the geometric constructions themselves may not be unique due to the cohomological properties of the category of spin spacetime. On a mathematical level we have consistently replaced a single spin spacetime SM by the category SSpac of such spacetimes, and the differential geometry on SM by the corresponding functorial descriptions. On a physical level, however, we should not conclude from this that SSpac is now the physical arena in which our system lives, instead of a collection of systems. (See [1, Chap. 1] for more detailed philosophical remarks on the interpretation of the locally covariant approach.)
Acknowledgments I would like to thank Chris Fewster for suggesting to use the cohomological language in Sec. 3.4 and for bringing the problem of computing the relative Cauchy evolution for the Dirac field to my attention. I would also like to thank Romeo Brunetti for correcting some of my misconceptions in the early stages of this computation. An anonymous referee made several important corrections and helpful suggestions, for which I am grateful. Much of this work was performed as part of my PhD-thesis at the University of York and I would also like to thank the University of Trento for their kind hospitality during my visit in October 2007. Furthermore, this research was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft (DFG)) through the Institutional Strategy of the University of G¨ ottingen
May 11, J070-S0129055X10003990
418
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
and the Graduiertenkolleg 1493 “Mathematische Strukturen in der modernen Quantenphysik”. Appendix A. Results in Microlocal Analysis In this appendix, we will list some results concerning the microlocal analysis of distributions. For a detailed treatment of scalar distributions we refer to [38], whereas Hilbert and Banach-space-valued distributions are treated in [1, 37]. More details concerning distributional sections of vector bundles can be found in, e.g., [1, 16, 34, 41]. Before we discuss distributional sections of vector bundles, we first consider the scaling limit of a distribution in an open set of Rn : Definition A.1. Let O be a convex open region O ⊂ Rn containing 0. For all λ > 0 we define the scaling map δλ : O → O by δλ (x) := λx. Let u be a distribution on a convex open region O ⊂ Rn containing 0. The scaling degree d of u at 0 is defined as d := inf{β ∈ [−∞, ∞) | limλ→0 λβ δλ∗ u = 0}, where (δλ∗ u)(f ) := λ−n u(f ◦ δλ−1 ). If u0 := limλ→0 λd δλ∗ u exists we call it the scaling limit of u at 0. Note that the scaling limit may fail to exist (e.g., u(x) = log|x|) or it may vanish (e.g., if 0 ∈ supp(u)). On a manifold, we will only consider scaling limits in a certain choice of local coordinates. How this limit depends on this choice of coordinates will not be relevant for us. We now prove the following resultw : Proposition A.2. Let u be a distribution on a convex open region O ⊂ Rn containing 0 with scaling limit u0 at 0. Then {0} × π2 (WF (u0 )) ⊂ WF (u), where π2 denotes the projection on the second coordinate. Proof. Suppose that (0, ξ0 ) ∈ WF (u) with ξ0 = 0. We will prove that (x, ξ0 ) ∈ WF (u0 ) for all x. By assumption, we can choose χ ∈ C0∞ (O) and an open conic neighborhood Γ ⊂ Rn of ξ0 such that χ ≡ 1 on a neighborhood of 0 and supp(χ) × Γ ∩ WF (u) = ∅. We set v := χu and v λ := λd δλ∗ v, where d is the scaling degree of u at 0. Notice that WF (v) ∩ T0∗ O = WF (u) ∩ T0∗ O and u0 := limλ→0 v λ , so without wA
similar result was also claimed as [34, Proposition 2.8], but we find their proof unconvincing. In particular, when localizing the scaling limit u0 with a test-function χ0 and estimating (cf. [34, Eq. (2.11)]) “ “.” ξ ” d−n 0 u χ0 e−i λ ·. χ 0 u (ξ) = lim λ λ→0 λ the test-function χ0 ( λ. ) becomes singular in the limit λ → 0. The quoted reference pays insufficient attention to this issue in the last sentence of their proof, because their last estimate does not involve any χ0 .
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
419
loss of generality, we may prove the result with v replacing u and we can view the v λ as compactly supported distributions on all of Rn . Notice that for λ > 0 we have δλ∗ u0 = λ−d u0 , i.e. u0 is a homogeneous distribution and therefore it is tempered ([38, Theorem 7.1.18]). We now prove that v λ converges to u0 in the sense of tempered distributions on Rn . For this we first write v = |α|≤r (−1)|α| ∂ α vα , where r is the order of v and the vα are compactly sup ported distributions of order 0 (see [38, Sec. 2.1]). Note that |α|
sup|∂ α φ|, supp(φ) ⊂ B1 , (15) |wλ (φ)| ≤ C |α|≤r
for some C, r > 0, where B1 is the (Euclidean) unit ball and 0 < λ ≤ 1. In fact, for λ ≥ 1 we also have
λd−n−|α| sup|∂ α φ| |wλ (φ)| = λd−n |w(φ ◦ δλ−1 )| ≤ C ≤C
d−n≤|α|≤r
sup|∂ α φ|,
d−n≤|α|≤r
so the estimate (15) holds for all λ > 0. Now, let φ ∈ S(Rn ) be a function of rapid decrease and choose a partition of unity on Rn as follows. We let χ0 ∈ C0∞ (Rn ) be positive such that χ ≡ 1 on B1 and χ(x) = 0 when x ≥ 2. We then set χm (x) := χ0 (2−m x) − χ0 (21−m x) and note that: supp(χm≥1 ) ⊂ {x | 2m−1 ≤ x ≤ 2m+1 },
∞
χm = 1,
m=0
where the sum is finite near every point. We define φm := χm φ and µm := 2−m−1 and rescale φm in order to apply the estimate (15): . λ d−n λ/µm |w (φm )| = µm w φm µm
α . (∂ φm ) µd−n−|α| sup ≤C m µm |α|≤r
≤ C1
|α|≤r |β|≤r+n−d
sup|xβ ∂ α φm |, Rn
m ≥ 0,
(16)
May 11, J070-S0129055X10003990
420
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
where the last line uses µm ≤ (4x)|α|+n−d for m ≥ 1, which follows from d − n ≤ |α| and the support properties of χm . (For m = 0 we simply estimate d−n−|α| by a constant to arrive at the last line of (16).) We now note that µ0 maxα supx|∂ α χm | ≤ c for some c independent of m, as the derivatives only bring out extra factors of 2−m ≤ 1. Moreover, for m ≥ 0 we notice that χm+1 +χm +χm−1 ≡ 1 on supp(χm ), where we define χ−1 := 0. Therefore (16) leads to
|wλ (φm )| ≤ C2 sup |xβ ∂ α φ|(χm+1 + χm + χm−1 ) d−n−|α|
|α|≤r |β|≤r+n−d
and summing over m ≥ 0 then gives:
|wλ (φ)| ≤ 3C2
Rn
|α|≤r |β|≤r+n−d
sup |xβ ∂ α φ|. Rn
This shows that wλ (φ) can be estimated by a seminorm on S(Rn ) uniformly in λ. It then follows that wλ → u0 and hence v λ → u0 as tempered distributions. Indeed, for any φ ∈ S(Rn ) and > 0 we can choose φ ∈ C0∞ (Rn ) and λ0 > 0 such that |wλ (φ − φ )| < 2 for all λ > 0 and |wλ (φ )| < 2 for all λ < λ0 . Fourier transformation is a continuous operation on tempered distributions, so we can compute: −N 0 (ξ)| = lim λd−n vˆ ξ ≤ CN lim λd−n ξ = CN ξ−N lim λN +d−n |u λ λ→0 λ→0 λ→0 λ 0 (ξ) = 0 for all ξ in Γ, all N ∈ N and suitable CN > 0. For N > n−d the limit yields u near ξ0 . We then apply [38, Theorem 8.1.8], which says that for a homogeneous distribution we have for all x = 0 that (x, ξ0 ) ∈ WF (u0 ) if and only if (ξ0 , −x) ∈ 0 ). 0 ) and also (0, ξ0 ) ∈ WF (u0 ) if and only if ξ0 ∈ supp(u WF (u For a distribution u with values in a Banach space B one can define the wave front set by using estimates of the norm u(χeiξ· ), which replace the corresponding estimates of the absolute value |u(χeiξ· )| for scalar distributions [37]. Alternatively, one can use the following equivalent characterization ( [1, Theorem A.1.4]): WF (u) =
WF (l ◦ u)\Z.
(17)
l∈B
A similar idea works for a distributional section u of a vector bundle V = O × Rm over a contractible region O of Rn . Indeed, using a basis ei for Rm with dual basis ei we can identify u with a distribution u ˜ on O with values in B ⊗ (Rm ) , where the correspondence is given by m m m
i i u ˜(h) := u(hei ) ⊗ e , u f ei = ˜ u(f i ), ei , i=1
i=1
i=1
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
421
where , denotes the canonical pairing of Rm with the second factor of B ⊗ (Rm ) . We set by definition WF (u) := WF (˜ u). Equation (17) allows a straightforward generalization of many results for scalar distributions on open sets of Rn to Banach-space-valued distributional sections of a vector bundle over regions over Rn . Moreover, by showing how these results transform under changes of coordinates they can be formulated for vector bundles on a manifold. We list a number of these results in the following Theorem (cf. [1, 38]): Theorem A.3. If u, v are distributional sections of a complex vector bundle V over the spacetime M with values in the Banach space B, then: (1) (2) (3) (4)
sing supp(u) is the projection of WF (u) on the first variable, u ∈ C ∞ (V, B) if and only if WF (u) = ∅, WF (u + v) ⊂ WF (u) + WF (v), if P is a linear partial differential operator on V with smooth coefficients and (matrix-valued) principal symbol x p(x; ξ), then WF (P u) ⊂ WF (u) ⊂ WF (P u) ∪ ΩP , where ΩP := {(x; ξ) ∈ T ∗ M | ξ = 0, det p(x; ξ) = 0}, (5) if x ∈ M, φ : U → Rn is a local trivialization on a convex neighborhood U with φ(x) = 0 and (φ−1 )∗ u has a scaling limit u0 at 0, then φ∗ ({0} × π2(WF (u0 ))) ⊂ WF (u) ∩ Tx∗ M . In the last item, the scaling limit depends not just on the choice of coordinates, but also on the choice of a frame ei of V over U and we let the scaling maps δλ act on sections of V componentwise: ( i f i ei ) ◦ δλ−1 = i (f i ◦ δλ−1 )ei . In the particular case where B is a Hilbert space, we also have (see [1, 37]): Theorem A.4. Let H be a Hilbert space and Vi , i = 1, 2, two finite-dimensional (complex ) vector bundles over smooth ni dimensional spacetimes Mi with complex conjugations Ji , i.e. the Ji are antilinear, base-point preserving bundle isomorphisms Ji : Vi → Vi such that Ji2 = −id. Let ui , i = 1, 2. be two H-valued distributional sections of Vi and let wij be the distributional sections of the vector bundle Xi Xj over Mi × Mj determined by wij (f1 f2 ) := ui (Ji f1 ), uj (f2 ). Then (x, ξ) ∈ WF (u1 ) ⇔ (x, −ξ; x, ξ) ∈ WF (w11 ) and WF (wij ) ⊂ −(WF (ui ) ∪ Z) × (WF (uj ) ∪ Z), where Z denotes the zero-section. Finally, we establish some results on the wave front sets of advanced and retarded fundamental solutions E ± (for their existence and uniqueness we refer to [16]) and S ± , Sc± . These results are analogous to [36, Theorem 6.5.3], but now x See
[16] for the definition of the principal symbol.
May 11, J070-S0129055X10003990
422
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
for operators in a vector bundle. Note that for distributional sections of vector bundles there is a Propagation of Singularities Theorem, which follows from the propagation of the polarization set [41]. Theorem A.5. Let E ± be the advanced (−) and retarded (+) fundamental solutions for a normally hyperbolic operator P acting on the sections of a vector bundle DM over a globally hyperbolic spacetime M = (M, g) of dimension n ≥ 2. Then WF (E ± ) = {(x, ξ; y, η) ∈ T ∗ M ×2 \Z | x ∈ J ± (y), x = y, (x, −ξ) ∼ (y, η)} ∪ {(x, −ξ; x, ξ) ∈ T ∗ M ×2 \Z | (x, ξ) ∈ T ∗ M \} =: A± ∪ B
(18)
where Z is the zero-section and (x, ξ) ∼ (y, η) if and only if there is a light-like geodesic γ from x to y to which ξ and η are cotangent such that they are each others parallel transport along γ. Proof. The first part of this proof follows closely the proof of [31]. We start by reducing the problem to a local one as follows. The principal symbol of P is p(x, ξ) = gµν (x)ξ µ ξ ν I, where I is the identity operator on DM , so by the Propagation of Singularities Theorem, the singularities of E ± propagate along lightlike geodesics by parallel transport. By definition the points in set A± are invariant under the same parallel transport. Now consider a point p := (x, ξ; y, η) with x = y. If ξ = η = 0 then P is not contained in any set on either side of the equality, so we may assume ξ = 0 (the case η = 0 is analogous). Let S be a spacelike Cauchy surface through y and propagate (x, ξ) along the light-like geodesic γ towards S. If γ ends at S in x = y then P is not contained in A± or B, nor is it contained in WF (E ± ), because E(x , y) = 0 when x and y are spacelike, so it cannot have any singularities there. If γ ends at y, on the other hand, we can find a point p := (x , ξ ; y, η), where x on γ is in any given causally convex neighborhood of y and ξ is the parallel transport of ξ along γ to x . Then p ∈ WF (E ± ) if and only if p ∈ WF (E ± ) and p ∈ A± if and only if p ∈ A± . Hence, it suffices to prove the claim locally. On a sufficiently small causally convex domain O ⊂ M we can find for every k ∈ N a C k -section W k of DM D∗ M on O×2 such that ( [16, Proposition 2.5.1]): ±
E (x, y) =
k+1
Vj (x, y)f ∗ (1 ⊗ R± (2 + 2j, ·))(x, y) + W k (x, y).
(19)
j=0
Here, the Hadamard coefficients Vj are uniquely defined smooth sections of DM D∗ M on O×2 , R± (α, y) are the retarded (+) and advanced (−) Riesz distributions (or rather distribution densities) on Minkowski spacetime and they are pulled back by the smooth diffeomorphism f : O×2 → T O defined by (x, y) → (x, exp−1 x (y)). This means we use Riemannian normal coordinates for y centered on x, which is
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
423
well-defined because O is causally convex. The Riesz distributions have many useful properties, of which we will only use for all j ≥ 0: WF (R± (2j + 2, ·)) = {(x, ξ) ∈ T ∗ M0 \Z | x = 0 or x2 = 0, x ∈ J ± (0), ξ x} R± (2 + 2j, λx) = λ2+2j−n R± (2 + 2j, x), λ > 0. (20) (These can be proved using [16, Proposition 1.2.4 items 4 and 5], j+1 R± (2+2j, ·) = δ and the wave front sets of the distinguished parametrices as determined in [36].) Hence, for all j ∈ N: WF (f ∗ (1 ⊗ R± (2 + 2j, ·))) = f ∗ (WF (1 ⊗ R± (2 + 2j, ·))) = f ∗ (Z|O × WF (R± (2 + 2j, ·))) = {(x, ξ; y, η) | (ξ, η) = df T (0, η ) for some ± (exp−1 x (y), η ) ∈ WF (R (2 + 2j, ·))},
= (A± ∪ B) ∩ T ∗ O×2 ,
(21)
where df T is the transpose of the derivative df at (x, y). The last equality uses the wave front set of the Riesz distributions in Eq. (20) and the properties of Riemannian normal coordinates (cf. [31]). It follows that WF (E ± |O×2 ) ⊂ (A± ∪ B) ∩ T ∗ O×2 , because for each order of differentiation N we can choose a sufficiently high order k in Eq. (19) to make the required estimate in the definition of the wave front set. We can prove the opposite inclusion, if we can show that the wave front set of the finite sum in (19) also contains (A± ∪ B) ∩ T ∗ O×2 , which we will do using scaling limits (cf. [34]). First, we may employ the Riemannian normal coordinates f : O×2 → T O as above. Next, we may assume that O is also a contractible coordinate neighbourhood, so we can consider local coordinates φ : O → Rn on O and the associated coordinate map dφ on T O. Moreover, we can choose φ in such a way that φ(x0 ) = 0 for an arbitrarily given x0 ∈ O. The composition dφ ◦ f then defines coordinates on O×2 such that (x0 , x0 ) → 0 ∈ R2n . Using a frame EA for DM |O and the dual frame E B we can express the terms in the sum of Eq. (19) in the A (x, y)R± (2 + 2j, y). From Eq. (20), we then find the local coordinates dφ ◦ f as VjB scaling behavior A A (x, y)R± (2 + 2j, y)) = λ2+2j−n (VjB (λx, λy)R± (2 + 2j, y)) δλ∗ (VjB
for all λ > 0. In the scaling limit only the lowest order term survives: A lim λn−2 (δλ ◦ f −1 ◦ dφ−1 )∗ E(x, y) = V0B (0, 0)R(2, y)E B (x)EA (y)
λ→0
= R(2, y)E A (x)EA (y),
May 11, J070-S0129055X10003990
424
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
where we wrote R(2, y) := R− (2, y) − R+ (2, y) and we used the explicit expression A ( [16, Lemmas 2.2.2 and 1.3.17]). V0AB (x, x) = δB Now, the last item of Theorem A.3 (which follows from Proposition A.2) implies that WF (E) ⊃ (dφ ◦ f )∗ ({(0, 0)} × π2 (WF (1 ⊗ R(2, ·)))), because E A (x)EA (y) is smooth and not identically vanishing. From Eq. (20) and the support properties of R± (2, ·) we easily compute π2 (WF (1 ⊗ R(2, ·))) = {(0, ξ) | ξ 2 = 0}. Pulling this back to O×2 and using the properties of Riemannian normal coordinates yields WF (E) ⊃ {(x0 , −ξ; x0 , ξ) | ξ 2 = 0}. Because E is a bi-solution to the wave equation we can apply the Propagation of Singularities Theorem to find that WF (E) ⊃ A+ ∪ A− on O×2 and from the support properties of E + and E − we then conclude that WF (E ± ) ⊃ A± . Finally, WF (E ± ) ⊃ WF (P E ± ) = WF (δ) = B. This completes the proof. Corollary A.6. In the notation of Theorem A.5, WF (E) = A+ ∪ A− \Z. Proof. By Theorem A.5 and the support properties of E ± , we have WF (E) = A+ ∪A− away from the diagonal. The inclusion ⊃ then follows from the closedness of the wave front set. For the opposite inclusion we consider a point on the diagonal and use the Propagation of Singularities Theorem to find an approximating sequence of points off the diagonal. Proposition A.7. For the fundamental solutions of the Dirac equation we have, in the notation of Theorem A.5: WF (S ± ) = WF (Sc± ) = A± ∪ B and WF (S) = WF (Sc ) = A+ ∪ A− \Z. In other words, WF (S ± ) = WF (Sc± ) = WF (E ± ) and WF (S) = WF (Sc ) = WF (E). Proof. Because S ± = (i∇ / + m)E ± and Sc± = (−i∇ / + m)E ± (see [6]) we ± ± ± immediately find WF (S ) ⊂ WF (E ) and WF (Sc ) ⊂ WF (E ± ). Similarly WF (S) ⊂ WF (E) and WF (Sc ) ⊂ WF (E). Now suppose that WF (S) = WF (Sc ) = WF (E) = A+ ∪ A− , which we will prove below. By the support properties of the fundamental solutions we then find that away from the diagonal WF (S ± ) = WF (Sc± ) = A± , whereas on the diagonal WF (E ± ) = B ⊃ WF (S ± ) ⊃ WF (P S ± ) = WF (δ) = B and similarly for cospinors. To complete the proof we need to show that WF (S) ⊃ WF (E) and WF (Sc ) ⊃ WF (E), for which we adapt (and correct) an idea of [33]. We prove the case of S, because the other case follows by taking adjoints (cf. Theorem 3.10). Further note that it is sufficient to prove the claim on the diagonal, because the Propagation of Singularities Theorem applies both to E and to S. Now suppose that (x, −ξ; x, ξ) ∈ WF (E)\WF (S). We will derive a contradiction as follows. For every time-like, future pointing normalized vector n0 ∈ Tx M we can find a smooth spacelike Cauchy surface C through x such that n0 is normal to C. We let n denote
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
425
the future pointing normal vector field on C and ι : C → M the canonical injec/ tion. By [6, Proposition 2.4(c)] we can restrict S to C ×2 to find S|C ×2 = −iδn and in particular (x, −dιTx (ξ); x, dιTx (ξ)) ∈ WF (S|C ×2 ). By (a component version of) [38, Theorem 8.2.4], on the other hand: WF (S|C ×2 ) ⊂ (ι × ι)∗ (WF (S)) = {(x, dιTx (ξ); y, dιTy (ξ )) | (x, ξ; y, ξ ) ∈ WF (S)}. Therefore, there must be a point (x, −η; x, η) ∈ WF (S) such that (x, −dιTx (η); x, dιTx (η)) = (x, −dιTx (ξ); x, dιTx (ξ)) Notice, however, that the transpose of dι is nothing else than restricting the dual vector ξ to the tangent space of C. Because WF (S) ⊂ WF (E), there are only two possibilities: η = ξ or η = ξ − 2(ξa na0 )n0 . The first contradicts our assumption, so we have η = ξ − 2(ξa na0 )n0 . Now (x, −η; x, η) ∈ WF (S) must hold for every normalised, time-like, future pointing vector n0 ∈ Tx M . Choosing a sequence of vectors n0 such that η → ξ and using the closedness of the wave front set we find again (x, −ξ; x, ξ) ∈ WF (S). Hence, WF (E) = WF (S). Appendix B. Proof of Theorem 4.18 The computations involved in the proof of Theorem 4.18 are somewhat similar to the computation of the stress-energy-momentum tensor. We will work in components and in local coordinates on O, using Greek indices to indicate the coordinate frame and coordinate derivatives. To ease the notation we will drop the subscript on the local frame eµa . As γ a is independent of we may use Eqs. (5) to vary 1 c 1 β b a α c c γ b ∇ / v = ∂a v − Γ ab vγc γ γ = ea ∂α v + eb {∂α eβ − eγ Γ αβ }vγc γ γ a , 4 4
(22)
which yields: 1 β d c 1 d a b a c β b a δ∇ / v = δeα a eα ∇d vγ − δeb eβ Γ ad vγc γ γ + ∂a δeβ eb vγc γ γ 4 4 1 1 γ α β c β γ b a b a − δecγ eα a eb Γ αβ vγc γ γ − δΓ αβ ea eb eγ vγc γ γ . 4 4 We can perform an integration by parts as follows: 1 ∂a δecβ eβb vγc γ b γ a 4 =
−i i Pc (δecβ eβb vγc γ b ) + δecβ eβb Pc (vγc γ b ) 4 4 1 1 1 − δecβ ∂a eβb vγc γ b γ a − δedβ eβb Γcad vγc γ b γ a + δecβ eβd Γdab vγc γ b γ a 4 4 4
(23)
May 11, J070-S0129055X10003990
426
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
=
−i i 1 Pc (δecβ eβb vγc γ b ) + δecβ eβb (Pc v)γc γ b − δecβ eβb ∇a v[γc γ b , γ a ] 4 4 4 1 1 1 − δecβ ∂a eβb vγc γ b γ a + δeβb edβ Γcad vγc γ b γ a + δecβ eβd Γdab vγc γ b γ a . 4 4 4 (24)
Because [γc γ b , γ a ] = γc {γ b , γ a } − {γc , γ a }γ b = 2η ab γc − 2δca γ b and ecβ = gµβ η cd eµd we can write: 1 1 1 − δecβ eβb ∇a v[γc γ b , γ a ] = − δ(gµβ η cd eµd )eβb η ab ∇a vγc + δecβ eβb ∇c vγ b 4 2 2 1 = − δgµβ η cd eµd eβb η ab ∇a vγc − δeµd eaµ ∇a vγ d 2 1 d a = δg αβ eaα ebβ ∇a vγb − δeα a eα ∇d vγ . 2
(25)
When substituting Eqs. (24) and (25) into (23), we can recombine the terms −1 c 1 −1 c γ d β γ b a δe ∂a eβb vγc γ b γ a − δecγ eα δe e Γ vγc γ b γ a a eb Γ αβ vγc γ γ = 4 β 4 4 γ d ab to obtain δ∇ /v =
−i i 1 Pc (δecβ eβb vγc γ b ) + δecβ eβb (Pc v)γc γ b + δg αβ eaα ebβ ∇a vγb 4 4 2 1 β c b a − δΓγαβ eα a eb eγ vγc γ γ . 4
(26)
Note that the variations of the frame δeα a cancel out, except in the terms with Pc . / S0 f ), because both B0 and v solve These are harmless when we compute B0 (δ∇ the Dirac equation. Therefore, the final answer will not depend on variations of the frame, as desired. In the last term of Eq. (26), we can use the symmetry of the Christoffel symbol: 1 γ α β c 1 1 β c b a ab = − δΓγαβ g αβ ecγ vγc − δΓγ(αβ) eα a eb eγ vγc γ γ = − δΓ αβ ea eb eγ vγc η 4 4 4 1 1 = − δg γµ gµν Γναβ g αβ ecγ vγc − ∂α δgβµ eµa g αβ vγ a 4 4 1 + ∂µ δgαβ eµa g αβ vγ a . 8
(27)
We handle the last term using an integration by parts as before: −i i 1 1 ∂a δgαβ g αβ vγ a = Pc (δgαβ g αβ v) + δgαβ g αβ Pc v − δgαβ ∂a g αβ vγ a 8 8 8 8 =
−i i 1 Pc (δgαβ g αβ v) + δgαβ g αβ Pc v − δg αβ ∂a gαβ vγ a , (28) 8 8 8
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
427
where we used δgαβ ∂a g αβ = −δg αβ gαµ gβν ∂a g µν = δg αβ ∂a gαβ . The penultimate term in (27) is: 1 1 − ∂α δgβµ eµa g αβ vγ a = ∂b (δg αβ gαµ gβν )eµa ebρ g ρν vγ a 4 4 =
1 1 ∂b (δg αβ eaα ebβ )vγa − δg αβ gαµ gβν ∂b (eµa ebρ g νρ )vγ a 4 4
=
1 1 ∇b (δg αβ eaα ebβ )vγa − δg αβ (Γabc ecα ebβ + Γbbc eaα ecβ )vγa 4 4 1 − δg αβ gαµ gβν ∂b (eµa ebρ g νρ )vγ a . 4
(29)
The first term on the right-hand side of Eq. (29) is 1 1 1 ∇b (δg αβ eaα ebβ )vγa = ∇b (δg αβ eaα ebβ vγa ) − δg αβ eaα ebβ ∇b vγa . 4 4 4
(30)
The other terms can be simplified with some computation: 1 − δg αβ (Γabc ecα ebβ + Γbbc eaα ecβ + gαµ gβν η ac ∂b (eµc ebρ g ρν ))vγa 4 1 = − δg αβ (−∂β eaα + eaγ Γγβα − eaα ∂c ecβ + eaα Γµµβ 4 + eaα gβν ∂ρ g ρν + eaα ∂b ebβ + gαµ η ac ∂β eµc )vγa 1 = − δg αβ (−η ac eµc ∂β gαµ + eaγ Γγβα + eaα Γµµβ − eaα g ρν ∂ρ gβν )vγa 4 1 = − δg αβ (−2eaγ g γµ ∂β gαµ + eaγ g γµ (2∂β gαµ − ∂µ gαβ ) 8 + eaα g µγ ∂β gµγ − 2eaα g ρν ∂ρ gβν )vγa =
1 αβ a γµ δg (eγ g ∂µ gαβ + 2eaα gβµ g ρν Γµρν )vγa . 8
(31)
Substituting Eqs. (27)–(31) into (26) yields: δ∇ /v =
−i i i i Pc (δecβ eβb vγc γ b ) + δecβ eβb (Pc v)γc γ b − Pc (δgαβ g αβ v) + δgαβ g αβ Pc v 4 4 8 8 1 1 + δg αβ eaα ebβ ∇a vγb + ∇b (δg αβ eaα ebβ vγa ). 4 4
(32)
May 11, J070-S0129055X10003990
428
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
Using Lemma 4.17, we find for a spinor u ∈ C ∞ (DM ): δ∇ /u =
i i i i P (δecβ eβb γ b γc u) − δecβ eβb γ b γc (P u) + P (δgαβ g αβ u) − δgαβ g αβ P u 4 4 8 8 1 1 + δg αβ eaα ebβ γb ∇a u + ∇b (δg αβ eaα ebβ γa u). 4 4
(33)
Using Proposition 4.16 and Eqs. (32) and (33) we notice that the terms with Pc and P cancel out in the following equality, because B0 and S0 f both satisfy the Dirac equation: δ(β B0 (f )) = −B0 (δP S0 f ) =
i i B0 (δg αβ eaα ebβ γb ∇a S0 τ f ) + B0 (∇b (δg αβ eaα ebβ γa S0 τ f )) 4 4
=
i αβ a b δg eα eβ (B0 (γ(b ∇a) S0 τ f ) − ∇(b B0 (γa) S0 τ f )). 4
(34)
We now compare with Proposition 4.14 to get the final result.
References [1] K. Sanders, Aspects of locally covariant quantum field theory, PhD thesis, University of York (2008); also available online, arXiv:0809.4828v1[math-ph]. [2] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality principle — a new paradigm for local quantum field theory, Comm. Math. Phys. 237 (2003) 31–68. [3] C. Dappiaggi, T.-P. Hack and N. Pinamonti, The extended algebra of observables for Dirac fields and the trace anomaly of their stress-energy tensor, Rev. Math. Phys. 21 (2009) 1241–1312. [4] R. Verch, A spin-statistics theorem for quantum fields on curved spacetime manifolds in a generally covariant framework, Comm. Math. Phys. 223 (2001) 261–288. [5] C. J. Fewster, Quantum energy inequalities and local covariance II: Categorical formulation, Gen. Relativ. Gravit. 39 (2007) 1855–1890. [6] J. Dimock, Dirac quantum fields on a manifold, Trans. Amer. Math. Soc. 269 (1982) 133–147. [7] C. J. Fewster and R. Verch, A quantum weak energy inequality for Dirac fields in curved spacetime, Comm. Math. Phys. 225 (2002) 331–359. [8] H. B. Lawson and M.-L. Michelson, Spin Geometry (Princeton University Press, Princeton, 1989). [9] R. Coquereaux, Clifford algebras, spinors and fundamental interactions: Twenty years after, arXiv:math-ph/0509040v1. [10] W. Pauli, Contributions math´ematiques ` a la th´eorie des matrices de Dirac, Ann. Inst. H. Poincar´e 6 (1936) 109–136. [11] B. L. van der Waerden, Group Theory and Quantum Mechanics (Springer, Berlin, 1974). [12] Y. Choquet-Bruhat, C. de Witt-Morette and M. Dillard-Bleick, Analysis, Manifolds and Physics (North Holland, Amsterdam, 1977).
May 11, J070-S0129055X10003990
2010 10:6 WSPC/S0129-055X
148-RMP
The Locally Covariant Dirac Field
429
[13] S. P. Dawson and C. J. Fewster, An explicit quantum weak energy inequality for Dirac fields in curved spacetimes, Class. Quantum Grav. 23 (2006) 6659–6681. [14] S. Mac Lane, Categories for the Working Mathematician (Springer, New York, 1971). [15] S. Mac Lane and I. Moerdijk, Sheaves in Geometry and Logic: A First Introduction to Topos Theory (Springer, New York, 1992). [16] C. B¨ ar, N. Ginoux and F. Pf¨ affle, Wave Equations on Lorentzian Manifolds and Quantization (EMS, Z¨ urich, 2007). [17] J. Dieudonn´e, Treatise on Analysis, Vol. III (Academic Press, New York-London, 1972). [18] J. Tolksdorf, Clifford modules and generalized Dirac operators, Internat. J. Theoret. Phys. 40 (2001) 191–209. [19] R. Geroch, Spinor structures of space-times in general relativity. I, J. Math. Phys. 9 (1968) 1739–1744. [20] R. M. Wald, General Relativity (University of Chicago Press, Chicago-London, 1984). [21] R. Geroch, Spinor structures of space-times in general relativity. II, J. Math. Phys. 11 (1970) 343–348. [22] S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, Vol. I (Interscience, New York, 1963). [23] R. H. Good Jr., Properties of the Dirac matrices, Rev. Mod. Phys. 27 (1955) 187–211. [24] A. Lichnerowicz, Champs spinoriels et propagateurs en relativit´e g´en´erale, Bull. Soc. Math. France 92 (1964) 11–100. [25] D. Canarutto and A. Jadczyk, Fundamental geometric structures for the Dirac equation in general relativity, Acta Appl. Math. 51 (1998) 59–92. [26] J. E. Roberts and G. Ruzzi, A cohomological description of connections and curvature tensors over posets, Theory Appl. Categ. 16 (2006) 855–895. ´ [27] G. Segal, Classifying spaces and spectral sequences, Inst. Hautes Etudes Sci. Publ. Math. 34 (1968) 105–112. [28] K. Schm¨ udgen, Unbounded Operator Algebras and Representation Theory (Birkh¨ auser, Basel, 1990). [29] H. Araki, On the diagonalization of a bilinear Hamiltonian by a Bogoliubov transformation, Publ. Res. Inst. Math. Sci. Ser. A 4 (1968/1969) 387–412. [30] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 2 (Springer, Berlin, 1996). [31] M. J. Radzikowski, Micro-local approach to the Hadamard condition in quantum field theory on curved space-time, Comm. Math. Phys. 179 (1996) 529–553. [32] K. Kratzert, Singularity structure of the two point function of the free Dirac field on a globally hyperbolic spacetime, Ann. Phys. (8) 9 (2000) 475–498. [33] S. Hollands, The Hadamard condition for Dirac fields and adiabatic states on Robertson–Walker spacetimes, Comm. Math. Phys. 216 (2001) 635–661. [34] H. Sahlmann and R. Verch, Microlocal spectrum condition and Hadamard form for vector-valued quantum fields in curved spacetime, Rev. Math. Phys. 13 (2001) 1203– 1246. [35] S. Hollands, The operator product expansion for perturbative quantum field theory in curved spacetime, Comm. Math. Phys. 273 (2007) 1–36. [36] J. J. Duistermaat and L. H¨ ormander, Fourier integral operators. II, Acta Math. 128 (1972) 183–269. [37] A. Strohmaier, R. Verch and M. Wollenberg, Microlocal analysis of quantum fields on curved space-times: Analytic wave front sets and Reeh–Schlieder theorems, J. Math. Phys. 43 (2002) 5514–5530.
May 11, J070-S0129055X10003990
430
2010 10:6 WSPC/S0129-055X
148-RMP
K. Sanders
[38] L. H¨ ormander, The Analysis of Linear Partial Differential Operators, Vol. I (Springer, Berlin, 2003). [39] K. Sanders, Equivalence of the (generalized) Hadamard and microlocal spectrum condition for (generalized) free fields in curved spacetime, Comm. Math. Phys. 295 (2010) 485–501. [40] R. Brunetti, K. Fredenhagen and M. K¨ ohler, The microlocal spectrum condition and Wick polynomials of free fields on curved spacetimes, Comm. Math. Phys. 180 (1996) 633–652. [41] N. Dencker, On the propagation of polarization sets for systems of real principal type, J. Funct. Anal. 46 (1982) 351–372. [42] C. D’Antoni and S. Hollands, Nuclearity, local quasiequivalence and split property for Dirac quantum fields in curved spacetime, Comm. Math. Phys. 261 (2006) 133–159. [43] M. Forger and H. R¨ omer, Currents and the energy-momentum tensor in classical field theory: A fresh look at an old problem, Ann. Phys. 309 (2004) 306–389.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 4 (2010) 431–484 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004004
INVERSE SCATTERING IN ¨ DE SITTER–REISSNER–NORDSTROM BLACK HOLE SPACETIMES
´ THIERRY DAUDE Department of Mathematics and Statistics, McGill University, 805 Sherbrooke South West, Montr´ eal QC, H3A 2K6, Canada
[email protected] FRANC ¸ OIS NICOLEAU D´ epartement de Math´ ematiques, Laboratoire Jean Leray – UMR 6629, Universit´ e de Nantes, 2, rue de la Houssini` ere, BP 92208, 44322 Nantes Cedex 03, France
[email protected] Received 4 October 2009 Revised 15 March 2010 In this paper, we study the inverse scattering of massive charged Dirac fields in the exterior region of (de Sitter)–Reissner–Nordstr¨ om black holes. Firstly, we obtain a precise high-energy asymptotic expansion of the diagonal elements of the scattering matrix (i.e. of the transmission coefficients) and we show that the leading terms of this expansion allow to recover uniquely the mass, the charge and the cosmological constant of the black hole. Secondly, in the case of nonzero cosmological constant, we show that the knowledge of the reflection coefficients of the scattering matrix on any interval of energy also permits to recover uniquely these parameters. Keywords: Inverse scattering; black holes; Dirac equation. Mathematics Subject Classification 2010: 81U40, 35P25
1. Introduction This paper deals with inverse scattering problems in black hole spacetimes and is a continuation of our previous work [4]. Here we shall study the inverse scattering of massive charged Dirac fields that propagate in the outer region of (de Sitter)– Reissner–Nordstr¨om black holes, an important family of spherically symmetric, charged exact solutions of the Einstein equations that will be thoroughly described in Sec. 2. These spacetimes are completely characterized by three parameters: the mass M > 0 and the electric charge Q ∈ R of the black hole as well as the cosmological constant Λ ≥ 0 of the universe. In what follows, these parameters will be 431
May 11, J070-S0129055X10004004
432
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
considered as the “unknowns” of our inverse problem. In fact, the inverse scattering problem we have in mind is the following: we assume that we are static observers living in the exterior region of a (dS)-RN black hole, that is the region between the exterior event horizon of the black hole and the cosmological horizon when Λ > 0, or the region lying beyond the exterior event horizon of the black hole when Λ = 0. The geometry of the spacetime in which these observers live is thus fixed in some sense. But, what we do not assume however is that these observers know the exact values of the parameters M, Q and Λ “a priori ”. Hence the natural question we adress is: do such observers have any means to measure or characterize uniquely these parameters by an inverse scattering experiment? Let us explain more precisely the exact inverse scattering problem studied in this paper. First of all, we shall use the direct scattering theory for massive charged Dirac fields established in [3] for RN black holes and more generally in [18] for dS-RN black holes. The point of view adopted in these papers to describe the geometry of the black hole is that of static observers located far from the horizons (think typically of a telescope on earth aimed at the black hole). We shall conserve this point of view here which means in practice that all the relevant objects (such as the wave and scattering operators) used in this work will be expressed by means of the Regge–Wheeler coordinates system. This choice of coordinates has an important consequence in the understanding of the boundaries of the outer region of (dS)RN black holes, namely, either the exterior event horizon of the black hole and the cosmological horizon when Λ > 0, or the event horizon of the black hole and spacelike infinity when Λ = 0. These boundaries are indeed perceived by such observers as asymptotic regions of the spacetime which, moreover, may have very different geometrical structures. This entails the following nice and peculiar picture concerning the propagation properties of the Dirac fields ([3, 18]). First, it can be proved that the energy of the fields contained in any compact set between the two asymptotic regions vanishes at late times. Therefore, the fields scatter toward these asymptotic regions. Second, from the point of view of our particular observers, Dirac fields are shown to obey there simple but different equations that reflect the different geometries of the asymptotic regions. Therefore, two distinct wave operators must be introduced according to the asymptotic region we consider. Let us denote for the moment the wave operators corresponding to the part of Dirac ± and the fields which scatters toward the event horizon of the black hole by W(−∞) wave operators corresponding to the part of Dirac fields which scatters toward the ± . These wave operators will be cosmological horizon or spatial infinity by W(+∞) precisely defined in Sec. 2. The main result obtained in [3, 18] asserts that the global wave operators defined by ± ± + W(+∞) , W ± = W(−∞)
(1.1)
exist and are asymptotically complete. This permits to define a global scattering operator S by the usual formula S = (W + )∗ W − .
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
433
The scattering operator S will be the main object of study of this paper. It encodes the scattering data as viewed by observers living far from the horizons of a (dS)-RN black hole. We thus rephrase and precise our initial problem in the following way. We assume that our observers have access experimentally to the scattering operator S. More precisely, we assume that they can measure the expectation values of S, i.e. they can measure any quantities of the form Sψ, φ where ·, · denotes the scalar product of the energy Hilbert space H on which S acts and ψ, φ are any element of H. The question we adress is now: is the knowledge of S and any of its related quantities a sufficient information to uniquely characterize the parameters M, Q and Λ of (dS)-RN black holes? We can in fact be a bit more precise in the statement of the problem if we remark that the scattering operator S can be decomposed using (1.1) as S = TL + TR + L + R, where + − TL = (W(+∞) )∗ W(−∞) ,
+ − TR = (W(−∞) )∗ W(+∞) ,
and + − R = (W(+∞) )∗ W(+∞) ,
+ − L = (W(−∞) )∗ W(−∞) .
Each of the terms in S corresponds to a different inverse scattering experiment. For instance, the first two terms TR and TL (in fact the diagonal elements of S) are understood as transmission operators. These terms measure the part of a signal which is transmitted from one asymptotic region to the other in a scattering process. Conversely, the last two terms L and R (the anti-diagonal elements of S) are understood as reflection operators and correspond to the opposite experiment. These terms measure the part of the signal which is reflected from an asymptotic region to itself. The quantities of interest — the inverse scattering data — will be thus either the expectation values TR ψ, φ, TL ψ, φ of the transmission operators, or the expectation values Lψ, φ, Rψ, φ of the reflection operators. In this paper, we shall study two types of inverse problems. Firstly, in the two cases of RN black holes (Λ = 0) and dS-RN black holes (Λ > 0), we shall prove that the parameters M, Q, Λ are uniquely determined if we assume that the high energies of the transmission operators TR or TL are known. Note here that the same analysis would not be possible working wih the reflection operators R or L. The high energies of the reflection operators are indeed non-measurable and thus cannot be used to determine uniquely the parameters. This was mentioned in [4] (see also [12] where a similar problem was studied). Secondly, in the case of dS-RN black holes only (Λ > 0), we shall prove the same uniqueness result under the assumption that the reflection operators L or R are known on any (possibly small) interval of energy. The reason why we do not treat this second type of inverse problem in the case of RN black hole is the following. The structure of the scattering operator (at any
May 11, J070-S0129055X10004004
434
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
energy) turns out to be more complicated in the case of RN black holes than in dSRN black holes. This is again a consequence of the very different geometries of the asymptotic regions in the RN case (see below for a brief explanation). To obtain the same uniqueness result in this last case would require thus a better understanding of the scattering matrix. We are currently investigating this problem. Let us now recall the results of [4] where the first kind of inverse problem was adressed in the case of Reissner–Nordstr¨ om black holes (i.e. with only the two parameters M, Q unknown and the cosmological constant Λ equal to 0). Using the direct scattering theory for massless Dirac fields obtained in [3, 20] and a high energy asymptotic expansion of the expectation values TR ψ, φ or TL ψ, φ (as defined above), a partial answer was then given: the mass M and the modulus of the charge |Q| are uniquely determined from the leading terms of this high energy asymptotic expansion. Note that the indecision of the sign of the charge is not surprising in that case since the propagation of massless Dirac fields is only influenced by the geometry of the black hole which in turn only depends on |Q| (see the expression of the metric (2.2) in Sec. 2). In this paper, we continue our investigation and improve our results in several directions. In Sec. 3, we reconsider the case Λ = 0 corresponding to RN black holes but study the inverse scattering of massive charged Dirac fields instead of massless Dirac fields. Using the same approach in [4], we show that the mass M as well as the charge Q are uniquely determined by the leading terms of the high energy asymptotic expansion of the transmission operators TR or TL . In fact, the advantage of considering massive charged Dirac fields is that an explicit term associated to the interaction between the electric charge of the fields and that of the black hole appears in the equation and allows to recover Q and not |Q|. From the mathematical side, the analysis turns out to be much more involved than in [4]. The reasons are twofold. First, from the point of view of our observers, massive Dirac fields have completely distinct behaviors when approaching the different asymptotic regions. At the event horizon of the black hole for instance, the attraction exerced by the black hole is so strong that massive Dirac fields seem to behave as massless Dirac fields. The asymptotic dynamic there turns out to be very simple and is shown to obey a system of transport equations along the null radial geodesics of the black hole.a This is a consequence of the particular geometry (of hyperbolic type) near the event horizon (and more generally near any horizons). Conversely, RN black holes are asymptotically flat at spacelike infinity. There, the fields simply behave like massive Dirac fields in Minkowski spacetime and the mass of the fields, slowing down the propagation, plays an important role. In consequence, the dynamics near the two asymptotic regions are quite different and must be treated separately. The
a We emphasize again here that this simple expression for the asymptotic dynamic at the event horizon (in fact at any horizons) is only true from the point of view of observers living far from the horizons. Adopting another point of view such as the one of local observers living near a horizon would lead to a very different asymptotic dynamic.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
435
second and related difficulty comes from the appearance of long-range terms in the equation but only at a single asymptotic region: spacelike infinity. This entails new technical difficulties such as a modification of the standard wave operators at infinity and we need to work harder to obtain the high energy asymptotic expansion of the transmission operators. An interesting feature we would like to emphasize is that, eventhough we are considering high energies, the rest mass and the charge of Dirac fields do contribute to the asymptotic expansion of the scattering matrix. This can be clearly seen from the reconstruction formulae obtained in Theorem 3.2. At last, we also mention that the model studied in this section can be viewed as a good intermediate model before studying the same inverse problem in the more complicated geometrical setting of Kerr black holes. As shown in [13] indeed, the appearance of long-range terms in the equation (even for massless Dirac fields) is compulsory in that case as a side effect of the rotation of the spacetime. In Sec. 4, we consider the case of nonzero cosmological constant Λ > 0, that is de Sitter–Reissner–Nordstr¨om black holes and the three parameters M, Q, Λ are supposed to be a priori unknown. The two asymptotic regions are the event horizon of the black hole and the cosmological horizon. From the point of view of our observers, massive Dirac fields seem to behave as massless Dirac fields when approaching the horizons and as before, their propagation there obeys essentially a system of transport equations along the null radial geodesics of the black hole. However, different oscillations appear in the dynamics near these two horizons, once again due to the interaction between the charge of the field and that of the black hole. In consequence, Dirac fields evolve asymptotically according to slightly different dynamics in that case too. In Sec. 4.1, using the results of the previous part, we shall obtain a high energy asymptotic expansion of the transmission operators TR and TL and we shall prove that the parameters M, Q and Λ are uniquely characterized by the leading terms of this asymptotic expansion. In Sec. 4.2, we consider an inverse scattering problem based on the knowledge of the reflection operators R or L on a (possibly small) interval of energy. As already mentioned, a high energy aymptotic expansion of these reflection operators does not give any information and cannot be used to solve the inverse problem. To study this case, we follow instead the usual stationary approach of inverse scattering theory on the line. We refer for instance to the review by Faddeev [8] and to the important paper by Deift and Trubowitz [6] for a presentation of the method for Schr¨ odinger operators and to the nice paper [1] for a recent application to Dirac operators (see also [12, 15]). We shall first obtain a stationary representation of the scattering operator S in terms of the usual transmission and reflection “coefficients” (note that these turn out to be matrices in our case). This is done after a series of simplifications of our model which happens finally to reduce to a particular case of the model studied in [1]. Then we use the analysis of [1], namely, a classical Marchenko method based on a detailed analysis of the stationary solutions of the corresponding Dirac equation, to prove the following result: the knowledge of one of the reflection operators L or R at all energies is enough to uniquely characterize the parameters M, Q and Λ. Eventually, we improve this
May 11, J070-S0129055X10004004
436
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
result observing that, in the dS-RN model, the reflection operators R or L are in fact analytic in the energy variable on a small strip containing the real axis. Hence it is enough to know R or L on any interval of energy in order to uniquely know them for all energies. Applying the result of [1], this leads to the uniqueness of the parameters in that case too. Note finally that a stationnary representation of the scattering operator in the case of RN black holes would drastically differ to the one obtained in Sec. 4.2 for dS-RN black holes. This is due to the presence of long-range terms at spacelike infinity that change the asymptotic behaviors of stationary solutions and thus the structure of the scattering matrix. In particular, the stationary representation obtained in [1] could not be used in this case. We finish this introduction saying a few words on the main technical tool used in Secs. 3 and 4 to prove our uniqueness results from the high energies of the transmission operators TR ot TL . These are based on a high-energy expansion of the scattering operator S following an approach introduced by Enss and Weder in [7] in the case of multidimensional Schr¨ odinger operators. (Note that the case of multidimensional Dirac operators in flat spacetime was treated later by Jung in [17]). Their result can be summarized as follows. Using purely time-dependent methods, they showed roughly speaking that the first term of the high-energy expansion of S is exactly the Radon transform of the potential they are looking for. Since they work in dimension greater than two, this Radon transform can be inversed and the potential thus uniquely recovered. In our problem however, due to the spherical symmetry of the black hole, we are led to study a family of one-dimensional Dirac equations and the above Radon transform simply becomes an integral of a one-dimensional function, hence a number, and cannot be inversed. Fortunately in our models, it turns out that this integral can be explicitely computed and gives in general already a physically relevant information. Nevertheless, it is not enough to uniquely characterize all the parameters of the black hole. In fact, we need to calculate several terms of the asymptotic (and thus obtain several integrals) to prove our result. To do this, we follow the stationary technique introduced by one of us [21] which is close in spirit to the Isozaki–Kitada method used in long-range scattering theory [16]. The basic idea is to replace the wave operators (and thus the scattering operator) by explicit Fourier Integral Operators, called modifiers, from which we are able to compute the high-energy expansion readily. The construction of these modifiers and the precise determination of their phases and amplitudes will be given in a self-contained manner in Sec. 3. Note also that the similar results proved in our previous paper [4] could not be applied directly to our new model because of the presence of long-range terms in the equation. At last we mention that, while this method was well-known for Schr¨odinger operators and applied successfully to various situations (see [2, 21–23]), it has required some substantial modifications when applied to Dirac operators, essentially because of the matrix-valued nature of the equation. To deal with these difficulties, we made an extensive use of the paper by Gˆ atel and Yafaev [9] where a direct scattering theory of massive Dirac fields in flat spacetime was studied and modifiers were constructed.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
437
2. (De Sitter)–Reissner–Nordstr¨ om Black Holes and Dirac Equation In this section, we describe the geometry of the exterior regions of (de Sitter)– Reissner–Nordstr¨om black holes. In particular, we emphasize the point of view adopted for the observers as well as the different properties of the asymptotic regions mentioned in the introduction, clearly distinguishing between the cases of zero and nonzero cosmological constant Λ. We then express in a synthetic manner the equations that govern the evolution of massive charged Dirac fields in these spacetimes. We end up this section recalling the known direct scattering results of [3, 18] and introducing the scattering operator S. 2.1. (De Sitter)–Reissner–Nordstr¨ om black holes In Schwarzschild coordinates a (de Sitter)–Reissner–Nordstr¨om black hole is described by a four–dimensional smooth manifold 2 M = Rt × R + r × Sω ,
equipped with the Lorentzian metric g = F (r) dt2 − F (r)−1 dr2 − r2 dω 2 ,
(2.1)
where F (r) = 1 −
2M Q2 Λr2 + 2 − , r r 3
(2.2)
and dω 2 = dθ2 + sin2 θ dϕ2 is the Euclidean metric on the sphere S 2 . The constants M > 0, Q ∈ R appearing in (2.2) are interpreted as the mass and the electric charge of the black hole and Λ ≥ 0 is the cosmological constant of the universe. Observe that the function (2.2) and thus the metric (2.1) do not depend on the angular variables θ, ϕ ∈ S 2 reflecting the fact that dS-RN black holes are spherically symmetric spacetimes. The family (M, g) are in fact exact solutions of the Einstein–Maxwell equations 1 Gµν = Rµν + Rgµν + Λgµν . (2.3) 2 Here Gµν , Rµν and R denote respectively the Einstein tensor, the Ricci tensor and the scalar curvature of (M, g) while Tµν is the energy-momentum tensor 1 1 ρ ρσ Tµν = , (2.4) Fµρ Fν − gµν Fρσ F 4π 4 Gµν = 8πTµν ,
where Fµν is the electromagnetic two-form solution of the Maxwell equations ∇µ Fνρ = 0, ∇[µ Fνρ] = 0 and given here in terms of a global electromagnetic vector potential Fµν = ∇[µ Aν] ,
Aν dxν = −
Q dt. r
(2.5)
May 11, J070-S0129055X10004004
438
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
The metric g has two types of singularities. Firstly, the point {r = 0} for which the function F is singular. This is a true singularity or curvature singularity.b Secondly, the spheres whose radii are the roots of F (note that the coefficient of the metric g involving F −1 blows up in this case). We must distinguish here two cases. When the cosmological constant is positive Λ > 0 and small enough, there are three positive roots 0 ≤ r− < r0 < r+ < +∞ . The spheres of radius r− , r0 and r+ are called, respectively, Cauchy, event and cosmological horizons of the dSRN black hole. When Λ = 0, the number of these roots depends on the respective values of the constants M and Q. In this paper, we only consider the case M > |Q| for which the function F has two zeros at the values r− = M − M 2 − Q2 and r0 = M + M 2 − Q2 . The spheres of radius r− and r0 are called, respectively, the Cauchy and event horizons of the RN black hole. In both situations, the horizons are not true singularities in the sense given for {r = 0}, but in fact coordinate singularities. It turns out that, using appropriate coordinate systems, these horizons can be understood as regular null hypersurfaces that can be crossed one way but would require speeds greater than that of light to be crossed the other way. We refer to [14, 28] for a introduction to black hole spacetimes and their general properties. As mentioned in the introduction, we shall consider in this paper inverse scattering problems from the point of view of static observers living in the exterior region of a (dS)-RN black hole, that is the region {r0 < r < r+ } when Λ > 0 or the region {r0 < r < +∞} when Λ = 0, and located far from the horizons. Such observers are well described by the variable t of the Schwarzschild coordinates meaning that t corresponds to their proper time. Since the metric is singular then, it is important to understand the roles of the singularities — the horizons — as the natural boundaries of the exterior region. It turns out that they are perceived by such observers as asymptotic regions of spacetime. Precisely, this means that they are never reached in a finite time t by incoming and outgoing null radial geodesics, i.e. the trajectories followed by classical light-rays aimed radially at the black hole and either at the cosmological horizon if Λ > 0 or at infinity if Λ = 0. To see this point more easily, we introduce a new radial coordinate x, called the Regge–Wheeler coordinate, which has the property of straightening the null radial geodesics and will, at the same time, greatly simplify the later analysis. Observing that for all Λ ≥ 0 the function F (r) in the metric (2.2) remains always positive in the exterior region, it can be defined implicitely by the relation dr = F (r) > 0, dx
(2.6)
r 1 1 2κ0 x= − log(r − r0 ) − dy + C, 2κ0 y − r0 F (y) r0
(2.7)
or explicitly, by
b It
means that certain scalars obtained by contracting the Riemann tensor blow up when r → 0.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
439
where the quantity 1 F (r0 ) > 0, 2 is called the surface gravity of the event horizon and C is any constant of integration. Note that, when Λ > 0, the Regge–Wheeler variable could be also defined explicitely by r+ 1 2κ+ 1 log(r+ − r) − + dy + C, (2.8) x= 2κ+ r+ − y F (y) r κ0 =
where the quantity 1 F (r+ ) < 0, 2 is called the surfave gravity of the cosmological horizon. Moreover, in the case Λ = 0, the expression (2.7) simplifies as κ+ =
x=r+
2 r− 1 log(r − r0 ) + log(r − r− ) + C. 2κ0 r0 − r−
(2.9)
In the coordinate system (t, x, ω), it is easy to see from the logarithm in (2.7) and (2.9) and the positive sign of κ0 that the event horizon {r = r0 } is pushed away to {x = −∞} for all Λ ≥ 0. Similarly it follows from (2.8) and the negative sign of κ+ that the cosmological horizon {r = r+ } is pushed away to {x = +∞} when Λ > 0. Hence in any case the Regge–Wheeler variable x runs over the full real line R. Moreover, by (2.6), the metric takes now the form g = F (r)(dt2 − dx2 ) − r2 dω 2 ,
(2.10)
from which it is immediate to see that the incoming and outgoing null radial ∂ ∂ ± ∂x and take the simple geodesics are generated by the vector fields ∂t form γ ± (t) = (t, x0 ± t, ω0 ),
t ∈ R,
(2.11)
where (x0 , ω0 ) ∈ R × S 2 are fixed. These are simply straight lines with velocity ±1 mimicking, at least in the t − x plane, the situation of a one-dimensional Minkowski spacetime. At last, using (2.11), we can check directly that the event horizon and the cosmological horizon (when Λ > 0) are asymptotic regions of spacetime in the sense given above. From now on, we shall only consider the exterior region of dS-RN black holes and we shall work on the manifold B = Rt ×Σ with Σ = Rx ×Sω2 , equipped with the metric (2.10). Such a manifold B is globally hyperbolic meaning that the foliation Σt = {t} × Σ by the level hypersurfaces of the function t, is a foliation of B by Cauchy hypersurfaces (see [28] for a definition of global hyperbolicity and Cauchy hypersurfaces). In consequence, we can view the propagation of massive charged Dirac fields as an evolution equation in t on the spacelike hypersurface Σ, that is a cylindrical manifold having two distinct ends: {x = −∞} corresponding to the event
May 11, J070-S0129055X10004004
440
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
horizon of the black hole and {x = +∞} corresponding to the cosmological horizon when Λ > 0 and to spacelike infinity when Λ = 0. Note that the geometries of these ends are distinct in general. The event and cosmological horizons are indeed exponentially large ends of Σ whereas spacelike infinity is an asymptotically flat end of Σ (in the latter, observe that the metric (2.2) tends to the Minkowski metric expressed in spherical coordinates when r → +∞). The difference between these geometries will be easily seen from the distinct asymptotic behaviors of Dirac fields near these regions given in the next subsection. 2.2. Dirac equation and direct scattering results Scattering theory for massive charged Dirac fields on the spacetime B has been the object of the papers [3, 18]. We briefly recall here the main results of these papers. In particular, we use the form of the Dirac equation obtained therein. First, the evolution equation satisfied by massive charged Dirac fields in B can be written under the Hamiltonian form i∂t ψ = Hψ,
(2.12)
where ψ is a 4-components spinor belonging to the Hilbert space H = L2 (R × S 2 ; C4 ), and the Hamiltonian H is given by H = Γ1 Dx + a(x)DS 2 + b(x)Γ0 + c(x).
(2.13)
Here we use the following notations. The symbol Dx stands for −i∂x whereas DS 2 denotes the Dirac operator on S 2 which, in spherical coordinates, takes the form cot θ i Γ3 ∂ϕ . (2.14) DS 2 = −iΓ2 ∂θ + − 2 sin θ The potentials a, b, c are scalar smooth functions given in terms of the metric (2.1) by F (r) qQ , b(x) = m F (r), c(x) = , (2.15) a(x) = r r where m and q denote the mass and the electric charge of the fields respectively. Finally, the matrices Γ1 , Γ2 , Γ3 , Γ0 appearing in (2.13) and (2.14) are usual 4 × 4 Dirac matrices that satisfy the anticommutation relations Γi Γj + Γj Γi = 2δij Id,
∀ i, j = 0, . . . , 3.
(2.16)
Second, we use the spherical symmetry of the equation to simplify further the expression of the Hamiltonian H. Since, the Dirac operator DS 2 has compact resolvent, it can be diagonalized into an infinite sum of matrix-valued multiplication operators. The eigenfunctions associated to DS 2 are a generalization of the usual spherical harmonics called spin-weighted spherical harmonics. We refer to
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
441
Gel’Fand and Sapiro [10] for a detailed presentation of these generalized spherical harmonics and to [3, 18] for an application to our model. There exists thus a family Fnl of DS2 with the indexes (l, n) running in the set of eigenfunctions 1 I = (l, n), l − | 2 | ∈ N, l − |n| ∈ N which forms a Hilbert basis of L2 (S 2 ; C4 ) with the following property. The Hilbert space H can then be decomposed into the infinite direct sum [L2 (Rx ; C4 ) ⊗ Fnl ] := Hln , H= (l,n)∈I
(l,n)∈I
is identified with L (R; C4 ) and more important, we where Hln = L (Rx ; C obtain the orthogonal decomposition for the Hamiltonian H H ln , H= 2
4
) ⊗ Fnl
2
(l,n)∈I
with H ln := H|Hln = Γ1 Dx + al (x)Γ2 + b(x)Γ0 + c(x),
(2.17)
and al (x) = −a(x)(l + 12 ). Note that the Dirac operator DS 2 has been replaced in the expression of H ln by −(l + 12 )Γ2 thanks to the good properties of the spinweighted spherical harmonics Fnl . The operator H ln is a selfadjoint operator on Hln with domain D(H ln ) = H 1 (R; C4 ). Finally we use the following representation for the Dirac matrices Γ1 , Γ2 and Γ0 appearing in (2.17) 1 0 0 0 0 0 0 1 0 0 −i 0 0 1 0 0 0 −1 0 0 i , Γ2 = 0 , Γ0 = 0 0 . Γ1 = 0 0 −1 0 −1 i 0 0 0 0 0 0 0 0
0 −1
1
0
0 0
0
−i
0
0 (2.18)
In this paper, it will be often enough to restrict our analysis to a fixed harmonic. To simplify notations we shall thus simply write H, H and a(x) instead of Hln , H ln and al (x) respectively and we shall indicate in the course of the text whether we work on the global problem or on a fixed harmonic. Let us summarize now the direct scattering results obtained in [3, 18]. It is well known that the main information of interest in scattering theory concerns the nature of the spectrum of the Hamiltonian H. Our first result goes in this sense. Using essentially a Mourre theory (see [19]), it was shown in [3, 18] that, for all Λ ≥ 0, σpp (H) = ∅,
σsing (H) = ∅.
In other words, the spectrum of H is purely absolutely continuous. In consequence, massive charged Dirac fields scatter toward the two asymptotic regions at late times and they are expected to obey simpler equations there. This is one of the main information encoded in the notion of wave operators that we introduce now.
May 11, J070-S0129055X10004004
442
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
We first treat the case Λ = 0 corresponding to RN black holes. From (2.2) and (2.9), the potentials a, b, c have very different asymptotics as x → ±∞ (according to our discussion above this reflects the fact that the geometries near the two asymptotic regions are very different). At the event horizon, there exists α > 0 such that |a(x)|, |b(x)|, |c(x) − c0 | = O(eαx ),
x → −∞,
(2.19)
where the constant c0 is given by (see (2.15)) c0 =
qQ . r0
Hence we can write the Hamiltonian H as H = H0 + V0 ,
V0 (x) = a(x)Γ2 + b(x)Γ0 + (c(x) − c0 ),
H0 = Γ1 Dx + c0 ,
where the potential V0 is then short-range when x → −∞. In consequence, we can choose the asymptotic dynamic generated by the Hamiltonian H0 = Γ1 Dx + c0 as the comparison dynamic in this region. The Hamiltonian H0 is a selfadjoint operator on H with its spectrum covering the full real line, i.e. σ(H0 ) = R. Note finally that due to the simple diagonal form of the matrix Γ1 , the comparison dynamic e−itH0 is essentially a system of transport equations along the curves x ± t, that is the null radial geodesics of the black hole. Conversely at infinity, the potentials a, b, c have the asymptotics 1 |a(x)|, |b(x) − m|, |c(x)| = O , x → +∞. (2.20) x Hence we can write the Hamiltonian H as H = H0m + V0m ,
H0m = Γ1 Dx + mΓ0 ,
V0m (x) = a(x)Γ2 + (b(x) − m)Γ0 + c(x),
where the potential V0m is now a long-range potential having Coulomb decay when x → +∞. The asymptotic dynamic is generated by the Hamiltonian H0m = Γ1 Dx + mΓ0 , a classical one-dimensional Dirac Hamiltonian in Minkowski spacetime. The Hamiltonian H0m is a selfadjoint operator on H and its spectrum has a gap, i.e. σ(H0m ) = (−∞, −m) ∪ (+m, +∞). However, contrary to the preceding case, the m asymptotic dynamic e−itH0 cannot be used alone as a comparison dynamic because of the long-range potential V0m , but must be (Dollard)-modified. In order to define this modification and for other use, we need to introduce the classical velocity operators V0 = Γ1 ,
Vm = Dx (H0m )−1 ,
associated to the Hamiltonians H0 and H0m , respectively. The classical velocity operators are selfadjoint operators on H and their spectra are simply σ(Γ1 ) = {−1, +1} and σ(Vm ) = [−1, +1]. Let us also denote by P± and P±m the projections onto the positive and negative spectrum of Γ1 and Vm , i.e. P± = 1R± (Γ1 ),
P±m = 1R± (Vm ).
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
443
As shown in [3], a great interest of these projections is that they permit to separate easily the part of the fields that propagate toward the event horizon and the part of the fields that propagate toward infinity. They will be used in the definition of the wave operators below. Moreover, the classical velocity operator Vm enters in the expression of the Dollard modified comparison dynamic at infinity proposed in [3] and given by Rt m −1 m (2.21) U (t) = e−itH0 e−i 0 (b(sVm )−m)m(H0 ) +c(sVm ) ds . Let us make here two comments. First, the potential a(x)Γ2 turns out to be a “false” m long-range term. This is clear from (2.21) where the asymptotic dynamic e−itH0 has been modified by an extra phase which only involves the long-range potentials b and c. We refer to [3] for an explanation of this particular point. Second, we shall propose in Sec. 3 a new time-independent modification of the comparison dynamic m e−itH0 which will be a direct byproduct of our construction of modifiers in the spirit of Isozaki–Kitada’s work [16]. This new modification will be shown to be equivalent to the Dollard modification (2.21) in Theorem 3.3. We are now in position to introduce the wave operators associated to H. At the event horizon, we define ± = s- lim eitH e−itH0 P∓ , W(−∞) t→±∞
(2.22)
whereas at infinity, we define ± = s- lim eitH U (t)P±m . W(+∞) t→±∞
(2.23)
Finally, the global wave operators are given by ± ± W ± = W(−∞) + W(+∞)
(2.24)
Note here our use of the projections P± and P±m to separate the part of the field propagating toward the event horizon to the part of the field propagating toward infinity. In fact without these projections, the wave operators (2.22) and (2.23) would not exist at all. More precisely the main result of [3] is ± ± Theorem 2.1. The wave operators W(−∞) , W(+∞) and W ± exist on H. Moreover, ± ± = the global wave operators W are partial isometries with initial spaces Hscat m ± P∓ (H) + P± (H) and final space H. In particular, W are asymptotically complete, i.e. Ran W ± = H.
As a direct consequence of Theorem 2.1, we can define the scattering operator S by the usual formula S = (W + )∗ W − .
(2.25)
− It is clear that S is a well-defined operator on H and a partial isometry from Hscat + into Hscat . We now treat the case Λ > 0 corresponding to dS-RN black holes wich turns out to be a little bit more symmetric at the two (event and cosmological) horizons.
May 11, J070-S0129055X10004004
444
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
According to (2.2), (2.7) and (2.8), the potentials a, b, c have the following asymptotics as x → ±∞. There exists α > 0 such that |a(x)|, |b(x)| = O(e−α|x| ),
|x| → ∞,
(2.26)
and |c(x) − c0 | = O(eαx ),
x → −∞,
(2.27)
−αx
x → +∞,
(2.28)
|c(x) − c+ | = O(e
),
where the constants c0 and c+ are given by (see (2.15)) c0 =
qQ , r0
c+ =
qQ . r+
(2.29)
Hence, the potentials a, b are short-range when x → ±∞ and c − c0 and c − c+ are short-range when x → −∞ and x → +∞, respectively. At the event horizon, we choose as before the asymptotic dynamic generated by the Hamiltonian H0 = Γ1 Dx + c0 as the comparison dynamic while, at the cosmological horizon, we choose the asymptotic dynamic generated by the Hamiltonian H+ = Γ1 Dx + c+ as the comparison dynamic. The Hamiltonians H0 and H+ are clearly selfadjoint operators on H and their spectra are exactly the real line, i.e. σ(H0 ) = σ(H+ ) = R. We observe eventually that the dynamics e−itH0 and e−itH+ are essentially a system of transport equations along the null radial geodesics of the black hole but they differ by the distinct oscillations e−itc0 and e−itc+ . We need the classical velocity operators associated to H0 and H+ in order to separate the part of the fields that propagate toward the event horizon and the part of the fields that propagate toward the cosmological horizon. It turns out that they are equal to V0 = Γ1 in both cases and the associated projections onto the positive and negative spectrum are still P± . Thus we can introduce the wave operators as before. At the event horizon, we define ± = s- lim eitH e−itH0 P∓ , W(−∞) t→±∞
(2.30)
and at the cosmological horizon, we define ± W(+∞) = s- lim eitH e−itH+ P± . t→±∞
(2.31)
Finally, the global wave operators are given by ± ± W ± = W(−∞) + W(+∞) .
(2.32)
The main result of [18] is ± ± Theorem 2.2. The wave operators W(−∞) , W(+∞) and W ± exist on H. Moreover, ± the global wave operators W are isometries on H. In particular, W ± are asymptotically complete, i.e. Ran W ± = H.
Thanks to Theorem 2.2, we can define the scattering operator S as in (2.25) by S = (W + )∗ W − which is a well-defined isometry on H.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
445
We deduce from the previous discussion that, for all Λ ≥ 0, the scattering operator S is a well-defined operator on H. For all ψ, φ ∈ H, we shall consider in the following the expectation values of S, given by Sψ, φ, as the known data of our inverse problem. Moreover, using (2.24) and (2.32), we observe that these expectation values can be decomposed into 4 natural components Sψ, φ = W − ψ, W + φ = TR ψ, φ + TL ψ, φ + Lψ, φ + Rψ, φ, where − + ψ, W(−∞) φ, TR ψ, φ = W(+∞)
− + TL ψ, φ = W(−∞) ψ, W(+∞) φ,
(2.33)
− + Lψ, φ = W(−∞) ψ, W(−∞) φ,
− + Rψ, φ = W(+∞) ψ, W(+∞) φ.
(2.34)
It follows from our definitions of the wave operators (2.22), (2.30) and (2.23), (2.31) that the previous quantities can be interpreted in terms of transmission and reflection between the different asymptotic regions, i.e. {x = −∞} for the event horizon of the black hole and {x = +∞} for either spacelike infinity if Λ = 0, or the cosmological horizon if Λ > 0. For instance, TR ψ, φ corresponds to the part of a signal transmitted from {x = +∞} to {x = −∞} in a scattering process whereas the term TL ψ, φ corresponds to the part of a signal transmitted from {x = −∞} to {x = +∞}. Hence TR stands for “transmitted from the right” and TL for “transmitted from the left”. Conversely, Lψ, φ corresponds to the part of a signal reflected from {x = −∞} to {x = −∞} in a scattering process whereas the term Rψ, φ corresponds to the part of a signal reflected from {x = +∞} to {x = +∞}. 3. The Inverse Problem when Λ = 0 In this section, we study the inverse problem at high energy in the case Λ = 0 that corresponds to RN black holes. Let us recall here that all the results and formulae given hereafter are always obtained on a fixed spin-weighted spherical harmonic. Therefore the notations H, H, a(x) are a shorthand for Hln , H ln , al (x) defined in the preceding section. In order to state our main result, we make two assumptions. Assumption 1. We assume that our observers may measure the high energies of the transmitted operators TR or TL . Precisely, we assume that one of the following functions of λ ∈ R Fl (λ) = TR eiλx ψ, eiλx φ,
Gl (λ) = TL eiλx ψ, eiλx φ,
are known for all large values of λ, for all l ∈ N where l indexes the spin-weighted spherical harmonics and for all ψ, φ ∈ H with ψ, φ ∈ C0∞ (R; C4 ). Assumption 2. We also assume that the mass m and the charge q of the Dirac fields considered in these inverse scattering experiments are known and fixed. Moreover we assume that q = 0 since the case q = 0 is similar to the one treated [4].
May 11, J070-S0129055X10004004
446
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
The main result of this section is now summarized in the following Theorem Theorem 3.1. Under Assumptions 1 and 2, the parameters M and Q of the RN black hole are uniquely determined. Following our previous paper [4], the proof of Theorem 3.1 will be based on a high-energy asymptotic expansion of the functions Fl (λ) and Gl (λ) when λ → +∞. Precisely we shall prove the following formulae: Theorem 3.2 (Reconstruction Formulae). Let ψ, φ ∈ C0∞ (R; C4 ). Then for λ large, we obtain Fl (λ) = Θ(x)P− ψ, P− φ +
i A(x)P− ψ, P− φ + O(λ−2 ), 2λ
(3.1)
Gl (λ) = Θ(x)P+ ψ, P+ φ −
i A(x)P+ ψ, P+ φ + O(λ−2 ), 2λ
(3.2)
where θ(x) and A(x) are multiplication operators given by Θ(x) = e−i +∞ 2 A(x) = Θ(x) al (s)ds + −∞
R0
−∞
[c(s)−c0 ]ds+ic0 x
0
−∞
b2 (s)ds +
,
+∞
(3.3) (b(s) − m)2 ds + m2 x .
0
(3.4) Remark 3.1. In Theorem 3.2, we have emphasized the dependence of the functions Fl (λ) and Gl (λ) on the parameter l since the reconstruction formulae (3.1) and (3.2) can be derived if we work on a fixed spin-weighted spherical harmonic only. Nevertheless, as indicated in Assumption 1 we shall need to know these formulae on all spin-weighted spherical harmonics, hence for all l ∈ N, in order to prove the uniqueness result stated in Theorem 3.1. Remark 3.2. In the reconstruction formulae of Theorem 3.2, the contri +∞physical ic0 x 2 appearing in (3.3) and the functions −∞ al (s)ds + m2 x butions are the phase e appearing in (3.4). The presence of these terms clearly show that the charge q through c0 and the mass m of Dirac fields contribute to the high energy 0 asymptotics of the transmitted operators. On the other hand, the constant terms −∞ [c(s)−c0 ]ds 0 +∞ in (3.3) and −∞ b2 (s)ds + 0 (b(s) − m)2 ds in (3.4) may appear unnatural at first sight since they depend explicitely on the particular value 0 of the Regge–Wheeler variable x. They are in fact due to our particular choice of Dollard modification in ± . Recall here indeed that there the definition of the modified wave operators W(+∞) is no canonical choice for the (necessary) modifications entailed by the presence of long-range potentials at infinity. This point can be easily seen for instance from the Isozaki–Kitada modifications — constructed in the next subsection — whose phases are defined only up to a constant of integration (see (3.26) and Remark 3.4 after it). The above constant terms can thus be understood as constants of integration depending on our particular choice of modification. We emphasize at last that these
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
447
constants of integration do not play any role in our proof of the uniqueness of the parameters. Remark 3.3. In this paper, we use the high-energy asymptotics of the quantum wave operators for Dirac fields in order to reconstruct the mass and the charge of the black hole. An other interesting question would be to study the same inverse problem, but from the semiclassical dynamics, or even from the classical ones. According to the authors, these problems are still open. However, for semiclassical Schr¨ odinger operators with energies localized in an arbitrary small interval, an inverse scattering problem was studied in [24], for regular potentials at infinity; for the Newton equations at high energies, this problem was treated by Novikov in [25]. We now explain our strategy to prove Theorem 3.2. Using (2.22), (2.23), (2.33) and the fact that eiλx corresponds to a translation by λ in momentum space, we first rewrite Fl (λ) and Gl (λ) as follows − + (λ)ψ, W(−∞) (λ)φ, Fl (λ) = W(+∞)
(3.5)
− + W(−∞) (λ)ψ, W(+∞) (λ)φ,
(3.6)
Gl (λ) = with
± ± (λ) = e−iλx W(−∞) eiλx = s- lim eitH(λ) e−itH0 (λ) P∓ , W(−∞) t→±∞
± (λ) W(+∞)
=e
−iλx
± W(+∞) eiλx
m
= s- lim eitH(λ) e−iX(t,λ) e−itH0 t→±∞
(λ)
P±m,λ ,
where we use the notations H(λ) = Γ1 (Dx + λ) + a(x)Γ2 + b(x)Γ0 + c(x), H0m (λ) = Γ1 (Dx + λ) + mΓ0 , X(t, λ) =
H0 (λ) = Γ1 (Dx + λ) + c0 , −1 Vm (λ) = (Dx + λ) H0m (λ) ,
P±m,λ = 1R± (Vm (λ)), t
(b(sVm (λ)) − m)m(H0m (λ))−1 + c(sVm (λ)) ds.
0
In order to obtain an asymptotic expansion of the functions Fl (λ) and Gl (λ), it is thus enough to obtain an asymptotic expansion of the λ-shifted wave opera± (λ). To do this, we follow the procedure exposed in [21, 22], procedure tors W(±∞) inspired by the well-known Isozaki–Kitada method [16] developed in the setting of long-range stationary scattering theory. It consists simply in replacing the wave ± ± (λ) by “well-chosen” energy modifiers J(±∞) (λ), defined as Fourier operators W(±∞) Integral Operators (FIO) with explicit phases and amplitudes. Well-chosen here ± (λ) satisfying for λ large enough means practically that we look for J(±∞) ± ± W(−∞) (λ)ψ = lim eitH(λ) J(−∞) (λ)e−itH0 (λ) P∓ ψ, t→±∞
± ± W(+∞) (λ)ψ = lim eitH(λ) J(+∞) (λ)e t→±∞
−itH0m (λ)
P±m,λ ψ,
(3.7) (3.8)
May 11, J070-S0129055X10004004
448
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
and ± ± (W(±∞) (λ) − J(±∞) (λ))ψ = O(λ−2 ),
(3.9)
for any fixed ψ ∈ H such that ψ ∈ C0∞ (R; C4 ). Note that the decay O(λ−2 ) in (3.9) could be improved to any inverse power decay but turns out to be enough to ± (λ) satisfying our purpose here. In particular if we manage to construct such J(±∞) (3.9) then we obtain by (3.5) and (3.6) − + (λ)ψ, J(−∞) (λ)ψ + O(λ−2 ), Fl (λ) = J(+∞) − + Gl (λ) = J(−∞) (λ)ψ, J(+∞) (λ)ψ + O(λ−2 ),
(3.10)
from which we can calculate the first terms of the asymptotics easily. Let us here give a simple but useful result which allows us to simplify slightly the expressions of (3.7) and (3.8). Lemma 3.1. For all ξ ∈ R∗ , set
ν ± (ξ) = ±sgn(ξ) ξ 2 + m2 .
(3.11)
Then, for all ψ with supp ψˆ ⊂ R∗ , m
e−itH0 P±m ψ = e−itν
±
(Dx )
P±m ψ.
(3.12)
Moreover, e−itH0 P± = e∓itDx −itc0 P± .
(3.13)
Proof. The Fourier representation of the operator H0m is Γ1 ξ + mΓ0 and has pre 2 cisely one positive eigenvalue ξ + m2 and one negative eigenvalue − ξ 2 + m2 . Similarly, the Fourier representation of the classical velocity operator Vm is ξ 1 0 m ξ 2 +m2 (Γ ξ+mΓ ). Hence, for ξ > 0, P+ is the projection onto the positive spectrum 1 0 m of Γ ξ + mΓ and P− is the projection onto the negative spectrum of Γ1 ξ + mΓ0 . For ξ < 0, it is the opposite. This implies immediately (3.12). Finally the equality (3.13) is a direct consequence of the definitions of H0 and P± . According to Lemma 3.1, the projections P± and P±m allow us to “scalarize” the ± Hamiltonians H0 and H0m in the expressions (3.7) and (3.8) of W(±∞) (λ). Precisely these expressions read now ± ± (λ)ψ = lim eitH(λ) J(−∞) (λ)e∓it(Dx +λ)−itc0 P∓ ψ, W(−∞) t→±∞
± ± (λ)ψ = lim eitH(λ) J(+∞) (λ)e−itν W(+∞) t→±∞
±
(Dx +λ)
P±m,λ ψ.
(3.14) (3.15)
This minor simplification will be important in the forthcoming construction of the ± modifiers J(±∞) (λ).
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
449
Before entering into the details, let us give a hint on how to construct the ± (λ) a priori defined as FIOs with “scalar” phases ϕ± modifiers J(±∞) (±∞) (x, ξ, λ) and “matrix-valued” amplitudes p± (x, ξ, λ), i.e. defined for all ψ ∈ H by (±∞) 1 iϕ± (x,ξ,λ) ± ± ˆ J(±∞) (λ)ψ = √ e (±∞) p(±∞) (x, ξ, λ)ψ(ξ)dξ. 2π R If we assume for instance that (3.15) is true then we easily get ± ± (W(+∞) (λ) − J(+∞) (λ))ψ
=i 0
±∞
± eitH(λ) C(+∞) (λ)e−itν
±
(Dx +λ)
P±m,λ ψdt,
(3.16)
where ± ± ± (λ) := H(λ)J(+∞) (λ) − J(+∞) (λ)ν ± (Dx + λ), C(+∞)
(3.17)
± are also FIOs with phases ϕ± (+∞) (x, ξ, λ) and amplitudes c(+∞) (x, ξ, λ). From (3.16), we get the simple estimate ±∞ ± ± ± ± C(+∞) (λ) e−itν (Dx +λ) P±m,λ ψdt. (3.18) (W(+∞) (λ) − J(+∞) (λ))ψ ≤ 0
± In order that (3.9) be true it is then clear from (3.18) that the FIOs C(+∞) (λ) have to be “small” in some sense. Precisely we shall need that the amplitudes c± (+∞) (x, ξ, λ) be short-range in the variable x at infinity (i.e. when x → +∞) and
of order O(λ−2 ) when λ → +∞. Note here the role played by the projections P±m,λ which allow us to consider the part of the Dirac fields that propagate toward infinity. This explains why the amplitudes c± (+∞) (x, ξ, λ) must short-range in the variable x ± only at infinity. Similarly, for the construction of the modifiers J(−∞) (λ), we shall ± ± require that the amplitudes c(−∞) (x, ξ, λ) of the corresponding operators C(−∞) (λ) be short-range in the variable x only at the event horizon (i.e. when x → −∞) and of order O(λ−2 ) when λ → +∞. ± 3.1. Asymptotics of W(+∞) (λ) ± In this subsection, we construct the modifiers J(+∞) (λ) and give the asymptotics ± of W(+∞) (λ) when λ → +∞. For simplicity, we shall omit the lower index (+∞) in all the objects defined hereafter. We first look at the problem at fixed energy (i.e. we take λ = 0 in the previous formulae). Hence we aim to construct modifiers J ± with scalar phases ϕ± (x, ξ) and matrix-valued amplitudes p± (x, ξ) such that the amplitudes c± (x, ξ) of the operators C ± = HJ ± − J ± ν ± (Dx ) be short-range in x when x → +∞. We adapt here to our case the treatment given by Gˆ atel and Yafaev in [9] where a similar problem was considered in Minkowski spacetime (see also our recent paper [4]).
May 11, J070-S0129055X10004004
450
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
The operators C ± are clearly FIOs with phases ϕ± (x, ξ) and amplitudes c± (x, ξ) = B ± (x, ξ)p± (x, ξ) − iΓ1 ∂x p± (x, ξ),
(3.19)
B ± (x, ξ) = Γ1 ∂x ϕ± (x, ξ) + a(x)Γ2 + b(x)Γ0 + c(x) − ν ± (ξ).
(3.20)
where
As usual, we look for phases ϕ± close to xξ and amplitudes p± close to 1. So the term ∂x p± in (3.19) should be short-range et can be neglected in a first approximation. With p± = 1, we are thus led to solve B ± = 0. However a direct calculation leads then to matrix-valued phases ϕ± whereas we look for scalar ones. We follow [9] and solve in fact (B ± )2 = 0. Using crucially the anticommutation properties of the Dirac matrices (2.16), we get the new equation (B ± )2 = (∂x ϕ± )2 + a2 + b2 + (c − ν ± )2 + 2(c − ν ± )(B ± − c + ν ± ) = 0.
(3.21)
If we put B ± = 0 in (3.21), we obtain the scalar equation r± (x, ξ) := (∂x ϕ± )2 + a2 + b2 − (c − ν ± )2 = 0.
(3.22)
We look for an approximate solution of (3.22) of the form ϕ± (x, ξ) = xξ + φ± (x, ξ) where φ± (x, ξ) should be a priori relatively small in the variable x. Recalling that (ν ± )2 = ξ 2 + m2 by (3.11), we must then solve 2ξ∂x φ± + (∂x φ± )2 + a2 + (b2 − m2 ) − c2 + 2cν ± = 0. If we neglect (∂x φ± )2 in (3.23), we finally get 2ξ∂x φ± = − a2 + (b2 − m2 ) − c2 + d± ,
(3.23)
(3.24)
where we have introduced the notation d± (x, ξ) = 2c(x)ν ± (ξ). Note that by (2.20) and (3.11), the following estimate holds ∀ α, β ∈ N,
|∂xα ∂ξβ d± (x, ξ)| ≤ Cαβ x−1−α ξ1−β ,
∀ x ∈ R+ ,
∀ ξ ∈ R∗ . (3.25)
Therefore, using (2.20) again and the previous estimate (3.25), we see that a2 −c2 is short-range when x → +∞ whereas b2 − m2 and d± are long-range (of Coulomb type) when x → +∞. Hence we can define two solutions of (3.24) for all ξ = 0 as follows +∞ x 2 1 1 ± 2 2 (b (s) − m2 ) + d± (s, ξ) ds [a (s) − c (s)]ds − φ (x, ξ) = 2ξ x 2ξ 0 +∞ 1 + (b(s) − m)2 ds. (3.26) 2ξ 0 +∞ 1 (b(s)−m)2 ds Remark 3.4. Let us emphasize that we only add the quantity 2ξ 0 in (3.26) in order to prove that the Isozaki–Kitada and the Dollard modifications
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
451
coincide (see Theorem 3.3). In the general case however, the phases φ˜± (x, ξ), solutions of (3.24) would clearly take the form for all ξ = 0 +∞ x 1 1 [a2 (s) − c2 (s)]ds − (b2 (s) − m2 )ds φ˜± (x, ξ) = 2ξ x 2ξ 0 ν ± (ξ) x − c(s)ds + C(ξ), (3.27) ξ 0 where C(ξ) is a constant of integration. With this choice, we obtain for ξ = 0 (see (3.22)), 2 1 r± (x, ξ) = (∂x φ± )2 = 2 a2 (x) + (b2 (x) − m2 ) − c2 (x) + d± (x, ξ) . 4ξ
(3.28)
Moreover it is easy to see that the rests r± satisfy the estimates ∀ α, β ∈ N,
|∂xα ∂ξβ r± (x, ξ)| ≤ Cαβ x−2−α ξ−β ,
∀ x ∈ R+ ,
∀ ξ ∈ R∗ . (3.29)
In our derivation of the phases (3.26), it is important to keep in mind that we did not find an approximate solution of B ± = 0 but instead of (B ± )2 = 0. Therefore we cannot expect to take p± = 1 as a first approximation and we have to work a bit more. So we look for p± such that B ± p± be as small as possible. According to (3.21) and (3.22), we first note that (B ± )2 = r± + 2(c − ν ± )B ± .
(3.30)
We find now a relation between B ± and (B ± )2 . Using (3.20) and (3.24), we can reexpress B ± as B ± = B0± + 2ν ± K ± ,
(3.31)
where (3.32) B0± = Γ1 ξ + mΓ0 − ν ± , 1 1 K ± = ± − (a2 + (b2 − m2 ) − c2 + d± )Γ1 + aΓ2 + (b − m)Γ0 + c . 2ν 2ξ (3.33) If we take the square of (3.31) we get (B ± )2 = (B0± )2 + 2ν ± B0± K ± + 2ν ± K ± B ± . However, from (3.32) and (3.11) we see that becomes
(B0± )2
=
−2ν ± B0± .
(3.34) Whence (3.34)
(B ± )2 = −2ν ± B0± (1 − K ± ) + 2ν ± K ± B ± .
(3.35)
± 2
Now we replace the expression (3.35) of (B ) into (3.30) and we obtain c r± = −2ν ± B0± (1 − K ± ) + 2ν ± 1 + K ± − ± B ± . ν
(3.36)
May 11, J070-S0129055X10004004
452
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
We would like to isolate B ± in (3.36). We thus need to invert the functions (1 + K ± − νc± ). Using (2.19), (2.20) and (3.25), we get the following global asymptotics for K ± Cαβ x−1−α ξ−1−β , ∀ x ∈ R+ , ∀ ξ ∈ R∗ , α β ± ∀ α, β ∈ N, |∂x ∂ξ K (x, ξ)| ≤ Cαβ x−α ξ−1−β , ∀ x ∈ R− , ∀ ξ ∈ R∗ . (3.37) Let us consider the set X = {ξ ∈ R, |ξ| ≥ R} where R 1 is a constant. It follows ± − νc± ) immediately from the asymptotics (3.37) and those of νc(x) ± (ξ) that (1 + K and (1 − K ± ) are invertible for all (x, ξ) ∈ R × X if the constant R is assumed to be large enough. In consequence, we can write (3.36) as B ± (1 − K ± )−1 =
−1 1 c ± − r± (1 − K ± )−1 1 + K 2ν ± ν± −1 c ± + 1+K − ± B0± , ν
(3.38)
for all (x, ξ) ∈ R × X. The first term in the right-hand side of (3.38) is small thanks to (3.29) but the second one is not. We choose p± in such a way that they cancel this term. To do this, we observe that the Fourier representations of the projections P±m , i.e. the operators ξ 1 0 (Γ ξ + mΓ ) P±m (ξ) = 1R± 2 ξ + m2 1 sgn(ξ) 1 0 = I4 ± (Γ ξ + mΓ ) , ∀ ξ = 0, (3.39) 2 ξ 2 + m2 satisfy the following equations B0± (ξ)P±m (ξ) = 0,
(3.40)
by Lemma 3.1 and (3.32). According to (3.38), a natural choice for p± is thus p± = (1 − K ± )−1 P±m (ξ),
(3.41)
−1 1 c ± r± (1 − K ± )−1 P±m (ξ). q := B p = ± 1 + K − ± 2ν ν
(3.42)
for which we have ±
± ±
Let us summarize the situation at this stage. For ξ = 0, we have defined the phases ϕ± (x, ξ) = xξ + φ± (x, ξ) by (3.26) and for ξ ∈ X, the amplitudes p± are given by (3.41). Directly from the definitions and from the asymptotics (2.19) and (2.20) of the potentials a, b, c, the following estimates hold.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
453
Lemma 3.2 (Estimates on the Phases, the Amplitudes and Related Quantities). For all x ∈ R+ and ξ ∈ X with R large enough, we have ∀ β ∈ N, ∀ |α| ≥ 1, ∀ β ∈ N,
|∂ξβ φ± (x, ξ)| ≤ Cβ logxξ−β .
(3.43)
|∂xα ∂ξβ φ± (x, ξ)| ≤ Cαβ x−α ξ−β .
(3.44)
2 (ϕ± (x, ξ) − xξ)| ≤ |∂x,ξ
C . R2
(3.45) (3.46)
∀ α, β ∈ N,
|∂xα ∂ξβ K ± (x, ξ)| ≤ Cαβ x−1−α ξ−1−β . |∂xα ∂ξβ p± (x, ξ) − P±m (ξ) | ≤ Cαβ x−1−α ξ−1−β .
∀ α, β ∈ N,
|∂xα ∂ξβ r± (x, ξ) ≤ Cαβ x−2−α ξ−β .
(3.48)
∀ α, β ∈ N,
|∂xα ∂ξβ q ± (x, ξ) ≤ Cαβ x−2−α ξ−1−β .
(3.49)
∀ α, β ∈ N,
|∂xα ∂ξβ c± (x, ξ) ≤ Cαβ x−2−α ξ−1−β .
(3.50)
∀ α, β ∈ N,
(3.47)
Thanks to (3.43)–(3.45) and (3.47), for R large enough, we can define precisely our modifiers J ± as bounded operators on H (see [27], for instance). Let χ+ ∈ C ∞ (R) be a cutoff function in space variables such that χ+ (x) = 0 if x ≤ 12 and χ+ (x) = 1 if x ≥ 1. Let also θ ∈ C ∞ (R) be a cutoff function in energy variables such that θ(ξ) = 0 if |ξ| ≤ 12 and θ(ξ) = 1 if |ξ| ≥ 1. For R large enough, J ± are the Fourier Integral Operators with phases ϕ± (x, ξ) and amplitudes ξ ± + ± P (x, ξ) = χ (x)p (x, ξ)θ . (3.51) R We finish this part by a first application of the previous construction. In the ± are shown to be time-independent modifications next Theorem, the modifiers J(+∞) of Isozaki–Kitada type equivalent to the Dollard modification (2.21). Precisely, we have Theorem 3.3. For any ψ ∈ H such that supp ψˆ ⊂ X, we have ± ± W(+∞) ψ = lim eitH J(+∞) e−itν t→±∞
±
(Dx )
P±m ψ.
(3.52)
Proof. We only sketch the proof for the case (+). By definition of P+m , we have R |D | |D | −i 0t b s√ x −m ν + m +c s √ x ds + (D ) 2 2 2 2 x Dx +m Dx +m P+m ψ U (t)P+m ψ = e−itν (Dx ) e := V (t)P+m ψ. Then, we write: + e−itν eitH J(+∞)
+
(Dx )
(3.53)
+ P+m ψ = eitH V (t) V ∗ (t)e−itν (Dx ) + + + e−itν (Dx ) P+m ψ (3.54) × eitν (Dx ) J(+∞) Rt + + + = eitH V (t) ei 0 [···]ds eitν (Dx ) J(+∞) e−itν (Dx ) P+m ψ. (3.55)
May 11, J070-S0129055X10004004
454
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
The classical flow associated with the Hamiltonian ν + (ξ) = sgn(ξ) ξ 2 + m2 is given by |ξ| t ,ξ . (3.56) Φ (x, ξ) = x + t ξ 2 + m2 + + + Then, using Egorov’s theorem, we see that eitν (Dx ) J(+∞) e−itν (Dx ) is a FIO with phase ϕ+ (t, x, ξ) = xξ + φ+ (x + tη, ξ), and with principal symbolc P + (x + tη, ξ) where η = √ 2|ξ| 2 . ξ +m Rt + + + e−itν (Dx ) is a FIO with the same principal Thus, ei 0 [···]ds eitν (Dx ) J(+∞) + symbol and with phase ϕ+ 1 (t, x, ξ) = xξ + φ1 (t, x, ξ) where φ+ 1 (t, x, ξ) =
1 2ξ +
+∞
1 2ξ
[a2 (s) − c2 (s)]ds − x+tη
1 2ξ
+∞
(b(s) − m)2 ds +
[(b2 (s) − m2 ) + 2c(s)ν + (ξ))]ds
0
t
0
x+tη
(b(sη) − m)
0
m + c(sη) ds. ν + (ξ) (3.57)
+∞ 2 1 Since 2ξ [a (s) − c2 (s)]ds = o(1) when t → +∞, and by making a change of x+tη variables in the last integral, we obtain φ+ 1 (t, x, ξ) = −
1 2ξ
φ+ 1 (t, x, ξ) = −
+
[(b2 (s) − m2 ) + 2c(s)ν + (ξ))]ds +
0
1 + 2ξ Using again that
x+tη
tη
(b(s) − m)2 ds 0
[2(b(s) − m)m + 2c(s)ν + (ξ)]ds + o(1).
tη
1 2ξ
+∞
(3.58)
0
x+tη 1 2ξ
1 2ξ
tη 0
tη
(b2 (s) − m2 ) + 2c(s)ν + (ξ)) ds = o(1), we see that
2 1 (b (s) − m2 ) + 2c(s)ν + (ξ)) ds + 2ξ
[2(b(s) − m)m + 2c(s)ν + (ξ)]ds + o(1).
+∞
(b(s) − m)2 ds 0
(3.59)
0
Then, φ+ 1 (t, x, ξ) = −
1 2ξ
0
tη
(b(s) − m)2 ds +
1 2ξ
+∞
(b(s) − m)2 ds + o(1) = o(1). 0
(3.60) c It
means that the others terms of the symbol are o(1) when t → +∞.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
Using (3.43), (3.44), (3.47) and the continuity of FIOs, we see that Rt + + + e−itν (Dx ) P+m ψ = P+m ψ + o(1) ei 0 [...]ds eitν (Dx ) J(+∞)
455
(3.61)
and Theorem 3.3 follows from (3.55) and (3.61). ± We now construct the modifiers at high energy J(+∞) (λ) so that they satisfy (3.9) and (3.15). We still omit the lower index (+∞) in the next notations. Comparing (3.15) and (3.52) suggests to construct J ± (λ) close to e−iλx J ± eiλx which are clearly FIOs with phases ϕ± (x, ξ, λ) = xξ + φ± (x, ξ + λ) and amplitudes P ± (x, ξ + λ). With J ± (λ) = e−iλx J ± eiλx , we see from (3.50) that the amplitudes
c± (x, ξ, λ) = B ± (x, ξ + λ)P ± (x, ξ + λ) − iΓ1 ∂x P ± (x, ξ + λ), of the operators C ± (λ) = H(λ)J ± (λ) − J ± (λ)ν ± (Dx + λ) would satisfy the estimate c± (x, ξ, λ) = O(x−2 λ−1 ),
(3.62)
for ξ in a compact set. Here and in the following, the notation f (x, λ) = O(x−2 λ−1 ) means that f (x, λ) decays as x−2 when x → +∞ and as λ−1 when λ → +∞. We want however the amplitudes c± (x, ξ, λ) to be of order O(x−2 λ−2 ) and the decay in (3.62) is not sufficient for our purpose. In consequence, we need to refine our construction. Following the procedure given in [4], we look for modifiers J ± (λ) defined as FIOs with phases ϕ± (x, ξ, λ) and with new amplitudes P ± (x, ξ, λ) that take the form 1 1 P ± (x, ξ, λ) = p± (x, ξ + λ) + p± (x, ξ + λ)l± (x) + 2 P∓ k ± (x) , (3.63) λ λ (up to suitable cutoff functions defined later), where P± denote the projections onto the positive and negative spectrum of Γ1 . Here the correctors l± , k ± (that can be matrix-valued) will be functions of x only and should satisfy some decay in x (see below). It will be clear in the next calculations why we add such correctors to the amplitudes p± (x, ξ + λ). We now choose l± and k ± in (3.63) so that the amplitudes 1 ± 1 ± ± ± ± ± c (x, ξ, λ) = B (x, ξ + λ) p (x, ξ + λ) + p (x, ξ + λ)l (x) + 2 P∓ k (x) λ λ 1 − iΓ1 ∂x p± (x, ξ + λ) + ∂x p± (x, ξ + λ)l± (x) λ 1 ± 1 ± ± + p (x, ξ + λ)∂x l (x) + 2 P∓ ∂x k (x) , (3.64) λ λ of the operators C ± (λ) be of order O(x−2 λ−2 ).
May 11, J070-S0129055X10004004
456
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
To prove this, we need the asymptotics of the different functions appearing in (3.64). For x in R+ and for λ large enough, we obtain (after long and tedious calculations)
m2 ν (ξ + λ) = ± λ + ξ + + O(λ−2 ). 2λ m2 d± (x, ξ + λ) = ±2c(x) λ + ξ + + O(x−1 λ−2 ). 2λ 1 K ± (x, ξ + λ) = ± [2P∓ c(x) + a(x)Γ2 + (b(x) − m)Γ0 ] 2λ ±
+ O(x−1 λ−2 ).
(3.65)
(3.66)
(3.67)
P±m (ξ + λ) = P± + O(λ−1 ).
(3.68)
p± (x, ξ + λ) = P± + O(λ−1 ).
(3.69)
∂x p± (x, ξ + λ) = ±
1 P∓ (a (x)Γ2 + b (x)Γ0 ) + O(x−2 λ−2 ). 2λ
(3.70)
B ± (x, ξ + λ) = ∓2(ξ + λ)P∓ + 2c(x)P∓ + a(x)Γ2 + b(x)Γ0 + O(λ−1 ).
(3.71)
q ± (x, ξ + λ) = B ± (x, ξ + λ)p± (x, ξ + λ) =±
1 2 c (x)P± + O(x−2 λ−2 ). 2λ
(3.72)
We mention that the following simple equalities have been used several times to prove the preceding asymptotics 1
1+Γ =2
I2 0
0 0
= 2P+ ,
1−Γ =2 1
0 0
0 I2
= 2P− .
(3.73)
By (3.69)–(3.72), the amplitudes c± (x, ξ, λ) take the form 1 2 1 c P± ± 2 c2 P± l± 2λ 2λ 1 1 2 0 + 2 ∓2(ξ + λ)P∓ + 2cP∓ + aΓ + bΓ + O P∓ k ± λ λ 1 1 1 − iΓ ± P∓ (a Γ2 + b Γ0 ) ± 2 P∓ (a Γ2 + b Γ0 )l± 2λ 2λ 1 1 1 1 ± ± + . P± + O ∂x l + 2 P∓ ∂x k + O λ λ λ x2 λ2
c± (x, ξ, λ) = ±
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
457
From the asymptotics (2.20) of the potentials a, b, c, we rewrite this last expression as 1 2 i 1 Γ P∓ (a Γ2 + b Γ0 ) c± (x, ξ, λ) = ± c2 P± ∓ P∓ k ± ∓ 2λ λ 2λ i − Γ1 P± ∂x l± + R(x, λ), (3.74) λ where the rest R(x, λ) satisfies 1 + |l± (x)| |∂x l± (x)| |k ± (x)| |k ± (x)| |∂x k ± (x)| R(x, λ) = O + + + + . x2 λ2 λ2 λ2 xλ2 λ2
(3.75)
Now we choose the correctors l± , k ± in such a way that the terms of orders O(λ−1 ) in (3.74) cancel. Once it is done we shall have to check that the rest (3.75) be of order O(x−2 λ−2 ). There are clearly two different types of terms in the expression (3.74): on one hand the terms 1 i 1 1 ± c2 P± − Γ1 P± ∂x l± = P± ± c2 ∓ i∂x l± , 2λ λ λ 2 “live” in H± = P± (H); on the other hand the terms i 1 1 i 2 2 ± 2 0 ± 0 Γ P∓ (a Γ + b Γ ) = P∓ ∓2k + (a Γ + b Γ ) , ∓ P∓ k ∓ λ 2λ λ 2 “live” in H∓ = P∓ (H). Since the Hilbert spaces H− and H+ form a direct sum of H, i.e. H = H− ⊕ H+ , we can consider separatly the equations 1 ± c2 ∓ i∂x l± = 0, 2 i ± ∓2k + (a Γ2 + b Γ0 ) = 0, 2
(3.76) (3.77)
in order to cancel the terms of order O(λ−1 ) in (3.74). We solve first (3.76) and obtain i +∞ 2 ± l (x) = l(x) = c (s)ds. (3.78) 2 x Then we solve (3.77) and get i k ± (x) = ± (a (x)Γ2 + b (x)Γ0 ). 4
(3.79)
The functions l and k ± clearly satisfy when x → +∞ l(x) = O(x−1 ),
∂x l(x) = O(x−2 ),
k ± (x) = O(x−2 ).
(3.80)
Finally, with this choice of correcting terms l and k ± , we conclude from (3.74) and (3.75) that c± (x, ξ, λ) = R(x, λ) = O(x−2 λ−2 ).
May 11, J070-S0129055X10004004
458
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
In fact, we can prove that for all x ∈ R+ , ξ in a compact set and λ large enough ∀ α, β ∈ N,
|∂xα ∂ξβ c± (x, ξ, λ)| ≤ Cαβ x−2−α λ−2 .
(3.81)
Let us summarize the previous results. The modifiers J ± (λ) are (formally) constructed as FIOs with phases ϕ± (x, ξ, λ) = xξ + φ± (x, ξ + λ) where +∞ x 1 ± 2 2 [a (s) − c (s)]ds − [(b2 (s) − m2 ) φ (x, ξ + λ) = 2(ξ + λ) x 0 +∞ ± 2 + d (s, ξ + λ)ds] + (b(s) − m) ds , (3.82) 0
and amplitudes 1 1 P ± (x, ξ, λ) = p± (x, ξ + λ) + p± (x, ξ + λ)l(x) + 2 P∓ k ± (x) , λ λ
(3.83)
where l and k ± are given by (3.78) and (3.79) respectively. Unfortunately, since φ± (x, ξ + λ) = O(x) when x → −∞, this phase does not belong to a good class of oscillating symbols. So, we have to introduce some technical cutoff functions in the amplitude in order to localize x far away from −∞. Moreover, these cutoff functions must be negligible in the asymptotics in the previous calculus. We follow the strategy exposed in [22] which we briefly recall here. We consider a fixed test function ψ ∈ C0∞ (R) and we want to calculate the ± asymptotics of W(+∞) (λ)ψ. Since ψˆ ∈ / C0∞ (R), at high energies, translation of wave packets does not dominate over spreading. So we introduce a cutoff function (depending on λ) in order to control the spreading. Let χ0 ∈ C0∞ (R) be a cutoff function such that χ0 (ξ) = 1 if | ξ |≤ 1, χ0 (ξ) = 0 if | ξ |≥ 2. Using the Fourier representation, we have easily: Dx ∀ > 0, ∀ N ≥ 1, χ0 = O(λ−N ). (3.84) − 1 ψ 2 λ L (R) Now, let us define the classical propagation zone: Ω = {x + t; x ∈ supp ψ, t ∈ R+ },
(3.85)
and let η + ∈ C ∞ (R) be a cutoff function such that η + = 1 in a neighborhood of Ω and η + = 0 in a neighborhood of −∞. We consider Dx ± + −itν ± (Dx +λ) m,λ K (λ) = (η − 1)e P± χ0 ψ. (3.86) λ Lemma 3.3. For λ 1, ∈ ]0, 1[, t ∈ R± , and N ≥ 1, we have: K ± (λ)L2 (R) = O(t−N λ−N ).
(3.87)
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
459
Proof. We only sketch the proof for the case (+). Using the Fourier transform and (3.39), we easily see that 1 + Γ1 (λ ξ + λ) + mΓ0 +
iϕ(ξ) (η (x) − 1)λ I4 + χ0 (ξ)dξ e K (λ) = 4π (λ ξ + λ)2 + m2 × ψ(y)dy,
(3.88)
where ϕ(ξ) = λ (x − y)ξ − t (λ ξ + λ)2 + m2 . So,
−1 ξ 1 + λ . ∂ξ ϕ(ξ) = λ x − y + t (1 + λ −1 ξ)2 + m2
(3.89)
Since ξ is in a compact set, < 1, y ∈ supp ψ, we easily obtain for x ∈ supp(η + − 1), and λ 1, |∂ξ ϕ(ξ)| ≥ cλ (1 + t),
(3.90)
for a suitable constant c > 0. We conclude by a standard non stationary phase argument. Now, we can define precisely ours modifiers J ± (λ) in order to calculate the ± (λ)ψ. According to (3.84), it suffices to calculate the asympasymptotics of W(+∞)
± x totics of W(+∞) (λ)χ0 ( D λ )ψ. We first remark that for λ 1 and < 1, we have
ξ + λ ∈ X if λξ ∈ supp χ0 . So, we can define the modifiers J ± (λ) as FIOs with phases ϕ± (x, ξ, λ) = xξ + φ± (x, ξ + λ) where φ± (x, ξ + λ) are given by (3.82) and with amplitudes ξ 1 ± 1 ± + ± ± P (x, ξ, λ) = η (x) p (x, ξ + λ) + p (x, ξ + λ)l(x) + 2 P∓ k (x) χ0 , λ λ λ
(3.91) where l and k ± are given by (3.78) and (3.79), respectively. With this definition, we can mimick the proof of Theorem 3.3, to get Lemma 3.4. For ψ ∈ C0∞ (R) and for λ large, we have ± Dx ± ± (λ)χ0 (λ)e−itν (Dx +λ) P±m,λ ψ. W(+∞) ψ = lim eitH(λ) J(+∞) t→±∞ λ
(3.92)
Moreover, it is easy to see that the estimates (3.81) are still satisfied, so we can prove our main estimate (3.9). Precisely we get Lemma 3.5. For ψ ∈ C0∞ (R) and when λ tends to infinity, the following estimate holds: ± ± (λ) − J(+∞) (λ))ψ = O(λ−2 ). (W(+∞)
May 11, J070-S0129055X10004004
460
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
Proof. Everything done in [4], Lemma 3.3 works here in the same way. All the contributions coming from the cut-off function η + are negligible using the same arguments as in Lemma 3.3 since the support of the derivatives of η + are far away from Ω. ± We end up this section giving the asymptotics of W(+∞) (λ) when λ is large. ± ∞ (λ)ψ = According to Lemma 3.5, we have for any ψ ∈ C0 (R; C4 ), W(+∞) ± −2 J(+∞) (λ)ψ+ O(λ ). Thus we only need to compute the asymptotics of the modifier ± (λ) that we shall consider as pseudodifferential operators with symbols J(+∞)
j ± (x, ξ, λ) = eiφ
±
(x,ξ+λ)
P ± (x, ξ, λ).
Using the explicit expressions (3.82) and (3.91), we first get the asymptotics +∞ x x 1 φ± (x, ξ + λ) = ∓ c(s)ds + (a2 − c2 )(s)ds − (b2 (s) − m2 )ds 2λ x 0 0 +∞ logx 2 + (b(s) − m) ds + O , (3.93) λ2 0 1 1 l(x) P∓ (aΓ2 + bΓ0 ) + P± + O P ± (x, ξ, λ) = η + (x) P± ± . (3.94) 2λ λ λ2 Moreover using a Taylor expansion of et at t = 0, we get from (3.93) logx i ˜+ iφ± (x,ξ+λ) ∓iC + (x) C (x) + O =e e 1+ , 2λ λ2 with
+
C (x) =
x
c(s)ds, 0
+∞
(a2 − c2 )(s)ds −
C˜ + (x) = x
0
x
(3.95)
(b2 (s) − m2 )ds +
(3.96) +∞
(b(s) − m)2 ds. 0
Combining now (3.94) and (3.95), we obtain + i ˜+ 1 l(x) C (x)P± ± P∓ (aΓ2 + bΓ0 ) + P± j ± (x, ξ, λ) = e∓iC (x) η + (x) P± + 2λ 2λ λ 1 +O . (3.97) λ2 However, notice from (3.78) that +∞ x +∞ i l(x) i ˜+ 2 2 2 2 C (x)+ = a (s)ds − (b (s) − m )ds + (b(s) − m) ds , 2λ λ 2λ x 0 0
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
461
and from the anticommutation properties (2.16) of the Dirac matrices that P∓ (aΓ2 + bΓ0 ) = (aΓ2 + bΓ0 )P± . Hence (3.97) becomes +∞ x i η (x) 1 + a2 (s)ds − (b2 (s) − m2 )ds j (x, ξ, λ) = e 2λ x 0 +∞ 1 1 (aΓ2 + bΓ0 ) P± + O (b(s) − m)2 ds ± + . (3.98) 2λ λ2 0 ±
∓iC + (x) +
Eventually, if we introduce the notations +∞ x +∞ i ± 2 2 2 2 R (x) = a (s)ds − (b (s) − m )ds + (b(s) − m) ds 2 x 0 0 1 ± (aΓ2 + bΓ0 ), 2
(3.99)
we deduce from (3.98) and the fact that η + (x) = 1 on supp ψ, the following Proposition Proposition 3.1. For any ψ ∈ C0∞ (R; C4 ), 1 1 ± ± ∓iC + (x) W(+∞) (λ)ψ = e 1 + R (x) P± ψ + O , λ λ2
(3.100)
where C + (x) and R± (x) are given by (3.96) and (3.99), respectively. ± (λ) 3.2. Asymptotics of W(−∞)
In this subsection, we focus on what happens at the event horizon and give the ± (λ) when λ → +∞. In fact, we shall derive them from the asymptotics of W(−∞) results obtained in the preceding Sec. 3.1 after some simplifications of our model. As usual, we shall omit the lower index (−∞) in the objects defined or used hereafter. Recall that the expressions of the wave operators at the event horizon are given by (see (2.22)) W ± = s- lim eitH e−itH0 P∓ , t→±∞
where H0 = Γ1 Dx + c0 , H = Γ1 Dx + aΓ2 + mΓ0 + c and the potentials a, b, c − c0 satisfy (2.19) when x → −∞. We first simplify this expression in a convenient way. Let us introduce the unitary transform U on H x 1 − [c(s) − c0 ]ds + c0 x, (3.101) U = e−iΓ C (x) , C − (x) = −∞
and define the selfadjoint operators on H A0 = Γ1 Dx ,
A = U ∗ HU .
(3.102)
May 11, J070-S0129055X10004004
462
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
Using (3.101), a short calculation shows that the operator A can be rewritten as A = Γ1 Dx + W (x), where 1
W (x) = eiΓ
C − (x)
(3.103)
1 − a(x)Γ2 + b(x)Γ0 e−iΓ C (x) .
(3.104)
Note that according to the anticommutation properties (2.16) of the Dirac matrices, the potential W satisfies W Γ1 + Γ1 W = 0 and W 2 (x) = a2 (x) + b2 (x). Moreover from (2.19), we get the following estimates for W ∃ α > 0,
x → −∞.
W (x) = O(eαx ),
Using the unitarity of U and (3.102) we rewrite W
±
(3.105)
as
W ± = U s- lim eitA U ∗ e−itH0 P∓ , t→±∞
= U s- lim eitA e−itA0 eitA0 U ∗ e−itH0 P∓ .
(3.106)
t→±∞
Now we can simplify the strong limit appearing in (3.106) in two steps. First we claim that 1
s- lim eitA0 U ∗ e−itH0 P∓ = eiΓ t→±∞
c0 x
P∓ .
(3.107)
Indeed, using the particular diagonal form of Γ1 given in (2.18) and since e−itH0 = e−itA0 e−itco , we have 1
eitA0 U ∗ e−itH0 P∓ = eitA0 eiΓ
C − (x) −iA0 −itc0
e
e
1
P∓ = eiΓ
C − (x∓t) −itc0
e
P∓ . (3.108)
When t → +∞, the right-hand-side of (3.108) can be written using (3.101) as R x−t − e−iC (x−t) e−itc0 P− = e−i −∞ (c(s)−c0 )ds+c0 x P− , from which (3.107) follows when t → +∞. The case t → −∞ is obtained similarly. Second since the potential W decays exponentially when x → −∞ by (3.105), it follows from the methods used in [3, 18] that the wave operators W ± (A, A0 ) = s- lim eitA e−itA0 P∓ , t→±∞
(3.109)
exist on H. Hence by (3.106), (3.107), (3.109) and the chain-rule, we obtain the following nice expressions for W ± 1
W ± = U W ± (A, A0 ) eiΓ
c0 x
P∓ .
(3.110)
1
At last since U and eiΓ c0 x commute with eiλx , it is clear from (3.110) that it is enough to know the asymptotics of W ± (A, A0 , λ) = e−iλx W ± (A, A0 )eiλx when λ → +∞ in order to get the asymptotics of W ± (λ). Note here that the λ-shifted wave operator W ± (A, A0 , λ) is exactly the kind of wave operator studied in our previous paper [4] in which the asymptotics of
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
463
W ± (A, A0 , λ) were calculated. Nevertheless, we can also easily derive these asymptotics from the results of the preceding section. For completeness this is what we choose to do here. We thus follow our usual strategy and construct modifiers J0± (λ) corresponding to W ± (A, A0 , λ). This problem is in fact similar to the one in Sec. 3.1. It suffices to replace H0m by A0 and H by A in our calculations. From the explicit form (3.102) and (3.103) of the operators A0 and A, we deduce that we can use the results obtained in Sec. 3.1 with the following changes: (1) Since the mass m does not appear in A0 hence we take m = 0. (2) The long-range matrix-valued potential b and scalar potential c do not appear in A (see (3.103) and (3.105)) hence we put b(x) = c(x) = 0. (3) The short-range matrix-valued potential a(x)Γ2 is replaced by W (x). (4) The projections P±m are replaced by P∓ since we work at the event horizon. Noting that these changes also entail that ν ± (ξ) = ∓ξ and d± (x, ξ) = 0, we obtain the following results. At fixed energy λ = 0, the modifiers J0± are defined as FIOs with phases −∞ 1 ϕ± (x, ξ) = xξ + W 2 (s)ds, 2ξ x and amplitudesd p± (x, ξ) = (1−K ± (x, ξ))−1 P∓ ,
K ± (x, ξ) = ∓
1 W 2 (x) 1 Γ + W (x) . (3.111) − 2ξ 2ξ
At high energy, the modifiers J0± (λ) are defined as FIOs with phases −∞ 1 ± ϕ (x, ξ, λ) = xξ + W 2 (s)ds, 2(ξ + λ) x
(3.112)
and amplitudes P ± (x, ξ, λ) = p± (x, ξ + λ) +
1 P± k ± (x), λ2
(3.113)
where k ± (x) = ∓ 4i W (x). Using these definitions and (3.105), we can prove that the symbols c± (x, ξ, λ) of the operators C ± (λ) = A(λ)J0± (λ) − J0± (λ)A0 (λ) satisfy the estimates eαx ∀ µ, β ∈ N, |∂xα ∂ξβ c± (x, ξ, λ)| ≤ Cµβ 2 , (3.114) λ for all x ∈ R− and λ large enough. Finally as in the proof of Lemma 3.5 the estimates (3.114) are the main ingredients to prove the equivalent properties to (3.14) and (3.9). Precisely we have Lemma 3.6. For any ψ ∈ C0∞ (R; C4 ) and for λ large, the following estimate holds (W ± (A, A0 , λ) − J0± (λ))ψ = O(λ−2 ). d In
the same way as the preceding section, we should add some technical cutoff functions which are negligible in the asymptotics.
May 11, J070-S0129055X10004004
464
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
We now use Lemma 3.6 to compute the asymptotics of W ± (A, A0 , λ)ψ up to the order O(λ−2 ). For any ψ ∈ C0∞ (R; C4 ) and for λ large, we have 1 W ± (A, A0 , λ)ψ = J0± (λ)ψ + O . λ2 Hence, it is enough to compute the asymptotics of J0± (λ) for λ large. Using (3.111)– (3.113) and after some calculations, we obtain −∞ 1 1 ± 2 J0 (λ)ψ = 1 + W (s)ds ∓ W (x) P∓ ψ + O . (3.115) i 2 2λ λ x Note that we retrieve naturally the same formulae as in [4]. Eventually combining (3.110) and (3.115), we obtain the asymptotics of W ± (λ) for λ large Proposition 3.2. For any ψ ∈ C0∞ (R), 1 1 ± ± iΓ1 c0 x W(−∞) (λ)ψ = U 1 + Q (x) e P∓ ψ + O , (3.116) λ λ2 −∞ where U is given by (3.101), Q± (x) = 12 (i x W 2 (s)ds∓ W (x)) and W (x) is given by (3.104). 3.3. Proofs of Theorems 3.1 and 3.2 ± In this last subsection, we use the asymptotics of W(±∞) (λ) obtained in Propositions 3.1 and 3.2 to prove the reconstruction formulae given in Theorem 3.2 and finally prove Theorem 3.1.
Proof of Theorem 3.2. We only treat the case of the transmission operator TR and give the proof of (3.1) since the proof of (3.2) corresponding to the transmission operator TL is similar. Recall that we want to compute the asymptotic expansion when λ → +∞ of − + (λ)ψ, W(−∞) (λ)φ, Fl (λ) = TR eiλx ψ, eiλx φ = W(+∞)
for ψ, φ ∈ C0∞ (R; C4 ). Using Propositions 3.1 and 3.2 and the notations therein, we have 1 1 + 1 − iC + (x) iΓ1 c0 x Fl (λ) = e P− φ + O 1 + R (x) P− ψ, U 1 + Q (x) e , λ λ λ2 1 1 iC + (x) [e P− ψ, U Q+ eiΓ c0 x P− φ λ 1 iC + (x) − iΓ1 c0 x R P− ψ, U e P− φ] + O + e . λ2
= eiC
+
(x)
1
P− ψ, U eiΓ
c0 x
P− φ +
We now compute separatly the terms of different orders in (3.117).
(3.117)
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
465
Order 0. Since Γ1 P− = −P− , the term of order 0 reads ei[C
+
(x)−C − (x)+c0 x]
P− ψ, P− φ.
(3.118)
Moreover from (3.96) and (3.101), the phase C + (x) − C − (x) + c0 takes the simple form 0 [c(s) − c0 ]ds + c0 x. (3.119) C + (x) − C − (x) + c0 x = − −∞
Order 1. Using Γ1 P− = −P− again, the term of order 1 can be written as ei[C
+
(x)−C − (x)+c0 x]
(R− + (Q+ )∗ ) P− ψ, P− φ.
−
Since W 2 = a2 + b2 and W P− = e2iC (aΓ2 + bΓ0 )P− by (2.16), the term (Q+ )∗ P− takes the form − i −∞ 2 1 (a + b2 )(s)ds − e2iC (aΓ2 + bΓ0 ) P− . (3.120) (Q+ )∗ P− = − 2 x 2 Moreover from (3.99) the term R− is i +∞ 2 i x 2 − R = a (s)ds − (b (s) − m2 )ds 2 x 2 0 1 i +∞ (b(s) − m)2 ds − (aΓ2 + bΓ0 ). + 2 0 2
(3.121)
Hence adding (3.120) and (3.121), the term of order 1 reads +∞ i i 0 2 i[C + (x)−C − (x)+c0 x] a2 (s)ds + b (s)ds e 2 −∞ 2 −∞ i i +∞ (b(s) − m)2 ds + m2 x P− ψ, P− φ + 2 0 2 1 i[C + (x)−C − (x)+c0 x] 1 2iC − 2 0 2 0 e − e (aΓ + bΓ ) + (aΓ + bΓ ) P− ψ, P− φ . 2 2 (3.122) +
−
Finally using that ei[C (x)−C (x)+c0 x] is scalar, that (aΓ2 +bΓ0 )P± = P∓ (aΓ2 +bΓ0 ) by (2.16) and the fact that P+ ψ, P− φ = 0, we see that the last term in (3.122) cancel, i.e. + − 1 2iC − 1 e (aΓ2 + bΓ0 ) + (aΓ2 + bΓ0 ) P− ψ, P− φ = 0. ei[C (x)−C (x)+c0 x] 2 2 Hence the term of order 1 is +∞ i i 0 2 i +∞ i[C + (x)−C − (x)+c0 x] 2 a (s)ds + b (s)ds + (b(s) − m)2 ds e 2 −∞ 2 −∞ 2 0 i 2 + m x P− ψ, P− φ . (3.123) 2
May 11, J070-S0129055X10004004
466
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
If we introduce the following functions Θ(x) = e−i
R0
−∞
[c(s)−c0 ]ds+ic0 x
A(x) = Θ(x)
,
+∞
0
a2 (s)ds + −∞
b2 (s)ds + −∞
+∞
(b(s) − m)2 ds + m2 x ,
0
we have proved the reconstruction formula (3.1) and thus Theorem 3.2.
Proof of Theorem 3.1. We show here that the reconstruction formula (3.1) entails the uniqueness of the parameters M and Q under the additional assumption that the charge q of Dirac fields is known, fixed and nonzero. The same result can be shown from the reconstruction formula (3.2) in a similar way. We first compute one of the integrals that appear in (3.1) which will be useful in the later analysis. Using the explicit expressions of F, al given in (2.2) and (2.15) as well as the definition of the Regge–Wheeler variable x(r) given in (2.6), an easy calculation shows that 2 1 1 2 al (s)ds = l + , (3.124) 2 r 0 R where r0 is the radius of the event horizon. Now let us consider two transmission operators Tl,1 and Tl,2 corresponding, respectively, to parameters Mj , Qj , mj , (j = 1, 2) and q1 = q2 = q where q is supposed to be known and nonzero. In what follows, all the objects corresponding to Tl,j with j = 1, 2 will be denoted by the usual notations with a lower index j. We suppose that Tl,1 = Tl,2 . In consequence we also have Fl,1 (λ) = Fl,2 (λ). Our goal is to prove that M1 = M2 and Q1 = Q2 . Using Theorem 3.2 and identifying the terms of same orders in the reconstruction formula (3.1), we thus get Θ1 (x) = Θ2 (x),
(3.125)
A1 (x) = A2 (x).
(3.126)
By (3.3) and a standard continuity argument, (3.125) leads to the equality 0 0 −i [c1 (s) − c0,1 ]ds + ic0,1 x = −i [c2 (s) − c0,2 ]ds + ic0,2 x + 2kπ, (3.127) −∞
−∞
where k ∈ Z. If we derivate (3.127) with respect to x, we obtain c0,1 = c0,2 := c0 .
(3.128)
Now by (3.124), (3.126) leads to the equality 2 1 1 i 0 2 i +∞ i l+ + b (s)ds + (b1 (s) − m)2 ds + m21 x 2 r0,1 2 −∞ 1 2 0 2 2 1 1 i 0 2 i +∞ i = l+ + b2 (s)ds + (b2 (s) − m)2 ds + m22 x. 2 r0,2 2 −∞ 2 0 2
(3.129)
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
467
If we derivate (3.129) with respect to x, we first get m1 = m2 := m.
(3.130)
Hence the mass m of Dirac fields is uniquely determined. Moreover, using (3.130), (3.124) and the homogeneity in the parameter l, we obtain from (3.129) r0,1 = r0,2 := r0 .
(3.131)
Therefore the radius r0 of the event horizon is also uniquely determined. Now if we combine (3.131) and c0 = qQ r0 into (3.128), we get (since q is supposed to be nonzero) Q1 = Q2 := Q. The charge Q of the black hole is thus uniquely determined. Eventually since r0 cancels the function F , we get from (2.2) that M1 = M2 := M =
r02 + Q2 , 2r0
and the mass M of the black hole is uniquely determined. This finishes the proof of Theorem 3.1. 4. The Inverse Problem for dS-RN Black Holes (Λ > 0) In this section, we study the inverse problem in the case Λ > 0 corresponding to dS-RN black holes. In a first part, we prove the same kind of results as in Sec. 3, that is we prove that the parameters M, Q and Λ are uniquely determined by the high energies of the transmission operators TL or TR . In a second part, we prove by means of a purely stationary method that the parameters M, Q and Λ can also be uniquely determined from the knowledge of the reflection operators L or R on any interval of energy. 4.1. The inverse problem at high energy As in Sec. 3, we shall assume here that one of the following functions of λ ∈ R Fl (λ) = TR eiλx ψ, eiλx φ,
Gl (λ) = TL eiλx ψ, eiλx φ,
ˆ φˆ ∈ is known for all large values of λ, for all l ∈ N and for all ψ, φ ∈ H with ψ, ∞ 4 C0 (R; C ). We emphasize that in this case the construction of the modifiers are simpler than in the previous section due to the decay of the potentials at infinity; the phases of the modifiers constructed later will belong to a good class of oscillating symbols. In particular, we do not need a technical cutoff function η + and a cutoff function χ0 in order to control the spreading of the wave packets as in Sec. 3 and we ˆ φˆ ∈ C ∞ (R; C4 ). We also assume that can consider test functions ψ, φ ∈ H with ψ, 0
May 11, J070-S0129055X10004004
468
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
the mass m and the charge q of the Dirac fields are known and fixed. Furthermore, the charge q is supposed to be nonzero. Then our main result is Theorem 4.1. Under the previous assumptions, the parameters M, Q and Λ of the dS-RN black hole are uniquely determined. This theorem will follow from the following reconstruction formulae obtained on each spin-weighted spherical harmonics ˆ φˆ ∈ Theorem 4.2 (Reconstruction Formulae). Let ψ, φ ∈ H such that ψ, ∞ 4 C0 (R; C ). Then for λ large, we have Fl (λ) = Θ(x)P− ψ, P− φ +
1 A(x)P− ψ, P− φ + O(λ−2 ), λ
(4.1)
Gl (λ) = Θ(x)P+ ψ, P+ φ −
1 A(x)P+ ψ, P+ φ + O(λ−2 ), λ
(4.2)
where θ(x) and A(x) are multiplication operators given by +∞ 2 i al (s) + b2 (s) ds Θ(x), Θ(x) = e−iβ−i(c+ −c0 )x , A(x) = 2 −∞ and a constant β given by 0 c(s) − c0 ds + β= −∞
+∞
(4.3)
c(s) − c+ ds.
0
We shall prove Theorem 4.2 using the same global strategy as in the proof of Theorem 3.2. From (2.30), (2.31), (2.33) and the fact that eiλx corresponds to a translation by λ in momentum space, we express F (λ) and G(λ) as follows − + Fl (λ) = W(+∞) (λ)ψ, W(−∞) (λ)φ,
(4.4)
− + Gl (λ) = W(−∞) (λ)ψ, W(+∞) (λ)φ,
(4.5)
± ± W(−∞) (λ) = e−iλx W(−∞) eiλx = s- lim eitH(λ) e−itH0 (λ) P∓ ,
(4.6)
± ± W(+∞) (λ) = e−iλx W(+∞) eiλx = s- lim eitH(λ) e−itH+ (λ) P± ,
(4.7)
with t→±∞
t→±∞
and H(λ) = Γ1 (Dx + λ) + a(x)Γ2 + b(x)Γ0 + c(x), H0 (λ) = Γ1 (Dx + λ) + c0 ,
H+ (λ) = Γ1 (Dx + λ) + c+ .
In consequence, it is enough to obtain an asymptotic expansion of the λ-shifted ± wave operators W(±∞) (λ) in order to prove the reconstruction formulae (4.1) and (4.2). ± (λ) given by (4.6) are exactly Note first that the λ-shifted wave operators W(−∞) the same as in the case Λ = 0 studied in Sec. 3.2. For completeness we recall here
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
469
± the asymptotic expansion of W(−∞) (λ) obtained in Proposition 3.2. For any ψ ∈ H, ∞ 4 ˆ ψ ∈ C (R; C ), we have 0
1 1 1 ± W(−∞) (λ)ψ = U 1 + Q± (x) eiΓ c0 x P∓ ψ + O , λ λ2
where 1
U = e−iΓ
C − (x)
Q± (x) = W (x) = e
, 1 2
C − (x) = i
iΓ1 C − (x)
x −∞
−∞
[c(s) − c0 ]ds + c0 x,
(4.8)
(4.9)
W 2 (s)ds ∓ W (x) , (4.10)
x 2
0
(a(x)Γ + b(x)Γ )e
−iΓ1 C − (x)
.
± Note second that the λ-shifted wave operators W(+∞) (λ) given by (4.7) are very similar to (4.6), the constant c0 being replaced by c+ and the projections P∓ being replaced by P± since we work now at the cosmological horizon. Hence they can be studied exactly the same way as in Sec. 3.2. Since there are slight modifications in some formulae, we recall here the procedure but omit the proofs. Using the unitary ± as follows transform (4.9), we simplify the wave operators W(+∞) ± W(+∞) = U s- lim eitA e−itA0 eitA0 U ∗ e−iH+ P± , t→±∞
(4.11)
where we have used again the notations A0 = Γ1 Dx and A = U ∗ HU = Γ1 Dx +W (x) from (3.102) and (3.103) with the potential W given by (4.10). We also recall that by (2.16) this new potential W (x) satisfies the properties Γ1 W + W Γ1 = 0,
W 2 = a2 + b 2 ,
(4.12)
as well as the global estimate ∃ α > 0,
W (x) = O(e−α|x| ),
∀ x ∈ R.
(4.13)
The potential W is thus very short-range both at the event horizon and at the cosmological horizon. Now an easy calculation shows that (to be compared with (3.107) and its proof) 1
1
s- lim eitA0 U ∗ e−iH+ P± = eiΓ β eiΓ t→±∞
where the constant β is given by 0 c(s) − c0 ds + β= −∞
+∞
c+ x
P± ,
c(s) − c+ ds.
(4.14)
(4.15)
0
Furthermore, it is immediate from (4.13) that the wave operators W ± (A, A0 ) = ± s- limt→±∞ eitA e−itA0 exist on H. Hence we conclude by the chain-rule that W(+∞)
May 11, J070-S0129055X10004004
470
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
± take the nice form (to be compared to the expressions (3.110) obtained for W(−∞) ) 1
1
± = U W ± (A, A0 ) eiΓ β eiΓ W(+∞) 1
1
Since U and eiΓ β eiΓ ± (λ) for W(+∞)
c+ x
c+ x
P± .
(4.16)
commute with eiλx , we finally get the following expression 1
1
± W(+∞) (λ) = U W ± (A, A0 , λ) eiΓ β eiΓ
c+ x
P± ,
where W ± (A, A0 , λ) = e−iλx W ± (A, A0 )eiλx . Clearly it is enough to know the asymptotics of W ± (A, A0 , λ)P± when λ → +∞ in ± (λ). In fact, the calculations are exactly the order to get the asymptotics of W(+∞) same to what has been done in Sec. 3.2 (it suffices to replace P∓ by P± in these calculations) or in [4]. Hence we only give the final result without more details. For any ψ ∈ H, ψˆ ∈ C0∞ (R; C4 ), we finally obtain 1 1 1 1 ˜± ± (λ)ψ = U 1 + Q (x) eiΓ β eiΓ c+ x P± ψ + O , (4.17) W(+∞) λ λ2 ˜ ± (x) = 1 (i +∞ W 2 (s)ds ± W (x)) and W is given by where U is given by (4.9), Q x 2 (4.10). Proof of Theorem 4.2. We now use the asymptotic expansions (4.8) and (4.17) to prove the reconstruction formulae (4.1) and (4.2). Since the proofs are analogous, we only treat (4.1). Using the previous notations we clearly have ! 1 1 1 ˜− 1 (x) eiΓ β eiΓ c+ x P− ψ, U 1 + Q+ (x) Fl (λ) = U 1 + Q λ λ " 1 1 eiΓ c0 x P− φ + O . (4.18) λ2 Since U is unitary and since Γ1 P− = −P− , we reexpress (4.18) as F (l λ) = e−iβ−i(c+ −c0 )x P− ψ, P− φ − 1 ˜ (x) + (Q+ )∗ (x) P− ψ, P− φ + O 1 . + e−iβ−i(c+ −c0 )x Q λ λ2
(4.19)
˜ − , (4.19) becomes From the explicit expressions of Q+ and Q Fl (λ) = e−iβ−i(c+ −c0 )x P− ψ, P− φ +∞ 1 i 1 −iβ−i(c+ −c0 )x 2 + W (s)ds − W (x) P− ψ, P− φ + O . e λ 2 −∞ λ2 (4.20)
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
471
Eventually observe that W (x)P− = P+ W (x) by (2.16) and that P+ ψ, P− φ = 0. Hence we obtain for (4.20) Fl (λ) = e−iβ−i(c+ −c0 )x P− ψ, P− φ +∞ 1 i 2 −iβ−i(c+ −c0 )x + W (s)ds e P− ψ, P− φ + O . 2λ −∞ λ2
(4.21)
Denoting Θ(x) = e−iβ−i(c+ −c0 )x , +∞ +∞ i i 2 2 2 A(x) = W (s)ds Θ(x) = (al (s) + b (s))ds Θ(x), 2 2 −∞ −∞ we have proved the reconstruction formula (4.1). This finishes the proof of Theorem 4.2. Proof of Theorem 4.1. We prove here that the parameters M, Q and Λ are uniquely determined from the knowledge of the high energies of the transmission operator TR . Note that the proof with the high energies of TL is the same. Consider TR,1 and TR,2 two transmission operators corresponding to parameters Mj , Qj , Λj with j = 1, 2 where moreover m, q = 0 are supposed to be known and fixed. In what follows, we shall denote all the objects associated to TR,j by the usual notations with a lower index j. We assume that TR,1 = TR,2 . From the definition of Fl (λ) it follows then that Fl,1 (λ) = Fl,2 (λ). We identify now the terms of same orders in the asymptotic expansion (4.1). Since ψ, φ are dense in H, we get Θ1 (x) = Θ2 (x),
∀x ∈ R
(4.22)
A1 (x) = A2 (x),
∀ x ∈ R.
(4.23)
Let us analyze the term of order 0 first. From (4.22) and (4.3), we have −iβ1 − i(c+,1 − c0,1 )x = −iβ2 − i(c+,2 − c0,2 )x + 2kπ,
∀ x ∈ R,
(4.24)
where k ∈ Z. If we derivate (4.24) with respect to x, we thus obtain c0,1 − c+,1 = c0,2 − c+,2 .
(4.25)
Hence using (4.25) and (2.29), we see that the quantity X = c0 − c+ = qQ is uniquely determined.
r+ − r0 , r0 r+
(4.26)
May 11, J070-S0129055X10004004
472
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
We analyze now the term of order O(λ−1 ). From (4.23), (4.3) and (4.22) again, we have +∞ +∞ 2 W1 (s)ds = W22 (s)ds. (4.27) −∞
−∞
Using that W 2 (x) = a2l (x) + b2 (x) and the expressions of the potentials al and b given by (2.15) and the definition of the Regge–Wheeler variable (2.6), we can compute explicitely the integrals that appear in (4.27). In fact we have 2 +∞ 1 1 1 2 W (s)ds = l + − (4.28) + m2 (r+ − r0 ). 2 r0 r+ −∞ By homogeneity in l and since m is considered as known and fixed, we deduce from (4.27) and (4.28) that r+,2 − r0,2 r+,1 − r0,1 = , r0,1 r+,1 r0,2 r+,2 r+,1 − r0,1 = r+,2 − r0,2 .
(4.29) (4.30)
Hence the quantities Y =
r+ − r0 , r0 r+
Z = r+ − r0 ,
(4.31)
are uniquely determined. We can now show the uniqueness of the parameters M, Q and Λ as follows. We first note the following relation X = qQY.
(4.32)
Since X, Y are uniquely determined and q is supposed to be known and fixed, we deduce from (4.32) that Q is uniquely determined, i.e. Q1 = Q2 = Q. Moreover, from (4.31) we deduce that r+ −r0 and r0 r+ are uniquely determined. Hence so are r0 and r+ as the unique solutions of the obvious polynomial of second order. Now recall r0 and r+ are roots of F (r) = 0. The equations F (r0 ) = 0 and F (r+ ) = 0 can be written using (2.2) as the linear system 2 Q2 r+ 2 1 + r2 r+ 3 + M (4.33) = . Λ 2 2 Q2 r0 1+ 2 r0 r0 3 r 3 −r 3
The determinant of (4.33) is 23 r00 r++ and is clearly nonzero. Hence (M, Λ) are the unique solutions of the system (4.33) whose coefficients depend only on r0 , r+ , Q which are uniquely determined by the previous discussion. We thus conclude that M and Λ are also uniquely determined, i.e. M1 = M2 and Λ1 = Λ2 and the proof of Theorem 4.1 is finished.
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
473
4.2. The inverse problem on an interval of energy In this last subsection, we solve the inverse problem when the reflection operators L or R are supposed to be known on a (possibly small) interval of energy. We follow the usual stationary approach of inverse scattering on the line and refer to [8, 6] for a presentation of the general method in the case of one-dimensonal Schr¨ odinger operators and to [1] for an application to massless Dirac operators (see also [12, 15] for massive Dirac operators). We first determine a stationary representation of the scattering operator S expressed in terms of the usual transmission and reflection coefficients (here matrices). We do this by a serie of simplications of our model which finally reduces to the exact framework studied in [1]. We then use the exponential decay of the potentials to show that the reflection coefficients R and L can be extended analytically to a small strip around the real axis. In consequence, the reflection coefficients R or L are uniquely determined on R if they are known on any interval of energy by analytic continuation. At last, we use the results of [1], a classical Marchenko method, to prove that the parameters M, Q and Λ are uniquely determined by the knowledge of R(ξ) or L(ξ) for all energies. Recall that the scattering operator S is defined by S = (W + )∗ W − , where the global wave operators W ± are given when Λ > 0 by ± ± + W(+∞) , W ± = W(−∞)
(4.34)
with ± W(−∞) = s- lim eitH e−itH0 P∓ ,
± W(+∞) = s- lim eitH e−itH+ P± .
t→±∞
t→±∞
(4.35)
We now use the unitary transform U introduced in (3.101) and the corresponding ± obtained in (3.110) and (4.16) to express (4.34) as simplified expressions of W(±∞) 1
W ± = U W ± (A, A0 )(eiΓ
c0 x
1
1
P∓ + eiΓ β eiΓ
c+ x
P± ).
(4.36)
Here we have used the notations introduced in Secs. 3.2 and 4.1. Let us denote by 1 1 1 G± the operators eiΓ c0 x P∓ + eiΓ β eiΓ c+ x P± appearing in (4.36) and by S(A, A0 ) the scattering operator associated to the operators A and A0 , i.e. S(A, A0 ) = (W + (A, A0 ))∗ W − (A, A0 ). Using the unitarity of U we thus immediately get the following expression for the scattering operator S S = G∗+ S(A, A0 )G− .
(4.37)
The couple of operators (A, A0 ) acting on H turns out to fit the framework studied in [1]. Recall that they are given by A0 = Γ1 Dx and A = A0 + W (x) where the
May 11, J070-S0129055X10004004
474
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau 1
1
potential W (x) = eiΓ C− (x) (a(x)Γ2 + b(x)Γ0 )e−iΓ C− (x) is the 4× matrix-valued function 0 k(x) −ib(x) a(x) 2iC− (x) W (x) = . (4.38) , k(x) = e k ∗ (x) −a(x) ib(x) 0 Here k ∗ (x) denotes the transpose conjugate of the matrix-valued function k(x). Moreover W satisfies (4.12) and (4.13) and thus its entries belong to L1 (R). This is precisely the kind of operators studied in [1]. Note however that our potential W is better than L1 (R) since it is exponentially decreasing at both ends x → ±∞. This will be used hereafter. As a consequence, we can use the following stationary representation of S(A, A0 ) obtained in [1]. Let us introduce the unitary transform F on H defined by 1 1 e−iΓ xξ ψ(x)dx, (4.39) F ψ(ξ) = √ 2π R then we have (see [1, p. 143]) S(A, A0 ) = F ∗ S0 (ξ)F ,
(4.40)
where the scattering matrix S0 (ξ) takes the form TL (ξ) R(ξ) S0 (ξ) = . L(ξ) TR (ξ)
(4.41)
Here TL (ξ) and TR (ξ) are 2 × 2 matrix-valued functions which correspond to the usual transmission coefficients of S whereas L(ξ) and R(ξ) are 2 × 2 matrix-valued functions which correspond to the usual reflection coefficients of S. We refer to [1, Secs. 2 and 3] for the definition and the construction of the scattering matrix S0 (ξ). Hence (4.37) becomes S = (F G+ )∗ S0 (ξ)F G− .
(4.42)
We now finish our factorization of the scattering operator S as follows. Using 2 × 2 block matrix notations, we note that iβ ic x ic x e 1 0 0 0 0 e 0 e + G+ = = , G , − 0 e−iβ 0 1 0 e−ic0 x 0 e−ic+ x and we define two unitary transforms F± on H by ic x e + 0 F+ ψ(ξ) = F ψ(ξ) 0 e−ic0 x −ixξ+ic+ x 1 0 e = √ ψ(x)dx, 0 eixξ−ic0 x 2π R and
F− ψ(ξ) = F
eic0 x 0
1 = √ 2π
R
0 e−ic+ x
(4.43)
ψ(ξ)
e−ixξ+ic0 x 0
0 eixξ−ic+ x
ψ(x)dx.
(4.44)
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
Then we have F G+ =
eiβ 0
0 1
F+ ,
F G− =
1 0
0 e−iβ
475
F− .
(4.45)
Hence we conclude from (4.45) that the scattering operator (4.42) factorizes as −iβ e TL (ξ) e−2iβ R(ξ) (4.46) S = F+∗ F− . L(ξ) e−iβ TR (ξ) We summarize this result as a proposition Proposition 4.1. The scattering operator S has the following stationary representation. If F± are the unitary transforms defined in (4.43) and (4.44), then S = F+∗ S(ξ)F− ,
(4.47)
where the 4 × 4 scattering matrix S(ξ) is given by −iβ e TL (ξ) e−2iβ R(ξ) S(ξ) = , L(ξ) e−iβ TR (ξ)
(4.48)
and the quantities TL , TR and L, R are the 2 × 2 matrices that correspond to the transmission and reflection matrices of S(A, A0 ) respectively and are obtained in [1, Secs. 2 and 3]. Remark 4.1. As the notations suggest, the diagonal elements of the scattering matrix S(ξ) given in (4.48) are simply the stationary representations of the transmission operators TL and TR introduced in Sec. 2, (2.33). The anti-diagonal elements of S(ξ) are in turn the stationary representations of the reflection operators L and R in (2.34). Remark 4.2. The unitary operators F± appearing in the stationary representation (4.47) of S are natural in the following sense. Let us define the two selfadjoint operators on H H + = (Γ1 Dx + c+ )P+ + (Γ1 Dx + c0 )P− ,
H − = (Γ1 Dx + c0 )P+ + (Γ1 Dx + c+ )P− .
Hence it is clear from (4.34) and (4.35) that the global wave operators can be written in a classical form as ±
W ± = s- lim eitH e−itH . t→±∞
Now it is an easy calculation to show that the unitary transforms F± introduced in (4.43) and (4.44) are precisely the unitary transforms which diagonalize the operators H ± , i.e. H ± = F±∗ Mξ F± , where Mξ denotes the multiplication operator by ξ. We conclude that (4.47) together with (4.48) are the expected stationary representation of the scattering operator S.
May 11, J070-S0129055X10004004
476
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
In the sequel, we shall use the explicit link between our scattering matrix S(ξ) and the scattering matrix S0 (ξ) thoroughly studied in [1] in order to solve the inverse problem. Let us first briefly summarize some of the main results obtained in [1]. Under the assumption W ∈ L1 (R), the scattering matrix S0 (ξ) is continuous for ξ ∈ R and tends to I4 when ξ → ±∞. It is also unitary for each ξ ∈ R (see [1, Theorem 3.1] for a proof of these statements and for other properties on S0 (ξ))). Moreover, the following partial characterization result holds: Theorem 4.3 ([1, Theorem 6.3]). Assume that the reflection operators R(ξ) and L(ξ) be 2 × 2 matrix valued functions satisfying sup R(ξ) < 1,
sup L(ξ) < 1,
ξ∈R
ξ∈R
ˆ R(α) ∈ L1 (R),
+∞ 2 ˆ αR(α) dα < ∞,
0
ˆ L(α) ∈ L1 (R), (4.49)
0
−∞
2 ˆ αL(α) dα < ∞,
(4.50)
ˆ ˆ where R(α) and L(α) denote the usual Fourier transform of R(ξ) and L(ξ) and · is the Euclidean norm of a given matrix. Then the matrix-valued function k(x) ∈ L1 (R) in (4.38) (and thus the potential W (x)) can be uniquely recovered from the knowledge of R(ξ) and L(ξ) for all ξ ∈ R. We make several comments on this result and how we can apply it to our model: • The proof of the above theorem uses a classical Marchenko method. For instance, the matrix-valued function k(x) can be obtained after solving the following Marchenko integral equations for α > 0 (see [1, Eqs. (6.9) and (6.11)]) +∞ +∞ ˆ + δ + 2x)dγdδ, ˆ + 2x) + ˆ + γ + 2x)∗ R(α B1 (x, γ)R(δ B1 (x, α) = −R(α
ˆ − 2x)∗ + B2 (x, α) = −L(α
0
0
+∞ +∞
0
(4.51) ˆ + γ − 2x)L(α ˆ + δ − 2x)∗ dγdδ. B2 (x, γ)L(δ
0
(4.52) Under the assumption (4.49), the integral equations (4.51) and (4.52) are uniquely solvable in L1 (R+ ) ([1, Theorem 6.2]). Moreover, under the additionnal assumption (4.50), the matrix-valued function k(x) defined using the boundary values of B1 and B2 by the formulae (see [1, Eq. (4.19)]) k(x) = 2iB1 (x, 0+ ),
∀ x > 0,
k(x) = −2iB2 (x, 0+ ),
∀ x < 0,
can be shown to be in L1 (R) and thus corresponds to the potential we are looking for. • If the potential W belongs to L1 (R), then the condition (4.49) is automatically satisfied (see [1, Theorem 4.2 and Eq. (6.17)]). Although this condition is the natural one under which one could expect to reconstruct the potential k in the class L1 , the authors of [1] had to add the extra assumption (4.50) (which must then be checked) in order to prove their result. We refer to [1, p. 154] for more
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
477
details on this point. In our case, we shall prove the condition (4.50) as follows. Using the exponential decay of W , we are first able to show that the reflection coefficients R(ξ) and L(ξ) (in fact the whole scattering matrix S0 (ξ)) are analytic on a small strip around the real axis. Moreover the functions R(· + iη) and L(· + iη) can be shown to belong to L2 (R) uniformly for each |η| small enough. It follows then from standard results on the Fourier transform (see, for instance, ˆ ˆ [26, Theorem IX.13]) that R(α) and L(α) satisfy ˆ ∈ L2 (R), e |α| R(α)
ˆ ∈ L2 (R), e |α| L(α)
∀ small enough,
from which (4.50) follows immediately. • From (4.51) and (4.52) and the reconstruction procedure explained above, we see that the knowledge of R(ξ) and L(ξ) for all ξ ∈ R is used to recover the potential k(x) for all x ∈ R. In fact it is only enough to know either R(ξ) or L(ξ) for all ξ ∈ R since then the whole scattering matrix S0 (ξ) can be uniquely recovered. The procedure is explained in [1, p. 147, Eqs. (5.3)–(5.5)] and we reproduce it for completeness. Assume for instance that R(ξ) is known for all ξ ∈ R. Then the transmission coefficients TL (ξ) and TR (ξ) can be obtained performing the factorizations TL (ξ)TL (ξ)∗ = I4 − R(ξ)R(ξ)∗ ,
TR (ξ)∗ TR (ξ) = I4 − R(ξ)∗ R(ξ),
ξ ∈ R. (4.53)
Under the assumption k ∈ L1 (R), it was shown in [1] that the above factorization problems are in fact left or right canonical Wiener–Hopf factorization in the Wiener algebra W 4 and thus lead to unique TL (ξ) and TR (ξ) (see for instance [11, Theorem 9.2, p. 831]). At last, the reflection coefficient L(ξ) is recovered from R(ξ) by the formula L(ξ) = −TR (ξ)R(ξ)∗ (TL (ξ)∗ )−1 .
(4.54)
• Eventually we explain how we can apply this result to our model. From Proposition 4.1, we assume for instance that e−2iβ R(ξ) is known for all ξ ∈ R. Then it is easy to see from (4.53) and (4.54) that we can uniquely recover TL (ξ) and TR (ξ) by performing Wiener–Hopf factorizations and then e2iβ L(ξ) for all ξ ∈ R. Note that the exponential term e−2iβ disappears in the factorization (4.53). If we assume that the assumptions (4.49) and (4.50) hold (this will be checked below), then we can apply Theorem 4.3 as follows. Multiplying the integral equations (4.51) and (4.52) by e−2iβ and solving them, we conclude that we can uniquely recover e2iβ k(x) (and not k(x)) for all x ∈ R. We shall show below that this implies the uniqueness of the parameters M, Q and Λ of the black hole. Let us now show the analyticity of R(ξ) and L(ξ) on a small strip around the real axis and prove there the uniform L2 estimates mentioned above. To do this we need to introduce some objects whose existence has been shown in [1, Secs. 1–3].
May 11, J070-S0129055X10004004
478
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
The reflection coefficients R(ξ) and L(ξ) can be expressed in terms of solutions of the stationary problem ξ∈R
[Γ1 Dx + W (x)]X(x, ξ) = ξX(x, ξ),
(4.55)
where X(x, ξ) is understood as 4 × 4 matrix-valued function. Of special interest are the Jost solutions Fl (x, ξ) and Fr (x, ξ) of (4.55) which are singled out by the specific asymptotics at infinity 1
Fl (x, ξ) = eiΓ Fr (x, ξ) = e
ξx
1
x → +∞,
(I4 + o(1)),
iΓ ξx
x → −∞.
(I4 + o(1)),
For each ξ ∈ R, these two solutions exist, are fundamental matrices of (4.55) and are related as follows ([1, Proposition 2.2]). There exist two 4 × 4 matrix valued functions al (ξ) and ar (ξ) such that Fl (x, ξ) = Fr (x, ξ)al (ξ),
Fr (x, ξ) = Fl (x, ξ)ar (ξ),
and satisfying al (ξ)ar (ξ) = ar (ξ)al (ξ) = I4 for all ξ ∈ R. Note that Fl (x, ξ) and Fr (x, ξ) satisfy the asymptotics (in the opposite ends) 1
ξx
(al (ξ) + o(1)),
x → −∞,
iΓ ξx
(ar (ξ) + o(1)),
x → +∞.
Fl (x, ξ) = eiΓ Fr (x, ξ) = e
1
(4.56)
Let us now express al (ξ) and ar (ξ) using 2 × 2 block matrix notations as al1 (ξ) al2 (ξ) ar1 (ξ) ar2 (ξ) al (ξ) = , ar (ξ) = . al3 (ξ) al4 (ξ) ar3 (ξ) ar4 (ξ) Then the reflection coefficients are defined by ([1, Eqs. (3.6) and (3.7)]) R(ξ) = ar2 (ξ)ar4 (ξ)−1 = −al1 (ξ)−1 al2 (ξ), L(ξ) = al3 (ξ)al1 (ξ)−1 = −ar4 (ξ)−1 ar3 (ξ). Since the situations are obviously symmetric, we shall only prove the analyticity and the uniform L2 estimate on a small strip around the real axis for R(ξ) (the proof for L(ξ) being identical). Moreover, we shall only consider the definition R(ξ) = −al1 (ξ)−1 al2 (ξ) for simplicity. To go further, we use some integral representations of the coefficients al1 (ξ) and al2 (ξ) obtained in [1]. These are given in terms of the Faddeev matrix Ml (x, ξ) defined by 1
Ml (x, ξ) = Fl (x, ξ)e−iΓ
ξx
.
It is easy to see from (4.55) that Ml (x, ξ) must satisfy the integral equation ([1, Eq. (2.12)]) +∞ 1 1 e−iΓ ξ(y−x) W (y)Ml (y, ξ)eiΓ ξ(y−x) dy, (4.57) Ml (x, ξ) = I4 − iΓ1 x
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
479
and from (4.56) that Ml (x, ξ) must satisfy the asymptotics Ml (x, ξ) = I4 + o(1) when x → +∞. In fact, using once again 2 × 2 block matrix notations for Ml (x, ξ) Ml1 (x, ξ) Ml2 (x, ξ) Ml (x, ξ) = , Ml3 (x, ξ) Ml4 (x, ξ) and iterating (4.57) once, we get the uncoupled system of integral equations for Ml3 (x, ξ) and Ml4 (x, ξ) ([1, Eqs. (2.15) and (2.16)]) +∞ e2iξ(y−x) k(y)∗ dy Ml3 (x, ξ) = i x
+∞
+∞
+ x
y +∞
+∞
Ml4 (x, ξ) = I4 + x
e2iξ(y−x) k(y)∗ k(z)Ml3 (z, ξ)dzdy, e−2iξ(z−y) k(y)∗ k(z)Ml4 (z, ξ)dzdy,
(4.58) (4.59)
y
and similar equations for Ml1 (x, ξ) and Ml2 (x, ξ) that we would not need. Eventually, the following integral representations for the coefficients al1 (ξ) and al2 (ξ) hold ([1, Eqs. (2.25) and (2.26)]) (4.60) al1 (ξ) = I2 − i k(y)Ml3 (y, ξ)dy, al2 (ξ) = −i
R
R
e−2iξy k(y)∗ Ml4 (y, ξ)dy.
(4.61)
We first study the coefficient al2 (ξ) expressed in terms of the Faddeev matrix Ml4 (x, ξ). Under the assumption k ∈ L1 (R), a solution Ml4 (x, ξ) of (4.59) with the right asymptotics is easily shown to exist by iteration. Moreover for each fixed x ∈ R, this solution can be extended to a continuous function in the variable ξ when Im ξ ≤ 0 and analytic when Im ξ < 0 ([1, Proposition 2.3]). We prove now the following result +∞ Lemma 4.1. Define the function P (x, ξ) = x e2|Imξ||y| k(y)dy. Then there exists κ > 0 small enough such that (i) For all ξ satisfying |Im ξ| ≤ κ and for all x ∈ R, the function P (x, ξ) is uniformly bounded. (ii) For each fixed x ∈ R, the Faddeev matrix Ml4 (x, ξ) can be extended analytically to the strip |Im ξ| < κ. Moreover, for each such ξ, it satisfies the estimate Ml4 (x, ξ) ≤ C cos h(P (x, ξ)).
(4.62)
(iii) For each fixed x ∈ R, the derivative Ml4 (x, ξ) of the Faddeev matrix with respect to the variable x can be extended analytically to the strip |Im ξ| < κ. Moreover, for each such ξ, it satisfies the estimate (x, ξ) ≤ C sin h(P (x, ξ)). Ml4
(4.63)
May 11, J070-S0129055X10004004
480
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
Proof. The first assertion is a direct consequence of the definition of P (x, ξ) and (4.13) (take for instance κ = α2 where α is the positive number that appears in #∞ (4.13)). Solving (4.59) by iteration leads to set Ml4 (x, ξ) = n=0 un (x, ξ) with u0 (x, ξ) = I2 and +∞ +∞ e−2iξ(z−y) k(y)∗ k(z)un−1 (z, ξ)dzdy, ∀ n ≥ 1. (4.64) un (x, ξ) = x
y
By induction we get the estimates un (x, ξ) ≤
P (x, ξ)2n , (2n)!
∀ n ∈ N.
(4.65)
Together with (i), this entails the second assertion. To prove the third one, we #∞ consider the serie of derivatives n=1 un (x, ξ). From (4.64), note that +∞ un (x, ξ) = − e−2iξ(z−x) k(x)∗ k(z)un−1 (z, ξ)dzdy. x
2n−1
By induction and using (4.65), we get the estimates un (x, ξ) ≤ C P (x,ξ) (2n−1)! all n ≥ 1 from which we deduce (iii).
for
Corollary 4.1. Let κ the positive number defined in Lemma 4.1. The coefficient al2 (ξ) is analytic on the strip |Im ξ| < κ. Moreover, it satisfies there the estimate al2 (ξ) = O(|ξ|−1 ),
|ξ| → ∞.
(4.66)
Proof. The analyticity on the strip |Im ξ| < κ follows directly from (4.61) and Lemma 4.1. To prove the second assertion, we integrate by parts in (4.61). For all ξ with |Im ξ| < κ, we obtain 1 e−2iξy (k (y)Ml4 (y, ξ) + k(y)Ml4 (y, ξ))dy. (4.67) al2 (ξ) = − 2ξ R Since k also satisfies the estimate (4.13) and using Lemma 4.1 again, we conclude C . that al2 (ξ) ≤ |ξ| We now study the coefficient al1 (ξ) expressed in terms of the Faddeev matrix Ml3 (x, ξ). Once again under the assumption k ∈ L1 (R), a solution Ml3 (x, ξ) of (4.58) with the right asymptotics is easily shown to exist by iteration. Moreover for each fixed x ∈ R, this solution can be extended to a continuous function in the variable ξ when Im ξ ≥ 0 and analytic when Im ξ > 0 ([1, Proposition 2.3]). Using the same function P (x, ξ) and positive number κ as in Lemma 4.1, let us prove the following result Lemma 4.2. For each fixed x ∈ R, the Faddeev matrix Ml3 (x, ξ) can be extended analytically to the strip |Im ξ| < κ. Moreover, for each such ξ, it satisfies the
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
481
estimates Ml3 (x, ξ) ≤ Ce2|Im ξ||x| sinh(P (x, ξ)). C (1 + e2|Im ξ||x| ), |ξ| ≥ 1. Ml3 (x, ξ) ≤ |ξ| Proof. We solve (4.58) by iteration. Hence we set Ml3 (x, ξ) = +∞ v0 (x, ξ) = i e2iξ(y−x) k(y)dy,
(4.68) (4.69) #∞
n=0 vn (x, ξ)
with
x
and
+∞
vn (x, ξ) = x
+∞
e2iξ(y−x) k(y)∗ k(z)vn−1 (z, ξ)dzdy.
(4.70)
y
We can prove the following estimate by induction vn (x, ξ) ≤ e2|Im ξ||x|
P (x, ξ)2n+1 , (2n + 1)!
∀ n ∈ N,
(4.71)
which implies immediately (4.68). Moreover, since P (x, ξ) is uniformly bounded on |Im ξ| < κ, we deduce from (4.68) the analyticity of Ml3 (x, ξ) on the same strip. To prove (4.69), we integrate by parts in (4.58) with respect to the variable y. For all ξ with |Im ξ| < κ, we obtain k ∗ (x) e−2iξx +∞ 2iξy ∗ − e (k ) (y)dy Ml3 (x, ξ) = − 2ξ 2ξ x k ∗ (x)K(x) e−2iξx +∞ 2iξy ∗ − − e ((k ) (y)K(y) 2iξ 2iξ x − k ∗ (y)k(y)Ml3 (y, ξ))dy,
(4.72) +∞
where we have introduced the function K(x) = x k(y)Ml3 (y, ξ)dy. Now using (4.13) for k and k , (4.68) and the uniform estimate K(x) ≤ C for all ξ with |Im ξ| < κ, we deduce that (4.69) holds when |ξ| is large from (4.72). Corollary 4.2. Let κ be the positive number defined in Lemma 4.1. Then the coefficient al1 (ξ) is analytic on the strip |Im ξ| < κ and tends to I2 when |ξ| → ∞. Furthermore, possibly considering smaller κ, the coefficient al1 (ξ) is invertible on the strip |Im ξ| < κ and a−1 l1 (ξ) is analytic and uniformly bounded there. Proof. The first assertion is a direct consequence of (4.60) and Lemma 4.2. Since al1 (ξ) tends to I2 when |ξ| → ∞, al1 (ξ) is clearly invertible for |ξ| large enough. Since al1 (ξ) is also invertible on the real axis ([1, Proposition 2.10]), we conclude that al1 (ξ) is invertible on a strip |Im ξ| < with 0 < < κ small enough and that a−1 l1 (ξ) is analytic and uniformly bounded on |Im ξ| < . Denoting this by κ, we have proved the corollary.
May 11, J070-S0129055X10004004
482
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
Let us put all these results together. Since R(ξ) = −a−1 l1 (ξ)al2 (ξ), Corollaries 4.1 and 4.2 imply that the reflection coefficient R(ξ) is analytic on a strip |Im ξ| < κ where κ is a small enough positive number. Moreover, using the estimates of the same corollaries, we see that R(· + iη) ∈ L2 (R) for all |η| < κ. In fact, we have sup R(· + iη)L2 < ∞.
|η|<κ
ˆ Finally it follows from [26, Theorem IX.13] that the Fourier transform R(α) satisfies the estimate ˆ ∈ L2 (R). eκ|α| R(α)
(4.73)
In particular, the assumption (4.50) in Theorem 4.3 is satisfied by R(ξ). We finish this paper solving the inverse problem. Theorem 4.4. Assume that one of the reflection matrices L(ξ) or e−2iβ R(ξ) appearing in (4.48) is known on a (possibly small) interval of R. Assume moreover that the mass m and the charge q = 0 of the Dirac fields are known and fixed. Then the parameters M, Q and Λ of the dS-RN black hole are uniquely determined. Proof. We only give the proof when the reflection matrix e−2iβ R(ξ) is supposed to be known on an interval I of R since the proof with L(ξ) can be treated the same way. We consider thus e−2iβ1 R1 (ξ) and e−2iβ2 R2 (ξ) two reflection matrices corresponding to parameters Mj , Qj and Λj with j = 1, 2 where moreover the parameters m, q = 0 are supposed to be known and fixed. As usual we shall denote all the objects related to e−2iβj Rj (ξ) by a lower index j in what follows. Assume that e−2iβ1 R1 (ξ) = e−2iβ2 R2 (ξ) for all ξ ∈ I. By analyticity, we thus have e−2iβ1 R1 (ξ) = e−2iβ2 R2 (ξ),
∀ ξ ∈ R.
Using the procedure explained after Theorem 4.3, this also entails that e2iβ1 L1 (ξ) = e2iβ2 L2 (ξ),
∀ ξ ∈ R.
Thanks to (4.73) and the corresponding result for L(ξ), we can apply Theorem 4.3 (and the remarks following this theorem). Hence we obtain the equality e2iβ1 k1 (x) = e2iβ2 k2 (x) for all x ∈ R or equivalently 1
e2iΓ
β1
1
W1 (x) = e2iΓ
β2
W2 (x),
∀ x ∈ R.
(4.74)
Now recall that W 2 is a positive function since 2 F (r) 1 2 2 2 W (x) = al (x) + b (x) = l + + m2 F (r). 2 r2 Hence taking the square of (4.74) and then the modulus, we have W12 (x) = a2l,1 (x) + b21 (x) = a2l,2 (x) + b22 (x) = W22 (x),
∀ x ∈ R.
(4.75)
May 11, J070-S0129055X10004004
2010 10:7 WSPC/S0129-055X
148-RMP
Inverse Scattering in de Sitter–Reissner–Nordstr¨ om Black Hole Spacetimes
Note in particular that
+∞
−∞
W12 (s)ds =
483
+∞
−∞
W22 (s)ds.
(4.76)
Moreover by homogeneity in l and since al and b are positive functions, we deduce from (4.75) that al,1 (x) = al,2 (x),
b1 (x) = b2 (x),
∀ x ∈ R.
(4.77)
Now since 1
W (x) = e−2iΓ
C − (x)
(al (x)Γ2 + b(x)Γ0 ),
by (2.16) it follows from (4.74) and (4.77) that 1
e2iΓ
β1 −2iΓ1 C1− (x)
e
1
= e2iΓ
β1 −2iΓ1 C2− (x)
e
,
∀ x ∈ R,
or equivalently that β1 − C1− (x) = β2 − C2− (x) + kπ,
∀ x ∈ R,
(4.78)
where k ∈ Z. Derivating (4.78), we obtain c1 (x) = c2 (x),
∀ x ∈ R.
(4.79)
If we let tend x to ±∞, we obtain from (4.79) and (2.15) c0,1 = c0,2 ,
c+,1 = c+,2 .
(4.80)
We notice eventually that (4.76) and (4.80) are precisely the conditions under which the parameters M, Q and Λ were shown to be uniquely determined in the proof of Theorem 4.1 (see precisely the conditions (4.25) and (4.27)). We thus apply the same procedure as before to end up the proof of the theorem. References [1] T. Aktosun, M. Klaus and C. van der Mee, Direct and inverse scattering for selfadjoint Hamiltonian systems on the line, Integr. Equa. Oper. Theory 38 (2000) 129–171. [2] S. Arians, Geometric approach to inverse scattering for the Schr¨ odinger equation with magnetic and electric potentials, J. Math. Phys. 38(6) (1997) 2761–2773. [3] T. Daud´e, Time-dependent scattering theory for massive charged dirac fields by a Reissner–Nordstr¨ om black hole, preprint, Universit´e Bordeaux 1 (2004); available online at http://tel.archives-ouvertes.fr/tel-00011974/en/. [4] T. Daud´e and F. Nicoleau, Recovering the mass and the charge of a Reissner– Nordstr¨ om black hole by an inverse scattering experiment, Inverse Problems 24 (2008) 025017, 18 pp; Corrigendum, ibid. 25 (2009) 059801. [5] J. Derezi´ nski and C. G´erard, Scattering Theory of Classical and Quantum N-Particle Systems (Springer, 1997). [6] P. Deift and E. Trubowitz, Inverse scattering on the line, Comm. Pure Appl. Math 32 (1979) 121–251. [7] V. Enss and R. Weder, The geometrical approach to multidimensional inverse scattering, J. Math. Phys. 36(8) (1995) 3902–3921.
May 11, J070-S0129055X10004004
484
2010 10:7 WSPC/S0129-055X
148-RMP
T. Daud´ e & F. Nicoleau
[8] L. D. Faddeev, The inverse problem in the quantum theory of scattering II, Itogi Nanki i Tekhniki. Ser. Sovrem. Probl. Mat. 3 (1974) 93–180. [9] Y. Gˆ atel and D. R. Yafaev, Scattering theory for the dirac operator with a long-range electromagnetic potential, J. Funct. Anal. 184 (2001) 136–176. [10] I. M. Gel’fand and Z. Y. Sapiro, Representations of the group of rotations of 3-dimensional space and their applications, Amer. Math. Soc. Trans. 11(2) (1956) 207–316. [11] I. Gohberg, S. Goldberg and M. A. Kaashoek, Classes of Linear Operators, Vol. 2, Operator Theory: Advances and Applications, Vol. 63 (Birkh¨ auser, 1993) [12] B. Gr´ebert, Inverse scattering for the Dirac operator on the real line, Inverse Problems 8 (1992) 787–807. [13] D. H¨ afner and J-.P. Nicolas, Scattering of massless Dirac fields by a Kerr black hole, Rev. Math. Phys. 16(1) (2004) 29–123. [14] S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Space-Time, Cambridge Monographs on Mathematical Physics, No. 1 (Cambridge Univ. Press, 1973). [15] D. B. Hinton, A. K. Jordan, M. Klaus and J. K. Shaw, Inverse scattering on the line for a Dirac system, J. Math. Phys. 32(11) (1991) 3015–3030. [16] H. Isozaki and H. Kitada, Modified wave operators with time-independent modifiers, Papers of the College of Arts and Sciences Tokyo Univ. 32 (1985) 81–107. [17] W. Jung, Geometric approach to inverse scattering for Dirac equation, J. Math. Phys. 36(8) (1995) 3902–3921. [18] F. Melnyk, The Hawking effect for spin 1/2 fields, Comm. Math. Phys. 244(3) (2004) 483–525. [19] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators, Comm. Math. Phys. 78 (1981) 391–408. [20] J.-P. Nicolas, Scattering of linear Dirac fields by a spherically symmetric black hole, Ann. Inst. Henri Poincar´e Physique Th´eorique 62(2) (1995) 145–179. [21] F. Nicoleau, A stationary approach to inverse scattering for Schr¨ odinger operators with first order perturbation, Comm. Partial Differential Equations 22(3–4) (1997) 527–553. [22] F. Nicoleau, An inverse scattering problem with the Aharonov–Bohm effect, J. Math. Phys. 8 (2000) 5223–5237. [23] F. Nicoleau, Inverse scattering for Stark Hamiltonians with short-range potentials, Asymptot. Anal. 35(3–4) (2003) 349–359. [24] F. Nicoleau, An inverse scattering problem for the Schr¨ odinger equation in a semiclassical process, J. Math. Pures Appl. 86 (2006) 463–470. [25] R. Novikov, Small angle scattering and X-ray transform in classical mechanics, Arkiv Mat. 37(1) (1999) 141–169. [26] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 2 (Academic Press, 1975). [27] D. Robert, Autour de l’approximation semiclassique, Progress in Mathematics, Vol. 68 (Birkh¨ auser, Basel, 1987). [28] R. Wald, General Relativity (University of Chicago Press, 1984). [29] R. Weder, Multidimensional inverse scattering in an electric field, J. Funct. Anal. 139(2) (1996) 441–465.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 485–505 c World Scientific Publishing Company DOI: 10.1142/S0129055X10003989
´ FLOWS ON THE LOOP BOTT–VIRASORO EULER–POINCARE GROUP AND SPACE OF TENSOR DENSITIES AND (2 + 1)-DIMENSIONAL INTEGRABLE SYSTEMS
PARTHA GUHA Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, D-04103 Leipzig, Germany and S. N. Bose National Centre for Basic Sciences, JD Block, Sector-3, Salt Lake, Calcutta-700098, India
[email protected] Received 22 July 2009 Revised 22 January 2010 Dedicated to Professor Tudor Ratiu on his 60th birthday with great respect and admiration Following the work of Ovsienko and Roger ([54]), we study loop Virasoro algebra. Using this algebra, we formulate the Euler–Poincar´e flows on the coadjoint orbit of loop Virasoro algebra. We show that the Calogero–Bogoyavlenskii–Schiff equation and various other (2 + 1)-dimensional Korteweg–deVries (KdV) type systems follow from this construction. Using the right invariant H 1 inner product on the Lie algebra of loop Bott– Virasoro group, we formulate the Euler–Poincar´e framework of the (2+1)-dimensional of the Camassa–Holm equation. This equation appears to be the Camassa–Holm analogue of the Calogero–Bogoyavlenskii–Schiff type (2 + 1)-dimensional KdV equation. We also derive the (2 + 1)-dimensional generalization of the Hunter–Saxton equation. Finally, we give an Euler–Poincar´e formulation of one-parameter family of (1 + 1)-dimensional partial differential equations, known as the b-field equations. Later, we extend our construction to algebra of loop tensor densities to study the Euler–Poincar´e framework of the (2 + 1)-dimensional extension of b-field equations. Keywords: Diffeomorphism; loop Virasoro algebra; tensor densities; Calogero– Bogoyavlenskii–Schiff equation; (2 + 1)-dimensional Camassa equation; b-field equation. Mathematics Subject Classifications 2010: 53A07, 53B50
1. Introduction The study of higher dimensional integrable systems is one of the most challenging areas in integrable systems. Early in the study of integrable systems, the main thrusts were restricted to the (1 + 1)-dimensional systems because of the difficulty of finding the physically significant high-dimensional solutions which are localized in all directions. Recently, much progress has been achieved in understanding the 485
June 2, 2010 14:55 WSPC/S0129-055X
486
148-RMP
J070-00398
P. Guha
properties and solutions for two-dimensional integrable models such as Kadomtsev– Petvashvili (KP), Davey–Stewartson (DS) equations [1]. One of the most striking feature of (2+1)-dimensional system is the exponentially localized structures, called dromions, which are driven by two perpendicular line ghost solitons in case of the DS equation or two non-perpendicular line ghost solitons in case of the KP equation. One should recall that the name dromions as well as their spectral meaning were introduced by Fokas and Santini [27]. Recently the rich dromion structures were found in (2 + 1)-dimensional KdV equations also [49, 50, 59, 63]. After the discovery of dromions, the question arises whether there exist exponentially localized structures in (2 + 1)-dimensional breaking soliton equations as well. In such systems, the spectral parameter becomes a multivalued function, in other words, spectral parameter possesses so-called breaking behavior. The solutions of these equations may become multivalued. There is an equation exibiting breaking solitons, formulated by Bogoyavlenskii in [5, 6], as one of the (2 + 1)dimensional reductions of the self dual Yang–Mills equations. In a series of papers Bogoyavlenskii studied such breaking soliton equations. He extended the well-known Lax representation to the generalized form Lt = P (L) +
n
Rk (L, Lyk ) + [L, A].
k=1
Here P (L) and Rk (L, Lyk ) are certain meromorphic functions of the operator L. In [5, 6], Bogoyavlenskii constructed several hydrodynamic-type systems which are connected to the Toda lattice and the Volterra model. It has been shown that these systems possess the breaking behavior, the Hamiltonian forms and conservation laws. The continuous limits of these systems include the equation vt = 4vvy + 2vx ∂x−1 vy − vxxy + β0 (6vvx − vxxx ),
(1)
which after the substitution v = ux , is reduced to potential form utx = 4ux uxy + 2uy uxx − uxxxy ,
(2)
where we set β0 = 0. Schiff [64] obtained above equation in a different route. He derived Eq. (1) from the reduction of the self-dual Yang–Mills equations from four to three dimensions. There has been considerable interest to show that the self-dual Yang–Mills equations as a master integrable equation, from which many integrable systems can be obtained by suitable reductions and this was the original motivation of Schiff. It has been shown in [66] that the generalized SDYM equations contain (as dimensional reductions) various (2 + 1)-dimensional integrable soliton hierarchies which generalize the nonlinear Schr¨ odinger and KdV hierarchies. One can also derive (2 + 1)-dimensional KdV type systems from another method. Using classical differential geometry Konopelchenko [44] has derived (2+1)dimensional KdV equation. In geodesic coordinates, the Gauss equation is reduced to the Schr¨ odinger equation where the Gaussian curvature plays the role of a
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
487
potential. It can be shown that a special case is governed by the KdV equation for the Gaussian curvature. In this framework, Konopelchenko [44] studied the integrable dynamics of curvature via the KdV equation, higher KdV equations and other (2 + 1)-dimensional integrable equations with breaking solitons. The bihamiltonian operators for (2 + 1)-dimensional integrable systems were introduced in [28–30]. In an interesting paper Fokas, Olver and Rosenau [26] proposed an algorithmic construction of (2 + 1)-dimensional integrable system qxt − νqxxxt + aqxy + bqxxxy + c(qxx qy + 2qx qxy ) − cν(qxxxx qy + 2qxxx qxy ), (3) which yield peakon/dromion type solutions. This equation can be identified with the potential form of the Camassa–Holm (integrable) analogue of the Calogero– Bogoyavlenskii–Schiff equation. Recently the one-parameter family of shallow water equations of the following form ut − uxxt + (b + 1)uux = bux uxx + uuxxx,
(4)
where b is a real parameter, has drawn some attention. This equation is known as the b-field equation. It was introduced by Degasperis, Holm and Hone [18, 19], who showed the existence of multi-peakon solutions for any value of b, although only the special cases b = 2, 3 are integrable, having bihamiltonian formulations. The b = 2 case is the well-known Camassa–Holm (CH) equation [8] and b = 3 is the integrable system discovered by Degasperis and Procesi [20]. One must note that only for b = 2, 3, Eq. (4) is also hydrodynamically relevant [14, 41]. It is worth to remember that b = 2 case was later recognized as being included in a class of integrable equations derived from hereditary symmetries in Fokas and Fuchssteiner [25] Using the Helmholz field m := u−uxx , the b-field equation or the DHH equation (4) allows reformulation in the compact form mt + umx + bux m = 0,
(5)
where the three terms correspond respectively to evolution, convection and stretching of the one-dimensional flow. In this paper we also study an Euler–Poincar´e formulation of (2 + 1)-dimensional b-field equation. It must be worth to mention that the well-posedness and blow up of the b-field equation was proved in [23], and its invariance properties were used by Henry [38] to investigate the equation qualitatively. In a recent paper [35], the author has formulated the Euler–Poincar´e (EP) framework of the Degasperis and Procesi (DP) equation. It turns out that the DP equation is the Euler–Poincar´e flow on the combined space of Hill’s (second order) operator and first order differential operators on circle. In this paper, the author has given the EP formulation of the two-component generalization of the DP equation. It has been shown [35] also that the Hamiltonian structure obtained from the EP framework exactly coincides with the Hamiltonian structures of the
June 2, 2010 14:55 WSPC/S0129-055X
488
148-RMP
J070-00398
P. Guha
DP equation obtained by Degasperis et al. In this paper, we give a much shorter derivation of the DP and the b-field equation using the deformation of vector field structure on S 1 . Following the work of Ovsienko and Roger [54], we study loop Virasoro algebra. Using this algebra, we are able to derive the (2 + 1)-dimensional b-field equation. The aim of this paper is to contribute towards a theory of integrable type geodesic flows on infinite-dimensional Lie groups which has attracted tremendous attention since Arnold’s seminal paper [2] on Euler equation in hydrodynamics. Later, Ebin and Marsden [22] established a proper geometric setting of this problem. They showed that the geodesic spray was smooth. This led to very nice existence proofs; the limit of zero viscosity for manifolds with no boundary was shown to exist for the first time. It would be worth to mention that the in recent years equations like the Camassa–Holm equation model for shallow water waves or the Hunter–Saxton [40] model for nemetic liquid crystals, the geometric structures was used to to study qualitative properties of the solutions. The geometric structure of the Hunter–Saxton equation and its relevance has been studied by Lenells [45]. The geometric approach leads to the construction of global weak solutions in the periodic case [46], the case of global weak solutions without periodicity being investigated in Bressan and Constantin [3]. One must note that in (1 + 1) dimensions both the Camassa–Holm and the Degasperis–Procesi equation admit traveling waves that are peaked and orbitally stable so that these patterns are physically detectable [17,48]. These peakons capture the main feature of the exact travelling wave solutions of greatest height of the governing equations for water waves [16]. The KdV equation is an Euler–Poincar´e equation on the Virasoro–Bott group (see [42, 57, 65]). This group is defined as the unique (up to isomorphism) nontrivial central extension of the group Diff(S 1 ) of all diffeomorphisms of S 1 . The inertia operator is given by the standard L2 -metric on S 1 . It is known that the two-component KdV and Camassa–Holm equations are also geodesic flows on the extended Virasoro–Bott group [32–34]. It is worth pointing out that for the Camassa–Holm equation the geometric approach leads to a proof which demonstrates that the equation satisfies the Least Action Principle [10, 11]. The infinite-dimensional groups also play important role for the construction of (2 + 1)-dimensional integrable systems. One form of (2 + 1)-dimensional KdV and nonlinear Schr¨ odinger equation can be derived from the toroidal Lie algebra. Here the variable x is associated to the action of the usual affine part of the toroidal Lie algebra [4, 61], while evolutions in y and t are indiced by the action of the genuine toroidal part. The weight of v and the relative and the relative weight of y and t are balanced with that of x, thus it allows us two freedoms to determine the weights for all the variables. In this paper we study the (formal) Euler–Poincar´e framework [52] of various (2 + 1)-dimensional KdV type systems. Until now there is no systematic method of construction of (2 + 1)-dimensional integrable systems from the view point of geodesic flows or Euler–Poincar´e framework. In particular, we show that the
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
489
Calogero–Bogoyavlenskii–Schiff equation arises as a geodesic flow on loop Bott– Virasoro group. This equation is an eminent member of the (2 + 1)-dimensional family of KdV equations [39]. We also study the Euler–Poincar´e framework of the Bogoyavlenskii–Konopelchenko equation. In fact there are not so long list of equations of (2 + 1) dimensions are known to be EP formulated. Recently, Ovsienko [58] studied the bihamiltonian properties of the Martinez Alonso–Shabat type system uxt = uy uxy − uyy ux . Another nonlinear differential equation utx = uxx uy − uxy ux + cuyy has been mentioned in [54]. In the second half of the paper, we construct higher dimensional Camassa– Holm equation. We show that the (2 + 1)-dimensional Camassa–Holm equation arises as geodesic flow with respect to the right invariant H 1 metric on the cotangent loop Virasoro group. We also compute the (2 + 1)-dimensional Hunter–Saxton equation. The result of this paper was announced [36] in the Oberwolfach meeting on geometrical mechanics. This is a long version of [36]. The paper is organized as follows. In Sec. 2, we present the Euler–Poincar´e formalism and frozen Poisson structures. Loop Virasoro algebra is introduced in Sec. 3. In Sec. 4, we give the Euler–Poincar´e formulation of the (2 + 1)-dimensional KdV equation. Section 5 is devoted to the derivation of the (2 + 1)-dimensional Camassa–Holm equation and the Hunter–Saxton equation. In Sec. 6, we present the Euler–Poincar´e framework of the b-field equation. The formulation of (2 + 1)-dimensional b-field equation is given Sec. 7. 2. The Euler–Poincar´ e Formalism The Euler–Poincar´e equations were born in 1901 (see [52] for details) when Poincar´e made a extensive generalization of the classical Euler equations for the rigid body and ideal fluids. He did this by formulating the equations on a general Lie algebra, with the rigid body being associated with the rotation Lie algebra and fluids with the Lie algebra of divergence free vector fields. We give a rapid introduction of the Euler–Poincar´e framework. Let G be a Lie group and g be its corresponding Lie algebra and its dual is denoted by g∗ . G can be thought of as the configuration space of some physical system, for example, the group SO(3) for a rigid body and a group sDiff(M ) of volume preserving diffeomorphisms for an ideal fluid filling a domain M . The dual space g∗ to any Lie algebra g carries a natural Lie–Poisson structure: {f, g}LP (µ) := [df, dg], µ for any µ ∈ g and f, g ∈ C (g∗ ). ∗
∞
June 2, 2010 14:55 WSPC/S0129-055X
490
148-RMP
J070-00398
P. Guha
The Hamiltonian vector field on g∗ corresponding to a Hamiltonian function f , computed with respect to the Lie–Poisson structure is given by dµ = ad∗df µ, dt
µ ∈ g∗ ,
(6)
which implies that the Hamiltonian vector field Xf (µ) = ad∗df µ. Let us fix some quadratic form, energy function, on g. Consider the right translations of this quadratic form to the tangent space at any point of the group. In this process, we define a right-invariant Riemannian metric on the group using the energy function. The geodesic flow on G with respect to this quadratic form represents the extremals of the principle of least action, which traces out the actual motion of the physical system. We identify the Lie algebra and its dual with this quadratic form. This identification is done via inertia operator I : g −→ g∗ . This allows us to rewrite the Euler–Poincar´e equation on the dual space g∗ . It has been proved that the EP equation on g∗ is Hamiltonian with respect to the natural Lie–Poisson structure on the dual space. Definition 2.1. The Euler–Poincar´e equation on tonian H(µ) = 12 I −1 µ, µ is given by du = ad∗I −1 µ µ, dt
g∗ corresponding to the Hamil-
I −1 µ ∈ g,
(7)
where ad∗(.) µ is the coadjoint operator, dual to the operator [µ, ·], defining the structure of the Lie algebra g. Equation (7) characterizes an evolution of a point µ ∈ g∗ . 2.1. Frozen Lie–Poisson structure Consider the dual of the Lie algebra of g∗ with a Poisson structure given by the “frozen” Lie–Poisson structure. In other words, we fix some point µ0 ∈ g∗ and define a Poisson structure given by {f, g}0 := [df (µ), dg(µ)], µ0 , which satisfies Jacobi identity. It plays an important role in integrable systems, particularly to the construction of the first Hamiltonian structure of the underlying integrable system. We can give another interpretation [12,13] of frozen structure from the definition of cocycle. Given an inertia operator I : g → g∗ one can define a constant Poisson structure {f, g}0(µ) = df, I dg
where µ ∈ g∗ .
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
491
A two-cocycle ω is called a coboundary if there is a point µ0 ∈ g∗ such that ω(p, q) = [p, q], µ0 , where p, q ∈ g. Since the Poisson structure is generated by a coboundary of ω, we obtain the frozen Lie–Poisson structure. This behaves like a Lie–Poisson structure frozen at the point µ0 ∈ g∗ and this coincides with the previous definition of frozen structure. It is easy to check that the above Poisson structures are compatible, i.e. their linear combination or pencil of Poisson structures { , }λ = { , }0 + λ { , }LP
(8)
is again a Poisson structure for all λ ∈ R. It was shown by Khesin and Misiolek [43] that Proposition 2.2. The brackets {·, ·}LP and {·, ·}0 are compatible for every “freezing” point µ0 ∈ g∗ . At this point, we can introduce the bihamiltonian structure. The notion of integrability can be understood from this structure. The standard way to understand bihamiltonian vector fields on the dual of the Lie algebra is associated to Lie–Poisson structures. Definition 2.3. A vector field X on g∗ is called bi-Hamiltonian if there are two functions, H1 and H2 such that X is a Hamiltonian vector field of H1 with respect to the Poisson structure { , }LP and is a Hamiltonian vector field of H2 with respect to { , }0 . 3. Loop Virasoro Algebra and (2 + 1)-Dimensional KdV Flows We wish to extend the Virasoro algebra to the case of two space variables. A natural way to do this is to consider the loops on it. One defines the loop group on Diff(S 1 ) as follows L(Diff(S 1 )) = {φ : S 1 → Diff(S 1 ) | φ is differentiable}, the group law being given by (φ ◦ ψ) (y) = φ(y) ◦ ψ(y),
y ∈ S1.
In a similar manner, we construct the Lie algebra L(Vect(S 1 )) consisting of vector fields on S 1 depending on one more independent variable y ∈ S 1 . The loop variable is thus denoted by y and the variable on the “target” copy of S 1 by x. The ∂ where f ∈ C ∞ (S 1 × S 1 ) and the elements of L(Vect(S 1 )) are of the form: f (x, y) ∂x Lie bracket reads as follows [54] ∂ ∂ ∂ f (x, y) , g(x, y) = (f (x, y) gx (x, y) − fx (x, y) g(x, y)) . ∂x ∂x ∂x It is easy to convince oneself that L Vect(S 1 ) is the Lie algebra of L Diff(S 1 ) in the usual weak sense for the infinite-dimensional case; a one-parameter group
June 2, 2010 14:55 WSPC/S0129-055X
492
148-RMP
J070-00398
P. Guha
argumentation gives an between the tangent space to L Diff(S 1 ) identification at the identity and L Vect(S 1 ) , equipped with its Lie bracket. In future, we will ˜ . The natural pairing between the loop Virasoro algebra denote L(Vect(S 1 )) by g and its dual is given by
∂ , v(x, y)dx2 = f v dx dy. (9) f (x, y) ∂x S 1 ×S 1 3.1. Cocycle and extension of loop Virasoro algebra Consider the following “modified” Gelfand–Fuchs cocycle on Vect(S 1 ):
d d (af g + bf g)dx, ωmGF f (x) , g(x) = dx dx S1
(10)
where the first term is the original Gelfand–Fuchs cocycle. This cocycle is cohomologues to the Gelfand–Fuchs cocycle, hence, the corresponding central-extension is isomorphic to the Virasoro algebra. The additional term is a coboundary term. It is easy to check that the functional
1 f g dx = (f g − f g )dx 2 S1 S1 d d depends on the commutator of f dx and g dx . ˜ . A distribution Let us give the explicit formulæ of non-trivial 2-cocycles [31] on g λ ∈ C ∞ (S 1 ) corresponds to a 2-cocycle of the first class given by [67]
fg xxx dx , µλ (f, g) = λ S1
these are the Virasoro type extensions. For the particular case where λ(a(y)) =
a(y)dy, such a 2-cocycle will be denoted by µ1 so that one has 1 S
f gxxx dx dy. (11) ω1 (f, g) = S 1 ×S 1
˜ given ˆ as the one-dimensional central extension of g We define the Lie algebra g by the cocycles ω1 . As a vector space, gˆ = g˜ R, ˆ . The commutator in g ˆ is given by the where the summand R is the center of g following explicit expression which readily follows from the above formulæ.
∂ ∂ ∂ ,a , g ,b = (f gx − fx g) + f gxxx dx dy . (12) f ∂x ∂x ∂x S 1 ×S 1 ˆ. where the last term is an element of the center of g
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
493
4. EP Formulation of Calogero–Bogoyavlenskii–Schiff Type (2 + 1)-Dimensional KdV Equation In this section, we give a Euler–Poincar´e derivation of the Calogero–Bogoyavlenskii– Schiff equation (1) or (2). It is a most important member of the (2 + 1)-KdV family. We recall the Kirillov–Segal result and generalize it to the case of the Lie ˆ. algebra g ˆ is given by Proposition 4.1. The coadjoint action of the Lie algebra g ∗
2 2 ∂ (v(x, y)dx ) = (fv ad x + 2fx v + c1 fxxx + c2 fx )dx , f (x,y) ∂x
(13)
while the center acts trivially. Corollary 4.2. The Hamiltonian operator OLV corresponding to the coadjoint action of the loop Virasoro algebra is given by OLV = ∂x v + v∂x + c1 ∂x3 + c2 ∂x .
(14)
ˆ ∗ which is a (pseudo)differential polynomial: Given a functional H on g
H(g, v) = h g, v, gx , vx , gy , vy , ∂x−1 g, ∂x−1 v, ∂y−1 g, ∂y−1 v, gxy , vxy , . . . dx dy, S 1 ×S 1
where h is a polynomial in an infinite set of variables. For instance, δH ∂ ∂ = hv − (hv ) − (hvy ) − ∂x−1 (h∂x−1 v ) − ∂y−1 (h∂y−1 v ) δv ∂x x ∂y +
∂2 ∂2 ∂2 (h (h ) + ) + (hvyy ) ± · · · v v xx xy ∂x2 ∂x∂y ∂y 2
where, as usual, hv means the partial derivative
∂h ∂v ,
similarly hvx =
∂h ∂vx .
Proposition 4.3. The Euler–Poincar´e flow restricted to hyperplane c1 = 1, c2 = 0 at (0, v dx2 ) yields the Calogero–Bogoyavlenskii–Schiff equation (or (2 + 1)dimensional KdV ) (15) vt = vxxy + 2vvy + vx ∂x−1 vy
for the Hamiltonian H = 12 S 1 ×S 1 v∂x−1 vy dx dy. We use the expression [58]
x
2π v(ξ, y)dξ − v(x, y)dx. (∂x−1 v)(x, y) = 0
0
4.1. The Bogoyavlenskii–Konopelchenko equation In this section, we derive several other (2 + 1)-dimensional KdV type equations. The Euler–Poincar´e formalism of the Bogoyavlenskii–Konopelchenko equation vt + βvxxy + 3α + vxxx + 3vvx + 2βvvy + βvx ∂x−1 vy = 0,
(16)
June 2, 2010 14:55 WSPC/S0129-055X
494
148-RMP
J070-00398
P. Guha
x
2π with ∂x−1 v(x, y) = 0 v(ξ, y)dξ − 0 v(x, y)dx, is closely related to the Calogero– Bogoyavlenskii–Schiff equation. In fact, this is a combination of KdV and Calogero–Bogoyavlenskii–Schiff flows. Equation (16) models the (2+1)-dimensional interaction of a Riemann wave propagating along the y-axis with a long wave along the x-axis. Using (14), we obtain the following result. Proposition 4.4. The Euler–Poincar´e flow associated to the loop Virasoro algebra
gˆ yields the Bogoyavlenskii–Konopelchenko (for α = 0) equation for the Hamiltonian H=
1 2
S 1 ×S 1
(v 2 + βv∂x−1 vy )dx dy,
when restricted to hyperplane c1 = 1, c2 = 0. Outline of Proof. We use EP equation vt = −OLV
δH δv
to obtain our result.
Another class of (2 + 1)-dimensional KdV equation was proposed by Lou and his collaborators ([47, 49, 50]) to study the rich dromion structures, defined as vt + vxxx = 3(v∂y−1 vx )x ,
(17)
where ∂y−1 is defined similarly as ∂x−1 [58]. This equation reduces to the usual (1 + 1)-dimensional KdV equation. We use “frozen” Lie–Poisson structure to compute the Hamiltonian operator
1 at (v(x)dx2 , c1 , c2 ) = (0, 0, 1). We also assume that the only cocycle term is S f g induced by the coboundary term. The Hamiltonian operator computed at the freezing at the point (0, 0, 1) yields a truncated Hamiltonian operator ˜1 = ∂x . O We also compute the Hamiltonian operator at (v(x)dx2 , c1 , c2 ) = (0, 1, 0), given by ˜2 = ∂ 3 . O x Proposition 4.5. The second class (2 + 1)-dimensional KdV follows from the ˆ∗ following combination of flows on g ˜1 δH1 + λO ˜2 δH2 , vt = µO δv δv with
δH1 δv
= (v∂y−1 vx ) and
δH2 δv
= v, respectively. Here µ = 3 and λ = −1.
Proof. By direct computation. 5. H 1 Metric and (2 + 1)-Dimensional Camassa–Holm and Hunter–Saxton Systems In this section, we study the Camassa–Holm analogue of the (2 + 1)-dimensional KdV equations from the integrable point of view. The hydrodynamical analogue
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
495
was derived in [41]. We start with the explicit expression for the coadjoint action of g with respect to right invariant H 1 -metric. ˜. Let us introduce H 1 norm on the algebra g Definition 5.1. The H 1 -Sobolev norm on the loop Virasoro algebra is defined as ∂ f (x, y) , u(x, y)dx2 ∂x H1
=
fu dx + ν
S1
S1
∂x f ∂x u dx.
(18)
Now we compute the coadjoint action. Proposition 5.2. The coadjoint action with respect to H 1 metric of the loop ˆ is given by Virasoro algebra g ∗
2 ∂ v dx = (f v ˜x + 2fx v˜ + c1 fxxx + c2 fx )dx2 , ad f (x,y) ∂x
where v˜ = (1 − ν∂x2 )v. Proof. We know that ∗ ∂ ∂ ∂ 2 2 ∂ ,h adf ∂x v dx , h = v dx , f ∂x H 1 ∂x ∂x H 1
∂ 2 + = v dx , (fhx − fx h fhxxx dx dy, fhx dx dy . ∂x S 1 ×S 1 S 1 ×S 1 H1 Thus from the right-hand side we obtain the matrix expression. We compute now the left-hand side of the above equation. Let us denote ∂ ∂ ˆ = h ∂ ,e , , c , gˆ = g , d , h fˆ = f ∂x ∂x ∂x where c = (c1 , c2 ), d = (d1 , d2 ) and e = (e1 , e2 ) . Now we compute the left-hand side
∗ˆgˆ)h ˆ dx dy + ν LHS = (ad f S 1 ×S 1
=
S 1 ×S 1
S 1 ×S 1
∗ˆgˆ) h ˆ dx dy (ad f
∗
ˆgˆˆh dx dy. [(1 − ν∂x2 ) ad f
Thus by equating the right-hand side and left-hand side, we obtain the above formula. Lemma 5.3. The Hamiltonian operator corresponding to the coadjoint action of the loop Virasoro algebra with respect to H 1 metric is given by OH 1 = (1 − ν∂x2 )−1 ∂x v˜ + v˜∂x + c1 ∂x3 + c2 ∂x , (19) where v˜ = (1 − ν∂ 2 )v. Let us study the Euler–Poincar´e flow associated to H 1 metric on the coadjoint ˆ. orbit of the cotangent loop Virasoro algebra g
June 2, 2010 14:55 WSPC/S0129-055X
496
148-RMP
J070-00398
P. Guha
Proposition 5.4. The Euler–Poincar´e flow with respect to H 1 -norm on dual space of loop Virasoro algebra becomes vt = OH 1
δH , δv
(20) ∗
˜ is defined as where OH 1 is defined by (19). Suppose the quadratic Hamiltonian on g
1 H= v∂ −1 vy dx dy, 2 S 1 ×S 1 x then the Euler–Poincar´e flow yields v˜t = v˜x ∂x−1 vy + 2˜ v vy + c1 vxxy + c2 vy .
(21)
Corollary 5.5. The Euler–Poincar´e flow restricted to hyperplane c2 = 0 yields the Camassa–Holm analogue of the Calogero–Bogoyavlenskii–Schiff equation vt − νvxxt + c1 vxxy + (vx − νvxxx )∂x−1 vy + 2(v − νvxx )vy = 0
for the Hamiltonian H = S 1 ×S 1 g∂x−1 vy dx dy.
(22)
Corollary 5.6. The potential form of the (2+1)-dimensional Camassa–Holm equation takes the form uxt − νuxxxt + c1uxxxy + uxx uy + 2ux uxy − ν uxxxxuy + 2uxxxuxy = 0 (23) for all v = ux . Remark. In a special (1 + 1)-dimensional case (y = x), Eq. (23) reduces to potential Camassa–Holm equation. If we further assume ν = 0, then Eq. (23) reduces to potential KdV equation. Corollary 5.7. The Euler–Poincar´e flow restricted to hyperplane c1 = 1 and c2 = 1 yields the modified Calogero–Bogoyavlenskii–Schiff equation vt − νvxxt + vy + vxxy + (vx − νvxxx )∂x−1 vy + 2(v − νvxx )vy = 0 and potential form of takes the form uxt − νuxxxt + uxy + uxxxy + (uxx uy + 2ux uxy ) − ν(uxxxx uy + 2uxxxuxy ) = 0. Corollary 5.8. If we assume
1 ν
→ 0, then Eq. (22) takes the form
vxxt + vxxx ∂x−1 vy + 2vxx vy = 0,
(24)
it is known as (2 + 1)-dimensional Hunter–Saxton equation. For (1 + 1)-dimensional case (y = x), this reduces to the Hunter–Saxton equation vxxt + vxxx v + 2vxx vx = 0.
(25)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
497
The potential form of Eq. (25) takes the form uxxxt + uxxxxuy + 2uxxxuxy = 0.
(26)
6. Euler–Poincar´ e Framework of (1 + 1)-Dimensional b-Field Equation Denote Fµ (S 1 ) the space of tensor-densities of degree µ on S 1 Fµ = {a(x)dxµ | a(x) ∈ C ∞ (S 1 )}, where µ is the degree, x is a local coordinate on S 1 . As a vector space, Fµ (S 1 ) is isomorphic to C ∞ (S 1 ) [53]. Geometrically we say Fλ ∈ Γ(Ω⊗λ ),
where Ω⊗λ = (T ∗ S 1 )⊗λ ,
Ω = T ∗ S 1 is the cotangent bundle of S 1 . Here F0 (M ) = C ∞ (M ), the space F1 (M ) and F−1 (M ) coincide with the spaces of differential forms and vector fields, respectively. d d and w(x) dx is defined as Definition 6.1. The b-bracket between v(x) dx
[v, w]b = vwx − (b − 1)vx w.
(27)
This b-bracket can also be expressed as [v, w]b =
b−2 b [v, w] − [v, w]sym , 2 2
(28)
where [v, w] = vwx − vx w and [v, w]sym = vwx + vx w. Remark. The b-bracket can be interpretred as an action of Vect(S 1 ) on F−(b−1) (S 1 ), a tensor densities on S 1 of degree −(b − 1). For b = 2 this is just a vector field action corresponding to a Lie algebra. Moreover because of [v, w]sym term b-bracket is not a skew-symmetric bracket, it is a deformation of the bracket of vector fields. There is a pairing , : Fµ ⊗ F1−µ → R given by 1−µ
a(x)(dx) , b(x)(dx) µ
=
a(x)b(x)dx
(29)
S1
d acts on the space of tensor denwhich is Diff(S 1 )-invariant. A vector field f (x) dx sities Fµ by the Lie derivative
Lµf(x)
d dx
(a(x)) = (f (x)a (x) + µf (x)a(x))(dx)µ .
(30)
June 2, 2010 14:55 WSPC/S0129-055X
498
148-RMP
J070-00398
P. Guha
We denote b-algebra by F−(b−1) and its dual by Fb . Thus we can define a pairing according to (29)
a(x)(dx)−(b−1) , b(x)(dx)b = a(x)b(x)dx. S1
It is clear that the b-algebra is not a Lie algebra and under this circumstances we cannot define proper “coadjoint action”. We generalize the concept of the coadjoint action to b-algebra and defined with respect to norm (29). Lemma 6.2.
1 (adH )∗f (u) = (1 − ν∂x2 )−1 f (1 − ν∂x 2 )ux + bfx (1 − ν∂x 2 )u .
(31)
Proof. We know ad∗f (u), gH 1 = − u, [f, g]b H 1 ≡ −u dxb , (f g − (b − 1)f g)(dx)1−b H 1 , hence the pairing is well-defined. Let us compute
(ufg − (b − 1)uf g)dx + ν RHS = S1
=
S1
LHS =
S1
=
S1
S1
u (f g − (b − 1)f g) dx
[f (1 − ν∂x 2 )u + bf (1 − ν∂x 2 )u 1 (adH )∗f u)g dx
1
+ν S1
(adH )∗f u g dx
1
[(1 − ν∂x 2 )adH )∗f u]g dx.
Thus by equating the right-hand side and left-hand side we obtain the above formula. Using the Helmholtz operator we express m = (1 − ν∂x 2 )u. Thus, we express the Hamiltonian operator corresponding to (31) as O1 = −(1 − ν∂ 2 )−1 (mx + bm∂). The Euler–Poincar´e equation δH ut = O1 δu
1 for H = 2
(32)
u2 dx,
S1
can be rewritten as mt = O
δH , δu
where O = (mx + bm∂). Using the EP formula (33), we construct the b-field equation.
(33)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
499
Proposition 6.3. The Euler–Poincar´e flow on the dual space of b-algebra yields the b-field equation mt + mx u + bmux = 0. This is a new derivation of the b-field equation. 6.1. Hamiltonian structure of the Degasperis–Procesi equation and EP framework Degasperis et al. studied Hamiltonian structures for b = 3 case of the b-field equation or the DHH equation, in other words, they exhibits bihamiltonian features of the Degasperis-Procesi system. They expressed the Degasperis–Procesi equation as mt = Bi
δHi δm
i = 0, 1,
(34)
where m = u − uxx (we assume ν = 1). Thus they studied the flow of Helmholtz function. They showed that there is only one local Hamiltonian structure B0 = ∂x (1 − ∂x2 )(4 − ∂x2 ),
(35)
and the second Hamiltonian structure is given by B1 = m2/3 ∂x m1/3 (∂x − ∂x3 )−1 m1/3 ∂x m2/3 ,
(36)
which can be simplified to ˆ = 2 (3m∂ + mx )(∂ − ∂ 3 )−1 (3m∂ + 2mx ). B1 ≡ B 9 Proposition 6.4. The Degasperis–Procesi equation ˆ mt = B
ˆ = 2 (3m∂ + mx )(∂ − ∂ 3 )−1 (3m∂ + 2mx ), B 9
δH1 , δm
H1 =
9 4
m dx S1
(37) is equivalent to mt = O
δH δu
for H =
u2 dx,
S1
where O = (mx + bm∂). Proof. Our goal is to show
where H1 = we obtain
9 4
S1
δH δH1 2 (∂ − ∂ 3 )−1 (3m∂ + 2mx ) = , 9 δm δu m dx. If we insert
δH1 δm
=
9 4
to left-hand side of above equation
(∂ − ∂ 3 )−1 mx = u,
June 2, 2010 14:55 WSPC/S0129-055X
500
148-RMP
J070-00398
P. Guha
where we use u = (1 − ∂ 2 )−1 m. Thus we obtain mt = (3m∂ + mx )
δH δu
where H = 12 S 1 u2 dx. Therefore the Degasperis–Holm–Hone form of Hamiltonian structure coincides with our Hamiltonian structure. 6.1.1. First Hamiltonian structure of b-field equation Let us compute the Hamiltonian operator at a frozen point m(x) = m0 . Since m0 is constant so the Hamiltonian operator at the frozen point becomes O0 = 3m0 ∂.
(38)
Actually freezing at the point m0 yields a Poisson structure induced by a coboundary, which is always a trivial Poisson structure. For all practical purposes we can normalize this O0 operator or taking m0 = 13 . We show that this leads us to the first Hamiltonian operator of the Degasperis–Procesi equation. Proposition 6.5. The Degasperis–Procesi equation with respect to first Hamiltoˆ0 = ∂, where the nian structure of Degasperis–Holm–Hone exactly coincides with O ˆ corresponding Hamiltonian H satisfies ˆ δH = (2u2 − u2x − uuxx). δu
(39)
Proof. It is easy to check that ˆ δH0 δH = (4 − ∂ 2 ) , δu δu where the first DHH hamiltonian is given by H0 = Thus we obtain mt = ∂(4 − ∂ 2 )
1 6
S1
u3 dx.
δH0 . δu
Using the chain rule formula for variational derivatives δH0 δH0 = (1 − ∂ 2 ) δu δm we obtain mt = ∂(4 − ∂ 2 )(1 − ∂ 2 )
δH0 . δm
Hence we obtain the first Hamiltonian structure B0 of Degasperis, Holm and Hone from our method.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
501
7. EP Formalism for (2 + 1)-Dimensional b-Field Equation ˜ 1 = LG1 be the associated loop group corresponding to G1 whose algebra Consider G is given by
g˜ 1 = L(F−(b−1) ). Consider an action of L(Vect(S 1 )) on L(F−(b−1) ) Lf
∂ ∂x
(g(dx)−(b−1) ) = (f gx − (b − 1)fx g)(dx)−(b−1) ,
(40)
this yields a new bracket. ˜1. Let us introduce H 1 norm on the algebra g Definition 7.1. The H 1 -Sobolev norm on the loop tensor density algebra is defined as
−(b−1) b , u(x, y)(dx) H 1 = fu dx + ν ∂x f ∂x u dx. (41) f (x, y)(dx) S1
S1
Proposition 7.2. The action of Vect(S 1 ) with respect to H 1 metric on the tensor product algebra Fb is given by ∗
d (v(x, y)(dx)b ) = +(f v˜x + bfx v˜)dxb , ad (f dx ) where v˜ = (1 − ν∂ 2 )v. Corollary 7.3. The Hamiltonian operator corresponding to the action of Vect(S 1 ) on Fb with respect to H 1 metric yields ˆ = −(1 − ν∂x2 )−1 ∂x v˜ + (b − 1)˜ O v ∂x (42) ˜ 1 orbit yields the (2 + 1)Proposition 7.4. The Euler–Poincar´e flow on the g dimensional b-field equation vt − νvxxt + λ∂x−1 vyy + (vx − νvxxx )∂x−1 vy + b(v − νvxx )vy = 0
(43)
where the Hamiltonian is given by H=
1 2
S 1 ×S 1
v∂x−1 vy dx dy.
The potential form of (43) yields uxt − νuxxxt + λuyy + (uxx uy + bux uxy ) − ν(uxxxxy uy + buxxxuxy ) = 0. Introducing the quantity m = v − νvxx ,
(44)
June 2, 2010 14:55 WSPC/S0129-055X
502
148-RMP
J070-00398
P. Guha
which is just the Helmholtz operator action on v. Therefore the one-parameter family of (2 + 1)-dimensional peakon-type PDE’s (or (2 + 1)-dimensional b-field equations) may be written in the following form mt + mx ∂x−1 vy + bvy m + λ∂x−1 vyy = 0,
(45)
which reduces to (1 + 1)-dimensional b-field equation for y = x and λ = 0. It is clear that Eq. (45) becomes a (2+1)-dimensional Camassa–Holm and (2+1)-dimensional Degasperis–Procesi equation for b = 2 and b = 3 respectively. 8. Conclusion and Outlook We have examined various extensions of (2 + 1)-dimensional KdV equations and (2+1)-dimensional generalized Camassa–Holm type systems. In particular, we have shown that all these equations constitute geodesic flows on the loop Bott–Virasoro group. In fact three famous (2 + 1)-dimensional partial differential equations: Calogero–Bogoyavlenskii–Schiff (CBS), (2 + 1)-dimensional Camassa–Holm (CH2 ) and (2 + 1)-dimensional Hunter–Saxton (HS2 ) can be described as Euler–Poincar´e flows on the dual space of loop Virasoro orbit. After that we have given the Euler–Poincar´e formalism of the new (1 + 1)dimensional b-field equation, proposed by Degasperis et al., on the space of tensor algebra. We also extend the EP framework to (2 + 1)-dimensional b-field equation, which includes the (2 + 1)-dimensional Degasperis–Procesi equation too. Therefore, this paper has further strengthened the programme of Euler–Poincar´e and integrable geodesic flows on extended group of diffeomorphisms. We hope in our forthcoming work we will consider the singular solutions of the (2 + 1)-dimensional equations. Acknowledgment The author is profoundly grateful to Professors Jerry Marsden, Tudor Ratiu, Valentin Ovsienko and Chand Devchand for stimulating discussions and various constructive suggestions. He is also grateful to Professor Thanasis Fokas for his interest and encouragement. In particular, he is immensely grateful to Chand Devchand for the b-bracket discussion. Finally, the author wants to thank the anonymous referee for many helpful comments and suggestions. He expresses grateful thanks to Professor J¨ urgen Jost for gracious hospitality at the Max Planck Institute for Mathematics in the Sciences. References [1] M. J. Ablowitz and P. A. Clarkson, Solitons, Nonlinear Evolution Equations and Inverse Scattering, London Mathematical Society Lecture Note Series, Vol. 149 (Cambridge University Press, 1991). [2] V. I. Arnold, Sur la g´eom´etrie differentielle des groupes de Lie de dimenson infinie et ses applications ` a l’hydrodynamique des fluids parfaits, Ann. Inst. Fourier Grenoble 16 (1966) 319–361.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
503
[3] A. Bressan and A. Constantin, Global solutions of the Hunter–Saxton equation, SIAM J. Math. Anal. 37 (2005) 996–1002. [4] Yu. Billig, An extension of the KdV hierarchy arising from a representation of a toroidal Lie algebra, J. Algebra 217 (1999) 40–64. [5] O. I. Bogoyavlensky, Breaking solitons in (2 + 1)-dimensional integrable equations, Russian Math. Surveys 45(4) (1990) 1–86. [6] O. I. Bogoyavlensky, Breaking solitons III, Izv. Akad. Nauk SSSR Ser. Matem. 54 (1990) 123–131; Math. USSR Izv. 36 (1991) 129–137 (English translation). [7] F. Calogero, A method to generate solvable nonlinear evolution equations, Lett. Nuovo Cimento 14 (1975) 443–447. [8] R. Camassa and D. Holm, An integrable shallow water equation with peaked solitons, Phys. Rev. Lett. 71(11) (1993) 1661–1664. [9] R. Camassa, D. Holm and J. M. Hyman, A new integrable shallow water equation, Adv. Appl. Mech. 31 (1994) 1–33. [10] A. Constantin and B. Kolev, Geodesic flow on the diffeomorphism group of the circle, Comment. Math. Helv. 78 (2003) 787–804. [11] A. Constantin, T. Kappeler, B. Kolev and P. Topalev, On geodesic exponential maps of the Virasoro group, Ann. Global Anal. Geom. 31 (2007) 155–180. [12] A. Constantin and B. Kolev, Integrability of invariant metrics on the Virasoro group, Phys. Lett. A 350(1–2) (2006) 75–80. [13] A. Constantin and B. Kolev, Integrability of invariant metrics on the diffeomorphism group of the circle, J. Nonlinear Sci. 16(2) (2006) 109–122. [14] A. Constantin and D. Lannes, The hydrodynamical relevance of the Camassa–Holm and Degasperis–Procesi equations, Arch. Ration. Mech. Anal. 192 (2009) 165–186. [15] M. Chen, S.-Q. Liu and Y. Zhang, A 2-component generalization of the Camassa– Holm equation and its solutions, nlin.SI/0501028. [16] A. Constantin, The trajectories of particles in Stokes waves, Invent. Math. 166 (2006) 523–535. [17] A. Constantin and W. Strauss, Stability of peakons, Comm. Pure Appl. Math. 53 (2000) 603–610. [18] A. Degasperis, D. D. Holm and A. N. W. Hone, A new integrable equation with peakon solutions, NEEDS 2001 Proceedings, Theoret. and Math. Phys. 133 (2002) 170–183. [19] A. Degasperis, D. D. Holm and A. N. W. Hone, Integrable and non-integrable equations with peakons, in Nonlinear Physics: Theory and Experiment, II (Gallipoli, 2002) (World Sci. Publishing, River Edge, NJ, 2003), pp. 37–43. [20] A. Degasperis and M. Procesi, Asymptotic integrability, in Symmetry and Perturbation Theory (Rome, 1998) (World Sci. Publishing, River Edge, NJ, 1999), pp. 23–37. [21] C. Devchand and J. Schiff, The supersymmetric Camassa–Holm equation and geodesic flow on the superconformal group, J. Math. Phys. 42(1) (2001) 260–273. [22] D. Ebin and J. Marsden, Groups of diffeomorphisms and themotion of an incompressible fluid, Ann. Math. 92 (1970) 102–163. [23] J. Escher and Z. Yin, Well-posedness, blow-up phenomena, and global solutions for the b-equation, J. Reine Angew. Math. 624 (2008) 51–80. [24] G. Falqui, On a Camassa–Holm type equation with two dependent variables, nlin.SI/0505059. [25] A. S. Fokas and B. Fuchssteiner, B¨ acklund transformations for hereditary symmetries, Nonlinear Anal. 5(4) (1981) 423–432. [26] A. S. Fokas, P. J. Olver, P. Rosenau, A plethora of integrable bi-Hamiltonian equations, in Algebraic Aspects of Integrable Systems, Progr. Nonlinear Differential Equations Appl., Vol. 26 (Birkh¨ auser Boston, Boston, MA, 1997), pp. 93–101.
June 2, 2010 14:55 WSPC/S0129-055X
504
148-RMP
J070-00398
P. Guha
[27] A. S. Fokas and P. M. Santini, Dromions and a boundary value problem for the Davey–Stewartson I equation, Physica D 44 (1990) 99–130. [28] A. S. Fokas and P. M. Santini, The recursion operator of the Kadomtsev–Petviashvili equation and the squared eigenfunction of the Schr¨ odinger operators, Stud. Appl. Math. 75 (1986) 179–186. [29] A. S. Fokas and P. M. Santini, Recursion operators and bi-Hamiltonian structures in multidimensions I, Comm. Math. Phys. 115 (1988) 375–419. [30] A. S. Fokas and P. M. Santini, Recursion operators and bi-Hamiltonian structures in multidimensions II, Comm. Math. Phys. 116 (1988) 449–474. [31] I. M. Gelfand and D. B. Fuks, Cohomologies of the Lie algebra of vector fields on the circle, Funct. Anal. Appl. 2(4) (1968) 92–93. [32] P. Guha, Integrable geodesic flows on the (super)extension of the Bott–Virasoro group, Lett. Math. Phys. 52(4) (2000) 311–328. [33] P. Guha, Geodesic flows, bi-Hamiltonian structure and coupled KdV type systems, J. Math. Anal. Appl. 310 (2005) 45–56. [34] P. Guha and P. Olver, Geodesic flow and two (super) component analog of the Camassa–Holm equation, SIGMA 2 (2006) 054, 9 pp. [35] P. Guha, Euler–Poincar´e formalism of (two component) Degasperis–Procesi and Holm–Staley type systems, J. Nonlinear Math. Phys. 14(3) (2007) 390–421. [36] P. Guha, Euler–Poincar´e flows on the space of tensor densities and integrable systems, Oberwolfach Report 5(3) (2008) 1875–1880. [37] I. M. Gelfand, I. M. Graev and A. M. Vershik, Models of representations of current groups, Representations of Lie Groups and Lie Algebras (Budapest, 1971) (Akad. Kiad, Budapest, 1985), pp. 121–179. [38] D. Henry, Persistence properties for a family of nonlinear partial differential equations, Nonlinear Anal. 70 (2009) 2049–2064. [39] A. N. W. Hone, Reciprocal link for (2 + 1)-dimensional extensions of shallow water equations, Appl. Math. Lett. 13(3) (2000) 37–42. [40] J. K. Hunter and R. Saxton, Dynamics of director fields, SIAM J. Appl. Math. 51 (1991) 1498–1521. [41] R. S. Johnson, Camassa–Holm, Korteweg–de Vries and related models for water waves, J. Fluid Mech. 455 (2002) 63–82. [42] A. Kirillov, Infinite-dimensional Lie groups: Their orbits, invariants and representations. The geometry of moments, in Twistor Geometry and Nonlinear Systems, Lecture Notes in Math., Vol. 970 (Springer, Berlin, 1982), pp. 101–123. [43] B. Khesin and G. Misiolek, Euler equations on homogeneous spaces and Virasoro orbits, Adv. Math. 176(1) (2003) 116–144. [44] B. G. Konopelchenko, Solitons in Multidimensions (World Scientific, 1993). [45] J. Lenells, The Hunter–Saxton equation describes the geodesic flow on a sphere, J. Geom. Phys. 57 (2007) 2049–2064. [46] J. Lenells, Weak geodesic flow and global solutions of the Hunter–Saxton equation, Discrete Contin. Dyn. Syst. 18 (2007) 643–656. [47] S.-Y. Lou, Searching for higher dimensional integrable models from lower ones via Painlev´e analysis, Phys. Rev. Lett. 80 (1998) 5027–5031. [48] Z. Lin and Y. Liu, Stability of peakons for the Degasperis–Procesi equation, Comm. Pure Appl. Math. 62 (2009) 125–146. [49] J. Lin, S.-Y. Lou and K. Wang, High-dimensional Virasoro integrable models and exact solutions, Phys. Lett. A 287(3–4) (2001) 257–267. [50] Y.-S. Li and Y.-J. Zhang, Symmetries of a (2 + 1)-dimensional breaking soliton equation, J. Phys. A 26(24) (1993) 7487–7494.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00398
Euler–Poincar´ e Flows on the Loop Bott–Virasoro Group
505
[51] G. Misiolek, A shallow water equation as a geodesic flow on the Bott–Virasoro group, J. Geom. Phys. 24 (1998) 203–208. [52] J. E. Marsden and T. Ratiu, Introduction to Mechanics and Symmetry (SpringerVerlag, New York, 1994). [53] V. Ovsienko and C. Roger, Generalizations of Virasoro group and Virasoro algebra through extensions by modules of tensor-densities on S 1 , Indag. Math. (N.S.) 9(2) (1998) 277–288. [54] V. Ovsienko and C. Roger, Looped cotangent Virasoro algebra and nonlinear integrable systems in dimension 2 + 1, Comm. Math. Phys., 273 (2007) 357–378; mathph/0602043. [55] V. Ovsienko, Coadjoint representation of Virasoro-type Lie algebras and differential operators on tensor-densities, in Infinite Dimensional K¨ ahler Manifolds (Oberwolfach, 1995), DMV Sem., Vol. 31 (Birkh¨ auser, Basel, 2001), pp. 231–255. [56] P. Olver and P. Rosenau, Tri-Hamiltonian duality between solitons and solitary-wave solutions having compact support, Phys. Rev. E (3) 53(2) (1996) 1900–1906. [57] V. Yu. Ovsienko and B. A. Khesin, KdV super equation as an Euler equation, Funct. Anal. Appl. 21 (1987) 329–331. [58] V. Yu. Ovsienko, Bi-Hamiltonian nature of the equation utx = uxy uy − uyy ux , arXiv:0802.1818v1 [math-ph]. [59] R. Radha and M. Lakshmanan, Dromion-like structures in the (2 + 1)-dimensional breaking soliton equation, Phys. Lett. A 197(1) (1995) 7–12. [60] P. Rosenau, Nonlinear dispersion and compact structures, Phys. Rev. Lett. 73(13) (1994) 1737–1741. [61] E. Ramos, C.-H. Sah and R Shrock, Algebras of diffeomorphisms of the N -torus, J. Math. Phys. 31(8) (1990) 1805–1816. [62] A. Reiman and M. Semenov-Tyan-Shanskii, Hamiltonian structure of equations of Kadomtsev–Petviashvili type, in Differential Geometry, Lie Groups and Mechanics, VI. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 133 (1984) 212–227. [63] H.-Y. Ruan and Y.-X. Chen, Dromion interactions of (2 + 1)-dimensional KdV-type equations, J. Phys. Soc. Japan 72(3) (2003) 491–495. [64] J. Schiff, Integrability of Chern–Simons–Higgs vortex equations and a reduction of the self-dual Yang–Mills equations to three dimensions, in Painlev´e Transcendents, eds. D. Levi and P. Winternitz, NATO ASI Series B, Vol. 278 (Plenum Press, New York, 1992). [65] G. Segal, Unitary representations of some infinite-dimensional groups, Comm. Math. Phys. 80(3) (1981) 301–342. [66] I. A. B. Strachan, Some integrable hierarchies in (2 + 1)-dimensions and their twistor description, J. Math. Phys. 34(1) (1993) 243–259. [67] P. Zusmanovich, The second homology group of current Lie algebras, in K-Theory (Strasbourg, 1992), Ast´erisque 226(11) (1994) 435–452.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 507–531 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004028
PROJECTIVE MODULE DESCRIPTION OF EMBEDDED NONCOMMUTATIVE SPACES
R. B. ZHANG School of Mathematics and Statistics, University of Sydney, Sydney, Australia
[email protected] XIAO ZHANG Institute of Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, P. R. China
[email protected] Received 20 May 2009 Revised 5 February 2010
An algebraic formulation is given for the embedded noncommutative spaces over the Moyal algebra developed in a geometric framework in [8]. We explicitly construct the projective modules corresponding to the tangent bundles of the embedded noncommutative spaces, and recover from this algebraic formulation the metric, Levi–Civita connection and related curvatures, which were introduced geometrically in [8]. Transformation rules for connections and curvatures under general coordinate changes are given. A bar involution on the Moyal algebra is discovered, and its consequences on the noncommutative differential geometry are described. Keywords: Noncommutative space; projective module; isometric embedding. Mathematics Subject Classification 2010: 51P05, 81R60, 83C65
1. Introduction It is a long held belief in physics that the notion of spacetime as a pseudo Riemannian manifold requires modification at the Planck scale [34, 38]. Theoretical investigations in recent times strongly supported this view. In particular, the seminal paper [16] by Doplicher, Fredenhagen and Roberts demonstrated mathematically that coordinates of spacetime became noncommutative at the Planck scale, thus some form of noncommutative geometry [13] appeared to be necessary in order to describe the structure of spacetime. This prompted intensive activities in mathematical physics studying various noncommutative generalisations of Einstein’s theory of general relativity [1, 3, 5–11, 29, 30]. For reviews on earlier works, we refer 507
June 2, 2010 14:55 WSPC/S0129-055X
508
148-RMP
J070-00402
R. B. Zhang & X. Zhang
to [31, 35] and references therein. For more recent developments, particularly on the study of noncommutative black holes, see [2, 4, 7, 9, 15, 26, 27, 33, 37]. In joint work with Chaichian and Tureanu [8], we investigated the noncommutative geometry [13, 22] of noncommutative spaces embedded in higher dimensions. We first quantized a space by deforming [21, 28] the algebra of functions to a noncommutative associative algebra known as the Moyal algebra. Such an algebra naturally incorporates the generalized spacetime uncertainty relations of [16], capturing key features expected of spacetime at the Planck scale. We then systematically investigated the noncommutative geometry of embedded noncommutative spaces. This was partially motivated by Nash’s isometric embedding theorem [32] and its generalization to pseudo-Riemannian manifolds [12, 19, 23], which state that any (pseudo-) Riemannian manifold can be isometrically embedded in Euclidean or Minkowski spaces. Therefore, in order to study the geometry of spacetime, it suffices to investigate (pseudo-) Riemannian manifolds embedded in higher dimensions. Embedded noncommutative spaces also play a role in the study of branes embedded in RD in the context of Yang–Mills matrix models [36]. The theory of [8] was developed within a geometric framework analogous to the classical theory of embedded surfaces (see, e.g., [14]). The present paper further develops the differential geometry of embedded noncommutative spaces by constructing an algebraic formulation in terms of projective modules, a language commonly adopted in noncommutative geometry [13, 22]. We shall first describe the finitely generated projective modules over a Moyal algebra, which will be regarded as noncommutative vector bundles on a quantized spacetime. We then construct a differential geometry of the noncommutative vector bundles, developing a theory of connections and curvatures on such bundles. In doing this, we make crucial use of a unique property of the Moyal algebra, namely, it has a set of mutually commutative derivations related to the usual partial derivations of functions. Then we apply the noncommutative differential geometry developed to study the embedded noncommutative spaces introduced in [8]. We explicitly construct the projective modules corresponding to the tangent bundles of the noncommutative spaces, and recover from this algebraic formulation the geometric Levi–Civita connections and related curvatures introduced in [8]. This way, the embedded noncommutative spaces of [8] acquire a natural interpretation in the algebraic formalism present here. Morally, one may regard the very definition of a projective module (a direct summand of a free module) as the geometric equivalent of embedding a low-dimensional manifold isometrically in a higher dimensional one. In the commutative setting of classical (pseudo-) Riemannian geometry, we make this connection more precise and explicit by showing that the projective module description of tangent bundles studied here is a natural consequence of the isometric embedding theorems [12, 19, 23, 32]. This is briefly discussed in Theorem 7.1.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
509
As a concrete example of noncommutative differential geometries over the Moyal algebra, we study in detail a quantum deformation of a time slice of the Schwarzschild spacetime. The projection operator yielding the tangent bundle is given explicitly, and the corresponding metric is also worked out. As is well known, one of the fundamental principles of general relativity is general covariance. It is important to find a noncommutative version of this principle. By analyzing the structure of the Moyal algebra, we show that the noncommutative geometry developed here (initiated in [8]) retains some notion of “general covariance”. Properties of the connection and curvature under general coordinate transformations are described explicitly (see Theorem 5.1). The Moyal algebra (over the real numbers) admits an involution similar to the bar involution in the context of quantum groups. We introduce a particularly nice class of noncommutative vector bundles over the Moyal algebra, which are associated to bar invariant idempotents and endowed with bar hermitian connections (see Sec. 6). In this case, the bar involution takes the left tangent bundles to right tangent bundles. We show that the tangent bundles of embedded noncommutative spaces under a middle condition belong to this class. The organization of the paper is as follows. In Sec. 2, we describe the Moyal algebras and finitely generated projective modules over them. In Sec. 3, we discuss the differential geometry of noncommutative vector bundles on quantum spaces corresponding to Moyal algebras. In Sec. 4, we develop the differential geometry of embedded noncommutative spaces using the language of projective modules. As an explicit example, we study in detail the quantum deformation of a time slice of the Schwarzschild spacetime in Sec. 4.2. In Sec. 5, we study the effect of general coordinate transformations. In Sec. 6, we investigate properties of noncommutative vector bundles under the bar involution of the Moyal algebra. Finally, Sec. 7, concludes the paper with some general comments and a discussion of the natural relationship between projective modules and isometric embeddings in classical (pseudo-) Riemannian geometry. Before closing this section, we mention that the theory of [8] has the advantage of being explicit and easy to use for computations. Using this theory, we constructed noncommutative Schwarzschild and Schwarzschild–de Sitter spacetimes in joint work with Wang [37]. Our long term aim is to develop a theoretical framework for studying noncommutative general relativity. A variety of physically motivated methods and techniques were used in the literature to study corrections to general relativity arising from the noncommutativity of the Moyal algebra. In particular, references [1, 3] studied deformations of the diffeomorphism algebra as a means for incorporating noncommutative effects of spacetime, while in [6, 7, 9] a gauge theoretical approached was taken. These approaches differ considerably from the theory of [8, 37] at the mathematical level.
June 2, 2010 14:55 WSPC/S0129-055X
510
148-RMP
J070-00402
R. B. Zhang & X. Zhang
2. Moyal Algebra and Projective Modules We describe the Moyal algebra of smooth functions on an open region of Rn , and the finitely generated projective modules over the Moyal algebra. This provides the background material needed in later sections, and also serves to fix notations. We take an open region U in Rn for a fixed n, and write the coordinate of a ¯ be a real indeterminate, and denote by R[[h]] ¯ point t ∈ U as (t1 , t2 , . . . , tn ). Let h ¯ ¯ the ring of formal power series in h. Let A be the set of formal power series in h with coefficients being real smooth functions on U . Namely, every element of A is of the ¯ i where fi are smooth functions on U . Then A is an R[[h]]-module ¯ fi h form i≥0
in the obvious way. Fix a constant skew symmetric n × n matrix θ = (θij ). The Moyal product on A corresponding to θ is a map µ : A ⊗R[[h]] ¯ A → A,
f ⊗ g → µ(f, g),
defined by ¯ P θij ∂ h ij ∂ti
µ(f, g)(t) = lim exp t →t
∂ ∂t j
f (t)g(t ).
(2.1)
On the right-hand side, f (t)g(t ) means the usual product of the numerical values of the functions f and g at t and t , respectively. It has been known since the early days of quantum mechanics that the Moyal ¯ product is associative (see, e.g., [28] for reference). Thus the R[[h]]-module A ¯ which equipped with the Moyal product forms an associative algebra over R[[h]], is a deformation of the algebra of smooth functions on U in the sense of [21]. We shall usually denote this associative algebra by A, but when it is necessary to make explicit the multiplication, we shall write it as (A, µ). The partial derivations ∂i := ∂t∂ i with respect to the coordinates ti for U are ¯ R[[h]]-linear maps on A. Since θ is a constant matrix, the Leibniz rule is valid. Namely, for any element f and g of A, we have ∂i µ(f, g) = µ(∂i f, g) + µ(f, ∂i g).
(2.2)
Therefore, the ∂i (i = 1, 2, . . . , n) are mutually commutative derivations of the Moyal algebra (A, µ) on U . Remark 2.1. The usual notation in the literature for µ(f, g) is f ∗g. This is referred to as the star-product of f and g. Hereafter, we shall replace µ by ∗ and simply write µ(f, g) as f ∗ g. Following the general philosophy of noncommutative geometry [13], we regard the associative algebra (A, µ) as defining some quantum deformation of the region U , and finitely generated projective modules over A as (spaces of sections of) noncommutative vector bundles on the quantum deformation of U defined by the noncommutative algebra A. Let us now briefly describe finitely generated projective modules over A.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
511
Given an integer m > n, we let l Am (respectively, Am r ) be the set of mtuples with entries in A written as rows (respectively, columns). We shall regard m (respectively, Am lA r ) as a left (respectively, right) A-module with the action defined by multiplication from the left (respectively, right). More explicitly, for v = a1 a2 · · · am ∈ l Am , and b ∈ A, we have b ∗ v = b ∗ a1 b ∗ a2 · · · b ∗ am . a1
a1 ∗ b
. am
. am ∗ b
a2 ∗ b a2 Similarly for w = .. ∈ Am r , we have w ∗ b = .. . Let Mm (A) be the set of (m × m)-matrices with entries in A. We define matrix multiplication in the usual way but by using the Moyal product for products of matrix entries, and still denote the corresponding matrix multiplication by ∗. Now for A = (aij ) and B = (bij ), ¯ we have (A ∗ B) = (cij ) with cij = k aik ∗ bkj . Then Mm (A) is an R[[h]]-algebra, m which has a natural left (respectively, right) action on Ar (respectively, l Am ). A finitely generated projective left (respectively, right) A-module is isomorphic to some direct summand of l Am (respectively, Am r ) for some m < ∞. If e ∈ Mm (A) satisfies the condition e ∗ e = e, that is, it is an idempotent, then M = l Am ∗ e := {v ∗ e | v ∈ l Am },
˜ = e ∗ Am := {e ∗ w |∈ Am } M r r
are, respectively, projective left and right A-modules. Furthermore, every projective ˜ constructed this way left (right) A-module is isomorphic to an M (respectively, M) by using some idempotent e. In Sec. 4, we shall give a systematic method for constructing idempotents (see (4.1)). The corresponding noncommutative vector bundles include the tangent bundles of embedded noncommutative spaces introduced in [8], which we shall investigate in depth. An explicit example of embedded noncommutative spaces will be analyzed in detail in Sec. 4.2. To do this, we need to develop some generalities of the differential geometry of noncommutative vector bundles using the language of projective modules over the Moyal algebra. 3. Differential Geometry of Noncommutative Vector Bundles In this section, we investigate general aspects of the noncommutative differential geometry over the Moyal algebra. We shall focus on the abstract theory here. A large class of examples will be given in Sec. 4, including one which will be worked out in detail. As we shall see, the set of mutually commutative derivations ∂i (i = 1, 2, . . . , n) of the Moyal algebra A will play a crucial role in developing the noncommutative differential geometry. 3.1. Connections and curvatures ˜ We We start by considering the action of the partial derivations ∂i on M and M. only treat the left module in detail, and present the pertinent results for the right module at the end, since the two cases are similar.
June 2, 2010 14:55 WSPC/S0129-055X
512
148-RMP
J070-00402
R. B. Zhang & X. Zhang
Let us first specify that ∂i acts on rectangular matrices with entries in A by componentwise differentiation. More explicitly, b11 b12 · · · b1l ∂i b11 ∂i b12 · · · ∂i b1l ∂i b21 ∂i b22 · · · ∂i b2l for B = b21 b22 · · · b2l . ∂i B = ··· ··· ··· ··· ··· ··· ··· ··· ∂i bk1
∂i bk2
· · · ∂i bkl
bk1
bk2
· · · bkl
In particular, given any ζ = v ∗ e ∈ M, where v ∈ l Am regarded as a row matrix, we have ∂i ζ = (∂i v) ∗ e + v ∗ ∂i (e) by the Leibniz rule. While the first term belongs to M, the second term does not in general. Therefore, ∂i (i = 1, 2, . . . , n) send M to some subspace of l Am different from M. Let ωi ∈ Mm (A) (i = 1, 2, . . . , n) be (m × m)-matrices with entries in A satisfying the following condition: e ∗ ωi ∗ (1 − e) = −e ∗ ∂i e,
∀i.
(3.1)
¯ Define the R[[h]]-linear maps ∇i (i = 1, 2, . . . , n) from M to l Am by ∇i ζ = ∂i ζ + ζ ∗ ωi ,
∀ζ ∈ M.
Then each ∇i is a covariant derivative on the noncommutative bundle M in the sense of Theorem 3.1 below. They together define a connection on M. Theorem 3.1. The maps ∇i (i = 1, 2, . . . , n) have the following properties. For all ζ ∈ M and a ∈ A, ∇i ζ ∈ M
and
∇i (a ∗ ζ) = ∂i (a) ∗ ζ + a ∗ ∇i ζ.
Proof. For any ζ ∈ M, we have ∇i (ζ) ∗ e = ∂i (ζ) ∗ e + ζ ∗ ωi ∗ e = ∂i ζ + ζ ∗ (ωi ∗ e − ∂i e), where we have used the Leibniz rule and also the fact that ζ ∗ e = ζ. Using this latter fact again, we have ζ ∗ (ωi ∗ e − ∂i e) = ζ ∗ (e ∗ ωi ∗ e − e ∗ ∂i e), and by the defining property (3.1) of ωi , we obtain ζ ∗ (e ∗ ωi ∗ e − e ∗ ∂i ∗ e) = ζ ∗ ωi . Hence ∇i (ζ) ∗ e = ∂i ζ + ζ ∗ ωi = ∇i ζ, proving that ∇i ζ ∈ M. The second part of the theorem immediately follows from the Leibniz rule. We shall also say that the set of ωi (i = 1, 2, . . . , n) is a connection on M. Since e ∗ ∂i e = ∂i (e) ∗ (1 − e), one obvious choice for ωi is ωi = −∂i e, which we shall refer to as the canonical connection on M. By inspecting the defining property (3.1) for a connection, we easily see the following result. Lemma 3.2. If ωi (i = 1, 2, . . . , n) define a connection on M, then so do also ωi + φi ∗ e (i = 1, 2, . . . , n) for any (m × m)-matrices φi with entries in A.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
513
For a given connection ωi (i = 1, 2, . . . , n), we consider [∇i , ∇j ] = ∇i ∇j − ∇j ∇i with the right hand side understood as composition of maps on M. By simple calculations we can show that for all ζ ∈ M, [∇i , ∇j ]ζ = ζ ∗ Rij
with Rij := ∂i ωj − ∂j ωi − [ωi , ωj ]∗ ,
where [ωi , ωj ]∗ = ωi ∗ ωj − ωj ∗ ωi is the commutator. We call Rij the curvature of M associated with the connection ωi . For all ζ ∈ M, [∇i , ∇j ]∇k ζ = ∂k (ζ) ∗ Rij + ζ ∗ ωk ∗ Rij , ∇k [∇i , ∇j ]ζ = ∂k (ζ) ∗ Rij + ζ ∗ (∂k Rij + Rij ∗ ωk ). Define the following covariant derivatives of the curvature: ∇k Rij := ∂k Rij + Rij ∗ ωk − ωk ∗ Rij ,
(3.2)
we have [∇k , [∇i , ∇j ]]ζ = ζ ∗ ∇k Rij ,
∀ζ ∈ M.
The Jacobian identity [∇k , [∇i , ∇j ]] + [∇j , [∇k , ∇i ]] + [∇i , [∇j , ∇k ]] = 0 leads to ζ ∗ (∇k Rij + ∇j Rki + ∇i Rjk ) = 0,
∀ζ ∈ M.
From this, we immediately see that e ∗ (∇k Rij + ∇j Rki + ∇i Rjk ) = 0. In fact, the following stronger result holds. Theorem 3.3. The curvature satisfies the following Bianchi identity: ∇k Rij + ∇j Rki + ∇i Rjk = 0. Proof. The proof is entirely combinatorial. Let Aijk = ∂k ∂i ωj − ∂k ∂j ωi , Bijk = [∂i ωj , ωk ]∗ − [∂j ωi , ωk ]∗ . Then we can express ∇k Rij as ∇k Rij = Aijk + Bijk − ∂k [ωi , ωj ]∗ − [[ωi , ωj ]∗ , ωk ]∗ . Note that Aijk + Ajki + Akij = 0, Bijk + Bjki + Bkij = ∂k [ωi , ωj ]∗ + ∂i [ωj , ωk ]∗ + ∂j [ωk , ωi ]∗ . Using these relations together with the Jacobian identity [[ωi , ωj ]∗ , ωk ]∗ + [[ωj , ωk ]∗ , ωi ]∗ + [[ωk , ωi ]∗ , ωj ]∗ = 0, we easily prove the Bianchi identity.
June 2, 2010 14:55 WSPC/S0129-055X
514
148-RMP
J070-00402
R. B. Zhang & X. Zhang
3.2. Gauge transformations Let GLm (A) be the group of invertible m × m-matrices with entries in A. Let G be the subgroup defined by G = {g ∈ GLm (A) | e ∗ g = g ∗ e},
(3.3)
which will be referred to as the gauge group. There is a right action of G on M defined, for any ζ ∈ M and g ∈ G, by ζ × g → ζ · g := ζ ∗ g, where the right side is defined by matrix multiplication. Clearly, ζ ∗ g ∗ e = ζ ∗ g. Hence ζ ∗ g ∈ M, and we indeed have a G action on M. For a given g ∈ G, let ωig = g −1 ∗ ωi ∗ g − g −1 ∗ ∂i g.
(3.4)
Then e ∗ ωig ∗ (1 − e) = g −1 ∗ e ∗ ωi ∗ (1 − e) ∗ g − g −1 ∗ e ∗ ∂i (g) ∗ (1 − e). By (3.1), g −1 ∗ e ∗ ωi ∗ (1 − e) ∗ g = −g −1 ∗ e ∗ ∂i (e) ∗ g = −g −1 ∗ e ∗ ∂i (e ∗ g) + g −1 ∗ e ∗ ∂i g = −g −1 ∗ e ∗ ∂i (g) ∗ e − e ∗ ∂i e + g −1 ∗ e ∗ ∂i g = −e ∗ ∂i e + g −1 ∗ e ∗ ∂i (g) ∗ (1 − e). Therefore, e ∗ ωig ∗ (1 − e) = −e ∗ ∂i e. This shows that the ωig satisfy the condition (3.1), thus form a connection on M. Now for any given g ∈ G, define the maps ∇gi on M by ∇gi ζ = ∂i ζ + ζ ∗ ωig ,
∀ζ.
Also, let Rgij = ∂i ωjg − ∂j ωig − [ωig , ωjg ]∗ be the curvature corresponding to the connection ωig . Then we have the following result. Lemma 3.4. Under a gauge transformation procured by g ∈ G, ∇gi (ζ ∗ g) = ∇i (ζ) ∗ g, Rgij
=g
−1
∀ζ ∈ M;
∗ Rij ∗ g.
Proof. Note that ∇gi (ζ ∗ g) = ∂i (ζ) ∗ g + ζ ∗ ∂i g + ζ ∗ g ∗ ωig = (∂i ζ + ζ ∗ ωi ) ∗ g. This proves the first formula.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
515
To prove the second claim, we use the following formula ∂i ωjg − ∂j ωig = g −1 ∗ (∂i ωj − ∂j ωi ) ∗ g − ∂i (g −1 ) ∗ ∂j g + ∂j (g −1 ) ∗ ∂i g + [∂i (g −1 ) ∗ g, g −1 ∗ ωj ∗ g]∗ − [∂j (g −1 ) ∗ g, g −1 ∗ ωi ∗ g]∗ ; [ωig , ωjg ]∗ = g −1 ∗ [ωi , ωj ]∗ ∗ g − ∂i (g −1 ) ∗ ∂j g + ∂j (g −1 ) ∗ ∂i g + [∂i (g −1 ) ∗ g, g −1 ∗ ωj ∗ g]∗ − [∂j (g −1 ) ∗ g, g −1 ∗ ωi ∗ g]∗ . Combining these formulae together we obtain Rgij = g −1 Rij g. This completes the proof of the lemma. 3.3. Vector bundles associated to right projective modules ˜ = e ∗ Am in Connections and curvatures can be introduced for the right bundle M r much the same way. Let ω ˜ i ∈ Mm (A) (i = 1, 2, . . . , n) be matrices satisfying the condition that (1 − e) ∗ ω ˜ i ∗ e = ∂i (e) ∗ e.
(3.5)
˜i Then we can introduce a connection consisting of the right covariant derivatives ∇ ˜ (i = 1, 2, . . . , n) on M defined by ˜ → M, ˜ ˜i : M ∇
˜ i ξ = ∂i ξ − ω ξ → ∇ ˜ i ∗ ξ.
˜ i (ξ ∗ a) = ∇ ˜ i (ξ) ∗ a + ξ ∗ ∂i a for all a ∈ A. It is easy to show that ∇ Note that if ω ˜ i is equal to ∂i e for each i, the condition (3.5) is satisfied. We call ˜ them the canonical connection on M. Returning to a general connection ω ˜ i , we define the associated curvature by ˜ ij = ∂i ω R ˜ j − ∂j ω ˜ i − [˜ ωi , ω ˜ j ]∗ . ˜ we have Then for all ξ ∈ M, ˜ ij ∗ ξ. ˜ i ,∇ ˜ j ]ξ = −R [∇ ˜ ij by We further define the covariant derivatives of R ˜ ij − R ˜ ij ∗ ω ˜ ij = ∂k R ˜ ij + ω ˜ kR ∇ ˜k ∗ R ˜k . Then we have the following result. ˜ satisfies the Bianchi identity Lemma 3.5. The curvature on the right bundle M ˜ jk + ∇ ˜ ki + ∇ ˜ ij = 0. ˜ iR ˜ jR ˜ kR ∇ By direct calculations we can also prove the following result: ˜ ij ) ∗ ξ, ˜ k , [∇ ˜ i ,∇ ˜ j ]]ξ = −∇ ˜ k (R [∇
˜ ∀ξ ∈ M.
˜ Consider the gauge group G defined by (3.3), which has a right action on M: ˜ × G → M, ˜ M
ξ × g → ξ · g := g −1 ∗ ξ.
June 2, 2010 14:55 WSPC/S0129-055X
516
148-RMP
J070-00402
R. B. Zhang & X. Zhang
Under a gauge transformation procured by g ∈ G, ˜ ig := g −1 ∗ ω ˜ i ∗ g + ∂i (g −1 ) ∗ g. ω ˜ i → ω ˜ defined by ˜ g on M The connection ∇ i ˜ g ξ = ∂i ξ − ω ∇ ˜ ig ∗ ξ i ˜ satisfies the following relation for all ξ ∈ M: ˜ g (g −1 ∗ ξ) = g −1 ∗ ∇ ˜ i ξ. ∇ i Furthermore, the gauge transformed curvature ˜ g := ∂i ω R ˜ jg − ∂j ω ˜ ig − [˜ ωig , ω ˜ jg ]∗ ij ˜ ij by is related to R ˜ g = g −1 ∗ R ˜ ij ∗ g. R ij Given any Λ ∈ Mm (A), we can define the A-bimodule map ˜ → A, , : M ⊗R[[h]] ¯ M
ζ ⊗ ξ → ζ, ξ = ζ ∗ Λ ∗ ξ,
(3.6)
where ζ ∗ Λ ∗ ξ is defined by matrix multiplication. We shall say that the bimodule homomorphism is gauge invariant if for any element g of the gauge group G, ζ · g, ξ · g = ζ, ξ ,
∀ζ ∈ M,
˜ ξ ∈ M.
Also, the bimodule homomorphism is said to be compatible with the connections ωi ˜ if for all i = 1, 2, . . . , n on M and ω ˜ i on M ˜ i ξ , ∂i ζ, ξ = ∇i ζ, ξ + ζ, ∇
∀ζ ∈ M,
˜ ξ ∈ M.
˜ → A be an A-bimodule homomorphism defined Lemma 3.6. Let , : M ⊗R[[h]] ¯ M by (3.6) with a given m × m-matrix Λ with entries in A. Then (1) , is gauge invariant if g ∗ Λ ∗ g −1 = Λ for all g ∈ G; ˜ if for all i, ˜ i on M (2) , is compatible with the connections ωi on M and ω e ∗ (∂i Λ − ωi ∗ Λ + Λ ∗ ω ˜ i ) ∗ e = 0. ˜ Proof. Note that ζ · g, ξ · g = ζ ∗ g ∗ Λ ∗ g −1 ∗ ξ for any g ∈ G, ζ ∈ M and ξ ∈ M. −1 Therefore ζ · g, ξ · g = ζ, ξ if g ∗ Λ ∗ g = Λ. This proves part (1). ˜ i) ∗ ξ. Thus if Λ satisfies Now ∂i ζ, ξ = ∂i ζ, ξ + ζ, ∂i ξ + ζ ∗ (∂i Λ − ωi ∗ Λ + Λ ∗ ω the condition of part (2), then , is compatible with the connections.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
517
3.4. Canonical connections and fiber metric ˜ given by Let us consider in detail the canonical connections on M and M ωi = −∂i e,
ω ˜ i = ∂i e.
A particularly nice feature in this case is that the corresponding curvatures on the left and right bundles coincide. We have the following formula: ˜ ij = −[∂i e, ∂j e]∗ . Rij = R Now we consider a special case of the A-bimodule map defined by Eq. (3.6). ˜ → A the map defined by (3.6) with Λ Definition 3.7. Denote by g : M ⊗R[[h]] ¯ M being the identity matrix. We shall call g the fiber metric on M. Lemma 3.8. The fiber metric g is gauge invariant and is compatible with the standard connections. Proof. Since Λ is the identity matrix in the present case, it immediately follows from Lemma 3.6(1) that g is gauge invariant. Note that e ∗ ∂i (e) ∗ e = 0 for all i. Using this fact in Lemma 3.6(2), we easily see that g is compatible with the standard connections. 4. Embedded Noncommutative Spaces In this section, we study explicit examples of idempotents and related projective modules. They correspond to the noncommutative spaces introduced in [8]. The main result here is a reformulation of the theory of embedded noncommutative spaces [8] in the framework of Sec. 3 in terms of projective modules. 4.1. Embedded noncommutative spaces We shall consider only embedded spaces with Euclidean signature. The Minkowski case is similarly, which we briefly allude to in Remark 4.6 at the end of shall 1 2 m in l Am , we define an (n × n)-matrix this section. Given X = X X · · · X (gij )i,j=1,2,...,n with entries given by gij =
m
∂i X α ∗ ∂j X α .
α=1
Following [8], we shall call X a noncommutative space embedded in Am if the matrix (gij ) is invertible. For a given noncommutative space X, we denote by (g ij ) the inverse matrix of (gij ) with gij ∗ g jk = g kj ∗ gji = δik for all i and k. Here Einstein’s summation convention is used, and we shall continue to use this convention throughout the paper. Let Ei = ∂i X,
˜ i = (Ej )t ∗ g ji , E
E i = g ij ∗ Ej ,
June 2, 2010 14:55 WSPC/S0129-055X
518
148-RMP
J070-00402
R. B. Zhang & X. Zhang
for i = 1, 2, . . . , n, where (Ei )t = e ∈ Mm (A) by ˜ j ∗ Ej e:=E ∂i X 1 ∗ g ij ∗ ∂j X 1 ∂i X 2 ∗ g ij ∗ ∂j X 1 = ··· ∂i X
m
ij
∗ g ∗ ∂j X
1
∂i X 1 2 ∂i X .. . ∂i X m
denotes the transpose of Ei . Define
∂i X 1 ∗ g ij ∗ ∂j X 2
···
∂i X 2 ∗ g ij ∗ ∂j X 2
···
··· ∂i X
m
∗ g ij ∗ ∂j X 2
∂i X 1 ∗ g ij ∗ ∂j X m
∂i X 2 ∗ g ij ∗ ∂j X m . ··· ··· m ij m · · · ∂i X ∗ g ∗ ∂j X (4.1)
We have the following results. ˜ j = δ j for all i and j. Proposition 4.1. (1) Under matrix multiplication, Ei ∗ E i (2) The m × m matrix e satisfies e ∗ e = e, that is, it is an idempotent in Mm (A). ˜ = e ∗ Am are (3) The left and right projective A-modules M = l Am ∗ e and M r i ˜ . More precisely, we have respectively spanned by Ei and E M = {ai ∗ Ei | ai ∈ A},
˜ = {E ˜ i ∗ bi | bi ∈ A}. M
˜ j = Ei ∗ (Ek )t ∗ g kj = δ j . It then Proof. Note that gij = Ei ∗ (Ej )t . Thus Ei ∗ E i immediately follows that e ∗ e = E˜i ∗ (Ei ∗ E˜j ) ∗ Ej = E˜i ∗ δij ∗ Ej = e. ˜ ⊂ {E ˜ i ∗ bi | bi ∈ A}. By the first part of Obviously, M ⊂ {ai ∗ Ei | ai ∈ A} and M the proposition, we have ˜ j ) ∗ Ej = aj ∗ Ej , ai ∗ Ei ∗ e = ai ∗ (Ei ∗ E ˜ i ∗ (Ei ∗ E ˜ j ) ∗ bj = E ˜ i ∗ bi . e ∗ E˜ j ∗ bj = E This proves the last claim of the proposition. ˜ = {(Ei )t ∗ bi | bi ∈ A} since (gij ) is invertible. It is also useful to observe that M ˜ respectively by T X and T˜X, and refer to them as We shall denote M and M the left and right tangent bundles of the noncommutative space X. Note that the definition of the tangent bundles coincides with that in [8]. ˜X → A defined in DefiniDefinition 4.2. Call the fiber metric g : T X ⊗R[[h]] ¯ T tion 3.7 the metric of the noncommutative space X. The proposition below in particular shows that g agrees with the metric of the embedded noncommutative space defined in [8] in a geometric setting.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
519
Proposition 4.3. For any ζ = ai ∗ Ei ∈ T X and ξ = (Ej )t ∗ bj ∈ T˜X with ai , bj ∈ A, g : ζ ⊗ ξ → g(ζ, ξ) = ai ∗ gij ∗ bj . In particular, g(Ei , (Ej )t ) = gij . Proof. Recall from Definition 3.7 that g is defined by (3.6) with Λ being the identity matrix. Thus for any ζ = ai ∗ Ei ∈ T X and ξ = (Ej )t ∗ bj ∈ T˜X with ai , bj ∈ A, g(ζ, ξ) = ai ∗ Ei ∗ (Ej )t ∗ bj = ai ∗ gij ∗ bj . This completes the proof. Let us now equip the left and right tangent bundles with the canonical connecωi = −∂i e, and denote the corresponding covariant derivations given by ωi = −˜ tives by ∇i : T X → T X,
˜ i : T˜X → T˜X. ∇
In principle, one can take arbitrary connections for the tangent bundles, but we shall not allow this option in this paper. The following elements of A are defined in [8], c Γijl
=
1 1 (∂i gjl + ∂j gli − ∂l gji ) , Υijl = (∂i (Ej ) ∗ (El )t − El ∗ ∂i (Ej )t ) , 2 2 ˜ ijl = c Γijl − Υijl , Γ
Γijl = c Γijl + Υijl ,
where Υijk was referred to as the noncommutative torsion. Set [8] Γkij = Γijl ∗ g lk ,
˜ k = g kl ∗ Γ ˜ ijl . Γ ij
(4.2)
Then we have the following result. Lemma 4.4. ∇i Ej = Γkij ∗ Ek ,
˜ j = −E ˜ iE ˜ k ∗ Γj . ∇ ki
(4.3)
˜ k ∗ ∂i Ek . We have Proof. Consider the first formula. Write ∂i e = ∂i (E˜ k ) ∗ Ek + E ∇i Ej = ∂i Ej − Ej ∂i ∗ e = ∂i Ej − (∂i (Ej ∗ e) − ∂i (Ej ) ∗ e) ˜ k ∗ Ek . = ∂i (Ej ) ∗ E It was shown in [8] that Γkij = ∂i (Ej ) ∗ E˜ k . This immediately leads to the first formula. The proof for the second formula is essentially the same. Note that Lemma 4.4 can be re-stated as ˜j ∗ Ek, ∇i E j = −Γ ik
˜ i (Ej )t = (Ek )t ∗ Γ ˜k . ∇ ij
June 2, 2010 14:55 WSPC/S0129-055X
520
148-RMP
J070-00402
R. B. Zhang & X. Zhang
By using Lemmas 3.8 and 4.4, we can easily prove the following result, which is equivalent to [8, Proposition 2.7]. Proposition 4.5. The connections are metric compatible in the sense that ˜ i ξ), ∂i g(ζ, ξ) = g(∇i ζ, ξ) + g(ζ, ∇
∀ζ ∈ T X,
ξ ∈ T˜X.
(4.4)
For ζ = Ej and ξ = (Ek )t , we obtain from (4.4) the following result for all i, j, k: ˜ ikj = 0. ∂i gjk − Γijk − Γ
(4.5)
This formula is in fact equivalent to Proposition 4.5. Define l ˜l, Rkij = Ek ∗ Rij ∗ E
l ˜ kij R = −g lq ∗ Eq ∗ Rij ∗ E˜ p ∗ gpk .
(4.6)
˜ ij = −[∂i e, ∂j e]∗ , we can show by some lengthy calculations that Using Rij = R l = −∂j Γlik − Γpik ∗ Γljp + ∂i Γljk + Γpjk ∗ Γlip , Rkij ˜l − Γ ˜l + Γ ˜ l = −∂j Γ ˜l ∗ Γ ˜ p + ∂i Γ ˜l ∗ Γ ˜p , R kij
jp
ik
ik
jk
ip
(4.7)
jk
which are the Riemannian curvatures of the left and right tangent bundles of the noncommutative space X given in [8, Lemma 2.12 and §4]. Therefore, l ∗ El , [∇i , ∇j ]Ek = Rkij
˜ i, ∇ ˜ j ](Ek )t = (El )t ∗ R ˜l , [∇ kij
(4.8)
recovering the relations [8, (2.13)] and their generalizations [8, §4] to arbitrary m ≥ n. Remark 4.6. We comment briefly on noncommutative spaces with Minkowski signatures embedded in higher dimensions [8]. Let η = diag(−1, . . . , −1, 1, . . . , 1) be a diagonal (m×m)-matrixwith p of the diagonal entries being −1, and q = m−p of them being 1. Given X = X 1 X 2 · · · X m in l Am , we define an (n × n)-matrix (gij )i,j=1,2,...,n with entries gij =
m
∂i X α ∗ ηαβ ∗ ∂j X β .
α=1
We call X a noncommutative space embedded in Am if the matrix (gij ) is invertible. Denote its inverse matrix by (g ij ). Now the idempotent which gives rise to the left and right tangent bundles of X is given by e = η(Ei )t ∗ g ij ∗ Ej , which obviously satisfies Ei ∗ e = Ei for all i. The fiber metric of Definition 3.8 yields a metric on the embedded noncommutative surface X.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
521
4.2. Example We analyze an embedded noncommutative surface of Euclidean signature arising from the quantisation of a time slice of the Schwarzschild spacetime. While the main purpose here is to illustrate how the general theory developed in previous sections works, the example is interesting in its own right. Let us first specify the notation to be used in this section. Let t1 = r, t2 = θ and t3 = φ, with r > 2m, θ ∈ (0, π), and φ ∈ (0, 2π). We deform the algebra of functions in these variables by imposing the Moyal product defined by (2.1) with the following anti-symmetric matrix 0 0 0 3 (θij )i,j=1 = 0 0 1. 0 −1 0 Note that the functions depending only on the variable r are central in the Moyal algebra A. We shall write the usual pointwise product of two functions f and g as f g, but write their Moyal product as f ∗ g. Consider X = X 1 X 2 X 3 X 4 given by −1 2m 1 2 X = f (r) with (f ) + 1 = 1 − , r (4.9) X 2 = r sin θ cos φ,
X 3 = r sin θ sin φ,
X 4 = r cos θ.
Simple calculations yield E1 = ∂r X = ( f
sin θ cos φ sin θ sin φ cos θ ),
E2 = ∂θ X = ( 0
r cos θ cos φ r cos θ sin φ −r sin θ ),
E3 = ∂φ X = ( 0
−r sin θ sin φ
r sin θ cos φ
0 ).
Using these formulae, we obtain the following expressions for the components of the metric of the noncommutative surface X: −1 2m 2m 2¯ 1− 1− cos(2θ) sinh h , g11 = 1 − r r ¯ g12 = g21 = r sin(2θ) sinh2 h, ¯ g22 = r2 [1 + cos(2θ) sinh2 h],
(4.10)
¯ cosh h, ¯ g23 = −g32 = −r2 cos(2θ) sinh h ¯ cosh h, ¯ g13 = −g31 = −r sin(2θ) sinh h g33 = r2 [sin2 θ − cos(2θ) sinh2 ¯h]. ¯ → 0, we recover the spatial components of the Schwarzschild metric. In the limit h Observe that the noncommutative surface still reflects the characteristics of the Schwarzschild spacetime in that there is a time slice of the Schwarzschild black hole with the event horizon at r = 2m.
June 2, 2010 14:55 WSPC/S0129-055X
522
148-RMP
J070-00402
R. B. Zhang & X. Zhang
Since the metric (gij ) depends on θ and r only, and the two variables commute, the inverse (g ij ) of the metric can be calculated in the usual way as in the commutative case. Now the components of the idempotent e = (eij ) = (Ei )t ∗ g ij ∗ Ej are given by the following formula: e11 =
2m 2m(2m − r)(2 + cos 2θ) ¯ 2 ¯ 3 ), + h + O(h r r2
e12 =
m cos φ sin θ 2m cos θ sin φ ¯ − h m m r −4m+2r r −4m+2r +
e13 =
m(4m + r + 2m cos 2θ) cos φ sin θ ¯ 2 ¯3) h + O(h m 2 r −4m+2r
m sin θ sin φ 2m cos θ cos φ ¯ + h m m r −4m+2r r −4m+2r +
m(4m + r + 2m cos 2θ) sin θ sin φ ¯ 2 ¯3) h + O(h m r2 −4m+2r
m cos θ m cos θ(4m − r + 2m cos 2θ) ¯ 2 ¯3) e14 = + h + O(h m m 2 r −4m+2r r −4m+2r e21 =
m cos φ sin θ 2m cos θ sin φ ¯ + h m m r −4m+2r r −4 m+2 r +
m(4m + r + 2m cos 2θ) cos φ sin θ ¯ 2 ¯3) h + O(h m r2 −4m+2r
e22 = 1 −
m 2m sin2 θ cos2 φ + 2 [2r + 2m cos 4θ cos2 φ − 6m cos2 φ r 2r
¯ 2 + O(h ¯3) + 2 cos 2θ(m + 8r + (m − r) cos 2φ)]h e23 = − + e24 =
m sin2 θ sin 2φ 3m sin 2θ ¯ − h r r m(2(m − r) cos 2θ + m(−3 + cos 4θ)) sin 2φ ¯ 2 ¯3) h + O(h 2r2
−2m cos θ cos φ sin θ m(1 + 3 cos 2θ) sin φ ¯ − h r r −
m(8m + 5r + 4m cos 2θ) cos φ sin 2θ ¯ 2 ¯3) h + O(h 2r2
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
e31 =
523
m sin θ sin φ 2m cos θ cos φ ¯ − h m m r −4m+2r r −4m+2r +
m(4m + r + 2m cos 2θ) sin θ sin φ ¯ 2 ¯3) h + O(h m 2 r −4m+2r
m sin2 θ sin 2φ 3m sin 2θ ¯ + h r r m(2(m − r) cos 2θ + m(−3 + cos 4θ)) sin 2φ ¯ 2 ¯3) + h + O(h 2r2 2m sin2 θ sin2 φ m = 1− + 2 [2r + 2m cos 4θ sin2 φ − 6m sin2 φ r 2r ¯ 2 + O(h ¯3) + 2 cos 2θ(m + 8r − (m − r) cos 2φ)]h
e32 = −
e33
−2m cos θ sin θ sin φ m(1 + 3 cos 2θ) cos φ ¯ + h r r m(8m + 5r + 4m cos 2θ) sin 2θ sin φ ¯ 2 ¯3) − h + O(h 2r2 m cos θ m cos θ(4m − r + 2m cos 2θ) ¯ 2 ¯3) = + h + O(h m m 2 r −4m+2r r −4m+2r
e34 =
e41
−2m cos θ cos φ sin θ m(1 + 3 cos 2θ) sin φ ¯ + h r r m(8m + 5r + 4m cos 2θ) cos φ sin 2θ ¯ 2 ¯3) − h + O(h 2r2 −2m cos θ sin θ sin φ m(1 + 3 cos 2θ) cos φ ¯ − h = r r m(8m + 5r + 4m cos 2θ) sin 2θ sin φ ¯ 2 ¯3) − h + O(h 2r2 2m cos2 θ 4m cos2 θ(−2m + r − m cos 2θ) ¯ 2 ¯ 3 ). + = 1− h + O(h r r2
e42 =
e43
e44
¯ 1+h ¯ 2 e2 + · · · . Then inspecting the formulae we see Let us write e = e0 + he that the matrices e0 and e2 are symmetric, while e1 is skew symmetric. This is no coincidence; rather it is a consequence of properties of X under the bar involution, which will be discussed in Sec. 6. Here we refrain from presenting the result of the Mathematica computation for the curvature Rij = −[∂i e, ∂j e], which is very complicated and not terribly illuminating. However, we mention that in [37] a quantisation of the Schwarzschild spacetime was carried out (for a particular choice of Θ), and the resulting noncommutative differential geometry was studied in detail. In particular, the metric, Christoffel symbols, Riemannian and Ricci curvatures were explicitly worked out. We refer to that paper for details.
June 2, 2010 14:55 WSPC/S0129-055X
524
148-RMP
J070-00402
R. B. Zhang & X. Zhang
5. General Coordinate Transformations We now return to the general setting of Sec. 3 to investigate “general coordinate transformations”. Our treatment follows closely [8, §V] and makes use of general ideas of [17, 21, 28]. We should point out that the material presented is part of an attempt of ours to develop a notion of “general covariance” in the noncommutative setting. This is an important matter which deserves a thorough investigation. We hope that the work presented here will prompt further studies. Let (A, µ) be a Moyal algebra of smooth functions on the open region U of Rn with coordinate t. This algebra is defined with respect to a constant skew symmetric matrix θ = (θij ). Let Φ : U → U be a diffeomorphism of U in the classical sense. We denote ui = Φi (t), and refer to this as a general coordinate transformation of U . Denote by Au the sets of smooth functions of u = (u1 , u2 , . . . , un ). The map Φ ¯ induces an R[[h]]-module isomorphism φ = Φ∗ : Au → A defined for any function f ∈ Au by φ(f )(t) = f (Φ(t)). ¯ We define the R[[h]]-bilinear map µu : Au ⊗ Au → Au ,
µu (f, g) = φ−1 µt (φ(f ), φ(g)).
Then it is well known [21] that µu is associative. Therefore, we have the associative algebra isomorphism ∼
φ : (Au , µu ) → (At , µt ). We say that the two associative algebras are gauge equivalent by adopting the terminology of [17]. ¯ Following [8], we define R[[h]]-linear operators ∂iφ := φ−1 ◦ ∂i ◦ φ : Au → Au ,
(5.1)
which have the following properties [8, Lemma 5.5]: ∂iφ ◦ ∂jφ − ∂jφ ◦ ∂iφ = 0, ∂iφ µu (f, g) = µu (∂iφ (f ), g) + µu (f, ∂iφ (g)), ∂iφ .
∀f, g ∈ Au ,
where the second relation is the Leibniz rule for Recall that this Leibniz rule played a crucial role in the construction of noncommutative spaces over (Au , µu ) in [8]. We shall denote by Mm (Au ) the set of (m×m)-matrices with entries in Au . The product of two such matrices will be defined with respect to the multiplication µu of the algebra (Au , µu ). Then φ−1 acting component wise gives rise to an algebra isomorphism from Mm (A) to Mm (Au ), where matrix multiplication in Mm (A) is defined with respect to µ.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
525
Since we need to deal with two different algebras (A, µ) and (Au , µu ) simultaneously in this section, we write µ and the matrix multiplication defined with respect to it by ∗ as before, and use ∗u to denote µu and the matrix multiplication defined with respect to it. Let e ∈ Mm (A) be an idempotent. There exists the corresponding finitely ˜ Now generated projective left (respectively, right) A-module M (respectively, M). −1 −1 −1 −1 eu := φ (e) is an idempotent in Mm (Au ), that is, φ (e)∗u φ (e) = φ (e). Write eu = (Eβα )α,β=1,...,m . This idempotent gives rises to the left projective Au -module ˜ u , respectively defined by Mu and right projective Au -module M α α a ∈ Au , Mu = a ∗u E1α aα ∗u E2α · · · aα ∗u Em α ∗ u bβ bβ ∈ Au , .. . ∗ b u β m where aα ∗u Eβα = α µu (aα , Eβα ) and Eβα ∗u bβ = β µu (Eβα , bβ ). Below we consider the left projective module only, as the right projective module may be treated similarly. Assume that we have the left connection ∂ζ ∇i : M → M, ∇i ζ = i + ζ ∗ ωi . ∂t Eβ1 β E2 ˜u = M Eβ
∗u bβ
Let ωiu := φ−1 (ωi ). We have the following result. Theorem 5.1. (1) The matrices ωiu satisfy the following relations in Mm (Au ): eu ∗u ωiu ∗u (1 − eu ) = −eu ∗u ∂iφ eu . (2) The operators ∇φi (i = 1, 2, . . . , n) defined for all η ∈ Mu by ∇φi η = ∂iφ η + η ∗u ωiu give rise to a connection on Mu . (3) The curvature of the connection ∇φi is given by Ruij = ∂iφ ωju − ∂jφ ωiu − ωiu ∗u ωju + ωju ∗u ωiu , which is related to the curvature Rij of M by Ruij = φ−1 (Rij ). Proof. Note that eu ∗u ωiu ∗u (1 − eu ) = φ−1 (e ∗ ωi ∗ (1 − e)). We also have φ ∂e −1 (e ∗ φ(∂iφ eu )) = φ−1 (e ∗ ∂i e). This ∂iφ eu = φ−1 ( ∂t i ), which leads to eu ∗u ∂i eu = φ proves part (1). Part (2) follows from part (1) and the Leibniz rule for ∂iφ . Straightforward calculations show that the curvature of the connection ∇φi is given by
June 2, 2010 14:55 WSPC/S0129-055X
526
148-RMP
J070-00402
R. B. Zhang & X. Zhang ∂ω
Ruij = ∂iφ ωju − ∂jφ ωiu − ωiu ∗u ωju + ωju ∗u ωiu . Now ∂iφ ωju = φ−1 ( ∂tij ), and ωiu ∗u ωju − ωju ∗u ωiu = φ−1 (ωi ∗ ωj ) − φ−1 (ωj ∗ ωi ). Hence Ruij = φ−1 (Rij ). Remark 5.2. One can recover the usual transformation rules of tensors under the diffeomorphism group from the commutative limit of Theorem 5.1 in a way similar to that in [8, §5.C]. 6. Bar Involution and Generalized Hermitian Structure In this section, we study a Moyal algebra analogue of the bar map of quantum groups, and investigate its implications on noncommutative geometry. Note that the ¯ i in ¯ admits an involution that maps an arbitrary power series a = ai h ring R[[h]] i i i ¯ to a ¯ the conjugate of a. Note that a ¯a contains h . We shall call a R[[h]] ¯ = i (−1) ai ¯ ¯ only even powers of h. We can extend this map to a conjugate linear anti-involution on the Moyal algebra A. ¯i ∈ A, where fi Lemma 6.1. Let ¯ : A → A be the map defined for any f = i fi h i ¯i ¯ are real functions on U, by f = i (−1) fi h . Then for all f, g ∈ A, f ∗ g = g¯ ∗ f¯. We refer to the map as the bar involution of the Moyal algebra. It is an analogue ¯ to q −1 , in the theory of quantum of the well known bar map, sending q = exp(h) groups, which plays an important role in the study of canonical (crystal) bases. The lemma can be easily proven by inspecting (2.1). Given any rectangular matrix A = (ars ) with entries in A, we let A† be the matrix obtained from A by first taking its transpose then sending every matrix elements to its conjugate. For example, † a1 a2 a 1 b 1 c1 = b1 b2 . a 2 b 2 c2 c1 c2 It is clear that if the product A ∗ B of two matrices are defined, then (A ∗ B)† = B † ∗ A† . ¯ Let Am = l Am be the R[[h]]-module consisting of rows matrices of length m with entries in A. We define the form ( , ) : Am × Am → A,
ζ × ξ → (ζ, ξ) := ζ ∗ ξ † .
Lemma 6.2. (1) For all ζ, ξ ∈ M and a, b ∈ A, (ζ, ξ) = (ξ, ζ),
(a ∗ ζ, b ∗ ξ) = a ∗ (ζ, ξ) ∗ ¯b.
Thus in this sense the form (6.1) is sesquilinear. (2) (ζ, ζ) = 0 if and only if ζ = 0.
(6.1)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
527
(3) For all ζ, ξ ∈ M and A ∈ Mm (A), we have (ζ ∗ A, ξ) = (ζ, ξ ∗ A† ). (4) Let the bar-unitary group Um (A) over A be the subgroup of GLm (A) defined by Um (A) = {g ∈ GLm (A) | g † = g −1 }. Then the form (6.1) is invariant under Um (A) in the sense that for all g ∈ Um (A) and ζ, ξ ∈ M, (ζ ∗ g, ξ ∗ g) = (ζ, ξ). It is straightforward to prove the lemma. Note that part (2) of the lemma makes the form (6.1) as nice as a positive definite hermitian form in the commutative case. We shall call an idempotent e ∈ Mm (A) self-adjoint (with respect to the sesquilinear form (6.1)) if e = e† . In this case, the corresponding left and right projective modules M = l Am ∗ e and ˜ = e ∗ Am are related by M r ˜ = {ζ † | ζ ∈ M}. M Furthermore, the form (6.1) restricts to a sesquilinear form on M, which is invariant under G ∩ Um (A). ˜ = e ∗ Am be the left and right bundles Lemma 6.3. Let M = l Am ∗ e and M r associated with a self-adjoint idempotent e. Assume that the left connection ωi on ˜ satisfy the condition M and the right connection ω ˜ i on M ω ˜ i = −ωi† ,
∀i.
Then for any ζ in M, ˜ i (ζ † ). (∇i ζ)† = ∇ Furthermore, the curvatures on the left and right bundles are related by ˜ ij = −R† . R ij Proof. Let ξ = ζ † . We have ˜ i ξ. (∇i ζ)† = (∂i ζ + ζ ∗ ωi )† = ∂i ξ + ωi† ∗ ξ = ∇ This proves the first part of the lemma. Now R†ij = (∂i ωj − ∂j ωi − [ωi , ωj ]∗ )† = ∂i ωj† − ∂j ωi† + [ωi† , ωj† ]∗ ˜ ij . = −R This proves the second part.
(6.2)
June 2, 2010 14:55 WSPC/S0129-055X
528
148-RMP
J070-00402
R. B. Zhang & X. Zhang
Hereafter, we shall assume that condition (6.2) is satisfied by the left and right connections. Let M be the left bundle corresponding to a self-adjoint idempotent e. We shall say that a connection ωi on M is hermitian with respect to the bar map (or bar-hermitian) if ωi† = ωi for all i. In this case, we shall also say that the bundle M is bar-hermitian. ˜ satisfy ˜ i = ∂i e on M Note that the canonical connections ωi = −∂i e on M and ω † † ω ˜ i = −ωi and ωi = ωi provided that e is self-adjoint. Therefore, in this case the canonical connection is bar-hermitian. Since the left and right curvatures associated to the canonical connections are equal, it follows from Lemma 6.3 that R†ij = −Rij . We have the following result. Theorem 6.4. Let X = X 1 X 2 · · · X m in l Am be an embedded noncommuta tive surface satisfying the condition X := X 1 X 2 · · · X m = X. Then X has the following properties: (1) The metric has the property gij = gji for all i, j. (2) The idempotent e = (Ei )t ∗ g ij ∗ Ej is self-adjoint. (3) Equipped with the canonical connection ωi = −∂i e, the tangent bundle of X is bar-hermitian. (4) The curvature satisfies R†ij = −Rij . Proof. The given condition on X implies that all the Ei satisfy Ei† = (Ei )t . Thus gij = Ei ∗ (Ej )t = Ei ∗ (Ej )† ,
e = (Ei )t ∗ g ij ∗ Ej = (Ei )† ∗ g ij ∗ Ej .
Hence we have gij = (Ei ∗ (Ej )† )† = Ej ∗ (Ei )† = gji . It then follows that g ij = g ji . Now the idempotent e satisfies e† = ((Ei )† ∗ g ij ∗ Ej )† = (Ej )† ∗ g ij ∗ Ei = (Ej )t ∗ g ji ∗ Ei = e. Parts (3) and (4) follow from part (2) and the discussion preceding the proposition.
Note that the quantum spacetimes studied in [37] and the example in Sec. 4.2 all satisfy the conditions of Theorem 6.4. 7. Concluding Remarks We wish to point out that in the classical commutative setting, we can recover (pseudo-) Riemannian geometry from the theory developed here by using the isometric embedding theorems of [12, 19, 23, 32]. The simplification in this case is that there is no need to distinguish the left and the right tangent bundles. To describe the situation, we let (N, g) be a smooth n-dimensional (pseudo-) Riemannian manifold with metric g. Denote by C∞ (N ) the set of smooth functions on N endowed with the usual pointwise multiplication. Let C∞ (N )m be the space consisting of row vectors of length m with entries in C∞ (N ). By results of [12, 19, 23, 32],
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
529
there exist positive integers p, q (with p + q = m) and a set of smooth funcm tions X 1 , . . . , X p , X p+1 , . . . , X m on N such that g = α,β=1 dX α ηαβ dX β , where η = diag(−1, . . . , −1, 1, . . . 1) with p = 0 if N is Riemannian. Let U be a coordinate p
q
1
2
m
∂X ∂X ) and chart of N with local coordinate (t1 , . . . , tn ). We set Ei = ( ∂X ∂ti ∂ti · · · ∂ti t ij define e = η(Ei ) g Ej on each coordinate chart U . Then we have the following result.
Theorem 7.1. (1) The idempotent e is globally defined on N . (2) The space Γ(T N ) of sections of the tangent bundle of N is given by C∞ (N )m e. (3) For all ζ, ξ ∈ Γ(T N ), we have g(ζ, ξ) = ζη(ξ)t . (4) The standard connection (with ωi = −∂i e) on C∞ (N )m e is the usual Levi– Civita connection on T X with the Christoffel symbol Γkij defined by (4.2) and Υijk = 0. (5) The Riemannian curvature tensor is given by (4.6). Returning to the noncommutative case, we recall that one can quantise any Poisson manifold following the prescription of [28]. Then one obtains a collection of noncommutative associative algebras (analogous to the Moyal algebra), one on each coordinate patch. The algebras relative to different local coordinates are gauge equivalent [28, Theorem 2.3] as discussed in Sec. 5. This way, one obtains a sheaf of noncommutative algebras over the Poisson manifold. The algebraic geometry of such a quantized Poisson manifold has been extensively developed by Kashiwara and Schapira [24, 25]. In principle one may extend the local theory developed in this paper to a “global” differential geometry over the quantized Poisson manifold. Work in this direction is currently under way. Results in this paper should be directly applicable to the development of a theory of noncommutative general relativity, which is of considerable current interest in theoretical physics. We hope that the theory presented here will provide a consistent mathematical basis for this purpose. We should also mention that one may use this theory to clarify, conceptually, aspects of the many noncommutative geometries introduced in physics in recent years based on physical intuitions. For example, general features of the noncommutative geometries in [3, 10, 11] have considerable similarity with that of [8]. These works also have the advantage of being explicit and amenable to calculations, thus have the chance to be physically tested. Therefore, it will be useful to further develop the mathematical bases of these theories by casting them into the framework of this paper. Finally, we note that a noncommutative analogue of spin geometry over the Moyal algebra within the C ∗ -algebraic framework in terms of noncompact spectral triples was studied in [20]. Our treatment is complementary to that of [20]. Acknowledgments We wish to thank Masud Chaichian and Anca Tureanu for discussions at various stages of this work. X. Zhang thanks the School of Mathematics and Statistics,
June 2, 2010 14:55 WSPC/S0129-055X
530
148-RMP
J070-00402
R. B. Zhang & X. Zhang
the University of Sydney for the hospitality extended to him during a visit when this work was completed. Partial financial support from the Australian Research Council, National Science Foundation of China (grants 10421001, 10725105 and 10731080), NKBRPC (2006CB805905) and the Chinese Academy of Sciences are gratefully acknowledged. References ´ [1] L. Alvarez-Gaum´ e, F. Meyer and M. A. Vazquez-Mozo, Comments on noncommutative gravity, Nucl. Phys. B 753 (2006) 92–117. [2] S. Ansoldi, P. Nicolini, A. Smailagic and E. Spallucci, Non-commutative geometry inspired charged black holes, Phys. Lett. B 645 (2007) 261–266. [3] P. Aschieri, M. Dimitrijevic, F. Meyer and J. Wess, Noncommutative geometry and gravity, Class. Quant. Grav. 23 (2006) 1883–1911. [4] R. Banerjee, B. R. Majhi and S. K. Modak, Noncommutative Schwarzschild black hole and area law, Class. Quant. Grav. 26 (2009) 085010, 11 pp. [5] M. Buric, T. Grammatikopoulos, J. Madore and G. Zoupanos, Gravity and the structure of noncommutative algebras, JHEP 0604 (2006) 054. [6] M. Chaichian, M. Oksanen, A. Tureanu and G. Zet, Gauging the twisted Poincare symmetry as noncommutative theory of gravitation, Phys. Rev. D 79 (2009) 044016, 8 pp. [7] M. Chaichian, M. R. Setare, A. Tureanu and G. Zet, On black holes and cosmological constant in noncommutative gauge theory of gravity, JHEP 0804 (2008) 064. [8] M. Chaichian, A. Tureanu, R. B. Zhang and Xiao Zhang, Riemannian geometry of noncommutative surfaces, J. Math. Phys. 49 (2008) 073511, 26 pp. [9] M. Chaichian, A. Tureanu and G. Zet, Corrections to Schwarzschild solution in noncommutative gauge theory of gravity, Phys. Lett. B 660 (2008) 573–578. [10] A. H. Chamseddine, Complexified gravity in noncommutative spaces, Comm. Math. Phys. 218 (2001) 283–292. [11] A. H. Chamseddine, SL(2, C) gravity with a complex vierbein and its noncommutative extension, Phy. Rev. D 69 (2004) 024015, 8 pp. [12] C. J. S. Clarke, On the global isometric embedding of pesudo-Riemannian manifolds, Proc. Roy. Soc. Lond. A. 314 (1970) 417–428. [13] A. Connes, Noncommutative Geometry (Academic Press, 1994). [14] M. P. do Carmo, Differential Geometry of Curves and Surfaces (Prentice-Hall, Englewood Cliffs, NJ, 1976). [15] B. P. Dolan, K. S. Gupta and A. Stern, Noncommutative BTZ black hole and discrete time, Class. Quant. Grav. 24 (2007) 1647–1656. [16] S. Doplicher, K. Fredenhagen and J. E. Roberts, The quantum structure of spacetime at the Planck scale and quantum fields, Comm. Math. Phys. 172 (1995) 187–220. [17] V. Drinfeld, Quasi-Hopf algebras, Leningrad Math. J. 1 (1990) 1419–1457. [18] S. Estrada-Jimenez, H. Garcia-Compean, O. Obregon and C. Ramirez, Twisted covariant noncommutative self-dual gravity, Phys. Rev. D 78 (2008) 124008, 11 pp. [19] A. Friedman, Local isometric embedding of Riemannian manifolds with indefinite metric, J. Math. Mech. 10 (1961) 625–650. [20] V. Gayral, J. M. Gracia-Bond´ıa, B. Iochum, T. Sch¨ ucker and J. C. V´ arilly, Moyal planes are spectral triples, Comm. Math. Phys. 246 (2004) 569–623. [21] M. Gerstenhaber, On the deformation of rings and algebras, Ann. Math. 79 (1964) 59–103.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00402
Projective Module Description of Embedded Noncommutative Spaces
531
[22] J. M. Gracia-Bond´ıa, J. C. V´ arilly and H. Figueroa, Elements of Noncommutative Geometry, Birkh¨ auser Advanced Texts: Basler Lehrb¨ uher (Birkh¨ auser Boston, Inc., Boston, MA, 2001). [23] R. E. Greene, Isometric Embedding of Riemannian and Pseudo Riemannian Manifolds, Mem. Amer. Math. Soc., No. 97 (Amer. Math. Soc., 1970). [24] M. Kashiwara and P. Schapira, Deformation quantization modules I: Finiteness and duality, arXiv:0802.1245 [math.QA]. [25] M. Kashiwara and P. Schapira, Deformation quantization modules II. Hochschild class, arXiv:0809.4309 [math.AG]. [26] H. C. Kim, M. I. Park, C. Rim and J. H. Yee, Smeared BTZ black hole from space noncommutativity, JHEP 10 (2008) 060. [27] A. Kobakhidze, Noncommutative corrections to classical black holes, Phys. Rev. D 79 (2009) 047701, 3 pp. [28] M. Kontsevich, Deformation quantization of Poisson manifolds, Lett. Math. Phys. 66 (2003) 157–216. [29] J. Madore and J. Mourad, Quantum space-time and classical gravity, J. Math. Phys. 39 (1998) 423–442. [30] S. Majid, Noncommutative Riemannian and spin geometry of the standard q-sphere, Comm. Math. Phys. 256 (2005) 255–285. [31] F. Muller-Hoissen, Noncommutative geometries and gravity, in Recent Developments in Gravitation and Cosmology, AIP Conf. Proc., Vol. 977 (Amer. Inst. Phys., Melville, NY, 2008), pp. 12–29. [32] J. Nash, The imbedding problem for Riemannian manifolds, Ann. Math. 63 (1956) 20–63. [33] P. Nicolini, A. Smailagic and E. Spallucci, Noncommutative geometry inspired Schwarzschild black hole, Phys. Lett. B 632 (2006) 547–551. [34] H. S. Snyder, Quantized space-time, Phys. Rev. 71 (1947) 38–41. [35] R. J. Szabo, Symmetry, gravity and noncommutativity, Class. Quant. Grav. 23 (2006) R199–R242. [36] H. Steinacker, Emergent gravity and noncommutative branes from Yang–Mills matrix models, Nucl. Phys. B 810 (2009) 1–39. [37] D. Wang, R. B. Zhang and X. Zhang, Quantum deformations of Schwarzschild and Schwarzschild-de Sitter spacetimes, Class. Quant. Grav. 26 (2009) 085014, 14 pp. [38] C. N. Yang, On quantized space-time, Phys. Rev. 72 (1947) 874.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 533–548 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004016
CONSTRUCTION OF CERTAIN FUZZY FLAG MANIFOLDS
MAJDI BEN HALIMA Facult´ e des Sciences de Sfax, D´ epartement de Math´ ematiques, Route de Soukra, 3038 Sfax, Tunisia
[email protected] Received 14 May 2009 Revised 16 February 2010 Approximating the algebra of complex-valued smooth functions on a space-time manifold by a sequence of matrix algebras AN ∼ = Mat(dN , C), with dN ∞, is the basic idea of fuzzy manifolds. In this paper, we explicitly construct fuzzy versions of the homogeneous spaces SO(2n+1)/U (n) and Sp(n)/U (1)×Sp(n−1) for n ≥ 2. This allows us to extend a result of Zhang giving a construction of fuzzy irreducible compact Hermitian symmetric spaces to a class of flag manifolds. Keywords: Fuzzy flag manifolds; Berezin–Toeplitz quantization; representations of compact Lie groups. Mathematics Subject Classification: 81T08, 81S10, 22E47
1. Introduction Let (M, ω) be a quantizable compact K¨ ahler manifold. Let (L, h, ∇) be an associated quantum line bundle. Here L is a holomorphic line bundle, h a Hermitian metric and ∇ the unique connection in L which is compatible with the complex structure and the metric such that the curvature form R of the line bundle and the K¨ ahler form ω of the manifold are related as R(X, Y ) := ∇X ∇Y − ∇Y ∇X − ∇[X,Y ] = −iω(X, Y ), where X, Y are smooth vector fields on M . Let us fix a positive integer N and set LN := L⊗N , the N th tensor power of L. On the space Γ∞ (M, LN ) of smooth sections of LN , we have the scalar product ϕ, ψ = hN (ϕ(x), ψ(x))dΩ(x), M
where hN := h⊗N is the induced metric on LN and dΩ(x) is the normalized Liouville 2 N 2 measure on M . Let L (M, L ) be the L -completion of the space Γ∞ (M, LN ) and Γhol (M, LN ) be its closed subspace of holomorphic sections. By compactness of M , the Hilbert space HN := Γhol (M, LN ) is finite-dimensional. The algebra 533
June 2, 2010 14:55 WSPC/S0129-055X
534
148-RMP
J070-00401
M. Ben Halima
AN := EndC (HN ) can evidently be identified with the matrix algebra Mat(dimC HN , C). Letting C ∞ (M ) be the algebra of complex-valued smooth functions on M , the Berezin–Toeplitz quantization map TN : C ∞ (M ) → AN is defined by associating to a function f multiplication of holomorphic sections of LN by f followed by projection on the space of holomorphic sections. In this way, one obtains a sequence of matrix algebras (AN )N ≥1 and a sequence of linear maps (TN )N ≥1 . Referring to a work of Bordemann, Meinrenken and Schlichenmaier [7], we know that the sequence (AN )N ≥1 should, in some sense, “approximate” the commutative algebra C ∞ (M ). Such an approximation scheme is reminescent of fuzzy manifolds where finite-dimensional matrix algebras are used to approximate the algebra of complex-valued smooth functions on a space-time manifold. More precisely, a fuzzy version of a compact manifold D is given by a sequence of linear subspaces (EN )N ≥1 in the function algebra C ∞ (D) such that EN ⊂ EN +1 and N ≥1 EN is dense in C ∞ (D), and such that EN is isomorphic to a matrix algebra AN ∼ = Mat(dN , C) with dN ∞. Furthermore, it is required that this truncation retains all symmetries of the manifold D. The prototypical example of fuzzy compact manifold is the fuzzy two-sphere S 2 . Identify S 2 with the homogeneous space SU (2)/S(U (1) × U (1)) and recall that L2 (S 2 ) ∼ , V2k , = k ∈ N0
where N0 := Z≥0 and Vl is the space of homogeneous complex polynomials of degree l in two variables. Then, since VN∗ ⊗ VN ∼ =
N
V2k
k=0
by self-duality of the Vl and the usual Clebsch–Gordan rule, the algebra AN := EndC (VN ) ∼ = Mat(N + 1, C) appears not only as a natural SU (2)-equivariant truncation of L2 (S 2 ) (or C ∞ (S 2 )) but carries a non-commutative multiplication as well (see, e.g., [24, 25] for details). A number of fuzzy compact manifolds have been constructed by now. For reviews on some of these constructions, we refer to [5, 6, 11, 16]. As suggested by Madore in [24], fuzzy compact manifolds have found several applications in physics. In quantum field theory, it can provide a finite mode approximation to commutative continuum field theories, giving an alternative to lattice gauge theories. Compared to a lattice regularization procedure, the fuzzy approach has the advantage of preserving the space-time symmetries. It also has further advantages in situations where fermions are included. Due to these and other potential advantages, the fuzzy approach appears as a promising new tool in quantum field theory (see, e.g., [4, 12, 15] for more details). There are other reasons to investigate fuzzy compact manifolds in theoretical physics. They lead to matrix models which receive a lot of interest in string theory, especially in the theory of D-branes (see, e.g., [2, 17]).
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Construction of Certain Fuzzy Flag Manifolds
535
From a rather mathematical point of view, fuzzy compact manifolds have an interesting connection with noncommutative geometry. Following an idea of Fr¨ ohlich and Gaw¸edzki [13], a fuzzy version of a compact manifold D can be specified by a sequence of triples (Mat(dN , C), HN , ∆N ), 2
where the Hilbert space HN = CdN is equipped with the inner product A, B =
1 Tr(AB ∗ ), dN
and ∆N is a matrix analog of the Laplace–Beltrami operator. The “fuzzy Laplacian” ∆N comes with a cutoff and encodes mathematical informations about the manifold D (see, e.g., [10] for details). This important fact motivates the study of fuzzy compact manifolds from a framework of noncommutative geometry. The main goal of the present work is to construct explicit fuzzy versions of the homogeneous spaces SO(2n + 1)/U (n) and Sp(n)/U (1) × Sp(n − 1) (n ≥ 2) by means of elementary representation-theoretic methods. This allows us to establish the following result wich describes a class of fuzzy flag manifolds. Theorem . Let G be a compact, connected simply connected Lie group with Lie algebra g, and let p be a standard maximal parabolic subalgebra of the complexified Lie algebra gC . Let K ⊂ G be the connected Lie group with Lie algebra k := p ∩ g. Assume that (G, K) is a Gelfand pair. Then there exists a sequence (EN )N ≥1 of G-invariant subspaces of L2 (G/K) such that EN ⊂ EN +1 and ∪N ≥1 EN is dense in C ∞ (G/K), and such that EN is G-equivariantly isomorphic to a matrix algebra AN ∼ = Mat(dN , C) with dN ∞. This theorem extends a result of Zhang (see [31, Proposition 3.1 and Theorem 4.2]) wich gives a construction of fuzzy irreducible compact Hermitian symmetric spaces. In the proof of the above theorem, we shall make direct use of the standard Berezin–Toeplitz quantization procedure for compact K¨ahler manifolds. In connection with our work, let us mention that Lazaroiu, McNamee and S¨ amann (see [22]) have recently proved that a particular version of generalized Berezin quantization, which they call “Berezin–Bergmann quantization”, provides a general framework for approaching the construction of fuzzy compact K¨ ahler manifolds. Using this framework, the authors have proposed a general defenition of fuzzy scalar field theory on compact K¨ ahler manifolds. The present paper is organized as follows. In Sec. 2, we first fix our notations and terminology. Then we recall some useful facts about a special class of Gelfand pairs. In Sec. 3, we provide explicit formulas concerning the decomposition into irreducibles of some tensor product representations of the groups SO(2n + 1) and Sp(n) for n ≥ 2 (see Corollary 1 and Proposition 2 below). These formulas play an important role in Sec. 4, wich is essentially devoted to the proof of our main result.
June 2, 2010 14:55 WSPC/S0129-055X
536
148-RMP
J070-00401
M. Ben Halima
2. Preliminaries 2.1. Basic notions Let G be a compact connected semisimple Lie group with Lie algebra g. We denote by gC the complexification of g and by GC the simply connected Lie group with Lie algebra gC . Let T be a maximal torus in G with Lie algebra h. The complexification hC of h is a Cartan subalgebra of gC . We denote by ∆ the root system of gC with respect to hC . We fix a lexicographic ordering on the dual h∗R := (ih)∗ and we write ∆+ for the corresponding system of positive roots. The Killing form B on g extends complex bilinearly to gC . It is easy to see that B is positive definite on hR . For λ ∈ h∗R , let Hλ be the element of hR such that λ(H) = B(H, Hλ ) for all H ∈ hR . Thus we obtain a scalar product on h∗R given by λ, µ = B(Hλ , Hµ ). Let Π = {α1 , . . . , αl } be the system of simple roots corresponding to ∆+ . The elements αj , 1 ≤ j ≤ l, defined by 2αj , αk = δj,k αk , αk
for 1 ≤ k ≤ l,
are called the fundamental weights attached to Π. To simplify notation, we set j := αj . The weight lattice is then given by l nj j , nj ∈ Z . Λ = λ ∈ (hC )∗ ; λ = j=1
The set of dominant weights is the cone l nj j , nj ∈ N0 ⊂ Λ. Λ+ = λ ∈ (hC )∗ ; λ = j=1
For each λ ∈ Λ+ , we denote by ρλ the unique (up to equivalence) irreducible representation of G with highest weight λ, acting in V (λ). Let αj ∈ Π be a simple root. The irreducible representation ρj is called the fundamental representation attached to αj . Let now K be a closed connected subgroup of G with Lie algebra k. A dominant weight λ ∈ Λ+ is called K-spherical if the subspace of K-fixed vectors in V (λ) is one-dimensional. The corresponding representation ρλ is then called K-spherical. We write ΛK + for the subset of K-spherical dominant weights. If for every λ ∈ Λ+ the subspace of K-fixed vectors in V (λ) is at most one-dimensional, then the pair (G, K) is called a Gelfand pair. In this case, the harmonic analysis of the square integrable functions on the homogeneous space M = G/K, endowed with the Haar measure, is given by V (λ). L2 (M ) ∼ = λ ∈ ΛK +
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Construction of Certain Fuzzy Flag Manifolds
537
2.2. A special class of Gelfand pairs Let us keep the notations of the previous subsection. Let C gC = hC ⊕ gC CEα α =h ⊕ α∈∆
α∈∆
be the standard root decomposition of gC . For a given subset S ⊂ Π, define the parabolic subalgebra gC pS := hC ⊕ α, α ∈ ΓS
where ΓS := ∆+ ∪ {α ∈ ∆; α ∈ span(S)}, and denote by PS the corresponding parabolic subgroup of GC . Let lS be the Levi factor of pS , lS = hC ⊕ gC α α ∈ ΓS ∩(−ΓS )
and set kS := pS ∩ g = lS ∩ g. Then kS is a compact real form of lS . Setting KS := G ∩ PS , we see that KS is a Lie subgroup of G with Lie algebra kS . Assume furthermore that (G, K) is a Gelfand pair and that there exists a subset S ⊂ Π such that S c := (Π\S) = 1 and k = kS . Note that the corresponding PS ⊂ G is maximal parabolic and that the Dynkin diagram of K can be obtained from the Dynkin diagram of G by deleting one node. The simple root β ∈ Π with Π\S = {β} is called the Gelfand node associated to the pair (G, K). The following important proposition characterizes a special class of compact Gelfand pairs. Proposition 1 ([30, Proposition 4.7]). Let G be a compact, connected simply connected Lie group with Lie algebra g, and let p be a standard maximal parabolic subalgebra of the complexified Lie algebra gC . Let K ⊂ G be the connected Lie group with Lie algebra k := p ∩ g. Then (G, K) is a Gelfand pair if and only if one of the following three conditions are satisfied: (i) (G, K) is an irreducible compact Hermitian symmetric pair ; (ii) (G, K) (SO(2n + 1), U (n))(n ≥ 2); (iii) (G, K) (Sp(n), U (1) × Sp(n − 1))(n ≥ 2). Let (G, K) be a pair from the list (i)–(iii) above, and let (g, k) be the associated pair of Lie algebras. Then k = kS for some subset S ⊂ Π with S c = 1. Let β ∈ Π be the associated Gelfand node with corresponding fundamental weight := β . One can extend complex linearly to gC by setting (Eα ) = 0 for all α ∈ ∆. The following fact is worth mentioning. Denote by L the isotropy group of under the coadjoint action of G, i.e. L = {g ∈ G; Ad∗ (g) = }.
June 2, 2010 14:55 WSPC/S0129-055X
538
148-RMP
J070-00401
M. Ben Halima
Using the Killing form of g, we identify with an element Z ∈ hR = ih. Thus we get L = {g ∈ G; Ad(g)Z = Z}, and then l := Lie(L) = {X ∈ g; [X, Z] = 0}. In the standard root decomposition of gC , Eα commutes with Z if and only if the root α is orthogonal to . Observe now that α is orthogonal to if and only if α belongs to the set ΓS ∩ (−ΓS ). This means that lC is spanned by hC and the Eα ’s with α ∈ ΓS ∩ (−ΓS ), and hence we get lC = kC . We conclude that K = L, which proves that the flag manifold M = G/K can be identified with the G-orbit through under the coadjoint representation. 3. Decomposition of Tensor Product Representations of the Groups SO(2n + 1) and Sp(n) The goal of this section is to describe the decomposition into irreducibles of some particular tensor product representations of the special orthogonal Lie group SO(2n + 1) and the symplectic Lie group Sp(n) for n ≥ 2. We provide here explicit formulas that will be used in the proof of our main result in the next section. For a detailed exposition of the representation theory of SO(2n + 1) and Sp(n), we refer to [19]. 3.1. The case of the group SO(2n + 1) The material of this subsection is not new but is worth summarizing in preparation of our main result. Let Ei,j ∈ Mat(2n + 1, C) be the elementary matrix having 1 at the (i, j)-entry and 0 elsewhere. We take the standard Cartan subalgebra h of the Lie algebra so(2n + 1) spanned by the matrices (E2j−1,2j − E2j,2j−1 ) for 1 ≤ j ≤ n. Let ek be the linear form on the complexified Lie algebra hC given by 0 ih1 −ih1 0 .. . ek = hk 0 ihn −ihn 0 0 for 1 ≤ k ≤ n. In the usual ordering on h∗R := (ih)∗ and for n ≥ 2, we have the following system of positive roots of the pair (so(2n + 1, C), hC ) ∆+ = ∆+ (so(2n + 1, C), hC ) = {ek ± el , 1 ≤ k < l ≤ n} ∪ {ek , 1 ≤ k ≤ n}.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Construction of Certain Fuzzy Flag Manifolds
539
The associated system of simple roots is then Π = {αk = ek − ek+1 , 1 ≤ k < n} ∪ {αn = en }. Let us recall that: j (a) the fundamental weights are j = k=1 ek for 1 ≤ j ≤ n − 1 and n = n 1 k=1 ek ; 2 n (b) the weight lattice is Λ = { k=1 λk ek ; λk ∈ Z ∀k or λk ∈ Z + 12 ∀k}; n (c) a weight λ = k=1 λk ek ∈ Λ is dominant if and only if λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0; (d) the fundamental representation attached to the simple root αn = en is the so-called spin representation. n Given a dominant weight λ = k=1 λk ek (or simply λ = (λ1 , . . . , λn )), we denote, as before, by V (λ) the associated SO(2n + 1)-irreducible module with highest weight λ. Let now λ = s(e1 + · · ·+ en ) and µ = t(e1 + · · ·+ en ) be two “constant” dominant weights of SO(2n + 1) with s, t ∈ 12 N0 and s ≤ t. In [26, Theorem 2.5], Okada has proven the following multiplicity free decomposition formula V (λ) ⊗ V (µ) ∼ V (ν), = ν ∈ Ps,t
where Ps,t = {ν = (ν1 + t − s, . . . , νn + t − s); (ν1 , . . . , νn ) ∈ Nn0 , 2s ≥ ν1 ≥ · · · ≥ νn ≥ 0}. Since all representations of the group SO(2n + 1) are self-dual, we can deduce Corollary 1. Let λ = s(e1 + · · · + en ) with s ∈ 12 N0 . As SO(2n + 1)-modules, we have V (λ)∗ ⊗ V (λ) ∼ V (ν), = ν ∈ Ps
where Ps = {ν = (ν1 , . . . , νn ) ∈ Nn0 ; 2s ≥ ν1 ≥ · · · ≥ νn ≥ 0}. 3.2. The case of the group Sp(n) We begin this subsection by recalling some well-known facts about the representations of the compact Lie group Sp(n). Let ih1 . . . ih n h= H = ; hj ∈ R ∀ 1 ≤ j ≤ n −ih1 . . . −ihn
June 2, 2010 14:55 WSPC/S0129-055X
540
148-RMP
J070-00401
M. Ben Halima
be the standard Cartan subalgebra of the Lie algebra sp(n). Given an element H ∈ h as above, we can simply write H = diag(ih1 , . . . , ihn , −ih1 , . . . , −ihn ). Let ek be the linear form on hC defined by ek (diag(h1 , . . . , hn , −h1 , . . . , −hn )) = hk , where 1 ≤ k ≤ n. For n ≥ 2, we fix the following system of positive roots of the pair (sp(n, C), hC ) ∆+ = ∆+ (sp(n, C), hC ) = {ek ± el , 1 ≤ k < l ≤ n} ∪ {2ek , 1 ≤ k ≤ n}. The associated system of simple roots is Π = {αk = ek − ek+1 , 1 ≤ k < n} ∪ {αn = 2en }. Recall that: j (a) the fundamental weights are j = k=1 ek for 1 ≤ j ≤ n; n (b) the weight lattice is Λ = { k=1 λk ek ; λk ∈ Z ∀k}; (c) a weight λ = nk=1 λk ek ∈ Λ is dominant if and only if λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0. Next we are going to state a Littelmann’s rule which describes the decomposition into irreducibles of the tensor product of two general Sp(n)-irreducible modules. To this end, we first briefly recall some basic terminology. As usual, a partition is a non-increasing sequence λ = (λ1 , λ2 , . . .) of non-negative integers. The depth d(λ) of a partition λ is the number of non-zero terms of λ. A partition λ with depth ≤ n is regarded as an element of Nn0 . Let λ = (λ1 , λ2 , . . . , λd ) be a partition of depth d. The Young diagram of λ is a collection of left-justified rows of boxes with λi boxes in the ith row for 1 ≤ i ≤ d. A filling of the Young diagram of λ with elements of the set {1, 2, . . . , n} which is nondecreasing in rows and strictly increasing in the columns is called n-semistandard (Young) tableau (or tableau for short) of shape λ. Given a tableau T , the filling of the box (i, j) is denoted by Ti,j . Let again λ = (λ1 , λ2 , . . . , λd ) be a partition of depth d ≤ n. A tableau T of shape λ is called a (2n)-symplectic tableau if its entries are elements of {1, . . . , 2n} and if it obeys the additional constraint Ti,j ≥ 2i − 1. These tableaux were introduced by King and El-Sharkaway [18]. Consider a (2n)-symplectic tableau T . The vector con(T ) := ( {1 s in T } − {2 s in T }, . . . , {(2n − 1) s in T } − {(2n) s in T }) is called the content of T . We denote by T (l) the tableau that consists of the last n l columns of T . Given a weight ν = j=1 νj ej ∈ Λ, we shall identify ν with the
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Construction of Certain Fuzzy Flag Manifolds
541
element (ν1 , . . . , νn ) ∈ Zn . Now we arrive at Theorem 1 (Littelmann [23, Theorem (a), p. 346]). Let Λ+ be the set of dominant weights of Sp(n) with n ≥ 1. For λ, µ ∈ Λ+ , we have V (λ) ⊗ V (µ) ∼ V (λ + con(T )), = T
where the sum is over all (2n)-symplectic tableaux of shape µ such that the weight λ + con(T (l)) is dominant for all l. Remark. In the formulation of the Littelmann’s rule stated above, we basically reproduced Krattenthaler’s description (see [21, Appendix A6]) with a slight modification in the description of (2n)-symplectic tableaux, where we followed [14]. This formulation is more elementary and is mostly convenient to clarify our calculation. Applying the above theorem in the case where λ = µ = (N, 0, . . . , 0), we obtain Proposition 2. For N ∈ N0 and n ≥ 2, we have V ((N, 0, . . . , 0)) ⊗ V ((N, 0, . . . , 0)) ∼ =
V ((2k + l, l, 0, . . . , 0)).
k,l ∈ N0 0 ≤ k+l ≤ N
Proof. If N = 0, then the proposition is obvious. Let us consider a (2n)-symplectic tableau of shape λ = (N, 0, . . . , 0) with N ∈ N. For 1 ≤ i ≤ 2n, we set ki := {i s in T }. By definition of the ki s, we have k1 + k2 + · · · + k2n = N . Note that the content of T is given by con(T ) = (k1 − k2 , k3 − k4 , . . . , k2n−1 − k2n ). Assume that T satisfies the following property: λ + con(T (l)) ∈ Λ+ for all l. For l = k2n , the content of the tableau T (l) is con(T (l)) = (0, . . . , 0, −k2n ) n−1
and so λ + con(T (l)) = (N, 0, . . . , 0, −k2n ). n−2
Since λ + con(T (l)) ∈ Λ+ , it follows that k2n = 0. Next, we are going to prove that ki = 0 for all 4 ≤ i ≤ 2n. The case n = 2 is already proven. Assume n ≥ 3, fix 4 ≤ i ≤ 2n and suppose that kj = 0 for all i + 1 ≤ j ≤ 2n. We will prove that ki = 0. For this we consider the following cases: Case 1. If i is even, then we have for l = ki con(T (l)) = (0, . . . , 0, −ki , 0, . . . , 0). i−2 2
The fact that λ + con(T (l)) ∈ Λ+ clearly forces ki = 0.
June 2, 2010 14:55 WSPC/S0129-055X
542
148-RMP
J070-00401
M. Ben Halima
Case 2. If i is odd, then we have for l = ki con(T (l)) = (0, . . . , 0, ki , 0, . . . , 0). i−1 2
Since λ + con(T (l)) ∈ Λ+ , we easily get ki = 0. We conclude that ki = 0 for the fixed integer i. An induction on i allows us to derive the equality ki = 0 for all 4 ≤ i ≤ 2n with n ≥ 3. Hence the claim is proven for n ≥ 2. Consequently, we can write con(T ) = (k1 − k2 , k3 , 0, . . . , 0), where, of course, k1 + k2 + k3 = N . Conversely, if T is a (2n)-symplectic tableau of shape λ such that con(T ) = (k1 − k2 , k3 , 0, . . . , 0) with the ki ’s being defined as above, then one easily verifies that λ + con(T (l)) is a dominant weight for all l. We deduce that V ((N + k1 − k2 , k3 , 0, . . . , 0)) V ((N, 0, . . . , 0)) ⊗ V ((N, 0, . . . , 0)) ∼ = k1 ,k2 ,k3 ∈ N0 k1 +k2 +k3 =N
∼ =
V ((2k + l, l, 0, . . . , 0)).
k,l ∈ N0 0 ≤ k+l ≤ N
This completes the proof of the proposition. 4. Fuzzy Versions of Certain Flag Manifolds We shall freely use the notations introduced earlier. Let (G, K) be a pair from the list (i)–(iii) in Proposition 1. The aim of this section is to construct a fuzzy version of the flag manifold M = G/K. As we mentioned before, our construction is based on the Berezin–Toeplitz quantization of such a manifold. 4.1. Quantum line bundle over M Fix again a maximal torus T in G and let ∆, ∆+ and Π be as in Sec. 2. Let β be the Gelfand node associated with (G, K). If k is the Lie algebra of K, then k = kS with S = Π\{β}. We denote by ∆+ 1 the set of positive roots corresponding to S. Then C (gC kC = hC ⊕ α ⊕ g−α ). α∈∆+ 1
Setting n+ =
α∈∆+ \∆+ 1
gC α
and n− =
α∈∆+ \∆+ 1
gC −α ,
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Construction of Certain Fuzzy Flag Manifolds
we get gC = hC ⊕
543
C (gC α ⊕ g−α )
α∈∆+
= kC ⊕ n+ ⊕ n− . Define N + (respectively N − ) to be the connected subgroup of GC with Lie algebra n+ (respectively n− ). Note that G/K GC /K C N + GC /K C N − . This shows that M = G/K can be regarded as a complex manifold. Let ψ ∈ V () be a normalized highest weight vector, with weight = β . Denote by χ the unique holomorphic extension to K C N + of the character e− . With these notations, we have for all k ∈ K ρ (k)ψ = χ (k)−1 ψ . The line bundle L = G ×e− C over M = G/K = GC /K C N + is identified with GC ×χ C, and then it is seen as a holomorphic line bundle. Note that every holomorphic line bundle over M is of the form Lm for some m ∈ Z. Let HN be the space of holomorphic sections of the line bundle LN := L⊗N , N ∈ N. By the Borel–Weil theorem (see, e.g., [1]), HN is an irreducible G-module with highest weight N . It follows that HN is isomorphic, as G-module, to the space V (N ). The algebra AN := End C (HN ) admits a natural G-action and can be identified with the matrix algebra Mat(dN , C), where dN := dimC V (N ). Let h be the Hermitian structure of the bundle L → M defined by h([g, z], [g, z ]) = zz
for all g ∈ G.
We know that there exists a unique connection ∇ on L leaving h invariant and satisfying ∇X ψ = 0 for each vector field X of type (0, 1) and for each local holomorphic section ψ. The curvature of (L, ∇) is the complex 2-form on M given by R(X, Y ) := ∇X ∇Y − ∇Y ∇X − ∇[X,Y ] = −iω(X, Y ), where X, Y are smooth vector fields and ω is the G-invariant K¨ ahler metric on M (see, e.g., [3]). This shows that (L, h, ∇) is a quantum line bundle over M . 4.2. Berezin–Toeplitz quantization of M Fix N ∈ N. On the space Γ∞ (M, LN ) of smooth sections of LN , we have the scalar product ϕ, ψ = hN (ϕ(x), ψ(x))dΩ(x), M
where dΩ(x) is the normalized G-invariant measure associated to the metric ω on M . 2 Let L (M, LN ) be the L2 -completion of the space Γ∞ (M, LN ). We denote by ΠN
June 2, 2010 14:55 WSPC/S0129-055X
544
148-RMP
J070-00401
M. Ben Halima
the orthogonal projection onto the subspace HN ⊂ L2 (M, LN ). Given a function f in C ∞ (M ), one can define an operator on the space HN by TN (f ) := ΠN ◦ Mf where Mf is the multiplication operator associated to f . The corresponding map TN : C ∞ (M ) → EndC (HN ) = AN is called the Berezin–Toeplitz quantization map. Let PN be the orthogonal projector onto the highest weight subspace of V (N ). One easily verifies that ρN (g)PN ρN (g)−1 is the projector onto the “coherent state” associated to x = gK ∈ M (see [9]). Thus the coherent state map used in the Berezin–Toeplitz quantization of K¨ ahler manifolds (see [8]) is here equal to PN : M = G/K → EndC (HN ) gK → ρN (g)PN ρN (g)−1 and we get (see [29, Proposition 3.1]) the following expression for the Berezin– Toeplitz quantization map f (x)PN (x)dΩ(x). TN (f ) = (dimC HN ) M
From this expression, it is obvious that TN is G-equivariant. Using the fact that the map TN : C ∞ (M ) → AN is surjective (see [7, Proposition 4.2]), one can deduce that the algebra AN is G-equivariantly isomorphic to a submodule of L2 (M ). As shown by Bordemann, Meinrenken and Schlichenmaier (see [7]), the maps TN have the correct semi-classical behavior for N → ∞. In particular, the following results hold. Theorem 2. For f, h ∈ C ∞ (M ), we have (1) TN (f )op → f ∞ as N → ∞; (2) TN (f h) − TN (f )TN (h)op → 0 as N → ∞. Here op is the operator norm on AN and ∞ is the sup-norm on C ∞ (M ). Remark. Let l be a continuous length function on G satisfying the condition l(xyx−1 ) = l(y) for all x, y ∈ G. Let δ be the action of G on AN by conjugation by ρN . Then l and δ determine a Lipschitz seminorm LN on AN by δx (A) − Aop ; x = e , LN (A) = sup l(x) where e is the identity element of G. Let C(G/K) be the C -algebra of continuous complex-valued functions on G/K. We denote by ξ the action of G on G/K and on C(G/K) by left translation. We can define a Lipschitz seminorm on C(G/K) by ξx (f ) − f ∞ ; x = e . L∞ (f ) = sup l(x) Let us underline that the pairs (AN , LN ) and (C(G/K), L∞ ) are “compact quantum metric spaces” in the sense defined by Rieffel in [27].
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Construction of Certain Fuzzy Flag Manifolds
545
Motivated by the notion of Gromov–Hausdorff convergence of classical compact metric spaces, Rieffel gave in [27] a definition of a “quantum Gromov–Hausdorff distance” between two compact quantum metric spaces. Furthermore, he proved in [28] that the sequence {(AN , LN )}N ≥1 converges to (C(G/K), L∞ ) for this distance as N → ∞. 4.3. Fuzzy version of M Now we are in position to prove our main result. Theorem 3. Let (G, K), M and AN be as above. Then there exists a sequence (EN )N ≥1 of G-invariant subspaces of L2 (M ) such that EN ⊂ EN +1 and N ≥1 EN is dense in C ∞ (M ), and such that EN is G-equivariantly isomorphic to the matrix algebra AN . Proof. If (G, K) is an irreducible compact Hermitian symmetric pair, then the result of the theorem follows immediately in this case by comparing Proposition 3.1 and Theorem 4.2 in the paper of Zhang mentioned in the introduction ([31]). Thus it suffices to prove the theorem in the following two cases: Case 1. Assume that (G, K) (SO(2n + 1), U (n)) with n ≥ 2. Let the notations of roots and weights be as in Sec. 3.1. The Gelfand node associated to the pair (SO(2n + 1), U (n)) is β = αn = en and the fundamental weight attached to this simple root is = 12 (e1 + · · · + en ). Consider the holomorphic line bundle L = SO(2n + 1) ×e− C over the homogeneous space SO(2n + 1)/U (n). As SO(2n + 1)modules, HN = Γhol (LN ) ∼ = V (N )∗ ⊗ V (N ) for N ∈ N. = V (N ) and AN ∼ Using the result of Corollary 1, one immediately has AN ∼ V (λ). = λ=(λ1 ,...,λn )∈Nn 0 N ≥ λ1 ≥ λ2 ≥ ··· ≥ λn ≥ 0
On the other hand, an important result of Kr¨ amer (see [20, Table 1]) says that the 2 SO(2n + 1)-module L (SO(2n + 1)/U (n)) decomposes into irreducibles as L2 (SO(2n + 1)/U (n)) ∼ V (λ). = λ=(λ1 ,...,λn )∈Nn 0 λ1 ≥ λ2 ≥ ··· ≥ λn ≥ 0
Denote by EN the unique submodule of L2 (SO(2n + 1)/U (n)) such that EN ∼ V (λ) = λ=(λ1 ,...,λn ) ∈ Nn 0 N ≥ λ1 ≥ λ2 ≥ ··· ≥ λn ≥ 0
as SO(2n+1)-module. The sequence (EN )N ≥1 satisfies the assertions of the theorem.
June 2, 2010 14:55 WSPC/S0129-055X
546
148-RMP
J070-00401
M. Ben Halima
Case 2. Assume that (G, K) (Sp(n), U (1) × Sp(n − 1)) with n ≥ 2. In the notations of Sec. 3.2, the Gelfand node associated to the pair (Sp(n), U (1) × Sp(n − 1)) is β = α1 = e1 − e2 and the fundamental weight attached to this simple root is = e1 . Consider the holomorphic line bundle L = Sp(n) ×e− C over the homogeneous space Sp(n)/(U (1) × U (n)) and take HN = Γhol (LN ) for N ∈ N. As Sp(n)-modules, HN ∼ = V (N ) and AN ∼ = V (N )∗ ⊗ V (N ). Since the module V (N ) is self-dual, the result of Proposition 2 shows that AN ∼ V ((2k + l, l, 0, . . . , 0)). = k,l ∈ N0 0 ≤ k+l ≤ N
As in the previous case, the decomposition into irreducibles of the Sp(n)-module L2 (Sp(n)/(U (1) × Sp(n − 1))) is given by Kr¨ amer in [20, Table 1]. One has L2 (Sp(n)/U (1) × Sp(n − 1)) ∼ V ((2k + l, l, 0, . . . , 0)). = k,l∈N0 2
Denote by EN the unique submodule of L (Sp(n)/(U (1) × Sp(n − 1))) such that EN ∼ V ((2k + l, l, 0, . . . , 0)) = k,l ∈ N0 0 ≤ k+l ≤ N
as Sp(n)-module. The sequence (EN )N ≥1 verifies the assertions of the theorem. Finally, we observe that the analysis used in the proof of Theorem 3 directly implies the following result. Proposition 3 (Compare [30, Proposition 4.8]). Let (G, K) be a pair from the list (i)–(iii) in Proposition 1, and let β ∈ Π be the associated Gelfand node with corresponding fundamental weight := β . Then we have a multiplicity free decomposition of G-modules of the form V ()∗ ⊗ V () ∼ =
r
V (µi )
i=0
for certain r ∈ N, where µ0 := 0 ∈ Λ+ and {µi }1≤i≤r is a subset of the K-spherical K dominant weights ΛK + . Furthermore, every λ ∈ Λ+ can uniquely be written as a N0 -linear combination of the µi ’s (1 ≤ i ≤ r). Acknowledgments I would like to express my gratitude to Tilmann Wurzbacher for suggesting the problem and for helpful discussions. I would also like to thank the anonymous referee for pointing out to me references [22, 28], and for remarks improving the article.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00401
Construction of Certain Fuzzy Flag Manifolds
547
References [1] D. N. Akhiezer, Lie Group Actions in Complex Analysis (Vieweg, Braunschweig, 1995). [2] A. Y. Alekseev, A. Recknagel and V. Schomerus, Non-commutative world-volume geometries: Branes on SU (2) and fuzzy spheres, JHEP 09 (1999) 023. [3] D. Arnal, M. Cahen and S. Gutt, Representations of compact Lie groups and quantization by deformation, Acad. Roy. Belg. Bull. CI. Sci. (5) 74 (1988) 123–141. [4] A. P. Balachandran, T. R. Govindarajan and B. Ydri, The Fermion doubling problem and noncommutative geometry, Mod. Phys. Lett. A 15 (2000) 1279–1286. [5] A. P. Balachandran, B. P. Dolan, J. Lee, X. Martin and D. O’Connor, Fuzzy complex projective spaces and their star-products, J. Geom. Phys. 43 (2002) 184–204. [6] M. Ben Halima and T. Wurzbacher, Fuzzy complex Grassmannians and quantization of line bundles, to appear in Abh. Math. Semin. Hamb. Univ. [7] M. Bordemann, E. Meinrenken and M. Schlichenmaier, Toeplitz quantization of K¨ ahler manifolds and gl(N ), N → ∞ limits, Comm. Math. Phys. 165 (1994) 269–281. [8] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. I. Geometric interpretation of Berezin’s quantization, J. Geom. Phys. 7 (1990) 45–62. [9] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. II, Trans. Amer. Math. Soc. 337 (1993) 73–98. [10] B. P. Dolan and D. O’Connor, A fuzzy three sphere and fuzzy tori, JHEP 10 (2003) 060. [11] B. P. Dolan and J. Olivier, Fuzzy complex Grassmannian spaces and their star products, Internat. J. Modern Phys. A 18 (2003) 1935–1958. [12] M. R. Douglas and N. A. Nekrasov, Noncommutative field theory, Rev. Mod. Phys. 73 (2001) 977–1029. [13] J. Fr¨ ohlich and K. Gaw¸edzki, Conformal field theory and geometry of strings, in Mathematical Quantum Theory (Vancouver, 1993), Proceedings of the Conference on Mathematical Quantum Theory, Vancouver, Canada (Amer. Math. Soc. 1993), pp. 57–97. [14] M. Fulmek and C. Krattenthaler, Lattice path proofs for determinantal formulas for symplectic and orthogonal characters, J. Combin. Theory Ser. A 77 (1997) 3–50. [15] H. Grosse, C. Klimcik and P. Presnajder, Simple field theoretical models on noncommutative manifolds, in Lie Theory and Its Applications in Physics (Clausthal, 1995) (World Sci. Publishing, River Edge, NJ, 1996), pp. 117–131. [16] H. Grosse and A. Strohmaier, Noncommutative geometry and the regularization problem of 4D quantum field theory, Lett. Math. Phys. 48 (1999) 163–179. [17] Y. Hikida, M. Nozaki and Y. Sugawara, Formation of spherical 2D-brane from multiple D0-branes, Nucl. Phys. B 617 (2001) 117–150. [18] R. C. King and N. G. I. El-Sharkaway, Standard young tableaux and weight multiplicities of the classical Lie groups, J. Phys. A 16 (1983) 3153–3178. [19] A. W. Knapp, Lie Groups Beyond an Introduction, 2nd edn. (Birkh¨ auser, Boston, 2002). [20] M. Kr¨ amer, Sph¨ arische Untergruppen in Kompakten Zusammenh¨ angenden Liegruppen, Compositio Math. 38 (1979) 129–153. [21] C. Krattenthaler, Identities for classical group characters of nearly rectangular shape, J. Algebra 209 (1998) 1–64. [22] C. L. Lazaroiu, D. McNamee and C. S¨ amann, Generalized Berezin quantization, Bergmann metrics and fuzzy Laplacians, JHEP 09 (2008) 059.
June 2, 2010 14:55 WSPC/S0129-055X
548
148-RMP
J070-00401
M. Ben Halima
[23] P. Littelmann, A generalization of the Littlewood–Richardson rule, J. Algebra 130 (1990) 328–368. [24] J. Madore, The fuzzy sphere, Class. Quantum Grav. 9 (1992) 69–87. [25] J. Madore, An Introduction to Noncommutative Differential Geometry and Its Physical Applications, 2nd edn. (Cambridge University Press, Cambridge, 1999). [26] S. Okada, Applications of minor summation formulas to rectangular-shaped representations of classical groups, J. Algebra 205 (1998) 337–367. [27] M. A. Rieffel, Gromov–Hausdorff distance for quantum metric spaces, Mem. Amer. Soc. 168 (2004) 1–65. [28] M. A. Rieffel, Matrix algebras converge to the sphere for quantum Gromov–Hausdorff distance, Mem. Amer. Soc. 168 (2004) 67–91. [29] M. Schlichenmaier, Berezin–Toeplitz quantization and Berezin symbols for arbitrary compact K¨ ahler manifolds, in Coherent States, Quantization and Gravity (Bialowieza, 1998), Proc. XVII Workshop on Geometric Methods in Physics (Warsaw Univ. Press, 2001), pp. 45–56. [30] J. V. Stokman and M. S. Dijkhuizen, Quantized flag manifolds and irreducible -representations, Comm. Math. Phys. 203 (1999) 297–324. [31] G. Zhang, Berezin transform on compact Hermitian symmetric spaces, Manuscripta Math. 97 (1998) 371–388.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Reviews in Mathematical Physics Vol. 22, No. 5 (2010) 549–596 c World Scientific Publishing Company DOI: 10.1142/S0129055X1000403X
ON THE FEYNMAN PATH INTEGRAL FOR NONRELATIVISTIC QUANTUM ELECTRODYNAMICS
WATARU ICHINOSE Department of Mathematical Science, Shinshu University, Matsumoto 390-8621, Japan
[email protected] Received 17 March 2008 Revised 26 March 2010 The Feynman path integral for regularized nonrelativistic quantum electrodynamics is studied rigorously. We begin with the Lagrangian function of the corresponding classical mechanics and construct the Feynman path integral. In the present paper, the electromagnetic potentials are assumed to be periodic with respect to a large box and quantized through their Fourier coefficients with large wave numbers cut off. Firstly, the Feynman path integral with respect to paths on the space of particles and vector potentials is defined rigorously by means of broken line paths under the constraints. Secondly, the Feynman path integral with respect to paths on the space of particles and electromagnetic potentials is also defined rigorously by means of broken line paths and piecewise constant paths without the constraints. This Feynman path integral is stated heuristically in Feynman and Hibbs’ book. Thirdly, the vacuum and the state of photons of given momenta and polarizations are expressed concretely as functions of variables consisting of the Fourier coefficients of vector potentials. It is also proved rigorously in terms of distribution theory that the Coulomb potentials between charged particles naturally appear in the above Feynman path integral approach. This shows that the photons give rise to the Coulomb force. Keywords: Feynman path integral; quantum electrodynamics. Mathematics Subject Classification 2010: 81S40, 58D30
1. Introduction A number of mathematical results on the Feynman path integrals for quantum mechanics have been obtained. On the other hand, the author does not know any mathematical results on the Feynman path integrals for quantum electrodynamics (cf. [2, 23]), written as QED from now on. The Feynman path integral for the free relativistic scalar boson field was defined rigorously in terms of the infinite dimensional Fresnel integral in [2]. The Chern– Simons functional integral was also defined rigorously, associated with a principal
549
June 2, 2010 14:55 WSPC/S0129-055X
550
148-RMP
J070-00403
W. Ichinose
fiber bundle over R3 with structure group a compact connected Lie group, as an infinite dimensional distribution in terms of white noise analysis and the applications of its functional integral to the topological quantum field theory were given in [1]. In [27], the interaction of nonrelativistic particles with a scalar boson field was studied. There, the functional integral with respect to paths on the space of particles and the boson field was defined in terms of Markoff processes under the assumption that the mass divided by the imaginary unit and a coupling constant divided by the imaginary unit are positive. As will be seen in the present paper, particles interact with the boson field through the quantized vector potential in QED. On the other hand, in [27], particles interact with the boson field through the quantized scalar potential, where the vector potential disappears. This is the most different point between our result and Nelson’s one. The spectra of Hamiltonian operators for nonrelativistic QED models have also been studied (cf. [12, 14, 32]). The Hamiltonian operators in these QED models are defined by means of the Coulomb potentials, and creation operators and annihi∞ n 2 3 2 3 lation operators acting on the bosonic Fock space n=0 s (L (R ) ⊕ L (R )), defined dependently on an infrared and ultraviolet cut-off function in momentum space R3 , where L2 (R3 ) is the space of all square integrable functions in R3 and L2 (R3 ) ⊕ L2 (R3 ) expresses the space of all amplitudes of momentum of a single photon with polarizations. These QED models are simplified versions of those which are primarily intended in physics (cf. [10, 11, 29, 33]). A functional integral representation for the above nonrelativistic QED model with imaginary time was also obtained by Hiroshima [16] by means of the probabilistic method. We can see from Theorem 3.1 in the present paper that the Hamiltonian operators in [12, 14, 16, 32] are formally like (3.10) in the present paper. But, our presentation (3.10) is exhibited as a partial differential operator. In addition, as will be seen in Sec. 5, creation operators and annihilation operators with given momenta and polarizations acting on S (R4N ) are defined and the Hamiltonian operator (3.10) can be written by means of these creation operators and annihilation operators, where N is a positive integer determined from the regularization of QED, S(R4N ) denotes the Schwartz space of all rapidly decreasing functions in R4N and S (R4N ) is the dual space of S(R4N ). This description of the Hamiltonian operator is the one familiar in the heuristic presentations in physics (cf. [10, 11, 29, 33]). It is well known that the only translation invariant σ-additive regular measure on a separable infinite dimensional Banach space is the identically zero measure (cf. [13, Chap. 4, Sec. 5, Theorem 4]). The measure defining heuristically the Feynman path integral is meant to be translation invariant (cf. [11, (7-29)]), so it cannot be realized as a σ-additive regular nontrivial measure. As it is known, see, e.g., [2, 15, 23] the Feynman path integral itself can be realized as a linear functional satisfying certain suitable continuity conditions. Our aim in the present paper is to define rigorously the Feynman path integral for a regularized nonrelativistic QED (for a physical discussion of QED and its nonrelativistic version, see, e.g., [7, 8, 10, 11, 29]). We begin with the Lagrangian
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
551
function of the corresponding classical mechanics, differently from the models in [12, 14, 16, 32], and construct rigorously the Feynman path integral. Usually in physics, the Feynman path integral for nonrelativistic QED is only heuristically defined. In the present paper, electromagnetic potentials are assumed to be periodic with respect to a large box in R3 and quantized through their Fourier coefficients. We note that in the present paper, regrettably, the Fourier coefficients with large wave numbers need to be arbitrarily cut off (ultraviolet cut-off) and we do not take the limit of a box to R3 . In this double sense, our model is regularized. First, the mathematical definition of the Feynman path integral with respect to paths on the space of particles and vector potentials is given by means of broken line paths under the constraints, i.e. (2.20) in the present paper. These constraints are necessarily introduced in physics (cf., e.g., [11, (9-17)], [29, (A-7)], [32, (13.10)] and [33, (7.38)]) when electrodynamics is quantized from the classical mechanics. It is a reason for introducing the constraints that a momentum canonically conjugate to the scalar potential is absent. See (2.3) in the present paper. Secondly, without the constraints we give the mathematical definition of the Feynman path integral with respect to paths on the space of particles and electromagnetic potentials by means of broken line paths and piecewise constant paths. This Feynman path integral has been given heuristically by [11, (9-98)]. Our method of defining the Feynman path integral without the constraints is like the one we used before in [20] for defining the phase space Feynman path integral. That is, paths considered on the space of all scalar potentials are determined so that the derivatives of the Lagrangian function with respect to the variables of the scalar potential are piecewise constant (Remark 3.4). The author again emphasize that any definitions of [11, (9-98)] have not been given. So our result may be completely new. We note that our Feynman path integral with respect to paths on the space of particles and electromagnetic potentials can be proved to be equal with the Feynman path integral with respect to paths on the space of particles and vector potentials. Thirdly, the vacuum and the states of photons with given momenta and polarizations are expressed concretely as functions of variables consisting of the Fourier coefficients of vector potentials. In [11], only the vacuum and the state of a photon with a momentum and a polarization are expressed concretely as functions. Generally, in physics the vacuum and the states of photons with given momenta and polarizations are not considered concretely but rather abstractly (cf. [29, 33]). To write down the state of photons concretely, we introduce creation operators and annihilation operators, which can be written concretely as first order partial differential operators, similarly as it is done in white noise analysis in [15]. The results stated above should have many applications, as heuristically suggested in [11, Chap. 9]. Fourthly, we show in terms of distribution theory that the Coulomb potentials between charged particles appear when the periods of the Fourier series tend to
June 2, 2010 14:55 WSPC/S0129-055X
552
148-RMP
J070-00403
W. Ichinose
infinity and the cut-off of the Fourier coefficients is removed. This result, which shows that photons yield the Coulomb force, is well known in physics (cf. [8, 11]). In the present paper, we give a rigorous proof of this fact in the frame of our model of regularized nonrelativistic QED. The proof of giving a mathematical definition of the Feynman path integral for nonrelativistic QED with regularization is obtained by means of a somewhat delicate study of oscillatory integral operators, the abstract Ascoli–Arzel`a theorem on the weighted Sobolev spaces and the uniqueness to the initial problem for the Schr¨ odinger type equations as in [18–21]. The proof of expressing the vacuum and the states of photons with given momenta and polarizations concretely is as follows. We first define annihilation operators of photons with given momenta and polarizations by first order differential operators having the Fourier coefficients of vector potentials as variables. Creation operators of photons are defined as the adjoint operators of the annihilation operators. The vacuum is determined from the annihilation operators and the states of photons with given momenta and polarizations are determined from the vacuum by means of the creation operators. For the mathematics related to this see, e.g., [6]. This relies on formal considerations going back to [7]. The proof of the appearance of the Coulomb potentials between charged particles is given by proving the convergence theorem for the Riemann sum of a unbounded function as the discretization parameter in space tends to zero, which will be stated in Proposition 4.3 in the present paper. Our plan in the present paper is as follows. Section 2 is devoted to preliminaries. In Sec. 3, the main results on the Feynman path integral for regularized nonrelativistic QED are stated. In Sec. 4, the appearance of the Coulomb potentials between charged particles is proved rigorously in our model. In Sec. 5, the vacuum and the states of photons with given momenta and polarizations are given concretely. Sections 6–9 are devoted to the proofs of the main results stated in Sec. 3. 2. Preliminaries For a multi-index α = (α1 , . . . , αd ) and z = (z1 , . . . , zd ) ∈ Rd , we write |α| = d α1 αd α α α1 αd · · · (∂/∂zd) and z = 1 + |z|2 . Let j=1 αj , z = z1 · · · zd , ∂z = (∂/∂z1 ) 2 2 d L = L (R ) be the space of all square integrable functions in Rd with inner product (·, ·) and norm · . Let T > 0 be an arbitrary constant, t ∈ [0, T ] and x ∈ R3 . We consider n charged nonrelativistic particles x(j) (t) ∈ R3 (j = 1, 2, . . . , n) with mass mj > 0 and charge ej ∈ R. Let E(t, x) = (E1 (t, x), E2 (t, x), E3 (t, x)) ∈ R3 be the electric strength and B(t, x) = (B1 (t, x), B2 (t, x), B3 (t, x)) ∈ R3 the magnetic strength. Then the classical equations of motion of x(j) (t) are given by d mj x˙ (j) (t) = ej E(t, x(j) (t)) + ej x˙ (j) (t) × B(t, x(j) (t)), dt
x˙ (j) (t) =
d (j) x (t). dt
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
553
Let φ(t, x) ∈ R be a scalar potential and A(t, x) ∈ R3 a vector potential. We set x(t) := (x(1) (t), . . . , x(n) (t)) ∈ R3n , x˙ (t) := (x˙ (1) (t), . . . , x˙ (n) (t)) ∈ R3n . Then the Lagrangian function for particles and the electromagnetic field with the distributional charge density ρ(t, x) =
n
ej δ(x − x(j) (t))
(2.1)
j=1
and the distributional current density j(t, x) =
n
ej x˙ (j) (t)δ(x − x(j) (t)) ∈ R3
(2.2)
j=1
is given in distributional sense by
∂A ∂φ ˙ ˙ L t, x, x, A, A, , φ, ∂x ∂x n mj 1 |x˙ (j) |2 − ρ(t, x)φ(t, x)dx + = j(t, x) · A(t, x)dx 2 c j=1 1 8π n
+ =
j=1
1 + 8π
R3
(|E(t, x)|2 − |B(t, x)|2 )dx + C
mj (j) 2 1 (j) (j) (j) |x˙ | − ej φ(t, x ) + ej x˙ · A(t, x ) 2 c R3
(|E(t, x)|2 − |B(t, x)|2 )dx + C
(2.3)
(cf. [11, 32]), where E=−
1 ∂A ∂φ − , c ∂t ∂x
B = ∇ × A,
(2.4)
∂φ/∂x = (∂φ/∂x1 , ∂φ/∂x2 , ∂φ/∂x3 ) and C is an indefinite constant. It seems that a nontrivial indefinite constant in (2.3) has not been explicitly discussed by anyone before (cf. [11, 29, 32]). As in [8, 10, 29] we consider a sufficient large box
L2 L2 L3 L3 L1 L1 × − , × − , ⊂ R3 . V = − , 2 2 2 2 2 2
June 2, 2010 14:55 WSPC/S0129-055X
554
148-RMP
J070-00403
W. Ichinose
In the present paper, as variables we consider all periodic potentials φ(t, x) and A(t, x) in x ∈ R3 with periods L1 , L2 and L3 satisfying ∇ · A(t, x) = 0
in [0, T ] × R3
(the Coulomb gauge)
(2.5)
and also
φ(t, x)dx = 0,
V
A(t, x)dx = 0.
(2.6)
V
Let |V | = L1 L2 L3 . We set k :=
2π 2π 2π s1 , s2 , s3 L1 L2 L3
(s1 , s2 , s3 = 0, ±1, ±2, . . .).
(2.7)
Then, using the Gram and Schmidt method, we can easily determine ej (k) ∈ R3 (j = 1, 2) such that (e1 (k), e2 (k), k/|k|) for all k = 0 form a set of mutually orthogonal unit vectors in R3 and ej (−k) = −ej (k)
(j = 1, 2)
(2.8)
(cf. [3, p. 448]). We fix these ej (k) hereafter. Noting (2.5) and (2.6), we can expand φ(t, x) and A(t, x) formally into the Fourier series √ A(x, {alk (t)}) =
4π c {a1k (t)eik·xe1 (k) + a2k (t)eik·xe2 (k)}, |V |
(2.9)
k=0
φ(x, {φk (t)}) =
1 φk (t)eik·x . |V |
(2.10)
k=0
Remark 2.1. Usually in the physical literature (cf. [11, 29]) the condition (2.6) is not stated clearly. We write (1)
alk =:
(2)
alk − ialk √ 2 (1)
(l = 1, 2),
(2.11)
(2)
φk =: φk − iφk ,
(2.12)
where alk ∈ R and φk ∈ R, and also the complex conjugate of alk as a∗lk . Since A and φ are real valued, the relations (i)
(i)
(1)
(1)
al−k = −alk ,
(2)
(2)
al−k = alk ,
(1)
(1)
φ−k = φk ,
(2)
(2)
φ−k = −φk
(2.13)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
hold from (2.8). So, from (2.9) and (2.10), we have √ 2 4π 1 (1) √ (alk cos k · x + a(2) A(x, {alk }) = c el (k), lk sin k · x) |V | 2
555
(2.14)
k=0 l=1
1 (1) (2) (φk cos k · x + φk sin k · x). |V |
φ(x, {φk }) =
(2.15)
k=0
We also write (1)
ρk (x) :=
n
ej cos k · x(j) ,
(2.16)
ej sin k · x(j) .
(2.17)
j=1 (2)
ρk (x) :=
n j=1
Determining the constant C in the Lagrangian function (2.3) formally as the infinite constant n c|k| 2π 2 1 , (2.18) ej + 2 |V | j=1 |k|2 2 k=0
k=0
we can write L from (2.3) by means of (2.4), (2.9), (2.10) and (2.15) as L(x, x˙ , {alk }, {a˙ lk }, {φk }) =
n mj j=1
2
|x˙ (j) |2 + n
+
e2j
1 8π|V |
2 k=0 i=1
(i)
j=1
(i)
1 ej x˙ (j) · A(x(j) , {alk }) c j=1 n
+
|k|2 (i) (i) (c|k|)2 (alk )2 (a˙ lk )2 1 c|k| + − + . 2 2|V | 2|V | 2 16π 2
(i)
(|k|2 (φk )2 − 8πρk (x)φk )
(2.19)
k=0,i,l
The reason why we have chosen the indefinite constant C in (2.3) in the way given by (2.18) will be explained in Remark 5.1. n (1) Remark 2.2. If we do not assume (2.6), we must add (−1/|V |)( j=1 ej )φ0 and (i) ˙ l0 )2 /(4|V |) to (2.19). i,l=1,2 (a If we take into the constraints ∇ · E = 4πρ as in [11, (9-17)] and [33, (7.38)], we have (i)
(i)
|k|2 φk = 4πρk (x) (i = 1, 2, k = 0)
(2.20)
June 2, 2010 14:55 WSPC/S0129-055X
556
148-RMP
J070-00403
W. Ichinose
n and j=1 ej = 0 formally from (2.1), (2.4) and (2.5). But, in the present paper, we adopt only (2.20) as constraints. Then from (2.16) and (2.17), we have n 2
(i)
(i)
(i)
(|k|2 (φk )2 − 8πρk (x)φk ) + 16π 2
i=1
e2j
j=1
|k|2 n 16π 2 (1) 2 (2) = − 2 (ρk ) + (ρk )2 − e2j |k| j=1 =− =−
16π 2 |k|2 16π 2 |k|2
n
(j)
ej el eik·x e−ik·x
(l)
j,l=1,j=l n
ej el cos k · (x(j) − x(l) ).
(2.21)
j,l=1,j=l
So we get Lc (x, x˙ , {alk }, {a˙ lk }) =
n mj j=1
+
1 c
2
|x˙ (j) |2 −
n
2π |V |
n
k=0 j,l=1,j=l
ej el cos k · (x(j) − x(l) ) |k|2
ej x˙ (j) · A(x(j) , {alk })
j=1
1 + 2
k=0,i,l
(i)
(i)
(c|k|)2 (alk )2 (a˙ lk )2 c|k| − + 2|V | 2|V | 2
.
(2.22)
We introduce the weighted Sobolev spaces B a (Rd ) := {f ∈ L2 ; f B a := f + |α|=a (z α f + (∂z )α f ) < ∞} (a = 1, 2, . . .). Let B −a (Rd ) denote
their dual spaces. We set B 0 := L2 . Let χ ∈ C ∞ (Rd ) with compact support such χ(0) = 1. We define the oscillatory integral Os- g(·, z )dz by that lim→0 χ( z )g(·, z )dz independently of the choice of χ pointwise, in the topology of B a (Rd ) or in the topology in S(Rd ) (cf. [24]) for a function g(z, z ) in Rd × Rd , provided the integral involving χ exists in Lebesgue sense for any > 0. 3. Main Results We arbitrarily cut off the terms of large wave numbers k in (2.22). That is, let Mj (j = 1, 2, 3) be arbitrary positive integers such that M2 ≤ M3 . We consider 2π 2π 2π s1 , s2 , s3 ; s21 + s22 + s23 = 0, |s1 |, |s2 |, |s3 | ≤ Mj . Λj := k = L1 L2 L3 (3.1)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
557
Then we can determine Λj (j = 1, 2, 3) such that Λj =: Λj ∪ (−Λj ),
Λj ∩ (−Λj ) = empty set,
Λ2 ⊆ Λ3
(3.2)
and fix Λj hereafter. Let Nj denote the number of elements of the set Λj . It (i)
follows from (2.13) that aΛj := {alk }k∈Λj ,i,l ∈ R4Nj are independent variables (cf. [32, p. 154]). We also introduce cut-off functions g(x) ∈ C ∞ (R3 ) and ψ(θ) ∈ C ∞ (R). We consider L˜c (x, x˙ , {alk }, {a˙ lk }) :=
n mj j=1
2
|x˙ (j) |2 −
2π |V |
n
k∈Λ1 j,l=1,j=l
ej el cos k · (x(j) − x(l) ) |k|2
1 ˜ (j) , aΛ ) ej x˙ (j) · A(x 2 c j=1 (i) (i) (c|k|)2 (alk )2 (a˙ lk )2 1 c|k| + − + 2 2|V | 2|V | 2 n
+
(3.3)
k∈Λ3 ,i,l
in place of Lc given by (2.22), where A given by (2.14) is replaced with √ 2 4π 1 ˜ √ (ψ(a(1) A(x, aΛ2 ) = cg(x) lk ) cos k · x |V | 2 k∈Λ2 l=1
(2)
+ ψ(alk ) sin k · x)el (k).
(3.4)
We assume ψ(−θ) = −ψ(θ) (θ ∈ R). For the sake of simplicity we write Λ := Λ3 and N := N3 . We consider a subdivision ∆ : 0 = τ0 < τ1 < · · · < τν = T,
|∆| := max (τl − τl−1 ) 1≤l≤ν
of [0, T ]. Let x ∈ R3n and aΛ ∈ R4N be fixed. We take arbitrarily x (0) , . . . , x (ν−1) ∈ R3n and (0)
(ν−1)
aΛ , . . . , aΛ
∈ R4N .
Then, we write the oriented broken line path on [0, T ] connecting x (l) at θ = τl (l = 0, 1, . . . , ν, x (ν) = x) by q∆ (θ) ∈ R3n . Of course, dq∆ (θ)/dθ =: q˙∆ (θ) in distributional sense is in L2 ([0, T ]). In the same way we define the broken line path (0) (ν−1) and aΛ . We define aΛ∆ (θ) ∈ R8N by aΛ ∆ (θ) ∈ R4N on [0, T ] for aΛ , . . . , aΛ means of (2.13). We write the classical action T L˜c (q∆ (θ), q˙∆ (θ), aΛ∆ (θ), a˙ Λ∆ (θ))dθ. q∆ , aΛ∆ ) = (3.5) Sc (T, 0; 0
June 2, 2010 14:55 WSPC/S0129-055X
558
148-RMP
J070-00403
W. Ichinose
Let ρ∗ > 0 be the constant, which will be defined from Λ1 , Λ2 and Λ3 in Proposition 7.2 of the present paper. See also Remark 7.1. Then we have Theorem 3.1. We assume for cut-off functions g(x) and ψ(θ) in (3.4) that for any l = 1, 2, . . . and any multi-index α there exist constants δl > 0 and δα > 0 satisfying |∂θl ψ(θ)| ≤ Cl θ−(1+δl ) ,
θ∈R
(3.6)
x ∈ R3 .
(3.7)
and |∂xα g(x)| ≤ Cα x−(1+δα ) ,
Let |∆| ≤ ρ∗ and f (x, aΛ ) ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .). Then, 4N ν n 3 m 1 j 2πi(τl − τl−1 ) 2πi|V |(τl − τl−1 ) l=1
j=1
× Os-
···
(exp i−1 Sc (T, 0; q∆, aΛ∆ ))f (q∆ (0), (0)
(ν−1)
aΛ ∆ (0))dx (0) · · · dx (ν−1) daΛ · · · daΛ
(3.8)
is well defined in B a (R3n+4N ), which we write as (C∆ (T, 0)f )(x, aΛ ) or (exp i−1 Sc (T, 0; q∆, aΛ∆ ))f (q∆ (0), aΛ ∆ (0))Dq∆ DaΛ ∆ . In addition, as |∆| (T, 0)f )(x, aΛ ) converges to a limit which we call tends to 0, the function (C∆ the Feynman path integral (exp i−1 Sc (T, 0; q, aΛ ))f (q(0), aΛ (0))Dq DaΛ in B a (R3n+4N ). We can also see that this limit is B a -valued continuous and B a−2 valued continuously differentiable in T ∈ (0, ∞), and satisfies the Schr¨ odinger type equation i
∂ u(t) = H(t)u(t) ∂t
(3.9)
with u(0) = f, where 2 n 1 ∂ ej ˜ (j) H(t) = − A(x , aΛ2 ) (j) 2m i c ∂x j j=1 n 2π ej el cos k · (x(j) − x(l) ) |V | |k|2 k∈Λ1 j,l=1,j=l 2 2 |V | ∂ (c|k|) (i) 2 c|k| (a ) − + + . (i) 2 i ∂a 2|V | lk 2
+
k∈Λ ,i,l
lk
(3.10)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
559
Remark 3.1. Let us determine the indefinite constant C in (2.3) by n c|k| 2π 2 1 ej + 2 |V | j=1 |k|2 2 k∈Λ1
k∈Λ3
and cut off the terms of large wave numbers k in (2.19) by introducing Λj (j = 1, 2, 3). Then we get (3.3) again, taking into the account the constraints (2.20). Remark 3.2. Let 0 < ≤ 1 and g (x) ∈ C ∞ (R3 ) satisfy (3.7) for all α. Let U (t, 0)f (0 ≤ t ≤ T ) denote the Feynman path integral defined in Theorem 3.1 for f ∈ B a (R3n+4N ). Suppose that ∂xα g (x) are uniformly bounded with respect to 0 < ≤ 1 in R3n for all α and that ∂xα g (x) converges to ∂xα 1 pointwise in R3n for all α as tends to zero. Then we can prove that as tends to zero, U (t, 0)f converges to the solution of (3.9) with u(0) = f , where g(x) in (3.4) is replaced with 1, in B a uniformly in t ∈ [0, T ]. In this way we can remove the cut-off function g(x) in (3.4). This result will be published in [22]. Remark 3.3. Let 0 ≤ t0 ≤ t ≤ T . For f ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .) we define C∆ (t, t0 )f with C∆ (t0 , t0 )f = f as in (3.8). See (9.3) in the present paper for the precise definition. As will be seen from the proof of Theorem 3.1 of the present paper, under the assumptions of Theorem 3.1 (C∆ (t, t0 )f )(x, aΛ ) is well defined in B a and lim|∆|→0 C∆ (t, t0 )f exists in B a uniformly in 0 ≤ t0 ≤ t ≤ T , which satisfies the Sch¨ odinger type equation (3.9) with u(t0 ) = f . In place of L expressed by (2.19) we consider ˜ x, x˙ , {alk }, {a˙ lk }, {φk }) L( :=
n mj j=1
2
|x˙ (j) |2 +
n
+
e2j
1 8π|V |
2 k∈Λ1 i=1
(i)
j=1
(i)
1 ˜ (j) , aΛ ) ej x˙ (j) · A(x 2 c j=1 n
+
|k|2 (i) (i) (c|k|)2 (alk )2 (a˙ lk )2 c|k| 1 − + + 2 2|V | 2|V | 2 16π 2
(i)
(|k|2 (φk )2 − 8πρk (x)φk )
(3.11)
k∈Λ3 ,i,l
by means of (3.4) as in L˜c . Let q∆ (θ) ∈ R3n , aΛ ∆ (θ) ∈ R4N and aΛ∆ (θ) ∈ R8N be the broken line paths (1) (2) (0) (1) (ν−1) defined before. Let ξk := (ξk , ξk ) ∈ R2 for k ∈ Λ1 . Take ξ k , ξ k , . . . and ξ k (1) (2) in R2 arbitrarily. Set ρk (x) := (ρk (x), ρk (x)) by means of (2.16) and (2.17). Then, we define the path 4πρk (q∆ (θ)) (l) ∈ R2 , φk∆ (θ) := ξ k + |k|2
τl−1 < θ ≤ τl
(3.12)
June 2, 2010 14:55 WSPC/S0129-055X
560
148-RMP
J070-00403
W. Ichinose
(l = 1, 2, . . . , ν), where φk∆ (0) := limθ→0+0 φk∆ (θ). We set φΛ1 ∆ (θ) := {φk∆ (θ)}k∈Λ1 ∈ R2N1 . We define φΛ1 ∆ (θ) ∈ R4N1 by means of (2.13). Let ˜ x, x˙ , {alk }, {a˙ lk }, {φk }) given S(T, 0; q∆ , aΛ∆ , φΛ1 ∆ ) be the classical action for L( by (3.11). Theorem 3.2. Let |∆| ≤ ρ∗ and f (x, aΛ ) ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .). Then, under the assumptions of Theorem 3.1 the function 4N ν n 3 m 1 j 2πi(τl − τl−1 ) 2πi|V |(τl − τl−1 ) j=1 l=1
|k|2 (τl − τl−1 ) Os- · · · (exp i−1 S(T, 0; q∆, aΛ∆ , φΛ1 ∆ )) × 4iπ 2 |V | k∈Λ1
× f ( q∆ (0), aΛ ∆ (0))dx (0) · · · dx (ν−1) (0) (ν−1) (0) (1) (ν−1) · da · · · da dξ dξ · · · dξ Λ
Λ
k∈Λ1
k
k
k
(3.13)
is well defined in B a (R3n+4N ) and is equal to (exp i−1 Sc (T, 0; q∆ , aΛ∆ ))f (q∆ (0), aΛ ∆ (0))Dq∆ DaΛ ∆ defined by (3.8) in Theorem 3.1. So it follows from Theorem 3.1 that as |∆| → 0, then (3.13) converges to the Feynman path integral (3.14) (exp i−1 S(T, 0; q, aΛ , φΛ1 ))f (q(0), aΛ (0))Dq DaΛ DφΛ1 in B a (R3n+4N ), which satisfies the Schr¨ odinger type equation (3.9) with u(0) = f . The Feynman path integral (3.14) is given heuristically in [11, §9-8]. Remark 3.4. As was noted in the introduction, the constraints (2.20) are not needed in Theorem 3.2 above. The path φk∆ (θ) defined by (3.12) is determined so (i) ˜ q∆ (θ), q˙ ∆ (θ), aΛ∆ (θ), a˙ Λ∆ (θ), φΛ1 ∆ (θ))/∂φk (i = 1, 2) are piecewise conthat ∂ L( stant. Remark 3.5. We take f ∈ S(R3n+4N ) and set M0 = [(3n + 4N )/2] + 1, where [·] denotes Gauss’ symbol. Let ζ = (x, X), and α and β multi-indices. Then, the Sobolev inequality shows ∂ζκ (ζ α ∂ζβ f ). sup |ζ α ∂ζβ f (ζ)| ≤ ζ α ∂ζβ f + ζ∈R3n+4N
|κ|=M0
It follows from Lemma 2.4 with a = b = 1 in [17] or as in the proof of (7.14) in the present paper that the right-hand side of the above is bounded by Cα,β f B |α+β|+M0
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
561
with a constant Cα,β . Hence, for |∆| ≤ ρ∗ the functions (3.8), (3.13), the limit of (3.8) as |∆| → 0 and the limit of (3.13) as |∆| → 0 are well defined in S, so pointwise. Remark 3.6. We write (3.13) as G∆ (T, 0)f . Let 0 ≤ t0 ≤ t ≤ T . For f ∈ B a (R3n+4N ) (a = 0, 1, 2, . . .) we can define G∆ (t, t0 )f as in (3.13) in the same way that C∆ (t, t0 )f is defined in Remark 3.3. See also (9.20) in the present paper for the precise definition. As will be seen in the proof of Theorem 3.2, under the assumptions of Theorem 3.1, G∆ (t, t0 )f is well defined in B a and is equal to C∆ (t, t0 )f . We consider an external electromagnetic field Eex (t, x) = (Eex1 (t, x), Eex2 (t, x), Eex3 (t, x)) ∈ R3 and Bex (t, x) = (Bex1 (t, x), Bex2 (t, x), Bex3 (t, x)) ∈ R3 such that ∂xα Eex j (t, x), ∂xα Bex j (t, x) and ∂t Bex j (t, x) (j = 1, 2, 3) are continuous in [0, T ] × Rn for all α. Let φex (t, x) ∈ R and Aex (t, x) ∈ R3 be the electromagnetic potential to Eex and Bex . Then we obtain Theorem 3.3 below. Though Theorem 3.3 gives the generalization of Theorems 3.1 and 3.2, the results are stated separately from Theorems 3.1 and 3.2 to avoid confusion. ˜ (j) , aΛ ) + Aex (t, x(j) ). ˜ (j) , aΛ ) in (3.3), (3.10) and (3.11) by A(x We replace A(x n2 n 2 Moreover we add − j=1 ej φex (t, x(j) ) to (3.3) and (3.11), and j=1 ej φex (t, x(j) ) to (3.10), respectively. Then we have Theorem 3.3. Besides the assumptions of Theorem 3.1 we suppose as in [19–21] that for any α = 0 there exist constants Cα and δα > 0 satisfying |∂xα Eex j (t, x)| ≤ Cα ,
|∂xα Bex j (t, x)| ≤ Cα x−(1+δα )
(3.15)
and |∂xα Aex j (t, x)| ≤ Cα ,
|∂xα φex (t, x)| ≤ Cα x
(3.16)
for j = 1, 2 and 3 in [0, T ] × Rn. Then, the same assertions as in Theorems 3.1 and 3.2 hold. Remark 3.7. It follows from [19, Lemma 6.1] that under the assumptions (3.15) there exist Aex and φex satisfying (3.16). 4. The Appearance of the Coulomb Potentials We will show rigorously that the Coulomb potentials appear as the limit of the second term on the right-hand side of (3.3) and the limit of the second term on the right-hand side of (3.10). This result is well known as a heuristic result in physics (cf. [8, 11]). We will give a rigorous proof in our model. In the Hamiltonian operators of QED models in [12, 14, 16, 32], the Coulomb potentials are assumed from the beginning. Our proof is somewhat delicate.
June 2, 2010 14:55 WSPC/S0129-055X
562
148-RMP
J070-00403
W. Ichinose
Theorem 4.1. Let Lj (j = 1, 2, 3) tend to ∞ under the condition Li 1 ≤ ≤ m0 , m0 Lj
i, j = 1, 2, 3
(4.1)
for a constant m0 ≥ 1. Then we have lim
L1 ,L2 ,L3 →∞
k∈Λ1 j,l=1,j=l
= lim
M1 →∞
=
1 2
n
2π M1 →∞ |V |
lim
2π L1 ,L2 ,L3 →∞ |V | lim
ej el cos k · (x(j) − x(l) ) |k|2 n
k∈Λ1 j,l=1,j=l
n j,l=1,j=l
ej el − x(l) |
ej el cos k · (x(j) − x(l) ) |k|2
in S (R3n ).
|x(j)
(4.2)
Let χ0 (k) be the function in R3 defined by
χ0 (k) :=
1, |k| ≤ 1,
(4.3)
0, |k| > 1.
We first prove Lemma 4.2. Let > 0. Then we have 1 →0 (2π)2 =
1 2
n
lim
ej el
j,l=1,j=l n
j,l=1,j=l
cos k · (x(j) − x(l) ) χ0 ( k)dk |k|2
ej el |x(j) − x(l) |
in S (R3n ).
(4.4)
Proof. Let x and k be in R3 . Then, it is well known that 1 (2π)2
1 eik·x dk = |k|2 2|x|
in S (R3 )
(4.5)
(cf. [25, §5.9]). For the sake of simplicity, we consider the case n = 2. Let x = x(1) and y = x(2) . We will prove 1 →0 (2π)2 lim
1 eik·(x−y) χ0 ( k)dk = 2 |k| 2|x − y|
in S (R6 ).
(4.6)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
563
Let ϕ(x, y) ∈ S(R6 ). Then, with ·, · understood as distributional pairing, from (4.6) we have $ ik·(x−y) 1 e χ ( k)dk, ϕ(x, y) 0 (2π)2 |k|2 $ # 1 cos k · (x − y) = lim χ ( k)dk, ϕ(x, y) 0 →0 (2π)2 |k|2 # $ 1 sin k · (x − y) +i χ ( k)dk, ϕ(x, y) 0 (2π)2 |k|2 $ # 1 cos k · (x − y) = lim χ ( k)dk, ϕ(x, y) 0 →0 (2π)2 |k|2 # $ 1 = , ϕ(x, y) . 2|x − y|
# lim
→0
Consequently we obtain (4.4). Equation (4.6) is equivalent to # lim
→0
1 2π 2
$ 1 eik·(x−y) ϕ(x, y)dxdy χ ( k)dk, ϕ(x, y) = 0 2 |k| |x − y|
(4.7)
√ √ for all ϕ(x, y) ∈ S(R6 ). We set x = (x − y)/ 2 and y = (x + y)/ 2. Let ψ1 (x ) ˜ , y ) := ψ1 (x )ψ2 (y ) in the left-hand and ψ2 (y ) be in S(R3 ). We take ϕ(x, y) = ϕ(x side of (4.7). Then the left-hand side of (4.7) is equal to
eik·(x−y) χ0 ( k)ψ1 (x )ψ2 (y )dkdx dy |k|2 ik·√2x 1 e = lim 2 χ ( k)dk ψ2 (y )dy , ψ1 (x )dx 0 →0 2π |k|2
1 lim →0 2π 2
which is also equal to ϕ(x ˜ , y) ϕ(x, y) 1 √ √ ψ1 (x )dx ψ2 (y )dy = dx dy = dxdy |x − y| 2|x | 2|x | from (4.5). So, (4.7) holds for ϕ(x, y) = ψ1 (x )ψ2 (y ). Since the set of all linear combinations of ψ1 (x )ψ2 (y ) for all ψ1 and ψ2 in S(R3 ) is dense in S(Rx6 ,y ), so (4.7) holds for all ϕ(x, y) ∈ S(R6 ). Hence we get (4.6). Proposition 4.3. Let c ≥ 0 be a constant. Let Φ(k) be continuous in R3 \({0} ∪ {|k| = c}). We suppose |Φ(k)| ≤ φ(|k|) (k ∈ R3 ). We assume that φ(r) is nonincreasing in (0, ∞) and that r2 φ(r) is in L1 ([0, ∞)) and is bounded in (0, ∞). Then, ((2π)3 /|V |) k=0 Φ(k) is absolutely convergent, where the sum of k is taken
June 2, 2010 14:55 WSPC/S0129-055X
564
148-RMP
J070-00403
W. Ichinose
over (2πs1 /L1 , 2πs2 /L2 , 2πs3 /L3 ) (s1 , s2 , s3 = 0, ±1, ±2, . . .). We also get (2π)3 Φ(k) = Φ(k)dk L1 ,L2 ,L3 →∞ |V | lim
(4.8)
k=0
under the condition (4.1). Proof. We write L = (L1 , L2 , L3 ). Let us define the step function ΦL (k) by
2π(s1 − 1) 2πs1 , L1 L1
2π(s2 − 1) 2πs2 2π(s3 − 1) 2πs3 × , , × , L2 L2 L3 L3
2πs1 2π(s1 − 1) 2πs1 2πs2 2πs3 ΦL (k) = Φ ,− , , , k∈ L1 L2 L3 L1 L1
2π(s3 − 1) 2πs3 2πs2 2π(s2 − 1) × − ,− , × L2 L2 L3 L3 ΦL (k) = Φ
2πs1 2πs2 2πs3 , , L1 L2 L3
,
k∈
for s1 , s2 , s3 = 1, 2, . . . . Then, for k ∈ (2π(s1 − 1)/L1 , 2πs1 /L1 ] × (2π(s2 − 1)/L2 , 2πs2 /L2 ] × (2π(s3 − 1)/L3 , 2πs3 /L3 ] we have 2πs1 2πs2 2πs3 2πs1 2πs2 2πs3 ≤ φ(|k|) ≤ φ |ΦL (k)| = Φ , , , , L1 L1 L2 L3 L2 L3 since φ(r) is non-increasing. In the same way, for k ∈ (2π(s1 − 1)/L1 , 2πs1 /L1 ] × [−2πs2 /L2 , −2π(s2 − 1)/L2 ) × (2π(s3 − 1)/L3 , 2πs3 /L3 ] we get |ΦL (k)| ≤ φ(|k|).
(4.9)
In the same way as the above, we can define the step function ΦL (k) for all k ∈ R3 \{0} such that (4.9) and (2π)3 (2π)3 Φ(k) = ΦL (k)dk + |V | |V | R3 k=0
Φ(k).
(4.10)
k=0,s1 s2 s3 =0
For a short while we suppose L1 ≤ L2 ≤ L3 . Since φ(r) is non-increasing, it holds that for s1 ≥ 2 we have 2πs1 2π(s1 − 1) 2π 2π ≤ φ(|k|), , 0, 0 ≤ φ , , φ L1 L1 L2 L3
2π(s1 − 2) 2π(s1 − 1) 2π 2π , × 0, × 0, k∈ L1 L1 L2 L3
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
565
and also for s1 ≥ 2 and s2 ≥ 1 2πs1 2πs2 2π(s1 − 1) 2πs2 2π ≤ φ(|k|), , , 0 ≤ φ , , φ L1 L2 L1 L2 L3
2π(s2 − 1) 2πs2 2π 2π(s1 − 2) 2π(s1 − 1) , , × × 0, . k∈ L1 L1 L2 L2 L3 For s2 ≥ 2, we also have 2π 2πs2 2π 2π(s2 − 1) 2π ≤ φ(|k|), φ , , 0 ≤ φ , , L1 L2 L1 L2 L3
2π(s2 − 2) 2π(s2 − 1) 2π 2π , × × 0, . k ∈ 0, L1 L2 L2 L3 Thus we get (2π)3 |V |
|Φ(k)| ≤
k=0,s3 =0
≤
(2π)3 |V |
(2π)3 |V | +
φ(|k|)
k=0,s3 =0
φ(|k|)
k=0,s3 =0,s1 ,s2 =0,±1
(2π)3 |V |
φ(|k|)
k=0,s3 =s1 =0,|s2 |≥2
φ(|k|)dk.
+ 10
(4.11)
0≤k3 ≤(2π)/L3
We can take a constant 1 ≤ m ≤ m0 from (4.1) such that L2 ≤ mL1 ≤ L3 . We add the refinement {((2π)/(mL1 ), (2πs2 )/L2 , (2πs3 )/L3 ); s2 , s3 = 0, ±1, ±2, . . .} to {((2πs1 )/L1 , (2πs2 )/L2 , (2πs3 )/L3 ); s1 , s2 , s3 = 0, ±1, ±2, . . .}. Then, for s2 ≥ 2 noting 2π 2π(s2 − 1) 2π 2πs2 ≤ φ(|k|), ,0 ≤ φ , , φ 0, L2 mL1 L2 L3
2π(s2 − 2) 2π(s2 − 1) 2π 2π , × × 0, , k ∈ 0, mL1 L2 L2 L3 we have (2π)3 m|V |
φ(|k|) ≤ 2
k=0,s3 =s1 =0,|s2 |≥2
φ(|k|)dk 0≤k1 ≤(2π)/(mL1 ),0≤k3 ≤(2π)/L3
≤2
φ(|k|)dk. 0≤k3 ≤(2π)/L3
June 2, 2010 14:55 WSPC/S0129-055X
566
148-RMP
J070-00403
(2π)3 |V |
W. Ichinose
Consequently, from (4.11), we get (2π)3 |V |
|Φ(k)| ≤
k=0,s3 =0
φ(|k|)
k=0,s3 =0,s1 ,s2 =0,±1
+ 2(5 + m0 )
φ(|k|)dk.
(4.12)
0≤k3 ≤(2π)/L3
Let us consider the case of general L1 , L2 and L3 . We may suppose L1 ≤ L2 . Noting L2 ≤ m0 L3 from (4.1), we add the refinement {((2πs1 )/L1 , (2πs2 )/L2 , (2π)/(m0 L3 )); s1 , s2 = 0, ±1, ±2, . . .} to {((2πs1 )/L1 , (2πs2 )/L2 , (2πs3 )/L3 ); s1 , s2 , s3 = 0, ±1, ±2, . . .}. Then, as in the proof to (4.11), for s1 ≥ 2 we have 2π(s1 − 1) 2π 2π 2πs1 ≤ φ(|k|), , 0, 0 ≤ φ , , φ L1 L1 L 2 m0 L 3
2π(s1 − 2) 2π(s1 − 1) 2π 2π k∈ , × 0, × 0, L1 L1 L2 m0 L 3 and also for s1 ≥ 2 and s2 ≥ 1, 2π(s1 − 1) 2πs2 2π 2πs1 2πs2 ≤ φ(|k|), , , 0 ≤ φ , , φ L1 L2 L1 L 2 m0 L 3
2π(s1 − 2) 2π(s1 − 1) 2π(s2 − 1) 2πs2 2π k∈ , , × × 0, . L1 L1 L2 L2 m0 L 3 For s2 ≥ 2 we also have 2π 2π(s2 − 1) 2π 2π 2πs2 ≤ φ(|k|), φ , , 0 ≤ φ , , L1 L2 L1 L2 m0 L 3
2π 2π(s2 − 2) 2π(s2 − 1) 2π k ∈ 0, , × × 0, . L1 L2 L2 m0 L 3 Hence we can prove (2π)3 |V |
|Φ(k)| ≤ m0
k=0,s3 =0
≤
(2π)3 m0 |V |
(2π)3 |V | +
φ(|k|)
k=0,s3 =0
φ(|k|)
k=0,s3 =0,s1 ,s2 =0,±1
(2π)3 |V |
φ(|k|)
k=0,s3 =s1 =0,|s2 |≥2
+ 10m0
φ(|k|)dk. 0≤k3 ≤(2π)/(m0 L3 )
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
as in the proof of (4.11) and so (2π)3 (2π)3 |Φ(k)| ≤ |V | |V | k=0,s3 =0
567
φ(|k|)
k=0,s3 =0,s1 ,s2 =0,±1
+ 2m0 (5 + m0 )
φ(|k|)dk 0≤k3 ≤(2π)/(m0 L3 )
as in the proof of (4.12). Thus, for general L1 , L2 and L3 we obtain (2π)3 (2π)3 |Φ(k)| ≤ φ(|k|) |V | |V | k=0,sj =0
k=0,s1 ,s2 ,s3 =0,±1
+ 2m0 (5 + m0 )
φ(|k|)dk
(j = 1, 2, 3).
0≤kj ≤(2π)/(m0 Lj )
(4.13) We assumed that r2 φ(r) is in L1 (R). So, from (4.9), (4.10) and (4.13) we can prove that k=0 |Φ(k)| is convergent. In addition, since r2 φ(r) is assumed to be bounded in (0, ∞), 1 0 ≤ φ(|k|) ≤ Const. 2 , k = 0 |k| holds. So we see that ((2π)3 /|V |) k=0,s1 ,s2 ,s3 =0,±1 φ(|k|) tends to zero as L1 , L2 and L3 tend to the infinity under the condition (4.1). Consequently, from (4.13), we have (2π)3 Φ(k) = 0, j = 1, 2, 3 lim L1 ,L2 ,L3 →∞ |V | k=0,sj =0
under (4.1). Hence, noting (4.9), from (4.10) we obtain (4.8) by means of the Lebesgue dominated convergence theorem. Now we will prove Theorem 4.1. For the sake of simplicity, let n = 2. Let χ0 (k) be the function defined by (4.3). We write x = x(1) and y = x(2) . We take ϕ(x, y) ∈ S(R6 ). Then, we have & % (2π)3 cos k · (x − y) χ0 ( k), ϕ(x, y) |V | |k|2 k=0
=
(2π)3 cos k · (x − y) χ0 ( k)ϕ(x, y)dxdy |V | |k|2 k=0
(2π)3 cos k · (x − y) χ0 ( k)Dx 2 ϕ(x, y)dxdy, |V | |k|2 k2 k=0 2 where we define Dx := (1 − nj=1 ∂x2j ). Let Φ(k) = |k|−2 k−2 cos k · (x − y)Dx 2 ϕ(x, y)dxdy =
(4.14)
June 2, 2010 14:55 WSPC/S0129-055X
568
148-RMP
J070-00403
W. Ichinose
and 1 |k|2 k2
φ(|k|) :=
|Dx 2 ϕ(x, y)|dxdy.
Then from (4.14), Proposition 4.3 shows % & (2π)3 cos k · (x − y) lim lim χ0 ( k), ϕ(x, y) L1 ,L2 ,L3 →∞ →0 |V | |k|2 k=0
=
(2π)3 cos k · (x − y) Dx 2 ϕ(x, y)dxdy L1 ,L2 ,L3 →∞ |V | |k|2 k2 lim
1 dk |k|2 k2
=
k=0
(cos k · (x − y))Dx 2 ϕ(x, y)dxdy.
(4.15)
In the same way from (4.14), we also have & % (2π)3 cos k · (x − y) lim lim χ0 ( k), ϕ(x, y) →0 L1 ,L2 ,L3 →∞ |V | |k|2 =
1 |k|2 k2
dk
k=0
(cos k · (x − y))Dx 2 ϕ(x, y)dxdy.
(4.16)
On the other hand, Lemma 4.2 and Proposition 4.3 indicate & % (2π)3 cos k · (x − y) lim lim χ0 ( k), ϕ(x, y) →0 L1 ,L2 ,L3 →∞ |V | |k|2 k=0
cos k · (x − y) χ0 ( k)ϕ(x, y)dxdydk |k|2
= lim
→0
= 2π 2
ϕ(x, y) dxdy. |x − y|
(4.17)
Hence we obtain (4.2) together with (4.15) and (4.16). Remark 4.1. Let χ(k) ∈ S(R3 ) such that χ(0) = 1 and χ(−k) = χ(k). We take the limit of Lj (j = 1, 2, 3) under the condition (4.1). Then it holds that lim
→0
2π L1 ,L2 ,L3 →∞ |V | lim
n
k=0 j,l=1,j=l
=
1 2
n j,l=1,j=l
ej el |x(j) − x(l) |
χ( k)
ej el cos k · (x(j) − x(l) ) |k|2 (4.18)
pointwise for x ∈ R3n such that x(j) −x(l) = 0 (j, l = 1, 2, . . . , n, j = l). The proof is easy. Consider the case n = 2 and e1 = e2 = 1. Let us write x = x(1) and y = x(2) .
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
569
We take χ1 (k) ∈ C ∞ (R3 ) such that χ1 (k) = 1 (|k| ≤ 1) and χ1 (k) = 0 (|k| ≥ 2). Then, Proposition 4.3 says for x = y that the left-hand side of (4.18) is equal to 1 cos k · (x − y) lim dk χ( k) (2π)2 →0 |k|2 cos k · (x − y) 1 lim dk = χ1 (k)χ( k) (2π)2 →0 |k|2 1 −2 (cos k · (x − y))∆ {(1 − χ (k))χ( k)|k| }dk − k 1 |x − y|2 1 cos k · (x − y) = dk χ1 (k) (2π)2 |k|2 1 −2 − {(1 − χ (k))|k| }dk (4.19) (cos k · (x − y))∆ k 1 |x − y|2 pointwise, where ∆k denotes the Laplacian operator with respect to k ∈ R3 and we used | χ ( k)| = 1/3 |k|−2/3 ( |k|)2/3 |χ ( k)| ≤ Const. 1/3 |k|−2/3 . Since we have |∆k {(1 − χ1 (k))χ( k)|k|−2 }| ≤ Ck−3−1/3 with a constant C independent of , so we can prove that Eq. (4.19) is also true in the distribution sense S (R6 ). On the other hand, we see as in the proof of Lemma 4.2 that the left-hand side of (4.19) is equal to 1/(2|x − y|) in S (R6 ). Consequently we can prove that (4.19) is equal to 1/(2|x − y|). Hence (4.18) holds pointwise. 5. The Expression for the Vacuum and the States of Photons In this section, we express the vacuum and the states of photons with given momenta and polarizations concretely as functions of variables aΛ consisting of the Fourier coefficients of vector potentials. In [11, Problem 9-8] only the vacuum and the state of a photon of momentum k and polarization state l are expressed concretely. In this section, we generalize this result in [11] for the general states of photons. In physics, the vacuum and the state of photons are not considered concretely but rather considered abstractly (cf. [29, 33]). We also note that the state of photons of given momenta and polarizations are not discussed in the study for QED models defined by means of the functional method (cf. [12, 14, 16, 32]), because in the functional method each photon with polarizations is expressed by an amplitude of momentum in L2 (R3 ) ⊕ L2 (R3 ) as stated in the introduction. To write down the vacuum and the state of photons concretely, we will introduce ). Let us creation operators and annihilation operators acting on the space S (Ra4N Λ define ∂ |V | c|k| (i) (i) a a ˆlk := i −i 2c|k| i ∂a(i) |V | lk lk ∂ |V | c|k| (i) (i) + a (5.1) = 2c|k| |V | lk ∂a lk
June 2, 2010 14:55 WSPC/S0129-055X
570
148-RMP
J070-00403
W. Ichinose
acting on the space S (Ra4N ) for k ∈ Λ and i, l = 1, 2. From (2.13) we have Λ (1)
(1)
(2)
alk , a ˆl−k = −ˆ
(2)
a ˆl−k = a ˆlk .
(i)† (i) (i) Let a ˆlk denote the formal adjoint operator |V |/(2c|k|)(−∂/∂alk +c|k|alk /|V |) (i) of a ˆlk acting on the space S (Ra4N ). For f ∈ S (Ra4N ) and g ∈ S(Ra4N ) we have Λ Λ Λ (i)
(i)†
(ˆ alk f, g) = (f, a ˆlk g) ) into from the definition of the distribution. So, a ˆlk is continuous from S (Ra4N Λ (i)
S (Ra4N ) in weak topology. In the same way a ˆlk is continuous from S (Ra4N ) into Λ Λ 4N S (RaΛ ) in weak topology. We can easily see from (5.1) that the commutator relations (i)†
(i)
(i )†
ˆl k ] = δi i δl l δkk , [ˆ alk , a
(i)
(i )
[ˆ alk , a ˆ l k ] = 0
on S(Ra4N ) and so on S (Ra4N ) hold for k and k in the bounded domain Λ (cf. [7, Λ Λ §34] and [6,30]). For S(Ra4N ) is dense in S (Ra4N ) in weak topology (cf. [26]) and the Λ Λ ). We define the operator operators of both sides above are continuous in S (Ra4N Λ ) for k ∈ Λ and l = 1, 2 by a ˆlk acting on S (Ra4N Λ (1)
a ˆlk :=
(2)
a ˆlk − iˆ a √ lk 2
(5.2)
(cf. (2.11)). We call a ˆlk the annihilation operator and a ˆ†lk the creation operator. We (i) can easily see from the commutator relations for a ˆlk that the operators a ˆlk and a ˆ†lk also satisfy the commutator relations ˆ†l k ] = δl l δkk , [ˆ alk , a
[ˆ alk , a ˆ l k ] = 0
(5.3)
on S (Ra4N ) for k and k in Λ (cf. [29, (2.26)]). It follows from the commutator Λ relations (5.3) that we have
a ˆlk (ˆ a†lk )n − (ˆ a†lk )n a ˆlk = n (ˆ a†lk )n −1
(5.4)
) (cf. [7, §34]). Then we get the following expression as in physics (cf., e.g., on S (Ra4N Λ [29, (2.60) and (2.64)], and [33, (6.165) and (6.172)]). Proposition 5.1. We can write the last term of H(t) defined by (3.10) as 2 2 2 |V | ∂ (c|k|) (i) 2 c|k| (a ) − Hrad := + 2 i ∂a(i) 2|V | lk 2 k∈Λ ,l i=1
=
k∈Λ,l
c|k|ˆ a†lk a ˆlk
lk
(5.5)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
571
on S (Ra4N ). The vector potential A(x, aΛ2 ) defined by (2.14), where the sum of k Λ is taken over Λ2 , is given for each x ∈ R3 by the expression A(x, aΛ2 ) =
2 4π 1 c ˆ†lk e−ik·x )el (k) (ˆ alk eik·x + a |V | 2c|k| k∈Λ2 l=1
(5.6)
acting on S (Ra4N ). Λ Proof. Since from (5.1) and (5.2) we have ˆlk + a ˆ†l−k a ˆl−k ) c|k|(ˆ a†lk a =
c|k| (1)† (2)† (1) (2) (1)† (2)† (1) (2) {(ˆ alk + iˆ alk )(ˆ alk − iˆ alk ) + (−ˆ alk + iˆ alk )(−ˆ alk − iˆ alk )} 2 (1)† (1)
(2)† (2)
= c|k|(ˆ alk a ˆlk + a ˆlk a ˆlk ) 2 2 2 |V | ∂ (c|k|) (i) 2 c|k| = (a ) − + (i) 2 i ∂a 2|V | lk 2 i=1 lk
) for k ∈ Λ, so we get (5.5) on S (Ra4N ) as in the same way as before. on S(Ra4N Λ Λ From (5.1) and (5.2), we have ˆ†lk e−ik·x a ˆlk eik·x + a 1 (1) (1)† (2) (2)† alk + a = √ {(ˆ ˆlk ) cos k · x − i(ˆ alk − a ˆlk ) cos k · x 2 (1)
(1)†
(2)
(2)†
+ i(ˆ alk − a ˆlk ) sin k · x + (ˆ alk + a ˆlk ) sin k · x} c|k| (1) |V | c|k| (2) a cos k · x + a sin k · x = c|k| |V | lk |V | lk ∂ ∂ − i(cos k · x) (2) + i(sin k · x) (1) ∂alk ∂alk on S(Ra4N ). So, it is shown from (2.8) and (2.13) that Λ k∈Λ2
1 ˆ†lk e−ik·x )el (k) (ˆ alk eik·x + a 2c|k| =
k∈Λ2
1 (1) (2) (alk cos k · x + alk sin k · x)el (k) 2|V |
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
W. Ichinose
572
on S(Ra4N ). Hence, we see that the right-hand side of (5.6) is equal to Λ √ 2 4π 1 (1) √ (alk cos k · x + a(2) c el (k) lk sin k · x) |V | 2 k∈Λ l=1 2
on S (Ra4N ), which is equal to the left-hand side of (5.6) from (2.14). Λ We know
∞
e
−aθ 2
dθ =
−∞
π a
for a constant a > 0. So, we can easily see from (5.2) and (5.5) that c|k| (1)2 c|k| (2)2 exp − (alk + alk ) Ψ0 (aΛ ) := π|V | 2|V |
(5.7)
k∈Λ ,l
is the normal ground state of Hrad , called vacuum, whose energy is 0, i.e. Hrad Ψ0 = 0 and that we have
(5.8)
2c|k| ∗ a Ψ0 , |V | lk
a ˆ†lk Ψ0 =
a ˆlk Ψ0 = 0
(k ∈ Λ)
(5.9)
(cf. [11, §8-1, (9-43) and Problem 9-8]). We know that the eigenvalue 0 of (5.8) is simple in L2 (R4N ) (cf. [4, Chap. 3, Theorem 3.4]). a†lk )n Ψ0 (aΛ ) (k ∈ Λ, n = 0, 1, 2, . . .), which can The function Ψn lk (aΛ ) := (ˆ be written concretely from (5.1), (5.2) and (5.7), expresses the state of n photons of momentum k and polarization state l (cf. [11, §9-2] and [29, §2-2]) and satisfies † a ˆlk a ˆlk Ψn l k = n Ψn l k ,
k∈Λ,l
kˆ a†lk a ˆlk
Ψn l k = n (k )Ψn l k
k∈Λ
and Hrad Ψn l k = n (c|k |)Ψn l k from (5.4), (5.5) and (5.9). The operators ˆ†lk a ˆlk and a†lk a ˆlk k∈Λ,l a k∈Λ kˆ are called the total number operator and the momentum operator, respectively (cf. [6], and [29, (2.68) and (2.80)]). Let n (l, k) ≥ 0 be integers. Then ' a†lk )n (l,k) Ψ0 (aΛ ) denotes the state of n (l, k) photons of momentum k and k∈Λ,l (ˆ
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
polarization state l in the same way. Setting Ψ(aΛ ) = we get † a ˆ a ˆlk Ψ = n (l, k)Ψ,
'
a†lk )n (l,k) Ψ0 (aΛ ), k∈Λ,l (ˆ
lk
k∈Λ,l
kˆ a†lk a ˆlk Ψ =
k∈Λ
k∈Λ,l
573
(5.10)
n (l, k)k Ψ
(5.11)
k∈Λ,l
and
Hrad Ψ =
n (l, k)c|k|Ψ.
(5.12)
k∈Λ,l
The family
(i)†
(ˆ alk )n (l,k,i) Ψ0
k∈Λ ,l,i
∞
n (l,k,i)=0
makes a complete orthogonal system in L2 (R4N ) (cf. [4, Chap. 3, Theorem 3.1] and [7, §34]). We have (1)
a ˆlk =
a ˆlk − a ˆl−k √ , 2
(2)
a ˆlk =
i(ˆ alk + a ˆl−k ) √ 2
from (2.13) and (5.2). So we see together with (5.4) and the second equation in (5.9) that the family ∞ 1 (ˆ a†lk )n (l,k) Ψ0 (5.13) n (l, k)! k∈Λ,l n (l,k)=0
also makes a complete orthonormal system in L2 (R4N ) (cf. [7, §34] and [29, (2.46)]). For example, we have a†lk )2 Ψ0 ) = (Ψ0 , a ˆlk (ˆ a†lk )2 Ψ0 ) (ˆ a†lk Ψ0 , (ˆ a†lk )2 a ˆlk Ψ0 ) + 2(Ψ0 , a ˆ†lk Ψ0 ) = (Ψ0 , (ˆ = 2(ˆ alk Ψ0 , Ψ0 ) = 0. Remark 5.1. We considered the Lagrangian function (3.3) and the Hamiltonian operator (3.10), determining the indefinite constant in (2.3) by (2.18) or in Remark 3.1. On the other hand, in many references (cf. [11, 29, 32]) the indefinite constant is chosen to be 0. Consequently, the term ∞ = (1/2) nj=1 e2j /|x(j) − x(j) | appears in (4.2) from (2.21) and the ground state energy of Hrad is k∈Λ c|k|/2, which tends to infinity as M3 tends to infinity. Arguments are made about these
June 2, 2010 14:55 WSPC/S0129-055X
574
148-RMP
J070-00403
W. Ichinose
infinities in [11, §9-3 and §9-5]. In the present paper, we could see that the infinity n arising from the term (1/2) j=1 e2j /|x(j) − x(j) | disappears in (4.2) and that the ground state energy of Hrad is 0.
6. Preliminaries for the Proofs of Main Results From Secs. 6–9 we often write x and y in R3n as x and y, respectively, for the sake of simplicity when no confusion arises. Let 0 ≤ s < t ≤ T . For x and y in R3n , we define q t,s x,y (θ) := x −
t−θ (x − y), t−s
s ≤ θ ≤ t.
(6.1)
For X and Y in R4N , we also define at,s Λ X,Y (θ) := X −
t−θ (X − Y ), t−s
s ≤ θ ≤ t.
(6.2)
8N Then at,s is defined by means of (2.13). We set ΛX,Y (θ) ∈ R
V1 (x) :=
2π |V |
n
k∈Λ1 j,l=1,j=l
ej el cos k · (x(j) − x(l) ) |k|2
(6.3)
and V2 (aΛ ) :=
(c|k|)2 (i) c|k| (alk )2 − . 2|V | 2
(6.4)
k∈Λ ,i,l
For the sake of simplicity we suppose Λ2 = Λ3 (= Λ ) from Secs. 6–9. We write x = (x, X) ∈ R3n+4N and t,s 1+3n+4N qt,s q t,s , x,y (θ), aΛ X,Y (θ)) ∈ R x,y (θ) = (θ,
s≤θ≤t
(6.5)
for s < t. Then, from (3.3) and (3.5), we have t,s q t,s Sc (t, s; x,y , aΛX,Y )
1 mj |x(j) − y (j) |2 2(t − s) j=1 n 2 1 ˜ (j) , aΛ ) · dx(j) − V2 (aΛ )dt + |X − Y | −V1 (x)dt + + ej A(x c j=1 2|V |(t − s) qt,s x,y n
=
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
1 = mj |x(j) − y (j) |2 − 2(t − s) j=1 n
1 + ej (x(j) − y (j) ) · c j=1 n
|X − Y |2 − + 2|V |(t − s) 1 = 2(t − s)
n
t
s
1
s
t
t−θ (x − y) dθ V1 x − t−s
˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x
0
t−θ (X − Y ) dθ V2 X − t−s
mj |x
(j)
575
−y
| − (t − s)
(j) 2
1
V1 (x − θ(x − y))dθ
0
j=1
1 + ej (x(j) − y (j) ) · c j=1 n
|X − Y |2 − (t − s) + 2|V |(t − s)
1
˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x
0
1
V2 (X − θ(X − Y ))dθ.
(6.6)
0
Let M ≥ 0 and p(x, w, X, W ) a C ∞ function in R6n × R8N such that
α β α β |∂w ∂x ∂W ∂X p(x, w, X, W )| ≤ Cα,β,α ,β (x; wX; W )M
(6.7)
with constants Cα,β,α ,β , where x; w := for all multi-indices α, β, α and β 3n+4N 2 2 1 + |x| + |w| . For f (x, X) ∈ S(R ) we define the operator P (t, s) by
n 4N 3 m 1 j t,s (exp i−1 Sc (t, s; q t,s x,y , aΛX,Y ) 2πi(t − s) 2πi|V |(t − s) j=1 x−y X −Y √ √ × p x, , X, f (y, Y )dydY, s < t, t−s t−s 3 n n 4N m 1 mj |wj |2 j −1 Osexp i 2πi 2πi|V | 2 j=1 j=1 2 |W | p(x, w, X, W )dwdW f (x, X), s = t. + 2|V | (6.8) When p(x, w, X, W ) = 1, P (t, s) is called the fundamental operator and denoted by C(t, s).
June 2, 2010 14:55 WSPC/S0129-055X
576
148-RMP
J070-00403
W. Ichinose
Lemma 6.1. Let M1 and M2 be non-negative constants. Suppose that g(x)(x ∈ R3 ) and ψ(θ)(θ ∈ R) in (3.4) satisfy |∂xα g(x)| ≤ Cα xM1 ,
x ∈ R3
for all α and k d M2 dθk ψ(θ) ≤ Ck θ ,
θ∈R
α (P (t, s)f )(x, X) are continufor all k = 0, 1, . . . . Let f ∈ S(R3n+4N ). Then, ∂xα ∂X 3n+4N for all α and α . ous in 0 ≤ s ≤ t ≤ T and (x, X) ∈ R
√ Proof. Let s < t and make the change of variables: y → w = (x − y)/ t − s and √ Y → W = (X − Y )/ t − s in (6.8). Then from (6.6) we have 4N 3 n m 1 j P (t, s)f = Os(exp i−1 φ(t, s; x, w, X, W )) 2πi 2πi|V | j=1 × p(x, w, X, W )f (x −
√ √ ρw, X − ρW )dwdW,
ρ = t − s,
(6.9)
where φ(t, s; x, w, X, W ) :=
n mj j=1
:=
2
n mj j=1
· 0
1
2
|w(j) |2 +
|w
1 √ √ |W |2 + ψ(t, s; x, ρw, X, ρW ) 2|V |
1 |W |2 − ρ | + 2|V |
(j) 2
0
1
√ 1 √ (j) V1 (x − θ ρw)dθ + ej ρw c j=1
˜ (j) − θ√ρw(j) , X − θ√ρW )dθ − ρ A(x
n
1
√ V2 (X − θ ρW )dθ.
(6.10)
0
We note from (6.8) that (6.9) is also true for t = s. 3 (j) (j) Let L(j) := w(j) −2 (1 − im−1 j k=1 wk ∂/∂wk ) (j = 1, 2, . . . , n) and L1 := 4N W −2 (1 − i|V | k=1 Wk ∂Wk ). Then, integrating by parts with respect to w and W in (6.9) by means of L(j) and L1 , we see that the integrand is bounded by Const.x; Xl w−(3n+1) W −(4N +1) for some real constant l. See the proof of [19, Lemma 2.1] for further details. Consequently, we see that (P (t, s)f )(x, X) is continuous in 0 ≤ s ≤ t ≤ T and (x, X) ∈ R3n+4N . We note (6.9) and (6.10). Then, in the same way as in the above α (P (t, s)f )(x, X) are continuous in 0 ≤ s ≤ t ≤ T and we can prove that ∂xα ∂X (x, X) ∈ R3n+4N for all α and α .
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
577
For 0 ≤ σ1 , σ2 ≤ 1 we set σ := (σ1 , σ2 ) and τ (σ) := t − σ1 (t − s) ∈ R, ζ (j) (σ) := z (j) + σ1 (x(j) − z (j) ) + σ1 σ2 (y (j) − x(j) ) ∈ R3 ,
j = 1, 2, . . . , n,
ζ(σ) := z + σ1 (x − z) + σ1 σ2 (y − x) ∈ R , 3n
˜ ζ(σ) := Z + σ1 (X − Z) + σ1 σ2 (Y − X) ∈ R4N .
(6.11)
We also set ∂ A˜l (j) ∂ A˜m (j) (x , aΛ ) − (x , aΛ ) ∂xm ∂xl
Bml (x(j) , aΛ ) =
(6.12)
for l, m = 1, 2, 3 and j = 1, 2, . . . , n. Then, from (6.6), we have Lemma 6.2. We can write for s < t t,s t,s Sc (t, s; q t,s q t,s z,y , aΛZ,Y ) − Sc (t, s; z,x , aΛZ,X )
=
n x(j) + y (j) 1 mj (x(j) − y (j) ) · z (j) − t − s j=1 2 + (t − s)(x − y) ·
1
σ1 0
+
1 c
n
1
0
∂V1 (ζ(σ))dσ1 dσ2 ∂x
1
ej (x(j) − y (j) ) ·
˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x
0
j=1
1 1 n 3 1 (j) (j) (j) (j) ˜ + ej (xm − ym )(xl − zl ) σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2 c j=1 0 0 l,m=1
1 + ej (x(j) − y (j) ) · c j=1 n
+ (X − Y ) ·
( (Z − X) ·
1 0
0
1
∂ A˜ (j) ˜ σ1 (ζ (σ), ζ(σ))dσ 1 dσ2 ∂aΛ
)
1 1 n 3 1 ∂ A˜m (j) (j) ˜ ej (x(j) − z ) σ1 (ζ (σ), ζ(σ))dσ 1 dσ2 m m c j=1 m=1 ∂aΛ 0 0
X +Y 1 (X − Y ) · Z − + (t − s)|V | 2 1 1 ∂V2 ˜ + (t − s)(X − Y ) · σ1 (ζ(σ))dσ1 dσ2 . ∂aΛ 0 0
(6.13)
June 2, 2010 14:55 WSPC/S0129-055X
578
148-RMP
J070-00403
W. Ichinose
Proof. We use (6.6). From (6.5) and (6.11), we see
q zt,s y ,y
(−V1 (x))dt −
=
n 3
0
=
n 3
(j)
∂V1 /∂xl dt ∧ dxl
n 3 1 j=1 l=1
(−V1 (x))dt (j)
∆
j=1 l=1
=
q zt,s x ,x
1
0
(j)
(j)
∂V1 (ζ(σ))/∂xl det
(j)
(t − s)(xl
(j)
− yl )
1 0
j=1 l=1
= (t − s)(x − y) ·
1
1
σ1 0
0
0
∂(τ (σ), ζl (σ)) dσ1 dσ2 ∂(σ1 , σ2 )
1
(j)
σ1 ∂V1 (ζ(σ))/∂xl dσ1 dσ2
∂V1 (ζ(σ))dσ1 dσ2 , ∂x
(6.14)
where ∆ = ∆(t, s, x, y, z) is the 2-dimensional plane with oriented boundary conq t,s q s,s sisting of (θ, q t,s z,y (θ)), −(θ, z,x (θ)) and (θ, y,x (θ)) (s ≤ θ ≤ t), and σ in (6.11) gives the positive orientation of ∆. So the second term on the right-hand side of (6.13) appears. In the same way the last term appears. It is easy to show that the first and the 7th terms appear. As in the proof of (6.14), we have
q zt,s y ,y
˜ (j) , aΛ ) · dx(j) − A(x = s,s qx y ,y
= (x
˜ (j) , aΛ ) · dx(j) A(x
˜ (j) , aΛ ) · dx(j) + A(x
(j)
q zt,s x ,x
−y
(j)
1
)·
∆
˜ (j) , aΛ ) · dx(j) ) d(A(x
˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x
0
+
∆
1≤m
Bml dx(j) m
= (x(j) − y (j) ) ·
+
1
∧
(j) dxl
−
k∈Λ ,i,l m=1
1 0
0
1
∆
(i) (i) (∂ A˜m /∂alk )dx(j) m ∧ dalk
˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ A(x
0 (j)
(j) {(x(j) m − ym )(xl
(j)
(j)
− zl ) − (xl
1≤m
×
3
˜ σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2
(j)
(j) − yl )(x(j) m − zm )}
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
−
3
579
(j) (j) (j) {(x(j) m − ym )(X − Z) − (X − Y )(xm − zm )}
m=1
·
1
1
σ1 0
0
∂ A˜m (j) ˜ (ζ (σ), ζ(σ))dσ 1 dσ2 . ∂aΛ
(6.15)
So we can complete the proof of (6.13) from (6.6). (j)
Let us define Φm (t, s; x(j) , y (j) , z (j) , X, Y, Z) ∈ R (m = 1, 2, 3, j = 1, 2, . . . , n) and Φ1 (t, s; x, y, z, X, Y, Z) ∈ R4N by 3 (j) (j) ej (t − s) (j) + y x m m (j) (j) + zm − (xl − zl ) Φ(j) m = 2 mj c ×
1 0
l=1
1
˜ σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2
0
1 1 ∂ A˜m (j) ej (t − s) ˜ (X − Z) · σ1 (ζ (σ), ζ(σ))dσ 1 dσ2 mj c ∂aΛ 0 0 ej (t − s) 1 ˜ Am (x(j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ + mj c 0 2 1 1 (t − s) σ1 ∂V1 (ζ(σ))/∂x(j) + m dσ1 dσ2 mj 0 0 −
(6.16)
and Φ1 =
n 3 X +Y (t − s)|V | (j) Z− ej (x(j) + m − zm ) 2 c j=1 m=1 1
∂ A˜m (j) ˜ (ζ (σ), ζ(σ))dσ 1 dσ2 ∂aΛ 0 0 1 1 ∂V2 ˜ + (t − s)2 |V | σ1 (ζ(σ))dσ1 dσ2 , ∂aΛ 0 0 ×
1
σ1
(j)
(j)
(6.17)
(j)
respectively. Let Φ(j) := (Φ1 , Φ2 , Φ3 ) ∈ R3 . Then it follows from (6.13), (6.16) and (6.17) that t,s t,s Sc (t, s; q t,s q t,s z,y , aΛZ,Y ) − Sc (t, s; z,x , aΛZ,X )
1 mj (x(j) − y (j) ) · Φ(j) (t, s; x(j) , y (j) , z (j) , X, Y, Z) t − s j=1 n
=
+
1 (X − Y ) · Φ1 (t, s; x, y, z, X, Y, Z). (t − s)|V |
(6.18)
June 2, 2010 14:55 WSPC/S0129-055X
580
148-RMP
J070-00403
W. Ichinose
7. The Stability of the Fundamental Operator Lemma 7.1. Let f ∈ C 1 (Rd ) and |∂xα f | ≤ Cα < x >−(1+δα ) for all x ∈ Rd and all |α| = 1, where δα > 0 are constants. Then we have: (1) f is a bounded function in Rd . (2) We have α β γ 1 1 |x − z| ∂x ∂y ∂z σ1 f (z + σ1 (x − z) + σ1 σ2 (y − x))dσ1 dσ2 0
≤ Cα,β,γ ,
0
|α + β + γ| = 1,
x, y, z ∈ Rd .
The proof is easy, following the proof of [18, Lemma 3.5]. We note (3.4) and (6.11). Then, it follows from Lemma 7.1 that under the assumptions of Theorem 3.1 we have 1 1 ∂ A˜m (j) α β γ α β γ ˜ σ1 (ζ (σ), ζ(σ))dσ ∂x(j) ∂y(j) ∂z(j) ∂X ∂Y ∂Z (Z − X) · 1 dσ2 ∂a Λ 0 0 |α + β + γ + α + β + γ | ≥ 0
≤ Cα,β,γ,α ,β ,γ ,
(7.1)
for x(j) , y (j) , z (j) ∈ R3 and X, Y, Z ∈ R4N . In the same way we have the same 1 1 (j) (j) (j) ˜ estimates as the above for (xl − zl ) 0 0 σ1 Bml (ζ (j) (σ), ζ(σ))dσ 1 dσ2 and (xm − ˜ 1 1 (j) ˜ zm ) σ1 ∂ Am (ζ (j) (σ), ζ(σ))dσ 1 dσ2 . To obtain these estimates we assumed (3.6) 0 0
∂aΛ
and (3.7). Consequently, letting Θ be a component of Φ(j) and Φ1 , and |α + β + γ + α + β + γ | ≥ 1, then from (6.16) and (6.17) we obtain
α β γ ∂Y ∂Z Θ| ≤ Cα,β,γ,α ,β ,γ |∂xα ∂yβ ∂zγ ∂X
(7.2)
together with (6.3) and (6.4) for 0 ≤ s ≤ t ≤ T, x, y, z ∈ R3n and X, Y, Z ∈ R4N . Proposition 7.2. Under the assumptions of Theorem 3.1 we have: (1) There exists a constant ρ∗ > 0 such that the mapping: R3n+4N (z, Z) → (ξ, Ξ) = (Φ, Φ1 ) := (Φ(1) , Φ(2) , . . . , Φ(n) , Φ1 ) ∈ R3n+4N is homeomorphic and det ∂(ξ, Ξ)/∂(z, Z) ≥ 1/2 for each fixed 0 ≤ t − s ≤ ρ∗ , x, y, X and Y . We write its inverse mapping as R3n+4N (ξ, Ξ) → (z, Z) = (z(t, s; x, ξ, y, X, Ξ, Y ), Z(t, s; x, ξ, y, X, Ξ, Y )) ∈ R3n+4N . (2) Let η(t, s; x, ξ, y, X, Ξ, Y ) be a component of z and Z. Then, letting |α + β + γ + α + β + γ | ≥ 1, we have
β γ ∂Y η(t, s; x, ξ, y, X, Ξ, Y )| ≤ Cα,β,γ,α ,β ,γ |∂ξα ∂xβ ∂yγ ∂Ξα ∂X
(7.3)
for 0 ≤ t − s ≤ ρ∗ , x, ξ, y ∈ R3n and X, Ξ, Y ∈ R4N . Proof. (1) From (6.16) and (6.17), we can write ∂(Φ, Φ1 )/∂(z, Z) = I + (t − s)d(t, s; x, y, z, X, Y, Z),
(7.4)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
581
where I is the identity matrix of degree 3n + 4N . We can see as in the proof of (7.2) that each component of d(t, s; x, y, z, X, Y, Z) satisfies (7.2) for all α, β, γ, α , β and γ . Hence, applying [31, Theorem 1.22] to the mapping: (z, Z) → (Φ, Φ1 ), we prove (1). (2) We see (ξ, Ξ) = (Φ(t, s; x, y, z, X, Y, Z), Φ1 (t, s; x, y, z, X, Y, Z)) with z = z(t, s; x, ξ, y, X, Ξ, Y ) and Z = Z(t, s; x, ξ, y, X, Ξ, Y ). So, (7.3) follows from (7.2) and det ∂(ξ, Ξ)/∂(z, Z) ≥ 1/2. Remark 7.1. Let us consider the general case Λ2 ⊆ Λ3 in Proposition 7.2. Then ˜ aΛ ) and Bml (x, aΛ ) in (6.16) and (6.17). from (3.4) and (6.12), we consider A(x, 2 2 Let Λ1 and Λ2 be fixed. When Λ3 = Λ2 , we could determine ρ∗ > 0 from (7.4) such that we get det ∂(Φ, Φ1 )/∂(z, Z) ≥ 1/2 for 0 ≤ t − s ≤ ρ∗ , x, y, z ∈ R3n and X, Y, Z ∈ R4N3 . Let Λ3 ⊇ Λ2 . Then, direct calculations show det ∂(Φ, Φ1 )/∂(z, Z) ≥ 1/2 for 0 ≤ t − s ≤ ρ∗ , x, y, z ∈ R3n and X, Y, Z ∈ R4N3 from (6.16) and (6.17) since (i) (t − s)2 |V |∂ 2 V2 (aΛ )/∂(alk )2 = (t − s)2 (c|k|)2 are non-negative. Consequently, we can see that when Λ1 and Λ2 are fixed, the constant ρ∗ > 0 is taken independently of Λ3 (⊇ Λ2 ). Theorem 7.3. Let ρ∗ > 0 be the constant determined in Proposition 7.2. Then under the assumptions of Theorem 3.1 we can find constants Ka ≥ 0 (a = 0, 1, 2, . . .) such that C(t, s)f B a ≤ eKa (t−s) f B a ,
0 ≤ t − s ≤ ρ∗
(7.5)
for all f (x, aΛ ) ∈ B a (R3n+4N ). Proof. The definition (6.8) says C(s, s) = Identity.
(7.6)
So (7.5) holds for t = s. Let 0 < t − s ≤ ρ∗ . We take χ ∈ C ∞ (R3n+4N ) with compact support such that χ(0) = 1. Let > 0 and f ∈ S(R3n+4N ). Then from (6.8) and (6.18), we can write C(t, s)∗ χ( ·)2 C(t, s)f 3 4N n 1 mj f (y, Y )dydY = 2π(t − s) 2π|V |(t − s) j=1
×
t,s χ( z, Z)2 exp{i−1 Sc (t, s; q t,s z,y , aΛZ,Y )
t,s − i−1 Sc (t, s; q t,s z,x , aΛZ,X )}dzdZ
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
W. Ichinose
582
=
n
j=1
mj 2π(t − s)
3
1 2π|V |(t − s)
4N
f (y, Y )dydY
χ( z, Z)2
(j) Φ Φ m j 1 dzdZ. + i(X − Y ) · × exp i (x(j) − y (j) ) · (t − s) |V |(t − s) j=1
n
(7.7)
We can make the change of variables: (z, Z) → (ξ, Ξ) = (Φ, Φ1 ) in (7.7) from Proposition 7.2. Then C(t, s)∗ χ( ·)2 C(t, s)f 3 4N n 1 mj = 2π(t − s) 2π|V |(t − s) j=1 ×
χ( z, Z)2
f (y, Y )dydY
+ i(X − Y ) ·
Ξ |V |(t − s)
det
exp i
n
(x(j) − y (j) ) ·
j=1
mj ξ (j) (t − s)
∂(z, Z) dξdΞ. ∂(ξ, Ξ)
Equation (7.4) and Proposition 7.2(2) show det
∂(z, Z) = 1 + (t − s)h(t, s; x, ξ, y, X, Ξ, Y ), ∂(ξ, Ξ)
(7.8)
where h(t, s; x, ξ, y, X, Ξ, Y ) satisfies (7.3) for all α, β, γ, α , β and γ . Consequently, from Proposition 7.2(2), we have lim C(t, s)∗ χ( ·)2 C(t, s)f
→0
=
1 2π
3n+4N
lim
f (y, Y )dydY
→0
χ( z, Z)2
× {exp(i(x − y) · γ + i(X − Y ) · Γ)} det = f (x, X) + (t − s)
1 2π
3n+4N
∂(z, Z) dγdΓ ∂(ξ, Ξ)
Os-
{exp(i(x − y) · γ + i(X − Y ) · Γ)}
× h(t, s; x, ξ, y, X, Ξ, Y )f (y, Y )dydY dγdΓ,
(7.9)
where ξ (j) = (t − s)γ (j) /mj (j = 1, 2, . . . , n) and Ξ = |V |(t − s)Γ. We note that the second term on the right-hand side of (7.9) is a pseudo-differential operator.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
583
So, applying the Calder´ on–Vaillancourt theorem ([5]), we obtain lim χ( ·)C(t, s)f 2 = lim (C(t, s)∗ χ( ·)2 C(t, s)f, f ) →0 ∗ 2 = lim C(t, s) χ( ·) C(t, s)f, f
→0
→0
≤ (1 + 2K0 (t − s))f 2 ≤ e2K0 (t−s) f 2 with a constant K0 ≥ 0. Hence we get (7.5) with a = 0 by Fatou’s lemma. Let p(x, w, X, W ) be a C ∞ function satisfying (6.7) with an integer M ≥ 0. Then we obtain P (t, s)f ≤ Const.f B M
(7.10)
as in the proof of (7.5) with a = 0. See the proof of [19, Proposition 4.3] for further details. Let us recall the expression (6.9) of C(t, s)f . Set ζ := (x, X) and let κ = (κ1 , κ2 , . . . , κ3n+4N ) be an arbitrary multi-index. Then we can see that ∂ζκ (C(t, s)f ) − C(t, s)(∂ζκ f ) and ζ κ (C(t, s)f ) − C(t, s)(ζ κ f ) are written in the form P˜γ (t, s)(∂ζγ f ) (t − s) |γ|≤|κ|
:= (t − s)
|γ|≤|κ|
× Os-
n
j=1
3 mj 1 2πi 2πi|V |
4N
(exp i−1 φ(t, s; x, w, X, W ))pγ (t, s; x,
× (∂ζγ f )(x −
√
√ ρw, X, ρW )
√ √ ρw, X − ρW )dwdW
(7.11)
respectively, where pγ (t, s; x, w, X, W ) satisfies (6.7) with M = |κ| − |γ| for all α, β, α and β . We can prove these results by induction with respect to −1 (j) 2 −1 (j) 2 −1 2 |κ|, using ∂w(j) eimj |w | /2 = imj w(j) eimj |w | /2 , ∂W ei |W | /(2|V |) = −1 2 (iW/|V |)ei |W | /(2|V |) and the integration by parts in (6.9). See the proof of [21, Lemma 3.2] for further details. Let |κ| = a (a = 0, 1, 2, . . .). Then we have P˜γ (t, s)(∂ζγ f ). ∂ζκ (C(t, s)f ) ≤ C(t, s)(∂ζκ f ) + (t − s) |γ|≤a
Applying (7.5) with a = 0 and (7.10) to the right-hand side above, we get ∂ζκ (C(t, s)f ) ≤ eK0 (t−s) ∂ζκ f + Const.(t − s) |γ|≤a ∂ζγ f B a−|γ| . We know from Lemma 2.3 with s = 1 and a = b in [17] that there exist a constant µa ≥ 0 and λa (ζ, η) satisfying |∂ηα ∂ζβ λa (ζ, η)| ≤ Cα,β ζ; η−a
(7.12)
June 2, 2010 14:55 WSPC/S0129-055X
584
148-RMP
J070-00403
W. Ichinose
for all α and β, and Λa (ζ, Dζ ) = (µa + ζa + Dζ a )−1
(7.13)
on S, where Λa (ζ, Dζ ) is the pseudo-differential operator with symbol λa (ζ, η). So, using [17, Lemma 2.4] and the Calder´ on–Vaillancourt theorem, we have ∂ζγ f B a−|γ| ≤ Const.(µa−|γ| + ζa−|γ| + Dζ a−|γ| )∂ζγ f = Const.{(µa−|γ| + ζa−|γ| + Dζ a−|γ| )∂ζγ Λa } × (µa + ζa + Dζ a )f ≤ Const.f B a .
(7.14)
Hence we get ∂ζκ (C(t, s)f ) ≤ eK0 (t−s) ∂ζκ f + Const.(t − s)f B a .
(7.15)
In the same way, we get ζ κ (C(t, s)f ) ≤ eK0 (t−s) ζ κ f + Const.(t − s)f B a .
(7.16)
Thus we obtain C(t, s)f B a ≤ eK0 (t−s) f B a + Const.(t − s)f B a ≤ eKa (t−s) f B a . This completes the proof of Theorem 7.3. Proposition 7.4. Let 0 ≤ t−s ≤ ρ∗ and p(x, w, X, W ) satisfy (6.7) with an integer M ≥ 0. Then P (t, s) is a continuous operator from B a (a = 0, 1, 2, . . .) into B a+M . Proof. Let ζ = (x, X) and f ∈ S(R3n+4N ). We also use (6.9) as in the proof of Theorem 7.3. Then we have ∂ζκ P (t, s)f = Pγ (t, s)∂ζγ f, γ≤κ
where γ ≤ κ denotes γj ≤ κj for all j and pγ (t, s; x, w, X, W ) satisfy (6.7) with √ √ √ M + |κ| − |γ| as M . Using ζ = (x, X) = (x − ρw, X − ρW ) + ρ(w, W ), we also have ζ κ P (t, s)f = Qγ (t, s)ζ γ f, γ≤κ
where qγ (t, s; x, w, X, W ) satisfy (6.7) with M + |κ| − |γ| as M . Hence from (7.10) and (7.14) we see P (t, s)f B a = P (t, s)f + (ζ κ P (t, s)f + ∂ζκ P (t, s)f ) |κ|=a
≤ Const.f B a+M .
(7.17)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
585
8. The Consistency of the Fundamental Operator Let C(t, s) and H(t) be the fundamental operator defined in Sec. 6 and the operator defined by (3.10) with variables aΛ = aΛ2 = X, respectively. Theorem 8.1. Under the assumptions of Theorem 3.1 there exist integers M ≥ 0, M ≥ 0, C ∞ functions r(t, s; x, w, X, W ) and r (t, s; x, w, X, W ) in 0 ≤ s ≤ t ≤ T, (x, w) ∈ R6n and (X, W ) ∈ R8N satisfying (6.7) for all α, β, α and β , respectively such that √ ∂ (8.1) i − H(t) C(t, s)f = t − sR(t, s)f ∂t and
√ ∂ C(t, s)f + C(t, s)H(s)f = t − sR (t, s)f (8.2) ∂s ), where R(t, s) and R (t, s) are operators defined by (6.8). for f ∈ S(Rx3n × Ra4N Λ i
Proof. In this proof, we write x and y as x and y, respectively. Let x denote variables in R3 . We may assume s < t from Lemma 6.1. It follows from (3.10), (6.6) and (6.8) that direct calculations show ∂ i − H(t) C(t, s)f ∂t 4N n 3 m 1 j =− 2πi(t − s) 2πi|V |(t − s) j=1 t,s t,s −1 × (exp i Sc (t, s; q x,y , aΛX,Y ) r1 (t, s; x, y , X, Y ) +
i r2 (t, s; x, y, X, Y ) f (y, Y )dy dY 2
(8.3)
by means of (6.3) and (6.4), where t,s r1 (t, s; x, y , X, Y ) = ∂t Sc (t, s; q t,s x, y , aΛX,Y ) 2 n 1 ej ˜ (j) ∂x(j) Sc − A(x , X) + 2mj c j=1
+ V1 (x) +
|V | |∂X Sc |2 + V2 (X) 2
(8.4)
and 3n + 4N 1 − ∆ (j) Sc t−s mj x j=1 n
r2 =
1 ej ˜ (j) , X) − |V |∆X Sc , + (∇x · A)(x c j=1 mj n
(cf. the proof of [18, Proposition 2.3]).
x ∈ R3
(8.5)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
W. Ichinose
586
Set ρ = t − s. From (6.6), we can write ∂x(j) Sc − =
ej ˜ (j) A(x , X) c
mj (x(j) − y (j) ) ρ 1 ej ˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y )) − A(x ˜ (j) , X)}dθ {A(x + c 0 1 3 ej (j) ∂ A˜l (j) (j) + (x − θ(x(j) − y (j) ), X − θ(X − Y ))dθ (xl − yl ) (1 − θ) c ∂x 0 l=1
1
−ρ
∂V1 (x − θ(x − y ))dθ ∂x(j)
(1 − θ)
0
=
3 ˜ mj (x(j) − y (j) ) ej (j) (j) ∂ A − (xm − ym ) (x(j) , X) ρ 2c m=1 ∂xm 4N 3 ˜ ∂ A˜ (j) ej (j) ej (j) ∂ Al (x(j) , X) (Xm − Ym ) (x , X) + (xl − yl ) 2c m=1 ∂Xm 2c ∂x l=1 X −Y x − y + ρq1 t, s; x, √ , X, √ (8.6) ρ ρ
−
and ∂X Sc =
X −Y −ρ |V |ρ ×
1
=
1 ∂V2 (j) (j) (X − θ(X − Y ))dθ + ej (xl − yl ) ∂X c j=1 n
(1 − θ)
0
(1 − θ)
0
1
3
l=1
∂ A˜l (j) (x − θ(x(j) − y (j) ), X − θ(X − Y ))dθ ∂X
n 3 ˜ X −Y 1 (j) (j) ∂ Al + (x(j) , X) ej (xl − yl ) |V |ρ 2c j=1 ∂X l=1
X −Y x − y + ρq2 t, s; x, √ , X, √ . ρ ρ
(8.7)
We can easily see −
3
(j)
(j)
(j) (xk − yk )(x(j) m − ym )
k,m=1
+
3
(j)
(j)
(j)
(xk − yk )(xl
k,l=1
∂ A˜k (j) (x , X) ∂xm (j)
− yl )
∂ A˜l (j) (x , X) = 0. ∂xk
(8.8)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
587
Equations (8.6)–(8.8) show 2 n 1 ej ˜ (j) + |V | |∂X Sc |2 ∂ A(x S − , X) (j) c x 2m c 2 j j=1 n X −Y 1 |X − Y |2 √ x − y (j) (j) 2 = 2 mj |x − y | + + ρq3 t, s; x, √ , X, √ . 2ρ j=1 2|V |ρ2 ρ ρ (8.9) From (6.6), we also have t,s q t,s ∂t Sc (t, s; x, y , aΛX,Y ) = −
n 1 |X − Y |2 mj |x(j) − y (j) |2 − V1 (x) − 2 2ρ j=1 2|V |ρ2
X −Y √ x − y − V2 (X) + ρq4 t, s; x, √ , X, √ . ρ ρ
(8.10) Hence together with (8.4), we obtain x − y X −Y √ . r1 (t, s; x, y , X, Y ) = ρq5 t, s; x, √ , X, √ ρ ρ
(8.11)
From (6.6) or (8.6) and (8.7), the same arguments as for (8.11) show n 1 ∆ (j) Sc + |V |∆X Sc mj x j=1
2 ej 3n + 4N + = ρ c j=1 mj n
1
(1 − θ)
0
˜ (j) − θ(x(j) − y (j) ), X − θ(X − Y ))dθ × (∇x · A)(x X −Y x − y √ + ρq6 t, s; x, √ , X, √ ρ ρ 1 ej 3n + 4N ˜ (j) , X) + (∇x · A)(x = ρ c j=1 mj n
X −Y √ x − y + ρq7 t, s; x, √ , X, √ . ρ ρ
(8.12)
Hence together with (8.5), we get X −Y √ x − y r2 (t, s; x, y, X, Y ) = − ρq7 t, s; x, √ , X, √ . ρ ρ Thus we could complete the proof of (8.1) from (8.3), (8.11) and (8.13).
(8.13)
June 2, 2010 14:55 WSPC/S0129-055X
588
148-RMP
J070-00403
W. Ichinose
Let us consider (8.2). By direct calculations we see that the left-hand side of (8.2) is equal to 4N n 3 m 1 j − 2πi(t − s) 2πi|V |(t − s) j=1 × +
(exp i
−1
t,s Sc (t, s; q t,s x, y , aΛX,Y
) r1 (t, s; x, y , X, Y )
i r2 (t, s; x, y , X, Y ) f (y, Y )dy dY, 2
(8.14)
where t,s r1 (t, s; x, y, X, Y ) = ∂s Sc (t, s; q t,s x, y , aΛX,Y ) 2 n 1 ej ˜ (j) ∂y(j) Sc + A(y , Y ) − 2mj c j=1
− V1 (y ) −
|V | |∂Y Sc |2 − V2 (Y ) 2
(8.15)
and r2 = −
3n + 4N 1 + ∆ (j) Sc t−s mj y j=1 n
1 ej ˜ (j) , Y ) + |V |∆Y Sc . + (∇x · A)(y c j=1 mj n
(8.16)
Consequently, we can prove (8.2) as in the proof of (8.1). 9. The Proofs of the Main Results We first prove Theorem 3.1. Let ρ∗ > 0 be the constant determined in Proposition 7.2 and χ ∈ C ∞ (R3n+4N ) with compact support such that χ(0) = 1. We consider bounded operators Kj and Kj (j = 1, 2, . . . , ν) on B a (R3n+4N ). Then, it holds for f ∈ B a (R3n+4N ) that Kν χ( ·)Kν−1 χ( ·) · · · χ( ·)K1 χ( ·)f − Kν Kν−1 · · · K1 f
=
ν
Kν χ( ·) · · · χ( ·)Kj+1 χ( ·)(Kj − Kj )Kj−1 · · · K1 f
j=1
+
ν−1 j=0
Kν χ( ·) · · · χ( ·)Kj+1 (χ( ·) − 1)Kj · · · K1 f.
(9.1)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
589
Noting (6.1) and (6.2), from (3.5) we have Sc (T, 0; q∆ , aΛ∆ ) =
ν
τ ,τ
τ ,τ
l l−1 Sc (τl , τl−1 ; q xl(l) l−1 , aΛX (l) ,X (l−1) ), ,x(l−1)
l=1 (l)
where X (l) = aΛ (l = 1, 2, . . . , ν − 1) and X (ν) = aΛ . So, (3.8) is written as lim C(T, τν−1 )χ( ·)C(τν−1 , τν−2 )χ( ·) · · · C(τ2 , τ1 )χ( ·)C(τ1 , 0)χ( ·)f
→0
for f ∈ B a (R3n+4N ). Let f ∈ B a (R3n+4N ) and |∆| ≤ ρ∗ . We can easily see sup χ( ·)f B a ≤ Const.f B a
0<≤1
and lim (χ( ·) − 1)f B a = 0.
→0
Consequently, using Theorem 7.3 and (9.1), we can see that (3.8) is well defined in B a , which is written as C(T, τν−1 )C(τν−1 , τν−2 ) · · · C(τ2 , τ1 )C(τ1 , 0)f (= C∆ (T, 0)f ).
(9.2)
We also see from Remark 3.5 that there exists (3.8) in S for f ∈ S. Let 0 ≤ t0 ≤ t ≤ T . For a subdivision ∆ of [0, T ] we can find j and l such that j ≤ l, τj−1 < t0 ≤ τj and τl−1 < t ≤ τl , where we take j = 1 for t0 = 0. Then we define C∆ (t, t0 )f := lim C(t, τl−1 )χ( ·)C(τl−1 , τl−2 )χ( ·) →0
× · · · χ( ·)C(τj+1 , τj )χ( ·)C(τj , t0 )χ( ·)f
(9.3)
for f ∈ B a as was stated in Remark 3.3. Let |∆| ≤ ρ∗ . Then we have C∆ (t, t0 )f = C(t, τl−1 )C(τl−1 , τl−2 ) · · · C(τj+1 , τj )C(τj , t0 )f as in the proof of (9.2). Consequently, from (7.5) we have C∆ (t, t0 )f B a ≤ eKa (t−t0 ) f B a
(a = 0, 1, 2, . . .)
(9.4)
under the assumptions of Theorem 3.1. Proposition 9.1. Let |∆| ≤ ρ∗ . Then, under the assumptions of Theorem 3.1 we can find an integer M ≥ 2 such that C∆ (t, t0 )f − C∆ (t , t0 )f B a ≤ Ca (|t − t | + |t0 − t0 |)f B a+M for 0 ≤ t0 ≤ t ≤ T, 0 ≤ t0 ≤ t ≤ T and a = 0, 1, 2, . . . .
(9.5)
June 2, 2010 14:55 WSPC/S0129-055X
590
148-RMP
J070-00403
W. Ichinose
Proof. Let R(t, s) and R (t, s) be the operators defined by (8.1) and (8.2), respectively. We determine M in Proposition 9.1 by max (M, M , 2) for M and M in Theorem 8.1. We can easily see t √ (H(θ)C(θ, s)f + θ − sR(θ, s)f )dθ (9.6) i(C(t, s)f − C(t , s)f ) = t
from (8.1) for s ≤ t ≤ t ≤ T . Let τj < t ≤ τj+1 and τk < t ≤ τk+1 . So j ≥ k holds. Using the equation just after (9.3) and (9.6), we get i(C∆ (t, t0 )f − C∆ (t , t0 )f ) t t = H(θ)C∆ (θ, t0 )f dθ + θ − τj R(θ, τj )dθC∆ (τj , t0 )f t
+
τj
j−k−1 τj−l+1
τj−l
l=1 τk+1
+ t
θ − τj−l R(θ, τj−l )dθC∆ (τj−l , t0 )f
θ − τk R(θ, τk )dθC∆ (τk , t0 )f.
(9.7)
See the proof of [21, Theorem 4.2] for further details. As in the proof of (7.14), we see H(t)f B a ≤ Const.f B a+M
(9.8)
from (3.10) because of M ≥ 2. We also see R(t, s)f B a ≤ Const.f B a+M
(9.9)
from Proposition 7.4 for 0 ≤ t − s ≤ ρ∗ . Consequently, (9.4) and (9.7) show √ C∆(t, t0 )f − C∆ (t , t0 )f B a ≤ Const. eKa+M T (1 + ρ∗ )|t − t |f B a+M for 0 ≤ t0 ≤ t ≤ t ≤ T . The inequality above holds for 0 ≤ t0 ≤ t , t ≤ T . In the same way we get √ C∆(t, t0 )f − C∆ (t, t0 )f B a ≤ Const. eKa+M T (1 + ρ∗ )|t0 − t0 |f B a+M for 0 ≤ t0 , t0 ≤ t ≤ T . Hence, we can complete the proof of Proposition 9.1. Let M ≥ 2 be the integer determined in Proposition 9.1. We consider a solution u(t), which is B M -valued continuous and L2 -valued continuously differentiable in [t0 , T ], to (3.9) with u(t0 ) = 0 for a t0 ∈ [0, T ). Then, noting M ≥ 2, from (3.9) and (3.10) we can easily see du d (u(t), u(t)) = 2 (t), u(t) = −2−1 i(H(t)u(t), u(t)) = 0 dt dt
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
591
and so u(t) = 0 in [t0 , T ], where a for a complex number a denotes the real part of a. Consequently, we can see for a given f ∈ B a+M that the solution to (3.9) with u(t0 ) = f is determined uniquely in the space of all B M -valued continuous and L2 -valued continuously differentiable functions in [t0 , T ]. ∗ Let {∆j }∞ j=1 be a family of subdivisions of [0, T ] such that |∆j | ≤ ρ and limj→∞ |∆j | = 0. Take an arbitrary f ∈ B a+2M (a = 0, 1, 2, . . .). Then we see from (9.4) and (9.5) that {C∆j (t, t0 )f }∞ j=1 is uniformly bounded as a family of B a+2M -valued continuous functions and equicontinuous as a family of B a+M -valued functions in 0 ≤ t0 ≤ t ≤ T , respectively. It follows from the Rellich criterion (cf. [28, Theorem XIII.65]) that the embedding map from B M into L2 is compact. So is the embedding map from B a+2M into B a+M from (7.12), (7.13) and [17, Lemma 2.5] with a = b = 1. Consequently, from Ascoli–Arzel` a theorem we can find , which may depend on f , such that C∆jk (t, t0 )f converges a subsequence {∆jk }∞ k=1 a+M uniformly in 0 ≤ t0 ≤ t ≤ T as k → ∞. Since C∆j (t0 , t0 )f = f follows in B from Lemma 6.1, so (9.7)–(9.9) show that limk→∞ C∆jk (t, t0 )f =: U (t, t0 )f , where U (t, t0 )f is a B a+M -valued continuous and B a -valued continuously differentiable function in 0 ≤ t0 ≤ t ≤ T satisfying (3.9) with u(t0 ) = f . Hence, it follows from the uniqueness of solutions to (3.9) proved above that C∆ (t, t0 )f converges to U (t, t0 )f in B a+M uniformly in 0 ≤ t0 ≤ t ≤ T as |∆| → 0. Take an arbitrary f ∈ B a . Let ∆ and ∆ be subdivisions such that |∆| ≤ ρ∗ and |∆ | ≤ ρ∗ . For any > 0 we can take a g ∈ B a+2M such that g − f B a < . Then from (9.4) we have C∆ (t, t0 )f − C∆ (t, t0 )f B a ≤ C∆ (t, t0 )g − C∆ (t, t0 )gB a + C∆ (t, t0 )(f − g)B a + C∆ (t, t0 )(f − g)B a ≤ C∆ (t, t0 )g − C∆ (t, t0 )gB a+M + 2eKa T . So, lim
max
|∆|,|∆|→0 0≤t0 ≤t≤T
C∆ (t, t0 )f − C∆ (t, t0 )f B a ≤ 2eKa T .
(9.10)
Hence, we can see that C∆ (t, t0 )f converges in B a uniformly in 0 ≤ t0 ≤ t ≤ T as |∆| → 0. We write this limit as W (t, t0 )f . Let f ∈ B a . Take fj ∈ B a+M such that limj→∞ fj = f in B a . From (9.7) we have t H(θ)W (θ, t0 )fj dθ i(W (t, t0 )fj − fj ) = t0
in B a . The inequality W (t, t0 )f B a ≤ eKa (t−t0 ) f B a holds from (9.4). So, from [17, Lemma 2.5] with a = b = 1 we can see t H(θ)W (θ, t0 )f dθ i(W (t, t0 )f − f ) = t0
June 2, 2010 14:55 WSPC/S0129-055X
592
148-RMP
J070-00403
W. Ichinose
in B a−2 and that W (t, t0 )f is B a -valued continuous and B a−2 -valued continuously differentiable in 0 ≤ t0 ≤ t ≤ T. Hence lim|∆|→0 C∆ (t, t0 )f (=W (t, t0 )f ) satisfies (3.9) with u(t0 ) = f . Thus, we could complete the proof of Theorem 3.1. t,s We shall consider the proof of Theorem 3.2. Let q t,s x,y (θ) and a Λ X,Y (θ) be the paths defined by (6.1) and (6.2) for s < t, respectively. For ξk ∈ R2 (k ∈ Λ ) we 1
define the path by φt,s (θ) ξ k
:= ξk +
4πρk (q t,s x,y (θ)) ∈ R2 , |k|2
s≤θ≤t
(9.11)
as in (3.12). The path φt,s (θ) ∈ R2 (k ∈ Λ1 ) is defined by (2.13). So from (2.16) ξ k and (2.17) we have (1)
(1)
ξ−k = ξk ,
(2)
(2)
ξ−k = −ξk .
For k ∈ Λ1 , we can easily see 2 t,s 2 t,s |k| φ (θ) − 8πρk (q t,s x,y (θ)) · φ (θ) ξk
ξk
t,s 4πρk 2 16π 2 − = |k| φ − |ρk |2 ξk |k|2 |k|2 2
16π 2 2 = |k|2 |ξ k |2 − |ρk (q t,s x,y (θ))| . |k|2
(9.12)
˜ defined by (3.11) is written as So, the classical action for L t,s t,s S(t, s; q t,s x,y , aΛX,Y , {φ }k∈Λ1 ) ξk
t,s = Sc (t, s; q t,s x,y , aΛX,Y ) +
(t − s) |k|2 |ξk |2 4π|V |
(9.13)
k∈Λ1
from (2.21) and (3.3). Let χ1 ∈ C ∞ (R2N1 ) with compact support such that χ1 (0) = 1. Let > 0 and ξ := {ξk }k∈Λ1 ∈ R2N1 . For f ∈ S(R3n+4N ) we define G (t, s)f (0 ≤ s ≤ t ≤ T ) by n 4N 3 2 m 1 |k| (t − s) j 2 |V | 2πi(t − s) 2πi|V |(t − s) 4iπ k∈Λ1 j=1 −1 dξk , s < t, × · · · ei S χ1 ( ξ)f (y, Y )dydY k∈Λ1 f, s = t,
t,s t,s where S = S(t, s; q t,s x,y , aΛX,Y , {φ }k∈Λ1 ). ξk
(9.14)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
593
Proposition 9.2. Let f ∈ B a (R3n+4N )(a = 0, 1, 2, . . .). Then, under the assumptions of Theorem 3.1 we have lim G (t, s)f = C(t, s)f
(9.15)
→0
in B a for 0 ≤ t − s ≤ ρ∗ . Proof. In the case t = s Eq. (9.15) is clear from (7.6). Let 0 < t − s ≤ ρ∗ and f ∈ S(R3n+4N ). From (9.13) we have 4N n 3 m 1 j G (t, s)f = 2πi(t − s) 2πi|V |(t − s) j=1
×
t,s (exp i−1 Sc (t, s; q t,s x,y , aΛX,Y ))f (y, Y )dydY
|k|2 (t − s) i(t − s) · · · exp |k|2 |ξk |2 × 4iπ 2 |V | 4π|V |
k∈Λ1
× χ1 ( ξ)
k∈Λ1
dξk .
k∈Λ1 (1)
(2)
Let ηk := (ηk , ηk ) ∈ R2 and η := {ηk }k∈Λ1 . We know ∞ iπ iaθ 2 e dθ = a −∞
(9.16)
for a constant a > 0. So we can write G (t, s)f = P (t, s)f, where
(9.17)
|k|2 · · · exp i p (t, s) = |k|2 |ηk |2 iπ
k∈Λ1
k∈Λ1
× χ( 4π|V |/(t − s)η) dηk .
(9.18)
k∈Λ1
We see that lim p (t, s) = 1
→0
pointwise. Letting q (t, s) = p (t, s) − 1, we have P (t, s)f − C(t, s)f = Q (t, s)f.
(9.19)
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
W. Ichinose
594
We consider G (t, s)f − C(t, s)f 2 = P (t, s)f − C(t, s)f 2 = ((P (t, s) − C(t, s))† (P (t, s) − C(t, s))f, f ) = (Q (t, s)† Q (t, s)f, f ). Hence, we obtain (9.15) as in the proof of Theorem 7.3 in the present paper together with [17, Lemma 2.2]. See the proof of [20, Lemma 4.1] for further details. We can write (3.13) as lim G (T, τν−1 )χ( ·)G (τν−1 , τν−2 )χ( ·) · · · G (τ2 , τ1 )χ( ·)G (τ1 , 0)χ( ·)f
→0
(9.20)
in the same way that (3.8) is written in the above of (9.2). Integrating by parts in (9.18), we see that sup0<≤1 |p (t, s)| is finite. So the same proof as for (7.5) shows sup G (t, s)f B a ≤ Ca f B a ,
a = 0, 1, 2, . . .
0<≤1
with constants Ca from (9.17). Hence, using (9.1), we can prove Theorem 3.2 as in the proof of the convergence of (3.8) to (9.2) together with (9.15). Finally, we will prove Theorem 3.3. As in the proof of (6.15) we get 1 (j) (j) (j) Aex (t, x ) · dx − φex (t, x )dt − c q zt,s q zt,s y x ,y ,x 1 = (x(j) − y (j) ) · c − (t − s)(x
(j)
1
Aex (s, x(j) − θ(x(j) − y (j) ))dθ
0
−y
(j)
)·
1 0
1
σ1 Eex (τ (σ), ζ (j) (σ))dσ1 dσ2
0
1 1 3 3 1 (j) (j) (j) (j) − (x − ym ) (zl − xl ) σ1 Bml (τ (σ), ζ (j) (σ))dσ1 dσ2 c m=1 m 0 0 l=1
(9.21) (t, x), B31 (t, x), B12 (t, x)) = Bex (t, x), Blm = −Bml , and τ (σ) for s < t, where (B23 (j) and ζ (σ) were defined by (6.11). See the proof of [18, Proposition 3.3] for further details. So, we get Eq. (6.18) where the sum over j = 1, 2, . . . , n of (9.21) multiplied by mj ej /(t−s) is added to. Hence, under the assumptions of Theorem 3.3 we obtain the same assertion as in Theorem 3.1 in the same way that Theorem 3.1 is proved. In the same way of the proof of Theorem 3.2 we also get the same assertion as in Theorem 3.2 under the assumptions of Theorem 3.3. Thus, we could complete the proof of the main results.
June 2, 2010 14:55 WSPC/S0129-055X
148-RMP
J070-00403
Feynman Path Integral for Nonrelativistic Quantum Electrodynamics
595
Acknowledgements The author thanks the referee for many useful suggestions. This research was partially supported by Grant-in-Aid for Scientific Research No. 16540145 and No. 19540175, Ministry of Education, Culture, Sports, Science and Technology, Japanese Government.
References [1] S. Albeverio, A. Hahn and A. N. Sengupta, Chern–Simons theory, Hida distribution, and state models, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6 (2003) 65–81. [2] S. Albeverio, R. J. Høegh-Krohn and S. Mazzuchi, Mathematical Theory of Feynman Path Integrals, Lecture Notes in Math. Vol. 523, 2nd edn. (Springer-Verlag, Berlin and Heidelberg, 2008). [3] A. Arai, Fock Space and Quantum Field (Nihon Hyoron Co., Tokyo, 2000) (in Japanese). [4] F. A. Berezin and M. A. Shubin, The Schr¨ odinger Equation (Kluwer Academic Publishers, Dordrecht, 1983). [5] A. P. Calder´ on and R. Vaillancourt, On the boundedness of pseudo-differential operators, J. Math. Soc. Japan 23 (1971) 374–378. [6] J. M. Cook, The mathematics of second quantization, Trans. Amer. Math. Soc. 74 (1953) 222–245. [7] P. A. M. Dirac, The Principles of Quantum Mechanics, 4th edn. (Oxford Univ. Press, London, 1958). [8] E. Fermi, Quantum theory of radiation, Rev. Mod. Phys. 4 (1932) 87–132. [9] R. P. Feynman, Space-time approach to nonrelativistic quantum mechanics, Rev. Mod. Phys. 20 (1948) 367–387. [10] R. P. Feynman, Mathematical formulation of the quantum theory of electrodynamic interaction, Phys. Rev. 80 (1950) 440–457. [11] R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals (McGraw-Hill, New York, 1965). [12] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Spectral theory for the standard model of non-relativistic QED, Comm. Math. Phys. 283 (2008) 613–646. [13] I. M. Gel’fand and N. Y. Vilenkin, Generalized Functions. Vol. IV, Applications of Harmonic Analysis (Academic Press, New York-London, 1964). [14] S. J. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics (Springer, Berlin, 2003). [15] T. Hida, H.-H. Kuo, J. Potthoff and L. Streit, White Noise (Kluwer Academic Publishers, Dordrecht, 1993). [16] F. Hiroshima, Functional integral representation of a model in quantum electrodynamics, Rev. Math. Phys. 9 (1997) 489–530. [17] W. Ichinose, A note on the existence and -dependency of the solution of equations in quantum mechanics, Osaka J. Math. 32 (1995) 327–345. [18] W. Ichinose, On the formulation of the Feynman path integral through broken line paths, Comm. Math. Phys. 189 (1997) 17–33. [19] W. Ichinose, On convergence of the Feynman path integral formulated through broken line paths, Rev. Math. Phys. 11 (1999) 1001–1025. [20] W. Ichinose, The phase space Feynman path integral with gauge invariance and its convergence, Rev. Math. Phys. 12 (2000) 1451–1463.
June 2, 2010 14:55 WSPC/S0129-055X
596
148-RMP
J070-00403
W. Ichinose
[21] W. Ichinose, Convergence of the Feynman path integral in the weighted Sobolev spaces and the representation of correlation functions, J. Math. Soc. Japan 55 (2003) 957–983. [22] W. Ichinose, The continuity of the solution with respect to an electromagnetic potential to the Schr¨ odinger equation and the Dirac equation, preprint (2009). [23] G. W. Johnson and M. L. Lapidus, The Feynman Integral and Feynman’s Operational Calculus (Oxford Univ. Press, Oxford, 2000). [24] H. Kumano-go, Pseudo-Differential Operators (MIT Press, Cambridge, 1981). [25] E. H. Lieb and M. Loss, Analysis (Amer. Math. Soc. Providence, 1997). [26] S. Mizohata, The Theory of Partial Differential Equations (Cambridge Univ. Press, New York, 1973). [27] E. Nelson, Schr¨ odinger particles interacting with a quantized scalar field, in Proceedings of a Conference on the Theory and Applications of Analysis in Function Space (M.I.T. Press, Cambridge, 1964), pp. 88–120. [28] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of Operators (Academic Press, New York, 1978). [29] J. J. Sakurai, Advanced Quantum Mechanics (Addison-Wesley, Massachusetts, 1967). [30] F. E. Schroeck, Jr., Generalization of the Cook formalism for Fock space, J. Math. Phys. 12 (1971) 1849–1857. [31] J. T. Schwartz, Nonlinear Functional Analysis (Gordon and Breach Science Publishers, New York, 1969). [32] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, Cambridge, 2004). [33] M. S. Swanson, Path Integrals and Quantum Processes (Academic Press, San Diego, 1992).
July 12, 2010 12:0 WSPC/S0129-055X J070-S0129055X10004053
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 6 (2010) 597–667 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004053
GRADIENT FLOWS FOR OPTIMIZATION IN QUANTUM INFORMATION AND QUANTUM DYNAMICS: FOUNDATIONS AND APPLICATIONS
∗ and STEFFEN J. GLASER ¨ THOMAS SCHULTE-HERBRUGGEN
Department of Chemistry, Technical University of Munich (TUM), Lichtenbergstrasse 4, D-85747 Garching, Germany ∗
[email protected] GUNTHER DIRR† and UWE HELMKE Institute of Mathematics, University of W¨ urzburg, Am Hubland, D-97074 W¨ urzburg, Germany †
[email protected] Received 14 December 2008 Revised 26 February 2010 Many challenges in quantum information and quantum control root in constrained optimization problems on finite-dimensional quantum systems. The constraints often arise from two facts: (i) quantum dynamic state spaces are naturally smooth manifolds (orbits of the respective initial states) rather than being Hilbert spaces; (ii) the dynamics of the respective quantum system may be restricted to a proper subset of the entire state space. Mathematically, either case can be treated by constrained optimization over the reachable set of an underlying control system. Thus, whenever the reachable set takes the form a smooth manifold, Riemannian optimization methods apply. Here, we give a comprehensive account on the foundations of gradient flows on Riemannian manifolds including new applications in quantum information and quantum dynamics. Yet, we do not pursue the problem of designing explicit controls for the underlying control systems. The framework is sufficiently general for setting up gradient flows on (sub)manifolds, Lie (sub)groups, and (reductive) homogeneous spaces. Relevant convergence conditions are discussed, in particular for gradient flows on compact and analytic manifolds. This is meant to serve as foundation for new achievements and further research. Illustrative examples and new applications are given: we extend former results on unitary groups to closed subgroups with tensor-product structure, where the finest product partitioning relates to SUloc (2n ) := SU (2) ⊗ · · · ⊗ SU (2) — known as (qubit-wise) local unitary operations. Such applications include, e.g., optimizing figures of merit on SUloc (2n ) relating to distance measures of pure-state entanglement as well as to best rank-1 approximations of higher-order tensors. In quantum information, our gradient flows provide a numerically favorable alternative to standard tensor-SVD techniques. Keywords: Constrained optimization in quantum systems; Riemannian optimization; Riemannian gradient flows and algorithms; double-bracket flows; quantum control; lowrank approximation of tensors; tensor SVD. Mathematics Subject Classification 2010: 49-02, 49R50, 53-02, 53Cxx, 65Kxx, 81V70, 90C30, 15A18, 15A69 597
July 12, J070-S0129055X10004053
598
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Contents 1. Introduction 2. Overview 2.1. Flows and dynamical systems . 2.2. Gradient flows for optimization 2.3. Discretized gradient flows . . . 2.4. Reachability and controllability 2.5. Settings of interest . . . . . . .
598 . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
602 602 603 603 606 608
3. Theory: Gradient Flows 3.1. Gradient flows on Riemannian manifolds 3.2. Gradient flows on Lie groups . . . . . . 3.3. Gradient flows on homogeneous spaces . 3.4. Examples . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
609 609 614 619 631
4. Applications to Quantum Information and Quantum Control 4.1. A geometric measure of pure-state entanglement . . . . 4.2. Generalized local subgroups . . . . . . . . . . . . . . . . 4.3. Locally reversible interaction Hamiltonians . . . . . . . 4.4. Intrinsic versus penalty approach: An example . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
644 644 644 651 653
5. Conclusions
. . . . .
. . . . .
. . . . .
. . . . .
655
1. Introduction Controlling quantum systems offers a great potential for performing computational tasks or for simulating the behavior of other quantum systems (which are difficult to handle experimentally) or classical systems [1,2], when the complexity of a problem reduces upon going from a classical to a quantum setting [3]. Important examples are known in quantum computation, quantum search and quantum simulation. Most prominently, there is the exponential speed-up by Shor’s quantum algorithm of prime factorisation [4,5], which on a general level relates to the class of quantum algorithms [6,7] solving hidden subgroup problems in an efficient way [8]. In Grover’s quantum search algorithm [9,10] one still finds a quadratic acceleration compared to classical approaches [11]. Recently, the simulation of quantum phase-transitions [12] has shifted into focus [13, 14]. Among the generic tools needed for advances in quantum simulation and quantum technology, quantum control plays a major role. For a survey see, e.g., [15,16]. Its key concern is to develop (optimal) control strategies and constructive ways for implementing them under realistic experimental settings such that a certain performance index is maximized. In a wider sense, such figures of merit depend on terminal conditions as well as on running cost like time or energy. Yet in quantum control important classes of performance indices are completely determined by
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
599
their value at the final state, typical examples being quantum gate fidelities, efficiencies of state transfer or coherence transfer, as well as distance functions related to Euclidean entanglement measures. Since realistic quantum systems are mostly beyond analytical tractability, numerical methods are often indispensable. A good strategy is to proceed in two steps: (a) firstly, by exploring the possible gains on an abstract and computationally cheap level, i.e. by maximizing the quality function either over the entire state space or over the set of possible states — the so-called reachable set; (b) secondly, by going into optimizing the experimental controls (“pulse shapes”) in a concrete setting. However, (b) is often computationally expensive and highly sensitive, as it actually approximates the solution of a constrained variational problem in an infinite dimensional function space. Here we almost entirely focus on task (a) and no longer pursue issues of optimal control (b). In particular, we do not address the problem of designing controls that steer concrete experimental setups to achieve optimal figures of merit. By merely depending on the geometry of the underlying state-space manifold the first instance (a) allows for analyzing in advance and on an abstract level the limits of what one can achieve in step (b). We therefore refer to (a) as the abstract optimization task. The second instance, in contrast, hinges on introducing the specific time scales and control parameters of an experimental setting for finding steerings of the quantum dynamical system such that the optima determined in (a) are actually assumed. This is why we term (b) the dynamic optimal control task. Certainly, one can approach the entire problem only in terms of (b) and sometimes one is even forced to do so, e.g., if nothing is known about the geometric structure of the reachable set. Yet, the above two-fold strategy serves to yield (strict) upper bounds (independent of the concrete experimental setting) in (a) which provide benchmarks for the reliability of the numerical methods applied in (b). In a pioneering paper [17], Brockett introduced the idea of exploiting gradient flows on the orthogonal group for diagonalizing real symmetric matrices and for sorting lists. In a series of subsequent papers he extended the concept to intrinsic gradient methods for (constrained) optimization [18,19]. Soon these techniques were generalized to Riemannian manifolds, their mathematical and numerical details were worked out [20–22] and thus they turned out to be applicable to a broad range of optimization tasks including eigenvalue and singular-value problems, principal component analysis, matrix least-squares matching problems, balanced matrix factorisations, and combinatorial optimization — for an overview see, e.g., [22, 23]. Implementing a gradient method for optimization on a smooth (constrained) manifold, such as an unitary orbit, via the Riemannian exponential map, inherently ensures that the (discretized) flow remains within the manifold. Alternatively, formulating the optimization problem on some embedding Euclidean space comes at the expense of additional constraints (e.g., enforcing unitarity) to be taken care of by penalty-type or augmented Lagrange-type techniques. In this sense, gradient flows on manifolds are intrinsic optimization methods [24], whereas extrinsic
July 12, J070-S0129055X10004053
600
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
optimizations on an embedding space require in general nonlinear projective techniques in order to stay on the (constrained) manifold. In particular, using the differential geometry of matrix manifolds has thus become a field of active research. For new developments (however without exploiting the Lie structure to the full extent) see, e.g., [25]. Even beyond manifolds, gradient flows have recently been described for metric spaces with applications of probability theory [26]. For optimization in quantum dynamics, gradient flows and their discrete numerical integration schemes have also proven powerful tools. This holds in both types of tasks: (a) for exploring the maxima of pertinent quality functions on the reachable set of a quantum system, e.g., on the unitary group and its orbits [27–29] and (b) beyond the current focus also for arriving at concrete experimental steerings (i.e.“pulse shapes”). These steerings actually achieve the quality limits established in (a) under given experimental conditions for closed systems [30–32], whereas they give (best) approximations in open systems [33–35]. Note that in task (b) gradient flows on the set of admissible control amplitudes can be viewed as another instance of flows on Riemannian manifolds. However, this instance will not be pursued here. Moreover, in view of unifying variational approaches to ground-state calculations [36, 37], a common framework of gradient flows on Riemannian manifolds as well as projective techniques on their tangent spaces will be useful. The latter allow for restricting the flows, e.g., from Lie groups G to any closed subgroup H, in particular to any closed subgroup of SU (N ). Consecutive partitionings into different subgroups of SU (N ) are exploited in unitary networks addressing largescale quantum systems by neglecting long-range entangling correlations [38–40]. Related techniques for truncating the Hilbert space to pertinent parametrized subsets include matrix-product states (MPS) [41, 42] of density-matrix renormalization groups (DMRG) [43, 44], quantum cellular automata with Margolus partitionings [45], projected entangled pair states (PEPS) [46] weighted graph states (WGS) [47], multi-scale entanglement renormalization approaches (MERA) [48], string-bond states (SBS) [49] as well as combined methods [36,50]. It is noteworthy that in many-particle physics gradient flows for diagonalizing Hamiltonians were reintroduced independently of Brockett’s work [17] by Wegner [51] and were further elaborated on again independently of the monography by Helmke and Moore [22] or the one by Bloch [23] in the tract by Kehrein [52]. Suffice this to illustrate the need for making the mathematical methods available to the physics community in a comprehensive way. Another field of applications of restricting flows to closed subgroups of SU (N ) is entanglement of multi-partite quantum systems [53, 54]: we present a connection from gradient flows on the subgroup of local unitary operations to best rank-1 approximations of higher-order tensors as well as a relation to tensor-SVDs. These methods are of importance, e.g., in view of optimization of entanglement witnesses [55]. Applying the same approach to other subgroups of SU (N ) with tensor product structure is anticipated to be of use also for classifying multi-partite
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
601
systems by partial separability, an example being three-tangles of GHZ-type and W-type states [56–58]. Here the goal is to give a comprehensive account of the foundations of gradient flows — and thus the justification for some recent developments — as well as to present new applications to quantum information. Terms are kept general enough to trigger future developments, since we elucidate the necessary requirements for implementing gradient-based optimization methods in different geometric settings: Riemannian manifolds and submanifolds, Lie groups and homogeneous spaces. We will also show how they can be carried over to homogeneous spaces that do no longer form Lie groups themselves. Standard examples are coset spaces G/H, i.e. the quotient of a Lie group G by a closed (yet not necessarily normal) subgroup H. Here naturally reductive homogeneous space are of particular interest and the wellknown double-bracket flows will be demonstrated to form a special case precisely of this kind. A separate paper on open quantum systems [59] sets up a formal approach within the framework of Lie semigroups accounting for Markovian quantum evolutions (or Markovian channels). There we also show the current limits of abstract optimization over reachable sets specifically arising in open systems. The differential geometry of the set of all completely positive, trace-preserving invertible maps is analyzed in the framework of Lie semigroups. In particular, the set of all Kossakowski–Lindblad generators is retrieved as its tangent cone (Lie wedge). Moreover, it shows how the Lie-semigroup structure corresponds to the Markov properties recently studied in terms of divisibility [60]. It illustrates why abstract optimization tasks for open systems are much more intricate than in the case of closed system, while dynamic optimal control tasks for open systems can be handled completely analogously. It specifies algebraic conditions for time-optimal controls to be the method of choice in open systems. Finally it draws perspectives to a new algorithmic approach for optimization on semigroup orbits combining (a priori) knowledge of the respective Lie wedge with well-known techniques from optimal control. Outline To begin with, recall some basic terminology on dynamical systems and Riemannian geometry. Then the aim is to provide the differential geometric tools for setting up Riemannian optimization methods — primarily focussing on gradient flows — in different scenarios ranging from optimization over the entire unitary group to closed subgroups or homogeneous spaces. Finally we give a number of applications including worked examples. More precisely, the paper is organized as follows: Section 2 draws a general sketch of dynamical systems and flows on manifolds including issues of reachability and controllability. It provides the manifold setting for gradient-flow-based algorithms like steepest ascent, conjugate gradients, Jacobi-type, and Newton methods. A detailed analysis is then given in Sec. 3, where (1) we resume the general preconditions for gradient flows on smooth manifolds. In particular, we recall the role
July 12, J070-S0129055X10004053
602
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
of a Riemannian metric that allows for identifying the cotangent bundle T ∗ M with the tangent bundle T M . Though major parts of the foundations can be found scattered in [22,25,61], here we add a comprehensive overview of the interplay between Riemannian geometry, Lie groups, and (reductive) homogeneous spaces. (2) We give examples of gradient flows on compact Lie groups as well as their closed subgroups. (3) In view of further developments, we address gradient flows on reductive homogeneous spaces including specializations to Cartan-like cases as well as naturally reductive homogeneous spaces. In particular, double-bracket flows turn out as gradient flows on naturally reductive homogeneous spaces. (4) Examples interdispersed in the main text illustrate the relevance in a plethora of different settings. Section 4 is dedicated to specific applications in quantum information and quantum control. (1) We show how gradient flows on the subgroup of local unitaries SUloc (2n ) in n qubits do not only provide a valuable tool in witness optimization, but relate to generalized singular-value decompositions (SVD), namely the tensorSVD. Here, our gradient flows yield an alternative to common algorithms for best rank-1 approximations of higher-order tensors, e.g., higher-order power methods (HOPM) or higher-order orthogonal iteration (HOOI). (2) Flows on SUloc (2n ) also serve as a convenient tool to decide whether Hamiltonian interactions can be timereversed solely by local unitary manipulations thus complementing the algebraic assessment given in [29]. (3) Optimization tasks with additional extrinsic constraints are addressed by tailored gradient flows on the respective subgroups or by auxiliary penalty methods. By including practical applications and worked examples we illustrate the ample range of problems to which gradient flows on manifolds provide valuable solutions. To this end, we start out by an extended overview on Riemannian optimization techniques on manifolds with particular emphasis on gradient techniques. 2. Overview 2.1. Flows and dynamical systems In this paper, we treat various optimization tasks for quantum dynamical systems in a common framework, namely by gradient flows on smooth manifolds. Let M denote a smooth manifold, e.g., the unitary orbit of all quantum states relating to an initial state X0 . By a continuous-time dynamical system or a flow one defines a smooth map Φ: R×M →M
(2.1)
such that for all states X ∈ M and times t, τ ∈ R one has Φ(0, X) = X Φ(τ, Φ(t, X)) = Φ(t + τ, X).
(2.2)
Since these equations hold for any X ∈ M , one gets the operator identity Φτ ◦ Φt = Φt+τ
(2.3)
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
603
for all t, τ ∈ R, thus showing the flow acts as a one-parameter group, and for positive times t, τ ≥ 0 as a one-parameter semigroup of diffeomorphisms on M . 2.2. Gradient flows for optimization Now, the general idea for optimizing a scalar quality function on a smooth manifold M (which might either arise naturally or from including smooth equality constraints, vide infra) by dynamical systems is as follows: Let f : M → R be a smooth quality function on M . The differential of f : M → R is a mapping Df : M → T ∗ M of the manifold to its cotangent bundle T ∗ M , while the gradient vector field is a mapping grad f : M → T M to its tangent bundle T M . So the gradient of f at X ∈ M , denoted grad f (X), is the vector in TX M uniquely determined by Df (X) · ξ = grad f (X) | ξX
for all ξ ∈ TX M .
(2.4)
∗ M Here, the scalar product · | ·X plays a central role: it allows for identifying TX with TX M . The pair (M, · | ·) is called a Riemannian manifold with Riemannian metric · | ·. In view of gradient flows, the convenience of Riemannian manifolds lies in the fact that by duality in particular the differential Df (X) of f at X can be identified with a tangent vector of TX M . Then, the flow Φ : R×M → M determined by the ordinary differential equation
X˙ = grad f (X)
(2.5)
is termed gradient flow. Formally, it is obtained by integrating Eq. (2.5), i.e. Φ(t, X) = Φ(t, Φ(0, X)) = X(t),
(2.6)
where X(t) denotes the unique solution of Eq. (2.5) with initial value X(0) = X. Observe this ensures that f does increase along trajectories Φ of the system by virtue of following the gradient direction of f . In generic problems, gradient flows typically run into some local extremum as sketched in Fig. 2. Therefore a sufficiently large set of independent initial conditions may be needed to provide confidence into numerical results. However, in some pertinent applications, local extrema can be ruled out; prominent examples of this type will be discussed in detail in the context of Brockett’s double bracket flow [17, 22]. 2.3. Discretized gradient flows Gradient flows may be envisaged as natural continuous versions of the steepest ascent method for optimizing a real-valued function f : Rm → R by moving along its gradient grad f ∈ Rm , i.e. Steepest ascent method Xk+1 = Xk + αk grad f (Xk ), where αk ≥ 0 is an appropriate step size.
(2.7)
July 12, J070-S0129055X10004053
604
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Here, the right-hand side of Eq. (2.7) does make sense, as the manifold M = Rm coincides with its tangent space TX M = Rm containing grad f (X). Clearly, a generalization is required as soon as M and TX M are no longer identifiable. This gap is filled by the Riemannian exponential map defined by expX : TX M → M,
ξ → expX (ξ)
(2.8)
so that t → expX (tξ) describes the unique geodesic with initial value X ∈ M and “initial velocity” ξ ∈ TX M as illustrated in Fig. 1. If the manifold M carries the structure of a matrix Lie group G, we may identify the tangent space element ξ ∈ TX G with ΩX, where Ω is itself an element of the Lie algebra g, i.e. the tangent space at the unity element g = T1 G. Moreover, if the Lie-group structure matches with the Riemannian framework in the sense that the metric is bi-invariant (as will be explained in more detail later), then the Riemannian exponential of ξ = ΩX can readily be calculated explicitly. This is done in three steps by (i) right translation with the inverse group element X −1 , (ii) taking the conventional exponential map of the Lie algebra element Ω, (iii) right translation with the group element X as summarized in the following diagram ξ = ΩX ∈ TX G RX −1 Ω∈g
exp
−−−−−X −−→
eΩ X ∈ G R X
(2.9)
exp
−−−−−−−−−→ eΩ ∈ G.
Next, the gradient system (2.5) will be integrated (to sufficient approximation) by a discrete scheme that can be seen as an intrinsic Euler step method. This can be performed by way of the Riemannian exponential map, which is to say straight line segments used in the standard method are replaced by geodesics on M . This leads to the following integration scheme which is well-defined on any Riemannian manifold M . (1) Riemannian gradient method Xk+1 := expXk (αk grad f (Xk ))
(2.10)
Fig. 1. The Riemannian exponential expX is a smooth map taking the tangent vector tξ ∈ TX M at X ∈ M to expX (tξ) ∈ M . By varying t ∈ R, it yields the unique geodesic with initial value X ∈ M and “initial velocity” ξ ∈ TX M . (Color online)
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
605
where αk ≥ 0 denotes a step size appropriately selected to guarantee convergence, cf. Sec. 3. For matrix Lie groups G with bi-invariant metric, Eq. (2.10) simplifies to (1 ) Gradient method on a Lie group Xk+1 := exp(αk grad f (Xk )Xk−1 )Xk ,
(2.11)
where exp : g → G denotes the conventional exponential map. In either case, the iterative procedure can be pictured as follows: at each point Xk ∈ M one evaluates grad f (Xk ) in the tangent space TXk M . Then one moves via the Riemannian exponential map in direction grad f (Xk ) to the next point Xk+1 on the manifold so that the quality function f improves, f (Xk+1 ) ≥ f (Xk ), as shown in Fig. 2. Higher-order Riemannian optimization methods The steepest ascent approach just outlined is most basic for addressing abstract optimization tasks intrinsically. Other intrinsic iterative schemes exploiting the underlying Riemannian geometry like conjugate gradients, Jacobi-type methods or Newton’s method can be obtained similarly. For an introduction to these more advanced topics beyond the subsequent sketch see, e.g., [20, 62, 63].
↑f
Fig. 2. Abstract optimization task: The quality function f : M → R, X → f (X) (top trace) is driven into a (local) maximum by following the gradient flow X˙ = grad f (X) on the manifold M (lower trace). (Color online)
July 12, J070-S0129055X10004053
606
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
(2) Conjugate gradient method Xkl+1 := argmax f (expXkl (t Ωlk )) t≥0
Ωlk :=
0 Xk+1
:=
(2.12)
Xkn ,
grad f (Xkl )
for l = 0
grad f (Xkl ) + αlk ΠXkl−1 ,Xkl (Ωlk−1 )
for l = 1, . . . , n − 1,
αlk
where is a real parameter and ΠX,Y (Ω) denotes the parallel transport of Ω along the geodesic from X to Y . (3) Jacobi-type method Xkl+1 := argmax f (expXkl (t Ωl (Xkl ))) t∈R
0 Xk+1
:=
(2.13)
Xks ,
where Ω0 , Ω1 , . . . , Ωs−1 are vector fields such that Ω0 (X), Ω1 (X), . . . , Ωs−1 (X) span TX M for all X ∈ M . The integer s is called sweep length. (4) Newton’s method Xk+1 := expXk (−(Hess f (Xk ))−1 grad f (Xk )),
(2.14)
where Hess f (X) denotes the Hessian of f at X. Since inverting the Hessian or, more precisely, solving the equation Hess f (Xk )Z = grad f (Xk ) is numerically costly, in higher-dimensional problems it is customary to use approximative methods with partial updates, e.g., the limited memory variant of the Broyden– Fletcher–Goldfarb–Shanno algorithm (LBFGS) [64–66]. 2.4. Reachability and controllability Up to now we have addressed optimization tasks over abstract state spaces forming Riemannian manifolds M — hence the term abstract optimization task (AOT). However, often there may be restrictions of the original manifold M to a (proper) submanifold N M . In this paragraph, we sketch how restrictions arising from an underlying control system can be accounted for. To this end, some general remarks on reachable sets and controllability are in order. The abstract optimization task (AOT) then amounts to optimizing over reachable sets, which is the topic we focus on here. In contrast, the dynamic optimal control task (OCT) would give explicit steerings, which will not be discussed here. Instead, we refer the reader to [67, 68], or, for the quantum case, to [69,70] and for numerical methods and applications to [30,71–74]. For simplicity, let (Σ) denote a smooth control system on the state manifold M , i.e. a family of (ordinary) differential equations (2.15) (Σ) X˙ = F (X, u), u ∈ U ⊂ Rm with control parameters u ∈ U and smooth vector fields Fu := F (·, u) on M . While the vector fields Fu are assumed to be time-independent, the controls are allowed
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
607
to vary in time. For convenience, the resulting control function t → u(t) ∈ U is denoted again by u. Moreover, the set of all admissible control functions is supposed to contain at least all piecewise constant ones. For an admissible control function u, we refer to X(t, X0 , u) as the unique solution of (2.15) with initial value X0 . Thereby the reachable set of X0 is defined Reach(X0 ) := Reach(X0 , T ). (2.16) 0≤T
Here Reach(X0 , T ) denotes the set of all states which can be reached in time T , i.e. Reach(X0 , T ) := {X(T, X0, u) ∈ M | u ∈ U}.
(2.17)
The system (Σ) is said to be controllable, if Reach(X0 ) = M for all X0 ∈ M , i.e. if for each pair (X0 , Y0 ) of states there exists an admissible control u and a time T ≥ 0 such that X(T, X0 , u) = Y0 . In general, it is hard to decide whether a given system (Σ) is controllable or not. However, for dynamics evolving on some Lie group G, the situation is much easier. Let (ΣG ) be a bilinear or, equivalently, a right invariant, control-affine system on a matrix Lie group G with Lie algebra g, i.e. m X˙ = A0 + uj Aj X, u ∈ U ⊂ Rm (2.18) (ΣG ) j=1
with drift A0 ∈ g and control directions Aj ∈ g. For compact Lie groups G, a simple algebraic test for controllability is known: If the system Lie algebra s := A0 , . . . , Am Lie
(2.19)
generated by A0 , . . . , Am via nested commutators coincides with g, then the corresponding system (ΣG ) is controllable, cf. [75, 76]. In particular, there exists a (minimal) finite time T > 0, such that the entire group G can be reached from any initial point X0 ∈ G within this time, i.e. G = Reach(X0 , ≤ T ) := Reach(X0 , T ) (2.20) 0≤T ≤T
for all X0 ∈ G. Estimates on T which leads to upper and lower bounds for the optimal time of state-to-state transfer in controlled quantum systems can be found in [77]. If s g, but s generates a closed subgroup of G, one can still conclude how the reachable set of (2.18) looks like: Reach(X0 ) = S · X0 ,
(2.21)
where S denotes the closed subgroup generated by s. In contrast, for non-compact groups G, which are indispensable for describing open quantum systems, the situation gets more involved. Here, s = g implies only accessibility of (ΣG ), i.e. that all reachable sets Reach(X0 ) have non-empty interior.
July 12, J070-S0129055X10004053
608
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
This follows from a more general result on smooth non-linear control systems, which says that the so-called Lie algebra rank condition (LARC) {F (X) | F ∈ Fu | u ∈ ULie } = TX M,
(2.22)
for all X ∈ M implies accessibility. Here, Fu | u ∈ ULie denotes the Lie subalgebra of vectors fields generated by Fu , u ∈ U via Lie bracket operation, cf. [67]. Note that for right invariant vector fields on G, the Lie bracket coincides (up to sign) with the commutator such that (2.22) boils down to s = g. Moreover, by exploiting the identity Reach(1, T1 ) · Reach(1, T2 ) = Reach(1, T1 + T2 ),
(2.23)
one can show that Reach(1) is always a Lie subsemigroup of G. A subsemigroup is a subset S ⊂ G which contains the unity and is closed under multiplication, i.e. 1 ∈ S and S · S ⊆ S. However, the geometry of subsemigroups is rather subtle compared to Lie subgroups and therefore at present not amenable to intrinsic optimization methods, as shown in more detail in a paper dwelling on open quantum systems in terms of Lie semigroups [59]. 2.5. Settings of interest In terms of reachability, there are different scenarios that structure the subsequent line of thought: we start out with fully controllable or operator controllable quantum systems [28, 75, 76, 78–81] represented as spin- or pseudo-spin systems. Then, neglecting decoherence, to any initial state represented by its density operator A, the entire unitary orbit O(A) := {U AU −1 | U unitary} can be reached [81]. In systems of n qubits (e.g., spin- 21 particles), this is the case under the following mild conditions [82]: (1) the qubits have to be inequivalent, i.e. distinguishable and selectively addressable, and (2) they have to be pairwise coupled (e.g., by Ising or Heisenberg-type interactions), where the coupling topology may take the form of any graph as long as it is connected, (3) the Hamiltonians must not show any symmetry (so the system algebra has to be given in irreducible representation), and finally (4) the Hamiltonians must not (simultaneously) allow for an orthogonal or a symplectic representation. In other instances not the entire (unitary) group, but just a subgroup K can be reached. This is the case if the system Lie algebra is a proper subalgebra of the fully unitary algebra, so k su(N ) or equivalently exp k = K SU (N ). Such restrictions may ay arise from symmetry constraints, which can be conveniently characterized by the centralizera of k in su(N ), see [82,83]. Otherwise, the system itself can be fully controllable, but the focus of interest may be reduced: e.g., the subgroup K = SUloc (2n ) := SU (2) ⊗ SU (2) ⊗ · · ·⊗ SU (2) of (possibly fast) local actions on each qubit is of interest to study local reachability, or whether an effective multi-qubit interaction Hamiltonian is locally reversible in a i.e.
by k := {s ∈ su(N ) | [s, k] = 0 ∀ k ∈ k}.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
609
the sense of Hahn’s spin echo [29]. Or, one may ask what is the Euclidean distance of some pure state to the nearest point on the local unitary orbit of a pure product state. This may be useful when optimizing entanglement witnesses [55]. Likewise, one may address other than the finest partitioning of the entire unitary group, e.g., K = SU (2n1 ) ⊗ · · · ⊗ SU (2nj ) ⊗ · · · ⊗ SU (2nr ) ⊂ SU (2n ), where rj=1 nj = n. Another type of reduction arises not by restriction to a subgroup H, but by the fact that the quality function of interest f is equivariant, i.e. constant on cosets HG. Consider, for instance, a fully controllable system where f is equivariant with respect to the closed subgroup H ⊂ G. Then, it may be favorable to transfer the optimization problem from the original Lie group G to the homogeneous space G/H. 3. Theory: Gradient Flows Gradient systems are a standard tool of Riemannian optimization for maximizing smooth quality functions on a manifold M . Thus the manifold structure of M arises either naturally by the problem itself or by smooth equality-constraints imposed on a previously unconstrained problem. Note that in general inequality-constraints would entail manifolds with a boundary — and thus are a much more subtle issue not to be developed any further here. The case M = Rm — sometimes referred to as the unconstrained case — is wellknown and can be found in many texts on ordinary differential equations or nonlinear programming, cf. [84–87]. However, gradient systems on abstract Riemannian manifolds provide a rather new approach to constrained optimization problems. Although the resulting numerical algorithms are in general only linearly convergent, their global behavior is often much better then the global behavior of locally quadratic methods. Textbooks combining the different areas of Riemannian geometry, gradient systems and constrained optimization are quite rare. The best choices to our knowledge are [22,61]. For further reading we also suggest the papers [19–21,62]. Nevertheless, most of the material which is necessary to understand the intrinsic optimization approach applied in Sec. 4 is scattered in many different references. For the reader’s convenience, we therefore review the basic ideas on these topics. First, we discuss the general setting on Riemannian manifolds, then we proceed with Lie groups and finally summarize some more advanced results on homogeneous spaces. For standard definitions and terminology from Riemannian geometry we refer to any modern text on this subject such as [88–91]. 3.1. Gradient flows on Riemannian manifolds In the following, let M denote a finite dimensional smooth manifold M with tangent and cotangent bundles T M and T ∗ M , respectively. Moreover, let M be equipped with a Riemannian metric · | ·, i.e. with a scalar product · | ·X on each tangent space TX M varying smoothly with X ∈ M . More precisely, · | · has to be a
July 12, J070-S0129055X10004053
610
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
smooth, positive definite section in the bundle of all symmetric bilinear forms over M . Such sections always exist for finite dimensional smooth manifolds, cf. [89, 92]. The pair (M, · | ·) is called a Riemannian manifold. Let f : M → R be a smooth quality function on M with differential Df : M → T ∗ M . Then the gradient of f at X ∈ M , denoted by grad f (X), is the vector in TX M uniquely determined by the equation Df (X) · ξ = grad f (X) | ξX
(3.1)
for all ξ ∈ TX M . Equation (3.1) naturally defines a vector field on M via grad f : M → T M,
X → grad f (X)
(3.2)
called the gradient vector field of f . The corresponding ordinary differential equation X˙ = grad f (X),
(3.3)
and its flow are referred to as the gradient system and the gradient flow of f , respectively. Obviously, the critical points of f : M → R coincide with the equilibria of the gradient flow. Moreover, the quality function f is monotonically increasing along solutions X(t) of (3.3), i.e. the real-valued function t → f (X(t)) is monotonically increasing in t, as d 2 ˙ f (X(t)) = grad f (X(t)) | X(t) X(t) = grad f (X(t)) X(t) ≥ 0. dt Here · X denotes the norm on TX M induced by · | ·X , i.e. ξ X := ξ | ξX for all ξ ∈ TX M . 3.1.1. Convergence of gradient flows Recall that the asymptotic behavior (for t → +∞) of a solution of (3.3) is characterized by its ω-limit set
ω(X0 ) := {X(τ, X0 ) | t ≤ τ < t+ (X0 )}, 0
where {· · ·} denotes the closure of the set {· · ·} and X(t, X0 ) the unique solution of (3.3) with initial value X(0) = X0 and positive escape time t+ (X0 ) > 0. The following result gives a sufficient condition for solutions of Eq. (3.3) to converge to the set of critical points of f . Proposition 3.1. If f has compact a superlevel set, i.e. if the sets {X ∈ M | f (X) ≥ C} are compact for all C ∈ R, then any solution of Eq. (3.3) exists for t ≥ 0 and its ω-limit set is a non-empty compact and connected subset of the set of critical points of f . Proof. Since solutions of Eq. (3.3) are monotonically increasing in t, the compact sets {X ∈ M | f (X) ≥ C} are positively invariant, i.e. invariant for t ≥ 0 under
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
611
the gradient flow of Eq. (3.3). Thus the assertion follows from standard results on ω-limit sets and Lyapunov theory, cf. [22, 85]. Although, Proposition 3.1 guarantees that ω(X0 ) is contained in the set of critical points of f , this does not imply convergence to a critical point. Indeed, there are smooth gradient systems which exhibit solutions converging only to the set of critical points, cf. [93]. The next two results provide sufficient conditions for convergence to a single critical point under different settings. In particular, Theorem 3.1 yields a powerful tool for analyzing real analytic gradient systems. Corollary 3.1. If f has compact superlevel sets and if all critical points are isolated, then any solution of (3.3) converges to a critical point of f for t → +∞. Proof. This is an immediate consequence of Proposition 3.1. Theorem 3.1 ([94]). If (M, · | ·) and f are real analytic, then all non-empty ω-limit sets ω(X0 ) of Eq. (3.3) are singletons, i.e. ω(X0 ) = ∅ implies that X(t, X0 ) converges to a single critical point X ∗ of f for t → +∞. Proof. The main argument is based on L ojasiewicz’s inequality which says that ∗ in a neighborhood of X an estimate of the type |f (X)|p ≤ C grad f (X) for some p < 1 and C > 0 holds. A complete proof can be found in [94, 95]. 3.1.2. Restriction to submanifolds Now, consider the restriction of f to a smooth submanifold N ⊂ M . Obviously, the Riemannian metric · | · on M restricts to a Riemannian structure on N . Thus (N, · | ·|T N ) constitutes a Riemannian manifold in a canonical way. Moreover the equality Df |N (X) = Df (X)|TX N immediately implies that the gradient of the restriction f |N at X ∈ N is given by the orthogonal projection of grad f (X) onto TX N , i.e. grad f |N (X) = PX (grad f (X)),
(3.4)
where PX denotes the orthogonal projector onto TX N . Hence the gradient system of f |N on an arbitrary submanifold N is well-defined and reads X˙ = PX (grad f (X)).
(3.5)
3.1.3. Analyzing critical points by the Hessian Subsequently, we address the problem, how to define and compute the Hessian of f , as its knowledge is essential for a deeper insight of (3.3). For instance, the stability of critical points is determined by its eigenvalues or the computation of explicit
July 12, J070-S0129055X10004053
612
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
discretization schemes, preserving the convergence behavior of (3.3), can be based on it, cf. [17, 22]. At critical points X ∗ ∈ M of f , the Hessian is given by the symmetric bilinear form Hess f (X ∗ ) : TX ∗ M × TX ∗ M → R, Hess f (X ∗ )(ξ, η) := (Dϕ(X ∗ )ξ) Hess (f ◦ ϕ−1 )(ϕ(X ∗ ))Dϕ(X ∗ )η,
(3.6)
where ϕ is any chart around X ∗ and Hess(f ◦ ϕ−1 ) denotes the ordinary Hesse matrix of f ◦ ϕ−1 . It is straightforward to show that (3.6) is independent of ϕ. Equivalently, Hess f (X ∗ ) is uniquely determined by d2 (f ◦ α) Hess f (X ∗ )(ξ, ξ) := , (3.7) dt2 t=0 where α is any smooth curve with X ∗ = α(0) and α(0) ˙ = ξ. While the remaining ∗ values of Hess f (X ) can be obtained by a standard polarization argument,b i.e. via the formula 2 Hess f (X ∗ )(ξ, η) = Hess f (X ∗ )(ξ + η, ξ + η) − Hess f (X ∗ )(ξ, ξ) − Hess f (X ∗ )(η, η).
(3.8)
However, the previous definition does not apply to regular points of f . In general, one has to establish the concept of geodesics, cf. Remark 3.1 More precisely, the Hessian of f at an arbitrary point x ∈ M is given by d2 (f ◦ γ) , (3.9) Hess f (X)(ξ, ξ) := dt2 t=0 where γ is the unique geodesic with X = γ(0) and γ(0) ˙ = ξ. Again, the remaining values can be computed by (3.8). As usual, we associate to Hess f (X) a unique selfadjoint linear operator Hess f (X) : TX M → TX M such that ξ | Hess f (X)ηX = Hess f (X)(ξ, η)
(3.10)
holds for all ξ, η ∈ TX M . It is called the Hessian operator of f at X ∈ M . Remark 3.1. In modern textbooks on differential geometry, the concept of geodesics as well as the notion of (higher) covariant derivatives are defined via linear connections, cf. [90, 96]. Therefore, Eq. (3.9) is usually derived as a consequence and not introduced as a definition of the Hessian. For Riemannian manifolds, however, it is also possible to establish (Riemannian) geodesics as curves of minimal arc length. Both approaches coincide if one picks the so-called Riemannian or Levi–Civita connection as linear connection on M . precisely, the polarization procedure is defined as follows: Let H be a real Hilbert space and β : H → R a bounded quadratic form, i.e. there exists a bounded symmetric bilinear form B : H × H → R such that β(v) = B(v, v) for all v ∈ H. By the symmetry and bilinearity of B we have B(v + w, v + w) = B(v, v) + B(w, w) + 2B(v, w) and hence B(v, w) = 12 (B(v + w, v + w) − B(v, v) − B(w, w)) = 12 (β(v + w) − β(v) − β(w)) for all v, w ∈ H. Therefore, B is uniquely determined by the quadratic form β and the latter identity is known as law of polarization. b More
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
613
Unfortunately, the computation of geodesics is in general a non-trivial problem, as one has to solve (in local charts) a second order differential equation. However, on compact Lie groups their calculation is rather simple as we will see at the end of Sec. 3.2. The above concepts yield the following generalization of a familiar result from elementary calculus for characterizing local extreme points. Theorem 3.2. Let M be a Riemannian manifold and let X∗ be a critical point of the quality function f : M → R. If Hess f (X ∗ ) or, equivalently, Hess f (X) are negative definite, then X ∗ is a strict local maximum of f . Proof. In local coordinates the result follows straightforwardly from Eq. (3.6). In general, (asymptotic) stability of an equilibrium X ∗ ∈ M of (3.3) may dependent on the Riemannian metric · | ·. However, the property of being a strict local maximum or an isolated critical point of a smooth function f is obviously not up to the choice of any Riemannian metric. Therefore, the following result shows that in fact certain (asymptotically) stable equilibria X ∗ ∈ M of (3.3) are independent of the Riemannian metric. Theorem 3.3. (a) If X ∗ ∈ M is a strict local maximum of f, then X ∗ is a stable equilibrium of (3.3). In particular, for any neighborhood U of X ∗ there exists a neighborhood V of X ∗ such that the ω-limit sets ω(x0 ) are non-empty and contained in U for all x0 ∈ V . (b) If X ∗ ∈ M is a strict local maximum and an isolated critical point of f, then X ∗ is an asymptotically stable equilibrium of (3.3). In particular, there is a neighborhood V of X ∗ such that ω(x0 ) = {X ∗ } for all X0 ∈ V, i.e. all solutions X(t, X0 ) with initial value X0 ∈ V converge to X ∗ for t → +∞. Proof. Both assertions follow immediately from classical stability theory by taking f as Lyapunov function, cf. [22, 85]. Note that the convergence analysis near arbitrary equilibria, i.e. near arbitrary critical points of f is quite subtle and may depend on the Riemannian metric, cf. [97]. 3.1.4. Discretised gradient flows Finally, we approach the problem of finding discretizations of (3.3) which lead to convergent gradient ascent methods. The ideas presented below can be traced back to Brockett, cf. [17]. Let expX : TX M → M
(3.11)
be the Riemannian exponential map at X ∈ M , i.e. t → expX (tξ) denotes the unique geodesic with initial value X ∈ M and initial velocity ξ ∈ TX M . Moreover,
July 12, J070-S0129055X10004053
614
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
we assume that M is (geodesically) complete, i.e. any geodesic is defined for all t ∈ R. Hence, (3.11) is well-defined for the entire tangent bundle T M . The simplest discretization approach — a scheme that can be seen as an intrinsic Euler step method — leads to Riemannian gradient method Xk+1 := expXk (αk grad f (Xk ))
(3.12)
where αk denotes an appropriate step size. In order to guarantee convergence of (3.12) to the set of critical points, it is sufficient to apply the Armijo rule [87]. An alternative to Armijo’s rule provides the step size selection suggested by Brockett in [19], see also [22]. Convergence to a single critical point is a more subtle issue. If (M, · | ·) and f are analytic, and the step sizes are chosen according to a version of the first Wolfe–Powell condition for Riemannian manifolds, then pointwise convergence holds. A detailed proof can be found in [98]. 3.2. Gradient flows on Lie groups In the following, we apply the previous results to Lie groups and Lie subgroups. However, to fully exploit Lie-theoretic tools, the Riemannian structure and the group structure have to match, i.e. the metric · | · has to be invariant under the group action. For basic concepts and results on Lie Groups and their Riemannian geometry we refer to [88,89,99–101]. In particular, we recommend the AMS-booklet of Arvanitoyeorgos [102] for a rather comprehensive, but condensed overview including many references for further reading. Sometimes we refer to [102] although it does not contain a full proof of the corresponding statement. Nevertheless, the details given therein will help the reader to get a better understanding of the subject. In any case, we always added a second reference containing a complete proof. Let G denote a finite dimensional Lie group, i.e. a group which carries a smooth manifold structure such that the group operations are smooth mappings.c For notational convenience we will assume that G can be represented as a (closed) matrix Lie group, i.e. as an (embedded) Lie subgroup of some general linear group GL(N, K) of invertible N ×N -matrices over K = R or C. Remark 3.2. According to a well-known result by Cartan, a subgroup G of GL(N, K) is an (embedded) Lie subgroup, i.e. a smooth submanifold of GL(N, K), if and only if it is closed in GL(N, K), cf. [103]. Note, however, that there is a subtle difference between embedded and immersed Lie subgroups. Moreover, not every abstract Lie group admits a faithful representation as a matrix Lie group. Nevertheless, the class of matrix Lie groups is rich enough for all of our subsequent applications. For more details on these topics we also refer to [100, 103]. c Actually, any Lie group G exhibits a real analytic substructure (induced by the exponential map), i.e. G can also be regarded as a real analytic manifold [101, 103].
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
615
3.2.1. Invariant metrics A Lie group G can be endowed in a canonical way with a Riemannian metric · | ·. Let g := T1 G be the Lie algebra of G, i.e. the tangent space of G at the unity 1. From the fact that the right multiplication rH : G → G and left multiplication lH : G → G are diffeomorphisms of G for all H ∈ G, it follows TH G = gH = Hg
(3.13)
for all H ∈ G. Now, let (· | ·) be any scalar product on g. Then g | hG := (gG−1 | hG−1 )
(3.14)
for all G ∈ G and g, h ∈ TG G yields a right invariant metric on G, where right invariance stands for g | hG = gH | hHGH
(3.15)
for all G, H ∈ G and g, h ∈ TG G. Thus right multiplication rH represents an isometry of G. In the same way, one could obtain left invariant metrics on G. Remark 3.3. In an abstract setting, one has to replace (3.13) by TH G = DrH (1)g = DlH (1)g
(3.16)
for all H ∈ G, where DrH and DlH denote the tangent maps of rH and lH , respectively. For a matrix Lie group, however, the respective tangent maps are given by DrH (G)ξ = ξH and DlH (G)ξ = Hξ for all G ∈ G and ξ ∈ TG G. Hence (3.16) reduces to (3.13). The construction of bi-invariant, i.e. right and left invariant metrics is much more subtle and in general even impossible. To summarize the basic results on this topic we need some further terminology. The adjoint maps Ad : G → GL(g) and ad : g → gl(g) are defined by AdG h := GhG−1
and
adg h := [g, h] := gh − hg
for all G ∈ G and all g, h ∈ g, where GL(g) and gl(g) denote the set of all automorphisms and, respectively, endomorphisms of g. Note both notations adg h and [g, h] are used interchangeably in the literature. A bilinear form (· | ·) on g is called (a) AdG -invariant if the identity (g | h) = (AdG g | AdG h)
(3.17)
is satisfied for all g, h ∈ g and G ∈ G. (b) adg -invariant if the identity (adg h | k) = −(h | adg k) is satisfied for all g, h, k ∈ g.
(3.18)
July 12, J070-S0129055X10004053
616
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Proposition 3.2. The following statements are equivalent: (a) There exists a bi-invariant Riemannian metric · | · on G. (b) There exists an AdG -invariant scalar product (· | ·) on g. Moreover, each of the statements (a) and (b) imply (c) There exists an adg -invariant scalar product (· | ·) on g. If G is also connected, then (c) is equivalent to (a) and (b), respectively. Proof. The equivalence (a) ⇔ (b) follows easily by exploiting Eq. (3.15) at G = 1. Moreover, applying (b) to a one-parameter subgroup t → exp(tg) and taking the derivative at t = 0 yields (c). The implication (c) ⇒ (b) is obtained in the same way, i.e. by differentiating t → (Adetg h | Adetg k) with respect to t, cf. [102]. Note, however, that this implies AdG -invariance only on the connected component of the unity. Therefore, connectedness is necessary for the implication (c) ⇒ (b) as counter-examples show. Now, the main result on the existence of bi-invariant metrics reads as follows. Theorem 3.4. A connected Lie group G admits a bi-invariant Riemannian metric if and only if G is the direct product of a compact Lie group G0 and an abelian one, which is isomorphic to some (Rm , +), i.e. G ∼ = G0 × Rm Proof. Cf. [89, 104]. Finally, we focus on a special class of Lie groups. A connected Lie group G is called semisimple if the Killing form, i.e. the bilinear form (g, h) → κ(g, h) := tr(adg adh ) is non-degenerate on g. Most prominent representatives of this class are SL(N, R), SL(N, C), SO(N, R) and SU (N ). More on semisimple Lie groups and their algebras can be found in [99, 103]. Theorem 3.5. (a) If G is semisimple then the Killing form κ defines an adg -invariant bilinear form on g. (b) If G is semisimple and compact then −κ defines an adg -invariant scalar product on g. Thus −κ induces a bi-invariant Riemannian metric on G. Proof. Cf. [102, 103]. 3.2.2. Gradient flows with respect to an invariant metric Next, we study gradient flows on G or on a closed subgroup H ⊂ G with respect to an invariant metric · | ·. Therefore, let f : G → R be a smooth quality function
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
617
and let ϕ : G → G be any diffeomorphism. Using the identity grad(f ◦ ϕ)(G) = (Dϕ(G))∗ grad f (ϕ(G))
(3.19)
∗
for all G ∈ G, where (·) denotes the adjoint operator, we obtain by the right invariance of the metric grad f (G) = grad(f ◦ rG ) (1)G
(3.20)
G˙ = grad f (G)
(3.21)
G˙ = grad(f ◦ rG )(1)G.
(3.22)
for all G ∈ G. Hence
can be rewritten as Thus the gradient flow of f is determined by the map G → grad(f ◦ rG )(1) ∈ g. To study its asymptotic behavior of Eq. (3.21) we can apply the results of the previous section. For instance, for compact Lie grops we have. Corollary 3.2. Let G be a compact Lie group with a right invariant Riemannian metric · | · and let f : G → R be a real analytic quality function. Then any solution of Eq. (3.21) converges to a critical point of f for t → +∞. Proof. This follows immediately from Proposition 3.1 and Theorem 3.1, as the pair (G, · | ·) constitutes a real analytic Riemannian manifold whenever the metric · | · is invariant, cf. footnote [146]. Now, let H be a closed subgroup of G. By Remark 3.2, we know that H is actually an (embedded) submanifold of G. Therefore, the gradient flow of f |H with respect to · | ·|H is well-defined and can be given explicitly via the orthogonal projectors PH , cf. (3.5). However, for an invariant metric the computation of PH simplifies considerably, as all calculations can be carried out on the Lie algebra g of G. Lemma 3.1. Let G be a Lie group with a right invariant Riemannian metric · | · and let H be a closed subgroup of G. Furthermore, let g and h their corresponding Lie algebras and denote by Ph the orthogonal projection of g onto h. Then the orthogonal projection PH in (3.4) is given by PH (gH) := Ph (g)H
(3.23)
for all gH ∈ TH G. Proof. This is a straightforward consequence of the identity TH H = hH and the right invariance of · | ·. According to (3.5), (3.20) and (3.23), the gradient flow of f |H finally reads H˙ = Ph (grad(f ◦ rH )(1))H.
(3.24)
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
618
3.2.3. Geodesics with respect to an invariant metric The remainder of this subsection is devoted to the issue: How to compute geodesics and the Hessian of a smooth quality function with respect to an invariant metric. The main results for the forthcoming applications are summarized in Theorem 3.6(b) and Proposition 3.3. For readers with basic differential geometric background we provide some details of the proof which however can be skipped, so as not to lose the thread. First, we need some further notation. Let Xgr : G → gG and Xgl : G → Gg be the right and left invariant vector fields on G which are uniquely determined by Xgr (1) = g and Xgl (1) = g, respectively. Moreover, let LX (·) denote the Lie derivative with respect to the vector field X , i.e. for a smooth function f : G → R one has LX (f )(G) := Df (G) · X (G). On vector fields Y, the action of LX (·) is given by (DΦX (t, G))−1 · Y(ΦX (t, G)) − Y(G) , t→0 t
LX (Y)(G) := − lim
where ΦX (t, ·) denotes the corresponding flow of X . Next, we recall two basic facts from differential geometry which play a key role for the proof of Theorem 3.6. The first one shows that the set of right/left invariant vector fields is invariant under Lie derivation, cf. [99, 102]. The second one relates a Riemannian metric of a manifold M with a particular linear connection on M . For more details see e.g., [89]. Fact 1. The Lie derivative of a right/left invariant vector field is again right/left invariant and satisfies r LXgr Xhr = −X[g,h]
l and LXgl Xhl = X[g,h] .
(3.25)
Fact 2. On any Riemannian manifold M there exists a unique Riemannian connection ∇ determined by the properties LX Y = ∇X Y − ∇Y X
(3.26)
∇X Y | Z = ∇X Y | Z + Y | ∇X Z.
(3.27)
and
Now, combining both facts yields the main result about geodesics on Lie groups. Theorem 3.6. Let G be a Lie group with a bi-invariant metric · | · and let ∇ denote the unique Riemannian connection on G induced by · | ·. (a) For right/left invariant vector fields the Riemannian connection ∇ is given by 1 r ∇Xgr Xhr = − X[g,h] 2
and
∇Xgl Xhl =
1 l X . 2 [g,h]
(3.28)
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
619
(b) The geodesics through any G ∈ G are of the form t → G exp(tg) or t → exp(tg)G with g ∈ g. In particular, the geodesics through the unity 1 are precisely the one-parameter subgroups of G. Proof. (a) Applying Koszul’s identity, cf. [89, 91], 2∇X Y | Z = LX Y | Z + LY Z | X − LZ X | Y − X | LY Z + Y | LZ X + Z | LX Y, to Xgr , Xhr and Xkr we obtain r r r − Xhr | X[k,g] − Xkr | X[g,h] . 2∇Xgr Xhr | Xkr = +Xgr | X[h,k]
Now Proposition 3.2 and Fact 1 imply 1 r 2∇Xgr Xhr = − X[g,h] . 2 Obviously, for left invariant vector fields the same arguments apply. (b) Let γ(t) := exp(tg)G. Part (a) implies that the covariant derivative ∇γ(t) γ(t) ˙ = ˙ r ∇Xgr Xg (γ(t)) of γ vanishes and thus γ represents the unique geodesics through G with “initial velocity” ξ = gG. The same holds for γ (t) := G exp(tg), cf. [99, 102]. r 2∇Xgr Xhr | Xkr = −Xkr | X[g,h]
and hence
Observe that the bi-invariance of the metric and the invariance of the vector fields are essential for the above result. For example Eq. (3.28) fails, if the Riemannian metric is just right invariant. More details on this topic can be found in [99, 105]. Finally, by Theorem 3.6, the Hessian of the restriction f |H can easily be obtained by restricting the Hessian of f to T H. More precisely, we have. Proposition 3.3. Let f : G → R be a smooth quality function on a Lie group with bi-invariant metric · | · and let H be a closed subgroup. Then the Hessian of f |H at H is given by Hess f |H (H) = Hess f (H)|TH H×TH H
(3.29)
Note that in general Eq. (3.29) is sheer nonsense unless H is a Lie subgroup. Counterexamples can be obtained easily for G = Rm . 3.3. Gradient flows on homogeneous spaces The subsequent section on homogeneous spaces is motivated by the following observation, cf. Sec. 3.4. As before, let f : G → R be a smooth quality function. In many applications f can be decomposed into a function F defined on a smooth manifold M and a (right) group action α : (X, G) → X · G on M such that f (G) := F (X · G)
(3.30)
for some fixed X ∈ M . Then we can think of f as defined on the orbit of X. More precisely, let f = F |O(X) , where O(X) := {X · G | G ∈ G} denotes the orbit of X.
July 12, J070-S0129055X10004053
620
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Thus f(Y ) = f (G)
(3.31)
for Y = X · G with G ∈ G. Such quality functions f are called induced by F , cf. Sec. 3.4. By construction, we have max f (G) = max f(Y ). G∈G
Y ∈O(X)
(3.32)
Moreover, let HX := {G ∈ G | X · G = X} denote the stabilizer or, equivalently, isotropy subgroup of X. Then f can also be viewed as a function on the right coset space,d G/HX := {HX G | G ∈ G},
(3.33)
which is equivalent to say that f is equivariant with respect to HX , i.e. f (G) = f (HG)
(3.34)
for all H ∈ HX . Therefore, coset space show up quite naturally in optimizing equivariant quality functions. Note that passing from G to G/Hx can be rather useful in order to avoid certain degeneracies such as continua of critical points. 3.3.1. Coset spaces We first collect the fundamental facts on the differential structure of G/H, where H is any closed subgroup of G. Detailed expositions can be found in [91, 99, 101, 102, 106]. Theorem 3.7. Let G be a Lie group with Lie algebra g and let H ⊂ G be a closed subgroup with Lie algebra h. Moreover, let p be any complementary subspace to h, i.e. g = h ⊕ p. Then the following holds: (a) The quotient topology turns the set of right cosets G/H := {[G] := HG | G ∈ G} into a locally compact Hausdorff space. (b) There exists a unique manifold structure on G/H such that the canonical projection Π : G → G/H, G → [G] is a submersion. In particular, the tangent space of G/H at [1] is isomorphic to p via the canonical identificad [exp tp]|t=0 and thus dim G/H = dim G − dim H. tion p → dt The following statements refer to the unique manifold structure on G/H given in part (b). (c) The Lie group G acts smoothly from the right on G/H via ([G ], G) → [G G] d Note
(3.35)
that the coset-terminoloy in the group literature is not consistent, i.e. right cosets are sometimes called left cosets and vice versa. Here, we stick to the term right coset, if the group element in on the right side, i.e. [G] = HG.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
621
such that rG : G/H → G/H,
[G ] → [G G]
(3.36)
are diffeomorphisms for all G ∈ G. Moreover, Π ◦ lG : G → G/H,
G → [GG ]
(3.37)
are submersions for all G ∈ G. Thus the tangent space T[G] G/H is given by T[G] G/H = D rG ([1]) T[1] G/H = D(Π ◦ lG )(1)g = D(Π ◦ lG )(1)(AdG−1 p).
(3.38)
(d) Moreover, if H is a normal subgroup, i.e. GHG−1 = H for all G ∈ G, then the multiplication [G] · [G ] := [GG ] is well-defined and yields a Lie group structure on G/H. Proof. Cf. [99, 101, 106]. The Lie group G/H given by Theorem 3.7(d) is called the quotient Lie group of G by H. Moreover, the result provides the possibility to extend the well-known First Isomorphism Law to the category of Lie groups. Theorem 3.8. Let Φ : G → G be a smooth surjective Lie group homomorphism. : G/H → G with H := Then there exists a well-defined Lie group isomorphism Φ ker Φ such that the diagram / G y< y yy Π yy b yy Φ G/H G
Φ
(3.39)
commutes. Moreover, let g, g and h denote the corresponding Lie algebras and let p be any complementary space to h. Then DΦ(1) is a surjective Lie algebra homomorphism with ker DΦ(1) = h and commutative diagram DΦ(1)
/ g w; w w ww DΠ(1) wwDΦ(1) b w w p∼ = g/h. g
(3.40)
Proof. Note that H = ker Φ is a closed normal subgroup of G. Thus by the First Isomorphism Law Φ([G]) := Φ(G) for [G] ∈ G/H is a well-defined group isomor phism. Moreover, Φ is smooth, since Π is a smooth submersion by Theorem 3.7. The assertion that DΦ(1) is a surjective Lie algebra homomorphism, follows easily from the properties of the exponential map. Finally, a straightforward application of the chain rule yields Eq. (3.40).
July 12, J070-S0129055X10004053
622
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
3.3.2. Orbit theorems and homogeneous spaces Next, we analyze the relation between group actions and coset spaces. A smooth right Lie group action is a smooth map α : M × G → M , (X, G) → X · G with (X · G) · H = X · (GH)
and X · 1 = X
for all X ∈ M and G, H ∈ G. The orbit of X ∈ M under the group action α is defined by O(X) := {X · G | G ∈ G}. The action is called transitive if M = O(X) for some and hence for all X ∈ M . Equivalently, one can say that for all X, Y ∈ M there exists an element G ∈ G with Y = X · G. Moreover, for X ∈ M let HX := {G ∈ G | X · G = X} denote the stabilizer of X and αX : G → M the map G → X · G. Then the canonical map α X : G/HX → M is defined by [G] → X · G. Theorem 3.9 (Orbit Theorem). Let G be a Lie group with Lie algebra g and let α : M × G → M be a smooth right action of G on a smooth manifold M . Moreover, let X be any point in M . Then the following statements are satisfied : (a) The stabilizer subgroup HX is a closed subgroup of G. (b) Let hX be the Lie algebra of HX . Then ker DαX (1) = hX .
(3.41)
In particular, the canonical map α X : G/HX → M is an injective immersion. (c) The canonical map α X is an embedding, i.e. O(X) is a submanifold of M X is proper.e In this case, the tangent diffeomorphic to G/HX , if and only if α space of O(X) at Y = X · G is given by TY O(X) = DαX (G) TG G = DαY (1) g = DαY (1) AdG−1 pX ,
(3.42)
where pX is any complementary subspace of hX , i.e. g = hX ⊕ pX . Proof. (a) The continuity of αX implies that HX = α−1 X (X) is closed. (b) In order to see that α X is an injective immersion, consider the identity αX ◦rG = α(αX (·), G) and thus DαX (G) · gG = D1 α(X, G) ◦ DαX (1) g. Therefore, DαX (1) g = 0 implies d αX (exp(tg)) = 0 dt for all t ∈ R and hence ker DαX (1) ⊂ hX . As the inclusion hX ⊂ ker DαX (1) is obvious, we obtain ker DαX (1) = hX . Moreover, let pX be any complemenαX ([1]) = tary subspace of hX . Then, identifying pX with T[1] G/HX yields D αX ([1]) is injective and the same holds for DαX (1)|p , cf. Theorem 3.7. Thus D any other [G] ∈ G/HX by right multiplication rG . eA
map ϕ is called proper if the pre-image ϕ−1 (K) of any compact set K is also compact.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
623
(c) The first part follows from a standard embedding criterion on immersed manifolds, cf. [92]. The first equality of Eq. (3.42) is a straightforward consequence X ◦ ΠX , where ΠX : G → G/HX denotes the canonical of the identity αX = α projection. The second one is obtained by αY = αX ◦ lG = αX ◦ rG ◦ AdG , while the third one follows from HY = AdG−1 Hx . For further details see also. Corollary 3.3. Let α : M × G → M be as in Theorem 3.9 and let X ∈ M be any point. (a) If G is compact then G/HX is diffeomorphic to O(X). (b) If α is transitive then G/HX is diffeomorphic to M . Proof. (a) This follows readily from Theorem 3.9(c) and the compactness of G. αX ([G]). (b) Observe that transitivity of α implies surjectivity of DαX (G) and D Thus Theorem 3.9(b) yields the desired result, cf. [106]. This gives rise to the following definition. A manifold M is called a homogeneous G-space or for short a homogeneous space, if there exists a transitive smooth Lie group action of G on M . In particular, any coset space G/H can be regarded as a homogeneous space via the canonical action ([G ], G) → [G G] for [G ] ∈ G/H and G ∈ G. Further results on homogeneous spaces, orbit spaces and principal G-bundles can be found in [96, 101, 106]. Remark 3.4. Note that by Theorem 3.9 the orbit O(X) carries always a manifold structure the topology of which is equal or finer than the topology induced by M . 3.3.3. Reductive homogeneous spaces Let M be homogeneous space with transitive Lie group action α : M × G → M and let H := HX be the stabilizer subgroup of a fixed element X ∈ M . Next, we are interested in carrying over the Riemannian structure of G to M or, equivalently, to G/H. First, we need some further terminology. As most of the following terms are conveniently expressed via algebraic properties of the pair (G, H), we focus on the case M = G/H. Yet one could restate all results in terms of an abstract group action α on M . A homogeneous space G/H is reductive, if the Lie algebra h of H has a complementary subspace p in g such that p is AdH -invariant, i.e. HpH −1 ⊂ p for all H ∈ H. A Riemannian metric · | · on G/H is called G-invariant if the mappings rG are isometries, i.e. if for all ξ, η ∈ T[G ] G/H and G, G ∈ G the identity rG ([G ])ξ | D rG ([G ])η[G G] ξ | η[G ] = D
(3.43)
July 12, J070-S0129055X10004053
624
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
holds. Moreover a bilinear form (·|·) on p is called (a) AdH -invariant if the identity (p | p ) = (AdH p | AdH p )
(3.44)
is satisfied for all p, p ∈ p and H ∈ H. (b) adh -invariant if the identity (adh p | p ) = −(p | adh p )
(3.45)
is satisfied for all p, p ∈ p and h ∈ h. Note that G/H is reductive, if G has a bi-invariant metric, as one can choose p := h⊥ . Next, we give a generalization of Proposition 3.2 and Theorem 3.4 to homogeneous spaces. Proposition 3.4. Let G/H be a homogeneous space with reductive decomposition g = h ⊕ p. The following statements are equivalent : (a) There exists a G-invariant metric · | · on G/H. (b) There exists an AdH -invariant scalar product (·|·) on p. In addition, if H is connected then (a) and (b) are equivalent to (c) There exists a adh -invariant scalar product (·|·) on p. Proof. Cf. [102] and Proposition 3.2. Theorem 3.10. Let G/H be a homogeneous space with reductive decomposition g = h ⊕ p. Then G/H admits a G-invariant metric if and only if the closure of AdH |p := {AdH : p → p | H ∈ H} is compact in GL(p). Proof. Cf. [89]. Remark 3.5. (a) As a special case, Theorem 3.10 implies the existence of bi-invariant metrics on compact Lie groups, cf. Theorem 3.4 and [89]. (b) Replacing p by the quotient space g/h, allows to state Theorem 3.10 without referring to any reductive decomposition g = h ⊕ p of g, cf. [89]. Moreover, it can be shown that any homogeneous space G/H which admits a G-invariant metric is reductive, cf. [107]. Theorem 3.10 can easily be rephrased for an arbitrary homogeneous G-space M with transitive group action α : M × G → M , by choosing H := HX with X ∈ M . Note however, for orbits M := O(X) embedded in some larger Riemannian manifold N , the invariant metric given by Theorem 3.10 does in general not coincide with the induced metric. This gives rise to the following definition. A manifold M is called a Riemannian homogeneous G-space or for short Riemannian homogeneous space, if M is a homogeneous G-space with α-invariant
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
625
metric, which is to say that the mappings αG : M → M , αG (X) := X · G are isometries of M for all G ∈ G, i.e. for all ξ, η ∈ TX M and G ∈ G one has ξ | ηX = DαG (X)ξ | DαG (X)ηX·G .
(3.46)
Proposition 3.5. (a) Any homogeneous space of the form G/H with a G-invariant metric is a Riemannian homogeneous space. (b) Any Riemannian homogeneous space is isometric to a homogeneous space of the form G/H with a G-invariant metric. Proof. Follows readily from the previous definitions and Corollary 3.3(b). 3.3.4. Naturally reductive homogeneous spaces and geodesics Characterizing the Riemannian connection of a homogeneous space and its geodesics are in general advanced issues which we do not want to address here, cf. [102] and the references therein e.g., [108]. However, there are two cases — see (a) and (b) below — which are easy to handle. A homogeneous space G/H is called (a) Naturally reductive if it is reductive with complementary space p and AdH invariant scalar product (·|·) on p such that the identity (P adg h | k) = −(h | P adg k)
(3.47)
is satisfied for all g, h, k ∈ p, where P denotes the projection onto p along h. (b) Cartan-like if it is reductive with complementary space p and AdH -invariant scalar product (·|·) on p such that the commutator relations [h, h] ⊂ h,
[h, p] ⊂ p and [p, p] ⊂ h.
(3.48)
are satisfied. Remark 3.6. If, in definition (a), the complementary space p can be chosen as the orthogonal complement of h with respect to some AdG -invariant scalar product (·|·) on g, then condition (3.47) reduces to (adg h | k) = −(h | adg k)
(3.49)
for all g, h, k ∈ p. Lemma 3.2. (a) Every Cartan-like homogeneous space G/H is naturally reductive. (b) Every naturally reductive homogeneous space G/H is a Riemannian homogeneous space. Proof. (a) By the commutator relation [p, p] ⊂ h, we have P adg h = 0 for all g, h ∈ p. Thus Eq. (3.47) is satisfied. (b) The assertion follows immediately from Proposition 3.4. Theorem 3.11 (Coset Version). Let G/H be naturally reductive. Then G/H is Riemannian homogeneous space such that all geodesics through [G] ∈ G/H are of
July 12, J070-S0129055X10004053
626
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
the form t → [G exp(t AdG−1 p)] = [exp(tp)G]
(3.50)
with p ∈ p. Proof. By Lemma 3.2(b), the quotient space G/H is also a Riemannian homogeneous space. For a proof for Eq. (3.50) we refer to [91, 102, 105]. The above result can be restated for an arbitrary naturally reductive Riemannian homogeneous G-space. Theorem 3.12 (Orbit Version). Let M be a homogeneous G-space with transitive group action α : M × G → M . Assume that G/HX is naturally reductive with decomposition g = hX ⊕ pX . Then M is a Riemannian homogeneous G-space such that all geodesics through Y = X · G ∈ M are of the form t → Y · exp(t AdG−1 p)
(3.51)
with p ∈ pX . Proof. The result is a straightforward consequence of Theorem 3.11. Thus naturally reductive homogeneous spaces are Riemannian spaces, where the exponential map is particularly simple to express. By taking the basic picture of [91] further to discuss geodesics, Fig. 3 illustrates that only in naturally reductive homogeneous spaces the geodesics on G project to geodesics on G/H. In this sense, projection and exponentiation of tangent vectors commute in naturally reductive homogeneous spaces. However, on reductive homogeneous spaces that are not naturally reductive, the problem is considerably more involved. A necessary and sufficient condition for t → [G exp(tg)] being a geodesic in G/H can be found in [102, 109]. On the other hand, for numerical purposes it is often enough and even advisible to approximate the Riemannian exponential map by another computationally more efficient local parametrisation. Here, the map p p → Π ◦ lG ◦ exp(AdG−1 p)
(3.52)
might be a natural candidate, even if it fails to give the exact Riemannian exponential map. These issues are subject to current research, and recent details can be found in [25, 110]. Figure 3 also shows how in reductive homogeneous spaces that are no longer naturally reductive, the projected geodesic still provides a first-order approximation to the geodesic generated by the projection of the tangent vector. 3.3.5. Adjoint orbits A prime example for naturally reductive homogeneous spaces is provided by the adjoint action of a compact Lie group — a scenario which is of major interest in
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
627
Fig. 3. Geodesics in reductive homogeneous spaces G/H. The tangent vector p ∈ p projects to the tangent vector ξ at the coset [1] = H. Note that only in naturally reductive homogeneous spaces the geodesic in G generated by p projects onto G/H such that it coincides with the geodesic of the projected tangent vector in the sense Π(etp ) = exp[1] (tξ). In reductive homogeneous spaces that are not naturally reductive, the projection yields in general only a first-order approximation at [1] = H as shown in the lower part, where Π(etp ) = exp[1] (tξ). (Color online)
the forthcoming applications. Therefore, we summarize the previous results for the particular case of adjoint orbits. Note that the adjoint action given by (X, G) → AdG X := GXG−1 is a left action. However, all previous statements and formulas remain valid mutatis mutandis, e.g., right cosets have to be replaced by left cosets, etc. Corollary 3.4. Let G be a Lie group with Lie algebra g and let K ⊂ G be a compact subgroup with Lie algebra k and bi-invariant metric · | ·. Moreover, let α : g × K → g, (X, K) → AdK X := KXK −1 be the adjoint action of K on g and denote by αX : K → g the map K → AdK X. Then the following assertions hold (a) The stabilizer group H := HX of X is a closed subgroup of K. (b) The coset space K/H is diffeomorphic to the adjoint orbit O(X) := {AdK X | K ∈ K} of X. In particular, the map α X : K/H → O(X), [K] → AdK X is a well-defined diffeomorphism satisfying the commutative diagram K Π
K/H
αX
/ O(X) ⊂ g s9 ss s s ss ss αbX
(3.53)
July 12, J070-S0129055X10004053
628
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
(c) Let h := hX denote the Lie algebra of H and p be any complementary space to h in k, then Dα(1) = − adX is a surjective homomorphism with ker adX = h and commutative diagram DαX (1)
/ TX O(X) ⊂ g 7 ppp p p DΠ(1) pppαX ([1]) ppp Db p∼ = k/h; g
(3.54)
Moreover, the tangent space of O(X) at Y = AdK X is given by TY O(X) = adY k = adY (AdK −1 p). (d) O(X) ∼ = K/H is naturally reductive. More precisely, p := h⊥ yields a naturally reductive decomposition of k with AdH -invariant scalar product on p is given by the restriction of · | ·. (e) There is a well-defined α-invariant metric on O(X) given by ξ | ηAdK X := pξ | pη
(3.55)
with ξ = adY (AdK pξ ), η = adY (AdK pη ) and pξ , pη ∈ p. (f) All geodesics through Y = AdK X ∈ O(X) with respect to the metric given in part (e) are of the form t → Adexp(t AdK p) Y
(3.56)
with p ∈ p. Proof. Part (a) and (b) follow immediately from Theorem 3.9 and Corollary 3.3. (c) For k ∈ k we have d Adexp(tk) X = − adX k dt t=0 and thus Dα(1) = − adX . All other statements are again consequences of Theorem 3.9. (d) First, observe that the bi-invariance of · | · implies that k = h ⊕ p with p := h⊥ is reductive. Now, let P denote the orthogonal projection onto p. In turn, the bi-invariance of · | · yields P adg h | k = adg h | k = −h | adg k = −h | P adg k for all g, h, k ∈ p, cf. Proposition 3.2. Therefore, O(X) ∼ = K/H is naturally reductive. ∈ K. A straightforward calculation using the identities (e) Let Y ∈ O(X) and K (AdK DαKe (Y )ξ = AdKe ξ for ξ ∈ TY O(X) and AdK e (adY k) = adAdK e k) for fY all k ∈ k yields the required invariance. Part (f) follows immediately from Theorem 3.12 and the identity hY = AdK hX for Y = AdK X which implies h⊥ Y = AdK p.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
629
3.3.6. Gradient flows on Riemannian homogeneous spaces Applying the previous results on gradient flows to quality functions f on Riemannian homogeneous spaces G/H, we obtain by the G-invariance of the Riemannian metric — similar to (3.20) — the gradient equality (3.57) grad f([G]) = D rG ([1]) grad(f ◦ rG ) ([1]) for all G ∈ G, where rG denotes the mapping [G ] → [G G]. Similar to Eq. (3.20), the gradient of f is therefore completely determined by (3.58) G → grad(f ◦ rG )([1]) ∈ p. However, Eq. (3.58) does not induce a mapping from G/H to p, as in general grad(f ◦ rG )([1]) = grad(f ◦ rHG )([1]) for H ∈ H\{1}. Now, for analyzing the asymptotic behavior of ˙ = grad f([G]) [G]
(3.59)
Sec. 3.1 provides again the appropriate tools. For instance, if G/H is compact we have. Corollary 3.5. Let G/H be a compact Riemannian homogeneous space and let f : G/H → R be real analytic. Then any solution of Eq. (3.59) converges to a critical point of f for t → +∞. Proof. This follows immediately from Proposition 3.1 and Theorem 3.1 as a Riemannian homogeneous space constitutes always a real analytic Riemannian manifold, cf. [99, 101]. Finally, we return to our starting point and ask for the relation between (3.59) and (3.21) in the case of an H-equivariant quality function f . Then, f induces a quality function f on G/H via f([G]) := f (G)
(3.60)
for all G ∈ G. Moreover, assume G carries a bi-invariant metric · | · and G/H is a homogeneous space with reductive decomposition g = h ⊕ p and p := h⊥ . This implies that the restriction of · | · to p × p is AdH -invariant. Now, the identity f ◦ Π = f yields Df([G]) · DΠ(G) = Df (G)
for all G ∈ G
(3.61)
and hence (DΠ(G))∗ grad f([G]) = grad f (G)
(3.62)
for all G ∈ G, where Π denotes the canonical projection and (·)∗ the adjoint mapping. By identifying p with the tangent space of G/H at [1], the map DΠ(1) represents the orthogonal projector h + p → p for h ∈ h and p ∈ p. Thus we obtain DΠ(1)(DΠ(1))∗ = idp .
July 12, J070-S0129055X10004053
630
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
In the same way, using the identity Π ◦ rG = rG ◦ Π, one shows DΠ(G)(DΠ(G))∗ = idT[G] G/H for all G ∈ G. Consequently, (3.62) yields grad f([G]) = DΠ(G) grad f (G)
(3.63)
for all G ∈ G. Therefore, we have proven the following result: Theorem 3.13. Suppose G/H satisfies the above assumptions and f : G → R is a H-equivariant quality function with induced quality function f : G/H → R. Then the canonical projection of the gradient flow of Eq. (3.21) onto G/H yields the gradient flow of Eq. (3.59), i.e. if G(t) is a solution of Eq. (3.21) then Π(G(t)) is one of Eq. (3.59). 3.3.7. Discretized gradient flows on naturally reductive homogeneous spaces As before, let G be a Lie group with bi-invariant metric and let f be an equivariant quality function with respect to the closed subgroup H, i.e. for all H ∈ H one has f (G) = f (HG), so f |HG = constant for every G ∈ G. Moreover, assume that G/H is a naturally reductive coset space. Implementing a gradient algorithm for the induced quality function f on G/H finally yields the following recursion scheme [Gk+1 ] := [exp(αk grad f (Gk ) G−1 k ) Gk ],
(3.64)
where αk > 0 denotes a suitable step size. This, however, is not surprising, which can be seen as follows. With G/H being naturally reductive, there is the reductive decomposition g = h ⊕ p with p := h⊥ , such that any Ω ∈ g decomposes uniquely into Ω = Ωh + Ωp . Then the equivariance of f guarantees that its gradient at G ∈ G is orthogonal to the coset HG. Thus one finds grad f (G) | Ωh G = Df (G)Ωh G = 0 for all Ωh ∈ h. Therefore, the “pullback” of the gradient of f to g satisfies grad f (G)G−1 ∈ p. Furthermore, combining Eqs. (3.38) and (3.63) with the identity D(Π ◦ lG )(1)Ω = DΠ(G)GΩ for all Ω ∈ g (cf. Remark 3.3) yields grad f([G]) = D(Π ◦ lG )(1)(G−1 grad f (G)). Thus from Eq. (3.50) we finally obtain exp[G] (t grad f([G])) = [exp(t grad f (G) G−1 )G] for all t ∈ R, where exp[G] denotes the Riemannian exponential map at [G], cf. Eqs. (2.10) and (2.11). This precisely explains why recursion scheme (3.64) ressembles the corresponding one on the group level.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
631
3.4. Examples Often practically relevant quality functions take the form of a linear functional restricted to an adjoint orbit O(X). For instance, in quantum dynamics the unitary orbit O(A) := {UAU † | U ∈ SU (N )}
(3.65)
of an initial state A plays a central role, because it defines the largest reachability set under closed Hamiltonian dynamics. Then the set of feasible expectation values is such a linear map, since it is the projection onto an observable C in the sense of a Hilbert–Schmidt scalar product. These expectation values can be generalized to arbitrary complex square matrices A, C ∈ CN ×N such as to coincide with elements of the C-numerical range W (C, A) := {tr(C † UAU † ) | U ∈ SU (N )}.
(3.66)
As C-numerical ranges are well established in the mathematical literature [111,112], in the sequel we will adopt the notation. Note that finding the maximum absolute value, i.e. the C-numerical radius r(C, A) :=
max
U∈SU(N )
|tr{C † UAU † }|
(3.67)
is straightforward for Hermitian A, C (it amounts to sorting the respective eigenvalues, cf. Corollary 3.8), while for arbitrary complex A, C there is no general analytical solution. Moreover, when restricting to local unitary operations K ∈ SUloc (2n ) := SU (2)⊗n , the maximization task becomes non-trivial even for Hermitian A, C [113, 114]. Having set the frame, we now illustrate the previous theory by gradient flows on the entire unitary group SU (2n ), on the local unitary group SU (2)⊗n as well as their adjoint orbits. 3.4.1. Geometric optimization by gradient flows on SU (N ) Consider a fully controllable system (Σ) on SU (N ) in the sense that the entire group SU (N ) can be generated by evolutions under the Hamiltonian of the system plus the available controls. If A is an initial density operator or a matrix collecting its signal-relevant terms, then the reachable set to A coincides with the orbit of the canonical (semi)group action of (Σ) on A which yields in the entire unitary orbit O(A), cf. Eq. (3.65). Recall its “projection” on some observable C (or its signal-relevant terms) forms the C-numerical range of A, cf. Eq. (3.67). In this setting, there are two geometric optimization tasks of particular practical relevance as they determine maximal signal intensity in coherent spectroscopy [27]. (a) Find all points on the unitary orbit of A that minimize the Euclidean distance to C.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
632
(b) Find all points on the unitary orbit of A that minimize the angle to the 1-dimensional, complex subspace spanned by C. Clearly, the distance UAU † − C 22 = A 22 + C 22 − 2 Re{tr(C † UAU † )}
(3.68)
†
†
is minimal if the overlap Re{tr(C UAU )} is maximal. Moreover, making use of the definition of the angle between 1-dimensional complex subspaces cos2 ({UAU † , C}) :=
|tr{C † UAU † }|2 2
2
A 2 · C 2
,
(3.69)
problem (b) is equivalent to maximizing the function |tr(C † UAU † )|. Its maximal value is the C-numerical radius of A (see Eq. (3.66)). Obviously, rC (A) ≤ A 2 · C 2 with equality if and only if UAU † and C are complex collinear for some U ∈ SU (N ). Note that the two tasks (a) and (b) are equivalent whenever the C-numerical range forms a circular disk in the complex plane (centred at the origin); conditions for circular symmetry have been characterized in [115]. Extending concepts of Brockett [17] from the orthogonal to the special unitary group [27, 28, 116], the above optimization problems (a) and (b) can be treated by the previously presented gradient-flow methods, cf. also [22, 23]. For fixed matrices A, C ∈ CN ×N define f1 : SU (N ) → R,
f1 (U ) := Re tr(C † UAU † )
(3.70)
f2 (U ) := |tr(C † UAU † )|2 .
(3.71)
and f2 : SU (N ) → R,
Observe that the distance problem (a) is solved by maximizing f1 , while the angle problem is solved for maximal f2 . Now, the differential and the gradient of f1 with respect to the bi-invariant Riemannian metric Eq. (3.77) is precisely given by Df1 (U )(ΩU ) = Re tr([UAU † , C † ]Ω), grad f1 (U ) = [UAU † , C † ]†S U, as will be illustrated in the worked example below. The differential and the gradient of f2 can be obtained in the same manner as ∗
Df2 (U )(ΩU ) = tr(C †UAU † ) · tr([UAU † , C † ]Ω) − tr(C † UAU † ) · tr([UAU † , C † ]† Ω), grad f2 (U ) = 2(f2 (U )∗ · [UAU † , C † ])†S U. This yields the following result. Theorem 3.14. The gradient systems of fν , ν = 1, 2 with respect to the bi-invariant Riemannian metric (3.77) are given by (3.72) U˙ = Ων (U )U
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
633
with Ω1 (U ) := [UAU † , C † ]†S
and
∗
Ω2 (U ) := 2(f2 (U ) · [UAU † , C † ])†S .
(3.73)
respectively. Each solution of (3.72) converges to a respective critical point for t → +∞. Thereby, the critical points of fν are characterized by Ων (U ) = 0, ν = 1, 2. Proof. The above computations immediately yield Eq. (3.72). As fν , ν = 1, 2 are real analytic, the convergence of each solution to a critical point is guaranteed by Proposition 3.1 and Theorem 3.1, cf. [116]. An implementable numerical integration scheme for the above gradient systems making use of the Riemannian exponential, see Eqs. (2.9) and (2.11), is given by (ν)
(ν)
(ν)
(ν)
Uk+1 = exp(αk Ων (Uk )) Uk ,
U0 = 1N .
(3.74)
(ν)
A suitable choice of step sizes αk > 0 ensuring convergence can be found in (ν) [27, 28, 116]. Generically, it drives Uk into final states attaining the maxima of the quality functions fν , ν = 1, 2. However, there is no guarantee that the gradient flows always reach the global maxima. Standard numerical integration procedures such as, e.g., the Euler method are not applicable here as they would not preserve unitarity. 3.4.2. Worked example We now derive the discretized integration scheme maximizing the quality function f1 in all detail. To this end, recall that SU (N ) is a compact connected Lie group of real dimension N 2 − 1. Its Lie algebra, i.e. its tangent space at the identity is given by set su(N ) of all skew-Hermitian matrices Ω with tr Ω = 0, i.e. su(N ) := {Ω ∈ CN ×N | Ω† = −Ω, tr Ω = 0}.
(3.75)
So elements Ω ∈ su(N ) relate to Hamiltonians H via Ω = iH. The tangent space at an arbitrary element U ∈ SU (N ) is TU SU (N ) = su(N )U = {ΩU | Ω ∈ su(N )},
(3.76)
cf. Eq. (3.13). Moreover, let SU (N ) be endowed with the bi-invariant Riemannian metric ΩU | ΞU U := tr(Ω† Ξ),
(3.77)
defined on the tangent spaces TU SU (N ), cf. Eq. (3.15). Now set F : SU (N ) → CN ×N , F (U ) := C † UAU † f : SU (N ) → R,
f (U ) := Re tr{C † UAU † }
For computing the tangent map of F , we exploit the fact that SU (N ) is an embedded submanifold of CN ×N . Therefore, the tangent map is obtained by restricting
July 12, J070-S0129055X10004053
634
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
the ordinary Fr´echet derivative DF (U ) to the tangent space TU SU (N ), cf. Eqs. (3.4) and (3.5). Thus, by applying the product rule, one easily finds DF (U )(ΩU ) = C † ΩUAU † + C † UA(ΩU )† = C † ΩUAU † − C † UAU † Ω. Now, the chain rule as well as the short-hand notations A˜ := UAU † and [·, ·]S to denote the skew-Hermitian part of the commutator [·, ·] give Df (U )(ΩU ) = D(Re tr)(F (U )) ◦ D(F (U ))(ΩU ) ˜ = Re tr{[A, ˜ C † ]Ω} = tr{[A, ˜ C † ]S Ω} = Re tr{C † ΩA˜ − C † AΩ} ˜ C]† | Ω = [A, ˜ C † ]† U | ΩU , = [A, S S where the last identity explicitly invokes the right-invariance of the Riemannian metric on SU (N ), cf. Eq. (3.77). Next, identifying the above expression with Df (U )(ΩU ) = grad f (U ) | ΩU
(3.78)
one gets the gradient vector field ˜ C † ]† U grad f (U ) = [A, S
(3.79)
˜ C † ]S U. U˙ = grad f (U ) = −[A,
(3.80)
and thus the gradient system
By the Riemannian exponential, see Eqs. (2.9) and (2.11), and with αk ≥ 0 as an appropriate step size we finally arrive at the discretization †
Uk+1 = e−αk [Uk AUk ,C]S Uk .
(3.81)
3.4.3. Gradient flows on the local subgroup SUloc (2n ) The quality functions introduced in the previous subsection may be restricted to the subgroup of local action, i.e. to SUloc (2n ) := SU (2) ⊗ · · · ⊗ SU (2) ⊂ SU (2n ). n-times Let the Pauli matrices be defined as 0 1 0 −i σx := , σy := , 1 0 i 0
1 σz := 0
0 . −1
(3.82)
(3.83)
Moreover the σk,α , α ∈ {x, y, z} are defined by σk,α := 12 ⊗ · · · ⊗ 12 ⊗ σα ⊗ 12 ⊗ · · · ⊗ 12 ,
(3.84)
where the term σα appears in the kth position of the Kronecker product and 12 denotes the 2×2-identity matrix.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
635
The Lie subalgebra to SUloc (2n ) ⊂ SU(2n ) can be specified by n n suloc (2 ) := 12 ⊗ · · · ⊗ 12 ⊗ Ωk ⊗ 12 ⊗ · · · ⊗ 12 Ωk ∈ su(2) , k=1
with the term Ωk ∈ su(2) appearing at the kth position, cf. Eq. (3.84). Then the tangent space of SUloc (2n ) at an arbitrary element U is given by TU SUloc (2n ) = {ΩU | Ω ∈ suloc (2n )}.
(3.85)
Finally, SUloc (2n ) is endowed with the bi-invariant Riemannian metric induced by SU (2n ), i.e. ΩU, ΞU U := tr(Ω† Ξ)
(3.86)
for ΩU, ΞU ∈ TU SUloc (2n ). Lemma 3.3. Let H ⊂ GL(N, C) be any closed subgroup with Lie algebra h ⊂ gl(N, C) := CN ×N . Moreover let h1 , . . . , hm be a real orthonormal basis of h with respect to the real scalar product (g1 | g2 ) := Re tr(g1† g2 ),
g1 , g1 ∈ CN ×N ,
(3.87)
i.e. spanR {h1 , . . . , hm } = h and (hi | hj ) = δij . (a) Then the orthogonal projection P : CN ×N → CN ×N onto h is given by g → P g :=
m
Re tr{h†j g}hj .
(3.88)
j=1
(b) The orthogonal projection P ⊥ : CN ×N → CN ×N onto the orthogonal complement h⊥ is given by g → P ⊥ g = g − P g. Proof. Both (a) and (b) are basic and well-known facts from linear algebra. Remark 3.7. For the unitary case, i.e. for h ⊂ su(N ), the real part in Eq. (3.88) can be neglected and the projector P can be rewritten in the more convenient matrix form P as m vec(hj ) vec(hj )† , (3.89) P := j=1 †
where the terms vec(hj ) vec(hj ) represent the rank-1 projectors Pj = |hj hj | in vec-notation. Corollary 3.6. The orthogonal projection P : CN ×N → CN ×N onto suloc (2n ) with respect to (3.87) is given by P g :=
n 1 (Re(tr(g † Xk ))Xk + Re(tr(g † Yk ))Yk + Re(tr(g † Zk ))Zk ), 2n k=1
July 12, J070-S0129055X10004053
636
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
where Xk , Yk and Zk are defined by, cf. Eq. (3.84) Xk := iσk,x ,
Yk := iσk,y ,
Zk := iσk,z .
Proof. This follows straightforwardly from the orthogonality of the set {Xk , Yk , Zk | k = 1, . . . , n} and Lemma 3.3. Theorem 3.15. Let floc be the restriction of (3.70) to SUloc (2n ). (a) The gradient of floc with respect to (3.86) and the corresponding gradient system are given by grad floc (U ) = P ([C † , UAU † ])U
(3.90)
U˙ = P ([C † , UAU † ])U,
(3.91)
and
where P denotes the orthogonal projection P : gl(2n , C) → gl(2n , C) onto suloc (2n ). More explicitly, (3.91) is equivalent to a system of n coupled equations U˙ k = Ωk Uk ,
k = 1, . . . , n
(3.92)
on SU (2), where Ωk =
1 (Re(tr([C † , UAU † ]† Xk ))X + Re(tr([C † , UAU † ]† Yk ))Y 2n + Re(tr([C † , UAU † ]† Zk ))Z).
Each solution of (3.91) converges for t → ±∞ to a critical point of floc characterized by P ([C † , UAU † ]) = 0.
(3.93)
(b) The Hessian form Hess floc (U ) and the Hessian operator Hess floc (U ) of floc at U are given by Hess floc (U )(ΩU, ΞU ) =
1 (Re(tr(Ω† [C † , [Ξ, UAU † ]])) 2 + Re(tr(Ω† [UAU † , [Ξ, C † ]]))).
(3.94)
and Hess floc (U )ΩU = (S(U )Ω)U,
(3.95)
respectively, with Ω ∈ suloc (2n ) and 1 P ([C † , [Ω, UAU † ]] + [UAU † , [Ω, C † ]]). 2 (c) For all initial points U0 ∈ SUloc (2n ) the discretization scheme S(U )Ω :=
Uk+1 := exp(αk P ([C † , Uk AUk† ]))Uk
(3.96)
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
637
with step size αk =
P ([C † , Uk AUk† ]) 2
[C † , P ([C † , Uk AUk† ])] · [P ([C † , Uk AUk† ]), Uk AUk† ]
(3.97)
converges to the set of critical points of floc . Proof. The subsequent arguments follow our conference report [117], which also contains a complete proof for the flow on the entire groups such as SU (2n ). (a) Since SUloc (2n ) is a closed subgroup of SU (2n ), it is also an embedded Lie subgroup and thus a submanifold of SU (2n ), cf. Remark 3.2. Therefore, the gradient of floc is well-defined by (3.4). Furthermore, by (3.23) and (3.73) we obtain grad floc (U ) = P (grad f1 (U )) = P ([UAU † , C † ]† )U = P ([C † , UAU † ])U, where the last equality follows from P ([UAU † , C † ]† ) = −P ([UAU † , C † ]) and the skew-symmetry of the commutator. Moreover, Eq. (3.92) is derived by Corollary 3.6 and the identity n d (U1 (t) ⊗ · · · ⊗ Un (t)) = 12 ⊗ · · · ⊗ U˙ k (t)Uk−1 (t) ⊗ · · · ⊗ 12 dt k=1
× (U1 (t) ⊗ · · · ⊗ Un (t)). Compactness of SUloc (2n ) and real analyticity of floc imply that each solution converges to critical points for t → +∞, cf. Proposition 3.1 and Theorem 3.1. (b) By (3.9), the Hessian of floc at U is determined by evaluating the second derivative of ϕ := f ◦ γ at t = 0, where γ is any geodesic. This yields Hess floc (U )(ΩU, ΩU ) := ϕ (0) = Re(tr(C † [Ω, [Ω, UAU † ]])),
(3.98)
for Ω ∈ suloc (2n ). The Hessian then is obtained from the quadratic form (3.98) by a standard polarisation argument Eq. (3.8), i.e. 1 Re(tr(C † [Ω, [Ξ, UAU † ]])) + Re(tr(C † [Ξ, [Ω, UAU † ]])) . Hess floc (U )(ΩU, ΞU ) = 2 Finally, by the identity tr[X, Y ]Z = − tr Y [X, Z] we conclude 1 Hessfloc (U )(ΩU, ΞU ) = Re(tr(Ω† [C † , [Ξ, UAU † ]])) + Re(tr(Ω† [UAU † , [Ξ, C † ]])) . 2 Therefore, the Hessian operator of floc at U is given by Hess floc (U )ΩU = (S(U )Ω)U with Ω ∈ suloc (2n ) and S(U )Ω :=
1 P ([C † , [Ω, UAU † ]] + [UAU † , [Ω, C † ]]). 2
July 12, J070-S0129055X10004053
638
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
(c) Estimating the second derivative ϕ (t) = Re(tr([C † , Ω][Ω, etΩ UAU † e−tΩ ])) for Ω := grad floc (U ) = P ([C † , UAU † ]) and U ∈ SUloc (2n ) yields |ϕ (t)| ≤ [C † , Ω] · [Ω, etΩ UAU † e−tΩ ] = [C † , Ω] · [Ω, UAU † ] . Therefore, we get the estimate 2 d max 2 floc (expU (Ωt)) ≤ [C † , Ω] · [Ω, UAU † ] t≥0 dt for Ω := grad floc (U ). Now, a standard Lyapunov-type argument, similar to the proof of Theorem 3.3 in [22], yields the desired result. For similar discretization schemes in different contexts or other intrinsic Riemannian methods see also [19, 22, 27, 118]. 3.4.4. Double-bracket flows as gradient flows on naturally reductive homogeneous spaces The well-known double-bracket flows have established themselves as useful tools for diagonalizing matrices (usually real symmetric ones) as well as for sorting lists [17, 19, 22, 23]. Moreover, they relate to Hamiltonian integrable systems [119, 120]. (Note again that in many-particle physics gradient flows were later introduced independently for diagonalizing Hamiltonians [51,52].) In summarizing the most important results we show that double-bracket flows can be viewed as special cases of gradient flows on naturally reductive homogeneous spaces G/H in terms of Sec. 3.3, where H is a stabilizer group, which is typically not normal. Then the homogeneous space G/H does not constitute a group itself. Let O(A) as in Eq. (3.65) denote the unitary orbit of some A ∈ CN ×N . Note that the adjoint action (U, A) → AdU A := UAU † of SU (N ) constitutes a left action on the Lie algebra g := CN ×N . However, this should not cause any confusion for the reader since the key result we refer to — Corollary 3.4 — was presented for left actions. Let C ∈ CN ×N be another complex matrix. For minimizing the (squared) Euclidean distance X − C 22 between C and the unitary orbit of A we derive a gradient flow maximizing the target function f(X) := Re tr{C † X}
(3.99)
over X ∈ O(A). Clearly, this is but an alternative to tackling the problem by a gradient flow on the unitary group, since as in Sec. 3.3, we have the equivalence max f(X) =
X∈O(A)
for f (U ) := Re tr{C † UAU † }.
max
U∈SU(N )
f (U )
(3.100)
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
639
Building upon Corollary 3.4, we have the following facts: O(A) constitutes a compact and connected naturally reductive homogeneous space isomorphic to SU (N )/H. Here, H := {U ∈ SU (N ) | AdU A = A}
(3.101)
denotes the stabilizer group of A. Recalling that the Lie algebra of SU (N ) is su(N ), we further obtain for the tangent space of O(A) at X = AdU A the form TX O(A) = {adX Ω | Ω ∈ su(N )}
(3.102)
with adX Ω := [X, Ω]. Moreover, the kernel of adA : su(N ) → g reads h = {Ω ∈ su(N ) | [A, Ω] = 0}
(3.103)
and forms the Lie subalgebra to H. Now, by the standard Hilbert–Schmidt scalar product (Ω1 , Ω2 ) → tr{Ω†1 Ω2 } on su(N ) one can define the ortho-complement to the above kernel as p := h⊥ .
(3.104)
This induces a unique decomposition of any skew-Hermitian matrix Ω = Ωh + Ωp with Ωh ∈ h and Ωp ∈ p. Finally, we obtain an AdSU(N ) -invariant Riemannian metric on O(A) via †
adX (AdU Ω1 ) | adX (AdU Ω2 )X := tr{Ωp1 Ωp2 }
(3.105)
for X := AdU A, which is equivalent to saying †
adX (Ω1 ) | adX (Ω2 )X := tr{Ωp1X Ωp2X }
(3.106)
with pX := AdU p. Now, the main results on double-bracket flows read as follows: Theorem 3.16. Set f : O(A) → R, f(X) := Re tr{C † X}. Then one finds (a) The gradient of f with respect to the Riemannian metric defined by Eq. (3.105) is given by grad f(X) = [X, [X, C † ]S ],
(3.107)
where [X, C † ]S denotes the skew-Hermitian part of [X, C † ]. (b) The gradient flow X˙ = grad f(X) = [X, [X, C † ]S ]
(3.108)
defines an isospectral flow on O(A) ⊂ g. The solutions exist for all t ≥ 0 and converge to a critical point X∞ of f(X) characterized by [X∞ , C † ]S = 0. Proof. (A detailed proof for the real case can be found in [22]; for an abstract Lie algebraic version see also [19].)
July 12, J070-S0129055X10004053
640
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
(a) For X = AdU A and ξ = adX Ω ∈ TX O(A) we obtain d † −tΩ tΩ Re tr{C e Df (X) adX Ω = Xe } = Re tr{C † adX Ω}. dt t=0 Therefore, the gradient of f has to satisfy Re tr{C † adX ΩpX } = grad f(X) | adX ΩpX X for all ΩpX ∈ pX . Applying Eq. (3.105) to X = A gives †
Re tr{C † adA Ωp } = tr{Γp Ωp } for all Ωp ∈ p, where Γp is defined by grad f(A) = adA Γp with Γp ∈ p. Thus we finally arrive at †
tr{(adA† C)†S Ωp } = tr{Γp Ωp } for all Ωp ∈ p, where (adA† C)S denotes the skew-Hermitian part of adA† C. Hence, Γp = (adA† C)pS . Moreover, for Ωh ∈ h, we have tr{(adA† C)† Ωh } = −tr{adA C † Ωh } = tr{C † adA Ωh } = 0. Hence, (adA† C)S ∈ p and therefore grad f(A) = adA (adA† C)S = [A, [A, C † ]S ]. The same arguments apply to X = AdU A and thus grad f(X) = [X, [X, C † ]S ]. (b) Since Eq. (3.107) evolves on the unitary orbit of A, the associated flow is isospectral by construction. The compactness of O(A) then implies that each solution X(t) of Eq. (3.107) exists for all t ≥ 0 and converges to the set of critical points cf. Proposition 3.1. Moreover, from Theorem 3.1 we derive that X(t) converges actually to a single critical point X∞ of f, i.e. to a point X∞ which satisfies [X∞ , [X∞ , C † ]S ] = 0.
(3.109)
Since [X∞ , C † ]S ∈ pX∞ , Eq. (3.109) is equivalent to [X∞ , C † ]S = 0. In order to obtain a numerical algorithm for maximizing f one can discretize the continuous-time gradient flow (3.107) as in the previous examples via Xk+1 = e−αk [Xk ,C
†
]S
Xk eαk [Xk ,C
†
]S
(3.110)
with appropriate step sizes αk > 0. Note that Eq. (3.110) heavily exploits the fact that the adjoint orbit O(A) constitutes a naturally reductive homogeneous space and thus the knowledge on its geodesics, cf. Corollary 3.4.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
641
Remark 3.8. As an alternative to Eq. (3.110), taking the standard Euler-type iteration Xk+1 = Xk + αk [Xk , [Xk , C † ]S ]
(3.111)
does not retain the isospectral nature of the flow. Therefore, it should only be used as a computationally inexpensive, rough scheme in the neighborhood of equilibrium points, if at all. For A, C complex Hermitian (real symmetric) and the full unitary (or orthogonal) group or its respective orbit the gradient flow (3.107) is well understood, cf. Corollary 3.8. However, for non-Hermitian A and C, the nature of the flow and in particular the critical points have not been analyzed in depth, because the Hessian at critical points is difficult to come by. Even for A, C Hermitian, a full critical point analysis becomes non-trivial as soon as the flow is restricted to a closed and connected subgroup K ⊂ SU (N ). Nevertheless, the techniques from Theorem 3.16 can be taken over to establish a gradient flow and a respective gradient algorithm on the orbit OK in a straightforward manner. Corollary 3.7. The gradient flow of Eq. (3.107) restricts to the subgroup orbit OK (A) := {KAK † | K ∈ K ⊂ SU (N )} by taking the respective orthogonal projection Pk onto the subalgebra k ⊂ su(N ) of K instead of projecting onto the skewHermitian part, i.e. X˙ = [X, Pk [X, C † ]]. With step sizes αk > 0 the corresponding discrete integration scheme reads †
†
Xk+1 = e−αk Pk [Xk ,C ] Xk eαk Pk [Xk ,C ] .
(3.112)
In view of unifying the interpretation of unitary networks, e.g., for the task of computing ground states of quantum mechanical Hamiltonians H ≡ A, the double-bracket flows for complex Hermitian A, C on the full unitary orbit Ou (A) as well as on the subgroup orbits OK (A) for different partitionings brought about r by K := {K ∈ SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nr )| j=1 Nj = 2n } have shifted into focus [36]. Therefore, we have given the foundations for the recursive schemes of Eqs. (3.110) and (3.112), which are listed in Table 2 as U1P and U1KP. Finally, we summarize what is known about the nature of critical points for the real symmetric or complex Hermitian case. For a detailed discussion of the real symmetric case and the orthogonal group see e.g., [22]. Corollary 3.8. Let C and A be real symmetric or complex Hermitian and assume for simplicity that they show distinct eigenvalues in either case. Then one finds: (a) For A, C real symmetric, define with respect to the special orthogonal group SO(N ) and Y ∈ Oo (A) := {OAO | O ∈ SO(N )} a pair of target functions on the group and on the respective orbit by g(O) := tr{C OAO }
(3.113)
g(Y ) := tr{C Y }.
(3.114)
July 12, J070-S0129055X10004053
642
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Then the gradient flow O˙ := grad g(O) = [OAO , C]O
(3.115)
shows 2(N −1) N ! critical points, while the double-bracket flow Y˙ := grad g(Y ) = [Y, [Y, C]]
(3.116)
only shows N ! equilibrium points. (b) For A, C complex Hermitian, and X ∈ Ou (A) := {UAU † | U ∈ SU (N )} f (U ) := tr{C † UAU † }
(3.117)
f(X) := tr{C † X}
(3.118)
the gradient flow on the special unitary group SU (N ) U˙ := grad f (U ) = [UAU † , C]U
(3.119)
shows a continuum of critical points, while the double-bracket flow on the unitary orbit X˙ := grad f(X) = [X, [X, C]]
(3.120)
again shows only N ! equilibrium points. (c) On the orbit, the respective target function has a unique global maximum which is given by the diagonalization diag(λ1 , . . . , λN ), λ1 > · · · > λN of A, if C is assumed to be diagonal of the form C = diag(µ1 , . . . , µN ), µ1 > · · · > µN . Moreover, the respective gradient flow converges to the unique global maximum for almost all initial values with an exponential bound on the rate. Proof. (a) and (b) The counting arguments follow immediately from the fact that in either case for C diagonal with distinct eigenvalues, the set of critical points C∞ := {X∞ ∈ O(A) | [X∞ , C] = 0} on the orthogonal or unitary orbit is given by N ! different diagonalizations of A and remains therefore invariant under conjugation by any permutation matrix. Moreover, on the orthogonal group O(N ), the stabilizer group of A is given by {diag(±1, ±1, . . . , ±1)}, which adds 2N independent further degrees of freedom. Finally, restricting to SO(N ) we obtain 2N −1 N ! critical points on the group level. In contrast, for the unitary case SU (N ), the stabilizer group of A reads N iφ1 iφν iφN diag(e , . . . , e , . . . , e ) φν ∈ 2πZ, φν ∈ R , ν=1
which is always continuous.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
643
(c) Since C is symmetric or Hermitian, we can assume without loss of generality that C is diagonal. Then, the critical point condition [X∞ , C] yields that the critical points of g and, respectively, f are given by the diagonalizations of A. Moreover, analyzing the Hessian at critical points shows that there is only one global maximum in both cases and no local ones [22]. The exponential convergence of the gradient flows Eqs. (3.116) and (3.120) to the respective unique global maximum for almost all initial values is also established via the Hessian, i.e. by linearizing the respective gradient flows at critical points [22].
3.4.5. Some final remarks on the naturally reductive case Let f : SU (2n ) → R be an arbitrary smooth function that is equivariant under local unitary operations of the n-fold tensor product SUloc (2n ) := SU (2) ⊗ · · · ⊗ SU (2). This includes, e.g., any measure of entanglement µE (U ) that varies smoothly with U . By construction grad f |SUloc (2n ) = 0, so we may consider then the induced flow to [U˙ ] = grad f([U ]) on the homogeneous space G/K = SU (2n )/SUloc (2n ), which is naturally reductive for all n and even Cartan-like for n = 2. This can be seen, because (i) SU (2n ) carries a bi-invariant metric induced by the Killing form allowing to define p := k⊥ , which gives the reductive decomposition g = k ⊕ p, yet only for n = 2 one recovers the commutator inclusions [k, k] ⊆ k, [p, p] ⊆ k, and [k, p] ⊆ p; (ii) in any case, by Proposition 3.4 there is an AdK -invariant scalar product on p; and (iii) Eq. (3.47) is fulfilled for all {a, b, c} ⊆ p, as tr{[a, b]† c} = − tr{b† [a, c]}, cf. Remark 3.6. Therefore, one finally arrives at a discretized gradient algorithm of the form [Uk+1 ] := [exp(αk grad f (Uk ) Uk−1 )Uk ],
(3.121)
cf. Eq. (3.64). Clearly, this example extends analogously to functions that are equivariant under the action of generalized local subgroups SU(N1 ) ⊗ · · · ⊗ SU(Nr ) with r j=1 Nj = N , cf. (4.8), giving flows on the corresponding reductive homogeneous spaces G/K = SU (N )/(SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nr )). Comparing Eq. (3.121) with the results of the previous subsection on double bracket flows shows the following: having a “model” of the coset space G/K, i.e. having a smooth group action of G (e.g. on some vector space) such that one of its orbits is diffeomorphic to G/K, facilitates the implementation Eq. (3.121) rather than implementing it on the abstract coset level.
July 12, J070-S0129055X10004053
644
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
4. Applications to Quantum Information and Quantum Control 4.1. A geometric measure of pure-state entanglement The Euclidean distance of a pure state to the set Spp of all pure product states may be seen as a geometric measure of entanglement [55, 121, 122]. Since Spp coincides with the local unitary orbit Oloc (yy † ) := {U yy † U † | U ∈ SUloc (2n )}
(4.1)
of any pure product state y ∈ Spp , it relates to the following optimization task ∆(x) :=
min
U∈SUloc (2n )
xx† − U yy † U † 2 ,
n
(4.2)
n
where x ∈ C2 denotes a normalized pure state and y ∈ C2 a pure product state, e.g., y = (1, 0, . . . , 0) = (e1 ⊗ · · · ⊗ e1 ). This notation replaces |x by x and |xx| by xx† for the sake of convenient generalization to higher-order tensor products. Obviously, minimizing (4.2) is equivalent to maximizing the so-called local transfer max
U∈SUloc (2n )
Re(tr(xx† U yy † U † )),
(4.3)
between xx† and yy † . Further, since tr(xx† U yy † U † ) = | tr(x† U y)|2 taking the real part in (4.3) is redundant. Now, the techniques developed in Sec. 3.4.3 match perfectly to tackle problem (4.3). Let C := xx† , A := diag(1, 0, . . . , 0) and define the so-called local unitary transfer between C and A by the real-valued function floc (U ) := tr (CUAU † ).
(4.4)
Then the gradient flow (3.91) or more precisely its discretization (3.96) will generically solve (4.3). For explicit numerical results see Sec. 4.2.3 and [117, 123]. In general, neither an algebraic characterization of the maximal value of floc nor the structure of its critical points is known, the major difficulty arising from the fact that U is restricted to SUloc (2n ). As soon as U may be taken from the entire special unitary group, the solution is well-known: it is simply obtained by arranging the (real) eigenvalues of both A and C magnitude-wise in the same order [17, 22, 124, 125]. 4.2. Generalized local subgroups 4.2.1. Bipartite systems and relations to singular-value decompositions An exceptional case, where the restricted problem (4.3) can be solved are bipartite pure systems. These systems are particularly simple in as much as the maxima of floc can be linked to the singular-value decomposition (SVD) of the matrices X and Y associated to x and y by x := vec X and y := vec Y . Since these ideas
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
645
readily extend to arbitrary finite dimensional bipartite systems, we generalize the formulation of problem (4.3) thus leading to Eq. (4.5), before going into multipartite systems. † , Y = VY ΣY WY† be singular value decompoProposition 4.1. Let X = VX ΣX WX sitions with VX , VY ∈ U(N1 ), WX , WY ∈ U(N2 ) and ΣX , ΣY sorted by magnitude. Moreover, let x := vec X and y := vec Y . Then the maximum value of the local transfer between xx† and yy † is bounded by
max
U∈SU(N2 )⊗SU(N1 )
Re(tr(xx† U yy † U † )) ≤ (tr Σ†X ΣY )2 .
(4.5)
Equality is actually achieved for VX , VY ∈ SU(N1 ), while WX , WY ∈ SU(N2 ) and ∗ ⊗ VX ) · (WY ⊗ VY† ). U∗ := (WX Proof. For U := W ⊗ V ∈ SU (N2 ) ⊗ SU (N1 ) we obtain tr(xx† U yy † U † ) = tr(xx† (W ⊗ V )yy † (W † ⊗ V † )) = tr(xx† vec(V Y W ) vec(V Y W )† ) = |x† vec(V Y W )|2 = |tr(X † V Y W )|2 .
(4.6)
Here, we have used the identities vec(V Y W ) = (W ⊗ V ) vec Y and (vec X)† vec Y = tr X † Y for all X, Y ∈ CN1 ×N2 . Now, (4.6) implies max
U∈SU(N2 )⊗SU(N1 )
Re tr(xx† U yy † U † ) =
max
V ∈SU(N1 ) W ∈SU(N2 )
|tr(X † V Y W )|2 ≤ (tr Σ†X ΣY )2 , (4.7)
where the last inequality is due to von Neumann, cf. [111,124]. If VX , VY ∈ SU(N1 ) and WX , WY ∈ SU(N2 ), equality is assumed in Eq. (4.7) for † ∗ ) ⊗ VX VY† = (WX ⊗ VX ) · (WY ⊗ VY† ). U∗ := (WY WX
Corollary 4.1. Set x := vec A and y := vec C. Then the maximum local transfer between xx† and yy † in the sense of Proposition 4.1 is bounded by A 2C :=
max
V ∈U(N1 ) W ∈U(N2 )
| tr(C † V AW † )|2 ,
which is known as the C-spectral norm of A, cf. [112]. Note that in the context of finding maximal distances between global unitary orbits for the purpose of geometric discrimination of generic non-pure quantum states [126], results similar to [125, 127] show up, while here we treat local unitary orbits of pure bipartite states as made explicit in Eq. (4.5).
July 12, J070-S0129055X10004053
646
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
4.2.2. Multipartite systems and relations to best rank-1 approximations of higher-order tensors Proposition 4.1 has a straightforward generalization to multipartite systems, which relates to best rank-1 approximations of higher-order tensors. To outline this relation, we define the concept of a generalized local subgroup SUloc (N1 , . . . , Nr ) := SU(N1 ) ⊗ · · · ⊗ SU(Nr ).
(4.8)
of type (N1 , . . . , Nr ) with Nk ∈ N, k = 1, . . . , r. Thus the associated general local subgroup optimization problem can be stated as follows. Generalized Local Subgroup Problem (GLSP). For C, A ∈ CN ×N with N := N1 ·N2 · · · Nr find max
U∈SUloc (N1 ,...,Nr )
Re(tr(CU AU † )).
(4.9)
To our knowledge, the GLSOP seems to be unsolved so far. To introduce higherorder tensors, we have to fix some further notation. For simplicity, we regard a tensor of order r ∈ N as an array X = (Xi1 ···ir )1≤i1 ≤N1 ,...,1≤ir ≤Nr of size N1 ×· · ·×Nr . The space of all N1 ×· · ·×Nr -tensors is denoted by CN1 ×···×Nr . A natural scalar product for tensors of the same size is given by Yi∗1 ···ir Xi1 ···ir . (4.10) Y | X := i1 ···ir
Moreover, a tensor X is called a rank-1 tensor if there exist xk ∈ CNk , k = 1, . . . , r such that X = x1 x2 · · · xr ,
(4.11)
where the (i1 · · · ir )-entry of the outer product is defined by (x1 x2 · · · xr )i1 ···ir := x1i1 · x2i2 · · · xrir . Thus the question of decomposing a given tensor by tensors of lower rank leads to the following fundamental approximation problem: Best Rank-1 Approximation Problem (BRAP). Let · denote the norm induced by scalar product (4.10). For X ∈ CN1 ×···×Nr solve min
C∈C,xk =1 k=1,...,r
X − C · x1 · · · xr 2 .
(4.12)
Note that the above notation is necessary to distinguish between two different types of outer products: the Kronecker product ⊗ (of column-vectors), which maps r-tuples of column-vectors to a column-vector of larger size, and the “abstract” outer product , which maps r-tuples of column-vectors to arrays (= tensors) of
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
647
order r. The relation between both is given by the canonical isomorphism vec : CN1 ×···×Nr → CN with N := N1 · N2 · · · Nr , which is uniquely determined by x1 x2 · · · xr → x1 ⊗ x2 ⊗ · · · ⊗ xr ,
(4.13)
i.e. vec assigns to each array X ∈ CN1 ×···×Nr a column-vector in CN by arranging the entries of X in a lexicographical order. With these notations at hand, the relation between GLSP and BRAP can be stated as follows. Theorem 4.1. Let X ∈ CN1 ×···×Nr be a tensor of order r and let x := vec(X) ∈ CN with N := N1 · N2 · · · Nr . Then the BRAP is equivalent to the GLSP max
U∈SUloc (N1 ,...,Nr )
Re(tr(xx† U yy † U † )),
(4.14)
where y ∈ CN can be any pure product state, e.g., y = (1, 0, . . . , 0) = e1 ⊗ · · · ⊗ e1 . More precisely, (a) If U1 ⊗ · · · ⊗ Ur is a solution of (4.14) then xk := Uk e1 , k = 1, . . . , r and C := X | x1 · · · xr solve (4.12). (b) If C ∈ C and xk , k = 1, . . . , r solve (4.12) then any U1 ⊗ · · · ⊗ Ur with xk = Uk e1 , k = 1, . . . , r yields a solution of (4.14). For proving Theorem 4.1 we need the following technical lemma. Lemma 4.1. The pair (x1 · · · xr , C) solves (4.12) if and only if x1 · · · xr is a maximum of max
z k =1,k=1,...,r
|X | z 1 · · · z r |
(4.15)
and C = X | x1 · · · xr . Proof. Consider the following identity X − C · z 1 · · · z r 2 = X 2 + |C|2 − 2 Re(C ∗ X | z 1 · · · z r ) = X 2 + |C − X | z 1 · · · z r |2 − |X | z 1 · · · z r |2 . Thus we obtain min
C∈C,z k =1 k=1,...,r
X − C · z 1 · · · z r 2 = X 2 − max |X | z 1 · · · z r |2 . z k =1 k=1,...,r
This yields the desired result. Proof of Theorem 4.1. Let y = e1 ⊗ · · · ⊗ e1 . Then (U1 ⊗ · · · ⊗ Ur )y = (U1 e1 ) ⊗ · · · ⊗ (Ur e1 ) and thus tr(xx† U yy † U † ) = tr(x† U yy † U † x) = |x† U y|2 = |X | (U1 e1 ) · · · (Ur e1 )|2 .
July 12, J070-S0129055X10004053
648
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Therefore, we obtain max
U∈SUloc (N1 ,...,Nr )
Re(tr(xx† U yy † U † )) =
max
U∈SUloc (N1 ,...,Nr )
|X | (U1 e1 ) · · · (Ur e1 )|2
= max |X | z 1 · · · z r |2 . z k =1 k=1,...,r
and hence Lemma 4.1 implies (a) and (b). Remark 4.1. (1) The isomorphism vec coincides “almost” with the standard vecoperation on matrices for r = 2, more precisely vec(X) = vec(X ). (2) Since any phase factor can readily be absorbed into x1 · · · xr , it is easy to show that max
xk =1,k=1,...,r
|X | x1 · · · xr | =
max
xk =1,k=1,...,r
Re(X | x1 · · · xr ).
Therefore, maxima of the “real-part-expression” on the right-hand side are always maxima of the “absolute-value-term” on the left. (3) By replacing yy † in (4.14) with an appropriate sum li=1 yi yi† , the above ideas can be extended to best approximations of higher rank, i.e. to best approximations of the form 2 l i,1 i,r X− Ci · x · · · x , min Ci ∈C,xi,k =1 i=1
with l ≤ min{N1 , . . . , Nr } and all xi,1 · · · xi,r mutually orthogonal, cf. [128, 129]. (4) Unfortunately, an analogue of Proposition 4.1 involving the tensor SVD as defined in [130] does not hold for higher-order tensors. Even the classical Eckart–Young Theorem, which asserts that the best rank-k approximation of a matrix is given by its truncated SVD, is false for higher-order tensors, cf. [131]. (5) Higher-order methods, like Newton-, BFGS- or conjugate gradient methods for computing best approximations of higher order tensors can be found in [132–135]. Near local maxima these methods are in general faster than gradient algorithms: Although a single iteration of them is more time-consumimg than a gradient step, the number of iterations to guarantee a certian error threshold is considerably lower due to local higher-order convergence rate. However, their global convergence behavior is a rather delicate issue. In practice, therefore, one often applies a combined strategy: (i) first, run a gradient algorithm to reach the region of attraction of a higher-order method; (ii) then switch to a higher-order method. 4.2.3. Numerical results For comparing our gradient-flow approach to tensor-SVD techniques, here we focus on two examples that are well-established in the literature, since analytical solutions
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
649
[136] as well as numerical results from semidefinite programming are known [55]. First, consider a pure 3-qubit state depending on a real parameter s ∈ [0, 1] √ √ (4.16) |X(s) := s|W + 1 − s|V , where one defines 1 1 |W := √ (|001 + |010 + |100) and |V := √ (|110 + |101 + |011) 3 3 with the usual shorthand notation of quantum information |0 := 10 , |1 := 01 and |001 := 10 ⊗ 10 ⊗ 01 , etc. With these stipulations one finds the corresponding 2 × 2 × 2 tensor representations for |W and |V to take the form 1 0 1 1 1 0 W(1,:,:) = √ W(2,:,:) = √ (4.17) 3 1 0 3 0 0 and V(1,:,:)
1 0 0 =√ 3 0 1
V(2,:,:)
1 0 1 = √ . 3 1 0
Likewise, observe the pure 4-qubit-state √ √ |X(s) := s|GHZ − 1 − s|X + ⊗ |X + ,
(4.18)
(4.19)
with the definitions 1 1 |GHZ := √ (|0011 + |1100) and |X + := √ (|10 + |01). 2 2 Consider the target function f (K) = tr{C † KAK † } with C = diag(1, 0, 0, . . . , 0) and A := |X(s) X(s)|. As shown in Fig. 4 with the gradient flow restricted
to the local unitaries K ∈ SUloc (2n ) one obtains results perfectly matching the analytical solutions of [136] as well as the numerical ones from semidefinite programming ensuring global optimality — yet in drastically less CPU time as compared to [55], see Table 1. Gradient flows are some 30 to 150 times faster in CPU time than semidefinite programming methods for the 3-qubit and 4-qubit example, respectively. In the tensor-SVD algorithms [131] such as the higher-order power method (HOPM) or the higher-order orthogonal iteration (HOOI) as implemented in the MATLAB package [137], N = 50 to N = 60 iterations are required for quantitative agreement with the algebraically established results. In the 3-qubit example, all minimal distances are also reproduced correctly with N = 5 iterations — except for the limiting values s near 0 and near 1, for which the minimal distances of ∆(|X(0)) = ∆(|X(1)) = 2/3 are obtained by either tensor method instead of the correct analytical value of 5/9, which requires N = 60 iterations as shown in Fig. 4(c). In the 4-qubit example, however, for N = 5 iterations, both tensor methods suffer from apparently random numerical instabilities, which only vanish when allowing for N = 50 iterations in either method. It is the considerably high
July 12, J070-S0129055X10004053
1– max. local transfer
650
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
0.6
0.8
0.5
0.7
0.4
0.6
0.3
0.5
0.2 0
0.5
1
s
0.4 0
0.5
s
(a)
(b) 0.7 0.68
1– max. local transfer
1
0.7
N=5
0.68
0.66
0.66
0.64
0.64
0.62
0.62
0.6
0.6
0.58
0.58
0.56
0.56
0.54
0.54
0.52
0.52
0.5 0.998
0.999
1
s
N = 60
0.5 0.998
0.999
s
1
(c) n Fig. 4. Numerical results by gradient flows on the local unitary group K = SU √loc (2 ) deter√ s|W + 1 − s|V (see mining (a) the Euclidean distance of the 3-qubit state |X(s) = Eq. (4.16)) to the nearest product state as a function of s; (b) the distance of 4-qubit state √ √ b |X(s) = s|GHZ − 1 − s|X+ ⊗ |X+ (see Eq. (4.19)) to the nearest product state. (c) TensorSVD results for Euclidean distance of the 3-qubit state |X(s) to the nearest product state as in part (a). With the standard of N = 5 iterations, both methods (here shown for HOPM) give systematic errors as indicated by the arrow. N = 60 iterations are needed for quantitatively matching the well-established distance values. The high number of iterations required slows down the method as indicated in Table 1. (Color online)
number of iterations that makes the tensor methods substantially slower than our gradient-flow algorithm as shown in Table 1. Therefore, at least for lower order tensors, gradient flows provide an appealing alternative to standard tensor-SVD methods for best rank-1 approximations. Moreover, one should take into account that the above gradient methods are developed to solve the GLSOP and thus a considerable speed-up can be expected by adjusting them to the local orbit Oloc (yy † ) of a pure product state. For similar results obtained by an intrinsic Newton and conjugated gradient method see also [118, 123]. Generalizations of such higher-order methods to Grassmann manifolds, which perfectly
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
651
Table 1. CPU times for determining Euclidean distance to orbit of separable pure states in Fig. 4.
Qubits
Semidefinite programming CPU-time [sec]a
By gradient flow CPU-time [sec]b
Speed-up
3 4
10.92 103.97
0.30 0.71
36.4 147.0
Qubits
Higher-ord. tensor-SVD (HOPM) CPU-time [sec]b
H.O. tensor-SVD (HOOI) CPU-time [sec]b
Speed-ups
3 4
2.39 3.93
5.37 7.03
4.6 (2.0) 26.5 (14.8)
a Eisert
et al. (processor with 2.2 GHz, 1 GB RAM) [55]. of 50 runs, Athlon XP1800+ (1.1 GHz, 512 MB RAM).
b Average
fit in the previous theory of Riemannian homogeneous spaces [110], are provided in [132–135]. As also discussed therein, the applications to tensor approximation in signal processing and data compression or subspace reconstruction in image processing are numerous. Moreover we anticipate that these numerical approaches will also prove useful tools in tensor and rank aspects of entanglement and kinematics of qubit pairs as addressed, e.g., in [138, 139]. 4.3. Locally reversible interaction Hamiltonians 4.3.1. Joint local reversibility In a recent study [29], we have addressed the decision problem whether a timeindependent (self-adjoint) Hamiltonian H normalized to ||H||2 = 1 generates a one-parameter unitary group U (t) = {e−itH | t ∈ R} that is jointly invertible for all t by local unitary operations K ∈ SUloc (2n ) = SU (2)⊗n in the sense KHK † = −H.
(4.20)
Apart from complete algebraic classification, in [29] we used that the question obviously finds an affirmative answer, if there is an element K ∈ SUloc (2n ) such that ||KHK † + H||2 = 0,
(4.21)
which amounts to minimizing the transfer function f (K) = Re tr{HKHK † }.
(4.22) n
With P denoting the projector onto k, i.e. the Lie algebra of K = SUloc (2 ), we therefore used the gradient flow K˙ = − grad f (K) = −P ([KHK † , H])K
(4.23)
as an other application of Theorem 3.15. If (due to normalization) Re tr{HKHK † } = −1 can be reached, the interaction Hamiltonian is locally reversible.
July 12, J070-S0129055X10004053
652
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Remark 4.2. There is an interesting relation to local C-numerical ranges as described in detail in [113, 114]: if the local C-numerical range Wloc (H, H) := {tr(HKHK −1 )|K ∈ K} = [−1; +1] then the interaction Hamiltonian H is locally reversible. The references also establish the interconnection to local C-numerical ranges of circular symmetry and multi-quantum interaction components transforming like irreducible spherical spin tensors. In Fig. 5, we give some examples: e.g., the Ising-ZZ interaction in a cyclic four-qubit coupling topology is locally reversible, while in the cyclic three-qubit topology it is not, and also for two qubits coupled by an isotropic Heisenberg-XXX interaction it is not. Thus numerical tests provide convenient answers particularly in problems where an algebraic assessment becomes more tedious than in the examples presented here, which are fully understood on algebraic grounds [29]. 4.3.2. Pointwise local reversibility In [29] we also generalized the above problem to the question, whether for a fixed τ ∈ R there is a pair K1 , K2 ∈ K = SUloc (2n ) so that K1 e−iτ H K2 = e+iτ H
(4.24)
which upon setting A := e−iτ H and C := e+iτ H is equivalent to ||K1 AK2 − C||2 = 0.
(4.25)
tr {KHK −1 H}
[normalised]
1 (a) 0.5
(b)
0 (c) −0.5
−1
0
50
100
150
iteration
Fig. 5. Gradient-flow driven local reversion of different Heisenberg interaction Hamiltonians: (a) the Ising-ZZ interaction on a cyclic four-qubit topology C4 can in fact be locally reversed, whereas (b) neither the ZZ interaction on a cyclic three-qubit topology C3 can be reversed locally, (c) nor the Heisenberg-XXX interaction between two qubits. (Color online)
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
653
0 (a)
−0.5
(b)
1
2
Re tr {K e−itHK (−e−itH)} [normalised]
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
−1 0
Fig. 6.
10
20 30 iteration
40
50
Gradient-flow driven local inversion of the exponential of Hamiltonian H = −i π H 4
1 (σz 2
⊗1+
1 ⊗ σz + σz ⊗ σz ) and U (τ ) := e (a) by a gradient flow with independent K1 and K2 (b) by a gradient flow with K1 = K2† =: K. (Color online)
Thus one may choose a gradient flow to minimize 1 Re tr{C † K1 AK2 } 2n
(4.26)
K˙ 1 = grad f (K1 ) = P (K1 AK2 C † )K1 K˙ 2 = grad f (K2 ) = P (K2 C † K1 A)K2 .
(4.27)
f (K1 , K2 ) := − by the coupled system
So if f (K1 , K2 ) = −1 can be reached, then U (t) = e−iτ H is locally reversible at time t = τ . See Fig. 6 for examples comparing pointwise and universal local reversibility. 4.4. Intrinsic versus penalty approach: An example So far, we have demonstrated that in quantum information and control constrained optimization tasks arise that lend themselves to Riemannian, i.e. intrinsic optimization methods. This is because the differential geometry of their constraint sets is well understood, in particular, many of their Riemannian quantities, like the exponental map, are given explicitly by well-known formulas. In other case, however, the use of sophisticated tools from differential geometry may be to time-consuming. Therefore, it is sometimes advisable to combine intrinsic techniques with extrinsic methods, like a penalty term or an augmented Lagrange multiplier approach. Here, we only sketch how to incorporate a basic penalty term. For instance, one may face the problem to maximize a quality function f on the reachable set of a quantum system under additional state space contraints. An example amounts to finding the maximal unitary transfer from matrix (state) A to C subject to leaving another state E invariant (provided A and E do not share
July 12, J070-S0129055X10004053
654
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
the same stabilizer group). Another variant amounts to optimizing the contrast between the transfer from A to C and the transfer from A to D; so the task is to maximize the transfer from A to C subject to suppressing the transfer from A to D. For tackling those types of problems, we address two basically different approaches — a purely intrinsic one and a combined method joining intrinsic and penalty-type techniques. Both methods will be briefly illustrated for the problem of maximizing the transfer from A to C while leaving E invariant, i.e. max |tr{UAU † C † }| subject to UE U † = E.
U∈U(N )
(4.28)
It is straightforward to see that the stabilizer group KE := {K ∈ U (N ) | KEK † = E}
(4.29)
of E forms a compact connected Lie subgroup of U (N ). Differentiating the identity etk Ee−tk = E for t = 0 yields its Lie algebra kE := {k ∈ u(N ) | adk (E) ≡ [k, E] = 0}.
(4.30)
By the Jacobi identity [[Ω1 , Ω2 ], E] + [[Ω2 , E], Ω1 ] + [[E, Ω1 ], Ω2 ] = 0 one can easily verify that kE is indeed a Lie subalgebra of u(N ). Moreover, from the compactness of KE we conclude that the exponential map exp : kE → KE is not only locally, but globally onto. Note, however, this fact is not exploited in what follows. A set of generators of kE may constructively be found by solving a system of homogeneous linear equations, i.e. kE = ker adE ∩ u(N ) = {k ∈ u(N ) | (1 ⊗ E − E ⊗ 1)vec(k) = 0}. In particular, if E is of the form E = µ1 + Ω with µ ∈ C and Ω ∈ u(N ), then kE is identical to the centralizer of Ω in u(N ). By ortho-normalizing the elements kj ∈ kE of the generating set kE with j = 1, 2, . . . , nE , one obtains the projectors Pj := |kj kj | (see also Eq. (3.88)) to give the total projection operator P := j Pj . With this definition, the gradient flow U2K of the summarizing Table 2 applies and solves Eq. (4.28). Therefore, the constraint of leaving a neutral state E invariant during the transfer from A to C can be approached intrinsically by restricting the flow from the full unitary group to a compact connected Lie subgroup, the stabilizer group KE of E. However, it may be tedious to check for the stabilizer group KE in each and every practical instance and then project the gradients onto the corresponding subalgebra kE . In [28], we therefore presented a combined approach based on the penalty function L(U ) = f2 (U ) − λ(tr{E † U EU † } − ||E||22 )
(4.31)
with f2 (U ) := |tr{C † UAU † }|2 and penalty term λ(tr{E † U EU † } − E 22). Here, the constraint U EU † −E = 0 was rewritten in the more convenient form tr{E † U EU † }− E 22 = 0. The algorithm given in Table 2 as U2C implements a discretized gradient
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
655
flow of L obtained from the identity ∗
DL(U ) (ΩU ) = tr{(2(f2 (U ) [UAU † , C † ])S − λ[UEU † , E † ])Ω}. Note that the penalty parameter λ is increased within the recursion to guarantee that the constraint is (at least approximately) satisfied in the limit. Thus, for the constrained optimization task of maximizing the transfer from A to C subject to leaving the state E invariant, one has the choice of taking either the intrinsic approach U2K or the combined approach of U2C. Note, however, that the intrinsic approach restricts the flow to the stabilizer group KE at any time, whereas the combined method is designed such as to start arbitrarily on U (N ) but finally to give an equilibrium point on KE . Therefore, the intrinsic approach has the advantage that the constraint is (at least in principal) properly satisfied for the entire iteration. However, there are situations where an intrinsic method is impractical as the computational costs are too expensive. The combined method, in contrast, does not suffer from this shortcoming and thus has a wider range of applications. On the other hand, it is well-known that simple penalty methods as presented above become ill-conditioned for large values of λ. Therefore, an augmented Lagrange multiplier approach may be a good alternative if numerical difficulties arise, cf. [86, 87]. Note that the intrinsic approach paves the way to perform (or approximate) a transfer from A to C robustly by taking KE as the stabilizer group resistent against a certain error class in the sense familiar from stabilizer codes [142–144]. The extrinsic approach, on the other hand, could be taken to transfer one protected state A to another one C via intermediate states that are no longer necessarily protected against errors as in the intrinsic case. Finally, in [28, 113], we devised a penalty-type gradient flow algorithm for solving the constrained optimization maxU |tr{C † UAU † }| subject to tr{D† UAU † } = min .
(4.32)
To this end, we introduced the penalty function L(U ) := |tr{C † UAU † }|2 − λ |tr{D† UAU † }|2 ,
(4.33)
to maximize the transfer from A to C while suppressing the transfer from A to D. This leads to the recursive scheme U3C in Table 2. For the relation of unconstrained and constrained gradient flows to the topic of C-numerical ranges and relative C-numerical ranges, see [113, 114, 145], where the latter explicitly compares gradient results with those of quadratic programming with quadratic constraints. 5. Conclusions The ability to calculate optima of quality functions for quantum dynamical processes and to determine steerings in concrete experimental settings that actually
Target function
Discretized gradient flow
f (U ) = | tr{C † UAU † }|2 f (U, V ) = Re tr{C † U AV }
U2 U3
[·, ·]S and (·)S denote skew-Hermitian parts
Uk+1 = exp{−αk ([Ak , C † ]f ∗ (Uk ) − [Ak , C † ]† f (Uk ))}Uk where Ak := Uk AUk† Uk+1 = exp{−αk (Uk AVk C † )S }Uk Vk+1 = exp{−βk (Vk C † Uk A)S }Vk
Uk+1 = exp{−αk [Uk AUk† , C † ]S }Uk
‚ ‚ ‚X ‚ ‚N ‚ ‚ ‚ min U A V − A 0 j j j ‚ ‚ U,V ∈SU (n) ‚ ‚ j=1
‚ ‚ ‚X ‚ ‚ N ‚ ∗ ‚ min ‚ U j Aj U j − A0 ‚ ‚ U ∈SU (n) ‚ ‚ j=1
=
(2)
(j)
s
(j) ∗
(j)
(j)
(j)
(j)
and A0jk := A0 −
(j)
(j)
(j)
where A0jk := A0 −
(j)
ν=1 ν=j
N X
(j)
(ν)
(ν)
Uk Aν Vk
(j)
Vk+1 = exp{−βk (Vk A∗0jk Uk Aj )s }Vk ,
(j)
(j)
where Ak := Uk Aj Uk
(j)
(j)
Uk+1 = exp{−αk (Uk Aj Vk A∗0jk )s }Uk
(j)
(1)
(2) (1) (2) exp{−βk Pk (Kk C † Kk A)}Kk
(1)
= exp{−αk Pk (Kk AKk C † )}Kk
where Ak := Kk AKk†
Uk+1 = exp{−αk [Ak , A∗0jk ] }Uk
(1) Kk+1 (2) Kk+1
ν=1 ν=j
N X
(ν)
Ak
Kk+1 = exp{−αk Pk [Kk AKk† , C † ]}Kk Kk+1 = exp{−αk (Pk [Ak , C † ]f ∗ (Kk ) − Pk [Ak , C † ]† f (Kk ))}Kk
[141]
[141]
[29]
[herea ] [herea ]
[27, 28] [23, 29]
[27, 28]
[17, 22, 23]
Ref.
2010 12:0 WSPC/S0129-055X
U5K
U4K
U3K
1 AK2 }
Re tr{C † K
f (K1 , K2 ) =
f (K) = Re tr{C † KAK † } f (K) = |tr{C † KAK † }|2
U1K U2K
Maximization restricted to subgroups K ⊂ U (N ) of the unitary group with K ∈ K and Pk as projection from gl(N, C) onto k, i.e. the Lie algebra to K
f (U ) = Re tr{C † UAU † }
U1
Maximization over the unitary group: U, V ∈ SU (N ) and A, C ∈
CN×N ;
Maximization over the orthogonal group: O ∈ SO(N, R) and A, ∆ ∈ RN×N with ∆ diagonal, αk > 0 stepsize Ok+1 = exp{−αk [Ok AOk , ∆ ]}Ok O1 f (O) = tr{∆ OAO }
I. Unconstrained optimization
Examples of optimization tasks and related gradient flows. 656
No.
Table 2.
July 12, J070-S0129055X10004053 148-RMP
T. Schulte-Herbr¨ uggen et al.
Target function
Discretized gradient flow
(Continued )
f (X) = tr{CX} with Xk := AdOk (A)
Xk+1 = e−αk [Xk ,C] Xk e+αk [Xk ,C]
f (X) =
Re tr{C † X} Xk+1 =
†
]S X e+αk [Xk ,C † ]S k
† † e−αk Pk [Xk ,C ] Xk e+αk Pk [Xk ,C ]
Xk+1 = e−αk [Xk ,C
a Work
fC (U ) (s.a.) and fE (U ) :=
presented in part at the MTNS 2006 [117].
fC (U ) (s.a.) and fD (U ) := tr{D † UAU † }
L(U ) = |fC (U )|2 − λ|fD (U )|2
tr{E † U EU † }
L(U ) = |fC (U )|2 − λ(fE (U ) − ||E||22)
with fC (U ) := tr{C † UAU † }
L(U ) = Re fC (U ) − λIm2 fC (U )
Uk+1 =
Uk+1
where Ak := Uk AUk† and Ek := Uk EUk† ∗ (U )[A , C † ]) − λ(f ∗ (U )[A , D † ]) )}U exp{−2αk ((fC S S k k k k k D where Ak := Uk AUk†
1 where Ak := Uk AUk† and XH,S := (X ± X † ) 2 ∗ (U )[A , C † ]) − λ[E , E † ])}U = exp{−αk ((2fC S k k k k
Uk+1 = exp{−αk ([Ak , C † ]S + 2iλIm fC (Uk )[Ak , C † ]H )}Uk
[28]
[28]
[28]
[here]
[here]
[22, 119]
Ref.
2010 12:0 WSPC/S0129-055X
U3C
U2C
U1C
Maximizing L(U ) with penalty parameter λ ∈ R over the unitary group: U ∈ SU (N ); A, C, D, E ∈ CN×N
with Xk := AdKk (A),
f (X) = Re tr{C † X} with Xk := AdUk (A),
II. Constrained optimization
U1KP
U1P
Maximization restricted to homogeneous spaces G/H of the unitary group with X ∈ G/H and A, C arbitrary complex square and Pk as projection from gl(N, C) onto k
O1P
Maximization restricted to homogeneous spaces G/H of the orthogonal group with X ∈ G/H and A, C real symmetric
No.
Table 2.
July 12, J070-S0129055X10004053 148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics 657
July 12, J070-S0129055X10004053
658
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Summary: General Gradient Algorithm for Steepest Ascent on Riemannian Manifolds Requirements: Riemannian manifold M , e.g., Lie group G with (bi-invariant) metric · | · or its group orbits; smooth target function f : M → R; associated gradient system X˙ = grad f (X). Input : initial state X(0) ∈ M , parameters for target function. Output : sequence of iterative pairs {(Xk , f (Xk ))} approximating critical points X∗ and their critical values f (X∗ ). Initialization: If possible, generate generic initial state X0 , e.g., for compact Lie groups pick random G0 ∈ G according to Haar measure (for SU (N ) see [140]) and set X0 := G0 · X(0), otherwise identify X0 := X(0); calculate f (X0 ), grad f (X0 ), and step size α0 according to Sec. 3. Recursion: while k = 0, 1, 2, . . . , klimit and αk > αthreshold > 0 do 1: iterate Xk+1 = expXk (αk grad f (Xk )) according to examples in Table 2. 2: calculate f (Xk+1 ). 3: update step size αk+1 according to Sec. 3. 4: go to step 1. end Fig. 7. Summarizing scheme for steepest-ascent gradient flows on Riemannian manifolds. For related methods, like conjugate gradients, Jacobi- or Newton-type schemes, step (1) has to be modified in a straight-forward way according to Sec. 2, for details see [20, 62, 63]. If the dynamic stepsize selection of Sec. 3 is too costly CPU-timewise, one may start out with constant stepsizes, and halve them whenever (f (Xk+1 ) − f (Xk )) ≤ 0, cf. Armijo’s rule. In cases, where local extrema exist (see Sec. 3), make sure to run with a sufficient number of generic initial conditions.
achieve these optima is tantamount to exploiting and manipulating quantum effects in future technology. To this end, we have presented a comprehensive account of gradient flows on Riemannian manifolds (see general scheme of Fig. 7) allowing for generically convergent quantum optimization algorithms — an ample array of explicit examples being given in Table 2. Since the state spaces of quantum dynamical systems can often be represented by smooth manifolds, the unified foundations given here are also illustrated by many applications for numerically addressing optimization tasks in quantum information and quantum control. In the present work, a variety of applications are addressed by relating the dynamics to Lie group actions of the unitary group and its closed subgroups, which also includes recent least-squares approximations by a sum of several elements on independent matrix orbits [141] given as instances U4K and U5K in Table 2. Since symmetries give rise to stabilizer groups, particular attention has been paid to gradient flows on homogeneous spaces.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
659
Theory and algorithms have been structured and tailored for the following scenarios: (i) (ii) (iii) (iv)
for for for for
Lie groups with bi-invariant metric, closed subgroups compact Riemannian symmetric spaces, or, more generally, naturally reductive homogeneous spaces.
As soon as the homogeneous spaces are no longer naturally reductive, the “standard” way of representing geodesics on quotient spaces (by projecting geodesics from the group level to the quotient) fails. Alternatives of local approximations have been sketched in these cases in order to structure future developments. Techniques based on the Riemannian exponential are easy to implement on Lie groups (with bi-invariant metric) and their closed subgroups. In particular, gradient flows on subgroups of the unitary group with tensor product structure allow to address different partitionings of m-party quantum systems, the finest one being the group of purely local operations SU (2) ⊗ SU (2) ⊗ · · · ⊗ SU (2). The corresponding gradient flows have several applications in quantum dynamics: for instance they prove useful tools to decide whether effective multi-qubit interaction Hamiltonians generate time evolutions that can be reversed in the sense of Hahn’s spin echo solely by local operations. As a new application, gradient flows on SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nm ) turned out to be a valuable and reliable alternative to conventional tensor-SVD methods for determining best rank-1 tensor approximations to higherorder tensors. In the case of m-party multipartite pure quantum states, they can readily be applied to optimizing entanglement witnesses. Double-bracket flows have been characterized as a special case of gradient flows on naturally reductive homogeneous orbit spaces. Here, in view of using gradient techniques for ground-state calculations [36], it is important to note that doublebracket flows can also be established for any closed subgroup of SU (N ): by allowing for different partitionings SU (N1 ) ⊗ SU (N2 ) ⊗ · · · ⊗ SU (Nm ), one may set up a common frame to compare different types of unitary networks [36,50] for calculating and simulating large-scale quantum systems. Moreover, we have shown how techniques of restricting a gradient flow to subgroups also prove a useful tool for addressing constrained optimization tasks by ensuring the constraints are fulfilled intrinsically. As an alternative, we have sketched gradient flows that respect constraints extrinsically, e.g., by way of penaltytype parameters. These methods await application, e.g., in error-correction and robust state transfer. Finally, in a follow-up study, we discussed the dynamics of open quantum systems in terms of Lie semigroups [59]. We discuss relations between the theory of Lie semigroups and completely positive semigroups. In particular in open systems, an easy characterization of reachable sets arises only in very simple cases. It thus poses a current limit to an abstract optimization approach on reachable sets. However, in these cases, gradient-assisted optimal control methods again prove valuable.
July 12, J070-S0129055X10004053
660
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
Therefore, not only does the current work give the justification to some recent developments, it also provides new techniques to the field of quantum dynamics. It shows how to exploit the differential geometry in Lie theoretical terms for optimization on quantum-state manifolds. Thus the comprehensive theoretical treatment illustrated by known examples and new practical applications has been given to fill a gap. We anticipate that the ample array of methods and their exemplifications will find broad application in particular, since tensor approximations begin to play a key role in tensor-network approaches. They are used to approximate ground state energies (Rayleigh-coefficients) of large-system Hamiltonians exceeding the memory capacity of any (classical) computer hardware [36, 41–50]. The account of theoretical foundations is also meant to structure and trigger further basic research thus widening the set of useful tools. Acknowledgments Fruitful discussion with Jens Eisert on [36] is gratefully acknowledged. We wish to thank Otfried G¨ uhne for drawing our attention to witness optimization and Ref. [55]. This work was supported in part by the integrated EU programmes QAP, Q-ESSENCE and the exchange with COQUIT, as well as by Deutsche Forschungsgemeinschaft, DFG, in the incentives SPP 1078 and SFB 631. Support and exchange enabled by the two Bavarian PhD programmes of excellence Quantum Computing, Control, and Communication (QCCC) as well as Identification, Optimization and Control with Applications in Modern Technologies is gratefully acknowledged. References [1] R. P. Feynman, Simulating physics with computers, Int. J. Theoret. Phys. 21 (1982) 467–488. [2] R. P. Feynman, Feynman Lectures on Computation (Perseus Books, Reading, MA, 1996). [3] A. Y. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation (American Mathematical Society, Providence, 2002). [4] P. W. Shor, Algorithms for quantum computation: Discrete logarithms and factoring, in Proceedings of the Symposium on the Foundations of Computer Science (1994 ), Los Alamitos, California, USA (IEEE Computer Society Press, New York, 1994), pp. 124–134. [5] P. W. Shor, Polynomial-time algorithms for prime factorisation and discrete logarithm on a quantum computer, SIAM J. Comput. 26 (1997) 1484–1509. [6] R. Jozsa, Quantum algorithms and the Fourier transform, Proc. R. Soc. A 454 (1998) 323–337. [7] R. Cleve, A. Ekert, C. Macchiavello and M. Mosca, Quantum algorithms revisited, R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. 454 (1998) 339–354. [8] M. Ettinger, P. Høyer and E. Knill, The quantum query complexity of the hidden subgroup problem is polynomial, Inf. Process. Lett. 91 (2004) 43–48. [9] L. K. Grover, A fast quantum mechanical algorithm for database search, in Proceedings of the 28th Annual Symposium on the Theory of Computing (1996 ), Philadelphia, Pennsylvania, USA (ACM Press, New York, 1996), pp. 212–219.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
661
[10] L. K. Grover, Quantum mechanics helps in searching for a needle in a haystack, Phys. Rev. Lett. 79 (1997) 325–328. [11] C. H. Papadimitriou, Computational Complexity (Addison Wesley, Reading MA, 1995). [12] S. Sachdev, Quantum Phase Transitions (Cambridge University Press, Cambridge, 1999). [13] E. Jan´e, G. Vidal, W. D¨ ur, P. Zoller and J. Cirac, Simulation of quantum dynamics with quantum optical systems, Quant. Inf. Computation 3 (2003) 15–37. [14] D. Porras and J. Cirac, Effective quantum spin systems with trapped ions, Phys. Rev. Lett. 92 (2004) 207901. [15] J. Dowling and G. Milburn, Quantum technology: The second quantum revolution, Phil. Trans. R. Soc. Lond. A 361 (2003) 1655–1674. [16] H. M. Wiseman and G. J. Milburn, Quantum Measurement and Control (Cambridge University Press, Cambridge, 2009). [17] R. W. Brockett, Dynamical systems that sort lists, diagonalise matrices, and solve linear programming problems, in Proc. IEEE Decision Control (1988 ), Austin, Texas, USA (1988), pp. 779–803; reproduced in: Lin. Alg. Appl. 146 (1991) 79–91. [18] R. W. Brockett, Least-squares matching problems, Lin. Alg. Appl. 122(4) (1989) 761–777. [19] R. W. Brockett, Differential geometry and the design of gradient algorithms, Proc. Symp. Pure Math. 54 (1993) 69–91. [20] S. T. Smith, Geometric optimization methods for adaptive filtering, PhD Thesis, Harvard University, Cambridge MA (1993). [21] S. T. Smith, Hamiltonian and Gradient Flows, Algorithms and Control, Fields Institute Communications (American Mathematical Society, Providence, 1994), pp. 113– 136, chap. Optimization techniques on Riemannian manifolds. [22] U. Helmke and J. B. Moore, Optimization and Dynamical Systems (Springer, Berlin, 1994). [23] A. Bloch (ed.), Hamiltonian and Gradient Flows, Algorithms and Control, Fields Institute Communications (American Mathematical Society, Providence, 1994). [24] M. T. Chou and K. R. Driessel, The projected gradient method for least-squares matrix approximations with spectral constraints, SIAM J. Numer. Anal. 27 (1990) 1050–1060. [25] P. A. Absil, R. Mahony and R. Sepulchre, Optimization Algorithms on Matrix Manifolds (Princeton University Press, Princeton, 2008). [26] L. Ambrosio, N. Gigli and G. Savar´e, Gradient Flows in Metric Spaces and in the Space of Probability Measures, Lectures in Mathematics, 2nd edn. (ETH-Z¨ urich, Birkh¨ auser, Basel, 2008). [27] S. J. Glaser, T. Schulte-Herbr¨ uggen, M. Sieveking, O. Schedletzky, N. C. Nielsen, O. W. Sørensen and C. Griesinger, Unitary control in quantum ensembles: Maximising signal intensity in coherent spectroscopy, Science 280 (1998) 421–424. [28] T. Schulte-Herbr¨ uggen, Aspects and prospects of high-resolution NMR, PhD Thesis, Diss-ETH 12752, Z¨ urich (1998). [29] T. Schulte-Herbr¨ uggen and A. Sp¨ orl, Which quantum evolutions can be reversed by local unitary operations? Algebraic classification and gradient-flow based numerical checks (2006); http://arXiv.org/pdf/quant-ph/0610061. [30] N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-Herbr¨ uggen and S. J. Glaser, Optimal control of coupled spin dynamics: Design of NMR pulse sequences by gradient ascent algorithms, J. Magn. Reson. 172 (2005) 296–305.
July 12, J070-S0129055X10004053
662
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
[31] T. Schulte-Herbr¨ uggen, A. K. Sp¨ orl, N. Khaneja and S. J. Glaser, Optimal controlbased efficient synthesis of building blocks of quantum algorithms: A perspective from network complexity towards time complexity, Phys. Rev. A 72 (2005) 042331. [32] A. K. Sp¨ orl, T. Schulte-Herbr¨ uggen, S. J. Glaser, V. Bergholm, M. J. Storcz, J. Ferber and F. K. Wilhelm, Optimal control of coupled Josephson qubits, Phys. Rev. A 75 (2007) 012302. [33] T. Schulte-Herbr¨ uggen, A. Sp¨ orl, N. Khaneja and S. Glaser, Optimal control for generating quantum gates in open dissipative systems (2006); http://arXiv.org/ pdf/quant-ph/0609037. [34] P. Rebentrost, I. Serban, T. Schulte-Herbr¨ uggen and F. Wilhelm, Optimal control of a qubit coupled to a non-Markovian environment, Phys. Rev. Lett. 102 (2009) 090401. [35] M. Grace, C. Brif, H. Rabitz, I. Walmsley, R. Kosut and D. Lidar, Optimal control of quantum gates and suppression of decoherence in a system of interacting two-level particles, J. Phys. B.: At. Mol. Opt. Phys. 40 (2007) S103–S125. [36] C. Dawson, J. Eisert and T. J. Osborne, Unifying variational methods for simulating quantum many-body systems, Phys. Rev. Lett. 100 (2008) 130501. [37] T. Huckle, K. Waldherr and T. Schulte-Herbr¨ uggen, Unifying large-scale tensor approximations — Concepts and algorithms (2010); to be submitted. [38] M. Plenio, J. Eisert, J. Dreissig and M. Cramer, Entropy, entanglement, and area: Analytical results for harmonic lattice systems, Phys. Rev. Lett. 94 (2003) 060503. [39] M. Cramer, J. Eisert, M. Plenio and J. Dreissig, Entanglement-area law for general bosonic harmonic lattice systems, Phys. Rev. A 73 (2006) 012309. [40] M. Wolf, F. Verstraete, M. B. Hastings and I. Cirac, Area laws in quantum systems: Mutual information and correlations, Phys. Rev. Lett. 100 (2008) 070502. [41] M. Fannes, B. Nachtergaele and R. Werner, Abundance of translation invariant pure states on quantum spin chains, Lett. Math. Phys. 25 (1992) 249–258. [42] M. Fannes, B. Nachtergaele and R. F. Werner, Finitely correlated states on quantum spin chains, Comm. Math. Phys. 144 (1992) 443–490. [43] I. Peschel, X. Wang, M. Kaulke and K. Hallberg (eds), Density-Matrix Renormailzation: A New Numerical Method in Physics, Lecture Notes in Physics, Vol. 528 (Springer, Berlin, 1999). [44] U. Schollw¨ ock, The density-matrix renormalization group, Rev. Mod. Phys. 77 (2005) 259–315. [45] B. Schumacher and R. Werner, Reversible quantum cellular automata (2004); http://arXiv.org/pdf/quant-ph/0405174. [46] F. Verstraete, D. Porras and I. Cirac, DMRG and periodic boundary conditions: A quantum information perspective, Phys. Rev. Lett. 93 (2004) 227205. [47] S. Anders, M. B. Plenio, W. D¨ ur, F. Verstraete and H. J. Briegel, Ground-state approximation for strongly interacting spin systems in arbitrary spatial dimension, Phys. Rev. Lett. 97 (2006) 107206. [48] G. Vidal, Entanglement renormalization, Phys. Rev. Lett. 99 (2007) 220405. [49] N. Schuch, M. Wolf, F. Verstraete and I. Cirac, Strings, projected entangled pair states, and variational Monte Carlo methods, Phys. Rev. Lett. 100 (2008) 040501. [50] R. H¨ ubner, C. Kruszynska, L. Hartmann, W. D¨ ur, F. Verstraete, J. Eisert and M. Plenio, Renormalization algorithm with graph enhancement, Phys. Rev. A 79 (2009) 022317. [51] F. Wegner, Flow-equations for Hamiltonians, Ann. Phys. (Leipzig) 3 (1994) 77–91. [52] S. Kehrein, The Flow-Equation Approach to Many-Particle Systems, Springer Tracts in Physics, Vol. 217 (Springer, Berlin, 2006).
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
663
[53] M. B. Plenio and S. Virmani, An introduction to entanglement measures, Quant. Comp. Inf. 7 (2007) 1–51. [54] R. Horodecki, P. Horodecki, M. Horodecki and K. Horodecki, Quantum entanglement, Rev. Mod. Phys. 81 (2009) 865–942. [55] J. Eisert, P. Hyllus, O. G¨ uhne and M. Curty, Complete hierarchies of efficient approximations to problems in entanglement theory, Phys. Rev. A 70 (2004) 062317. [56] R. Lohmayer, A. Osterloh, J. Siewert and A. Uhlmann, Entangled three-qubit states without concurrence and three-tangle, Phys. Rev. Lett. 97 (2006) 260502. [57] A. Osterloh, J. Siewert and A. Uhlmann, Tangles of superpositions and the convexroof extension, Phys. Rev. A 77 (2008) 032210. [58] C. Eltschka, A. Osterloh, J. Siewert and A. Uhlmann, Three-tangle for mixtures of generalized GHZ and generalized W states, New J. Phys. 10 (2008) 043014. [59] G. Dirr, U. Helmke, I. Kurniawan and T. Schulte-Herbr¨ uggen, Lie semigroup structures for reachability and control of open quantum systems, Rep. Math. Phys. 64 (2009) 93–121; http://arXiv.org/pdf/0811.3906. [60] M. M. Wolf and J. I. Cirac, Dividing quantum channels, Comm. Math. Phys. 279 (2008) 147–168. [61] C. Udri¸ste, Convex Functions and Optimization Methods on Riemannian Manifolds (Kluwer, Dordrecht, 1994). [62] D. Gabay, Minimizing a differential function over a differential manifold, J. Optim. Theory Appl. 37 (1982) 177–219. [63] M. Kleinsteuber, Jacobi-type methods on semisimple Lie algebras — A Lie algebraic approach to the symmetric eigenvalue problem, PhD Thesis, Universit¨ at W¨ urzburg (2006). [64] J. Nocedal, Updating quasi-Newton matrices with limited storage, Math. Comp. 35 (1980) 773–782. [65] R. H. Byrd, P. Lu and R. B. Schnabel, Representation of quasi-Newton matrices and their use in limited memory methods, Math. Program. 63 (1994) 129–156. [66] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd edn. (Springer, New York, 2006). [67] V. Jurdjevic, Geometric Control Theory (Cambridge University Press, Cambridge, 1997). [68] Y. L. Sachkov, Controllability of invariant systems on Lie groups and homogeneous spaces, J. Math. Sci. 100 (2000) 2355–2427. [69] G. Dirr and U. Helmke, Lie theory for quantum control, GAMM-Mitteilungen 31 (2008) 59–93. [70] D. D’Alessandro, Introduction to Quantum Control and Dynamics (Chapman & Hall/CRC, Boca Raton, 2008). [71] V. F. Krotov, Global Methods in Optimal Control (Marcel Dekker, New York, 1996). [72] A. Peirce, M. Dahleh and H. Rabitz, Optimal control of quantum mechanical systems: Existence, numerical approximations and applications, Phys. Rev. A 37 (1987) 4950–4962. [73] K. L. Teo, C. J. Goh and K. H. Wong, A Unified Computational Approach to Optimal Control Problems (Chapman & Hall/CRC, Boca Raton, 1991). [74] Y. Maday and G. Turinici, New formulation of monotonically convergent quantum control algorithms, J. Chem. Phys. 118 (2003) 8191–8196. [75] H. Sussmann and V. Jurdjevic, Controllability of nonlinear systems, J. Differential Equations 12 (1972) 95–116. [76] V. Jurdjevic and H. Sussmann, Control systems on Lie groups, J. Differential Equations 12 (1972) 313–329.
July 12, J070-S0129055X10004053
664
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
[77] A. Agrachev and T. Chambrion, An estimation of the controllability time for singleinput systems on compact Lie groups, ESAIM Control Optim. Calc. Var. 12 (2006) 409–441. [78] R. W. Brockett, System theory on group manifolds and coset spaces, SIAM J. Control 10 (1972) 265–284. [79] R. W. Brockett, Lie theory and control systems defined on spheres, SIAM J. Appl. Math. 25 (1973) 213–225. [80] W. M. Boothby and E. N. Wilson, Determination of the transitivity of bilinear systems, SIAM J. Control Optim. 17 (1979) 212–221. [81] F. Albertini and D. D’Alessandro, Notions of controllability for bilinear multilevel quantum systems, IEEE Trans. Automat. Control 48 (2003) 1399–1403. [82] R. Zeier, U. Sander and T. Schulte-Herbr¨ uggen, Symmetry in quantum system theory of multi-qubit systems, in Proc. 19th MTNS, Budapest, Hungary (2010), in press. [83] U. Sander and T. Schulte-Herbr¨ uggen, Symmetry in quantum system theory of multi-qubit systems: Rules for quantum architecture design (2009); http://arXiv.org/pdf/0904.4654. [84] M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra (Academic Press, San Diego, 1974). [85] M. C. Irwin, Smooth Dynamical Systems (Academic Press, New York, 1980). [86] R. Fletcher, Practical Methods of Optimization, 2nd edn. (Wiley & Sons, Chichester, 1987). [87] D. G. Luenberger and Y. Ye, Linear and Nonlinear Programming, 3rd edn. (Springer, Berlin, 2008). [88] W. Boothby, An Introduction to Differential Manifolds and Riemannian Geometry (Academic Press, New York, 1975). [89] S. Gallot, D. Hulin and J. Lafontaine, Riemannian Geometry, 3rd edn. (Universitext, Springer, Berlin, 2004). [90] M. Spivak, A Comprehensive Introduction to Differential Geometry, Vols. I–II, 3rd edn. (Publish or Perish, Houston, 1999). [91] B. O’Neill, Semi-Riemannian Geometry (Academic Press, San Diego, 1983). [92] R. Abraham, J. E. Marsden and T. Ratiu, Manifolds, Tensor Analysis and Applications, 2nd edn. (Springer, New York, 1988). [93] J. Palis and W. de Melo, Geometric Theory of Dynamical Systems (Springer, New York, 1982). [94] S. L ojasiewicz, Sur les Trajectoires du Gradient d’une Fonction Analytique. Seminari di Geometria 1982–1983, Universit` a di Bologna, Istituto di Geometria, Dipartimento di Matematica (1984). [95] K. Kurdyka, On gradients of functions definable in O-minimal structures, Ann. Inst. Fourier 48 (1998) 769–783. [96] S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, Vols. I–II (Wiley Interscience, New York, 1996). [97] F. Takens, A solution, in Manifolds — Amsterdam 1970, ed. N. Kuiper, Lecture Notes in Math., Vol. 197 (Springer, New York, 1971), p. 231. [98] C. Lageman, Convergence of gradient-like dynamical systems and optimization algorithms, PhD Thesis, Universit¨ at W¨ urzburg (2007). [99] S. Helgason, Differential Geometry, Lie Groups, and Symmetric Spaces (Academic Press, New York, 1978). [100] B. C. Hall, Lie Groups, Lie Algebras, and Representations (Springer, New York, 2003).
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
665
[101] J. J. Duistermaat and J. A. C. Kolk, Lie Groups (Springer, New York, 2000). [102] A. Arvanitoyeorgos, An Introduction to Lie Groups and the Geometry of Homogeneous Spaces (American Mathematical Society, Providence, 2003). [103] A. W. Knapp, Lie Groups beyond an Introduction, 2nd edn. (Birkh¨ auser, Boston, 2002). [104] J. Milnor, Curvatures of left invariant metrics on Lie groups, Adv. Math. 21 (1976) 293–329. [105] J. Cheeger and D. G. Ebin, Comparison Theorems in Riemannian Geometry (NorthHolland, Amsterdam, 1975). [106] T. Br¨ ocker and T. tom Dieck, Representation of Compact Lie Groups (Springer, New York, 1985). [107] O. Kowalski and J. Szenthe, On the existence of homogeneous geodesics in homogeneous Riemannian manifolds, Geom. Dedicata 81 (2000) 209–214; Erratum, ibid. 84 (2001) 331. [108] A. Besse, Einstein Manifolds (Spinger, Berlin, 1986). [109] B. Kostant, Holonomy and Lie algebra of motions in Riemannian manifolds, Trans. Amer. Math. Soc. 80 (1955) 520–542. [110] U. Helmke, K. H¨ uper and J. Trumpf, Newton’s method on Grassmann manifolds (2007); http://arXiv.org/pdf/0709.2205. [111] M. Goldberg and E. Straus, Elementary inclusion relations for generalized numerical ranges, Linear Algebra Appl. 18 (1977) 1–24. [112] C.-K. Li, C-numerical ranges and C-numerical radii, Lin. Multilin. Alg. 37 (1994) 51–82. [113] T. Schulte-Herbr¨ uggen, G. Dirr, U. Helmke, M. Kleinsteuber and S. Glaser, The significance of the C-numerical range and the local C-numerical range in quantum control and quantum information, Lin. Multin. Alg. 56 (2008) 3–26. [114] G. Dirr, U. Helmke, M. Kleinsteuber and T. Schulte-Herbr¨ uggen, Relative C-numerical ranges for applications in quantum control and quantum information, Lin. Multin. Alg. 56 (2008) 27–51. [115] C.-K. Li and N. K. Tsing, Matrices with circular symmetry on their unitary orbits and C-numerical ranges, Proc. Amer. Math. Soc. 111 (1991) 19–28. [116] U. Helmke, K. H¨ uper, J. B. Moore and T. Schulte-Herbr¨ uggen, Gradient flows computing the C-numerical range with applications in NMR spectroscopy, J. Global Optim. 23 (2002) 283–308. [117] G. Dirr, U. Helmke, M. Kleinsteuber, S. Glaser and T. Schulte-Herbr¨ uggen, The local C-numerical range: Examples, conjectures and numerical algorithms, in Proc. MTNS (2006), Kyoto, Japan (2006), pp. 1419–1426. [118] G. Dirr, U. Helmke, M. Kleinsteuber and T. Schulte-Herbr¨ uggen, A new type of Cnumerical range arising in quantum computing, PAMM 6 (2006) 711–712; Special issue on 80th Annual Meeting GAMM. [119] A. Bloch, R. W. Brockett and T. Ratiu, A new formulation of the generalized Toda lattice equations and their fix-point analysis via the moment map, Bull. Am. Math. Soc. 56 (1990) 447–451. [120] A. Bloch, R. W. Brockett and T. Ratiu, Completely integrable gradient flows, Comm. Math. Phys. 147 (1992) 57–74. [121] R. Bertlman, H. Narnhofer and W. Thirring, A geometric picture of entanglement and Bell inequalities, Phys. Rev. A 66 (2002) 032319. [122] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, UK, 2000).
July 12, J070-S0129055X10004053
666
2010 12:0 WSPC/S0129-055X
148-RMP
T. Schulte-Herbr¨ uggen et al.
[123] O. Curtef, G. Dirr and U. Helmke, Conjugate gradient algorithms for best rank-1 approximation of tensors, PAMM 7 (2008) 1062201–1062202; Proceedings of the ICIAM (2007), Z¨ urich, Switzerland. [124] J. von Neumann, Some matrix-inequalities and metrization of matrix-space, Tomsk Univ. Rev. 1 (1937) 286–300; reproduced in John von Neumann: Collected Works, Vol. IV: Continuous geometry and other topics, ed. A. H. Taub (Pergamon Press, Oxford, 1962), pp. 205–219. [125] O. Sørensen, Polarization transfer experiments in high-resolution NMR spectroscopy, Prog. NMR Spectrosc. 21 (1989) 503–569. ˙ [126] D. Markham, J. A. Miszczak, Z. Puchala and K. Zyczkowski, Quantum state discrimination: A geometric approach, Phys. Rev. A 77 (2008) 042111. [127] J. Stoustrup, O. Schedletzky, S. J. Glaser, C. Griesinger, N. C. Nielsen and O. W. Sørensen, Generalized bound on quantum dynamics: Efficiency of unitary transformations between non-Hermitian states, Phys. Rev. Lett. 74 (1995) 2921–2924. [128] T. Kolda, Orthogonal tensor decompositions, SIAM J. Matrix Anal. Appl. 23 (2001) 243–255. [129] T. Zhang and G. H. Golub, Rank-one approximation to higher-order tensors, SIAM. J. Matrix Anal. Appl. 23 (2001) 534–550. [130] L. de Lathauwer, B. de Moor and J. Vandewalle, A multilinear singular value decomposition, SIAM J. Matrix Anal. Appl. 21 (2000) 1253–1278. [131] L. de Lathauwer, B. de Moor and J. Vandewalle, On the best rank-1 and rank(R1 , R2 , . . . , Rn ) approximation of higher-order tensors, SIAM J. Matrix Anal. Appl. 21 (2000) 1324–1342. [132] L. Eld´en and B. Savas, A Newton–Grassmann method for computing the best multilinear rank-(R1 , R2 , R3 ) approximation of a tensor, SIAM J. Matrix Anal. Appl. 31 (2009) 248–271. [133] B. Savas and L. H. Lim, Quasi-Newton methods on Grassmannians and multilinear approximations of tensors, Optimization Online 2009 (2009) 2362; arXiv:0907.2214. [134] O. Curtef, G. Dirr and U. Helmke, Riemannian optimization on tensor manifolds: Applications to generalized Rayleigh quotients (2010); arXiv:1005.4854. [135] M. Ishteva, L. D. Lathauwer, P. A. Absil and S. V. Huffel, Differential-geometric Newton method for the best rank-(R1 , R2 , R3 ) approximation of tensors, Numer. Algorithms 51 (2009) 179–194; Tributes to Gene H. Golub, Part II. [136] T. Wei and P. Goldbart, Geometric measure of entanglement and applications to bipartite and multipartite quantum states, Phys. Rev. A 68 (2003) 022307. [137] T. G. Kolda and B. W. Bader, Tutorial on MATLAB for tensors and the Tucker decomposition, Talk at workshop on tensor decomposition and applications, CIRM, Marseille (2005). [138] J. L. Brylinski, Mathematics of Quantum Computation, Computational Mathematics Series (Chapman & Hall/CRC, Boca Raton, 2002), pp. 3–23, chap. on Algebraic Measures of Entanglement. [139] B. G. Englert and N. Metwally, Mathematics of Quantum Computation, Computational Mathematics Series (Chapman & Hall/CRC, Boca Raton, 2002), pp. 24–75, chap. on Kinematics of Qubit Pairs. [140] F. Mezzadri, How to generate random matrices from the classical compact groups, Notices Amer. Math. Soc. 54 (2007) 592–604. [141] C. K. Li, Y. T. Poon and T. Schulte-Herbr¨ uggen, Least-squares approximation by elements from matrix orbits achieved by gradient flows on compact Lie groups, Math. Comp., in press (2010); arXiv:0812.1817.
July 12, J070-S0129055X10004053
2010 12:0 WSPC/S0129-055X
148-RMP
Gradient Flows for Optimization in Quantum Information and Quantum Dynamics
667
[142] M. Grassl, Lectures on Quantum Information (Wiley-VCH, Weinheim, 2007), pp. 105–120, chap. on Quantum Error Correction. [143] A. R. Calderbank, E. M. Rains, P. W. Shor and N. J. A. Sloane, Quantum error correction via codes over GF (4), IEEE Trans. Inf. Theory 44 (1998) 1369–1387. [144] A. R. Calderbank and P. W. Shor, Good quantum error-correcting codes exist, Phys. Rev. A 54 (1998) 1089–1105. [145] B. Tibken, Y. Fan, S. J. Glaser and T. Schulte-Herbr¨ uggen, Semidefinite programming relaxations applied to determining upper bounds of C-numerical ranges, in Proc. IEEE Intl. Conference on Control Applications (CCA) (2004 ), Munich, Germany (2004); published as CD-ROM Proceedings (2006).
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 6 (2010) 669–697 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004041
DENSITY DEPENDENT STOCHASTIC NAVIER–STOKES EQUATIONS WITH NON-LIPSCHITZ RANDOM FORCING
MAMADOU SANGO School of Mathematics, Institute for Advanced Study, 1, Einstein Drive, Princeton, NJ 08540, USA and Department of Mathematics and Applied Mathematics, University of Pretoria, Pretoria 0002, South Africa
[email protected] [email protected] Received 24 September 2009 Revised 17 March 2010 In this work, we investigate the question of existence of weak solutions to the density dependent stochastic Navier–Stokes equations. The noise considered contains functions which depend nonlinearly on the velocity and which do not satisfy the Lipschitz condition. Furthermore, the initial density is allowed to vanish. We introduce a suitable notion of probabilistic weak solution for the problem and prove its existence. Keywords: Density dependent stochastic Navier–Stokes equations; weak solutions; Galerkin scheme; tightness of probability measures; Prokhorov and Skorokhod’s theorems. Mathematics Subject Classification 2010: 35R60, 35D05, 35Q35
1. Introduction The mathematical study of incompressible Navier–Stokes equations goes back to the pioneering work of Leray [29–31]. Since then a considerable wealth of work and ground-breaking results have been obtained by some of the brightest minds in Mathematics and Applied Mathematics. For an in-depth historical overview of the body of work done in this direction, we refer to the monographs [1, 27, 32, 35, 57]. One of the greatest challenges in the field of fluid dynamics is the question of understanding of complex phenomenon of Turbulence. With the development of stochastic processes, models of Navier–Stokes equations perturbed by white noise were proposed and investigated in the quest for better understanding turbulence in fluids (see [3–5, 10, 11, 15–17, 19, 40, 41, 43, 46, 47, 60–62], just to cite a few). The main feature in these equations is the decomposition of the force acting on the fluid into a regular (deterministic) part and very irregular (turbulent) part driven by white noise. The mathematical theory of stochastic (mainly incompressible) 669
July 12, J070-S0129055X10004041
670
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Navier–Stokes equations is a very rich and broad area covering deep results on existence of solutions, dynamical systems feature, ergodicity, and many more, see [11, 14, 17, 42], for instance. However, while research in the density independent case has known a relatively sustained growth over the years, very little is known about the density dependent case which even in the deterministic case has a relatively recent history. On the deterministic front, results on global existence and some uniqueness results have been obtained by Antonsev and Kazhikov [1,24], in the case of non vanishing initial density, see also [28]. These results were subsequently extended to the vanishing initial density case in [20, 22, 23, 33, 35, 52–54] (the magnetohydrodynamic version). The most difficult case of compressible fluids is a very active area since the work of Lions [37–39] where the notion of renormalized solution introduced earlier by him and Di-Perna led to a breakthrough in the field; we refer to his monograph [36] and those of Feireisl [18] and Novotny [44] for a greater wealth of information. In the present work, we provide a detailed investigation of a large class of stochastic density dependent Navier–Stokes equations. We consider a sufficiently general forcing consisting of a regular part and a stochastic part both depending nonlinearly on the velocity of the fluid and we do not require the functions involved in the forcing to satisfy the Lipschitz condition and we allow the initial density of the fluid to be non negative. The main result is the construction of a probabilistic weak solution for the problem considered. The result is achieved thanks to a delicate blending of the semi-Galerkin approximation and deep theorems of compactness both of probabilistic and analytic nature which has proved very efficient in establishing existence of solutions in other problems, we refer to [3–5,15,16,19,46–50,62,64]. Securing the strong convergence of several sequences of the approximating solutions through the tightness of the corresponding probability distributions and fine measure, theoretic results presented far more challenges than in the deterministic and the density independent stochastic Navier–Stokes cases. Our results extend most of the known deterministic existence results referred to above to the stochastic case. Yashima was the first to study stochastic density dependent equations in his thesis [64]. He considered additive noise and the case of positive initial density. One of his main contributions is the extension of some results of Bensoussan and Temam [5] to the density dependent case and the extension of some results of Antontsev and Kazhikov [1] to the stochastic case. The next work known to us in this direction is that of Cutland and Enright [12] who treat the case of positive initial density with nonlinear noise depending on the velocity. Their approach is based on nonstandard analysis and Loeb space techniques. It is worth noting that some existence results in the one-dimensional and two-dimensional compressible cases were obtained in the work of Tornatore and Yashima [58, 59, 63]. In view of the lack of Lipschitzity of the forces uniqueness is out of reach for the problem we study. The genuine uniqueness question is similar to the still unsolved deterministic case.
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
671
Let D be a domain bounded in R3 with a sufficiently smooth boundary ∂D (at least Lipschitz). We fix a final time T > 0 and denote by QT the cylindrical domain (0, T ) × D. We consider the initial boundary value problem for the density dependent stochastic Navier–Stokes equations ρdu + (ρ(u · ∇)u − µ u + ∇P )dt = ρf (t, u)dt + ρg(t, u)dW ∂ρ + (u · ∇)ρ = 0 in QT , ∂t div u = 0
in QT ,
u(0) = u0
(1) (2) (3)
u = 0 on (0, T ) × ∂D, ρ(0) = ρ0 ,
in QT ,
in D;
(4) (5)
u is the velocity of the particles of fluid, P the pressure, ρ the density, W a ldimensional Wiener process and the right-hand side of (1) represents the force acting on the fluid and consisting of a regular part involving the function f and a chaotic part involving the function g and W . As a closing remark in this introduction, we note that the framework elaborated in the present paper opens some opportunities for attacking ergodic problems related to density dependent turbulent Navier–Stokes fluids. The density independent case was considered in [7–9, 14, 17, 25, 26], just to cite a few; see also the references therein. The Galerkin approximation plays an important role in these works. The plan of the paper is as follows. In Sec. 2, we gather some preliminary results that will be needed in the work, we introduce the definition of the probabilistic weak solution for the problems (1)–(5), we formulate our main result. In Sec. 3, we introduce a semi-Galerkin approximation scheme for the problems (1)–(5) and obtain a priori estimates for the approximating solutions needed for the application of several compactness results. In Sec. 4, we prove the crucial result of tightness of Galerkin’s solutions and apply Prokhorov’s and Skorokhod’s compactness results. In the last Sec. 5, we prove our main result.
2. Preliminaries and Main Result We introduce some function spaces. Let D(D) be the space of C ∞ functions compactly supported in D and let D (D) be the space of distributions on D. For 1 ≤ r ≤ ∞, l a nonnegative integer we define the Sobolev spaces Wrl (D) = {v ∈ Lr (D) : Dα v ∈ (Lr (D))3 for |α| ≤ l},
July 12, J070-S0129055X10004041
672
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Dα = D1α1 · · · D3α3 , α = (α1 , α2 , α3 ), |α| = α1 + α2 + α3 , Di = ∂/∂xi . l Wr,0 (D) is the closure of D(D) in Wrl (D), 3 ∂vi r −1 : vi ∈ L (D), i = 0, 1, 2, 3 Wr (D) = v = v0 + ∂xi i=1
H l (D) = W2l (D),
l H0l (D) = W2,0 (D),
H −1 (D) = W2−1 (D);
these spaces are endowed with their respective usual norms. Next let V = {v ∈ D(D) : div v = 0}. Denote by V the closure of V in (H 1 (D))3 and by H the closure of V in (L2 (D))3 . V and H are Hilbert spaces with norms · V and · H , respectively. We denote the Euclidean norm by | · |. In view of the Lipschitzity of the boundary of D the following characterization of V and H hold: V = {v ∈ (H 1 (D))3 : div v = 0 in D and v|∂D = 0}, H = {v ∈ (L2 (D))3 : div v = 0 in D and v|∂D · n = 0}, where v|∂D denotes the trace of v on ∂D and n is a vector normal to ∂D. The inner product in H is induced by the inner product (·, ·) in L2 (D). We denote by ·, · the duality paring between V and V the dual of V . We denote by (·, ·)D the duality product in all functions spaces on D. In particular, v(x)w(x)dx, (v, w)D = D p
−1
−1
if v ∈ L (D) and w ∈ L (D), p + (p ) = 1. We recall some properties of products in Sobolev spaces Wp1 (D), p ≥ 1; the −1 − 3−1 ; p∗ is any finite non negative real Sobolev conjugate p∗ is given by p−1 ∗ =p if p = 3, p∗ = ∞ if p > 3. p
Lemma 1. (i) For 1 ≤ p ≤ q ≤ ∞, the product Wp1 (D) × Wq1 (D) → Wr1 (D) is continuous if r ≥ 1 and r−1 = p−1 + q∗−1 . (ii) For 1 ≤ p ≤ ∞, 1 ≤ q ≤ ∞, the product Wp1 (D) × Wq−1 (D) → Wr−1 (D) −1 is continuous if p−1 + q −1 ≤ 1 and r−1 = p−1 . ∗ +q
(6)
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
673
For a probability space (Ω, F, P ) and a Banach space X we introduce the space L (Ω, F, P, Lq (0, T, X))(1 ≤ p, q ≤ ∞) of random functions defined on Ω with values in Lq (0, T, X). We endow Lp (Ω, F, P, Lq (0, T, X)) with the norm p
ϕLp (Ω,F,P,Lq (0,T,X)) = (Eϕ(ω, ·, ·)pLq (0,T,X) )1/p . We shall need in the sequel some important compactness results that we formulate now. The proofs of these results can be found in the given references. We have [32, Chap. 1, Lemma 1.3]. Lemma 2. Let (gκ )κ=1,2,... and g be some functions in Lq (0, T, Lq (D)) with q ∈ (1, ∞) such that gκ Lq (0,T,Lq (D)) ≤ C,
∀κ
and as κ → ∞ gκ → g
for almost all (x, t) ∈ QT .
Then gκ weakly converges to g in Lq (0, T, Lq (D)). Remark 3. The results of the lemma hold for the space Lq (Ω, F, P, Lq (0, T, D)) in Ω × QT . The next result is a sharper version of a theorem of Aubin (cf. [32, Chap. 1, Par. 5]) due to Simon [51, Sec. 8, Theorem 5]. Lemma 4. Let X, B and Y be some Banach spaces such that X is compactly embedded into B and let B be a subset of Y . For any 1 ≤ p, q ≤ ∞, and 0 < s ≤ 1 let E be a set bounded in Lq (0, T, X) ∩ N s,p (0, T, Y ), where s,p p −s N (0, T, Y ) = v ∈ L (0, T, Y ) : sup h v(t + θ) − v(t)Lp (0,T −θ,Y ) < ∞ . h>0
p
Then E is relatively compact in L (0, T, B). We shall need in the sequel two deep results due to Prokhorov and Skorokhod. We begin by introducing the concept of tightness of probability measures. Let E be a separable Banach space and let B(E) be its Borel σ-field. Definition 5. A family of probability measures P on (E, B(E)) is tight if for any ε > 0, there exists a compact set Kε ⊂ E such that µ(Kε ) ≥ 1 − ε,
for all µ ∈ P.
A sequence of measures {µn } on (E, B(E)) is weakly convergent to a measure µ if for all continuous and bounded functions ϕ on E ϕ(x)µn (dx) = ϕ(x)µ(dx). lim n→∞
E
E
The following result due to Prokhorov [45] shows that the tightness property is a compactness criterion.
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
674
Lemma 6. A sequence of measures {µn } on (E, B(E)) is tight if and only if it is relatively compact, that is, there exists a subsequence {µnk } which weakly converges to a probability measure µ. Skorokhod proves in [55] the next result which relates the weak convergence of probability measures with that of almost everywhere convergence of random variables. Lemma 7. For an arbitrary sequence of probability measures {µn } on (E, B(E)) weakly convergent to a probability measure µ, there exists a probability space (Ω, F , P ) and random variables ξ, ξ1 , . . . , ξn , . . . with values in E such that the probability law of ξn , L(ξn )(A) = P {ω ∈ Ω : ξn (ω) ∈ A},
for all A ∈ F,
is µn , the probability law of ξ is µ, and lim ξn = ξ,
n→∞
P -a.s.
We borrowed the presentation of these lemmas from [13]. We now formulate the conditions on f and g. We assume that f : (0, T ) × H → V is a nonlinear mapping: (i) continuous in both its variables, (ii) there exists a positive constant C such that f (t, v)V ≤ C(1 + vH ).
(7)
We assume that g : (0, T ) × H → H l is a nonlinear mapping: (i) continuous in both its variables, (ii) there exists a positive constant C such that g(t, v)|H ×l ≤ C(1 + vH );
(8)
H ×l is the product of l copies of the space H. We state the following: Definition 8. A weak solution of (1)–(5) is a probabilistic system (Ω, F, F t , P, W, u, ρ) where (i) (ii) (iii) (iv)
(Ω, F, P ) is a probability space, F t is a filtration on (Ω, F, P ), W (t) is an l-dimensional F t standard Wiener process, for almost every t, u(t) and ρ(t) are F t -measurable, u ∈ L4 (Ω, F, P, L∞ (0, T, H)) ∩ L2 (Ω, F, P, L2 (0, T, V )), ρ ∈ L∞ (Ω, F, P, L∞ (0, T, L∞ (D))),
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
675
(v) for any ϕ ∈ V, ψ ∈ H 1 (D) t t (ρu)(t)ϕdx − ρuu∇ϕdxdt + µ ∇u · ∇ϕdxdt D
0
=
t
ρ0 u0 ϕdx + D
ρ(t)ψ dx − D
0
D
t
ρf (t, u)ϕdxdt + 0
ρg(t, u)ϕdxdW, 0
D
t
ρ0 ψ dx −
(10)
D
ρ(0) = ρ0 ,
(9)
D
ρu∇ψ dxdt = 0, 0
D
and
D
ρ(0)u(0)ϕdx =
D
ρ0 u0 ϕdx
(11)
D
almost surely and for all t ∈ [0, T ]. Our main result is Theorem 9. Let the above conditions on f and g be satisfied and assume that u0 ∈ H, ρ0 ∈ L∞ (D), ρ0 ≥ 0. Then there exists a solution of problems (1)–(5) in the sense of Definition 8. Remark 10. We hereby emphasize the fact that the initial conditions (5) are understood in the sense of (11). Under the above estimates satisfied by u and ρ, and the integral identities (9) and (10) it can be shown as in the deterministic case ( [35, 54, Chap. 2]) that (11) holds almost surely. [54, Proposition 13, p. 1110] shows that conditions (11) are equivalent to Π(ρ0 (u(0) − u0 )) = ∇q, where Π is the Leray projector and q ∈ H 1 (D). Therefore unless ρ(0) is constant this condition is weaker than the one usually required, ρ0 (u(0) − u0 ) = 0. This means the velocity fields u(0) and u0 are equal outside the vacuum. 3. Semi-Galerkin Approximation and A Priori Estimates 3.1. The semi-Galerkin scheme In this section, we introduce a semi-Galerkin approximation following [1,24,33,54]. We obtain key a priori estimates for the approximating sequences of the presumed solutions of our problem. Let A be the Stokes’ operator with domain D(A) = H 2 ∩ V . We consider an orthonormal basis of D(A) consisting of the eigenvectors w1 , . . . , wm , . . . of A. We ¯ F¯ , P¯ ) with a denote the span of w1 , . . . , wm by V m . On the probability space (Ω,
July 12, J070-S0129055X10004041
676
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
¯ , we look for the pair of sequences (ρm , um ) given l-dimensional Wiener process W m (u is sought as linear combination of w1 , . . . , wm as will be made precise below) satisfying the integral equation (ρm dum )(t)v dx + ρm um ∇um v dxdt + µ ∇um · ∇v dxdt D
D
D
¯, ρm g(t, um )v dxdW
ρm f (t, um )v dxdt +
= D
(12)
D
for all v ∈ V m and ∂ρm + (um · ∇)ρm = 0 in QT ∂t um (0) = um 0 ,
ρm (0) = ρm 0
in D.
(13) (14)
We assume that m um 0 ∈ V , 1 ¯ ρm 0 ∈ C (D),
αm =
um 0 → u0
ρm 0 → ρ0
in (L2 (D))3 ,
in L∞ (D) weakly-star,
1 1 + inf ρ0 ≤ ρm + sup ρ0 = βm . 0 ≤ D m m D
(15) (16) (17)
In solving (13) with the second initial condition in (14), we assume that um exists and let y m (τ, t, x) be the flow of um (·, ·); that is y m is solution of the Cauchy problem dy(τ, t, x) = um (τ, y(τ, t, x)), dτ
y(τ, t, x)|τ =t = x.
(18)
By the method of characteristics, we have the representation m ρm (t, x) = ρm 0 (y (0, t, x))
(19)
for the requested solution. This implies that 0 < αm ≤ ρm (t, x) ≤ βm .
(20)
We note that ρm is a random function through the relations (18) and (19) which is bounded above and below by deterministic values in (20). For the existence of a solution um to (12), we substitute the function ρm from (19) in (12) and look for um in the form of the expansion um =
m k=1
k ϕm k (t)w (x).
(21)
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
677
Substituting v = w1 , . . . , wm successively into (12) we obtain a system of ordinary stochastic differential equations for the coefficients ϕm k (t) m m k m l ρm wk wl dx dϕm ρm (wj ϕm k (t) + j ∇)w ϕk w dxdt D
k=1
ρm f t,
− D
g t,
= D
m
D
wl dxdt + µ wj ϕm j
j=1
j,k=1
m
m D j=1
j l ϕm j (t)∇w ∇w dx
¯ l wj ϕm j (t) w dxdW ,
l = 1, . . . , m.
(22)
j=1
The matrix
m ρ w w dx m
k
l
D
k,l=1
is non-degenerate since the family {ρ w }k=1,...,m is free; in view of (20) ρm > 0. Thus (22) can be reduced to the canonical form m
k
m m m m m m ¯ dϕm l (t) + Fl (t, ϕ1 , . . . , ϕm )dt = Gl (t, ϕ1 , . . . , ϕm )dW ,
(23)
with the initial conditions m ϕm l (0) = ϕl0 ,
(24)
where ϕm l0 are the coefficients in the expansion um 0
=
m
k ϕm k0 w .
k=1
In view of the conditions on f and g, the functions Flm and Gm l are continum , . . . , ϕ . Thus thanks to an existence result for sysous in their variables t, ϕm 1 m tem of stochastic ordinary differential equations due to Skorokhod [56, Theorem 2, Chap. 5], a local solution of (23) exists on an interval [0, Tm ]. Therefore any for t ∈ [0, Tm ], the representation (21) holds. The existence over the whole interval [0, T ] will follow from uniform a priori estimates in the next subsection. 3.2. The a priori estimates We now proceed to the task of deriving needed a priori estimates. Substituting v = wk into (12), multiplying the resulting relation by ϕm k (t) and summing over k = 1, . . . , m, we get (um ρm dum )(t)dx + ρm um um ∇um dxdt + µ ∇um · ∇um dxdt D
D
¯. ρm g(t, um )um dxdW
ρm f (t, um )um dxdt +
= D
D
D
(25)
July 12, J070-S0129055X10004041
678
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
We introduce the stopping times inf{t > 0 : ρm (t)um (t)L2 (D) ≥ N }, ¯ : ρm (t)um (t)L (D) ≥ N } = ∅, τN = if {¯ ω∈Ω 2 ∞ if {¯ ¯ : ρm (t)um (t)L (D) ≥ N } = ∅. ω∈Ω 2
Applying Ito’s formula to ρm um um dx, D
we deduce from (25) that √ | ρm um |2 dx d D
um um
= D
∂ρm (s) ds − 2µ ∂s
um Aum dxds − 2
D
¯ ]dx + ρm um [f (s, um )ds + g(s, um )dW
2 D
ρm um (um · ∇)um dxds D
√ | ρm g(s, um )|2 dxds,
(26)
D
where s ∈ [0, t ∧ τN ], t ∈ [0, Tm ], t ∧ τN = min{t, τN }. We have div(um um ρm um )dx = [um um div(ρm um ) + ρm um ∇(um um )]dx D
D
[um um (um · ∇)ρm + 2ρm um (um · ∇)um ]dx;
= D
where in the last step we made use of the divergence freeness of um . The left-hand side is equal to zero in view of the vanishing of um on (0, T ) × ∂D. Hence from (13), we have ∂ρm (s) dx = − um um um um um ∇ρm dx ∂s D D um ρm (um · ∇)um dx; (27) =2 D
Thus substituting the right-hand side of (27) into (26), we get for all s ∈ [0, t ∧ τN ] s um (r)2V dr ρm (s)um (s)2L2 (D) + 2µ m 2 ≤ ρm 0 u0 L2 (D) + + 0
s
0
s
2|um , ρm f (r, um )|dr
0
s m 2 m m m m ¯ ρ (r)g(r, u )L2 (D) dr + 2 (u , ρ g(r, u ))dW . 0
(28)
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
679
Taking supremum in both sides of (28) over the interval [0, t ∧ τN ], followed by the expectation, we have t∧τN m 2 m ¯ ¯ E sup ρ (s)u (s)L2 (D) + 2µE um (s)2V ds 0≤s≤t∧τN
m 2 ¯ ¯ ≤ E ρm 0 u0 L2 (D) + E
0
t∧τN
2|um , ρm f (s, um )|ds
0
¯ ρm (s)g(s, um )2L2 (D) ds + 2E
t∧τN
¯ +E
0
t∧τN
0
¯ . (um , ρm g(s, um ))dW (29)
We estimate terms in the right-hand side of this equation. By Young’s inequality and the conditions on f , we have for any ε > 0 t∧τN t∧τN m m m 2u , ρ f (s, u )ds ≤ ε ρm (s)um (s)2L2 (D) ds 0
0
+ Cε ≤C
t∧τN
0
t∧τN
0
ρm (s)f (s, um )2L2 (D) ds
ρm (s)um (s)2L2 (D) ds + C.
Similarly in view of the conditions on g we have t∧τN m 2 m ρ (s)g(s, u )L2 (D) ds ≤ C ρm (s)um (s)2L2 (D) ds + C.
(30)
t∧τN
0
(31)
0
We now estimate the stochastic integral in (28). We have for any ε > 0, s m m m ¯ ¯ E sup 2(ρ (s)g(s, u (s)), u (s))dW 0≤s≤t∧τN
0
≤ C E¯
t∧τN
t∧τN
0
2
1/2
ρ (s)L∞ (D) (1 + u (s)L2 (D) ) ρm (s)um (s)2L2 (D) ds
sup
m
0≤s≤t∧τN
¯ ≤ εE
m
(ρ (s)g(s, u (s)), u (s)) ds
×
m
0
≤ C E¯ ≤ C E¯
m
t∧τN 0
0≤s≤t∧τN
t∧τN 0
1/2
ρm (s)um (s)L2 (D) 2
1/2
ρ (s)L∞ (D) (1 + u (s)L2 (D) ) ds
sup
¯ + CE
2
m
m
m
ρm (s)um (s)2L2 (D) (1 + ρm (s)um (s)2L2 (D) )ds.
(32)
July 12, J070-S0129055X10004041
680
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Substituting the inequalities (30)–(32) into (29) we get for sufficiently small ε > 0 t∧τN ¯ sup ρm (s)um (s)2 2 ¯ E + 2µE um (s)2 ds L (D)
0≤s≤t∧τN
¯ = E
m m 2 ρ0 u0 L2 (D) + C E¯
V
0
t∧τN
0
(1 + ρm (s)um (s)2L2 (D) )ds.
In view of Gronwall’s inequality, it follows that ¯ E
sup
0≤s≤t∧τN
¯ ρm (s)um (s)2L2 (D) + 2µE
t∧τN
0
um (s)2V ds ≤ C.
As N → ∞, t ∧ τN → t. Thus passing to the limit in this inequality, we find that t ¯ E¯ sup ρm (s)um (s)2L2 (D) + 2µE um (s)2V ds ≤ C, ∀t ∈ [0, Tm ]. (33) 0≤s≤t
0
Since the constant C is independent of m, we have Tm = T . Applying Ito’s formula to Eq. (26) with p ≥ 1, we get p−2 m d ρm (t)um (t)pL2 (D) + pµ ρm (t)um (t)L (t)2V dt 2 (D) u =
p m p−2 m m ρ (t)um (t)L , ρ f (t, um ) + ρm (t)g(t, um )2L2 (D) }dt 2 (D) {2u 2 p−2 m m ¯ + p ρm (t)um (t)L , ρ g(t, um ))dW 2 (D) (u p p p−4 m m + − 1 ρm (t)um (t)L , ρ g(t, um ))2 dt, t ∈ [0, T ]. 2 (D) (u 2 2
Integrating this equation over [0, t] and squaring the resulting equation we get 2 ρm (t)um (t)2p L2 (D) + (pµ)
0
t
p−2 m ρm (s)um (s)L (s)2V ds 2 (D) u
m 2p ≤ C{ ρm 0 u0 L2 (D) + I1 + I2 + I3 + I4 },
(34)
where I1 =
0
I2 =
t
0
I4 =
t
0
I3 =
t
0
t
p−2 m ρm (s)um (s)L (s), ρm (s)f (s, um (s))ds 2 (D) u
2
2 ,
p−2 ρm (s)um (s)L ρm (s)g(s, um (s))2L2 (D) ds 2 (D)
p−2 m ¯ ρm (s)um (s)L (s), ρm (s)g(s, um (s)))dW 2 (D) (u
p−4 m ρm (s)um (s)L (s), ρm (s)g(s, um (s)))2 ds 2 (D) (u
2 , 2 ,
2 .
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
681
The following inequalities readily follow t 2 p−2 I1 + I2 ≤ ρm (s)um (s)L ρm (s)um (s)2L2 (D) )ds 2 (D) (1 + 0
≤C I4 =
t
0 t
0
≤C
ρm (s)um (s)2p L2 (D) )ds
p−4 ρm (s)um (s)L ρm (s)um (s)4L2 (D) )ds 2 (D) (1 +
t
0
(1 +
(1 +
2
ρm (s)um (s)2p L2 (D) )ds.
For the estimation of I4 , we use the Martingale inequality t 2 p−2 m m m E¯ sup ρm (s)um (s)L (u (s), ρ (s)g(s, u (s)))dW 2 (D) 0≤t≤T 0
¯ ≤ E
T
2p−4 ρm (s)um (s)L ρm (s)um (s)4L2 (D) )ds 2 (D) (1 +
0
¯ ≤ E
2p−4 m ρm (s)um (s)L (s), ρm (s)g(s, um (s)))2 ds 2 (D) (u
0
¯ ≤ E
T
T
0
(1 +
ρm (s)um (s)2p L2 (D) )ds.
In view of these estimates and (34) making use of Gronwall’s inequality, we obtain E¯ sup ρm (t)um (t)2p ∀ p ≥ 1. (35) L2 (D) ≤ C, 0≤t≤T
Raising both sides of (28) to the power p ≥ 1, and using the above inequality (35), we also get along the previous lines p T m 2 ¯ E u (s)V ds ≤ C. (36) 0
Our next task is to estimate some increments in time of um and ρm in the space V . But before that let us make a few remarks. In view of estimate (35), for any p ≥ 1, and the fact that ρm ∈ L∞ (0, T, L∞ (D))
¯ F¯ , P¯ , L∞ (0, T, (L2 (D))3 )). ρm um ∈ L2p (Ω,
(37)
Thus ¯ F¯ , P¯ , L∞ (0, T, H −1 (D))) ∇(ρm um ) ∈ L2p (Ω, and by (13), it follows that ∂ρm ¯ F¯ , P¯ , L∞ (0, T, H −1(D))). ∈ L2p (Ω, ∂t
(38)
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
682
Also by (36), for all p ≥ 1 ¯ F¯ , P¯ , L2 (0, T, V )). um ∈ Lp (Ω, Thus in view of the Sobolev embedding V → (L6 (D))3 we have ¯ F¯ , P¯ , L2 (0, T, (L6 (D))3 )), um ∈ Lp (Ω,
(39)
¯ F¯ , P¯ , L2 (0, T, (L6 (D))3 )). ρm um ∈ Lp (Ω,
(40)
and
Recall the following result due to Riesz and Thorin (cf. [6, Theorem 1.1.1]). Lemma 11. Let T be a linear operator from Lp1 (0, T ) into Lp2 (D) and from Lq1 (0, T ) into Lq2 (D) with q1 ≥ p1 and q2 ≤ p2 . Then for any s ∈ (0, 1), T maps Lr1 (0, T ) into Lr2 (D) with 1 , s/p1 + (1 − s)/q1
r1 =
r2 =
1 . s/p2 − (1 − s)/q2
Applying this lemma with p1 = 2, p2 = 6, q1 = ∞, q2 = 2 and s = 3/4, we get from (37) and (40) that ¯ F¯ , P¯ , L8/3 (0, T, ((L4 (D)))3 )); um ∈ Lp (Ω,
ρm u m ,
(41)
¯ F¯ , P¯ ) → X and where we have also used the lemma with respect to L2p (Ω, p ¯ ¯ ¯ L (Ω, F , P ) → X. Next we have ¯ F¯ , P¯ , L4/3 (0, T, (L2 (D))9 )), ρm um um ∈ Lp (Ω,
(42)
¯ F¯ , P¯ , L4/3 (0, T, (H −1 (D))3 )). ∇(ρm um um ) ∈ Lp (Ω,
(43)
and thus
Indeed applying Holder’s inequality we have for k = 1, 2, 3 2 4/6 T m m m (ρ uk uk ) dx dt 0
D
≤
T
0
≤
D
T
0
≤ C
4/12 4/12 4 m 4 (ρm um ) dx (u ) dx dt k k
0
D T
D
8/12 1/2 m m 4 (ρ uk ) dx dt
T
0
8/12
D
4 (ρm um k ) dx
D T
4 (um k ) dx
dt + 0
1/2
8/12
D
dt
8/12
4 (um k ) dx
dt
.
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
683
The integrals in the right-hand side are bounded for a.e. ω in view of the estimates (41). The sought estimates thus follow. Recalling the definition of the norm of V , we have (ρm um )(t + θ) − (ρm um )(t)2V = sup [(ρm um )(t + θ) − (ρm um )(t)]v dx. v∈V :vV =1
D
Thus owing to the integral identity (12), we have
T −θ
¯ E 0
(ρm um )(t + θ) − (ρm um )(t)2V dt
¯ = E
T −θ
0
¯ ≤ E
2 t+θ d(ρm um )ds dt t V
T −θ
[R1 (t) + R2 (t) + R3 (t) + R4 (t)]dt
0
(44)
where 2 t+θ m m m ∇(ρ u u )ds , R1 (t) = t 2 t+θ m m R3 (t) = ρ f (s, u )ds , t
V
V
2 t+θ m R2 (t) = µ∆u ds , t V
2 t+θ m m ¯ R4 (t) = ρ g(s, u )dW . t V
We have
1/2 R1
= sup D
≤C
t+θ
t+θ
∇(ρm um um )ds ϕ(x)dx : ϕ ∈ V, ϕV = 1
t
ρm um um L2 (D) ds.
t
Then in view of (42) ¯ E 0
T −θ
¯ R1 (t)dt ≤ Cθ1/2 E
0
≤ Cθ1/2 .
T −θ
t
t+θ
3/4 4/3 ρm um um L2 (D) dsdt
July 12, J070-S0129055X10004041
684
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Next using (33), we get
T −θ
¯ E 0
R2 (t)dt ≤ E¯
T −θ
0
∇u L2 (D) ds m
dt
t
≤ θE¯
2
t+θ
T −θ
0
t+θ
t
∇um 2L2 (D) dsdt
≤ Cθ. Using the conditions on f and estimate (35), we have 2 T −θ T −θ t+θ m m ¯ ¯ E R3 (t)dt ≤ C E ρ (s)L∞ (D) (1 + u (s)L2 (D) )ds dt 0
0
t
¯ m 2 ∞ ≤ CθEρ L (0,T,L∞ (D))
T −θ
0
t
t+θ
(1 + um (s)2L2 (D) )dsdt
≤ Cθ. For the stochastic integral we use the martingale inequality. We have 2 T −θ t+θ m m ¯ ¯ E ρ g(s, u )dW dt t 0 ≤
0
≤
T −θ
T −θ
V
¯ E
t+θ
t
T −θ ¯ E ≤
t
≤ C E¯
0
T −θ
D t+θ
2 ¯ dt ρ g(s, u )ϕ(x)dx dW m
sup
T −θ ¯ E ≤
0
t
ϕ∈V :ϕV =1
0
sup ϕ∈V :ϕV =1
¯ E
0
t+θ
m
2 ρm g(s, um )ϕ(x)dx ds dt
t
D
m
m 2
[ρ g(s, u )] dx ds dt
D t+θ
ρm 2L∞ (D) g(s, um )2(L2 (D))×l ds dt
θ sup ρm 2L∞ (D) (1 + um 2H ) 0≤t≤T
≤ Cθ; at some steps we made use of Fubini’s theorem and the estimate (35). Combining the estimates that we’ve just derived with (44) we get the crucial estimate T −θ ¯ E (ρm um )(t + θ) − (ρm um )(t)2V dt ≤ Cθ1/2 . (45) 0
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
We also need to show that T −θ ¯ E um (t + θ) − um (t)2W −1 (D) dt ≤ Cθ1/2 . 3/2
0
685
(46)
We note that ρm (t + θ)(um (t + θ) − um (t)) = (ρm um )(t + θ) − (ρm um )(t)um (t)(ρm (t + θ) − ρm (t)). Let us estimate
T −θ
¯ E
(47)
um (t)(ρm (t + θ) − ρm (t))2W −1 (D) dt. 3/2
0
−1 We have, by (38), (33) and (6), × → W3/2 (D)) that T −θ um (t)(ρm (t + θ) − ρm (t))2W −1 (D) dt
(W21 (D)
W2−1 (D)
3/2
0
≤
T −θ 0
t+θ m 2 ∂ρ (s) m dsdt u (t)V ∂s t V
m 2 ∂ρ (s) ≤ Cθ2 ∂s ∞ L
(0,T,V
)
0
T
um (t)2V dt.
Taking mathematical expectation in this inequality and using (36) and (38), we get T −θ um (t)(ρm (t + θ) − ρm (t))2W −1 (D) dt ≤ Cθ2 . (48) 3/2
0
Combining (45), (48) and (47), we get T −θ ¯ E ρm (t + θ)(um (t + θ) − um (t))2W −1 (D) dt ≤ Cθ1/2 . 3/2
0
This implies (46). We are left with another key estimate on the function Ψm (t) = ρm (t)um (t)v dx D
for v ∈ V . We claim that ¯ m (t + h) − Ψm (t)C([0,T ]) ≤ ch1/4 . EΨ
(49)
We have from (12), ¯ m (t + h) − Ψm (t)| E|Ψ t+h t+h ¯ ≤ E¯ ρm um um ∇v dxds + µE ∇um ∇v dxds t t D D t+h t+h m m m m ¯ ¯ . ¯ +E ρ f (s, u )v dxds + E ρ g(s, u )v dxdW t t D D
July 12, J070-S0129055X10004041
686
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
In view of (42), we have t+h m m m ¯ sup E ρ u u ∇v dxds t∈[0,T −h] t D ¯ m um um L4/3 (0,T,(L2 (D))9 ) ≤ Ch1/4 . ≤ Ch1/4 ∇vH Eρ Next, using (36), we have 1/2 t+h t+h ¯ sup ∇um ∇v dxds ≤ Ch1/2 vV E um 2V E t∈[0,T −h] t t D ≤ Ch1/2 . By similar arguments, we have by (35) t+h ¯ ¯ sup (1 + um (s)H ) E ρm f (s, um )v dxds ≤ Ch1/2 ρm L∞ (Q) vH E t s∈[t,t+h] D ≤ Ch1/2 . Finally using Martingale inequality we have t+h m m ¯ ¯ sup ρ g(s, u )v dxdW E t∈[0,T −h] t D 2 1/2 t+h ¯ ≤ E ρm g(s, um )v dx dt t D m ¯ ≤ Ehv H ρ L∞ (Q)
sup (1 + um (s)H ) s∈[t,t+h]
≤ Ch. Hence summarizing these estimates we arrive at (49). Furthermore, in view of (35), we have for any p ≥ 1 ¯ sup |Ψm (t)|p ≤ Cvp , E H t∈[0,T ]
¯ sup ρm um p 2 E L (D) ≤ C.
(50)
t∈[0,T ]
We now summarize our key estimates in this section. For that we introduce the k (1 ≤ p < ∞) (k = 1, 2) of random variables y such that spaces Xp,µ n ,νn 1 (i) For Xp,µ n ,νn
¯ sup y(t)2p2 E L (D) ≤ C, 0≤t≤T
¯ sup 1 sup E n νn |θ|≤µn
0
T −θ
E¯ 0
T
p y(s)2V
ds
≤ C,
y(t + θ) − y(t)2W −1 (D) dt ≤ C; 3/2
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
687
endowed with the norm 1 yXp,µ n ,νn
1/2p ¯ sup y(t)2p2 ¯ E = E + L (D) 0≤t≤T
T
0
¯ sup 1 +E n νn
T −θ
sup |θ|≤µn
0
p/2 2/p y(s)2V ds 1/2
y(t + θ) −
y(t)2W −1 (D) dt 3/2
,
1 is a Banach space. Xp,µ n ,νn 2 (ii) For Xp,µ n ,νn p ¯ Ey ≤ C, L8/3(0,T,((L4 (D)))3 )
¯ sup 1 sup E n νn |θ|≤µn
T −θ
0
y(t + θ) − y(t)2V dt ≤ C;
endowed with the norm p ¯ 2 yXp,µ = (Ey )1/p L8/3 (0,T,((L4 (D)))3 ) n ,νn
¯ sup 1 +E n νn
sup |θ|≤µn
0
T −θ
1/2 y(t + θ) −
y(t)2V dt
,
2 is a Banach space. Xp,µ n ,νn
We define Xq3 (q is any positive number) as the space of random variables y such that q ∂y q ¯ ¯ Ey ≤ C, E ≤ C; ∂t ∞ L∞ (0,T,L∞ (D)) L (0,T,H −1 (D)) endowed with the norm yXq3 =
q 1/q ¯ (Ey L∞ (0,T,L∞ (D)) )
1/q ∂y q ¯ + E , ∂t ∞ L (0,T,H −1 (D))
Xq3 is a Banach space. 4 of random variables y such that Finally we have the space Xp,µ n ,νn p ¯ Ey L∞ (0,T ) ≤ C,
sup n
1 ¯ + θ) − y(t)C[0,T ] ≤ C, sup Ey(t νn |θ|≤µn
which endowed with the norm p 1/p ¯ 4 (Ey + sup yXp,µ L∞ (0,T ) ) n ,νn n
is a Banach space.
1 ¯ + θ) − y(t)C[0,T ] sup Ey(t νn |θ|≤µn
July 12, J070-S0129055X10004041
688
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Combining the estimates (20), (35), (36), (38), (41), (45), (46), (49) and (50), we have Theorem 12. For any p ≥ 1 and for µn , νn such that the series ∞ 1/4 µn νn n=1 1 2 converges, the sequences um , ρm um , ρm and Ψm are bounded in Xp,µ , Xp,µ , n ,νn n ,νn 3 4 Xq and Xp,µn ,νn for any n, respectively.
4. Tightness Property of Probability Measures Induced by Galerkin Solutions We may rewrite Lemma 4 in the following more convenient form adapted to our situation as in [4]. For any sequences µn , νn which converge to zero as n → ∞, and any 1 ≤ pk , qk ≤ ∞ (k = 1, 2, 3, 4) the set Yµkn ,νn of functions y ∈ Lqk (0, T, Xk ) ∩ Nµpnk ,νn (0, T, Yk ) where Nµpnk ,νn (0, T, Y ) is the set 1 pk v ∈ L (0, T, Yk ) : sup sup v(t + θ) − v(t)Lpk (0,T −θ,Yk ) < ∞ n νn |θ|≤µn is relatively compact in Lpk (0, T, Bk ), Xk , Bk and Yk play respectively the role of X, B and Y in Lemma 4. −1 (D), q1 = 2, p1 = 2 and let Let Yµ1n ,νn be the space with X1 = V, Y1 = W3/2 2 2 4 B1 = L (D). Let Yµn ,νn be the space with X2 = L (D), Y2 = V , q2 = 8/3, p2 = 8/3 and B2 = W2−θ (D) (0 < θ < 1), W2−θ (D) being the interpolation space [L2 (D) = W20 (D), H −1 (D)]θ ; we refer to [34] for the needed informations. Also by [34, Theorem 16.1, Chap. 1], we have that W2−θ (D) is compactly embedded into H −1 (D). Let Yµ3n ,νn be the space with X3 = L∞ (D), Y3 = H −1 (D), q3 = ∞, p3 = ∞ and let −1 (D). Let Yµ4n ,νn be the space with X4 = B4 = Y4 = R, p4 = q4 = ∞. B3 = W∞ Now we consider the set S = C(0, T, Rl ) ×
4
Lpk (0, T, Bk ).
k=1
and B(S) the σ-algebra of the Borel sets of S. For each m, let Φ be the map Φ : ¯ →S:ω ¯ (¯ Ω ¯ → (W ω , ·), um (¯ ω , ·), ρm um (¯ ω , ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)). Since the solution is not unique in general this map is multivalued. However a selection can be made to suit our needs. Precise arguments can be found in [5]. So we make use of the map modulo a selection. For each m, we introduce a probability measure πm on (S, B(S)) by πm (A) = P¯ (Φ−1 (A))
for all A ∈ B(S).
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
689
The main result of this section is Theorem 13. The family of probability measures {πm : m ∈ N} is tight. Proof. For ε > 0 we should find the compact subsets Σε ⊂ C(0, T, Rl ),
Yε ⊂
4
Lpk (0, T, Bk )
k=1
such that ¯ (¯ P¯ {¯ ω:W ω , ·) ∈ / Σε } ≤ ε/2
(51)
P¯ {¯ ω : (um (¯ ω , ·), ρm um (¯ ω , ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)) ∈ / Yε } ≤ ε/2.
(52)
The quest for Σε is made by taking account of some facts about the Wiener process such as the formula E|B(t2 ) − B(t1 )|2j = (2j − 1)!(t2 − t1 )j ,
j = 1, 2, . . . .
(53)
For a constant Lε depending on ε to be chosen later and n ∈ N, we consider the set w(·) ∈ C(0, T, Rl ) : Σε = . sup{n|w(t2 ) − w(t1 )| : t1 , t2 ∈ [0, T ], |t2 − t1 | < n−6 } ≤ Lε The Σε is relatively compact in C(0, T, Rd ) by Arsela–Ascoli’s theorem. Making use of Markov’s inequality P {¯ ω : ξ(¯ ω ) ≥ α} ≤
1 E[|ξ(¯ ω )|k ] αk
¯ F¯ , P¯ ) and positive numbers α and k, we get for a random variable ξ on (Ω, ¯ (¯ P¯ {¯ ω:W ω , ·) ∈ / Σε } ≤ P¯ ∪n ω ¯:
sup
t1 ,t2 ∈[0,T ],|t2 −t1 |
¯ (t2 ) − W ¯ (t1 )| > Lε /n |W
4 ∞ n −1 n ¯ (t) − W ¯ (iT n−6 )|4 ≤ E sup |W L −6 ≤t≤(i+1)T n−6 ε iT n n=1 i=0 6
≤ C
4 ∞ ∞ n C 1 (T n−6 )2 n6 = 4 . Lε Lε n=1 n2 n=1
We choose L4ε to get (51).
1 = 2Cε
∞ 1 2 n n=1
−1
July 12, J070-S0129055X10004041
690
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Next we choose Yε as a ball of radius Mε in Yµ1n ,νn × Yµ2n ,νn × Yµ3n ,νn × Yµ4n ,νn centered at zero and with µn , νn independent of ε, converging to zero and such that −1 1/4 converges. As remarked above Yε is a compact subset of n νn µn 4
Lpk (0, T, Bk ).
k=1
We have further P¯ {¯ ω : (um (¯ ω , ·), ρm um (¯ ω , ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)) ∈ / Yε } ω : ρm um Yµ2n ,νn > Mε } ≤ P¯ {¯ ω : um Yµ1n ,νn > Mε } + P¯ {¯ ω : Ψm Yµ4n ,νn > Mε } + P¯ {¯ ω : ρm Yµ3n ,νn > Mε } + P¯ {¯ ≤
1 ¯ m ¯ m um Y 2 ¯ m Y 3 ¯ m Y 4 (Eu Yµ1n ,νn + Eρ + Eρ + EΨ ) µn ,νn µn ,νn µn ,νn Mε
≤
C . Mε
Choosing Mε = 2Cε−1 we get (52). From (51) and (52), we have ¯ (¯ P {¯ ω:W ω , ·) ∈ Σε ; (um (¯ ω , ·), ρm um (¯ ω, ·), ρm (¯ ω , ·), Ψm (¯ ω , ·)) ∈ Yε } ≥ 1 − ε. This proves that πm (Σε × Yε ) ≥ 1 − ε,
∀ε > 0
and hence the theorem. In view of the just proven tightness of {πm } we have from Lemma 6 that there exists a subsequence {πmj } and a measure π such that πmj → π weakly. By Skorokhod’s Lemma 7, there exist a probability space (Ω, F, P ) and random variables (Wmj , umj , ρmj umj , ρmj , Ψmj ), (W, u, g, ρ, Ψ) on (Ω, F, P ) with values in S such that the probability law of (Wmj , umj , ρmj umj , ρmj , Ψmj ) is πmj ; hence {Wmj } is a sequence of l-dimensional Wiener processes. Furthermore (Wmj , umj , ρmj umj , ρmj , Ψmj ) → (W, u, g, ρ, Ψ) in S,
P -a.s.
(54)
and the probability law of (W, u, g, ρ, Ψ) is π. Set F t = σ{W (s), u(s), ρ(s)}s∈[0,t] . We show that W (t) is a F t -standard Wiener process. For this we use the following characterization of Wiener processes through their characteristic functions (see [21])
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
691
which stipulates that for any m ∈ N, 0 = t0 < t1 < · · · < tm and v0 , v1 , . . . , vm m E exp ivk [W (tk ) − W (tk−1 )] − iz0 W (t0 ) k=1
Nj 1 = exp − vk2 (tk − tk−1 ) . 2
(55)
k=1
(55) will follow if we can prove that for the conditional characteristic function we have 2 v h E[exp{iv[W (t + h) − W (t)]}/F t ] = exp − (56) 2 for all h > 0 and any v. Note that for any given σ-algebra F and random variables ˜ F˜ , P˜ ) on which the mathematical expectation X and Y on a probability space (Ω, is denoted by E, if X is F -measurable and E|Y |, E|XY | < ∞, E(XY /F ) = XE(Y /F ),
EE(Y /F ) = E(Y ),
that is E(XY ) = E(XE(Y /F )). Using this fact we see that (56) will be proved if for any continuous bounded functional Λt (W (·), u(·), ρ(·)) on S depending only on the values of W, u and ρ on the interval (0, t), we have E[exp{v[W (t + h) − W (t)]}Λt (W (·), u(·), ρ(·))] 2 z h = exp − EΛt (W (·), u(·), ρ(·)). 2
(57)
Since [Wmj (t + h) − Wmj (t)] are independent of Λt (Wmj , umj , ρmj ) and Wmj is a Wiener process E[exp{iz[Wmj (t + h) − Wmj (t)]}Λt (Wmj , umj , ρmj )] = E exp{iz[Wmj (t + h) − Wmj (t)]}EΛt (Wmj , umj , ρmj ) 2 z h = exp − EΛt (Wmj , umj , ρmj ). 2 In view of (54) and the continuity of Λt , we can pass to the limit in this equality and get (57). It can be shown that Wmj , umj , ρmj satisfy the approximating equations (12) and (25) with m replaced by mj . In particular div umj = 0,
(58)
∂ρmj + (umj · ∇)ρmj = 0, ∂t
(59)
m
umj (0) = u0 j ,
m
ρmj (0) = ρ0 j ,
(60)
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
692
t
(ρmj umj )(t)v dx − D
m
= D
0
m
t 0
0
∇umj · ∇v dxdt D
ρmj f (t, umj )v dxdt
D
ρmj g(t, umj )v dxdWmj .
+ 0
t
D
ρ0 j u0 j v dx +
t
ρmj umj umj ∇v dxdt + µ
(61)
D
5. Passage to the Limit In Theorem 12 let us take p = 2. We have that u mj → u
weakly-star in L4 (Ω, F, P, L∞ (0, T, H)),
(62)
u mj → u
weakly in L2 (Ω, F, P, L2 (0, T, V )).
(63)
By (35), (54) and Vitali’s theorem, we have umj → u strongly in L2 (Ω, F, P, L2 (0, T, H)).
(64)
Thus for fixed x, u mj → u
a.e. (t, ω) with respect to the measure dt ⊗ dP.
(65)
weakly-star in L2 (Ω, F, P, L∞ (0, T, L2 (D))).
(66)
Next we have ρmj u mj → g
By (35) and the uniform boundedness of ρmj Eρmj umj 4L∞ (0,T,L2 (D)) ≤ C. This implies that Eρmj umj 4L∞ (0,T,H −1/2 (D)) ≤ C. This together with (54) and Vitali’s theorem give strongly in L2 (Ω, F, P, L∞ (0, T, W2−θ (D))).
ρmj u mj → g
(67)
We have that ρmj is bounded in Xq3 for any q > 0. Taking q = 4 we get ρmj → ρ weakly-star in L4 (Ω, F, P, L∞ (0, T, L∞ (D)))
(68)
and Eρmj 4L∞ (0,T,W −1 (D)) ≤ C. ∞
This estimate combined with (54) and Vitali’s theorem imply that ρmj → ρ
−1 strongly in L2 (Ω, F, P, C(0, T, W∞ (D))).
(42) gives ρmj u mj u mj → h
¯ F¯ , P¯ , L4/3 (0, T, (L2 (D))9 )). weakly in L2 (Ω,
(69)
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
693
−1 The product W∞ (D) × H 1 (D) → W6−1 (D) is continuous. Thus (69) and (63) give
weakly in L2 (Ω, F, P, L2 (0, T, W6−1 (D))).
ρmj umj → ρu
(70)
And taking account of (67) we get g = ρu. Similarly since the product from (67), (70) and (63) that
(71)
W6−1 (D)×H 1 (D)
→
−1 W3/2 (D)
is continuous we have
−1 ρmj umj umj → gu = ρuu weakly in L2 (Ω, F, P, L2 (0, T, W3/2 (D))).
(72)
Next, in view of (67), (35), the conditions on f and Vitali’s theorem we have f (·, umj (·)) → f (·, u(·)) in L2 (Ω, F, P, L2 (0, T, H)).
(73)
Similarly owing to the conditions on g g(·, umj (·)) → g(·, u(·)) in L2 (Ω, F, P, L2 (0, T, H d )).
(74)
Using this convergence with (69) and (54), we can show that t t ρmj g(s, umj (s))dWmj (s) → ρg(s, u(s))dW (s)
(75)
0
0
weakly in L2 (Ω, F, P, L2 (D)). We skip the details and instead refer to [50] where a similar situation is dealt with thoroughly. A key role is played by Lemma 2. Next in view of (54), Ψmj (ω, ·) → Ψ(ω, ·) uniformly in C([0, T ]), P a.s. Hence owing to (50) and Vitali’s theorem we get Ψ mj → Ψ strongly in L1 (Ω, F, P, C(0, T, R)). Hence mj mj mj ρ (0)u (0)v dx → ρ(0)u(0)v dx. Ψ (0) = D
But
D
ρmj (0)umj (0)v dx =
D
Thus
ρ0 u0 v dx. D
ρ(0)u(0)v dx =
D
ρ0 u0 v dx.
(76)
D
Also passing to the limit in (17), we get inf ρ0 ≤ ρ ≤ sup ρ0 . D
(77)
D
Combining all these convergences we can pass to the limit in the weak formulation of problem (58)–(61) and obtain the claim of our main result.
July 12, J070-S0129055X10004041
694
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Acknowledgments This work is supported by the National Science Foundation under the agreement No. DMS-0635607 and by the National Research Foundation of South Africa. The results were obtained during my stay at the Institute for Advanced Study in fall of 2009. I thank the institute for providing excellent conditions of work. I thank Professor Ya. G. Sinai for stimulating discussions on the results of the paper and encouragement. Until the paper was completed, I was not aware of the work of Professor F. H. Yashima who informed me during the Panafrican Congress of Mathematicians in Yamoussoukro (Cˆote-d’Ivoire) in August 2009. My sincere gratitude is due to him for sending his thesis [64]. I thank one of the reviewers for interesting comments that improved the paper. References [1] S. N. Antontsev, A. V. Kazhikhov and V. N. Monakhov, Boundary Value Problems in Mechanics of Nonhomogeneous Fluids, Studies in Mathematics and Its Applications, Vol. 22 (North-Holland Publishing Co., Amsterdam, 1990). [2] A. Bensoussan, Some existence results for stochastic partial differential equations, in Stochastic Partial Differential Equations and Applications (Trento, 1990), Pitman Res. Notes Math. Ser., Vol. 268 (Longman Scientific and Technical, Harlow, UK, 1992), pp. 37–53. [3] A. Bensoussan, Results on stochastic Navier–Stokes equations, in Control of Partial Differential Equations (Trento, 1993), Lecture Notes in Pure and Appl. Math., Vol. 165 (Dekker, New York, 1994), pp. 11–21. [4] A. Bensoussan, Stochastic Navier–Stokes equations, Acta Appl. Math. 38 (1995) 267– 304. ´ [5] A. Bensoussan and R. Temam, Equations stochastiques du type Navier–Stokes, J. Funct. Anal. 13 (1973) 195–222. [6] J. Bergh and J. L¨ ofstr¨ om, Interpolation Spaces. An Introduction, Grundlehren der Mathematischen Wissenschaften, No. 223 (Springer-Verlag, Berlin-New York, 1976). [7] J. Bricmont, A. Kupiainen and R. Lefevere, Exponential mixing of the 2D stochastic Navier–Stokes dynamics, Comm. Math. Phys. 230(1) (2002) 87–132. [8] J. Bricmont, A. Kupiainen and R. Lefevere, Ergodicity of the 2D Navier–Stokes equations with random forcing. Dedicated to Joel L. Lebowitz, Comm. Math. Phys. 224(1) (2001) 65–81. [9] J. Bricmont, A. Kupiainen and R. Lefevere, Probabilistic estimates for the twodimensional stochastic Navier–Stokes equations, J. Statist. Phys. 100(3–4) (2000) 743–756. [10] Z. Brzezniak, M. Capinski and F. Flandoli, Stochastic Navier–Stokes equations with multiplicative noise, Stochastic Anal. Appl. 10(5) (1992) 523–532. [11] Z. Brze´zniak and Y. Li, Asymptotic compactness and absorbing sets for 2D stochastic Navier–Stokes equations on some unbounded domains, Trans. Amer. Math. Soc. 358(12) (2006) 5587–5629. [12] N. J. Cutland and B. Enright, Stochastic nonhomogeneous incompressible Navier– Stokes equations, J. Differential Equations 228(1) (2006) 140–170. [13] G. Da Prato and J. Zabczyk, Stochastic Equations in Infinite Dimensions, Encyclopedia of Mathematics and Its Applications, Vol. 44 (Cambridge University Press, Cambridge, 1992).
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
695
[14] G. Da Prato and J. Zabczyk, Ergodicity for Infinite-Dimensional Systems, London Mathematical Society Lecture Note Series, Vol. 229 (Cambridge University Press, Cambridge, 1996). [15] G. Deugoue and M. Sango, On the stochastic 3D Navier–Stokes-alpha model of fluids turbulence, Abstr. Appl. Anal. 2009 (2009), Article ID 723236, 27 pp. [16] G. Deugoue and M. Sango, On the strong solution for the 3D stochastic Leray-Alpha Model, Boundary Value Problems 2010 (2010), Article ID 723018, 31 pp. [17] E. Weinan and Ya. G. Sinai, New results in mathematical and statistical hydrodynamics, Russian Math. Surveys 55(4) (2000) 635–666. [18] E. Feireisl, Dynamics of Viscous Compressible Fluids, Oxford Lecture Series in Mathematics and Its Applications, Vol. 26 (Oxford University Press, Oxford, 2004). [19] F. Flandoli and D. Gatarek, Martingale solutions and stationary solutions for stochastic Navier–Stokes equations, Probab. Theory Relat. Fields 102 (1995) 367–391. [20] J.-F. Gerbeau and C. Le Bris, Existence of solution for a density-dependent magnetohydrodynamic equation, Adv. Differential Equations 2(3) (1997) 427–452. [21] I. I. Gikhman and A. V. Skorohod, Stochastic Differential Equations, Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 72 (Springer-Verlag, New YorkHeidelberg, 1972). [22] Y. Cho and H. Kim, Unique solvability for the density-dependent Navier–Stokes equations, Nonlinear Anal. 59(4) (2004) 465–489. [23] H. J. Choe and H. Kim, Strong solutions of the Navier–Stokes equations for nonhomogeneous incompressible fluids, Comm. Partial Differential Equations 28(5–6) (2003) 1183–1201. [24] A. V. Kazhikhov, Solvability of the initial-boundary value problem for the equations of the motion of an inhomogeneous viscous incompressible fluid, Dokl. Akad. Nauk SSSR 216 (1974) 1008–1010 (in Russian). [25] S. B. Kuksin, Randomly Forced Nonlinear PDEs and Statistical Hydrodynamics in 2 Space Dimensions, Zurich Lectures in Advanced Mathematics (European Mathematical Society (EMS), Z¨ urich, 2006), x+93 pp. [26] S. Kuksin and A. Shirikyan, Ergodicity for the randomly forced 2D Navier–Stokes equations, Math. Phys. Anal. Geom. 4(2) (2001) 147–195. [27] O. A. Ladyzhenskaya, The Mathematical Theory of Viscous Incompressible Flow, 2nd edn., revised and enlarged (Gordon and Breach, Science Publishers, New YorkLondon-Paris, 1969). [28] O. A. Ladyzhenskaja and V. A. Solonnikov, The unique solvability of an initialboundary value problem for viscous incompressible inhomogeneous fluids. Boundary value problems of mathematical physics, and related questions of the theory of functions, 8, Zap. Nauch. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 52 (1975) 52–109, 218–219 (in Russian). [29] J. Leray, Sur le syst`eme d’´equations aux d´eriv´ees partielles qui r´egit l’´ecoulement permanent des fluides visqueux, C. R. Acad. Sci. Paris 192 (1931) 1180–1182. [30] J. Leray, Sur le mouvement d’un liquide visqueux emplissant l’espace, Acta Math. 63(1) (1934) 193–248. [31] J. Leray, Etude de diverses ´equations int´egrales non lineaires et de quelques probl`emes que pose l’hydrodynamique, J. Math. Pure Appl. (9 ) 12 (1933) 1–82. [32] J. L. Lions, Quelques m´ethodes de r´ esolution des probl` emes aux limites non lin´eaires (Dunod, Gauthiers-Villars, Paris, 1969; Russian translation by Mir). [33] J.-L. Lions, On some problems connected with Navier–Stokes equations, in Nonlinear Evolution Equations (Proc. Sympos., Univ. Wisconsin, Madison, Wis., 1977), Publ.
July 12, J070-S0129055X10004041
696
[34]
[35] [36] [37] [38] [39]
[40] [41] [42] [43] [44]
[45] [46]
[47]
[48] [49] [50] [51] [52] [53] [54]
2010 11:50 WSPC/S0129-055X
148-RMP
M. Sango
Math. Res. Center Univ. Wisconsin, Vol. 40 (Academic Press, New York-London, 1978), pp. 59–84. J.-L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problems and Applications, Vol. I, Die Grundlehren der Mathematischen Wissenschaften, Band 181 (Springer-Verlag, New York-Heidelberg, 1972). P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 1, Incompressible Models (The Clarendon Press, Oxford University Press, New York, 1996). P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 2, Compressible Models (The Clarendon Press, Oxford University Press, New York, 1998). P.-L. Lions, Limites incompressible et acoustique pour des fluides visqueux, compressibles et isentropiques, C. R. Acad. Sci. Paris S´ er. I Math. 317(12) (1993) 1197–1202. P.-L. Lions, Compacit´e des solutions des ´equations de Navier–Stokes compressibles isentropiques, C. R. Acad. Sci. Paris S´ er. I Math. 317(1) (1993) 115–120. P.-L. Lions, Existence globale de solutions pour les ´equations de Navier–Stokes compressibles isentropiques, C. R. Acad. Sci. Paris S´ er. I Math. 316(12) (1993) 1335– 1340. R. Mikulevicius and B. L. Rozovskii, Global L2 -solutions of stochastic Navier–Stokes equations, Ann. Probab. 33(1) (2005) 137–176. R. Mikulevicius and B. L. Rozovskii, Stochastic Navier–Stokes equations for turbulent flows, SIAM J. Math. Anal. 35(5) (2004) 1250–1310. S.-E. Mohammed and T.S. Zhang, Dynamics of Stochastic 2D Navier–Stokes, to appear in J. Funct. Anal. A. S. Monin and A. M. Yaglom, Statistical Fluid Mechanics: Mechanics of Turbulence, Vols. I, II (Dover Publications, Dover Ed Edition, 2007). A. Novotny and I. Stravskraba, Introduction to the Mathematical Theory of Compressible Flow, Oxford Lecture Series in Mathematics and Its Applications, Vol. 27 (Oxford University Press, Oxford, 2004), xx+506 pp. Yu. V. Prohorov, Convergence of random processes and limit theorems in probability theory, Teor. Veroyatnost. i Primenen. 1 (1956) 177–238 (in Russian). P. A. Razafimandimby and M. Sango, Weak solutions of a stochastic model for two-dimensional second grade fluids, Boundary Value Problems 2010 (2010), Article ID 636140, 47 pp. P. A. Razafimandimby and M. Sango, Asymptotic behavior of solutions of stochastic evolution equations for second grade fluids, to appear in C. R. Math. Acad. Sci. Paris. M. Sango, Existence result for a doubly degenerate quasilinear stochastic parabolic equation, Proc. Japan Acad. Ser. A Math. Sci. 81(5) (2005) 89–94. M. Sango, Weak solutions for a doubly degenerate quasilinear parabolic equation with random forcing, Discrete Contin. Dyn. Syst. Ser. B 7(4) (2007) 885–905. M. Sango, Magnetohydrodynamic turbulent flows: Existence results, Phys. D 239 (2010) 912–923. J. Simon, Compact sets in the space Lp (0, T ; B), Ann. Mat. Pura Appl. 146(4) (1987) 65–96. J. Simon, Sur les fluides visqueux incompressibles et non homog`enes, C. R. Acad. Sci. Paris S´er. I Math. 309(7) (1989) 447–452. ´ J. Simon, Ecoulement d’un fluide non homog`ene avec une densit´e initiale s’annulant, C. R. Acad. Sci. Paris S´ er. A-B 287(15) (1978) A1009–A1012. J. Simon, Nonhomogeneous viscous incompressible fluids: Existence of velocity, density, and pressure, SIAM J. Math. Anal. 21(5) (1990) 1093–1117.
July 12, J070-S0129055X10004041
2010 11:50 WSPC/S0129-055X
148-RMP
Density Dependent Stochastic Navier–Stokes Equations
697
[55] A. V. Skorokhod, Limit theorems for stochastic processes, Teor. Veroyatnost. i Primenen. 1 (1956) 289–319. [56] A. V. Skorokhod, Studies in the Theory of Random Processes (Scripta Technica, Inc. Addison-Wesley Publishing Co., Inc., Reading, Mass., 1965); Translated from the Russian. [57] R. Temam, Navier–Stokes Equations. Theory and Numerical Analysis, Studies in Mathematics and Its Applications, Vol. 2 (North-Holland Publishing Co., Amsterdam-New York-Oxford, 1977). [58] E. Tornatore and H. Fujita Yashima, One-dimensional stochastic equations for a viscous barotropic gas, Ricerche Mat. 46(2) (1997) 255–283 (in Italian). [59] E. Tornatore, Global solution of bi-dimensional stochastic equation for a viscous gas, NoDEA Nonlinear Differential Equations Appl. 7(4) (2000) 343–360. [60] M. I. Vishik, A. I. Komech and A. V. Fursikov, Some mathematical problems of statistical hydromechanics, Uspekhi Mat. Nauk 34(5)(209) (1979) 135–210 (in Russian). [61] M. I. Vishik and A. V. Fursikov, Mathematical Problems of Statistical Hydromechanics, Mathematics and Its Applications (Kluwer, Drodrecht, 1988). [62] M. Viot, Solutions faibles d’equations aux derivees partielles stochastiques non lineaires, Doctor of Sciences thesis, Parix 6 (1973). [63] H. F. Yashima, Equations stochastiques d’un gaz visqueux isotherme dans un domaine monodimensionnel infini, Acta Math. Vietnam. 26(2) (2001) 147–168. [64] H. F. Yashima, Equations de Navier–Stokes stochastiques non homog`enes et applications, Tesi di Perfezionamento, Scuola Normale Superiore, Pisa (1992), 169 pp.
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 6 (2010) 699–732 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004065
DERIVATIONS OF THE TRIGONOMETRIC BCn SUTHERLAND MODEL BY QUANTUM HAMILTONIAN REDUCTION
´ ∗,‡ and B. G. PUSZTAI†,§ L. FEHER ∗Department
of Theoretical Physics, MTA KFKI RMKI, H-1525 Budapest, P.O.B. 49, Hungary and Department of Theoretical Physics, University of Szeged, Tisza Lajos krt 84-86, H-6720 Szeged, Hungary †Bolyai Institute, University of Szeged, Aradi v´ ertan´ uk tere 1, H-6720 Szeged, Hungary ‡
[email protected] §
[email protected]
Received 7 October 2009 The BCn Sutherland Hamiltonian with coupling constants parametrized by three arbitrary integers is derived by reductions of the Laplace operator of the group U (N ). The reductions are obtained by applying the Laplace operator on spaces of certain vector valued functions equivariant under suitable symmetric subgroups of U (N ) × U (N ). Three different reduction schemes are considered, the simplest one being the compact real form of the reduction of the Laplacian of GL(2n, C) to the complex BCn Sutherland Hamiltonian previously studied by Oblomkov. Keywords: Integrable many-body systems; quantum Hamiltonian reduction; polar action. Mathematics Subject Classification 2010: 22E70, 53C80, 81R12
1. Introduction The family of Calogero–Sutherland type many-body models is very important both in physics and mathematics, as is amply demonstrated in the reviews [1–6]. In this paper, we focus on the group theoretic derivation of the trigonometric Sutherland models introduced by Olshanetsky and Perelomov [7] in correspondence with the crystallographic root systems. The Hamiltonian of the model associated with the roots system R is given by 1 |α|2 µα (µα + 2µ2α − 1) 1 , (1.1) HR = − ∆ + 2 4 sin2 (α · q) α∈R where ∆ is the Laplacian on the Euclidean space of the roots and the µα are arbitrary real constants depending only on the lengths of the roots, with µ2α := 0 699
July 12, J070-S0129055X10004065
700
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
if 2α ∈ / R. In the original An−1 case, the model was solved by Sutherland [8]. An interesting general observation [9] is that the radial part of the Laplace operator of any compact Riemannian symmetric space is always conjugate to a Sutherland operator (1.1) built on the root system of the symmetric space, with coupling constants determined by the multiplicities of the roots. This observation showed the algebraic integrability of the resulting Hamiltonians HR at (small) finite sets of coupling constants and inspired later developments. The integrability, and exact solvability in terms of a triangular structure, was first established for the models (1.1) in full generality by Heckman and Opdam [10, 11]. Their technique is based on differential-reflection operators belonging to the Hecke algebraic generalization of harmonic analysis [2, 12]. The Hecke algebraic approach is very powerful, but it is still desirable to treat as many cases of the models (1.1) in group theoretic terms as possible. Important progress in this direction was achieved by Etingof, Frenkel and Kirillov [13] who worked out the quantum mechanical version of the classical Hamiltonian reduction due to Kazhdan, Kostant and Sternberg [14] and thereby showed that the An−1 Sutherland Hamiltonian arises as the restriction of the Laplace operator of SU (n) to certain vector valued spherical functions. A spherical function F on SU (n) with values in the SU (n) module V satisfies the equivariance condition F (gxg −1 ) = g · F (x) and thus it is uniquely determined by its restriction to the maximal torus T < SU (n). It is easily seen that the restricted function f = F |T must vary in the zero-weight subspace V T and the action of the Laplace operator of SU (n) on F can be expressed by the action of a scalar differential operator on f whenever dim(V T ) = 1. This latter condition singles out the symmetric tensorial powers V = S kn (Cn ) (k ∈ Z≥0 ) and their duals among the irreducible highest weight representations of SU (n), and the resulting scalar differential operator turns out to be the Sutherland operator HAn−1 with coupling parameter µα = k + 1. The above arguments cannot be extended to the simple Lie groups beyond SU (n), since in general they do not admit non-trivial highest weight representations with multiplicity one for the zero weight.a However, taking any compact connected Lie group Y , there exist other nice actions of certain subgroups of Y × Y on Y for which one can try to generalize the above arguments. Indeed [17], if G is the fixed point set of an involution of Y ×Y , then every orbit of the natural action of G on Y can be intersected by a toral subgroup A < Y . Therefore the G-equivariant functions on Y with values in a representation V of G give rise to V K -valued functions on A, where K is the isotropy group of the generic elements of A. Moreover, if dim(V K ) = 1, then the application of the Laplace operator of Y on C ∞ (Y, V )G may induce a scalar Sutherland operator. The group actions just alluded to are called Hermann actions. They received a lot of attention in differential geometry (see, a The only exceptions [15, 16] are the defining representation of SO(2n + 1) and the 7-dimensional representation of G2 . In the former case, we have checked that the reduced Laplacian gives a decoupled system.
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
701
e.g., [17, 18] and references therein), but their use for the construction of integrable systems still has not been explored systematically. The goal of this paper is to explain that certain Hermann actions on Y = U (N ) permit derivations of the BCn Sutherland Hamiltonian from the Laplacian of U (N ). The derivations that we present are partly motivated by an earlier derivation found in the complex holomorphic setting in [19], and by our previous paper [20] where we discussed how the classical mechanical version of the trigonometric BCn model with three arbitrary coupling constants can be obtained by reducing the free particle moving on the group U (N ). Taking for R the root system BCn = {i ± j , ±k , ±2k | i, j, k ∈ {1, . . . , n}, i = j},
(1.2)
with orthonormal vectors {i }, and introducing new coupling parameters a, b, c by the definition 1 µi ±j := a + 1, µk := b − c, µ2k := c + , (1.3) 2 the Hamiltonian (1.1) reads 1 ∂2 + 2 j=1 ∂qj2 n
HBCn = −
1≤k
a(a + 1) a(a + 1) + sin2 (qk − ql ) sin2 (qk + ql )
1 1 2 2 n n 1 c −4 1 b −4 . + + 2 j=1 sin2 (qj ) 2 j=1 cos2 (qj )
(1.4)
In fact, we shall obtain this Hamiltonian with arbitrary non-negative integers a, b and c as a reduction of the Laplace operator of U (N ). More precisely, we shall present 3 different derivations, for which N = 2n, N = 2n + 1 or N = 2n + 2. There is considerable conceptual overlap between this paper and the abovementioned work [19] of Oblomkov, who related the eigenfunctions of the holomorphic BCn Sutherland operator to vector valued spherical functions on the group GL(N, C). If we replace GL(N, C) by U (N ), then Oblomkov’s construction leads to our construction in the most important N = 2n case. However, there are also different cases considered in [19] and in this paper even after such replacement, and the language and the techniques used are rather different. In fact, we shall obtain the results by applying a recently developed general framework of quantum Hamiltonian reduction under polar group actions [21]. We shall raise interesting open questions, too, and to facilitate their future investigation we describe our analysis in a self-contained manner. The organization of the article is as follows. In the next section, we recall the necessary notions and results concerning quantum Hamiltonian reductions of the Laplace operator on a Riemannian manifold that admits generalized polar coordinates adapted to the symmetry group in the sense of [22]. In Sec. 3, we specialize to Hermann actions on a compact Lie group Y , and describe those Hermann actions on Y = U (N ) that are expected to lead to BCn Sutherland models if the representation
July 12, J070-S0129055X10004065
702
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
of the symmetry group G < Y × Y is chosen appropriately. The key part of the paper is Sec. 4, where we confirm the above expectation for three infinite families of cases. In Sec. 5, we summarize the results, further discuss the comparison with [19] and formulate open questions. There is also an appendix containing background material. 2. Quantum Hamiltonian Reduction Under Polar Actions We here collect general definitions and results that will be used subsequently. Our main purpose is to explain that formula (2.14) characterizes the reductions of the Laplace operator of a Riemannian manifold under so-called polar actions [22] of compact symmetry groups. The exposition is restricted to the necessary minimum, for more details see [21] and references therein. Let Y be a smooth, connected, complete Riemannian manifold with metric η. Consider the Laplace operator ∆Y corresponding to η. For a smooth function F , 1 1 in local coordinates {y µ } on Y one has ∆Y F = |η|− 2 ∂µ (|η| 2 ∂ µ F ) with |η| := det(ηµ,ν ). The restriction of ∆Y onto the space of the complex-valued compactly supported smooth functions, ∆0Y := ∆Y |Cc∞ (Y ) : Cc∞ (Y ) → Cc∞ (Y ),
(2.1)
is an essentially self-adjoint linear operator of the Hilbert space L2 (Y, dµY ), where µY denotes the measure generated by the Riemannian volume form, locally defined 1 by |η| 2 µ dy µ . Suppose that a compact Lie group G acts on (Y, η) by isometries. The action is given by a smooth map φ : G × Y → Y,
(g, y) → φ(g, y) = φg (y) = g.y
(2.2)
such that φ∗g η = η for every g ∈ G. The measure µY inherits the G-invariance and therefore the Hilbert space L2 (Y, dµY ) naturally carries a continuous unitary representation of G. This in turn is unitarily equivalent to an orthogonal direct sum, L2 (Y, dµY ) ∼ = ⊕ρ Mρ ⊗ Vρ¯, where (ρ, Vρ ) runs over a complete set of pairwise inequivalent irreducible unitary representations of G, ρ¯ denotes the contragredient of the representation ρ, and Mρ is a “multiplicity space” on which G acts trivially. ¯ 0 , which by definition Correspondingly, the self-adjoint scalar Laplace operator, ∆ Y 0 0 ∼ ˆ ρ ⊗ idVρ¯ , where ∆ ¯ ˆ ρ is is the closure of ∆Y (2.1), can be decomposed as ∆Y = ⊕ρ ∆ ˆ a self-adjoint operator on the Hilbert space Mρ . The system (Mρ , ∆ρ ) is called the ¯ 0 ) having the symmetry type ρ¯. reduction of the system (L2 (Y, dµY ), ∆ Y ˆ ρ ), consider now an irreducible In order to present a convenient model of (Mρ , ∆ unitary representation (ρ, V ) of G, where V is a finite dimensional complex vector space with inner product ( , )V . By simply acting componentwise, the differential operator ∆0Y extends onto the complex vector space of the V -valued compactly supported smooth functions, Cc∞ (Y, V ). This gives the essentially self-adjoint operator ∆0Y : Cc∞ (Y, V ) → Cc∞ (Y, V )
(2.3)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
703
of the Hilbert space L2 (Y, V, dµY ). Because of the G-symmetry of the metric η, the set Cc∞ (Y, V )G := {F | F ∈ Cc∞ (Y, V ), F ◦ φg = ρ(g) ◦ F (∀ g ∈ G)}
(2.4)
of the V -valued, compactly supported G-equivariant smooth functions is an invariant linear subspace of ∆0Y . Moreover, the restriction of ∆0Y (2.3) onto Cc∞ (Y, V )G , ∆ρ := ∆0Y |Cc∞ (Y,V )G : Cc∞ (Y, V )G → Cc∞ (Y, V )G ,
(2.5)
is a densely defined, symmetric, essentially self-adjoint linear operator on the Hilbert space L2 (Y, V, dµY )G of the square-integrable G-equivariant functions. It is not difficult to demonstrate the unitary equivalence ˆ ρ) ∼ ¯ ρ ) with V := Vρ , (Mρ , ∆ = (L2 (Y, V, dµY )G , ∆
(2.6)
¯ ρ denotes the closure of ∆ρ in (2.5). It is convenient for many purposes to where ∆ use the realization of the reduced quantum system furnished by L2 (Y, V, dµY )G . Particularly simple cases of the reduction arise if the reduced configuration space Yred := Y /G is a smooth manifold, although this happens very rarely. However, restricting to the principal orbit type, Yˇ ⊂ Y , one always obtains a smooth fiber bundle π : Yˇ → Yˇ /G. Note that Yˇ consists of the points of Y having the smallest isotropy subgroups for the G-action [23]. The “big cell” of the reduced configuration space, given by Yˇred := Yˇ /G, is naturally endowed with a Riemannian metric, ηred , making π a Riemannian submersion. From a quantum mechanical point of view, neglecting the non-principal orbits is harmless, in some sense, since Yˇ is not only open and dense in Y , but it is also of full measure. In many applications polar group actions are important, whose characteristic property is that the G-orbits possess representatives that form sections in the sense of Palais and Terng [22]. By definition, a section Σ ⊂ Y is a connected, closed, regularly embedded smooth submanifold of Y that meets every G-orbit and it does so orthogonally at every intersection point of Σ with an orbit. If a section exists, then any two sections are G-related. The induced metric on Σ is denoted by ηΣ , and for the measure generated by ηΣ , we introduce the notation µΣ . For a section ˇ a connected component of the manifold Σ ˆ := Yˇ ∩ Σ. The isotropy Σ, denote by Σ ˆ subgroups of all elements of Σ are the same and for a fixed section we define K := Gy ˆ The group K is called the centralizer of the section Σ. By restricting for y ∈ Σ. ˇ η ˇ ), where η ˇ is the ˇ (Yˇred , ηred ) becomes identified with (Σ, π : Yˇ → Yˇ /G onto Σ, Σ Σ ˇ We let ∆ ˇ stand for the Laplace operator of the Riemannian induced metric on Σ. Σ ˇ η ˇ ). The G-equivariant diffeomorphism manifold (Σ, Σ ˇ × (G/K) (Q, gK) → φg (Q) ∈ Yˇ Σ
(2.7)
provides a trivialization of the fiber bundle π : Yˇ → Yˇ /G. Generalized polar coordiˇ and “angular” coordinates on G/K. nates on Yˇ consist of “radial” coordinates on Σ
July 12, J070-S0129055X10004065
704
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
To concretize the reduced system (2.6) for polar actions, we introduce the space ˇ V K ), f = F | ˇ for some F ∈ Cc∞ (Y, V )G }, ˇ V K ) := {f | f ∈ Cc∞ (Σ, Fun(Σ, Σ
(2.8)
where V K is spanned by the K-invariant vectors in the representation space V . We assume that the representation (ρ, V ) of the symmetry group G is admissible in the sense that dim(V K ) > 0.
(2.9)
The restriction of functions appearing in the definition (2.8) gives rise to a linear ˇ V K) ∼ isomorphism Fun(Σ, = Cc∞ (Y, V )G → L2 (Y, V, dµY )G . This induces a scalar K ˇ V ) making it a pre-Hilbert space whose closure satisfies the product on Fun(Σ, ˇ V K) ∼ Hilbert space isomorphism Fun(Σ, = L2 (Y, V, dµY )G . Next, consider the Lie algebra G := Lie(G) and its subalgebra K := Lie(K). Fix a G-invariant positive definite scalar product, BG , on G and thereby determine the orthogonal complement K⊥ of K in G. For any ξ ∈ G denote by ξ the associated vector field on Y . Then at ˇ the linear map K⊥ ξ → ξ ∈ TQ Y is injective, and the inertia each point Q ∈ Σ Q operator J(Q) ∈ End(K⊥ ) can be defined by the requirement , ζQ ) = BG (ξ, J(Q)ζ), ηQ (ξQ
∀ξ, ζ ∈ K⊥ .
(2.10)
Note that J(Q) is symmetric and positive definite with respect to BG |K⊥ ×K⊥ . By choosing dual bases {Tα }, {T α } ⊂ K⊥ , that is, BG (T α , Tβ ) = δβα , we let bα,β (Q) := BG (Tα , J(Q)Tβ ),
bα,β (Q) := BG (T α , J(Q)−1 T β ).
(2.11)
ˇ is an embedded submanifold of The G-orbit G.Q ⊂ Y through any point Q ∈ Σ Y and by its embedding it inherits a Riemannian metric, ηG.Q . Thus we can define ˇ → (0, ∞) by the smooth density function δ : Σ δ(Q) := volume of the Riemannian manifold (G.Q, ηG.Q ),
(2.12)
where the volume is understood with respect to the measure, µG.Q , belonging to ηG.Q . It is easy to see that 1
1
δ(Q) = C|det(J(Q))| 2 = C|det(bα,β (Q))| 2
(2.13)
with some constant C > 0. In the following proposition, quoted from [21], ρ denotes the representation of G corresponding to the representation ρ of G. Proposition 2.1. Let us consider a polar G-action using the above notations. Then the reduced system (2.6) associated with an admissible irreducible unitary ˇ V K , dµ ˇ ), ∆red ), representation (ρ, V ) of G can be identified with the pair (L2 (Σ, Σ where 1
1
∆red = ∆Σˇ − δ − 2 ∆Σˇ (δ 2 ) + bα,β ρ (Tα )ρ (Tβ )
(2.14)
1 ˇ V K ) is a densely defined, symmetric, essentially with domain D(∆red ) = δ 2 Fun(Σ, ˇ V K , dµ ˇ ). self-adjoint operator on the Hilbert space L2 (Σ, Σ
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
705
The above statement results by calculating the action of ∆Y on the V -valued equivariant functions in (2.8) with the aid of polar coordinates, using also the Hilbert space identifications ˇ V K) ∼ ˇ V K , δdµ ˇ ). Fun(Σ, = L2 (Y, V, dµY )G ∼ = L2 (Σ, Σ
(2.15)
The last equality follows by integrating out the “angular” coordinates in the scalar ˇ V K, product of equivariant functions. One also uses the unitary map U :L2 (Σ, 1 2 ˇ K δdµΣˇ ) → L (Σ, V , dµΣˇ ) defined by U : f → δ 2 f . The first term in (2.14) corresponds to the kinetic energy of a particle moving on ˇ η ˇ ) and the rest represents potential energy if dim(V K ) = 1. The (Yˇred , ηred ) ∼ = (Σ, Σ second term of (2.14) is always potential energy, which is constant in some cases. We refer to this term as the “measure factor”. It represents a significant difference between the outcomes of the corresponding classical and quantum Hamiltonian reductions [21]. If dim(V K ) > 1, then one says that the reduced system contains internal “spin” degrees of freedom and then the third term of (2.14) encodes “spindependent potential energy”. 3. Examples of Polar Actions on Compact Lie Groups From now we take the “unreduced configuration space” Y to be a compact, connected, real Lie group endowed with a bi-invariant metric η, induced by a positive definite, Y -invariant bilinear form BY of the Lie algebra Y := Lie(Y ). For the reduction group G one may choose any symmetric subgroup of the direct product group Y × Y , that is, (Y × Y )σ0 ≤ G ≤ (Y × Y )σ ,
(3.1)
where (Y × Y )σ stands for the fixed-point set of some involutive automorphism σ ∈ Inv(Y × Y ), and (Y × Y )σ0 is the connected component of the identity in (Y × Y )σ . The group G acts on Y by the map φ : G × Y → Y,
−1 ((gL , gR ), y) → φ(gL ,gR ) (y) := gL ygR .
(3.2)
The group actions of this form are often called Hermann actions. Under mild conditions, which hold in the examples below, these are polar actions in the sense of [22]. In fact, the sections are provided by certain toral subgroupsb A < Y . Thus the sections are flat in the induced metric, which is the characteristic property of the so-called hyperpolar actions [17]. In the simplest special case σ(y1 , y2 ) = (y2 , y1 ), G = Ydiag = {(y, y) | y ∈ Y } ∼ = Y and (3.2) is just the adjoint action of Y on itself, for which the sections are the maximal tori of Y . bA
toral subgroup A < Y is a connected and closed Abelian subgroup. It is the closedness of the relevant subgroups that requires some conditions. If Y is semi-simple, then a sufficient condition is to take BY as a multiple of the Killing form [17].
July 12, J070-S0129055X10004065
706
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
3.1. Hermann actions associated with pairs of involutions The reductions that we study later arise from the following construction. Let σL , σR ∈ Inv(Y ) be two involutions of Y , and let YL , YR ≤ Y be corresponding symmetric subgroups of Y , (Y σI )0 ≤ YI ≤ Y σI
(I ∈ {L, R}).
(3.3)
We suppose that the scalar product BY is invariant under both σL and σR and introduce σ ∈ Inv(Y × Y ) by σ(y1 , y2 ) := (σL (y1 ), σR (y2 )). Then G := YL × YR
(3.4)
is a symmetric subgroup of Y ×Y and Eq. (3.2) defines a hyperpolar Hermann action of G on Y . The classification of the inequivalent pairs of involutions (σL , σR ) has been worked out by Matsuki [24]. We assume for simplicity that the two involutions σL and σR commute with each other, which holds for the large majority of cases in the classification. Subsequently, the induced Lie algebra involutions are denoted by the same letters σL and σR . Now, with the aid of the subspaces Y σI ,± := ker(σI ∓ IdY ) ⊂ Y
(I ∈ {L, R}) and Y ±± := Y σL ,± ∩ Y σR ,± ⊂ Y (3.5)
we obtain the orthogonal decomposition Y = Y ++ ⊕ Y +− ⊕ Y −+ ⊕ Y −− ,
(3.6)
which gives also a Z2 × Z2 -gradation of Y. The Lie algebra of the symmetric subgroup YI ≤ Y is Lie(YI ) ∼ = Y σI ,+ (I ∈ {L, R}). Then, we choose a maximal Abelian −− and also define A := exp(A), which is a toral subgroup of Y . subalgebra A in Y According to an important theorem proved in [25, 26], the Lie group Y admits the generalized Cartan decomposition Y = YL AYR .
(3.7)
This means that every element of Y can be written as a product of the elements of the subgroups in (3.7). Recalling the definition of the Hermann action (3.2) for G = YL ×YR , Eq. (3.7) says that the subgroup A intersects every G-orbit. Moreover, it does so orthogonally at every intersection point, and thus A provides a section for the G-action in the sense of [22]. Below Aˇ denotes a connected component of the regular part of the section A. Let us introduce the subgroups YLR := YL ∩ YR ≤ Y and M := {g | g ∈ YLR , gag −1 = a (∀ a ∈ A)} ≤ YLR .
(3.8)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
707
Their Lie algebras are Lie(YLR ) ∼ = Lie(YL ) ∩ Lie(YR ) ∼ = Y σL ,+ ∩ Y σR ,+ = Y ++ ,
(3.9)
M := Lie(M ) = {X | X ∈ Y ++ , adX (q) = 0 (∀ q ∈ A)},
(3.10)
where adX is defined by the Lie bracket on Y. It can be shown that the centralizer ˇ is now of the section A = exp(A) (the isotropy subgroup of the elements of A) furnished by K = Mdiag = {(g, g) | g ∈ M } ≤ G.
(3.11)
To specialize the inertia operator J defined in (2.10), we introduce a G-invariant scalar product on the Lie algebra G = Lie(G) = Lie(YL × YR ) ∼ = Lie(YL ) ⊕ Lie(YR ) ∼ = Y σL ,+ ⊕ Y σR ,+
(3.12)
by the formula BG ((ξL , ξR ), (ζL , ζR )) := BY (ξL , ζL )+BY (ξR , ζR ),
∀(ξL , ξR ), (ζL , ζR ) ∈ G. (3.13)
This induces the decomposition G = K ⊕ K⊥ , where K = Lie(K). By using the decomposition Y = M ⊕ M⊥ defined by BY , we also introduce the subspaces Ka⊥ := {(X, −X) | X ∈ M} ⊂ K⊥ ,
(3.14)
Ke⊥ := {(ξL , ξR ) | ξL , ξR ∈ M⊥ ∩ Y ++ } ⊂ K⊥ ,
(3.15)
Ko⊥ := {(ζL , ζR ) | ζL ∈ Y +− , ζR ∈ Y −+ } ⊂ K⊥ ,
(3.16)
which yield the orthogonal decomposition K⊥ = Ka⊥ ⊕ Ke⊥ ⊕ Ko⊥ .
(3.17)
Now consider the vector field ξ = (ξL , ξR ) on Y associated with ξ = (ξL , ξR ) ∈ G by means of the G-action. At an arbitrary point eq ∈ A (q ∈ A) of the section A we find ξeq = (ξL , ξR )eq = (dLeq )e ξR − e−adq (ξL ) ∈ Teq Y,
(3.18)
where Ly denotes the left-translation on Y by group element y ∈ Y . Simply by plugging (3.18) into the definition (2.10), routine algebraic manipulations lead to the following result: subspaces Lemma 3.1. Equation (3.17) is a decomposition of K⊥ into invariant ˇ One has J(eq ) ⊥ = 2 IdK⊥ and, of the inertia operator J(eq ) at any point eq ∈ A. K a a
July 12, J070-S0129055X10004065
708
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
writing ξ = (ξL , ξR ) ∈ G as a 2-component column vector with components ξL and ξR , the action of J(eq ) on Ke⊥ and Ko⊥ is encoded by the matrices 1 − cosh(ad ) q J(eq )K⊥ = , e ⊥ 1 − cosh(adq ) Ke (3.19) ) 1 − sinh(ad q J(eq )K⊥ = . o ⊥ 1 sinh(adq ) K q
o
J(eq )−1 K⊥ a
1 2
For the inverse of J(e ) one has = IdK⊥ together with a cosh(adq ) sinh−2 (adq ) sinh−2 (adq ) q −1 J(e ) K⊥ = − , e ⊥ sinh−2 (adq ) cosh(adq ) sinh−2 (adq ) Ke −2 −2 sinh(adq ) cosh (adq ) cosh (adq ) J(eq )−1 K⊥ = . −2 o ⊥ cosh−2 (adq ) −sinh(adq ) cosh (adq ) K
(3.20)
(3.21)
o
3.2. A family of two involutions on U (N ) For our later purpose, we now focus on the unitary group Y := U (N ) = {y | y ∈ GL(N, C), y † y = 1N }.
(3.22)
We equip the Lie algebra Y := u(N ) = {X | X ∈ gl(N, C), X † + X = 0}
(3.23)
with the scalar product BY (X, Z) := −tr(XZ),
∀ X, Z ∈ u(N ).
(3.24)
To any pair (m, n) ∈ Z2≥0 with m ≥ n and m+n = N we associate the block-matrix Im,n := diag(1m , −1n ) =
1m
0
0
−1n
∈ U (N ),
(3.25)
and the involutive inner automorphism θm,n : U (N ) → U (N ),
y → θm,n (y) := Im,n yI−1 m,n .
The fixed-point set of θm,n is
a 0 θm,n = U (N ) a ∈ U (m), b ∈ U (n) ∼ = U (m) × U (n). 0 b
(3.26)
(3.27)
Note that U (N )θm,n is connected. The induced Lie algebra involution operates as θm,n (X) = Im,n XI−1 m,n ,
∀X ∈ u(N ).
(3.28)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
709
Using the block-matrix realization
u(N ) =
A −C †
C m×n , A ∈ u(m), B ∈ u(n), C ∈ C B
(3.29)
the eigenspaces u(N )θm,n ,± are 0 = A ∈ u(m), B ∈ u(n) , 0 B
0 C = C ∈ Cm×n . −C † 0
u(N )
θm,n ,+
u(N )θm,n ,−
A
(3.30)
Now we take two pairs (m, n), (r, s) ∈ Z2≥0 with the additional requirements m ≥ r ≥ s ≥ n and m + n = r + s = N , and consider the commuting involutions σL := θr,s
and σR := θm,n .
(3.31)
The corresponding symmetric subgroups YL , YR ≤ Y are U (N )L := U (N )σL ∼ = U (r) × U (s) and U (N )R := U (N )σR ∼ = U (m) × U (n). (3.32) The partition N = n+(r−n)+(s−n)+n leads to a 4×4 block-matrix decomposition of any N × N matrix in general. (Of course, if r = n or s = n, then the blockmatrix decomposition contains fewer blocks.) That is, any matrix X ∈ CN ×N can be written as X1,1 X1,2 X1,3 X1,4 X2,1 X2,2 X2,3 X2,4 , (3.33) X= X3,1 X3,2 X3,3 X3,4 X4,1 X4,2 X4,3 X4,4 where the entries Xi,j are themselves matrices, X1,1 ∈ Cn×n , X1,2 ∈ Cn×(r−n) , X1,3 ∈ Cn×(s−n) , X1,4 ∈ Cn×n , etc. Then for the Lie group YLR = YL ∩ YR we have U (N )LR a1,1 a2,1 = 0 0
a1,2
0
a2,2
0
0
a3,3
0
0
0 a1,1 0 a2,1 a4,4 0
a1,2 a2,2
∈ U (r), a3,3 ∈ U (s − n), a4,4
∈ U (n) . (3.34)
July 12, J070-S0129055X10004065
710
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
Therefore U (N )LR ∼ = U (r) × U (s − n) × U (n) and the Lie algebra Lie(U (N )LR ) = ++ is isomorphic to u(r) ⊕ u(s − n) ⊕ u(n). In our case the subspace Y −− in u(N ) (3.5) reads 0 0 0 A1,4 0 0 0 A 2,4 −− n×n (r−n)×n A1,4 ∈ C . (3.35) u(N ) = , A ∈ C 2,4 0 0 0 0 0 −A†1,4 −A†2,4 0 To proceed, we define the diagonal n × n matrix q := diag(q1 , q2 , . . . , qn ) ∈ Rn×n for any real n-tuple (q1 , q2 , . . . , qn ) ∈ Rn , and we also set 0 0 0 q 0 0 0 0 −− q := 0 0 0 0 ∈ u(N ) . −q 0 0 0
(3.36)
(3.37)
Then the set of matrices A := {q | (q1 , q2 , . . . , qn ) ∈ Rn } ⊂ u(N )−−
(3.38)
is a maximal Abelian subalgebra in u(N )−− . A basis of the dual space A∗ is given by the functionals k : A → R,
q → k (q) := qk .
The corresponding subgroup A = exp(A) has the form cos(q) 0 0 sin(q) 0 0 0 1 r−n q n (q1 , q2 , . . . , qn ) ∈ R . A= e = 0 0 0 1 s−n −sin(q) 0 0 cos(q)
(3.39)
(3.40)
If T(n) denotes the diagonally embedded standard torus in U (n), then it is straightforward to show that the subgroup M (3.8) is now furnished by a 0 0 0 0 b 0 0 a ∈ T(n), b ∈ U (r − n), c ∈ U (s − n) . (3.41) M= 0 0 c 0 0 0 0 a Note that M is connected, and therefore so is the centralizer K = Mdiag of the section A. Moreover, we have the identifications K∼ = Mdiag ∼ =M ∼ = T(n) × U (r − n) × U (s − n) ∼ = U (1)×n × U (r − n) × U (s − n).
(3.42)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
711
It is shown in [26, p. 63], that the closed, connected subset π A+ := eq 0 ≤ q1 ≤ q2 ≤ · · · ≤ qn ≤ ⊂A (3.43) 2 intersects each orbit of G = U (N )σL × U (N )σR under the action (3.2) precisely once. Note also that matrix exponentiation provides a bijection from π ⊂A (3.44) A+ := q 0 ≤ q1 ≤ q2 ≤ · · · ≤ qn ≤ 2 onto A+ . By inspecting the isotropy subgroup Geq ≤ G for eq ∈ A+ , we find that Geq = K if and only if q ∈ Aˇ+ , where Aˇ+ denotes the connected open subset π Aˇ+ := q 0 < q1 < q2 < · · · < qn < (3.45) ⊂ A+ . 2 We can conclude from the above that the subset Aˇ := exp(Aˇ+ ) provides a connected component for the regular part of the section A. Regarding the components qk in ˇ for the Laplace operator ∆ ˇ defined by the (3.45) as global coordinates on A, A induced metric we obtain n 1 ∂2 . (3.46) ∆Aˇ = 2 ∂qk2 k=1
3.3. Diagonalization of the inertia operator We continue the study of the examples (3.31) by presenting a basis of K⊥ that diagonalizes J(eq ) (3.19) for any q ∈ Aˇ+ in (3.45). We then use this basis to 1 compute the density δ 2 that enters the second term of the reduced Laplacian (2.14). 1 Note that δ 2 could be found also by the specialization of general formulae available for two commuting involutions [25, 2], but we need to fix a basis for the evaluation of the third term of (2.14), which will be performed later. We start by defining an orthonormal basis (ONB) in the space M⊥ ∩ u(N )++ , which (due to (3.34) and (3.41)) has the form M⊥ ∩ u(N )++ X1,1 −X † 1,2 = 0 0
X1,2 0 0 0
0 0 X1,1 , X4,4 ∈ u(n), (X1,1 + X4,4 )diag = 0, . n×(r−n) 0 0 X1,2 ∈ C 0 X4,4 0
0
(3.47) ⊥
If r = n, then there are no off-diagonal blocks, and in general dim(M ∩u(N ) ) = n(2r − 1). For all 1 ≤ j ≤ n we let Ejj 0 0 0 0 0 0 0 i i , √ (3.48) := E2 j 2 0 0 0 0 0 0 0 −Ejj ++
July 12, J070-S0129055X10004065
712
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
and for all 1 ≤ k < l ≤ n we define Ekl − Elk 0 1 Erk +l := 2 0 Ei k +l
i := 2
Erk −l :=
1 2
Ei k −l
i := 2
0 0
0
0 0
0
0 0
0
0
0 0
Elk − Ekl
Ekl + Elk
0 0
0
0
0 0
0
0
0 0
0
0
0 0
Ekl − Elk
0 0
0
0 0
0
0 0
0
0 0
Ekl + Elk
0 0
0
0 0
0
0 0
0
0 0
−Ekl − Elk 0 0 , 0 Ekl − Elk 0 0 . 0 Ekl + Elk
For all 1 ≤ j ≤ n and 1 ≤ d ≤ r − n we set 0 Ejd 0 0 0 E 0 0 0 1 −Edj , E i,d := √i dj Er,d := √ j j 2 0 2 0 0 0 0 0 0 0 0 0
, , (3.49)
Ejd
0
0
0
0
0
0
0
0
0 . 0 0
(3.50)
The superscripts i and r refer to purely imaginary and to real matrices, respectively, and the elementary matrices Eab are always understood to be of the correct size as dictated by (3.33). The set of matrices i }n ∪ {Er,d , Ei,d }1≤j≤n, {EαD }α,D := {Erk ±l , Ei k ±l }1≤k
(3.51)
1≤d≤r−n
forms an ONB in M⊥ ∩ u(N )++ . Here D is an “index of degeneration” and α runs over the positive roots R+ for the root system Cn or BCn . More precisely, R+ (Cn ) if r = n, R+ = (3.52) R+ (BCn ) if r > n. One can easily verify the relations (adq )2 EαD = −α(q)2 EαD .
(3.53)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
713
Next, we deal with the subspaces u(N )+− and u(N )−+ given by
u(N )+−
0 0 = 0 0
u(N )−+ 0 0 = −X † 1,3 0
0
0
0
0
0
0
† 0 −X3,4
0
X1,3
0
X2,3
† −X2,3
0
0
0
0 X3,4 ∈ C(s−n)×n , X3,4 0 0
(3.54)
0 0 X1,3 ∈ Cn×(s−n) , X2,3 ∈ C(r−n)×(s−n) . 0 0 (3.55)
Note that both u(N )+− and u(N )−+ are trivial if s = n. In general, dim(u(N )+− ) = 2n(s − n) and dim(u(N )−+ ) = 2r(s − n). For all 1 ≤ j ≤ n and 1 ≤ d ≤ s − n we define
0 0
0
0 0 1 r,d ˜ √ Ej := 2 0 0 0 0
0
0 1 r,d ˜ Fj := √ 2 Edj 0
0
0 , Edj 0
0 0 −Ejd
0
−Ejd
0
0
0
0
0
0
0
0 , 0 0
0 0
0
0 0 i i,d ˜ √ Ej := 2 0 0 0 0
0
0 i i,d ˜ Fj := √ 2 Edj 0
0
0 , Edj 0
0 0 Ejd
0
Ejd
0
0
0
0
0
0
0
(3.56)
0 . 0 0
(3.57)
For all 1 ≤ c ≤ r − n and 1 ≤ d ≤ s − n we introduce
0
0
0 0 1 r,c,d ˜ := √ F0 2 0 −Edc 0 0
0 Ecd 0 0
0
0 , 0 0
0
0
0 0 i i,c,d ˜ F0 := √ 2 0 Edc 0 0
0 Ecd 0 0
0
0 . 0 0 (3.58)
July 12, J070-S0129055X10004065
714
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
The set of matrices ˜D }j,D := {E ˜r,d , E ˜i,d }1≤j≤n {E j j j
(3.59)
1≤d≤s−n
forms an ONB in u(N )+− . The set of matrices , F˜i,d }1≤j≤n {F˜Dj }j,D := {F˜r,d j j
(3.60)
1≤d≤s−n
together with the set {F˜0D }D := {F˜0r,c,d, F˜0i,c,d}1≤c≤r−n
(3.61)
1≤d≤s−n
form an ONB in u(N )−+ . They verify the relations ˜ D ) = qj F˜ D , adq (E j j
˜D, adq (F˜Dj ) = −qj E j
adq (F˜0D ) = 0.
(3.62)
Now we compute the matrix of J and of J −1 on the invariant subspaces in dim(M) in M. Then the vectors (3.17). First, choose an arbitrary ONB {Lj }j=1 ˆ j := √1 (Lj , −Lj ) ≡ √1 L 2 2
Lj
−Lj
(3.63)
yield an ONB in Ka⊥ . The matrix entries of J(eq )|K⊥ and J(eq )−1 |K⊥ read a a ˆ k , J(eq )L ˆ l ) = 2δk,l , BG (L
ˆ l ) = 1 δk,l . ˆ k , J(eq )−1 L BG (L 2
Second, upon introducing the vectors EαD 1 EαD 1 D D Vα := √ , Wα := √ , 2 EαD 2 −EαD we obtain an ONB in Ke⊥ , and by applying (3.19) on these vectors we get 1 (1 − cosh(adq ))EαD q D , J(e )Vα = √ 2 (1 − cosh(adq ))EαD (1 + cosh(adq ))EαD 1 q D . J(e )Wα = √ 2 −(1 + cosh(adq ))EαD
(3.64)
(3.65)
(3.66)
We find from the relations (3.53) that cosh(adq )EαD = cos(α(q))EαD , and then elementary trigonometric identities yield α(q) α(q) q D 2 D q D 2 (3.67) J(e )Vα = 2 sin Vα , J(e )Wα = 2 cos WαD . 2 2
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
715
Therefore the only non-trivial matrix entries of J(eq )|K⊥ and J(eq )−1 |K⊥ are the e e following ones: α(q) BG (VαD , J(eq )VαD ) = 2 sin2 , 2 α(q) D q D 2 BG (Wα , J(e )Wα ) = 2 cos , 2 BG (VαD , J(eq )−1 VαD ) =
1 , α(q) 2 sin2 2 1
BG (WαD , J(eq )−1 WαD ) = 2 cos2 Third, by introducing D ˜ 1 E j D ˜ Vj := √ , 2 F˜D j
˜ D := √1 W j 2
˜D E j , −F˜D
α(q) 2
(3.68)
.
Z˜0D :=
0 , F˜ D
(3.69)
0
j
we obtain an ONB in Ko⊥ , and the application of (3.19) on these basis vectors gives J(eq )V˜D j
1 = √ 2
˜D J(eq )W j
1 = √ 2
˜D − sinh(adq )F˜D E j j sinh(adq )E˜Dj + F˜Dj ˜ D + sinh(adq )F˜ D E j j sinh(adq )E˜Dj − F˜Dj
,
(3.70) .
By using the relations (3.62) we see that ˜ D = (1 − sin(qj ))W ˜ D . J(eq )V˜D = (1 + sin(qj ))V˜D , J(eq )W j j j j q
)Z˜0D
(3.71)
Z˜0D ,
Since J(e = we conclude that the only non-trivial matrix entries of q and its inverse J(eq )−1 |K⊥ are the following ones: J(e )|K⊥ o o BG (V˜D , J(eq )V˜D ) = 1 + sin(qj ), j j
˜ D , J(eq )W ˜ D ) = 1 − sin(qj ), BG (W j j
1 , 1 + sin(qj )
˜ D ) = ˜ D , J(eq )−1 W BG (W j j
BG (Z˜0D , J(eq )Z˜0D ) = 1,
BG (Z˜0D , J(eq )−1 Z˜0D ) = 1.
, J(eq )−1 V˜D )= BG (V˜D j j
1 , 1 − sin(qj ) (3.72)
ˇ := Aˇ = exp(Aˇ+ ) with Aˇ+ in (3.45), Lemma 3.2. By using the identification Σ the second term of the reduced Laplacian (2.14) is given by n n 1 1 1 1 (m − n)(r − s) 4(s − n)2 − 1 δ − 2 ∆Aˇ (δ 2 ) = + 2 2 2 2 sin (q ) sin (2qj ) j j=1 j=1 −
n(3m2 + n2 − 1) . 6
(3.73)
July 12, J070-S0129055X10004065
716
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
Proof. Consider the function
J :=
[sin(qk − ql ) sin(qk + ql )]
ν
n
ν1
[sin(qj )]
j=1
1≤k
n
[sin(2qj )]
ν2
,
(3.74)
j=1
where the domain of the variables q1 , q2 , . . . , qn is such that all sin functions are positive and ν, ν1 , ν2 ∈ R are arbitrary parameters. Recall from [9] the identity n 1 ∂2J 1 J −1 = ν(ν − 1) + ∂qa2 sin2 (qk − ql ) sin2 (qk + ql ) a=1 1≤k
n j=1
n 1 1 (ν − 1) + 4ν 2 2 2 2 sin (qj ) sin (2qj ) j=1
2 2 2 − n (ν1 + 2ν2 ) + 2ν(ν1 + 2ν2 )(n − 1) + ν (n − 1)(2n − 1) . 3 (3.75) By calculating det(J(eq )) using the above basis of K⊥ , it is easily obtained from 1 (2.13) that δ 2 (eq ) ∝ J (q1 , q2 , . . . , qn ) with ν = 1,
ν1 = r − s,
1 ν2 = s − n + . 2
(3.76)
Taking into account (3.46), the required statement follows immediately. The subsequent formula is obtained by direct substitution since we have determined the matrix elements of J(eq )−1 (cf. (2.11)). It will be used in Sec. 4, when we shall further inspect the reduced Laplace operator (2.14) in interesting cases. Lemma 3.3. In terms of the above notations, the third term of the reduced Laplacian (2.14) takes the following form: bα,β ρ (Tα )ρ (Tβ ) 1 = 2
ˆ j )2 + 1 ρ (L 2 j=1 n
1≤j≤dim(M)
+
1 2
+
1 2
i )2 ρ (V2 j
sin2 (qj )
+
i ρ (W2 )2 j
cos2 (qj )
ρ (Vrk −l )2 + ρ (Vik −l )2 ρ (Wrk −l )2 + ρ (Wik −l )2 + qk − ql qk − ql 2 2 1≤k
ρ (Vrk +l )2 + ρ (Vik +l )2 ρ (Wrk +l )2 + ρ (Wik +l )2 + qk + ql qk + ql 2 2 1≤k
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
717
n r−n r,d 2 i,d 2 ρ (Wr,d )2 + ρ (Wi,d )2 1 ρ (Vj ) + ρ (Vj ) j j & ' &q ' + + j 2 qj 2 j=1 2 sin cos d=1 2 2 s−n n r,d 2 i,d 2 r,d 2 ˜ ˜ ˜ ˜ i,d )2 ρ (Vj ) + ρ (Vj ) ρ (Wj ) + ρ (W j + + 1 + sin(qj ) 1 − sin(qj ) j=1 d=1
+
r−n s−n
(ρ (Z˜0r,c,d )2 + ρ (Z˜0i,c,d)2 ).
(3.77)
c=1 d=1
4. BCn Sutherland Models from the KKS Ansatz In this section, we study interesting examples of the quantum Hamiltonian reduction based on the Hermann action (3.2) on Y = U (N ) associated with the involutions (3.31). The reductions correspond to certain UIRREPS ρ of the symmetry group G = U (N )L × U (N )R = (U (r) × U (s)) × (U (m) × U (n)).
(4.1)
To describe them, we now briefly summarize our notations for the UIRREPS of Π λ , Vλ ) U (n), for arbitrary n. (See also the Appendix.) First, we have the UIRREP (Π of SU (n) in correspondence to any highest weight λ ∈ P+ (SU (n)), that can be (n−1 written as λ = i=1 ai i using the fundamental weights i and integers ai ∈ Z≥0 . A label µn (λ) ∈ {0, 1, . . . , n − 1} is attached to the highest weight λ by the congruence relation µn (λ) ≡
n−1
kak
(mod n) for λ =
n−1
ai i .
(4.2)
i=1
k=1 2π
2π
It enters the equality Π λ (ei n 1n ) = ei n µn (λ) IdVλ . Then, for any k ∈ Z, the representation Π λ of SU (n) extends to the representation ρ(k,λ) of U (n) defined by ρ(k,λ) (ξg) = ξ nk+µn (λ)Π λ (g),
∀ ξ ∈ U (1),
∀ g ∈ SU (n).
(4.3)
Up to equivalence, all UIRREPS of U (n) are obtained in this way. The notation makes sense even for n = 1, by putting P+ (SU (1)) := {0}, and we have ρ(k,0) (g) = (det g)k (∀g ∈ U (n)). By letting ρ(k,λ) and πλ stand for the infinitesimal version of the representations ρ(k,λ) and Π λ , respectively, we have tr(Z) tr(Z) 1n + (µn (λ) + nk) IdVλ , ∀ Z ∈ u(n). (4.4) ρ(k,λ) (Z) = πλ Z − n n (n)
(n)
(n)
We use the notations πλ , Vλ , ρ(k,λ) etc. when considering various values of n simultaneously. The UIRREPS of the direct product group G (4.1) have the form ' & ' & (r) (s) (m) (n) (4.5) ρ = ρ(k1 ,λ1 ) ρ(k2 ,λ2 ) ρ(k1 ,λ1 ) ρ(k2 ,λ2 ) , L
L
L
L
R
R
R
R
July 12, J070-S0129055X10004065
718
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
1 2 1 2 where λ1L , λ2L , λ1R , λ2R are the highest weights and kL , kL , kR , kR ∈ Z according to (4.3). The main problem is to find the UIRREPS (ρ, V ) for which
dim(V K ) = 1,
(4.6)
where K = Mdiag < G is given by (3.42). We investigate this problem by adopting the ansatz that one of the four constituent representations in (4.5) has the (l) form ρ(k,a1 1 ) (l ∈ {r, s, m, n}) and the other three constituent representations are (l)
1-dimensional. More exactly, ρ(k,a1 1 ) will be used for a factor of the maximal size, l = max{r, s, m, n}. We call this assumption the KKS ansatz, since it eventually originates from the seminal paper by Kazhdan, Kostant and Sternberg [14]. The usefulness of this assumption is also supported by results in [13, 19, 20]. The key (l) property is that all weight-multiplicities of ρ(k,a1 1 ) are equal to one. The analysis of the condition (4.6) is the easiest if the group K (3.42) is Abelian, which happens in the following cases: • Case I: m = r = s = n, N = 2n, • Case II: m = r = n + 1, s = n, N = 2n + 1, • Case III: m = n + 2, r = s = n + 1, N = 2n + 2. Next we describe the simplest Case I in detail, then present the essential points for the other two cases. The complex holomorphic analogue of Case I was studied in [19]; and the results are consistent. The other two cases of our KKS ansatz have not been investigated before. Remark. The reader may wonder why we take l = max{r, s, m, n} in our KKS ansatz in Cases II and III. In fact, we previously studied ([20] and unpublished work) the classical Hamiltonian reductions of the free particle on U (N ) based on the symmetry group (4.1) by using a minimal coadjoint orbit of positive dimension for any one of the four factors and one-point orbits for the other three factors. We found that this leads to the classical BCn Sutherland model with three independent coupling constants only in the three cases mentioned above, and only if the minimal coadjoint orbit of positive dimension, 2(l − 1) for U (l), is associated with a factor of maximal size. The connection to quantum Hamiltonian reduction is clear from the relation between the coadjoint orbits of U (l) of dimension 2(l − 1) and the (l) representations ρ(k,a1 1 ) (and their contragredients), which follows for example from geometric quantization. 4.1. Case I: m = r = s = n, N = 2n Now σL = σR = θn,n and U (N )L = U (N )R ∼ = U (n) × U (n). The decomposition (3.33) of any matrix in CN ×N simplifies to a two by two block form with all four blocks having size n × n. We look for admissible UIRREPS ρ of G (4.1) by adopting the KKS ansatz ' & ' & (n) (n) (n) (n) (4.7) ρ := ρ(k1 ,a1 1 ) ρ(k2 ,0) ρ(k1 ,0) ρ(k2 ,0) , L
L
R
R
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
719
1 2 1 2 where a1 ∈ Z≥0 , kL , kL , kR , kR ∈ Z and the representation space is identified as
V ≡ Va(n) . 1 1
(4.8)
Note that any element X ∈ G ∼ = u(N )σL ,+ ⊕ u(N )σR ,+ of the symmetry algebra G can be realized as a pair X = (XL , XR ) with XL , XR ∈ u(N )σL ,+ = u(N )σR ,+ ∼ = u(n) ⊕ u(n). So, for any X ∈ G we have the refined decomposition 1 2 X = (XL , XR ) = ((XL1 , XL2 ), (XR , XR )),
where
∈ u(n) and as block-matrices 1 0 XL1 XR 1 2 1 2 (XL , XL ) := , X ) := , (X R R 0 XL2 0
(4.9)
1 2 XL1 , XL2 , XR , XR
0 2 XR
.
(4.10)
With these notations, the formula of the Lie algebra representation corresponding to (4.7) reads tr(XL1 ) µn (a1 1 ) (n) 1 1 1n + kL + ρ (X) = πa1 1 XL − tr(XL1 ) n n 2 1 1 2 2 + tr(kL XL2 + kR XR + kR XR ) IdV . (4.11)
Lemma 4.1. The KKS ansatz (4.7) defines admissible UIRREPS of G satisfying 1 2 1 2 + kL + kR + kR = 0 and a1 = γn with some γ ∈ Z≥0 . dim(V K ) = 0 if and only if kL In these cases dim(V K ) = 1. Using the bosonic oscillator realization of V (4.8) described in the Appendix, V K has the form (n) VK ∼ [0] = spanC {|γ, γ, . . . , γ}. = Vγn
1
(4.12)
Proof. The isotropy subalgebra is K = Mdiag = {X = (X0 , X0 ) | X0 ∈ M}, where M can be parametrized as
H + ix1n 0 (n) M = X0 = (4.13) H ∈ iHR , x ∈ R . 0 H + ix1n That is, for the components of any X ∈ K we have the parametrization 1 2 XL1 = XL2 = XR = XR = H + ix1n .
(4.14)
(n) Va1 1
and X ∈ K we can write Thus, using Eq. (4.11), for any v ∈ (n) 1 2 1 2 ρ (X)v = πa1 1 (H)v + ix µn (a1 1 ) + n(kL + kL + kR + kR ) v.
(4.15)
Clearly ρ (X)v = 0 (∀X ∈ K) if and only if (n)
1 2 1 2 πa(n) (H)v = 0 (∀H ∈ iHR ) and µn (a1 1 ) + n(kL + kL + kR + kR ) = 0. 1 1
(4.16) Therefore It is easy
(n) 1 2 1 2 VK = VK ∼ +kL +kR +kR ) = 0. = Va1 1 [0], provided that µn (a1 1 )+n(kL (n) to see that Va1 1 [0] = {0} if and only if a1 = γn for some γ ∈ Z≥0 .
July 12, J070-S0129055X10004065
720
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
1 2 1 2 Since µn (γn1 ) = 0 by (4.2), the requirement kL + kL + kR + kR = 0 then also (n) follows from (4.16). Finally, note that by using the oscillator realization of Vγn 1 one has the second equality in (4.12).
In what follows we make use of the basis of K⊥ constructed in Sec. 3.3. In the present case this is given by the basis {Vαa , Wαa }a∈{r,i},α∈R+(Cn ) of Ke⊥ together with ˆ j } of Ka⊥ defined according to (3.63) by using the following orthonormal the basis {L n basis {Lj }j=1 of M: 0 Ejj i Lj := √ ∈ M (1 ≤ j ≤ n). (4.17) 2 0 Ejj Lemma 4.2. In the case of the KKS ansatz (4.7) subject to the conditions of Lemma 4.1 the third term in the reduced Laplacian (2.14) gives bα,β ρ (Tα )ρ (Tβ ) 1 1 2 2 + kL ) − γ(γ + 1) = − n(kL 2 −
1≤k
1 1 + 2 2 sin (qk − ql ) sin (qk + ql )
n n 1 1 2 2 1 2 + kR ) − (kL + kR ) 1 1 (kL 2 1 2 + k ) − 2(k . L R 2 2 2 sin (qj ) sin (2qj ) j=1 j=1
(4.18) Proof. Note that in the present case only the first four sums occur in the formula (3.77). Recalling that µn (γn1 ) = 0 and utilizing formula (4.11) for ρ , we can calculate the action of the various terms. For example, since ˆ j = √1 (Lj , −Lj ) = i ((Ejj , Ejj ), (−Ejj , −Ejj )) , L (4.19) 2 2 we get 1 (n) 1 2 1 2 ˆ j ) = i πγn
1 − + k − k − k )Id ρ (L E + (k . (4.20) jj n V L L R R 1 2 n ˆ j ) on V K can be easily calculated in the bosonic oscillator picture. The action of ρ (L (n) 2 1 2 1 = −kL − kL − kR , it follows Since πγn 1 Ejj − n1 1n |γ, γ, . . . , γ = 0, and since kR ˆ {|γ, γ, . . . , γ} the operator ρ ( L ) acts as the span that on the subspace V K ∼ = j C ˆ 1 2 i 1 1 ) scalar ρ (Lj ) = i(kL + kL ). In the same manner, the equalities ρ (V2j ) = i(kL + kR i 2 1 K and ρ (W2j ) = −i(kL + kR ) hold on V . Furthermore, we have on V ρ (Vrk −l ) = ρ (Wrk −l ) = ρ (Vrk +l ) = ρ (Wrk +l ) 1 (n) (n) = √ (πγn
(Ekl ) − πγn
(Elk )), 1 1 2 2
(4.21)
ρ (Vik −l ) = ρ (Wik −l ) = ρ (Vik +l ) = ρ (Wik +l ) i (n) (n) = √ (πγn
(Ekl ) + πγn
(Elk )). 1 1 2 2
(4.22)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
721
Next, ∀k, l ∈ {1, 2, . . . , n}, k = l, we obtain (n) (n) (Ekl )πγn
(Elk )|γ, γ, . . . , γ = b†k bl b†l bk |γ, γ, . . . , γ πγn
1 1
= γ(γ + 1)|γ, γ, . . . , γ.
(4.23)
The above equations imply that on V K ˆ j )2 = −(k 1 + k 2 )2 , ρ (L L L
i 1 1 2 ρ (V2 )2 = −(kL + kR ) , j
i 2 1 2 ρ (W2 )2 = −(kL + kR ) , j
(4.24)
1 ρ (Vαr )2 + ρ (Vαi )2 = ρ (Wαr )2 + ρ (Wαi )2 = − γ(γ + 1) for 2 α = k ± l , k = l.
(4.25)
Now (4.18) results by substitution into (3.77), using obvious trigonometric identities. The following proposition is obtained by putting together the statements of Eq. (3.46), Lemmas 3.2 and 4.2. Proposition 4.3. Under the KKS ansatz (4.7) the general formula (2.14) gives the following result for the reduction of the Laplace operator of U (N ): 1 1 1 2 2 (4.26) + kL ) − n(2n − 1)(2n + 1), −∆red = HBCn + n(kL 2 6 where HBCn is the Sutherland Hamiltonian (1.4) with the coupling parameters defined by a ≡ γ,
1 1 + kR |, b ≡ |kL
2 1 c ≡ |kL + kR |
(4.27)
1 2 1 , kL , kR ∈ Z and γ ∈ Z≥0 determined by in terms of the free parameters kL Lemma 4.1. 1 2 1 Remark. By varying γ, kL , kL , kR , the coupling parameters a, b, c in (1.4) can take arbitrary non-negative integer values. As further discussed in Sec. 5, Proposition 4.3 follows also from the results of Oblomkov [19].
4.2. Case II: m = r = n + 1, s = n, N = 2n + 1 In this case σL = σR = θn+1,n and correspondingly U (N )L = U (N )R ∼ = U (n + 1) × U (n). We consider the following ansatz for the UIRREP (ρ, V ) of the symmetry group G (4.1), ' & ' & (n+1) (n) (n+1) (n) (4.28) ρ := ρ(k1 ,a1 1 ) ρ(k2 ,0) ρ(k1 ,0) ρ(k2 ,0) , L
L
R
R
(n+1)
1 2 1 2 where a1 ∈ Z≥0 , kL , kL , kR , kR ∈ Z and the carrier space is identified as V ≡ Va1 1 . Similarly to (4.9), any X ∈ G ∼ = u(N )σL ,+ ⊕ u(N )σR ,+ can be realized as a pair X = (XL , XR ) with XL , XR ∈ u(N )σL ,+ = u(N )σR ,+ ∼ = u(n + 1) ⊕ u(n). So, we
July 12, J070-S0129055X10004065
722
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
1 2 1 write X ∈ G as X = (XL , XR ) = (XL1 , XL2 ), (XR , XR ) with XL1 , XR ∈ u(n + 1), 2 2 XL , XR ∈ u(n). Then (4.28) implies the formula tr(XL1 ) (n+1) 1 ρ (X) = πa1 1 XL − 1n+1 n+1 µn+1 (a1 1 ) 1 1 2 2 1 1 2 2 + kL + tr(XL ) + kL tr(XL ) + kR tr(XR ) + kR tr(XR ) IdV . n+1 (4.29) Lemma 4.4. The KKS ansatz (4.28) yields admissible UIRREPS of G if and only 1 2 1 2 if ∃γ, γ˜ ∈ Z≥0 such that the parameters kL , kL , kR , kR ∈ Z, and a1 ∈ Z≥0 satisfy the conditions a1 = γn + γ˜ ,
2 2 kL + kR = γ˜ − γ,
1 1 kL + kR = R − (˜ γ − γ),
(4.30)
where γ˜ − γ = Q + (n + 1)R with uniquely determined Q = Q(γ, γ˜ ) ∈ {0, 1, . . . , n} and R = R(γ, γ˜) ∈ Z. If these conditions hold, then dim(V K ) = 1 and V K is given by [γe1 + γe2 + · · · + γen + γ˜ en+1 ] = spanC {|γ, γ, . . . , γ, γ˜ }, VK ∼ = Va(n+1) 1 1
(4.31)
(n+1)
where the last equality refers to the bosonic oscillator realization of Va1 1 . Proof. For the isotropy subalgebra we have K = Mdiag = {X = (X0 , X0 ) | X0 ∈ M}, where D 0 0 M = X0 = i 0 ω 0 D = diag(d1 , d2 , . . . , dn ) ∈ Rn×n , ω ∈ R . (4.32) 0 0 D So, for any X ∈ K we have XL = XR = X0 , and D 0 1 1 2 XL = XR = i = iD. (4.33) , XL2 = XR 0 ω (n Now, for each ϕ = (ϕ1 , ϕ2 , . . . , ϕn ) ∈ Rn we let ϕ¯ := j=1 ϕj , and consider the traceless Cartan elements (n+1)
¯ ∈ HR Hϕ := diag(ϕ1 , ϕ2 , . . . , ϕn , −ϕ)
,
(n) ˜ ϕ := diag(ϕ1 , ϕ2 , . . . , ϕn ) − 1 ϕ1 H ¯ n ∈ HR . n Then the components of X ∈ K can be parametrized as 1 2 ˜ ϕ + i x + 1 ϕ¯ 1n , XL1 = XR = iHϕ + ix1n+1 , XL2 = XR = iH n
(4.34)
(4.35)
(n+1)
where ϕ ∈ Rn and x ∈ R. From (4.29), it follows that ∀v ∈ Va1 1 we have 2 2 (iHϕ )v + i(kL + kR )ϕv ¯ + ix(µn+1 (a1 1 ) ρ (X)v = πa(n+1) 1 1 1 1 2 2 + (n + 1)(kL + kR ) + n(kL + kR ))v.
(4.36)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
723
Clearly ρ (X)v = 0 (∀X ∈ K) if and only if 2 2 πa(n+1) (Hϕ )v = −(kL + kR )ϕv ¯ 1 1
(∀ ϕ ∈ Rn ),
1 1 2 2 + kR ) + n(kL + kR ) = 0. Note that ϕ¯ = and µn+1 (a1 1 ) + (n + 1)(kL (n e (H ), so after introducing the shorthand notations j ϕ j=1 1 1 κ1 := kL + kR ∈Z
(4.37) (n
2 2 and κ2 := kL + kR ∈ Z,
we conclude that
VK =VK ∼ − κ2 = Va(n+1) 1 1
n
j=1
ϕj =
(4.38)
ej ,
(4.39)
j=1
provided that µn+1 (a1 1 ) + (n + 1)κ1 + nκ2 = 0. Our next goal is to identify the (n+1)
weight space Va1 1 [−κ2 (e1 + e2 + · · · + en )]. Recall that −κ2 (e1 + e2 + · · · + en ) ∈ (n+1)
Wa1 1 if and only if ∃(l1 , l2 , . . . , ln+1 ) ∈ Zn+1 ≥0 with l1 + l2 + · · · + ln+1 = a1 , such that −κ2 (e1 + e2 + · · · + en ) =
n+1 j=1
lj ej =
n
(lj − ln+1 )ej .
(4.40)
j=1
Since the functionals e1 , e2 , . . . , en are linearly independent, we end up with the requirement l1 = l2 = · · · = ln = ln+1 − κ2 . For the free parameters we choose 2 2 + kR and a1 have to obey the γ := l1 and γ˜ := ln+1 , then the parameters κ2 = kL equations κ2 = γ˜ − γ and a1 = γn + γ˜ . Note that under these assumptions we have Va(n+1) [−κ2 (e1 + e2 + · · · + en )] = Va(n+1) [γe1 + γe2 + · · · + γen + γ˜ en+1 ] 1 1 1 1 = spanC {|γ, γ, . . . , γ, γ˜ }.
(4.41)
Now let us express the value of the label µn+1 (a1 1 ) ∈ {0, 1, . . . , n} in terms of γ and γ˜ . Recalling (4.2), we can write µn+1 (a1 1 ) = µn+1 ((γn + γ˜ )1 ) ≡ γn + γ˜ ≡ γ˜ − γ
(mod(n + 1)).
(4.42)
Notice that ∃! Q = Q(γ, γ˜ ) ∈ {0, 1, . . . , n} and ∃! R = R(γ, γ˜ ) ∈ Z such that γ˜ − γ = Q + (n + 1)R, thereby the previous congruence relation translates into the equation µn+1 (a1 1 ) = Q. Plugging this equation into the requirement µn+1 (a1 1 ) + (n + 1)κ1 + nκ2 = 0, we get 0 = Q + (n + 1)κ1 + n(Q + (n + 1)R) = (n + 1)(˜ γ − γ − R + κ1 ),
(4.43)
1 1 therefore we end up with the additional constraint kL + kR = κ1 = R − (˜ γ − γ).
July 12, J070-S0129055X10004065
724
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
1 2 Observe from Lemma 4.4 that kR , kR ∈ Z and γ, γ˜ ∈ Z≥0 can be taken as free parameters that label the admissible cases of the KKS ansatz (4.28). By proceeding like in Sec. 4.1, it is a matter of straightforward substitutions to specialize the reduced Laplacian (2.14) to our case. In this way we found the following result.
Proposition 4.5. Under the KKS ansatz (4.28) with parameters satisfying (4.30) the Laplace operator of U (N ) reduces to 1 1 1 2 2 1 2 (4.44) + kR ) + (kR ) − n(n + 1)(2n + 1), −∆red = HBCn + n(kR 2 3 where HBCn is given by (1.4) with the coupling parameters determined in terms of 1 2 , kR ∈ Z and γ, γ˜ ∈ Z≥0 according to the arbitrary parameters kR a ≡ γ,
b ≡ γ + γ˜ + 1,
1 2 c ≡ |˜ γ − γ + kR − kR |.
(4.45)
Remark. The non-negative integer coupling parameters a, b, c that arise in this case satisfy the condition b ≥ a + 1. 4.3. Case III: m = n + 2, r = s = n + 1, N = 2n + 2 Now the fixpoint subgroups of the two different involutions σL = θn+1,n+1 and σR = θn+2,n are U (N )L ∼ = U (n + 1) × U (n + 1) and U (N )R ∼ = U (n + 2) × U (n). We consider the reductions associated with UIRREPS (ρ, V ) of G (4.1) having the form ' & ' & (n+1) (n+1) (n+2) (n) (4.46) ρ := ρ(k1 ,0) ρ(k2 ,0) ρ(k1 ,a1 1 ) ρ(k2 ,0) , L
L
R
R
where a1 ∈ Z≥0 and ∈ Z, and the representation space is identified (n+2) as V ≡ Va1 1 . Any X ∈ G is a pair X = (XL , XR ) with XL ∈ u(n + 1) ⊕ u(n + 1) and XR ∈ u(n + 2) ⊕ u(n), and we may further write XL = (XL1 , XL2 ) and XR = 1 2 1 2 , XR ), where now XL1 , XL2 ∈ u(n + 1), XR ∈ u(n + 2) and XR ∈ u(n). Then the (XR G-representation can be written as 1 ) tr(XR 1 1 ρ (X) = πa(n+2) − X n+2 R 1 1 n+2 µn+2 (a1 1 ) 1 2 1 1 2 2 + kL tr(XL1 ) + kL tr(XL2 )+ kR + ) + kR tr(XR ) IdV . tr(XR n+2 (4.47) 1 2 1 2 kL , kL , kR , kR
Lemma 4.6. The KKS ansatz (4.46) yields admissible UIRREPS if and only if 1 2 1 2 ∃ γ, γ˜ , γˆ ∈ Z≥0 and k ∈ Z such that the parameters kL , kL , kR , kR ∈ Z and a1 ∈ Z≥0 satisfy the conditions a1 = γn + γ˜ + γˆ , 1 kR = R − γ˜ − k,
1 kL = k,
2 kL = γ˜ − γˆ + k,
2 kR = γˆ − γ − k,
(4.48)
where a1 = Q+(n+2)R with uniquely determined Q = Q(γ, γ˜ , γˆ ) ∈ {0, 1, . . . , n+1} and R = R(γ, γ˜ , γˆ ) ∈ Z. If the above conditions are met, then dim(V K ) = 1 and
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
725
concretely V K = Va(n+2) [γe1 + γe2 + · · · + γen + γ˜ en+1 + γˆ en+2 ] 1 1 = spanC {|γ, γ, . . . , γ, γ˜ , γˆ },
(4.49)
where the last equality refers to the bosonic oscillator realization of Proof. For the isotropy M}, where D 0 0 ω M = X0 = i 0 0 0 0
(n+2) Va1 1 .
subalgebra we have K = Mdiag = {X = (X0 , X0 ) | X0 ∈ 0 0 ω ˜ 0
0 0 n×n D = diag(d1 , d2 , . . . , dn ) ∈ R , ω, ω ˜ ∈ R . 0 D (4.50)
Any X = (XL , XR ) ∈ K satisfies XL = XR = X0 , components D D 0 ω ˜ 0 1 2 1 , XL = i , XR = i 0 XL = i 0 ω 0 D 0
and therefore it has the 0
0
ω
0 ,
0
ω ˜
For any real (n + 1)-tuple ϕ = (ϕ1 , ϕ2 , . . . , ϕn+1 ) ∈ Rn+1 (n ϕ˜ := j=1 ϕj , and introduce the traceless matrices
2 XR = iD.
(4.51) (n+1 we let ϕ¯ := j=1 ϕj ,
Hϕ := diag(ϕ1 , ϕ2 , . . . , ϕn+1 , −ϕ), ¯ ϕ˜ 1n , n ϕ¯ 1n+1 , HL1 := diag(ϕ1 , ϕ2 , . . . , ϕn+1 ) − n+1 ϕn+1 HL2 := diag(−ϕ, 1n+1 . ¯ ϕ1 , . . . , ϕn ) + n+1 We then write the components of X ∈ K in the form ϕ˜ 1 2 2 = iHϕ + ix1n+2 , XR = iHR +i x+ XR 1n , n ϕ¯ ϕn+1 1 1 2 2 XL = iHL + i x + 1n+1 , XL = iHL + i x − 1n+1 . n+1 n+1 2 := diag(ϕ1 , ϕ2 , . . . , ϕn ) − HR
(4.52)
(4.53)
(4.54) (4.55)
(n+2)
From (4.47), it follows that for any v ∈ Va1 1 and X ∈ K we have 1 2 2 ρ (X)v = πa(n+2) (iHϕ )v + i(kL ϕ¯ − kL ϕn+1 + kR ϕ)v ˜ 1 1 1 1 2 2 v. + kL ) + nkR + ix µn+2 (a1 1 ) + (n + 2)kR + (n + 1)(kL
(4.56)
July 12, J070-S0129055X10004065
726
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
Clearly ρ (X)v = 0 (∀X ∈ K) if and only if 2 1 2 πa(n+2) (Hϕ )v = (kL ϕn+1 − kL ϕ¯ − kR ϕ)v ˜ 1 1
(∀ϕ ∈ Rn ),
(4.57)
and 1 1 2 2 µn+2 (a1 1 ) + (n + 2)kR + (n + 1)(kL + kL ) + nkR = 0.
(4.58)
Since 2 1 2 1 2 kL ϕn+1 − kL ϕ¯ − kR ϕ˜ = −(kL + kR )(e1 + e2 + · · · + en )(Hϕ ) 2 1 + (kL − kL )en+1 (Hϕ ),
(4.59)
we obtain from (4.57) that we must have 1 2 2 1 V K = Va(n+2) [−(kL + kR )(e1 + e2 + · · · + en ) + (kL − kL )en+1 ]. 1 1
(4.60)
It is easy to see (cf. the Appendix) that the weight space in (4.60) is non-trivial if and only if ∃ (l1 , l2 , . . . , ln+2 ) ∈ Zn+2 ≥0 with l1 + l2 + · · · + ln+2 = a1 , such that 1 2 2 1 + kR )(e1 + e2 + · · · + en ) + (kL − kL )en+1 = −(kL
n+1
(lj − ln+2 )ej .
(4.61)
j=1
We set γ := l1 ,
γ˜ := ln+1 ,
γˆ := ln+2 ,
1 k := kL .
(4.62)
2 2 Then (4.61) requires l1 = l2 = · · · = ln = γ and γˆ − γ = k + kR with γ˜ − γˆ = kL − k. So, regarding γ, γ˜ , γˆ ∈ Z and k ∈ Z as free parameters, we see that the other parameters have to obey the relations 2 = γ˜ − γˆ + k, kL
2 kR = γˆ − γ − k,
a1 = γn + γ˜ + γˆ .
(4.63)
To satisfy the remaining condition (4.58), we now define Q = Q(γ, γ˜ , γˆ ) ∈ {0, 1, . . . , n + 1} and R = R(γ, γ˜ , γˆ ) ∈ Z by the equality a1 = γn + γ˜ + γˆ = Q + (n + 2)R.
(4.64)
1 Then (4.58) translates into the condition kR = R − γ˜ − k, which completes the proof.
Further direct calculations yield the explicit form of the reduced Laplacian (2.14). Proposition 4.7. Under the KKS ansatz (4.46) parametrized by arbitrary γ, γ˜ , γˆ ∈ Z≥0 and k ∈ Z according to Lemma 4.6, the reduced Laplacian of U (N ) satisfies
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
727
−∆red = HBCn + C with the constant 1 1 γ + k)(˜ γ + k + 1) C = − n(4n2 + 12n + 11) + n(2k + γ˜ − γˆ )2 + (˜ 6 2 + (ˆ γ − k)(ˆ γ − k + 1) (4.65) and coupling parameters given in the notation (1.4) by a ≡ γ,
b ≡ γ + γ˜ + 1,
c ≡ γ + γˆ + 1.
(4.66)
Remark. The integer coupling parameters a, b, c arising in this case satisfy b, c ≥ a + 1. 5. Discussion We here summarize the results, discuss the related work [19] and point out open problems. In this paper, we applied the formalism of quantum Hamiltonian reduction under polar group actions to study the reductions of the Laplace operator of U (N ) by means of the Hermann action (3.2) of the symmetry group G = (U (r) × U (s)) × (U (m) × U (n)) with N = m + n = r + s. We concentrated on the three series of cases for which the centralizer of the corresponding section, the group K = Mdiag (3.42), is Abelian. We built the representation (ρ, V ) of the symmetry group that enters the definition of the reduction by using as building blocks in (4.5) 1-dimensional representations and a symmetric power of the defining representation of the “largest” factor of G. In the framework of this “KKS ansatz” we determined all cases for which the reduction is consistent (that is dim(V K ) = 0), and saw also that in these admissible cases dim(V K ) = 1. We then calculated the explicit formula of the reduced Laplacian by specializing Eq. (2.14), and found that up to an additive constant it yields the BCn Sutherland Hamiltonian (1.4) with coupling parameters given as follows: • Case I: a, b, c ∈ Z≥0 , • Case II: a, b, c ∈ Z≥0 with b ≥ a + 1, • Case III: a, b, c ∈ Z≥0 with b, c ≥ a + 1. The dependence of the additive constant and of the coupling parameters a, b, c on the parameters of the respective representation (ρ, V ) is given by the three propositions formulated in Sec. 4. The above results show that Case I, which is the simplest case, covers all integral values of the coupling parameters a, b, c and the other two cases allow for alternative group theoretic descriptions of the BCn model at proper subsets of the integral coupling parameters. This state of affairs could not be foreseen before performing the analysis of the different reduction schemes. Observe also that if b = c, then the Hamiltonian (1.4) becomes of type Cn , but the Bn and Dn type Sutherland models do not arise from (1.4) at any values of the integers a, b, c. This is in contrast with
July 12, J070-S0129055X10004065
728
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
the corresponding classical Hamiltonian reduction [20], which covers all coupling constants of the classical BCn model, and is due to the never vanishing second term of the “measure factor” given by (3.73). The measure factor represents a kind of quantum anomaly since it gives the difference between the naive quantization of the reduced classical Hamiltonian and the outcome of the corresponding quantum Hamiltonian reduction [21]. In Case I, our analysis is consistent with the results of Oblomkov [19], who studied reductions of the Laplace operator of GL(m + n, C) using the symmetry group GC := (GL(m, C) × GL(n, C)) × (GL(m, C) × GL(n, C)),
m ≥ n.
(5.1)
In fact, in Case I, our reduction is nothing but the compact real form of the reduction studied in [19] for m = n. For the m > n cases of the symmetry group (5.1) a generalization of the KKS ansatz was employed in [19], which was found to yield the complex version of the BCn Sutherland Hamiltonian (1.4) with integer coupling parameters subject to the restriction c ≥ b − (m − n) ≥ 0. Thus the coupling parameters obtained for m > n form a proper subset of those obtained for m = n, and this proper subset is different from those that we derived in our Cases II and III. For clarity we note that the KKS ansatz (4.28) that we adopted in Case II was motivated by the corresponding classical reduction [20], and it does not correspond to the ansatz used in [19] for m − n = 1. It is not clear to us how the classical analogues of the m > n reductions of [19] work. Of course, the reductions can be applied also to the differential operators associated with the higher Casimirs. This can be used to explain the complete integrability of the BCn Sutherland model and to derive the spectra as well as the form of the joint eigenfunctions of the corresponding commuting Hamiltonians at the pertinent values of the coupling constants from representation theory [19]. We stress that the general method that we applied in our analysis can be used also to study other problems in the future. For example, one may try to determine all possible values of the coupling constants of the Sutherland models (1.1) that may result as reductions of the Laplacian of a compact Lie group in general. This is closely related to the open problem concerning the classification of the Hermann actions and representations (ρ, V ) of symmetric subgroups G (3.1) such that the condition dim(V K ) = 1 holds for the centralizer K < G of the section. In all such cases the reduced Laplace operator (2.14) is expected to provide a many-body model that can be solved by the group theoretic method because of its very origin. Besides the trigonometric real form that we considered, the complex BCn Sutherland model admits the well-known hyperbolic real form and other physically very different real forms associated with two types of particles [27, 28]. The derivation of the hyperbolic model by quantum Hamiltonian reduction can be done similarly to the present work, but starting from U (n, n) instead of U (2n) (in Case I) taking the Cartan involution both for σL and for σR (see also [20]). The models with two types of particles pose a more difficult problem. At the classical level, it
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
729
can be seen from [28] that to derive them one needs to take the Cartan involution of U (n, n) for σL and a different involution for σR that has a non-compact fixpoint subgroup. Therefore the corresponding quantum Hamiltonian reduction would require some modifications of the method used in this paper, which need further investigation. Acknowledgments We thank J. Balog for useful comments on the manuscript. This work was supported by the Hungarian Scientific Research Fund (OTKA) under the grant K 77400. Appendix. Some Representation Theoretic Facts In this Appendix, we gather some basic facts in order to fix the notations used in Sec. 4. A.1. On the UIRREPS of SU (n) and U (n) Since the Lie group SU (n) is compact, connected and simply-connected, there is a one-to-one correspondence between the UIRREPS (Π, V ) of SU (n) and the finite dimensional complex IRREPS (π, V ) of sl(n, C) = su(n)C . In the complex simple Lie algebra sl(n, C) we have the Cartan subalgebra H consisting of diagonal matrices, and use also the real Cartan subalgebra HR := {H | H ∈ sl(n, C), H is diagonal with real entries} ⊂ H.
(A.1)
The functionals {ei }ni=1 ⊂ H∗ are defined by the formula ei (H) := Hii (H ∈ H). The roots with respect to H form the set R := {ei − ej | 1 ≤ i, j ≤ n, i = j} ⊂ H∗ and we fix the root vectors Eei −ej := Eij . The set of positive roots is R+ := {ei − ej | 1 ≤ i < j ≤ n} and the simple roots are αi := ei − ei+1 (i (1 ≤ i ≤ n − 1). Let i = k=1 ek ∈ H∗ (1 ≤ i ≤ n − 1) denote the fundamental weights. The equivalence classes of the IRREPS of sl(n, C) can be uniquely labeled by the highest (dominant integral) weights, which are the elements of P+ (SU (n)) = {a1 1 + a2 2 + · · · + an−1 n−1 | a1 , a2 , . . . , an−1 ∈ Z≥0 } ∼ = Zn−1 ≥0 . (A.2) Now take an sl(n, C) IRREP (πλ , Vλ ) of highest weight λ ∈ P+ (SU (n)). To any linear functional ν ∈ H∗ we associate the weight space ) ker (πλ (H) − ν(H) IdVλ ) ⊂ Vλ , (A.3) Vλ [ν] := H∈H
and we also define the set of weights Wλ := {ν | ν ∈ H∗ , Vλ [ν] = {0}}. Then we * have the weight space decomposition Vλ = ν∈Wλ Vλ [ν]. Note that λ ∈ Wλ and dim(Vλ [λ]) = 1, so we can write Vλ [λ] = Cvλ with some highest weight vector vλ . The characteristic property of the non-zero vector vλ is that πλ (Eα )vλ = 0 holds
July 12, J070-S0129055X10004065
730
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
for all α ∈ R+ . The IRREP (πλ , Vλ ) of sl(n, C) induces the UIRREP (Πλ , Vλ ) of SU (n) by the requirement Πλ (eX ) = eπλ (X) for all X ∈ su(n). The corresponding scalar product on Vλ can be defined by fixing the norm of vλ and requiring the anti-hermiticity of πλ (X) for all X ∈ su(n). The UIRREPS of U (n) are usually parametrized by the set P+ (U (n)) = {m = (m1 , m2 , . . . , mn ) ∈ Zn | m1 ≥ m2 ≥ · · · ≥ mn }.
(A.4)
The representation ρm of U (n) may be defined as the extension of the representation Π λ of SU (n) < U (n) characterized by the properties λ=
n−1
(mi − mi+1 )i
and ρm (ξ1n ) = ξ m1 +···+mn IdVλ
∀ ξ ∈ U (1).
(A.5)
i=1
In the main text we use a slightly different parametrization by pairs (k, λ) ∈ Z × P+ (SU (n)). The correspondence is given by the relation m1 +· · ·+mn = µn (λ)+kn, as is seen from the comparison between (A.5) and (4.2) and (4.3). A.2. On the bosonic oscillator realization of (πm1 , Vm1 ) Fix an integer n ≥ 2 and to each n-tuple (l1 , l2 , . . . , ln ) ∈ Zn≥0 associate a “symbol” |l1 , l2 , . . . , ln . Let F denote the complex vector space generated by these symbols, + F := C|l1 , l2 , . . . , ln . (A.6) (l1 ,l2 ,...,ln )∈Zn ≥0
Endow F with the scalar product ( , ) for which the vectors {|l1 , l2 , . . . , ln }(l1 ,l2 ,...,ln )∈Zn≥0 satisfy (|l1 , l2 , . . . , ln , |l1 , l2 , . . . , ln ) = δl1 ,l1 δl2 ,l2 · · · δln ,ln ,
(A.7)
and introduce the annihilation and creation operators bi and b†i (1 ≤ i ≤ n) on F by
√ li |l1 , l2 , . . . , li − 1, . . . , ln if li ≥ 1, bi |l1 , l2 , . . . , ln := (A.8) 0 if li = 0, , (A.9) b†i |l1 , l2 , . . . , ln := li + 1|l1 , l2 , . . . , li + 1, . . . , ln . Then b†i is the adjoint of bi , and one has the commutation relations [bi , bj ] = 0,
[b†i , b†j ] = 0,
[bi , b†j ] = δi,j IdF .
(A.10)
The “bosonic Fock space” F decomposes as the orthogonal direct sum F = * m∈Z≥0 Fm with Fm := spanC {|l1 , l2 , . . . , ln | (l1 , l2 , . . . , ln ) ∈ Zn≥0 , l1 + l2 + · · · + ln = m}. (A.11) Now consider the linear map ψ : gl(n, C) → End(F ) defined on the standard basis {Eij }1≤i,j≤n of gl(n, C) by ψ(Eij ) := b†i bj .
(A.12)
July 12, J070-S0129055X10004065
2010 12:1 WSPC/S0129-055X
148-RMP
Derivations of Trigonometric BCn Sutherland Model
731
Then (ψ, F ) is a representation of gl(n, C) and the subspace Fm is invariant under ψ. The map ψm : gl(n, C) → End(Fm ),
X → ψm (X) := ψ(X)|Fm
(A.13)
provides a finite dimensional representation of the Lie algebra gl(n, C). By restricting ψm to the subalgebra sl(n, C) < gl(n, C), we end up with a finite dimensional representation (ψm , Fm ) of sl(n, C). The set of weights of the representation (ψm , Fm ) is
n n li ei (l1 , l2 , . . . , ln ) ∈ Z≥0 , l1 + l2 + · · · + ln = m , (A.14) Wm := i=1
and the weight space Fm [ν] ⊂ Fm corresponding to weight ν = takes the form Fm [l1 e1 + l2 e2 + · · · + ln en ] = C|l1 , l2 , . . . , ln .
(n
i=1 li ei
∈ Wm (A.15)
Note that each weight space is 1-dimensional. The representation (ψm , Fm ) contains the (up to rescaling) unique highest weight vector vm := |m, 0, . . . , 0, with weight m1 = me1 ∈ Wm . This shows that (ψm , Fm ) is equivalent to the IRREP (πm 1 , Vm 1 ). We identify these sl(n, C) (and the naturally corresponding su(n)) representations in the proofs presented in Sec. 4. References [1] M. A. Olshanetsky and A. M. Perelomov, Quantum integrable systems related to Lie algebras, Phys. Rept. 94 (1983) 313–404. [2] G. Heckman, Hypergeometric and spherical functions, Harmonic Analysis and Special Functions on Symmetric Spaces, eds. G. Heckman and H. Schlichtkrull, Perspectives in Mathematics, Vol. 16 (Academic Press, 1994), pp. 1–89. [3] B. Sutherland, Beautiful Models (World Scientific, 2004). [4] R. Sasaki, Quantum Calogero–Moser systems, in Encyclopaedia of Mathematical Physics (Academic Press, 2006), pp. 123–129. [5] P. Etingof, Calogero–Moser Systems and Representation Theory (European Mathematical Society, 2007). [6] A. P. Polychronakos, Physics and mathematics of Calogero particles, J. Phys. A Math. Gen. 39 (2006) 12793–12845; arXiv:hep-th/0607033. [7] M. A. Olshanetsky and A. M. Perelomov, Completely integrable Hamiltonian systems connected with semisimple Lie algebras, Invent. Math. 37 (1976) 93–108. [8] B. Sutherland, Exact results for a quantum many-body problem in one dimension II, Phys. Rev. A 5 (1972) 1372–1376. [9] M. A. Olshanetsky and A. M. Perelomov, Quantum systems related to root systems, and radial parts of Laplace operators, Funct. Anal. Appl. 12 (1978) 121–128; arXiv:math-ph/0203031. [10] G. J. Heckmam and E. M. Opdam, Root systems and hypergeometric functions I, Compositio Math. 64 (1987) 329–352. [11] E. M. Opdam, Root systems and hypergeometric functions IV, Compositio Math. 67 (1988) 191–209.
July 12, J070-S0129055X10004065
732
2010 12:1 WSPC/S0129-055X
148-RMP
L. Feh´ er & B. G. Pusztai
[12] I. Cherednik, Double Affine Hecke Algebras, London Mathematical Society Lecture Notes Series, Vol. 319 (Cambridge University Press, 2005). [13] P. I. Etingof, I. B. Frenkel and A. A. Kirillov, Jr., Spherical functions on affine Lie groups, Duke Math. J. 80 (1995) 59–90; arXiv:hep-th/9407047. [14] D. Kazhdan, B. Kostant and S. Sternberg, Hamiltonian group actions and dynamical systems of Calogero type, Commun. Pure Appl. Math. 31 (1978) 481–507. [15] A. D. Berenstein and A. V. Zelevinsky, When is the multiplicity of a weight equal to 1?, Funct. Anal. Appl. 24 (1990) 259–269. [16] R. Howe, Perspectives on invariant theory: Schur duality, multiplicity-free actions and beyond, in The Schur Lectures (1992), Israel Math. Conf. Proc., Vol. 8 (Bar-Iran Univ., 1995), pp. 1–182. [17] E. Heintze, R. Palais, C.-L. Terng and G. Thorbergsson, Hyperpolar actions on symmetric spaces, in Geometry, Topology, and Physics for Raoul Bott, ed. S.-T. Yau (International Press, 1995), pp. 214–245. [18] A. Kollross, Polar actions on symmetric spaces, J. Differential Geom. 77 (2007) 425–482; arXiv:math/0506312 [math.DG]. [19] A. Oblomkov, Heckman–Opdam’s Jacobi polynomials for the BCn root system and generalized spherical functions, Adv. Math. 186 (2004) 153–180; arXiv:math/0202076 [math.RT]. [20] L. Feh´er and B. G. Pusztai, A class of Calogero type reductions of free motion on a simple Lie group, Lett. Math. Phys. 79 (2007) 263–277; arXiv:math-ph/0609085. [21] L. Feh´er and B. G. Pusztai, Hamiltonian reductions of free particles under polar actions of compact Lie groups, Theoret. Math. Phys. 155 (2008) 646–658; arXiv:0705.1998 [math-ph]. [22] R. Palais and C.-L. Terng, A general theory of canonical forms, Trans. Amer. Math. Soc. 300 (1987) 771–789. [23] V. V. Gorbatsevich, A. L. Onishchik and E. B. Vinberg, Foundations of Lie Theory and Lie Transformation Groups (Springer, 1997). [24] T. Matsuki, Classification of two involutions on compact semisimple Lie groups and root systems, J. Lie Theory 12 (2002) 41–68. [25] B. Hoogenboom, Intertwining Functions on Compact Lie Groups, CWI Tract, Vol. 5 (Centrun Wisk. Inform., Amsterdam, 1984). [26] T. Matsuki, Double coset decomposition of reductive Lie groups arising from two involutions, J. Algebra 197 (1997) 49–91. [27] F. Calogero, Exactly solvable one-dimensional many-body problems, Lett. Nuovo Cim. 13 (1975) 411–416. [28] M. Hashizume, Geometric approach to the completely integrable Hamiltonian systems attached to the root systems with signature, Adv. Stud. Pure Math. 4 (1984) 291–330.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 7 (2010) 733–838 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004077
A CLASSICAL MECHANICAL MODEL OF BROWNIAN MOTION WITH PLURAL PARTICLES
SHIGEO KUSUOKA∗ and SONG LIANG† ∗Graduate School of Mathematical Sciences, The University of Tokyo, Komaba 3-8-1, Meguro-ku Tokyo 153-8914, Japan
[email protected] †Institute of Mathematics, Tsukuba University, Tennoudai 1-1-1 Tsukuba 305-8571, Japan
[email protected]
Received 15 December 2008 Revised 26 May 2010
We give a connection between diffusion processes and classical mechanical systems in this paper. Precisely, we consider a system of plural massive particles interacting with an ideal gas, evolved according to classical mechanical principles, via interaction potentials. We prove the almost sure existence and uniqueness of the solution of the considered dynamics, prove the convergence of the solution under a certain scaling limit, and give the precise expression of the limiting process, a diffusion process. Keywords: Infinite particle systems; classical mechanics; Markov processes; diffusions; convergence; Brownian motion. Mathematics Subject Classification 2010: 70F45, 34F05, 60B10, 60J60
1. Introduction Brownian motion is a well-known physical phenomenon concerning the dynamics of a small particle put into a fluid in equilibrium, e.g., a grain of pollen in a glass of water [10]. It is an interesting problem in mathematical physics to describe the Brownian motion phenomenology by classical mechanical models. Brownian motion was first observed accidentally by Brown in 1827. The first physical explanation of it was given by Einstein: the motion being explained as coming about as a result of the repeated collisions of the particle with the numerous much smaller fluid atoms. In more mathematical terms, the explanation is often presented in the following rough way: since a big number of water atoms collide with the massive particle randomly, and each atom is light enough, if we assume that the interactions from each atom at each time are independent, the motion 733
August 10, J070-S0129055X10004077
734
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
of the massive particle could be considered as a sum of many i.i.d. random variables, by the central limit theorem, this will give in a suitable limit the Brownian motion. However, we have to notice that, even in a model where only interactions through collisions are considered, there exists the possibility of re-collisions, so the states (i.e. positions and velocities) of light particles at each time are not independent of each other, so one is not really in presence of a sum of i.i.d. variables. This becomes a more evident and significant drawback when considering the model of interactions caused by potentials, or a model with more than one massive particles. Indeed, the actual motion of the massive particles could not be a result of a sum of i.i.d. random variables, it is not even a Markov process. So to study this phenomenon more precisely, we need to construct some new model, which takes the mentioned re-interactions into account. In such a model, a finite number of massive particles (molecules) interact with a gas of infinitely many light atoms, which have a random initial state. The dynamics is fully deterministic, Newtonian, as long as the initial condition is given, and the only source of randomness is from the initial configuration of the light atoms. The problem is to prove that in an appropriate limit as the mass m of the gas atoms goes to 0 while their velocities and their density increase in an appropriate manner, such that the variance of momentum transfer stays of order 1, the motion of the molecules converges to a Markov process, in particular a diffusion process. This type of model, called a mechanical model of Brownian motion, was first introduced and studied by Holley [6], for the case of only one molecule, with the whole system in dimension d = 1, and the interactions only given by collisions. This model was later extended by, e.g., D¨ urr–Goldstein–Lebowitz [3–5], Calderoni–D¨ urr–Kusuoka [2], and others, to the case of higher dimensional spaces. But in all papers, only collisional interactions of one molecule with light atoms are considered. In the present paper, we consider the case of plural molecules interacting via smooth compact support potentials with an ideal gas of atoms. This increases the difficulties in many aspects, for example, (1) strong non-Markovian character of the dynamics (for every positive mass m of the atoms), due to possibly multiple, or even extended in time, interactions between a particular atom and the molecules; (2) the appearance of an interaction (mediated by the gas atoms) between the molecules in the limiting process; and (3) irregular behaviour of the above interaction when the interaction ranges of the molecules overlap, i.e. an atom can interact with more than one molecule at a time. Let us note that one expects that the non-Markovian character of the dynamics mentioned above disappears when the gas atoms become infinitely fast, in the limit m → 0.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
735
We show the existence and the uniqueness of the solution of an infinite system of ordinary differential equations (ODEs) describing the model, for almost every initial condition of the ideal gas, and study the motion of the molecules in the Brownian limit, where m → 0, and the intensity function of the initial configuration of the N d−1 m 2 atoms is given by λ m (dx, dv) = m 2 ρ( 2 |v| + i=1 Ui (x − Xi,0 ))dxdv, where ρ is a function giving the initial distribution, x respectively, v denote the position respectively, velocity of an atom, N is the number of molecules, Xi,0 are the initial positions of the molecules and Ui are the potentials. (See Sec. 2 for more details and the notations.) A heuristic central limit theorem type argument for this scaling, which assumes all of the necessary independency in the limit, may be given as follows: when m → 0, the average energy mv 2 of the atoms remains order 1 due to the velocity scaling. For the momentum transfer, notice that since potentials are compactly supported by our assumption, we have that (as long as the molecules stay in a bounded region) interactions occur only if the atoms are in a certain area A which does not depend d Vi (t) is a result of the sum of the on time. So roughly speaking, for a fixed time t, dt effects from those atoms whose positions x and velocities v satisfy x + tv ∈ A. Also, the effect from each one such atoms is of order 1. Since an atom has velocity of order m−1/2 , the length of the time that it stays in the area A has order m1/2 . In summary, in a fixed time interval [0, T ], T > 0, the momentum transfer for a molecule from will remain constant if the total one atom is of order m1/2 , hence the momentum T −1/2 , number of interacting atoms has mean 0 ds x+sv∈A λ m (dx, dv) behaving as m for m → 0. Let us explain further main ideas and provide a sketch of the content of this (or paper before closing this section. First, we introduce a system ϕ(t, x, v; X) (see (2.2) and (3.4)), describing the motion of the light atoms when ψ(t, x, v; X)) the molecules are “frozen”, and consider the classical scattering theory (including a ray representation) for it. As a result of our cut-off in the potentials and the initial distribution of the atoms, as long as the velocities of the molecules are not too fast, each light atom interacts with the molecules for a time length of order m1/2 (instead of x(t, x, v)) gives the 0-order only, so the interaction given by ψ(t, x, v; X) approximation of the momentum variance of the molecules (see Proposition 3.6.3). We use this fact to get the tightness of the states of the molecules as long as their velocities are of order O(1) (Lemma 3.5.1 and Sec. 4). Next, with the help of this tightness, by adding the effect given by the error caused by the described “frozing approximation” as a 1-order term (see Proposition 3.6.4), we prove the desired convergence for m → 0. This is done by characterizing the possible accumulation points by martingale problem theory (Sec. 5). Finally, we show that when there is only one molecule, or when there are two molecules but the potential functions satisfy certain conditions (see Theorem 2.0.1(4)), the velocity(s) of the molecule(s) do(es) not go beyond order O(1), so the stopping time τn (to keep the velocity(s) O(1))
August 10, J070-S0129055X10004077
736
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
converges to ∞, which means that our convergence is valid for all time intervals (Sec. 6).
2. Description of the Model and Statement of the Result Let us describe our model and results precisely in this section. Let N ≥ 1 and d ≥ 1 be integers, and let M1 , . . . , MN , m > 0. Here N stands for the number of molecules, d for the dimension of the space Rd , in which the whole system is considered. M1 , . . . , MN are the masses of the molecules. m stands for the mass of the light particles (the environmental ideal gas atoms), (later on the limit m → 0 will be taken). We use Ui ∈ C0∞ (Rd ), i = 1, . . . , N , to denote the (cut-off) potential functions, which, as (2.1) shows, are assumed to provide potentials that only depend on the relative positions of the molecules and the atoms. Also, let Xi,0 , Vi,0 ∈ Rd , i = 1, . . . , N , be given, which stand for the initial positions and the initial velocities of the molecules. Assume that the initial condition of the environment, i.e. the positions and the = Conf(Rd × Rd ). The velocities of the ideal gas atoms at time 0, is given by ω ∈Ω distribution of ω will be specified later. (We ask for the reader’s tolerance for using “∼” for a while. We do so because we will soon convert the problem to some new probability space (see Sec. 3.1) by using ray representation, and we believe that it is better to keep the notations without “∼” until then.) Here Conf(Rd × Rd ) stands for the set of all non-empty closed subsets of Rd × Rd which have no cluster point. Conf(Rd × Rd ) is equipped with the σ-algebra E0 , the σ-algebra generated by {{C ⊂ Rd × Rd ; C = ∅, closed, C ∩ G = ∅}; G is open in Rd × Rd }. Each ω is means that there exists an atom at position x a subset of Rd × Rd , and (x, v) ∈ ω with velocity v at time 0. As claimed before, we assume that as long as the initial conditions ω ∈ Conf(Rd × Rd ) and Xi,0 , Vi,0 ∈ Rd , i = 1, . . . , N , are given, the whole system evolves according to Newton mechanical laws via interaction potentials depending only on the relative positions. (m) (m) (m) (m) We use Xi (t) = Xi (t, ω) and Vi (t) = Vi (t, ω) ∈ Rd to denote the position and the velocity of the ith molecules at time t with initial environmental ), v (m) (t, x, v, ω ) ∈ Rd to condition ω , and for each (x, v) ∈ ω , we use x(m) (t, x, v, ω denote the position and the velocity at time t of the atom which had state (x, v) at time 0. Also, for the sake of simplicity, we assume that there is no direct interaction between molecules or between atoms. Actually, adding the effect of interactions between molecules causes totally no mathematical difficulty, while making the formula more complicated. We would rather say that one of the most interesting points of our results in this paper is that, even for the case with no direct interactions between molecules, after taking the limit m → 0, we get a diffusion in which interactions between molecules appear. (See Theorem 2.0.1, especially the definition of the generator L below.)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
737
In conclusion, for each initial environmental condition ω , we assume that the motion of the system is described by the following infinite system of ODEs: (m) d X (m) (t, ω ) = Vi (t, ω ), i dt d (m) (m) V M (t, ω ) = − ∇Ui (Xi (t, ω ) − x(m) (t, x, v, ω ))µωe (dx, dv), i i dt d ×Rd R (m) (m) (Xi (0, ω ), Vi (0, ω )) = (Xi,0 , Vi,0 ), i = 1, . . . , N, (2.1) d (m) (m) x (t, x, v, ω ) = v (t, x, v, ω ), dt N d
(m) (m) v m (t, x, v, ω ) = − ∇Ui (x(m) (t, x, v, ω ) − Xi (t, ω )), dt i=1 (x(m) (0, x, v, ω ), v (m) (0, x, v, ω )) = (x, v), (x, v) ∈ ω . : µωe (A) = ( ω ∩ A) for any Here µωe ( · ) is the counting measure determined by ω A ∈ B(Rd × Rd ). ( ( · ) thus denoting the number of points in the argument.) Since we are only interested in the motion of the molecules, from now on, (m) (t, ω ), when talking about the solution of (2.1), we always mean the value of (X (m) (m) (m) (m) (m) V (t, ω )) = ((X1 (t, ω ), . . . , XN (t, ω )), (V1 (t, ω ), . . . , VN (t, ω ))). Finally, let us give the distribution of the environmental initial condition ω . Let ρ: R → [0, ∞) be a continuous function such that ρ(s) → 0 rapidly as s → ∞ (see conditions A1 and A2 below for details). Let λ m be the non-atomic Radon measure d d on R × R given by N d−1 m 2
λm (dx, dv) = m 2 ρ Ui (x − Xi,0 ) dxdv, |v| + 2 i=1 and let P ω ) be the Poisson point process with the intensity measure λ m (d m . So d d P is a probability measure on Ω(= Conf(R × R )). We assume that the dism tribution of ω is given by Pm . (See, e.g., [7] for more details about Poisson point processes.) In this paper, we consider the following questions: (Q1) Does the dynamics have a unique solution for Pm -almost every initial condition? (m) (t, ω (m) (t, ω ), V )) as m → 0? (Q2) What is the limit behavior of the solution (X Throughout this paper, we assume that Ui ∈ C0∞ (Rd ) satisfy Ui (−x) = Ui (x), x ∈ Rd , i = 1, . . . , N . Let Ri be constants such that Ui (x) = 0 if N |x| ≥ Ri . Define the constants C0 = (2 i=1 Ri ∇Ui ∞ )1/2 , e0 = 12 (2C0 + 1)2 + N i=1 Ui ∞ . Assume that ρ: R → [0, ∞) is a measurable function satisfying the
August 10, J070-S0129055X10004077
738
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
following: (A1) ρ(s) = 0 if s ≤ e0 , (A2) for any c > 0, there exists a ρc : R → [0, ∞) such that sup ρ(s + a) ≤ ρc (s),
|a|≤c
and
Rd
3
(1 + |v| )ρc
for any s ∈ R,
1 2 |v| dv < ∞. 2
The meaning of the assumption (A1) is that those atoms with their initial momenta less than a certain value are ignored. The point is that, under this condition, (same as in the case with the molecules “frozen”, which we call the “classical case”), since the initial velocities of the atoms are fast enough, the interactions are not strong enough to “stop” the atoms, so they keep their velocities at a certain level for all time, hence they will leave the valid region for interaction very quickly (see Proposition 3.2.2 and Corollary 3.2.3 for the classical case, and Propositions 3.6.1 and 3.6.5 for our case). This helps us to avoid the problem of “too many collisions in a short period of time”. (A2) is a assumption with respect to the “rapidness” of the decreasing of ρ. Also, assume that the initial position (X1,0 , . . . , XN,0 ) satisfies |Xi,0 − Xj,0 | > Ri + Rj for any i = j, i.e. the molecules are originally separated enough such that their potential ranges do not overlap. We answer in this paper the two questions (Q1), (Q2) described above under our present assumption. For (Q1), we will show that there exists a unique solution of (2.1) for P m -almost every initial condition for every m > 0 (Theorem 2.2(1) below). In order to answer (Q2), let us first define some notations to describe the limit = (X1 , . . . , XN ) ∈ RdN , let process. For any X = (ϕ v(t, x, v; X)) ϕ(t, x0 , v0 ; X) 0 (t, x0 , v0 ), ϕ 1 (t, x0 , v0 )) = ( x(t, x, v; X), denote the solution of Newton’s equation dx = v(t, x, v; X), (t, x, v; X) dt N
d − Xi ), v(t, x, v; X) = − ∇Ui ( x(t, x, v; X) dt i=1 v(0, x, v; X)) = (x, v). ( x(0, x, v; X),
(2.2)
Compare (2.2) with the second half of (2.1) with m = 1, one finds that the only difference is that in (2.2), we have the molecules fixed, whereas in (2.1), the (with proper X) as an molecules are also moving. We will use this ϕ(t, x0 , v0 ; X)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
739
approximation of (x(t, x0 , v0 ), v(t, x0 , v0 )). As mentioned in Sec. 1, this is actually one of our main ideas of the present paper. Also, we will use the so-called ray representation Ψ: let E = {(x, v) ∈ Rd × (Rd \{0}); x · v = 0}, Ev = {x ∈ Rd ; x · v = 0},
v ∈ Rd \{0},
and let ν(dx, dv) be the measure on E given by ν(dx, dv) = |v| ν (dx; v)dv, where ν(dx; v) is the Lebesgue measure on Ev . Define Ψ:
R × E → Rd × (Rd \{0}), (s, (x, v)) → Ψ(s, (x, v)) = (Ψ0 (s, (x, v)), Ψ1 (s, (x, v))) = (x − sv, v),
in other words, we decompose the position of each atom into two parts: one parallel to its velocity and the other orthogonal to its velocity. is well 0 (t + s, Ψ(s, x, v); X) Then by Lemma 3.2.1, we have that lims→∞ ϕ 0 defined for any t ∈ R and (x, v) ∈ E. Denote it by ψ (t, x, v; X), i.e. let = lim ϕ ψ 0 (t, x, v; X) 0 (t + s, Ψ(s, x, v); X). s→∞
Now we are ready to give the quadratic term of the diffusion generator of the limit process: Let ∞ 1 = − Xi )dt aik;jl (X) ∇k Ui (ψ 0 (t, x, v; X) Mi Mj E −∞
∞ 1 2 0 |v| ν(dx, dv). × ∇l Uj (ψ (t, x, v; X) − Xj )dt ρ 2 −∞ Notice that the integral above, although it might look like infinite at a glance, is actually finite by Corollary 3.2.3 and assumptions (A1) and (A2). We next give the definition of the drift term of the limit process. For any (x, v) ∈ V , a) ∈ Rd denote the solution of V ∈ RdN and a ∈ R, let z(t; x, v, X, E, X, N
d2 − Xi )(z(t) − (t + a)Vi ), z(t) = − ∇2 Ui (ψ 0 (t, x, v, X) dt2 i=1 (2.3) d lim z(t) = lim z(t) = 0. t→−∞ t→−∞ dt V , a) is a linear function of V . Let bik;jl : RdN → R be the Then z(t; x, v, X,
C ∞ -functions determined by the following: ∞ 1 2 2 0 − |v| ν(dx, dv) ∇ Ui (ψ (t, x, v, X) − Xi )z(t, x, v, X, V , −t)dt ρ 2 −∞ E =
N d
=1 j=1
j , bi·;j (X)V
August 10, J070-S0129055X10004077
740
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
∞ − Xi)zp (t, x, v, X, V, −t)dt) × ( −∞ dp=1 ∇k ∇p Ui (ψ 0 (t, x, v, X) ρ( 12 |v|2 )ν(dx, dv) = d=1 N j=1 bik;j (X)Vj , k = 1, . . . , d, where zp means the pth element of the vector z for p = 1, . . . , d. By the same reason as that for the quadratic term, the integral on the left-hand side above is finite. Now we are in a position to give the definition of the limit diffusion generator L on R2dN : or equivalently, −
L=
E
N N N
d d d
∂ ∂2 1
∂ aik,jl (X) + b ( X)V + Vik . ik,jl j 2 i,j=1 ∂Vik ∂Vjl i,j=1 ∂Vik i=1 ∂Xik k,l=1
k,l=1
k=1
The coefficients a· and b· correspond to the 0-order and the 1-order approximations, respectively, given by the “frozing approximation” of the molecules (see also Sec. 1). Our main results in the present paper are formulated in Theorem 2.0.1 below. (1) ensures the existence of a unique dynamics for P m -almost every initial condition. (2)–(4) of Theorem 2.0.1 are to be understood with respect to the convergence of (m) (t, ω ), V (m) (t, ω )), t ≥ 0} under P ω ) as m → 0: for the the distribution of {(X m (d case of only one molecule, we have the convergence with no further assumption (the assertion (2)); when there are more than one molecule, in the general case, the convergence is valid until the stopping time given as the first time for which the potential ranges of any pair of molecules overlap (the assertion (3)); finally, for the special case of exactly two molecules with spherically-symmetric potentials, we strengthen the result by allowing the process to run until an arbitrary time (the assertion (4)). The precise description is as follows. Theorem 2.0.1. Under our present setting, we have the following. . (1) For any m > 0, there exists a unique solution to (2.1) for P m -almost every ω (m) (m) (2) Assume N = 1. Then as m → 0, the distribution of {(X1 (t), V1 (t)), t ≥ 0} under P m converges weakly to the diffusion process with generator L in C([0, ∞); R2d ) equipped with the Skorohod metric. (3) Assume N ≥ 2. Let σ0 ( ω ) = inf t > 0; min{|Xi (t; ω ) − Xj (t; ω )| − (Ri + Rj )} ≤ 0 , i=j
be the first time for which the distance between molecules in some pair is less than the sum of the radii of their potentials. Then as m → 0, the distribution of (m) (t ∧ σ0 )), t ≥ 0} under P (m) (t ∧ σ0 ), V {(X m converges weakly to the diffusion with generator L stopped at σ0 in C([0, ∞); R2dN ) equipped with the Skorohod metric. (4) Let N = 2 and d ≥ 3. Assume that there exist functions h1 , h2 such that Ui (x) = hi (|x|),
i = 1, 2,
and there exists a constant ε0 > 0 such that (−1)i−1 hi (s) > 0,
(−1)i−1 hi (s) > 0,
s ∈ (Ri − ε0 , Ri ),
i = 1, 2.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
741
(m)(t), V (m) (t)), t ≥ 0} under P Then we have that when m → 0, {(X m converges weakly to the Markov process given by the following: it acts as the diffusion with generator L as long as the potential ranges of the two molecules do not overlap, and the two molecules collide whenever their potential ranges touch each other. (See Theorem 6.3.2 for the precise definition of the limiting process.) Let us comment a little bit more about the conditions in Theorem 2.0.1. We do so for (4) first. The first half of the conditions requires that the potential functions for the two molecules depend only on the distances from the atoms. The condition d ≥ 3 is used in the proof of (4), and we would say that it is not strange to have it here since, as remarked at the end of Sec. 3, our cut-off fits reality (in the sense described there) only if d ≥ 3. Finally, the second half of the assumptions above implies that at least near to the edges of the potential ranges, one molecule experiences repulsive forces with the atoms, and the other molecule experiences attractive forces with the atoms. We use this condition to keep bounded (for m → 0) the velocites of the two molecules. This is also the reason why we need to stop the process at σ0 in (3): our decomposition of Vi (t) (see (3.30)) is valid only when the velocities of the molecules are O(1), which holds until σ0 without further assumption (see (3.31)), while this is not always true after σ0 (to see this, notice that the “resulting direct interactions” t∧σ (X(s))ds between molecules in (3.30) become ∞ when m → 0 if −m−1/2 0 ∇i U ∇i U(X(s)) = 0). We succeeded to extend the result until any time for the special case described in (4), by showing that in that case, the “resulting direct interactions” turn out to be “colliding forces”, which do not change the total momenta of the molecules; while in the general case, these might accelerate the molecules to ∞ immediately (to see this, just consider the case of two molecules of the same type), making the decomposition itself not valid anymore. (See also Lemma 3.5.1 and the paragraphs following it.) Remark 1. We can also get the unique existence of the solution to (2.1) for P malmost every ω under some more simple-looked assumptions (see Proposition 3.3.9). Remark 2. We emphasize again that as explained in Sec. 1, in our present problem, the forces at any fixed time are not independent of the history. Therefore, since both the molecules and the light “environmental” atoms are moving, the system is very complicated and difficult to handle. Our basic idea for the proof is that, although all of the particles are moving all the time, since the molecules are very heavy compared with the atoms, when considering the scattering of the atoms, we can use the approximation that the molecules are frozen (see (2.2)), which gives us the 0-order appximation of the momentum variance of the molecules. V ). (See Secs. 3–5 for more The 1-order error appears in our result as z(t, x, v; X, details.)
August 10, J070-S0129055X10004077
742
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Remark 3. For any fixed m > 0, although Vi (t) is continuous with respect to t (since it is described by the ODEs (2.1)), our martingale part Mi (t) in the decomposition of Vi (t) (see Lemma 3.5.1) needs not be continuous. The only thing we can say is that its jumps are dominated by some constant multiple of m1/2 (see Lemma 3.5.1). This is also one of our ideas, namely to use the martingale theorem only for the terms for which it is applicable. For the remaining terms, instead of trying to deal with them in detail, we show that they are negligible as m → 0. The rest of this paper is organized as follows: In Sec. 3, we prove the unique existence of the solution, and present some preparation for the proof of convergence. Especially, we formulate the decomposition of Vi (t) (see Lemma 3.5.1) and deduce from it some properties. The “frozing approximation” is also discussed in this section. Section 4 gives the proof of the lemmas formulated in Sec. 3. In Sec. 5, we use these lemmas to prove the first two convergence results (Theorems 2.0.1(2) and 2.0.1(3)), with the help of “martingale theory”. The proof of the last part of Theorem 2.0.1 is given in Sec. 6. 3. Preparations In this section, we formulate the ray-representation, prove the unique existence of the solution of the dynamics for each fixed m > 0, and give some preparations for the proof of our convergence results. For the sake of simplicity, from now on, we will omit the superscription (m) when there is no risk of confusion. We represent related results of classical mechanics, especially Newton’s equa x, v; X) are tion and ray representation in Sec. 3.1; some results with respect to ψ(t, prepared in Sec. 3.2; Sec. 3.3 is devoted to the almost surely unique existence of the solution of (2.1) with the help of the ray representation; in Sec. 3.4, we recall some basic facts about the Skorohod spaces (D([0, T ]; Rd), d0 ) and (D([0, ∞); Rd ), dis), ω (t, ω which will be used later (as described in Remark 3, although both (X(t, ), V )) and the limit processes are continuous with respect to t, this new space is necessary in our proof); in Sec. 3.5, we state several basic lemmas, especially the decomposition of Vi (t), the proof of which will be given in Sec. 4; finally, in Sec. 3.6, we prepare some basic calculations for later use. Since we are considering the Skorohod metric, it suffices for us to prove our assertions for t ∈ [0, T ] for any T > 0, instead of t ∈ [0, ∞). (See [1].) So from now on, we choose an arbitrary T > 0 and fix it. Also, as mentioned in Sec. 1, we use the stopping time that the velocities of the molecules are larger than or equal to n: choose any n ≥ 1 and fix it for a while (we will take n → ∞ at the end). Now, we are ready to define the following notations: Let σ(ω) = σn (ω) = inf t ≥ 0; max |Vi (t, ω)| ≥ n , i=1,...,N
R0 = R0 (n, T ) = max (Ri + |Xi,0 | + nT ) + 1, i=1,...,N
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
C0 =
2
N
743
1/2 Ri ∇Ui ∞
,
i=1
τ = τ (n, T ) = C0−1 R0 . Also, for reader’s convenience, we give a brief list of the other main notations and their meanings used in this paper: (X(t), V (t), x(t, x, v), v(t, x, v)): Solution of the dynamics, Solution of (2.2), the motion of the atoms with the molecules “frozen”, ϕ(t, x, v; X): which is shown to be the 0-order term in our approximation of (x(t, x, v), v(t, x, v)), The corresponding scattering, ψ(t, x, v; X): Ψ(t, x, v) := (x − tv, v), used in the ray representation, V ): Solution of (2.3), the 1-order term in our approximation of z(t, x, v; X, (x(t, x, v), v(t, x, v)). 3.1. Classical mechanics In this and the next subsection, we prepare some results with respect to the solution of Newton’s equation (2.2). be the solution of (2.2). ∈ (Rd )N , let ϕ(t, x, v; X) As in Sec. 2, for any X First, let us recall the following well-known result about Newton’s equation. Proposition 3.1.1. For any f : R2d → [0, ∞), we have N
1 2 |v| + f (ϕ(t, x, v; X))ρ Ui (x − Xi ) dxdv 2 R2d i=1
=
f (x, v)ρ R2d
N 1 2
|v| + Ui (x − Xi ) dxdv. 2 i=1
(3.1)
Proof. As the proof is fundamental and well-known, we give a sketch only. N First, since the total energy is constant, we have that 12 |v|2 + i=1 Ui (x − 2 + N Ui (ϕ − Xi ), so the left-hand side of 1 (t, x, v; X)| 0 (t, x, v; X) Xi ) = 12 |ϕ i=1 1 2 + N Ui (ϕ − (3.1) is equal to R2d f (ϕ(t, x, v; X))ρ( 1 (t, x, v; X)| 0 (t, x, v; X) i=1 2 |ϕ Xi ))dxdv. Therefore, in order to show the assertion, it is sufficient to show that ϕ e 0 ,ϕ e 1) | ∂(∂(x,v) | = 1 for any t > 0. On the other hand, by a straightforward calcu0 1 that d (| ∂(ϕe ,ϕe ) |) = 0, also, we lation, we get by the definition of ϕ(t, x, v; X) dt
∂(x,v)
= (x, v). This completes the proof of our have by definition (ϕ 0, ϕ 1 )(0, x, v; X) assertion. The rest of this subsection is dedicated to a discussion of the ray representation. Let E, Ev , Ψ, etc., be as given in Sec. 2. Note that for any measurable
August 10, J070-S0129055X10004077
744
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
f : R2d → [0, ∞), we have by definition and a simple variable change that f (x, v)dxdv = f (Ψ(t, x, v))dtν(dx, dv). R2d
(3.2)
R×E
In order to derive our new intensity function λm , for the sake of simplicity, we introduce the following notations. Let 1
(s, x, v) → Ψm (s, x, v) = Ψ(s, x, m− 2 v),
Ψm : R × E → Rd × (Rd \{0}), and let
1
fm (x, v) = f (x, m− 2 v), N 1 2
|v| + Ui (x − Xi,0 ) . ρ0 (x, v) = ρ 2 i=1 Then we have f (x, v)λ m (dx, dv) R2d
=m
d−1 2
f (x, v)ρ R2d
=m
− 12
N m 2
|v| + Ui (x − Xi,0 ) dxdv 2 i=1
f (x, m
− 12
v)ρ
R2d
=m
− 12
= m−1
R×E
N 1 2
|v| + Ui (x − Xi,0 ) dxdv 2 i=1
fm (Ψ(s, x, v))ρ0 (Ψ(s, x, v))dsν(dx, dv) 1
R×E
1
fm (Ψ(m− 2 s, x, v))ρ0 (Ψ(m− 2 s, x, v))dsν(dx, dv),
where we used (3.2) when passing to the forth line. On the other hand, 1
1
1
fm (Ψ(m− 2 s, x, v)) = f (x − m− 2 sv, m− 2 v) = f (Ψm (s, x, v)). Therefore,
R2d
f (x, v)λ m (dx, dv) =
R×E
f (Ψm (s, x, v))λm (ds, dx, dv),
where λm (ds, dx, dv) is the measure on Conf(R × E) defined by 1
λm (ds, dx, dv) = m−1 ρ0 (Ψ(m− 2 s, x, v))dsν(dx, dv) N 1 2
−1 −1/2 |v| + Ui (x − m sv − Xi,0 ) dsν(dx, dv). =m ρ 2 i=1 Also, with a little abuse of notation, we use Ψm to denote the natural map = Conf(R × E) to Conf(Rd × (Rd \{0})), i.e. Ψm (A) = {Ψm (a)|a ∈ A}. from Ω
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
745
Let Pm (dω) = Pλm (dω) be the Poisson point process on Conf(R×E) with intensity function λm (ds, dx, dv). Then since λm (B) = λ m (Ψm (B)) for any B ∈ B(R × E), we have that Pm (A) = P m (Ψm (A)),
for all A ∈ E0 .
Therefore, we can convert our problem with respect to Conf(Rd × Rd ) to a problem with respect to Conf(R × E). In summary, we let Ω = Conf(R × E), λ(ds, dx, dv) = λm (ds, dx, dv) N 1 2
−1 −1/2 =m ρ |v| + Ui (x − m sv − Xi,0 ) dsν(dx, dv), 2 i=1 Pm = Pλm be the Poisson point process on Ω with intensity λm (ds, dx, dv). ω ∈ Ω has distribution Pm , and for each initial condition ω, we are considering the following system of infinite ODEs (we omit the superscription (m) for the sake of simplicity): d Xi (t, ω) = Vi (t, ω), dt 1 d V (t, ω) = − ∇Ui (Xi (t, ω) − x(t, Ψ(s, x, m− 2 v)))µω (ds, dx, dv), M i i dt R×E (Xi (0, ω), Vi (0, ω)) = (Xi,0 , Vi,0 ), i = 1, . . . , N, (3.3) d x(t, x, v, ω) = v(t, x, v, ω), dt N
d m v(t, x, v, ω) = − ∇Ui (x(t, x, v, ω) − Xi (t, ω)), dt i=1 (x(0, x, v, ω), v(0, x, v, ω)) = (x, v), (x, v) ∈ Ψ(ω). 3.2. Classical scattering = (X1 , . . . , XN ) ∈ (Rd )N , and let ϕ Continuing as in Sec. 3.1, let X be the solution (see of (2.2). In this subsetion, we prove some results with respect to ψ(t, x, v; X) (3.4) below for its definition). We call it “classical”scattering since as opposite to (x(t, x, v, ω), v(t, x, v, ω)), the massive particles are not moving when considering ϕ(t, x, v; X). = max{Ri +|Xi |; i = 1, . . . , N }, and let s0 = R(X) Lemma 3.2.1. Let R(X) |v| . Then is independent of for any (x, v) ∈ E and t ∈ R, we have that ϕ(t + s, Ψ(s, x, v); X)
s as long as s ≥ s0 .
August 10, J070-S0129055X10004077
746
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Proof. For any (x, v) ∈ E, notice that x · v = 0 by definition of E, so inf |x − (s0 + u)v + rv| = inf |x − s0 v − (u − r)v|
0≤r≤u
0≤r≤u
≥ s0 |v| = R(X),
u ≥ 0,
which implies that the derivative of ϕ with respect to time t (the right-hand side of (2.2)) is 0, hence = (x − s0 v, v) = Ψ(s0 , x, v), ϕ(u, Ψ(s0 + u, x, v); X)
u ≥ 0.
Therefore, by the Markovian property of ϕ, we have = ϕ(t X) ϕ(t + s0 + u, Ψ(s0 + u, x, v); X) + s0 , ϕ(u, Ψ(s0 + u, x, v); X); = ϕ(t + s0 , Ψ(s0 , x, v); X),
t ∈ R, u ≥ 0.
That is, = ϕ(t ϕ(t + s, Ψ(s, x, v); X) + s0 , Ψ(s0 , x, v); X),
for any s ≥ s0 ,
is independent of s as long as s ≥ s0 . or equivalently, ϕ(t + s, Ψ(s, x, v); X) is wellBy Lemma 3.2.1, we get that lims→∞ ϕ(t + s, Ψ(s, x, v); X) = defined, and is equal to ϕ(t + s0 , Ψ(s0 , x, v); X). Write it as ψ(t, x, v; X) 0 1 (ψ (t, x, v; X), ψ (t, x, v; X)), i.e. = (ψ 0 (t, x, v; X), ψ 1 (t, x, v; X)) ψ(t, x, v; X) = ϕ(t = lim ϕ(t + s, Ψ(s, x, v); X) + s0 , Ψ(s0 , x, v); X). s→∞
(3.4)
With the same notations as in Sec. 2, we shall present one more result concerning ϕ(t, x, v; X). Proposition 3.2.2. Suppose that |v| > 2C0 . Then · (|v|−1 v) > C0 , ϕ 1 (t, x, v; X)
for any t ∈ R, x ∈ Ev .
= v. Write η = |v|−1 v. Then by assumption, Proof. Notice that ϕ 1 (0, x, v; X) v · η = |v| > 2C0 . Let · η ≤ C0 }. 1 (t, x, v; X) τ1 = inf{t ≥ 0; ϕ We show that τ1 = +∞. · η = C0 . By definition, we have 1 (τ1 , x, v; X) Suppose τ1 < +∞. Then ϕ t 0 0 · ηdu (s, x, v; X)) · η = ϕ 1 (u, x, v; X) (ϕ (t, x, v; X) − ϕ s
> C0 |t − s|,
for any 0 ≤ s < t ≤ τ1 ,
which implies that d 0 · η) ≥ C0 , (ϕ (t, x, v; X) dt
0 ≤ t ≤ τ1 .
(3.5)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
In particular,
d 0 (t, x, v; X) dt (ϕ
· η) > 0 for 0 ≤ t ≤ τ1 . Also, since
1
747
−v =− ϕ (τ1 , x, v; X)
N τ1
0
− Xi )dt, ∇Ui (ϕ 0 (t, x, v; X)
i=1
we have by definition that −
N τ1
0
− Xi ) · ηdt ∇Ui (ϕ 0 (t, x, v; X)
i=1
· η − v · η < C0 − 2C0 = −C0 . =ϕ 1 (τ1 , x, v; X) Therefore, with the help of (3.5), we have C0 < ≤
=
N τ1
0
1 C0
− Xi ) · ηdt ∇Ui (ϕ 0 (t, x, v; X)
i=1 N τ1
i=1
0
− Xi ) · η| · |∇Ui (ϕ 0 (t, x, v; X)
d 0 · η)dt (ϕ (t, x, v; X) dt
N 1
− Xi ) · η| |∇Ui (ϕ 0 (t, x, v; X) C0 i=1 {t∈[0,τ1 ],|ϕe 0 (t,x,v;X)·η−X i ·η|
d 0 · η)dt (ϕ (t, x, v; X) dt N 1
d 0 · η − Xi · η)dt (ϕ (t, x, v; X) ≤ ∇Ui ∞ C0 i=1 dt 0 {t∈[0,τ1 ],|ϕ e (t,x,v;X)·η−Xi ·η|
≤
N 1
∇Ui ∞ 2Ri C0 i=1
= C0 , · (|v|−1 v) > C0 which makes a contradiction. Therefore, τ1 = +∞, i.e. ϕ 1 (t, x, v; X) for any t ≥ 0. The assertion for t < 0 can be shown in the same way by considering · η ≤ C0 }. τ2 = sup{t < 0; ϕ 1 (t, x, v; X) By Proposition 3.2.2, we get the following important result with respect to ψ 0 (t, x, v; X). Corollary 3.2.3. For any (x, v) ∈ E with |v| > 2C0 , we have that − Xi | > Ri , |ψ 0 (t, x, v; X) or t ≤ −C −1 R(X). if t ≥ 2C0−1 R(X) 0
i = 1, . . . , N,
August 10, J070-S0129055X10004077
748
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Proof. Choose and fix any (x, v) ∈ E with |v| > 2C0 , and let η = |v|−1 v. Then since x · v = 0, we have that Ψ0 (s, x, v) · η = (x − sv) · η = −sv · η = −s|v|, Let s0 =
R(X) |v|
for any s > 0.
and as before. Then s0 < C0−1 R(X),
· η = Ψ0 (s0 , x, v) · η = −s0 |v| = −R(X). ϕ 0 (0, Ψ(s0 , x, v); X)
(3.6)
Also, |Ψ1 (s0 , x, v)| = |v| > 2C0 by assumption. Combining (3.4), Proposition 3.2.2 and (3.6), we get ·η =ϕ ·η 0 (t + s0 , Ψ(s0 , x, v); X) ψ 0 (t, x, v; X) t+s0 · ηdu + ϕ ·η = ϕ 1 (u, Ψ(s0 , x, v); X) 0 (0, Ψ(s0 , x, v); X) 0
> (t + s0 )C0 − R(X),
for any t > −s0 .
then In particular, if t > 2C0−1 R(X), · η > (t + s0 )C0 − R(X) ≥ R(X). ψ 0 (t, x, v; X) then t + s0 < 0, so In the same way, if t < −C0−1 R(X), ·η = ϕ ·η 0 (t + s0 , Ψ(s0 , x, v); X) ψ 0 (t, x, v; X) 0 · ηdu + ϕ ·η =− ϕ 1 (u, Ψ(s0 , x, v); X) 0 (0, Ψ(s0 , x, v); X) (t+s0 )
< −C0 · (−(t + s0 )) − R(X) < −R(X). This completes the proof of our assertion. Proposition 3.2.4. For any measurable f : R2d → [0, ∞) such that the integrand below is integrable, we have N 1 2
|v| + f (x, v)ρ Ui (x − Xi ) dxdv 2 R2d i=1 ∞ 1 2 |v| ν(dx, dv). f (ψ(t, x, v; X))dt ρ = (3.7) 2 −∞ E ∞ on the right-hand side of (3.7), Remark 4. The integral −∞ f (ψ(t, x, v; X))dt although it might look as being an infinite integral, is actually a finite one by Corollary 3.2.3. Proof. By using approximation and taking limit with the help of convergence theorem, we may and do assume, without loss of generality, that there exists a
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
749
> 0 such that constant R supp(f ) ⊂ {(x, v); |x| + |v| ≤ R}. Let + R(X)), T = 2C0−1 (R and let N
v) = 1 |v|2 + E(x, Ui (x − Xi ). 2 i=1
Then by Proposition 3.1.1 and a simple change of variables, we have v))dxdv f (x, v)ρ(E(x, R2d
= R2d
v))dxdv f (ϕ(T, x, v; X))ρ( E(x,
=
R×E
f (ϕ(T, Ψ(t, x, v); X))ρ( E(Ψ(t, x, v)))dtν(dx, dv).
(3.8)
Therefore, it suffices for us to show that the right-hand side of (3.8) is equal to
1 2 |v| dtν(dx, dv). f (ψ(T − t, x, v); X)ρ 2 R×E We only need to show that the integrands are equal, i.e. it suffices to show that
1 2 |v| . (3.9) f (ϕ(T, Ψ(t, x, v); X))ρ(E(Ψ(t, x, v))) = f (ψ(T − t, x, v; X))ρ 2 Let us prove this in what follows. We first show that if the left-hand side of (3.9) is not 0, then it is equal to the right-hand side. Assume that f (ϕ(T, Ψ(t, x, v); X))ρ( E(Ψ(t, x, v))) = 0. Then ρ(E(Ψ(t, x, v))) > 0 implies by our assumption that E(Ψ(t, x, v)) > e0 , so |v| > 2C0 , hence by Proposition 3.2.2, · η > C0 ϕ 1 (s, Ψ(t, x, v); X) for any s ∈ R, where η = |v|−1 v. Therefore, since (ϕ 0, ϕ 1 ) is the solution of (2.2), we have by definition that T d 0 − Ψ0 (t, x, v)) · η = ϕ (s, Ψ(t, x, v); X)ds (ϕ 0 (T, Ψ(t, x, v); X) ·η 0 ds T · ηds ϕ 1 (s, Ψ(t, x, v); X) = 0
+ R(X)), > T · C0 = 2(R where in the latter step we used the definition of T .
(3.10)
August 10, J070-S0129055X10004077
750
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
> 0 in addition, which gives us We also have f (ϕ(T, Ψ(t, x, v); X)) · η| ≤ |ϕ + |ϕ ≤ R. |ϕ 0 (T, Ψ(t, x, v); X) 0 (T, Ψ(t, x, v); X)| 1 (T, Ψ(t, x, v); X)| (3.11) Combining (3.10) with (3.11), and noticing that x · v = 0 since (x, v) ∈ E, we get by the definition of η that + 2R(X) < −Ψ0 (t, x, v) · η = (x − tv) · η = t|v|, R hence t ≥
e R+2R( X) |v|
(3.12)
≥ s0 . So by the definition of ψ, we get
= ϕ(T, ψ(T − t, x, v) = ϕ(T − t + t, Ψ(t, x, v); X) Ψ(t, x, v); X). so by the definition Also, (3.12) gives us that |Ψ0 (t, x, v)| = |x − tv| ≥ t|v| ≥ R(X), of E, we also get 1 E(Ψ(t, x, v)) = |v|2 . 2 This completes the proof of the fact that if the left-hand side of (3.9) is not 0, then it is equal to the right-hand side. We next show the opposite, i.e. we assume that the right-hand side of (3.9), 1 2 f (ψ(T − t, x, v; X))ρ( 2 |v| ), is not 0, and show that it is equal to the left-hand side, f (ϕ(T, Ψ(t, x, v); X))ρ( E(Ψ(t, x, v))). It is sufficient to show that t ≥ s0 (= R(X) ). |v|
hence (Indeed, if t ≥ s0 , then by using x · v = 0, we get |x − tv| ≥ t|v| ≥ R(X), 1 2 E(Ψ(t, x, v)) = 2 |v| by definition. Also, since t ≥ s0 , we have by the definition of = ϕ(T = ϕ(T, which ψ that ψ(T − t, x, v; X) − t + t, Ψ(t, x, v); X) Ψ(t, x, v); X), 1 1 2 2 2 will complete our proof.) Since ρ( 2 |v| ) > 0, we have 2 |v| > 2C0 , hence |v| > 2C0 , which in turn by Proposition 3.2.2 gives us that · η > C0 ϕ 1 (u, Ψ(s, x, v); X)
(3.13)
for any u, s ∈ R and x ∈ Ev . If t ≥ T , then by the definition of T , since |v| > 2C0 , we have 2 + R(X)) > R(X) = s0 . > 4 (R (R + R(X)) t≥T = C0 |v| |v| If t < T , then we have by (3.13) and the definition of T that for any r > 0 − Ψ0 (r, x, v)) · η (ϕ 0 (T − t + r, Ψ(r, x, v); X) T −t+r · ηdu = ϕ 1 (u, Ψ(r, x, v); X) 0
+ 2R(X) + (r − t)C0 . > (T − t + r) · C0 = 2R > 0, we have Also, since f (ψ(T − t, x, v; X)) + |ψ 1 (T − t, x, v; X)| ≤ R. |ψ 0 (T − t, x, v; X)|
(3.14)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
751
Therefore, we have for any r ≥ s0 = |ψ 0 (T − t, x, v; X)| ≤ R. |ϕ 0 (T − t + r, Ψ(r, x, v); X)|
(3.15)
Combining (3.14) and (3.15), we get + 2R(X) + (r − t)C0 , (r|v| =) − Ψ0 (r, x, v) · η > R X) Applying the above to r = s0 = R(|v| , we get
for any r ≥ s0 .
>R + 2R(X) + (s0 − t)C0 . R(X) Therefore, t > s0 . This completes our proof. 3.3. Existence and uniqueness of the solution In this subsection, we prove the first assertion of Theorem 2.0.1, the almost sure unique existence of the solution of the considered infinite system of ODEs for any fixed m > 0. Recall that by Sec. 3.1, we have already converted the problem into C1 , etc., (3.3), which uses the ray representation. In the following we shall use C, C, to denote constants which may be different in different places. For any open subset G ⊂ R × E, let θG : Conf(R × E) → Conf(R × E), ω → θG (ω) = ω ∩ G. Then θG is E0 /E0 -measurable. Here E0 is the σ-algebra on O(R × E) = {A ⊂ R × E|A = ∅, A is closed}, generated by {{C ∈ O(R × E); C ∩ A = ∅}; A is open in R × E}. Also, let FG = σ{XK ; K ⊂ G, K is compact} ∨ ℵ. Here XK is the random variable defined by XK (ω) = µω (K)(= (ω ∩ K)), ω ∈ Ω, and ℵ stands for the set of null sets. Then it is trivial that {FG |G is open} is an increasing σ-algebra. Let Fin(R × E) denote the set of non-empty finite subsets of R × E. It is easy to see that if ω ∈ Fin(R × E), then (3.3) has a unique solution. In the following, we extend this unique existence of a solution for (3.3) to Pm -almost every ω. Fix any T > 0 as before. Let R0 and τ be as given at the end of Sec. 2, set Gn = {(t, x, v) ∈ R × E; |x| < R0 , |t| < T + m1/2 τ }, and let θn = θGn . Lemma 3.3.1. θn ω ∈ Fin(R × E) for Pm -a.e. ω. Proof. Let c =
N
λm (Gn ) =
i=1 Ui ∞ .
R×E
×m
Then by definition and assumption,
1{|x|
−1
ρ
N 1 2
−1/2 |v| + Ui (x − m tv − Xi,0 ) dtν(dx, dv) 2 i=1
August 10, J070-S0129055X10004077
752
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
≤ (2R0 )d−1 2(T + m1/2 τ )m−1
Rd
|v|ρc
1 2 |v| dv 2
< ∞.
(3.16)
So E Pm [ (θn ω)] = E Pm [µω (Gn )] = λm (Gn ) < ∞. θn ω), V (t, θn ω)) is well-defined for Pm By Lemma 3.3.1, we have that (X(t, almost every ω. Next, for any t ∈ [0, T ], we define St : Fin(R × E) → O(R × E) as St (ω) = (u, x, v) ∈ R × E;
min
i=1,...,N
min |Xi (s, ω) − (x − (u − s)m
−1/2
0≤s≤t
1 v)| − Ri + ≤0 2
for any ω ∈ Fin(R × E). Then we have the following: Lemma 3.3.2. For any open set G and ω ∈ Fin(R × E), we have that {St (ω) ⊂ G} = {St (θG ω) ⊂ G}. Proof. Choose and fix any ω ∈ Fin(R×E). We give the proof of the part {St (ω) ⊂ G} ⊂ {St (θG ω) ⊂ G}, the opposite one can be proven in exactly the same way. Notice that by definition, 1 (u, x, v) ∈ / St (ω) ⇒ |Xi (s, ω) − (x − um−1/2 v + sm−1/2 v)| ≥ Ri + , 2 ∀ s ∈ [0, t], i = 1, . . . , N x(s, x − um−1/2 v, m−1/2 v; ω) = x − um−1/2 v + sm−1/2 v, ⇒ v(s, x − um−1/2 v, m−1/2 v; ω) = m−1/2 v, for any s ∈ [0, t]. So (u, x, v) ∈ / St (ω) 1 ⇒ |Xi (s; ω) − x(s, x − um−1/2 v, m−1/2 v; ω)| ≥ Ri + , 2
for any s ∈ [0, t],
⇒ ∇Ui (Xi (s; ω) − x(s, x − um−1/2 v, m−1/2 v; ω)) = 0,
for any s ∈ [0, t], (3.17)
for i = 1, . . . , N . Moreover, it is trivial to see that (u, x, v) ∈ G ⇒ µω (du, dx, dv) = µθG ω (du, dx, dv).
(3.18)
(3.17) and (3.18) combined with the definition (3.3) imply ω), V (s, ω)) = (X(s, θG ω), V (s, θG ω)), St (ω) ⊂ G ⇒ (X(s,
for any s ∈ [0, t], (3.19)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
753
(as long as ω ∈ Fin(R × E)). Therefore, St (ω) ⊂ G ⇒ St (θG ω) ⊂ G. We next deal with general ω ∈ Conf(R × E). As a special case of Lemma 3.3.2, we have the following. Corollary 3.3.3. {St (θk ω) ⊂ Gn } = {St (θn ω) ⊂ Gn },
Pm -a.e. ω,
for any k > n.
By Lemmas 3.3.1 and 3.3.2, we have that {St (θn ω) ⊂ G} = {St (θGn ∩G ω) ⊂ G},
Pm -a.e. ω,
so by the definition of F· , we have {St (θn ω) ⊂ G} ∈ FGn ∩G ⊂ FG for any open G ⊂ R × E, i.e. St (θn ·) is a {FG }-stopping time. Here, a map T : Ω → O(R × E) is, by definition, called a F -stopping time if, T is B(Ω)/E0 -measurable and {ω ∈ Ω; T (ω) ⊂ G} ∈ FG for any F -regular open set G. For any ω ∈ Ω, let τn (ω) = inf t ≥ 0; max |Vi (t, θn ω)| > n ∧ T. i=1,...,N
Lemma 3.3.4. For any n ∈ N, there exists a unique solution to (3.3) for Pm -a.e. ω satisfying τn (ω) = T . Proof. We first notice that τn (ω) = T ⇒ ST (θn ω) ⊂ Gn .
(3.20)
Indeed, if τn (ω) = T , then |Vi (t, θn ω)| ≤ n for any t ∈ [0, T ] and i = 1, . . . , N , hence / Gn . |Xi (t, θn ω)| ≤ nT +|Xi,0 | for any t ∈ [0, T ] and i = 1, . . . , N . Assume (u, x, v) ∈ Then either |x| ≥ R0 + nT or |u| ≥ m1/2 C0−1 (R0 + nT ) + T . If |x| ≥ R0 + nT , then |x + rv| ≥ |x| ≥ R0 + nT for any r ∈ R, so |Xi (s, θn ω) − (x − um−1/2 v + / ST (θn ω). sm−1/2 v)| ≥ Ri + 12 for any s ∈ [0, T ], which implies that (u, x, v) ∈ If |u| ≥ m1/2 C0−1 (R0 + nT ) + T , then since |v| > C0 Pm -almost surely, for any s ∈ [0, T ], we have |x − um−1/2 v + sm−1/2 v| ≥ C0−1 (R0 + nT )|v| ≥ R0 + nT , so in this case we also have |Xi (s, θn ω) − (x − um−1/2 v + sm−1/2 v)| ≥ Ri + 12 for any s ∈ [0, T ], which implies that (u, x, v) ∈ / ST (θn ω). In conclusion, we have in either cases that (u, x, v) ∈ / ST (θn ω). This completes the proof of (3.20). Now, we are ready to show that the desired solution is well-defined almost surely on the set τn (ω) = T for any n ∈ N. Indeed, if τn (ω) = T , then we have by (3.20), Corollary 3.3.3 and (3.19) that (t, θk ω)) = (X(t, θn ω), V (t, θn ω)), θk ω), V (X(t,
for any t ∈ [0, T ] and k ≥ n,
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
754
so we can define (t, θn ω)), ω), V (t, ω)) = (X(t, θn ω), V (X(t, (x(t, x, v, ω), v(t, x, v, ω)) = (x(t, x, v, θn ω), v(t, x, v, θn ω)), which exists for Pm -almost every ω satisfying our condition by Lemma 3.3.1. Then ω), V (t, ω), x(t, x, v, ω), v(t, x, v, ω)) satisfies (3.3). (X(t, Notice that τn (ω) = T ⇒ τn+1 (ω) = T . Therefore, to complete the proof of Theorem 2.0.1(1), it suffices to prove the following: Lemma 3.3.5. P
∞
{τn = T }
= 1.
n=1
We divide the proof of Lemma 3.3.5 into several steps. Lemma 3.3.6. There exist constants C1 , C2 > 0 such that N
1 i=1
2
2
Mi |Vi (t, θn ω)| ≤ C1 + C2
St (θn ω)
1Gn (u, x, v)(1 + |v|2 )µω (du, dx, dv),
for any θn ω ∈ Fin(R × E). Proof. For any θn ω ∈ Fin(R × E), we have by the invariance of the energy N
1 i=1
2
Mi |Vi (t, θn ω)|2
+
+
m 2
N
i=1
+
|v(t, x − um−1/2 v, m−1/2 v; θn ω)|2 µθn ω (du, dx, dv)
N
i=1
=
R×E
R×E
Ui (Xi (t, θn ω) − x(t, x − um−1/2 v, m−1/2 v; θn ω))µθn ω (du, dx, dv)
1 m Mi |Vi,0 |2 + 2 2
N
i=1
R×E
R×E
|m−1/2 v|2 µθn ω (du, dx, dv)
Ui (Xi,0 − (x − um−1/2 v))µθn ω (du, dx, dv).
(3.21)
If (u, x, v) ∈ / St (θn ω), then |Xi (s, θn ω) − (x − (u − s)m−1/2 v)| > Ri + 12 for any s ∈ [0, t] and i = 1, . . . , N , so by (3.3), v(t, x − um−1/2 v, m−1/2 v; θn ω) = m−1/2 v
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
755
and Ui (Xi (t, θn ω) − x(t, x − um−1/2 v, m−1/2 v; θn ω)) = 0. Therefore, (3.21) implies N
1 i=1
2
Mi |Vi (t, θn ω)|2
+
+
m 2
St (θn ω)
|v(t, x − um−1/2 v, m−1/2 v; θn ω)|2 µθn ω (du, dx, dv)
N
i=1 St (θn ω)
=
N
1 i=1
+
2
Mi |Vi,0 |2 +
N
i=1
St (θn ω)
So with C1 := N
1 i=1
Ui (Xi (t, θn ω) − x(t, x − um−1/2 v, m−1/2 v; θn ω))µθn ω (du, dx, dv)
2
m 2
St (θn ω)
|m−1/2 v|2 µθn ω (du, dx, dv)
Ui (Xi,0 − (x − um−1/2 v))µθn ω (du, dx, dv).
N
1 2 i=1 2 Mi |Vi,0 |
and C2 := 2
2
Mi |Vi (t, θn ω)| ≤ C1 + C2
St (θn ω)
= C1 + C2
St (θn ω)
N
i=1 Ui ∞
+
m 2,
we get
(1 + |v|2 )µθn ω (du, dx, dv) 1Gn (u, x, v)(1 + |v|2 )µω (du, dx, dv). (3.22)
Let us prepare for later use the following general result with respect to stopping times and Poisson point process. Lemma 3.3.7. (1) Let f : R × E → [0, ∞) be measurable and let S be a stopping time. Then Pm Pm f dµω = E f dλm . E S(ω)
S(ω)
(2) Let f : R × E → [0, ∞) be measurable and S, T be two stopping times satisfying (i) T (ω) ⊂ S(ω) for any ω ∈ Ω, (ii) E Pm [ S(ω) |f |dλm ] < ∞. Then
f (dµω − dν) FT = E f (dµω − dν) . S(ω) T (ω)
E
Proof. As the result is already known, we give a sketch only. (See, e.g., [8, 12] for related results.)
August 10, J070-S0129055X10004077
756
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
We first have E Pm [µω (A\S)|FS ] = λm (A\S),
∀A ∈ B(R × E).
This is heuristically based on the definition of Poisson point process and the independence of FS and FA\S , and can be proved rigorously, for example, first for non-random S, and then be extended to stopping times in a routine way. So for positive simple functions f we have Pm f dµω FS = f dλm . (3.23) E (R×E)\S (R×E)\S With the help of the monotone convergence theorem, this can be extended to any positive measurable function f in a routine way. Therefore, E Pm
f dµω = E Pm
R×E
S(ω)
f dµω − E Pm
= R×E
f dλm − E
Pm
=E
Pm
(R×E)\S(ω)
(R×E)\S(ω)
f dµω
f dλm
f dλm . S(ω)
For the second assertion, (3.23) implies that E[ f (dµω − dλm )|FS ] = R×E S(ω) f (dµω − dλm ), hence E
f (dµω − dλm ) FT = E E f (dµω − dλm ) FS FT S(ω) R×E f (dµω − dλm ) FT =E
R×E
=E T (ω)
f (dµω − dλm ).
Since St (θn ·) is a {FG }-stopping time, FSt+ε (θn ·) is well-defined for any ε > 0 (n) small enough. Let Ft = ε>0 FSt+ε (θn ·) , 0 ≤ t < T . Then τn is a stopping time (n)
with respect to the filtration {Ft }t∈[0,T ) . Let (n) Mt
= St (θn ω)
1Gn (u, x, v)(1 + |v|2 )(µω (du, dx, dv) − λ(du, dx, dv)).
(n)
(n)
Lemma 3.3.8. {Mt }t∈[0,T ] is a {Ft }t∈[0,T ] -martingale with mean 0.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
757
Proof. Notice that St (θn ω) is monotone non-decreasing with respect to t, and by N assumption, with c = i=1 Ui ∞ , we have 1Gn (u, x, v)(1 + |v|2 )λ(du, dx, dv) R×E
≤ 2(T + m1/2 C0−1 (R0 + nT ))(R0 + nT )d−1 m−1
1 2 |v| |v|dv, × (1 + |v|2 )ρc 2 Rd which is finite (but may depend on n) by assumption. This combined with Lemma 3.3.7 gives us our assertion. (n)
Proof of Lemma 3.3.5. We have by Lemma 3.3.8 that E[Mτn ] = 0. So by Lemma 3.3.6, N
1 i=1
2
Mi E[|Vi (τn , θn ω)|2 ]
≤ C1 + C2 E
2
Sτn (θn ω)
1Gn (u, x, v)(1 + |v| )λ(du, dx, dv)
≤ C1 + C2 E
2
Sτn (θn ω)
(1 + |v| )λ(du, dx, dv) .
So with C3 := (min M2i )−1 , we have P [τn < T ] = P max |Vi (τn , θn ω)| ≥ n i=1,...,N
N
1 C3 2 Mi |Vi (τn , θn ω)| ≤ 2E n 2 i=1 1 1 2 ≤ 2 C1 C3 + 2 C2 C3 E (1 + |v| )λ(du, dx, dv) . n n Sτn (θn ω)
(3.24)
Let us estimate the expectation on the right-hand side of (3.24). Let Sd (r) denote N the volume of the ball in Rd with radius r, and let C1 = i=1 m1/2 Sd (Ri + 12 ), C2 = N 1/2 T Sd−1 (Ri + 12 ). Then i=1 m |{(u, x) ∈ R × Ev ; (u, x, v) ∈ St (θn ω)}| = (u, x) ∈ R × Ev ; ∃i = 1, . . . , N, s.t., s 1 −1/2 −1/2 min (x − um v) + (m v − Vi (r, θn ω))dr ≤ Ri + 0≤s≤t 2 0
August 10, J070-S0129055X10004077
758
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
≤
N
m
1/2
−1
|v|
i=1
≤
N
m
1/2
−1
|v|
i=1
s −1/2 y ∈ Rd ; min y + ≤ Ri + 1 (m v − V (r, θ ω))dr i n 0≤s≤t 2 0
1 −1/2 |v| + max |Vi (s, θn ω)|)Sd−1 Ri + T (m 0≤s≤t 2
1 + Sd Ri + 2
−1 −1/2 = |v| |v| + max |Vi (s, θn ω)|) . C1 + C2 (m 0≤s≤t
Also, notice that |Vi (s, θn ω)| ≤ n for any s ∈ [0, τn ]. Therefore, with C1 = m−1 Rd(1 + |v|2 )(C1 + C2 m−1/2 |v|)ρc ( 12 |v|2 )dv and C2 = m−1 C2 Rd(1 + |v|2 )ρc ( 12 |v|2 )dv, which are finite by assumption, we have (1 + |v|2 )λ(du, dx, dv) Sτn (θn ω)
1 2 |v| |v|dv|{(u, x) ∈ R × Ev ; (u, x, v) ∈ St (θn ω)}| 2 Rd
1 2 2 −1 |v| (C1 + C2 (m−1/2 |v| + n))dv (1 + |v| )m ρc ≤ 2 Rd
≤
(1 + |v|2 )m−1 ρc
= C1 + C2 n. This combined with (3.24) implies P (τn < T ) ≤
1 1 C1 C3 + 2 C2 C3 (C1 + C2 n) → 0, 2 n n
as n → ∞,
which completes the proof. As mentioned in Sec. 2, we can also get the unique existence of the solution of (2.1) under the following condition (and without any further assumption such ∞ as (A1) or (A2)): d ≥ 2 and −∞ (1 + |s|)d ρ(s)ds < ∞. (See Proposition 3.3.9.) This result is not necessary for the rest of this paper, but we include it here since the condition is very simply: the intensity function ρ decreases rapidly enough at infinity. Proposition 3.3.9. Assume that d ≥ 2 and ∞ (1 + |s|)d ρ(s)ds < ∞,
(3.25)
−∞
then there exists a unique solution to (2.1) for P . m -almost every ω Notice that neither does Theorem 2.0.1(1) include Proposition 3.3.9 nor vice versa.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
759
Proof. The proof is almost the same as the one we just used for Theorem 2.0.1(1), although we do not use the ray representation this time. The point is that in the proof of Theorem 2.0.1(1), the assumption (A2) was only used to estimate several −1/2 tv − Xi,0 )) (e.g., (3.16)), while integrals with respect to ρ( 12 |v|2 + N i=1 Ui (x − m if we do not use the ray representation, then the corresponding term ρ( 12 |v|2 + N i=1 Ui (x − Xi,0 )) does not depend on v, so by the variable change r = |v| and a suitable shift we can get similar estimates without the help of (A2). We give a brief sketch of the proof in the following. Unless otherwise specified, the notations have the same meanings as in the proof of Theorem 2.0.1(1). First notice that for any α ≥ 0, we have α + d2 − 1 ≥ 0 since d ≥ 2, so for any c ∈ R, we have by assumption and a simple calculation that ∞ m 2 d |v| + c dv ≤ Cd,m |v| ρ (|c| + |s|)α+ 2 −1 ρ(s)ds 2 d −∞ R
2α
(3.26)
for some constants Cd,m > 0 independent of c. So m 2 |v| + c dv < ∞, |v| ρ 2 Rd
2α
if 0 ≤ α ≤
d + 1. 2
(3.27)
Let Gn = {(x, v) ∈ R2d ; |x| < R0 + nT + |v|T }, and let θn = θGn . Then since |{x; (x, v) ∈ Gn }| = 2d (R0 + nT + T |v|)d ≤ 4d (R0 + nT )d + 4d T d |v|d , with C =
m
− d−1 2
N
i=1 Ui ∞ ,
λ m (Gn ) =
we have the following by (3.26) and our assumption:
ρ Gn
N m 2
|v| + Ui (x − Xi,0 ) dxdv 2 i=1
N m 2
≤ |v| + ρ Ui (x − Xi,0 ) dxdv 2 |x|≤R0 i=1
m 2 |v| dxdv ρ + 2 Gn ∩{|x|>R0 } ∞ d ≤ (2R0 )d Cd,m (C + |s|) 2 −1 ρ(s)ds
−∞
+ 4d (R0 + nT )d
ρ Rd
m 2 m 2 |v| dv + 4d T d |v| dv |v|d ρ 2 2 Rd
August 10, J070-S0129055X10004077
760
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
≤ (2R0 )d Cd,m
∞
−∞
d
(C + |s|) 2 −1 ρ(s)ds d
d
+ 4 (R0 + nT ) Cd,m + 4d T dCd,m
∞
−∞
∞
−∞
|s| 2 −1 ρ(s)ds d
|s|d−1 ρ(s)ds
< ∞. So the conclusion of Lemma 3.3.1 still holds in our case. The proof of Theorem 2.0.1(1) until Lemma 3.3.8 is valid in the present case, just with the trivial modifications such as R × E replaced by R2d , and with the definition of St modified as St : Fin(R2d ) → O(R2d ), given by
1 2d St ( ω ) = (x, v) ∈ R ; min ) − (x + sv)| − Ri + min |Xi (s, ω ≤0 i=1,...,N 0≤s≤t 2 for any ω ∈ Fin(R2d). The fact that R2d 1Gn (x, v)(1 + |v|2 )λ m (dx, dv) < ∞ in the proof of Lemma 3.3.8 is now proven as follows: since |{x; (x, v) ∈ Gn }| = 2d (R0 + nT + T |v|)d and there exists a constant C2 > 0 (depending on R0 , n, T, d) such that 2d (R0 + nT + T |v|)d (1 + |v|2 ) ≤ C2 (1 + |v|d+2 ), we get by (3.26) and our assumption that 1−d m 2 1Gn (x, v)(1 + |v|2 )λ m (dx, dv) R2d
N
m |v|2 + (1 + |v|2 )ρ Ui (x − Xi,0 ) dxdv ≤ 2 |x|≤R0 i=1
m 2 + |v| dxdv (1 + |v|2 )ρ 2 Gn ∩{|x|>R0 } N m 2
2 |v| + dx (1 + |v| )ρ Ui (x − Xi,0 ) dv ≤ 2 |x|≤R0 Rd i=1
m 2 d+2 + |v| dv C2 (1 + |v| )ρ 2 Rd ∞ d d ≤ (2R0 )d Cd,m [(C + |s|) 2 −1 + (C + |s|) 2 ]ρ(s)ds
+ C2 Cd,m < ∞, where C =
N
i=1 Ui ∞ .
−∞
∞
−∞
[|s| 2 −1 + |s|d ]ρ(s)ds d
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
761
The part under the title “Proof of Lemma 3.3.5” is most changed, and we give it as follows: N
1 2 2 Mi E[|Vi (τn , θn ω )| ] ≤ C1 + C2 E 1Gn (x, v)(1 + |v| )λm (dx, dv) . 2 Sτn (θn ω e) i=1 Therefore, with C3 := (min M2i )−1 , we have max |Vi (τn , θn ω )| ≥ n P [τn < T ] = P i=1,...,N
(3.28)
N
1 C3 2 Mi |Vi (τn , θn ω ≤ 2E )| n 2 i=1 1 1 2 ≤ 2 C3 C1 + 2 C3 C2 E 1Gn (x, v)(1 + |v| )λm (dx, dv) . n n Sτn (θn ω e) (3.29) m 2 Notice that by definition, λ m (dx, dv) = ρ( 2 |v| )dxdv if |x| > R0 . Also, there exist constants C0 , C1 > 0 (depending on T, N, d and Ri ) such that
)}| |{x ∈ Rd ; (x, v) ∈ St (θn ω 1 d )| ≤ Ri + = x ∈ R ; ∃i ∈ {1, . . . , N }, s.t., min |x + sv − Xi (s, θn ω 0≤s≤t 2 s 1 = x ∈ Rd ; ∃i ∈ {1, . . . , N }, s.t., min |x + (v − Vi (r, θn ω ))dr| ≤ Ri + 0≤s≤t 2 0 N
≤ C0 + C1 |v| + max |Vi (s, θn ω )| . i=1
0≤s≤t
Moreover, |Vi (t, θn ω )| ≤ n if t ∈ [0, τn ]. Therefore, by assumption and (3.27), there exist constants C0 , C1 > 0 such that 1Gn (x, v)(1 + |v|2 )λ m (dx, dv) Sτn (θn ω e)
≤
|x|≤R0
(1 + |v|2 )λ m (dx, dv)
m 2 |v| dv(1 + |v|2 )|{x ∈ Rd ; (x, v) ∈ Sτn (θn ω +m ρ )}| 2 Rd N
d−1 m |v|2 + ≤m 2 dx (1 + |v|2 )ρ Ui (x − Xi,0 ) dv 2 |x|≤R0 Rd i=1
d−1 m 2 |v| dv (C0 + C1 (|v| + N n))(1 + |v|2 )ρ +m 2 2 Rd d−1 2
August 10, J070-S0129055X10004077
762
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
≤ (2R0 )d m +m +m
d−1 2
d−1 2
d−1 2
(C0
Cd,m
+
∞ −∞
[(C + |s|) 2 −1 + (C + |s|) 2 ]ρ(s)ds d
C1 nN )Cd,m
C1 Cd,m
∞
[|s|
d−1 2
−∞
∞
−∞
d
[|s| 2 −1 + |s| 2 ]ρ(s)ds
+ |s|
d
d+1 2
d
]ρ(s)ds
≤ C0 + C1 n. This combined with (3.29) implies P (τn < T ) → 0, as n → ∞. 3.4. Some basic facts about Skorohod spaces In this subsection, we recall some basic facts about the Skorohod spaces (D([0, T ]; Rd ), d0 ) and (D([0, ∞); Rd ), dis), and the tightness of the probability measures on them. As mentioned in Remark 3 of Sec. 2, these spaces will be needed in order to carry out our proof. (See [1] for more details.) For any T > 0, let D([0, T ]; Rd) be the Skorohod space: d D([0, T ]; R ) = w: [0, T ] → Rd ; w(t) = w(t+) := lim w(s), t ∈ [0, T ), s↓t
and w(t−) := lim w(s) exists, t ∈ (0, T ] , s↑t
with the metric d0 = d0T given by = inf {λ0 ∨ w − w ◦ λ∞ } d0 (w, w) λ∈Λ
for any w, w ∈ D([0, T ]; Rd), where Λ = {λ: [0, T ] → [0, T ]; continuous, non-decreasing, λ(0) = 0, λ(T ) = T }, w∞ = sup0≤t≤T |w(t)|, and
λ(t) − λ(s) λ = sup log t−s 0≤s
for any λ ∈ Λ. It is well known that (D([0, T ]; Rd ), d0 ) is a complete metric space. Also, C([0, T ]; Rd ) = {w: [0, T ] → Rd ; continuous} is closed in (D([0, T ]; Rd ), d0 ), and the Skorohod topology relativized to C([0, T ]; Rd) coincides with the uniform topology there. (See, e.g., [1].) We have the following result about the tightness in ℘(D([0, T ]; Rd )), the space of all probabilities on D([0, T ]; Rd): let (Ωn , Fn , Pn ), n = 1, 2, . . . , be probability spaces, and let Xn : Ωn → D([0, T ]; Rd), n ∈ N, be measurable. Let µXn = Pn ◦Xn−1 .
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
763
Then we have the following: Theorem 3.4.1. Suppose that there exist constants ε, β, γ, C > 0 such that (1) E Pn [Xn ( · )ε∞ ] ≤ C, (2) E Pn [|Xn (r)−Xn (s)|β |Xn (s)−Xn (t)|β ] ≤ C|t−r|1+ε for any 0 ≤ r ≤ s ≤ t ≤ 1, (3) E Pn [|Xn (s) − Xn (t)|ε ] ≤ C|t − s|γ for any 0 ≤ s ≤ t ≤ 1, d for any n ∈ N. Then {µXn }∞ n=1 is tight in ℘(D([0, T ]; R )).
Proof. This is a corollary of results of [1]. Indeed, by [1, Theorem 13.2] and the paragraph between pp. 140–141 there, we have that {µXn }∞ n=1 is tight if the following four conditions are satisfied (see [1] for the notations). (1) (2) (3) (4)
lima→∞ lim supn→∞ Pn (Xn ∞ ≥ a) = 0, (δ)| ≥ a) = 0 for any a > 0, limδ→0 supn∈N Pn (|wX n limδ→0 supn∈N Pn (|Xn (δ) − Xn (0)| ≥ a) = 0 for any a > 0, limδ→0 supn∈N Pn (|Xn (1−) − Xn (1 − δ)| ≥ a) = 0 for any a > 0.
The fact that our conditions (1) and (3) imply (1) and (3) here, respectively, is trivial by Chebyshev’s inequality. The condition (4) here is also gotten in the same way, with the help of our (1) and the dominated convergence theorem. So the only thing left is to confirm that the (2) here is also satisfied. We do it in the following. We use [1, Theorem 10.4], (the quantities γ, µ((s, t]), β and P there are 1 Xn , C 1+ε (t − s), β/2 and Pn in our case, respectively, and the quantity L(γ, δ) (δ)). Our condition (2) implies that there is now replaced by wX n Pn (|Xn (s) − Xn (r)| ∧ |Xn (t) − Xn (s)| ≥ λ) 1 Pn E [|Xn (r) − Xn (s)|β |Xn (s) − Xn (t)|β ] λ2β 1 ≤ 2β C|t − r|1+ε λ 1 = 2β µ((r, t])1+ε , λ ≤
i.e. [1, (10.20)] is satisfied. So by [1, Theorem 10.4], [1, (10.21)] holds, i.e. Pn (|wX (δ)| ≥ a) ≤ n
1 1 2K (C 1+ε T )(C 1+ε 2δ)ε . 2β a
The right-hand side above certainly converges to 0 as δ → 0 for any a > 0. Finally, let D([0, ∞); Rd ) be the set of functions on [0, ∞) that are right continuous and have left limits at every point, and let dis(w1 , w2 ) =
∞
n=1
2−n (1 ∧ d0n (gn w1 , gn w2 )),
August 10, J070-S0129055X10004077
764
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
where d0n is the Skorohod metric on D([0, n]; Rd ) as just defined, and gn is the function given by gn (t) = 1{t∈[0,n−1]} + (n − t)1{t∈(n−1,n]} . Then the convergence to a continuous process in (D([0, ∞); Rd ), dis) is equivalent to the convergence to it in (C([0, T ]; Rd ), · [0,T ] ) for all T > 0. By [1, Theorem 16.7], we have that in order to prove the weak convergence of the distribution of a process with t ∈ [0, ∞) in (D([0, ∞); Rd ), dis), it is sufficient to show it for t ∈ [0, T ], for all T > 0. 3.5. Basic lemmas In this subsection, we state several key lemmas which are used for the proof of our results. The proof of these lemmas will be given in Secs. 4 and 5. Let (T,n)
Ft = Ft
= F(−∞,t+2m1/2 τ )×E ∨ ℵ
= σ{ω ∩ (−∞, t + 2m1/2 τ ) × E} ∨ ℵ. Proposition 3.6.5 below ensures that (Xi (t ∧ σ), Vi (t ∧ σ)), i = 1, . . . , N , are Ft -measurable. Also, we define a new potential in the following way. Let ∞ ρ(s)ds, t ∈ R, ρ(t) = − t
1 2 |v| + s dv, ρ p(s) = 2 Rd
and let X) = U(
p
Rd
N
Ui (Xi − x)
− p(0) dx.
i=1
will be given after Lemma 3.5.1. Some more discussion concerning U Our key decomposition is given in Lemma 3.5.1. Its result suffices for the proof of the tightness, but in order to find the limit, concrete expressions for Mi (t) and Pi∗1 (t) are necessary, and will be given later (see (4.22)). In order to keep the line of our proof sharp we shall first avoid presenting such concrete expressions. Lemma 3.5.1. For any i = 1, . . . , N, there exist an Rd -valued (Ft )t -martingale Mi (t), an Rd -valued (Ft )t -adapted process ηi (t) and an Rd -valued (Ft )t -adapted C 1 -class (in t) process Pi∗1 (t), such that (1) Mi (Vi (t ∧ σ) − Vi (0)) = Mi (t) + ηi (t) + Pi∗1 (t) − m−1/2
0
t∧σ
(X(s))ds, ∇i U
(3.30)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
(2) sup E
sup
Pm
m∈(0,1] t∈[0,T ]
765
d ∗1 2 Pi (t) < ∞, dt
(3) there exists a constant C independent of m such that for any i = 1, . . . , N, 0 ≤ s ≤ t ≤ T and m ∈ (0, 1], we have E Pm [|Mi (t) − Mi (s)|2 |Fs ] ≤ C|t − s|, and the jumps of Mi (·) satisfy |∆Mi (t)| ≤ Cm1/2 , (4)
E
Pm
2
sup |ηi (t)|
t∈[0,T ]
→ 0,
as m → 0
for any i = 1, . . . , N . In particular, the distributions of {Mi (t) + ηi (t); t ∈ [0, T ]} and {Pi∗1 (t); t ∈ [0, T ]} under Pm are tight in ℘(D([0, T ]; Rd)) as m → 0, and any of their cluster points have continuous canonical processes. Let us explain a little bit before going further. As claimed in Sec. 1, in our model, the molecules feel each other through the mediation of the gas atoms, and the molecules do not interact with each other directly. In Lemma 3.5.1, we reexpress the interactions in such a way that the light atoms do not appear explicitly X) appears as a new potential. this time. In this new expression, the function U( As will be shown later (Lemma 4.3.3), it is approximately the expected total force given by the “frozing approximations” ψ(t, x, v, X). , it is easy to see that if |Xi − Xj | > Ri + Rj for any i = j, By the definitionof U N then U (X) = i=1 Rd (p(Ui (x)) − p(0))dx, therefore, (X) = 0, ∇U
if |Xi − Xj | > Ri + Rj for any i = j.
(3.31)
at X is a constant. Write this constant as U 0 . So in this case, the value of U So our “new potential” U (X(t)) keeps 0 until any pair of two molecules are too near such that their (original) potentials overlap. This is heuristic because when the molecules are far enough from each other, as a result of our cut-off, they feel the influence of different atoms, so by the symmetry of the potentials and the initial distribution λm , we get our assertion. Also notice that as soon as this term becomes non-zero, since m−1/2 → ∞, it gives us an “infinitely strong force”. This is why we needed to stop the process in Theorem 2.0.1(2) (see also the paragraphs following it). Also, we will use the following lemmas to prove Theorem 2.0.1(4): Lemma 3.5.2. Let D be any open subset of RdN , and assume that for any i = ¯ → Rd satisfying 1, . . . , N, there exists a Cb1 -class function gi : D X) = |∇i U (X)|, · ∇i U( gi (X)
∈ D, ¯ i = 1, . . . , N. for any X
August 10, J070-S0129055X10004077
766
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Let C σ D = inf{t ≥ 0; X(t) ∈ D }.
Then (1)
sup E
Pm
m∈(0,1]
T ∧σ∧σf D
m 0
−1/2
X(t))|dt <∞ |∇i U(
for any i = 1, . . . , N, N (X(t ∧ σ ∧ σ (2) the distributions of m−1/2 (U D )) − U0 ) + i=1 under Pm is tight in ℘(C([0, T ]; R)) as m → 0.
Mi 2 |Vi (t ∧ σ
2 ∧ σ D )|
Let L be the operator defined in Sec. 2. By looking into the concrete expressions of the decomposition (3.30), we can get the following Lemma. In particular, this implies Theorems 2.0.1(2) and 2.0.1(3). The proof will be given in Sec. 5. −U 0 )C ⊂ RdN , and assume that f ∈ C0∞ (D0 × Lemma 3.5.3. Let D0 = (supp U dN ∧ σ), V (t ∧ σ)) under Pm is R ). Then we have that the distribution of f (X(t tight in ℘(C([0, T ]; R)) as m → 0, and its limit is the solution of the L-martingale problem stopped at σ. 3.6. Some basic calculation In this subsection, we prepare some estimates, especially some properties of x(t, x, v, ω) (see Propositions 3.6.3–3.6.5), for later use. First notice that it is trivial by definition that |Xi (t, ω)| ≤ |Xi,0 | + nT,
for any t ∈ [0, σ(ω) ∧ T ].
(3.32)
Proposition 3.6.1. Suppose that (x, v) ∈ E, |v| > (2C0 +1)m−1/2 and n ≤ m−1/2 . Then (|v|−1 v) · v(t, x, v; ω) ≥ m−1/2 (C0 + 1),
for any t ∈ [0, σ(ω)].
Proof. Let η = |v|−1 v and let ξ = inf{t > 0; v(t, x, v, ω) · η < m−1/2 (C0 + 1)}. We only need to show that ξ ≥ σ(ω). Suppose that the contrary holds. Notice that by definition, N ξ
(∇Ui (x(t, x, v, ω) − Xi (t, ω)) · η)dt. (v(ξ, x, v, ω) − v) · η = −m−1 i=1
0
Also, for any t ∈ [0, ξ ∧ σ(ω)], we have by assumption d (x(t, x, v, ω) − Xi (t, ω)) · η = v(t, x, v, ω) · η − Vi (t, ω) · η dt ≥ m−1/2 (C0 + 1) − n
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
767
≥ m−1/2 (C0 + 1) − m−1/2 = m−1/2 C0 , in particular, (x(t, x, v, ω) − Xi (t, ω)) · η is monotone increasing with respect to t. So since v · η = |v| > (2C0 + 1)m−1/2 by assumption, we have m−1/2 C0 < −(v(ξ, x, v, ω) − v) · η N ξ
= m−1 (∇Ui (x(t, x, v, ω) − Xi (t, ω)) · η)dt 0
i=1
≤ m−1
N
0
i=1
ξ
|∇Ui (x(t, x, v, ω) − Xi (t, ω)) · η|
× (m−1/2 C0 )−1 d[(x(t, x, v, ω) − Xi (t, ω)) · η] ≤ m−1 ×
N
(m−1/2 C0 )−1 ∇Ui ∞
i=1
|(x(t,x,v,ω)−Xi (t,ω))·η|≤Ri
≤ m−1
N
d[(x(t, x, v, ω) − Xi (t, ω)) · η]
(m−1/2 C0 )−1 ∇Ui ∞ 2Ri
i=1
= m−1/2 C0 , which yields a contradiction. Therefore, ξ ≥ σ(ω). Since we are considering the limit behavior as m → 0, without loss of generality, we assume n < m−1/2 from now on. Also, for the sake of simplicity, from now on, we omit the notation ω when there is no risk of confusion. Note that in our setting, since d x(t, Ψ(s, x, m−1/2 v)) = v(t, Ψ(s, x, m−1/2 v)), dt N
d −1/2 v(t, Ψ(s, x, m v)) = − ∇Ui (x(t, Ψ(s, x, m−1/2 v)) − Xi (t)), m dt i=1
we have d2 x(m1/2 t + s, Ψ(s, x, m−1/2 v)) dt2 =−
N
∇Ui (x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − Xi (m1/2 t + s, ω)).
i=1
Also, for any s > 0 and t ∈ [0, T ∧ σ(ω)], we have by definition and (3.32) that (x(t, Ψ(s, x, m−1/2 v)), v(t, Ψ(s, x, m−1/2 v))) = Ψ(s − t, x, m−1/2 v)
(3.33)
August 10, J070-S0129055X10004077
768
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
if t < s − (m−1/2 C0 )−1 R0 , (x, v) ∈ E and |v| ≥ 2C0 + 1. Indeed, since 0 ≤ t < s − (m−1/2 C0 )−1 R0 , for any u ∈ [0, t], we have that |s − u| > (m−1/2 C0 )−1 R0 . This combined with (x, v) ∈ E and |v| ≥ 2C0 + 1 gives us that |x − (s − u)m−1/2 v| ≥ |(s − u)m−1/2 v| > R0 ≥ Ri + |Xi,0 | + nT , which in turn combined with |Xi (u, ω)| ≤ |Xi,0 | + nT implies that |x − (s − u)m−1/2 v − Xi (u, ω)| ≥ Ri for any u ∈ [0, t] and i = 1, . . . , N . Therefore, until t, the velocity of this atom keeps unchanged, hence its position at time t is equal to x − (s − t)m−1/2 v. Therefore,
d 1/2 −1/2 1/2 −1/2 v)), x(m t + s, Ψ(s, x, m v)) x(m t + s, Ψ(s, x, m dt = (Ψ0 (−m1/2 t, x, m−1/2 v), m1/2 Ψ1 (−m1/2 t, x, m−1/2 v)) = (x + tv, v) = Ψ(−t, x, v)
(3.34)
if t < −C0−1 R0 , (x, v) ∈ E, |v| ≥ 2C0 + 1, and 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). We recall the following well-known Gronwall’s Lemma, for later use. Lemma 3.6.2 (Gronwall’s Lemma). Suppose that a continuous function g(·) satisfies t g(s)ds, 0 ≤ t ≤ T, 0 ≤ g(t) ≤ α(t) + β 0
with β ≥ 0 and α: [0, T ] → R integrable. Then t g(t) ≤ α(t) + β α(s)eβ(t−s) ds, 0
0 ≤ t ≤ T.
In particular, if α(t) = α is a constant, then g(t) ≤ αeβt ,
0 ≤ t ≤ T.
1/2 As claimed in Sec. 2, we will use ψ 0 (t, x, v, X(s−am , ω)) as an approximation of x(m1/2 t + s; Ψ(s, x, m−1/2 v)). In the following two propositions, with the help of Gronwall’s Lemma, we show that this is a good approximation by giving some estimate for the error (see Proposition 3.6.3(3)), which is necessary when showing the tightness, and giving the coefficient of the next term in its expansion (see Proposition 3.6.4). which is necessary when showing the convergence to the limit.
Proposition 3.6.3. Fix any a ∈ R. Suppose that 0 ≤ s − am1/2 ≤ T ∧ σ(ω) and 0 ≤ s − m1/2 τ ≤ T ∧ σ(ω). Let − am1/2 , ω)). y(t) = x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − ψ 0 (t, x, v; X(s Also, suppose that (x, v) ∈ E and |v| > 2C0 + 1. Then (1) y(t) = 0 if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω) and t ≤ −τ,
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
769
(2) N
d2 − am1/2 , ω)) − Xi (m1/2 t + s, ω)) y(t) = − {∇Ui (y(t) + ψ 0 (t, x, v; X(s dt2 i=1
− am1/2 , ω)) − Xi (s − am1/2 , ω))}. − ∇Ui (ψ 0 (t, x, v; X(s depending only on n, τ and N ∇2 Ui ∞ , such that (3) there exists a constant C, i=1 d + |a|), (3.35) |y(t)| + y(t) ≤ m1/2 C(2τ dt if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω) and |t| ≤ 2τ . Proof. We first show the first assertion. We have by (3.34) that x(m1/2 t + s, Ψ(s, x, m−1/2 v)) = x+tv in our setting. We next look at the term ψ 0 (t, x, v; X(s− 1/2 1/2 am , ω)). It is trivial that |Xi (s − am , ω)| ≤ |Xi,0 | + nT under our assumption. Also, since t ≤ −τ and |v| ≥ 2C0 + 1, we have for any s big enough that u ∈ [0, t + s] ⇒ u − s ∈ [− s, t] ⊂ [− s, −τ ], hence inf
u∈[0,t+e s]
|x − sv + uv| ≥ |t||v| ≥ C0−1 R0 (2C0 + 1) ≥ R0 ,
(this might look incorrect if one forgets the fact that t is now taken to be nega0 (t + s, x − sv, v; X(s − am1/2 , ω)) = limes→∞ ϕ − tive). Therefore, ψ 0 (t, x, v; X(s 1/2 am , ω)) = x + tv. This proves our first assertion. The second assertion is trivial by definition. Let us prove the third assertion. Notice that for any |t| ≤ 2τ satisfying 0 ≤ m1/2 t + s ≤ T ∧ σ(ω), we have |Xi (m1/2 t + s, ω) − Xi (s − am1/2 , ω)| ≤ n|(m1/2 t + s) − (s − am1/2 )| ≤ nm1/2 (2τ + |a|), so by (2), 2
N d ≤ y(t) ∇2 Ui ∞ |y(t) − [Xi (m1/2 t + s, ω) − Xi (s − am1/2 , ω)]| dt2 i=1
≤
N
2
∇ Ui ∞ m
1/2
n(2τ + |a|) +
i=1
N
2
∇ Ui ∞ |y(t)|.
i=1
Therefore, 2 d y(t), d y(t) ≤ d y(t) + d y(t) dt 2 dt dt dt N
≤ m1/2 ∇2 Ui ∞ n (2τ + |a|)
i=1
+ 1+
N
i=1
d ∇ Ui ∞ y(t), y(t) dt 2
August 10, J070-S0129055X10004077
770
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
d if |t| ≤ 2τ and 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). Also, by (1), y(−τ ) = dt y(−τ ) = 0. Let d g(t) = |(y(t − τ ), dt y(t − τ ))|, then we have g(0) = 0 and d g(t) = d y(t − τ ), d y(t − τ ) dt dt dt N N
1/2 2 2 ∇ Ui ∞ n (2τ + |a|) + 1 + ∇ Ui ∞ g(t), ≤m i=1
i=1
if −τ ≤ t ≤ 3τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω). (Notice that t = 0 satisfies these conditions since 0 ≤ s − m1/2 τ ≤ T ∧ σ(ω) under our assumption.) Therefore, if 0 ≤ t ≤ 3τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω), then N N t
1/2 2 2 ∇ Ui ∞ n (2τ + |a|)3τ + 1 + ∇ Ui ∞ g(s)ds, g(t) ≤ m i=1
0
i=1
so by Gronwall’s inequality, we get N
PN 2 1/2 2 ∇ Ui ∞ n (2τ + |a|)3τ e(1+ i=1 ∇ Ui ∞ )t . g(t) ≤ m i=1
The assertion for t ∈ [−τ, 0] satisfying 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω) is proved in the same way, and we omit the proof here. This completes the proof. V , a) be the solution of (2.3). In the following, we show that this Let z(t; x, v, X, z(t) gives the next term in the approximation of x(m1/2 t + s, Ψ(s, x, m−1/2 v)). Proposition 3.6.4. Let a ∈ R. Suppose that t ≥ −a, 0 ≤ s − m1/2 τ ≤ T ∧ σ(ω), −τ ≤ t ≤ 2τ and 0 ≤ s − am1/2 ≤ s + m1/2 t ≤ T ∧ σ(ω). Also, let (x, v) ∈ E and |v| > 2C0 + 1. Then |x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − am1/2 )) + m1/2 z(t; x, v, X(s − am1/2 ), V (s − am1/2 ), a))| − (ψ 0 (t, x, v, X(s s+m1/2 t 1/2 2 1/2 −1/2 1/2 (1 + |a|) m + m ≤ Cm |V (r) − V (s − am )|dr . s−am1/2
Here C is a constant depending only on τ, n,
N i=1
∇3 Ui ∞ and
N i=1
∇2 Ui ∞ .
Proof. The main tool is again Gronwall’s Lemma. Let − am1/2 , ω)) y(t) = x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − ψ 0 (t, x, v, X(s as in Proposition 3.6.3, and let − am1/2 ), V (s − am1/2 ), a). ξ(t) = y(t) − m1/2 z(t; x, v, X(s
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
771
We need to estimate |ξ(t)|. By a simply calculation, N
d2 − am1/2 )) − Xi (m1/2 t + s)) y(t) = − {∇Ui (y(t) + ψ 0 (t, x, v; X(s dt2 i=1
− am1/2 )) − Xi (s − am1/2 ))} − ∇Ui (ψ 0 (t, x, v; X(s N 1
=− ∇2 Ui (η[y(t) − {Xi (m1/2 t + s) − Xi (s − am1/2 )}] i=1
0
− am1/2 )) − Xi (s − am1/2 )) + ψ 0 (t, x, v, X(s × [y(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a)}]dη, so
d2 ξ(t) = − 2 dt i=1 N
1
0
dη{∇2 Ui (η[y(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a)}]
− am1/2 )) − Xi (s − am1/2 )) + ψ 0 (t, x, v; X(s − am1/2 )) − Xi (s − am1/2 ))} − ∇2 Ui (ψ 0 (t, x, v; X(s · (y(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a)}) −
N
− am1/2 )) − Xi (s − am1/2 )) ∇2 Ui (ψ 0 (t, x, v, X(s
i=1
× (ξ(t) − {Xi (m1/2 t + s) − Xi (s − m1/2 a) − m1/2 (t + a)Vi (s − m1/2 a)}). Therefore, since |Xi (m1/2 t + s) − Xi (s − m1/2 a)| ≤ n(t + |a|)m1/2 in our domain, s+m1/2 t and Xi (m1/2 t + s) − Xi (s − m1/2 a) = s−am1/2 Vi (r)dr, we get N 2
N
d ∇3 Ui ∞ (|y(t)| + n(t + |a|)m1/2 )2 + ∇2 Ui ∞ |ξ(t)| dt2 ξ(t) ≤ i=1 i=1 +
N
∇2 Ui ∞
i=1
s+m1/2 t s−am1/2
|Vi (r) − Vi (s − m1/2 a)|dr.
be the constant in Proposition 3.6.3(3), and let Let C C1 =
N
+ n)2 (2τ + 1)2 , ∇3 Ui ∞ (C
i=1
C2 =
N
i=1
∇2 Ui ∞ .
(3.36)
August 10, J070-S0129055X10004077
772
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Then (3.36) combined with Proposition 3.6.3(3) gives us 2 d ≤ C1 m(1 + |a|)2 + C2 ξ(t) dt2
s+m1/2 t
s−am1/2
|Vi (r) − Vi (s − m1/2 a)|dr + C2 |ξ(t)|,
if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω), |t| ≤ 2τ and t ≥ −a. Let d g(t) = ξ(t − τ ), ξ(t − τ ) . dt Then the estimate above gives us 2 d g(t) ≤ d ξ(t − τ ) + d ξ(t − τ ) dt dt dt2 s+m1/2 (t−τ ) ≤ C1 m(1 + |a|)2 + C2 |Vi (r) − Vi (s − m1/2 a)|dr s−am1/2
+ (C2 + 1)g(t), if t − τ ≥ −a, |t − τ | ≤ 2τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω). Since ξ(−τ ) = s+m1/2 (t−τ ) d |Vi (r) − Vi (s − m1/2 a)|dr is dt ξ(−τ ) = 0, we have g(0) = 0. Also, s−am1/2 monotone non-decreasing with respect to t. So if t − τ ≥ −a and 0 ≤ t ≤ 3τ , then 1/2 g(t) ≤ 3τ
C1 m(1 + |a|)2 + C2
+ (C2 + 1)
(t−τ )
s+m
s−am1/2
|Vi (r) − Vi (s − m1/2 a)|dr
t
g(u)du. 0
Therefore, by Gronwall’s inequality and the monotonicity of m1/2 a)|dr again, the above implies 1/2 g(t) ≤ 3τ e
(C2 +1)3τ
2
C1 m(1 + |a|) + C2
s+m
s−am1/2
(t−τ )
s+m1/2 t s−am1/2
|Vi (r)−Vi (s−
|Vi (r) − Vi (s − m
1/2
a)|dr ,
if t − τ ≥ −a, −τ ≤ t − τ ≤ 2τ and 0 ≤ m1/2 (t − τ ) + s ≤ T ∧ σ(ω). This completes the proof of our assertion. In the following proposition, we show that similarly as for the solution of Newton’s equation (see Corollary 3.2.3), x(m1/2 t + s, Ψ(s, x, m−1/2 v)) does not interact with Xi (m1/2 t + s, ω) if |t| is big. Proposition 3.6.5. Let (x, v) ∈ E and |v| > 2C0 +1. Suppose that 0 ≤ m1/2 t+s ≤ T ∧ σ(ω) and that either t < −τ or t > 2τ . Then ∇Ui (x(m1/2 t + s, Ψ(s, x, m−1/2 v)) − Xi (m1/2 t + s, ω)) = 0.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
773
Proof. Let η = |v|−1 v. Notice that |Xi (m1/2 t + s, ω)| ≤ |X0,i | + nT if 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). So it suffices to show that |x(m1/2 t + s, Ψ(s, x, m−1/2 v))| ≥ R0 for t satisfying our condition. We show it in the following. First notice that by (3.34), if t < −τ = −C0−1 R0 , then |x(m1/2 t + s, Ψ(s, x, m−1/2 v))| = |x + tv| ≥ |t||v| ≥ C0−1 R0 (2C0 + 1) > R0 . We next prove the assertion for t > 2τ . Let us divide it into two cases, according to whether s < 0 or not. We first deal with the case s < 0. Notice that by Proposition 3.6.1, we have that η · v(u, Ψ(s, x, m−1/2 v)) ≥ m−1/2 (C0 + 1) for any u ∈ (0, T ∧ σ). Also, x(0, Ψ(s, x, m−1/2 v)) = x − sm−1/2 v, x · η = 0 and v · η = |v|. Therefore, η · x(m1/2 t + s, Ψ(s, x, m−1/2 v)) m1/2 t+s = η · v(u, Ψ(s, x, m−1/2 v))du + η · (x − sm−1/2 v) 0
≥ m−1/2 (C0 + 1)(m1/2 t + s) − sm−1/2 |v| = t(C0 + 1) + m−1/2 s(C0 + 1 − |v|) ≥ t(C0 + 1) > 2C0−1 R0 (C0 + 1) > R0 , where when passing to the last line, we used the fact that s < 0 and C0 + 1 − |v| < 0. Let us now prove the assertion for t > 2τ and s > 0. Notice that s < T ∧ σ in this case since we have by assumption 0 ≤ m1/2 t + s ≤ T ∧ σ(ω). We first show that η · x(s, Ψ(s, x, m−1/2 v)) ≥ −R0 ,
for all s ∈ [0, T ∧ σ).
(3.37)
In the following, again, we use the fact that η · v(u, Ψ(s, x, m−1/2 v)) ≥ m−1/2 (C0 + 1) > 0 for any u ∈ (0, T ∧ σ), which is guaranteed by Proposition 3.6.1. We also use the fact that x(0, Ψ(s, x, m−1/2 v)) = x − sm−1/2 v, x · η = 0 and v · η = |v|. Let 1/2 0 . If s ∈ [0, s0 ], then we have that s0 = R |v| m η · x(s, Ψ(s, x, m
−1/2
s
v)) = 0
η · v(u, Ψ(s, x, m−1/2 v))du + η · (x − sm−1/2 v)
≥ 0 − m−1/2 |v|s ≥ −m−1/2 |v| ·
R0 1/2 m = −R0 . |v|
If s ∈ [s0 , T ∧ σ], then by using a similar argument as in the proof of (3.33), it is easy to see by definition that x(s − s0 , Ψ(s, x, m−1/2 v)) = x − s0 m−1/2 v, v(s − s0 , Ψ(s, x, m−1/2 v)) = m−1/2 v,
August 10, J070-S0129055X10004077
774
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
therefore, η · x(s, Ψ(s, x, m
−1/2
s
η · v(u, Ψ(s, x, m−1/2 v))du
v)) = s−s0
+ η · x(s − s0 , Ψ(s, x, m−1/2 v)) ≥ 0 + η · (x − s0 m−1/2 v) = −s0 m−1/2 |v| = −
R0 1/2 m · m−1/2 |v| = −R0 . |v|
This completes the proof of (3.37). Since d x(m1/2 t + s, Ψ(s, x, m−1/2 v)) = m1/2 v(m1/2 t + s, Ψ(s, x, m−1/2 v)), dt and 0 ≤ m1/2 t + s ≤ σ(ω) by assumption, we have by Proposition 3.6.1 that d (η · x(m1/2 t + s, Ψ(s, x, m−1/2 v))) > C0 . dt
(3.38)
This combined with (3.37) implies that t d 1/2 −1/2 η · x(m t + s, Ψ(s, x, m (η · x(m1/2 u + s, Ψ(s, x, m−1/2 v))du v)) = du 0 + η · x(s, Ψ(s, x, m−1/2 v)) ≥ C0 t − R0 ≥ C0 · 2C0−1 R0 − R0 = R0 . This completes the proof of our assertion, hence the lemma is proven. Before closing this section, let us discuss a little bit more about the new potential and the function p defined in Sec. 3.5. U The following equation will be used later: N
1 (X) = ∇i U |v|2 + ∇Ui (Xi − x)ρ Ui (x − Xi ) dxdv. (3.39) 2 R2d i=1 Also, by a simple calculation, there exists a global constant Cd such that ∞ d p(s) = Cd ρ(r + s)r 2 −1 dr, 0
hence p (s) = Cd
∞
ρ(r + s)r 2 −1 dr d
0 ∞
= Cd s
ρ(t)(t − s) 2 −1 dt. d
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
So if s < e0 , then
p (s) = Cd
∞
0
ρ(t)(t − s) 2 −1 dt, d
∞ d d p (s) = Cd 1 − ρ(t)(t − s) 2 −2 dt, 2 0
∞ d d d p (s) = Cd 1 − ρ(t)(t − s) 2 −3 dt. 2− 2 2 0
775
(3.40) (3.41)
Also notice that under the condition s < e0 , if 0 ≤ t < s, then t < e0 , hence ρ(t) = 0. Therefore, we get that < 0, if d ≥ 3, p (s) = 0, if d = 2, (3.42) > 0, if d = 1. We remark that in reality, we have ρ(t) = e−t , so ρ(t) = −e−t and p(s) = −Ce−s , for some constant C > 0, so p (s) < 0. 4. Proof of Basic Lemmas We give the proofs of Lemmas 3.5.1 and 3.5.2 in this section. The proof of Lemma 3.5.3 will be given in Sec. 5. 4.1. First decomposition Let σ(ω) = σn (ω) = inf{t ≥ 0; maxi=1,...,N |Vi (t, ω)| ≥ n}, R0 = maxi=1,...,N {Ri + |Xi,0 |} + nT + 1, and τ = C0−1 R0 as before. Also, we always assume that (x, v) ∈ E, i.e. x · v = 0. First, for any t ≤ T , we have by (3.3) that Mi (Vi (t) − Vi (0)) t =− ds 0
R×E
∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))µω (dr, dx, dv),
so we have the following decomposition. −Mi (Vi (t ∧ σn ) − Vi (0)) = Vi0 (t) + Vi1 (t), with Vi0 (t)
t∧σn
= 0
1[4m1/2 τ,∞) (s)ds
×
R×E
∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))µω (dr, dx, dv),
August 10, J070-S0129055X10004077
776
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Vi1 (t) =
t∧σn 0
1[0,4m1/2 τ ) (s)ds
×
R×E
∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))µω (dr, dx, dv).
4.2. The term Vi1 (t) Let us deal with Vi1 (t) in this subsection. We will show that it is negligible as m → 0. Let us decompose Vi1 (t) as follows: Vi1 (t) = Vi10 (t) + Vi11 (t), with Vi10 (t) =
t∧σn
1[0,4m1/2 τ ) (s)ds
0
×
R×E
{∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v)))
− ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))}µ ω (dr, dx, dv), t∧σn Vi11 (t) = 1[0,4m1/2 τ ) (s)ds 0
×
R×E
∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))µ ω (dr, dx, dv).
Before discussing the behavior of Vi10 (t), let us prepare the following result. Fix any t0 > 0. Then we have the following: Lemma 4.2.1. For any s ∈ [0, t0 ] satisfying 0 ≤ m1/2 s ≤ T ∧ σn (ω), we have that |x(m1/2 s, Ψ(r, x, m−1/2 v)) − ϕ 0 (s, Ψ(m−1/2 r, x, v); X(0)))| ≤ nm1/2 s
N
∇2 Ui ∞ t0 e(
PN i=1
∇2 Ui ∞ +1)t0
.
i=1
Proof. The main tool is again Gronwall’s lemma. First notice that under our condition, |Xi (m1/2 s) − Xi (0)| ≤ nm1/2 s. Let ξ(s) = x(m1/2 s, Ψ(r, x, m−1/2 v)) − ϕ 0 (s, Ψ(m−1/2 r, x, v); X(0))). Then we have N
d2 ξ(s) = {−∇Ui (x(m1/2 s, Ψ(r, x, m−1/2 v)) − Xi (m1/2 s)) ds2 i=1
+ ∇Ui (ϕ 0 (s, Ψ(m−1/2 r, x, v); X(0))) − Xi (0))}.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
777
Therefore, since ∇2 Ui , i = 1, . . . , N , are bounded, we have that 2 N d
≤ ξ(s) ∇2 Ui ∞ (|ξ(s)| + |Xi (m1/2 s) − Xi (0)|) ds2 i=1 ≤
N
∇2 Ui ∞ (|ξ(s)| + nm1/2 s).
i=1
Let g(s) = |(ξ(s), d g(s) ≤ ds
d ds ξ(s))|.
Then the above implies that 2 d ξ(s) + d ξ(s) ds2 ds N N
∇2 Ui ∞ + ∇2 Ui ∞ + 1 g(s). ≤ nm1/2 s i=1
i=1
Also, g(0) = 0. So for any 0 ≤ s ≤ t0 , we get that N N
1/2 2 2 ∇ Ui ∞ t0 + ∇ Ui ∞ + 1 g(s) ≤ nm s i=1
s
g(u)du.
0
i=1
Therefore, by Gronwall’s Lemma, we have g(s) ≤ nm
1/2
s
N
∇2 Ui ∞ t0 e(
PN i=1
∇2 Ui ∞ +1)s
.
i=1
This gives us our assertion. In particular, applying Lemma 4.2.1 to t0 = 4τ , we get that |x(s, Ψ(r, x, m−1/2 v)) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| ≤ ns
N
∇2 Ui ∞ 4τ e(
PN i=1
∇2 Ui ∞ +1)4τ
,
(4.1)
i=1
|Xi (s) − Xi (0)| ≤ ns,
for any s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)).
We use this to prove the following. The key point here is that the domain of s now is close to 0 and narrow enough. Lemma 4.2.2. E Pm [sup0≤t≤T |Vi10 (t)|2 ] → 0 as m → 0. Proof. First notice that in the definition of Vi10 , we are taking an integral for s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)), so if r > 6m1/2 τ or r < −2m1/2 τ , then we have |u| > 2m1/2 τ for any u ∈ [r − s, r], so since x · v = 0, we get by definition = |x − m−1/2 (r − s)v| |ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| ≥ m−1/2 |r − s||v| ≥ 2τ |v| ≥ R0 .
August 10, J070-S0129055X10004077
778
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Therefore, for any s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)), we have 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0))) =0 ∇Ui (Xi (0) − ϕ
(4.2)
if r > 6m1/2 τ or r < −2m1/2 τ . Also, (4.2) holds if |x| ≥ R0 + 1. Similarly, the same holds with X(0) substituted by X(s) (since 0 ≤ s ≤ σ). Let N
PN 2 C1 = ∇2 Ui ∞ ∇2 Uj ∞ 4τ e( j=1 ∇ Uj ∞ +1)4τ + 1. j=1
Then by combining these facts with (4.1), we get that for any s ∈ [0, 4m1/2 τ ∧ T ∧ σ(ω)), |∇Ui (Xi (s, ω) − x(s, Ψ(r, x, m−1/2 v))) − ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| ≤ 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)nsC1 . Therefore, by the definition of Vi10 (t), we get that t∧σn 10 |Vi (t)| ≤ 1[0,4m1/2 τ ) (s)ds 0
×
R×E
C1 ns1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)µω (dr, dx, dv)
C1 ≤ n(4m1/2 τ )2 2
R×E
1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)µω (dr, dx, dv). (4.3)
We need to discuss the L2 (Pm )-norm of the integral on the right-hand side above. Notice thatin general, it is easy to see by the definition of a Poisson point process that E Pm [( gdµω )2 ] = g 2 dλm + ( gdλm )2 for any g ∈ L2 (λm ). N Let c = j=1 Uj ∞ , and set C2 = 8τ (2(R0 + 1))d−1 Rd ρc ( 12 |v|2 )|v|dv, which is finite by our assumption. Then we have by definition that 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)λ(dr, dx, dv) R×E
= R×E
≤
R×E
≤ 8m
1/2
1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)m−1 ρ0 (x − m−1/2 rv, v)drν(dx, dv) 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)m−1 ρc τm
−1
= C2 m−1/2 .
d−1
(2(R0 + 1))
Rd
ρc
1 2 |v| drν(dx, dv) 2
1 2 |v| |v|dv 2
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
Therefore, E
Pm
R×E
= R×E
2 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)µω (dr, dx, dv)
1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)λ(dr, dx, dv)
+ R×E
≤ C2 m
779
−1/2
2 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)λ(dr, dx, dv) + C22 m−1 .
(4.4)
This combined with (4.3) gives us that 2 1 E Pm sup |Vi10 (t)|2 ≤ C1 n(4m1/2 τ )2 (C2 m−1/2 + C22 m−1 ). 2 0≤t≤T The right-hand side above converges to 0 as m → 0. This completes the proof of our assertion. For the term Vi11 (t), we show in the following that it is also negligible when m → 0. The main idea is to use the fact that the expectation (of the integral with respect to the counting measure) is 0 (see (4.5) below), which means that we only need to calculate its variance. Lemma 4.2.3.
E
Pm
sup
0≤t≤T
|Vi11 (t)|2
→0
as m → 0.
Proof. We first notice that ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))λ(dr, dx, dv) = 0
(4.5)
R×E
for any s ∈ [0, 4m1/2 τ ∧T ∧σ) and |v| ≥ C0 . Indeed, since |Xi (0)− Xj (0)| > Ri + Rj X(0)) = 0. Combining this with (3.39), for any i = j, we have by (3.31) that ∇i U( we get that N
1 ∇Ui (Xi (0) − x)ρ |v|2 + Uj (x − Xj (0)) dxdv = 0. 2 2d R j=1 Applying Proposition 3.1.1 to this with t = m−1/2 s and f (x, v) = ∇Ui (Xi (0) − x), we get N
1 |v|2 + ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, x, v; X(0)))ρ Uj (x − Xj (0)) dxdv = 0. 2 2d R j=1
August 10, J070-S0129055X10004077
780
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Reformulating this using the ray representation yields ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(r, x, v); X(0))) R×E
N
1 × ρ |v|2 + Uj (Ψ0 (r, x, v) − Xj (0)) drν(dx, dv) = 0. 2 j=1 By changing variable r = m−1/2 r, we obtain (4.5). By (4.5), we get that Vi11 (t) =
t∧σn
1[0,4m1/2 τ ) (s)ds
0
×
R×E
∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))
× (µω (dr, dx, dv) − λ(dr, dx, dv)).
(4.6)
As in the proof of Lemma 4.2.2, (4.2) holds if r > 6m1/2 τ or r < −2m1/2 τ , or if |x| ≥ R0 + 1. Let 1 ρc ( |v|2 )|v|dv, C3 = 8τ (2(R0 + 1))d−1 ∇Ui 2∞ 2 d R which is finite by our assumption. Then we have that E Pm ∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0))) R×E 2 × (µω (dr, dx, dv) − λ(dr, dx, dv)) = R×E
≤
R×E
2 |∇Ui (Xi (0) − ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))| λ(dr, dx, dv)
1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)∇Ui 2∞ λ(dr, dx, dv)
= ∇Ui 2∞
R×E
1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)
N
1 × m−1 ρ |v|2 + Uj (Xj,0 − (x − m−1/2 rv)) drν(dx, dv) 2 j=1 ≤m
−1
8m
1/2
= C3 m−1/2 .
d−1
τ (2(R0 + 1))
∇Ui 2∞
Rd
ρc
1 2 |v| |v|dv 2
(4.7)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
Therefore, E
Pm
sup t∈[0,T ]
|Vi11 (t)|2
≤E
Pm
T ∧σ
0
0
−ϕ (m ≤E
−1/2
4m
1[0,4m1/2 τ ) (s)
R×E
s, Ψ(m
Pm
781
1/2
−1/2
T ∧σ
τ 0
∇Ui (Xi (0)
2 r, x, v); X(0)))(µω (dr, dx, dv) − λ(dr, dx, dv)) ds
1[0,4m1/2 τ ) (s) ∇Ui (Xi (0) R×E
2 −ϕ 0 (m−1/2 s, Ψ(m−1/2 r, x, v); X(0)))(µ ω (dr, dx, dv) − λ(dr, dx, dv)) ds ≤ (4m
1/2
4m1/2 τ
τ)
dsE 0
0
−ϕ (m
−1/2
s, Ψ(m
Pm
−1/2
R×E
∇Ui (Xi (0)
2 r, x, v); X(0)))(µ ω (dr, dx, dv) − λ(dr, dx, dv))
≤ (4m1/2 τ )2 C3 m−1/2 , which converges to 0 as m → 0. This completes the proof of our assertion. Combining Lemmas 4.2.2 and 4.2.3, we get the following main result of this subsection. Lemma 4.2.4.
E Pm
sup |Vi1 (t)|2 → 0
0≤t≤T
as m → 0.
4.3. The term Vi0 (t) Let us discuss the term Vi0 (t) in this subsection. For any r ∈ R, let r = r(ω) = ((r − 2m1/2 τ ) ∨ 0) ∧ T ∧ σ(ω). Notice that by Corollary 3.2.3, r ))) = 0 r ) − ψ 0 (m−1/2 (s − r), x, v; X( ∇Ui (Xi ( ⇒ |m−1/2 (s − r)| ≤ 2τ. So for s ∈ [4m1/2 τ, ∞), r ))) = 0. r ) − ψ 0 (m−1/2 (s − r), x, v; X( r < 2m1/2 τ ⇒ ∇Ui (Xi (
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
782
Therefore, we have the following decomposition: Vi0 (t) = Vi01 (t) + Vi02 (t) + Vi03 (t) − Vi04 (t) + Vi05 (t), with Vi01 (t) =
t∧σn
1[4m1/2 τ,∞) (s)ds
0
× Vi02 (t) Vi03 (t)
R×E
1[4m1/2 τ,∞) (s)ds
0
=
t∧σn
=
∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)))λ(dr, dx, dv),
t∧σn
ds 0
(2m1/2 τ,∞)×E
R×E
fi (s, r, x, v)µω (dr, dx, dv),
r ))) ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(
× (µω (dr, dx, dv) − λ(dr, dx, dv)), t∧σn Vi04 (t) = 1[0,4m1/2 τ ) (s)ds 0
[2m1/2 τ,∞)×E
r ))) × ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( × (µω (dr, dx, dv) − λ(dr, dx, dv)), t∧σn $ 05 Vi05 (t) = 1[4m1/2 τ,∞) (s)ds F i (s, r, x, v)λ(dr, dx, dv), 0
R×E
where fi (s, r, x, v) = ∇Ui (Xi (s) − x(s, Ψ(r, x, m−1/2 v))) r ))), − ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( 0 −1/2 $ 05 F (s − r), x, v; X(s))) i (s, r, x, v) = −{∇Ui (Xi (s) − ψ (m
r )))}. r ) − ψ 0 (m−1/2 (s − r), x, v; X( − ∇Ui (Xi ( We discuss each term in the above decomposition in the following. We will show that Vi02 (t) and Vi05 (t) give us the “smooth” term in (3.30), and the martingale part of Vi03 (t) gives us the “martingale” term there (see the end of Sec. 4). For the term Vi02 , we have by definition d 02 fi (t, r, x, v; ω)µω (dr, dx, dv). Vi (t) = 1(4m1/2 τ,σ) (t) dt R×E By definition and assumption, we have that λm (dr, dx, dv) = 0 if |v| ≤ 2C0 + 1. Also, by Proposition 3.6.5 and Corollary 3.2.3, fi (t, r, x, v) = 0 if |r − t| ≥ 2m1/2 τ . So we only need to consider the case where t ∈ [4m1/2 τ, T ∧ σ), r ∈ [2m1/2 τ, T ∧ σ(ω) + 2m1/2 τ ] and |v| ≥ 2C0 + 1. Before going further, we first show the following, with the help of Proposition 3.6.5, Corollary 3.2.3 (which claimed that both of the two interactions exist
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
783
only for a certain range of t − r), and Proposition 3.6.3 (which gave an estimate for the error of our approximation of x(t, Ψ(r, x, m−1/2 v))). Lemma 4.3.1. There exists a constant C > 0 such that |fi (t, r, x, v)| ≤ 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r) · Cm1/2 , if t ∈ [4m1/ τ, T ∧ σ], |r − 2m1/2 τ | ≤ T ∧ σ(ω) and |v| ≥ 2C0 + 1. Proof. First, since t ∈ [0, T ∧ σn ), we have by Proposition 3.6.5 that ∇Ui (Xi (t) − x(t, Ψ(r, x, m−1/2 v))) = 0 if t − r > 2m1/2 τ or t − r < −m1/2 τ . Also, since r ∈ r )| ≤ |Xi,0 | + nT , so by Corollary 3.2.3, [0, T ∧ σn ) by definition, we have |Xi ( r ))) = 0 if t − r ≥ 2m1/2 τ or t − r ≤ −m1/2 τ . r ) − ψ 0 (m−1/2 (t − r), x, v; X( ∇Ui (Xi ( / [t − 2m1/2 τ, t + m1/2 τ ]. Combining the above, we get that fi (t, r, x, v) = 0 if r ∈ 1/2 1/2 Next, for r ∈ [t − 2m τ, t + m τ ], if |x| ≥ R0 + 1, since x · v = 0, we get easily that |x(t, Ψ(r, x, m−1/2 v))| = |x − (r − t)m1/2 v| ≥ |x| ≥ R0 + 1, hence both of the terms of fi (t, r, x, v) are equal to 0. Finally, we show, for |x| < R0 + 1 and r ∈ [t − 2m1/2 τ, t + m1/2 τ ], that |fi (t, r, x, v)| ≤ Cm1/2 . For this kind of x and r, since t ∈ [4m1/ τ, T ∧ σ(ω)], we have by definition 2m1/2 τ ≤ r ≤ T ∧ σ + m1/2 τ , so r = r − 2m1/2 τ . We have |fi (t, r, x, v)| ≤ ∇2 Ui ∞ (|Xi (t) − Xi ( r )| + |x(t, Ψ(r, x, m−1/2 v)) r ))|). − ψ 0 (m−1/2 (t − r), x, v; X( The term involving X is easy. Indeed, since t, r ∈ [0, T ∧σ(ω)], we have by definition r )| ≤ n|t − r| = n|t − (r − 2m1/2 τ )| |Xi (t) − Xi ( ≤ n(|t − r| + 2m1/2 τ ) ≤ n4m1/2 τ. We next deal with the second absolute value above. Notice that by assumption, 0 ≤ r − 2m1/2 τ ≤ T ∧ σ(ω), 0 ≤ r − m1/2 τ ≤ T ∧ σ(ω) and 0 ≤ t ≤ T ∧ σ(ω). Therefore, by Proposition 3.6.3 (3) (with (t, s, a) there given by (m−1/2 (t − r), r, 2τ )), such that there exists a constant C − 2m1/2 τ ))| ≤ m1/2 C(2τ + 2τ ). |x(t, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (t − r), x, v; X(r Combining the above, we get our assertion. Now we are ready to prove the following result concerning the term Vi02 (t). Lemma 4.3.2. We have that sup
sup E
m∈(0,1] 0≤t≤T
Pm
d 02 2 V (t) < ∞. dt i
In particular, {the distribution of {Vi02 (t)}t∈[0,T ] under Pm }m∈(0,1] is tight in ℘(D([0, T ]; Rd)).
August 10, J070-S0129055X10004077
784
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Before giving the proof, we remark that this result is natural since by Lemma 4.3.1, fi (s, r, x, v) is not 0 only if r is very near to s, which implies by − 2m1/2 τ )) is a good Proposition 3.6.3(3) and 3.6.4 that ψ 0 (m−1/2 (s − r), x, v; X(r −1/2 v)). approximation of x(s, Ψ(r, x, m Proof. By Lemma 4.3.1, we have d 02 V (t) ≤ Cm1/2 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)µω (dr, dx, dv). dt i R×E Therefore, 2 d E Pm Vi02 (t) dt ≤E
Pm
≤ C 2m
Cm
1/2
R×E
R×E
2 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)µω (dr, dx, dv)
1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)λm (dr, dx, dv)
+ Cm1/2 R×E
2 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)λm (dr, dx, dv) .
(4.8) Let c = i=1 Ui ∞ , and C = 3τ [2(R0 + 1)]d−1 Rd ρc ( 12 |v|2 )|v|dv, which is finite by assumption. Then 1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)λm (dr, dx, dv) N
R×E
= R×E
1[0,R0 +1) (|x|)1[−m1/2 τ,2m1/2 τ ) (t − r)
N 1 2
1/2 ×m ρ |v| + Ui (x − m rv − Xi,0 ) dr|v| ν (dx; v)dv 2 i=1
1 2 ≤ m−1 3m1/2 τ |v| |v| 1[0,R0 +1) (|x|)ρc ν (dx; v)dv 2 E
1 2 |v| |v|dv ρc ≤ 3m−1/2 τ [2(R0 + 1)]d−1 2 Rd −1
= Cm−1/2 .
(4.9)
Combining (4.8) and (4.9), we get that d 02 2 P m := sup Vi (t) < ∞, C sup E dt 0≤t≤T m∈(0,1]
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
785
which is exactly the first half of our assertion. Therefore, − t |2 , E[|Vi02 (t) − Vi02 (t )|2 ] ≤ C|t hence by Theorem 3.4.1 (with β = ε = γ = 1), {{Vi02 (t)}t∈[0,T ] under Pm }m∈(0,1] is tight in ℘(D([0, T ]; Rd)). We next deal with Vi01 (t). By using Proposition 3.2.4, we show that it is equal to t∧σ (X(s))ds, which gives us the “colliding” term in Theorem 2.0.1(4). ∇i U m 0 −1/2
0 , n, T and Ui , i = Lemma 4.3.3. There exists an m0 > 0 (depending on X 1, . . . , N ) such that for any m ≤ m0 , t∧σ 01 −1/2 (X(s))ds, ∇i U Vi (t) = m 0
is as defined in Sec. 3.5. where U Proof. Suppose that ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)) = 0. Then s − r < 1/2 2m τ by Proposition 3.6.5, this combined with s ≥ 4m1/2 τ implies that r > 2m1/2 τ = 2m1/2 C0−1 R. Since |v| ≥ 2C0 +1 and x·v = 0 for λm -almost every (r, x, v), this implies |x − m−1/2 rv| ≥ m−1/2 r|v| ≥ R0 , hence Ui (Xi,0 − (x − m−1/2 rv)) = 0. Therefore, by definition, Proposition 3.2.4 and (3.39), t∧σn 1[4m1/2 τ,∞) (s)ds ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) Vi01 (t) = 0
R×E
N
1 Ui (x − m−1/2 rv − Xi,0 ) drν(dx, dv) × m−1 ρ |v|2 + 2 i=1 t∧σn = 1[4m1/2 τ,∞) (s)ds ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) 0
× m−1 ρ
×m
0
R×E
1 2 |v| drν(dx, dv) 2
1[4m1/2 τ,∞) (s)ds
0
=
t∧σn
=
−1/2
t∧σn
R2d
∇Ui (Xi (s) − x)ρ
N 1 2
|v| + Uk (x − Xk,0 ) dxdv 2 k=1
X(s))ds, 1[4m1/2 τ,∞) (s)m−1/2 ∇i U(
where we used Proposition 3.2.4 in passing to the third equality, and used (3.39) in passing to the last equality. So in order to complete the proof of our assertion, it suffices to show that X(s)) = 0 for any s ∈ [0, 4m1/2 τ ∧ σ], if m is small enough. We show it from ∇i U(
August 10, J070-S0129055X10004077
786
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
now on. Notice that since |Xi,0 − Xj,0 | > Ri + Rj for any i, j = 1, . . . , N with i = j by assumption, there exists an m0 > 0 (small enough) such that for any m ≤ m0 , we have |Xi,0 − Xj,0 | > Ri + Rj + 8m1/2 τ n for any i = j. Also, by definition, we have |Xi (s) − Xi,0 | ≤ sn ≤ 4m1/2 τ n for any s ∈ [0, 4m1/2 τ ∧ σ] and i = 1, . . . , N . Therefore, |Xi (s) − Xj (s)| ≥ |Xi,0 − Xj,0 | − |Xi (s) − Xi,0 | − |Xj (s) − Xj,0 | > Ri + Rj + 8m1/2 τ n − 4m1/2 τ n − 4m1/2 τ n = Ri + Rj , (X(s)) so by (3.31), ∇i U = 0 for any s ∈ [0, 4m1/2 τ ∧ σ]. This completes the proof of our assertion. Before discussing the term Vi05 (t), let us first prepare, by using Gronwall’s with respect to X: Lemma, the continuity of ψ 0 (t, x, v; X) (depending on Lemma 4.3.4. For any Y > 0, there exists a constant C N N 2 maxi=1 Ri + Y, τ, C0 and i=1 ∇ Ui ∞ ) such that 1 ) − ψ 0 (t, x, v; X 2 )| ≤ C X 1−X 2 Rd , |ψ 0 (t, x, v; X 1 |, |X 2| ≤ Y . for any (x, v) ∈ E, |v| ≥ 2C0 + 1, |t| ≤ 2τ and |X Proof. Choose and fix any v ∈ Rd with |v| ≥ 2C0 + 1, and let s0 = 1 ) − ψ 0 (t, x, v; X 2 ). Then by definition, 2τ . Let g(t) = ψ 0 (t, x, v; X
maxN i=1 Ri +Y |v|
1) − ϕ 2 ), g(t) = ϕ 0 (t + s0 , x − s0 v, v; X 0 (t + s0 , x − s0 v, v; X so N
d2 1 ) − Xi1 ) g(t) = − ∇Ui (ϕ 0 (t + s0 , x − s0 v, v; X dt2 i=1
+ N
N
2 ) − Xi2 ). ∇Ui (ϕ 0 (t + s0 , x − s0 v, v; X
i=1 2
Let C = i=1 ∇ Ui ∞ , then 2
N d ≤ 1−X 2 Rd ), g(t) ∇2 Ui ∞ (|g(t)| + |Xi1 − Xi2 |) ≤ C(|g(t)| + X dt2 i=1 therefore, d g(t), d g(t) ≤ dt dt
2 d g(t) + d g(t) dt2 dt
d 1 2 ≤ CX − X Rd + (1 + C) g(t), g(t) . dt
∨
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
787
d d Also, g(−s0 ) = dt g(−s0 ) = 0. Let h(t) = |(g(t − s0 ), dt g(t − s0 ))|. Then h(0) = 0, and for any t ∈ [0, s0 + 2τ ], t 1−X 2 Rd (s0 + 2τ ) + (1 + C) h(t) ≤ CX h(s)ds, 0
so by Gronwall’s Lemma, 2 Rd (s0 + 2τ )e(1+C)(s0 +2τ ) , 1−X h(t) ≤ CX Notice that since |v| ≥ 2C0 + 1, we have 2τ ≤ s0 ≤
t ∈ [0, s0 + 2τ ].
maxN i=1
Ri +Y 2C0 +1
∨ 2τ . Therefore,
|g(t)| ≤ h(t + s0 )
maxN i=1 Ri +Y maxN i=1 Ri + Y 1−X 2 Rd , ≤C ∨ 2τ + 2τ e(1+C)( 2C0 +1 ∨2τ +2τ ) X 2C0 + 1 for any t ∈ [−2τ, 2τ ]. This complets the proof of our assertion. We use Lemma 4.3.4 to prove the following: Lemma 4.3.5. There exists a constant C > 0 such that 1/2 $ 05 1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1) (|x|) |F i (s, r, x, v)| ≤ Cm
for s ∈ [4m1/2 τ, T ∧ σn ]. Proof. First, since s, r ∈ [0, T ∧ σ(ω)] in our domain, it is easy to see that $ $ 05 05 |F i (s, r, x, v)| = 0 if |x| ≥ R0 + 1. Also, by Corollary 3.2.3, |Fi (s, r, x, v)| = 0 if −1/2 (s − r)| ≥ 2τ . Finally, for |x| ≤ R0 + 1 and |s − r| ≤ 2m1/2 τ , by definition |m and Lemma 4.3.4, we only need to show the following: r )| ≤ Cm1/2 , |Xi (s) − Xi (
s ≥ 4m1/2 τ.
(4.10)
To show (4.10), again, notice that in the present setting, 0 ≤ r − 2m1/2 τ ≤ T ∧ σ, so r = r − 2m1/2 τ . So the left-hand side of (4.10) = |Xi (s) − Xi (r − 2m1/2 τ )| ≤ n|s − (r − 2m1/2 τ )| ≤ n(|s − r| + 2m1/2 τ ) ≤ n4m1/2 τ . This completes the proof of our assertion. By Lemma 4.3.5, we get the following lemma in the same way as we derived Lemma 4.3.2 from Lemma 4.3.1. d Vi05 (t)|2 ] < ∞, Lemma 4.3.6. (1) supm∈(0,1] sup0≤t≤T E Pm [| dt 05 (2) {the distribution of {Vi (t)}t∈[0,T ] under Pm }m∈(0,1] is tight in ℘(D([0, T ]; Rd )).
We show that the term Vi04 is negligible. Precisely, we show the following: Lemma 4.3.7.
E
Pm
sup
0≤t≤T
|Vi04 (t)|2
→0
as m → 0.
August 10, J070-S0129055X10004077
788
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Proof. The proof is similar to previous ones, it is easier than the one of Lemma 4.2.3, where we had to show first that the expectation is 0 (see (4.5)), whereas now, we are considering only the variance from the very beginning. We have for any s ∈ [0, 4m1/2 τ ] that r )))| |∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( ≤ ∇Ui ∞ 1[0,R0 +1) (|x|)1[0,2m1/2 τ ) (|s − r|) ≤ ∇Ui ∞ 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r). Let C4 = 8∇Ui 2∞ τ (2(R0 +1))d−1 Rd ρc ( 12 |v|2 )|v|dv, which is finite. Then we have by the definition of λ and the assumption A2 that Pm r ))) ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( E [2m1/2 τ,∞)×E 2 × (µω (dr, dx, dv) − λ(dr, dx, dv))
0
=E [2m1/2 τ,∞)×E
≤
[2m1/2 τ,∞)×E
∇Ui (Xi ( r ) − ψ (m
−1/2
2
r ))) λ(dr, dx, dv) (s − r), x, v; X(
∇Ui 2∞ 1[0,R0 +1) (|x|)1[−2m1/2 τ,6m1/2 τ ] (r)
N
1 × m−1 ρ |v|2 + Uj (x − m−1/2 rv − Xj,0 ) drν(dx, dv) 2 j=1 ≤ ∇Ui 2∞ 8m1/2 τ (2(R0 + 1))d−1 m−1
Rd
ρc
1 2 |v| |v|dv 2
= C4 m−1/2 .
(4.11)
Therefore, E Pm
sup |Vi04 (t)|2 ≤ E Pm 4m1/2 τ
0≤t≤T
4m1/2 τ
0
[2m1/2 τ,∞)×E
r ))) × ∇Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( 2 × (µω (dr, dx, dv) − λ(dr, dx, dv)) ds ≤ (4m1/2 τ )2 C4 m−1/2 , which converges to 0 as m → 0. This completes the proof of our assertion.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
789
Now, the only term left to be discussed is Vi03 . We deal with it in the next subsection. 4.4. The term Vi03 We deal with the term Vi03 in this subsection. More precisely, we show that it is equal to a martingale plus a negligible term. (m,n) = F(−∞,2m1/2 τ +t)×E ∨ ℵ as We first prepare some notations. Let Ft = Ft in Sec. 3.5. Then Ft is increasing and right continuous. Let N ((0, t] × A) := µω ((2m1/2 τ, 2m1/2 τ + t] × A) N for any A ∈ B(E). Notice that if ρ( 12 |v|2 + j=1 Uj (Xj,0 − (x − m−1/2 rv))) > 0, then |v| ≥ 2C0 + 1, hence if r ≥ m1/2 τ in addition, then |x − m−1/2 rv| ≥ τ |v| > R0 , N so ρ( 12 |v|2 + j=1 Uj (Xj,0 − (x − m−1/2 rv))) = ρ( 12 |v|2 ). Therefore, if we let
1 2 |v| ν(dx, dv), ν(dx, dv) = ρ 2 then N is the Ft -adapted Poisson point process with intensity measure λ(dt, dx, dv) = m−1 dtν(dx, dv) = m−1 dtρ( 12 |v|2 )ν(dx, dv). Notice that N ((s, t] × A) is independent of Fs for any s < t and A ∈ B(E). Let ¯ (dt, dx, dv) = N (dt, dx, dv) − m−1 dtν(dx, dv). N Notice that Xi (t ∧ σ) and Vi (t ∧ σ) are Ft -measurable. Also, since ∇Ui (Xi ( r) − r )) = 0 only if |m−1/2 (s − r)| ≤ 2τ , which combined with ψ 0 (m−1/2 (s − r), x, v; X( r ≥ 2m1/2 τ and s ≤ T ∧ σ implies r = r − 2m1/2 τ , we get by definition that t∧σ ds Vi03 (t) = 0
[2m1/2 τ,2m1/2 τ +(T ∧σ))×E
− 2m1/2 τ ))) × ∇Ui (Xi (r − 2m1/2 τ ) − ψ 0 (m−1/2 (s − r), x, v; X(r × (µω (dr, dx, dv) − λ(dr, dx, dv)) t∧σ = ds 0
[0,T ∧σ)×E
× ∇Ui (Xi (r) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r))) N¯ (dr, dx, dv). In the last expression above, if r > t ∧ σ, then since s ≤ t ∧ σ, we get = 0. m−1/2 (s − r) − 2τ < −τ , hence ∇Ui (Xi (r) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r))) Therefore, t∧σ 03 Vi (t) = ds 0
[0,t∧σ)×E
× ∇Ui (Xi (r) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r))) N¯ (dr, dx, dv).
August 10, J070-S0129055X10004077
790
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Let $ 03 V$ i (t) =
t
¯ (dr, dx, dv) N
ds 0
(0,t]×E
∧ σ))). × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 (s − r) − 2τ, x, v; X(r Then $ 03 Vi03 (t) = V$ i (t ∧ σ). ∧ σ))) = 0 if |u| ≥ 2τ . So By Corollary 3.2.3, ∇Ui (Xi (r ∧ σ) − ψ 0 (u, x, v; X(r $ 03 the integral domain s ∈ [0, t] in the definition of V$ i (t), which is equivalent to s − r ∈ [−r, t − r], can be substituted by s − r ∈ [0, (t − r) ∧ 4m1/2 τ ] = [0, 4m1/2 τ ] \ $ 03 (t) can be decomposed into [(t − r) ∧ (4m1/2 τ ), 4m1/2 τ ]. Therefore, V$ i
$ 03 i V$ i (t), i (t) = M (t) + η where i (t) = M
4m1/2 τ
¯ (dr, dx, dv) N
ds 0
(0,t]×E
∧ σ))), × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r 4m1/2 τ ¯ (dr, dx, dv) N ds ηi (t) = − (t−r)∧(4m1/2 τ )
(0,t]×E
∧ σ))). × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r $ 03 By definition (notice that the integral domain (0, t] × E in the definition of V$ i (t) can always be converted into (0, T ] × E whenever necessary, and vice versa), d $ $ 03 ¯ (dr, dx, dv) N V (t) = dt i (0,t]×E ∧ σ))), × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 (t − r) − 2τ, x, v; X(r so with C1 = 4τ ∇Ui 2∞ (2(R0 + 1))d−1 Rd ρ( 12 |v|2 )|v|dv, we have 2 d $ 03 (t) |∇Ui (Xi (r ∧ σ) E Pm V$ =E dt i (0,t]×E 0
− ψ (m ≤
(0,t]×E
−1/2
2 (t − r) − 2τ, x, v; X(r ∧ σ)))| λ(dr, dx, dv)
∇Ui 2∞ 1[0,R0 +1) (|x|)1[0,2τ ] (|m−1/2 (t − r) − 2τ |)
× m−1 drρ
1 2 |v| ν(dx, dv) 2
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
≤ 4m1/2 τ ∇Ui 2∞ (2(R0 + 1))d−1 m−1
ρ Rd
791
1 2 |v| |v|dv 2
= C1 m−1/2 . (4.12) This fact will be used later. i (t). First, it is easy to see by definition that M i (t) Let us study the term M i | ≤ 4m1/2 τ ∇Ui ∞ . Also, with is a Ft -martingale, with its jumpssatisfying |∆M 1 2 2 d−1 2 ρ( 2 |v| )|v|dv, we have that, for any 0 ≤ s ≤ C = (4τ ) ∇Ui ∞ (2(R0 + 1)) Rd t ≤ T, i (t) − M i (s)|2 |Fs ] E Pm [|M 2 4m1/2 τ Pm 0 −1/2 =E ∇Ui (Xi (r) − ψ (m u − τ, x, v; X(r)))du 0 (s,t)×E × 1[0,R0 +1) (|x|)m−1 drρ
1 2 |v| 2
ν(dx, dv) Fs
≤ C|t − s|,
(4.13)
hence for any 0 ≤ r ≤ s ≤ t ≤ T , i (t) − M i (s)|2 |M i (s) − M i (r)|2 ] ≤ C 2 |t − s||s − r|. E Pm [|M
(4.14)
Also, by Doob’s inequality and (4.13), we get E Pm
2 1/2 i (t)| ≤ E Pm sup |M i (t)| sup |M
t∈[0,T ]
t∈[0,T ]
i (t)|2 ]1/2 ≤ 2 sup E Pm [|M t∈[0,T ]
≤ 2 sup
t∈[0,T ]
√ √ Ct = 2 CT < ∞.
(4.15)
By Theorem 3.4.1 (with ε = 1, β = 2 and γ = 1/2), (4.13)–(4.15) imply the following: i (t)} Lemma 4.4.1. {The distribution of {M t∈[0,T ] under Pm }m∈(0,1] is tight in d ℘(D([0, T ]; R )). We next show that under any of its cluster points as m → 0, the canonical process is continuous with probability 1. We first make the following preparation.
August 10, J070-S0129055X10004077
792
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Lemma 4.4.2. For any ε ∈ (0, 1], let % d ω ∈ D([0, T ]; R ): sup |ω(t) − ω(s)| > ε , A= δ≥0
B=
% δ≥0
|t−s|≤δ
ε ω ∈ D([0, T ]; R ): sup |ω(t) − ω(s)| > . 2 |t−s|≤eδ d
Then A ⊂ A¯ ⊂ B o ⊂ B. Here A¯ and B o means the closure of A and the interior of B in (D([0, T ]; Rd), d0 ), respectively. Proof. For any ω0 ∈ A and ω ∈ D([0, T ]; Rd ) with d0 (ω, ω0 ) < ε5 , we have that ω ∈ B. Indeed, by definition, we have that there exists a continuous non-decreasing function λ: [0, T ] → [0, T ] such that λ(0) = 0, λ(T ) = T , and |λ(t) − λ(s)| ≤ eε/4 |t − s| ≤ e|t − s|,
for any 0 ≤ s < t ≤ T,
sup |ω0 (t) − ω(λ(t))| ≤ ε/4.
0≤t≤T
Therefore, sup |ω(t) − ω(s)| =
|t−s|≤eδ
sup |λ(t)−λ(s)|≤eδ
|ω(λ(t)) − ω(λ(s))|
≥ sup |ω(λ(t)) − ω(λ(s))| |t−s|≤δ
≥ sup |ω0 (t) − ω0 (s)| − sup |ω0 (t) − ω(λ(t))| |t−s|≤δ
0≤t≤T
− sup |ω0 (s) − ω(λ(s))| 0≤s≤T
ε ε − = ε/2, 4 4 which means that ω ∈ B. This completes the proof of our assertion. > ε−
Now, we are ready to prove the continuity of canonical processes of cluster points i (t)} of {{M t∈[0,T ] under Pm }m→0 . i (t)} Lemma 4.4.3. Any cluster point of {{M t∈[0,T ] under Pm }m→0 in ℘(D([0, T ]; d R )) must have continuous canonical processes. Proof. Suppose there exists a sequence mk → 0 (as k → 0) such that Pmk ◦ i )−1 (which we write as Qk for the sake of simplicity) converges to some Q∞ ∈ (M ℘(D([0, T ]; Rd )) as k → ∞. We show that the canonical process under Q∞ is continuous with probability 1. Suppose not. Then there exists a constant ε > 0
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
such that
%
Q∞
793
ω ∈ Dd ([0, T ]): sup |ω(t) − ω(s)| > ε = a > 0. |t−s|≤δ
δ≥0
Without loss of generality, we assume that ε ≤ 1. Let A and B be the sets defined in Lemma 4.4.2. Then Q∞ (A) = a > 0, so by Lemma 4.4.2, Q∞ (B o ) ≥ a > 0. Also, B o is an open set, and Qk → Q∞ weakly in ℘(D([0, T ]; Rd)), so we have lim inf k→∞ Qk (B o ) ≥ Q∞ (B o ). Therefore, there exists an N ∈ N such that for any k ≥ N, Qk (B o ) ≥ a2 , hence Qk (B) ≥ a2 , which means that i has a jump greater than ε/2) ≥ a . Since mk → 0 as k → ∞, this yields a Pm (M 2
k
i under Pm are smaller than contradiction with the fact that all of the jumps of M k 1/2 4mk τ ∇Ui ∞ . This completes the proof of our assertion. We next use Lemma 4.4.3 to show the following, which will be used later. Lemma 4.4.4. For any ε > 0, we have that lim sup lim sup Pm m→0
δ→0
sup 0≤s≤t≤T,|s−t|≤δ
i (t) − M i (s)| > ε |M
= 0.
(4.16)
i (t) − M i (s)| > ε). If Proof. Let a(m, δ) = Pm (sup0≤s≤t≤T,|s−t|≤δ |M lim sup lim sup a(m, δ) > 0, m→0
δ→0
then there exists a constant a > 0 and sequences δk → 0, mk → 0 (as k → ∞) such that i i Pm |M (t) − M (s)| > ε ≥ a (4.17) sup k
0≤s≤t≤T,|s−t|≤δk
i )−1 , k ∈ N. Also, let for any k ∈ N. As before, let Qk = Pmk ◦ (M ω ∈ D([0, T ]; R ): d
Ak = Bk =
sup 0≤s≤t≤T,|t−s|≤δk
|ω(t) − ω(s)| > ε ,
ε . ω ∈ D([0, T ]; Rd): sup |ω(t) − ω(s)| > 2 0≤s≤t≤T,|t−s|≤eδk
Then Qk (Ak ) > a by assumption, and by the same argument as in the proof of Lemma 4.4.2, we get that Ak ⊂ Ak ⊂ Bko ⊂ Bk for any k ∈ N. Also, Ak is monotone decreasing with respect to k, hence for any ≥ k, we have that Q (Ak ) ≥ Q (A ) > a. Therefore, since Ak is a closed set, we get that Q∞ (Bk ) ≥ Q∞ (Ak ) ≥ lim sup Q (Ak ) ≥ a. →∞
August 10, J070-S0129055X10004077
794
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
This is true for any k ∈ N, so since Bk is monotone decreasing with respect to k, we get that ∞ % Bk ≥ a, Q∞ k=1
which means that Q∞ ({canonical process has jump ≥ ε/2}) ≥ a, which contradicts Lemma 4.4.3. This completes the proof of our assertion. i (t), for later use. Before dealing with ηi (t), we prepare one more result about M Lemma 4.4.5. There exists a constant C > 0 (not depending on m) such that Pm 4 i sup E sup |M (t)| ≤ C. t∈[0,T ]
m∈(0,1]
¯ |4 ] ≤ Proof. fact of Poisson point process that E[| f dN 2By 2 the general 4 E[3( f dλ) + f dλ], we get with the help of Doob’s inequality that i (t)|4 E Pm sup |M t∈[0,T ]
i (T )|4 ] ≤ (4/3)4 E Pm [|M 4 λ(dr, dx, dv) = (4/3) E 3
4m1/2 τ
ds
0
(0,T ]×E
2 2 ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ(m−1/2 s − 2τ, x, v; X(r
+
4m1/2 τ
λ(dr, dx, dv)
ds 0
(0,T ]×E
4 ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ(m−1/2 s − 2τ, x, v; X(r 4 ≤ (4/3) 3
m
−1
ρ
(0,T ]×E
1 2 |v| drν(dx, dv) 2 2
× (4m
1/2
2
τ ∇Ui ∞ 1[0,R0 +1) (|x|))
+ (0,T ]×E
m−1 ρ
1 2 |v| drν(dx, dv)(4m1/2 τ ∇Ui ∞ 1[0,R0 +1) (|x|))4 2
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
≤ (4/3)4 3(4τ ∇Ui ∞ )4
795
2 1 2 ρ |v |v|dv 2 Rd
T (2(R0 + 1))d−1
+ (4τ ∇Ui ∞ )4 mT (2(R0 + 1))d−1
1 2 ρ |v |v|dv . 2 Rd
The right-hand side above is dominated by a finite global constant for m ∈ (0, 1]. We next deal with ηi (t). First, we use some basic properties of Poisson point process to show that there exists a constant C such that E Pm [|ηi (t)|6 ] ≤ Cm3/2 ,
t ∈ [0, T ], m ∈ (0, 1].
In fact, notice that ηi (t) can be expressed as ¯ N (dr, dx, dv) ηi (t) = − [(t−4m1/2 τ )∨0,t]×E
(4.18)
4m1/2 τ
ds (t−r)∧(4m1/2 τ )
∧ σ))). × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r Also, in general, if Z is a Poisson random variable with mean a, then we have E[Z −a] = 0, E[(Z −a)2 ] = E[(Z −a)3 ] = a, E[(Z −a)4 ] = 3a2 +a, and E[(Z −a)6 ] = 15a3 + 25a2 + a. Therefore, by definition of Poisson point process and a simple calculation, there exists a global constant C such that 6 ¯ E f dN ≤ CE
3 2 3 2 4 6 f dλ + f dλ + f dλ f dλ + f dλ , 2
for any measurable function f . We use this to prove (4.18). 4m1/2 τ ∧ σ)))ds|. Let A = | (t−r)∧(4m1/2 τ ) ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 s − 2τ, x, v; X(r
Then since t − r ≥ 0, we get that A ≤ 4m1/2 τ ∇Ui ∞ . Therefore, E Pm [|ηi (t)|6 ] ≤ CE
2
A m
−1
ρ
[(t−4m1/2 τ )∨0,t]×E
3 1 2 |v| drν(dx, dv) 2
2 1 2 |v| drν(dx, dv) + A m ρ 2 [(t−4m1/2 τ )∨0,t]×E
1 + A2 m−1 ρ |v|2 drν(dx, dv) 2 [(t−4m1/2 τ )∨0,t]×E 3
−1
August 10, J070-S0129055X10004077
796
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
×
4
A m
−1
ρ
[(t−4m1/2 τ )∨0,t]×E
A6 m−1 ρ
+
[(t−4m1/2 τ )∨0,t]×E
1 2 |v| drν(dx, dv) 2 1 2 |v| drν(dx, dv) 2
1/2 1/2 2 −1 d ≤C 4m τ (4m τ ∇Ui ∞ ) m (2(R0 + 1))
ρ
Rd
1/2 1/2 3 −1 d + 4m τ (4m τ ∇Ui ∞ ) m (2(R0 + 1))
3 1 2 |v| |v|dv 2
2 1 2 |v| |v|dv ρ 2 Rd
1 2 1/2 1/2 2 −1 d |v| |v|dv + 4m τ (4m τ ∇Ui ∞ ) m (2(R0 + 1)) ρ 2 Rd
1 2 |v| |v|dv ρ × 4m1/2 τ (4m1/2 τ ∇Ui ∞ )4 m−1 (2(R0 + 1))d 2 Rd
1 |v|2 |v|dv ρ + 4m1/2 τ (4m1/2 τ ∇Ui ∞ )6 m−1 (2(R0 + 1))d , 2 Rd which gives us our assertion. We use (4.18) to show the following, with the help of (4.12) (the estimate for $ 03 i the derivative of V$ i ), Lemma 4.4.4 (the “continuity” of the limit of M (t)), and 4 i Lemma 4.4.5 (the estimate with respect to |M (t)| ). Lemma 4.4.6.
lim E Pm
m→0
sup |ηi (t)|2 = 0.
0≤t≤T
Proof. By (4.18), 4 [m− 3 T ]
E Pm |ηi (km4/3 )|6 ≤ Cm3/2 m−4/3 T → 0,
as m → 0.
k=0
In particular we have E
Pm
max 4
0≤k≤[m− 3 T ]
|ηi (km
4/3
6
)|
→ 0,
as m → 0.
(4.19)
ag process, there exists a measurable ξm : Ω → [0, T ] such Since ηi (t) is a c`adl` that |ηi (ξm )| ∨ |ηi (ξm −)| = sup |ηi (t)|. 0≤t≤T
(4.20)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
797
Also, the jumps of ηi satisfy |∆ηi | ≤ 4m1/2 τ ∇Ui ∞ , so |ηi (ξm −)| ≤ |ηi (ξm )| + 4/3 4/3 4m1/2 τ ∇Ui ∞ . Let ξ [m−4/3 ξm ]. Then 0 ≤ ξm − ξ . m = m m ≤m Combining the above, we get that Pm 2 sup |ηi (t)| = E Pm [|ηi (ξm )|2 ∨ |ηi (ξm −)|2 ] E 0≤t≤T
≤ 2(4m1/2 τ ∇Ui ∞ )2 + 2E Pm [|ηi (ξm )|2 ] ≤ 2(4m
1/2
2
τ ∇Ui ∞ ) + 4E
Pm
max 4
0≤k≤[m− 3 T ]
|ηi (km
4/3
2
)|
2 + 4E Pm [|ηi (ξm ) − ηi (ξ m )| ].
The first term on the right-hand side above converges to 0 as m → 0 evidently. By (4.19), the second term above is also converging to 0 as m → 0. So in order to show that E Pm [sup0≤t≤T |ηi (t)|2 ] → 0, it suffices to prove that the third term 2 E Pm [|ηi (ξm ) − ηi (ξ m )| ] converges to 0. We show it in the following. Notice that $ 2 Pm $ $ 03 03 2 [|V$ E Pm [|ηi (ξm ) − ηi (ξ m )| ] ≤ 2E i (ξm ) − Vi (ξm )| ] 2 i (ξm ) − M i (ξ + 2E Pm [|M m )| ]. 4/3 Since 0 ≤ ξm − ξ , we get by (4.12) that m ≤ m 2 T $ $ $ d Pm $ 03 03 2 03 (t) dt E Pm [|V$ 1[ξm ,ξf (t) V$ i (ξm ) − Vi (ξm )| ] ≤ E m] dt i 0
≤ E Pm ≤ m4/3
T
0
0
T
1[ξm ,ξf (t)dt · m]
$ 2 Pm d $ 03 E dt Vi (t) dt
≤ m4/3 T C1 m−1/2 → 0,
0
T
2 d $ 03 V$ dt i (t) dt
as m → 0.
2 i (ξm ) − M i (ξ For the term E Pm [|M m )| ], we first notice that since 0 ≤ ξm − ξm ≤ m4/3 by definition, (4.16) gives us that
i (ξm ) − M i (ξ lim Pm (|M m )| > ε) = 0.
m→0
(4.21)
This is true for any ε > 0. Also, we have by Lemma 4.4.5 that for any ε > 0, 2 i (ξm ) − M i (ξ E Pm [|M m )| ] 2 i 2 i (ξm ) − M i (ξ i ≤ E Pm [|M m )| , |M (ξm ) − M (ξm )| > ε] + ε
August 10, J070-S0129055X10004077
798
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang 4 1/2 1/2 i (ξm ) − M i (ξ i (ξm ) − M i (ξ ≤ E Pm [|M P (|M + ε2 m )| ] m )| > ε) 1/2
≤ 4E
Pm
i (t)| sup |M
4
t∈[0,T ]
1/2 i (ξm ) − M i (ξ P (|M + ε2 m )| > ε)
1/2 i (ξm ) − M i (ξ ≤ 4C 1/2 P (|M + ε2 . m )| > ε)
This combined with (4.21) gives us that 2 i (ξm ) − M i (ξ lim E Pm [|M m )| ] = 0,
m→0
and completes the proof of the fact that 2 lim E Pm [|ηi (ξm ) − ηi (ξ m )| ] = 0,
m→0
completing then the proof of our assertion. Combining all of the results in Secs. 4.1–4.3, we get Lemma 3.5.1, with i (t ∧ σ), Mi (t) = −M Pi∗1 (t) = −Vi02 (t) − Vi05 (t), ηi (t) =
−Vi1 (t)
+
Vi04 (t)
(4.22)
− ηi (t ∧ σ).
Before closing this subsection, we state the following result with respect to the quadratic variation of the martingale Mi (·). The proof is easy and we omit it. For i = 1, . . . , N and k = 1, . . . , d, let Aik (r) = Aik (r, x, v) =
2τ
−2τ
∇k Ui (Xi (r) − ψ 0 (u, x, v; X(r)))du.
Then we have: Lemma 4.4.7. For any l1 , l2 = 1, . . . , N and k1 , k2 = 1, . . . , d, the following equality holds: Al1 k1 (r, x, v)Al2 k2 (r, x, v)N (dr, dx, dv). [Mlk11 , Mlk22 ]s = m [0,s∧σ]×E
4.5. Proof of Lemma 3.5.2 In this subsection, we present the proof of Lemma 3.5.2. The first assertion is just an easy consequence of Lemma 3.5.1 and the formula of integration by parts. Indeed, for any t ≥ 0, we have by assumption and the
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
799
formula of integration by parts that t∧σfD t∧σfD −1/2 (X(s)))ds m |∇i U (X(s))|ds = g(X(s)) · (m−1/2 ∇i U 0
0
∧ σ = g(X(t D )) −
t∧σf D
t∧σf D 0
(X(s))ds m−1/2 ∇i U
(s)) ds(∇g(X(s)) V 0
0
s
(X(r))dr. m1/2 ∇i U
Therefore, by Lemma 3.5.1(1), we get T ∧σ∧σfD X(s))|ds m−1/2 |∇i U( 0
= g(X(T ∧ σ ∧ σ D ))(−Mi (Vi (T ∧ σ ∧ σ D ) − Vi (0)) + Mi (T ∧ σ D) T ∧σ∧σfD ∗1 (t)) (∇g(X(t)) V + ηi (T ∧ σ D ) + Pi (T ∧ σ D )) − 0
× {−Mi (Vi (t ∧ σ ∧ σ D ) − Vi (0)) + Mi (T ∧ σ D ) + ηi (T ∧ σ D) + Pi∗1 (t ∧ σ D )}dt
≤ (g∞ + ∇g∞ · N nT ) 2Mi n + sup |Mi (t) + ηi (t)| + sup |Pi∗1 (t)| . 0≤t≤T
0≤t≤T
Therefore, we get our first assertion by Lemmas 3.5.1(2), 3.5.1(4) and (4.15). Before giving the proof of the second assertion, let us make some preparation. With the help of Lemma 4.4.7, we have the following. Lemma 4.5.1.
lim E
m→0
Pm
t 2 ηi (s)dMi (s) = 0. sup
t∈[0,T ]
0
t Proof. Since Mi (·) is a martingale, Lemma 3.5.1(4) implies that 0 ηi (s)dMi (s) is also a martingale. Therefore, with the help of Lemma 4.4.7 and Doob’s inequality, we get that t 2 Pm ηi (s)dMi (s) sup E t∈[0,T ]
0
2 T ≤ 4E Pm ηi (s)dMi (s) 0 =2
d
k,=1
E
Pm 0
T
ηik (s)ηi (s)d[Mik , Mi ]s
August 10, J070-S0129055X10004077
800
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang d
= 2m
E
T ∧σ
Pm 0
k,=1
≤ 2(4τ ∇Ui ∞ )2
ηik (s)ηi (s)
[0,T ]×E
Aik (s, x, v)Ai (s, x, v)N (ds, dx, dv) E
E Pm [|ηi (s)|2 ]1[0,R0 +1) (|x|)ρ
1 2 |v| ν(dx, dv)ds 2
1 2 ρ v |v|dvE Pm [|ηi (s)|2 ]. 2 Rd
2
≤ 2(4τ ∇Ui ∞ ) T (2(R0 + 1))
d−1
This combined with Lemma 3.5.1(4) completes the proof of our assertion. We next show the second assertion of Lemma 3.5.2. The basic idea is to add an d t∧σ extra term i=1 M1i 0 ηi (s)dMi (s) first, use the decomposition and the estimates of Lemma 3.5.1 to show that the resulting quantity is tight, and finally delete the added term by Lemma 4.5.1. First, by Lemma 3.5.1, we have (X(t ∧ σ)) − U (X(0))) m−1/2 (U +
N
Mi
1 |Vi (t ∧ σ)| + 2 Mi
i=1
=
N
Mi
2
i=1
2
|Vi (0)| +
Mi
0 N
Mi
2
i=1
+ 0
t∧σ
+
=
2
N
i=1
t∧σ
0
t∧σ
0
ηi (s)dMi (s)
(X(s)) · Vi (s)ds m−1/2 ∇i U
t∧σ d 1 Vi (s) · Vi (s)ds + ηi (s)dMi (s) ds Mi 0
|Vi (0)|2 +
N
i=1
t∧σ
Vi (s)dηi (s) +
t∧σ
Vi (s)
0
1 Mi
0
t∧σ
d ∗1 P (s)ds + ds i
t∧σ
0
Vi (s)dMi (s)
ηi (s)dMi (s) .
Since |Vi (t ∧ σ)| ≤ n by the definition of σ, we have by Lemma 3.5.1(2) that 2 d sup sup E Pm Vi (t ∧ σ) Pi∗1 (t) < ∞. dt 0≤t≤T
m∈(0,1]
t∧σ d Therefore, by Theorem 3.4.1, we get that 0 Vi (s) ds Pi∗1 (s)ds under Pm is tight for m ∈ (0, 1]. t For the term 0 1[0,σ] (s)Vi (s)dMi (s), we recall that σ = inf{t > 0; maxi=1,...,N × |Vi (t)| = n}, so σ is a Ft -stopping time. Therefore, since {Mi (s)}s is a martingale,
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
we get that
Ni (t) :=
801
t
0
1[0,σ] (s)Vi (s)dMi (s)
is also a Ft -martingale. Notice that E Pm [|Ni (t) − Ni (s)|2 |Fs ] ≤ n2 d2 E Pm [|Mi (t) − Mi (s)|2 |Fs ]. So by Lemma 3.5.1(3), we get that E Pm [|Ni (t) − Ni (s)|2 |Fs ] ≤ n2 d2 C|t − s|,
|∆N (t)| ≤ dnCm1/2 .
Therefore, similarly as in the proof of Lemmas 4.4.1 and 4.4.3, we get that {Ni (t)}t under Pm is tight for m → 0, and the canonical process under any of its cluster points is continuous withprobability 1. t∧σ t∧σ Finally, we show that 0 Vi (s)dηi (s)+ M1i 0 ηi (s)dMi (s) is negligible. Notice that by Lemma 3.5.1(3), t∧σ t∧σ 1 Vi (s)dηi (s) + ηi (s)dMi (s) Mi 0 0 t∧σ t∧σ 1 = Vi (t ∧ σ)ηi (t) − ηi (s)dVi (s) + ηi (s)dMi (s) Mi 0 0
t∧σ t∧σ 1 1 d ∗1 = Vi (t ∧ σ)ηi (t) − ηi (s) ηi (s)dηi (s) Pi (s) ds − Mi 0 ds Mi 0 t∧σ 1 (X(s))ds + ηi (s)m−1/2 ∇i U Mi 0 1 1 ηi (t)2 + [ηi , ηi ]t 2Mi 2Mi
t∧σ d 1 (X(s)) − Pi∗1 (s) ds. + ηi (s) m−1/2 ∇i U Mi 0 ds
= Vi (t ∧ σ)ηi (t) −
Since |Vi (t ∧ σ)| ≤ n, Lemma 3.5.1(4) gives us that 1 1 Pm 2 lim E ηi (t) + [ηi , ηi ]t = 0. sup Vi (t ∧ σ)ηi (t) − m→0 2Mi 2Mi t∈[0,T ∧σ] Also, for any ε > 0, we have for any A > 0, t∧σ
d ∗1 −1/2 Pm ηi (s) m ∇i U (X(s)) − Pi (s) ds > ε sup ds t∈[0,T ∧σ] 0 ≤ Pm
sup s∈[0,T ∧σ]
|ηi (s)| > A
+ Pm
sup s∈[0,T ∧σ]
0
t∧σ
d ∗1 ε −1/2 (X(s))| + Pi (s) ds > ∇i U |m ds A
August 10, J070-S0129055X10004077
802
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
1 ≤ E Pm A
sup s∈[0,T ∧σ]
A + E Pm ε
T ∧σ
0
|ηi (s)|
|m
−1/2
d ∗1 ∇i U(X(s))| + Pi (s) ds . ds
Combining this with Lemmas 3.5.1(2), 3.5.1(4) and 3.5.2(1), by taking first A > 0 small enough and then m > 0 small enough, we get that t∧σ
d −1/2 ∗1 (X(s)) − Pi (s) ds > ε = 0 ηi (s) m ∇i U sup lim Pm m→0 ds t∈[0,T ∧σ∧σf D]
0
(X(t ∧ σ ∧ σ for any ε > 0. This completes the proof of the fact that m−1/2 (U D )) − t∧σ∧σfD N M 1 2 i (X(0))) U + |Vi (t ∧ σ ∧ σ )| + η (s)dM (s) under P is tight D i i m i=1 2
Mi
0
as m → 0, and the canonical process under any of its cluster points is continuous with probability 1. This combined with Lemma 4.5.1 gives us our second assertion of Lemma 3.5.2. 5. Convergence until “Near” As mentioned at the end of Sec. 3.4, weak convergence of the distribution of a process with t ∈ [0, T ] for any T > 0 implies the weak convergence of the distribution of the process with t ∈ [0, ∞). So in order to prove Theorems 2.0.1(2)–2.0.1(4), it suffices to prove the assertions for t ∈ [0, T ] for any T > 0. Fix a T > 0 from now on. t∧σ × By Lemma 3.5.1, we have that {{Mi Vi (t ∧ σn ) + m−1/2 0 n ∇i U s )ds}t∈[0,T ] under Pm } is tight in ℘(D([0, T ]; Rd)) as m → 0, and the canon(X ical process under any of its cluster points is continuous with probability 1. Let σ0 (ω) = inf{t > 0; mini=j {|Xi (t) − Xj (t)| − (Ri + Rj )} ≤ 0}. Then by (X s ) = 0 for any s ≤ σ0 . Therefore, there exists (at least) one sequence (3.31), ∇i U ∧ σn ∧ σ0 ), V (t ∧ σn ∧ mk → 0 (as k → ∞) such that {distribution of {(X(t d σ0 ))}t∈(0,T ] under Pmk } converges in ℘(D([0, T ]; R )). In this section, we give the proof of the fact that any cluster point gotten above is the stopped diffusion process with generator L as given in Sec. 2, by proving that it is the solution of the martingale problem L. This certainly implies Theorem 2.0.1(2) and 2.0.1(3). For the sake of simplicity, in this section, we let σ = σn ∧ σ0 . We use the same )C ⊂ RdN . notations as in Sec. 4. Also, we use the notation D0 = (supp U 5.1. Decomposition ∧ As claimed, we show from now on that any cluster point of {distribution of {(X(t σ), V (t ∧ σ))}t∈[0,T ] under Pm } is a solution of the martingale problem L, i.e. for
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
any f ∈ C0∞ (D0 × RdN ), f (X(t ∧ σ), V (t ∧ σ)) − f (X0 , V0 ) −
t∧σ
0
Lf (Xs , Vs )ds ,
803
(5.1)
after taking the limit m → 0, is a martingale. First, since we do not have enough information about the term ηi (t), we use the following to convert the problem to the one without ηi (t). Let
t∧σn −1 ∗1 −1/2 ∇i U (Xs )ds Yi (t) = Vi (0) + Mi Mi (t) + Pi (t) − m 0
= Vi (t) −
Mi−1 ηi (t),
i = 1, . . . , N,
and let Yt = Y (t) = (Y1 (t), . . . , YN (t)). Then we have the following. (We use the (t).) and Vt = V notations Xt = X(t) Lemma 5.1.1. For any f ∈ C0∞ (D0 × RdN ), we have that {f (Xt∧σn , Vt∧σn )}t and {f (Xt∧σn , Yt∧σn )}t converge or do not converge for m → 0 at the same time, and when they converge, they have the same limit. Proof. Just notice that if we let fV denote the partial differential of f with respect to V , then fV ∞ < ∞ and |f (Xt∧σn , Vt∧σn ) − f (Xt∧σn , Yt∧σn )| ≤ fV ∞ max
i=1,...,N
hence
E
Pm
1 sup |ηi (s)|, Mi s∈[0,T ]
sup |f (Xt∧σn , Vt∧σn ) − f (Xt∧σn , Yt∧σn )|
0≤t≤T
1 Pm ≤ fV ∞ max E i=1,...,N Mi
sup |ηi (s)| ,
s∈[0,T ]
which, by Lemma 3.5.1(4), converges to 0 as m → 0. By Lemma 5.1.1, in order to prove that any cluster point of (5.1) is a martingale, it suffices to prove that any cluster point of t∧σ Lf (Xs , Vs )ds f (Xt∧σ , Yt∧σ ) − f (X0 , Y0 ) − 0
is a martingale. Since f ∈ C0∞ (D0 × RdN ) (notice that all the terms involved except Mi (t) are continuous with respect to t), we have t∧σ (X s )ds = 0, fV (Xs , Ys ) · ∇U 0
August 10, J070-S0129055X10004077
804
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
so we obtain by Ito’s formula and the definition of Yi (t) that f (Xt∧σ , Yt∧σ ) − f (X0 , Y0 ) t∧σ fX (Xs , Ys ) · Vs ds = 0
+
t∧σ N
1 fVi (Xs , Ys ) · dMi (s) + (II) + (III) + (IV), Mi 0 i=1
with (II) =
(III) =
t∧σ N
1 fV (Xs , Ys ) · dPi∗1 (s), M i 0 i=1 N
d
l1 ,l2 =1 k1 ,k2 =1
(IV) =
0<s≤t∧σ
1 2Ml1 Ml2
t∧σ
0+
fV k1 V k2 (Xs , Ys )d[Mlk11 , Mlk22 ]s , l1
l2
N
1 fV (Xs , Ys− ) · ∆Ml (s) f (Xs , Ys ) − f (Xs , Ys− ) − Ml l=1
N
1 1 . fVl1 Vl2 (Xs , Ys− )(∆Ml1 (s))(∆Ml2 (s)) − 2 M l1 M l2 l1 ,l2 =1
N 1 t∧σ The term fVi (Xs , Ys ) · dMi (s) is already a martingale since i=1 Mi 0 {Mi (t)}t , i = 1, . . . , N , are martingales and fV is bounded, hence it remains a martingale when taking m → 0. So it suffices to show that 0
t∧σ
fX (Xs , Ys ) · Vs ds + (II) + (III) + (IV) −
0
t∧σ
Lf (Xs , Vs )ds → 0.
t∧σ t∧σ The fact that the difference between 0 fX (Xs , Ys ) · Vs ds and 0 fX (Xs , Vs ) · t∧σ Vs ds, its corresponding term in 0 Lf (Xs , Vs )ds, converges to 0 is a direct consequence of Lemma 5.1.1. In the following sections, we show the convergences of the other terms. Precisely, we show that when m → 0, (II) −
N d
0 i,j=1 k,l=1
(III) −
t
t
N d
0 i,j=1 k,l=1
(IV) → 0.
Vj (s)bik,jl (Xs )
aik,jl (Xs )
∂ f (Xs , Vs )ds → 0, ∂Vik
∂2 f (Xs , Vs )ds → 0, ∂Vik ∂Vjl
(5.2)
(5.3) (5.4)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
805
5.2. The term (IV) By using the fact that any jump of Mi (t) is dominated by Cm1/2 (Lemma 3.5.1(3)) and the definition of Mi (t), we show with the help of the properties of a Poisson point process that (IV) is negligible. Precisely, we show the following. Lemma 5.2.1.
lim E
Pm
m→0
sup |(IV)| = 0.
0≤t≤T
Proof. Since f ∈ C0∞ (D0 × RdN ), we have that the third partial derivatives fV k1 V k2 V k3 , l1 , l2 , l3 = 1, . . . , N, k1 , k2 , k3 = 1, . . . , d, are bounded. Also, l1
l2
l3
by Lemma 3.5.1(3), the jumps of {Mi (t)} satisfy |∆Mi (s)| ≤ Cm1/2 . ThereN d fore, by Taylor’s expansion, with C1 = l1 ,l2 ,l3 =1 k1 ,k2 ,k3 =1 fV k1 V k2 V k3 ∞ C, l1
we have
|(IV)| ≤
N
d
0<s≤t∧σ l1 ,l2 ,l3 =1 k1 ,k2 ,k3 =1
≤ C1 m1/2
N
l2
l3
fV k1 V k2 V k3 ∞ |∆Mlk11 (s)||∆Mlk22 (s)||∆Mlk33 (s)| l1
l2
l3
|∆Ml (s)|2 .
0<s≤t∧σ l=1
Therefore, to complete the proof of this lemma, it suffices to show that E Pm [ 0<s≤T ∧σ |∆Mi (s)|2 ] is bounded for m > 0, which we are now going to show. We have by the definition of {Mi (t)} that
Mi (t) = −
4m1/2 τ
¯ (dr, dx, dv) N (0,t∧σ]×E
du 0
∧ σ))), × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r so
0<s≤t∧σ
2
|∆Mi (s)| =
4m1/2 τ
N (dr, dx, dv) (0,t∧σ]×E
du 0
2 ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r
.
Recall that N is the Poisson point process with intensity m−1 ρ( 12 |v|2 )drν(dx, dv). Therefore, since ∧ σ)))| ≤ ∇Ui ∞ 1[0,R +1) (|x|), |∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r 0
August 10, J070-S0129055X10004077
806
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
we get that
E Pm
|∆Mi (s)|2
0<s≤T ∧σ
= E Pm
N (dr, dx, dv)
4m1/2 τ
du 0
(0,T ∧σ]×E
2 ∧ σ))) × ∇Ui (Xi (r ∧ σ) − ψ 0 (m−1/2 u − 2τ, x, v; X(r
1 2 |v| 1[0,R0 +1) (|x|)(4m1/2 τ )2 ∇Ui 2∞ drν(dx, dv) 2 [0,T ]×E
1 2 |v| |v|dv, ≤ 16τ 2 ∇Ui 2∞ T (2(R0 + 1))d−1 ρ 2 Rd
m−1 ρ
≤
which is finite by our assumption. This completes the proof of our assertion. 5.3. The term (III) For the term (III), we show in this subsection that (5.3) holds, i.e. when m → 0, (III) corresponds to the quadratic term of the generator L. By Lemma 4.4.7, we have that t∧σ fV k1 V k2 (Xs , Ys )d[Mlk11 , Mlk22 ]s 0+
l1
l2
t∧σ
=m 0+
fV k1 V k2 (Xs , Ys )Al1 k1 (s, x, v)Al2 k2 (s, x, v)N (ds, dx, dv). l1
l2
Let (III ) =
N
d
1
l1 ,l2 =1 k1 ,k2 =1
2Ml1 Ml2
t∧σ
0+
fV k1 V k2 (Xs , Ys ) l1
×
l2
Al1 k1 (s, x, v)Al2 k2 (s, x, v)ρ E
1 2 |v| ν(dx, dv) ds. 2
Then we have the following. The reason is intuitively as follows: when subtracting (III ), we are subtracting the corresponding expectation, so the resulting quantity is its variance, which converges to 0. Lemma 5.3.1.
lim E
m→0
Pm
sup |(III) − (III )| = 0.
0≤t≤T
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
807
Proof. By definition, N (ds, dx, dv) is the Poisson point process with intensity λ(ds, dx, dv) = m−1 ρ( 12 |v|2 )dsν(dx, dv). Also, notice that there exists a constant C > 0 such that |Alk (s∧σ, x, v)| ≤ C1[0,R0 +1] (|x|). Let C1 = 2C 2 fV V ∞ (T ((2R0 + 1))d−1 Rd ρ( 12 |v|2 )|v|dv)1/2 , which is finite by assumption. Then by Doob’s inequality, for any l1 , l2 = 1, . . . , N and k1 , k2 = 1, . . . , d, we have t∧σ mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s)(N − λ)(ds, dx, dv) sup l l 1 2 0≤t≤T 0 E t∧σ ≤ E Pm sup mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s) l1 l2 0≤t≤T
E
Pm
0
E
2 1/2 × (N − λ)(ds, dx, dv) ≤ 2E Pm
T ∧σ
0
E
mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s) l1
l2
2 1/2 × (N − λ)(ds, dx, dv)
1/2
T ∧σ
= 2E Pm 0 2
≤ 2C fV V ∞ m
E
(mfV k1 V k2 (Xs , Ys )Al1 k1 (s)Al2 k2 (s))2 λ(ds, dx, dv) l1
l2
T 0
E
1[0,R0 +1] (|x|)m
−1
ρ
1/2 1 2 |v| dsν(dx, dv) 2
≤ C1 m1/2 . This completes the proof of our assertion. Lemma 5.3.1 combined with Corollary 3.2.3 implies that (5.3) holds, i.e. N d after taking the limit m → 0, (III) corresponds to the term i,j=1 k,l=1 × ∂2 aik,jl (X) k l . ∂Vi ∂Vj
5.4. The term (II) In this subsection, we deal with the term (II). The most basic idea is the same as up to now: use the benefit that the variance of the corresponding Poisson point process is small (see Lemmas 5.4.1 and 5.4.7). Proposition 3.6.4 is also used, to derive the V , a). limit, which gives us z(·; x, v, X,
August 10, J070-S0129055X10004077
808
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Recall that Pi∗1 is given by Pi∗1 (t) = −Vi02 (t) − Vi05 (t). So we have the decomposition −
t∧σ
0
fV (Xs , Ys ) · dPi∗1 (s)
t∧σ
=
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
0
t∧σ
+
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
0
R×E
fi (s, r, x, v)µω (dr, dx, dv) ds
R×E
$ 05 (s, r, x, v)λ(dr, dx, dv) ds F i
t∧σ
=
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
0
×
R×E
$ 05 (fi (s, r, x, v) + Fi (s, r, x, v))µω (dr, dx, dv) ds
t∧σ
+
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
0
×
R×E
$ 05 Fi (s, r, x, v)(λ(dr, dx, dv) − µω (dr, dx, dv)) ds.
(5.5)
We first show in the following lemma that the second term on the right-hand side above is negligible. Lemma 5.4.1. lim E
Pm
m→0
sup 0≤t≤T
0
×
R×E
t∧σ
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
$ 05 (s, r, x, v)(λ(dr, dx, dv) − µ (dr, dx, dv)) ds = 0. F ω i
Proof. As mentioned, this result intuitively lies in the fact that only the variance of the Poisson point process is involved. We prove it by first performing a proper decomposition (see (5.6)) and then show that each of these terms are small enough (see Lemmas 5.4.2–5.4.4). Let 2 $ 05 r ))) R(s, r, x, v) = −F r ) − ψ 0 (m−1/2 (s − r), x, v; X( i (s, r, x, v) − ∇ Ui (Xi (
r ) − ψ 0 (m−1/2 (s − r), x, v; X(s)) × {Xi (s) − Xi ( r ))}. + ψ 0 (m−1/2 (s − r), x, v; X(
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
809
Then we have the decomposition t∧σ fV (Xs , Ys )1[4m1/2 τ,σ) (s) 0
×
R×E
$ 05 (s, r, x, v)(λ(dr, dx, dv) − µ (dr, dx, dv)) ds F ω i
= (5I) + (5II) + (5III), where
(5.6)
t∧σ
(5I) =
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
0
×
R×E
R(s, r, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv)) ds,
t∧σ
(5II) =
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
0
×
R×E
r ))) ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(
×{(Xi (s) − Xi ( r ) − (s − r)Vi ( r )) − (ψ 0 (m−1/2 (s − r), x, v; X(s)) r ) + (s − r)V ( − ψ 0 (m−1/2 (s − r), x, v; X( r )))} × (µω (dr, dx, dv) − λ(dr, dx, dv)) ds,
t∧σ
(5III) = 0
fV (Xs , Ys )1[4m1/2 τ,σ) (s)
×
R×E
g(r, s, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv)) ds,
with r ))) r ) − ψ 0 (m−1/2 (s − r), x, v; X( g(r, s, x, v) = ∇2 Ui (Xi ( r ) + (s − r)V ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( r )) × {(s − r)Vi ( r ))}. + ψ 0 (m−1/2 (s − r), x, v; X( So Lemma 5.4.1 follows from Lemmas 5.4.2–5.4.4 in the following: Lemma 5.4.2.
lim E
m→0
Pm
sup |(5III)| = 0.
0≤t≤T
( Proof. First notice that 0 ≤ r ≤ σ, hence |V r )| ≤ N n. Let C1 = 2 N n). Then by Corollary 3.2.3 and Lemma 4.3.4, we ∇ Ui ∞ (2τ n + 4Cτ
August 10, J070-S0129055X10004077
810
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
have that |g(r, s, x, v)| − r||V ( ≤ ∇2 Ui ∞ 1[0,2m1/2 τ ) (|s − r|)(|s − r||Vi ( r )| + C|s r )|)1[0,R0 +1) (|x|) ≤ m1/2 C1 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|). Also, it is easy to see that g(r, s, x, v) is Fr -measurable. Therefore by N and C2 = fV ∞ T C1 (2(R0 + assumption, with c = i=1 Ui ∞ 1))(d−1)/2 (4τ Rd ρc ( 12 |v|2 )|v|dv)1/2 , we have Pm E sup |(5III)| 0≤t≤T
≤ fV ∞ E
Pm
≤ fV ∞
0 T
0
= fV ∞
R×E
R×E
T
ds
≤ fV ∞
ρ
E
Pm
g(r, s, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv))
2 1/2 g(r, s, x, v)(µω (dr, dx, dv) − λ(dr, dx, dv)) 1/2 [|g(r, s, x, v)| ]λ(dr, dx, dv) 2
R×E
T
ds 0
×m
ds
Pm dsE
0
−1
T ∧σ
R×E
(m1/2 C1 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|))2
1/2 N 1 2
−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1
≤ C2 m1/4 , which converges to 0 as m → 0. Lemma 5.4.3.
lim E Pm
m→0
sup |(5I)| = 0.
0≤t≤T
Proof. By the definition of R(s, r, x, v), a Taylor expansion, Corollary 3.2.3 and Lemma 4.3.4, we get that r )) |R(s, r, x, v)| ≤ ∇3 Ui ∞ |(Xi (s) − Xi ( r )))|2 − (ψ 0 (m−1/2 (s − r), x, v; X(s)) − ψ 0 (m−1/2 (s − r), x, v; X( 2 |Xi (s) − Xi ( ≤ (1 + C) r )|2 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|). r )| ≤ Notice that when |s − r| ≤ 2m1/2 τ , since s, r ∈ [0, σ], we get that |Xi (s) − Xi ( n|s − r| ≤ n4m1/2 τ . So the above gives us that 2 m1[0,2m1/2 τ ) (|s − r|)1[0,R +1) (|x|). |R(s, r, x, v)| ≤ (4nτ (1 + C)) 0
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
2 4τ (2(R0 + 1))d−1 Therefore, with C1 = 2fV ∞ T (4nτ (1 + C)) we have E Pm sup |(5I)| 0≤t≤T
≤ 2fV ∞
0
×m
R×E
T
ds 0
−1
ρ
Rd
ρc ( 12 |v|2 )|v|dv,
T
ds
≤ 2fV ∞
811
R×E
E Pm [1[0,σ] (s)|R(s, r, x, v)|]λ(dr, dx, dv) 2 m1[0,2m1/2 τ ) (|s − r|)1[0,R +1) (|x|) (4nτ (1 + C)) 0
N 1 2
−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1
≤ C1 m1/2 , which converges to 0 as m → 0. Lemma 5.4.4.
lim E
Pm
m→0
sup |(5II)| = 0.
0≤t≤T
Proof. First, by Lemma 4.3.4, r ) + (s − r)V ( |ψ 0 (m−1/2 (s − r), x, v; X(s)) − ψ 0 (m−1/2 (s − r), x, v; X( r ))| X(s) r ) − (s − r)V ( ≤ C| − X( r ))|. Notice that if s ≥ 4m1/2 τ and |s − r| ≤ 2m1/2 τ , then by the definition of r, we always have that r ≤ s. Therefore, s r ) − (s − r)V ( (u) − V ( X(s) − X( r) = (V r ))du. (5.7) r e
For any l ≤ s (≤ σ), we have that |Xi (l) − Xj (l)| ≥ Ri + Rj , i = j, which implies X(l)) = 0, i = 1, . . . , N . Therefore we have by Lemma 3.5.1 that that ∇i U(
u d ∗1 1 Pi (l)dl + ηi (u) − ηi ( Vi (u) − Vi ( r) = r) Mi r e dl u −1/2 + Mi (u) − Mi ( r) − m ∇i U(X(l))dl =
1 Mi
r e
u
r e
d ∗1 Pi (l)dl + ηi (u) − ηi ( r ) + Mi (u) − Mi ( r) . dl
Let am = 4m1/2 τ + 2 max E Pm i=1,...,N
(5.8)
sup |ηi (u)| + (4m1/2 τ )1/2 .
0≤u≤T
Then by Lemma 3.5.1(4), we have am → 0 as m → 0. Notice that for s ∈ [0, T ∧ σ], |s − r| ≤ 2m1/2 τ implies |s − r| ≤ 4m1/2 τ . Let C be the constant in (4.13),
August 10, J070-S0129055X10004077
812
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
N d ∗1 and let C1 = 4 i=1 M1i (supm∈(0,1] supt∈[0,T ] E Pm [| dl Pi (l)|2 ]1/2 + 1 + C), which is finite by Lemma 3.5.1. Then we get by (5.7), (5.8) and (4.13) that r ) − (s − r)V ( − X( r ))|] E Pm [|X(s) N u
s 1 d ∗1 Pm ≤ Pi (l)dl du E M dl i r e r e i=1 s s + du(ηi (u) − ηi ( r )) + du(Mi (u) − Mi ( r )) re
≤
N
i=1
r e
2 1/2 d 1 (4m1/2 τ )2 sup sup E Pm Pi∗1 (l) Mi dl m∈(0,1] t∈[0,T ]
+ 4m1/2 τ 2 sup E Pm [|ηi (u)|] + 0≤u≤T
s
r e
duE Pm [|Mi (u) − Mi ( r )|2 ]1/2
≤ C1 (4m1/2 τ )am .
(5.9)
Also,by properties of Poisson point processes, we have in general E Pm [ f dµω ] = E Pm [ f dλm ] = E[f ]dλm for any measurable f : R × E × Conf(R × E) → R. Let be as in Lemma 4.3.4, and let C2 = 2(1 + C)f V ∞ ∇2 Ui ∞ T C1 4τ (2(R0 + C 1))d−1 Rd ρc ( 12 |v|2 )|v|dv. Then by Corollary 3.2.3, Lemma 4.3.4 and (5.9), we get that T ds(1 + C) E Pm sup |(5II)| ≤ fV ∞ 0≤t≤T
0
× E Pm 1[0,σ) (s)
R×E
r ) − (s − r)V ( ∇2 Ui ∞ |X(s) − X( r ))|
× 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|)
× (µω (dr, dx, dv) + λm (dr, dx, dv)) V ∞ ∇2 Ui ∞ = 2(1 + C)f ×
R×E
T
ds 0
r ) − (s − r)V ( E Pm [1[0,σ) (s)|X(s) − X( r ))|]
× 1[0,2m1/2 τ ) (|s − r|)1[0,R0 +1) (|x|)λm (dr, dx, dv)) ≤ C2 am , which converges to 0 as m → 0. This completes the proof of Lemma 5.4.4. Lemmas 5.4.2–5.4.4 complete the proof of Lemma 5.4.1.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
813
We next deal with the first term on the right-hand side of (5.5). We first make the decomposition −1/2 $ 05 v))) fi (s, r, x, v) + F i (s, r, x, v) = ∇Ui (Xi (s) − x(s, Ψ(r, x, m
− ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) 2 3 1 = f i (s, r, x, v) + fi (s, r, x, v) + fi (s, r, x, v), with −1/2 1 f v))) i (s, r, x, v) = ∇Ui (Xi (s) − x(s, Ψ(r, x, m
− ∇Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) − ∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) · (x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s))), 2 0 −1/2 2 f (s − r), x, v; X(s))) i (s, r, x, v) = ∇ Ui (Xi (s) − ψ (m
· (x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s)) − m1/2 z(m−1/2 (s − r); x, v, X(s), V (s), −m−1/2 (s − r))), 2 0 −1/2 3 f (s − r), x, v; X(s))) i (s, r, x, v) = ∇ Ui (Xi (s) − ψ (m
· m1/2 z(m−1/2 (s − r); x, v, X(s), V (s), −m−1/2 (s − r)), where z is as defined in (2.3). Although we are not subtracting the expectations of the corresponding terms, 2 1 1 the terms involving f i (s, r, x, v) and fi (s, r, x, v) are negligible, as fi (s, r, x, v) and 2 f i (s, r, x, v) themselves are small enough. Indeed, it is easy to see by Taylor expan−1/2 1 v)) − ψ 0 (m−1/2 (s − sion that f i (s, r, x, v) is of higher order than x(s, Ψ(r, x, m r), x, v; X(s)), so it is somehow trivial that the term corresponding to it is negli2 gible. The fact that the term involving f i (s, r, x, v) is also negligible comes from Proposition 3.6.4. We formulate the result in the following. Lemma 5.4.5. We have that t∧σ lim E Pm sup fV (Xs , Ys )1[4m1/ τ,σ] (s) m→0
×
0≤t≤T
R×E
0
k (s, r, x, v)µ (dr, dx, dv) ds = 0, f ω i
k = 1, 2.
Proof. We first show the assertion for k = 1. First notice that by Corollary 3.2.3 1 and Proposition 3.6.5, we have that for s ∈ [0, T ∧ σ], f i (s, r, x, v) = 0 only if 1/2 1/2 1/2 |x| ≤ R0 + 1 and s − r ∈ [−m τ, 2m τ ]. Since s ∈ [4m τ, T ∧ σ], this implies that r − m1/2 τ ∈ [0, T ∧ σ]. So in this region, we have by Proposition 3.6.3 that
August 10, J070-S0129055X10004077
814
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
> 0 such that there exists a constant C |x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s))| m1/2 . + |m−1/2 (r − s)|) ≤ 4Cτ ≤ m1/2 C(2τ )2 , we have So with C1 = ∇3 Ui ∞ (4Cτ 3 −1/2 1 v)) |f i (s, r, x, v)| ≤ ∇ Ui ∞ |x(s, Ψ(r, x, m 2 − ψ 0 (m−1/2 (s − r), x, v; X(s))|
≤ C1 m1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|). Let C2 = fV ∞ T C12 (2(R0 + 1))d−1 4τ Rd ρc ( 12 |v|2 )|v|dv. Then by the definition of λ, we get t∧σ
Pm 1 fV (Xs , Ys )1[4m1/ τ,σ] (s) sup fi (s, r, x, v)µω (dr, dx, dv) ds E 0≤t≤T
≤
T
0
dsfV ∞ C1 m
0
×m
−1
R×E
ρ
R×E
1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|)
N 1 2
−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1
≤ C2 m1/2 , which converges to 0 as m → 0. The assertion for k = 2 is similar. Again, for s ∈ [0, T ∧ σ], we have by Corol1/2 2 τ, 2m1/2 τ ]. lary 3.2.3 that f i (s, r, x, v) = 0 only if |x| ≤ R0 + 1 and s − r ∈ [−m 1/2 For any s and r satisfying |s − r| ≤ 2m τ , we have by Proposition 3.6.4 that |x(s, Ψ(r, x, m−1/2 v)) − ψ 0 (m−1/2 (s − r), x, v; X(s)) − m1/2 z(m−1/2 (s − r); x, v, X(s), V (s), −m−1/2 (s − r))| ≤ Cm1/2 (1 + 2τ )2 m1/2 .
Let C3 = 4fV ∞ ∇2 Ui ∞ C(1 + 2τ )2 τ T (2(R0 + 1))d−1 Rd ρc ( 12 |v|2 )|v|dv. Then t∧σ
Pm 2 E fV (Xs , Ys )1[4m1/ τ,σ] (s) sup fi (s, r, x, v)µω (dr, dx, dv) ds 0≤t≤T
≤
0
T
0
dsfV ∞ ∇2 Ui ∞ C(1 + 2τ )2 m
×m
−1
ρ
R×E
R×E
1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|)
N 1 2
−1/2 |v| + Ui (x − m rv − Xi,0 ) drν(dx, dv) 2 i=1
≤ C3 m1/2 , which converges to 0 as m → 0. This completes the proof of our lemma.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
815
Before dealing with the main term, namely the one involving fi3 (s, r, x, v), let V , a) with respect to X and V , by us prove the following continuity of z(t; x, v, X, again using Gronwall’s Lemma. Lemma 5.4.6. For any T1 > 0 and b, A, B > 0, there exists a constant C = C(T1 , b, A, B) such that 1 , a) − z(t; x, v, X 2, V 2 , a)| ≤ C(X 1−X 2 + V 1 − V 2 ). 1, V |z(t; x, v, X 1 , X 2 ≤ B, V 1 , V 2 ≤ b. for any t ∈ [−τ, T1 ], |a| ≤ A, X V , by using the same method as in the Proof. First notice that for any a, x, v, X, proofs of Lemmas 3.6.3, 4.3.4, etc., with the help of Gronwall’s Lemma, we get easily that for any T0 > 0, T0 e(1+ |z(t)| ∨ |z (t)| ≤ (T0 + |a|)V
PN i=1
∇2 Ui ∞ )T0
,
|t| ≤ T0 .
(5.10)
k, V k , a), k = 1, 2, and let For the sake of simplicity, we write z k (t) = z(t; x, v, X 1 2 ξ(t) = z (t) − z (t). Then we have that in our domain, there exists a constant C0 = > 0 be the constant in Lemma 4.3.4 C0 (T, b, A, B) > 0 such that |z 1 (t)| ≤ C0 . Let C N 3 + 1)(C0 + (T + A)b) + ∇2 Ui ∞ (1 + T + A)}. Then and let C = i=1 {∇ Ui ∞ (C by definition and Lemma 4.3.4, 2 N d 2 1 ) − Xi1 )(z 1 (t) − (t + a)V 1) ∇ Ui (ψ 0 (t, x, v; X dt2 ξ(t) = − i=1 N
2 ) − Xi2 )(z 2 (t) − (t + a)V 2 ) + ∇2 Ui (ψ 0 (t, x, v; X i=1 N
1 ) − X 1 ) − ∇2 Ui (ψ 0 (t, x, v; X 2 ) − X 2 )} = − {∇2 Ui (ψ 0 (t, x, v; X i i i=1
1) × (z 1 (t) − (t + a)V −
N
2
0
2
∇ Ui (ψ (t, x, v; X ) −
Xi2 )(z 1 (t)
i=1
≤
N
−V )) − z (t) − (t + a)(V 2
+ 1)X 1−X 2 (|z 1 (t)| + (T + |a|)V 1 ) ∇3 Ui ∞ (C
i=1
+
N
1−V 2 ) ∇2 Ui ∞ (|z 1 (t) − z 2 (t)| + (T + |a|)V
i=1
2 + V 1−V 2 ) + C|z 1 (t) − z 2 (t)| 1−X ≤ C(X 1−X 2 + V 1−V 2 ) + C|ξ(t)|. = C(X
1
2
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
816
d Let g(t) = |(ξ(t), dt ξ(t))|. Then 2 d g(t) ≤ d ξ(t) + d ξ(t) dt dt2 dt
1−X 2 + V 1 − V 2 ) + (C + 1)g(t). ≤ C(X Hence if we set g(t) = g(t − τ ), then g(0) = discussion above gives us that
d (0) dt g
= 0 by definition, and the
d 1 −X 2 + V 1−V 2 ) + (C + 1) g(t) ≤ C(X g (t). dt So we have for any t ∈ [0, T1 + τ ] that t 1 2 1 2 g(t) ≤ Ct(X − X + V − V ) + (C + 1) g(s)ds. 0
This combined with Gronwall’s Lemma implies that 1 −X 2 + V 1 − V 2 )e(C+1)(T1 +τ ) , g(t) ≤ C(T1 + τ )(X
t ∈ [0, T1 + τ ],
which completes the proof of our assertion. Now, let us come back to deal with the term corresponding to fi3 (s, r, x, v). We make once more a decomposition of the form
t∧σ 3 (s, r, x, v)µ (dr, dx, dv) ds = (V 1) + (V 2), fV (Xs , Ys )1[4m1/ τ,σ] (s) f ω i 0
with
R×E
(V1) =
fV (Xs , Ys )1[4m1/ τ,σ] (s)
0
(V2) =
0
t∧σ
R×E
t∧σ
fV (Xs , Ys )1[4m1/ τ,σ] (s)
R×E
3 fi (s, r, x, v)λ(dr, dx, dv) ds, 3 fi (s, r, x, v)(µω − λ)(dr, dx, dv) ds.
The term (V1) (after a slight modification to get rid of the restriction that s ≥ 4m1/ τ ), is actually our goal term. The term (V2), being the variance with respect to the corresponding Poisson point process, is expected to be negligible. We show the second assertion in Lemma 5.4.7. (t) and X(t) are bounded. Also, m−1/2 |s − r| ≤ 2τ and Notice that up to σn , V 2 0 −1/2 (s − r), x, v; X(s))) = 0. So by (5.10), in |x| ≤ R0 + 1 if ∇ Ui (Xi (s) − ψ (m −1/2 −1/2 (s − r); x, v, X(s), V (s), −m (s − r)) is bounded. So by the this case, z(m 2 3 definition of f i and the boundedness of ∇ Ui , we get that there exists a constant C > 0 such that (|s − r|)1 (|x|). |f3 (s, r, x, v)| ≤ Cm1/2 1 1/2 [0,2m
i
Lemma 5.4.7.
lim E
m→0
Pm
τ]
sup |(V 2)| = 0.
0≤t≤T
[0,R0 +1]
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
817
Proof. Let 2 3 r ))) R3 (s, r, x, v) = f r ) − ψ 0 (m−1/2 (s − r), x, v; X( i (s, r, x, v) − ∇ Ui (Xi (
r ), V ( r ); −m−1/2 (s − r)). · m1/2 z(m−1/2 (s − r); x, v, X( Then (V2) = (V21) + (V22), with
t∧σ
(V21) =
fV (Xs , Ys )1[4m1/ τ,σ] (s)
0
×
R×E
t∧σ
(V22) = 0
fV (Xs , Ys )1[4m1/ τ,σ] (s)
×
R3 (s, r, x, v)(µω − λ)(dr, dx, dv) ds,
R×E
r ))) ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(
r ), V ( · m1/2 z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r)) × (µω − λ)(dr, dx, dv) ds. We first deal with (V21). We have by Corollary 3.2.3 and Proposition 3.6.5 that R3 (s, r, x, v) = 0 if |x| ≥ R0 + 1 or if |s − r| ≥ 2m1/2 τ . For s ∈ [0, T ∧ σ] and |s − r| ≤ 4m1/2 τ . Let C1 = ∇2 Ui ∞ C + |s − r| ≤ 2m1/2 τ , we have by definition PN 2 3 (1+ 0 + 2τ )nN 2τ e i=1 ∇ Ui ∞ 2τ ) , where C is the constant in ∇ Ui ∞ (1 + C)(T is the one in Lemma 4.3.4. Then by (5.10), Lemmas 5.4.6 Lemma 5.4.6, and C and 4.3.4, we have |R3 (s, r, x, v)| = ∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) (s), −m−1/2 (s − r)) V · m1/2 {z(m−1/2 (s − r); x, v, X(s), r ), V ( r ), −m−1/2 (s − r))} − z(m−1/2 (s − r); x, v, X( + {∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s))) r )))} − ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X( r ), V ( r ), −m−1/2 (s − r)) × m1/2 z(m−1/2 (s − r); x, v, X( (s), −m−1/2 (s − r)) ≤ ∇2 Ui ∞ m1/2 |z(m−1/2 (s − r); x, v, X(s), V r ), V ( r ), −m−1/2 (s − r))| − z(m−1/2 (s − r); x, v, X(
August 10, J070-S0129055X10004077
818
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
+ ∇3 Ui ∞ (|Xi (s) − Xi ( r )| r ))|) + |ψ 0 (m−1/2 (s − r), x, v; X(s)) − ψ 0 (m−1/2 (s − r), x, v; X( r ), V ( × m1/2 |z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r))| r ) + V (s) − V ( − X( r )). ≤ C1 m1/2 (X(s) r )| ≤ N n|s − r| ≤ 4m1/2 τ N n. To Since |Vi (t)| ≤ n until σn , we have |X(s) − X( estimate the term with respect to V (·) in the equation above, let am = 4m1/2 τ + 2 max E Pm sup |ηi (u)| + (4m1/2 τ )1/2 i=1,...,N
0≤u≤T
as before. Then by Lemma 3.5.1(4), am → 0 as m → 0. √ N d Pi∗1 (u)|] + 1 + C), where C is Let C2 = i=1 M1i (supm∈(0,1] sup0≤u≤t E Pm [| du the constant in (4.13). Then we have that (s) − V ( E Pm [|V r )|] ≤ C2 am ,
|s − r| ≤ 2m1/2 τ.
Indeed, since s, r ∈ [0, σ0 ∧ σn ], we have by Lemma 3.5.1(1) and (3.31) that
s d ∗1 1 Pi (l)dl + ηi (s) − ηi ( r) = r ) + Mi (s) − Mi ( r) , Vi (s) − Vi ( Mi r e dl hence by Lemma 3.5.1(2) and (4.13),
d
1 Pm Pm d ∗1 Pi (u) E [|V (s) − V ( r )|] ≤ |s − r| sup E Mi du 0≤u≤T i=1 + 2E Pm sup |ηi (u)| + E Pm [|Mi (s) − Mi ( r )|] 0≤u≤T
≤
d
d 1 |s − r| sup E Pm Pi∗1 (u) Mi du 0≤u≤t i=1 √ Pm 1/2 + 2E sup |ηi (u)| + C|s − r| 0≤u≤T
≤ C2 am , which gives us our assertion. = Combining the above and the definition of λ, we get that with C 1 d−1 2 8τ T fV ∞ (2(R0 + 1)) C1 (4τ N n + C2 ) Rd ρc ( 2 |v| )|v|dv, Pm sup |(V 21)| E 0≤t≤T
≤
0
T
dsfV ∞ E Pm 1[0,T ∧σ] (s)
R×E
(s) − V ( C1 m1/2 (4m1/2 τ N n + V r ))
× 1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|)(µω + λ)(dr, dx, dv)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
≤2
T
dsfV ∞
0
R×E
819
E Pm [1[0,T ∧σ] (s)C1 m1/2 (4m1/2 τ N n + C2 am )]
× 1[0,2m1/2τ ] (|s − r|)1[0,R0 +1] (|x|)λ(dr, dx, dv) 1/2 + am ) → 0, ≤ C(m
as m → 0.
r ) − ψ 0 (m−1/2 (s − r), To handle the term (V22) is easier. We have |∇2 Ui (Xi ( 2 x, v; X( r )))| ≤ ∇ Ui ∞ 1[0,2m1/2 τ ] (|s − r|)1[0,R0 +1] (|x|). Also, for s ∈ [0, T ] and r ), V ( |s − r| ≤ 2m1/2 τ , we have by (5.10) that z(m−1/2 (s − r); x, v, X( r ), −1/2 (s−r)) is bounded. Let C be a bound of it, and let C = T f C ((2(R −m 3 V ∞ 3 0+ r ) is Fr -measurable, by the 1))d−1 4τ ∇2 Ui ∞ Rd ρc ( 12 |v|2 )|v|dv)1/2 . Then since X( definition of Poisson point processes and the definition of λ, we have E Pm sup |(V 22)| 0≤t≤T
≤
T
0
dsfV ∞ E Pm
R×E
r )) ∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(
r ), V ( × m1/2 z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r)) 2 1/2 × (µω (dr, dx, dv) − λ(dr, dx, dv))
T
= 0
dsfV ∞
R×E
+ r )) E Pm (∇2 Ui (Xi ( r ) − ψ 0 (m−1/2 (s − r), x, v; X(
r ), V ( × m1/2 z(m−1/2 (s − r); x, v, X( r ), −m−1/2 (s − r)))2
,
1/2 × λ(dr, dx, dv) ≤
0
T
dsfV ∞
R×E
(∇2 Ui ∞ m1/2 C3 )2 1[0,R0 +1) (|x|)
1/2 × 1[0,2m1/2 τ ) (|s − r|)λ(dr, dx, dv) 1/4 → 0, ≤ Cm
as m → 0.
This completes the proof of Lemma 5.4.7. N Up to now, we have shown that all of the terms of −(II) except i=1 M1i (V1) are negligible. We are almost done with our discussion with respect to (II), except for getting rid of the term 1[4m1/ τ,σ] (s) in the definition of (V1). We do it now.
August 10, J070-S0129055X10004077
820
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Notice that in the integral domain of (V1), we have s ≥ 4m1/2 τ . So if ∇ Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)) = 0, then r ≥ 2m1/2 τ . If ρ( 12 |v|2 + N −1/2 rv − Xi,0 )) = 0 in addition, then |v| ≥ 2C0 + 1. Therefore, in this i=1 Ui (x− m −1/2 rv| ≥ 2τ (2C0 +1) ≥ R0 , hence since x·v = 0, we get |x−m−1/2 rv| ≥ R0 , case, |m −1/2 rv − Xi,0 )) = ρ( 12 |v|2 ). which in turn gives us that ρ( 12 |v|2 + N i=1 Ui (x − m Therefore, by definition, t∧σ dsfV (Xs , Ys )1[4m1/ τ,σ] (s) (V1) = 2
0
×
∇2 Ui (Xi (s) − ψ 0 (m−1/2 (s − r), x, v; X(s)))
R×E
(s), −m−1/2 (s − r)) · m1/2 z(m−1/2 (s − r); x, v, X(s), V
1 2 |v| drν(dx, dv) · m−1 ρ 2 t∧σ = dsfV (Xs , Ys )1[4m1/τ ,σ] (s) 0
×
+∞
−∞
E
du∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s)))
(s), −u) ρ 1 |v|2 ν(dx, dv) , × z(u; x, v, X(s), V 2 where when passing to the last equality, we used the change of variable u = m−1/2 (s − r) for every s fixed. With the help of this re-expression, we make a decomposition once more, (V1) = (V11) + (V12), with
t∧σ
dsfV (Xs , Ys )
(V11) = 0
× E
+∞
−∞
du∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s)))
1 2 × z(u; x, v, X(s), V (s), −u) ρ |v| ν(dx, dv) , 2 t∧σ dsfV (Xs , Ys )1[0,4m1/2 τ ] (s) (V12) = − 0
× E
+∞
−∞
du∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s)))
1 2 |v| ν(dx, dv) . × z(u; x, v, X(s), V (s), −u) ρ 2
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
821
Notice that for s ∈ [0, T ∧ σ], ∇2 Ui (Xi (s) − ψ 0 (u, x, v; X(s))) = 0 only if |u| ≤ 2τ + 1, and z(u; x, v, X(s), V (s), −u) is bounded in this domain. So and |x| ≤ R 0 +∞ 2 0 ( du∇ U (X (s) − ψ (u, x, v; X(s)))z(u; x, v, X(s), V (s), −u))ρ( 12 |v|2 )ν(dx, i i E −∞ dv) is bounded. Let C be a bound of it. Then |(V12)| ≤ 4Cτ fV ∞ m1/2 . This completes the proof of (5.2), i.e. the fact that the term (II) is converging N to − i=1 M1i (V11) as m → 0.
5.5. Conclusion Combining the results of Secs. 5.1–5.4, and taking the limit n → ∞ at last (notice that σn → ∞ a.s.), we get Theorems 2.0.1(2) and 2.0.1(3). Notice that this also gives us Lemma 3.5.3, by considering each time interval [ηn−1 , ξn ], with ηn , ξn given by the following: η0 = 0,
, εf ∈ B supp U ξn = inf t ≥ ηn−1 ; X(t) , 2 , εf )}, ∈ / B(supp U ηn = inf{t ≥ ξn ; X(t)
n ≥ 1.
, 2εf ) × RdN )C . Here εf > 0 is chosen such that supp f ⊂ (B(supp U
6. Case of Two Molecules In this section, we consider the special case of two molecules with d ≥ 3 and spherically-symmetric potential functions U1 , U2 , as described in Theorem 2.0.1(4). Precisely, in addition to all of the assumptions in Secs. 3–5, we assume from now on that d ≥ 3 and there exist functions h1 , h2 : [0, ∞) → R such that Ui (x) = hi (|x|), i = 1, 2, and, moreover, there exists a constant ε0 > 0 such that (−1)i−1 hi (s) > 0, (−1)i−1 hi (s) > 0, s ∈ (Ri − ε0 , Ri ), i = 1, 2. See Sec. 2 for the explanation of these assumptions. Without loss of generality, we assume that ε0 < R1 ∧ R2 . In the following, we show that in this special case, as announced in Sec. 2, as m → 0, {(X(t), V (t))}t under Pm converges to the reflecting diffusion process which has generator L and act as “colliding” when the potential ranges of the two molecules overlap. (See Theorem 6.3.2 for the precise definition of the limiting process.) . We then show We first discuss a little bit more about the new potentials U that in our present setting, the condition of Lemma 3.5.2 is satisfied, and that (t ∧ σn ))}t under Pm }m is tight in ∧ σn ), V when m → 0 {the distribution of {(X(t
August 10, J070-S0129055X10004077
822
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
$d ) with the metric function dis $d = C([0, ∞); Rd )×D([0, ∞); Rd ) given by of W ℘(W
∞
1 , ω2 ) = dis(ω 2−n 1 ∧ max |x1 (t) − x2 (t)| + dis(v1 , v2 ) t∈[0,n]
n=1
$d , i = 1, 2. Here dis is the Skorohod metric on for ωi = (xi (·), vi (·)) ∈ W d D([0, ∞), R ) defined in Sec. 3.4. Finally, we use these to show the desired convergence. 6.1. The new potential U be as defined in Sec. 3.6, and let U 0 be the constant Let p and U 2
0 = U (p(Ui (Xi − x)) − p(0))dx, Rd
i=1
(X1 , X2 ), when X1 and X2 are far which, as claimed in Sec. 3.6, is the value of U enough, precisely, when |X1 − X2 | ≥ R1 + R2 . Then U (X1 , X2 ) − U0 = {[p(U1 (X1 − x) + U2 (X2 − x)) − p(0)] Rd
− [(p(U1 (X1 − x)) − p(0)) + (p(U2 (X2 − x)) − p(0))]}dx U1 (X1 −x)+U2 (X2 −x) dx p (s)ds = 0
Rd
−
U1 (X1 −x)
0
=
p (s)ds −
U2 (X2 −x)
U1 (X1 −x)+U2 (X2 −x)
U2 (X2 −x)
=
U1 (X1 −x)
dx 0
Rd
=
Therefore, (X1 , X2 ) = ∇1 U
U1 (X1 −x)
p (s)ds 0
p (s + u)du.
0
U2 (X2 −x)
dx 0
Rd
p (s)ds −
U2 (X2 −x)
ds 0
Rd
(p (s + U2 (X2 − x)) − p (s))ds
U1 (X1 −x)
dx
p (s)ds
0
dx Rd
p (U1 (X1 − x) + u)du∇U1 (X1 − x). (6.1)
Notice that the integrand in (6.1) is 0 outside of B2 = BX1 ,X2 = {x ∈ Rd ; |x − X1 | ≤ R1 , |x − X2 | ≤ R2 }. Therefore, (X1 , X2 ) = ∇1 U
dx
B2
0
U2 (X2 −x)
p (U1 (X1 − x) + u)du∇U1 (X1 − x). (6.2)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
823
We will use this expression in the following calculations. In this subsection, we show, by using the spherical-symmetry of the poten tials, that ∇1 U (X1 , X2 ) has the same direction as X2 − X1 . Therefore, the term −1/2 t∧σ (X(s))ds in the decomposition (3.30) of Mi Vi (t) gives us the ∇i U −m 0 reflecting force. First, we have the following: Lemma 6.1.1. Let ε ∈ (0, ε0 ]. Then there exists a Cε > 0 such that for any X1 , X2 ∈ Rd satisfying |X1 − X2 | ∈ [R1 + R2 − ε, R1 + R2 − 2ε ), we have that 1 , X2 ) is parallel to X2 − X1 in Rd , and ∇i U(X (X1 , X2 ) ≤ −Cε , (X1 − X2 ) · ∇1 U
(X1 , X2 ) ≥ Cε . (X1 − X2 ) · ∇2 U
= (X1 , X2 ) ∈ Proof. First notice that by assumption and (3.39), we have for any X 2d R
1 2 ∇i U(X) = ∇Ui (Xi − x)ρ |v| + U1 (X1 − x) + U2 (X2 − x) dxdv 2 R2d
1 2 Xi − x hi (|Xi − x|)ρ |v| + h1 (|X1 − x|) + h2 (|X2 − x|) dxdv. = 2 R2d |Xi − x| (X) is parallel to X1 − X2 in Rd . From this, it is easy to see that ∇i U For the second half of the lemma, since the proofs are similar, we only prove the first assertion. Notice that for any x ∈ B2 , since |X1 − X2 | ≥ R1 + R2 − ε, we have that |X1 − x| ≥ R1 − ε, |X2 − x| ≥ R2 − ε. By our assumption, U1 (X1 − x) = h1 (|X1 − x|), U2 (X2 − x) = h2 (|X2 − x|). Therefore, by (6.2), h2 (|X2 −x|) X1 − x (X1 , X2 ) = . ∇1 U dx p (h1 (|X1 − x|) + u)duh1 (|X1 − x|) |X 1 − x| 0 B2 Notice that in this integral domain, since ε ≤ ε0 < R1 ∧ R2 , we have (X1 − X2 ) · X1 −x |X1 −x| > 0. By assumption, h1 (|X1 − x|) > 0,
h2 (|X2 − x|) < 0,
h1 (|X1 − x|) < 0,
h2 (|X2 − x|) > 0.
Also, since d ≥ 3, we have by (3.42) that p (s) < 0 for any s < e0 . Therefore, if we set . 2 = x; |X1 − x| ≤ R1 − ε , |X2 − x| ≤ R2 − ε ⊂ B2 , B 6 6 then h2 (|X2 −x|) 1 , X2 ) ≥ − −(X1 − X2 ) · ∇1 U(X dx p (h1 (|X1 − x|) + u)du f2 B
0
× h1 (|X1 − x|)(X1 − X2 ) ·
X1 − x . |X1 − x|
August 10, J070-S0129055X10004077
824
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
We have by (3.42) that −p (s) > 0 for any |s| ≤ U1 ∞ + U2 ∞ , also, −p (·) is continuous in this closed interval. Therefore, there exists a constant C1 > 0 such that inf{−p (s); |s| ≤ U1 ∞ + U2 ∞ } ≥ C1 . ε , we have that Moreover, for any x ∈ B
ε 5 |X1 − x| ≥ |X1 − X2 | − |X2 − x| ≥ (R1 + R2 − ε) − R2 − = R1 − ε, 6 6 i.e. |X1 − x| ∈ [R1 − 56 ε, R1 − 6ε ]. In the same way, |X2 − x| ∈ [R2 − 56 ε, R2 − 6ε ]. So by assumption, there exists a constant Cε1 > 0 (which does not depend on x) such that h1 (|X1 − x|) ≥ Cε1 ,
h2 (|X2 − x|) ≤ −Cε1 ,
h1 (|X1 − x|) ≤ −Cε1 ,
h2 (|X2 − x|) ≥ Cε1 .
Also, we have that (X1 − X2 ) ·
(R1 + R2 − ε)(R1 − ε) X1 − x ≥ . |X1 − x| R1
Indeed, if we decompose X1 − x into + (x − x ) X1 − x = X1 − x with X1 − x X1 − X2 and x − x ⊥ X1 − X2 , then X2 − x = X2 − x + (x − x ) is 2 2 2 2 | + |x − x | , hence also an orthogonal decomposition. So R2 ≥ |X2 − x| = |X2 − x | ≤ R2 . Also, |X1 − X2 | ≥ R1 + R2 − ε, So |X1 − x | ≥ |X1 − X2 | − |X2 − x | ≥ |X2 − x (R1 + R2 − ε) − R2 = R1 − ε. Therefore, (X1 − X2 ) ·
|X1 − X2 | |X1 − x X1 − x (R1 + R2 − ε)(R1 − ε) | ≥ ≥ . |X1 − x| R1 R1
Combining these, we get that (X1 , X2 ) ≥ − −(X1 − X2 ) · ∇1 U
f2 B
dx 0
h2 (|X2 −x|)
p (h1 (|X1 − x|) + u)du
X1 − x |X1 − x| (R1 + R2 − ε)(R1 − ε) ≥ Cε1 C1 Cε1 dx, R1 fε B × h1 (|X1 − x|)(X1 − X2 ) ·
which gives us our first assertion. As a direct corollary of Lemma 6.1.1, we have the following. Lemma 6.1.2. Let ε ∈ (0, ε0 ], and let X1 , X2 ∈ Rd satisfying |X1 − X2 | ∈ [R1 + R2 − ε, R1 + R2 ). Then we have that (X1 , X2 ) < 0, (X1 − X2 ) · ∇1 U
1 , X2 ) > 0. (X1 − X2 ) · ∇2 U(X
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
825
We also have the following as an easy corollary of Lemma 6.1.1. ε Corollary 6.1.3. Assume that t1 , t2 ∈ [0, σn ] satisfy |t1 − t2 | ≤ 4n , and |X1 (t1 ) − ε X2 (t1 )| ∈ [R1 + R2 − ε, R1 + R2 − 2 ). Then
ε/2 −(X1 (t2 ) − X2 (t2 )) · ∇1 U (X1 (t1 ), X2 (t1 )) ≥ Cε 1 − . R1 + R2 − ε |a−b| d Proof. By using the general fact that (a,b) |b|2 ≥ 1 − |b| for any a, b ∈ R , we get 1 , X 2 ) with |(X 1 − X 2 ) − (X1 − X2 )| < |X1 − X2 |, by Lemma 6.1.1 that for any (X
we have (X1 , X2 ) 1 − X 2 ) · ∇1 U −(X 2 , X1 − X2 ) 1 − X (X |X1 − X2 |2 2 ) − (X1 − X2 )| 1 − X |(X ≥ Cε 1 − . |X1 − X2 |
1 , X2 ) = −(X1 − X2 ) · ∇1 U(X
Under our assumption, we have |X1 (t1 ) − X1 (t2 )| ≤ n|t1 − t2 | ≤ |X2 (t1 ) − X2 (t2 )| ≤ 4ε . Therefore, by the argument above,
ε 4,
similarly,
1 (t1 ), X2 (t1 )) −(X1 (t2 ) − X2 (t2 )) · ∇1 U(X
|(X1 (t2 ) − X2 (t2 )) − (X1 (t1 ) − X2 (t1 ))| ≥ Cε 1 − |X1 (t1 ) − X2 (t1 )|
ε/2 ≥ Cε 1 − . R1 + R2 − ε
6.2. Tightness Same as before, we only need to discuss under condition |Vi | ≤ n, i.e. use t ∧ σn instead of t, and finally take n → ∞. We first show that the condition of Lemma 3.5.2 is satisfied. = (X1 , X2 ) ∈ R2d with |X1 − X2 | < R1 + R2 big enough, For any X (X) is parallel to X1 − X2 in Rd , and by by Lemma 6.1.1, we have that ∇i U (X) has the Lemma 6.1.2, ∇1 U(X) has the opposite direction as X1 − X2 , and ∇2 U same direction as X1 − X2 . = X2 −X1 , g2 (X) = X1 −X2 , and let D ¯ = Therefore, if we let g1 (X) |X2 −X1 | |X1 −X2 | {(X1 , X2 )||Xi | ≤ |Xi,0 |+nT, |X1 −X2 | ≥ R1 +R2 −ε0 }. Then since R1 +R2 −ε0 > 0, X) = |∇i U (X)| for any x ∈ D, ¯ i.e. ¯ and gi (X) · ∇i U( we have that g1 , g2 ∈ Cb1 (D) the condition of Lemma 3.5.2 is satisfied. ∧ We next give a brief proof of the tightness of {the distribution of {(X(t σn ), V (t ∧ σn ))}t under Pm }m as m → 0. The only difficulty is the assertion with
August 10, J070-S0129055X10004077
826
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
T respect to V (·). We deal with it from now on. Let Ak = {Yt : 0 |dYt | ≤ k}, k ∈ N. Then we have by Kusuoka [9, Corollary 8] that Ak is compact in Lp ([0, T ]; Rd) with cluster points in D([0, T ]; Rd) for any k ∈ N. Also, by Lemma 3.5.2(1), there exists a constant C > 0 such that −1
·∧σn −1/2 ∇i U(X(s))ds (Ak ) Pm ◦ m 0
= 1 − Pm 1 ≥ 1 − E Pm k ≥1−
T ∧σ
m
−1/2
0
X(s))|ds >k |∇i U(
T ∧σ
m
−1/2
0
(X(s))|ds |∇i U
C , k
t∧σ (X(s))ds} which converges to 1 as k → ∞. Therefore, {{m−1/2 0 n ∇i U t d under Pm }m∈(0,1] is tight in ℘(D([0, T ]; R )). Therefore, since by Lemma 3.5.1, t∧σ ∗1 −1/2 (X(s))ds, Mi (Vi (t ∧ σ) − Vi (0)) = Mi (t) + ηi (t) + Pi − m ∇i U 0
Pi∗1
and the distributions of Mi (t) + ηi (t) and under Pm are tight in ℘(D([0, T ]; Rd )), we get the conclusion that {{Vi (t∧σn )}t under Pm }m→0 is tight in ℘(D([0, T ]; Rd )). 6.3. Convergence to a Markov process The idea is similar to that presented by Kusuoka in [9]. Let us first recall the following existence and uniqueness theorem of Kusuoka [9, Theorem 1]. Let D be a bounded domain in Rd with a smooth boundary ∂D and let n(x), x ∈ ∂D, be the outer normal vector at x ∈ ∂D. Let L0 =
d
i=1
vi
d d
1 ij ∂2 ∂ ∂ + a (x) + bi (x, v) i , ∂xi 2 i,j=1 ∂v i ∂v j i=1 ∂v
where aij : Rd → R, i, j = 1, . . . , d, are smooth function, symmetric with respect to i, j and uniformly elliptic with respect to x, and bi : R2d → R, i = 1, . . . , d, are bounded measurable functions. Let Φ: Rd × ∂D → Rd be a smooth map satisfying the following: (1) Φ(·, x): Rd → Rd is linear for all x ∈ ∂D, (2) Φ(v, x) = v for any x ∈ ∂D and v ∈ Tx (∂D), i.e. Φ(v, x) = v if x ∈ ∂D, v ∈ Rd and v · n(x) = 0, (3) Φ(Φ(v, x), x) = v for all v ∈ Rd and x ∈ ∂D, (4) Φ(n(x), x) = n(x) for any x ∈ ∂D.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
827
Then Kusuoka [9, Theorem 1] proved the following: ¯ C × Rd . Then there exists a unique probability Theorem 6.3.1. Let (x0 , v0 ) ∈ (D) d ˜ satisfying the following: measure µ over W (1) µ(ω(0) = (x0 , v0 )) = 1, (2) µ(ω(t) ∈ DC × Rd , t ∈ [0, ∞)) = 1, ¯ C ×Rd ), {f (ω(t))− t L0 f (w(s))ds; t ≥ 0} is a martingale (3) For any f ∈ C0∞ ((D) 0 under µ(ω), (4) µ(1∂D (x(t))(v(t) − Φ(v(t−), x(t))) = 0 for all t ∈ [0, ∞)) = 1. ˜ d. Here ω(·) = (x(·), v(·)) ∈ W By using this, we get the following slight variation. Recall that D0 = {(X1 , X2 ) ∈ R2d ; |X1 − X2 | > R1 + R2 } in our present setting. Theorem 6.3.2. There exists a unique probability measure P∞,0 over D([0, ∞); R4d ) satisfying the following. (1) P∞,0 (ω(0) = (x0 , v0 )) = 1, ¯ 0 , t ∈ [0, ∞)) = 1, ∈D (2) P∞,0 (X(t) t (s))ds; t ≥ 0} is V (t)) − 0 (Lf )(X(s), V (3) For any f ∈ C0∞ (D0 × R2d ), {f (X(t), a martingale under P∞,0 , (4) If f ∈ C0∞ (R4d ) satisfies M1−1 (∇v1 f )(x, v) · (x1 − x2 ) + M2−1 (∇v2 f )(x, v) · (x2 − x1 ) = 0
(6.3)
for any (x, v) ∈ ∂D0 × R2d , then f (X(t), V (t)) is continuous in t, P∞,0 -a.s., 2 2 (5) M1 |V1 (t)| + M2 |V2 (t)| is continuous in t, P∞,0 -a.s. Proof. We define Φ(v, x), (v, x) = (v1 , v2 , x1 , x2 ) ∈ R4d , in the following way: For any such v1 , v2 , x1 , x2 ∈ Rd , decompose v1 and v2 into vi = ui +wi with ui ⊥ x1 −x2 and wi x1 − x2 , i = 1, 2, and let Φ(v, x) = (Φ1 (v, x), Φ2 (v, x)) with Φ1 (v, x) = u1 +
M1 − M2 2M2 w1 + w2 , M1 + M2 M1 + M2
Φ2 (v, x) = u2 +
2M1 M2 − M1 w1 + w2 . M1 + M2 M1 + M2
Then Φ satisfies the conditions before Theorem 6.3.1. We first check the fact that a probability µ satisfying the conditions (1)–(4) of Theorem 6.3.1 with Φ given above also satisfies conditions (1)–(5) of Theorem 6.3.2. All except (4) are trivial. For (4), it sufficient to show that f (x, Φ(v, x)) = f (x, v)
August 10, J070-S0129055X10004077
828
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
for any x ∈ ∂D0 if it satisfies (6.3). We show it in the following. Since Φ1 (v, x) − v1 =
2M2 (w2 − w1 ), M1 + M2
we have
f (x, Φ(v, x)) − f (x, v) =
0
1
Φ2 (v, x) − v2 =
2M1 (w1 − w2 ), M1 + M2
[∇v1 f (x, v + t(Φ(v, x) − v))(Φ1 (t, x) − v1 )
+ ∇v2 f (x, v + t(Φ(v, x) − v))(Φ2 (t, x) − v2 )]dt 1 2M1 M2 =− [−M1−1 ∇v1 f + M2−1 ∇v2 f ] M1 + M2 0 × (x, v + t(Φ(v, x) − v)) · (w2 − w1 )dt = 0, where in the last line we used (6.3) and the fact that w2 − w1 x2 − x1 . For the opposite direction, i.e. the fact that a probability µ satisfying the conditions (1)–(5) of Theorem 6.3.2 also satisfies conditions (1)–(4) of Theorem 6.3.1 with Φ given above, we only need to check that (4) of Theorem 6.3.1 is satisfied, or equivalently, show that V (σ) = Φ(V (σ−), X(σ−)) if X(σ) ∈ ∂D0 . Choose any d w ∈ R and fix it for a while. Let f (x, v) = M1 v1 · w + M2 v2 · w. Then f satisfies (6.3), so by (4) of Theorem 6.3.2, f (X(t), V (t)) is continuous in t. We write it down together with (5) of Theorem 6.3.2: M1 V1 (t) + M2 V2 (t) is continuous in t, M1 V12 (t) + M2 V22 (t) is continuous in t. Solving these two equations, we get that either Φ1 (V (σ−), X(σ)) · w = V1 (σ) · w,
and
Φ2 (V (σ−), X(σ)) · w = V2 (σ) · w
(6.4)
or V1 (σ−) · w = V1 (σ) · w,
and V2 (σ−) · w = V2 (σ) · w.
(6.5)
If w is orthogonal to X1 (σ) − X2 (σ), then these two conditions are equivalent, so both of them hold, which means that there is no jump at time σ in any of these directions. Now, the only thing left to be checked is that (6.4) also holds for any w X1 (σ) − X2 (σ). If not, then (6.5) holds, so Vi (σ) = Vi (σ−) for i = 1, 2. Since d (X1 (t) − X2 (t))2 = (X1 (t) − X2 (t)) · (V1 (t) − V2 (t)), dt this implies that
d d 2 2 (X1 (t) − X2 (t)) (X1 (t) − X2 (t)) = . dt dt t=σ− t=σ
(6.6)
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
829
If (6.6) is equal to 0, then V1 (σ−) − V2 (σ−) is orthogonal to X1 (σ) − X2 (σ), so by the definition of Φ this implies that Φ(V (σ−), X(σ)) = V (σ−), which combined with our assumption implies that Φ(V (σ−), X(σ)) = V (σ), the very equation that we need. If (6.6) is not equal to 0, write it as C ∈ R, then by the continuity d (X1 (t) − X2 (t))2 |t=σ , there exists an ε > 0 small enough such that either of dt (X1 (σ − ε) − X2(σ − ε))2 or (X1 (σ + ε) − X2(σ + ε))2 is less than (X1 (σ) − X2 (σ))2 − |C| |C| 2 2 · ε = (R1 + R2 ) − 2 · ε. This contradicts the condition (2). Therefore, (6.4), i.e. (4) of Theorem 6.3.1 holds. We have already shown in Sec. 6.2 that {{(X(t∧σ n ), V (t∧σn ))}t under Pm }m→0 is tight. We show from now on that any cluster point of it satisfies all of the conditions of Theorem 6.3.2. The fact that any of its cluster points satisfies (1) is trivial. The fact that it satisfies (3) is nothing but Lemma 3.5.3. So we only need to show that (2), (4) and (5) are also satisfied. We show (2) first. Choose an arbitrary ε > 0 and fix it for a while. Let 3 ξ = ξε = inf t > 0; |X1 (t) − X2 (t)| ≤ R1 + R2 − ε ∧ σn ∧ T. 4 Then (2) is implied by the following. Lemma 6.3.3. Let ε ∈ (0, ε0 ] and let ξ be as defined above. Then lim Pm (ξ < T ∧ σn ) = 0.
m→0
This result is easy to be imagined, since as m → 0, m−1/2 → ∞, so by Corol −1/2 t∧σ lary 6.1.3, the term −m ∇i U (X(s))ds in the decomposition of Mi Vi (t) 0 gives us a very strong force as soon as the distance between the two molecules is less than R1 + R2 . Proof. Notice that if ξ < T ∧ σn , then |X1 (ξ) − X2 (ξ)| = R1 + R2 − 34 ε, hence + ε, |X1 (t) − X2 (t)| ∈ R1 + R2 − ε, R1 + R2 − , 2
+ ε , for any t ∈ ξ − ,ξ . 8n
We have by Ito’s formula and Lemma 3.5.1 that 2
2
|X1 (t) − X2 (t)| = |X1 (0) − X2 (0)| + 2
0
t
(X1 (s) − X2 (s))
· M1 (s) − M2 (s) + η1 (s) − η2 (s) + P1∗1 (s) − P2∗1 (s) −m
−1/2
0
s
(∇1 U (X1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))du ds,
August 10, J070-S0129055X10004077
830
so
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
3 R1 + R2 − ε 4
2
− (R1 + R2 − ε)2
2 ε ε ≥ |X1 (ξ) − X2 (ξ)|2 − X1 ξ − − X2 ξ − 8n 8n ξ
(X1 (s) − X2 (s)) · M1 (s) − M2 (s) + η1 (s) − η2 (s)
=2 ε ξ− 8n
+ P1∗1 (s) − P2∗1 (s)
− m−1/2 −m
0
−1/2
1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))du (∇1 U(X
s
ε ξ− 8n
≥ −2
ε ξ− 8n
1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))du ds (∇1 U(X
ξ
ε R1 + R2 − 2
ε ξ− 8n
|M1 (s)| + |M2 (s)| + |η1 (s)| + |η2 (s)|
+ |P1∗1 (s)| + |P2∗1 (s)| +m
−1/2
T ∧σn
0
+ 2m−1/2
(X1 (u), X2 (u))| + |∇2 U (X1 (u), X2 (u)))| du (|∇1 U
ξ
s
ds ε ξ− 8n
ε ξ− 8n
ds
[−(X1 (s) − X2 (s))
(X1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))]du. · (∇1 U
(6.7)
ε/2 ε 2 Let C1 = (R1 + R2 − ε)2 − (R1 + R2 − 34 ε)2 and C2 = ( 8n ) Cε (1 − R1 +R ) > 0, 2 −ε where Cε is the constant given in Lemma 6.1.1 and Corollary 6.1.3. Notice that C1 and C2 depend only on R1 + R2 , ε and n, and do not depend on m. Also, write Ys = |M1 (s)| + |M2 (s)| + |η1 (s)| + |η2 (s)| + |P1∗1 (s)| + |P2∗1 (s)|. Then with the help of Corollary 6.1.3, (6.7) implies that
ξ < T ∧ σn
ξ
ε ε ε ⇒ 2 R1 + R2 − Ys ds + R1 + R2 − ε 2 4n 2 ξ− 8n
×
0
T ∧σn
(X1 (u), X2 (u))| + |∇2 U (X1 (u), X2 (u)))|)du m−1/2 (|∇1 U
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
831
2
3 ≥ (R1 + R2 − ε) − R1 + R2 − ε 4 s ξ + 2m−1/2 ds [−(X1 (s) − X2 (s)) 2
ε ξ− 8n
ε ξ− 8n
(X1 (u), X2 (u)) − ∇2 U (X1 (u), X2 (u)))]du · (∇1 U
ξ s ε/2 −1/2 ≥ C1 + 2m Cε 1 − ds du ε ε R1 + R2 − ε ξ− 8n ξ− 8n = C1 + 2m
−1/2
Cε 1 −
ε/2 R1 + R2 − ε
2 1 ε 2 8n
= C1 + m−1/2 C2 . T 2 ε Pm T ∧σn (X1 (u), [ 0 m−1/2 |∇i U Let C3 = supm∈(0,1] {2 0 E Pm [Ys ]ds + 4n i=1 E X2 (u))|du]}, which is finite by Lemmas 3.5.1 and 3.5.2. Then the above implies that ξ
ε ε ε Ys ds + Pm (ξ < T ∧ σn ) ≤ Pm 2 R1 + R2 − R1 + R2 − ε 2 4n 2 ξ− 8n ×
0
T ∧σn
1 (u), X2 (u))| m−1/2 (|∇1 U(X
(X1 (u), X2 (u)))|)du ≥ C1 + m−1/2 C2 + |∇2 U ξ∧σn 1 ε Pm ≤ E + R − Ys ds 2 R 1 2 ε 2 C1 + m−1/2 C2 ξ− 8n
T ∧σn ε ε (X1 (u), X2 (u))| m−1/2 (|∇1 U R1 + R2 − 4n 2 0 (X1 (u), X2 (u)))|)du + |∇2 U
+
1 ε ≤ R1 + R2 − C3 , 2 C1 + m−1/2 C2 which converges to 0 as m → 0. This completes the proof of our assertion. We next show that the condition (5) of Theorem 6.3.2 is satisfied, i.e. M1 |V1 (t)|2 + M2 |V2 (t)|2 is continuous in t almost surely, under any limit probability.
August 10, J070-S0129055X10004077
832
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
We first prepare the following: (Y1 , Y2 ) · Y1 −Y2 is monotone non-increasing with respect to Lemma 6.3.4. −∇1 U |Y1 −Y2 | |Y1 − Y2 | for |Y1 − Y2 | ∈ [R1 + R2 − ε0 , R1 + R2 ]. Proof. As in the proof of Lemma 6.1.1, by (6.2), we have that in our present setting, h2 (|Y2 −x|) (Y1 , Y2 ) · Y1 − Y2 = − −∇1 U dx p (h1 (|Y1 − x|) + u)du |Y1 − Y2 | 0 BY1 ,Y2 × h1 (|Y1 − x|)
Y1 − Y2 Y1 − x · . |Y1 − x| |Y1 − Y2 |
Let B Y1 ,Y2 = {(s, t)|∃x ∈ BY1 ,Y2 , s = |Y1 − x|, t = |Y2 − x|}, and for any (s, t) ∈ BY1 ,Y2 , let α, β, θ be the angles between Y1 Y2 and Y1 x, Y2 Y1 and Y2 x, xY1 and xY2 , respectively. Write A = |Y1 − Y2 |. Finally, let l(s, t) denote the length of the hypercircle {x ∈ Rd ; |Y1 − x| = s, |Y2 − x| = t} in Rd−2 . Then by using a change of variables, (Y1 , Y2 ) · Y1 − Y2 −∇1 U |Y1 − Y2 | 0 = dsdt (−p (h1 (s) + u))du(−h1 (s))l(s, t) cos α sin θ. B Y1 ,Y2
h2 (t)
Notice that all of the terms above are positive. The integration domain B Y1 ,Y2 is decreasing with respect to |Y1 − Y2 |. Also, for any fixed s and t, the term l(s, t) is also decreasing with respect to |Y1 − Y2 |. Therefore, it is sufficient to show that for any s, t fixed, cos α sin θ is decreasing with respect to A = |Y1 − Y2 |. We shall show it from now on. By the sine formula, cos α sin θ = At sin α cos α. So it suffices to show that A sin α cos α is monotone decreasing with respect to A, or equivalent, is monotone increasing /with respect to α, for α > 0 small enough. It is easy to see that A = s cos α + t2 − s2 sin2 α. So / A sin α cos α = s sin α cos2 α + t2 − s2 sin2 α sin α cos α 0 = s sin α(1 − sin2 α) + (t2 − s2 sin2 α)(1 − sin2 α) sin2 α. Since α > 0 is small enough, we have sin2 α > 0 small enough and monotone 1 increasing with respect to α. Also, since s/t is near to R R2 (> 0), there exists an 2 ε1 > 0 such that the functions f1 (x) = sx(1 − x ) and f2 (x) = (t2 − s2 x)(1 − x)x = 2 t2 x(x − 1)(x − st2 ) are monotone increasing in x ∈ [0, ε1 ]. Combining these, we get the desired property of A sin α cos α to be increasing with respect to α for α > 0 small enough, or equivalent, decreasing with respect to A. This completes the proof of our assertion.
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
833
Let ξ0 = ξε0 = inf{t > 0; |X1 (t) − X2 (t)| ≤ R1 + R2 − 34 ε0 } ∧ σn ∧ T . Then by Lemma 6.3.3, Pm (ξ0 < T ∧ σn ) → 0 as m → ∞. We next use Lemma 6.3.4 to prove the following: T ∧σ ∧ξ (X(s)) 0 )ds > δ) = 0 for any Lemma 6.3.5. limm→0 Pm ( 0 n 0 m−1/2 (U −U δ > 0. 1 , Y2 ) · Y1 −Y2 is positive for |Y1 − Proof. By Lemma 6.1.2, we have that −∇1 U(Y |Y1 −Y2 | Y2 | ∈ [R1 + R2 − ε0 , R1 + R2 ). Also, by Lemma 6.3.4, the same quantity is monotone (X1 , X2 ) = U (X1 −X2 , 0). So non-increasing with respect to |Y1 −Y2 |. Notice that U with a little bit abuse of notation, we can write U (X1 , X2 ) = U (X1 − X2 ). We have 0 = 0 if |X1 − X2 | ≥ R1 + R2 . Also, for any |X1 − X2 | < R1 + R2 , 1 , X2 )− U that U(X R1 +R2 0 , and R1 +R2 + t(1 − R1 +R2 ) ≥ 1 for t ∈ [0, 1], we have U( |X1 −X2 | (X1 − X2 )) = U |X1 −X2 | |X1 −X2 | hence 0 (X1 , X2 ) − U U
1 − X2 ) − U R1 + R2 (X1 − X2 ) = U(X |X1 − X2 |
1 R1 + R2 (X1 − X2 ) + t 1 − R1 + R2 (X1 − X2 ) = −∇1 U |X1 − X2 | |X1 − X2 | 0
R1 + R2 · −1 + (X1 − X2 )dt |X1 − X2 |
1 (X1 − X2 ) · (X1 − X2 ) −1 + R1 + R2 dt −∇1 U ≤ |X1 − X2 | 0
R1 + R2 = −∇1 U(X1 − X2 ) · (X1 − X2 ) −1 + |X1 − X2 | (X1 − X2 )||X1 − X2 | ≤ |∇1 U
R1 + R2 − |X1 − X2 | |X1 − X2 |
(X1 − X2 )|(R1 + R2 − |X1 − X2 |). = |∇1 U 0 is (X1 , X2 ) − U The first equation in the calculation above also gives us that U non-negative. Also, by (3.31), (X(s)) − U0 = 0 if |X1 (s) − X2 (s)| ≥ R1 + R2 . T ∧σ U ∧ξ 1 (s) − X2 (s))|ds], which is finite Let C = supm∈(0,1] E Pm [ 0 n 0 m−1/2 |∇1 U(X by Lemma 3.5.2. Then for any ε ∈ (0, 34 ε0 ), we have T ∧σn ∧ξ0 −1/2 Pm m (U (X(s)) − U0 )ds > δ 0
≤ Pm
0
T ∧σn ∧ξ0
(X1 (s) − X2 (s))| m−1/2 |∇1 U
× (R1 + R2 − |X1 (s) − X2 (s)|)1{|X1 (s)−X2 (s)|
δ
August 10, J070-S0129055X10004077
834
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
≤ Pm
inf
s∈[0,T ∧σn ]
δ 1 (s) − X2 (s))|ds > m |∇1 U(X ε 0 T ∧σn ∧ξ0 ε Pm −1/2 (X1 (s) − X2 (s))|ds < T ∧ σn ) + E m |∇1 U δ 0
+ Pm ≤ Pm (ξ 43 ε
|X1 (s) − X2 (s)| ≤ R1 + R2 − ε
T ∧σn ∧ξ0
−1/2
ε ≤ Pm (ξ 43 ε < T ∧ σn ) + C. δ By Lemma 6.3.3, Pm (ξ 43 ε < T ∧ σn ) → 0 as m → 0 for any ε > 0. Therefore, by taking first ε > 0 small enough and then m > 0 small enough, we get our assertion. We are now ready to show that the condition (5) of Theorem 6.3.2 is satisfied. Lemma 6.3.6. M1 |V1 (t)|2 +M2 |V2 (t)|2 is continuous in t almost surely, under any (t))t under Pm } as m → 0. cluster point of {(X(t), V Proof. Let mk be a sequence and P∞ be a probability such that limk→∞ mk = 0 and {(X(t), V (t))t under Pm } converges to P∞ as k → ∞. (This is possible by Sec. 6.2.) Then (Vi2 (s))s under Pm → (Vi2 (s))s under P∞ in ℘(D([0, T ]; Rd)), as m → 0. Also, let 2
(X(s)) 0 ) + 1 Hsm = m−1/2 (U −U Mi |Vi (s)|2 . 2 i=1 m Then we have by Lemma 3.5.2(2) that under our present setting, {(Ht∧σ ) under n ∧ξ0 t d Pm }m→0 is tight in ℘(C([0, T ]; R )). That is, there exists a Hs ∈ C([0, T ]; Rd) such that
(Hsm )s under Pm → (Hs )s under P∞ in ℘(C([0, T ]; Rd )), as m → 0. Combining the above, we get 2 1
m 2 Hs − Mi Vi (s) under Pm 2 i=1
→
s∈[0,T ∧σn ∧ξ0 )
2
1
Hs − Mi Vi (s)2 2 i=1
under P∞ s∈[0,T ∧σn ∧ξ0 )
in ℘(D([0, T ]; Rd)), as m → 0. However, for any δ > 0, we have by Lemma 6.3.5 that 2 T ∧σn ∧ξ0 m 1
2 Mi Vi (s) ds > δ → 0, as m → 0. Pm Hs − 2 0 i=1
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
So
P∞
T ∧σn ∧ξ0
0
835
2 1
2 Mi Vi (s) ds > δ = 0 Hs − 2 i=1
for any δ > 0. Also, ξ0 → T ∧ σn as m → 0. Therefore, T ∧σn 2 1
2 Mi Vi (s) ds = 0, P∞ -a.s. Hs − 2 0 i=1
This combined with the continuity of Hs and the fact that σn → ∞ a.e. gives us that M1 |V1 (t)|2 + M2 |V2 (t)|2 is continuous in t, P∞ -almost surely. We finally show that the condition (4) of Theorem 6.3.2 is satisfied. The method is similar to the one of the proof of (5). As in Sec. 5.1, let Yi (t) = Vi (t) − Mi−1 ηi (t), i = 1, 2, where ηi (t) is as given in (t) = (Y1 (t), Y2 (t)), and let Lemma 3.5.1. Let Y t (X(s)) {M −1 fV1 (X(s), Y (s)) · ∇1 U Gt = m−1/2 0
1
(X(s))}ds (s)) · ∇2 U + M2−1 fV2 (X(s), + f (X(t), V (t)). Y We first show the following. Lemma 6.3.7. {(Gt∧σn )t under Pm }m→0 is tight in ℘(C([0, T ]; Rd )). Proof. Let t = Gt − f (X(t), (t)) + f (X(t), G V Y (t)). Then t | ≤ fV1 ∞ M −1 |η1 (t)| + fV2 ∞ M −1 |η2 (t)|. |Gt − G 1 2 Therefore, by Lemma 3.5.1(4), we have that the tightness of {(Gt∧σn )t t∧σn )t under Pm }m→0 in ℘(C([0, T ]; Rd)) is equivalent to the tightness of {(G under Pm }m→0 in ℘(C([0, T ]; Rd )). On the other hand, we have by Lemma 3.5.1 and Ito’s formula that t = fX1 (X(t), dG Y (t)) · V1 (t)dt + fX2 (X(t), Y (t)) · V2 (t)dt + M1−1 fV1 (X(t), Y (t)) · (dM1 (t) + dP1∗1 (t)) + M2−1 fV2 (X(t), Y (t)) · (dM2 (t) + dP2∗1 (t)). t∧σn )t under Pm }m→0 is tight So by Lemma 3.5.1(2), (4.13) and Theorem 3.4.1, {(G d in ℘(C([0, T ]; R )). This completes the proof of our assertion.
August 10, J070-S0129055X10004077
836
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Lemma 6.3.8. Suppose that f ∈ C0∞ (R4d ) satisfies the condition in (4) of Theorem 6.3.2. Then for any δ > 0, we have that t T ∧σn ∧ξ0 m−1/2 X(s)) {M1−1 fV1 (X(s), Y (s)) · ∇1 U( lim Pm m→0
0
0
+ M2−1 fV2 (X(s), Y
(X(s))}ds dt > δ = 0. (s)) · ∇2 U
(X1 , X2 ) = 0 if |X1 −X2 | > R1 +R2 . For any X1 , X2 ∈ Proof. First notice that ∇i U d i = R1 +R2 Xi , i = 1, 2. Then |X 1 − X 2 | = R1 + R with |X1 − X2 | ≤ R1 + R2 , let X |X1 −X2 | = (X 1 , X 2 ) ∈ R2 . Since D0 = {(X1 , X2 ) ∈ R2d ; |X1 −X2 | > R1 +R2 }, this means X ∂D0 , Also, as in the proof of Corollary 6.1.3, −∇1 U (X1 , X2 ) = ∇2 U (X1 , X2 ) is
parallel with same direction to X1 − X2 , so
(X1 , X2 ) = − |∇1 U(X1 , X2 )| (X1 − X2 ) = − |∇1 U(X1 , X2 )| (X 1 − X 2 ), ∇1 U |X1 − X2 | R1 + R2 (X1 , X2 ) = + |∇2 U(X1 , X2 )| (X1 − X2 ) = + |∇1 U(X1 , X2 )| (X 1 − X 2 ). ∇2 U |X1 − X2 | R1 + R2 So by assumption, for any Y ∈ R2d , 1 , X2 ) + M −1 fV2 (X, (X1 , X2 ) Y ) · ∇1 U(X Y ) · ∇2 U M1−1 fV1 (X, 2 1 , X2 )| |∇1 U(X Y ) · (X 1 − X 2 ) + M −1 fV2 (X, Y ) · (X 1 − X 2 )) (−M1−1 fV1 (X, 2 R1 + R2 = 0,
=
hence if we set C1 ≡ M1−1 fXV1 ∞ ∨ M2−1 fXV2 ∞ , then 1 , X2 ) + M −1 fV2 (X, Y ) · ∇2 U(X 1 , X2 )| |M1−1 fV1 (X, Y ) · ∇1 U(X 2 (X1 , X2 ) Y )) · ∇1 U = |M1−1 (fV1 (X, Y ) − fV1 (X, (X1 , X2 )| Y )) · ∇2 U + M2−1 (fV2 (X, Y ) − fV2 (X, (X1 , X2 )| 1U ≤ M1−1 fXV1 ∞ |X − X||∇ (X1 , X2 )| 2U + M2−1 fXV2 ∞ |X − X||∇
R1 + R2 − 1 |X|. ≤ C1 (|∇1 U (X1 , X2 )| + |∇2 U(X1 , X2 )|) |X1 − X2 | 0 | + 2nT )(R1 + R2 )−1 , and let Let C2 = 2(|X C3 = C1 C2 sup E m∈(0,1]
T ∧σn
Pm
m 0
−1/2
X(s))| X(s))|)ds + |∇2 U( , (|∇1 U(
August 10, J070-S0129055X10004077
2010 15:0 WSPC/S0129-055X
148-RMP
Classical Mechanical Model of Brownian Motion with Plural Particles
837
which is finite by Lemma 3.5.2. Then by the calculation above, we have for any ε ∈ [0, 34 ε0 ∧ 12 (R1 + R2 )), (hence R1 + R2 − ε > 12 (R1 + R2 )), t T ∧σn ∧ξ0 m−1/2 (X(s)) (s)) · ∇1 U Pm {M1−1 fV1 (X(s), Y 0
0
+ M2−1 fV2 (X(s), Y ≤ Pm
T ∧σn ∧ξ0
0
(s)) · ∇2 U(X(s))}ds dt > δ
(X(s))| (X(s))|) + |∇2 U m−1/2 C1 (|∇1 U
R + R 1 2 0 | + 2nT ) − 1 1{|X1 (s)−X2 (s)| δ × (|X |X1 (s) − X2 (s)|
≤ Pm |X1 (s) − X2 (s)| ≤ R1 + R2 − ε inf
s∈[0,T ∧σn ]
+ Pm
0
T ∧σn ∧ξ0
X(s))| X(s))|)ds + |∇2 U( m−1/2 C1 (|∇1 U(
0 | + 2nT ) > δ (|X
ε (R1 + R2 )/2
−1
≤ Pm (ξ 43 ε < T ∧ σn ) 2 T ∧σn ∧ξ0
ε Pm −1/2 m C1 |∇i U (X(s))| ds + C1 C2 · E δ 0 i=1 ε ≤ Pm (ξ 43 ε < T ∧ σn ) + C3 . δ Since Pm (ξ 43 ε < T ∧ σn ) → 0 as m → 0 for any ε ∈ (0, 34 ε0 ] by Lemma 6.3.3, we get our assertion by taking first ε > 0 small enough then m > 0 small enough. By using the same argument when deriving Lemma 6.3.6 from Lemmas 3.5.2 and 6.3.5, with the help of Lemmas 6.3.7 and 6.3.8, we get the following, which means that the condition (4) of Theorem 6.3.2 is also satisfied. Lemma 6.3.9. Assume that f ∈ C0∞ (R4d ) satisfies V ) · (X1 − X2 ) + M −1 (∇V2 f )(X, V ) · (X2 − X1 ) = 0 M1−1 (∇V1 f )(X, 2 V ) ∈ ∂D0 × R2d , then f (X(t), V (t)) is continuous in t almost surely, for any (X, under any cluster point of {(X(t), V (t))t under Pm }, as m → 0. This completes the proof of the fact that in our setting any cluster point of the distribution of {(Xt , Vt )}t under Pm as m → 0 satisfies all of the conditions of
August 10, J070-S0129055X10004077
838
2010 15:0 WSPC/S0129-055X
148-RMP
S. Kusuoka & S. Liang
Theorem 6.3.2. Therefore, by the uniqueness of Theorem 6.3.2, the distribution of {(Xt , Vt )}t under Pm converges to P∞,0 as m → 0. Acknowledgment We would like to thank the referees for their valuable comments which helped to improve the manuscript in many ways. Also we would like to thank Professor Sergio Albeverio for read the manuscript carefully. This work was financially supported by Grant-in-Aid for the Encouragement of Young Scientists (No. 21740063), Japan Society for the Promotion of Science. References [1] P. Billingsley, Convergence of Probability Measures (John Wiley & Sons, Inc., 1968). [2] P. Calderoni, D. D¨ urr and S. Kusuoka, A mechanical model of Brownian motion in half-space, J. Statist. Phys. 55(3–4) (1989) 649–693. [3] D. D¨ urr, S. Goldstein and J. L. Lebowitz, A mechanical model of Brownian motion, Comm. Math. Phys. 78(4) (1980/81) 507–530. [4] D. D¨ urr, S. Goldstein and J. L. Lebowitz, A mechanical model for the Brownian motion of a convex body, Z. Wahrsch. Verw. Gebiete 62(4) (1983) 427–448. [5] D. D¨ urr, S. Goldstein and J. L. Lebowitz, Stochastic processes originating in deterministic microscopic dynamics, J. Statist. Phys. 30(2) (1983) 519–526. [6] R. Holley, The motion of a heavy particle in an infinite one dimensional gas of hard spheres, Z. Wahrsch. Verw. Gebiete 17 (1971) 181–219. [7] N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes, North-Holland Mathematical Library, Vol. 24 (North-Holland Publishing Co., Kodansha, Ltd., 1981). [8] O. Kallenberg, Foundations of Modern Probability, Probability and Its Applications, 2nd edn. (Springer-Verlag, New York, 2002). [9] S. Kusuoka, Stochastic Newton equation with reflecting boundary condition, in Stochastic Analysis and Related Topics in Kyoto, Adv. Stud. Pure Math., Vol. 41 (Math. Soc. Japan, 2004), pp. 233–246. [10] E. Nelson, Dynamical Theories of Brownian Motion (Princeton University Press, Princeton, 1967). [11] M. Reed and B. Simon, Methods of Modern Mathematical Physics. III. Scattering Theory (Academic Press, 1979). [12] J. A. M. van der Weide, Stochastic Processes and Point Processes of Excursions, CWI Tract, Vol. 102 (Stichting Mathematisch Centrum, Centrum voor Wiskunde en Informatica, Amsterdam, 1994).
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 7 (2010) 839–858 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004089
A NOTE ON THE NON-COMMUTATIVE LAPLACE–VARADHAN INTEGRAL LEMMA
W. DE ROECK Institut f¨ ur Theoretische Physik, Universit¨ at Heidelberg, Germany [email protected] CHRISTIAN MAES Instituut voor Theoretische Fysica, K. U. Leuven, Belgium [email protected] ˇ Y ´ KAREL NETOCN Institute of Physics, Academy of Sciences of the Czech Republic Prague, Czech Republic [email protected] LUC REY-BELLET Department of Mathematics and Statistics, University of Massachusetts, Amherst, USA [email protected] Received 10 September 2009 Revised 21 May 2010 We continue the study of the free energy of quantum lattice spin systems where to the local Hamiltonian H an arbitrary mean field term is added, a polynomial function of the arithmetic mean of some local observables X and Y that do not necessarily commute. By slightly extending a recent paper by Hiai, Mosonyi, Ohno and Petz [10], we prove in general that the free energy is given by a variational principle over the range of the operators X and Y . As in [10], the result is a non-commutative extension of the Laplace–Varadhan asymptotic formula. Keywords: Quantum large deviations; quantum lattice systems; Laplace–Varadhan lemma. Mathematics Subject Classification 2010: 82B10
1. Introduction 1.1. Large deviations One of the highlights in the combination of analysis and probability theory is the asymptotic evaluation of certain integrals. We have here in mind integrals of the 839
August 10, J070-S0129055X10004089
840
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
form, for some real-valued function G, dµn (x) exp{vn G(x)},
vn +∞ as n +∞
(1.1)
for which the measures µn satisfy a law of large numbers. Such integrals can be evaluated depending on the asymptotics of the µn . The latter is the subject of the theory of large deviations, characterizing the rate of convergence in the law of large numbers. In a typical scenario, the µn are the probabilities of some macroscopic variable, such as the average magnetization or the particle density in ever growing volumes vn and as distributed in a given equilibrium Gibbs ensemble. Then, depending on the case, thermodynamic potentials J make the rate function dµn (x) ∼ dx exp{−vn J (x)} in the sense of large deviations for Gibbs measures, see [8, 9, 16, 22, 23]. That theory of large deviations is however broader than the applications in equilibrium statistical mechanics. Essentially, when the rate function for µn is given by J , then the integral (1.1) is computed as 1 log dµn (x) exp{vn G(x)} −−−−− → sup{G(x) − J (x)}. (1.2) n+∞ x vn This is a typical application of Laplace’s asymptotic formula for the evaluation of real-valued integrals. The systematic combination with the theory of large deviations gives the so called Laplace–Varadhan integral lemma. We first recall the large deviation principle (LDP). Let (M, d) be some complete separable metric space. Definition 1.1. The sequence of measures µn on M satisfies a LDP with rate function J : M → R+ ∪ {+∞} and speed vn ∈ R+ if (1) J is convex and has closed level sets, i.e. {J −1 (x), x ≤ c}
(1.3)
is closed in (M, d) for all c ∈ R+ ; (2) for all Borel sets U ⊂ M with interior int U and closure cl U , one has lim inf
1 log µn (U ) ≥ − inf J (u), u∈int U vn
lim sup
1 log µn (U ) ≤ − inf J (u). u∈cl U vn
n+∞
n+∞
We say that the rate function J is good whenever the level sets (1.3) are compact. For the transfer of LDP, one considers a pair (µn , νn ), n ∞ of sequences of absolutely continuous measures on (M, d) such that dνn (x) = exp{vn G(x)}, dµn
µn -almost everywhere,
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
841
for some measurable mapping G : M → R. We now state an instance of the Laplace– Varadhan lemma. Lemma 1.1 (Laplace–Varadhan Integral Lemma). Assume that G is bounded and continuous and that the sequence (µn ) satisfies a large deviation principle with good rate function J and speed vn . Then (νn ) satisfies a large deviation principle with good rate function G − J and speed vn . For more general versions and proofs we refer to the literature, see e.g. [5–7, 22, 23]; it remains an important subject of analytic probability theory to extend the validity of the variational formulation (1.2) and to deal with its applications. 1.2. Mean-field interactions From the point of view of equilibrium statistical mechanics, one can also think of the formula (1.1) as giving (the exponential of) the pressure or free energy when adding a mean field type term to a Hamiltonian which is a sum of local interactions. The choice of the function G is then typically monomial with a power decided by the number of particles or spins that are in direct interaction. For example, the free energy of an Ising-like model with such an extra mean field interaction would be given by the limit p 1 1 log exp −βHΛ (η) + λp |Λ| ηi (1.4) lim |Λ| ΛZd |Λ| Λ i∈Λ
η∈{+,−}
for p = 1, 2, . . . , where HΛ (η) is the (local) energy of the spin configuration η and the limit takes a sequence of regularly expanding boxes Λ to cover some given lattice. The case p = 1 corresponds to the addition of a magnetic field λ1 ; p = 2 is most standard and adds effectively a very small but long range two-spin interaction. Higher p-values are also not uncommon in the study of Ising interactions on hypergraphs, and even very large p has been found relevant, e.g., in models of spin glasses and in information theory [4]. The form (1.1) is easily recognized in (1.4), with exp{−βHΛ (η)}, vn = |Λ|, µn (x) ∼ η∈{+,−}Λ ,
P
i∈Λ
ηi =x|Λ|
and the function G(x) = λp xp . The Laplace–Varadhan lemma applies to (1.4) since we know that the sequence of Gibbs states with density ∼ exp{−βHΛ ( · )} satisfies a LDP with a good rate function Jcl and speed |Λ|. The result reads that (1.4) is given by the variational formula sup {λp up − Jcl (u)}.
(1.5)
u∈[−1,1]
In non-commutative versions the local Hamiltonian H and the additional mean field term are allowed not to commute with each other. That is natural within the
August 10, J070-S0129055X10004089
842
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
statistical mechanics of quantum spin systems and this is also the context of the present paper. 1.3. Non-commutative extensions Although it has proven very useful to think of integrals (1.1) within the framework of probability and large deviation theory, it is fundamentally a problem of analysis. However, without such a probabilistic context, the question of a non-commutative extension of the Laplace–Varadhan Lemma 1.1 becomes ambiguous and it in fact allows for different formulations, each possibly having a physical interpretation on its own. One approach is to ask for the asymptotic evaluation of the expectations lim
ΛZd
1 ¯ log ωΛ (e|Λ| G(XΛ ) ) |Λ|
(1.6)
¯ Λ would now be the arithmetic mean under a family of quantum states ωΛ where X of some quantum observable in volume Λ. To be specific, one can take ωΛ a quantum Gibbs state for a Hamiltonian HΛ at inverse temperature β, with density matrix ¯ Λ = ( σΛ ∼ exp{−βHΛ }, and X i∈Λ Xi )/|Λ| the mean magnetization in some fixed direction. Arguably, this formulation is closely related to the asymptotic statistics ¯ Λ . Indeed, let νΛ be the measure of outcomes in von Neumann measurements of X on [− X , X ] defined by ¯ Λ )) νΛ (f ) := ωΛ (f (X
for f ∈ C([− X , X ]).
(1.7)
Then, (1.6) can be evaluated with the help of Lemma 1.1 (the commutative Laplace– Varadhan integral lemma) if the family νΛ satisfies a LDP with speed |Λ|. In recent years, this LDP has been established for σΛ ∼ exp{−βHΛ } in the regime of small β (high temperature) or d = 1, see [11, 13–15]. A more general class of possible extensions is obtained by considering the limits of 1 |Λ| 1 ¯ log TrΛ (σΛK e K G(XΛ ) )K , Λ Zd (1.8) |Λ| for different K > 0, where σΛ is the density matrix of a quantum state in box Λ. For the canonical form σΛ = exp(−βHΛ )/ZΛβ with local Hamiltonian HΛ at inverse temperature β, (1.8) becomes |Λ| β 1 1 ¯ log β TrΛ (e− K HΛ e K G(XΛ ) )K , |Λ| ZΛ
Λ Zd .
(1.9)
There is no a priori reason to exclude any particular value of K from consideration. Two standard options are: K = 1, which corresponds to the expression (1.6) above, and K +∞, which, by the Trotter product formula, boils down to 1 1 ¯ log β TrΛ (e−βHΛ +|Λ|G(XΛ ) ), |Λ| ZΛ
Λ Zd
(1.10)
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
843
which is the free energy of a corresponding quantum spin model, cf. (1.4). In the present paper, we study the case K +∞ (without touching the question of interchangeability of both limits). One of our results, Theorem 3.1 with Y = Y¯Λ = 0, is of the form lim
ΛZd
1 ¯ log TrΛ (e−βHΛ +|Λ| G(XΛ ) ) = sup {G(u) − J (u)}. |Λ| −X≤u≤X
(1.11)
Note that we omitted the normalization factor 1/ZΛβ since it merely adds a constant (independent of G) to (1.10). In the usual context of the theory of large deviations, formula (1.11) arises as a change of rate function. However, while our result (1.11) very much looks like Varadhan’s formula in Lemma 1.1, there is a big difference in interpretation: The function J is not as such the rate function of large deviations ¯ Λ . Instead, it is given as the Legendre transform for X J (u) = sup{tu − q(t)},
u∈R
(1.12)
t∈R
of a function q( · ) which is the pressure corresponding to a linearized interaction, i.e. q(t) = lim
ΛZd
1 ¯ log TrΛ (e−βHΛ +t|Λ|XΛ ) ). |Λ|
(1.13)
1.4. Several non-commuting observables: Towards joint large deviations? In the previous Sec. 1.3, we made the tacit assumption that there is a single observ¯ Λ corresponding to some Hermitian operator on Hilbert space. However, in able X 1 formula (1.4), the observable |Λ| i∈Λ ηi could equally well represent a vectorvalued magnetization which, upon quantization, would correspond to several non¯ Λ , Y¯Λ , say, the magnetization along the x-axis and y-axis, commuting observables X respectively. In the commutative theory, this case does not require special attention; the framework of large deviations applies equally regardless of whether the observable takes values in R or R2 . Obviously, this is not true in the non-commutative setting and in fact, we do not even know a natural analogue of the generating function (1.6), since we do not dispose of a simultaneous Von Neumann measurement ¯ Λ and Y¯Λ . One can take the point of view that this is inevitable in quantum of X mechanics, and insisting is pointless. Yet, as Λ Zd , the commutator 1 ¯ ¯ [XΛ , YΛ ] = O (1.14) |Λ| ¯ Λ , Y¯Λ is restored on the macroscopic vanishes and hence the joint measurability of X scale. We refer the reader to [19] where this issue is discussed and studied in more depth. The advantage of the approach via the Laplace–Varadhan Lemma is that one ¯ Λ and can set aside these conceptual questions and study joint large deviations of X ¯ Λ and Y¯Λ , for example a symmetrized Y¯Λ by choosing G to be a joint function of X
August 10, J070-S0129055X10004089
844
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
monomial ¯Λ , Y¯Λ ) = (X ¯ Λ )k (Y¯Λ )l + (Y¯Λ )l (X ¯ Λ )k , G(X
for some k, l ∈ N,
(1.15)
and check whether the formula (1.11) remains valid with some obvious adjustments. This turns out to be the case and it is our main result: Theorem 3.1. 1.5. Comparison with previous results The asymptotics of the expression (1.10) was first studied and the result (1.11) was first obtained by Petz et al. [17], in the case where the Hamiltonian HΛ is made solely from a one-body interaction. The corresponding equilibrium state is then a product state. In [10], Hiai et al. generalized this result to the case of locally interacting spins but the lattice dimension was restricted to d = 1. However, the authors of [10] argue that the restriction to d = 1 can be lifted in the high-temperature regime. The main reason is that their work relies heavily on an asymptotic decoupling condition which is proven in that regime, [1]. One should observe here that this asymptotic decoupling condition in fact implies a large deviation principle for ¯ Λ , as follows from the work of Pfister [18]. Hence, in the language of Sec. 1.3, [10] X evaluates (1.10) (the case K = ∞) in those regimes where (1.6) (the case K = 1) can be evaluated as well. The present paper elaborates on the result of [10] in two ways. First, we remark that, in our setup, the decoupling condition is actually not necessary for (1.11) to hold, and therefore one can do away with the restriction to d = 1 or high temperature. Hence, again referring to Sec. 1.3, the case K = ∞ can be controlled even when we know little about the case K = 1. To drop the decoupling condition, it is absolutely essential that we start from finite-volume Gibbs states, and not from finite-volume restrictions of infinite-volume Gibbs states, as it is done in [10]. Second, we show that by the same formalism, one can treat the case of several noncommuting observables, as explained in Sec. 1.4. The most serious step in this generalization is actually an extension of the result of [17] to noncommuting observables. This extension is stated in Lemma 6.1 and proven in Sec. 7. Note. While we were finishing this paper, we learnt of a similar project by J.-B. Bru and W. de Siqueira Pedra. Their result [3] is nothing less than a full-fledged theory of equilibrium states with mean-field terms in the Hamiltonian, describing not only the mean-field free energy (as we do here), but also the states themselves. Also, their results hold for fermions, while ours are restricted to spin systems, and they provide interesting examples. Yet, the focus of our paper differs from theirs and our main result is not contained in their paper. 1.6. Outline In Sec. 2, we sketch the setup. We introduce spin systems on the lattice, noncommutative polynomials and ergodic states. Section 3 describes the result of the paper. The remaining Secs. 4–7 contain the proofs.
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
845
2. Setup 2.1. Hamiltonian and observables We consider a quantum spin system on the regular lattice Zd , d = 1, 2, . . . . We briefly introduce the essential setup below, and we refer to [12, 20] for more expanded, standard introductions. The single site Hilbert space H is finite-dimensional (isomorphic to Cn ) and for any finite volume Λ ⊂ Zd , we set HΛ = ⊗Λ H. The C ∗ -algebra of bounded operators on HΛ is denoted by BΛ ≡ B(HΛ ). The standard embedding BΛ ⊂ BΛ for Λ ⊂ Λ is assumed throughout. The quasi-local algebra U is defined as the norm closure of the finite-volume algebras BΛ . (2.1) U := Λ finite
Denote by τi , i ∈ Zd , the translation which shifts all observables over a lattice vector i, i.e. τi is a homomorphism from BΛ onto Bi+Λ . We introduce an interaction potential Φ, that is a collection (ΦA ) of Hermitian elements of BA , labeled by finite subsets A ⊂ Zd . We assume translation invariance (i) and a finite range (ii): (i) τi (ΦA ) = Φi+A for all finite A ⊂ Zd ; (ii) there is a dmax < ∞ such that, if diam(A) > dmax , then ΦA = 0. In estimates, we will frequently use the number r(Φ) :=
ΦA < ∞.
(2.2)
A0
The local Hamiltonian in a finite volume Λ is ΦA HΛ ≡ HΛΦ =
(2.3)
A⊂Λ
which corresponds to free or open boundary conditions. Boundary conditions will however turn out to be irrelevant for our results. We will drop the superscript Φ since we will keep the interaction potential fixed. Let X, Y, . . . denote local observables on the lattice, located at the origin, i.e. Supp X (which is defined as the smallest set A such that X ∈ BA ) is a finite set which includes 0 ∈ Zd . We write τj X (2.4) XΛ := j∈Zd ,Supp τj X⊂Λ
and ¯ Λ := 1 XΛ X |Λ| for the corresponding intensive observable (the “empirical average” of X).
(2.5)
August 10, J070-S0129055X10004089
846
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
All of these operators are naturally embedded into the quasi-local algebra U. At some point, we will also require the intensive infinite volume observable ¯ ∼X ¯ Λ∞ . X ¯ since it does not belong to the quasi-local Some care is required in dealing with X algebra U. We will further comment on this in Sec. 2.3. 2.2. Non-commutative polynomials ¯ Λ , Y¯Λ ) We will perturb the Hamiltonian HΛΦ by a mean field term of the form |Λ|G(X ¯ Λ , Y¯Λ , e.g., as where G is a “non-commutative polynomial” of the operators X in (1.15). In this section, we introduce these non-commutative polynomials G as quantizations of polynomial functions g. First, we define Ran(X, Y ) := [− X , X ] × [− Y , Y ].
(2.6)
This definition is motivated by the fact that (“sp” stands for spectrum) ¯ Λ × sp Y¯Λ ⊂ Ran(X, Y ), sp X
for all Λ.
(2.7)
Let g be a real polynomial function on the rectangular set Ran(X, Y ). Using the symbol I for the collection of all finite sequences from the binary set {1, 2}, ˜ : I → C is called a quantization of g whenever any map G N
˜ G(α) xα(1) · · · xα(n) = g(x1 , x2 )
(2.8)
n=0 α=(α(1),...,α(n))∈I
˜ is called for all (x1 , x2 ) ∈ Ran(X, Y ) and for some N ∈ N. A quantization G symmetric whenever ˜ ˜ G(α(1), . . . , α(n)) = G(α(n), . . . , α(1)).
(2.9)
˜ defines a self-adjoint operator Any such symmetric quantization G G(X, Y ) =
N
˜ G(α) Xα(1) · · · Xα(n)
(2.10)
n=0 α=(α(1),...,α(n))∈I
taking X1 ≡ X and X2 ≡ Y . In the thermodynamic limit, one expects different quantizations of g to be equivalent: ˜ and G ˜ be any two quantizations of g : Ran(X, Y ) → R. Then Lemma 2.1. Let G ¯ Λ , Y¯Λ ) − G (X ¯ Λ , Y¯Λ ) ≤ Cg (X, Y )
G(X |Λ| for some Cg (X, Y ) < ∞, and for all finite volumes Λ.
(2.11)
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
847
Proof. This is a simple consequence of the fact that the commutator of macroscopic observables vanishes in the thermodynamic limit, more precisely, ¯ Λ , Y¯Λ ] ≤ 1 X |Supp X| × Y |Supp Y |.
[X |Λ|
(2.12)
Indeed, our results, Theorems 3.1 and 3.2, do not depend on the choice of quantization. This can also be checked a priori using the above lemma and the log-trace inequality in (3.11). 2.3. Infinite-volume states A state ωΛ is a positive linear functional on BΛ , normalized by ωΛ = ωΛ (1) = 1. An example is the tracial state, ωΛ ( · ) ∼ TrΛ ( · ). In general we consider states ωΛ as characterized by their density matrix σΛ , ωΛ ( · ) = TrΛ (σΛ ·). An infinite volume state ω is a positive normalized function on the C ∗ -algebra U (the quasi-local algebra). It is translation invariant when ω(A) = ω(τj A) for all j ∈ Zd and A ∈ U. A translation-invariant state ω is ergodic whenever it is an extremal point in the convex set of translation invariant states. A state is called symmetric whenever it is invariant under a permutation of the lattice sites, that is, for any sequence of one-site observables A1 , . . . , An ∈ B{0} ⊂ U and i1 , . . . , in ∈ Zd ω(τi1 (A1 )τi2 (A2 ) · · · τin (An )) = ω(τiπ(1) (A1 )τiπ(2) (A2 ) · · · τiπ(n) (An ))
(2.13)
for any permutation π of the set {1, . . . , n}. The set of ergodic/symmetric states on U is denoted by Serg , Ssym , respectively. At some point we will need the theorem by Størmer [21] that states that any ω ∈ Ssym can be decomposed as dνω (φ)φ (2.14) ω= prod.
for some regular probability measure νω whose support consists of product states. Of course, the set of product states can be identified with the (finite-dimensional) set of states on the one-site algebra B{0} = B(H). For a finite-volume state ωΛ on BΛ , we consider the entropy functional S(ωΛ ) ≡ SΛ (ωΛ ) = − Tr σΛ log σΛ .
(2.15)
The mean entropy of a translation-invariant infinite-volume state ω is defined as s(ω) := lim
ΛZd
1 S(ωΛ ), |Λ|
with ωΛ := ω BΛ (restriction to Λ).
(2.16)
In this formula and in the rest of the paper, the limit limΛZd is meant in the sense of Van Hove, see, e.g., [12, 20]. Standard properties of the functional s are its affinity and upper semicontinuity (with respect to the weak∗-topology on states).
August 10, J070-S0129055X10004089
848
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
¯ and Y¯ , postponing In Sec. 2.1, we mentioned the observables at infinity’ X l ¯k ¯ their definition to the present section. Expressions like ω(X Y ) (for some positive numbers l, k) can be defined as ¯ l Y¯ k ) := ω(X
lim
Λ,Λ Zd
¯ l Y¯ k ), ω(X Λ Λ
(2.17)
provided that the limit exists. We use the following standard result that can be viewed as a non-commutative law of large numbers Lemma 2.2. For ω ∈ Serg , the limit (2.17) exists and ¯ l Y¯ k ) = [ω(X)]l [ω(Y )]k . ω(X
(2.18)
¯ and ω(Y ) = ω(Y¯ ) by translation invariance. An immeNote that ω(X) = ω(X) diate corollary is that for a non-commutative polynomial G which is a quantization of g (see Sec. 2.2), and ω ∈ Serg : ¯ Y¯ )) = g(ω(X), ω(Y )). ω(G(X,
(2.19)
For the convenience of the reader, we sketch the proof of Lemma 2.2 in the Appendix. Finally, we note that Lemma 2.2 does not require the state ω to be trivial at infinity. Triviality at infinity is a stronger notion which is not used in the present paper. In particular, the state µ ¯ constructed in Sec. 4 is ergodic, but not trivial at infinity, since it fails to be ergodic with respect to a subgroup of lattice translations. 3. Result Choose X, Y to be local operators and let HΛΦ be the Hamiltonian corresponding ˜ be a symto a finite-range, translation invariant interaction Φ, as in Sec. 2.1. Let G metric quantization of a polynomial g on the rectangle Ran(X, Y ) and G( ·, · ) the corresponding self-adjoint operator, as defined in Sec. 2.2. We define the “G-mean field partition function” ¯
¯
ZΛG (Φ) := TrΛ (e−HΛ +|Λ| G(XΛ ,YΛ ) )
(3.1)
¯ Λ , Y¯Λ empirical averages of X, Y . The following theorem is our main result: with X Theorem 3.1. Define the pressure p(u, v) = lim
ΛZd
Φ 1 log TrΛ e−HΛ +uXΛ +vYΛ |Λ|
(3.2)
and its Legendre transform I(x, y) =
sup (ux + vy − p(u, v)).
(3.3)
(u,v)∈R2
Then lim
ΛZd
1 log ZΛG (Φ) = sup (g(x, y) − I(x, y)) |Λ| (x,y)∈R2
(3.4)
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
849
where the limit Λ Zd is in the sense of Van Hove, as in (3.2). In particular, the left-hand side of (3.4) does not depend on the particular form of quantization taken. As discussed in Sec. 1, our result expresses the pressure of the mean field Hamiltonian through a variational principle. To derive this result, it is helpful to represent this pressure first as a variational problem on a larger space, namely that of ergodic states, as in Theorem 3.2. Theorem 3.1 follows then by parametrizing these states by their values on X and Y . We also need the “local energy operator” associated to the interaction Φ as EΦ :=
1 ΦA . |A|
(3.5)
A0
Theorem 3.2 (Mean-Field Variational Principle). Let s( · ) be the mean entropy functional, as in Sec. 2.3. Then lim
ΛZd
1 log ZΛG (Φ) = sup (g(ω(X), ω(Y )) + s(ω) − ω(EΦ )). |Λ| ω∈Serg
(3.6)
To understand how the first term on the right-hand side of (3.6) originates from (3.1), we recall the equality (2.19) for ergodic states ω. The proof of Theorem 3.2 is postponed to Secs. 5 and 6. Here we prove that Theorem 3.1 is a rather immediate consequence of Theorem 3.2. Proof of Theorem 3.1. We write the right-hand side of (3.6) in the form ˜ y)) sup (g(x, y) − I(x,
(3.7)
(x,y)∈R2
where ˜ y) = I(x,
inf
ω∈Serg ω(X)=x, ω(Y )=y
(−s(ω) + ω(EΦ ))
(3.8)
is a convex function on R2 , infinite on the complement of Ran(X, Y ). To establish ˜ y) is lower semi-continuous (l.s.c.), we proceed as in the proof of the that I(x, contraction principle in large deviation theory, see, e.g., [5]: The function ω → (−s(ω) + ω(EΦ )) is l.s.c. and the set {ω ∈ Serg , ω(X) = x, ω(Y ) = y} is compact by the continuity of ω → (ω(X), ω(Y )) (compactness and continuity with respect to the weak∗ -topology). Therefore, the infimum is attained and we can deduce that ˜ y) ≤ a} = F ({ω ∈ Serg |−s(ω) + ω(EΦ ) ≤ a}) {x, y | I(x,
(3.9)
where F : ω → (ω(X), ω(Y )). The level set on the left-hand side is closed and hence I˜ is l.s.c.
August 10, J070-S0129055X10004089
850
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
By using the infinite-volume Gibbs variational principle [12, 20], the Legendre– Fenchel transform of I˜ reads ˜ y)) = sup (s(ω) − ω(EΦ ) + u ω(X) + v ω(Y )) sup (ux + vy − I(x,
(x,y)∈R2
ω∈Serg
= p(u, v).
(3.10)
The equality I = I˜ then follows by the involution property of the Legendre–Fenchel transform on the set of convex lower-semicontinuous functions, see, e.g., [20]. Independence of boundary conditions. Observe that both Theorems 3.1 and 3.2 have been formulated for the finite volume Gibbs states with open boundary conditions. It is however easy to check that this choice is not essential and other equivalent formulations can be obtained. Indeed, by the standard log-trace inequality, ¯
¯
¯
¯
| log TrΛ (e−βHΛ +WΛ +|Λ| G(XΛ ,YΛ ) ) − log TrΛ (e−βHΛ +|Λ| G(XΛ ,YΛ ) )| ≤ WΛ
(3.11) and hence if one chooses WΛ such that limΛZd WΛ /|Λ| = 0, then we can replace −βHΛ by −βHΛ + WΛ in Theorems 3.1 and 3.2. Finite-range restrictions. It is obvious that our paper contains some restrictions that are not essential. In particular, by standard estimates (in particular, those used to prove the existence of the pressure, see, e.g., [20]) one can relax the finite-range conditions on the interaction Φ to the condition that ΦA
< ∞, (3.12) |A| A0
and similarly for the local observables X, Y . Moreover, it is not necessary that G is a non-commutative polynomial. Starting from (3.11), one checks that it suffices that G can be approximated in operator norm by non-commutative polynomials. 4. Approximation by Ergodic States In this section, we describe a construction that is the main ingredient of our proofs, as well as of those in [10, 17]. This construction will be used in Secs. 6 and 7. Let V be a hypercube centered at the origin, i.e. V = [−L, L]d for some L > 1 and let ∂V := {i ∈ V ∃i ∈ Zd \V such that i, i are nearest neighbors} (4.1) We write Zd /V = ((2L + 1)Z)d
(4.2)
to denote the “block lattice” whose points can be thought of as translates of V . In other words, Zd = ∪i∈Zd /V V + i. Consider a state µV on BV .
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
851
We aim to build an infinite-volume ergodic state out of µV . First, we define the block product state
µV . (4.3) µ ˜ := Zd /V
We define also the translation-average of µ ˜, 1 µ ˜ ◦ τj . µ ¯ := |V |
(4.4)
j∈V
We can now check the following properties: • We have the exact equality of entropies s(¯ µ) = s(˜ µ) =
1 S(µV ). |V |
(4.5)
This follows from the affinity of the entropy in infinite-volume. A remark is in order: A priori, the infinite-volume entropy is defined for translation-invariant states, whereas µ ˜ is only periodic. However, one easily sees that the entropy can still be defined, e.g. be viewing µ ˜ as a translation-invariant state on the block d lattice Z /V , and correcting the definition by dividing by |V |. • The state µ ¯ is ergodic. This follows, for example, from an explicit calculation that is presented in [10]. Note however that µ ¯ is in general not ergodic with respect to the translations over the sublattice Zd/V = ((2L + 1)Z)d . This phenomenon (though in a slightly different setting) is commented upon in [20] (the end of Sec. III.5). • The state µ ¯ is a good approximation of µV for observables which are empirical averages, provided V is large. Consider the local observable X as in Sec. 2.1. A translate τj X can lie inside a translate of V , i.e. Supp τj X ⊂ V + i for some i ∈ Zd/V , or it can lie on the boundary between two translates of V . The difference ¯ V ) clearly stems from those translates where X ¯ and µV (X between µ ¯ (X) = µ ¯(X) lies on a boundary, and the fraction of such translates is bounded by |Supp X| ×
|∂V | . |V |
(4.6)
Hence ¯ V )| ≤ X |Supp X| × ¯ − µV (X |¯ µ(X)
|∂V | . |V |
(4.7)
5. The Lower Bound In this section, we prove the following lower bound. Lemma 5.1. Recall ZΛG (Φ) as defined in (3.1). Then lim inf ΛZd
1 log ZΛG (Φ) ≥ sup ((g(ω(X), ω(Y )) + s(ω) − ω(EΦ )) |Λ| ω∈Serg
where all symbols have the same meaning as in Sec. 3.
(5.1)
August 10, J070-S0129055X10004089
852
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
Proof. Consider a state ω ∈ Serg . We show that 1 log ZΛG (Φ) ≥ g(ω(X), ω(Y )) + s(ω) − ω(EΦ ). (5.2) |Λ| Consider, for each volume Λ, the restriction ωΛ := ω BΛ . By the finite-volume variational principle (see, e.g., [2, Proposition 6.2.22]), lim inf ΛZd
1 ¯ Λ , Y¯Λ )) + 1 S(ωΛ ) − 1 ωΛ (HΛ ). log ZΛG (Φ) ≥ ωΛ (G(X |Λ| |Λ| |Λ|
(5.3)
The following convergence properties apply with Λ Zd in the sense of Van Hove: (1) (2)
¯ Λ , Y¯Λ )) = ω(G(X ¯ Λ , Y¯Λ )) → g(ω(X), ω(Y )), ωΛ (G(X 1 S(ωΛ ) → s(ω), |Λ|
(3)
1 ω(HΛ ) → ω(EΦ ). |Λ|
(5.4) (5.5) (5.6)
The relation (5.6) is obvious from the finite range condition on Φ, see Sec. 2.1. The convergence (5.5) is the definition of the mean entropy s. Finally, (5.4) follows from the ergodicity of ω as explained in Sec. 2.3. The relation (5.2) now follows immediately, since one can repeat the above construction for any ergodic state ω. 6. The Upper Bound 6.1. Reduction to product states In this section, we outline how to approximate 1 log ZΛG (Φ) |Λ|
(6.1)
by a similar expression involving the partition function of a block-product state. Fix a hypercube V = [−L, L]d and cover the lattice with its translates, as explained in Sec. 4. From now on, Λ is chosen such that it is a multiple of V . One can easily adopt the arguments such as to cover the case where Λ tends to infinity in the sense of Van Hove (as one has to do as well in the proof of the existence of the pressure for local interactions, see [12]). Define the observables HΛV ≡ HΛΦ,V ,
¯ ΛV , X
Y¯ΛV
by cutting all terms that connect any two translates of V , i.e. ¯ V := 1 X τj X, Λ |Λ|
(6.2)
(6.3)
j∈Λ ∃i∈Zd/V :Supp τj X⊂V +i
and analogously for HΛV and Y¯ΛV . One can say that these observables with superscript V are one-block observables with the blocks being translates of V . One easily
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
853
derives that ¯V − X ¯ Λ ≤ X |Supp X| |∂V | ,
X Λ |V |
HΛV − HΛ ≤ r(Φ)|Λ|
|∂V | |V |
(6.4)
with the number r(Φ) as defined in Sec. 2.1. Using the log-trace inequality, we bound V 1 1 ¯ ¯ ¯V ¯V log TrΛ (e−HΛ +|Λ| G(XΛ ,YΛ ) ) − log TrΛ (e−HΛ +|Λ|G(XΛ ,YΛ ) ) |Λ| |Λ|
(6.5)
as follows 1 ¯ Λ , Y¯Λ ) − G(X ¯ V , Y¯ V )
HΛ − HΛV + G(X Λ Λ |Λ| |∂V | ≤ (r(Φ) + Cg ( X |Supp X| + Y |Supp Y |)) |V |
(6.5) ≤
where Cg is constant depending on the function G. The second term of (6.5) is clearly the pressure of a product state with mean field interaction. We will find an upper bound for this pressure by slightly extending the treatment of Petz et al. in [17]. We prove an “extended PRV”-lemma, Lemma 6.1 in the next section. 6.2. The extended Petz–Raggio–Verbeure upper bound In this section, we outline the bound from above on the quantity V 1 ¯V ¯ V log TrΛ (e−HΛ +|Λ|G(XΛ ,YΛ ) ) |Λ|
(6.6)
that appeared in (6.5). To do this, let us make the setting slightly more abstract. Consider the lattice d Z with the one-site Hilbert space G given by
H. (6.7) G := V
In words, Z should be thought of as the block lattice Zd/V . Let D, A, B be onesite observable on the new lattice, i.e. D, A, B are Hermitian operators on G. The extended PRV (Petz–Raggio–Verbeure) states that d
Lemma 6.1 (Extended PRV). Let all symbols have the same meaning as in Secs. 2.1–2.3, except that the one-site Hilbert space is changed from H to G. Then lim sup ΛZd
1 ¯ ¯ ¯ B)) ¯ + s(ω) − ω(D)). log TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ) ≤ sup (ω(G(A, |Λ| ω∈Ssym (6.8)
¯ B)) ¯ defined as (2.17) exists. In particular ω(G(A, To appreciate the similarity between (6.8) and (3.6), one should realize that D is a local energy operator, as EΨ in (3.6). The proof of this lemma in the case that A = B is in the original paper [17]. The proof for the more general case is presented
August 10, J070-S0129055X10004089
854
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
in Sec. 7. Of course, one can prove that the right-hand side of (6.8) is also a lower bound: it suffices to copy Sec. 5. By the Størmer theorem, see (2.14), each symmetric state ω on U can be written as the barycenter of a regulary probability measure on the product states, and since all terms on the right-hand side of (6.8) are affine and upper semicontiuous functions of ω, it follows that the sup can be restricted to product states (see [17] for the fine details of this argument). Since, moreover, all product states are ergodic, we can ¯ B)) ¯ by g(ω(A), ω(A)). Hence, Lemma 6.1 implies that replace ω(G(A, lim sup ΛZd
1 ¯ ¯ log TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ) ≤ sup (g(ω(A), ω(B)) + s(ω) − ω(D)). |Λ| ω prod. (6.9)
6.2.1. From the extended PRV to the upper bound Next, we use (6.9) to formulate an upper bound on the quantity V 1 ¯V ¯ V TrΛ (e−HΛ +|Λ|G(XΛ ,YΛ ) ) |Λ|
(6.10)
for Λ a multiple of V . This means that we have to recall that the lattice sites in (6.9) are in fact blocks. We write Λ∗ := Λ/V and choose D := HV ¯V A := X B := Y¯V . Then, by the extended PRV,
1 1 ∗ s (ω) − ω(D) (6.10) ≤ sup g(ω(A), ω(B)) + |V | |V | ω prod. on B(Λ∗ ) ¯ V ), ωV (Y¯V )) + 1 S(ωV ) − 1 ωV (HV ) = sup G(ωV (X |V | |V | ωV on BV
where s∗ indicates that this is the entropy density on the block lattice Λ∗ , hence it should be divided by |V | to obtain the density on Λ. Now, let ω ˜ be the infinite¯ be its volume state obtained by taking a block-product over states ωV and let ω “translation-average”, as in Sec. 4. By the conclusions of Sec. 4, it follows that ω ¯ is ergodic and s(¯ ω ) = S(ωV ). Also, we see that |∂V | ¯V ) − ω |ωV (X ¯ (X)| ≤ X |Supp X| |V | 1 |∂V | |ωV (HV ) − ω ¯ (EΦ )| ≤ r(Φ) |V | |V | ¯ and analogously for YV . Consequently, we obtain |∂V | (6.10) ≤ sup (g(ω(X), ω(Y )) + s(ω) − ω(EΦ )) + O , |V | ω∈Serg
V Zd
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
855
| which proves the upper bound for Theorem 3.2, since the O( |∂V |V | )-term can be made arbitrarily small by increasing V .
7. Proof of Lemma 6.1 Let the state µΛ on BΛ be given by µΛ ( · ) =
1 ¯ ¯ TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ·) ZΛG (D)
with ¯
¯
ZΛG (D) := TrΛ (e−DΛ +|Λ|G(AΛ ,BΛ ) ). Naturally, µΛ is the finite-volume Gibbs state that saturates the variational principle, i.e. 1 ¯ Λ )) + 1 S(ωΛ ) − ωΛ (D) log ZΛG (D) = sup ωΛ (G(A¯Λ , B |Λ| |Λ| ωΛ on BΛ ¯Λ )) + 1 S(µΛ ) − µΛ (D). = µΛ (G(A¯Λ , B (7.1) |Λ| Our strategy is to attain the “entropy” and “energy” of the state µΛ via ergodic states. For definiteness, we assume that G is of the form ¯Λ ) := [A¯Λ ]k [B ¯Λ ]l G(A¯Λ , B
for some integers k, l,
¯Λ ) has to be a self-adjoint (which, strictly speaking, is not allowed since G(A¯Λ , B operator, but this does not matter for the argument in this section). The general case follows by the same argument. We apply the construction in Sec. 4 to µΛ , thus obtaining infinite-volume states µ ˜ and µ ¯. Since we will repeat the construction for different Λ, we indicate the ¯{Λ} , but remembering that these are states on the Λ-dependence in µ ˜{Λ} and µ infinite lattice. They satisfy s(¯ µ{Λ} ) =
1 S(µΛ ). |Λ|
(7.2)
We have also established in Sec. 4 that µ ¯{Λ} is ergodic and that the states µ ¯{Λ} {Λ} and µ ˜ approximate µΛ for observables which are empirical averages. However, ¯ B), ¯ except in we cannot conclude yet that they have comparable values for G(A, the case where G is linear. Essentially, such a comparison is achieved next by using the fact that µΛ is symmetric. Choose a sequence of volumes Λn such that along that sequence the right-hand side of (7.1) converges. We assume that µ ¯Λn has a weak∗-limit, as n ∞, which can always be achieved (by the weak∗-compactness) by restricting to a subsequence of Λn . We call this limit µ. By construction, it is a symmetric state.
August 10, J070-S0129055X10004089
856
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
Energy estimate. Since µ ¯Λn → µ, in the weak∗-topology, and µ ¯Λn (D) = µΛn (D), we have µΛn (D) → µ(D).
(7.3)
G-estimates. Using the symmetry of the state µΛ , we estimate ¯Λ )) − µΛ (⊗k A ⊗l B)| |µΛ (G(A¯Λ , B c(k, l) (k + l)2 k+l ≤ max ( A , B ) +O , |Λ| |Λ|2
|Λ| ∞
(7.4)
where the tensor products ⊗k A ⊗l B := A ⊗ · · · ⊗ A ⊗ B ⊗ · · · ⊗ B k copies
(7.5)
l copies
denote that all one-site operators are placed on different sites. Since µΛ is symmetric, we need not specify on which sites. The error term of order 1/|Λ| comes from those terms in the expansion of the monomial containing a product of k + l one-site operators but only involving k + l − 1 sites. Since µ is symmetric, we obtain analogously that ¯ B)) ¯ = µ(⊗k A ⊗l B). µ(G(A,
(7.6)
In particular, the left-hand side is well-defined. Hence, by combining (7.4) and (7.6), we obtain ¯Λn )) → µ(G(A, ¯ B)). ¯ µΛn (G(A¯Λn , B
(7.7)
For a more general non-commutative polynomial G as defined in Sec. 2.2 (not ¯ Λn ) necessarily a monomial), the convergence (7.7) follows easily since G(A¯Λn , B can be approximated in operator norm by polynomials. Entropy estimates. As established in Sec. 4, we have 1 S(µΛ ) = s(¯ µ{Λ} ), |Λ|
for all Λ.
(7.8)
By the upper semi-continuity of the infinite-volume entropy and the convergence µ ¯Λn → µ, we get that µ{Λn } ) ≤ s(µ). lim sup s(¯
(7.9)
n∞
Hence lim
n∞
1 S(µΛn ) ≤ s(µ). |Λn |
(7.10)
By combining the convergence results (7.3), (7.7) and (7.10), we have proven that there is a symmetric state µ such that the right-hand side of (6.8) with ω ≡ µ is larger than a given limit point of the right-hand side of (7.1). Since the construction can be repeated for any limit point, this concludes the proof of Lemma 6.1.
August 10, J070-S0129055X10004089
2010 15:1 WSPC/S0129-055X
148-RMP
Note on Non-Commutative Laplace–Varadhan Integral Lemma
857
Acknowledgment The authors thank M. Fannes, M. Mosonyi, Y. Ogata, D. Petz and A. Verbeure for fruitful discussions. K. N. is also grateful to the Instituut voor Theoretische Fysica, K. U. Leuven, and to Budapest University of Technology and Economics for kind hospitality, and acknowledges the support from the Grant Agency of the Czech Republic (Grant no. 202/07/J051). W. D. R. was a postdoctoral fellow of the FWOFlanders at the time when the paper was written and he acknowledges the financial support. L. R. B. acknowledges the support of the NSF (DMS-0605058). Appendix. Proof of Lemma 2.2 To prove Lemma 2.2, it is convenient to introduce an extended framework: Let πω be the cyclic GNS-representation associated to the state ω, Hω the associated Hilbert space and ψ ∈ Hω the representant of the state ω, i.e. ω(A) = ψ, πω (A)ψHω ,
A ∈ U.
(A.1)
The set πω (U) is a subalgebra of B(Hω ). Let Uj , ∈ Zd be the unitary representation of the translation group induced on πω (U), i.e. Uj πω (A)Uj∗ = πω (τj A). Ergodicity of ω implies (see, e.g., the proof of [20, Theorem III.1.8]) that 1 strongly Uj −−−−−→ Pψ |Λ| ΛZd
(A.2)
(A.3)
j∈Λ
where Pψ is the one-dimensional orthogonal projector associated to the vector ψ, and Λ Zd in the sense of Van Hove. Using (A.3) and the translation-invariance Uj ψ = ψ, one calculates 1 ¯Λ )π(Y¯Λ )ψ = Uj π(X)Uj −j π(Y )U−j ψ π(X |Λ|2 j,j ∈Λ
−−−−→d ΛZ Pψ π(X)Pψ π(Y )ψ = ω(X)ω(Y )ψ for local observables X, Y ∈ U. Taking the scalar product with ψ, we conclude ¯ Λ Y¯Λ ) → ω(X)ω(Y ). The same argument works for all polynomials in that ω(X ¯ ¯ XΛ , YΛ , thus proving Lemma 2.2. Finally, we remark that one can also construct ¯ Y¯ as weak∗-limits of X ¯ Λ , Y¯Λ , as Λ Zd (these weak∗-limits are the operators X, simply multiples of identity: ω(X)1, ω(Y )1). This is however not necessary for our results. References [1] H. Araki and P. D. F. Ion, On the equivalence of KMS and Gibbs conditions for states of quantum lattice systems, Comm. Math. Phys. 35 (1974) 1–12. [2] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics: 2, 2nd edn. (Springer-Verlag, Berlin, 1996).
August 10, J070-S0129055X10004089
858
2010 15:1 WSPC/S0129-055X
148-RMP
W. De Roeck et al.
[3] J.-B. Bru and W. de Siqueira Pedra, Equilibrium states of Fermi systems with long range interactions, in preparation. [4] R. Heylen, D. Boll´e and N. S. Skantzos, Thermodynamics of spin systems on smallworld hypergraphs, Phys. Rev. E 74 (2006) 056111. [5] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications (Springer, Berlin, 1993). [6] F. den Hollander, Large Deviations, Field Institute Monographs, Vol. 14 (Amer. Math. Soc., 2000). [7] J. D. Deuschel and D. W. Stroock, Large Deviations, Pure and Applied Mathematics, Vol. 137 (Academic Press, Boston, 1989). [8] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics (Springer, 2005). [9] H.-O. Georgii, Gibbs Measures and Phase Transitions, De Gruyter Studies in Mathematics, Vol. 9 (De Gruyter, 1988). [10] F. Hiai, M. Mosonyi, H. Ohno and D. Petz, Free energy density for mean field perturbation of states of a one-dimensional spin chain, Rev. Math. Phys. 20 (2008) 335–365. [11] F. Hiai, M. Mosonyi and O. Tomohiro, Large deviations and Chernoff bound for certain correlated states on the spin chain, J. Math. Phys. 48(12) (2007) 123301– 123319. [12] R. B. Israel, Convexity in the Theory of Lattice Gases, Princeton Series in Physics (Princeton University Press, 1979). [13] M. Lenci and L. Rey-Bellet, Large deviations in quantum lattice systems: One-phase region, J. Stat. Phys. 119 (2005) 715–746. [14] K. Netoˇcn´ y and F. Redig, Large deviations for quantum spin systems, J. Stat. Phys. 117 (2004) 521–547. [15] Y. Ogata, Large deviations in quantum spin chain, arXiv:0803.0113. [16] S. Olla, Large deviations for Gibbs random fields, Probab. Theory Related Fields 77 (1988) 343–357. [17] D. Petz, G. A. Raggio and A. Verbeure, Asymptotics of Varadhan-type and the Gibbs variational principle, Comm. Math. Phys. 121 (1989) 271–282. [18] C.-E. Pfister, Thermodynamical aspects of classical lattice systems, in In and Out of Equilibrium, Probability with a Physics Flavor, Vol. 1, ed. V. Sidoravicius (Birkh¨ auser, 2002). [19] W. De Roeck, C. Maes and K. Netoˇcn´ y, Quantum macrostates, equivalence of ensembles and an H theorem, J. Math. Phys. 47 (2006) 073303. [20] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton University Press, Princeton, 1993). [21] E. J. Stormer, Symmetric states on infinite tensor products of C ∗ -algebras, Funct. Anal. 3 (1969) 48–68. [22] S. R. S. Varadhan, Asymptotic probabilities and differential equations, Comm. Pure Appl. Math. 19 (1966) 261–286. [23] S. R. S. Varadhan, Large Deviations and Applications (Society for Industrial and Applied Mathematics, 1984).
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 8 (2010) 859–879 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004090
DYNAMICAL BOUNDS FOR STURMIAN ¨ SCHRODINGER OPERATORS
L. MARIN UMR 6628-MAPMO, Universit´ e d’Orl´ eans, B.P. 6759, 45067 Orl´ eans cedex, France [email protected] Received 3 November 2009 The Fibonacci Hamiltonian, that is a Schr¨ odinger operator associated to a quasiperiodical Sturmian potential with respect to the golden mean has been investigated intensively in recent years. Damanik and Tcheremchantsev developed a method in [10] and used it to exhibit a non trivial dynamical upper bound for this model. In this paper, we use this method to generalize to a large family of Sturmian operators dynamical upper bounds and show at sufficently large coupling anomalous transport for operators associated to irrational number with a generic diophantine condition. As a counterexample, we exhibit a pathological irrational number which does not verify this condition and show its associated dynamic exponent only has ballistic bound. Moreover, we establish a global lower bound for the lower box counting dimension of the spectrum that is used to obtain a dynamical lower bound for bounded density irrational numbers. Keywords: Sturmian Schr¨ odinger operators; quasiperiodical potential; dynamical bounds. Mathematics Subject Classification 2010: 81Q10, 47B36
1. Introduction If H is a self-adjoint operator on a separable Hilbert space H, the time dependent Schr¨ odinger equation of quantum mechanics, i∂t ψ = Hψ, yields to a unitary dynamical evolution in H, ψ(t) = e−itH ψ(0). Under the time evolution, ψ(t) will generally spread out with time. This could be a complicated question to quantify this spreading in concrete cases. One of the most studied case is where H is given by L2 (Rd ) or l2 (Zd ), H is a Sch¨ odinger operator of the form −∆ + V , and ψ(0) is a localized wavepacket. The form of the potential V is depending on the physical model one studies. One of the most studied is the Sturmian potential and its particular subcase, the Fibonacci Hamiltonian, describing a standard one-dimensional quasicrystal. The first approach to study quantum dynamics is the spectral theorem. Recall that each initial vector ψ(0) = ψ has a spectral measure, defined as the unique 859
September 14, J070-S0129055X10004090
860
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
Borel measure verifying
ψ, f (H)ψ =
f (E)dµψ (E) σ(H)
for every measurable function f . ·, · denotes the scalar product of H. A major step in the theory discovered by Guarneri ([14, 15]) was that suitable continuity properties of the spectral measure dµψ implies lower bounds on the spreading of the wavepacket. It was then extended by many authors in [3, 16, 25, 23]. Continuity properties of the spectral measure follows from upper bounds on measure of intervals, µψ ([E − ε, E + ε]), E ∈ σ(H), ε → 0. Later on, many authors refined Guarneri’s method ([2, 17, 30]) allowing to take into account the whole statistics of µψ ([E − ε, E + ε]), E ∈ R. One can find better lower bounds with information about both measure of intervals and the growth of the generalized eigenfunctions uψ (n, E) ([23, 30]). In the case of Schr¨ odinger operators in one space dimension, the information on the spectral measure and on generalized eigenfunctions is linked to the properties of solutions to the difference (also called sometimes free) equation Hu = Eu ([6, 11, 13, 19, 20, 31]). Explicit lower bounds on spreading rate for numerous concrete cases come from an analysis of these solutions ([5, 6, 13, 19, 20, 23]). The second approach to dynamical lower bounds in one dimension is based on the Parseval formula, 2 −1 ∞ ∞ i −2t/T −itH 2 H −E− e |e δ1 , δn | dt = δ1 , δn dE. 2π T 0 −∞ This method developed in [8, 9, 31] is the basis for the results in [7, 21]. This method has the advantage that it gives directly dynamical bounds without any knowledge of the properties of spectral measure. What is required is upper bounds for solutions corresponding to some set of energies, which can be very small (non empty is sufficient). Moreover, additional information allows to improve the results. A combination of both approach leads to optimal dynamical bounds for growing sparse potentials (see [31]). As mentioned before, there is a fairly good understanding of how to prove dynamical lower bounds, specially in one space dimension. Results of dynamical upper bounds are a few and more recent. Proving upper bounds is hard because one needs to control the entire wavepacket. In fact, the dynamical lower bounds that typically established only bound some (fast) part of the wavepacket from below and this is sufficient for the desired growth of the standard dynamical quantities. In the same way, it is of course much easier to prove upper bounds only for a (slow) portion of the wavepacket. Killip, Kiselev and Last developed this idea with success in [24]. Their work provides explicit criteria for upper bounds on the slow part of the wavepacket in terms of lower bounds on solutions. Applying their general method to the Fibonacci operator, their result supports the conjecture that this model exhibits anomalous transport (i.e. neither localized, nor diffusive, nor ballistic).
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
861
The conjecture for Fibonacci model is finally proved at sufficiently large coupling by Damanik and Tcheremchantsev in [10]. They developed a general method establishing a connection between solutions properties and dynamical upper bounds. Based on the Parseval formula, this method allows to bound the entire wavepacket from above provided that suitable lower bounds for solution (or rather transfer matrix) growth at complex energies are available. It is the main purpose of this paper to extend the application of this general method used for concrete Fibonacci model to almost every Sturmian potential. We will show that one has anomalous transport for Sturmian models associated to irrational numbers far enough from rational numbers, in a sense we develop further. On the other hand, we construct an irrational number close enough to rational number that yields to balistic motion. In this paper, we use tools that are relevant to give a new lower bound for the box counting dimension of the spectrum that is better for almost every irrational number. Since the spectrum is a Cantor set with Lebesgue measure zero, it is logical to investigate its fractal dimension. It is well known that this Cantor set is the limit of band spectra of approximant operators [29, 1]. To find the bound, we use band spectra at rank n as a sequence of εn -cover of the spectrum. Using the informations given in [28] about the number of band in periodic band spectra and in [27] about the length of the bands, we estimate εn and give a bound for the number of band of this diameter. This yields to a bound from below of the minimal number of balls of diameter εn one needs to cover the spectrum. This bound also has a direct dynamical application and allows us to state a dynamical lower bound using the method in ([30]). It is required for this lower bound to have the transfer matrix norms polynomially bounded. This property is shown to be true for bounded density irrational number in [18], hence more is not expected. This limits dynamical implication of this lower bound to a set of irrational number of Lebesgue measure 0. We will give precise statements of the model we study and our results in the next section. Section 3 will be devoted to the proof of our main result. We give a pathological example in the Sec. 4 and a new lower bound for box counting dimension of the spectrum in Sec. 5. 2. Model and Statements We limit our study to the one-dimensional discrete Schr¨odinger operator Hβ , [Hβ ψ](n) = ψ(n + 1) + ψ(n − 1) + V (n)ψ(n)
(1)
acting on l2 (Z), associated to a Sturmian potential V (n) given by V (n) = ((n + 1)β − nβ)V with β an irrational number in [0, 1] and V a positive constant. We denote continued fraction expansion of β by 1 = [0, a1 , a2 , . . .]. β= 1 a1 + a2 + · · ·
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
862
√
The Fibonacci Hamiltonian, Hβ with β = 5−1 = [0, 1, 1, . . .] is the sim2 plest example in Sturmian model because of its particular continued fraction development. Since we are interested in dynamical bounds, let us recall some quantities we want to bound: We denote the time average outside probabilities by a(n, T ), P (N, T ) = |n|>N
with a(n, T ) =
2 T
0
∞
e−2t/T |e−itH δ1 , δn |2 dt.
For all α ∈ [0, +∞], see [13] S − (α) = − lim inf
log P (T α − 2, T ) log T
S + (α) = − lim sup
log P (T α − 2, T ) . log T
T →∞
and T →∞
The following critical exponents are particular of interest: ± α± l = sup{α ≥ 0 : S (α) = 0}, ± α± u = sup{α ≥ 0 : S (α) < ∞}. ± + γ They verify 0 ≤ α± l ≤ αu . In particular, if γ > αu then P (T , T ) goes to 0 ± fast. αl can be interpreted as the (lower and upper) rates of propagation of the essential part of the wavepacket and α± u as the rates of propagation of the fastest part of the wavepacket. Moreover, we always have for this kind of models α+ u ≤ 1. This upper bound, called ballistic, is the fastest rate of spreading of the wavepacket. Sturmian potentials (quasiperiodic structure) are the buffer situation between random potentials (no structure in potential) that imply dynamical localization ± (α± u = 0) and periodic potentials that imply ballistic spreading that is αu = 1. More precisely, one has a non trivial strictly positive bound for almost all irrational numbers. In a sense we will make more precise latter, these irrational numbers are far enough from rational numbers. On the other hand, we show for irrational number close enough to rational number, one has ballistic motion. The first objective of this paper is to give a non ballistic upper bound for a large set of irrational numbers. Recall the sequences associated to β:
p−1 = 1,
p0 = 0,
q−1 = 0,
q0 = 1,
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
863
pk+1 = ak+1 pk + pk−1 , qk+1 = ak+1 qk + qk−1 .
(2)
We can now state our main result: Theorem 1. Let β be an irrational number and Hβ defined as in (1) with a Sturmian potential associated to β. Assume that V > 20. If D = lim supk logkqk is finite then 2D . α+ u ≤ V −8 log 3 Moreover, for an irrational number with continued fraction expansion containing no 1, the dynamical upper bound becomes D . α+ u ≤ V −8 log 3 Remark 1. It is clear that taking V large enough, one can obtain a non trivial bound that is smaller than 1. It is well known that the set of irrational numbers with finite D has full Lebesgue measure. In fact, for any algebraic number, that is with a periodic continued fraction development, one can easily compute D. Moreover, the explicit value of D is known for almost all β by the result of Khinchin discussed next. Lemma 1 ([22]). For almost all β with respect to Lebesgue measure, D = lim sup k
log qk π2 = DK = , k 12 log 2
where qk is the sequence defined as in (2) and 1
M = lim inf (a1 · · · ak ) k = CK = 2.685 . . . k
CK is called the Khintchin constant. Corollary 1. For Lebesgue almost every irrational number β, we have 2D K . α+ u ≤ V −8 log 3 Proof. It follows directly from previous Theorem 1 and Khinchin lemma. Corollary 2. For a precious number, that is ω = [0, a, a, a, a, . . .], a = 1 the bound becomes log(a + ω) . α+ u ≤ V −8 log 3
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
864
Proof. One can compute qk easily for such numbers. On the contrary, if D is infinite, one can have ballistic motion at all large coupling: Theorem 2. There exist an irrational number ω with D = +∞ such that for any V > 20 the dynamic of Hω is ballistic. We also prove a new lower bound for the fractal dimension of the spectrum: Theorem 3. Set Ck = k3 kj=1 log(aj + 2). We have for any irrational number β verifying C = lim sup Ck < +∞ and V > 20: dim+ B (σ) ≥
log 2 1 2 C + log(V + 5)
(3)
where σ is the spectrum of Hβ . 3. Proof of Theorem 1 When one wants to bound all these dynamical quantities for specific models, it is useful to connect them to the qualitative behavior of the solutions of the difference equation ψ(n + 1) + ψ(n − 1) + V (n)ψ(n) = zψ(n)
(4)
with z ∈ C and ψ a non-zero vector. One can reformulate this equation in terms of transfer matrices. ψ(n + 1) ψ(1) = F (n, z) ψ(n) ψ(0) with
T (n, z) · · · T (1, z) F (n, z) = Id [T (n, z)]−1 · · · [T (0, z)]−1
and
T (m, z) =
We set
z − V (m) 1
−1 0
n ≥ 1, n = 0, n ≤ −1,
.
T (qk , z) · · · T (1, z) Mk (z) = F (qk , z) = Id [T (q , z)]−1 · · · [T (0, z)]−1 k
n ≥ 1, k = 0, n ≤ −1.
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
865
The following statement allows us to connect transfer matrix norms with dynamical exponents (see [10] for details). Here and in what follows, f g means that f ≤ Cg for some positive constant C that we leave implicit. Theorem 4. Let Hβ be defined as in (1) and K ≥ 4 such that σ(Hβ ) ⊆ [−K + 1, K − 1]. Then, the outside probabilities can be bounded from above in terms of transfer matrix norms as follows: 2 −1 K i dE, max Mk E + Pr (N, T ) exp(−cN ) + T 3 1≤qk ≤N T −K
Pl (N, T ) exp(−cN ) + T
3
K
−K
2 −1 i dE, max Mk E + −N ≤qk ≤−1 T
the implicit constants depend only on K and c is a universal positive constant. This theorem connects transfer matrix behavior with a dynamical upper bound in the following way. Choosing N = N (T ) = CT α such that the both integrals decay faster that any inverse power of T , implies that P (N (T ), T ) goes to 0 faster + that any inverse power of T . By definition, of α+ u , it follows that αu ≤ α. To exhibit such kind of condition, we have to prove the considered energy is not in the spectrum, then the transfer matrix norm is shown to grow super exponentially. We shall recall now a few properties of the transfer matrix and their traces. The transfer matrix sequence verifies the evolution in k (see, e.g., [1, 28]) Mk+1 (z) = Mk−1 (z)Mk (z)ak+1 .
(5)
In order to bound from below the sequence of the norm of transfer matrix, it is enough to consider their traces. We recall now the following result one can find in [28]. Proposition 1. Let tk,p be the trace of the matrix Mk−1 Mkp . The evolution along the p index is given by tk,p+1 = tk+1,0 tk,p − tk,p−1 , and consequently, tk,p+1 = Sp (tk+1,0 )tk,1 − Sp−1 (tk+1,0 )tk,0 = Sp (tk+1,0 )tk,0 − Sp±1 (tk+1,0 )tk,−1 . The evolution along the k index is related to the p-evolution by tk+2,0 = tk,ak+1 , tk+1,1 = tk,ak+1 +1 , tk+1,−1 = tk,ak+1 −1 .
(6) (7)
September 14, J070-S0129055X10004090
866
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
If one denotes by xk = tk+1,0 the trace of Mk and zk = tk,1 the trace of Mk−1 Mk . This can be reduced to the usual trace map relation (6) xk+1 = zk Sak+1 −1 (xk ) − xk−1 Sak+1 −2 (xk ), zk+1 = zk Sak+1 (xk ) − xk−1 Sak+1 −1 (xk ), with initial conditions, x−1 = 2, x0 = z and z0 = z − V . Remark 2. This two sequences are dependent on z but we will omit it in order to simplify notations. Here, Sl denotes the lth Tchebychev polynomial of the second kind: S−1 (x) = 0, S0 (x) = 1, Sl+1 (x) = xSl (x) − Sl−1 (x),
∀ l ≥ 0.
The sequence {xk (z)}k can have two different behaviors depending on z. If and only if z lies in the spectrum of Hβ then this sequence is bounded. A criterium has first been stated by S¨ ut˝ o in [29] for Fibonacci Hamiltonian and extended by Bellissard et al. in [1] for other irrational numbers. The appearance of δ in the next Lemma is purely technical and does not change the proof. Lemma 2. A necessary and sufficient condition that {xk (z)}k be unbounded is that xN −1 (z) ≤ 2 + δ,
xN (z) > 2 + δ,
zN (z) > 2 + δ
for some N ≥ 0. This N is unique. Set Gk = Gk−1 + ak Gk−2 ,
G0 = 1,
G−1 = 1.
We have |xk+1 | ≥ |zk | ≥ ecGk−N + 1
∀ k > N,
with c = log(1 + δ) > 0 constant. Proof. We start by stating the following inequality on Chebychev polynomial: |Sl (x)| − |Sl−1 (x)| ≥ (|x| − 1)|Sl−1 (x)| − |Sl−2 (x)| ≥ (|x| − 1)[|Sl−1 (x)| − |Sl−2 (x)|] iterating this, one obtains ≥ (|x| − 1)l [|S0 (x)| − |S−1 (x)|] = (|x| − 1)l . The proof is made by induction. Hypothesis HN is the following: One has |xN | > 2 + δ and |zN | > 2 + δ. Moreover |xN −1 | ≤ |zN |. It is clear that the hypothesis of the lemma implies HN . We now show the induction property, namely HN implies |zN +1 | > |zN |, |xN +1 | > |zN |,
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
867
and |xN | ≤ |zN +1 |. It is easy to see that these three relations with HN implies HN +1 . Suppose HN to be true, then one has |zN +1 | ≥ |zN SaN +1 (xN )| − |xN −1 SaN +1 −1 (xN )| ≥ |zN |[|SaN +1 (xN )| − |SaN +1 −1 (xN )|] ≥ |zN |(|xN | − 1)aN +1 .
(8)
This shows that |zN +1 | > |zN | with |xN | ≥ 2 + δ. One also has |zN +1 | > |xN |. Indeed, one can write |zN +1 | ≥ |zN |(|xN | − 1) ≥ |xN | + (|zN | − 1)|xN | − |zN | ≥ |xN | + 2(|zN | − 1) − |zN | ≥ |xN | + |zN | − 2 ≥ |xN |. Only the last relation remain to be shown: One shows the same way that before |xN +1 | ≥ |zN SaN +1 −1 (xN )| − |xN −1 SaN +1 −2 (xN )| ≥ |zN |[|SaN +1 −1 (xN )| − |SaN +1 −2 (xN )|] ≥ |zN |(|xN | − 1)aN +1 −1 which yields to |xN +1 | > |zN |. Taking logarithms in (8), one obtains: log|zk+1 | ≥ log|zk | + ak+1 log(|xk | − 1). Using |zk+1 | > |zk | and |zk−1 | < |xk | yields to log(|zk+1 | − 1) ≥ log(|zk | − 1) + ak+1 log(|zk−1 | − 1). Sequence {log(|zk | − 1)}k>N grows faster than the exponential sequence Gk . This sequence is defined in the following way Gk = Gk−1 + ak+N Gk−2 ,
G0 = 1,
G−1 = 1.
One has |xk+1 | ≥ |zk | ≥ ecGk−N + 1
∀ k > N,
with c = log(1 + δ) > 0 a fixed constant. This constant c comes from the difference in the initial conditions between the sequence {Gk }k and the sequence {log(|zk | − 1)}k>N . This criterium motivates the following definition: Set σk,p = {E ∈ R, |tk,p (E)| ≤ 2}.
September 14, J070-S0129055X10004090
868
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
Denote by βn = pqnn , the rational approximation of β. It is well known that the spectrum of the operator Hβn , where βn replace β in the definition of Hβ coincide with the set σk,0 . The sequence of operator {Hβn } is called the periodic approximants of Hβ and converges strongly to Hβ . It is well known spectrum of Hβ is a Cantor set that can be approximate by the band spectra of the periodic approximants. The following proposition recalls precisely this statement ([29, 1, 32]): Proposition 2. The sequence of spectra of periodic approximants of Hβ satisfies (i) the set σk,p is made of pqk + qk−1 distinct intervals, c ∩ σk,p ), ∀ k ∈ N, (ii) σ ⊂ σk+1,0 ∪ σk,0 and σk,p+1 ⊂ σk+1,0 ∪ (σk+1,0 (iii) σk+1,0 ∩ σk,p ∩ σk,p−1 = ∅, ∀ V > 4 and ∀ k ∈ N, p ≥ 0. We recall now important result about periodic approximants spectra structure. It allows to know the way the intervals of σk,p are included in σk−1,p . It requires some definitions: Definition 1. For a given k, we call — Type I gap: A band of σk,1 included in a band of σk,0 and therefore in a gap of σk+1,0 , — Type II band: A band of σk+1,0 included in a band of σk,−1 and in a gap of σk,0 , — Type III band: A band of σk+1,0 included in a band of σk,0 and in a gap of σk,1 . As proved in [28] these definitions exhaust all the possible configuration with the following lemma. Lemma 3 ([28]). At a given level k, (i) a type I gap contains an unique type II band of σk+2,0 . (ii) a type II band contains (ak+1 +1) bands of type I of σk+1,1 . They are alternated with (ak+1 ) type III bands of σk+2,0 . (iii) a type III band contains (ak+1 ) bands of type I of σk+1,1 . They are alternated with (ak+1 − 1) type III bands of σk+2,0 . As stated above, the spectrum of Hβn is made by a growing number of intervals of decreasing length as n is increasing. We recall now a result obtain in [27] which allows to control the length of the bands of σk,p at any level k. We need again some notations to resume it: Let A = {I, II, III} be an alphabet. For each band B of spectrum at level k, correspond an unique word i0 i1 · · · ik ∈ An+1 such that B is a band of type ik included in a band of type ik−1 at level k − 1, . . . , included in a band of type i0 at level 0. This word will be called the index of B. More than one band can have the same index. Let Tn = (ti,j (n))3∗3 be a sequence of matrix and τ = i0 i1 · · · ik an index, we define: Lτ (T ) = ti0 ,i1 (1)ti1 ,i2 (2) · · · tik−1 ,ik (k).
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
869
We can now recall the result in [27]: Theorem 5 ([27]). If β = [a1 , a2 , . . .] is an irrational number in [0, 1] and Hβ defined as above with V > 20 then any band B of index τ verifies, 4Lτ (Q) ≤ |B| ≤ 4Lτ (P ) where P = (Pn )n>0
with c1 =
with c2 =
3 V −8
0 Pn = c1 /an c1 /an
c1an −1 0 0
0 c1 /an c1 /an
and Q = (Qn )n>0 0 Qn = c2 (an + 2)−3 c2 (an + 2)−3
c2an −1 0 0
0 c2 (an + 2)−3 c2 (an + 2)−3
1 V +5 .
By now, we define the periodic approximants spectrum not only in R but in C. δ = {z ∈ C: |xk (z)| ≤ 2 + δ} σk,0
The statements of the preceeding propositions remain true if one replace σk,p by δ for some small enough fixed δ. A condition on V should be added to keep the σk,p invariant formula, V > Vδ = [16 + 24δ + 9δ 2 + 4]1/2 (see [10]). Since the invariant δ remains the same. The proof is the very keeps true, all the structure for set σk,0 same, see [28, 24]. The following proposition states, due to classical Koebe distortion theorem, the height of this set is almost the same that its length. Proposition 3. If k ≥ 3, δ > 0 and V > 20 then there exists constants cδ ,dδ > 0 such that qk−1 qk−1 (j) (j) δ B(xk , rk ) ⊆ σk,0 ⊆ B(xk , Rk ) j=1
j=1
(j)
where {xk }1≤j≤qk−1 are the zeros of xk , rk = cδ inf τ ∈Ak Lτ (Q) and Rk = dδ supτ ∈Ak Lτ (P ). Proof. The proof follows the same steps that in [10]. Let Cj be a connected com2δ . With V > max{20, λ(2δ)}, Cj contains exactly one of a qk−1 zeros ponent of σk,0 (j)
δ δ of σk,0 , xk . Moreover Cj contains one connected component of σk,0 , denoted by ˜ Cj . It suffices to show that (j) (j) B(xk , rk ) ⊆ C˜j ⊆ B(xk , Rk ),
to obtain the result.
(9)
September 14, J070-S0129055X10004090
870
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
As xk is a proper function (as a polynomial of z) and Cj contains an unique zero, its degree is 1. xk : int(Cj ) → B(0, 2 + 2δ) is univalent (as a proper function of degree one) and so x−1 k : B(0, 2 + 2δ) → int(Cj ) is well defined and univalent too. Consequently, the function (j)
F : B(0, 1) → C, F (z) =
x−1 k ((2 + 2δ)z) − xk (2 + 2δ)(x−1 k ) (0)
is univalent on B(0, 1). We have F (0) = 0 and F (0) = 1. Applying Koebe distortion theorem, we get |z| |z| ≤ |F (z)| ≤ , 2 (1 + |z|) (1 − |z|)2 Evaluating this for |z| =
2+δ 2+2δ ,
|z| ≤ 1.
one has
(2 + δ)(2 + 2δ) (2 + δ)(2 + 2δ) ≤ F (z) ≤ . (4 + 3δ)2 δ2 By definition of F this implies (j)
(2 + δ)(2 + 2δ) −1 |(xk ) (0)|, δ2
(j)
(2 + δ)(2 + 2δ) −1 |(xk ) (0)|. (4 + 3δ)2
|x−1 k ((2 + 2δ)z) − xk | ≤ |x−1 k ((2 + 2δ)z) − xk | ≥ And then for |z| = 2 + δ, (j)
(2 + δ)(2 + 2δ) −1 |(xk ) (0)|, δ2
(j)
(2 + δ)(2 + 2δ) −1 |(xk ) (0)|. (4 + 3δ)2
|x−1 k (z) − xk | ≤ |x−1 k (z) − xk | ≥ (j)
It suffices with |(x−1 k ) (0)| = |xk (xk )| to remark that rk ≤ |(x−1 k ) (0)| ≤ Rk
˜ and with |z| = 2 + δ, x−1 k (z) runs through the entire boundary of Cj to conclude.
Proof of Theorem 1. We have now all the required tools to finish the proof of the Theorem 1. (j) As xk are real, we have −γ(V )
δ ⊆ {z ∈ C: |Im z| < Rk } ⊆ {z ∈ C: |Im z| < dqk σk,0
},
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
871
for a suitable γ(V ). This implies with Proposition 2 −γ(V )
δ δ σk,0 ∪ σk,1 ⊆ {z ∈ C: |Im z| < dqk
}.
(10)
Let us be more precise on how to choose γ(V ). We need to bound all Rk from above. Rk is the supremum of products of k elements of matrix Pn . All the coefficients in Pn are maximal for an = 1. The worst case possible happens when a band has a index history type I containing a band of type II, in that case the coefficient could be trivial equal to 1 (if an = 1). But because of combinatoric behavior of bands described by the Lemma 3, this situation cannot occur more than half of the time. Consequently this implies k/2
Rk ≤ c1 . −γ(V )
We should have Rk < dqk
so a suitable γ can be chosen by taking:
γ(V ) ≤ lim sup − k
k log c1 . 2 log qk
For ε = Im z > 0, we get an uniform lower bound for |xn (E + iε)| with E ∈ −γ(V ) < ε. With (10), this [−K, K] ⊂ R. For a fixed ε > 0, we choose k such that dqk shows |xk (E + iε)| > 2 + δ and |zk (E + iε)| > 2 + δ. As |x−1 (E + iε)| = 2 ≤ 2 + δ we are in the situation of the Lemma 2 and we have the bound |xj | ≥ elog(1+δ)Gj−k + 1,
∀ j > k.
(11)
All this motivates the following definitions: For δ > 0, T > 1, denote by k(T ) the unique integer with γ(V )
qk(T )−1 dδ
γ(V )
≤T ≤
qk(T ) dδ
and let N (T ) = qk(T )+√k(T ) . It is then easy to see for T large enough and for every ν > 0, that we have a constant Cν > 0 such that 1
N (T ) Cν T γ(V ) T ν . Let us give explicit argument on this statement: log qk(T )+√k(T ) k(T ) + k(T ) log N (T ) = log T log T k(T ) + k(T ) log qk(T )+√k(T ) k(T ) + k(T ) ≤ k(T ) + k(T ) (−k(T ) + 1)/2 log c1 k(T ) + k(T ) ≤ 2D . (−k(T ) + 1) log c1
(12)
September 14, J070-S0129055X10004090
872
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
For k(T ) large enough, last expression is close to enough, one gets
1 γ(V )
=
2D − log c1 .
So for T large
2D
N (T ) Cν T − log c1 T ν with ν arbitrary small. Applying (11) to Theorem 4, we get K 3 Pd (N (T ), T ) exp(−cN (T )) + T −K
exp(−cN (T )) + T 3 e
2 −1 i dE, max Mn E + T 1≤qn ≤N (T )
−2 log(1+δ)G√k(T )
.
From this bound, it is clear that Pd (N (T ), T ) goes to zero faster than any inverse power of T since sequence G has exponential growth. One gets the same bound for Pg (N (T ), T ) because of the symetry of the potential. Finally, one can conclude with (12) that α+ u ≤ α with α=
1 +ν γ(V )
and ν arbitrary small. For the second part of the theorem, notice the constant 2 comes from the choice of γ(V ) considering the worst coefficient in matrix Pn . But assuming there are no 1 in continued fraction development, one gets Rk ≤ ck1 and γ(V ) ≤ lim sup − k
k log c1 . log qk
4. A Pathological Counterexample The statements above holds if D < +∞. In the case D = +∞, we exhibit in the next statement a counter example. It is still an open question if D = +∞ implies ballistic motion. Theorem 6. There exists an irrational number ω with D = +∞ such that for any V > 20 α+ u = 1. The proof, made by induction, follows the lines of pathological example in [25]. The main idea is that, choosing an irrational number close to rational numbers (with large values for the sequence {ak }k ), potentials of Hβ and Hβn coincide on large scale of time. Large enough to say that Hβ and Hβn have the same dynamical behavior. It is well known that periodic operator Hβn has ballistic motion.
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
873
We make now these ideas more precise and first prove the following lemma: Let βn = [a1 , . . . , an ] be fixed and β be any an irrational number verifying β = [a1 , . . . , an , . . .]. Lemma 4. The Sturmian potentials of the operators Hβ and Hβn have the same first qn+1 values. Proof. To prove this, we recall the iterative construction of Sturmian word that coincide with our potential. For details and proof, see, e.g., [26]. Set W0 = 0 et W1 = 0a1 −1 V and define the sequence of Sturmian words by a
Wk+1 = Wk k+1 Wk−1 ,
k ≥ 1.
Each word Wk has length qk . As Hβ and Hβn have the same first n terms of continued fraction expansion, words W0 , W1 , . . . , Wn are the same for Hβ and Hβn . For Hβn , the limit word W∞ is periodic with period qn and repeat endless the an word Wn . As Wn = Wn−1 Wn−2 , one has an Wn∞ = Wnan+1 Wn−1 Wn−2 Wn∞ . a
This shows that the potential Hωn begins with the word Wn n+1 Wn−1 which is the word Wn+1 for Hω . As Wn+1 is qn+1 long, this ends the proof. We need another lemma, one can find in [25]. It states that two operators have close dynamic (on some scale of time T ) if their potentials are close enough. We make this idea more precise by recalling this lemma: Lemma 5. Let H1 = ∆ + V1 and H2 = ∆ + V2 acting on l2 (Z), and such that |V1 (k)|, |V2 (k)| < C for all k ∈ Z and some constant C. Let T > 0 and ε > 0 be fixed constant then if it exists L(T, ε), δ > 0 such that |V1 (k) − V2 (k)| < δ for all |k| < L, then ||X|2H1 T − |X|2H2 T | < ε. We get back to the construction. Proof of Theorem 6. As Hωn is a periodic potential operator, one has |X|2Hωn T > Cn T 2 , choose Tn big enough such that Cn >
1 . log Tn
One can then choose an+1 such that L(Tn , 1) ≤ qn+1 .
September 14, J070-S0129055X10004090
874
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
Inductively, we have a sequence Tn going to infinity and an irrational number ω with |X|2 Tn >
Tn2 − 1 > Tn2−ε , log Tn
∀ ε > 0.
Now, since ω is fully construct, one can compare Hω with Hωn . Then Lemma 5 implies |X|2Hω Tn >
Tn2 − 1, log Tn
(13)
which yields to − α+ u ≥ βδ1 (2) > 1 − ε,
∀ ε > 0.
5. Lower Bound for the Box Counting Dimension of the Spectrum We give now a lower bound of the fractal box counting dimension of the spectrum of operator Hβ . We recall now the defintion. If one denotes by N (ε) the number of balls of diameter at most ε one need to cover σ, then the upper box counting dimension is defined by dim+ B = lim sup ε→0
log N (ε) . log ε
The spectrum is approached by the band spectrum of periodic Hβn . Moreover, in [28, 27], we have precise information of the number of bands and their length. It allows us to give a lower bound of minimal number of set of some decreasing scale needed to cover the spectrum and then to give a lower bound of box dimension of the limit set. The first idea to cover the spectrum can be to take into account all the bands and take as a scale the smallest length, but this is a bad idea because this minimal length decreases faster than the number of intervals grows. The second idea can be to count the number of bands that have the maximal length, in terms of inverse power of V . This yields to a better lower bound for the box dimension of the spectrum for almost every irrational number. Fixing the irrational number, one can improve this method, by counting precisely the number of band that have a particular length. It has been made for Fibonacci number in [12] where the full fractal spectrum has been investigated. The length of a band is depending of its history, in that case, the number of I in the index history. Hence, one obtains this way all the contribution at any scale to the box dimension. It is shown their result is optimal with V increasing and one has for β = [0, 1, 1, . . .] √ log(1 + 2) . dimB (σ(Hβ )) ≈ log V An other example, simpler than golden mean is silver ratio. Fix β = [0, 2, 2, . . .], then all the bands have the same length up to a constant independent of V . Namely,
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
875
all bands at level k have length ck V −k , where ck is a constant depending of history of the band but not of V . At a given rank k, the number of band of length ck V −k needed to cover the spectrum is bound from below by qk . This implies that one has: √ log qk log(1 + 2) . dimB (σ(Hβ )) ≥ − lim inf ≈ k log ck V −k log V It is easy to show by direct computation the other side inequality and hence we obtain the same estimation for this case √ log(1 + 2) . dimB (σ(Hβ )) ≈ log V It is quite astonishing that both golden mean and silver ratio yield the same fractal dimension estimate. Going back to the general case, we will apply the same method used for silver mean, that is count the number of bands at level k that have length equal to ck V −k . We obtain: k Theorem 7. Set Ck = k3 j=1 log(aj + 2). We have for any irrational number β verifying C = lim sup Ck < +∞ and V > 20: dim+ B (σ) ≥
log 2 1 2 C + log(V + 5)
(14)
where σ is the spectrum of Hβ . Remark 3. As in Lemma 1, C finite is valid for a set of full Lebesgue measure. The following lemma give precise statement of the counting idea. Lemma 6. Denote by nk,I , nk,II and nk,III the number of bands of type respectively I, II and III in respectively σk,1 , σk+1,0 , σk+1,0 and with a length greater than εk = 4Πkj=1 (V + 5)−1 (aj + 2)−3 . For all k, we have the following induction relation: nk+1,I = (ak+1 + 1)nk,II + ak+1 nk,III , nk+1,II = 1{ak+1 ≤2} nk,I , nk+1,III = ak+1 nk,II + (ak+1 − 1)nk,III . Here, the initial conditions are n0,I = 1, n0,II = 0, n0,III = 1. Moreover this three sequences verify the following properties: nk,II = 0 nk,III = 0 nk,I = 0 nk,I > nk,III and k
nk,II + nk,III > 2 2 .
September 14, J070-S0129055X10004090
876
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
Proof. The induction relation is obvious with (5). The two first properties are made by induction. Initial conditions give level 0. Assume it is true at level n, then as ak+1 > 0, nk,II = 0 ∨ nk,III = 0, implies nk+1,I = 0. For the second part, if ak+1 ≤ 2 then nk+1,II = 0, else ak+1 > 2 implies nk+1,III = 0. To prove nk,I > nk,III it suffices to see that nk,I = nk,III + nk−1,II + nk−1,III . For the last property, it suffices to show that nk,II + nk,III ≥ 2(nk−2,II + nk−2,III ). Using induction relation, we get nk,II = [(ak−1 + 1)nk−2,II + ak−1 nk−2,III ]1{ak ≤2} nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,I 1{ak−1 ≤2} . We distinguish 4 cases depending on the values of ak and ak−1 . • If ak > 2 and ak−1 > 2, then we simply get nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) ≥ (ak − 1)(ak−1 − 1)(nk−2,II + nk−2,III ) ≥ 4(nk−2,II + nk−2,III ). • If ak ≤ 2 and ak−1 > 2, then one has nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + (ak−1 + 1)nk−2,II + ak−1 nk−2,III ≥ ak ak−1 (nk−2,II + nk−2,III ) ≥ 3(nk−2,II + nk−2,III ). • If ak > 2 and ak−1 ≤ 2, then one has nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,I ≥ (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,III ≥ (ak − 1)ak−1 (nk−2,II + nk−2,III ) ≥ 2(nk−2,II + nk−2,III ). • If ak ≤ 2 and ak−1 ≤ 2, then one gets nk,II + nk,III = (ak − 1)(ak−1 nk−2,II + (ak−1 − 1)nk−2,III ) + ak nk−2,I + (ak−1 + 1)nk−2,II + ak−1 nk−2,III .
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
877
And one obtains nk,II + nk,III ≥ ((ak − 1)ak−1 + ak−1 + 1)nk−2,II + ((ak − 1)(ak−1 − 1) + (ak−1 + ak )nk−2,III ≥ (ak ak−1 + 1)nk−2,II + (ak−1 + ak )nk−2,III ≥ 2(nk−2,II + nk−2,III ). Proof of Theorem 7. With previous lemma, we find a bound for nk,II + nk,III , that is the number of bands of length at least εk . To make sure we have a disjoint cover we consider only half of the bands. Each band is then separeted by another band we does not count. Then by definition of box dimension, we have dim+ B (σ) ≥ lim inf k
log 1/2(nk,II + nk,III ) , − log εk
and the stated result. Remark 4. The former bound for box dimension provided in [27] was log 2 log M − log 3 + , dimB (σ) ≥ dimH (σ) ≥ max , 10 log 2 − 3 log t2 log M − log t2 /3 1
where M = lim inf k→∞ (a1 a2 · · · ak ) k and t2 =
1 4(V +8) .
For almost all irrational numbers, that is with M equal to the Khintchin constant (2.685. . .), our bound is better than above and for any V > 20. On the other hand, for all fixed V , one has no improvement with some specific numbers. Fixing β = [0, c, c, . . .], the bound above goes to 1 and (14) to 0 as c goes to infinity. A lower bound for box dimension can be relevant to obtain a bound for dynamic lower exponent αu . Definition 2. An irrational number is said to be a bounded density irrational number if it fulfills the following condition 1 ai < +∞. n i=1 n
lim sup n
Theorem 8. For any bounded density irrational number, we have α− u ≥ with C = lim sup k3
k j=1
log 2 1 2 C + log(V + 5)
log(aj + 2).
Proof. It is shown in [30, 12] that if the norms of the transfer matrix are poly+ nomially bounded on the spectrum then we have α− u ≥ dimB (σ). This property on the norm of the transfer matrix is shown for irrational with bounded density in [18].
September 14, J070-S0129055X10004090
878
2010 13:28 WSPC/S0129-055X
148-RMP
L. Marin
Acknowledgments It is a pleasure to thank Dominique Vieugu´e for useful conversations about number theory.
References [1] J. Bellissard, B. Iochum, E. Scoppola and D. Testard, Spectral properties of one dimensional quasi-crystals, Comm. Math. Phys. 125 (1989) 527–543. [2] J.-M. Barbaroux, F. Germinet and S. Tcheremchantsev, Fractal dimensions and the phenomenon of intermittency in quantum dynamics, Duke. Math. J. 110 (2001) 161–193. [3] J. M. Combes, Connections between quantum dynamics and spectral properties of time-evolution operators, in Differential Equations with Applications to Mathematical Physics (Academic Press, Boston, 1993), pp. 59–68. [4] D. Damanik, Dynamical upper bounds for one-dimensional quasicrystals, J. Math. Anal. Appl. 303 (2005) 327–341. [5] D. Damanik, α-continuity properties of one-dimensional quasicrystals, Comm. Math. Phys. 192 (1998) 169–182. [6] D. Damanik, R. Killip and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals. III. α-continuity, Comm. Math. Phys. 212 (2000) 191–204. [7] D. Damanik, D. Lenz and G. Stolz, Lower transport bounds for one-dimensional continuum Schr¨ odinger operators, Math. Ann. 336 (2006) 361–389. [8] D. Damanik, A. S¨ ut˝ o and S. Tcheremchantsev, Power-law bounds on transfer matrices and quantum dynamics in one dimension II, J. Funct. Anal. 216 (2004) 362–387. [9] D. Damanik and S. Tcheremchantsev, Power-law bounds on transfer matrices and quantum dynamics in one dimension, Comm. Math. Phys. 236 (2003) 513–534. [10] D. Damanik and S. Tcheremchantsev, Upper bounds in quantum dynamics, J. Amer. Math. Soc. 20 (2007) 799–827. [11] D. Damanik and S. Tcheremchantsev, Scaling estimates for solutions and dynamical lower bounds on wavepacket spreading, J. Anal. Math. 97 (2005) 103–131. [12] D. Damanik, M. Embree, A. Gorodetski and S. Tcheremchantsev, The fractal dimension of the spectrum of the Fibonacci Hamiltonian, Comm. Math. Phys. 280 (2008) 499–516. [13] F. Germinet, A. Kiselev and S. Tcheremchantsev, Transfert matrices and transport for Schr¨ odinger operators, Ann. Inst. Fourier (Grenoble) 54 (2004) 787–830. [14] I. Guarneri, Spectral properties of quantum diffusion on discrete lattices, Europhys. Lett. 10 (1989) 95–100. [15] I. Guarneri, On an estimate concerning quantum diffusion in the presence of a fractal spectrum, Europhys. Lett. 21 (1993) 729–733. [16] I. Guarneri and H. Schulz-Baldes, Lower bounds on wave packet propagation by packing dimensions of spectral measures, Math. Phys. Electron. J. 5 (1999), Paper 1, 16 pp. [17] I. Guarneri and H. Schulz-Baldes, Intermittent lower bounds on quantum diffusion, Lett. Math. Phys. 49 (1999) 317–324. [18] B. Iochum, L. Raymond and D. Testard, Resistance of one-dimensional quasicristals Phys. A 187 (1992) 353–368. [19] S. Jitomirskaya and Y. Last, Power-law subordinacy and singular spectra. I. Half-line operators, Acta Math. 183 (1999) 171–189.
September 14, J070-S0129055X10004090
2010 13:28 WSPC/S0129-055X
148-RMP
Dynamical Bounds for Sturmian Schr¨ odinger Operators
879
[20] S. Jitomirskaya and Y. Last, Power-law subordinacy and singular spectra. II. Line operators, Comm. Math. Phys. 211 (2000) 643–658. [21] S. Jitomirskaya, H. Schulz-Baldes and G. Stolz, Delocalization in random polymer models, Comm. Math. Phys. 233 (2003) 27–48. [22] A. Ya. Khinchin, Continued Fractions (University of Chicago Press, 1964). [23] A. Kiselev and Y. Last, Solutions, spectrum and dynamics for Schr¨ odinger operators on infinite domains, Duke Math. J. 102 (2000) 125–150. [24] R. Killip, A. Kiselev and Y. Last, Dynamical upper bounds on wavepacket spreading, Amer. J. Math. 125 (2003) 1165–1198. [25] Y. Last, Quantum dynamics and decompositions of singular continuous spectra, J. Funct. Anal. 142 (1996) 406–445. [26] M. Lothaire, Algebraic Combinatorics on Words (Cambridge Univ. Press, 2002), Chap. 2, pp. 40–97. [27] Q. Liu and Z. Wen, Hausdorff dimension of spectrum of one-dimensional Schr¨ odinger operator with Sturmian potentials, Potential Anal. 20 (2004) 33–59. [28] L. Raymond, A constructive gap labelling for the discrete Schr¨ odinger operator on a quasiperiodic chain, preprint (1997). [29] A. S¨ ut˝ o, The spectrum of a quasiperiodic Schr¨ odinger operator, Comm. Math. Phys. 111 (1987) 409–415. [30] S. Tcheremchantsev, Mixed lower bound in quantum transport, J. Funct. Anal. 197 (2003) 247–282. [31] S. Tcheremchantsev, Dynamical analysis of Schr¨ odinger operators with growing sparse potentials, Comm. Math. Phys. 253 (2005) 221–252. [32] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Mathematical Surveys and Monographs, Vol. 72 (Amer. Math. Soc., 2000).
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 8 (2010) 881–961 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004107
ASYMPTOTICS FOR FERMI CURVES: SMALL MAGNETIC POTENTIAL
GUSTAVO DE OLIVEIRA Department of Mathematics, University of British Columbia, Canada [email protected] Received 9 March 2010 We consider complex Fermi curves of electric and magnetic periodic fields. These are analytic curves in C2 that arise from the study of the eigenvalue problem for periodic Schr¨ odinger operators. We characterize a certain class of these curves in the region of C2 where at least one of the coordinates has “large” imaginary part. The new results in this work extend previous results in the absence of magnetic field to the case of “small” magnetic field. Our theorems can be used to show that generically these Fermi curves belong to a class of Riemann surfaces of infinite genus. Keywords: Fermi curves; Bloch variety; Fermi surfaces; periodic Schr¨ odinger operators. Mathematics Subject Classification 2010: 47B99, 81Q99, 14H55
1. Introduction In [1], the authors introduced a class of Riemann surfaces of infinite genus that are “asymptotic to” a finite number of complex lines joined by infinite many handles. These surfaces are constructed by pasting together a compact submanifold of finite genus, plane domains, and handles. All these components satisfy a number of geometric/analytic hypotheses stated in [1] that specify the asymptotic holomorphic structure of the surface. The class of surfaces obtained in this way yields an extension of the classical theory of compact Riemann surfaces that has analogues of many theorems of the classical theory. It was proven in [1] that this new class includes quite general hyperelliptic surfaces, heat curves (which are spectral curves associated to a certain “heat-equation”), and Fermi curves with zero magnetic potential. In order to verify the geometric/analytic hypotheses for the latter the authors proved two “asymptotic” theorems similar to the ones we prove below. This is the main step needed to verify these hypotheses. In this work, we extend their results to Fermi curves with “small” magnetic potential. There are two immediate applications of our results. First, as we have already mentioned, one can use our theorems for verifying the geometric/analytic hypotheses of [1] for Fermi curves with small magnetic potential. This would show that 881
September 14, J070-S0129055X10004107
882
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
these curves belong to the class of Riemann surfaces mentioned above. Secondly, one can prove that a class of these curves are irreducible (in the usual algebraicgeometrical sense). Both these applications were done in [1] for Fermi curves with zero magnetic potential. Complex Fermi curves (and other similar spectral curves) have been studied, in different perspectives, in the absence of magnetic field [1–5], and in the presence of magnetic field [6]. Some results on the real Fermi curve in the high-energy region were obtained in [7]. There one also finds a short description of the existing results on periodic magnetic Schr¨ odinger operators. An even more general review is presented in [8]. To our knowledge, our work provides new results on complex Fermi curves with magnetic field. At this moment, we are only able to handle the case of “small” magnetic potential. The asymptotic characterization of Fermi curves with arbitrarily large magnetic potential remains as an open problem. In order to prove our theorems, we follow the same strategy as [1]. The presence of magnetic field makes the analysis considerably harder and requires new estimates. As it was pointed out in [7, 8], the study of an operator with magnetic potential is essentially more complicated than the study of the operator with just an electric potential. This seems to be the case in this problem as well. Before we outline our results let us introduce some definitions. Let Γ be a lattice in R2 and let A1 , A2 and V be real-valued functions in L2 (R2 ) that are periodic with respect to Γ. Set A := (A1 , A2 ) and define the operator H(A, V ) := (i∇ + A)2 + V acting on L2 (R2 ), where ∇ is the gradient operator in R2 . For k ∈ R2 consider the following eigenvalue–eigenvector problem in L2 (R2 ) with boundary conditions, H(A, V )ϕ = λϕ, ϕ(x + γ) = eik·γ ϕ(x) for all x ∈ R2 and all γ ∈ Γ. Under suitable hypotheses on the potentials A and V this problem is self-adjoint and its spectrum is discrete. It consists of a sequence of real eigenvalues E1 (k, A, V ) ≤ E2 (k, A, V ) ≤ · · · ≤ En (k, A, V ) ≤ · · · . For each integer n ≥ 1, the eigenvalue En (k, A, V ) defines a continuous function of k. From the above boundary condition, it is easy to see that this function is periodic with respect to the dual lattice Γ# := {b ∈ R2 | b · γ ∈ 2πZ for all γ ∈ Γ}, where b · γ is the usual scalar product on R2 . It is customary to refer to k as the crystal momentum and to En (k, A, V ) as the nth band function. The corresponding normalized eigenfunctions ϕn,k are called Bloch eigenfunctions. The operator H(A, V ) (and its three-dimensional counterpart) is important in solid state physics. It is the Hamiltonian of a single electron under the influence of
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
883
magnetic field with vector potential A, and electric field with scalar potential V , in the independent electron model of a two-dimensional solid [9]. The classical framework for studying the spectrum of a differential operator with periodic coefficients is the Floquet (or Bloch) theory [9–11]. Roughly speaking, the main idea of this theory is to “decompose” the original eigenvalue problem, which usually has continuous spectrum, into a family of boundary value problems, each one having discrete spectrum. In our context this leads to decomposing the problem H(A, V )ϕ = λϕ (without boundary conditions) into the above k-family of boundary value problems. Let Uk be the unitary transformation on L2 (R2 ) that acts as Uk : ϕ(x) → eik·x ϕ(x). By applying this transformation, we can rewrite the above problem and put the boundary conditions into the operator. Indeed, if we define Hk (A, V ) := Uk−1 H(A, V ) Uk
and ψ := Uk−1 ϕ,
then the above problem is unitarily equivalent to Hk (A, V )ψ = λψ
for ψ ∈ L2 (R2 /Γ).
Furthermore, a simple (formal) calculation shows that Hk (A, V ) = (i∇ + A − k)2 + V. The real “lifted” Fermi curve of (A, V ) with energy λ ∈ R is defined as Fˆλ,R (A, V ) := {k ∈ R2 | (Hk (A, V ) − λ)ϕ = 0 for some ϕ ∈ DHk (A,V ) \{0}}, where DHk (A,V ) ⊂ L2 (R2 /Γ) denotes the (dense) domain of Hk (A, V ). The adjective “lifted” indicates that Fˆλ,R (A, V ) is a subset of R2 rather than R2 /Γ# . As we may replace V by V − λ, we only discuss the case λ = 0 and write FˆR (A, V ) ˆ := in place of Fˆ0,R (A, V ) to simplify the notation. Let |Γ| := R2 /Γ dx and A(0) −1 ˆ |Γ| A(x)dx. Since Hk (A, V ) is equal to H ˆ (A − A(0), V ), if we perform 2 R /Γ
k−A(0)
ˆ ˆ the change of coordinates k → k + A(0) and redefine A − A(0) → A we may ˆ assume, without loss of generality, that A(0) = 0. The dual lattice Γ# acts on R2 by translating k → k + b for b ∈ Γ# . This action maps FˆR (A, V ) to itself because for each n ≥ 1 the function k → En (k, A, V ) is periodic with respect to Γ# . In other words, the real lifted Fermi curve “is periodic” with respect to Γ# . Define FR (A, V ) := FˆR (A, V )/Γ# . We call FR (A, V ) the real Fermi curve of (A, V ). It is a curve in the torus R2 /Γ# . The above definitions and the real Fermi curve have physical meaning. It is useful and interesting, however, to study the “complexification” of these curves. Knowledge about the complexified curves may provide information about the real counterparts. For complex-valued functions A1 , A2 and V in L2 (R2 ) and for k ∈ C2 the above problem is no longer self-adjoint. Its spectrum, however, remains discrete. It is a sequence of eigenvalues in the complex plane. From the boundary condition
September 14, J070-S0129055X10004107
884
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
in the original problem it is easy to see that the family of functions k → En (k, A, V ) remains periodic with respect to Γ# . Moreover, the transformation Uk is no longer unitary but it is still bounded and invertible and it still preserves the spectrum, that is, we can still rewrite the original problem in the form Hk (A, V )ψ = λψ for ψ ∈ L2 (R2 /Γ) without modifying the eigenvalues. Thus, it makes sense to define Fˆ (A, V ) := {k ∈ C2 | Hk (A, V )ϕ = 0 for some ϕ ∈ DHk (A,V ) \{0}}, F (A, V ) := Fˆ (A, V )/Γ# . ˆ We call F(A, V ) and F (A, V ) the complex “lifted” Fermi curve and the complex Fermi curve, respectively. When there is no risk of confusion we refer to either simply as Fermi curve. We are now ready to outline our results. When A and V are zero the (free) Fermi curve can be found explicitly. It consists of two copies of C with the points −b2 + ib1 (in the first copy) and b2 + ib1 (in the second copy) identified for all (b1 , b2 ) ∈ Γ# with b2 = 0. In this work, we prove that in the region of C2 where k ∈ C2 has “large” imaginary part the Fermi curve (for nonzero A and V ) is “close to” the free Fermi curve. In a compact form, our main result (that will be stated precisely in Theorems 1 and 2) is essentially the following. Main result. Suppose that A and V have some regularity and assume that (in a suitable norm) A is smaller than a constant given by the parameters of the problem. Write k in C2 as k = u + iv with u and v in R2 and suppose that |v| is larger than a constant given by the parameters of the problem. (Recall that the free Fermi curve is two copies of C with certain points in one copy identified with points in the other one.) Then, in this region of C2 , the Fermi curve of A and V is very close to the free Fermi curve, except that instead of two planes we may have two deformed planes, and identifications between points can open up to handles that look like {(z1 , z2 ) ∈ C2 | z1 z2 = constant} in suitable local coordinates. The proof of our results has basically three steps: • We first derive very detailed information about the free Fermi curve (which is explicitly known). Then, to compute the interacting Fermi curve we have to find the kernel of H in L2 (R2 ) with the above boundary conditions. • In the second step of the proof, we derive a number of estimates for showing that this kernel has finite dimension for small A and k ∈ C2 with large imaginary part. Our strategy here is similar to the Feshbach method in perturbation theory [12]. Indeed, we prove that in the complement of the kernel of H in L2 (R2 ), after a suitable invertible change of variables in L2 (R2 ), the operator H multiplied by the inverse of the operator that implements this change of variables is a compact perturbation of the identity that is invertible for such A and k. This reduces the problem of finding the kernel to finite dimension and thus we can write local defining equations for the Fermi curve.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
885
• In the third step of the proof, we use these equations to study the Fermi curve. A few more estimates and the implicit function theorem gives us the deformed planes. The handles are obtained using a quantitative Morse lemma from [13] that is available in the Appendix A. Steps two and three contain most of the novelties in this work. The critical part of the proof is the second step. The main difficulty arises due to the presence of the term A·i∇ in the Hamiltonian H(A, V ). When A is large, taking the imaginary part of k ∈ C2 arbitrarily large is not enough to control this term — it is not enough to make its contribution small and hence have the interacting Fermi curve as a perturbation of the free Fermi curve. (The term V in H(A, V ) is easily controlled by this method.) However, the proof can be implemented by assuming that A is small. This work is organized as follows. In Sec. 2, we collect some properties of the free Fermi curve and in Sec. 3, we define ε-tubes about it. In Sec. 4, we state our main results and in Sec. 5, we describe the general strategy of analysis used to prove them. Subsequently, we implement this strategy by proving a number of lemmas and propositions in Secs. 6–10, which we put together later in Secs. 11 and 12 to prove our main theorems. The proof of the estimates of Secs. 9 and 10 are left to the Appendices B and C. 2. The Free Fermi Curve When the potentials A and V are zero the curve Fˆ (A, V ) can be found explicitly. In this section we collect some properties of this curve. For ν ∈ {1, 2} and b ∈ Γ# set Nb,ν (k) := (k1 + b1 ) + i(−1)ν (k2 + b2 ), Nν (b) := {k ∈ C2 | Nb,ν (k) = 0}, Nb (k) := Nb,1 (k)Nb,2 (k), Nb := N1 (b) ∪ N2 (b), θν (b) :=
1 ((−1)ν b2 + ib1 ). 2
Observe that Nν (b) is a line in C2 . The free lifted Fermi curve is an union of these lines. Here is the precise statement. Proposition 1 (The Free Fermi Curve). The curve Fˆ (0, 0) is the locally finite union Nν (b). b∈Γ# ν∈{1,2}
In particular, the curve F (0, 0) is a complex analytic curve in C2 /Γ# .
September 14, J070-S0129055X10004107
886
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
The proof of this proposition is straightforward. It can be found in [13]. Here we only give its first part. Proof of Proposition 1 (First Part). For all k ∈ C2 the functions {eib·x | b ∈ Γ# } form a complete set of eigenfunctions for Hk (0, 0) in L2 (R2 /Γ) satisfying Hk (0, 0)eib·x = (i∇ − k)2 eib·x = (b + k)2 eib·x = Nb (k)eib·x . Hence,
Fˆ (0, 0) = {k ∈ C2 | Nb (k) = 0 for some b ∈ Γ# } =
b∈Γ#
Nb =
Nν (b).
b∈Γ# ν∈{1,2}
This is the desired expression for Fˆ (0, 0). The lines Nν (b) have the following properties (see [13] for a proof). Proposition 2 (Properties of Nν (b)). Let ν ∈ {1, 2} and let b, c, d ∈ Γ# . Then: (a) (b) (c) (d) (e)
Nν (b) ∩ Nν (c) = ∅ if b = c; dist(Nν (b), Nν (c)) = √12 |b − c|; N1 (b) ∩ N2 (c) = {(iθ1 (c) + iθ2 (b), θ1 (c) − θ2 (b))}; the map k → k + d maps Nν (b) to Nν (b − d); the map k → k + d maps N1 (b) ∩ N2 (c) to N1 (b − d) ∩ N2 (c − d).
Let us briefly describe what the free Fermi curve looks like. In Fig. 1, there is a sketch of the set of (k1 , k2 ) ∈ Fˆ (0, 0) for which both ik1 and k2 are real, for the case where the lattice Γ# has points over the coordinate axes, that is, it has points ik1 N2 (0)
N2 (b)
N2 (−b)
ik1 N1 (b)
N1 (0)
N1 (−b)
k2
k2 Fig. 1.
Sketch of Fˆ (0, 0) and F (0, 0) when both ik1 and k2 are real.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
887
of the form (b1 , 0) and (0, b2 ). Observe that, in particular, Proposition 2 yields N1 (0) ∩ N2 (b) = {(iθ1 (b), θ1 (b))}, N1 (−b) ∩ N2 (0) = {(iθ2 (−b), θ2 (b))}, the map k → k + b maps N1 (0) ∩ N2 (b) to N1 (−b) ∩ N2 (0). Recall that points in Fˆ (0, 0) that differ by elements of Γ# correspond to the same point in F (0, 0). Thus, in the sketch on the left, we should identify the lines k2 = −b2 /2 and k2 = b2 /2 for all b ∈ Γ# with b2 = 0, to get a pair of helices climbing up the outside of a cylinder, as illustrated by the figure on the right. The helices intersect each other twice on each cycle of the cylinder — once on the front half of the cylinder and once on the back half. Hence, viewed as a “manifold” (with singularities), the pair of helices are just two copies of R with points that corresponds to intersections identified. We can use k2 as a coordinate in each copy of R and then the pairs of identified points are k2 = b2 /2 and k2 = −b2 /2 for all b ∈ Γ# with b2 = 0. So far we have only considered k2 real. The full Fˆ (0, 0) is just two copies of C with k2 as a coordinate in each copy, provided we identify the points θ1 (b) = 12 (−b2 + ib1 ) (in the first copy) and θ2 (b) = 12 (b2 + ib1 ) (in the second copy) for all b ∈ Γ# with b2 = 0. 3. The ε-Tubes about the Free Fermi Curve We now introduce real and imaginary coordinates in C2 and define ε-tubes about the free Fermi curve. We derive some properties of the ε-tubes as well. For k ∈ C2 write k1 = u1 + iv1
and k2 = u2 + iv2 ,
where u1 , u2 , v1 and v2 are real numbers. Then, Nb,ν (k) = (k1 + b1 ) + i(−1)ν (k2 + b2 ) = i(v1 + (−1)ν (u2 + b2 )) − (−1)ν (v2 − (−1)ν (u1 + b1 )), so that |Nb,ν (k)| = |v + (−1)ν (u + b)⊥ |, where (y1 , y2 )⊥ := (y2 , −y1 ). Since Nb (k) = Nb,1 (k)Nb,2 (k), we have Nb (k) = 0 if and only if v − (u + b)⊥ = 0
or v + (u + b)⊥ = 0.
Let 2Λ be the length of the shortest nonzero “vector” in Γ# . Then there is at most one b ∈ Γ# with |v + (u + b)⊥ | < Λ and at most one b ∈ Γ# with |v − (u + b)⊥ | < Λ (see [13] for the proof).
September 14, J070-S0129055X10004107
888
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Let ε be a constant satisfying 0 < ε < Λ/6. For ν ∈ {1, 2} and b ∈ Γ# , define the ε-tube about Nν (b) as Tν (b) := {k ∈ C2 | |Nb,ν (k)| = |v + (−1)ν (u + b)⊥ | < ε}, and the ε-tube about Nb = N1 (b) ∪ N2 (b) as Tb := T1 (b) ∪ T2 (b). Since (v + (u + b)⊥ ) + (v − (u + b)⊥ ) = 2v, at least one of the factors |v + (u + b)⊥ | or |v − (u + b)⊥ | in |Nb (k)| must always be greater or equal to |v|. If k ∈ Tb both factors are also greater or equal to ε. If k ∈ Tb one factor is bounded by ε and the other must lie within ε of |2v|. Thus, k ∈ Tb ⇒ |Nb (k)| ≥ ε|v|,
(1)
k ∈ Tb ⇒ |Nb (k)| ≤ ε(2|v| + ε).
(2)
Finally, denote by T¯b the closure of Tb . The intersection T¯b ∩ T¯b is compact whenever b = b , and T¯b ∩ T¯b ∩ T¯b is empty for all distinct elements b, b , b ∈ Γ# (see [13] for details). If a point k belongs to the free Fermi curve the function Nb (k) vanishes for some b ∈ Γ# . We now give a lower bound for this function when (b, k) is not in the zero set. Proposition 3 (Lower Bound for |Nb (k)|). (a) If |b + u + v ⊥ | ≥ Λ and |b + u − v ⊥ | ≥ Λ, then |Nb (k)| ≥ Λ2 (|v| + |u + b|). (b) If |v| > 2Λ and k ∈ T0 , then |Nb (k)| ≥ Λ2 (|v| + |u + b|) for all b = 0 but at most one b = 0. This exceptional ˜b obeys |˜b| > |v| and | |u + ˜b| − |v| | < Λ. (c) If |v| > 2Λ and k ∈ T0 ∩ Td with d = 0, then |Nb (k)| ≥ Λ2 (|v| + |u + b|) for all b ∈ {0, d}. Furthermore we have |d| > |v| and | |u + d| − |v| | < Λ. Proof. (a) By hypothesis, both factors in |Nb (k)| = |v + (u + b)⊥ | |v − (u + b)⊥ | are greater or equal to Λ. We now prove that at least one of the factors must also be greater or equal to 12 (|v| + |u + b|). Suppose that |v| ≥ |u + b|. Then, since (v + (u + b)⊥ ) + (v − (u + b)⊥ ) = 2v, at least one of the factors must also be greater or equal to |v| = 12 (|v| + |v|) ≥ 12 (|v| + |u + b|). Now suppose that |v| < |u + b|. Then similarly we prove that |u + b| > 12 (|v| + |u + b|). All this together implies that |Nb (k)| ≥ Λ2 (|v| + |u + b|), which proves part (a). (b) By hypothesis ε < Λ/6 < |v|. Let k ∈ T0 . Then, by (2), |N0 (k)| ≤ ε(2|v| + ε) < 3ε|v| <
Λ |v|. 2
(3)
Thus we have either |u + v ⊥ | < Λ or |u − v ⊥ | < Λ (otherwise we apply part (a) to get a contradiction). Suppose that |u + v ⊥ | < Λ. Then there is no b ∈ Γ# \{0} with |b+u+v ⊥ | < Λ and there is at most one ˜b ∈ Γ# \{0} satisfying
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
889
|˜b + u − v ⊥ | < Λ. This inequality implies | |u + ˜b| − |v| | < Λ. Furthermore, for this ˜b, |˜b| = |2v ⊥ − (u + v ⊥ ) + (˜b + u − v ⊥ )| > 2|v| − 2Λ > |v|, since −2Λ > −|v|. Now suppose that |u−v ⊥ | < Λ. Then similarly we prove that |˜b| > |v|. Finally observe that, if b ∈ {0, ˜b} then |b+u+v ⊥ | ≥ Λ and |b+u−v ⊥ | ≥ Λ. Hence, by applying part (a) it follows that |Nb (k)| ≥ Λ2 (|v| + |u + b|). This proves part (b). (c) As in the proof of part (b), if k ∈ T0 ∩ Td then in addition to (3), we have |Nd (k)| < Λ2 |v|. Thus, applying part (b) we conclude that d must be the exceptional ˜b of part (b). The statement of part (c) follows then from part (b). This completes the proof. 4. Main Results The Riemann surfaces introduced in [1] can be decomposed into X com ∪ X reg ∪ X han , where X com is a compact submanifold with smooth boundary and finite genus, X reg is a finite union of open “regular pieces”, and X han is an infinite union of closed “handles”. All these components satisfy a number of geometric/analytic hypotheses stated in [1] that specify the asymptotic holomorphic structure of the surface. Below we state two “asymptotic” theorems that essentially characterize the X reg and X han components of Fermi curves with small magnetic potential. Before we move to the theorems let us introduce some definitions. For any ϕ ∈ L2 (R2 /Γ) define ϕˆ : Γ# → C as 1 ϕ(x)e−ib·x dx, ϕ(b) ˆ := (F ϕ)(b) := |Γ| R2 /Γ where |Γ| := R2 /Γ dx. Then, ib·x ϕ(x) = (F −1 ϕ)(x) ˆ = ϕ(b)e ˆ , b∈Γ#
ϕ L2 (R2 /Γ) = |Γ|1/2 ϕ ˆ l2 (Γ# ) . Recall that k = u + iv with u, v ∈ R2 , let ρ be a positive constant, and set Kρ := {k ∈ C2 | |v| ≤ ρ}. Finally, consider the projection pr: C2 → C, (k1 , k2 ) → k2 , and define q := (i∇ · A) + A2 + V.
September 14, J070-S0129055X10004107
890
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
It is easy to construct a holomorphic map E: Fˆ (A, V ) → F(A, V ) [13]. The precise form of this map is irrelevant here. For our purposes it is enough to think of it simply as a “projection” (or “exponential map”). We are ready to state our results. Clearly, the set Kρ is invariant under the action of Γ# and Kρ /Γ# is compact. Hence, the image of Fˆ (A, V )∩Kρ under the holomorphic map E is compact in F (A, V ). This image set will essentially play the role of X com in the decomposition of F (A, V ). Our first theorem characterizes the regular piece X reg of F (A, V ). Theorem 1 (The Regular Piece). Let 0 < ε < Λ/6 and suppose that A1 , A2 and ˆ V are functions in L2 (R2 /Γ) with b2 qˆ(b) l1 (Γ# ) < ∞ and (1+b2 )A(b) l1 (Γ# \{0}) < 2ε/63. Then there is a constant ρ = ρΛ,ε,q,A such that, for ν ∈ {1, 2}, the projection pr induces a biholomorphic map between (Fˆ (A, V ) ∩ Tν (0)) Kρ ∪ Tb b∈Γ# \{0}
and its image in C. This image component contains {z ∈ C | 8|z| > ρ and |z + (−1)ν θν (b)| > ε for all b ∈ Γ# \{0}} and is contained in
1 ε2 z ∈ C |z + (−1)ν θν (b)| > ε− for all b ∈ Γ# \{0} , 2 40Λ where θν (b) = 12 ((−1)ν b2 + ib1 ). Furthermore, pr−1 : Image(pr) → Tν (0), (1,0)
y → (−β2 (1,0)
where β2
− i(−1)ν y − r(y), y),
is a constant given by (24) that depends only on ρ and A, (1,0)
|β2
|<
ε2 100Λ
and
|r(y)| ≤
ε3 C + , 50Λ2 ρ
where C = CΛ,ε,q,A is a constant. Now observe that, since Tb + c = Tb+c for all b, c ∈ Γ# , the complement of ˆ E(F (A, V ) ∩ Kρ ) in F (A, V ) is the disjoint union of A A ˆ E Tb (F (A, V ) ∩ T0 ) A Kρ ∪ # b∈Γ A b2 =0 A and E(Fˆ (A, V ) ∩ T0 ∩ Tb ). b∈Γ# b2 =0
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
891
Basically, the first of the two sets will be the regular piece of F (A, V ), while the second set will be the handles. The map Φ parametrizing the regular part will be the composition of the map E with the inverse of the map discussed in the above theorem. The detailed information about the handles X han in F (A, V ) comes from our second main theorem. Theorem 2 (The Handles). Let 0 < ε < Λ/6 and suppose that A1 , A2 and V ˆ are functions in L2 (R2 /Γ) with b2 qˆ(b) l1 (Γ# ) < ∞ and (1 + b2 )A(b) l1 (Γ# \{0}) < 2ε/63. Then, for every sufficiently large constant ρ and for every d ∈ Γ# \{0} with 2|d| > ρ, there are maps
ε ε φd,1 : (z1 , z2 ) ∈ C2 |z1 | ≤ and |z2 | ≤ → T1 (0) ∩ T2 (d), 2 2
ε 2 φd,2 : (z1 , z2 ) ∈ C |z1 | ≤ and |z2 | ≤ ε → T1 (−d) ∩ T2 (0), 2 and a complex number td with |td | ≤
C |d|4
such that:
(i) For ν ∈ {1, 2} the domain of the map φd,ν is biholomorphic to its image, and the image contains
ε 2 k ∈ C |k1 + i(−1)ν k2 | ≤ and 8 ε ν+1 ν ν+1 d1 − i(−1) (k2 + (−1) d2 )| ≤ |k1 + (−1) . 8 Furthermore, Dφˆd,ν =
1 2
1 −i(−1)ν
1 i(−1)ν
and
I +O
φd,ν (0) = (iθν (d), (−1)ν+1 θν (d)) + O
ε 900
1 |d|2
+O
1 . ρ
(ii) ˆ φ−1 d,1 (T1 (0) ∩ T2 (d) ∩ F (A, V ))
ε ε = (z1 , z2 ) ∈ C2 z1 z2 = td , |z1 | ≤ and |z2 | ≤ , 2 2 ˆ φ−1 d,2 (T1 (−d) ∩ T2 (0) ∩ F (A, V ))
ε ε = (z1 , z2 ) ∈ C2 z1 z2 = td , |z1 | ≤ and |z2 | ≤ . 2 2 (iii) φd,1 (z1 , z2 ) = φd,2 (z2 , z1 ) − d.
September 14, J070-S0129055X10004107
892
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
These are the main results in this paper. In the next section, we outline the strategy for proving them. The proofs are presented in the subsequent sections divided in many steps. 5. Strategy Outline Below we briefly describe the general strategy of analysis used to prove our results. We first introduce some notation and definitions. Observe that Hk (A, V )ϕ = ((i∇ + A − k)2 + V )ϕ = ((i∇ − k)2 + 2A · (i∇ − k) + (i∇ · A) + A2 + V )ϕ, and write Hk (A, V ) = ∆k + h(k, A) + q(A, V ) with ∆k := (i∇ − k)2 ,
h(k, A) := 2A · (i∇ − k) and
q(A, V ) := (i∇ · A) + A2 + V. For each finite subset G of Γ# set G := Γ# \ G
and C2G := C2
Nb ,
b∈G
L2G := span{eib·x | b ∈ G}
and L2G := span{eib·x | b ∈ G }.
To simplify the notation write L2 in place of L2 (R2 /Γ). Let I be the identity operator on L2 , and let πG and πG be the orthogonal projections from L2 onto L2G and L2G , respectively. Then, L2 = L2G ⊕ L2G
and I = πG + πG .
2 For k ∈ C2G define the partial inverse (∆k )−1 G on L as −1 (∆k )−1 G := πG + ∆k πG .
Its matrix elements are ((∆k )−1 G )b,c :=
ib·x
ic·x
e e , (∆k )−1 G 1/2 |Γ| |Γ|1/2
δb,c
= L2
δb,c
if c ∈ G, 1 Nc (k)
if c ∈ G,
where b, c ∈ Γ# . Here is the main idea. By definition, a point k is in Fˆ (A, V ) if Hk (A, V ) has a nontrivial kernel in L2 . Hence, to study the part of the curve in the intersection of 2 # d ∈G Td with C \ b∈G Tb for some finite subset G of Γ , it is natural to look for a nontrivial solution of (∆k + h + q)(ψG + ψG ) = 0,
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
893
where ψG ∈ L2G and ψG ∈ L2G . Equivalently, if we make the following (invertible) change of variables in L2 , (ψG + ψG ) = (∆k )−1 G (ϕG + ϕG ), where ϕG ∈ L2G and ϕG ∈ L2G , we may consider the equation (∆k + h + q)ϕG + (I + (h + q)∆−1 k )ϕG = 0.
(4)
The projections of this equation onto L2G and L2G are, respectively, πG (h + q)ϕG + πG (I + (h + q)∆−1 k )ϕG = 0, πG (∆k + h + q)ϕG + πG (h +
q)∆−1 k ϕG
= 0.
(5) (6)
Now define RG G on L2 as RG G := πG (I + (h + q)∆−1 k )πG . Observe that RG G is the zero operator on L2G . Then, if RG G has a bounded inverse on L2G , Eq. (5) is equivalent to −1 ϕG = −RG G πG (h + q)ϕG .
Substituting this into (6) yields −1 πG (∆k + h + q − (h + q)∆−1 k RG G πG (h + q))ϕG = 0.
This equation has a nontrivial solution if and only if the (finite) |G| × |G| determinant −1 det[πG (∆k + h + q − (h + q)∆−1 k RG G πG (h + q))πG ] = 0
or, equivalently, expressing all operators as matrices in the basis {|Γ|−1/2 eib·x | b ∈ Γ# }, wd ,b −1 (RG = 0, (7) detNd (k)δd ,d + wd ,d − G )b,c wc,d N (k) b b,c∈G
d ,d ∈G
where ˆ − c) + qˆ(b − c). wb,c := hb,c + qˆ(b − c) = −2(c + k) · A(b Therefore, if RG G has a bounded inverse on L2G — which is in fact the case under suitable conditions — in the region under consideration we can study the Fermi curve in detail using the (local) defining Eq. (7).
September 14, J070-S0129055X10004107
894
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
6. Invertibility of RG G The following notation will be used whenever we consider vector-valued quantities. Let X be a Banach space and let A, B ∈ X 2 , where A = (A1 , A2 ) and B = (B1 , B2 ). Then, A X := ( A1 2X + A2 2X )1/2
and A · B := A1 B1 + A2 B2 .
Furthermore, we will denote by · the operator norm on L2 (R2 /Γ). In general, for any B, C ⊂ Γ# (C such that ∆−1 k πC exists) define the operator RBC as RBC := πB (I + (h + q)∆−1 k )πC −1 −1 = πB πC + πB q ∆−1 k πC + πB (2A · i∇)∆k πC − πB (2k · A)∆k πC . (8)
Its matrix elements are (RBC )b,c = δb,c +
ˆ − c) 2k · A(b ˆ − c) qˆ(b − c) 2c · A(b − − , Nc (k) Nc (k) Nc (k)
(9)
where b ∈ B and c ∈ C. We first estimate the norm of the last three terms on the right-hand side of (8). We begin with the following proposition. Proposition 4. Let k ∈ C2 and let B, C ⊂ Γ# with C ⊂ {b ∈ Γ# | Nb (k) = 0}. Then, 1 , q l1 sup πB q ∆−1 k πC ≤ ˆ |N c (k)| c∈C ˆ πB (A · i∇)∆−1 k πC ≤ A l1 sup c∈C
|c| , |Nc (k)|
ˆ πB (k · A)∆−1 k πC ≤ A l1 |k| sup c∈C
1 . |Nc (k)|
To prove this proposition we apply the following well-known inequality (see [13]). Proposition 5. Consider a linear operator T : L2C → L2B with matrix elements Tb,c . Then, |Tb,c |, sup |Tb,c | . T ≤ max sup c∈C
b∈B
b∈B c∈C
Proof of Proposition 4. We only prove the first inequality. The proof of the other ones is similar. Write T := πB q ∆−1 k πC . Then, in view of (8) and (9), |ˆ q (b − c)| 1 ≤ sup ˆ q l1 , |Tb,c | ≤ sup sup |Nc (k)| c∈C c∈C c∈C |Nc (k)| b∈B
sup
b∈B c∈C
b∈B
|ˆ q (b − c)| 1 ≤ sup ˆ q l1 . |N (k)| |N c c∈C c (k)| b∈B
|Tb,c | ≤ sup
c∈C
By Proposition 5, these estimates yield the desired inequality.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
895
−1 The key estimate for the existence of RG G is given below.
Proposition 6 (Estimate of RSS − πS ). Let k ∈ C2 with |u| ≤ 2|v| and |v| > 2Λ. Suppose that S ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Then, q l1 RSS − πS ≤ ˆ
14 ˆ 1 + A l1 . ε|v| ε
(10)
If A = 0, the right-hand side of (10) can be made arbitrarily small for any V by taking |v| sufficiently large (recall that q(0, V ) = V ). If A = 0, however, we need ˆ l1 in (10) ˆ l1 small to make that quantity less than 1. The term 14 A to take A ε −1 comes from the estimate we have for πG h ∆k πG . Proof of Proposition 6. By hypothesis, for all b ∈ S, 1 1 ≤ . |Nb (k)| ε|v|
(11)
We now show that, for all b ∈ S, 4 |b| ≤ . |Nb (k)| ε
(12)
First suppose that |b| ≤ 4|v|. Then, |b| 4|v| 4 ≤ = . |Nb (k)| ε|v| ε Now suppose that |b| ≥ 4|v|. Again, by hypothesis we have |u| ≤ 2|v| and |v| > 2Λ > ε. Hence, |b| 3 |v ± (u + b)⊥ | ≥ |b| − |u| − |v| ≥ |b| − 3|v| ≥ |b| − |b| = . 4 4 Consequently, |b| |b| 4 4 16 4 4 = ≤ |b| = ≤ ≤ . ⊥ ⊥ |Nb (k)| |v + (u + b) | |v − (u + b) | |b| |b| |b| |v| ε This proves (12). The expression for RSS − πS is given by (8). Observe that |k| ≤ |u| + |v| ≤ 3|v|. Then, applying Proposition 4 and using (11) and (12) we obtain ˆ l1 + ˆ q l1 ) sup RSS − πS ≤ (6|v| A b∈S
1 ˆ l1 sup |c| + 2 A |Nc (k)| b∈S |Nc (k)|
8 ˆ 14 ˆ 1 1 ˆ l1 + ˆ ≤ (6|v| A + A + A q l1 ) q l1 l1 = ˆ l1 . ε|v| ε ε|v| ε This is the desired inequality.
September 14, J070-S0129055X10004107
896
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
From the last proposition it follows easily that RSS has a bounded inverse for large |v| and weak magnetic potential. Lemma 1 (Invertibility of RSS ). Let k ∈ C2 ,
2 |u| ≤ 2|v|, |v| > max 2Λ, ˆ q l1 , ˆ q l1 < ∞ ε
and
ˆ l1 < A
2 ε. 63
Suppose that S ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Then the operator RSS has a bounded inverse with 1 ˆ l1 14 < 17 , + A RSS − πS < ˆ q l1 ε|v| ε 18 −1 RSS − πS < 18 RSS − πS .
Proof. Write RSS = πS + T with T = RSS − πS . Then, by Proposition 6, 1 ˆ l1 14 < 1 + 4 = 17 < 1. + A q l1 T = RSS − πS ≤ ˆ ε|v| ε 2 9 18 −1 Hence, the Neumann series for RSS = (πS + T )−1 converges (and is a bounded operator). Furthermore, −1 − πS = (πS + T )−1 − πS = (πS + T )−1 − (πS + T )−1 (πS + T ) RSS
= (πS + T )−1 T ≤ (1 − T )−1 T < 18 RSS − πS , as was to be shown. Lemma 1 says that if G is such that G ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|} the operator RG G has a bounded inverse on L2G for |u| ≤ 2|v|, large |v|, and weak magnetic potential. We are now able to write local defining equations for Fˆ (A, V ) under such conditions. 7. Local Defining Equations In this section we derive local defining equations for the Fermi curve. We begin with a simple proposition. Proposition 7. Suppose either (i) or (ii) or (iii) where: (i) G = {0} and k ∈ T0 \ b∈Γ# \{0} Tb ; (ii) G = {0, d} and k ∈ T0 ∩ Td ; (iii) G = ∅ and k ∈ C2 \ b∈Γ# Tb . Then G = Γ# \G = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Proof. The proposition follows easily if we observe that G = Γ# \G and recall from (1) that k ∈ Tb ⇒ |Nb (k)| ≥ ε|v|.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
897
We now introduce some notation. Let B be a fundamental cell for Γ# ⊂ R2 (see [9, p. 310]). Then any vector u ∈ R2 can be written as u = ξ + u for some ξ ∈ Γ# and u ∈ B. Define
2 α := sup{|u| | u ∈ B}, R := max α, 2Λ, ˆ q l1 , KR := {k ∈ C2 | |v| ≤ R}. ε We first show that in C2 \KR the Fermi curve is contained in the union of ε-tubes about the free Fermi curve. ˆ (A, V )\KR is Contained in the Union of ε-Tubes). Proposition 8 (F Fˆ (A, V )\KR ⊂ Tb . b∈Γ#
Proof. Without loss of generality, we may consider k ∈ C2 with real part in B. We now prove that any point outside the region KR and outside the union of ε-tubes does not belong to Fˆ (A, V ). Suppose that k ∈ C2 \(KR ∪ b∈Γ# Tb ) and recall that k is in Fˆ (A, V ) if and only if (4) has a nontrivial solution. If we choose G = ∅ then G = Γ# and this equation reads RG G ϕG = 0. By Proposition 7(iii), we have G = Γ# = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Furthermore, since u ∈ B and |v| > R ≥ α, it follows that |u| ≤ α < |v| < 2|v|. Consequently, the operator RG G has a bounded inverse by Lemma 1. Thus, the only solution of the above equation is ϕG = 0. That is, there is no nontrivial solution of this equation and therefore k ∈ Fˆ (A, V ). We are left to study the Fermi curve inside the ε-tubes. There are two types of regions to consider: intersections and non-intersections of tubes. To study non intersections we choose G = {0} and consider the region (T0 \ b∈Γ# \{0} Tb )\KR . For intersections we take G = {0, d} for some d ∈ Γ# \{0} and consider (T0 ∩ Td )\KR . Observe that, since the tubes Tb have the following translational property, Tb + c = Tb+c for all b, c ∈ Γ# , and the curve Fˆ (A, V ) is invariant under the action of Γ# , there is no loss of generality in considering only the two regions above. Any other part of the curve can be reached by translation. Recall that G = Γ# \G and for d , d ∈ G and i, j ∈ {1, 2} set
dd (k; G) := −4 Bij
Aˆi (d − b) −1 ˆ (RG G )b,c Aj (c − d ), Nb (k)
b,c∈G
Cid d (k; G) := −2Aˆi (d − d ) + 2
qˆ(d − b) − 2b · A(d ˆ − b) Nb (k)
b,c∈G −1 ˆ × (RG G )b,c Ai (c − d )
September 14, J070-S0129055X10004107
898
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
+2
Aˆi (d − b) −1 (RG q (c − d ) G )b,c (ˆ Nb (k)
b,c∈G
ˆ − d )), − 2d · A(c ˆ − d ) C0d d (k; G) := qˆ(d − d ) − 2d · A(d
−
qˆ(d − b) − 2b · A(d ˆ − b) −1 (RG G )b,c (ˆ q (c − d ) N (k) b
b,c∈G
ˆ − d )). − 2d · A(c
(13)
Then, Dd ,d (k; G) := wd ,d −
b,c∈G
wd ,b (R−1 )b,c wc,d Nb (k) G G
dd 2 dd 2 dd dd = B11 k1 + B22 k2 + (B12 + B21 )k1 k2
+ C1d d k1 + C2d d k2 + C0d d . These functions have the following property.
dd , Cid d , C0d d Proposition 9. For d , d ∈ G and i, j ∈ {1, 2}, the functions Bij (and consequently Dd ,d ) are analytic on (T0 \ b∈Γ# \{0} Tb )\KR and (T0 ∩Td )\KR for G = {0} and G = {0, d}, respectively.
dd Sketch of the proof. It suffices to show that Bij , Cid d and C0d d are analytic functions. This property follows from the fact that all the series involved in the definition of these functions are uniformly convergent sums of analytic functions. The argument is similar for all cases. See [13] for details.
Using the above functions we can write (local) defining equations for the Fermi curve. ˆ (A, V )). Lemma 2 (Local Defining Equations for F (i) Let G = {0} and k ∈ (T0 \ b∈Γ# \{0} Tb )\KR . Then k ∈ Fˆ (A, V ) if and only if N0 (k) + D0,0 (k) = 0. (ii) Let G = {0, d} and k ∈ (T0 ∩ Td )\KR . Then k ∈ Fˆ (A, V ) if and only if (N0 (k) + D0,0 (k))(Nd (k) + Dd,d (k)) − D0,d (k)Dd,0 (k) = 0. Proof. We only prove part (i). The proof of part (ii) is similar. First, by Proposition 7(i) we have G = Γ# \{0} = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. Furthermore, since k ∈ T0 , we have either |v − u⊥ | < ε or |v + u⊥ | < ε. In either case this implies
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
899
|u| < ε + |v| < 2Λ + |v| < 2|v|. Hence, the operator RG G has a bounded inverse by Lemma 1. Thus, in the region under consideration Fˆ (A, V ) is given by (7): w0,b −1 (RG 0 = N0 (k) + w0,0 − G )b,c wc,0 = N0 (k) + D0,0 (k). N (k) b b,c∈G
This is the desired expression. To study in detail the defining equations above we shall estimate the asymptotic d d , Cid d , C0d d and Dd ,d for large |v|. (We sometimes behavior of the functions Bij refer to these functions as coefficients.) Since all these functions have a similar form it is convenient to prove these estimates in a general setting and specialize them later. This is the contents of Secs. 9 and 10. We next introduce a change of variables in C2 that will be useful for proving these bounds. 8. Change of Coordinates Define the (complementary) index ν as ν := ν − (−1)ν . Observe that ν = 2 if ν = 1, ν = 1 if ν = 2, and (−1)ν = −(−1)ν . The following change of coordinates in C2 will be useful for our analysis. For ν ∈ {1, 2} and d , d ∈ G define the functions wν,d , zν,d : C2 → C as wν,d (k) := k1 + d1 + i(−1)ν (k2 + d2 ), zν,d (k) := k1 + d1 − i(−1)ν (k2 + d2 ).
(14)
Observe that, the transformation (k1 , k2 ) → (wν,d , zν,d ) is just a translation composed with a rotation. Furthermore, if k ∈ Tν (d )\KR then |wν,d (k)| is “small” and |zν,d (k)| is “large”. Indeed, |wν,d (k)| = |Nd ,ν (k)| < ε and |zν,d (k)| = |Nd ,ν (k)| ≥ |v| > R. Define also
1 d d d d d d d d (B − B22 + i(−1)ν (B12 + B21 )), 4 11 1 d d d d := (B11 + B22 ), 2 1 d d d d d d d d := −d1 B11 − i(−1)ν d2 B22 − (d2 + i(−1)ν d1 )(B12 + B21 ) 2 1 + (C1d d + i(−1)ν C2d d ), 2
Jνd d :=
Kd d
Ldν d
dd dd dd d d + d2 + d1 d2 (B12 + B21 ) M d d := d2 1 B11 2 B22
− d1 C1d d − d2 C2d d + C0d d ,
where Jνd d , K d d , Ldν d and M d d are functions of k ∈ C2 that also depend on the choice of G ⊂ Γ# . Using these functions we can express Nd (k) + Dd ,d (k) and Dd ,d (k) as follows.
September 14, J070-S0129055X10004107
900
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Proposition 10. Let ν ∈ {1, 2} and let d , d ∈ G. Then,
2 dd 2 zν,d + (1 + K d d )wν,d zν,d Nd + Dd ,d = Jνd d wν,d + Jν
+ Ldν d wν,d + Ldν d zν,d + M d d ,
2 dd 2 dd zν,d wν,d zν,d Dd ,d = Jνd d wν,d + Jν + K
+ Ldν d wν,d + Ldν d zν,d + M d d . Furthermore, (1, −i(−1)ν ) · A(d ˆ − b) −1 ν ˆ (RG G )b,c (1, −i(−1) ) · A(c − d ), N (k) b
Jνd d (k) = −
b,c∈G
K d d (k) = −2
A(d ˆ − b) · A(c ˆ − d ) −1 (RG G )b,c , Nb (k)
b,c∈G
qˆ(d − b) + 2(d − b) · A(d ˆ − b) −1 ν ˆ (RG G )b,c (1, i(−1) ) · A(c − d ) Nb (k)
Ldν d (k) =
b,c∈G
+
(1, i(−1)ν ) · A(d ˆ − b) −1 (RG q (c − d ) G )b,c (ˆ N (k) b
b,c∈G
ˆ − d )) − (1, i(−1)ν ) · A(d ˆ − d ), + 2(d − d ) · A(c
M d d (k) = −
qˆ(d − b) + 2(d − b) · A(d ˆ − b) −1 (RG ˆ(c − d ) G )b,c q N (k) b
b,c∈G
ˆ − d ). + qˆ(d − d ) + 2(d − d ) · A(d
dd Proof. To simplify the notation write w = wν,d , z = zν,d , Bij = Bij d d Ci = Ci . First observe that, in view of (14),
Nd = (k1 + d1 + i(−1)ν (k2 + d2 ))(k1 + d1 − i(−1)ν (k2 + d2 )) = wz. Furthermore, k1 = k2 = k12 = k22 = k1 k2 =
1 (w + z) − d1 , 2 (−1)ν (w − z) − d2 , 2i 1 2 1 (w + z 2 ) + wz − d1 (w + z) + d2 1 , 4 2 1 1 − (w2 + z 2 ) + wz + i(−1)ν d2 (w − z) + d2 2 , 4 2 i(−1)ν 2 1 1 (z − w2 ) − (d2 − i(−1)ν d1 )w − (d2 + i(−1)ν d1 ) + d1 d2 . 4 2 2
and
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
901
Hence, Dd ,d = B11 k12 + B22 k22 + (B12 + B21 )k1 k2 + C1 k1 + C2 k2 + C0 =
1 (B11 − B22 − i(−1)ν (B12 + B21 ))w2 4 1 ν 2 + (B11 − B22 + i(−1) (B12 + B21 ))z + −d1 B11 + i(−1)ν d2 B22 4
1 1 − (d2 − i(−1)ν d1 )(B12 + B21 ) + (C1 − i(−1)ν C2 ) w 2 2 1 + −d1 B11 + i(−1)ν d2 B22 − (d2 + i(−1)ν d1 )(B12 + B21 ) 2
1 2 + (C1 + i(−1)ν C2 ) z + d2 1 B11 + d2 B22 + d1 d2 (B12 + B21 ) 2 1 − d1 C1 − d2 C2 + C0 + (B11 + B22 )wz 2
= Jνd d w2 + Jνd d z 2 + K d d wz + Ldν d w + Ldν d z + M d d . This proves the first claim. Consequently,
Nd + Dd ,d = Jνd d w2 + Jνd d z 2 + (1 + K d d )wz + Ldν d w + Ldν d z + M d d , which proves the second claim. Now, again to simplify the notation write fg =
fˆ(b, d ) −1 (RG ˆ(c, d ), G )b,c g N (k) b
b,c∈G
that is, to represent sums of this form suppress the summation and the other factors. Note that f g = gf according to this notation. Then, substituting (13) into the definition of Jνd d we have
1 (B11 − B22 + i(−1)ν (B12 + B21 )) 4 = −A1 A1 + A2 A2 − i(−1)ν (A1 A2 + A2 A1 )
Jνd ,d =
= (A1 − i(−1)ν A2 )(−A1 + i(−1)ν A2 ) = −((1, −i(−1)ν ) · A) ((1, −i(−1)ν ) · A) =−
(1, −i(−1)ν ) · A(d ˆ − b) −1 ν ˆ (RG G )b,c (1, −i(−1) ) · A(c − d ). N (k) b
b,c∈G
Similarly, substituting (13) into the definitions of K d d , Ldν d and M d d we derive the other expressions.
September 14, J070-S0129055X10004107
902
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
9. Asymptotics for the Coefficients Let f and g be functions on Γ# and for k ∈ C2 and d , d ∈ G set f (d − b) −1 Φd ,d (k; G) := (RG G )b,c g(c − d ). N (k) b
(15)
b,c∈G
In this section we study the asymptotic behaviour of the function Φd ,d (k) for k in the union of ε-tubes with large |v|. Here we only give the statements. See Appendix B for the proofs. Reset the constant R as
4 ˆ l1 , (1 + b2 )ˆ q (b) l1 R := max 1, α, 2Λ, 140 A , (16) ε and make the following hypothesis. Hypothesis 1. 2 ε. 63 Our first lemma provides and expansion for Φd ,d (k) “in powers of 1/|zν,d (k)|”. ˆ b2 qˆ(b) l1 < ∞ and (1 + b2 )A(b) l1 <
Lemma 3 (Asymptotics for Φd ,d (k)). Under Hypothesis 1, let ν ∈ {1, 2} and let f and g be functions on Γ# with b2 f (b) l1 < ∞ and b2 g(b) l1 < ∞. Suppose either (i) or (ii) where: (i) G = {0} and k ∈ (Tν (0)\ b∈G Tb )\KR ; (ii) G = {0, d} and k ∈ (Tν (0) ∩ Tν (d))\KR . Then, for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii), (1)
(2)
(3)
Φd ,d (k) = αµ,d (k) + αµ,d (k) + αµ,d (k), where for 1 ≤ j ≤ 2, (j)
|αµ,d (k)| ≤
(2|z
µ,d
Cj (k)| − R)j
and
(3)
|αµ,d (k)| ≤
|z
C3 , (k)|R2
µ,d
where Cj = Cj;Λ,A,q,f,g and C3 = C3;ε,Λ,A,q,f,g are constants. Furthermore, the (j) functions αµ,d (k) are given by (66) and (69) and are analytic in the region under consideration. (1)
Below we have more information about the function αµ,d (k). (1)
Lemma 4 (Asymptotics for αµ,d (k)). Consider the same hypotheses of Lemma 3. Then, for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii), (1)
(1,0)
(1,1)
(1,2)
(1,3)
zµ,d (k) αµ,d (k) = αµ,d + αµ,d (w(k)) + αµ,d (k) + αµ,d (k), (1,0)
(1,j)
where αµ,d is a constant given by (80), and the remaining functions αµ,d are given by (79). Furthermore, for 0 ≤ j ≤ 2, (1,j)
|αµ,d | ≤ Cj
and
(1,3)
|αµ,d | ≤
C3 , 2|zµ,d (k)| − R
where Cj = Cj;Λ,A,f,g and C3 = C3;Λ,A,f,g are constants given by (81).
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
903
The next lemma estimates the decay of Φd ,d (k) with respect to zν ,d (k) for d = d .
Lemma 5 (Decay of Φd ,d (k) for d = d ). Under Hypothesis 1, let ν ∈ {1, 2} and let f and g be functions on Γ# with b2 f (b) l1 < ∞ and b2 g(b) l1 < ∞. Suppose further that G = {0, d} and k ∈ (Tν (0) ∩ Tν (d))\KR . Then, for d , d ∈ G with d = d , CΓ# ,ε,f,g , |zν ,d (k)|3−10−1
|Φd ,d (k)| ≤ where CΓ# ,ε,f,g is a constant.
The next proposition relates the quantities |v|, |k2 |, |zν,d (k)| and |d| for k in the ε-tubes with large |v|. Proposition 11. For ν ∈ {1, 2} we have: (i) Let k ∈ Tν (0)\KR . Then, 1 3 1 ≤ ≤ |zν,0 (k)| |v| |zν,0 (k)|
and
1 1 8 ≤ ≤ . 4|v| |k2 | |v|
(ii) Let k ∈ (Tν (0) ∩ Tν (d))\KR . Then, 1 1 3 ≤ ≤ , |zν,0 (k)| |v| |zν,0 (k)| 1 2|z
ν ,d
(k)|
≤
1 1 3 ≤ ≤ , |zν ,d (k)| |v| |zν ,d (k)| 1 2 ≤ . |d| |zν ,d (k)|
10. Bounds on the Derivatives (j)
In the last section, we expressed Φd ,d (k) as a sum of certain functions αµ,d (k) for k in the ε-tubes with large |v|. In this section we provide bounds for the derivatives of all these functions. Here we only give the statements. See Appendix C for the proofs. Our first lemma concerns the derivatives of Φd ,d (k). Lemma 6 (Derivatives of Φd ,d (k)). Under Hypothesis 1, let f and g be functions in l1 (Γ# ) and suppose either (i) or (ii) where: (i) G = {0} and k ∈ (T0 \ b∈G Tb )\KR ; (ii) G = {0, d} and k ∈ (T0 ∩ Td )\KR . Then, for any integers n and m with n + m ≥ 1 and for any d , d ∈ G, n+m ∂ C ∂k n ∂k m Φd ,d (k) ≤ |v| , 1
2
where C is a constant with C = Cε,Λ,A,f,g,m,n if (i) or C = CΛ,A,f,g,m,n if (ii). We now improve the estimate of Lemma 6(ii) for d = d .
September 14, J070-S0129055X10004107
904
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Lemma 7 (Derivatives of Φd ,d (k) for d = d). Consider a constant β ≥ 2 ˆ and suppose that |b|β qˆ(b) l1 < ∞ and (1 + |b|β )A(b) l1 < 2ε/63. Let ν ∈ {1, 2} # β and let f and g be functions on Γ obeying |b| f (b) l1 < ∞ and |b|β g(b) l1 < ∞. Suppose further that G = {0, d} and k ∈ T0 ∩ Td with |v| > 2ε |b|β qˆ(b) l1 . Then, for any integers n and m with n + m ≥ 0 and for any d , d ∈ G with d = d , n+m ∂ C ∂k n ∂k m Φd ,d (k) ≤ |d|1+β , 1 2 where C = Cε,Λ,A,f,g,m,n is a constant. Observe that, in particular, this lemma with m = n = 0 generalizes Lemma 5. (j) We next have bounds for the derivatives of αµ,d (k). (j)
Lemma 8 (Derivatives of αµ,d (k)). Under Hypothesis 1, let ν ∈ {1, 2} and let f and g be functions in l1 (Γ# ). Suppose either (i) or (ii) where: (i) G = {0} and k ∈ (Tν (0)\ b∈G Tb )\KR ; (ii) G = {0, d} and k ∈ (Tν (0) ∩ Tν (d))\KR . Then, there is a constant ρ = ρε,A,q,m,n with ρ ≥ R such that, for |v| ≥ ρ and for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii), for any integers n and m with n + m ≥ 1 and for 1 ≤ j ≤ 2, n+m n+m ∂ ∂ Cj C3 (j) (3) ≤ α (k) and α (k) ∂k n ∂k m µ,d ≤ |zµ,d (k)|R2 , ∂k n ∂k m µ,d (2|zµ,d (k)| − R)j 1 2 1 2 where Cl = Cl;f,g,Λ,A,q,n,m for 1 ≤ l ≤ 3 are constants. Furthermore, C1;f,g,Λ,A,1,0 , C1;f,g,Λ,A,0,1 ≤ 13Λ−2 f l1 g l1
and
C1;f,g,Λ,A,1,1 ≤ 65Λ−3 f l1 g l1 . 11. The Regular Piece Proof of Theorem 1. Step 1 (Defining Equation). We first derive a defining equation for the Fermi curve. Without loss of generality we may assume that ˆ A(0) = 0. Let G = {0}, recall that G = Γ# \{0}, and consider the region (Tν (0)\ b∈G Tb )\Kρ , where ρ is a constant to be chosen sufficiently large obeying ρ ≥ R. By Proposition 7(i) we have G = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}. To simplify the notation write Kρ ∪ Tb . Mν := Fˆ (A, V ) ∩ Tν (0) b∈Γ# \{0}
By Lemma 2(i), a point k is in Mν if and only if N0 (k) + D0,0 (k) = 0.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
905
By Proposition 10, if we set w(k) := wν,0 (k) = k1 + i(−1)ν k2
and z(k) := zν,0 (k) = k1 − i(−1)ν k2 ,
this equation becomes β1 w2 + β2 z 2 + (1 + β3 )wz + β4 w + β5 z + β6 + qˆ(0) = 0,
(17)
where β1 := Jν00 ,
β2 := Jν00 ,
β3 := K 00 ,
β4 := L00 ν ,
β5 := L00 ν ,
β6 := M 00 − qˆ(0),
00 with Jν00 , K 00 , L00 given by Proposition 10. Observe that all the coeffiν and M cients β1 , . . . , β6 have exactly the same form as the function Φ0,0 (k) of Lemma 3(i) (see (15)). Thus, by this lemma, for 1 ≤ i ≤ 6 we have (1)
βi = β i (j)
where the function βi (j)
|βi (k)| ≤
(2)
+ βi
(3)
+ βi ,
(18)
is analytic in the region under consideration with
C C ≤ j (2|z(k)| − ρ) |z(k)|j
(3)
for 1 ≤ j ≤ 2 and |βi (k)| ≤
C , |z(k)|ρ2
(j)
where C = Cε,Λ,q,A is a constant. The exact expression for βi can be easily obtained from the definitions and from Lemma 3(i). Substituting (18) into (17) and dividing both sides of the equation by z yields (1)
w + β2 z + g = 0,
(19)
where g :=
qˆ(0) β4 w β6 β1 w 2 (2) (3) + (β2 + β2 )z + β3 w + + β5 + + z z z z
(20)
obeys |g(k)| ≤
C , ρ
(21)
with a constant C = Cε,Λ,q,A . Therefore, a point k is in Mν if and only if F (k) = 0, where (1)
F (k) := w(k) + β2 (k) z(k) + g(k) is an analytic function (in the region under consideration). Step 2 (Candidates for a Solution). Let us now identify which points are candidates to solve the equation F (k) = 0. First observe that, by Proposition 2(c) the lines
September 14, J070-S0129055X10004107
906
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Nν (0) and Nν (d) intersect at Nν (0) ∩ Nν (d) = {(iθν (d), (−1)ν θν (d))}. Hence, the second coordinate of this point and the second coordinate of a point k differ by
pr(k) − pr(Nν (0) ∩ Nν (d)) = k2 − (−1)ν θν (d) = k2 + (−1)ν θν (d). Now observe that, if k ∈ Tν (0) ∩ Tν (d) then |k1 + i(−1)ν k2 | < ε and 1 1 ν ν ν |k2 + (−1) θν (d)| = (k1 + i(−1) k2 ) − (k1 + d1 − i(−1) (k2 + d2 ) 2 2 1 ε ε |N0,ν (k) − Nd,ν (k)| < + = ε. 2 2 2 That is, the second coordinate of k and the second coordinate of Nν (0) ∩ Nν (d) must be apart from each other by at most ε. This gives a necessary condition on the second coordinate of a point k for being in Mν . Conversely, if a point k is in the (ε/4)-tube inside Tν (0), that is, |k1 + i(−1)ν k2 | < 4ε , and its second coordinate differ from the second coordinate of Nν (0) ∩ Nν (d) by at most ε/4, that is, |k2 + (−1)ν θν (d)| < 4ε , then ε ε |Nd,ν (k)| = N0,ν (k) − 2(k2 + (−1)ν θν (d))| ≤ + 2 < ε, 4 4 that is, the point k is also in Tν (d) and hence lie in the intersection Tν (0) ∩ Tν (d). This gives a sufficient condition on the first and second coordinates of a point k for being in Tν (0) ∩ Tν (d). For y ∈ C define the set of candidates for a solution of F (k) = 0 as Mν (y) := pr−1 (y) ∩ Tν (0) Tb ≤
b∈Γ# \{0}
= pr−1 (y) ∩ Tν (0)
Tν (b) .
b∈Γ# \{0}
Observe that, if |y + (−1)ν θν (b)| ≥ ε for all b ∈ Γ# \{0} then Mν (y) = pr−1 (y) ∩ Tν (0) = {(k1 , y) ∈ C2 | |k1 + i(−1)ν y| < ε}.
(22)
On the other hand, if |y + (−1)ν θν (d)| < ε for some d ∈ Γ# \{0}, then there is at most one such d and consequently Mν (y) = pr−1 (y) ∩ (Tν (0)\Tν (d)) = {(k1 , y) ∈ C2 ||k1 + i(−1)ν y| < ε
and |k1 + d1 + i(−1)ν (y + d2 )| ≥ ε}. (23)
Indeed, suppose there is another d = 0 such that |y + (−1)ν θν (d )| < ε. Then, |d − d | = |2(−1)ν θν (d − d )| = |y + (−1)ν θν (d) − (y + (−1)ν θν (d ))| ≤ 2ε < 2Λ, which contradicts the definition of Λ. Thus, there is no such d = 0.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
907
Step 3 (Uniqueness). We now prove that, given k2 , if there exists a solution k1 (k2 ) of F (k1 , k2 ) = 0, then this solution is unique and it depends analytically on k2 . This follows easily using the implicit function theorem and the estimates below, which we prove later. Proposition 12. Under the hypotheses of Theorem 1 we have |F (k) − w(k)| ≤
ε C1 + , 900 ρ
(a)
∂F 1 C2 ∂k1 (k) − 1 ≤ 7 · 34 + ρ ,
(b)
where the constants C1 and C2 depend only on ε, Λ, q and A. Now suppose that (k1 , y) ∈ Mν (y). Then, ∂F 1 C2 ∂k1 (k1 , y) − 1 ≤ 7 · 34 + ρ . Hence, by the implicit function theorem, by choosing the constant ρ ≥ R sufficiently large, if F (k1∗ , y) = 0 for some (k1∗ , y) ∈ Mν (y), then there is a neighborhood U × V ⊂ C2 which contains (k1∗ , y), and an analytic function η : V → U such that F (k1 , k2 ) = 0 for all (k1 , k2 ) ∈ U × V if and only if k1 = η(k2 ). In particular this implies that the equation F (k1 , k2 ) = 0 has at most one solution (η(y), y) in Mν (y) for each y ∈ C. We next look for conditions on y to have a solution or have no solution in Mν (y). Step 4 (Existence). We first state an improved version of Proposition 12(a). Proposition 13. Under the hypotheses of Theorem 1 we have (1,0)
F (k) − w(k) = β2 where (1,0) β2
(1,1)
+ β2
(1,2)
(w(k)) + β2
(k) + h(k),
! " θν (A(−b)) ˆ ˆ − c)) θν (A(b ˆ δb,c + θν (A(c)) = −2i θν (b) θν (c) b,c∈G1
is a constant that depends only on ρ and A and (1,3)
h := β2
+ g.
Furthermore, 1 ε2 , 100Λ 1 (1,2) |β2 (k)| < 4 3 ε4 , 7 Λ (1,0)
|β2
|<
1 ε3 , 40Λ2 1 |h(k)| ≤ Cε,Λ,q,A . ρ (1,1)
|β2
(k)| <
(24)
September 14, J070-S0129055X10004107
908
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
We now derive conditions for the existence of solutions. Suppose that F (η(y), y) = 0. Then, since η(y) + i(−1)ν y = w(η(y), y) and ε < Λ/6, using the above proposition we obtain |η(y) + i(−1)ν y| = |w(η(y), y)| = |F (η(y), y) − w(η(y), y)| ≤
ε2 ε4 C ε3 ε2 C + + + ≤ + . 100Λ 40Λ2 74 Λ3 ρ 50Λ ρ
Hence, by choosing the constant ρ sufficiently large we find that |η(y) + i(−1)ν y| <
ε2 . 40Λ
In view of (23), there is no solution in Mν (y) if for some d ∈ Γ# \{0} we have
|y + (−1)ν θν (d)| < ε and |η(y) + d1 + i(−1)ν (y + d2 )| < ε. This happens if 1 |y + (−1) θν (d)| ≤ 2 ν
ε2 ε− 40Λ
because in this case
|η(y) + d1 + i(−1)ν (y + d2 )| = |η(y) + i(−1)ν y − 2i(−1)ν y + d1 − i(−1)ν d2 | ≤ |η(y) + i(−1)ν y| + 2|y + (−1)ν θν (d)| < ε. Therefore, the image set of pr is contained in
1 ε2 ν # Ω1 := z ∈ C |z + (−1) θν (b)| > ε− for all b ∈ Γ \{0} . 2 40Λ On the other hand, in view of (22), there is a solution in Mν (y) if |y+(−1)ν θν (b)| > ε for all b ∈ Γ# \{0}. Recall from Proposition 11(a) that ρ < |v| < 8|k2 |. Thus, the image set of pr contains the set Ω2 := {z ∈ C | 8|z| > ρ and |z + (−1)ν θν (b)| > ε for all b ∈ Γ# \{0}}. Step 5. Summarizing, we have the following biholomorphic correspondence: pr
Mν k −−→ k2 ∈ Ω, pr −1
Mν (η(y), y) ←−−− y ∈ Ω, where Ω2 ⊂ Ω ⊂ Ω1 (1,0)
with the constant β2
− i(−1)ν y − r(y),
given by (24),
(1,0)
|β2
(1,0)
and η(y) = −β2
|<
ε2 100Λ
and |r(y)| ≤
This completes the proof of the theorem.
ε3 C + . 50Λ2 ρ
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
909
Proof of Proposition 12. (a) Recallthat β2 = Jν00 . First observe that, by Proposition 10, Lemma 3, and (66), we have (1)
β2 (k) = (Jν00 )(1) (k) (1, i(−1)ν ) · A(−b) ˆ ˆ Sb,c (1, −i(−1)ν ) · A(c). = Nb (k)
(25)
b,c∈G1
Thus, by (94) and (99), √
2 45 √ ˆ 2 A l1 Λ(2|z(k)| − R) 44 2 44 2ε 1 4 ε . ≤ ≤ Λ|z(k)| 45 63 900 |z(k)|
(1)
|β2 (k)| ≤
ˆ l1 2 A
(26)
Now recall that |g(k)| ≤ Cε,Λ,q,A ρ1 . Hence, (1)
|F (k) − w(k)| = |β2 (k)z(k) + g(k)| ≤ This proves part (a). (b) We first compute 2wz − w2 ∂g ∂β1 w2 + β1 = + ∂k1 ∂k1 z z2 (2)
(3)
+ β2 + β 2 + + β4
#
ε 1 + Cε,Λ,q,A . 900 ρ
(2)
(3)
∂β2 ∂β + 2 ∂k1 ∂k1
$ z
∂β3 ∂β4 w w + β3 + ∂k1 ∂k1 z
z − w ∂β5 ∂β6 1 β6 qˆ(0) − 2 − 2 . + + z2 ∂k1 ∂k1 z z z
(27)
Now observe that, since k ∈ Tν (0)\Kρ we have |w(k)| < ε, 3|v| ≥ |z| and ρ < |v| ≤ |z|. Furthermore, by Lemmas 3(i), 6(i) and 8(i), for 1 ≤ i ≤ 6 and 1 ≤ j ≤ 2, |βi (k)| ≤
C , |z(k)|
∂βi (k) C ∂k1 ≤ |z(k)| ,
C C (j) (3) |βi (k)| ≤ , |βi (k)| ≤ , |z(k)|j |z(k)|ρ2 (28) ∂β (j) (k) ∂β (3) (k) C C i i ≤ , , ≤ ∂k1 |z(k)|j ∂k1 |z(k)|ρ2
where C = Cε,Λ,q,A in all cases. Hence, ∂g(k) 1 ∂k1 ≤ Cε,Λ,q,A ρ .
(29)
ˆ we obtain By Lemma 8(i) with f = g = (1, −i(−1)ν ) · A, (1) ∂β2 (k) 13 ˆ 21 (1, −i(−1)ν ) · A z(k) ≤ |z(k)| 2 l ∂k1 Λ |z(k)| ≤
26 ˆ 2 1 A l1 < . Λ2 7 · 34
(30)
September 14, J070-S0129055X10004107
910
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Therefore, ∂ ∂ ∂F (1) = = (k) − 1 (F (k) − w(k)) (β (k)z(k) + g(k)) ∂k1 ∂k1 2 ∂k1 ∂β (1) 1 1 ∂g (1) = 2 (k)z(k) + β2 (k) + (k) ≤ + Cε,Λ,q,A . 4 ∂k1 7·3 ∂k1 ρ This proves part (b) and completes the proof of the proposition. Proof of Proposition 13. First observe that
(1, i(−1)ν ) · A = A1 + i(−1)ν A2 = A1 − i(−1)ν A2 = −2iθν (A). Thus, recalling (25), (1)
β2 (k) = (Jν00 )(1) (k) =
2iθν (A(−b)) ˆ ˆ Sb,c 2iθν (A(c)). Nb (k)
b,c∈G1
Now, by Lemma 4, we have (1)
(1,0)
z(k)β2 (k) = β2 where (1,0) β2
(1,1)
+ β2
(1,2)
(w(k)) + β2
(1,3)
(k) + β3
(k),
! " θν (A(−b)) ˆ ˆ − c)) θν (A(b ˆ δb,c + θν (A(c)) = −2i θν (b) θν (c) b,c∈G1
and (1,3)
|β3
(k)| ≤ CΛ,A
1 1 < CΛ,A . |z(k)| ρ
Hence, (1)
(1,0)
F (k) − w(k) = z(k)β2 (k) + g(k) = β2 (1,3)
with h := β3
(1,1)
+ β2
(1,2)
(w(k)) + β2
(k) + h(k)
+ g. Furthermore, in view of (21), (1,3)
|h(k)| ≤ |β3
1 (k)| + |g(k)| < Cε,Λ,q,A . ρ
ˆ l1 < 2ε/63 This proves the first part of the proposition. Finally, by (81), since A and ε < Λ/6, we find that
1 1 (1,0) ˆ ˆ l1 2iθν (A) ˆ l1 θν (A) l1 2iθν (A) |β2 | ≤ 1+ 2Λ 2Λ 4 ˆ 2 1 ε2 , ≤ A 1 < Λ l 100Λ
ε 7 (1,1) ˆ ˆ l1 2iθν (A) ˆ l1 θν (A) l1 2iθν (A) |β2 | ≤ 2 1 + Λ 6Λ 8 ˆ 21 < 1 ε3 ≤ 2 ε A l Λ 40Λ2
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
911
and 64 ˆ 21 2iθν (A) ˆ l1 2iθν (A) ˆ l1 ≤ 256 A ˆ 41 < 1 ε4 . θν (A) l l 3 3 Λ Λ 74 Λ3 This completes the proof. (1,2)
|β2
|≤
12. The Handles Proof of Theorem 2. Step 1 (Defining Equation). Let G = {0, d} and consider the region (Tν (0) ∩ Tν (d))\Kρ , where ρ is a constant to be chosen sufficiently large obeying ρ ≥ R. Observe that, this requires d being sufficiently large for (Tν (0) ∩ Tν (d))\Kρ being not empty. In fact, by Proposition 11(ii), for k in this region we have ρ < |v| ≤ 2|d|. Now, recall from Proposition 7(ii) that G = {b ∈ Γ# | |Nb (k)| ≥ ε|v|}, and to simplify the notation write Hν := Fˆ (A, V ) ∩ (Tν (0) ∩ Tν (d))\Kρ . By Lemma 2(ii), a point k is in Hν if and only if (N0 (k) + D0,0 (k))(Nd (k) + Dd,d(k)) − D0,d (k)Dd,0 (k) = 0.
(31)
Define w1 (k) := wν,0 = k1 + i(−1)ν k2 , z1 (k) := zν,0 = k1 − i(−1)ν k2 ,
(32)
w2 (k) := wν ,d = k1 + d1 + i(−1)ν (k2 + d2 ),
z2 (k) := zν ,d = k1 + d1 − i(−1)ν (k2 + d2 ). Note that, by Proposition 11(ii), |v| ≤ |z1 | ≤ 3|v|,
|v| ≤ |z2 | ≤ 3|v| and |d| ≤ |z2 | ≤ 2|d|.
By Proposition 10, N0 + D0,0 = β1 w12 + β2 z12 + (1 + β3 )w1 z1 + β4 w1 + β5 z1 + β6 + qˆ(0), Nd + Dd,d = η1 w22 + η2 z22 + (1 + η3 )w2 z2 + η4 w2 + η5 z2 + η6 + qˆ(0),
(33)
where β1 := Jν00 ,
β2 := Jν00 ,
β3 := K 00 ,
00 00 − qˆ(0), β4 := L00 ν , β5 := Lν , β6 := M
and η1 := Jνdd ,
η2 := Jνdd ,
η3 := K dd,
dd dd − qˆ(0), η4 := Ldd ν , η5 := Lν , η6 := M
with Jνd d , K d d , Ldν d and M d d given by Proposition 10. Observe that all the coefficients β1 , . . . , β6 and η1 , . . . , η6 have exactly the same form as the function Φd ,d (k) of Lemma 3(ii) (see (15)). Thus, by this lemma, for 1 ≤ i ≤ 6 we have (1)
βi = βi
(2)
+ βi
(3)
+ βi
(1)
and ηi = ηi
(2)
+ ηi
(3)
+ ηi ,
(34)
September 14, J070-S0129055X10004107
912
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira (j)
where the functions βi with
(j)
and ηi
are analytic in the region under consideration
C C ≤ (2|z1 (k)| − ρ)j |z1 (k)|j C (3) and |βi (k)| ≤ , |z1 (k)|ρ2 C C (j) ≤ |ηi (k)| ≤ j (2|z2 (k)| − ρ) |z2 (k)|j C (3) , and |ηi (k)| ≤ |z2 (k)|ρ2 (j)
|βi (k)| ≤
for 1 ≤ j ≤ 2
for 1 ≤ j ≤ 2
(j)
(j)
where C = Cε,Λ,q,A is a constant. The exact expressions for βi and ηi can be easily obtained from the definitions and from Lemma 3(ii). Substituting (34) into (33) yields 1 (1) (N0 + D0,0 ) = w1 + β2 z1 + g1 , z1 1 (1) (Nd + Dd,d ) = w2 + η2 z2 + g2 , z2
(35)
where g1 :=
β1 w12 β4 w1 β6 qˆ(0) (2) (3) + (β2 + β2 )z1 + β3 w1 + + β5 + + , z1 z1 z1 z1
η1 w22 η4 w2 η6 qˆ(0) (2) (3) g2 := + (η2 + η2 )z2 + η3 w2 + + η5 + + z2 z2 z2 z2
(36)
obey |g1 (k)| ≤
C ρ
and |g2 (k)| ≤
C , ρ
(37)
with a constant C = Cε,Λ,q,A . This gives us more information about the first term in (31). We next consider the second term in that equation. Write D0,d = c1 (d) + p1
and Dd,0 = c2 (d) + p2
(38)
with ˆ ˆ c1 (d) := qˆ(−d) − 2d · A(−d), p1 := D0,d − qˆ(−d) + 2d · A(−d), ˆ ˆ p2 := Dd,0 − qˆ(d) − 2d · A(d). c2 (d) := qˆ(d) + 2d · A(d), We have the following estimates. Proposition 14. Under the hypotheses of Theorem 2 we have, for any integers n and m with n + m ≥ 0 and for 1 ≤ j ≤ 2, n+m ∂ C1 C2 ∂k n ∂k m pj (k) ≤ |d| and |cj (d)| ≤ |d| , 1
2
where the constants C1 and C2 depend only on ε, Λ, q and A.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
913
Thus, by dividing both sides of (31) by z1 z2 and substituting (35) and (38) we find that 1 [(N0 + D0,0 )(Nd + Dd,d ) − D0,d Dd,0 ] z1 z2
0=
(1)
(1)
= (w1 + β2 z1 + g1 )(w2 + η2 z2 + g2 ) −
1 (c1 (d) + p1 )(c2 (d) + p2 ). z1 z2
(39)
We now introduce a (nonlinear) change of variables in C2 . Set (1)
x1 (k) := w1 (k) + β2 (k) z1 (k) + g1 (k), (1)
(40)
x2 (k) := w2 (k) + η2 (k) z2 (k) + g2 (k). This transformation obeys the following estimates. Proposition 15. Under the hypotheses of Theorem 2 we have: (i) For 1 ≤ j ≤ 2 and for ρ sufficiently large, |xj (k) − wj (k)| ≤
C ε ε + < . 900 ρ 8
(ii)
∂x1 ∂k1 ∂x2 ∂k1 and
∂k1 ∂x1 ∂k2 ∂x1
∂x1 # 1 ∂k2 = ∂x2 1 ∂k2
i(−1)ν i(−1)ν
∂k1 ∂x2 1 = 1 2 i(−1)ν ∂k2 ∂x2
$
(I + M )
1 i(−1)ν
(I + N )
with M ≤
4 1 C < + 7 · 34 ρ 2
and
N ≤ 4 M .
Furthermore, for all m, i, j ∈ {1, 2}, 2 ∂ km 3 2 C ∂xi ∂xj ≤ Λ3 ε + ρ . Here, all the constants C depend only on ε, Λ, q and A. By the inverse function theorem, these estimates imply that the above transformation is invertible. Therefore, by rewriting Eq. (39) in terms of these new
September 14, J070-S0129055X10004107
914
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
variables, we conclude that a point k is in Hν if and only if x1 (k) and x2 (k) satisfy the equation x1 x2 + r(x1 , x2 ) = 0,
(41)
where r(x1 , x2 ) := −
1 (c1 (d) + p1 )(c2 (d) + p2 ). z1 z2
In order to study this defining equation we need some estimates. Step 2 (Estimates). Using the above inequalities we have, for i, j, l ∈ {1, 2}, 2 ∂ ∂pj ∂km C ∂xi pj (k(x)) ≤ ∂km ∂xi ≤ |d| m=1 and 2 2 2 ∂ pj ∂km ∂kn ∂pj ∂ 2 km ∂2 C ∂km ∂kn ∂xi ∂xl + ∂km ∂xi ∂xl ≤ |d| , ∂xi ∂xl pj (k(x)) ≤ m,n=1 m=1 so that |r(x)| ≤ C
1 1 1 C ≤ 4, 2 |d| |d| |d| |d|
∂ 1 1 1 1 1 1 C ∂xi r(x) ≤ C |d|3 |d| |d| + C |d|2 |d| |d| ≤ |d|4 and ∂2 ≤ C . r(x) ∂xi ∂xj |d|4 Here, all the constants depend only on ε, Λ, q and A. Step 3 (Morse Lemma). We now apply the quantitative Morse lemma in Appendix A for studying Eq. (41). We consider this lemma with a = b = C/|d|4 , 1 ε , 4 }. Observe that, under this δ = ε, and d sufficiently large so that b < max{ 23 55 condition we have (δ − a)(1 − 19b) >
ε 2
and (δ − a)(1 − 55b) >
ε . 4
According to this lemma, there is a biholomorphism Φν defined on
ε ε 2 Ω1 := (z1 , z2 ) ∈ C |z1 | < and |z2 | < 2 2
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
with range containing
ε ε 2 (x1 , x2 ) ∈ C |x1 | < and |x2 | < 4 4
915
(42)
such that DΦν − I ≤
C , |d|2
((x1 x2 + r) ◦ Φν )(z1 , z2 ) = z1 z2 + td , |td | ≤
C , |d|4
|Φν (0)| ≤
C , |d|4
(43)
where DΦν is the derivative of Φν and td is a constant that depends on d. Hence, if for ν = 1 we define φd,1 : Ω1 → T1 (0) ∩ T2 (d) as φd,1 (z1 , z2 ) := (k1 (Φ1 (z1 , z2 )), k2 (Φ1 (z1 , z2 ))), where k(x) is the inverse of the transformation (40), we obtain the desired map. Note that the conclusion (ii) of the theorem is immediate. We next prove (i) and (iii). Step 4 (Proof of (i)). By Proposition 15(i), for 1 ≤ j ≤ 2 we have |xj (k)−wj (k)| ≤ 8ε . Now, recall from (32) the definition of w1 (k) and w2 (k). Then, since ε |xj (k)| ≤ |xj (k) − wj (k)| + |wj (k)| < + |wj (k)|, 8 the set
ε ε 2 (k1 , k2 ) ∈ C |w1 (k)| < and |w2 (k)| < 8 8 is contained in the set (42). This proves the first part of (i). To prove the second part we use Proposition 15 and (43). First observe that # $ 1 ∂k 1 1 DΦ1 = (I + N )(I + DΦ1 − I) Dφd,1 = ∂x 2 i −i # $ 1 1 1 (I + N + R), = 2 i −i where N ≤
1 C + 33 ρ
and R ≤
C . |d|2
Furthermore, from (32) and (40) we have 1 1 (1) (1) k1 = iθν (d) + (w1 + w2 ) = iθν (d) + (x1 + x2 + β2 z1 + η2 z2 + g1 + g2 ) 2 2
September 14, J070-S0129055X10004107
916
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
and similarly k2 = −(−1)ν θν (d) + so that
(−1)ν (1) (1) (x1 − x2 − β2 z1 + η2 z2 − g1 + g2 ), 2i
1 φd,1 (0) = k(Φ1 (0)) = k O |d|4
= (iθν (d), −(−1)ν θν (d)) + O
ε 900
+O
1 . ρ
Step 5 (Proof of (iii)). To prove part (iii) it suffices to note that T1 (0) ∩ T2 (d) ∩ Fˆ (A, V ) is mapped to T1 (−d) ∩ T2 (0) ∩ Fˆ (A, V ) by translation by d and define φd,2 by φd,2 (z1 , z2 ) := φd,1 (z2 , z1 ) + d. This completes the proof of the theorem. Proof of Proposition 14. It suffices to estimate ˆ − d ) cd ,d := qˆ(d − d ) − 2(d − d ) · A(d
and pd ,d := Dd ,d − cd ,d
ˆ − d ). Observe for d , d ∈ {0, d} with d = d . Define lνd d := (1, i(−1)ν ) · A(d that, since 1 |d − d |2 |ˆ q (d − d )| − d |2 1 1 |b|2 |ˆ q (b)| ≤ b2 qˆ(b) l1 2 , ≤ 2 |d − d | |d| #
|ˆ q (d − d )| =
|d
b∈Γ
and similarly 1 ˆ − d )| ≤ b2 A(b) ˆ |A(d , l1 |d|2 it follows that |cd ,d | ≤
CA,q |d|
and |lνd d | ≤
CA . |d|2
This gives the desired bounds for c1 and c2 . Now, by Proposition 10, we have
2 dd 2 dd ˜ dν d − lνd d )wν,d zν,d wν,d zν,d + (L p = Jνd d wν,d + Jν + K
˜ d d − ld d )zν,d + M ˜dd + (L ν ν
˜ dν d := Ldν d +lνd d and M ˜ d d := M d d −c. Observe that all the coefficients with L ˜ dν d and M ˜ d d have exactly the same form as the function Φd ,d (k) Jνd d , K d d , L of Lemma 7 (see Proposition 10 and (15)). Thus, by this lemma with β = 2, for n+m any integers n and m with n + m ≥ 0, the absolute value of the ∂k∂ n ∂km -derivative 1
2
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
917
of each of these functions is bounded above by Cε,Λ,A,q,m,n |d|1 3 . Hence, if we recall from Proposition 11(ii) that |z1 (k)| ≤ 6|d| and |z2 (k)| ≤ 2|d|, and apply the Leibniz rule we find that n+m ∂ C ∂k n ∂k m pd ,d (k) ≤ Cm,n |d| . 1
2
This yields the desired bounds for p1 and p2 and completes the proof. Proof of Proposition 15. (i) Similarly as in (26) we have (1)
|β2 (k)| ≤
1 ε 900 |z1 (k)|
(1)
and |η2 (k)| ≤
1 ε . 900 |z2 (k)|
Thus, in view of (37), and by choosing ρ sufficiently large, (1)
|x1 (k) − w1 (k)| ≤ |β2 (k) z1 (k) + g1 (k)| ≤
C ε ε + < , 900 ρ 8
and similarly |x2 (k) − w2 (k)| < ε/8. This proves part (i). (ii) Recall (32) and (40). Then, for 1 ≤ j ≤ 2, (1)
∂β ∂ ∂w1 ∂z1 (1) ∂g1 ∂x1 (1) = (w1 + z1 β2 + g1 ) = + z1 2 + β + , ∂kj ∂kj ∂kj ∂kj ∂kj 2 ∂kj (1)
∂x2 ∂η ∂ ∂w2 ∂z2 (1) ∂g2 (1) = (w2 + z2 η2 + g2 ) = + z2 2 + η + . ∂kj ∂kj ∂kj ∂kj ∂kj 2 ∂kj First observe that the functions g1 and g2 are similar to the function g (see ∂g1 ∂g2 and ∂k are given by expressions (36) and (20)). Thus, it is easy to see that ∂k j j similar to (27). Since k ∈ Tν (0) ∩ Tν (d) we have |w1 (k)| < ε and |w2 (k)| < ε. Recall also the inequalities in Proposition 11(ii). Hence, by Lemmas 3(ii), 6(ii) and 8(ii), we obtain (28) with k1 and z(k) replaced by kj and z1 (k), respectively, and for k1 , z(k) and β replaced by kj , z2 (k) and η, respectively. Consequently, similarly as in (29) and using again Lemma 3(ii), for 1 ≤ j ≤ 2 we have ∂z2 (1) ∂g2 ∂z1 (1) ∂g1 1 1 ∂kj β2 + ∂kj ≤ Cε,Λ,q,A ρ and ∂kj η2 + ∂kj ≤ Cε,Λ,q,A ρ . Now recall that β2 = Jν00 and η2 = Jνdd . Then, by Proposition 10, Lemma 3(ii), and (66), it follows that (1)
β2 (k) = (Jν00 )(1) (k) =
(1, i(−1)ν ) · A(−b) ˆ ˆ Sb,c (1, −i(−1)ν ) · A(c), N (k) b
b,c∈G1 (1) η2 (k)
= (Jνdd )(1) (k) =
(1, i(−1)ν ) · A(d ˆ − b) ˆ − d). Sb,c (1, −i(−1)ν ) · A(c N (k) b
b,c∈G1
September 14, J070-S0129055X10004107
918
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Hence, by Lemma 8(ii), similarly as in (30), for 1 ≤ j ≤ 2, (1) 13 ∂β2 (k) ˆ 21 < 1 (1, −i(−1)ν ) · A z1 (k) ≤ l ∂kj Λ2 7 · 34 (1) 1 ∂η2 (k) . < z2 (k) ∂kj 7 · 34 Therefore, ∂x1 ∂k1 ∂x2 ∂k1
and
(1) (1) ∂β2 (k) ∂β2 (k)
z1 (k) z1 (k) ∂k1 ∂k2 i(−1)ν + (1) (1) i(−1)ν ∂η2 (k) ∂η2 (k) z2 (k) z2 (k) ∂k1 ∂k2 ∂g1 ∂g1 (1) (1) −i(−1)ν β2 β2 ∂k1 ∂k 2 + + (1) (1) ∂g ∂g ν 2 2 η2 −i(−1) η2 ∂k1 ∂k2
1 i(−1)ν := (I + M1 + M2 + M3 ), 1 i(−1)ν
∂x1 ∂k2 = 1 1 ∂x2 ∂k2
where M1 ≤ 2
2 7 · 34
1 and M2 + M3 ≤ Cε,Λ,q,A . ρ
Set M := M1 + M2 + M3 . This proves the first claim. Now, by choosing ρ sufficiently large we can make M < 12 . Write # $ 1 i(−1)ν P := . 1 i(−1)ν Then, by the inverse function theorem and using the Neumann series, ∂k1 ∂x1 −1 ∂k1 ∂x1 ∂x1 ∂x2 ∂k2 ˜ )P −1 = ∂k1 = (I + M )−1 P −1 = (I + M ∂k2 ∂x2 ∂k2 ∂x2 ∂x1 ∂x2 ∂k1 ∂k2 ˜ P −1 ) =: P −1 (I + P M
1 1 1 ˜ P −1 ), = (I + P M i(−1)ν 2 i(−1)ν with ˜ 1 ≤ ˜ P −1 ≤ 2 M P M
2 M ≤ 4 M . 1 − M
˜ P −1 . This proves the second claim. Set N := P M
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
919
Differentiating the matrix identity T T −1 = I and applying the chain rule we find that
2 2 ∂xl ∂kp ∂ 2 km ∂km ∂ ∂km ∂ 2 xl ∂kr ∂kp =− =− . ∂xi ∂xj ∂xl ∂xi ∂kp ∂xj ∂xl ∂kr ∂xp ∂xi ∂xj l,p=1
l,p=1
Furthermore, in view of the above calculations we have
∂ki 1 ≤ (1 + N ) ≤ 1 (1 + 4 M ) ≤ 1 1 + 4 1 < 3 . ∂xj 2 2 2 2 2 Thus, 2 2 3 ∂ km ≤ 4 3 sup ∂ xl . ∂xi ∂xj 2 l,r,p ∂kr ∂xp We now estimate (1)
(1)
(1)
∂ 2 β2 ∂z1 ∂β2 ∂z1 ∂β2 ∂ 2 g1 ∂ 2 x1 = + z1 + + ∂ki ∂kj ∂ki ∂kj ∂ki ∂kj ∂kj ∂ki ∂ki ∂kj
and
∂ 2 x2 . ∂ki ∂kj
From (27) with g, w and z replaced by g1 , w1 and z1 , respectively, we obtain ∂ 2 β1 w12 ∂β1 2w1 z1 − w12 2z 2 − 6w1 z1 + 4w12 ∂ 2 g1 = +2 + β1 1 2 2 2 ∂k1 ∂k1 z1 ∂k1 z1 z13 # # $ $ (2) (3) (2) (3) ∂ 2 β2 ∂ 2 β2 ∂β2 ∂β2 ∂ 2 β3 + + + 2 + w1 z + 1 ∂k12 ∂k12 ∂k1 ∂k1 ∂k12 +2 +
2(w1 − z1 ) ∂β3 ∂ 2 β4 w1 ∂β4 z1 − w1 + +2 + β4 ∂k1 ∂k12 z1 ∂k1 z12 z13
∂ 2 β5 ∂ 2 β6 1 ∂β6 1 β6 2ˆ q (0) + −2 +2 3 + . 2 ∂k1 ∂k12 z1 ∂k1 z12 z1 z13
Hence, by Lemmas 3(ii), 6(ii) and 8(ii), 2 ∂ g1 1 ∂k 2 ≤ Cε,Λ,q,A ρ . 1 Similarly we prove that 2 ∂ gl 1 ∂ki ∂kj ≤ Cε,Λ,q,A ρ for all l, i, j ∈ {1, 2} because all the derivatives acting on gl are essentially the same up to constant factors (see [13]). Furthermore, again by Lemma 8(ii), ∂η (1) ∂β (1) 1 1 2 2 ≤ Cε,Λ,q,A , ≤ Cε,Λ,q,A , ∂kj ∂kj ρ ρ
September 14, J070-S0129055X10004107
920
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
and
(1) 65 ∂ 2 β2 (k) ˆ 21 < 1 ε2 , (1, −i(−1)ν ) · A ≤ z1 (k) l ∂k1 ∂kj Λ3 5Λ3 (1) 1 2 ∂ 2 η2 (k) ε . z2 (k) < ∂ki ∂kj 5Λ3
Hence,
2 ∂ xl 1 1 2 ∂ki ∂kj ≤ 5Λ3 ε + Cε,Λ,q,A ρ .
Therefore,
2 2 3 ∂ km ≤ 4 3 sup ∂ xl ≤ 3 ε2 + Cε,Λ,q,A 1 . ∂xi ∂xj 2 l,r,p ∂kr ∂xp Λ3 ρ
This completes the proof of the proposition. Acknowledgments I would like to thank Professor Joel Feldman for suggesting this problem and for the many discussions I have had with him. I am also grateful to Alessandro Michelangeli for useful comments about the manuscript. This work is part of the author’s Ph.D. thesis [13] defended at the University of British Columbia in Vancouver, Canada. Appendix A. Quantitative Morse Lemma Lemma 9 (Quantitative Morse Lemma [13]). Let δ be a constant with 0 < δ < 1 and assume that f (x1 , x2 ) = x1 x2 + r(x1 , x2 ) is an holomorphic function on Dδ = {(x1 , x2 ) ∈ C2 ||x1 | ≤ δ and |x2 | ≤ δ}. Suppose further that, for all x ∈ Dδ and 1 ≤ i ≤ 2, the function r satisfies %& % ' % ∂2r % ∂r 1 % % , (x) %≤b< ∂xi (x) ≤ a < δ and % % ∂xi ∂xj % 55 i,j∈{1,2} where a and b are constants. Then f has a unique critical point ξ = (ξ1 , ξ2 ) ∈ Dδ with |ξ1 | ≤ a and |ξ2 | ≤ a. Furthermore, let s = max{|ξ1 |, |ξ2 |}. Then there is a biholomorphic map Φ from the domain D(δ−s)(1−19b) to a neighbourhood of ξ ∈ Dδ that contains {(z1 , z2 ) ∈ C2 | |zi − ξi | < (δ − s)(1 − 55b) for 1 ≤ i ≤ 2} such that (f ◦ Φ)(z1 , z2 ) = z1 z2 + c, where c ∈ C is a constant fulfilling |c− r(0, 0)| ≤ ∂r ∂r (0, 0) = 0 and ∂x (0, 0) = 0, a2 . The differential DΦ obeys DΦ − I ≤ 18b. If ∂x 1 2 then ξ = 0 and s = 0.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
921
Appendix B. Asymptotics for the Coeficients: Proofs Proof of Proposition 11. We first derive a more general inequality and then we prove parts (i) and (ii). First observe that, if k ∈ Tµ (d )\KR then |v + (−1)µ (u + d )⊥ | = |Nd ,µ (k)| < ε < |v|. Hence, |v| ≤ |2v − (v + (−1)µ (u + d )⊥ )| ≤ 3|v|. But |2v − (v + (−1)µ (u + d )⊥ )| = |v − (−1)µ (u + d )⊥ | = |k1 + d1 − i(−1)µ (k2 + d2 )| = |zµ,d (k)|. Therefore, 1 3 1 ≤ ≤ . |zµ,d (k)| |v| |zµ,d (k)|
(44)
We now prove parts (i) and (ii). (i) The first inequality of part (i) follows from the above estimate setting (µ, d ) = (ν, 0). To prove the second inequality observe that, since |v| > R ≥ 2Λ > 12ε by hypothesis and |v| ≤ |zν,0 (k)| by (44), on the one hand we have 1 11 1 1 |v| ≤ |v| = |v| − |v| ≤ |v| − Λ ≤ |v| − ε 4 12 12 6 ≤ |zν,0 (k)| − |k1 + i(−1)ν k2 | ≤ |zν,0 (k) − k1 − i(−1)ν k2 | = 2|k2 |. On the other hand, since |zν,0 (k)| < 3|v| by (44), |k2 | = |2i(−1)ν k2 | = |k1 + i(−1)ν k2 − (k1 − i(−1)ν k2 )| = |k1 + i(−1)ν k2 − zν,0 (k)| ≤ ε + 3|v| ≤ 4|v|. Combining these estimates we obtain the second inequality of part (i). (ii) Similarly, in view of (44), if k ∈ Tµ (d )\KR for (µ, d ) ∈ {(ν, 0), (ν , d)} then 1 1 3 ≤ ≤ |zν,0 (k)| |v| |zν,0 (k)|
1 1 3 ≤ ≤ . |zν ,d (k)| |v| |zν ,d (k)|
and
These are the first two inequalities of part (ii). Now, since
zν ,d (k) = k1 − i(−1)ν k2 + d1 − i(−1)ν d2
= zν ,0 (k) + d1 − i(−1)ν d2 = wν,0 (k) + d1 − i(−1)ν d2 ,
|wν,0 (k)| < ε, and |d1 − i(−1)ν d2 | = |d|, it follows that |zν ,d (k)| − ε ≤ |d| ≤ |zν ,d (k)| + ε. Furthermore, by (45), ε<
|v| |zν ,d (k)| Λ ≤ ≤ . 6 12 12
(45)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
922
Thus, 1 |zν ,d (k)| ≤ |d| ≤ 2|zν ,d (k)|. 2 This yields the third inequality of part (ii) and completes the proof. Proof of Lemma 3. We consider all cases at the same time. Therefore, we have either hypothesis (i) with (µ, d ) = (ν, 0) or hypothesis (ii) with (µ, d ) ∈ {(ν, 0), (ν , d)}. Observe that either (ν, ν ) = (1, 2) or (ν, ν ) = (2, 1). Step 1. Recall the change of variables (14) and set
1 1 G1 := b ∈ G |b − d | < R , G2 := b ∈ G |b − d | ≥ R . 4 4 Then G = G1 ∪ G2 and G1 , G2 ⊂ {b ∈ Γ# | |Nb (k)| ≥ ε|v|} by Proposition 7. Furthermore, by Proposition 11, for (µ, d ) = (ν, 0) if (i) or (µ, d ) ∈ {(ν, 0), (ν , d)} if (ii) we have |zµ,d | ≤ 3|v|. Thus, observing the definition of G2 , f (d − b) −1 (RG G )b,c g(c − d ) |R1 (k)| := Nb (k) b∈G1 c∈G2
≤
|c − d |2 1 −1 RG |f (d − b)| |g(c − d )| G ε|v| |c − d |2 b∈G1
≤
c∈G2
16 2 1 Cε,f,g −1 RG c g(c) l1 ≤ , G f l1 2 ε|v| R |zµ,d |R2
(46)
and similarly |R2 (k)| ≤ Hence,
Cε,f,g . |zµ,d |R2
(47)
f (d − b) −1 (RG G )b,c g(c − d ) Φd ,d (k) = + + N (k) b b,c∈G1
=
b∈G1 c∈G2
b∈G2 c∈G
f (d − b) −1 (RG G )b,c g(c − d ) + R1 (k) + R2 (k) Nb (k)
(48)
b,c∈G1
with |R1 (k) + R2 (k)| ≤
Cε,f,g . |zµ,d |R2
(49)
Now, if we set TG G := πG − RG G and recall the convergent series expansion −1 −1 = RG G = (πG − TG G )
∞ j=0
TGj G ,
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
923
we can write f (d − b) −1 (RG G )b,c g(c − d ) N (k) b
b,c∈G1
=
∞ f (d − b) j (TG G )b,c g(c − d ). N (k) b j=0
(50)
b,c∈G1
Note, the above equality is fine because G1 is finite set. Let
1 1 G3 := b ∈ G |b − d | < R , G4 := b ∈ G |b − d | ≥ R . 2 2 Again, observe that G = G3 ∪ G4 . Thus, we can break TG G into TG G = πG T πG = (πG3 + πG4 )T (πG3 + πG4 ) = T33 + T43 + T34 + T44 , where Tij := πGi T πGj for i, j ∈ {3, 4}. Using this decomposition we prove the following. Proposition 16. Under the hypotheses of Lemma 3 we have ∞ f (d − b) j (TG G )b,c g(c − d ) N (k) b j=0 b,c∈G1
=
∞ f (d − b) j (T33 )b,c g(c − d ) + R3 (k) N (k) b j=0 b,c∈G1
with R3 (k) given by (75) and |R3 (k)| ≤
CΛ,f,g . |zµ,d |R2
(51)
This proposition will be proved below. Combining this with (48) and (50) we obtain Φ
d ,d
∞ 3 f (d − b) j (T33 )b,c g(c − d ) + (k) = Rj (k). Nb (k) j=0 j=1
(52)
b,c∈G1
j Step 2. We now look in detail to the operator T33 and its powers T33 . Recall that 1 µ µ µ θµ (b) = 2 ((−1) b2 + ib1 ) and set µ := µ − (−1) so that (−1) = −(−1)µ . Then,
Nb (k) = Nb,µ (k)Nb,µ (k) = (wµ,d − 2iθµ (b − d ))(zµ,d − 2iθµ (b − d )). Extend the definition of θµ (y) to any y ∈ C2 . Thus, ˆ − c) = −2iθµ (A(b ˆ − c))wµ,d − 2iθµ (A(b ˆ − c))zµ,d . 2(k + d ) · A(b
September 14, J070-S0129055X10004107
924
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Hence, Tb,c = =
1 ˆ − c) − qˆ(b − c)) (2(c + k) · A(b Nc (k) ˆ − c) − qˆ(b − c) + 2(k + d ) · A(b ˆ − c) 2(c − d ) · A(b (wµ,d − 2iθµ (c − d ))(zµ,d − 2iθµ (c − d ))
= Xb,c + Yb,c ,
(53)
where Xb,c := Yb,c :=
ˆ − c) − qˆ(b − c) − 2iθµ (A(b ˆ − c))wµ,d 2(c − d ) · A(b , (wµ,d − 2iθµ (c − d ))(zµ,d − 2iθµ (c − d )) (wµ,d
ˆ − c))zµ,d −2iθµ (A(b . − 2iθµ (c − d ))(zµ,d − 2iθµ (c − d ))
(54) (55)
Let X and Y be the operators whose matrix elements are, respectively, Xb,c and Yb,c . Set X33 := πG3 XπG3
and Y33 := πG3 Y πG3 .
We next prove the following estimates,
1 1 ˆ l1 + 4 ˆ q l1 X33 ≤ 20 A < , Λ |zµ,d |R 3
(56)
8 ˆ l1 < 1 , Y33 ≤ θµ (A) Λ 14 where |zµ,d |R := 2|zµ,d | − R. First observe that the “vector” b ∈ Γ# has the same length as the complex number 2iθµ (b): |b| = |(b1 , b2 )| = |b1 + i(−1)µ b2 | = |2iθµ (b)|.
(57)
Thus, for b ∈ G3 , |b − d | 1 |2iθµ (b − d )| = < . R R 2 Consequently, |zµ,d
1 1 ≤ − 2iθµ (b − d )| |zµ,d | − |2iθµ (b − d )| <
1 1 |zµ,d | − R 2
=
2 . |zµ,d |R
(58)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
925
Furthermore, for b ∈ G , |wµ,d
1 1 1 ≤ ≤ − 2iθµ (b − d )| |b − d | − |wµ,d | |b − d | − ε 1 1 ≤ = . 2Λ − Λ Λ
(59) (60)
Here we have used that |wµ,d | < ε < Λ and |b − d | ≥ 2Λ for all b ∈ G . Using again that ε < Λ ≤ |c − d |/2 for all c ∈ G we have |c − d | < 2. |c − d | − ε
(61)
Finally recall that ε 1 < Λ 6
and
1 1 1 ≤ < , |zµ,d | |v| R
(62)
where the last inequality follows from Proposition 11 since |v| > R by hypothesis. Then, using the above inequalities and Proposition 5, the bounds (56) for X33 and Y33 follow from the estimates sup |Xb,c | ≤ sup + sup + sup c∈G3
b∈G3
b∈G3
c∈G3
c∈G3
b∈G3
b∈G3
c∈G3
ˆ − c)| + |ˆ ˆ − c))| |wµ,d | q (b − c)| + |2iθµ (A(b 2|c − d | |A(b |wµ,d − 2iθµ (c − d )| |zµ,d − 2iθµ (c − d )| 2 sup ≤ + sup |zµ,d |R c∈G3 b∈G3
×
!
b∈G3
c∈G3
b∈G3
c∈G3
" √ ˆ − c)| ˆ − c)| 2|c − d | |A(b |ˆ q (b − c)| + ε 2 |A(b × + |wµ,d − 2iθµ (c − d )| |wµ,d − 2iθµ (c − d )| 2 sup ≤ + sup |zµ,d |R c∈G3 b∈G3 " √ ˆ − c)| |ˆ ˆ − c)| q (b − c)| + ε 2 |A(b 2|c − d | |A(b + × |c − d | − ε Λ " " !! √ 1 2 ε 2 ˆ q l ˆ l1 + ≤ A 2 4+ |zµ,d |R Λ Λ & ' 1 q l1 ˆ l1 + 4 ˆ ≤ 20 A Λ |zµ,d |R & ' 1 1 1 q l1 1 ˆ l1 + 4 ˆ < + = ≤ 20 A Λ R 7 4 3 !
September 14, J070-S0129055X10004107
926
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
and similarly
sup
c∈G3 b∈G
3
+ sup
b∈G3 c∈G
ˆ l1 < 1 . |Yb,c | ≤ 8 θµ (A) Λ 14 3
j Step 3. We now look in detail to T33 . For each integer j ≥ 1 write j j = (X33 + Y33 )j = Zj + Wj + Y33 , T33
(63)
where Wj is the sum of the j terms containing only one factor X33 and j − 1 factors Y33 , Wj :=
j
(Y33 )m−1 X33 (Y33 )j−m ,
m=1 j Zj := (X33 + Y33 )j − Wj − Y33 .
In view of (56) we have j 1 Y33 j ≤ , 14 j−1 CΛ,A,q 1 Wj ≤ j X33 Y33 ≤ j , |zµ,d |R 14 j−2 j CΛ,A,q 2 1 j 2 ≤ . Zj ≤ (2 − j − 1) X33 3 |zµ,d |2R 3 j−1
Hence, the series S :=
∞
j Y33 = (I − Y33 )−1 ,
j=0
W :=
∞
Wj
and Z :=
j=1
∞
Zj
(64)
j=2
converge, and the operator norm of W and Z decay with respect to |zµ,d |. Indeed,
j ∞ ∞ 1 Y33 j ≤ < C, S ≤ 14 j=0 j=0 W ≤
∞
Wj ≤
j=1
∞ CΛ,A,q j 2|zµ,d | − R j=1
1 14
j−1 <
∞ j CΛ,A,q 2 CΛ,A,q Z ≤ Zj ≤ ≤ . 2 |zµ,d |R j=2 3 |zµ,d |2R j=2 ∞
Thus, we have the expansion ∞ j=0
j T33 = S + W + Z.
CΛ,A,q , |zµ,d |R
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
927
Step 4. Consequently, ∞ f (d − b) j (T33 )b,c g(c − d ) N (k) b j=0 b,c∈G1
=
b,c∈G1
f (d − b) (S + W + Z)b,c g(c − d ) (wµ,d − 2iθµ (b − d ))(zµ,d − 2iθµ (b − d ))
(1)
(2)
= αµ,d + αµ,d + R4 , where
(1)
αµ,d (k) :=
b,c∈G1 (2) αµ,d (k)
:=
b,c∈G1
and R4 (k) :=
b,c∈G1
(65)
f (d − b) Sb,c (k)g(c − d ) , (wµ,d (k) − 2iθµ (b − d ))(zµ,d (k) − 2iθµ (b − d )) f (d − b) Wb,c (k)g(c − d ) (wµ,d (k) − 2iθµ (b − d ))(zµ,d (k) − 2iθµ (b − d ))
f (d − b) Zb,c (k)g(c − d ) . (wµ,d (k) − 2iθµ (b − d ))(zµ,d (k) − 2iθµ (b − d ))
(66)
(67)
By a short calculation as in (74), using (58) and (60) we find that (1)
|αµ,d (k)| ≤ (2)
|αµ,d (k)| ≤ |R4 (k)| ≤
2 1 CΛ,f,g f l1 g l1 S ≤ , Λ 2|zµ,d | − R |zµ,d |R 2 1 CΛ,A,q,f,g f l1 g l1 W ≤ , Λ 2|zµ,d | − R |zµ,d |2R
(68)
2 1 CΛ,A,q,f,g f l1 g l1 Z ≤ . Λ 2|zµ,d | − R |zµ,d |3R
Hence, recalling (52) we conclude that (1)
(2)
(3)
Φd ,d = αµ,d + αµ,d + αµ,d , where (3) αµ,d (k)
:=
4
Rj (k).
(69)
j=1
Furthermore, in view of (49), (51) and (68), since 1 |zµ,d |3R
=
1 1 < , 3 (2|zµ,d | − R) |zµ,d |R2
for 1 ≤ j ≤ 2 we have (j)
|αµ,d (k)| ≤
Cj |zµ,d (k)|jR
(3)
and |αµ,d (k)| ≤
C3 , |zµ,d (k)|R2
where Cj = Cj;Λ,A,q,f,g and C3 = C3;ε,Λ,A,q,f,g are constants. This proves the main statement of the lemma. Finally observe that, since G3 is a finite set, the matrices
September 14, J070-S0129055X10004107
928
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
X33 and Y33 are analytic in k because their matrix elements are analytic functions of k. (Note, the functions wµ,d (k) and zµ,d (k) are analytic.) Consequently, the matrices Wj and Zj are also analytic and so are Sb,c , Wb,c and Zb,c because the series (j) (64) converge uniformly with respect to k. Thus, all the functions αµ,d (k) are analytic in the region under consideration. This completes the proof of the lemma. Proof of Proposition 16. Step 1. Recall that TG G = T33 + T34 + T43 + T44 with (0) (0) (0) (0) Tij = πGi T πGj and set X33 := 0, Y34 := T34 , W43 := T43 , and Z44 := T44 . It is straightforward to verify that, for any integer j ≥ 0, (j)
(j)
(j)
(j)
j+1 + X33 + Y34 + W43 + Z44 , TGj+1 G = T33
(70)
where (j)
(j−1)
+ T34 W43
(j)
(j−1)
+ T34 Z44
X33 := T33 X33 Y34 := T33 Y34 (j)
(j−1)
(j−1)
(j−1)
j + T43 X33 W43 := T43 T33 (j)
(j−1)
Z44 := T43 Y34
: L2G3 → L2G3 , : L2G3 → L2G4 , (j−1)
+ T44 W43
(j−1)
+ T44 Z44
(71)
: L2G4 → L2G3 , : L2G4 → L2G4 .
Step 2. Since πG1 πG4 = πG4 πG1 = 0 and πG1 πG3 = πG3 πG1 = πG1 , substituting (0)
(70) into the sum below for the terms where j ≥ 1 we have, recalling that X33 = 0, ∞ f (d − b) j (TG G )b,c g(c − d ) N (k) b j=0 b,c∈G1
∞ f (d − b) j (T33 )b,c g(c − d ) = Nb (k) j=0 b,c∈G1
∞ f (d − b) (j) (X33 )b,c g(c − d ). + N (k) b j=1
(72)
b,c∈G1
Now recall from (58) and (60) that, for all b ∈ G3 , 1 2 1 ≤ , |Nb (k)| Λ |zµ,d |R
(73)
and observe that G1 ⊂ G3 . Let M be either TG G or T33 . Then, the estimate f (d − b) j (M ) g(c − d ) b,c b,c∈G Nb (k) 1 ib·x ic·x f (d − b) e j e = g(c − d , M ) Nb (k) |Γ|1/2 |Γ|1/2 b∈G1
c∈G1
1 2 f l1 g l1 M j ≤ Λ |zµ,d |R
(74)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
929
implies that the left-hand side and the first term on the right-hand side of (72) converge because M < 17/18. Thus, the last term in (72) also converges. Hence, we are left to show that R3 (k) :=
∞ f (d − b) (j) (X33 )b,c g(c − d ) N (k) b j=1
(75)
b,c∈G1
obeys |R3 (k)| ≤
CΛ,f,g . |zµ,d | R2
In order to do this we need the following inequality, which we prove later. Proposition 17. Consider a constant β ≥ 0 and suppose that (1 + |b|β )ˆ q (b) l1 < 2 β ˆ ˆ 1 < 2ε/63. Suppose further that |v| > (1 + |b| )A(b) l1 . ∞ and (1 + |b|β )A(b) l ε Then, for any B, C ⊂ G and m ≥ 1, m 17 1 πB TGm G πC ≤ (1 + (2Λ)β−β βmβ −1 ) sup , β 18 b∈B 1 + |b − c| c∈C
where β is the smallest integer greater or equal than β. Step 3. Now observe that, if b ∈ G1 and c ∈ G4 then |b − c| = |b − d − (c − d )| ≥ |c − d | − |b − d | ≥
R R R − = . 2 4 4
Thus, applying the last proposition with β = 2 and recalling that G3 ⊂ G , for m ≥ 0 we have m+1 3(m + 1) 17 m T34 ≤ πG1 TGm G TG G4 = πG1 TGm+1 π ≤ . πG1 T33 G G 4 1 18 1 + R2 16 Furthermore, since πG4 πG3 = πG4 πG1 = 0 and πG3 πG1 = πG1 , from (70) we obtain (j)
j+1 W43 πG1 = πG4 TGj+1 G πG πG = πG TG G πG . 3 1 4 1
Hence, (j)
j+1 W43 πG1 = πG4 TGj+1 < G πG ≤ TG G 1
17 18
j+1 .
Therefore, for 0 ≤ m < j, (j−m−1)
m m πG1 T33 T34 W (j−m−1) πG1 ≤ πG1 T33 T34 W43 j+1 3(m + 1) 17 ≤ . 1 18 1 + R2 16
πG1
September 14, J070-S0129055X10004107
930
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Iterating the first expression in (71) we find that (j)
(j−1)
+ T33 X33
(j−1)
+ T33 T34 W43
= T34 W43
(j−1)
+ T33 T34 W43
j−1
(j−m−1)
X33 = T34 W43 = T34 W43
(j−1) (j−2)
2 + T33 X33
(j−2)
(j−2)
j−2 j−1 + · · · + T33 T34 W43 + T33 T34 W43
.. .
=
m T33 T34 W43
(1)
.
(0)
(76)
m=0
Thus, using the above inequality, (j) πG1 X33 πG1
% j−1 % % % % % (j−m−1) m =% πG1 T33 T34 W43 πG1 % % % m=0
≤
j−1
(j−m−1)
m πG1 T33 T34 W43
πG1
m=0
3 ≤ 1 1 + R2 16 =
17 18
j+1 j−1
3 (j 2 + j) 1 2 2+ R 8
(m + 1)
m=0
17 18
j+1 .
Consequently, % % % % ∞ ∞ % % (j) (j) %πG % π X πG1 X33 πG1 ≤ G1 % 33 % 1 % % j=1 j=1 j+1 ∞ 17 3 C 2 ≤ (j + j) ≤ 2, 1 2 18 R 2 + R j=1 8 where C is an universal constant. Finally, using this and (73), since |zµ,d | ≤ 3|v| we have ∞ f (d − b) 6C 1 (j) |R3 (k)| = f l1 g l1 X33 g(c − d ) ≤ . 2 N (k) Λ |z b µ,d | R b,c∈G1 j=1 b,c
In view of (72) and (75) this completes the proof.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
931
Proof of Proposition 17. For any b, c ∈ Γ# set Qb,c := (1 + |b − c|β )Tb,c . We first claim that, for any B, C ⊂ G , sup
|Qb,c | <
b∈B c∈C
17 18
and
sup c∈C
|Qb,c | <
b∈B
17 . 18
(77)
In fact, using the bounds (11), (12) and |k| ≤ 3|v|, it follows that qˆ(b − c) 2c · A(b ˆ − c) 2k · A(b ˆ − c) sup − − |Qb,c | = sup (1 + |b − c|β ) Nc (k) Nc (k) Nc (k) b∈B b∈B c∈C
c∈C
≤ (1 + |b|β )ˆ q (b) l1
14 17 1 1 4 ˆ + (1 + |b|β )A(b) + = , l1 < ε|v| ε 2 9 18
and similarly we prove the second bound in (77). Furthermore, since |Tb,c | ≤ |Qb,c | for all b, c ∈ Γ# , for any integer m ≥ 1 we have m m 17 17 m m |(TBC )b,c | < and sup |(TBC )b,c | < . sup 18 18 b∈B c∈C c∈C
b∈B
Now, let p be the smallest integer greater or equal than β, and for any integer m ≥ 1 and any ξ0 , ξ1 , . . . , ξm ∈ Γ# , let b = ξ0 and c = ξm . Then, & |b − c| = (2Λ) β
β
=
(2Λ)β (2Λ)p
|b − c| 2Λ
&
'β
m
≤ (2Λ)
β
|b − c| 2Λ
'p
|ξi1 −1 − ξi1 | · · · |ξip −1 − ξip |
i1 ,...,ip =1 m
≤ (2Λ)β−p
(|ξi1 −1 − ξi1 |p + · · · + |ξip −1 − ξip |p )
i1 ,...,ip =1
= (2Λ)β−p p mp−1
m
|ξi−1 − ξi |p
i=1
≤ (2Λ)β−p p mp−1
m * (1 + |ξi−1 − ξi |p ).
(78)
i=1
To simplify the notation write s := supb∈B, c∈C sup
b∈B c∈C
1 . 1+|b−c|β
|(TGm G )b,c |
≤ sup b∈B c∈C
1 sup (1 + |b − c|β )|(TGm G )b,c | 1 + |b − c|β b∈B c∈C
Hence,
September 14, J070-S0129055X10004107
932
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
≤ s sup
b∈B c∈C
×
|(TGm G )b,c | + (2Λ)β−p p mp−1 sup
b∈B ξ ∈G 1
(1 + |ξ1 − ξ2 |2 )|Tξ1 ,ξ2 | · · ·
ξ2 ∈G
(1 + |b − ξ1 |β )|Tb,ξ1 |
(1 + |ξm−1 − c|2 )|Tξm−1 ,c |
c∈C
m 17 + (2Λ)β−p p mp−1 sup (1 + |b − ξ1 |2 )|Tb,ξ1 | ≤ s 18 b∈B ξ1 ∈G
× sup
ξ1 ∈G ξ ∈G 2
(1+|ξ1 − ξ2 |2 )|Tξ1 ,ξ2 | · · ·
≤ s (1 + (2Λ)β−p p mp−1 )
17 18
sup
ξm−1 ∈G c∈C
(1 + |ξm−1 − c|2 )|Tξm−1 ,c |
m ,
and similarly we prove the other inequality. Therefore, by Proposition 5, πB TGm G πC
≤ (1 + (2Λ)
β−β
βm
β −1
)
17 18
m sup b∈B c∈C
1 , 1 + |b − c|β
where β is the smallest integer greater or equal than β. This is the desired estimate. Proof of Lemma 4. To simplify the notation write w = wµ,d , z = zµ,d , and |z|R = 2|z| − R. First observe that 1 w − 2iθµ (c −
d )
=
−1 w + , 2iθµ (c − d ) 2iθµ (c − d )(w − 2iθµ (c − d ))
so that z −1 w = + Nc (k) 2iθµ (c − d ) 2iθµ (c − d )(w − 2iθµ (c − d )) +
1 2iθµ (c − d ) w − 2iθµ (c − d ) z − 2iθµ (c − d )
=: ηc(0) + ηc(w) + ηc(z) , where, in view of (58) to (61), since |w| < ε, |ηc(0) | ≤
1 , 2Λ
|ηc(w) | ≤
ε 2Λ2
and |ηc(z) | ≤
4 . |z|R
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
933
Hence, ˆ − c))z −2iθµ (A(b Nc (k)
Yb,c =
ˆ − c))η (0) − 2iθµ (A(b ˆ − c))η (w) − 2iθµ (A(b ˆ − c))η (z) = −2iθµ (A(b c c c (0)
(w)
(z)
=: Yb,c + Yb,c + Yb,c . (·)
(·)
Let Y ( · ) be the operator whose matrix elements are Yb,c and set Y33 := πG3 Y ( · ) πG3 . Then, similarly as we estimated Y33 , using (58) to (61) and Proposition 5, it follows easily that 1 (0) ˆ l1 , Y (w) ≤ ε θµ (A) ˆ l1 , Y (z) ≤ 4 θµ (A) ˆ l1 . θµ (A) Y33 ≤ 33 33 2Λ 2Λ2 |z|R Furthermore, S = (I − Y33 )−1 = 1 + (1 − Y33 )−1 Y33 = 1 + SY33 (0)
(w)
(z)
2 = 1 + (1 + SY33 )Y33 = 1 + Y33 + Y33 + Y33 + SY33 ,
where, recalling (56), 2 SY33
−1
≤ (1 − Y33 )
14 Y33 2 < Y33 ≤ 1 − Y33 13 2
2 8 ˆ 21 . θµ (A) l Λ
Combining all this we have z Sb,c (0) (w) (0) (w) (z) (z) 2 = (ηb + ηb )(δb,c + Yb,c + Yb,c + Yb,c + (SY33 )b,c ) + ηb Sb,c Nb (k) (0)
(0)
(0)
(w)
(w)
(0)
(w)
= [ηb (δb,c + Yb,c )] + [ηb Yb,c + ηb (δb,c + Yb,c + Yb,c )] (0)
(w)
(0)
(w)
(z)
(z)
2 + [(ηb + ηb )(SY33 )b,c ] + [(ηb + ηb )Yb,c + ηb Sb,c ] (0)
(1)
(2)
(3)
=: Kb,c + Kb,c + Kb,c + Kb,c with
1 ˆ θµ (A) l1 , 1+ 2Λ
ε ε ε 1 (1) ˆ ˆ ˆ 1 1 1 θ θ ( A) + ( A) + θ ( A) 1 + |Kb,c | ≤ µ µ µ l l l 4Λ3 2Λ2 2Λ Λ2
ε 7 ˆ l1 , (A) θ < 1 + µ 2Λ2 6Λ 2 1 8 (2) ˆ 21 < 64 θµ (A) ˆ 21 , |Kb,c | ≤ θµ (A) l l Λ Λ Λ3
(0) |Kb,c |
(3)
1 ≤ 2Λ
|Kb,c | ≤
3 ˆ l1 4 + 14 4 < CΛ,A θµ (A) 2Λ |z|R 13 |z|R |z|R
for all b, c ∈ G3 . Here, to estimate |Kb,c | we have used that ε < Λ/6. (1)
September 14, J070-S0129055X10004107
934
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Finally, recalling (66) and using the above estimates we find that z Sb,c (1) zµ,d (k)αµ,d (k) = g(c − d ) f (d − b) N (k) b b,c∈G1
=
3 (j) f (d − b) K g(c − d ) b,c
b,c∈G1 (1,0)
j=0 (1,1)
(1,2)
(1,3)
=: αµ,d + αµ,d (w(k)) + αµ,d (k) + αµ,d (k), where, in particular, (1,0) αµ,d
=−
b,c∈G1
! " ˆ − c)) f (d − b) θµ (A(b δb,c + g(c − d ). 2iθµ (b − d ) θµ (c − d )
(79)
(80)
(1,j)
Furthermore, for 0 ≤ j ≤ 2, it follows easily from (79) that |αµ,d | ≤ Cj with
1 1 ˆ l1 f l1 g l1 , θµ (A) C0 := 1+ 2Λ 2Λ
ε 7 ˆ (81) C1 := θµ (A) l1 f l1 g l1 , 1+ 2Λ2 6Λ C2 :=
64 ˆ 21 f l1 g l1 , θµ (A) l Λ3
while for j = 3, (1,3)
|αµ,d | ≤ CΛ,A,f,g
1 . |z|R
This completes the proof of the lemma. Proof of Lemma 5. To prove this lemma we apply the following (well-known) inequality (see [13] for a proof). Proposition 18. Let α and δ be constants with 1 < α ≤ 2 and 1 < δ ≤ 2. Suppose that f is a function on Γ# obeying |b|α f (b) l1 < ∞. Then, for any ξ1 , ξ2 ∈ Γ# with ξ1 = ξ2 , 1 if α, δ < 2, |f (b − ξ1 )| C ≤ × δ α+δ−2 |b − ξ | |ξ − ξ | 2 1 2 ln|ξ1 − ξ2 | if α = 2 or δ = 2, # b∈Γ \{ξ1 ,ξ2 }
where C = CΓ# ,α,δ,f is a constant. First observe that π{b} TGm G π{c} = |(TGm G )b,c |. Hence, by Proposition 17 with β = 2, for all b, c ∈ G and m ≥ 1, m 17 1 m m |(TG G )b,c | = π{b} TG G π{c} ≤ (1 + 2m) . 18 1 + |b − c|2
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
Note that this inequality is also valid for m = 0. Thus, ∞ f (d − b) m |Φd ,d (k)| = (TG G )b,c g(c − d ) m=0 b,c∈G Nb (k) " ! ∞ m |g(c − d )| 1 17 ≤ (1 + 2m) |f (d − b)| ε|v| m=0 18 1 + |b − c|2 b∈G c∈G |g(c − d )| C , ≤ |f (d − b)| |g(b − d )| + ε|v| |b − c|2
935
(82)
c∈G \{b}
b∈G
where C is an universal constant. Now, by the triangle inequality, H¨ older’s inequality, and since · l2 ≤ · l1 , |f (d − b)| |g(b − d )| b∈G
=
|d − d |2 |f (d − b)| |g(b − d )| − d |2 |d
b∈G
≤
4 (|d − b|2 + |b − d |2 ) |f (d − b)| |g(b − d )| |d − d |2 b∈G
≤
4 ( b2 f (b) l2 g l2 + f l2 b2 g(b) l2 ) |d − d |2
≤
4 Cf,g ( b2 f (b) l1 g l1 + f l1 b2 g(b) l1 ) ≤ . |d − d |2 |d − d |2
(83)
Furthermore, by Proposition 18 with α = δ = 2, for any 0 < 1 < 2, |g(c − d )| CΓ# ,g,1 ln|b − d | ≤ C # ,g ≤ . Γ 2 |2 |b − c| |b − d |b − d |2−1 c∈G \{b}
Applying this inequality and (83) to (82) we obtain ! " |f (d − b)| Cf,g C |Φd ,d (k)| ≤ + CΓ# ,g,1 . ε|v| |d − d |2 |b − d |2−1 b∈G
Again, by Proposition 18 with α = 2 and δ = 2 − 1 we conclude that, for any 0 < 2 < 2 − 1 , ' & Cε,Γ# ,f,g,1 ,2 Cf,g ln |d − d | C + CΓ# ,f,g,1 . |Φd ,d (k)| ≤ ≤ 2 2− 1 ε|v| |d − d | |d − d | |v| |d − d |2−1 −2 Finally, recall from Proposition 11(ii) that |zν ,d | < 3|d| and |zν ,d | < 3|v|, observe that |d − d | = |d|, and set = 1 + 2 . Then, for any 0 < < 2, |Φd ,d (k)| ≤
Cε,Γ# ,f,g,1 ,2 Cε,Γ# ,f,g, ≤ . |d| |d|2−1 −2 |zν ,d |3−
Choosing = 10−1 we obtain the desired inequality.
September 14, J070-S0129055X10004107
936
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Appendix C. Bounds on the Derivatives: Proofs Proof of Lemma 6. Step 0. When there is no risk of confusion we shall use the same notation to denote an operator or its matrix. Define FBC := [f (b − c)]b∈B,c∈C ,
GBC := [g(b − c)]b∈B,c∈C ,
ΦG (k) := [Φd ,d (k; G)]d ,d ∈G . Here FBC and GBC are |B| × |C| matrices and ΦG (k) is a |G| × |G| matrix. First observe that f (d − b) −1 (RG ΦG (k) = G )b,c g(c − d ) N (k) b b,c∈G
d ,d ∈G
−1 can be written as the product of matrices FGG ∆−1 k RG G GG G . Furthermore, since −1 −1 = Hk−1 , we can write ΦG (k) as on L2G we have ∆−1 k RG G = (RG G ∆k ) −1 FGG Hk GG G . Hence,
∂ n+m Hk−1 ∂ n+m Φ (k) = F GG G . G GG ∂k1n ∂k2m ∂k1n ∂k2m
(84)
This is the quantity we want to estimate. Step 1. Let T = T (k) be an invertible matrix. Then applying T T −1 = I and using the Leibniz rule for
∂ m0 m ∂ki 0
∂ m0 m ∂ki 0
to the identity
(T T −1) we find that
m0 −m1 m 0 −1 m0 ∂ ∂ m0 T −1 T ∂ m1 T −1 −1 = −T . m0 m0 −m1 ∂ki m1 ∂ki ∂kim1 m =0 1
Iterating this formula m0 − 1 times we obtain m0 mj−1 mj−1 −mj * −1 mj−1 ∂ mm0 T −1 ∂ m0 T −1 T ∂ = (−T −1 ) mj−1 −mj m m0 ∂ki mj ∂ki m0 ∂ki j=1 m =0 =
j
mj−1 −mj mj−1 T ∂ (−T −1 ) mj−1 −mj mj ∂ki
m* 0 −1 mj−1 −1 j=1
mj =0
mm0 −1 −1
∂ mm0 −1 −mm0 T ∂ mm0 T −1 mm0 −1 × (−T −1 ) mm −1 −mm m 0 m m0 ∂ki m0 ∂ki 0 mm0 =0 m* 0 −1 mj−1 mj−1 −mj −1 mj−1 ∂ T ∂ mm0 −1 T = (−1)m0 T −1 mj−1 −mj T −1 mm0 −1 T −1 . mj ∂ki ∂ki j=1 mj =0
(85)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
Step 2. In view of (85), it is not difficult to see that linear combination of terms of the form m nj * ∂ H k −1 Hk , Hk−1 nj ∂k 2 j=1 where either have
∂ m Hk−1 ∂k2m
937
is given by a finite
(86)
+m
m −1 ∂ n ∂ Hk ∂n j=1 nj = m. Thus, when we compute ∂k1n ∂k2m , the derivative ∂k1n acts , nj k ˆ on Hk−1 or ∂ nHjk . However, since ∂H ∂k2 b,c = 2(k2 + b2 )δb,c − 2A2 (b − c), we ∂k
n ∂ n ∂ j Hk ∂k1n ∂knj 2
2
= 0 if nj ≥ 1 and ∂
Hk−1 ∂k1n
n
n ∂ n ∂ j Hk ∂k1n ∂knj 2
=
∂ n Hk ∂k1n
if nj = 0. Similarly, using
again (85), one can see that is given by a finite linear combination of terms of +n the form (86), with m and k2 replaced by n and k1 , respectively, and j=1 nj = n. ∂ n+m H −1
k Therefore, combining all this we conclude that ∂kn ∂km is given by a finite linear 1 2 combination of terms of the form n+m nj * ∂ H k −1 ∆−1 R−1 , ∆−1 (87) nj k RG G k GG ∂k i j j=1
+ +n+m where n+m j=1 nj δ2,ij = m and j=1 nj δ1,ij = n, that is, where the sum of nj for which ij = 2 is equal to m, and the sum of nj for which ij = 1 is equal to n. nj
Hk −1 ∆k πG . n ∂ki j
Step 3. The first step in bounding (87) is to estimate ∂
A simple
j
calculation shows that #
∂ nj Hk −1 n ∆ ∂kijj k
$ b,c
ˆ 2(kij + bij )δb,c + 2Aij (b − c) if nj = 1, 1 × 2δb,c = if nj = 2, Nc (k) 0 if nj ≥ 3.
Furthermore, by Proposition 7, 1 1 ≤ |Nb (k)| ε|v| for all b ∈ G , while by Proposition 3 we have 2 1 ≤ |Nb (k)| Λ|v|
(88)
and |ki + bi | ≤ |ui + bi | + |vi | ≤ |v| + |u + b| ≤
2 |Nb (k)| Λ
for all b ∈ G if G = {0, d}, and for all b ∈ G \{˜b} if G = {0}. Furthermore, |˜b| ≤ Λ + |u| + |v| < Λ + 3|v|,
(89)
September 14, J070-S0129055X10004107
938
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
since |u| < 2|v| because k ∈ T0 . Now, let 1B (x) be the characteristic function of the set B. Then, using the above estimates we have # $ ∂ nj Hk −1 sup nj ∆k πG ∂k c∈G i j b∈G
!
b,c
2|kij + bij |δnj ,1 + 2δnj ,2 2|Aˆij (b − c)| ≤ sup δb,c + δnj ,1 |Nb (k)| |Nb (k)| c∈G b∈G " ! 2|kij + ˜bij | + 2 2|Aˆij (˜b − c)| ≤ sup δ˜b,c + 1G (˜b) |N˜b (k)| |N˜b (k)| c∈G " ! 2|kij + bij | + 2 2|Aˆij (b − c)| δb,c + + sup |Nb (k)| |Nb (k)| c∈G
"
b∈G \{˜ b}
≤
ˆ l1 2|kij + ˜bij | + 2 + 2 A 1G (˜b) ε|v| " !& ' 2|Aˆij (b − c)| 4 2 + sup + δb,c + Λ |Nb (k)| |Nb (k)| c∈G b∈G \{˜ b}
≤
2 ˆ l1 )1G (˜b) + 4 + 4 + 4 A ˆ l1 (2(|u| + |v| + |˜b|) + 2 + 2 A ε|v| Λ Λ|v| Λ|v|
≤
2 ˆ l1 )1G (˜b) + 4 + 4 + 4 A ˆ l1 (12|v| + 2Λ + 2 + 2 A ε|v| Λ Λ|v| Λ|v|
≤ 1G (˜b) ε−1 CΛ,A + CΛ,A . Similarly, # $ ∂ nj Hk −1 sup ∂k nj ∆k πG b∈G c∈G ij
≤ 1G (˜b) ε−1 CΛ,A + CΛ,A . b,c
Hence, by Proposition 5, % % % ∂ nj H % % % k −1 ˜ −1 CΛ,A + CΛ,A . ∆ π % G % ≤ 1G (b) ε % ∂kinjj k % Step 4. By a similar (and much simpler) calculation (using Proposition 5) we get FGG ≤ f l1 , GGG ≤ g l1 , 1 2 ˜ ∆−1 + (1 − 1G (˜b)) . k πG ≤ 1G (b) ε|v| Λ|v|
(90)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
939
From Lemma 1 we have (RG G )−1 ≤ 18. Thus, the operator norm of (87) is bounded by % % % % n+m % % * −1 −1 ∂ nj Hk −1 −1 % % ∆k RG G % ∆k RG G nj % ∂k % % j=1 ij % % n+m % nj * % % ∂ Hk −1 % −1 −1 ≤ ∆−1 ∆ πG % RG % G , k RG G % ∂kinjj k % j=1 which is bounded either by n+m * 1 1 18 (ε−1 CΛ,A + CΛ,A ) 18 ≤ ε−(n+m+1) CΛ,A,n,m ε|v| |v| j=1 if G = {0}, or by
n+m * 1 1 18 CΛ,A 18 g l1 ≤ CΛ,A,n,m Λ|v| |v| j=1
if G = {0, d}. Therefore, % n+m −1 % %∂ Hk % % % % ∂k n ∂k m % ≤ 1
2
finite sum where # of terms depend on n and m
with C = Cε,Λ,A,n,m if G = {0} (84) and (90) we have n+m ∂ Φ (k) ∂k n ∂k m G = 1 2
C C C ≤ Cn,m ≤ , |v| |v| |v|
(91)
or C = CΛ,A,n,m if G = {0, d}. Finally, recalling
n+m −1 Hk FGG ∂ G G G n m ∂k1 ∂k2 % n+m −1 % %∂ Hk % C % ≤ FGG % % ∂k n ∂k m % GG G ≤ |v| , 1 2
where C = Cε,Λ,A,n,m,f,g if G = {0} or C = CΛ,A,n,m,f,g if G = {0, d}. This is the desired inequality. The proof of the lemma is complete. Proof of Lemma 7. Let R+ be the set of non-negative real numbers and let σ be a real-valued function on R+ such that: (i) σ(t) ≥ 1 for all t ∈ R+ with σ(0) = 1; (ii) σ(s)σ(t) ≥ σ(s + t) for all s, t ∈ R+ ; (iii) σ increases monotonically. For example, for any β ≥ 0 the functions t → eβt and t → (1 + t)β satisfy these properties. Now, let T be a linear operator from L2C to L2B with B, C ⊂ Γ# (or a
September 14, J070-S0129055X10004107
940
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
matrix T = [Tb,c ] with b ∈ B and c ∈ C) and consider the σ-norm T σ := max sup |Tb,c |σ(|b − c|), sup |Tb,c |σ(|b − c|) . c∈C
b∈B c∈C
b∈B
In [13] we prove that this norm has the following properties. Proposition 19 (Properties of · σ ). Let S and T be linear operators from L2C to L2B with B, C ⊂ Γ# . Then: (a) (b) (c) (d)
T ≤ T σ≡1 ≤ T σ ; If B = C, then S T σ ≤ S σ T σ ; If B = C, then (I + T )−1 σ ≤ (1 − T σ )−1 if T σ < 1; 1 T σ for all b ∈ B and all c ∈ C. |Tb,c | ≤ σ(|b−c|)
Now, by using these properties we prove Lemma 7. We follow the same notation as above. First observe that, similarly as in the last proof we can write −1 −1 Φd ,d (k) = F{d }G ∆−1 k RG G GG {d } = F{d }G Hk GG {d } .
Now, let σ(|b|) = (1 + |b|)β , and observe that there is a positive constant Cβ such that σ(|b|) ≤ Cβ (1 + |b|β ) for all b ∈ Γ# . Then, it is easy to see that F{d }G σ = f σ ≤ Cβ (1 + |b|β )f (b) l1 , GG {d } σ = g σ ≤ Cβ (1 + |b|β )g(b) l1 . Furthermore, by (77) and Proposition 5, −1 −1 RG σ ≤ G σ = (I + TG G )
∞
TG G jσ < 18,
(92)
j=0
and since for diagonal operators the σ-norm and the operator norm agree, from (90) we have ∆−1 k πG σ ≤
2 . Λ|v|
Hence, in view of Propositions 19(b) and 11(ii), −1 |Φd ,d (k)| ≤ F{d }G ∆−1 k RG G GG {d } ≤ Cβ,f,g,Λ,A,m,n
1 , |d|
and by repeating the proof of Lemma 6 with the operator norm replaced by the σ-norm we obtain % % n+m % % ∂ 1 % % % ∂k n ∂k m Φd ,d (k)% ≤ Cβ,f,g,Λ,A,m,n |d| . 1
2
σ
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
941
Therefore, by Proposition 19(d), for any integers n and m with n + m ≥ 0, % n+m % n+m % ∂ % ∂ 1 % % ≤ Φ (k) Φ (k) 1 + |d − d |β % ∂k n ∂k m d ,d % ∂k n ∂k m d ,d 1 2 1 2 σ ≤ Cβ,f,g,Λ,A,m,n
1 . |d|1+β
This is the desired inequality. Proof of Lemma 8. Define the operator M (j) : L2G → L2G as
M
(j)
S := W Z
3
3
if j = 1, if j = 2, if j = 3,
where S, W and Z are given by (64). In order to prove Lemma 8, we first prove the following proposition. Proposition 20. Assume the same hypotheses of Lemma 8. Then, for any integers n and m with n + m ≥ 1 and for 1 ≤ j ≤ 3, % n+m % % ∂ % Cj −1 (j) % % ∆ M % ∂k n ∂k m k % ≤ (2|zµ,d (k)| − R)j , 1 2 where C1 = C1;Λ,A,n,m and Cj = Cj;Λ,A,q,n,m for 2 ≤ j ≤ 3 are constants. Furthermore, C1;Λ,A,1,0 ≤
13 , Λ2
C1;Λ,A,0,1 ≤
13 Λ2
and
C1;Λ,A,1,1 ≤
65 . Λ3
Proof. Step 0. To simplify the notation write w = wµ,d , z = zµ,d and |z|R = 2|z|− R. First observe that, for any analytic function of the form h(k) = ˜h(w(k), z(k)) we have
∂ ∂ ∂ ˜ ∂ ˜ ∂ ∂ ν + − h= h = i(−1) h, h. ∂k1 ∂w ∂z ∂k2 ∂w ∂z Thus, % % n+m % % ∂ −1 (j) % % ∆ M % % ∂k n ∂k m k 1 2 % % m n % % n−r+m−p r+p m n ∂ ∂ % −1 (j) % ∆ M (−1)m−p n−r+m−p = %(i(−1)ν )m % k % % p r ∂z ∂wr+p p=0 r=0 % n−r+m−p r+p % %∂ % ∂ −1 n+m (j) % % sup sup % n−r+m−p ∆ M %. ≤2 r+p k ∂z ∂w p≤r r≤n
September 14, J070-S0129055X10004107
942
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Now, by the Leibniz rule, % n % % ∂ ∂ m −1 (j) % % %= ∆ M % ∂z n ∂wm k %
% m n % % m n ∂ n−r+m−p ∆−1 ∂ r+p M (j) % % % k % n−r ∂wm−p r ∂wp % % % p r ∂z ∂z p=0 r=0
% n−r+m−p −1 % %∂ ∆k % % ≤ 2n+m sup sup % % % n−r m−p ∂z ∂w p≤m r≤n
% r+p (j) % %∂ M % % % % ∂z r ∂wp % .
Furthermore, we shall prove below that % %% % % ∂ n−r+m−p ∆−1 % % ∂ r+p M (j) % C % % j,n,m k %% sup sup % , % %≤ n−r ∂wm−p % % % ∂z r ∂wp % |z|n+j p≤m r≤n % ∂z R
(93)
with constants C1,n,m = C1,n,m;Λ,A and Cj,n,m = Cj,n,m;Λ,A,q for 2 ≤ j ≤ 3. Hence, % n m % %∂ ∂ % −1 (j) % n+m Cj,n,m % . % ∂z n ∂wm ∆k M % ≤ 2 |z|n+j R Therefore, being careful with the indices, % n+m % % ∂ % Cj,n−r+m−p,r+p Cj −1 (j) % n+m % sup sup 2n−r+m−p+r+p ≤ j , % ∂k n ∂k m ∆k M % ≤ 2 n−r+m−p+j p≤m r≤n |z|R |z|R 1 2 where C1 = C1;Λ,A,n,m and Cj = Cj;Λ,A,q,n,m for 2 ≤ j ≤ 3. This is the desired inequality. We are left to prove (93) and estimate the constants C1;Λ,A,i,j for i, j ∈ {0, 1} to finish the proof of the proposition. ∂ r+p ∆−1
Step 1. The first step for obtaining (93) is to estimate ∂zr ∂wkp πG3 . Observe that ∂ r+p ∆−1 ∂ r+p (∆−1 ) k k b,c = ∂z r ∂wp b,c ∂z r ∂wp p ∂ ∂r 1 δb,c = p ∂w w − 2iθµ (b − d ) ∂z r z − 2iθµ (b − d ) (−1)r r! δb,c (−1)p p! = (w − 2iθµ (b − d ))p+1 (z − 2iθµ (b − d ))r+1 ≤
|w − 2iθµ (b −
p! r! δb,c d )|p+1 |z −
2iθµ (b − d )|r+1
,
and recall from (58) and (59) that, for all b ∈ G3 , 2 1 ≤ |z − 2iθµ (b − d )| |z|R Then,
and
1 1 ≤ . |w − 2iθµ (b − d )| Λ
∂ r+p ∆−1 p! r! 2r+1 δ b,c k ≤ p+1 r+1 , ∂z r ∂wp b,c Λ |z|R
(94)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
943
and consequently,
sup
b∈G3 c∈G 3
+ sup
c∈G3 b∈G
≤
r+1
r+p ∆−1 k ∂ ∂z r ∂wp b,c
3
r+2 p! r! 2 sup δb,c = p! r! 2 + sup . r+1 Λp+1 |z|R Λp+1 |z|r+1 b∈G3 c∈G c∈G3 b∈G R 3
3
Therefore, by Proposition 5, % r+p −1 % %∂ % p! r! 2r+2 1 ∆k % % ≤ π . G % ∂z r ∂wp 3% Λp+1 |z|r+1 R
(95)
Step 2. We now estimate the second factor in (93). Let us first consider the case j = 1, that is, M (1) = S. Since S = (I − Y33 )−1 , the operator S is clearly invertible. ∂pS Thus, by applying (85) with T = S −1 , one can see that ∂w p is given by a finite linear combination of terms of the form p nj −1 * ∂ S S S, (96) ∂wnj j=1 where
+p j=1
nj = p. Hence, when we compute ∂ nj S −1 . ∂w nj
∂r ∂pS ∂z r ∂w p ,
the derivative
∂r ∂z r
acts
−1
either on S or Similarly, using again (85) with T = S , one can see that ∂r S is given by a finite linear combination of terms of the form (96), with p and w r ∂z +r ∂ r+p S replaced by r and z, respectively, and j=1 mj = r. Thus, we conclude that ∂z r ∂w p is given by a finite linear combination of terms of the form r+p * ∂ mj +nj S −1 S, (97) S ∂z mj ∂wnj j=1 +r+p +r+p where j=1 mj = r and j=1 nj = p. Indeed, observe that the general form of the terms (97) follows directly from (85) because that identity is also valid for mixed derivatives. Since S = (I − Y33 )−1 with Y33 < 1/14 and Yb,c =
ˆ − c)) z −2iθµ (A(b , (w − 2iθµ (c − d ))(z − 2iθµ (c − d ))
(98)
we have S = (I − Y33 )−1 ≤
14 1 ≤ 1 − Y33 13
(99)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
944
and
j+l ∂ j+l ∂ −1 S = j l Yb,c j l ∂z ∂w ∂z ∂w b,c ∂ j −2iθ (A(b ˆ − c)) z ∂ l 1 µ = j . ∂z z − 2iθµ (c − d ) ∂wl w − 2iθµ (c − d )
Furthermore, ˆ − c)) z ˆ − c)) 2iθν (c − d ) ∂ j −2iθµ (A(b (−1)j−1 j! 2iθµ (A(b = j ∂z z − 2iθµ (c − d ) (z − 2iθν (c − d ))j+1 ∂l (−1)l l! 1 = l ∂w w − 2iθµ (c − d ) (w − 2iθµ (c − d ))l+1
for j ≥ 1, for l ≥ 0.
Recall from (59) and (61) that, for all c ∈ G , |c − d | |c − d | ≤ ≤ 2. |w − 2iθµ (c − d )| |c − d | − ε
(100)
Then, using this and (94), for j ≥ 1 and l ≥ 0,
∂ j+l ˆ − c)| j! l! |A(b |c − d | −1 S ≤ j l j+1 l ∂z ∂w |z − 2iθµ (c − d )| |w − 2iθµ (c − d )| |w − 2iθµ (c − d )| b,c ≤
ˆ − c)| 2j+2 j! l! |A(b Λl |z|j+1 R
,
(101)
while for j = 0 and l ≥ 0,
∂ j+l ˆ − c)| |z| l! |A(b −1 S ≤ j l ∂z ∂w |z − 2iθµ (c − d )| |w − 2iθµ (c − d )|l+1 b,c ≤
ˆ − c)| 2 l! |A(b . Λl+1
(102)
Consequently,
∂ j+l −1 sup + sup S ∂z j ∂wl b∈G3 c∈G3 b,c c∈G3
b∈G3
j+2 |z|R 2 j! l! ˆ − c)| |A(b δ0,j sup + sup ≤ 1 − δ0,j + 2Λ Λl |z|j+1 b∈G3 c∈G3 R
j+3 2 j! l! ˆ |z|R ≤ 1 − δ0,j + δ0,j A l1 . 2Λ Λl |z|j+1 R
c∈G3
b∈G3
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
Therefore, by Proposition (5), % % j+l
j+3 % % ∂ 2 j! l! ˆ |z|R −1 % % % ∂z j ∂wl S % ≤ 1 − δ0,j + 2Λ δ0,j Λl |z|j+1 A l1 . R Thus, for r ≥ 1, in view of (97) where
+r+p j=1
945
(103)
mj = r,
% % r+p % % m +n r+p * % % ∂ % % ∂ j j −1 % % % S % % ∂z r ∂wp S % ≤ Cr,p % ∂z mj ∂wnj S % S j=1
≤ Cr,p
r+p *
CΛ,A
j=1
× CΛ,A
r+p *
2
mj +3
mj ! n j ! ˆ A l1 Λnj
1 − δ0,mj +
j=1
≤ CΛ,A,r,p
|z|R δ0,mj 2Λ
1 m +1 |z|R j
1 , |z|r+1 R
since mj ≥ 1 for at least one 1 ≤ j ≤ r + p. Similarly, if r = 0 then % r+p % % ∂ % % % % ∂z r ∂wp S % ≤ CΛ,A,r,p . Hence, in view of (95), % n−r+m−p −1 % % r+p (1) % % %∂ ∆k % M % % %∂ % sup sup % % % % n−r m−p r ∂z ∂w ∂z ∂wp % p≤m r≤n (m − p)! (n − r)! 2n−r+2 ˆ l1 CΛ,A,r,p A Λm−p+1 |z|n−r+1 p≤m r≤n R
1 |z|R × 1 − δ0,r + δ0,r 2Λ |z|r+1 R
≤ sup sup
≤ CΛ,A,n,m
1 . |z|n+1 R
This proves (93) for j = 1. Step 3. We now estimate the constant C1;Λ,A,i,j for i, j ∈ {0, 1}. First observe that ∂w = |δ1,j + i(−1)ν δ2,j | = 1 and ∂z = |δ1,j − i(−1)ν δ2,j | = 1. ∂kj ∂kj
September 14, J070-S0129055X10004107
946
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Thus, in view of (99) and (103), since |z| ≥ |v| > R ≥ 2Λ, % % % % %
% −1 % −1 −1 % % ∂S % % % % = %−S ∂S S % = %−S ∂w ∂S + ∂z ∂S % S% % % % ∂kj % % % ∂kj ∂kj ∂w ∂kj ∂z $ % −1 % % −1 % 2 # 4 ˆ ˆ l1 % ∂S % % ∂S % 3 22 A 2 A l1 % % % % + ≤ ≤ S % + ∂w % % ∂z % 2 |z|2R Λ2 2
≤
ˆ l1 18 A . Λ2
Similarly, ∂S ∂2S =− ∂ki ∂kj ∂ki −S −S
+
∂z ∂kj
∂z ∂S −1 ∂w ∂S −1 + S ∂kj ∂w ∂kj ∂z
∂w ∂S −1 ∂z ∂S −1 + ∂kj ∂w ∂kj ∂z ∂w ∂kj
∂S ∂ki
∂w ∂ 2 S −1 ∂z ∂ 2 S −1 + ∂ki ∂w2 ∂ki ∂z∂w
∂w ∂ 2 S −1 ∂z ∂ 2 S −1 + ∂ki ∂w∂z ∂ki ∂z 2
S,
so that, using the above inequality as well, % % % −1 % % −1 % % 2 % % % % % % % % ∂ S % % ≤ 2 S % ∂S % % ∂S % + % ∂S % % % ∂ki % % ∂w % % ∂z % % ∂ki ∂kj % % 2 −1 % % 2 −1 % % 2 −1 % %∂ S % %∂ S % %∂ S % % % % % % +2% + S % % ∂z∂w % + % ∂z 2 % % 2 ∂w 2
ˆ l1 8 A ˆ l1 3 18 A ≤2 + 2 2 Λ Λ2
$ 2 # 3 ˆ ˆ l1 ˆ l1 3 25 A 26 A 2 A l1 + + 2 Λ3 Λ|z|2R |z|3R
ˆ l1 54 ˆ 55 A 432 ˆ 2 ≤ 4 A l1 + 3 A l1 ≤ Λ Λ Λ3
#
$ ˆ l1 8 A +1 . Λ
Furthermore, by (95), % % % % % % % % ∂∆−1 % % % ∂∆−1 % ∂∆−1 23 8 22 k % k % k % % % % + ≤ % ∂kj % % ∂w % % ∂z % ≤ Λ2 |z|R + Λ|z|2 ≤ Λ2 |z|R R
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
and
947
% 2 −1 % % 2 −1 % % 2 −1 % % 2 −1 % % % % ∂ ∆k % % ∂ ∆k % % % %≤% % + 2 % ∂ ∆k % + % ∂ ∆k % % % ∂z∂w % % ∂z 2 % % ∂ki ∂kj % % ∂w2 % ≤
23 24 26 5 · 23 1 + + < . 2 3 Λ3 |z|R Λ2 |z|R Λ|z|R Λ3 |z|R
ˆ l1 < 2ε/63 and ε < Λ/6, Hence, since A % % % % % % % ∂ −1 % % ∂∆−1 % % % −1 % ∂S % k % % % % % ∂kj ∆k S % ≤ % ∂kj % S + ∆k % ∂kj % ≤
8 Λ2 |z|R
ˆ l1 2 18 A 3 13 1 + ≤ 2 2 2 Λ|z|R Λ Λ |z|R
and % % % % ∂2 −1 % % % ∂ki ∂kj ∆k S % % 2 −1 % % %% % % %% % % 2 % % ∂ ∆k % % ∂∆−1 %% % % ∂∆−1 %% % % % −1 % ∂ S % k % % ∂S % k % % ∂S % % % % % ≤% S + % +% + ∆k % % % % % % % % ∂ki ∂kj ∂kj ∂ki ∂ki ∂kj ∂ki ∂kj % $$ # # ˆ l1 8 A ˆ l1 ˆ l1 1 8 18 A 65 1 2 55 A 5 · 23 3 ≤ +2 2 +1 < 3 + . 3 2 3 |z|R Λ 2 Λ Λ Λ Λ Λ Λ |z|R Therefore, C1;Λ,A,1,0 ≤
13 , Λ2
C1;Λ,A,0,1 ≤
13 Λ2
and C1;Λ,A,1,1 ≤
65 , Λ3
as was to be shown. r+p
(2)
r+p
M ∂ W Step 4. To prove (93) for j = 2 we need to bound ∂∂zr ∂w p = ∂z r ∂w p . Recall from (64) that
W =
∞ j=1
Wj =
j ∞
(Y33 )m−1 X33 (Y33 )j−m ,
j=1 m=1
where Yb,c is given above by (98) and X33 ≤ C/|z| < 1/3 with Xb,c =
ˆ − c) − qˆ(b − c) − 2iθµ (A(b ˆ − c))w (c − d ) · A(b . (w − 2iθµ (c − d ))(z − 2iθµ (c − d ))
First observe that ∂ r+p (Y33 )m−1 X33 (Y33 )j−m ∂z r wp
September 14, J070-S0129055X10004107
948
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
is given by a sum of j r+p terms of the form ∂ l1 +n1 Y33 ∂ lm−1 +nm−1 Y33 ∂ lm +nm X33 ∂ lm+1 +nm+1 Y33 ∂ lj +nj Y33 · · · · · · , l n l n l n l n ∂z 1 ∂w 1 ∂z m−1 ∂w m−1 ∂z m ∂w m ∂z m+1 ∂w m+1 ∂z lj ∂wnj where there are j factors ordered as in the product (Y33 )m−1 X33 (Y33 )j−m . Further+j +j more, for each term in the sum we have i=1 li = r and i=1 ni = p. Thus, % r+p % % ∂ % % % W % ∂z r wp % % % %∞ % % ∂ r+p % % =% Wj % (104) % r p % j=1 ∂z w % % % % %∞ j % % ∂ r+p m−1 j−m % (Y ) X (Y ) =% 33 33 33 % % r p % % j=1 m=1 ∂z w ≤
≤
% j % ∞ % ∂ r+p % m−1 j−m % % (Y ) X (Y ) 33 33 % ∂z r wp 33 % j=1 m=1 ∞
j r+p
% l +n % % ∂ 1 1 Y33 ∂ lm +nm X33 ∂ lj +nj Y33 % % sup % · · · · · · % ∂z l1 ∂wn1 ∂z lm ∂wnm ∂z lj ∂wnj % m=1 I
j r+p
% l +n % l +n % l +n % % % % m m X33 % % j j Y33 % % ∂ 1 1 Y33 % % · · · %∂ % · · · %∂ % sup % % ∂z lm ∂wnm % % ∂z lj ∂wnj % , % ∂z l1 ∂wn1 % m=1 I
j=1
≤
∞ j=1
j
j
(105)
where I :=
j j (li , ni )li ≤ r and ni ≤ p for 1 ≤ i ≤ j with li = r and ni = p . i=1
i=1
(106) +∞ Note, we can differentiate the series (104) term-by-term because the sum j=1 Wj +j converges uniformly and the sum m=1 is finite. We next estimate the factors in (105). Combining (101) and (102) we have l +n
li +2 ∂i i 2 l i ! ni ! ˆ |z|R (107) ∂z li ∂wni Yb,c ≤ 1 − δ0,li + 2Λ δ0,li Λni |z|li +1 |A(b − c)|. R Furthermore, using (94) and (100), l +n ∂i i ∂z li ∂wni Xb,c ∂ li ˆ − c) − qˆ(b − c) − 2iθµ (A(b ˆ − c))w ∂ ni (c − d ) · A(b 1 = li ∂z z − 2iθµ (c − d ) ∂wni w − 2iθµ (c − d )
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
949
(−1)li l !(−1)ni n !(2θ (A(b ˆ − c))2θµ (c − d ) − (c − d ) · A(b ˆ − c) − qˆ(b − c)) i i µ = (z − 2iθµ (c − d ))li +1 (w − 2iθµ (c − d ))ni +1 ˆ − c)| |c − d | + |ˆ li ! ni ! (2|A(b q (b − c)|) |z − 2iθµ (c − d )|li +1 |w − 2iθµ (c − d )|ni +1
≤
ˆ − c)| |c − d | + |ˆ q (b − c)| 2li +1 li ! ni ! 2|A(b l +1 i n |w − 2iθµ (c − d )| Λ i |z|R
2li +1 li ! ni ! ˆ − c)| + 1 |ˆ q (b − c)| . ≤ 4| A(b Λ Λni |z|lRi +1
≤
(108)
Hence,
sup
b∈G3
+ sup
c∈G3
c∈G3
b∈G3
l +n i i ∂ ∂z li ∂wni Yb,c
li +2 2 l i ! ni ! |z|R ˆ − c)| |A(b ≤ 1 − δ0,li + δ0,li + sup sup 2Λ Λni |z|lRi +1 b∈G3 c∈G3
li +3 2 l i ! ni ! ˆ |z|R δ0,li ≤ 1 − δ0,li + A l1 n 2Λ Λ i |z|lRi +1
c∈G3
b∈G3
and similarly sup
b∈G3
c∈G3
l +n
2li +2 li ! ni ! ∂i i ˆ q l1 ˆ + sup . ∂z li ∂wni Xb,c ≤ Λni |z|li +1 4 A l1 + Λ c∈G3 R b∈G
3
Thus, by Proposition (5), since |z| ≥ |v| > R ≥ 2Λ, % l +n %
li +3 % ∂i i % 2 l i ! ni ! ˆ |z|R % % δ ≤ 1 − δ Y + A l1 0,li 0,li % ∂z li ∂wni 33 % 2Λ Λni |z|lRi +1
1 1 2li +3 li ! ni ! ˆ 2li +3 li ! ni ! ˆ 1 ≤ ≤ + A A l1 l l |z|R 2Λ Λni |z|Ri Λni +1 |z|lRi
(109)
and % % l +n
% 2li +2 li ! ni ! % ∂i i ˆ q l1 ˆ % % 1 X + 4 A ≤ l % ∂z li ∂wni 33 % Λ Λni |z|lRi +1 $ # 1 2li +3 li ! ni ! ˆ ˆ q l1 A l1 . = 2Λ + ˆ l1 |z|R Λni +1 |z|li 2 A R
(110)
September 14, J070-S0129055X10004107
950
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
+j +j Applying these estimates to (105) and recalling that i=1 li = r and i=1 ni = p we have % r+p % % ∂ % % % % ∂z r wp W % % l +n % l +n % l +n % % % j ∞ % ∂ m m X33 % % ∂ j j Y33 % % ∂ 1 1 Y33 % r+p % % % % % ≤ j sup % l1 n1 % · · · % lm nm % · · · % lj nj % ∂z ∂w ∂z ∂w ∂z ∂w % m=1 I j=1 ≤
∞ j=1
j r+p
j
#
sup
m=1 I
#
ˆ q l1 2Λ + ˆ l1 2 A
$
j 1 * 2li +3 li ! ni ! ˆ A l1 |z|R i=1 Λni +1 |z|lRi
$j j # j j ∞ * * ˆ l1 2r r+p 8 A 1 j sup li ! nm ! 1 r p |z|R Λ |z|R j=1 Λ I m=1 m=1 i=1 # $ j ∞ ˆ q l1 1 2r r!p! r+p+1 1 ≤ 2Λ + j ≤ CΛ,A,q,r,p r+1 r+1 . p ˆ 21 Λ |z| |z| 2 A l1 R R j=1 ˆ q l1 = 2Λ + ˆ l1 2 A
$
This is the inequality we needed to prove (93) for j = 2. In fact, using (95) we obtain % n−r+m−p −1 % % r+p (2) % % %∂ ∆k % M % % %∂ % sup sup % % n−r ∂wm−p % % ∂z r ∂wp % ∂z p≤m r≤n (m − p)! (n − r)! 2n−r+2 CΛ,A,q,r,p Λm−p+1 |z|n−r+1 |z|r+1 p≤m r≤n R R
≤ sup sup
≤ CΛ,A,q,m,n
1 . |z|n+2 R r+p
(3)
r+p
M ∂ Z Step 5. To prove (93) for j = 3 we need to estimate ∂∂zr ∂w p = ∂z r ∂w p , where
Z=
∞ j=2
Zj =
∞
j (X33 + Y33 )j − Wj − Y33 .
j=2
First observe that ∂ r+p ∂ r+p j Z = ((X33 + Y33 )j − Wj − Y33 ) j ∂z r ∂wp ∂z r ∂wp is given by a sum of (2j − j − 1) · j r+p terms of the form ∂ l1 +n1 Y33 ∂ lm +nm X33 ∂ lj +nj Y33 · · · · · · , l1 n1 ∂z lm/0 ∂wnm ∂z lj ∂wnj1 .∂z ∂w j factors
(111)
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
951
where there are j − 2 factors involving X33 or Y33 and two factors containing X33 . + + Furthermore, for each term in the sum we have ji=1 li = r and ji=1 ni = p. Thus, % r+p % % ∂ % % % % ∂z r ∂wp Zj % % l +n % l +n % l +n % % % % ∂ m m X33 % % ∂ j j Y33 % % ∂ 1 1 Y33 % j r+p % % % % % ≤ (2 − j − 1) j , sup % l1 n1 % · · · % lm nm % · · · % lj nj % ∂z ∂w ∂z ∂w ∂z ∂w % I where the set I is given above by (106). Now observe that, the estimate for the derivatives of X33 in (110) is better then the estimate for the derivatives of Y33 in (109) because the former has an extra factor CΛ,A,q /|z|R < 1. Since the product (111) has at least two factors containing X33 , we can estimate any of these products by considering the worst case. This happens when there are exactly two factors involving X33 . Hence, by proceeding in this way, for each j ≥ 2 we have % r+p % % ∂ % % % % ∂z r ∂wp Zj % # $2 j li +3 * 1 1 2 l !n ! ˆ q i i l ˆ l1 ≤ (2j − j − 1) j r+p sup A 2Λ + ˆ l1 |z|2R i=1 Λni +1 |z|lRi I 2 A # ≤2 j
j r+p
ˆ q l1 2Λ + ˆ l1 2 A
≤ CΛ,A,q,r,p j r+p
2 21
j
$2
1 2r r!p! |z|2R Λp |z|rR
#
ˆ l1 8 A Λ
$j
1 , |z|r+2 R
since A l1 ≤ 2ε/63 and ε < Λ/6. Thus, % % % r+p j ∞ ∞ % % ∂ r+p % CΛ,A,q,r,p % % ∂ 2 r+p % % % % j % ∂z r ∂wp Zj % ≤ |z|r+2 % ∂z r ∂wp Z % ≤ 21 R j=2 j=2 ≤
CΛ,A,q,r,p . |z|r+2 R
Therefore, recalling (95), % n−r+m−p −1 % % r+p (3) % %∂ % ∆k % M % % %∂ % sup sup % % n−r ∂wm−p % % ∂z r ∂wp % ∂z p≤m r≤n (m − p)! (n − r)! 2n−r+2 CΛ,A,q,r,p Λm−p+1 |z|n−r+1 |z|r+2 p≤m r≤n R R
≤ sup sup
≤ CΛ,A,q,m,n
1 . |z|n+3 R
This is the desired inequality for j = 3. The proof of the proposition is complete.
September 14, J070-S0129055X10004107
952
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
We can now prove Lemma 8. We first prove it for 1 ≤ j ≤ 2 and then for j = 3 separately. Proof of Lemma 8 for 1 ≤ j ≤ 2. Define the |B| × |C| matrices FBC := [f (b − c)]b∈B,c∈C
and GBC := [g(b − c)]b∈b,c∈C ,
and write w = wµ,d , z = zµ,d and |z|R = 2|z|−R. First observe that, for 1 ≤ j ≤ 2, the functions (j) f (d − b)M g(c − d ) (j) b,c [αµ,d (k)]d ∈G = (b − d ))(z − 2iθµ (b − d )) (w − 2iθ µ b,c∈G1
d ∈G
(j) FGG1 ∆−1 k M
are the diagonal entries of the matrix GG1 G . Thus, similarly as in the proof of Lemma 6, by Proposition 20, for 1 ≤ j ≤ 2, n+m % % n m ∂ % % (j) ≤ FGG % ∂ ∂ ∆−1 M (j) % GG G ≤ Cj , α (k) ∂k n ∂k m µ,d % % ∂k n ∂k m k 1 1 |z|j 1
2
1
2
R
where C1 = C1;Λ,A,n,m,f,g and C2 = C2;Λ,A,q,n,m,f,g are constants. Furthermore, C1;Λ,A,1,0,f,g ≤
13 f l1 g l1 , Λ2
C1;Λ,A,0,1,f,g ≤
13 f l1 g l1 Λ2
and C1;Λ,A,1,1,f,g ≤
65 f l1 g l1 . Λ3
This proves the lemma for 1 ≤ j ≤ 2. Proof of Lemma 8 for j = 3. We need to estimate ∂ n+m ∂ n+m (3) α (k) = Rj (k), ∂k1n ∂k2m µ,d ∂k1n ∂k2m j=1 4
where R1 , . . . , R4 are given by (46), (47), (75) and (67), respectively. Step 1. We begin with the terms involving R1 and R2 , which are easier. We follow the same notation as above. First observe that, similarly as in the proof of Lemma 6, −1 −1 on L2G , we have since ∆−1 k RG G = Hk % n+m % % ∂ % ∂ n+m Hk−1 % % F = R (k) G ∂k n ∂k m 1 % {d }G1 ∂k n ∂k m G2 {d } % 1 2 1 2 % n+m −1 % %∂ Hk % % ≤ F{d }G1 % % ∂k n ∂k m % GG2 {d } , 1 2 % % n+m % % ∂ ∂ n+m Hk−1 % % ∂k n ∂k m R2 (k) = %F{d }G2 ∂k n ∂k m GG {d } % 1 2 1 2 % n+m −1 % %∂ Hk % % ≤ F{d }G2 % % ∂k n ∂k m % GG {d } . 1 2
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
953
Furthermore, we have already proved that F{d }G1 ≤ f l1 and GG {d } ≤ g l1 (see (90) and (91)), and since |z| ≤ 3|v|, by Proposition 11, % n+m −1 % %∂ Hk % 1 −(n+m+1) % % CΛ,A,n,m . % ∂k n ∂k m % ≤ ε |z| 1 2 Now recall that G2 = {b ∈ G ||b − d | > 14 R}. Then, sup
b∈{d } c∈G
|f (b − c)| ≤
c∈G2
2
≤ sup
c∈G2
|d − c|2 1 |f (d − c)| ≤ b2 f (b) l1 sup 2 2 |d − c| c∈G2 |d − c|
16 2 b f (b) l1 , R2
|f (b − c)| ≤ sup
c∈G2
b∈{d }
≤
|d − c|2 1 |f (d − c)| ≤ b2 f (b) l1 sup 2 |d − c|2 c∈G2 |d − c|
16 2 b f (b) l1 . R2
Hence, by Proposition 5, F{d }G2 ≤ 16 b2 f (b) l1
1 . R2
GG2 {d } ≤ 16 b2 f (b) l1
1 . R2
Similarly,
Therefore, combining all this, for 1 ≤ j ≤ 2 we obtain n+m ∂ 1 −(n+m+1) CΛ,A,n,m,f,g . ∂k n ∂k m Rj (k) ≤ ε |z|R2 1 2 Step 2. Recall from (67) the expression for R4 . Then, similarly as above, by applying Proposition 20 for j = 3 we find that n+m % n+m % ∂ % ∂ % −1 % % ≤ F R (k) ∆ Z {d }G1 % k n m ∂k n ∂k m 4 % GG1 {d } ∂k1 ∂k2 1 2 ≤ f l1 g l1 CΛ,A,q,n,m
1 . |z|3R
Step 3. To bound the derivatives of R3 (which is given by (75)) we need a few more (j) estimates. Recall from (70) that W43 = πG4 TGj+1 G πG . First observe that 3 ∂ r+p ∂ r+p (j−m−1) −1 m m ∆ π T T W = ∆−1 πG1 T33 T34 TGj−m G πG 34 G p 33 43 k 1 3 ∂k1r ∂k2 ∂k1r ∂k2p k
September 14, J070-S0129055X10004107
954
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
is given by a sum of (j + 2)r+p terms of the form ∂ l1 +n1 ∆−1 k ∂k1l1 ∂k2n1 ×
πG1
∂ l2 +n2 T33 ∂ lm+2 +nm+2 T34 · · · l n ∂k1l2 ∂k2n2 ∂k1m+2 ∂k2 m+2
∂ lm+3 +nm+3 TG G l n ∂k2m+3 ∂k2 m+3
···
∂ lj+2 +nj+2 TG G l
n
∂k1j+2 ∂k2 j+2
πG3 .
+j+2 +j+2 Moreover, for each term in the sum we have i=1 li = r and i=1 ni = p. Thus, % r+p % % ∂ (j−m−1) % −1 m % % π ∆ T T W % ∂k r ∂k p G1 k 33 34 43 % 2 1 %#j+2 % $ % * ∂ li +ni T % (i) % % %, π ≤ (j + 2)r+p sup % (112) G li ni 3 % % ∂k1 ∂k2 I i=1
where the set I is given by (106) with j replaced by j + 2 and ∆−1 π for i = 1, k G1 T for 2 ≤ i ≤ m + 1, 33 T(i) := T34 for i = m + 2, T G G for m + 3 ≤ i ≤ j + 2. Step 3a. The first step in bounding (112) is to estimate
(113)
∂ r+p ∆−1 k πG1 . ∂k1r ∂k2p
We follow the
same argument that we have used in the proof of Lemma 6 to bound In fact, in view of (85) one can see that p nj * ∂ p ∆−1 ∂ ∆ k −1 k ∆−1 , = ∆k nj k ∂k2p ∂k 2 j=1
∂ n+m Hk−1 ∂k1n ∂k2m .
(114)
finite sum where # of terms depend on p
where
+p j=1
nj = p. Hence, when we compute
acts either on ∆−1 or k ∂ r ∂ nj ∂k1r ∂knj 2
∂ nj ∆k . n ∂k2 j
∆k = 0 if nj ≥ 1 and ∂ r ∆−1 k ∂k1r
p −1 ∂ r ∂ ∆k ∂k1r ∂k2p ,
the derivative
∂r ∂k1r
k However, since ( ∂∆ ∂k2 )b,c = 2(k2 + c2 )δb,c , we have
∂ r ∂ nj ∂k1r ∂knj 2
∆k =
∂r ∂k1r ∆k
if nj = 0. Similarly, using again
(85) one can see that is given by a finite sum as in (114), with p and k2 +r replaced by r and k1 , respectively, and j=1 nj = r. Thus, combining all this we conclude that r+p r+p −1 n * j ∂ ∆k ∂ ∆k −1 ∆k , = ∆−1 (115) n k ∂k1r ∂k2p ∂kijj j=1 finite sum where # of terms depend on r and p
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
where
+r+p j=1
nj δ2,ij = p and #
∂ nj ∆k n ∂kijj
955
+r+p j=1
$ = b,c
nj δ1,ij = r. If we observe that 2(kij + cij )δb,c if nj = 1, 2δb,c 0
if nj = 2, if nj ≥ 3,
and extract the “leading term” from the summation in (115), in a sense that will be clear below, we can rewrite (115) in terms of matrix elements as & 'r & 'p (−1)r+p (r + p)! 2(k1 + c1 ) 1 2(k2 + c2 ) ∂ r+p = ∂k1r ∂k2p Nc (k) Nc (k) Nc (k) Nc (k)
+
finite sum where # of terms depend on r and p
(2(k1 + c1 ))αj (2(k2 + c2 ))βj , Nc (k)r+p+1
where αj + βj < r + p for every j in the summation. Recall from (88) and (89) that, c}, for all c ∈ G \{˜ 2 1 7 |ki + ci | ≤ < < |Nc (k)| Λ 3ε 2ε
and
|ki + c˜i | Λ + 3|v| 7 ≤ ≤ . |Nc˜(k)| ε|v| 2ε
Hence, r+p r+p ∂ 1 (r + p)! 7 + ∂k r ∂k p Nc (k) ≤ |Nc (k)| ε 2 1
≤
(r + p)! |Nc (k)|
finite sum where # of terms depend on r and p
r+p 7 1 + Cε,r,p . ε |Nc (k)|2
(116)
αj +βj 7 1 ε |Nc (k)|2
(117)
Thus, by Proposition 5, since |Nc (k)| ≥ ε|v| ≥ ε|z|/3 for all c ∈ G , we have % r+p −1 % %∂ % 7r+p (r + p)! 3 ∆k Cε,r,p % % + π . (118) % ∂k r ∂k p G1 % ≤ r+p+1 ε |z| |z|2 2 1 Now, let ρ1 = ρ1;ε,r,p be the constant ρ1;ε,r,p := max
l1 ≤r n1 ≤p
εl1 +n1 +1 Cε,l1 ,n1 , 4(l1 + n1 )! 7l1 +n1
where Cε,l1 ,n1 is the constant in (118). Then, for |z| > ρ1 and for any l1 ≤ r and any n1 ≤ p, % % % ∂ l1 +n1 ∆−1 % 7l1 +n1 (l + n )! 3 7l1 +n1 (l1 + n1 )! 4 % % 1 1 k % ≤ + π % G l n 1 % ∂k11 ∂k2 1 % εl1 +n1 +1 |z| εl1 +n1 +1 |z| l1 +n1 +1 7 1 = (l1 + n1 )! . ε |z|
(119)
September 14, J070-S0129055X10004107
956
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
This is the first inequality we need to bound (112). We next estimate the other factors in that expression. Step 3b. Recall from (53) that Tb,c =
1 ˆ − c) − qˆ(b − c)). (2(c + k) · A(b Nc (k)
By direct calculation we have r+p
∂ 1 ∂ r+p Tb,c ˆ − c) − qˆ(b − c)) = (2(c + k) · A(b ∂k1r ∂k2p ∂k1r ∂k2p Nc (k) r−1+p
∂ 1 +r 2Aˆj (b − c) ∂k1r−1 ∂k2p Nc (k) $ # 1 ∂ r+p−1 2Aˆj (b − c). +p ∂k1r ∂k2p−1 Nc (k) Hence, using (116) and (117), since |Nc (k)| ≥ ε|v| ≥ ε|z|/3 for all c ∈ G and |v| > 1, $ # r+p
r+p ∂ 7 ˆ |ˆ q (b − c)| T C 7 b,c ε,r,p | A(b − c)| + + ≤ (r + p)! ∂k r ∂k p ε ε|v| ε ε|v| 2 1 Cε,r,p ˆ |A(b − c)| |v| r+p+1 7 ˆ − c)| + |ˆ ˆ − c)| + Cε,r,p (|A(b q (b − c)|). ≤ (r + p)! |A(b ε |z| (120) +
Therefore, by Proposition 5, % r+p % %∂ T G G % % % % ∂k r ∂k p % ≤ Θr,p , 1
(121)
2
where Θr,p := (r + p)!
r+p+1 7 ˆ l1 + Cε,A,q,r,p 1 . A ε |z|
(122)
This is the second estimate we need to bound (112). We next derive one more inequality. Step 3c. Set 2 Qr,p b,c := (1 + |b − c| )
∂ r+p Tb,c . ∂k1r ∂k2p
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
We first prove that, for any B, C ⊂ G , r,p sup |Qb,c | ≤ Ωr,p and b∈B c∈C
where Ωr,p
sup c∈C
957
|Qr,p b,c | ≤ Ωr,p ,
b∈B
r+p+1 7 1 ˆ := (r + p)! (1 + b2 )A(b) . l1 + Cε,A,q,r,p ε |z|
(123)
In fact, in view of (120) we have r+p r,p Tb,c 2 ∂ |Qb,c | = sup (1 + |b − c| ) r p sup ∂k1 ∂k2 b∈B c∈C b∈B c∈C (1 + |b − c|2 ) ≤ sup b∈B c∈C
" r+p+1 7 Cε,r,p ˆ ˆ (|A(b − c)| + |ˆ q (b − c)|) |A(b − c)| + × (r + p)! ε |z| !
r+p+1 7 1 ˆ ≤ (r + p)! , (1 + b2 )A(b) l1 + Cε,A,q,r,p ε |z| + and similarly we estimate supc∈C b∈B |Qr,p b,c |. Now observe that, as in (78), for any integer m ≥ 0 and for any ξ0 , ξ1 , . . . , ξm+2 ∈ Γ# , let b = ξ0 and c = ξm+2 . Then, |b − c|2 ≤ 2(m + 2)
m+2
|ξi−1 − ξi |2 .
i=1
To simplify the notation write ∂
li ,ni
=
∂ li +ni l n ∂k1i ∂k2 i
, and recall from (113) and (123)
the definition of T(i) and Ωr,p . Hence, similarly as in the proof of Proposition 17, since |b − c| ≥ R/4 for all b ∈ G1 and c ∈ G4 , # $ m+2 * sup ∂ li ,ni T(i) b∈G1 i=2 c∈G4 b,c # $ m+2 * 1 2 li ,ni ≤ sup sup (1 + |b − c| ) ∂ T (i) 2 i=2 b∈G1 1 + |b − c| b∈G1 c∈G4
c∈G4
≤
2(m + 2) sup (1 + |b − ξ1 |2 )|∂ l2 ,n2 Tb,ξ1 | 1 2 b∈G 1 ξ1 ∈G 1+ R 3 16 × (1 + |ξ1 − ξ2 |2 )|∂ l3 ,n3 Tξ1 ,ξ2 | · · · ξ2 ∈G3
×
c∈G4
(1 + |ξm+1 − c|2 )|∂ lm+2 ,nm+2 Tξm+1 ,c |
b,c
September 14, J070-S0129055X10004107
958
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
≤
2(m + 2) sup (1 + |b − ξ1 |2 )|∂ l2 ,n2 Tb,ξ1 | 1 2 b∈G 1 ξ1 ∈G 1+ R 3 16 × sup (1 + |ξ1 − ξ2 |2 )|∂ l3 ,n3 Tξ1 ,ξ2 | ξ1 ∈G3 ξ ∈G 2 3
×
=
sup
ξm+1 ∈G3
(1 + |ξm+1 − c|2 )|∂ lm+2 ,nm+2 Tξm+1 ,c |
c∈G4
l 2(m + 2) ,nm+2 2 ,n2 sup |Qlb,ξ | · · · sup |Qξm+2 | 1 m+1 ,c 1 2 b∈G ξm+1 ∈G3 1 ξ1 ∈G 1+ R c∈G4 3 16
m+2 2(m + 2) * ≤ Ωli ,ni 1 1 + R2 i=2 16
and similarly # $ m+2 m+2 * 2(m + 2) * li ,ni sup ∂ T(i) Ωli ,ni . ≤ c∈G4 b∈G i=2 1 + 1 R2 i=2 b,c 1 16 Therefore, by Proposition 5, % % m+2 l +n m+2 % % 2(m + 2) * % * ∂ i i T(i) % ≤ Ωli ,ni . %πG1 % % ∂k1li ∂k2ni % 1 + 1 R2 i=2 i=2 16 We have all we need to bound (112). Step 3d. From (121) and (119) it follows that % j+2 % j+2 % * ∂ li +ni T % * (i) % % ≤ Θli ,ni % % % ∂k1li ∂k2ni % i=m+3 i=m+3 and
% % r+p+1 % ∂ l1 +n1 T % 7 1 (1) % % . ≤ (r + p)! % % l n 1 1 % ∂k1 ∂k2 % ε |z|
Thus, recalling (112) we get % r+p % % ∂ (j−m−1) % −1 m % % ∆ π T T W % ∂k r ∂k p k G1 33 34 43 % 2 1 %#j+2 % $ % * ∂ li +ni T % (i) % % r+p ≤ (j + 2) sup % π G3 % li ni % I % i=1 ∂k1 ∂k2
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
959
" j+2 r+p+1 !m+2 1 2(m + 2) * * 7 ≤ (j + 2)r+p sup (r + p)! Ωli ,ni Θli ,ni ε I |z| 1 + 1 R2 i=2 i=m+3 16 C ≤ (j + 2)r+p (m + 2) |z|R2 " j+2 l1 +n1 +1 !m+2 * * 7 × sup (l1 + n1 )! Ωli ,ni Θli ,ni , ε I i=2 i=m+3 where C is an universal constant. Now, recall the definition of Θr,p and Ωr,p in ˆ l1 , and let ρ2 = ρ2;ε,A,q,r,p be a ˆ l1 < (1 + b2 )A (122) and (123), observe that A sufficiently large constant such that, for |z| > ρ2 and for any li ≤ r and any ni ≤ p, li +ni +1 7 ˆ Θli ,ni , Ωli ,ni ≤ 2(li + ni )! (1 + b2 )A(b) l1 . ε Then, % % r+p % ∂ (j−m−1) % −1 m % % % % ∂k r ∂k p ∆k πG1 T33 T34 W43 2 1 C |z|R2 " j+2 l1 +n1 +1 !m+2 * * 7 × sup (l1 + n1 )! Ωli ,ni Θli ,ni ε I i=2 i=m+3
≤ (j + 2)r+p (m + 2)
(m + 2)C j+1 ˆ (2 (1 + b2 )A(b) l1 ) |z|R2 Pj+2 j+2 i=1 (li +ni ) * 7 × sup (li + ni )! ε I i=1
≤ (j + 2)r+p
(since
+j+2
i=1 li
= r,
+j+2
5j+2
i=1 (li + ni )! < (r + p)!) r+p+1
j+1 7 14 1 ˆ 1 (1 + b2 )A(b) ≤ C(r + p)! (m + 2)(j + 2)r+p l ε ε |z|R2 j+1 4 Cε,r,p r+p ≤ (m + 2)(j + 2) , |z|R2 9 i=1
ni = p and
j+2 7 ε
ˆ since (1 + b2 )A(b) l1 < 2ε/63. This establishes a bound for (112). Step 4. We now apply the last inequality for deriving an estimate for the derivatives of R3 and complete the proof of the lemma for j = 3. Recall from (76) that (j)
X33 =
j−1 m=0
(j−m−1)
m T33 T34 W43
.
September 14, J070-S0129055X10004107
960
2010 13:29 WSPC/S0129-055X
148-RMP
G. de Oliveira
Then,
% % r+p % % ∂ −1 (j) % % % ∂k r ∂k p πG1 ∆k X33 % 2
1
≤
j−1 m=0
≤
% r+p % % ∂ (j−m−1) % −1 m % % % ∂k r ∂k p ∆k πG1 T33 T34 W43 % 1
2
j+1 j−1 4 Cε,r,p r+p (m + 2)(j + 2) 2 |z|R 9 m=0
Cε,r,p ≤ (j + 2)r+p |z|R2 Cε,r,p = (j + 2)r+p |z|R2
j+1 j−1 4 (m + 2) 9 m=0 j+1 4 1 2 (j + 3j). 9 2
Thus, since G1 ⊂ G3 , % % % % ∞ r+p % % (j) −1 % %πG ∂ X ∆ π G1 % 33 k % 1 ∂k r ∂k p 2 % % 1 j=1 % ∞ % % ∂ r+p % −1 (j) % % ≤ % ∂k r ∂k p πG1 ∆k X33 % j=1
1
2
j+1 ∞ Cε,r,p 4 1 2 1 r+p ≤ (j + 3j) ≤ CCε,r,p (j + 2) , |z|R2 j=1 9 2 |z|R2 where C is an universal constant. Therefore, r+p ∞ r+p ∂ ∂ (j) −1 F = ∆ G R (k) X G1 {d } 33 k ∂k r ∂k p 3 {d }G1 ∂k r ∂k p 2 2 1 1 j=1 % % % % ∞ r+p % % ∂ (j) −1 % ≤ F{d }G1 %πG1 r p ∆k X33 πG1 % % GG1 {d } ∂k1 ∂k2 % % j=1 1 . |z|R2 Finally, combining all the estimates we have % n+m % % 4 % % ∂ % ∂ n+m % % (3) % % % % % ∂k n ∂k m αµ,d (k)% ≤ % ∂k n ∂k m Rj (k)% 1 2 1 2 j=1 ≤ CCε,r,p f l1 g l1
≤3
C C 4C + ≤ , |z|R2 |z|3R |z|R2
where C = Cε,Λ,A,q,f,g,m,n is a constant. Set ρε,A,q,m,n := max{ρ1;ε,m,n , ρ2;ε,A,q,m,n }. The proof of the lemma for j = 3 is complete.
September 14, J070-S0129055X10004107
2010 13:29 WSPC/S0129-055X
148-RMP
Asymptotics for Fermi Curves: Small Magnetic Potential
961
References [1] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Riemann Surfaces of Infinite Genus, CRM Monograph Series (Amer. Math. Soc., 2003). [2] D. Gieseker, H. Kn¨ orrer and E. Trubowitz, The Geometry of Algebraic Fermi Curves, Perspectives in Mathematics, Vol. 14 (Academic Press, Inc., 1993). [3] H. Kn¨ orrer and E. Trubowitz, A directional compactification of the complex Bloch variety, Comment. Math. Hel. 65 (1990) 114–149. [4] I. Krichever, Spectral theory of two-dimensional periodic operators and its applications, Russian Math. Surveys 44(2) (1989) 145–225. [5] H. McKean, Integrable systems and algebraic curves, in Global Analysis (Proc. Biennial Sem. Canad. Math. Congr. Univ. Calgary, 1978), Lecture Notes in Math., Vol. 755 (Springer, 1979), pp. 83–200. [6] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Asymmetric Fermi surfaces for magnetic Schr¨ odinger operators, Comm. Partial Differential Equations 26 (2000) 319–336. [7] Y. Karpeshina, Spectral properties of the periodic magnetic Schr¨ odinger operator in the high-energy region. Two-dimensional case, Comm. Math. Phys. 251 (2004) 473–514. [8] L. Erd¨ os, Recent developments in quantum mechanics with magnetic fields, in Spectral Theory and Mathematical Physics: A Festschrift in Honor of Barry Simon’s 60th Birthday, Proc. Sympos. Pure Math., Vol. 76, Part 1 (Amer. Math. Soc., 2007), pp. 401–428. [9] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV: Analysis of Operators (Academic Press, 1978). [10] P. Kuchment, Floquet Theory for Partial Differential Equations (Birkh¨ auser, 1993). [11] W. Magnus and S. Winkler, Hill’s Equation (Dover, 2004). [12] S. Gustafson and I. Sigal, Mathematical Concepts of Quantum Mechanics (Springer, 2006). [13] G. de Oliveira, Asymptotics for Fermi curves of electric and magnetic periodic fields, Ph.D. thesis, The University of British Columbia (2009); http://hdl.handle.net/2429/11114.
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 8 (2010) 963–993 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004119
THE 3D SPIN GEOMETRY OF THE QUANTUM TWO-SPHERE
SIMON BRAIN∗,‡ and GIOVANNI LANDI∗,†,§ ∗Dipartimento
di Matematica e Informatica, Universit` a di Trieste, Via A. Valerio 12/1, 34127 Trieste, Italy †INFN,
Sezione di Trieste, Trieste, Italy ‡[email protected] §[email protected] Received 23 March 2010
We study a three-dimensional differential calculus Ω1 Sq2 on the standard Podle´s quantum two-sphere Sq2 , coming from the Woronowicz 4D+ differential calculus on the quantum group SUq (2). We use a frame bundle approach to give an explicit description of Ω1 Sq2 and its associated spin geometry in terms of a natural spectral triple over Sq2 . We equip this spectral triple with a real structure for which the commutant property and the first order condition are satisfied up to infinitesimals of arbitrary order. Keywords: Noncommutative geometry; spectral triples; quantum groups; quantum spheres. Mathematics Subject Classification 2010: 58B34, 17B37
1. Introduction The standard quantum two-sphere Sq2 has proven to be one of the most important and useful examples in trying to understand the relationship between the geometric/analytic world of noncommutative geometry and the algebraic setting of quantum group theory. At the algebraic level, it is known that Sq2 has a unique left-covariant two-dimensional differential calculus [17, 18]. On the other hand, it is known that this same calculus is recovered via analytic techniques by means of a noncommutative spin geometry [4, 20]. This compatibility has led to the discovery of other noncommutative two-dimensional geometries on Sq2 with a range of interesting properties [7]. In this paper, we extend the investigation to the noncommutative spin geometry of a differential calculus on Sq2 whose dimension is equal to three. Quantum two-spheres were constructed and classified by Podle´s in [16]. The standard sphere Sq2 is unique amongst the Podle´s family in that it also appears as the base space of the noncommutative Hopf fibration SUq (2) → Sq2 constructed in [1] as a basic example of a quantum principal bundle. By equipping the total space SUq (2) with the 3D differential calculus of [22], one finds that the two-dimensional 963
September 14, J070-S0129055X10004119
964
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
differential calculus on Sq2 appears as an associated vector bundle. This “quantum frame bundle” approach to noncommutative geometry, developed in [13, 14], has been applied successfully to study a host of examples, not least the two-dimensional geometry of the quantum sphere Sq2 itself. The present paper also uses the frame bundle approach to study the geometry of Sq2 , but this time starting with the 4D+ differential calculus on SUq (2) of [22]. This calculus has the advantage of being bicovariant under both left and right translations, in contrast with the 3D calculus, which is only left-covariant. Using the framing theory we recover the three-dimensional differential calculus Ω1 Sq2 of [9, 10, 17] on Sq2 . The methods we use are well-adapted to the principal bundle structure and as a consequence we immediately find an explicit description of the bimodule relations in Ω1 Sq2 , including a decomposition into irreducible components. We do not discuss the deeper aspects of the Riemannian geometry such as Hodge structure and connection theory: these will be developed elsewhere [12]. Our main results concern the spin geometry of the three-dimensional calculus Ω1 Sq2 . Remarkably, we find that the spinor bundle of Sq2 is unchanged from the one used in [4, 14, 20] for the two-dimensional calculus. We construct a Dirac operator D which implements the exterior derivative in Ω1 Sq2 , finding that the eigenvalues of |D| grow not faster than q −2j for large j and hence that the associated spectral triple has metric dimension zero. Moreover, we equip this spectral triple with a Z2 -grading operator and a real structure which is defined “up to compact operators”, in the sense that the “commutant property” and the “first order condition” for a real spectral triple [3] are satisfied up to infinitesimals of arbitrary order. As we shall see, this is in contrast with [4], where a “true” real structure for the “two-dimensional” calculus on Sq2 was given (cf. also [20]), but is parallel to the results of [7] for the sphere Sq2 . We also find that the “KO-theoretic” dimension of this real spectral triple is equal to the classical value, just two. The paper is organized as follows. In Sec. 2, we give a brief overview of the construction of quantum differential calculi on quantum groups and their homogeneous spaces, followed by the general quantum frame bundle construction itself. Following this, Sec. 3 recalls the elementary geometry of the Hopf fibration SUq (2) → Sq2 and the Hopf algebra Uq (su(2)) which describes its symmetries. In Sec. 4, we describe the differential structure of the Hopf fibration. We start from the 4D quantum differential calculus on the total space SUq (2) from which we derive the calculus on the bundle fiber U(1). The structure of the calculus Ω1 Sq2 is then obtained as a “framed quantum manifold” in the sense of [14]. Finally, in Sec. 5 we construct our spectral triple (A[Sq2 ], H, D) over Sq2 , which in addition we equip with a Z2 -grading Γ of the spinor bundle H and a real structure J: H → H. Notation. In this paper, we make frequent use of the “q-numbers” defined by [x] :=
q x − q −x q − q −1
(1.1)
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
965
for each x ∈ R and q = 1. Furthermore, for the sake of brevity we introduce the constants µ := q + q −1 ,
ν := q − q −1
(1.2)
to be used throughout the paper. Our convention is that N = {0, 1, 2, . . .}. 2. Preliminaries on Quantum Principal Bundles We start with some generalities on differential calculi and quantum principal bundles. These will be endowed both with universal and non-universal compatible calculi. 2.1. Differential structures Let P be a complex ∗-algebra with unit. A first order differential calculus over P is a pair (Ω1 P, d) where Ω1 P is a P -P -bimodule (the one-forms) and d: P → Ω1 P is a linear map obeying the Leibniz rule d(ab) = a(db) + (da)b,
a, b ∈ P,
and such that the map P ⊗ P → Ω1 P defined by a ⊗ b → a db is surjective. ˜ where Ω 1 P := 1 P, d), The universal differential calculus over P is the pair (Ω ker m is the kernel of the product map m: P ⊗ P → P on P , with obvious bimodule structure p · (a ⊗ b) = pa ⊗ b,
(a ⊗ b) · p = a ⊗ bp,
a, b, p ∈ P
˜ is defined by dp ˜ := 1 ⊗ p − p ⊗ 1, for each p ∈ P . It is so-called because any and d 1 P/NP , other differential calculus (Ω1 P, d) over P arises as a quotient Ω1 P = Ω 1 1 where NP is some P -P -sub-bimodule of Ω P . With the projection πP : Ω P → Ω1 P ˜ one has d = πP ◦ d.
If H is a Hopf algebra, we write mH: H ⊗ H → H and 1H for its product and unit, ∆H: H → H ⊗ H and H: H → C for its coproduct and counit and SH: H → H for its antipode (when there is no possibility of confusion, we omit the subscript H). We use Sweedler notation ∆(h) = h(1) ⊗ h(2) for the coproduct. A differential calculus Ω1 H over a Hopf algebra H is said to be leftcovariant if the coproduct ∆, viewed as a left coaction of H on itself, extends to a left coaction ∆L: Ω1 H → H ⊗ Ω1 H such that d is an intertwiner and ∆L is a bimodule map: ∆L (dh) = (id ⊗ d)∆L (h),
∆L (hω) = ∆(h) · ∆L (ω),
∆L (ωh) = ∆L (ω) · ∆(h)
for all h ∈ H, ω ∈ Ω1 H. A similar definition holds for a right-covariant calculus, now with a right coaction ∆R: Ω1 H → Ω1 H ⊗ H. A calculus is said to be bicovariant if it is both left and right covariant with commuting coactions. The universal
September 14, J070-S0129055X10004119
966
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
1 H is bicovariant when equipped with the left and right tensor product calculus Ω coactions on H ⊗ H. Left-covariant differential calculi on a Hopf algebra H are classified as follows after [22]. First, it may be shown that the linear map r: H ⊗ H → H ⊗ H,
r(a ⊗ b) := ab(1) ⊗ b(2) ,
(2.1)
is an isomorphism with inverse r−1: H ⊗ H → H ⊗ H,
r−1 (a ⊗ b) = aS(b(1) ) ⊗ b(2) .
(2.2)
1 H we obtain an isomorphism Upon restricting r to the universal calculus Ω 1H → H ⊗ H +, r: Ω where H + := ker H denotes the augmentation ideal of H. This is in fact an isomorphism of H-H bimodules if we equip H ⊗ H + with the bimodule structure a · (b ⊗ ω) = ab ⊗ ω,
(a ⊗ ω) · b = ab(1) ⊗ ωb(2) ,
a, b ∈ H, ω ∈ H +
(2.3)
and an isomorphism of H-H-bicomodules if we equip H ⊗ H + with the bicomodule structure ∆L (a ⊗ ω) = a(1) ⊗ (a(2) ⊗ ω), ∆R (a ⊗ ω) = (a(1) ⊗ ω(1) ) ⊗ a(2) ω(2) ,
a ∈ H, ω ∈ H + .
1 H is carried to a right ideal IH of H + Any left-covariant sub-bimodule NH of Ω by the map r in (2.1). Conversely, any right ideal IH arises in this way from a 1 H. It follows that the left-covariant differential left-covariant sub-bimodule of Ω calculi on H are in one-to-one correspondence with right ideals IH ⊂ H + ; indeed, given such an IH , one has Ω1 H H ⊗ Λ1 , where Λ1 ∼ = H + /IH are the left-invariant 1 −1 1 one-forms. We also write Ωinv H := r (Λ ). A left-covariant sub-bimodule NH is also right-covariant if and only if the corresponding ideal IH is stable under the right adjoint coaction AdR: H → H ⊗ H,
AdR (a) = a(2) ⊗ S(a(1) )a(3) ,
in the sense that AdR (IH ) ⊂ IH ⊗ H. It follows that bicovariant calculi on H are in one-to-one correspondence with right ideals IH of H + which are AdR -stable [22]. Given a left-covariant differential calculus Ω1 H over H, the quantum tangent space of Ω1 H is the vector space TH := {X ∈ H | X(1) = 0 and X(a) = 0 for all a ∈ IH },
(2.4)
where the vector space H is the linear dual of H. This tangent space admits many properties analogous to the classical case, in particular there exists a unique bilinear
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
967
form · | · : TH × Ω1 H → C such that X | a db = H (a)X(b),
a, b ∈ H, X ∈ TH .
(2.5)
With respect to this bilinear form, the vector spaces Ω1inv H and TH are nondegenerately paired, so that dim Ω1inv H = dim TH = dim Λ1 . This number is said to be the dimension of the left-covariant differential calculus Ω1 H. 2.2. Quantum principal bundles The general set-up for a principal fibration of noncommutative spaces is an algebra P (playing the role of the algebra of functions on the total space) which is a right comodule algebra for a Hopf algebra H with coaction δR : P → P ⊗ H. The algebra of functions on the base space of the fibration is the subalgebra M of P consisting of coinvariant elements under δR , M := P H = {p ∈ P: δR (p) = p ⊗ 1}. For a well-defined bundle structure at the level of universal differential calculi, one requires exactness of the following sequence [1], j ver 1 M )P − 1P − 0 → P (Ω →Ω −→ P ⊗ H + → 0,
(2.6)
with H + the augmentation ideal, as before. The algebra inclusion M → P extends 1 P of universal differential calculi, hence P (Ω 1 M )P are ˜ 1 M → Ω to an inclusion Ω the analogues of the horizontal one-forms (classically this corresponds to the space of one-forms which have been pulled back from the base of the fibration). The map ver is defined by ver(p ⊗ p ) = pδR (p ); the generator of the vertical one-forms. We say that the inclusion M → P is a quantum principal bundle with universal calculi and structure quantum group H. Requiring exactness of the sequence (2.6) is equivalent to requiring that the induced canonical map χ: P ⊗M P → P ⊗ H,
p ⊗M p → pδR (p )
(2.7)
be bijective. If this is the case, one also says that the triple (P, H, M ) is an H-Hopf–Galois extension. This bijection condition is enough for a principal bundle structure at the level of universal differential calculi. For a principal bundle with non-universal calculi extra conditions are required that we briefly recall. Assume then that P and M are equipped with differential cal 1 P/NP and Ω1 M = Ω 1 M/NM , where NP and MM are sub-bimodules culi Ω1 P = Ω
September 14, J070-S0129055X10004119
968
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
1 P and Ω 1 M, respectively. Assume further that H is equipped with a leftof Ω covariant calculus Ω1 H corresponding to a right ideal IH . Compatibility of the differential structures means that the calculi satisfy the conditions 1M NM = NP ∩ Ω
and δR (NP ) ⊂ NP ⊗ H.
(2.8)
1
The role of the first condition is to ensure that Ω M is spanned by elements of the form mdn with m, n ∈ M and is hence obtained by restricting the calculus on P . The second condition in (2.8) is sufficient to ensure covariance of Ω1 P . Finally, we need the sequence ver
0 → P (Ω1 M )P → Ω1 P −−→ P ⊗ Λ1 → 0
(2.9)
to be exact. This sequence is the analogue of the sequence (2.6) but now at the level of non-universal calculi. The P -P -bimodule P (Ω1 M )P once again makes up the horizontal one-forms and ver(p ⊗ p ) = pδR (p ) is the canonical map which generates the vertical one-forms. The condition ver(NP ) = P ⊗ IH
(2.10)
ensures that the map ver: Ω1 P → P ⊗ Λ1 ,
Λ1 H + /IH
is well-defined and yields that the sequence (2.9) is indeed exact. 2.3. Framed quantum manifolds Suppose that the total space P of the bundle is itself a Hopf algebra equipped with a Hopf algebra surjection π: P → H. Here we have a coaction of H on P by coproduct and projection to H, δR: P → P ⊗ H,
δR = (id ⊗ π)∆.
The base is then the quantum homogeneous space M = P H of coinvariants and the algebra inclusion M → P is automatically an H-Hopf–Galois extension, i.e. a quantum principal bundle with universal calculi. To impose non-universal differential structure, we suppose that Ω1 P is left-covariant for P and Ω1 H is left-covariant for H, so that they are defined by right ideals IP and IH of P + and H + , respectively. We ensure the first of (2.8) by taking it as a definition of Ω1 M ; in the case at hand, the remaining compatibility conditions in (2.8)–(2.10) reduce to (id ⊗ π)AdR (IP ) ⊂ IP ⊗ H,
π(IP ) = IH .
(2.11)
Thus a choice of left-covariant calculus on P satisfying these conditions automatically gives a principal bundle with non-universal calculi [14]. We say that an algebra M is a framed quantum manifold if it is the base of a quantum principal bundle, M = P H , to which Ω1 M is an associated vector bundle.
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
969
To give M as a framed quantum manifold we therefore require not only a quantum principal bundle δR: P → P ⊗ H as above but also a right H-comodule V , so that E := (P ⊗ V )H plays the role of the sections of the corresponding associated vector bundle (the space P ⊗ V is equipped with the tensor product coaction). Moreover, we require a “soldering form” θ: V → P Ω1 M such that the map sθ: E → Ω1 M,
p ⊗ v → pθ(v)
is an isomorphism. For a general M , it is usually not obvious how to go about looking for a framing. However in the case of a quantum homogeneous space with compatible calculi one has a “standard” framing in the following way [14]. If the conditions in (2.11) are satisfied then the algebra M = P H is automatically framed by the bundle (P, H, M ). The H-comodule V and soldering form θ are given explicitly by the formulæ V = (P + ∩ M )/(IP ∩ M ),
∆R v = v˜(2) ⊗ Sπ(˜ v(1) ),
θ(v) = S˜ v(1) d˜ v(2) , (2.12)
with v˜ any representative of v in P + ∩ M and ∆(˜ v ) = v˜(1) ⊗ v˜(2) is the coproduct on P . 3. The Standard Podle´ s Sphere We recall here some of the basic geometry of the so-called standard Podle´s quantum two-sphere Sq2 of [16]. We begin with the quantum group A[SUq (2)] and its symmetries Uq (su(2)), from which we obtain the quantum sphere Sq2 as the base space of the quantum Hopf fibration SUq (2) → Sq2 . Finally we sketch the construction of a family of quantum line bundles over Sq2 which shall prove useful in what is to follow. 3.1. The quantum group SUq (2) Recall that the coordinate algebra A[Mq (2)] of functions on the quantum matrices Mq (2) is the associative unital algebra generated by the entries of the matrix x = (xi j ) =
a c
b d
ac = qca,
bd = qdb,
obeying the relations ab = qba,
bc = cb,
ad − da = (q − q
−1
cd = qdc, )bc,
(3.1)
with 0 = q ∈ C a deformation parameter. The algebra A[Mq (2)] has a coalgebra structure given by ∆(xi j ) = xi µ ⊗ xµ j and (xi j ) = δi j . From A[Mq (2)] we obtain a
September 14, J070-S0129055X10004119
970
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
Hopf algebra A[SLq (2)] upon quotienting by the determinant relation ad = 1 + qbc (equivalently da = 1 + q −1 bc) and defining an antipode by a b d −q −1 b S . = −qc a c d When the deformation parameter q is taken to be real A[Mq (2)] is made into a ∗-algebra by defining the anti-linear involution ∗ b∗ d −qc a := . (3.2) x∗ = ∗ c d∗ a −q −1 b It is not difficult to see that A[SLq (2)] inherits this ∗-structure. Without loss of generality, we take 0 < q < 1. The compact quantum group A[SUq (2)] is defined to be the quotient of A[SLq (2)] by the additional relations S(xk l ) = (xl k )∗ . Thus in A[SUq (2)] we have a b a −qc∗ x= . (3.3) = c a∗ c d The algebra relations become ac = qca,
ac∗ = qc∗ a,
cc∗ = c∗ c,
aa∗ + q 2 cc∗ = 1,
a∗ a + c∗ c = 1,
(3.4)
together with their conjugates. On generators, the counit is (a) = (a∗ ) = 1, (c) = (c∗ ) = 0 and the antipode is now S(a) = a∗ , S(a∗ ) = a, S(c) = −qc, S(c∗ ) = −q −1 c∗ , while the coproduct now reads ∆(a) = a ⊗ a − qc∗ ⊗ c, ∆(c) = c ⊗ a + a∗ ⊗ c and ∆(a∗ ) = a∗ ⊗ a∗ − qc ⊗ c∗ , ∆(c∗ ) = c∗ ⊗ a∗ + a ⊗ c∗ . 3.2. The quantum universal enveloping algebra Uq (su(2)) The quantum universal enveloping algebra Uq (su(2)) is the unital ∗-algebra generated by the four elements K, K −1 , E, F , with KK −1 = K −1 K = 1, subject to the relations K ±1 E = q ±1 EK ±1 ,
K ±1 F = q ∓1 F K ±1 ,
[E, F ] = (q − q −1 )−1 (K 2 − K −2 )
(3.5)
and the ∗-structure K ∗ = K,
E ∗ = F,
F ∗ = E.
It becomes a Hopf ∗-algebra when equipped with the coproduct ∆ and counit defined on generators by ∆(K ±1 ) = K ±1 ⊗ K ±1 ,
∆(E) = E ⊗ K + K −1 ⊗ E,
∆(F ) = F ⊗ K + K −1 ⊗ F, (K) = 1,
(E) = 0,
(F ) = 0,
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
971
and with antipode S defined by S(K) = K −1 , S(E) = −qE, S(F ) = −q −1 F on generators. The maps ∆, are extended as ∗-algebra maps, whereas S extends as a ∗-anti-algebra map. From the relations (3.5), one finds that the quadratic Casimir element Cq := F E + (q − q −1 )−2 (qK 2 − 2 + q −1 K −2 ) −
1 4
(3.6)
generates the center of the algebra Uq (su(2)). The finite-dimensional irreducible ∗-representations πj of Uq (su(2)) are indexed by a half-integer j = 0, 1/2, 1, 3/2, . . . called the spin of the representation. Explicitly, these representations are given by πj (K)|j, m = q m |j, m , πj (F )|j, m = ([j − m][j + m + 1])1/2 |j, m + 1 ,
(3.7)
πj (E)|j, m = ([j − m + 1][j + m])1/2 |j, m − 1 , where the vectors |j, m for m = −j, −j + 1, . . . , j − 1, j form an orthonormal basis of the (2j + 1)-dimensional irreducible Uq (su(2))-module V j . Moreover, πj is a ∗representation with respect to the Hermitian inner product on V j for which the vectors |j, m are orthonormal. In each representation, the Casimir Cq of (3.6) acts as a multiple of the identity, with constant given by
1 πj (Cq ) = j + 2
2 −
1 4
(3.8)
as one may easily verify by direct computation. The Hopf ∗-algebras A(SUq (2)) and Uq (su(2)) are dually paired via a bilinear pairing ( · , · ): Uq (su(2)) × A[SUq (2)] → C
(3.9)
which is non-degenerate. It is defined on generators by (K, a) = q −1/2 ,
(K −1 , a) = q 1/2 , (E, c) = 1,
(K, d) = q 1/2 ,
(K −1 , d) = q −1/2 ,
(F, b) = 1,
with all other combinations of generators pairing to give zero. The pairing is extended to products of generators via the requirements (∆(X), p1 ⊗ p2 ) = (X, p1 p2 ), (X, 1) = (X),
(X1 X2 , p) = (X1 ⊗ X2 , ∆(p)),
(3.10)
(1, p) = (p),
for all X, X1 , X2 ∈ Uq (su(2)) and all p, p1 , p2 ∈ A[SUq (2)]. It is compatible with the antipode and the ∗-structures in the sense that, for all X ∈ Uq (su(2)),
September 14, J070-S0129055X10004119
972
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
p ∈ A[SUq (2)], (S(X), p) = (X, S(p)),
(X ∗ , p) = (X, (S(p))∗ ),
(X, p∗ ) = ((S(X))∗ , p).
(3.11)
Using the pairing, there is a canonical left action of Uq (su(2)) on A[SUq (2)] defined by : Uq (su(2)) × A[SUq (2)] → A[SUq (2)],
X p := p(1) (X, p(2) )
(3.12)
where X ∈ Uq (su(2)), p ∈ A[SUq (2)] and ∆(p) = p(1) ⊗ p(2) denotes the coproduct on A[SUq (2)]. In particular, this action works out on generators to be E a = b, K ±1 a = q ±1/2 a,
E c = d,
K ±1 c = q ±1/2 c,
E b = 0,
F b = a,
F d = c,
K ±1 b = q ∓1/2 b,
E d = 0,
F a = 0,
K ±1 d = q ∓1/2 d,
(3.13)
F c = 0.
This action makes A[SUq (2)] into a left Uq (su(2))-module ∗-algebra, in the sense that X (p1 p2 ) = (X(1) p1 )(X(2) p2 ),
X 1 = 1,
X p∗ = ((S(X))∗ p)∗
for all p, p1 , p2 ∈ A[SUq (2)], X ∈ Uq (su(2)). There is also a canonical right action of Uq (su(2)) on A[SUq (2)], defined by
: A[SUq (2)] × Uq (su(2)) → A[SUq (2)],
p X := (X, p(1) )p(2)
(3.14)
for X ∈ Uq (su(2)) and p ∈ A[SUq (2)], with properties similar to those for the left action. These two canonical actions commute amongst one another. 3.3. Line bundles on the quantum sphere Sq2 The coordinate algebra H := A[U(1)] of the group U(1) is the commutative unital ∗-algebra generated by t, t∗ , subject to the relations tt∗ = t∗ t = 1. It is a Hopf algebra when equipped with the coproduct, counit and antipode ∆(t) = t ⊗ t,
(t) = 1,
S(t) = t∗ ,
extended as ∗-algebra maps. There is a canonical Hopf algebra projection given on generators by a b t 0 := . (3.15) π: A[SUq (2)] → A[U(1)], π c d 0 t∗ Using this projection a right coaction of H = A[U(1)] on P := A[SUq (2)] is defined by δR : A[SUq (2)] → A[SUq (2)] ⊗ A[U(1)],
δR (xi j ) := xi µ ⊗ π(xµ j ).
(3.16)
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
973
In fact, this coaction is the same thing as a Z-grading on A[SUq (2)] for which the generators have degrees deg(a) = deg(c) = 1,
deg(b) = deg(d) = −1.
(3.17)
The subalgebra of coinvariants under this coaction is denoted A[Sq2 ], A[Sq2 ] := {m ∈ A[SUq (2)] | δR (m) = m ⊗ 1}. We shall frequently write M := A[Sq2 ]. This algebra is precisely the subalgebra generated by elements of degree zero: it is the unital ∗-algebra generated by the elements b+ := cd,
b− := ab,
b0 := bc
(3.18)
subject to the relations b0 b± = q ±2 b± b0 ,
q −2 b− b+ = q 2 b+ b− + (1 − q 2 )b0 ,
b+ b− = b0 (1 + q −1 b0 ) inherited from those of A[SUq (2)]. In the classical limit q → 1, the first line of relations becomes the statement that the algebra is commutative, whereas the second line becomes the sphere relation for the classical two-sphere S 2 . The quantum sphere Sq2 is precisely the standard Podle´s sphere of [16]. The canonical algebra inclusion M → P is well known to be a Hopf–Galois extension [1] and hence a quantum principal bundle with universal differential calculi whose typical fiber is determined by H := A[U(1)]. The coaction (3.16) of H on A[SUq (2)] is also used to define a family of line bundles over the quantum sphere Sq2 , indexed by n ∈ Z: Ln := {x ∈ A[SUq (2)] | δR (x) = x ⊗ t−n }. One has the decomposition [15] A[SUq (2)] =
Ln .
n∈Z
In particular L0 = A[Sq2 ] and one finds that L∗n ∼ = L−n and Ln ⊗A[Sq2 ] Lm ∼ = Ln+m for each n, m ∈ Z. Moreover, E Ln ⊂ Ln+2 ,
F Ln ⊂ Ln−2 ,
K ±1 Ln ⊂ Ln
for all n ∈ Z, as can be checked directly using (3.13) and (3.10). It is known that each Ln is a finitely generated projective (say) left A[Sq2 ]module of rank one [21]. In this way, we think of the module Ln as the space of sections of a line bundle over Sq2 with winding number −n.
September 14, J070-S0129055X10004119
974
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
4. Differential Structure of the Quantum Hopf Fibration In this section we equip the quantum group SUq (2) with a four-dimensional bicovariant differential calculus, originally described in [22]. Using this, the base space Sq2 of the Hopf fibration inherits a three-dimensional differential calculus which was originally described in [17], although we describe it here in terms which are more compatible with the principal bundle structure. Finally, we show that Sq2 is a framed quantum manifold, in the sense that its cotangent bundle is a vector bundle associated to the Hopf fibration SUq (2) → Sq2 . 4.1. Differential structure on SUq (2) In the following we write P for the counit of the Hopf algebra P := A[SUq (2)]. In terms of the matrix elements in (3.3), we define IP to be the right ideal of P + := Ker P generated by the nine elements b2 ,
c2 ,
b(a − d), zb,
zc,
c(a − d),
a2 + q 2 d2 − (1 + q 2 )(ad + q −1 bc),
z(a − d),
z(q 2 a + d − (q 2 + 1)),
(4.1)
where z := q 2 a + d − (q 3 + q −1 ). As discussed in Sec. 2.1, this ideal defines a leftcovariant first order differential calculus on SUq (2), which we denote by Ω1 P . In fact, one checks that IP is stable under the right adjoint coaction AdR and so this calculus is bicovariant under left and right coactions of A[SUq (2)]. It is precisely the 4D+ calculus on SUq (2) introduced in [22]: indeed, one may check that the space Λ1 ∼ = P + /IP of left-invariant one-forms is a four-dimensional vector space. Following [11], we define elements L− , L0 , L+ , Lz of Uq (su(2)) by L− := q 1/2 F K −1 ,
L+ := q −1/2 EK −1 ,
L0 := K 2 + ν 2 q −1 F E − 1,
Lz := K −2 − 1.
The vectors L0 and Lz are related to the quantum Casimir (3.6) by 2 1 1 −1 2 (q − q ) Cq + − = qL0 + q −1 Lz . 4 2
(4.2)
The elements L− , L0 , L+ , Lz act upon A[SUq (2)] via the formula (3.12) and together provide a basis for the quantum tangent space TP of the calculus. Note in particular that the element Cq − P (Cq )1 is also an element of TP . Let {ω− , ω0 , ω+ , ωz } be a basis of the space of left-invariant one-forms Λ1 such that (Lj , ωk ) = δjk for j, k = −, 0, +, z. As given in [19], the bimodule relations in the calculus Ω1 P with respect to these one-forms are: a b 0 a b 2 −1 b ω− = ω0 ; ω− + ν q c d d 0 c d a b a b 2 −1 0 a ω+ = ω0 ; ω+ + ν q c d 0 c c d
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
a c a ωz c ω0
975
−1 b q a qb ω0 ; = q −1 c qd d b 0 a 2 −1 a 0 = ω0 ω− + ν q d c 0 0 c b 0 qa q −1 b + ωz . ω+ + d 0 qc q −1 d (4.3)
In these terms, the exterior derivative d: A[SUq (2)] → Ω1 P has the form dp = (L− p)ω− + (L0 p)ω0 + (L+ p)ω+ + (Lz p)ωz ,
p ∈ A[SUq (2)], (4.4)
where is the left action of Uq (su(2)) on A[SUq (2)] defined in (3.12). By using the formulæ (3.13) to compute the action of L0 , Lz , L+ , L− on the generators of A[SUq (2)] and then substituting into (4.4), one obtains the explicit expressions da = (q −1 − 1 + ν 2 q −1 )aω0 + bω+ + (q − 1)aωz , db = aω− + (q − 1)bω0 + (q −1 − 1)bωz , dc = (q −1 − 1 + ν 2 q −1 )cω0 + dω+ + (q − 1)cωz ,
(4.5)
dd = cω− + (q − 1)dω0 + (q −1 − 1)dωz for the differentials of the matrix generators of A[SUq (2)] in terms of these leftinvariant one-forms. 4.2. Framed manifold structure of Sq2 Next, we use Sec. 2.3 to compute the cotangent bundle Ω1 Sq2 of the base space Sq2 of the Hopf fibration as an associated vector bundle. As before, we write P = A[SUq (2)] for the algebra of functions on the total space of the Hopf fibration, M = A[Sq2 ] for the algebra of functions on the base and H = A[U(1)] for the structure quantum group. Recall the right coaction δR : P → P ⊗H defined in (3.16) and the canonical projection π: P → H defined in (3.15). The differential calculus on P is taken to be the four-dimensional bicovariant calculus Ω1 P defined in the previous section; it is defined in terms of the AdR -invariant ideal IP generated by the elements in (4.1). Now writing H for the counit of H, we obtain a bicovariant differential calculus Ω1 H on H = A[U(1)] by projecting the ideal IP to obtain an ideal IH := π(IP ) of Ker H . As such, IH is generated by the three elements t2 + q 2 t∗2 − (1 + q 2 ),
z(t − t∗ ),
z(q 2 t + t∗ − (q 2 + 1)),
again with z = q 2 t + t∗ − (q 3 + q −1 ), where t, t∗ are the generators of H.
(4.6)
September 14, J070-S0129055X10004119
976
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
Lemma 4.1. The calculus Ω1 H is one-dimensional. It is spanned as a left module by the left-invariant one-form ωt := t∗ dt and has bimodule relations ωt t = qtωt ,
ωt t∗ = q −1 t∗ ωt ,
where t, t∗ are the generators of H = A[U(1)]. Proof. We define an equivalence relation ∼ on H + by x ∼ y if and only if x − y ∈ IH . By taking a linear combination of the generators in (4.6), one finds in particular that (t − 1) + q(t∗ − 1) ∼ 0, which is our key equivalence. Using it, one deduces that t2 = (t + 1)(t − 1) + 1 ∼ −q(t + 1)(t∗ − 1) + 1 = −q(t∗ − t) + 1 ∼ (q + 1)(t − 1) + 1, t∗2 = (t∗ + 1)(t∗ − 1) + 1 ∼ −q −1 (t∗ + 1)(t − 1) + 1 = −q −1 (t − t∗ ) + 1 ∼ −q −1 (1 + q −1 )(t − 1) + 1, so that every quadratic polynomial in t, t∗ and 1 is equivalent to a linear combination of t − 1 and t∗ − 1. By induction any polynomial in t is equivalent to such a linear combination. Applying the key equivalence once more tells us that we can always eliminate t∗ − 1. Thus we take t − 1 as a representative of the quotient space H + /IH and ωt := r−1 (1 ⊗ (t − 1)) as the corresponding left-invariant oneform, which spans the calculus Ω1 H as a left H-module. To obtain the bimodule relations, we compute for example that ωt t = ((t∗ − 1) ⊗ t − 1)t = (1 − t) ⊗ t2 − t = qt(t∗ − 1) ⊗ t − 1 = qtωt , where denotes an equivalence class modulo IH . The first and last equalities use the definition of the map r and the middle equality uses the bimodule structure (2.3). The differential calculus Ω1 M on the base of the fibration is defined by restricting the calculus Ω1 P to M . This means that it is defined as the quotient 1 M/NM , where NM is the M -M -bimodule NM := NP ∩ Ω 1 M . We postΩ1 M := Ω 1 pone the computation of generators and relations for Ω M and observe that for now we have the following expressions for the exterior derivative on M in terms of the left-invariant one-forms ω± , ω0 . Lemma 4.2. The exterior derivative d acts on M = A[Sq2 ] as
db+
d2
db0 = db db−
b
2
µν 2 q −1 cd ν 2 q −1 (1 + µbc) 2 −1
µν q
ab
qc2
ω+
ac ω0
qa
2
in terms of the generators b± , b0 of M given in (3.18).
ω−
(4.7)
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
977
Proof. This follows from direct computation. For example, to compute db+ the Leibniz rule yields db+ = d(cd) = (dc)d + c(dd). One uses the expressions (4.5) to rewrite dc, dd in terms of ω± and ω0 , then the bimodule relations in Eqs. (4.3) to collect all coefficients to the left. Combining together alike terms yields the expression as stated. The same method works for computing db0 and db− . Lemma 4.3. With P, H and M as above, the differential calculi Ω1 P, Ω1 H and Ω1 M satisfy the compatibility conditions of (2.11). Proof. The relation π(IP ) = IH holds by definition of the calculus on H. It is sufficient to verify the AdR -condition in (2.11) on generators: one finds that (id ⊗ π)AdR (c2 ) = c2 ⊗ t4 ,
(id ⊗ π)AdR (c(a − d)) = c(a − d) ⊗ t2 ,
(id ⊗ π)AdR (b2 ) = b2 ⊗ t∗4 ,
(id ⊗ π)AdR (b(a − d)) = b(a − d) ⊗ t∗2 ,
(id ⊗ π)AdR (zc) = zc ⊗ t2 ,
(id ⊗ π)AdR (zb) = zb ⊗ t−2 ,
with all other generators coinvariant under the map (id ⊗ π)AdR . This means that we may apply Sec. 2.3 to express Sq2 as a framed quantum manifold. The framing comodule V is computed as follows. Clearly P + ∩ M is equal to M + = Ker M , the restriction of the counit P to the subalgebra M . In our case, with M = A[Sq2 ] being generated by b± , b0 , we have that M + = b0 , b± as a right ideal. To compute IP ∩ M we note that, since the generators b(a − d), c(a − d), a2 + q 2 d2 − (1 + q 2)(ad + q −1 bc), zb, zc, z(a − d), z(q 2 a + d − (q 2 + 1)) are not of homogeneous degree, the ideal that each of them generates has no intersection with M . Thus we concentrate on the generators b2 , c2 of IP . The elements of degree zero in b2 include b2 {a2 , ac, c2 } and so we see that b2− , b− b0 , b20 all lie in IP ∩ M . Similarly, from the ideal c2 we see that b2+ and b+ b0 are also in IP ∩ M . From this discussion we obtain V = b0 , b± /b2± , b20 , b± b0 .
(4.8)
Hence V is three-dimensional with representatives b± and b0 . We compute the right coaction of H on V from (2.12) as ∆R (b+ ) = cd ⊗ Sπ(d2 ) = b+ ⊗ t2 , ∆R (b− ) = ab ⊗ Sπ(a2 ) = b− ⊗ t∗2 , ∆R (b0 ) = bc ⊗ 1 = b0 ⊗ 1. Hence V = C ⊕ C ⊕ C and the associated bundle E = L−2 ⊕ L0 ⊕ L+2 = A[SUq (2)]2 ⊕ A[SUq (2)]0 ⊕ A[SUq (2)]−2
September 14, J070-S0129055X10004119
978
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
is the direct sum of the line bundles over Sq2 with winding numbers −2, 0 and 2. This yields the following theorem. Theorem 4.4. The homogeneous space Sq2 is a framed quantum manifold with cotangent bundle Ω1 Sq2 ∼ = L−2 ⊕ L0 ⊕ L+2 . The isomorphism is given by the soldering form θ(b+ ) = q 2 c2 db− − qµac db0 + a2 db+ = ω+ , θ(b0 ) = −qdc db− + (1 + µbc)db0 − q −1 ba db+ = ν 2 q −1 ω0 , θ(b− ) = d2 db− − q −1 µbd db0 + q −2 b2 db+ = qω− and makes Ω1 Sq2 projective as a left A[Sq2 ]-module. Proof. The only remaining part is to compute the soldering form θ(b± ), θ(b0 ). We find the left coaction on M = A[Sq2 ] inherited from the coproduct on A[SUq (2)] to be ∆L (b+ ) = ∆L (cd) = c2 ⊗ b− + cd ⊗ (1 + µb0 ) + d2 ⊗ b+ , ∆L (b0 ) = ∆L (bc) = ca ⊗ b− + 1 ⊗ b0 + bc ⊗ (1 + µb0 ) + db ⊗ b+ , ∆L (b− ) = ∆L (ab) = a2 ⊗ b− + ab ⊗ (1 + µb0 ) + b2 ⊗ b+ . In fact these coproducts were already used in computing ∆R above. This time we apply the antipode S to the first tensor factor to obtain θ(b+ ) = S(b+ (1) )d(b+ (2) ) = q 2 c2 db− − qµac db0 + a2 db+ , similarly for θ(b− ) and θ(b0 ). This yields the middle expressions as stated. We then insert the expressions from Lemma 4.2 to obtain {ω+ , ν 2 q −1 ω0 , qω− } for the values of the map θ. According to Sec. 2.3, the map θ: V → P Ω1 M is well-defined on V . In order to get one-forms on A[Sq2 ], one must multiply θ(b− ) by an element of degree 2, θ(b+ ) by an element of degree −2 and θ(b0 ) by an element of degree zero. Moreover, every one-form is obtained in this way. This yields the isomorphism as stated. Since all line bundles Ln are projective, so is Ω1 Sq2 . The above also shows that the exterior derivative d in the calculus Ω1 Sq2 is given by restriction of the expression in (4.4), namely dm = (L− m)ω− + (L0 m)ω0 + (L+ m)ω+ ,
m ∈ A[Sq2 ].
(4.9)
We stress that L∓ m ∈ L±2 rather than being element in A[Sq2 ]. Of course, from (4.4) combined with the fact that the vertical vector field Lz obeys Lz m = 0
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
979
for all m ∈ A[Sq2 ], we already expected this to be the case. From Theorem 4.4 we know that Ω1 Sq2 is spanned as a left module by {d2 , db, b2 } ω+ := {∂+ b+ , ∂+ b0 , ∂+ b− }, ν 2 q −1 {µcd, 1 + µbc, µab} ω0 := {∂0 b+ , ∂0 b0 , ∂0 b− }, 2
(4.10)
2
{qc , ac, qa } ω− := {∂− b+ , ∂− b0 , ∂− b− }. The bimodule relations in the calculus Ω1 Sq2 are in general quite complicated to compute directly, but we can use the expressions in Eqs. (4.10) to break them into smaller pieces which are much easier to work with. Corollary 4.5. The cotangent bundle Ω1 Sq2 has first order differential sub-calculi Ω1+ ∼ = L−2 ⊕ L0 ,
Ω10 ∼ = L0 ,
Ω1− ∼ = L0 ⊕ L+2
with differentials given by d+ := ∂+ + ∂0 , d0 := ∂0 and d− := ∂0 + ∂− respectively. These calculi obey the bimodule relations −2 q b+ (∂+ b+ ) + q −3 µ−1 b+ (∂0 b+ ) b + q −4 b (∂ b ) + µ−1 q −2 (1 + q −3 b )(∂ b ) 0 + + 0 0 + ∂+ b+ b0 = q −2 b− (∂+ b+ ) − (q 2 − q −2 )b+ (∂+ b− ) + ∂0 b0 b− + (q 2 − q −2 )−1 (q −2 b− (∂0 b+ ) − b+ (∂0 b− )) − q −1 νb+ (∂0 b− ), −3 −1 b+ (∂+ b0 ) + q µ b0 (∂0 b+ ) b + ∂+ b0 b0 = q −2 b0 (∂+ b0 ) + q −2 µ−1 b+ (∂0 b− ) −2 b− q b− (∂+ b0 ) − q −1 νb0 (∂+ b− ) + q −2 (1 + q −1 b0 )(∂0 b− ), 2 2 −2 −1 2 b + q b+ (∂+ b− ) + (q − q ) (q b− (∂0 b+ ) − b+ (∂0 b− )) ∂+ b− b0 = b0 (∂+ b− ) + q −1 µ−1 b0 (∂0 b− ) −2 b− q b− (∂+ b− ) + q −3 µ−1 b− (∂0 b− ), 2 3 −1 q b+ (∂− b+ ) + q µ b+ (∂0 b+ ) b + ∂− b+ b0 = b0 (∂− b+ ) + qµ−1 b0 (∂0 b+ ) −2 q b− (∂− b+ ) + (q 2 − q −2 )−1 (b− (∂0 b+ ) − q 2 b+ (∂0 b− )), b− 2 −1 q b+ (∂− b0 ) + qνb0 (∂− b+ ) + µ (1 + qb0 )(∂0 b+ ) b + ∂− b0 b0 = q 2 b0 (∂− b0 ) + µ−1 b− (∂0 b+ ) b− b− (∂− b0 ) + q 3 µ−1 b0 (∂0 b− ), 2 q b+ (∂− b− ) + (q 2 − q −2 )b− (∂− b+ ) + q 2 ∂0 b0 b + + (q 2 − q −2 )−1 (b− (∂0 b+ ) − q 2 b+ (∂0 b− )) + qνb− (∂0 b+ ) ∂− b− b0 = 4 −1 3 q b0 (∂− b− ) + µ (1 + q b0 )(∂0 b− ) b− 2 q b− (∂− b− ) + q 3 µ−1 b− (∂0 b− ).
September 14, J070-S0129055X10004119
980
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
Proof. Using the expressions in Eqs. (4.10) the bimodule relations in Ω1 Sq2 are easily determined from straightforward but laborious computation along the following lines. From the bimodule relations in Eqs. (4.3) one finds that 2 −1 2 b + ω + + ν q c ω 0 b + ω+ b0 = b0 ω+ + ν 2 q −1 caω0 b− b− ω+ + ν 2 q −1 a2 ω0 , 2 2 b + ω − + ν d ω 0 b + ω− b0 = b0 ω− + ν 2 dbω0 b− b− ω− + ν 2 b2 ω0 , with ω0 commuting with each of b± , b0 . Combining these with the algebra relations in A[SUq (2)] yields the bimodule relations as stated, together with b + b+ (∂0 b+ ) ∂0 b+ b0 = q −2 b0 (∂0 b+ ) −2 b− q b− (∂0 b+ ) − q −2 b− (∂0 b+ ) + b+ (∂0 b− ), 2 −1 q b+ (∂0 b0 ) − qµ ν(∂0 b+ ) b + ∂0 b0 b0 = b0 (∂0 b0 ) −2 b− q b− (∂0 b0 ) + q −1 µ−1 ν(∂0 b− ), 2 −2 q b+ (∂0 b− ) + b− (∂0 b+ ) − q b+ (∂0 b− ) b + ∂0 b− b0 = q 2 b0 (∂0 b− ) b− b− (∂0 b− ). The fact that Ω1+ = L−2 ⊕ L0 , Ω10 = L0 and Ω1− = L0 ⊕ L+2 close as sub-bimodules is now clear by inspection. The Leibniz rules for the differentials d+ , d0 and d− follow from the Leibniz rule for d and the direct sum decomposition of Ω1 Sq2 . Corollary 4.6. The one-forms in the calculus Ω1 Sq2 enjoy the relations ∂+ b0 = q −2 b− (∂+ b+ ) − q 2 b+ (∂+ b− ), b0 b− (∂+ b+ ) = q 3 (1 + qb0 )b+ (∂+ b− ), ∂− b0 = b+ (∂− b− ) − q −4 b− (∂− b+ ), b0 b+ (∂− b− ) = q −3 (1 + q −1 b0 )b− (∂− b+ ), b0 ∂0 b0 = −qµν −1 b− (∂0 b+ ) + q −1 µν −1 b+ (∂0 b− ), b+ (∂0 b0 ) = (µ−1 + q −2 b0 )∂0 b+ , b− (∂0 b0 ) = (µ−1 + q 2 b0 )∂0 b+ .
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
981
Proof. These are obtained in analogy with the proof of Corollary 4.5, from the relations in A[SUq (2)] acting on ω± and ω0 . One finds the relations as stated, together with b+ (∂+ b− ) = q −1 b0 (∂+ b0 ),
b− (∂+ b+ ) = q 2 (1 + qb0 )(∂+ b0 ),
b− (∂− b+ ) = q 2 b0 (∂− b0 ),
b+ (∂− b− ) = q −1 (1 + q −1 b0 )(∂− b0 ).
There are other relations involving the differential ∂0 , but they are quite complicated (since the sphere relation in A[Sq2 ] does not explicitly involve the unit) and are not particularly illuminating, so we shall not give them here. Finally, we use Theorem 4.4 to compute the differentials ∂± and ∂0 in terms of the exterior derivative d. Using the algebra relations in A[SUq (2)] and the expressions in Eqs. (4.10) we find that ∂+ b+ = q −1 b2+ db− − µb+ (1 + q −1 b0 )db0 + (1 + q −1 b0 )2 db+ + q −2 νb+ b− db+ , ∂+ b0 = qb+ b0 db− − µb+ b− db0 + q −2 (1 + q −1 b0 )b− db+ , ∂+ b− = q 2 b20 db− − q −1 µb− b0 db0 + q −3 b2− db+ , ∂0 b+ = −µb2+ db− + µb+ (1 + µb0 )db0 − q −2 µb+ b− db+ , ∂0 b0 = (1 + µb0 )(−b+ db− + (1 + µb0 )db0 − q −2 b− db+ ), ∂0 b− = −µb− b+ db− + µb− (1 + µb0 )db0 − q −2 µb2− db+ , ∂− b+ = qb2+ db− − q −1 µb0 b+ db0 + q −2 b20 db+ , ∂− b0 = (1 + qb0 )b+ db− − qµb0 (1 + qb0 )db0 + q −2 b− b0 db+ , ∂− b− = ((1 + qb0 )2 + νb− b+ )db− − µb− (1 + qb0 )db0 + q −1 b2− db+ . These expressions may now be used to compute the full bimodule structure of the calculus Ω1 Sq2 in terms of the differential d, as well as the deeper structure of the noncommutative Riemannian geometry of this calculus, along similar lines to [14]. However, since our objective is to study the spin geometry of the calculus, we have all we need and so we shall not pursue these directions here. 5. The Spectral Geometry of Sq2 In this section, we give the “three-dimensional” differential calculus Ω1 Sq2 by a spectral triple on Sq2 . This means equipping Sq2 with a spinor bundle S and a Dirac operator D which together implement the exterior derivative d for Ω1 Sq2 . We then equip this spectral triple with a real structure for which the commutant property and the first order condition for the Dirac operator are satisfied up to infinitesimals of arbitrary order, in parallel with the results of [7] for the “two-dimensional” calculus on Sq2 .
September 14, J070-S0129055X10004119
982
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
5.1. Background on spectral triples We recall briefly the notion of a spectral triple [2]. Definition 5.1. A unital spectral triple (A, H, D) consists of a complex unital ∗-algebra A, faithfully ∗-represented by bounded operators on a (separable) Hilbert space H, and a self-adjoint operator D: H → H (the Dirac operator) with the following properties: / R, is a compact operator on H; (i) the resolvent (D − λ)−1 , λ ∈ (ii) for all a ∈ A the commutator [D, π(a)] is a bounded operator on H. A spectral triple (A, H, D) is called even if there exists a Z2 -grading of H, i.e. an operator Γ: H → H with Γ = Γ∗ and Γ2 = 1, such that ΓD + DΓ = 0 and Γa = aΓ for all a ∈ A. Otherwise the spectral triple is said to be odd. With 0 < n < ∞, the Dirac operator D is said to be n+ -summable if + (D + 1)−1/2 is in the Dixmier ideal Ln (H). The metric dimension of the spectral triple (A, H, D) is defined to be the infimum of the set of all n, such that D is n+ -summable. Given a spectral triple (A, H, D), one associates to it a canonical first order differential calculus (Ω1D A, dD ). In particular, the A-A-bimodule Ω1D A is defined to be j a0 [D, aj1 ] | aj0 , aj1 ∈ A , Ω1D A := ω = (5.1) 2
j
with the differential dD given by dD a = [D, a] for a ∈ A. The original definition [3] of a real structure on a spectral triple (A, H, D) was given by an anti-unitary operator J: H → H with the properties J 2 = ±1, JD = ±DJ and [π(a), Jπ(b)J −1 ] = 0,
[[D, π(a)], Jπ(b)J −1 ] = 0,
a, b ∈ A.
(5.2)
These are called the commutant property and the first order condition respectively. However, in many examples involving quantum spaces, one needs to modify these conditions in order to obtain non-trivial spin geometries [5–8]. Following the approach there, we impose the weaker assumption that (5.2) holds only up to infinitesimals of arbitrary order (i.e. up to compact operators T with the property that the singular values sk (T ) satisfy limk→∞ k p sk (T ) = 0 for all p > 0). Definition 5.2. A real structure on a spectral triple (A, H, D) is an anti-unitary operator J: H → H such that J 2 = ±1, [π(a), Jπ(b)J
−1
] ∈ I,
JD = ±DJ,
[[D, π(a)], Jπ(b)J −1 ] ∈ I ,
a, b, ∈ A,
(5.3)
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
983
where I is an operator ideal of infinitesimals of arbitrary order. We say that the datum (A, H, D, J) is a real spectral triple (up to infinitesimals). If (A, H, D, Γ) is even and JΓ = ±ΓJ, we call the datum (A, H, D, Γ, J) an even real spectral triple (up to infinitesimals). The signs above depend on the so-called KO-dimension of the triple. We shall only need the case where the KO-dimension is two; then J 2 = −1, JD = DJ and JΓ = −ΓJ. 5.2. A Dirac operator on Sq2 In order to define a spectral triple on Sq2 , we need a spinor bundle over Sq2 and an associated Dirac operator, which we require should recover the differential calculus Ω1 Sq2 via the commutator representation defined in (5.1). Since the differential calculus Ω1 Sq2 constructed in Theorem 4.4 is equivariant under a left coaction of A[SUq (2)] and hence a right action of Uq (su(2)), we are led to consider spinor bundles and Dirac operators which are right Uq (su(2))-equivariant. Guided by this principle, as well as by the spin structure of the classical twosphere S 2 , for the A[Sq2 ]-module of spinors we take S = S+ ⊕ S− := L−1 ⊕ L+1 . As right Uq (su(2))-modules, the vector spaces S± are both isomorphic to the direct sum Vj (5.4) V := j∈N+ 12
over all irreducible Uq (su(2))-modules V j with spin j ∈ N + A corresponding basis for V is then given by 1 |j, m j ∈ N + , m = −j, . . . , j , 2
1 2
a half-odd integer.
where the vectors |j, m span the irreducible Uq (su(2))-module V j in Eqs. (3.7). We denote the orthonormal bases of the two different copies S± of V respectively by |j, m ± ,
1 j ∈ N+ , 2
m = −j, . . . , j.
(5.5)
We equip S with the inner product which makes this basis orthonormal and write H for the corresponding Hilbert space completion of S. As A[Sq2 ]-modules, the vector spaces S± each carry one of two inequivalent Uq (su(2))-equivariant representations of A[Sq2 ], π± : A[Sq2 ] → End(S± ). Recall that S± are just the subspaces of A[SUq (2)] with overall degrees ∓1 with respect to the Z-grading (3.17), so the representations π± on S± are simply given
September 14, J070-S0129055X10004119
984
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
by restricting the multiplication in A[SUq (2)] to the appropriate degrees. However, it is possible to describe these representations explicitly in terms of the basis (5.5) in the following way. Indeed, the Uq (su(2))-equivariant representations of A[Sq2 ] on V were already described in [7, 21]. To be able to simply quote them we make a change of generators, now writing x1 = −q 1/2 µb+ ,
x0 − 1 = µb0 ,
x−1 = −q −3/2 µb− ,
(5.6)
where b± , b0 are the generators of A[Sq2 ] defined in (3.18), and µ = q + q −1 . With respect to these new generators, the algebra relations of A[Sq2 ] now read x−1 (x0 − 1) = q 2 (x0 − 1)x−1 , x1 (x0 − 1) = q −2 (x0 − 1)x1 , (q 2 x0 + 1)(x0 − 1) = (q + q −1 )x−1 x1 , (q −2 x0 + 1)(x0 − 1) = (q + q −1 )x1 x−1 . Then, with N = ±1/2, the two representations π± = π±1/2 of A[Sq2 ] on S± have the form 0 πN (xi )|j, m ± = α− i (j, m; N )|j − 1, m + i ± + αi (j, m; N )|j, m + i ±
+ α+ i (j, m; N )|j + 1, m + i ± ,
(5.7)
where the coefficients are determined by α+ 1 (j, m; N )
=q
−j+m
[j + m + 1][j + m + 2] [2j + 1][2j + 2]
1/2 αN (j + 1),
α01 (j, m; N ) = −q m+2 ([2][j − m][j + m + 1])1/2 [2j]−1 βN (j), 1/2 [j − m − 1][j − m] − j+m+1 αN (j), α1 (j, m; N ) = −q [2j − 1][2j] 1/2 [2][j − m + 1][j + m + 1] + m αN (j + 1), α0 (j, m; N ) = q [2j + 1][2j + 2] α00 (j, m; N ) = [2j]−1 ([j − m + 1][j + m] − q −2 [j − m][j + m + 1])βN (j), 1/2 [2][j − m][j + m] − m αN (j), α0 (j, m; N ) = q [2j − 1][2j] 1/2 [j − m + 1][j − m + 2] j+m (j, m; N ) = q αN (j + 1), α+ −1 [2j + 1][2j + 2]
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
985
α0−1 (j, m; N ) = q m ([2][j − m + 1][j + m])1/2 [2j]−1 βN (j), −j+m−1 α− −1 (j, m; N ) = −q
[j + m − 1][j + m] [2j − 1][2j]
1/2 αN (j)
1 1 (with the convention that α− i ( 2 , ± 2 ; N ) = 0) and the real numbers αN (j), βN (j) are
αN (j) = ([2j + 1][2j])−1/2 ([2][j + N ][j − N ])1/2 ([2j + 1][2j])1/2 q N , 1 3 −1 −1 −ε −1 βN (j) = q [2j + 2] (εq − (q − q ) [j][j + 1] − , 2 2 with ε = sign(N ). Next we come to the Dirac operator. With the 2 × 2 Pauli matrices 0 1 1 0 0 0 σ+ := , σ0 := , σ− := , 0 0 0 −1 1 0 one has the relations 1 σ+ σ− = 0 σ0 σ+ = σ+ ,
0 , 0
σ+ σ0 = −σ+ ,
σ02 = 2 σ+
=
1 0 , 0 1
2 σ−
= 0,
σ− σ+ =
σ− σ0 = σ− ,
Further, we use the differential operators D± , D0 , D± := L± ,
0 0
D0 := L0 + q −2 Lz = q −1 (q − q −1 )2
0 , 1
(5.8)
σ0 σ− = −σ− .
2 1 1 Cq + − , 4 2
(5.9)
having used the expression (4.2) for the last equality. As will be clearly momentarily, the use of D0 instead of L0 (the extra Lz vanishing identically on A[Sq2 ]) will lead to a Dirac operator whose square is diagonal. We define a Dirac operator D: S → S by D = D+ σ+ + D0 σ0 + D− σ− ,
(5.10)
where the 2 × 2 Pauli matrices σ± , σ0 act upon the column vector of S by left multiplication and the vector fields D± , D0 operate via the left action of Uq (su(2)) (using the symbol , which we omit from now on). As mentioned above, elements a ∈ A[Sq2 ] act as multiplicative operators on S via the representations π± : π+ (a) 0 π : A[Sq2 ] → End(S), π(a) := 0 π− (a) although we will not always explicitly denote the representation π. Proposition 5.3. The Dirac operator D: S → S obeys [D, a] = (L+ a)σ+ + (L0 a)σ0 + (L− a)σ− for each a ∈ A[Sq2 ].
September 14, J070-S0129055X10004119
986
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
Proof. For ψ = (ψ+ ψ− )tr ∈ S+ ⊕S− , using the derivation property of the vector fields D± , D0 , the commutator [D, a] works out to be 0 (D+ a)ψ− (D0 a)ψ+ + [D, a]ψ = + (D− a)ψ+ 0 −(D0 a)ψ− = ((D+ a)σ+ + (D0 a)σ0 + (D− a)σ− )ψ. To obtain the desired result, one simply substitutes D± = L± and D0 = L0 +q −2 Lz , observing that Lz a = 0 for all a ∈ A[Sq2 ]. This also shows that for all a ∈ A[Sq2 ] the commutator [D, a] recovers the one-form da, acting on the spinors S by “Clifford multiplication”. The summand D+ σ+ + D− σ− in the operator (5.10) is precisely the Dirac operator of [4], corresponding [20] to the “two-dimensional” differential calculus on the sphere Sq2 . The extra term D0 in our Dirac operator is the origin of the extra ‘direction’ in the calculus Ω1 Sq2 . It is clear from (4.2) that D0 vanishes when q → 1, whence the classical limit of our construction is just the canonical spectral triple on the classical two-sphere S 2 . Next, we compute the spectrum of the Dirac operator. We shall use the identities 1 q −1 K 2 − 2 + qK −2 −2 L+ L− = qEF K = q Cq + − K −2 , 4 (q − q −1 )2 (5.11) 2 −1 −2 qK − 2 + q K 1 L− L+ = q −1 F EK −2 = q −1 Cq + − K −2 , 4 (q − q −1 )2 each obtained using the expression (3.6) for the quantum Casimir Cq . Moreover, we know from (3.13) that for all ψ± ∈ S± we have K 2 ψ± = q ±1 ψ± ,
K −2 ψ± = q ∓1 ψ± .
(5.12)
These facts lead to the following result. Proposition 5.4. The Dirac operator D obeys D2 = q −2 ν 4
2 2 1 1 1 + Cq + Cq + − , 4 2 4
where Cq is the quantum Casimir. Proof. Using the Pauli relations (5.8) one computes that, for ψ = (ψ+ 2
D ψ=
D02
1 0 1 0 0 ψ + D+ D− ψ + D− D+ 0 1 0 0 0
0 ψ. 1
ψ− )tr ∈ S, (5.13)
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
987
The crucial fact in this calculation is that D0 is a function of the Casimir Cq and therefore commutes with D± . Next, using the relations (5.11) and (5.12) we find 1 D± D∓ ψ± = Cq + ψ± 4 for each ψ± ∈ S± . Furthermore, we have that 2 2 1 1 Cq + − D02 = q −2 ν 4 . 4 2 Substituting these expressions into (5.13) yields the formula as claimed. As an immediate consequence we obtain the spectrum of our Dirac operator D. Corollary 5.5. The Dirac operator D defined in (5.10) has spectrum 2 1/2 1 j ∈ N + 1 Spec(D) = ± q −2 ν 4 [j]2 [j + 1]2 + j + 2 2 with multiplicities 2j + 1. Proof. The eigenvalues of Cq are given in (3.8): each |j, m ± is an eigenvector with eigenvalue [j + 12 ]2 − 14 , whence the multiplicity of the jth eigenvalue is 2(2j + 1). From the expression for D2 in Proposition 5.4, we read off its eigenvalues using those for Cq , yielding 2 1 1 , (5.14) Spec(D2 ) = λj := q −2 ν 4 [j]2 [j + 1]2 + j + j ∈N+ 2 2 each having multiplicity 2(2j + 1). Here we have used the identity [j + 12 ]2 − 1/2 [ 12 ]2 = [j][j + 1]. The eigenvalues of D are therefore just ±λj with multiplicities 2j + 1. By inspection, we see that the eigenvalues of |D| grow not faster than q −2j for large j, in contrast with the Dirac operator of [4], whose eigenvalues diverge not faster than q −j . It is the extra term D0 which accounts for this behavior. This result immediately gives us an expression for D in terms of an orthonormal basis of eigenspinors |j, m; ↑ , |j, m; ↓ defined by D|j, m; ↑ = µj |j, m; ↑ ,
D|j, m; ↓ = −µj |j, m; ↓
with eigenvalues µj :=
q
2 1/2 1 ν [j] [j + 1] + j + . 2
−2 4
2
2
(5.15)
September 14, J070-S0129055X10004119
988
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
To proceed further, it will be necessary to have an explicit description of these eigenspinors in terms of the basic spinors |j, m ± . By evaluating the actions of D± , D0 on S one finds that the Dirac operator is 1 (5.16) D|j, m ± = ±q −1 ν 2 [j][j + 1]|j, m ± + j + |j, m ∓ , 2 the first term corresponding to the action of D0 σ0 , the second to the action of D± σ± . Knowing the eigenvalues of D, we find the corresponding eigenspinors to be 1 |j, m; ↑ := (−ζj+ |j, m + − ζj− |j, m − ), 2µj 1 |j, m; ↓ := (−ζj− |j, m + + ζj+ |j, m − ), 2µj for m = −j, −j + 1, . . . , j − 1, j and j ∈ N + 12 , where we have written ζj+ = µj + q −1 ν 2 [j][j + 1], ζj− = µj − q −1 ν 2 [j][j + 1].
(5.17)
(5.18)
On the two-dimensional subspace Vj,m spanned by |j, m + , |j, m − for fixed values of j, m, the operator which diagonalizes D is just the orthogonal matrix + −ζj −ζj− 1 Wj := . (5.19) 2µj −ζj− ζj+ We write W: H → H for the closure of the operator defined by the matrices Wj , j ∈ N + 12 . 5.3. Spectral properties of Sq2 We now show that the datum (A[Sq2 ], H, D) fulfils the conditions required of a spectral triple, which we then equip with a real structure in the sense of Definition 5.2. Theorem 5.6. The datum (A(Sq2 ), H, D) constitutes a unital spectral triple over the sphere Sq2 with metric dimension zero. Proof. For each a ∈ A[Sq2 ] the commutator [D, a] acts on S by multiplication operators and is therefore itself a bounded operator. In fact, for the summand D+ σ+ + D− σ− this goes as in [4], whereas for the term D0 one gets multiplication by L0 a which belongs to A[Sq2 ] itself. The operator D clearly satisfies D = D∗ on the dense domain S of H. From Corollary 5.5 it is clear that the only accumulation points of the spectrum of D are at infinity, so the resolvent of D is compact. Since the eigenvalues of D grow exponentially with j ∈ N + 12 , the metric dimension is just zero.
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
989
Proposition 5.7. With the Z2 -grading Γ: H → H defined by Γ|j, m; ↑ := |j, m; ↓ ,
Γ|j, m; ↓ := |j, m; ↑
on the orthonormal basis (5.15) and extended by A[Sq2 ]-linearity, the datum (A[Sq2 ], H, D, Γ) constitutes an even spectral triple. Proof. It is obvious that Γ2 = 1 and Γ = Γ∗ . The property ΓD + DΓ = 0 follows from the fact that Γ interchanges the +µj and −µj eigenspaces of D, as may be verified directly on the basis vectors (5.15). Next a real structure. Since we have made the same choice for the spinors as in [4], it is tempting to take the same real structure as well. However, one quickly finds that this choice is unsuitable, since it neither commutes nor anti-commutes with our Dirac operator D. The reason for this lies mainly in the fact that the term D0 in our Dirac operator (5.10) is proportional to the Casimir operator, which is rather a “second order differential operator”, if anything. Instead, we define an anti-unitary operator J : H → H in terms of its action on the orthonormal basis (5.15) by J|j, m; ↑ = (−1)m+1/2 |j, −m; ↑ ,
J|j, m; ↓ = (−1)m+1/2 |j, −m; ↓
and seek to show that this J equips the datum (A[Sq2 ], H, D, Γ) with a real structure. It is not difficult to check that the J above is equivariant under the right action of Uq (su(2)) on H, making it a particularly natural choice. Proposition 5.8. The operator J satisfies J 2 = −1, DJ = JD and ΓJ = −JΓ. Proof. The fact that J 2 = −1 is immediate. We find that (DJ − JD)|j, m; ↑ = (−1)m+1/2 D|j, −m; ↑ − µj D|j, m; ↑ = (−1)m+1/2 µj |j, −m; ↑ − (−1)m+1/2 µj |j, −m; ↑ = 0, (JΓ + ΓJ)|j, m; ↑ = J|j, m; ↓ − (−1)m+1/2 Γ|j, −m; ↑ = (−1)m+1/2 |j, −m; ↓ − (−1)m+1/2 |j, −m; ↓ = 0, where we have used anti-linearity of J. Similar computations hold on |j, m; ↓ . Aiming at (modified) commutant and first order conditions as in Definition 5.2, and having in mind the strategy of [7], we denote by Lq the positive trace-class operator defined by Lq |j, m ± := q j |j, m ± ,
1 j ∈N+ , 2
on H and let Kq be the two-sided ideal of B(H) generated by the operators Lq . The ideal Kq is an ideal of infinitesimals of arbitrarily high order and so we take
September 14, J070-S0129055X10004119
990
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
I = Kq as our operator ideal in Definition 5.2. Thus, to prove that J defines a real structure, it remains to check that the commutant property and first order condition in (5.3) are satisfied. The strategy of [7] is based on the fact that the operators π(xi ), i = −1, 0, 1, can be “approximated” by operators acting diagonally on the Hilbert space of spinors. Specifically, these operators zi , i = −1, 0, 1, on H are defined by 0 zi |j, m ± = α− i (j, m; 0)|j − 1, m + i ± + αi (j, m; 0)|j, m + i ±
+ α+ i (j, m; 0)|j + 1, m + i ± .
(5.20)
The coefficients are exactly the ones used in (5.7), unless |m + i| > j + ν for ν = −1, 0, 1, in which case we set ανi (j, m; 0) = 0. Momentarily we shall show that the operators zi approximate the operators π(xi ) modulo the ideal Kq , but to do this we first need the following technical lemma. Lemma 5.9. With Wj , j ∈ N + 12 , the operators in (5.19), there exists a constant C (independent of j) such that ∗ − 1|| < Cq j ||Wj Wj+1
for all j ∈ N + 12 . ∗ Proof. One evaluates the norm ||Wj Wj+1 − 1|| by computing the eigenvalues ∗ of the 2 × 2 matrix Wj Wj+1 − 1 and choosing the larger of the two, finding it to be √ + − + − ζj+ ζj+1 + ζj− ζj+1 − ζj− ζj+1 + ζj+ ζj+1 − 2 µj µj+1 ∗ − 1 = . Wj Wj+1 √ 2 µj µj+1
Using the inequalities [j] < (q − q −1 )−1 q −j and [j]−1 < q j−1 , elementary estimates √ for each of the terms in this expression yield that ζj± < C q −j and µj µj+1 < C q −2j for real constants C , C , so it appears at first glance that the above norm has an O(1) behavior. However, a more detailed analysis shows that the coefficient of q −2j in the numerator is in fact zero; the behavior of the numerator is therefore O(q −j ) and we have our result. Proposition 5.10. There exist bounded operators Ai , Bi , i = −1, 0, 1, such that π(xi ) − zi = Ai Lq = Lq Bi when acting upon the basis vectors |j, m; ↑↓ . In particular, π(xi ) − zi ∈ Kq for i = −1, 0, 1. Proof. From [7, Lemma 4.4], there exist bounded operators Ai , Bi , i = −1, 0, 1 such that π(xi ) − zi = Ai Lq = Lq Bi
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
991
with respect to the basis |j, m ± of H, and so the operators π(xi ) are approximated by the operators zi modulo the ideal Kq of infinitesimals. We need to check that using the operator W to change the basis vectors from |j, m ± to |j, m; ↑↓ does not spoil this approximation property. Evaluating Wj zi Wj∗ − zi on |j, m; ↑↓ gives ∗ (Wj zi Wj∗ − zi )|j, m; ↑↓ = α− i (j, m; 0)(Wj−1 Wj − 1)|j − 1, m + i; ↑↓ ∗ + α+ i (j, m; 0)(Wj Wj+1 − 1)|j + 1, m + i; ↑↓ .
This and Lemma 5.9 yield that Wj zi Wj∗ − zi ∈ Kq for all i = −1, 0, 1 and all j ∈ N + 12 . As a consequence, we immediately get the commutant property, the first of the two conditions in (5.3). Proposition 5.11. For all a, b ∈ A[Sq2 ] we have [π(a), Jπ(b)J −1 ] ∈ Kq . Proof. From the derivation property of commutators, it suffices to check this only for the generators x−1 , x0 , x1 of A[Sq2 ]. With the operators z−1 , z0 , z1 defined in (5.20), we have Jzk J −1 |j, m ± = (−1)k (α− k (j, −m; 0)|j − 1, m − k ± + α0k (j, −m; 0)|j, m − k ± + α+ k (j, −m; 0)|j + 1, m − k ± ). (5.21) Using this, one computes as in [7, Lemma 6.2] that [zi , Jzk J −1 ] = 0,
i, k = −1, 0, 1.
(5.22)
It is straightforward to check that [π(xi ), Jπ(xk )J −1 ] = [π(xi ) − zi , Jπ(xk )J −1 ] + [zi , J(π(xk ) − zk )J −1 ] + [zi , Jzk J −1 ], whence the assertion follows from Proposition 5.10. We are now ready for our main theorem regarding the differential structure of Sq2 . Theorem 5.12. The datum (A(Sq2 ), H, D, Γ, J) constitutes a real even unital spectral triple (up to infinitesimals) with KO-dimension equal to two. Proof. Having already established Propositions 5.8 and 5.11, it remains to verify the first order condition for D, namely that [[D, a], JaJ −1 ] ∈ Kq for all a ∈ A[Sq2 ]. For this, we split the Dirac operator into two pieces, D = D∆ + DΩ , where D∆ = D0 σ0 and DΩ = D− σ− + D+ σ+ . By linearity it suffices to check the first order condition for D∆ and DΩ individually.
September 14, J070-S0129055X10004119
992
2010 13:30 WSPC/S0129-055X
148-RMP
S. Brain & G. Landi
Since D0 is a function of the Casimir, each a ∈ A[Sq2 ] is an eigenfunction for the derivation [D∆ , · ], whence the first order condition for D∆ follows immediately from the commutant property in Proposition 5.11. On the other hand, the component DΩ has eigenvalues ±γj , γj := [j + 12 ], whose growth with j obeys γj < Cq −j for C a real constant (as already mentioned, DΩ is precisely the Dirac operator considered in [4]). It is easy to compute that [DΩ , zi ]|j, m ± = (γj−1 − γj )α− i (j, m; 0)|j − 1, m + i ∓ + (γj+1 − γj )α+ i (j, m; 0)|j + 1, m + i ∓ . Using this expression, together with (5.21), one calculates the action of the commutators [[DΩ , zi ], Jzk J −1 ] for i, k = −1, 0, 1 and finds them to be a sum of five ν (j, m), ν = −2, . . . , 2, i.e. independent weighted shift operators with weights Si,k [[DΩ , zi ], Jzk J −1 ]|j, m ± =
2
ν Si,k (j, m)|j + ν, m + i − k ± .
ν=−2 ν (j, m) Si,k
are estimated using exactly the same method as in These weights [7, Proposition 6.5]. In our case, the growth condition for γj is sufficient to ν (j, m)| < C q j for some real constant C . We conclude that guarantee that |Si,k −1 [[DΩ , zi ], Jzk J ] ∈ Kq for all i, k = −1, 0, 1. Since the zi approximate the operators π(xi ) modulo Kq , the proof is complete. Acknowledgments Both authors were partially supported by the Italian Project “Cofin08– Noncommutative Geometry, Quantum Groups and Applications”. SB is grateful to INdAM–GNSAGA for support and the Department of Mathematics at the University of Trieste for its hospitality. We thank Francesco D’Andrea for very useful comments. References [1] T. Brzezi´ nski and S. Majid, Quantum group gauge theory on quantum spaces, Comm. Math. Phys. 157 (1993) 591–638; Erratum, ibid. 167 (1995) 235. [2] A. Connes, Noncommutative Geometry (Academic Press, 1994). [3] A. Connes, Gravity coupled with matter and the foundation of noncommutative geometry, Comm. Math. Phys. 182 (1996) 155–176. [4] L. D¸abrowski and A. Sitarz, Dirac operator on the standard Podle´s quantum sphere, in Noncommutative Geometry and Quantum Groups (Warsaw, 2001 ), Banach Center Publ., Vol. 61 (Polish Acad. Sci., Warsaw, 2003), pp. 49–58. [5] L. D¸abrowski, G. Landi, M. Paschke and A. Sitarz, The spectral geometry of the equatorial Podle´s sphere, C. R. Math. Acad. Sci. Paris 340 (2005) 819–822. [6] L. D¸abrowski, G. Landi, S. Sitarz, W. D. van Suijlekom and J. C. Varilly, The Dirac operator on SUq (2), Comm. Math. Phys. 259 (2005) 729–759. [7] L. D¸abrowski, F. D’Andrea, G. Landi and E. Wagner, Dirac operators on all Podle´s spheres, J. Noncommut. Geom. 1 (2007) 213–239.
September 14, J070-S0129055X10004119
2010 13:30 WSPC/S0129-055X
148-RMP
Spin Geometry of the Quantum Two-Sphere
993
[8] F. D’Andrea, L. D¸abrowski and G. Landi, The isospectral Dirac operator on the 4-dimensional orthogonal quantum sphere, Comm. Math. Phys. 279 (2008) 77–116. - urdevi´c, Geometry of quantum principal bundles. I, Comm. Math. Phys. 175 [9] M. D (1996) 457–520. - urdevi´c, Geometry of quantum principal bundles. II, Rev. Math. Phys. 9 (1997) [10] M. D 531–607. [11] A. Klimyk and K. Schm¨ udgen, Quantum Groups and Their Representations (Springer Verlag, Berlin Heidelberg, 1997). [12] G. Landi and A. Zampini, in preparation. [13] S. Majid, Quantum and braided group Riemannian geometry, J. Geom. Phys. 30 (1999) 113–146. [14] S. Majid, Noncommutative Riemannian and spin geometry of the standard q-sphere, Comm. Math. Phys. 256 (2005) 255–285. [15] T. Masuda, K. Mimachi, Y. Nakagami, M. Noumi and K. Ueno, Representations of the quantum group SUq (2) and the little q-Jacobi polynomials, J. Funct. Anal. 99 (1991) 357–387. [16] P. Podle´s, Quantum spheres, Lett. Math. Phys. 14 (1987) 193–202. [17] P. Podle´s, Differential calculus on quantum spheres, Lett. Math. Phys. 18 (1989) 107–119. [18] P. Podle´s, The classification of differential structures on quantum two-spheres, Comm. Math. Phys. 150 (1992) 167–179. [19] K. Schm¨ udgen, Commutator representations of differential calculi on the quantum group SUq (2), J. Geom. Phys. 31 (1999) 241–264. [20] K. Schm¨ udgen and E. Wagner, Dirac operator and a twisted cyclic cocycle on the standard Podle´s quantum sphere, J. Reine Angew. Math. 574 (2004) 219–235. [21] K. Schm¨ udgen and E. Wagner, Representations of crossed product algebras of Podle´s quantum spheres, J. Lie Theory 17 (2007) 751–790. [22] S. L. Woronowicz, Differential calculus on compact matrix pseudogroups (quantum groups), Comm. Math. Phys. 122 (1989) 125–170.
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 995–1032 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004120
BOREL SUMMABILITY OF ϕ44 PLANAR THEORY VIA MULTISCALE ANALYSIS
MARCELLO PORTA∗ and SERGIO SIMONELLA† ∗Dipartimento
di Fisica, Universit` a di Roma “Sapienza”, Piazzale Aldo Moro 5, 00185 Roma, Italy
†Dipartimento
di Matematica, Universit` a di Roma “Sapienza”, Piazzale Aldo Moro 5, 00185 Roma, Italy ∗[email protected] †[email protected] Received 23 March 2010
We review the issue of Borel summability in the framework of multiscale analysis and renormalization group, by discussing a proof of Borel summability of the ϕ44 massive Euclidean planar theory; this result is not new, since it was obtained by Rivasseau and ’t Hooft. However, the techniques that we use have already been proved effective in the analysis of various models of consended matter and field theory; therefore, we take the ϕ44 planar theory as a toy model for future applications. Keywords: Borel summability; ϕ44 theory; renormalization group. Mathematics Subject Classification 2010: 81T08, 81T17, 40G10
1. Introduction The problem of giving a meaning to the formal perturbative series defining the scalar ϕ44 theory, the simplest four-dimensional interacting field theory, has been very debated (see [7] for a critical introduction to the problem) and it is still wide open, despite several triviality conjectures have been proposed since the work of Landau, [1]. Here we focus on the planar restriction of the full perturbative series; that is, we consider only the graphs that can be drawn on a sheet of paper without ever crossing lines in points where no interacting vertices are present. This problem is much easier than the complete case, since the number of topological Feynman graphs contributing to a given order n is much smaller than the original n!. In fact, in the planar theory this number is bounded by (const.)n , see [10, 11]. Still, the problem is far from being trivial, since the theory needs to be renormalized; this can be done using renormalization group, see [6, 12, 13, 25], for instance. 995
October 12, J070-S0129055X10004120
996
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
It is well known that the full ϕ44 with the “wrong” sign of the renormalized coupling constant, that is the one corresponding to an unstable self-interaction potential, is perturbatively asymptotically free, in the sense that truncating the beta function to a finite order the running coupling constant describing the interaction of the fields at energy scale µ flows to zero in the ultraviolet as (log µ)−1 . This fact does not have any direct physical interpretation in the full ϕ44 , since the theory is not defined for the considered value of the renormalized coupling constant. Moreover, the beta function itself is not defined, because of the factorial growth of the number of topological Feynman graphs in the order of the series. However, these problems do not affect the planar theory, since it is only defined perturbatively and the number of graphs at a given order is far smaller than the original n!. Therefore, one can hope in this case to exploit asymptotic freedom to rigorously construct the theory. This has been done independently by Rivasseau and ’t Hooft using quite different methods, see [2–5]; indeed, they proved that the renormalized perturbative series defining the Schwinger functions, which are the result of various resummations, are absolutely convergent. In particular, they proved that the result is the Borel sum of the perturbative series in the renormalized coupling constant. This last fact means in particular that the Schwinger functions can be expressed to an arbitrary accuracy starting from their perturbative series in the renormalized coupling constant, following a well-defined prescription; moreover, the result is unique within a certain class of functions, the Borel summable ones. Clearly, this does not exclude the existence of other less regular solutions with the same formal perturbative expansion. At the time of those works, besides the possibility of giving a mathematically rigorous meaning to a simple quantum field theory, the physical motivation of the study was that the ϕ44 planar theory is formally equal to the limit N → ∞ of a massive SU (N ) theory in four dimensions, with interaction λ Tr ϕ4 where ϕ is an N × N matrix, see [3, 11]. In particular, in ’t Hooft work the planar approximation was seen as a first step towards the more ambitious study of QCD with large number of colors. In this paper, we review the issue of Borel summability of the ϕ44 planar theory using the rigorous renormalization group techniques introduced in [6, 12, 13] (in [6, 13] the flow of the running coupling constants of the planar theory was heuristically discussed), which make possible a transparent proof of the ultraviolet stability of the massive Euclidean ϕ44 theory, through the so called “n! bounds”. One of the motivations of our work lies in the fact that very few proofs of Borel summability based on renormalization group methods are present in literature, [8, 9]. Moreover, we take the ϕ44 planar theory as a first step towards the study of physically more interesting models, which can be analyzed by similar techniques. As mentioned before, the great gain that one has in the planar restriction of the full ϕ44 theory is that the topological Feynman graphs of a given order n are far less (their number is bounded as (const.)n , against the n! of the full case). This is in a sense reminiscent of what happens in fermionic field theories, where it is possible
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
997
to control the factorial growth of the number of Feynman graphs by exploiting the −1 arising in the anticommutation of the fields, showing that the nth order of the series, which is given by n! addends, reconstructs the determinant of an n × n matrix, which is estimated by (const.)n . For instance, we think that the methods described in this paper could be useful to prove Borel summability for the onedimensional Hubbard model, where one sector of the theory is asymptotically free, while to control the flow of the other running coupling constants one has to prove that the beta function is vanishing, [21]. This model has been rigorously constructed in [15] using renormalization group methods similar to those used here, but a proof of Borel summability has not been given yet. Informally, our main result can be stated as follows; we refer the reader to Sec. 3, Theorem 1, for a precise formulation. Main result. The Schwinger functions of the Euclidean massive planar ϕ44 theory are Borel summable in the renormalized coupling constant; in particular, they satisfy the hypothesis of the Nevanlinna–Sokal theorem [14], which are sufficient conditions for Borel summability. Roughly speaking, our proof goes as follows. First, by choosing the renormalized coupling constant in a suitable complex domain, we prove that the flow equation defining recursively the running coupling constants at all energy scales admits a bounded solution which falls into the radius of convergence of the Schwinger functions, and verifies some special regularity properties. To do that, we use a fixed point argument, similar to the one introduced by ’t Hooft in [2]. Then, to conclude the check of the hypothesis of Nevanlinna–Sokal theorem on Borel summability, we show that it is possible to “undo” the resummation that allowed to write the Schwinger functions as power series in the running coupling constants so that the nth order Taylor remainder in the renormalized coupling constant λ can be bounded proportionally to n!|λ|n+1 uniformly in the analyticity domain. To prove this second statement, we rely in a crucial way on the Gallavotti–Nicol` o tree representation of the beta function; the “undoing” of the resummations, corresponding to rather involved analytical operations, is made clear by a graphical manipulation of these trees. This procedure is quite similar in spirit to what has been done by Rivasseau in [4]. Therefore, we feel that our proof lies halfway between those of Rivasseau and ’t Hooft. As mentioned above, in ’t Hooft approach, which is based on renormalization group ideas, the flow of the beta function is studied in a way analogous to the one we follow. However, instead of deriving bounds on the remainder of the resummed perturbative series, ’t Hooft, see [2], concludes the proof of Borel summability by checking the analyticity properties of the Borel transform using a totally independent argument, that we have not been able to rigorously reproduce in our framework. For what concerns the comparison with Rivasseau’s work, see [4], the main difference is that in his approach the beta function is not introduced: to construct
October 12, J070-S0129055X10004120
998
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
the planar theory Rivasseau uses a “minimal” resummation procedure, involving only a certain class of Feynman graphs with four external legs, the parquet ones. This defines an asymptotically free “running coupling constant”, and it turns out to be enough to prove the finiteness of the planar theory. To conclude, Rivasseau shows that the result of these operations is the Borel sum of the nonrenormalized series, by proving an n! bound on the Taylor remainder; this bound is obtained undoing the resummation of the parquet subgraphs in a suitable way. The paper is organized as follows. In Sec. 2, we define the model, we set the notations, we briefly review the ideas behind multiscale integration and we introduce the beta function and the flow of the running coupling constants; we refer the interested reader to [6, 12, 13] for a detailed introduction to these techniques. In Sec. 3, we state our main result and we discuss the strategy of the proof. Finally, in Sec. 4 and in the appendices cited therein, we prove the theorem.
2. Renormalization Group Analysis In this section we describe the iterative procedure that allows to express the Schwinger functions of the full ϕ44 theory as power series order by order finite in the ultraviolet limit, graphically represented in terms of renormalized Feynman graphs; at the same time, we define the planar ϕ44 theory by considering at each step only the planar graphs. Our discussion will be quite short; we refer the reader to [6] for a detailed proof of the renormalizability of the ϕ44 theory. be a massive gaussian free field with ultraviolet The full ϕ44 theory. Let ϕx cut-off at length γ −N , where γ > 1 is a fixed scale parameter, and x ∈ Λ where Λ is a four-dimensional box of side size L with periodic boundary conditions; for simplicity, we set to 1 the value of the mass. We rewrite the field as: (≤N )
) ϕ(≤N = x
N
x ∈ Λ,
ϕ(j) x ,
(2.1)
j=0
where {ϕ(j) }N j=0 are independent gaussian fields with propagators (j)
Cx,y := fj (p) := and
dp (2π)4
dp fj (p) ip·(x−y) e , (2π)4 p2 + 1 2
e−p /γ 2 e−p
2j
is a shorthand for |Λ|−1 lim
N →+∞
− e−p
2
/γ 2(j−1)
p=2πn/L
N j=0
(2.2) if j > 0 , if j = 0
with n ∈ Z4 ; notice that
fj (p) = 1.
(2.3)
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
999
The generating functional of the Schwinger functions of the ϕ44 theory is given by: (N ) (≤N ) ) ) eWN (ζf ) := exp ζ dx ϕ(≤N f P (dϕ(≤N ) ), (2.4) eV (ϕ x x (j) where fx is a Schwartz test function, ζ ∈ R, P (dϕ(≤N ) ) := N j=0 P (dϕ ) with P (dϕ(j) ) the gaussian distribution of the field ϕ(j) with covariance given by (2.2), and the interaction V (N ) is defined as V (N ) (ϕ(≤N ) ) ) 4 ) 2 ) 2 := dx (λN : (ϕ(≤N ) : +αN : (∂ϕ(≤N ) : +µN : (ϕ(≤N ) : +νN ), x x x
(2.5)
Λ
where λN , αN , µN , νN are called bare coupling constants, and the dots denote the Wick product of the fields (see [6, Appendix C]); notice that in our convention the “wrong” sign of λN is the positive one. The generic q-point Schwinger function of the full ϕ44 theory is obtained deriving the generating functional q times with respect p np p np (ζf ) + WN (ζf ), where WN , WN to ζ and setting ζ = 0. Now, let WN (ζf ) =: WN are respectively the planar/non planar part of WN to be defined recursively in the following; the q-point Schwinger function of the planar theory is defined as: T S(N ) (f ; q) :=
∂q W p (ζf )|ζ=0 . ∂ζ q N
(2.6)
We shall denote by S T (f ; q) the limit for N → +∞ of (2.6). Multiscale analysis. As explained in [6], we can try to evaluate (2.4) by proceeding in an iterative fashion, integrating the independent fields ϕ(j) starting from the ultraviolet scale j = N going down to the infrared scale j = 0. This iterative integration gives rise to an expansion in Feynman graphs; the restriction to the planar theory will be enforced by considering at each integration step only the planar ones. For simplicity, in what follows, we shall explicitly discuss only the case f = 0, which corresponds to the integration of the “partition function”. The case f = 0 is a straightforward extension of our argument, and it will be discussed later. After the integration of ϕ(N ) , ϕ(N −1) , . . . , ϕ(k+1) , we rewrite the integral (2.4) as (k) (≤k) (k) (≤k) (k) ) )+Vnp (ϕ(≤k) ) eWN (0) = eV (ϕ P (dϕ(≤k) ) = eVp (ϕ P (dϕ(≤k) ), (2.7) where P (dϕ(≤k) ) := kj=0 P (dϕ(j) ), the field ϕ(≤k) = kj=0 ϕ(j) has a propagator given by, in momentum space, Cp(≤k) :=
k j=0
Cp(j) ,
Cp(j) :=
fj (p) , p2 + 1
(2.8) (k)
(k)
and the effective potential V (k) together with its planar/non planar parts Vp , Vnp (N ) will be defined recursively. At the beginning, V (N ) (ϕ(≤N ) ) = Vp (ϕ(≤N ) ); on scale
October 12, J070-S0129055X10004120
1000
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
k we will show that, if # = “p”, “np”: dp1 dpm (k) (k) ··· V (p1 , . . . , pm ; m) V# (ϕ(≤k) ) = 4 (2π) (2π)4 # m≥0
m (≤k) ϕpi : δ pi , : i=1
(2.9)
i
(k)
where V# (p1 , . . . , pm ; m) are suitable coefficients to be recursively defined, and the product with m = 0 is interpreted as 1. Let us perfom the single scale integration. First, we split V (k) as LV (k) + RV (k) , where R = 1 − L and L, the localization operator, is a linear operator acting on functions of the form (2.9), defined by its (k) action on the kernels V# (p1 , . . . , pm ; m) in the following way (with a slight abuse of notation, due to the presence of the delta function in (2.9) we only write the independent values of the momenta in the arguments of the kernels): (k)
(k)
LV# (p1 , p2 , p3 ; 4) := V# (0, 0, 0; 4),
(2.10)
1 (k) (k) (k) (k) LV# (p; 2) := V# (0; 2) + p∂p V# (0; 2) + pi pj ∂pi ∂pj V# (0; 2), 2 (k)
and LV# (p1 , . . . , pm ; m) = 0 otherwise. By symmetry, it follows that (k)
∂pi V# (0; 2) = 0,
(k)
∂pi ∂pj V# (0; 2) = 0
(k)
for i = j, (2.11)
(k)
∂pi pi V# (0; 2) = ∂pj pj V# (0; 2) for all i, j; finally, we define the running coupling constants of the planar theory on scale k as: (k)
λk := Vp (0, 0, 0; 4), αk :=
1 ∂p p V (k) (0; 2), 2 1 1 p
(k)
γ 2k µk := Vp (0; 2), (2.12) γ 4k νk := Vp(k) (0); (k)
the corresponding objects in the full theory are obtained by replacing the Vp in (2.12) with V (k) . Therefore, setting ϕ(≤k) =: ϕ(≤k−1) + ϕ(k) , we can rewrite (2.7) with k replaced by k − 1, and V (k−1) given by (k) (≤k−1) (k−1) (≤k−1) +ϕ(k) ) V (ϕ ) = log P (dϕ(k) )eV (ϕ :=
1 E T (V (k) (ϕ(≤k) ); n), n! k
(2.13)
n≥0
where EkT is called truncated expectation on scale k, and it is defined as: (h) ∂n EhT (X(ϕ(h) ); n) := n log P (dϕ(h) )eζX(ϕ ) |ζ=0 . ∂ζ
(2.14)
It is convenient to define also V (−1) ; for this purpose one thinks ϕ(≤N ) as being given by, see formula [6, Eq. (6.9)], ϕ(≤N ) = ϕ(−1) + ϕ(0) + · · · + ϕ(N ) ,
(2.15)
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1001
where the field ϕ(−1) is distributed independently relative to the other ϕ(j) , j ≥ 0, (−1) and it has its own covariance Cx,y which needs not to be specified (because it will eventually be taken to be identically zero whenever it appears in some interesting formulas). The introduction of V (−1) allows to treat the case k = 0 on the same grounds as the cases k > 0. Tree expansion and Feynman graphs. The iterative integration described above leads to a representation of the effective potential on scale k − 1 as a power series in the running coupling constants λh , αh , µh , νh with h ≥ k, where the coefficients of the series can be represented in terms of connected Feynman graphs, as briefly explained in the following. The key formula which we start from is (2.13); iterating this formula as suggested by Fig. 1, we end up with a representation of the effective potentials in terms of a sum over Gallavotti–Nicol` o trees [6, 12, 13], see Fig. 2: V (k−1) (γ), V (k−1) (ϕ(≤h) ) = n≥1 γ∈Tk−1,n
V (k−1) (γ) =
dpm (k−1) dp1 ··· V (p1 , . . . , pm ; γ, m) 4 (2π) (2π)4 m≥0
m (≤k−1) ϕpi :δ pi , : i=1
(2.16)
i
where Tk−1,n is the set of trees with root r on scale hr = k − 1 and n endpoints, with value V (k−1) (γ). The trees involved in the sum are distinct; two trees are considered identical if it is possible to superpose them together with the labels appended to their vertices by stretching or shortening the branches. Proceeding in a way analogous to [6, Sec. XVI and Appendix C], it follows that the kernels V (k) (p1 , . . . , pm ; γ, m) satisfy the following recursion relation: s 1 V˜ (k) (p1 , . . . , pmj ; γj , mj ) V (k−1) (p1 , . . . , pm ; γ, m) = s! m ,...,m j=1 1
·
s
π∈Gm
ϑ⊂π connected
λ∈ϑ
(k) Cp(λ)
·
(≤k−1) Cp(λ) ,
λ∈π/ϑ
(2.17) where γ1 , . . . , γs are the s subtrees of γ with root corresponding to the first nontrivial vertex of γ, V˜ (k) (p1 , . . . , pmj ; γj , mj ) is equal to RV (k) (p1 , . . . , pmj ; γj , mj ) if γj is nontrivial and to LV (k) (p1 , . . . , pmj ; mj ) otherwise, Gm is a suitable set of Feynman graphs defined below, and the integral is over their loop momenta. This relation is a consequence of the rules of evaluation of the truncated expectations of Wick monomials, see [6, Appendix C]. Formula (2.17) is iterated by replacing each V˜ (k) (p1 , . . . , pmj ; γj , mj ) corresponding to nontrivial γj ’s with (2.17) with k − 1
October 12, J070-S0129055X10004120
1002
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
V(k) =
V(k1) =
,
k
k−1
V(k) =
+ k
V(k) =
,
k
k
+
k−1
k
+ k−1
k
k−1
+ ...
k
Fig. 1. Graphical interpretation of (2.13). The graphical equations for LV (k−1) , RV (k−1) are obtained from the equation in the second line by putting an L, R label, respectively, over the vertices on scale k.
v v0
V(k−1) = trees
k−1
k
hv
N
Fig. 2. The effective potential V (h) can be represented as a sum over Gallavotti–Nicol` o trees. The small black dots will be called vertices of the tree. All the vertices except the first (i.e. the one on scale k) have an R label attached, which means that they correspond to the action of REhTv , while the first represents EkT . The generic endpoint e, represented by a fat endpoint, corresponds to LV (he −1) . The sum is over distinct trees; two trees are considered identical if it is possible to superpose them together with the labels appended to their vertices by stretching or shortening the branches.
replaced by k. Analogously, the planar part of the effective potential is defined as: s 1 V˜p(k) (p1 , . . . , pmj ; γj , mj ) Vp(k−1) (p1 , . . . , pm ; γ, m) = s! m1 ,...,ms j=1 (k) (≤k−1) · · C C . p(λ)
π∈Gm ϑ⊂π π planar connected
λ∈ϑ
p(λ)
λ∈π/ϑ
(2.18) Represent a generic Wick monomial Mj containing the product of mj fields as a point or as a cluster with mj emerging lines, depending on whether the corresponding γj is trivial or not; we shall consider the points as (trivial) clusters, too. Given the Wick monomials M1 , . . . , Ms the symbol Gm denotes the set of connected graphs that can be made joining pairwise some of the lines associated with the clusters M1 , . . . , Ms in such a way that: (i) two lines emerging from the same cluster cannot be contracted together, (ii) there should be enough lines so that looking the clusters as points the resulting graph is connected, (iii) after the contraction
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1003
there should be still m uncontracted lines, representing the Wick monomial M . The resulting graph is enclosed in a new cluster, labeled by k. Furthermore, the condition ϑ ⊂ π with the subscript “connected” means that the subgraph ϑ still keeps the connection between the boxes. We graphically represent the propagators C (k) by a solid line, while C (≤k−1) correspond to wavy lines. Finally, the restriction to planarity means that we discard all the graphs that show lines crossing in points were no interacting vertices are present. We refer the reader to [6, Sec. XVI], for a more extensive discussion and for examples. Clearly, the iteration stops when only trivial subtrees appear in (2.17), (2.18); at this point, the resulting graph looks like an “usual” one, but enclosed in a hierarchical cluster structure, where each cluster has a scale label; and given two clusters Gv , Gv then Gv ⊂ Gv if and only if hv > hv . After the iteration, the effective potential on scale k is expressed as a power series in the running coupling constants λh , αh , µh , νh with h > k. From the analysis of [6, 12, 13], it follows that the contribution of a given tree γ ∈ Tk,n to a kernel of the planar theory can be bounded in the following way, setting δ := maxh {|λh |, |αh |, |µh |, |νh |}, for some positive Cm , ρ: (k) Vp (p1 , . . . , pm ; γ, m) ≤ Cm (const.)n δ n γ k(4−m) γ −ρ(hv −hv ) , (2.19) v>r v not e.p.
where the product runs over the vertices of the tree γ and v is the vertex immediately preceding v; since the number of distinct trees is bounded as (const.)n it follows that, see [6, Sec. XIX]: |Vp(k) (p1 , . . . , pm ; γ, m)| ≤ Cm C n δ n γ k(4−m) , (2.20) γ∈Tk,n
which means that the planar part of the effective potential can be expressed as a convergent power series in the running coupling constants, provided their absolute values are small enough. This is not the case in the full theory; in the analogous of (2.19), due to the combinatorics of the Feynman graphs, one has to take into account an extra n! factor. Formula (2.19) implies in particular the so called short memory property of the Gallavotti–Nicol` o trees, which states that if two scales of a given tree are constrained to have fixed values, say h, k with h < k, then the bound on the sum over all the remaining scales is improved by a factor γ −(ρ/2)(k−h) with respect to (2.20); in other words, long trees are exponentially suppressed. The expansion of the Schwinger functions. The generating functional of the Schwinger functions can be evaluated repeating a procedure completely analogous to the one described for the effective potentials; after the integration of the scales N, N − 1, . . . , k + 1 it turns out that: (k) (≤k) WN (ζf ) ;ζf ) = P (dϕ(≤k) )eS (ϕ e =
(k)
P (dϕ(≤k) )eSp
(k) (ϕ(≤k) ;ζf )+Snp (ϕ(≤k) ;ζf )
,
(2.21)
October 12, J070-S0129055X10004120
1004
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella (k)
where the effective potentials S# (ϕ(≤k) ; ζf ) have the form: dp1 dpm+t (k) (k) S# (ϕ(≤k) ; ζf ) = ··· S (p1 , . . . , pm+t ; m, t) (2π)4 (2π)4 # m≥0 t≥0
· :
m
m+t
ϕ(≤k) : pi
i=1
ζfpj δ pi ,
j=m+1
(2.22)
i
and can be represented as sums over trees very similar to the ones introduced for the effective potentials, up to the following differences, see [12, Sec. 7.5] and [13]: (i) special vertices may appear, from which dotted lines representing the “external fields” ζf emerge (that do not contribute to the total number of endpoints), and (ii) no R operation is defined on the path from a given dotted line to the root. We call Tk,n,t the set of such trees having root scale k, n endpoints and t dotted lines. See Fig. 3 for an example. Setting (k) (k) S# (p1 , . . . , pm+t ; m, t) = S# (p1 , . . . , pm+t ; γ, m, t), (2.23) n≥1 γ∈Tk,n,t
the planar parts of the kernels of the effective potentials are related by the following recursive equation: Sp(k−1) (p1 , . . . , pm+t ; γ, m, t) s 1 S˜p(k) (p1 , . . . , pmj +tj ; γj , mj , tj ) = s! m1 ,...,ms j=1 t1 ,...,ts
·
π∈Gm ϑ⊂π π planar connected
λ∈ϑ
Cp(λ) · (k)
λ∈π/ϑ
k
Fig. 3.
(≤k−1)
Cp(λ)
A generic tree belonging to Tk,6,2 .
,
(2.24)
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1005
where γ1 , . . . , γs are the s subtrees of γ with root coinciding with the first vertex of γ following the root. If γj is trivial and corresponds to a dotted (k) line then S˜p (p1 , . . . , pmj +tj ; γj , mj , tj ) = δmj ,1 δtj ,1 , while if it corresponds (k) to a solid line S˜p (p1 , . . . , pmj +tj ; γj , mj , tj ) = δtj ,0 LV (k) (p1 , . . . , pmj ; mj ); if (k) γj is a nontrivial subtree with tj > 0 then S˜p (p1 , . . . , pmj +tj ; γj , mj , tj ) = (k) Sp (p1 , . . . , pmj +tj ; γj , mj , tj ), while if γj is nontrivial and tj = 0 then (k) (k) S˜p (p1 , . . . , pmj ; γj , mj , 0) = RVp (p1 , . . . , pmj ; γj , mj ), with R = 1 − L defined as in (2.10). Clearly, m1 , . . . , ms and t1 , . . . , ts are subject to the con straints j mj = m, j tj = t. Formula (2.24) is iterated by replacing each (k)
Sp (p1 , . . . , pmj ; γj , mj , tj ) corresponding to any nontrivial γj with tj > 0. ThereT fore, the generic planar Schwinger function S(N ) (f ; q) can be written as: T S(N Sp (γ), ) (f ; q) = n≥1 γ∈T−1,n,q
Sp (γ) :=
dp1 dpq ··· fp · · · fpq Sp(−1) (p1 , . . . , pq ; γ, 0, q), (2π)4 (2π)4 1
γ ∈ T−1,n,q , (2.25)
(−1)
is given by (2.24) with k = 0. Finally, from the theory of [12, Sec. 7.5], where Sp it follows that |Sp (γ)| ≤ f q1 Cq C n δ n , (2.26) γ∈T−1,n,q
which implies that in the planar theory the Schwinger functions can be expressed as absolutely convergent power series in the running coupling constants, provided their absolute values are small enough. As it is well known, this is not the case in the full theory, since the bound (2.26) has to be multiplied by n!; see [6, 12, 13, 22]. The beta function and its tree expansion. From now on, we shall focus only on the planar theory. The running coupling constants obey to recursive equations (4) (2 ) induced by the iterative integration; it follows that, setting vk := λk , vk := αk , (2) (0) vk = µk , vk := νk : (a) (a) (a) vk = γ −2δa,2 −4δa,0 vk−1 − Bv k , k ≥ 0, (2.27) where the operator B, the beta function of the theory, has the form, see formula [6, Eq. (9.15)]: (a)
(Bv)k :=
∞
N
r=2 h1 ,...,hr ≥k a1 ,...,ar (a)
βa(a) (k; h1 , . . . , hr ) 1 ,...,ar
r
(a )
vhi i .
(2.28)
i=1
The quantities {v−1 } are called the renormalized coupling constants. As the iter(a) ative procedure described before suggests, the beta function (Bv)k can be represented as a sum over trees; the only difference with respect to the trees which
October 12, J070-S0129055X10004120
1006
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
have been introduced previously is that we attach an La over the first vertex, (k) (k) where La is defined in the following way: La Vp (p1 , p2 , p3 ; 4) := Vp (0, 0, 0; 4) if a = 4 and zero otherwise, La Vp (p; 2) := γ −2k Vp (0; 2) if a = 2 and zero other(k)
(k)
wise, La Vp (p; 2) := (1/2)∂p1 p1 Vp (0; 2) if a = 2 and zero otherwise and finally (k) (k) La Vp (0) := γ −4k Vp (0) if a = 0 and zero otherwise. From the theory of [6], it follows that in the planar theory: βa(a),...,a (k; h1 , . . . , hr ) ≤ (const.)r , (2.29) 1 r (k)
(k)
h1 ,...,hr ≥k
which means that the beta function is defined as an absolutely convergent power series provided the absolute values of the running coupling constants are small enough; this is not the case in the full theory, since in that case the bound (2.29) has to be multiplied by r!. Remarks. (1) From the representation of the coefficients of the beta function in terms of Feynman graphs, induced by the iterative integration previously described (see also [6, Secs. IX, XVI–XIX]), it follows that for k > 0, calling r¯ the number of indexes i such that ai = 4 (corresponding to the number of vertices with four external lines), (k; h1 , . . . , hr ) = 0 unless r¯ ≥ 2, βa(4) 1 ,...,ar
(2.30)
) βa(21 ,...,a (k; h1 , . . . , hr ) = 0 unless r¯ ≥ 2, r
(2.31)
(k; h1 , . . . , hr ) = 0 unless r¯ ≥ 1. βa(2) 1 ,...,ar
(2.32)
These properties can be understood in the following way. The graphs contributing to (2.30)–(2.32) are all computed at vanishing external momenta, and the momenta flowing on the propagators must have absolute values bigger than 0; (a) in fact, the quantity (Bv)k arise from the integration of the fields ϕ(h) with h ≥ k, which if k > 0 have support for momenta p such that |p| > 0. Then, to see property (2.30), simply try to draw on a sheet of paper any graph with four external lines evaluated at vanishing external momenta; as the reader may check, the condition r¯ < 2 is not compatible with the fact that the momenta flowing on the propagators have absolute values > 0. Property (2.32) can be seen in an analogous way. To understand (2.31), notice that the graphs con(2 ) tributing to βa1 ,...,ar (k; h1 , . . . , hr ) have two external lines, and are derived twice with respect to the external momentum. Then, proceed as for (2.32), and notice that the only two-legged graphs with r¯ = 1 compatible with the request on the modulus of the inner momenta are “tadpole” graphs, which do not depend on the value of the external momentum; therefore, their derivatives are vanishing. (2) Note that the flow of νk is decoupled from the others, since νk does not appear in the recursive equations defining λk , αk , µk (it is graphically represented by
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1007
a vertex with no external lines); moreover, the sequence ν−1 , . . . , νN solves the following equation: (0) (2.33) νk = γ −4 νk−1 − Bv k , which implies νk = γ −4(k+1) ν−1 −
k j=0
(0) γ 4(j−k) Bv j ,
(2.34)
(0)
where (Bv)j is analytic in its arguments for maxk {|λk |, |αk |, |µk |} small enough. For these reasons, in what follows we shall focus only on the flows of λk , µk , αk . We can rewrite Eq. (2.27) as: vk = γ −(k+1)(2δa,2 +4δa,0 ) v−1 − (a)
(a)
k j=0
(a) γ (j−k)(2δa,2 +4δa,0 ) Bv j ,
(2.35) (a)
and this equation can be iterated in order to obtain the formal power series of vk in the renormalized coupling constants. Again, Eq. (2.35) can be represented graphically. The second term in (2.35) corresponds to the sum of all the possible trees with root scale k enclosed in a frame labeled by a type label a. The correspondence between the framed trees and the trees discussed after (2.28) is made explicit by the example in Fig. 4. In general, the fat endpoint e labeled by ae and attached to a vertex on scale (a) he − 1 corresponds to the running coupling constant vhe −1 , while the first term in (2.35) is represented as a trivial tree with a thin endpoint labeled by a and root scale k. See Fig. 5 for a graphical representation of (2.35). The iteration of (2.35) produces trees showing thin endpoints, and in general more than one frame; see Fig. 6 for a picture of the situation. Therefore, the nth order contribution in the (a) renormalized coupling constant to vk is defined graphically as the sum of all the possible framed trees with root scale k enclosed in a frame labeled by a, with n thin endpoints, and where the generic vertex v has an R label attached otherwise the corresponding subtree is enclosed in a frame. We stress that trees with different type
k
=
k
a,2
a,0)
j=0
a Fig. 4.
Example of framed tree.
a
j
October 12, J070-S0129055X10004120
1008
2010 10:1 WSPC/S0129-055X
M. Porta & S. Simonella
a1 k
148-RMP
a
=
k
a
+
k
+ a2
k
a1
a2
a2
+
a3
k
+ ...
a3 a
a
Fig. 5.
a1
a
Graphical interpretation of formula (2.35); a sum over the ai ’s is understood.
a1
a1
a1
a2
a2 +
a2
a3
a3
+ ...
a5 a4 a3
Fig. 6.
Graphical interpretation of the iteration of Eq. (2.35).
labels attached to their frames and endpoints are considered different. The same graphical procedure allows to find the perturbative expansion of the Schwinger functions (or equivalently of the effective potentials) in the renormalized coupling constants, starting from their definition as trees with only “fat” endpoints. Remark. Given a generic framed tree showing any number of inner frames, we define the maximally pruned framed tree as the tree obtained by replacing the maximal inner frames (i.e. the ones enclosed only by the outermost frame) with fat endpoints of the corresponding type; by properties (2.30)–(2.32) the sum over the scale of the first vertex of a framed tree, see Fig. 4, involves only the term with j = 0 if: • the type label of the frame is 2 and the maximally pruned framed tree has no endpoints of type 4; • the type label of the frame is 2 or 4 and the maximally pruned framed tree has at most one endpoint of type 4. We shall say that a frame is trivial if the enclosed tree verifies one of the above properties; all the other frames will be called nontrivial. Call T˜−1,m,q the set of trees with root scale −1, any number of frames, m endpoints fat or thin, and q dotted lines; given a generic tree γ ∈ T˜−1,m,q we call n2 ,4 (γ) the number of nontrivial frames (see previous remark) labeled by a = 2 , 4 and we denote by ma (γ) the number of endpoints of type a. In the planar theory the following remarkable result is true.
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1009
Theorem [n! Bound]. Let q > 0; there exist two positive constants C, Cq such that, if m = m4 + m2 + m2 : ma (a) q m |S(γ)| ≤ C Cq f 1 n! ; (2.36) max |vk | a
γ∈T˜−1,m,q n2 ,4 (γ)=n ma (γ)=ma
k≥−1
for q = 0 the bound (2.36) has to be multiplied by |Λ|. We refer the reader to [6, 12, 13], and to Appendix E (see item (2) in the Remark below), for a proof of this result. Remarks. (1) The “n! bound” (2.36) only applies to the planar theory; in the full theory n is replaced by the number of endpoints of the tree. This proves the ultraviolet stability of the full ϕ44 theory; see [6, 12, 13, 22–25]. (2) In References [6, 13], it was noticed that in the planar case the bound grows factorially in the number of frames; as we show in Appendix E, it is possible to improve the bound by considering only the nontrivial frames labeled by 4, 2 . Roughly speaking, the factorial is “produced” by the sums appearing in the definitions of the frames; the frames labeled by 2, 0 do not contribute to the factorial because their sums can be controlled thanks to the exponential factor appearing in (2.35) and Fig. 4, and if a frame is trivial the sum is missing. Notations. From now on we shall set λ := λ−1 ,
α := α−1 ,
µ := γ −2 µ−1 ;
(2.37)
moreover, we define λ := {λk }k≥1 ,
α := {αk }k≥1 ,
µ := {µk }k≥1 .
(2.38)
Notice that the definition (2.38) does not involve the running coupling constants on scale zero. In fact, for purely technical reasons, the running coupling constants on scale zero have to be treated separately from those on scales > 0. In particular, we first determine the running coupling constants on scales > 0 as functions of those on scale 0, and then we express the running coupling constants on scale 0 as functions of the renormalized ones. The motivation of this procedure is connected with the fact that the properties of the beta function (2.30)–(2.32), that will play a key role in our analysis, are true only for scales k > 0. It is also convenient to introduce ξk := (ξ2 ,k , ξ2,k ) := (αk , µk ), Finally, we define the sets Bδ , Cδ , Wδ,ϑ see Fig. 7: Bδ := {z ∈ C : |z| < δ},
ξ := {ξk }k≥1 . (2.39) with δ > 0, ϑ ∈ 0, π2 in the following way, ξ := (α, µ),
Cδ := {z ∈ C : Re z −1 > δ −1 },
Wδ,ϑ := {z ∈ C : |z| < δ, |arg z| < π − ϑ}.
(2.40)
October 12, J070-S0129055X10004120
1010
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
Fig. 7.
The domains Bδ , Cδ , Wδ,ϑ .
3. Borel Summability of ϕ44 Planar Theory In this section, we state our main result in a mathematically precise form, we recall what Borel summability is and we outline the ideas of the proof. The technical details are contained in Sec. 4 and in the appendices. Theorem 1 (’t Hooft–Rivasseau). For any ϑ ∈ 0, π2 there exist η¯ > 0, ε¯ > 0 T 4 such that the Schwinger functions S T (f ; q) = limN →+∞ S(N ) (f ; q) of the planar ϕ4 theory are analytic for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯, and Borel summable in λ at the origin. Remark. Not surprisingly, ε¯ → 0 if ϑ → 0. Before discussing a sketch of the proof, let us briefly remind what Borel summa bility is (see [14, 16]). A formal power series n an z n , z ∈ C, is said to be Borel summable if the following properties are true: • the Borel transform B(t) := n an!n tn converges for every t in some circle Bδ ; • B(t) admits an analytic continuation in a neighbourhood of the positive real axis; • the integral 1 +∞ − t e z B(t)dt (3.1) f (z) = z 0 is convergent for z ∈ Cδ¯ for some δ¯ > 0. n Notice that f (z) ∼ for z → 0. The function f (z) is called the Borel n an z sum of the formal power series, and if f (z) exists it is unique. Therefore, Borel summability is nothing else than a one-to-one mapping between a certain space of functions and a certain space of power series: all the information on the function is enclosed in the list of its Taylor coefficients. For these reasons, Borel summability is, [17], the perfect substitute for ordinary analyticity when a function is expanded on the boundary of its analyticity domain. By the Nevanlinna–Sokal theorem, [14], to establish whether f (z) is the Borel sum of n an z n it is sufficient to check the following two properties: • f (z) is analytic in Cδ for some δ > 0;
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
• for every z ∈ Cδ and for all M > 0 the following estimate holds: M−1 an z n ≤ C M M !|z|M , C > 0. f (z) −
1011
(3.2)
n=0
Sketch of the Proof. Our proof consists in a check of the two hypothesis of the Nevanlinna–Sokal theorem, and it goes as First, we prove that for any follows. fixed ultraviolet cutoff N > 0 and any ϑ ∈ 0, π2 the running coupling constants are analytic for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯; analyticity of the Schwinger functions T T S(N ) (f ; q) in the same domain is straightforward, since S(N ) (f ; q) is given by an absolutely convergent power series in the running coupling constants, see [6, 12, 13]. T Then, we prove that S T (f ; q) = limN →+∞ S(N ) (f ; q) exists, and that the limit is reached uniformly in the analyticity domain. Therefore, S T (f ; q) is analytic in the T T same analyticity domain of S(N ) (f ; q). To conclude, we show that S (f ; q), as function of λ in Wε¯,ϑ , verifies the bound (3.2). These two properties imply Borel summability, since Cε¯ ⊂ Wε¯,ϑ . Analyticity. To solve the flow equations (2.27) and determine the analyticity properties of the running coupling constants we use a fixed point argument. More precisely, we show that the Eqs. (2.27) are solved by sequences parametrized by the renormalized coupling constants (λ, α, µ) which, for finite N , are the fixed points of some operators acting on suitable finite dimensional spaces; all the technical work is reduced to showing that in the considered spaces the operators are contractions. After this, the sequences of running coupling constants are determined through an exponentially convergent procedure. In particular, in the limit N → +∞, for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯, we find that the Eqs. (2.27) admit a solution of the form, for some positive C, c: 1 , |αk − α| ≤ c(|λ| + |µ|2 ), λk = k ˜ −1 + λ β˜k (3.3) j=0
|µk − γ −2k µ| ≤ c[γ −2k |µ|2 + (|λ| + |ξ|)|λk |], ˜ = λ(1 + O(µ)), |β˜k − βk | ≤ C(|λ| + |ξ|), βk := β (4) (k; k, k) > 0. where λ 4,4 To begin, we rewrite the flow equation for λk as, see (2.27) with a = 4: λk =: λk+1 + β4,k+1 (λ, ξ),
k ≥ 0,
(3.4)
λ =: λ0 + f4,0 (λ0 , µ0 ) + β4,0 (λ0 , λ, ξ0 , ξ),
(3.5)
where f4,0 is linear in λ0 , and β4,h is given by a sum of terms proportional to at least two among λ0 , . . . , λN . Then, iterating (2.27) up to the scale 0 we get that, for a = 2, 2 : αk =: α0 −
k j=1
β2 ,j λ, ξ ,
µk =: γ −2k µ0 −
k
γ 2(j−k) β2,j λ, ξ ,
k ≥ 1,
j=1
(3.6)
October 12, J070-S0129055X10004120
1012
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
α0 =: α − f2 ,0 (λ0 , µ0 ) − β2 ,0 λ0 , λ, ξ0 , ξ , µ0 =: µ − f2,0 (µ0 ) − β2,0 λ0 , λ, ξ0 , ξ ,
(3.7)
where f2 ,0 collect terms at most linear in λ0 , while β2 ,h , β2,h are given by sums of terms proportional to at least two or one among λ0 , . . . , λN , respectively. Setting β4,h (λ, ξ) =: βh λ2h + β¯4,h λ, ξ where βh > 0 and β¯4,h is of order ≥ 3, Eq. (3.4) can be rewritten as k k −1 −1 −1 λ, ⇒ λ = λ − β + R ξ = λ + β − Rj λ, ξ , λ−1 k+1 k+1 j 0 k k+1 k j=1
(3.8)
j=1
where Rj is given by a sum of terms bounded proportionally to one between αj , µj , λj , and it depends only on running coupling constants on scales ≥ j, see Appendix B; the key remark is that, formally, Eq. (3.8) can be seen as defining the fixed point of the map 1
(Tλ0 ,ξ x)k = λ−1 0
+
k
βj −
j=1
k
Rj x, ξ
,
k ≥ 1,
(3.9)
j=1
where x = (x1 , . . . , xN ) with xi ∈ C and α, µ satisfy (3.6), which again can be formally seen as the fixed point of the map k α0 − β2 ,j λ, y j=1 ˜ (3.10) (Tξ0 ,λ y)k = , k ≥ 1, k γ −2k µ − 2(j−k) γ β2,j λ, y 0 j=1
where y = (y1 , y2 , . . . , yN ) and yk = (yk,2 , yk,2 ) with yk,i ∈ C. Therefore, we can in principle determine the running coupling constants on scale > 0 as functions of (λ0 , α0 , µ0 ) by solving the equations: λ = Tλ0 ,ξ λ,
˜ ξ0 ,λ ξ; ξ=T
(3.11)
after this, the dependence of the running coupling constants on the renormalized ones can be deduced from Eqs. (3.5) and (3.7). To solve (3.11), in Sec. 4.1 and in Appendices A and B we prove that if S ∈ CN is the set of sequences “close enough” to the solution of the flow of λk truncated to second order and if S˜ ∈ C2N is a 2N -dimensional ball centered in zero and of suitably small radius, then: (i) if x ∈ S and |α0 |, |µ0 | are small enough the map ˜ ξ0 ,x leaves S˜ invariant and is a contraction therein; (ii) the fixed point y(x) of T ˜ ξ0 ,x T in S˜ is H¨older continuous in x with exponent 0 < ρ < 1; (iii) given ϑ ∈ (0, π/2], for all λ0 ∈ Wε,ϑ with ε small enough, the map Tλ0 ,y(·) leaves S invariant and is a contraction therein. To be specific, the distances d, d˜ that we shall adopt
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1013
˜ y ) := maxk,i |yk,i − y |, in S, S˜ are defined as d(x, x ) := maxk |xk − xk |, d(y, k,i respectively. Then, we can construct the sequences solving (3.11) in the following way: take (0) α (0) ˜ λ(0) ∈ S, (3.12) ∈ S, ξ = µ(0) and define, for m ≥ 0,
n (m) ˜ ξ , ξ (m+1) := lim T ξ0 ,λ(m) n→∞
λ(m+1) := Tλ0 ,ξ(m+1) λ(m) .
(3.13)
Assume inductively that for all 0 ≤ m ≤ m the sequences ξ (m ) , λ(m ) belong ˜ S, which is true for m = 0. Property (i) above implies that respectively to S, (m+1) ˜ while property (iii) implies that λ(m+1) belongs to S. Then, ξ belongs to S, our procedure (3.13) converges exponentially to a limit; in fact, for m ≥ 1, for some 0 < ρ < 1, Cρ > 0 and 0 < < 1: ρ (m+1) (m) (m) (m−1) max ξk,i − ξk,i ≤ Cρ max λk − λk k,i k ρ (1) (0) (m−1)ρ ≤ Cρ (3.14) max λk − λk k
(m+1) (1) (m) (0) − λk ≤ m max λk − λk max λk k,i
k
where we used property (ii) to get the first inequality in the first line, and property (iii) for the remaining ones. Since λ(1) , λ(0) are bounded, Eqs. (3.14) prove that the limits λ∗ = lim λ(m) , m→∞
ξ ∗ = lim ξ (m) m→∞
(3.15)
exist in S, S˜ respectively, and by construction λ∗ = Tλ0 ,ξ∗ λ∗ ,
˜ ξ ,λ∗ ξ ∗ , ξ∗ = T 0
(3.16)
i.e. λ∗ , ξ ∗ are the sequences of running coupling constants from scale 1 to N of the planar ϕ44 theory, parametrized by λ0 , α0 , µ0 . The proof of analyticity of the limits for (λ0 , α0 , µ0 ) ∈ Wε,ϑ × Bη × Bη with ε, η small enough is straightforward; it is a consequence of the analyticity properties of the initial data and of the maps T, ˜ and of the fact that convergence is uniform for (λ0 , α0 , µ0 ) ∈ Wε,ϑ × Bη × Bη . T, After this, from Eqs. (3.5) and (3.7) we show that λ0 , α0 , µ0 are analytic for (λ, α, µ) ∈ Wε ,ϑ × Bη × Bη with ϑ > ϑ, ε < ε, η < η, and this concludes the proof of analyticity of the running coupling constants in the renormalized ones. T Finally, to prove analyticity of the Schwinger functions we use that S(N ) (f ; q) is given by an absolutely convergent power series in the running coupling constants, see Sec. 2, and we prove that the limit for N → ∞ exists and it is reached uniformly for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯ with ε¯ < ε , η¯ < η .
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
1014
Bound on the remainder. In Sec. 4.2, we show that relying on the tree representation of the beta function described in Sec. 2, it is possible to rewrite the q-point Schwinger function as: S T (f ; q) = S T,(≤n) (f ; q) + r(n) (f ; q),
(3.17)
where S T,(≤n) (f ; q) is the Taylor expansion of S T (f ; q) up to order n in λ = 0, and r(n) (f ; q) is a quantity bounded by (const.)n+1 Cq f q1 (n + 1)!|λ|n+1 uniformly in the analyticity domain. The idea is to use the graphical representation of the beta function depicted in Fig. 5 to “extract” in the tree expansion of the Schwinger function all the possible trees with less than n + 1 thin endpoints corresponding to λ, as suggested by Fig. 6; the main difficulty in this procedure is to check that after having reproduced the Taylor series up to the order n the “unwanted” trees, i.e. the ones showing more than n endpoints of type 4, have less than n + 1 nontrivial frames labeled by a = 2 , 4, see remark after Fig. 6. After having checked this, the desired bound is a straightforward consequence of the n! bound (2.36). 4. Proof of Theorem 1 4.1. Analyticity of the flow of the running coupling constants In this section we present in a mathematically precise form the properties (i)–(iii) mentioned in the previous section after Eq. (3.11), which, as we already discussed, are the key ingredients in the construction of the sequences of the running coupling constants on scale ≥ 1 as functions of the ones on scale 0. After this, we express the running coupling constants on scale 0 in terms of the renormalized ones, and we prove the analyticity properties required for Borel summability. The spaces of sequences that we shall consider are the following ones: √ 1 N , , |t | ≤ δk Sλ0 ,δ := x ∈ C : xk = k k (4.1) λ−1 βj + t k 0 + j=1
S˜η := {y ∈ C2N : |yk,i | ≤ η}. The following two lemmas imply, respectively, properties (i), (ii) and property (iii) stated in Sec. 3. Lemma 1. For any ϑ ∈ 0, π2 there exist ε¯ > 0, η¯ > 0 such that if (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × B2¯η × B2¯η and x ∈ Sλ0 ,¯ε+¯η : 2
˜ ξ0 ,x is a map from S˜4¯η to S˜4¯η ; (1) T ˜ ξ0 ,x is a contraction in S˜4¯η , i.e. if y ∈ S˜4¯η , y ∈ S˜4¯η (2) T ˜ ξ0 ,x y ˜ ξ0 ,x y max T − T , ≤ max yk,i − yk,i k,i k,i k,i
k,i
0 < < 1;
(4.2)
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1015
(3) given two sequences x, x belonging to Sλ0 ,¯ε+¯η , the fixed points y(x), y(x ) of ˜ ξ0 ,x , T ˜ ξ0 ,x in S˜4¯η verify the following inequalities: the maps T |yk,i (x) − yk,i (x )| ≤ C[log(1 + ε¯k) + 1] max |xk − xk |,
(4.3)
ρ |yk,i (x) − yk,i (x )| ≤ Cρ max |xk − xk | .
(4.4)
k
k
for some positive C, Cρ and 0 < ρ < 1. Lemma 2. For any ϑ ∈ 0, π2 there exist ε¯ > 0, η¯ > 0 such that if (λ0 , α0 , µ0 ) ∈ ˜ ξ0 ,x in S˜4¯η for x ∈ Sλ0 ,¯ε+¯η exists and: W2¯ε, ϑ × B2¯η × B2¯η the fixed point y(x) of T 2
(1) Tλ0 ,y(·) is a map from Sλ0 ,¯ε+¯η to Sλ0 ,¯ε+¯η ; (2) Tλ0 ,y(·) is a contraction in Sλ0 ,¯ε+¯η , i.e. if x ∈ Sλ0 ,¯ε+¯η , x ∈ Sλ0 ,¯ε+¯η , max |(Tλ0 ,y(x) x)k − (Tλ0 ,y(x ) x )k | ≤ max |xk − xk |, k
k
0 < < 1.
(4.5)
We refer the reader to Appendices A and B for the proofs of these lemmas. As explained in Sec. 3, this two results allow to construct the sequences of the running coupling constants as functions of those on scale 0, and to determine their analyticity properties. We take (0) α (0) (4.6) ∈ S˜4¯η , λ(0) ∈ Sλ0 ,¯ε+¯η ξ = µ(0) analytic for (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × B2¯η × B2¯η ; to be concrete, we can choose 2
(0) αk
=
(0) µk
= η¯,
(0)
λk =
1 , k −1 λ0 + βj
λ0 ∈ W2¯ε, ϑ . 2
(4.7)
j=1
Then, we can construct the sequences of running coupling constants by proceeding as explained after (3.12); analyticity for (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × B2¯η × B2¯η is a 2 straightforward consequence of the analyticity properties of the maps and of the initial data, and of the fact that convergence is uniform for (λ0 , α0 , µ0 ) ∈ W2¯ε, ϑ × 2 B2¯η × B2¯η . Now we turn to the flow Eqs. (3.5) and (3.7) for the running coupling constants on scale 0. Notice that these equations are different from the ones corresponding to higher scales, because of the presence of the functions fa,0 . The main consequence of this fact is that choosing λ inside Cε does not imply that λ0 ∈ Cε for some ε ; this is the reason why we considered λ0 ∈ Wε,ϑ so far. The strategy that we shall adopt is very similar, but technically much simpler, to the one we followed for the scales 1, . . . , N , see Appendix C for details: first, we determine with a fixed point argument α0 , µ0 as analytic functions of λ0 , α, µ in W2ε, ϑ × Bη × Bη for ε, η small 2 enough; then, we plug α0 , µ0 into Eq. (3.5) for λ0 , and we solve it using again a fixed point argument; finally, we show that the solution has the required analyticity
October 12, J070-S0129055X10004120
1016
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
−1 properties in λ, α, µ. In particular, it follows that λ−1 (1 + O(µ)) + β0 , up 0 λ to corrections bounded by (const.)(|λ| + |ξ|).
Asymptotic behavior of the running coupling constants. So far, our construction allowed us to conclude that, if (λ, α, µ) ∈ Wε,ϑ × Bη × Bη with ε, η small enough: 1
λk = −1
λ
(1 + O(µ)) +
k
, βk +
|αk | ≤ η,
|µk | ≤ η,
(4.8)
tk
j=0
√ with |tk | ≤ (k + 1) ε + η; however, these results can be improved to get (3.3). In fact, the flows of αk , µk are given by, for k ≥ 1: αk = α0 −
k
β2 ,j λ, ξ ,
µk = γ −2k µ0 −
j=1
k
γ 2(j−k) β2,j λ, ξ ,
(4.9)
j=1
where: β2 ,j λ, ξ ≤ c |λj |2 ,
β2,j λ, ξ ≤ c |λ| + |ξ| |λj |.
(4.10)
Therefore it follows that, using the expression for λk in (4.8), for some c > 0: # |αk − α| ≤ c |λ| + |µ|2 , µk − γ −2k µ ≤ c γ −2k |µ|2 + (|λ| + |ξ|)|λk | , (4.11) which give the last two of (3.3). To prove the first of (3.3), simply use (4.11) and the first of (4.8) to replace the running coupling constants appearing in Rj , see (3.9) and (B.2). Analyticity of the Schwinger functions. As we have discussed in Sec. 2, the T Schwinger functions S(N ) (f ; q) are given by absolutely convergent power series in the running coupling constants on scales ≤ N ; therefore, taking ε¯, η¯ smaller than T the radius of convergence of the series, S(N ) (f ; q) is analytic for (λ, α, µ) ∈ Wε¯,ϑ × × B . To prove analyticity in the limit N → +∞ we show that the sequence B η ¯ η ¯ $ T % S(N ) (f ; q) N ≥1 is uniformly Cauchy in the analyticity domain. In fact, consider two positive integers N, N such that N > N ; then, T T T T S(N ) (f ; q) − S(N ) (f ; q) := δS1,(N,N ) (f ; q) + δS2,(N,N ) (f ; q),
(4.12)
T where δS1,(N,N ) (f ; q) is given by a sum of trees with at least one endpoint on scale (a),N
(a),N
k ≤ N corresponding to the difference of running coupling constants vk −vk T of theories with cutoffs on scales N , N , and δS2,(N,N ) (f ; q) is given by a sum of GN trees having root scale −1 and at least one endpoint on scale ≥ N + 1. The first term can be bounded using the results of Appendix D as: T −1 δS , (4.13) 1,(N,N ) (f ; q) ≤ (const.)N
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1017
while the second can be estimated using the short memory property of the GN trees (see discussion after (2.20)) as, for some ρ > 0, T −ρN δS , γ > 1; (4.14) 2,(N,N ) (f ; q) ≤ (const.)γ all the bounds are uniform in (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯. Therefore the limit exists, and it is analytic in Wε¯,ϑ × Bη¯ × Bη¯. 4.2. Bounds on the Taylor remainder of the Schwinger functions In this section we show that for all n > 0, (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯, the q-points Schwinger function S T (f ; q) verifies S T (f ; q) = S T,(≤n) (f ; q) + rn (f ; q)
(4.15)
where S T,(≤n) (f ; q) is the Taylor expansion of S T (f ; q) up to the order n in λ = 0 and rn (f ; q) is a remainder bounded by C n+1 (n + 1)!|λ|n+1 for some C > 0. Result (4.15) concludes the proof of Borel summability of the Schwinger functions of the planar theory. One can try to prove decomposition (4.15) by iterating the graphical definition of the running coupling constants, see discussion after (2.35) and, in particular, Fig. 6 to get an idea of the graphical meaning of the iteration, to “extract” all the possible trees with only thin endpoints and at most n of them labeled by 4; to conclude the proof one has to check at the end that the sum of the values of the trees not belonging to this category is bounded by C n+1 (n + 1)!|λ|n+1 . For simplicity, in the following we shall call “a-endpoint” an endpoint labeled by a, and “a-frame” a frame labeled by a; a-frames with a equal to 2 or 4 will be called “(2 , 4)-frames”. Empty and square endpoints. We can rewrite (3.4), (3.7) in the more compact form: (a)
vk
= γ −2δa,2 (k+1) v−1 − γ −2δa,2 k fa,0 (λ0 , µ0 ) (a)
−
k
γ 2(j−k)δa,2 βa,j λ0 , λ, ξ0 , ξ .
(4.16)
j=0
We graphically represent −γ −2δa,2 k fa,0 as an empty a-endpoint and −
k
γ 2(j−k)δa,2 βa,j
j=0
as a square a-endpoint. Therefore, in general, the fat a-endpoint can be written as the sum of thin, empty and square a-endpoints; see Fig. 8. In turn, the empty and the square endpoints can be represented as sums of framed trees with root scale k, no inner frames and only fat endpoints, see discussion after (2.35). It is important to notice that the frames appearing in the tree representation of −γ −2δa,2 k fa,0 are trivial, see Remark after Fig. 5.
October 12, J070-S0129055X10004120
1018
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
k
a
Fig. 8.
=
k
a
+
k
a
+
k
a
Fat endpoints are equal to thin plus empty plus square endpoints.
We define the order and the 4-order of fat, thin, empty, and square endpoints as the order of their values in all the renormalized coupling constants and in λ only, respectively. Therefore: • Thin and fat endpoints have order 1; empty endpoints have order 2; square a-endpoints have order 1 or 2 depending on whether a = 2 , 4 or a = 2. • Thin, fat and empty a-endpoints have 4-order 0 or 1 depending on whether a = 2 , 2 or a = 4; square endpoints have 4-order 1. Notice that the reason why we set to 1 the order and the 4-order of the square a-endpoints with a = 2 , 4, which are given by sums of trees with two 4-endpoints, is that we have to exploit asymptotic freedom to control the sum in (4.16); the result can be bounded uniformly in k by |λ| but not by |λ|2 . Notations. We shall use the following notations: • n2 ,4 (γ) is the number of nontrivial (2 , 4)-frames appearing in a tree γ; (a) • nsq (γ) is the number of square a-endpoints appearing in a tree γ, and nsq (γ) := (4) (2 ) (2) nsq (γ) + nsq (γ) + nsq (γ); • the order O(γ) and the 4-order O4 (γ) of a tree γ are respectively equal to the sums of the orders, 4-orders of the endpoints of γ; • the “expansion” of square and empty endpoints consists in replacing them with their tree expansions in terms of framed trees with no inner frames and only fat endpoints, see discussion after (2.35). Proof of (3.17). We will proceed by induction. Assume that, at the step r of the induction, for every n > 0, M > 0 with M ≥ n the Schwinger function S T (f ; q) can be written as (r)
(r),1
(r),2
S T (f ; q) = Fn,M + Rn,M + Rn,M , (r)
(4.17)
(r),i
where both Fn,M , Rn,M can be represented as sums over distinct trees such that n2 ,4 (γ) ≤ n. Moreover, we assume that: (r)
• the trees γ contributing to Fn,M are such that O4 (γ) ≤ n, O(γ) ≤ M and show fat and thin endpoints; (r),i • the trees γ contributing to Rn,M are such that O4 (γ) > n or O(γ) > M , depending on whether i = 1, 2, and may have empty and square endpoints. These assumptions are trivially true at the beginning of the induction, see Sec. 2. As a consequence of result (2.36), and since the number of topologically distinct (r),1 (r),2 trees with m endpoints is estimated by (const.)m , Rn,M , Rn,M are bounded
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1019
respectively by C n Cq f q1n!|λ|n+1 , C M Cq f q1 n!δ M+1 for some positive C and δ := maxk {|λk |, |αk |, |µk |}. Now do the following. (r)
(1) Substitute every fat 2-endpoint appearing in Fn,M with the sum of a thin plus an empty plus a square 2-endpoints: in this way the fat 2-endpoint disappear, generating new trees such that n2 ,4 (γ) ≤ n that we organize by writing (r)
(r)
(r),1
Fn,M = A1 + A2
(r),2
+ A2
,
(4.18)
where (r)
A1 := “sum of trees γ such that O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1
:= “sum of trees γ such that O4 (γ) > n”,
(r),2
:= “sum of trees γ such that O4 (γ) ≤ n and O(γ) > M ”.
A2 A2
(2) Substitute every fat 2 -endpoint appearing in A1 with the sum of a thin plus an empty plus a square 2 -endpoint: in this way the fat 2 -endpoints disappear, generating new trees such that n2 ,4 (γ) ≤ n that we organize by writing (r)
(r)
(r)
(r),1
A1 = A3 + A4
(r),2
+ A4
,
(4.19)
where (2 )
(r)
A3 := “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) ≤ M ”, (2 )
(r),1
:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) > n − 1 + δn2 ,4 ,0 ”,
(r),2
:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) > M ”.
A4 A4
(2 )
(r),1
Notice that the trees appearing in A4 trees,
are such that O4 (γ) > n; in fact, for these
(2 ) O4 (γ) ≥ n2 ,4 (γ) + 1 + nsq (γ) − δn2 ,4 ,0 > n,
(4.20)
where we used that each nontrivial 2 -frame contains trees of 4-order ≥ 2, that the square 2 -endpoints are of 4-order strictly bigger than their corresponding thin and (r),1 empty endpoints, and the definition of A4 . (3) Expand each square a-endpoint with a = 2 , 2 appearing in A3 , and write (r)
(r)
(r)
(r),1
A3 = A5 + A6
(r),2
+ A6
,
(4.21)
where (r)
A5 := “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1
A6
:= “sum of the trees γ s.t. O4 (γ) > n”,
(r),2 A6
:= “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) > M ”.
Notice that the trees generated at this step are such that n2 ,4 (γ) ≤ n; in fact, for a (r) (2 ) generic tree γ generated by γ ∈ A3 it follows that n2 ,4 (γ ) = n2 ,4 (γ) + nsq (γ) ≤ (r) n, where the last inequality holds by definition of A3 .
October 12, J070-S0129055X10004120
1020
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella (r)
(4) Substitute every fat 4-endpoint appearing in A5 with the sum of a thin plus an empty plus a square 4-endpoint: in this way the fat 4-endpoints disappear, generating new trees such that n2 ,4 (γ) ≤ n that we organize by writing (r)
(r)
(r),1
A5 = A7 + A8
(r),2
+ A8
,
(4.22)
where (r)
(4)
A7 := “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) ≤ M ”, (r),1
:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) > n − 1 + δn2 ,4 ,0 ”,
(r),2
:= “sum of trees γ s.t. n2 ,4 (γ) + nsq (γ) ≤ n − 1 + δn2 ,4 ,0 and O(γ) > M ”.
A8 A8
(4) (4)
(r),1
Now, we show A8 can be rewritten as a sum of trees such that O4 (γ) > n and n2 ,4 (γ) ≤ n. Notice that since the 4-order of the 4-square endpoint is equal to the 4-order of its corresponding fat, thin and empty endpoints, we cannot use a bound like the one in (4.20). To “rise the 4-order” of a tree γ up to n + 1 we have to (4) (4) expand a suitable number n ˜ sq (γ) ≤ nsq (γ) of square 4-endpoints (which are given (4) ˜ sq (γ) = 0, because in by sums of trees of 4-order ≥ 2). If n2 ,4 (γ) = 0 we choose n (r) this case by definition of S4 the 4-order of γ is already > n; if n2 ,4 (γ) > 0 we choose n ˜ (4) sq (γ) := n − n2 ,4 (γ),
(4.23)
with this choice it follows that (n2 ,4 (γ) refers to the tree γ before this last expansion), ˜ (4) O4 (γ) ≥ n2 ,4 (γ) + 1 + n sq (γ) = n + 1.
(4.24)
Finally, a generic tree γ produced by this last expansion verifies n2 ,4 (γ ) = n2 ,4 (γ) + n ˜ (4) sq (γ) = n.
(4.25)
(r)
(5) Expand each square 4-endpoint appearing in A7 , and write (r)
(r)
(r),1
A7 = A9 + A10
(r),2
+ A10 ,
(4.26)
where (r)
A9 := “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1
A10
:= “sum of the trees γ s.t. O4 (γ) > n”,
(r),2 A10
:= “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) > M ”.
(4.27)
Notice that the trees generated at this step are such that n2 ,4 (γ) ≤ n; in fact, for a (r) (4) generic tree γ generated by γ ∈ A7 it follows that n2 ,4 (γ ) = n2 ,4 (γ) + nsq (γ) ≤ (r) n, where the last inequality holds by definition of A7 .
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1021
(r)
(6) Expand each empty a-endpoint appearing in A9 , and write (r)
(r+1)
(r),1
A9 = Fn,M + A12
(r),2
+ A12
(4.28)
where (r+1)
Fn,M := “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) ≤ M ”, (r),1
A12
:= “sum of the trees γ s.t. O4 (γ) > n”,
(r),2 A12
:= “sum of the trees γ s.t. O4 (γ) ≤ n and O(γ) > M ”.
(4.29)
(7) We are now able to express the generic Schwinger function S T (f ; q) as S (f ; q) =
(r+1) Fn,M
=:
(r+1) Fn,M
T
+
(r),1 Rn,M
+
(r+1),1 Rn,M
+
(r),2 Rn,M
6
+
(r),j
A2i
j=1,2 i=1
+
(r+1),2 Rn,M ,
(4.30)
where, by construction, all the trees are such that n2 ,4 (γ) ≤ n, the remainder (r+1),1 (r+1),2 contains distinct trees such that O4 (γ) > n, while Rn,M is given by a Rn,M (r+1)
sum of distinct trees such that O(γ) > M . If Fn,M still contains trees with fat endpoints repeat the process starting from step (1), otherwise we have finished: calling r∗ the final step (which is finite, see Remark below), i.e. the integer such (r ∗ ) that Fn,M contains trees with only thin endpoints, the n! bound (2.36) implies that, if δ = maxh {|λh |, |αh |, |µh |}: (r ∗ ),1
|Rn,M | ≤ C n Cq f q1 n!|λ|n+1 , ∗
(r )
(r ∗ ),2
|Rn,M | ≤ C M Cq f q1n!δ M+1 .
(4.31)
∗
(r )
Moreover, Fn,M differs from Fn,+∞ , the Taylor expansion in λ to the order n, by a quantity bounded by C M Cq f q1 n!δ M+1 ; therefore, for each λ in the analyticity domain and for each n ≥ 0 there exists a finite integer M (λ, n) ≥ n such that for all M ≥ M (λ, n) it follows that: T (r ∗ ) S (f ; q) − Fn,+∞ (4.32) ≤ 4C n Cq f q1 n!|λ|n+1 , and this bound concludes the proof of Borel summability of the ϕ44 planar theory. Remark. The iteration ends in less than M + 1 steps (where each step is formed by the seven substeps described above); this means that no trees with fat endpoints (M+1) are present in Fn,M . We can prove this fact with a simple induction. At the step r = 0, the trees with fat endpoints are of order ≥ 0. Assume inductively that at the (r) rth step, the trees belonging to Fn,M with at least one fat endpoint are of order ≥ r. If this is true, by repeating the six substeps described above, we find that (r+1) the new trees with at least one fat endpoint appearing in Fn,M must be of order ≥ r + 1, since at the rth step the fat endpoints are replaced by thin plus empty plus square endpoints, and the empty endpoints are of order 2 while the squares are given by sums of trees of order ≥ 2. Hence, after at most r∗ = M + 1 iterations (r ∗ ) no more trees showing fat endpoints will be present in Fn,M .
October 12, J070-S0129055X10004120
1022
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
5. Conclusions In this paper, we discussed the issue of Borel summability in the framework of multiscale analysis and renormalization group, by providing a proof of Borel summability for the ϕ44 planar theory using the techniques of [6]. This result is not new, since it has been proven independently by ’t Hooft and Rivasseau, [2–5]. The proof given by ’t Hooft is based on renormalization group methods, and it does not rely on Nevanlinna–Sokal theorem; we have not been able to fully reproduce ’t Hooft argument in our rigorous framework. The proof given by Rivasseau, instead, consists in a check of the two hypothesis of Nevanlinna–Sokal theorem. However, his methods are quite different from the ones that we use, since in his approach the beta function was not introduced. Moreover, in his work a particular choice of the wave function renormalization and of the renormalized mass was made. One of the motivations of our work is that very few proofs of Borel summability of interacting field theories based on renormalization group methods are present in the literature, [8, 9]. Moreover, our framework has already been proved effective in the analysis of various models of condensed matter and field theory. Therefore, we consider our work as a first step towards the analysis of more interesting models. For instance, we think that the ideas of this paper can be applied to the one-dimensional Hubbard model, which has been rigorously constructed through renormalization group methods in [15], but where a proof of Borel summability has not been given yet. In fact, due to the anticommutativity of the fermionic fields the factorial growth of the Feynman graphs can be controlled using the so called Gram bounds. Moreover, one sector of the theory is asymptotically free, while to control the flow of the other running coupling constants one has to exploit the vanishing of the beta function. Regarding our work, the first part of this paper consists essentially in a rigorous study of the beta function of an asymptotically free field theory. In particular, we have shown that the theory is analytic for values of the renormalized coupling constant λ belonging to a “Watson domain”, see [18] and definition (2.40), and for values of the wave function renormalization and of the renormalized mass close to 1 in absolute value. In the second part of our work, to prove Borel summability we have shown that it is possible to “undo” the resummation that allowed us to write the Schwinger functions as a convergent power series in the running coupling constants, in such a way that the difference between the generic Schwinger function and its Taylor expansion to the order n in λ is bounded by C n n!|λ|n+1 for some positive C. Thanks to Nevanlinna–Sokal theorem, see [14], this last fact along with the above mentioned analyticity properties implies Borel summability. Acknowledgments It is a pleasure to thank Prof. G. Gallavotti for having introduced us to the theory of renormalization, for having proposed the problem and for many very useful discussions, from which all the ideas of this paper emerged. We are also grateful to Dr. A. Giuliani, for constant encouragement and constructive criticism.
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1023
Appendix A. Proof of Lemma 1 In this appendix, we present the proof of Lemma 1. Recall that r¯ is the number of running coupling constants of type 4 appearing at a given order r of the perturbative series defining the beta function, see (2.28); moreover, we define r˜ := r − r¯. We ˜ ξ ,x remind also that with the notation y(x) we denote the fixed point of the map T 0
in S˜2η . All the estimates that we shall derive here and in the next appendix are consequences of the fact that, as it can be checked in a straightforward way, if x, x ∈ Sλ0 ,ε+η , λ0 ∈ Wε,ϑ and ε, η are small enough there exists a constant Cϑ > 0 such that xk Cϑ |xk | ≤ , ≤ Cϑ if k ≥ h; (A.1) |λ0 |−1 + k x h
−1
the constant Cϑ grows as ∼ ϑ
for ϑ → 0.
Proof of Lemma 1 (1). First, we have to prove that if (λ0 , α0, µ0 )∈ Wε,ϑ ×Bη ×Bη ˜ ξ0 ,x leaves invariant S˜2η , for ϑ ∈ 0, π , and ε, η small and x ∈ Sλ0 ,ε+η the map T 2 enough; in fact, setting a = (a1 , . . . , ar ), h = (h1 , . . . , hr ): ˜ ξ ,x y)k,2 | ≤ |α0 | + |(T 0
k j=1 r≥2
≤ |α0 | +
r r βa(2 ) (j; h) |xhi | |yhi ,ai |
hi ≥j {ai }ri=1 i=1,...,r
k j=1 r≥2
i=1 ai =4
i=1 ai =4
βa(2 ) (j; h)|xj |2 Cϑr¯ εr¯−2 (2η)r˜
hi ≥j {ai }ri=1 i=1,...,r
≤ |α0 | + Cϑ ε
(A.2)
for some Cϑ > 0. Similarly, ˜ ξ ,x y)k,2 | ≤ |µ0 | + |(T 0
k
γ 2(j−k)
j=1
r≥2
βa(2) (j; h)Cϑr¯εr¯(2η)r˜
hi ≥j {ai }ri=1 i=1,...,r
≤ |µ0 | + Cϑ (ε + η)2 ,
(A.3)
for Cϑ large enough. Hence, if (α0 , µ0 ) ∈ Bη × Bη , then both (A.2), (A.3) can be made smaller than 2η taking ε small enough. Proof of Lemma 1(2). Under the same assumption of Lemma 1(1), we show now ˜ ξ0 ,x is a contraction in S˜2η ; in fact, that T ˜ ξ ,x y )k,2 | ≤ ˜ ξ ,x y)k,2 − (T |(T 0 0
k j=1
r βa(2 ) (j; h) |xhi |
r≥3 hi ≥j {ai }ri=1 i=1,...,r
· (6η)r˜−1 r˜ max yk,i − yk,i , k,i
i=1 ai =4
(A.4)
October 12, J070-S0129055X10004120
1024
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
where we used that the second order of the beta function depends only on xj ; therefore, we can exploit two of the xhi’s to perform the sum, and it follows that, for ε, η small enough: ˜ ξ0 ,x y )k,2 | ≤ max |y − yk,i |, ˜ ξ0 ,x y)k,2 − (T |(T k,i
0 < < 1.
k,i
(A.5)
The same result can be proved for the difference of the 2-components, using the γ 2(j−k) factor to perform the sum over the j’s; this concludes the proof of the ˜ ξ ,x . contractivity of T 0 Proof of Lemma 1(3). We prove here the last item of Lemma 1. Given y ∈ S˜2η , set n n ˜ ˜ yk,i,n := T (A.6) yk,i,n := T ξ0 ,x y k,i , ξ0 ,x y k,i , and assume inductively that for all 0 ≤ m ≤ n the following bound is true: (A.7) |yk,i,m − yk,i,m | ≤ C(log(1 + εk) + 1) max xk − xk ; k
therefore, from (A.7) it follows that: |yk,2 ,n+1 −
yk,2 ,n+1 |
≤
k j=1
r≥2 hi ≥j {ai }ri=1 i=1,...,r
r
+
& βa(2 ) (j; h) C r¯−1 |xj |¯ r (3ε)r¯−2 (6η)r˜ ϑ '
CCϑr¯ (log(1
r¯−2
+ εh ) + 1)|xj | (3ε) 2
r˜−1
(6η)
=1 a =4
× max |xk − xk |,
(A.8)
k
and |yk,2,n+1 −
yk,2,n+1 |
≤
k
γ
2(j−k)
& βa(2) (j; h) C r¯−1 r¯(3ε)r¯−1 (6η)r˜ ϑ
r≥2 hi ≥j {ai }ri=1 i=1,...,r
j=1
+
r
CCϑr¯ (log(1
' r¯
r˜−1
+ εh ) + 1)(3ε) (6η)
=1 a =4
× max |xk − xk |. k
(A.9)
Using the short memory property of the GN trees, see discussion after (2.20), it follows that: β (i) (j; h) log(1 + εh ) ≤ (const.)r log(1 + εj); (A.10) a h1 ,...,hr hi ≥j
plugging this bound into (A.8), (A.9) we can reproduce our inductive assumption (A.7) for m = n + 1, choosing for ε, η small enough. This concludes the proof
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1025
of (4.3). The “H¨older continuity bound” (4.4) can be proved again by induction, replacing (A.7) with |yk,i,m −
yk,i,m |
ρ ≤ Cρ max |xk − xk | ,
0 < ρ < 1,
k
(A.11)
and using in (A.8) the bound, if hi ≥ j for all i = 1, . . . , r and 0 < ρ < 1: ρ r r xhi − xhi ≤ 2¯ r max |xk − xk | |xj |2−ρ Cϑr¯ (3ε)r¯−2 . k i=1 i=1 ai =4
(A.12)
ai =4
Appendix B. Proof of Lemma 2 In this appendix we present a proof of Lemma 2. Proof of Lemma 2(1). First, we have to prove that Tλ0 ,y(·) leaves invariant Sλ0 ,ε+η for (λ0 , α0 , µ0 ) ∈ Wε,ϑ × Bη × Bη for ε, η small enough. We have that (Tλ0 ,y(x) x)k = λ−1 0
+
k
1 , k βj − Rj (x, y(x))
j=1
(B.1)
j=1
where y(x) ∈ S˜2η and
Rj (x, y(x)) =
−2 ¯ ¯ xj βj2 + x−1 j βj β4,j (x, y(x)) − xj β4,j (x, y(x)) , 1 + βj xj + x−1 β¯4,j (x, y(x))
(B.2)
j
with β¯4,j (x, y(x)) =
r≥3
hi ≥j {ai }ri=1 i=1,...,r
βa(4) (j; h)
r
xhi
i=1 ai =4
r
yhi ,ai (x),
(B.3)
i=1 ai =4
(4)
where βa1 ,...,ar (j; h1 , . . . , hr ) = 0 unless there are at least two ai equal to 4. The final statement follows from the fact that for ε, η small enough |Rj (x, y(x))| ≤ (const.)(ε + η).
(B.4)
Proof of Lemma 2(2). To conclude, we have to show that under the same assumptions of the previous item, Tλ0 ,y(x) is a contraction in Sλ0 ,ε+η . Setting y(x) =: y,
October 12, J070-S0129055X10004120
1026
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
y(x ) =: y , from (B.1) we have that (Tλ0 ,y(x) x)k − (Tλ0 ,y(x ) x )k k
=
(Rj (x, y) − Rj (x , y ))
j=1
λ−1 0 +
k j=1
βj −
k
k
Rj (x, y) λ−1 0 +
j=1
βj −
j=1
k
,
(B.5)
Rj (x , y )
j=1
where Rj is given by (B.2); therefore, to bound the difference of Rj ’s calculated at different x we have to estimate (the other terms can be worked out in a similar way) −2 x β¯4,j (x, y) − xj −2 β¯4,j (x , y ) j βa(4) (j; h) ≤ r≥3 {hi }≥j {ai }
r r r r −2 −2 · xj xhi yhi ,ai − xj xhi yhi ,ai ; i=1 ai =4
we have that
i=1 ai =4
i=1 ai =4
(B.6)
i=1 ai =4
r r r r −2 −2 xhi yhi ,ai − xj xhi yhi ,ai xj i=1 i=1 i=1 i=1 ai =4 ai =4 ai =4 ai =4 r r r r −2 ≤ xj xhi yhi ,ai − xhi yhi ,ai i=1 i=1 i=1 i=1 ai =4 ai =4 ai =4 ai =4 r r (x + x ) j j + max |xk − xk | 2 2 xhi yhi ,ai k xj xj i=1 i=1 a =4
(B.7)
(B.8)
ai =4
i
and: (B.7) ≤ |xj |−1 Cϑr¯−1 r(3ε)r¯−2 (6η)r˜ max |xk − xk | + Cϑr¯(3ε)r¯−2 (6η)r˜−1
r
k
|yh ,a − yh ,a |
(B.9)
=1 a =4
(B.8) ≤ max |xk − xk ||xj |−1 2Cϑr¯ (3ε)r¯−2 (6η)r˜. k
(B.10)
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1027
Using (4.3) and the short memory property it follows that: r β (4) (j; h)yh ,a − y a h ,a ≤ (const.) [log(1 + εj) + 1] max |xk − xk |; k
h1 ,...,hr hi ≥j
(B.11) therefore, since the other terms arising in the difference (B.5) can be treated exactly in the same way, from (B.9)–(B.11) we find that: k k |Rj (x , y ) − Rj (x, y)| ≤ (const.) |xj |−1 (ε + η) + k(log(1 + εk) + 1) j=1
j=1
× max |xk − xk |,
(B.12)
k
which gives statement (4.5) for ε, η small enough. In fact, the denominator of (B.5) is bounded from below as k k −1 −1 λ + (βj − Rj (x, y)) λ0 + (βj − Rj (x , y )) ≥ (const.)|xk |−2 ; 0 j=1 j=1 (B.13) using the second of (A.1) our claim (4.5) follows. Appendix C. The Running Coupling Constants on Scale 0 In this appendix, we discuss how to express the running coupling constants on scale zero as functions of the renormalized ones. First, a straightforward computation shows that the second equation in (3.7) can be rewritten as: µ0 =
1 − µ0 µ − β2,0 (λ0 , λ, ξ0 , ξ); 1+µ 1+µ
(C.1)
(2)
this is a consequence of the fact that in (2.28) βa1 ,...,ar (0; 0 . . . , 0) = 1 if ai = 2 for all i ∈ [1, r]. Since the running coupling constants on scale > 0 are parametrized by the ones on scale 0, we can rewrite (C.1) as µ + g2 (λ0 , ξ0 , µ) =: µ + f˜2 (µ) + g2 (λ0 , ξ0 , µ), µ0 =: (C.2) 1+µ and plugging (C.2) in the first equation of (3.7) we get α0 =: α + f˜2 (µ) + g2 (λ0 , ξ0 , µ),
(C.3)
where: f˜i (µ) are analytic functions of µ ∈ Bη¯, and gi (λ0 , ξ0 , µ) are analytic for (λ0 , ξ0 , µ) ∈ W2¯ε, ϑ × B2¯η × B2¯η × Bη¯. Formulas (C.2), (C.3) can be regarded as a 2 fixed point equation: ) ( ˜ ξ,λ0 ξ0 ; ξi,0 = M (C.4) i
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
1028
˜ ξ,λ0 leaves invariant all we have to do is to check that: (i) for |λ0 |, |ξ| small enough M ˜ ξ,λ0 is a contraction therein. The property (i) is a the set B 32 η¯ × B 23 η¯, and (ii) M straightforward consequence of the fact that |f˜i (µ)| ≤ C|µ|2 ,
|gi (λ0 , ξ0 , µ)| ≤ C|λ0 |(|λ0 | + |ξ|),
(C.5)
where in the second inequality we used that |λ | ≤ c|λ |, |ξ | ≤ c |λ | + |ξ | and k 0 i,k 0 0 that, from (3.7), |ξi,0 | ≤ c |λ0 | + |ξ| ; if we choose (λ0 , ξ) ∈ W2¯ε, ϑ × Bη¯ × Bη¯ with 2 ε¯, η¯ small enough then the set B 23 η¯ × B 32 η¯ ⊂ B2¯η × B2¯η is left invariant by (C.4). To prove property (ii), we use a Cauchy estimate. In fact, the Cauchy bound tells us that if y, y ∈ B 23 η¯ × B 23 η¯ then, since gi (λ0 , y, µ) is analytic for y ∈ B2¯η × B2¯η and bounded as (C.5), for (λ0 , µ) ∈ W2¯ε, ϑ × Bη¯ with ε¯, η¯ small enough: 2
ε¯ |gi (λ0 , y, µ) − gi (λ0 , y , µ)| ≤ 2C (¯ ε + η¯) max |yi − yi | ≤ max |yi − yi | i i η¯
(C.6)
with 0 < < 1. Therefore, we can construct explicitly the solution ξi,0 (λ0 , ξ), and the above properties allow us to conclude that it is analytic for (λ0 , ξ) ∈ W2¯ε, ϑ × Bη¯ × Bη¯. 2 After this, we are left with Eq. (3.5) for λ0 ; since all the couplings on scale ≥ 1 are functions of λ0 , ξ0 and, as we know for our previous analysis, ξi,0 = ξi,0 (λ0 , ξ), we can rewrite (3.5) as: λ0 =: λ − λ0 f˜4 (µ) − β0 λ20 + h(λ0 , ξ) f˜4 (µ) = O(µ),
(C.7)
|h(λ0 , ξ)| ≤ C|λ0 |2 (|λ0 | + |ξ|),
where we used that µ0 satisfies (C.4) with i = 2, and h(λ0 , ξ) is analytic for (λ0 , ξ) ∈ W2¯ε, ϑ × Bη¯ × Bη¯. Therefore, we can rewrite (C.7) as: 2
λ0 = Mλ,ξ ˜ λ0 , ˜ := λ
λ , 1 + f˜4 (µ)
˜ Mλ,ξ ˜ x := λ +
1 1 + f˜4 (µ)
(C.8) (−β0 x2 + h(x, ξ)).
leaves All we have to do is to check that: (i) if (λ, ξ) ∈ Wε¯,ϑ × Bη¯ × Bη¯ then Mλ,ξ ˜ invariant the set W 32 ε¯, 23 ϑ ⊂ W2¯ε, ϑ , and (ii) Mλ,ξ is a contraction therein. Let us ˜ 2 prove property (i); for ε¯, η¯ small enough, it is easy to see that if λ ∈ Wε¯,ϑ then ˜ ∈ W 4 3 and x ∈ W 3 2 ⇒ M ˜ x ∈ W 3 2 . λ ¯, 3 ϑ ¯, 3 ϑ λ,ξ 3 ε, 4 ϑ 2ε 2ε
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1029
We now turn to property (ii). From the analyticity of h(x, ξ) in x ∈ W2¯ε, ϑ , using 2 that the distance from a point x ∈ W 32 ε¯, 23 ϑ to the boundary of W2¯ε, ϑ is bounded 2
|x| 3
sin ϑ6 , if x, x ∈ W 32 ε¯, 23 ϑ a Cauchy estimate tells us that: 3C ε¯(¯ ε + η¯) x − M x | ≤ 8 β ε ¯ + |Mλ,ξ |x − x | ≤ |x − x | ˜ ˜ 0 λ,ξ sin(ϑ/6)
from below by
(C.9)
with < 1; the first inequality follows from the bound on h in (C.7), while the second holds taking ε¯ small enough (remember that ϑ ∈ 0, π2 ). In conclusion, we can explicitly construct the solution of (C.8), and by a simple inductive argument it follows that it is analytic for (λ, α, µ) ∈ Wε¯,ϑ × Bη¯ × Bη¯. Appendix D. Dependence of the Running Coupling Constants on the Ultraviolet Cutoff In this appendix we show that the running coupling constants are weakly dependent on the location of the ultraviolet cutoff; in particular, denoting with a superscript N the quantities corresponding to a theory with cutoff on scale N , if (λ, α, µ) ∈ Wε,ϑ × Bη × Bη with ε, η small enough we show that there exist two positive constants C, ρ such that for any k ≤ N and N < N the following bounds hold: N C α − αN ≤ + Cηγ −ρN , k k ε−1 + N N C µk − µN ≤ (D.1) + Cηγ −ρN , k −1 ε +N N C λ − λN ≤ . k k ε−1 + N In the proof, we shall use in a crucial way the short memory property of the GN N trees, see discussion after (2.20). Consider first the difference of αN k , αk . Denoting by a prime the running coupling constants corresponding to a theory with cutoff N and neglecting the N label in the others we have that |αk − αk | ≤
k N β2 ,j (λ, α, µ) − β2N ,j (λ , α , µ ) + |α0 − α0 |,
(D.2)
j=1
where β2N ,j is the beta function the theory with an ultraviolet cutoff on scale N . Let a := maxk∈[0,N ] |ak |; using property (2.31) and the bounds in (A.1) it follows that, for some C1 > 0, ρ > 0 (neglecting for simplicity the arguments of the beta function): N C1 C1 β2 ,j − β2N ,j ≤ |λh − λh |γ ρ(j−h) α − α + µ − µ + −1 −1 2 (ε + j) ε +j h≥j
ρ(j−N )
+ C1 (ε + η)
γ , ε−1 + j
(D.3)
October 12, J070-S0129055X10004120
1030
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
|α0 − α0 | ≤ C1 (ε + η)(α − α + µ − µ + λ − λ ) + C1 (ε + η)2 γ −ρN , (D.4) where the last terms in (D.3) and (D.4) take into account the contribution of GN trees with at least one endpoint on scale > N , and all the others bound the differences of trees with all endpoints on scale < N . Therefore, plugging (D.3) and (D.4) in (D.2) we have that, for some C˜1 > 0: α − α ≤ C˜1 (ε + η)(µ − µ + λ − λ ) +
N j=1
N C˜1 |λh − λh |γ ρ(j−h) ε−1 + j h≥j
C˜1 (ε + η) + C˜1 (ε + η)2 γ −ρN . + −1 ε +N
(D.5)
By what has been discussed in Secs. 3 and 4 and in Appendices A and B, it follows that |λk − λk | for k ≥ 1 can be estimated in the following way, for some C2 > 0: |λk − λk | ≤
C2 (λ − λ + α − α + µ − µ ) C2 (ε + η)γ ρ(k−N ) + , ε−1 + k (ε−1 + k)2
(D.6)
where the first term takes into account the difference of running coupling constants on scale ≤ N , while the last term takes into account trees with root scale ≤ k having at least one endpoint on scale > N . Plugging (D.6) in (D.5) it is straightforward to see that, for some C3 > 0, α − α ≤ C3 (ε + η)(λ − λ + µ − µ ) +
C3 (ε + η) + C3 (ε + η)2 γ −ρN , ε−1 + N
(D.7)
which if inserted in (D.6) implies, for some positive C4 , C5 : λ − λ ≤ εC4 µ − µ + C4 ε(ε + η)2 γ −ρN + α − α ≤ C5 (ε + η)µ − µ +
C4 ε(ε + η) , ε−1 + N
C5 (ε + η) + C5 (ε + η)2 γ −ρN . ε−1 + N
(D.8)
The difference µk − µk can be bounded in a way analogous to αk − αk , and using (D.8) it follows that µ − µ ≤
C6 + C6 ηγ −ρN , ε−1 + N
C6 > 0,
(D.9)
which together with (D.8) proves (D.1). Appendix E. An Improvement of the n! Bounds in the Planar Theory In this appendix, we discuss an improvement, valid in the planar case, of the n! bounds proved in [6, Sec. XIX], see formulas (19.5) and (20.2). Here we shall follow
October 12, J070-S0129055X10004120
2010 10:1 WSPC/S0129-055X
148-RMP
Borel Summability of ϕ44 Planar Theory via Multiscale Analysis
1031
the notations of that work: we remind that the “form factor” r(a) (σ; k) of [6] corresponds to the contribution of the tree σ with thin endpoints to the formal expansion (a) of vk γ (2δa,2 +4δa,0 )k in λ, α, µ, which is obtained by iteration of the equation graphically represented in Fig. 5. We claim that [6, Eq. (20.2)] is still valid if f is replaced by an f¯ denoting just the number of nontrivial frames (see remark after Fig. 5 for the definition of trivial frame) labeled by a = 2 , 4. To prove the claim, observe that one can repeat the proof of Sec. XIX in [6] with the new inductive assumption ¯
|r
(a)
˜ n−1 f¯! (σ; k)| ≤ ε¯ D n
f (bk)j
j!
j=0
γ (2δa,2 +4δa,0 )k
(E.1)
instead of [6, Eq. (19.5)], the only difference being that the number of topological Feynman graphs with m vertices is bounded proportionally to N0m where N0 is a suitable constant, because of the restriction to the planar theory. Then if f˜ is the number of nontrivial (2 , 4)-frames of σ excluding the external one, equation [6, Eq. (19.13)] is replaced by, depending on whether the frame enclosing σ is trivial or not: ˜ n−m Dm f˜! |r(a) (σ; k)| ≤ D7 N m Dm ε¯n D ×
0 4 ˜ f k
h=0 r=0
6
γ (2δa,2 +4δa,0 )h
(bh)r , (nontrivial frame), r!
˜ n−m Dm f˜!, |r(a) (σ; k)| ≤ D7 N0m D4m ε¯n D 6
(E.2)
(trivial frame);
with respect to [6], we have kept the factor γ (2δa,2 +4δa,0 )h inside the sum, instead of estimating it replacing h with k. If the frame enclosing σ is trivial the claim follows ˜ large enough (as in [6], here m ≥ 2). If the frame from the second of (E.2), taking D is nontrivial and a = 2 , 4, proceed as in [6, Eq. (19.15)], while if a = 2 substitute that bound with ˜
f k h=0 r=0
¯
γ 2h
f γ 2k (bk)r (bh)r ≤ , r! 1 − γ −2 r=0 r!
(E.3)
and do the same for a = 0 (γ 2k will be replaced by γ 4k ). From this the claim follows ˜ sufficiently large, as explained in [6]. choosing D References [1] L. D. Landau, Collected Papers of L. D. Landau (Gordon and Breach, 1965). [2] G. ’t Hooft, Borel summability of a four-dimensional field theory, Phys. Lett. B 119 (1982) 369–371. [3] G. ’t Hooft, Rigorous construction of planar diagram field theories in four dimensional euclidean space, Comm. Math. Phys. 88 (1983) 1–25. [4] V. Rivasseau, Construction and Borel summability of planar 4-dimensional Euclidean field theory, Comm. Math. Phys. 95 (1984) 445–486.
October 12, J070-S0129055X10004120
1032
2010 10:1 WSPC/S0129-055X
148-RMP
M. Porta & S. Simonella
[5] V. Rivasseau, Rigorous construction and Borel summability for a planar fourdimensional field theory, Phys. Lett. B 137 (1983) 98–102. [6] G. Gallavotti, Renormalization theory and ultraviolet stability for scalar fields via renormalization group methods, Rev. Mod. Phys. 57 (1985) 471–562. [7] G. Gallavotti and V. Rivasseau, ϕ4 -Field theory in dimension four: A modern introduction to its open problems, Ann. Inst. H. Poincar´ e 40 (1984) 185–220. [8] F. Feldman, J. Magnen, V. Rivasseau and R. S´en´eor, Construction and Borel summability of infrared Φ44 by a phase space expansion, Comm. Math. Phys. 109 (1987) 437–480. [9] F. Feldman, J. Magnen, V. Rivasseau and R. S´en´eor, A renormalizable field theory: The massive Gross–Neveu model in two-dimensions, Comm. Math. Phys. 103 (1986) 67–103. [10] J. Koplik, A. Neveu and S. Nussinov, Some aspects of the planar perturbation series, Nucl. Phys. B 123 (1977) 109–131. [11] E. Br´ezin, C. Itzykson, G. Parisi and J. B. Zuber, Planar diagrams, Comm. Math. Phys. 59 (1978) 35–51. [12] G. Gallavotti and F. Nicol` o, Renormalization theory for four-dimensional scalar fields, I, Comm. Math. Phys. 100 (1985) 545–590. [13] G. Gallavotti and F. Nicol` o, Renormalization theory for four-dimensional scalar fields, II, Comm. Math. Phys. 101 (1985) 247–282. [14] A. Sokal, An improvement of Watson’s theorem on Borel summability, J. Math. Phys. 21 (1980) 261–263. [15] V. Mastropietro, Rigorous proof of Luttinger liquid behaviour in the 1d Hubbard model, J. Stat. Phys. 121 (2005) 373–432. [16] G. H. Hardy, Divergent Series (Oxford University Press, 1949). [17] V. Rivasseau, Constructive field theory in zero dimension, Adv. Math. Phys. 2009 (2009) article ID 180159, 12 pp. [18] G. N. Watson, A theory of asymptotic series, Philos. Trans. R. Soc. Lond. Ser. A 211 (1912) 279–313. [19] G. Benfatto and G. Gallavotti, Perturbation theory of the Fermi surface in a quantum liquid. A general quasi-particle formalism and one-dimensional systems, Comm. Math. Phys. 258 (2005) 609–655. [20] G. Gentile and V. Mastropietro, Renormalization group for one-dimensional fermions. A review on mathematical results, Phys. Rep. 352 (2001) 273–437. [21] G. Benfatto and V. Mastropietro, Ward identities and chiral anomaly in the Luttinger liquid, Comm. Math. Phys. 258 (2005) 609–655. [22] C. De Calan and V. Rivasseau, Local existence of the Borel transform in Euclidean ϕ44 , Comm. Math. Phys. 82 (1982) 69–100. [23] K. Hepp, Proof of the Bogoliubov–Parasiuk theorem on renormalization, Comm. Math. Phys. 2 (1966) 301–326. [24] W. Zimmermann, Convergence of Bogoliubov’s method of renormalization in momentum space, Comm. Math. Phys. 15 (1969) 208–234. [25] J. Polchinski, Renormalization and effective Lagrangians, Nucl. Phys. B 231 (1984) 269–295.
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 1033–1059 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004156
PARALLEL TRANSPORT OVER PATH SPACES
SAIKAT CHATTERJEE∗ and AMITABHA LAHIRI† S. N. Bose National Centre for Basic Sciences, Block JD, Sector III, Salt Lake, Kolkata 700098, West Bengal, India ∗[email protected] †[email protected] AMBAR N. SENGUPTA Department of Mathematics, Louisiana State University, Baton Rouge, Louisiana 70803, USA [email protected] Received 10 October 2009 Revised 14 June 2010 We develop a differential geometric framework for parallel transport over path spaces and a corresponding discrete theory, an integrated version of the continuum theory, using a category-theoretic framework. Keywords: Gauge theory; path spaces; double categories. Mathematics Subject Classification 2010: 81T13, 58Z05, 16E45
1. Introduction A considerable body of literature has grown up around the notion of “surface holonomy”, or parallel transport on surfaces, motivated by the need to have a gauge theory of interaction between charged string-like objects. Approaches include direct geometric exploration of the space of paths of a manifold (Cattaneo et al. [5], for instance), and a very different, category-theory flavored development (Baez and Schreiber [2], for instance). In the present work, we develop both a path-space geometric theory as well as a category theoretic approach to surface holonomy, and describe some of the relationships between the two. As is well known [1] from a group-theoretic argument and also from the fact that there is no canonical ordering of points on a surface, attempts to construct a groupvalued parallel transport operator for surfaces leads to inconsistencies unless the ∗ Current
address: School of Mathematics, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India. 1033
October 12, J070-S0129055X10004156
1034
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
group is abelian (or an abelian representation is used). So in our setting, there are two interconnected gauge groups G and H. We work with a fixed principal G-bundle ¯ then, viewing the space of A-horizontal ¯ π : P → M and connection A; paths itself as a bundle over the path space of M , we study a particular type of connection on this path-space bundle which is specified by means of a second connection A and a field B whose values are in the Lie algebra LH of H. We derive explicit formulas describing parallel-transport with respect to this connection. As far as we are aware, this is the first time an explicit description for the parallel transport operator has been obtained for a surface swept out by a path whose endpoints are not pinned. We obtain, in Theorem 2.1, conditions for the parallel-transport of a given point in path-space to be independent of the parametrization of that point, viewed as a path. We also discuss H-valued connections on the path space of M , constructed from the field B. In Sec. 3, we show how the geometrical data, including the field B, lead to two categories. We prove several results for these categories and discuss how these categories may be viewed as “integrated” versions of the differential geometric theory developed in Sec. 2. In working with spaces of paths, one is confronted with the problem of specifying a differential structure on such spaces. It appears best to proceed within a simpler formalism. Essentially, one continues to use terms such as “tangent space” and “differential form”, except that in each case the specific notion is defined directly (for example, a tangent vector to a space of paths at a particular path γ is a vector field along γ) rather than by appeal to a general theory. Indeed, there is a good variety of choices for general frameworks in this philosophy (see, for instance, [16, 17]). For this reason, we shall make no attempt to build a manifold structure on any space of paths. 1.1. Background and motivation Let us briefly discuss the physical background and motivation for this study. Traditional gauge fields govern interaction between point particles. Such a gauge field is, mathematically, a connection A on a bundle over spacetime, with the structure group of the bundle being the relevant internal symmetry group of the particle species. The amplitude of the interaction, along some path γ connecting the point particles, is often obtained from the particle wave functions ψ coupledtogether using ¯ which is quantities involving the path-ordered exponential integral P exp(− γ A), ¯ the same as the parallel-transport along the path γ by the connection A. If we now change our point of view concerning particles, and assume that they are extended
Fig. 1.
Point particles interacting via a gauge field.
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1035
string-like entities, then each particle should be viewed not as a point entity but rather a path (segment) in spacetime. Thus, instead of the two particles located at two points, we now have two paths γ1 and γ2 ; in place of a path connecting the two point particles we now have a parametrized path of paths, in other words a surface Γ, connecting γ1 with γ2 . The interaction amplitudes would, one may expect, involve both the gauge field A, as expressed through the parallel transports along γ1 and γ2 , and an interaction between these two parallel transport fields. This higher order, or higher dimensional interaction, could be described by means of a gauge field at the higher level: it would be a gauge field over the space of paths in spacetime.
1.2. Comparison with other works The approach to higher gauge theory developed and explored by Baez [1], Baez and Schreiber [2, 3], and Lahiri [13], and others cited in these papers, involves an abstract category theoretic framework of 2-connections and 2-bundles, which are higher-dimensional analogs of bundles and connections. There is also the framework of gerbes [6, 4, 14]. We develop both a differential geometric framework and category-theoretic structures. We prove in Theorem 2.1 that a requirement of parametrization invariance imposes a constraint on a quantity called the “fake curvature” which has been observed in a related but more abstract context by Baez and Schreiber [2, Theorem 23]. Our differential geometric approach is close to the works of Cattaneo et al. [5], Pfeiffer [15], and Girelli and Pfeiffer [11]. However, we develop, in addition to the differential geometric aspects, the integrated version in terms of categories of diagrams, an aspect not addressed in [5]; also, it should be noted that our connection form is different from the one used in [5]. To link up with the integrated theory it is essential to explore the effect of the LH-valued field B. To this end we determine a “bi-holonomy” associated to a path of paths (Theorem 2.2) in terms of the field B; this aspect of the theory is not studied in [5] or other works. Our approach has the following special features: • we develop the theory with two connections A and A¯ as well as a 2-form B (with the connection A¯ used for parallel-transport along any given string-like object, and the forms A and B used to construct parallel-transports between different strings); • we determine, in Theorem 2.2, the “bi-holonomy” associated to a path of paths using the B-field; • we allow “quadrilaterals” rather than simply bigons in the category theoretic formulation, corresponding to having strings with endpoints free to move rather than fixed-endpoint strings. Our category theoretic considerations are related to notions about double categories introduced by Ehresmann [9, 10] and explored further by Kelly and Street [12].
October 12, J070-S0129055X10004156
1036
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
Fig. 2.
Gauge fields along paths c1 and c2 interacting across a surface.
2. Connections on Path-Space Bundles In this section we will construct connections and parallel-transport for a pair of intertwined structures: path-space bundles with structure groups G and H, which are Lie groups intertwined as explained below in (2.1). For the physical motivation, it should be kept in mind that G denotes the gauge group for the gauge field along each path, or string, while H governs, along with G, the interaction between the gauge fields along different paths. An important distinction between existing differential geometric approaches (such as Cattaneo et al. [5]) and the “integrated theory” encoded in the categorytheoretic framework is that the latter necessarily involves two gauge groups: a group G for parallel transport along paths, and another group H for parallel transport between paths (in path space). We shall develop the differential geometric framework using a pair of groups (G, H) so as to be consistent with the “integrated” theory. Along with the groups G and H, we use a fixed smooth homomorphism τ : H → G and a smooth map G × H → H : (g, h) → α(g)h such that each α(g) is an automorphism of H, such that the identities τ (α(g)h) = gτ (h)g −1 , α(τ (h))h = hh h−1 ,
(2.1)
hold for all g ∈ G and h, h ∈ H. The derivatives τ (e) and α (e) will be denoted simply as τ : LH → LG and α : LG → LH. (This structure is called a Lie 2-group in [1, 2].) To summarize very rapidly, anticipating some of the notions explained below, we work with a principal G-bundle π : P → M over a manifold M , equipped with ¯ and an α-equivariant vertical 2-form B on P with values in connections A and A, ¯ paths in P , the Lie algebra LH. We then consider the space PA¯ P of A-horizontal
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1037
which forms a principal G-bundle over the path-space PM in M . Then there is an associated vector bundle E over PM with fiber LH; using the 2-form B and the connection form A¯ we construct, for any section σ of the bundle P → M , an LH-valued 1-form θσ on PM . This being a connection over the path-space in M with structure group H, parallel-transport by this connection associates elements of H to parametrized surfaces in M . Most of our work is devoted to studying a second connection form ω(A,B) , which is a connection on the bundle PA¯ P which we construct using a second connection A on P . Parallel-transport by ω(A,B) is related to parallel-transport by the LH-valued connection form θσ . ¯ 2.1. Principal bundle and the connection A Consider a principal G-bundle π:P →M with the right-action of the Lie group G on P denoted P × G → P : (p, g) → pg = Rg p. ¯ Let A¯ be a connection on this bundle. The space PA¯ P of A-horizontal paths in P may be viewed as a principal G-bundle over PM , the space of smooth paths in M . We will use the notation pK ∈ Tp P , for any point p ∈ P and Lie-algebra element K ∈ LG, defined by d pK = p · exp(tK). dt t=0 It will be convenient to keep in mind that we always use t to denote the parameter for a path on the base manifold M or in the bundle space P ; we use the letter s to parametrize a path in path-space. 2.2. The tangent space to PA¯ P ¯ The points of the space PA¯ P are A-horizontal paths in P . Although we call PA¯ P a “space” we do not discuss any topology or manifold structure on it. However, it is useful to introduce certain differential geometric notions such as tangent spaces on PA¯ P . It is intuitively clear that a tangent vector at a “point” γ˜ ∈ PA¯ P ought to be a vector field on the path γ˜ . We formalize this idea here (as has been done elsewhere as well, such as in Cattaneo et al. [5]). If PX is a space of paths on a manifold X, we denote by evt the evaluation map evt : PX → X : γ → evt (γ) = γ(t).
(2.2)
Our first step is to understand the tangent spaces to the bundle PA¯ P . The following result is preparation for the definition (see also [5, Theorem 2.1]). Proposition 2.1. Let A¯ be a connection on a principal G-bundle π : P → M, and ˜ : [0, 1] × [0, 1] → P : (t, s) → Γ(t, ˜ s) = Γ ˜ s (t) Γ
October 12, J070-S0129055X10004156
1038
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
a smooth map, and ˜ s). v˜s (t) = ∂s Γ(t, Then the following are equivalent: (i) Each transverse path ˜ s : [0, 1] → P : t → Γ(t, ˜ s) Γ ¯ is A-horizontal. ˜ 0 is A-horizontal, ¯ (ii) The initial path Γ and the “tangency condition” ¯ ∂ A(˜ vs (t)) ¯ ˜ s), v˜s (t)) = F A (∂t Γ(t, ∂t holds, and thus also T ¯ ¯ ˜ s), v˜s (t))dt, ¯ A(˜ vs (T )) − A(˜ vs (0)) = F A (∂t Γ(t,
(2.3)
(2.4)
0
for every T, s ∈ [0, 1]. Equation (2.3), and variations on it, is sometimes referred to as the Duhamel formula and sometimes a “non-abelian Stokes formula”. We can write it more compactly by using the notion of a Chen integral. Withsuitable regularity assumptions, a 2-form Θ on a space X yields a 1-form, denoted Θ, on the space PX of smooth paths in X; if c is such a path, a “tangent vector” v ∈ Tc (PX) is a vector field t → v(t) along c, and the evaluation of the 1-form Θ on v is defined to be 1 Θ (v) = Θ(c (t), v(t))dt. (2.5) Θ v=
c
c
0
The 1-form Θ, or its localization to the tangent space Tc (PX), is called the Chen integral of Θ. Returning to our context, we then have T ¯ F A, (2.6) ev∗T A¯ − ev∗0 A¯ = 0
where the integral on the right is a Chen integral; here it is, by definition, the 1-form on PA¯ P whose value on a vector v˜s ∈ TΓ˜ s PA¯ P is given by the right-hand side of (2.3). The pullback ev∗t A¯ has the obvious meaning. ¯
Proof. From the definition of the curvature form F A , we have ¯ ˜ ∂s Γ) ˜ = ∂t (A(∂ ˜ − ∂s (A(∂ ˜ − A([∂ ¯ t Γ, ˜ ∂s Γ] ˜ ) + [A(∂ ˜ A(∂ ¯ s Γ)]. ˜ ¯ s Γ)) ¯ t Γ)) ¯ t Γ), F A (∂t Γ, 0
So ˜ − F A¯ (∂t Γ, ˜ ∂s Γ) ˜ = ∂s (A(∂ ˜ − [A(∂ ¯ t Γ), ˜ A(∂ ¯ s Γ)] ˜ ¯ s Γ)) ¯ t Γ)) ∂t (A(∂ ˜ = 0, ¯ t Γ) = 0 if A(∂ thus proving (2.3) if (i) holds. Equation (2.4) then follows by integration.
(2.7)
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1039
Next suppose (ii) holds. Then, from the first line in (2.7), we have ˜ − [A(∂ ¯ t Γ), ˜ A(∂ ¯ s Γ)] ˜ = 0. ¯ t Γ)) ∂s (A(∂
(2.8)
˜ t); then Now let s → h(s) ∈ G describe parallel-transport along s → Γ(s, ˜ t)), ¯ s Γ(s, h (s)h(s)−1 = −A(∂
and h(0) = e.
Then ¯ t Γ(t, ˜ s))h(s)) = Ad(h(s)−1 )[∂s (A(∂ ˜ − [A(∂ ¯ t Γ), ˜ A(∂ ¯ s Γ)] ˜ ¯ t Γ)) ∂s (h(s)−1 A(∂
(2.9)
and the right-hand side here is 0, as seen in (2.8). Therefore, ¯ t Γ(t, ˜ s))h(s) h(s)−1 A(∂ is independent of s, and hence is equal to its value at s = 0. Thus, if A¯ vanishes ˜ 0) then it also vanishes in ∂t Γ(t, ˜ s) for all s ∈ [0, 1]. In conclusion, if the on ∂t Γ(t, ˜ 0 is A-horizontal, ¯ initial path Γ and the tangency condition (2.3) holds, then each ¯ ˜ s is A-horizontal. transverse path Γ In view of the preceding result, it is natural to define the tangent spaces to PA¯ P as follows: Definition 2.1. The tangent space to PA¯ P at γ˜ is the linear space of all vector fields t → v˜(t) ∈ Tγ˜(t) P along γ˜ for which ¯ v (t)) ∂ A(˜ ¯ γ (t), v˜(t)) = 0 − F A (˜ ∂t
(2.10)
holds for all t ∈ [0, 1]. The vertical subspace in Tγ˜ PA¯ P consists of all vectors v˜(·) for which v˜(t) is vertical in Tγ˜(t) P for every t ∈ [0, 1]. Let us note one consequence: ¯ Lemma 2.1. Suppose γ : [0, 1] → M is a smooth path, and γ˜ an A-horizontal lift. Let v : [0, 1] → TM be a vector field along γ, and v˜(0) any vector in Tγ˜(0) P with π∗ v˜(0) = v(0). Then there is a unique vector field v˜ ∈ Tγ˜ PA¯ P whose projection down to M is the vector field v, and whose initial value is v˜(0). Proof. The first-order differential equation (2.10) determines the vertical part of ¯ v˜(t), from the initial value. Thus v˜(t) is this vertical part plus the A-horizontal lift of v(t) to Tγ˜(t) P .
October 12, J070-S0129055X10004156
1040
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
2.3. Connections induced from B All through our work, B will denote a vertical α-equivariant 2-form on P with values in LH. In more detail, this means that B is an LH-valued 2-form on P which is vertical in the sense that B(u, v) = 0
if u or v is vertical,
and α-equivariant in the sense that Rg∗ B = α(g −1 )B
for all g ∈ G
wherein Rg : P → P : p → pg is the right action of G on the principal bundle space P , and α(g −1 )B = dα(g −1 )|e B, recalling that α(g −1 ) is an automorphism H → H. ¯ Consider an A-horizontal γ˜ ∈ PA¯ P , and a smooth vector field X along γ = π ◦ γ˜; ˜ take any lift Xγ˜ of X along γ˜, and set 1 def ˜ ˜ γ˜ (u))du. θγ˜ (X) = B (Xγ˜ ) = B(˜ γ (u), X (2.11) γ ˜
0
˜ γ˜ (as any two choices differ by a vertical This is independent of the choice of X vector on which B vanishes) and specifies a linear form θγ˜ on Tγ (PM ) with values in LH. If we choose a different horizontal lift of γ, a path γ˜ g, with g ∈ G, then θγ˜ g (X) = α(g −1 )θγ˜ (X).
(2.12)
Thus, one may view θ˜ to be a 1-form on PM with values in the vector bundle E → PM associated to PA¯ P → PM by the action α of G on LH. Now fix a section σ : M → P , and for any path γ ∈ PM let σ ˜ (γ) ∈ PA¯ P be the ¯ A-horizontal lift with initial point σ(γ(0)). Thus, σ ˜ : PM → PA¯ P is a section of the bundle PA¯ P → PM . Then we have the 1-form θσ on PM with values in LH given as follows: for any X ∈ Tγ (PM ), (θσ )(X) = θσ˜ (γ) (X).
(2.13)
We shall view θσ as a connection form for the trivial H-bundle over PM . Of course, it depends on the section σ of PA¯ P → PM , but in a “controlled” manner, i.e. the behavior of θσ under change of σ is obtained using (2.12). 2.4. Constructing the connection ω(A,B) Our next objective is to construct connection forms on PA¯ P . To this end, fix a connection A on P , in addition to the connection A¯ and the α-equivariant vertical LH-valued 2-form B on P .
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1041
The evaluation map at any time t ∈ [0, 1], given by evt : PA¯ P → P : γ˜ → γ˜ (t), commutes with the projections PA¯ P → PM and P → M , and the evaluation map PM → M . We can pull back any connection A on the bundle P to a connection ev∗t A on PA¯ P . Given a 2-form B as discussed above, consider the LH-valued 1-form Z on PA¯ P specified as follows. Its value on a vector v˜ ∈ Tγ˜ PA¯ P is defined to be Z(˜ v) =
1
B(˜ γ (t), v˜(t))dt.
(2.14)
0
Thus
1
Z=
B,
(2.15)
0
where on the right we have the Chen integral (discussed earlier in (2.5)) of the ¯ 2-form B on P , lifting it to an LH-valued 1-form on the space of (A-horizontal) smooth paths [0, 1] → P . The Chen integral here is, by definition, the 1-form on PA¯ P given by 1 B(˜ γ (t), v˜(t))dt. v˜ ∈ Tγ˜ PA¯ P → 0
Note that Z and the form θ are closely related: Z(˜ v ) = θγ˜ (π∗ v˜).
(2.16)
Now define the 1-form ω(A,B) by ω(A,B) = ev∗1 A + τ (Z).
(2.17)
Recall that τ : H → G is a homomorphism, and, for any X ∈ LH, we are writing τ (X) to mean τ (e)X; here τ (e) : LH → LG is the derivative of τ at the identity. The utility of bringing in τ becomes clear only when connecting these developments to the category theoretic formulation of Sec. 3. A similar construction, but using only one algebra LG, is described by Cattaneo et al. ([5]). However, as we pointed out earlier, a parallel transport operator for a surface cannot be constructed using a single group unless the group is abelian. To allow non-abelian groups, we need to have two groups intertwined in the structure described in (2.1), and thus we need τ . Note that ω(A,B) is simply the connection ev∗1 A on the bundle PA¯ P , shifted by the 1-form τ (Z). In the finite-dimensional setting it is a standard fact that such a shift, by an equivariant form which vanishes on verticals, produces another connection; however, given that our setting is, technically, not identical to the finitedimensional one, we shall prove this below in Proposition 2.2.
October 12, J070-S0129055X10004156
1042
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
Thus, ω(A,B) (˜ v ) = A(˜ v (1)) +
1
τ B(˜ γ (t), v˜(t))dt.
(2.18)
0
We can rewrite this as ¯ − ev∗ (A − A)] ¯ + ω(A,B) = ev∗0 A + [ev∗1 (A − A) 0
1
¯
(F A + τ B).
(2.19)
0
To obtain this we have simply used the relation (2.4). The advantage in (2.19) is that it separates off the endpoint terms and expresses ω(A,B) as a perturbation of the simple connection ev∗0 A by a vector in the tangent space Tev∗0 A A, where A is the space of connections on the bundle PA¯ P . Here note that the “tangent vectors” to the affine space A at a connection ω are the 1-forms ω1 − ω, with ω1 running over A. A difference such as ω1 − ω is precisely an equivariant LG-valued 1-form which vanishes on vertical vectors. Recall that the group G acts on P on the right P × G → P : (p, g) → Rg p = pg and this induces a natural right action of G on PA¯ P : γ , g) → Rg γ˜ = γ˜ g. PA¯ P × G → PA¯ P : (˜ Then for any vector X in the Lie algebra LG, we have a vertical vector ˜ γ ) ∈ Tγ˜ PA¯ P X(˜ given by ˜ γ )(t) = d X(˜ γ˜ (t) exp(uX). du u=0 Proposition 2.2. The form ω(A,B) is a connection form on the principal G-bundle PA¯ P → PM . More precisely, ω(A,B) ((Rg )∗ v) = Ad(g −1 )ω(A,B) (v) for every g ∈ G, v˜ ∈ Tγ˜ (PA¯ P ) and ˜ =X ω(A,B) (X) for every X ∈ LG. Proof. It will suffice to show that for every g ∈ G, Z((Rg )∗ v) = Ad(g −1 )Z(v) and every vector v tangent to PA¯ P , and ˜ =0 Z(X) for every X ∈ LG.
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1043
˜ From (2.15) and the fact that B vanishes on verticals it is clear that Z(X) is 0. The equivariance under the G-action follows also from (2.15), on using the G-equivariance of the connection form A and of the 2-form B, and the fact that ¯ ¯ the right action of G carries A-horizontal paths into A-horizontal paths. 2.5. Parallel transport by ω(A,B) Let us examine how a path is parallel-transported by ω(A,B) . At the infinitesimal level, all we need is to be able to lift a given vector field v : [0, 1] → T M , along γ ∈ PM , to a vector field v˜ along γ˜ such that: (i) v˜ is a vector in Tγ˜ (PA¯ P ), which means that it satisfies Eq. (2.10): ¯ v (t)) ∂ A(˜ ¯ = F A (˜ γ (t), v˜(t)); ∂t
(2.20)
(ii) v˜ is ω(A,B) -horizontal, i.e. satisfies the equation
1
A(˜ v (1)) +
τ B(˜ γ (t), v˜(t))dt = 0.
(2.21)
0
The following result gives a constructive description of v˜. ¯ B, and ω(A,B) are as specified before. Let Proposition 2.3. Assume that A, A, γ˜ ∈ PA¯ P, and γ = π ◦ γ˜ ∈ PM its projection to a path on M, and consider any v ∈ Tγ PM . Then the ω(A,B) -horizontal lift v˜ ∈ Tγ˜ PA¯ P is given by h v˜(t) = v˜A ˜v (t), ¯ (t) + v h ¯ where v˜A ˜ (t) P is the A-horizontal lift of v(t) ∈ Tγ(t) M, and ¯ (t) ∈ Tγ
¯ v (1)) − v˜v (t) = γ˜ (t) A(˜
t
1
¯
h F A (˜ γ (u), v˜A ¯ (u))du
(2.22)
wherein h v˜(1) = v˜A (1) + γ˜ (1)X,
(2.23)
h (1) being the A-horizontal lift of v(1) in Tγ˜(1) P, and with v˜A
X =− 0
1
h τ B(˜ γ (t), v˜A ¯ (t))dt.
(2.24)
Note that X in (2.24) is A(˜ v (1)). Note also that since v˜ is tangent to PA¯ P , the vector v˜v (t) is also given by
t ¯ A v h ¯ v (0)) + v˜ (t) = γ˜(t) A(˜ F (˜ γ (u), v˜A¯ (u))du . (2.25) 0
October 12, J070-S0129055X10004156
1044
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
Proof. The ω(A,B) horizontal lift v˜ of v in Tγ˜ (PA¯ P ) is the vector field v˜ along γ˜ which projects by π∗ to v and satisfies the condition (2.21): 1 A(˜ v (1)) + τ B(˜ γ (t), v˜(t))dt = 0. (2.26) 0
¯ Now for each t ∈ [0, 1], we can split the vector v˜(t) into an A-horizontal part and ¯ v v (t)) ∈ LG viewed as a a vertical part v˜v (t) which is essentially the element A(˜ vector in the vertical subspace in Tγ˜(t) P : h v˜(t) = v˜A ˜v (t) ¯ (t) + v
and the vertical part here is given by ¯ v (t)). v˜v (t) = γ˜ (t)A(˜ Since the vector field v˜ is actually a vector in Tγ˜ (PA¯ P ), we have, from (2.20), the relation 1 ¯ h ¯ v (t)) = A(˜ ¯ v (1)) − A(˜ F A (˜ γ (u), v˜A ¯ (u))du. t
We need now only verify the expression (2.23) for v˜(1). To this end, we first split this into A-horizontal and a corresponding vertical part: h (1) + γ˜ (1)A(˜ v (1)). v˜(1) = v˜A
The vector A(˜ v (1)) is obtained from (2.26), and thus proves (2.23). There is an observation to be made from Proposition 2.3. Equation (2.24) has, on the right-hand side, the integral over the entire curve γ˜ . Thus, if we were to consider parallel-transport of only, say, the “left half” of γ˜, we would, in general, end up with a different path of paths! 2.6. Reparametrization invariance If a path is reparametrized, then, technically, it is a different point in path space. Does parallel-transport along a path of paths depend on the specific parametrization of the paths? We shall obtain conditions to ensure that there is no such dependence. Moreover, in this case, we shall also show that parallel transport by ω(A,B) along a path of paths depends essentially on the surface swept out by this path of paths, rather than the specific parametrization of this surface. For the following result, recall that we are working with Lie groups G, H, smooth homomorphism τ : H → G, smooth map α : G × H → H : (g, h) → α(g)h, where each α(g) is an automorphism of H, and the maps τ and α satisfy (2.1). ¯ and B an Let π : P → M be a principal G-bundle, with connections A and A, LH-valued α-equivariant 2-form on P vanishing on vertical vectors. As before, on
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1045
¯ the space PA¯ P of A-horizontal paths, viewed as a principal G-bundle over the space PM of smooth paths in M , there is the connection form ω(A,B) given by 1 τ B. ω(A,B) = ev∗1 A + 0
By a “smooth path” s → Γs in PM , we mean a smooth map [0, 1]2 → M : (t, s) → Γ(t, s) = Γs (t), viewed as a path of paths Γs ∈ PM . With this notation and framework, we have: Theorem 2.1. Let Φ : [0, 1]2 → [0, 1]2 : (t, s) → (Φs (t), Φt (s)) be a smooth diffeomorphism which fixes each vertex of [0, 1]2 . Assume that (i) either ¯
F A + τ (B) = 0
(2.27)
and Φ carries each s-fixed section [0, 1] × {s} into an s-fixed section [0, 1] × {Φ0 (s)}; (ii) or 1 ¯ ∗ ∗ ¯ ¯ (F A + τ B) = 0, (2.28) [ev1 (A − A) − ev0 (A − A)] + 0
2
Φ maps each boundary edge of [0, 1] into itself, and Φ0 (s) = Φ1 (s) for all s ∈ [0, 1]. ˜ 0 ◦ Φ0 along the path s → (Γ ◦ Φ)s , Then the ω(A,B) -parallel-translate of the point Γ ˜ ˜ 0 along s → Γs . ˜ is Γ1 ◦ Φ1 , where Γ1 is the ω(A,B) -parallel-translate of Γ As a special case, if the path s → Γs is constant and Φ0 the identity map on [0, 1], so that Γ1 is simply a reparametrization of Γ0 , then, under conditions (i) or ˜ 0 along the path s → (Γ ◦ Φ)s , (ii) above, the ω(A,B) -parallel-translate of the point Γ ˜ 0. ˜ is Γ0 ◦ Φ1 , i.e. the appropriate reparametrizaton of the original path Γ ˜ ◦ Φ)0 projects down to (Γ ◦ Φ)0 , which, by the boundary Note that the path (Γ behavior of Φ, is actually that path Γ0 ◦ Φ0 , in other words Γ0 reparametrized. ¯ ˜ ◦ Φ)1 is an A-horizontal lift of the path Γ1 , reparametrized by Φ1 . Similarly, (Γ If A = A¯ then conditions (2.28) and (2.27) are the same, and so in this case the weaker condition on Φ in (ii) suffices. Proof. Suppose (2.27) holds. Then the connection ω(A,B) has the form ¯ − ev∗0 (A − A)]. ¯ ev∗0 A + [ev∗1 (A − A) The crucial point is that this depends only on the endpoints, i.e. if γ˜ ∈ PA¯ P and V˜ ∈ Tγ˜ PA¯ P then ω(A,B) (V˜ ) depends only on V˜ (0) and V˜ (1). If the conditions
October 12, J070-S0129055X10004156
1046
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
˜ s with on Φ in (i) hold then reparametrization has the effect of replacing each Γ ˜ ˜ ΓΦ0 (s) ◦ Φs , which is in PA¯ P , and the vector field t → ∂s (ΓΦ0 (s) ◦ Φs (t)) is an ˜ Φ0 (s) (t)), ω(A,B) -horizontal vector, because its endpoint values are those of t → ∂s (Γ since Φs (t) equals t if t is 0 or 1. Now suppose (2.28) holds. Then ω(A,B) becomes simply ev∗0 A. In this case ω(A,B) (V˜ ) depends on V˜ only through the initial value V˜ (0). Thus, the ω(A,B) -parallel-transport of γ˜ ∈ PA¯ P , along a path s → Γs ∈ PM , is obtained by A-parallel-transporting the initial point γ˜ (0) along the path s → Γ0 (s), and ¯ shooting off A-horizontal paths lying above the paths Γs . (Since the paths Γs do not necessarily have the second component fixed, their horizontal lifts need not be ˜ Φs ◦ Φs is ˜ s ◦ Φs , except at s = 0 and s = 1, when the composition Γ of the form Γ ˜ 0 ◦ Φ0 , guaranteed to be meaningful.) From this it is clear that parallel translating Γ ˜ by ω(A,B) along the path s → Γs , results, at s = 1, in the path Γ1 ◦ Φ1 . 2.7. The curvature of ω(A,B) We can compute the curvature of the connection ω(A,B) . This is, by definition, 1 Ω(A,B) = dω(A,B) + [ω(A,B) ∧ ω(A,B) ], 2 where the exterior differential d is understood in a natural sense that will become clearer in the proof below. More technically, we are using here notions of calculus on smooth spaces; see, for instance, [16] for a survey, and [17] for another approach. First we describe some notation about Chen integrals in the present context. 1 If B is a 2-form on P , with values in a Lie algebra, then its Chen integral 0 B, restricted to PA¯ P , is a 1-form on PA¯ P given on the vector V˜ ∈ Tγ˜ (PA¯ P ) by 1 1 ˜ B (V ) = B(˜ γ (t), V˜ (t))dt. 0
0
If C is also a 2-form on P with values in the same Lie algebra, we have a product ˜ Y˜ ∈ Tγ˜ (PA¯ P ) by 2-form on the path space PA¯ P given on X, 1 2 0
˜ Y˜ ) [B ∧ C](X,
˜ [B(˜ γ (u), X(u)), C(˜ γ (v), Y˜ (v))]du dv
= 0≤u
˜ [C(˜ γ (u), X(u)), B(˜ γ (v), Y˜ (v))]du dv
− 0≤u
1
= 0
0
1
˜ [B(˜ γ (u), X(u)), C(˜ γ (v), Y˜ (v))]du dv.
(2.29)
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1047
Proposition 2.4. The curvature of ω(A,B) is 1 ω(A,B) ∗ A = ev1 F + d τB Ω 0
+
ev∗1 A ∧
1
1 2
τB + 0
[τ B ∧ τ B],
(2.30)
0
where the integrals are Chen integrals. Proof. From ω(A,B) = ev∗1 A +
1
τ B, 0
we have 1 Ωω(A,B) = dω(A,B) + [ω(A,B) ∧ ω(A,B) ] 2 1 = ev∗1 dA + d τ B + W,
(2.31)
0
where ˜ ω(A,B) (Y˜ )] ˜ Y˜ ) = [ω(A,B) (X), W (X, ˜ ev∗1 A(Y˜ )] = [ev∗1 A(X),
1 ∗ ˜ ˜ τ B(˜ γ (t), Y (t))dt + ev1 A(X), 0
1
+
0 1
∗ ˜ ˜ τ B(˜ γ (t), X(t))dt, ev1 A(Y )
+ 0
1
˜ τ B(˜ γ (u), X(u)), τ B(˜ γ (v), Y˜ (v)) du dv
0
˜ Y˜ ) + ev∗ A ∧ = [ev∗1 A, ev∗1 A](X, 1 1 2 +
1
˜ Y˜ ) τ B (X,
0
˜ Y˜ ). [τ B ∧ τ B](X,
(2.32)
0
¯ and without τ , the expression for the curvature can be In the case A = A, ¯ expressed in terms of the “fake curvature” F A + B. For a result of this type, for a related connection form, see Cattaneo et al. [5, Theorem 2.6] have calculated a similar formula for curvature of a related connection form. A more detailed exploration of the fake curvature would be of interest.
October 12, J070-S0129055X10004156
1048
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
2.8. Parallel-transport of horizontal paths As before, A and A¯ are connections on a principal G-bundle π : P → M , and B is an LH-valued α-equivariant 2-form on P vanishing on vertical vectors. Also PX is the space of smooth paths [0, 1] → X in a space X, and PA¯ P is the space of ¯ smooth A-horizontal paths in P . Our objective now is to express parallel-transport along paths in PM in terms of a smooth local section of the bundle P → M : σ:U →P where U is an open set in M . We will focus only on paths lying entirely inside U . The section σ determines a section σ ˜ for the bundle PA¯ P → PM : if γ ∈ PM ¯ then σ ˜ (γ) is the unique A-horizontal path in P , with initial point σ(γ(0)), which projects down to γ. Thus, σ ˜ (γ)(t) = σ(γ(t))¯ a(t),
(2.33)
for all t ∈ [0, 1], where a ¯(t) ∈ G satisfies the differential equation ¯ (t) = −Ad(¯ a(t)−1 )A¯ ((σ ◦ γ) (t)) a ¯(t)−1 a
(2.34)
for t ∈ [0, 1], and the initial value a ¯(0) is e. Recall that a tangent vector V ∈ Tγ (PM ) is a smooth vector field along the path γ. Let us denote σ ˜ (γ) by γ˜: def
γ˜ = σ ˜ (γ). Note, for later use, that γ˜ (t) = σ∗ (γ (t))¯ a(t) + γ˜ (t)¯ a(t)−1 a ¯ (t) .
(2.35)
vertical
Now define the vector V˜ = σ ˜∗ (V ) ∈ Tγ˜ (PA¯ P )
Fig. 3.
The section σ ˜ applied to a path c.
(2.36)
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1049
to be the vector V˜ in Tγ˜ (PA¯ P ) whose initial value V˜ (0) is V˜ (0) = σ∗ (V (0)). The existence and uniqueness of V˜ was proved in Lemma 2.1. Note that V˜ (t) ∈ Tγ˜(t) P and (σ∗ V )(t) ∈ Tσ(γ(t)) P , are generally different veca(t) and V˜ (t) are both in Tγ˜(t) P and differ by a vertical tors. However, (σ∗ V )(t)¯ vector because they have the same projection V (t) under π∗ : V˜ (t) = (σ∗ V )(t)¯ a(t) + vertical vector.
(2.37)
Our objective now is to determine the LG-valued 1-form =σ ˜ ∗ ω(A,B) ω(A,A,B) ¯
(2.38)
on PM , defined on any vector V ∈ Tγ (PM ) by (V ) = ω(A,B) (˜ σ∗ V ). ω(A,A,B) ¯
(2.39)
We can now work out an explicit expression for this 1-form. Proposition 2.5. With notation as above, and V ∈ Tγ (PM ), 1 ω(A,A,B) (V ) = Ad(¯ a(1)−1 )Aσ (V (1)) + Ad(¯ a(t)−1 )τ Bσ (γ (t), V (t))dt, (2.40) ¯ 0
∗
¯ : [0, 1] → G where Cσ denotes the pullback σ C on M of a form C on P, and a describes parallel-transport along γ, i.e. satisfies ¯ (t) = −Ad(¯ a(t)−1 )A¯σ (γ (t)) a ¯(t)−1 a (V ) can also be expressed with initial condition a ¯(0) = e. The formula for ω(A,A,B) ¯ as (V ) = Aσ (V (0)) ω(A,A,B) ¯ + [Ad(¯ a(1)−1 )(Aσ − A¯σ )(V (1)) − (Aσ − A¯σ )(V (0))] 1 ¯ Ad(¯ a(t)−1 )(FσA + τ Bσ )(γ (t), V (t))dt. (2.41) + 0 ¯ Note that in (2.41), the terms involving A¯σ and FσA cancel each other out.
Proof. From the definition of ω(A,B) in (2.17) and (2.14), we see that we need only focus on the B term. To this end we have, from (2.35) and (2.37): a(t) + vertical, (σ∗ V )(t)¯ a(t) + vertical) B(˜ γ (t), V˜ (t)) = B(σ∗ (γ (t))¯ a(t), (σ∗ V )(t)¯ a(t)) = B(σ∗ (γ (t))¯ = α(¯ a(t)−1 )Bσ (γ (t), V (t)).
(2.42)
October 12, J070-S0129055X10004156
1050
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
Now recall the relation (2.1) τ (α(g)h) = gτ (h)g −1 , for all g ∈ G and h ∈ H, which implies τ (α(g)K) = Ad(g)τ (K) for all g ∈ G and K ∈ LH. As usual, we are denoting the derivatives of τ and α by τ and α again. Applying this to (2.42) we have τ B(˜ γ (t), V˜ (t)) = Ad(¯ a(t)−1 )τ Bσ (γ (t), V (t)), and this yields the result. Suppose ˜ : [0, 1]2 → P : (t, s) → Γ(t, ˜ s) = Γ ˜ s (t) = Γ ˜ t (s) Γ ˜ s being A-horizontal, ¯ ˜ s) being is smooth, with each Γ and the path s → Γ(0, ˜ We will need to use the bi-holonomy g(t, s) which A-horizontal. Let Γ = π ◦ Γ. ¯ then up the ˜ 0) along Γ0 |[0, t] by A, is specified as follows: parallel translate Γ(0, t 0 ¯ path Γ |[0, s] by A, back along Γs -reversed by A and then down Γ |[0, s] by A; then the resulting point is ˜ 0)g(t, s). Γ(0,
(2.43)
The path ˜s s → Γ ˜ 0 using the connection ev∗ A. In describes parallel transport of the initial path Γ 0 what follows we will compare this with the path ˆs s → Γ ˆ0 = Γ ˜ 0 using the connection ev∗ A. The following which is the parallel transport of Γ 1 result describes the “difference” between these two connections. Proposition 2.6. Suppose ˜ s) = Γ ˜ s (t) = Γ ˜ t (s) ˜ : [0, 1]2 → P : (t, s) → Γ(t, Γ ¯ ˜ s) being ˜ s being A-horizontal, and the path s → Γ(0, is smooth, with each Γ ∗ ˜ A-horizontal. Then the parallel translate of Γ0 by the connection ev1 A along the ˜ results in Γ ˜ s g(1, s), with g(1, s) being path [0, s] → PM : u → Γu , where Γ = π ◦ Γ, the “bi-holonomy” specified as in (2.43). ˜ 0 by ev∗ A along the path [0, s] → PM : ˆ s be the parallel translate of Γ Proof. Let Γ 1 ˆ u → Γu . Then the right endpoint Γs (1) traces out an A-horizontal path, starting ˆ s (1) is the result of parallel transporting Γ(0, ˜ 0) by A¯ along Γ0 ˜ 0 (1). Thus, Γ at Γ 1 ˆ s (1) back by A¯ along then up the path Γ |[0, s] by A. If we then parallel transport Γ ˆ s (0). This point is of the form Γs |[0, 1]-reversed then we obtain the initial point Γ ˜ s (0)b, for some b ∈ G, and so Γ ˆs = Γ ˜ s b. Γ
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1051
ˆ s (0) back down Γ0 |[0, s]-reversed, by A, produces the Then, parallel-transporting Γ ˜ 0)b. This shows that b is the bi-holonomy g(1, s). point Γ(0, Now we can turn to determining the parallel-transport process by the connection ˜ as above, let now Γ ˇ s be the ω(A,B) -parallel-translate of Γ ˜ 0 along ω(A,B) . With Γ ˇ ˜ ¯ [0, s] → PM : u → Γu . Since Γs and Γs are both A-horizontal and project by π∗ down to Γs , we have ˇs = Γ ˆ s bs , Γ ˇ s is 0, for some bs ∈ G. Since ω(A,B) = ev∗1 A + τ (Z) applied to the s-derivative of Γ ∗ ˆ and ev1 A applied to the s-derivative of Γs is 0, we have −1 ˆ b−1 s ∂s bs + Ad(bs )τ Z(∂s Γs ) = 0.
Thus, s → bs describes parallel transport by θ ˆ σ ◦ Γ = Γ. ˜ s g(1, s), we then have ˆs = Γ Since Γ
σ
(2.44)
where the section σ satisfies
dbs −1 ˜ s) b = −Ad(g(1, s)−1 )τ Z(∂s Γ ds s 1 ˜ s), ∂s Γ(t, ˜ s))dt. = −Ad(g(1, s)−1 ) τ B(∂t Γ(t,
(2.45)
0
To summarize: Theorem 2.2. Suppose ˜ : [0, 1]2 → P : (t, s) → Γ(t, ˜ s) = Γ ˜ s (t) = Γ ˜ t (s) Γ ¯ ˜ s) being A˜ s being A-horizontal, and the path s → Γ(0, is smooth, with each Γ ˜ 0 by the connection ω(A,B) along the horizontal. Then the parallel translate of Γ ˜ results in path [0, s] → PM : u → Γu , where Γ = π ◦ Γ, ˜ s g(1, s)τ (h0 (s)), Γ
(2.46)
with g(1, s) being the “bi-holonomy” specified as in (2.43), and s → h0 (s) ∈ H solving the differential equation 1 dh0 (s) ˜ s), ∂s Γ(t, ˜ s))dt (2.47) h0 (s)−1 = −α(g(1, s)−1 ) B(∂t Γ(t, ds 0 with initial condition h0 (0) being the identity in H. Let σ be a smooth section of the bundle P → M in a neighborhood of Γ([0, 1]2 ). Let at (s) ∈ G specify parallel transport by A up the path [0, s] → M : v → Γ(t, v), i.e. the A-parallel-translate of σΓ(t, 0) up the path [0, s] → M : v → Γ(t, v) results in σ(Γ(t, s))at (s). On the other hand, a ¯s (t) will specify parallel transport by A¯ along [0, t] → M : u → Γ(u, s). Thus, ˜ s) = σ(Γ(t, s))a0 (s)¯ as (t) Γ(t,
(2.48)
October 12, J070-S0129055X10004156
1052
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
The bi-holonomy is given by g(1, s) = a0 (s)−1 a ¯s (1)−1 a1 (s)¯ a0 (1). Let us look at parallel-transport along the path s → Γs , by the connection ˆ s ∈ PA¯ P be obtained by parallel ω(A,B) , in terms of the trivialization σ. Let Γ ˜0 = σ ˜ (Γ0 ) ∈ PA¯ P along the path transporting Γ [0, s] → M : u → Γ0 (u) = Γ(0, u). This transport is described through a map [0, 1] → G : s → c(s), specified through ˆs = σ ˜ s a0 (s)−1 c(s). Γ ˜ (Γs )c(s) = Γ
(2.49)
c(s)−1 c (s) = −Ad(c(s)−1 )ω(A,A,B) (V (s)), ¯
(2.50)
Then c(0) = e and
where Vs ∈ TΓs PM is the vector field along Γs given by Vs (t) = V (s, t) = ∂s Γ(t, s) for all t ∈ [0, 1]. Equation (2.50), written out in more detail, is
c(s)−1 c (s) = −Ad(c(s)−1 ) Ad(¯ as (1)−1 )Aσ (Vs (1)) +
1
Ad(¯ as (t)−1 )τ Bσ (Γs (t), Vs (t))dt ,
(2.51)
0
where a ¯s (t) ∈ G describes A¯σ -parallel-transport along Γs |[0, t]. By (2.46), c(s) is given by c(s) = a0 (s)g(1, s)τ (h0 (s)), where s → h0 (s) solves dh0 (s) h0 (s)−1 = − ds
1
α(¯ as (t)a0 (s)g(1, s))−1 Bσ (∂t Γ(t, s), ∂s Γ(t, s))dt,
(2.52)
0
with initial condition h0 (0) being the identity in H. The geometric meaning of a ¯s (t)a0 (s) is that it describes parallel-transport first by Aσ up from (0, 0) to (0, s) and then to the right by A¯σ from (0, s) to (t, s). 3. Two Categories from Plaquettes In this section we introduce two categories motivated by the differential geometric framework we have discussed in the preceding sections. We show that the geometric framework naturally connects with certain category theoretic structures introduced by Ehresmann [9, 10] and developed further by Kelley and Street [12].
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1053
We work with the pair of Lie groups G and H, along with maps τ and α satisfying (2.1), and construct two categories. These categories will have the same set of objects, and also the same set of morphisms. The set of objects is simply the group G: Obj = G. The set of morphisms is Mor = G4 × H, with a typical element denoted (a, b, c, d; h). It is convenient to visualize a morphism as a plaquette labeled with elements of G: To connect with the theory of the preceding sections, we should think of a and ¯ c as giving A-parallel-transports, d and b as A-parallel-transports, and h should be thought of as corresponding to h0 (1) of Theorem 2.2. However, this is only a rough guide; we shall return to this matter later in this section. For the category Vert, the source (domain) and target (co-domain) of a morphism are: sVert (a, b, c, d; h) = a, tVert (a, b, c, d; h) = c. For the category Horz sHorz (a, b, c, d; h) = d, tHorz (a, b, c, d; h) = b. We define vertical composition, that is composition in Vert, using Fig. 5. In this figure, the upper morphism is being applied first and then the lower. Horizontal composition is specified through Fig. 6. In this figure, we have used the notation ◦opp to stress that, as morphisms, it is the one to the left which is applied first and then the one to the right. Our first observation is: Proposition 3.1. Both Vert and Horz are categories, under the specified composition laws. In both categories, all morphisms are invertible. c d
h
b
a Fig. 4.
Plaquette.
October 12, J070-S0129055X10004156
1054
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
c d
h
b c
a = c ◦ c = a d
= d d h(α(d−1 )h ) b b
a
b
h
a Fig. 5.
Vertical composition.
c
c d
b
h
◦opp d
a
h
b= d (α(a−1 )h )h b
a Fig. 6.
e
a a
Horizontal composition (for b = d ).
a e
c c
e e
a
a
e
a
e
Identity for Vert
Identity for Horz Fig. 7.
Identity maps.
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1055
Proof. It is straightforward to verify that the composition laws are associative. The identity map a → a in Vert is (a, e, a, e; e), and in Horz it is (e, a, e, a; e). These are displayed in in Fig. 7. The inverse of the morphism (a, b, c, d; h) in Vert is (c, b−1 , a, d−1 ; α(d)h−1 ); the inverse in Horz is (a−1 , d, c−1 , b; α(a)h−1 ). The two categories are isomorphic, but it is best not to identify them. We use ◦H to denote horizontal composition, and ◦V to denote vertical composition. We have seen earlier that if A, A¯ and B are such that ω(A,B) reduces to ev∗0 A ¯ (for example, if A = A¯ and F A + τ (B) is 0) then all plaquettes (a, b, c, d; h) arising from the connections A and ω(A,B) , satisfy τ (h) = a−1 b−1 cd. Motivated by this observation, we could consider those morphisms (a, b, c, d; h) which satisfy τ (h) = a−1 b−1 cd.
(3.1)
However, we can look at a broader class of morphisms as well. Suppose h → z(h) ∈ Z(G) is a mapping of the morphisms in the category Horz or in Vert into the center Z(G) of G, which carries composition of morphisms to products in Z(G): z(h ◦ h ) = z(h)z(h ). Then we say that a morphism h = (a, b, c, d; h) is quasi-flat with respect to z if τ (h) = (a−1 b−1 cd)z(h)
(3.2)
A larger class of morphisms could also be considered, by replacing Z(G) by an abelian normal subgroup, but we shall not explore this here. Proposition 3.2. Composition of quasi-flat morphisms is quasi-flat. Thus, the quasi-flat morphisms form a subcategory in both Horz and Vert. Proof. Let h = (a, b, c, d; h) and h = (a , b , c , d ; h ) be quasi-flat morphisms in Horz, such that the horizontal composition h ◦H h is defined, i.e. b = d . Then h ◦H h = (a a, b , c c, d; {α(a−1 )h }h). Applying τ to the last component in this, we have a−1 τ (h )aτ (h) = a−1 (a
−1 −1
b
= ((a a)−1 b which says that h ◦H h is quasi-flat.
−1
c d )a(a−1 b−1 cd)z(h)z(h )
(c c)d)z(h ◦H h),
(3.3)
October 12, J070-S0129055X10004156
1056
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
Now suppose h = (a, b, c, d; h) and h = (a , b , c , d ; h ) are quasi-flat morphisms in Vert, such that the vertical composition h ◦V h is defined, i.e. c = a . Then h ◦V h = (a, b b, c , d d; h{α(d−1 )h }). Applying τ to the last component in this, we have τ (h)d−1 τ (h )d = (a−1 b−1 cd)d−1 (a = (a
−1
−1 −1
b
c d )dz(h)z(h )
(b b)−1 c d d)z(h ◦V h),
(3.4)
which says that h ◦V h is quasi-flat. For a morphism h = (a, b, c, d; h) we set τ (h) = τ (h).
If h = (a, b, c, d; h) and h = (a , b , c , d ; h ) are morphisms then we say that they are τ -equivalent, h = τ h if a = a , b = b , c = c , d = d , and τ (h) = τ (h ). Proposition 3.3. If h, h , h , h are quasi-flat morphisms for which the compositions on both sides of (3.5) are meaningful, then (h ◦H h ) ◦V (h ◦H h) = τ (h ◦V h ) ◦H (h ◦V h)
(3.5)
whenever all the compositions on both sides are meaningful. Thus, the structures we are using here correspond to double categories as described by Kelly and Street [12, Sec. 1.1] Proof. This is a lengthy but straightforward verification. We refer to Fig. 8. For a morphism h = (a, b, c, d; h), let us write τ∂ (h) = a−1 b−1 cd. For the left-hand side of (3.5), we have (h ◦H h) = (a a, b , c c, d; {α(a−1 )h }h) (h ◦H h ) = (c c, b , f f, d ; {α(c−1 )h }h )
(3.6)
∗ def
h = (h ◦H h ) ◦V (h ◦H h) = (a a, b b , f f, d d; h∗ ), where h∗ = {α(a−1 )h }h{α(d−1 c−1 )h }{α(d−1 )h }
(3.7)
Applying τ gives τ (h∗ ) = a−1 τ (h )z(h )a · τ (h)z(h)d−1 c−1 τ (h )cd · × z(h ) · d−1 τ (h )dz(h ) = (a a)−1 (b b )−1 (f f )(d d)z(h∗ ),
(3.8)
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
f
f d
h
d d
h
b
b
a Fig. 8.
h
b
c c
c c d
1057
h
b
a
Consistency of horizontal and vertical compositions.
where we have used the fact, from (2.1), that α is converted to a conjugation on applying τ , and the last line follows after algebraic simplification. Thus, τ (h∗ ) = τ∂ (h∗ )z(h∗ )
(3.9)
On the other hand, by an entirely similar computation, we obtain h∗ = (h ◦V h ) ◦H (h ◦V h) = (a a, b b , f f, d d; h∗ ),
(3.10)
h∗ = {α(a−1 )h }{α(a−1 b−1 )h }h{α(d−1 )h }.
(3.11)
def
where
Applying τ to this yields, after using (2.1) and computation, τ (h∗ ) = τ∂ (h∗ )z(h∗ ). Since τ (h∗ ) is equal to τ (h∗ ), the result (3.5) follows. Ideally, a discrete model would be the exact “integrated” version of the differential geometric connection ω(A,B) . However, it is not clear if such an ideal transcription is feasible for any such connection ω(A,B) on the path-space bundle. To make contact with the differential picture we have developed in earlier sections, we should compare quasi-flat morphisms with parallel translation by ω(A,B) in the case where B is such that ω(A,B) reduces to ev∗0 A (for instance, if A = A¯ ¯ and the fake curvature F A + τ (B) vanishes); more precisely, the h for quasi-flat morphisms (taking all z(h) to be the identity) corresponds to the quantity h0 (1) specified through the differential Eq. (2.47). It would be desirable to have a more thorough relationship between the discrete structures and the differential geometric constructions, even in the case when z(·) is not the identity. We hope to address this in future work.
October 12, J070-S0129055X10004156
1058
2010 10:3 WSPC/S0129-055X
148-RMP
S. Chatterjee, A. Lahiri & A. N. Sengupta
4. Concluding Remarks We have constructed in (2.17) a connection ω(A,B) from a connection A on a principal G-bundle P over M , and a 2-form B taking values in the Lie algebra of a ¯ second structure group H. The connection ω(A,B) lives on a bundle of A-horizontal paths, where A¯ is another connection on P which may be viewed as governing the gauge theoretic interaction along each curve. Associated to each path s → Γs of paths, beginning with an initial path Γ0 and ending in a final path Γ1 in M , is a parallel transport process by the connection ω(A,B) . We have studied conditions (in Theorem 2.1) under which this transport is “surface-determined”, that is, depends more on the surface Γ swept out by the path of paths than on the specific parametrization, given by Γ, of this surface. We also described connections over the path space of M with values in the Lie algebra LH obtained from the A¯ and B. We developed an “integrated” version, or a discrete version, of this theory, which is most conveniently formulated in terms of categories of quadrilateral diagrams. These diagrams, or morphisms, arise from parallel transport by ω(A,B) when B has a special form which makes the parallel transports surface-determined. Our results and constructions extend a body of literature ranging from differential geometric investigations to category theoretic ones. We have developed both aspects, clarifying their relationship. Acknowledgments We are grateful to the anonymous referee for useful comments and pointing us to the reference [12]. Our thanks to Urs Schreiber for the reference [16]. We also thank Swarnamoyee Priyajee Gupta for preparing some of the figures. ANS acknowledges research supported from US NSF grant DMS-0601141. AL acknowledges research support from Department of Science and Technology, India under Project No. SR/S2/HEP-0006/2008. References [1] J. Baez, Higher Yang–Mills theory, http://arxiv.org/abs/hep-th/0206130. [2] J. Baez and U. Schreiber, Higher gauge theory, http://arXiv:hep-th/0511710v2. [3] J. Baez and U. Schreiber, Higher gauge theory II: 2-connections on 2-bundles, http://arxiv.org/abs/hep-th/0412325. [4] L. Breen and W. Messing, Differential geometry of gerbes, http://arxiv.org/abs/ math/0106083. [5] Alberto S. Cattaneo, P. Cotta-Ramusino and M. Rinaldi, Loop and path spaces and four-dimensional BF theories: Connections, holonomies and observables, Comm. Math. Phys. 204 (1999) 493–524. [6] D. Chatterjee, On gerbs, Ph.D. thesis, University of Cambridge (1998). [7] K.-T. Chen, Algebras of iterated path integrals and fundamental groups, Trans. Amer. Math. Soc. 156 (1971) 359–379. [8] K.-T. Chen, Iterated integrals of differential forms and loop space homology, Ann. of Math. 97(2) (1973) 217–246.
October 12, J070-S0129055X10004156
2010 10:3 WSPC/S0129-055X
148-RMP
Parallel Transport Over Path Spaces
1059
´ [9] C. Ehresmann, Cat´egories structur´ees, Ann. Sci. Ecole Norm. Sup. 80 (1963) 349–425. [10] C. Ehresmann, Cat´egories et structures (Dunod, Paris, 1965). [11] F. Girelli and H. Pfeiffer, Higher gauge theory — Differential versus integral formulation, J. Math. Phys. 45 (2004) 3949–3971; http://arxiv.org/abs/hep-th/0309173. [12] G. M. Kelly and R. Street, Review of the elements of 2-categories, in Category Seminar (Proc. Sem., Sydney, 1972/1973), Lecture Notes in Math., Vol. 420 (Springer, Berlin, 1974), pp. 75–103. [13] A. Lahiri, Surface holonomy and gauge 2-group, Int. J. Geom. Methods Mod. Phys. 1 (2004) 299–309. [14] M. Murray, Bundle gerbes, J. London Math. Soc. 54 (1996) 403–416. [15] H. Pfeiffer, Higher gauge theory and a non-abelian generalization of 2-form electrodynamics, Ann. Phys. 308 (2003) 447–477; http://arxiv.org/abs/hep-th/0304074. [16] A. Stacey, Comparative smootheology; http://arxiv.org/abs/0802.2225. [17] O. Viro, http://www.pdmi.ras.ru/∼olegviro/talks.html.
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 1061–1097 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004132
MODULI SPACES OF G2 MANIFOLDS
SERGEY GRIGORIAN Max-Planck-Institut f¨ ur Gravitationsphysik (Albert-Einstein-Institut), Am M¨ uhlenberg 1, D-14476 Golm, Germany and Simons Center for Geometry and Physics, Stony Brook University, Stony Brook, NY 11794, USA [email protected] Received 27 January 2010 Revised 14 June 2010 This paper is a review of current developments in the study of moduli spaces of G2 manifolds. G2 manifolds are seven-dimensional manifolds with the exceptional holonomy group G2 . Although they are odd-dimensional, in many ways they can be considered as an analogue of Calabi–Yau manifolds in seven dimensions. They play an important role in physics as natural candidates for supersymmetric vacuum solutions of M -theory compactifications. Despite the physical motivation, many of the results are of purely mathematical interest. Here we cover the basics of G2 manifolds, local deformation theory of G2 structures and the local geometry of the moduli spaces of G2 structures. Keywords: Special holonomy; moduli space; M -theory. Mathematics Subject Classification 2010: 53C25, 53C29, 53Z05
1. Introduction Ever since antiquity there has been a very close relationship between physics and geometry. Originally, in Timaeus, Plato related four of the five Platonic solids — tetrahedron, hexahedron, octahedron, icosahedron to the elements fire, earth, air and water, respectively, while the fifth solid, the dodecahedron was the quintessence of which the cosmos itself is made. Later, Isaac Newton’s Laws of Motion and Theory of Gravitation gave a precise mathematical framework in which the motion of objects can be calculated. However, Albert Einstein’s General Relativity made it very explicit that the physics of spacetime is determined by its geometry. More recently, this fundamental relationship has been taken to a new level with the development of String and M -theory. Over the past 25 years, superstring theory has emerged as a successful candidate for the role of a theory that would unify gravity with other interactions. It was later discovered that all five superstring theories can be obtained as special limits of a more general 11-dimensional theory known as M -theory and moreover, the low energy limit of which is the 11-dimensional supergravity [44, 46]. The complete formulation of M -theory is, however, not known yet. 1061
October 12, J070-S0129055X10004132
1062
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
One of the key features of String and M -theory is that these theories are formulated in 10- and 11-dimensional spacetimes, respectively. One of the techniques to relate this to the visible four-dimensional world is to assume that the remaining six or seven dimensions are curled up as a small, compact, so-called internal space. This is known as compactification. Such a procedure also leads to a remarkable interrelationship between physics and geometry, since the effective physical content of the resulting four-dimensional theory is determined by the geometry of the internal space. Usually the full multidimensional spacetime is regarded as a direct product M4 × X, where M4 is a four-dimensional non-compact manifold with Lorentzian signature (− + ++) and X is a compact six, or seven-dimensional Riemannian manifold. In general, the parameters that define the geometry of the internal space give rise to massless scalar fields known as moduli, and the properties of the moduli space are determined by the class of spaces used in the compactification. The properties of the internal space in String and M -theory compactifications are governed by physical considerations. A key ingredient of these theories is supersymmetry [45]. Supersymmetry is a physical symmetry between particles the spin of which differs by 12 — that is, between integer spin bosons and half-integer spin fermions. Mathematically, bosons are represented as functions or tensors and fermions as spinors. When looking for a supersymmetric vacuum for which the metric is the only non-zero field, that is a Ricci-flat solution that is invariant under supersymmetry transformations, it turns out that a necessary requirement is the existence of covariantly constant, or parallel, spinor. That is, there must exist a non-trivial spinor η on the Riemannian manifold X that satisfies ∇η = 0
(1.1)
where ∇ is the relevant spinor covariant derivative [8]. This condition implies that η is invariant under parallel transport. Properties of parallel transport on a Riemannian manifold are closely related to the concept of holonomy. Consider a vector v at some point x on X. Using the natural Levi–Civita connection that comes from the Riemannian metric, we can parallel transport v along paths in X. In particular, consider a closed contractible path γ based at x. As shown in Fig. 1, if we parallel transport v along γ, then the new vector v which we get will necessarily have the same magnitude as the original vector v, but otherwise it does not have to be the same. This gives the notion of holonomy group. Below we give the precise definition. Definition 1. Let (X, g) be a Riemannian manifold of dimension n with metric g and corresponding Levi–Civita connection ∇, and fix point x ∈ X. Let γ : [0, 1] → X be a loop based at x, that is, a piecewise-smooth path such that γ(0) = γ(1) = x. The parallel transport map Pγ : Tx X → Tx X is then an invertible linear map which lies in SO(n). Define the Riemannian holonomy group Holx (X, g) of ∇ based at x to be Holx (X, g) = {Pγ : γ is a loop based at x} ⊂ O(n).
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1063
X
γ
v’ v x
Fig. 1.
Parallel transport of a vector.
If the manifold X is connected, then it is trivial to see that the holonomy group is independent of the base point, and can hence be defined for the whole manifold. Parallel transport is initially defined for vectors, but can then be naturally extended to other objects like tensors and spinors, with the holonomy group acting on these objects via relevant representations. Now going back to the covariantly constant spinor η, (1.1) implies that η is invariant under the action of the holonomy group. This shows that the spinor representation of Hol(X, g) must contain the trivial representation. For Hol(X, g) = SO(n), this is not possible since the spinor representation is reducible, so Hol(X, g) ⊂ SO(n). Hence the condition (1.1) implies a reduced holonomy group. Thus, Ricci-flat special holonomy manifolds occur very naturally in string and M -theory. As shown by Berger [9], the list of possible special holonomy groups is very limited. In particular, if X is simply-connected, and neither locally a product nor symmetric, the only possibilities are given in Fig. 2. In this list manifolds with holonomy SU (k), Sp(k), G2 and Spin(7) are Ricci-flat. Moreover, these groups are subgroups of SO(n) and are simply-connected. This implies that manifolds with these holonomy groups always admit a spin structure ([30, Proposition 3.6.2]). These are also precisely the manifolds that admit a parallel spinor. K¨ ahler manifolds only admit parallel projective spinors — a line subbundle of the spinor bundle. Geometry K¨ ahler Calabi–Yau HyperK¨ ahler Exceptional Exceptional Fig. 2.
Holonomy U (k) SU (k) Sp(k) G2 Spin(7)
Dimension 2k 2k 4k 7 8
List of special holonomy groups.
October 12, J070-S0129055X10004132
1064
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
Thus, for a Ricci-flat supersymmetric vacuum in a 10-dimensional theory, X has to be six-dimensional in order to reduce to four dimensions, and hence necessarily a Calabi–Yau manifold. Similarly, for an 11-dimensional theory, seven-dimensional manifolds with G2 holonomy arise naturally. We have thus seen that even rather simple physical requirement restrict the geometry of the manifold X to rather special classes. In particular, the study of Calabi–Yau manifolds has been crucial in the development of String Theory, and in fact some very important discoveries in the theory of Calabi–Yau manifolds have been made thanks to advances in the physics. One such major discovery is Mirror Symmetry [41, 27]. This symmetry first appeared in String Theory where evidence was found that conformal field theories (CFTs) related to compactifications on a Calabi–Yau manifold with Hodge numbers (h1,1 , h2,1 ) are equivalent to CFTs on a Calabi–Yau manifold with Hodge numbers (h2,1 , h1,1 ). Mirror symmetry is currently a powerful tool both for calculations in String Theory and in the study of the Calabi–Yau manifolds and their moduli spaces. In mathematical literature G2 holonomy first appeared in Berger’s list of special holonomy groups in 1955 [9]. In 1966, Bonan has shown that manifolds with G2 holonomy are Ricci-flat. It was known from general theory that having a holonomy group G is equivalent to having a torsion-free G-structure. So it was natural to study G2 structures on manifolds to get a better understanding of G2 holonomy. The different classes of G2 structures have been explored by Fern´andez and Gray in their 1982 paper [18]. In particular they have shown that a torsionfree G2 structure is equivalent to the G2 -invariant three-form ϕ being closed and co-closed. It was not known whether the group G2 (or indeed Spin(7) for that matter) does actually appear as a non-symmetric holonomy group until in 1987 Bryant [12] proved the existence of metrics with G2 and Spin(7) holonomy. In a later paper, Bryant and Salamon [11] constructed complete metrics with G2 holonomy. However the first compact examples of G2 holonomy manifolds have been constructed by Joyce in 1996 [28, 29]. These examples are based on quotients T 7 /Γ where Γ is a finite group. Such quotient spaces usually exhibit singularities, and Joyce has shown that it is possible to resolve these singularities in such a way as to get a smooth, compact manifold with G2 holonomy. Since then, a number of other types of constructions have been found, in particular the construction by Kovalev [35] where a compact G2 manifold is obtained by gluing together two non-compact asymptotically cylindrical Riemannian manifolds with holonomy SU (3). In the G2 holonomy compactification approach to M -theory, the physical content of the four-dimensional theory is given by the moduli of G2 holonomy manifolds. A review of the role of G2 manifolds in M -theory is given by Acharya and Gukov [2] and by Duff [17]. Such a compactification of M -theory is in many ways analogous to Calabi–Yau compactifications in String Theory, where much progress has been made through the study of the Calabi–Yau moduli spaces. In particular, as it was shown in [14, 40], the moduli space of complex structures and the
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1065
complexified moduli space of K¨ ahler structures are both in fact, K¨ahler manifolds. Moreover, both have a special geometry: that is, both have a line bundle whose first Chern class coincides with the K¨ ahler class. However, until recently, the structure of the moduli space of G2 holonomy manifolds has not been studied in that much detail. Generally, it turns out that the study of G2 manifolds is quite difficult. Unlike the study of Calabi–Yau manifolds where the machinery of algebraic geometry has been used with great success, in the case of G2 manifolds there is no analogue, so analytical rather than algebraic study is needed. In this review, we aim to give an overview of what is currently known about G2 moduli spaces and corresponding deformations of G2 structures. We first give an introduction to the properties of the group G2 — definitions and representations. Then we look at general properties of G2 structures. Finally we move on to properties of G2 moduli spaces. Note that here we will only be looking at smooth compact G2 manifolds. Properties of the non-compact asymptotically cylindrical G2 manifolds have recently been studied by Kovalev and Nordstr¨ om [36] and by Nordstr¨ om [38], while the properties of G2 manifolds with conical singularities have been studied by Karigiannis [33]. 2. The Group G2 2.1. Automorphisms of octonions The group G2 is the smallest of the five exceptional Lie groups, the others being F4 , E6 , E7 and E8 . Surprisingly, all of these Lie groups are related to the octonions, but G2 is especially close. So let us first give a few facts about the octonions. The eight-dimensional algebra of octonions, denoted by O, is the largest possible normed division algebra. The others of course are the real numbers R, complex numbers C and the quaternions H. Following Baez [6], it turns out that division algebras can be defined using the notion of triality. Given three real vector spaces U, V, W , then a triality is a non-degenerate trilinear map t : U × V × W → R. Non-degenerate here means that for any fixed non-zero elements of U and V , the induced functional on W is non-zero. Hence, t also defines a bilinear map m m : U × V → W ∗. For each fixed element of U , this map defines an isomorphism between V and W ∗ , and for each fixed element of V , an isomorphism between U and W ∗ . Hence these three spaces are isomorphic to each, and if we choose to identify non-zero elements e1 ∈ U , e2 ∈ V , and e1 e2 ∈ W ∗ , we can identify the spaces U, V, W with each other, and we can say that m now defines multiplication on U with identity element e = e1 = e2 = e1 e2 . Note that in particular, the existence of a non-degenerate trilinear map implies that the original vector spaces U ,V ,W are all of the same dimension.
October 12, J070-S0129055X10004132
1066
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
Due to the non-degeneracy of the original triality, multiplication by a fixed element is an isomorphism, so in fact, U is a division algebra. Assuming further that U, V, W are inner product spaces, if the triality map satisfies |t(u, v, w)| ≤ uvw and is such that for all u, v there exists a non-zero w such that the bound is attained (and similarly for cyclic permutations for u, v, w) then we get a normed division algebra. The converse is also true — any division algebra defines a triality. As discussed in detail by Baez [6], on Rn it is possible to construct bilinear maps mn involving the vector and spinor representations of Spin(n) mn : Vn × Sn± → Sn∓
for n = 0, 4 (mod 8)
(2.1a)
mn : Vn × Sn → Sn
otherwise
(2.1b)
(±) Sn
are the (left- and right-handed) where Vn is the vector representation of SO(n), spinor representations. The spinor representations in (2.1) are self-dual, so in principle, by dualizing the maps in (2.1), we could obtain trilinear maps into R. However, in order to obtain trialities, these maps have to be non-degenerate, and hence the dimensions of the relevant representations must agree. This happens only for n = 1, 2, 4, 8, and each of these trialities gives a normed division algebra of the corresponding dimension: t1 t2 t4 t8
: V1 × S1 × S1 → R ⇒ R, : V2 × S2 × S2 → R ⇒ C, : V4 × S4+ × S4− → R ⇒ H, : V8 × S8+ × S8− → R ⇒ O.
(2.2)
This way, via the trialities we obtain all of the normed division algebras. In general, suppose we have a triality t : U1 × U2 × U3 → R. Then to define a normed division algebra from t, we fix two vectors in the two of the three spaces. Hence the automorphism of the division algebra is the subgroup of the automorphism group of the triality that fixes these two vectors. For t8 the automorphism group of the triality turns out to be Spin(8), while G2 is defined as the automorphism group of the corresponding octonion algebra. Thus we have Definition 2. The group G2 is the automorphism group of the octonion algebra. Since G2 is the automorphism group of octonions, it is the subgroup of Spin(8) (the automorphism group of the triality t8 ) that preserves unit vectors in V8 and S8+ . As explained by Baez in [6], the subgroup of Spin(8) that fixes a unit vector in V8 is Spin(7). Moreover, if the representation S8+ is restricted to Spin(7), we get the spinor representation S7 . Therefore, G2 is the subgroup of Spin(7) that fixes a unit vector in S7 . In this representation, Spin(7) acts transitively on the unit sphere S 7 , so we have Spin(7)/G2 = S 7 . Hence we have the following result.
(2.3)
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1067
Proposition 3. The group G2 has dimension 14. Proof. From (2.3), dim G2 = dim(Spin(7)) − dim S 7 = 21 − 7 = 14. The automorphism group fixes the identity, so in fact G2 acts non-trivially on octonions that are orthogonal to the identity — the imaginary octonions, denoted by Im(O) and thus we get a natural seven-dimensional representation of G2 . A closer look at this representation reveals another description of G2 . Using octonion multiplication, we can define a cross product on Im(O) by 1 (ab − ba). (2.4) 2 But G2 preserves octonion multiplication, hence any element of G2 preserves the seven-dimensional cross product. Alternatively, (2.4) can be written as a × b = Im(ab) =
a × b = ab + a, b
(2.5)
where , is the octonionic inner product, in general defined by a, b =
1 ∗ (a b + ba∗ ). 2
Also, it can be shown that 1 a, b = − Tr(a × (b × ·)) (2.6) 6 Therefore, from (2.5), multiplication of imaginary octonions can be defined in terms of the cross product, hence any transformation preserving the cross product preserves multiplication on Im(O), and is thus in G2 . So, G2 is precisely the group that preserves the seven-dimensional cross product. Moreover, from the cross product we can form a “scalar triple product” on Im(O) given by ϕ0 (a, b, c) = a, b × c = a, bc .
(2.7)
This defines ϕ0 as an anti-symmetric trilinear functional — that is, a three-form on R7 . Equivalently, for a basis ei of Im(O), ei × ej = ϕ0kij ek .
(2.8)
So in this description, the components of ϕ0 are essentially the structure constants of the algebra of imaginary octonions. A well known way to encode the multiplication rules for the octonions is the Fano plane [6]. It is shown in Fig. 3. In the diagram, the vertices e1 , . . . , e7 are the seven square roots of −1. Multiplication follows along the six straight lines (sides of the triangle and the altitudes) and along the central circle in the direction of the arrows. So if ei , ej , ek are in this order on a straight line, then ei ej = ek and ej ei = −ek .
October 12, J070-S0129055X10004132
1068
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
e3
e2
e7
e6
e1
e5
Fig. 3.
e4
Fano plane.
However, from (2.8) we see that ϕ0 encodes precisely the same information as the Fano plane. Suppose x1 , . . . , x7 are coordinates on R7 and let eijk = dxi ∧dxj ∧dxk , then just reading off from the Fano plane, ϕ0 can be written as ϕ0 = e123 + e145 + e167 + e246 − e257 − e347 − e356 .
(2.9)
Note that in order to keep the same convention for ϕ0 as Joyce [30], in the Fano plane we have a different numbering for the octonions compared to Baez [6]. With this choice of coordinates, the inner product on Im(O) ∼ = R7 is given by the standard Euclidean metric g0 = (dx1 )2 + · · · + (dx7 )2 .
(2.10)
As seen from (2.6), G2 preserves the inner product on Im(O), so it clearly preserves g0 and is hence a subgroup of SO(7). Since ϕ0 defines the seven-dimensional cross product, and G2 is the symmetry group of this cross product, G2 is the stabilizer of ϕ0 in GL(7, R). So we can state: Theorem 4 ([12]). The subgroup of GL(7, R) that preserves the three-form ϕ0 is G2 . From the metric g0 we can define the Hodge star ∗0 on R7 , and using this, the dual four-form ψ0 = ∗0 ϕ0 which is given by ψ0 = e4567 + e2367 + e2345 + e1357 − e1346 − e1256 − e1247 .
(2.11)
This is a key property of G2 and as such this is often taken as the definition of the group G2 , in particular in [30]. As we have seen, G2 preserves both ϕ0 and g0 ,
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1069
so it also preserves ψ0 . In particular, ϕ0 and ψ0 give alternate descriptions of the trivial one-dimensional representation of G2 . It also turns out that ψ0 is closely related to the associator on Im(O). As the octonions are non-associative, we can define a non-trivial associator map [·, ·, ·] : Im(O) × Im(O) × Im(O) → Im(O) given by [a, b, c] = a(bc) − (ab)c.
(2.12)
Just as ϕ0 is defined as a dualization of the cross product using the inner product to obtain the map ϕ0 : Im(O) × Im(O) × Im(O) → R so it turns out that up to a constant multiple the map ψ0 : Im(O) × Im(O) × Im(O) × Im(O) → R is a dualization of the associator, given by 1 [a, b, c], d . (2.13) 2 It is possible to show that ϕ0 and ψ0 satisfy various contraction identities. In particular, from [13, 21, 32], we have ψ0 (a, b, c, d) =
Proposition 5. The three-form ϕ0 and the corresponding four-form ψ0 satisfy the following identities: ϕ0abc ϕ0mn c = g0am g0bn − g0an g0bm + ψ0abmn ,
(2.14a)
ϕ0abc ψ0mnp c = 3(g0a[m ϕ0np]b − g0b[m ϕ0np]a ),
(2.14b)
[mn p n p ψ0abcd ψ0mnpq = 24δ [m δc δd] − 16ϕ0[abc ϕ0 [mnp δd] , a δb δc δd + 72ψ0[ab q]
q]
q]
(2.14c)
where [m n p] denotes antisymmetrization of indices and δab is the Kronecker delta, with δba = 1 if a = b and 0 otherwise. The above identities can be of course further contracted — the details can be found in [21, 32]. These identities and their contractions are crucial whenever any calculations involving ϕ0 and ψ0 have to be done. In particular, these are very useful when studying G2 manifolds. 2.2. Representations of G2 As we will see in Sec. 3, a crucial role in the study of G2 structures is played by the representations of G2 . Since G2 is a subgroup of SO(7), it has a fundamental vector representation on R7 . In the study of G2 manifolds, it is very important to understand the representations of G2 on p-forms. So let us consider first the
October 12, J070-S0129055X10004132
1070
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
representations of G2 on antisymmetric tensors in R7 . For brevity let V = R7 . Following Bryant [13], we first look at the the Lie algebra so(7), which is the space of antisymmetric 7 × 7 matrices on V . For a vector ω ∈ V , define the map ρϕ : V → so(7)
given by ρϕ (ω) = ωϕ0
(2.15)
which is clearly injective. Conversely, define the map τϕ : so(7) → V
given by τϕ (αab )c =
1 c ϕ αab . 6 0 ab
(2.16)
From (2.14), we get that τϕ (ρϕ (ω)) = ω, so that τϕ is a partial inverse of ρϕ . Thus we get a decomposition so(7) = ker τϕ ⊕ ρϕ (V )
(2.17)
where dim ρϕ (V ) = 7 and dim ker τϕ = 14. It turns out that ker τϕ is in fact a Lie algebra with respect to the matrix commutator. This is the Lie algebra bracket on so(7) and satisfies the Jacobi identity. It is hence only necessary to show that for α, β ∈ ker τϕ , we have [α, β] ∈ ker τϕ . This is an exercise in applying the contractions for ϕ. Thus we get a 14-dimensional Lie subalgebra of so(7). However, this is precisely the Lie algebra g2 [32], that is g2 = ker τϕ = {α ∈ so(7) : ϕ0abc αbc = 0}.
(2.18)
This further implies that we get the following decomposition of so(7): so(7) = g2 ⊕ ρϕ (V ).
(2.19)
The group G2 acts via the adjoint representation on the 14-dimensional vector space g2 and via the fundamental vector representation on the seven-dimensional space ρϕ (V ). This is a G2 -invariant irreducible decomposition of so(7) into the representations 7 and 14. Hence we get the following result: Theorem 6 ([12]). The space Λ2 of two-forms on V decomposes as Λ2 = Λ27 ⊕ Λ214 .
(2.20)
with the components Λ27 and Λ214 given by: Λ27 = {ωϕ: ω a vector}, 1 2 a b Λ14 = α = αab e ∧ e : (αab ) ∈ g2 . 2
(2.21a) (2.21b)
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1071
An alternative, but fully equivalent, description of Λ27 and Λ214 presents them as eigenspaces of the operator T ψ : Λ2 → Λ2
given by Tψ (αab ) = ψ0abcd αcd .
(2.22)
With this description, we have [32]: Λ27 = {α ∈ Λ2 : Tψ α = 4α},
(2.23a)
Λ214 = {α ∈ Λ2 : Tψ α = −2α}.
(2.23b)
Correspondingly, the description of the 7 and 14 pieces of Λ5 is obtained from (2.21a) and (2.21b) via Hodge duality. Let us now look at three-forms in more detail. Consider Sym2 (V ∗ ) — the space of symmetric two-tensors on V , and define a map iϕ : Sym2 (V ∗ ) → Λ3
given by iϕ (h)abc = hd[a ϕ0bc]d .
(2.24)
We can decompose Sym2 (V ∗ ) = Rg0 ⊕ Sym20 (V ∗ ) where Rg0 is the set of symmetric tensors proportional to the metric g0 and Sym20 (V ∗ ) is the set of traceless symmetric tensors. This is a G2 -invariant irreducible decomposition of Sym2 (V ∗ ) into onedimensional and 27-dimensional representations. We clearly have iϕ (g0 )abc = ϕ0abc , so the map iϕ is also G2 -invariant and is injective on each summand of this decomposition. Looking at the first summand, we get that iϕ (Rg0 ) = Λ31 — the one-dimensional singlet representation of G2 . Now look at the second summand and consider iϕ (Sym20 (V ∗ )). This is 27-dimensional and irreducible, so it gives a 27-dimensional representation of G2 on three-forms: iϕ (Sym20 (V ∗ )) = Λ327 (V ∗ ). Now, Λ3 is 35-dimensional, and we have accounted for 1 + 27 = 28 dimensions. Thus we still have seven dimensions left unaccounted for in Λ3 . So let us extend the map iϕ to Λ2 — the antisymmetric two-tensors on R7 . Suppose β ∈ Λ27 . Then β = ωϕ0 , for some vector ω ∈ V so iϕ (β)abc = ϕd0
[a|e| ϕ0bc]d ω
e
= ψ0abcd ω d
(2.25)
where we have used (2.14). This defines a G2 -invariant map from V to Λ3 and hence gives Λ37 . So overall we thus have a decomposition of three-forms into irreducible representations of G2 : Theorem 7 ([13]). The space Λ3 of three-forms on V decomposes as Λ3 = Λ31 ⊕ Λ37 ⊕ Λ327
(2.26)
October 12, J070-S0129055X10004132
1072
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
where Λ31 = {χ ∈ Λ3 : χabc = f ϕ0abc for scalar f },
(2.27a)
Λ37 = {ωψ0 : ω a vector},
(2.27b)
Λ327 = {χ ∈ Λ3 : χabc = hd[a ϕ0bc]d for hab traceless, symmetric}.
(2.27c)
From the identities for contraction of ϕ0 and ψ0 , it is possible to see that an equivalent description of Λ327 is Λ327 = {χ ∈ Λ3 : χ ∧ ϕ0 = 0 and χ ∧ ψ0 = 0}. A similar decomposition of four-forms is again obtained via Hodge duality. Suppose we have χ ∈ Λ3 , then define π1 , π7 and π27 to be projections of χ onto 3 Λ1 , Λ37 and Λ327 , respectively. Using contraction identities for ϕ and ψ, we get the following relations [21]: Proposition 8. Given a three-form χ ∈ Λ3 , the projections of χ onto the components (2.26) of Λ3 are given by: 1 1 (χabc ϕabc χ, ϕ0 with |π1 (χ)|2 = 7a2 , 0 )= 42 7 1 π7 (χ) = ωψ0 where ω a = − χmnp ψ0mnpa with |π7 (χ)|2 = 4|ω|2 , 24 3 2 π27 (χ) = iϕ (h) where hab = χmn{a ϕmn with |π27 (χ)|2 = |h|2 . 0b} 4 9 π1 (χ) = aϕ0 where a =
(2.28a) (2.28b) (2.28c)
Here {a b} denotes the traceless symmetric part. Note that similar projections can be defined for four-forms as well. 3. G2 Structures 3.1. Definition As we shall see, the notion of holonomy is closely related to G-structures on manifolds. Let us give the necessary definitions Definition 9. Let X be a manifold of dimension n. Suppose T X is the tangent bundle over X. Define the manifold F by F = {(x, e1 , . . . , en ) : x ∈ X and (e1 , . . . , en ) is a basis for Tx X} This then has a projection π : (x, e1 , . . . , en ) → x onto X and a natural left action by GL(n, R) on the fibers. F is thus a principal bundle over X with fiber GL(n, R), called the frame bundle of X. Definition 10. Let X be a manifold of dimension n. Let G be a Lie subgroup of GL(n, R). Then a G -structure on X is a principal subbundle P of F with fiber G.
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1073
The framework of G-structures is very powerful, and a number of geometrical structures can be reformulated in this language. In particular, a Riemannian metric on a manifold is equivalent to an O(n) structure. We are in particular interested in torsion-free G-structures. A G-structure is torsion-free if and only if there exists a compatible torsion-free connection on TM . A connection ∇ on TM is equivalent to a connection D on the frame bundle F , and we say ∇ is compatible with the G-structure P if D reduces to a connection on P . For example, given a Riemannian metric, a unique torsion-free Levi–Civita connection can always be defined, hence all O(n) structures are torsion-free. On a complex manifold with complex dimension, an integrable complex structure is equivalent to a torsion-free GL(m, C) structure. A K¨ ahler structure is then equivalent to a torsion-free U (m)-structure. From [30], we have a key result that relates torsion-free structures and holonomy: Proposition 11. Let (X, g) be a Riemannian manifold of dimension n, with O(n)structure P corresponding to g. Let G be a Lie subgroup of O(n). Then Hol(g) ⊆ G if and only if X admits a torsion-free G-structure Q that is a subbundle of P . As Proposition 11 shows, the study of Riemannian holonomy is equivalent to studying torsion-free G-structures. Hence in order to study G2 holonomy manifolds we will first consider G2 structures. Now suppose X is a smooth, oriented 7-dimensional manifold. Following Joyce [30], define a three-form ϕ to be positive if locally we can choose a frame such that ϕ is written in the form (2.9) — that is for every p ∈ X there is an oriented isomorphism qp between Tp X and R7 such that ϕ|p = ϕ0 . For each p ∈ X define Pp3 X to be set of such three-forms. To each positive ϕ we can associate a metric g and a Hodge dual ∗ϕ which are identified with g0 and ψ0 under the qp and the associated metric is written (2.10). Since ϕ0 is preserved by G2 and GL(7, R)+ acts transitively on Pp3 X it follows that Pp3 X ∼ = GL(7, R)+ /G2 and hence dim Pp3 X = dim GL(7, R)+ − dim G2 = 49 − 14 = 35. This is equal to the dimension of Λ3 Tp∗ X, hence Pp3 X is an open subset of Λ3 Tp∗ X. Moreover if we consider the bundle P 3 X over X with fiber Pp3 X, it will be an open subbundle of Λ3 T ∗ X. Given a positive three-form ϕ on X, consider at each point p the set Qp of isomorphisms qp between Tp X and R7 such that ϕ|p = ϕ0 . It is then easy to see that Qp ∼ = G2 and that the bundle Q over X with fiber Qp is in fact a principal subbundle of the frame bundle F . So in fact, Q is a G2 structure. The converse is also true — given an oriented G2 structure Q, we can uniquely define a positive three-form ϕ and associated metric g and four-form ψ that correspond to ϕ0 ,g0 and ψ0 , respectively. We thus have a key result: Theorem 12 ([30]). Let X be an oriented seven-dimensional manifold. There exists a 1 − 1 correspondence between positive three-forms on X and oriented
October 12, J070-S0129055X10004132
1074
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
G2 -structures Q on X. Moreover, to each positive three-form ϕ we can associate a Riemannian metric g and a corresponding four-form ∗ϕ ϕ = ψ such for each p ∈ X, under the isomorphism qp : Tp X → R7 , these quantities are identified with ϕ0 ,g0 and ψ0 , respectively. So given a positive three-form ϕ on X, it is possible to define a metric g associated to ϕ. This metric then defines the Hodge star, which we denote by ∗ϕ to emphasize the dependence on ϕ. Given the Hodge star, we can in turn define the four-form ψ = ∗ϕ ϕ. Thus in fact both the metric g and the four-form ψ are functions of ϕ. By definition, at point p ∈ X there is an isomorphism that identifies ϕ with ϕ0 , ψ with ψ0 and g with g0 . Therefore, properties of ϕ0 and ψ0 such as the contraction identities (2.14) that we encountered in Sec. 2.1 also hold for the differential forms ϕ and ψ. In general, any G-structure on a manifold X induces a splitting of bundles of p-forms into subbundles corresponding to irreducible representations of G. The same is of course true for G2 structures. The decomposition of p-forms on R7 carries over to any manifold with a G2 structure, so from the previous section we have the following decomposition of the spaces of p-forms Λp : Λ1 = Λ17 ,
(3.1a)
Λ2 = Λ27 ⊕ Λ214 ,
(3.1b)
Λ3 = Λ31 ⊕ Λ37 ⊕ Λ327 ,
(3.1c)
Λ4 = Λ41 ⊕ Λ47 ⊕ Λ427 ,
(3.1d)
5
Λ =
Λ57
⊕
Λ514 ,
Λ6 = Λ67 .
(3.1e) (3.1f)
Λpk
corresponds to the k-dimensional irreducible representation of G2 . Here each are isomorphic to each other via Hodge Moreover, for each k and p, Λpk and Λ7−p k p duality, and also Λ7 are isomorphic to each other for n = 1, 2, . . . , 6. Define the standard inner product on Λp , so that for p-forms α and β, 1 (3.2) α, β = αa1 ···ap β a1 ···ap . p! This is related to the Hodge star, since α ∧ ∗β = α, β vol where vol is the invariant volume form given locally by vol = det g dx1 ∧ · · · ∧ dx7 .
(3.3)
(3.4)
Then the decompositions (3.1) are orthogonal with respect to (3.2). Note that ϕ, ϕ = 7, so in fact we have 1 V = ϕ ∧ ∗ϕ (3.5) 7 where V is the volume of the manifold X.
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1075
We know that the metric g is defined by the three-form ϕ and we can use some of the results from Sec. 2.1 to find a direct relationship between the two quantities. Proposition 13. Given a positive three-form ϕ on a seven-manifold X, the associated metric g is given by 1
gab = (det s)− 9 sab .
(3.6)
1 ϕamn ϕbpq ϕrst ˆεmnpqrst 144
(3.7)
with sab =
where ˆεmnpqrst is the alternating symbol with ˆε12,...,7 = +1. Alternatively, for u,v vector fields on X, 1 (uϕ) ∧ (vϕ) ∧ ϕ 6 where denotes interior multiplication: (uϕ)bc = ua ϕabc . u, v vol =
(3.8)
Proof. Consider the quantity Pab given by Pab = ϕamn ϕbpq ψ mnpq Using identities (2.14) to contract ϕ and ψ, this gives Pab = 24gab . Expanding ψ mnpq in terms of ϕ and the Levi–Civita tensor we get Pab =
1 ϕamn ϕbpq ϕrst εmnpqrst . 6
If we write ˆεmnpqrst for the alternating symbol with ˆε12,...,7 = +1, then we get 1 ϕamn ϕbpq ϕrst ˆεmnpqrst . (3.9) gab det g = 144 Alternatively, let u and v be vector fields on X. Then 1 (ua ϕamn )(v a ϕbpq )ϕrst ˆεmnpqrst . u, v det g = 144 Hence we get (3.8). Now define 1 ϕamn ϕbpq ϕrst ˆεmnpqrst 144 so that then, after taking the determinant of (3.9) we get (3.6). sab =
Thus we see that even though given the three-form ϕ we can define the metric g, this relationship is rather complicated and nonlinear. In particular, this also shows that ψ = ∗ϕ ϕ depends on ϕ in an even more non-trivial fashion, since the Hodge star depends itself on the metric. Here we need to say a few words about the notation used for the G2 threeform ϕ and the associated four-form ψ. The notation that we use here is due to
October 12, J070-S0129055X10004132
1076
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
Authors Beasley and Witten; Gukov, Yau and Zaslow
Three-form
Dual Four-form
Φ
∗Φ
Bryant
φ=
Hitchin; Lee and Leung Joyce Karigiannis; Karigiannis and Leung; Grigorian and Yau
Ω ϕ
Θ = ∗Ω Θ(ϕ) = ∗ϕ
[26, 37] [28–30]
ϕ
ψ = ∗ϕ ϕ
[21, 31, 32, 34]
Fig. 4.
1 ε eijk 6 ijk
∗φ φ =
References [7, 22]
1 ε eijkl 24 ijkl
[12, 13]
Notation that is used by different authors.
Karigiannis — where the Hodge dual of ϕ is denoted by ψ and was first introduced in [31]. In Fig. 4, we summarize the different notations used by other authors: where eijk = ei ∧ ej ∧ ek and eijkl = ei ∧ ej ∧ ek ∧ el for basis covectors ei . 3.2. Torsion-free structures The definition of a G2 structure only defines the algebraic properties of ϕ, and in general does not address the analytical properties of ϕ. Using the associated metric g we can define the Levi–Civita connection ∇ on X. Then it is natural to ask what are the properties of ∇ϕ. This quantity is known as the torsion of the andez and G2 structure. Originally the torsion of G2 structures was studied by Fern´ Gray [18], and their analysis revealed that there are in fact a total of 16 torsion classes of G2 structures. Later on, Karigiannis reproduced their results using simple computational arguments [32]. Following [32], consider the three-form ∇X ϕ for some vector field X. We know that three-forms split as Λ31 ⊕ Λ37 ⊕ Λ327 , so consider the projections π1 ,π7 and π27 of ∇X ϕ onto these components. Using (2.28), we have π1 (∇X ϕ) = aϕ where a = X a (∇a ϕbcd )ϕbcd = X a ∇a (ϕbcd ϕbcd ) − ϕbcd X a ∇a ϕbcd = −X a (∇a ϕbcd )ϕbcd = 0. Hence we see that the Λ31 component vanishes. Similarly, for Λ327 we have π27 (∇X ϕ) = iϕ (h) where 3 c 3 3 (X ∇c ϕmn{a )ϕb}mn = X c ∇c (ϕmn{a ϕb}mn ) − ϕmn{a X c ∇c ϕ b}mn 4 4 4 3 = − (X c ∇c ϕmn{a )ϕ b}mn 4 = 0.
hab =
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1077
Here we have used the fact that ϕmna ϕb mn = 6gab , the traceless part of which vanishes. Therefore, the Λ327 part of ∇X ϕ also vanishes. Now consider the Λ37 component. In this case, π7 (∇X ϕ) = ωψ where ωa = −
1 c 1 a X (∇c ϕmnp )ψ mnpa = X (∇a ψ bcde )ϕbcd . 24 24
This quantity does not vanish in general, so we can conclude that ∇X ϕ ∈ Λ37
(3.10)
∇ϕ ∈ W = Λ17 ⊗ Λ37 .
(3.11)
and thus overall,
Further classification of torsion classes depends on the decomposition of W into components according to irreducible representations of G2 . Given (3.11), we can write ∇a ϕbcd = Ta e ψebcd
(3.12)
where Tab is the full torsion tensor. This two-tensor full defines ∇ϕ since pointwise, it has 49 components and the space W is also 49-dimensional (pointwise). In general we can split Tab as T = τ1 g + τ7 + τ14 + τ27
(3.13)
where τ1 is a function, and gives the 1 component of T , τ7 ∈ Λ27 and hence gives the 7 component, τ14 ∈ Λ214 gives the 14 component and τ27 is traceless symmetric, giving the 27 component. Note that the normalization of these components is different from [32]. Hence we can split W as W = W1 ⊕ W7 ⊕ W14 ⊕ W27 .
(3.14)
The 16 torsion classes arise as the subsets of W which ∇ϕ belongs to. Moreover, as shown in [32], the torsion components τi relate directly to the expression for dϕ and dψ. In fact, in our notation, dϕ = 4τ1 ψ + 3τ7 ∧ ϕ − ∗τ27 ,
(3.15a)
dψ = 4τ7 ∧ ψ − 2 ∗τ14 .
(3.15b)
Now suppose dϕ = dψ = 0. Then this means that all four torsion components vanish and hence T = 0, and as a consequence ∇ϕ = 0. The converse is trivially true, since d and d∗ can both be expressed in terms of the covariant derivative. This result is
October 12, J070-S0129055X10004132
1078
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
due to Fern´ andez and Gray [18]. If we add the fact that Hol(g) is a subgroup of G if and only if X admits a torsion-free G structure from Proposition 11, then we get the following important result. Theorem 14 ([30, Proposition 10.1.3]). Let X be a seven-manifold with a G2 structure defined by the three-form ϕ and equipped with the associated Riemannian metric g. Then the following are equivalent: (1) (2) (3) (4)
The G2 -structure is torsion-free; Hol(g) ⊆ G2 and ϕ is the induced three-form; ∇ϕ = 0 on X where ∇ is the Levi–Civita connection of g; dϕ = dψ = 0 where ψ = ∗ϕ with the Hodge star defined by g.
Different torsion classes of the G2 structure also restrict the curvature of the manifold. Consider the curvature tensor Rabcd . Then for fixed a,b, we have (Rab )cd ∈ Λ2 , so we can decompose it as (Rab )cd = (π7 Rab )cd + (π14 Rab )cd .
(3.16)
Following Karigiannis [32], consider the operator Tψ (2.22) acting on Rabcd . Then we have g ad Tψ Rabcd = Rabef ψ efcd g ad = −(Rbeaf + Reabf )ψ efcd g ad = −Rbeaf ψ e caf + Rf bea ψ eaf c = −2g adTψ Rabcd =0 where we have used the cyclic identity for Rabef . Hence, from (2.23) we get 3 (π14 Rab )cd g ac (3.17) 2 where Ricbd is the Ricci tensor. However, in general, by the Ambrose–Singer holonomy theorem [5], if Hol(g) ⊆ G, then Rabcd ∈ Sym2 (g) where g is the Lie algebra of G. Therefore, in the G2 case, if the G2 structure is torsion-free and hence Hol(g) ⊆ G2 , then Rabcd ∈ Sym2 (g2 ). This however implies that in (3.16), the π7 component vanishes, and thus from (3.17), we have the following result: Ricbd = 3(π7 Rab )cd g ac =
Theorem 15 ([10]). Let X be a Riemannian seven-manifold with metric g. If Hol(g) ⊆ G2 , then X is Ricci-flat. In fact, this result can also be derived without invoking the general Ambrose– Singer theorem. In [32], Karigiannis expressed the Λ27 component of the curvature tensor in terms of the torsion tensor Tab , so that when the torsion vanishes, the
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1079
curvature tensor is fully contained in Λ214 , thus directly confirming the Ambrose– Singer theorem in the G2 case. The original proof of Theorem 15 due to Bonan [10] relied on the fact that the Lie algebra structure of g2 imposes strong conditions on the Riemann tensor, and that these imply that the Ricci tensor cannot be nonvanishing. Given a compact manifold with a torsion-free G2 structure, the decompositions (3.1) carry over to de Rham cohomology [30], so that we have H 1 (X, R) = H71 , 2
H (X, R) = 3
H (X, R) = 4
H (X, R) = 5
H (X, R) = 6
H (X, R) =
(3.18a)
H72 ⊕ H13 ⊕ H14 ⊕ H75 ⊕ H76 .
2 H14 , 3 H7 ⊕ H74 ⊕ 5 H14 ,
(3.18b) 3 H27 , 4 H27 ,
(3.18c) (3.18d) (3.18e) (3.18f)
Define the refined Betti numbers bpk = dim(Hkp ). Clearly, b31 = b41 = 1 and we also have b1 = bk7 for k = 1, . . . , 6. Moreover, it turns out that if Hol(X, g) = G2 then b1 = 0. Therefore, in this case the H7k component vanishes in (3.18). It can be easily shown that on a Ricci-flat manifold, any harmonic one-form must be parallel. However this happens if and only if Hol(g) has an invariant one-form. However the only G2 -invariant forms are ϕ and ψ. Therefore there are no non-trivial harmonic one-forms when Hol(g) = G2 and thus b1 = 0. An example of a construction of a manifold with a torsion-free G2 structure is to consider X = Y × S 1 where Y is a Calabi–Yau three-fold. Define the metric and a three-form on X as gX = dθ2 × gY , ϕ = dθ ∧ ω + Re Ω,
(3.19) (3.20)
where θ is the coordinate on S 1 . This then defines a torsion-free G2 structure, with ∗ϕ =
1 ω ∧ ω − dθ ∧ Im Ω. 2
(3.21)
However, the holonomy of X in this case is SU (3) ⊂ G2 . From the K¨ unneth formula we get the following relations between the refined Betti numbers of X and the Hodge numbers of Y bk7 = 1 bk14
=h
1,1
−1
bk27 = h1,1 + 2h2,1
for k = 1, . . . , 6,
(3.22)
for k = 2, 5,
(3.23)
for k = 3, 4.
(3.24)
In [28–30], Joyce describes a possible construction of a smooth manifold with holonomy equal to G2 from a Calabi–Yau manifold Y . So suppose Y is a Calabi–Yau
October 12, J070-S0129055X10004132
1080
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
three-fold as above. Then suppose σ : Y → Y is an antiholomorphic isometric involution on Y , that is, χ preserves the metric on Y and satisfies σ 2 = 1,
(3.25a)
σ (ω) = −ω, ¯ σ ∗ (Ω) = Ω.
(3.25b)
∗
(3.25c)
Such an involution σ is known as a real structure on Y . Define now a quotient given by σ Z = (Y × S 1 )/ˆ
(3.26)
where σ ˆ :Y × S 1 → Y × S 1 is defined by σ ˆ (y, θ) = (σ(y), −θ). The three-form ϕ ˆ and hence provides defined on Y × S 1 by (3.20) is invariant under the action of σ Z with a G2 structure. Similarly, the dual four-form ∗ϕ given by (3.21) is also invariant. Generically, the action of σ on Y will have a non-empty fixed point set N , which is in fact a special Lagrangian submanifold on Y [30]. This gives rise to orbifold singularities on Z. The singular set is two copies of Z. It is conjectured that it is possible to resolve each singular point using an ALE four-manifold with holonomy SU (2) in order to obtain a smooth manifold with holonomy G2 , however the precise details of the resolution of these singularities are not known yet. We will therefore consider only free-acting involutions, that is those without fixed points. Manifolds defined by (3.26) with a freely acting involution were called barely G2 manifolds by Harvey and Moore in [25]. The cohomology of barely G2 manifolds is expressed in terms of the cohomology of the underlying Calabi–Yau manifold Y : H 2 (Z) = H 2 (Y )+ ,
(3.27a)
H 3 (Z) = H 2 (Y )− ⊕ H 3 (Y )+ .
(3.27b)
Here the superscripts ± refer to the ± eigenspaces of σ ∗ . Thus H 2 (Y )+ refers to twoforms on Y which are invariant under the action of involution σ and correspondingly H 2 (Y )− refers to two-forms which are odd under σ. Wedging an odd two-form on Y with dθ gives an invariant three-form on Y × S 1 , and hence these forms, together with the invariant 3-forms H 3 (Y )+ on Y , give the three-forms on the quotient space ˆ . Now, Z. Also note that H 1 (Z) vanishes, since the one-form on S 1 is odd under σ given a three-form on Y , its real part will be invariant under σ, hence H 3 (Y )+ is essentially the real part of H 3 (Y ). Therefore the Betti numbers of Z in terms of Hodge numbers of Y are b1 = 0,
(3.28a)
b 2 = h+ 1,1 ,
(3.28b)
3
b =
h− 1,1
+ h2,1 + 1.
(3.28c)
A class of barely G2 manifolds that are constructed from complete intersection Calabi–Yau manifolds has recently been considered in [20], where the Betti numbers of all such manifolds have been calculated explicitly.
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1081
Note that barely G2 manifolds have holonomy SU (3) Z2 while the first Betti number still vanishes. This shows that vanishing first Betti number is not a necessary and sufficient condition for Hol(g) = G2 . In fact, as shown by Joyce in [28, 29], Hol(g) = G2 if and only if the fundamental group π1 (X) is finite. Let us briefly describe Joyce’s construction of compact torsion-free manifolds with Hol(g) = G2 . Here we follow [30]. On T 7 we can define a flat G2 structure (ϕ0 , g0 ), similarly as on R7 . Now suppose that Γ is a finite group acting on T 7 that preserves the G2 structure. Then we can define the orbifold T 7 /Γ. The key to resolving the orbifold singularities is to consider appropriate Quasi Asymptotically Locally Euclidean (QALE) G2 manifolds. These are seven-manifolds with a torsionfree G2 structure that is asymptotic to the G2 structure on R7 /G where G is a finite subgroup of G2 . The orbifold T 7 /Γ is then resolved to obtain a smooth compact manifold. However on the resolution, the resulting G2 -structure is not necessarily torsion-free, so it is shown that it can be deformed to a torsion-free G2 structure (ϕ, g). Further, the fundamental group is calculated, and if it is finite, then Hol(g) = G2 . Using this method, Joyce found 252 topologically distinct G2 holonomy manifolds with unique pairs of Betti numbers (b2 , b3 ). 4. Moduli Space 4.1. Deformations of G2 structures One of the interesting directions in the study of G2 holonomy manifolds is the structure of the moduli space. Essentially, the idea is to consider the space of all torsion-free G2 structures modulo diffeomorphisms on a manifold with fixed topology. The moduli space itself has an interesting geometry that may give further information about G2 manifolds. Currently, we can only say something about the very local structure of the G2 moduli space. For this, we take a fixed G2 structure and deform it slightly. The space of these deformations is the local moduli space. To study it, we thus need to understand the deformations of G2 structures. Although, we are mostly interested in deformations of torsion-free G2 structures, many of the results are valid for any G2 structures. Our aim is to consider infinitesimal deformations of ϕ of the form ϕ → ϕ + εχ
(4.1)
for some three-form χ. As we already know, the G2 structure on X and the corresponding metric g are all determined by the invariant three-form ϕ. Hence, deformations of ϕ will induce deformations of the metric. These deformations of metric will then also affect the deformation of ψ = ∗ϕ. Theoretically, “large” deformations could also be considered, and in fact, as we shall see below in some cases closed expressions can be obtained for large deformations. However in that case, it is difficult to determine the resulting torsion class of the new G2 structure [31]. In order for the deformed ϕ to define a new G2 structure, the new ϕ must also be a positive
October 12, J070-S0129055X10004132
1082
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
form (as per the definition of a G2 structure). However it is known [30] that the bundle of positive three-forms on X is an open subbundle of Λ3 T ∗ X, so we can always find ε small enough in order for the deformed ϕ to be positive. Using the decomposition of three-forms (3.1c), we can split χ into Λ31 , Λ37 and 3 Λ27 parts, and at first let us consider each one separately. As shown by Karigiannis in [31], metric deformations can be made explicit when the three-form deformations are either in Λ31 or Λ37 . Let us first review some of these results. First suppose ϕ ˜ = f ϕ.
(4.2)
˜ = ˜∗ϕ We will also use the notation ψ ˜ where ˜∗ is the Hodge star derived from the metric g˜ corresponding to ϕ ˜ . Then from (3.9) we get 1 ϕ ˜ ϕ ˜ ϕ ˜ ˆεmnpqrst g˜ab det g˜ = 144 amn bpq rst = f 3 gab det g. (4.3) After taking the determinant on both sides, we obtain det g˜ = f
14 3
det g.
(4.4)
Substituting (4.4) into (4.3), we finally get 2
g˜ab = f 3 gab ,
(4.5)
˜ = f 43 ψ. ψ
(4.6)
and hence
So, a scaling of ϕ gives a conformal transformation of the metric. Hence deformations of ϕ in the direction Λ31 also give infinitesimal conformal transformation. Suppose f = 1 + εa, then to third order in ε, we can write ˜ = 1 + 4 aε + 2 a2 ε2 − 4 a3 ε3 + O(ε4 ) ψ. ψ (4.7) 3 9 81 Given a torsion-free G2 structure, dϕ = dψ = 0, so if we want the deformed structure to be also torsion-free, f must be constant. Now, suppose in general that ϕ ˜ = ϕ + εχ for some χ ∈ Λ3 . Then using (3.8) for the definition of the metric associated with ϕ ˜ , after some manipulations, we get: = 1 (uϕ) ∧ (vϕ) ∧ ϕ u, v vol 6 1 + ε[(uχ) ∧ ∗(vϕ) + (vχ) ∧ ∗(uϕ)] 2 1 + ε2 (uχ) ∧ (vχ) ∧ ϕ 2 1 + ε3 (uχ) ∧ (vχ) ∧ χ. 6
(4.8)
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1083
Rewriting (4.8) in local coordinates, we get √ det g˜ 1 1 2 mnpq g˜ab √ = gab + εχmn(a ϕmn b) + ε χamn χbpq ψ 2 8 det g +
1 3 ε χamn χbpq (∗χ)mnpq . 24
(4.9)
Now suppose the deformation is in the Λ37 direction. This implies that χ = ωψ
(4.10)
for some vector field ω. Look at the first order term in (4.9). From (2.28) we see that this is essentially a projection onto Λ31 ⊕ Λ327 — the traceless part gives the Λ327 component and the trace gives the Λ31 component. Hence this term vanishes for χ ∈ Λ37 . For the third order term, it is more convenient to study at it in (4.8). By looking at ω((uωψ) ∧ (vωψ) ∧ ψ) = 0 we immediately see that the third order term vanishes. So now we are left with 1 2 c d mnpq det g g˜ab det g˜ = gab + ε ω ω ψcamn ψdbpq ψ 8 = (gab (1 + ε2 |ω|2 ) − ε2 ωa ωb ) det g (4.11) where we have used a contraction identity for ψ twice. Taking the determinant of (4.11) gives 2 det g˜ = (1 + ε2 |ω|2 ) 3 det g. (4.12) Eventually we have the following result: Theorem 16 ([31]). Given a deformation of a G2 structure (4.1) with χ = ωψ ∈ Λ37 , then the new metric g˜ab is given by 2
g˜ab = (1 + ε2 |ω|2 )− 3 ((gab (1 + ε2 |ω|2 ) − ε2 ωa ωb ))
(4.13)
˜ is given by and the deformed four-form ψ ˜ = (1 + ε2 |ω|2 )− 13 (ψ + ∗ε(ωψ) + ε2 ω ∗ (ωϕ)). ψ
(4.14)
One of the key reasons why it is possible to get these closed form expressions for modified g and ψ is because as shown by Karigiannis in [31], the determinant of (4.11) can be calculated in a closed form. Notice that to first order in ε, both √ det g and gab remain unchanged under this deformation. Now let us examine the last term in (4.14) in more detail. Firstly, we have ω ∗ (ωϕ) = ∗(ω ∧ (ωϕ))
October 12, J070-S0129055X10004132
1084
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
and (ω ∧ (ωϕ))mnp = 3ω[m ω a ϕ|a|np] = 3iϕ (ω ◦ ω)
(4.15) Λ41
Λ427
and compowhere (ω ◦ ω)ab = ωa ωb . Therefore, in (4.14), this term gives nents. So, can write (4.14) as 3 ˜ = (1 + ε2 |ω|2 )− 13 1 + ε2 |ω|2 ψ + ∗ε(ωψ) + ε2 ∗ iϕ ((ω ◦ ω)0 ) . (4.16) ψ 7 Here (ω ◦ ω)0 denotes the traceless part of ω ◦ ω, so that iϕ ((ω ◦ ω)0 ) ∈ Λ327 and thus, in (4.16), the components in different representations are now explicitly shown. To first order, we thus have the deformations ϕ ˜ = ϕ + ε(ωψ), ˜ = ψ + ∗ε(ωψ). ψ If originally dϕ = dψ = 0, that is, the G2 structure is torsion-free, then for the deformed structure to be torsion-free to first order we need d(ωψ) = d ∗ (ωψ) = 0.
(4.17) 4
By expanding d(ωψ) in terms of the decomposition of Λ , and setting each term individually to 0, we find that the symmetric part of ∇a ωb and the Λ27 part of dω must vanish. Furthermore, by expanding ∗d ∗ (ωψ) in terms of the decomposition of Λ2 we find that the Λ214 part of dω must also vanish. Hence we get that ∇ω = 0. If Hol(g) = G2 , then we know that in this case ω = 0, so there are no interesting small Λ37 deformations of manifolds with holonomy equal to G2 . As we have seen above, in the cases when the deformations were in Λ31 or Λ37 directions, there were some simplifications, which make it possible to write down all results in a closed form. In the case of deformations in Λ327 the only known way to get results for deformations of the metric and the four-form ψ is to consider the deformations order by order in ε. This analysis has been carried out in [21], and here we will review those results. So suppose we have a deformation ϕ ˜ = ϕ + εχ where χ ∈ Λ327 . Now let us set up some notation. Define 1 1 √ ϕ ˜ ϕ ˜ ˆεmnpqrst ϕ ˜ 144 det g amn bpq rst det g˜ = g˜ab . det g
s˜ab =
(4.18)
(4.19)
From (3.9), the untilded sab is then just equal to gab . We can rewrite (4.19) as det g g˜ab = (gab + δsab ) (4.20) det g˜
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1085
where δgab is the deformation of the metric and δsab is the deformation of sab , which from (4.9) is given by δsab =
1 1 1 εχmn(a ϕb)mn + ε2 χamn χbpq ψ mnpq + ε3 χamn χbpq (∗χ)mnpq . 2 8 24
(4.21)
Also introduce the following short-hand notation sk = Tr((δs)k )
(4.22)
where the trace is taken using the original metric g. From (4.21), note that since χ ∈ Λ327 , when taking the trace the first order term vanishes, and hence s1 is at least second-order in ε. Clearly, for k > 1, sk are at least of order k in ε. Similarly as before, take the determinant of (4.18):
det g˜ det g
92 =
det(g + δs) . det(g)
(4.23)
Unlike in the case of Λ37 deformations, we cannot compute det(g + δs) in closed form, so we have to calculate it order by order in ε. From the standard expansion of det(I + X), we find 1 1 det(g + δs) = 1 + s1 + (s21 − s2 ) + (s31 − 3s1 s2 + 2s3 ) + O(ε4 ). det g 2 6
(4.24)
However, as we noted above, s1 is second-order in ε, so this expression actually simplifies: det(g + δs) 1 1 = 1 + s1 − s2 + s3 + O(ε4 ). det g 2 3
(4.25)
Raising this to the power of − 19 , and expanding again to fourth order in ε, we get
det g det g˜
12 =1+
1 1 s2 − s1 18 9
−
1 s3 + O(ε4 ). 27
(4.26)
Using this and (4.20), we can immediately get the deformed metric, but the expressions using the current form of δsab are not very useful. So far, the only property of Λ327 that we have used is that it is orthogonal to ϕ, thus in fact, up to this point everything applies to Λ37 as well. Now however, let χ be of the form χabc = hd[a ϕbc]d
(4.27)
where hab is traceless and symmetric, so that χ ∈ Λ327 . Let us first introduce some further notation. Let h1 , h2 , h3 , h4 be traceless, symmetric matrices, and introduce
October 12, J070-S0129055X10004132
1086
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
the following shorthand notation be (ϕh1 h2 ϕ)mn = ϕabm had 1 h2 ϕden ,
ϕh1 h2 h3 ϕ =
be cf ϕabc had 1 h2 h3 ϕdef ,
be cf (ψh1 h2 h3 ψ)mn = ψabcm ψdef n had 1 h2 h3 , be cf mn ψh1 h2 h3 h4 ψ = ψabcm ψdef n had 1 h2 h3 h4 .
(4.28a) (4.28b) (4.28c) (4.28d)
It is clear that all of these quantities are symmetric in the hi and moreover (ϕh1 h2 ϕ)mn and (ψh1 h2 h3 ψ)mn are both symmetric in indices m and n. Then, it can be shown that 4 χ(a|mn| ϕb)mn = hab , 3 4 16 4 χamn χbpq ψ mnpq = − |χ|2 gab + (h2 ){ab} − (ϕhhϕ){ab} , 7 9 9 32 8 Tr(h3 )gab − (ϕhh2 ϕ){ab} , χamn χbpq ∗ χmnpq = 189 9 where as before {a b} denotes the traceless symmetric part. Using this and (4.21), we can now express δsab in terms of h: 2 4 3 1 2 2 3 ε Tr(h ) δsab = εhab + gab − ε Tr(h ) + 3 63 567 2 2 ε3 1 2 +ε (4.29) (h ){ab} − (ϕhhϕ){ab} − (ϕhh2 ϕ){ab} 9 18 27 and hence 1 4 s1 = Tr(δs) = − ε2 Tr(h2 ) + ε3 Tr(h3 ), 9 81 8 4 2 Tr(h3 ) − (ϕhhhϕ) , s2 = Tr(δs2 ) = ε2 Tr(h2 ) + ε3 9 27 27
(4.30a) (4.30b)
8 3 ε Tr(h3 ). (4.30c) 27 Substituting these expressions into (4.26) and (4.20), we can get the full expression for the deformed metric (up to third order in ε) and correspondingly the expression for the deformed four-form ψ: s3 = Tr(δs3 ) =
Theorem 17 ([21]). Given a deformation of a G2 structure (4.1) with χabc = hd[a ϕbc]d ∈ Λ327 , then the new metric g˜ab is given to third order in ε by 1 1 1 3 2 g˜ab = 1 + ε2 Tr(h2 ) + ε3 Tr(h3 ) − ε (ϕhhhϕ) gab + εhab 18 81 243 3 1 2 2 2 (h )(ab) − (ϕhhϕ)ab + ε3 hab Tr(h2 ) + ε2 9 18 81 −
ε3 (ϕhh2 ϕ)ab + O(ε4 ) 27
(4.31)
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
˜ is given by and correspondingly, the deformed four-form ψ ˜ = ψ − ε ∗ χ + ε2 − 1 Tr(h2 )ψ + 1 ∗ iϕ ((φhhφ)0 ) ψ 189 6 2 1 5 + ε3 − (ϕhhhϕ)ψ − Tr(h2 ) ∗ χ + ∗ iϕ (h30 ) 1701 108 18 1 1 ∗ iϕ ((ψhhhψ)0 ) + α ∧ ϕ + O(ε4 ) − 36 324
1087
(4.32)
where (φhhφ)0 , h30 and (ψhhhψ)0 denote the traceless parts of (φhhφ)ab , (h3 )ab and (ψhhhψ)ab , respectively, and αa = ψamnp ϕrst hmr hns hpt .
(4.33)
In general if such a deformation is performed on a torsion-free G2 structure, then it is not known what conditions must h satisfy in order for the torsion class to be preserved. If we restrict our analysis only to first order deformations, then it is easier to see these conditions. Suppose we have dϕ = dψ = 0 and we apply a deformation (4.1) with χ = iϕ (h) ˜ = 0 are for traceless and symmetric. Then to first order the conditions for d˜ ϕ = dψ dχ = d ∗ χ = 0. Hence the deformation must be a form that is closed and co-closed. For a compact manifold this is thus equivalent to χ being harmonic. We can also find what this condition means in terms of h. By decomposing dχ into Λ41 , Λ47 and Λ427 components, we find that we must have ∇r hra = 0.
(4.34a)
∇m ha(b ϕmac) = 0.
(4.34b)
Further, if we decompose ∗d ∗ χ into Λ27 and Λ214 components, we again get (4.34a) and moreover get a new constraint ∇m ha[b ϕmac] = 0.
(4.35a)
Thus overall, for h traceless and symmetric, χ = iϕ (h) being closed and co-closed is equivalent to ∇r hra = 0
and ∇m hab ϕmac = 0.
On a compact manifold χ being closed and co-closed is equivalent to χ being harmonic. It also turns out [2] that, if χ is defined as above, then ∆χ = 0 ⇔ ∆L h = 0
October 12, J070-S0129055X10004132
1088
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
where ∆L is the Lichnerowicz operator given by ∆L hab = ∇2 hab + 2Racbd hcd .
(4.36)
Therefore to preserve the torsion-free G2 structure, we have to limit our attention to zero modes of the Lichnerowicz operator. Note that, to linear order, traceless deformations of the metric which preserve the Ricci tensor are also precisely the Lichnerowicz zero modes, and this is consistent with (4.31) where the linear term in the metric deformation is proportional to h. Let us compare what happens here to what happens on Calabi–Yau manifolds [14]. In that case, deformations of the metric δgmn split into deformations of mixed type δgµ¯ν and deformations of pure type δgµν and δgµ¯ ν¯ . From the mixed type deformations we can define a real (1, 1)-form iδgµ¯ν dxµ ∧ dxν¯
(4.37)
and given the holomorphic 3-form Ω, we can use the mixed type deformation to define a real (2, 1)-form Ωκλ ν¯ δgµ¯ ν¯ dxk ∧ dxλ ∧ dxµ¯ .
(4.38)
In order to preserve the Calabi–Yau structure, the metric deformation must preserve the vanishing Ricci curvature, and hence δgmn must satisfy the Lichnerowicz equation: ∆L δgmn = 0. However, the Lichnerowicz equation for δgmn becomes equivalent to both the (1, 1)form (4.37) and the (2, 1)-form (4.38) being harmonic. Note that the definition (4.38) is very similar to χabc = hd[a ϕbc]d in the G2 case with ϕ playing the role of Ω and h the role of δgµ¯ ν¯ . 4.2. Geometry of the moduli space In the theory of Calabi–Yau moduli spaces, one of the key results is that the local moduli space of complex structure deformations is isomorphic to an open set in H m−1,1 (X) where X is a Calabi–Yau m-fold. Moreover, as it has been shown by Tian and Todorov [42, 43], any infinitesimal deformation can be in fact lifted to a full deformation. For the moduli spaces of G2 manifolds however, we can only replicate the results about the local moduli space. First let us define the moduli space of torsion-free G2 structures. Let X be the set of of positive three-forms ϕ ∈ P 3 X such that dϕ = d ∗ϕ ϕ = 0. Here we use ∗ϕ to emphasize that the Hodge star is defined using the G2 holonomy metric that is defined by ϕ itself. Then X gives the set of all three-forms that correspond to oriented, torsion-free G2 structures. However we do not want to distinguish between three-forms that are related by a diffeomorphism. Hence, let D be the group of all diffeomorphisms of X isotopic to the identity. This group then acts naturally on three-forms. The
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1089
moduli space of torsion-free G2 structures is then defined as the quotient M = X /D. The key result by Joyce is that M is locally diffeomorphic to an open set of H 3 (X, R): Theorem 18 ([28, 29]). Define a map Ξ : X →H 3 (X, R) by Ξ(ϕ) = [ϕ]. Then Ξ is invariant under the action of D on X . Moreover, Ξ induces a diffeomorphism between neighborhoods of ϕD ∈ M and [ϕ] ∈ H 3 (X, R). Since the dimension of H 3 (X, R) is b3 (X), this result implies that dim M = b3 (X). The full proof of this result can be found either in [28, 29] or [30]. This result covers the basic local properties of the G2 moduli space, but we do not yet know anything about the global structure of M. So anything we can say about the moduli space only holds in a small neighborhood. Looking back at the study of Calabi–Yau moduli spaces, we know that the complex structure moduli space admits a K¨ ahler structure, and the K¨ ahler structure moduli space admits a Hessian structure [14]. It turns out that on the G2 moduli space we can also define a Hessian structure. First let us define the notion of a Hessian manifold [39]: Definition 19. Let M be a smooth manifold and suppose D is a flat, torsionfree connection on M . A Riemannian metric G on a flat manifold (M, D) is called Hessian if G can be locally expressed as G = D2 H
(4.39)
that is, ∂ 2H (4.40) ∂xi ∂xj where {x1 , . . . , xn } is an affine coordinate system with respect to D. Then H is called the Hessian potential. Gij =
Note that this is the closest analogue to a K¨ ahler structure that can be defined on a real manifold. In fact, as shown by Shima [39], if we define a complex structure on the manifold TM , then the straightforward extension of G onto TM is K¨ahler if and only if G is a Hessian metric on (M, D). Thus the complexification of a Hessian manifold is K¨ ahler. In the case of the G2 moduli space M, we know that M is locally diffeomorphic to an open set in H 3 (X, R). Suppose we choose a basis [ϕ0 ], . . . , [ϕn ] on H 3 (X, R) where n = b3 (X) − 1. Taking the unique harmonic representatives of the basis elements, we can expand ϕ ∈ M as ϕ=
n
sN φN .
(4.41)
N =0
Since H 3 (X, R) is a vector space, s0 , . . . , sn give an affine coordinate system, which in turn defines a flat connection D = d on M. It is trivial to check that this connection is well-defined [34].
October 12, J070-S0129055X10004132
1090
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
In order to define a metric on M, we have to choose a Hessian potential function on M. The only natural function on M is the volume function V (ϕ) given by (3.5): 1 ϕ ∧ ψ. V (ϕ) = 7 X Note that as before, ψ = ∗ϕ ϕ is itself a function of ϕ. So we can consider V or some function of V as potential candidates for a Hessian potential. Let us calculate the Hessian of V . Note that under a scaling sM → λsM , ϕ scales as ϕ → λϕ and from 4 (4.6), ∗ϕ scales as ∗ϕ → λ 3 ∗ ϕ, and so V scales as 7
V → λ 3 V. So V is homogeneous of order sM
7 3
in the sM , and hence
∂V 7 = V M ∂s 3 1 = sM φM ∧ ∗ϕ 3
and thus, 1 ∂V = M ∂s 3
φM ∧ ∗ϕ.
(4.42)
Using our results on deformations of G2 structures from Sec. 4.1, we can deduce that ∂N (∗ϕ) =
4 ∗ π1 (φN ) + ∗π7 (φN ) − ∗π27 (φN ). 3
Hence differentiating (4.42) again, we find that 4 1 ∂V π π7 (ϕM ) ∧ ∗π7 (ϕN ) = (ϕ ) ∧ ∗π (ϕ ) + 1 M 1 N ∂sM ∂sN 9 3 1 − π27 (ϕM ) ∧ ∗π27 (ϕN ). 3
(4.43)
(4.44)
Note that in the case when b1 (X) = 0 (which in particular is true when Hol(g) = G2 ), since H73 = H 1 , the H73 component of H 3 (X, R) is empty. Therefore, the second term in (4.44) vanishes, and we find that the signature of this metric is Lorentzian — (1, b3 − 1). Up to a constant factor, this definition of the moduli space metric has been been used in mathematical literature — in particular by Hitchin in [26] and Karigiannis and Leung in [34]. However in physics literature, in particular by Beasley and Witten in [7] and by Gutowski and Papadopoulos in [23], the potential K given by K = −3 log V has been used instead.
(4.45)
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1091
The motivation for using this modified potential is two-fold. Firstly, this is more in line with the logarithmic K¨ ahler potentials on Calabi–Yau moduli spaces. Secondly, and perhaps most importantly is that the metric that arises from this potential appears as the target space metric of the effective theory in four dimensions when the action for the 11-dimensional supergravity is reduced to four dimensions on a G2 manifold. We will hence define the moduli space metric GMN as GMN =
∂2K . ∂sM ∂sN
Using the definition of K and (4.44), we get 1 ∂2K = (ϕ ) ∧ ∗π (ϕ ) − π7 (ϕM ) ∧ ∗π7 (ϕN ) π 1 M 1 N ∂sM ∂sN V + π27 (ϕM ) ∧ ∗π27 (ϕN )
(4.46)
In this case, if b1 (X) = 0, we get GMN =
1 V
φM ∧ ∗φN .
(4.47)
X
This metric is then in fact Riemannian. In the physics setting, apart from the G2 three-form, there is another three-form C and when the 11-dimensional supergravity action is reduced to four dimensions, the parameters of ϕ and C naturally combine to give a complexification of the G2 moduli space. The extension of the metric GMN to this complex space is then K¨ ahler [7, 21, 23]. However since the metric on the complexified space does not depend on C, there is not much difference in treating the moduli space as a complexified K¨ ahler manifold or a real Hessian manifold. Here we will treat M as a real Hessian manifold. Now that we have fixed a metric on M, we can proceed to various other geometrical quantities. For this we will need to use higher derivatives of ψ. In what follows, we will assume that b1 (X) = 0, so that there no harmonic forms in H73 . Let us introduce local special coordinates on M. Let φ0 = aϕ and φµ ∈ Λ327 for µ = 1, . . . , b327 , so that s0 defines directions parallel to ϕ and sµ define directions 3 . Then, from the deformations of ψ in Sec. 4.1, we can extract the higher in H27 derivatives of ψ in these directions: 4 2 8 a ψ, ∂0 ∂0 ∂0 ψ = − a3 ψ, 9 27 1 2 ∂0 ∂µ ψ = − a ∗ φµ , ∂0 ∂0 ∂µ ψ = a2 ∗ φµ , 3 9 2 1 ∂µ ∂ν ψ = − Tr(hµ hν )ψ + ∗ iϕ ((ϕhµ hν ϕ)0 ), 189 3 4 2 a Tr(hµ hν )ψ − a ∗ iϕ ((ϕhµ hν ϕ)0 ), ∂0 ∂µ ∂ν ψ = 567 9 ∂0 ∂0 ψ =
(4.48a) (4.48b) (4.48c) (4.48d)
October 12, J070-S0129055X10004132
1092
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
5 1 Tr(hµ hν ) ∗ φκ + ∗ iϕ ((hµ hν hκ )0 ) 18 3 1 4 − ∗ iϕ ((ψhµ hν hκ ψ)0 ) − (ϕhµ hν hκ ϕ)ψ, 6 567
∂µ ∂ν ∂κ ψ = −
(4.48e)
where hµ , hν and hκ are traceless symmetric matrices corresponding to the threeforms φµ , ϕν and φκ , respectively. On a Hessian manifold, there is a natural symmetric three-tensor given by the derivative of the metric, or equivalently the third derivative of the Hessian potential. We will denote this tensor AMNP . By analogy with similar quantities on Calabi–Yau moduli spaces, this tensors is called the Yukawa coupling. Using these expressions, following [21] we can now write down all the components of AMNR : A000 = −14a3 ,
(4.49a)
A00µ = 0,
(4.49b)
2a φµ ∧ ∗φν = −2aGµν , V 2 (ϕhµ hν hρ ϕ)dV. =− 27V
A0µν = −
(4.49c)
Aµνρ
(4.49d)
The full Riemann curvature on a Hessian manifold is then defined by RMNPQ =
1 M (A QR AR NP − AMPR AR NQ ). 4
(4.50)
Note that since the fourth derivative of K is fully symmetric, the fourth derivative terms vanish here. However, we can also define the Hessian curvature tensor by QKLMN = ∂M ∂N ∂L ∂K K − AKMR AR LN .
(4.51)
This tensor is the equivalent of the K¨ ahler curvature, and carries more information than the actual Riemann tensor (4.50). The Riemann curvature tensor is obtained from Q by RMNPQ =
1 (QMNPQ − QNMPQ ). 2
(4.52)
From (4.48), we can calculate the fourth derivatives of K, and hence get all the components of Q: Theorem 20 ([21]). The components of the Hessian curvature tensor Q corresponding to the metric (4.47) on the local moduli space of torsion-free G2 structures are given by: Q0000 = 14a4 ,
(4.53a)
Q000µ = 0,
(4.53b)
Q00µν = 2a2 Gµν ,
(4.53c)
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
Q0µνρ = −Aµνρ a, 1 5 Qκµνρ = Gµν Gκρ + Gµκ Gνρ − Gµρ Gκν − Gτ σ Aµτ ρ Aκνσ 3 7 1 1 2 + − Tr(hκ hµ hν hρ ) + (ψhκ hµ hν hρ ψ) V 27 27 5 Tr(h(κ hµ ) Tr(hν hρ) ) vol. + 81
1093
(4.53d)
(4.53e)
Let us look in more detail at the expression for Aµνρ . If we define haµ = haµm dxm , then we get 4 (4.54) Aµνρ = − ϕabc haµ ∧ hbν ∧ hcρ ∧ ψ. 9V Expressions for the G2 Yukawa coupling has been derived by different authors — in particular by Lee and Leung, [37], de Boer, Naqvi and Shomer [16], and Karigiannis [32]. Similarly, we can rewrite (4.53e) as 1 5 Gµν Gκρ + Gµκ Gνρ − Gµρ Gκν − Gτ σ Aµτ ρ Aκνσ Qκµνρ = 3 7 81 + ψabcd haκ ∧ hbµ ∧ hcν ∧ hdρ ∧ ϕ 9V 1 1 + (4.55) (5 Tr(h(κ hµ ) Tr(hν hρ) ) − 6 Tr(hκ hµ hν hρ ))vol. 81 V As we have mentioned previously, by complexifying the G2 moduli space, it is possible to turn the Hessian structure into a K¨ ahler structure. Similarly, the Hessian curvature Q becomes K¨ahler curvature. On Calabi–Yau manifolds, the complex structure moduli space is naturally a complex manifold, and admits a K¨ ahler structure, while the K¨ ahler structure moduli space is naturally a Hessian manifold, but can be complexified to become K¨ahler itself. We compare the various quantities on G2 moduli space and on the Calabi–Yau complex structure moduli space in Fig. 5. We can see that there are a number of similarities. This leads to a speculation that perhaps the G2 moduli space possesses more structures than it is currently known. One of the key features of Calabi–Yau moduli spaces is the special geometry, that is, both have a line bundle whose first Chern class coincides with the K¨ ahler class [19, 40]. From physics point of view, special geometry relates to the effective theory having N = 2 supersymmetry. M -theory compactified on G2 manifolds only gives N = 1 supersymmetry, so from this point of view it is perhaps unlikely that the (complexified) G2 moduli space would admit precisely this structure. Moreover, it was shown by Alekseevsky and Cort´es in [4] that a so-called special real structure on a Hessian manifold corresponds to special K¨ ahler structure on the tangent bundle. A special real manifold is a Hessian manifold on which the cubic form DG (with D being the flat connection, and G the Hessian metric) is parallel with respect
October 12, J070-S0129055X10004132
1094
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
Quantity
G2 moduli in Λ327
Complex structure moduli
Form
ϕ, ψ
Ω
Deformation space
3 H27
H (2,1)
Metric deformation
2 h 3 µν
δgµ ¯ν ¯
χabc = hd[a ϕbc]d R K = −3 log( ϕ ∧ ψ) R Gµν = V1 φM ∧ ∗φN R 4 ϕabc ha Aµνρ = − 9V µ × ∧ hbν ∧ hcρ ∧ ψ
χαβ γ¯ = − 21 Ωαβδ δgγ¯ ¯δ R ¯ K = − log(i Ω ∧ Ω)
Form deformation K¨ ahler potential Moduli space metric Yukawa coupling Curvature
Fig. 5.
Qκµνρ as in (4.55)
¯
Gµ¯ν = −
R
κµνρ = −
χ ∧¯ χν ¯ R µ ¯ Ω∧Ω
R
β γ Ωαβγ χα µ ∧ χν ∧ χρ ∧ Ω
Rµ¯ν ρ¯ σ = Gµ¯ ν Gρ¯ σ + Gµ¯ σ Gρ¯ ν − e2KC κµν τ¯ κν¯σ¯ τ¯
Comparison of G2 moduli space and Calabi–Yau complex structure moduli space.
to D. In our terms, this would mean that the derivative of the Yukawa coupling A vanishes. This is a rather strong condition which is not necessarily fulfilled in our case. So perhaps instead there is some intermediate structure that could be defined on the G2 moduli space or its complexification.
5. Concluding Remarks In this paper, we have reviewed the developments in the study of G2 moduli spaces. Currently only the local picture of the moduli space is known, so in the future it is natural to try and obtain at least some information on the global structure of the G2 moduli space. On Calabi–Yau manifolds, the extension to the global moduli space was originally done by Tian and Todorov [42, 43]. We have seen that there are a number of similarities in the local structure of Calabi–Yau moduli spaces and G2 moduli spaces, so it is feasible that it could also be possible to derive similar global properties of G2 moduli spaces. However torsion-free G2 structures are very nonlinear in some aspects — in particular, the metric depends nonlinearly on ϕ and hence the differential equation ∇ϕ = 0 for a torsion-free structure is also nonlinear. Therefore, it is not clear how to extend infinitesimal deformations of a G2 structure to large deformations, apart from considering deformations order by order. However even such expansions quickly get very complicated. Another possible topic for study would be to further develop approaches to mirror symmetry on G2 holonomy manifolds [22]. One possible direction for further research is to look at G2 manifolds in a slightly different way. Suppose we have type IIA superstrings on a non-compact Calabi–Yau three-fold with a special Lagrangian submanifold which is wrapped by a D6 brane which also fills M4 .
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1095
Then, as explained in [3], from the M -theory perspective this looks like a S 1 bundle over the Calabi–Yau which is degenerate over the special Lagrangian submanifold, but this seven-manifold is still a G2 manifold. The moduli space of this manifold will be then determined by the Calabi–Yau moduli and the special Lagrangian moduli. This possibly could provide more information about mirror symmetry on Calabi–Yau manifolds [41]. One more direction is to look at G2 manifolds with singularities. So far in this work we have considered only smooth G2 manifolds, however, from a physical point of view, G2 manifolds with singularities are even more interesting, as they yield more realistic matter content [1]. Also, the moduli spaces which we studied are for manifolds with fixed topology. By allowing topological transitions through singularities [15], it may be possible to find some relations between the different moduli spaces. Understanding these questions would improve our grasp of both the geometry and physics of G2 moduli spaces and the interplay between them. References [1] B. Acharya and E. Witten, Chiral fermions from manifolds of G(2) holonomy, hep-th/0109152. [2] B. S. Acharya and S. Gukov, M theory and singularities of exceptional holonomy manifolds, Phys. Rept. 392 (2004) 121–189; hep-th/0409191. [3] M. Aganagic, A. Klemm and C. Vafa, Disk instantons, mirror symmetry and the duality web, Z. Naturforsch. A 57 (2002) 1–28; hep-th/0105045. [4] D. V. Alekseevsky and V. Cortes, Geometric construction of the r-map: From affine special real to special K¨ ahler manifolds, arXiv:0811.1658. [5] W. Ambrose and I. M. Singer, A theorem on holonomy, Trans. Amer. Math. Soc. 75 (1953) 428–443. [6] J. Baez, The Octon, Bull. Amer. Math. Soc. (N.S.) 39 (2002) 145–205. [7] C. Beasley and E. Witten, A note on fluxes and superpotentials in M -theory compactifications on manifolds of G(2) holonomy, JHEP 07 (2002); hep-th/0203061. [8] K. Becker, M. Becker and J. H. Schwarz, String Theory and M-Theory: A Modern Introduction (Cambridge University Press, 2007). [9] M. Berger, Sur les groupes d’holonomie homog`ene des vari´et´es ` a connexion affine et des vari´et´es riemanniennes, Bull. Soc. Math. France 83 (1955) 279–330. [10] E. Bonan, Sur les vari´et´es riemanniennes ` a groupe d’holonomie g2 our spin(7), C. R. Acad. Sci. Paris 262 (1966) 127–129. [11] R. Bryant and S. Salamon, On construction of some complete metrics with exceptional holonomy, Duke Math. J. 58 (1989) 829–850. [12] R. L. Bryant, Metrics with exceptional holonomy, Ann. of Math. (2) 126(3) (1987) 525–576. [13] R. L. Bryant, Some remarks on G 2-structures, math/0305124. [14] P. Candelas and X. de la Ossa, Moduli space of Calabi–Yau manifolds, Nucl. Phys. B 355 (1991) 455–481. [15] M. Cvetic, G. W. Gibbons, H. Lu and C. N. Pope, M -theory conifolds, Phys. Rev. Lett. 88 (2002) 121602, pp. 4; hep-th/0112098. [16] J. de Boer, A. Naqvi and A. Shomer, The topological, hep-th/0506211. [17] M. J. Duff, M -theory on manifolds of G(2) holonomy: The first twenty years, hep-th/0201062.
October 12, J070-S0129055X10004132
1096
2010 10:1 WSPC/S0129-055X
148-RMP
S. Grigorian
[18] M. Fern´ andez and A. Gray, Riemannian manifolds with structure group G2 , Ann. Mat. Pura Appl. (4) 132 (1982) 19–45. [19] D. S. Freed, Special Kaehler manifolds, Comm. Math. Phys. 203 (1999) 31–52; hep-th/9712042. [20] S. Grigorian, Betti numbers of a class of barely G2 manifolds, arXiv:0909.4681. [21] S. Grigorian and S.-T. Yau, Local geometry of the G2 moduli space, Comm. Math. Phys. 287 (2009) 459–488; arXiv:0802.0723. [22] S. Gukov, S.-T. Yau and E. Zaslow, Duality and fibrations on G(2) manifolds, hep-th/0203217. [23] J. Gutowski and G. Papadopoulos, Moduli spaces and brane solitons for M theory compactifications on holonomy G(2) manifolds, Nucl. Phys. B 615 (2001) 237–265; hep-th/0104105. [24] F. R. Harvey, Spinors and Calibrations (Academic Press, 1990). [25] J. A. Harvey and G. W. Moore, Superpotentials and membrane instantons, hep-th/9907026. [26] N. J. Hitchin, The geometry of three-forms in six and seven dimensions, math/0010054. [27] K. Hori et al., Mirror Symmetry (Amer. Math. Soc., 2003). [28] D. D. Joyce, Compact Riemannian 7-manifolds with holonomy G2 . I, J. Differential Geom. 43 (1996) 291–328. [29] D. D. Joyce, Compact Riemannian 7-manifolds with holonomy G2 . II, J. Differential Geom. 43 (1996) 329–375. [30] D. D. Joyce, Compact Manifolds with Special Holonomy, Oxford Mathematical Monographs (Oxford University Press, 2000). [31] S. Karigiannis, Deformations of G 2 and Spin(7) structures on manifolds, Canad. J. Math. 57 (2005) 1012–1055; math/0301218. [32] S. Karigiannis, Geometric flows on manifolds with G 2 structure, I, math/0702077. [33] S. Karigiannis, Desingularization of G2 manifolds with isolated conical singularities, Geom. Topol. 13(3) (2009) 1583–1655; arXiv:0807.3346. [34] S. Karigiannis and N. C. Leung, Hodge theory for G2-manifolds: Intermediate Jacobians and Abel–Jacobi maps, arXiv:0709.2987. [35] A. Kovalev, Twisted connected sums and special Riemannian holonomy, math/0012189. [36] A. Kovalev and J. Nordstr¨ om, Asymptotically cylindrical 7-manifolds of holonomy G2 with applications to compact irreducible G2 -manifolds, arXiv:0907.0497. [37] J.-H. Lee and N. C. Leung, Geometric structures on G(2) and Spin(7)-manifolds, math/0202045. [38] J. Nordstr¨ om, Deformations of asymptotically cylindrical G2 -manifolds, Math. Proc. Cambridge Philos. Soc. 145(2) (2008) 311–348; arXiv:0705.4444. [39] H. Shima, The Geometry of Hessian Structures (World Scientific Publishing, 2007). [40] A. Strominger, Special geometry, Comm. Math. Phys. 133 (1990) 163–180. [41] A. Strominger, S.-T. Yau and E. Zaslow, Mirror symmetry is T -duality, Nucl. Phys. B 479 (1996) 243–259; hep-th/9606040. [42] G. Tian, Smoothness of the universal deformation space of compact Calabi–Yau manifolds and its Petersson–Weil metric, in Mathematical Aspects of String Theory (San Diego, Calif., 1986), Adv. Ser. Math. Phys., Vol. 1 (World Sci. Publishing, 1987), pp. 629–646. [43] A. Todorov, The Weil–Petersson geometry of the moduli space of SU (n ≥ 3) (Calabi–Yau) manifolds I, Comm. Math. Phys. 126 (1989) 325–346.
October 12, J070-S0129055X10004132
2010 10:1 WSPC/S0129-055X
148-RMP
Moduli Spaces of G2 Manifolds
1097
[44] P. K. Townsend, The eleven-dimensional supermembrane revisited, Phys. Lett. B 350 (1995) 184–187; hep-th/9501068. [45] P. C. West, Introduction to Supersymmetry and Supergravity (World Scientific Publishing — Singapore, 1990). [46] E. Witten, String theory dynamics in various dimensions, Nucl. Phys. B 443 (1995) 85–126; hep-th/9503124.
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 9 (2010) 1099–1121 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004144
A UNIFIED TREATMENT OF CONVEXITY OF RELATIVE ENTROPY AND RELATED TRACE FUNCTIONS, WITH CONDITIONS FOR EQUALITY
ˇ ´ ANNA JENCOV A Mathematical Institute, Slovak Academy of Sciences, ˇ anikova 49, 814 73 Bratislava, Slovakia Stef´ [email protected] MARY BETH RUSKAI Department of Mathematics, Tufts University, Medford, MA 02155, USA [email protected] Received 14 August 2009 Revised 19 June 2010 We consider a generalization of relative entropy derived from the Wigner–Yanase–Dyson entropy and give a simple, self-contained proof that it is convex. Moreover, special cases yield the joint convexity of relative entropy, and for Tr K ∗ Ap KB 1−p Lieb’s joint concavity in (A, B) for 0 < p < 1 and Ando’s joint convexity for 1 < p ≤ 2. This approach allows us to obtain conditions for equality in these cases, as well as conditions for equality in a number of inequalities which follow from them. These include the monotonicity under partial traces, and some Minkowski type matrix inequalities proved by Carlen and Lieb for Tr1 (Tr2 Ap12 )1/p . In all cases, the equality conditions are independent of p; for extensions to three spaces they are identical to the conditions for equality in the strong subadditivity of relative entropy. Keywords: Relative entropy; convex trace functions; Wigner–Yanase–Dyson entropy. Mathematics Subject Classification 2010: 47A63, 15A45, 94A17
1. Introduction 1.1. Background For matrices A12 > 0 acting on a tensor product of two Hilbert spaces, Carlen and Lieb ([7, 8]) considered the trace function [Tr1 (Tr2 Ap12 )q/p ]1/q and proved that it is concave when 0 ≤ p ≤ q ≤ 1 and convex when 1 ≤ q and 1 ≤ p ≤ 2. They showed that this implies that these functions and the norms they generate satisfy Minkowski type inequalities, including a natural generalization to matrices A123 acting on a tensor product of three Hilbert spaces. They also raised the question of the conditions for equality in their inequalities. When q = 1, we show that this can 1099
October 12, J070-S0129055X10004144
1100
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
be treated using methods developed to treat equality in the strong subadditivity of quantum entropy. Moreover, we obtain conditions for equality in a large class of related convexity inequalities, show that they are independent of p in the range 0 < p < 2, and show that for inequalities involving A123 they are identical to the equality conditions for strong subadditivity (SSA) of quantum entropy given in [13]. These equality conditions are non-trivial and have found many applications in quantum information theory. For example, they play an important role in some recent “no broadcasting” results; see [19] and references therein. They also play a key role in Devetak and Yard’s ([9]) “quantum state redistribution” protocol which gives an operational interpretation to the quantum conditional mutual information. Our approach to proving joint convexity of relative entropy is motivated by Araki’s relative modular operator ([5]), introduced to generalize relative entropy to more general situations including type III von Neumann algebras. It was subsequently used by Narnhofer and Thirring ([29]) to give a new proof of SSA. The argument given here is similar to that in [18,31,37]; however, the unified treatment for 0 < p < 2 leading to equality conditions, is new. Moreover, a dual treatment can be given for −1 < p < 1 allowing extension to the full range (−1, 2). Wigner and Yanase ([42, 43]) introduced the notion of skew information of a density matrix γ with respect to a self-adjoint observable K, 1 −Tr [K, γ p ][K, γ 1−p ] 2
(1)
for p = 12 and Dyson suggested extending this to p ∈ (0, 1). Wigner and Yanase [43] proved that (1) is convex in γ for p = 12 and, in his seminal paper [20] on convex trace functions, Lieb proved joint concavity for p ∈ (0, 1) for the more general function (A, B) → Tr K ∗ Ap KB 1−p
(2)
for K fixed and A, B > 0 positive semi-definite. This implies convexity of (1) and was a key step in the original proof ([23]) of the strong subadditivity (SSA) inequality of quantum entropy. Moreover, it leads to a proof of joint convexity of relative entropya as well. It is less well known that Ando ([3, 4]) gave another proof which also showed that for 1 ≤ p ≤ 2, the function (2) is jointly convex in A, B. The case p = 2 was considered earlier by Lieb and Ruskai ([24]). We modify what one might describe as Lieb’s extension of the Wigner–Yanase–Dyson (WYD) entropy to a type of relative entropy in a way that allows a unified treatment of the convexity and concavity of Tr K ∗ Ap KB 1−p in the range p ∈ (0, 2] and includes the usual relative entropy as a special case. Our modification retains a linear term, a In [23], only concavity of the conditional entropy was proved explicitly, but the same argument [36, Sec. V.B] yields joint convexity of the relative entropy. Independently, Lindblad ([26]) observed that this follows directly from (2) by differentiating at p = 1.
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1101
even for A = B. Although this might seem unnecessary for convexity and concavity questions, it is crucial to a unified treatment. Lieb also considered Tr K ∗ Ap KB q with p, q > 0 and 0 ≤ p + q ≤ 1 and Ando considered 1 < q ≤ p ≤ 2. In Sec. 2.2, we extend our results to this situation. However, we also show that for q = 1−p, equality holds only under trivial conditions. Therefore, we concentrate on the case q = 1 − p. Next, we introduce our notation and conventions. In Sec. 2, we first describe our generalization of relative entropy and prove its convexity; then consider the extension to q = 1 − p mentioned above; and finally prove monotonicity under partial traces including a generalization of strong subadditivity to p = 1. In Sec. 3, we consider several formulations of equality conditions. In Sec. 4, we show how to use these results to obtain equality conditions in the results of Lieb and Carlen ([7, 8]). For completeness, we include an appendix which contains the proof of a basic convexity result from [37] that is key to our results. 1.2. Notation and conventions We introduce two linear maps on the space Md of d×d matrices. Left multiplication by A is denoted LA and defined as LA (X) = AX; right multiplication by B is denoted RB and defined as RB (X) = XR. These maps are associated with the −1 introduced by Araki ([5]) in a far more relative modular operator ∆AB = LA RB general context. They have the following properties: (a) The operators LA and RB commute since LA [RB (X)] = AXB = RB [LA (X)]
(3)
even when A and B do not commute. (b) LA and RA are invertible if and only if A is non-singular, in which case L−1 A = −1 = RA−1 . LA−1 and RA (c) When A is self-adjoint, LA and RA are both self-adjoint with respect to the Hilbert–Schmidt inner product A, B = Tr A∗B. (d) When A ≥ 0, the operators LA and RA are positive semi-definite, i.e. Tr X ∗LA (X) = Tr X ∗AX ≥ 0 and Tr X ∗RA (X) = Tr X ∗XA = Tr XAX ∗ ≥ 0. (d) When A > 0, then (LA )p = LAp and (RA )p = RAp for all p ≥ 0. If A is also non-singular, this extends to all p ∈ R. More generally, f (LA ) = Lf (A) for f : (0, ∞) → R. To see why (e) holds, it suffices to observe that A > 0 implies LA and RA are linear operators for which f (A) can be defined by the spectral theorem for any function f with domain in (0, ∞). It is easy to verify that A|φj = αj |φj implies LA |φj φk | = αj |φj φk | for k = 1, . . . , d so that the spectral decomposition of A induces one on LA with degeneracy d and f (LA )|φj φk | = f (αk )|φj φk |.
October 12, J070-S0129055X10004144
1102
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
For RB a similar argument goes through starting with left eigenvectors of B i.e. φj |B = βj φj |. If a function is homogeneous of degree 1, then convexity is equivalent to subad ditivity. Thus, if F (λA) = λF (A), then F is convex if and only if F (A) ≤ j F (Aj ) with A = j Aj . We will use this equivalence without further ado. For B positive semi-definite, we denote the projection onto (ker B)⊥ by P(ker B)⊥ . We will encounter expressions involving commuting positive semi-definite matrices A, D with ker D ⊆ ker A. We will simply write AD−1 for √ √ lim A(D + I)−1 A = AD−1 P(ker D)⊥ = AD−1 P(ker A)⊥ (4) →0
with D−1 the generalized inverse.
2. WYD Entropy Revisited and Extended 2.1. Generalization of relative entropy We now introduce the family of functions 1 (x − xp ) p = 1 gp (x) = p(1 − p) x log x p = 1,
(5)
which are well-defined for x > 0 and p = 0. We will consider p ∈ (0, 2] although it would suffice to consider p ∈ [ 12 , 2]. For A, B strictly positive we define √ √ −1 (K B) (6) Jp (K, A, B) ≡ Tr BK ∗gp LA RB 1 (Tr K ∗AK − Tr K ∗Ap KB 1−p ) p ∈ (0, 1) ∪ (1, 2), p(1 − p) (7) = Tr KK ∗A log A − Tr K ∗AK log B p = 1, − 1 (Tr K ∗AK − Tr AKB −1 K ∗A) p = 2. 2 When p = 1 and K = I, (6) reduces to the usual relative entropy, i.e. J1 (I, A, B) = H(A, B) = Tr A(log A − log B).
(8)
For p = 1, the function Jp (K, A, B) differs from that considered by Lieb ([20]) and 1 Ando ([3, 4]) by the seemingly irrelevant linear term Tr K ∗AK and the factor p(1−p) . However, this minor difference allows us to give a unified treatment of p ∈ (0, 2] because of the extension by continuity to p = 1 and the sign change there. One might expect to associate the exchange A ↔ B with the symmetry p ↔ (1−p) around p = 12 . However, there are several subtleties due to the linear term, the exchange K ↔ K ∗ , and the case p = 1. Therefore, we use instead the observation
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
that
1103
√ √ −1 Jp (K ∗ , B, A) = Tr AK gp (LB RA )(K ∗ A) √ √ −1 g1−p (LA RB )(K B) = Tr BK ∗ = J1−p (K, A, B)
where, for −1 ≤ p < 1, we define
(9)
1 (1 − xp ) p = 0 p(1 − p) (10) − log x p=0 √ √ −1 and Jp (K, A, B) = Tr BK ∗ gp (LA RB )(K B). The functions Jp (K, A, B) and Jp (K, A, B) have been considered before, usually with K = I, in the context of information geometry ([2, Sec. 7.2] and references therein) and by Petz ([31]) who used the term “quasi-entropy”. What is novel here is that we present a simple unified proof of joint convexity in A, B that easily yields equality conditions, shows that they are independent of p, and can be extended to other functions. The special case Jp (I, A, I) is equivalentb to the Tsallis ([40]) entropy. When K = K ∗ , the relation gp (x) = xg1−p (x−1 ) =
Jp (K, A, A) = −
1 Tr [K, Ap ][K, A1−p ] 2p(1 − p)
(11)
yields the original WYD information (up to a constant) and extends it to the range (0, 2]. Morevoer, K = K ∗ implies that Jp (K, A, A) = J1−p (K, A, A). Although neither gp (w) nor g1−p (w) is positive, their averagec Gp (w) ≡ 12 [gp (w)+wgp (w−1 )] ≥ 0 on (0, ∞). Therefore, when K = K ∗ , √ √ −1 )(K A) ≥ 0. (12) Jp (K, A, A) = Tr(K A)∗ Gp (LA RA The function Jp (I, A, B) is a more appealing generalization of relative entropy than Tr Ap B 1−p because of Proposition 1, which one can consider to be a generalization of Klein’s inequality ([17]). It allows one to use Jp (I, A, B) as a pseudo-metric, as is commonly done with the relative entropy. Proposition 1. When U is unitary and A, B > 0 with Tr A = Tr B = 1, then Jp (U, A, B) ≥ 0 with equality if and only if A = U ∗BU . Proof. When U is unitary, Jp (U, A, B) = Jp (I, U ∗AU, B) = Jp (I, A, U BU ∗ ). b This
(13)
was pointed out by Karol Zyczkowski. definition of gep in (10) differs from that in [18] by the exchange e gp ↔ e g1−p so that in [18] g (w)] for any g. In the convention used here, Gp (w) = 12 [gp (w) + ge1−p (w)]. G(w) = 12 [g(w) + e
c The
October 12, J070-S0129055X10004144
1104
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
Therefore, it suffices to consider the case U = I. For p ∈ (0, 1) H¨ olders inequality p 1−p p 1−p ≤ (Tr A) (Tr B) = 1 with equality if and and only A = B. implies Tr A B It immediately follows that Jp (I, A, B) ≥
1 (Tr A − 1) = 0 and Jp (I, A, B) = 0 ⇔ A = B. (14) p(1 − p)
For p = 1, the result is well-known [38, Sec. 2.5.2] and originally due to Klein ([17]). For p ∈ (1, 2) we write p = 1 + r and again use H¨ older’s inequality r
r
r
1 = Tr A = Tr B − 2(r+1) AB − 2(r+1) B r+1 1 1+r 1+r
r r r ≤ Tr B − 2(r+1) AB − 2(r+1) (Tr B) 1+r
(15)
1 1 1 1 ≤ [Tr B − 2 A1+r B − 2 ] 1+r Tr A1+r B −r 1+r where we used Tr B = 1 and the second inequality follows from a classic result of Lieb–Thirring [25, Appendix B, Theorem 9]. Because the denominator p(1 − p) changes sign at p = 0 and p = 1, both gp and gp are convex. In fact, they satisfy the much stronger condition of operator convexity for p ∈ (0, 2] and p ∈ [−1, 1) respectively. Since g(0) = 0 and
1 (1 − xp−1 ) p = 1 gp (x) p(1 − p) = , x log x p=1
(16)
it follows that gp (x)/x is operator monotone [3, 10, 27], for p ∈ (0, 2], i.e. gp can be analytically continued to the upper half plane, which it maps into itself. By applying Nevanlinna’s theorem [1, Sec. 59, Theorem 2] to gp (x)/x, one finds that gp (x) has an integral representation of the form
∞
x2 t − x dν(t) x+t 0
∞ 2 x 1 1 = ax + − + t dν(t) x+t t x+t 0
gp (x) = ax +
(17)
with ν(t) ≥ 0. Integral representations are not unique, and making a suitable change of variable in the classic formula
0
∞
π 1 xp−1 = ≡ x+1 sin pπ cp
p ∈ (0, 1)
(18)
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
allows us to give the following explicit representations
∞ 1 t p−1 − 1 t dt x + cp p(1 − p) x+t 0 ∞ 2 x t 1 −1+ dt 0 x+t x+t 1+t gp (x) =
∞ 2 1 x p−2 t dt x − c p−1 p(1 − p) x+t 0 1 (−x + x2 ) 2
1105
p ∈ (0, 1),
p = 1, (19) p ∈ (1, 2),
p = 2.
Note that for p ∈ (0, 2) the integrand is supported on (0, ∞). This plays a key role in the equality conditions; therefore, we will henceforth concentrate on p ∈ (0, 2). Theorem 2. The function Jp (K, A, B) defined in (6) is jointly convex in A, B. Proof. It follows from (17) that Jp (K, A, B) = a Tr K ∗AK
∞ + Tr K ∗A
1 Tr KBK ∗ (AK) − LA + tRB t 0 1 + Tr BK ∗ (KB) tν(t)dt. LA + tRB
(20)
The joint convexity then follows immediately from that of the map (X, A, B) → 1 (X) which was proved in [37] following the strategy in [24]. The proof Tr X ∗ LA +tR B is also given in the Appendix. For other approaches, see [30, 31, 11]. The advantage to the argument used here is that it immediately implies that equality holds in joint convexity if and only if it holds for each term in the integrand. Corollary 3. The relative entropy H(A, B) = J1 (I, A, B) is jointly convex in A, B. 2.2. Extensions with r = 1 − p We now consider extensions of Theorem 2 to situations considered by Ando ([4]) and Lieb ([20]) in which B 1−p is replaced by B r with r = 1 − p. Our approach uses an idea from Bekjan ([6]) and Effros ([11]). We will also show that equality holds in these extensions only under trivial conditions. For this we first need an elementary lemma, which we prove for the concave case. Lemma 4. Let f (λ): [0, ∞) → R be a nonlinear convex or concave operator function, let A1 , A2 be density matrices and A = λA1 + (1 − λ)A2 with λ ∈ (0, 1). Then f (A) = λf (A1 ) + (1 − λ)f (A2 ) if and only if A1 = A2 .
October 12, J070-S0129055X10004144
1106
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
Proof. Since any operator concave function is analytic, nonlinearity implies that f is strictly concave. If f (A) = λf (A1 ) + (1 − λ)f (A2 ), then v, f (A)v = λv, f (A1 )v + (1 − λ)v, f (A2 )v
(21)
for any vector v. Now choose v to be a normalized eigenvector of A. Then inserting this on the left above and applying Jensen’s inequality to each term on the right, one finds f (v, Av) ≤ λf (v, A1 v) + (1 − λ)f (v, A2 v).
(22)
But this contradicts concavity unless equality holds, which implies that v is also an eigenvector of A1 and A2 . But then the strict concavity of f also implies that v, A1 v = v, A2 v. Since this holds for an orthonormal basis of eigenvectors of A, A1 and A2 , we must have A1 = A2 . Corollary 5. The function (A, B) → Tr K ∗Ap KB r is jointly concave on the set of positive definite matrices when p, r ≥ 0 and p + r ≤ 1. Moreover, when p + r < 1 and K is invertible, the convexity is strict unless B1 = B2 and A1 = A2 . Proof. It is an immediate consequence of Theorem 2 that (A, B) → Tr K ∗Ap KB 1−p is jointly concave in A, B. Now write Tr K ∗Ap KB r = Tr K ∗Ap K(B s )1−p with s = r/(1 − p). First, observe that for 0 < s < 1 the function f (x) = xs satisfies the hypotheses of Lemma 4. Therefore, (λB1 + (1 − λ)B2 )s > λB1s + (1 − λ)B2s
(23)
with 0 < λ < 1 and B1 = B2 . The operator monotonicity of x → x1−p for 0 < p < 1 then implies (λB1 + (1 − λ)B2 )r > (λB1s + (1 − λ)B2s )1−p ,
(24)
and the joint concavity of Tr K ∗Ap KB 1−p implies Tr K ∗Ap K(B s )1−p ≥ Tr K ∗ (λA1 + (1 − λ)A2 )p K(λB1s + (1 − λ)B2s )1−p ≥ λTr K ∗Ap1 KB1
s(1−p)
+ (1 − λ) Tr K ∗Ap2 KB2
s(1−p)
(25)
where A = λA1 + (1 − λ)A2 , B = λB1 + (1 − λ)B2 , which is precisely the joint concavity of Tr K ∗Ap KB r . Moreover, equality in joint concavity implies equality in (25) and, since K ∗Ap K is strictly positive, this implies equality in (23). Therefore, equality in (25) gives a contradiction unless B1 = B2 . In that case, the joint concavity reduces to concavity in A for which, by a similar argument, equality holds if and only if A1 = A2 . Corollary 6. The function (A, B) → Tr K ∗Ap KB 1−r is jointly convex on the set of positive definite matrices when 1 < r ≤ p ≤ 2. Moreover, when r < p and K is invertible, the convexity is strict unless B1 = B2 and A1 = A2 .
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1107
Proof. The argument is similar to that for Corollary 5. Write Tr K ∗Ap KB 1−r = 1−r . Since s ∈ (0, 1) and 1 − p ∈ (−1, 0) when 1 < Tr K ∗Ap K(B s )1−p with s = 1−p s r < p < 2, it follows that x is operator concave and x1−p is operator monotone decreasing. 2.3. Monotonicity under partial traces Let X and Z denote the generalized Pauli operators whose action on the standard basis is X|ek = |ek+1 (with subscript addition mod d) and Z|ek = ei2πk/d |ek . It is well known and easy to verify that d1 k Z kAZ −k is the projection of a matrix onto its diagonal. If D is a diagonal matrix, then k X k DX −k = (Tr D)I. Now let {Wn }n=1,2,...,d2 denote some ordering of the generalized Pauli operators, e.g., Wj+k(d−1) = X j Z k with j, k = 1, 2, . . . , d. Then d1 n Wn AWn∗ = (Tr A)I and 1 (Wn ⊗ I2 )A12 (Wn ⊗ I2 )∗ = I1 ⊗ (Tr1 A) = I1 ⊗ A2 . d n
(26)
Using the fact that replacing Wn by U Wn U ∗ with U unitary, simply corresponds to a change of basis which does not affect (26) and then multiplying both sides by U ∗ ⊗ I2 on the left and U ⊗ I2 on the right gives the equivalent expression 1 (Wn U ∗ ⊗ I2 )A12 (Wn U ∗ ⊗ I2 )∗ = I1 ⊗ A2 . d n
(27)
Combining this with joint convexity yields a slight generalization of the well-known monotonicity of Jp (K, A, B) under partial traces (MPT), first proved by Lieb in [20] for the case K12 = I1 ⊗ K2 when p ∈ (0, 1). Theorem 7. Let Jp be as in (7), A12 , B12 strictly positive in Md1 ⊗ Md2 and K12 = V1 ⊗ K2 with V1 unitary in Md1 . Then Jp (K2 , A2 , B2 ) ≤ Jp (K12 , A12 , B12 ).
(28)
Proof. Writing Wn for Wn ⊗ I2 and V for V1 ⊗ I2 and using (27) gives 1 Jp (I1 ⊗ K2 , I1 ⊗ A2 , I1 ⊗ B2 ) d1 1 1 ∗ ∗ 1 ∗ Jp I1 ⊗ K2 , Wn V A12 VWn , Wn B12 Wn = d1 d1 n d1 n
Jp (K2 , A2 , B2 ) =
≤
1 Jp (I1 ⊗ K2 , Wn (V1∗ ⊗ I2 )A12 (V1 ⊗ I2 )Wn∗ , Wn B12 Wn∗ ) d21 n
= Jp (V1 ⊗ K2 , A12 , B12 ) where the final equality follows from the unitary invariance of the trace.
October 12, J070-S0129055X10004144
1108
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
Because Tr 12 (V1 ⊗ K2 )A12 (V1 ⊗ K2 )∗ = Tr 2 K2 A2 K2∗ , (28) is equivalent to 1−p 1−p ≥ 0 p ∈ (0, 1) ∗ p ∗ p Tr K2 A2 K2 B2 − Tr(V1 ⊗ K2 ) A12 (V1 ⊗ K2 )B12 . (29) ≤ 0 p ∈ (1, 2) We can obtain a weak reversal of this for p ∈ (0, 1). The argument in the Appendix shows that for any p and fixed A, B ≥ 0 both Tr K ∗Ap KB 1−p and Tr K ∗AK are convex in K. This was observed earlier by Lieb ([20]) and also follows from the results in [24]. One can then apply the argument above in the special case A12 = I1 ⊗ A2 , B12 = I1 ⊗ B2 to conclude that Tr K2∗ Ap2 K2 B21−p ≤
1 ∗ Tr K12 (I1 ⊗ A2 )p K12 (I1 ⊗ B2 )1−p d1
∗ (I1 ⊗ A2 )p K12 (I1 ⊗ B2 )1−p ≤ Tr K12
(30) (31)
independent of whether p < 1 or p > 1. However, because the term Tr K ∗AK is convex rather than linear in K, (30) does not allow us to draw any conclusions about the monotonicity of Jp (K12 , I1 ⊗ A2 , I1 ⊗ B2 ). To prove Theorem 7 we showed that joint convexity implies monotonicity; the reverse implication also holds. Let A1 , . . . , Am , B1 , . . . , Bm be positive definite matrices in Md , A = j Aj , B = j Bj , and put 12 = 12 = A |ej ej | ⊗ Aj , B |ej ej | ⊗ Bj , (32) j
j
12 and B 12 are block diagonal, for e1 , . . . , em the standard basis of Cm . Then A 12 = Ak = A and similarly for B. Then if monotonicity under 2 = Tr1 A and A k partial traces holds, one can conclude that 2 , B 2 ) Jp (K, A, B) = Jp (K, A 12 , B 12 ) = ≤ Jp (I1 ⊗ K, A
Jp (K, Aj , Bj )
(33)
j
Thus, monotonicity under partial traces also directly implies joint convexity of Jp . Applying (28) in the case K = I, and A12 → A123 and B12 → A12 ⊗ I3 gives Jp (I23 , A23 , A2 ⊗ I3 ) ≤ Jp (I123 , A123 , A12 ⊗ I3 ).
(34)
When p = 1, it follows from (7) that J1 (I23 , A23 , A2 ⊗ I3 ) = H(A23 , A2 ⊗ I2 ) = −S(A23 ) + S(A2 ) where S(A) = −Tr A log A. Thus, (34) becomes −S(A23 ) + S(A2 ) ≤ −S(A123 ) + S(A12 ) or, equivalently S(A2 ) + S(A123 ) ≤ S(A12 ) + S(A23 ) which is the standard form of SSA.
(35)
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1109
3. Equality for Joint Convexity of Jp (K, A, B) 3.1. Origin of necessary and sufficient conditions Looking back at the proof of Theorem 2, we see that for p ∈ (0, 2), equality holds in the joint convexity of Jp (K, A, B) if and only if equality holds in the joint convexity for each term in the integrand in (17). It should be clear from the argument given in the Appendix, that this requires Mj = 0 for all j with Mj given by (70). This is easily seen to be equivalent to (36) (LAj + tRBj )−1 (Xj ) = (LA + tRB )−1 (X) for all j, with A = j Aj , B = j Bj , and X = j Xj with Xj = Aj K and/or Xj = KBj . By writing AK = LA (K) in the former case and KB = RB (K) in the latter we obtain the conditions −1 −1 (I + t∆−1 (K) = (I + t∆−1 (K) ∀ j Aj Bj ) AB )
∀ t > 0,
(37a)
(∆Aj Bj + tI)−1 (K) = (∆AB + tI)−1 (K) ∀ j
∀ t > 0.
(37b)
From the integral representations (19), one might expect it to be necessary for either or both of (37a) and (37b) to hold depending on p. In fact, either will suffice because (37a) holds if and only if (37b) holds. Because ∆AB is positive definite, by analytic continuation (37b) extends from t > 0 to the entire complex plane, except points −t on the negative real axis for which t ∈ spectrum (∆AB ). Therefore, by using the Cauchy integral formula, one finds that for any function G analytic on C\(−∞, 0] G(∆Aj Bj )(K) = G(∆AB )(K). Theorem 8. For fixed K, and A = j Aj , B = j Bj , the following are equivalent (a) Jp (K, A, B) = j Jp (K, Aj , Bj ) for all p ∈ (0, 2). (b) Jp (K, A, B) = j Jp (K, Aj , Bj ) for some p ∈ (0, 2). (c) (∆Aj Bj + tI)−1 (K) = (∆AB + tI)−1 (K) for all j and for all t > 0. −it = Ait KB −it for all j and for all t > 0. (d) Ait j KBj (e) (log A − log Aj )K = K(log B − log Bj ) for all j. Proof. Clearly (a) ⇒ (b). The implications (b) ⇒ (c) ⇒ (d), as well as (b) ⇒ (a), follow from the discussion above. Differentiation of (d) at t = 0 gives (d) ⇒ (e), and it is straightforward to verify that (e) ⇒ (b) with p = 1. Moreover, (d) implies 1−it ∗ it = Tr K ∗Ait KB 1−it for all t, which implies (a) by analytic j Tr K Aj KBj continuation. 3.2. Sufficient subalgebras When K = I, we can obtain a more useful reformulation of the equality conditions by using results about sufficient subalgebras obtained in [14, 15, 33]. Since the definition and convexity properties of Jp (I, A, B) extend by continuity to positive
October 12, J070-S0129055X10004144
1110
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
semidefinite matrices, with ker B ⊆ ker A, we will formulate the conditions in this more general situation, using the conventions in Sec. 1.2. Let N ⊆ Md be a subalgebra, then there is a trace preserving conditional expectation EN from Md onto N , such that Tr AX = Tr EN (A)X for all X ∈ N . In particular, if N = Md1 ⊗ I ⊆ Md1 ⊗ Md2 , then we have EN (A12 ) = Tr2 A ⊗ d12 I. Let Q1 , . . . , Qm ∈ Md+ and assume that ker Qm ⊆ ker Qj for all j. The subalgebra N is said to be sufficient for {Q1 , . . . , Qm } if there is a completely positive trace preserving map T : N → Md , such that T (EN (Qj )) = Qj for all j = 1, . . . , m. This definition is due to Petz ([33, 32]) and it is a quantum generalization of the well known notion of sufficiency from classical statistics. In [33], it was shown that sufficient subalgebras can be characterized by the condition H(Qj , Qm ) = H(EN (Qj ), EN (Qm )),
for all j.
We combine this with the results of the previous section to obtain other useful characterizations of sufficiency. Theorem 9. Let Q1 , . . . , Qm ∈ Md+ be such that ker Qm ⊆ ker Qj for all j. Let N ⊆ Md be a subalgebra. The following are equivalent. (i) N is sufficient for {Q1 , . . . , Qm }. −it (ii) EN (Qj )it EN (Qm )−it P(ker Qm )⊥ = Qit j Qm , for all j, t ∈ R. + + (iii) There exist Qj,0 ∈ N , and D ∈ Md , such that ker D = ker Qm , and Qj = Qj,0 D for j = 1, . . . , m. (iv) Jp (I, Qj , Qm ) = Jp (I, EN (Qj ), EN (Qm )) for all j and some p ∈ (0, 1). The proof of the conditions (i)–(iii) can be found in [14], see also [28]. The condition (iv) was proved in [15]. 3.3. Equality conditions with K = I Theorem 10. Let A1 , . . . , Am and B1 , . . . , Bm be positive semi-definite matrices with ker Bj ⊆ ker Aj , and let A = j Aj , B = j Bj . Then the following are equivalent. (a) Jp (I, A, B) = j Jp (I, Aj , Bj ) for all p ∈ (0, 2). (b) Jp (I, A, B) = j Jp (I, Aj , Bj ) for some p ∈ (0, 2). −it (c) Ait = Ait B −it P(ker Bj )⊥ for all j and t ∈ R. j Bj (d) There are positive matrices D1 , . . . , Dm , with ker Dj = ker Bj , such that [Aj , Dj ] = [Bj , Dj ] = 0, and with D = j Dj . Aj = AD−1 Dj ,
Bj = BD−1 Dj .
(38)
Proof. As in Sec. 3.1, (b) implies (36) on (ker Bj )⊥ , with Xj = Bj , X = B. This gives (∆Aj Bj + tI)−1 (I) = (∆AB + tI)−1 (I) on (ker Bj )⊥ . Then (c) follows from the Cauchy integral formula as in Sec. 3.1.
(39)
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1111
To show (c) implies (d), we will use Theorem 9. First let N = I ⊗ Md ⊆ 12 , B 12 be the block-diagonal matrices in Mm ⊗ Md , defined Mm ⊗ Md and let A 12 ⊇ ker B 12 = |ej ej | ⊗ ker Bj and EN (A 12 ) = by (32). Clearly, we have ker A j 1 12 ) = 1 I ⊗ B. Then (c) implies EN (A 12 )it EN (B 12 )−it P I ⊗ A, EN (B e ⊥ = m it −it A 12 B12
(ker B12 )
m
12 , Qm = Q2 = B 12 , we for all t. Then by using Theorem 9 with Q1 = A can conclude that there are positive matrices A0 , B0 ∈ Md and D12 ∈ (Mm ⊗ Md)+ , 12 , [I ⊗ A0 , D12 ] = [I ⊗ B0 , D12 ] = 0 and such that ker D12 = ker B 12 = (I ⊗ A0 )D12 , A
12 = (I ⊗ B0 )D12 . B
(40)
12 are block diagonal, D12 = |ej ej | ⊗ Dj must also be block 12 , B Since A j diagonal with Dj ∈ Md+ , ker Dj = ker Bj , [A0 , Dj ] = [B0 , Dj ] = 0 for all j and Aj = A0 Dj ,
Bj = B0 Dj .
(41)
Taking Tr1 in (40) gives A = A0 D and B = B0 D. Using this in (41) gives (38) which proves (d). The implications (d) ⇒ (a) ⇒ (b) are straightforward. We return briefly to the case of arbitrary K. Note that if the condition (d) holds and [Dj , K] = 0 for all j, then Jp (K, A, B) = j Jp (K, Aj , Bj ) for all p ∈ (0, 2), this gives a sufficient, but not necessary, condition for equality if K = I. The next result reduces the case of K unitary to K = I. Then, we can apply the conditions of Theorem 10 to Aj and KBj K ∗ . Theorem 11. If K is unitary, then Jp (K, A, B) = Jp (I, A, KBK ∗ ) = j Jp (I, Aj , KBj K ∗ )
j
Jp (K, Aj , Bj ) if and only if
Proof. When K is unitary, then KB p K ∗ = (KBK ∗ )p which implies Jp (K, A, B) = Jp (I, A, KBK ∗ ). One can try to extend the results of this section to the case K ≤ 1, and hence to all K, by using the unitary dilation U=
K
L
−L
K
where L = U (1 − |K|2 )1/2 and K = U |K| is the polar decomposition. Then, with A=
A 0 0
0
,
B=
B
0
0
0
October 12, J070-S0129055X10004144
1112
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
we have Jp (K, A, B) = Jp (U, A, B), so that we may use Theorem 11 to get conditions for equality. But note that the conditions of Theorem 10 require that ker UBj U ∗ ⊆ ker Aj and it can be shown that this implies P(ker Aj )⊥ KP(ker Bj )⊥ K ∗ = P(ker Aj )⊥ , where PN denotes a projection onto the subscripted space. In particular, if all Aj and Bj are invertible, this restricts us to unitary K. 3.4. Equality in monotonicity under partial trace It is easy to see that when A12 = A1 ⊗A2 and B12 = B1 ⊗B2 , then Jp (I, A12 , B12 ) = Jp (I, A2 , B2 ) if and only if A1 = B1 with Tr A1 = 1. However, it is not necessary that A12 = A1 ⊗ A2 . The equality conditions are given by the following theorem. Theorem 12. Let K12 = I12 and A12 , B12 ∈ B(H1 ⊗H2 )+ , with ker B12 ⊆ ker A12 . Equality holds in (28) if and only if (i) H2 = n HnL ⊗ HnR , L L + R R + ⊗ AR (ii) A12 = n AL n with An ∈ B(H1 ⊗ Hn ) and An ∈ B(Hn ) , nL R L L + R (iii) B12 = n Bn ⊗ Bn with Bn ∈ B(H1 ⊗ Hn ) and Bn ∈ B(HnR )+ , L (iv) AL n = Bn for all n. Proof. Let us denote Aj = d11 Wj A12 Wj∗ , Bj = d11 Wj B12 Wj∗ , with Wj defined as in the proof of Theorem 7. Then we get that equality in (28) is equivalent to Jp I12 , Aj , Bj = Jp (I12 , Aj , Bj ). j
j
j
By Theorem 10, equality for some p implies equality for all p, so that Jp (I12 , A12 , B12 ) = Jp (I2 , Tr1 A, Tr1 B) = Jp (I12 , EN (A12 ), EN (B12 )) for p ∈ (0, 1), where N is the subalgebra I1 ⊗ B(H2 ) ⊆ B(H1 ⊗ H2 ). Hence N is sufficient for {A12 , B12 } and, by Theorem 9, there are some AR , BR ∈ B(H2 )+ and D ∈ B(H1 ⊗ H2 )+ , ker D = ker B12 , such that [(I1 ⊗ AR ), D] = [(I1 ⊗ BR ), D] = 0 and A12 = D(I1 ⊗ AR ),
B12 = D(I1 ⊗ BR ).
(42)
Now let M1 be the subalgebra in B(H2 ), generated by AR , BR . Then D ∈ (I1 ⊗ M1 ) = B(H1 ) ⊗ M1 where M denotes the commutant of M . There is a decomposition H2 = n HnL ⊗ HnR , such that R B(HnL ) ⊗ 1R M1 = 1L M1 = n, n ⊗ B(Hn )
and D = result, with
n
n
R L n Dn ⊗ 1n , where Dn ∈ B(H1 ⊗ Hn ). Since AR , BR ∈ L L An = Bn = Dn . The converse can be verified directly.
M1 , we get the
Applying this result in the case A12 → A123 and B12 → A12 ⊗ I3 gives equality conditions in (34). Since these are independent of p, they are identical to the conditions, first given in [13], for equality in SSA (35) which corresponds to p = 1.
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1113
Corollary 13. Equality holds in (34) if and only if (i) H2 = n HnL ⊗ HnR . R L L R R (ii) A123 = n AL n ⊗ An with An ∈ B(H1 ⊗ Hn ) and An ∈ B(Hn ⊗ H3 ). Proof. It suffices to let A12 → A123 and B12 → A12 ⊗ I3 in Theorem 12. To apply these results in Sec. 4, it is useful to observe that condition (ii) in Corollary 13 above can be written as A123 = (FL ⊗ I3 )(I1 ⊗ FR )
(43)
with FL ∈ B(H1 ⊗ H2 )+ , FR ∈ B(H2 ⊗ H3 )+ , [FL ⊗ I3 , I1 ⊗ FR ] = 0. Combining this with part (d) of Theorem 10 gives the following useful result, which essentially allows us to bypass the need to apply Theorem 10 to Jp (I, Aj , Wn Aj Wn ). Corollary 14. Let Aj ∈ Md1 ⊗ Md2 , A = Aj . Then Jp (I12 , A, (Tr2 A) ⊗ I2 ) = Jp (I12 , Aj , (Tr2 Aj ) ⊗ I2 )
(44)
j
if and only if there are Dj ∈ Md+1 , such that ker Dj = ker Tr2 Aj , [Aj , Dj ⊗ I] = 0 and Aj = A(D−1 Dj ⊗ I) with D = j Dj . 123 = |ej ej |⊗Aj ∈ Mm ⊗Md1 ⊗Md2 , then A = A 23 ∈ Md1 ⊗Md2 Proof. Let A j and (44) can be written as 23 , A 2 ⊗ I3 ) = Jp (I123 , A 123 , A 12 ⊗ I3 ). Jp (I23 , A By (43), this is equivalent to the existence of FL and FR , [(FL ⊗ I3 ), (I1 ⊗ FR )] = 0, (1)(23) is block-diagonal, FL must 123 = (FL ⊗ I3 )(I1 ⊗ FR ). Since A such that A be of the form FL = j |ej ej | ⊗ Dj , so that Aj = FR (Dj ⊗ I). Then Tr2 Aj = Dj Tr2 FR which implies that ker Dj ⊆ ker Tr2 Aj . If we let Pj = P(ker Tr2 Aj )⊥ , then Pj commutes with Dj and Aj = (Pj ⊗ I)Aj = (Pj Dj ⊗ I)FR , so that we can assume that ker Dj = ker Tr2 Aj , by taking Pj Dj instead of Dj . Taking Tr1 of (43) gives A = (D ⊗ I3 )FR = FR (D ⊗ I3 ) so that Aj = A(D−1 Dj ⊗ I).
4. Equality in Joint Convexity of Carlen–Lieb Carlen and Lieb [8] obtained several convexity inequalities from those of the map Υp,q (K, A) ≡ Tr(K ∗Ap K)q/p
(45)
October 12, J070-S0129055X10004144
1114
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
using an identity which we write only for q = 1 and p > 1 in our notation as 1 Υp,1 (K, A) = (p − 1) inf Jp (K, A, X) + Tr X p 1 ∗ + Tr K AK : X > 0 . p(p − 1)
(46)
We introduce the closely related quantity p,1 (K, A) = inf Jp (K, A, X) + 1 Tr X : X > 0 Υ p 1 1 ∗ = Υp,1 (K, A) − Tr K AK (p − 1) p
(47) (48)
which is well-defined for all p ∈ (0, 2) and allows us to continue to treat the cases p < 1 and p > 1 simultaneously, as well as include the special case p = 1 for which 1,1 (K, A) = −Tr K ∗AK log(K ∗AK) + Tr K ∗ (A log A)K + Tr K ∗AK Υ = S(K ∗AK) + Tr KK ∗A log A + Tr K ∗AK.
(49)
Since we are dealing with finite dimensional spaces, the infimum in (46) has a minimizer which satisfies Xmin = (K ∗Ap K)1/p .
(50)
For fixed K, let Xj denote the minimizer associated with Aj . Then p,1 (K, A1 ) + Υ p,1 (K, A2 ) = Jp (K, A1 , X1 ) + 1 Tr X1 + Jp (K, A2 , X2 ) + 1 Tr X2 Υ p p 1 ≥ Jp (K, A1 + A2 , X1 + X2 ) + Tr(X1 + X2 ) p 1 ≥ inf Jp (K, A1 + A2 , X) + Tr X : X > 0 p p,1 (K, A1 + A2 ) =Υ
(51)
(52)
. Note that equality above requires both X = which proves convexity of Υ p,1 j Xj and Jp (K, A, X) = j Jp (K, Aj , Xj ), where X is the minimizer associated with A. Now we introduce some notation following the strategy in the published version of [8]. Let |½ denote the vector (1, 1, . . . , 1) with all components 1 and |e1 the
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1115
vector (1, 0, . . . , 0). Define I I 1 K = I ⊗ |½e1 | = . .. d I
0
... 0 . . . 0 .. . ... 0
Aj1
0
0
Aj2
0
0
Aj3
0 0 .. .
(53)
and Aj =
Ajk ⊗ |ek ek | =
k
and A =
j
0 = 0 .. .
Ajk
k
Aj =
k
Ak ⊗ |ek ek | = ∗
KA K= p
k
Ak with Ak =
j
... 0 . . . 0 , . . . 0 . .. . ..
(54)
Ajk . Then
Apk
⊗ |e1 e1 |
k
With this notation, we make some definitions following Carlen and Lieb but modified to allow a unified treatment of p ∈ (0, 2). Φ(p,1) (A) = Φ(p,1) (A1 , A2 , A3 . . .) ≡ Υp,1 (K, A) = Tr(Ap1 + Ap2 + Ap3 + · · ·)1/p ,
(55)
(p,1) (A) = Φ (p,1) (A1 , A2 , A3 . . .) Φ p,1 (K, A) ≡Υ ! " 1 1 Φ(p,1) (A1 , A2 , A3 , . . .) − = Tr Ak . (p − 1) p
(56)
k
apply only when A is a block diagonal matrix in The definitions of Φ and Φ Md1 ⊗ Md2 . We now extend this to an arbitrary matrices A12 ∈ Md1 ⊗ Md2 . Ψ(p,1) (A12 ) ≡ Tr1 (Tr2 Ap12 )1/p , 1 1 Ψ(p,1) (A12 ) ≡ Ψ(p,1) (A12 ) − Tr A12 . (p − 1) p
(57) (58)
October 12, J070-S0129055X10004144
1116
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
For p = 1, the formulas with hats are related to the conditional entropy, from which they differ by a constant Φ(1,1) (A1 , A2 , A3 . . .) − Tr A12 = −Tr Ak log Ak + Ak log Ak =S
k
Ak
k
k
− S(A12 )
k
= J1 (I, A12 , Tr2 A12 ⊗ I2 ), (1,1) (A12 ) − Tr A12 = S(A1 ) − S(A12 ) = H(A12 , A1 ⊗ I2 ). Ψ
(59) (60)
When A12 is block diagonal, Ψ(p,1) (A12 ) = Φ(p,1) (A12 ) with the understanding that Tr2 A12 = k Ak . Now let Wn denote the generalized Pauli matrices as in Sec. 2.3, Wn = I1 ⊗ Wn and define A123 = Wn A12 Wn∗ ⊗ |en en | = Wn A12 Wn∗ (61) n
n
so that A123 is block diagonal with blocks Wn A12 Wn∗ . Then 1+p
d2 p Ψ(p,1) (A12 ) = Φ(A(12)(3) ) = Φ(W1 A12 W1∗ , W2 A12 W2∗ , . . .).
(62)
(p,1) (A) (p,1) (A) and Ψ It is straightforward to show that for p ∈ (0, 2) the functions Φ are all convex in A, inheriting this property from the quantities from which they are defined. In view of (59) and (60), the conditions for equality in the next two theorems are not surprising. (p,1) (A) is convex in A for p ∈ (0, 2). Moreover, the Theorem 15. The function Φ following are equivalent: (i) Jp (I, A, (Tr2 A) ⊗ I2 ) = j Jp (I, Aj , (Tr2 Aj ) ⊗ I2 ), (ii) There are matrices Dj > 0, D = j Dj , such that [Ajk , Dj ] = 0, ker Dj = ker( k Ajk ) and Ajk = Ak D−1 Dj , (p,1) (A1 , A2 , A3 . . .) = Φ (iii) Φ j (p,1) (Aj1 , Aj2 , Aj3 . . .). Proof. It follows from Corollary 14 and the fact that Aj are block-diagonal that (i) ⇔ (ii) and it is straightforward to verify that (ii) ⇒ (iii). Moreover, (iii) implies (i) for p = 1, by (59). To show that (iii) implies (ii) for p = 1, observe that (iii), p,1 (K, A) = Υ implies Υ j p,1 (K, Aj ), and this implies Jp (K, Aj , Xj ) (63) Jp (K, A, X ) = j
where Xj = (K = Xj ⊗ |e1 e1 | and Xj = X = (K∗Ap K)1/p = p 1/p j p 1/p X ⊗ |e1 e1 |, with Xj = ( k Ajk ) and X = ( k Ak ) . Since Apjk Xj1−p ⊗ |e1 e1 |, K∗Apj KXj1−p = ∗
Apj K)1/p
k
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1117
with a similar expression for K∗ Ap KX 1−p , we find Jp (I, Ak , X) = Jp (K, A, X ) = Jp (K, Aj , Xj ) = Jp (I, Ajk , Xj ). j
k
k,j
Convexity then implies that we must have Jp (I, Ak , X) = Jp (I, Ajk , Xj ) ∀ k.
(64)
j
Since ker Xj ⊆ ker Ajk , Theorem 10 implies that −it −it P(ker Xj )⊥ = Ait for all k, j, t. Ait kX jk Xj , k = After writing A j |ej ej | ⊗ Ajk , X = j |ej ej | ⊗ Xj , this reads −it it 1 1 it −it I ⊗ Tr1 Ak I ⊗ Tr1 X Ak X = P(ker X) e ⊥, m m
(65)
so that, by Theorem 9, there are elements Bk ∈ Md+ and D ∈ (Mm ⊗ Md )+ , such k = (I ⊗ Bk )D. As before, one finds [(I ⊗ Bk ), D] = 0 and A that ker D = ker X, + D = j |ej ej | ⊗ Dj for some Dj ∈ Md which implies (ii). (p,1) (A12 ) is convex in A12 for p ∈ (0, 2). Moreover, Theorem 16. The function Ψ if we let A123 denote the block diagonal matrix with blocks Wn AWn∗ , the following are equivalent: (i) Jp (I, A123 , A1 ⊗I23 ) = j Jp (I, (A123 )j , (A1 )j ⊗I23 ) with A123 defined by (61), (ii) There are matrices Dj ∈ Md+1 , D = j Dj , such that ker Dj = ker(A1 )j , [Aj , Dj ⊗ I] = 0 and Aj = A(D−1 Dj ⊗ I). (p,1) (A) = Ψ (iii) Ψ j (p,1) (Aj ). 1+p
Proof. It follows from the definition of A123 , that d2 p Ψ(p,1) (A) = Φ(A123 ). The equivalence (i) ⇔ (iii) follows immediately from Theorem 15, and (i) ⇔ (ii) can be shown to follow from Corollary 14. Theorem 17. The following monotonicity inequalities hold, (p,1) (A23 ) ≤ Ψ (p,1) (A123 ), Ψ
p ∈ (0, 2),
(66a)
Ψ(p,1) (A23 ) ≥ Ψ(p,1) (A123 ),
p ∈ (0, 1),
(66b)
Ψ(p,1) (A23 ) ≤ Ψ(p,1) (A123 ),
p ∈ [1, 2).
(66c)
Moreover, equality holds if and only if the conditions of Corollary 13 are satisfied. since the other inequalities follow immeProof. It suffices to give the proof for Ψ diately. The argument is similar to that for Theorem 7. Let Wn denote the generalized Pauli matrices of Sec. 2.3, but now let Wn = Wn ⊗ I23 . Then the convexity
October 12, J070-S0129055X10004144
1118
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
(p,1) (A23 ) implies of Ψ (p,1) (A23 ) = 1 Ψ (p,1) (I1 ⊗ A23 ) Ψ d1 1 1 Ψ(p,1) Wn A123 Wn = d1 d1 n ≤
1 (p,1) (A123 ) Ψ(p,1) (Wn A123 Wn ) = Ψ d21 n
under unitaries of the form U1 ⊗ I23 . In the case where we used the invariance of Ψ 1)(A123 ) becomes (1,1) (A23 ) ≤ Ψ(1, p = 1, it follows from (60) that Ψ S(A2 ) − S(A23 ) ≤ S(A12 ) − S(A123 )
(67)
which is SSA. Because the equality conditions in Theorem 16 are independent of p, they are identical to those for SSA, which are given in Corollary 13. The Carlen–Lieb triple Minkowski inequality for the case q = 1 is an immediate corollary of Theorem 17. Observe that Tr3 Tr1 (Tr2 Ap123 )1/p = Ψ(p,1) (A(13),(2) ) Tr3 [Tr2 (Tr1 A123 )p ]1/p = Ψ(p,1) (A32 )
(68a) (68b)
so that it follows immediately from (66c) that Tr3 [Tr2 (Tr1 A123 )p ]1/p = Ψ(p,1) (A32 ) ≤ Ψ(p,1) (A132 ) = Tr3 Tr1 (Tr2 Ap123 )1/p
(69)
for 1 < p ≤ 2 and from (66b) that the inequality reverses for 0 < p < 1. Moreover, the conditions for equality are again independent of p and identical to those for equality in SSA, given in Corollary 13. 5. Final Remarks It should be clear that the results in Sec. 2 are not restricted to Jp (K, A, B). The function gp (x) given in (6) can be replaced by any operator convex function of the form g(x) = xf (x) with f operator monotone on (0, ∞). Moreover, if the measure ν(t) in (17) is supported on (0, ∞), then the conditions for equality are identical to those in Sec. 3. In particular, our results go through with gp replaced by gp and Jp (I, A, B) replaced by Jp (I, A, B), which is well-defined for p ∈ [−1, 1) with J0 (I, A, B) = H(B, A). Thus our results can be extended to all p ∈ (−1, 2). The case p = 2 reduces to the convexity of (A, X) → Tr X ∗A−1 X with A > 0 proved in [24]. One can show that equality holds if and only if Xj = Aj T ∀ j with T = A−1 X. We recently learned that Kiefer ([16]) proved the p = 2 convexity, by a different method, much earlier and also found these equality conditions.
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1119
There have been various attempts, e.g., the Renyi ([35]) and Tsallis ([40]) entropies, to generalize quantum entropy in a way that gives the usual von Neumann entropy at p = 1. In this paper we have considered two extensions of the conditional entropy involving an exponent p ∈ (0, 2), namely, • Jp (I, A12 , A1 ) which gives Tr Ap23 A21−p
≤ p ∈ (0, 1) Tr Ap123 A1−p and can be 12 ≥ p ∈ (1, 2)
thought of as a pseudo-metric; and (p,1) (A12 ) which gives Tr2 (Tr3 Ap )1/p ≥ Tr12 (Tr3 Ap )1/p p ∈ (0, 1) and can be • Ψ 23 123 ≤ p ∈ (1, 2) thought of as a pseudo-norm. These expressions are quite different for p = 1, but arise from quantities with the same convexity and monotonicity properties, as well as the same equality conditions which are independent of p. Moreover, both yield SSA at p = 1 and the equality conditions for p = 1 are identical to those for SSA. This independence of non-trivial equality conditions on the precise form of the function seems remarkable. If one uses gp and Jp (I, A, B) from (10), then the inequalities above hold with p ∈ (1, 2) replaced by p ∈ (−1, 0) and SSA corresponds to p = 0. Acknowledgments The first-named author was supported by the grants VEGA 2/0032/09, APVV0071-06, Center of Excellence SAS — Quantum Technologies and ERDF OP R&D Project CE QUTE ITMS 26240120009. The second-named author was partially supported by National Science Foundation under Grant DMS-0604900. Appendix. Proof of the Key Schwarz Inequality For completeness, we include the proof of the joint convexity of (A, B, X) → Tr X ∗ (LA + tRB )−1 (X) when A, B > 0 and t > 0. Since this function is homogeneous of degree one, it suffices to prove subadditivity. Now let Mj = (LAj + tRBj )−1/2 (Xj ) − (LAj + tRBj )1/2 (Λ).
(70)
Then one can verify that 0≤ Tr Mj∗ Mj = Mj , Mj j
=
j
j
Tr Xj∗ (LAj + tRBj )−1 (Xj ) − Tr
− Tr Λ∗
j
Xj + Tr Λ∗
Xj∗ Λ
j
j
(LAj + tRBj )Λ.
(71)
October 12, J070-S0129055X10004144
1120
2010 10:2 WSPC/S0129-055X
148-RMP
A. Jenˇ cov´ a & M. B. Ruskai
Next, observe that for any matrix W , (LAj + tRBj )(W ) = (Aj W + tW Bj ) = LPj Aj (W ) + tRPj Bj (W ). j
j
Therefore, inserting the choice Λ = (LPj Aj + tRPj Bj )−1 ( j Xj ) in (71) yields ∗ 1 1 Tr Xj P Xj ≤ Tr Xj∗ (Xj ). (72) P L j Aj + tR j Bj LAj + tRBj j j j for any t ≥ 0. References [1] N. I. Akheizer and I. M. Glazman, Theory of Operators in Hilbert Space, Vol. II (Frederik Ungar Publishing, NY, 1963). [2] A. Amari and H. Nagaoka, Methods of Information Geometry, Translations of Mathematical Monographs, Vol. 191 (American Mathematical Society and Oxford University Press, 2000). [3] T. Ando, Topics on Operator Inequalities, Lecture Notes (Hokkaido University, 1978). [4] T. Ando, Concavity of certain maps on positive definite matrices and applications to Hadamard products, Lin. Alg. Appl. 26 (1979) 203–241. [5] H. Araki, Relative entropy of states of von Neumann algebras, Publ RIMS Kyoto Univ. 9 (1976) 809–833. [6] T. Bekjan, On joint convexity of trace functions, Lin. Alg. Appl. 390 (2004) 321–327. [7] E. Carlen and E. Lieb, A Minkowski type trace inequality and strong subadditivity of quantum entropy, Amer. Math. Soc. Trans. 189(2) (1999) 59–62; Reprinted in [21]. [8] E. A. Carlen and E. H. Lieb, A Minkowski type trace inequality and strong subadditivity of quantum entropy II: Convexity and concavity, Lett. Math. Phys. 83 (2008) 107–126; arXiv:0710.4167. [9] I. Devetak and J. Yard, Exact cost of redistributing multipartite quantum states, Phys. Rev. Lett. 100 (2008) 230501, 4 pp. [10] W. F. Donoghue Jr., Monotone Matrix Functions and Analytic Continuation (Springer, 1974). [11] E. G. Effros, A matrix convexity approach to some celebrated quantum inequalities, Proc. Natl. Acad. Sci. 106 (2009) 1006–1008; arXiv:0802.1234. [12] H. Epstein, Remarks on two theorems of E. Lieb, Comm. Math. Phys. 31 (1973) 317–325. [13] P. Hayden, R. Jozsa, D. Petz and A. Winter, Structure of states which satisfy strong subadditivity of quantum entropy with equality, Comm. Math. Phys. 246 (2004) 359–374; arXiv:quant-ph/0304007. [14] A. Jenˇcov´ a and D. Petz, Sufficiency in quantum statistical inference, Comm. Math. Phys. 263 (2006) 259–276; arXiv:math-ph/0412093. [15] A. Jenˇcov´ a and D. Petz, Sufficiency in quantum statistical inference. A survey with examples, J. Infin. Dimens. Anal. Quantum Prob. Relat. Top. 9 (2006) 331–352; arXiv:quant-ph/0604091. [16] J. Kiefer, Optimum experimental designs, J. Roy. Statist. Soc. Ser. B 21 (1959) 272–310. [17] O. Klein, Zur quantenmechanischen begr¨ undung der zweiten hauptsatzes der w¨ aremlehre, Z. Phys. 72 (1931) 767–775.
October 12, J070-S0129055X10004144
2010 10:2 WSPC/S0129-055X
148-RMP
Unified Treatment of Convexity of Relative Entropy and Related Trace Functions
1121
[18] A. Lesniewski and M. B. Ruskai, Relative entropy and monotone Riemannian metrics on non-commutative probability space, J. Math. Phys. 40 (1999) 5702–5724. [19] S. Luo, N. Li and X. Cao, Relation between “no broadcasting” for noncommuting states and “no local broadcasting” for quantum correlations, Phys. Rev. A 79 (2009) 054305, 3 pp. [20] E. H. Lieb, Convex trace functions and the Wigner–Yanase–Dyson conjecture, Adv. Math. 11 (1973) 267–288; Reprinted in [21]. [21] M. Loss and M. B. Ruskai (eds.), Inequalities: Selecta of E. Lieb (Springer, 2002). [22] E. H. Lieb and M. B. Ruskai, A fundamental property of the quantum-mechanical entropy, Phys. Rev. Lett. 30 (1973) 434–436; Reprinted in [21]. [23] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum mechanical entropy, J. Math. Phys. 14 (1973) 1938–1941; Reprinted in [21]. [24] E. H. Lieb and M. B. Ruskai, Some operator inequalities of the Schwarz type, Adv. Math. 12 (1974) 269–273; Reprinted in [21]. [25] E. H. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the Schr¨ odinger Hamiltonian and their relation to Sobolev inequalities, in Studies in Mathematical Physics, eds. E. Lieb, B. Simon and A. Wightman (Princeton University Press, 1976), pp. 269–303; Reprinted in [21]. [26] G. Lindblad, Expectations and entropy inequalities, Comm. Math. Phys. 39 (1974) 111–119. ¨ [27] K. L¨ owner, Uber monotone Matrix Funktionen, Math. Z. 38 (1934) 177–216. [28] M. Mosonyi and D. Petz, Structure of sufficient quantum coarse-grainings, Lett. Math. Phys. 68 (2004) 19–30. [29] H. Narnhofer and W. Thirring, From relative entropy to entropy, Fizika 17 (1985) 257–265. [30] M. Ohya and D. Petz, Quantum Entropy and Its Use, 2nd edn. (Springer-Verlag, 2004). [31] D. Petz, Quasi-entropies for finite quantum systems, Rep. Math. Phys. 23 (1986) 57–65. [32] D. Petz, Sufficiency of channels over von Neumann algebras, Quart. J. Math. 39 (1988) 907–1008. [33] D. Petz, Sufficient subalgebras and the relative entropy of states of a von Neumann algebra, Comm. Math. Phys. 105 (1986) 123–131. [34] D. Petz, Monotone Metrics on Matrix Spaces, Lin. Alg. Appl. 244 (1996) 81–96. [35] A. R´enyi, On measures of entropy and information, in Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. I (Univ. California Press, Berkeley, 1961), pp. 547–561. [36] M. B. Ruskai, Inequalities for quantum entropy: A review with conditions for equality, J. Math. Phys. 43 (2002) 4358–4375; Erratum ibid., 46 (2005) 019901, quantph/0205064. [37] M. B. Ruskai, Another short and elementary proof of strong subadditivity of quantum entropy, Rep. Math. Phys. 60 (2007) 1–12; arXiv:quant-ph/0604206. [38] D. Ruelle, Statistical Mechanics (Benjamin, 1969). [39] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton Univ. Press, 1993). [40] C. Tsallis, Possible generalization of Boltzmann–Gibbs statistics, J. Stat. Phys. 52 (1988) 479–487. [41] A. Wehrl, General properties of entropy, Rev. Mod. Phys. 50 (1978) 221–260. [42] E. P. Wigner and M. M. Yanase, Information content of distributions, Proc. Nat. Acad. Sci. 49 (1963) 910–918. [43] E. P. Wigner and M. M. Yanase, On the positive semi-definite nature of certain matrix expressions, Canad. J. Math. 16 (1964) 397–406.
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1123–1145 c World Scientific Publishing Company DOI: 10.1142/S0129055X1000417X
ON THE HERMAN–KLUK SEMICLASSICAL APPROXIMATION
DIDIER ROBERT D´ epartement de Math´ ematiques, Laboratoire Jean Leray, CNRS-UMR 6629, Universit´ e de Nantes, 2 rue de la Houssini` ere, F-44322 Nantes Cedex 03, France [email protected] Received 19 November 2009 For a subquadratic symbol H on Rd ×Rd = T ∗ (Rd ), the quantum propagator of the time ˆ is a Semiclassical Fourier-Integral Operator = Hψ dependent Schr¨ odinger equation i ∂ψ ∂t ˆ = H(x, Dx ) (-Weyl quantization of H). Its Schwartz kernel is described by when H a quadratic phase and an amplitude. At every time t, when is small, it is “essentially supported” in a neighborhood of the graph of the classical flow generated by H, with a full uniform asymptotic expansion in for the amplitude. In this paper, our goal is to revisit this well-known and fundamental result with emphasis on the flexibility for the choice of a quadratic complex phase function and on global L2 estimates when is small and time t is large. One of the simplest choice of the phase is known in chemical physics as Herman–Kluk formula. Moreover, we prove that 1 |log | where δ > 0 is the semiclassical expansion for the propagator is valid for |t| 4δ a stability parameter for the classical system. Keywords: Coherent states; time dependent Schr¨ odinger equations; Semiclassical Fourier-Integral Operator; Ehrenfest time. Mathematics Subject Classification 2010: 35Q41, 81Q05, 81S30, 35S30
1. Introduction and Results Let us consider the time-dependent Schr¨odinger equation i
∂ψ(t) ˆ = H(t)ψ(t), ∂t
ψ(t = t0 ) = ψ0 ,
(1.1)
ˆ where ψ is an initial state, H(t) is a quantum Hamiltonian defined as a continuous family of self-adjoint operators in the Hilbert space L2 (Rd ), depending on time t and on the Planck constant > 0, which plays the role of a small parameter in ˆ the system of units considered in this paper. H(t) is supposed to be the -Weylquantization of a classical smooth observable H(t, X), X = (x, ξ) ∈ Rd ×Rd (see [27] for more details concerning semiclassical Weyl quantization). 1123
November 16, J070-S0129055X1000417X
1124
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
Our main results concern subquadratic Hamiltonians H; that means here that H(t, X) is continuous in t ∈ R, C ∞ smooth in X ∈ R2d and satisfies, for every γ ∈ N2d , |γ| ≥ 2, γ H(t, X| ≤ CT,γ , |∂X
∀ t,
|t − t0 | ≤ T,
∀ X ∈ R2d
(1.2)
∂ and CT,γ > 0. where ∂X = ∂X Let us introduce some classes of symbols (“classical observables”) defined as follows. Let be m, n ∈ N.
Definition 1.1. We say that a symbol s is in Om (n) if s is a smooth function on the Euclidean space Rn such that for every γ ∈ Nn , |γ| ≥ m we have γ s(X)| < +∞ |s|∞,γ := sup |∂X
(1.3)
X∈Rn
If s(ε) depends on a parameter ε ∈ P we say that s(ε) is bounded in Om (n) if for every γ, we have sup |s(ε)|∞,γ < +∞.
ε∈P
It is well known that the subquadratic assumption entails that Eq. (1.1) is solved by a unique quantum unitary propagator in L2 (Rd ) such that ψt = U (t, t0 )ψ0 , ∀ t ∈ R. For the same reason, the classical dynamics is also well defined ∀ t ∈ R. zt = (qt , pt ) is the classical path in the phase space R2d such that zt0 = z and satisfying q˙t = ∂p H(t, qt , pt ) (1.4) p˙t = −∂Hq (t, qt , pt ), qt0 = q, pt0 = p. It defines a Hamiltonian flow: φt (z) = zt (φt0 (z) = z). Let us introduce the stability Jacobi matrix of this Hamiltonian flow:F (t) = ∂z φt (z). F (t) is a 2d× 2d symplectic Bt t matrix with four d × d blocks, F (t) = A Ct Dt , where At =
∂qt , ∂q
Bt =
∂qt , ∂p
Ct =
∂pt , ∂q
Dt =
∂pt . ∂p
We also introduce the classical action t S(t, z) = (ps · q˙s − H(s, zs ))ds
(1.5)
(1.6)
t0
where u · v denote the usual scalar product for u, v ∈ Rd , and the phase function i Φ(t, z; x, y) = S(t, z) + pt · (x − qt ) − p · (y − q) + (|x − qt |2 + |y − q|2 ). 2
(1.7)
For applications, it is useful to introduce semi-classical subquadratic symbols. These symbols have an asymptotic expansion in the semiclassical parameter > 0,
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
H (t, X)
j≥0
1125
j Hj (t, X) such that the following conditions are satisfied.
∀ j ≥ 0, Hj (t, •) ∈ O(2−j)+ (2d) and are bounded in O(2−j)+ (2d) for t ∈ R, ∀ N ≥ 1, −N −1 H(t, X) − j Hj (t, X) is bounded in O0 for
(1.8)
0≤j≤N
t ∈ R and ∈ ]0, 1].
(1.9)
Let us recall the definition of Weyl quantization. For any symbol s in Om (2d),and for any ψ ∈ S(Rd ), we have x+y i w −d (x−y)·ξ , ξ ψ(y)dydξ. (1.10) Op [s]ψ(x) = (2π) e s 2 R2d We shall also use the notation sˆ = Opw [s]. The Herman–Kluk formula is included in the following asymptotic result which will be discussed in details in this paper. This formula was discovered by several authors in the chemical-physics litterature in the eighties. We refer to the introductions of [22, 29] for interesting historical expositions. It is rather surprising that until the recent paper [29] and the Ph.D. thesis [33] there was no explicite connexion in the mathematical literature between the Herman–Kluk formula and Fourier-Integral Operators with complex phases. Theorem 1.2. Let be H (t) a time dependent semiclassical subquadratic Hamiltonian and K (t; x, y) be the Schwartz kernel of its propagator U (t, t0 ). Then there exists a semi-classical symbol of order 0, a (t; z) = 0≤j<+∞ aj (t; z)j where aj is continuous in t, i K (t; x, y) e Φ(t,z;x,y) a(; t; z)dz (1.11) R2d
in the L2 uniform norm. More precisely, if we denote i e Φ(t,z;x,y) aj (t; z)j dz K (,N ) (t; x, y) = (2π)−3d/2 R2d
(1.12)
0≤j≤N
and U (,N ) (t, t0 ) the operator, in L2 (Rd ), with the Schwartz kernel K (,N ) (t; x, y), then, for every T > 0 and every N ≥ 1, there exists C(T, N ) > 0 such that for the L2 operator norm we have U (t, t0 ) − U (,N ) (t, t0 ) ≤ C(T, N )N +1 ,
∀ t,
|t − t0 | ≤ T,
∈ ]0, 1]. (1.13)
The leading term is 1/2
a0 (t; z) = det
t (At + Dt + i(Bt − Ct )) exp −i H1 (zs )ds t0
(1.14)
November 16, J070-S0129055X1000417X
1126
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
where the square root is defined by continuity starting from t = t0 (a0 (t0 ; z) = 2d/2 ). Moreover, the amplitudes aj are smooth functions defined by transport equations (see the proof below ) and, for every T > 0 they are bounded in O0 for |t| ≤ T . In [29], the authors give a rigorous proof of this result with an additional hypothesis: they assume that H(x, ξ) is a polynomial in ξ. Here we consider more general subquadratic symbols. In particular our result applies to relativistic Hamiltonians
like 1 + |ξ|2 + V (x). Using a global diagonalization (see [28, Sec. 3]), the result can be extended to Dirac systems. Similar results are true with more general quadratic phases and for systems with diagonalizable leading symbols (see [4, 28]). Let us define the quadratic phase Φ(Θt ,Γ) (t, z; x, y) = S(t, z) + pt · (x − qt ) − p · (y − q) 1 ¯ − q).(y − q)) (1.15) + (Θt (x − qt ) · (x − qt ) − Γ(y 2 where Γ, Θt are complex symmetrix matrices with a definite-positive imaginary part, Θt is C 1 in t. Γ is constant, Θt may depend smoothly on t and z such that the following condition is satisfied: Θt v.v ≥
∃cT > 0, ∀γ,
|γ| ≥ 1,
∃CT,γ ,
1 2 |v| , cT
∀ t,
∂zγ Θt ≤ CT,γ ,
|t| ≤ T,
∀ z ∈ R2d
∀ z ∈ R2d ,
∀ |t| ≤ T.
(1.16) (1.17)
So we have Theorem 1.3. Under the assumptions of Theorem 1.2 and (1.16), (1.17), we have (Θt ,Γ) i −3d/2 (t,z;x,y) K(t; x, y) (2π) eΦ f (; t; z)dz (1.18) where f (; t; z) = In particular
R2d
0≤j<+∞
fj (t; z)j with the same meaning as in Theorem 1.2.
f0 (t, z) = 2d/2 det1/2 [M (Θt , Γ)]
(1.19)
where ¯ − Θ(A + B Γ)). ¯ M (Θt , Γ) = i(C + DΓ There exist several methods to prove this theorem. In [29], the authors prove it as a consequence of a symbolic calculus for FIO with complex quadratic phases. In [5], the authors proved a weaker result for Γ = iI and Θt = Γt is determined by the propagation of Gaussian coherent states: Γt = (C + DΓ)(A + BΓ)−1 (see Sec. 2 of this paper). Laptev–Sigal in [23] have also considered a similar formula for the propagator (see Sec. 5 of this paper) but assume that the initial data has a compact support in momenta. Kay ([22]) explains how to compute all the semiclassical corrections aj but did not give estimates on the error term, so its expansion is not rigorously established. Here we choose another approach, may be more explicit and
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1127
simpler. We shall prove the general Theorem 1.3 as a consequence of the particular case of Theorem 1.2 by using a real deformation of the phase Φ(Θt ,Γ) on the simpler one Φ(iI,iI) . Moreover, we give a direct proof of Theorem 1.2, proving the necessary properties for Fourier integrals with complex quadratic phases. This way we can get easily explicit estimates for the error terms for large times. ˆ Let us assume that conditions on H(t) are satisfied for T = +∞. Moreover assume that there exists a positive real function µ(T ) ≥ 1, T > 0, such that the classical flow φt satisfies, for every multiindex γ, |γ| ≥ 1, we have for some Cγ > 0,
|∂zγ φt,t (z)| ≤ Cγ µ(T )|γ| ,
for |t| + |t | ≤ T,
∀ z ∈ R2d .
(1.20)
We have discussed in [5] the condition (1.20). In particular this condition is fulfilled 2 H(t, X). with µ(T ) = eδT for δ = supX∈R2d ,t∈R J∂X,X Theorem 1.4. Choosing the phase as in Theorem 1.2, for j ≥ 0 the amplitudes aj (t, z) satisfy the following estimates, for every multiindex γ there exist a constant Cj,γ such that |∂zγ aj (t, z)| ≤ Cjγ |det1/2 Mt |µ(t)4j+|γ| ,
∀ t ∈ R,
∀ z ∈ R2d .
(1.21)
Hence we have the following Ehrenfest type estimate. For every N ≥ 1 and every ε > 0 there exists CN,ε such that we have U (t, t0 ) − U (N ) (t, t0 ) ≤ CN,ε ε(N +1) , |t| ≥ s
1−ε |log |, 4δ
∀ t,
∀ ∈ ]0, 1].
(1.22)
In previous works, an Ehrenfest time TE = c|log |, c > 0, was estimated for propagation of Gaussians in [9] and propagation of observables in [6]. For Gaussians 1 1 , for observables c = 2δ . In [29], the authors gave an Ehrenfest time we got c = 6δ without explicit estimate on c. 2. Gaussians Coherent States and Quadratic Hamiltonians The phase functions Φ(Θ,Γ) in (1.7) and (1.15) are closely related with Gaussian coherent states. This can be seen by proving a particular case of Theorem 1.2 for quadratic time-dependent Hamiltonians: Ht (q, p) =
1 (Gt q · q + 2Lt q · p + Kt p · p) 2
where q, p ∈ Rd , Kt , Lt , Gt are real, d × d matrices, continuous in time t ∈ R, Gt , Kt are symmetric. The classical motion in the phase space is given by the linear differential equation q q˙ 0 I Gt LTt , J= =J· (2.1) Lt Kt p p˙ −I 0
November 16, J070-S0129055X1000417X
1128
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
where LT is the transposed matrix of L, J defines the symplectic form σ(X, X ) := JX · X , X = (x, ξ), X = (x , ξ ). This equation defines a linear symplectic transformation, Ft , such that F0 = I (we take here t0 = 0). It can be represented as a 2d × 2d matrix which can be written as four d × d blocks: At Bt Ft = . (2.2) Ct Dt ˆ The quantum evolution for the Hamiltonian H(t) is denoted by U (t) (U (0) = I). We can compute the matrix elements of U (t) on the coherent states basis ϕz . This has been done in [24, p. 249 (6.36)] and [3, 12, 10]. We follow here the presentation given in [10]. Let us introduce some notations which will be used later. g denotes the 2 Gaussian function: g(x) = π −d/4 e−|x| /2 and Λ is the dilation operator Λ ψ(x) = −d/4 ψ(−1/2 x). So ϕ0 = Λ g, and the general Gaussian coherent states are defined as follows. = Tˆ (z)ϕ(Γ) , ϕ(Γ) z where Tˆ (z) is the Weyl translation operator, z = (q, p), i ˆ T (z) = exp (p · x − q · Dx ) ∂ and z = (q, p) ∈ Rd × Rd . ϕ(Γ) is the Gaussian state: where Dx = −i ∂x i (Γ) −d/4 Γx · x aΓ exp ϕ (x) = (π) 2
(2.3)
(2.4)
(2.5)
where Γ is a complex symmetric matrix such that Γ is definite-positive, aΓ is a normalization constant. (aΓ = det1/4 Γ). It is convenient to introduce here the Siegel space Σ+ (d) of d × d complex matrices Γ such that Γ is definite-positive. (See in [13] properties of Σ+ (d).) (Γ) Let us define the Fourier–Bargmann transform FB as follows, ψ ∈ L2 (Rd ), FB [ψ](z) = (2π)−d/2 ψ, ϕ(Γ) z . (Γ)
(Γ)
z ∈ R2d , ϕz x ∈ Rd ,
ϕ(Γ) z (x) (Γ)
(2.6)
is the following coherent state living at z, z = (q, p) ∈ Rd × Rd , −d/4
= (π)
i p · q iΓ(x − q) · (x − q) p·x− + aΓ exp , 2 2
(2.7)
FB is an isometry from L2 (Rd ) into L2 (R2d ) (with the Lebesgue measures). If 2 Γ =iI we denote FB = FBiI ; its range consists of F ∈ L2 (R2d ) such that exp p2 + d i q·p 2 F (q, p) is holomorphic in C in the variable q − ip. In other words, 2 q·p p FB ψ(z) = Eψ (q − ip) exp − − i (2.8) 2 2
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1129
where Eψ is entire in Cd (see [25]). Moreover we have the inversion formula (Γ) ψ(x) = FB [ψ](z)ϕ(Γ) in the L2 -sense. (2.9) z (x)dz, R2d
These properties are well known (see [25, 5]). Sometimes we shall use the shorter (Γ) ˜ notation ψ˜Γ = FB ψ and ψ˜Γ = ψ. ˆ Let us denote by R[Ft ] the quantum propagator for the Hamiltonian H(t) (this is the metaplectic representation of Ft ) and K (Ft ) its Schwartz kernel. We know ˆ t ]g is the following Gaussian state [10, 13], that Λ R[F i −d/4 ˆ Γt x · x Λ R[Ft ]g(x) = (π) aΓ (t) exp (2.10) 2 where aΓ (t) = [det(At + ΓBt )]−1/2 aΓ , the complex square root is computed by continuitya from t = t0 = 0, and Γt = (Ct + ΓDt )(At + ΓBt )−1 ,
Γt0 = Γ.
Proposition 2.1. We have the following exact formula (Θ,Γ) M (Θt , Γ) (t,z;x,y) K (Ft ) (x, y) = 2d/2 (2π)−3d/2 det1/2 eΦ dz i 2d R
(2.11)
(2.12)
¯ − Θt (A + B Γ) ¯ and where Γ, Θt ∈ Σ+ (d), Θt is C 1 in t; M (Θt , Γ) = C + DΓ Φ(Θt ,Γ) (t, z; x, y) =
1 (qt · pt − q · p) + pt · (x − qt ) − p · (y − q) 2 1 ¯ − q) · (y − q)). + (Θt (x − qt ) · (x − qt ) − Γ(y 2
Let us remark that here the action is S(t, z) = 12 (qt · pt − q · p). First of all let us remark that the integral (2.12) is an oscillating integral and is defined, as usual, by integrations by parts. We shall give two proofs of this formula. Proof I. We start with any Γ0 in the Siegel space Σ+ (d). Using the formula ψ(x) = (2π)−d
ψ, ϕΓz 0 ϕΓz 0 dz R2d
we get the formula K (Ft ) (x, y) = (2π)−d
(Γ
R2d
ϕz 0 (y)ϕz(Γt t ) (x)dz.
(2.13)
a This definition of det1/2 is different that the det1/2 function on Σ (d), this is explained in [10] + to compute Maslov index.
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
1130
So, we get K (Ft ) (x, y) = (2π)−3d/2 k0 (t)
i
(Γt ,Γ0 ) (t,z;x,y)
eΦ
dz,
(2.14)
R2d
where k0 (t) = 2d/2
det1/2 ( Γ0 ) det1/2 (A + BΓ0 )
.
Now we shall transform the phase Φ(Γt ,Γ0 ) into the phase Φ(Θ,Γ0 ) . Let us introduce Θ(s) = sΘ + (1 − s)Γt , 0 ≤ s ≤ 1. We have Θ(s) ∈ Σ+ (d). We want to find k(t, s) such that k(t, 0) = k0 (t) and (Θt ,Γ0 ) i ∂ (t,z;x,y) eΦ dz = 0, ∀ s ∈ [0, 1]. (2.15) k(t, s) ∂s R2d We have (Θt ,Γ0 ) i ∂ i Φ(Θt ,Γ0 ) i e (Θt − Γt )(x − qt ) · (x − qt )e Φ = . ∂s 2
The main trick used here and later in this paper, and also in all the previous papers on this subject ([23, 22, 29]), is to integrate by parts to convert each factor (x − qt ) into , using the following equality ¯ p )ΦΘ,Γ = (C τ + ΓD ¯ τ − (Aτ + ΓB ¯ τ )Θ)(x − qt ) (∂q + Γ∂
(2.16)
where Aτ denotes the transposed matrix of A. Let us introduce the matrix ¯ − Θ(A + B Γ). ¯ M = M (Θ, Γ) = C + DΓ So we have i
(Θ,Γ)
M τ (x − qt )e Φ
=
¯ p e i ΦΘ,Γ . ∂q + Γ∂ i
(2.17)
Let us remark that M is invertible. This is a consequence of the following lemma (see [11, 13] or [28, Appendix A], for proofs). ∗ d ∗ d Lemma A B 2.2. For every linear symplectic map in F : T (R ) → T (R ), d F = C D and every Γ ∈ Σ+ (d), (A + BΓ), (C + DΓ) are invertible in C and (C + DΓ)(A + BΓ)−1 ∈ Σ+ (d).
So we have ¯ = C + DΓ − Θ(A ¯ + BΓ) = ((C + DΓ)(A + BΓ)−1 − Θ)(A ¯ M + BΓ)−1 . ¯ ∈ Σ+ (d) so is invertible. But (C + DΓ)(A + BΓ)−1 − Θ) Denote M (t, s) = M (Θs , Γt ). Let us recall the Liouville formula ∂s det(M (t, s)) = det(M (t, s)) Tr(∂s M (t, s)M (t, s)−1 ).
(2.18)
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1131
So, integrating by parts in (q, p) we get k(t, s) = k(t, 0)
det1/2 M (t, s)
(2.19)
det1/2 M (t, 0)
k(t,0) Now we have to compute det1/2 . A simple computation gives M (t, 0) = (D − M(t,0) ¯ Γt B)(Γ0 − Γ0 ). The proof of (2.12) follows from the formula
det(D − Γt B) = det(A + BΓ0 )−1 .
(2.20)
This equality follows from the symplecticity of F (Dτ B = B τ D). We have B τ Γt B − Dτ B = −(A + BΓ0 )−1 B. So we get (2.20) if detB = 0. The general case follows by a density argument. Let us remark that can exchange the role of Θ and Γ by considering the adjoint U (t)∗ of U (t). Proof II. We solve directly the Schr¨ odinger equation ∂ ˆ ψ(t, x) = 0 i − H(t) ∂t
(2.21)
for any initial data ψ(x) := ψ(0, x), ψ ∈ S(Rd ) using the ansatz (Θ,Γ) −3d/2 (t,z;x,y) k(t) eiΦ ψ(y)dzdy. ψ(t, x) = (2π)
(2.22)
R2d ×Rd
We have to compute k(t) such that k(0) = 2d/2 . Let us remark that if we integrate first in y then the integral (2.22) in z converges because the Fourier–Bargmann transform of ψ, FB ψ, is in the Schwartz space S(R2d ). For simplicity, we assume here that Θ = Γ = iI. The general case can be reached by the same method or by using the deformation argument of Proof I as we shall see later for more general Hamiltonians. ˆ Here the Hamiltonian H(t) is a quadratic form. So using dilations we can assume that = 1. A simple computation left to the reader, gives the following: Lemma 2.3. ˆ = Gx · x + i(L + Lτ )x · x − Kx · x + Tr(K − iL) (g −1 H(t)g)(x) where g(x) = e
|x|2 − 2
(2.23)
.
So we get ˆ (i∂t − H(t))ψ(t) = (2π)−3d/2
i
(Θ,Γ)
eΦ
(t,z;x,y)
b(t, x, z)ψ(y)dzdy
R2d ×Rd
(2.24) where b(t, z, z) = i∂t k(t) − k(t)(E(x − qt ) · (x − qt ) + Tr(K − iL)).
November 16, J070-S0129055X1000417X
1132
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
As in Proof I, we integrate by parts in the variable z ∈ R2d , using (∂q − i∂p )Φ = M τ (x − qt ) with M = C − B − i(A + D), which is invertible (see below Lemma 3.2). Using the Hamilton equation of motion we get M˙ = −E(A − iB) − i(K − iL)M. So, we find the following differential equation for k(t), 1 k˙ = Tr( M M˙ k. 2 Using the Liouville formula, we get again (2.12) for this particular phase.
(2.25)
(2.26)
3. Proof of Theorems 1.2 and 1.4 As usual for this kind of problems there are two steps: (1) Determine the amplitudes aj solving by induction transport differential equations; (2) Estimate the error between the approximated propagator and the exact one. 3.1. Transport equations It is convenient to write e Φ = (π)d/2 ϕzt (x)ϕ¯z (y)e (S(t,z)+(p·q−pt ·qt )/2) . i
i
(3.1)
ˆ (t)ϕzt . It is not difficult to add contributions of Then we have to compute H the lower order terms of the Hamiltonian, so we shall assume for simplicity that H (t) = H0 (t) := H(t). Lemma 3.1. For every N ≥ 2 we have |γ|/2 γ x − qt √ ∂ H(t, zt )Πγ ϕzt (x) γ! X
ˆ H(t)ϕ zt (x) =
|γ|≤N
+ (N +1)/2 T (zt )Λ Opw 1 [RN (t, zt )]g(x) where
RN (t, zt , X) = 0
1
(1 − s)N N!
√ γ ∂X H(t, zt + s X)X γ ds
(3.2)
(3.3)
|γ|=N +1
and Πγ is a universal polynomial of degree ≤ |γ| which is even or odd according |γ| is even or odd. Proof. Let us recall that ϕz = Tˆ(z)Λ g. In this proof we put zt = z. An easy property of Weyl quantization gives √ w ˆ ˆ ˆ (3.4) Λ−1 T (z)H(t)T (z)Λ = Op1 [H( • +z)]. So the lemma follows easily from the Taylor formula with integral remainder.
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1133
In this first step, we do not take care of remainder estimates, this will be done in the next step. Let us denote I(a, Φ) the formal operator having the Schwartz kernel i e Φ(t,z;x,y) a(t, z)dz. (3.5) Ka (x, y) = (2π)−3d/2 R2d
From the Lemma 3.1, we can write ˆ H(t)I(a, Φ) ∼ I(b, Φ),
where b ∼
|γ|/2 γ
We have Πγ (x) =
γ!
γ ∂X H(t, zt )Πγ
x − qt √
hγ,β xβ .
a.
(3.6)
(3.7)
β≤γ
The quadratic part can be computed as for quadratic Hamiltonians and the linear part disappears with the classical motion. So we have b ∼ H(t, zt )a + (∂q H(t, zt ) + i∂p H(t, zt )) · (x − qt )a x − qt x − qt √ √ + E · + Tr(K − iL) a 2 H(t, X) the Hessian matrix of H(t). We have where we denote ∂X,X G L 2 ∂X,X H(t, zt ) = , E = G + 2iL − K, L K
(3.8)
(3.9)
2 2 2 H(t, zt ), L := ∂q,p H(t, zt ), K := ∂p,p H(t, zt ). with G := ∂q,q At Bt 2 Here the stability matrix Ft = Ct Dt satisfies F˙t = J∂X,X H(t, zt )Ft , Ft=0 = I. As in the quadratic case we want to transform the power of (x − qt ) into power of .
Lemma 3.2. Let us denote Mt = (Ct − Bt ) − i(At + Dt ). We have |det Mt | ≥ 2−d , (∂q − i∂p )e
i Φ
and
= iMtτ (x − qt )e
(3.10) i Φ
Proof. For simplicity, let us forget the lower index t. Let us consider the 2d × 2d matrix I + A − iC B + i(I − D) I + F + iJ(I − F ) = C − i(I − A) I + D + iB I + A − iC −i(D + iB) + i = . i(A − iC) I + D + iB
(3.11)
(3.12)
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
1134
Using [13, Lemma 4, Appendix A], we get det(I + F + iJ(I − F )) = det((I + A − iC)(I + D + iB) − (A − iC − I)(D + iB − I)) = 2d det(A + D + i(B − C)).
(3.13)
Using that F is symplectic, we get (I + F + iJ(I − F ))∗ (I + F + iJ(I − F )) = (I + F τ )(I + F ) + (1 − F τ )(I − F ) ≥ I2d
(3.14)
hence (3.10) follows. Let us recall classical computations for the derivatives of the action ∂q S = (∂q qt )τ pt − p,
(3.15)
∂p S = (∂p qt )τ pt .
(3.16)
Then we can compute ∂q Φ, ∂p Φ and we get (3.11). Integrate by parts like in the quadratic case, we get ˆ Φ) I(f, Φ) (i∂t − H(t))I(a,
(3.17)
where
|γ|/2 γ x − qt 1 −1 ˙ √ ∂ H(t, zt )Πγ f ∼ i ∂t a − Tr(M M )a + a. 2 γ! X
(3.18)
|γ|≥3
Hence using the Liouville formula, we get the first term a0 (t, z) = 2d/2 det1/2 (iM .
(3.19)
We shall obtain the next terms aj by successive integrations by parts. This is solved more explicitly with the following lemma. Lemma 3.3. For any symbol b ∈ O0 (2d), and every multiindex α ∈ N2d we have i i (x − qt )α e Φ b(z)dz = |β| fα,β (t, z)e Φ ∂zβ b(z)dz (3.20) R2d
|α| 2 ≤|β|≤|α|
R2d
where fα,β (t, z) are symbols of order 0, uniformly bounded in O0 (2d) on bounded time intervals. They only depend on the classical flow φt (z) and its derivatives. More precisely, let us assume that there exists a positive function µ(T ) such that for every γ ∈ N2d we have sup |∂zγ φt (z)| ≤ Cγ µ(T )|γ| .
(3.21)
|∂z fα,β (z)| ≤ Cα,β; µ(T )|α|−|β|+| |.
(3.22)
|a|≤T
Then we have
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1135
Proof. The lemma is easily obtained by induction on |α| using Lemma 3.2. Now, to determine the transport equation, we solve inductively on j ≥ 0, the equation ˆ (i∂t − H(t))I k ak (t), Φ = O(j+2 ). (3.23) 0≤k≤j+1
Reasoning by induction on j ≥ 0, we get the transport equation for aj+1 (t) by cancellation of the coefficient of j+1 in (3.23). ∂t aj+1 (t, z) =
1 ˙ −1 Tr M M aj+1 (t, z) + bj (t, z), 2
where
bj (t, z) =
aj+1 (0, z) = 0,
Fj,k,α (t, z)∂zα ak (t, z).
(3.24)
(3.25)
|α|+2k≤2(j+2)
Moreover, Fj,k,α (t, z) depends only on the classical flow φt (z) and its derivatives and satisfies |∂zγ Fj,k,α (t, z)| ≤ Cj,k,α,γ µ(T )2(j−k+2)+|γ|−|α| where Cj,k,α,γ only depends on sup|t|≤T |H(t)|∞,γ , 2 ≤ |γ| ≤ j + 2. So we get, for every j ≥ 0, t det1/2 M (t, z)M (s, z)−1 bj (s, z)ds. aj+1 (t, z) =
(3.26)
(3.27)
0
Moreover, from (3.25) and (3.26), we get the following estimate, for every j ≥ 0, |t| ≤ T , z ∈ R2d , |∂zγ aj (t, z)| ≤ Cj,γ |det1/2 M (t, z)|µ(T )4j+|γ|
(3.28)
with the same remark as in (3.26) for the constant Cj,γ . 3.2. Error estimates Let us denote
where a(N ) (t) =
(N ) ˆ RN (t) = (i∂t − H(t))I(a (t), Φ)
k ak . Using the Duhamel formula, we have t U (t) − U N,(t) ≤ −1 R(s)ds
(3.29)
0≤k≤N
(3.30)
0
where t0 = 0, U (t) = U (t, 0), U N, (t) = I(a(N ) (t), Φ). So we have to estimate RN (t). Let us denote K (N ) (x, y) the Schwartz kernel ˜ (N ) (X, Y ) the Schwartz kernel of RN (t) in the Fourier–Bargmann of RN (t) and K
November 16, J070-S0129055X1000417X
1136
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
representation: ˜ (N ) (X, Y ) = K Rd ×Rd
K (N ) (x, y)ϕX (y)ϕY (x)dxdy.
(3.31)
˜ N (t) the operator with Schwartz kernel K ˜ (N ) (X, Y ). The following lemma Let be R is well known. Here we forget N and t for simplicity. Lemma 3.4. We have the L2 norm estimate ˜ L2 (Rd ) . RL2 (Rd ) ≤ (2π)−d R
(3.32)
˜ ˜ max sup |K(X, Y )|dX, sup |K(X, Y )|dY .
(3.33)
In particular, we have −d
RL2 (Rd ) ≤ (2π)
Y
X
Proof. For inequality (3.32) we use that the Fourier–Bargmann transform is an isometry. Inequality (3.33) is known as Carleman (or Schur) L2 estimate. Using Lemma 3.1, we get ˜ (N ) (X, Y ) = 2−3d/2 (π)−d K i (N ) ×
Tˆ (zt )Λ Opw (t, z)e δ(t,z) dz 1 [RN (t)]g, ϕY ϕX , ϕz a R2d
(3.34) t ·qt where δ(t, z) = S(t, z) + p·q−p . 2 Using Weyl commutation formula, we have
i |X − z|2
ϕX , ϕz = exp − + σ(X, z) , 4 2 w −zt .
Tˆ (zt )Λ Opw 1 [RN (t)]g, ϕY = Op1 [RN (t)]g, g Y√
(3.35) (3.36)
We know the Wigner function W0,Z of the pair (g, gZ ), Z ∈ R2d ([28]) 2 Z W0,Z (X) = 22d exp − X − − iσ(X, Z) . 2
(3.37)
By a well-known property of Weyl quantization ([13]), for any symbol s, we have −d
Opw 1 [s]g, gZ = (2π)
R2d
s(X)W0,Z (X)dX
(3.38)
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1137
We shall use the following lemma Lemma 3.5. Let be f ∈ O0 (2d). For every γ ∈ N2d and m > 0 there exists Cγ,m such that γ −|X−Z|2 −iJZ·X dX 2d X f (X)e R
≤ Cγ,m (1 + |Z|)−m
sup |α|≤m+|γ|; Y ∈R2d
|∂Yα f (Y )|.
(3.39)
Proof. It is enough to assume |Z| ≥ 1. We integrate m times by parts with the differential operator L=
2(X − Z) − iJZ · X ∂X 4|X − z|2 + |JZ|2
(3.40)
α , with |lm,α | ≤ Cm,α (|Z| + |X − Z|)−m , where using that (Lτ )m = |α|≤m lm,α ∂X θ(X) = −|X − Z|2 − iJZ · X. So using Lemma 3.5 we get the following estimate: for every N ; N there exists CN,N (depending only on semi-norms |H(t)|∞,γ , 2 ≤ |γ| ≤ N + N , such that for X, Y ∈ R2d and |t| ≤ T we have
N +1
˜ (N ) (X, Y )| ≤ CN,N (µ(T ))N +N 2 −d |K −N |X−z|2 |Y − zt | × e− 4 |a(N ) (t, z)|dz. 1+ √ 2d R
(3.41)
Let us denote φ∗t = φ0,t = (φt )−1 . We have the Lipchitz estimate, for |t| ≤ T , |φ∗,t Y − z| ≤ µ(T )|Y − zt |.
(3.42)
So we get −N −N t∗ 2 | Y − X| |Y − z |φ t − |X−z| 4 √ e dz ≤ CN 1 + 1+ √ R2d µ(T )
(3.43)
and ˜ (N ) (X, Y )| |K N +N
≤ CN,N (µ(T ))
N +1 2
−N ∗ |φt Y − X| √ sup |a(N ) (t, z)|. 1+ µ(T ) z∈R2d ,|t|≤T (3.44)
Then using Lemma 3.4 and choosing N > 2d, we get the following uniform L2 estimate for the remainder term, for |t| ≤ T , RN (t) ≤ CN (µ(T ))N +1 (N +1)/2
sup z∈R2d ,|t|≤T
|a(N ) (t, z)|.
(3.45)
If T is fixed, pushing the expansion up to 2N instead of N we get easily Theorem 1.2 using the Duhamel formula.
November 16, J070-S0129055X1000417X
1138
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
Using global estimates on aj (t, z) obtained from the transport equation (3.28) and pushing the asymptotic expansion up to 2N , we get the proof of Theorem 1.4 using again the Duhamel formula. 4. Varying Phase. Proof of Theorem 1.3 To avoid technicalities, we fix the time t. It would be not difficult to follow a time parameter t if necessary for application. So in this section, φ is a symplectic diffeomorphism in R2d , such that φ, φ−1 are Lipchitz continuous and φ ∈ O1 (2d). We denote z = (q, p) ∈ R2d , φ(z) = (Q(z), P (z)) ∈ Rd × Rd and S an action for φ, i.e. a primitive on R2d of the closed 1-form P dQ − pdq. We consider the following phases Φ(φ,Θ,Γ) (z; x, y) = S(z) + P · (x − Q) − p · (y − q) 1 ¯ − q) · (y − q)). + (Θ(x − Q) · (x − Q) − Γ(y 2
(4.1)
This class of Fourier-Integral Operators with complex quadratic phase was already analyzed in [29]. We want to show here how to vary the choice of the matrices Θ, Γ for a given canonical transformation φ of R2d . As in Sec. 3, let us denote I(a, Φ) the operator with the Schwartz kernel (φ,Θ,Γ) i −3d/2 (z;x,y) Ka (x, y) = (2π) eΦ a(z)dz (4.2) R2d
where a ∈ O0 (2d), Φ = Φ(φ,Θ,Γ) . Using a Fourier–Bargmann transform and the following estimate: there exist C > 0, c > 0 such that for all X ∈ R2d , we have c|X|2 | ϕΓ , ϕX | ≤ C exp − , (4.3) ˜ a (X, Y ) of Ka and prove that we can estimate the Fourier–Bargmann transform K 2 d I(a, Φ) is bounded in L (R ) (see Sec. 3, Lemma 3.5 and Sec. 5 below). Our goal in this section is to prove the following result which gives Theorem 1.3 as a particular case. Proposition 4.1. Let be 4 matrices in Σ+ (d), Θ, Θ , Γ, Γ and a ∈ O0 (2d). Θ, Θ may be z dependent such that
∀ γ,
∃c > 0,
Θ() v.v ≥ c|v|2 ,
|γ| ≥ 1,
∃Cγ ,
∀ z ∈ R2d
(4.4)
∂zγ Θ() ≤ Cγ , ∀ z ∈ R2d . (4.5) Then there exists a semi-classical symbol a ∼ j j aj of order 0 such that we have for the L2 operator norm,
I(a, Φ(φ,Θ,Γ) ) = I(a , Φ(φ,Θ ,Γ ) ) + O(∞ ).
(4.6)
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1139
Moreover we have for the principal symbol a0 the formula a0 (z) = a0 (z)
det1/2 (M (1)) det1/2 (M (0))
(4.7)
¯ ¯ − ((1 − s)Θ + sΘ )(A + B Γ). where M (s) := C + DΓ Proof. The method is rather simple and is an extension of what we have already done for quadratic Hamiltonians (Proof I) except that here we have to solve transport equations in the deformation parameter s to get the lower order correction terms. Let us remark that this class of Fourier-Integral Operators is closed under adjointness: I(a, Φ(Θ,Γ) )∗ = I(a∗ , Φ∗ ),
(4.8)
¯(φ−1 Z), Z = (Q, P ), Z = φ(z) and where a∗ (Z) = a Φ∗ (Z; x, y) = −S(φ−1 Z) + p · (x − q) − P · (y − Q) 1 ¯ − Q) · (y − Q)). + (Γ(x − q) · (x − q) − Θ(y 2
(4.9)
So by transitivity we can assume that Γ = Γ . As in the quadratic Hamiltonian case, let us introduce, Θs = (1 − s)Θ + sΘ , Φ(s) = Φ(Θs ,Γ) , 0 ≤ s ≤ 1 and look for (s) a semiclassical symbol a(s) = j j aj such that (s) i ∂ e Φ (z;x,y) a(s) (z)dz = O(∞ ), ∀ s ∈ [0, 1]. (4.10) ∂s R2d However, we have ∂ (s) i Φ (z; x, y) = (Θ − Θ)(x − Q) · (x − Q) ∂s and we have to find a C 1 family symbol a(s) , 0 ≤ s ≤ 1 such that i (s) (s) I ∂s a + (Θ − Θ)(x − Q) · (x − Q)a , Φ = O(∞ ).
(4.11)
(4.12)
The principal term a0 = a(1) is computed as in the quadratic case. Let us suppose for a moment that Θ, Θ are constant. Then as in the quadratic case we have ¯ p )Φ(s) = (C τ + ΓD ¯ τ − (Aτ + ΓB ¯ τ )Θs )(x − Q) (4.13) (∂q + Γ∂ A B where A = ∂q Q, B = ∂p Q, C = ∂q P , D = ∂p P and F = C D is a symplectic matrix. ¯ is invertible so we can integrate ¯ − Θs (A + B Γ) We know that M (s) := C + DΓ by parts as in Sec. 3. and as above we can achieve the proof of Proposition 4.1.
November 16, J070-S0129055X1000417X
1140
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
When Θ, Θ are z dependent, the integrations by part are more tricky. We have to use ¯ p )Φ(s) = M τ (s, z)(x − Q) + N (s, z)(x − Q, x − Q) (∂q + Γ∂
(4.14)
where N (s, z)(x, y) is a bilinear application in (x, y) ∈ Rd × Rd into d × d matrices, with coefficients in O0 in z, C 1 in s. Hence we have (Θ,Γ) i ¯ p e i Φ(Θ,Γ) = (M τ )−1 (s, z) ∂q + Γ∂ (x − Q)e Φ i
(Θ,Γ)
− (M τ )− (s, z)N (s, z)(x − Q, x − Q)e i Φ
.
(4.15)
So we apply (4.15) and the following lemmas to proceed like in Sec. 3. Lemma 4.2. For any symbol b ∈ O0 (2d), for every multiindex α ∈ N2d and every N ≥ |α|/2 we have (s) i (x − Q)α e Φ b(z)dz 2d R (s) i = |β| fα,β (s, z)e Φ ∂ β b(z)dz |α| 2 ≤|β|≤N
+
R2d
|β|+|γ|=N +1,|β|≥1
|γ|
i
R2d
(s)
gα,β (s, z)(x − Q)β e Φ gβ,γ ∂ γ b(z)dz
(4.16)
where fα,β (s, z), gα,β (s, z) are symbols of order 0, uniformly bounded in O0 (2d) for s ∈ [0, 1]. Lemma 4.3. For every b ∈ O0 (2d) and β ∈ Nd we have the crude L2 estimate, uniform in s ∈ [0, 1], I((x − Q)β b, Φ(s) = O(|β|/2 ). Using these two lemmas we get the full semiclassical symbol a ∼ a0 (z) = a0
det1/2 (M (s)) det1/2 (M (0))
(4.17) j
j aj , where (4.18)
and for j ≥ 1, aj is computed by induction as solution for s = 1 of the differential equation ∂s aj (s) = Tr M˙ (s)M −1 (s) aj (s) + bj (s), aj (0) = aj . (4.19) where bj (s) depends on the ak (s), k ≤ j − 1. Remark 4.4. Considering the adjoint operator, it is possible to exchange the role of the matrices Θ and Γ. If the symbol a depends smoothly on some parameter λ, it is not difficult to show that a also depends smoothly in λ.
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1141
Proof of Lemma 4.2. This is done by an induction on N such that α ≤ N . Proof of Lemma 4.3. Let us begin by giving a simple proof of (4.3) when Θ is z dependent satisfying the assumptions (4.4) and (4.5) of Proposition 4.1. We shall prove the more general estimate, for every β ∈ Nd there exist C > 0, c > 0 such that 2
| xα g Θ , gY | ≤ Ce−c|Y | ,
∀ Y ∈ R2d .
(4.20)
Let us denote Y = (y, η) ∈ Rd × Rd . By a direct estimate we get easily, 2
| xα g Θ , gY | ≤ Ce−2c|y| ,
∀ (y, η) ∈ R2d .
(4.21)
Using Fourier transform and Plancherel formula, we exchange y and η and we get (4.20). Now we can follow the method of Sec. 3 to estimate L2 norm of operators using a Fourier–Bargmann transformation. ˜ Let be K(X, Y ) the Fourier–Bargmann kernel of I((x − Q)β b, Φ(s) ). We have i ˜ K(X, Y ) = 2−3d/2 (π)−d |α|/2
Tˆ (Z)Λ (xβ g Θ ), ϕY ϕX , ϕz b(z)e δ(t,z) dz R2d
(4.22) where Z = (Q, P ) = φ(z) and | Tˆ (Z)Λ (xβ g Θ ), ϕY | = | xβ g Θ , g Y√−Z |.
So we get ˜ |K(X, Y )| ≤ C|α|/2
c 2 2 exp − (|Y − φ(z)| + |X − z| dz. R2d
(4.23)
(4.24)
Using that φ is a Lipchitz canonical transformation, we have, for C0 large enough and c0 > 0 small enough, c0 |α|/2 2 ˜ exp − (|Y − φ(X)| . |K(X, Y )| ≤ C0 (4.25) Hence we get the proof of Lemma 4.3 using Lemma 3.4. We have proved Proposition 4.1 and Theorem 1.3. 5. Semiclassical Fourier Integral Operators In [23, 8] and in the recent preprint [30], the authors have considered FourierIntegral Operators defined by the following simpler phase 1 Ψ(φ,Θ) (p; x, y) = S(y, p) + P · (x − Q) + Θ(x − Q) · (x − Q) (5.1) 2 where (Q, P ) = φ(y, p), φ is a bilipchitz canonical transformation like above, Θ ∈ Σ+ (d).
November 16, J070-S0129055X1000417X
1142
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
In [23, 8] the authors have proved semiclassical expansions for the propagator of Schr¨ odinger equation for initial data with a compact support. This result is extended in [30] for the Schr¨ odinger Hamiltonian −2 + V , to general data in L2 with uniform norm estimates. We shall give here some extensions of results of [30] using the same techniques as in Secs. 3 and 4, so we shall not repeat the details. Let us denote J (a, Ψφ,Θ ) the operator whose Schwartz kernel is (φ,Θ) i (p;x,y) eΨ a(y, p)dp. (5.2) K(x, y) = (2π)−d Rd
A natural question discussed in this section is to compare the Fourier-Integral Operators I(a, Φ(φ,Θ,Γ) ) defined with 2d “frequency variables” and J (a, Ψ(φ,Θ) ) defined with d “frequency variables”. A Fourier integral operator in L2 (Rd ) is always a quantization of a canonical transformation φ in the cotangent space T ∗ (Rd ). A nice way to make clear this relationship is to use a Fourier–Bargmann transform (see [7, 31]). This can be easily done in the same way for Semiclassical Fourier-Integral Operators as we shall see now. Definition 5.1. A family of operators, depending on a small parameter ∈ ]0, 1], U : S(Rd ) → S (Rd ) is a Semiclassical Fourier-Integral Operator of order m ∈ R associated to the canonical bilipchitz transformation φ: T ∗ (Rd ) → T ∗ (Rd ), if for d d every N we have U = UN + RN where UN : S(R ) → S (R ) and RN = O(N ) and for every N ≥ 0 there exists CN such that −N |Y − φ(X)| m−3d/2 ˜ √ |K (X, Y )| ≤ CN , ∀ X, Y ∈ R2d , ∈ ]0, 1], 1+ (5.3) ∗ ˜ (X, Y ) is the Schwartz kernel of FB U F . where K N
B
Remark 5.2. (1) In this definition, which co¨ıncides with a definition given in [31] for = 1, a Semiclassical Fourier-Integral Operator has, up to a negligible operator in , a kernel living in a neighborhood of the graph of a canonical transformation φ. But this definition says nothing concerning asymptotic ˜ (X, Y ) in a neighborhood of the graph of φ when is small. expansion of K So this definition is certainly too permissive. But for fixed it is suitable as proven in [31]. (2) Using Carleman–Schur estimate, a Semiclassical Fourier-Integral Operator of order 0 is uniformly bounded in L2 (Rd ). This is a straightforward consequence of the definition. This class of Semiclassical Fourier-Integral Operator of order 0 is clearly closed by composition. (3) In Definition 5.1, it is equivalent to use any Fourier–Bargmann transformation (Γ) FB , Γ ∈ Σ+ (d). (4) There are other definitions of Semiclassical Fourier-Integral Operator using Lagrangian analysis and real phase functions. For this point of view, see for example, [1].
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1143
(5) Fourier-Integral Operators with complex phase were used to study propagation of singularities of P.D.E. Many papers and books have been published on this subject, among them let us point out [2, 26, 32]. Now we shall see that the operators already considered in this paper are Semiclassical Fourier-Integral Operators. Proposition 5.3. Let be amplitudes a = a(x, z), a ∈ O0 (3d) and u = u(x, y, p), u ∈ O0 (3d) and Θ, Γ ∈ Σ+ (d), Θ may depend in z or (y, p), such that (1.16), (1.17) are satisfied. Then I(a, Φ(φ,Θ,Γ) ) and J (u, Ψφ,Θ ) are Semiclassical Fourier-Integral Operators of order 0. Proof. Concerning I(a, Φ(φ,Θ,Γ) ), we get the result following Sec. 3.2, estimate (3.44). The proof for J (u, Ψφ,Θ ) is almost the same. For simplicity we assume Θ constant. For Θ depending in (y, p) we could proceed as in Sec. 4. ˜ Y = (˜ Let us denote X = (˜ x, ξ), y , η˜). We want to estimate i ˜ −d ˜ e Φ u(x, y, p)dpdxdy (5.4) K(X, Y ) = (2π) R3d
where ˜ = S(y, p) + P · (x − Q) + Θ (x − Q) · (x − Q) Φ 2 i i + (˜ x − y) · (˜ x − y) + ξ˜ · (˜ x − y) + (˜ y − x) · (˜ y − x) + η˜ · (˜ y − x). (5.5) 2 2 B Dτ −B τ −1 if F = A is Let us remark that we have: F −1 = −C τ Aτ C D . So, because F τ τ symplectic, we know that D − B Θ is invertible. Hence we have ˜ = (C τ − Aτ Θ)(x − Q) + (ξ˜ − p) + i(˜ x − y), ∂y Φ
(5.6)
˜ = (Dτ − B τ Θ)(x − Q), ∂p Φ
(5.7)
˜ = Θ(x − Q) + (P − η˜) + i(˜ y − x). ∂x Φ
(5.8)
˜ by integrations by parts using So we get the necessary estimates on K ˜ − (−Aτ Θ + C τ )(Dτ − B τ Θ)−1 ∂p Ψ = (ξ˜ − p) + i(˜ ∂y Φ x − y), ˜ − Θ(Dτ − B τ Θ)−1 ∂p Ψ = (P − η˜) + i(˜ ∂x Φ y − x).
(5.9) (5.10)
The following result is a slight generalization of [23, 8, 30]. Theorem 5.4. Under the assumptions of Theorem 1.2 and (1.16), (1.17), we have i (φt ,Θt ) (t,y,p,x) K(t; x, y) (2π)−d eψ u(; t, y, p)dp (5.11) where u(; t, y, p) Theorem 1.2.
=
Rd
0≤j<+∞
uj (t; y, p)j has the same meaning as in
November 16, J070-S0129055X1000417X
1144
2010 15:27 WSPC/S0129-055X
148-RMP
D. Robert
In particular u0 (t, y, p) = det1/2 (D − ΘB)).
(5.12)
Sketch of Proof. These result can be proved following the same strategy as for proving Theorem 1.3. We first prove the theorem for some Θ (Θ = iI), following the proof of Theorem 1.2. Then we can get the theorem for any Θ by the variation argument as in the proof of Theorem 1.3. L2 estimate for operator norm of Fourier-Integral Operators is used to control the remainder terms. Remark 5.5. It is not difficult to adapt the proof of Theorem 1.4 concerning an Ehrenfest time estimate to the setting of Theorem 5.4. References [1] I. Alexandrova, Semi-classical wave front set and Fourier-Integral Operators, Canad. J. Math. 60 (2008) 241–263. [2] V. M. Babich and V. S. Buldreyev, Asymptotic Methods in Short Waves Diffraction Problem (Moscow Nauka, 1972, in Russian); (Springer, 1991, English translation). [3] V. Bargmann, On the Hilbert space of analytic functions and associated integral transform, Comm. Pure Appl. Math. 14 (1961) 187–214. [4] J. M. Bily, Propagation d’´etats coh´erents et applications, Ph.D. thesis, Universit´e de Nantes (2001). [5] J. M. Bily and D. Robert, The semi-classical Van–Vleck formula. Application to the Aharonov–Bohm effect, in Long Time Behaviour of Classical and Quantum Systems, Proceedings of the Bologna APTEX International Conference, Bologna, Italy, September 13–17, 1999 (World Scientific, 2001), pp. 89–106. [6] A. Bouzouina and D. Robert, Uniform semiclassical estimates for the propagation of quantum observables, Duke Math. J. 111(2) (2002) 223–252. [7] J. M. Bony, Evolution equations and microlocal analysis, in Hyperbolic Problems and Related Topics, Grad. Ser. Anal. (International Press, 2003), pp. 17–40. [8] J. Butler, Global h Fourier integral operators with complex-valued phase functions, Bull. London Math. Soc. 34(4) (2002) 479–489. [9] M. Combescure and D. Robert, Semiclassical spreading of quantum wave packets and applications near unstable fixed points of the classical flow, Asymptot. Anal. 14 (1997) 377–404. [10] M. Combescure and D. Robert, Quadratic quantum Hamiltonians revisited, Cubo 8(1) (2006) 61–86. [11] A. C´ ordoba and C. Fefferman, Wave packets and Fourier Integral Operators, Comm. Partial Differential Equations 3(11) (1978) 979–1005. [12] B. Fedosov, Deformation Quantization and Index Theory, Mathematical Topics, Vol. 9 (Akademic Verlag, 1996). [13] G. B. Folland, Harmonic Analysis in Phase Space, Annals of Mathematics Studies, Vol. 122 (Princeton University Press, Princeton, NJ, 1989). [14] D. Fujiwara, A construction of the fundamental solution for the Schr¨ odinger equation, J. Anal. Math. 35 (1979) 41–96. [15] G. Hagedorn, Semiclassical quantum mechanics I: The limit for coherent states, Comm. Math. Phys. 71 (1980) 77–93.
November 16, J070-S0129055X1000417X
2010 15:27 WSPC/S0129-055X
148-RMP
On the Herman–Kluk Semiclassical Approximation
1145
[16] G. Hagedorn and A. Joye, Exponentially accurate semi-classical dynamics: Propagation, localization, Ehrenfest times, scattering and more general states, Ann. Henri Poincar´e 1(5) (2000) 837–883. [17] E. J. Heller, Time-dependent approach to semiclassical dynamics, J. Chem. Phys. 62(4) (1975) 1544–1555. [18] E. J. Heller, Frozen Gaussians: A very simple semiclassical approximation, J. Chem. Phys. 75(6) (1981) 2923–2931. [19] M. F. Herman and E. Kluk, A semiclassical justification for the use of non-spreading wavepackets in dynamics calculations, Chem. Phys. 91(1) (1984) 27–34. [20] L. H¨ ormander, The Analysis of Linear Partial Differential Operators I (SpringerVerlag, 1983). [21] K. Kay, Integral expressions for the semi-classical time-dependent propagator, J. Chem. Phys. 100(6) (1994) 4377–4392. [22] K. Kay, The Herman–Kluk approximation: Derivation and semiclassical corrections, Chem. Phys. 322 (2006) 3–12. [23] A. Laptev and I. M. Sigal, Global Fourier Integral Operators and semiclassical asymptotics, Rev. Math. Phys. 12(5) (2000) 749–766. [24] R. Littlejohn, The semiclassical evolution of wave packets, Phys. Rep. 138(4–5) (1986) 193–291. [25] A. Martinez, An Introduction to Semiclassical and Microlocal Analysis, Universitext (Springer-Verlag, 2002). [26] J. Ralston, Gaussian beams and propagation of singularities, in Studies in Partial Differential Equations, MAA Stud. Math., Vol. 23 (Math. Assoc. America, 1982), pp. 246–248. [27] D. Robert, Autour de l’approximation Semi-Classique, Progress in Mathematics, No. 68 (Birkh¨ auser, 1987). [28] D. Robert, Propagation of coherent states in quantum mechanics and applications, in Partial Differential Equations and Applications, S´emin. Congr., Vol. 15 (Soc. Math. France, 2007), pp. 181–250. [29] V. Rousse and T. Swart, A mathematical justification for the Herman–Kluk propagator, Comm. Math. Phys. 286 (2009) 725–750. [30] V. Rousse, Semiclassical simple initial value representations, Universit´e Paris 12 (2009), arXiv:0904.0387. [31] D. Tataru, Phase space transforms and microlocal analysis, in Phase Space Analysis of Partial Differential Equations, Publ. Cent. Ri. Mat Ennio Giorgi, Vol. II (Scuola Norm. Sup. Pisa, 2004), pp. 505–524. [32] J. Sj¨ ostrand, Singularit´es analytiques microlocales, Ast´erique 95 (1982) 1–166. [33] T. Swart, Initial value representation, Ph.D. thesis, Frei Universit¨ at Berlin (2008).
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1147–1179 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004168
ALMOST ADDITIVE THERMODYNAMIC FORMALISM: SOME RECENT DEVELOPMENTS
LUIS BARREIRA Departamento de Matem´ atica, Instituto Superior T´ ecnico, 1049-001 Lisboa, Portugal [email protected] Received 9 November 2009 Revised 13 July 2010 This is a survey on recent developments concerning a thermodynamic formalism for almost additive sequences of functions. While the nonadditive thermodynamic formalism applies to much more general sequences, at the present stage of the theory there are no general results concerning, for example, a variational principle for the topological pressure or the existence of equilibrium or Gibbs measures (at least without further restrictive assumptions). On the other hand, in the case of almost additive sequences, it is possible to establish a variational principle and to discuss the existence and uniqueness of equilibrium and Gibbs measures, among several other results. After presenting in a self-contained manner the foundations of the theory, the survey includes the description of three applications of the almost additive thermodynamic formalism: a multifractal analysis of Lyapunov exponents for a class of nonconformal repellers; a conditional variational principle for limits of almost additive sequences; and the study of dimension spectra that consider simultaneously limits into the future and into the past. Keywords: Almost additive sequences; thermodynamic formalism. Mathematics Subject Classification 2010: 37C45, 37D20, 37D35
1. Introduction The point of departure for this survey is the nonadditive thermodynamic formalism developed in [1], having in mind certain applications to the dimension theory of dynamical systems, as detailed below. Our main aim is to survey some recent developments in the particular case of almost additive sequences of functions. During the last two decades, the dimension theory of dynamical systems progressively developed into an independent field of research, roughly speaking with the objective of measuring the complexity from the dimensional point of view of the objects that remain invariant under the dynamics, such as the invariant sets and measures. The first monograph that clearly took this point of view was Pesin’s book ([36]), which describes the state-of-the-art up to 1997. We refer to our book ([4]) for a detailed description of many of the more recent results in the area. 1147
November 16, J070-S0129055X10004168
1148
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
The nonadditive thermodynamic formalism is a generalization of the classical thermodynamic formalism, in which the topological pressure P (ϕ) of a continuous function ϕ (with respect to a given dynamics on a compact metric space), is replaced by the topological pressure P (Φ) of a sequence of continuous functions Φ = (ϕn )n . The classical pressure P (ϕ) was introduced by Ruelle in [39] for expansive maps (see also his book [40]), and by Walters in [46] in the general case. For arbitrary sets (not necessarily compact), the nonadditive topological pressure also generalizes (and imitates) the notion of topological pressure introduced by Pesin and Pitskel in [37], which is equivalent to the notion introduced earlier by Bowen in [13] (see [37]). The nonadditive thermodynamic formalism contains as a particular case a new formulation of the subadditive thermodynamic formalism earlier introduced by Falconer in [19]. The main motivation behind the nonadditive thermodynamic formalism is to allow certain applications to a more general class of invariant sets in the context of the dimension theory of dynamical systems. We first recall that the unique solution s of the equation P (sϕ) = 0,
(1)
where ϕ is a certain function associated to a given invariant set, is often related to the Hausdorff dimension of the set. Equation (1) was introduced by Bowen in [15] (in his study of quasi-circles) and is usually called Bowen’s equation. It is also appropriate to call it Bowen–Ruelle’s equation, taking into account the fundamental role of the thermodynamic formalism developed by Ruelle, and of his article [41]. Virtually all known equations used to compute or to estimate the dimension of invariant sets are particular cases of Eq. (1) or of an appropriate generalization. We recommend [42] for a quite detailed and informative related discussion. On the other hand, in certain applications of dimension theory (we refer to the examples in [1, 4]), one is naturally led to consider sequences Φ = (ϕn )n that may satisfy no additivity between the functions ϕn . The nonadditive topological pressure and its associated thermodynamic formalism allow us to consider these generalizations in a unified framework. In particular, this allowed to establish in [1] sharp lower and upper dimension estimates for repellers and hyperbolic sets, including for a class of nondifferentiable maps, without further effort. The dimension estimates are obtained as solutions of appropriate generalizations of Eq. (1) now involving the nonadditive topological pressure. Given a continuous function ϕ : X → R in a compact metric space X, the classical topological pressure of ϕ, with respect to a continuous map f : X → X, satisfies the variational principle ϕdµ , P (ϕ) = sup hµ (f ) + µ
X
where hµ (f ) is the Kolmogorov–Sinai entropy of f with respect to the measure µ, and where the supremum is taken over all f -invariant probability measures on X.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1149
The thermodynamic formalism developed in [1] also includes a variational principle for the topological pressure, although with a restrictive assumption on the sequence Φ. Namely, if there exists a continuous function ϕ : X → R such that ϕn+1 − ϕn ◦ f → ϕ then
uniformly when n → ∞,
(2)
ϕdµ , P (Φ) = sup hµ (f ) + µ
X
again with the supremum taken over all f -invariant probability measures on X. The restrictive assumption in (2) caused that until recently there was no available discussion of equilibrium and Gibbs measures, in the general context of the nonadditive thermodynamic formalism. But it is well known that equilibrium and Gibbs measures play a prominent role in dimension theory and in particular in the multifractal analysis of dynamical systems, in which the spectra are often obtained by providing equilibrium measures with the appropriate local entropy or the appropriate pointwise dimension. Equilibrium and Gibbs measures can also be for example measures of full topological entropy or full Hausdorff dimension. It is sometimes possible to develop the theory without a variational principle for the topological pressure, and thus without these measures, but the corresponding proofs tend to be more technical. Clearly, from the points of view of dimension theory and multifractal analysis, it is desirable to continue using equilibrium and Gibbs measures even when the classical thermodynamical formalism cannot be used. The discussion above justifies the interest in looking for more general classes of sequences of functions, although perhaps not arbitrary sequences, for which it is still possible to establish a corresponding variational principle for the topological pressure, and to study the associated equilibrium and Gibbs measures, among several other results. This is precisely what happens with the so-called almost additive sequences, for which it is possible not only to establish a variational principle, but also to discuss the existence and uniqueness of equilibrium and Gibbs measures. We recall that a sequence Φ = (ϕn )n is said to be almost additive if there is a constant C > 0 such that −C + ϕn + ϕm ◦ f n ≤ ϕn+m ≤ C + ϕn + ϕm ◦ f n
(3)
for every n, m ∈ N. Clearly, for any function ϕ the sequence ϕn =
n−1
ϕ ◦ fk
k=0
is almost additive, since in this case ϕn+m = ϕn + ϕm ◦ f n for every n, m ∈ N. Nontrivial examples of almost additive sequences occur for example in the study of Lyapunov exponents for nonconformal maps by Barreira
November 16, J070-S0129055X10004168
1150
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
and Gelfert in [7] (see Sec. 7). Following [3], we consider in particular repellers and hyperbolic sets of C 1 transformations, and for an almost additive sequence Φ of continuous functions we describe several results towards the foundations of an almost additive thermodynamic formalism. This includes the formula 1 log P (Φ) = lim exp ϕn (x) n→∞ n n x:f (x)=x
for the topological pressure, for the class of almost additive sequences Φ with tempered variation. We also describe a variational principle for the topological pressure of an almost additive sequence, namely 1 ϕn dµ , (4) P (Φ) = sup hµ (f ) + lim n→∞ n X µ and we discuss the existence and uniqueness of equilibrium and invariant Gibbs measures, among several other results, for example concerning characterizations of unique equilibrium measures. Mummert ([34]) established independently identity (4), although under an additional assumption on the sequence Φ that can be removed by repeating verbatim arguments in [3]. Cao, Feng and Huang considered more recently in [16] the general class of subadditive sequences, and they also obtained the variational principle in (4), but they do not discuss the existence of equilibrium or Gibbs measures. Earlier results in this direction were obtained by K¨ aenm¨aki in [30] for a particular class of subadditive sequences, while also discussing the existence of an equilibrium measure. After presenting the foundations of the almost additive thermodynamic formalism, we describe three applications of the formalism. The first application, following Barreira and Gelfert in [7], considers nonconformal repellers in R2 satisfying a cone condition. The main objective is to obtain a multifractal analysis for the level sets of the Lyapunov exponents. In particular, we consider certain almost additive sequences related to the Lyapunov exponents to which one can apply the almost additive thermodynamic formalism. However, we emphasize that the results in [7] were obtained independently of the theory described in the survey. We also point out that the proofs of some results in Secs. 4–6 can be considered a distillation of arguments in that paper. We recall that a differentiable map f is said to be conformal on a given set provided that the differential dx f is a multiple of an isometry at every point x of the set. We emphasize that the dimension theory and the multifractal analysis of dynamical systems are only completely understood in the case of conformal uniformly hyperbolic dynamics, either invertible or noninvertible. This includes saddle-type hyperbolic diffeomorphisms on surfaces, and holomorphic maps in the complex plane with a hyperbolic Julia set. The study of the dimension of invariant sets of nonconformal transformations has proven to be much more delicate. The main difficulty is related with the possibility of existence of distinct Lyapunov exponents in different directions, which may change from point to point. Another difficulty is that certain number-theoretical
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1151
properties may play an important role. Nevertheless, there exist several noteworthy results concerning the dimension theory of certain classes of invariant sets of nonconformal transformations, namely due to Falconer ([18, 20]), Bothe ([12]), Simon ([44]), and Simon and Solomyak ([45]). We refer to [4] for a related discussion. The second application, following Barreira and Doutor in [5], has the objective of establishing a conditional variational principle for the multifractal spectra obtained from limits of almost additive sequences. This means that we consider the level sets ϕn (x) =α , Kα = x ∈ X : lim n→∞ ψn (x) where (ϕn )n and (ψn )n are almost additive sequences, and we give a description of their topological entropy or Hausdorff dimension in terms of a conditional variational principle. For example, in the case of the topological entropy the conditional variational principle takes the form ϕn dµ X =α , h(f | Kα ) = max hµ (f ) : lim n→∞ ψn dµ X
where h(f | Kα ) denotes the topological entropy on Kα . It is also shown that the spectra, such as α → h(f | Kα ), are continuous, and that the associated irregular sets have full dimension. The approach in [5] builds on related arguments in former work of Barreira et al. in [9], although now for almost additive sequences. The multifractal analysis of dynamical systems can be considered a subfield of the dimension theory of dynamical systems, and it studies the complexity of the level sets of invariant local quantities obtained from a dynamical system. The concept of multifractal analysis was suggested by Halsey et al. in [27]. The first rigorous approach is due to Collet, Lebowitz and Porzio in [17] for a class of measures invariant under 1-dimensional Markov maps. In [32], Lopes considered the measure of maximal entropy for hyperbolic Julia sets, and in [38], Rand studied Gibbs measures for a class of repellers. We refer the reader to the books [4, 36] for details and further references. The third application, following Barreira and Doutor in [6], is a complete description of the dimension spectra of limits of almost additive sequences on a hyperbolic set of a surface diffeomorphism. The main novelty is that we consider simultaneously limits into the future and into the past. More precisely, the spectra are obtained by computing the Hausdorff dimension of the level sets of limits of almost additive sequences both for positive and negative time. We emphasize that the description of the spectra is not a consequence of the results considering simply limits into the future (or into the past). The main difficulty is that although the local product structure provided by the intersection of stable and unstable manifolds is
November 16, J070-S0129055X10004168
1152
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
bi-Lipschitz equivalent to a product, the level sets are never compact (this causes that their box dimension is strictly larger than their Hausdorff dimension), and thus the product of level sets may have a dimension that need not be the sum of the dimensions of the sets. Instead we construct explicitly noninvariant measures concentrated on each product of level sets having the appropriate pointwise dimension. This approach builds on former work of Barreira and Valls in [11] in the additive case. 2. Nonadditive Topological Pressure 2.1. General theory We recall in this section the notion of nonadditive topological pressure introduced n−1 by Barreira in [1]. The main idea is to replace each sequence of functions k=0 ϕ◦f k in the definition of topological pressure by an arbitrary sequence ϕn . Let f : X → X be a continuous transformation of a compact metric space X. Given a finite open cover U of X, we denote by Wn (U) the collection of vectors U = (U0 , . . . , Un ) with U0 , . . . , Un ∈ U. For each U ∈ Wn (U), we write m(U ) = n, and we consider the open set X(U ) =
n
f −k Uk .
k=0
These sets can be thought of as cylinder sets. Now let Φ be a sequence of continuous functions ϕn : X → R for each n ∈ N. We define γn (Φ, U) = sup{|ϕn (x) − ϕn (y)| : x, y ∈ X(U ) for some U ∈ Wn (U)}
(5)
for each n ∈ N, and we always assume that lim sup lim sup
diam U→0 n→∞
γn (Φ, U) = 0. n
(6)
We observe that condition (6) holds automatically when Φ is an additive sequence, that is, when ϕn =
n−1
ϕ ◦ fk
(7)
k=0
for a given continuous function ϕ : X → R and each n ∈ N (this is an immediate consequence of the uniform continuity of any continuous function in the compact metric space X). Now we proceed with the construction of the nonadditive topological pressure. For each U ∈ Wn (U) we write sup ϕn if X(U ) = ∅, (8) ϕ(U ) = X(U) −∞ if X(U ) = ∅.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1153
Given a set Z ⊂ X and a number α ∈ R, we define the function exp(−αm(U ) + ϕ(U )), M (Z, α, Φ, U) = lim inf n→∞ Γ
U∈Γ
where the infimum is taken over all finite or countable collections Γ ⊂ k≥n Wk (U) such that U∈Γ X(U ) ⊃ Z (in other words, such that the cylinder sets X(U ) cover the set Z). One can show that the function α → M (Z, α, Φ, U) jumps from +∞ to 0 at a unique value of α, and thus we can define PZ (Φ, U) = inf{α ∈ R : M (Z, α, Φ, U) = 0}. Theorem 2.1 ([1]). The following properties hold : (1) The limit PZ (Φ) :=
lim
diam U→0
PZ (Φ, U)
exists; (2) If there exist constants c1 , c2 < 0 such that c1 n ≤ ϕn ≤ c2 n for every n ∈ N, and the topological entropy h(f | X) is finite, then there exists a unique number s ∈ R such that PZ (sΦ) = 0. The number PZ (Φ) is called the nonadditive topological pressure of the sequence of functions Φ (with respect to f on Z). We note that the set Z need not be compact nor f -invariant. For simplicity, when there is no danger of confusion, we simply refer to PZ (Φ) as the topological pressure of Φ (with respect to f on Z). We also write P (Φ) = PX (Φ). One can easily verify that if Φ is the (additive) sequence of functions in (7), then P (Φ) coincides with the classical topological pressure of the function ϕ. The number h(f | Z) = PZ (0) is called the topological entropy of f on Z. It coincides with the notion of topological entropy for noncompact sets introduced in [37], and is equivalent to the notion of topological entropy introduced earlier by Bowen in [13]. It can be described as follows. Given a set Z ⊂ X and a number α ∈ R, we define the function exp(−αm(U )), N (Z, α, U) = lim inf n→∞ Γ
U∈Γ
where the infimum is taken over all finite or countable collections Γ ⊂ such that U∈Γ X(U ) ⊃ Z. Then h(f | Z) =
lim
diam U→0
k≥n
Wk (U)
inf{α ∈ R : N (Z, α, U) = 0}.
2.2. Equilibrium measures for subadditive sequences As described in the introduction, the nonadditive thermodynamic formalism developed in [1] also includes a variational principle for the topological pressure, although
November 16, J070-S0129055X10004168
1154
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
with a restrictive assumption on the sequence Φ (see (2)). Nevertheless, it is still meaningful to consider some particular classes of dynamics and potentials, and to look for equilibrium and Gibbs measures. With this in mind we describe in this section results by K¨aenm¨aki [30] and by Feng and K¨ aenm¨aki [25] concerning the construction of equilibrium measures for a class of subadditive sequences in the particular case of symbolic dynamics. These sequences are well adapted to the study of the dimension of a class of limit sets of iterated function systems (see [30]) and of the multifractal analysis of the top Lyapunov exponent of products of matrices (see [21, 23, 26]). We refer to the following sections for related results concerning the existence of equilibrium and Gibbs measures for other classes of dynamics and potentials. We first introduce some notation to consider the particular case of symbolic dynamics. Given p ∈ N, we write Σn = {1, . . . , p}n for each n ∈ N and |ω| = n for each ω ∈ Σn . We also write Σn , Σ = {1, . . . , p}N and Σ∗ = n∈N
and we consider the shift map σ : Σ → Σ by σ(i1 i2 · · ·) = (i2 i3 · · ·). Given t ≥ 0 and ω ∈ Σ∗ , let C be the class of all (parametrized) functions ψωt : Σ → R+ with ψω0 = 1 satisfying the following properties: (1) there exists Kt > 0 such that ψωt (ω1 ) ≤ Kψωt (ω2 ) for any ω1 , ω2 ∈ Σ; (2) for every ω ∈ Σ and j ∈ [1, |ω|] ∩ N we have ψωt (ω ) ≤ ψωt | j (σ j (ω)ω )ψσt j (ω) (ω ), where ω | j are the first j elements ω, and where σ j (ω)ω denotes the juxtaposition of the two sequences; (3) for each δ > 0 there exist a = a(δ), b = b(δ) ∈ (0, 1) depending only on δ, with a(δ) 1 and b(δ) 1 when δ → 0, such that ψωt (ω )a|ω| ≤ ψωt+δ (ω ) ≤ ψωt (ω )b|ω| for every ω ∈ Σ. We note that this class of functions contains as particular examples several classes earlier considered by Falconer [18, 20] and by Barreira [2], in connection with the study of the dimension of repellers of nonconformal transformations. For any function in the class C, using the subadditivity it is shown in [30] that given ω ∈ Σ and a σ-invariant probability measure µ in Σ, the limits 1 log ψωt (ω ) (9) p(t) = lim n→∞ n n ω∈Σ
and sµ (t) = lim
n→∞
1 µ(Cω ) log ψωt (ω ) n n ω∈Σ
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1155
exist, where Cω ⊂ Σ is the set of sequences whose first n elements are equal to those of ω). Moreover, they are independent of ω . To verify that p(t) is indeed a particular case of the nonadditive topological pressure, given ω ∈ Σ and n ∈ N we define a sequence ϕn : Σ → R by ϕtn (ω) = sup log ψωt (ω ). ω ∈Cω
(10)
Then the first condition on the class C ensures that (6) holds, and we can show that p(t) coincides with the nonadditive topological pressure of the sequence Φt = (ϕn )n for any ω . This follows readily from results in [1] using the second condition on C. Moreover, by the third condition we can readily apply Theorem 2.1 to conclude that there exists a unique t ≥ 0 such that p(t) = 0 (the proof of this statement in [30] follows the same argument). This zero is often related to the dimension of certain classes of limit sets of iterated function systems and repellers (see for example [1, 2, 4, 18, 20]). In addition, the following property holds. Theorem 2.2 ([30]). We have p(t) ≥ hµ (σ) + sµ (t).
(11)
By Kingman’s subadditive ergodic theorem, we have 1 sµ (t) = lim ϕt dµ, n→∞ n Σ n and thus, inequality (11) can be written in the form 1 P (Φt ) ≥ hµ (σ) + lim ϕtn dµ. n→∞ n Σ This inequality is due to Falconer [19] in the general case of arbitrary subadditive sequences (and not only for the sequences Φt ) with a bounded distortion condition (which in the present context is given by the first condition on C). Assuming a certain Lipschitz property for the elements of the sequence (more generally for topological Markov chains), he also obtained the variational principle 1 ϕtn dµ . (12) P (Φt ) = sup hµ (σ) + lim n→∞ n Σ µ In an analogous manner to that in the classical additive theory, we say that a σ-invariant probability measure µ in Σ is an equilibrium measure for the sequence Φt if it attains the supremum in (12). In the present context the existence of equilibrium measures was establish by K¨aenm¨aki. Theorem 2.3 ([30]). For each t ≥ 0 there exists an equilibrium measure for the sequence Φt . The existence of these equilibrium measures is used in [30] to study the dimension of a class of limit sets of iterated function systems.
November 16, J070-S0129055X10004168
1156
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
Now we consider a particular class of functions in C that are obtained from products of matrices. Given p, m ∈ N, let M1 , . . . , Mp be m × m matrices. For each t > 0, n ∈ N and ω ∈ Σn , we consider the constant function ψ¯ωt = Mi1 · · · Min t , ¯ t as in (10), that is, where ω = (i1 · · · in ), and again we define a sequence Φ ϕ¯tn (ω) = sup log ψ¯ωt (ω ) = sup log Mi1 · · · Min t , ω ∈Cω
ω ∈Cω
where ω = (i1 · · · in ). One can easily verify that the functions ψ¯ωt belong to the class C, and that p(t) in (9) is given by 1
Mi1 · · · Min t . p(t) = lim n→∞ n n ω∈Σ
Moreover, given a σ-invariant probability measure µ in Σ, we have 1 sµ (t) = t lim µ(Cω )log Mi1 · · · Min , n→∞ n n ω∈Σ
and it follows from (12) (see also [16]) that p(t) = sup(hµ (σ) + sµ (t)). µ
The following result is due to Feng and K¨ aenm¨aki. Theorem 2.4 ([25]). If for each n ∈ N there exist i1 , . . . , in ∈ {1, . . . , m} such that Mi1 · · · Min = 0, then for each t ≥ 0 there exist at most m ergodic equilibrium ¯ t . If in addition the only proper vector space V such measures for the sequence Φ that Mi V ⊂ V for i = 1, . . . , m is the origin, then for each t ≥ 0 there exists a ¯ t. unique equilibrium measure for the sequence Φ The irreducibility condition in Theorem 2.4 concerning the subspaces V is used in [23] to show that there exist c > 0 and k ∈ N such that for each ω, ω ∈ Σ∗ there exists ω ¯ ∈ kj=1 Σj for which
Mωω¯ ω ≥ c Mω · Mω .
(13)
It is essentially this property that allows to establish the existence of a unique equilibrium measure in [25]. We note that property (13) ensures that the sequence ¯ t is almost additive (see (3)), and thus the existence of a unique ergodic measure Φ in Theorem 2.4 as well as its Gibbs property (also obtained in [25]) follow from general results in [3] for the class of almost additive sequences (compare with the results in Secs. 4 and 5). 3. Topological Pressure for Almost Additive Sequences We introduce in this section the class of almost additive sequences, and we present formulas for the nonadditive topological pressure. For definiteness we consider only
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1157
the case of functions defined on a repeller. We refer to the remaining sections for further developments. 3.1. Repellers and Markov partitions We recall in this section the notion of repeller and the notion of Markov partition. Let f : M → M be a C 1 map, and let Λ ⊂ M be a compact f -invariant set (this means that f −1 Λ = Λ). We say that f is expanding on Λ, and that Λ is a repeller of f if there exist constants c > 0 and β > 1 such that
dx f n v ≥ cβ n v for every x ∈ Λ, n ∈ N, and v ∈ Tx M . In addition, we always assume in this presentation that there is an open set U ⊃ Λ such that Λ = n∈N f n U , and that f is topologically mixing on Λ. We recall that a collection of closed sets R1 , . . . , Rp ⊂ Λ is said to be a Markov partition of the repeller Λ if: p (1) Λ = i=1 Ri , and int Ri = Ri for i = 1, . . . , p; (2) int Ri ∩ int Rj = ∅ whenever i = j; (3) f (Ri ) ⊃ Rj whenever f (int Ri ) ∩ int Rj = ∅. We note that here the interior of each set Ri is computed with respect to the induced topology on Λ. Any repeller has Markov partitions with arbitrarily small diameter max{diam Ri : i = 1, . . . , p}
(14)
(see [41]). Given a Markov partition R1 , . . . , Rp of Λ, we define a p × p matrix A = (aij ) with entries 1 if f (int Ri ) ∩ int Rj = ∅, aij = (15) 0 if f (int Ri ) ∩ int Rj = ∅, and we consider the corresponding topological Markov chain σ : ΣA → ΣA defined by the shift map σ(i1 i2 · · ·) = (i2 i3 · · ·) in the set ΣA = {(i1 i2 · · ·) ∈ {1, . . . , p}N : aik ik+1 = 1 for every k ∈ N}.
(16)
We denote by ΣA,n the set of n-tuples (i1 · · · in ) for which there is a sequence (j1 j2 · · ·) ∈ ΣA such that i = j for = 1, . . . , n. For each (i1 · · · in ) ∈ ΣA,n we define n−1 f − Ri+1 , (17) ∆i1 ···in = =0
and setting χ(i1 i2 · · ·) =
∞ =0
f − Ri+1 =
∞ n=1
we obtain a coding map χ : ΣA → Λ for the repeller.
∆i1 ···in ,
November 16, J070-S0129055X10004168
1158
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
3.2. Formulas for the topological pressure Now we introduce the class of almost additive sequences, and we describe corresponding formulas for the nonadditive topological pressure both using and avoiding Markov partitions. We say that the sequence of functions Φ = (ϕn )n with ϕn : Λ → R for each n ∈ N is almost additive (with respect to f on Λ) if there exists a constant C > 0 such that for every n, m ∈ N and x ∈ Λ we have (18) −C + ϕn (x) + ϕm (f n (x)) ≤ ϕn+m (x) ≤ C + ϕn (x) + ϕm (f n (x)). n−1 Clearly, any additive sequence of functions ϕn = k=0 ϕ ◦ f k is almost additive. Nontrivial examples of almost additive sequences occur naturally for example in the study of nonconformal repellers (see Sec. 7 for a detailed description). Now let Λ be a repeller of f , and let ∆i1 ···in be the sets in (17) obtained from a given Markov partition. We write γn (Φ) = sup{|ϕn (x) − ϕn (y)| : x, y ∈ ∆i1 ···in and (i1 · · · in ) ∈ ΣA,n }.
(19)
One can easily verify that γn (Φ) coincides with γn (Φ, U) in (5) for the open cover U of Λ formed by the elements R1 , . . . , Rp of the Markov partition (with respect to the induced topology on Λ). We say that Φ has tempered variation if γn (Φ)/n → 0 as n → ∞. Clearly, any sequence with tempered variation satisfies condition (6). The following result provides a formula for the topological pressure of an almost additive sequence with tempered variation. Theorem 3.1 ([7, Proposition 3]). Let Λ be a repeller of a C 1 map, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with tempered variation. Then 1 log exp ϕn (xi1 ···in ) (20) P (Φ) = lim n→∞ n i ···i 1
n
for any points xi1 ···in ∈ ∆i1 ···in , for each (i1 · · · in ) ∈ ΣA,n and n ∈ N. The statement in Theorem 3.1 was first established by Barreira and Gelfert in [7], and was then extended by Barreira in [3] to other classes of transformations (see Secs. 5 and 6). We emphasize that identity (20) ensures not only that the nonadditive topological pressure of an almost additive sequence is a limit, but also that the limit is independent of the particular Markov partition used to define it. For a continuous function ϕ : Λ → R, we recall that the (classical) topological pressure of ϕ (with respect to f on Λ) is given by n−1 1 log exp max ϕ(f k (x)), n→∞ n x∈∆i1 ···in i ···i
P (ϕ) = lim
1
n
(21)
k=0
where ∆i1 ···in are the sets in (17) obtained from any given Markov partition. One can easily verify that the limit in (20) exists (by showing that the first sum defines a
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1159
submultiplicative sequence). Furthermore, the limit is independent of the particular Markov partition used to define it (see [36,47] for details). We note that identity (20) includes identity (21) (which is often taken as the definition of topological pressure) as a particular case. We have also the following alternative characterization of the topological pressure. It has the advantage of avoiding Markov partitions and the associated symbolic dynamics. Let Fix(f ) = {x ∈ Λ : f (x) = x} be the set of fixed points of f in Λ. Theorem 3.2 ([3]). Let Λ be a repeller of a C 1 map, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with tempered variation. Then 1 log exp ϕn (x). (22) P (Φ) = lim n→∞ n n x∈Fix(f )
4. Results for Repellers We describe in this section several results of the almost additive thermodynamic formalism, again for definiteness in the particular case of functions defined in a repeller. In particular, we describe a variational principle for the topological pressure. We also introduce, for almost additive sequences, the notions of equilibrium measure and of Gibbs measure, and we consider the problem of existence and uniqueness of these measures. 4.1. Variational principle for the topological pressure To formulate the variational principle for the topological pressure, we first recall the notion of Kolmogorov–Sinai entropy. Given a measurable transformation f : Λ → Λ, we denote by M the family of f -invariant probability measures in Λ. We recall that a measure µ in Λ is said to be f -invariant if µ(f −1 A) = µ(A) for every measurable set A ⊂ Λ. Given a measure µ ∈ M and a partition ξ of Λ into measurable subsets, we define µ(C) log µ(C), Hµ (ξ) = − C∈ξ
with the convention that 0 log 0 = 0. The Kolmogorov–Sinai entropy of f with respect to µ is given by hµ (f ) = sup{hµ (f, ξ) : Hµ (ξ) < ∞}, where hµ (f, ξ) = inf
n∈N
1 Hµ (ξn ), n
November 16, J070-S0129055X10004168
1160
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
n−1 for the partition ξn of Λ into the sets k=0 f −k Ck+1 with C1 , . . . , Cn ∈ ξ. In the case of invariant measures in repellers, the entropy can be obtained as follows. Given a Markov partition of the repeller Λ, we consider the partition ξn = {∆i1 ···in : (i1 · · · in ) ∈ ΣA,n } of Λ. Its entropy is given by Hµ (ξn ) = −
µ(∆i1 ···in ) log µ(∆i1 ···in ),
i1 ···in
and hµ (f ) = lim
n→∞
1 1 Hµ (ξn ) = inf Hµ (ξn ). n∈N n n
The following is a variational principle for the topological pressure. Theorem 4.1 ([3]). Let Λ be a repeller of a C 1 map f, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with tempered variation. Then ϕn (x) dµ(x) lim P (Φ) = max hµ (f ) + µ∈M n Λ n→∞ 1 = max hµ (f ) + lim ϕn dµ , (23) n→∞ n Λ µ∈M including the existence in L1 (Λ, µ) of the first limit, and the existence of the second limit. In a similar manner to that in the classical theory, it is easier to show that 1 ϕn dµ P (Φ) ≥ max hµ (f ) + lim n→∞ n Λ µ∈M when compared to the reverse inequality. The argument uses the subadditivity of the sequence ψn = ϕn + C (see (18)), that is, the property ψn+m ≤ ψn + ψm ◦ f n ,
n, m ∈ N,
together with Kingman’s subadditive ergodic theorem. The proof of the reverse inequality uses analogous arguments to those in the proof of [1, Theorem 1.7], which in their turn were inspired in arguments of Bowen in [14]. The fact that the supremum can be replaced by a maximum in (23) follows from the upper semicontinuity of the map 1 ϕn dµ, (24) M µ → hµ (f ) + lim n→∞ n Λ since µ → hµ (f ) is upper semi-continuous in this setting, and since the limit in (24) is continuous in µ.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1161
4.2. Equilibrium and Gibbs measures We continue to consider a repeller Λ of a C 1 map f . In an analogous manner to that in the classical additive theory, we say that a measure µ ∈ M is an equilibrium measure for the almost additive sequence Φ (with respect to f on Λ) if it attains any of the maxima in (23) (and thus both maxima), that is, if 1 ϕn dµ. P (Φ) = hµ (f ) + lim n→∞ n Λ The existence of equilibrium measures is thus an immediate consequence of Theorem 4.1. Theorem 4.2 ([3]). Let Λ be a repeller of a C 1 map. Then any almost additive sequence of continuous functions on Λ with tempered variation has at least one equilibrium measure. We also say that a probability measure µ in Λ (which need not be f -invariant) is a Gibbs measure for the sequence Φ (with respect to f on Λ, and to a given Markov partition of Λ) if there exists a constant K > 0 such that K −1 ≤
µ(∆i1 ···in ) ≤K exp[−nP (Φ) + ϕn (x)]
for every n ∈ N, (i1 · · · in ) ∈ ΣA,n , and x ∈ ∆i1 ···in . It turns out, as in the classical additive theory, that invariant Gibbs measures are always equilibrium measures. The argument is simple. We first note that if µ is an f -invariant Gibbs measure, then the limit hµ (x) := lim − n→∞
ϕn (x) 1 log µ(∆i1 ···in ) = P (Φ) − lim n→∞ n n
(25)
exists for µ-almost every x ∈ Λ (by Theorem 4.1 the second limit in (25) exists in L1 (Λ, µ), and thus it also exists for µ-almost every x ∈ Λ). By Shannon–McMillan– Breiman’s theorem we obtain ϕn (x) hµ (x)dµ(x) = P (Φ) − lim hµ (f ) = dµ(x), n→∞ n Λ Λ and hence µ is an equilibrium measure. To formulate the following result we need to consider the stronger notion of bounded variation. We say that the sequence of functions Φ = (ϕn )n has bounded variation if supn∈N γn (Φ) < ∞ (see (19) for the definition of γn (Φ)). For example, k one can easily verify that if Φ is the additive sequence ϕn = n−1 k=0 ϕ ◦ f for some H¨older continuous ϕ in a repeller, then Φ has bounded variation. Clearly, if Φ has bounded variation, then it has tempered variation. The following statement says in particular that for each almost additive sequence with bounded variation there exists a unique equilibrium measure.
November 16, J070-S0129055X10004168
1162
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
Theorem 4.3 ([3]). Let Λ be a repeller of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then: (1) there is a unique equilibrium measure for Φ; (2) there is a unique invariant Gibbs measure for Φ; (3) the two measures coincide and are mixing. In particular, the unique equilibrium measure for an almost additive sequence with bounded variation is an invariant Gibbs measure. We refer to [34] for some results related to those in this section, although using a different notion of equilibrium measure. 4.3. Characterizations of unique equilibrium measures The unique equilibrium measure in Theorem 4.3 can be characterized as follows. We denote by δx the probability measure with δx ({x}) = 1. Theorem 4.4 ([3]). Let Λ be a repeller of a C 1 map, and let Φ = (ϕn )n be an almost additive sequence of continuous functions on Λ with bounded variation. Then the unique equilibrium measure for Φ is the weak limit of the sequence of invariant probability measures eϕn (x) δx eϕn (x) . (26) µn = x∈Fix(f n )
x∈Fix(f n )
Now we present another characterization of the unique equilibrium measures. Given a sequence of continuous functions Φ = (ϕn )n with bounded variation, we set ai1 ···in = max{exp ϕn (y) : y ∈ ∆i1 ···in }, with the convention that ai1 ···in = 0 if ∆i1 ···in = ∅. We also set αn = ai1 ···in . i1 ···in
We define a probability measure νn in the algebra generated by the sets ∆i1 ···in by νn (∆i1 ···in ) = ai1 ···in /αn for each (i1 · · · in ) ∈ ΣA,n , and we extend it arbitrarily to the Borel σ-algebra of Λ. Since Λ is compact, the family of probability measures in Λ is compact in the weak* topology, and hence, there exists a subsequence (νnk )k converging to some probability measure ν in the weak* topology. A priori the accumulation point ν need not be unique. We denote the set of all accumulation points of the sequence (νn )n by M(Φ). As explained above, M(Φ) = ∅. The following statement shows that all accumulation points are Gibbs measures. Theorem 4.5 ([3]). Let Λ be a repeller of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then each measure in M(Φ) is an ergodic Gibbs measure for Φ.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1163
Moreover, the following is a characterization of the unique invariant Gibbs measure. Theorem 4.6 ([3]). Let Λ be a repeller of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then the unique invariant Gibbs measure for Φ is the unique invariant measure in M(Φ). When Φ is an almost additive sequence of continuous functions in Λ with tempered variation (but not necessarily with bounded variation), we can still show that there exist an ergodic probability measure ν in Λ, a constant K > 0, and a positive sequence (ρn )n decreasing to 0, such that K −1 e−nρn ≤
ν(∆i1 ···in ) ≤ Kenρn exp[−nP (Φ) + ϕn (x)]
(27)
for every n ∈ N, (i1 · · · in ) ∈ ΣA,n , and x ∈ ∆i1 ···in . We emphasize that the measure ν need not be invariant. Furthermore, in general it may not be possible to obtain an invariant measure through an averaging procedure, due to the extra small exponentials in (27). On the other hand, it is still reasonable to call the measure ν in (27) a weak Gibbs measure for Φ, as proposed by Yuri in [48]. 5. Results for Hyperbolic Sets We consider in this section the case of functions defined in a hyperbolic set, and we formulate corresponding results to those in Sec. 4 for functions defined in a repeller. 5.1. Hyperbolic sets and Markov partitions Let f : M → M be a diffeomorphism of a smooth manifold M , and let Λ ⊂ M be a compact f -invariant set. We say that Λ is a hyperbolic set for f if for every point x ∈ Λ there exists a decomposition of the tangent space Tx M = E s (x) ⊕ E u (x) such that dx f E s (x) = E s (f (x))
and dx f E u (x) = E u (f (x)),
and there exist constants λ ∈ (0, 1) and c > 0 such that
dx f n | E s (x) ≤ cλn
and dx f −n | E u (x) ≤ cλn
for every x ∈ Λ and n ∈ N. In addition, we always assume in this presentation that there is an open set U ⊃ Λ such that f n U, (28) Λ= n∈Z
and that f is topologically mixing on Λ. Given ε > 0 sufficiently small, for each x ∈ Λ the local stable and unstable manifolds (of size ε) are given by V s (x) = {y ∈ M : d(f n (y), f n (x)) < ε for every n ≥ 0}
November 16, J070-S0129055X10004168
1164
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
and V u (x) = {y ∈ M : d(f n (y), f n (x)) < ε for every n ≤ 0}, where d is the distance on M . Now we briefly recall the notion of Markov partition for a hyperbolic set. A collection of closed sets R1 , . . . , Rp ⊂ Λ with sufficiently small diameter (given by (14)) is called a Markov partition of Λ if: p (1) Λ = i=1 Ri , and int Ri = Ri for i = 1, . . . , p; (2) V s (x) ∩ V u (x) ∈ Ri and card(V s (x) ∩ V u (x)) = 1 for x, y ∈ Ri ; (3) int Ri ∩ int Rj = ∅ whenever i = j; (4) if x ∈ f (int Ri ) ∩ int Rj , then f −1 (V u (f (x)) ∩ Rj ) ⊂ V u (x) ∩ Ri and f (V s (x) ∩ Ri ) ⊂ V s (f (x)) ∩ Rj . The interior of each set Ri is computed with respect to the induced topology on Λ. Any hyperbolic set satisfying (28) has Markov partitions with arbitrarily small diameter (see, for example, [14]). Given a Markov partition R1 , . . . , Rp of a hyperbolic set Λ, we define as in the case of repellers a p× p matrix A = (aij ) with entries given by (15), and we consider the corresponding two-sided topological Markov chain defined by the shift map on the set ΣA = {(i1 i2 · · ·) ∈ {1, . . . , p}Z : aik ik+1 = 1 for every k ∈ Z}.
(29)
We continue to denote by ΣA,n the set of n-tuples (i1 · · · in ) for which there is a sequence (· · · j0 j1 j2 · · · ) ∈ ΣA such that i = j for = 1, . . . , n. For each (i1 · · · in ) ∈ ΣA,n we consider again the sets ∆i1 ··· in defined by (17). 5.2. Formulation of the results Repeating arguments in the proofs of Theorems 3.1 and 3.2 we obtain the following statement, thus providing formulas for the topological pressure of an almost additive sequence. Theorem 5.1 ([3]). Let Λ be a hyperbolic set of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with tempered variation. Then identities (20) and (22) hold for any points xi1 ···in ∈ ∆i1 ···in , for each (i1 · · · in ) ∈ ΣA,n and n ∈ N. We also formulate corresponding versions of Theorems 4.1 and 4.3. Theorem 5.2 ([3]). Let Λ be a hyperbolic set of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with tempered variation. Then (23)
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1165
holds, including the existence in L1 (Λ, µ) of the first limit, and the existence of the second limit. In particular, this shows that the sequence Φ has at least one equilibrium measure. Theorem 5.3 ([3]). Let Λ be a hyperbolic set of a C 1 map, and let Φ be an almost additive sequence of continuous functions on Λ with bounded variation. Then: (1) there is a unique equilibrium measure for Φ; (2) there is a unique invariant Gibbs measure for Φ; (3) the two measures are equal, are mixing, and coincide with the weak limit of the sequence of invariant probability measures µn in (26). 6. Further Generalizations Some of the former results for repellers and hyperbolic sets can be generalized to more general classes of dynamics. We first present a variational principle for the topological pressure. Theorem 6.1 ([3]). Let f be a continuous map in a compact metric space Λ, and let Φ be an almost additive sequence of continuous functions in Λ satisfying (6). Then ϕn (x) dµ(x) lim P (Φ) = sup hµ (f ) + n µ∈M Λ n→∞ 1 = sup hµ (f ) + lim ϕn dµ , n→∞ n Λ µ∈M including the existence in L1 (Λ, µ) of the first limit, and the existence of the second limit. We also formulate a criterion for the existence of equilibrium measures. Theorem 6.2 ([3]). Let f be a continuous map in a compact metric space Λ such that M ∈ µ → hµ (f ) is upper semi-continuous, and let Φ be an almost additive sequence of continuous functions on Λ satisfying (6). Then there exists an equilibrium measure for Φ. For example, if f is an expansive continuous map in Λ, then the entropy is upper semi-continuous, and hence each almost additive sequence has an equilibrium measure. We recall that f is said to be expansive if there exists δ > 0 such that if d(f n (x), f n (y)) < δ
for every n ∈ N,
then x = y (when f is invertible we replace N by Z). For example, when f is a onesided or two-sided topologically mixing topological Markov chain, the entropy is
November 16, J070-S0129055X10004168
1166
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
upper semi-continuous. Incidentally, all these transformations satisfy specification. On the other hand, there are plenty transformations not satisfying specification for which the entropy is still upper semi-continuous. For example, all β-shifts are expansive, and thus the entropy is upper semi-continuous (see [31] for details), but for β in a residual set of full Lebesgue measure (although the complement has full Hausdorff dimension) the corresponding β-shift does not satisfy specification (see [43]). Finally, we describe some regularity properties of the topological pressure. We denote by A(Λ) the family of almost additive sequences of continuous functions satisfying (6). Let also E(Λ) ⊂ A(Λ) be the family of sequences with a unique equilibrium measure. Theorem 6.3 ([5]). Let f be a continuous map in a compact metric space Λ such that M µ → hµ (f ) is upper semi-continuous. Then: (1) given Φ ∈ A(Λ), the function t → P (Φ + tΨ) is differentiable at t = 0 for every Ψ ∈ A(Λ) if and only if Φ ∈ E(Λ); in this case the unique equilibrium measure µ of Φ is ergodic, and ψn d P (Φ + tΨ)|t=0 = lim dµ; (30) n→∞ Λ n dt (2) for each open set U ⊂ R, if Φ + tΨ ∈ E(Λ) for every t ∈ U, then the function t → P (Φ + tΨ) is of class C 1 in U . The proof of Theorem 6.3 follows partially arguments in [31]. 7. Application I: Nonconformal Repellers We describe in this section a class of nonconformal repellers considered by Barreira and Gelfert in [7] to which one can apply the results in Sec. 4, in connection with the study of Lyapunov exponents of nonconformal transformations. 7.1. Cone condition and bounded distortion To describe the class of repellers under consideration, we first introduce what we call a cone condition. Given a number γ ≤ 1 and a 1-dimensional subspace E(x) ⊂ R2 , we consider the cone Cγ (x) = {(u, v) ∈ E(x) ⊕ E(x)⊥ : v ≤ γ u }. We say that a differentiable map f : R2 → R2 satisfies a cone condition on a set Λ ⊂ R2 if there exist γ ≤ 1 and for each x ∈ Λ a 1-dimensional subspace E(x) ⊂ R2 varying continuously with x such that (dx f )Cγ (x) ⊂ {0} ∪ int Cγ (f x).
(31)
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1167
Following [7], we present several examples of maps satisfying a cone condition. Example 7.1. Assume that for each x ∈ Λ the derivative dx f is represented by a positive 2 × 2 matrix. Then the first quadrant Q is invariant under these linear transformations, that is, (dx f )Q ⊂ Q for each x ∈ Λ. Therefore, the map f satisfies the cone condition in (31) with γ = 1, taking for E(x) the 1-dimensional subspace making an angle of π/4 with the horizontal direction. This example is related to work in [26] (see also [22]). Another class of examples corresponds to the existence of a strongly unstable foliation. Example 7.2. Let Λ be a locally maximal repeller in the sense that in some open neighborhood U the repeller Λ is the only invariant set. In this case f −1 Λ ∩ U = Λ. Assume that there exists a strongly unstable foliation of the set U , that is, a foliation by 1-dimensional C 2 leaves V (x) such that: (1) f (V (x)) ⊃ V (f x) for every x ∈ U ∩ f −1 U ; (2) there exist constants c > 0 and λ ∈ (0, 1) such that |det dx f n | ≤ cλn
dx f n | Tx V (x) 2
for all x ∈
n
f −i U
and n ∈ N.
i=0
It is shown by Hu in [29] that this assumption is equivalent to: (1) for some choice of subspaces E(x) varying continuously with x, the cone condition in (31) holds for every x ∈ U ∩ f −1 U ; (2) there exist 1-dimensional subspaces F (x) ⊂ {0} ∪ int Cγ (x) for each x ∈ U ∩ f −1 U such that dx f F (x) = F (f x). Thus, repellers with a strongly unstable foliation satisfy a cone condition. Notice that the cone condition in (31) is weaker then assuming the existence of a strongly unstable foliation. In particular, (31) does not ensure the existence of an invariant distribution F (x) as in Example 7.2. On the other hand, when there exists a strongly unstable foliation, the invariant distribution F (x) is given by (see [29]) dy f n Cγ (y). F (x) = n∈N y∈f −n x
It is thus independent of the particular preimages xn ∈ f −n x, that is, F (x) = dxn f n Cγ (xn ). n∈N
We can also consider repellers with a dominated splitting.
November 16, J070-S0129055X10004168
1168
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
Example 7.3. We say that the repeller Λ possesses a dominated splitting if there exists a decomposition TΛ R2 = E ⊕ F such that: (1) dx f E(x) = E(f x) and dx f F (x) = F (f x) for every x ∈ Λ; (2) there exist constants c > 0 and λ ∈ (0, 1) such that
dx f n | E · (dx f )−n | F ≤ cλn
for all x ∈ Λ
and n ∈ N.
It follows easily from the definition that the subspaces E(x) and F (x) vary continuously with x. Furthermore, one can verify that when there exists a dominated splitting of Λ, the map f satisfies a cone condition on Λ. We note that the existence of a strongly unstable foliation does not ensure the existence of a dominated splitting, due to the requirement of a df -invariant decomposition E ⊕ F (more precisely, the existence of a strongly unstable foliation only ensures the existence of the invariant distribution F in Example 7.2). Now we consider certain almost additive sequences of functions obtained from the singular values of a 2 × 2 matrix A, namely σ1 (A) = A and σ2 (A) = A−1 −1 (with respect to the 2-norm in R2 ). Given a C 1 map f : R2 → R2 , we define sequences of functions Φi = (ϕi,n )n for i = 1, 2 by ϕi,n (x) = log σi (dx f n )
(32)
for each n ∈ N and i = 1, 2. Clearly, the functions ϕi,n are continuous. These sequences are related to the Lyapunov exponents of the map f (see Sec. 7.2). We first present a criterium for almost additivity. Proposition 7.4 ([7]). Let Λ be a repeller of a C 1 map f : R2 → R2 . If f satisfies a cone condition on Λ, then Φi is almost additive for i = 1, 2. For a map f as in Proposition 7.4, we consider a number δ > 0 such that for every x ∈ Λ the map is invertible on the ball B(x, δ) (simply take a Lebesgue number of a cover by balls with the property that f is invertible on each of them). For each x ∈ Λ and n ∈ N we define Bn (x, δ) =
n−1
f − B(f x, δ).
=0
We always assume that the diameter of the Markov partition used to define the sets ∆i1 ···in in (17) is at most δ/2 (we recall that any repeller has Markov partitions of arbitrarily small diameter). This ensures that ∆i1 ···in ⊂ Bn (x, δ) for every x = χ(i1 i2 · · ·) ∈ Λ and n ∈ N. We say that f has bounded distortion on Λ if there exists δ > 0 such that sup{ dy f n (dz f n )−1 : x ∈ Λ and y, z ∈ Bn (x, δ)} < ∞.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1169
Now we give a condition for bounded distortion in the case of C 1+α transformations. Given α > 0, we say that f is α-bunched on Λ if
(dx f )−1 1+α dx f < 1 for every x ∈ Λ (this notion was introduced in [1] in the context of dimension theory of nonconformal transformations). The following statement is an immediate consequence of the proof of [2, Theorem 4]. Proposition 7.5. Let Λ be a repeller of a C 1+α map f : M → M . If f is α-bunched on Λ, then f has bounded distortion on Λ. Now we consider the sequences Φi for i = 1, 2 introduced in (32) and we present a criterium for bounded variation. Proposition 7.6 ([7, Proposition 1]). Let Λ be a repeller of a C 1 transformation f : R2 → R2 . If f has bounded distortion on Λ, then Φi has bounded variation for i = 1, 2. 7.2. Variational principle and Gibbs measures It follows from Propositions 7.4 and 7.6 that if a C 1 map f : R2 → R2 satisfies a cone condition on Λ and has bounded distortion on Λ, then Φi is an almost additive sequence with bounded variation for i = 1, 2. This allows us to apply the results in Sec. 4 to recover the corresponding statements of Barreira and Gelfert in [7]. To explain the relation between the sequences Φi and the theory of Lyapunov exponents, we first recall some basic notions. Given a differentiable transformation f : M → M (which is not necessarily invertible), for each x ∈ M and v ∈ Tx M we define the Lyapunov exponent of (x, v) by 1 χ(x, v) = lim sup log dx f n v , n→+∞ n
(33)
with the convention that log 0 = −∞. It follows from the abstract theory of Lyapunov exponents (see [8] for full details) that for each x ∈ M there exist a positive integer s(x) ≤ dim M , numbers χ1 (x) < · · · < χs(x) (x), and linear subspaces {0} = E0 (x) ⊂ E1 (x) ⊂ · · · ⊂ Es(x) (x) = Tx M such that for i = 1, . . . , s(x) we have Ei (x) = {v ∈ Tx M : χ(x, v) ≤ χi (x)}, and χ(x, v) = χi (x) whenever v ∈ Ei (x)\Ei−1 (x). It follows from Oseledets’ multiplicative ergodic theorem (see, for example, [8]), or more precisely from its version for noninvertible transformations, that for each finite f -invariant measure in M there is a set X ⊂ M of full measure such that if x ∈ X, then 1 log dx f n v = χi (x) lim n→+∞ n
November 16, J070-S0129055X10004168
1170
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
for every v ∈ Ei (x)\Ei−1 (x) and i = 1, . . . , s(x), with uniform convergence in v on each subspace F ⊂ Ei (x) such that F ∩ Ei−1 (x) = {0} (in particular, the lim sup in (33) is now a limit). For M = R2 and each x ∈ R2 , when s(x) = 1 we set λ1 (x) = χ1 (x)
and λ2 (x) = χ1 (x),
and when s(x) = 2 we set λ1 (x) = χ1 (x)
and λ2 (x) = χ2 (x).
The numbers λ1 (x) and λ2 (x) are the values of the Lyapunov exponent v → χ(x, v) counted with multiplicities. It follows again from Oseledets’ multiplicative ergodic theorem that for each finite f -invariant measure in R2 there is a set X ⊂ R2 of full measure such that lim
n→+∞
ϕi,n (x) 1 = lim log σi (dx f n ) = λi (x) n→+∞ n n
for each x ∈ X and i = 1, 2 (see (32)). Combining these observations with the criteria in Propositions 7.4 and 7.6, we readily obtain the following statement of Barreira and Gelfert by applying the results in Sec. 4. Theorem 7.7 ([7]). Let Λ be a repeller of a C 1 map f : R2 → R2 . If f satisfies a cone condition on Λ, and f has bounded distortion on Λ, then for i = 1, 2 the following properties hold : (1) the topological pressure satisfies the variational principle λi (x)dµ(x) P (Φi ) = max hµ (f ) + µ∈M
Λ
1 n = max hµ (f ) + lim log σi (dx f )dµ(x) ; n→∞ n Λ µ∈M (2) there is a unique equilibrium measure µi for Φi , and this is the unique invariant Gibbs measure for Φi ; (3) there is a constant K > 0 such that K −1 ≤
µi (∆i1 ···in ) ≤K exp[−nP (Φi )]σi (dx f n )
for every n ∈ N, (i1 · · · in ) ∈ ΣA,n , and x ∈ ∆i1 ···in ; (4) the measure µi is mixing, and σi (dx f n )δx σi (dx f n ) µi x∈Fix(f n )
x∈Fix(f n )
as n → ∞.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1171
8. Application II: Multifractal Analysis We describe in this section a conditional variational principle for the u-dimension spectrum established by Barreira and Doutor in [5]. This contains as a particular case a conditional variational principle for the entropy spectrum (see Theorem 8.3 below). For simplicity of the exposition, we do not consider the multidimensional case in [5] but only the case of a single ratio of almost additive functions. We emphasize that this is already a nontrivial result when compared to the existing results in the classical case of additive sequences. 8.1. Notion of u-dimension We recall in this section the notion of u-dimension introduced by Barreira and Schmeling in [10]. Let f : X → X be a continuous transformation of a compact metric space, and let U be a finite open cover of X. Let also u : X → R+ be a continuous function. Given a set Z ⊂ X and a number α ∈ R, we define the function exp(−αu(U )), N (Z, α, u, U) = lim inf n→∞ Γ
U∈Γ
where u(U ) is defined as in (8), and where the infimum is taken over all finite or countable collections Γ ⊂ k≥n Wk (U) such that u∈Γ X(U ) ⊃ Z. Setting dimu,U Z = inf{α ∈ R : N (Z, α, u, U) = 0}, one can show that the limit dimu Z =
lim
diam U→0
dimu,U Z
exists. The number dimu Z is called the u-dimension of the set Z (with respect to f ). For example, if u = 1, then dimu Z is equal to the topological entropy h(f | Z) of f on Z (see Sec. 2). The following result is an easy consequence of the definitions. Proposition 8.1. The number dimu Z = α is the unique root α of the equation k PZ (−αU ) = 0, where U = (un )n with un = n−1 k=0 u ◦ f for each n ∈ N. Furthermore, given a probability measure µ in X, we set dimu,U µ = inf{dimu,U Z : µ(Z) = 1}. One can show that the limit dimu µ =
lim
diam U→0
dimu,U µ
exists, and we call it the u-dimension of µ. Moreover, the lower and upper u-pointwise dimensions of µ at the point x ∈ X are defined by dµ,u (x) =
lim
lim inf inf −
diam U→0 n→∞
U
log µ(X(U )) u(U )
November 16, J070-S0129055X10004168
1172
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
and dµ,u (x) =
lim
lim sup sup −
diam U→0 n→∞
U
log µ(X(U )) , u(U )
where the infimum and supremum are taken over all vectors U ∈ Wn (U) such that x ∈ X(U ). If µ ∈ M is ergodic, then hµ (f ) dimu µ = dµ,u (x) = dµ,u (x) = u dµ X for µ-almost every x ∈ X (see [10]). 8.2. Conditional variational principle We formulate in this section a conditional variational principle for the u-dimension of sets defined in terms of ratios of almost additive sequences. This corresponds to a multifractal analysis of the level sets of limits of ratios of almost additive sequences. We continue to consider a continuous map f : X → X of a compact metric space. Let Φ = (ϕn )n and Ψ = (ψn )n be almost additive sequences of functions in X. We assume that lim inf m→∞
ψm (x) >0 m
and ψn (x) > 0
for every x ∈ X and n ∈ N. Given α ∈ R we define ϕn (x) =α . Kα = x ∈ X : lim n→∞ ψn (x)
(34)
The function Fu : R → R defined by Fu (α) = dimu Kα is called the u-dimension spectrum of the pair (Φ, Ψ) (with respect to f ). We also consider the function P : M → R defined by ϕn dµ . P(µ) = lim X n→∞ ψn dµ X
The following is a conditional variational principle for the spectrum Fu . We n−1 consider the (additive) sequence of functions U = (un )n with un = k=0 u ◦ f k for each n ∈ N. We recall that E(X) denotes the family of almost additive sequences satisfying (6) with a unique equilibrium measure. Theorem 8.2 ([5]). Let f be a continuous map of a compact metric space X such that µ → hµ (f ) is upper semi-continuous, and assume that span{Φ, Ψ, U } ⊂ E(X).
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1173
If α ∈ P(M), then Kα = ∅. Otherwise, if α ∈ int P(M), then Kα = ∅, and the following properties hold : (1) Fu satisfies the variational principle h (f ) µ : µ ∈ M and P(µ) = α ; Fu (α) = max u dµ X
(2) we have Fu (α) = min{Tu (α, q) : q ∈ R}, where Tu (α, q) is the unique real number satisfying P (q(Φ − αΨ) − Tu (α, q)U ) = 0;
(35)
(3) there is an ergodic measure µα ∈ M such that P(µα ) = α, µα (Kα ) = 1, and hµ (f ) dimu µα = α = Fu (α). u dµα X
In addition, the spectrum Fu is continuous in int P(M). The proof of Theorem 8.2 builds on earlier work of Barreira et al. in [9]. We note that the number Tu (α, q) is defined implicitly by (35). By Theorem 6.3, the function (p, α, q) → P (q(Φ − αΨ) − pU ) is of class C . By the Implicit function theorem, we conclude that (α, q) → Tu (α, q) is also of class C 1 in R2 , since by (30), ∂ P (q(Φ − αΨ) − pU )|(p,q)=(Tu (α,q),q) = − u dµq < 0, ∂p X 1
where µq is the unique equilibrium measure of q(Φ − αΨ) − Tu (α, q)U . Now we formulate explicitly a particular case of Theorem 8.2. Let Φ = (ϕn )n be an almost additive sequence of functions ϕn : X → R. Given α ∈ R, we consider the level set Kα = x ∈ X : lim ϕn (x) = α . n→∞
The entropy spectrum E : R → R (of the sequence Φ) is defined by E(α) = h(f | Kα ), where h(f | Kα ) denotes the topological entropy of f on Kα (see Secs. 2 and 8.1). We also consider the function P : M → R defined by 1 ϕn dµ. P(µ) = lim n→∞ n X
November 16, J070-S0129055X10004168
1174
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
The following statement is a conditional variational principle for the entropy spectrum E. It is an immediate consequence of Theorem 8.2 below. Theorem 8.3. Let f be a continuous map of a compact metric space X such that µ → hµ (f ) is upper semi-continuous, and assume that the almost additive sequence Φ has a unique equilibrium measure. If α ∈ P(M), then Kα = ∅. Otherwise, if α ∈ int P(M), then Kα = ∅, and the following properties hold : (1) E satisfies the variational principle E(α) = max{hµ (f ) : µ ∈ M and P(µ) = α}; (2)
E(α) = min{P (qΦ) − qα : q ∈ R};
(3) there is an ergodic measure µα ∈ M such that P(µα ) = α, µα (Kα ) = 1, and hµα (f ) = E(α). In addition, the spectrum E is continuous in int P(M). Now we consider the associated irregular sets, on which the limits in (34) do not exist. We consider only the particular case of topological Markov chains. Namely, let Φ and Ψ be almost additive sequences in ΣA , either as in (16) or as in (29). The irregular set of the pair (Φ, Ψ) is defined by ϕn (x) ϕn (x) < lim sup , I = x ∈ ΣA : lim inf n→∞ ψn (x) n→∞ ψn (x) and we denote by mu the equilibrium measure of u, when it is unique. Theorem 8.4 ([5]). Let σ | ΣA be a topologically mixing topological Markov chain. If span{Φ, Ψ, U } ⊂ E(ΣA ), and P(mu ) ∈ int P(Mσ ), then dimu I = dimu ΣA . Theorem 8.4 follows from the application of results in [10] combined with Theorem 8.2. 9. Application III: Dimension Spectra Our last application of the almost additive thermodynamic formalism considers dimension spectra of level sets associated to the limits of ratios of almost additive sequences. Moreover, we take into account simultaneously limits of ratios of sequences into the future and into the past.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1175
Let f : M → M be a C 1+ε surface diffeomorphism with a hyperbolic set Λ satisfying the same hypotheses as in Sec. 5.1. We always assume that dim E s (x) = dim E u (x) = 1 for every x ∈ Λ. Let ts and tu be the unique real numbers such that P (ts log df | E s ) = P (tu log df −1 | E u ) = 0, where P denotes the (classical) topological pressure with respect to f on Λ. It was shown by McCluskey and Manning in [33] that dimH (Λ ∩ V s (x)) = ts
and dimH (Λ ∩ V u (x)) = tu
for every x ∈ Λ, where dimH denotes the Hausdorff dimension. Moreover, it was shown by Palis and Viana in [35] that dimH (Λ ∩ V s (x)) = dimB (Λ ∩ V s (x)), dimH (Λ ∩ V u (x)) = dimB (Λ ∩ V u (x)) for every x ∈ Λ, where dimB denotes the upper box dimension. Since the stable and unstable distributions have codimension 1, it follows from results of Hasselblatt in [28] that the maps x → E s (x) and x → E u (x) are Lipschitz. This implies that dimH Λ = dimH [(Λ ∩ V s (x)) × (Λ ∩ V u (x))] = dimH (Λ ∩ V s (x)) + dimH (Λ ∩ V u (x)) = ts + tu .
(36)
Indeed, if dimH A = dimB A, then for any set B we have dimH (A × B) = dimH A + dimH B. Now we proceed with the description of the dimension spectra. We denote by L+ (respectively, L− ) the family of almost additive sequences of continuous functions with respect to f (respectively, f −1 ) that have bounded variation with respect to f (respectively, f −1 ). We only consider almost additive sequences Φ+ = (ϕ+ n )n ,
Φ− = (ϕ− n )n ,
Ψ+ = (ψn+ )n ,
and Ψ− = (ψn− )n
such that lim inf m→∞
± ψm (x) > 0 and ψn± (x) > 0 m
for every n ∈ N and x ∈ Λ. Given (Φ+ , Ψ+ ) ∈ L+ × L+ and α ∈ R we define ϕ+ (x) =α , Kα+ = x ∈ Λ : lim n+ n→∞ ψn (x) and given (Φ− , Ψ− ) ∈ L− × L− and α ∈ R we define ϕ− n (x) − Kα = x ∈ Λ : lim − =α . n→∞ ψn (x)
November 16, J070-S0129055X10004168
1176
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
We also consider the dimension spectrum D : R2 → R defined by D(α, β) = dimH (Kα+ ∩ Kβ− ). The following is a conditional variational principle for the spectrum D. Theorem 9.1 ([6]). If α ∈ int P+ (M) and β ∈ int P− (M), then D(α, β) = dimH Kα+ + dimH Kβ− − dimH Λ hµ (f ) + : µ ∈ M and P (µ) = α = max − log df | E s dµ Λ
hµ (f ) − + max : µ ∈ M and P (µ) = β . log df | E u dµ
(37)
Λ
Moreover, the spectrum D is analytic in int P+ (M) × int P− (M). The proof Theorem 9.1 follows to some extent arguments of Barreira and Valls ([11]) in the additive case. In particular, it involves constructing a measure ν = ναβ sitting on the set Kα+ ∩ Kβ− , that is, such that ν(Kα+ ∩ Kβ− ) = 1, having the “right” pointwise dimension. This means that lim inf r→0
log ν(B(x, r)) ≥ dimH Kα+ + dimH Kβ− − dimH Λ log r
for ν-almost every x ∈ Λ, and lim sup r→0
log ν(B(x, r)) ≤ dimH Kα+ + dimH Kβ− − dimH Λ log r
for every x ∈ Kα+ ∩Kβ− . These properties, together with general results in dimension theory (see, for example, [4]) readily yield the first identity in (37). The second identity follows from Theorem 8.2. The measure ν, although never invariant, is constructed essentially as a product of (invariant) equilibrium measures along the stable and unstable directions, for which the results in Sec. 4 are essential. More precisely, set U = q + (Φ − αΨ) − (dimH Kα+ − ts )Du
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1177
and S = q − (Φ − βΨ) − (dimH Kβ− − tu )Ds , where Du and Ds are the additive sequences n−1
log df | E u ◦ f k
k=0
and
n−1
log df −1 | E s ◦ f k ,
k=0
−
and where q , q ∈ R are such that +
P (U ) = P (S) = 0. By the almost additive thermodynamic formalism there exist unique equilibrium measures ν u and ν s respectively of U and S. Roughly speaking, the measure ναβ is given by the product ν u × ν s at the level of symbolic dynamics. It is also shown in [6] that dimH Kα+ = dimH (Kα+ ∩ V u (x)) + ts and dimH Kβ− = dimH (Kβ− ∩ V s (y)) + tu for every x ∈ Kα+ and y ∈ Kβ− . Together with (36) and (37), this shows that D(α, β) = dimH (Kα+ ∩ V u (x)) + dimH (Kβ− ∩ V s (y)) for every x ∈ Kα+ and y ∈ Kβ− . Note Added in Proof. Meantime, I became aware of the interesting paper [24] by Feng and Huang. Their work considers the more general case of asymptotically subadditive sequences and is a quite substantial advance towards a general theory. Acknowledgment The author was partially supported by FCT through CAMGSD, Lisbon. References [1] L. Barreira, A non-additive thermodynamic formalism and applications to dimension theory of hyperbolic dynamical systems, Ergodic Theory Dynam. Systems 16 (1996) 871–927. [2] L. Barreira, Dimension estimates in nonconformal hyperbolic dynamics, Nonlinearity 16 (2003) 1657–1672. [3] L. Barreira, Nonadditive thermodynamic formalism: Equilibrium and Gibbs measures, Discrete Contin. Dyn. Syst. 16 (2006) 279–305. [4] L. Barreira, Dimension and Recurrence in Hyperbolic Dynamics, Progress in Mathematics, Vol. 272 (Birkh¨ auser, 2008). [5] L. Barreira and P. Doutor, Almost additive multifractal analysis, J. Math. Pures Appl. 92 (2009) 1–17.
November 16, J070-S0129055X10004168
1178
2010 15:27 WSPC/S0129-055X
148-RMP
L. Barreira
[6] L. Barreira and P. Doutor, Dimension spectra of almost additive sequences, Nonlinearity 22 (2009) 2761–2773. [7] L. Barreira and K. Gelfert, Multifractal analysis for Lyapunov exponents on nonconformal repellers, Comm. Math. Phys. 267 (2006) 393–418. [8] L. Barreira and Ya. Pesin, Lyapunov Exponents and Smooth Ergodic Theory, Univ. Lect. Ser., Vol. 23 (Amer. Math. Soc., 2002). [9] L. Barreira, B. Saussol and J. Schmeling, Higher-dimensional multifractal analysis, J. Math. Pures Appl. 81 (2002) 67–91. [10] L. Barreira and J. Schmeling, Sets of “non-typical” points have full topological entropy and full Hausdorff dimension, Israel J. Math. 116 (2000) 29–70. [11] L. Barreira and C. Valls, Multifractal structure of two-dimensional horseshoes, Comm. Math. Phys. 266 (2006) 455–470. [12] H. Bothe, The Hausdorff dimension of certain solenoids, Ergodic Theory Dynam. Systems 15 (1995) 449–474. [13] R. Bowen, Topological entropy for noncompact sets, Trans. Amer. Math. Soc. 184 (1973) 125–136. [14] R. Bowen, Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, Lect. Notes in Math., Vol. 470 (Springer, 1975). ´ [15] R. Bowen, Hausdorff dimension of quasi-circles, Inst. Hautes Etudes Sci. Publ. Math. 50 (1979) 259–273. [16] Y.-L. Cao, D.-J. Feng and W. Huang, The thermodynamic formalism for sub-additive potentials, Discrete Contin. Dyn. Syst. 20 (2008) 639–657. [17] P. Collet, J. Lebowitz and A. Porzio, The dimension spectrum of some dynamical systems, J. Stat. Phys. 47 (1987) 609–644. [18] K. Falconer, The Hausdorff dimension of self-affine fractals, Math. Proc. Cambridge Philos. Soc. 103 (1988) 339–350. [19] K. Falconer, A subadditive thermodynamic formalism for mixing repellers, J. Phys. A 21 (1988) 1737–1742. [20] K. Falconer, Bounded distortion and dimension for non-conformal repellers, Math. Proc. Cambridge Philos. Soc. 115 (1994) 315–334. [21] D.-J. Feng, Lyapunov exponents for products of matrices and multifractal analysis. I. Positive matrices, Israel J. Math. 138 (2003) 353–376. [22] D.-J. Feng, The variational principle for products of non-negative matrices, Nonlinearity 17 (2004) 447–457. [23] D.-J. Feng, Lyapunov exponents for products of matrices and multifractal analysis. II. General matrices, Israel J. Math. 170 (2009) 355–394. [24] D.-J. Feng and W. Huang, Lyapunov spectrum of asymptotically sub-additive potentials, Comm. Math. Phys. 297 (2010) 1–43. [25] D.-J. Feng and A. K¨ aenm¨ aki, Equilibrium states for the pressure function for products of matrices, preprint (2009). [26] D.-J. Feng and K. Lau, The pressure function for products of non-negative matrices, Math. Res. Lett. 9 (2002) 363–378. [27] T. Halsey, M. Jensen, L. Kadanoff, I. Procaccia and B. Shraiman, Fractal measures and their singularities: The characterization of strange sets, Phys. Rev. A 34 (1986) 1141–1151; Errata, ibid. 34 (1986) 1601. [28] B. Hasselblatt, Regularity of the Anosov splitting and of horospheric foliations, Ergodic Theory Dynam. Systems 14 (1994) 645–666. [29] H. Hu, Box dimensions and topological pressure for some expanding maps, Comm. Math. Phys. 191 (1998) 397–407.
November 16, J070-S0129055X10004168
2010 15:27 WSPC/S0129-055X
148-RMP
Almost Additive Thermodynamic Formalism
1179
[30] A. K¨ aenm¨ aki, On natural invariant measures on generalised iterated function systems, Ann. Acad. Sci. Fenn. Math. 29 (2004) 419–458. [31] G. Keller, Equilibrium States in Ergodic Theory, London Mathematical Society Student Texts, Vol. 42 (Cambridge University Press, 1998). [32] A. Lopes, The dimension spectrum of the maximal measure, SIAM J. Math. Anal. 20 (1989) 1243–1254. [33] H. McCluskey and A. Manning, Hausdorff dimension of horseshoes, Ergodic Theory Dynam. Systems 3 (1983) 251–260. [34] A. Mummert, The thermodynamic formalism for almost-additive sequences, Discrete Contin. Dyn. Syst. 16 (2006) 435–454. [35] J. Palis and M. Viana, On the continuity of the Hausdorff dimension and limit capacity for horseshoes, in Dynamical Systems (Valparaiso, 1986), eds. R. Bam´ on, R. Labarca and J. Palis, Lect. Notes in Math., Vol. 1331 (Springer, 1988), pp. 150– 160. [36] Ya. Pesin, Dimension Theory in Dynamical Systems: Contemporary Views and Applications, Chicago Lectures in Mathematics (Chicago University Press, 1997). [37] Ya. Pesin and B. Pitskel’, Topological pressure and the variational principle for noncompact sets, Funct. Anal. Appl. 18 (1984) 307–318. [38] D. Rand, The singularity spectrum f (α) for cookie-cutters, Ergodic Theory Dynam. Systems 9 (1989) 527–541. [39] D. Ruelle, Statistical mechanics on a compact set with Zν action satisfying expansiveness and specification, Trans. Amer. Math. Soc. 185 (1973) 237–251. [40] D. Ruelle, Thermodynamic Formalism, Encyclopedia of Mathematics and Its Applications, Vol. 5 (Addison-Wesley, 1978). [41] D. Ruelle, Repellers for real analytic maps, Ergodic Theory Dynam. Systems 2 (1982) 99–107. [42] H. Rugh, On the dimensions of conformal repellers. Randomness and parameter dependency, Ann. of Math. (2 ) 168 (2008) 695–748. [43] J. Schmeling, Symbolic dynamics for β-shifts and self-normal numbers, Ergodic Theory Dynam. Systems 17 (1997) 675–694. [44] K. Simon, The Hausdorff dimension of the Smale–Williams solenoid with different contraction coefficients, Proc. Amer. Math. Soc. 125 (1997) 1221–1228. [45] K. Simon and B. Solomyak, Hausdorff dimension for horseshoes in R3 , Ergodic Theory Dynam. Systems 19 (1999) 1343–1363. [46] P. Walters, A variational principle for the pressure of continuous transformations, Amer. J. Math. 97 (1976) 937–971. [47] P. Walters, An Introduction to Ergodic Theory, Graduate Texts in Mathematics, Vol. 79 (Springer, 1982). [48] M. Yuri, Zeta functions for certain non-hyperbolic systems and topological Markov approximations, Ergodic Theory Dynam. Systems 18 (1998) 1589–1612.
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1181–1208 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004181
PAULI–FIERZ MODEL WITH KATO-CLASS POTENTIALS AND EXPONENTIAL DECAYS
TAKERU HIDAKA and FUMIO HIROSHIMA∗ Faculty of Mathematics, Kyushu University, Fukuoka 819-0385, Japan ∗[email protected] Received 29 March 2010 Revised 25 October 2010 Generalized Pauli–Fierz Hamiltonian with Kato-class potential KPF in nonrelativistic quantum electrodynamics is defined and studied by a path measure. KPF is defined as the self-adjoint generator of a strongly continuous one-parameter symmetric semigroup and it is shown that its bound states spatially exponentially decay pointwise and the ground state is unique. Keywords: Pauli–Fierz model; exponential decay; ground states; functional integrations. Mathematics Subject Classification 2010: 81Q10, 46N50
1. Introduction In this paper, we investigate generalized Pauli–Fierz Hamiltonians with Katoclass potentials in nonrelativistic quantum electrodynamics by a path measure. It includes not only Kato-class potentials but also general cutoff functions of quantized radiation fields. Basic ingredients in this paper are path measures and functional integral representation of semigroups. It has been shown that functional integral representations are useful tools to investigate the spectrum of models in quantum field theory. See, e.g., [4, 9, 15, 18, 20, 22, 23, 28, 29]. The strongly continuous one-parameter semigroup (e−tHp )t≥0 generated by the Schr¨ odinger operator, Hp = 12 (p − a)2 + V , on L2 (Rd ) with some external potential V and vector potential a = (a1 , . . . , ad ) is expressed by a path measure, which is known as Feynman–Kac–Itˆ o formula ([25]): Rt Rt (f, e−tHp g) = dxf¯(x)Ex [e− 0 V (Bs )ds−i 0 a(Bs )◦dBs g(Bt )], (1.1) x where Ex denotes the expectation value with respect t to the Wiener measure P , (Bt )t≥0 the d-dimensional Brownian motion and 0 a(Bs ) ◦ dBs a Stratonovich integral.
1181
November 16, J070-S0129055X10004181
1182
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
Conversely since a Kato-class potential V satisfies that sup Ex [e−
Rt 0
V (Bs )ds
x
] < ∞,
t ≥ 0,
(1.2)
the family of mappings St defined by St g(x) = Ex [e−
Rt 0
V (Bs )ds−i
Rt 0
a(Bs )◦dBs
g(Bt )],
t ≥ 0,
(1.3)
turns to be the strongly continuous one-parameter symmetric semigroup for a Kato-class potential V . The Schr¨ odinger operator with a Kato-class potential V is then defined as the self-adjoint generator of (St )t≥0 . See, e.g., [3, 26, 27, 19]. The three-dimensional Kato-class includes a singular external potential such as V (x) = −|x|−a , 0 ≤ a < 2. We extend this to the Pauli–Fierz Hamiltonian. The Pauli–Fierz Hamiltonian HPF is a self-adjoint operator defined on the tensor product of Hilbert spaces: H = L2 (Rd ) ⊗ L2 (Q),
(1.4)
where L2 (Q) is an L2 -space over a probability apace (Q, B, µ) with a Gaussian measure µ, and it describes the Schr¨odinger representation of the standard Boson Fock space. The Pauli–Fierz Hamiltonian HPF is given by HPF =
√ 1 (p ⊗ 1 + αA )2 + V ⊗ 1 + 1 ⊗ Hf (m), 2
(1.5)
where α ≥ 0 is a coupling constant, Hf (m) the free field Hamiltonian with a field mass m ≥ 0 and A = (A1 , . . . , Ad ) a quantized radiation field with a cutoff function. See Sec. 2 for further details of notations. Under some conditions on cutoff functions and V it is proven that (1.5) is self-adjoint and e−tHPF is then defined by the spectral resolution. In [14], (F, e−tHPF G) is also presented by a path measure: (F, e−tHPF G) = dx(F (x), (Tt G)(x))L2 (Q) , (1.6) where Tt is of the form Tt f (x) = Ex [e−
Rt 0
√ V (Bs )ds ∗ i αAE (Kt ) J0 e Jt G(Bt )]
∈ L2 (Q)
(1.7)
for each x ∈ Rd . Compare with (1.3) and see (2.47) for details. Our construction of generalized Pauli–Fierz Hamiltonians is closed to the procedure to define the Schr¨ odinger operator with Kato-class potentials. We believe however that it is worthwhile extending it to the Pauli–Fierz Hamiltonian from the mathematical point of view. It will be shown that the family of operators Tt : H → H , t ≥ 0, can be also defined for Kato class potentials V and general cutoff functions in A , and the generalized Pauli–Fierz Hamiltonian KPF is defined as the self-adjoint generator of (Tt )t≥0 . Of course, under some conditions KPF coincides with HPF , but KPF permits to include more singular V’s and general cutoff functions in A .
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1183
Cutoff functions of Aµ (x), µ = 1, 2, 3, of the standard Pauli–Fierz Hamiltonian in three dimensions are of the form ˆ |k| (1.8) e−ikx eµ (k, j)ϕ(k)/ with some function ϕˆ and polarization vectors e(k, j) = (e1 (k, j), e2 (k, j), e3 (k, j)), j = 1, 2. In [8], the so-called Nelson model on a pseudo-Riemannian manifold is studied by a path measure. A generalized Pauli–Fierz Hamiltonians include a mathematical analogue of the Nelson model on a pseudo-Riemannian manifold, which is unitarily transformed to the Pauli–Fierz Hamiltonian with a variable mass. The cutoff function of the Pauli–Fierz Hamiltonian with a variable mass v is (1.8) ˆ replaced by Ψ(k, x) and φˆjµ (k), respectively: with eikx and eµ (k, j)ϕ(k) Ψ(k, x)φˆjµ (k)/ |k|. (1.9) Here φˆjµ (k) is some function and Ψ(k, x), k = 0, is the unique solution of the Lippman–Schwinger equation ([21]): i|k||x−y| 1 v(y) e +ikx Ψ(k, y)dy. (1.10) − Ψ(k, x) = e 4π |x − y| The main results of the present paper are as follows: (1) we define the generalized Pauli–Fierz Hamiltonian KPF with Kato-class potentials and generalized cutoff functions, i.e. we prove that (Tt )t≥0 is a strongly continuous one-parameter symmetric semigroup; (2) KPF is an extension of HPF ; (3) bound states of KPF spatially exponentially decay pointwise and the ground is unique if it exists. We explain an outline of (1)–(3) above. First we define the strongly continuous one-parameter symmetric semigroup (Tt )t≥0 with Kato-class potentials and general cutoff functions by functional integral representations. Then KPF is defined by Tt = e−tKPF for t ≥ 0. We introduce two assumptions, Assumptions 2.1 and 2.12, on cutoff functions of A . The former is stronger than the latter. One advantage to define the generalized Pauli–Fierz Hamiltonian by a path measure is that we need only a weak condition on cutoff functions (Assumption 2.12) and external potentials. Then for arbitrary α ∈ R, Kato-class potential V and cutoff function ρˆjµ (x, k) satisfying ρˆjµ (x, k) ∈ Cb1 (Rdx ; L2 (Rdk )), we can define KPF as a self-adjoint operator. Secondly, we can show that √ 1 ˙ V− ⊗ 1 + 1 ⊗ Hf (m) ˙ V+ ⊗ 1 − (p ⊗ 1 + αA )2 + 2
(1.11)
is well defined for V± such that 0 ≤ V+ ∈ L1loc (Rd ) and 0 ≤ V− is relatively form bounded with respect to p2 /2 with a relative bound strictly smaller than one. It is shown that KPF = (1.11) under Assumption 2.1 on cutoff functions.
November 16, J070-S0129055X10004181
1184
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
Finally it is shown that bound states of KPF spatially exponentially decays pointwise. To show the spatial exponential decay of bound states is very important to study the properties of spectrum of Pauli–Fierz type models. In [2, 11, 10] the spatial exponential decay of bound states is shown but our method is completely different from them. Since ϕb (x) = etE e−tKPF ϕb for ϕb such that KPF ϕb = Eϕb , exponential decay of ϕb (x) is proven by means of showing supx ϕb (x) L2 (Q) < ∞ Rt and estimating etE Ex [e− 0 V (Bs )ds ]. We conclude that
ϕb (x) L2 (Q) ≤ De−C|x|
β
(1.12)
almost everywhere x ∈ Rd , and constants D and C are independent of the field mass m. Here the exponent β, β ≥ 1, is determined by the behavior of external potential V . When lim inf |x|→∞ V (x) < E, we can take β = 1, and when V (x) = |x|2n , β = n+1 is obtained. See Theorem 3.1 for the details. Furthermore, from a standard argument [15] it follows that the transformed operator ei(π/2)N Tt e−i(π/2)N is a positivity improving semigroup, where N denotes the number operator in L2 (Q). Then we conclude that the ground state of KPF is unique if it exists. This paper is organized as follows: Section 2 is devoted to constructing a strongly continuous symmetric semigroup (Tt )t≥0 and defining the self-adjoint operator KPF . In Sec. 3, we show the spatial exponential decay of bound states of KPF pointwise. And lastly, we have the Appendix. 2. Generalized Pauli–Fierz Hamiltonian 2.1. Definitions Let us begin with defining a generalized Pauli–Fierz Hamiltonian by a path measure. We usethe notation EP for the expectation with respect to a probability measure P , i.e. · · · dP = EP [· · ·]. Let Sreal = Sreal (Rd ) be the set of real-valued Schwartz d−1 test functions on Rd . We set Q = j=1 Sreal . There exist a σ-field B, a probability measure µ on a measurable space (Q, B) and a Gaussian random variable A (Φ) d−1 indexed by Φ = (Φ1 , . . . , Φd−1 ) ∈ j=1 L2real (Rd ) such that Eµ [A (Φ)] = 0
(2.1)
and the covariance is given by 1 (Φj , Ψj )L2 (Rd ) . 2 j=1 d−1
Eµ [A (Φ)A (Ψ)] =
(2.2)
Throughout the scalar product on Hilbert space, L is denoted by (F, G)L , where it is antilinear in F and linear in G. We omit L when no confusion arises. For d−1 2 d L (R ), A (Φ) is defined by general Φ ∈ A (Φ) = A (Φ) + iA ( Φ).
(2.3)
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1185
Thus A (Φ) is linear in Φ over C. The Boson Fock space is defined by L2 (Q, dµ) = L2 (Q). It is know that the linear hull of d−1 2 d (2.4) : A (φ1 ) · · · A (φn ) : φj ∈ L (R ), j = 1, . . . , n, n ≥ 0 is dense in L2 (Q), where : X : denotes the wick product of X. See the Appendix for the definition of Wick product. Let us define the free field Hamiltonian Hf (m) on L2 (Q). Define the map Γ(T ): L2 (Q) → L2 (Q) by Γ(T )1 = 1 and Γ(T ) : A (φ1 ) · · · A (φn ) : = : A (T φ1 ) · · · A (T φn ) :
(2.5)
for a contraction operator T on ⊕d−1 L2 (Rd ). Then Γ(T ) is also contraction on (2.4) and can be uniquely extended to the contraction operator on the hole space L2 (Q), which is denoted by the same symbol Γ(T ). We can check that Γ(T )Γ(S) = Γ(T S). Then {Γ(e−ith )}t∈R for a self-adjoint operator h defines the strongly continuous one-parameter unitary group on L2 (Q). The self-adjoint generator of {Γ(e−ith )}t∈R is denoted by dΓ(h), i.e. Γ(e−ith ) = e−itdΓ(h) ,
t ∈ R.
(2.6)
Let h= where ω(k) =
d−1
ω(−i∂),
|k|2 + m2 ,
m ≥ 0, k ∈ Rd .
(2.7)
(2.8)
Then we set Hf (m) = dΓ(h)
(2.9)
and it is called the free field Hamiltonian on L2 (Q). Let p = −i∂ = (−i∂x1 , . . . , odinger operator −i∂xd ) be momentum operators in L2 (Rdx ). We define the Schr¨ Hp by 1 2 p + V, (2.10) 2 where V denotes a real-valued external potential. The conditions on V will be required later. The zero coupling Hamiltonian is now given by the self-adjoint operator Hp =
Hp ⊗ 1 + 1 ⊗ Hf (m)
(2.11)
H = L2 (Rdx ) ⊗ L2 (Q).
(2.12)
on the Hilbert space
The Pauli–Fierz Hamiltonian HPF is defined by replacing p ⊗ 1 in zero cou√ pling Hamiltonian (2.11) with p ⊗ 1 + αA , where α ≥ 0 is a coupling
November 16, J070-S0129055X10004181
1186
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
constant and
Aµ =
⊕
Rd
Aµ (x)dx
(2.13)
∼ is ⊕the2 so-called quantized radiation field. Here we used the identification H = Rd L (Q)dx. We shall define Aµ (x) below. Let √ ρjµ (·, x) = (φˆjµ Ψ(·, x)/ ωˇ), j = 1, . . . , d − 1, µ = 1, . . . , d, (2.14) ˆ (respectively X) ˇ denotes the (respectively where φjµ is a cutoff function and X j inverse) Fourier transform of X. Note that ρˆµ (k, x) = φˆjµ (k)Ψ(k, x)/ ω(k). Examples of cutoff functions are given letter. The quantized radiation field is defined by d−1 Aµ (x) = A ρjµ (x), µ = 1, . . . , d, (2.15) j=1
for each x ∈ Rd . Now we arrive at the definition of the Pauli–Fierz Hamiltonian. It is defined by HPF =
√ 1 (p ⊗ 1 + αA )2 + V ⊗ 1 + 1 ⊗ Hf (m). 2
(2.16)
We omit ⊗ for notational convenience in what follows. Then HPF is expressed as HPF =
√ 1 (p + αA )2 + V + Hf (m). 2
Assumption 2.1. Suppose that ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )) and √ √ ω ρˆjµ , ρˆjµ , ρˆjµ / ω, ∂xµ ρˆjµ , ∂xµ ρˆjµ / ω ∈ L∞ (Rdx ; L2 (Rdk )).
(2.17)
(2.18)
Under Assumption 2.1 it follows that
(p · A + A · p)F ≤ c1 (p2 + Hf (m) + 1)F ,
A · A F ≤ c2 (Hf (m) + 1)F .
(2.19) (2.20)
Moreover, HPF is self-adjoint on D(p2 ) ∩ D(Hf (m)) under Assumption 2.1. See [16, 17, 12] for the proof. We give examples of cutoff functions ρjµ . Example 2.2 (Standard Pauli–Fierz Hamiltonian). The standard Pauli– Fierz Hamiltonian is defined by HPF with the dimension d = 3, m = 0, and √ ˆ Ψ(k, x) = e+ikx , φˆjµ (k) = ϕ(k)e µ (k, j)/ ω, where e(k, j) = (e1 (k, j), e2 (k, j), e3 (k, j)), j = 1, 2, denote polarization vectors, √ √ ˆ ϕ/ ˆ ω, ϕ/ω ˆ ∈ L2 (Rd ). and ϕˆ is an ultraviolet cutoff function. Suppose that ω ϕ, j 1 d 2 d Then ρµ (k, x) ∈ Cb (Rx ; L (Rk )) and (2.18) is fulfilled.
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1187
Example 2.3 (The Pauli–Fierz Hamiltonian with a Variable Mass). The Pauli–Fierz Hamiltonian with a variable mass v instead of m is studied in [13]. Then d = 3, m = 0, and Ψ(k, x) is the unique solution to the Lippman–Schwinger equation ([21]): i|k||x−y| e v(y) 1 +ikx Ψ(k, y)dy. (2.21) − Ψ(k, x) = e 4π |x − y| Ψ(k, x) formally satisfies (−∆x + v(x))Ψ(k, x) = |k|2 Ψ(k, x),
k = 0.
It is established that the Pauli–Fierz Hamiltonian with a variable mass has a ground state for arbitrary values of coupling constants when |v(x)| ≤ C(1+|x|2 )−β/2 , β > 3, with some constant C. Then it is also seen that |Ψ(k, x) − eikx | ≤ C(1 + |x|2 )−1/2 . Since ∂xµ Ψ(k, x) = ikµ e ×
ikx
1 − 4π
R3
(2.22)
1 − i|k| |x − y|
(xµ − yµ )ei|k||x−y| v(y) Ψ(k, y)dy, |x − y|2
(2.23)
it follows that sup k∈D,x∈Rd x
|∂xµ Ψ(k, x)| < ∞
(2.24)
for any compact set D but D 0. Let supp φˆjµ ⊂ D. Then ρjµ ∈ Cb1 (Rdx ; L2 (Rdk )) follows from (2.22) and (2.24). In addition to condition supp φˆjµ ⊂ D let us suppose √ √ that φˆjµ / ω, ω φˆjµ , φˆjµ /ω ∈ L2 (Rdk ), then (2.18) is fulfilled. 2.2. Feynman–Kac type formulae Let us prepare the Euclidean version of the quantized radiation field A (Φ) to construct a functional integral representation of e−tHPF in the same way as [14]. d−1 Sreal (Rd+1 ). There exist a probability measure µE on a meaLet QE = surable space (QE , BE ) and a Gaussian random variable AE (Φ) indexed by d−1 2 d+1 L (R ) such that Φ∈ EµE [AE (Φ)] = 0 and the covariance is given by 1 (Φj , Ψj )L2 (Rd+1 ) . 2 j=1 d−1
EµE [AE (Φ)AE (Ψ)] =
November 16, J070-S0129055X10004181
1188
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
Both L2 (Q) and L2 (QE ) are connected through the second quantization of the family of isometry {jt }t∈R between L2 (Rd ) and L2 (Rd+1 ): e−ik0 t ω(k)/(ω(k)2 + |k0 |2 )fˆ(k). j t f (k0 , k) = √ π
(2.25)
d−1 Define Jt = Γ( jt ) : L2 (Q) → L2 (QE ). From the identity j∗t js = e−|t−s|ω(−i∂) ∗ it follows that Jt Js = e−|t−s|Hf (m) . Set X = C([0, ∞); Rd ) be the set of continuous paths on [0, ∞). Let (Bt )t≥0 denote the d-dimensional Brownian motion starting at x ∈ Rd on (X , B(X ), P x ) with the Wiener measure P x . That is, P x (B0 = x) = 1. Let Cbn (Rdx ; L2 (Rdk )) be the set of strongly n-times differentiable L2 (Rd )-valued functions on Rd such that supx ∂xz f (x) L2 (Rd ) < ∞ for |z| ≤ n. For fµ ∈ Cb1 (Rdx ; L2 (Rdk )), µ = 1, . . . , d, we can define an L2 (Rd )-valued Stratonovich integral: d 0
µ=1
t
fµ (Bs ) ◦ dBsµ =
0
t
f (Bs ) · dBs +
1 2
0
t
∂ · f (Bs )ds,
(2.26)
d d where f (Bs ) · dBs = µ=1 fµ (Bs )dBsµ and ∂ · f (Bs ) = µ=1 (∂xµ fµ )(Bs ). We also define an L2 (Rd+1 )-valued Stratonovich integral by d µ=1
0
t
d
js fµ (Bs ) ◦ dBsµ =
µ=1
tj/n
lim
n→∞
t(j−1)/n
jt(j−1)/n fµ (Bs ) ◦ dBsµ ,
(2.27)
where limn→∞ is a strong limit in L2 (X ; L2 (Rd+1 )). By the Itˆ o isometry we have the identity for S ≤ T S T js f (Bs ) · dBs , js g(Bs ) · dBs Ex 0
=
0
d µ=1
0
S
L2 (Rd+1 )
Ex [(fµ (Bs ), gµ (Bs ))]ds
Hence we have the bound 2 d Ex js fµ (Bs ) ◦ dBsµ µ=1
≤
t
x
dsE 0
1 2
fµ (Bs ) + ∂ · f (Bs ) . 2 2 µ=1 d
The next proposition is fundamental.
2
(2.28)
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1189
Proposition 2.4. Let V be bounded. Suppose Assumption 2.1. Then Rt √ (F, e−tHPF G) = dxEx [e− 0 V (Bs )ds (J0 F (B0 ), ei αAE (Kt ) Jt G(Bt ))L2 (Q) ], (2.29) Kt is the
d−1
L2 (Rd+1 )-valued stochastic integral given by Kt =
d−1 d
t
0
j=1 µ=1
js ρjµ (·, Bs ) ◦ dBsµ .
(2.30)
Here d µ=1
0
t
js ρjµ (·, Bs ) ◦ dBsµ =
0
t
js ρj (·, Bs ) · dBs +
1 2
0
t
js ∂ · ρj (·, Bs )ds.
Proof. Suppose that ρˆjµ ∈ Cb2 (Rdx ; L2 (Rdk )). Then (2.29) is proven in the same way as [16, Lemma 4.8]. Next we suppose that ρˆjµ (k, x) ∈ Cb1 (Rdx ; L2 (Rdk )). Let χ ∈ C ∞ (Rd ) and ϕ ∈ C0∞ (Rd ) be such that |x| < 1, 1, χ(x) = <1, 1 ≤ |x| ≤ 2, ϕ ≥ 0 0, 2 < |x|, and ϕ(x)dx = 1. Define χN (x) = χ(x/N ) and ϕn (x) = ϕ(x/n)n−d/2 . Let ρˆjµ (k, x)M,n = (ϕn ∗ (ρjµ (k, ·)χN (·)))(x), ρˆjµ (k, x)M = ρjµ (k, x)χM (x). We note that ρˆjµ (k, x)M,n ∈ Cb∞ (Rdx ; L2 (Rdk )). Since ρˆjµ (k, x)M,n → ρˆjµ (k, x)M in Lp (Rdx , L2 (Rdk )) for 1 ≤ p < ∞ as n → ∞, there exists a subsequence n such that ρˆjµ (k, x)M,n → ρˆjµ (k, x)M strongly in L2 (Rdk ) for almost everywhere x ∈ Rd . Furthermore, ρˆjµ (k, x)M → ρˆjµ (k, x) for each x ∈ Rd in L2 (Rdk ). Then lim ρˆjµ (k, x)M,n = ρˆjµ (k, x)
lim
M→∞ n →∞
(2.31)
strongly in L2 (Rdk ) for almost everywhere x ∈ Rd . In the same way as above, we can also see that lim
lim ∂xz ρˆjµ (k, x)M,n = ∂xz ρˆjµ (k, x)
M→∞ n →∞
(2.32)
strongly in L2 (Rdk ) for almost everywhere x ∈ Rd for |z| ≤ 1. Thus (2.29) holds with ρˆjµ replaced by ρˆjµ (k, x)M,n . HPF with ρjµ replaced by ρˆjµ (k, x)M,n is denoted by HPF (M, n ). Let F ∈ C0∞ ⊗ D(Hf (m)). Then we can prove directly that lim
lim HPF (M, n )F = HPF F.
M→∞ n →∞
November 16, J070-S0129055X10004181
1190
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
Since C0∞ ⊗ D(Hf (m)) is a core of HPF (M, n ) and HPF ,
lim e−tHPF (M,n ) = e−tHPF
lim
(2.33)
M→∞ n →∞
strongly. Moreover (F, e
−tHPF (M,n )
√
dxEx [(J0 F (x), ei
G) =
αAE (Kt (M,n )) −
e
Rt 0
V (Bs )
Jt G(Bt ))], (2.34)
where Kt (M, n ) is defined by Kt with ρjµ (k, x) replaced by ρjµ (k, x)M,n . Operator N = dΓ(1) is called the number operator in L2 (Q). Let F ∈ D(N ). Then the bound
A (Φ)F ≤ 2 Φ
(N + 1)1/2 F
is known. From (2.34) and √
|ei
αAE (Kt (M,n ))
√ αAE (Kt )
− ei
| ≤ |AE (Kt (M, n ) − Kt )|
it follows that
|(F, e−tHPF (M,n ) G) − (F, e−tHPF G)| Rt √ ≤ α dxEx [(|J0 F (x)|, |AE (Kt (M, n ) − Kt )|e− 0 V (Bs ) |Jt G(Bt )|)] √ ≤C α √ ≤C α
dx (N + 1)1/2 F (x) Ex [ Kt (M, n ) − Kt
G(Bt ) ] dx (N + 1)1/2 F (x) (Ex [ Kt (M, n ) − Kt 2 ])1/2 (Ex [ G(Bt ) 2 ])1/2 .
We estimate Ex [ Kt (M, n ) − Kt 2 ]. By (2.28), we have d d−1 t 1 x 2 x j 2 j 2 E [ Kt (M, n ) − Kt ] ≤ E 2
δρµ (Bs ) + δ∂ · ρ (Bs ) ds. 2 µ=1 j=1 0 where δf = f − fM,n . By (2.31) and (2.32), we see that lim
lim Ex [ Kt (M, n ) − Kt 2 ] = 0
M→∞ n →∞
for each x ∈ Rd . Then by the Lebesgue dominated convergence theorem we have lim r.h.s. (2.34)
lim
M→∞ n →∞
=
√
dxEx [(J0 F (x), ei
αAE (Kt ) −
e
Rt 0
V (Bs )
Jt G(Bt ))].
(2.35)
Then (2.29) also holds for ρjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Thus the proposition follows.
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1191
2.3. One-parameter symmetric semigroup and generalized Pauli–Fierz Hamiltonian We can extend functional integral representations in Proposition 2.4 to more general external potentials and ρjµ . Definition 2.5 (Kato-Class Potentials). External potential V : Rd → R is called a Kato-class potential if and only if sup |λ(x − y)V (y)|dy < ∞ d = 1, x∈Rd B1 (x) (2.36) lim sup |λ(x − y)V (y)|dy = 0 d ≥ 2 r→0 x∈Rd
Br (x)
holds, where Br (x) denotes the closed ball of radius r centered at x, and 1, d = 1, λ(x) = − log |x|, d = 2, 2−d |x| , d ≥ 3.
(2.37)
We denote the set of Kato-class potential by KKato . An equivalent characterization of Kato-class is as follows: Proposition 2.6. A function V is in KKato if and only if t lim sup Ex |V (Bs )|ds = 0. t↓0 x∈Rd
(2.38)
0
Proof. See, e.g., [1, 6, 26, 27]. Definition 2.7. Let K be the set of external potential V = V+ − V− such that 0 ≤ V+ ∈ L1loc (Rd ) and 0 ≤ V− ∈ KKato . Example 2.8. In [1, 26, 27], it is shown that Lpu (Rd ) ⊂ KKato where p d p Lu (R ) = f sup |f (x)| dx < ∞ x |x−y|≤1 with p
=1,
d = 1,
> d , 2
d ≥ 2.
(2.39)
In particular let V ∈ Lp (Rd ) + L∞ (Rd ) with (2.39), then V ∈ KKato . a Example 2.9. Let d = 3 and V (x) = P (x) − |x| b , where a ≥ 0, 0 ≤ b < 2 and 2n j P (x) = j=0 aj x is a polynomial such that a2n > 0. Then V ∈ K .
November 16, J070-S0129055X10004181
1192
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
Now we shall see that the random variable to the Wiener measure P x for V ∈ K .
t 0
V± (Bs )ds is integrable with respect
t Lemma 2.10. Let 0 ≤ V ∈ L1loc (Rd ). Then P x ( 0 V (Bs )ds < ∞) = 1 for each x ∈ Rd . t Proof. Since V ∈ L1loc (Rd ), we can see that Ex [ 0 1N V (Bs )ds] < ∞ for the indicator function 1, |k| ≤ N, 1N (k) = 0, |k| > N. Then there exists a measurable set NN ⊂ X such that P x (NN ) = 0 and t !∞ < ∞ for ω ∈ X \NN . Set N = N =1 NN . For ω ∈ X \N 0 1N (Bs )V (Bs )ds t we can see that 0 1N (Bs (ω))V (Bs (ω))ds < ∞ for arbitary N ≥ 1. Let ω ∈ X \N . There exists N = N (ω) ≥ 1 such that sup0≤s≤t |Bs (ω)| < N . Henceforth t t V (Bs (ω))ds = 1N (Bs (ω))V (Bs (ω))ds < ∞, ω ∈ X \N . 0
0
Thus the lemma follows. Rt
When V− ∈ KKato , it can be seen that the Rexponent e 0 V (Bs )ds is integrable t with respect to P x , and the supremum of Ex [e 0 V (Bs )ds ] in x is finite. We shall check it. Lemma 2.11. Let V ∈ KKato . Then there exists β > 0 and γ > 0 such that sup Ex [e
Rt 0
V (Bs )
] < γeβt .
(2.40)
x
Furthermore when V ∈ Lp (Rd ) with =1, p > d , 2
d = 1, d ≥ 2,
there exists C such that β ≤ C V p .
(2.41)
Proof. By Proposition 2.6, there exists t∗ > 0 such that t αt = sup Ex V (Bs ) < 1 0
x
∗
for all t ≤ t , and αt → 0 as t → 0. It is known as Khasminskii’s lemma that sup Ex [e x
Rt 0
V (Bs )
]<
1 1 − αt
(2.42)
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1193
for all t ≤ t∗ . By means of the Markov property of the Brownian motion we have 2 R 2t∗ R t∗ R t∗ 1 Ex [e− 0 V (Bs ) ] = Ex [e− 0 V (Bs ) EBt∗ [e− 0 V (Bs ) ]] ≤ . 1 − αt∗ Repeating this procedure, we can see that [t/t∗ ]+1 Rt 1 sup Ex [e 0 V− (Bs ) ] ≤ 1 − αt∗ x
(2.43)
1 for all t > 0, where [z] = max{w ∈ Z | w ≤ z}. Set γ = ( 1−α ) and β = t∗ ∗ 1 1/t log( 1−αt∗ ) . Then (2.40) is proven. Next we prove (2.41). Suppose V ∈ Lp (Rd ). In the case of d = 1, we directly see that t t Ex [V (Bs )]ds ≤ (2πs)−1/2 ds V 1 . (2.44) αt = 0
0
1 p
Next, we let d ≥ 2 and q be such that + 1q = 1. The following estimates are due to [1, Proof of Theorem 4.5]. Let an arbitrary > 0 be fixed. We have t Ex [|V (Bs )|]ds 0
t
= 0
Ex [|V (Bs )|χ|Bs −x|≥ ]ds +
≤t
−d/2 −|y|2 /(2t)
(2πt) |y|≥
e
0
t
Ex [|V (Bs )|χ|Bs −x|< ]ds
|V (x + y)|dy + e
t
∞
0
Ex [e−s |V (Bs )|χ|Bs −x|< ].
It is easy to see that 1/q 2 2 t (2πt)−d/2 e−|y| /(2t) |V (x + y)|dy ≤ t(2π)−d/2
V p . e−q|y| /2 dy |y|≥
(2.45) Let f be the integral kernel of ( 12 p2 + 1)−1 . Then we see that ∞ x −s dsE [e |V (Bs )|χ|Bs −x|< ] ≤ f (x − y)|V (y)|dy. 0
|x−y|<
Since |f (z)| ≤ Cλ(z) for |z| ≤ 12 with some constant C, we have ∞ dsEx [e−s |V (Bs )|χ|Bs −x|< ] ≤ C λ(x − y)|V (y)|dy 0
|x−y|<
and then
∞
x
dsE [e 0
−s
|V (Bs )|χ|Bs −x|< ] ≤ C
1/q q
λ(z) dy |z|<
V p
(2.46)
November 16, J070-S0129055X10004181
1194
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
by the H¨ older inequality. Hence from(2.44)–(2.46), there exists Ct () such that αt ≤ Ct () V p and limt→0 Ct () = C( |z|< λ(z)q dy)1/q . Then for sufficiently small 1 1/T and then there exists DT such that β ≤ T and we have β ≤ ( 1−CT () V p ) DT V p . Then (2.41) follows.
The functional integral representation (2.29) introduced in Proposition 2.4 is well defined not only for bounded external potentials and ρjµ satisfying (2.18) but also more general external potentials and ρjµ . We can identify Hilbert space H with L2 (Rd ×Q) with the scalar product (F, G) = dx(F (x), G(x))L2 (Q) . The functional integral representation of (F, e−tHPF G) is also given by Rt √ (F, e−tHPF G) = dx(F (x), Ex [e− 0 V (Bs )ds J∗0 ei αAE (Kt ) Jt G(Bt )])L2 (Q) . From this expression we shall define (Tt )t≥0 by (2.47) below. Assumption 2.12. We suppose that V ∈ K and ρˆjµ = ρˆjµ (k, x) ∈ Cb1 (Rdx ; L2 (Rdk )). Note that under Assumption 2.12, Aµ (x) is not relatively bounded with respect to Hf (m) in the case of m = 0. Under Assumption 2.12 however we define the family of linear operators {Tt }t≥0 on H by Tt F (x) = Ex [e−
Rt 0
√ V (Bs )ds ∗ i αAE (Kt ) J0 e Jt F (Bt )]
(2.47)
for all t ≥ 0. Note that Kt is well defined since ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Lemma 2.13. Suppose Assumption 2.12. Then Tt is bounded on H for t ≥ 0. Proof. By the definition of Tt we have Rt
Tt F 2H ≤ dxEx [e−2 0 V (Bs )ds ]Ex [ F (Bt ) 2L2 (Q) ]. Since V ∈ K , C = supx Ex [e−2
Rt 0
V (Bs )ds
] < ∞. Thus Tt F 2H ≤ C F 2H follows.
In what follows we shall show that {Tt }t≥0 is a strongly continuous oneparameter symmetric semigroup on H . In order to show it we introduce the second quantization of Euclidean group {ut , r} on L2 (Rd+1 ), where the time shift operator ut : L2 (Rd+1 ) → L2 (Rd+1 ) is defined by ut f (x0 , x) = f (x0 − t, x) and the time reflection r: L2 (Rd+1 ) → L2 (Rd+1 ) by rf (x0 , x) = f (−x0 , x) for x = (x0 , x) ∈ R × Rd . The second quantization of ut and r are denoted by Ut : L2 (QE ) → L2 (QE ) and R: L2 (QE ) → L2 (QE ), respectively. Note that r∗ = r,
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1195
rr = r∗ r = 1, u∗t = u−t and u∗t ut = 1 and that Ut and R are unitary. The time shift ut , the time reflection r and isometry jt : L2 (Rd ) → L2 (Rd+1 ) satisfy the lemma below. Lemma 2.14. (1) ut js = js+t and Ut Js = Js+t . (2) rjs = j−s r and RUs = U−s R. Proof. By the definition of js we have ω(k) 1 i(k0 (x0 −s)+k·x) js f (x) = √ fˆ(k)dk0 dk. e (d+1)/2 π(2π) ω(k)2 + |k0 |2 Then ut js = js+t follows, and Ut Js = Γ(ut )Γ(js ) = Γ(ut js ) = Γ(js+t ) = Js+t . (2) is similarly proven. Lemma 2.15. Suppose Assumption 2.12. Then it follows that Tt Ts = Tt+s for all t, s ≥ 0. Proof. By the definition of Tt , we have Rs
√ V (Br )dr ∗ i αAE (Ks ) J0 e Js EBs Rt √ × [e− 0 V (Br )dr J∗0 ei αAE (Kt ) Jt F (Bt )]].
Ts Tt F (x) = Ex [e−
0
(2.48)
Let Es = Js J∗s , s ∈ R, be the family of projections. By the formulae Js J∗0 = ∗ ∗ = Es U−s and Jt = U−s Jt+s , (2.48) is expressed as Js J∗s U−s Rs
√ V (Br )dr ∗ i αAE (Ks ) J0 e Es EBs Rt √ ∗ i αAE (Kt ) × [e− 0 V (Br )dr U−s e U−s Jt+s F (Bt )]].
Ts Tt F (x) = Ex [e−
0
Since Us is unitary, we have √
∗ i e U−s
αAE (Kt )
√
U−s = ei
αAE (u∗ −s Kt )
(2.49)
as an operator, where the exponent is given by d−1 d t ∗ jr+s ρjµ (Br ) ◦ dBrµ . u−s Kt = j=1 µ=1
0
Let (Ft )t≥0 be the natural filtration of the Brownian motion (Bt )t≥0 . By the Markov property of the projections Et ’s ([24]), we can neglect Es in (2.49) and we have Ts Tt F (x) = Ex [e−
Rs
× Ex [e
√ V (Br )dr ∗ i αAE (Ks ) J0 e R √ − ss+t V (Br )dr i αAE (Kss+t ) 0
e
Jt+s F (Bs+t )|Fs ]],
where Ex [· · · | Fs ] denotes the conditional expectation with respect to (Ft )t≥0 and d−1 d s+t s+t jr ρjµ (Br ) ◦ dBrµ . Ks = j=1 µ=1
s
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
1196
Hence we obtain that Ts Tt F (x) = Ex [e−
R s+t 0
√ V (Br )dr ∗ i αAE (Ks+t ) J0 e Js+t F (Bs+t )]
= Ts+t F (x)
and the lemma is proven. Next we check the symmetric property of Tt . Lemma 2.16. Suppose Assumption 2.12. Then it follows that Tt∗ = Tt for t ≥ 0. Proof. By the functional integral representation and the unitarity of the timereflection R on L2 (QE ), we have Rt √ (F, Tt G) = dxEx [e− 0 V (Bs )ds (RJ0 F (B0 ), Rei αAE (Kt ) RRJt G(Bt ))] =
dxEx [e−
Rt 0
V (Bs )ds
√
(J0 F (B0 ), ei
αAE (rKt )
J−t G(Bt ))],
d−1 d t j µ where the exponent is rKt = j=1 µ=1 0 j−s ρµ (Bs ) ◦ dBs . By means of the time-shift Ut we also have Rt √ (F, Tt G) = dxEx [e− 0 V (Bs )ds (Ut J0 F (B0 ), Ut ei αAE (rKt ) Ut∗ Ut J−t G(Bt ))]
dxEx [e−
=
Rt 0
V (Bs )ds
√
(Jt F (B0 ), ei
αAE (ut rKt )
J0 G(Bt ))],
d−1 d t ˜s = Bt−s − Bt , where ut rKt = j=1 µ=1 0 jt−s ρjµ (Bs ) ◦ dBsµ . Finally we set B which equals to Bs in law. Then we have Rt √ ˜ ˜t ))], (2.50) (F, Tt G) = dxE0 [e− 0 V (x+Bs )ds (Jt F (x), ei αAE (ut rK t ) J0 G(x + B where u t rK t =
d−1 d j=1 µ=1
t
˜s ) ◦ dB ˜sµ = lim jt−s ρjµ (x + B
d−1 n
n→∞
0
∆j (i)
j=1 i=1
and limn→∞ is in the strong sense of L2 (X ; L2 (Rd+1 )) and ∆j (i) =
d µ=1
ti/n
t(i−1)/n
˜ s ) ◦ dB ˜µ. jt−t(i−1)/n ρjµ (x + B s
dx and E0 in (2.50) we have Rt ˜ 0 (F, Tt G) = lim E dxe− 0 V (x+Bs )ds
Then exchanging
n→∞
√
× (Jt F (x), ei
αAE (
Ld−1 Pn j=1
i=1
∆j (i))
˜t )) J0 G(x − B
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1197
and changing variable x − Bt to x in dx we have Rt 0 (F, Tt G) = lim E dxe− 0 V (x+Bs )ds n→∞
Pn ˜ L √ αAE ( d−1 i=1 ∆j (i)) j=1
× (Jt F (x + Bt ), ei
J0 G(x)) ,
where ˜ j (i) = − ∆
d µ=1
ti/n
t(i−1)/n
jt−t(i−1)/n ρjµ (x + Bs ) ◦ dBsµ .
and lim
n→∞
n
˜ j (i) = − ∆
d µ=1
i=1
0
t
ρjµ (x + Bs ) ◦ dBsµ .
We thus can finally see that Rt √ (F, Tt G) = dxEx [e− 0 V (Bs )ds (Jt F (Bt ), e−i αAE (Kt ) J0 G(B0 ))] = (Tt F, G). Then the lemma follows. Lemma 2.17. Suppose Assumption 2.12. Then Tt is strongly continuous in t ≥ 0 on H . Proof. Since Tt is uniformly bounded and the semigroup property Tt Ts = Tt+s is hold, it is enough to show the weak continuity at t = 0. By the Lebesgue dominated convergence theorem it suffices to show that √
Ex [(J0 F (B0 ), ei
αAE (Kt )
Jt G(Bt )] → Ex [(J0 F (B0 ), J0 G(B0 )]
as t → 0 for each x ∈ Rd . Let √
Ex [(J0 F (B0 ), ei =
αAE (Kt )
Jt G(Bt )] − Ex [(J0 F (B0 ), J0 G(B0 )] √ √ Ex [(J0 F (B0 ), ei αAE (Kt ) Jt G(Bt )] − Ex [(J0 F (B0 ), ei αAE (Kt ) Jt G(B0 )] √ √ + Ex [(J0 F (B0 ), ei αAE (Kt ) Jt G(B0 )] − Ex [(J0 F (B0 ), ei αAE (Kt ) J0 G(B0 )] √ + Ex [(J0 F (B0 ), ei αAE (Kt ) J0 G(B0 )] − Ex [(J0 F (B0 ), J0 G(B0 )].
The first and second terms of the right-hand side above converge to zero as t → 0, since Bt and Jt are continuous in t. We will check that the third line also goes to zero. We have √
|Ex [(J0 F (B0 ), ei αAE (Kt ) J0 G(B0 )] − Ex [(J0 F (B0 ), J0 G(B0 )]| √ ≤ (Ex [ αAE (Kt )J0 F (B0 ) 2 ])1/2 (Ex [ G(Bt ) 2 ])1/2 .
November 16, J070-S0129055X10004181
1198
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
We have a bound √ Ex [ AE (Kt )J0 F (B0 ) 2 ] ≤ N + 1F (x) 2 E0 [ Kt (x) 2L2 (Rd+1 ) ], where Kt (x) = 0
E
t j µ=1 0 js ρµ (x
+ Bs ) ◦ dBsµ . We have
d−1 t
d
d−1 d j=1
[ Kt (x) 2L2 (Rd+1 ) ]
≤
x
dsE 2
j=1 0
ρjµ (Bs ) 2
µ=1
1 j 2 + ∂ · ρ (Bs )ds . 2
(2.51)
Then limt→0 Ex [ AE (Kt )J0 F (B0 ) 2 ] = 0 follows and the proof is complete. Theorem 2.18. Suppose Assumption 2.12. Let V ∈ K . Then {Tt }t≥0 is a strongly continuous one-parameter symmetric semigroup. In particular, there exists a selfadjoint operator KPF bounded below such that e−tKPF = Tt ,
t ≥ 0,
(2.52)
and e−tKPF F (x) = Ex [e−
Rt 0
√ V (Bs )ds ∗ i αAE (Kt ) J0 e Jt F (Bt )].
(2.53)
Proof. This follows from Lemmas 2.15–2.17. Definition 2.19 (Generalized Pauli–Fierz Hamiltonians). Suppose Assumption 2.12. We define a generalized Pauli–Fierz Hamiltonian with an external potential V ∈ K by a self-adjoint operator KPF in (2.52). Corollary 2.20. Suppose Assumption 2.12. Let us identify H with L2 (Rd × Q). Then under this identification ei(π/2)N e−tKPF e−i(π/2)N , t > 0, is positivity improving. In particular the ground state of KPF is unique if it exists. Proof. By (2.53), we can see that (F, ei(π/2)N e−tKPF e−i(π/2)N G) Rt √ = dxEx [(J0 F (x), e− 0 V (Bs )ds ei(π/2)N ei αAE (Kt ) e−i(π/2)N Jt G(Bt ))]. √
Since in [15] it is shown that ei(π/2)N ei αAE (Kt ) e−i(π/2)N is positivity improving, (F, ei(π/2)N e−tKPF e−i(π/2)N G) > 0 for all 0 ≤ F, G ∈ H but F = 0 and G = 0. Then the corollary follows. Let Lp (Rd ; L2 (Q)) = {f : Rd → L2 (Q)| f (x) pL2 (Q) dx < ∞} and set the Lp norm as F p = ( F (x) pL2 (Q) dx)1/p . Corollary 2.21. Suppose Assumption 2.12. e−tKPF can be extended to a bounded operator from Lp (Rd ; L2 (Q)) to itself for 1 ≤ p ≤ ∞.
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
Proof. Let p = ∞, p = 1 and
1 p
+
1 q
= 1. Then we have
e−tKPF F (x) pL2 (Q) ≤ (Ex [e−
Rt
≤ (Ex [e−q Thus we have
e
−tKPF
1199
0
V (Bs )ds
Rt 0
F (Bt ) ])p
V (Bs )ds
])p/q Ex [ F (Bt ) pL2 (Q) ].
F (x) pL2 (Q) dx
≤C
F (x) pL2 (Q) dx.
In the case of p = ∞ and p = 1, the proof is similar. 2.4. Quadratic form and KPF By the functional integral representation, we have the so-called diamagnetic inequality |(F, e−tHPF G)| ≤ (|F |, e−t(Hp +Hf (m)) |G|).
(2.54)
By means of the diamagnetic inequality, we can see that when |V |1/2 is relatively bounded with respect to (p2 /2)1/2 with a relative bound a ≥ 0, it is also relatively √ bounded with respect to ( 12 (p + αA )2 + Hf (m))1/2 with a relative bound ≤ a. See [14]. Let V = V+ − V− be such that V+ ∈ L1loc (Rd ) and V− infinitesimally small with respect to p2 /2 in the sense of form. Then under Assumption 2.1 we can define the self-adjoint operator HPF =
√ 1 ˙ V+ − ˙ V− (p + αA )2 + Hf (m) + 2
(2.55)
˙ by the quadratic form sum ±. Theorem 2.22. Let V ∈ K and suppose Assumption 2.1. Then KPF = HPF , where HPF is defined by (2.55). Proof. The functional integral representation of e−tHPF for (2.55) can be given by the procedure below [25, 14]. Let V (x) ≥ n. n, Vn,m (x) = V (x), m < V (x) < n, m, V (x) ≤ m. Thus Vn,m ∈ L∞ (Rd ) and then the functional integral representation of e−tHPF with external potential Vn,m , which is denoted by e−tHPF (n,m) , is given by Proposition 2.6. By the monotone convergence theorem for forms, we can see that limn→∞ limm→∞ e−tHPF (n,m) = e−tHPF , where HPF is defined by (2.55). On the
November 16, J070-S0129055X10004181
1200
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
other hand, the functional integral representation of I = (F, e−tHPF (n,m) G) = I + i I is divided into the positive part and the negative part as I = (I)+ − (I)− + i( I)+ − i( I)− , and each term converges as n, m → ∞ by the monotone convergence theorem for integral. Then the functional integral representation is given by (F, e−tHPF G) Rt Rt √ = lim dxE[(J0 F (B0 ), e− 0 Vn,+ (Bs )ds e+ 0 Vm,− (Bs )ds ei αA (Kt ) Jt G(Bt ))]. n,m→∞
=
dxE[(J0 F (B0 ), e−
Rt 0
√ V (Bs )ds i αA (Kt )
e
Jt G(Bt ))].
(2.56)
Since V ∈ K , we see that V+ ∈ L1loc (Rd ) and V− is infinitesimally small with respect to p2 /2 in the sense of form [6, Theorem 1.12]. Moreover (F, e−tKPF G) equals to the right-hand side of (2.22). Then we conclude that e−tHPF = e−tKPF . Thus the theorem follows. 3. Pointwise Spatial Exponential Decays In this section, we show the spatial exponential decay of bound states of KPF . Let ϕb be a bound state of KPF associated with eigenvalue E; KPF ϕb = Eϕb .
(3.1)
Assumption 3.1. We say that V = W + U ∈ E if and only if W ∈ L1loc (Rd ), inf x W (x) > −∞ and 0 > U ∈ Lp (Rd ) for some =1, d = 1, p > d , d ≥ 2. 2 Let W + U ∈ E and set W = W+ − W− , where W± ≥ 0 is given by W+ (x) = max{0, W (x)} and W− (x) = min{0, W (x)}. Since U ∈ Lp (Rd ) ⊂ KKato , W− ∈ L∞ ⊂ KKato and W+ ∈ L1loc (Rd ), we note that E ⊂ K . We set W∞ = inf W (x). x
(3.2)
A fundamental estimate to show the spatial exponential decay of bound states is the lemma below. Lemma 3.2. Let V = W + U ∈ E . Suppose that ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Then for arbitrary t, a > 0 and each 0 < α < 1/2, there exist constants D1 , D2 and D3 such that α a2 t
ϕb (x) L2 (Q) ≤ D1 eD2 U p t eEt (D3 e− 4 where Wa (x) = inf{W (y)||x − y| < a}.
e−tW∞ + e−tWa (x) ) ϕb H ,
(3.3)
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1201
Proof. It is a slight modification of [5]. Since ϕb = etE e−tKPF ϕb , we have ϕb (x) = Ex [J∗0 e−
Rt 0
√ V (Bs ) i αAE (Kt )
Jt ϕb (Bt )]etE .
(3.4)
ϕb (Bt ) L2 (Q) ].
(3.5)
e
Hence for almost every x it follows that
ϕb (x) L2 (Q) ≤ etE Ex [e−
Rt 0
V (Bs )
By this, we have
ϕb (x) L2 (Q) ≤ etE (Ex [e−4
Rt 0
W (Bs )ds
])1/4 (Ex [e−4
Rt 0
U(Bs )ds
])1/4 ϕb H ,
where we used the Schwartz inequality and 2 Ex [ ϕb (Bt ) 2L2 (Q) ] = (2πt)−d/2 e−|y| /2t ϕb (x + y) 2L2 (Q) dy =
2
e−π|z| ϕb (x +
√ 2πtz) 2L2 (Q) dz
≤ ϕb 2H . Let A = {ω ∈ X | sup0≤s≤t |Bs (ω)| > a}. Then it follows from a martingale inequality that ∞ 2 −r 2 /2 d−1 r dx ≤ ξα e−αa /t E0 [1A ] ≤ 2P 0 (|Bt | ≥ a) = 2(2π)−d/2 Sd−1 √ e a/ t
with some ξα for each 0 < α < 1/2. Thus it follows that Ex [e−4
Rt 0
W (Bs )ds
] = E0 [1A e−4
Rt 0
W (Bs +x)ds
] + Ex [1Ac e−4
Rt 0
W (Bs )ds
]
≤ e−4tW∞ E0 [1A ] + e−4tWa (x) 2
≤ ξα e−αa
/t −4tW∞
e
+ e−4tWa (x) .
Rt
Next we estimate Ex [e−4 0 U(BRs )ds ]. Since U is in Kato-class, there exist constants t D1 and D2 such that Ex [e−4 0 U(Bs )ds ] ≤ D1 eD2 U p t by Lemma 2.11. Setting D3 = ξα 1/4 , we obtain the lemma by the inequality (a + b)1/4 ≤ a1/4 + b1/4 for a, b ≥ 0. For V = W + U ∈ E , we define Σ = lim inf V (x). |x|→∞
(3.6)
Since U ∈ Lp (Rd ), lim inf |x|→∞ U (x) = 0 and hence Σ = lim inf W (x). |x|→∞
Moreover Σ ≥ W∞ holds.
(3.7)
November 16, J070-S0129055X10004181
1202
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
Theorem 3.3. Suppose that V = W + U ∈ E and ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )). Confining Case 1. Suppose that W (x) ≥ γ|x|2n outside a compact set K for some n > 0 and some γ > 0. Let 0 < α < 1/2. Then there exists a constant C1 such that # " αc (3.8)
ϕb (x) L2 (Q) ≤ C1 exp − |x|n+1 ϕb H , 16 where c = inf x∈Rd \K W |x| (x)/|x|2n . 2
Confining Case 2. Suppose that lim|x|→∞ W (x) = ∞. Then there exist constants C and δ such that
ϕb (x) L2 (Q) ≤ C exp(−δ|x|) ϕb H .
(3.9)
Non-Confining Case. Suppose that Σ > E and Σ > W∞ . Let 0 < β < 1. Then there exists a constant C2 such that β (Σ − E)
ϕb (x) L2 (Q) ≤ C2 exp − √ √ |x| ϕb H . (3.10) 8 2 Σ − W∞ Proof. Since supx ϕb (x) L2 (Q) < ∞, it is enough to show all the statements for sufficiently large |x|. Confining Case 1. Note that W |x| (x) ≥ c|x|2n for x ∈ Rd \K. Then we have 2
bounds for x ∈ Rd \K:
|x|W |x| (x)1/2 ≥ c|x|n+1 ,
(3.11)
|x|W |x| (x)−1/2 ≤ c|x|1−n .
(3.12)
2
2
Inserting t = t(x) = W |x| (x)−1/2 |x| and a = a(x) = 2
ϕb (x) ≤ e− 16 c|x| α
n+1
× (D3 ec|x|
|x| 2
D1 e(D2 U p +E)c|x|
1−n
|W∞ |
in (3.3), we have
1−n
+ e−(1− 16 )c|x| α
n+1
) ϕb H
(3.13)
for x ∈ Rd \K. Then (3.8) follows. Non-Confining Case. Rewrite formula (3.3) as α a2 t
ϕb (x) ≤ D1 eD2 U p t (D3 e− 4
e−t(W∞ −E) + e−t(Wa (x)−E) ) ϕb H .
(3.14)
Then altering both Σ = lim inf |x|→∞ (−W− (x)) and Σ > W∞ , it is possible to choose decomposition V = W + U ∈ E such that U p ≤ (Σ − E)/2, since lim inf |x|→∞ U (x) = 0. Inserting t = t(x) = |x| and a = a(x) = |x| 2 in (3.14),
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1203
we have
ϕb (x) ≤ D1 e U p |x| (D3 e− 16 |x| e−|x|(W∞ −E) + e α
≤ D1 (D3 e +e √
Choosing =
2
) ϕb H
α −( 16 +(W∞ −E)− 12 (Σ−E))|x|
−((W |x| (x)−E)− 12 (Σ−E))|x| 2
α/16 √ , Σ−W∞
−|x|(W |x| (x)−E)
) ϕb H .
the exponent on the first term above turns out to be
1 α 1 + (W∞ − E) − (Σ − E) = (Σ − E). 16 2 2 Moreover we see that lim inf |x|→∞ W |x| (x) = Σ, and obtain 2
ϕb (x) L2 (Q) ≤ C2 e− 2 (Σ−E)|x| ϕb H
for sufficiently large |x|. Then (3.10) follows. Confining Case 2. Finally, we prove confining case 2. In this case for arbitrary c > 0 there exists N such that W |x| (x) ≥ c for all |x| > N . Inserting t = t(x) = |x| and a = a(x) =
|x| 2
2
in (3.3), we obtain that
ϕb (x) ≤ D1 e U p |x| (D3 e− 16 |x| e−|x|(W∞ −E) + e α
≤ D1 (D3 e
α −( 16 − U p +(W∞ −E))|x|
−|x|(W |x| (x)−E) 2
) ϕb H
+ e−|x|(c−E− U p) ) ϕb H
for |x| > N . Choosing sufficiently large c and sufficiently small such that α − U p + (W∞ − E) > 0, 16 c − E − U p > 0,
we have ϕb (x) ≤ C e−δ |x| for sufficiently large |x|. Then (3.9) follows. We give several remarks on Theorem 3.3. Independence of Bose Mass m. Suppose that ω(k) = |k|2 + m2 . Let ϕb be a normalized ground state of KPF : ϕb H = 1, and Em = inf σ(KPF ). It is shown that there exist also constants C1 and C2 such that
ϕb (x) L2 (Q) ≤ C1 e−C2 |x| , n
n ≥ 1,
by Theorem 3.3. Since the ground state energy Em is decreasing in m, we can take C1 and C2 independent of m < M with some M . This fact is nontrivial and useful to show the existence of ground states of the Pauli–Fierz model with m = 0. This is used in, e.g., [13].
November 16, J070-S0129055X10004181
1204
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
Condition W ∞ < Σ. When inf x V (x) < Σ, it is possible to decompose V = W + U ∈ E such that W∞ < Σ. In fact, for arbitrary > 0, there exists y ∈ Rd such that V (y) < inf V (x) + . x
Suppose that inf x V (x) + < Σ. Let Oy ⊂ Rd be a neighborhood of y. Then define U (x), x ∈ Oy , u(x) = 0, y ∈ Oy . ˜ = W + u and U ˜ = U − u. This yields that V = W ˜ +U ˜ ∈ E and W ˜∞ < Let W inf x V (x) + < Σ. Threshold. The threshold is defined by Σ∞ = lim
inf
(F, HPF F ),
R→∞ F ∈DR , F =1
where DR = {F ∈ D(HPF ) | F (x) = 0, |x| < R}. We note that Σ∞ ≥ Σ, and Σ = Σ∞ = ∞ in confining cases. The bound given in [10] is e+C|·|1(−∞,λ] (HPF ) H < ∞, where C 2 + λ < Σ∞ . From this the bound (3.15) dx e+δ|x| ϕb (x) 2L2 (Q) ≤ C ϕb H follows, where δ<
Σ∞ − E.
Theorem 3.3, however, gives pointwise bounds:
ϕb (x) L2 (Q) ≤ C1 exp(−C2 |x|β ) ϕb H ,
β ≥ 1.
(3.16)
In particular, the superexponential decay, ϕb (x) ≤ C1 e−C2 |x| ϕb H , is shown for the case of polynomially increasing potentials (Confining Case 1), while in nonconfining cases, we show that in (3.16), β = 1 and n+1
Σ−E C2 < √ √ . 8 2 E − W∞
(3.17)
We give examples of external potentials. Example 3.4 (Confining Potentials). Let V = V+ − V− be such that V+ ∈ Lploc (Rd ) and V− ∈ Lp (Rd ), where =1, d = 1, p > d , d ≥ 2. 2 In this case V ∈ E .
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1205
Example 3.5 (Coulomb Potentials). Suppose Assumption 2.1. Then HPF = KPF . Let V = −αZ/|x| be the Coulomb potential. Then inf σ(Hp ) = −αZ/2. We have (φ ⊗ 1, HPF φ ⊗ 1)H = (φ, (Hp + Veff )φ)L2 (Rd ) for φ ∈ D( 12 p2 ), where Veff (x) = Let V∞ = supx |
d−1 d α j (ρ (x), ρjν (x))L2 (Rd ) . 2 j=1 µ,ν=1 µ
d−1 d j=1
j j µ,ν=1 (ρµ (x), ρν (x))L2 (Rd ) |.
Thus
α inf σ(HPF ) ≤ − (Z − V∞ ). 2 When Z > V∞ , inf σ(HPF ) < lim|x|→∞ V (x) = 0 follows for all values of coupling constant α. Then ground states of HPF decay as C1 e−C2 |x| pointwise for all values of coupling constants. Acknowledgments FH acknowledges support of Grant-in-Aid for Science Research (B) 20340032 from JSPS and Grant-in-Aid for Challenging Exploratory Research 22654018 from JSPS. Appendix In this appendix, we show the unitary equivalence between HPF and the Pauli–Fierz Hamiltonian defined on L2 (Rd ) ⊗ F ,
d−1 2 d d−1 2 d ∞ L (R )) is the Boson Fock space over L (R ). where F = n=0 ⊗ns ( Let Ω = {1, 0, 0, . . .} ∈ F be the Fock vacuum. The annihilation operator and the creation operator in F are denoted by a∗ (f ) and a(f ), respectively, where d−1 2 d L (R ). They satisfy canonical commutation relations: f = (f1 , . . . , fd−1 ) ∈ [a(f ), a∗ (g)] =
d−1
(f¯j , gj )L2 (Rd ) ,
j=1 ∗
∗
[a (f ), a (g)] = 0 = [a(f ), a(g)]. The field operator in F is given by ˜ˆ ˆ + a(φ)), ˆ = √1 (a∗ (φ) A(φ) 2 ⊕ ˜ˆ ˆ where φ(k) = φ(−k). The quantized radiation field is defined by Aµ = Rd Aµ (x)dx ρµ (x)), where a under the identification L2 (Rd ) ⊗ F ∼ = L2 (Rd ; F ) and Aµ (x) = A(ˆ
November 16, J070-S0129055X10004181
1206
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
cutoff function is given by ρˆµ (x) = ρˆµ (k, x) = the free field Hamiltonian is defined by dΓ(ω) =
k ∞ k=0 i=1
d−1 ˆj j=1 φµ (k)Ψ(k, x)/ ω(k). Finally
i
1 ⊗ ··· ω ···⊗ 1. $ %& '
(A.1)
k 2
Then the Pauli–Fierz Hamiltonian in L (R ) ⊗ F is given by √ ˆ PF = 1 (p ⊗ 1 + αA)2 + V ⊗ 1 + 1 ⊗ dΓ(ω). H (A.2) 2 Suppose that V is relatively bounded with respect to 12 p2 with a relative bound strictly smaller than one, and that ρˆjµ ∈ Cb1 (Rdx ; L2 (Rdk )) and √ √ ω ρˆjµ , ρˆjµ , ρˆjµ / ω, ∂xµ ρˆjµ , ∂xµ ρˆjµ / ω ∈ L∞ (Rdx ; L2 (Rdk )). (A.3) d
ˆ PF is self-adjoint on D(p2 ⊗ 1) ∩ D(1 ⊗ dΓ(ω)). Now let See Assumption 2.1. Then H us see the relationship between L2 (Q) and F . Let U : F → L2 (Q) be defined by U Ω = 1, U : A(φˆ1 ) · · · A(φˆn ) : Ω = : A (φ1 ) · · · A (φn ):, where the Wick product on the left-hand side is defined by moving all the creation operators to the left and annihilation operators to the right without any commutation relations. While the Wick product of the left-hand side is defined recursively by : A (φ) : = A (φ) and : A (φ)
n ( j=1
A (φj ) : = A (φ) :
n ( j=1
A (φj ) : −
n ( 1 (fk , f ) : A (φj ) : . 2 k=1
j =k
The unitary operator U can be extended to the unitary operator from F to L2 (Q), and it also implements U dΓ(ω)U −1 = Hf (m). Then under (A.3) it follows that (1 ⊗ U ) maps D( 12 p2 ⊗ 1) ∩ D(1 ⊗ dΓ(ω)) to D( 12 p2 ⊗ 1) ∩ D(1 ⊗ Hf (m)) and ˆ PF (1 ⊗ U −1 ) = HPF . (1 ⊗ U )H
(A.4)
References [1] M. Aizenman and B. Simon, Brownian motion and Harnak’s inequality for Schr¨ odinger operators, Comm. Pure Appl. Math. 35 (1982) 209–270. [2] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Comm. Math. Phys. 207 (1999) 249–290. [3] K. Broderix, D. Hundertmark and H. Leschke, Continuity properties of Schr¨ odinger semigroups with magnetic fields, Rev. Math. Phys. 12 (2000) 181–225.
November 16, J070-S0129055X10004181
2010 15:28 WSPC/S0129-055X
148-RMP
Generalized PF Model
1207
[4] V. Betz, F. Hiroshima, J. L˝ orinczi, R. A. Minlos and H. Spohn, Ground state properties of the Nelson Hamiltonian — a Gibbs measure-based approach, Rev. Math. Phys. 14 (2002) 173–198. [5] R. Carmona, Pointwise bounds for Schr¨ odinger operators, Comm. Math. Phys. 62 (1978) 97–106. [6] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators (SpringerVerlag, 1987). [7] C. Fefferman, J. Fr¨ ohlich and G. M. Graf, Stability of ultraviolet-cutoff quantum electrodynamics with non-relativistic matter, Comm. Math. Phys. 190 (1997) 309–330. [8] C. G´erard, F. Hiroshima, A. Panatti and A. Suzuki, Infrared divergence of a scalar quantum field model on a pseudo Riemannian manifold, Interdiscip. Inform. Sci. 15 (2009) 399–421. [9] M. Gubinelli, Gibbs measures for self-interacting Wiener paths, Mark. Proc. Rel. Fields 12 (2006) 747–766. [10] M. Griesemer, Exponential decay and ionization thresholds in non-relativistic quantum electrodynamics, J. Funct. Anal. 210 (2004) 321–340. [11] M. Griesemer, E. Lieb and M. Loss, Ground states in non-relativistic quantum electrodynamics, Invent. Math. 145 (2001) 557–595. [12] D. Hasler and I. Herbst, On the self-adjointness and domain of Pauli–Fierz type Hamiltonians, Rev. Math. Phys. 20 (2008) 787–800. [13] T. Hidaka, On the existence of ground states for the Pauli–Fierz model with a variable mass, preprint (2010). [14] F. Hiroshima, Functional integral representation of a model in quantum electrodynamics, Rev. Math. Phys. 9 (1997) 489–530. [15] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics II, J. Math. Phys. 41 (2000) 661–674. [16] F. Hiroshima, Essential self-adjointness of translation invariant quantum filed models for arbitrary coupling constants, Comm. Math. Phys. 211 (2000) 585–613. [17] F. Hiroshima, Self-adjointness of the Pauli–Fierz Hamiltonian for arbitrary values of coupling constants, Ann. Henri Poincar´e 3 (2002) 171–201. [18] F. Hiroshima, Fiber Hamiltonians in nonrelativistic quantum electrodynamics, J. Funct. Anal. 252 (2007) 314–355. [19] F. Hiroshima, T. Ichinose and J. L˝ orinczi, Path integral representation for Schr¨ odinger operator with Bernstein function of the Laplacian, preprint (2009). [20] F. Hiroshima and J. L˝ orinczi, Functional integral representations of the Pauli–Fierz model with spin 1/2, J. Funct. Anal. 254 (2008) 2127–2185. [21] T. Ikebe, Eigenfunction expansion asociated with the Schr¨ odinger operators and their applications to scattering theory, Arch. Ration. Mech. Anal. 5 (1960) 1–34. [22] J. L˝ orinczi, R. A. Minlos and H. Spohn, The infrared behaviour in Nelson’s model of a quantum particle coupled to a massless scalar field, Ann. Henri Poincar´e 3 (2002) 1–28. [23] E. Nelson, Schr¨ odinger particles interacting with a quantized scalar field, in Proc. Conf. Analysis in Function Space, eds. W. T. Martin and I. Segal (MIT Press, Cambridge 1964), p. 87. [24] B. Simon, The P (φ)2 Euclidean (Quantum) Field Theory (Princeton Univ. Press, 1974). [25] B. Simon, Functional Integral Representation and Quantum Physics (Academic Press, 1979). [26] B. Simon, Schr¨ odinger semigroups, Bull. Amer. Math. Soc. 7 (1982) 447–526.
November 16, J070-S0129055X10004181
1208
2010 15:28 WSPC/S0129-055X
148-RMP
T. Hidaka & F. Hiroshima
[27] B. Simon, Kato’s inequality and the comparison of semigroups, J. Funct. Anal. 32 (1979) 97–101. [28] H. Spohn, Ground state of quantum particle coupled to a scalar boson field, Lett. Math. Phys. 44 (1998) 9–16. [29] H. Spohn, Dynamics of Charged Particles and their Radiation Field (Cambridge University Press, 2004).
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1209–1240 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004193
EIGENFUNCTION EXPANSIONS AND SPACETIME ESTIMATES FOR GENERATORS IN DIVERGENCE-FORM
MATANIA BEN-ARTZI Institute of Mathematics, Hebrew University, Jerusalem 91904, Israel [email protected] Received 28 January 2010 Revised 10 August 2010 Let H = − L2 (Rn ), n
Pn
∂ j,k=1 ∂xj
∂ aj,k (x) ∂x
k
be a formally self-adjoint (elliptic) operator in
≥ 2. The real coefficients aj,k (x) = ak,j (x) are assumed to be bounded and to coincide with −∆ outside of a ball. The paper deals with two topics: (i) An eigenfunction expansion theorem, proving in particular that H is unitarily equivalent to −∆, and (ii) Global spacetime estimates for the associated inhomogeneous wave equation, proved under suitable (“nontrapping”) additional assumptions on the coefficients. The main tool used here is a Limiting Absorption Principle (LAP) in the framework of weighted Sobolev spaces, which holds also at the threshold. Keywords: Divergence-type operator; limiting absorption principle; eigenfunction expansion; spacetime estimates. Mathematics Subject Classification 2010: 35J15, 35L15, 47F05
1. Introduction Let H = − nj,k=1 ∂j aj,k (x)∂k , where aj,k (x) = ak,j (x), be a formally self-adjoint ∂ ∂ operator in L2 (Rn ), n ≥ 2. The notations ∂j = ∂x and ∂t = ∂t are used throughout j the paper. We assume that the real measurable matrix function a(x) = {aj,k (x)}1≤j,k≤n satisfies, with some positive constants a1 > a0 > 0, Λ0 > 0, a0 I ≤ a(x) ≤ a1 I, a(x) = I
x ∈ Rn ,
(1.1)
for |x| > Λ0 .
(1.2)
In what follows, we shall use the notation H = −∇ · a(x)∇. We retain the notation H for the self-adjoint (Friedrichs) extension associated with the form (a(x)∇ϕ, ∇ψ), where ( , ) is the scalar product in L2 (Rn ). When a(x) ≡ I we set H = H0 = −∆. Operators of this type appear in geometry (Laplacian on noncompact Riemannian manifolds) as well as in physics, typically when physical parameters vary in space (such as the acoustic propagator in a medium with variable speed of sound). 1209
November 16, J070-S0129055X10004193
1210
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Under our assumptions (1.1) and (1.2), it follows that σ(H), the spectrum of H, is the half-axis [0, ∞), and is entirely continuous. In particular, the equality (Hu, u) = (a(x)∇u, ∇u) shows that H has no eigenvalue at zero. In addition, if the coefficient matrix a(x) is smooth, the absence of singular continuous spectrum follows from the classical work of Mourre ([58]). However, it seems that there is no proof in the literature establishing the absolute continuity of the spectrum in our case of non-smooth (and even discontinuous) coefficients. This fact is implied by our Theorem A stated in Sec. 3 below. The “threshold” z = 0 plays a special role in this setting, as we shall see later. The mere fact that both H and H0 are spectrally absolutely continuous over [0, ∞) does not imply that they are “identical”, namely, in the functional analytic setting, that they are “unitarily equivalent”. Thus one question that arises is: Question 1. Are the operators H and H0 unitarily equivalent, under the above assumptions on the coefficients? We next recall the definition of the wave operators related to H, H0 [50, Chap. X]. Consider the family of unitary operators W (t) = exp(itH) exp(−itH0 ),
−∞ < t < ∞.
The strong limits W± (H, H0 ) = s- lim W (t), t→±∞
(1.3)
if they exist, are called the wave operators (relating H, H0 ). These operators play an important role in scattering theory. They are clearly isometries. If the range of W+ is equal to the absolutely continuous subspace of H (which here is L2 (Rn ) itself), we say that it is complete, with a similar definition for W− . If either one is complete, then it is unitary (in the case at hand) and provides a unitary equivalence between H and H0 . A second question that arises therefore is: Question 2. Do the wave operators exist and, if so, are they complete? As noted above, a positive answer to this question entails a positive answer to the first question. Another aspect related to the spectral theory of H is its associated eigenfunction expansion. When available, it serves as an analytic tool which is sharper than the abstract spectral theorem. In the case of H0 , the Fourier transform n g(x)e−iξx dx, (1.4) F g(ξ) = g(ξ) = (2π)− 2 Rn
serves to express g(x) as n
g(x) = (2π)− 2
Rn
g(ξ)eiξx dξ,
(1.5)
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1211
which can be viewed as an “expansion” of g in terms of the “generalized eigenfunctions” (or “modes”) exp(iξx), associated with the eigenvalues |ξ|2 . Furthermore, the operator F is unitary and F H0 F −1 is just multiplication by |ξ|2 in Fourier space. Such (“diagonalizing”) expansions have been used extensively in quantum mechanics (for example, the Airy transform associated with the Stark Hamiltonian). It is therefore natural to pose the following question: Question 3. Can one associate a similar “eigenfunction expansion” with the operator H? More specifically, can one replace the exponentials exp(iξx) by some approximating generalized eigenfunctions (“distorted plane waves”) so that the resulting transform remains unitary and diagonalizes the operator? As a final topic in this paper, we turn back to the evolution (unitary) group exp(−itH)u0 , which solves the Schr¨odinger equation i∂t u = Hu,
u(0) = u0 .
The last 30 years have seen a very intensive research on the global (spacetime) properties of these solutions, known as “Strichartz and smoothing” estimates. Instead of treating the Schr¨ odinger equation we choose here to address the generalized wave equation, ∂t2 u = −Hu + f,
(1.6)
subject to initial conditions u(0) = u0 , ∂t u(0) = v0 . The conservation of energy for this equation (in the homogeneous case, f = 0) is given by 1 1 [|H β ∂t u(x, t)|2 + |H β+ 2 u(x, t)|2 ]dx = [|H β v0 (x)|2 + |H β+ 2 u0 (x)|2 ]dx, Rn
Rn
(1.7) for any β ∈ R, and any t ∈ R. In this context, the dispersive character of the equation means that the solution “escapes” from any bounded set, as |t| → ∞, in some average sense. We would like to estimate this decay in terms of the initial energy norm, namely, the right-hand side of (1.7). We therefore ask: Question 4. Can one establish global L2 spacetime estimates for solutions of (1.6) in terms of the initial energy norm? In this paper, we answer affirmatively the first three questions. As for Question 4, we provide such estimates by imposing restrictive hypotheses on the coeffficient matrix. The precise statements, as well as discussions of the relevant bibliography for each topic, are given in Sec. 3.
November 16, J070-S0129055X10004193
1212
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
The main technical tool used here consists of a close study of the properties of the resolvent R(z) as z approaches the real axis. To be more specific, we introduce the general notion of the “continuity up to the spectrum” of the resolvent. Definition 1.1. Let [α, β] ⊆ R. We say that H satisfies the “Limiting Absorption Principle” (LAP) in [α, β] if R(z), z ∈ C ± , can be extended continuously to Im z = 0, Re z ∈ [α, β], in a suitable operator topology. In this case we denote the limiting values by R± (λ), α ≤ λ ≤ β. The precise specification of the operator topology in the above definition is left open. Typically, it will be the uniform operator topology associated with weightedL2 or Sobolev spaces, which are introduced in Sec. 2. Note that the limiting values R− (λ) are, generally speaking, different from + R (λ). In fact, one has (formally) the “Stieltjes formula” A(λ) =
1 d (R+ (λ) − R− (λ)) = E(λ), 2πi dλ
where E(λ) is the spectral family associated with H. The operator A(λ), λ ∈ [0, ∞), known in the physical literature as the “density of states” ([28, Chap. XIII]), plays an important role in our study. The paper is organized as follows. Basic functional spaces and notations are introduced in Sec. 2. Our results are stated as Theorems A–C in Sec. 3. Around each of the three theorems, we discuss some background material as well as relevant references. Obviously, the large amount of existing literature excludes any possibility of compiling an exhaustive bibliography. Section 4 is devoted to revisiting the LAP as applied to the Laplacian H0 , and in particular obtaining uniform “low energy” estimates. In Sec. 5, we prove Theorem A, the LAP for H. The eigenfunction expansion theorem, Theorem B, is proved in Sec. 6. The global spacetime estimates for the generalized wave equation (1.6), as stated in Theorem C, are proved in Sec. 7. Some of the results presented here were announced in [9]. 2. Functional Spaces and Notation Throughout this paper we shall make use of the following weighted-L2 and Sobolev spaces. First, for s ∈ R and m a nonnegative integer we define. (1 + |x|2 )s |u(x)|2 dx < ∞ (2.1) L2,s (Rn ) := u(x)/ u 20,s = H m,s (Rn ) :=
Rn
u(x)/Dα u ∈ L2,s , |α| ≤ m, u 2m,s =
(we write L2 for L2,0 and u 0 = u 0,0 ).
|α≤m
Dα u 20,s
(2.2)
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1213
More generally, for any σ ∈ R, let H σ ≡ H σ,0 be the Sobolev space of order σ, namely, u/u ∈ L2,σ , ˆ u σ,0 = u 0,σ } H σ = {ˆ
(2.3)
where the Fourier transform is defined as in (1.4). For negative indices, we denote by {H −m,s , · −m,s } the dual space of H m,−s . In particular, observe that any function f ∈ H −1,s can be represented (not uniquely) as f = f0 +
n k=1
i−1
∂ fk , ∂xk
fk ∈ L2,s , 0 ≤ k ≤ n.
(2.4)
In the case n = 2 and s > 1, we define 2 2,s L2,s (R2 )/ˆ u(0) = 0}, 0 (R ) = {u ∈ L
and set H0−1,s (R2 ) to be the space of functions f ∈ H −1,s (R2 ) which have a representation (2.4) where fk ∈ L2,s 0 , k = 0, 1, 2. For any two normed spaces X, Y , we denote by B(X, Y ) the space of bounded linear operators from X to Y , equipped with the operator-norm · B(X,Y ) topology. 3. Statement of Results and Background 3.1. The limiting absorption principle (LAP) We note that the operator H can be extended in an obvious way (retaining the −1 1 same notation) as a bounded operator H: Hloc
→ Hloc . In particular, H: H 1,−s → −1,−s , for all s ≥ 0. Furthermore, the graph-norm of H in H −1,−s is equivalent H to the norm of H 1,−s . Similarly, we can consider the resolvent R(z) as defined on L2,s , s ≥ 0, where L2,s is densely and continuously embedded in H −1,s . The basic technical tool used in the present paper is given in the following theorem. It has its own significance, stating that the resolvent is continuous up to the spectrum, including the threshold at λ = 0. Theorem A. Suppose that a(x) satisfies (1.1), (1.2). Then the operator H satisfies the LAP in R. More precisely, let s > 1 and consider the resolvent R(z) = (H − z)−1 , Im z = 0, as a bounded operator from L2,s (Rn ) to H 1,−s (Rn ). Then: (a) R(z) is bounded with respect to the H −1,s (Rn ) norm. Using the density of L2,s in H −1,s , we can therefore view R(z) as a bounded operator from H −1,s (Rn ) to H 1,−s (Rn ). (b) The operator-valued functions, defined respectively in the lower and upper halfplanes, z → R(z) ∈ B(H −1,s (Rn ), H 1,−s (Rn )),
s > 1,
±Im z > 0,
(3.1)
November 16, J070-S0129055X10004193
1214
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
can be extended continuously from C ± = {z/ ± Im z > 0} to C ± = C ± ∪ R (with respect to the operator-norm topology of B(H −1,s (Rn ), H 1,−s (Rn ))). In the case n = 2, replace H −1,s by H0−1,s . Notation. We denote the limiting values of the resolvent on the real axis by R± (λ) = lim R(λ + i). →±0
The spectrum of H is therefore entirely absolutely continuous. In particular, it follows that the limiting values R± (λ) are continuous at λ = 0 and H has no resonance there. The main focus of Theorem A is the LAP for H at “low energies”, i.e. in intervals [α, β] where α < 0 < β. However, to review the existing literature, we consider first the LAP in (0, ∞), namely, over the interior of the spectrum. Under assumptions close to ours here (but also assuming that a(x) is continuously differentiable) a weaker version (roughly, “strong” instead of “uniform” convergence of the resolvents) was obtained by Eidus ([34, Theorem 4 and Remark 1]). His approach relied on elliptic (kernel) estimates. The systematic treatment of the LAP started with the work of Agmon ([1]). He established it for operators of the type H0 + V , where V is a short-range perturbation. To obtain the LAP for H0 he considered the action of division by symbols with simple zeros in weighted Sobolev spaces. We therefore label this approach as the “Fourier approach” (see [41, Chap. 14]). The short-range potential was treated by perturbation methods. Soon thereafter, two other approaches to the LAP were proposed, first the “Commutator method” (known as “Mourre’s method”) proposed in the classical paper [58] and then the “Spectral method”, initiated in joint works of the author with Devinatz ([12, 13]). In its implementation for partial differential operators, this method relies on estimates of traces of Sobolev functions on characteristic manifolds, somewhat in analogy to the division by symbols with simple zeros in the case of the Fourier method. In fact, it implies the H¨ older continuity of the limiting values ± R (λ) in a suitable operator topology. All three approaches yielded simple proofs for the LAP associated with H = H0 + V, where V is short-range, in the interior (0, ∞) of the spectrum. Using one of the aforementioned approaches, the LAP for H has later been established, with V being a long-range or Stark-like potential ([5, 45]), a potential in Lp (Rn ) ([36, 47]), a potential depending only on direction (x/|x|) ([38]) or a perturbation of such a potential ([61, 62]). In these latter cases the condition α > 0 is replaced by α > lim sup|x|→∞ V (x). The LAP for operators of the type f (−∆) + V, for a certain class of functions f, was derived in [17], using the spectral method. A remarkable success of Mourre’s method was in its application to the LAP in the case of the N -body Schr¨ odinger operator (outside of thresholds) ([60]).
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1215
As mentioned in the Introduction, if the coefficient matrix a(x) is smooth, the operator H can be viewed as the Laplace–Beltrami operator ∆g on noncompact manifolds, where g is a smooth metric that approaches the Euclidean metric at infinity. The LAP in this case (in the interior of the spectrum) has already been established by Mourre. We refer to [65] and references therein for the case of perturbations of such operators. More recent works that employ the Mourre method for the derivation of the LAP in the interior of the spectrum, for asymptotically Euclidean spaces, are [75, Sec. 5] and [19, Theorem 2.2]. We now turn back to our topic here, the LAP in intervals containing the threshold at the bottom of the spectrum. The study of the resolvent near the threshold λ = 0 is sometimes referred to as “low energy estimates”. The literature in this case is considerably more limited. An inspection of the aforementioned works shows that the methods they employ cannot be extended in a straightforward way to our operator H. This case has been studied for the Laplacian H0 in [12, Appendix A] and for H in the one-dimensional case (n = 1) in [8, 10, 27]. The present paper deals with the multi-dimensional case n ≥ 2. In recent works, Bouclet ([21]) and Bony and H¨ afner ([20]) have applied the Mourre method in order to establish “low energy” LAP for ∆g on noncompact manifolds of dimension n ≥ 3, where the metric g(x) is smooth but long-range. The paper [64] deals with the two-dimensional (n = 2) case, but the resolvent R(z) is restricted to continuous compactly supported functions f , thus enabling the use of pointwise decay estimates of R(z)f at infinity. Finally we mention the case of the closely related “acoustic propagator”, where the matrix a(x) = b(x1 )I is scalar and dependent on a single coordinate, has been extensively studied [10, 22, 29, 31, 48, 49, 53], as well as the “anisotropic” case where b(x1 ) is a general positive matrix ([11]). The LAP for the periodic case (namely, a(x) is symmetric and periodic) has recently been established in [59]. Note that in this case the spectrum is absolutely continuous and consists of a union of intervals (“bands”). The proof of Theorem A, based on the spectral approach, is given in Sec. 5. It uses an extended version of the LAP for H0 , with the resolvent R0 (z) acting on elements of H −1,s , for suitable positive values of s (see Sec. 4). Since L2,s (respectively H 1,−s ) is densely and continuously embedded in H −1,s (respectively L2,−s ), we conclude that the resolvents R0 (z), R(z) can be extended continuously to C ± in the B(L2,s (Rn ), L2,−s (Rn )) operator topology. An immediate consequence of this fact is the existence and completeness of the wave operators. Using a well-known theorem of Kato and Kuroda ([51]), we have the following immediate corollary concerning the completeness of the wave operators (see (1.3) for the definition). Corollary 3.1. The wave operators W± (H, H0 ) exist and are complete.
November 16, J070-S0129055X10004193
1216
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Indeed, all that is needed is that H, H0 satisfy the LAP in R, with respect to the same operator topologies. We refer to the paper [46] where the existence and completeness of the wave operators W± (H, H0 ) is established under suitable smoothness assumptions on a(x) (however, a(x) − I is not assumed to be compactly supported and H can include also magnetic and electric potentials).
3.2. The eigenfunction expansion theorem The spectral theorem (for self-adjoint operators) can be viewed as a “generalized eigenfunction theorem”. In fact, using the result of Theorem A one can obtain a more refined version in this case as follows. d E(λ) Let {E(λ), λ ∈ R} be the spectral family associated with H. Let A(λ) = dλ be its weak derivative. More precisely, we use the well-known formula, A(λ) =
1 1 lim (R(λ + i) − R(λ − i)) = (R+ (λ) − R− (λ)). 2πi →0+ 2πi
(3.2)
By Theorem A, we know that A(λ) ∈ B(L2,s (Rn ), L2,−s (Rn )). The formal relation (H − λ)A(λ) = 0 can be given a rigorous meaning if, for example, we can find a bounded operator T such that T ∗ A(λ)T is bounded in L2 (Rn ) and has a complete set (necessarily at most countable) of eigenvectors. These will serve as “generalized eigenvectors” for H. We refer to [18, Chaps. V and VI] and [23] for a development of this approach for self-adjoint elliptic operators. Note that by this approach we have at most a countable number of such generalized eigenvectors for any fixed √ − n−3 √ 2 J κj ( λ|x|)ψj (ω), where λ. In the case of H0 = −∆, they correspond to |x|
, λj being the jth eigenvalue of the Laplace–Beltrami operator κj = λj + (n−1)(n−3) 4 on the unit sphere S n−1 , ψj the corresponding eigenfunction and Jν is the Bessel function of order ν. On the other hand, the Fourier expansion (1.5) can be viewed as expressing a function in terms of the “generalized eigenfunctions” exp(iξx) of H0 . Observe that now there is a continuum of such functions corresponding to λ > 0, namely, |ξ|2 = λ. From the physical point of view, this expansion in terms of “plane waves” proves to be more useful for many applications. In particular, replacing −∆ by the Schr¨ odinger operator −∆ + V (x) one can expect, under certain hypotheses on the potential V , a similar expansion in terms of “distorted plane waves”. This has been accomplished, in increasing order of generality (more specifically, decay assumptions on V (x) as |x| → ∞) in [1, 2, 44, 63, 68]. See also [74] for an eigenfunction expansion for relativistic Schr¨odinger operators. Here we use the LAP result of Theorem A in order to derive a similar expansion for the operator H. In fact, our generalized eigenfunctions are given by the following definition.
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1217
Definition 3.2. For every ξ ∈ Rn let ψ± (x, ξ) = −R∓ (|ξ|2 )((H − |ξ|2 ) exp(iξx)) n = R∓ (|ξ|2 ) ∂l (al,j (x) − δl,j )∂j exp(iξx).
(3.3)
l,j=1
The generalized eigenfunctions of H are defined by ϕ± (x, ξ) = exp(iξx) + ψ± (x, ξ).
(3.4)
We assume n ≥ 3 in order to simplify the statement of the theorem. As we show below (see Proposition 6.1) the generalized eigenfunctions are (at least) continuous in x, so that the integral in the statement makes sense. Theorem B. Suppose that n ≥ 3 and that a(x) satisfies (1.1) and (1.2). For any compactly supported f ∈ L2 (Rn ) define −n 2 (F± f )(ξ) = (2π) f (x)ϕ± (x, ξ)dx, ξ ∈ Rn . (3.5) Rn
Then the transformations F± can be extended as unitary transformations (for which we retain the same notation) of L2 (Rn ) onto itself. Furthermore, these transformations “diagonalize” H in the following sense. f ∈ L2 (Rn ) is in the domain D(H) if and only if |ξ|2 (F± f )(ξ) ∈ L2 (Rn ) and H = F∗± M|ξ|2 F± ,
(3.6)
where M|ξ|2 is the multiplication operator by |ξ|2 . 3.3. Spacetime estimates for a generalized wave equation The Strichartz estimates ([72]) have become a fundamental ingredient in the study of nonlinear wave equations. They are Lp spacetime estimates that are derived for operators whose leading part has constant coefficients. We refer to the books [4, 70, 71] for detailed accounts and further references. Here we focus exclusively on spacetime estimates pertinent to the framework of this paper, namely, weighted L2 estimates. Indeed, once the “low energy estimates” of Theorem A are established, the method of proof here follows a standard methodology. We recall first some results related to the Cauchy problem for the classical wave equation u =
∂2u − ∆u = 0, ∂t2
(3.7)
subject to the initial data u(x, 0) = u0 (x),
∂t u(x, 0) = v0 (x),
x ∈ Rn .
(3.8)
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
1218
The Morawetz estimate [56] yields |x|−3 |u(x, t)|2 dxdt ≤ C( ∇u0 20 + v0 20 ), R
Rn
n ≥ 4,
while in [7] we gave the estimate |x|−2α−1 |u(x, t)|2 dxdt ≤ Cα ( |∇|α u0 20 + |∇|α−1 v0 20 ), R
Rn
(3.9)
n ≥ 3,
(3.10)
for every α ∈ (0, 1). Related results were obtained in [55] (allowing also dissipative terms), [42] (with some gain in regularity), [76] (with short-range potentials) and [39] for spherically symmetric solutions. Here we consider the equation n ∂2u ∂ 2u + Hu = − ∂i ai,j (x)∂j u = f (x, t), (3.11) ∂t2 ∂t2 i,j=1 subject to the initial data (3.8). We first replace the assumptions (1.1) and (1.2) by stronger ones as follows. a(x) = g −1 (x) = (g i,j (x))1≤i,j≤n
(H1)
(3.12)
where g(x) = (gi,j (x))1≤i,j≤n is a smooth Riemannian metric on Rn such that g(x) = I (H2)
for |x| > Λ0 .
(3.13)
The Hamiltonian flow associated with h(x, ξ) = (g(x)ξ, ξ) is nontrapping for any (positive) value of h.
Recall that (H2) means that the flow associated with the Hamiltonian vectorfield ∂h ∂ ∂ n H = ∂h ∂ξ ∂x − ∂x ∂ξ leaves any compact set in Rx . Identical hypotheses are imposed in the study of resolvent estimates in semiclassical theory ([24, 25]). In our estimates we use “homogeneous Sobolev spaces” associated with the operator H. 1 We note that since H has no eigenvalue at zero, the operators H −1 and H − 2 1 are well defined self-adjoint operators. Note that H 2 θ 0 is equivalent to the homogeneous Sobolev norm ∇θ 0 . Theorem C. Suppose that n ≥ 3 and that a(x) satisfies Hypotheses (H1) and (H2). Let s > 1. 1
(a) (Local Energy Decay) Let u0 ∈ D(H 2 ) and v0 ∈ L2 (Rn ). Then there exists a constant C1 = C1 (s, n) > 0 such that the solution to (3.11) and (3.8) satisfies, 1 (1 + |x|2 )−s [|H 2 u(x, t)|2 + |ut (x, t)|2 ]dxdt R
Rn
1 ≤ C1 H 2 u0 20 + v0 20 + R
Rn
|f (x, t)| dxdt . 2
(3.14)
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1219
1
(b) (Amplitude Decay) Assume f = 0. Let u0 ∈ L2 (Rn ) and v0 ∈ D(H − 2 ). There exists a constant C2 = C2 (s, n) > 0 such that the solution to (3.11) and (3.8) satisfies, 1 (1 + |x|2 )−s |u(x, t)|2 dxdt ≤ C2 [ u0 20 + H − 2 v0 20 ]. (3.15) R
Rn
These estimates generalize similar estimates obtained for the classical (g = I) wave equation ([7, 55]). Remark 3.3. The estimate (3.14) is an “energy decay estimate” for the wave equation (3.11). A localized (in space) version of the estimate has served to obtain global (small amplitude) existence theorems for the corresponding nonlinear equation ([25, 40]). Remark 3.4. The referee has pointed out to the author the recent preprint [19, Theorem 1.3], where a more general result is obtained, with the metric being longrange. The weighted L2 -spacetime estimates for the dispersive equation i−1
∂ u = Lu, ∂t
have been extensively treated in recent years. In general, in this case there is also a gain of derivatives (so called “smoothing”) in addition to the energy decay. For the Schr¨ odinger operator L = −∆ + V (x), with various assumptions on the potential V, we refer to [3, 6, 7, 15, 16, 42, 52, 67, 69, 77] and references therein. Smoothing estimates in the presence of magnetic potentials are considered in [30]. The Schr¨ odinger operator on a Riemannian manifold is considered in [24, 33]. For more general operators, see [14, 17, 26, 43, 57, 66, 73] and references therein. 4. The Operator H0 = −∆ Let {E0 (λ)} be the spectral family associated with H0 , so that ˆ 2 dξ, λ ≥ 0, h ∈ L2 (Rn ). (E0 (λ)h, h) = |h| |ξ|2 ≤λ
(4.1)
Following the methodology of [13, 32], we see that the weak derivative A0 (λ) = d 2,s , L2,−s ) for any s > 12 and λ > 0. (Here and below we dλ E0 (λ) exists in B(L write L2,s for L2,s (Rn )). Furthermore, √ ˆ 2 dτ,
A0 (λ)h, h = (2 λ)−1 |h| (4.2) |ξ|2 =λ
where , is the (L2,−s , L2,s ) pairing (conjugate linear with respect to the second term) and dτ is the Lebesgue surface measure. Recall that by the standard trace
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
1220
lemma we have
|ξ|2 =λ
ˆ 2 dτ ≤ C h ˆ 2 s, |h| H
s>
1 . 2
(4.3)
However, we can refine this estimate near λ = 0 as follows. Proposition 4.1. Let and h ∈ L2,s 0 . Then
1 2
< s < 32 ,
|ξ|2 =λ
h ∈ L2,s . For n = 2 assume further that s > 1
ˆ 2 dτ ≤ C min(λγ , 1) h ˆ 2 s, |h| H
(4.4)
1 0 <γ =s− , 2 1 0 <γ < s− , 2
(4.5)
where n ≥ 3, n = 2,
and C = C(s, γ, n). Proof. If n ≥ 3, the proof follows as in [16, Appendix], using the “generalized Hardy inequality” due to Herbst [37], namely, that multiplication by |ξ|−s is bounded from H s into L2 (see also [54, Sec. 9.4]). If n = 2 and 1 < s < 32 we have, for h ∈ L2,s 0 , ˆ Hs , ˆ ˆ ˆ |h(ξ)| = |h(ξ) − h(0)| ≤ Cs,δ |ξ|δ h for any 0 < δ < min(1, s − 1). Using this estimate in the integral in the right-hand side of (4.4) the claim follows also in this case. Combining Eqs. (4.2)–(4.4) we conclude that, 1
1
| A0 (λ)f, g| ≤ A0 (λ)f, f 2 A0 (λ)g, g 2 1
≤ C min(λ− 2 , λη ) f 0,s g 0,σ ,
f ∈ L2,s , g ∈ L2,σ ,
(4.6)
where either (i) n ≥ 3,
3 1 < s, σ < , 2 2
s + σ > 2 and 0 < 2η = s + σ − 2,
or (ii)
n = 2,
1<s<
3 , 2
1 3 <σ< , 2 2
(4.7) s + σ > 2,
0 < 2η < s + σ − 2
and fˆ(0) = 0. In both cases, A0 (λ) is H¨older continuous and vanishes at 0, ∞, so as in [13] we obtain. Proposition 4.2. The operator-valued function B(L2,s , L2,−σ ), z → R0 (z) ∈ 2,−σ ), B(L2,s 0 ,L
n ≥ 3, n = 2,
(4.8)
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1221
where s, σ satisfy (4.7), can be extended continuously from C ± to C ± , in the respective uniform operator topologies. Remark 4.3. We note that the conditions (4.7) yield the continuity of A0 (λ) across the threshold λ = 0 and hence the continuity property of the resolvent as in Proposition 4.2. However, for the local continuity at any λ0 > 0, it suffices to take s, σ > 12 , as in [1]. This remark applies equally to the statements below, where the resolvent is considered in other functional settings. We shall now extend this proposition to more general function spaces. Let g ∈ H 1,σ , where s, σ satisfy (4.7). Let f ∈ H −1,s have a representation of the form (2.4). Equation (4.2) can be extended to yield an operator (for which we retain the same notation) A0 (λ) ∈ B(H −1,s , H −1,−σ ), defined by (where now , is used for the (H −1,s , H 1,σ ) pairing), n ∂ −1 A0 (λ) f0 + i fk , g ∂xk k=1 n √ −1 = (2 λ) ξk fˆk (ξ) gˆ(ξ)dτ, f ∈ H −1,s , g ∈ H 1,σ , fˆ0 (ξ) + |ξ|2 =λ
k=1
(4.9) (replace H −1,s by H0−1,s if n = 2). Observe that this definition makes good sense even though the representation (2.4) is not unique, since f = f0 +
n
∂ ∂ ˜ fk = f˜0 + i−1 fk , ∂xk ∂xk n
i−1
k=1
k=1
implies fˆ0 (ξ) +
n k=1
ξk fˆk (ξ) = fˆ˜0 (ξ) +
n
ξk fˆ˜k (ξ)
k=1
(as tempered distributions). To estimate the operator-norm of A0 (λ) in this setting we use (4.9) and the considerations preceding Proposition 4.2, to obtain, instead of (4.6), for k = 1, 2, . . . , n, A0 (λ) ∂ fk , g ≤ C min(λ− 12 , λη ) f −1,s g 1,σ , f ∈ H −1,s , g ∈ H 1,σ , ∂xk (4.10) where s, σ satisfy (4.7) (replace H −1,s by H0−1,s if n = 2).
November 16, J070-S0129055X10004193
1222
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
We now define the extension of the resolvent operator by ∞ A0 (λ) dλ, Im z = 0. R0 (z) = λ−z 0
(4.11)
The convergence of the integral (in operator-norm) follows from the estimate (4.10). The LAP in this case is given in the following proposition. Proposition 4.4. The operator-valued function R0 (z) is well-defined (and analytic) for nonreal z in the following functional setting. B(H −1,s , H 1,−σ ), n ≥ 3, (4.12) z → R0 (z) ∈ B(H0−1,s , H 1,−σ ), n = 2, where s, σ satisfy (4.7). Furthermore, it can be extended continuously from C ± to C ± , in the respective uniform operator topologies. The limiting values are denoted by R0± (λ). The extended function satisfies (H0 − z)R0 (z)f = f,
f ∈ H −1,s , z ∈ C ± ,
(4.13)
where for z = λ ∈ R, R0 (z) = R0± (λ). Proof. We assume for simplicity n ≥ 3. By Definition (4.11) and the estimate (4.10), we get readily R0 (z) ∈ B(H −1,s , H −1,−σ ) if Im z = 0, as well as the analyticity of the map z → R0 (z), Im z = 0. Furthermore, the extension to Im z = 0 is carried out as in [13]. Equation (4.13) is obvious if Im z = 0 and f ∈ L2,s . By the density of L2,s in −1,s , the continuity of R0 (z) on H −1,s and the continuity of H0 − z (in the sense H of distributions), we can extend it to all f ∈ H −1,s . As z → λ ± i · 0, we have R0 (z)f → R0± (λ)f in H −1,−σ . Applying the (constant coefficient) operator H0 − z yields, in the sense of distributions, f = (H0 − z)R0 (z)f → (H0 − λ)R0± (λ)f which establishes (4.13) also for Im z = 0. Finally, the established continuity of z → R0 (z) ∈ B(H −1,s , H −1,−σ ) (up to the real boundary) and Eq. (4.13) imply the continuity of the map z → H0 R0 (z) ∈ B(H −1,s , H −1,−σ ). The stronger continuity claim (4.12) follows since the norm of H 1,−σ is equivalent to the graph-norm of H0 as a map of H −1,−σ to itself. Remark 4.5. The main point here is the fact that the limiting values can be extended continuously to the threshold at λ = 0. In the neighborhood of any λ > 0 this proposition follows from [68, Theorem 2.3], where a very different proof is used. In fact, using the terminology there, the limit functions R0± (λ)f are the unique (on either side of the positive real axis) radiative functions and they satisfy a suitable “Sommerfeld radiation condition”. We recall it here for the sake of completeness, since we will need it in the next section.
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1223
Let z = k 2 ∈ C\{0}, Im k ≥ 0. For f ∈ H −1,s let u = R0 (z)f ∈ H 1,−σ be as defined above. Then 2 − n−1 ∂ n−1 2 Ru = (4.14) (r 2 u) − iku dx < ∞, r ∂r |x|>Λ0
where r = |x|. We shall refer to Ru as the radiative norm of u. Furthermore, we can take 12 < s, σ, as in Remark 4.3. 5. The Operator H Fix [α, β] ∈ R and let Ω = {z ∈ C + /α < Re z < β, 0 < Im z < 1}.
(5.1)
Let z = µ + iε ∈ Ω and consider the equation (H − z)u = f ∈ H −1,s ,
u ∈ H 1,−σ ,
(f ∈ H0−1,s if n = 2).
(Observe that in the case n = 2 also u ∈ L2,σ 0 .) ∞ n With Λ0 as in (1.2), let χ(x) ∈ C (R ) be such that 0, |x| < Λ0 + 1, χ(x) = 1, |x| > Λ0 + 2.
(5.2)
(5.3)
Equation (5.2) can be written as (H0 − z)(χu) = χf − 2∇χ · ∇u − u∆χ.
(5.4)
Letting ψ(x) = 1 − χ( x2 ) ∈ C0∞ (Rn ) and using Proposition 4.4 and standard elliptic estimates, we obtain from (5.4) u 1,−σ ≤ C[ f −1,s + ψu 0,−s],
(5.5)
where s, σ satisfy (4.7), and C > 0 depends only on Λ0 , σ, s, n. We note that since ψ is compactly supported, the term ψu 0,−s can be replaced by ψu 0,−s for any real s . In fact, the second term in the right-hand side can be dispensed with, as is demonstrated in the following proposition. Proposition 5.1. The solution to (5.2) satisfies, u 1,−σ ≤ C f −1,s ,
(5.6)
where s, σ satisfy (4.7) and C > 0 depends only on σ, s, n, Λ0 . Proof. In view of (5.5), we only need to show that ψu 0,−s ≤ C f −1,s .
(5.7)
November 16, J070-S0129055X10004193
1224
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Since L2,s (Rn ) is dense in H −1,s (Rn ) it suffices to prove this inequality for f ∈ L2,s (Rn ) ∩ H −1,s (Rn ) (using the norm of H −1,s ). We argue by contradiction. Let {zk }∞ k=1 ⊆ Ω,
2,s {fk }∞ (Rn ) ∩ H −1,s (Rn ) k=1 ⊆ L
(with fˆk (0) = 0 if n = 2) and 1,−σ {uk = R(zk )fk }∞ (Rn ) k=1 ⊆ H
be such that, ψuk 0,−s = 1,
fk −1,s ≤ k −1 , k = 1, 2, . . .
¯ as k → ∞. zk → z0 ∈ Ω
(5.8)
1,−σ By (5.5), {uk }∞ . Replacing the sequence by a suitable subk=1 is bounded in H sequence (without changing notation) and using the Rellich compactness theorem we may assume that there exists a function u ∈ L2,−σ , σ > σ, such that,
uk → u in L2,−σ as k → ∞.
(5.9)
Furthermore, by weak compactness we actually have (restricting again to a subsequence if needed) uk − → u in H 1,−σ as k → ∞. w
(5.10)
Since H maps continuously H 1,−σ into H −1,−σ we have Huk − → Hu in H −1,−σ as k → ∞, w
so that from (H − zk )uk = fk we infer that (H − z0 )u = 0.
(5.11)
In view of (5.4) and Remark 4.5, the functions χuk are “radiative functions”. Since they are uniformly bounded in H 1,−σ their “radiative norms” (4.14) are uniformly bounded. Suppose first that z0 = 0. In view of Remark 4.5, we can take s, σ > 12 . Then the limit function u is a radiative solution to (H0 − z0 )u = 0 in |x| > Λ0 + 2 and hence must vanish there (see [68]). By the unique continuation property of solutions to (5.11) we conclude that u ≡ 0. Thus by (5.9) we get ψuk 0,−σ → 0 as k → ∞, which contradicts (5.8). We are therefore left with the case z0 = 0. In this case u ∈ H 1,−σ satisfies the equation ∇ · (a(x)∇u) = 0.
(5.12)
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
In particular, ∆u = 0 in |x| > Λ0 and 2 ∞ ∂u −2σ 2 r |u| + dτ dr < ∞. ∂r Λ0 |x|=r
1225
(5.13)
Consider first the case n ≥ 3. We may then use the representation of u by spherical harmonics, so that, with x = rω, ω ∈ S n−1 , ∞ ∞ n−1 u(x) = r− 2 bj rµj hj (ω) + cj r−νj hj (ω) , r > Λ0 , (5.14) j=0
j=0
where, (n − 1)(n − 3) , 4 0 = λ0 < λ1 ≤ λ2 ≤ · · ·
µj (µj − 1) = νj (νj + 1) = λj +
(5.15)
being the eigenvalues of the Laplace–Beltrami operator on S n−1 , and hj (ω) the corresponding spherical harmonics. Since λ1 = n − 1, it follows that µ0 =
n−1 , 2
µ0 + 1 ≤ µ1 ≤ µ2 · · · ,
n−3 = ν0 < ν1 ≤ ν2 · · · . 2
(5.16)
We now observe that (5.13) forces b0 = b1 = · · · = 0. Also, by (5.14)
|x|=r
∂u dτ = −(n − 2)|S n−1 |c0 , ∂r
r > Λ0 ,
(|S n−1 | is the surface measure of S n−1 ), while integrating (5.12) we get ∂u dτ = 0, r > Λ0 . |x|=r ∂r Thus c0 = 0. It now follows from (5.14) that, for r > Λ0 , 2 2 −2ν1 ∂u ∂u r 2 2 |u| + dτ ≤ |u| + dτ. ∂r Λ0 ∂r |x|=r |x|=Λ0
(5.17)
(5.18)
(5.19)
Multiplying (5.12) by u ¯ and integrating by parts over the ball |x| ≤ r, we infer from (5.19) that the boundary term vanishes as r → ∞. Thus ∇u ≡ 0, in contradiction to (5.8) and (5.9). It remains to deal with the case n = 2. Instead of (5.14), we now have ∞ ∞ 1 1 bj rµj hj (ω) + cj r−νj hj (ω) , r > Λ0 , (5.20) u(x) = r− 2 b0 r 2 log r + j=0
where µ0 = 12 , µ1 = 32 , ν1 = 12 .
j=1
November 16, J070-S0129055X10004193
1226
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
As in the derivation above, the condition (5.13) yields b0 = b1 = · · · = 0. Also, we get b0 = 0 in view of (5.18). It now follows that u ¯ |x|=r
∞ ∂u 1 dτ = −2π νj + |cj |2 r−2νj −1 , ∂r 2 j=1
r ≥ Λ0 ,
(5.21)
from which, as in the argument following (5.19), we deduce that u ≡ 0, again in contradiction to (5.8) and (5.9).
Proof of Theorem A. Part (a) of the theorem is actually covered by Proposition 5.1. Moreover, the proposition implies that the operator-valued function z → R(z) ∈ B(H −1,s (Rn ), H 1,−σ (Rn )),
s > 1, z ∈ Ω,
is uniformly bounded, where s, σ satisfy (4.7). Here and below replace H −1,s by H0−1,s if n = 2. ¯ in We next show that the function z → R(z) can be continuously extended to Ω −1,s n 1,−σ n −1,s (R ), H (R )). To this end, we take f ∈ H (Rn ) the weak toplogoy of B(H −1,σ n (R ) and consider the function and g ∈ H z → g, R(z)f ,
z ∈ Ω,
where , is the (H −1,σ , H 1,−σ ) pairing. We need to show that it can be extended ¯ continuously to Ω. In view of the uniform boundedness established in Proposition 5.1, we can take f, g in dense sets (of the respective spaces). In particular, we can take f ∈ L2,s (Rn ) and g ∈ L2,σ (Rn ), so that the continuity property in Ω is obvious. ∞ −−− → z0 ∈ [α, β]. Consider therefore a sequence {zk }k=1 ⊆ Ω such that zk − k→∞
1,−σ The sequence {u R(zk )f }∞ (Rn ). Therefore there exists k=1 is bounded in H k =∞ a subsequence ukj j=1 which converges to a function u ∈ L2,−σ , σ > σ. w
We can further assume that ukj −−−→ u in H 1,−σ . It follows that j→∞
g, ukj −−−→ g, u. j→∞
Passing to the limit in (H − zkj )ukj = f we see that the limit function satisfies (H − z0 )u = f. We now repeat the argument employed in the proof of Proposition 5.1. If z0 = 0 we note that the functions {χuk }∞ k=1 are radiative functions with uniformly bounded “radiative norms” (4.14) in |x| > Λ0 + 2. The same is therefore true for the limit function u. If z0 = 0 the function u ∈ H 1,−σ solves Hu = f.
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1227
In both cases this function is unique and we get the convergence
g, R(zk )f = g, uk −−−− → g, u. k→∞
We can now define R+ (z0 )f = u,
(5.22)
with an analogous definition for R− (z0 ). At this point we can readily deduce the following extension of the resolvent R(z) as the inverse of H − z. (H − z)R(z)f = f,
f ∈ H −1,s , z ∈ C ± ,
(5.23)
where R(z) = R± (λ) when z = λ ∈ R. Indeed, observe that if Im z = 0 then (H − z)R(z)f = f for f ∈ L2,s (Rn ) and (H − z)R(z) ∈ B(H −1,s , H −1,−σ ), so the assertion follows from the density of L2,s (Rn ) in H −1,s (Rn ). For z = λ ∈ R we use the (just established) weak continuity of the map z → (H − z)R(z) from H −1,s into H −1,−σ in C ± . The passage “from weak to uniform continuity” (in the operator topology) is a classical argument due to Agmon ([1]). In [8], we have applied it in the case n = 1. Here we outline the proof in the case n > 1. ¯ We establish first the continuity of the operator-valued function z → R(z), Ω, −1,s n 2,−σ n (R ), L (R )). in the uniform operator topologoy of B(H ¯ and {fk }∞ ⊆ H −1,s (Rn ) be sequences such that zk − ⊆ Ω −−−→ Let {zk }∞ k=1 k=1 k→∞ ¯ and fk converges weakly to f in H −1,s (Rn ). It suffices to prove that the z ∈ Ω sequence uk = R(zk )fk , which is bounded in H 1,−σ (Rn ), converges strongly in L2,−σ (Rn ). Since this is clear if Im z = 0, we can take z ∈ [α, β]. Note first that we can take 12 < σ < σ so that s, σ satisfy (4.7). Then ∞ the {uk }k=1 is bounded in H 1,−σ (Rn ) and there exists a subsequence sequence ∞ ukj j=1 which converges to a function u ∈ L2,−σ . w
We can further assume that ukj −−−→ u in H 1,−σ . j→∞
It follows that the limit function satisfies (see Eq. (5.23)) (H − z)u = f. Once again we consider separately the cases z = 0 and z = 0. In the first case, in view of (5.23) and Remark 4.5 the functions χuk are “radiative functions”. Since they are uniformly bounded in H 1,−σ their “radiative norms” (4.14) are uniformly bounded, and we conclude that also Ru < ∞. In the second case, we simply note that u ∈ H 1,−σ solves Hu = f. As in the proof of Proposition 5.1 we conclude that in both cases the limit is 2,−σ (Rn ). unique, so that the whole sequence {uk }∞ k=1 converges to u in L Thus, the continuity in the uniform operator topologoy of B(H −1,s (Rn ), 2,−σ (Rn )) is proved. L
November 16, J070-S0129055X10004193
1228
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Finally, we claim that the operator-valued function z → R(z) is continuous in the uniform operator toplogoy of B(H −1,s (Rn ), H 1,−σ (Rn )). Indeed, if we invoke Eq. (5.23) we get that also z → HR(z) is continuous in the uniform operator topology of B(H −1,s (Rn ), H −1,−σ (Rn )). Since the domain of H in H −1,−σ (Rn ) is H 1,−σ (Rn ), the claim follows. The conclusion of the theorem follows by taking σ = s. Remark 5.2. In view of (5.4) and Remark 4.5 it follows that for λ > 0 the functions R± (λ)f, f ∈ H −1,s , are “radiative”, i.e. satisfy a Sommerfeld radiation condition. 6. The Eigenfunction Expansion Theorem In this section we prove Theorem B stated in Sec. 3. We first collect some basic properties of the generalized eigenfunctions in the following proposition. Proposition 6.1. The generalized eigenfunctions ϕ± (x, ξ) = exp(iξx) + ψ± (x, ξ) (see (3.4)) are in
1 (Rn ) Hloc
for each fixed ξ ∈ Rn and satisfy the equation (H − |ξ|2 )ϕ± (x, ξ) = 0.
(6.1)
In addition, these functions have the following properties: (i) The map Rn ξ → ψ± (·, ξ) ∈ H 1,−s (Rn ),
s > 1,
is continuous. (ii) For any compact K ⊆ Rn the family of functions {ϕ± (x, ξ), ξ ∈ K} is uniformly bounded and uniformly H¨ older continuous in x ∈ Rn . Proof. Since (H − |ξ|2 ) exp(iξx) ∈ H −1,s , s > 1, Eq. (6.1) follows from the definition (3.3) in view of Eq. (5.23). Furthermore, the map Rn ξ → (H − |ξ|2 ) exp(iξx) ∈ H −1,s (Rn ),
s > 1,
is continuous, so the continuity assertion (i) follows from Theorem A. For s > 1 the set of functions {ψ± (·, ξ), ξ ∈ K} is uniformly bounded in H 1,−s . Thus, in view of (6.1), it follows from the De Giorgi–Nash–Moser Theorem [35, older Chap. 8] that the set {ϕ± (x, ξ), ξ ∈ K} is uniformly bounded and uniformly H¨ continuous in {|x| < R} for every R > 0. In particular, we can take R > Λ0 (see Eq. (1.2)). In the exterior domain {|x| > R} the set {ψ± (x, ξ), ξ ∈ K} is bounded in H 1,−s , s > 1, and we have (H0 − |ξ|2 )ψ± (x, ξ) = 0. In addition the boundary values {ψ± (x, ξ), |x| = R, ξ ∈ K} are uniformly bounded. From well-known properties of solutions of the Helmholtz equation, we
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1229
conclude that this set is uniformly bounded and therefore, invoking once again the De Giorgi–Nash–Moser Theorem, uniformly H¨ older continuous.
Proof of Theorem B. We use the LAP proved in Theorem A, adapting the methodology of Agmon’s proof ([1]) for the eigenfunction expansion in the case of Schr¨ odinger operators with short-range potentials. To simplify notation, we prove for F+ . Let u ∈ H 1 be compactly supported. For any z such that Im z = 0 we can write its Fourier transform as n n (2π)− 2 u(x) exp(−iξx)dx = 2 u(x)(H0 − z) exp(−iξx)dx. u ˆ(ξ) = (2π)− 2 |ξ| − z Rn Rn Let θ ∈ C0∞ (Rn ) be a (real) cutoff function such that θ(x) = 1 for x in a neighborhood of the support of u. We can rewrite the above equality as n
u ˆ(ξ) =
(2π)− 2
(H0 − z)u(x), θ(x) exp(iξx), |ξ|2 − z
where ·, · is the (H −1,s , H 1,−s ) bilinear pairing (conjugate linear with respect to the second term). We have therefore, with f = (H − z)u, n
u ˆ(ξ) =
(2π)− 2 ( (H − z)u(x), θ(x) exp(iξx) + (H0 − H) exp(iξx), u(x)) |ξ|2 − z n
(2π)− 2 ( f (x), θ(x) exp(iξx) + f (x), R(¯ z )(H0 − H) exp(iξx)). (6.2) = 2 |ξ| − z Introducing the function n z )(H0 − H) exp(iξx), f˜(ξ, z) = fˆ(ξ) + (2π)− 2 f (x), R(¯
we have (ξ) = u ˆ(ξ) = R(z)f
f˜(ξ, z) , |ξ|2 − z
Im z = 0,
(6.3)
We now claim that this equation is valid for all compactly supported f ∈ H −1 . Indeed, let u = R(z)f ∈ H 1,−s , s > 1. Let ψ(x) = 1 − χ(x), where χ(x) is defined in (5.3). We set uk (x) = ψ(k −1 x)u(x),
fk (x) = (H − z)(ψ(k −1 x)u(x)),
k = 1, 2, 3, . . . .
The equality (6.3) is satisfied with u, f replaced, respectively, by uk , fk .
November 16, J070-S0129055X10004193
1230
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Since −−− → u(x) ψ(k −1 x)u(x) − k→∞
in H
1,−s
, we have (H − z)(ψ(k −1 x)u(x)) −−−− → (H − z)u = f (x) k→∞
−1,−s
in H , where in the last step we have used Eq. (5.23). In addition, since (H0 − H) exp(iξx) is compactly supported z )(H0 − H) exp(iξx) = (H0 − H) exp(iξx), R(z)fk (x)
fk (x), R(¯ z )(H0 − H) exp(iξx). − −−− → (H0 − H) exp(iξx), R(z)f = f, R(¯ k→∞
Combining these considerations with the continuity of the Fourier transform (on tempered distributions) we establish that (6.3) is valid for all compactly supported f ∈ H −1 . d E(λ) Let {E(λ), λ ∈ R} be the spectral family associated with H. Let A(λ) = dλ be its weak derivative. More precisely, we use the well-known formula, 1 lim (R(λ + i) − R(λ − i)), A(λ) = 2πi →0+ to get (using Theorem A), for any f ∈ H −1,s , s > 1, 1
f, (R+ (λ) − R− (λ))f .
f, A(λ)f = 2πi We now take f ∈ L2 and compactly supported. From the resolvent equation we infer R(λ + i) − R(λ − i) = 2iR(λ + i)R(λ − i),
> 0,
so that R(λ + i)f 20 , > 0. π Using Eq. (6.3) and Parseval’s theorem we therefore have,
f, A(λ)f = lim (|ξ|2 − (λ + i))−1 f˜(ξ, λ + i) 20 , > 0. →0+ π Note that f˜(ξ, z) can be extended continuously as z → λ + i · 0 by
f, A(λ)f = lim
→0+
n f˜(ξ, λ) = fˆ(ξ) + (2π)− 2 f (x), R− (λ)(H0 − H) exp(iξx).
(6.4)
(6.5)
In order to study properties of f˜(ξ, z) as a function of ξ we compute n n −2 ˜ ˆ f (ξ, z) = f (ξ) + (2π) ∂l (al,j (x) − δl,j )∂j exp(iξx), R(z)f (x) l,j=1 n = fˆ(ξ) + (2π)− 2 i
n l,j=1
ξj
Rn
(al,j (x) − δl,j )∂l (R(z)f (x)) exp(−iξx)dx, (6.6)
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1231
where in the last step we have used that both ∂l (R(z)f (x)) and (al,j (x) − δl,j ) exp(−iξx) are in L2 . Consider now the integral (al,j (x) − δl,j )∂l (R(z)f (x)) exp(−iξx)dx, z ∈ Ω, g(ξ, z) = Rn
where Ω is as in (5.1). In view of Theorem A the family {∂l R(z)f (x)}z∈Ω is uniformly bounded in L2,−s , s > 1, so by Parseval’s theorem we get g(·, z) 0 < C,
z ∈ Ω,
where C only depends on f. This estimate and (6.6) imply that, if f ∈ L2 is compactly supported: (i) The function ¯ (ξ, z) → f˜(ξ, z) Rn × Ω is continuous. For real z it is given by (6.5). (ii)
lim
k→∞
|ξ|>k
(|ξ|2 − z)−1 |f˜(ξ, z)|2 dξ = 0,
uniformly in z ∈ Ω. As z → |ξ|2 + i · 0, we have by Theorem A and Eq. (3.4), −n ˜ 2 lim f (x)ϕ+ (x, ξ)dx = F+ f (ξ), f (ξ, z) = (2π) 2 z→|ξ| +i·0
(6.7)
Rn
so that, taking (i) and (ii) into account we obtain from (6.4), for any compactly supported f ∈ L2 , 1
f, A(λ)f = √ |F+ f (ξ)|2 dσ, λ > 0, (6.8) 2 λ |ξ|2 =λ where dσ is the surface Lebesgue measure. It follows that for any [α, β] ⊆ [0, ∞), β
f, A(λ)f dλ = ((E(β) − E(α))f, f ) = α
α≤|ξ|2 ≤β
|F+ f (ξ)|2 dξ.
(6.9)
Letting α → 0, β → ∞, we get f 0 = F+ f 0 . 2
(6.10)
Thus f → F+ f ∈ L (R ) is an isometry for compactly supported functions, which can be extended by density to all f ∈ L2 (Rn ). Furthermore, since the spectrum of H is entirely absolutely continuous, it follows that for every f ∈ L2 , Eq. (6.8) holds for almost all λ > 0 (with respect to the Lebesgue measure). n
November 16, J070-S0129055X10004193
1232
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Let f ∈ D(H). By the spectral theorem 1 2
Hf, A(λ)Hf = λ f, A(λ)f = √ ||ξ|2 F+ f (ξ)|2 dσ, λ > 0. 2 λ |ξ|2 =λ In particular, 2 Hf 0 = ||ξ|2 F+ f (ξ)|2 dξ. (6.11) Rn ∞ 2 Conversely, if the right-hand side of (6.11) is finite, then 0 λ f, A(λ)f dλ < ∞, so f ∈ D(H). The adjoint operator F∗+ is a partial isometry (on the range of F+ ). If f (x) ∈ 2 L (Rn ) is compactly supported and g(ξ) ∈ L2 (Rn ) is likewise compactly supported then n (F+ f, g) = (2π)− 2 f (x)ϕ+ (x, ξ)dx g(ξ)dξ Rn
−n 2
Rn
f (x)
= (2π)
Rn
Rn
g(ξ)ϕ+ (x, ξ)dξ dx,
where in the change of order of integration Proposition 6.1 was taken into account. It follows that for a compactly supported g(ξ) ∈ L2 (Rn ), ∗ −n 2 g(ξ)ϕ+ (x, ξ)dξ, (6.12) (F+ g)(x) = (2π) Rn
and the extension to all g ∈ L2 (Rn ) is obtained by the fact that F∗+ is a partial isometry. Now if f ∈ D(H), g ∈ L2 (Rn ), we have |ξ|2 F+ f (ξ)F+ g(ξ)dξ = F∗+ (|ξ|2 F+ f (ξ))g(ξ)dξ, (Hf, g) = Rn
Rn
which is the statement (3.6) of the theorem. It follows from the spectral theorem that for every interval J = [α, β] ⊆ [0, ∞) and for every f ∈ L2 (Rn ) we have, with EJ = E(β)−E(α) and χJ the characteristic function of J, EJ f (x) = F∗+ (χJ (|ξ|2 )F+ f (ξ)), or F+ EJ f (ξ) = χJ (|ξ|2 )F+ f (ξ). It remains to prove that the isometry F+ is onto (and hence unitary). So, suppose to the contrary that for some nonzero g(ξ) ∈ L2 (Rn ) (F∗+ g)(x) = 0. In particular, for any f ∈ L2 (Rn ) and any interval J as above, 0 = (EJ f, F∗+ g) = (F+ EJ f, g) = (χJ (|ξ|2 )F+ f (ξ), g(ξ)) = (F+ f (ξ), χJ (|ξ|2 )g(ξ)), so that F∗+ (χJ (|ξ|2 )g(ξ)) = 0.
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1233
By Eq. (6.12), we have, for any 0 ≤ α < β, g(ξ)ϕ+ (x, ξ)dξ = 0, α<|ξ|2 <β
so that, in view of the continuity properties of ϕ+ (x, ξ) (see Proposition 6.1), for a.e. λ ∈ (0, ∞), g(ξ)ϕ+ (x, ξ)dσ = 0. (6.13) |ξ|2 =λ
From the definition (3.4), we get g(ξ) exp(iξx)dσ − |ξ|2 =λ
|ξ|2 =λ
g(ξ)R− (λ)((H − λ) exp(iξx))dσ = 0.
(6.14)
Since (H − λ) exp(iξx) is compactly supported (when |ξ|2 = λ), the continuity property of R− (λ) enables us to write g(ξ)R− (λ)((H − λ) exp(iξx))dσ = R− (λ) g(ξ)(H − λ) exp(iξx)dσ, |ξ|2 =λ
|ξ|2 =λ
which, by Remark 5.2, satisfies a Sommerfeld radiation condition. We conclude that the function 1 g(ξ) exp(iξx)dσ ∈ H 1,−s , s > , G(x) = 2 2 |ξ| =λ is a radiative solution (see Remark 4.5) of (−∆ − λ)G = 0, and hence must vanish. Since this holds for a.e. λ > 0, we get gˆ(ξ) = 0, hence g = 0. 7. Global Spacetime Estimates 1
Proof of Theorem C. (a) Define, with G = H 2 , u± =
1 (Gu ± i∂t u). 2
(7.1)
Then i ∂t u± = ∓iGu± ± f. 2 Defining
U (t) =
u+ (t)
(7.2)
u− (t)
(7.3)
we have i−1 U (t) = −KU + F, 1 f (·, t) 2 G 0 . K= , F (t) = 1 0 −G − f (·, t) 2
(7.4)
November 16, J070-S0129055X10004193
1234
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Note that, as is common when treating evolution equations, we write U (t), F (t), . . . for U (x, t), F (x, t), . . . when there is no risk of confusion. The operator K is a self adjoint operator on D = L2 (Rn ) ⊕ L2 (Rn ). Its spectral family EK (λ) is given by EK (λ) = EG (λ) ⊕ (I − EG (−λ)), λ ∈ R, where EG is the spectral family of G. d E(λ) be its weak Let E(λ) be the spectral family of H, and let A(λ) = dλ derivative (3.2). By the definition of G we have EG (λ) = E(λ2 ), hence its weak derivative is given by AG (λ) =
d EG (λ) = 2λA(λ2 ), dλ
λ > 0.
(7.5)
In view of the LAP (Theorem A), we therefore have that the operator-valued function AG (λ) ∈ B(L2,s (Rn ), L2,−s (Rn )), is continuous for λ ≥ 0. Denoting Ds = L2,s (Rn ) ⊕ L2,s (Rn ), it follows that AK (λ) =
d EK (λ) = AG (λ) ⊕ AG (−λ), dλ
λ ∈ R,
is continuous with values in B(Ds , D−s ) for s > 1. Making use of Hypotheses (H1) and (H2), we invoke [65, Theorem 5.1] to con1 clude that lim supµ→∞ µ 2 A(µ) B(L2,s ,L2,−s ) < ∞, so that by (7.5) there exists a constant C > 0, such that AG (λ) B(L2,s ,L2,−s ) < C,
λ ≥ 0.
(7.6)
s > 1, λ ∈ R.
(7.7)
It follows that also AK (λ) B(Ds ,D−s ) < C,
λ ∈ R,
Let , be the bilinear pairing between D−s and Ds (conjugate linear with respect to the second term). For any ψ, χ ∈ Ds we have, in view of the fact that AK (λ) is a weak derivative of a spectral measure, (i) (ii)
| AK (λ)ψ, χ|2 ≤ AK (λ)ψ, ψ · AK (λ)χ, χ, ∞
AK (λ)ψ, ψdλ = ψ 2L2 (Rn )⊕L2 (Rn ) .
(7.8)
−∞
We first treat the pure Cauchy problem, i.e. f ≡ 0. To estimate U (x, t) = e−itK U (x, 0) we use a duality argument. Some of the following computations will be rather formal, but they can easily be justified by
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1235
a density argument, as in [7, 17]. We shall use (( , )) for the scalar product in L2 (Rn+1 ) ⊕ L2 (Rn+1 ). Take w(x, t) ∈ C0∞ (Rn+1 ) ⊕ C0∞ (Rn+1 ). Then,
∞
((U, w)) = −∞
∞
= −∞
e−itK U (x, 0) · w(x, t)dxdt !
AK (λ)U (x, 0),
= (2π)1/2
∞
−∞
∞
" eitλ w(·, t)dt dλ
−∞
AK (λ)U (x, 0), w(·, ˜ λ)dλ,
where 1
w(x, ˜ λ) = (2π)− 2
w(x, t)eitλ dt. R
Noting (7.8) and (7.7), and using the Cauchy–Schwartz inequality |((U, w))| ≤ (2π)1/2 U (x, 0) 0 · ≤ C U (x, 0) 0 ·
∞
−∞
∞
−∞
12
AK (λ)w(·, ˜ λ), w(·, ˜ λ)dλ
w(·, ˜ λ) 2Ds
12 dλ .
It follows from the Plancherel theorem that |((U, w))| ≤ C U (x, 0) 0
R
w(·, t) 2Ds dt
12 . s
Let φ(x, t) ∈ C0∞ (Rn+1 ) ⊕ C0∞ (Rn+1 ), and take w(x, t) = (1 + |x|2 )− 2 φ(x, t), so that s
|(((1 + |x|2 )− 2 U, φ))| ≤ C · U (x, 0) 0 · φ L2 (Rn+1 ) . This concludes the proof of the part involving the Cauchy data in (3.14), in view of (7.3). To prove the part concerning the inhomogeneous equation, it suffices to take u0 = v0 = 0. In this case the Duhamel principle yields, for t > 0, U (t) =
t
e−i(t−τ )K F (τ )dτ,
0
where we have used the form (7.4) of the equation.
November 16, J070-S0129055X10004193
1236
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
Integrating the inequality U (t) we get
∞ 0
D −s
≤
t
0
U (t) D−s dt ≤
e−i(t−τ )K F (τ ) D−s dτ,
∞
0
∞
τ
e−i(t−τ )K F (τ ) D−s dtdτ.
Invoking the first part of the proof we obtain ∞ U (t) D−s dt ≤ C 0
∞
0
F (τ ) 0 dτ,
which proves the part related to the inhomogeneous term in (3.14). (b) Define v± (x, t) = exp(±itG)φ± (x), where 1 [u0 (x) ∓ G−1 v0 (x)]. 2
φ± (x) = Then clearly
u(x, t) = v+ (x, t) + v− (x, t).
(7.9)
We establish the estimate (3.15) for v+ . Taking w(x, t) ∈ C0∞ (Rn+1 ) we proceed as in the first part of the proof. Let , be the L2,−s (Rn ), L2,s (Rn ) pairing. Then ∞ eitG φ+ (x) · w(x, t)dxdt (v+ , w) = −∞
∞
= 0
!
AG (λ)φ+ ,
= (2π)1/2
∞
0
∞
" e−itλ w(·, t)dt dλ
−∞
AG (λ)φ+ , w(·, ˜ λ)dλ,
where 1
w(x, ˜ λ) = (2π)− 2
w(x, t)e−itλ dt.
R
Noting (7.6) as well as the inequalities (7.8) (with AG replacing AK ) and using the Cauchy–Schwartz inequality ∞ 1/2 |(v+ , w)| ≤ (2π)1/2 φ+ 0 ·
AG (λ)w(·, ˜ λ), w(·, ˜ λ)dλ ≤ C φ+ 0 ·
0
0
∞
12
w(·, ˜ λ) 20,s dλ
.
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
The Plancherel theorem yields |(v+ , w)| ≤ C φ+ 0
R
w(·, t) 20,s
1237
1/2 dt
.
s
Let ω ∈ C0∞ (Rn+1 ), and take w(x, t) = (1 + |x|2 )− 2 ω(x, t), so that s
|((1 + |x|2 )− 2 v+ , ω)| ≤ C · φ+ 0 · ω L2(Rn+1 ) . This (with the similar estimate for v− ) concludes the proof of the estimate (3.15). Remark 7.1 (Optimality of the Requirement s > 1). A key point in the proof was the use of the uniform bound (7.6). In view of the relation (7.5), this is reduced to the uniform boundedness of λA(λ2 ), λ ≥ 0, in B(L2,s , L2,−s ). By [65, Theo1 rem 5.1] the boundedness at infinity, lim supµ→∞ µ 2 A(µ) < ∞, holds already with s > 12 . Thus the further restriction s > 1 is needed in order to ensure the boundedness at λ = 0 (Theorem A). Remark 7.2. Clearly we can take [0, T ] as the time interval, instead of R, for any T > 0. Acknowledgments This work was partially done during my visits to the Department of Mathematics at Stanford University (Spring 2004) and the Department of Mathematics of the Universit´e de Provence (Marseille, Spring 2006). I am grateful for the hospitality of both departments with special thanks to Professors Rafe Mazzeo and Yves Dermenjian. In addition, very stimulating discussions with S. Agmon, K. Hidano, Y. Pinchover, M. Ruzhansky, M. Sugimoto and T. Umeda are happily acknowledged. The author thanks the referee for calling his attention to the works [19–21]. References [1] S. Agmon, Spectral properties of Schr¨ odinger operators and scattering theory, Ann. Sc. Norm. Super. Pisa 2 (1975) 151–218. [2] S. Agmon, J. Cruz-Sampedro and I. Herbst, Spectral properties of Schr¨ odinger operators with potentials of order zero, J. Funct. Anal. 167 (1999) 345–369. [3] Y. Ameur and B. Walther, Smoothing estimates for the Schr¨ odinger equation with an inverse-square potential, preprint (2007). [4] M. Beals and W. Strauss, Lp estimates for the wave equation with a potential, Comm. Partial Differential Equations 18 (1993) 1365–1397. [5] M. Ben-Artzi, Unitary equivalence and scattering theory for Stark-like Hamiltonians, J. Math. Phys. 25 (1984) 951–964. [6] M. Ben-Artzi, Global estimates for the Schr¨ odinger equation, J. Funct. Anal. 107 (1992) 362–368. [7] M. Ben-Artzi, Regularity and smoothing for some equations of evolution, in Nonlinear Partial Differential Equations and Their Applications; Coll`ege de France Seminar, Longman Scientific, Vol. 11, eds. H. Brezis and J. L. Lions (Longman Sci. Tech. 1994), pp. 1–12.
November 16, J070-S0129055X10004193
1238
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
[8] M. Ben-Artzi, On spectral properties of the acoustic propagator in a layered band, J. Differential Equations 136 (1997) 115–135. [9] M. Ben-Artzi, Spectral theory for divergence-form operators, in Spectral and Scattering Theory and Related Topics, ed. H. Ito, Vol. 1607 (RIMS Kokyuroku, 2008), pp. 77–84. [10] M. Ben-Artzi, Y. Dermenjian and J.-C. Guillot, Analyticity properties and estimates of resolvent kernels near thresholds, Comm. Partial Differential Equations 25 (2000) 1753–1770. [11] M. Ben-Artzi, Y. Dermenjian and A. Monsef, Resolvent kernel estimates near thresholds, Differential Integral Equations 19 (2006) 1–14. [12] M. Ben-Artzi and A. Devinatz, The limiting absorption principle for a sum of tensor products applications to the spectral theory of differential operators, J. Anal. Math. 43 (1983/84) 215–250. [13] M.Ben-Artzi and A. Devinatz, The Limiting Absorption Principle for Partial Differential Operators, Memoirs of the AMS, Vol. 364 (Amer. Math. Soc., 1987). [14] M. Ben-Artzi and A. Devinatz, Local smoothing and convergence properties for Schr¨ odinger-type equations, J. Funct. Anal. 101 (1991) 231–254. [15] M. Ben-Artzi and A. Devinatz, Regularity and decay of solutions to the Stark evolution equations, J. Funct. Anal. 154 (1998) 501–512. [16] M. Ben-Artzi and S. Klainerman, Decay and regularity for the Schr¨ odinger equation, J. Anal. Math. 58 (1992) 25–37. [17] M. Ben-Artzi and J. Nemirovsky, Remarks on relativistic Schr¨ odinger operators and their extensions, Ann. Inst. H. Poincar´ e 67 (1997) 29–39. [18] Ju. M. Berezanskii, Expansion in Eigenfunctions of Selfadjoint Operators, Translations of Mathematical Monographs, Vol. 17 (Amer. Math. Soc., 1968). [19] J.-F. Bony and D. H¨ afner, The semilinear wave equation on asymptotically Euclidean manifolds, arXiv:0810.0464. [20] J.-F. Bony and D. H¨ afner, Low frequency resolvent estimates for long range perturbations of the Euclidean Laplacian, arXiv:0903.5531. [21] J.-M. Bouclet, Low frequency estimates for long range perturbations in divergence form, arXiv:0806.3377. [22] A. Boutet de Monvel-Berthier and D. Manda, Spectral and scattering theory for wave propagation in perturbed stratified media, J. Math. Anal. Appl. 191 (1995) 137–167. [23] F. E. Browder, The eigenfunction expansion theorem for the general self-adjoint singular elliptic partial differential operator. I. The analytical foundation, Proc. Natl. Acad. Sci. 40 (1954) 454–459. [24] N. Burq, Semi-classical estimates for the resolvent in nontrapping geometries, Int. Math. Res. Not. 5 (2002) 221–241. [25] N. Burq, Global Strichartz estimates for nontrapping geometries: About an article by H. Smith and C. Sogge, Comm. Partial Differential Equations 28 (2003) 1675–1683. [26] H. Chihara, Smoothing effects of dispersive pseudodifferential equations, Comm. Partial Differential Equations 27 (2002) 1953–2005. [27] A. Cohen and T. Kappeler, Scattering and inverse scattering for steplike potentials in the Schr¨ odinger equation, Indiana Univ. Math. J. 34 (1985) 127–180. [28] C. Cohen-Tannoudji, B. Diu and F. Lalo¨e, Quantum Mechanics (John Wiley, 1977). [29] E. Croc and Y. Dermenjian, Analyse spectrale d’une bande acoustique multistratifie´e. Partie I: Principe d’absorption limite pour une stratification simple, SIAM J. Math. Anal. 26 (1995) 880–924. [30] P. D’ancona and L. Fanelli, Strichartz and smoothing estimates for dispersive equations with magnetic potentials, Comm. Partial Differential Equations 33 (2008) 1082–1112.
November 16, J070-S0129055X10004193
2010 15:28 WSPC/S0129-055X
148-RMP
Eigenfunctions Expansions and Spacetime Estimates
1239
[31] S. DeBi`evre and W. Pravica, Spectral analysis for optical fibers and stratified fluids I: The limiting absorption principle, J. Funct. Anal. 98 (1991) 404–436. [32] V. G. Deich, E. L. Korotayev and D. R. Yafaev, Theory of potential scattering, taking into account spatial anisotropy, J. Soviet Math. 34 (1986) 2040–2050. [33] S.-I. Doi, Smoothing effects of Schr¨ odinger evolution groups on Riemannian manifolds, Duke Math. J. 82 (1996) 679–706. [34] D. M. Eidus, The principle of limiting absorption, in American Mathematical Society Translations, Series 2, Vol. 47 (Amer. Math. Soc., Providence, 1965), pp. 157–192. (Originally in Russian, Mat. Sb. 57 (1962) 13–44). [35] D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order (Springer-Verlag, 1977). [36] M. Goldberg and W. Schlag, A limiting absorption principle for the three-dimensional Schr¨ odinger equation with Lp potentials, Int. Math. Res. Not. 75 (2004) 4049–4071. 2 1 [37] I. Herbst, Spectral theory of the operator (p2 + m2 ) 2 − Z er , Comm. Math. Phys. 53 (1977) 285–294. [38] I. Herbst, Spectral and scattering theory for Schr¨ odinger operators with potentials independent of |x|, Amer. J. Math. 113 (1991) 509–565. [39] K. Hidano, Morawetz–Strichartz estimates for spherically symmetric solutions to wave equations and applications to semilinear Cauchy problems, Differential Integral Equations 20 (2007) 735–754. [40] K. Hidano, J. Metcalfe, H. F. Smith, C. D. Sogge and Y. Zhou, On abstract Strichartz estimates and the Strauss conjecture for nontrapping obstacles, to appear in Trans. Amer. Math. Soc. (2009); http://front.math.ucdavis.edu/0805.1673. [41] L. H¨ ormander, The Analysis of Linear Partial Differential Operators II (SpringerVerlag, 1983). [42] T. Hoshiro, On weighted L2 estimates of solutions to wave equations, J. Anal. Math. 72 (1997) 127–140. [43] T. Hoshiro, Decay and regularity for dispersive equations with constant coefficients, J. Anal. Math. 91 (2003) 211–230. [44] T. Ikebe, Eigenfunction expansions associated with the Schr¨ odinger operators and their application to scattering theory, Arch. Ration. Mech. Anal. 5 (1960) 1–34. [45] T. Ikebe and Y. Saito, Limiting absorption method and absolute continuity for the Schr¨ odinger operators, J. Math. Kyoto Univ. Ser. A 7 (1972) 513–542. [46] T. Ikebe and T. Tayoshi, Wave and scattering operators for second-order elliptic operators in Rn , Publ. RIMS Kyoto Univ. Ser. A 4 (1968) 483–496. [47] A. D. Ionescu and W. Schlag, Agmon–Kato–Kuroda theorems for a large class of perurbations, Duke Math. J. 131 (2006) 397–440. [48] M. Kadowaki, Low and high energy resolvent estimates for wave propagation in stratified media and their applications, J. Differential Equations 179 (2002) 246–277. [49] M. Kadowaki, Resolvent estimates and scattering states for dissipative systems, Publ. RIMS Kyoto Univ. Ser. A 38 (2002) 191–209. [50] T. Kato, Perturbation Theory for Linear Operators (Springer-Verlag, 1966). [51] T. Kato and S. T. Kuroda, The abstract theory of scattering, Rocky Mountain J. Math. 1 (1971) 127–171. [52] T. Kato and K. Yajima, Some examples of smooth operators and the associated smoothing effect, Rev. Math. Phys. 1 (1989) 481–496. [53] K. Kikuchi and H. Tamura, The limiting amplitude principle for acoustic propagators in perturbed stratified fluids, J. Differential Equations 93 (1991) 260–282. [54] V. G. Maz’ya and T. O. Shaposhnikova, Theory of Sobolev Multipliers (SpringerVerlag, 2008).
November 16, J070-S0129055X10004193
1240
2010 15:28 WSPC/S0129-055X
148-RMP
M. Ben-Artzi
[55] K. Mochizuki, Scattering theory for wave equations with dissipative terms, Publ. RIMS Kyoto Univ. Ser. A 12 (1976) 383–390. [56] C. S. Morawetz, Time decay for the Klein–Gordon equation, Proc. Roy. Soc. Ser. A 306 (1968) 291–296. [57] K. Morii, Time-global smoothing estimates for a class of dispersive equations with constant coefficients, Ark. Mat. 46 (2008) 363–375. [58] E. Mourre, Absence of singular continuous spectrum for certain self-adjoint operators, Comm. Math. Phys. 78 (1980/81) 391–408. [59] M. Murata and T. Tsuchida, Asymptotics of green functions and the limiting absorption principle for elliptic operators with periodic coefficients, J. Math. Kyoto Univ. 46 (2006) 713–754. [60] P. Perry, I. M. Sigal and B. Simon, Spectral analysis of N -body Schr¨ odinger operators, Ann. Math. 114 (1981) 519–567. [61] B. Perthame and L. Vega, Morrey–Campanato estimates for Helmholtz equations, J. Funct. Anal. 164 (1999) 340–355. [62] B. Perthame and L. Vega, Energy decay and Sommerfeld condition for Helmholtz equation with variable index at infinity, preprint (2002). [63] A. Ja. Povzner, The expansion of arbitrary functions in terms of eigenfunctions of the operator −∆u + cu, in American Mathematical Society Translations, Series 2, Vol. 60 (Amer. Math. Soc., 1966) 1–49. (Originally in Russian, Math. Sb. 32 (1953) 109–156. [64] A. G. Ramm, Justification of the limiting absorption principle in R2 , in Operator Theory and Applications, Fields Institute Communications, Vol. 25, eds. A. G. Ramm, P. N. Shivakumar and A. V. Strauss (Amer. Math. Soc., 2000), pp. 433–440. [65] D. Robert, Asymptotique de la phase de diffusion ` a haute ´energie pour des perturbations du second ordre du laplacien, Ann. Sci. Ecole Norm. Sup. (4) 25 (1992) 107–134. [66] M. Ruzhansky and M. Sugimoto, Global L2 -boundedness theorems for a class of Fourier integral operators, Comm. Partial Differential Equations 31 (2006) 547–569. [67] M. Ruzhansky and M. Sugimoto, A smoothing property of Schr¨ odinger equations in the critical case, Math. Ann. 335 (2006) 645–673. [68] Y. Saito, Spectral Representations for Schr¨ odinger Operators with Long-Range Potentials, Lecture Notes in Mathematics, Vol. 727 (Springer-Verlag, 1979). [69] B. Simon, Best constants in some operator smoothness estimates, J. Funct. Anal. 107 (1992) 66–71. [70] C. D. Sogge, Lectures on Non-Linear Wave Equations, 2nd edn. (International Press, 2008). [71] W. A. Strauss, Nonlinear Wave Equations, CBMS Lectures, Vol. 73 (Amer. Math. Soc., 1989). [72] R. S. Strichartz, Restrictions of Fourier transforms to quadratic surfaces and decay of solutions of wave equations, Duke Math. J. 44 (1977) 705–714. [73] M. Sugimoto, Global smoothing properties of generalized Schr¨ odinger equations, J. Anal. Math. 76 (1998) 191–204. [74] T. Umeda, Generalized eigenfunctions of relativistic Schr¨ odinger operators I, Electronic J. Differential Equations 127 (2006) 1–46. [75] A. Vasy and J. Wunsch, Positive commutators at the bottom of the spectrum, J. Funct. Anal. 259 (2010) 503–523. [76] G. Vodev, Local energy decay of solutions to the wave equation for short-range potentials, Asymptot. Anal. 37 (2004) 175–187. [77] B. G. Walther, A sharp weighted L2 -estimate for the solution to the time-dependent Schr¨ odinger equation, Ark. Math. 37 (1999) 381–393.
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP J070-S0129055X10004211
Reviews in Mathematical Physics Vol. 22, No. 10 (2010) 1241–1243 c World Scientific Publishing Company DOI: 10.1142/S0129055X10004211
REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 22 (2010)
Barreira, L., Almost additive thermodynamic formalism: Some recent developments Bassi, A., D¨ urr, D. & Kolb, M., On the long time behavior of free stochastic Schr¨ odinger evolutions Ben-Artzi, M., Eigenfunction expansions and spacetime estimates for generators in divergence-form Ben Halima, M., Construction of certain fuzzy flag manifolds Brain, S. & Landi, G., The 3D spin geometry of the quantum two-sphere Bru, J.-B. & de Siqueira Pedra, W., Effect of a locally repulsive interaction on s-wave superconductors Chatterjee, S., Lahiri, A. & Sengupta, A. N., Parallel transport over path spaces Daud´ e, T. & Nicoleau, F., Inverse scattering in de Sitter–Reissner– Nordstr¨ om black hole spacetimes
de Oliveira, G., Asymptotics for Fermi curves: Small magnetic potential De Roeck, W., Maes, C., Netoˇ cn´ y, K. & Rey-Bellet, L., A note on the non-commutative Laplace–Varadhan integral lemma de Siqueira Pedra, W., see Bru, J.-B. Demirel, S. & Harrell, II, E. M., On semiclassical and universal inequalities for eigenvalues of quantum graphs Dimassi, M. & Petkov, V., Spectral shift function for operators with crossed magnetic and electric fields Dirr, G., see SchulteHerbr¨ uggen, T. D¨ urr, D. see Bassi, A. Feh´ er, L. & Pusztai, B. G., Derivations of the trigonometric BCn Sutherland model by quantum Hamiltonian reduction Glaser, S. J., see SchulteHerbr¨ uggen, T.
10 (2010) 1147
1 (2010) 55
10 (2010) 1209 5 (2010) 533
8 (2010) 963
3 (2010) 233
9 (2010) 1033
4 (2010) 431
1241
8 (2010) 881
7 (2010) 839 3 (2010) 233
3 (2010) 305
4 (2010) 355 6 (2010) 597 1 (2010) 55
6 (2010) 699 6 (2010) 597
November 16, 2010 15:28 WSPC/S0129-055X 148-RMP J070-S0129055X10004211
1242
Author Index
Grigorian, S., Moduli spaces of G2 manifolds Guha, P., Euler–Poincar´e flows on the loop Bott–Virasoro group and space of tensor densities and (2 + 1)-dimensional integrable systems Harrell, II, E. M., see Demirel, S. Helmke, U., see SchulteHerbr¨ uggen, T. Hidaka, T. & Hiroshima, F., Pauli–Fierz model with Kato-class potentials and exponential decays Hiroshima, F., see Hidaka, T. Ichinose, W., On the Feynman path integral for nonrelativistic quantum electrodynamics Jenˇ cov´ a, A. & Ruskai, M. B., A unified treatment of convexity of relative entropy and related trace functions, with conditions for equality Jensen, A. & Yajima, K., Spatial growth of fundamental solutions for certain perturbations of the harmonic oscillator Kolb, M., see Bassi, A. Kriz, I., Perturbative deformations of conformal field theories revisited Kusuoka, S. & Liang, S., A classical mechanical model of Brownian motion with plural particles
9 (2010) 1061
5 (2010) 485 3 (2010) 305 6 (2010) 597
10 (2010) 1181 10 (2010) 1181
5 (2010) 549
9 (2010) 1099
2 (2010) 193 1 (2010) 55
2 (2010) 117
7 (2010) 733
Lahiri, A., see Chatterjee, S. Landi, G., see Brain, S. Liang, S., see Kusuoka, S. Longo, R., Martinetti, P. & Rehren, K.-H., Geometric modular action for disjoint intervals and boundary conformal field theory Maes, C., see De Roeck, W. Marin, L., Dynamical bounds for Sturmian Schr¨ odinger operators Martinetti, P., see Longo, R. Matte, O. & Stockmeyer, E., Spectral theory of no-pair Hamiltonians Morsella, G. & Tomassini, L., From global symmetries to local currents: The free (scalar) case in four dimensions Nachtergaele, B., Schlein, B., Sims, R., Starr, S. & Zagrebnov, V., On the existence of the dynamics for anharmonic quantum oscillator systems Netoˇ cn´ y, K., see De Roeck, W. Nicoleau, F., see Daud´ e, T. Petkov, V., see Dimassi, M. Porta, M. & Simonella, S., Borel summability of ϕ44 planar theory via multiscale analysis Pusztai, B. G., see Feh´ er, L. Rehren, K.-H., see Longo, R. Rey-Bellet, L., see De Roeck, W.
9 (2010) 1033 8 (2010) 963 7 (2010) 733
3 (2010) 331 7 (2010) 839
8 (2010) 859 3 (2010) 331
1 (2010) 1
1 (2010) 91
2 (2010) 207 7 (2010) 839 4 (2010) 431 4 (2010) 355
9 (2010) 995 6 (2010) 699 3 (2010) 331 7 (2010) 839
November 16, J070-S0129055X10004211
2010 15:28 WSPC/S0129-055X
148-RMP
Author Index Robert, D., On the Herman–Kluk semiclassical approximation Ruskai, M. B., see Jenˇ cov´ a, A. Sanders, K., The locally covariant Dirac field Sango, M., Density dependent stochastic Navier–Stokes equations with non-Lipschitz random forcing Schlein, B., see Nachtergaele, B. Schulte-Herbr¨ uggen, T., Glaser, S. J., Dirr, G. & Helmke, U., Gradient flows for optimization in quantum information and quantum dynamics: Foundations and applications
10 (2010) 1123 9 (2010) 1099 4 (2010) 381
6 (2010) 669 2 (2010) 207
6 (2010) 597
Sengupta, A. N., see Chatterjee, S. Simonella, S., see Porta, M. Sims, R., see Nachtergaele, B. Starr, S., see Nachtergaele, B. Stockmeyer, E., see Matte, O. Tomassini, L., see Morsella, G. Yajima, K., see Jensen, A. Zagrebnov, V., see Nachtergaele, B. Zhang, R. B. & Zhang, X., Projective module description of embedded noncommutative spaces Zhang, X., see Zhang, R. B.
1243
9 (2010) 1033 9 (2010) 995 2 (2010) 207 2 (2010) 207 1 (2010) 1 1 (2010) 91 2 (2010) 193 2 (2010) 207
5 (2010) 507 5 (2010) 507