February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Reviews in Mathematical Physics Vol. 21, No. 1 (2009) 1–59 c World Scientific Publishing Company
BREATHERS IN INHOMOGENEOUS NONLINEAR LATTICES: AN ANALYSIS VIA CENTER MANIFOLD REDUCTION
†,‡ ´ GUILLAUME JAMES∗ , BERNARDO SANCHEZ-REY ´ CUEVAS†,§ and JESUS ∗Institut
National Polytechnique de Grenoble and CNRS, Laboratoire Jean Kuntzmann (UMR 5224), tour IRMA, BP 53, 38041 Grenoble Cedex 9, France †Grupo
de F´ısica No Lineal, Universidad de Sevilla, Departamento de F´ısica Aplicada I, Escuela Universitaria Polit´ ecnica, ´ c/. Virgen de Africa 7, 41011-Sevilla, Spain ∗
[email protected] ‡
[email protected] §
[email protected] Received 24 April 2008 Revised 6 October 2008
We consider an infinite chain of particles linearly coupled to their nearest neighbors and subject to an anharmonic local potential. The chain is assumed weakly inhomogeneous, i.e. coupling constants, particle masses and on-site potentials can have small variations along the chain. We look for small amplitude and time-periodic solutions, and, in particular, spatially localized ones (discrete breathers). The problem is reformulated as a nonautonomous recurrence in a space of time-periodic functions, where the dynamics is considered along the discrete spatial coordinate. Generalizing to nonautonomous maps a center manifold theorem previously obtained for infinite-dimensional autonomous maps [44], we show that small amplitude oscillations are determined by finite-dimensional nonautonomous mappings, whose dimension depends on the solutions frequency. We consider the case of two-dimensional reduced mappings, which occur for frequencies close to the edges of the phonon band (computed for the unperturbed homogeneous chain). For an homogeneous chain, the reduced map is autonomous and reversible, and bifurcations of reversible homoclinic orbits or heteroclinic solutions are found for appropriate parameter values. These orbits correspond respectively to discrete breathers for the infinite chain, or “dark” breathers superposed on a spatially extended standing wave. Breather existence is shown in some cases for any value of the coupling constant, which generalizes (for small amplitude solutions) an existence result obtained by MacKay and Aubry at small coupling [57]. For an inhomogeneous chain, the study of the nonautonomous reduced map is in general far more involved. Here, the problem is considered when the chain presents a finite number of defects. For the principal part of the reduced recurrence, using the assumption of weak inhomogeneity, we show
∗ Corresponding
author. 1
February 11, 2009 13:36 WSPC/148-RMP
2
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas that homoclinics to 0 exist when the image of the unstable manifold under a linear transformation (depending on the defect sequence) intersects the stable manifold. This provides a geometrical understanding of tangent bifurcations of discrete breathers commonly observed in classes of systems with impurities as defect strengths are varied. The case of a mass impurity is studied in detail, and our geometrical analysis is successfully compared with direct numerical simulations. In addition, a class of homoclinic orbits is shown to persist for the full reduced mapping and yields a family of discrete breathers with maximal amplitude at the impurity site. Keywords: Nonlinear lattices; spatial homogeneities; discrete breathers; center manifold reduction; homoclinic orbit; bifurcations. Mathematics Subject Classification 2000: 37L60, 37K60, 37K50, 82C44, 37L10, 34C25, 34C37
1. Introduction It is now well established that many nonlinear networks of interacting particles sustain time-periodic and spatially localized oscillations commonly denoted as discrete breathers. In spatially periodic systems, breathers are also called intrinsically localized modes [74] in distinction to Anderson modes of disordered linear systems [5]. The properties of discrete breathers have been analyzed in an important number of numerical works (see the reviews [28, 81, 22]) and their existence in periodic systems has been proved analytically in different contexts, see [57,12,72,7,13,64,26,44] and references therein. In the context of numerical simulations or experiments discrete breathers often denote a larger class of spatially localized oscillations, such as metastable states, oscillations with a certain degree of periodicity, or even chaotic oscillations interacting with a noisy extended background [42, 33]. Nonlinear waves of this type are now actually detected in real materials [70, 77, 71, 24, 58] and also generated in artificial systems such as Josephson junction arrays, micromechanical cantilever arrays and coupled optical waveguides (see references in [17]). They are thought to play a role in various physical processes such as the formation of local fluctuational openings in the DNA molecule [66, 67], which occurs in particular during thermal denaturation experiments. Beyond spatially periodic systems, it is a fundamental and challenging problem to understand breather properties in nonlinear and inhomogeneous media, such as non-periodic or disordered crystals, amorphous solids and biological macromolecules. For example the interplay between nonlinearity and disorder can provide an alternative interpretation for slow relaxation processes in glasses [54, 55]. In quasi-one-dimensional media, moving localized waves interacting with impurities [52, 19, 29], extended defects [78] or local bends of the lattice (see [20] and its references) can remain trapped and release vibrational energy at specific sites. The modeling of thermal denaturation of DNA and the analysis of its local fluctuational openings, also known as denaturation bubbles, represents another problem where heterogeneity is important. In order to describe these phenomena, a nonlinear model at the scale of the DNA base pair has been introduced by Peyrard and Bishop [66] and further improved by Dauxois et al. [21] The model describes the stretching xn (t) of the H-bonds between two bases, in the nth base pair along a
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
3
DNA molecule (a large value of xn corresponding to a local opening). Each bond fluctuates in an effective anharmonic potential V and interacts with its nearestneighbors. The model is described by a Hamiltonian system, and can be coupled with a thermostat to study the effect of thermal noise in denaturation experiments. This model accurately describes the thermal denaturation of certain real DNA segments provided their heterogeneity is taken into account [16]. In its simplest form, the model incorporates different dissociation energies for the adenine-thymine (AT) and guanine-cytosine (GC) base pairs. The Hamiltonian of the system reads H=
+∞ m 2 k x˙ n + Vsn (xn ) + (1 + ρe−β(xn+1+xn ) )(xn+1 − xn )2 , 2 2 n=−∞
(1)
where Vsn (x) = Dsn (1 − e−asn x )2 is a Morse potential depending on the base pairs sequence sn ∈ {AT , GC }. The case ρ = 0 yields a particular case of a Klein–Gordon lattice, i.e. the model consists in a chain of anharmonic oscillators with harmonic nearest-neighbors coupling. For parameters corresponding to real DNA sequences, Langevin molecular dynamics of (1) have shown that some locations of discrete breathers heavily depend on the sequence and seem to coincide with functional sites in DNA [48], but at the present time this topic remains controversial [80]. From a mathematical point of view, Albanese and Fr¨ ohlich have proved the existence of breathers for a class of random Hamiltonian systems describing an infinite array of coupled anharmonic oscillators [3] (see also the earlier work [31] of Fr¨ ohlich et al. concerning quasiperiodic localized oscillations). These breather families can be parametrized by the solutions frequencies, which belong to fat Cantor sets (i.e. with nonzero Lebesgue measure) of asymptotically full relative measure in the limit of zero amplitude. These solutions are nonlinear “continuations” of a given Anderson mode from the limit of zero amplitude, and the gaps in their frequency values originate from a dense set of resonances present in the system. For disordered Klein–Gordon lattices, complementary numerical results on the continuation of breathers with respect to frequency or the transition between breathers to Anderson modes are available in [54, 55, 6]. In addition, the existence of breathers in inhomogeneous Klein–Gordon lattices (with disordered on-site potentials) has been proved by Sepulchre and MacKay [72, 73] for small coupling k. The proof is based on the continuation method previously introduced by MacKay and Aubry [57] for an homogeneous chain (method considerably generalized in [72]). For k = 0, the system reduces to an array of uncoupled non-identical anharmonic oscillators, and the simplest type of discrete breather consists of a single particle oscillating while the others are at rest. Under a nonresonance condition [72, 73], this solution can be continued to small values of k (in most cases at fixed frequency) using the implicit function theorem, yielding a spatially localized solution. In this paper, we provide complementary mathematical tools for studying timeperiodic oscillations (not necessarily spatially localized) in inhomogeneous infinite lattices. The theory is developed in a very general framework, and applied to
February 11, 2009 13:36 WSPC/148-RMP
4
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
breather bifurcations in inhomogeneous Klein–Gordon lattices as lattice parameters and breather frequencies are varied. We start from a general Klein–Gordon lattice with Hamiltonian +∞ Kn Mn 2 x˙ n + Dn V (An xn ) + (xn+1 − xn )2 H= 2 2 n=−∞
(2)
(case ρ = 0 of (1) with more general inhomogeneities). The potential V is assumed sufficiently smooth in a neighborhood of 0 with V (0) = 0, V (0) = 1. The general theory is a priori valid for small inhomogeneities and small amplitude oscillations. In particular, in our application to system (2) we assume Mn , Dn , An , Kn to be close (uniformly in n) to positive constants. However, considering an example of Klein–Gordon lattice with a mass defect, we check using numerical computations that our tools remain applicable up to strongly nonlinear regimes, and sometimes for a large inhomogeneity. Our analysis is based on a center manifold reduction and the concept of spatial dynamics. This concept was introduced by Kirchg¨ assner [51] for nonlinear elliptic PDE in infinite strips, considered as (ill-posed) evolution problems in the unbounded space coordinate, and locally reduced to a finite-dimensional ODE on an invariant center manifold. This idea was transposed to the context of traveling waves in homogeneous infinite oscillator chains by Iooss and Kirchg¨ assner [38], and center manifold reduction has been subsequently applied to the analysis of traveling waves and pulsating traveling waves in different one-dimensional homogeneous lattices [39, 46, 40, 75, 65, 41]. Indeed, looking for traveling waves in an oscillator chain yields an advance-delay differential equation (a system of such equations in the case of pulsating traveling waves), which can be reformulated as an infinite-dimensional evolution problem in the moving frame coordinate, and locally reduced to a finitedimensional ODE under appropriate spectral conditions. In [43], one of us has proved the existence of breathers in Fermi–Pasta–Ulam (FPU) lattices using a similar technique in a discrete context. The dynamical equations for time-periodic solutions were reformulated as an infinite-dimensional recurrence relation in a space of time-periodic functions, and then locally reduced to a finite-dimensional mapping on a center manifold, where breathers corresponded to homoclinic orbits to 0. A general center manifold theorem for infinite-dimensional maps with unbounded linearized operator has been proved subsequently [44] and has been used to analyze breather bifurcations in diatomic FPU lattices [45,47] and spin lattices [62]. More generally, the dynamical equations of many one-dimensional lattices can be reformulated as infinite-dimensional maps in loop spaces as one looks for small amplitude time-periodic oscillations ([44, Sec. 6.1]). As shown in the present paper, the center manifold reduction theorem readily applies to homogeneous Klein–Gordon lattices, where Mn = m, Dn = d, An = a, Kn = k in (2) and m, d, a, k > 0. This reduction result rigorously justifies (in the weakly nonlinear regime) a formal one-Fourier mode approximation previously
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
5
introduced in [15]. The equations of motion read m
d2 xn + daV (axn ) = k(xn+1 − 2xn + xn−1 ), dt2
n ∈ Z.
(3)
˜n (ωt), Looking for time-periodic solutions (with frequency ω) and setting xn (t) = x xn , x ˜n−1 ) (3) can be formulated as an (ill-posed) recurrence relation (˜ xn+1 , x˜n ) = F (˜ in a space of 2π-periodic functions. Using the theorem of [44], one can locally reduce the problem to a finite-dimensional mapping on a center manifold whose dimension depends on the frequency ω. More precisely, Eq. (3) linearized at xn = 0 admits solutions in the form of linear waves (phonons) with xn (t) = A cos(qn − ωq t), whose frequency satisfies the dispersion relation mωq2 = a2 d + 2k(1 − cos q).
(4)
The frequencies ωq lie in a band [ωmin , ωmax ] with ωmin > 0. In the nonlinear case, the dimension of the center manifold depends on how many multiples of ω belong to (or are close to) the phonon band. When ω ≈ ωmax or ω ≈ ωmin (with no additional resonance), the center manifold is two-dimensional if solutions are searched even in time, which reduces (3) locally to a two-dimensional reversible mapping on the center manifold. For appropriate parameter values, this map admits small amplitude homoclinic solutions to 0 corresponding to breather solutions of (3). Breather solutions in this system have been proved to exist by MacKay and Aubry [57] for small values of the coupling parameter k. Known regions of breather existence are considerably extended here, since we prove the existence of small amplitude breathers for arbitrary values of k in some cases and for frequencies close to the phonon band edges (see Theorem 7). In addition, we prove the existence of “dark breather” solutions, which converge towards a nonlinear standing wave as n → ±∞ and have a much smaller amplitude at the center of the chain. These solutions correspond to heteroclinic orbits of the reduced two-dimensional map. Furthermore, we extend this analysis to the case when small lattice inhomogeneities are present. The dynamical equations of the inhomogeneous system (2) take the form Mn
d2 xn + Dn An V (An xn ) dt2 = Kn (xn+1 − xn ) − Kn−1 (xn − xn−1 ),
n ∈ Z,
(5)
and time-periodic solutions can be obtained as orbits of a nonautonomous map (xn+1 , xn ) = F (λn , xn , xn−1 ), where the nonconstant lattice parameters are embedded in a multicomponent parameter λn . Fixing Mn = m + mn , Dn = d + dn , An = a + an , Kn = k + kn , we consider the case when constant lattice parameters m, d, a, k > 0 are perturbed by uniformly small sequences (mn )n∈Z , (dn )n∈Z , (an )n∈Z , (kn )n∈Z . We prove (see Theorem 3) that small amplitude timeperiodic solutions with frequencies close to ωmin or ωmax are determined by a twodimensional nonautonomous map. Moreover, we generalize this reduction result in
February 11, 2009 13:36 WSPC/148-RMP
6
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
the case when several multiples of ω are close to the band [ωmin , ωmax ], which yields a higher-dimensional reduced problem (see Theorem 4). In fact, we prove this type of reduction result in a very general framework, for infinite-dimensional mappings with small nonautonomous perturbations, considered in a neighborhood of a non-hyperbolic fixed point, or close to a bifurcation. The linear autonomous part of the map must satisfy a property of spectral separation (see Theorem 1), but a large number of one-dimensional lattices with finite-range coupling fall within this category. We obtain a direct proof of the reduction result by observing that any nonautonomous mapping un+1 = F (λn , un ) can be seen as a projection of an extended autonomous mapping, to which the center manifold theorem of [44] can be applied under appropriate assumptions. The center manifold of the extended map is infinite-dimensional, but this case is also covered in [44]. The reduced nonautonomous mapping for the original system can be interpreted as a projection on a finite-dimensional subspace of the extended autonomous mapping restricted to the invariant center manifold. We use this reduction result to analyze the case when Eq. (5) presents a mass defect at a single site, all other lattice parameters being independent of n. In that case, the linearized problem admits a spatially localized mode (usually denoted as an impurity mode or defect mode), and a nonlinear continuation of this mode can be computed [29], corresponding to a Lyapunov family of periodic orbits. Klein– Gordon systems with a coupling defect or a harmonic impurity in the on-site potential share similar characteristics [19], as well as nonlinear lattices with a different type of nonlinearity [53]. In addition to this simple localization phenomenon, single impurities can have more complex effects in a nonlinear system. Indeed it is a common feature to observe a complex sequence of tangent bifurcations between (deformations of) site-centered and bond-centered breathers in some neighborhood of the defect as the strength of an impurity is varied [20, 53]. Using numerical computations we show some examples of such bifurcations in the present paper, as one varies the strength of a mass defect in system (5). From a physical point of view, it is quite important to understand how a local change in the lattice parameters modifies the set of spatially localized solutions. For example, this could contribute to explain how a mutation at a specific location of an homogeneous sequence of (artificial) DNA would modify the structure of fluctuational openings [48]. This paper provides a qualitative explanation of such tangent bifurcations, which reveals also very precise quantitatively when compared with numerical simulations of the Klein–Gordon model. According to the previously described reduction theorem, for a small mass defect of size , small amplitude breather solutions of (5) with frequencies below (and close to) ωmin are described by a two-dimensional nonautonomous mapping vn+1 = f (vn , ω) + g(n, vn , ω, ). Here we consider the principal part of the reduced mapping as (vn , ω, ) ≈ (0, ωmin , 0). We show that this truncated reduced map admits an homoclinic orbit to 0 (corresponding to an approximate breather solution for the oscillator chain) if, for = 0, the image of the unstable manifold of 0 under a certain linear shear intersects its stable manifold.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
7
The linear shear is O()-close to the identity. When the on-site potential is soft (i.e. the period of small oscillations in this potential increases with amplitude), these manifolds have very complicated windings characteristic of homoclinic chaos, hence the set of their intersections changes in a complex way as the linear shear varies, or equivalently as one varies the mass defect. This phenomenon explains the existence of the above-mentioned tangent bifurcations, at least for small defect sizes, and for small amplitude breathers with frequencies close to the phonon band. In addition, we show (by comparison with direct numerical simulations of the Klein–Gordon model) that this picture remains valid quite far from the weakly nonlinear regime. Let us note that, to obtain an exact solution of (5) from an orbit of the truncated map, it would be necessary to control the effect of higher order terms (with respect to vn , ω − ωmin, ) present in the full reduced mapping and prove the persistence of this solution. This result is obtained for Lyapunov families of periodic orbits, which correspond here to discrete breathers with maximal amplitude at the impurity site (see Theorem 8 in Sec. 4.1.5). In that case, the corresponding orbits of the reduced mapping appear through a pitchfork bifurcation when ω reaches the linear defect mode frequency. The persistence of the above-mentioned tangent bifurcations of discrete breathers is a much more complex problem, which study would require asymptotical techniques beyond all algebraic orders (more details in this respect are given in Sec. 4.1.4). This problem is not examined here from the analytical side, but we compare instead numerically computed solutions of (5) with approximate solutions deduced from the truncated map. The very good agreement leads us to conjecture that most of the tangent bifurcations existing for the truncated problem persist for the full reduced system. Lastly we consider the more general case when system (5) admits a finite number of defects, i.e. perturbations mn , dn , an , kn have a compact support (as above these perturbations are assumed to be small, of order ). We show that the approach developped for a single impurity can be extended to this case (see Lemma 7), where the linear shear is replaced by a more general linear near-identity transformation A . The linear transformation A provides a useful tool for studying breather bifurcations in Klein–Gordon lattices with a finite number of impurities, as for the single impurity case that we have analyzed in detail. By computing the principal part of A as is small and frequencies are close to ωmin , we show that the effect of the parameter sequence (λn )n∈Z on the set of small amplitude breather solutions should mainly depend on weighted averages of the defect values. The outline of the paper is as follows. Section 2 presents the center manifold reduction theory for time-periodic oscillations in weakly inhomogeneous nonlinear lattices. We treat the case of Klein–Gordon lattices in detail in Secs. 2.1 and 2.3, and formulate the reduction theory in a much more general setting in Sec. 2.2. Section 3 concerns spatially homogeneous Klein–Gordon lattices. Existence theorems for small amplitude breather and dark breather solutions are deduced from the dynamics of two-dimensional reversible maps on invariant center manifolds. The case of weakly inhomogeneous Klein–Gordon chains is considered in Sec. 4, where
February 11, 2009 13:36 WSPC/148-RMP
8
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
the truncated reduced map is analyzed for a finite number of defects. A geometrical condition for the existence of homoclinic orbits to 0 is derived in Sec. 4.2, and some homoclinic bifurcations are studied in detail in Sec. 4.1 for a single mass defect. In the latter case, breather solutions are numerically computed in Sec. 5 and the results are successfully compared with our analytical findings. 2. Reduction Result for Small Inhomogeneities In this section, we consider system (5) in the limit of small inhomogeneities. We show that all small amplitude time-periodic solutions are determined by a finite-dimensional nonautonomous map, whose dimension depends on the frequency domain under consideration. For this purpose we reformulate (5) as a map in a loop space, perturbed by a small nonautonomous term (Sec. 2.1). Then we prove in Sec. 2.2 a general center manifold reduction theorem for infinite-dimensional maps with small nonautonomous perturbations. This result is based on the center manifold theorem proved in [44] for autonomous systems. Our general result is applied to the inhomogeneous Klein–Gordon lattice, which yields the above mentioned reduction result (Sec. 2.3). 2.1. The Klein–Gordon system as a map in a loop space We set xn (t) = yn (ω(k/m)1/2 t) in Eq. (5), where yn is 2π-periodic in t (hence xn is time-periodic with frequency ω(k/m)1/2 ). The constant a > 0 being fixed, we also define V˜ (x) = a−2 V (ax). Equation (5) becomes d2 yn + Ω2 (1 + ηn )V˜ ((1 + γn )yn ) dt2 = yn+1 − yn − (1 + κn )(yn − yn−1 ), n ∈ Z
ω 2 (1 + n )
(6)
where Ω2 = a2 d/k and 1+n = (1+ mmn )/(1+ kkn ), 1+ηn = (1+ ddn )(1+ aan )/(1+ kkn ), kn γn = aan , 1 + κn = (1 + kn−1 k )/(1 + k ). The sequences (n )n∈Z , (ηn )n∈Z , (γn )n∈Z (κn )n∈Z will be assumed sufficiently small in ∞ (Z), where ∞ (Z) is the classical Banach space of bounded sequences on Z, equiped with the supremum norm. To simplify the notations, we shall drop the tilde in the sequel when referring to the renormalized potential V˜ . Moreover we shall use the shorter notations {} when referring to sequences (n )n∈Z . To analyze system (6), we use the same approach as in [44] for spatially homogeneous systems. We reformulate (6) as a (nonautonomous) recurrence relation in a space of 2π-periodic functions of t, and locally reduce the (spatial) dynamics to one on a finite-dimensional center manifold. We restrict our attention to the case when yn is even in t in order to deal with lower-dimensional prob2 n for all n ∈ Z, where H# = {y ∈ lems. More precisely, we assume yn ∈ H# n n Hper (0, 2π), y is even} and Hper (0, 2π) denotes the classical Sobolev space of 2π0 (0, 2π) = L2per(0, 2π)). periodic functions (Hper
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
9
Since our analysis concerns small amplitude solutions and small inhomogeneities, the first step consists in studying the linearized system at yn = 0 when n , ηn , γn , κn are fixed equal to 0. In that case, Eq. (6) yields d2 yn + Ω2 yn = yn+1 − 2yn + yn−1 , n ∈ Z. (7) dt2 Now we rewrite the problem as an infinite-dimensional linear mapping. For this 2 2 purpose we introduce Yn = (yn−1 , yn ) ∈ D, where D = H# × H# . Equation (7) can be written ω2
Yn+1 = Aω Yn , where
n ∈ Z,
2 2d y 2 Aω (z, y) = y, ω + (Ω + 2)y − z dt2
(8)
(9)
2 0 and Eq. (8) holds in X = H# × H# . The operator Aω : D ⊂ X → X is unbounded in X (of domain D) and closed (we omit the additional parameter Ω in the notation Aω ). The spectrum of Aω consists in essential spectrum at the origin and an infinite number of eigenvalues σp , σp−1 (p ≥ 0) depending on ω, Ω, and satisfying the dispersion relation
σ 2 + (ω 2 p2 − Ω2 − 2)σ + 1 = 0
(10)
(it follows that σp is either real or has modulus one). Equation (10) is directly obtained by setting yn = σ n cos (pt) in Eq. (7). The invariance σ → σ −1 in (10) originates from the invariance n → −n in (7). In the sequel, we shall note σp the solution of (10) satisfying |σp | ≥ 1 and Im σp ≤ 0. Clearly σp is real negative for p large enough and limp→+∞ σp = −∞. Moreover, σp−1 accumulates at σ = 0 as p → +∞. It follows that the number of eigenvalues of Aω on the unit circle is finite for any value of the parameters ω, Ω. In addition, the eigenvalues σp , σp−1 defined by (10) lie on the unit circle when Ω ≤ ωp ≤ (4 + Ω2 )1/2 . This property has a simple interpretation. Multiplying (10) by σ −1 , setting σ = eiq and ωq = ωp(k/m)1/2 , one finds the usual dispersion relation (4). Consequently, if ωp(k/m)1/2 lies inside the phonon band [ωmin , ωmax ] for some p ∈ N, then Aω admits a pair of eigenvalues e±iq on the unit circle determined by the dispersion relation (4). This condition on ω is equivalent to prescribing Ω ≤ ωp ≤ (4 + Ω2 )1/2 . Now let us describe the spectrum of Aω near the unit circle when Ω > 0 is fixed and ω is varied. As we shall see, the number of eigenvalues of Aω on the unit circle changes as ω crosses an infinite sequence of decreasing critical values ω1 > ω2 > · · · > 0. Small amplitude solutions of the nonlinear system bifurcating from yn = 0 will be found near these critical frequencies. We begin by studying the evolution of each pair of eigenvalues σp , σp−1 as ω varies. Firstly, one can easily check that σ0 , σ0−1 are independent of ω, real positive and lie strictly off the unit circle.
February 11, 2009 13:36 WSPC/148-RMP
10
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
√ Secondly, we consider the case p ≥ 1. For ω > 4 + Ω2 /p, σp , σp−1 are real negative and lie strictly off the unit circle. When √ ω decreases, they approach the unit circle and one has σp = σp−1 = −1 for ω = 4 + Ω2 /p (this corresponds to a frequency ωq at the top of the phonon band, for a wavenumber q = π). At this critical parameter√value, σp = −1 is a double non semi-simple eigenvalue of Aω . For Ω/p < ω < 4 + Ω2 /p, σp , σp−1 lie on the unit circle, and approach +1 as ω decreases. One has σp = σp−1 = 1 for ω = Ω/p, and then +1 is a double non semi-simple eigenvalue of Aω (this corresponds to a frequency ωq at the bottom of the phonon band, for a wavenumber q = 0). For ω < Ω/p, σp , σp−1 are real positive and lie strictly off the unit circle. Now let√us qualitatively describe the evolution of the whole spectrum of Aω . When ω > 4 + Ω2 the spectrum of Aω lies strictly off the unit circle (both inside and outside the unit disc). When ω decreases, the eigenvalues σp approach the √ unit circle for all p ≥ 1. As the first critical value ω1 = 4 + Ω2 is reached, the eigenvalues σ1 , σ1−1 collide and yield a double (non semi-simple) eigenvalue σ1 = −1, while the remaining part of the spectrum is hyperbolic. When ω is further decreased, two different situations occur depending on the value of Ω. √ −1 3, σ , σ are the only eigenvalues on the unit circle for Ω ≤ ω ≤ For Ω > 2/ 1 1 √ 2 4 + Ω . One has σ1 = σ1−1 = 1 at the second critical value ω2 = Ω. When ω is off the unit circle. One further decreased, σ1 , σ1−1 are real positive and lie strictly √ −1 2 has σ2 = σ2 = −1 at the third critical value ω3 = 4 + Ω /2 < Ω. The situation is sketched in Fig. 1.√ −1 The case Ω < 2/ 3 is different, since √ eigenvalues on the unit √ σ1 , σ1 are the only /2 < ω ≤ 4 + Ω2 . Indeed circle in the smaller frequency range 4 + Ω2√ √ one has −1 2 σ2 = σ2 √ = −1 at second critical value ω2 = 4 + Ω /2 > Ω. For ω < 4 + Ω2 /2 and ω ≈ 4 + Ω2 /2 the spectrum of Aω on the unit circle consists √ in two pairs of simple eigenvalues σ1 , σ1−1 , σ2 , σ2−1 . In the interval Ω < ω < 4 + Ω2 /2 other eigenvalues may collide at −1 depending on the value of Ω. The situation is sketched in Fig. 2. In what follows, we restrict √ our attention to the neighborhood of critical frequencies ω ≈ ω2 with Ω > 2/ 3, and ω ≈ ω1 . This leads us to consider the small parameter µ defined by ω 2 = ωi2 + µ. As ω equals one of the critical frequencies
√ Fig. 1. Spectrum of Aω near the unit circle as ω is varied, in the case Ω > 2/ 3. The unbounded part of the spectrum on the negative real axis is not shown. The arrows indicate how the eigenvalues have moved from their positions in the previous graph, after ω has been decreased.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
11
√ Fig. 2. Spectrum of Aω near the unit circle as ω is varied, in the case Ω < 2/ 3. The unbounded part of the spectrum on the negative real axis is not shown. The arrows indicate how the eigenvalues have moved from their positions in the previous graph, after ω has been decreased.
ω1 , ω2 , the spectrum of Aω on the unit circle only consists in a double eigenvalue −1 or +1, isolated from the hyperbolic part of the spectrum. For ω ≈ ω1 and ω ≈ ω2 , this splitting of the spectrum of Aω will allow us to reduce (6) locally to a map on a two-dimensional invariant center manifold (see Sec. 2.2). In addition, the above spectral analysis shows that the fixed point Y = 0 of (8) is hyperbolic when ω > ω1 or ω < ω2 and ω ≈ ω2 . In this case, when nonlinear effects are taken into account, we shall see that the stable and unstable manifolds W s (0), W u (0) may intersect depending on the local properties of the anharmonic potential V , leading to the existence of homoclinic orbits to Y = 0. √ Although we shall restrict to the cases ω ≈ 4 + Ω2 and ω ≈ Ω, it is interesting to √ give some comments on the situation when ω is close to the other critical values 4 + Ω2 /p and Ω/p, for an interger p ≥ 2. Clearly if yn is a 2π-periodic solution of (6) for a given value of ω, then so is yn (pt) √ when ω is replaced by ω/p. Consequently, all solutions yn √obtained for ω ≈ 4 + Ω2 or ω ≈ Ω provide additional solutions yn (pt) for ω ≈ 4 + Ω2 /p or ω ≈ Ω/p. These additional solutions are “artificial”, since they become equal to the previous ones if one goes back to the unscaled system (5). However, they should be embedded in larger families of small amplitude solutions if Aω possesses √ additional pairs√of eigenvalues on the unit circle (this is the case e.g. for Ω < 2/ 3 and ω ≈ ω2 = 4 + Ω2 /2, see Fig. √ 2). Another interesting remark can be made in the case when ω ≈ Ω and Ω < 2/ 3. The dimension of the center manifold depends on Ω and is higher than 2 (at least 4). The bifurcations of homoclinic solutions become much harder to analyze because slow hyperbolic modes coexist with fast oscillatory modes. In that case, subtle bifurcation phenomena beyond all algebraic orders can be expected, such as the existence of orbits homoclinic to exponentially small periodic or quasi-periodic orbits whose size could not be cancelled in general. Such phenomena have been analyzed e.g. in [56] for reversible flows, when only one oscillatory mode coexists with hyperbolic modes close to bifurcation (i.e. for reversible solutions homoclinic to exponentially small periodic orbits). This situation occurs in particular for bifurcations of traveling waves and pulsating traveling waves (traveling breathers) in different one-dimensional lattices [38, 46, 40, 75]. System (6) will be analyzed in the limit of small amplitude solutions and for small parameters µ, {}, {η}, {γ}, {κ}. The parameter space will be denoted as
February 11, 2009 13:36 WSPC/148-RMP
12
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas 4
E = R × ( ∞ (Z)) . All parameters are embedded in a multicomponent parameter {λ} = (µ, {}, {η}, {γ}, {κ}) ∈ E. In addition we denote by τn the index shift in ∞ (Z), i.e. {τn {}}k = n+k . Equation (6) can be rewritten in the form of a nonautonomous mapping in a function space. More precisely we have Yn+1 = LYn + N (Yn , λn ),
n ∈ Z,
(11)
where Yn = (yn−1 , yn ) = (zn , yn ) ∈ D, λn = (µ, n , ηn , γn , κn ) ∈ R5 , L = Aωi (for i = 1 or 2) and N (z, y, λn ) = (0, N2 (z, y, λn )), N2 (z, y, λn ) = (ωi2 n + µ(1 + n ))
d2 y + Ω2 [(1 + ηn )(1 + γn ) − 1]y dt2
+ κn (y − z) + W (y, ηn , γn ),
(12)
W (y, η, γ) = Ω2 (1 + η)(V [(1 + γ)y] − (1 + γ)y). Equation (11) holds in the Hilbert space X. The potential V is assumed sufficiently smooth (C p+1 , with p ≥ 5) in a neighborhood of 0. It follows that N : D × R5 → X is C k (k = p − 2 ≥ 3) in a neighborhood of (Y, λ) = 0. The operator N consists in higher order terms as (Y, λn ) ≈ 0, i.e. we have N (0, λ) = 0, DY N (0, 0) = 0. We note that (11) is invariant under the symmetry T Y = Y (·+π). Moreover the usual invariance under index shifts {Y } → τ1 {Y } is broken by the inhomogeneity of the lattice, and replaced by the invariance ({Y }, {λ}) → (τ1 {Y }, τ1 {λ}). In the next section, we prove a general center manifold reduction theorem for maps having the form (11), under appropriate spectral conditions on L and for small nonautonomous perturbations {λ} ∈ E. This analysis relies on the reduction results proved in [44] for autonomous maps. To simplify the proof, problem (11) will be considered as a projection of a suitable autonomous mapping to which the center manifold theorem can be directly applied.
2.2. Center manifold reduction for nonautonomous perturbations of infinite-dimensional maps In this section, we reformulate the situation of Sec. 2.1 in a general framework, and prove a local center manifold reduction result for problems of this type. This level of generality is relevant for nonlinear lattices, because the dynamical equations of many one-dimensional lattices can be reformulated as infinite-dimensional maps in loop spaces as one looks for small amplitude time-periodic oscillations. Indeed, if the coupling between sites has finite range (i.e. xn is coupled to xk for |n − k| ≤ p), then in general xn+p can be obtained locally as a function of xn+p−1 , . . . , xn−p using the implicit function theorem (for some examples see e.g. [44, Sec. 6.1], or [47]). To work in a general setting, let us consider a Hilbert space X and a closed linear operator L : D ⊂ X → X of domain D, L being in general unbounded.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
13
We equip D with the scalar product u, v D = Lu, Lv X + u, v X , hence D is a Hilbert space continuously embedded in X. We denote by U ×V a neighborhood of 0 in D ×Rp and consider a nonlinear map N ∈ C k (U × V, X) (k ≥ 2), where N (Y, λ) satisfies N (0, λ) = 0, DY N (0, 0) = 0. We look for sequences (Yn )n∈Z in U satisfying ∀n ∈ Z,
Yn+1 = LYn + N (Yn , λn ) in X,
(13)
where {λ} = (λn )n∈Z is a bounded sequence in V treated as a parameter. In what follows we shall note E = ∞ (Z, Rp ) the Banach space in which {λ} lies. Notice that Y = 0 is a fixed point of (13). We assume that L has the property of spectral separation, i.e. L satisfies the assumption (H) described below (in what follows we note σ(T ) the spectrum of a linear operator T ). Assumption (H). The operator L has nonempty hyperbolic (|z| = 1) and central (|z| = 1) spectral parts. Moreover, there exists an annulus A = {z ∈ C, r ≤ |z| ≤ R} (r < 1 < R) such that the only part of the spectrum of L in A lies on the unit circle. The situation corresponding to Assumption (H) is sketched in Fig. 3. Under assumption (H), the hyperbolic part σh of σ(L) is isolated from its central part σc . In particular this allows one to split X into two subspaces Xc , Xh invariant under L, corresponding to σc , σh respectively. More precisely, Lh = L|Xh and Lc = L|Xc satisfy σ(Lh ) = σh and σ(Lc ) = σc . The invariant subspace Xc is called center subspace, and Xh is the hyperbolic subspace. The subspace Xc is finite-dimensional when the spectrum of L on the unit circle consists in a finite number of eigenvalues with finite multiplicities (we do not need this assumption for the reduction theorem constructed here). C(R)
C(r)
Fig. 3.
Spectrum of L (dots), unit circle (dashed) and oriented circles C(r), C(R).
February 11, 2009 13:36 WSPC/148-RMP
14
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
The spectral projection πc on the center subspace can be defined in the following way (see e.g. [49]) 1 1 (zI − L)−1 dz − (zI − L)−1 dz, πc = 2iπ C(R) 2iπ C(r) where C(r) denotes the circle of center z = 0 and radius r (see Fig. 3). One has πc ∈ L(X, D), Xc = πc X ⊂ D and πc L = Lπc , where L(X, D) denotes the set of bounded operators from X into D. In the sequel we note πh = I −πc and Dh = πh D. Remark 1. Let us consider the situation of √ Sec. 2.1 and the linear operator L of Eq. (11). In the case ω = ω2 and Ω > 2/ 3, the spectrum of L on the unit circle consists in a double non semi-simple eigenvalue +1. Moreover, for ω = ω1 the spectrum of L on the unit circle consists in a double non semi-simple eigenvalue −1. In both cases the associated invariant subspace Xc is spanned by Vz = (cos t, 0), 2π Vy = (0, cos t) and we have πc Y = π1 ( 0 Y (t) cos t dt) cos t. In addition the unit circle is isolated from the remainder of the spectrum, since the latter is discrete and only accumulates at the origin and at −∞ on the real axis. It follows that L satisfies Assumption (H). Now we state the center manifold reduction theorem in the general case. In the sequel we note Y c = πc Y , Y h = πh Y . Theorem 1. Assume that L has the property of spectral separation, i.e. satisfies Assumption (H). There exists a neighborhood Ω × Λ of 0 in D × E and a map ψ ∈ C k (Xc × Λ, Dh) (with ψ(0, {λ}) = 0, DY c ψ(0, 0) = 0) such that for all {λ} ∈ Λ the following holds: (i) If {Y } is a solution of (13) such that Yn ∈ Ω for all n ∈ Z, then Ynh = ψ(Ync , τn {λ}) for all n ∈ Z and Ync satisfies the nonautonomous recurrence relation in Xc ∀n ∈ Z,
c Yn+1 = fn (Ync , {λ}),
(14)
where fn ∈ C ((Xc ∩ Ω) × Λ, Xc ) is defined by k
fn (., {λ}) = πc (L + N (., λn )) ◦ (I + ψ(., τn {λ})). (ii) Conversely, if {Y c } is a solution of (14) such that Ync ∈ Ω for all n ∈ Z, then Yn = Ync + ψ(Ync , τn {λ}) satisfies (13). (iii) If L + N (., λ) commutes with a linear isometry T ∈ L(X) ∩ L(D) then T ψ(., {λ}) = ψ(., {λ}) ◦ T and T fn (., {λ}) = fn (., {λ}) ◦ T . Properties (i) and (ii) reduce the local study of (13) to that of the nonautonomous recurrence relation (14) in the subspace Xc . Note that the dependency of ψ and the reduced map fn with respect to sequences {λ} is nonlocal. In what follows we give a simple proof of Theorem 1 which relies on the fact that the nonautonomous mapping (13) can be seen as a projection of an extended autonomous mapping, to which the center manifold theorem proved in [44] can
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
15
be applied. This procedure will explain why the result of Theorem 1 can be seen as a center manifold reduction, since the reduction function ψ will appear as one component of the function having the center manifold as its graph. The reduced nonautonomous mapping (14) will be interpreted as a projection of the extended autonomous mapping restricted to the invariant center manifold. Theorem 1 has been proved in [44] in the case of an autonomous mapping, when the sequence {λ} is absent or replaced by a simple parameter λ ∈ Rp . To recover this autonomous case we introduce the additional variable Sn = τn {λ} ∈ E. Note that for any fixed n ∈ Z, Sn denotes a bounded sequence in Rp (to simplify the notations we use the symbol Sn instead of {Sn }). Given a sequence {λ} ∈ E we also note δ0 {λ} = λ0 . Equation (13) can be rewritten Yn+1 = LYn + N (Yn , δ0 Sn ),
Sn+1 = τ1 Sn ,
(15)
which consists in an autonomous mapping in X × E. In what follows, we apply the theory of [44] to system (15). As we shall see the corresponding center manifold will be infinite-dimensional due to the second component of (15). The case of infinite-dimensional center manifolds has been treated in [44], with the counterpart that theory is restricted to maps in Hilbert spaces. Consequently the first step is to search for Sn in a suitable Hilbert space instead of the Banach space E. For this purpose we consider the space of sequences h−1 = {{u}/uk ∈ Cp , {u}−1 < +∞} , where {u}2−1 = k∈Z (1 + k 2 )−1 uk 2 . The space h−1 defines a Hilbert space 2 −1 equiped with the scalar product {u}, {v} = uk · vk , where · k∈Z (1 + k ) p denotes the usual scalar product on C and the associated norm. For all n ∈ Z, we now search for Sn in the space H = h−1 ∩ (Rp )Z consisting of real sequences in h−1 . Note that E ⊂ H, the embedding being continuous. Since sequences in H may be unbounded and N (Y, .) is defined on a neighborhood V of λ = 0 in Rp , we replace (15) by a locally equivalent problem (Yn+1 , Sn+1 ) = F (Yn , Sn )
(16)
where F (Y, S) = (LY + N (Y, γ(δ0 S)), τ1 S), γ : Rp → V is a C ∞ cut-off function satisfying γ(x) ≤ x, γ(x) = x for x < r, γ(x) = 0 for x > 2r, r being chosen small enough (with B(0, 2r) ⊂ V). Problem (16) consists in an autonomous mapping in X × H. In order to apply the center manifold theorem of [44] we need to study the spectrum of DF (0) = L × τ1 . One has clearly σ(DF (0)) = σ(L) ∪ σ(τ1 ), where σ(τ1 ) is determined in the following lemma. Lemma 1. The spectrum σ(τ1 ) of τ1 : H → H consists of the unit circle.
February 11, 2009 13:36 WSPC/148-RMP
16
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
Proof. Consider the complexification h−1 of H. Given a sequence {f } ∈ h−1 and z ∈ C, we look for {u} ∈ h−1 satisfying (zI − τ1 ){u} = {f }.
(17)
Equation (17) can be solved in a simple manner using Fourier series. Recall that 1 (0, 2π) can be defined as the set of functions in the periodic Sobolev space Hper 2 p L (R/2πZ, C ) whose Fourier coefficients form a sequence in h1 , where p 2 2 h1 = {u}/uk ∈ C , (1 + k )uk < +∞ . k∈Z −1 In the same way its dual space Hper (0, 2π) is isomorphic to h−1 , where the iso1 −1 morphism C : Hper (0, 2π) → h−1 is again given by Cn (T ) = 2π T, e−int for all −1 T ∈ Hper (0, 2π). In addition one has the useful property τ1 C(T ) = C(e−it T ) for −1 (0, 2π). Now return to Eq. (17) and consider T = C −1 ({u}) and all T ∈ Hper −1 −1 (0, 2π) S = C ({f }). One obtains the equivalent problem in Hper
(z − e−it )T = S.
(18)
If |z| = 1 then (18) has the unique solution T = (z − e−it )−1 S, hence z ∈ / σ(τ1 ). If z = eiθ is chosen on the unit circle, T = 2πδ−θ is a solution for S = 0, corresponding to an eigenvector {u} = {einθ } of τ1 . As it follows from Lemma 1, σ(DF (0)) consists of the union of σ(L) with the unit circle. Consequently, DF (0) has the property of spectral separation, i.e. the hyperbolic part of its spectrum is isolated from the unit circle. Moreover the center subspace of DF (0) is simply Xc × H. With these spectral properties at hand, we now apply the center manifold theorem of [44] which states the following. ˜ of (Y, S) = 0 in D × H and a map Theorem 2. There exists a neighborhood Ω × Λ k ψ ∈ C (Xc × H, Dh ) (with ψ(0, 0) = 0, Dψ(0, 0) = 0) such that the manifold M = {(Y, S) ∈ D × H/Y = Y c + ψ(Y c , S), Y c ∈ Xc } has the following properties: ˜ then F (Y, S) ∈ M. (i) M is locally invariant under F, i.e. if (Y, S) ∈ M ∩ (Ω× Λ), ˜ for all n ∈ Z, then (ii) If {(Y, S)} is a solution of (16) such that (Yn , Sn ) ∈ Ω × Λ h c (Yn , Sn ) ∈ M for all n ∈ Z (i.e. Yn = ψ(Yn , Sn )) and (Ync , Sn ) satisfies the recurrence relation in Xc × H c Yn+1 = f˜(Ync , Sn ),
Sn+1 = τ1 Sn ,
where f˜(Y c , S) = πc [L + N (., γ(δ0 S))](Y c + ψ(Y c , S)).
(19)
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
17
˜ for (iii) Conversely, given a solution {(Y c , S)} of (19) such that (Ync , Sn ) ∈ Ω × Λ c c all n ∈ Z, consider Yn = Yn + ψ(Yn , Sn ). Then (Yn , Sn ) defines a solution of (16) lying on M. (iv) If L + N (., λ) commutes with a linear isometry T ∈ L(X) ∩ L(D) then T ψ(Y c , S) = ψ(T Y c , S) and T f˜(Y c , S) = f˜(T Y c , S). The manifold M is called a local C k center manifold for (16). It is locally invariant under F (as stated by property (i)) and the linear isometries of (16). Property (iv) expresses the invariance of M under the linear isometry T × I of (16). Now the proof of Theorem 1 follows directly from Theorem 2. Since E is continuously embedded in H, ψ defines a C k map from Xc × E into Dh . In Theorem 1 ˜ and γ = I on Λ. Then we choose Λ as a ball of center 0 in E such that Λ ⊂ Λ problems (13) and (16) are equivalent for all {λ} ∈ Λ, with Sn = τn {λ}, and properties (i)–(iii) of Theorem 1 are directly deduced from properties (ii)–(iv) of Theorem 2. In addition, since (0, τn {λ}) is a solution of (16) for all {λ} ∈ Λ it follows ψ(0, τn {λ}) = 0 (by property (ii) of Theorem 2), and consequently ψ(0, {λ}) = 0. 2.3. Application to the Klein–Gordon lattice 2.3.1. Reduction result In this section, we apply the reduction Theorem 1 to the inhomogeneous Klein– Gordon lattice considered in Sec. 2.1. We recall that the inhomogeneous system (6) has been reformulated as a nonautonomous map in a loop space given by expression (11). All parameters (sequences of heterogeneities and frequency shift µ) are embedded in the multicomponent parameter {λ} = (µ, {}, {η}, {γ}, {κ}) ∈ E = 4 R× ( ∞ (Z)) . The problem has exactly the general form (13) (in the particular case in which the first component of {λ} is constant) and consequently the reduction Theorem 1 can be applied to (11). This yields the reduction result for the original system (6) stated below (Theorem 3). It is straightforward to check that system (6) has the reduction properties (i) and (ii) described in this theorem since the equivalent system (11) satisfies properties (i) and (ii) of Theorem 1 (see Remark 1). However there remains to compute the explicit forms (21) and (24) of the recurrence relations given below. These expressions do not simply correspond to the two-dimensional mapping (14) rewritten as a second order recurrence relation. In addition we rewrite (14) in normal form, i.e. we perform a polynomial change of variables which simplifies (14) by keeping only its essential terms. These computations will be the object of the next three Secs. 2.3.2–2.3.4. Property (iii) below is equivalent to property (iii) of Theorem 1, where the symmetry T is the half period time shift which satisfies T|Xc = −I. √ in Eq. (6), where ωc = 4 + Ω2 or ωc = Ω (in that Theorem 3. Fix ω 2 = ωc2 + µ √ case we further assume Ω > 2/ 3). There exist neighborhoods U, V and W of 0 in
February 11, 2009 13:36 WSPC/148-RMP
18
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
2 2 H# , E and R, respectively, and a C k map φ : R2 × E → H# (with φ(0, {λ}) = 0, Dφ(0, 0) = 0) such that the following holds for all {λ} ∈ V.
(i) All solutions of (6) such that yn ∈ U for all n ∈ Z have the form yn (t) = βn cos t + [φ(βn−1 , βn , τn {λ})](t). For ωc = Ω, βn satisfies a recurrence relation βn+1 − 2βn + βn−1 = Rn (βn−1 , βn , {λ})
(20)
where Rn : W 2 × V → R is C k . The principal part of Rn reads Rn (α, β, {λ}) = (Ω2 ηn (1 + γn ) − (Ω2 + µ)n + Ω2 γn − µ)β + κn (β − α) + Bβ 3 + h.o.t., Ω2 5 (3) (4) 2 B= V (0) − (V (0)) . 8 3
For ωc =
(21) (22)
√ 4 + Ω2 one has
βn+1 + 2βn + βn−1 = Rn (βn−1 , βn , {λ}),
(23)
with Rn (α, β, {λ}) = (Ω2 ηn (1 + γn ) − (4 + Ω2 + µ)n + Ω2 γn − µ)β ˜ 3 + h.o.t., + κn (β − α) + Bβ 2 Ω2 Ω2 (4) (3) ˜ B= −2 . V (0) + (V (0)) 8 16 + 3Ω2
(24) (25)
In both cases, higher order terms in Rn are O((α, β)3 {λ}E + (α, β)5 ) and non-local in {λ}. is a solution of problem (20) or (23) (respectively, for ωc = Ω or (ii) If βn √ ωc = 4 + Ω2 ), such that βn ∈ W for all n ∈ Z, then yn (t) = βn cos t + φ(βn−1 , βn , τn {λ}) satisfies Eq. (6). (iii) The functions φ and Rn have the following symmetries φ(−α, −β, {λ}) = T φ(α, β, {λ}),
Rn (−α, −β, {λ}) = −Rn (α, β, {λ}),
where T denotes the half period time shift [T φ(.)](t) = [φ(.)](t + π). A possible way of computing the reduced recurrence relations (20) and (23) would be to consider the equivalent autonomous mapping (15) and use a classical computation scheme for center manifolds of autonomous systems (see e.g. [79] for a description of the method). The first step consists in computing the Taylor expansion of the reduction function ψ up to a given order. This can be done using a nonlocal equation for Yn (obtained by expressing Yn in (15) as a function of N (Yn , δ0 Sn )) and computing the Taylor coefficients of ψ by induction (see [79]). The second step is to compute the reduced recurrence relation (19) which is completely determined by ψ.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
19
In the next three sections, we shall use a different method yielding simpler computations. Firstly we compute the expressions (21) and (24) in the autonomous case {} = {η} = {γ} = {κ} = 0, using the method of [44]. Then, using a symmetry argument, we deduce how the leading order part of the reduced equation is modified by the nonautonomous terms of (11). To end this section, we point out a generalization of Theorem 3. As it follows from the analysis of Sec. 2.1, the dimension of the center space Xc of Aω is twice the number of multiples of ω lying within the band [Ω, (4 + Ω2 )1/2 ]. More precisely, if Ω ≤ ωp ≤ (4 + Ω2 )1/2 for all integers p ∈ {p0 , . . . , p1 }, with no additional multiples entering the band, then the center space is spanned by the corresponding Fourier modes (cos(pt), 0), (0, cos(pt)). As above the following reduction result follows from Theorem 1. Theorem 4. Consider ωc > 0 such that ωc p ∈ [Ω, (4 + Ω2 )1/2 ] for all integers p ∈ {p0 , . . . , p1 }, with no additional multiples in this interval. Fix ω 2 = ωc2 + µ 2 spanned in Eq. (6) and note N = p1 − p0 + 1. Consider the subspace Hc of H# by the N Fourier modes cos(p0 t), . . . , cos(p1 t) and its complementary subspace Hc⊥ consisting of orthogonal Fourier modes. There exist neighborhoods U, V of 0 in 2 , E, respectively, and a C k map φ : R2N × E → Hc⊥ (with φ(0, {λ}) = 0, H# Dφ(0, 0) = 0), such that for all {λ} ∈ V, all solutions of (6) such that yn ∈ U for all n ∈ Z have the form yn (t) =
p1
(p )
(p )
0 1 [βn(p) cos(pt)] + φ(βn−1 , βn(p0 ) , . . . , βn−1 , βn(p1 ) , τn {λ}).
(26)
p=p0
Moreover, all small amplitude solutions of (6) are determined by a finitedimensional recurrence relation obtained by projecting (6) on Hc and using the ansatz (26). In the following Secs. 2.3.2–2.3.4, we compute the explicit forms of the reduced recurrence relations given in Theorem 3.
2.3.2. Homogeneous case near the lower phonon band edge
√ In this section, we restrict our attention to the case when Ω > 2/ 3 and ω ≈ ω2 = Ω. We consider the autonomous case when {} = {η} = {γ} = {κ} = 0. Equation (11) now reads Yn+1 = LYn + N (Yn , µ),
n∈Z
where L = AΩ is given by (9) and d2 y N ((z, y), µ) = 0, µ 2 + W (y) , dt
(27)
February 11, 2009 13:36 WSPC/148-RMP
20
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
with W (y) = Ω2 (V (y) − y). System (27) is a reformulation of the equations of motion for the homogenous Klein–Gordon lattice ω2
d2 yn + Ω2 V (yn ) = yn+1 − 2yn + yn−1 , dt2
n ∈ Z.
(28)
As in the nonautonomous case (11), system (27) is invariant under the symmetry T Y = Y (· + π). Moreover, the invariance yn → y−n of (28) implies that (27) is reversible with respect to the symmetry R(z, y) = (y, z), i.e. if Yn is a solution then also RY−n . In other words, if Y and [L + N (., µ)](RY ) are in some neighborhood of 0 in D one has (L + N (., µ) ◦ R)2Y = Y . Lastly, due to the existence of the additional symmetry T , it is worthwhile to notice that T R defines an other reversibility symmetry. In what follows we use the notations introduced in Sec. 2.2. We recall that the spectrum of L = AΩ on the unit circle consists in a double non semi-simple eigenvalue +1, and the associated two-dimensional invariant subspace Xc is spanned by the vectors Vz = (cos t, 0), Vy = (0, cos t), with L|Xc =
0 1 −1 2
in the basis (Vz , Vy ). For µ in some neighborhood Λ of 0, (27) admits a C k twodimensional local center manifold Mµ ⊂ D (which can be written as a graph over Xc ), locally invariant under L + N (., µ) (see [44, Theorem 1, p. 32]). One can write
Mµ = Y ∈ D/Y = aVz + bVy + ψ(a, b, µ), (a, b) ∈ R2 ,
(29)
where ψ ∈ C k (R2 × Λ, Dh ) and ψ(a, b, µ) = O((a, b)2 + (a, b)|µ|). Moreover, 2, p. 34 and Sec. 5.2]). Mµ is invariant under T and R (see [44, Theorem 2π In the sequel, we use the notations P ∗ (y) = π1 0 y(t) cos t dt, Pc y = P ∗ (y) cos t 2 /P ∗ (y) = 0}. The spectral projection πc on Xc reads πc (z, y) = and Hh2 = {y ∈ H# (Pc z, Pc y) and we have Dh = Hh2 × Hh2 . Since Mµ is invariant under R and Vz , Vy are exchanged by R, we have the symmetry property Rψ(a, b, µ) = ψ(b, a, µ). Consequently, the function ψ has the form ψ(a, b, µ) = (ϕ(b, a, µ), ϕ(a, b, µ))
(30)
with ϕ ∈ C k (R2 × Λ, Hh2 ). Since Mµ is invariant under T and T|Xc = −I we have in addition T ϕ(a, b, µ) = ϕ(−a, −b, µ).
(31)
For µ ≈ 0, the center manifold Mµ contains all solutions Yn of (27) staying in a sufficiently small neighborhood of Y = 0 in D for all n ∈ Z. Their coordinates (an , bn ) on Mµ are thus given by a two-dimensional mapping which determines all
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
small amplitude solutions when µ ≈ 0. The reduced mapping is given by a an+1 = fµ n bn+1 bn
21
(32)
where a b, = , fµ b −a + 2b + r(a, b, µ)
(33)
r(a, b, µ) = −µb + P ∗ W (b cos t + ϕ(a, b, µ)).
(34)
One obtains Eq. (32) using the fact that zn = an cos t + ϕ(bn , an , µ),
yn = bn cos t + ϕ(an , bn , µ)
for Yn = (zn , yn ) ∈ Mµ and applying P ∗ to Eq. (27) (one has P ∗ ϕ = 0 and d2 ∗ 2 P ∗ ◦ dt on H# ). 2 = −P Since the reduced mapping inherits the symmetries of (27) [44], fµ commutes with T|Xc = −I and thus r(−a, −b, µ) = −r(a, b, µ). Moreover, fµ is reversible with respect to the symmetry R(a, b) = (b, a), i.e. (fµ ◦ R)2 = I. This yields the identity r(a, b, µ) = r(−a + 2b + r(a, b, µ), b, µ). This imposes the following structure for the Taylor expansion of r at (a, b, µ) = 0 1 r(a, b, µ) = −bµ + c1 b3 + c2 ab2 − c2 a2 b + O(|b|(|a| + |b|)4 2 + |b|(|a| + |b|)2 |µ|),
(35)
where coefficients c1 , c2 have to be determined. Note that r(a, 0, µ) = 0 (see [44, p. 53] for details). For determining the unknown coefficients of (35), we first compute the leading order terms in the Taylor expansion of ψ at (a, b, µ) = 0. This can be done using the fact that Mµ is locally invariant under L + N (., µ) (see [44, Theorem 1, p. 32]). For (a, b) ≈ 0, this yields πh [L + N (., µ)]((a, b) cos t + ψ(a, b, µ)) = ψ(fµ (a, b), µ)
(36)
or equivalently ϕ(−a + 2b + r(a, b, µ), b, µ) = ϕ(a, b, µ), d2 ϕ(b, −a + 2b + r(a, b, µ), µ) = Ω2 2 + 2 + Ω2 ϕ(a, b, µ) − ϕ(b, a, µ) dt 2 d + (1 − Pc ) µ 2 + W (b cos t + ϕ(a, b, µ)). dt
(37)
(38)
February 11, 2009 13:36 WSPC/148-RMP
22
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
Thanks to the symmetry property (37), the Taylor expansion of ϕ at order 2 takes the form 1 ϕ(a, b, µ) = ϕ011 bµ − ϕ110 a2 + ϕ110 ab + ϕ020 b2 + h.o.t. 2
(39)
By an identification procedure we now compute the coefficients ϕpqr in (39), using (38) and the expansion W (y) = Ω
2
1 (3) 1 (4) 2 3 4 V (0)y + V (0)y + O(y ) . 2 6
(40)
Identification at order bµ gives
d2 + 1 ϕ011 = 0, dt2
hence ϕ011 = 0 since ϕ011 ∈ Hh2 . Identification at order ab leads to ϕ020 = −
1 4
d2 Ω2 2 + 2 + Ω2 ϕ110 dt
(41)
and identification at order b2 yields d2 1 −ϕ110 + Ω2 2 − 2 + Ω2 ϕ020 = − Ω2 V (3) (0) cos2 t. dt 2
(42)
Reporting (41) in (42) gives
d2 +1 dt2
2 ϕ110 =
2 (3) V (0) cos2 t Ω2
and consequently ϕ110 ϕ020
1 (3) 1 = 2 V (0) 1 + cos(2t) , Ω 9 1 1 1 1 1 − = V (3) (0) − − 2 + cos(2t) . 2 2 Ω 6 9Ω2
As a conclusion, we obtain ϕ(a, b, µ) =
1 2 1 1 (3) cos(2t) ab − a V (0) 1 + Ω2 9 2 1 1 (3) 1 1 1 + V (0) − − 2 + − cos(2t) b2 + h.o.t. 2 2 Ω 6 9Ω2
(43)
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
23
We now compute the two-dimensional mapping giving the coordinates (an , bn ) of the solutions on Mµ . Equation (32) can be written an+1 = bn ,
bn+1 − 2bn + bn−1 = r(bn−1 , bn , µ).
Using (34), (43) and (40) yields in Eq. (35) 19 5 2 1 (3) 1 2 (4) + Ω (V (0))2 , c1 = Ω V (0) − 8 9 6 4
c2 =
19 (3) (V (0))2 . 18
(44)
(45)
Lastly, one can write (44) in normal form using the change of variables bn = c2 3 βn . The normal form of (44) at order 3 reads βn − 12 βn+1 − 2βn + βn−1 = −µβn + Bβn3 + h.o.t. with B = c1 +
Ω2 c2 = 2 8
5 V (4) (0) − (V (3) (0))2 . 3
(46)
(47)
This yields the explicit form (20) of the reduced recurrence relation in the autonomous case {} = {η} = {γ} = {κ} = 0. 2.3.3. Homogeneous case near the upper phonon band edge √ In this section, we consider the case ω ≈ ω1 = 4 + Ω2 , in the autonomous case when {} = {η} = {γ} = {κ} = 0. Equation (11) takes the form (27), where L = Aω1 is given by (9). The spectrum of L on the unit circle consists in a double non semi-simple eigenvalue −1, and the center space Xc is again spanned by Vz = (cos t, 0), Vy = (0, cos t). For µ = ω 2 − ω12 in some neighborhood Λ of 0, there exists a smooth twodimensional local center manifold Mµ ⊂ D locally invariant under L + N (., µ), T , R and having the form (29). The function ψ having the center manifold as its graph has the form (30) and shares the property (31). For µ ≈ 0, the center manifold Mµ contains all solutions Yn of (27) staying in a sufficiently small neighborhood of Y = 0 in D for all n ∈ Z. Their coordinates (an , bn ) on Mµ are then given by a two-dimensional mapping, which determines all small amplitude solutions when µ ≈ 0. The operator L has the following structure in the basis (Vz , Vy ) 0 1 L|Xc = −1 −2 and the reduced mapping is given by an an+1 = fµ bn+1 bn where
a b, fµ = b −a − 2b + r(a, b, µ)
(48)
(49)
February 11, 2009 13:36 WSPC/148-RMP
24
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
and r is defined by (34). Since the reduced mapping inherits the symmetries of (27), fµ commutes with T|Xc = −I hence r(−a, −b, µ) = −r(a, b, µ). Moreover, (48) is reversible with respect to the symmetry R(a, b) = (b, a), which yields the identity r(a, b, µ) = r(−a − 2b + r(a, b, µ), b, µ). This implies r(a, 0, µ) = 0 and 1 r(a, b, µ) = −bµ + c1 b3 + c2 ab2 + c2 a2 b + h.o.t., 2
(50)
where the coefficients c1 , c2 have to be determined. For this purpose, we first compute the leading order terms in the Taylor expansion of ψ at (a, b, µ) = 0, using the fact that Mµ is locally invariant under L + N (., µ). Equation (36) yields ϕ(−a − 2b + r(a, b, µ), b, µ) = ϕ(a, b, µ), 2 2 2 d ϕ(b, −a − 2b + r(a, b, µ), µ) = ω1 2 + 2 + Ω ϕ(a, b, µ) − ϕ(b, a, µ) dt 2 d + (1 − Pc ) µ 2 + W (b cos t + ϕ(a, b, µ)). dt
(51)
(52)
The Taylor expansion of ϕ at order 2 takes the following form (due to the symmetry property (51)) 1 ϕ(a, b, µ) = ϕ011 bµ + ϕ110 a2 + ϕ110 ab + ϕ020 b2 + h.o.t. 2
(53)
By an identification procedure we now compute the coefficients ϕpqr in (53), using (52) and the expansion (40). Identification at order bµ gives 2 d + 1 ϕ011 = 0, dt2 hence ϕ011 = 0 since ϕ011 ∈ Hh2 . Identification at order ab leads to d2 1 ϕ020 = ω12 2 + 2 + Ω2 ϕ110 4 dt and identification at order b2 yields 2 1 2 d 2 ϕ110 + ω1 2 − 2 + Ω ϕ020 = − Ω2 V (3) (0) cos2 t. dt 2 Reporting (54) in (55) gives 2 2 2 d 2 ϕ110 = −2Ω2 V (3) (0) cos2 t ω1 2 + Ω dt
(54)
(55)
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
25
and consequently
ϕ110
ϕ020
1 Ω2 = −V (0) + cos (2t) , Ω2 (16 + 3Ω2 )2
1 2 (3) 1 2 2 1 = − Ω V (0) + 4+ cos (2t) . 2 − 16 + 3Ω2 4 Ω2 Ω (16 + 3Ω2 ) (3)
As a conclusion, we obtain ϕ(a, b, µ)
1 Ω2 1 2 a + cos (2t) ab + = −V (0) 2 Ω2 2 (16 + 3Ω2 )
1 2 (3) 1 2 2 1 − Ω V (0) + 4+ cos (2t) b2 + h.o.t. 2 − 16 + 3Ω2 2 4 Ω2 Ω (16 + 3Ω ) (3)
(56) We now compute the two-dimensional mapping giving the coordinates (an , bn ) of the solutions on Mµ . Equation (48) can be written an+1 = bn ,
bn+1 + 2bn + bn−1 = r(bn−1 , bn , µ).
(57)
Using (34), (56) and (40) yield in Eq. (50)
2 2 1 2 4 2 1 2 (4) (3) c1 = Ω V (0) − (V (0)) Ω + 4+ , 2 − 16 + 3Ω2 8 Ω2 Ω (16 + 3Ω2 ) 2
c2 = −Ω4 (V (3) (0))
1 1 1 + . 4 Ω 2 (16 + 3Ω2 )2
The transformation bn = βn −
c2 3 12 βn
(58) (59)
yields the normal form of (57) of order 3
˜ n3 + h.o.t., βn+1 + 2βn + βn−1 = −µβn + Bβ
(60)
˜ defined by (25). This yields the explicit form (23) of the reduced recurrence with B relation in the autonomous case {} = {η} = {γ} = {κ} = 0. 2.3.4. Inhomogeneous cases Using the normal form computations performed in the above sections for the autonomous case, one can obtain by perturbation the principal part (20) (or (23)) of the normal form for ω ≈ ω2 (or ω ≈ ω1 ) in the nonautonomous case. In what follows this computation is described for ω ≈ ω2 , the treatment for ω ≈ ω1 being completely similar.
February 11, 2009 13:36 WSPC/148-RMP
26
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
Theorem 3 is obtained by applying the reduction Theorem 1 to the first order system (11). According to Theorem 1(i), small amplitude solutions Yn = (zn , yn ) of (11) have the following form for small {λ} ∈ E Yn = (an , bn ) cos t + Ψ(an , bn , τn {λ}),
(61)
where Ψ(a, b, {λ}) = ψ((a, b) cos t, {λ}) ∈ Dh and ψ denotes the reduction function of Theorem 1. In the sequel we shall note Ψ = (Ψ1 , Ψ2 ). Let us compute the explicit form of the reduced map (14). For this purpose, one has to use the ansatz (61) in Eq. (11) and project the latter on the Fourier mode cos t. Setting Fn (a, b, {λ}) cos t = fn ((a, b) cos t, {λ}), the reduced map (14) becomes a an+1 = Fn (., {λ}) n , (62) bn+1 bn a b, Fn (., {λ}) , (63) = b −a + 2b + rn (a, b, {λ}) where (recall {λ} = (µ, {}, {η}, {γ}, {κ})) rn (a, b, {λ}) = −(Ω2 n + µ(1 + n ))b + Ω2 [(1 + ηn )(1 + γn ) − 1]b + κn (b − a) + P ∗ W (b cos t + Ψ2 (a, b, τn {λ}), ηn , γn )
(64)
and the function W is defined by (13). Since fn (., {λ}) commutes with T and T|Xc = −I, the map Fn (., {λ}) commutes with −I. We have consequently rn (a, b, {λ}) = −(Ω2 n + µ(1 + n ))b + Ω2 [(1 + ηn )(1 + γn ) − 1]b + κn (b − a) + c1 b3 + c2 ab2 + c3 a2 b + O((a, b)3 {λ}E + (a, b)5 ),
(65)
where the coefficients ci need to be determined. Now, since rn [a, b, (µ, 0, 0, 0, 0)] = r(a, b, µ) in the homogeneous case (see Sec. 2.3.2), we have c3 = − 21 c2 and c1 , c2 are defined by (45). Consequently we have computed the principal part of the reduced Eq. (62) in the nonautonomous case. c2 3 β To obtain the normal form of (62) of order three we now define P (β) = β − 12 and consider as in Sec. 2.3.2 an = P (αn ),
bn = P (βn ).
This yields the normal form of (62) of order 3 given in Eq. (20). Moreover, the small amplitude solutions of (6) have the form c2 yn = βn − βn3 cos t + Ψ2 (P (βn−1 ), P (βn ), τn {λ}), 12 therefore the reduction function φ of Theorem 3 is given by φ(α, β, {λ}) = c2 3 β cos t + Ψ2 (P (α), P (β), {λ}). Note that the reduction function φ has a com− 12 ponent along the Fourier mode cos t after the normal form transformation.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
27
3. Exact Periodic Solutions for a Homogeneous Lattice Here we consider the case of the homogeneous Klein–Gordon lattice (3), which leads us to system (6) with {} = {η} = {γ} = {κ} = 0. Breather solutions have been proved to exist by MacKay and Aubry [57] for system (3) with small values of the coupling parameter k and nonresonant breather frequencies. Here we prove the existence of small amplitude breathers for arbitrary values of k in some cases and frequencies close to the phonon band edges (see Theorems 5(i), 6(i) and 7 below). We also prove the existence of dark breather solutions, which converge towards a nonlinear standing wave as n → ±∞ and have a much smaller amplitude at the center of the chain. √ Let us start with the case ω ≈ Ω and Ω > 2/ 3 in (6). By Theorem 3, small 2 are determined by the recurrence relation (20). amplitude solutions of (6) in H# This recurrence becomes autonomous for an homogeneous lattice and takes the form (46). It is important to note that the invariance n → −n of (6) in the homogeneous case is inherited by (46) (see [44, Sec. 5.2 and Theorem 2]). This invariance implies that the two-dimensional map (βn−1 , βn ) → (βn , βn+1 ) is reversible. Bifurcations of small amplitude homoclinic and heteroclinic solutions have been studied in [44, Sec. 6.2.3] for this class of maps. This yields the following result for the recurrence relation (46). √ 2 Lemma 2. Assume Ω > 2/ 3 and B = Ω8 (V (4) (0) − 53 (V (3) (0))2 ) = 0. For µ ≈ 0, the recurrence relation (46) has the following solutions: (i) For µ < 0 and B < 0, (46) has at least two homoclinic solutions βn1 , βn2 (and also −βn1 , −βn2 ) such that limn→±∞ βni = 0. These solutions have the −|n| 1 2 = βn1 , β−n = βn2 and satisfy 0 < βni ≤ C|µ|1/2 σ1 , with symmetries β−n+1 1/2 σ1 = 1 + O(|µ| ) > 1. (ii) If µ and B have the same sign, (46) has two symmetric fixed points ±β ∗ = O(|µ|1/2 ). (iii) For µ > 0 and B > 0, (46) has at least two heteroclinic solutions βn3 , βn4 (and also −βn3 , −βn4 ) such that limn→±∞ βni = ±β ∗ . These solutions have the 3 4 = −βn3 and β−n = −βn4 . Moreover, βn3 , βn4 are O(µ1/2 ) as symmetries β−n+1 n → ±∞ and O(µ) for bounded values of n. Note that for B > 0 and µ < 0 (µ ≈ 0), (46) has no small amplitude homoclinic solution to 0. For µ < 0, typical plots of the stable and unstable manifolds of the fixed point βn = 0 are shown in Fig. 5 (nonintersecting case B > 0) and in Figs. 7 and 12 (intersecting case B < 0). Theorem 3 ensures that each solution βni in Lemma 2 corresponds to a solution i yn of (6) given by i yni (t) = βni cos t + φ(βn−1 , βni , (µ, 0, 0, 0, 0))
(66)
February 11, 2009 13:36 WSPC/148-RMP
28
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
with ω 2 = Ω2 + µ in (6). This yields the following result (the symmetries of yni are due to the symmetries of βni described in Lemma 2). √ Theorem 5. Fix {} = {η} = {γ} = {κ} = 0 in Eq. (6). Assume Ω > 2/ 3 and b = V (4) (0) − 53 (V (3) (0))2 = 0. For ω ≈ Ω, problem (6) has the following solutions 2 for all n ∈ Z: with yn ∈ H# (i) For ω < Ω and b < 0, (6) has at least two homoclinic solutions yn1 , yn2 (and 2 = 0. These solutions also yn1 (t + π), yn2 (t + π)) such that limn→±∞ yni H# 1 1 2 2 satisfy y−n+1 = yn , y−n = yn and have the form yni = βni cos t + O(|ω − Ω|), −|n|
(67) 1/2
where 0 < βni ≤ C|ω − Ω|1/2 σ1 and σ1 = 1 + O(|ω − Ω| ) > 1. Solutions yn1 , yn2 correspond to small amplitude breathers with a slow exponential decay as n → ±∞. 2 independent (ii) If ω − Ω and b have the same sign, (6) admits a solution y 0 ∈ H# of n, corresponding to collective in-phase oscillations. It has the form y 0 (t) = β ∗ cos t + O(|ω − Ω|) and β ∗ = O(|ω − Ω|1/2 ). (iii) For ω > Ω and b > 0, (6) has at least two heteroclinic solutions yn3 , yn4 (and 2 = 0 and also yn3 (t + π), yn4 (t + π)) such that limn→−∞ yni − y 0 (t + π)H# i 0 3 2 limn→+∞ yn − y H# = 0. These solutions satisfy y−n+1 (t) = yn3 (t + π) 4 4 2 , y H 2 are O((ω − (t) = yn4 (t + π). Moreover, their norms yn3 H# and y−n n # Ω)1/2 ) as n → ±∞ and O((ω − Ω)) for bounded values of n. Solutions yn3 , yn4 correspond to small amplitude dark breathers.
In addition, note that for b > 0 there exists no small amplitude discrete breather 2 with ω < Ω and ω ≈ Ω (since (46) has no small amplitude solution y n ∈ H# homoclinic to 0). √ Now we consider the case ω ≈ ωc with ωc = 4 + Ω2 . In that case, Eq. (6) can be locally reduced to the recurrence relation (23), which becomes again autonomous if {} = {η} = {γ} = {κ} = 0 and has the invariance n → −n. This class of recurrence relations has been studied in [44, Sec. 6.2.3, Lemma 7] to which we refer for details. In addition one can note that the recurrence (60) can be recast in the form (46) by setting βn = (−1)n β˜n . The following result for the recurrence relation (60) follows. ˜ = Ω2 (V (4) (0) + (V (3) (0))2 ( Ω2 2 − 2)) = 0. For µ ≈ 0, the Lemma 3. Assume B 8 16+3Ω recurrence relation (60) has the following solutions: ˜ > 0, (60) has at least two homoclinic solutions β 1 , β 2 (and (i) For µ > 0 and B n n 1 2 also −βn , −βn ) such that limn→±∞ βni = 0. These solutions have the symme1 2 = −βn1 , β−n = βn2 and satisfy 0 < (−1)n βni ≤ Cµ1/2 |σ1 |−|n| , with tries β−n+1 1/2 |σ1 | = 1 + O(|µ| ) > 1.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
29
˜ have the same sign, (60) has a period 2 solution βn0 = (−1)n β ∗ , (ii) If µ and B ∗ with β = O(|µ|1/2 ). ˜ < 0, (60) has at least two heteroclinic solutions βn3 , βn4 (and (iii) For µ < 0 and B also −βn3 , −βn4 ) such that limn→±∞ |βni ∓ βn0 | = 0. These solutions have the 3 4 = βn3 and β−n = −βn4 . Moreover, βn3 , βn4 are O(|µ|1/2 ) as symmetries β−n+1 n → ±∞ and O(|µ|) for bounded values of n. ˜ < 0 and µ > 0 (µ ≈ 0) problem (60) has no small amplitude In addition, for B homoclinic solution to 0. As above, the solutions of the reduced recurrence relation provided by Lemma 3 yield the following solutions of (6). Theorem 6. Fix {} = {η} = {γ} = {κ} = 0 in Eq. (6). Assume ˜b = V (4) (0) + √ 2 Ω2 (V (3) (0)) ( 16+3Ω 4 + Ω2 , problem (6) has the following 2 − 2) = 0. For ω ≈ ωc = 2 solutions with yn ∈ H# for all n ∈ Z: (i) For ω > ωc and ˜b > 0, (6) has at least two homoclinic solutions yn1 , yn2 (and 2 = 0. These solutions also yn1 (t + π), yn2 (t + π)) such that limn→±∞ yni H# 1 2 (t) = yn1 (t + π), y−n = yn2 and have the form satisfy y−n+1 yni = βni cos t + O(|ω − ωc |),
(68) 1/2
where 0 < (−1)n βni ≤ C(ω − ωc )1/2 |σ1 |−|n| and |σ1 | = 1 + O((ω − ωc ) ) > 1. Solutions yn1 , yn2 correspond to small amplitude breathers with a slow exponential decay as n → ±∞. (ii) If ω − ωc and ˜b have the same sign, (6) admits a solution yn0 being 2-periodic in n, corresponding to out-of-phase oscillations. It has the form yn0 (t) = y(t + nπ) 2 ) and β ∗ = O(|ω − ωc |1/2 ). with y(t) = β ∗ cos t + O(|ω − ωc |) (y ∈ H# (iii) For ω < ωc and ˜b < 0, (6) has at least two heteroclinic solutions yn3 , yn4 (and 2 = 0 and also yn3 (t + π), yn4 (t + π)) such that limn→−∞ yni − y 0 (t + π)H# 3 3 4 2 = 0. These solutions satisfy y limn→+∞ yni − y 0 H# = y and y −n+1 n −n (t) = 4 1/2 2 , y H 2 are O(|ω − ωc | yn4 (t + π). Moreover, their norms yn3 H# ) as n → n # 3 4 ±∞ and O(|ω − ωc |) for bounded values of n. Solutions yn , yn correspond to small amplitude dark breathers.
2 In addition, for ˜b < 0 there exists no small amplitude discrete breather yn ∈ H# with ω > ωc and ω ≈ ωc . It is worthwhile mentioning that approximate breather solutions of (6) can be obtained in the form of modulated plane waves, using multiscale expansions (see [34, 35] and references therein), where the error can be controlled over finite time intervals. The envelope of a modulated wave satisfies the nonlinear Schr¨ odinger (NLS) equation, and does not propagate along the chain when a plane wave with wavenumber q = 0 or q = π is modulated (its group velocity vanishes). In these two cases the NLS equation is focusing (i.e. time-periodic and spatially localized solutions exist) when b < 0 and ˜b > 0 respectively, which coincides (according to Theorems 5 and 6) with the parameter values for which exact breathers exist.
February 11, 2009 13:36 WSPC/148-RMP
30
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
In addition, as shown in [27] the condition b < 0 leads to the instability of nonlinear standing waves with wavenumber q = 0. If periodic boundary conditions are considered, these standing waves become unstable above a critical energy via a tangent bifurcation. When the lattice period tends to infinity, the energy threshold goes to 0 and bifurcating solutions are slowly spatially modulated. The same result has been obtained for standing waves with q = π when V is even and ˜b > 0. In what follows we reformulate the results with respect to the unscaled original system (3). For conciseness, we only describe breather bifurcations, but conditions for dark breather bifurcations are easily deduced from Theorems 5 and 6. We express the condition ˜b > 0 of Theorem 6 in a different way using the relation (3) 2 ˜b = b − 16 (V (0)) . 2 3 16 + 3Ω
In addition, as the rescaled potential V˜ of (6) is replaced by the original potential ˜ = a−2˜b. V of (3), coefficients b and ˜b are simply replaced by h = a−2 b and h Theorem 7. Consider the Klein–Gordon lattice (3), where the on-site potential V satisfies V (0) = 0, V (0) = 1 and m, d, a, k > 0. Assume h = V (4) (0) − 5 (3) 2 2 (0))2 = 0 and note Ω2 = a2 d/k, ωmin = a2 d/m, ωmax = (a2 d + 4k)/m 3 (V (4) (3) 2 and H = V (0) − 2(V (0)) . (i) If h < 0 and Ω2 > 4/3, system (3) admits two families of breather solutions x1n , x2n parametrized by their frequency ω (in addition to phase shift), where ω ≈ ωmin and ω < ωmin . These solutions satisfy x1−n+1 = x1n and x2−n = x2n and decay exponentially as n → ±∞. As ω → ωmin, the amplitude of oscillations and the exponential rate of decay are O(|ω − ωmin |1/2 ). The breather profile is a slow modulation of a linear mode with wavenumber q = 0. (ii) If h > 0 and Ω2 > −16H/(3h), system (3) admits two families of breather solutions x1n , x2n parametrized by their frequency ω (in addition to phase shift), where ω ≈ ωmax and ω > ωmax . These solutions satisfy x1−n+1 (t) = x1n (t+π/ω), x2−n = x2n and decay exponentially as n → ±∞. As ω → ωmax , the amplitude of oscillations and the exponential rate of decay are O(|ω − ωmax |1/2 ). The breather profile is a slow modulation of a linear mode with wavenumber q = π. To interpret the conditions on the on-site potential V in properties (i) and (ii), it is interesting to note that V is soft for h < 0 and hard for h > 0 near the origin (i.e. the period of small oscillations in this potential respectively increases or decreases with amplitude). The condition on Ω in property (i) corresponds to a nonresonance condition, i.e. it ensures that no multiple of ω lies in the phonon band [ωmin, ωmax ] for ω ≈ ωmin. The condition on Ω in property (ii) is of different ˜ > 0. nature and is equivalent (with the condition h > 0) to fixing h Discrete breathers were known to exist in Klein–Gordon lattices for small coupling k after the work of MacKay and Aubry [57]. Theorem 7 considerably enlarges the domain of breather existence, with the limitation that it only provides small
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
31
amplitude solutions. In particular, it is interesting to note that small amplitude breathers of property (ii) exist for all values of k if H > 0.
4. Normal Form Analysis for Inhomogeneous Lattices 2 are described (for According to Theorem 3, small amplitude solutions of (6) in H# small inhomogeneities and frequencies close to the phonon band edges) by finitedimensional nonautonomous recurrence relations. In what follows we only consider √ the case ω ≈ Ω, the situation when ω ≈ 4 + Ω2 yielding similar phenomena. At leading order, the reduced recurrence relation (20) can be approximated by
βn+1 − 2βn + βn−1 = (Ω2 ηn (1 + γn ) − (Ω2 + µ)n + Ω2 γn − µ)βn + κn (βn − βn−1 ) + Bβn3 .
(69)
Different kinds of techniques can be employed to obtain homoclinic solutions of (69). One can use variational methods for asymptotically periodic sequences [63] (see also [82] in the homogeneous case), or proceed by perturbation near an uncoupled limit (also denoted anti-continuous or anti-integrable limit) where Ω and B are large (see [9, 10, 4] and [57, Sec. 9]). Existence results of localized solutions are also available in [1, 2] for disordered defect sequences. Another approach is to start from a known uniformly hyperbolic homoclinic solution in the homogeneous case, which persists for small inhomogeneities by the implicit function theorem, and obtain estimates for defect sizes allowing persistence (see the technique developed by Bishnani and MacKay [14]). Interesting related results on the structural stability of discrete dynamical systems under nonautonomous perturbations can be found in [30]. With a different point of view, we develop here a dynamical system technique, valid for a finite number of defects, which allows to analyze bifurcations of homoclinic solutions as defects are varied (see Secs. 4.1 and 4.2). For an isolated defect we highlight, near critical defect values, bifurcations of new homoclinic solutions (having no counterpart in the homogeneous system) or the disappearance of homoclinic solutions existing in the homogeneous case. Our method is also generalized to a finite number of defects, with the counterpart that (69) is modified by suitable higher order terms depending on the defect sequence (this procedure only provides approximate solutions of (69)). However, this does not constitute a strong limitation since the full reduced equation (20) is itself a higher order perturbation of (69). Note that Eq. (69) is valid (according to Theorem 3) for small defect sizes and µ ≈ 0, where the parameter µ determines for µ < 0 the (weak) degree of hyperbolicity of the fixed point βn = 0 in the homogeneous case. Our analysis does not impose conditions on the relative sizes of these parameters. In order to obtain exact breather solutions of (6) via Theorem 3, it would be necessary to proceed in two steps. The first step is the one described above, where exact or approximate homoclinic solutions are obtained for the truncated problem
February 11, 2009 13:36 WSPC/148-RMP
32
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
(69). The second step is to show that these solutions persist for the complete equation (20) as higher order terms are added. For this purpose a typical procedure would be to solve (20) using the contraction mapping theorem in the neighborhood of an exact or approximate solution of (69). However, in Sec. 4.1.4, we analyze tangent bifurcations of homoclinic orbits for which the persistence problem would become extremely complex, since it requires asymptotical techniques beyond all algebraic orders (see Sec. 4.1.4 for more details). We shall not analytically examine the persistence of such bifurcations for the complete equation (20). Instead, we shall later compare approximate solutions yn (t) ≈ βn cos t (deduced from (69) and Theorem 3) to numerically computed solutions of the original problem (6) (see Sec. 5). This will allow us to study the validity of approximation (69) far from the small amplitude limit and as inhomogeneities become larger. The persistence of homoclinic solutions for the complete equation (20) will be shown for particular homoclinic orbits which appear through a pitchfork bifurcation at the origin when ω reaches a critical value. These orbits correspond to discrete breathers with maximal amplitude at the impurity site (see Theorem 8 in Sec. 4.1.5). This part involves standard bifurcation techniques, in contrast with the persistence problem of the above mentioned tangent bifurcations. Note that other interesting bifurcations can exist when impurities act at a purely nonlinear level (see [76, 50] for some examples in spatially discrete or continuous systems). This would correspond to the situation when the on-site potential in (2) has an harmonic part independent on n, whereas higher order terms are inhomogeneous. The subsequent analysis of the reduced recurrence relation would be quite different, and in particular the method developed in Secs. 4.1 and 4.2 (based on a linear deformation of the unstable manifold) would not apply. 4.1. Case of a single mass defect We start with the simplest case when the coefficients of (69) are constant, except at n = 0 where their value changes. To fix the idea we consider the case of a single mass defect in Eq. (5), i.e. Dn = d, Kn = k, An = a, Mn = 1 + mn , mn = m0 δn0 . The case when all lattice parameters are allowed to vary over a finite number of sites will be considered in Sec. 4.2. For Eq. (6), the above assumption yields ηn = γn = κn = 0 and n = m0 δn0 . Equation (69) reads (recall ω 2 = Ω2 + µ) βn+1 − 2βn + βn−1 = −(ω 2 m0 δn0 + µ)βn + Bβn3 . Setting βn−1 = αn and Un = (αn , βn )T , Eq. (70) can be rewritten 0 2 Un+1 = Gω (Un ) − ω m0 δn0 βn where
(70)
(71)
Gω (Un ) =
βn . −αn + 2βn + (Ω2 − ω 2 )βn + Bβn3
(72)
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
33
One has in particular U1 = A(ω, m0 )Gω (U0 ) where the linear transformation
A(ω, m0 ) =
1 −ω 2 m0
0 1
(73) (74)
corresponds to a linear shear. Note that the axis α = 0 consists of fixed points of A(ω, m0 ). It is worthwhile to notice that the map Gω is reversible under the symmetry R : (α, β) → (β, α), i.e. Gω ◦ R = RG−1 ω . In other words, if Un is a solution of (71) for m0 = 0 then RU−n is also solution. This property is due to the fact that Eq. (70) with m0 = 0 has the invariances n → n + 1 and n → −n. Obviously the latter invariance still exists for m0 = 0. Consequently, for all m0 ∈ R, if Un is a solution of (71) then RU−n+1 is also solution. Now we shall use a geometrical argument to find homoclinic orbits to 0 for Eq. (70). In the sequel, we consider the stable manifold W s (0) of the fixed point (α, β) = 0 of Gω , and its unstable manifold W u (0), both existing for ω < Ω. The following result follows immediately. Lemma 4. For 0 < ω < Ω, Eq. (70) possesses an homoclinic orbit to 0 if and only if W s (0) and A(ω, m0 )W u (0) intersect. In addition it is useful to notice that W s (0) and W u (0) are exchanged by the reversibility symmetry R. 4.1.1. Linear case As a simple illustration, consider the linear case when V is harmonic, in which B = 0. Equation (70) reads βn+1 − 2βn + βn−1 = −(ω 2 m0 δn0 + µ)βn .
(75)
In that case, W s (0) and W u (0) correspond respectively to the stable and the unstable eigenspace of a linear mapping in R2 . The situation is sketched in Fig. 4. The corresponding stable eigenvalue σ ∈ (0, 1) is given by µ 1 2 − (µ − 4µ)1/2 , µ = ω 2 − Ω2 < 0, (76) 2 2 and W s (0) is the line β = σα, W u (0) being the line β = σ −1 α. For fixed ω < Ω, A(ω, m0 ) maps the unstable eigenspace on the stable eigenspace if and only if m0 > 0 (mass is increased at the defect) and σ =1−
m0 = ml (ω),
(77)
where ml (ω) =
1 −1 (σ − σ). ω2
(78)
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
34
A(ω,m0)Wu(0) m0<ml
1
u
W (0)
2
G (U ) ω
0.8
W (0) A(ω,ml)Wu(0)
0.6
βn
βn
0
1.5
u
Ws(0)
U0
A(ω,ml)
1
U
0.4
−1
u
A(ω,m )W (0) 0
Ws(0) 0.2
0.4
U1
m0>ml
0.2
0 0
0.5
U2 0.6
α
0.8
1
1.2
0 0
0.5
1
α
1.5
2
n
n
Fig. 4. Linear case (B = 0). Left panel: Stable manifold (in the half plane α > 0) and images of the unstable manifold by A(ω, m0 ) for m0 = 0.005, m0 = ml and m0 = ml + 0.005; Right panel: Homoclinic orbit to 0 for m0 = ml . In both panels we have fixed Ω = 10, ω = 9.99, which implies ml = 0.0092.
Now keeping fixed m0 > 0, condition (77) can be rewritten ω = ωl (m0 ), where (for ω < Ω) ωl2 =
1 [Ω2 + 2 − (4 + m20 Ω2 (Ω2 + 4))1/2 ], 1 − m20 ωl2
1 2 , = (Ω2 + 2) − 2 2 Ω +2
m0 = 1, (79)
m0 = 1.
The solutions of (75) homoclinic to 0 are spanned by βn = σ |n| , and the corresponding solutions of (6) in the linear case read yn (t) = βn cos t with ω = ωl (m0 ). One recovers a classical result, i.e. if mass is increased at the defect then the linear localized mode frequency lies below the phonon band and its frequency is given by ωl . Now let us consider the effects of nonlinear terms. For this purpose, we start with the simplest case of a hard potential, i.e. B > 0. The situation when B < 0 is far more complex and will be investigated later. 4.1.2. Nonlinear defect modes for hard on-site potentials If B > 0, W s (0) and W u (0) do not intersect (except at the origin) for 0 < ω < Ω. Indeed, one can show by induction that |βn | > |βn−1 | > 0 for any nontrivial orbit on W u (0), which implies W u (0) lies inside the sector formed by the lines α = β and α = 0. In the same way, W s (0) lies inside the sector formed by the lines α = β and β = 0 hence it does not intersect W u (0). The above property also implies that W u (0) can be defined (globally) as the graph α = g(β) of an increasing function g, and the same holds true for W s (0) = RW u (0) on which β = g(α). For fixed ω ∈ (0, Ω), the local unstable manifold can be approximated by α = g(β) = σβ + bβ 3 + O(|β|5 ), with b = σ 2 (σ 2 − σ −2 )−1 B < 0 (this coefficient can be computed by a classical identification procedure, using the fact that W u (0)
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
0.9
35
u A(ω,ml)W (0)
Wu(0)
0.8
G (U ) ω 0
0.7
A(ω,m )Wu(0) 0
m >m
0.6
0
l
βn
0.5 0.4 0.3
U0
0.2 0.1 U
Ws(0)
−1
0 −0.1 0
U
U2 0.05
1
0.1
0.15
α
0.2
0.25
0.3
0.35
n
Fig. 5. Case B > 0 and ω < Ω. Stable and unstable manifolds (in the half plane α > 0), and image of the unstable manifold by A(ω, m0 ) for m0 = ml and m0 = 0.05 > ml . We have fixed Ω = 10 and ω = 9.9.
is invariant under Gω ). Consequently, W s (0) and W u (0) have the local shape represented in Fig. 5. The same situation occurs in the limit ω ≈ Ω (one can locally approximate the map Gω up to any order in U, µ using the time-one map of an integrable flow [8], which allows to determine the shape of W s (0) and W u (0) close to Un = 0). In the case when m0 ≤ 0, the curves W s (0) and A(ω, m0 )W u (0) do not intersect (A(ω, m0 )W u (0) remains inside the sector formed by the lines α = β and α = 0). However, W s (0) and A(ω, m0 )W u (0) intersect if m0 > 0 provided m0 > ml (ω),
(80)
which means there exists an orbit homoclinic to 0 for Eq. (70). This property is clear for m0 ≈ ml where there exists a unique intersection point (in the half plane α > 0) close to U = 0 due to the local shape of W s (0) and W u (0). Moreover, we numerically find a unique intersection point for all values of m0 satisfying (80). Condition (80) is equivalent to (81) ωl < ω < Ω. √ The amplitude of the homoclinic orbit is O( ω − ωl ) as m0 is fixed and ω → ωl (at the limit A(ω, m0 )W u (0) and W s (0) become tangent at the origin), and its spatial decay rate σ is given by Eq. (76). This homoclinic orbit corresponds for Eq. (6) to a nonlinear defect mode, i.e. a nonlinear analogue of the above-mentioned
February 11, 2009 13:36 WSPC/148-RMP
36
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
linear localized mode. This solution can be approximated by yn (t) ≈ βn cos t for m0 ≈ 0 and the frequency ω varies with amplitude contrarily to the linear case. In Sec. 4.1.5, we show the persistence of the above homoclinic solution βn for the full normal form (20) and the existence of corresponding small amplitude solutions of (5) (see Theorem 8). Alternatively, these solutions could be obtained using an infinite-dimensional version of the Lyapunov center theorem (see Sec. 4.1.5 for more details). Lastly, let us notice that the above homoclinic orbit possesses the symmetry β−n = βn , or equivalently RU −n+1 = Un . It suffices to check the latter relation for n = 0 to prove it for any n, since both solutions RU −n+1 and Un coincide if they satisfy the same initial condition. Since U0 lies on the unstable manifold we have α0 = g(β0 ), and in the same way β1 = g(α1 ) since U1 lies on the stable manifold. Since by definition α1 = β0 , this implies α0 = β1 and thus RU 1 = U0 . Using the properties U1 = Gω (U0 )−ω 2 m0 (0, β0 )T and RU 1 = U0 we also deduce the relations 2α0 = [2 + Ω2 − ω 2 (m0 + 1)]β0 + Bβ03
(82)
2β1 = [2 + Ω2 − ω 2 (m0 + 1)]α1 + Bα31 ,
(83)
which are useful in particular for the numerical computation of U0 , U1 . 4.1.3. Nonlinear defect mode with algebraic decay In the situation of Sec. 4.1.2 (B > 0), the case when m0 is fixed and ω → Ω deserves a special attention. Indeed, the homoclinic orbit (αn , βn ) converges in this limit towards a solution having an algebraic decay as n → ±∞. More precisely, if ω = Ω and m0 > 0 then W s (0) and A(ω, m0 )W u (0) intersect at a unique point (α1 , β1 ) in the half plane α > 0 (see Fig. 6). This can be checked analytically for m0 ≈ 0 and (α, β) ≈ 0, since the unstable manifold can be locally parametrized by α = β − (B/2)1/2 β 2 + O(|β|3 ) in the half plane α > 0 (this expansion follows from a classical identification procedure). Using this relation for (α0 , β0 ) in conjunction with (82) we find as m0 → 0 β0 = Ω2 (2B)−1/2 m0 + O(m20 ).
(84)
Note that in this non-hyperbolic case, the function g having the unstable manifold as its graph is not C 2 at β = 0 (in the half plane α < 0, one has α = β+(B/2)1/2 β 2 + O(|β|3 ) on the local unstable manifold). Far from the small amplitude limit, we have also checked numerically the existence and uniqueness of W s (0) ∩ A(ω, m0 )W u (0) in the half plane α > 0. Consequently, there exists a solution of (70) homoclinic to 0 for ω = Ω and m0 > 0. This solution has an algebraic decay due to the fact that the origin is not any more an hyperbolic fixed point for ω = Ω. One can approximate the solution profile for m0 ≈ 0, using the fact that (70) admits at both sides of n = 0 a continuum limit. Indeed, setting βn ≈ m0 β(x),
x = m0 n,
(85)
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
37
0.16
Wu(0)
0.14
u A(ω,m0)W (0)
G (U ) ω 0
0.12
βn
0.1
U
0.08
0
0.06
Ws(0)
U−1
0.04
U1
U2 0.02 0 0
0.02
0.04
0.06
α
0.08
0.1
0.12
0.14
n
Fig. 6. Case B > 0 and ω = Ω. Stable and unstable manifolds (in the half plane α > 0), and image of the unstable manifold by A(ω, m0 ) for m0 = 0.05. We have fixed Ω = 10 in this example.
one obtains the following differential equation d2 β = Bβ 3 , dx2
x ∈ (−∞, 0) or (0, +∞),
from which we deduce (multiply by β and integrate) dβ = −sign(x)(B/2)1/2 β 2 dx
(86)
since β(x) → 0 as x → ±∞. Using (86) and (84) one obtains the following approximation of the homoclinic solution for m0 ≈ 0 −1 2 2 βn ≈ m 0 . (87) m0 |n| + 2 B Ω This yields an approximate solution yn (t) ≈ βn cos t of (6), corresponding to a breather with an algebraic decay and a frequency ω = Ω at the bottom of the phonon band. 4.1.4. Case of soft on-site potentials In this section, we make some considerations on the case B < 0 (soft on-site potential V ) which is far more complex. We fix the parameter ω in Eq. (70) and let m0 vary. Using a geometrical argument, we show that two (asymmetric) homoclinic
February 11, 2009 13:36 WSPC/148-RMP
38
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
solutions of (70) having one hump near the defect site n = 0 disappear through a tangent bifurcation, at a critical value of m0 which can be estimated. As we shall later numerically check (see Sec. 5), the same features occur for the Klein–Gordon model which was locally reduced to (20), i.e. to a higher order perturbation of (70). In addition, we show (again using a geometrical argument) that a symmetric solution βn of (70) homoclinic to 0 and centered at n = 0 disappears through a pitchfork bifurcation with −βn for m0 = ml (ω). This bifurcation persists for the full normal form (20) as we shall see in Sec. 4.1.5. In the present section we only study the simplest homoclinic bifurcations that occur in the soft potential case when m0 is varied, but an infinity of tangent bifurcations occur in fact due to the complicated structure of the stable and unstable manifolds of the origin. In order to treat the case when m0 ≈ 0, we start by fixing m0 = 0 and proceed perturbatively. For m0 = 0, µ < 0 and B < 0, Eq. (70) possesses homoclinic solutions to 0. This case has been analyzed in several references with different viewpoints and for different parameter ranges, see e.g. [68, 37, 4, 63, 82, 44]. The dynamics of the map Gω is rather complex due to the fact that the stable and unstable manifolds of the origin intersect transversally in general (see Figs. 7 and 12). This implies the existence of an invariant Cantor set on which some iterate Gpω is topologically conjugate to a full shift on N symbols [8], which yields a rich variety of solutions and in particular an infinity of homoclinic orbits to 0. Among these different homoclinic orbits one can point out two particular ones Uni = (αin , βni )T (i = 1, 2), corresponding for the Klein–Gordon chain to breather solutions with a single hump near n = 0 (site-centered or bond-centered). These solutions have been described in Lemma 2 and Theorem 5 in the small amplitude limit. The corresponding homoclinics Un1 , Un2 are reversible, i.e. they satisfy 2 2 1 1 = Un2 (β−n = βn2 ) and RU−n+2 = Un1 (β−n+1 = βn1 ). In Fig. 7, the point RU−n+1 with label 2 lying on the axis α = β corresponds to U11 , and the points with labels 3, 1 correspond to U02 , U12 , respectively. Obviously any translation of Uni generates a breather solution of (6) having its maximal amplitude near a different site. Now let us consider the situation when ω is kept fixed and a small mass defect m0 is introduced in (70). As illustrated in Fig. 7, each of the above solutions is structurally stable. For example, let us consider in Fig. 7 the intersection points 1, 2, 3 between W u (0) and W s (0). Each of these intersections persists (points 1 , 2 , 3 in Fig. 7) as the linear shear A(ω, m0 ) is applied to W u (0) for m0 ≈ 0 (dashed line in Fig. 7). Let us examine the corresponding solutions of (71) and the related breather solutions of the Klein–Gordon model. ˜ 2 at the ˜ 2 = (˜ α2n , β˜n2 )T the solution of (71) with initial data U We denote by U n 1 point 1 . This solution is homoclinic to 0 according to Lemma 4. Repeating an argu2 2 ˜n2 , i.e. β˜−n ˜−n+1 =U = β˜n2 . Consequently, ment of Sec. 4.1.2, one can show that RU 2 ˜ Un corresponds to an (approximate) breather solution of (6) centered at the defect site n = 0. This solution is a small deformation of the site-centered breather yn2 of Theorem 5.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
39
0.3
0.25 3 0.2
2
3’
βn
2’ 0.15
0.1 1
1’ 0.05
0 0
0.05
0.1
0.15
αn
0.2
0.25
0.3
0.35
Fig. 7. First intersection points between the stable and unstable manifolds for parameters ω = 9.9 (µ = −1.99) and B = −75. The dashed line depicts the image of the unstable manifold by the linear shear A(ω, m0 ) for m0 = 0.005.
˜n1 the homoclinic solution of (71) with initial data U ˜11 Now let us denote by U at the point 2 . It corresponds to an (approximate) breather solution of (6), whose profile is a small deformation of the breather yn1 centered between n = 0 and n = 1 ˜11 does not belong to the line α = β (it lies at a distance (see Theorem 5). Since U O(|m0 |)), the corresponding breather solution is not symmetric any more, which was expected since the atomic masses at n = 0, 1 are different. ˜ 3 = (˜ α3n , β˜n3 )T the homoclinic solution of (71) with initial data Lastly we note U n 3 ˜ ˜ 3 is O(|m0 |)-close to U 2 (point with label 3), U ˜ 3 is a U1 at the point 3 . Since U 1 0 n 2 ˜3 existing for m0 = 0. In other words, U small deformation of the solution U n−1
1
2 corresponds to a small deformation of the breather yn−1 centered at n = 1. The mass defect at n = 0 breaks the mirror symmetry of the solution, since its amplitude 3 3 − β˜n+1 = O(|m0 |) for n = 0. has only the imperfect symmetry β˜−n+1 A more delicate question concerns the continuation and the possible bifurcations of the above homoclinic solutions as m0 is further varied. The evolution of ˜n1 , U ˜n2 , U ˜n3 depends on the structure of the homoclinic windings near U11 , U12 , U02 . U Numerically we find that the lobes formed near these points by the stable and unstable manifolds have the structure shown in Fig. 7. These manifolds windings can be analytically approximated as explained in [37, Sec. 3.5] or [36, Sec. 4]. At a critical value m0 = mc (ω) > 0, the points with labels 2 and 3 on s W (0) ∩ A(ω, m0 )W u (0) collide as W s (0) and A(ω, m0 )W u (0) become tangent. ˜ 1 disappear through a tangent bifurcation ˜ 3 and U Consequently, the solutions U n n above this critical value of m0 .
February 11, 2009 13:36 WSPC/148-RMP
40
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
Obviously, since we consider the truncated map (71) instead of the full recurrence relation (20), these solutions only correspond to approximate breather solutions of (6). It should be hard to prove that the above tangent bifurcation of homoclinic solutions persists for the full reduced equation (20), because it involves phenomena beyond all algebraic orders for µ ≈ 0. Indeed, for the truncated map (71) with m0 = 0, the splitting of W s (0) and W u (0) lies beyond all orders in µ. This is due to the fact that the map (72) can be approximated up to an arbitrary order in (Un , µ) using the time-one map of an integrable flow [8], for which W s (0) and W u (0) coincide and form a pair of symmetric homoclinic loops. The analysis of the above tangent bifurcation, which requires to estimate the splitting distance between W s (0) and W u (0) and the angles at their intersection, would therefore involve difficult beyond all orders asymptotics. In particular, the critical value of m0 at which tangent bifurcation occurs for the truncated map (71) lies beyond all orders in µ (since it is of the order of the splitting distance between W s (0) and W u (0)), and the same phenomenon can be expected when the higher order terms of (20) are taken into account. Analytical results on the exponentially small splitting of separatrices have been derived for some families of analytic maps (see [32, 23, 25] and references therein), but our case is more complex since (for an analytic on-site potential V ) the center manifold reduction breaks in general the analyticity of the reduced equation. A strategy to tackle this problem would be to proceed as in [56, Sec. 8], where the center manifold reduction is replaced by an infinite-dimensional normal form reduction. This would lead to difficult analytical problems which lie beyond the scope of this paper. Instead, in Sec. 5 we shall check numerically that the above tangent bifurcation occurs for breather solutions of (6) close to our approximate solutions, at a critical value of m0 close to mc (ω). In what follows we give a simple method to estimate mc (ω), which is based on a simple approximation of W u (0). Note that the method does not work in the limit µ ≈ 0 in which the center manifold reduction is achieved, but fits quite well our numerical computations in a different parameter regime. Let us consider a u of the local unstable manifold of Fig. 7, parametrized cubic approximation Wapp 2 3 by β = λα − c α . The coefficient c depends on µ and B and need not be specified in what follows (a value of c suitable when λ is large is computed in [36, Eq. (60)]). We note λ = σ −1 = 1 − µ/2 + µ2 − 4µ/2 the unstable eigenvalue. We have β = λ0 α − c2 α3
(88)
u on the curve A(ω, m0 )Wapp , where λ − ω 2 m0 = λ0 . By symmetry we can approxis parametrized by mate the local stable manifold using the curve Wapp
α = λβ − c2 β 3 .
(89)
u s The curves A(ω, m0 )Wapp and Wapp become tangent at (α, β) when in addition
(λ − 3c2 β 2 )(λ0 − 3c2 α2 ) = 1.
(90)
In order to compute m0 = mc as a function of ω, or, equivalently, the corresponding value of λ0 as a function of λ, one has to solve the nonlinear system
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
41
(88)–(90) with respect to α, β, λ0 , which yields a solution depending on λ. Instead of using λ it is practical to parametrize the solutions by t = β/α. This yields 1/2 1/2 1 t 1 1 √ √ α= , β= , t+ 3 t+ 3 t t c 2 c 2 λ0 =
3 1 t + 3, 2 2t
λ=
3 1 + t3 . 2t 2
Since µ = 2 − λ − λ−1 and m0 = (λ − λ0 )(Ω2 + µ)−1 it follows 2t 3 + t4 − , 2t 3 + t4 3 1 1 (Ω2 + µ)−1 . m0 = t− 2 t µ = 2−
(91) (92)
Given a value of µ ∈ (−Ω2 , −1/2), one can approximate mc by the value of m0 given by Eqs. (91), (92). For example, in the case numerically studied in Fig. 7 we have ω = 9.9 and µ = −1.99. Consequently, λ ≈ 3.721, t ≈ 1.7935 and λ0 ≈ 2.777, which yields mc ≈ 0.009632. A numerical study of the map yields mc ∈ (0.00963, 0.00964), and consequently our approximation works very well in this parameter regime. Moreover, the approximation is extremely close to the actual value of m0 at which a tangent bifurcation occurs between the corresponding breather solutions of the Klein–Gordon system (numerically we again find m0 ∈ (0.00963, 0.00964), see Sec. 5 for more details). Despite it gives precise numerical results in a certain parameter range, the approximation (91), (92) is not always valid. Indeed, the parameter regime µ > −1/2 is not described within this approximation. Moreover, one can check that u s intersects Wapp on the line α = β with an angle depending solely on λ, and Wapp u s and Wapp become tangent for λ = 2). not on the coefficient B (in particular, Wapp This problem could be solved by adding a quintic term dα5 in Eq. (88). The intersection point with label 1 between W s (0) and A(ω, m0 )W u (0) persists for 0 < m0 < ml (ω), or equivalently 0 < ω < ωl (m0 ), and consequently the ˜n2 exists within this parameter range. At ω = ωl , reversible homoclinic solution U this solution disappears through a pitchfork bifurcation with the symmetric solution ˜n2 (the amplitude of the homoclinic orbit is O(√ω − ωl ) as m0 is fixed and −U ω → ωl ). This homoclinic orbit corresponds for Eq. (6) to a nonlinear defect mode, i.e. a nonlinear analogue of the linear localized mode of Sec. 4.1.1. The existence of exact small amplitude solutions of this type (with ω ≈ ωl ) is proved in Sec. 4.1.5 (see Theorem 8). For m0 ≈ 0 and ω ≈ ωl , the breather solution of (6) can be approximated by yn (t) ≈ βn cos t, and the frequency ω varies with amplitude and lies below ωl . More generally, the evolution of the set A(ω, m0 )W u (0) ∩ W s (0) as m0 varies is very complex, due to the complex shape of the stable and unstable manifolds and
February 11, 2009 13:36 WSPC/148-RMP
42
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
the complicated structure of their intersection (see Fig. 12). In Sec. 5, we shall give some additional examples of breather bifurcations which can be deduced from the fine structure of the stable and unstable manifolds. Note that previous studies have examined, for certain families of reversible twodimensional maps, how parameter changes modify the intersections between the stable and unstable manifolds of the origin and the associated set of homoclinic solutions [15, 18]. These (autonomous) maps are directly obtained from the discrete nonlinear Schr¨ odinger equation or generalized versions (due to their phase invariance), as one looks for oscillatory solutions with a single Fourier component. Although we obtain similar types of tangent bifurcations as defect strengths are varied, our situation is quite different since we are concerned with a nonautonomous map, where the impurity leads to consider a linear shear of the unstable manifold. 4.1.5. Persistence of a nonlinear defect mode In Secs. 4.1.2 and 4.1.4, we have seen that symmetric homoclinic solutions corresponding to a nonlinear defect mode exist for the truncated normal form (70), both for hard and soft on-site potentials. In this section we prove that these solutions persist for the full system (20). They appear through a pitchfork bifurcation at the origin, when m0 > 0 is fixed (close to 0) and ω crosses the critical value ωl (m0 ). This bifurcation is supercritical for hard on-site potentials and subcritical for soft ones. As a consequence, the center manifold theorem yields the existence of corresponding defect modes for the Klein–Gordon system (3) (see Theorem 8 below). In the case of a single mass defect mn = m0 δn0 , the normal form (20) reads βn+1 − 2βn + βn−1 + (ω 2 m0 δn0 + ω 2 − Ω2 )βn − Bβn3 + n (βn−1 , βn , m0 , µ) = 0, (93) 2 + βn2 )] uniformly where n (βn−1 , βn , m0 , µ) = O[(|βn−1 |3 + |βn |3 )(|m0 | + |µ| + βn−1 in n ∈ Z (since in the nonlinear term (64), τn {λ}∞ (Z) does not depend on n). In the sequel we fix m0 > 0 close to 0 and vary ω 2 (by now we omit m0 in notations). For simplicity we shall note β the sequence {β}. Looking for solutions β of (93) in 2 (Z), Eq. (93) takes the form
F (β, ω 2 ) = 0,
(94)
where F : 2 (Z) × R+ → 2 (Z) is C k in a neighborhood B × O of (0, Ω2 ) and F (−β, ω 2 ) = −F(β, ω 2 ) (Z2 -symmetry). The neighborhood O can be fixed independently of m0 for m0 sufficiently small, since Theorem 3 yields a reduction result valid for (ω 2 , m0 ) in a neighborhood of (Ω2 , 0). Moreover, O contains the critical value ωl (m0 )2 for m0 ≈ 0 since ωl (0) = Ω. Now we look for solutions of (94) bifurcating from β = 0. From the analysis of Sec. 4.1.1 it follows that DF (0, ω 2 ) has a nontrivial kernel if and only if ω 2 = ωl (m0 )2 . The kernel of L = DF (0, ωl2 ) is one-dimensional and spanned by the
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
43
eigenvector ξ given by ξn = σ |n| (see Eq. (76) for the definition of σ). To determine the range R(L) of L, we note that L = T + A with (T β)n = βn+1 − 2βn + βn−1 + (ωl2 − Ω2 )βn , (Aβ)n = ωl2 m0 δn0 βn . Since ωl2 < Ω2 , T is invertible in 2 (Z) by Lax–Milgram’s theorem. Since A is compact in 2 (Z), it follows that L is Fredholm with index 0 and codim R(L) = 1. In addition, R(L) = ξ ⊥ since L is selfadjoint. We now assume ω 2 ≈ ωl2 . The solutions of (94) near (0, ωl2 ) can be determined using classical results for bifurcations at a simple eigenvalue, based on a Lyapunov– Schmidt reduction (see e.g. [60,61]). The Z2 -symmetry of F and the nondegeneracy conditions 2 2 2 (Dβω 2 F (0, ωl ) · ξ, ξ)2 = m0 + c = 0,
(Dβ3 F (0, ωl2 ) · [ξ]3 , ξ)2 = −6Bb2 = 0, in which c2 = ξ22 =
2 − 1, 1 − σ2
b2 =
ξn4 =
n∈Z
2 − 1, 1 − σ4
(95)
guarantee that the set of solutions of (94) near (0, ωl2 ) consists in a pitchfork lying in a two-dimensional submanifold of 2 (Z) × R (see [61, Proposition 1.9, p. 438]). More precisely, the Lyapunov–Schmidt reduction yields the bifurcation equation (m0 + c2 )(ω 2 − ωl2 ) − Bb2 3 + h.o.t. = 0, where ω 2 − ωl2 ≈ 0 and ≈ 0 denotes the coordinate of small amplitude solutions β along the kernel of L. The local branch of nontrivial solutions of (94) can be therefore parametrized by β = ξ + O(3 ) in 2 (Z), ω 2 = ωl2 +
(96)
2
Bb 2 + O(4 ) m 0 + c2
(97)
(note that ω 2 is even in ). The pitchfork bifurcation is supercritical for B > 0 (i.e. for hard on-site potentials) and subcritical for B < 0 (soft on-site potentials). In the degenerate case B = 0, a branch of solutions bifurcating from β = 0 still exists, and higher order terms of (97) determine the direction of bifurcation. As a corollary of (96), note that bifurcating solutions are also O(||) in ∞ (Z). Applying the center manifold Theorem 3, the homoclinic solutions (96), (97) of the reduced recurrence relation correspond to small amplitude solutions of (6), 2 2 = 0. This yields the following for all n ∈ Z and limn→±∞ yn H# with yn ∈ H# existence result of a nonlinear defect mode in the original system (5). Theorem 8. Consider the Klein–Gordon lattice (5), where the inhomogeneity lies in the mass parameter Mn = m+m0δn0 (with m, m0 > 0 and m0 ≈ 0) and all other
February 11, 2009 13:36 WSPC/148-RMP
44
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
lattice parameters Dn = d, An = a, Kn = k are constant (with d, a, k > 0). Assume the on-site potential V satisfies V (0) = 0, V (0) = 1 and note Ω2 = a2 d/k. (i) In the linear case V (y) = 12 y 2 , Eq. (5) admits spatially localized solutions yn (t) = σ |n| 0 cos(Ω0 t) (linear defect mode), with frequency Ω0 = ωl k/m and ωl (m0 ) defined by (79). Their spatial decay is fixed by σ(m0 ) ∈ (0, 1), which is determined by Eq. (76) taken for ω = ωl . (ii) In the nonlinear case, assume in addition Ω2 > 4/3. Equation (5) admits a family of spatially localized solutions yn (t) = σ |n| cos (Ω t) + O(2 ) (nonlinear defect mode), parametrized by ≈ 0 (in a neighborhood of 0 whose size depends on m0 ), with frequency Ω = Ω0 + hΩ1 2 + O(4 ), where h = V (4) (0) − 53 (V (3) (0))2 , Ω1 = by (95).
kΩ2 b2 a2 16mΩ0 m0 +c2
> 0 and b, c are given
To end this section, we compare our approach with another analysis of (5) based on the Lyapunov center theorem in its infinite-dimensional version. Under a nonresonance condition, i.e. when no multiple of Ω0 lies in the phonon band [ωmin , ωmax ], the Lyapunov center theorem ensures that yn = 0 is contained in a two-dimensional invariant manifold of 2 (Z) consisting of small amplitude periodic solutions, whose frequency tends to Ω0 as they approach the equilibrium (see e.g. [60] for more details on the Lyapunov center theorem). The nonlinear defect mode considered in Theorem 8 corresponds to a Lyapunov family of periodic orbits, and for small enough m0 > 0 the condition Ω2 > 4/3 implies the above nonresonance condition. Indeed, 2 = a2 d/m, assuming Ω2 > 4/3 is equivalent to fixing 2ωmin > ωmax (recall ωmin 2 2 ωmax = (a d + 4k)/m). Since limm0 →0 Ω0 = ωmin , one has 2Ω0 > ωmax when m0 > 0 and m0 ≈ 0, which establishes the nonresonance condition since in addition Ω0 < ωmin (recall ωl < Ω). More generally, one can see from the Lyapunov center theorem that a nonlinear defect mode exists when m0 is not necessarily small, provided the nonresonance condition is fulfilled. When m0 → 0, the Lyapunov center theorem is not adequate to analyze (5) because it is valid in a neighborhood of yn = 0 which may vanish at the limit. This is due to the fact that the frequency Ω0 enters the continuous spectrum for m0 = 0, which violates the nonresonance condition. For example, the Lyapunov center theorem also asserts that the Lyapunov family of periodic orbits contains the only periodic solutions near yn = 0 in 2 (Z), with frequency close to Ω0 . However, as seen in Sec. 4.1.4 for the principal part of the normal form (20), many spatially localized solutions can exist in the vicinity of the defect mode when m0 is small (see also Sec. 5 where these results are checked numerically for the full Klein–Gordon model). By opposition, the center manifold reduction we employ is well adapted to determine bifurcating solutions for m0 ≈ 0 and frequencies close to ωmin , although their persistence for the full normal form would be hard to analyze here (except in the special case of Theorem 8).
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
45
4.2. Case of finitely many defects This section generalizes the analysis of the above section to the case when Eq. (5) admits a finite number of inhomogeneities. More precisely, we assume in Eq. (6) ηn = γn = κn = n = 0 if |n| ≥ n0 + 1, for a given integer n0 ≥ 0. Note that this assumption allows one to cover the case of an odd number of defects as well as an even number. The situation is more complex than in Sec. 4.1, because studying homoclinic solutions of (69) leads to finding the intersections of the stable manifold with the image of the unstable manifold under a nonlinear transformation (see Lemma 5 below). However, one can recover the linear case if one replaces the relevant spatial map by a suitable one, both being equal at leading order only. In what follows we shall develop this leading order theory, considering as higher order terms all terms being o((αn , βn )3 ). Equation (69) reads βn+1 − 2βn + βn−1 = (θn − µ)βn − κn βn−1 + Bβn3 ,
(98)
where θn = Ω2 (ηn + ηn γn + γn ) − ω 2 n + κn , ω 2 = Ω2 + µ. In the sequel, we shall note ε = {θ}∞(Z) + {κ}∞(Z) . Setting βn−1 = αn and Un = (αn , βn )T , Eq. (98) can be rewritten
Fn (α, β) =
Un+1 = Fn (Un ), β
(99)
−(1 + κn )α + (2 + θn − µ)β + Bβ 3
.
(100)
Noting F = Gω for simplicity (see definition (72)), one can observe that Fn = (I + Tn )F + O(|κn ||β|3 ),
0 0 Tn = . θn + (µ − 2)κn κn
(101)
Note that higher order terms are absent from Eq. (101) if κn = 0. Since Fn = F for |n| ≥ n0 + 1 one has the following property. Lemma 5. Fix µ < 0 and denote by W s (0), W u (0) the stable and unstable manifolds of the fixed point U = 0 of F . Consider the nonlinear map G = Fn0 ◦ Fn0 −1 ◦ · · · ◦ F−n0 ◦ F −2n0 −1 . Equation (99) possesses an homoclinic orbit to 0 if and only if W s (0) and G(W u (0)) intersect. Lemma 5 is hard to use for analyzing homoclinic solutions since it involves a nonlinear transformation G instead of a linear one as in Lemma 4. However one can recover the linear case when replacing Fn by a suitable approximation Fˆn , equal to Fn up to higher order terms. This is possible thanks to property (104) of Lemma 6
February 11, 2009 13:36 WSPC/148-RMP
46
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
below. In the sequel, we note
L = DF (0) =
0 1 . −1 2 − µ
Lemma 6. Consider the collection of maps Fˆn (−n0 ≤ n ≤ n0 ) defined by Fˆn = An F ◦ A−1 n−1 ,
(102)
where A−n0 −1 = I and for n ≥ −n0 An = Ln Ln−1 · · · L−n0 L−n−n0 −1 ,
Ln = (I + Tn )L.
(103)
The map Fˆn is a leading order approximation of Fn , i.e. Fˆn = Fn + O(ε(α, β)3 ). Moreover one has the property Fˆn0 ◦ Fˆn0 −1 ◦ · · · ◦ Fˆ−n0 = AF 2n0 +1 ,
(104)
where A = An0 reads A = Ln0 Ln0 −1 · · · L−n0 L−2n0 −1 = I + O(ε).
(105)
Proof. First we note that the sequence An satisfies A−n0 = I + T−n0 and An+1 = (I + Tn+1 )LAn L−1
(106)
for all n ≥ −n0 − 1. It follows for −n0 ≤ n ≤ n0 Fˆn = (I + Tn )LAn−1 L−1 F ◦ A−1 n−1 .
(107)
Now let us note that An = I + O(ε). Moreover, the following identity holds true for any parameter-dependent matrix M ∈ M2 (R) with M = O(ε) F ◦ (I + M ) = L(I + M )L−1 F + O(ε(α, β)3 ).
(108)
Consequently one has also F = L(I + M )L−1 F ◦ (I + M )−1 + O(ε(α, β)3 ). Using this property in Eq. (107) leads to Fˆn = (I + Tn )F + O(ε(α, β)3 ). Using (101) this yields Fˆn = Fn + O(ε(α, β)3 ), therefore Fˆn is a leading order approximation of Fn . Property (104) follows directly from the definition of Fˆn . It is worthwhile stressing that A = DG(0), where G is the nonlinear transformation introduced in Lemma 5. Now we fix in addition Fˆn = F = Fn for |n| ≥ n0 + 1. According to Lemma 6 we have also Fˆn = Fn + O(ε(α, β)3 ) for |n| ≤ n0 . In the sequel we approximate system (99) by the new one Un+1 = Fˆn (Un ).
(109) u
Property (104) implies the following result, since W (0) is invariant under F
2n0 +1
.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
47
Lemma 7. Fix µ < 0 and denote by W s (0), W u (0) the stable and unstable manifolds of the fixed point U = 0 of F . Equation (109) possesses a solution Un homoclinic to 0 if and only if W s (0) and A(W u (0)) intersect, where the matrix A = I + O(ε) is defined in Lemma 6. The intersection point corresponds to Un0 +1 . Consequently, as in Sec. 4.1 one recovers the problem of finding the intersection of W s (0) with the image of W u (0) under the (near-identity) linear transformation A. Note that A = I + T0 in the single defect case n0 = 0. Here we shall not attempt to relate the bifurcations of breather solutions of (6) with the properties of the inhomogeneities, via an analysis of homoclinic solutions of (109). This question will be considered in future works using the simplification provided by Lemma 7. As for a single defect, for B < 0 one can expect multiple tangent bifurcations between (deformations of) site-centered and bond-centered breathers as inhomogeneities are varied, due to the winding structure of W u (0) and W s (0). It is now interesting to compute the leading order contribution of the sequence of inhomogeneities to the matrix A. This is the object of the following lemma. Lemma 8. The matrix A of Lemma 6 takes the form A = I + M + O(ε2 + ε|µ|), where
M11 M12 M = , M21 M22 M11 =
2n0
n(n + 1)ρn0 −n − nκn0 −n ,
n=0
M12 =
2n0
−n2 ρn0 −n + nκn0 −n ,
n=0
M21 =
2n0
(n + 1)2 ρn0 −n − (n + 1)κn0 −n ,
n=0
M22 =
2n0
−n(n + 1)ρn0 −n + (n + 1)κn0 −n ,
n=0
ρn = Ω2 (ηn + γn − n ). Proof. Since Tn = O(ε) it follows from definition (105) A=I+
2n0 n=0
Ln Tn0 −n L−n + O(ε2 ).
(110)
February 11, 2009 13:36 WSPC/148-RMP
48
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
Now we use the expansions
Tn = Mn + O(ε2 + ε|µ|), L = Lc + O(|µ|),
0 ρn − κ n 0 1 Lc = , −1 2 Mn =
0 , κn
to obtain A = I + M + O(ε2 + ε|µ|),
M=
2n0
Lnc Mn0 −n L−n c ,
n=0
where Lnc
−n + 1 n = . −n n+1
Then simple computations lead to the coefficients of M provided above. Interestingly, Lemma 8 shows that the influence of the inhomogeneities on the set of homoclinic solutions depends at leading order (via the matrix I + M ) on algebraically-weighted averages of {κ} and {ρ}. Now let us return to the original parameters mn , dn , an , kn describing the lattice inhomogeneities (see Eqs. (5) and (6)), with mn = dn = an = 0 for |n| ≥ n0 + 1, kn = 0 for n ≤ −n0 − 1 and n ≥ n0 . Let us note ε˜ = {mn /m}∞ + {dn /d}∞ + {an /a}∞ + {kn /k}∞ . One obtains κn =
kn−1 − kn + O(˜ ε2 ), k
where rn = Ω2
ρn = rn + O(˜ ε2 ),
dn an mn +2 − d a m
is a linear combination of the on-site potential and mass defect impurities. Some coefficients of M can be simplified since 2n0
κn0 −n = O(˜ ε2 ),
n=0
2n0
−nκn0 −n =
n=0
n0 −1 1 kn + O(˜ ε2 ). k n=−n 0
Noting Ik =
n0 n=−n0
nk rn ,
J0 =
n0 −1 1 kn , k n=−n 0
˜ + O(˜ one finally obtains A = I + M ε2 + ε˜|µ|) with
˜ 11 ˜ 11 + n0 I0 − I1 M −M ˜ = M ˜ 11 + (n0 + 1)I0 − I1 ˜ 11 M −M ˜ 11 = n0 (n0 + 1)I0 − (2n0 + 1)I1 + I2 + J0 . and M
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
49
Consequently, the matrix A depends (at leading order in ε˜ and µ) on the average values I0 , J0 of rn , kn /k, and on the weighted averages I1 , I2 of rn (with linear and ˜ ) = 0, it follows Tr(A) = 2+O(˜ quadratic weights, respectively). Since Tr(M ε2 +˜ ε|µ|) 2 and Det(A) = 1 + O(˜ ε ) (Det(A) is independent of µ due to identity (105)). As a consequence, in order to study the spectrum of A for ε˜, µ ≈ 0 (and determine to which type of linear transformation it corresponds) it would be necessary to compute the quadratic terms in (˜ ε, µ) in its expansion. 5. Numerical Results We have performed numerical computations in order to check the range of validity of the analysis of Sec. 4.1, and in particular if discrepancies appear for large amplitude solutions or if parameters (m0 , ω) are moved away from (0, Ω). More precisely, we have computed breather solutions of the Klein–Gordon lattice d2 yn + Ω2 V (yn ) = yn+1 − 2yn + yn+1 (111) dt2 with a single mass defect mn = m0 δn,0 and periodic boundary conditions y−N (t) = yN (t). In general we have used a lattice with 101 particles, except for the computations of breathers with algebraic decay (case ω = Ω) where 401 particles have been considered. The computations have been compared with homoclinic orbits to 0 of the two-dimensional map (71). For the numerical computations we have always fixed Ω = 10 (recall Ω is the lower phonon band edge for the infinite system). This can be done taking, for instance, k = 0.01, and d = 1 in the original problem (6). For the potential V we have chosen a polynomial of degree 4 with V (0) = 1. ω 2 (1 + mn )
5.1. Hard potentials To start we have considered the simplest case of a hard potential, i.e. a potential with a strictly positive hardening coefficient B (see definition (22)). We have chosen V (x) =
x4 x2 + , 2 4
(112)
for which B = 75. In this case, the reduced map (71) possesses a unique orbit homoclinic to 0 in the sector α > 0, β > 0, for m0 > 0 and ωl < ω < Ω. An example of this homoclinic orbit is shown in Fig. 5 for a frequency ω = 9.9 (µ = −1.99) and a mass defect m0 = 0.05. In Fig. 8 (left panel), we compare the approximate solution yn = βn cos t obtained with this homoclinic orbit (circles) with the exact breather profile computed with the standard numerical method based on the anti-continuous limit [59] (continuous line). The agreement is excellent even if the solution profile is very localized. Indeed, as one computes the eigenvalues σ, σ −1 (Eq. (76)) of the linearized map (75) with m0 = 0, one obtains σ ≈ 0.27, which implies a strong spatial localization visible in Fig. 8. The accuracy of the center manifold reduction (a priori expected for σ ≈ 1) is surprisingly good in this parameter regime.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
50
−3
6
x 10
5
4
yn(0)
yn(0)
0.1
3
2
0.05
1
0 −10
−5
0
n
5
10
0 −10
−5
0
5
10
n
Fig. 8. Comparison between the profile of a breather solution (continuous line) of the Klein– Gordon system (111) with hard potential (112) and the approximate solution yn = βn cos t (circles) constructed with the homoclinic orbit of (71). We have considered a mass defect m0 = 0.05. In the left panel we have chosen a frequency ω = 9.9 (µ = −1.99). In the right panel we have fixed ω = 9.837 (µ = −3.23) very close to ωl (note the change of scale for the vertical axis).
The breather solution can be continued for decreasing frequencies up to ωl ≈ 9.8369, which is the frequency of the linear defect mode at which the breather solution bifurcates. Figure 8 (right panel) compares again the numerically computed breather profile and the approximate solution obtained with the homoclinic orbit, but now very close to this bifurcation point (at ω = 9.837, i.e. µ = −3.23). We still observe an excellent agreement. Note that the oscillations amplitudes are very small, but the solution is still strongly localized. For increasing frequencies the continuation path ends up at the lower edge of the phonon band ω = Ω (µ = 0). For this particular frequency value the breather solution (see continuous line in Fig. 9, left panel) presents an algebraic decay which is very well described by approximation (87). This approximation fails to describe the maximum amplitude of the oscillation β0 for these parameter values. This is not surprising since β0 is not small, and βn varies rapidly near n = 0, hence m0 should be further decreased to attain the domain of validity of the ansatz (85) near the solution center. However, the value of β0 obtained from the exact homoclinic orbit of (71) fits very well the maximum amplitude of the breather solution, as it is shown in Fig. 9, right panel. Note that the agreement is very good even for very large amplitudes or very large mass defect i.e. within a surprisingly large parameter range for a local theory. It is interesting to remark that the accuracy of this fit depends on the symmetry of the potential V (x) we have chosen. Figure 10 shows what happens if we add to the polynomial potential (112) a cubic term x3 /6 that breaks its symmetry. The range of validity of our leading order approximation reduces significantly. A similar result was obtained in [69] for breather solutions in spatially homogeneous Fermi– Pasta–Ulam lattices. Obviously the agreement would be improved by taking into
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
0.4
1.6
0.35
1.4
0.3
1.2
0.25
1
51
1
y (0),y (0)
0.2
0.8
0
n
y (0)
Breathers in Inhomogeneous Nonlinear Lattices
0.15
0.6
0.1
0.4
0.05
0.2
0 −25
−20
−15
−10
−5
0
5
10
15
20
25
0.2
n
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
m0
Fig. 9. Left panel: Breather solution at the lower edge of the phonon band ω = Ω (µ = 0) for a mass defect m0 = 0.05 and the symmetric potential V (x) = x2 /2 + x4 /4. The continuous line corresponds to the numerically computed breather solution. The circles represent approximation (87) of the homoclinic orbit that fits very well the algebraic decay of the breather tails. Right panel: The continuous lines represent the amplitude of the breather solution at n = 0 and n = 1 versus mass defect. The circles correspond to the homoclinic solutions of the nonlinear map (71) (the upper plot represents β0 and the lower plot β1 ).
0.55 0.5 0.45
y0(0),y1(0)
0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
m0 Fig. 10. Same computation as in Fig. 9, right panel, but now for the asymmetric potential V (x) = x2 /2 + x3 /6 + x4 /4.
account the Taylor expansion of the reduction function φ (see Theorem 3) and computing the normal form at a higher order. Finally, we have numerically studied the spectral stability of the breather solutions by finding the eigenvalues of the Floquet operator, which gives us the evolution
February 11, 2009 13:36 WSPC/148-RMP
52
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
of any small perturbation over one period [11]. We have checked that all breather solutions in the gap ωl < ω < Ω are spectrally stable, at least for the value of the frequency parameter Ω = 10 we have considered. 5.2. Soft potentials In the case of soft potentials (when the coefficient B defined by (22) is strictly negative), the situation is far more complex due to the much more intricate structure of the intersections between the stable and unstable manifolds. Therefore one expects a richer bifurcation scenario as parameters (breather frequency, mass defect) are varied. Our computations have been performed with the symmetric potential V (x) =
x4 x2 − , 2 4
(113)
for which B = −75. Let us recall some basic features of the analysis performed in Sec. 4.1.4, in order to compare the results with numerical computations. For the (truncated) reduced mapping (71) with m0 = 0, Fig. 7 shows some intersections of stable and unstable manifolds emanating from the saddle point at the origin, for a frequency value ω = 9.9 < Ω. Iterating the map with an initial condition U1 at the homoclinic point with label 1, we obtain an homoclinic orbit which corresponds to a onesite breather centered at n = 0. With an initial condition U1 at the homoclinic point with label 2, the corresponding breather is a two-site breather with maximal amplitude at n = 0 and n = 1. An initial condition U1 at the homoclinic point with label 3 (symmetric of point 1 respect to the line α = β) corresponds to a one-site breather centered at site n = 1. The dashed line of Fig. 7 depicts the image of the unstable manifold by the linear shear A(ω, m0 ) for m0 = 0.005. As m0 increases A(ω, m0 ) W u (0) moves further down so that the intersection points 2 and 3 , corresponding to homoclinic orbits of the inhomogeneous problem, get closer and closer. So there exists a critical value of m0 for which these intersection points collide and then disappear. In fact, we have checked numerically that this tangent bifurcation occurs at a critical value m0 ∈ (0.00963, 0.00964) for problem (71). This critical value can be approximated using Eqs. (91), (92), which yields m0 ≈ 0.009632 in the present case. These results correspond very precisely to a breather bifurcation numerically observed in the Klein–Gordon chain (at a critical value m0 ∈ (0.00963, 0.00964)) and depicted in Fig. 11. The upper branch of Fig. 11(a) represents the energy of a breather solution corresponding to point 2 . For m0 ≈ 0, the breather has a maximal amplitude at sites n = 0, 1. A profile of this breather for m0 = 0.0093, close to the bifurcation point, is shown in Fig. 11(b), where the amplitude is now much larger at n = 1. The lower branch of Fig. 11(a) represents a one-site breather centered at n = 1 and corresponds to point 3 . Its profile for m0 = 0.0093 is shown in Figure 11(c).
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
53
Upper branch 0.04
0.25
n
y (0)
0.2 0.15
0.035
0.1 0.05
Energy
0 −25
−20
−15
−10
−5
0
5
10
15
20
25
10
15
20
25
n
Lower branch 0.25
0.03
n
y (0)
0.2 0.15 0.1 0.05
0.025 0
0.002
0.004
0.006
m0
0.008
0.01
0 −25
−20
−15
−10
−5
0
5
n
Fig. 11. Tangent bifurcation between breather solutions numerically computed in a Klein– Gordon chain with a soft potential. The chain presents a mass defect m0 at n = 0, and the bifurcation occurs as m0 is increased. In the left panel, the breathers energies E = P 2 2 n∈Z Ω V (yn (0)) + (yn+1 (0) − yn (0)) /2 are depicted versus m0 (the breathers are even in t with frequency ω = 9.9). For m0 ≈ 0, the upper branch represents a two-site breather centered between sites n = 0 and n = 1. The lower branch represents a one-site breather centered at n = 1. The breathers profiles close to the bifurcation point are plotted in the right panels (the value of m0 is marked with a dashed line in the left panel).
We have numerically computed the Floquet spectra of these breather solutions for the parameter values of Fig. 11. The solutions on the lower branch are spectrally stable, whereas the solutions on the upper branch are unstable. As in Sec. 5.1, we have also computed one-site breathers centered at the mass defect, corresponding to point 1 in Fig. 7. Again we have found an excellent agreement between the numerically computed breather profiles and the approximate solutions obtained using the map (71). As expected from the analysis of Sec. 4.1.4, these breathers survive up to m0 = ml (ω), i.e. up to a much higher value of m0 than the families 2 , 3 described above. A part of the intersecting stable and unstable manifolds is shown in the left panel of Fig. 12. Due to their complicated windings, new intersection points appear between A(ω, m0 ) W u (0) and W s (0) as m0 is chosen in certain windows of the parameter space, giving rise to new homoclinic solutions of (70). An example is shown in the region marked with a rectangle (see the details in the right panel of Fig. 12). For some value of m0 ∈ (0.01064, 0.01065), a new intersection point between A(ω, m0 ) W u (0) and W s (0) appears. As m0 is further increased, this inverse tangent bifurcation gives rise to two new homoclinic points 5 and 6 . Correspondingly, we have numerically checked that an inverse tangent bifurcation occurs in the Klein–Gordon chain at a critical value of m0 ∈ (0.01064, 0.01065), giving rise to new breather solutions which do not exist in the homogeneous chain. The point 4 in Fig. 12 also exists for m0 = 0. Returning to Fig. 7, it is obtained by applying the inverse map G−1 ω to the point with label 2. In the homogeneous limit m0 = 0, this homoclinic point corresponds consequently to a two-site breather
February 11, 2009 13:36 WSPC/148-RMP
54
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas 0.5
0.2
0.4
0.19
0.3
0.18
0.2
0.17
0.1
0.16
4’
βn
βn
5’
0
0.15
−0.1
0.14
−0.2
0.13
−0.3
0.12
−0.4
0.11
6’
−0.5 0.5
0
0.1 0.06
0.5
α
0.08
0.1
n
α
0.12
0.14
0.16
n
Fig. 12. Emergence of new intersection points between A(ω, m0 )W u (dashed curve) and W s (drawn with a full line) as the mass defect is increased. The figure corresponds to m0 = 0.012 and ω = 9.9. The right panel shows a zoom of the left panel over the region marked with a rectangle. The new homoclinic points 5 and 6 correspond to new breather solutions of the Klein–Gordon lattice. The point with label 4 corresponds to a two-site breather, which exists in the homogeneous lattice and persists for m0 ≤ 0.012.
Upper branch 0.046 n
y (0)
0.2
0.044
0.1 0 −25
−20
−15
−10
0
5
10
15
20
25
10
15
20
25
10
15
20
25
Central branch y (0)
0.04
n
Energy
−5
n
0.042 0.2 0.1 0 −25
0.038
−20
−15
−10
−5
0
5
n
Lower branch 0.036 n
y (0)
0.2
0.034 0
0.005
0.01
m0
0.015
0.02
0.1 0 −25
−20
−15
−10
−5
0
5
n
Fig. 13. Bifurcation diagram of breather solutions numerically computed in the Klein–Gordon chain, with a soft potential and a mass defect m0 at n = 0. In the left panel the breathers energies E are depicted versus m0 (see the definition of E in the caption of Fig. 11). The breathers frequency is ω = 9.9. The lower branch at the left of the vertical line corresponds to a two-site breather centered between n = 1 and n = 2. The right panel shows the profiles of the three breathers for m0 = 0.012, when all of them coexist (the value of m0 is marked with a vertical line in the left panel).
centered between n = 1 and n = 2. As Fig. 12 shows, an increase of the mass defect m0 moves point 5 against point 4 until they collide and disappear through a new tangent bifurcation. This tangent bifurcation is also numerically found in the Klein–Gordon chain at critical value of the mass defect very close to the theoretical one (in both cases one obtains m0 ≈ 0.01268).
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
55
Figure 13 shows the bifurcation diagram of the numerically computed breathers corresponding to homoclinic points 4 , 5 , 6 (left panel), and gives their profiles for a given value of m0 in the right panel. A numerical computation of Floquet spectra shows that all these breathers are unstable. As a conclusion, we have seen that the truncated normal form (70) allows one to predict with a high precision certain breather bifurcations in the Klein–Gordon chain, which occur as the mass defect m0 is varied. These bifurcations depend on the fine structure of the windings of the stable and unstable manifolds of the origin, computed on the truncated normal form without defect. Acknowledgments This work has been supported by the French Ministry of Research through the CNRS Program ACI NIM (New Interfaces of Mathematics). G.J. wishes to thank Michel Peyrard for initiating this research, and is grateful to R. MacKay for pointing out interesting bibliographical references. B.S-R. and J.C. acknowledge sponsorship by the Ministerio de Educaci´ on y Ciencia, Spain, project FIS2004-01183. B.S-R. is grateful to the Institut de Math´ematiques de Toulouse (UMR 5219) where a part of this work has been carried out during a visit from September to October 2006. References [1] C. Albanese and J. Fr¨ ohlich, Periodic solutions of some infinite-dimensional Hamiltonian systems associated with nonlinear partial difference equations. I, Commun. Math. Phys. 116 (1988) 475–502. [2] C. Albanese and J. Fr¨ ohlich, Periodic solutions of some infinite-dimensional Hamiltonian systems associated with nonlinear partial difference equations. II, Commun. Math. Phys. 119 (1988) 677–699. [3] C. Albanese and J. Fr¨ ohlich, Perturbation theory for periodic orbits in a class of infinite-dimensional Hamiltonian systems, Commun. Math. Phys. 138 (1991) 193–205. [4] G. L. Alfimov, V. A. Brazhnyi and V. V. Konotop, On classification of intrinsic localized modes for the discrete nonlinear Schr¨ odinger equation, Phys. D 194 (2004) 127–150. [5] P. W. Anderson. Absence of diffusion in certain random lattices, Phys. Rev. 109 (1958) 1492–1505. [6] J. F. R. Archilla, R. S. MacKay and J. L. Marin, Discrete breathers and Anderson modes: Two faces of the same phenomenon? Phys. D 134 (1999) 406–418. [7] G. Arioli and A. Szulkin, Periodic motions of an infinite lattice of particles: The strongly indefinite case, Ann. Sci. Math. Qu´ebec 22 (1998) 97–119. [8] D. K. Arrowsmith and C. M. Place, An Introduction to Dynamical Systems (Cambridge University Press, 1990). [9] S. Aubry and G. Abramovici, Chaotic trajectories in the standard map: The concept of anti-integrability, Phys. D 43 (1990) 199–219. [10] S. Aubry, Anti-integrability in dynamical and variational problems, Phys. D 86 (1995) 284–296. [11] S. Aubry, Breathers in nonlinear lattices: Existence, linear stability and quantization, Phys. D 103 (1997) 201–250.
February 11, 2009 13:36 WSPC/148-RMP
56
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
[12] S. Aubry, Discrete breathers in anharmonic models with acoustic phonons, Ann. Inst. H. Poincar´e Phys. Th´eor. 68(4) (1998) 381–420. [13] S. Aubry, G. Kopidakis and V. Kadelburg, Variational proof for hard discrete breathers in some classes of Hamiltonian dynamical systems, Discrete Contin. Dyn. Syst. B 1 (2001) 271–298. [14] Z. Bishnani and R. S. MacKay, Safety criteria for aperiodically forced systems, Dyn. Syst. 18 (2003) 107–129. [15] T. Bountis, H. W. Capel, M. Kollmann, J. C. Ross, J. M. Bergamin and J. P. van der Weele, Multibreather and homoclinic orbits in 1-dimensional nonlinear lattices, Phys. Lett. A 268 (2000) 50–60. [16] A. Campa and A. Giansanti, Experimental tests of the Peyrard–Bishop model applied to the melting of very short DNA chains, Phys. Rev. E 58 (1998) 3585–3588. [17] D. K. Campbell, S. Flach and Yu. S. Kivshar, Localizing energy through nonlinearity and discreteness, Phys. Today 57 (2004) 43–49. [18] R. Carretero-Gonz´ alez, J. D. Talley, C. Chong and B. A. Malomed, Multistable solitons in the cubic-quintic discrete nonlinear Schr¨ odinger equation, Phys. D 216 (2006) 77–89. [19] J. Cuevas, F. Palmero, J. F. R. Archilla and F. R. Romero, Moving discrete breathers in a Klein–Gordon chain with an impurity, J. Phys. A 35 (2002) 10519–10530. [20] J. Cuevas and P. G. Kevrekidis, Breather statics and dynamics in Klein–Gordon chains with a bend, Phys. Rev. E 69 (2004) art. no. 056609, 13 pp. [21] T. Dauxois, M. Peyrard and A. R. Bishop, Entropy-driven DNA denaturation, Phys. Rev. E 47(1) (1993) R44–R47. [22] T. Dauxois, A. Litvak-Hinenzon, R. S. MacKay and A. Spanoudaki (eds.), Energy Localisation and Transfer, Advanced Series in Nonlinear Dynamics, Vol. 22 (World Scientific, 2004). [23] A. Delshams and R. Ram´ırez-Ros, Exponentially small splitting of separatrices for perturbed integrable standard-like maps, J. Nonlinear Sci. 8 (1998) 317–352. [24] J. Edler and P. Hamm, Self-trapping of the amide I band in a peptide model crystal, J. Chem. Phys. 117 (2002) 2415–2424. [25] B. Fiedler and J. Scheurle, Discretization of homoclinic orbits, rapid forcing and “invisible chaos”, Mem. Amer. Math. Soc. 119 (1996) No. 570. [26] S. Flach, Existence of localized excitations in nonlinear Hamiltonian lattices, Phys. Rev. E 51 (1995) 1503–1507. [27] S. Flach, Tangent bifurcation of band edge plane waves, dynamical symmetry breaking and vibrational localization, Phys. D 91 (1996) 223–243. [28] S. Flach and C. R. Willis, Discrete breathers, Phys. Rep. 295 (1998) 181–264. [29] K. Forinash, M. Peyrard and B. Malomed, Interaction of discrete breathers with impurity modes, Phys. Rev. E 49 (1994) 3400–3411. [30] J. M. Franks, Time dependent stable diffeomorphisms, Invent. Math. 24 (1974) 163–172. [31] J. Fr¨ ohlich, T. Spencer and C. E. Wayne, Localization in disordered, nonlinear dynamical systems, J. Stat. Phys. 42 (1986) 247–274. [32] V. Gelfreich, Splitting of a small separatrix loop near the saddle-center bifurcation in area-preserving maps, Phys. D 136 (2000) 266–279. [33] B. Gershgorin, Yu. V. Lvov and D. Cai, Renormalized waves and discrete breathers in β-Fermi–Pasta–Ulam chains, Phys. Rev. Lett. 95 (2005) art. no. 264302, 4 pp. [34] J. Giannoulis and A. Mielke, The nonlinear Schr¨ odinger equation as a macroscopic limit for an oscillator chain with cubic nonlinearities, Nonlinearity 17 (2004) 551–565.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
57
[35] J. Giannoulis and A. Mielke, Dispersive evolution of pulses in oscillator chains with general interaction potentials, Discrete Contin. Dyn. Syst. B 6 (2006) 493–523. [36] D. Hennig, K. Ø. Rasmussen, H. Gabriel and A. B¨ ulow, Solitonlike solutions of the discrete nonlinear Schr¨ odinger equation, Phys. Rev. E 54(5) (1996) 5788–5801. [37] D. Hennig and G. P. Tsironis, Wave transmission in nonlinear lattices, Phys. Rep. 307 (1999) 333–432. [38] G. Iooss and K. Kirchg¨ assner, Travelling waves in a chain of coupled nonlinear oscillators, Commun. Math. Phys. 211 (2000) 439–464. [39] G. Iooss, Travelling waves in the Fermi–Pasta–Ulam lattice, Nonlinearity 13 (2000) 849–866. [40] G. Iooss and G. James, Localized waves in nonlinear oscillator chains, Chaos 15 (2005) art. no. 015113, 15 pp. [41] G. Iooss and D. E. Pelinovsky, Normal form for travelling kinks in discrete Klein– Gordon lattices, Phys. D 216 (2006) 327–345. [42] M. V. Ivanchenko, O. I. Kanakov, V. D. Shalfeev and S. Flach. Discrete breathers in transient processes and thermal equilibrium, Phys. D 198 (2004) 120–135. [43] G. James, Existence of breathers on FPU lattices, C. R. Acad. Sci. Paris S´ er. I 332 (2001) 581–586. [44] G. James, Centre manifold reduction for quasilinear discrete systems, J. Nonlinear Sci. 13 (2003) 27–63. [45] G. James and P. Noble, Breathers on diatomic Fermi–Pasta–Ulam lattices, Phys. D 196 (2004) 124–171. [46] G. James and Y. Sire, Travelling breathers with exponentially small tails in a chain of nonlinear oscillators, Commun. Math. Phys. 257 (2005) 51–85. [47] G. James and M. Kastner, Bifurcations of discrete breathers in a diatomic Fermi– Pasta–Ulam chain, Nonlinearity 20 (2007) 631–657. [48] G. Kalosakas, K. Ø. Rasmussen, A. R. Bishop, C. H. Choi and A. Usheva, Sequencespecific thermal fluctuations identify start sites for DNA transcription, Europhys. Lett. 68(1) (2004) 127–133. [49] T. Kato, Perturbation Theory for Linear Operators (Springer Verlag, 1966). [50] P. G. Kevrekidis, Yu. S. Kivshar and A. S. Kovalev, Instabilities and bifurcations of nonlinear impurity modes, Phys. Rev. E 67 (2003) art. no. 046604, 8 pp. [51] K. Kirchg¨ assner, Wave solutions of reversible systems and applications, J. Differential Equations 45 (1982) 113–127. [52] S. A. Kiselev, S. R. Bickham and A. J. Sievers, Anharmonic gap mode in a onedimensional diatomic lattice with nearest-neighbor Born–Mayer–Coulomb potentials and its interaction with a mass-defect impurity, Phys. Rev. B 50 (1994) 9135–9152. [53] Yu. S. Kivshar, F. Zhang and A. S. Kovalev, Stable nonlinear heavy-mass impurity modes, Phys. Rev. B 55 (1997) 14265–14269. [54] G. Kopidakis and S. Aubry, Intraband discrete breathers in disordered nonlinear systems. I. Delocalization, Phys. D 130 (1999) 155–186 [55] G. Kopidakis and S. Aubry, Intraband discrete breathers in disordered nonlinear systems. II. Localization, Phys. D 139 (2000) 247–275. [56] E. Lombardi, Oscillatory Integrals and Phenomena beyond All Algebraic Orders with Applications to Homoclinic Orbits in Reversible Systems, Lecture Notes in Mathematics, Vol. 1741 (Springer-Verlag, 2000). [57] R. S. MacKay and S. Aubry, Proof of existence of breathers for time-reversible or Hamiltonian networks of weakly coupled oscillators, Nonlinearity 7 (1994) 1623–1643.
February 11, 2009 13:36 WSPC/148-RMP
58
J070-00357
G. James, B. S´ anchez-Rey & J. Cuevas
[58] M. E. Manley, M. Yethiraj, H. Sinn, H. M. Volz, A. Alatas, J. C. Lashley, W. L. Hults, G. H. Lander and J. L. Smith, Formation of a new dynamical mode in αuranium observed by inelastic X-ray and neutron scattering, Phys. Rev. Lett. 96 (2006) art. no.125501, 4 pp. [59] J. L. Marin and S. Aubry, Breathers in nonlinear lattices: Numerical calculation from the anticontinuous limit, Nonlinearity 9 (1996) 1501–1528. [60] H. Kielh¨ ofer, Bifurcation Theory. An Introduction with Applications to PDEs, Applied Mathematical Sciences, Vol. 156 (Springer Verlag, 2004). [61] J. E. Marsden and T. Hughes, Mathematical Foundations of Elasticity (Dover Publications, 1994). [62] P. Noble, Existence of breathers in classical ferromagnetic lattices, Nonlinearity 17 (2004) 1–15. [63] A. Pankov and N. Zakharchenko, On some discrete variational problems, Acta Appl. Math. 65 (2001) 295–303. [64] A. Pankov, Travelling Waves and Periodic Oscillations in Fermi–Pasta–Ulam Lattices (Imperial College Press, London, 2005). [65] D. E. Pelinovsky and V. M. Rothos, Bifurcations of travelling wave solutions in the discrete NLS equations, Phys. D 202 (2005) 16–36. [66] M. Peyrard and A. R. Bishop, Statistical mechanics of a nonlinear model for DNA denaturation, Phys. Rev. Lett. 62 (1989) 2755–2758. [67] M. Peyrard, Nonlinear dynamics and statistical physics of DNA, Nonlinearity 17 (2004) 1–40. [68] W.-X. Qin and X. Xiao, Homoclinic orbits and localized solutions in nonlinear Schr¨ odinger lattices, Nonlinearity 20 (2007) 2305–2317. [69] B. S´ anchez-Rey, G. James, J. Cuevas and J. F. R. Archilla, Bright and dark breathers in Fermi–Pasta–Ulam lattices, Phys. Rev. B 70 (2004) art. no. 014301, 10 pp. [70] M. Sato and A. J. Sievers, Direct observation of the discrete character of intrinsic localized modes in an antiferromagnet, Nature 432 (2004) 486–488. [71] U. T. Schwarz, L. Q. English and A. J. Sievers, Experimental generation and observation of intrinsic localized spin wave modes in an antiferromagnet, Phys. Rev. Lett. 83 (1999) 223–226. [72] J. A. Sepulchre and R. S. MacKay, Localized oscillations in conservative and dissipative networks of weakly coupled autonomous oscillators, Nonlinearity 10 (1997) 679–713. [73] J. A. Sepulchre and R. S. MacKay, Discrete breathers in disordered media, Phys. D 113 (1998) 342–345. [74] A. J. Sievers and S. Takeno, Intrinsic localized modes in anharmonic crystals, Phys. Rev. Lett. 61 (1988) 970–973. [75] Y. Sire, Travelling breathers in Klein–Gordon lattices as homoclinic orbits to p-tori, J. Dynam. Differential Equations 17 (2005) 779–823. [76] A. A. Sukhorukov, Yu. S. Kivshar, J. J. Rasmussen and P. L. Christiansen, Nonlinearity and disorder: Classification and stability of nonlinear impurity modes, Phys. Rev. E 63 (2001) art. no. 036601, 18 pp. [77] B. I. Swanson, J. A. Brozik, S. P. Love, G. F. Strouse and A. P. Shreve, Observation of intrinsically localized modes in a discrete low-dimensional material, Phys. Rev. Lett. 82 (1999) 3288–3291. [78] J. J. L. Ting and M. Peyrard, Effective breather trapping mechanism for DNA transcription, Phys. Rev. E 53(1) (1996) 1011–1020. [79] A. Vanderbauwhede, Center manifolds, normal forms and elementary bifurcations, in Dynamics Reported 2, eds. U. Kirchgraber and H. O. Walther (John Wiley and Sons Ltd and B. G. Teubner, 1989), pp. 89–169.
February 11, 2009 13:36 WSPC/148-RMP
J070-00357
Breathers in Inhomogeneous Nonlinear Lattices
59
[80] T. S. van Erp, S. Cuesta-Lopez and M. Peyrard, Bubbles and denaturation in DNA, Eur. Phys. J. E 20 (2006) 421–434. [81] L. V´ azquez, R. S. MacKay and M. P. Zorzano (eds), Localization and Energy Transfer in Nonlinear Systems, Proceedings of the Third Conference (San Lorenzo de El Escorial, Spain, 17—21 June 2002) (World Scientific, 2003). [82] M. I. Weinstein, Excitation thresholds for nonlinear localized modes on lattices, Nonlinearity 12 (1999) 673–691.
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Reviews in Mathematical Physics Vol. 21, No. 1 (2009) 61–109 c World Scientific Publishing Company
LONG-TIME ASYMPTOTICS OF THE TODA LATTICE FOR DECAYING INITIAL DATA REVISITED
¨ HELGE KRUGER Department of Mathematics, Rice University, Houston, TX 77005, USA
[email protected] http://math.rice.edu/∼ hk7/ GERALD TESCHL Faculty of Mathematics, Nordbergstrasse 15, 1090 Wien, Austria and International Erwin Schr¨ odinger, Institute for Mathematical Physics, Boltzmanngasse 9, 1090 Wien, Austria
[email protected] http://www.mat.univie.ac.at/∼ gerald/ Received 29 April 2008 Revised 22 October 2008 The purpose of this article is to give a streamlined and self-contained treatment of the long-time asymptotics of the Toda lattice for decaying initial data in the soliton and in the similarity region via the method of nonlinear steepest descent. Keywords: Riemann–Hilbert problem; Toda lattice; solitons. Mathematics Subject Classification 2000: 37K40, 37K45, 35Q15, 37K10
1. Introduction The simplest model of a solid is a chain of particles with nearest neighbor interaction. The Hamiltonian of such a system is given by p(n, t)2 + V (q(n + 1, t) − q(n, t)) , (1.1) H(p, q) = 2 n∈Z
where q(n, t) is the displacement of the nth particle from its equilibrium position, p(n, t) is its momentum (mass m = 1), and V (r) is the interaction potential. Restricting the attention to finitely many particles (e.g., by imposing periodic 2 boundary conditions) and to the harmonic interaction V (r) = r2 , the equations of motion form a linear system of differential equations with constant coefficients. 61
February 11, 2009 13:39 WSPC/148-RMP
62
J070-00358
H. Kr¨ uger & G. Teschl
The solution is then given by a superposition of the associated normal modes. Around 1950, it was generally believed that a generic nonlinear perturbation would yield to thermalization. That is, for any initial condition the energy should eventually be equally distributed over all normal modes. In 1955, Fermi, Pasta and Ulam carried out a seemingly innocent computer experiment at Los Alamos [15] to investigate the rate of approach to the equipartition of energy. However, much to everybody’s surprise, the experiment indicated, instead of the expected thermalization, a quasi-periodic motion of the system! Many attempts were made to explain this result but it was not until ten years later that Kruskal and Zabusky [43] revealed the connections with solitons (see [2] for further historical information and a pedagogical discussion). This had a big impact on soliton mathematics and led to an explosive growth in the last decades. In particular, it led to the search for a potential V (r) for which the above system has soliton solutions. By considering addition formulas for elliptic functions, Toda came up with the choice V (r) = e−r + r − 1. The corresponding system is now known as the Toda equation [40]. The equation of motion in this case reads explicitly ∂H(p, q) d p(n, t) = − = e−(q(n,t)−q(n−1,t)) − e−(q(n+1,t)−q(n,t)) , dt ∂q(n, t) d ∂H(p, q) q(n, t) = = p(n, t). dt ∂p(n, t)
(1.2)
The important property of the Toda equation is the existence of the so-called soliton solutions, that is, pulselike waves which spread in time without changing their size or shape and interact with each other in a particle-like way. This is a surprising phenomenon, since for a generic linear equation one would expect spreading of waves (dispersion) and for a generic nonlinear force one would expect that solutions only exist for a finite time (breaking of waves). Obviously, our particular force is such that both phenomena cancel each other giving rise to a stable wave existing for all time! In fact, in the simplest case of one soliton, you can easily verify that this solution is given by γ exp(−2κn + 2σ sinh(κ)t) 1+ 1 − e−2κ (1.3) q1 (n, t) = q+ + log , γ 1+ exp(−2κ(n + 1) + 2σ sinh(κ)t) −2κ 1−e with κ, γ > 0 and σ ∈ {±1}. It describes a single bump traveling through the crystal with speed σ sinh(κ)/κ and width proportional to 1/κ. In other words, the smaller the soliton the faster it propagates. It results in a total displacement 2κ of the crystal. However, this is just the tip of the iceberg and can be generalized to the N soliton solution det(I + CN (n, t)) , (1.4) qN (n, t) = q+ + log det(I + CN (n + 1, t))
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
63
2.0 1.5 1.0 0.5 0.0 − 20 Fig. 1.
0
20
40
One soliton q1 (n, 0) with κ = 1, γ = 1, and q0 = 0.
where
γi (n, t)γj (n, t) CN (n, t) = 1 − e−(κi +κj )
,
γj (n, t) = γj e−2κj n−2σj sinh(κj )t ,
(1.5)
1≤i,j≤N
with κj , γj > 0 and σj ∈ {±1}. The case N = 1 coincides with the one soliton solution from above and asymptotically, as t → ∞, the N -soliton solution can be written as a sum of one-soliton solutions. Historically such solitary waves were first observed by the naval architect Russel [35] who followed the bow wave of a barge which moved along a channel maintaining its speed and size (see the review article [33] for further information). The importance of these solitary waves is that they constitute the stable part of the solutions arising from arbitrary short range initial conditions and can be used to explain the quasi-periodic behavior found by Fermi, Pasta and Ulam. In fact, the classical result discovered by Zabusky and Kruskal [43] states that every “short range” initial condition eventually splits into a number of stable solitons and a decaying background radiation component. This is illustrated in Fig. 2 which shows the numerically computed solution q(n, t) corresponding to the initial condition q(n, 0) = δ0,n , p(n, 0) = 0 at some large time t = 130. You can see the soliton region | nt | > 1 with two single solitons on the very left, respectively, right and the similarity region | nt | < 1 where there is a continuous displacement plus some small oscillations which decay like t−1/2 and are asymptotically given by q(n, t) 2 log(T0 (z0 )) +
2ν(z0 ) −sin(θ0 )t
1/2 cos(tΦ0 (z0 ) + ν(z0 ) log(t) − δ(z0 )),
(1.6)
where z0 = eiθ0 is a slow variable depending only on nt and the functions T0 (z0 ), ν(z0 ), Φ0 (z0 ), and δ(z0 ) are explicitly given in terms of the scattering data associated with the initial data. Our main goal will be to mathematically justify this formula for the solution in the similarity region | nt | < 1 (Theorem 2.2) and to
February 11, 2009 13:39 WSPC/148-RMP
64
J070-00358
H. Kr¨ uger & G. Teschl
1
0
−1
− 200
− 100
0
100
200
Fig. 2. Numerically computed solution q(n, 150) of the Toda lattice, with initial condition all particles at rest in their equilibrium positions except for the one in the middle which is displaced by 1.
show that the solution splits into a number of solitons in the soliton region | nt | > 1 (Theorem 2.1). Existence of soliton solutions is usually connected to complete integrability of the system, and this is also true for the Toda equation. To see that the Toda equation is indeed integrable we introduce Flaschka’s variables [16] a(n, t) =
1 −(q(n+1,t)−q(n,t))/2 e , 2
1 b(n, t) = − p(n, t) 2
(1.7)
and obtain the form most convenient for us d a(t) = a(t)(b+ (t) − b(t)), dt
(1.8)
d b(t) = 2(a(t)2 − a− (t)2 ). dt Here we have used the abbreviation f ± (n) = f (n ± 1).
(1.9)
Note that if q(n, t) → q± sufficiently fast as n → ±∞, the converse map is given by ∞
q(n, t) = q+ + 2 log (2a(j, t)) , p(n, t) = −2b(n, t). (1.10) j=n
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
65
Moreover, q(n, t) → q± , p(n, t) → 0 as |n| → ∞ corresponds to a(n, t) → 12 , b(n, t) → 0. To show complete integrability it suffices to find a so-called Lax pair [27], that is, two operators H(t), P (t) in 2 (Z) such that the Lax equation d H(t) = P (t)H(t) − H(t)P (t) dt
(1.11)
is equivalent to (1.8). One can easily convince oneself that the right choice is H(t) = a(t)S + + a− (t)S − + b(t), P (t) = a(t)S + − a− (t)S − ,
(1.12)
where (S ± f )(n) = f ± (n) = f (n ± 1) are the shift operators. Now the Lax equation (1.11) implies that the operators H(t) for different t ∈ R are unitarily equivalent (cf. [38, Theorem 12.4]): Theorem 1.1. Let P (t) be a family of bounded skew-adjoint operators, such that t → P (t) is differentiable. Then there exists a family of unitary propagators U (t, s) for P (t), that is, d U (t, s) = P (t)U (t, s), dt
U (s, s) = I.
(1.13)
Moreover, the Lax equation (1.11) implies H(t) = U (t, s)H(s)U (t, s)−1 .
(1.14)
This result has several important consequences. First of all, it implies global existence of solutions of the Toda lattice. In fact, considering the Banach space of all bounded real-valued coefficients (a(n), b(n)) (with the sup norm), local existence follows from standard results for differential equations in Banach spaces. Moreover, Theorem 1.1 implies that the norm H(t) is constant, which in turn provides a uniform bound on the coefficients of H(t), a(t)∞ + b(t)∞ ≤ 2H(t) = 2H(0).
(1.15)
Hence solutions of the Toda lattice cannot blow up and are global in time (see [38, Sec. 12.2] for details). Second, it provides an infinite sequence of conservation laws expected from a completely integrable system. Indeed, if the Lax equation (1.11) holds for H(t), it automatically also holds for H(t)j . Taking traces shows that tr H(t)j − H0j , j ∈ N, (1.16) is an infinite sequence of conserved quantities, where H0 is the operator corresponding to the constant solution a0 (n, t) = 12 , b0 (n, t) = 0 (it is needed to make the trace
February 11, 2009 13:39 WSPC/148-RMP
66
J070-00358
H. Kr¨ uger & G. Teschl
converge). Introducing a suitable symplectic structure, they can be shown to be in involution as well ([18, Sec. 1.7]). For example, 1 b(n, t) = − p(n, t) and tr(H(t) − H0 ) = 2 n∈Z n∈Z (1.17) 1 1 2 2 2 2 b(n, t) + 2 a(n, t) − tr(H(t) − H0 ) = = H(p, q) 4 2 n∈Z
correspond to conservation of the total momentum and the total energy, respectively. These observations pave the way for a solution of the Toda equation via the inverse scattering transform originally invented by Gardner, Green, Kruskal and Miura [17] for the Korteweg–De Vries equation (see [38, Sec. 13.4] for the case of the Toda lattice). In particular, Theorem 1.1 implies that the operators H(t), t ∈ R, are unitarily equivalent and that the spectrum σ(H(t)) is independent of t. Now the general idea is to find suitable spectral data S(H(t)) for H(t) which uniquely determine H(t). Then, Eq. (1.11) can be used to derive linear evolution equations for S(H(t)) which are easy to solve. In our case these data will be the so-called scattering data and the formal procedure (which can be thought of as a nonlinear Fourier transform) is summarized below: S(H(0))
time evolution
-
S(H(t))
6 direct scattering (a(0), b(0))
inverse . scattering ? (a(t), b(t))
The inverse scattering step will be done by reformulating the problem as a Riemann– Hilbert factorization problem. This Riemann–Hilbert problem will then be analyzed using the method of nonlinear steepest descent by Deift and Zhou [4] (which is the nonlinear analog of the steepest descent for Fourier type integrals). In fact, one of our goals is to give a complete and expository introduction to this method. We are trying to present a streamlined and simplified approach with complete proofs. In particular, we have added two appendices which show how to solve the localized Riemann–Hilbert problem on a small cross via parabolic cylinder functions and how to rewrite Riemann–Hilbert problems as singular integral equations. Only some basic knowledge on Riemann–Hilbert problems, which can be found for example in the beautiful lecture notes by Deift [3], is required. For further information on the history of the steepest descent method, which was inspired by earlier work of Manakov [28] and Its [19], and the problem of finding
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
67
the long-time asymptotics for integrable nonlinear wave equations, we refer to the survey by Deift, Its and Zhou [7]. More information on the Toda lattice can be found in the monographs by Faddeev and Takhtajan [14], Gesztesy, Holden, Michor and Teschl [18], Teschl [38], or Toda [40]. Here we partly followed the review article [39]. A much more comprehensive guide to the literature can be found in [18, Sec. 1.8]. First results on the long-time asymptotics of the doubly infinite Toda lattice were given by Novokshenov and Habibullin [32] and Kamvissis [20]. Long-time asymptotics for the finite and semi infinite Toda lattice can be found in Moser [30] and Deift, Li and Tomei [9], respectively. The long-time behavior of Toda shock problem was investigated by Kamvissis [21] and Venakides, Deift and Oba [41] and of the Toda rarefaction problem by Deift, Kamvissis, Kriecherbauer and Zhou [11]. For the case of a periodic driving force see Deift, Kriecherbauer and Venakides [8]. Finally, we also want to mention that one could replace the constant background solution by a periodic one. However, this case exhibits a much different behavior, as was pointed out by Kamvissis and Teschl in [22] (see also [12, 13, 23, 24, 26] for a rigorous mathematical treatment). 2. Main Results As stated in the introduction, we want to compute the long-time asymptotics for the doubly infinite Toda lattice which reads in Flaschka’s variables ˙ b(n, t) = 2(a(n, t)2 − a(n − 1, t)2 ), (2.1) a(n, ˙ t) = a(n, t)(b(n + 1, t) − b(n, t)), (n, t) ∈ Z × R. Here the dot denotes differentiation with respect to time. We will consider solutions (a, b) satisfying 1 l+1 (2.2) (1 + |n|) a(n, t) − 2 + |b(n, t)| < ∞ n for some l ∈ N for one (and hence for all, see [38]) t ∈ R. It is well known that the corresponding initial value problem has unique global solutions which can be computed via the inverse scattering transform [38]. The long-time asymptotics were first derived by Novokshenov and Habibullin [32] and were later made rigorous by Kamvissis [20] under the additional assumption that no solitons are present. The case of solitons was recently investigated by us in [25]. As one of our main simplifications in contradistinction to [20] we will work with the vector Riemann–Hilbert problem which arises naturally from the inverse scattering theory, thus avoiding the detour over the associated matrix Riemann–Hilbert problem. This also avoids the singularities appearing in the matrix Riemann–Hilbert problem in case the reflection coefficient is −1 at the band edges. To state the main results, we begin by recalling that the sequences a(n, t), b(n, t), n ∈ Z, for fixed t ∈ R, are uniquely determined by its scattering data, that
February 11, 2009 13:39 WSPC/148-RMP
68
J070-00358
H. Kr¨ uger & G. Teschl
is, by its right reflection coefficient R+ (z, t), |z| = 1, and its eigenvalues λj ∈ (−∞, −1) ∪ (1, ∞), j = 1, . . . , N , together with the corresponding right norming constants γ+,j (t) > 0, j = 1, . . . , N . It is well known that under the assumption (2.2) the reflection coefficients are C l+1 (T). Rather than in the complex plane, we will work on the unit disc using the usual Joukowski transformation 1 1 (2.3) λ= z+ , z = λ − λ2 − 1, λ ∈ C, |z| ≤ 1. 2 z In these new coordinates the eigenvalues λj ∈ (−∞, −1) ∪ (1, ∞) will be denoted by ζj ∈ (−1, 0) ∪ (0, 1). The continuous spectrum [−1, 1] is mapped to the unit circle T. Moreover, the phase of the associated Riemann–Hilbert problem is given by n (2.4) Φ(z) = z − z −1 + 2 log(z) t and the stationary phase points, Φ (z) = 0, are denoted by n 2 n 2 n n −1 z0 = − − − 1, z0 = − + −1 (2.5) t t t t and correspond to n λ0 = − . (2.6) t √ Here the branch of the square root is chosen such that Im( z) ≥ 0. For nt < −1 we have z0 ∈ (0, 1), for −1 ≤ nt ≤ 1 we have z0 ∈ T (and hence z0−1 = z0 ), and for nt > 1 we have z0 ∈ (−1, 0). For | nt | > 1 we will also need the value ζ0 ∈ (−1, 0) ∪ (0, 1) defined via Re(Φ(ζ0 )) = 0, that is, ζ0 − ζ0−1 n =− . t 2 log(|ζ0 |)
(2.7)
We will set ζ0 = −1 if | nt | ≤ 1 for notational convenience. A simple analysis shows that for nt < −1 we have 0 < ζ0 < z0 < 1 and for nt > 1 we have −1 < z0 < ζ0 < 0. Furthermore, recall that the transmission coefficient T (z), |z| ≤ 1, is time independent and can be reconstructed using the Poisson–Jensen formula. In particular, we define the partial transmission coefficient with respect to z0 by (2.8) T (z, z0) z−ζk−1 |ζk | z−ζk , z0 ∈ (−1, 0), ζk ∈(ζ0 ,0)
z0 z−ζk−1 1 ds |z0 | = 1, |ζk | z−ζk exp 2πi log(|T (s)|) s+z = s−z s , z0 ζk ∈(−1,0)
z−ζk−1 1 ds |ζk | z−ζk exp 2πi log(|T (s)|) s+z s−z s , z0 ∈ (0, 1). T ζk ∈(−1,0)∪(ζ0 ,1)
Here, in the case z0 ∈ T, the integral is to be taken along the arc Σ(z0 ) = {z ∈ T|Re(z) < Re(z0 )} oriented counterclockwise. For z0 ∈ (−1, 0) we set Σ(z0 ) = ∅
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
69
and for z0 ∈ (0, 1) we set Σ(z0 ) = T. Then T (z, z0 ) is meromorphic for z ∈ C\Σ(z0 ). Observe that T (z, z0) = T (z) once z0 ∈ (0, 1) and (0, ζ0 ) contains no eigenvalues. Moreover, T (z, z0) can be computed in terms of the scattering data since |T (z)|2 = 1 − |R+ (z, t)|2 = 1 − |R+ (z, 0)|2 . Moreover, we set T0 (z0 ) = T (0, z0) |ζk |−1 , z0 ∈ (−1, 0), ζ ∈(ζ ,0) 0 k
z0 1 ds |ζk |−1 exp 2πi |z0 | = 1, = (2.9) z0 log(|T (s)|) s , ζk ∈(−1,0)
1 ds −1 |ζk | exp 2πi T log(|T (s)|) s , z0 ∈ (0, 1), ζk ∈(−1,0)∪(ζ0 ,1)
and
∂ log T (z, z0) ∂z z=0 −1 (ζk − ζk ), z0 ∈ (−1, 0), ζk ∈(ζ0 ,0) z0 1 (ζk−1 − ζk ) + πi log(|T (s)|) ds |z0 | = 1, = s2 , z0 ζk ∈(−1,0) 1 ds (ζk−1 − ζk ) + πi T log(|T (s)|) s2 , z0 ∈ (0, 1).
T1 (z0 ) =
(2.10)
ζk ∈(−1,0)∪(ζ0 ,1)
In other words, T (z, z0) = T0 (z0 )(1 + T1 (z0 )z + O(z 2 )). Theorem 2.1 (Soliton Region). Assume (2.2) for some l ∈ N and abbreviate by ζ −ζ −1
k k ck = − 2 log(|ζ the velocity of the kth soliton determined by Re(Φ(ζk )) = 0. Then k |) the asymptotics in the soliton region, |n/t| ≥ 1 + C/t log(t)2 for any C > 0, are as follows. Let ε > 0 sufficiently small such that the intervals [ck − ε, ck + ε], 1 ≤ k ≤ N, are disjoint and lie inside (−∞, −1) ∪ (1, ∞). If | nt − ck | < ε for some k, the solution is asymptotically given by a single soliton
∞
1 − ζk2 + γk (n, t) −l (2a(j, t)) = T0 (z0 ) 2 + γ (n + 1, t) + O(t ) , 1 − ζ k k j=n (2.11) ∞ 1 γk (n, t)ζk (1 − ζk2 ) −l + O(t ), b(j, t) = T1 (z0 ) + 2 2((γk (n, t) − 1)ζk2 + 1) j=n+1
where γk (n, t) = γk T (ζk , −ck −
−1 c2k − 1)−2 et(ζk −ζk ) ζk2n .
(2.12)
February 11, 2009 13:39 WSPC/148-RMP
70
J070-00358
H. Kr¨ uger & G. Teschl
If | nt − ck | ≥ ε, for all k, one has ∞
(2a(j, t)) = T0 (z0 )(1 + O(t−l )),
j=n
(2.13)
∞
1 b(j, t) = T1 (z0 ) + O(t−l ). 2 j=n+1 Note that one can choose | nt − ck | < ε1 for the regions where (2.11) is valid, respectively | nt − ck | ≥ ε2 for the regions where (2.13) is valid, such that the regions overlap if ε1 > ε2 . Due to the exponential decay of the one-soliton solution, both formulas of course produce the same result on the overlap. In particular, we recover the well known fact that the solution splits into a sum of independent solitons where the presence of the other solitons and the radiation part corresponding continuous spectrum manifests itself in phase shifts given by to the −2 2 T (ζk , −ck − ck − 1) . Indeed, notice that for ζk ∈ (−1, 0) this term just contains the product over the Blaschke factors corresponding to solitons ζj with ζk < ζj . For ζk ∈ (0, 1), we have the product over the Blaschke factors corresponding to solitons ζj ∈ (−1, 0), the integral over the full unit circle, plus the product over the Blaschke factors corresponding to solitons ζj with ζk > ζj . Furthermore, this result shows that in the region nt > 1 the solution is asymptotically given by a N− -soliton solution, where N− is the number of ζj ∈ (−1, 0), formed from the data ζj , γj for all ζk ∈ (−1, 0). Similarly, in the region nt < −1 the solution is asymptotically given by a N+ -soliton solution, where N+ is the number of ζj ∈ (0, 1), formed from the data ζj , γ˜j for all ζj ∈ (0, 1), where γ˜j = γj
ζk ∈(−1,0)
ζj − ζk−1 1 s + ζj ds |ζk | log(|T (s)|) . exp ζj − ζk 2πi T s − ζj s
(2.14)
In the remaining region, we will show Theorem 2.2 (Similarity Region). Assume (2.2) with l ≥ 5, then, away from the soliton region, |n/t| ≤ 1 − C for any C > 0, the asymptotics are given by ∞
(2a(j, t)) = T0 (z0 ) 1 +
j=n
ν(z0 ) −2 sin(θ0 )t
1/2 cos(tΦ0 (z0 ) + ν(z0 ) log(t) − δ(z0 ))
+ O(t
−α
) ,
∞
1 b(j, t) = T1 (z0 ) + 2 j=n+1 + O(t−α ),
(2.15) ν(z0 ) −2 sin(θ0 )t z0 = eiθ0 ,
1/2 cos(tΦ0 (z0 ) + ν(z0 ) log(t) − δ(z0 ) + θ0 ) (2.16)
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
71
for any α < 1. Here 1 log(|T (z0 )|), π Φ0 (z0 ) = 2(sin(θ0 ) − θ0 cos(θ0 )), ν(z0 ) = −
δ(z0 ) = π/4 − 3ν(z0 ) log |2 sin(θ0 )| + 2 arg(T˜ (z0 )) − arg(R+ (z0 , 0))
(2.17)
+ arg(Γ(iν(z0 ))), T˜(z0 ) =
ζk ∈(−1,0)
z0 z − ζk−1 1 |T (s)| s + z0 ds |ζk | · exp log , z − ζk 2πi z0 |T (z0 )| s − z0 s
and Γ(z) is the gamma function. For a(n, t), respectively, b(n, t), we obtain as a simple consequence: Corollary 2.3. Under the same assumptions as in Theorem 2.2 we have 1/2 −sin(θ0 )ν(z0 ) 1 cos(tΦ0 (z0 ) + ν(z0 ) log(t) − δ(z0 ) − θ0 ) a(n, t) = + 2 2t + O(t−α ), (2.18) 1/2 −2 sin(θ0 )ν(z0 ) sin(tΦ0 (z0 ) + ν(z0 ) log(t) − δ(z0 ) + 2θ0 ) b(n, t) = t + O(t−α ).
(2.19)
Proof. To get the first formula for we use a(n, t) = 12 ∞ j=n (2a(j, t))/ ∞ n n+1 1 1 j=n+1 (2a(j, t)). Now set x = t and observe θ0 ( t ) = θ0 (x+ t ) = θ0 (x)±θ (x) t + −2 O(t ) uniformly in |x| ≤ 1 − C. Similarly for for the other terms and hence on checks that the only difference up to O(t−α ) errors in the above formulas for n and n±1 is a ∓2θ0 in the argument of the cosine (stemming from the tΦ0 (z0 ) term). The ∞ ∞ second formula follows in the same manner from b(n, t) = j=n b(j, t) − j=n+1 b(j, t). This is illustrated in Fig. 3, which shows the same solution as in Fig. 2 but in Flaschka’s variables. It is also interesting to look at the relation between the energy λ of the underlying Lax operator H and the propagation speed at which the corresponding parts of the Toda lattice travel, that is, the analog of the classical dispersion relation. By the above theorems, the nonlinear dispersion relation is given by (see Fig. 4) n (2.20) v(λ) = , t where λ ∈ [−1, 1], −λ, √ v(λ) = (2.21) λ2 − 1 √ , λ ∈ (−∞, −1] ∪ [1, ∞). log(|λ − λ2 − 1|)
February 11, 2009 13:39 WSPC/148-RMP
72
J070-00358
H. Kr¨ uger & G. Teschl
2.0
1.5
1.0
0.5
− 200 Fig. 3.
− 100
0
100
200
Numerically computed solution a(n, 150) of the Toda lattice in Flaschka’s variables.
v(λ) 2
1
−4
−2
2
4
−1
−2 Fig. 4.
Nonlinear dispersion relation for the Toda lattice.
We will not address the asymptotics in the missing region around |n| ≈ t. In the case |R+ (z, 0)| < 1, the solution can be given in terms of Painlev´e II transcendents. If |R+ (z, 0)| = 1 (which is the generic case), an additional region, the collisionless shock region, will appear where the solution can be described in terms of elliptic functions. For the Painlev´e region we refer to [4, 20]. For the collisionless shock region, an outline using the g-function method was given in [10] (for the case of
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
73
the Korteweg–de Vires equation). The case of the Toda lattice will be dealt with in [29]. We also remark that the present methods can also be used to obtain further terms in the asymptotic expansion [5]. Finally, note that one can obtain the asymptotics for n ≥ 0 from the ones for n ≤ 0 by virtue of a simple reflection. Similarly for t ≥ 0 versus t ≤ 0. Lemma 2.4. Suppose a(n, t), b(n, t) satisfy the Toda equation (2.1), then so do a ˜(n, t) = a(−n − 1, t),
˜b(n, t) = −b(−n, t)
respectively, a ˜(n, t) = a(n, −t),
˜b(n, t) = −b(n, −t).
3. The Inverse Scattering Transform and the Riemann–Hilbert Problem In this section, we want to derive the Riemann–Hilbert problem from scattering theory. The special case without eigenvalues was first given in Kamvissis [20]. How eigenvalues can be added was first shown in Deift, Kamvissis, Kriecherbauer and Zhou [11]. We essentially follow [25] in this section. For the necessary results from scattering theory respectively the inverse scattering transform for the Toda lattice we refer to [36–38]. Associated with a(t), b(t) is a self-adjoint Jacobi operator H(t) = a(t)S + + a− (t)S − + b(t)
(3.1)
in 2 (Z), where S ± f (n) = f ± (n) = f (n ± 1) are the usual shift operators and 2 (Z) denotes the Hilbert space of square summable (complex-valued) sequences over Z. By our assumption (2.2), the spectrum of H consists of an absolutely continuous part [−1, 1] plus a finite number of eigenvalues λk ∈ R\[−1, 1], 1 ≤ k ≤ N . In addition, there exist two Jost functions ψ± (z, n, t) which solve the recurrence equation H(t)ψ± (z, n, t) =
z + z −1 ψ± (z, n, t), 2
|z| ≤ 1,
(3.2)
and asymptotically look like the free solutions lim z ∓n ψ± (z, n, t) = 1.
n→±∞
(3.3)
Both ψ± (z, n, t) are analytic for 0 < |z| < 1 with smooth boundary values for |z| = 1. The asymptotics of the two Jost function are ψ± (z, n, t) =
z ±n (1 + 2B± (n, t)z + O(z 2 )), A± (n, t)
(3.4)
February 11, 2009 13:39 WSPC/148-RMP
74
J070-00358
H. Kr¨ uger & G. Teschl
as z → 0, where A+ (n, t) =
∞
j=n
A− (n, t) =
n−1
∞
B+ (n, t) = −
2a(j, t),
b(j, t),
j=n+1
2a(j, t),
n−1
B− (n, t) = −
j=−∞
(3.5) b(j, t).
j=−∞
One has the scattering relations T (z)ψ∓ (z, n, t) = ψ± (z, n, t) + R± (z, t)ψ± (z, n, t),
|z| = 1,
(3.6)
where T (z), R± (z, t) are the transmission respectively reflection coefficients. The transmission and reflection coefficients have the following well known properties ([38, Sec. 10.2]): Lemma 3.1. The transmission coefficient T (z) has a meromorphic extension to the interior of the unit circle with simple poles at the images of the eigenvalues ζj . The residues of T (z) are given by Resζk T (z) = −ζk
γ+,k (t) = −ζk γ−,k (t)µk (t), µk (t)
where γ±,k (t)−1 =
|ψ± (ζk , n, t)|2
(3.7)
(3.8)
n∈Z
and ψ− (ζk , n, t) = µk (t)ψ+ (ζk , n, t). Moreover, T (z)R+ (z, t) + T (z)R− (z, t) = 0,
|T (z)|2 + |R± (z, t)|2 = 1.
(3.9)
In particular, one reflection coefficient, say R(z, t) = R+ (z, t), and one set of norming constants, say γk (t) = γ+,k (t), suffices. Moreover, the time dependence is given by ([38, Theorem 13.4]): Lemma 3.2. The time evolutions of the quantities R+ (z, t), γ+,k (t) are given by R(z, t) = R(z)et(z−z γk (t) = γk e
−1
t(ζk −ζk−1 )
)
,
(3.10) (3.11)
where R(z) = R(z, 0) and γk = γk (0). Now we define the sectionally meromorphic vector |z| < 1, T (z)ψ− (z, n, t)z n ψ+ (z, n, t)z −n , m(z, n, t) = −1 n −1 −1 −n , |z| > 1. T (z )ψ− (z , n, t)z ψ+ (z , n, t)z
(3.12)
We are interested in the jump condition of m(z, n, t) on the unit circle T (oriented counterclockwise). To formulate our jump condition we use the following
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
75
convention: When representing functions on T, the lower subscript denotes the non-tangential limit from different sides, m± (z) =
lim
ζ→z, |ζ|±1 <1
m(ζ),
|z| = 1.
(3.13)
In general, for an oriented contour Σ, m+ (z) (respectively, m− (z)) will denote the limit of m(ζ) as ζ → z from the positive (respectively, negative) side of Σ. Here the positive (respectively, negative) side is the one which lies to the left (respectively, right) as one traverses the contour in the direction of the orientation. Using the notation above implicitly assumes that these limits exist in the sense that m(z) extends to a continuous function on the boundary. Theorem 3.3 (Vector Riemann–Hilbert Problem). Let S+ (H(0)) = {R(z), |z| = 1; (ζk , γk ), 1 ≤ k ≤ N } the left scattering data of the operator H(0). Then m(z) = m(z, n, t) defined in (3.12) is a solution of the following vector Riemann– Hilbert problem. Find a function m(z) which is meromorphic away from the unit circle with simple poles at ζk , ζk−1 and satisfies: (i) The jump condition
m+ (z) = m− (z)v(z),
v(z) =
for z ∈ T, (ii) the pole conditions
1 − |R(z)|2
−R(z)e−tΦ(z)
R(z)etΦ(z)
1
Resζk m(z) = lim m(z)
0
,
(3.14)
0
, −ζk γk etΦ(ζk ) 0
0 ζk−1 γk etΦ(ζk ) Resζ −1 m(z) = lim m(z) , k z→ζk−1 0 0 z→ζk
(3.15)
(iii) the symmetry condition m(z
−1
0 ) = m(z) 1
1 0
(3.16)
(iv) and the normalization m(0) = (m1
m2 ),
m1 · m2 = 1,
m1 > 0.
(3.17)
Here the phase is given by Φ(z) = z − z −1 + 2
n log z. t
(3.18)
Proof. The jump condition (3.14) is a simple calculation using the scattering relations (3.6) plus (3.9). The pole conditions follow since T (z) is meromorphic in
February 11, 2009 13:39 WSPC/148-RMP
76
J070-00358
H. Kr¨ uger & G. Teschl
|z| < 1 with simple poles at ζk and residues given by (3.7). The symmetry condition holds by construction and the normalization (3.17) is immediate from the following lemma. Observe that the pole condition at ζk is sufficient since the one at ζk−1 follows by symmetry. Moreover, it can be shown that the solution of the above Riemann– Hilbert problem is unique [25]. However, we will not need this fact here and it will follow as a byproduct of our analysis at least for sufficiently large t. Moreover, we have the following asymptotic behavior near z = 0: Lemma 3.4. The function m(z, n, t) defined in (3.12) satisfies 1 (1 + 2B(n, t)z) + O(z 2 ). m(z, n, t) = A(n, t)(1 − 2B(n − 1, t)z) A(n, t) (3.19) Here A(n, t) = A+ (n, t) and B(n, t) = B+ (n, t) are defined in (3.5). Proof. This follows from (3.4) and T (z) = A+ A− (1 − 2(B+ − b + B− )z + O(z 2 )).
For our further analysis it will be convenient to rewrite the pole condition as a jump condition and hence turn our meromorphic Riemann–Hilbert problem into a holomorphic Riemann–Hilbert problem following [11]. Choose ε so small that the discs |z − ζk | < ε are inside the set {z|0 < |z| < 1} and do not intersect. Then redefine m in a neighborhood of ζk , respectively ζk−1 , according to 1 0 m(z) , |z − ζk | < ε, tΦ(ζk ) ζ γ e k k 1 z − ζk m(z) = (3.20) zγk etΦ(ζk ) 1 − −1 m(z) −1 z − ζ k , |z − ζk | < ε, 0 1 m(z), else. Then a straightforward calculation using Resζ m = limz→ζ (z − ζ)m(z) shows Lemma 3.5. Suppose m(z) is redefined as in (3.20). Then m(z) is holomorphic away from the unit circle and satisfies (3.14), (3.16), (3.17) and the pole conditions are replaced by the jump conditions 1 0 m+ (z) = m− (z) ζ γ etΦ(ζk ) , |z − ζk | = ε, k k 1 z − ζk
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
1 m+ (z) = m− (z)
zγk etΦ(ζk ) z − ζk−1 ,
0
|z −1 − ζk | = ε,
77
(3.21)
1
where the small circle around ζk is oriented counterclockwise and the one around ζk−1 is oriented clockwise. Finally, we note that the case of just one eigenvalue and zero reflection coefficient can be solved explicitly. Lemma 3.6 (One Soliton Solution). Suppose there is only one eigenvalue and a vanishing reflection coefficient, that is, S+ (H(t)) = {R(z) ≡ 0, |z| = 1; (ζ, γ)} with ζ ∈ (−1, 0) ∪ (0, 1) and γ ≥ 0. Then the Riemann–Hilbert problem (3.14)–(3.17) has a unique solution is given by (3.22) m0 (z) = f (z) f (1/z) −1 1 2z −ζ 2 f (z) = +1−ζ , γ(n, t)ζ z−ζ 1 − ζ 2 + γ(n, t) 1 − ζ 2 + ζ 2 γ(n, t) where γ(n, t) = γetΦ(ζ) . In particular, 1 − ζ 2 + γ(n, t) γ(n, t)ζ(ζ 2 − 1) A+ (n, t) = . , B (n, t) = + 1 − ζ 2 + γ(n, t)ζ 2 2(1 − ζ 2 + γ(n, t)ζ 2 )
(3.23)
Furthermore, the zero solution is the only solution of the corresponding vanishing problem where the normalization is replaced by m(0) = (0 m2 ) with m2 arbitrary. Proof. By symmetry, the solution must be of the form m0 (z) = f (z) f (1/z) , where f (z) is meromorphic in C ∪ {∞} with the only possible pole at ζ. Hence 1 B f (z) = 1+2 , A z−ζ where the unknown constants A and B are uniquely determined by the pole condition Resζ f (z) = −ζγ(n, t)f (ζ −1 ) and the normalization f (0)f (∞) = 1, f (0) > 0. 4. Conjugation and Deformation This section demonstrates how to conjugate our Riemann–Hilbert problem and deform the jump contours, such that the jumps will be exponentially close to the identity away from the stationary phase points. In order to do this, we will assume that R(z) has an analytic extension to a strip around the unit circle throughout this and the following section. This is for example the case if the decay in (2.2) is exponentially. We will eventually show how to remove this assumption in Sec. 6.
February 11, 2009 13:39 WSPC/148-RMP
78
J070-00358
H. Kr¨ uger & G. Teschl
For easy reference we note the following result which can be checked by a straightforward calculation. Lemma 4.1 (Conjugation). Assume that d(z)−1 D(z) = 0
˜ ⊆ Σ. Let D be a matrix of the form Σ 0 , (4.1) d(z)
˜ → C is a sectionally analytic function. Set where d : C\Σ m(z) ˜ = m(z)D(z),
(4.2)
then the jump matrix transforms according to v˜(z) = D− (z)−1 v(z)D+ (z).
(4.3)
If d satisfies d(z −1 ) = d(z)−1 and d(0) > 0. Then the transformation m(z) ˜ = m(z)D(z) respects our symmetry, that is, m(z) ˜ satisfies (3.16) if and only if m(z) does. In particular, we obtain
v˜ =
v11
v12 d2
v21 d−2
v22
d− v11 d+
v12 d+ d−
,
˜ z ∈ Σ\Σ,
(4.4)
respectively, v˜ =
−1 v21 d−1 + d−
, d+ v22 d−
˜ z ∈ Σ.
(4.5)
In order to remove the poles there are two cases to distinguish. If Re(Φ(ζk )) < 0 the corresponding jumps (3.21) are exponentially close to the identity as t → ∞ and there is nothing to do. Otherwise, if Re(Φ(ζk )) < 0, we use conjugation to turn the jumps into exponentially decaying ones, again following Deift, Kamvissis, Kriecherbauer, and Zhou [11] (see also [25]). For this purpose, we will use the next lemma which shows how γk etΦ(ζk ) can be replaced by its inverse. It turns out that we will have to handle the poles at ζk and ζk−1 in one step in order to preserve symmetry and in order to not add additional poles elsewhere. Lemma 4.2. Assume that the Riemann–Hilbert problem for m has jump conditions near ζ and ζ −1 given by 1 0 m+ (z) = m− (z) γζ , |z − ζ| = ε, 1 z−ζ
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
1 m+ (z) = m− (z) 0
γz z − ζ −1 ,
|z −1 − ζ| = ε.
79
(4.6)
1
Then this Riemann–Hilbert problem is equivalent to a Riemann–Hilbert problem for m ˜ which has jump conditions near ζ and ζ −1 given by (ζz − 1)2 1 ζ(z − ζ)γ |z − ζ| = ε, ˜ − (z) m ˜ + (z) = m , 0
1 1
m ˜ + (z) = m ˜ − (z) (z − ζ)2 ζz(ζz − 1)γ
0 , 1
|z −1 − ζ| = ε,
and all remaining data conjugated (as in Lemma 4.1) by z−ζ 0 ζz − 1 . D(z) = ζz − 1 0 z−ζ
(4.7)
Proof. To turn γ into γ −1 , introduce D by 1z−ζ z−ζ 1 0 γ ζ ζz − 1 , |z − ζ| < ε, ζz − 1 ζ 0 0 −γ z−ζ z−ζ zζ z−ζ 0 0 γ zζ − 1 ζz − 1 , |z −1 − ζ| < ε, D(z) = ζz − 1 zζ − 1 1 0 1 − γ zζ z−ζ z−ζ 0 ζz − 1 , else, ζz − 1 0 z−ζ and note that D(z) is analytic away from the two ˜ = m(z)D(z), circles. Now set m(z) which is again symmetric by D(z −1 ) = 01 10 D(z) 01 10 . The jumps along |z − ζ| = ε and |z −1 − ζ| = ε follow by a straightforward calculation and the remaining jumps follow from Lemma 4.1. The jumps along T are of oscillatory type and our aim is to apply a contour deformation which will move them into regions where the oscillatory terms will decay exponentially. Since the jump matrix v contains both exp(tΦ) and exp(−tΦ)
February 11, 2009 13:39 WSPC/148-RMP
80
J070-00358
H. Kr¨ uger & G. Teschl
we need to separate them in order to be able to move them to different regions of the complex plane. For this we will need the following factorizations of the jump condition (3.14). First of all v(z) = b− (z)−1 b+ (z), where
1 b− (z) = 0
R(z)e−tΦ(z) , 1
(4.8)
1 R(z)etΦ(z)
b+ (z) =
0 . 1
This will be the proper factorization for z > z0 . Here z > z0 has to be understood as λ(z) > λ0 . Similarly, we have 0 1 − |R(z)|2 v(z) = B− (z)−1 (4.9) B+ (z), 1 0 2 1 − |R(z)| where
B− (z) =
1
0
R(z)etΦ(z) − 1 − |R(z)|2
1
,
1 B+ (z) = 0
R(z)e−tΦ(z) − 1 − |R(z)|2 . 1
This will be the proper factorization for z < z0 . To get rid of the diagonal part we need to solve the corresponding scalar Riemann–Hilbert problem. Moreover, for z0 ∈ (−1, 0) we have Re(Φ(z)) > 0 for z ∈ (ζ0 , 0) and Re(Φ(z)) < 0 for z ∈ (−1, ζ0 ) ∪ (0, 1), for z0 ∈ T we have Re(Φ(z)) > 0 for z ∈ (−1, 0) and Re(Φ(z)) < 0 for z ∈ (0, 1), and for z0 ∈ (0, 1) we have Re(Φ(z)) > 0 for z ∈ (−1, 0) ∪ (ζ0 , 1) and Re(Φ(z)) < 0 for z ∈ (0, ζ0 ) (compare Fig. 5 and note that by Re(Φ(z −1 )) = −Re(Φ(z)) the curves Re(Φ(z)) = 0 are symmetric with respect to z → z −1 ). Together with the Blaschke factors needed to conjugate the jumps near the eigenvalues, this is just the partial transmission coefficient T (z, z0 ) introduced in (2.8). In fact, it satisfies the following scalar meromorphic Riemann–Hilbert problem: Lemma 4.3. Set Σ(z0 ) = ∅ for z0 ∈ (−1, 0), Σ(z0 ) = {z ∈ T|Re(z) < Re(z0 )} for z0 ∈ T, and Σ(z0 ) = T for z0 ∈ (0, 1). Then the partial transmission coefficient T (z, z0 ) is meromorphic for z ∈ C\Σ(z0 ), with simple poles at ζj and simple zeros at ζj−1 for all j with 12 (ζj + ζj−1 ) < λ0 , and satisfies the jump condition T+ (z, z0 ) = T− (z, z0 )(1 − |R(z)|2 ),
z ∈ Σ(z0 ).
Moreover, (i) T (z −1, z0 ) = T (z, z0)−1 , z ∈ C\Σ(z0 ), and T (0, z0 ) > 0, z , z0 ), z ∈ C, and in particular T (z, z0) is real-valued for z ∈ R, (ii) T (z, z0) = T (¯
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
81
(iii) T (z, z0) = T (z)(C + o(1)) with C = 0 for |z| ≤ 1 near ±1 if ±1 ∈ Σ(z0 ) and continuous otherwise. Proof. That ζj are simple poles and ζj−1 are simple zeros is obvious from the Blaschke factors and that T (z, z0) has the given jump follows from Plemelj’s formulas. (i)–(iii) are straightforward to check. Observe that for ζ0 < ζN if ζN ∈ (0, 1) respectively ζ0 < 1 else we have T (z) = T (z, z0). Moreover, note that (i) and (ii) imply |T (z, z0)|2 = T (¯ z, z0 )T (z, z0 ) = T (z −1, z0 )T (z, z0 ) = 1,
z ∈ T\Σ(z0 ).
(4.10)
Now we are ready to perform our conjugation step. Introduce
80 1 z − ζk > > 1 > tΦ(ζ ) > B k C ζk γk e > > B C > > B C D0 (z), > > @ ζ γ etΦ(ζk ) A > > k k > − 0 > > z − ζk > > > > <0 1 zζk γk etΦ(ζk ) D(z) = > 0 > B > zζk − 1 C > B C > >B C D0 (z), > > @ A > zζk − 1 > > 1 − > tΦ(ζ ) > k zζk γk e > > > > > > : D0 (z),
where
D0 (z) =
Note that we have D(z
−1
|z − ζk | < ε,
|z −1 − ζk | < ε,
λk <
1 (ζ0 + ζ0−1 ), 2
λk <
1 (ζ0 + ζ0−1 ), 2
else,
T (z, z0)−1
0
0
T (z, z0)
.
0 1 0 1 )= D(z) . 1 0 1 0
Now we conjugate our vector m(z) defined in (3.12), respectively (3.20), using D(z), m(z) ˜ = m(z)D(z).
(4.11)
/ Σ(z0 )) or it Since T (z, z0) is either nonzero and continuous near z = ±1 (if ±1 ∈ ˜ has the same behaviour as T (z) near z = ±1 (if ±1 ∈ Σ(z0 )), the new vector m(z) is again continuous near z = ±1 (even if T (z) vanishes there). Then, using Lemmas 4.1 and 4.2, the jumps corresponding to eigenvalues λk < −1 1 (ζ 2 0 + ζ0 ) (if any) are given by z − ζk 1 ζk γk T (z, z0 )−2 etΦ(ζk ) v˜(z) = , |z − ζk | = ε, 0 1
February 11, 2009 13:39 WSPC/148-RMP
82
J070-00358
H. Kr¨ uger & G. Teschl
v˜(z) =
1 ζk z − 1 ζk zγk T (z, z0)2 etΦ(ζk )
0 , 1
|z −1 − ζk | = ε,
and corresponding to eigenvalues λk > 12 (ζ0 + ζ0−1 ) (if any) by 1 0 v˜(z) = ζ γ T (z, z )−2 etΦ(ζk ) , |z − ζk | = ε, k k 0 1 z − ζk zγk T (z, z0 )2 etΦ(ζk ) 1 z − ζk−1 |z −1 − ζk | = ε. v˜(z) = , 0
(4.12)
(4.13)
1
In particular, an investigation of the sign of Re(Φ(z)) (see Fig. 5 below) shows that all off-diagonal entries of these jump matrices, except for possibly one if ζk0 = ζ0 for some k0 , are exponentially decreasing. In the latter case we will keep the pole condition for ζk0 = ζ0 which now reads
0 0 Resζk0 m(z) , ˜ = lim m(z) ˜ z→ζk0 −ζk0 γk0 T (ζk0 , z0 )−2 etΦ(ζk0 ) 0 (4.14)
−2 tΦ(ζk0 ) γ T (ζ , z ) e 0 ζk−1 k k 0 0 0 0 Resζ −1 m(z) ˜ = lim m(z) ˜ . k0 z→ζk−1 0 0 0 Furthermore, the jump along T is given by ˜b− (z)−1˜b+ (z), λ(z) > λ0 , v˜(z) = B ˜+ (z), λ(z) < λ0 , ˜− (z)−1 B where
and
1 ˜b− (z) = 0
R(z −1 )e−tΦ(z) T (z −1 , z0 )2 ,
˜− (z) = B − 1 ˜+ (z) = B 0
1
0
T (z, z0)2
1
˜b+ (z) = R(z)etΦ(z)
1
(4.15)
1 T− (z, z0 )−2 R(z)etΦ(z) 1 − R(z)R(z −1 )
,
(4.16)
0 , 1
T+ (z, z0 )2 −1 −tΦ(z) R(z )e − 1 − R(z)R(z −1 ) . 1
(4.17)
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
83
Here we have used T± (z −1 , z0 ) = T± (¯ z , z0 ) = T± (z, z0 ) and R(z −1 ) = R(¯ z ) = R(z) for z ∈ T to show that there exists an analytic continuation into a neighborhood of the unit circle. Moreover, using T± (z, z0 ) = T∓ (z −1 , z0 )−1 ,
z ∈ Σ(z0 ),
we can write T− (z, z0 )−2 T− (z, z0 ) = , −1 1 − R(z)R(z ) T− (z, z0 )
T+ (z, z0 )2 T+ (z, z0 ) = −1 1 − R(z)R(z ) T+ (z, z0 )
(4.18)
for z ∈ T, which shows that the matrix entries are in fact bounded. Now we deform the jump along T to move the oscillatory terms into regions where they are decaying. There are three cases to distinguish (see Fig. 5): Case 1: z0 ∈ (−1, 0). In this case we will set Σ± = {z| |z| = (1 − ε)±1 } for some small ε ∈ (0, 1) such that Σ± lies in the region with ±Re(Φ(z)) < 0 and such that we do not intersect the original contours (i.e., we stay away from ζj±1 ). Then we can split our jump by redefining m(z) ˜ according to m(z) ˜ ˜b+ (z)−1 , (1 − ε) < |z| < 1, m(z) ˆ = m(z) (4.19) ˜ ˜b− (z)−1 , 1 < |z| < (1 − ε)−1 , m(z), ˜ else. It is straightforward to check that the jump along T disappears and the jump along Σ± is given by ˜b+ (z), z ∈ Σ+ , vˆ(z) = (4.20) ˜b− (z)−1 , z ∈ Σ− . The other jumps (4.12), (4.13) as well as the pole condition (4.14) (if present) are unchanged. Note that the resulting Riemann–Hilbert problem still satisfies our symmetry condition (3.16) since we have ˜b± (z −1 ) = 0 1 ˜b∓ (z) 0 1 . 1 0 1 0
Fig. 5.
Sign of Re(Φ(z)) for different values of z0 .
February 11, 2009 13:39 WSPC/148-RMP
84
J070-00358
H. Kr¨ uger & G. Teschl
By construction all jumps (4.12), (4.13), and (4.19) are exponentially close to the identity as t → ∞. The only non-decaying part being the pole condition (4.14) (if present). Case 2: z0 ∈ T\{±1}. In this case we will set Σ± = Σ1± ∪ Σ2± as indicated in Fig. 6. Again note that Σ1± , respectively Σ2∓ , lies in the region with ±Re(Φ(z)) < 0 and must be chosen such that we do not intersect any other parts of the contour. Then we can split our jump by redefining m(z) ˜ according to m(z) ˜ ˜b+ (z)−1 , z between T and Σ1+ , m(z) ˜ ˜b− (z)−1 , z between T and Σ1− , ˜+ (z)−1 , z between T and Σ2 , (4.21) m(z) ˆ = m(z) ˜ B + ˜− (z)−1 , z between T and Σ2− , m(z) ˜ B m(z), ˜ else. One checks that the jump along T disappears and the jump along Σ± is given by ˜b+ (z), z ∈ Σ1+ , ˜b− (z)−1 , z ∈ Σ1 , − vˆ(z) = (4.22) ˜+ (z), B z ∈ Σ2+ , B ˜− (z)−1 , z ∈ Σ2 . −
All other jumps (4.12) and (4.13) are unchanged. Again the resulting Riemann– Hilbert problem still satisfies our symmetry condition (3.16) and the jump along
Fig. 6.
Deformed contour.
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
85
Σ± away from the stationary phase points z0 , z0−1 is exponentially close to the identity as t → ∞. Case 3: z0 ∈ (0, 1). In this case, we will set Σ± = {z| |z| = (1 − ε)±1 } for some small ε ∈ (0, 1) such that Σ± lies in the region with ∓Re(Φ(z)) < 0 and such that we do not intersect the original contours. Then we can split our jump by redefining m(z) ˜ according to ˜ + (z)−1 , (1 − ε) < |z| < 1, ˜ B m(z) ˜− (z)−1 , 1 < |z| < (1 − ε)−1 , m(z) ˆ = m(z) (4.23) ˜ B m(z), ˜ else. One checks that the jump along T disappears and the jump along Σ± is given by B ˜+ (z), z ∈ Σ+ , vˆ(z) = (4.24) B ˜− (z)−1 , z ∈ Σ− . The other jumps (4.12), (4.13) as well as the pole condition (4.14) (if present) are unchanged. Again the resulting Riemann–Hilbert problem still satisfies our symmetry condition (3.16) and all jumps (4.12), (4.13), and (4.23) are exponentially close to the identity as t → ∞. The only non-decaying part being the pole condition (4.14) (if present). In Cases 1 and 3, we can immediately apply Theorem B.6 to m ˆ as follows: If ˆ t and w ˆ0t | nt − ck | > ε for all k we can choose γ0 = 0. Since the error between w is exponentially small, this proves the second part of Theorem 2.1 in the analytic case upon comparing
T (z, z0 ) 0 m(z) = m(z) ˆ (4.25) 0 T (z, z0)−1 with (3.19). The changes necessary for the general case will be given in Sec. 6. Otherwise, if | nt −ck | < ε for some k, we choose γ0t = γk (n, t). Again we conclude ˆ0t is exponentially small, proving the first part of that the error between w ˆt and w Theorem 2.1. The changes necessary for the general case will also be given in Sec. 6. In Case 2, the jump will not decay on the two small crosses containing the stationary phase points z0 and z0−1 . Hence we will need to continue the investigation of this problem in the next section. 5. Reduction to a Riemann–Hilbert Problem on a Small Cross In the previous section, we have shown that for z0 ∈ T\{±1} we can reduce everything to a Riemann–Hilbert problem for m(z) ˆ such that the jumps are of order −1 O(t ) except in a small neighborhoods of the stationary phase points z0 and z0−1 . Denote by ΣC (z0±1 ) the parts of Σ+ ∪ Σ− inside a small neighborhood of z0±1 . In this section, we will show that everything can reduced to solving the two problems in the two small crosses ΣC (z0 ) respectively ΣC (z0−1 ).
February 11, 2009 13:39 WSPC/148-RMP
86
J070-00358
H. Kr¨ uger & G. Teschl
It will be slightly more convenient to use the alternate normalization 1 ˜ ˆ A = T0 A, m(z) ˇ = m(z), A˜ such that 1 m(0) ˇ = 1 . A˜2
(5.1)
(5.2)
ˆ consists of two straight lines Without loss of generality, we can also assume that Σ in a sufficiently small neighborhood of z0 . We will need the solution of the corresponding 2 × 2 matrix C C (z) = M− (z)˜ v (z), M+
z ∈ ΣC ,
M C (∞) = I,
(5.3)
where the jump v˜ is the same as for m(z) ˜ but restricted to a neighborhood of one of the two crosses ΣC = (Σ+ ∪ Σ− ) ∩ {z| |z − z0 | < ε/2} for some small ε > 0. As a first step we make a change of coordinates −2 sin(θ0 ) z0 i ζ (5.4) (z − z0 ), z = z0 + ζ= z0 i −2 sin(θ0 ) such that the phase reads Φ(z) = iΦ0 + 2i ζ 2 + O(ζ 3 ). Here we have set z0 = eiθ0 ,
θ0 ∈ (−π, 0),
respectively cos(θ0 ) = −n/t, which implies Φ0 = 2(sin(θ0 ) − θ0 cos(θ0 )),
Φ (z0 ) = 2ie−2iθ0 sin(θ0 ).
The corresponding Riemann–Hilbert problem will be solved in Appendix A. To apply this result we need the behaviour of our jump matrices near z0 , that is, the behavior of T (z, z0 ) near z → z0 . Lemma 5.1. Let z0 ∈ T, then T (z, z0 ) =
iν z − z0 T˜ (z, z0 ) −z0 z − z0
(5.5)
where ν = − π1 log(|T (z0 )|) and the branch cut of the logarithm used to define z iν = eiν log(z) is chosen along the negative real axis. Here z0
z − ζk−1 1 |T (s)| s + z ds ˜ T (z, z0 ) = |ζk | · exp log , z − ζk 2πi z0 |T (z0 )| s − z s ζk ∈(−1,0)
is H¨ older continuous of any exponent less than 1 at z = z0 and satisfies T˜(z0 , z0 ) ∈ T. Proof. This follows since iν z0 1 z − z0 s + z ds exp log(|T (z0 )|) . = −z0 2πi z0 s−z s z − z0 The property T˜ (z0 , z0 ) ∈ T follows after letting z → z0 in (4.10).
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
87
Now if z(ζ) is defined as in (5.4) and 0 < α < 1, then there is an L > 0 such that 3
|T (z(ζ), z0 ) − ζ iν T˜ (z0 , z0 )e− 2 iν log(−2 sin(θ0 )) | ≤ L|ζ|α , where the branch cut of ζ iν is tangent to the negative real axis. Clearly we also have |R(z(ζ)) − R(z0 )| ≤ L|ζ|α and thus the assumptions of Theorem A.1 are satisfied with r = R(z0 )T˜ (z0 , z0 )−2 e3iν log(−2 sin(θ0 )) and the solution of (5.3) is given by 1 M0 z0 M (z) = I − +O α , 1/2 t (−2 sin(θ0 )t) z − z0
0 −β , M0 = β¯ 0 √ β = νei(π/4−arg(R(z0 ))+arg(Γ(iν))) (−2 sin(θ0 ))−3iν T˜ (z0 , z0 )2 e−itΦ0 t−iν , C
(5.6)
(5.7)
where 1/2 < α < 1, and cos(θ0 ) = −λ0 . Note |r| = |R(z0 )| and hence ν = 1 log(1 − |R(z0 )|2 ). − 2π Now we are ready to show Theorem 5.2. The solution m(z) ˇ is given by m(z) ˇ = 1
1 −
1 1 (m (z) + m ¯ (z)) + O , 0 0 1/2 tα (−2 sin(θ0 )t)
where m0 (z) = β¯
z z − z0
z0 −β , z − z0
m ¯ 0 (z) = m0 (z) = m0 (z
−1
0 ) 1
(5.8)
1 . 0
(5.9)
Proof. Introduce m(z) by C m(z)M ˇ (z)−1 , |z − z0 | ≤ ε, ˜ C (z)−1 , |z −1 − z0 | ≤ ε, m(z) = m(z) ˇ M m(z), ˇ else, where ˜ C (z) = M
1 M0 z 0 1 0 1 C −1 +O α . M (z ) = I− 1 0 1 0 t (−2 sin(θ0 )t)1/2 z − z0
February 11, 2009 13:39 WSPC/148-RMP
88
J070-00358
H. Kr¨ uger & G. Teschl
The Riemann–Hilbert problem for m has jumps given by M C (z)−1 , v (z)M C (z)−1 , M C (z)ˆ I, ˜ C (z)−1 , v(z) = M C ˜ (z)ˆ ˜ C (z)−1 , M v (z)M I, vˆ(z),
|z − z0 | = ε, ε ˆ < |z − z0 | < ε, z ∈ Σ, 2 ε z ∈ Σ, |z − z0 | < , 2 |z −1 − z0 | = ε, ε ˆ < |z −1 − z0 | < ε, z ∈ Σ, 2 ε z ∈ Σ, |z −1 − z0 | < , 2 else.
The jumps are I + O(t−1/2 ) on the loops |z − z0 | = ε, |z −1 − z0 | = ε and even I + O(t−α ) on the rest (in the L∞ norm, hence also in the L2 one). In particular, as in Lemma A.3 we infer µ − 1 1 2 = O(t−1/2 ). Thus we have with Ω∞ as in (B.8) 1 µ(s)w(s)Ω∞ (s, z) m(z) = 1 1 + 2πi Σ 1 = 1 1 + µ(s)(M C (s)−1 − I)Ω∞ (s, z) 2πi |s−z0 |=ε 1 ˜ C (s)−1 − I)Ω∞ (s, z) + O(t−α ) µ(s)(M + 2πi |s−1 −z0 |=ε 1 z0 1 Ω∞ (s, z) = 1 1 + 1 1 M0 2πi |s−z0 |=ε s − z0 (−2 sin(θ0 )t)1/2 1 1 s + M Ω∞ (s, z) + O(t−α ) 1 1 0 1/2 2πi |s−1 −z0 |=ε s − z0 (−2 sin(θ0 )t) 1 1 (m0 (z) + m ¯ 0 (z)) + O α = 1 1 − 1/2 t (−2 sin(θ0 )t) finishing the proof. Hence, using (3.19) and (5.1), (m(z)) ˇ 2 =
1 1 + (T1 + 2B)z + O(z 2 ) A˜2
(5.10)
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
89
and comparing with 1 2 Re(z0 β) 2 Re(β) 2 (m(z)) ˇ = 1 − ) + O − z + O(z , 2 tα (−2 sin(θ0 )t)1/2 (−2 sin(θ0 )t)1/2 (5.11) we obtain A˜2 = 1 + and
1 2 Re(β) + O 1/2 tα (−2 sin(θ0 )t)
1 2 Re(z0 β) T1 + 2B = − +O α . t (−2 sin(θ0 )t)1/2
(5.12)
(5.13)
In summary we have
1 Re(β) +O α , 1/2 t (−2 sin(θ0 )t) 1 Re(z0 β) 1 +O α , B = − T1 − 1/2 2 t (−2 sin(θ0 )t) A = T0 1 +
(5.14) (5.15)
which proves Theorem 2.2 in the analytic case. Remark 5.3. Note that, in contradiction to Theorem B.6, Theorem 5.2 does not require uniform boundedness of the associated integral operators, but only some knowledge of the solution of the Riemann–Hilbert problem. However, it requires that the solution is of the form I + o(1) and hence cannot be used in the soliton region. 6. Analytic Approximation In this section, we want to present the necessary changes in the case where the reflection coefficient does not have an analytic extension. The idea is to use an analytic approximation and split the reflection in an analytic part plus a small rest. The analytic part will be moved to the complex plane while the rest remains on the unit circle. This needs to be done in such a way that the rest is of O(t−l ) and the growth of the analytic part can be controlled by the decay of the phase. In the soliton region a straightforward splitting based on the Fourier series R(z) =
∞
k ˆ R(k)z
(6.1)
k=−∞
ˆ ∈ will be sufficient. It is well known that our assumption (2.2) implies k l R(−k) 1 l (N) (this follows from the estimate [38, Eq. (10.83)]) and R ∈ C (T). ˆ ˆ ∈ 1 (N) and let 0 < ε < Lemma 6.1. Suppose R(k) ∈ 1 (Z), k l R(−k) 1, β > 0 be given. Then we can split the reflection coefficient according to
February 11, 2009 13:39 WSPC/148-RMP
90
J070-00358
H. Kr¨ uger & G. Teschl
R(z) = Ra,t (z) + Rr,t (z) such that Ra,t (z) is analytic in 0 < |z| < 1 and |Ra,t (z)e−βt | = O(t−l ),
1 − ε ≤ |z| ≤ 1,
|Rr,t (z)| = O(t−l ),
|z| = 1.
(6.2)
β0 k ˆ Proof. We choose Ra,t (z) = ∞ k=−K(t) R(k)z with K(t) = − log(1−ε) t for some positive β0 < β. Then, for 1 − ε ≤ |z|, |Ra,t (z)e−βt | ≤
∞
−βt ˆ ˆ 1 e−βt (1−ε)−K(t) ≤ R ˆ 1 e−(β−β0 )t . |R(k)|e (1−ε)k ≤ R
k=−K(t)
Similarly, for |z| = 1, −K(t)−1
|Rr,t (z)| ≤
ˆ |R(k)| ≤ const
k=−∞
∞ k=K(t)+1
kl ˆ const const |R(−k)| ≤ ≤ . K(t)l K(t)l tl
To apply this lemma in the soliton region z0 ∈ (−1, 0) we choose β = min − Re(Φ(z)) > 0 |z|=1−ε
(6.3)
and split R(z) = Ra,t (z) + Rr,t (z) according to Lemma 6.1 to obtain ˜b± (z) = ˜ba,t,± (z)˜br,t,± (z) = ˜br,t,± (z)˜ba,t,± (z). Here ˜ba,t,± (z), ˜br,t,± (z) denote the matrices obtained from ˜b± (z) as defined in (4.16) by replacing R(z) with Ra,t (z), Rr,t (z), respectively. Now we can move the analytic parts into the complex plane as in Sec. 4 while leaving the rest on T. Hence, rather then (4.20), the jump now reads ˜ z ∈ Σ+ , ba,t,+ (z), −1 vˆ(z) = ˜ba,t,− (z) , z ∈ Σ− , ˜ br,t,− (z)−1˜br,t,+ (z), z ∈ T.
(6.4)
By construction we have vˆ(z) = I+O(t−l ) on the whole contour and the rest follows as in Sec. 4. In the other soliton region z0 ∈ (0, 1), we proceed similarly, with the only dif˜± (z) have at first sight more complicated off ference that the jump matrices B diagonal entries. To remedy this we will rewrite them in terms of left rather than right scattering data. For this purpose, let us use the notation Rr (z) ≡ R+ (z) for the right and Rl (z) ≡ R− (z) for the left reflection coefficient. Moreover, let Tr (z, z0 ) ≡ T (z, z0 ) be the right and Tl (z, z0 ) ≡ T (z)/T (z, z0) be the left partial transmission coefficient.
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
With this notation we have ˜b− (z)−1˜b+ (z), λ(z) > λ0 , v˜(z) = B ˜+ (z), λ(z) < λ0 , ˜− (z)−1 B where
1 ˜b− (z) = 0
Rr (z −1 )e−tΦ(z) Tr (z −1 , z0 )2 ,
(6.5)
1
˜b+ (z) = Rr (z)etΦ(z) Tr (z, z0 )2
1
91
0 , 1
and
1
0
˜− (z) = B Tr,− (z, z0 )−2 , tΦ(z) Rr (z)e 1 − 2 |T (z)| Tr,+ (z, z0 )2 −1 −tΦ(z) Rr (z )e 1 − ˜+ (z) = |T (z)|2 B . 0 1 Using (3.9) together with (4.18) we can further write 1 0 , ˜− (z) = B −1 −tΦ(z) Rl (z )e 1 Tl (z −1 , z0 )2 1 ˜+ (z) = B 0
Rl (z)etΦ(z) Tl (z, z0 )2 . 1
˜± (z) as with ˜b± (z) by splitting Rl (z) rather Now we can proceed as before with B than Rr (z). In the similarity region we need to take the small vicinities of the stationary phase points into account. Since the phase is quadratic near these points, we cannot use it to dominate the exponential growth of the analytic part away from the unit circle. Hence we will take the phase as a new variable and use the Fourier transform with respect to this new variable. Since this change of coordinates is singular near the stationary phase points, there is a price we have to pay, namely, requiring additional smoothness for R(z). We begin with Lemma 6.2. Suppose R(z) ∈ C 5 (T). Then we can split R(z) according to R(z) = R0 (z) + (z − z0 )(z − z0 )H(z),
z ∈ Σ(z0 ),
(6.6)
February 11, 2009 13:39 WSPC/148-RMP
92
J070-00358
H. Kr¨ uger & G. Teschl
where R0 (z) is a real polynomial in z such that H(z) vanishes at z0 , z0 of order three and has a Fourier series H(z) =
∞
ˆ k ekω0 Φ(z) , H
ω0 =
k=−∞
π , π cos(θ0 ) + Φ0
(6.7)
ˆ k summable. Here Φ0 = Φ(z0 )/i. with k H Proof. By choosing a polynomial R0 , we can match the values of R and its first four derivatives at z0 , z0 . Hence H(z) ∈ C 4 (T) and vanishes together with its first three derivatives at z0 , z0 . When restricted to Σ(z0 ) the phase Φ(z)/i gives a one to one coordinate transform Σ(z0 ) → [iΦ0 , iΦ0 +iω0 ] and we can hence express H(z) in this new coordinate. The coordinate transform locally looks like a square root near z0 and z0 , however, due to our assumption that H vanishes there, H is still C 2 in this new coordinate and the Fourier transform with respect to this new coordinates exists and has the required properties. Moreover, as in Lemma 6.1 we obtain: Lemma 6.3. Let H(z) be as in the previous lemma. Then we can split H(z) according to H(z) = Ha,t (z) + Hr,t (z) such that Ha,t (z) is analytic in the region Re(Φ(z)) < 0 and |Ha,t (z)eΦ(z)t/2 | = O(1),
Re(Φ(z)) < 0, |z| ≤ 1,
|Hr,t (z)| = O(t−1 ),
|z| = 1. (6.8)
Proof. We choose Ha,t (z) = follows as in Lemma 6.1.
∞ k=−K(t)
ˆ k ekωΦ(z) with K(t) = t/(2ω). The rest H
By construction Ra,t (z) = R0 (z)+(z −z0 )(z −z0 )Ha,t (z) will satisfy the required Lipschitz estimate in a vicinity of the stationary phase points (uniformly in t) and all jumps will be I + O(t−1 ). Hence we can proceed as in Sec. 5. Acknowledgments We thank Ira Egorova, Katrin Grunert, Alice Mikikits-Leitner, and Johanna Michor for pointing out errors in a previous version of this article. Furthermore, we are indebted to Fritz Gesztesy and the anonymous referee for valuable suggestions improving the presentation of the material. This research was supported by the Austrian Science Fund (FWF) under Grant No. Y330.
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
93
Appendix A. The Solution on a Small Cross Introduce the cross Σ = Σ1 ∪ · · · ∪ Σ4 (see Fig. 7) by Σ1 = {ue−iπ/4 , u ∈ [0, ∞)}
Σ2 = {ueiπ/4 , u ∈ [0, ∞)}
Σ3 = {ue3iπ/4 , u ∈ [0, ∞)}
Σ4 = {ue−3iπ/4 , u ∈ [0, ∞)}.
(A.1)
Orient Σ such that the real part of z increases in the positive direction. Denote by D = {z, |z| < 1} the open unit disc. Throughout this section, z iν will denote the function eiν log(z) , where the branch cut of the logarithm is chosen along the negative real axis (−∞, 0). Introduce the following jump matrices (vj for z ∈ Σj )
1 −R1 (z)z 2iν e−tΦ(z) 1 0 , v2 = , v1 = R2 (z)z −2iν etΦ(z) 1 0 1 (A.2)
1 −R3 (z)z 2iν e−tΦ(z) 1 0 v3 = , v4 = . R4 (z)z −2iν etΦ(z) 1 0 1 Now consider the RHP given by m+ (z) = m− (z)vj (z), z ∈ Σj , m(z) → I,
j = 1, 2, 3, 4,
z → ∞.
(A.3)
We have the next theorem, in which we follow the computations of [4, Secs. 3 and 4]. The method can be found in earlier literature, see for example [19]. One can also find arguments like this in [20, Sec. 5] or [6, (3.65)–(3.76)].
Fig. 7.
Contours of a cross.
February 11, 2009 13:39 WSPC/148-RMP
94
J070-00358
H. Kr¨ uger & G. Teschl
We will allow some variation, in all parameters as indicated in the next result. Theorem A.1. There is some ρ0 > 0 such that vj (z) = I for |z| > ρ0 . Moreover, suppose that within |z| ≤ ρ0 the following estimates hold: (i) The phase satisfies Φ(0) = iΦ0 ∈ iR, Φ (0) = 0, Φ (0) = i and 1 2 + for z ∈ Σ1 ∪ Σ3 , ±Re Φ(z) ≥ |z| , 4 − else, 2 Φ(z) − Φ(0) − iz ≤ C|z|3 . 2
(A.4) (A.5)
(ii) There is some r ∈ D and constants (α, L) ∈ (0, 1] × (0, ∞) such that Rj , j = 1, . . . , 4, satisfy H¨ older conditions of the form |R1 (z) − r¯| ≤ L|z|α , r¯ | ≤ L|z|α , |R3 (z) − 1 − |r|2
|R2 (z) − r| ≤ L|z|α , r |R4 (z) − | ≤ L|z|α . 1 − |r|2
Then the solution of the RHP (A.3) satisfies
0 −β 1+α 1 i m(z) = I + 1/2 + O(t− 2 ), zt β¯ 0
(A.6)
(A.7)
for |z| > ρ0 , where β=
√ i(π/4−arg(r)+arg(Γ(iν))) −itΦ0 −iν νe e t ,
ν=−
1 log(1 − |r|2 ). 2π
(A.8)
Furthermore, if Rj (z) and Φ(z) depend on some parameter, the error term is uniform with respect to this parameter as long as r remains within a compact subset of D and the constants in the above estimates can be chosen independent of the parameters. We remark that the solution of the RHP (A.3) is unique. This follows from the usual Liouville argument [3, Lemma 7.18] since det(vj ) = 1. Note that the actual value of ρ0 is of no importance. In fact, if we choose ˜ of the problem with jump v˜, where v˜ is equal to 0 < ρ1 < ρ0 , then the solution m v for |z| < ρ1 and I otherwise, differs from m only by an exponentially small error. This already indicates, that we should be able to replace Rj (z) by their respective values at z = 0. To see this we start by rewriting our RHP as a singular integral equation. We will use the theory developed in Appendix B for the case of 2 × 2 matrix valued functions with m0 (z) = I and the usual Cauchy kernel (since we won’t require symmetry in this section) Ω(s, z) = I
ds . s−z
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
95
Moreover, since our contour is unbounded, we will assume w ∈ L1 (Σ) ∩ L2 (Σ). All results from Appendix B still hold in this case with some straightforward modifications if one observes that µ − I ∈ L2 (Σ). Indeed, as in Theorem B.3, in the special case b+ (z) = vj (z) and b− (z) = I for z ∈ Σj , we obtain 1 ds m(z) = I + , (A.9) µ(s)w(s) 2πi Σ s−z where µ − I is the solution of the singular integral equation (I − Cw )(µ − I) = Cw I,
(A.10)
that is, µ = I + (I − Cw )−1 Cw I,
Cw f = C− (wf ).
(A.11)
Here C denotes the usual Cauchy operator and we set w(z) = w+ (z) (since w− (z) = 0). As our first step we will get rid of some constants and rescale the entire problem by setting m(z) ˆ = D(t)−1 m(zt−1/2 )D(t),
(A.12)
where D(t) =
d(t)−1 0
0 , d(t)
d(t) = eitΦ0 /2 tiν/2 ,
d(t)−1 = d(t).
(A.13)
Then one easily checks that m(z) ˆ solves the RHP ˆ − (z)ˆ vj (z), z ∈ Σj , m ˆ + (z) = m m(z) ˆ → I,
j = 1, 2, 3, 4,
(A.14)
z → ∞, z ∈ / Σ,
where vˆj (z) = D(t)−1 vj (zt−1/2 )D(t), j = 1, . . . , 4, explicitly
−1/2 )−Φ(0)) 1 −R1 (zt−1/2 )z 2iν e−t(Φ(zt , vˆ1 (z) = 0 1
vˆ2 (z) =
−1/2
R2 (zt−1/2 )z −2iν et(Φ(zt
1 vˆ3 (z) = 0
vˆ4 (z) =
0
1 )−Φ(0)) −1/2
−R3 (zt−1/2 )z 2iν e−t(Φ(zt 1
1
)−Φ(0))
−1/2
R2 (zt−1/2 )z −2iν et(Φ(zt
)−Φ(0))
(A.15)
,
0
1
,
1
.
February 11, 2009 13:39 WSPC/148-RMP
96
J070-00358
H. Kr¨ uger & G. Teschl
Our next aim is to show that the solution m(z) ˆ of the rescaled problem is close to c the solution m ˆ (z) of the RHP ˆ c− (z)ˆ vjc (z), z ∈ Σj , m ˆ c+ (z) = m z → ∞, m ˆ c (z) → I,
j = 1, 2, 3, 4, z∈ / Σ,
(A.16)
associated with the following jump matrices
2 1 1 −¯ r z 2iν e−iz /2 c c vˆ1 (z) = , vˆ2 (z) = 2 0 1 rz −2iν eiz /2
0
,
1
2 r¯ 1 z 2iν e−iz /2 1 − 2 1 − |r| , vˆ4c (z) = r 2 vˆ3c (z) = z −2iν eiz /2 0 1 1 − |r|2
0 1
(A.17)
.
The difference between these jump matrices can be estimated as follows. ˆ are close in the sense that Lemma A.2. The matrices w ˆc and w w ˆj (z) = w ˆjc (z) + O(t−α/2 e−|z|
2
/8
z ∈ Σj ,
),
j = 1, . . . , 4.
(A.18)
Furthermore, the error term is uniform with respect to parameters as stated in Theorem A.1. Proof. We only give the proof z ∈ Σ1 , the other cases being similar. There is only ˆjc (z) given by one nonzero matrix entry in w ˆj (z) − w −R1 (zt−1/2 )z 2iν e−t(Φ(zt−1/2 )−Φ(0)) + r¯z 2iν e−iz2 /2 , |z| ≤ ρ0 t1/2 , W = r¯z 2iν e−iz2 /2 |z| > ρ0 t1/2 . A straightforward estimate for |z| ≤ ρ0 t1/2 shows ˆ
−1/2
|W | = eνπ/4 |R1 (zt−1/2 )e−tΦ(zt
)
− r¯|e−|z|
ˆ
−1/2
≤ eνπ/4 |R1 (zt−1/2 ) − r¯|eRe(−tΦ(zt ≤ eνπ/4 |R1 (zt−1/2 ) − r¯|e−|z|
2
/4
2
/2
))−|z|2 /2
ˆ
−1/2
+ eνπ/4 |e−tΦ(zt
)
− 1|e−|z|
2
/2
ˆ −1/2 )|e−|z|2 /4 , + eνπ/4 t|Φ(zt
ˆ where Φ(z) = Φ(z)−Φ(0)− 2i z 2 = Φ 6(0) z 3 +. . . . Here we have used 2i z 2 = 12 |z|2 for ˆ −1/2 )) ≤ |z|2 /4 by (A.4). Furthermore, by (A.5) and (A.6), z ∈ Σ1 and Re(−tΦ(zt |W | ≤ eνπ/4 Lt−α/2 |z|α e−|z| for |z| ≤ ρ0 t
1/2
. For |z| > ρ0 t
1/2
2
/4
+ eνπ/4 Ct−1/2 |z|3 e−|z|
we have
|W | ≤ eνπ/4 e−|z|
2
/2
2
≤ eνπ/4 e−ρ0 t/4 e−|z|
which finishes the proof. The next lemma allows us, to replace m(z) ˆ by m ˆ c (z).
2
/4
2
/4
,
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
97
Lemma A.3. Consider the RHP m+ (z) = m− (z)v(z), m(z) → I,
z ∈ Σ, z → ∞,
(A.19)
z∈ / Σ.
Assume that w ∈ L2 (Σ) ∩ L∞ (Σ). Then µ − I2 ≤
cw2 1 − cw∞
(A.20)
provided cw∞ < 1, where c is the norm of the Cauchy operator on L2 (Σ). Proof. This follows since µ ˜ = µ − I ∈ L2 (Σ) satisfies (I − Cw )˜ µ = Cw I. Lemma A.4. The solution m(z) ˆ has a convergent asymptotic expansion 1 1 ˆ (t) + O 2 m(z) ˆ = I+ M z z
(A.21)
for |z| > ρ0 t1/2 with the error term uniformly in t. Moreover, ˆ (t) = M ˆ c + O(t−α/2 ). M
(A.22)
Proof. Consider m ˆ d (z) = m(z) ˆ m ˆ c (z)−1 , whose jump matrix is given by c vˆd (z) = m ˆ c− (z)ˆ v (z)ˆ v c (z)−1 m ˆ c− (z)−1 = I + m ˆ c− (z) w(z) ˆ −w ˆ c (z) m ˆ − (z)−1 . By Lemma A.2, we have that w ˆ−w ˆc is decaying of order t−α/2 in the norms of L1 and L∞ and thus the same is true for w ˆd = vˆd − I. Hence by the previous lemma ˆ µd − I2 = O(t−α/2 ). Furthermore, by µ ˆd = m ˆ d− = m ˆ − (m ˆ c− )−1 = µ ˆ(ˆ µc )−1 we infer ˆ µ−µ ˆc 2 = O(t−α/2 ) since µ ˆc is bounded. Now m(z) ˆ =I−
1 1 2πi z
µ ˆ(s)w(s)ds ˆ + Σ
1 1 2πi z
sˆ µ(s)w(s) ˆ Σ
1/2
shows (recall that w ˆ is supported inside |z| ≤ ρ0 t ) ˆ µ(s)2 sw(s) ˆ 1 ˆ 2 m(z) ˆ = I + M (t) + O , z z2 where ˆ (t) = − 1 M 2πi Now the rest follows from ˆ (t) = M ˆc− 1 M 2πi
µ ˆ(s)w(s)ds. ˆ Σ
Σ
(ˆ µ(s)w(s) ˆ −µ ˆc (s)w ˆc (s))ds
ds s−z
February 11, 2009 13:39 WSPC/148-RMP
98
J070-00358
H. Kr¨ uger & G. Teschl
using ˆ µw ˆ−µ ˆc w ˆ c 1 ≤ w ˆ−w ˆ c 1 + ˆ µ − I2 w ˆ−w ˆc 2 + ˆ µ−µ ˆ c 2 w ˆ c 2 . Finally, it remains to solve (A.16) and to show: Theorem A.5. The solution of the RHP (A.16) is of the form 1 1 ˆc +O 2 , m ˆ c (z) = I + M z z where
0 −β c ˆ M =i ¯ , β 0
β=
√ i(π/4−arg(r)+arg(Γ(iν))) νe .
(A.23)
(A.24)
The error term is uniform with respect to r in compact subsets of D. Moreover, the solution is bounded (again uniformly with respect to r). Given this result, Theorem A.1 follows from Lemma A.4 1 ˆ (t)D(t)−1 + O(z −2 t−1 ) m(z) = D(t)m(zt ˆ 1/2 )D(t)−1 = I + 1/2 D(t)M t z 1 ˆ c D(t)−1 + O(t−(1+α)/2 ) D(t)M (A.25) t1/2 z for |z| > ρ0 , since D(t) is bounded. The proof of this result will be given in the remainder of this section. In order to solve (A.16) we begin with a deformation which moves the jump to R as follows. Denote the region enclosed by R and Σj as Ωj (cf. Fig. 8) and define D0 (z)Dj , z ∈ Ωj , j = 1, . . . , 4, c c ˆ (z) m ˜ (z) = m (A.26) D0 (z), else, =I+
where
D0 (z) =
Fig. 8.
z iν e−iz 0
2
/4
0 −iν iz 2 /4 z e
Deforming back the cross.
,
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
and D1 =
1 1 r¯ 1 0 , D2 = , D3 = 0 1 r 1 0
−
r¯ 1 1 − |r|2 , D4 = r − 1 1 − |r|2
Lemma A.6. The function m ˜ c (z) defined in (A.26) satisfies the RHP
1 − |r|2 −¯ r ˜ c− (z) , z∈R m ˜ c+ (z) = m r 1 1 ˆc c m ˜ (z) = I + M + . . . D0 (z), z
z → ∞,
99
0 1
.
(A.27)
3π π < arg(z) < . 4 4
Proof. First, one checks that m ˜ c+ (z) = m ˜ c− (z)D0 (z)−1 vˆ1c (z)D0 (z)D1 = m ˜ c− (z), z ∈ Σ1 and similarly for z ∈ Σ2 , Σ3 , Σ4 . To compute the jump along R observe that, by our choice of branch cut for z iν , D0 (z) has a jump along the negative real axis given by
2 e(log |z|±iπ)iν e−iz /4 0 , z < 0. D0,± (z) = 2 0 e−(log |z|±iπ)iν eiz /4 Hence the jump along R is given by D1−1 D2 ,
z>0
−1 and D4−1 D0,− (z)D0,+ (z)D3 , −2πν
and (A.27) follows after recalling e
z < 0,
2
= 1 − |r| .
Now, we can follow [4, (4.17)–(4.51)] to construct an approximate solution. d m ˜ c (z) The idea is as follows, since the jump matrix for (A.27), the derivative dz has the same jump and hence is given by n(z)m ˜ c (z), where the entire matrix n(z) can be determined from the behavior z → ∞. Since this will just serve as a motivation for our ansatz, we will not worry about justifying any steps. ˜ c (z) = For z in the sector π4 < arg(z) < 3π 4 (enclosed by Σ2 and Σ3 ) we have m c m ˆ (z)D0 (z) and hence d c iz m ˜ (z) + σ3 m ˜ c (z) m ˜ c (z)−1 dz 2 ν z d c z c c − ˆ (z) + i σ3 m = i ˆ (z) m ˆ c (z)−1 m ˆ (z)σ3 + m z 2 dz 2 1 i c ˆ = [σ3 , M ] + O , 2 z
1 σ3 = 0
0 . −1
Since the left-hand side has no jump, it is entire and hence by Liouville’s theorem a constant given by the right hand side. In other words, iz i d c 0 β12 c c ˆ c ]. m ˜ (z) + σ3 m ˜ (z) = β m ˜ (z), β = (A.28) = [σ3 , M β21 0 dz 2 2
February 11, 2009 13:39 WSPC/148-RMP
100
J070-00358
H. Kr¨ uger & G. Teschl
This differential equation can be solved in terms of parabolic cylinder function which then gives the solution of (A.27). ˆ c is given by Lemma A.7. The RHP (A.27) has a unique solution, and the term M √ ˆ c = i 0 −β12 , β12 = β21 = νei(π/4−arg(r)+arg(Γ(iν))) . M (A.29) β21 0 Proof. Uniqueness follows by the standard Liouville argument since the determinant of the jump matrix is equal to 1. To find the solution we use the ansatz ψ11 (z) ψ12 (z) m ˜ c (z) = , ψ21 (z) ψ22 (z) where the functions ψjk (z) satisfy i d 1 2 iz 1 + z − β12 β21 ψ11 (z), ψ12 (z) = − ψ11 (z) = − ψ22 (z), 2 4 β21 dz 2 i d iz 1 2 1 + − z + β12 β21 ψ22 (z). ψ22 (z) = ψ21 (z) = ψ11 (z), β12 dz 2 2 4 That is, ψ11 (e3πi/4 ζ) satisfies the parabolic cylinder equation 1 1 2 D (ζ) + a + − ζ D(ζ) = 0 2 4 with a = iβ12 β21 and ψ22 (eiπ/4 ζ) satisfies the parabolic cylinder equation with a = −iβ12 β21 . Let Da be the entire parabolic cylinder function of [42, §16.5] and set e−3πν/4 Diν (−eiπ/4 z), Im(z) > 0, ψ11 (z) = eπν/4 Diν (eiπ/4 z), Im(z) < 0, eπν/4 D−iν (−ieiπ/4 z), Im(z) > 0, ψ22 (z) = e−3πν/4 D−iν (ieiπ/4 z), Im(z) < 0. Using the asymptotic behavior 2 a(a − 1) −4 Da (z) = z a e−z /4 1 − + O(z ) , 2z 2
z → ∞,
|arg(z)| ≤ 3π/4,
shows that the choice β12 β21 = ν ensures the correct asymptotics ψ11 (z) = z iν e−iz
2
/4
(1 + O(z −2 )),
ψ21 (z) = iβ21 z iν e−iz
2
/4
(z −1 + O(z −3 )),
ψ12 (z) = −iβ12 z −iν eiz ψ22 (z) = z −iν eiz
2
/4
2
/4
(z −1 + O(z −3 )),
(1 + O(z −2 )),
as z → ∞ inside the half plane Im(z) ≥ 0. In particular, 1 ˆc 0 c −2 c ˆ m ˜ (z) = I + M + O(z ) D0 (z) with M = i β21 z
−β12 . 0
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
101
It remains to check that we have the correct jump. Since by construction both limits ˜ c− (z) satisfy the same differential equation (A.28), there is a constant m ˜ c+ (z) and m ˜ c− (z)v. Moreover, since the coefficient matrix of the matrix v such that m ˜ c+ (z) = m linear differential equation (A.28) has trace 0, the determinant of m ˜ c± (z) is conc stant and hence det(m ˜ ± (z)) = 1 by our asymptotics. Moreover, a straightforward calculation shows √ 2πe−iπ/4 e−πν/2 −1 −2πν √ e − γ νΓ(iν) ˜ c+ (0) = √ v=m ˜ c− (0)−1 m 2πeiπ/4 e−πν/2 √ γ 1 νΓ(−iν) √
where γ =
ν β12
=
β √21 . ν
Here we have used √ √ 2a/2 π 2(1+a)/2 π , Da (0) = − Da (0) = Γ((1 − a)/2) Γ(−a/2) √ plus the duplication formula Γ(z)Γ(z+ 21 ) = 21−2z πΓ(2z) for the Gamma function. Hence, if we choose √ νΓ(−iν) r, γ= √ 2πeiπ/4 e−πν/2 we have
1 − |r|2 −¯ r v= r 1 π = ν sinh(πν) , which follows since |γ|2 = 1. To see this use |Γ(−iν)|2 = Γ(1−iν)Γ(iν) −iν π from Euler’s reflection formula Γ(1 − z)Γ(z) = sin(πz) for the Gamma function. In particular, √ β12 = β21 = νei(π/4−arg(r)+arg(Γ(iν))) ,
which finishes the proof. Remark A.8. An inspection of the proof shows that m ˆ c is given by the solution of a differential equation depending analytically on ν. Hence, m ˆ c depends analytically 1 2 on ν = − 2π log(1 − |r| ). This implies local Lipschitz dependence on r as long as r ∈ D. Appendix B. Singular Integral Equations In this section we show how to transform a meromorphic vector Riemann–Hilbert problem with simple poles at ζ, ζ −1 , m+ (z) = m− (z)v(z), z ∈ Σ, 0 0 Resζ m(z) = lim m(z) , z→ζ −ζγ 0
February 11, 2009 13:39 WSPC/148-RMP
102
J070-00358
H. Kr¨ uger & G. Teschl
0 ζ −1 γ Resζ −1 m(z) = lim m(z) , 0 0 z→ζ −1 0 1 −1 m(z ) = m(z) , 1 0 m(0) = 1 m2 ,
(B.1)
where ζ ∈ (−1, 0) ∪ (0, 1) and γ ≥ 0, into a singular integral equation. Since we require the symmetry condition for our Riemann–Hilbert problems we need to adapt the usual Cauchy kernel to preserve this symmetry. Moreover, we keep the single soliton as an inhomogeneous term which will play the role of the leading asymptotics in our applications. Hypothesis B.1. Suppose the jump data (Σ, v) satisfy the following assumptions: (i) Σ consist of a finite number of smooth oriented finite curves in C which intersect at most finitely many times with all intersections being transversal. (ii) Σ does not contain 0, ζ ±1 . (iii) Σ is invariant under z → z −1 and is oriented such that under the mapping z → z −1 sequences converging from the positive sided to Σ are mapped to sequences converging to the negative side. (iv) The jump matrix v is invertible and can be factorized according to v = b−1 − b+ = (I − w− )−1 (I + w+ ), where w± = ±(b± − I) are continuous and satisfy 0 1 0 1 w± (z −1 ) = w∓ (z) , z ∈ Σ. (B.2) 1 0 1 0 The classical Cauchy-transform of a function f : Σ → C which is square integrable is the analytic function Cf : C\Σ → C given by f (s) 1 ds, z ∈ C\Σ. (B.3) (Cf )(z) = 2πi Σ s − z Denote the non-tangential boundary values from both sides (taken possibly, in the L2 -sense — see, e.g., [3, Eq. (7.2)]) by C+ f , respectively C− f . Then it is well known that C+ and C− are bounded operators L2 (Σ) → L2 (Σ), which satisfy C+ − C− = I and C+ C− = 0 (see, e.g., [1]). Moreover, one has the Plemelj–Sokhotsky formula [31] C± = where (Hf )(t) = is the Hilbert transform and
−
1 (iH ± I), 2
1 f (s) − ds, π Σ t−s
t ∈ Σ,
denotes the principal value integral.
(B.4)
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
103
In order to respect the symmetry condition we will restrict our attention to the set L2s (Σ) of square integrable functions f : Σ → C2 such that 0 1 −1 f (z ) = f (z) . (B.5) 1 0 Clearly this will only be possible if we require our jump data to be symmetric as well (i.e., Hypothesis B.1 holds). Next we introduce the Cauchy operator 1 f (s)Ωζ (s, z) (B.6) (Cf )(z) = 2πi Σ acting on vector-valued functions f : Σ → C2 . Here the Cauchy kernel is given by z − ζ −1 1 0 s − ζ −1 s − z Ωζ (s, z) = ds z−ζ 1 0 s−ζ s−z 1 1 − 0 s − z s − ζ −1 ds, = (B.7) 1 1 − 0 s−z s−ζ for some fixed ζ ∈ / Σ. In the case ζ = ∞ we set 1 1 − 0 s − z s ds Ω∞ (s, z) = 1 0 s−z and one easily checks the symmetry property: 0 1 0 1 Ωζ (s, z) . Ωζ (1/s, 1/z) = 1 0 1 0
(B.8)
(B.9)
The properties of C are summarized in the next lemma. Lemma B.2. Assume Hypothesis B.1. The Cauchy operator C has the properties, that the boundary values C± are bounded operators L2s (Σ) → L2s (Σ) which satisfy C+ − C− = I
(B.10)
and (Cf )(ζ −1 ) = (0
∗),
(Cf )(ζ) = (∗
0).
(B.11)
Here ∗ is a placeholder for an unspecified value. Furthermore, C restricts to L2s (Σ), that is 0 1 −1 (Cf )(z ) = (Cf )(z) , z ∈ C\Σ (B.12) 1 0
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
H. Kr¨ uger & G. Teschl
104
for f ∈ L2s (Σ) and if w± satisfy (B.2) we also have 0 1 C± (f w∓ )(1/z) = C∓ (f w± )(z) , 1 0
z ∈ Σ.
(B.13)
Proof. Everything follows from (B.9) and the fact that C inherits all properties from the classical Cauchy operator. We have thus obtained a Cauchy transform with the required properties. Following [1, Secs. 7 and 8], respectively [25], we can solve our Riemann–Hilbert problem using this Cauchy operator. Introduce the operator Cw : L2s (Σ) → L2s (Σ) by Cw f = C+ (f w− ) + C− (f w+ ),
f ∈ L2s (Σ)
(B.14)
and recall from Lemma 3.6 that the unique solution corresponding to v ≡ I is given by −1 1 1 2z −ζ 2 +1−ζ m0 (z) = f (z) f , f (z) = γζ z 1 − ζ2 + γ z−ζ −1
Observe that for γ = 0 we have f (z) = 1 and for γ = ∞ we have f (z) = ζ 2 z−ζ z−ζ . In particular, m0 (z) is uniformly bounded away from ζ for all γ ∈ [0, ∞]. Then we have the next result. Theorem B.3. Assume Hypothesis B.1. Suppose m solves the Riemann–Hilbert problem (B.1). Then 1 m(z) = (1 − c0 )m0 (z) + µ(s)(w+ (s) + w− (s))Ωζ (s, z), 2πi Σ where µ=
m+ b−1 +
=
m− b−1 −
and
c0 =
1 2πi
Σ
(B.15)
µ(s)(w+ (s) + w− (s))Ωζ (s, 0) . 1
Here (m)j denotes the jth component of a vector. Furthermore, µ solves (I − Cw )µ = (1 − c0 )m0 .
(B.16)
Conversely, suppose µ ˜ solves (I − Cw )˜ µ = m0 , and
c˜0 =
1 2πi
Σ
(B.17)
µ ˜ (s)(w+ (s) + w− (s))Ωζ (s, 0) = −1, 1
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
105
then m defined via (B.15), with (1 − c0 ) = (1 + c˜0 )−1 and µ = (1 + c˜0 )−1 µ ˜ , solves the Riemann–Hilbert problem (B.1) and µ = m± b−1 ± . Proof. If m solves (B.1) and we set µ = m± b−1 ± , then m satisfies an additive jump given by m+ − m− = µ(w+ + w− ). Hence, if we denote the left-hand side of (B.15) by m, ˜ both functions satisfy the same additive jump. Furthermore, Hypothesis B.1 implies that µ is symmetric and hence so is m. ˜ Using (B.11) we also see that m ˜ satisfies the same pole conditions ˜ has no jump and solves as m0 . In summary, m − m (B.1) with v ≡ I except for the normalization which is given by m(0) − m(0) ˜ = 0 ∗ . Hence Lemma 3.6 implies m−m ˜ = 0. Moreover, if m is given by (B.15), then (B.10) implies m± = (1 − c0 )m0 + C± (µw− ) + C± (µw+ ) = (1 − c0 )m0 + Cw (µ) ± µw± = (1 − c0 )m0 − (I − Cw )µ + µb± .
(B.18)
From this we conclude that µ = m± b−1 ± solves (B.16). Conversely, if µ ˜ solves (B.17), then set 1 m(z) ˜ = m0 (z) + µ ˜ (s)(w+ (s) + w− (s))Ωζ (s, z), 2πi Σ and the same calculation as in (B.18) implies m ˜± = µ ˜b± , which shows that m = ˜ solves the Riemann–Hilbert problem (B.1). (1 + c˜0 )−1 m Note that in the special case γ = 0 we have m0 (z) = 1 1 and we can choose ζ as we please, say ζ = ∞ such that c0 = c˜0 = 0 in the above theorem. Hence we have a formula for the solution of our Riemann–Hilbert problem m(z) in terms of (I−Cw )−1 m0 and this clearly raises the question of bounded invertibility of I − Cw . This follows from Fredholm theory (cf., e.g., [44]): Lemma B.4. Assume Hypothesis B.1. The operator I − Cw is Fredholm of index zero, ind(I − Cw ) = 0.
(B.19)
Proof. Since one can easily check (I − Cw )(I − C−w ) = (I − C−w )(I − Cw ) = I − Tw , where Tw = T++ + T+− + T−+ + T−− ,
Tσ1 σ2 (f ) = Cσ1 [Cσ2 (f w−σ2 )w−σ1 ],
(B.20)
February 11, 2009 13:39 WSPC/148-RMP
106
J070-00358
H. Kr¨ uger & G. Teschl
it suffices to check that the operators Tσ1 σ2 are compact ([34, Theorem 1.4.3]). By Mergelyan’s theorem we can approximate w± by rational functions and, since the norm limit of compact operators is compact, we can assume without loss that w± have an analytic extension to a neighborhood of Σ. Indeed, suppose fn ∈ L2 (Σ) converges weakly to zero. Without loss we can assume fn to be continuous. We will show that Tw fn L2 → 0. Using the analyticity of w in a neighborhood of Σ and the definition of C± , we can slightly deform the contour Σ to some contour Σ± close to Σ, on the left, and have, by Cauchy’s theorem, 1 (C(fn w− )(s)w− (s))Ωζ (s, z). T++ fn (z) = 2πi Σ+ Now (C(fn w− )w− )(z) → 0 as n → ∞. Also |(C(fn w− )w− )(z)| < const fn L2 w− L∞ < const and thus, by the dominated convergence theorem, T++ fn L2 → 0 as desired. Moreover, considering I − εCw = I − Cεw for 0 ≤ ε ≤ 1 we obtain ind(I − Cw ) = ind(I) = 0 from homotopy invariance of the index. By the Fredholm alternative, it follows that to show the bounded invertibility of I − Cw we only need to show that ker(I − Cw ) = 0. The latter being equivalent to unique solvability of the corresponding vanishing Riemann–Hilbert problem in the case γ = 0 (where we can choose ζ = ∞ such that c0 = c˜0 = 0). Corollary B.5. Assume Hypothesis B.1. A unique solution of the Riemann–Hilbert problem (B.1) with γ = 0 exists if and only if the corresponding vanishing Riemann– Hilbert problem, where the normalization condition is replaced by m(0) = (0 m2 ), with m2 arbitrary, has at most one solution. We are interested in comparing a Riemann–Hilbert problem for which w∞ is small with the one-soliton problem, where w∞ = w+ L∞ (Σ) + w− L∞ (Σ) .
(B.21)
For such a situation we have the following result: Theorem B.6. Fix a contour Σ and choose ζ, γ = γ t , v t depending on some parameter t ∈ R such that Hypothesis B.1 holds. Assume that wt satisfies wt ∞ ≤ ρ(t)
(B.22)
for some function ρ(t) → 0 as t → ∞. Then (I − Cwt )−1 : L2s (Σ) → L2s (Σ) exists for sufficiently large t and the solution m(z) of the Riemann–Hilbert problems (B.1) differs from the one-soliton solution mt0 (z) only by O(ρ(t)), where the error term depends on the distance of z to Σ ∪ {ζ ±1 }.
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
107
Proof. By boundedness of the Cauchy transform, one has Cwt ≤ constwt ∞ . Thus, by the Neumann series, we infer that (I − Cwt )−1 exists for sufficiently large t and (I − Cwt )−1 − I = O(ρ(t)). This implies ˜ µt − mt0 L2s = O(ρ(t)) and c˜t0 = O(ρ(t)) (note µ ˜t0 = µt0 = mt0 ). t t t Consequently c0 = O(ρ(t)) and µ − m0 L2s = O(ρ(t)) and thus mt (z) − mt0 (z) = O(ρ(t)) uniformly in z as long as it stays a positive distance away from Σ ∪ {ζ ±1 }. References [1] R. Beals and R. Coifman, Scattering and inverse scattering for first order systems, Comm. Pure Appl. Math. 37 (1984) 39–90. [2] T. Dauxois, M. Peyrard and S. Ruffo, The Fermi–Pasta–Ulam ‘numerical experiment’: History and pedagogical perspectives, Eur. J. Phys. 26 (2005) S3–S11. [3] P. Deift, Orthogonal Polynomials and Random Matrices: A Riemann–Hilbert Approach, Courant Lecture Notes, Vol. 3 (Amer. Math. Soc., Rhode Island, 1998). [4] P. Deift and X. Zhou, A steepest descent method for oscillatory Riemann–Hilbert problems, Ann. of Math. (2) 137 (1993) 295–368. [5] P. Deift and X. Zhou, Long-time asymptotics for integrable systems. Higher order theory, Comm. Math. Phys. 165(1) (1994) 175–191. [6] P. Deift and X. Zhou, Asymptotics for the Painlev´e II equation, Comm. Pure Appl. Math. 48 (1995) 277–337. [7] P. A. Deift, A. R. Its and X. Zhou, Long-time asymptotics for integrable nonlinear wave equations, in Important Developments in Soliton Theory, Springer Ser. Nonlinear Dynam, eds. A. S. Fokas and V. E. Zakharov (Springer, Berlin, 1993), pp. 181–204. [8] P. Deift, T. Kriecherbauer and S. Venakides, Forced lattice vibrations. I, II, Comm. Pure Appl. Math. 48(11) (1995) 1187–1249, 1251–1298. [9] P. Deift, L. C. Li and C. Tomei, Toda flows with infinitely many variables, J. Funct. Anal. 64 (1985) 358–402. [10] P. Deift, S. Venakides and X. Zhou, The collisionless shock region for the long time behavior of solutions of the KdV equation, Comm. Pure Appl. Math. 47 (1994) 199– 206. [11] P. Deift, S. Kamvissis, T. Kriecherbauer and X. Zhou, The Toda rarefaction problem, Comm. Pure Appl. Math. 49(1) (1996) 35–83. [12] I. Egorova, J. Michor and G. Teschl, Inverse scattering transform for the Toda hierarchy with quasi-periodic background, Proc. Amer. Math. Soc. 135 (2007) 1817–1827. [13] I. Egorova, J. Michor and G. Teschl, Soliton solutions of the Toda hierarchy on quasi-periodic background revisited, to appear in Math. Nach. [14] L. Faddeev and L. Takhtajan, Hamiltonian Methods in the Theory of Solitons (Springer, Berlin, 1987). [15] E. Fermi, J. Pasta and S. Ulam, Studies of nonlinear problems, in Collected Works of Enrico Fermi, Vol. II. Theory, Methods, and Applications, ed. E. Segre, 2nd edn. (Marcel Dekker, New York, 2000), pp. 978–988; reprinted from University of Chicago Press (1965).
February 11, 2009 13:39 WSPC/148-RMP
108
J070-00358
H. Kr¨ uger & G. Teschl
[16] H. Flaschka, The Toda lattice. I. Existence of integrals, Phys. Rev. B 9 (1974) 1924– 1925. [17] C. S. Gardner and J. M. Green, M. D. Kruskal and R. M. Miura, A method for solving the Korteweg–de Vries equation, Phys. Rev. Lett. 19 (1967) 1095–1097. [18] F. Gesztesy, H. Holden, J. Michor and G. Teschl, Soliton Equations and Their Algebro-Geometric Solutions. Volume II: (1 + 1)-Dimensional Discrete Models, Cambridge Studies in Advanced Mathematics, Vol. 114 (Cambridge University Press, Cambridge, 2008). [19] A. R. Its, Asymptotics of solutions of the nonlinear Schr¨ odinger equation and isomonodromic deformations of systems of linear differential equations, Soviet Math. Dokl. 24 (1981) 452–456. [20] S. Kamvissis, On the long time behavior of the doubly infinite Toda lattice under initial data decaying at infinity, Comm. Math. Phys. 153(3) (1993) 479–519. [21] S. Kamvissis, On the Toda shock problem, Phys. D 65 (1993) 242–266. [22] S. Kamvissis and G. Teschl, Stability of periodic soliton equations under short range perturbations, Phys. Lett. A 364 (2007) 480–483. [23] S. Kamvissis and G. Teschl, Stability of the periodic Toda lattice under short range perturbations, arXiv:0705.0346. [24] S. Kamvissis and G. Teschl, Stability of the periodic Toda lattice: Higher order asymptotics, arXiv:0805.3847. [25] H. Kr¨ uger and G. Teschl, Long-time asymptotics for the Toda lattice in the soliton region, to appear in Math. Z. [26] H. Kr¨ uger and G. Teschl, Long-time asymptotics for the periodic Toda lattice in the soliton region, arXiv:0807.0244. [27] P. D. Lax, Integrals of nonlinear equations of evolution and solitary waves, Comm. Pure and Appl. Math. 21 (1968) 467–490. [28] S. V. Manakov, Nonlinear Frauenhofer diffraction, Sov. Phys. JETP 38 (1974) 693– 696. [29] J. Michor, I. Nenciu and G. Teschl, Long-time asymptotics of the Toda lattice in the collisionless shock region, in preparation. [30] J. Moser, Finitely many mass points on the line under the influence of an exponential potential — An integrable system, in Dynamical Systems, Theory and Applications, ed. J. Moser, Lecture Notes in Phys., Vol. 38 (Springer, Berlin, 1975), pp. 467–497. [31] N. I. Muskhelishvili, Singular Integral Equations (P. Noordhoff Ltd., Groningen, 1953). [32] V. Yu. Novokshenov and I. T. Habibullin, Nonlinear differential-difference schemes integrable by the method of the inverse scattering problem. Asymptotics of the solution for t → ∞, Soviet. Math. Dokl. 23(2) (1981) 304–307. [33] R. S. Palais, The symmetries of solitons, Bull. Amer. Math. Soc. 34 (1997) 339–403. [34] S. Pr¨ ossdorf, Some Classes of Singular Equations (North-Holland, Amsterdam, 1978). [35] J. S. Russel, Report on waves, in 14th Mtg. of the British Assoc. for the Advance of Science (John Murray, London, 1844), pp. 311–390 + 57 plates. [36] G. Teschl, Inverse scattering transform for the Toda hierarchy, Math. Nach. 202 (1999) 163–171. [37] G. Teschl, On the initial value problem of the Toda and Kac–van Moerbeke hierarchies, in Differential Equations and Mathematical Physics, eds. R. Weikard and G. Weinstein, AMS/IP Studies in Advanced Mathematics, Vol. 16 (Amer. Math. Soc., Providence, 2000), pp. 375–384. [38] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Math. Surv. and Mon., Vol. 72 (Amer. Math. Soc., Rhode Island, 2000).
February 11, 2009 13:39 WSPC/148-RMP
J070-00358
Long-Time Asymptotics of the Toda Lattice for Decaying Initial Data Revisited
109
[39] G. Teschl, Almost everything you always wanted to know about the Toda equation, Jahresber. Deutsch. Math.-Verein. 103(4) (2001) 149–162. [40] M. Toda, Theory of Nonlinear Lattices, 2nd edn. (Springer, Berlin, 1989). [41] S. Venakides, P. Deift and R. Oba, The Toda shock problem, Comm. Pure Appl. Math. 44 (1991) 1171–1242. [42] E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, 4th edn. (Cambridge University Press, Cambridge, 1927). [43] N. J. Zabusky and M. D. Kruskal, Interaction of solitons in a collisionless plasma and the recurrence of initial states, Phys. Rev. Lett. 15 (1963) 240–243. [44] X. Zhou, The Riemann–Hilbert problem and inverse scattering, SIAM J. Math. Anal. 20(4) (1989) 966–986.
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Reviews in Mathematical Physics Vol. 21, No. 1 (2009) 111–154 c World Scientific Publishing Company
EFFECTIVE CONSTRAINTS FOR QUANTUM SYSTEMS
MARTIN BOJOWALD Institute for Gravitation and the Cosmos, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
[email protected] ¨ BARBARA SANDHOFER Institute for Theoretical Physics, University of Cologne, Z¨ ulpicher Strasse 77, 50937 Cologne, Germany
[email protected] AURELIANO SKIRZEWSKI Centro de F´ısica Fundamental, Universidad de los Andes, M´ erida 5101, Venezuela
[email protected] ARTUR TSOBANJAN Institute for Gravitation and the Cosmos, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
[email protected] Received 28 April 2008 Revised 7 November 2008 An effective formalism for quantum constrained systems is presented which allows manageable derivations of solutions and observables, including a treatment of physical reality conditions without requiring full knowledge of the physical inner product. Instead of a state equation from a constraint operator, an infinite system of constraint functions on the quantum phase space of expectation values and moments of states is used. The examples of linear constraints as well as the free non-relativistic particle in parametrized form illustrate how standard problems of constrained systems can be dealt with in this framework. Keywords: Constrained systems; physical Hilbert space; effective equations. Mathematics Subject Classification 2000: 53D17, 81S10
1. Introduction Effective equations are a trusted tool to sidestep some of the mathematical and conceptual difficulties of quantum theories. Quantum corrections to classical equations 111
February 11, 2009 13:42 WSPC/148-RMP
112
J070-00359
M. Bojowald et al.
of motion are usually easier to analyze than the behavior of outright quantum states, and they can often be derived in a manageable way. This is illustrated, e.g., by the derivation of the low-energy effective action for anharmonic oscillators in [1] or, equivalently, effective equations for canonical quantum systems in [2–4]. But effective equations are not merely quantum corrected classical equations. They provide direct solutions for quantum properties such as expectations values or fluctuations. While semiclassical regimes play important roles in providing useful approximation schemes, effective equations present a much more general method. In fact, they may be viewed as an analysis of quantum properties independently of specific Hilbert space representation issues. As we will discuss here, this is especially realized for constrained systems which commonly have additional complications such as the derivation of a physical inner product or the problem of time in general relativity [5]. We therefore develop an effective constraint formalism parallel to that of effective equations for unconstrained systems. Its advantages are that (i) it avoids directly writing an integral (or other) form of a physical inner product, which is instead implemented by reality conditions for the physical variables; (ii) it shows when a phase space variable evolves classically enough to play the role of internal time which, in a precise sense, emerges from quantum gravity; and (iii) it directly provides physical quantities such as expectation values and fluctuations as relational functions of internal time, rather than computing a whole wave function first and then performing integrations. These advantages avoid conceptual problems and some technical difficulties in solving quantum equations. They can also bring out general properties more clearly, especially in quantum cosmology. Moreover, they provide equations which are more easily implemented numerically than equations for states followed by integrations to compute expectation values. (Finally, although we discuss only systems with a single classical constraint in this paper, anomaly issues can much more directly be analyzed at the effective level; see [6–8] for work in this direction.) As we will see, however, there are still various unresolved mathematical issues for a completely general formulation. In this article, we propose the general principles behind an effective formulation of constrained systems and illustrate properties and difficulties by simple examples, including the parametrized free, non-relativistic particle where we will demonstrate the interplay of classical and quantum variables as it occurs in constrained systems. Specific procedures used in this concrete example will be general enough to encompass any non-relativistic system in parameterized form. Relativistic systems show further subtleties and will be dealt with in a forthcoming paper.
2. Setting We first review the setup of effective equations for unconstrained Hamiltonian systems [2–4], which we will generalize to systems with constraints in the following section.
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
113
We describe a state by its moments rather than a wave function in a certain Hilbert space representation. This has the immediate advantage that the description is manifestly representation independent and deals directly with quantities of physical interest, such as expectation values and fluctuations. Just as a Hilbert space representation, the system is determined through the algebra of its basic operators and their -relations (adjointness or reality conditions). In terms of expectation values, fluctuations and all higher moments, this structure takes the form of an infinite dimensional phase space whose Poisson relations are derived from the basic commutation algebra. Dynamics is determined by a Hamiltonian on this phase space. As a function of all the phase space variables it is obtained by taking the expectation value of the Hamiltonian operator in a general state and expressing the state dependence as a dependence on all the moments. Thus, the Hamiltonian operator determines a function on the infinite dimensional phase space which generates Hamiltonian evolution.a Specifically, for an ordinary quantum mechanical system with canonical basic operators qˆ and pˆ satisfying [ˆ q , pˆ] = i we have a phase space coordinatized by the expectation values q := ˆ q and p := ˆ p as well as infinitely many quantum variablesb Ga,b := (ˆ p − ˆ p)a (ˆ q − ˆ q )b Weyl
(2.1)
for integer a and b such that a + b ≥ 2, where the totally symmetric ordering is used. For a + b = 2, for instance, this provides fluctuations (∆q)2 = G0,2 = Gqq and (∆p)2 = G2,0 = Gpp as well as the covariance G1,1 = Gqp . As indicated, for moments of lower orders it is often helpful to list the variables appearing as operators directly. The symplectic structure is determined through ˆ B] ˆ for Poisson brackets which follow by the basic rule {A, B} = −i−1 [A, ˆ ˆ ˆ and any two operators A and B which define phase space functions A := A ˆ B := B. Moreover, for products of expectation values in the quantum variables one simply uses the Leibniz rule to reduce all brackets to the elementary ones. General Poisson brackets between the quantum variables then satisfy the
a This
viewpoint in the present context of effective equations goes back to [9]. While some underlying constructions can be related to the geometrical formulation of quantum mechanics developed in [10–12], the geometrical formulation has so far not provided a rigorous derivation of effective equations. Present methods in this context remain incomplete due to a lack of treating quantum variables properly, which take center stage in the methods of [2] and those developed here. In some cases, it may be enough to place upper bounds on additional correction terms from quantum variables, based on semiclassicality assumptions. This may be done within the geometric formulation to provide semiclassical equations [13, 14], but it is insufficient for effective equations. b Notice that the notation used here differs from that introduced in [2] because we found that the considerations of the present article, in which several canonical pairs are involved, can be presented more clearly in this way.
February 11, 2009 13:42 WSPC/148-RMP
114
J070-00359
M. Bojowald et al.
formulac {Ga,b , Gc,d } =
r+s ∞ a b c d 1 − 2 j k k j 4 r,s=0 j,k
× Ga+c−j−k,b+d−j−k (δj,2r+1 δk,2s − δj,2r δk,2s+1 ) − adGa−1,b Gc,d−1 + bcGa,b−1 Gc−1,d
(2.2)
where the summation of j and k is over the ranges 0 ≤ j ≤ min(a, d) and 0 ≤ k ≤ min(b, c), respectively. (For low order moments, it is easier to use direct calculations of Poisson brackets via expectation values of commutators.) This defines the kinematics of the quantum system formulated in terms of moments. The role of the commutator algebra of basic operators is clearly seen in Poisson brackets. Dynamics is defined by a quantum Hamiltonian derived from the Hamiltonian operator by taking expectation values. This results in a function of expectation values and moments through the state used for the expectation value. By Taylor expansion, we have q , pˆ)Weyl = H(q + (ˆ q − q), p + (ˆ p − p))Weyl HQ (q, p, Ga,b ) = H(ˆ = H(q, p) +
∞ ∞ 1 ∂ a+b H(q, p) a,b G a!b! ∂pa ∂q b a=0
(2.3)
b=0
where we understand Ga,b = 0 if a + b < 2 and H(q, p) is the classical Hamiltonian evaluated in expectation values. As written explicitly, we assume the Hamiltonian to be Weyl ordered. If another ordering is desired, it can be reduced to Weyl ordering by adding re-ordering terms. Having a Hamiltonian and Poisson relations of all the quantum variables, one can compute Hamiltonian equations of motion q˙ = {q, HQ }, p˙ = {p, HQ } and G˙ a,b = {Ga,b , HQ }. This results in infinitely many equations of motion which, in general, are all coupled to each other. This set of infinitely many ordinary differential equations is fully equivalent to the partial differential equation for a wave function given by the Schr¨ odinger equation. In general, one can expect a partial differential equation to be solved more easily than infinitely many coupled ordinary ones. Exceptions are solvable systems such as the harmonic oscillator or the spatially flat quantum cosmology of a free, massless scalar field [15] where equations of motion for expectation values and higher moments decouple. This decoupling also allows a precise determination of properties of dynamical coherent states [16]. Such solvable systems can then be used as the basis for a perturbation theory to analyze more general systems, just like free quantum field theory provides a solvable basis for interacting ones. In quantum cosmology, this is developed in [17–19]. Moreover, c We
thank Joseph Ochoa for bringing a mistake in the corresponding formula of [2], as well as its correction, to our attention. We would like to note that the derivation presented in [2], if followed through correctly, does yield (2.2).
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
115
semiclassical and some other regimes allow one to decouple and truncate the equations consistently, resulting in a finite set of ordinary differential equations. This is easier to solve and, as we will discuss in detail below, can be exploited to avoid conceptual problems especially in the context of constrained systems. 3. Effective Constraints For a constrained system, the definition of phase space variables (2.1) in addition to expectation values of basic operators is the same. For several basic variables, copies of independent moments as well as cross-correlations between different canonical pairs need to be taken into account. A useful notation, especially for two canonical pairs (q, p; q1 , p1 ) as we will use it later, is p − p)a (ˆ q − q)b (ˆ p1 − p1 )c (ˆ q1 − q1 )d Weyl . Ga,b c,d ≡ (ˆ
(3.1)
Also here we will, for the sake of clarity, sometimes use a direct listing of operators, 0,1 2 q as in Gqq = G2,0 0,0 = (∆q) or the covariance Gp1 = G1,0 , for low order moments. We assume that we have a single constraint Cˆ in the quantum system and no true Hamiltonian; cases of several constraints or constrained systems with a true Hamiltonian can be analyzed analogously. We clearly must impose the principal ˆ = 0 since any physical state |ψ, whose quantum constraint CQ (q, p, Ga,b ) := C expectation values and moments we are computing, must be annihilated by our ˆ constraint, C|ψ = 0. Just as the quantum Hamiltonian HQ before, the quantum constraint can be written as a function of expectation values and quantum variables by Taylor expansion as in (2.3). However, this one condition for the phase space ˆ variables is much weaker than imposing a Dirac constraint on states, C|ψ = 0. In fact, a simple counting of degrees of freedom shows that additional constraints must be imposed: One classical constraint such as C = 0 removes a pair of canonical variables by restricting to the constraint surface and factoring out the flow generated by the constraint. For a quantum system, on the other hand, we need to eliminate infinitely many variables such as a canonical pair (q, p) together with all the quantum variables it defines. Imposing only CQ = 0 would remove a canonical pair but leave all its quantum variables unrestricted. These additional variables are to be removed by infinitely many further constraints. ˆ There are obvious candidates for these constraints. If C|ψ = 0 for any physical ˆ state, we do not just have a single constraint C = 0 but infinitely many quantum constraints C (n) := Cˆ n = 0 (n) C := f (ˆ q , pˆ)Cˆ n = 0 f (q,p)
(3.2) (3.3)
for positive integer n and arbitrary phase space functions f (q, p). All these expectation values vanish for physical states, and in general differ from each other on the quantum phase space. For arbitrary f (q, p), there is an uncountable number of constraints which should be restricted suitably such that a closed system of constraints
February 11, 2009 13:42 WSPC/148-RMP
116
J070-00359
M. Bojowald et al.
results which provides a complete reduction of the quantum phase space. The form of functions f (q, p) to be included in the quantum constraint system depends on the form of the classical constraint and its basic algebra. Examples and a general construction scheme are presented below. We thus have indeed infinitely many constraints,d which constitute the basis for our effective constraints framework. This is to be solved as a classical constrained system and we naturally adapt notions encountered in the classical constraint analysis. The constraint surface is the surface defined by setting all of the constraint functions to zero. A function on phase space is first class if it has a vanishing Poisson bracket with all of the constraints on the constraint surface and is second class otherwise. A set of constraints is first class if they form a closed Poisson algebra. Continuing the analogy, we refer to the Poisson flows generated by first class constraints as gauge transformations. A second class function varies along some of these flows and is therefore gauge dependent. As we have to solve an infinite system of constraints on an infinite dimensional phase space, an effective treatment requires approximations whose explicit form depends on the specific constraints. At this point, some caution is required: approximations typically entail disregarding quantum variables beyond a certain order to make the system finite. Doing so for an order of moments larger than two results in a Poisson structure which is not symplectic because only the expectation values form a symplectic submanifold of the full quantum phase space, but no set of moments to a certain order does. We are then dealing with constrained systems on Poisson manifolds such that the usual countings of degrees of freedom do not apply. For instance, it is not guaranteed that each constraint generates an independent flow even if it weakly commutes with all other constraints which would usually make it first class. Properties of constrained systems in the more general setting of Poisson manifolds which need not be symplectic are discussed, e.g., in [21]. We also emphasize that gauge flows generated by quantum constraints on the quantum phase space play important roles, which one may not have expected from the usual Dirac treatment of constraints. There, only a constraint equation is written for states, but no gauge flow on the Hilbert space needs to be factored out. ˆ In fact, the gauge flow which one could define by exp(itC)|ψ for a self-adjoint ˆ ˆ C is trivial on physical states which solve the constraint equation C|ψ = 0. In observed in [20], a single constraint C (2) would guarantee a complete reduction for a system ˆ In this case, non-degeneracy of where zero is in the discrete part of the spectrum of a self-adjoint C. ˆ ˆ 2 |ψ = 0 implies C|ψ = 0. However, details of the quantization the inner product ensures that ψ|C and the quantum representation are required for this conclusion, based also on properties of the spectrum, which is against the spirit of effective equations. Moreover, the resulting constraint equation C (2) is in general rather complicated and must be approximated for explicit analytical or numerical solutions. Then, if C (2) = 0 is no longer imposed exactly, a large amount of freedom ˆ for uncontrolled deviations from C|ψ = 0 would open up. In our approach, we are using more than one constraint which ensures that even under approximations the system remains sufficiently well controlled. Moreover, our considerations remain valid for constraints with zero in the continuous parts of their spectra, although as always there are additional subtleties. d As
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
117
the context of effective constraints, there are two main reasons why the gauge flow is non-trivial and becomes important for a complete removal of gauge dependent variables. First, the flow on expectation values of observables is produced by transˆ O ˆ exp(itC)|ψ. ˆ formation of both the state and its dual ψ| exp(−itC) To conclude that the gauge flow is trivial one implicitly uses self-adjointness, deducing that ψ|Cˆ = 0 on physical states. If the physical states are not part of the kinematical Hilbert space, the adjoint action of our original constraint operator on them is undefined altogether and Cˆ has to be modified before we can define the flow. These are specific properties of the kinematical representation which we are not making use of in the effective procedure used here, where reality and normalization conditions are not imposed before the very end of finding properties of states in the physical Hilbert space. The expectation values and moments we are dealing with when imposing quantum constraints thus form a much wider manifold than the Hilbert space setting would allow. Here, not only constraint equations but also gauge flows on the constraint surface are crucial. If representation properties are given which imply that physical states are in the kinematical Hilbert space, we will indeed see that the corresponding flow is trivial as the example in Sec. 4.2 shows. ˆ Secondly, the Dirac constraint C|ψ = 0 corresponds to infinitely many conditions, and only when all of them are solved can the gauge-flow trivialize. An effective treatment, on the other hand, shows its strength especially when one can reduce the required set of equations to finitely many ones, which in our case would imply only a partial solution of the Dirac constraint. On these partial solutions, which for instance make sure that fluctuations correspond to those of a state satisfying ˆ C|ψ = 0 even though other moments do not need to come from such a state, the gauge-flow does not become trivial. Our method of solving quantum constrained systems may be outlined as follows: We start by finding the complete first class set of constraint functions representing the quantum constraint. Setting these to zero defines the constraint surface, with constraint functions generating gauge transformations on it. We construct observables from the gauge invariant functions and recover dynamics, where appropriate, as a gauge transformation of non-observable quantities. We illustrate this method of solving quantum constraints in specific examples below. There are also general conclusions which can be drawn. As the main requirements, we have to ensure the system of effective constraints to be consistent and complete. Consistency means that the set of all constraints should be first class, if we start with a single classical constraint or a first class set of several constraints. As we will illustrate by examples, this puts restrictions on the form of quantum constraints, related to the ordering of operators used, beyond the basic requirement that they be zero when computed in physical states. To show that the constraints are complete, i.e. they remove all expectation values and quantum variables associated with one canonical pair, we will consider a constraint Cˆ = qˆ in Sec. 4.1. Since locally one can always choose a single (irreducible) constraint to be a phase space variable, this will serve as proof that local degrees
February 11, 2009 13:42 WSPC/148-RMP
118
J070-00359
M. Bojowald et al.
of freedom are reduced correctly. (Still, global issues may pose non-trivialities since entire gauge orbits must be factored out when constraints are solved.) 3.1. The form of quantum constraints At first sight, our definition of quantum constraints may seem problematic. Some of them in (3.3) are defined as expectation values of non-symmetric operators, thus implying complex valued constraint functions. (We specifically do not order symmetrically in (3.3) because this would give rise to terms where some qˆ or pˆ appear to the right while others remain to the left. This would not vanish for physical states and therefore not correspond to a constraint.) This may appear problematic, but one should note that this reality statement is dependent on the (kinematical) inner product used before the constraints are imposed. This inner product in general differs from the physical one if zero is in the continuous part of the spectrum of the constraint and thus reality in the kinematical inner product is not physically relevant. Moreover, in gravitational theories it is common or even required to work with constraint operators which are not self-adjoint [22], and thus complex valued constraints have to be expected in general. For physical statements, which are derived after the constraints have been implemented, only the final reality conditions of the physical inner product are relevant.e As we will discuss in more detail later, this physical reality can be implemented effectively: We solve the constraints on the quantum phase space, and then impose the condition that the reduced quantum phase space be real. We will see explicitly that complex-valued quantum variables on the unconstrained phase space are helpful to ensure consistency. In parallel to Hilbert space notation, we call quantum variables (2.1) on the original quantum phase space kinematical quantum variables, and those on the reduced quantum phase space physical quantum variables. Kinematical quantum variables are allowed to take complex values because their reality would only refer to the inner product used on the kinematical Hilbert space. For physical quantum variables in the physical Hilbert space as usually defined, on the other hand, reality conditions must be imposed. 3.1.1. Closure of constrained system Still, it may seem obvious how to avoid the question of reality of the constraints n (p, q)Weyl altogether by using quantum constraints defined as GC f (q,p) = Cˆ n f n n such as GC q and GC p with the symmetric ordering used as in (2.1). Here, the symmetric ordering contained in the definition of quantum variables must leave Cˆ intact as a possibly composite operator, i.e. we have for instance GC,p = 12 Cˆ pˆ + ˆ − Cp independently of the functional form of Cˆ in terms of qˆ and pˆ. Otherwise pˆC e At
least partially, the meaning of reality conditions depends on specifics of the measurement process. This may be further reason to keep an open mind toward reality conditions especially in quantum gravity.
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
119
it would not be guaranteed that the expectation value vanishes on physical states. n We could not include variables with higher powers of q and p, such as GC pp as constraints because there would be terms in the totally symmetric ordering ˆ 2 (such as pˆCˆ n pˆ) not annihilating a physical state. But, e.g., GC pˆ understood as 1 ˆ 2 ˆ − Cp2 would be allowed. The use of such symmetrically ordered ˆ + pˆ2 C 2 C p variables would imply real quantum constraints. However, this procedure is not feasible: The constraints would not form a closed set and not even be first class. We have, for instance, {GC
n
,f (q,p)
, GC
m
,g(q,p)
}=
1 ˆn ˆ ˆ ˆn ˆm [C f + f C , C gˆ + gˆCˆ m ] 4i g ˆn ˆ ˆ ˆn ˆm Cm ˆn ˆ ˆ ˆn [C f + f C , C ] − [C f + f C , gˆ] − 2i 2i f Cn ˆ ˆm [Cˆ n , Cˆ m gˆ + gˆCˆ m ] − [f , C gˆ + gˆCˆ m ] − 2i 2i + {C n f, C m g} .
(3.4)
The first commutator contains several terms which vanish when the expectation value is taken in a physical state, but also the two contributions [Cˆ n , gˆ]Cˆ m fˆ and fˆCˆ m [Cˆ n , gˆ] whose expectation value in a physical state vanishes only if fˆ or gˆ ˆ This would require quantum observables to be known and used in commute with C. the quantum constraints, which in general would be too restrictive and impractical. By contrast, the quantum constraints defined above do form a first class system: we have [fˆCˆ n , gˆCˆ m ] = [fˆ, gˆ]Cˆ n+m + fˆ[Cˆ n , gˆ]Cˆ m + gˆ[fˆ, Cˆ m ]Cˆ n
(3.5)
whose expectation value in any physical state vanishes. Thus, using these constraints implies that their quantum Poisson brackets vanish on the constraint surface, providing a weakly commuting set: 1 ˆ ˆn ˆm [f C , gˆC ] ≈ 0 . (3.6) i A further possibility of using Weyl-ordered constraints of a specific form will be discussed briefly in Sec. 3.2, but also this appears less practical in concrete examples than using non-symmetrized constraints. Constraints thus result for all phase space functions f (q, p), but not all constraints in this uncountable set can be independent. For practical purposes, one would like to keep the number of allowed functions to a minimum while keeping the system complete. Then, however, the set of quantum constraints is not guaranteed to be closed for any restricted choice of phase space functions in their definition. (n) (m) (n+m) If Cf and Cg are quantum constraints, closure requires the presence of C[f,g] (n)
{Cf , Cg(m) } =
(m)
(n)
(for n ≥ 2), Cf [C n ,g] and Cg[C m ,f ] as additional constraints according to (3.5). This allows the specification of a construction procedure for a closed set of quantum constraints. As we will see in examples later, for a system in canonical variables
February 11, 2009 13:42 WSPC/148-RMP
120
J070-00359
M. Bojowald et al. (n)
(m)
(q, p) it is necessary to include at least Cq and Cp in the set of constraints for a (n) complete reduction. With C[q,p] = iC (n) , the first new constraints resulting from a closed constraint algebra add nothing new. However, in general the new constraints (n) (n) Cq[C m ,p] and Cp[C m ,q] will be independent and have to be included. Iteration of the procedure generates further constraints in a process which may or may not stop after finitely many steps depending on the form of the classical constraint. Although many independent constraints have to be considered for a complete system, most of them will involve quantum variables of a high degree. To a given order in the moments it is thus sufficient to consider only a finite number of constraints which can be determined and analyzed systematically. Such truncations and approximations will be discussed by examples in Secs. 5 and 6. 3.1.2. Number of effective constraints: linear constraint operator For special classes of constraints one can draw further conclusions at a more general level. In particular for a linear constraint, which shows the local behavior of singly constrained systems, it is sufficient to consider polynomial multiplying functions as we will justify by counting degrees of freedom. Because this counting depends on the number of degrees of freedom, we generalize, in this section only, our previous setting to a quantum system of N + 1 canonical pairs of operators (ˆ q i , pˆi )i=1,...,N +1 i i satisfying the usual commutation relations [ˆ q , pˆj ] = iδj . Furthermore, it is sufficient to consider only the case where the constraint itself is one of the canonical ˆ linear in the canonical variables, we variables. Given any constraint operator C, can always find linear combinations of the canonical operators ((ˆ xi )i=1,...,2N ; qˆ, pˆ) such that qˆ = Cˆ and [ˆ q , pˆ] = i,
p, x ˆi ] = 0, [ˆ q, x ˆi ] = [ˆ
[ˆ xi , x ˆj ] = i (δi,j−N − δi−N,j )
i.e. xˆi form an algebra of N canonical pairs (i = 1, . . . , N and i = N + 1, . . . , 2N corresponding to the configuration and momentum operators, respectively).f For the rest of this subsection we assume the above notation, so that our quantum xi , system is parametrized by the expectation values q := ˆ q , p := ˆ p, xi := ˆ i = 1, . . . , 2N and the quantum variables: x1 − x1 )a1 · · · (ˆ x2N − x2N )a2N (ˆ p − p)b (ˆ q − q)c Weyl (3.7) Ga1 ,a2 ,...,a2N ;b,c := (ˆ where the operator product is totally symmetrized. As proposed, we include among the constraints all functions of the form Cf = ˆ ˆ f C, where fˆ is now any operator polynomial in the canonical variables. This ˆ proposition is consistent with C|ψ = 0 and the set of operators of the form fˆCˆ is closed under taking commutators. As a result the set of all such functions Cf is f The linear combinations that would satisfy the above relations may be obtained by performing a linear canonical transformation on the operators (post-quantization). Such combinations are not unique, but this fact is not important for the purpose of counting the degrees of freedom.
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
121 (n)
first-class with respect to the Poisson bracket induced by the commutator. (Cf is automatically included in the above constraints through Cf where fˆ = fˆCˆ n−1 , which is polynomial in the canonical variables so long as fˆ is.) In principle, we have an infinite number of constraints to restrict an infinite number of quantum variables. To see how the degrees of freedom are reduced, we proceed order by order. Variables of the order M in N +1 canonical pairs are defined as in Eq. (3.7), with i ai + b + c = M . The total number of different combinations of this form is the same as the number of ways the positive adding up to M M+2(N powers +1)−1 . Solving a single can be distributed between 2(N + 1) terms, that is 2(N +1)−1 constraint classically results in the (local) removal of one canonical pair. Subsequent quantization of the theory would result in quantum variables corresponding to N canonical pairs. In the rest of the section we demonstrate that our selected form of the constraints leaves unrestricted precisely the quantum variables of the form Ga1 ,...,a2N ;0,0 . It is convenient to make another change in variables. We note that in order to permute two non-commuting canonical operators in a product we need to add i times a lower order product. Starting with a completely symmetrized product of order M and iterating the procedure we can express it in terms of a sum of unsymmetrized products of orders M and below, in some pre-selected order. In particular, we consider variables of the form: x1 )a1 · · · (ˆ x2N )a2N pˆb qˆc . F a1 ,a2 ,...,a2N ;b,c := (ˆ
(3.8)
It is easy to see that there is a one-to-one correspondence between variables (3.7) (combined with the expectation values) and (3.8), but the precise mapping is tedious to derive and not necessary for counting. We can immediately see that our constraints require F a1 ,a2 ,...,a2N ;b,c = 0 for c = 0. Moreover, all of the constraints ˆ may be written as a combination of the variables F a1 ,a2 ,...,a2N ;b,c , c = 0 Cf = fˆC (again, this can be seen by noting that we may rearrange the order of operators in a product by adding terms proportional to lower order products). There are still too many degrees of freedom left as none of the variables F a1 ,a2 ,...,a2N ;b,0 are constrained. At this point, however, we have yet to account for the unphysical degrees of freedom associated with the gauge transformations. Indeed, every constraint induces a flow on the space of quantum variables through the Poisson bracket, associated with the commutator of the algebra of canonical operators. The set of constraints Cf is first-class, which means that the flows they produce preserve constraints and are therefore tangent to the constraint surface. However, not all of the flowgenerating vector fields corresponding to the distinct constraints considered above will be linearly independent on the constraint surface because, to a fixed order in moments, we are dealing with a non-symplectic Poisson manifold. The degeneracy becomes obvious when we count the degrees of freedom to a given order. To order M the constraints are accounted for by variables F a1 ,a2 ,...,a2N ;b,c+1 , where
February 11, 2009 13:42 WSPC/148-RMP
122
J070-00359
M. Bojowald et al.
+1)−2 ai + b + c + 1 = M . Counting as earlier in the section, there are M+2(N 2(N +1)−1 such variables. Subtracting the number of constraints from the number of quantum variables of order M , we are left with M + 2(N + 1) − 1 M + 2(N + 1) − 2 − 2(N + 1) − 1 2(N + 1) − 1 M + 2(N + 1) − 2 M + 2(N + 1) − 1 −1 = M + 2(N + 1) − 1 − (2N + 1) 2(N + 1) − 1 2(N + 1) − 1 M + 2(N + 1) − 2 = (3.9) M 2(N + 1) − 1
i
unrestricted quantum variables. If each constraint does generate an independent non-vanishing flow, we should subtract the number of constraints from the result M+2(N +1)−2 physical degrees of freedom of order M . again to get 2(N +1)−1−M M 2(N +1)−1 This number becomes negative once M is large enough raising the possibility that the system has been over-constrained. Fortunately, this is not the case. All of the operators xˆi commute with the origˆ inal constraint operator C(≡ qˆ), which means that any function of the expectation xi ], weakly commutes with every value of a polynomial in (ˆ xi )i=1,...,2N ; g = g[ˆ constraint 1 ˆˆ [f C, g[ˆ xi ]] i 1 ˆ g[ˆ ˆ xi ]] + [fˆ, g[ˆ xi ]]C = fˆ[C, i 1 ˆ xi ]]C = [fˆ, g[ˆ i
xi ]} = {Cf , g[ˆ
(3.10)
which vanishes on the constraint surface. This means that the variables F a1 ,a2 ,...,a2N ;0,0 are both unconstrained and unaffected by the gauge flows. They can be used to construct the quantum variables corresponding to precisely N canonical pairs, so that we have at least the correct number of physical degrees of freedom. Finally we show that the variables F a1 ,a2 ,...,a2N ;b,0 , b = 0 are not gauge invariant {Cf , F a1 ,a2 ,...,a2N ;b,0 } 1 ˆ (ˆ x1 )a1 · · · (ˆ x2N )a2N pˆb ] = [fˆC, i 1 ˆ x1 )a1 · · · (ˆ x1 )a1 · · · (ˆ x2N )a2N pˆb ]Cˆ + ibf(ˆ x2N )a2N pˆb−1 = [fˆ, (ˆ i ≈ bfˆ(ˆ x1 )a1 · · · (ˆ x2N )a2N pˆb−1 , (3.11) where “≈” denotes equality on the constraint surface. One may still suspect that a gauge may be selected such that the flows on one of these variables vanish, however
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
123
this is not the case. Substituting a constraint such that fˆ = g[ˆ xi ]Cˆ b−1 , where g[xi ] is some polynomial in 2N variables: xi ]((ˆ x1 )a1 · · · (ˆ x2N )a2N )Cˆ b−1 pˆb−1 {CgC b−1 , F a1 ,a2 ,...,a2N ;b,0 } ≈ b g[ˆ and commuting all the Cˆ to the right one by one, such that Cˆ b−1 pˆb−1 = (b − ˆ we have 1)!(i)b−1 + · · · up to operators of the form AˆC,
CgC b−1 , F a1 ,a2 ,...,a2N ;b,0 ≈ b!(i)b−1 g[ˆ xi ] ((ˆ x1 )a1 · · · (ˆ x2N )a2N ) . (3.12) Since the right-hand side is a gauge independent function, (3.12) tells us that it is impossible to get rid of all flows on a given variable F a1 ,a2 ,...,a2N ;b,0 by simply picking a gauge. To summarize: using an alternative set of variables F a1 ,a2 ,...,a2N ;b,c defined in Eq. (3.8) we find that constraints become F a1 ,a2 ,...,a2N ;b,c ≈ 0, c = 0; the variables F a1 ,a2 ,...,a2N ;b,0 , b = 0 are gauge dependent, which leaves the gauge invariant and unconstrained physical variables F a1 ,a2 ,...,a2N ;0,0 . These may then be used to determine directly the physical quantum variables Ga1 ,...,a2N ;0,0 defined in Eq. (3.7). Thus, for a linear constraint a correct reduction in the degrees of freedom is achieved ˆ (polynomial in the canonical variby applying constraints of the form Cf = fˆC ables), as can be directly observed order by order in the quantum variables. Locally, our procedure of effective constraints is complete and consistent since any irreducible constraint can locally be chosen as a canonical coordinate. 3.2. Generating functional More generally, one can work with a generating functional of all constraints with polynomial-type multipliers, which can then be extended to arbitrary constraints including non-linear ones. To elaborate, we return to a single canonical pair and denote basic operators q , pˆ) such that they satisfy the Heisenberg algebra [ˆ xi , x ˆj ] = iij , as (ˆ xi )i=1,2 = (ˆ where ij are the components of the non-degenerate antisymmetric tensor with xi ) obtained 12 = 1. We assume that there is a Weyl ordered constraint operator C(ˆ by inserting the basic operators in the classical constraint and then Weyl ordering. We can generate the Weyl ordered form of all quantum constraints and their algebra i i xi ) := e αi ·ˆx C(ˆ xi ) for all αi ∈ through use of a generating functional, defining Cα (ˆ xi ) = 0 R, which, as we show below, do form a closed algebra. It is clear that Cα (ˆ for physical states, and thus we have a specific class of infinitely many quantum constraints. This class includes polynomials as multipliers which arise from a+b ∂ a bˆ i Cα (ˆ x ) ˆ q pˆ C ∝ a b ∂α1 ∂α2 α=0 in specific orderings as Weyl ordered versions of qˆa pˆb Cˆ such that expectation values remain zero in physical states because Cα (ˆ xi ) = 0 for all α. From Sec. 3.1.1 one
February 11, 2009 13:42 WSPC/148-RMP
124
J070-00359
M. Bojowald et al.
may suspect that this system is not closed, but closure does turn out to be realized. To establish this, we provide several auxiliary calculations. First, we have 1 i1 δ · · · δjinn ) (ˆ xj1 · · · xˆjn x ˆj + x ˆj xˆj1 · · · xˆjn ) 2 (j1 n 1 i1 in δ(j1 · · · δjn ) 2 x ˆj1 · · · x ˆjr xˆj x ˆjr+1 · · · x ˆjn = 2(n + 1) r=0
[ˆ x(i1 · · · xˆin ) , x ˆj ]+ =
+
n
i(n + 1 − r)jjr x ˆj1 · · · x ˆjr−1 x ˆjr+1 · · · x ˆjn
r=1
+
n
i(n + 1 − r)
xˆ · · · xˆ
jn−r j j1
jn−r−1 jn−r+1
x ˆ
···x ˆ
jn
r=1
=x ˆ(i1 · · · x ˆin x ˆj) .
(3.13)
Thus, the anticommutator of a Weyl ordered operator with a basic operator is also Weyl ordered. i i From Baker–Campbell–Hausdorff identities it follows that e αi ·ˆx acts as a displacement operator i
i
i
i
ˆj + ji αi . e αi ·ˆx xˆj e− αi ·ˆx = x
(3.14)
This also shows the algebra of these operators: i
i
i
i
i
e αi ·ˆx e βi ·ˆx = e αi ·ˆx
i
+ i βi ·ˆ xi − 212 [α·ˆ x,β·ˆ x]
i
i
i
= e (αi +βi )·ˆx e− 2 αi
ij
βj
.
(3.15)
With this, one can realize the operator Cα (ˆ xi ) as i i i i 1 ij i αi x ˆi i αi x ˆi i 2 Cα (ˆ x ) := e C(ˆ x)=e C xˆ + αj e 2 αi xˆ 2 ∞ n n 1 1 ij m i = C x ˆ + α (iα · xˆ)n−m (iα · x ˆ ) j n n n! m 2 2 n=0 m=0 ∞ 1 1 ij i = + α , (3.16) iα · x ˆ , C x ˆ j n n! 2 +n n=0 which is manifestly Weyl ordered due to (3.13). Here, we use the iterative definition ˆ C] ˆ +n := [A, ˆ [A, ˆ C]] ˆ +(n−1) . ˆ C] ˆ +0 := Cˆ and [A, [A, Finally, the algebra of constraints is i
i
i
i
i
i
i
i
xi ), Cβ (ˆ xi )] = (e αi xˆ C(ˆ xi )e βi xˆ − e βi xˆ C(ˆ xi )e αi xˆ )C(ˆ xi ) (3.17) [Cα (ˆ i i i i i 1 ij i 1 = e 2 αi xˆ C x ˆi + ij αj e βi (ˆx + 2 αj ) e 2 αi xˆ 2 i i i i i 1 ij i 1 − e 2 αi xˆ e βi (ˆx − 2 αj ) C xˆi − ij αj e 2 αi xˆ C(ˆ xi ) 2
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
=
i i i i i 1 ij 1 e βi 2 αj e 2 (αi +βi )ˆx C x ˆi + ij (αj − βj ) e 2 (αi +βi )ˆx 2 i
1 ij
i
i
− e− βi 2 αj e 2 (αi +βi )ˆx i 1 ij i (αi +βi )ˆ xi 2 ×C x ˆ − (αj − βj ) e C(ˆ xi ) 2 and thus
1 ij i x ), Cβ (ˆ x )] = e C x ˆ + (αj − βj ) [Cα (ˆ 2 1 ij i − 2 βi ij αj i −e C x ˆ − (αj − βj ) Cα+β (ˆ xi ). 2 i
125
i
(3.18)
ij i 2 βi αj
(3.19)
This produces a closed set of Weyl ordered and thus real effective constraints, which is uncountable. There are closed subsets obtained by allowing αi to take values only in a lattice in phase space, but in this case the completeness issue becomes more difficult to address. Moreover, the Cα may be difficult to compute in specific examples. At this stage, we turn to a discussion of specific examples based on polynomial multipliers in quantum constraints, rather than providing further general properties of Weyl ordered effective constraints. 4. Linear Examples Given that the precise implementation of a set of quantum constraints depends on the form of the constrained system, we illustrate typical properties by examples, starting with linear ones. ˆ = qˆ 4.1. A canonical variable as constraint: C n
From C (n) = 0 we obtain that all quantum variables Gq are constrained to vanish, (n) (n) pqˆn leaves in addition to CQ = q itself. Cq is included as C (n+1) , adding Cp = ˆ us with a closed set of constraints, which suffices for discussion of moments up to (n) pm qˆn . second-order. At higher orders, one has to include Cpm = ˆ In this example, it is feasible to work with the symmetrically ordered quantum variables since there is an obvious quantum observable qˆ commuting with n n the constraint. For instance, quantum variables GC q and GC p form a closed set of constraints as may be deduced from (3.4) and the subsequent discussion. Using the Poisson relations (2.2) we verify the first-class nature of system of constraints: for b = d = 0 we obviously have {Ga,0 , Gc,0 } = 0, for b = 0 and d = 1 we have {Ga,0 , Gc,1 } = a(Ga+c−1,0 − Ga−1,0 Gc,0 ) ≈ 0 and for b = d = 1, {Ga,1 , Gc,1 } = (a − c)Ga+c−1,1 − aGa−1,1 Gc,0 + cGa,0 Gc−1,1 ≈ 0. To discuss moments up to second-order, constraints with at most a single power of p are needed. These constraints are in fact equivalent to constraints given by
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
M. Bojowald et al.
126
quantum variables due to n n−1 n n n q − q)n = q n−j = Gq = (ˆ (−1)j q j ˆ (−1)j q j C (n−j) + (−1)n q n j j j=0 j=0 Gq
n
p
=
1 (ˆ q − q)n (ˆ p − p) + (ˆ q − q)n−1 (ˆ p − p)(ˆ q − q) + · · · + (ˆ p − p)(ˆ q − q)n n+1
1 1 (n + 1)(ˆ p − p)(ˆ q − q)n + in(n + 1)(ˆ q − q)n−1 n+1 2 n n 1 n n = ˆ pqˆ − p(ˆ q − q)n−1 q − q) + pqˆn−j + in(ˆ (−1)j q j ˆ j 2 j=1
=
=
Cp(n)
qn
− pG
+
n−1 j=1
n−1 n 1 (−1)j q i Cp(n−j) + (−1)n q n p + inGq . j 2
(4.1)
Starting from n = 1 one can iteratively verify that the relations above provide n m (m−1) }n,m∈N to {Gq , Gq p }n,m∈N which proa one-to-one mapping from {C (n) , Cp vides specific examples of the relation between (3.7) and (3.8) in Sec. 3.1.2. Thus, the constraint surface as well as the gauge flow can be analyzed using quantum variables. For this type of classical constraint, reordering will only lead to either a constant or to terms depending on quantum variables defined without reference to pˆ. Since these are already included in the set of constraints and a constant does not matter for generating canonical transformations, they can be eliminated when n computing the gauge flow. Note, however, that there is a constant term 12 i in Gq p for n = 1 which will play an important role in determining the constraint surface. The fact that constraints are complex valued does not pose a problem for the gauge flow since imaginary contributions come only with coefficients which are (real) constraints themselves and thus vanish weakly, or are constant and thus irrelevant for the flow. Also the gauge flow generated by the quantum constraints up to second-order n n can be computed using quantum variables such as Gq and Gq p rather then the non-symmetric version. For the moments of different orders, we then have the following constraints and gauge transformations. (i) Expectation values: one constraint q ≈ 0 generating one gauge transformation p → p + λ1 . (ii) Fluctuations: two constraints Gqq ≈ 0 and Gqp ≈ const, generating gauge transformations Gpp → Gpp + 4λ2 Gqp and Gpp → Gpp (1 + 2λ3 ), respectively. As we will see in Eq. (4.2) below, Gqp is non-zero on the constraint surface, so that Gpp can be freely rescaled using gauge transformations. (iii) Higher moments: at each order, we have n (n−m) with m < n and only Gp is left to be removed by gauge generconstraints Cpm n ated e.g. by Gq . This confirms the counting of Sec. 3.1.2. Thus, to second-order we see that two moments are eliminated by quantum constraints while the remaining one is gauge. In this way, the quantum variables are eliminated completely either by constraints or by being pure gauge. (Moments such as Gqp were not included in the counting argument of Sec. 3.1.2 in the context of the dimension of the gauge
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
127
flow to be factored out. Here, in fact, we verify that the flow generated by Gqq suffices to factor out all remaining quantum variables to second-order.) This example also illustrates nicely the role of imaginary contributions to the constraints from the perspective of the kinematical inner product. The constraint (1) pqˆ = 0 implies that Cp = ˆ Gqp =
1 1 1 ˆ q pˆ + pˆqˆ − qp = ˆ pqˆ − qp + i ≈ i 2 2 2
(4.2)
must be imaginary. From the point of view of the kinematical inner product this seems problematic since we are taking the expectation value of a symmetrically ordered product of self-adjoint operators. However, the inner product of the kinematical Hilbert space is only auxiliary, and from our perspective not even necessary to specify. Then, an imaginary value (4.2) of some kinematical quantum variables has a big advantage: It allows us to formulate the quantum constrained system without violating uncertainty relations. For an unconstrained system, we have the generalized uncertainty relation Gqq Gpp − (Gqp )2 ≥
1 2 . 4
(4.3)
This relation, which is important for an analysis of coherent states, would be violated had we worked with real quantum constraints Gqq ≈ 0 ≈ Gqp instead of (1) (C (2) , Cp ). Again, this is not problematic because the uncertainty relation is formulated with respect to the kinematical inner product, which may change. Still, the uncertainty relations are useful to construct coherent states and it is often helpful to have them at ones disposal. They can be formulated without using self-adjointness, but this would require one to treat qˆ, pˆ as well as qˆ† and pˆ† as independent such that their commutators (needed on the right-hand side of an uncertainty relation) are unknown. The imaginary value of Gqp obtained with our definition of the quantum constraints, on the other hand, allows us to implement the constraints in a way respecting the standard uncertainty relation: −(Gqp )2 = 14 2 from (4.2) saturates the relation. ˆ = pˆ on a circle 4.2. Discrete momentum as constraint: C We now assume classical phase space variables φ ∈ S 1 with momentum p. This φ and cos φ requires a non-canonical basic algebra generated by the operators pˆ, sin with φ, pˆ] = i cos φ, [sin
φ . φ, pˆ] = −i sin [cos
(4.4)
This example can also be seen as a model for isotropic loop quantum cosmology and gravity [23–25]. The constraint operator Cˆ = pˆ implies the presence of quantum constraints n (n−1) ≈ Gp . This is not sufficient to remove all quantum CQ = p as well as Cp variables by constraints or gauge, and we need to include quantum constraints
February 11, 2009 13:42 WSPC/148-RMP
128
J070-00359
M. Bojowald et al.
referring to φ. Unlike in Sec. 4.1, we cannot take f = φ because there is no oper(n) ator for φ. If we choose Csin φ as starting point, the requirement of a closed set of (n)
(n)
constraints generates C1·[p,sin φ] = −Ccos φ . Taken together, those constraints gener(n)
(n)
(n)
(n)
(n)
(n)
ate Csin φ[p,cos φ] = Csin2 φ , Csin φ[p,sin φ] = −Csin φ cos φ and Ccos φ[p,sin φ] = −Ccos2 φ , (n)
i.e. all quantum constraints Cf (φ) with a function f depending on φ polynomially of second degree through sin φ and cos φ. Iterating the procedure results in a closed n (n) set of constraints p, Gp and CP (sin φ,cos φ) with arbitrary polynomials P (x, y). In this case, we have independent uncertainty relations for each pair of selfadjoint operators. Relevant for consistency with the constraints is the relation 1 φ2 Gpp Gcos φ cos φ − (Gp cos φ )2 ≥ 2 sin 4 and its obvious analog exchanging cos φ and sin φ. Also here, one can see as before (1) that the imaginary part of Gp cos φ = Ccos φ − p cos φ + 12 i sin φ ≈ 12 i sin φ allows one to respect the uncertainty relation even though Gpp ≈ 0. Note that this is similar to the previous example, although now zero being in the discrete spectrum of pˆ would allow one to use a physical Hilbert space as a subspace of the kinematical one whose reality conditions could thus be preserved. If this is done, Gp cos φ must be real even kinematically because the kinematical inner product determines the physical one just by restriction. Demanding both Gp cos φ and sin φ to be real, the only way to satisfy Gp cos φ ≈ 12 i sin φ is to set Gp cos φ ≈ 0 and sin φ ≈ 0. Therefore, in this example, the uncertainty relation above is automatically saturated even for real kinematical quantum variables. Alternatively, if one knows that the constraint is represented as a self-adjoint operator with zero in the discrete part of its spectrum, the same relations can be recovered by appealing directly to the existence of creation and annihilation operators which map zero eigenstates of the constraint to other states in the kinematical Hilbert space. For these operators to exist, the physical Hilbert space must indeed be a subspace of the kinematical Hilbert space (given by zero eigenstates of the constraint operator and the inner product on those states) such that this argument explicitly refers to the discrete spectrum case only. Using this information about the quantum representation makes it possible to do the reduction of effective constraints without introducing complex-valued kinematical quantum variables. Indeed, in our φ and a φ, respectively, raise and lower the φ + isin φ − isin ˆ = cos case a ˆ† = cos discrete eigenvalues of pˆ represented on the Hilbert space L2 (S 1 , dφ). For any eigenφ = 0 and ˆ φ = 0. Thus, φ + isin φ − isin a = cos state of pˆ, then, ˆ a† = cos we again derive that the right-hand side of uncertainty relations vanishes in physical states, making real-valued kinematical quantum variables consistent. Moreover, this example shows that for a constraint with zero in the discrete part of its spectrum, additional constraints follow which can be used to eliminate variables which in the general effective treatment appear as gauge. In fact, all moments involving a† )m is used for physical sin φ or cos φ are constrained to vanish if ˆ an = 0 = (ˆ states. In this case, no gauge flow is necessary to factor out these moments, but
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
129
in contrast to the gauge flow itself this can only be seen based on representation properties. Using complex valued kinematical quantum variables turns out to be more general and applicable to constraints with zero in the discrete or continuous spectrum. For systems with zero in the discrete spectrum, this can be avoided but requires one to refer explicitly to properties of the quantum representation or the operator algebra. ˆ = pˆ1 − pˆ 4.3. Two component system with constraint: C As an example which can be interpreted as a parameterized version of an unconstrained system, we consider a system with a 4-dimensional phase space and phase space coordinates (q, p; q1 , p1 ). The system is governed by a linear constraint CQ = p1 − p.
(4.5)
The classical constraint can, of course, be transformed canonically to a constraint which is identical to one of the phase space coordinates since ( 12 (q1 − q), C; 12 (q1 + q), p1 + p) forms a system of canonical coordinates and momenta containing C = p1 − p. Moreover, the transformation is linear and can easily be taken over to the quantum level as a unitary transformation. The orders of moments do not mix under such a linear transformation, and thus the arguments put forward in Sec. 4.1 can directly be used to conclude that the system discussed here is consistent and complete. Nevertheless, it is instructive to look at details of the procedure without doing such a transformation, which will serve as a guide for more complicated cases. Expectation values satisfy the classical gauge transformations −q˙ = 1 = q˙1 ,
p˙ = 0 = p˙ 1 .
(4.6)
At this point, we recall that, before constraints have been solved, there are no reality or positivity conditions for the kinematical quantum variables Ga,b p − p)a (ˆ q − q)b (ˆ p1 − p1 )c (ˆ q1 − q1 )d Weyl . c,d = (ˆ
(4.7)
Their gauge transformations are G˙ a,b c,d = 0.
(4.8)
Even though these variables remain constant, as do those of the deparameterized system, here we have additional moments compared to an unconstrained canonical pair: solving the constraints has to eliminate all quantum variables with respect to one canonical pair, but also cross-correlations to the unconstrained pair. 4.3.1. Constraints In addition to gauge transformations (4.6) and (4.8) generated by the principal quantum constraint CQ = C (1) , the system is subject to further constraints and their gauge transformations. As explained above, the quantum constraints have to
February 11, 2009 13:42 WSPC/148-RMP
130
J070-00359
M. Bojowald et al.
form a complete, first class set. Such a set is given by n m n−m n mn − m (n) C = (−1)n−m pk1 p Gn−m−,0 m−k,0 m k
m=0 k=0 =0
m n−m (−1)n−m pk1 p k
m=0 k=0 =0 i n−m−−1,0 (n − m −
)G × Gn−m−,1 − m−k,0 m−k,0 2 n m n−m n mn − m = (−1)n−m pk1 p m k
m=0
Cq(n) =
Cp(n)
n m n−m
n m
k=0 =0
× (pGn−m−,0 + Gn−m−+1,0 ) m−k,0 m−k,0 Cp(n) = 1
n m n−m n mn − m (−1)n−m pk1 p m k
m=0 k=0 =0
× (pt Gn−m−,0 + Gn−m−,0 m−k,0 m−k+1,0 ) Cq(n) = 1
n m n−m n mn − m (−1)n−m pk1 p m k
m=0 k=0 =0 i n−m−,0 (m − k)G × Gn−m−,0 + m−k,1 m−k−1,0 . 2
These constraints are accompanied by analogous expressions involving polynomial factors of the basic operators, which we will not be using to the orders considered here. In this section we solve our constraints as given to second-order in quantum variables and determine the gauge orbits they generate. The procedure generalizes to higher orders. At this point, a further choice arises: we need to determine which variables we want to solve in terms of others which are to be kept free. This is related to the choice of time in a deparametrization procedure: In the absence of a Hamiltonian and an absolute time, one variable is selected, whose change is used to describe the relative evolution of other variables; we refer to Sec. 5.2 for further discussions of time deparametrization. Here, we view q1 as the time variable which is demoted from a physical variable to the status of an evolution parameter, and thus H = p will be the Hamiltonian generating evolution in this time. Notice that time is chosen after quantization when dealing with effective constraints. (For our linear constraint, of course, the roles of the two canonical pairs can be exchanged, with q playing the role of time.) Classically, it is then straightforward to solve the constraint and discuss gauge, and the same applies to expectation values in the quantum theory. The discussion of quantum variables is, however, non-trivial and is therefore presented here in
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
131
some detail for second-order moments. Having made a choice of time, a complete 1 deparametrization requires that all quantum variables of the form Gaa12 ,b ,b2 with a2 = 0 or b2 = 0 be completely constrained or removed by gauge. Only quantum variables Ga,b 0,0 are allowed to remain free, and must do so without any further restrictions. To second-order, the deparametrized system has 2 + 3 = 5 variables; the parametrized theory has 4 + 10 = 14. We begin by eliminating quantum variables in favor of the variables associated with the canonical pair (q1 , p1 ) only. From the fact that, on the one hand, Gp1 p1 , Gp1 q1 and Gq1 q1 should satisfy the uncertainty relations and thus cannot all vanish but, on the other hand, are not present in the unconstrained system, we expect at least one of them to be removed by gauge. Differences to the classical treatment first arise for second-order moments, which we now discuss. At this second-order,g i.e. keeping only second-order moments as well as terms linear in , the constraints form a closed and complete system given by C (n) |N =2 = cn + dn (Gpp + Gp1 p1 − 2Gpp1 ) i (n) pq p Cp |N =2 = an G − Gp1 − + bn Gpp − 2Gpp1 + Gp1 p1 2 Cq(n) |N =2 = an Gpp − Gpp1 |N =2 = an Gpp1 − Gp1 p1 Cp(n) 1 i p | = a − G − Cq(n) G + cn (Gp1 p1 − 2Gpp1 + 2cn Gpp ), N =2 n p1 q1 q1 1 2 where an = an (p1 , q) ≡ −n(C (1) )n−1 , cn = cn (p1 , q) ≡ (C (1) )n ,
i n (n − 1)(n − 2)(C (1) )n−3 2 2 n dn = dn (p1 , q) ≡ (n − 1)(C (1) )n−2 , 2
bn = bn (p1 , q) ≡
and C (1) = p1 − p is the linear constraint which in this case is identical with the classical constraint. Due to the fact that the prefactors in the constraint equations contain C (1) , we find non-trivial constraints only when the exponent of C (1) vanishes. This happens for a1 , b3 and d2 , while cn vanishes for all n. For higher n no additional constraints arise. Constraints arising for n = 2, 3 turn out to be linear combinations of the constraints arising for n = 1. Therefore we find for the second-order system only five independent constraints: C (1) |N =2 = p1 − p and Cq(1) |N =2 = −
i − Gqp + Gqp1 , 2
Cp(1) | = Gp1 p1 − Gpp1 , 1 N =2 g The
Cp(1) |N =2 = Gpp1 − Gpp
Cq(1) | = 1 N =2
moment expansion is formalized in Sec. 6.1.
i − Gpq1 + Gq1 p1 . 2
February 11, 2009 13:42 WSPC/148-RMP
132
J070-00359
M. Bojowald et al.
From these equations it is already obvious that four second-order moments referring to q1 or p1 can be eliminated through the use of constraints. In addition to p1 = p for expectation values, these are Gqp1 ≈
1 i + Gqp , 2
Gpp1 ≈ Gpp ,
Gp1 p1 ≈ Gpp1 ≈ Gpp
(4.9)
as well as 1 i + Gq1 p1 (4.10) 2 which is not yet completely expressed in terms of moments only of (q, p). The remaining moments of (q1 , p1 ) are not constrained at all, and thus must be eliminated by gauge transformations. To summarize, three expectation values are left unconstrained, one of which should be unphysical; six second-order variables are unconstrained, three of which should be unphysical. Notice that there is no contradiction to the fact that we have four weakly commuting (and independent) constraints but expect only three variables to be removed by gauge. These are constraints on the space of second-order moments, which, in this truncation, as noted before do not have a non-degenerate Poisson bracket (although the space of all moments has a non-degenerate symplectic structure). Weak commutation then does not imply first class nature in the traditional sense (see e.g. [21]), and four weakly commuting constraints may declare less than four variables as gauge. While the constraints as functionals are independent, their gauge flows may be linearly dependent. Gpq1 ≈
4.3.2. Observables To explicitly account for the unphysical degrees of freedom, we consider the gauge transformations generated by the constraints. The quantum constraint p1 − p ≈ 0 produces a flow on the expectation values only, which agrees with the classical flow (4.6). The second-order constraints, produce no (independent)h flow on the expectation values. Also Gpp is gauge invariant. For the five remaining free second-order variables, p Gp1 − Gpp ≈ 0 generates a flow (on the constraint surface): δGqp = Gpp1 − 2Gpp ≈ −Gpp , δGqq = 2Gqp1 − 4Gqp ≈ i − 2Gqp , δGq1 p1 = Gpp1 ≈ Gpp , δGqq1
qp
= Gq1 p1 + G
δGq1 q1 =
Gpq1
(4.11) −
2Gpq1
≈G
qp
− Gq1 p1 − i,
≈ i + 2Gq1 p1 .
parts of the second-order constraints proportional to C (1) that have been discarded can also be ignored when computing the flows generated on the constraint surface, as the missing contributions are proportional to the gauge flow associated with C (1) . This is true in general, and extends to higher orders.
h The
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
133
Gpp1 − Gp1 p1 ≈ 0 gives: δGqp ≈ Gpp ,
δGqq ≈ i + 2Gqp ,
δGqq1 ≈ Gq1 p1 − Gqp − i, 1 2 i
δGq1 q1 ≈ i − 2Gq1 p1 .
(4.12)
+ Gqp − Gqp1 ≈ 0 gives: δGqp ≈
1 i + Gqp , 2
δGqq ≈ 2Gqq ,
δGqq1 ≈ Gqq1 − Gqq , 1 2 i
δGq1 p1 ≈ −Gpp ,
1 δGq1 p1 ≈ − i − Gqp , 2
δGq1 q1 ≈ −2Gqq1 .
(4.13)
− Gpq1 + Gq1 p1 ≈ 0 gives: 1 δGqp ≈ − i − Gq1 p1 , 2
δGqq ≈ −2Gqq1 ,
δGqq1 ≈ Gqq1 − Gq1 q1 ,
δGq1 p1 ≈
δGq1 q1 ≈ 2Gq1 q1 .
1 i + Gq1 p1 2
(4.14)
All of the gauge flows obey 1 δGqq1 = − (δGqq + δGq1 q1 ) . (4.15) 2 Thus, in addition to A1 := Gpp we can identify the observables A2 := Gqq + 2Gqq1 + Gq1 q1 and A3 := Gqp + Gq1 p1 . They satisfy the algebra {A1 , A3 } = −2A1 , {A1 , A2 } ≈ −4(A3 + 12 i), {A2 , A3 } = 2A2 on the constraint surface which, except for the imaginary term, agrees with the Poisson algebra expected for unconstrained quantum variables of second-order. The imaginary term can easily be absorbed into the definition of A3 , which leads us to the physical quantum variables δGqp = −δGq1 p1 ,
1 G qp := Gqp + Gq1 p1 + i . (4.16) 2 They commute with all the constraints and satisfy the standard algebra for secondorder moments, thus providing the correct representation. To implement the physical inner product, we simply demand that all the physical quantum variables be real. This means that Gqp + Gq1 p1 must have the imaginary part − 12 i which is possible for kinematical quantum variables. G qq := Gqq + 2Gqq1 + Gq1 q1 ,
G pp := Gpp ,
4.3.3. Gauge fixing In fact, one can choose a gauge where all physical quantum variables agree with the kinematical quantum variables of the pair (q, p), and kinematical quantum variables of the pair (q1 , p1 ) satisfy Gp1 p1 = 0 and Gq1 p1 = − 12 i. This choice violates kinematical reality conditions, but it ensures physical reality and preserves the kinematical uncertainty relation even though one fluctuation vanishes. Other gauge choices are possible since only Gqp +Gq1 p1 is required to have imaginary part − 12 for real G qp , which can be distributed in different ways between the two moments. Thus, there are different choices of the kinematical reality conditions. Such gauge choices may be related to some of the freedom contained in choosing
February 11, 2009 13:42 WSPC/148-RMP
134
J070-00359
M. Bojowald et al.
the kinematical Hilbert space which would similarly affect the reality of kinematical quantum variables. The algebra of the physical variables can be recovered without the knowledge of their explicit form as observables, by completely fixing the gauge degrees of freedom and using the Dirac bracket to find the Poisson structure on the remaining free parameters. We introduce gauge conditions φi = 0 which together with the secondorder constraints define a symplectic subspace Σφ of the space of second-order quantum variables. Our conditions should fix the gauge freedom entirely — which means that the flow due to any remaining first class constraints should vanish on Σφ . (We recall that the space of second-order moments does not form a symplectic subspace of the space of all moments, but it does define a Poisson manifold. In such a situation, not all first class constraints need to be gauge-fixed to obtain a symplectic gauge-fixing surface.) In order to ensure that the conditions put no restrictions on the physical degrees of freedom, we demand that no non-trivial function of the gauge conditions be itself gauge invariant. The simple gauge discussed above corresponds to φ4 = Gp1 q1 + 12 i = 0, φ5 = (1) Gq1 q1 = 0 and φ6 = Gqq1 = 0. Under these conditions Cq1 remains first class but has a vanishing flow (4.14) on the surface Σφ . The other second-order constraints now form a second class system when combined with the gauge conditions. The combination of constraints and gauge fixing conditions eliminates all second-order variables except for Gpp , Gqq and Gqp , which therefore parameterize Σφ . Labeling (1) (1) (1) φ1 = Cp , φ2 = Cq , φ3 = Cp1 , the commutator matrix ∆ij := {φi , φj } on Σφ , ∆|Σφ =
0
0
0
−Gpp
0
0
0
0
1 i + Gqp 2
0
0
0
0
Gqq
−2i
Gpp
1 − i − Gqp 2
−Gqq
0
0
0
0
2i
0
0
1 Gqp − i 2
−Gqq
1 − i − Gqp 2
0
0
1 i − Gqp 2 qq G 1 i + Gqp 2 0 0 0
is invertible. The Dirac bracket {f, g}Dirac := {f, g} − {f, φi } (∆−1 )ij {φj , g} for a second class system of constraints can easily be computed for the remaining free parameters Gqq , Gpp and Gqp , recovering precisely the algebra satisfied by the physical quantum variables (4.16). Thus, fixing the gauge freedom entirely, we recover the physical Poisson algebra. In a general situation, where finding the explicit form of observables is more difficult, this alternative method of obtaining their Poisson algebra is easier to utilize.
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
135
5. Truncations Linear constraints show that consistency and completeness are satisfied in our formulation of effective constraints. Locally, every constraint can be linearized by a canonical transformation, but global issues may be important especially in the quantum theory. Moreover, moments transform in complicated ways under general canonical transformations, mixing the orders of quantum variables. We will thus discuss nonlinear examples to show the practicality of our procedures. Before doing so, we provide a more systematic analysis of the treatment of infinitely many constraints as they arise on the quantum phase space. The above examples only considered quantum variables up to second-order. A reduction of this form is always necessary if one intends to derive effective equations from a constrained system. For practical purposes, infinite dimensional systems have to be reduced to a certain finite order of quantum variables so that one can actually retrieve some information from the system. There are two possibilities to do this: an approximate solution scheme order by order, or a sharp truncation, i.e. discarding all quantum variables above some order. It is then necessary to check whether the system of constraints can still be formulated in a consistent way after such a reduction has been carried out. A priori one cannot assume, for instance, that a sharply truncated system of constraints has any non-trivial solution at all. It may turn out that all degrees of freedom are removed by the truncated constraints. Also it is not clear how many (truncated) constraints have to be taken into account at a certain order of the truncation. In this section, we first consider a linear example and show that it can consistently be truncated. We then turn to the more elaborate and more physical example of the parametrized free, nonrelativistic particle. Here, sharp truncations turn out to be inconsistent. While this makes sharp truncations unreliable as a general tool, it is instructive to go through examples where they are inconsistent. The following section will then be devoted to consistent approximations without a sharp truncation. ˆ = qˆ 5.1. Truncated system of constraints for C The system as in Sec. 4.1 is governed by a constraint Cclass = q which on the quantum level entails the constraint operator Cˆ = qˆ. (We explicitly denote the classical constraint as Cclass because by our general rule we reserve the letter C for ˆ This implies the following constraints on the quantum the expectation value C.) phase space: C
(n)
n = Cˆ n = Cclass +
n−1 j=0
Cp(n)
n = ˆ pCˆ n = pCclass +p
+
n−1 j=0
n j
n−1 j G0,n−j , Cclass j
n−1 j=0
j Cclass
an−j
Cq(n) = ˆ q Cˆ n = C (n+1)
n j G0,n−j Cclass j
(n − j)2 G0,n−j−1 G1,n+1 − i (n − j + 1)
February 11, 2009 13:42 WSPC/148-RMP
136
J070-00359
M. Bojowald et al.
where an−j are constant coefficients. These are accompanied by similar expressions (n) of higher polynomial constraints, i.e. Cpm which are more lengthy in explicit form due to the reordering involved in quantum variables. The lowest power constraint yields C (1) = Cclass ≈ 0. Inserting this, the higher power constraints reduce to C (n) ≈ G0,n , Cq(n) ≈ G0,n+1 1 n2 ≈ pG0,n + G0,n−1 . G1,n − i an (n + 1)2
Cp(n)
Performing a sharp truncation at N th order we set Ga,b = 0 for all a + b > N . As non-trivial constraints remain C (n) |N ≈ G0,n Cp(n) |N
for all n ≤ N,
Cp(N ) |N ≈ pG0,N
G1,n − i
n2 G0,n−1 (n + 1)2 1 N2 + G −i 0,N −1 aN (N + 1)2
1 ≈ pG0,n + an
for all n ≤ N − 1, for n = N.
Solving the quantum constraints C (n) ≈ 0 and inserting the solutions into the (n) constraints Cp , yields 1 G1,n an ≈0
Cp(n) |N ≈
for all n ≤ N − 1,
Cp(n) |N
for all n ≥ N.
Thus we find that for the truncated system, G0,n are eliminated through the constraints C (n) = 0, whereas the quantum variables G1,n are eliminated through (n) Cp = 0. Higher polynomial constraints can be expanded as (n)
Cpk = ≈
k n k n i j p − p)k−i (ˆ q − q)n−j , p Cclass (ˆ i j i=0 j=0 k k i Gk,n p − p)k−i (ˆ q − q)n = + ··· p (ˆ i bk,n i=0
with some coefficients bk,n and where moments of lower order in p are not written explicitly because they can be determined from constraints of smaller k. Therefore, these constraints fix all remaining moments except Gn,0 . Due to the constraint C (1) = Cclass ≈ 0, moreover, expectation values are restricted to the classical constraint hypersurface. No further restrictions on these degrees of freedom arise and also the gauge flows act such that the moments are removed in the proper way.
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
137
In particular, all remaining unconstrained Gn,0 become pure gauge: they can be changed arbitrarily by a gauge transformation. (This again confirms considerations (n) in Sec. 3.1.2 because the gauge flow of Cqm = C (n+m) is sufficient to remove all (n)
gauge without making use of Cpm with m = 0, where operators not commuting with the constraint would occur.) The system can thus be truncated consistently. For a truncation at N th order of a linear classical constraint, constraints up to order N have to be taken into account. However, the linear case is quite special because we only had to truncate the system of constraints, but not individual constraints: any effective constraint contains quantum variables of only one fixed order. Referring back to Sec. 3.1.2, when Cˆ is linear, we can impose all of the constraints and remove all of the gauge degrees of freedom in variables up to a given order without invoking higher-order constraints. This is accomplished by treating higher-order constraints as imposing conditions on higher-order quantum variables (possibly in terms of the lower-order unconstrained variables) and noting that using Eq. (3.12) there is no need to refer to constraints containing polynomial terms of order above F a1 ,a2 ,...,a2N ;b,0 itself in order to demonstrate that this variable may be rescaled using gauge transformations. The gauge-invariant degrees of freedom that remain weakly commute with all constraints and not just the constraints up to the order considered; see Eq. (3.10). As a result, in the linear examples of Sec. 4, higher order constraints do not affect the reduction of the degrees of freedom for orders below and so could be disregarded without making any approximations. For nonlinear constraints, however, orders of moments mix and constraints relevant at low orders can contain moments of higher order. It is then more crucial to see how the higher moments could be disregarded consistently, as we will do in what follows.
5.2. Truncated system of constraints for the parametrized free non-relativistic particle The motion of a free particle of mass M in one dimension is described on the phase space (p, q). Through the introduction of an arbitrary time parameter t, time can be turned into an additional degree of freedom. The system is then formulated on the 4-dimensional phase space with coordinates (t, pt ; q, p). The Hamiltonian constraint of the parametrized free non-relativistic particle is given by Cclass = pt +
p2 , 2M
(5.1)
which is constrained to vanish. Promoting phase space variables to operators, Dirac constraint quantization yields the quantum constraint pˆ2 pˆt + Ψ = 0. (5.2) 2M
February 11, 2009 13:42 WSPC/148-RMP
138
J070-00359
M. Bojowald et al.
In the Schr¨ odinger representation, one arrives at an equation that is formally equivalent to the time-dependent Schr¨ odinger equationi i
2 ∂ 2 Ψ(t, q) ∂Ψ(t, q) = . ∂t 2M ∂q 2
(5.3)
As is well known, solutions to this equation are given by i Ψ(t, q) = dkA(k)e E(k)t+ikq
(5.4)
2 2
k where E(k) = 2M . For the quantum variables we use, as before, the notation
p − p)a (ˆ q − q)b (ˆ pt − pt )c (tˆ − t)d Weyl . Ga,b c,d = (ˆ
(5.5)
In its general form, the set of constraints on the quantum phase space is given in the Appendix. 5.2.1. Zeroth order truncation Truncation of the system at zeroth order, i.e. setting all quantum variables to zero, n yields C (n) |N =0 = Cclass together with n Cq(n) |N =0 = qCclass +
i p n−1 n C , 2 m class
(n)
n Ct |N =0 = tCclass +
i n−1 nCclass 2
as the required constraints. This truncation is not consistent. Inserting the condition Cclass = 0 into the expressions of the remaining constraints results in inconsistency: (1) for example Ct |N =0 = tCclass + 12 i, implies i 2 = 0. The reason may seem clear: A truncation at zeroth order can be understood as neglecting all quantum properties of the system. But this is not possible for a free particle. For example, there is no solution in which the spread in both p and q would stay negligible throughout the particle’s evolution. There is no wave-packet which would remain tightly peaked throughout the evolution and a description in terms of expectation values alone seems insufficient in this case. 5.2.2. Second order truncation But even if one takes into account the second-order quantum variables, spreads and fluctuations, an inconsistent system results. The expanded constraints can also be i In
contrast to the ordinary, time-dependent Schr¨ odinger equation, time is an operator in the equation obtained here and not an external parameter. This implies that the Hamiltonian which 2 ˆ phys = pˆ , has the same action on physical states as the momentum generates evolution in time, H 2M
operator canonically conjugate to time. In contrast to the physical Hamiltonian, which is bounded below and positive semidefinite, the spectrum of the time momentum pˆt covers the entire real line. On physical solutions, however, only positive “frequencies” contribute.
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
139
found in the appendix, which we now sharply truncate at second-order in moments. From C (n) only three non-trivial constraints follow 1 2,0 G , C (1) = Cclass + 2M 0,0 4p 1,0 2 G + G0,0 C (2) |N =2 = Cclass − (6Cclass − 4pt ) G2,0 0,0 + 2,0 , 2M 1,0 3 , C (3) |N =2 = Cclass upon inserting the constraints successively. Thus for an N = 2 order truncation, at n = 3, the classical constraint is recovered and must vanish for the truncated system. Then, C (1) ≈ 0 yields G2,0 0,0 ≈ 0 which is too strong for a consistent reduction since one expects the fluctuation Gpp to be freely specifiable. It has to remain a physical degree of freedom after solving the constraints, for otherwise no general wave packet as in (5.4) can be posed as an initial condition of the free particle. As we see, the sharply truncated system is over-constrained. In particular, the constraint C (3) , when truncated to second-order moments, reduces to the classical constraint 3 , which then immediately implies Gpp = 0 due to C (1) . Cclass This observation points to a resolution of the inconsistency: While C (1) is already of second-order even without a truncation, C (3) contains higher order moments. The truncation is then inconsistent in that we are ignoring higher orders next to an expression which we then constrain to be zero. For unconstrained moments, this would be consistent; but it is not if some of the moments are constrained to vanish. Thus, a more careful approximation scheme must be used where we do not truncate sharply but ignore higher moments only when they appear together with lower moments not constrained to vanish. In such a scheme, as discussed in the following section, C (3) would pose a constraint on the higher moments in terms of Cclass ≈ −Gpp /2M , but would not require Cclass or Gpp to vanish. 6. Consistent Approximations Through the iteration described in Sec. 3.1, the polynomial constraints of Sec. 3.1.2 or the generating function of Sec. 3.2 one arrives at an infinite number of constraints imposed on an infinite number of quantum variables. The linear systems have already demonstrated consistency and completeness of the whole system, but for practical purposes the infinite number of constraints and variables is to be reduced. We have seen in the preceding section that sharp truncations are in general inconsistent and that more careful approximation schemes are required. Depending on the specific reduction, it is neither obvious that the effective constraints are consistent in that they allow solutions for expectation values and moments at all, nor is it guaranteed that the constraints at hand do actually eliminate all unphysical degrees of freedom. For each classical canonical pair which is removed by imposing the constraints, all the corresponding moments as well as cross-moments with the unconstrained canonical variables should be removed. Classically, as well as in our quantum phase space formulation, the elimination of unphysical degrees of
February 11, 2009 13:42 WSPC/148-RMP
140
J070-00359
M. Bojowald et al.
freedom is a twofold process: The constraints can either restrict unphysical degrees of freedom to specific functions of the physical degrees of freedom, or unphysical degrees of freedom can be turned into mere gauge degrees of freedom under the transformations generated by the constraints and then gauge fixed if desired. In the following, we will first demonstrate by way of a non-trivial example, rather than referring to linearization, that the constraints as formulated in Sec. 3.1 are consistent, before turning to the elimination of the unphysical degrees of freedom. Our specific example is again the parametrized free non-relativistic particle, but the general considerations of Sec. 6.1 hold for any parameterized non-relativistic system. We use the variables and constraints as they have been determined in Sec. 5.2. This establishes a hierarchy of the constraints, suggesting to solve C (n) first, then (n) (n) (n) (n) Cq , Ct , Cpt and Cp , and the remaining constraints (A.3)–(A.6) first for k = 1, then k = 2 etc. Note that for each k in (A.3)–(A.6) the r = k term is the only contribution of a form not appearing at lower orders. The terms occurring in the r-sum are linear combinations of the constraints (A.3)–(A.6) for k < k. Thus apart from the r = k term all other terms vanish if the lower k constraints are satisfied. It is important to notice that the structure of the constraints is such that on (n) (n) (n) (n) the constraint hypersurface C (n) , Cqpk , Cq , Ctpk and Ct contain as lowest order (n)
(n)
(n)
(n)
terms expectation values, whereas Cppk , Cp , Cpt pk and Cpt have second-order moments as lowest contribution. The highest order moments occurring in C (n) are (n) (n) (n) (n) (n) (n) (n) of order 2n, 2n + 1 for Cq , Ct , Cp and Cpt and 2n + 1 + k in Cqpk , Ctpk , Cppk (n)
and Cpt pk . The structure of (A.3)–(A.6) implies that the lowest contributing order in the j- and -sums (on the constraint hypersurface) is j + + k ± 1 and rises with k. Consequently, there exists a maximal k up to which constraints have to be studied if only moments up to a certain order are taken into account. We check the consistency of the constraints order by order in the moments. This means that we first have to verify that one can actually solve the constraints for the expectation values. This analysis will then be displayed explicitly for second- and third-order moments. 6.1. General procedure and moment expansion To verify consistency up to a certain order, one can exploit the fact that up to a fixed order N of the moments only a finite number of constraints have to be taken into account. This can be seen from the following argument: All constraints (A.1)–(A.6) have a structure, for which C
(n)
n m 2(n−m) n m2(n − m) m−j p2(n−m)− ,0 = G pt m j
(2M )n−m j,0 m=0 j=0 =0
is representative. In the j- -summation, the relevant moments occur for j + ± 1 ≤ N . From this condition, a number of pairs (j, ) result for which the sums occurring
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
141
in (A.1)–(A.6) can be evaluated. There remain sums over m containing pt , which should be eliminated if we choose t as internal time to make contact with the quantum theory of the deparameterized system. (Our consistent approximation procedure, however, is more general and does not require the choice of an internal time.) We can achieve this by rewriting these as terms of the form n(n − 1) · · · (n − n−g−1 multiplied by powers of p and 2M , where g is an integer depending on g)Cclass the values of j and . (See the examples in Eqs. (A.7)–(A.21).) This is achieved by eliminating pt viaj C (1) = CQ ≈ 0 and illustrates the central role played by the principal quantum constraint CQ . For a fixed order N of moments, there is a factor of lowest and one of highest power of Cclass . In C (n) , e.g., the highest power is given n , whereas the lowest power is for j = 0, = 0 (with m = n) and is simply Cclass n−N k . given for = 0, j = N and is given by n(n − 1) · · · (n − (N − 1))Cclass 2,0 Since Cclass ≈ −G0,0 /2M , powers of second-order moments ensue (or higher q-moments if there is a potential). Together with powers of in some of the terms, this must be compared with the orders of higher moments in order to approximate consistently. To formalize the required moment expansions, one can replace each a+b+c+d a,b Gc,d and expand in λ. This automatically guarantees that moment Ga,b c,d by λ higher order moments appear at higher orders in the expansion, and that products of moments are of higher order than the moments themselves. Moreover, in order to leave the uncertainty relation unchanged, we have to replace by λ2 , which ensures that it is of higher order, too, without performing a specific -expansion. After the λ-expansion has been performed, λ can be set equal to one to reproduce the original terms. (Assumptions of orders of moments behind this expansion scheme can easily be verified for Gaussian coherent states of the harmonic oscillator, where a moment Ga,b is of order at least (a+b)/2 .) One can now rewrite the sum over m for all those terms which produce factors with powers of Cclass down to the lowest power occurring in front of the relevant n−N . One can therefore rewrite the moments. In C (n) this would correspond to Cclass constraints in the form n−1 n−2 n Cclass Y1 + nCclass Y2 + n(n − 1)Cclass Y3 + · · · + R ≈ 0,
(6.1)
where Yi are functions linear in moments including those of order smaller than N , and R contains only moments which are of higher order. This allows one to successively solve the constraints for n = 1, n = 2, etc. and discard all constraints arising for n ≥ N + 1, n > 0. In each case, one has to find the terms of lowest order n , to see at which order in the moment expansion, in combination with powers Cclass a constraint becomes relevant.
j In
2,0 our example of the free particle, we have CQ = pt + p2 /2M + G0,0 /2M . If there is a potential,
there will be further classical terms as well as quantum variables G0,n 0,0 . term arises of course as well for = N, j = 0, = 1, j = N − 1, etc.
k This
February 11, 2009 13:42 WSPC/148-RMP
142
J070-00359
M. Bojowald et al.
n It is crucial for this procedure to work that Cclass , which arises in all constraints, can be eliminated at least for all n > n through terms of higher order moments using the principal constraint CQ . This key property can easily be seen to be realized for any non-relativistic particle even in a potential, as long as pt appears linearly. (For relativistic particles, additional subtleties arise as discussed in a forthcoming paper.) While (A.1)–(A.6) change their form in such a case with a different classical constraint, the procedure sketched here still applies. Thus, it does not only refer to quadratic constraints but is sufficiently general for non-relativistic quantum mechanics. We will explicitly demonstrate the procedure for the free particle in what follows. For that purpose, we rewrote the set of constraints in the required form (6.1) for moments up to third-order as seen in the Appendix.
6.2. Consistency of constraints for expectation values At zeroth order, we keep only expectation values. All moments are of order O(λ2 ) or higher. As only relevant constraints we therefore find C (n) ≈ 0, cf. Appendix. n ≈ 0. This in turn Keeping only zeroth order terms, this reduces to C (n) = Cclass corresponds to the single constraint Cclass ≈ 0 which can be used to eliminate pt in terms of p. The system of constraints is obviously consistent at zeroth order and no constraints on variables associated with the pair (q, p) result. As explained above, the only constraint that restricts zeroth order moments is C (1) = Cclass ≈ 0. This constraint allows us to eliminate pt . It generates a gauge flow on expectation values given by p˙ = 0,
p˙ t = 0,
q˙ =
p , M
t˙ = 1.
(6.2)
The two observables of the system are therefore P (0) = p and Q(0) = q − t
p M
with {Q(0) , P (0) } = 1.
(6.3)
These correspond to the two physical degrees of freedom corresponding to expectation values of canonical variables. Among the four original degrees of freedom of the system, pt is eliminated via the constraint and t is a pure gauge degree of freedom. There are no further constraints to this order, which is thus consistent. 6.3. Consistency of constraints up to second-order moments At second-order, we include second-order moments and orders of (recall that is of order λ2 in the moment expansion) in addition to expectation values. Third-order contributions are set to zero. We find that in addition to C (1) , the new constraints (1) (1) (1) (1) Cq , Ct , Cpt and Cp arise. All other constraints are of higher order: secondorder moments enter in these equations only through quadratic terms or with a factor of , both of which are considered as higher order terms, cf. Appendix. The
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
143
only non-trivial constraints are therefore 1 2,0 G ≈ 0, 2M 0,0 p 1,1 p i + G ≈ 0, = G0,1 1,0 + M 2 M 0,0 p 1,0 i = G0,1 + G0,0 ≈ 0, 1,1 + M 2 p 1,0 G ≈ 0, = G0,0 2,0 + M 1,0 p 2,0 G ≈ 0, = G1,0 1,0 + M 0,0
C (1) = Cclass +
(6.4)
Cq(1)
(6.5)
(1)
Ct
Cp(1) t Cp(1)
(6.6) (6.7) (6.8)
where third-order contributions have been set to zero. In accordance with the order of expectation values, we use the first constraint to eliminate pt = −p2 /2M − G2,0 0,0 /2M and solve for second-order moments G0,1 1,0 = −
p 1,1 p i − G , M 2 M 0,0
G0,0 2,0 = −
p 1,0 G , M 1,0
p 1,0 i G0,1 = −G0,0 , 1,1 − M 2 G1,0 1,0 = −
p 2,0 G . M 0,0
(6.9) (6.10)
As constraints for k > 1 contain second-order moments only through C n , they are n trivial as well. This follows from the first constraint which sets C n ∼ (G2,0 0,0 ) ∼ O(λ2n ). Thus, as far as the second-order moments are concerned, the system of 1,0 1,0 0,1 constraints is consistent: G0,0 2,0 , G1,0 , G0,1 and G1,0 are fully determined while all second-order moments associated with the pair (q, p) can be specified freely. All remaining constraints then determine higher moments. This is the same situation as experienced in the linear case as far as solving the constraints for second-order moments is concerned. The inconsistency of Sec. 5.2.2 is avoided because C (3) , which made Cclass and thus G2,0 0,0 vanish in the sharp truncation, is now realized as a higher order constraint in the moment expansion. (1) (1) (1) (1) Gauge transformations are generated by C (1) , Cq , Ct , Cpt and Cp where third-order contributions are set to zero as in (6.10). In comparison to Sec. 6.2, we have four additional gauge transformations. Whereas P (2) := P (0) remains gauge invariant under these transformations as well, this is not the case for Q(0) . The latter has to be alleviated by adding second-order moments such that an observable Q(2) = Q(0) −
1 1,0 G M 0,1
(6.11)
results satisfying {Q(2) , P (2) } = 1. Calculating the transformations generated by the constraints on second-order moments shows that G pp(2) = G2,0 0,0 is an observable, i.e. commutes with all five constraints on the hypersurface defined by these constraints. The form of the gauge
February 11, 2009 13:42 WSPC/148-RMP
144
J070-00359
M. Bojowald et al.
orbits suggests to make the ansatz 0,0 G qp(2) = G1,1 0,0 + G1,1 −
G qq (2) = G0,2 0,0 − 2
t 2,0 i G + M 0,0 2
(6.12)
p 0,1 p2 2t G0,1 + 2 G0,0 0,2 − M M M
i t2 2,0 0,0 + G + G G1,1 + 0,0 1,1 2 M 2 0,0 (6.13)
for the remaining two observables. They are invariant under gauge transformations. qq (2) and the The term i 2 is included such that the Poisson brackets between G remaining two quantum observables are of the required form. They satisfy {G pp(2) , G qp(2) } = −2G pp(2) ,
{G pp(2) , G qq (2) } = −4G qp(2) ,
{G qp(2) , G qq (2) } = −2G qq(2) . Commutators between the variables Q(2) , P (2) and the physical quantum variables G qq (2) , G pp(2) and G qp(2) vanish. Thus we showed that four of the ten second-order moments are eliminated directly by the constraints. Three of the remaining second-order moments, G0,0 1,1 , 0,0 0,1 G0,2 and G0,1 , are pure gauge degrees of freedom. Consequently three physical quantum degrees of freedom remain at second-order. The observables can be used to determine the general motion of the system in coordinate time: From (6.3) and (6.11) together with (6.10) and (6.12), we obtain t (2) 1 p t (2) 1 i P + Gt ≈ Q(2) + P − Gtpt + M M M p 2 t (2) 1 t pp(2) P − (2) G qp(2) + G = Q(2) + − Gqp M M P
q(t) = Q(2) +
(6.14)
for the relational dependence between q, t and Gqp . Thus, the moments appear in the solutions for expectation values in coordinate time which illustrates the relation between expectation values and moments. At this stage, we still have to choose a gauge if we want to relate the non-observables q, t and Gqp in this equation to properties in a kinematical Hilbert space. A convenient choice is to treat (t, pt ) like a fully constrained pair as we have analyzed it in the example of a linear constraint in Sec. 4. This suggests to fix the gauge by requiring that Gtpt = − 12 i has no real part but only the imaginary part for physical quantum variables to be real. Moreover, as in the linear case we can gauge fix Gtt = 0, such that the uncertainty relation Gtt Gpt pt − (Gtpt )2 ≥ 2 /4 is saturated independently of the behavior of the (q, p)-variables. (For Gtt = 0, it would depend on those variables via Gpt pt ≈ p2 Gpp /M 2 from (6.10).) Finally, this is the only gauge condition for Gtpt which works for all values of P (2) , including P (2) = 0 in (6.14).
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
145
In this gauge, we obtain P (2) G pp(2) t, Gqp (t) = G qp(2) + t (6.15) M M in agreement with the solutions one would obtain for the deparametrized free particle. In this case, there is no quantum back-reaction of quantum variables affecting the motion of expectation values because the particle is free. In the presence of a potential, equations analogous to those derived here would exhibit those effects. While it would in general be difficult to determine precise observables in such a case, they can be computed perturbatively starting from the observables found here for the free particle. q(t) = Q(2) +
6.4. Consistency of constraints up to third-order moments (1)
(1)
Including third-order terms in the analysis, solutions to the constraints Cq , Ct , (1) (1) Cpt and Cp become G0,1 1,0 = −
p 1,1 p i 1 2,1 − G − G , M 2 M 0,0 2M 0,0
1 2,0 i p 1,0 G0,1 = −G0,0 − G , 1,1 − M 2 2M 0,1 G0,0 2,0 = −
p 1,0 1 2,0 G − G , M 1,0 2M 1,0
(6.16) (6.17) (6.18)
p 2,0 1 3,0 G − G . (6.19) M 0,0 2M 0,0 As in the previous subsection, they will be used to determine second-order moments. The constraint C (1) contains no third-order term and thus remains unaltered. Third(1) (1) (1) (1) order moments are determined by higher constraints Cqp , Ctp , Cpt p , Cp2 and G1,0 1,0 = −
(2)
(2)
(2)
Cq , Ct , Cpt . All other constraints contain third-order moments with a factor of or of second or higher moments, both of which provides terms of higher order. (1) (1) For instance, we may consider the constraints Cqp2 , Ctp2 , cf. (A.17), (A.18). They both contain third-order moments with a factor of Cclass , which, after solving C (1) , becomes a term of fifth order. The remaining second- and third-order terms occur with a factor of , and are thus of fourth and fifth order. From this consideration of (1) (1) orders in the moment expansion we conclude that Cqp2 and Ctp2 do not constrain third-order moments but become relevant only at higher than third-orders of the approximation scheme. (1) For n = 1 the constraints that actually determine third-order moments are Cqp , (1) (1) (1) Ctp , Cpt p and Cp2 . On the constraint hypersurface, they imply p 2,1 1 2,0 1,1 G + G (G − i), M 0,0 2M 0,0 0,0 1 3,0 1 2,0 p 2,0 p 2,0 G0,0 G0,0 + G0,0 − G , ≈ 2M 2M M M 1,0 G1,1 1,0 ≈ −
G1,0 2,0
G1,0 1,1 ≈
1 2,0 1,0 p 2,0 G G − G 2M 0,0 0,1 M 0,1
G2,0 1,0 ≈
1 2,0 2,0 p 3,0 G0,0 G0,0 − G . 2M M 0,0
February 11, 2009 13:42 WSPC/148-RMP
146
J070-00359
M. Bojowald et al.
Note that this holds on the constraint hypersurface defined by the constraints C (1) , (1) (1) (1) (1) Cq , Ct , Cpt and Cp . Dropping fourth and fifth order terms, we find the simple relations p 2,1 p 2,0 p 2,0 p 3,0 G0,0 , G1,0 G0,1 , G1,0 G1,0 , G2,0 G . G1,1 1,0 ≈ − 1,1 ≈ − 2,0 ≈ − 1,0 ≈ − M M M M 0,0 This happens in a consistent manner because unconstrained third-order moments appear on the right-hand sides. No condition for the (q, p)-moments appearing here 0,0 arises in this way, but the third-order moments G1,0 1,1 and G2,1 associated with (t, pt ) (2)
(2)
(2)
remain unspecified at this stage. The constraints Cq , Ct , Cpt arising for n = 2 yield p G2,0 G1,1 , G0,1 2,0 ≈ 2M 2 0,0 0,0 1 1 2,0 p2 2,0 2,0 0,0 G G ≈ + G0,0 G + , G 2,1 0,0 1,1 M 2M 0,1 M 0,1 p 1 2,0 p 2,0 1 3,0 p2 3,0 G G G ≈ 2 G + + G0,0 , − 3,0 M 2M 2 0,0 2M 0,0 2M 0,0 2M 0,0 which, after setting higher-order terms to zero, sets G0,1 2,0 ≈ 0,
G0,0 2,1 ≈
p2 2,0 G , M 2 0,1
G0,0 3,0 ≈ −2
p3 3,0 G . 2M 3 0,0
The inclusion of third-order terms and new constraints does not affect P (2) and Q . They remain constant under gauge transformations. We therefore write (2)
P (3) := P (0)
and Q(3) := Q(2) .
(6.20)
Accordingly, their Poisson bracket is unaltered. The situation is different for the second-order quantum variables. Only G pp(2) remains invariant under the flow generated by third-order constraints. Now that third-order terms are included, G qp(2) and G qq (2) are no longer observables. The former transforms under gauge transformations as follows 1 2,1 1 2,0 (1) G , {G qp(2) , Ct } = G , {G qp(2) , Cq(1) } = 2M 0,0 2M 0,1 1 2,0 1 3,0 G , {G qp(2) , Cp(1) } = G , }= {G qp(2) , Cp(1) t 2M 1,0 2M 0,0 (1)
(1)
(1)
(1)
whereas Poisson brackets with Cqp , Ctp , Cpt p and Cp2 are of fourth order in the moment expansion. The terms on the right-hand side can be eliminated through the addition of a third-order moment by 1 2,0 G . (6.21) 2M 0,1 This has vanishing Poisson brackets with all constraints up to fourth order terms. Moreover, it has vanishing Poisson bracket with P (3) as well as Q(3) . The Poisson bracket with G pp(3) := G pp(2) remains unaltered, {G qp(3) , G pp(3) } = 2G pp(3) . G qp(3) := G qp(2) −
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
147
The transformations generated by the constraints on G qq (2) are of a more complicated form and we have not found a simple way of writing G qq (3) in explicit form. We conclude at this place because the applicability of effective constraints has been demonstrated. As already mentioned, the procedure also applies to interacting systems: We can solve the constraints in the same manner and using the same orders of constraints. The main consequence in the presence of a potential V (q) is that additional q-moments appear as extra terms in solutions at certain orders, whose precise form depends on the potential. For a small potential, this can be dealt with by perturbation theory around the free solutions. 7. Conclusions We have introduced an effective procedure to treat constrained systems, which demonstrates how many of the technical and conceptual problems arising otherwise in those cases can be avoided or overcome. The procedure applies equally well to constraints with zero in the discrete or continuous parts of their spectra and is, in fact, independent of many representation properties. For each classical constraint, infinitely many constraints are imposed on an infinite-dimensional quantum phase space comprised of expectation values and moments of states. This system is manageable when solved order by order in the moments because this requires the consideration of only finitely many constraints at each order. A formal definition of this moment expansion has been given in Sec. 6.1. ˆ of a conThe principal constraint is simply the expectation value CQ = C straint operator, viewed as a function of moments via the state used. Unless the constraint is linear, there are quantum corrections depending on moments which can be analyzed for physical implications. Moments are themselves subject to further constraints, thus restricting the form of quantum corrections in CQ . We have demonstrated that there is a consistent procedure in which an expansion by moments can be defined, in analogy with an expansion by moments in effective equations for unconstrained systems. This has been shown to be applicable to any parametrized non-relativistic system. We have also demonstrated the procedure with explicit calculations in a simple example corresponding to the parameterized free non-relativistic particle. In these cases, when faced with infinitely many constraints we could explicitly choose an internal time variable and eliminate all its associated moments to the orders considered. Especially for the free particle, we were able to determine observables invariant under the flows generated by the constraints, and more generally observed how such equations encode quantum back-reaction of moments on expectation values in an interacting system. These observables were subjected to reality conditions to ensure that they correspond to expectation values and moments computed in a state of the physical Hilbert space. Especially physical Hilbert space issues appear much simpler in this framework compared to a direct treatment, being imposed just by reality conditions for functions rather than self-adjointness conditions for operators. Nevertheless, crucial
February 11, 2009 13:42 WSPC/148-RMP
148
J070-00359
M. Bojowald et al.
properties of the physical Hilbert space are still recognizable despite of the fact that we do not refer to a specific quantum representation. We also emphasize that we choose an internal time after quantization, because we do so when evaluating effective constraints obtained from expectation values of operators. This is a new feature which may allow new concepts of emergent times given by quantum variables even in situations where no classical internal time would be available (see, e.g., [19]). In the examples, we have explicitly implemented the physical Hilbert space by reality conditions on observables given by physical expectation values and physical quantum variables. Observables thus play important roles and techniques of [26–28] might prove useful in this context. Notice that we are referring to observables of the quantum theory, although they formally appear as observables in a classicaltype theory of infinitely many constraints for infinitely many variables. The fact that it often suffices to compute these observables order by order in the moment expansion greatly simplifies the computation of observables of the quantum theory. Nevertheless, especially for gravitational systems of sufficiently large complexity one does not even expect classical observables to be computable in explicit form. Then, additional expansions such as cosmological perturbations can be combined with the moment expansion to make calculations feasible. This provides almost all applications of interest. Moreover, if observables cannot be determined completely, gauge fixing conditions can be used. As we observed, depending on the specific gauge fixing some of the kinematical quantum variables (before imposing constraints) can be complex-valued while the final physical variables are required to be real. Different gauge fixings imply different kinematical reality conditions, which can be understood as different kinematical Hilbert space structures all resulting in the same physical Hilbert space. While we have discussed only the simplest examples, this led us to introduce approximation schemes which are suitable more generally. In more complicated systems such as quantum cosmology one may not be able to find, e.g., explicit expressions for physical quantum variables as complete observables. But for effective equations it is sufficient to know the local behavior of gauge-invariant quantities, which can then be connected to long-term trajectories obtained by solving effective equations. A local treatment, on the other hand, allows one to linearize gauge orbits, making it possible to determine observables. Moreover, as always in the context of effective equations, simple models can serve as a basis for perturbation theories of more complicated systems. A class of systems of particular interest is given by quantum cosmology as an example for parametrized relativistic systems to be discussed in a forthcoming paper. In such a case, the linear term pt in the systems considered here would be replaced by a square p2t . There is thus a sign ambiguity in pt which has some subtle implications. Moreover, the principal quantum constraint CQ will then acquire an additional moment Gpt pt which may spoil the suitability of t as internal time in quantum theory provided that the fluctuation Gpt pt can become large enough for no real solution for pt to exist. This demonstrates a further advantage of the effective
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
149
constraint formalism which we have not elaborated here: the self-consistency of emergent time pictures can be analyzed directly from the structure of equations. Finally, if there are several classical constraints, anomaly issues can be analyzed at the effective level without many of the intricacies arising for constraint operators. Also this is discussed in more detail elsewhere [8]. To summarize, we have seen that the principal constraint CQ already provides quantum corrections on the classical constrained variables. The procedure thus promises a manageable route to derive corrections from, e.g., quantum gravity in a way in which physical reality conditions can be implemented. Since such conditions can be imposed order by order in moments as well as other perturbations, results can be arrived at much more easily compared to the computation of full physical states in a Hilbert space. Nevertheless, all physical requirements are implemented.
Acknowledgments We thank Alejandro Corichi for discussions. B.S. thanks the Friedrich-EbertStiftung for financial support. Work of M.B. was supported in part by NSF grant PHY0653127.
Appendix. System of Constraints for the Parametrized Free Particle General expression for the constraints are
C (n) =
n m 2(n−m) n m2(n − m) m−j p2(n−m)− ,0 G , pt m j
(2M )n−m j,0 m=0 j=0
(A.1)
=0
Cq(n)
(n)
Ct
Cp(n) t
n m 2(n−m) n m2(n − m) m−j p2(n−m)− = pt m j
(2M )n−m m=0 j=0 =0 i −1,0 ,1
Gj,0 × qG,0 , j,0 + Gj,0 + 2 n m 2(n−m) n m2(n − m) m−j p2(n−m)− = pt m j
(2M )n−m m=0 j=0 =0 i ,0 ,0 jG × tG,0 + G + j,0 j,1 j−1,0 , 2 n m 2(n−m) n m2(n − m) m−j p2(n−m)− = pt m j
(2M )n−m m=0 j=0 =0
× (pt G,0 j,0
+ G,0 j+1,0 ),
(A.2)
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
M. Bojowald et al.
150
(n)
Cpk =
k n m 2(n−m) n m 2(n − m) k m−j p2(n−m)+k−−r +r,0 Gj,0 , p m j
r t (2M )n−m m=0 j=0 r=0 =0
(A.3) (n)
Ctpk
(n)
Cqpk
(n)
Cpt pk
k n m 2(n−m)
n m 2(n − m) k m−j p2(n−m)+k−−r = p m j
r t (2M )n−m m=0 j=0 =0 r=0 i +r,0 +r,0 jG × tG+r,0 + G + j,0 j,1 j−1,0 , 2 k n m 2(n−m) n m 2(n − m) k m−j p2(n−m)+k−−r = p m j
r t (2M )n−m m=0 j=0 =0 r=0 i × qG+r,0 + G+r,1 + ( + r)G+r−1,0 , j,0 j,0 j,0 2 k n m 2(n−m) n m 2(n − m) k m−j p2(n−m)+k−−r = p m j
r t (2M )n−m m=0 j=0 r=0
(A.4)
(A.5)
=0
× (pt G+r,0 j,0
+ G+r,0 j+1,0 ).
(A.6)
In addition to those written explicitly here, there are those involving higher polynomials also in q, t and pt . The first two types of those constraints are more lengthy due to reorderings in the quantum variables. The constraints listed suffice for considerations in this paper. In the following, we give a moment expansion, using Xi and Ri to denote linear functions of higher, i.e. at least fourth, order moments. The leading terms are given by 1 2,0 G 2M 0,0 2 p p 1,0 1 0,0 1 2,0 n−2 + n(n − 1)Cclass G2,0 + G + G + G 2M 2 0,0 M 1,0 2 2,0 2M 1,0 p 1 3,0 4,0 n−3 + G + G + n(n − 1)(n − 2)Cclass 2M 2 0,0 8M 2 0,0 2 p p 1,0 1 0,0 p3 3,0 2,0 G + G + × G + G + X1 2M 2 1,0 2M 2,0 6 3,0 6M 3 0,0
n−1 n C (n) = Cclass + nCclass
Cq(n)
+ n(n − 1)(n − 2)(n − 3)R1 = 0, (A.7) p i p 1,1 1 2,1 n−1 + G0,0 + G0,1 G = qC (n) + nCclass 1,0 + M 2 M 2M 0,0 3p 2,0 p 1,1 p2 2,1 n−2 i 1 G G + G + + n(n − 1)Cclass G1,0 1,0 0,0 + 2 M 2M 2M 2 0,0 M 1,0
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems
151
1 i 1 3,0 + G0,1 + G + X 2 2 2,0 2 2M 2 0,0 2 p p3 2,0 p 0,0 n−3 i G G1,0 G + + n(n − 1)(n − 2)Cclass 1,0 + 2 2 M 2M 3 0,0 2M 2,0 3p 2,0 p2 3,0 1 1,0 G + G + G + + X 3 2M 2 1,0 M 3 0,0 2M 2,0 4 p p3 2,0 n−4 i G3,0 G + n(n − 1)(n − 2)(n − 3)Cclass 0,0 + 4 2 6M 2M 3 1,0 p2 1,0 p 0,0 G + G + + X 4 2M 2 2,0 6M 3,0
(n)
Ct
Cp(n) t
Cp(n)
(A.8) + n(n − 1)(n − 2)(n − 3)(n − 4)R2 = 0, p 1,0 1 2,0 n−1 i + G0,1 + G0,0 G = tC (n) + nCclass 1,1 + 2 M 2M 0,1 p2 2,0 p 1,0 1 0,0 2,0 n−2 i 1 G + G + G + X5 G + + n(n − 1)Cclass 2 2M 0,0 2M 2 0,1 M 1,1 2 2,1 p 1,0 1 0,0 p2 2,0 n−3 i G1,0 + G2,0 + G + n(n − 1)(n − 2)Cclass 2 M 2 2M 2 0,0 p 1 2,0 3,0 G + G + + X6 2M 2 0,0 2M 1,0 3 p p2 2,0 3,0 n−4 i G + G + n(n − 1)(n − 2)(n − 3)Cclass 2 6M 3 0,0 2M 2 1,0 p 1,0 1 0,0 G2,0 + G3,0 + X7 + 2M 6 + n(n − 1)(n − 2)(n − 3)(n − 4)R3 = 0, p 1,0 1 2,0 0,0 n−1 (n) = pt C + nCclass G + G2,0 + G M 1,0 2M 1,0 2 p 1,0 1 0,0 p 2,0 n−2 + n(n − 1)Cclass G + G + X8 G + 2M 2 1,0 M 2,0 2 3,0
(A.9)
+ n(n − 1)(n − 2)R4 = 0, 1 3,0 p 2,0 n−1 G G = pC (n) + nCclass + + G1,0 1,0 2M 0,0 M 0,0 2 p p 2,0 1 1,0 3,0 n−2 G G G + + + X + n(n − 1)Cclass 9 2M 2 0,0 M 1,0 2 2,0
(A.10)
+ n(n − 1)(n − 2)R5 = 0,
(A.11)
February 11, 2009 13:42 WSPC/148-RMP
152
J070-00359
M. Bojowald et al. (n)
n−1 n Cp2 = 2pCp(n) − pC (n) + Cclass G2,0 0,0 + nCclass p 3,0 1 4,0 2,0 G + G1,0 + G × (A.12) + n(n − 1)R6 = 0, M 0,0 2M 0,0 p 2,0 1 3,0 (n) (n) 1,0 n−1 (n) n 1,0 G + G + G1,1 Ctp = tCp + pCt + C G0,1 + nCclass M 0,1 2M 0,1 p 2,0 1 3,0 1,0 n−2 i + n(n − 1)Cclass G + G1,0 + G + X10 2 M 0,0 2M 0,0 2 1 1,0 p 2,0 p 3,0 n−3 i + n(n − 1)(n − 2)Cclass G G + G + + X11 2 2M 2 0,0 2 2,0 M 1,0
(n) Cqp
+ n(n − 1)(n − 2)(n − 3)R7 = 0, i n = qCp(n) + pCq(n) + Cclass + G1,1 0,0 2 p 2,1 1 3,1 i 1 2,0 n−1 G0,0 + G0,0 + G1,1 G + 3 + nCclass 1,0 2 2M M 2M 0,0 2 3p 2p 1,0 1 0,0 n−2 i G + G G2,0 + + n(n − 1)Cclass 2 2M 2 0,0 M 1,0 2 2,0 2p 3 2,0 G + 2 G3,0 + + X 12 M 0,0 2M 1,0 2 3p 2p3 3,0 n−3 i G2,0 G + n(n − 1)(n − 2)Cclass 1,0 + 2 2 2M 3M 3 0,0 p 1 0,0 G + G1,0 + + X 13 M 2,0 6 3,0 + n(n − 1)(n − 2)(n − 3)R8 = 0,
n−1 n = pt Cp(n) + pCp(n) + Cclass G1,0 Cp(n) 1,0 + nCclass tp t
p 2,0 1 3,0 G1,0 + G1,0 G 2,0 + M 2M 1,0
+ n(n − 1)R9 = 0, (n)
(A.13)
(A.14)
(A.15)
(n)
n G3,0 Cp3 = 3pCp2 − 3p2 Cp(n) + p3 C (n) + Cclass 0,0 n−1 + nCclass X14 + n(n − 1)R10 = 0, (n)
(n)
(n)
Ctp2 = tCp2 − p2 Ct
(A.16)
(n)
+ 2pCtp − 2ptCp(n) 2,0 n−1 i 2,0 n + Cclass G0,1 + nCclass G + X15 2 0,0 p 3,0 2,0 n−2 i + n(n − 1)Cclass G + G1,0 + X16 2 M 0,0 + n(n − 1)(n − 2)R11 = 0,
(A.17)
February 11, 2009 13:42 WSPC/148-RMP
J070-00359
Effective Constraints for Quantum Systems (n)
153
(n)
(n) Cqp2 = qCp2 − p2 Cq(n) + 2pCqp − 2pqCp(n) 1 3,0 p 2,0 2,1 1,0 n−1 i n G + Cclass G0,0 + nCclass + X17 3 G0,0 + 2G1,0 + 4 2 M 2M 0,0 2 2p 3,0 p 2,0 1,0 n−2 i G + 3 G1,0 + G2,0 + X18 + n(n − 1)Cclass 2 M 2 0,0 M
+ n(n − 1)(n − 2)R12 = 0, (n)
(A.18)
(n)
+ 2pCp(n) − 2ppt Cp(n) Cpt p2 = pt Cp2 − p2 Cp(n) t tp n−1 n + Cclass G2,0 1,0 + nCclass X19 + n(n − 1)R13 = 0, (n)
(n)
(n)
(n)
(n)
(A.19) (n)
− 3p2 Ctp + 32 ptCp(n) + 3pCtp2 − 3ptCp2 n−1 i 3,0 n G + Cclass G3,0 + nC + X + n(n − 1)R14 = 0, 20 0,1 class 2 0,0
Ctp3 = tCp3 + p3 Ct
(n)
(n)
(n)
(A.20)
(n)
(n) Cqp3 = qCp3 + p3 Cq(n) − 3p2 Cqp + 32 pqCp(n) + 3pCqp2 − 3pqCp2 i 2,0 3,1 n + Cclass G0,0 + 3 G0,0 2 p 3,0 2,0 n−1 i 4 G0,0 + 3G1,0 + X21 + n(n − 1)R15 = 0 . + nCclass 2 M
(A.21)
References [1] F. Cametti, G. Jona-Lasinio, C. Presilla and F. Toninelli, Comparison between quantum and classical dynamics in the effective action formalism, quant-ph/9910065. [2] M. Bojowald and A. Skirzewski, Effective equations of motion for quantum systems, Rev. Math. Phys. 18 (2006) 713–745; math-ph/0511043. [3] A. Skirzewski, Effective equations of motion for quantum systems, PhD thesis, Humboldt-Universit¨ at Berlin (2006). [4] M. Bojowald and A. Skirzewski, Quantum gravity and higher curvature actions, Int. J. Geom. Methods Mod. Phys. 4 (2007) 25–52; hep-th/0606232. [5] K. V. Kuchaˇr, Time and interpretations of quantum gravity, in Proc. 4th Canadian Conf. on General Relativity and Relativistic Astrophysics, eds. G. Kunstatter, D. E. Vincent and J. G. Williams (World Scientific, Singapore, 1992), pp. 211–314. [6] M. Bojowald and G. Hossain, Cosmological vector modes and quantum gravity effects, Class. Quantum Grav. 24 (2007) 4801–4816; arXiv:0709.0872. [7] M. Bojowald and G. Hossain, Quantum gravity corrections to gravitational wave dispersion, Phys. Rev. D 77 (2008) 023508; arXiv:0709.2365. [8] M. Bojowald, G. Hossain, M. Kagan and S. Shankaranarayanan, Anomaly freedom in perturbative loop quantum gravity, Phys. Rev. D 78 (2008) 063547; arXiv:0806.3929. [9] M. Bojowald, P. Singh and A. Skirzewski, Coordinate time dependence in quantum gravity, Phys. Rev. D 70 (2004) 124022; gr-qc/0408094. [10] T. W. B. Kibble, Geometrization of quantum mechanics, Commun. Math. Phys. 65 (1979) 189–201. [11] A. Heslot, Quantum mechanics as a classical theory, Phys. Rev. D 31 (1985) 1341– 1348.
February 11, 2009 13:42 WSPC/148-RMP
154
J070-00359
M. Bojowald et al.
[12] A. Ashtekar and T. A. Schilling, Geometrical formulation of quantum mechanics, in On Einstein’s Path: Essays in Honor of Engelbert Sch¨ ucking, ed. A. Harvey (Springer, New York, 1999), pp. 23–65; gr-qc/9706069. [13] J. Willis, On the low-energy ramifications and a mathematical extension of loop quantum gravity, PhD thesis, The Pennsylvania State University (2004). [14] G. Date, On obtaining classical mechanics from quantum mechanics, Class. Quant. Grav. 24 (2007) 535–550; gr-qc/0606078. [15] M. Bojowald, Large scale effective theory for cosmological bounces, Phys. Rev. D 75 (2007) 081301(R); gr-qc/0608100. [16] M. Bojowald, Dynamical coherent states and physical solutions of quantum cosmological bounces, Phys. Rev. D 75 (2007) 123512; gr-qc/0703144. [17] M. Bojowald, H. Hern´ andez and A. Skirzewski, Effective equations for isotropic quantum cosmology including matter, Phys. Rev. D 76 (2007) 063511; arXiv:0706.1057. [18] M. Bojowald, Quantum nature of cosmological bounces, Gen. Rel. Grav. 40 (2008) 2659–2683; arXiv:0801.4001. [19] M. Bojowald and R. Tavakol, Recollapsing quantum cosmologies and the question of entropy, Phys. Rev. D 78 (2008) 023515; arXiv:0803.4484. [20] A. Corichi, On the geometry of quantum constrained systems, Class Quantum Grav. 25 (2008) 135013; arXiv:0801.1119. [21] M. Bojowald and T. Strobl, Poisson geometry in constrained systems, Rev. Math. Phys. 15 (2003) 663–703; hep-th/0112074. [22] A. Komar, Constraints, hermiticity, and correspondence, Phys. Rev. D 19 (1979) 2908–2912 [23] M. Bojowald, Isotropic loop quantum cosmology, Class. Quantum Grav. 19 (2002) 2717–2741; gr-qc/0202077. [24] A. Ashtekar, M. Bojowald and J. Lewandowski, Mathematical structure of loop quantum cosmology, Adv. Theor. Math. Phys. 7 (2003) 233–268; gr-qc/0304074. [25] M. Bojowald, Loop quantum cosmology, Living Rev. Relativity 8 (2005) 11; grqc/0601085; http://relativity.livingreviews.org/Articles/lrr-2005-11/. [26] B. Dittrich, Partial and complete observables for Hamiltonian constrained systems, Gen. Rel. Grav. 39 (2007) 1891–1927; gr-qc/0411013. [27] B. Dittrich, Partial and complete observables for canonical general relativity, Class. Quant. Grav. 23 (2006) 6155–6184; gr-qc/0507106. [28] B. Dittrich, Aspects of classical and quantum dynamics of canonical general relativity, PhD thesis, University of Potsdam (2005).
March
10,
2009 19:20 WSPC/148-RMP
J070-00360
Reviews in Mathematical Physics Vol. 21, No. 2 (2009) 155–227 c World Scientific Publishing Company
THE POINT PROCESSES OF THE GRW THEORY OF WAVE FUNCTION COLLAPSE∗
RODERICH TUMULKA Department of Mathematics, Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854-8019, USA
[email protected]
Received 22 April 2008 Revised 25 October 2008 The Ghirardi–Rimini–Weber (GRW) theory is a physical theory that, when combined with a suitable ontology, provides an explanation of quantum mechanics. The so-called collapse of the wave function is problematic in conventional quantum theory but not in the GRW theory, in which it is governed by a stochastic law. A possible ontology is the flash ontology, according to which matter consists of random points in space-time, called flashes. The joint distribution of these points, a point process in space-time, is the topic of this work. The mathematical results concern mainly the existence and uniqueness of this distribution for several variants of the theory. Particular attention is paid to the relativistic version of the GRW theory that was developed in 2004. Keywords: Quantum theory without observers; Ghirardi–Rimini–Weber (GRW) theory of spontaneous wave function collapse; relativistic Lorentz covariance; flash ontology; Dirac equation; Dirac evolution between Cauchy surfaces and hyperboloids. Mathematics Subject Classification 2000: 81P05, 46N50, 83A05, 81Q99
Contents 1. Introduction 1.1. Physical motivation 1.2. A philosophical aspect
156 158 159
2. Scheme of GRW Theories with Flash Ontology 2.1. The simplest case of GRWf 2.2. Labeled flashes 2.3. Variable total flash rate
160 161 163 164
∗A
version of this work has been accepted as a Habilitation thesis by the Mathematics Institute of Eberhard-Karls-Universit¨ at T¨ ubingen, Germany. The main difference between the thesis and the present version is that the proof of Theorem 1 (a Kolmogorov extension theorem for POVMs) was included in the thesis but not here, as it has been published separately [73]. 155
March 10, 2009 19:20 WSPC/148-RMP
156
J070-00360
R. Tumulka
2.4. Time-dependent operators 2.4.1. “Gauge” freedom 2.5. General scheme of GRWf theories 2.5.1. Nonpositive collapse operators 2.5.2. Past-dependent operators 2.5.3. “Gauge” freedom once more 2.5.4. Ways of specifying the theory 2.6. Flashes + POVM = GRWf 2.6.1. Reconstructing Λ
167 168 171 171 171 172 173 174 174
3. Rigorous Treatment of the GRWf Scheme 3.1. Weak integrals 3.2. POVMs 3.3. The simplest case of GRWf 3.4. Time-dependent operators 3.4.1. Given W and Λ 3.4.2. Given H and Λ 3.5. The general GRWf scheme 3.5.1. Given W and Λ 3.5.2. Given H and Λ 3.6. Reconstructing W and Λ
176 176 177 178 181 181 187 188 188 189 190
4. Relativistic GRW Theory 4.1. Abstract definition of the relativistic flash process 4.2. Concrete specification 4.3. Existence theorem in Minkowski space-time
195 197 200 202
5. Outlook 5.1. Nonlocality 5.2. Other approaches to relativistic collapse theories 5.3. The value of a precise definition Appendix. Proofs of Lemmas A.1. Weak integrals A.2. Dyson series
209 209 211 214 217 217 221
1. Introduction This work concerns the foundations of quantum mechanics. The Ghirardi–Rimini– Weber (GRW) theory is a proposal for a precise definition of quantum mechanics, intended to replace the conventional rules of quantum mechanics (as formulated by, e.g., Dirac and von Neumann) and to overcome the certain vagueness and imprecision inherent in these rules. This vagueness and imprecision arise from the situation that these rules specify what a macroscopic observer will see when measuring a certain observable, but leave unspecified exactly which systems should be counted as macroscopic, or as observers, and exactly which physical processes should be
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
157
counted as measurements. The GRW theory, as proposed in 1986 by Ghirardi, Rimini and Weber [40] and Bell [9], solves this problem for the entire realm of non-relativistic quantum mechanics, and a key role in this theory is played by a stochastic law according to which wave functions collapse at random times, rather than at the intervention of an observer. It is a “quantum theory without observers” [42, 43]. After the success of this approach with non-relativistic quantum mechanics, the question arises whether and how the GRW theory can be extended to quantum field theory, to relativistic quantum mechanics, and to relativistic quantum field theory. This question has been worked on intensely over the past 20 years, but not completely and finally answered. The first seriously relativistic theories of the GRW type, and in fact the first seriously relativistic quantum theories without observers, were developed in 2002 by Dowker and Henson [28] (on a discrete space-time) and in 2004 by the author [68] (on a flat or curved Lorentzian manifold). A major part of this work (Sec. 4) consists of a study of the model the author has proposed. This model, which is abbreviated rGRWf for “relativistic GRW theory with flash ontology,” uses some elements that were suggested for this purpose already in 1987 by Bell [9], in particular the “flash ontology,” which corresponds to a point process in space-time. Since the flash ontology is incompatible with the standard way of extending the GRW theory to quantum field theory — the CSL (continuous spontaneous localization) approach pioneered particularly by Pearle [55] and employing diffusion processes in Hilbert space — we developed in [69] a different way of extending the GRW theory to quantum field theories, suitable for flashes. A key element of this extension is an abstract scheme generalizing the original GRW theory (which applies to non-relativistic quantum mechanics), in which the theory is defined by specifying the Hamiltonian operator (as in conventional quantum theory) and the flash rate operators. This scheme is directly applicable to quantum field theories. A major part of this work (Secs. 2 and 3) consists of a description, further generalization and mathematical analysis of this scheme, including existence theorems providing exact conditions for the existence of the relevant point processes. The further generalization is necessary to include the process of the rGRWf theory. The goal of this work is to provide a firm mathematical basis for the GRW theories with flash ontology. It lies in the nature of the topic that this work must be a mixture of mathematics, physics, and philosophy. The theorems and proofs presented here appear here for the first time, while the physical (and philosophical) considerations reported here have been published before [28, 68–70, 72, 2]. The relevant mathematical considerations involve concepts and results from several fields, including stochastic processes; operators in Hilbert space; and differential geometry of Lorentzian manifolds. The main results of this work are existence proofs for the relevant point processes. An existence question arises in many physical theories and is often remarkably difficult. For example, the existence of Newtonian trajectories with Coulomb interaction (for almost all initial conditions) is still an open problem for more than 4 particles. For
March 10, 2009 19:20 WSPC/148-RMP
158
J070-00360
R. Tumulka
existence results about other quantum theories without observers, see [14, 36, 66]. A simple introduction to rGRWf is given in [70]; discussions of rGRWf can also be found in [1, 2, 46, 51, 52, 37]. 1.1. Physical motivation When the standard quantum formalism utilizes the concept of collapse of the wave function, it does so in a rather ill-defined way, introducing a collapse whenever “an observer” intervenes. This is replaced by a concept of objective collapse, or spontaneous collapse, in GRW-type theories. These theories replace the unitary Schr¨ odinger evolution of the wave function by a nonlinear, stochastic evolution, so that the Schr¨ odinger evolution remains a good approximation for microscopic systems while superpositions of macroscopically different states (such as Schr¨ odinger’s cat) quickly collapse into one of the contributions. The GRW theory [40, 9, 3] is the simplest and best-known theory of this kind, another one the Continuous Spontaneous Localization (CSL) approach [55, 3]. These theories, when combined with a suitable ontology, provide paradox-free versions of quantum mechanics and possible explanations of the quantum formalism in terms of objective events, and thus “quantum theories without observers.” Quantum theory is conventionally formulated as a positivistic theory, i.e. as a set of rules predicting what an observer will see when performing an experiment (more specifically, predicting which are the possible outcomes of the experiment, and which are their probabilities), also called the quantum formalism. Many physicists have felt it desirable to formulate quantum theory instead as a realistic theory, i.e. one describing (a model of) reality, independently of the presence of observers; in other words, describing all events that actually happen. This idea was most prominently advocated by Einstein [33], Bell [11], Schr¨ odinger [65], de Broglie [22], Bohm [17] and Popper [60]. Realistic theories have come to be known as quantum theories without observers (QTWO) [42, 43]. Since in a QTWO also the observer and experiments are contained as special cases of matter and events, the quantum formalism remains valid but is a theorem and not an axiom, that is, a consequence of the QTWO and not its basic postulate. Conversely, a QTWO provides an explanation of the quantum formalism, describing how and why the outcomes specified by the formalism come about with their respective probabilities. There are two examples of QTWO that work in a satisfactory way (as pointed out by, e.g., Bell [8], Goldstein [42, 43] and Putnam [61]): Bohmian mechanics [17, 7, 13] and GRW theory [40, 9, 3], as well as variants of these two theories. (It may or may not be possible that also other approaches, such as the “many worlds” view or the “decoherent histories” program, can be developed into satisfactory QTWOs [42, 43, 2].) Among the variants of GRW theory (i.e. among the mathematical theories of spontaneous wave function collapse besides the original GRW model), the most notable is the continuous spontaneous localization (CSL) theory of Pearle [55];
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
159
similar models were considered by Di´osi [24], Belavkin [5], Gisin [41], and Ghirardi, Pearle, and Rimini [39]. Aside from explicit mathematical models, the idea that the Schr¨ odinger equation might have to be replaced by a nonlinear and stochastic evolution has also been advocated by such distinguished theoretical physicists as Penrose [58] and Leggett [49]. 1.2. A philosophical aspect A crucial part of QTWOs is the so-called primitive ontology [2]. This means variables describing the distribution of matter in space and time. Here are four examples of primitive ontology: • Particle ontology. Matter consists of point particles, mathematically represented by a location Qt in physical 3-space for every time t, or, equivalently, by a curve in space-time called the particle’s world line. This is the primitive ontology both of Bohmian and classical mechanics. One should imagine that each electron or quark is one point, so that a macroscopic object consists of more than 1023 particles. • String ontology. Matter consists of strings, mathematically described by a curve in physical 3-space (or possibly another dimension of physical space), or, equivalently, by a 2-surface in space-time called the world sheet. One should imagine that each electron consists of one or more strings. • Flash ontology. Matter consists of discrete points in space-time, called world points or flashes. One should imagine that a solid object consists of more than 106 flashes per cubic centimeter per second. More flashes means more matter. • Matter density ontology. Matter is continuously distributed in space, mathematically described by a density function m(q, t), where q is the location in physical 3-space and t the time. A QTWO needs a primitive ontology to give physical meaning to the mathematical objects considered by the theory [2, 52]. The role of the wave function then is “to tell the matter how to move” [2], that is, to govern the primitive ontology (in a stochastic way). The theory we are mainly considering here, rGRWf, uses the flash ontology, which was first proposed for the original (non-relativistic) GRW model by Bell [9] and adopted in [48, 42, 43, 69]. Interestingly, the (non-relativistic) GRW evolution of the wave function can reasonably be combined with the matter density ontology as well [12,42,43,2]; thus, there are two different GRW theories, called GRWm and GRWf, with the same wave function but different ontologies [2]. However, it is not known how GRWm could be made relativistic. Likewise, it is not known how Bohmian mechanics could be made relativistic. More precisely, there does exist a natural and convincing way of defining Bohmian world lines on a relativistic space-time [29, 71], but it presupposes the existence of a preferred slicing of space-time into spacelike 3-surfaces, called the time foliation. The time foliation may itself be given by a Lorentz-invariant law, but still
March 10, 2009 19:20 WSPC/148-RMP
160
J070-00360
R. Tumulka
t
r Fig. 1. A typical pattern of flashes in space-time (r = space, t = time), and thus a possible world according to the GRW theory with the flash ontology.
it seems against the spirit of relativity because it defines a notion of absolute simultaneity. This does not mean that this theory is wrong; it means that if it is right then we will have to adopt a different understanding of relativity. An overview of a recent research about Bohmian mechanics and relativity can be found in [71, Sec. 3.3]. We introduce some notation. Throughout this work, H will always be a separable complex Hilbert space. The adjoint of an operator T on H is denoted T ∗ . The Borel σ-algebra of a topological space X will be denoted B(X). 2. Scheme of GRW Theories with Flash Ontology This chapter is of a physical character. It provides an overview of GRW theories with flash ontology (hereafter, GRWf theories). The mathematical considerations in this chapter are not intended to be rigorous (except when stated otherwise). For example, we will pretend that functions are differentiable or operators invertible whenever that is useful. We describe a general scheme of GRWf theories (including, but more general than, the scheme described in [69]). We begin with a simple special case and increase generality step by step, finally arriving at the general version that contains also rGRWf. Given the scheme, a particular GRWf theory can be defined by specifying certain operators. This situation is roughly analogous to the general Schr¨ odinger equation i
dψt = Hψt , dt
(1)
which becomes a concrete evolution equation only after specifying the self-adjoint operator H, called the Hamiltonian.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
161
2.1. The simplest case of GRWf We take physical space to be R3 and the time axis to be R. To specify the probabilistic law for the flash process, we specify the rate density r(q, t) at time t ∈ R for a flash to occur at q ∈ R3 , which means, roughly speaking, that the probability of a flash in an infinitesimal volume dq around q between t and t + dt, conditional on the flashes in the past of t, equals r(q, t)dq dt. The first basic equation of GRWf says that the flash rate density is given by r(q, t) = ψt |Λ(q)ψt .
(2)
Here Λ(q) is a positive operator, called the flash rate operator, which must be specified to define the theory, and ψt ∈ H is called the wave function or state vector at time t, which fulfills ψt = 1 and evolves according to the following two evolution laws. When a flash occurs at time T and location Q, the wave function changes discontinuously according to the second basic equation, ψT + =
Λ(Q)1/2 ψT − . Λ(Q)1/2 ψT −
(3)
Here, ψT + = limtT ψt and ψT − = limtT ψt . This is called the collapse of the state vector at time T and location Q. Between the flashes, the wave function evolves according to the Schr¨ odinger equation (1). Once the operators H and Λ(q) are specified, the equations are intended to define the flash process F = ((T1 , Q1 ), (T2 , Q2 ), . . .),
(4)
as follows: Choose, at an “initial time” t0 the initial state vector ψt0 ∈ H with odinger equation (1) up to the time T1 > t0 ψt0 = 1, and evolve it using the Schr¨ at which the first flash occurs; let Q1 be the location of the first flash; collapse the state vector at time T1 and location Q1 ; continue with the collapsed state vector. (In the more general variants of the GRWf scheme, it can happen that the sequence F ends after finitely many flashes if the rate is very low; in the simple variant we are presently considering, this does not happen, as we will see.) Example 1. The original 1986 GRW model [40, 9] is designed for non-relativistic quantum mechanics of N particles; for N = 1 it fits the above scheme as follows: H = L2 (R3 ); H is the usual Hamiltonian of non-relativistic quantum mechanics, a self-adjoint extension of Hψ = −
2 2 ∇ ψ+Vψ 2m
(5)
for ψ ∈ C0∞ (R3 ), where m is the particle’s mass and V the potential; finally, the flash rate operators are multiplication operators by a Gaussian, Λ(q)ψ(r) =
2 2 λ e−(r−q) /2σ ψ(r), 2 3/2 (2πσ )
(6)
March 10, 2009 19:20 WSPC/148-RMP
162
J070-00360
R. Tumulka
where λ and σ are new constants of nature with suggested values λ ≈ 10−15 s−1 and σ ≈ 10−7 m. Since Λ(q)dq = λI, (7) R3
where I is the identity operator on H , the total flash rate 3 r(R , t) = r(q, t)dq = λ
(8)
R3
is independent of the state vector and constant in time. Thus, the flash times T1 , T2 , . . . form a Poisson process with intensity λ (while the locations Q1 , Q2 , . . . do depend on ψ). Example 2. A version of the GRW model advocated by Dove and Squires [26] and Tumulka [69] corresponding to non-relativistic quantum mechanics of N identical particles fits into the scheme as follows: H = S± L2 (R3 )⊗N with S+ the symmetrizer and S− the anti-symmetrizer, i.e. H is the space of symmetric (for bosons), respectively, anti-symmetric (for fermions) L2 functions on R3N ; H is the usual Hamiltonian, a self-adjoint extension of Hψ = −
N 2 2 ∇ ψ + V ψ, 2m i i=1
(9)
for ψ ∈ C0∞ (R3N ) ∩ H ; finally, the flash rate operators are Λ(q)ψ(r1 , . . . , rN ) =
N 2 2 λ e−(ri −q) /2σ ψ(r1 , . . . , rN ) (2πσ 2 )3/2 i=1
(10)
with the same constants as before. Then (7) holds with N λ instead of λ, and hence the total flash rate is larger by a factor N , r(R3 , t) = N λ.
(11)
The condition (7) plays a role to ensure the important property that the distribution of F is given by a POVM, i.e. there is a POVM (positive-operator-valued measure, see Sec. 3.2) G(·) on the history space Ω = (R4 )N , called the history POVM, such that for A ⊆ Ω P(F ∈ A) = ψ|G(A)ψ
(12)
with ψ = ψt0 the initial state vector.a a A physical consequence of this property is the impossibility of superluminal communication by means of entanglement (“no signalling”). Indeed, consider two systems, a and b, that are distant and do not interact (e.g., because they are spacelike separated) but may be entangled. Their joint wave function lies in Ha ⊗ Hb and the Hamiltonian and the flash rate operators split in such a way that the history POVM splits as well: G(Aa × Ab ) = G(Aa ) ⊗ G(Ab ) for any event Aa (Ab ) concerning the flashes in system a (b) [9]. As a consequence, the marginal distribution of the flashes in systems a does not depend on the external fields in system b, nor on ψ except through its reduced density matrix ρa = trb |ψψ|, which implies no signalling [9].
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
163
We can come close to an explicit expression for the history POVM G(·) by providing an explicit expression for its marginal Gn (·) for the first n flashes, which we obtain by a formal calculation [69] from (1), (2), (3) and (7), writing X for the space-time point (Q, T ) (and x = (q, t), dx = dq dt): P(X1 ∈ dx1 , . . . , Xn ∈ dxn ) = ψ|Gn (dx1 × · · · × dxn )ψ =
ψ|L∗n Ln ψdx1
· · · dxn
(13) (14)
with Ln (x1 , . . . , xn ) = 1t0
(15)
Here, 1t0
(16)
which means we assume separate flash rate operators for every type; the new collapse rule prescribes that if a flash of type I occurs at time T and location Q then ΛI (Q)1/2 ψT − . (17) ψT + = ΛI (Q)1/2 ψT − Concerning the total flash rate operator, we assume, instead of (7), Λi (q)dq = λI. i∈L
R3
(18)
March 10, 2009 19:20 WSPC/148-RMP
164
J070-00360
R. Tumulka
(It should not lead to confusion that a capital I is sometimes used for the identity operator and sometimes for a random label.) As a consequence, the joint distribution of the first n flashes together with their labels is P(Fn ∈ dfn ) = P(X1 ∈ dx1 , I1 = i1 , . . . , Xn ∈ dxn , In = in ) = ψ|L∗n Ln ψdx1 · · · dxn
(19)
with Ln = Ln (x1 , i1 , . . . , xn , in ) = 1t0
(20)
Example 3. The original GRW model (corresponding to non-relativistic quantum mechanics of N distinguishable particles) fits this scheme as follows: H = L2 (R3N ); L = {1, . . . , N }; H is the usual Hamiltonian of non-relativistic quantum mechanics, a self-adjoint extension of N 2 2 ∇ ψ+Vψ Hψ = − 2mi i i=1
(21)
for ψ ∈ C0∞ (R3N ), where mi is the mass of particle i; finally, the flash rate operators are 2 2 λi Λi (q)ψ(r1 , . . . , rN ) = e−(ri −q) /2σ ψ(r1 , . . . , rN ). (22) 2 3/2 (2πσ ) Note that the rate constant λi may depend on the label i, so that different variables ri (what one would call “different particles” in orthodox quantum mechanics) may have different flash rates. (In fact, if the frequencies λi are all different, one might use them as the labels.) One easily checks (18) with λ = λ1 + · · · + λN . 2.3. Variable total flash rate
We now stop assuming that the total flash rate operator i dq Λi (q) is a multiple of the identity; that is, we drop (7) and (18). As pointed out in [69], this situation naturally arises for a GRWf process appropriate for quantum field theory (corresponding to a variable number of particles), as already suggested by the fact, see (11), that the total flash rate is proportional to the number of particles. A stochastic wave function evolution very similar to the one discussed here was proposed by Blanchard and Jadczyk [15] as a model of a quantum system interacting with a classical one; see [46] for a discussion of the commonalities. We can keep the same formulas, (16) for the flash rate and (17) for the collapse, and we want that the joint distribution of all flashes is still given by a history POVM. It turns out [69] that this naturally leads us to a modified, nonlinear Schr¨ odinger
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
165
equation. We will take the opposite route here, start by postulating the appropriate nonlinear Schr¨ odinger equation [69,15], and derive from there that the distribution of flashes is indeed given by a POVM. Here is the equation: i
i i dψt = Hψt − Λ(R3 )ψt + ψt |Λ(R3 )ψt ψt , dt 2 2
where 3
Λ(R ) =
i∈L
R3
Λi (q)dq.
Note that (23) formally preserves ψt if ψ = 1 initially: dψt d ψt 2 = 2 Re ψt dt dt 1 1 i = 2 Re − ψt |Hψt − ψt |Λ(R3 )ψt + ψt |Λ(R3 )ψt ψt |ψt 2 2 = (ψt 2 − 1)ψt |Λ(R3 )ψt = 0.
(23)
(24)
(25) (26) (27)
This calculation shows that the last term in (23), the nonlinear term, serves the purpose of keeping ψt normalized to 1. Note further that if, as we assumed in the previous sections, Λ(R3 ) = λI and ψ = 1 initially then (23) reduces to the Schr¨ odinger equation (1). Next, we want to obtain formulas analogous to (19) and (20) for the distribution of the first n flashes from the flash rate (16), the collapse law (17) and the betweenflashes evolution (23). However, since the total flash rate is not constant any more, it need not be bounded away from zero, and, as a consequence, it can have positive probability that only finitely many flashes occur. The appropriate history space space is therefore not (R4 )N but the space of all finite or infinite sequences, Ω=
∞
(R4 )m ∪ (R4 )N
(28)
m=0
(where (R4 )0 has a single element, the empty sequence). Another method of representing finite-or-infinite sequences is based on introducing a formal symbol (“cemetery state”) and writing a finite sequence in R4 , such as (x1 , . . . , xn ), as the infinite sequence (x1 , . . . , xn , , , . . .) in R4 ∪ {}. Then Ω can be understood as Ω = {(z1 , z2 , . . .) ∈ (R4 ∪ {})N : zn = ⇒ zn+1 = },
(29)
the set of sequences for which is “absorbing”, i.e. the sequence cannot leave the cemetery state once it is reached. In this representation, the number of flashes #F in a sequence F = (z1 , z2 , . . .) ∈ (R4 ∪ {})N has to be defined as #F = inf{n ∈ N : zn = } − 1 (with the understanding inf ∅ = ∞).
(30)
March 10, 2009 19:20 WSPC/148-RMP
166
J070-00360
R. Tumulka
For expressing the distribution of the flashes, let us introduce the following abbreviations: 1
Wt = e− 2 Λ(R
3
)t− i Ht
for t ≥ 0,
Wt = 0 for t < 0
(31)
and Ln = Ln (x1 , i1 , . . . , xn , in ) = Λin (qn )1/2 Wtn −tn−1 Λin−1 (qn−1 )1/2 Wtn−1 −tn−2 · · · Λi1 (q1 )1/2 Wt1 −t0 ,
(32)
in analogy to (20). In terms of the initial wave function ψ = ψt0 and the flashes Z1 , . . . , Zn (possibly n = 0) between t0 and t, ψt can be expressed as ψt =
Wt−tn Ln (Z1 , . . . , Zn )ψ . Wt−tn Ln (Z1 , . . . , Zn )ψ
(33)
To see this, note that the right-hand side changes according to (17) at a flash and according to (23) in between, as one can confirm by computing its time derivative. Note that the nonlinear term in (23) arises from the normalizing factor in the denominator. Indeed, we could take (33) as the definition of ψt and derive the nonlinear Schr¨ odinger equation (23) from it. By formal computation, one obtains from (16), (17) and (23) the following expressions: The probability that no flash occurs up to time t0 + t with t > 0 is P(#F = 0 or T1 > t0 + t) = ψ|Wt∗ Wt ψ; the probability that no flash occurs at all is ∗ P(#F = 0) = ψ lim Wt Wt ψ ; t→∞
(34)
(35)
the probability that at least one flash occurs, and occurs in dx1 with label i1 is P(#F ≥ 1, Z1 ∈ dz1 ) = ψ|Wt∗1 −t0 Λi1 (q1 )Wt1 −t0 ψdx1 ∗
= ψ|L1 (z1 ) L1 (z1 )ψdx1 ;
(36) (37)
the probability that at least n flashes occur, and occur in dx1 , . . . , dxn with particular labels is P(#F ≥ n, Z1 ∈ dz1 , . . . , Zn ∈ dzn ) = ψ|L∗n Ln ψdx1 · · · dxn ;
(38)
the probability P(#F ≥ n) that at least n flashes occur can be obtained from this by integrating out dx1 , . . . , dxn and summing over i1 , . . . , in , but for this quantity there is no simple formula (without integral) in terms of Λ’s and W ’s. The formula (38) is the basis for the fact that the joint distribution of the flashes is given by a history POVM (see Theorem 3 in Sec. 3.4.1 below). The probability that the process stops after n flashes, which occur at particular space-time points and particular labels is ∗ ∗ P(#F = n, Z1 ∈ dz1 , . . . , Zn ∈ dzn ) = ψ Ln lim Wt Wt Ln ψ dx1 · · · dxn ; t→∞ (39) again, P(#F = n) can be obtained by integration.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
167
Example 4. The following version of GRWf corresponding to a non-relativistic quantum field theory (i.e. quantum mechanics with a variable number of particles) proposed in [69]. The labels correspond to different particle species (electron, quark, . . .); H is a product of spaces corresponding to the particle species, Hi , (40) H = i∈L
where Hi is a copy of the (bosonic or fermionic) Fock space, i.e. Hi =
∞
Hi
(N )
=
N =0
∞
S± L2 (R3 , C2si +1 )⊗N
(41)
N =0
with si ∈ {0, 12 , 1, 32 , . . .} the spin of species i; a typical Hamiltonian consists of a (N ) plus contributions creating and annihilating contribution like (9) on every Hi particles (see, e.g., [19] for concrete examples); finally, the flash rate operators Λi (q) (N ) are given by (10) on every Hi . As a consequence, ˆi , (42) Λi (q)dq = λN ˆi is the particle number operator of species i, where N ˆi ψ = Ni ψ for ψ ∈ H (N ) ⊗ N Hi , i
(43)
i =i
which is unbounded. Indeed, the Λi (q), defined by (10) on every N -particle sector (N ) ˆi (q) of species i (which Hi , is nothing but the particle number density operator N actually is an operator-valued distribution) convolved with the Gaussian of width ˆi (q) convolved with the Gaussian of width σ. Conversely, Λi (q) could be defined as N σ, also if a given quantum field theory is not of the structure (40), (41).
2.4. Time-dependent operators Suppose now that the relevant operators are explicitly time dependent: the Hamiltonian H(t) and the flash rate operators Λi (q, t). It is straightforward to adapt the basic equations of GRWf to this situation. We rewrite the flash rate density as ri (q, t) = ψt |Λi (q, t)ψt ,
(44)
the collapse law as ψT + =
ΛI (Q, T )1/2 ψT − , ΛI (Q, T )1/2 ψT −
(45)
and the between-flashes evolution law as i
dψt i i = H(t)ψt − Λ(R3 , t)ψt + ψt |Λ(R3 , t)ψt ψt , dt 2 2
(46)
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
R. Tumulka
168
where 3
Λ(R , t) =
i∈L
R3
Λi (q, t)dq.
(47)
These equations reduce to (16), (17) and (23) if H(t) and Λi (q, t) are constant as functions of t. We also write Λ(z) instead of Λi (q, t), where z = (q, t, i) is a labeled flash. From the above equations, we obtain by a formal computation in analogy to (38) that P(#F ≥ n, Z1 ∈ dz1 , . . . , Zn ∈ dzn ) = ψ|L∗n Ln ψdx1 · · · dxn
(48)
with t
n−1 n Ln = Ln (z1 , . . . , zn ) = Λ(zn )1/2 Wttn−1 Λ(zn−1 )1/2 Wtn−2 · · · Λ(z1 )1/2 Wtt01 ,
where Wst is defined by Wtt = I,
dWst = dt
1 i − Λ(R3 , t) − H(t) Wst 2
(49)
(50)
for t ≥ s and Wst = 0 for t < s. As a formal consequence of (50), d t∗ t W Ws = −Wst ∗ Λ(R3 , t)Wst dt s
(51)
for t ≥ s. 2.4.1. “Gauge” freedom There remains a certain freedom in the choice of the operators H(t) and Λ(z) used ˜ ˜ to define the theory. A different choice H(t) and Λ(z) of Hamiltonian and flash rate operators can lead to the same history POVM G(·) as H(t) and Λ(z), and thus to ˜ the same probability distribution of the flashes. In this case, we regard H(t) and ˜ Λ(z) as physically equivalent to H(t) and Λ(z), that is, as a different representation of the same physical theory. See [2] for a discussion of the concept of physical equivalence. For this conclusion, it plays a role that we regard the flashes as the primitive ontology, and the theory as defined by defining the distribution of the flashes. Had we regarded the wave function as the primitive ontology, then a change in H(t) would not have been admissible, as it leads to a different function for ψt . One can say that in GRWf theories we care about wave functions only insofar as we care about flashes, and that is why two wave functions, ψt and ψ˜t , arising from different choices of H(t) and Λ(z), can be regarded as just two representations of the same physical evolution, mathematically represented by the probability distribution P(·) = ψ0 |G(·)ψ0 of the flashes. Note the similarity between the freedom about H(t) and Λ(z) and the gauge invariance of (classical) electrodynamics: Different choices of vector potentials Aµ are physically equivalent, i.e. different representations of the same reality, because
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
169
we regard not Aµ but the field strength Fµν as the primitive ontology. (Alternatively, one may regard only the particle trajectories as the primitive ontology, and these depend only on the field strength Fµν .) ˜ ˜ Returning to H(t) and Λ(z), here is a way of constructing H(t) and Λ(z) that t lead to the same history POVM G(·). Let Us for s, t ∈ R be a family of unitary operators such that Utt = I,
Ust Urs = Urt
(52)
for all r, s, t ∈ R. Let us assume t0 = 0 for ease of notation. Note that all Ust are determined by the subfamily (U0t )t∈R because, by (52), Ust = U0t (U0s )−1 . Now set 0
dU ˜ H(t) = Ut0 H(t)U0t + i t U0t dt
(53)
˜ i (q, t) = U 0 Λi (q, t)U t . Λ t 0
(54)
and
Then ˜ 3 , t) = U 0 Λ(R3 , t)U t , Λ(R t 0 ˜ st W
=
Ut0 Wst U0s ,
(55) (56)
˜ n = Ut0 Ln , L n
(57)
˜ n (·) = Gn (·), G
(58)
and thus
as we have claimed. This can be understood in the following way. Imagine there is a separate Hilbert space Ht for every time t. Then there are many ways of identifying Hs with Ht for all s, t ∈ R, each corresponding to a family of unitary isomorphisms Vst : Hs → Ht with Vtt = I and Vst Vrs = Vrt . Two such families V, Vˆ differ by a family of unitary operators Ust = Vt0 Vˆst V0s on H0 satisfying (52), and thus, if we started with a tacit identification of all Ht ’s, every other way of identifying them is represented by a family Ust . Also in ordinary quantum mechanics different ways of identifying odinger picture and the Heisenberg picture. In the the Ht ’s are known: the Schr¨ Heisenberg picture, Hs and Ht are identified along the unitary time evolution, so that ψs and ψt are identified as the same vector; in the Schr¨odinger picture, Hs and Ht are so identified that the position operators (represented by a projection-valued measure on R3 ) are time-independent. Also in GRWf, we can speak of a Schr¨ odinger picture and a Heisenberg picture. In GRWf, a role similar to that of the position operators in ordinary quantum mechanics is played by the flash rate operators Λ(q). If we assume them to be time-independent, as we did in Sec. 2.1, then this entails a particular way of identifying the Ht ’s; if we drop this assumption, as we do in this section, then other identifications are possible.
March 10, 2009 19:20 WSPC/148-RMP
170
J070-00360
R. Tumulka
Heisenberg picture. The analog of the Heisenberg picture in GRWf is characterized by the condition ˜ H(t) = 0,
(59)
so that the Hamiltonian contribution to the evolution of the state vector ψt disappears. It can be obtained through the choice i dU0t = − H(t)U0t . dt
(60)
(Note, however, a fine conceptual difference from the Heisenberg picture in ordinary quantum mechanics: In ordinary quantum mechanics, it is the observables that evolve, while in GRWf, which is not defined in terms of observables, it is the flash rate operators.) In the Heisenberg picture, we have that 1 dWst = − Λ(R3 , t)Wst dt 2
(61)
for t ≥ s, with Wtt = I and Wst = 0 for t < s. (One might be tempted to think that (61), as it does not contain the skew-adjoint factor iH(t), implies that all Wst are self-adjoint, but this is generically not the case; it is the case when all Λ(R3 , t) commute with each other.) Square-root picture. This “gauge” is characterized by the condition ˜ t ≥ 0. W 0
(62)
˜ 0t = W0t∗ W0t by (56), W ˜ can be expressed ˜ 0t∗ W In fact, since in every “gauge” W through W according to ˜ t = (W t∗ W t )1/2 . W 0 0 0
(63)
This relation gives the “square-root picture” its name. This picture can be obtained through the choiceb U0t = W0t (W0t∗ W0t )−1/2 .
(64)
˜ t can be easily computed The advantage of the square-root picture is that W 0 t∗ t t ˜ need not be positive (nor self-adjoint) for by (63) if W0 W0 is given. Since W s s = 0, the square-root picture only simplifies the rightmost term in (49). But this will be different in Sec. 2.5 when we allow flash rate operators to depend on previous flashes. expression (64) is indeed rigorously defined and unitary if W0t is bijective. To see that it is well-defined, note that if W0t is bijective then so are W0t∗ and W0t∗ W0t , and thus also T := (W0t∗ W0t )1/2 (as the bijectivity of T 2 implies that of T ). In particular, U0t is bijective as the product of the bijective operators W0t and T −1 . To see that U0t is unitary, note that its adjoint is its inverse, U0t∗ U0t = T −1 T 2 T −1 = I (as T and thus T −1 are self-adjoint). Finally note that U00 = I by definition, and that (56) yields (63).
b The
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
171
2.5. General scheme of GRWf theories The scheme of GRWf we have developed so far is this: Given the operators H(t) and Λi (q, t), the corresponding GRWf theory is defined by (44)–(47). This scheme can be naturally generalized in two ways. 2.5.1. Nonpositive collapse operators First, instead of the positive operators Λi (q, t)1/2 in the collapse law (45) we can put a collapse operator Ci (q, t) which satisfies Ci (q, t)∗ Ci (q, t) = Λi (q, t)
(65)
but is not necessarily positive, and not necessarily self-adjoint. That is, we replace (45) with ψT + =
CI (Q, T )ψT − . CI (Q, T )ψT −
(66)
We also write C(z) instead of Ci (q, t), where z = (q, t, i) is a labeled flash. 2.5.2. Past-dependent operators Second, we can allow that both the Hamiltonian H and the collapse operator C depend on the previous flashes, H = H(z1 , . . . , zn , t) = H(fn , t),
(67)
C = Ci (z1 , . . . , zn , q, t) = C(fn , z).
(68)
We write fn := (z1 , . . . , zn ) and fn−1 = (z1 , . . . , zn−1 ). Indeed, this situation occurs in the relativistic GRWf model, see Sec. 4. The GRWf process can then be defined, instead of by (48)–(50), by P(#F ≥ n, Fn ∈ dfn ) = ψ|L∗n Ln ψdx1 · · · dxn
(69)
with Ln = Ln (fn ) = C(fn )W tn (fn−1 )Ln−1 (fn−1 ), where W t (fn ) is defined by tn
W (fn ) = I,
dW t (fn ) = dt
L0 (∅) = I,
1 i 3 − Λ(fn , R , t) − H(fn , t) W t (fn ) 2
for t ≥ tn and W t (fn ) = 0 for t < tn ; here, Λ(fn , R3 , t) = Ci (fn , q, t)∗ Ci (fn , q, t)d3 q. i∈L
(70)
(71)
(72)
R3
(It is unnecessary now to specify two times for the W operator, as in the notation Wst , because now s = tn , where tn is the time of the last flash in fn .)
March 10, 2009 19:20 WSPC/148-RMP
172
J070-00360
R. Tumulka
2.5.3. “Gauge” freedom once more In addition to the gauge freedom described in Sec. 2.4.1, there is another gauge freedom when the operators H and C can depend on the past flashes f = (z1 , . . . , zn ), and exploiting this freedom one can ensure that all C’s are positive operators. ˜ ˜ z) that lead to Here is a way of constructing different operators H(f, t) and C(f, the same history POVM G(·). The construction is the same as in Sec. 2.4.1, except that the unitaries Ust are now allowed to depend on the past flashes f . That is, let 4 n Ust (f ) for s, t ∈ R, f ∈ ∪∞ n=0 (R × L ) be an arbitrary family of unitary operators such that Utt (f ) = I,
Ust (f )Urs (f ) = Urt (f )
(73)
for all r, s, t ∈ R, and set (assuming t0 = 0 for ease of notation) 0
dU (f ) t ˜ H(f, t) = Ut0 (f )H(f, t)U0t (f ) + i t U0 (f ) dt
(74)
˜ z) = U 0 (f, z)C(f, z)U t (f ). C(f, t 0
(75)
˜ z) = U 0 (f )Λ(f, z)U t (f ), Λ(f, t 0
(76)
˜ R3 , t) = Ut0 (f )Λ(f, R3 , t)U0t (f ), Λ(f,
(77)
and
It follows that
˜ t (f ) = W
Ut0 (f )W t (f )U0tn (f ),
(78)
˜ n (f ) = L
Ut0n (f )Ln (f ),
(79)
for f = (z1 , . . . , zn ), and thus ˜ n (·) = Gn (·), G
(80)
as we have claimed. Heisenberg-plus picture. This is the special case characterized by the conditions ˜ H(f, t) = 0,
˜ z) ≥ 0, C(f,
(81)
so that ˜ z) = Λ(f, ˜ z)1/2 . C(f,
(82)
(The tag “plus” indicates that the C˜ are positive). It can be obtained through the particular choice of U0t (f ) defined by dU0t (f ) i (83) = − H(t, f )U0t (f ) dt for t > tn > · · · > t1 > 0 and f = (z1 , . . . , zn ), zk = (qk , tk , ik ), with the initial condition U0tn (f ) chosen so that U00 (∅) = I,
U0tn (f )∗ C(f )U0tn (fn−1 ) ≥ 0
(84)
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
173
with fn−1 = (z1 , . . . , zn−1 ). Indeed, U0tn (f ) is determined by (84) from C(f ) and U0tn (fn−1 ), provided that C(f ) : H → H is bijective.c Square-root-plus picture. This case is characterized by the conditions ˜ t (f ) ≥ 0, W
˜ z) ≥ 0 C(f,
(85)
and can be obtained through the particular choice of U0t (f ) defined by two equations, (84) and Uttn (f ) = W (W ∗ W )1/2
(86)
with W = W t (f ) and n = #f . From (84) it follows by (75) that all C˜ are positive. From (86), as in the context of (64), it follows by (78) that ˜ t (f ) = U tn (f )∗ (W ∗ W )1/2 U tn (f ), W 0 0
(87)
˜ t (f ) is positive. the analog of (63). As a consequence, W 2.5.4. Ways of specifying the theory The theory can be specified by specifying all the operators H(f, t) and C(f ). If all the C(f ) are positive, one can instead specify the Λ(f ) (as then C(f ) = Λ(f )1/2 ). In the Heisenberg-plus picture, one has to specify only the operators Λ(f ) since all H(f, t) = 0. Let us return to the case of nonzero H(f, t). One can specify, instead of H(f, t), directly the W t (f ) (in addition to the C(f )), provided they satisfy the following condition of consistency with the specified C(f ): d t ∗ t W (f ) W (f ) = −W t (f )∗ Λ(f, R3 , t)W t (f ). dt
(88)
To specify the W instead of the H operators is analogous to specifying, in ordinary quantum mechanics, the unitary time-evolution operators Ust instead of the Hamiltonians H(t). The wave function ψt at time t can be expressed through the W operators. It depends, of course, on the flashes f between t0 and t: ψt =
W t (f )L#f (f )ψ W t (f )L#f (f )ψ
(89)
with Ln defined by (70). follows from the fact that the operator T := C(f )U0tn (fn−1 ), as it is bounded and bijective, possesses a unique polar decomposition [63, Theorem 12.35] T = U P as a product of a unitary U and a bounded positive P . Now if V is a unitary so that V T is positive, then V T = (T ∗ V ∗ V T )1/2 = (T ∗ T )1/2 = P and V = U ∗ . That is, U0tn (f ) = U . c This
March 10, 2009 19:20 WSPC/148-RMP
174
J070-00360
R. Tumulka
2.6. Flashes + POVM = GRWf We have described how theories of the GRWf type, specified in terms of the flash rate (and other) operators, give rise to a distribution of flashes given by a POVM. We now argue that essentially every theory with flash ontology in which the distribution of the flashes is given by a POVM, arises from the GRWf scheme for a suitable choice of flash rate operators, and is thus a collapse theory. (A rigorous discussion is provided in Sec. 3.6.) In particular, this suggest that rGRWf can be expressed, in any coordinate system, in the GRWf scheme. We assume that the history POVM G(·) is such that for every n ∈ N its marginal Gn (·) for the first n flashes has a positive-operator-valued density function En : (R4 × L )n → B(H ) relative to the Lebesgue measure, i.e. Gn (A) = En (fn )dfn , (90) A
where fn = (x1 , i1 , . . . , xn , in ), xk ∈ R and the notation dfn means g(fn )dfn = 1fn ∈A g(fn )dx1 · · · dxn . 4
A
(91)
R4N
i1 ···in ∈L
This assumption is fulfilled for the history POVM G(·) of GRWf with En (fn ) = Ln (fn )∗ Ln (fn ).
(92)
2.6.1. Reconstructing Λ We now explain how to reconstruct the flash rate operators from the history POVM, i.e. how to extract Λ(f ) from the operator-valued functions En , first in the squareroot-plus picture, and afterwards in the Heisenberg-plus picture. Square-root-plus picture. Set L0 (∅) = I, W t (f = ∅) = G(T1 > t)1/2 , where
G(T1 > t) =
i∈L
(93)
∞
ds R3
t
dq E1 (q, s, i).
(94)
Now set Λ(f = ∅, q, t, i) = W t (∅)−1 E1 (q, t, i)W t (∅)−1 .
(95)
Continue inductively along the number n of flashes, setting Ln (fn ) = Λ(fn )1/2 W tn (fn−1 )Ln−1 (fn−1 ) and
t
W (fn ) =
L∗n (fn )−1
i∈L
t
∞
ds R3
(96) 1/2 −1
dq En+1 (fn , q, s, i)Ln (fn )
.
(97)
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
175
Now set Λ(fn , z) = W t (fn )−1 L∗n (fn )−1 En+1 (fn , z)Ln (fn )−1 W t (fn )−1 .
(98)
It then follows formally that L∗n (fn )Ln (fn ) = En (fn ).
(99)
Theorem 5 in Sec. 3.6 provides conditions under which this computation actually works. Heisenberg-plus picture. Set L0 (∅) = I. We determine Λ(f = ∅, q, t, i) for t > 0 (and all q ∈ R3 , i ∈ L ) by simultaneously solving 1 dW t (∅) = − Λ(∅, R3 , t)W t (∅) dt 2
(100)
W t0 (∅) = I,
(101)
Λi (f = ∅, q, t) = W t (∅)∗−1 E1 (q, t, i)W t (∅)−1 .
(102)
with initial datum
and
That is, 1 dW t (∅) = − W t (∅)∗−1 dt 2
dq R3
E1 (q, t, i).
(103)
i∈L
Now we proceed by induction along the number of flashes. Suppose that for all sequences of up to n − 1 flashes, fn−1 = (z1 , . . . , zn−1 ), the operators Λi (q, t, fn−1 ), W t (fn−1 ), and Ln−1 (fn−1 ) are known for all i ∈ L , q ∈ R3 , and t ≥ tn . For arbitrary fn = (z1 , . . . , zn ), set Ln (fn ) = Λ(fn )1/2 W tn (fn−1 )Ln−1 (fn−1 ).
(104)
Solve simultaneously 1 dW t (fn ) = − Λ(fn , R3 , t)W t (fn ) dt 2
(105)
W tn (fn ) = I,
(106)
Λ(fn , z) = (L∗n (fn )W t (fn )∗ )−1 En+1 (fn , z)(W t (fn )Ln (fn ))−1 .
(107)
with initial datum
and
Then (70) and (71) are satisfied by construction, and L∗n (fn )Ln (fn ) = En (fn ) by (107).
March 10, 2009 19:20 WSPC/148-RMP
176
J070-00360
R. Tumulka
3. Rigorous Treatment of the GRWf Scheme In this chapter we repeat the considerations of Chap. 2 in a rigorous treatment; here we provide the exact conditions under which our constructions work and the point processes exist. 3.1. Weak integrals Let B(H ) denote the space of bounded operators on the Hilbert space H . We say that an operator-valued function Λ : (M, A) → B(H ) is weakly measurable if for every ψ ∈ H the function fψ : M → C, defined by fψ (q) = ψ|Λ(q)ψ, is Borel measurable. In that case, also q → φ|Λ(q)ψ is Borel measurable because, using polarization, 1 (fφ+ψ (q) − fφ−ψ (q) − ifφ+iψ (q) + ifφ−iψ (q)). (108) 4 Moreover, also the adjoint q → Λ(q)∗ is weakly measurable. Let µ be a σ-finite measure on (M, A). We understand the expression Λ(q)µ(dq) as a weak integral defined by T = Λ(q)µ(dq) :⇔ ∀ψ ∈ H : ψ|T ψ = ψ|Λ(q)ψµ(dq). (109) φ|Λ(q)ψ =
Throughout this paper, all integrals over operators are weak integrals. (Another concept of integration of Banach-space-valued functions is the Bochner integral [74], which is not suitable for our purposes since relevant examples of flash rate operators Λ(q), such as (6), are weakly integrable but not Bochner integrable: Bochner integrability requires Λ(q)µ(dq) < ∞, while, for example, for the Λ(q) of the original GRW model, given by (6), Λ(q) = λ/(2πσ2 )3/2 = const. for all q ∈ M = R3 , and µ is the Lebesgue measure, so that in fact Λ(q)µ(dq) = ∞.) Note that T need not exist (for example, Λ(q) = I for all q ∈ R3 ), but if it exists then it is unique, as T is determined by the values ψ|T ψ. Moreover, if T exists then φ|T ψ = φ|Λ(q)ψµ(dq) (110) by (108); in particular, q → φ|Λ(q)ψ is (absolutely) integrable. We can guarantee the existence of T in a special case: Lemma 1. If Λ : M → B(H ) is weakly measurable and Λ(q) is positive for every q ∈ M then S = ψ ∈ H : ψ|Λ(q)ψµ(dq) < ∞ (111) is a subspace, and
B(φ, ψ) =
φ|Λ(q)ψµ(dq)
∀φ, ψ ∈ S
(112)
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
177
defines a positive Hermitian sesquilinear form on S. Moreover, if B is bounded and S = H then, by the Riesz lemma, there is a positive operator T ∈ B(H ) such that B(φ, ψ) = φ|T ψ. The proof of this lemma, and those of all other lemmas in this subsection, are included in Appendix A.1. Lemma 2. Let S be a dense subspace of H , T ∈ B(H ), Λ : M → B(H ) weakly measurable and Λ(q) positive for every q ∈ M . If the equation ψ|Λ(q)ψµ(dq) (113) ψ|T ψ = M
is true for all ψ ∈ S then it is true for all ψ ∈ H . In other words, if (113) holds on S then T = Λ(q)µ(dq). Below we collect some lemmas about weak measurability. Most of the proofs (see Appendix A.1) were provided to the author by Reiner Sch¨ atzle (T¨ ubingen). Lemma 3. Let {φn : n ∈ N} be an orthonormal basis of H . The function q → Λ(q) is weakly measurable if and only if for all n, m ∈ N, q → Λnm (q) := φn |Λ(q)φm is measurable. Lemma 4. If q → Λ(q) is weakly measurable and R, S, and T = Λ(q)µ(dq) are bounded operators then q → RΛ(q)S is weakly measurable, and RTS = RΛ(q)Sµ(dq). (114) Lemma 5. If Λ, Λ : M → B(H ) are both weakly measurable then so is their product, q → Λ(q)Λ (q). Lemma 6. If Λ : M → B(H ) is weakly measurable and every Λ(q) is self-adjoint then q → Λ(q) is measurable. Lemma 7. If Λ : M → B(H ) is weakly measurable and Λ(q) is positive and bijective for every q ∈ M then q → Λ(q)−1 is weakly measurable. Lemma 8. If Λ : M → B(H ) is weakly measurable and Λ(q) ≥ 0 for every q ∈ M then q → Λ(q)1/2 is weakly measurable. 3.2. POVMs A relevant mathematical concept for GRW theories is that of POVM (positiveoperator-valued measure). In this section, we recall the definition of POVM and a theorem about POVMs that we need, an analog of the Kolmogorov extension theorem [73]. Definition 1. A POVM (positive operator valued measure) on the measurable space (Ω, A) acting on H is a mapping G : A → B(H ) from a σ-algebra A on the
March 10, 2009 19:20 WSPC/148-RMP
178
J070-00360
R. Tumulka
set Ω such that (i) G(Ω) = I, (ii) G(A) ≥ 0 for every A ∈ A, and (iii) (weak σ-additivity) for any sequence of pairwise disjoint sets A1 , A2 , . . . ∈ A
∞ ∞
G Ai = G(Ai ), (115) i=1
i=1
where the sum on the right-hand side converges weakly, i.e. converges, for every ψ ∈ H , to ψ|G(∪i Ai )ψ.
i ψ|G(Ai )ψ
If G is a POVM on (Ω, A) and ψ ∈ H with ψ = 1, then A → ψ|G(A)ψ is a probability measure on (Ω, A). We quote a theorem that we need from [73] (see there for the proof), an analog of the Kolmogorov extension theorem for POVMs. Recall that a Borel space is a measurable space isomorphic to a Borel subset of [0, 1]; in particular, any Polish space with its Borel σ-algebra is a Borel space [47]. Theorem 1. Let (M, A) be a Borel space and Gn (·), for every n ∈ N, a POVM on (M n , A⊗n ). If the family Gn (·) satisfies the consistency property Gn+1 (A × M ) = Gn (A)
∀A ∈ A⊗n
(116)
then there exists a unique POVM G(·) on (M N , A⊗N ) (where A⊗N is the σ-algebra generated by the cylinder sets) such that for all n ∈ N and all sets A ∈ A⊗n , Gn (A) = G(A × M N ).
(117)
Moreover, for every ψ ∈ H with ψ = 1 there exists a unique probability measure µψ on (M N , A⊗N ) such that for all n ∈ N and all sets A ∈ A⊗n , µψ (A × M N ) = ψ|Gn (A)ψ, and in fact µψ (·) = ψ|G(·)ψ. 3.3. The simplest case of GRWf Let H be a (possibly unbounded) self-adjoint operator on the separable Hilbert space H . Let (Q, AQ ) be a Borel space and µ a σ-finite measure on (Q, AQ ); Q plays the role of physical space, which in Sec. 2.1 we took to be Q = R3 with AQ the Borel σ-algebra and µ the Lebesgue measure. Assumption 1. For every q ∈ Q, Λ(q) is a bounded positive operator, Λ : Q → B(H ) is weakly measurable, and Λ(q)µ(dq) = λI Q
for a constant λ > 0. Let µLeb denote the Lebesgue measure on (R, B(R)).
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
179
Definition 2. Under Assumption 1, a random variable F = (X1 , X2 , . . .) = ((Q1 , T1 ), (Q2 , T2 ), . . .) with values in (Ω, A) = ((Q × R)N , (AQ ⊗ B(R))⊗N ) is a GRWf process with Hamiltonian H, flash rate operators Λ(q), initial time t0 and initial state vector ψ if for every n ∈ N the joint distribution of X1 , . . . , Xn is absolutely continuous relative to (µ⊗ µLeb)⊗n on (Q× R)n with density ψ|L∗n Ln ψ, where Ln (x1 , . . . , xn ) is given by (15). Theorem 2. Under Assumption 1, there exists a GRWf process for every initial time t0 and every initial state vector ψ ∈ H with ψ = 1. Its distribution is unique and of the form ψ|G(·)ψ for a suitable history POVM G(·) on ((Q × R)N , (AQ ⊗ B(R))⊗N ). A crucial step towards proving Theorem 2 is the following lemma. Lemma 9. Set L0 = I. Under Assumption 1, for all n ∈ N, Ln ∈ B(H ) is well defined, (x1 , . . . , xn ) → L∗n Ln is weakly measurable, and dtn µ(dqn )L∗n Ln = L∗n−1 Ln−1 . (118) R
Q
Proof. Since H is self-adjoint, the expression e−iHt/ defines a unitary operator. Since Λ(q) is positive and defined on all of H , it is self-adjoint, and Λ(q)1/2 exists and is a bounded operator. Thus, Ln is well defined on all of H and a bounded operator. Moreover, L∗n Ln as a function (Q × R)n (x1 , . . . , xn ) → L∗n (x1 , . . . , xn )Ln (x1 , . . . , xn ) ∈ B(H )
(119)
is weakly measurable: Every Λ(qk ) is weakly measurable by definition, also as a function on (Q × R)n that does not depend on tk and x for = k. By Lemma 8, also (x1 , . . . , xn ) → Λ(qk )1/2 is weakly measurable. The operator-valued function t → e−iHt is weakly measurable because t → φ|e−iHt ψ is even continuous, as even t → e−iHt ψ is continuous for self-adjoint H [62]. Thus, also (x1 , . . . , xn ) → e−iH(tk+1 −tk )/ is weakly measurable. The number-valued function 1t0
= R
Q
dtn
Q
µ(dqn )ψ|L∗n Ln ψ
(120)
March 10, 2009 19:20 WSPC/148-RMP
180
J070-00360
R. Tumulka
= R
dtn
Q
µ(dqn )1tn−1
× ψ|L∗n−1 eiH(tn −tn−1 )/ Λ(qn )e−iH(tn −tn−1 )/ Ln−1 ψ = dtn 1tn−1
dtn 1tn−1
= Ln−1 ψ2
∞
tn−1
dtn e−λ(tn −tn−1 ) λ
= Ln−1 ψ = ψ|L∗n−1 Ln−1 ψ. 2
This implies (118). Proof of Theorem 2. We use Theorem 1 for (M, A) = (Q × R, A ⊗ B(R)). We have to check that L∗n Ln is the density of a POVM Gn (·) satisfying the consistency property (116). Set, for all A ∈ An := (AQ ⊗ B(R))⊗n , µ ˜⊗n (dx1 · · · dxn )L∗n Ln (121) Gn (A) = A
with µ ˜ = µ ⊗ µLeb . This defines a POVM: For A = (Q × R)n , the right-hand side is, by repeated application of (118), the identity. To see that Gn (A) is a well-defined bounded operator for all A ∈ An , we apply Lemma 1 to M = (Q × R)n , A = An , Λ = L∗n Ln ≥ 0: in our case S = H because for all ψ ∈ H , µ ˜⊗n (dx1 · · · dxn )ψ|L∗n Ln ψ = µ ˜⊗n (dx1 · · · dxn )Ln ψ2 (122) A
A
≤
(Q×R)n
µ ˜⊗n (dx1 · · · dxn )Ln ψ2
= ψ|Gn ((Q × R)n )ψ = ψ2 .
(123)
According to Lemma 1, the sesquilinear form (112) is defined on H × H , and by (122), (123) is bounded, thus defining a bounded operator Gn (A). To see
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
181
⊗n that Gn (·) is weakly σ-additive, just note that A µ ˜ (dx1 · · · dxn )ψ|L∗n Ln ψ is σ-additive in A. The consistency condition (116) follows from (118). By Theorem 1, there is a unique POVM G on (Q × R)N whose marginals are the Gn . Moreover, for every ψ ∈ H with ψ = 1 there is a unique probability measure Pψ on (Q × R)N that extends the distributions ψ|Gn (·)ψ, and which is thus the distribution of the GRWf process with initial time t0 and initial state vector ψ. Finally, Pψ (·) = ψ|G(·)ψ. The labeled GRWf processes we considered in Sec. 2.2 are included in Definition 2, and their existence is covered by Theorem 2 by setting Q = R3 × L , where L is a finite or countable set of labels, AQ = B(R3 ) ⊗ P(L ), where P(L ) is the power set of L , and µ = µLeb ⊗ ν, where µLeb is the Lebesgue measure on R3 and ν the counting measure on L . Thus, the labeled GRWf process is a point process in R4 × L , and the distribution of the first n labeled flashes is given by (19). Assumption 1 requires, in the notation of Sec. 2.2, that every Λi (q) is bounded, that q → Λi (q) is weakly measurable for every i ∈ L , and that (18) holds (where the series converges weakly if L is infinite). 3.4. Time-dependent operators We now show the existence of a GRWf process for time-dependent H(t) and Λ(q, t) operators with a variable total flash rate (not requiring the normalization Λ(q)dq = λI), but only under rather restrictive assumptions, particularly that these operators are bounded. In many of the physical applications it would be desirable to permit unbounded (self-adjoint, in particular densely defined) operators: first, the physical Hamiltonians H(t) are (more often than not) unbounded, and second, the flash rate operators in quantum field theory, as described in Example 4, are naturally unbounded. As mentioned already in Sec. 2.5.4, one can either ask about the construction of the process from given H(t) and Λ(z), or from given Wst and Λ(z). For our purposes it is useful to assume the second point of view first, and to turn to the construction of Wst from H(t) afterwards. 3.4.1. Given W and Λ Fix the initial time t0 ∈ R. Suppose that we are given operators Wst for every s, t ≥ t0 and Λ(q, t) for every t ≥ t0 and q ∈ Q, where (Q, AQ , ) is again a Borel space and µ a σ-finite measure on (Q, AQ ). Definition 3. Let M = Q × R ∪ {} and AM = AQ ⊗ B(R) × A , where A = {∅, {}}. A random variable F = (Z1 , Z2 , . . .)
March 10, 2009 19:20 WSPC/148-RMP
182
J070-00360
R. Tumulka
with values in (M N , A⊗N M ) is a GRWf process with time-dependent flash rate operators Λ(q, t), evolution operators Wst , initial time t0 , and initial state vector ψ if is absorbing and for every n ∈ N the joint distribution of Z1 , . . . , Zn satisfies (the analogs of (48) and (49)) µ ˜⊗n (dz1 · · · dzn )ψ|L∗n Ln ψ (124) P(#F ≥ n, (Z1 , . . . , Zn ) ∈ A) = A
for A ∈ (AQ ⊗ B(R))⊗n , where L0 = I and n Ln = Ln (z1 , . . . , zn ) = Λ(zn )1/2 Wttn−1 Ln−1 (z1 , . . . , zn−1 ).
(125)
Assumption 2. For every q ∈ Q and t ≥ t0 , Λ(q, t) is a bounded operator; (q, t) → Λ(q, t) is weakly measurable; for every t ≥ t0 , Λ(q, t)µ(dq) (126) Λ(Q, t) := Q
exists as a bounded operator. Assumption 3. For every s, t ≥ t0 , Wst is a bounded operator; for t < s, Wst = 0; the function (s, t) → Wst is weakly measurable and satisfies the following weak version of (51): t t∗ t dt Wst ∗ Λ(Q, t )Wst . (127) Ws Ws − I = − s
We remark that, as a consequence of the weak measurability of (q, t) → Λ(q, t) and the existence of Λ(Q, t) as a bounded operator, t → Λ(Q, t) is weakly measurable. Theorem 3. Under Assumptions 2 and 3, there exists a GRWf process for every initial time t0 and every initial state vector ψ ∈ H with ψ = 1 with flash rate operators Λ(q, t) and evolution operators Wst . The distribution of the GRWf process is unique and of the form ψ|G(·)ψ for a suitable history POVM G(·) on (M N , A⊗N M ). Lemma 10. Under Assumptions 2 and 3, there exists a unique positive operator Ts ∈ B(H ), denoted limt→∞ Wst∗ Wst in the following, such that ψ|Ts ψ = lim ψ|Wst∗ Wst ψ.
(128)
t→∞
Indeed, lim Wst∗ Wst = Ts = I −
t→∞
∞ s
dt Wst ∗ Λ(Q, t )Wst .
(129)
Moreover, s → Ts is weakly measurable.
Proof. Keep s ∈ R fixed. Since Wst ∗ Λ(Q, t)Wst is a positive operator, so is its integral over t , so that (127) implies Wst∗ Wst ≤ I and Wst2 ∗ Wst2 ≤ Wst1 ∗ Wst1 for
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
183
t1 ≤ t2 . Therefore, t → ψ|Wst∗ Wst ψ is a decreasing nonnegative function for every ψ ∈ H and thus possesses a limit αψ as t → ∞. Define (polarization) 1 (αφ+ψ − αφ−ψ − iαφ+iψ + iαφ−iψ ). 4
α(φ, ψ) =
(130)
Then α(φ, ψ) = lim φ|Wst∗ Wst ψ t→∞
(131)
for all φ, ψ ∈ H by (130) and the linearity of limits; in particular, the limit on the right-hand side exists. It follows that α is a sesquilinear form H × H → C, Hermitian, positive, and bounded (with α ≤ 1). By the Riesz lemma, there is a bounded positive operator Ts such that α(φ, ψ) = φ|T ψ. Now (131) implies (128) and From (129) we see that Ts is weakly measurable, as integrals (such as ∞ (129). t ∗ t dt ψ|W s Λ(Q, t )Ws ψ) are measurable functions of their boundaries. s Lemma 11. Under Assumptions 2 and 3, for all n ∈ N, Ln ∈ B(H ) is well defined, (x1 , . . . , xn ) → L∗n Ln is weakly measurable, and ∗ ∗ t∗ t dtn µ(dqn )Ln Ln = Ln−1 I − lim Wtn−1 Wtn−1 Ln−1 . (132) R
t→∞
Q
Proof. Ln is well defined on all of H and a bounded operator because the same is true of Wst and Λ(q)1/2 . Moreover, L∗n Ln as a function M n (z1 , . . . , zn ) → L∗n (z1 , . . . , zn )Ln (z1 , . . . , zn ) ∈ B(H )
(133)
is weakly measurable: z → Λ(z) is weakly measurable by Assumption 2, z → Λ(z)1/2 by Lemma 8, (s, t) → Wst by Assumption 3; now by Lemma 5, Ln and L∗n Ln are weakly measurable. By definition (125), for any ψ ∈ H , ∗ dtn µ(dqn )Ln Ln ψ ψ R
Q
=
R
= =
R
dtn dtn
Q
dtn
=
dtn
=
Q
µ(dqn )ψ|L∗n Ln ψ n∗ n µ(dqn )ψ|L∗n−1 Wttn−1 Λ(qn , tn )Wttn−1 Ln−1 ψ
n n Ln−1 ψ|Λ(qn , tn )Wttn−1 Ln−1 ψ µ(dqn )Wttn−1
n Ln−1 ψ Wttn−1
tn Λ(qn , tn )µ(dqn ) Wtn−1 Ln−1 ψ
n n dtn Wttn−1 Ln−1 ψ|Λ(Q, tn )Wttn−1 Ln−1 ψ
March 10, 2009 19:20 WSPC/148-RMP
184
J070-00360
R. Tumulka
=
n∗ n dtn Wttn−1 Ln−1 ψ Λ(Q, tn )Wttn−1 Ln−1 ψ
n (by (129) and Wttn−1 = 0 for tn < tn−1 ) t∗ t = Ln−1 ψ I − lim Wtn−1 Wtn−1 Ln−1 ψ
t→∞
∗ t∗ t = ψ Ln−1 I − lim Wtn−1 Wtn−1 Ln−1 ψ . t→∞
This implies (132). Proof of Theorem 3. We proceed very much as in the proof of Theorem 2, and begin with defining the POVM Gn (·) on M n that will be the marginal of the history POVM G(·). Recall that M = Q × R ∪ {}. For k = 0, 1, 2, . . . , n, let Ωkn = {(z1 , . . . , zn ) ∈ M n : z1 , . . . , zk ∈ Q × R, zk+1 = · · · = zn = }.
(134)
(In particular, Ωnn = (Q×R)n .) We want that Gn (·) is concentrated on ∪nk=0 Ωkn (so that sequences in which is followed by a flash do not occur). Consider an arbitrary n−k for suitable Ak ⊆ (Q × R)k , A ∈ A⊗n M ; then A ∩ Ωkn is of the form Ak × {} ⊗k indeed with Ak ∈ (AQ ⊗ B(R)) . Note An = A ∩ Ωnn . Set Gn (A) =
n−1 k=0
⊗k
µ ˜
Ak
+ An
(dx1 · · · dxk )L∗k (x1 , . . . , xk )
lim Wtt∗ Wttk k t→∞
Lk (x1 , . . . , xk )
µ ˜⊗n (dx1 · · · dxn )L∗n (x1 , . . . , xn )Ln (x1 , . . . , xn ).
(135)
We show that this defines a POVM. We begin with the case A = M n : Then Ak = (Q × R)k , and µ ˜⊗n (dx1 · · · dxn )L∗n Ln An
⊗(n−1)
=
µ ˜ (Q×R)n−1
(dx1 · · · dxn−1 )
dtn
Q
µ(dqn )L∗n Ln
(136)
(by Lemma 11) ⊗(n−1) ∗ t∗ t µ ˜ (dx1 · · · dxn−1 )Ln−1 I − lim Wtn−1 Wtn−1 Ln−1 = t→∞
(Q×R)n−1
(137)
= An−1
µ ˜⊗(n−1) (dx1 · · · dxn−1 )L∗n−1 Ln−1
−
⊗(n−1)
µ ˜ An−1
(dx1 · · · dxn−1 )L∗n−1
Wttn−1 lim Wtt∗ n−1 t→∞
Ln−1 .
(138)
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
185
Iterating this calculation n times, we obtain µ ˜⊗n (dx1 · · · dxn )L∗n Ln An
n−1
= I−
⊗k
µ ˜
Ak
k=0
(dx1 · · · dxk )L∗k
lim
t→∞
Wtt∗ Wttk k
Lk .
(139)
Together with (135), it follows that Gn (A) = Gn (M n ) = I. To see that Gn (A) is a well-defined bounded operator for all A ∈ A⊗n M , it suffices, by Lemma 1, that ψ|Gn (A)ψ ≤ ψ2 when Gn (A) is replaced with its definition, i.e. with the right-hand side of (135). And indeed, n−1 ∗ ⊗k t∗ t µ ˜ (dx1 · · · dxk ) ψ Lk lim Wtk Wtk Lk ψ ψ|Gn (A)ψ = k=0
t→∞
Ak
+ An
µ ˜⊗n (dx1 · · · dxn )ψ|L∗n Ln ψ
(140)
(because the integrands are nonnegative by Lemma 10) n−1 ∗ ⊗k t∗ t ≤ µ ˜ (dx1 · · · dxk ) ψ Lk lim Wtk Wtk Lk ψ k=0
t→∞
(Q×R)k
+ (Q×R)n
µ ˜⊗n (dx1 · · · dxn )ψ|L∗n Ln ψ
= ψ|Gn (M n )ψ = ψ2 .
(141)
To see that Gn (·) is σ-additive in the weak sense, just note that in A. We check the consistency condition (116):
A
is σ-additive
Gn+1 (A × M ) = Gn+1 (A × (Q × R)) + Gn+1 (A × {})
(142)
(by definition (135)) = µ ˜⊗(n+1) (dx1 · · · dxn+1 )L∗n+1 Ln+1 An ×(Q×R)
+
n k=0
Ak
µ ˜⊗k (dx1 · · · dxk )L∗k
t W lim Wtt∗ tk Lk k
t→∞
(143)
(by (132)) t µ ˜⊗n (dx1 · · · dxn )L∗n I − lim Wtt∗ W = tn Ln n t→∞
An
+
n k=0
Ak
⊗k
µ ˜
(dx1 · · · dxk )L∗k
Wttk lim Wtt∗ k t→∞
Lk
(144)
March 10, 2009 19:20 WSPC/148-RMP
186
J070-00360
R. Tumulka
= An
+
µ ˜⊗n (dx1 · · · dxn )L∗n Ln
n−1 Ak
k=0
⊗k
µ ˜
(dx1 · · · dxk )L∗k
lim
t→∞
Wtt∗ Wttk k
Lk
= Gn (A).
(145) (146)
By Theorem 1, there is a unique POVM G(·) on M N whose marginals are the Gn (·). It is concentrated on the set Ω given by (29) of sequences for which is absorbing, because any other sequence, one with a space-time point after , would already for sufficiently large n fail to be contained in any of Ωkn as defined in (134). The history POVM G(·) is also concentrated on those sequences that are ordered by the time coordinates of the flashes, T1 < T2 < · · · . Moreover, for every ψ ∈ H with ψ = 1 there is a unique probability measure Pψ on M N that extends the distributions ψ|Gn (·)ψ. Indeed, Pψ (·) = ψ|G(·)ψ. To see that it satisfies (124), note that for an event A concerning Z1 , . . . , Zn and entailing that #F ≥ n, in other words for A ⊆ (Q × R)n with A ∈ (AQ ⊗ B(R))⊗n , we have that An = A and thus µ ˜⊗n (dx1 · · · dxn )L∗n Ln (147) Gn (A) = A
by (135). As a consequence, Pψ defines a GRWf process. To show that Pψ is uniquely determined by (124), we show that the joint distribution of the first n components of F , Zk ∈ M , must be ψ|Gn (·)ψ. Indeed, from (124) it follows that, for A ⊆ (Q × R)n with A ∈ (AQ ⊗ B(R))⊗n , P(#F = n, (Z1 , . . . , Zn ) ∈ A) = P(#F ≥ n, (Z1 , . . . , Zn ) ∈ A) − P(#F ≥ n + 1, (Z1 , . . . , Zn+1 ) ∈ A × (Q × R)) = µ ˜⊗n (dz1 · · · dzn )ψ|L∗n Ln ψ
(148)
A
−
A×(Q×R)
µ ˜⊗(n+1) (dz1 · · · dzn+1 )ψ|L∗n+1 Ln+1 ψ
(149)
(by (132)) = µ ˜⊗n (dz1 · · · dzn )ψ|L∗n Ln ψ A
−
A
⊗n
µ ˜
= A
t µ ˜⊗n (dz1 · · · dzn ) ψ L∗n I − lim Wtt∗ W ψ L n tn n t→∞
∗ t∗ t (dz1 · · · dzn ) ψ Ln lim Wtn Wtn Ln ψ . t→∞
(150) (151)
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
187
This implies that for A ⊆ M n with A ∈ A⊗n M (using that is absorbing) that P(A) =
n−1
P(#F = k, (Z1 , . . . , Zk ) ∈ Ak ) + P(#F ≥ n, (Z1 , . . . , Zn ) ∈ An )
k=0
=
n−1 k=0
Ak
+ An
t µ ˜⊗k (dz1 · · · dzk ) ψ L∗k lim Wtt∗ W ψ L k tk k
(152)
t→∞
µ ˜⊗n (dz1 · · · dzn )ψ|L∗n Ln ψ
= ψ|Gn (A)ψ.
(153)
3.4.2. Given H and Λ Now suppose that we are given operators H(t) for every t ≥ 0 and Λ(q, t) for every t ≥ t0 and q ∈ Q, where (Q, AQ ) is again a Borel space and µ a σ-finite measure on (Q, AQ ). Our aim now is to construct the evolution operators Wst . Assumption 4. For every t ≥ t0 , H(t) is a bounded self-adjoint operator; t → H(t) is weakly measurable. Moreover, for every t ≥ t0 t t H(s)ds < ∞, Λ(Q, s)ds < ∞. (154) t0
t0
The functions t → H(t) and t → Λ(Q, t) are measurable by Lemma 6. As an abbreviation, set 1 i Rt = − Λ(Q, t) − H(t). 2
(155)
Note that t → Rt is weakly measurable and Rt is bounded with Rt ≤ 12 Λ(Q, t)+ t 1 t H(t), so that t0 Rs ds < ∞. Now define Ws by the Dyson series Wst
=I+
∞ n=1
s
t
dt1
t
t1
dt2 · · ·
t
tn−1
dtn Rtn · · · Rt1 .
(156)
Lemma 12. Under Assumptions 2 and 4, the Dyson series (156) is weakly convergent and defines a bounded operator Wst on H . The function (s, t) → Wst is weakly measurable and satisfies the following weak version of (50): t i 1 t Ws − I = dt − Λ(Q, t ) − H(t ) Wst , (157) 2 s as well as (127). Thus, Assumption 3 is fulfilled. The proof is included in Appendix A.2.
March 10, 2009 19:20 WSPC/148-RMP
188
J070-00360
R. Tumulka
Corollary 1. Under Assumptions 2 and 4, there exists, for every initial time t0 and every initial state vector ψ ∈ H with ψ = 1, a GRWf process with Hamiltonians H(t) and flash rate operators Λ(q, t), where Wst is given by the Dyson series (156). The distribution of the process is unique and of the form ψ|G(·)ψ for a suitable history POVM G(·) on (M N , A⊗N M ). Proof. By Lemma 12, Assumption 3 is fulfilled, and the statement follows from Theorem 3. 3.5. The general GRWf scheme The methods developed in the previous section for time-dependent H and Λ operators cover also the general scheme, in which the operators may depend on previous flashes and the collapse operator C is not necessarily the positive square root of Λ. Since the proofs are essentially the same, we formulate only the results. 3.5.1. Given W and Λ Fix the initial time t0 ∈ R, let (Q, AQ ) be a Borel space and µ a σ-finite measure on (Q, AQ ). Let ∞
Ω := Ω(n) n=0
:=
∞
{(z1 , . . . , zn ) : zk = (qk , tk ) ∈ Q × R, t0 ≤ t1 ≤ · · · ≤ tn }.
(158)
n=0
For f ∈ Ω(n) set #f := n. Suppose that for every sequence f ∈ Ω we are given operators W t (f ) for every t ≥ t#f and C(f, q, t) for every t ≥ t#f and q ∈ Q. Definition 4. Let M = Q × R ∪ {} and AM = AQ ⊗ B(R) × A , where A = {∅, {}}. A random variable F = (Z1 , Z2 , . . .) N
, A⊗N M )
with values in (M is a GRWf process with past-dependent collapse operators C(f, q, t), evolution operators W t (f ), initial time t0 , and initial state vector ψ if is absorbing and for every n ∈ N the joint distribution of Z1 , . . . , Zn satisfies P(#F ≥ n, (Z1 , . . . , Zn ) ∈ A) = µ ˜⊗n (dz1 · · · dzn )ψ|L∗n Ln ψ (159) A
for A ∈ (AQ ⊗ B(R))⊗n , where L0 = I and Ln = Ln (z1 , . . . , zn ) = C(z1 , . . . , zn )W tn (z1 , . . . , zn−1 )Ln−1 (z1 , . . . , zn−1 ). (160) Assumption 5. For every f ∈ Ω, q ∈ Q and t ≥ t#f , C(f, q, t) is a bounded operator; (f, q, t) → C(f, q, t) is weakly measurable; for every t ≥ t#f , Λ(f, Q, t) := C(f, q, t)∗ C(f, q, t)µ(dq) (161) Q
is a bounded operator.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
189
Assumption 6. For every f ∈ Ω and t ≥ t#f , W t (f ) is a bounded operator; for t < t#f , W t (f ) = 0; the function (f, t) → W t (f ) is weakly measurable and satisfies t t ∗ t dt W t (f )∗ Λ(f, Q, t )W t (f ). (162) W (f ) W (f ) − I = − t#f
We remark that, as a consequence of Lemma 5, of the weak measurability of (f, q, t) → C(f, q, t) and of the existence of Λ(f, Q, t) as a bounded operator, (f, t) → Λ(f, Q, t) is weakly measurable. Theorem 4. Under Assumptions 5 and 6, there exists a GRWf process for every initial time t0 and every initial state vector ψ ∈ H with ψ = 1 with collapse operators C(f, q, t) and evolution operators W t (f ). The distribution of the GRWf process is unique and of the form ψ|G(·)ψ for a suitable history POVM G(·) on (M N , A⊗N M ). 3.5.2. Given H and Λ Now suppose that for every f ∈ Ω, t ≥ t#f and q ∈ Q we are given operators H(f, t) and C(f, q, t). Assumption 7. For every f ∈ Ω and t ≥ t#f , H(f, t) is a bounded self-adjoint operator; (f, t) → H(f, t) is weakly measurable. Moreover, for every t ≥ t#f
t
t#f
H(f, s)ds < ∞,
t
t#f
Λ(f, Q, s)ds < ∞.
(163)
The functions (f, t) → H(f, t) and (f, t) → Λ(f, Q, t) are measurable, as pointed out in Sec. 3.4.2. Set i 1 Rt (f ) = − Λ(f, Q, t) − H(f, t). 2
(164)
Then (f, t) → Rt (f ) is weakly measurable and Rt (f ) is bounded with Rt (f ) ≤ t 1 1 Λ(f, Q, t) + H(f, t), so that R (f )ds < ∞. Now define W t (f ) by the s t#f 2 appropriate Dyson series t
W (f ) = I +
∞ n=1
t
t#f
ds1
t
s1
ds2 · · ·
t
sn−1
dsn Rsn (f ) · · · Rs1 (f ).
(165)
Corollary 2. Under Assumptions 5 and 7, there exists, for every initial time t0 and every initial state vector ψ ∈ H with ψ = 1, a GRWf process with pastdependent Hamiltonians H(f, t) and collapse operators C(f, q, t), where W t (f ) is given by the Dyson series (165). The distribution of the process is unique and of the form ψ|G(·)ψ for a suitable history POVM G(·) on (M N , A⊗N M ).
March 10, 2009 19:20 WSPC/148-RMP
190
J070-00360
R. Tumulka
3.6. Reconstructing W and Λ We now make the considerations of Sec. 2.5.3 rigorous and show that the “squareroot-plus picture” exists. For simplicity, we ignore the possibility that the sequence of flashes could stop, thus discarding the symbol and assuming that G(·) is a POVM on (M N , A⊗N M ) with M = Q × R and AM = AQ ⊗ B(R). Define the marginal Gn (·) of G(·) by Gn (A) = G(A × M N )
(166)
Ω = {(z1 , z2 , . . .) ∈ M N : zk = (qk , tk ) ∈ M, t0 ≤ t1 ≤ t2 ≤ · · ·}
(167)
for all A ∈
A⊗n M .
Let
be the set of time-ordered sequences of flashes, and Ω(n) = {(z1 , . . . , zn ) ∈ M n : t0 ≤ t1 ≤ · · · ≤ tn }
(168)
the set of length-n time-ordered sequences as in (158). Assumption 8. The POVM G(·) on (M N , A⊗N M ) is such that • each of its marginals Gn (·) possesses an operator-valued density function En , i.e. there is a weakly measurable En : M n → B(H ) with Gn (A) = µ ˜⊗n (dz1 · · · dzn )En (z1 , . . . , zn ) (169) A
A⊗n M ;
for all A ∈ • G(·) is concentrated on Ω as given by (167), i.e. G(Ω) = I; • for all f ∈ M n and t ≥ tn , µ(dq)En+1 (f, q, t)
(170)
exists as a bounded operator; • En (f ) : H → H is a bijective operator for all f ∈ M n , and ∞ ds µ(dq)En+1 (f, q, s)
(171)
Q
t
Q
is a bijective operator H → H for every t ≥ tn . Theorem 5. If a given POVM G(·) on (M N , A⊗N M ) satisfies Assumption 8 then there exist positive operators C(f ) and W t (f ) (square-root-plus picture), satisfying Assumptions 5 and 6, so that G(·) is the history POVM of the GRWf process associated with C(f ) and W t (f ) by Theorem 4. We conjecture that the last item in Assumption 8 is stronger than necessary, in particular that En (f ) does not have to be a bijective operator. In particular, rGRWf possesses a positive-operator-valued density function En (f ) which is not bijective, and we conjecture that it fits the GRWf scheme nonetheless.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
191
Lemma 13. If the POVM G(·) on (M N , A⊗N M ) is such that each of its marginals ˜⊗n , Gn (·) possesses an operator-valued density function En as in (169) relative to µ ⊗n n ˜ -almost all f ∈ M , and then En (f ) ≥ 0 for µ µ ˜ (dz)En+1 (f, z) = En (f ) (172) M
⊗n
for µ ˜ -almost all f ∈ M n . If G(Ω) = I then En (f ) = 0 for µ ˜⊗n -almost all n (n) f ∈ M \Ω . Proof. We begin with showing that En (f ) ≥ 0 for µ ˜⊗n -almost all f . Let S be a countable dense subset of H , and for ψ ∈ S let Aψ be the set of f for which ψ|En (f )ψ < 0. Since f → ψ|En (f )ψ is a Radon–Nikodym density function of ˜⊗n , it is nonnegative almost everywhere, i.e. the measure ψ|Gn (·)ψ relative to µ Aψ is a null set. As a consequence, AS := ∪ψ∈S Aψ is a null set. Now for arbitrary ψ ∈ H , there is a sequence (ψm )m∈N in S with ψm → ψ as m → ∞, and hence ψm |En (f )ψm → ψ|En (f )ψ. Since the limit cannot be negative if none of the members of the sequence is, ψ|En (f )ψ ≥ 0 on M n \AS , which is what we claimed. We turn to (172). There is no loss of generality in assuming that En (f ) ≥ 0 for all (instead of almost all) f ∈ M n (and all n): a change of (f, z) → En+1 (f, z) on ˜⊗n -almost all f , z → En+1 (f, z) changes only aµ ˜⊗(n+1) -null set entails that for µ on a µ ˜-null set of z’s, so that the integral in (172) is not affected. Let ψ ∈ H and consider the two functions ψ µ ˜(dz)ψ|En+1 (f, z)ψ, hψ (f ) = ψ|En (f )ψ. (173) g (f ) = M
By the Fubini–Tonelli theorem, ⊗n ψ µ ˜ (df )g (f ) = A
A×M
µ ˜⊗(n+1) (dfn+1 )ψ|En+1 (fn+1 )ψ
= Gn+1 (A × M ).
(174)
Since Gn+1 (A × M ) = Gn (A), g ψ is a density function of the measure ψ|Gn (·)ψ relative to µ ˜⊗n . Of course, hψ is another density function of the same measure. By the Radon–Nikodym theorem, the density is unique up to changes on null sets, and thus g ψ (f ) = hψ (f )
(175)
for almost all f . We still have to show that a null set containing all f for which (175) fails to hold can be chosen independently of ψ. To this end, let S be a countable dense subset of H ; without loss of generality we assume that S is a vector space over the complex rationals Q + iQ. For ψ ∈ S let Aψ be the set of those f for which (175) fails to hold. We know that Aψ is a null set, and thus that AS = ∪ψ∈S Aψ is a null set. Fix f ∈ M n \AS . We have that for all ψ ∈ S, g ψ (f ) = hψ (f ). By the vector
March 10, 2009 19:20 WSPC/148-RMP
192
J070-00360
R. Tumulka
space structure of S, if φ and ψ are contained in S then so are φ ± ψ and φ ± iψ; using the polarization identity 1 (Q(φ + ψ) − Q(φ − ψ) − iQ(φ + iψ) + iQ(φ − iψ)) 4 with Q(χ) = χ|T χ, we obtain that µ ˜(dz)φ|En+1 (f, z)ψ = φ|En (f )ψ φ|T ψ =
(176)
(177)
M
for all φ, ψ ∈ S. By linearity in φ, ψ of each side, this is also true for all φ, ψ in the C vector space spanned by S. That is, g ψ (f ) = hψ (f ) for all ψ from a dense subspace of H , and hence, by Lemma 2, for all ψ ∈ H . That is, (172) holds for all f ∈ M n \AS . Now suppose G(Ω) = I. Then Gn (Ω(n) ) = I, or Gn (An ) = 0 for An := M n \Ω(n) . Let S be a countable dense subset of H , and for ψ ∈ S let Aψ be the set of those f ∈ An for which ψ|En (f )ψ = 0. Since the integral of the nonnegative function ˜⊗n f → ψ|En (f )ψ over An equals ψ|Gn (An )ψ = 0, the function must vanish µ almost everywhere in An , and thus µ ˜⊗n (Aψ ) = 0. As a consequence, AS := ∪ψ∈S Aψ is a null set. Now for arbitrary ψ ∈ H , there is a sequence (ψm )m∈N in S with ψm → ψ as m → ∞, and hence (since En (f ) is bounded) 0 = ψm |En (f )ψm → ψ|En (f )ψ for every f ∈ An \AS , which is what we claimed. Proof of Theorem 5. There is no loss of generality in assuming that En (f ) ≥ 0 for all (instead of almost all) f ∈ M n (and all n), that (172) holds for all f ∈ M n , and that En (f ) = 0 for all f ∈ M n \Ω(n) : Inductively along n, we change En (f ) to / Ω(n) ; then we change zero if the given En (f ) was not positive or nonzero for f ∈ En+1 (f, z) on a null set of f ’s (and thus for a null set of pairs (f, z) ∈ M n+1 ) so as to make (172) true for all f ∈ M n (which is clearly possible in a weakly measurable way). For the reconstruction of the W and C operators we proceed along the lines of (93)–(98). Set L0 := I and, for t ≥ t0 , ∞ 1/2 W t (∅) := ds µ(dq)E1 (q, s) . (178) t
Q
By (171), the bracket is a well-defined and bijective operator, and must be positive because E1 (q, s) ≥ 0. Thus, the square root exists and is positive; it is bijective, too, since if T 2 is bijective then so is T . For t < t0 set Wt (∅) =0. The function t → W t (∅) ∞ is weakly measurable because integrals (such as t ds Q µ(dq)ψ|E1 (q, s)ψ) are measurable functions of their boundaries, and by Lemma 8 the root is measurable, too. Now set, for all q ∈ Q and t ≥ t0 Λ(q, t) := W t (∅)−1 E1 (q, t)W t (∅)−1 .
(179)
This is well-defined and bijective (since E1 (q, t) was assumed bijective); it is positive because E1 (q, t) is positive and W t (∅) is self-adjoint (and thus so is its inverse). It
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
193
is weakly measurable as a function of (q, t) because W t (∅) is, by Lemma 7 W t (∅)−1 is, E1 (q, t) is by assumption, and the product is by Lemma 5. Now set C(q, t) := Λ(q, t)1/2 .
(180)
It is clearly well-defined, bijective, positive, and weakly measurable as a function of (q, t). Our induction hypothesis asserts that Ln−1 (z1 , . . . , zn−1 ),
W tn (z1 , . . . , zn−1 ),
Λ(z1 , . . . , zn ) and C(z1 , . . . , zn )
are all well-defined, bijective, positive except Ln−1 (z1 , . . . , zn−1 ), and weakly measurable as a function of (z1 , . . . , zn ) ∈ Ω(n) ; the set Ω(n) was defined in (158); furthermore, it is part of the induction hypothesis that L−1 n−1 (z1 , . . . , zn−1 ) is weakly measurable. Now set, for fn = (z1 , . . . , zn ) ∈ Ω(n) and fn−1 = (z1 , . . . , zn−1 ), Ln (fn ) := C(fn )W tn (fn−1 )Ln−1 (fn−1 ).
(181)
By induction hypothesis, all factors are well-defined, bijective, and weakly measurable as a function of fn , and using Lemma 5, so is Ln ; L−1 n is weakly measurable too, since C(fn )−1 and W tn (fn−1 )−1 are by Lemma 7, and Ln−1 (fn−1 )−1 is by induction hypothesis. Set, for t ≥ tn , 1/2 ∞ ds µ(dq)En+1 (fn , q, s)Ln (fn )−1 . (182) W t (fn ) := L∗n (fn )−1 t
Q
Note that the adjoint of a bijective operator is bijective (because if S is a left (right) inverse of T then S ∗ is a right (left) inverse of T ∗ ), and the inverse of the adjoint is the adjoint of the inverse. That is why the bracket is a positive operator, so that the square root can be taken. By assumption, (171) is bijective for t ≥ tn , and thus so is W t (fn ). We already know that fn → Ln (fn )−1 is weakly measurable; so is the adjoint, and the middle integral is because (fn , q, s) → En+1 (fn , q, s) is by assumption. Thus, (fn , t) → W t (fn ) is weakly measurable. By the same arguments, with z = (q, t) ∈ Q × R and t ≥ tn , Λ(fn , z) := W t (fn )−1 L∗n (fn )−1 En+1 (fn , z)Ln (fn )−1 W t (fn )−1
(183)
C(fn , z) = Λ(fn , z)1/2
(184)
and
are well-defined, bijective, positive and weakly measurable as functions of (fn , z). This proves the induction hypothesis for n + 1. It now follows directly from (183), (181), and (179) that En (fn ) = L∗n (fn )Ln (fn ) for fn ∈ Ω(n) , and En (fn ) = 0 = L∗n (fn )Ln (fn ) for fn ∈ M n \Ω(n) .
(185)
March 10, 2009 19:20 WSPC/148-RMP
194
J070-00360
R. Tumulka
To show that Assumption 5 is fulfilled, it remains to check that Λ(f, Q, t) exists as a bounded operator. Indeed, µ(dq)ψ|C(f, q, t)∗ C(f, q, t)ψ Q
=
µ(dq)ψ|Λ(f, q, t)ψ
(186)
Q
(by the definition of Λ(f, q, t)) µ(dq)En+1 (fn , z) Ln (fn )−1 W t (fn )−1 ψ = Ln (fn )−1 W t (fn )−1 ψ Q
−1 2 t −1 2 2 ≤ µ(dq)En+1 (fn , z) Ln (fn ) W (fn ) ψ .
(187) (188)
Q
The operators in the norms are bounded because Ln (fn )−1 and W t (fn )−1 are bijective, and (170) was assumed to be bounded. To show that Assumption 6 is fulfilled, it remains to check (162). t dt W t (fn )∗ Λ(f, Q, t )W t (fn ) (189) tn
t
=
dt
tn
Q
µ(dq)W t (fn )∗ Λ(f, q, t )W t (fn )
(190)
(by (183)) t = dt µ(dq)L∗n (fn )−1 En+1 (fn , q, t )Ln (fn )−1
(191)
(by Lemma 4) t ∗ −1 = Ln (fn ) dt µ(dq)En+1 (fn , q, t ) Ln (fn )−1 ,
(192)
tn
Q
tn
while by (182) W t (fn )∗ W t (fn ) = L∗n (fn )−1
Q
t
∞
dt
Q
µ(dq)En+1 (fn , q, t ) Ln (fn )−1 . (193)
Thus, the sum of the two equations is t W t (fn )∗ W t (fn ) + dt W t (fn )∗ Λ(f, Q, t )W t (fn ) tn
= L∗n (fn )−1
∞ tn
dt
Q
µ(dq)En+1 (fn , q, t ) Ln (fn )−1
(194)
(by (172)) = L∗n (fn )−1 En (fn )Ln (fn )−1 = L∗n (fn )−1 L∗n (fn )Ln (fn )Ln (fn )−1 = I (195) by (185).
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
195
4. Relativistic GRW Theory We begin by introducing some terminology and notation. We generally intend that all manifolds, surfaces, and curves are C ∞ . A space-time (M, g) is a time-oriented Lorentzian 4-manifold (see, e.g., [54]). The simplest example is Minkowski spacetime (M = R4 , g = diag(1, −1, −1, −1)). A 3-surface is a 3-dimensional embedded submanifold (without boundary) of M that is closed in the topology of M . A 3surface Σ is spacelike if every nonzero tangent vector to Σ is spacelike. Note that a spacelike 3-surface is a Riemannian manifold. If Σ is a spacelike 3-surface and x, y ∈ Σ, the spacelike distance from x to y along Σ, distΣ (x, y), is the infimum of the Riemannian lengths of all curves in Σ connecting x to y. A curve in M is timelike if every nonzero tangent vector to the curve is timelike; we will always regard timelike curves as directed towards the future, i.e. we assume that the derivative relative to the curve parameter is future-pointing. A timelike curve is inextendible in M if it is not a proper subset of a timelike curve in M . A curve in M is causal if every nonzero tangent vector to the curve is either timelike or lightlike; we also regard causal curves as directed towards the future. For every subset A ⊆ M , the (causal ) future of A is the set J + (A) = {y ∈ M : ∃ x ∈ A ∃ a causal curve from x to y},
(196)
and the (causal ) past of A is J − (A) = {y ∈ M : ∃ x ∈ A ∃ a causal curve from y to x}.
(197)
For example, in Minkowski space-time J + (x) = {y ∈ R4 : (y µ − xµ )(yµ − xµ ) ≥ 0, y 0 − x0 ≥ 0}.
(198)
(As usual, yµ = gµν y ν , and we adopt the sum convention implying summation over indices that appear both upstairs and downstairs.) For y ∈ J + (x), the timelike distance of y from x, τ (y, x), is the supremum of the lengths of all causal curves connecting x to y. For Minkowski space-time, τ (y, x) = ((y µ − xµ )(yµ − xµ ))1/2 .
(199)
Assumption 9. (M, g) is such that τ (·, x) : J + (x) → [0, ∞) is C ∞ on the interior of J + (x), and its derivative ∇µ τ vanishes nowhere. Furthermore, τ (y, x) = 0 if and only if y ∈ ∂J + (x). For example, this is the case in Minkowski space-time. It is not the case in spacetime manifolds with closed timelike curves, in which τ may have nondifferentiable points. The future hyperboloid based at a point x and with distance parameter s > 0 is the set Hs (x) = {y ∈ J + (x) : τ (y, x) = s}.
(200)
March 10, 2009 19:20 WSPC/148-RMP
196
J070-00360
R. Tumulka
If x ∈ J + (x ) then we write H(x, x ) = Hτ (x,x ) (x ) for the future hyperboloid based at x containing x. In Minkowski space-time, the future hyperboloids are
1/2 3 (y i − xi )2 Hs (x) = (y 0 , y 1 , y 2 , y 3 ) ∈ R4 : y 0 = x0 + s2 + . (201) i=1
From Assumption 9, it follows (by the implicit function theorem) that Hs (x) is an embedded submanifold, and thus a 3-surface; it is spacelike because ∇µ τ is timelike. A Cauchy surface in M is a spacelike 3-surface that intersects every inextendible causal curve in M exactly once.d Let C be the set of all Cauchy surfaces in M , H the set of all future hyperboloids in M . The future hyperboloids are not necessarily Cauchy surfaces. In Minkowski space-time, for example, they never are: Indeed, for √ given x ∈ R4 , t → y(t) = x + (t, 1 + t2 , 0, 0) is an inextendible timelike curve that does not intersect J + (x), and in particular not the future √ hyperboloids. To see this, note first that its tangent vector uµ = dy µ /dt = (1, t/ 1 + t2 , 0, 0), is always timelike as uµ uµ = 1 − t2 /(1 + t2 ) > 0, and since uµ is nonzero every other tangent vector is a multiple of uµ . It is inextendible because y 0 (t) → ±∞ as t → ±∞, and it does not intersect J + (x) because (y µ (t) − xµ )(yµ (t) − xµ ) = t2 − (1 + t2) = −1 < 0. As a consequence of its Lorentzian metric, M is endowed with a natural σfinite measure, which we denote d4 x. Similarly, every spacelike 3-surface Σ, being a Riemannian manifold, is endowed with a natural σ-finite measure, the Riemannian volume measure, which we denote d3 x (it will always be clear which Σ we refer to). 1 2 3 For example, if the hyperboloid Hs (0) given by (201) is coordinatized by x , x , x 3 2 2 then the measure d x has density 1/ 1 + r /s in coordinates, i.e. √ 1 2 3 2 2 3 1 2 3 f( s + r , x , x , x ) d x f (x) = dx dx dx , (202) 1 + r2 /s2 Hs (0) R3 3 where r(x1 , x2 , x3 ) := ( k=1 (xk )2 )1/2 . Lemma 14 (Coarea formula). Under Assumption 9, for any x ∈ M and any measurable f : J + (x ) → [0, ∞), ∞ 4 d x f (x) = ds d3 x f (x). (203) J + (x )
0
Hs (x )
Proof. The general coarea formula can be found as [35, Theorem 3.2.12]. (Actually, it is not necessary that τ (·, x ) be C ∞ : we only need locally Lipschitz (which is true in any Lorentzian manifold), but would then have to say more about the definition of d3 x.) The easiest way to see that the Jacobian factor is correct is by noting that an orthonormal basis of the tangent space in x to Hs (x ), together with the futurepointing unit normal in x to Hs (x ), forms an orthonormal basis of the tangent space in x to M . d O’Neill
[54] defines a Cauchy surface as a subset that intersects every inextendible timelike curve in M exactly once. That is different in two ways: it allows submanifolds that are not C ∞ , and it allows 3-surfaces possessing lightlike tangent vectors.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
197
4.1. Abstract definition of the relativistic flash process We now present the abstract definition of the relativistic GRW flash process, or rGRWf process. It is abstract in the sense that it supposes certain operators as given, for which we provide concrete specification later in Sec. 4.2. Suppose that for every Σ ∈ C ∪ H we are given a Hilbert space HΣ , and we are given another Hilbert space H0 . Suppose that we are given unitary time evolution operators, as in ordinary quantum mechanics, in the following sense: For every two Σ, Σ ∈ C ∪ H we are given a unitary isomorphism UΣΣ : HΣ → HΣ such that UΣΣ = IΣ ,
UΣΣ UΣΣ = UΣΣ ,
(204)
where IΣ denotes the identity operator on HΣ . (Our example will be the time evolution defined by the Dirac equation.) The family (UΣΣ ) can be represented in another way through a family of unitary isomorphisms UΣ : H0 → HΣ . Indeed, if (UΣΣ ) are given, choose an arbitrary Σ0 ∈ C ∪ H and an arbitrary unitary isomorphism UΣ0 : H0 → HΣ0 , and set UΣ = UΣΣ0 UΣ0 . Conversely, if a family UΣ : H0 → HΣ is given, define UΣΣ = UΣ−1 UΣ , and (204) is satisfied. (Note that identifying H0 with HΣ by means of UΣ is nothing but the Heisenberg picture.) Furthermore, for every Σ ∈ H we are given an operator-valued function ΛΣ : Σ → B(HΣ ) such that every ΛΣ (x) is positive. Let λ > 0 be a constant (the same as in Sec. 2.1). Assumption 10. ΛΣ : Σ → B(HΣ ) is weakly measurable, and d3 x ΛΣ (x) = λIΣ .
(205)
Σ
In addition, on the set {(x, x ) ∈ M 2 : x ∈ J + (x )} the function −1 (x, x ) → UH(x,x ) ΛH(x,x ) (x)UH(x,x ) ∈ B(H0 ),
(206)
where H(x, x ) means the future hyperboloid based at x containing x, is weakly measurable.
(For a concrete specification of HΣ , UΣΣ , and ΛΣ (x) see Sec. 4.2.) Moreover, suppose we are given a finite label set L ; set N := #L . For every i ∈ L we are given a point Xi,0 ∈ M , called the seed flash with label i. Finally, we are given a vector H0 (207) ψ∈ i∈L
(i.e. the product of N copies of H0 ) with ψ = 1. Let the history space be Ω := M L ×N
(208)
March 10, 2009 19:20 WSPC/148-RMP
198
J070-00360
R. Tumulka
t Σ (x’,x) x
x’ r Fig. 2. The 3-surface Σ(x , x) = H(x, x ) of constant timelike distance from x containing x, in Minkowski space-time.
(corresponding to one sequence of flashes in M for every label i) with σ-algebra A := B(M )⊗(L ×N) .
(209)
For x, x ∈ M define the operator Kx (x) ∈ B(H0 ) by
Kx (x) := 1x∈J + (x ) e−λτ (x,x )/2 UΣ−1 ΛΣ (x)1/2 UΣ ,
(210)
where Σ = H(x, x ) is the future hyperboloid based at x containing x as in Fig. 2. For any sequence f = (x0 , x1 , x2 , . . . , xn ) of space-time points, set K(f ) := Kxn−1 (xn ) · · · Kx1 (x2 )Kx0 (x1 ).
(211)
Definition 5. Given the data just listed, an rGRWf process is a random variable F = (Xi,k : i ∈ L , k ∈ N) with values in (Ω, A) such that for every choice of n = (ni ) ∈ NL the joint distribution of the first ni flashes of type i is 2 4 P(Xi,k ∈ d xi,k : i ∈ L , k ≤ ni ) = K(fi )ψ df (212) i∈L
with the notation fi = (xi,0 , . . . , xi,ni ) and df =
ni
d4 xi,k .
(213)
i∈L k=1
Theorem 6. Given the data listed above, and if Assumptions 9 and 10 hold, then there exists an rGRWf process and is unique in distribution. The distribution is ψ|G(·)ψ for a suitable history POVM G(·) on the history space Ω.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
199
Lemma 15. Under Assumptions 9 and 10, (x, x ) → Kx∗ (x)Kx (x) is weakly measurable, and d4 x Kx∗ (x)Kx (x) = I. (214) M
Proof. The function M 2 (x, x ) → Kx (x) ∈ B(H0 ) is weakly measurable because it is, up to the measurable factor 1x∈J + (x ) e−λτ (x,x )/2 , the square root of −1 (x, x ) → UH(x,x ) ΛH(x,x ) (x)UH(x,x ) , which is weakly measurable by Assumption 10. By the usual arguments, M 2 (x, x ) → Kx∗ (x)Kx (x) is weakly measurable. By definition (210), with Σ = H(x, x ) and x ∈ J + (x ),
eλτ (x,x ) Kx∗ (x)Kx (x) = (UΣ−1 ΛΣ (x)1/2 UΣ )∗ UΣ−1 ΛΣ (x)1/2 UΣ
(215)
(because UΣ is unitary and ΛΣ (x) is self-adjoint) = UΣ−1 ΛΣ (x)1/2 UΣ UΣ−1 ΛΣ (x)1/2 UΣ = UΣ−1 ΛΣ (x)UΣ . Thus, M
d4 x Kx∗ (x)Kx (x) =
(216)
M
−1 d4 x 1x∈J + (x ) e−λτ (x,x ) UH(x,x ) ΛH(x,x ) (x)UH(x,x )
(by Lemma 14) ∞ ds e−λs = Hs (x )
0
d3 x UH−1 ΛHs (x ) (x) UHs (x ) s (x )
(by Lemma 4)
∞ −1 −λs ds e UHs (x ) =
Hs (x )
0
3
d x ΛHs (x ) (x) UHs (x )
(by (205)) ∞ = ds e−λs UH−1 λIHs (x ) UHs (x ) s (x ) 0
=
∞
ds e−λs λI = I.
0
The following lemma is the analog of Lemma 4 for tensor products. Lemma 16. If H1 , H2 , H3 are separable Hilbert spaces, q → Λ(q) ∈ B(H2 ) is weakly measurable, and R ∈ B(H1 ), S ∈ B(H3 ), and T = Λ(q)µ(dq) ∈ B(H2 ) then q → R ⊗ Λ(q) ⊗ S ∈ B(H1 ⊗ H2 ⊗ H3 ) is weakly measurable, and R ⊗ T ⊗ S = R ⊗ Λ(q) ⊗ Sµ(dq). (217) Proof. This is an immediate consequence of Lemma 4: replace R → R ⊗ I ⊗ I, T → I ⊗ T ⊗ I, and S → I ⊗ I ⊗ S, and note that (P ⊗ I)(I ⊗ Q) = P ⊗ Q.
March 10, 2009 19:20 WSPC/148-RMP
200
J070-00360
R. Tumulka
Proof of Theorem 6. For every n ∈ N, we define a POVM Gn (·) on ((M L )n , B(M )⊗L ×n ) as follows: Gn (A) =
A i∈L
K(fi )∗ K(fi )
n
d4 xi,k .
(218)
i∈L k=1
First, for A = (M L )n , we obtain from nN -fold application of Lemma 15 (and Lemma 16) that Gn (A) = I. For arbitrary A ∈ B(M )⊗L ×n , the existence and boundedness of Gn (A) follows again from Lemma 1 if only the right-hand side of (218), when sandwiched between ψ’s, remains ≤ ψ2 . Indeed, n A i∈L k=1
d4 xi,k ψ| ⊗i K(fi )∗ K(fi )ψ
(because the integrand is nonnegative) n ≤ d4 xi,k ψ| ⊗i K(fi )∗ K(fi )ψ = ψ2 .
(219)
(220)
(M L )n i∈L k=1
To see that Gn (·) is σ-additive, just note that A is σ-additive in A. Thus, Gn (·) is a POVM. The consistency condition (116) follows from N -fold application of Lemma 15 (and Lemma 16), namely to Kxi,n (xi,n+1 )∗ Kxi,n (xi,n+1 ) for all i ∈ L . Now Theorem 1 provides a POVM G(·) on (M L )N = M L ×N whose marginals are the Gn (·). To see that ψ|G(·)ψ is the distribution of an rGRWf process, note that in case ni = n for all i ∈ L , (212) means P(Xi,k ∈ d4 xi,k : i ∈ L , k ≤ n) = ψ|Gn (df )ψ,
(221)
while (212) for unequal ni follows from the case before by choosing n large enough (n = max{ni : i ∈ L }) and applying Lemma 15 to integrate out some of the xi,k . The uniqueness of the distribution Pψ of the rGRWf process follows from Theorem 1 because (212) implies for the case in which all ni = n that the joint distribution of the first n flashes of all labels is given by the POVM Gn (·), and then the uniqueness statement of Theorem 1 provides the uniqueness of Pψ . 4.2. Concrete specification We now present an outline for defining HΣ , UΣ , and ΛΣ (x). A rigorous definition will be presented in Sec. 4.3 for Minkowski space-time. Concretely, we intend to take HΣ to be L2 (D|Σ ), the space of square-integrable measurable sections of the vector bundle D|Σ modulo changes on null sets. The vector bundle D is the bundle of Dirac spin spaces [23, 59], a complex bundle of rank 4 over M , endowed with a connection (whose curvature arises from the
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
201
curvature of M ). We obtain the operators UΣΣ by solving the Dirac equation ie µ −iγ ∇µ − Aµ ψ = mψ, (222) where γ µ are the Dirac matrices, ∇ is the covariant derivative operator, e ∈ R is a constant, the charge parameter, Aµ is a 1-form, the electromagnetic vector potential, and m > 0 is a constant, the mass parameter. We take ΛΣ (x) to be the multiplication operator on L2 (D|Σ ) by a function of the spacelike distance from x along Σ, ΛΣ (x)ψ(y) = λN (y)(distΣ (x, y))ψ(y),
(223)
for all y ∈ Σ, where : [0, ∞) → [0, 1] is a fixed function that we call the profile function, and −1 d3 x (distΣ (x, y)) . (224) N (y) = Σ
The normalizing factor N is chosen as to ensure (205). As an example, could be a Gaussian, u2 (225) (u) = exp − 2 2σ with σ the same constant as in Sec. 2.1. It is sometimes useful to assume that has compact support [0, σ]; then it is clear that N (y) is finite. From the concrete specification just given, it becomes clear that rGRWf is defined in a covariant way. No coordinate system on M was ever chosen; the unitary evolution from one 3-surface to another is given by the Dirac equation; in contrast to Bohmian mechanics for relativistic space-time (as developed in [29]), no “time foliation” (preferred foliation of M into spacelike 3-surfaces) is assumed or constructed; more generally, no concept of simultaneity-at-a-distance is involved. The reader should note that this is more than that the Poincar´e group (the isometry group of Minkowski space-time) acts on the theory’s solutions. In detail, let us call a theory weakly covariant if the set of possible probability measures on the history of the primitive ontology (PO) is closed under the action of the Poincar´e group. Indeed, rGRWf is weakly covariant, but the concept of weak covariance is too weak to capture the idea of a relativistic theory. As a simple example, we can turn non-relativistic classical mechanics (with instantaneous interaction-at-adistance) into a weakly covariant theory in Minkowski space-time in the following way: (i) postulate the existence of an additional physical object mathematically represented by a timelike vector field nµ subject to the field equation ∂ν nµ = 0 (which ensures nµ is constant); (ii) the vector field selects a Lorentz frame (whose time axis lies in the direction of nµ ); (iii) in this frame apply the non-relativistic equations. (Since probability plays no role here, we can think of the probability measure as a Dirac measure concentrated on a single history.) This theory is weakly covariant, as the transformed history simply has a different nµ vector, even though
March 10, 2009 19:20 WSPC/148-RMP
202
J070-00360
R. Tumulka
in a world governed by this theory the Michelson–Morley experiment has a nonzero result and superluminal communication is possible. (On top of that, the definition of weak covariance is limited to special relativity, and it is not clear how to adapt it to curved space-time.) Fay Dowker (through personal communication on January 28, 2004) has proposed the following definition for the concept of a covariant law: Suppose a law L is such that for every Cauchy surface Σ in M there is a set IΣ of possible initial data on Σ, and L associates with every D ∈ IΣ a probability measure PD on the space of possible histories of the PO in J + (Σ). Now call the law L strongly covariant if for every two Cauchy surfaces Σ1 , Σ2 with Σ2 ⊆ J + (Σ1 ) and every D1 ∈ IΣ1 there is a random variable D2 with values in IΣ2 so that the history of the PO in J + (Σ2 ) can equally be regarded as generated by the initial datum D1 on Σ1 or D2 on Σ2 ; that is, the distribution PD2 averaged over the distribution of D2 agrees with PD1 restricted to J + (Σ2 ). This definition is fulfilled by the law of rGRWf (when suitably formulated, see [72, 70]), where the initial data on Σ are the wave function on Σ and the last flash of each label before Σ. This definition is intended to exclude that a theory presupposes or generates a foliation, or any other notion of simultaneity-at-a-distance. 4.3. Existence theorem in Minkowski space-time Let (M, g) be Minkowski space-time, M = R4 , g = diag(1, −1, −1, −1). Then the Dirac bundle is trivial, D = M × C4 , and its connection is flat, so that we can replace the covariant derivative ∇µ by the partial derivative ∂µ . For Σ ∈ C ∪ H, HΣ := L2 (D|Σ ) = L2 (Σ, C4 , h, d3 x), which means the space of measurable functions ψ : Σ → C4 (modulo changes on null sets) that are square-integrable in the sense d3 x ψ ∗ (x)h(x)ψ(x) < ∞, (226) Σ
where h : Σ → C is a measurable function into the positive definite Hermitian 4 × 4 matrices that we define below. The scalar product in L2 (Σ, C4 , h, d3 x) is φ|ψΣ = d3 x φ∗ (x)h(x)ψ(x). (227) 4×4
Σ
It is clear that HΣ is a Hilbert space. Here, h(x) = γ 0 γ µ nµ (x),
(228)
where nµ (x) is the future-pointing unit normal on Σ at x ∈ Σ, so normalized that nµ (x)nµ (x) = 1. Since Σ is C ∞ , so are nµ and x → h(x). It is a known fact that the matrix γ 0 γ µ nµ is positive definite for every timelike vector nµ . The scalar product (227) can also be written µ ¯ d3 x φ(x)γ nµ (x)ψ(x), (229) φ|ψΣ = Σ
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
203
¯ where φ(x) = φ∗ (x)γ 0 (while φ∗ means component-wise conjugation). It is a known fact that φ → φ¯ is a Lorentz-invariant operation, while φ → φ∗ is not. As a consequence, HΣ and ·|·Σ are defined in a Lorentz-invariant way. Assumption 11. The profile function : [0, ∞) → [0, 1] is (Borel) measurable, and ∞ (u)eκu du < ∞ (230) 0< 0
for every κ > 0. This is true, for example, of the Gaussian (225), and when has compact support (and is not almost-everywhere zero). The operators ΛΣ (x) are defined by (223). Assumption 12. The 1-form A : R4 → R4 is time-independent in a suitable Lorentz frame, C ∞ , and satisfies ∃ M, ξ > 0 : ∀ x ∈ R3 , ∀µ : |Aµ (x)| < M (|x| + 1)−4−ξ .
(231)
Theorem 7. Under Assumptions 11 and 12 and with HΣ , UΣ , and ΛΣ (x) as specified above and L any finite label set, the hypotheses of Theorem 6 are fulfilled. As a consequence, an rGRWf process exists, it is unique in distribution, and the distribution is ψ|G(·)ψ for a certain POVM G(·). We conjecture that Assumption 12 is stronger than necessary, in particular that Aµ does not have to be time-independent. According to a theorem of Dimock [23], the Dirac equation defines a unitary isomorphism UΣΣ : HΣ → HΣ for all Cauchy surfaces Σ, Σ . The evolution from a Cauchy surface to a hyperboloid is provided by the following lemma. Lemma 17. Under Assumption 12, the Dirac equation (222) defines a unitary isomorphism UΣΣ : HΣ → HΣ for all Σ, Σ ∈ C ∪ H. Proof. First note that we assume m > 0 in the Dirac equation (with m = 0 this proof would not work). Choose a Lorentz frame in which Assumption 12 holds (allowing us to identify M with R4 ), and let Σ0 be the 3-surface defined by t = 0. Since d3 x on Σ0 is just the Lebesgue measure and h = I, we write L2 (R3 , C4 ) instead of L2 (Σ0 , C4 , h, d3 x). It suffices to define UΣΣ0 for all Σ ∈ H. We define the U operator first on a dense subspace S of L2 (R3 , C4 ), then show that it is bounded and take its bounded extension on all of L2 (R3 , C4 ); we leave S to be chosen later but assume S ⊆ C ∞ (R3 , C4 ). We define the U operators by solving the Dirac equation for ψ0 ∈ S to obtain ψ : R4 → C4 on space-time and then restricting ψ to Σ. By a result of Chernoff [18], for C ∞ time-independent Aµ , the Dirac Hamiltonian is essentially self-adjoint on C0∞ (R3 , C4 ) (i.e. compactly supported functions), so that there is no ambiguity about H, and ψ ∈ C ∞ (R4 , C4 ) if ψ0 ∈ C ∞ (R3 , C4 ) ∩ L2 (R3 , C4 ). As a consequence, no ambiguity about changing
March 10, 2009 19:20 WSPC/148-RMP
204
J070-00360
R. Tumulka
ψ on null sets arises, and ψΣ := ψ|Σ is well defined. By the linearity of the Dirac 1/2 equation, ψ0 → ψΣ is linear. We write ψΣ for ψΣ |ψΣ Σ . We now show that ψΣ ≤ ψ0
for Σ ∈ H.
(232)
Without loss of generality, we assume ψ0 = 1. Define the probability current vector field e j : R4 → R4 by ¯ µψ j µ = ψγ and note that, for any spacelike 3-surface Σ, ψ2Σ = d3 x j µ (x)nµ (x)
(233)
(234)
Σ
is the flux of j across Σ. Some well-known properties of j: (i) Since h in (227) is positive definite, j has positive Lorentzian scalar product j µ nµ with every futurepointing timelike nµ , and thus j is future-pointing causal. (ii) j is divergence free, i.e. the continuity equation ∂µ j µ = 0
(235)
holds as a consequence of the Dirac equation. (iii) Since j µ (x)nµ (x) ≥ 0, ν(d3 x) = j µ (x)nµ (x)d3 x
(236)
is a σ-finite measure on Σ. We now use Bohmian mechanics as a mathematical tool. According to this theory, electrons have world lines. In the single-particle version of this theory, with the electron there is associated a wave function ψ of norm 1 obeying the Dirac equation, and the electron’s position Q(t) ∈ R3 at time t ∈ R evolves according to (j 1 , j 2 , j 3 ) dQ = (t, Q(t)) dt j0
(237)
with j given by (233). Equivalently, the world line L = {(t, Q(t)) : t} is an integral curve of the vector field j. The initial position Q(0) is chosen at random with probability distribution |ψ0 |2 d3 x, thus defining a probability distribution P for the random path L. More precisely, L = {(t, Q(t)) : t ∈ (τ− , τ+ )}, with (τ− , τ+ ) the maximal domain of definition of the solution to (237); in case τ− = −∞ and τ+ = +∞ we say that L exists globally in time; τ± can be finite if the world line runs into a zero of j. e This
is standard terminology. In rGRWf, of course, it does not signify the flow of probability. In Bohmian mechanics it does.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
205
We use the following fact [67]: Given a future-pointing causal C ∞ divergence-free vector field j on R4 whose flux across Σ0 = {0} × R3 is 1 and a spacelike 3-surface Σ, then for every measurable A ⊆ Σ, P(L ∩ A = ∅) = ν(A\B0 ),
(238)
where L is the random Bohmian trajectory, i.e. integral curve of j, whose initial point has distribution |ψ0 |2 d3 x on Σ0 ; P(L∩A = ∅) is the probability of the Bohmian trajectory intersecting A; ν is given by (236); and B0 is the set of points x ∈ Σ with ψ(x) = 0 which do not lie on any Bohmian trajectory starting on Σ0 . Note that the hypotheses on j are fulfilled in our case; note also that the Bohmian trajectories are causal since j is, and thus a spacelike 3-surface intersects each Bohmian trajectory at most once. We thus obtain a stochastic interpretation of the flux across A as the probability of the random curve L intersecting A, but we need to get control of the set B0 . To this end, we use the global existence theorem of Teufel and myself [66] for Bohmian trajectories, which implies the following: Given an electromagnetic potential Aµ on R4 that is time-independent and C ∞ , and an initial wave function ψ0 ∈ L2 (R3 , C4 ) ∩ C ∞ (R3 , C4 ), then almost all Bohmian trajectories exist for all times, where “almost all ” refers to the |ψ0 |2 distribution over the initial point on {0} × R3 . From this we get control of B0 ⊆ Σ ∈ C ∪ H, namely ν(B0 ) = 0.
(239)
For Σ of the form Σt := {t}×R3, this would be immediate from the global existence theorem, applied to Σt as the initial time, noting that the |ψt |2 distribution coincides with ν. For Σ ∈ H, we have to do some work: Let Bt be the set of points x ∈ Σ such that ψ(x) = 0 (so that there exists a trajectory through x) and the trajectory through x does not exist at time t, i.e. does not intersect Σt := {t} × R3 . Now we ask for the probability that the random trajectory L starting on Σ0 intersects Σ in a point x ∈ Bt . In that event, L has to coincide with the unique trajectory through x, which does not intersect Σt and thus does not exist globally. By the global existence theorem, this probability is zero: 0 = P(L ∩ Bt = ∅) = ν(Bt \B0 ),
(240)
where the second equality is (238). Now choose an arbitrary measurable set A ⊆ Σ with ν(A) < ∞ and consider A ∩ Bt1 ∩ · · · ∩ Btm instead of Bt and observe that ν((A ∩ Bt1 ∩ · · · ∩ Btm )\B0 ) ≤ ν(Bt1 \B0 ) = 0.
(241)
ν(A ∩ Bt1 ∩ · · · ∩ Btm ) = ν(A ∩ Bt1 ∩ · · · ∩ Btm ∩ B0 ).
(242)
Put differently,
By the same argument for any time tm+1 instead of 0, ν(A ∩ Bt1 ∩ · · · ∩ Btm ) = ν(A ∩ Bt1 ∩ · · · ∩ Btm ∩ Btm+1 ).
(243)
March 10, 2009 19:20 WSPC/148-RMP
206
J070-00360
R. Tumulka
Setting t1 = 0 and by induction along m ∈ N, ν(A ∩ B0 ) = ν(A ∩ B0 ∩ Bt1 ∩ · · · ∩ Btm ).
(244)
Now consider an infinite sequence (tm ) that is dense in R (say, an enumeration of Q); then ν(A ∩ B0 ) = lim ν(A ∩ B0 ∩ Bt1 ∩ · · · ∩ Btm ) m→∞
∞ Btm = 0 = ν A ∩ B0 ∩
(245)
m=1
because m Btm = ∅, as every trajectory exists for some time interval of positive length (and thus, e.g., at some rational time). Since A was arbitrary with finite measure, ν(B0 ) = 0, which is what we claimed in (239). As a consequence of (239), we have from (238) that 1 ≥ P(L ∩ Σ = ∅) = ν(Σ) = ψ2Σ ,
(246)
which shows (232). Now we show that ψΣ = ψ0 for Σ ∈ H. For this we use the flux-acrosssurfaces theorem of D¨ urr and Pickl [32], which implies the following: Under Assumption 12, for all ψ0 with ψ0 = 1 from a suitable dense subspace S of L2 (R3 , C4 ) with S ⊆ C ∞ (R3 , C4 ) it is true that lim d3 x j µ nµ = 1. (247) s→∞
Hs (0)
This fixes the subspace S (and this is where the condition (231) enters). We want to show that Ps := ψ2Hs (0) = 1 for every s > 0 and ψ0 ∈ S with ψ0 = 1. This quantity is the probability that the random Bohmian trajectory L intersects Hs (0). In particular, it is decreasing in s, Ps1 ≥ Ps2
if s1 ≤ s2 .
(248)
Now, according to (247), Ps → 1 as s → ∞ while Ps ≤ 1, and thus Ps = 1 for all s > 0. What we have obtained is that, for any Σ ∈ H, U := UΣΣ0 : S → HΣ is norm˜ preserving. It is therefore bounded and possesses a unique bounded extension U 2 3 4 ˜ to all of HΣ0 = L (R , C ). To see that U is norm-preserving, too, consider a convergent sequence ψn → ψ with ψn ∈ S and note that ˜ U ψΣ = lim U ψn = lim U ψn Σ = lim ψn = lim ψn = ψ. n→∞ n→∞ n→∞ n→∞ Σ
(249) ˜ In the following we write UΣΣ0 for U. Σ Now we show that U := UΣ0 is onto. We first observe that the range of U is a closed subspace because if, in HΣ , φn → φ and φn = U ψn then (φn ) is a
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
207
Cauchy sequence, and thus so is (ψn ), and thus (ψn ) converges, and U limn ψn = limn U ψn = limn φn = φ, so that φ lies in the range. It remains to show that the range of U is dense in HΣ : The range of U contains C0∞ (Σ, C4 ) (i.e. compactly supported) because for such a ψΣ there exists a Cauchy surface Σ that has the support of ψΣ in common with Σ. By Dimock’s existence theorem, there is a unique ψ : R4 → C4 , solving the Dirac equation, whose restriction to Σ , and thus to Σ, is ψΣ . Set ψ0 to be the restriction of ψ to Σ0 . Proof of Theorem 7. To begin with, Assumption 9 is satisfied in Minkowski spacetime, and the UΣ operators are provided by Lemma 17. Now we show that the quantity N (y) is always well defined by (224), which means that Σ d3 x(distΣ (x, y)) is finite and nonzero: It could only be zero if were zero almost everywhere, which is excluded by the positivity in (230). To check that it is finite, we only need check that it is finite for x = 0 and y = (s, 0, 0, 0) since there is an isometry of Minkowski space carrying Hs (x ) is into Hs (0) and y into (s, 0, 0, 0). In particular, N (y) is actually independent of y (in Minkowski space-time!). Now we calculate d3 x ◦ distΣ (x, (s, 0, 0, 0)) (250) Σ=Hs (0)
(by (202)) √ 1 2 3 2 2 1 2 3 ◦ distΣ (( s + r , x , x , x ), (s, 0, 0, 0)) = dx dx dx 1 + r2 /s2 R3 (s sinh−1 (r/s)) = dx1 dx2 dx3 1 + r2 /s2 R3 (where sinh−1 means the inverse function of sinh) ∞ 4πr2 = dr (s sinh−1 (r/s)) 1 + r2 /s2 0 (substituting r = s sinh(u/s) so that du = dr/ 1 + r2 /s2 ) ∞ = 4πs2 du (u) sinh(u/s)2 .
(251) (252)
(253)
(254)
0
Since is a bounded function, what is relevant for finiteness of this integral is the asymptotics for u → ∞, where sinh ∼ 12 exp and thus sinh(u/s)2 ∼ 14 exp(2u/s). Thus, the finiteness in (230) is (necessary and) sufficient for the finiteness of this integral for every s > 0. The operators ΛΣ (x), defined by (223), are weakly measurable as a function of x ∈ Σ = Hs (x ) whenever (x, y) → ◦ dist(x, y) is measurable. This is satisfied since for future hyperboloids in Minkowski space-time, (x, y) → dist(x, y) is a measurable (even C ∞ ) function, : [0, ∞) → [0, 1] is measurable by Assumption 11, and N (y) is actually independent of y.
March 10, 2009 19:20 WSPC/148-RMP
208
J070-00360
R. Tumulka
To check (205) for Σ = Hs (x ) and arbitrary ψ ∈ HΣ , d3 xψ|ΛΣ (x)ψ Σ
µ ¯ d3 y ψ(y)γ nµ (y)λN (y)(distΣ (x, y))ψ(y)
d3 x
= Σ
(255) (256)
Σ
(we can reorder the integrals because the integrand is nonnegative) µ ¯ = λ d3 y ψ(y)γ nµ (y)ψ(y)N (y) d3 x (distΣ (x, y)) (257)
Σ
Σ µ ¯ d3 y ψ(y)γ nµ (y)ψ(y) = λψ|ψ.
= λ
(258)
Σ
We now show the measurability of (206). To this end, we define, for every hyperboloid Hs (x), a diffeomorphism ϕs,x : Hs (x) → R3 by ϕs,x (y) = (y 1 − x1 , y 2 − x2 , y 3 − x3 ). This induces a linear mapping Ms,x : L2 (R3 , C4 ) → HHs (x) defined by Ms,x ψ(y) = ψ(ϕs,x (y)); Ms,x ψ is square-integrable because Ms,x ψ2Hs (x)
= Hs (x)
=
R3
d3 y(Ms,x ψ)∗ (y)γ 0 γ µ nµ (y)(Ms,x ψ)(y)
d3 v ψ ∗ (v)γ 0 γ µ (1, v/ s2 + v 2 )µ ψ(v)
≤
d3 v|ψ(v)|2 R3
(259)
3
γ 0 γ µ C4 < ∞,
(260)
µ=0
−1 which indeed implies Ms,x ≤ ( µ γ 0 γ µ )1/2 . Similarly, Ms,x ψ(v) = ψ(ϕ−1 s,x (v)) −1 is a bounded operator. We check that (x, x ) → Mτ (x,x ),x ΛH(x,x ) (x)Mτ (x,x ),x is weakly measurable: ψ|Mτ−1 (x,x),x ΛH(x,x ) (x)Mτ (x,x ),x ψ = d3 v ψ ∗ (v)ψ(v)λN (dist(x, ϕτ (x,x ),x (v))
(261) (262)
R3
which is measurable since the integrand is measurable in (x, x , v). It remains to H(x,x ) show that (x, x ) → Mτ−1 is weakly measurable. By a translation (x,x),x UΣ0 H (0)
−1 x → 0, it suffices to show that s → ψ0 |Ms,0 UΣ0s ψ0 is measurable for all ψ0 ∈ L2 (R3 , C4 ), which follows (since the operators are bounded) from the fact −1 Hs (0) 3 UΣ0 ψ0 (v) = ψ(ϕ−1 that s → Ms,0 s,0 (v)) is continuous for all v ∈ R and ψ0 ∈ S, 4 4 ∞ as then ψ : R → C is C . Thus, Assumption 10 is fulfilled, too, and Theorem 6 applies.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
209
5. Outlook 5.1. Nonlocality Locality means that if two space-time regions A and B are spacelike separated then events in A cannot influence those in B or vice versa. Let me point out why rGRWf is a nonlocal theory. rGRWf specifies the joint distribution of flashes, some of which may occur in A and some in B. The distribution of those in A, i.e. of how many flashes occur in A and at which space-time points, is in general not independent of the flashes in B (except in case the initial state vector factorizes): P(F ∩ A ∈ ·|F ∩ B) = P(F ∩ A ∈ ·).
(263)
But this is not yet an influence between B and A: correlation is not causation. After all, the flashes in A and those in B may be correlated because of a common cause in the past. Taking this into account, the criterion for the absence of an influence between A and B is that F ∩ A and F ∩ B are conditionally independent, given the history of their common past. And also this can fail in rGRWf: P(F ∩ A ∈ ·|F ∩ B, F ∩ J − (A) ∩ J − (B)) = P(F ∩ A ∈ ·|F ∩ J − (A) ∩ J − (B)). (264) Thus, rGRWf is nonlocal. The nonlocality of rGRWf should be seen in connection with Bell’s famous nonlocality argument [6, 10], according to which the laws of our universe must be nonlocal. The argument shows that every local theory entails that the predicted probabilities for certain experiments satisfy Bell’s inequality, which however is violated according to the quantum formalism and in experiment (and according to rGRWf). Many authors, beginning with Einstein, Podolsky and Rosen [34], have expressed the view that locality follows from relativistic covariance. This view seems dubious given Bell’s result that locality is wrong while relativity has been extraordinarily successful. More detailed arguments to the effect that nonlocality does not contradict relativity (or, in other words, that the concept of locality is not equivalent to that of relativistic covariance) have been given in [50, 44]. The strongest argument to this effect is, however, the existence of rGRWf, a nonlocal theory that is convincingly covariant. Indeed, the biggest hurdle on the way to a relativistic quantum theory without observer was to find a theory that is nonlocal yet covariant. Thus, this is perhaps the most remarkable aspect of rGRWf. So how does rGRWf accomplish this feat? How does it reconcile relativity and nonlocality? We think that the following point, which we have first described in [72], is crucial: If space-time regions A and B are spacelike separated, then nonlocality means that events in A can influence those in B or vice versa. Of course, an influence from A to B would mean an influence
March 10, 2009 19:20 WSPC/148-RMP
210
J070-00360
R. Tumulka
to the past in some Lorentz frames. In rGRWf, however, the words “or vice versa” are important, as in rGRWf there is no objective fact about whether the influence took place from A to B or from B to A. The rGRWf laws simply prescribe the joint distribution of flashes in A and B, but do not say that nature made the first random decision in A, which then influenced the flashes in B. There is no need for rGRWf to specify in which order to make random decisions. One can say that the direction of the influence depends on the chosen Lorentz frame. In a frame in which A is earlier than B one would conclude that the flashes in A have influenced those in B, while in a frame in which B is earlier than A one would conclude the opposite. The following simple illustration of how an influence can fail to have a direction is due to Conway and Kochen [20]. Example 5. Consider a discrete space-time M as depicted in Fig. 3, which can be thought of as a subset of 1 + 1-dimensional Minkowski space. In terms of a suitable time coordinate function T , all space-time points have positive integer values of T , and at time T there exist T space points. The PO is a field φ : M → {0, 1} subject to two laws: (i) If x is any point in M and y, z its two neighbors in the future then φ(x) + φ(y) + φ(z) ∈ {0, 2}. (ii) Given all values of φ up to time T , the random event φ(x) = 1 has conditional probability 1/2 for any point x with T (x) > T . Let us generate a random space-time history according to these laws. On the one point x with T (x) = 1 we choose φ(x) at random according to (ii), with probability 1/2 for φ(x) = 1. Then we can choose, for the left point y with T (y) = 2, the value φ(y), again with probability 1/2 for φ(y) = 1. Then, by (i), for the right point z with T (z) = 1, the value φ(z) is determined by φ(x) and φ(y). Similarly, if we have chosen all φ values up to time T then any single φ value in the row T + 1 will determine all the other values in this row. This model world is not meant to be relativistic, but it illustrates influences without direction: Suppose we simulate the model one time step after another, and suppose we have filled in the φ values up to time T . Let x be the leftmost point at time T + 1, and y the rightmost one. Now we may throw a coin to choose φ(x), and then compute all the other φ values in that row. Or we may throw a coin for
T=4 T=3 T=2 T=1 Fig. 3. The discrete space-time considered in the text, and the T function on it. The bullets symbolize the space-time points, while the lines have no physical meaning and serve only for indicating how to continue the figure to infinity.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
211
φ(y) and compute φ(x) from that. In one case there was an influence from x to y, in the other from y to x. But there is no objective direction of the influence in the model world. The theory specifies no such direction, and there is no need to specify it. For a physical theory it suffices to specify the joint probability distribution of the history of the PO. The direction of the influence lies only in the way we choose to look at, or simulate, the model world, like a choice of gauge or a choice of coordinates; it represents no objective fact in the world. The situation is the same as any other situation of simulating two dependent random variables X, Y with known joint distribution: One could first simulate X according to its known marginal distribution and then Y according to its known conditional distribution given X, or vice versa, and none of these two orderings is more correct than the other. 5.2. Other approaches to relativistic collapse theories In this subsection, we mention the approaches to relativistic collapse theories other than rGRWf in the literature, and describe the differences. A crucial part of the problem of specifying a relativistic collapse theory is to specify a law for the primitive ontology. The need for a clear specification of the primitive ontology has often not been sufficiently appreciated in the literature. Many authors have focused on the problem of specifying a Lorentz-invariant law that associates with every spacelike 3-surface Σ in space-time a wave function ψΣ , in such a way that macroscopic superpositions collapse appropriately (e.g., [41, 56, 38, 57, 53]). But such a law is only half of what is needed for a relativistic collapse theory: the other half concerns the primitive ontology. Dowker and Henson [28] describe a collapse model on a lattice space-time Z2 in 1 + 1 dimension. This model has many traits in common with rGRWf (except that rGRWf lives on manifolds). In particular, it is relativistic in the appropriate lattice sense, and it defines a primitive ontology consisting of field values at the lattice sites (a primitive ontology not among the examples listed in Sec. 1.2). In contrast to rGRWf, this model incorporates interaction while rGRWf assumes non-interacting “particles” (of course, there are no particles in this theory, just flashes). An important future goal for rGRWf is the development of a version with interaction. Hellwig and Kraus [45] worry about the relativistic invariance of wave function collapse in ordinary quantum mechanics and propose that wave functions collapse along the past light cone of the space-time point at which a measurement takes place. They assume as given the space-time points X1 , . . . , Xn at which measurements take place (some of which may be spacelike separated) and the observables O1 , . . . , On ∈ B(H ) measured there with results R1 , . . . , Rn ∈ R and associate with every x ∈ M a collapsed state vector ψx ∈ H . In detail, they assume the Heisenberg picture in which the unitary evolution of the state vector disappears; let Pk , for k = 1, . . . , n, be the projection to the eigenspace of Ok with eigenvalue
March 10, 2009 19:20 WSPC/148-RMP
212
J070-00360
R. Tumulka
Rk and set
k:X
Pk ψ
J + (x) ∈
k ∈H, ψx = P ψ k k:Xk ∈J + (x)
(265)
where ψ is the initial state vector, an empty product is understood as the identity operator, and the ordering in the product is such that whenever Xk ∈ J + (X ) then Pk stands to the left of P . It is assumed that for spacelike separated Xk and X , Ok commutes with O , and thus Pk with P . [We mention that in [45], the term Tr(QP W ) in equations (3)–(5) should read Tr(QPWP ).] This rule involves a kind of retrocausation, as the decision, made by an observer at Xk , about which Ok to measure influences the reality in the past, more precisely at those points x that are spacelike separated from Xk and that therefore are earlier than Xk in some inertial frames. Even more problematic is that the use of the proposal of Hellwig and Kraus remains unclear, for two reasons. First, in ordinary quantum mechanics the formalism is usually supposed to specify the joint probability distribution of the results Rk , which follows from the conventional quantum formalism (with instantaneous collapse at every measurement) 2 n Pk ψ (266) P(R1 = r1 , . . . , Rn = rn ) = k=1
with Pk the projection to the eigenspace with eigenvalue rk , and the ordering of the factors in the product as before (whenever Xk ∈ J + (X ) then Pk is left of P , while for spacelike separated Xk and X , Pk commutes with P ). Formula (266) is manifestly Lorentz invariant, and since the measurement results constitute (in a vague and imprecise way) the primitive ontology of ordinary quantum mechanics it suffices that their distribution be specified by the laws of the theory in a Lorentzinvariant manner, making a rule like (265) irrelevant. Second, instead of defining a state vector ψx for every space-time point x it seems more natural to define a state vector ψΣ for every spacelike 3-surface Σ (even for a single particle in the presence of collapses, be they due to flashes or to measurements). Indeed, such is the case in rGRWf (and in the model of Dowker and Henson [28]), so it certainly does not conflict with relativistic invariance (as Hellwig and Kraus seem to think). The notion of a state vector ψΣ for every surface Σ is, of course, much older; it is used by Tomonaga and Schwinger in the 1940’s, and implicit in the derivation of (266). If ψ is admitted to depend on Σ then the apparent conflict between instantaneous collapse and relativity evaporates: it is then completely consistent that ψ collapses instantaneously (on all of 3-space) in every Lorentz frame because the collapse is associated with some space-time point
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
213
X, and ψΣ is a collapsed state vector on every spacelike 3-surface Σ with X ∈ J − (Σ) but uncollapsed on every Σ with X ∈ J + (Σ). In contrast, for the primitive ontology at a space-time point x it would not make sense to depend on a 3-surface Σ. Dove and Squires [27, 25] essentially reiterate the ideas of Hellwig and Kraus in the context of a GRW theory with flash ontology. They propose a Lorentz-invariant rule for collapsing the wave function given the flashes, but no law for the flashes given the initial wave function. That is, what they provide is, at best, a part of a collapse theory. Furthermore, their proposal is based on the misconception that they have to define the value ψ(x) of the wave function for every space-time point x (if the system consists of a single particle, N = 1). This was already discussed above in the context of Hellwig and Kraus’s proposal. Blanchard and Jadczyk [16] start from the consideration of a system of quantum particles continuously observed by detectors of limited efficiency, which manage only every now and then to detect a particle. This consideration is related to GRW theory as the detection events are points in space-time, and are reasonably modeled in a stochastic way by a point process in space-time whose distribution may coincide with that of a GRWf process. To obtain a relativistic version of this model, one might try to analyze the behavior of detectors consisting of relativistic particles, but Blanchard and Jadczyk instead try to guess relativistic equations. What they guess is not related to rGRWf, and in fact does not answer the question of the probability distribution of the detection events. They consider a wave function Ψτ on space-time that, instead of being a solution to the Dirac equation, evolves. That is, the wave function is not a function on space-time but a one-parameter family of functions on space-time, where the parameter τ is a pseudo-time, anyway a fifth coordinate (in addition to the four space-time coordinates). I do not see why a theory based on such a wave function should lead to any predictions related to those of quantum mechanics. In Blanchard and Jadczyk’s model of detection, they propose a stochastic rule for a random τ value associated with the detection event, but no rule for a random space-time point. Moreover, this rule is not Lorentz invariant but assumes a preferred frame, which they call the rest frame of the detector. That may seem natural when modeling a detector, but it would not be admissible for a relativistic theory of flashes. Ruschhaupt [64] continues where Blanchard and Jadczyk have stopped. His contribution is to associate a space-time point with the detection event as follows: he assumes that a world line s → x(s) of the detector is given, parametrized with proper time, and when Blanchard and Jadczyk’s rule generates a random value τ of the pseudo-time, Ruschhaupt inserts this value into x(·) to obtain a random space-time point x(τ ). Since the world line x(·) is given, this model, unlike rGRWf, does not qualify as a fundamental theory. On top of that, there is no reason why the predictions of this model should be related to those of quantum mechanics.
March 10, 2009 19:20 WSPC/148-RMP
214
J070-00360
R. Tumulka
Conway and Kochen [20] claim to have shown that relativistic GRW theories are impossible. rGRWf is a counterexample to their claim; the model of Dowker and Henson [28] is another counterexample. A detailed evaluation of their arguments is given in [72]; see [4] for a further critique, and [21] for Conway and Kochen’s reply to [4] and [72]. Here is a summary of [72]: Conway and Kochen claim that the impossibility of relativistic GRW theories is a corollary of a physical statement they derive in [20] and call the “free will theorem”; it is intended to exclude deterministic theories of quantum mechanics. The proof of the free will theorem contains a logical gap in the sense that it uses a hypothesis that is stronger than formulated in the statement of the “theorem.” The weaker version of the hypothesis (“FIN” or “effective locality”) is, in fact, fulfilled by rGRWf, while the stronger one is violated. The stronger version is equivalent to locality (in the sense of Einstein, Podolsky, Rosen and Bell [10], and in the sense of Sec. 5.1 above), which was shown by Bell in 1964 [6] to conflict with certain probability distributions predicted by quantum mechanics and afterwards confirmed in experiment. Thus, EPRB locality is wrong in our world, making a theorem assuming it useless. (However, the Conway–Kochen proof could be turned around into a disproof of EPRB locality, assuming determinism [4].) Moreover, Conway and Kochen’s argument from the free will theorem to the impossibility of relativistic GRW theories supposes that every stochastic theory is equivalent to a deterministic one (by making all random decisions at the initial time), which in this case is incorrect in a relevant way because the probability distribution in rGRWf depends on the external field Aµ , which observers are free to influence at later times. 5.3. The value of a precise definition In the introduction we mentioned that the GRW theory provides a precise definition of quantum mechanics. As always with precise definitions, it is easy to find many physicists who will honestly declare that they do not need such a definition for their work. So an example will be given of what such a definition is good for. The example consists of a simple physical statement that one would like to prove, and a simple proof based on GRW theory (with flash ontology) as a precisely defined theory. (By the way, this simple proof appears here for the first time in print.) However, from the rules of ordinary quantum mechanics it is impossible to get anywhere near a proof. The statement is this: For every conceivable experiment that one could carry out on a physical system there is a POVM E(·) so that the probability distribution of the result R is ψ|E(·)ψ, where ψ is the system’s wave function. (267) Below we show that this is true in a (hypothetical) world governed by GRWf, for any choice of Hamiltonian and flash rate operators (while E(·) depends on this choice, of course); we will translate the physical statement (267) into a mathematical one and give a proof.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
215
What is the status of (267) in ordinary quantum mechanics? There, one introduces as an axiom (rather than theorem) that observables correspond to self-adjoint operators, and specifies the distribution of the result if an observable is measured, and a formula for the subsequent collapse of the wave function. But it is well known that not every conceivable experiment is the measurement of an observable: Self-adjoint operators correspond to projection-valued measures (PVMs), which are POVMs P (·) such that P (A) is a projection for every measurable set A; it is easy to name experiments whose POVMs E(·) are not PVMs, for example a cascade of several measurements corresponding to non-commuting operators, or a “time-ofarrival measurement” observing the time a detector clicks. Thus, the usual axioms of quantum mechanics do not exhaust all conceivable experiments. One is tempted to introduce (267) as a further axiom. Let us return to GRWf theories. To translate (267) into a mathematical statement, we note that the result of an experiment will be read off from the arrangement of matter in space and time, that is, from the primitive ontology. Thus, the result R is a function of the random pattern of flashes F , R = ζ(F ). (Note that we do not model a class of experiments, but claim that any experiment deserving the name must be of this form.) We assume that ζ is a measurable function from the appropriate history space Ω (such as M N ) to the value space V of the experiment. We also assume that the experiment begins at time t0 , that the Hilbert space is H = Hsys ⊗ Henv , where Hsys is the Hilbert space of the system and Henv that of its environment, and that the wave function at time t0 is a product, Ψt0 = ψ ⊗ φ (which expresses that the system and apparatus are initially independent and justifies saying that the system has wave function ψ). Finally, the distribution of the GRWf process is given by a history POVM G(·) on the appropriate history space. Now, the physical statement (267) reduces to the following mathematical statement (which is mathematically not deep): Theorem 8. Let H = Hsys ⊗ Henv be a separable Hilbert space, G(·) a POVM on (Ω, AΩ ) acting on H , φ a fixed vector in Henv with φ = 1, and ζ : (Ω, AΩ ) → (V, AV ) a measurable function. For every ψ ∈ Hsys with ψ = 1, let Ψt0 = ψ ⊗ φ, Fψ be a random variable in Ω with distribution Ψt0 |G(·)Ψt0 , and Rψ = ζ(Fψ ). Then there is a POVM E(·) on (V, AV ) acting on Hsys so that the distribution of Rψ is ψ|E(·)ψ. Proof. For A ⊆ V with A ∈ AV , P(R ∈ A) = P(F ∈ ζ −1 (A)) = Ψt0 |G(ζ −1 (A))Ψt0 = ψ ⊗ φ|G(ζ −1 (A))ψ ⊗ φ = ψ|E(A)ψsys ,
(268) (269)
where ·|·sys denotes the scalar product in Hsys , and E(A) : Hsys → Hsys is defined by first mapping ψ → G(ζ −1 (A))ψ ⊗ φ and then taking the partial scalar product with φ. The partial scalar product with φ is the adjoint of ψ → ψ ⊗ φ, indeed the
March 10, 2009 19:20 WSPC/148-RMP
216
J070-00360
R. Tumulka
unique bounded linear mapping Lφ : Hsys ⊗ Henv → Hsys such that Lφ (ψ ⊗ χ) = φ|χenv ψ.
(270)
It has Lφ = φ and satisfies ψ|Lφ Ψsys = ψ ⊗ φ|Ψ.
(271)
We check that E(·) is a POVM: For A = V (the entire space), ζ −1 (V ) = Ω and G(ζ −1 (V )) = I, and E(V ) = I by (270). For every A, E(A) is clearly well defined and bounded, and positive by (271). The weak σ-additivity follows from that of G(·). (There does exist, though, another argument yielding (267), due to D¨ urr et al. [31]. It constitutes a proof of (267) from Bohmian mechanics, another proposal for the precise definition of quantum mechanics; but on the basis of ordinary quantum mechanics it remains incomplete. Here is an outline of the argument: Suppose that the experiment begins at time t0 and ends at t1 ; that, as before, H = Hsys ⊗ Henv and Ψt0 = ψ ⊗ φ; that the time evolution of the wave function is given by a unitary operator Utt01 , so that Ψt1 = Utt01 Ψt0 . Now assume Born’s rule, according to which the probability distribution of the configuration Q at time t1 is Ψt1 |P (·)Ψt1 for a suitable PVM P (·) on configuration space Q acting on H , the “configuration PVM”. Finally, assume that R is a function of Q, R = ζ(Q). (Here is where the argument works in Bohmian mechanics but not really in ordinary quantum mechanics, as one assumes that the configuration is part of the primitive ontology.) Then P(R ∈ A) = ψ ⊗ φ|Utt01 ∗ P (ζ −1 (A))Utt01 ψ ⊗ φ = ψ|E(A)ψ,
(272)
and E(·) is a POVM.) To sum up, the value of a precise definition of a physical theory is much the same as the value of a precise definition of a mathematical concept: It allows us to provide proofs for statements that we are interested in. Without the precise definition, many of these statements remain mere guesses or intuitions. And often, the clarity afforded by this precision helps us make new discoveries. Acknowledgments I thank Valia Allori (Northern Illinois University), Angelo Bassi (Trieste), Fay Dowker (Imperial College London), Detlef D¨ urr (LMU M¨ unchen), Gian-Carlo Ghirardi (ICTP Trieste), Sheldon Goldstein (Rutgers University), Frank Loose (T¨ ubingen), Tim Maudlin (Rutgers University), Rainer Nagel (T¨ ubingen), Travis Norsen (Marlboro College), Philip Pearle (Hamilton College), Peter Pickl (ETH Z¨ urich), Reiner Sch¨ atzle (T¨ ubingen), Luca Tenuta (T¨ ubingen), Stefan Teufel (T¨ ubingen), Jakob Wachsmuth (T¨ ubingen), and Nino Zangh`ı (Genova) for helpful discussions at various times on various topics related to this work.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
217
Appendix. Proofs of Lemmas A.1. Weak integrals Proof of Lemma 1. If P : H → H is a positive operator then it is self-adjoint.f Therefore, P 1/2 exists, and ψ|P ψ = P 1/2 ψ2 . As a consequence, setting P = Λ(q), if ψ ∈ S then q → Λ(q)1/2 ψ is a square-integrable function, and thus, if φ, ψ ∈ S, (273) |φ|Λ(q)ψ|µ(dq) ≤ Λ(q)1/2 φΛ(q)1/2 ψµ(dq) < ∞ by the Cauchy–Schwarz inequality in H and that in L2 (M, µ). This shows that S is a subspace. For φ, ψ ∈ S, set Q(ψ) = ψ|Λ(q)ψµ(dq) and B(φ, ψ) =
1 (Q(φ + ψ) − Q(φ − ψ) + iQ(φ − iψ) − iQ(φ + iψ)) 4
(274)
Then (112) follows; sesquilinearity and positivity (and, in particular, Hermitian symmetry) follow from (112). Proof of Lemma 2. For arbitrary ψ ∈ H , there is a sequence (ψn )n∈N in S with ψn → ψ. Since T is bounded, ψn |T ψn → ψ|T ψ. What we have to show is ψn |Λ(q)ψn µ(dq) → ψ|Λ(q)ψµ(dq). (275) M
M
For every n ∈ N, define the function fn : M → [0, ∞) by fn (q) = ψn |Λ(q)ψn .
(276)
Let [fn ] denote its equivalence class modulo changes on a µ-null set. Since fn (q)µ(dq) = ψn |T ψn < ∞, [fn ] ∈ L1 (M, µ). The sequence ([fn ]) is a Cauchy sequence in L1 : fn − fm 1 = |ψn − ψm + ψm |Λ(q)(ψn − ψm + ψm ) − ψm |Λ(q)ψm |µ(dq) ≤ ψn − ψm |Λ(q)(ψn − ψm )µ(dq)
(277)
+
2|ψn − ψm |Λ(q)ψm |µ(dq)
(278)
(using the Cauchy–Schwarz inequality for H ) φ + ψ|P (φ + ψ) = P (φ + ψ)|φ + ψ implies φ|P ψ + ψ|P φ = P φ|ψ + P ψ|φ; call this equation (1); consider the same equation with iψ instead of ψ, and call it equation (2); equation (1) minus i times equation (2) yields φ|P ψ = P φ|ψ.
f Because
March 10, 2009 19:20 WSPC/148-RMP
218
J070-00360
R. Tumulka
≤ ψn − ψm |T (ψn − ψm ) + 2Λ(q)1/2 (ψn − ψm )Λ(q)1/2 ψm µ(dq)
(279)
(using the Cauchy–Schwarz inequality for L2 (M, µ)) 1/2 2 1/2 2 ≤ T ψn − ψm + 2 Λ(q) (ψn − ψm ) µ(dq) ×
1/2 Λ(q)1/2 ψm 2 µ(dq)
(280)
= T ψn − ψm 2 + 2ψn − ψm |T (ψn − ψm )1/2 ψm |T ψm 1/2
(281)
≤ T ψn − ψm + 2T
(282)
2
1/2
ψn − ψm T
1/2
ψm
= T (ψn − ψm + 2ψm )ψn − ψm → 0
(283)
as n, m → ∞. Since ([fn ]) is a Cauchy sequence in the Banach space L (M, µ), fn (q)µ(dq) → f (q)µ(dq). it converges, say [fn ] → [f ], and ψn |T ψn = On the other hand, since the Λ(q) are bounded, the fn converge pointwise to q → ψ|Λ(q)ψ, and f (the L1 limit) must agree with the pointwise limit µ-almost everywhere. Thus, q → ψ|Λ(q)ψ is an L1 function, and (275) holds. 1
Proof of Lemma 3. The “only if” part is clear, and the “if” part follows from ∞ ∞ φ|Λ(q)ψ = φ|φn Λnm (q)φm |ψ, (284) n=1 m=1
where the series converges for every q, and the fact that the pointwise limit of measurable functions is measurable. Proof of Lemma 4. q → RΛ(q)S is weakly measurable because, if {φn : n ∈ ∞ ∞ N} is an orthonormal basis, φn |RΛ(q)Sφm = k=1 =1 φn |Rφk φk |Λ(q)φ × φ |Sφm , as R and Λ(q) are bounded. To check (114), note that since R is bounded, its adjoint R∗ is defined on all of H and is bounded too, so that R∗ φ|Λ(q)Sψ exists for all φ, ψ and q, and is integrable by (110) with φ replaced by R∗ φ and ψ by Sψ: R∗ φ|T Sψ = R∗ φ|Λ(q)Sψµ(dq) = φ|RΛ(q)Sψµ(dq). (285) The left-hand side equals φ|RT Sψ, and the right-hand side φ| RΛ(q)Sµ(dq)ψ. Proof of Lemma 5. For every q, φn |Λ(q)Λ (q)φm =
∞
Λn (q)Λm (q)
(286)
=1
because Λ(q) is bounded. The right-hand side is a measurable function of q because products, sums, and limits of measurable functions are measurable.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
219
Proof of Lemma 6. For bounded, self-adjoint T , it is known [63, Theorem 12.25] that T = sup |ψ|T ψ|.
(287)
ψ=1
Let S be any countable dense subset of the unit sphere of H . Then sup |ψ|T ψ| = sup |ψ|T ψ|. ψ∈S
ψ=1
(288)
The ≥ relation is clear, and for the ≤ relation consider any ψ ∈ H with ψ = 1 and note that there is a sequence (ψm ) ⊆ S with ψm → ψ and therefore, by the boundedness of T , ψm |T ψm → ψ|T ψ; as a consequence, for every ε > 0, |ψ|T ψ| − ε ≤ |ψm |T ψm |
(289)
for sufficiently large m. Thus, Λ(q) = sup |ψ|Λ(q)ψ|, ψ∈S
(290)
and the supremum of countably many measurable functions is measurable. Proof of Lemma 7. A positive operator Λ(q) that is defined on all of H is selfadjoint, and if it is bounded and bijective then its spectrum must be contained in some interval [a, b] with 0 < a < b < ∞. For every n ∈ N let An ⊆ M be the set of those q for which the spectrum of Λ(q) is contained in [1/n, n]. To see that this set is measurable, choose any countable dense subset S of the unit sphere of H and define ! 1 , n ∀ψ ∈ S . (291) A n := q ∈ M : ψ|Λ(q)ψ ∈ n This set is measurable because it is the countable intersection of the measurable sets A n (ψ) = {q ∈ M : ψ|Λ(q)ψ ∈ [ n1 , n]}. But in fact, An = A n : An ⊆ A n is clear, and if q ∈ A n and ψ ∈ H has norm 1 then there is a sequence (ψm ) in S with ψm → ψ, and by the boundedness of Λ(q) also ψm |Λ(q)ψm → ψ|Λ(q)ψ, and therefore ψ|Λ(q)ψ ∈ [ n1 , n]. Since ∪n An = M , it suffices to show on An that q → Λ(q)−1 is weakly measurable. For q ∈ An , consider 1/n times the Neumann series applied to I − n1 Λ(q), k ∞ 1 1 (292) I − Λ(q) . n n k=0
The series converges in norm because I − n1 Λ(q) ≤ 1 − 1/n2 , and since, in case k of convergence, T = (I − T )−1 , (292) is the inverse of Λ(q). As a consequence, (292) also converges weakly, and k ∞ 1 1 −1 (293) ψ I − Λ(q) ψ . ψ|Λ(q) ψ = n n k=0
Each term on the right-hand side is a measurable function of q ∈ An by Lemma 5, and thus so is the series.
March 10, 2009 19:20 WSPC/148-RMP
220
J070-00360
R. Tumulka
Proof of Lemma 8. For n ∈ N set An = {q ∈ M : Λ(q) ≤ n}. By Lemma 6 this is a measurable set. Since ∪n An = M , it suffices to show on An that Λ(q)1/2 is weakly measurable. We use the Taylor expansion of the square root function x → x1/2 around x = 1, ∞ 1/2 k 1/2 (1 + t) = (294) t , k k=0
where
α(α − 1) · · · (α − k + 1) α . = k k!
(295)
The series converges absolutely for |t| < 1, and thus the corresponding operator series ∞ 1/2 (296) Tk k k=0
converges in norm for self-adjoint T with T < 1. In this case (in which I +T ≥ 0), we obtain from the functional calculus for self-adjoint operators that (296) equals indeed (I + T )1/2 . Now let 0 < ε < 1/2 and q ∈ An , and set T =
1 Λ(q) − (1 − ε)I, n
(297)
so that I + T = εI + n1 Λ(q). Then −(1 − ε)I ≤ T ≤ n1 Λ(q) − 12 I ≤ I − 12 I = 12 I and thus T ≤ 1 − ε. Thus, 1/2 k ∞ ∞ 1 1 1/2 1/2 Λ(q) − (1 − ε)I . (298) = Tk = εI + Λ(q) k k n n k=0
k=0
From this we can conclude with Lemma 5 that q → (εI + n1 Λ(q))1/2 is weakly measurable. Since limits of measurable functions are measurable, it only remains to show that " 1/2 # 1 1 ψ → ψ ψ √ Λ(q)1/2 ψ as ε → 0. (299) ψ εI + Λ(q) n n Indeed, for any positive bounded operator S, this convergence statement holds even in norm: (εI + S)1/2 − S 1/2 → 0 as ε → 0.
(300)
To see this, set R± = (εI + S)1/2 ± S 1/2 ; note R+ ≥ ε1/2 I, so that R+ is bijective −1 ≤ ε−1/2 ; note R+ R− = εI + S − S = εI (since (εI + S)1/2 and S 1/2 and R+ −1 commute because εI + S and S commute); thus R− = εR+ . As a consequence, −1 1/2 → 0 as ε → 0, which is (300). R− = εR+ ≤ ε
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
221
A.2. Dyson series Proof of Lemma 12. To see that (156) is weakly convergent, note that t t ∞ t dt1 dt2 · · · dtn |ψ|Rtn · · · Rt1 ψ| n=1
s
t1
≤ ψ
2
(301)
tn−1
∞ n=1
t
s
dt1
t
t1
dt2 · · ·
t
tn−1
dtn Rtn · · · Rt1
n t ∞ Rt 1 = ψ dt1 Rt1 ≤ ψ2 e s dt1 Rt1 < ∞. n! s n=1 2
(302)
(303)
As a consequence, ψ|Wst ψ is well defined and defines a bounded quadratic form and thus a bounded operator Wst : H → H . To see that (s, t) → Wst is weakly measurable, note that (i) t → Rt is; (ii) by Lemma 5, (t1 , . . . , tn ) → Rtn · · · Rt1 is; (iii) integrals are measurable functions of their boundaries; and (iv) limits of measurable function are measurable. To check (157), note first that the domain of integration in Rn for the nth term of (156) is characterized by s ≤ t1 ≤ · · · ≤ tn ≤ t, and changing the order of integration (because of absolute weak convergence), (156) can be rewritten as t2 tn ∞ t t dtn dtn−1 · · · dt1 Rtn · · · Rt1 . (304) Ws = I + n=1
s
s
s
As a consequence, the right-hand side of (157) is t dt Rt Wst s
= s
t
dt Rt +
t
s
dt Rt
∞
(using (114)) t t ∞ = dt Rt + dt s
s
s
n=1
s
n=1
n=1
s
t
dtn
t
dtn
tn s
tn
s
dtn−1 · · ·
dtn−1 · · ·
s
t2
t2
dt1 Rtn · · · Rt1
(305)
dt1 Rt Rtn · · · Rt1
(306)
s
( dt and n can be exchanged because of absolute (weak) convergence) t2 t t tn ∞ t dt Rt + dt dtn dtn−1 · · · dt1 Rt Rtn · · · Rt1 (307) = s
s
s
(rename t → tn+1 ) t ∞ t = dt1 Rt1 + dtn+1 s
n=1
s
s
s
tn+1
dtn
s
tn
dtn−1 · · ·
t2 s
dt1 Rtn+1 Rtn · · · Rt1 (308)
March 10, 2009 19:20 WSPC/148-RMP
222
J070-00360
R. Tumulka
(m := n + 1) ∞ t = dtm s
m=1
tm
s
dtm−1 · · ·
t2
dt1 Rtm Rtm−1 · · · Rt1 = Wst − I.
s
(309)
To check (127), we proceed in a similar way. To simplify notation, set τ = (t1 , . . . , tn ), Rτ = Rtn · · · Rt1 , and Sn (s, t) = {(t1 , . . . , tn ) ∈ Rn : s ≤ t1 ≤ · · · ≤ tn ≤ t}. For n = 0, set
(310)
R∅ = I
and
dτ f (τ ) = f (∅).
(311)
S0 (s,t)
Then the Dyson series (156) can be written as ∞ Wst = dτ Rτ . n=0
(312)
Sn (s,t)
Now observe that the right-hand side of (127) is t − dt Wst ∗ Λ(Q, t )Wst s
t
dt Wst ∗ (Rt∗ + Rt )Wst
= s
(313)
(using (114); the ordering of summation and integration can be changed because of absolute (weak) convergence) t ∞ = dt dτ dτ ∗ Rτ∗ ∗ (Rt∗ + Rt )Rτ s
n,n∗ =0
Sn (s,t )
(separating Rt∗ and Rt ) t ∞ = dt dτ s
n,n∗ =0
+
t
dt
s
Sn (s,t )
n,n∗ =0
Sn∗ (s,t )
dτ ∗ Rτ∗ ∗ Rt∗ Rτ
∞
dτ Sn
(314)
Sn∗ (s,t )
(s,t )
Sn
∗ (s,t )
dτ ∗ Rτ∗ ∗ Rt Rτ
(315)
(changing the ordering of integration and summation, and setting t0 = t∗0 = s) ∞ = dτ Sn (s,t)
n,n∗ =0
+
∞ n,n∗ =0
Sn (s,t)
dτ ∗
s
Sn∗ (s,t)
dτ ∗
dτ Sn∗ (s,t)
t
s
t
dt 1tn ≤t 1t∗n∗ ≤t Rτ∗ ∗ Rt∗ Rτ dt 1tn ≤t 1t∗n∗ ≤t Rτ∗ ∗ Rt Rτ
(316)
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
223
(renaming either t → tn+1 or t → t∗n∗ +1 ) ∞ = dτ dτ ∗ 1tn ≤t∗n∗ +1 Rτ∗ ∗ Rτ Sn (s,t)
n,n∗ =0
+
Sn∗ +1 (s,t)
∞
dτ
Sn+1 (s,t)
n,n∗ =0
Sn∗ (s,t)
dτ ∗ 1t∗n∗ ≤tn+1 Rτ∗ ∗ Rτ
(317)
(renaming either m∗ = n∗ + 1 and m = n, or m∗ = n∗
=
and m = n + 1) ∞ ∞ m=0 m∗ =1
+
dτ
Sm (s,t)
∞ ∞
m=1
m∗ =0
Sm∗ (s,t)
dτ ∗ 1tm ≤t∗m∗ Rτ∗ ∗ Rτ
dτ
Sm (s,t)
Sm∗ (s,t)
dτ ∗ 1t∗m∗ ≤tm Rτ∗ ∗ Rτ
(318)
(separating the terms with m = 0 or m∗ = 0) ∞ = dτ dτ ∗ 1tm ≤t∗m∗ Rτ∗ ∗ Rτ Sm (s,t)
m,m∗ =1
+
∞ Sm∗ (s,t)
m∗ =1
+
dτ ∗ Rτ∗ ∗
∞ ∞
m=1 m∗ =1
+
Sm∗ (s,t)
dτ Sm (s,t)
∞
Sm (s,t)
m=1
Sm∗ (s,t)
dτ ∗ 1t∗m∗ ≤tm Rτ∗ ∗ Rτ
dτ Rτ
(319)
(combining the first and third term) ∞ ∞ = dτ dτ ∗ Rτ∗ ∗ Rτ + Sm (s,t)
m,m∗ =1
+
∞ m=1
= −I +
Sm (s,t)
m∗ =0
= −I + Wst∗ Wst .
m∗ =1
dτ Rτ
∞ ∞
m=0
This shows (127).
Sm∗ (s,t)
Sm (s,t)
Sm∗ (s,t)
dτ ∗ Rτ∗ ∗
(320) dτ Sm∗ (s,t)
dτ ∗ Rτ∗ ∗ Rτ (321)
March 10, 2009 19:20 WSPC/148-RMP
224
J070-00360
R. Tumulka
References [1] V. Allori, M. Dorato, F. Laudisa and N. Zangh`ı, La Natura Delle Cose, Introduzione ai Fondamenti e Alla Filosofia Della Fisica (Carocci, Rome, 2005). [2] V. Allori, S. Goldstein, R. Tumulka and N. Zangh`ı, On the common structure of Bohmian mechanics and the Ghirardi–Rimini–Weber theory, British J. Philos. Sci. 59 (2008) 353–389; arXiv:quant-ph/0603027. [3] A. Bassi and C. G. Ghirardi, Dynamical reduction models, Phys. Rep. 379 (2003) 257–426. [4] A. Bassi and C. G. Ghirardi, The Conway–Kochen argument and relativistic GRW models, Found. Phys. 37 (2007) 169–185; arXiv:quant-ph/0610209. [5] V. P. Belavkin, A new wave equation for a continuous nondemolition measurement, Phys. Lett. A 140 (1989) 355–358. [6] J. S. Bell, On the Einstein–Podolsky–Rosen paradox, Physics 1 (1964) 195–200; Reprinted as Chap. 2 of [10]. [7] J. S. Bell, On the problem of hidden variables in quantum mechanics, Rev. Modern Phys. 38 (1966) 447–452; Reprinted as Chap. 1 of [10]. [8] J. S. Bell, Six possible worlds of quantum mechanics, in Proc. Nobel Symp. 65. Possible Worlds in Humanities, Arts and Sciences (Stockholm, August 11–15, 1986), ed. S. All´en (Walter de Gruyter, 1989), pp. 359–373; Reprinted as Chap. 20 in [10]. [9] J. S. Bell, Are there quantum jumps?, in Schr¨ odinger. Centenary Celebration of a Polymath, ed. C. E. W. Kilmister (Cambridge University Press, 1987), pp. 41–52; Reprinted as Chap. 22 of [10]. [10] J. S. Bell, Speakable and Unspeakable in Quantum Mechanics (Cambridge University Press, 1987). [11] J. S. Bell, Against “measurement”, in Sixty-Two Years of Uncertainty: Historical, Philosophical, and Physical Inquiries into the Foundations of Quantum Physics, ed. A. I. Miller, NATO ASI Series B, Vol. 226 (Plenum Press, 1990), pp. 17–31; Reprinted in Phys. World 3(8) (1990) 33–40. [12] F. Benatti, C. G. Ghirardi and R. Grassi, Describing the macroscopic world: Closing the circle within the dynamical reduction program, Found. Phys. 25 (1995) 5–38. [13] K. Berndl, M. Daumer, D. D¨ urr, S. Goldstein and N. Zangh`ı, A survey of Bohmian mechanics, Il Nuovo Cimento B 110 (1995) 737–750; arXiv:quant-ph/9504010. [14] K. Berndl, D. D¨ urr, S. Goldstein, G. Peruzzi and N. Zangh`ı, On the global existence of Bohmian mechanics, Commun. Math. Phys. 173 (1995) 647–673; arXiv:quantph/9503013. [15] P. Blanchard and A. Jadczyk, Events and piecewise deterministic dynamics in eventenhanced quantum theory, Phys. Lett. A 203 (1995) 260–266. [16] P. Blanchard and A. Jadczyk, Relativistic quantum events, Found. Phys. 26 (1996) 1669–1681. [17] D. Bohm, A suggested interpretation of the quantum theory in terms of “hidden” variables, I and II, Phys. Rev. 85 (1952) 166–193. [18] P. R. Chernoff, Essential self-adjointness of powers of generators of hyperbolic equations, J. Funct. Anal. 12 (1973) 401–414. [19] S. Colin, T. Durt and R. Tumulka, On superselection rules in Bohm–Bell theories, J. Phys. A 39 (2006) 15403–15419; arXiv:quant-ph/0509177. [20] J. H. Conway and S. Kochen, The free will theorem, Found. Phys. 36 (2006) 1441– 1473; arXiv:quant-ph/0604079. [21] J. H. Conway and S. Kochen, Reply to comments of Bassi, Ghirardi, and Tumulka on the free will theorem, Found. Phys. 37 (2007) 1643–1647; arXiv:quant-ph/0701016.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
225
[22] L. De Broglie, La nouvelle dynamique des quanta, in Electrons et Photons: Rapports et Discussions du Cinqui` eme Conseil de Physique tenu a ` Bruxelles du 24 au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay (Gauthier-Villars, Paris, 1928); English translation, The new dynamics of quanta, in Quantum Theory at the Crossroads, eds. G. Bacciagaluppi and A. Valentini (Cambridge University Press, 2009), pp. 374–407. [23] J. Dimock, Dirac quantum fields on a manifold, Trans. Amer. Math. Soc. 269 (1982) 133–147. [24] L. Di´ osi, Localized solution of a simple nonlinear quantum Langevin equation, Phys. Lett. A 132 (1988) 233–236. [25] C. Dove, Explicit wavefunction collapse and quantum measurement, Ph.D. thesis, Department of Mathematical Sciences, University of Durham (1996). [26] C. Dove and E. J. Squires, Symmetric versions of explicit wavefunction collapse models, Found. Phys. 25 (1995) 1267–1282. [27] C. Dove and E. J. Squires, A local model of explicit wavefunction collapse, preprint (1996); arXiv:quant-ph/9605047. [28] F. Dowker and J. Henson, Spontaneous collapse models on a lattice, J. Statist. Phys. 115 (2004) 1327–1339; arXiv:quant-ph/0209051. [29] D. D¨ urr, S. Goldstein, K. M¨ unch-Berndl and N. Zangh`ı, Hypersurface Bohm–Dirac models, Phys. Rev. A 60 (1999) 2729–2736; arXiv:quant-ph/9801070. [30] D. D¨ urr, S. Goldstein and N. Zangh`ı, Quantum equilibrium and the origin of absolute uncertainty, J. Statist. Phys. 67 (1992) 843–907; arXiv:quant-ph/0308039. [31] D. D¨ urr, S. Goldstein and N. Zangh`ı, Quantum equilibrium and the role of operators as observables in quantum theory, J. Statist. Phys. 116 (2004) 959–1055; arXiv:quantph/0308038. [32] D. D¨ urr and P. Pickl, Flux-across-surfaces theorem for a Dirac particle, J. Math. Phys. 44 (2003) 423–456; math-ph/0207010. [33] A. Einstein, Reply to criticisms, in Albert Einstein, Philosopher-Scientist, ed. P. A. Schilpp (Library of Living Philosophers, Evanston, IL, 1949), pp. 663–688. [34] A. Einstein, B. Podolsky and N. Rosen, Can quantum-mechanical description of physical reality be considered complete?, Phys. Rev. 47 (1935) 777–780. [35] H. Federer, Geometric Measure Theory (Springer, Berlin, 1969). [36] H.-O. Georgii and R. Tumulka, Global existence of Bell’s time-inhomogeneous jump process for lattice quantum field theory, Markov Process. Related Fields 11 (2005) 1–18; arXiv:math.PR/0312294. [37] C. G. Ghirardi, Some reflections inspired by my research activity in quantum mechanics, J. Phys. A 40 (2007) 2891–2917. [38] C. G. Ghirardi, R. Grassi and P. Pearle, Relativistic dynamical reduction models: General framework and examples, Found. Phys. 20 (1990) 1271–1316. [39] G. C. Ghirardi, P. Pearle and A. Rimini, Markov processes in Hilbert space and continuous spontaneous localization of systems of identical particles, Phys. Rev. A 42 (1990) 78–89. [40] G. C. Ghirardi, A. Rimini and T. Weber, Unified dynamics for microscopic and macroscopic systems, Phys. Rev. D 34 (1986) 470–491. [41] N. Gisin, Stochastic quantum dynamics and relativity, Helv. Phys. Acta 62 (1989) 363–371. [42] S. Goldstein, Quantum theory without observers. Part one, Physics Today (March 1998), pp. 42–46. [43] S. Goldstein, Quantum theory without observers. Part two, Physics Today (April 1998), pp. 38–42.
March 10, 2009 19:20 WSPC/148-RMP
226
J070-00360
R. Tumulka
[44] S. Goldstein and R. Tumulka, Opposite arrows of time can reconcile relativity and nonlocality, Classical Quantum Gravity 20 (2003) 557–564; arXiv:quant-ph/0105040. [45] K.-E. Hellwig and K. Kraus, Formal description of measurements in local quantum field theory, Phys. Rev. D 1 (1970) 566–571. [46] A. Jadczyk, Some comments on the formal structure of spontaneous localization theories, in Quantum Mechanics: Are there Quantum Jumps? And On the Present Status of Quantum Mechanics, eds. A. Bassi, D. D¨ urr, T. Weber and N. Zangh`ı, AIP Conference Proceedings, Vol. 844 (American Institute of Physics, 2006), pp. 192–199; arXiv:quant-ph/0603046. [47] O. Kallenberg, Foundations of Modern Probability (Springer, 1997). [48] A. Kent, “Quantum jumps” and indistinguishability, Mod. Phys. Lett. A 4(19) (1989) 1839–1845. [49] A. J. Leggett, Testing the limits of quantum mechanics: Motivation, state of play, prospects, J. Phys. Condens. Matter 14 (2002) R415–R451. [50] T. Maudlin, Quantum Non-Locality and Relativity (Blackwell, 1994). [51] T. Maudlin, Non-local correlations in quantum theory: Some ways the trick might be done, in Einstein, Relativity, and Absolute Simultaneity, eds. W. L. Craig and Q. Smith (Routledge, 2007), pp. 186–209. [52] T. Maudlin, Completeness, supervenience and ontology, J. Phys. A 40 (2007) 3151– 3171. [53] O. Nicrosini and A. Rimini, Relativistic spontaneous localization: A proposal, Found. Phys. 33 (2003) 1061–1084; arXiv:quant-ph/0207145. [54] B. O’Neill, Semi-Riemannian Geometry (Academic Press, 1983). [55] P. Pearle, Combining stochastic dynamical state-vector reduction with spontaneous localization, Phys. Rev. A 39 (1989) 2277–2289. [56] P. Pearle, Toward a relativistic theory of statevector reduction, in Sixty-Two Years of Uncertainty: Historical, Philosophical, and Physical Inquiries into the Foundations of Quantum Physics, ed. A. I. Miller, NATO ASI Series B, Vol. 226 (Plenum Press, 1990), pp. 193–214. [57] P. Pearle, Relativistic collapse model with tachyonic features, Phys. Rev. A 59 (1999) 80–101; arXiv:quant-ph/9902046. [58] R. Penrose, The Road to Reality (Random House, 2004). [59] R. Penrose and W. Rindler, Spinors and Space-Time, Volume 1: Two-Spinor Calculus and Relativistic Fields (Cambridge University Press, 1984). [60] K. R. Popper, Quantum mechanics without “the observer”, in Quantum Theory and Reality, ed. M. Bunge (Springer, 1967), pp. 7–44. [61] H. Putnam, A philosopher looks at quantum mechanics (again), British J. Philos. Sci. 56 (2005) 615–634. [62] M. Reed and B. Simon, Methods of Modern Mathematical Physics I: Functional Analysis (Academic Press, 1972). [63] W. Rudin, Functional Analysis (McGraw-Hill, 1973). [64] A. Ruschhaupt, A relativistic extension of event-enhanced quantum theory, J. Phys. A 35 (2002) 9227–9243; arXiv:quant-ph/0204079. [65] E. Schr¨ odinger, Die gegenw¨ artige Situation in der Quantenmechanik, Naturwissenschaften 23 (1935) 844–849. [66] S. Teufel and R. Tumulka, Simple proof for global existence of Bohmian trajectories, Commun. Math. Phys. 258 (2005) 349–365; arXiv:math-ph/0406030. [67] R. Tumulka, Closed 3-forms and random worldlines, Ph.D. thesis, Mathematisches Institut, Ludwig-Maximilians-Universit¨ at M¨ unchen (2001); http://edoc.ub.unimuenchen.de/7/.
March 10, 2009 19:20 WSPC/148-RMP
J070-00360
Point Processes of GRW Theory of Wave Function Collapse
227
[68] R. Tumulka, A relativistic version of the Ghirardi–Rimini–Weber model, J. Statist. Phys. 125 (2006) 821–840; arXiv:quant-ph/0406094. [69] R. Tumulka, On spontaneous wave function collapse and quantum field theory, Proc. Roy. Soc. A 462 (2006) 1897–1908; arXiv:quant-ph/0508230. [70] R. Tumulka, Collapse and relativity, in Quantum Mechanics: Are there Quantum Jumps? And On the Present Status of Quantum Mechanics, eds. A. Bassi, D. D¨ urr, T. Weber and N. Zangh`ı, AIP conference proceedings, Vol. 844 (American Institute of Physics, 2006), pp. 340–352; arXiv:quant-ph/0602208. [71] R. Tumulka, The ‘unromantic pictures’ of quantum theory, J. Phys. A 40 (2007) 3245–3273; arXiv:quant-ph/0607124. [72] R. Tumulka, Comment on “the free will theorem”, Found. Phys. 37 (2007) 186–197; arXiv:quant-ph/0611283. [73] R. Tumulka, A Kolmogorov extension theorem for POVMs, Lett. Math. Phys. 84 (2008) 41–46; arXiv:0710.3605. [74] K. Yosida, Functional Analysis, Grundlehren der Mathematischen Wissenschaften, Vol. 123, 6th edn. (Springer-Verlag, 1980).
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Reviews in Mathematical Physics Vol. 21, No. 2 (2009) 229–278 c World Scientific Publishing Company
ON MATHEMATICAL MODELS FOR BOSE–EINSTEIN CONDENSATES IN OPTICAL LATTICES
AMANDINE AFTALION CNRS, CMAP, UMR 7641, Ecole Polytechnique, 91128 Palaiseau Cedex, France
[email protected] BERNARD HELFFER Laboratoire de Math´ ematiques, Univ Paris-Sud et CNRS, Bat 425. 91 405 Orsay Cedex, France Bernard.Helff
[email protected] Received 5 May 2008 Revised 10 November 2008 Our aim is to analyze the various energy functionals appearing in the physics literature and describing the behavior of a Bose–Einstein condensate in an optical lattice. We want to justify the use of some reduced models and control the error of approximation. For that purpose, we will use the semi-classical analysis developed for linear problems related to the Schr¨ odinger operator with periodic potential or multiple wells potentials. We justify, in some asymptotic regimes, the reduction to low dimensional problems and analyze the reduced problems. Keywords: Bose–Einstein condensates; optical lattice; semi-classical analysis. Mathematics Subject Classification 2000: 35Q55, 35J20
1. Introduction 1.1. The physical motivation for Bose–Einstein condensates in optical lattices Superfluidity and superconductivity are two spectacular manifestations of quantum mechanics at the macroscopic scale. Among their striking characteristics is the existence of vortices with quantized circulation. The physics of such vortices is of tremendous importance in the field of quantum fluids and extends beyond condensed matter physics. The advantage of ultracold gaseous Bose–Einstein condensates is to allow tests in the laboratory to study various aspects of macroscopic quantum physics. There is a large body of research, both experimental, theoretical and mathematical on vortices in Bose–Einstein condensates [28, 29, 2, 24]. Current physical interest is in the investigation of very small atomic assemblies, for which 229
March 10, 2009 17:53 WSPC/148-RMP
230
J070-00361
A. Aftalion & B. Helffer
one would have one vortex per particle, which is a challenge in terms of detection and signal analysis. An appealing option consists in parallelizing the study, by producing simultaneously a large number of micro-BECs rotating at the various nodes of an optical lattice [33]. Experiments are under way. A major topic is the transition from a Mott insulator phase to a superfluid phase. We refer to the paper of Zwerger [37] and the references therein for more details. Our framework of study will be in the mean field regime where the condensate can be described by a Gross–Pitaevskii type energy with a term modeling the optical lattice potential. The mean field description of a condensate by the Gross–Pitaevskii energy has been derived as the limit of the hamiltonian for N bosons, when N tends to infinity [25, 23] in the case of a condensate without optical lattice. The scattering length aN of the interaction in the N -body problem is such that N aN → g. The rigorous derivation in the case of an optical lattice where there are fewer atoms per site is nevertheless open. In a one-dimensional optical lattice, the condensate splits into a stack of weaklycoupled disk-shaped condensates, which leads to some intriguing analogues with high-Tc superconductors due to their similar layered structure [34, 35, 22, 7–9]. Our aim, in this paper, is to address mathematical models that describe a BEC in an optical lattice. Related models have been analyzed in [3] with Gamma convergence techniques. The theory which we will develop is inspired by a series of physics papers [33–35, 22, 36]. We want to justify their reduction to simpler energy functionals in certain regimes of parameters and in particular understand the ground state energy. This relies on cases where the problem becomes almost linear in some direction. The ground state energy of a rotating Bose–Einstein condensate is given by the minimization of 1 1 |∇Ψ − iΩ × rΨ|2 − Ω2 r2 |Ψ|2 + (V (r) QΩ (Ψ) := 2 2 3 R + W (z))|Ψ|2 + g|Ψ|4 dxdydz, (1.1) under the constraint
R3
|Ψ(x, y, z)|2 dxdydz = 1,
(1.2)
where • • • •
r2 = x2 + y 2 , r = (x, y, z), Ω ≥ 0 is the rotational velocity along the z axis, Ω × r = Ω(−y, x, 0), g ≥ 0 is the scattering length.
The experimental device leading to the realization of optical lattices requires a trapping potential V (r) given by 1 2 2 ω r + ωz2 z 2 , (1.3) V (r) = 2 ⊥
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
231
corresponding to the magnetic trap. We assume that the radial trapping frequency is much larger than the axial trapping frequency, that is 0 ≤ ωz ω⊥ .
(1.4)
We will always assume the condition: 0 ≤ Ω < ω⊥
(1.5)
for the existence of a minimizer: the trapping potential has to be stronger than the centrifugal force. The presence of the one dimensional optical lattice in the z direction is modeled by W (z) =
1 w(z), 2
(1.6)
where 12 is the lattice depth,a and w is a positive T -periodic function. In the whole paper, we will assume: Assumption 1.1. The potential w is a C ∞ even, non negative function on R which is T -periodic and admits as unique minima the points kT (k ∈ Z). Moreover these minima are non degenerate. Thus, w(z + T ) = w(z),
w(0) = 0,
w (0) > 0,
w(z) > 0
if z ∈ T Z.
(1.7)
An example is 2
w(z) = sin
2πz λ
(1.8)
and λ is the wavelength of the laser light. The optical potential W creates a onedimensional lattice of wells separated by a distance T = λ/2. We will assume that tends to 0 (this means deep lattice) and that T is fixed. Furthermore, we assume that the lattice is deep enough so that it dominates over the magnetic trapping potential in the z direction and that the number of sites is large. Thus we will, in this paper, ignore the magnetic trap in the z direction, and this will correspond to the case ωz = 0.
(1.9)
We will mainly discuss, instead of a problem in R3 , a periodic problem in the z direction, that is in R2x,y ×[− T2 , T2 [, where T corresponds to the period of the optical lattice, or in R2x,y × [− N2T , N2T [ for a fixed integer N ≥ 1. Therefore, we focus on a Called
Vz in [33].
March 10, 2009 17:53 WSPC/148-RMP
232
J070-00361
A. Aftalion & B. Helffer
the minimization of the functional 1 1 |∇Ψ − iΩ × rΨ|2 − Ω2 r2 |Ψ|2 + (V (r) (Ψ) := Qper,N Ω NT NT 2 2 2 Rx,y ×]− 2 , 2 [ + W (z))|Ψ|2 + g|Ψ|4 dxdydz, (1.10) under the constraint R2x,y ×]− N2T , N2T [
|Ψ(x, y, z)|2 dxdydz = 1,
(1.11)
with 1 2 2 ω r , 2 ⊥ the potential W given by (1.6), (1.7), and the wave function Ψ satisfying V (r) =
Ψ(x, y, z + N T ) = Ψ(x, y, z).
(1.12)
(1.13)
This functional has a minimizer in the unit sphere of its natural form domain SΩper,N and we call per,N EΩ =
inf
per,N Ψ∈SΩ
Qper,N (Ψ). Ω
(1.14)
Notation. In the case N = 1, we will write more simply per,(N =1)
Qper Ω := QΩ
,
per,(N =1)
per EΩ := EΩ
.
(1.15)
When Ω = 0, we will sometimes omit the reference to Ω. Our aim is to justify that the ground state energy can be well approximated by the study of simpler models introduced in physics papers [33, 34, 22] and to measure the error which is done in the approximation. For that purpose, we will describe how, in certain regimes, the semi-classical analysis developed for linear problems related to the Schr¨odinger operator with periodic potential or multiple wells potentials is relevant: Outassourt [27], Helffer–Sj¨ ostrand [18, 15] or for an alternative approach [31]. 1.2. The linear model The linear model which appears naturally is a selfadjoint realization associated with the differential operator: Ω + Hz , HΩ = H⊥
(1.16)
with 1 1 2 2 Ω H⊥ := − ∆x,y + ω⊥ r − ΩLz , 2 2
(1.17)
Lz = i(x∂y − y∂x ),
(1.18)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
233
and 1 d2 + W (z). (1.19) 2 dz 2 In the transverse direction, we will consider the unique natural selfadjoint extension Ω in L2 (R2x,y ) of the positive operator H⊥ by keeping the same notation. In the longitudinal direction, we will consider specific realizations of Hz and in particular the T -periodic problem or more generally the (N T )-periodic problem attached to Hz which will be denoted by Hzper and Hzper,N and we keep the notation Hz for the problem on the whole line. So our model will be the self-adjoint operator Hz := −
Ω + Hzper,N . HΩper,N = H⊥
(1.20)
In this situation with separate variables, we can split the spectral analysis, the spectrum of HΩper,N being the closed set Ω σ(HΩper,N ) := σ(H⊥ ) + σ(Hzper,N ).
(1.21)
Ω H⊥
The first operator is a harmonic oscillator with discrete spectrum. Under Condition (1.5), the bottom of its spectrum is given by Ω λ⊥ 1 := inf(σ(H⊥ )) = ω⊥ .
(1.22)
A corresponding ground state is the Gaussian 1 ω⊥ 2 ω⊥ 2 r . ψ⊥ = exp − (1.23) π 2 Note that the ground state energy and the ground state are independent of Ω. The gap between the ground state energy and the second eigenvalue (which has multiplicity 1 or 2) is given by ⊥ δ⊥ := λ⊥ 2,Ω − λ1 = ω⊥ − Ω.
Hzper,N ,
(1.24)
The properties of the periodic Hamiltonian which will be described in Sec. 3.2 (formulas (3.8) and (3.9) for the physical model), depend on the value of N . In the case N = 1, we call the ground state of Hzper φ1 (z) and the ground energy (or lowest eigenvalue) λ1,z . In the semi-classical regime → 0, λ1,z satisfies c (1.25) λ1,z ∼ , for some c > 0. The splitting δz between the ground state energy and the first excited eigenvalue satisfies c˜ (1.26) δz ∼ , for some c˜ > 0. For N > 1, the ground state energy of Hzper,N is unchanged and the corresponding ground state φN 1 is the periodic extension of φ1 considered as an (N T )-periodic function. More precisely, in order to have the L2 -normalizations, the relation is 1 φN (1.27) 1 = √ φ1 , N
March 10, 2009 17:53 WSPC/148-RMP
234
J070-00361
A. Aftalion & B. Helffer
on the line. But we now have N exponentially close eigenvalues of the order of λ1,z lying in the first band of the spectrum of Hz on the whole line. They are separated from the (N + 1)th by a splitting δzN which satisfies: −S/)). δzN = δz + O(exp
(1.28)
Here the notation O(exp −S/)) means O(exp −S/)) = O(exp −S /), The first N eigenfunctions satisfy 2iπ( − 1) φN (z + T ) = exp φN (z), N
∀S < S.
(1.29)
for = 1, . . . , N,
(1.30)
of what will be called later a kcorresponding to the special values k = 2π(−1) NT Floquet condition. We will also use another real orthonormal basis (called (N T )-periodic Wannier functions basis) (ψjN ) (j = 0, . . . , N − 1) of the spectral space attached to the first N eigenvalues. Each of these (N T )-periodic functions have the advantage to be localized (as → 0) in a specific well of W considered as defined on R/(N T )Z. 1.3. The reduced functionals We want to prove the reduction to lower dimensional functionals by using the spectral analysis of the linear problem. There are two natural ideas to compute upper bounds, and thus find these functionals. We can • either use test functions of the type Ψ(x, y, z) = φ(z)ψ⊥ (x, y),
(1.31)
Ω and minimize among all where ψ⊥ is the first normalized eigenfunction of H⊥ 2 possible L -normalized φ(z) to obtain a 1D-longitudinal reduced problem, • or use
— in the case N = 1, Ψ(x, y, z) = φ1 (z)ψ(x, y)
(1.32)
where φ1 is the first eigenfunction of Hzper and minimize among all possible L2 -normalized ψ(x, y) to obtain a 2D transverse reduced problem, — or in the case N ≥ 1 Ψ(x, y, z) =
N −1
ψjN (z)ψj,⊥ (x, y)
(1.33)
j=0
where ψjN (z) is the orthonormal basis of Wannier functions mentioned above, and minimize on the suitably normalized ψj,⊥ ’s which provide N
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
235
coupled problems. We denote by ΠN the projection on this space. For Ψ ∈ L2 (R2 × ]− N2T , N2T [), we have N −1
ΠN Ψ =
ψjN (z)ψj,⊥ (x, y),
(1.34)
j=0
with
ψj,⊥ (x, y) =
]− N2T , N2T [
Ψ(x, y, z)ψjN (z)dz.
Computing the energy of a test function of type (1.31), we get N (Ψ) = ω⊥ + EA (φ) Qper,N Ω
(1.35)
N where EA is the functional on the N T -periodic functions in the z direction, defined 1 on H (R/N T Z) by N2T 1 N 2 2 4 |φ (z)| + W (z)|φ(z)| + φ → EA (φ) = g |φ(z)| dz (1.36) 2 − N2T
with
g := g
R2
|ψ⊥ (x, y)|4 dxdy
=
1 gω⊥ . 2π
(1.37)
N The functional EA is introduced by [22] who analyze a particular case. Its study in the small limit is one of the aims of this paper. For test functions of type (1.32), we get in the case N = 1
Qper Ω (Ψ) = λ1,z + EB,Ω (ψ) with
EB,Ω (ψ) :=
(1.38)
1 1 |∇x,y ψ − iΩ × rψ|2 − Ω2 r2 |ψ|2 2 2 R2x,y 1 2 2 + ω⊥ (x + y 2 )|ψ|2 + g|ψ|4 dxdy, 2
and
g := g
T 2
− T2
(1.39)
|φ1 (z)| dz 4
.
N In the case N > 1, we define EB,Ω ((ψj,⊥ )j=0,...,N −1 ) by N (Ψ) := λ1,z ψj,⊥ 2 + EB,Ω ((ψj,⊥ )) Qper,N Ω
(1.40)
(1.41)
j
with Ψ=
N −1 j=0
ψjN (z)ψj,⊥ (x, y).
(1.42)
March 10, 2009 17:53 WSPC/148-RMP
236
J070-00361
A. Aftalion & B. Helffer
Of course when minimizing over normalized Ψ’s, one gets more simply the problem of minimizing N (Ψ) = λ1,z + EB,Ω ((ψj,⊥ )). Qper,N Ω
(1.43)
N does not provide N coupled problems but one sinAs such, the energy EB,Ω gle energy depending on N test functions. Nevertheless, in the small limit, the Wannier functions are localized in each well. Thus each function ψj,⊥ only interacts with its nearest neighbors and this simplification provides N coupled problems, as suggested by Snoek [33] on the basis of formal computations. We will analyze their validity. This reduced functional is somehow related to the Lawrence–Doniach model for superconductors (see [7, 8]).
1.4. Main results N 1.4.1. The reference quantities: mN A and mB,Ω N We are able to justify the reductions to the lower dimensional functionals EA and N EB,Ω when their infimum is much smaller than the gap between the first two excited states of the linear problem in the other direction, namely in case A, when mN A is much smaller than δ⊥ , where N mN A = inf EA (φ), φ=1
(1.44)
and in case B, when mN B,Ω is much smaller than the gap between the two first bands of the periodic problem on the line, where mN B,Ω = P j
inf
ψj,⊥ 2 =1
N EB,Ω ((ψj,⊥ )).
(1.45)
N We will also give more accurate estimates of mN A and mB,Ω according to the regime of parameters. Here we consider two cases:
• the Weak Interaction case, where the interaction term (L4 term) is at most of the same order as the ground state of the linear problem in the same direction; • the Thomas–Fermi case, where the kinetic energy term is much smaller than the potential and interaction terms. N N N In what follows, when N is not mentioned in mN A , mB,Ω , EA , EB,Ω , then the notations are for N = 1. Similarly, if Ω is not mentioned, this means that either the considered quantity is independent of Ω or that we are treating the case Ω = 0. To mention the dependence on other parameters, we will sometimes explicitly write g) or mN g , ω⊥ ) . this dependence like for example mN A (, B,Ω (˜
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
237
1.4.2. Universal estimates and applications Using the test function Ψper,N (x, y, z) = ψ⊥ (x, y)φN 1 (z), where φN 1 is the N th normalized ground state introduced in (1.27) and ψ⊥ (x, y) is Ω , actually independent of Ω, leads to the following trivial and the ground state of H⊥ universal inequalities (which are valid for any N and any Ω such that 0 ≤ Ω < ω⊥ ) per,N ≤ λ1,z + ω⊥ + IN , λ1,z + ω⊥ ≤ EΩ
where IN
gω⊥ := 2N π
T 2
− T2
|φ1 (z)| dz 4
=
(1.46)
I . N
(1.47)
From (1.27), we have:
NT 2
− N2T
4 (φN 1 (z))
1 dz = 2 N
NT 2
− N2T
1 φ1 (z) dz = N
4
T 2
− T2
φ1 (z)4 dz,
(1.48)
where, as → 0, and, under Assumption (1.7), it can be proved (see (3.10)), that IN ∼
c4 gω⊥ − 12 . 2π N
(1.49)
per,N An immediate analysis shows that λ1,z + ω⊥ is a good asymptotic of EΩ in the limit → 0 when g is sufficiently small (what we can call the quasi-linear situation). More precisely, we have
Theorem 1.2. Under the condition that either (QLa)
1
g 2 ,
(1.50)
or (QLb)
1
gω⊥ 2 1,
(1.51)
then we have per,N EΩ = (λ1,z + ω⊥ ) (1 + o(1)),
(1.52)
as tends to 0. Each of these conditions implies indeed that IN is small relatively to λz or to ω⊥ . Our main goal is to have more accurate estimates than (1.52), to analyze more general cases when none of these two conditions is satisfied and to give natural sufficient conditions allowing the analysis of reduced models.
March 10, 2009 17:53 WSPC/148-RMP
238
J070-00361
A. Aftalion & B. Helffer
1.4.3. Case (A): The longitudinal model We consider states which are of type (1.31) with ϕ ∈ L2 (Rz /(N T )Z). The energy of such test functions provides the upper bound per,N EΩ ≤ ω ⊥ + mN g) A (,
(1.53)
where mN g was introduced in (1.37). A is given by (1.44) and In order to show that the upper bound is an approximate lower bound, we first address the “Weak Interaction” case, 1 (ω⊥ − Ω),
(AWIa)
(1.54)
and, for a given c > 0, (AWIb)
1
gω⊥ 2 ≤ c.
(1.55)
The first assumption implies that the lowest eigenvalue λ1,z of the linear problem in the z direction (having in mind (1.25)) is much smaller than the gap in the transverse direction δ⊥ = ω⊥ − Ω. This will allow the projection onto the subspace ψ⊥ ⊗ L2 (Rz /(N T )Z). The second assumption implies that the nonlinear term (of √ order gω⊥ / ) is of the same order as λ1,z . It implies using (1.25), (1.49) and the universal estimate λ1,z ≤ mN A ≤ λ1,z + IN ,
(1.56)
that mN A ≈
1 .
(1.57)
Here ≈ means “of the same order” in the considered regime of parameters. More precisely we mean by writing (1.57) that, for any 0 > 0, there exists C > 0 such that, for all ∈ ]0, 0 ], any g, ω⊥ satisfying (1.55), C 1 ≤ mN . A ≤ C Note that most of the time, we will not control the constant with respect to N . All these rough estimates are obtained by rather elementary semi-classical methods which are recalled in Sec. 3. More precise asymptotics of mN A will be given under the additional Assumption (1.50) in Sec. 5.2. Thus, by (1.54), mN A is much smaller than δ⊥ . We will prove. Theorem 1.3. When tends to 0, and under Conditions (1.54) and (1.55), we have per,N = ω ⊥ + mN g) (1 + o(1)). EΩ A (,
(1.58)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
239
We now describe the “Thomas–Fermi” regime, where we can also justify the reduction to the longitudinal model. We assume that, for some given c > 0, (ATFa) (ATFb) 5
√ gω⊥ 1,
(1.59)
gω⊥ 2 ≤ c,
(1.60)
1
5
3
g 12 − 6 ω⊥ 12 (ω⊥ − Ω) 8 .
(ATFc)
(1.61)
Note that (1.59) is the converse of (1.55) while (1.59) and (1.61) imply that 1 (ω⊥ − Ω). This implies λ1,z δ⊥ , which is the main condition to reduce to case A. Assumptions (1.59) and (1.60) allow to show that: mN A ≈
gω 23 ⊥
,
(1.62)
and this also implies that the nonlinear term is much bigger than δz . The estimate (1.62) will be shown in Sec. 5.3, together with more precise ones with stronger hypotheses (see assumptions (5.12) and (5.13)). Theorem 1.4. When tends to 0, and under Conditions (1.59)–(1.61), we have, as → 0, per,N = ω ⊥ + mN ) (1 + o(1)). EΩ A (, g
(1.63)
The proofs give actually much stronger results.
1.4.4. Case (B): The transverse model This corresponds to the idea of a reduction on the ground eigenspace in the z variable, where the interaction term is kept in the transverse problem: therefore, this is a regime where ω⊥ 1. We recall that we denote by λ1,z the (N -independent) ground state energy of Hzper,N and by φN 1 the normalized ground state. We consider N states which are of type (1.32) or (1.33). We have defined EB,Ω by (1.41), (1.42) N and mB,Ω , the infimum of the energy of such test functions by (1.45). We have the upper bound per,N ≤ λ1,z + mN EΩ B,Ω .
(1.64)
When N = 1, mB,Ω is a function of g˜ and ω⊥ as it is clear from (1.39) and (1.45). Note that, as for the estimate of IN , we get g˜ = g
T 2
− T2
4
φ1 (z) dz
g ≈ √ .
(1.65)
March 10, 2009 17:53 WSPC/148-RMP
240
J070-00361
A. Aftalion & B. Helffer
Again we can discuss two different cases according to the size of the interaction. In the Weak Interaction case, we prove the following: Theorem 1.5. When tends to 0, and under the conditions 1
(BWIa)
g− 2 ≤ C,
(1.66)
(BWIb)
ω⊥ 1,
(1.67)
then per,N EΩ = λ1,z + mN B,Ω (1 + o(1)).
(1.68)
Condition (BWIb) implies that the bottom of the spectrum of the linear problem in the x − y direction is much smaller than δz , the gap in the z direction, which is of order 1/. Condition (1.66), together with (1.46) and (1.49), implies that mN B,Ω satisfies mN B,Ω ≈ ω⊥ .
(1.69)
1
Indeed, (BWIa) and (BWIb) imply g 2 ω⊥ 1, that is (QLb). In the Thomas–Fermi case, we prove the following: Theorem 1.6. When tends to 0, and under the conditions √ (BTFa) g, √ 3 (BTFb) ω⊥ g 4 1,
(1.70) (1.71)
and (BTFc)
3
1
g 2 4 ω⊥ 1,
(1.72)
then per,N EΩ = λ1,z + mN B,Ω (1 + o(1)).
(1.73)
Note that (BTFa) is the converse of (BWIa). We will see in Proposition 6.6 (together with (6.31), (6.43) and (6.44)) that, under these assumptions and Assumption (6.42), the term mN B,Ω satisfies √ 1/4 , mN B,Ω ≈ ω⊥ g/
(1.74)
and thus is much smaller than δzN which is of order 1 . Our proofs are made up of two parts: rough or accurate estimates of mN A,Ω per,N and mN on the other hand. B,Ω on the one hand and a lower bound for EΩ The lower bound consists in showing that the upper bound obtained by projecting on the special states introduced above in (1.31), (1.32) or (1.33) is actually also asymptotically a good lower bound.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
241
1.4.5. Tunneling effect and discrete models Since the Wannier functions are localized in the z variable, the energy of a function
N −1 Ψ = j=0 ψjN (z)ψj,⊥ (x, y) provides at leading order the sum of N decoupled energies for ψj,⊥ on each slice j. At the next order, in the computation of the L2 norm of the gradient, only the nearest neighbors in z interact through an exponentially small term, describing what is called the tunneling effect. These simplifications are discussed in Sec. 7. In case A, the behavior on each slice j is the same, given by ψ⊥ and it is the behavior on the z direction which has a tunneling contribution. There are no vortices whatever the velocity Ω. In case B, for N = 1, there are vortices for large velocity and they are located on each slice at the same place. For N large, it is an open and interesting question to analyze whether it is possible for a vortex line to vary location according to the slice, whether vortices interact between the slices and how. This could be performed using our reduced models. 1.4.6. Comparison with the global problem on R3 To conclude with the presentation of the main results, let us observe that, if we denote by EΩ (g), the infimum of QΩ,g introduced in (1.1) over L2 (R3 ) normalized Ψ’s, then, for all g ≥ 0, all 0 ≤ Ω < ω⊥ , per EΩ=0 (g) = EΩ (g) = EΩ (g = 0) = EΩ (g = 0).
(1.75)
Hence, if we look at the Bose–Einstein functional on R3 the infimum of the functional restricted to L2 -normalized states is independent of g ≥ 0 and Ω and is immediately obtained by the ground state energy of the Hamiltonian attached to the case g = 0 and Ω = 0. This explains why, following the physicists, we have considered the (N T )-periodic problem, which exhibits more interesting properties. 1.5. Organization of the paper The paper is organized as follows. In Sec. 2, we start the spectral analysis of the linear problems in the longitudinal and transverse directions. We recall in particular the main techniques which can be used for the analysis of the spectral problem with periodic potential on the line. Section 3 is devoted to the semi-classical results for the periodic problem. Although we are mainly interested in 1D-problems we recall here techniques which are true in any dimension and can be useful for the analysis of 2D or 3D optical lattices, at least when Ω = 0. In Sec. 4, we prove the main theorems for case A. In Sec. 5, we analyze the N for N = 1 and N > 1 and also ground state of the 1D nonlinear energy EA distinguish between the two cases: Weak Interaction and Thomas–Fermi. Section 6 N . Section 7 is devoted corresponds to a similar analysis for the transverse models EB to the tunneling effects and discuss, on the basis of the semi-classical estimates of
March 10, 2009 17:53 WSPC/148-RMP
242
J070-00361
A. Aftalion & B. Helffer
Sec. 3, some results obtained by physicists on the discrete nonlinear Schr¨ odinger model. 2. Analysis of the Linear Model The linear model which appears naturally is associated to Ω + Hz , H Ω = H⊥
which was presented in the introduction (see (1.17)–(1.21)). A natural condition Ω ) is Condition (1.5). In this situation (for the strict positivity of the operator H⊥ with separate variables, we can split the spectral analysis in the separate spectral Ω , whose main properties were recalled in the introduction, and the analysis of H⊥ spectral analysis of a suitable realization of Hz which will be presented in the next subsection. There are two related approaches that we will describe for the analysis of the spectrum of Hz , which is known to be a band spectrum, i.e. an absolutely continuous spectrum which is a union of closed intervals, which are called the bands. We will then give a specific treatment of the (N T )-periodic problem. 2.1. Floquet’s theory We can first use the Floquet theory (or the Bloch theory, which is an alternative name for the same theory, see for example [15] for a short presentation). One can show that the spectrum of Hz is obtained by taking the closure of k∈[0,2π/T ] σ(Hz,k ) where 2 1 d + ik + W (z) Hz,k = − 2 dz is considered as an operator on L2 (R/T Z). So σ(Hz ) = σ(Hz,k ).
(2.1)
k∈[0, 2π T ]
We now write 2π Z. (2.2) T Hence we have to analyze for each k the operator Hz,k on L2 (R/Γ). Later we will use the notation Γ = T Z and Γ∗ =
Hzper = Hz,0 .
(2.3)
A unitary equivalent presentation of this approach consists in analyzing Hz restricted to the subspace hk of the u ∈ L2loc (R) such that u(z + T ) = eikT u(z).
(2.4)
Here we did not see a k-dependence in the differential operator but this is the choice of the space hk (which is NOT in L2 (R)), which gives the k-dependence. Condition (2.4) is called a Floquet condition.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
243
This means that we have written, using the language of the Hilbertian-integrals, the decomposition ⊕ hk dk (2.5) L2 (R) = [0,2π/T ]
and that we have for the operator the corresponding decomposition ⊕ z,k dk, Hz = H
(2.6)
[0,2π/T ]
z,k unitary equivalent to Hz,k . with H For each k ∈ [0, 2π/T [, Hz,k has a discrete spectrum which can be described by an increasing sequence of eigenvalues (λj (k))j∈N . The spectrum of Hz is then a union of bands Bj , each band being described by the range of λj . At least when we have the additional symmetry W even, one can determine for which value of k the ends of the band Bj are obtained. For j = 1, we know in addition from the diamagnetic inequality that the minimum of λ1 is obtained for k = 0: inf λ1 (k) = λ1 (0). k
(2.7)
2.2. Wannier’s approach When the band is simple (and this will be the case for the lowest band in the regime small), one can associate to λj (k) a normalizedb eigenfunction ϕj (z, k) with in addition an analyticity with respect to k together with the (2π/T )-periodicity in k. In this case (we now take j = 1), one can associate to ϕ1 , which satisfies, ϕ1 (z + T ; k) = ϕ1 (z, k), and
2π ϕ1 z; k + = ϕ1 (z, k), T
(2.8)
(2.9)
a family of Wannier’s functions (ψ )∈Γ defined by 2π T T exp(ikz) ϕ1 (z, k)dk, ψ (z) = ψ0 (z − ), (2.10) ψ0 (z) = 2π 0 for ∈ Γ. In addition, we can take ψ0 real. One can indeed construct ϕ1 satisfying in addition the condition ϕ1 (z, k) = ϕ1 (z, −k).
(2.11)
One obtains (after some normalization of ψ0 ) that: Proposition 2.1. (i) The family (ψ )∈Γ gives an orthonormal basis of the spectral space attached to the first band. (ii) ψ0 is an exponentially decreasing function. b In
L2 (] −
T 2
,
T 2
[).
March 10, 2009 17:53 WSPC/148-RMP
244
J070-00361
A. Aftalion & B. Helffer
The second point can be proved using the analyticityc with respect to k. This orthonormal basis corresponding to the first band plays the role of the 2 basis Pj (z) exp − |z2 | in the Lowest Landau Level approximation. Note that we recover ϕ1 (z, k) by the formula exp(ik )ψ (z). (2.12) ϕ1 (z, k) = exp(−ikz) ∈Γ
Moreover, the operator A on 2 (Γ) whose matrix is given by A = Hz ψ , ψ
(2.13)
is unitary equivalent to the restriction of Hz to the spectral space attached to the first band. One can of course observe that A commutes with the translation on 2 (Γ), so it is a convolution operator by a sequence a ∈ 1 (Γ) (actually in the space of the rapidly decreasing sequences S(Γ)), A = a( − ),
(2.14)
which is actually the Fourier series of k → λ1 (k) 1 = a, λ where 1 ( ) := T λ 2π So we have (Au)( ) =
(2.15)
2π/T
exp(−i k)λ1 (k)dk.
(2.16)
0
a( − )u( ),
for u ∈ 2 (Γ).
∈Γ
2.3. (N T )-periodic problem There is another way to proceed which is the one we will choose in this paper. We keep w T -periodic but look at the (N T )-periodic problem and we analyze this problem. The spectrum is discrete but the idea is that we will recover the band spectrum in the limit N → +∞. If we compare with what we do in the Floquet theory, the analysis of the (N T )-periodic problem consists in considering the direct sum of the problems with a Floquet condition corresponding to k = −1) . 0, N2πT , . . . , 2π(N NT Note that this decomposition into a direct sum works only for linear problems, so it will be interesting to explore this approach for the nonlinear problem. In this spirit, it can be useful to have an adapted orthonormal basis of the spectral space attached to the first N eigenvalues of the N T -periodic problem (which c One
can make a contour deformation in the integral defining ψ0 in (2.10).
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
245
can be identified with the vector space generated by the eigenfunctions correspond−1) ). ing to the N Floquet eigenvalues associated with k = 0, N2πT , . . . , 2π(N NT Our claim is that there exists an orthonormal basis, for the L2 -norm on ]− N2T , N2T [, consisting of (N T )-periodic functions and replacing the Wannier functions. We write N 1 N φj (z), ψ0N (z) = √ N j=1
(2.17)
d where φN j is an eigenfunction of the (N T )-periodic problem, chosen in such a way that j−1 N φN j (z + T ) = ωN φj (z),
(2.18)
with ωN = exp(2iπ/N ). We can then introduce ΓN = Γ/(N T Z),
(2.19)
and define, for ∈ ΓN , the (N T )-Wannier functions ψN (z) = ψ0N (z − ).
(2.20)
This gives an orthonormal basis of the eigenspace attached to the first N eigenvalues of the (N T )-periodic problem. These first N eigenvalues belong to the previously defined first band. N Note that conversely, we can recover the eigenfunctions φN j from the ψj by a discrete Fourier transfrom. In particular we have N −1 1 N √ = ψj . φN 1 N j=0
(2.21)
Except the fact that these “Wannier” functions are NOT exponentially decreasing at ∞ (they are by construction (N T )-periodic), one can then play with them in the same way (this corresponds to the replacement of the Fourier series by the finite dimensional one). We then meet the “discrete convolution” on 2 (ΓN ): aN ( − )u( ), for u ∈ 2 (ΓN ). (AN u)( ) = ∈ΓN
Of course 2 (ΓN ) is nothing else than CN with its natural Hermitian structure. We have presented different techniques to determine the bottom of the spectrum of Hz , which all provide the same ground energy. We will now recall more quantitative results based on the so-called semi-classical analysis. that except in the case j = 1, we do not claim that φN j is the jth eigenfunction but this is the first one corresponding to the condition (2.18).
d Note
March 10, 2009 17:53 WSPC/148-RMP
246
J070-00361
A. Aftalion & B. Helffer
3. Semi-Classical Analysis for the Periodic Case 3.1. Preliminary discussion Till now, we have not strongly used that we are in a semi-classical regime: our semi-classical parameter here will not be the Planck constant (which was already assumed to be equal to 1) but . We will now use this additional assumption for presenting quantitative results. The literature in optical lattices is mainly analyzing a very particular model, the Mathieu equation. We will sketch how one can do this in full generality. For the one-dimensional case which is considered here, one can refer to Harrell [17] (who uses techniques of ordinary differential equations) or to the book of Eastham [16], but we will describe a proof which is not limited to the onedimensional situation (see [31, 19, 27]) and is described in the books of Helffer [18] or Dimassi–Sj¨ ostrand [15]. As we have shown in the previous section, the description of the first band, can be either obtained by a good approximation of λ1 (k) and ϕ1 (z, k) as → 0 or by first finding a good approximation of the Wannier function ψ0 introduced in (2.10), which is expected to be exponentially localized in one well, or of the (N T )-periodic Wannier function introduced in (2.17). The analysis is done usually in two steps. First we localize roughly λ1 (k), then we analyze very accurately the variation of λ1 (k) − λ1 (0). The first one will be obtained by a harmonic approximation and the second one by the analysis of the tunneling effect. 3.2. The harmonic approximation We recall that we work under Assumption 1.1. The statements below are sometimes written vaguely and we refer to [15] or [18] for more precise mathematical statements. For the approximation of λ1,z (0) (actually for any λ1,z (k)) the rule is that we replace w(z) (having in mind (1.7)) by its quadratic approximation at 0. The harmonic approximation consists in first looking at the operator −
w (0) 2 1 d2 + z , 2 2 dz 22
(3.1)
on R. For the model in [33], w(z) = sin2 ( πz T ), and we find −
1 d2 1 πz 2 + 2 . 2 2 dz T
(3.2)
This operator is a harmonic oscillator whose spectrum is explicitly known. The jth eigenvalue is given by λhar j,z =
j − 12 w (0).
(3.3)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
247
The two main pieces of information we have to keep in mind are that the ground state energy is λhar 1,z =
1 w (0), 2
(3.4)
and that the gap between the first eigenvalue and the second value is given by har δzhar := λhar 2,z − λ1,z =
1 w (0).
(3.5)
The corresponding positive L2 normalized ground state is then given by 1
1
1
1
ψ har (z) = π − 4 w (0) 8 − 4 exp −w (0) 2
z2 . 2
(3.6)
It will also be important later to have the computation of the L4 norm. So we get by immediate computation: 1 1 1 ψ har (z)4 dz = π − 2 w (0) 4 − 2 . (3.7) R
The mathematical result is that this value provides a good approximation of λ1,z (0) (and hence of the bottom of the spectrum of Hz ) with an error which is O(1) as → 0: λ1,z (0) = λhar 1,z + O(1).
(3.8)
By working a little more, one can actually obtain a complete expansion of λ1,z (0) in powers of and hence, of λ1,z (k), since they have the same expansion. For each j ∈ N∗ , one has a similar expansion for λj,z (0). This implies in particular an estimate of λ2,z (0) − λ1,z (0), called the longitudinal gap: w (0) + O(1). (3.9) δz := λ2,z (0) − λ1,z (0) = From now on, we simply write λ1,z or λ1 instead of λ1,z (0) for the ground state energy of the periodic problem. Let us note that the ground state of the harmonic oscillator also provides a good approximation of the ground state of Hzper . So we obtain, using (3.7) that for φ1 , the L2 -normalized ground state of Hzper , we have
+ T2
− T2
1
1
1
φ1 (z)4 dz = π − 2 w (0) 4 − 2 + O(1).
(3.10)
3.3. The tunneling effect We now briefly explain the results about the length of the first band, which is exponentially small as → 0. The results can take the following form (see the work
March 10, 2009 17:53 WSPC/148-RMP
248
J070-00361
A. Aftalion & B. Helffer
of Outassourt [27] or the book by Dimassi–Sj¨ ostrand, Formula (6.26)) S+α λ1 (k) − λ1 (0) = 2(1 − cos(kT ))τ + O exp −
(3.11)
with α > 0 (arbitrarily close from below to 1) and, for some cτ = 0, S 3 τ ∼ cτ − 2 exp − .
(3.12)
Moreover one can express the constants cτ and S once w is given (seee also [18] in addition to the previous references). This τ seems to be called in some physical literature the hopping amplitude. Here, we simply explain how one computes S which determines the exponential decay of τ as → 0. In any dimension, S is interpreted as the minimal Agmon distance between two different minima of the potential w. In one dimension, with w satisfying Assumption (1.1), this distance is simply the Agmon distance between two consecutive minima and is given by T √ 2 w(z) dz. (3.13) S := 2 − T2
In particular, when w(z) = sin2 ( πz T ), we get √ T √ 2 πz 2 2T S := 2 sin T dz = π . − T2
(3.14)
This is to compare to (14) in [34], which is not an exact formula (as wrongly claimed) but only an asymptotically correct formula. It can be found, for this Mathieu operator, in [1]. Let us give the formula for the constant cτ . It can be found in [17], see also [27, Formula (4.14)] and [18, pp. 58–59]. We have: 3
1
cτ = 2 4 π − 2 exp Aτ , with (assuming w even)
Aτ = lim
η→0
η
T 2
1
(3.15) √ 2
dz + ln η . w(z) w (0)
(3.16)
We just sketch the mathematical proof. Filling out all the wells suitably except one (say 0), we get a new potential wmod ≥ w which coincides with w in an interval containing 0 and excluding small neighborhoods of all the other minima. We consider, for small enough, the ground state of this modified problem and (multiplying by a cut-off function) we get a function ψ0app (and an eigenvalue λapp 1 ) which is a very good approximation of ψ0 . e The
computation is a little simpler in the case when w is even.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
249
Now the hopping amplitude in the abstract theory is givenf exactly by −τ = a(T ) = Hz ψ0 , ψ1 = (Hz − µ)ψ0 , ψ1 ,
(3.17)
the last equality being satisfied, due to the orthogonality of ψ0 and ψ1 , for any µ. When replacing ψ0 by its approximation, one has to be careful, because ψ0app and ψ1app := ψ0app (· − T ) are no more orthogonal. So this leads to take µ = λapp 1 , and one can prove that app app τ ∼ −(Hz − λapp 1 )ψ0 , ψ1 .
(3.18)
An easy way to see that τ is exponentially small is to observe that app app −2 (Hz − λapp (w(z) − wmod )ψ0app , ψ1app , 1 )ψ0 , ψ1 =
ψ0app .
and to use the information on the asymptotic decay of approximation of ψ0app is, in a neighborhood of 0, 1 z wkb − 14 w(s)ds, for z ≥ 0, ψ0 = b(z, ) exp − 0 with b(z, ) ∼
bj (z)j ,
(3.19) The WKB-
(3.20)
(3.21)
j≥0
and
1 b0 (z) = π − 4 exp −
0
z
w (0) (w ) (t) − 2 dt . 2 w(t) 1 2
(3.22)
It should then be completed by symmetry to get an even WKB solution on ]−T, +T [. Note that we have 1 w (0) 2 , (w ) (T− ) = − 2 which implies that b0 tends to +∞ as z → T− . An integration by parts together with a WKB approximation leads to the asymptotic estimate of τ announced in (3.12). More precisely, we get that the prefactor cτ is immediately related to the constant b0 ( T2 )2 to (3.15). Note that more generally we have b0 (z)b0 (T − z) w(z) = Cst,
w( T2 ) and this leads
(3.23)
which again shows the blowing up of b0 at T . Finally, we emphasize that ψ0wkb is a good approximation of ψ0 only in intervals ]−T + η, T − η[ for some η > 0. f For
the Mathieu potential, this is consistent with Formula (13) in [34].
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
A. Aftalion & B. Helffer
250
One can also see that a(kT ) is of the order of |a(T )||k| (for k ≥ 2) 2 ), a(kT ) = O(τ
(3.24)
so it is legitimate in order to compute the width of the first band to forget all the a( ) for ∈ Γ, = 0, ±T . Thus, in the k variable, the spectrum (corresponding to the first band) is up to a very small error, of the order of the square of a(T ), given by the operator of multiplication in L2 (R/Γ) by the function a(0) + 2a(T ) cos(kT ). 3.4. Semi-classics for the (N T )-periodic Wannier functions What is written above corresponds to the use of Wannier functions on R. One can write a close theory using the (N T )-periodic Wannier functions without modifying the main terms of the asymptotics. In particular, ψ0wkb is also a good approximation of ψ0N for N > 1. Proposition 3.1. There exists c() with c(0) = 1, such that, for all η > 0, for all q > 0, there exists a constant Cη,q , such that we have
N 1 |z| wkb w(s)ds ψ0 (z) − ψ0 (z) ≤ Cη,q q , exp 0 ∀z ∈ ]−T + η, T − η[. For any α > 0, there exists η > 0 and Cα such that S0 α ]−T + η, T − η[. exp ψ0N (z) ≤ Cα exp , ∀z ∈
(3.25)
(3.26)
Although we will mainly use the (N T )-Wannier functions in this paper, the interest of the Wannier functions on R is that they allow to recover the information for all Floquet eigenvalues and this could be important if we want to control the constants with respect to N . N 4. Justification of the Reduction to the Longitudinal Energy EA
4.1. Main result N In this section, we address the reduction to the energy EA defined in (1.36) and N prove the following theorem (recall that mA is defined in (1.44)):
Theorem 4.1. If (AΩa)
g )(ω⊥ − Ω)−1 1 mN A (,
(4.1)
and (AΩb)
3
g(2ω⊥ − Ω)mN g)(ω⊥ − Ω)− 2 1, A (,
(4.2)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
251
we have inf Qper,N (Ψ) = ω⊥ + mN g)(1 + o(1)). A (, Ω
(4.3)
Ψ=1
Both Theorems 1.3 and 1.4 are a consequence of Theorem 4.1 as soon as we have the appropriate rough estimates on mN A already presented in the introduction. This is what we explain first in Sec. 4.2 before proving the theorem in Sec. 4.3.
4.2. Proof of Theorems 1.3 and 1.4 4.2.1. Weak Interaction case In the Weak Interaction case, we recall from (1.57), that, when (1.55) is satisfied, then mN A ≈ 1/.
(4.4)
Therefore, when (1.54) and (1.55) are satisfied, then (4.1) and (4.2) automatically hold with the observation that 3
) g(2ω⊥ − Ω)(ω⊥ − Ω)− 2 mN A (, g 1
3
≤ Cg(2ω⊥ − Ω) 2 ((ω⊥ − Ω))− 2 1, and Theorem 1.3 follows from Theorem 4.1.
4.2.2. Thomas–Fermi case In the Thomas–Fermi case, we will prove in (5.11) that, when (1.59) and (1.60) are satisfied, then 2/3 mN . A ≈ (gω⊥ /)
(4.5)
Let us verify that, if (1.59)–(1.61) are satisfied, then (4.1) and (4.2) hold. This will prove Theorem 1.4. We get (4.1) in the following way. First we have: 2
2
2
) ≤ C(ω⊥ − Ω)−1 ω⊥3 g 3 − 3 . (ω⊥ − Ω)−1 mN A (, g Hence (4.1) is a consequence of 3
gω⊥ (ω⊥ − Ω) 2 ,
(4.6)
which follows from (1.61) since (1.59) and (1.61) imply that (ω⊥ − Ω) 1. The check of (4.2) is then immediate from (1.61) and (4.5).
March 10, 2009 17:53 WSPC/148-RMP
252
J070-00361
A. Aftalion & B. Helffer
4.3. Proof of Theorem 4.1 Because of the upper bound (1.53), Theorem 4.1 is a consequence of the following proposition, recalling that δ⊥ = ω⊥ − Ω . Proposition 4.2. There exists a constant C > 0 such that, for all ∈ ]0, 1], for all ω⊥ , Ω s.t. δ⊥ ≥ 1 and for all g ≥ 0, (Ψ) = ω⊥ + mN ) (1 − CrA (, g)) , inf Qper,N A (, g Ω
Ψ=1
(4.7)
with 0 ≤ rA (, g) ≤
−1 g 1/4 δ⊥ 8
δ⊥ + ω ⊥ δ⊥
14
1
−1 mN g) 4 + mA (, g)δ⊥ . A (,
(4.8)
Proof. For simplicity, we make the proof for Ω = 0. The proof does not depend on N and for Ω not zero, we will make a remark at the end on how to adapt it, using the diamagnetic inequality. Note also that g) ≥ 0 1 − CrA (, by the lower bound. So we have only to prove (4.8) under the additional condition that the right-hand side of (4.8) is less than some fixed α0 . In any case, the estimate is only interesting in this case. The proof is inspired by [4] where a reduction is made from a 3D to a 2D setting for a fast rotation. We project a minimizer Ψ onto ψ⊥ ⊗ L2 (R/N T Z), and call ψ⊥ (x, y) ξ(z) its projection: Ψ(x, y, z) = ψ⊥ (x, y)ξ(z) + w(x, y, z) with
(4.9)
R2
ψ⊥ (x, y)w(x, y, z)dxdy = 0.
The orthogonality condition implies in particular 1=
NT 2
|ξ(z)|2 dz +
− N2T
R2 ×]− N2T , N2T [
and we have the lower bound N2T EB (w(·, ·, z))dz ≥ (δ⊥ + ω⊥ ) − N2T
|w(x, y, z)|2 dxdydz
R2 ×]− N2T
with EB (ψ)
= R2
, N2T
|w(x, y, z)|2 dxdydz, [
1 ω⊥ 2 2 2 2 2 |∇x,y ψ(x, y)| + (x + y ) |ψ(x, y)| dxdy. 2 2
(4.10)
(4.11)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
253
We compute the energy of Ψ and use the orthogonality condition and the equation satisfied by ψ⊥ to find that all the cross terms disappear so that N2T per,N N Q (Ψ) = ω⊥ |ξ(z)|2 dz + EA (ξ)
− N2T
+ R2
N EA (w(x, y, ·))dxdy +
+g
R2 ×]− N2T
, N2T
NT 2
− N2T
EB (w(·, ·, z))dz
|Ψ(x, y, z)|4 dxdydz,
(4.12)
[
where N EA
(φ) =
NT 2
− N2T
1 2 2 |φ (z)| + W (z)|φ| dz. 2
From (4.10)–(4.12), we find Q
per,N
N2T δ⊥ (Ψ) ≥ ω⊥ + E (w(·, ·, z))dz δ⊥ + ω⊥ − N2T B N + EA (w(x, y, ·))dxdy.
(4.13)
R2
We use (4.13) together with the upper bound (1.53) and (4.11) to derive that mN (, g) |w(x, y, z)|2 dxdydz ≤ A . (4.14) δ⊥ R2 ×]− N2T , N2T [ Note that the right-hand side in (4.14) is very small according to Conditions (4.1) and (4.2) and that (4.14) implies
NT 2
− N2T
|ξ(z)|2 dz ≥ 1 −
g) mN A (, . δ⊥
Then, we get also,
R2 ×]− N2T , N2T [
|∇x,y w(x, y, z)|2 dxdydz ≤ 2 |∂z w(x, y, z)| dxdydz ≤ 2 2
R2 ×]− N2T , N2T [
δ⊥ + ω ⊥ mN g) A (, , δ⊥ ω⊥
(4.15)
(4.16)
mN g). A (,
The proof of the Sobolev embedding of H 1 (R3 ) in L6 (R3 ) gives (see, for example, [11, p. 164, line −1]) for a general function v in H 1 (R3 ) 1/3
1/3
1/3
v 6 ≤ 4 ∂x v 2 ∂y v 2 ∂z v 2 . Here · p denotes the norm in Lp (R3 ).
(4.17)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
A. Aftalion & B. Helffer
254
In our case, we are working in H 1 (R2x,y ×(Rz /N T Z)). A partition of unity in the z variable allows us to extend this estimate also this case, and we get, for another universal constant C, 1/6 1/3 1/3 , (4.18) w 6 ≤ CN ∂x w 2 ∂y w 2 ∂z w 22 + w 22 where this time · p denotes the norm in Lp (R2x,y ×]− N2T , N2T [). So we obtain: 1 δ⊥ + ω ⊥ 3 1 ˜ N (, 2 g ) . w 6 ≤ Cm A δ⊥
(4.19)
(C, C˜ are N -dependent constants possibly changing from line to line.) Since by H¨ older’s Inequality, 1/4
3/4
w 4 ≤ w 2 w 6 , we deduce that 1 2
w 4 ≤ C mA (, g) δ⊥
− 18
δ⊥ + ω ⊥ δ⊥
14 .
(4.20)
We expand
2 1 2 |Ψ| = |ψ⊥ | |ξ| + 2|ψ⊥ | |ξ| |w| + 4 (ψ⊥ ξw) + |w| + 4|ψ⊥ |2 |ξ|2 (ψ⊥ ξw). 2 4
4
4
2
2
2
Since (4.12) implies that
N Qper,N (Ψ) ≥ ω⊥ + EA (ξ) − 4g
R2 ×]− N2T
, N2T
|ψ⊥ (x, y)|3 |ξ(z)|3 |w(x, y, z)|dxdydz, [
in order to get the lower bound, we just need to prove that the last term is a N (ξ). perturbation to EA We can do the following estimates g |ψ⊥ (x, y)|3 |ξ(z)|3 |w(x, y, z)|dxdydz 34 34 3 ≤ c0 gω⊥4 w 4 |ψ⊥ (x, y)|4 dxdy |ξ(z)|4 dz N ≤ c1 g 1/4 (EA (ξ))3/4 w 4 1 δ⊥ + ω ⊥ 4 N 1 −1 N ≤ c2 g 1/4 δ⊥ 8 mA (, g) 2 (EA (ξ))3/4 δ⊥ 1 1 δ⊥ + ω ⊥ 4 N 1 1/4 − 8 N ≤ c3 g δ ⊥ mA (, g) 4 (1 + C mN g)δ⊥ −1 )EA (ξ). A (, δ⊥
Here to get the last line, we have used the lower bound N (ξ) ≥ mN g) ξ 42 , EA A (,
and (4.15).
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
255
This leads to Q
per,N
1 1 δ⊥ + ω ⊥ 4 N 1 1/4 − 8 (Ψ) ≥ ω⊥ + 1 − Cg δ⊥ mA (, g) 4 δ⊥ N −1 − Cm A (, g)δ⊥ , N EA (ξ)
and then to (4.7). Remark 4.3. In the case with rotation Ω, the proof is the same if we replace EB by EB,Ω defined by 1 1 ⊥ 2 2 2 2 2 |∇x,y ψ − iΩr ψ| + (ω⊥ − Ω )r |ψ| EB,Ω (ψ) = dxdy. (4.21) 2 2 R2
We also use the diamagnetic inequality |∇|w|(x, y)|2 dxdy ≤ |(∇w − iΩr⊥ w)(x, y)|2 dxdy
(4.22)
which provides the Sobolev injections. Remark 4.4. Here, we have not proved that the minimizer of E behaves almost like the ground state in x, y times a function of ξ which minimizes EA . We are just able (see (4.14)) to prove that the minimizer is close to its projection (in some L2 or L4 norm). When N = 1, this can be improved under the stronger condition (1.51). (ξ) on the We first observe (note that (4.13) is still true with the addition of EA right-hand side) that (ξ) ≤ mA (, g). EA
Using (4.15), assuming
(4.23)
mA δ⊥
< 1, this leads to −1 mA (, g) (ξ) ≤ mA (, g) 1 − ξ 2 EA δ⊥
(4.24)
We will show in Sec. 5.2 (see (5.7)) how to proceed in order to show that ξ is close to the ground state φ1 (z) of Hzper . This can allow to improve the information given in Theorem 1.2. 5. The 1D Periodic Model: Estimates for mN A The aim of this section is to analyze mN A . We note that rough estimates were already given for the weak interaction case which were enough for the justification of the model but the corresponding rough estimates needed for the Thomas–Fermi justification will be obtained in this section. We will then look at accurate estimates for mN A , which will be established under stronger hypotheses. We will end the section by the discussion of the case N > 1, which finally leads to the introduction of the DNLS model for the Weak Interaction case.
March 10, 2009 17:53 WSPC/148-RMP
256
J070-00361
A. Aftalion & B. Helffer
5.1. Universal estimates We consider the one-dimensional situation and a T -periodic potential W , which could be for example W (z) = (sin πz)2 /2 . We consider the problem of minimizing on L2 (R/T R) the functional 1 ψ→ G(ψ) = 2
T 2
− T2
|ψ (z)| dz +
T 2
2
W (z)|ψ(z)| dz + g 2
− T2
T 2
− T2
|ψ(z)|4 dz,
(5.1)
over ψ L2 = 1. We are interested in the control of the minimum of the functional. It is clear that T2 g ) ≤ λ1 + g |φ1 (z)|4 dz, (5.2) λ1 ≤ m( − T2
so the question is now to improve the lower bound. We will use the following perturbation lemma. Lemma 5.1. If g ≥ 0, then 5
3
1
m( g ) ≥ λ1 + g φ1 44 − 2 2 g 2 φ1 36 φ1 24 (λ2 − λ1 )− 2 ,
(5.3)
2
d where (λ1 , φ1 ) is the spectral pair of − 12 dz 2 + W (z) corresponding to the ground 2 state energy (with φ1 = 1) and λ2 is the second eigenvalue. Moreover, if φmin be a minimizer of G, then there exists a complex number c of modulus 1 such that
g φmin − cφ1 2L2 ≤ 2
φ1 44 . λ2 − λ1
(5.4)
We will not give the proof of this lemma which is close to the proof of Proposition 4.2. Remark 5.2. Everything being universal, one can of course replace T by N T in the description. 5.2. Semi-classical results in the weak interaction case: N = 1 We first recall that using (3.10) we have, under Condition (1.55), the rough control 1 ≤ λ1,z ≤ mA (, g) ≤ λ1,z + g C
T 2
− T2
|φ1 (z)|4 dz ≤
C ,
(5.5)
which leads to (1.57) for N = 1 and was sufficient for the justification of the longitudinal model A. Let us now show that under stronger assumptions one can have a more accurate asymptotics including the main contribution of the non-linear interaction.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
257
Proposition 5.3. Under assumption (1.51), mA admits the following asymptotics: 1
1
1
3
1
− 2 mA (, g) = λhar w (0) 4 g− 2 + c0 + O() + O( g 2 − 4 ). 1 () + π
(5.6)
Proof. Indeed, λ1 and λ1 − λ2 are of order 1 , and by (3.10) and (5.4), we get 1
φmin − cφ1 2L2 ≤ C g 2 .
(5.7) − 16
Using the harmonic approximation, the term φ1 6 is of order and the remain3 1 der appearing in (5.3) is of order g 2 − 4 . Altogether we get for the energy T2 3 1 g) = λ1,z + g |φ1 (z)|4 dz + O( g 2 − 4 ). (5.8) mA (, − T2
Using (3.10), we obtain (5.6). This asymptotics becomes interesting in the semiclassical regime if (1.51) holds. Remark 5.4. Exponentially small effects will be discussed in Sec. 7. 5.3. Semi-classical analysis in a Thomas–Fermi regime: Case N = 1 5.3.1. Main results In this subsection, we first give the rough estimate leading to (1.62) for N = 1. Recall that g = π1 gω⊥ , but g and are taken as independent parameters. Proposition 5.5. If for some c > 0, g2 ≤ c,
(5.9)
and if 1
g 2 1,
(5.10)
then there exist C and 0 such that 2 2 1 2 −2 g 3 3 ≤ mA (, g) ≤ C g 3 − 3 , C
∀ ∈ ]0, 0 ].
(5.11)
This will be proved in the rest of the section, as well as, Proposition 5.6. If g2 1,
(5.12)
and (5.10) are satisfied, then 4
5
2
2
2
2
1
mA (, g) = 2− 3 3 3 5−1 w (0) 3 g 3 − 3 (1 + O( g − 3 − 3 )). The new assumption is (5.12), which is stronger than (5.9).
(5.13)
March 10, 2009 17:53 WSPC/148-RMP
258
J070-00361
A. Aftalion & B. Helffer
5.3.2. The harmonic functional on R 2
z Let us start with the case of a harmonic potential W (z) = γ 2 2 on R, with γ > 0, and consider the problem of minimizing T2 T2 T 1 2 2 γ q Hr,T (u) = u (t) dt + 2 t2 u(t)2 dt + g u(t)4 dt (5.14) 2 − T2 2 − T2 − T2
over the u’s in the form domain of q Hr,T such that u 2 = 1. the infimum of the functional. Actually there are two We denote by mHr,T A approximating “harmonic” functionals of interest corresponding to T finite and to T = +∞. An interesting point is that, for T large enough, the minimizers of these two functionals are the same as we will see below. But let us start with the case T = +∞. Lemma 5.7. If (5.10) holds, then 4
5
2
2
2
2
1
(, g) = 2− 3 3 3 5−1 γ 3 g 3 − 3 (1 + O( g − 3 − 3 )). mHr,+∞ A
(5.15)
The proof is rather standard. The analysis is done through a dilation. We look for an L2 -normalized test function φ in the form 1
φ(z) = ρ 2 v(ρz),
(5.16)
with ρ and v to be determined. The 1 − D energy of φ becomes 1 2 2 2 ρ v (t)2 dt + ρ−2 −2 t v(t) dt + g ρ v(t)4 dt, γ 2 R R R with
γ =
(5.17)
1 γ. 2
This leads to choose ρ = ργ such that −2
1
ργ = γ 3 g− 3 ,
(5.18)
and the energy of this model becomes 2 2 4 1 1 g 3 − 3 qTF (v) + (γ2 gˆ)− 3 v (t)2 dt 2 R with
qTF (v) :=
(5.19)
t2 v(t)2 dt +
R
v(t)4 dt.
(5.20)
R 2
2
This is asymptotically of the order of g 3 − 3 and Condition (5.10) is just the condition that the kinetic term is negligeable in the computation of the energy.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
259
The value of the infimum of qTF (v) and the control of the remainder is rather standard (see [2, Proposition 3.3] or [14] which treat the (2D)-case). One has to regularize the inverted parabola 1
1
vmin (t) = 2− 2 (λ − t2 )+2 ,
(5.21)
23 3 , λ= 2
(5.22)
with
and for x ∈ R, (x)+ = max(x, 0), which realizes the infimum but is not in H 1 . 5.3.3. The harmonic functional on ]− T2 , T2 [ We consider now the case of the interval and have the following Lemma: Lemma 5.8. Under assumption (5.10), there exists C > 0 such that 1 2 2 mhar,T (, g) ≥ g 3 − 3 . (5.23) A C The proof is a variant of the previous lemma. It is easy to see that the minimizers coincide if ργ T 1 > λ2, (5.24) 2 that is 13 1 2 3 T > g 3 γ3 . (5.25) 2 If (5.25) is not satisfied, we can still have a lower bound for the infimum of the functional. The renormalized functional reads ρT ρT ρT 2 2 2 ren,T 2 2 −2 −2 2 2 q (v) := ρ v (t) dt + ρ γ t v(t) dt + gρ v(t)4 dt, (5.26) ρT 2
ρT 2
which satisfies
q
ren,T
(v) ≥ gρ
ρT 2 ρT 2
ρT 2
4
v(t) dt .
Using the H¨older inequality, we obtain, if v 2 = 1, g ρ)(ρT )−1 , q ren,T (v) ≥ ( and using our assumption, we obtain 1 1 1 2 2 q ren,T (v) ≥ λ− 2 ( gρ) ≥ g 3 − 3 , 2 C if v 2 = 1. We then immediately obtain Lemma 5.8.
(5.27)
March 10, 2009 17:53 WSPC/148-RMP
260
J070-00361
A. Aftalion & B. Helffer
5.3.4. Relevance of the “harmonic functional” for rough bounds First we prove Proposition 5.5. We can proceed by direct comparison. Observing that we can find α > 0 such that T T w(z) ≤ αz 2 , ∀z ∈ − , + , 2 2 and 1
ρα T > 2λ 2 . Here, we use (5.9) and 1
2
1
1
1
g − 3 ) ≥ c0 α 3 c− 3 . ρα = c0 α 3 (− 3 We can then use the asymptotic estimate (5.15) with γ = α to get the upper bound in (5.11). Using now assumption (1.1), we can also find α ˆ such that T T w(z) ≥ α ˆ z 2 , ∀z ∈ − , + , 2 2 This leads, using our analysis of q TF in the harmonic case to the lower bound in (5.11). 5.3.5. Relevance of the “harmonic functional” for the asymptotic behavior In order to have a better localized minimizer, we should assume that ρ → +∞ and this corresponds to replacing assumption (5.9) by the stronger assumption (5.12). Moreover, we have to verify that under this assumption the “harmonic approximation” is valid for this energy computation. For this, we should analyze the localization of the minimizer. Assuming that such a localized minimizer exists (minimize the functional v → (z 2 v(z)2 + v(z)4 ) dz), we can also get an upper bound of mA by using a harmonic approximation and a lower bound of the same order. For the lower bound, we have just to analyze (forgetting the positive kinetic term) the infimum of the functional φ →
T 2
− T2
w(z) 2 4 φ + gφ dz. 2
As in the other case, a minimizer (over the L2 -normalized φ’s), should satisfy, for some µ > 0, the Euler–Lagrange equation w(z) φ(z) + 2 gφ(z)3 = µφ(z), 2 where µ will be determined by the L2 normalization over ]− T2 , T2 [.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
261
We find 1 1 w(z) 2 φ(z) = µ− 2 2 g +
(5.28)
w(z) dz = 1. µ− 2 +
(5.29)
with 1 2 g
But we know from the upper bound that µ is less than two times the energy which g ). In particular, if µ2 is small, it is easy to is asymptotically lower than mhar A ( estimate µ using the harmonic approximation of w at its minimum. It remains to verify the behavior of µ2 . We find 2
4
µ2 ≤ C g 3 3 . Not surprisingly, this shows that µ2 is small as ρ → +∞. So finally, we have obtained Proposition 5.6. 5.4. The case N > 1 We would like to extend our rough or accurate estimates for mA to the case N > 1, keeping the same kind of assumptions.
5.4.1. Universal control We now consider the functional over ]− N2T , N2T [. Using the minimizer obtained for N = 1 and extending it by periodicity, we get after renormalization, the general upper bound g N mA (, g) ≤ mA , . (5.30) N From this comparison, we obtain immediately the rough upper bounds in the WI case and in the TF case.
5.4.2. Rough lower bounds In the WI case, we always have, observing that λ1,z is the ground state energy for any N ∈ N∗ , λz1 ≤ mN ). A (, g Hence we obtain in full generality
(5.31)
March 10, 2009 17:53 WSPC/148-RMP
262
J070-00361
A. Aftalion & B. Helffer
Proposition 5.9. Under condition (1.54), then, for any N ≥ 1, we have mN ) ≈ A (, g
1 .
(5.32)
In the TF case, it remains to prove the lower bound which will be a consequence of the following inequality: mN ) ≥ A (, g
2 4 1 g 3 3 . CN 2
(5.33)
We indeed observe that if uN is a normalized minimizer, then there exists one interval Ij := ]j T2 , (j + 2) T2 [ (j ∈ {−N, . . . , N − 2}), such that 1 |uN |2 dz ≥ . N Ij We can then write, forgetting the kinetic term and translating Ij to ]− T2 , + T2 [, −2 2 (, g ) ≥ w(z) |u | dz + g |uN |4 dz mN N A Ij
Ij
≥ inf( uN 2 , uN 4 ) inf
u=1
+ T2
− T2
(W |u|2 + g|u|4 ) dz.
Then we can combine the lower bound obtained for N = 1 and the inequality w(z) ≥ α ˆ z 2 to get (5.33). So we get finally that mN A has the right order in the TF case. Proposition 5.10. Under assumptions (5.9) and (5.10), we have, for any N ≥ 1, 2
4
) ≈ g 3 3 . mN A (, g
(5.34)
This extends to general N our former Proposition 5.5. 5.4.3. Asymptotics We would like to give conditions under which the universal upper bound (5.30) becomes actually asymptotically or exactly a lower bound. Proposition 5.11. Under either assumption (1.51) or assumptions (5.10) and (5.12), g N mA (, g) ∼ mA , . (5.35) N Proof. The upper bound was already obtained in (5.30). The proof of the lower bound is different in the two considered cases.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
263
WI case. We will see later (in (7.6)) by a rough analysis of the tunneling effect and the property that the infimum of the function C N (c0 , c2 , . . . , cN −1 ) →
N −1
|cj |4
j=0
over
j
|cj |2 = 1 is attained when all the |cj |’s are equal: 1 |cj | = √ , N
for j = 0, . . . , N − 1,
(5.36)
that, under assumption (1.51), there exist C > 0, 0 > 0 and α > 0 such that mN A (g, )
≥ mA
g α , − C( g + 1) exp − , N
∀ ∈ (0, 0 ].
(5.37)
TF case. In this case we can for the lower bound forget the kinetic term and come back to the analysis of Sec. 5.3.5, with T replaced by N T . Under assumption (5.12), we have seen in (5.28) that the minimizer uN is localized in the neighborhood of each minimum and T -periodic. We can then write
NT 2
− N2T
T2 w w 2 4 2 4 |uN | + g|uN | dz = N |uN | + g|uN | dz 2 2 − T2 =
T 2
w √ g √ 2 4 | | N u | + N u | dz N N 2 N
− T2
≥ inf
v=1
T 2
− T2
w 2 g 4 |v| |v| + dz. 2 N
But under assumptions (5.10) and (5.12), the last term in the inequality has same asymptotics as mA (, Nbg ) and we are done. 6. Study of Case (B): Justification of the Transverse Reduced Model 6.1. Main result N by (1.41), (1.42) and mN We have defined EB,Ω B,Ω , the infimum of the energy by (1.45). In case B, the proof of the reduction does not depend on whether N = 1 or N > 1. The only difference is when looking at the rough or accurate estimates of the reduced model. Note that only rough estimates are used in the part concerning the justification of the model. The reduction is very similar to case A, and we will prove
March 10, 2009 17:53 WSPC/148-RMP
264
J070-00361
A. Aftalion & B. Helffer
Theorem 6.1. If mN B,Ω 1,
(RBa)
(6.1)
and (RBb)
1
2 g mN B,Ω 1,
(6.2)
then, as tends to 0, (Ψ) = λ1,z + mN inf Qper,N B,Ω (1 + o(1)). Ω
Ψ=1
(6.3)
Then Theorems 1.5 and 1.6 follow from this result and appropriate estimates on mN B,Ω , as we will prove in Sec. 6.4, while the proof of Theorem 6.1 is made in Sec. 6.2. 6.2. Proof of Theorem 6.1 We recall that we have the universal upper bound (1.64). The lower bound follows from the following proposition and the fact that there exists c > 0 such that δzN ∼ c/, as tends to 0. Proposition 6.2. There exists a universal constant C > 0 such that N inf Qper,N (Ψ) = λ1,z + mN B,Ω (1 − CrB ) Ω
Ψ=1
(6.4)
with 0≤
N rB
≤
N −1 mN B,Ω (δz )
+g
1 4
1 1 4 (δzN )− 8 (mN B,Ω )
1 λ1,z 8 . 1+ N δz
(6.5)
Proof. Essentially this corresponds to exchange the role of (A) and (B). We start from a minimizer Ψ and first write Ψ = ΠN Ψ + w
(6.6)
where ΠN is the orthogonal projection relative to the first N eigenfunctions of Hz introduced in (1.34). We have the lower bound EA (w)dxdy ≥ λN +1,z |w(x, y, z)|2 dxdydz, (6.7) R2 ×]− N2T ,+ N2T [
R2x,y
with (φ) := EA
NT 2
− N2T
1 2 1 φ (z) + 2 w(z)φ(z)2 dz. 2
(6.8)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
We now rewrite the energy in the form N2T per,N QΩ (Ψ) = EB,Ω (Ψ)dz + − N2T
+ R2x,y
with
R2x,y
265
EA (ΠN Ψ)dxdy
EA (w)dxdy + IN (Ψ),
(6.9)
IN (Ψ) = g
and EB,Ω (ψ) =
R2x,y
|Ψ|4 dxdydz,
1 1 |∇x,y ψ − iΩr⊥ ψ|2 + (ω⊥ 2 − Ω2 )r2 |ψ|2 dxdy, 2 2
(6.10)
(6.11)
with r⊥ = (−y, x). We note that IN ≥ 0 and that EB,Ω (ψ) ≥ ω⊥ ψ 2 .
(6.12)
We first get the control of w 2 . Having in mind (1.64), we obtain per,N λ1,z + mN (Ψ) B,Ω ≥ QΩ
≥ ω⊥ + λN +1,z w 2 + λ1,z ΠN Ψ 2
(6.13)
and this implies w 2 ≤
mN B,Ω . δzN
(6.14)
The right-hand side in (6.14) is small according to (6.1). Note also that we have immediately from (6.6), ΠN Ψ 2 ≥ 1 −
mN B,Ω . δzN
(6.15)
We now have to control the derivatives of w. For the transverse control, we start from 1 N |∇x,y w − iΩr⊥ w|2 dxdy, (6.16) λ1,z + mB,Ω ≥ λ1,z + 2 R2x,y ×]− N2T , N2 which leads to |∇x,y w − iΩr⊥ w| 2 ≤ 2mN B,Ω . For the longitudinal control, we write, for any α ∈ [0, 1] α 2 ∂z w 2 + λN +1,z (1 − α) w 2 . λ1,z + mN B,Ω ≥ λ1,z ΠN Ψ + 2 We determine α by writing λN +1,z (1 − α) = λ1,z ,
(6.17)
(6.18)
March 10, 2009 17:53 WSPC/148-RMP
266
J070-00361
A. Aftalion & B. Helffer
hence α=1−
λ1,z λN +1,z
.
(6.19)
So we have 2 N λN +1,z N mB,Ω ≤ 2 mB,Ω . (6.20) α δN,z In the semi-classical regime where we are, this leads to the existence of a constant C such that ∂z w 2 ≤
∂z w 2 ≤ CmN B,Ω .
(6.21)
Using in addition the diamagnetic inequality, we obtain ∇|w| 22 ≤ CmN B,Ω .
(6.22)
As in the other case, we obtain from Sobolev’s Inequality the control of w in L6 norm 1 1 1 3 N N ) 12 , w 6 ≤ C(mB,Ω ) 2 1 + N ≤ C(m (6.23) B,Ω δz where we have used that δzN 1 in the semi-classical regime. Using H¨older’s inequality, we obtain 1
1
N −8 2 w 4 ≤ C(mN . B,Ω ) (δz )
(6.24)
We now have all the estimates needed to mimic the proof of case A. We start from Qper,N (Ψ) ≥ λ + E (Π Ψ) − 4g |ΠN Ψ|3 |w|dxdydz. 1,z B N Ω
(6.25)
We have now to control the third term in (6.25) by the second term. This is done like in case A in the following way: 4g |ΠN Ψ|3 |w|dxdydz ≤ 4g ΠN Ψ 34 w 4 1
1
3
1
2 ≤ C1 g 4 (δzN )− 8 (EB (ΠN Ψ)) 4 (mN B,Ω ) .
(6.26)
We now use 4 EB (ΠN Ψ) ≥ mN B,Ω ΠN Ψ 2 ,
which together with (6.14) leads to mN B,Ω
≤C
mN B,Ω 1+ N δz
(6.27)
EB (ΠN Ψ).
This leads to
N 1 m 1 1 − B,Ω 4 4g |ΠN Ψ|3 |w|dxdydz ≤ C2 g 4 (mN EB (ΠN Ψ). δzN 8 1 + N B,Ω ) δz
(6.28)
(6.29)
Using this control, (6.14), (6.25) and (6.27), we have obtained the detailed proof of (6.4) in the general case.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
267
6.3. On the minimizers of EB In order to get bounds for mB,Ω , we can analyze the case Ω = 0. It is standard (see [2] or [21]) to prove Proposition 6.3. The minimizer of EB over the normalized ψ’s is unique (up to a multiplicative constant of modulus 1) and radial. If ψ is radial, we have that EB,Ω (ψ) = EB (ψ). Therefore, we get the following: Corollary 6.4. We always have inf EB,Ω := mB,Ω ≤ mB .
(6.30)
6.4. Proof of Theorems 1.5 and 1.6 The issue is to determine the magnitude of the infimum of the energy of the transverse problem mN B,Ω . 6.4.1. Reduction to the case N = 1 As in Case A it is immediate to see that g N , ω⊥ . mB,Ω ≤ mB,Ω N
(6.31)
If indeed ψmin,N was the T -periodic minimizer for (1.39) with gN = Neg , we get (6.31) by using (1.27), (2.21) and taking ψj,⊥ = √1N ψmin,N . So it remains to analyze the case N = 1. This depends on the magnitude of g˜ and leads us to consider two cases. 6.4.2. The Weak Interaction regime: Case N = 1 Proposition 6.5. If (1.66) holds, then mB,Ω ( g , ω⊥ ) ≤ Cω⊥ .
(6.32)
Indeed, (1.66) implies that g˜ is bounded and the test function ψ⊥ (which is independent of Ω) implies the proposition. Therefore, if (1.66) and (1.67) are satisfied, then Theorem 6.1 holds and implies Theorem 1.5. 6.4.3. The Thomas Fermi regime: Case N = 1 We start with the case when Ω = 0. When g˜ is not bounded, we can meet a Thomas–Fermi situation.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
A. Aftalion & B. Helffer
268
Proposition 6.6. If g → +∞, the function mB ( g , ω⊥ ) satisfies mB ( g , ω⊥ ) ∼ cTF ω⊥ g,
(6.33)
with cTF =
3 1 π 3 λ = 3−1 2 2 π − 2 . 24
(6.34)
Therefore, if (1.70)–(1.72) are satisfied, then Theorem 6.1 implies Theorem 1.6. Proof. A rescaling in ω⊥ u → 2
g/ω⊥ yields a new energy R2
1 2 2 2 4 |∇u| + gr |u| + 2 g|u| dxdy, g
which is of the type Thomas–Fermi (that is kinetic energy can be neglected) if 1 g. g
(6.35)
This leads then simply to the TF reduced functional 1 2 2 4 u → (ω⊥ g) r |u| + |u| dxdy, 2 R2 whose infimum over the unit ball in L2 (R2 ) is of order cTF (ω⊥ g), with cTF > 0 defined by: 1 2 2 4 r |u(x, y)| + |u(x, y)| dxdy. cTF = inf (6.36) 2 u2 =1 R2 The minimizer exists and is explicitly known as umin (x, y) =
1 1 (λ − r2 )+2 2
with
3
1
λ = 2 2 π− 2 .
This leads to (6.34). In addition, by a careful computation ([2]) we obtain more precisely Lemma 6.7. There exists c such that, as g tends to +∞,
mB c 1 = cTF g + ln g+O , ω⊥ g g with cTF defined in (6.36).
(6.37)
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
Remark 6.8. Note that we have the universal lower bound g, ω⊥ ) ≥ cTF ω⊥ g. mB (
269
(6.38)
This lower bound becomes better than the universal lower bound by ω⊥ as soon as (6.39) cTF g > 1. Remark 6.9. In the semi-classical regime, conditions (BTFa) and (BTFc) in Theorem 1.6 (take their product) imply that this two-dimensional energy is much smaller than 1/, that is 1
ω⊥ g 2 −1/4 −1 .
(6.40)
We now look at the case when Ω > 0. The previous proof, using that the minimizer of the TF reduced functional in (6.36) is radial, yields Proposition 6.10. There exists C such that, as g → +∞, 1
g , ω⊥ ) ≤ mB ( g, ω⊥ ) + C ln g g− 2 . mB,Ω (
(6.41)
This will be improved in (6.30) by a direct study of the minimizer of EB,Ω . Remark 6.11. For a lower bound, we can use the TF reduced functional 1 2 2 2 2 4 IΩ (u) = ω⊥ g (1 − Ω /ω⊥ )r |u| + |u| dxdy 2 R2 whose minimum is explicit: inf IΩ (u) = ω⊥ g˜eTF
u=1
1 2 ). (1 − Ω2 /ω⊥ 2
Thus we get that, if there exists β ∈ [0, 1[ such that 0 ≤ Ω/ω⊥ ≤ β, then, as g → +∞, mB,Ω ( g , ω⊥ ) ≈ ω⊥
g.
(6.42)
(6.43)
The uniformity of the approximation depends on β. In fact, if one wants a more precise expansion of the energy, one can use the ground state ρ of IΩ to split the energy EB,Ω (u). Indeed the Euler Lagrange equation for ρ multiplied by (1 − |u|2 ) for any function u yields the identity (see [2]) EB,Ω (u) = IΩ (ρ) + ρ2 |∇v − iΩ × rv|2 + g˜ρ4 (1 − |v|2 )2 where v = u/ρ. Thus, IΩ always provides a lower bound with an inverted parabola profile as soon as we are in a TF situation. The second part of the energy has the vortex contribution which is of lower order when Ω/ω⊥ 1. More precisely, the √ first vortex is observed for a velocity Ω of order ω⊥ ln g˜/ g˜. When Ω increases and becomes at most like βω⊥ with β < 1, the two parts of the energy I(ρ) and the
March 10, 2009 17:53 WSPC/148-RMP
270
J070-00361
A. Aftalion & B. Helffer
rest become of similar magnitude. In the limit, Ω → ω⊥ , there are a lot of vortices and the description can be made with the lowest Landau levels sets of states. The leading order term of the energy is the first eigenvalue of −(∇ − iΩ × r)2 which is equal to Ω. 6.5. Lower bounds in the TF case (N ≥ 1) In the proof of Theorem 1.6, we need a lower bound of mN B,Ω , which will be established in this subsection. We start from a minimizer (ψ,⊥ ) . Due to the normalization, there exists at least one j such that 1 ψj,⊥ ≥ √ . N Then we write (neglecting the kinetic part) mN B,Ω
1 ≥ (ω 2 − Ω2 ) 2
When expanding
r2 |ψj,⊥ |2 + g
NT 2
− N2T
N −1
R2x,y
4 ψjN (z)ψj,⊥ (x, y) dzdxdy.
j=0
4
N −1 N j=0 ψj (z)ψj,⊥ (x, y)
, the mixed terms are exponentially
small (see Sec. 7.1) in comparison to j ψj,⊥ 4L4 , hence we get, for some α > 0, 1 2 N 2 mB,Ω ≥ (ω − Ω ) r2 |ψj,⊥ |2 2 N2T α +g ψ0N (z)4 dz (ψj,⊥ )4 dxdy 1 − exp − . − N2T We now use (7.4), to obtain T2 1 2 α N 2 2 2 4 4 mB,Ω ≥ (ω − Ω ) r |ψj,⊥ | + g φ1 (z) dz ψj,⊥ dxdy 1 − exp − 2 − T2 α 1 4 g ψj,⊥ dxdy 1 − exp − = (ω 2 − Ω2 ) r2 |ψj,⊥ |2 + 2 1 2 α 4 (ω − Ω2 ) r2 |ψj,⊥ |2 + g ≥ ψj,⊥ dxdy 1 − exp − 2 1 2 α 1 (ω − Ω2 ) r2 |ψ|2 + g ψ 4 dxdy . ≥ 2 1 − exp − inf N ψ,ψ=1 2 One can then use the asymptotics obtained in the proof of (6.43) to get, under Assumption (6.42), the existence of CN,β > 0 such that, as tends to 0 and g to ∞, 1 ω⊥ g˜. (6.44) mN B,Ω ≥ CN,β
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
271
7. Tunneling Effects for the Nonlinear Models This is only in this section that we will exhibit the role of these localized (N T )periodic Wannier functions. 7.1. Towards the DNLS model 7.1.1. Preliminaries Our aim in this section is to discuss possible asymptotics for mN A in the case when N > 1, which will involve the tunneling effect. Although we have no final result on this part, we would like to prove how we reach a familiar model considered by physicists (see [22, 26, 36]): a discrete model called the DNLS model. In particular we will describe in Proposition 7.6 under which assumptions one can get a simplified model. The starting point in this subsection is that we replace the issue of N,,b g on the (N T )-periodic L2 -normalized functions by restricting the minimizing EA approximation to the eigenspace Im πN associated with the first N eigenvalues of the linear problem. 7.1.2. Projecting on the eigenspace Im πN Our aim is to analyze the reduced functional CN c = (c)j=0,...,N −1 →
N,,b g,red EA (c)
=
N −1 N,,b g EA cj ψjN , j=0
(7.1)
N,,b g N where EA is the former EA given in (1.36) with the explicit notation of the dependence of the parameters and the ψjN are the (N T )-periodic Wannier functions. When N = 1, the error which is done has been estimated in (5.8) under the 1 assumption that g 2 is small, i.e. (1.51). Replacing in the argument the projection on the first eigenspace by πN , the same result holds for N > 1. So we have:
Proposition 7.1. Under condition (1.51) N,(0)
mN g ) = mA A (,
3
1
(, g) + O( g 2 − 4 ),
(7.2)
with N,(0)
mA
(, g) :=
N,,b g,red EA (c). PN inf −1 2 {c | j=0 |cj | =1}
(7.3)
We now concentrate our discussion on the model obtained after this first approxN,(0) imation. More specifically we are interested in the asymptotics of mA (, g). 7.1.3. Neglecting the tunneling NT NT Let λN 1,z = λ1,z be the bottom of the (N T )-periodic spectrum of Hz on ]− 2 , 2 [. So strictly speaking, we can start the analysis of this first approximate model only under condition (1.51).
March 10, 2009 17:53 WSPC/148-RMP
272
J070-00361
A. Aftalion & B. Helffer
Neglecting the tunneling effect, we are lead to the minimum of the functional N,,b g,(1) EA
CN c →
N,,b g,(1) EA (c)
:= λ1,z
N −1
g |cj |2 +
j=0
N −1
4 |cj |
NT 2
− N2T
j=0
|ψ0N (z)|4 dz
,
over the c’s such that N −1
|cj |2 = 1.
j=0
Observing (see [15]), that N2T |ψ0N (z)|4 dz = − N2T
T 2
− T2
exp − S , φ1 (z)4 dz + O 2
(7.4)
where φ1 is the ground state of the T -periodic problem, the minimum of this approx1 imate functional, which is attained for cj = N − 2 , is N2T g N,(1) mA = λ1,z + |ψ N (z)|4 dz. (7.5) N − N2T 0 So as a first approximation, we have obtained Proposition 7.2. N,(0) mA (, g)
= λ1,z
or N,(0) mA (, g)
= λ1,z
g + N
g + N
NT 2
− N2T
T 2
− T2
|ψ0N (z)|4
4
φ1 (z) dz
dz
S + ( g + 1) O exp − ,
S S + g O exp − + O exp − . 2 (7.6)
is given in (1.29). If we apply this result to our context The definition of O N,(0) independently of with g = ω⊥ g, this yields information on the behavior of mA assumption (1.51). 7.1.4. Taking into account the tunneling If we keep the main tunneling term, we get the following more accurate approximating functional N,,b g,(2)
(c) CN c → EA N −1 N −1 N −1 2 4 := λ1 |cj | cj cj+1 + g |cj | − τ j=0
j=0
j=0
NT 2
− N2T
|ψ0N (z)|4 dz
.
(7.7) 1 is the lowest eigenHere τ is the hopping amplitude introduced around (3.12), λ N value corresponding to the Floquet condition k = 2 for the linear problem on
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
273
]− T2 , T2 [, which is exponentially closed to λ1 and we take the convention that cN = c0 . The quadratic form corresponds to the approximation in the first band: N −1 N −1 1 CN c → λ |cj |2 − τ cj cj+1 (7.8) j=0
j=0
which can be shown to be correct modulo O(exp − 2S ). Remark 7.3. This time the minimizer could depend on g!! This is the kind of problem which is analyzed in [22]. N,,b g ,(2)
Discussion about the justification of EA
One can wonder why we forget some terms in the computation. Let us do this more carefully. To be consistent with what we forget in the linear case (terms of order N T N −1 O(τ 2 )), we show first that one can approximateg − 2N T | j=0 cj ψjN (z)|4 dz by 2 4
N T N2T N N −1 2 −1 N 4 N 4 cj ψj (z) dz = |cj | |ψ0 | dz − N2T j=0 − N2T j=0
+
N −1
(|cj |2 + |cj+1 |2 )(cj c(j+1) + cj+1 c(j) )
j=0
×
NT 2
− N2T
ψ0N (z)|ψ0N (z)|2
·
ψ1N (z)dz
2 ). + O(τ
(7.9)
This first approximation is based on the following lemma. Lemma 7.4.
NT 2
− N2T
ψ0N (z)2 ψ1N (z)2 dz
2S = O exp − .
This is based on the property that, for all η > 0, there exists Cη such that 1 η (z), (7.10) |ψ0N (z)| ≤ Cη exp exp − dmod h Ag where dmod Ag (z) is an even function such that z mod w(t) dt, for z ∈ [0, T [, dAg (z, 0) = 2 0
and such that dmod Ag (z, 0) is increasing for z ≥ 0. use here the assumption that the potential and hence ψ0N is even. We recall also that the ψj are real. g We
March 10, 2009 17:53 WSPC/148-RMP
274
J070-00361
A. Aftalion & B. Helffer
On the contrary, this is a priori unclearh why one could forget terms like N2T τ = g ψ0N (z)3 ψ1N (z)dz (7.11) − N2T
(where we recall that w is even by assumption (1.1) and that this implies ψ0N even and real). This term is a priori of the same order as τ . We have indeed. Lemma 7.5.
+ N2T
− N2T
ψ0N (z)3 ψ1N (z)dz
S = O exp − .
(7.12)
Due to the decay estimates (7.10) for these (N T )-Wannier functions, the term to integrate in (7.12) decays like exp − 1 3dmod (z) + dmod (z − T ) , O Ag Ag so the main contribution comes from the origin and has the same size as exp − S . So it is necessary to be careful,i if one wants to neglect τ. + NT Let us now try to estimate − N2T ψ0N (z)3 ψ1N (z) dz as → 0 more precisely. 2
Heuristically, one can try to use a WKB approximation, this is available for ψ0N in the neighborhood of 0 but unfortunately, we do not have a good WKB approximation of ψ1N (z) close to the origin, as observed in Sec. 3.3 (see (3.23)). So we + NT have no obvious main term for the asymptotic behavior of − N2T ψ0N (z)3 ψ1N (z)dz. 2
A reasonable guess (which is implicitly used by the physicists) should be that: τ = g τ o(1),
as → 0.
(7.13)
The weaker mathematical result, which is obtained from Lemma 7.5, is the following τ = g τ O(1),
as → 0.
(7.14)
This leads to the proposition. Proposition 7.6. Under the assumption that there exists η > 0 such that, η 0≤ g exp ≤ 1, (7.15) then N,(0)
mA
N,(2)
= mA
+ o(τ ).
(7.16)
holds. This gives a motivation for the analysis of the DNLS model of [36] (with an
−1 2 extra term in λ N j=0 |cj | ). h In [22, p. 5], between formulas (18) and (19), the term τ b is discussed; see also p. 6 around formula (20). i We thank M. Snoek for kindly answering our questions on this problem.
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
275
If we consider the (N T )-periodic Floquet problem, we arrive naturally to questions analyzed in [22, (16)–(18)], and the remark after (21) in this paper. 7.2. On approximate models in case B: Towards Snoek’s model N introduced in Using the basis of the (N T )-Wannier functions, we can consider EB (1.43) and consider the decomposition N (ψ0,⊥ , . . . , ψN −1,⊥ ) EB
4 N −1 N N := EB (ψ0,⊥ , . . . , ψN −1,⊥ ) + g ψj (z)ψj,⊥ (x, y) . j=0 4 L
We now use various approximations related to the analysis of the z-problem ((N T )-Wannier functions). We get
N (ψ0,⊥ , . . . , ψN −1,⊥ ) EB
∼s
N −1
ψj,⊥ + t 2
N −1
j=0
and
(ψj,⊥ , ψj+1,⊥ + ψj,⊥ , ψj−1,⊥ ) ,
j=0
4 N −1 N −1 4 g ψ (z)ψ (x, y) ∼ g ψ ψj,⊥ 4L4 . j j,⊥ 0 L4 j=0 4 j=0 L
So the approximate functional becomes N −1 1 N,approx |∇ψ⊥,j |2 + V (x, y)|ψj,⊥ (x, y)|2 dxdy ((ψj,⊥ )j ) = EB 2 2 j=0 R N −1
+s
ψj,⊥ 2
j=0
+t
N −1
(ψj,⊥ , ψj+1,⊥ + ψj,⊥ , ψj−1,⊥ )
j=0
+ g
N −1
ψj,⊥ 4L4 ,
j=0
which should be minimized over the (ψj,⊥ )j such that N −1
ψj,⊥ 2 = 1.
j=0
This is the model described by Snoek [33].
(7.17)
March 10, 2009 17:53 WSPC/148-RMP
276
J070-00361
A. Aftalion & B. Helffer
Starting from this model, one can, depending on the size of the various parameters, come back in some case to the situation when (ψj,⊥ )j is of the form cj ψ⊥ ,
N −1 2 with j=0 |cj | = 1. In this case, we come back to the results of the previous subsection. In other cases, the problem seems completely open. This regime should lead to situations where vortices in the slice j are coupled with the neighboring slices. This is still to be analyzed. 8. Conclusion In this paper, we have analyzed the (N T )-periodic problem. Case B which leads to N coupled nonlinear problems provides many interesting directions of work. Other related models are still to be analyzed in relationship with our paper. For instance, it is natural to study the full 3D problem with a constraint on the L2 norm and the harmonic trapping potential also on the z direction. Another natural physical problem would be to analyze the quantity lim
Nc →+∞
1 inf Nc R + N2T 2 − NT 2
Qper,N (Ψ) Ω
|Ψ| dx=Nc
where we compute the energy by integrating over N periods and where Nc /N = ν (ν fixed). Upper bounds for this model are the periodic models with g replaced by gν. This point of view appears for example in [22] for discrete models. A related question is to analyze under which condition a minimizer of the (N T )-periodic problem is actually T -periodic.The general answer is unknown. One suspects by bifurcation arguments that it is true for g and Ω small enough, but physicists seem to wait for other situations. The discrete nonlinear model seems to appear in other contexts. It is addressed in [26]. A number of their results would require some rigorous justifications, for instance, the stability analysis. Acknowledgments We would like to thank X. Blanc for his careful reading of the manuscript and for discussions. We also thank M. Snoek for helpful discussions. This work is partially supported by the French ministry grant ANR-BLAN-0238, named VoLQuan. References [1] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, Applied Math Series, Vol. 55 (National Bureau of Standards, 1964). [2] A. Aftalion, Vortices in Bose–Einstein Condensates, Progress in Nonlinear Differential Equations and Their Applications, Vol. 67 (Birkh¨ auser, 2006).
March 10, 2009 17:53 WSPC/148-RMP
J070-00361
Mathematical Models for Bose–Einstein Condensates in Optical Lattices
277
[3] A. Aftalion, On the energy of a Bose–Einstein condensate in an optical lattice, Rev. Math. Phys. 19(4) (2007) 371–384. [4] A. Aftalion and X. Blanc, Reduced energy functionals for a three dimensional fast rotating Bose–Einstein condensates, to appear in Ann. Inst. H. Poincar´ e Anal. Non Lin´eaire 25(2) (2008) 339–355. [5] A. Aftalion, X. Blanc and F. Nier, Lowest Landau level functional and Bargmann transform in Bose–Einstein condensates, J. Funct. Anal. 241 (2006) 661–702. [6] A. Aftalion and B. Helffer, On mathematical models for Bose–Einstein condensates in optical lattices (expanded version), preprint (May 2008); revised in (October 2008); http://fr.arxiv.org/abs/0810.4003. [7] S. Alama, A. J. Berlinsky and L. Bronsard, Minimizers of the Lawrence–Doniach energy in the small-coupling limit: Finite width samples in a parallel field, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 19(3) (2002) 281–312. [8] S. Alama, A. J. Berlinsky and L. Bronsard, Periodic lattices for the Lawrence– Doniach energy of layered superconductors in a parallel field, Commun. Contemp. Math. 3(3) (2001) 457–404. [9] S. Alama, L. Bronsard and E. Sandier, On the shape of interlayer vortices in the Lawrence–Doniach model, Trans. Amer. Math. Soc. 360 (2008) 1–34. [10] I. Bloch, J. Dalibard and W. Zwerger, Many-body physics with ultracold gases, Rev. Mod. Phys. 80 (2008) 885–964. [11] H. Brezis, Analyse fonctionnelle, Th´ eorie et applications (Dunod, 1983). [12] F. Bethuel, H. Brezis and F. H´elein, Ginzburg–Landau Vortices, Progress in Nonlinear Partial Differential Equations and their Applications, Vol. 13 (Birkh¨ auser Boston, Boston, 1994). [13] H. Brezis and L. Oswald, Remarks on sublinear elliptic equations, Nonlinear Anal. 10 (1986) 55–64. [14] M. Correggi, T. Rindler-Daller and J. Yngvason, Rapidly rotating Bose–Einstein condensates in strongly anharmonic traps, J. Math. Phys. 48 (2007) 042104. [15] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in the Semi-Classical Limit, London Mathematical Society Lecture Note Series, Vol. 268 (Cambridge University Press, 1999). [16] M. S. P. Eastham, The Spectral Theory of Periodic Differential Equations (Scottish Academic Press, 1973). [17] E. M. Harrell, The band-structure of a one-dimensional, periodic system in a scaling limit, Ann. Phys. 119(2) (1979) 351–369. [18] B. Helffer, Semi-Classical Analysis for the Schr¨ odinger Operator and Applications, Lecture Notes in Mathematics, Vol. 1336 (Springer Verlag, 1988). [19] B. Helffer and J. Sj¨ ostrand, Analyse semi-classique pour l’´equation de Harper, Bull. Soc. Math. France 116(4) (1988) M´emoire 34. [20] B. Helffer and J. Sj¨ ostrand, Equation de Schr¨ odinger avec champ magn´etique et ´equation de Harper, Proc. Sonderborg Summer School, Springer Lect. Notes in Physics, Vol. 345 (Springer, 1989), pp. 118–197. [21] R. Ignat and V. Millot, The critical velocity for vortex existence in a two-dimensional rotating Bose–Einstein condensate, J. Funct. Anal. 233 (2006) 260–306. [22] M. Kr¨ amer, C. Memotti, L. Pitaevskii and S. Stringari, Bose–Einstein condensates in 1D optical lattices: Compressibility, Bloch bands and elementary excitations (27 October 2003); arXiv:cond-mat/0305300. [23] E. H. Lieb and R. Seiringer, Derivation of the Gross–Pitaevskii equation for rotating Bose gases, Commun. Math. Phys. 264 (2006) 505–537. [24] E. H. Lieb, R. Seiringer, J. P. Solovej and J. Yngvason, The Mathematics of the Bose Gas and Its Condensation (Birkh¨ auser, Basel, 2005).
March 10, 2009 17:53 WSPC/148-RMP
278
J070-00361
A. Aftalion & B. Helffer
[25] E. H. Lieb, R. Seiringer and J. Yngvason, A rigorous derivation of the Gross– Pitaevskii energy functional for a two-dimensional Bose gas, Commun. Math. Phys. 224 (2001) 17–31. [26] M. Machholm, A. Nicholin, C. J. Pethick and H. Smith, Spatial period-doubling in Bose–Einstein condensates in an optical lattice, Phys. Rev. A 69 (2004) 043604. [27] A. Outassourt, Comportement semi-classique pour l’op´erateur de Schr¨ odinger ` a potentiel p´eriodique, J. Funct. Anal. 72(1) (1987) 65–93. [28] C. Pethick and H. Smith, Bose–Einstein Condensation of Dilute Gases (Cambridge University Press, 2001). [29] L. P. Pitaevskii and S. Stringari, Bose–Einstein Condensation (Oxford Science Publications, 2003). [30] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. IV: Analysis of Operators (Academic Press, New York, 1978). [31] B. Simon, Semi-classical analysis of low lying eigenvalues III. Width of the ground state band in strongly coupled solids, Ann. Phys. 158 (1984) 415–420. [32] K. Schnee and J. Yngvason, Bosons in disc-shape traps: From 3D to 2D, preprint (2005); arXiv.math-ph/0510006. [33] M. Snoek, Vortex matter and ultracold superstrings in optical lattices, Ph.D. thesis (2006). [34] M. Snoek and H. T. C. Stoof, Vortex-lattice melting in a one-dimensional optical lattice, Phys. Rev. Lett. 96 (2006) 230402; arXiv:cond-mat/0601695 (31 January 2006). [35] M. Snoek and H. T. C. Stoof, Theory of vortex-lattice melting in a one-dimensional optical lattice, Phys. Rev. A 74 (2006) 033615; arXiv:cond-mat/0605699 (May 2006). [36] A. Smerzi, A. Trombettoni, P. G. Kevrekidis and A. R. Bishop, Dynamical superfluidinsulator transition in a chain of weakly coupled Bose–Einstein condensates, Phys. Rev. Lett. 89 (2002) 170402. [37] W. Zwerger, Mott–Hubbard transition of cold atoms in optical lattices, J. Opt. B Quantum Semiclass. Opt. 5 (2003) S9–S16.
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
Reviews in Mathematical Physics Vol. 21, No. 2 (2009) 279–313 c World Scientific Publishing Company
A MATHEMATICAL THEORY FOR VIBRATIONAL LEVELS ASSOCIATED WITH HYDROGEN BONDS II: THE NON-SYMMETRIC CASE
GEORGE A. HAGEDORN Department of Mathematics and Center for Statistical Mechanics, Mathematical Physics, and Theoretical Chemistry, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061-0123, USA
[email protected] ALAIN JOYE Institut Fourier, Unit´ e Mixte de Recherche CNRS–UJF 5582, Universit´ e de Grenoble I, BP 74, F-38402 Saint Martin d’H` eres Cedex, France
[email protected] Received 30 July 2008 Revised 22 October 2008 We propose an alternative to the usual time-independent Born–Oppenheimer approximation that is specifically designed to describe molecules with non-symmetrical hydrogen bonds. In our approach, the masses of the hydrogen nuclei are scaled differently from those of the heavier nuclei, and we employ a specialized form for the electron energy level surface. As a result, the different vibrational modes appear at different orders of approximation. Although we develop a general theory, our analysis is motivated by an examination of the FHCl − ion. We describe our results for it in detail. We prove the existence of quasimodes and quasienergies for the nuclear vibrational and rotational motion to arbitrary order in the Born–Oppenheimer parameter . When the electronic motion is also included, we provide simple formulas for the quasienergies up to order 3 that compare well with experiment and numerical results. Keywords: Born–Oppenheimer approximation; hydrogen bonds; vibrational levels. Mathematics Subject Classification 2000: 81V55, 92E99, 81Q20
1. Introduction This is the second in a series of articles devoted to the study of vibrational levels associated with hydrogen bonds. The first paper [5] deals with stretching vibrations of the hydrogen bond in the symmetric case in which the hydrogen binds two identical atoms or molecules. Our prototypical example is FHF − , which displays 279
March 10, 2009 17:57 WSPC/148-RMP
280
J070-00362
G. A. Hagedorn & A. Joye
strong anharmonic effects, coupling between vibrational modes, and a low frequency for the vibration of the hydrogen along the F –F axis. This second paper deals with all the vibrations and rotations in the non-symmetric situation. Our canonical example is FHCl − , which displays weaker anharmonic effects and a high frequency for the vibration of the hydrogen along the F –Cl axis. Both of our papers contain two main new ideas. The first is the same for both papers. Standard Born–Oppenheimer approximations keep the electron masses fixed while all the nuclear masses are taken proportional to −4 . We take the hydrogen mass proportional to −3 while keeping the heavier atoms’ masses proportional to −4 . This is physically appropriate for many molecules of interest: If the mass of an electron is 1 and is defined so the mass of a carbon C 12 nucleus is −4 , then = 0.0821, and the mass of a H 1 nucleus is 1.015 −3. The second novel idea is to exploit the smallness of certain derivatives of the electron energy level surface for the molecule being studied. Here our two papers are completely different, and they are motivated by examinations of numerically computed electron energy level surfaces using Gaussian 2003 software [3]. In the symmetric case, the second derivative associated with moving the H along the axis of AHA is small, and we could allow it to be small and negative if the H nucleus felt a double well potential. In the non-symmetric case, if the H is more weakly bound to the B in AHB, we assume all the derivatives associated with moving the B relative to A H in AHB are small. We assume all derivatives associated with stretching the distance between A and H not to be small. To describe the smallness of the small derivatives, we could have introduced another small parameter. Instead, we have elected to let play a second role. We take all the small derivatives to be proportional to . For the choice of = 0.0821 indicated above, that is again appropriate for our FHF − and FHCl − examples. The small derivatives are on the order of in units where the non-small derivatives are on the order of 1. We shall now restrict our attention to triatomic non-symmetrical hydrogen bonded molecules AHB, and assume the H is more strongly bound to the A. We do an asymptotic expansion for small , and our main results are the following: (1) To their respective leading orders, the vibrational levels are described by three independent harmonic oscillators in appropriate Jacobi coordinates: two separate one-dimensional harmonic oscillators and one two-dimensional isotropic harmonic oscillator. This is in contrast to the usual Born–Oppenheimer theory in which one obtains one coupled four-dimensional harmonic oscillator. Our technique does not require going through the diagonalization process to separate the normal modes. The different modes appear at different orders of the expansion, in contrast to the Born–Oppenheimer situation, where all vibrations are of order 2 . (2) The highest frequency vibrational states have energy of order 3/2 . These are the stretching oscillations of the A–H bond with the B approximately sitting still.
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
281
(3) The next highest frequency vibrations are the two degenerate bending modes. They are of order 2 . (4) The lowest vibrational energies are of order 5/2 . They are the stretching oscillations of the weak bond between the AH and the B. For the specific case of FHCl − , we have the following comparison of results, where vibrational energies are measured in cm−1 . The experimental results come from [2]. We note that the experiments were not done in the “gas phase”, so they may not accurately represent results for the isolated ions. All the Gaussian 2003 results presented in this paper are obtained by using the MP2 technique with the aug-cc-pvdz basis set. The software implements the standard Born– Oppenheimer approximation. The results for our model come from approximating the ground state electron energy surface with Gaussian 2003 and then applying our techniques.
Mode F –H stretch bends (degenerate) F H–Cl stretch
Experiment
Gaussian ’03
Our model
2710 843 275
2960 875 246
2960 871 251
Remarks. (1) It is not surprising that the results for our model are close to those obtained by Gaussian since we have used the same electron energy surface. The Gaussian software deals with the full 4-dimensional harmonic oscillator, whereas our technique deals with two 1-dimensional harmonic oscillators and one isotropic 2-dimensional harmonic oscillator. Evidently the Jacobi coordinates we have chosen are very close to the normal mode coordinates for the 4-dimensional oscillator. (2) The results from Gaussian and our model are just leading order (harmonic) calculations. Including higher order terms from the expansions might bring these into better agreement with experiment. Also, we again emphasize that the experimental results were not obtained for isolated ions. A recent chemistry article [9] contains data for vibrations of 18 hydrogen bonded molecules in the gas phase. It also contains an idea for quantifying how symmetric or non-symmetric a hydrogen bond is. Its conclusions are consistent with the analysis in our two papers. Figure 2 of that article plots the vibrational frequency of the A−H stretch versus the difference in the “proton affinities” of A and B for a molecule AHB. When A and B are identical, the frequency is low (800– 1000 cm−1 ), and when they attract the proton very differently, the frequency is high (1600–3500 cm−1 ). In our symmetric analysis, this vibrational energy is of order 2 , whereas in our non-symmetric analysis, it is of order 3/2 , which is roughly 3.5 times larger when = 0.0821.
March 10, 2009 17:57 WSPC/148-RMP
282
J070-00362
G. A. Hagedorn & A. Joye
Remarks. (1) We assume that the ground state electron energy level we are considering is non-degenerate for all nuclear configurations of interest. Thus, we do not consider situations that exhibit the Renner–Teller effect [8, 10, 6]. (2) Since our analysis includes rotations of the whole molecule, some small effects show up in the calculations. For example, l-type doubling [7] occurs for terms that have non-zero eigenvalues of the Lz operator at low order. (Lz is the nuclear angular momentum around the A−B axis.) States corresponding to Lz eigenvalue ±k with k ≥ 1 generically have their degeneracy in energy split at order 2+3k in our model. The paper is organized as follows: in Sec. 2, we describe our model in detail. In Sec. 3, we do the semiclassical expansion to all orders for the nuclei. In Sec. 4, we include the electrons. However, when we include the electrons, we just show that the energy expansion is valid through order 3 . Going to higher order is extremely complicated. 2. Semiclassical Analysis for the Effective Nuclear Hamiltonian In this section, we give a precise description of the Hamiltonian for the nuclei. As mentioned above, we consider a molecular system AHB in which the hydrogen is much more tightly bound to the A than to the B. We construct the coordinate system we use in two steps, as illustrated in the figures below. The first step is to choose a standard Jacobi coordinate system for the nuclei in their center of mass frame of reference. The first three coordinates from the A nucleus to the H are the components X1 , X2 , and X3 of the vector X nucleus. The fourth, fifth, and sixth coordinates Y1 , Y2 , and Y3 are the components from the center of mass of the A and H nuclei to the B nucleus of the vector Y (Fig. 1). We now change from these coordinates to new ones that we call (Y, θ, φ, R, γ, X). The (Y, θ, φ) are spherical coordinates for the vector described
Fig. 1.
Jacobi coordinates for the molecule.
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
283
by (Y1 , Y2 , Y3 ) in the original center of mass frame of reference. The (R, γ, X) are cylindrical coordinates for the vector (X1 , X2 , X3 ) in a frame of reference that rotates so that the axis for these coordinates is in the direction of the vector described by (Y1 , Y2 , Y3 ). The precise definition is given below. (See Figs. 2 and 3.) One reason for using these coordinates is that the potential energy surface depends only on Y , X, and R. A second reason is that in these coordinates, we can separate the total angular momentum J 2 and its z component Jz from the other motions easily. Also, to low order in perturbation theory, the angular momentum Lz conjugate to γ (which is the angular momentum in the direction of (Y1 , Y2 , Y3 )), gives another convenient quantum number. Note that Lz does not commute with the full Hamiltonian.
Fig. 2.
Jacobi coordinates fixed at the origin.
Fig. 3.
The final coordinate system.
March 10, 2009 17:57 WSPC/148-RMP
284
J070-00362
G. A. Hagedorn & A. Joye
The drawback to using this coordinate system is that the kinetic energy expression is quite messy. The complication comes from the Laplacian in the (Y, θ, φ) variables. The Laplacian in (R, γ, X) is simply the usual cylindrical Laplacian. These coordinates are closely related to ones used in [4] to deal with Born– Oppenheimer approximations for diatomic Coulomb systems. There is a minus sign error in the expression for L · J term on page 32 of [4]. As mentioned above, (Y, θ, φ) are just standard spherical coordinates. To describe the other three coordinates precisely, we first define the rotation
cos(θ) cos(φ) R1 (θ, φ) = cos(θ) sin(φ) −sin(θ) 0
0
−sin(φ) cos(φ) 0
sin(θ) cos(φ) sin(θ) sin(φ) . cos(θ)
1
0
Y1
1
It maps the vector @ 0 A to the unit vector in the direction of @ Y2 A. We then define 1
Y3
coordinates (ξ1 , ξ2 , ξ3 ) by
ξ1 X1 ξ2 = [ R1 (θ, φ) ]−1 X2 . ξ3 X3 Next, we define another rotation
cos(γ) −sin(γ) 0 R2 (γ) = sin(γ) cos(γ) 0 , 0 0 1 where, for generic vectors ξ, γ is defined by requiring the second component of [R2 (γ)]−1 ξ to be 0 and its first component to be positive. We then define coordinates X and R by ξ1 R −1 0 = [ R2 (γ) ] ξ2 . ξ3 X
Our Hamiltonian has kinetic energy −
3 4 ∆(X1 ,X2 ,X3 ) − ∆(Y1 ,Y2 ,Y3 ) , 2µ1 () 2µ2 ()
where µ1 () and µ2 () are modified reduced masses that we describe in detail below. Since Laplacians are rotationally invariant, under our coordinate changes, the first term simply becomes the usual cylindrical Laplacian 2 ∂ 1 ∂2 1 ∂ ∂2 3 + + + . − 2 µ1 () ∂R2 R ∂R R2 ∂γ 2 ∂X 2
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
285
By a very tedious calculation, the second term in the kinetic energy is 2 ∂ 1 2 ∂ 4 2 2 − + {J − 2L · J + L } , − 2µ2 () ∂Y 2 Y ∂Y Y2 where 1 ∂2 ∂ J = − 2 − cot θ − 2 ∂θ ∂θ sin θ 2
∂2 ∂2 + ∂φ2 ∂γ 2
+
2 cos θ ∂ 2 , sin2 θ ∂φ∂γ
(2.1)
is the total angular momentum operator, ∂ ∂ X ∂ 1 ∂ ∂ L · J = R sin γ − X sin γ − cos γ − cot θ ∂X ∂R R ∂γ sin θ ∂φ ∂γ ∂ X ∂ ∂ ∂ − X cos γ + sin γ + R cos γ ∂X ∂R R ∂γ ∂θ and ∂2 X 2 ∂2 ∂2 ∂2 L2 = −R2 − X2 + 2XR − 2 2 2 ∂X ∂X∂R ∂R R ∂γ 2 ∂ ∂2 X2 ∂ + 2X + + R− . R ∂R ∂X ∂γ 2 The modified reduced masses are µ1 () = 3
−4 mA −3 mH −4 mA + −3 mH
and µ2 () = 4
(−4 mA + −3 mH )−4 mB , −4 mA + −3 mH + −4 mB
where the three nuclei have masses −4 mA , −3 mH , and −4 mB . The modified reduced masses have limits as tends to zero. To isolate the leading behavior, we abuse notation and define µ1 = lim→0 µ1 () = mH and µ2 = lim→0 µ2 () = mA mB mA +mB . Then we have 3 3 4 = + . 2µ1 () 2µ1 2mA Similarly, 4 4 5 = . − 2µ2 () 2µ2 2mA (mA + 2mH ) We define the operator 4 D() = −
4 5 ∆(Y1 ,Y2 ,Y3 ) , ∆(X1 ,X2 ,X3 ) + 2mA 2mA (mA + 2mH )
written in the new variables, so that the kinetic energy can be expressed as −
3 4 ∆(X1 ,X2 ,X3 ) − ∆(Y1 ,Y2 ,Y3 ) + 4 D(), 2µ1 2µ2
all written in terms of (Y, θ, φ, R, γ, X). The quantum fluctuations of the nuclei around their equilibrium positions occur on short length scales, so we now do the appropriate rescaling of variables. We
March 10, 2009 17:57 WSPC/148-RMP
286
J070-00362
G. A. Hagedorn & A. Joye
assume the ground state electron energy surface has a minimum at Y = Y0 , R = 0 (because the hydrogen bond is linear), and X = X0 . Under the rescaling, the angles θ, φ and γ remain unchanged, but we replace Y , R, and X by y = (Y − Y0 )/3/4 ,
r = R/1/2 ,
and x = (X − X0 )/3/4 .
Under this rescaling, the total kinetic energy operator becomes 2 ∂ 1 ∂2 2 1∂ 5/2 ∂ 2 3/2 ∂ 2 + 2 2 − − + − 2 2 2µ1 ∂x 2µ1 ∂r r ∂r r ∂γ 2µ2 ∂y 2 −
4 13/4 ∂ + {J 2 − 2L · J + L2 } + 4 D(), (2.2) 3/4 µ2 (Y0 + y) ∂y 2µ2 (Y0 + 3/4 y)2
where J 2 is still given by (2.1), but L · J and L2 are now given by the -dependent expressions ∂ ∂ − −1/2 (X0 + 3/4 x) sin γ L · J = −1/4 r sin γ ∂x ∂r 3/4 ∂ 1 ∂ ∂ x −1/2 X0 + cos γ − cot θ − r ∂γ sin θ ∂φ ∂γ ∂ ∂ − −1/2 (X0 + 3/4 x) cos γ + −1/4 r cos γ ∂x ∂r 3/4 ∂ ∂ x −1/2 X0 + + sin γ r ∂γ ∂θ and ∂2 ∂2 ∂2 − −1 (X0 + 3/4 x)2 2 + 2−3/4 (X0 + 3/4 x)r 2 ∂x ∂x∂r ∂r −1 3/4 2 2 3/4 2 ∂ (X0 + x) ∂ (X0 + x) − + −1 r− r2 ∂γ 2 r ∂r
L2 = −−1/2 r2
+ 2−3/4 (X0 + 3/4 x)
∂2 ∂ + . ∂x ∂γ 2
(2.3)
Remarks. (1) The operator L · J can be rewritten as L · J = −1/4
r ∂ X0 + 3/4 x ∂ (L+ − L− ) − −1/2 (L+ − L− ) 2 ∂x 2 ∂r
− i−1/2 where
X0 + 3/4 x ∂ (L+ + L− ), 2r ∂γ
(2.4)
∂ 1 ∂ ∂ + i cot θ −i L± = e±iγ ± . ∂θ ∂γ sin θ ∂φ
By explicit computation, one can verify that L+ and L− commute with both J 2 and Jz . The operators L+ and L− are raising and lowering operators for the eigenstates of Lz .
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
287
(2) The dominant order terms in the expressions in L · J and L2 are the ones of order −1 in L2 . Because of the overall factor of 4 that multiplies these operators in the Hamiltonian, they are not relevant until the order 3 perturbation calculations. Motivated by numerical calculations for the FHCl − ion, we assume the ground state electron energy surface near its minimum depends only weakly on R and Y . To exploit this, we decompose the potential energy surface as V1 (X) + V2 (X, R, Y ),
(2.5)
where V1 and V2 have Taylor expansions of the forms V1 (X) ∼ a0 +
∞
aj (X − X0 )j ,
(2.6)
j=2
and V2 (X, R, Y ) ∼
bj,k,l (X − X0 )j Rk (Y − Y0 )l .
(2.7)
j+k+l≥2 k+l≥1 k even
The restrictions on the indices in V2 are obtained requiring all pure X dependence to be V1 and by requiring V2 to be even in R (because of the symmetry). We now can state our results for the semiclassical analysis of the bound states for the nuclei. Theorem 2.1. Consider the Hamiltonian H() = −
4 3 + V1 (X) + V2 (X, R, Y ), ∆(X1 ,X2 ,X3 ) − ∆ 2µ1 () 2µ2 () (Y1 ,Y2 ,Y3 )
rewritten in terms of the variables (X, R, Y, θ, φ, γ). Assume V1 and V2 are C ∞ functions that satisfy (2.6) and (2.7). Assume V1 has a unique global minimum a0 at X = X0 > 0, with a2 > 0 in (2.6), and that lim inf |X|→∞ V (X) > a0 . Assume V2 has a unique global minimum of 0 at X = X0 , R = 0, and Y = Y0 > 0, with b0,2,0 > 0 and b0,0,2 > 0 in (2.7). Given any integer N > 0, there exist a non-zero N l/4 N l/4 ψl/4 and a quasienergy EN/4 () = El/4 , quasimode ΨN/4 () = l=0 l=0 such that ψl/4 = O(1) for each l, El/4 = O(1) for each l, and (H() − EN/4 ())ΨN/4 () ≤ CN (N +1)/4 , for some CN that depends on the choices of n, k, m, and p below. Furthermore, E 0 = a0 ,
E1/4 = E2/4 = E3/4 = E4/4 = E5/4 = E7/4 = E9/4 = E11/4 = 0, 1 E6/4 = 2a2 /µ1 n + , for n = 0, 1, . . . , 2
March 10, 2009 17:57 WSPC/148-RMP
288
J070-00362
G. A. Hagedorn & A. Joye
E8/4 =
2b0,2,0 /µ1 (2m + |k| + 1), for an integer k, and m = 0, 1, . . . ,
1 E10/4 = 2b0,0,2 /µ2 p + , for p = 0, 1, . . . , 2
and E12/4 is given by the expression (3.7). The rotational energy first appears in E16/4 . For fixed angular momentum quantum numbers j and jz , for order N ≥ 12, the states with k = 0 are non-degenerate, and the states with |k| > 0 have multiplicity at most 2. Remarks. (1) We construct the quasimode in Theorem 2.1 with ψ0 equal to a normalized product of harmonic oscillator states, so ψ0 = 0. We choose the ψl/4 for l > 0 to be orthogonal to ψ0 , so the total quasimode ΨN/4 has ΨN/4 ≥ 1. (2) The quasimode construction of Theorem 2.1 guarantees that H() has some spectrum within a distance CN (N +1)/4 of EN/4 . If this interval lies below the essential spectrum of H(), then there must be a bound state in this interval. Our techniques cannot rule out the possibility that there might be points in the spectrum not associated with the quasimodes. However, to the best of our knowledge, in appropriate energy ranges, no experiments have indicated the presence of bound states other than those associated with the quasimodes. (3) Theorem 2.1 is stated with global hypotheses and without growth conditions on the potential. When the electronic motion is also included, the potential energy surface may only exist locally. The cutoff functions that are introduced in Proposition 3.2 allow us to obtain analogous results with only local assumptions. (4) Cutoff functions are introduced in Sec. 3.2. They are required even when V is defined everywhere, because it may grow rapidly. For example, in one dimen2 sion, if V (x) grows faster than ecx for all c, no harmonic oscillator eigenstate is in the domain of multiplication by V (x). Without the cutoff, our ψ0 would be a multi-dimensional harmonic oscillator eigenstate, and we would not be able to prove error bounds. When we multiply by the cutoff, the resulting function is in the domain of V . When V is not globally defined (as might occur in a molecular situation when an electron energy level hits the essential spectrum), the cutoff function is chosen so that the support of the quasimode lies inside the support of V . See Sec. 3.2. For the FHCl − ion, we have calculated values for the first few coefficients in the expansion for V , based on numerically differentiating results from Gaussian 2003. Here distances are measured in Angstroms, energies are in Hartrees, and we have used = 0.0821. a0 = −560.160, a2 = 0.567, b0,2,0 = 0.597, b1,0,1 = 0.853, b0,0,2 = 0.664.
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
289
The in (2.5) reflects the weakness of the hydrogen bond, and also that the molecule can bend easily. The FHCl − ion essentially looks like a slightly deformed F H molecule with a Cl− ion quite a long way from the F H. Gaussian 2003 assigns charges associated with each atom, and it obtains: F H Cl
−0.58, 0.51, −0.93.
The calculated F –H distance is 0.98 Angstrom, and the H–Cl distance is 1.91 Angstroms. (For HF alone, the charges are ±0.33, the H–F distance is 0.925 Angstrom, and the calculated vibrational frequency is 4083 cm−1 .) Experimental values [2] for the vibrational frequencies of FHCl − (in cm−1 ) are 275 843 2710
F H oscillates relative to the Cl, bends (2 degenerate modes), F H oscillates,
Gaussian 2003 calculates the harmonic vibrational frequencies (in cm−1 ) to be 246 875 2960
F H oscillates relative to the Cl, bends (2 degenerate modes), F H oscillates.
To leading order, our model has these frequencies proportional to 3/2 , 2 , and 5/2 respectively. The specific harmonic frequencies that we obtain for FHCl − are 251 871 2960
F H moves relative to the Cl, bends (2 degenerate modes), F H oscillates.
3. The Perturbation Expansion for the Nuclei We now do the perturbation expansion for the semiclassical motion of the nuclei under the global hypotheses of Theorem 2.1. When the hypotheses are satisfied only locally, see Proposition 3.2. The perturbation expansion describes the small dependence of the eigenvalue problem for the following differential operator 2 ∂ 1 ∂2 2 1∂ 5/2 ∂ 2 3/2 ∂ 2 + − + − − 2 2 2 2 2µ1 ∂x 2µ1 ∂r r ∂r r ∂γ 2µ2 ∂y 2 4 13/4 ∂ + {J 2 − 2L · J + L2 }, µ2 (Y0 + 3/4 y) ∂y 2µ2 (Y0 + 3/4 y)2 ∞ 3(j+l)+2k 4 + a0 + aj 3j/4 xj + bj,k,l 1+ xj rk y l .
−
j=2
j+k+l≥2 k+l≥1 k even
(3.1)
March 10, 2009 17:57 WSPC/148-RMP
290
J070-00362
G. A. Hagedorn & A. Joye
At this point we should make the Ansatz that the eigenvalue and eigenfunction have expansions of the forms E=
∞
νl ()Eql
and ψ(x, r, y, θ, φ, γ) =
l=0
∞
νl ()ψql (x, r, y, θ, φ, γ).
l=0
Here, ν0 () = 1, ψ0 is non-trivial, and νl+1 ()/νl () → 0 as → 0. However, one learns that every νl () that occurs is some power of 1/4 , so it is somewhat simpler just to take νl () = l/4 , i.e. E=
∞
l/4 El/4
and ψ(x, r, y, θ, φ, γ) =
l=0
∞
j/4 ψl/4 (x, r, y, θ, φ, γ).
l=0 2
Our Hamiltonian, J , and Jz all commute with one another, so we can simultaneously diagonalize these three operators. The eigenvalues of J 2 are j(j + 1), where j = 0, 1, 2, . . . , and for a given j, they have degeneracy (2j + 1)2 . We henceforth use the specific basis for the eigenspace for fixed j that is given in [1, Sec. 4.7]: {|j, jz , k : jz = −j, −j + 1, . . . , j; k = −j, −j + 1, . . . , j}, where Jz |j, jz , k = jz |j, jz , k and Lz |j, jz , k = k|j, jz , k, ∂ ∂ and Lz = −i ∂γ . Note that although J 2 , Jz , and Lz all commute where Jz = −i ∂φ with one another, Lz does not commute with the Hamiltonian. For future reference, we note also that the operators in (2.4) have
L+ |j, jz , k = α+,j,jz ,k |j, jz , k + 1 and L− |j, jz , k = α−,j,jz ,k |j, jz , k − 1, for some α±,j,jz ,k . When |k| = j, α+,j,jz ,j = 0 and α−,j,jz ,−j = 0. By restricting attention to given values of j and jz , the wave functions in our expansion can now be regarded (with some abuse of notation) as ψl/4 (x, r, y, θ, φ, γ) =
j
ψl/4 (x, r, y, k)|j, jz , k.
k=−j
We now substitute the Ansatz into the eigenvalue equation and equate terms order by order. We do not worry about normalization, but produce a quasimode that is O(1) as tends to 0. To simplify some of the discussion, we take ψl/4 orthogonal to ψ0 for l > 0. The results of these computations yield the formal expansions of Theorem 2.1. Order 0 . These terms simply require a0 ψ0 = E0 ψ0 . So, E 0 = a0 . Order l/4 for 1 ≤ l ≤ 5. The terms of these orders successively require a0 ψl/4 = E0 ψl/4 + El/4 ψ0 . So, El/4 = 0.
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
Order 6/4 .
291
These terms require −
This forces
E6/4 =
1 ∂ 2 ψ0 + a2 x2 ψ0 = E6/4 ψ0 . 2µ1 ∂x2
1 n+ 2
2a2 /µ1
for some n = 0, 1, . . . ,
and ψ0 (x, r, y, k) = f0 (r, y, k)Φ1 (x), where Φ1 (x) = (2a2 µ1 )1/8 π −1/4 2−n/2 (n!)−1/2 Hn (x )e−x
2
/2
with x = (2a2 µ1 )1/4 x. The function f0 is not yet determined. Order 7/4 .
We introduce the notation H0,x = −
1 ∂2 + a2 x2 . 2µ1 ∂x2
Then the 7/4 terms require [H0,x − E6/4 ]ψ1/4 = E7/4 ψ0 . We first examine the components of this equation that are multiples of Φ1 (x). These x components require E7/4 = 0. We then examine the components that are perpendicular to Φ1 (x) in the x variables. These ⊥x components require ψ1/4 (x, r, y, k) = f1/4 (r, y, k)Φ1 (x), where the function f1/4 is not yet determined. Order 8/4 .
These terms require 2 1 ∂ 2 ψ0 1 1 ∂ψ0 ∂ ψ0 + [H0,x − E6/4 ]ψ2/4 − + + b0,2,0 r2 ψ0 = E8/4 ψ0 . 2µ1 ∂r2 r ∂r r2 ∂γ 2
The x components of this equation require H0,r,γ ψ0 = E8/4 ψ0 , where H0,r,γ
1 ∂2 1 ∂2 1∂ + =− + + b0,2,0 r2 . 2µ1 ∂r2 r ∂r r2 ∂γ 2
This is a standard isotropic two dimensional Harmonic oscillator problem that one 0 can solve by separating variables. In our context, the angular operator Lz = −i ∂ψ ∂γ
March 10, 2009 17:57 WSPC/148-RMP
292
J070-00362
G. A. Hagedorn & A. Joye
has eigenvalues k = 0, ±1, ±2, . . . , ±j and eigenfunctions eikγ . For each such k, the radial operator k2 1 ∂2 1∂ − 2 + b0,2,0 r2 − + 2µ1 ∂r2 r ∂r r has eigenvalues
E8/4 = (2m + |k| + 1) 2b0,2,0 /µ1 ,
where m = 0, 1, . . . .
The corresponding normalized eigenfunctions are 2(m!) 1/4 2 −r 2 /2 (2b0,2,0 µ1 ) (r )|k| L|k| , m (r )e (m + |k|)! |k|
where, r = (2b0,2,0 µ1 )1/4 r, m ≥ 0, and Lm is a Laguerre polynomial. We permanently fix one such value of E8/4 . Since different pairs (m, k) can occur, we define K = {k ∈ Z : |k| ≤ j, and m(k) ≥ 0, } where m(k) =
1 2
2b0,2,0 /µ1 − |k| − 1 . E8/4
One can easily show that K is non-empty and has at most j + 1 elements. For k ∈ K, we define the normalized wave functions 2 2(m(k)!) 2 |k| 1/4 (2b0,2,0 µ1 ) (r )|k| Lm(k) (r )e−r /2 Φ2 (|k|, r) = (m(k) + |k|)! and take
f0 (r, y, k) =
g0 (y, k)Φ2 (|k|, r)
if k ∈ K
0
otherwise.
The functions g0 (y, k) for k ∈ K are not yet determined. However, we now have ψ0 (x, r, y, θ, φ, γ) = g0 (y, k)Φ1 (x)Φ2 (|k|, r)|j, jz , k. k∈K
For future reference, we let Z1 denote the subspace spanned by {Φ1 (x)Φ2 (|k|, r)|j, jz , k : k ∈ K}. The ⊥x terms at this order require [H0,x − E6/4 ]ψ2/4 = 0, which simply forces ψ2/4 = f2/4 (r, y, k)Φ1 (x).
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
Order 9/4 .
293
These terms require [H0,x − E6/4 ]ψ3/4 + [H0,r,γ − E8/4 ]ψ1/4 + a3 x3 ψ0 = E9/4 ψ0 .
(3.2)
The x components of this equation require [H0,r,γ − E8/4 ]ψ1/4 = E9/4 ψ0 .
(3.3)
We first examine the components of this equation that belong to the subspace Z1 . These x Z1 components require E9/4 = 0. Next, the x ⊥Z1 components of (3.3) that are orthogonal to Z1 require [H0,r,γ − E8/4 ]ψ1/4 = 0. This forces us to choose
g1/4 (y, k)Φ2 (|k|, r) if k ∈ K f1/4 (r, y, k) = 0 otherwise. The ⊥x components of (3.2) require [H0,x −E6/4 ]ψ3/4 +a3 x3 ψ0 = 0. We solve this equation by applying the reduced resolvent operator [H0,x − E6/4 ]−1 r . The result is 3 ψ3/4 (x, r, y, k) = −a3 g0 (y, k)Φ2 (|k|, r)[H0,x − E6/4 ]−1 r (x Φ1 (x)) k∈K
+ f3/4 (r, y, k)Φ1 (x).
(3.4)
Order 10/4 . [H0,x − E6/4 ]ψ4/4 + [H0,r,γ − E8/4 ]ψ2/4 −
1 ∂ 2 ψ0 + a3 x3 ψ1/4 + b0,0,2 y 2 ψ0 + b1,0,1 xyψ0 = E10/4 ψ0 . 2µ2 ∂y 2
(3.5)
2
The x Z1 components require − 2µ1 2 ∂∂yψ20 + b0,0,2 y 2 ψ0 = E10/4 ψ0 . This forces us to choose
1 2b0,0,2 /µ2 where p = 0, 1, . . . , E10/4 = p + 2 and g0 (y, k) = c0,k Φ3 (y) if k ∈ K,
(3.6)
where Φ3 (y) = (2b0,0,2 µ2 )1/8 π −1/4 2−p/2 (p!)−1/2 Hp (y )e−y with y = (2b0,0,2 µ2 )1/4 y.
2
/2
March 10, 2009 17:57 WSPC/148-RMP
294
J070-00362
G. A. Hagedorn & A. Joye
So far, the c0,k in (3.6) are arbitrary for k ∈ K, but we henceforth assume they satisfy the normalization condition |c0,k |2 = 1. k∈K
For future reference, we let Z2 denote the subspace spanned by {Φ1 (x)Φ2 (|k|, r)Φ3 (y)|j, jz , k : k ∈ K}. The x ⊥Z1 components require
g2/4 (y, k)Φ2 (|k|, r) f2/4 (r, y, k) = 0
if k ∈ K otherwise.
The ⊥x components require [H0,x − E6/4 ]ψ4/4 + a3 x3 ψ1/4 + b1,0,1 xyψ0 = 0. We apply the reduced resolvent of H0,x to obtain 3 ψ4/4 (x, r, y, k) = −a3 g1/4 (y, k)Φ2 (|k|, r)[H0,x − E6/4 ]−1 r (x Φ1 (x))
− b1,0,1 c0,k yΦ3 (y)Φ2 (|k|, r)[H0,x − E6/4 ]−1 r (x Φ1 (x)) + f4/4 (r, y, k)Φ1 (x). Note that the first two terms are zero if k ∈ / K. Remarks. (1) At this point, we have completely determined ψ0 , except for the values of c0,k for k ∈ K. Restoring the angular dependence in the notation, we have ψ0 = c0,k Φ1 (x)Φ2 (|k|, r)Φ3 (y)|j, jz , k. k∈K
Since j and jz are fixed, this is a linear combination of at most j + 1 linearly independent states. (2) As we shall see, the degeneracy generically partially splits at order 12/4 . At that point, states with different values of |k| have different energy, but two states with k = ±λ for λ > 0 have the same E12/4 . In terms of the energy, the degeneracy of these two states generically splits completely at order 2+3λ . When λ = 1, this splitting has long been observed in the spectra of linear polyatomic molecules. It is called l-type doubling [7]. (3) We have determined the dominant terms for the eigenvalue:
1 3/2 2a2 /µ1 + 2 (2m(k) + |k| + 1) 2b0,2,0 /µ1 n+ E0 + 2
1 5/2 + 2b0,0,2 /µ2 . p+ 2 This quantity does not depend on the quantum numbers j, jz , or k ∈ K. The dominant contribution to the energy from the total angular momentum is
j(j+1)4 2µ2 Y02
, so it enters at order 16/4.
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
295
(4) Below we impose the condition that every ψl/4 with l > 0 be orthogonal to the subspace Z2 . (5) At the next order, the pattern emerges for how to do all higher order formal perturbation calculations. For l ≥ 11, we have the following: • • • •
the the the the
x Z1 y terms determine El/4 , x Z1 ⊥y terms determine the y-dependence of g(l−10)/4 (y, k) x ⊥Z1 terms determine the r and k dependence of f(l−8)/4 (r, y, k), and ⊥x terms determine the x-dependence of ψ(l−6)/4 (x, r, y, k).
Since the general pattern occurs at the next order, we present full calculations for only one more order explicitly. Order 11/4 . [H0,x − E6/4 ]ψ5/4 + [H0,r,γ − E8/4 ]ψ3/4 + [H0,y − E10/4 ]ψ1/4 + a3 x3 ψ2/4 + b1,0,1 xyψ1/4 + b0,2,1 r2 yψ0 + b1,2,0 xr2 ψ0 = E11/4 ψ0 . The x Z1 y terms require E11/4 = 0. The x Z1 ⊥y terms require g1/4 (y, k) = −b0,2,1 c0,k Φ2 (|k|, r), r2 Φ2 (|k|, r)r [H0,y − E10/4 ]−1 r (y Φ3 (y)) for k ∈ K. This is the first place in the perturbation calculations where different values of |k| yield different results. Note that we could add c1/4,k Φ3 (y) to g1/4 (y, k) when k ∈ K, but we have chosen c1/4,k = 0 to impose the condition that ψ1/4 be orthogonal to the subspace Z2 . See Remark 4 above. The x ⊥Z1 terms require [H0,r,γ − E8/4 ]f3/4 + P⊥Z1 [H0,y − E10/4 ]f1/4 + b0,2,1 yP⊥Z1 r2 f0 = 0, where P⊥Z1 denotes the projection onto functions orthogonal to the subspace Z1 . We have already seen that the non-zero f1/4 (r, y, k) belong to the subspace Z1 , so P⊥Z1 [H0,y − E10/4 ]f1/4 = 0. Thus, applying the reduced resolvent of H0,r,γ (which is zero on Z1 ), we obtain 2 f3/4 (r, y, k) = −b0,2,1 c0,k yΦ3 (y)[H0,r (|k|) − E8/4 ]−1 r P⊥Z1 r Φ2 (|k|, r)
+ g3/4 (y, k)Φ2 (|k|, r)
if k ∈ K,
and f3/4 (r, y, k) = 0
if k ∈ / K.
Here, we have used the notation H0,r (|k|) = −
k2 1 ∂2 1 ∂ + − 2 ∂r2 2r ∂r 2r2
March 10, 2009 17:57 WSPC/148-RMP
296
J070-00362
G. A. Hagedorn & A. Joye
and the direct sum decomposition [H0,r,γ − E8/4 ]−1 r =
[H0,r (|k|) − E8/4 ]−1 r
|k|≤j
which results from H0,r,γ commuting with Lz . The ⊥x terms require ⊥x [H0,x − E6/4 ]ψ5/4 + [H0,r,γ − E8/4 ]ψ3/4
+ a3 x3 ψ2/4 + b1,0,1 xyψ1/4 + b1,2,0 xr2 ψ0 = 0, ⊥x where ψ3/4 denotes the component of ψ3/4 orthogonal to Φ1 (x) in the x variables. By combining this with (3.4) and (3.6), we have ⊥x (x, r, y, k) ψ3/4
3 −a3 c0,k Φ3 (y)Φ2 (|k|, r)[H0,x − E6/4 ]−1 r (x Φ1 (x)) = 0
if k ∈ K if k ∈ / K.
⊥x So, we see that [H0,r − E8/4 ]ψ3/4 = 0. Thus, we have 3 ψ5/4 (x, r, y, k) = −a3 g2/4 (y, k)Φ2 (|k|, r)([H0,x − E6/4 ]−1 r (x Φ1 (x)))
− b1,0,1 yg1/4 (y, k)Φ2 (|k|, r)([H0,x − E6/4 ]−1 r (xΦ1 (x))) − b1,2,0 c0,k Φ3 (y)r2 Φ2 (|k|, r)([H0,x − E6/4 ]−1 r (xΦ1 (x))) + f5/4 (r, y, k)Φ1 (x)
if k ∈ K.
For k ∈ / K, ψ5/4 (x, r, y, k) = f5/4 (r, y, k)Φ1 (x). Note that only g2/4 (y, k) (for k ∈ K) and f5/4 (r, y, k) in these expressions have not yet been determined. Remarks. (1) Amazingly, ψ1/4 = 0. This component of the wave function involves an anharmonic correction related to the bending and AH–B stretching modes. Restoring the angular dependence to the notation, we have ψ1/4 (x, r, y, θ, φ, γ) c0,k Φ2 (|k|, r), r2 Φ2 (|k|, r)r = −b0,2,1 k∈K
× Φ1 (x)Φ2 (|k|, r)[H0,y − E10/4 ]−1 r (y Φ3 (y))|j, jz , k. (2) Although we do not present the full calculations at order 12/4 , we do calculate E12/4 explicitly. It is generically contains non-zero anharmonic corrections.
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
297
Before going further with the expansion, we present a summary of what has been determined so far.
1 2a2 /µ1 + 2 (2m(k) + |k| + 1) 2b0,2,0 /µ1 E = E0 + 3/2 n + 2
1 + 5/2 p + 2b0,0,2 /µ2 + O(12/4 ). 2 The last information for E came from order 11/4, x Z1 y .
ψ0 =
c0,k Φ1 (x)Φ2 (|k|, r)Φ3 (y)|j, jz , k.
k∈K
This was completely determined at order 10/4, xZ1 . ψ1/4 = −b0,2,1
c0,k Φ2 (|k|, r), r2 Φ2 (|k|, r)r
k∈K
× Φ1 (x)Φ2 (r)[H0,y − E10/4 ]−1 r (y Φ3 (y))|j, jz , k. This was completely determined at order 11/4, x Z1 ⊥y . ψ2/4 =
g2/4 (y, k)Φ1 (x)Φ2 (|k|, r)|j, jz , k.
k∈K
The last information came from order 10/4, x ⊥Z1 . ψ3/4 = −b0,2,1
c0,k Φ1 (x)(yΦ3 (y))
k∈K 2 × [H0,r (|k|) − E8/4 ]−1 r (P⊥Z1 r Φ2 (|k|, r))|j, jz , k + g3/4 (y, k)Φ1 (x)Φ2 (|k|, r)|j, jz , k. k∈K
The last information came from order 11/4, x ⊥Z1 . −1 3 ψ4/4 = a3 b0,2,1 [H0,y − E10/4 ]−1 r (yΦ3 (y))[H0,x − E6/4 ]r (x Φ1 (x)) × c0,k Φ2 (|k|, r), r2 Φ2 (|k|, r)r Φ2 (|k|, r)|j, jz , k k∈K
− b1,0,1 (yΦ3 (y))[H0,x − E6/4 ]−1 r (x Φ1 (x))
k∈K
+
j k=−j
f4/4 (r, y, k)Φ1 (x)|j, jz , k.
c0,k Φ2 (|k|, r)|j, jz , k
March 10, 2009 17:57 WSPC/148-RMP
298
J070-00362
G. A. Hagedorn & A. Joye
The last information came from order 10/4, ⊥x (coupled with 11/4, xZ1 ⊥y , because of g1/4 ). 3 g2/4 (y, k)Φ2 (|k|, r)([H0,x − E6/4 ]−1 ψ5/4 = −a3 r (x Φ1 (x)))|j, jz , k k∈K
− b1,0,1 b0,2,1
c0,k Φ2 (|k|, r), r2 Φ2 (|k|, r)r ([H0,x − E6/4 ]−1 r (xΦ1 (x)))
k∈K
× Φ2 (|k|, r)(y[H0,y − E10/4 ]−1 r (y Φ3 (y)))|j, jz , k c0,k Φ3 (y)r2 Φ2 (|k|, r)([H0,x − E6/4 ]−1 − b1,2,0 r (xΦ1 (x)))|j, jz , k k∈K j
+
f5/4 (r, y, k)Φ1 (x)|j, jz , k.
k=−j
The last information came from order 11/4, ⊥x. We now return to describing higher orders of the perturbation expansion. We determine E12/4 , and explicitly write the equations that must be solved through order 16/4 . That is the order at which the angular momentum quantum number j appears, and the degeneracy due to rotations is split. Order 12/4 [H0,x − E6/4 ]ψ6/4 + [H0,r,γ − E8/4 ]ψ4/4 + [H0,y − E10/4 ]ψ2/4 + a3 x3 ψ3/4 + a4 x4 ψ0 + b1,0,1 xyψ2/4 + b0,2,1 r2 yψ1/4 + b1,2,0 xr2 ψ1/4 + b0,4,0 r4 ψ0 +
X02 H0,r,γ ψ0 2µ2 Y02
= E12/4 ψ0 . From the x Z1 γ terms, we can easily solve for E12/4 . 3 4 E12/4 = −a23 Φ1 (x), x3 [H0,x − E6/4 ]−1 r x Φ1 (x)x + a4 Φ1 (x), x Φ1 (x)x
− b20,2,1 Φ2 (|k|, r), r2 Φ2 (|k|, r)2r Φ3 (y), y[H0,y − E10/4 ]−1 r yΦ3 (y)y
X0 + b0,4,0 Φ2 (|k|, r), r4 Φ2 (|k|, r)r + 2b0,2,0 /µ1 (2m(k) + |k| + 1). 2µ2 Y02 As long as b0,4,0 = 0, this expression yields different values for different |k|. To see this, first note that the factor 2 2m(k) + |k| + 1 2 2
Φ2 (|k|, r), r Φ2 (|k|, r)r = 2b0,2,0 µ1 does not depend on k, and the term
X0 2b0,2,0 /µ1 (2m(k) + |k| + 1) 2µ2 Y02
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
299
does not depend on k. In fact, the only term that has non-trivial dependence on k in E12/4 is
Φ2 (|k|, r), r4 Φ2 (|k|, r)r =
(2 + 3|k| + k 2 ) + 6(|k| + 1)m(k) + 6m(k)2 . 2b0,2,0 µ1
We now show that different values of k yield different values of this quantity. Let k1 ≥ 0 and k2 ≥ 0 be two different values of |k| that yield the same result. Simultaneously solving (2 + 3k1 + k12 ) + 6(k1 + 1)m(k1 ) + 6m(k1 )2 = (2 + 3k2 + k22 ) + 6(k2 + 1)m(k2 ) + 6m(k2 )2 and 2m(k1 ) + k1 + 1 = 2m(k2 ) + k2 + 1 forces m(k1 ) =
−3 − 5k1 + k2 6
−3 + k1 − 5k2 . 6 However, m(k1 ) and m(k2 ) must both be non-negative. There are no simultaneous non-negative solutions to m(k2 ) =
k2 > 3 + 5k1 −3 + k1 5 since this would require 3 + 5k1 < −3/5 + k1 /5, which requires 24k1 < −18 or k1 < −3/4. This contradicts k1 ≥ 0, so different values of |k| must yield different values for E12/4 . Therefore, at this level of perturbation, the eigenvalues generically have multiplicity 1 when k = 0 and multiplicity 2 when k ≥ 1. Explicitly, 2 a3 1 3a4 (11 + 30n + 30n2 ) + (1 + 2n + 2n2 ) E12/4 = − 32µ1 a2 8a2 µ1 k2 <
−
b20,2,1 (2 m(k) + |k| + 1)2 8b0,2,0 b0,0,2 µ1
b0,4,0 ((2 + 3|k| + k 2 ) + 6(|k| + 1)m(k) + 6m(k)2 ) 2b0,2,0 µ1 X0 b0,2,0 + (2 m(k) + |k| + 1). 2 µ2 Y0 2µ1
+
(3.7)
March 10, 2009 17:57 WSPC/148-RMP
300
J070-00362
G. A. Hagedorn & A. Joye
Order 13/4 . [H0,x − E6/4 ]ψ7/4 + [H0,r,γ − E8/4 ]ψ5/4 + [H0,y − E10/4 ]ψ3/4 −
1 ∂ψ0 + a3 x3 ψ4/4 + a4 x4 ψ1/4 + b1,0,1 xyψ3/4 + b0,2,1 r2 yψ2/4 2µ2 Y0 ∂y
+ b1,2,0 xr2 ψ2/4 + b0,4,0 r4 ψ1/4 + b0,0,3 y 3 ψ0 + b1,0,2 xy 2 ψ0 + b2,0,1 x2 yψ0 ∂ X02 X0 ∂2 + + H0,r,γ ψ1/4 + r ψ0 2µ2 Y02 µ2 Y02 ∂x∂r ∂x = E13/4 ψ0 + E12/4 ψ1/4 . Order 14/4 . [H0,x − E6/4 ]ψ8/4 + [H0,r,γ − E8/4 ]ψ6/4 + [H0,y − E10/4 ]ψ4/4 −
∂ψ1/4 1 + a3 x3 ψ5/4 + a4 x4 ψ2/4 + b1,0,1 xyψ4/4 + b2,0,1 x2 yψ1/4 2µ2 Y0 ∂y
+ b0,2,1 r2 yψ3/4 + b1,2,0 xr2 ψ3/4 + b0,4,0 r4 ψ2/4 + b0,0,3 y 3 ψ1/4 + b1,0,2 xy 2 ψ1/4 + b0,2,2 r2 y 2 ψ0 + b1,2,1 xr2 yψ0 + b2,2,0 x2 r2 ψ0 ∂ X02 X0 r2 ∂ 2 ∂2 + + H ψ + − ψ0 r ψ 0,r,γ 2/4 1/4 2µ2 Y02 µ2 Y02 ∂x∂r ∂x 2µ2 Y02 ∂x2
X0 ∂ 1 ∂ ∂ ∂ + cos γ − cot θ ∂r r ∂γ sin θ ∂φ ∂γ X0 ∂ ∂ ∂ − sin γ ψ0 + X0 cos γ ∂r r ∂γ ∂θ
1 + µ2 Y02
X0 sin γ
= E14/4 ψ0 + E13/4 ψ1/4 + E12/4 ψ2/4 . Note. This is where we first encounter operators that mix the various different values of k. If we use (2.4) in the above expression and take ψ0 to be a linear combination of the two degenerate states with |k| = λ, we see that the last term on the left hand side of the equation contains L± |j, jz , λ and L± |j, jz , −λ, which are linear combinations of |j, jz , λ ± 1 and L± |j, jz , −λ ± 1, respectively. Thus, ψ6/4 is the lowest order term that involves k = ±λ. Order 15/4 . [H0,x − E6/4 ]ψ9/4 + [H0,r,γ − E8/4 ]ψ7/4 + [H0,y − E10/4 ]ψ5/4 −
1 ∂ψ2/4 + a3 x3 ψ6/4 + a4 x4 ψ3/4 + a5 x5 ψ0 + b1,0,1 xyψ5/4 2µ2 Y0 ∂y
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
+ b0,2,1r2 yψ4/4 + b1,2,0 xr2 ψ4/4 + b0,4,0 r4 ψ3/4 + b0,0,3 y 3 ψ2/4 + b2,0,1x2 yψ2/4 + b1,0,2 xy 2 ψ2/4 + b0,2,2 r2 y 2 ψ1/4 + b1,2,1 xr2 yψ1/4 + b2,2,0x2 r2 ψ1/4 + b0,4,1 r4 yψ0 + b1,4,0 xr4 ψ0 ∂ X02 X0 r2 ∂ 2 ∂2 + + H ψ + − ψ1/4 r ψ 0,r,γ 3/4 2/4 2µ2 Y02 µ2 Y02 ∂x∂r ∂x 2µ2 Y02 ∂x2 X0 ∂ 1 ∂ ∂ 1 ∂ + cos γ − cot θ X + sin γ 0 µ2 Y02 ∂r r ∂γ sin θ ∂φ ∂γ X0 ∂ ∂ ∂ − sin γ ψ1/4 + X0 cos γ ∂r r ∂γ ∂θ 1 2 X0 1∂ X0 y ∂2 + + − L x − − ψ0 µ2 Y02 Y0 ∂r2 r ∂r r2 z 1 ∂ ∂ 1 ∂ − cot θ + −r sin γ µ2 Y02 ∂x sin θ ∂φ ∂γ ∂2 ψ0 − r cos γ ∂x∂θ = E15/4 ψ0 + E14/4 ψ1/4 + E13/4 ψ2/4 + E12/4 ψ3/4 . Order 16/4 . [H0,x − E6/4 ]ψ10/4 + [H0,r,γ − E8/4 ]ψ8/4 + [H0,y − E10/4 ]ψ6/4 −
1 ∂ψ3/4 + a3 x3 ψ7/4 + a4 x4 ψ4/4 + a5 x5 ψ1/4 + b1,0,1 xyψ6/4 2µ2 Y0 ∂y
+ b0,2,1 r2 yψ5/4 + b1,2,0 xr2 ψ5/4 + b0,4,0 r4 ψ4/4 + b2,0,1 x2 yψ3/4 + b0,0,3 y 3 ψ3/4 + b1,0,2 xy 2 ψ3/4 + b0,2,2 r2 y 2 ψ2/4 + b1,2,1 xr2 yψ2/4 + b2,2,0 x2 r2 ψ2/4 + b0,4,1 r4 yψ1/4 + b1,4,0 xr4 ψ1/4 + b0,6,0 r6 ψ0 + b0,0,4 y 4 ψ0 + b1,0,3 xy 3 ψ0 + b2,0,2 x2 y 2 ψ0 + b3,0,1 x3 yψ0 ∂ X02 X0 r2 ∂ 2 ∂2 + + H0,r,γ ψ4/4 + ψ2/4 r ψ3/4 − 2 2 2µ2 Y0 µ2 Y0 ∂x∂r ∂x 2µ2 Y02 ∂x2 X0 ∂ 1 ∂ ∂ 1 ∂ + cos γ − cot θ + sin γ X 0 µ2 Y02 ∂r r ∂γ sin θ ∂φ ∂γ X0 ∂ ∂ X0 X0 y ∂ − sin γ ψ2/4 + + X0 cos γ x − H0,r,γ ψ1/4 ∂r r ∂γ ∂θ µ2 Y02 Y0
301
March 10, 2009 17:57 WSPC/148-RMP
302
J070-00362
G. A. Hagedorn & A. Joye
1 + µ2 Y02
2X0 y + − µ2 Y03 +
∂ −r sin γ ∂x
∂2 ∂ +r ∂x ∂x∂r
∂ 1 ∂ − cot θ sin θ ∂φ ∂γ
1 + 2µ2 Y02
∂2 ψ1/4 − r cos γ ∂x∂θ
∂ ∂ ∂2 2 +r + 2x − Lz 2xr ψ0 ∂x∂r ∂r ∂x
j(j + 1) 1 ∂ψ0 + D(0)ψ0 ψ0 + y 2µ2 Y02 µ2 Y02 ∂y
= E16/4 ψ0 + E15/4 ψ1/4 + E14/4 ψ2/4 + E13/4 ψ3/4 + E12/4 ψ4/4 . 3.1. The complete asymptotic expansion We now prove the existence of a complete expansion in powers of 1/4 for the quasienergies and the corresponding quasimodes under suitable hypotheses. The following proposition completes the proof of Theorem 2.1. Proposition 3.1. We assume the potential energy surface (2.5) is smooth, with Taylor series given by (2.6) and (2.7). Then, the eigenvalue problem for (3.1) can be solved by formal asymptotic expansions of the form E=
N
l/4 El/4 + O((N +1)/4 ),
l=0
ψ(x, r, y, θ, φ, γ) =
N
l/4 ψl/4 (x, r, y, θ, φ, γ) + O((N +1)/4 ),
l=0
for any N ∈ N. Proof. Keeping the original variables (X, R, Y ), we first make use of the invariant subspace L generated by the basis {|k}k=−j,...,j of eigenvectors of Lz , where we have dropped the fixed parameters j and jz from the notation. In this basis, the the identity operator J 2 − 2L · J + L2 can be represented by a matrix. ∂ Let I denote ∂ ∂ −i +i cos(θ) matrix, A denote the matrix representation of i sin(γ) sin(θ) ∂φ ∂γ +cos(γ) ∂θ , ∂ ∂ ∂ and B denote the matrix representation of i cos(γ) sin(θ) −i ∂φ + i cos(θ) ∂γ + sin(γ) ∂θ . Note that these angular differential operators can be written as linear combinations of L+ and L− , which ensures that they leave L invariant. With these definitions, we can write 2 ∂2 ∂2 2 2 2 ∂ − X + 2XR J − 2L · J + L = j(j + 1) + −R2 ∂X 2 ∂RX ∂R2 2 2 ∂ ∂ X X + 2X + R− − 1 L2z I+ R ∂R ∂X R2 ∂ ∂ X −X −2 R A − 2 B. ∂X ∂R R
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
303
Then, going to the rescaled variables and dropping the symbol I, the differential operator (3.1) takes the form −
8/4 6/4 ∂ 2 − 2 2µ1 () ∂x 2µ1 ()
10/4 ∂ 2 2µ2 () ∂y 2 2 ∂ ∂ 12/4 (X0 + 3/4 x)2 1 2 13/4 1∂ − − L − + r ∂r r2 z µ2 ()(Y0 + 3/4 y) ∂y 2µ2 ()(Y0 + 3/4 y)2 ∂r2 13/4 (X0 + 3/4 x) ∂ ∂2 + + r ∂x∂r ∂x µ2 ()(Y0 + 3/4 y)2 2 1 14/4 14/4 (X0 + 3/4 x) ∂ 2 ∂ A− B − + r r ∂x2 µ2 ()(Y0 + 3/4 y)2 ∂r 2µ2 ()(Y0 + 3/4 y)2 15/4 16/4 ∂ ∂ 2 − − Lz r A + j(j + 1) + r ∂x ∂r µ2 ()(Y0 + 3/4 y)2 2µ2 ()(Y0 + 3/4 y)2 + a0 +
∞ j=2
∂2 1 1∂ − L2 + ∂r2 r ∂r r2 z
aj 3j/4 xj +
bj,k,l 1+
−
3(j+l)+2k 4
xj rk y l .
j+k+l≥2 k+l≥1 k even
We get a matrix valued differential operator given as a formal infinite series in powers of 1/4 by expanding the reduced masses µj () and the denominators (Y0 + 3/4 ) and (Y0 + 3/4 )2 . Observe that in each term of the resulting expansion, the differential operators are at most of order two. The r dependence of these operators is explicit, which will allow us to check that that the factors 1/r and 1/r2 do not cause divergences in the expressions that we encounter below. The measure in the r variable is rdr, so the only term that might yield a vector not in L2 is the Lz /r2 . In the eigenspace where Lz multiplies by zero, there is no problem. In the eigenspaces where Lz multiplies by something non-zero, the wave functions contain factors of r, so again, there is no problem. We introduce the notation
ψl/4 (x, r, y, −j)
ψl/4 (x, r, y, −j + 1) ψl/4 (x, r, y, k)|k ≡ Ψl/4 (x, r, y) = . .. . k=−j ψl/4 (x, r, y, j) j
We have already explicitly presented perturbation theory through order l/4 for l ≤ 11. The equation we must solve at order l/4 with l ≥ 12 now can be
March 10, 2009 17:57 WSPC/148-RMP
304
J070-00362
G. A. Hagedorn & A. Joye
expressed as (H0,x − E6/4 )Ψ(l−6)/4 + (H0,r,γ − E8/4 )Ψ(l−8)/4 + (H0,y − E10/4 )Ψ(l−10)/4 + a3 x3 Ψ(l−9)/4 + b1,0,1 xyΨ(l−10)/4 +
l
Dq Ψ(l−q)/4
q=11
= El/4 Ψ0/4 + · · · + E12/4 Ψ(l−12)/4 ,
(3.8)
where the symbols Dq denote at most second order differential operators in x, r, y with matrix valued coefficients whose entries are polynomials in these variables divided by rp , with p = 0, 1, 2. We note also that H0,r,γ is now matrix-valued, because of the centrifugal term L2z /r2 , whereas H0,x and H0,y are scalar differential operators multiplied by the identity matrix. The point of this decomposition is to separate the vectors Ψq/4 of order less than or equal to (l − 11)/4 from those of order (l − 10)/4 to (l − 6)/4. Let Px , Py and Pr,γ be the orthogonal projectors on the eigenstates Φ1 (x), Φ3 (y) and on the subspace Z0 = span{Φ2 (r, |k|) |k}k∈K , respectively. We abuse notation and use the same symbols to denote the corresponding projectors when considered 2 on L2 (Rx , dx) ⊗ L2 (R+ r , rdr) ⊗ L (Ry , dy) ⊗ L. Note that these operators commute with one another and that the following identity holds for any q ∈ N: Px x2q+1 = Px x2q+1 Px⊥ ,
where Px⊥ = I − Px .
(3.9)
Also, we have constructed Ψl/4 so that Ψ0 = Px Pr,γ Py Ψ0
and Px Pr,γ Py Ψl/4 = 0,
for all l ≥ 1.
(3.10)
Hence, for l ≥ 1, ⊥ Ψl/4 + Px Pr,γ Py⊥ Ψl/4 . Ψl/4 = Px⊥ Ψl/4 + Px Pr,γ
(3.11)
In terms of the quantities introduced in the explicit computations of the lower orders, we have in particular Px Ψl/4 =
l
Φ1 (x)fl/4 (r, y, k)|k,
k=−l
Px Pr,γ Ψl/4 =
Φ1 (x)Φ2 (r, |k|)gl/4 (y, k)|k,
(3.12)
k∈K
Px Pr,γ Py Ψ0 = where ck,0 ∈ C and
k∈K
Φ1 (x)Φ2 (r, |k|)Φ3 (y)ck,0 |k,
k∈K
|ck,0 |2 = 1. Note that by virtue of (3.10),
gl/4 (y, k) = Py⊥ gl/4 (y, k),
for any k ∈ K and any l > 0.
(3.13)
We solve (3.8) by two independent steps. The first consists of determining the vectors Ψl/4 for any set of coefficients {c0,k }k∈K , and the other consists of solving
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
305
an eigenvalue equation for Ej/4 in C#(K) which may reduce the set of free coeffcients {c0,k }k∈K . It is only when we construct the actual quasimode that we restrict the values of the coefficients {c0,k }k∈K to those given by the determination of the the El/4 ’s. We now formulate our induction hypothesis for l ≥ 12. IH: After solving Eq. (3.8) through order (l−1)/4 for vectors satisfying (3.10), we have: • The following vectors are determined completely in terms of the coefficients {c0,k }k∈K and depend linearly on {c0,k }k∈K : for q = 0, 1, . . . , l − 11,
Ψq/4 , (I − Px Pr,γ )Ψ(l−10)/4 , (I − Px Pr,γ )Ψ(l−9)/4 , (I − Px )Ψ(l−8)/4 , (I − Px −
(3.14) and
Px⊥ Pr,γ )Ψ(l−7)/4 .
• The x dependence of the vector Px⊥ Pr,γ Ψ(l−7)/4 is determined and has the form Px⊥ Pr,γ Ψ(l−7)/4 = Px⊥ Pr,γ Ψ(l−7)/4 ({g(l−10)/4 }),
(3.15)
with linear dependence on {g(l−10)/4 (y, k)}k∈K , the set of functions {g(l−10)/4 } entailing the unknown y dependence. • There exist vector spaces Wq ⊆ C#(K) satisfying C#(K) = W0 ⊇ W1 ⊇ · · · ⊇ Wl−1
(3.16)
such that Eq/4 is determined by an eigenvalue equation in Wq , for q = 0, 1, . . . , l − 1. Our explicit computations show that these properties are satisfied for l = 12, with Wq = C#(K) , for q = 0, . . . , 11. We now show that the induction hypothesis holds at order l/4 . Using (3.9) and (3.10) and applying Px Pr,γ Py to Eq. (3.8) yields l 3 ⊥ ⊥ El/4 Ψ0 = Px Pr,γ Py a3 x Px Ψ(l−9)/4 + b1,0,1 xyPx Ψ(l−10)/4 + Dq Ψ(l−q)/4 . q=11
We note that for s = 9, 10, the vectors Px⊥ Ψ(l−s)/4 = Px⊥ (I − Px Pr,γ )Ψ(l−s)/4 are completely determined by IH. By IH again, the right-hand side depends linearly on the set {c0,k }k∈K . Expressing the equation in the basis {Φ1 (x)Φ2 (|k|, r)Φ3 (y)}k∈K of Z2 , we get a finite dimensional eigenvalue equation. Restricting attention to the subspace Wl−1 ⊆ C#(K) of free coefficients, we get an eigenvalue equation in Wl−1 which we solve to yield El/4 and the subspace Wl ⊆ Wl−1 of free coefficients.
March 10, 2009 17:57 WSPC/148-RMP
306
J070-00362
G. A. Hagedorn & A. Joye
We now turn to the computation of the vectors. Application of Px Pr,γ Py⊥ to Eq. (3.8) yields ⊥ −1 ⊥ Px Pr,γ Py Ψ(l−10)/4 = −(H0,y − E10/4 )r Px Pr,γ Py a3 x3 Px⊥ Ψ(l−9)/4 + b1,0,1 xyPx⊥ Ψ(l−10)/4 +
l
˜ q Ψ(l−q)/4 D
(3.17)
q=11
˜ q = Dq − Eq/4 . The right hand side is known by IH, and since where D Px Pr,γ Py⊥ Ψ(l−10)/4 = Px Pr,γ Ψ(l−10)/4 , (see (3.12), (3.13)), (3.11) implies that Ψ(l−10)/4 is fully determined up to the coefficients {c0,k }k∈K . Since the dependence of Px Pr,γ Ψ(l−10)/4 is linear in the previously determined quantities, we get by IH that Ψ(l−10)/4 depends linearly in the coefficients {c0,k }k∈K . Hence, the vector Px⊥ Pr,γ Ψ(l−7)/4 ({g(l−10)/4 }) in IH is, in turn, fully determined, and it depends linearly on the {c0,k }k∈K ’s. Thus, the same is true for (I − Px )Ψ(l−7)/4 . ⊥ to Eq. (3.8) yields Application of Px Pr,γ ⊥ ⊥ Px Pr,γ Ψ(l−8)/4 = −(H0,r,γ − E8/4 )−1 r Px Pr,γ (H0,y − E10/4 )Ψ(l−8)/4
3
+ a3 x
Px⊥ Ψ(l−9)/4
+
b1,0,1 xyPx⊥ Ψ(l−10)/4
+
l
˜ Dq Ψl−q ,
q=11
(3.18) where, by the same arguments, the right-hand side is fully determined up to the coefficients {c0,k }k∈K , on which it depends linearly. Now, from IH and the identity ⊥ Px Ψ(l−8)/4 = Px Pr,γ Ψ(l−8)/4 + Px Pr,γ Ψ(l−8)/4
we see that (I − Px Pr,γ )Ψ(l−8)/4 is fully determined and depends linearly on the coefficients {c0,k }k∈K . Finally, application of Px⊥ to Eq. (3.8) yields ⊥ (H0,r,γ − E8/4 )Px⊥ Ψ(l−8)/4 Px⊥ Ψ(l−6)/4 = −(H0,x − E6/4 )−1 r Px
+ (H0,y − E10/4 )Px⊥ Ψ(l−10)/4 + a3 x3 Ψ(l−9)/4 + b1,0,1 xyΨ(l−10)/4 l ˜ Dq Ψl−q , + (3.19) q=11
where, this time, the right-hand side is not fully determined since there is no projector Px⊥ acting on Ψ(l−9)/4 . However, at this step, Ψ(l−10)/4 and Px⊥ Ψ(l−8)/4 = Px⊥ (I − Px Pr,γ )Ψ(l−8)/4 are fully determined and linear in the {c0,k }k∈K , so that from IH we see that the only undetermined part comes from Φ1 (x)Φ2 (r, |k|)g(l−9)/4 (y, k)ck,0 |k. Px Pr,γ Ψ(l−9)/4 = k∈K
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
307
We conclude that the x dependence of the vector Px⊥ Ψ(l−6)/4 is determined, and that the undetermined part of this vector depends on the set of functions {g(l−9)/4 (y, k)}k∈K purely linearly. Thus, we have reproduced the all the requirements of the induction hypothesis, which ends the proof. 3.2. The expansion around a local minimum We now describe the construction of quasimodes of arbitrarily high order under assumptions that are only local. This construction uses the formal expansions of Proposition 3.1 and the insertion of cutoff functions. The construction is quite similar to that given in [5], so we refrain from presenting all details. Let N ≥ 0 be fixed and set Ψ(N ) (x, r, y, θ, φ, γ) =
N
l/4 ψl/4 (x, r, y, θ, φ, γ),
l=0
E (N ) =
N
l/4 El/4 ,
l=0
V (N ) (X, Y, R) =
al (X − X0 )l
(3.20)
l≤(N +1)/3
+
bj,k,l (X − X0 )j Rk (Y − Y0 )l ,
j+k+l≥2 k+l≥1 k even 4+3(j+l)+2k≤N
where the vectors ψl/4 and the scalars El/4 are defined in Proposition 3.1. Then we introduce a cutoff function. Let F : R → [0, 1] be C ∞ and such that supp F ⊂ [−2, 2] with F (t) = 1 for t ∈ [−1, 1]. We set F (X, R, Y ) = F ((X − X0 )/δ1 )F (R/δ2 )F ((Y − Y0 )/δ3 ), where 0 < δ1 < 3/4, 0 < δ2 < 1/2 and 0 < δ3 < 3/4. (N ) The quasimode ΨQ is defined as (N )
ΨQ (X, R, Y, θ, φ, γ) = −5/4 F (X, R, Y )Ψ(N ) ((X − X0 )/3/4 , R/1/2 , (Y − Y0 )/3/4 , θ, φ, γ). (3.21) The factor of −5/4 in this expression ensures asymptotic normalization of the quasimode because of the Jacobian factor in the integral for the L2 norm. Proposition 3.2. Let H() = −
3 4 ∆(X1 ,X2 ,X3 ) − ∆(Y1 ,Y2 ,Y3 ) + V1 (X) + V2 (X, R, Y ), 2µ1 () 2µ2 ()
March 10, 2009 17:57 WSPC/148-RMP
308
J070-00362
G. A. Hagedorn & A. Joye
satisfy the hypotheses of Proposition 3.1. Then, for any N ∈ N, there exists a constant CN , such that the vector (3.21) and the scalar (3.20) satisfy (N ) ΨQ = 1 + O(1/4 ) and (N )
(N )
H()ΨQ − E (N ) ΨQ (N )
ΨQ
≤ CN (N +1)/4 ,
as → 0.
(N )
Proof. We begin by computing the norm of ΨQ . The vectors ψl/4 , for l = 0, . . . , N , are given as a finite linear combinations of angular functions |k, jz , j, (k = −j, . . . , j), multiplied by Gaussians in x, r, y, times polynomials in these variables. Thus, they all belong to L2 . In particular, by our choices for ψ0 , we have |−5/4 ψ0 ((X − X0 )/3/4 , R/1/2 , (Y − Y0 )/3/4 , θ, φ, γ)|2 RdRdXdY dΩ =
|ψ0 (x, r, y, θ, φ, γ)|2 rdrdxdydΩ
= 1, where dΩ denotes the solid angle element in the angular variables. The norms of the other ψl/4 are similarly O(1). (N )
Hence ΨQ 2 = Ψ(N ) + (F2 − 1)Ψ(N ) 2 , where, (1 − F2 )Ψ(N ) 2 ≤ |Ψ(N ) ((X − X0)/3/4 , R/1/2, (Y − Y0 )/3/4, θ, φ, γ)|2 |X−X0 |≥δ1 R≥δ2 |Y −Y0 |≥δ3
× RdRdXdY dΩ.
(3.22)
The choice of exponents δj and the exponential decay of Ψ(N ) imply that (3.22) is of order ∞ , and we finally see that (N )
ΨQ = 1 + O(1/4 ). By construction, there exist C > 0 and D > 0, independent of , such that R(N ) (X, R, Y ) = V1 (X) + V2 (X, R, Y ) − V (N ) (X, Y, R) satisfies |R(N ) (X, R, Y )| ≤ C(|X − X0 |(N +1)/3 + |X − X0 |a Rb |Y − Y0 |c ),
(3.23)
where 4 + 3(a + c) + 2c ≥ N + 1, if (|X − X0 | + R + |Y − Y0 |) < D. Consider now (N )
V ΨQ
(N )
(N )
= V (N ) ΨQ + R(N ) ΨQ
= V (N ) F Ψ(N ) + R(N ) F Ψ(N ) .
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
309
Due to the support conditions imposed by the cutoff, we can estimate F R(N ) by means of (3.23), and, after passing to the rescaled variables x, r, y, we obtain F (X, R, Y )|R(N ) (X, R, Y )| ≤ F (X, R, Y )(N +1)/4 C(|x|(N +1)/3 + |x|a rb |y|c ). Once again using the Gaussian decay of Ψ(N ) , we finally get the L2 estimate (N )
R(N ) ΨQ = O((N +1)/4 ). We now have estimated everything except the terms in which the kinetic energy acts on the cutoffs. First note that derivatives with respect to angular variables do not affect the cutoffs. Next, by the Leibniz formula, the first and second derivatives with respect to x, y, or r acting on F Ψ(N ) yield supplementary terms given by first and second derivatives of F multiplied by Ψ(N ) or first derivatives of Ψ(N ) . By construction of the cutoff, the successive derivatives of F are supported away of the origin in at least one of the variables x, y, or r. Since Ψ(N ) and its derivatives are Gaussian times polynomials in these variables, these supplementary terms are all of order ∞ . Finally, taking into account the formal expansions of Theorem 3.1, and the definition 3 4 ∆(X1 ,X2 ,X3 ) − ∆(Y1 ,Y2 ,Y3 ) + V (N ) (X, R, Y, ), H (N ) () = − 2µ1 () 2µ2 () we get the L2 norm estimate (N )
(N )
H()ΨQ − E (N ) ΨQ (N )
(N )
= H (N ) ()ΨQ − E (N ) ΨQ + O((N +1)/4 ) = F (H (N ) ()Ψ(N ) − E (N ) Ψ(N ) ) + O((N +1)/4 ) + O(∞ ) = O((N +1)/4 ).
4. Inclusion of the Electrons In this section, we show that including the quantum mechanical treatment of the electrons does not change the expression for the energy up to an error of order 3 . We decompose the Hamiltonian for all the particles in the molecule as the sum of the nuclear kinetic energy plus a self-adjoint electron Hamiltonian h1 (Y, θ, φ, R, γ, X). The electron Hamiltonian depends parametrically on (Y, θ, φ, R, γ, X) and acts on functions of all of the electron variables, that we describe jointly with the single symbol Z. To avoid questions about Berry phases, we assume h1 (Y, θ, φ, R, γ, X) commutes with complex conjugation, i.e. it is a real symmetric operator. Because of rotational symmetries, the electron Hamiltonian can be written as h1 (Y, θ, φ, R, γ, X) = U (θ, φ, γ)h2 (X, R, Y )U (θ, φ, γ)−1 ,
March 10, 2009 17:57 WSPC/148-RMP
310
J070-00362
G. A. Hagedorn & A. Joye
where U (θ, φ, γ) is unitary on the electron Hilbert space and depends smoothly on θ, φ, and γ. As a consequence, discrete eigenvalues of h1 (Y, θ, φ, R, γ, X) do not depend on θ, φ, or γ. We assume that the resolvent of h2 (X, R, Y ) depends smoothly on (X, R, Y ). As a result, all discrete eigenvalues of h1 (Y, θ, φ, R, γ, X) depend smoothly on the nuclear configurations. We assume further that the ground state eigenvalue V (X, R, Y ) of h(Y, θ, φ, R, γ, X) is discrete and non-degenerate for each fixed value of (Y, θ, φ, R, γ, X). We also assume that V (X, R, Y ) has a global minimum at (X0 , 0, Y0 ) with a strictly positive Hessian at that minimum. To ensure that we are approximating discrete eigenvalues for the full molecular Hamiltonian, we assume that the V (X0 , 0, Y0 ) is strictly below the bottom of the spectrum of h2 (X, R, Y ) for all (X, R, Y ) outside a small neighborhood of (X0 , 0, Y0 ). We now introduce -dependence in h2 , and hence h1 . We choose functions V1 (X) and V2 (X, R, Y ) that satisfy V (X, R, Y ) = V1 (X) + 0 V2 (X, R, Y ) and the restrictions imposed after expression (2.5). Here 0 is a fixed value of that we take to be the fourth root of the electron mass divided by the carbon C 12 nuclear mass. We then define h(, Y, θ, φ, R, γ, X) by replacing V (X, R, Y ) by V1 (X) + V2 (X, R, Y ) in the spectral decomposition of h1 (Y, θ, φ, R, γ, X). Thus, we only introduce -dependence in this single eigenvalue and alter none of the eigenfunctions. Remark. To minimize technicalities, we have made assumptions for all (X, R, Y ). At the expense of inserting cut off functions, our assumptions need only be imposed for (X, R, Y ) in a neighborhood of (X0 , 0, Y0 ). We shall write down an explicit quasimode with an O(12/4 ) energy error for the Schr¨ odinger operator H() = −
4 3 ∆(X1 ,X2 ,X3 ) − ∆(Y1 ,Y2 ,Y3 ) + h(, X1 , X2 , X3 , Y1 , Y2 , Y3 ), 2µ1 () 2µ2 ()
rewritten in terms of the variables (Y, θ, φ, R, γ, X, Z). The quasienergy will be E() = E0 + 6/4 E6/4 + 8/4 E8/4 + 10/4 E10/4 ,
(4.1)
but the quasimode will be somewhat complicated. To specify the quasimode, we first let χ(Y, θ, φ, R, γ, X, Z) denote a normalized real ground state eigenfunction of h(, Y, θ, φ, R, γ, X) that depends continuously on its variables. Next, we let 5 X − X0 R Y − Y0 −5/4 l/4 ψl/4 , 1/2 , 3/4 , θ, φ, γ , ζ(, Y, θ, φ, R, γ, X) = 3/4 l=0
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
311
where the ψl/4 are the wave functions from Sec. 3 with g2/4 (y, ±λ) = g3/4 (y, ±λ) = f4/4 (r, y, k) = f5/4 (r, y, k) = 0. Note that when λ = 0 there is one linearly independent choice for ζ. When λ > 0, we have two linearly independent choices corresponding to k = ±λ. The quasimode is Ψ(, Y, θ, φ, R, γ, X, Z) = F (X, R, Y )ζ(, Y, θ, φ, R, γ, X)χ(Y, θ, φ, R, γ, X, Z) 3 F (X, R, Y )[h(, Y, θ, φ, R, γ, X) − V (, X, R, Y )]−1 r 2µ1 ∂ζ ∂χ (, Y, θ, φ, R, γ, X) (Y, θ, φ, R, γ, X, Z) × ∂X ∂X ∂ζ ∂χ + (, Y, θ, φ, R, γ, X) (Y, θ, φ, R, γ, X, Z) . ∂R ∂R
+
(4.2)
Theorem 4.1. There exists a constant C, such that the function Ψ() given by (4.2) and quasienergy E() given by (4.1) satisfy Ψ() = 1 + O(1/2 ) and (H() − E())Ψ(, ·) ≤ C3 .
(4.3)
Proof. The function Ψ(, ·) equals the normalized vector ψ0 χ plus terms that are orthogonal to ψ0 χ. Since the largest of these orthogonal terms is 1/4 ψ1/4 χ, we see that Ψ() has norm 1 + O(1/2 ). To prove the second estimate of the theorem, we begin by noting that the electronic eigenfunction χ has the form χ(Y, θ, φ, R, γ, X, Z) = U (θ, φ, γ)χ0 (Y, R, X, Z), where U (θ, φ, γ) is unitary. We next compute (H() − E())F (X, R, Y )ζ(, ·)χ(·),
(4.4)
where H() is decomposed as H() = −
3 4 ∆(X1 ,X2 ,X3 ) − ∆ 2µ1 () 2µ2 () (Y1 ,Y2 ,Y3 )
+ [h(, X1 , X2 , X3 , Y1 , Y2 , Y3 ) − V1 (X) − V2 (X, R, Y )] + V1 (X) + V2 (X, R, Y ), with the two final terms expanded in their Taylor series of appropriate orders. We write the resulting expression in the variables (Y, θ, φ, R, γ, X, Z). When all the
March 10, 2009 17:57 WSPC/148-RMP
312
J070-00362
G. A. Hagedorn & A. Joye
derivatives in H() act on ζ, all terms that are larger than order 3 cancel because of Taylor series estimates and the choices of the ψl/4 . When all the derivatives act on χ, all terms are O(3 ) or smaller because χ is smooth and the cutoffs are zero near the singularity at Y = 0. When any derivatives act on F , we obtain terms of order O(q ), for any q, due to the rapid fall off of the functions in ζ. The term that arises from [h() − V1 − V2 ] yields zero because it acts only on the χ. The remaining terms in (4.4) contain terms in which a partial derivative acts on ζ and the same partial derivative acts on χ. All of these terms are O(3 ) or smaller, except for ∂χ ∂ζ (, Y, θ, φ, R, γ, X) (Y, θ, φ, R, γ, X, Z) ∂X ∂X ∂χ ∂ζ (, Y, θ, φ, R, γ, X) (Y, θ, φ, R, γ, X, Z). (4.5) ∂R ∂R Thus, (4.4) yields (4.5) plus O(3 ). However, when the [h() − V1 − V2 ] acts on the second term in (4.2), the terms that arise from (4.5) cancel, leaving us with O(3 ) errors plus the kinetic energy and potential terms acting on the second term in (4.1). Because of the cutoff, the potential terms yield bounded operators times O(3 ) terms. When the kinetic energy acts on these terms, we obtain terms of order 9/2 or smaller, since everything is smooth, and the largest terms come from 6 and two X-derivatives acting on ζ. Note that when computing the norm in (4.3), it is essential that χ be orthogonal ∂χ ∂χ and ∂R , or cross terms would yield terms of order greater than 3 . This to ∂X orthogonality is guaranteed by our hypothesis that the electron Hamiltonian h(, ·) be real symmetric and that we choose χ to be real. +
Acknowledgment The first author was partially supported by National Science Foundation Grant DMS-0600944. References [1] A. R. Edmonds, Angular Momentum in Quantum Mechanics (Princeton University Press, 1974). [2] J. C. Evans and G. Y.-S. Lo, Vibrational spectra of hydrogen dihalide ions, J. Phys. Chem. 70 (1966) 543–545. [3] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick,
March 10, 2009 17:57 WSPC/148-RMP
J070-00362
A Mathematical Theory for Vibrational Levels Associated with Hydrogen Bonds II
[4] [5]
[6] [7] [8] [9] [10]
313
A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. L. Martin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez and J. A. Pople, Gaussian 03, Revision C.02 (Gaussian, Inc., Wallingford CT, 2004). G. A. Hagedorn, High order corrections to the time-independent Born–Oppenheimer expansion II: Diatomic coulomb systems, Commun. Math. Phys. 116 (1988) 23–44. G. A. Hagedorn and A. Joye, A mathematical theory for vibrational levels associated with hydrogen bonds I: The symmetric case, Commun. Math. Phys. 274 (2007) 691– 715. M. S. Herman, Born–Oppenheimer corrections near a Renner–Teller crossing, PhD dissertation, Virginia Tech (July 2008). G. Herzberg, l-type doubling in linear polyatomic molecules, Rev. Mod. Phys. 14 (1942) 219–223. R. Renner, On the theory of the interaction between electronic and nuclear motion for three-atomic, bar-shaped molecules, Z. Phys. 92 (1934) 172–183. J. R. Roscioli, L. R. McCunn and M. A. Johnson, Quantum structure of the intermolecular proton bond, Science 316 (2007) 249–254. D. R. Yarkony, Diabolical conical intersections, Rev. Mod. Phys. 68 (1996) 985–1013.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Reviews in Mathematical Physics Vol. 21, No. 3 (2009) 315–371 c World Scientific Publishing Company
QUANTIZATION OF SINGULAR REDUCTION
∗ ´ L. BATES, R. CUSHMAN, M. HAMILTON and J. SNIATYCKI
Department of Mathematics and Statistics, University of Calgary, Calgary, Alberta, T2N 1N4 Canada ∗
[email protected] Received 29 April 2008 Revised 13 December 2008 This paper creates a theory of quantization of singularly reduced systems. We compare our results with those obtained by quantizing algebraically reduced systems. In the case of a K¨ ahler polarization, we show that quantization of a singularly reduced system commutes with reduction, thus generalizing results of Sternberg and Guillemin. We illustrate our theory by treating an example of Arms, Gotay and Jennings where algebraic and singular reduction at the zero level of the momentum mapping differ. In spite of this, their quantizations agree. Keywords: Singular reduction; algebraic reduction; geometric quantization; decomposition of quantization representation; differential space. Mathematics Subject Classification 2000: 53D50, 53D20, 58A40
1. Introduction The problem of commutativity of quantization and reduction appears in physics in the context of quantization of theories with constraints such as electrodynamics, general relativity and Yang–Mills theory. In these examples, one deals with a problem having a Hamiltonian action of an infinite-dimensional gauge Lie group on an infinite-dimensional weakly symplectic manifold of Cauchy data of the theory. The space of physical degrees of freedom is the space of orbits of the gauge group contained in the zero level of an equivariant momentum map for the gauge group action. One can quantize the original space of Cauchy data and postulate that physically admissible states satisfy the quantized constraint conditions. Another approach is to perform classical reduction first and then quantize the obtained reduced space. Equivalence of these two approaches was first investigated in the context of field theory by Dirac [10]. A precise formulation of the problem of commutativity of quantization and reduction in the context of group representations was given by Guillemin and Sternberg [14]. In their work, “quantization” means geometric quantization used as a technique to construct a unitary representation of a compact connected Lie 315
April 2, 2009 10:19 WSPC/148-RMP
316
J070-00363
L. Bates et al.
group G from the action of G on a compact symplectic manifold (P, ω). They use “reduction” to mean regular reduction in the sense of Meyer [25] and Marsden and Weinstein [22]. Regular reduction describes the symplectic structure of the orbit space P/G under assumption that the action of G on P is free and proper and admits a co-adjoint equivariant momentum map J : P → g∗ . For a non-zero coadjoint orbit O ⊆ g∗ , the product manifold P × O has a symplectic structure ωP ×O = ω ⊕ (−ωO ). A Hamiltonian action of G on P gives rise to a Hamiltonian action ΦP ×O : G × (P × O) → P × O : (g, (p, µ)) → (gp, Ad∗g−1 µ) of G with a momentum map JP ×O : P × O → g∗ : (p, µ) → J(p) − µ.
(1)
−1 (O). For a From (1), it follows that JP−1 ×O (0) is the graph of J restricted to J −1 free and proper action J −1 (O)/G and JP ×O (0)/G are symplectomorphic [15]. The identification of J −1 (O)/G and JP−1 ×O (0)/G is called the “shifting trick”. Results of [14] state that, under some additional technical assumptions, the decomposition mi Ri (2) R= i
of the unitary representation R of G (obtained by geometric quantization of (P, ω)) into irreducible unitary representations Ri , corresponding to quantizable co-adjoint orbits Oi ⊆ g∗ , has the following properties: (i) the multiplicity mi = 0 unless the corresponding co-adjoint orbit is in the range of the momentum map J : P → g∗ and (ii) if mi > 0, then it is the dimension of the space of sections of a holomorphic line bundle over J −1 (Oi )/G obtained from the prequantization line bundle over P . The problem of generalizing the results of Guillemin and Sternberg to non-free actions of a compact Lie group on a compact symplectic manifold has been studied by several authors, see [27, 24, 19, 13, 12] and references quoted there. In this situation, J −1 (0)/G is a stratified space, see Lerman and Sjamaar [28]. For a non-zero co-adjoint orbit O ⊆ g∗ , the structure of J −1 (O)/G was studied in [7]. In either case, in order to prove that “quantization commutes with reduction”, one has to generalize geometric quantization to singular spaces. In [24], Meinrenken and Sjamaar used the technique of a partial resolution of singularities of J −1 (0)/G. They decomposed the quantization representation into irreducibles. But they defined “quantization” in a different way, namely as an equivariant spin-C index, i.e. a virtual representation, which may have negative dimension. Their results hold for an arbitrary compact symplectic manifold which is G-equivariantly prequantizable, where G is a compact Lie group. Using index theory allows one to avoid the introduction of a polarization. The problems encountered in generalizing theorems that state “quantization commutes with reduction” to a non-free action stem from the fact that regular
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
317
reduction is not a convenient way of describing the structure of the orbit space in the presence of singularities. There are two approaches to reduction which automatically take care of singularities of the orbit space, namely singular reduction and algebraic reduction, both of which were introduced in 1983, see [8, 34]. On the one hand, in singular reduction the space P/G of G-orbits in P is treated as a differential space with its ring of smooth functions being isomorphic to the ring C ∞ (P )G of G-invariant smooth functions on P . The superscript G denotes the subspace of G-invariant elements. Singular reduction enables one to use differential geometric techniques to study singular spaces. In particular, it allows a complete description of the structure of the orbit space P/G when the action of G on P is proper, see [9,30]. Singular reduction associates to J −1 (0)/G the Poisson algebra C ∞ (P )G /I G , where I G is the ideal of G-invariant smooth functions on P that vanish on J −1 (0). On the other hand, algebraic reduction of J −1 (0)/G gives rise to a Poisson algebra (C ∞ (P )/J )G , where J is the ideal in C ∞ (P ) generated by components of the momentum map. On the one hand, algebraic reduction does not require properness of the action of G on P . On the other hand, it leads to a Poisson algebra which need not be an algebra of functions even if G is compact, see Arms, Gotay and Jennings [4]. Algebraic reduction was extended to non-zero co-adjoint orbits by Wilbour [35], Kimura [20] and Arms [2]. In [2], Arms showed that the “shifting trick” applies to algebraic reduction. In other words, algebraic reduction of J −1 (O) gives a Poisson algebra, which is isomorphic to the Poisson algebra given by algebraic reduction of JP−1 ×O (0). Geometric quantization of algebraic reduction at quantizable co-adjoint orbits in terms of the quantization structure of the ambient symplectic manifold was given in [32]. Using “shifting trick” of Guillemin and Sternberg [16], one obtains a generalization of decomposition (2) without assuming the compactness of the symplectic manifold (P, ω) or the Lie group G [33]. The aim of this paper is to apply the technique of [32] to obtain an analogous quantization of singularly reduced Poisson algebras, and to investigate if quantization commutes with singular reduction. On the classical level, we establish validity of the “shifting trick” for singular reduction. In other words, we show that, if the action of G on P is proper, then J −1 (O) and JP−1 ×O (0) are diffeomorphic Poisson differential spaces. We show that quantization of singular reduction of J −1 (0)/G, following the algebraic scheme developed in [32], encounters obstructions related to existence of elements of (C ∞ (P )/J )G which correspond to functions that vanish identically on J −1 (0). We give conditions under which these obstructions vanish. When these conditions hold, we get a quantization of the subalgebra of C ∞ (P )G /I G consisting of elements that preserve the polarization. In general, we do not know if the quantization obtained by singular reduction is equivalent to the quantization obtained by algebraic reduction. Our main result is Theorem 3.10 which states that geometric quantization in terms of a K¨ahler polarization commutes with singular reduction. Moreover, the generalization of decomposition (2) obtained by quantization of singular reduction
April 2, 2009 10:19 WSPC/148-RMP
318
J070-00363
L. Bates et al.
is the same as that obtained by quantization of algebraic reduction. Let H denote the representation space of the representation R obtained by geometric quantization of the action of G on (P, ω) and let Hi be the representation space of an irreducible unitary representation Ri of G corresponding to a quantizable co-adjoint orbit Oi . Since neither P nor G are assumed to be compact, H and Hi are Hilbert spaces which need not be finite-dimensional. Under the assumptions of Theorem 3.10, geometric quantization of singular reduction at a co-adjoint orbit Oi gives rise to a projection operator Πi on H such that range Πi is the largest closed G-invariant subspace of H on which R is equivalent to a multiple (possibly infinite) of the irreducible unitary representation Ri of G. In this way, we get a discrete part of the decomposition of the quantization representation R: range Πi ⊕ H (3) H= i
is the orthogonal complement of ( range Πi ) in H. H may contain where H i subspaces of H on which R is equivalent to irreducible unitary representations that may contain are not given by quantization of appropriate co-adjoint orbits. Also, H the subspace of H corresponding to the continuous part of the spectral measure in the decomposition of H. The next step in the study of “commutation of quantization and reduction” is to try to describe the continuous part of the spectral measure in terms of geometric quantization of corresponding co-adjoint orbits. This is an open problem. We illustrate our results with an analysis of the representation of SU(2) obtained by quantization of the lift to the cotangent bundle T ∗ C2 of the canonical linear action of SU(2) on C2 . This example has been studied by Arms, Gotay and Jennings in [4]. It is the simplest case where algebraic reduction at zero differs from singular reduction at zero. Quantization before reduction gives rise to a representation space H consisting of holomorphic functions on C4 that are square integrable with an exponentially decaying weight function. The space H0 of elements of H, which are invariant under the representation, consists of analytic functions of z 2 = z12 + z22 + z32 + z42 . For each quantizable co-adjoint orbit On ⊆ su(2)∗ , the subspace Hn of H, on which the quantization representation R is unitarily equivalent to the representation ROn corresponding to On , consists of functions of the form Ψ0 (z 2 )pn , where Ψ0 (z 2 ) ∈ H0 and pn is a polynomial on C2 , which belongs to the representation space of the irreducible unitary representation ROn of su(2). The Hilbert space of the quantization representation decomposes as H = ⊕∞ n=0 Hn . Since all multiplicities appearing here are infinite, this result cannot be obtained using the approach of Guillemin and Sternberg [14] or Meinrenken and Sjamaar [24]. Even though algebraic and singular reduction at zero give rise to different Poisson algebras, their quantization leads to the same Hilbert space H0 . Moreover, they give the same quantum operators on H0 corresponding to quantizable invariant functions. The action of SU(2) on T (C2 \{0}) is free and proper so that algebraic reduction, singular reduction and regular reduction at non-zero co-adjoint orbits
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
319
coincide. The quantization of reduction at quantizable non-zero co-adjoint orbits in terms of the “shifting trick” of Guillemin and Sternberg [16] used here corresponds to the “covariance” approach in representation theory. In order to relate geometric quantization of quantizable co-adjoint orbits of SU(2) to the corresponding standard irreducible representations of SU(2), we realize prequantization line bundles as associated bundles of a principal complex line bundle. Even though the computations leading to irreducible representations of SU(2) in terms of homogeneouos polynomials are standard, we give them in two appendices for completeness. Here we have restricted our attention to K¨ ahler polarizations because they lead directly to unitary representations. All other polarizations require additional structures, for example a bundle of half-densities or half-forms in order to provide a scalar product in the space of sections covariantly constant along a polarization. When an additional structure is needed, one has to take it into account in order to be able to verify that “quantization commutes with reduction”. For example, in the case of a real polarization and a free action of a non-unimodular Lie group, one has to consider a correction given in terms of the trace of the adjoint representation, see Duval et al. [11]. Under the assumption that G is compact and J is a radical ideal, results of this paper should be comparable to some results of Huebschmann [19], in which a quantization of a stratified symplectic space was constructed. It would be very interesting to understand the relationship between Huebschmann’s results in [19] and our results here. 2. Reduction 2.1. Hamiltonian action We consider a symplectic manifold (P, ω). For each f ∈ C ∞ (P ), the Hamiltonian vector field of f is the unique vector field Xf on P such that Xf
ω = −df,
(4)
where denotes the left interior product (contraction). For each f1 , f2 ∈ C ∞ (P ), the Poisson bracket of f1 , f2 is given by {f1 , f2 } = −Xf1 f2 = −ω(Xf1 , Xf2 ).
(5)
The Poisson bracket (5) is bilinear, antisymmetric, acts as a derivation {f1 , f2 f3 } = f2 {f1 , f3 } + f3 {f1 , f2 },
(6)
and satisfies the Jacobi identity {{f1 , f2 }, f3 } + {{f2 , f3 }, f1 } + {{f3 , f1 }, f2 } = 0.
(7)
The associative algebra C ∞ (P ) endowed with the Poisson bracket (5) is called the Poisson algebra of (P, ω).
April 2, 2009 10:19 WSPC/148-RMP
320
J070-00363
L. Bates et al.
Let G be a connected Lie group, and let Φ : G × P → P : (g, p) → Φg (p) = gp
(8)
be an action of G on P . We assume that the action Φ is symplectic, that is Φ∗g ω = ω, and it has an Ad∗ -equivariant momentum map J : P → g∗ , where g∗ is the dual of the Lie algebra g of G. For each ξ ∈ g, action on P of the one parameter subgroup exp tξ of G is given by translations along the integral curves of XJξ . Here Jξ = J | ξ is the momentum corresponding to ξ. We say that an action Φ is free if gp = p implies that g is the identity element of G. Also, Φ is said to be proper if, for every convergent sequence {pn } in P and every sequence {gn } in G such that the sequence {gn pn } is convergent, there is a subsequence {gnk } which converges such that lim (gnk pnk ) = lim gnk lim pnk . (9) k→∞
k→∞
k→∞
2.2. Regular reduction If the action Φ of G on P is free and proper, then the space P/G of G-orbits on P is a manifold and the orbit map π : P → P/G is a locally trivial fibration. The action Φ of G on P induces on P the structure of a (left) principal G-bundle over P/G. The ring C ∞ (P/G) of smooth functions on P/G is isomorphic to the ring ∞ C (P )G of smooth G-invariant functions on P . The isomorphism is given by the pull-back by the G-orbit map π ∗ : C ∞ (P/G) → C ∞ (P )G : fˇ → π ∗ fˇ = fˇ ◦ π.
(10)
Since the action Φ is symplectic, that is, it preserves ω, it also preserves the Poisson bracket. In other words, Φ∗g {f1 , f2 } = {Φ∗g f1 , Φ∗g , f2 } for every g ∈ G and every f1 , f2 ∈ C ∞ (P ). Hence, C ∞ (P )G is a Poisson subalgebra of C ∞ (P ). Using the isomorphism π ∗ we can pull back the Poisson algebra structure from C ∞ (P )G to C ∞ (P/G). In particular, for fˇ1 , fˇ2 ∈ C ∞ (P/G), their Poisson bracket {fˇ1 , fˇ2 } satisfies π ∗ {fˇ1 , fˇ2 } = {π ∗ fˇ1 , π ∗ fˇ2 }.
(11)
Since the orbit space P/G is a Poisson manifold, it is foliated by symplectic manifolds, see Liebermann and Marle [21]. For each p ∈ P , the symplectic leaf of P/G through π(p) can be characterized as follows. For µ = J(p) let Gµ = {g ∈ G | Ad∗g µ = µ}
(12)
be the isotropy group of µ and let O = {Ad∗g µ ∈ g∗ | g ∈ G}
(13)
be the co-adjoint orbit through µ. The orbit space J −1 (O)/G is naturally identified with π(J −1 (O)). The set π(J −1 (O)) is the symplectic leaf of P/G through π(p).
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
321
In order to describe the symplectic structure of π(J −1 (O)) observe that, since the action of Gµ is free and proper, J −1 (µ) is a submanifold of P , and the Gµ -orbit space Pµ = J −1 (µ)/Gµ is a manifold. Let πµ : J −1 (µ) → Pµ be the Gµ -orbit map. The manifold Pµ inherits a symplectic form ωµ such that πµ∗ ωµ coincides with the pull back of ω to J −1 (0) by the inclusion map, see Marsden and Weinstein [22]. There is a canonical bijection between J −1 (µ)/Gµ and π(J −1 (O)) such that for each p ∈ J −1 (µ) the Gµ -orbit through p is mapped to the G-orbit through p. This bijection relates the symplectic structures of Pµ and π(J −1 (O)), see [9]. On the manifold P × O there is a symplectic form ωP ×O = pr∗1 ω − pr∗2 ωO , where pri is the projection onto the ith factor of P × O and ωO is the symplectic form of the co-adjoint orbit O. For µ = 0, one can describe π(J −1 (O)) as the reduction at 0 ∈ g∗ of the action of G on (P × O, ωP ×O ) given by G × (P × O) → (P × O) : (g, (p, ν)) → (gp, Ad∗g−1 ν),
(14)
which has a momentum map JP ×O : (P × O) → g∗ : (p, ν) → J(p) − ν.
(15)
In other words, for a free and proper action, we have a symplectomorphism between a symplectic leaf π(J −1 (O)) of the Poisson manifold P/G and the symplectic manifold JP−1 ×O (0)/G, see Guillemin and Sternberg [16]. 2.3. Singular reduction Singular reduction generalizes regular reduction to the situation where the Hamiltonian action of G is proper but not necessarily free. The orbit space P/G, endowed with the algebra C ∞ (P/G), which is isomorphic to C ∞ (P )G , is a differential space. It is stratified by orbit type. Each stratum is a Poisson manifold which is foliated by symplectic manifolds, see [9]. For each co-adjoint orbit O, the orbit space J −1 (O)/G = π(J −1 (O)) ⊂ P/G is a differential subspace of P/G. Consider first J −1 (0)/G. Since the momentum map J : P → g∗ is continuous, J −1 (0) is a closed subset of P . Moreover, it is invariant under the action of G. Hence π(J −1 (0)) = J −1 (0)/G is a closed differential subspace of P/G. Fact 2.1. Let G × P → P be a proper action of a Lie group G on a manifold P and let C be a G-invariant closed subset of P endowed with a differential structure C ∞ (C) induced by the inclusion map C → P . For each G-invariant function f ∈ C ∞ (C) there exists a G-invariant extension h in C ∞ (P ). Proof. By definition, a function f : C → R is in C ∞ (C) if, for each x ∈ C there exists a neighborhood Ux of x in P and a function h1 ∈ C ∞ (P ) such that f|C ∩ Ux = h1|C ∩ Ux . . Since the action of G on P is proper, there exists a slice Sx at x for this action. Without a loss of generality, we may assume that Sx ⊆ Ux . The intersection C ∩ Sx is closed in Sx , hence the restriction f|C ∩ Sx = h1|C ∩ Sx
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
L. Bates et al.
322
extends to a smooth function h2 on Sx . Since the isotropy group Gx is compact and preserves Sx , we may average h2 over Gx obtaining a Gx -invariant extension h3 of f|C ∩ Sx to Sx . The product Sx × Ox , where Ox = {gx | g ∈ G} is the orbit of G through x, is a G-invariant neighborhood of x in P . Using the projection Sx × Ox → Sx , we pull back h3 to a G-invariant function smooth hx on Sx × Ox . Since C is closed, its complement P \C is open. The open sets P \C and Sx ∩ Ox , for x ∈ C, form a G-invariant covering of P . Using a locally finite subcovering and a subordinate G-invariant partition of unity we can extend hx to a globally defined smooth function h on P . Let I = {f ∈ C ∞ (P ) | f|J −1 (0) = 0}.
(16)
Since J : P → g∗ is continuous, it follows that J −1 (0) is closed in P and every smooth function on J −1 (0) extends to a smooth function on P . Hence, we identify C ∞ (J −1 (0)) with C ∞ (P )/I. Similarly, since the action of G on P is proper, every G-invariant smooth function on J −1 (0) extends to a Ginvariant function on P . Hence, we can identify C ∞ (J −1 (0))G = (C ∞ (P )/I)G with C ∞ (P )G /(C ∞ (P )G ∩ I). On the other hand, the space C ∞ (J −1 (0))G can be identified with C ∞ (J −1 (0)/G). Hence, we have C ∞ (J −1 (0)/G) = C ∞ (J −1 (0))G = (C ∞ (P )/I)G = C ∞ (P )G /(C ∞ (P )G ∩ I) = C ∞ (P )G /I G ˇ = C ∞ (P/G)/I.
(17)
Here I G = {f ∈ C ∞ (P )G | f|J −1 (0) = 0}
(18)
and Iˇ = {fˇ ∈ C ∞ (P/G) | fˇ|J −1 (0)/G = 0}. ∗
G
(19) ∞
ˇ = I . For fˇ ∈ C (P/G), we Taking into account equation (10) we see that π (I) ˇ ˇ ˇ denote by [f ] the equivalence class of f modulo I. Fact 2.2. Iˇ is a Poisson ideal in C ∞ (P/G). Hence the space C ∞ (J −1 (0)/G), ˇ inherits the structure of a Poisson algebra with bracket (which equals C ∞ (P/G)/I), given by {[fˇ1 ], [fˇ2 ]} = [{fˇ1 , fˇ2 }].
(20)
Proof. If fˇ ∈ C ∞ (P/G) then π ∗ fˇ ∈ C ∞ (P )G and Xπ∗ fˇJξ = −XJξ (π ∗ fˇ) = 0,
(21)
for every ξ ∈ g. Hence, Xπ∗ fˇ preserves the level sets of the momentum map J. In ˇ ∈ I, ˇ vanishes on ˇ then π ∗ h particular, Xπ∗ fˇ preserves the zero level set J −1 (0). If h
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
323
ˇ also vanishes on J −1 (0). Since J −1 (0) and Xπ∗ (fˇπ ∗ )h ˇ = −X ∗ ˇπ ∗ h, ˇ π ∗ {fˇ, ˇ h} = {π ∗ fˇ, π ∗ h} π f
(22)
it follows that π ∗ {fˇ, ˇ h} vanishes on J −1 (0). Thus, the bracket {fˇ, ˇh} vanishes −1 −1 on π(J (0)) = J (0)/G. Consequently, Iˇ is a Poisson ideal. Since Iˇ is a Poisson ideal in C ∞ (P/G), it follows that the quotient C ∞ (P/G)/Iˇ = C ∞ (J −1 (0)/G) inherits structure of a Poisson algebra with bracket given by {[fˇ1 ], [fˇ2 ]} = [{fˇ1 , fˇ2 }]. The Poisson algebra C ∞ (J −1 (0)/G) = C ∞ (P/G)/Iˇ is the singularly reduced Poisson algebra at the zero level of J. Note that the ideal I (16) need not be a Poisson ideal. In Dirac’s terminology, the constraints defining J −1 (0) are first class if I is a Poisson ideal. If I is not a Poisson ideal, then there is a function f1 in I and a function f2 in C ∞ (P ) such that their bracket {f1 , f2 } is not in I. The function f2 is called a second class constraint. One of the problems of Dirac’s theory was the treatment of second class constraints, see Dirac [10]. Since we are dealing only with G-invariant functions, that is, our Poisson algebra is C ∞ (J −1 (0)/G) = (C ∞ (P )/I)G , we avoid the problem of second class constraints, because C ∞ (J −1 (0)/G)G = C ∞ (P/G)/Iˇ and the ideal Iˇ is Poisson. Because a co-adjoint orbit O = {0} need not be closed, smooth functions on −1 J (O)/G need not extend to C ∞ (P/G). As in the case of regular reduction we will show below that reduction at O = {0} and reduction of the action of G on P × O at the zero value of the momentum map JP ×O (15) are equivalent. Since J −1 (O)/G need not be a manifold, our argument is given in the framework of differential spaces. Fact 2.3. Let F : P → Q be smooth map between differential spaces. For every differential subspace R ⊆ P the restriction FR : R → Q of F to R is smooth. Moreover, if S ⊆ Q is a differential subspace containing the range of F then the map F S : P → S : p → F (p), (restriction of the co-domain) is smooth. Proof. Smoothness of the restriction of the domain of F to R is obvious. In order to prove smoothness of F S , we need to show that f ◦F S ∈ C ∞ (P ) for each f ∈ C ∞ (S). Suppose f ∈ C ∞ (S). Then, for each, q ∈ S, there is a neighborhood U of q in Q and h ∈ C ∞ (Q) such that f|S ∩ U = h|S ∩ U . Moreover, V = F −1 (U ) is open in P . For every p ∈ V , F (p) ∈ U ∩ S because the range of F is a subset of S, and f ◦ F S (p) = f ◦ F (p) = h(F (p)) = (h ◦ F )(p). Hence, f ◦ F S|V = h ◦ F|V . Since h ◦ F ∈ C ∞ (P ), it follows that f ◦ F S ∈ C ∞ (S). We now return to discussing singular reduction at a co-adjoint orbit O = {0}. Theorem 2.4. Assume that the action of G on P is proper. There is a nat−1 (O)/G with the ural Poisson diffeomorphism Fˇ between JP−1 ×O (0)/G and J
April 2, 2009 10:19 WSPC/148-RMP
324
J070-00363
L. Bates et al.
Poisson algebra structure on C ∞ (J −1 (O)/G) induced by the inclusion map J −1 (O)/ G → P/G. Proof. The restriction of the domain of F1 : P × O → P : (p, µ) → p to JP−1 ×O (0) gives a smooth map F2 : JP−1 ×O (0) → P . If JP ×O (p, µ) = 0, then J(p) = µ. Hence the range of F2 is J −1 (O). Restricting the co-domain of F2 to J −1 (O) we get a get −1 (O). a smooth map F : JP−1 ×O (0) → J Consider now a map H1 : P → P × g∗ : p → (p, J(p)). Restricting the domain of H1 to J −1 (O), we get a smooth map H2 : J −1 (O) → P × g∗ with range JP−1 ×O (0). (0) yields a smooth map Hence, the restriction of the co-domain of H2 to JP−1 ×O (0). H : J −1 (O) → JP−1 ×O For each p ∈ J −1 (O), we have F (H(p)) = F (p, J(p)) = p. Similarly, −1 , H(F (p, J(p))) = H(p) = (p, J(p)) for each (p, J(p)) ∈ JP−1 ×O (0). Hence, H = F −1 −1 which implies that J (O) and JP ×O (0) are diffeomorphic. The maps F1 , F2 , F and H1 , H2 , H intertwine the actions of G. Hence, they pass to smooth maps of the corresponding G-orbit spaces, namely, Fˇ1 , Fˇ2 , Fˇ and ˇ1, H ˇ 2 , H. ˇ Since H ˇ = Fˇ −1 , it follows that Fˇ : J −1 (0)/G → J −1 (O)/G is a H P ×O diffeomorphism. Hence, Fˇ ∗ : C ∞ (J −1 (O)/G) → C ∞ (JP−1 ×O (0)/G) is an isomorphism of associative algebras. We need to show that Fˇ ∗ is an isomorphism of Poisson algebras. The Poisson bracket on C ∞ (J −1 (O)/G) is induced by the inclusion map ιO : J −1 (O)/G → P/G and the Poisson bracket on C ∞ (JP−1 ×O (0)/G) is given by Eq. (20). Since the symplectic form of P × O is ωP ×O = pr∗1 ω − pr∗2 ωO , the map F1 : P × O → P : (p, µ) → p is Poisson. Moreover, F1 is G-equivariant. Hence, it induces a Poisson map Fˇ1 : (P × O)/G → P/G. Thus, the mapping Fˇ1∗ : C ∞ (P/G) → C ∞ ((P × O)/G) is a Poisson algebra homomorphism. The inclusion map ι : JP−1 ×O (0)/G → (P × O)/G is also a Poisson map. Therefore, the −1 ˇ ˇ restriction of Fˇ1 to JP−1 ×O (0)/G, given by F2 = F1 ◦ ι : JP ×O (0)/G → P/G, is a Poisson map, being a composition of Poisson maps. In other words, Fˇ2∗ : C ∞ (P/G) → C ∞ (JP−1 ×O (0)/G) is a Poisson algebra homomorphism. We need the following: Fact 2.5. Let IˇO be the ideal in the associative algebra C ∞ (P/G) consisting of functions that vanish on J −1 (O)/G. Then IˇO is a Poisson ideal. Proof. If fˇ ∈ C ∞ (P/G) then π ∗ fˇ ∈ C ∞ (P )G and, for every ξ ∈ g, Xπ∗ fˇJξ = −XJξ π ∗ fˇ = 0.
(23)
Hence, Xπ∗ fˇ preserves the level sets of the momentum map J. In particular, Xπ∗ fˇ ˇ ∈ IO , then π ∗ h ˇ vanishes on J −1 (O) and X ∗ ˇπ ∗ h ˇ preserves the level set J −1 (O). If h π f −1 also vanishes on J (O). Since ˇ = {π ∗ fˇ, π ∗ h} ˇ = −X ∗ ˇπ ∗ ˇh, π ∗ {fˇ, h} π f
(24)
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
325
ˇ vanishes on J −1 (O). Thus the bracket {fˇ, h} ˇ vanishes on it follows that π ∗ {fˇ, h} −1 −1 ˇ π(J (O)) = J (O)/G. This implies that IO is a Poisson ideal. We continue with the proof of Theorem 2.4. Since IˇO is a Poisson ideal in C (P/G) it follows that the quotient C ∞ (P/G)/IˇO inherits structure of a Poisson algebra with bracket given by ∞
{ˇι∗O fˇ1 , ˇι∗O fˇ2 } = ˇι∗O {fˇ1 , fˇ2 }.
(25)
Fˇ2∗
Next we show that vanishes on the ideal IˇO . Let πP ×O : P × O → (P × O)/G be the G-orbit map. For each fˇ ∈ IˇO , Fˇ2∗ fˇ ∈ C ∞ (JP−1 ×O (0)/G). We know that, if −1 (0) then p ∈ J (O). Hence, (p, J(p)) ∈ JP−1 ×O Fˇ2∗ fˇ(πP ×O (p, J(p))) = fˇ(Fˇ2 (πP ×O (p, J(p)))) = fˇ(π(F2 (p, J(p)))) = π ∗ fˇ(F2 (p, J(p))) = π ∗ fˇ(p) = fˇ(π(p)) = 0, because π(p) ∈ J −1 (O)/G and fˇ vanishes on J −1 (O)/G. Since IˇO is a Poisson ideal, it follows that Fˇ2∗ induces a Poisson algebra homo∞ ˇ morphism µ : C ∞ (P/G)/IˇO → C ∞ (JP−1 ×O (0)/G). Recall that C (P/G)/IO is ∞ −1 −1 the subspace C (J (O)/G) consisting of smooth functions on J (O)/G which extend to smooth functions on P/G. Hence, the Poisson homomorphism µ is the ∞ ˇ restriction of Fˇ ∗ : C ∞ (J −1 (O)/G) → C ∞ (JP−1 ×O (0)/G) to C (P/G)/IO . ∗ ˇ In order to show that F is a Poisson homomorphism, note that, for every f1 , f2 ∈ C ∞ (J −1 (O)/G) and x ∈ JP−1 ×O (0)/G, the value at x of the Poisson bracket {Fˇ ∗ f1 , Fˇ ∗ f2 } depends only on the first order jet of Fˇ ∗ f1 and Fˇ ∗ f2 at x. On the ˇ 2 ∈ C ∞ (P/G) ˇ 1, h other hand, there exists a neighborhood U of Fˇ (x) in P/G and h ˇ i|U ∩ J −1 (O)/G = fˇi|U ∩ J −1 (O)/G for i = 1, 2, and such that h ˇ 1|J −1 (O)/G , h ˇ 2|J −1 (O)/G }(x) ˇ 2|J −1 (O)/G }(x) = Fˇ ∗ {h {Fˇ ∗ ˇ h1|J −1 (O)/G , Fˇ ∗ h
(26)
ˇ i|J −1 (O)/G ∈ C ∞ (P/G)/IˇO . Therefore, because, h ˇ 2|J −1 (O)/G }(x) {Fˇ ∗ fˇ1 , Fˇ ∗ fˇ2 }(x) = {Fˇ ∗ ˇh1|J −1 (O)/G , Fˇ ∗ h ˇ 1|J −1 (O)/G , h ˇ 2|J −1 (O)/G }(x) = Fˇ ∗ {h ˇ 1|J −1 (O)/G , h ˇ 2|J −1 (O)/G }(F (x)) = {h = {fˇ1 , fˇ2 }(F (x)) = Fˇ ∗ {fˇ1 , fˇ2 }(x). Hence, Fˇ ∗ is a Poisson algebra homomorphism. Since Fˇ is a diffeomorphism, it follows that Fˇ ∗ is a Poisson algebra isomorphism. 3. Quantization 3.1. Geometric quantization In this section, we give a brief review of geometric quantization of symplectic manifolds following [29].
April 2, 2009 10:19 WSPC/148-RMP
326
J070-00363
L. Bates et al.
3.1.1. Prequantization Let λ : L → P be a prequantization complex line bundle. Let θ be the connection 1 form on the associated C× principal bundle L× and let ∇ be the corresponding covariant derivative on the space Γ∞ (L) of smooth sections of L. We identify L× with the subset of L consisting of all non-zero elements of L. Hence, a nowhere vanishing section of L is considered to be a section of L× . A different identification is discussed in the Appendix B. For each non-zero section σ of L and vector field X on P , ∇X σ = 2πi(X
σ ∗ θ)σ.
(27)
We require that the connection ∇ satisfies the prequantization condition (∇X ∇X − ∇X ∇X − ∇[X,X ] )σ = −(2π)−1 i ω(X, X )σ
(28)
for every section σ of L and every pair X, X of vector fields on P . Here is the Planck’s constant divided by 2π.a The prequantization condition can be satisfied if the de Rham cohomology class [(2π)−1 ω] on P is integral. If this cohomology condition holds, then the symplectic manifold (P, ω) is said to be quantizable. For each f ∈ C ∞ (P ), the Hamiltonian vector field Xf of f has a unique connection preserving lift to L× . This gives rise to a prequantization map P : C ∞ (P ) × Γ∞ (L) → Γ∞ (L) : (f, σ) → Pf σ = (−i∇Xf + f )σ.
(29)
For f1 , f2 ∈ C ∞ (P ), [Pf1 , Pf2 ]σ = −iP{f1 ,f2 } σ. Hence, the map f → (−i)−1 Pf is a representation of the Poisson algebra C ∞ (P ) on the space Γ∞ (L) of sections of the bundle λ : L → P , which we call a prequantization representation. For each f ∈ C ∞ (P ) such that the Hamiltonian vector field Xf is complete, the operator Pf is skew adjoint on the space of sections of λ that are square integrable with respect to the scalar product (30) (σ1 |σ2 ) = σ1 |σ2 ω n , P
where n = dim P . Here σ1 (p)|σ2 (p) is a Hermitian form on Lp = π −1 (p); we assume that it is invariant under parallel transport defined by the connection ∇ on L. Restricting the prequantization representation to the Poisson algebra spanned by the momenta Jξ , for ξ ∈ g, we get a representation ξ → (−i)−1 PJξ of g on Γ∞ (L). If the action of G on P lifts to a connection-preserving action of G on L, this representation integrates to a representation 1 2
U : G × Γ∞ (L) → Γ∞ (L) : (g, σ) → Ug σ a In
order to get formulae in the theory of representations of Lie groups, set = i.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
327
of G on Γ∞ (L) such that, for each g ∈ G, f ∈ C ∞ (P ) and σ ∈ Γ∞ (L) Ug (f σ) = (Φ∗g−1 f )Ug σ. In general, the prequantization representation of G is not irreducible. It is unitary on the Hilbert space of sections of L that are square integrable with respect to the scalar product (30). 3.1.2. Polarization A polarization of a symplectic manifold (P, ω) is an involutive Lagrangian distribution F ⊂ C ⊗ T P such that D = F ∩ F¯ ∩ T P and E = (F + F¯ ) ∩ T P , where F¯ denotes the complex conjugate of F , are involutive distributions on P . Let C ∞ (P )0F be the space of smooth complex valued functions on P that are constant along F , that is, C ∞ (P )0F = {f ∈ C ∞ (P ) ⊗ C | uf = 0 for all u ∈ F }. We assume that the polarization F is strongly admissible, that is, F is locally spanned by Hamiltonian vector fields of functions on C ∞ (P )0F . Let CF∞ (P ) denote the space of functions on P whose Hamiltonian vector fields preserve F . In other words, f ∈ CF∞ (P ) if, for every h ∈ C ∞ (P )0F , the Poisson bracket {f, h} ∈ C ∞ (P )0F . If f1 , f2 ∈ CF∞ (P ) and h ∈ C ∞ (P )0F then the Jacobi identity implies that {{f1 , f2 }, h} = −{f2 , {f1 , h}} + {f1 , {f2 , h}} ∈ C ∞ (P )0F . Hence, for a strongly admissible polarization, the ring CF∞ (P ) is a Poisson subalgebra of C ∞ (P ). Let Γ∞ F (L) denote the space of smooth sections of L that are covariantly constant along F , namely, ∞ Γ∞ F (L) = {σ ∈ Γ (L) | ∇u σ = 0 for all u ∈ F }.
For each h ∈ C ∞ (P )0F , f ∈ CF∞ (P ) and σ ∈ Γ∞ F (L) we have ∇Xh (Pf σ) = 0. Thus, for every f ∈ CF∞ (P ), the operator Pf maps Γ∞ F (L) to itself. Restricting the map (L) we obtain the quantization map (f, σ) → Pf σ to CF∞ (P ) × Γ∞ F ∞ Q : CF∞ (P ) × Γ∞ F (L) → ΓF (L) : (f, σ) → Qf σ = Pf σ = (−i∇Xf + f )σ.
Assume that the action Φ : G × P → P preserves the polarization F . Hence, for each ξ ∈ g, the momentum Jξ is in CF∞ (P ). Restricting the prequantization representation to the Poisson algebra spanned by Jξ , for ξ ∈ g, we get a representation ξ → (−i)−1 QJξ of g on Γ∞ F (L). This representation integrates to a representation ∞ R : G × Γ∞ F (L) → ΓF (L) : (g, σ) → Rg σ ∞ 0 ∞ of G on Γ∞ F (L) such that, for each g ∈ G, f ∈ C (P )F and σ ∈ ΓF (L)
Rg (f σ) = (Φ∗g−1 f )Rg σ.
April 2, 2009 10:19 WSPC/148-RMP
328
J070-00363
L. Bates et al.
Suppose that F is a positive K¨ ahler polarization of (P, ω). In other words, suppose that P has the structure of a K¨ahler manifold. Then F is the distribution of antiholomorphic directions, and i ω(u, u ¯) ≥ 0 for all u ∈ F . Also, L is a holomorphic (L) is the space of holomorphic sections of L. The line bundle over P and Γ∞ F representation R is unitary on the Hilbert space H of holomorphic sections of L that are square integrable with respect to the scalar product given by Eq. (30). Other types of polarization do not admit non-zero sections in Γ∞ F (L) that are square integrable with respect to the scalar product (30). They have to be considered separately. 3.2. Quantization of reduced Poisson algebras For a free and proper action of G on (P, ω), algebraic reduction and singular reduction are equivalent to regular reduction which leads to the Poisson algebra of the reduced symplectic manifold (Pµ , ωµ ), for µ ∈ g∗ . Moreover, quantization of a regularly reduced Poisson algebra corresponds to geometric quantization of the reduced symplectic manifold (Pµ , ωµ ). This has been the object of study of Guillemin and Sternberg [14] and others, see Huebschmann [17] and references quoted in Guillemin, Lerman and Sternberg [13]. 3.2.1. Quantization of singular reduction at 0 ∈ g∗ From Sec. 2.2, we know that C ∞ (J −1 (0)/G) = C ∞ (J −1 (0))G = (C ∞ (P )/I)G = C ∞ (P )G /I G = C ∞ (P/G)/Iˇ is a Poisson algebra obtained by singular reduction at 0 ∈ g∗ . As in the case of geometric quantization of a symplectic manifold, we consider first a prequantization of C ∞ (P )G /I G , followed by its quantization. Let IΓ∞ (L) = {f1 σ1 + · · · + fn σn | n ∈ N, f1 , . . . , fn ∈ I, σ1 , . . . , σn ∈ Γ∞ (L)}. The quotient Γ∞ (L)/IΓ∞ (L) corresponds to localization of sections of L at J −1 (0). For each σ ∈ Γ∞ (L), let [σ] be the class of σ in Γ∞ (L)/IΓ∞ (L), and for each f ∈ C ∞ (P ) let [f ] be the class of f in C ∞ (P )/I. Since I and IΓ∞ (L) are Ginvariant, the prequantization representation U of G on Γ∞ (L) induces an action of G on Γ∞ (L)/IΓ∞ (L) given by G × (Γ∞ (L)/IΓ∞ (L)) → Γ∞ (L)/IΓ∞ (L) : (g, [σ]) → [Ug σ]. Let (Γ∞ (L)/IΓ∞ (L))G = {[σ] ∈ Γ∞ (L)/IΓ∞ (L) | [Ug σ] = [σ] for all g ∈ G} be the space of G-invariant elements of Γ∞ (L)/IΓ∞ (L). Since G is connected, (Γ∞ (L)/IΓ∞ (L))G = {[σ] ∈ Γ∞ (L)/IΓ∞ (L) | [PJξ σ] = 0 ∀ξ ∈ g}.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
329
G
Theorem 3.1. The space (Γ∞ (L)/IΓ∞ (L)) is a module over the ring C ∞ (P )G /I G . Moreover, if the system satisfies the singular prequantization condition, namely Pk τ ∈ IΓ∞ (L) for all k ∈ I G and all τ ∈ Γ∞ (L) such that PJξ τ ∈ IΓ∞ (L) for all ξ ∈ g,
(31)
then the singularly reduced prequantization map G
G
(C ∞ (P )G /I G ) × (Γ∞ (L)/IΓ∞ (L)) → (Γ∞ (L)/IΓ∞ (L)) : ([f ], [σ]) → P[f ] [σ] = [Pf σ]
(32)
is well-defined. G
Proof. To show the space (Γ∞ (L)/IΓ∞ (L)) is a module over the ring C ∞ (P )G /I G , we need independence of representatives of the classes in the various quotients, as well as closure under multiplication. For each k ∈ I G and σ ∈ Γ∞ (L), we have kσ ∈ IΓ∞ (L). Similarly, f σ ∈ IΓ∞ (L) for each f ∈ C ∞ (P )G and σ ∈ IΓ∞ (L). Hence [f ][σ] = [f σ] ∈ Γ∞ (L)/IΓ∞ (L) is independent of the representatives f of [f ] and σ of [σ]. Moreover, [f σ] is G-invariant if PJξ (f σ) ∈ IΓ∞ (L) for each ξ ∈ g∗ . However, f and [σ] satisfy PJξ (f σ) = −i−1 (XJξ f )σ + f PJξ σ
for each ξ ∈ g.
The first term on the right-hand side is zero since f is G-invariant, and the second term is in IΓ∞ (L) since [σ] is G-invariant, and so PJξ (σ) ∈ IΓ∞ (L). Thus (Γ∞ (L)/IΓ∞ (L))G is a module over C ∞ (P )G /I G . Now, suppose that the system satisfies the singular prequantization condition. To show the reduced prequantization map is well-defined, we need to show that [Pf σ] is independent of the choice of representatives f of [f ] and σ of [σ]. In addition, we need to show that Pf maps G-invariant classes to G-invariant classes. The singular prequantization condition implies that, for [f ] ∈ C ∞ (P )G /I G and [σ] ∈ (Γ∞ (L)/IΓ∞ (L))G , the class [Pf σ] is independent of the representative f ∈ C ∞ (P )G of [f ]. Indeed, this is why we require it in the singular quantization condition. For kσ ∈ IΓ∞ (P ), where k ∈ I, Pf (kσ) = −i−1 (Xf k)σ + kPf σ. Since f ∈ C ∞ (P )G , it follows that Xf preserves J −1 (0), and so Xf is tangent to J −1 (0). Since k|J −1 (0) is zero, Xf k vanishes on J −1 (0), and so is in I. Therefore, Pf maps IΓ∞ (L) to itself. This implies that [Pf σ] is independent of the representative σ of [σ] as well. It remains to show that [Pf σ] is G-invariant if [σ] is, namely, that PJξ (Pf σ) ∈ ∞ IΓ (L) for all ξ ∈ g. Now PJξ (Pf σ) = Pf (PJξ σ) + [PJξ , Pf ]σ. First, PJξ σ ∈ IΓ∞ (L) by assumption. By the previous paragraph Pf maps IΓ∞ (L) to itself. So the first term is in IΓ∞ (L). Moreover, [PJξ , Pf ] = iP{Jξ,f } which is 0 since f is G-invariant. Thus Pf σ is G-invariant.
April 2, 2009 10:19 WSPC/148-RMP
330
J070-00363
L. Bates et al.
Therefore, the singularly reduced prequantization map (32) is well defined. In geometric quantization the transition from prequantization to quantization consists of the restriction of the domain of the prequantization map to quantizable functions CF∞ (P ) and polarized sections Γ∞ F (L). Here, “quantizable” functions are those whose Hamiltonian vector fields preserve the polarization F . Such functions form a Poisson subalgebra CF∞ (P ). Polarized sections are sections of L that are covariantly constant along F . The analogues in singular reduction of the ∞ ∞ G G ∞ G and (Γ∞ spaces CF∞ (P ) and Γ∞ F (L) are (CF (P ) ∩ C (P ) )/I F (L)/IΓ (L)) , respectively. ∞ G Theorem 3.2. The space S = (Γ∞ is a module over the ring F (L)/IΓ (L)) 0 ∞ ∞ G G R = (C (P )F ∩ C (P ) )/I . Moreover, if the system satisfies the singular quantization condition
Pk τ ∈ IΓ∞ (L) for all k ∈ I G ∩ CF∞ (P ) ∞ and all τ ∈ Γ∞ F (L) such that PJξ τ ∈ IΓ (L) for all ξ ∈ g
(33)
then the singularly reduced quantization map R × S → S : ([f ], [σ]) → Q[f ] [σ] = [Qf σ]
(34)
is well-defined. Proof. Again, in order to show S is a module over the ring R, we need to show closure under multiplication and independence of the representatives of the various quotients. Independence was already shown in the proof of the preceding theorem; since both the ring and the (putative) module are sub-objects of the corresponding objects considered there, independence of representatives holds here as well. For closure, we only need show that f σ is polarized if σ is polarized and f ∈ 0 C ∞ (P )F ∩ C ∞ (P )G . But this is essentially the definition of C ∞ (P )0F : if Y ∈ F , then ∇Y (f σ) = df (Y )σ + f ∇Y σ; the second term is zero since σ is polarized, and the first is zero by the definition of C ∞ (P )0F . Thus S is a module over R. As for the fact the singularly reduced quantization map is well-defined, we have already shown in the proof of the previous theorem that the prequantization map is independent of the representatives of the classes [f ] and [σ]. The argument for the singularly reduced quantization map is identical, except that f and σ are quantizable and polarized, respectively. The only thing left to check is that Qf maps into the correct space, namely, that if f is quantizable and σ is polarized, then Qf σ is polarized. But as noted in Sec. 3.1.2, for each f ∈ CF∞ (P ), the operator Pf takes Γ∞ F (L) to itself, which is exactly what we need. Thus the reduced quantization map (34) is well-defined.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
331
Remark 3.3. Observe that the singular prequantization condition implies the singular quantization condition. Both of these conditions hold if 0 is a regular value of J. 3.2.2. Comparison with algebraic reduction Let J be the ideal in C ∞ (P ) generated by components of the momentum map k J : P → g. In other words, J = { i=1 fi Jξi | fi ∈ C ∞ (P )}, where (ξ1 , . . . , ξk ) is a basis in g. The Poisson algebra of algebraic reduction at J = 0, is the space (C ∞ (P )/J C ∞ (P ))G of G-invariant elements in C ∞ (P )/J C ∞ (P ). Prequantization gives an action G
(C ∞ (P )/J C ∞ (P ))G × (Γ∞ (L)/J Γ∞ (L)) → (Γ∞ (P )/J Γ∞ (P ))G (f , σ) → Pf σ = Pf σ, where f denotes the class of f in (C ∞ (P )/J C ∞ (P ))G and σ denotes the class of σ in (Γ∞ (L)/J Γ∞ (L))G . Similarly, quantization gives an action G
∞ ∞ ∞ G (CF∞ (P )/J C ∞ (P ))G × (Γ∞ F (L)/J Γ (L)) → (ΓF (P )/J Γ (P )) (f , σ) → Qf σ = Qf σ.
In the case of algebraic reduction, both actions are well defined without any additional conditions. Observe that singular prequantization and quantization conditions involve prequantization and quantization operators of functions in I G , respectively. If J is not a radical ideal, there may exist G-invariant functions which vanish on J −1 (0) and are not in J . Moreover, prequantization and quantization of algebraic reduction allow for non-zero operators to be assigned to such functions. On the other hand, prequantization and quantization of singular reduction requires that operators corresponding to such functions vanish. In Sec. 4, we discuss an example where J is not a radical ideal. In addition, the singularly and algebraically reduced Poisson algebras are not isomorphic. Nevertheless, their quantizations yield the same quantum system because quantization of the algebraically reduced Poisson algebra assigns zero operators to equivalence classes of functions in I G . The reason for this is the fact that for each k ∈ I G the Hamiltonian vector field Xk of k restricted to J −1 (0) is a linear combination of vector fields XJξ , for ξ ∈ g, with coefficients given by functions on J −1 (0). Theorem 3.4. Suppose that for each k ∈ I G , Xk |J −1 (0) is a linear combination of XJξ , ξ ∈ g, such that the coefficients are functions which extend to a neighborhood of J −1 (0). Then the system satisfies the singular prequantization condition, and thus the singular quantization condition as well. Proof. Recall that the singular prequantization condition is the following: Pk τ ∈ IΓ∞ (L) for all k ∈ I G and τ such that PJξ τ ∈ IΓ∞ (L) for all ξ ∈ g.
(35)
April 2, 2009 10:19 WSPC/148-RMP
332
J070-00363
L. Bates et al.
Suppose P satisfies the vector field spanning condition of the theorem, namely, given k ∈ I G , there exist functions cξ defined on a neighborhood of J −1 (0) such that Xk = ξ∈g cξ XJξ on J −1 (0). Since the cξ are defined near J −1 (0) as well as on it, we can write
cξ X J ξ , (36) Xk = Y + ξ∈g
which is valid on a neighborhood of J −1 (0). Here Y is a vector field which vanishes on J −1 (0). Without loss of generality (by extending the cξ smoothly to zero outside a neighborhood of J −1 (0) and adjusting the definition of Y ) we may assume that (36) holds on all of P . Suppose τ ∈ Γ∞ (L) such that PJξ τ ∈ IΓ∞ (L) for all ξ ∈ g. We wish to show that Pk τ ∈ IΓ∞ (L) for all k ∈ I G . Expanding the definition of Pk τ using (36) gives Pk τ = i∇Xk τ + kτ = −i∇(cξ XJξ +Y ) τ + kτ = −icξ ∇XJξ τ − i∇Y τ + kτ.
(37)
Consider the last line of (37). The term kτ is in IΓ∞ (L), by definition. We see that the second term i∇Y τ is in IΓ∞ (L), as follows. Write τ = ψσ1 . Then ∇Y τ = Y (ψ)σ1 + iψθ(Y )σ1 . Since Y vanishes on J −1 (0), both Y (ψ) and θ(Y ) vanish on J −1 (0). So ∇Y τ is also in IΓ∞ (L). Finally, ∇XJξ τ = i (PJξ τ − Jξ τ ) is in IΓ∞ (L) since by hypothesis, PJξ τ ∈ IΓ∞ (L). (Jξ τ is clearly in IΓ∞ (L).) Therefore, Pk τ ∈ IΓ∞ (L). Thus the system satisfies the singular prequantization condition. 3.2.3. Quantization of reduction at O = {0} As we have seen above, singular and algebraic reduction at a co-adjoint orbit O = {0} are equivalent to the corresponding reduction of the action of G on (P × O, ωP ×O = pr∗1 ω − pr∗2 ωO ) at the zero level of the momentum map JP ×O : (p, µ) → J(p) − µ. Therefore, we interpret quantization of reduction of −1 (O). In (P × O, ωP ×O ) at JP−1 ×O (0) as quantization of reduction of (P, ω) at J this subsection we construct a quantization structure on the co-adjoint orbit (O, ω) and on (P × O, ωP ×O ). This gives rise to the quantization maps QO and QP ×O , respectively, see (38) and (41). In turn these quantization maps give rise to the quantum representations RO and RP ×O , respectively. The details follow. We now introduce a quantization structure on a co-adjoint orbit O. This means that O has to be a quantizable co-adjoint orbit. Let πO : LO → O be a prequantization complex line bundle for (O, ωO ). Denote by ∇O the covariant derivative associated to the connection 1-form θO on LO and let FO be a strongly admissible positive Ad∗G -invariant polarization of (O, ωO ). Let C ∞ (O)FO be the space of functions in C ∞ (O) such that their Hamiltonian vector fields preserve the polarization FO . Similarly, let Γ∞ FO (LO ) be the space of smooth sections of LO that are
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
333
covariantly constant along FO . Denote the corresponding quantization map by ∞ QO : CF∞O (O) × Γ∞ FO (LO ) → ΓFO (LO )
(38)
and by RO the corresponding representation of G on Γ∞ FO (LO ). Complex conjugation z → z¯ in LO is an automorphism of LO as a real vec¯ O the tor bundle over O, but it conjugates its complex structure. We denote by L complex line bundle over O with the conjugate complex structure endowed with ¯ O corresponding to a connection 1-form θ¯O = −θO and curcovariant derivative ∇ ¯ vature form dθO = −ωO . By assumption, FO is a positive polarization of (O, ωO ), ¯ ≥ 0 for all w ∈ FO . This implies that F¯O is a positive polarization that is, iωO (w, w) of (O, −ωO ). Thus, a given quantization structure on (O, ωO ) induces a quantiza¯O, ∇ ¯ O , and F¯O . Let Q ¯ O be the quantization tion structure on (O, −ωO ) given by L O ¯ map induced by this structure and by R the corresponding representation of G ¯ O ). (L on Γ∞ F¯ We consider the line bundle λP ×O : LP ×O → P ×O defined as the tensor product ¯O : L ¯O, ¯ O → O. More precisely, LP ×O = pr∗1 L⊗pr∗2 L of the bundles λ : L → P and λ ∗ ∗¯ ¯ where pr1 L and pr2 LO are pullbacks to P × O of L and LO by the projections maps on the first and the second factors, respectively. Local sections of λP ×O : LP ×O → P × O are linear combinations of sections of the form σ = σP ⊗ σ ¯O , where σP P ×O ¯ and σ ¯O are local sections of L and LO , respectively. Let ∇ be a connection on LP ×O defined by ¯ Oσ ¯O ) = ∇σP ⊗ σ ¯O + σP ⊗ ∇ ¯O . ∇P ×O (σP ⊗ σ The connection ∇P ×O satisfies the prequantization condition for ωP ×O = pr∗1 ωP − pr∗2 ωO . Finally, we choose the polarization FP ×O to be the direct sum F ⊕ F¯O . It is a strongly admissible positive G-invariant polarization of (P × O, ωP ×O ). ¯O is covariantly constant along FP ×O if and only if σP is Moreover, σP ⊗ σ covariantly constant along F and σ ¯O is covariantly constant along F¯O . There∞ ¯ ∞ ∞ fore Γ (LP ×O ) = Γ (L)⊗Γ (LO ), and the space of smooth sections of LP ×O ∞ ¯ ∞ that are covariantly constant along FP ×O is Γ∞ FP ×O (LP ×O ) = ΓF (L)⊗ΓF¯O (LO ). If P ×O ∞ ∞ ¯ O ∈ Γ∞ (σP ⊗ σ ¯O ) = σP ⊗ σ FP ×O (LP ×O ), fP ∈ CF (P ) and fO ∈ CF¯O (O), then Qpr∗ 1 fP P ×O O ¯ σ ¯O and Q ∗ (σP ⊗ σ ¯O ) = σP ⊗ Q ¯O . Thus, the quantization rep(QfP σP ) ⊗ σ pr2 fO
fO
resentation RP ×O of G on Γ∞ FP ×O (LP ×O ) is the tensor product of the quantization ¯O. representations R, and R Quantization of singular reduction of (P × O, ωP ×O ) at JP−1 ×O (0) is interpreted as quantization of singular reduction of (P, ω) at J −1 (O). We denote by IP ×O the ideal in C ∞ (P × O) consisting of functions that vanish on JP−1 ×O (0). The singular prequantization condition is PkP ×O τ ∈ IP ×O Γ∞ (LP ×O ) for all k ∈ IPG×O and all P ×O τ ∈ Γ∞ (LP ×O ) such that P(J τ ∈ IP ×O Γ∞ (LP ×O ) ∀ξ ∈ g, P ×O )ξ
(39)
April 2, 2009 10:19 WSPC/148-RMP
334
J070-00363
L. Bates et al.
and the singular quantization condition reads PkP ×O τ ∈ IP ×O Γ∞ (LP ×O ) for all k ∈ IPG×O ∩ CF∞P ×O (P × O) and all P ×O ∞ τ ∈ Γ∞ FP ×O (LP ×O ) such that P(JP ×O )ξ τ ∈ IP ×O Γ (LP ×O ) ∀ξ ∈ g.
(40)
The representation space of prequantization of singular reduction at O is S = (Γ∞ (LP ×O )/IP ×O Γ∞ (LP ×O ))G , and the corresponding prequantization map is given by (C ∞ (P × O)G /IPG×O ) × S → S : ([f ], [σ]) → [PfP ×O σ], where f ∈ C ∞ (P × O) is any representative of [f ] ∈ C ∞ (P × O)G /IPG×O and σ ∈ Γ∞ (LP ×O ) is any representative of [σ] ∈ S. Similarly, the Poisson algebra of quantizable elements of C ∞ (P × O)G /IPG×O in the polarization FP ×O is (C ∞ (P × O)FP ×O /JP ×O )G . The representation space of quantization of singular reduction at G ∞ . The corresponding quantization O is SFP ×O = Γ∞ FP ×O (LP ×O )/IP ×O Γ (LP ×O ) map is G QP ×O : (C ∞ (P × O)G FP ×O /IP ×O ) × SFP ×O → SFP ×O : ([f ], [σ]) → [Qf σ]
(41)
G where f ∈ C ∞ (P )F is any representative of [f ] ∈ C ∞ (P × O)G FP ×O /IP ×O and ∞ σ ∈ ΓFP ×O (LP ×O ) is any representative of [σ] ∈ SFP ×O .
3.2.4. K¨ ahler polarizations A K¨ ahler polarization is a positive polarization F such that F ∩ F¯ = 0. A symplectic manifold (P, ω) endowed with a positive K¨ ahler polarization F has the structure of a complex K¨ahler manifold such that the distribution F consists of antiholomorphic directions. Moreover, the prequantization line bundle L over P is holomorphic and the space Γ∞ F (L) consists of holomorphic sections of L. Square integrable sections (L) form a Hilbert space HF [6]. We use this notation to emphasize the in Γ∞ F polarization F . For each f ∈ C ∞ (P ), such that the Hamiltonian vector field Xf is complete and preserves the polarization F , the quantization operator Qf is skewadjoint on HF . The corresponding quantization representation R of G is unitary on HF . Lemma 3.5. Let Q be a Lagrangian submanifold of a connected 2n-dimensional K¨ ahler manifold P, and let f be a holomorphic function that vanishes identically on Q. Then f vanishes identically on P . Proof. It suffices to work locally. So take n nowhere zero independent real vector fields X1 , . . . , Xn tangent to Q and extend them to smooth real vector fields in a P -open neighborhood of a point p ∈ P . Let J denote the associated almost-complex structure tensor, and consider the complex vector fields ahler Za = Xa + iJXa . Since f is holomorphic, Za f = 0. Furthermore, since the K¨ condition implies JT F ∩ T L = {0}, see McDuff and Salamon [23], the vector fields
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
335
X1 , . . . , Xn , JX1 , . . . , JXn are independent. By using suitable linear combinations of the Xa and the Za , this implies that all partial derivatives of f vanish at the point p. Since f is analytic, it vanishes identically. Corollary 3.6. If σ is a holomorphic section of a line bundle L over a K¨ ahler manifold P, which vanishes identically on a Lagrangian submanifold Q, then σ vanishes identically on P . Proof. Let τ : V → U × C be a local trivialization of the line bundle L. Then restricting the holomorphic section σ to U ∩ F gives rise to a holomorphic function f : U ∩ F → C : u → f (u) such that σ|(U ∩ F )(u) = (u, f (u)). From Lemma 3.5 it follows that f |(U ∩ F ) = 0 and therefore σ|U = 0. Since σ is holomorphic, it follows that σ = 0 on P . Corollary 3.7. Let F be a K¨ ahler polarization of (P, ω) and let J : P → g∗ the momentum map for a Hamiltonian action of G on P . If J −1 (0) contains a ∞ Lagrangian submanifold of (P, ω) then Γ∞ F (L) ∩ IΓ (L) = {0}. ∞ Proof. Since I = {f ∈ C ∞ (P ) | f|J −1 (0) = 0}, it follows that Γ∞ F (L) ∩ IΓ (L) −1 is the set of holomorphic sections of the line bundle L that vanish on J (0). By hypothesis, J −1 (0) contains a Lagrangian submanifold Q of P . Hence, sections in ∞ Γ∞ F (L) ∩ IΓ (L) are holomorphic and vanish on a Lagrangian submanifold Q of P . The result follows using Corollary 3.6. G ∞ Let Γ∞ F (L) be the space of G-invariant sections of ΓF (L). Since G is connected, it follows that G ∞ Γ∞ F (L) = {σ ∈ ΓF (L) | QJξ σ = 0 for all ξ ∈ g}.
Theorem 3.8. Let F be a K¨ ahler polarization of (P, ω) and let J : P → g∗ be a momentum map for an action on (P, ω) of a connected Lie group G such G that J −1 (0) contains a Lagrangian submanifold of (P, ω). Then the spaces Γ∞ F (L) , ∞ ∞ G ∞ ∞ G (ΓF (L)/J Γ (L)) , and (ΓF (L)/IΓ (L)) may be naturally identified with each other. If the singular quantization condition, namely, Pk τ ∈ IΓ∞ (L) for all k ∈ I G ∩ CF∞ (P ) ∞ and all τ ∈ Γ∞ F (L) such that PJξ τ ∈ IΓ (L) ∀ξ ∈ g
(42)
G is satisifed, then Qf σ = Q[f ] [σ] for every f ∈ CF∞ (P )G and every σ ∈ Γ∞ F (L) . ∞ Proof. It follows from corollary 3.11 that Γ∞ F (L) ∩ IΓ (L) = {0}. Since J ⊆ I, ∞ ∞ it follows that ΓF (L) ∩ J Γ (L) = {0}. Hence, G
∞ ∞ ∞ (Γ∞ F (L)/IΓF (L)) = {[σ] ∈ ΓF (L)/IΓ (L) | [PJξ σ] = 0 ∀ξ ∈ g} ∞ = {σ ∈ Γ∞ F (L) | PJξ σ ∈ IΓ (L) ∀ξ ∈ g}
April 2, 2009 10:19 WSPC/148-RMP
336
J070-00363
L. Bates et al. ∞ = {σ ∈ Γ∞ F (L) | QJξ σ ∈ IΓ (L) ∀ξ ∈ g} ∞ ∞ = {σ ∈ Γ∞ F (L) | QJξ σ ∈ ΓF (L)/IΓ (L) ∀ξ ∈ g} ∞ G = {σ ∈ Γ∞ F (L) | QJξ σ = 0 ∀ξ ∈ g} = ΓF (L) .
If the singular quantization condition is satisfied, then Q[f ] [σ] is well defined ∞ G for every [f ] ∈ CF∞ (P )G /I G and for every [σ] ∈ (Γ∞ F (L)/IΓ (L)) . Also, for each ∞ G ∞ G ∞ ∞ G σ ∈ ΓF (L) , Qf σ ∈ ΓF (L) and [σ] ∈ (ΓF (L)/IΓ (L))) . Hence, Q[f ] [σ] = [Qf σ] = Qf σ
∞ mod(Γ∞ F (L) ∩ IΓ (L)) = Qf σ.
This completes the proof of Theorem 3.8. Theorem 3.8 implies that quantized singular reduction at 0 ∈ g∗ provides information about G-invariant polarized sections. We shall show that quantized singular reduction at a non-zero co-adjoint orbit O facilitates a description of the closed invariant subspace of HF on which the quantization representation is equivalent to the irreducible unitary representation corresponding to O. First we state a known result in representation theory using the notation of Sec. 3.2.4. Its proof can be found in [33]. Fact 3.9. Let R be a unitary representation of a connected Lie group G on a Hilbert space HF , and RO be an irreducible unitary representation of G on a Hilbert space ¯ O )G of G-invariant elements in HF ⊗ H ¯ O , where H ¯ O denotes HO . The space (HF ⊗ H the complex conjugate of HO , determines a projection operator ΠO defined on HF such that the range of ΠO is the largest closed G-invariant subspace of HF on which the representation of R of G is equivalent to a Hilbert direct sum of copies of the irreducible representation HO . Theorem 3.10. Let (O, ωO ) be a quantizable co-adjoint orbit such that geometric quantization with respect to K¨ ahler polarization FO on O gives rise to an irreducible unitary representation RO of G on Hilbert space HO . In addition, assume that the quantization of (P, ω) with respect to a K¨ ahler polarization F, and the quantization ahler polarization F ⊕ F¯O , give rise to unitary of (P × O, ωP ×O ) with respect to a K¨ ¯ O , respectively. ¯ Oof G, on Hilbert spaces HF and HF ⊗H representations R and R⊗R Also assume that there is a Lagrangian submanifold of (P × O, ωP ×O ) contained in JP−1 ×O (0). Under these assumptions, the space of states of the quantization of singular reduction at O and that of the quantization of algebraic reduction at O coincide. Moreover, they give rise to a projection operator ΠO defined on HF such that range ΠO is the largest closed G-invariant subspace of HF on which R is equivalent to a Hilbert direct sum of copies of the irreducible representation RO . G Proof. By Theorem 3.8, there is a natural identification of Γ∞ FP ×O (LP ×O ) , the space of G-invariant polarized sections of LP ×O , the representation space G ∞ of quantization of singular reduction at O. ΓFP ×O (LP ×O )/IP ×O Γ∞ FP ×O (LP ×O
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
337
G Let HPG×O be the Hilbert space of square integrable sections in Γ∞ . FP ×O (LP ×O ) G G ¯ Recall that, by construction, HP ×O = (HF ⊗ HO ) , see Sec. 3.23. Note that the tensor product in the preceding formula is a completed tensor product. It follows ¯ O )G determines a projection operator ΠO defined on from Fact 3.9 that (HF ⊗ H HF such that its range is the largest closed G-invariant subspace of HF on which the representation R of G is unitarily equivalent to a Hilbert direct sum of copies of the irreducible representation RO . Remark 3.11. Note that Theorem 3.10 does not require the singular quantization condition. 3.3. An alternative approach From Sec. 2.3, we know that C ∞ (P )G /I G is the Poisson algebra obtained by singular reduction at 0 ∈ g∗ . The space of reduced sections needs to be a module over this algebra. In Sec. 3.2.2 it is defined to be Γ∞ (L)/IΓ∞ (L))G . However, in order for the singularly reduced prequantization map to be well-defined, we needed to assume (as in Theorems 3.1 and 3.2) that Pk τ ∈ IΓ∞ (L) for all k ∈ I G and all τ which are G-invariant mod I, which we called the “singular prequantization condition”. Another approach, which we explore in this subsection, is to define the space of reduced sections by fiat in such a way that the reduced prequantization map is well-defined. Rather than requiring that the Pk τ terms become zero in the quotient, we include them in the kernel of the quotient map. To this end, we define A to be the space spanned by Pk τ where k ∈ I G and τ ∈ Γ∞ (L) such that PJξ τ ∈ IΓ∞ (L) for every ξ ∈ g. Note that Pk satisfies the singular prequantization condition if and only if A = {0}. Now define K = span A ∪ IΓ∞ (L).
(43)
∞
In essence, K is IΓ (L), expanded by everything that needs to be zero in order for (pre)quantization of singular reduction to be defined. We begin with some technical results that will make the ensuing calculations easier. Lemma 3.12. Let f ∈ C ∞ (P )G and ζ ∈ g, and let τ be G-invariant mod IΓ∞ (L), namely, PJξ τ ∈ IΓ∞ (L) for all ξ ∈ g. Then f τ, Pf τ, and PJζ τ are also G-invariant mod IΓ∞ (L). Proof. These are all straightforward calculations. For f τ , PJξ (f τ ) = f PJξ τ − iXJξ (f )τ. By assumption, the section PJξ τ is in IΓ∞ (L), and so when multiplied by the function f it is still in IΓ∞ (L). The second term vanishes by the G-invariance of f , and so PJξ (f τ ) is in IΓ∞ (L).
April 2, 2009 10:19 WSPC/148-RMP
338
J070-00363
L. Bates et al.
For Pf τ , PJξ Pf τ = Pf (PJξ τ ) + [PJξ , Pf ]τ = Pf PJξ τ + iP{Jξ ,f } τ. The second term vanishes because of the G-invariance of f , while the first term is in IΓ∞ (L) because Pf maps IΓ∞ (L) to itself, as noted in the proof of Theorem 3.1. Finally, for PJζ τ , it suffices to show that PJζ maps IΓ∞ (L) to itself: PJζ (kσ) = kPJζ σ − iXJζ (k)σ ∞
(44) ∞
where k ∈ I and σ ∈ Γ (L). The first term is clearly in IΓ (L). Since Jζ is tangent to J −1 (0), and k vanishes on J −1 (0), the second term is zero on J −1 (0), and so PJζ (kσ) ∈ IΓ∞ (L). Lemma 3.13. K is a G-invariant, C ∞ (P )G -submodule of Γ∞ (L). In addition, Γ∞ (L)/K is a module over C ∞ (P )G and C ∞ (P )G /I G . Furthermore, Pf σ ∈ K for all σ ∈ K and every f ∈ I G . Proof. Since K is generated by two types of sections, those of the form hσ for and of the form Pk τ , we need to check each assertion on each of these two types. To show K is a submodule, note that given f ∈ C ∞ (P )G , f (hσ) = (f h)σ, which is in K. Also, f Pk τ = Pk f τ + i{f, k}τ.
(45)
The first term is in K since it’s Pk of the section f τ , which is G-invariant mod IΓ∞ (L) by Lemma 3.12, while the second term is in K since {f, k} is in I G since k ∈ I G and I G is a Poisson ideal. Thus K is a submodule of Γ∞ (L), which implies Γ∞ (L)/K is a module over C ∞ (P )G . For it to be a module over C ∞ (P )G /I G , we only need that multiplication by elements of I G preserves K, which is trivial. To show G-invariance, since G is connected, and the action of G on Γ∞ (L) is generated by PJξ for ξ ∈ g, it suffices to check that PJξ κ ∈ K for each κ ∈ K. As noted around (44) above, PJξ maps IΓ∞ (L) to itself, so it suffices to check for elements of K of the form PJξ τ . To that end, PJξ (Pk τ ) = Pk (PJξ τ ) + iP{Jξ ,k} τ. The first term is in K because PJξ τ is G-invariant mod IΓ∞ (L) by Lemma 3.12, while the second term is zero by G-invariance of k. Finally, we show the third assertion. First, for f in C ∞ (P )G , Pf maps IΓ∞ (L) to itself, as shown in the proof of Theorem 3.1, and so we only need to check it for elements of the form P Pk τ . To that end, Pf Pk τ = Pk Pf τ + iP{f,k} τ. The second term is in K since {f, k} ∈ I G , while the first term is in K since Pf τ is G-invariant mod IΓ∞ (L) by Lemma 3.12. Thus Pf maps K to itself, for all f ∈ C ∞ (P )G .
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
339
It follows that we can define an action of G on Γ∞ (L)/K by (g, σ) → Ug σ, where U is the prequantization representation of G on Γ∞ (L) and σ is the class of σ in Γ∞ (L)/K. A class σ ∈ Γ∞ (L)/K is G-invariant if, for all ξ ∈ g, PJξ σ = 0, G which is equivalent to PJξ σ ∈ K. We denote by (Γ∞ (L)/K) the space of G∞ invariant elements in Γ (L)/K. By the following theorem both the prequantization and the quan representation ∞ G tization representation are well-defined on Γ (L)/K . G
Theorem 3.14. (Γ∞ (L)/K) is a module over the ring C ∞ (P )G /I G . Moreover, the singularly reduced prequantization map G
G
(C ∞ (P )G /I)×(Γ∞ (L)/K) → (Γ∞ (L)/K) : ([f ], σ) → Pf σ
(46)
is well-defined. Proof. We have already shown that Γ∞ (L)/K is a module over C ∞ (P )G /I G , and it is easy to see that the G-invariant parts form a submodule. Moreover, we have chosen K so that the map (C ∞ (P )G /I) × (Γ∞ (L)/K) → (Γ∞ (L)/K) : ([f ], σ) → Pf σ
(47)
G into itself, is well-defined. It remains only to show that Pf maps (Γ∞ F (L)/K) namely that Pf σ is G-invariant if σ is. This requires that PJξ Pf σ is in K if PJξ σ ∈ K. For such a σ,
PJξ (Pf σ) = Pf (PJξ )σ + iP{f,Jξ } σ. Since f is G-invariant, {Jξ , f } = 0 and the second term is zero. The first term is in K since, by the preceding lemma, Pf maps K into itself. Theorem 3.15. Let R be the ring (C ∞ (P )0F ∩ C ∞ (P )G )/I G . Then the space G is a module over R. Moreover, the singularly reduced quantization (Γ∞ F (L)/K) map G ∞ G (C ∞ (P ) ∩ C ∞ (P )G )/I × (Γ∞ F (L)/K) → (ΓF (L)/K) : ([f ], σ) → Pf σ
(48) is well-defined. Proof. We know that (Γ∞ (L)/K)G is a module over C ∞ (P )G /I G . If f is in ∞ C ∞ (P )0F ∩ C ∞ (P )G and σ is in Γ∞ F (L), then f σ is in ΓF (L). So f σ is in 0 G ∞ G ∞ ∞ G (Γ∞ F (L)/K) . Thus (ΓF (L)/K) is a module over (C (P )F ∩ C (P ) )/I. We have already shown in the preceding theorem that the reduced prequantization map (46) is well defined. Since (as shown in Sec. 3.1.2) if f ∈ CF∞ (P ) and ∞ σ ∈ Γ∞ F (L), then Pf σ ∈ ΓF (L), the restricted map (48) is also well-defined.
April 2, 2009 10:19 WSPC/148-RMP
340
J070-00363
L. Bates et al.
Thus, quantization of singular reduction using the submodule K is always defined, even if the system does not satisfy the singular prequantization condition. Theorem 3.16. If a system satisfies the singular (pre)quantization condition, then (pre)quantization of singular reduction defined using the submodule K is the same as that defined in the manner of Sec. 3.2.2. In addition, if the polarization is a K¨ ahler polarization, the quantization of algebraic reduction is the same as that using the submodule K as well. Proof. This is straightforward. If the system satisfies the singular (pre)quantization condition, then K = IΓ∞ (L), and so the two constructions are the same. Furthermore, by Theorem 3.4, if we have a system with a K¨ahler polarization satisfying the singular quantization condition, then the quantizations coming from singular and algebraic reduction are the same. If the singular quantization condition does not hold, then the quantization of singular reduction using K, although it is defined, it may not be equal to the quantization of algberaic reduction, since the quantum operators Qk corresponding to functions k in I G /J may have a non-zero image a priori, even on sections that are G-invariant mod IΓ∞ (L), and these images will vanish when we divide by K. Remark 3.17. When F is a K¨ ahler polarization, we have shown that quan∞ G comtization of singular reduction in terms of the module (Γ∞ F (L)/IΓ (L)) mutes with reduction, see Theorem 3.10. This result is based on the fact that ∞ −1 (0) contains a Lagrangian submanifold. Since Γ∞ F (L) ∩ IΓ (L) = {0} provided J ∞ IΓ (L) is a properly contained in K, we cannot conclude that Γ∞ F (L) ∩ K = {0} if J −1 (0) contains a Lagrangian submanifold. Hence, if we quantize in terms of G the module (Γ∞ F (L)/K) , we may lose the result that quantization and singular reduction commute. 4. AGJ’s Example Here we rework and example of Arms, Gotay and Jennings [4]. 4.1. Classical description Using the idea of a momentum mapping in classical mechanics we describe their example. 4.1.1. Real notation We start by constructing two real orthogonal representations of SU(2) on R4 . α −β 2 Recall that the set of 2 × 2 complex matrices of the form (β¯ α¯ ), where |α| + ¯ ∈ C2 , with the quaternion |β|2 = 1, is the Lie group SU(2).b We identify (α, β) b The
Lie group Hu is also equal to the Lie group Sp(1) or U(1, H).
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
341
x = α + j β¯ ∈ H. This identifies the complex number i with the quaternion i. Let Hu be the Lie group of quaternions of unit length, that is, x ∈ Hu if and only if x¯ x = 1. The mapping α −β → x = α + j β¯ ϕ : SU(2) → Hu : β¯ α ¯ is an isomorphism of Lie groups, whose tangent at the identity element idSU(2) is ψ : su(2) = TidSU(2) SU(2) → T1 Hu = R3 = spanR {i, j, k} : ix −y − iz X= → ξ = ix + j(y − iz), y − iz −ix where x, y, z ∈ R. The map ψ is an isomorphism of Lie algebras, namely, )], ψ([X, Y ]) = ψ(XY − Y X) = ξη − ηξ = [ξ, η] = [ψ(X), ψ(Y ) ∈ T 1 Hu . where X, Y ∈ su(2) and ξ = ψ(X), η = ψ(Y 4 y + yx ¯). It can Give H = R the standard Euclidean inner product x, y = 12 (x¯ be shown that Lemma 4.1. The mapping Φ : Hu × Hu → SO(4, R) : (a, b) → La,b ,
(49)
where La,b : H → H : x → ax¯b, is a surjective homomorphism of Lie groups with kernel Z2 = {±(1, 1)}. The mapping Φ (49) gives rise to two injective Lie group homomorphisms Φ : Hu → SO(4, R) : a → Φa,1 ,
(50)
where Φa,1 : R4 → R4 : x → ax and Φr : Hu → SO(4, R) : b → Φ1,b ,
(51)
where Φ1,b : R4 → R4 : x → x¯b. The tangent at the identity element of Hu of Φ and Φr gives rise to the maps T1 Φ : T1 Hu = R3 = spanR {i, j, k} → TidSO(4,R) SO(4, R) = so(4, R) : ξ → ξ,0 and T1 Φr : T1 Hu = R3 → TidSO(4,R) SO(4, R) = so(4, R) : η → 0,η , respectively. Here ξ,0 : R4 → R4 : x → ξx and 0,η : R4 → R4 : x → x¯ η . In particular, with respect to the standard basis of R4 , we have 0 −1 −1 0 0 −1 1 0 0 1 −1 0 , j,0 = , k,0 = i,0 = 1 0 1 0 −1 0 1 0 0 −1 1 0
April 2, 2009 10:19 WSPC/148-RMP
342
and
0,i
J070-00363
L. Bates et al.
0 1
−1 0 , = 0 −1 1 0
1 0 0 1 , = −1 0 0 −1
0,j
0 1 −1 0 . = 0 1 −1 0
0,k
We now construct an SO(4, R) momentum mapping coming from the linear SO(4, R) action ϕ : SO(4, R) × R4 → R4 : (A, q) → Aq. This action lifts to an SO(4, R) action on T ∗ R4 given by ϕ : SO(4, R) × T ∗ R4 → T ∗ R4 : (A, (q, p)) → (Aq, Ap), (52) 4 ∗ 4 which preserves the canonical 1-form θ0 = p, dq = n=1 pn dqn on T R and therefore the canonical 2-form ω = dθ0 = 4n=1 dpn ∧dqn . Hence ϕ is a Hamiltonian ∗ 4 action on T R . Next we compute its momentum mapping. Consider the SU(2) action ϕ : SU(2) × R4 → R4 : (A, q) → Φ (A)q.
(53)
Its infinitesimal action on R4 is generated by the vector fields ∂ ∂ ∂ X1 (q) = i,0 (q), , X2 (q) = j,0 (q), , and X3 (q) = k,0 (q), . ∂q ∂q ∂q The lift ϕ of ϕ to T ∗ R4 is a Hamiltonian action of SU(2) on T ∗ R4 , whose momentum mapping J has components 1 J1 (q, p) = (X1 θ0 )(q, p) = (q1 p2 − q2 p1 ) + (q3 p4 − q4 p3 ) = (S12 + S34 ) 2 1 J2 (q, p) = (q1 p3 − q3 p1 ) − (q2 p4 − q4 p2 ) = (S13 − S24 ) (54) 2 1 J3 (q, p) = (q1 p4 − q4 p1 ) + (q2 p3 − q3 p2 ) = (S14 + S23 ), 2 where 12 Sij = qi pj − qj pi . Similarly, we have the SU(2) action ϕr : SU(2) × R4 → R4 : (A, q) → Φr (A)q.
(55)
Its infinitesimal action on R is generated by the vector fields ∂ ∂ ∂ X4 (q) = 0,i (q), , X5 (q) = 0,j (q), , and X6 (q) = 0,k (q), . ∂q ∂q ∂q 4
The lift ϕ r of ϕr to T ∗ R4 is a Hamiltonian action of SU(2) on T ∗ R4 , whose momentum mapping J r has components 1 J4 (q, p) = −(q1 p2 − q2 p1 ) + (q3 p4 − q4 p3 ) = (−S12 + S34 ) 2 1 (56) J5 (q, p) = −(q1 p3 − q3 p1 ) − (q2 p4 − q4 p2 ) = − (S13 + S24 ) 2 1 J6 (q, p) = −(q1 p4 − q4 p1 ) + (q2 p3 − q3 p2 ) = (−S14 + S23 ). 2
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
343
Because so(4, R) is isomorphic to su(2) × su(2), it follows that the momentum mapping of the SO(4, R) action ϕ (52) is J : T ∗ R4 → so(4, R) : (q, p) → J (q, p) + J r (q, p) = (J1 i,0 + J2 j,0 + J3 k,0 ) + (J4 0,i + J5 0,j + J6 0,k ).
(57)
Lemma 4.2. The zero level set J −1 (0) of the SO(4, R)-momentum mapping J (57) is the set of all vectors (q, p) ∈ T ∗ R4 = R8 such that q and p are linearly dependent. J −1 (0) is a semialgebraic variety in R6 (with coordinates Jj for j = 1, . . . , 6) defined by J12 + J22 + J32 = J42 + J52 + J62 .
(58)
It is a smooth 5-dimensional manifold except when Jj = 0 for j = 1, . . . , 6. Proof. Consider the isomorphism ι : Λ2 R4 → so(4, R) : q ∧ p → ιq,p ,
(59)
where ιq,p : R → R : x → x, qp − x, pq. Composing the SO(4, R) momentum mapping J (57) with the inverse of the mapping ι (59) gives the map 4
4
J : T ∗ R4 → Λ2 R4 : (q, p) → q ∧ p =
1 1 1 S12 e1 ∧ e2 + S13 e1 ∧ e3 + S14 e1 ∧ e4 2 2 2 1 1 1 + S23 e2 ∧ e3 + S24 e2 ∧ e4 + S34 e3 ∧ e4 2 2 2
(60)
using (54) and (56). We are now in a position to prove the lemma. By hypothesis (q, p) ∈ J −1 (0). Consequently, (q, p) ∈ J−1 (0), that is, q ∧ p = 0. Therefore q and p are linearly dependent. Because q ∧ p is a decomposable 2-vector, its components satisfy Pl¨ ucker’s equation, namely 1 1 1 1 1 1 S12 S34 − S13 S24 + S14 S23 , 0= 2 2 2 2 2 2 where 12 Sij = qi pj − qj pi . In terms of the components of the momentum mapping J Pl¨ ucker’s equation reads 0 = (J1 − J4 )(J1 + J4 ) + (J2 − J5 )(J2 + J5 ) + (J3 − J6 )(J3 + J6 ) = J12 + J22 + J32 − J42 − J52 − J62 = F (J). Thus (58) holds. Every value of the function F on R6 except 0 is a regular value. J −1 (0) is a smooth 5-dimensional manifold except at the origin of R6 . Let {, } be the standard Poisson bracket on C ∞ (T ∗ R4 ) associated to the canonical symplectic form ω. Its structure matrix is given by {qi , pj } = δij ,
{qi , qj } = 0 = {pi , pj }
for i, j = 1, 2, 4. The proof of the next lemma and its corollaries are straightforward.
April 2, 2009 10:19 WSPC/148-RMP
344
J070-00363
L. Bates et al.
Lemma 4.3. For i, j = 1, 2, 3 the momenta Ji , J3+j form a Poisson algebra B on C ∞ (R6 ) under Poisson bracket {, }, which is isomorphic to su(2) × su(2). In particular {Ji , Jj } = −2
3
ijk Jk ,
{J3+i , J3+j } = −2
k=1
3
ijk J3+k ,
k=1
(61)
{Ji , J3+j } = 0. Corollary 4.4. The function F (J) = J12 + J22 + J32 − J42 − J52 − J62 is a Casimir for the Poisson algebra B. Corollary 4.5. The functions J7 =
1 (p, p + q, q), 2
J8 =
1 (p, p − q, q), 2
and
J9 = q, p
(62)
are Casimirs for the Poisson algebra B. 4.1.2. Complex notation In order to deal with the quantization of the left SU(2)-action and the decomposition of the associated representation, it is convenient to use a complex notation since we will work with the antiholomorphic polarization. On C4 introduce coordinates zn = √12 (pn + iqn ) for n = 1, . . . , 4. Then z¯n = √12 (pn − iqn ). Therefore 12 iSk = zk z¯ − z z¯k . Using the variables zn and z¯n , the momentum functions and invariant functions considered above become 1 (S12 + S34 ), 2 1 = −i[(z1 z¯3 − z3 z¯1 ) − (z2 z¯4 − z4 z¯2 )] = (S13 − S24 ), 2 1 = −i[(z1 z¯4 − z4 z¯1 ) + (z2 z¯3 − z3 z¯2 )] = (S14 + S23 ). 2 1 = −i[−(z1 z¯2 − z2 z¯1 ) + (z3 z¯4 − z4 z¯3 )] = (−S12 + S34 ), 2 1 = i[(z1 z¯3 − z3 z¯1 ) + (z2 z¯4 − z4 z¯2 )] = − (S13 + S24 ), 2 1 = −i[−(z1 z¯4 − z4 z¯1 ) + (z2 z¯3 − z3 z¯2 )] = (−S14 + S23 ), 2 = z1 z¯1 + z2 z¯2 + z3 z¯3 + z4 z¯4 ,
J1 = −i[(z1 z¯2 − z2 z¯1 ) + (z3 z¯4 − z4 z¯3 )] = J2 J3 J4 J5 J6 J7
1 2 (z + z22 + z32 + z42 + z¯12 + z¯22 + z¯32 + z¯42 ), 2 1 1 J9 = (z12 + z22 + z32 + z42 − z¯12 − z¯22 − z¯32 − z¯42 ). 2i J8 =
(63)
(64)
(65)
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
345
The Hamiltonian vector field Xf of a real valued function f is defined by Xf ω = −df , as in Sec. 2.1. Writing ∂k and ∂¯k for ∂/∂zk and ∂/∂ z¯k , respectively, it follows that Hamilton’s equations in zk and z¯k variables for the Hamiltonian f are z˙n = i
∂f (z, z¯) , ∂ z¯n
z¯˙ n = −i
∂f (z, z¯) ∂zn
for n = 1, . . . , 4.
Therefore the Hamiltonian vector fields associated to the functions Ji for i = 1, . . . , 7 are are XJ1 = −(z2 ∂1 + z¯2 ∂¯1 ) + (z1 ∂2 + z¯1 ∂¯2 ) − (z4 ∂3 + z¯4 ∂¯3 ) + (z3 ∂4 + z¯3 ∂¯4 ) XJ2 = −(z3 ∂1 + z¯3 ∂¯1 ) + (z4 ∂2 + z¯4 ∂¯2 ) + (z1 ∂3 + z¯1 ∂¯3 ) − (z2 ∂4 + z¯2 ∂¯4 ) XJ3 = −(z4 ∂1 + z¯4 ∂¯1 ) − (z3 ∂2 + z¯3 ∂¯2 ) + (z2 ∂3 + z¯2 ∂¯3 ) + (z1 ∂4 + z¯1 ∂¯4 ) XJ4 = (z2 ∂1 + z¯2 ∂¯1 ) − (z1 ∂2 + z¯1 ∂¯2 ) − (z4 ∂3 + z¯4 ∂¯3 ) + (z3 ∂4 + z¯3 ∂¯4 ) XJ5 = (z3 ∂1 + z¯3 ∂¯1 ) + (z4 ∂2 + z¯4 ∂¯2 ) − (z1 ∂3 + z¯1 ∂¯3 ) − (z2 ∂4 + z¯2 ∂¯4 ) XJ6 = (z4 ∂1 + z¯4 ∂¯1 ) − (z3 ∂2 + z¯3 ∂¯2 ) + (z2 ∂3 + z¯2 ∂¯3 ) − (z1 ∂4 + z¯1 ∂¯4 ) XJ7 = i(z1 ∂1 + z2 ∂2 + z3 ∂3 + z4 ∂4 − z¯1 ∂¯1 − z¯2 ∂¯2 − z¯3 ∂¯3 − z¯4 ∂¯4 ). As we will not need XJ8 or XJ9 latter on, we do not calculate them. 4.1.3. Reduction Consider the left action on SU(2) ⊆ SO(4, R) on T ∗ R4 . Its momentum mapping is J : T ∗ R4 → su(2) : (q, p) → J1 (q, p)i,0 + J2 (q, p)j,0 + J3 (q, p)k,0 .
(66)
Since the zero level set (J )−1 (0) is not a submanifold of T ∗ R4 , having a conical singularity at (0, 0), we cannot use regular reduction to remove the SU(2) symmetry on (J )−1 (0). In this subsection we discuss singular reduction. By definition the singular reduced space W is the space (J )−1 (0)/SU(2) of orbits of the left SU(2)-action on (J )−1 (0) ⊆ T ∗ R4 . Because the SU(2) action is proper we may use invariant theory to construct W . Observe that the algebra of polynomials on T ∗ R4 , which are invariant under the left SU(2) action, is generated by J4 , J5 , and J6 , see (54) and J7 , J8 , and J9 , see (56). The relations J12 + J22 + J32 = J42 + J52 + J62
(67)
and 1 2 (J + J22 + J32 + J42 + J52 + J62 ) + J92 = J72 − J82 , 2 1
J7 ≥ 0,
(68)
among the invariants, see (54) and (56) and (62), together with J1 = 0,
J2 = 0,
and J3 = 0,
(69)
April 2, 2009 10:19 WSPC/148-RMP
346
J070-00363
L. Bates et al.
which specify the 0-level set of J , define the the singular reduced space W . In other words, W is the semialgebraic variety of R6 with coordinates (J4 , J5 , . . . , J9 ) defined by J4 = J5 = J6 = 0
and J72 = J82 + J92 ,
J7 ≥ 0.
(70)
To describe W as a differential space, we need to construct its space of smooth on functions. Consider the orbit space V = T ∗ R4 /SU(2) of the left SU(2) action ϕ ∗ 4 ∗ 4 ∞ T R . Let ρ : T R → V be its orbit map. The space C (V ) of smooth functions on V is by definition the set of functions which are continuous in the quotient topology on V and which pull back under ρ to smooth SU(2)-invariant functions on T ∗ R4 . Since SU(2) is a compact Lie group, which acts linearly on T ∗ R4 , by SU(2) of smooth SU(2)-invariant a theorem of Schwarz [26] the algebra C ∞ (T ∗ R4 ) functions on T ∗ R4 is {h(J4 , J5 , . . . , J9 )|h ∈ C ∞ (R6 )}. In other words, every smooth SU(2)-invariant function is a smooth function of SU(2)-invariant polynomials. The space C ∞ (V ) of smooth functions defines a differential structure on V and the pair (V, C ∞ (V )) is a differential space. Because (J )−1 (0) is a closed subset of ρ((J )−1 (0))) = (J )−1 (0), we deduce that T ∗ R4 , which is saturated, that is, ρ−1 ( −1 W = ρ((J ) (0)). Consequently, W is a closed subset of V .c Hence the space C ∞ (W ) of smooth functions on W is given by restricting smooth functions on V to W . So (W, C ∞ (W )) is a differential subspace of (V, C ∞ (V )). Also, if I SU(2) is the ideal of smooth SU(2)-invariant functions on T ∗ R4 , whose restriction to (J )−1 (0) vanishes, then C ∞ (W ) = C ∞ (T ∗ R4 )/I SU(2) . The action ϕ of SU(2) on the complement of (J )−1 (0) in T R4 is free. Hence, if O is a non-zero co-adjoint orbit, then regular and singular reduction coincide. 4.2. Geometric quantization In order to deal with the quantization of the SU(2) action ϕ and the decomposition of the associated representation, it is convenient to use complex notation. 4.2.1. Complex notation Identify T ∗ R4 with C4 using zn = √12 (pn +iqn ) for n = 1, . . . , 4. In these coordinates 4 the canonical 1-form θ0 is − 2i n=1 (zn + z¯n )d(zn − z¯n ) and the canonical symplectic α −β zn = dθ0 . If we identify α + j β¯ ∈ Hu with A = (β¯ α¯ ) ∈ form ω = i 4n=1 dzn ∧ d¯ SU(2) and z1 + jz2 ∈ H with z = (z1 , z2 ) ∈ C2 , then the action
¯ z1 + jz2 ) → (α + j β)(z ¯ 1 + jz2 ) Hu × H → H : (α + j β, ¯ 1 + αz = (αz1 − βz2 ) + j(βz ¯ 2) c This
is clear is one observes that V is defined by (67) and (68).
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
347
becomes the natural action of SU(2) on C2 , namely ϕˇ : SU(2) × C2 → C2 : (A, z) → Az. Therefore the SU(2) action ϕ on T ∗ R4 in complex notation is z Az ˇ : SU(2) × C4 → C4 : A, Φ → . w Aw
(71)
In complex coordinates the momentum functions Jj for j = 1, . . . , 9, see (54), (56), and (60) become J1 =
1 (S12 + S34 ), 2
J4 =
1 (−S12 + S34 ), 2
J2 =
1 (S13 − S24 ), 2
J3 =
1 J5 = − (S13 + S24 ), 2
1 (S14 + S23 ) 2
J6 =
1 (−S14 + S23 ) 2
J7 = z1 z¯1 + z2 z¯2 + z3 z¯3 + z4 z¯4 ,
(72)
1 2 (z + z22 + z32 + z42 + z¯12 + z¯22 + z¯32 + z¯42 ), 2 1 1 J9 = (z12 + z22 + z32 + z42 − z¯12 − z¯22 − z¯32 − z¯42 ), 2i J8 =
where 12 iSk, = zk z¯ − z z¯k . Claim 4.6. In complex coordinates the zero level set of the SO(4, R) momentum map J (57) consists of all vectors (z, z¯) ∈ C4 × C4 where the non-zero components of z and z¯ each have the same argument. In other words, the set J −1 (0) = {(r1 eiθ , r2 eiθ , r3 eiθ , r4 eiθ )} such that rj ∈ R≥0 , rj = 0 for some j ∈ {1, 2, 3, 4}, and θ ∈ R. Proof. (z, z¯) ∈ J −1 (0) if and only if 12 iSk, = 0 for all (k, ) ∈ I = {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)}. In other words, zk z¯ = zk¯z¯ , that is, zk z¯ is real for all (k, ) ∈ I. Therefore all the non-zero components of the vectors z and z¯ each have the same argument. The Hamiltonian vector field Xf of a real valued function f is defined by ω = −df . Hamilton’s equations in zk and z¯k variables for the Hamiltonian Xf f are z˙n = i
∂f (z, z¯) ∂ z¯n
and z¯˙ n = −i
∂f (z, z¯) ∂zn
for n = 1, . . . , 4.
Writing ∂k and ∂¯k for ∂/∂zk and ∂/∂ z¯k , respectively, we can write the Hamiltonian vector fields associated to the functions Ji for i = 1, . . . , 7 in the form XJ1 = −(z2 ∂1 + z¯2 ∂¯1 ) + (z1 ∂2 + z¯1 ∂¯2 ) − (z4 ∂3 + z¯4 ∂¯3 ) + (z3 ∂4 + z¯3 ∂¯4 ) XJ2 = −(z3 ∂1 + z¯3 ∂¯1 ) + (z4 ∂2 + z¯4 ∂¯2 ) + (z1 ∂3 + z¯1 ∂¯3 ) − (z2 ∂4 + z¯2 ∂¯4 )
April 2, 2009 10:19 WSPC/148-RMP
348
J070-00363
L. Bates et al.
XJ3 = −(z4 ∂1 + z¯4 ∂¯1 ) − (z3 ∂2 + z¯3 ∂¯2 ) + (z2 ∂3 + z¯2 ∂¯3 ) + (z1 ∂4 + z¯1 ∂¯4 ) XJ4 = (z2 ∂1 + z¯2 ∂¯1 ) − (z1 ∂2 + z¯1 ∂¯2 ) − (z4 ∂3 + z¯4 ∂¯3 ) + (z3 ∂4 + z¯3 ∂¯4 ) XJ5 = (z3 ∂1 + z¯3 ∂¯1 ) + (z4 ∂2 + z¯4 ∂¯2 ) − (z1 ∂3 + z¯1 ∂¯3 ) − (z2 ∂4 + z¯2 ∂¯4 ) XJ6 = (z4 ∂1 + z¯4 ∂¯1 ) − (z3 ∂2 + z¯3 ∂¯2 ) + (z2 ∂3 + z¯2 ∂¯3 ) − (z1 ∂4 + z¯1 ∂¯4 ) XJ7 = i(z1 ∂1 + z2 ∂2 + z3 ∂3 + z4 ∂4 − z¯1 ∂¯1 − z¯2 ∂¯2 − z¯3 ∂¯3 − z¯4 ∂¯4 ). (73) As we will not need XJ8 or XJ9 latter on, we do not calculate them. Claim 4.7. At each point of J −1 (0) each of the vector fields XJ4 , XJ5 , and XJ6 is a real linear combination of the vector fields XJ1 , XJ2 and XJ3 . Proof. Assume first that J −1 (0) is smooth at (z1 , z2 , z3 , z4 ), that is, not all zj are zero. Using the description of J −1 (0) given in Claim 4.6, each of the vector fields XJk , k = 1, . . . , 6 when restricted to J −1 (0)\{0} is XJ = −(r2 eiθ ∂1 + r2 e−iθ ∂¯1 ) + (r1 eiθ ∂2 + r1 e−iθ ∂¯2 ) 1
XJ2 XJ3 XJ4 XJ5
− (r4 eiθ ∂3 + r4 e−iθ ∂¯3 ) + (r3 eiθ ∂4 + r3 e−iθ ∂¯4 ) = −(r3 eiθ ∂1 + r3 e−iθ ∂¯1 ) + (r4 eiθ ∂2 + r4 e−iθ ∂¯2 ) + (r1 eiθ ∂3 + r1 e−iθ ∂¯3 ) − (r2 eiθ ∂4 + r2 e−iθ ∂¯4 ) = −(r4 eiθ ∂1 + r4 e−iθ ∂¯1 ) − (r3 eiθ ∂2 + r3 e−iθ ∂¯2 ) + (r2 eiθ ∂3 + r2 e−iθ ∂¯3 ) + (r1 eiθ ∂4 + r1 e−iθ ∂¯4 ) = (r2 eiθ ∂1 + r2 e−iθ ∂¯1 ) − (r1 eiθ ∂2 + r1 e−iθ ∂¯2 )
(74)
− (r4 eiθ ∂3 + r4 e−iθ ∂¯3 ) + (r3 eiθ ∂4 + r3 e−iθ ∂¯4 ) = (r3 eiθ ∂1 + r3 e−iθ ∂¯1 ) + (r4 eiθ ∂2 + r4 e−iθ ∂¯2 ) − (r1 eiθ ∂3 + r1 e−iθ ∂¯3 ) − (r2 eiθ ∂4 + r2 e−iθ ∂¯4 )
XJ6 = (r4 eiθ ∂1 + r4 e−iθ ∂¯1 ) − (r3 eiθ ∂2 + r3 e−iθ ∂¯2 ) + (r2 eiθ ∂3 + r2 e−iθ ∂¯3 ) − (r1 eiθ ∂4 + r1 e−iθ ∂¯4 ). Consider first XJ4 . We seek functions c1 , c2 , and c3 such that, at each point of J −1 (0)\{0}, X J 4 = c1 X J 1 + c2 X J 2 + c3 X J 3 . Comparing the components of the vector fields given in (74), we see that the cj have to satisfy −r1 = c1 r1 + c2 r4 − c3 r3 , r2 = −c1 r2 − c2 r3 − c3 r4 , −r4 = −c1 r4 + c2 r1 + c3 r2 , r3 = c1 r3 − c2 r2 + c3 r1 .
(75)
This system has the solution c1 = −
r12 + r22 − r32 − r42 , r12 + r22 + r32 + r42
c2 = −
2(r2 r3 + r1 r4 ) , r12 + r22 + r32 + r42
c3 = − −
2(r2 r4 − r1 r3 ) r12 + r22 + r32 + r42 (76)
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
349
as can readily be checked. Turning to XJ5 and XJ6 and following the same procedure, we get corresponding system −r1 = −c1 r4 + c2 r1 + c3 r2 , r3 = −c1 r2 − c2 r3 − c3 r4 , r4 = c1 r1 + c2 r4 − c3 r3 −r2 = c1 r3 − c2 r2 + c3 r1 ,
(77)
for XJ5 , and −r1 = c1 r3 − c2 r2 + c3 r1 , r2 = −c1 r4 + c2 r1 + c3 r2 , −r3 = c1 r1 + c2 r4 − c3 r3 , r4 = −c1 r2 − c2 r3 − c3 r4
(78)
for XJ6 . These have solutions 2(r2 r3 − r1 r4 ) , 2 r1 + r22 + r32 + r42
c2 = −
2(r1 r3 + r2 r4 ) , r12 + r22 + r32 + r42
c2 = −
c1 = −
r12 − r22 + r32 − r42 , r12 + r22 + r32 + r42
c3 = −
2(r3 r4 − r1 r2 ) , r12 + r22 + r32 + r42
c3 = −
2(r1 r2 + r3 r4 ) + r22 + r32 + r42
r12
and c1 = −
r12 − r22 − r32 + r42 , r12 + r22 + r32 + r42
respectively. Since not all of the zj are zero, not all the rj are zero. So all of these solutions exist. At the singular set of J −1 (0), which is the point 0, all of the vector fields vanish. This establishes the claim. 4.2.2. Prequantization Let L = C4 × C be a trivial complex line bundle over C4 and let σ0 : C4 → C4 × C : z → (z, 1) be a trivializing section of L. Note that every smooth complex valued section of L can be written as ψ(z, z¯)σ0 for some smooth complex valued function ψ of z and z¯. On L define a covariant derivative ∇X of σ0 along a vector field X by ∇X σ0 = −i−1 (X
θ0 )σ0 ,
−1
which we may also write as ∇σ0 = −i θ0 ⊗ σ0 (omitting the vector field X). This leads to the usual Schr¨odinger (position) representation. If z | w = z w ¯ is the usual Hermitian inner product on C, the inner product of two sections ψ1 σ0 and ψ2 σ0 of L is given by (ψ1 σ0 | ψ2 σ0 ) = ψ1 | ψ2 ω 4 = ψ1 (z, z¯)ψ¯2 (z, z¯)d4 z d4 z¯. (79) T ∗ R4
C4
C4
Since we are going to use a K¨ahler polarization F on C4 spanned by the antiholomorphic vectors span{∂¯1 , ∂¯2 , ∂¯3 , ∂¯4 } it is more convenient to use another trivializing section of L namely 4
σ1 = exp[−(4)−1 (z, z − 2ip, q)]σ0 .
(80)
Let θ1 = −i n=1 z¯n dzn . Then dθ1 = ω, and ∇σ1 = −i−1 θ1 ⊗ σ1 , see [29, p. 144]. For f ∈ C ∞ (C4 ), the prequantization operator Pf is Pf (ψσ1 ) = (−i∇Xf + f )ψσ1 = −iXf (ψ)σ1 + (f − Xf
θ1 )ψσ1 .
April 2, 2009 10:19 WSPC/148-RMP
350
J070-00363
L. Bates et al.
A straightforward calculation using the expressions (72) for the momentum functions Jj for j = 1, . . . , 7 and the Hamiltonian vector fields (73) XJj for j = 1, . . . , 7, we obtain PJj (ψσ1 ) = −iXJj (ψ)σ1 . For the K¨ ahler polarization F on C4 spanned by the antiholomorphic vectors span{∂¯1 , ∂¯2 , ∂¯3 , ∂¯4 }, the space C ∞ (C4 )0F of complex valued smooth functions annihilated by vectors in F is the space of analytic functions of z. Claim 4.8. A real valued function in CF∞ (C4 ) is at most linear in both z and z¯. Conversely, any real valued polynomial at most linear in z and z¯ lies in CF∞ (C4 ). Proof. By definition, CF∞ (C4 ) is the space of real valued functions whose Hamiltonian vector fields preserve F . Recall that f ∈ CF∞ (C4 ) if for every h ∈ C ∞ (C4 )0F , we have {f, h} ∈ C ∞ (C4 )0F . Given a real valued function f (z, z¯), its Hamiltonian 4 ∂f ¯ vector field is Xf = i n=1 ( ∂∂f z¯n ∂n − ∂zn ∂n ). Therefore {f, h} = Xf h = i
4
∂f ∂h ∂f ∂h . − ∂ z¯n ∂zn ∂zn ∂ z¯n n=1
∂f ∂h Since h is analytic, the term ∂z ¯n is zero. If f has terms higher than linear in n ∂z ∂f ∂h any z¯n , then ∂ z¯n ∂zn contains terms in z¯n , and thus will not be analytic. Since f is real-valued, f¯ = f , and so f can be at most linear in each of the zn as well. Finally, if f is a polynomial at most linear in zn and z¯n , then f lies in CF∞ (C4 ).
From the above claim, we see that for k = 1, 2, 3 the components Jk (72) of the ˇ (71) of SU(2) on C4 lie in C ∞ (C4 ). Thus momentum mapping Jˇ for the action Φ F ˇ the polarization F is preserved by the SU(2) action Φ. The space Γ∞ F (L) of smooth polarized sections of the complex line bundle L is {σ ∈ Γ∞ (L) | ∇X σ = 0 for every X ∈ F }. This space is {ψσ1 ∈ Γ∞ (L) | ψ(z) is analytic}. Equations (79) and (80) give the following inner product ( | ) of ψ1 σ1 and ψ2 σ1 in Γ∞ F (L) 2 ψ1 (z)ψ2¯(z) exp(−|z| /2)d4 z d4 z¯. (81) (ψ1 σ1 | ψ2 σ1 ) = C4
C4
Since ψ1 and ψ2 are analytic on C4 , they do not depend explicitly on z¯. Let (HF , ( | )) be the Hilbert space of smooth sections of L, which are covariantly constant along the polarization F of C4 , and whose norm squared using the inner product ( | ) is finite. (HF , ( | )) is the representation space for the quantum SU(2) representation which we now construct. Note that for ψσ1 ∈ HF we have XJ1 (ψσ1 ) = [(−z2 ∂1 + z1 ∂2 − z4 ∂3 + z3 ∂4 )ψ](σ1 ) XJ2 (ψσ1 ) = [(−z3 ∂1 + z4 ∂2 + z1 ∂3 − z2 ∂4 )ψ](σ1 ) XJ3 (ψσ1 ) = [(−z4 ∂1 − z3 ∂2 + z2 ∂3 + z1 ∂4 )ψ](σ1 ),
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
351
since ∂∂z¯n ψ = 0 for n = 1, . . . , 4, because ψ is a holomorphic function of z. For k = 1, . . . , 4 the vector fields XJk are complete and preserve the polarization F of C4 . Therefore we can define linear differential operators on the space Cµ {z} of holomorphic functions of z, which are square integrable with respect to the measure 2 µ = exp(−|z| /2)d4 z d4 z¯, as follows. L1 = −z2 ∂1 + z1 ∂2 − z4 ∂3 + z3 ∂4 L2 = −z3 ∂1 + z4 ∂2 + z1 ∂3 − z2 ∂4
(82)
L3 = −z4 ∂1 − z3 ∂2 + z2 ∂3 + z1 ∂4 . The corresponding quantum operators are QJk (ψσ1 ) = −i(Lk ψ)σ1 ,
for k = 1, 2, 3.
(83)
The map ρ : su(2) × HF → HF : (Jk , ψσ1 ) → QJk (ψσ1 ),
(84)
where k = 1, 2, 3, is a representation of the Lie algebra su(2) = span{J1 , J2 , J3 } under Poisson bracket on the Hilbert space (HF , ( | )) because Q{Ji ,Jj } (ψσ1 ) = (−i)−1 (QJi QJj − QJj QJi )(ψσ1 ) = (−i)[(Li Lj − Lj Li )(ψ)]σ1 3
= 2(−i) εijk Lk (ψ) σ1 = 2QJk (ψσ1 ). k=1
3 The third equality above follows because [Li , Lj ] = k=1 εijk Lk . For j = 1, 2, 3 the operators QJj (83) are skew adjoint. Hence their exponential exp QJj generates a one parameter group of unitary operators on (HF , ( | )). This gives rise to a unitary representation R : SU(2) × HF → HF
(85)
of SU(2) on HF . Infinitesimalizing this representation gives the su(2) representation ρ (84). The subspace of HF left invariant by the SU(2) representation R (85) is equal to the subspace left invariant by the su(2) representation ρ (84). This subspace is the same as the subspace of HF spanned by the vectors which lie in the kernel of QJk (83) for each k = 1, 2, 3. In particular we prove Proposition 4.9. The space ker QJ of vectors in (HF , ( | )), which are invariant under the SU(2) representation R (85), is {ψσ1 ∈ HF | ψ(z) = Ψ(z 2 )}, where z 2 = 4 2 2 4 n=1 zn and Ψ(z ) is an entire analytic function on C , which is square integrable 2 4 4 with respect to the measure µ = exp(−|z| /2)d z d z¯. Consider the linear differential operators Lj for j = 1, 2, 3 (82), which span the Lie algebra L that is isomorphic to su(2). Let C[[z]] be the space of formal power
April 2, 2009 10:19 WSPC/148-RMP
352
J070-00363
L. Bates et al.
series in z = (z1 , z2 , z3 , z4 ) (which is the basis of (C4 )∗ dual to the standard basis of C4 .) The usual action of L on C[[z]] defines a representation ρ of su(2) on C[[z]]. We now describe the space of formal power series in z, which are invariant under the su(2) representation ρ. In other words, we find all the formal power series on (C4 )∗ which lie in the kernel of L1 , L2 , and L3 simultaneously. Towards this goal define new linear differential operators by H = iL1 ,
1 E = − (L3 − iL2 ) 2
and F =
1 (L3 + iL2 ). 2
Then [H, E] = 2E,
[H, F ] = −2F
and [E, F ] = H.
(86)
So E = {H, E, F } spans a Lie algebra of linear differential operators, which is isomorphic to sl(2, C). These operators acting on C[[z]] define a representation ρˇ of sl(2, C). We now determine the set of formal power series in z, which lie in the kernel of E, F , and H simultaneously. Choose a new basis {w1 = z1 + iz2 , w2 = z1 − iz2 , w3 = z3 + iz4 , w4 = z3 − iz4 } of (C4 )∗ . Then with respect to this basis H = −w1 ∂1 + w2 ∂2 − w3 ∂3 + w4 ∂4 , E = w4 ∂1 − w2 ∂3 , and F = −w3 ∂2 + w1 ∂4 , ∂ where ∂j = ∂w for j = 1, . . . , 4. Because H, E, and F are linear differential j operators, they preserve the degree of each term Mn in a formal power series M = n≥0 Mn in C[[w]]. Thus M is sl(2, C)-invariant if and only if Mn is sl(2)invariant for every n ∈ Z≥0 . A term Mn = w1j w2k w3 w4m of degree n is invariant under sl(2, C) if and only if HMn = EMn = F Mn = 0. In other words, Mn lies in a 1-dimensional irreducible summand of the representation ρ of E on C[[w]]. For this to occur it is necessary and sufficient that HMn = 0 and EMn = 0. We now determine the kernel of H. A straightforward calculation gives HMn = (k+m−j −)Mn. Therefore HMn = 0 if and only if j + + k + m = n and j + = k + m. To determine which monomials Mn satisfy the second condition we write Mn as two lists j
w1 · · · · · · w1 w3 · · · w3 w2 · · · w2 w4 · · · · · · w4 . k
m
Because j + = k + m, these lists have the same length. Therefore their entries can be paired off. This expresses Mn as a product with repetitions of the quadratic polynomials w1 w2 , w1 w4 , w2 w3 , and w3 w4 . Consequently, n is even. Therefore ker H is a subalgebra of C[[w]], which is generated by the preceding quadratic polynomials.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
353
Next we find the kernel of the operator E restricted to ker H. First we determine the kernel of E on the vector space spanned by w1 w2 , w1 w4 , w2 w3 , and w3 w4 . Suppose that 0 = E(aw1 w2 + bw1 w4 + cw2 w3 + dw3 w4 ) for some a, b, c, d ∈ R = (a − d)w2 w4 − cw22 + bw42 .
(87)
Because {w2 w4 , w22 , w42 } are linearly independent quadratic polynomials in w1 , w2 , w3 , w4 , from (87) we deduce that b = c = 0 and a = d. In other words, the only linear polynomials in {v1 = w1 w2 , v2 = w1 w4 , v3 = w2 w3 , v4 = w3 w4 } which lie in ker E are of the form a(v1 + v4 ) for some a ∈ R. We generalize this last result by showing that the algebra of formal power series, which lie in ker E | ker H = ker E ∩ ker H, is generated by the polynomial v1 + v4 . The argument proving this goes as follows. Let Gp be a homogeneous polynomial of degree p in {v1 , v2 , v3 , v4 }, which is a term in a formal power series in ker E ∩ ker H. Then 0 = EGp (v1 , v2 , v3 , v4 ) = =
∂Gp ∂Gp − ∂v1 ∂v4
∂Gp ∂Gp ∂Gp ∂Gp Ev1 + Ev2 + Ev3 + Ev4 ∂v1 ∂v2 ∂v3 ∂v4
w2 w4 −
∂Gp 2 ∂Gp 2 w + w . ∂v2 4 ∂v3 2
(88)
Because the variables {v1 , v2 , v3 , v4 } and {w2 w4 , w22 , w42 } are algebraically independent, from (88) it follows that ∂Gp ∂Gp − = 0 and ∂v1 ∂v4
∂Gp ∂Gp = = 0. ∂v2 ∂v3
(89)
From the second equation in (89) it follows that Gp is a polynomial in the variables v1 and v4 , which is homogeneous of degree p by hypothesis. Therefore Gp (v1 , v4 ) = p p−j j v4 for some aj ∈ C. Now j=0 aj v1 p−1
p
∂Gp = (p − j)aj v1p−j−1 v4j = (p − j + 1)aj−1 v1p−j v4j−1 ∂v1 j=0 j=1 and
∂Gp ∂v4
=
p
j=1
jaj v1p−j v4j−1 . But
∂Gp ∂v1
=
∂Gp ∂v4 .
So equating coefficients gives
jaj = (p − j + 1)aj−1 for j = 1, . . . , p. Consequently, aj = (pj)a0 for j = 1, . . . , p. In other words, Gp (v1 , v2 , v3 , v4 ) = a0 (v1 + v4 )p . Thus we have proved. sl(2,C)
Fact 4.10. The algebra C[[w]] of formal power series which are invariant under the sl(2, C) representation ρˇ of {H, E, F } on C[[w]], is generated by the polynomial P (w) = w1 w2 + w3 w4 . Translating the above result about the algebra of sl(2, C)-invariant formal power series back to su(2), we have shown Lemma 4.11. The algebra Cµ {z}su(2) of convergent power series, which are square integrable with respect to the measure µ and are invariant under the su(2) representation ρ of {L1 , L2 , L3 } on Cµ {z}, is generated by the polynomial z12 + z22 + z32 + z42 .
April 2, 2009 10:19 WSPC/148-RMP
354
J070-00363
L. Bates et al.
Proof of Proposition 4.9. Let f σ1 ∈ HF . Then f is a holomorphic function on 2 C4 , which is square integrable with respect to the measure µ = e−|z| /(2) d4 z d4 z¯. In other words, f ∈ Cµ {z}. Let f = ∞ n=0 fn be the Taylor series of f about 0. Here fn lies in Cµn [z], which is the space of homogeneous polynomials of degree n on C4 , that are square integrable with respect to the measure µ. This is just the space Cn [z] of homogeneous polynomials of degree n on C4 . Suppose that f σ1 ∈ ker QJk ⊆ HF for ∞ k = 1, 2, 3. Then for every L ∈ span{L1 , L2 , L3 } = L, we have 0 = Lf = n=0 Lfn , since L is a linear operator. From (82), it follows that Lfn ∈ Cn [z]. Because ∞ n=0 Lfn is a convergent power series, by uniqueness of analytic functions we obtain Lfn = 0 for every n ≥ 0. Therefore fn ∈ ker L ∩ Cn [z] = C[z]su(2) . But n then n = 2m and fn = am (z11 + z22 + z32 + z42 )m for some am ∈ C. In other words, f (z) = F (z 2 ), where z 2 = z12 + z22 + z32 + z42 and F (w) is the entire holomor ∞ phic function on C with Taylor series m=0 am wm about 0. By hypothesis f is square integrable with respect to the measure µ, which implies that the function F is also. For i = 4, 5, 6, applying the operators QJi to an su(2)-invariant section ψσ1 ∈ HF , where ψ(z, z¯) = Ψ(z 2 ), gives QJ4 (ψσ1 ) = −i[(z2 ∂1 − z1 ∂2 − z4 ∂3 + z3 ∂4 )ψ]σ1 = 0 QJ5 (ψσ1 ) = −i[(z3 ∂1 + z4 ∂2 − z1 ∂3 − z2 ∂4 )ψ]σ1 = 0 QJ6 (ψσ1 ) = −i[(z4 ∂1 − z3 ∂2 + z2 ∂3 − z1 ∂4 )ψ]σ1 = 0. In other words, the operators QJj for j = 4, 5, 6 vanish on the space of su(2)invariant sections in HF . However, −1 QJ7 (Ψ(z 2 )σ1 ) = [(z1 ∂1 + z2 ∂2 + z3 ∂3 + z4 ∂4 )Ψ(z 2 )]σ1 = 2z 2 Ψ (z 2 )σ1 , so QJ7 does not vanish on the space of su(2)-invariant sections in HF . 4.2.3. Decomposition of the sl(2, C) representation
We now decompose the sl(2, C) representation ρˇ on the Hilbert space Cµ {w} of holomorphic functions on C4 , which are square integrable with respect to the 1 −|w|2 /(4) 4 e d w d4 w, ¯ defined by the linear differential operators measure µ = 16 E = w4 ∂1 − w2 ∂3 , F = w1 ∂4 − w3 ∂2 , and H = −w1 ∂1 + w2 ∂2 − w3 ∂3 − w4 ∂4 acting on Cµ {w}, into a sum of irreducible sl(2, C) representations. Because E, F , and H are linear operators, which for every n ∈ Z≥0 preserve the space Cn [w] of homogeneous polynomials of degree n on C4 , it follows that the sl(2, C) representation ρˇ induces an sl(2, C) representation ρn : sl(2, C) → gl(Cn [w], C) : fn → ρn (λ)fn = ρ(λ)fn
(90)
for every λ ∈ sl(2, C) and every fn ∈ Cn [w]. Thus in order to decompose the sl(2, C) representation ρˇ on Cµ {w} it suffices to decompose the induced finite dimensional
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
355
representation ρn on Cn [w] for every n ∈ Z≥0 . To solve this problem we need to determine the top weight vectors of ρn for all n ∈ Z≥0 . In other words, we need to find ker E. Fact 4.12. ker E is a module over Cµe {w}
sl(2,C)
.
sl(2)
Proof. Let p ∈ ker E and f ∈ Cµe {w} . Then E(f p) = (Ef )p + f (Ep) = 0, sl(2) and Ep = 0 since p ∈ ker E. because Ef = 0 since f ∈ Cµe {w} Claim 4.13. The sl(2, C) representation ρˇ of {H, E, F } on C[[w]] is the symmetric tensor product of the sl(2, C) representation σ ˇ on C[[w2 , w3 ]] defined by E1 = ∂ ∂ + w , and the sl(2, C) representation τˇ on w3 ∂2 , F1 = w2 ∂3 , and H1 = −w2 ∂w 3 ∂w3 2 C[[w1 , w4 ]] defined by E2 = −w1 ∂4 , F2 = −w4 ∂1 , and H2 = w1 ∂1 − w4 ∂4 . Note that C[[w]] = C[[w2 , w3 ]]C[[w1 , w4 ]] and E = E1 +E2 , F = F1 +F2 , and H = H1 +H2 . ∞ k Proof. Consider the generating function G1 (t) = k=0 dim Ck [w2 , w3 ]t of the representation σ ˇ on C[[w2 , w3 ]]. Observe that Ck [w2 , w3 ] is the representation space of the standard irreducible sl(2, C) representation of dimension k + 1. Then ∞ ∞
t d k+1 d 1 k (k + 1)t = t . = G1 (t) = = dt dt 1 − t (1 − t)2 k=0
k=0
Similarly, the generating function of the sl(2) representation τˇ on C[[w1 , w4 ]] is 1 G2 (t) = ∞ =0 (+1)t = (1−t)2 . Therefore the generating function of the symmetric tensor product σ ˇ τˇ on C[[w]] is G(t) = G1 (t)G2 (t). We compute G(t) in two different ways. First, from the power series for G1 (t) and G2 (t) we get the power series ∞ ∞
k+ (k + 1)( + 1)t = (k + 1)( + 1) tn . n=0
k,=0
k+=n
1 1 Second, from the fact that G1 (t) = G2 (t) = (1−t) 2 , we see that G(t) is (1−t)4 . ∞ 1 1 n To write (1−t) 4 as a power series, we differentiate the identity 1−t = n=0 t ∞ n+3 n 1 three times and divide by 3!, obtaining (1−t) 4 = n=0 ( 3 )t . Therefore for every n ∈ Z≥0 we have
n+3 (k + 1)( + 1) = . 3 k+=n
3 (n + 3 )
for every n ∈ Z≥0 . Therefore the representation ρn : But dim Cn [w] = σ τˇ)n on Cn [w] sl(2, C) → gl(Cn [w], C) is the induced sl(2, C) representation (ˇ for every n ∈ Z≥0 . In other words, the sl(2, C) representation ρˇ on C[[w]] is the symmetric tensor product of the sl(2, C) representations σ ˇ and τˇ on C[[w2 , w3 ]] and C[[w1 , w4 ]], respectively. 2
2
Let µ1 = 14 e−(|w2 | +|w3 | )/4 dw2 dw3 dw ¯2 dw¯3 be a measure on C2 with coordi1 −(|w1 |2 +|w4 |2 )/4 dw1 dw4 dw ¯1 dw ¯4 be a measure on nates (w2 , w3 ) and let µ2 = 4 e
April 2, 2009 10:19 WSPC/148-RMP
356
J070-00363
L. Bates et al.
C2 with coordinates (w1 , w4 ). Let Cµ1 {w2 , w3 } be the space of holomorphic functions on C4 which are square integrable with respect to the measure µ1 and let Cµ2 {w1 , w4 } be the space of holomorphic functions on C4 which are square integrable with respect to the measure µ2 . Since µ = µ1 µ2 we obtain
Corollary 4.14. The sl(2, C) representation ρˇ of {H, E, F } on Cµ {w} is the symmetric tensor product of the sl(2, C) representation σ ˇ of {H1 , E1 , F1 } on Cµ1 {w2 , w3 } and the sl(2, C) representation τˇ of {H2 , E2 , F2 } on Cµ2 {w1 , w4 }.
sl(2,C)
Corollary 4.15. As a Cµ {w} -module, the kernel of E on Cµ {w} has a basis 2 k µ {w1 w3 ∈ C {w} | (k, ) ∈ (Z≥0 ) }. Proof. This follows because as a vector space ker E1 is spanned by {w3k ∈ Cµ1 {w2 , w3 } | k ∈ Z≥0 }, ker E2 is spanned by {w1 ∈ Cµ2 {w1 , w4 } | ∈ Z≥0 }, and ker E = ker E1 ker E2 on Cµ {w}. We now translate the result of Claim 4.13 to the original su(2)-represen-tation ρ. From the sl(2, C) representation σ on C[[w2 , w3 ]] given by {H1 , E1 , F1 } we form the su(2) representation σ on C[[z1 − iz2 , z2 + iz4 ]] given by {1 , 2 , 3 } where 1 (z2 ∂1 − z1 ∂2 + z4 ∂3 − z3 ∂4 ) + 2 1 2 = (z3 ∂1 − z4 ∂2 − z1 ∂3 + z2 ∂4 ) + 2 1 3 = (z4 ∂1 + z3 ∂2 − z2 ∂3 − z1 ∂4 ) − 2
1 =
1 i(z1 ∂1 + z2 ∂2 − z3 ∂3 − z4 ∂4 ) 2 1 i(z4 ∂1 + z3 ∂2 + z2 ∂3 + z1 ∂4 ) 2 1 i(z3 ∂1 − z4 ∂2 + z1 ∂3 − z2 ∂4 ). 2
Similarly, from the sl(2, C) representation τ on C[[w1 , w4 ]] given by {H2 , E2 , F2 } we form the su(2) representation τ on C[[z1 + iz2 , z3 − iz4 ]] given by {λ1 , λ2 , λ3 } where 1 1 λ1 = (z2 ∂1 − z1 ∂2 + z4 ∂3 − z3 ∂4 ) − i(z1 ∂1 + z2 ∂2 − z3 ∂3 − z4 ∂4 ) 2 2 1 1 λ2 = (z3 ∂1 − z4 ∂2 − z1 ∂3 + z2 ∂4 ) − i(z4 ∂1 + z3 ∂2 + z2 ∂3 + z1 ∂4 ) 2 2 1 1 λ3 = (z4 ∂1 + z3 ∂2 − z2 ∂3 − z1 ∂4 ) + i(z3 ∂1 − z4 ∂2 + z1 ∂3 − z2 ∂4 ). 2 2 Because C[z] = C[[z1 − iz2 , z3 + iz4 ]] C[[z1 + iz2 , z3 − iz4 ]] and L1 = 1 + λ1 , L2 = 2 + λ2 , and L3 = 3 + λ3 , it follows that Proposition 4.16. The su(2)-representation ρ is the symmetric tensor product of the su(2)-representation σ of {1 , 2 , 3 } on C[[z1 − iz2 , z3 + iz4 ]] and the su(2)representation τ of {λ1 , λ2 , λ3 } on C[[z1 + iz2 , z3 − iz4 ]]. Corollary 4.17. The su(2) representation ρ of {L1 , L2 , L3 } on Cµ {z} is the symmetric tensor product of the su(2) representation σ of {1 , 2 , 3 } on
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
357
Cµ1 {z1 − iz2 , z3 + iz4 } and the su(2) representation τ of {λ1 , λ2 , λ3 } on Cµ2 {z1 + 2 2 iz2 , z3 − iz4 }. Here µ1 = 14 e−(|z1 −iz2 | +|z3 +iz4 | )/4 d(z1 − iz2 ) d(z3 + iz4 ) and 2 2 µ2 = 14 e−(|z1 +iz2 | +|z3 −iz4 | )/4 d(z1 + iz2 ) d(z3 − iz4 ). 4.3. Quantization of singular reduction at 0 We will show that our example satisfies the singular prequantization condition. Theorem 4.18. AGJ’s example satisfies the singular prequantization condition, namely, that Pk τ ∈ IΓ∞ (L) for all k ∈ I SU(2) and all τ such that PJξ τ ∈ IΓ∞ (L) for all ξ ∈ su(2). Proof. First, note that I SU(2) is generated by J4 , J5 , and J6 ; while J1 , J2 , and J3 make up the set {Jξ , ξ ∈ su(2)}. We will show that if PJj τ ∈ IΓ∞ (L) for j = 1, 2, 3, then PJ4 τ ∈ IΓ∞ (L). The argument for J5 and J6 is similar and is not given. Recall from Claim 4.7 that X J 4 = c1 X J 1 + c2 X J 2 + c3 X J 3
(91)
on J −1 (0), where c1 , c2 , and c3 are the functions defined in (76). Note that these functions are defined on all of C4 \{0}. So the right-hand side of (91) is defined everywhere except the origin and is not necessarily equal to XJ4 there. Write X J 4 = c1 X J 1 + c2 X J 2 + c3 X J 3 + Y
(92) −1
where Y is some vector field on C , which vanishes on J (0). In Sec. 4.2.3, we showed that PJ (ψσ1 ) = −iXJ (ψ)σ1 for = 1, . . . , 7. Thus, using (92), we get 4
PJ4 (ψσ1 ) = −iXJ4 (ψ)σ1 = −ic1 XJ1 (ψ)σ1 − ic2 XJ2 (ψ)σ1 − ic3 XJ3 (ψ)σ1 − iY = c1 PJ1 (ψσ1 ) + c2 PJ2 (ψσ1 ) + c3 PJ3 (ψσ1 ) − iY (ψ)σ1 .
(93)
The first three terms in (93) are in IΓ∞ (L) by hypothesis. In the fourth, we differentiate the function ψ by a vector field that vanishes on J −1 (0). So the resulting function vanishes on J −1 (0). Thus Y (ψ)σ1 is in IΓ∞ (L). Therefore PJ4 (ψσ1 ) is in IΓ∞ (L). Since the singular prequantization condition implies the singular quantization condition, it follows that AGJ’s example satisifes the singular quantization condition. SU(2) ∞ for quantization of singular The representation space Γ∞ F (L)/IΓ (L) ∞ SU(2) SU(2) . We know that Γ∞ is reduction is isomorphic to the space ΓF (L) F (L) 2 2 2 2 2 2 isomorphic to {Ψ(z )σ1 | Ψ analytic}, where z = z1 + z2 + z3 + z4 . To complete the description of the quantization of singular reduction, we need the quantum operators for elements of (CF∞ (P ) ∩ C ∞ (P )SU(2) )/I SU(2) . As shown
April 2, 2009 10:19 WSPC/148-RMP
358
J070-00363
L. Bates et al.
SU(2) in the preceding section, the action of J7 on Γ∞ is F (L)
QJ7 (Ψ(z 2 )σ1 ) = −iz 2 Ψ (z 2 )σ1 . 4.4. Quantization of co-adjoint orbits of SU(2) In order to describe quantization of singular reduction at quantizable co-adjoint orbits of SU(2), we need a quantization of these orbits. Non-trivial quantizable ∗ orbits On of SU(2) are spheres in su(2) such that the cohomology class of ωOn is equal n ∈ Z≥0 . For the sake of simplicity of presentation we describe the corresponding complex line bundle πOn : LOn → On as a fiber bundle πn : Ln → CP1 associated to the principal C× bundle π : L× → CP1 . Let C× = C\{0} be the multiplicative group of non-zero complex numbers and let L× = C2 \{(0, 0)}. The C× action ϕ : C× × L× → L× : (c, (z1 , z2 )) = (c, z) → (cz1 , cz2 ) = cz is free and proper. Hence L× is a C× -principal bundle with base L× /C× = CP1 and bundle projection map π : L× → CP1 . We identify the Lie algebra of C× with C in such a way that t → exp(2πi ζt) is the one-parameter subgroup of C× corresponding to ζ ∈ C. For each ζ ∈ C, the vector field on L× corresponding to ζ, whose flow is (t, z) → exp(2πitζ)z, is given by Xζ = 2πi ζ(z1 ∂1 + z2 ∂2 ). The complex 1-form ϑ(z) =
1 z¯1 dz1 + z¯2 dz2 1 dz, z = 2πiz, z 2πi z1 z¯1 + z2 z¯2
on L× is C× -invariant, and Xζ ϑ(z) = ζ for every ζ ∈ C. Hence, ϑ is a connection form on L× . Its exterior differential Ω = dϑ is the curvature form of this connection. The curvature form Ω corresponds to a symplectic form ω on CP1 such that the prequantization condition Ω = −(2π)−1 ω is satisfied, see Appendix A for more details. Let πn : Ln → CP1 be the line bundle associated to L× , which corresponds to the C× action C× × C → C : (c, x) → cn x. Sections of πn correspond to maps σ : L× → C such that σ (cz) = cn σ (z) for each z ∈ L× and each c ∈ C× , see appendix 2 for more details. The connection form ϑ on L× gives rise to a connection ∇ on sections σ of πn such that (∇X σ) = (lift X) dσ for every vector field X on CP1 . Here lift X is the horizontal lift of X to L× . In other words, lift X is the unique vector field on L× such that T π ◦ lift X = X ◦ π and lift X ϑ = 0. For every section σ of Ln we have d
ver dσ (Xζ (z)) = Xζ (z) dσ (z) = σ (exp(2πi tζ)z) dt t=0 d = exp(2πintζ)σ (z) = 2πinζσ (z) dt t=0 = 2πin (Xζ (z)
ϑ(z))σ (z).
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
359
In other words, ver dσ = 2πin σ ϑ. Also (∇σ) (z) = hor dσ (z) = dσ (z) − ver dσ (z) = dσ (z) − 2πin σ (z)ϑ(z) n dz, z, = dσ (z) − σ (z) z, z n that is, dσ (z) = (∇σ) (z) + σ (z) z,z dz, z. For sections σ1 and σ2 of the bundle
πn consider the Hermitian inner product σ1 (z), σ2 (z) = C× , we have σ1 (cz), σ2 (cz) =
σ ¯1 (z)σ2 (z) z,zn .
For each c ∈
¯1 (z)σ2 (z) c¯n cn σ = z, zn = σ1 , σ2 (z). c¯n cn z, zn
Hence, σ1 (z), σ2 (z) depends only on π(z) ∈ CP1 . Moreover, a short calculation shows that dσ1 , σ2 (z) = ∇σ1 , σ2 (z) + σ1 , ∇σ2 (z). Hence σ1 (z), σ2 (z) is a Hermitian inner product on sections of Ln , which is invariant under the parallel transport defined by the connection ∇. Since the Chern class of Ln is n, Ln with connection ∇ and Hermitian inner product σ1 (z), σ2 (z) is a prequantization line bundle for CP1 with the symplectic form nω. The complex structure on L× gives rise to a polarization of (CP1 , nω) defined by F = T π(span{ ∂∂z¯1 , ∂∂z¯2 }). It is the distribution of antiholomorphic directions in the complex structure of CP1 . We have an action of SU(2) on L× given by z1 z × × SU(2) × L → L : (g, z) = g, → gz = g 1 . z2 z2 Since (gz)c = g(zc) for all g ∈ SU(2) and c ∈ C× , it follows that this SU(2) action induces an action of SU(2) on CP1 . The connection form ϑ is clearly SU(2)-invariant. The action of SU(2) on L× induces an action of SU(2) on Ln and an SU(2) on sections of Ln . For every section σ of Ln and every g ∈ SU(2) we have (gσ) (z) = σ (g −1 z). Since (∇σ) = dσ − 2πinσ ϑ and ϑ is SU(2)-invariant, it follows that g(∇σ) = ∇(gσ). So the connection ∇ is SU(2)-invariant. Thus, the action σ → gσ is the prequantization representation of SU(2) on the space of sections of Ln . The action of SU(2) on L× preserves its complex structure. Hence, the induced action of SU(2) on Ln preserves the polarization F . Therefore, the quantization representation of SU(2) corresponding to the polarization F is the restriction of n n the prequantization representation to the space Γ∞ F (L ) of sections of L that are covariantly constant along F under ∇. Let σ be a section of Ln . It is covariantly constant along F if ∇X σ = 0 for every vector field X on CP1 with values in F . Since, (∇X σ) = lift Xσ and lift X n
has values in span{ ∂∂z¯1 , ∂∂z¯2 }, it follows that σ ∈ Γ∞ F (L ) if and only if σ is a holomorphic function of z = (z1 , z2 ).
April 2, 2009 10:19 WSPC/148-RMP
360
J070-00363
L. Bates et al.
Claim 4.19. In fact σ is a homogeneous polynomial of degree n in (z1 , z2 ). Proof. See Lemma B.2 in Appendix B. Therefore the quantization representation of SU(2) on homogeneous polynomials Cn [z] of degree n on (C2 )∗ is given by Rn : SU(2) × Cn [z] → Cn [z] : (g, pn ) → g · pn ,
(94)
where g · pn (z) = pn (g −1 z). Infinitesimalizing Rn gives d ((exp −tξ)z)∗ pn = −LXξ pn , ρn : su(2) × Cn [z] → Cn [z] : (ξ, pn ) → dt t=0
where Xξ is the vector field on C whose flow is R × C2 → C2 : (t, z) → (exp tξ)z. Observe that ρn is a representation of the Lie algebra su(2) on Cn [z]. In greater ix −y − iz i 0 3 detail, su(2) = {(y − iz −ix ) ∈ gl(2, C) | (x, y, z) ∈ R } has a basis 1 = (0 −i), 2
2 = (01
−1 0 ),
0 and 3 = (−i
−i 0 ),
which satisfy the bracket relations
[1 , 2 ] = 1 2 − 2 1 = 23 ,
[2 , 3 ] = 21
and [3 , 1 ] = 22 .
Moreover, the linear differential operators L1 = −i(z1 ∂1 − z2 ∂2 )
L2 = −(−z2 ∂1 + z1 ∂2 )
and L3 = i(z2 ∂1 + z1 ∂2 )
satisfy the bracket relations [L1 , L2 ] = L1 L2 − L2 L1 = 2L3 ,
[L2 , L3 ] = 2L1
and [L3 , L1 ] = 2L2 .
Let H = z1 ∂1 − z2 ∂2 , E = z1 ∂2 , and F = z2 ∂1 . Then [H, E] = 2E,
[H, F ] = −2F
and [E, F ] = H.
So {H, E, F } defines a representation ρˇn of sl(2, C) on Cn [z]. Using the standard basis {z1n− z2 | = 0, 1, . . . , n} for Cn [z], the (n+ 1)× (n+ 1) matrix representations of H, E, and F are 0 1 0 n n 0 2 n−2 0 . . n−4 , and n − 1 0 . 0 , , .. .. .. . . . . . n . 1 0 −n 0 respectively. Since the only proper invariant subspace of the matrix representation of E and F on Cn [z] is spanned by z1n and z2n , respectively, there is no proper subspace of Cn [z] which is invariant under H, E, and F . In other words, the sl(2, C) representation ρˇn of {H, E, F } on Cn [z] is irreducible. Consequently, the su(2) representation ρn is irreducible. This shows that Corollary 4.20. The quantization SU(2) representation Rn on Cn [z], given by (94), is irreducible.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
361
4.5. Quantization of reduction at a non-zero co-adjoint orbit The SU(2)-co-adjoint orbit On is (CP1 , ωOn = nω), where n ∈ Z and ω is the imaginary part of the Fubini–Study hermitian metric on complex one-dimensional projective space CP1 . Let πn : Ln → CP1 be the prequantum line bundle with Chern class n and metric covariant derivative ∇. The polarization FOn = T π(span{ ∂∂z¯1 , ∂∂z¯2 }), where π : L× = C2 \{(0, 0)} → CP1 is the orbit map of the C× action C× × L× → L× : (c, z) → cz, is the orbit map of the C× action on L× given by (c, z) → cz, is a positive K¨ ahler polarization on the complex K¨ ahler manifold On . The space of smooth sections of πn , which are covariantly constant under ∇ along F is Cn [z], the space of homogeneous polynomials on (C2 )∗ of degree n. The linear action of SU(2) on L× commutes with the C × -action on L× given above and thus induces the SU(2)-action ROn : SU(2) × Cn [z] → Cn [z] : (g, pn ) → g · pn , where g · pn (z) = pn (g −1 z). The map ROn is an n + 1 dimensional quantum representation of SU(2), which is irreducible. Its infinitesimalization is the irreducible su(2) representation ρn : su(2) × Cn [z] → Cn [z] : (ξ, pn ) → −LXξ pn . By results of Sec. 3.2.4 reduction of the momentum map J of the SU(2)-action ϕ on T ∗ R4 at the SU(2)-co-adjoint orbit On is the same as reduction at 0 of the SU(2)-momentum mapping JT ∗ R4 ×On associated to the SU(2)-action
(A, ((q, p), µ)) → ((Φ (A)q, Φ (A)p), AdTΦ(A)−1 µ) on (T ∗ R4 × On , π1∗ ω − π2∗ ωOn ). The argument given in Sec. 3.2.4 shows that the representation space of the corresponding quantum su(2) representation is HF ⊗ Cn¯[z], which is the space of sections of the prequantum line bundle over T ∗ R4 × On , which are covariantly constant along along the positive K¨ ahler polarization F ⊗ FOn . In addition, the corresponding quantum su(2) representation is ρ ⊗ ρ¯n . The reduced quantum su(2) representation is the subspace of HF ⊗ Cn¯[z] which is spanned by su(2)-invariant vectors, that is, ker{QJj ⊗ LX¯Jk | j, k = 1, 2, 3}. We now determine the reduced quantum su(2) representation. Recall that Corollary 4.15 describes the highest weight module of the sl(2, C) representation ρˇ corresl(2,C) module, sponding to the su(2) representation ρ as follows. As an R = Cµ {z} k the highest weight vector module of ρˇ has a basis {(z1 + iz2 ) (z3 + iz4 ) | (k, ) ∈ ∞ ρm , where ρˇm is a (Z≥0 )2 }. In other words, as an R module ρˇ = m=0 (m + 1)ˇ irreducible sl(2, C) representation of dimension m + 1. Therefore ¯ˇn = ρˇ ⊗ ρ
∞
¯ˇn . (m + 1)ˇ ρm ⊗ ρ
m=0
April 2, 2009 10:19 WSPC/148-RMP
362
J070-00363
L. Bates et al.
The Clebsch–Gordon formula states that if m ≤ n, then ¯ˇn = ρˇn+m + ρˇn+m−2 + ρˇn+m−4 + · · · + ρˇn−m . ρˇm ⊗ ρ
(95)
The one-dimensional irreducible representation ρˇ0 is the trivial representation 1, which corresponds to the subspace (ˇ ρm ⊗ ρˇ)sl(2,C) of sl(2,C) invariant elements ¯ ¯ˇn )sl(2,C) = 1, if n = m Thus the ρm ⊗ ρ of HF ⊗ Cn [z]. Using (95 ) we obtain (ˇ 0, otherwise. ¯ˇn is n + 1 copies of R , that is, R module of sl(2, C) invariant elements of ρˇ ⊗ ρ n+1 i=1 R . This translates to Claim 4.21. The su(2) reduced quantum representation ρ ⊗ ρ¯n on HF ⊗ Cn¯[z] as a module over the ring R = Cµ {z12 + z22 + z32 + z42 } is isomorphic to n + 1 copies of R. Here Cµ {z12 + z22 + z32 + z42 } is the space of holomorphic functions on C4 , which are square integrable with respect to the measure µ. Appendix A. A Principal C× Line Bundle over CP1 Let L× = C2 \{(0, 0)}. Consider the C× -action ϕ : C× × L× → L× : (c, (z1 , z2 )) = (c, z) → (cz1 , cz2 ) = c · z. Fact A.1. The action ϕ is free and proper. Proof. To show that the action ϕ is free. Suppose that (cz1 , cz2 ) = (z1 , z2 ). Then either cz1 = z1 and z1 = 0 or cz2 = z2 and z2 = 0. Consequently, c = 1. Thus the action is free. To show that the action ϕ is proper, it suffices to show that the map Φ : C× × L× → L× × L× : (c, z) → (z, cz) is proper, that is, if K is a compact subset of L× × L× , then Φ−1 (K) is a compact subset of C× × L× . Towards this goal suppose that {(zn , cn zn )} is a sequence in K. Because K is compact, there is a subsequence {(znk , cnk znk )} which converges to (u, v) = (u1 , u2 , v 1 , v 2 ) ∈ K. Therefore neither u or v is equal to 0. If uj = 0 and v = 0 for some j, ∈ {1, 2}, then lim cnk = lim
k→∞
k→∞
cnk zn k znj k
=
v = c∗ = 0. uj
So the sequence {(cnk , znk )} converges to (c∗ , u). Also v 1 u2 = lim (cnk zn1 k )(zn2 k ) = lim (cnk zn2 k )(zn1 k ) = v 2 u1 . k→∞
k→∞
(96)
To be concrete suppose that j = = 1. (The other cases are handled in a similar fashion). Then (u, v) = (u1 , u2 , c∗ u1 , c∗ u2 ),
using (96)
= (u, c∗ u) = Φ(c∗ , u). Consequently, (c∗ , u) ∈ Φ
−1
(K). Hence Φ is a proper mapping.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
363
Let π : L× → CP1 : (z1 , z2 ) → [z1 : z2 ] be the orbit mapping of the C× -action ϕ. Here [z1 : z2 ] are homogeneous coordinates on CP1 , which is the orbit space L× /C× . We want to calculate the Chern class of the C× -principal bundle π. Towards this goal consider the 1-form ϑ(z) =
1 1 (¯ z1 dz1 + z¯2 dz2 ) = dz, z, 2πiz, z 2πiz, z
¯1 + where , is the standard Hermitian inner product on C2 defined by z, w = z1 w ¯2 . The 1-form ϑ has the following properties: z2 w (1) ϑ is invariant under the C× -action ϕ, that is, for every c ∈ C× , we have ϕ∗c ϑ = ϑ. (2) For every ζ ∈ C, the infinitesimal generator of the action ϕ is the vector field Xζ (z) = ζz and Xζ
ϑ = ζ.
(97)
(3) ker ϑ(z) = spanC {Y (z) ∈ C2 | Y (z), z = 0}. From the properties above it follows that Fact A.2. ϑ is a connection 1-form on the C× -principal bundle π : L× → CP1 . Proof. To see this note that Tz L× = hor Tz L× ⊕ ver Tz L× , where hor Tz L× = ker ϑ(z) and ver Tz L× = span{X ζ (z) ∈ C | ζ ∈ C}. Because ϑ is C× -invariant, its kernel is also, that is, hor Tcz L× = Tz L× for every c ∈ C× . Also the normalization condition (97) holds. This shows that ϑ is a principal C× connection, provided that properties (1)–(3) hold. To prove (1) we compute ϕ∗c ϑ(z) = ϑ(cz) =
1 1 1 d(cz), cz, = 2 c dz, cz = ϑ(z). 2πicz, cz |c| 2πiz, z
To show (2) we note that the exponential map exp : C → C× : ζ → e2πiζ identifies the Lie algebra of C× with C. Therefore the C× -action ϕ is given by (e2πiζ , z) → e2πiζ z. Consequently, the vector field d Xζ (z) = e2πitζ z = 2πiζz, dt t=0 that is, Xζ (z) = 2πiζ z1 ∂z∂ 1 +z2 ∂z∂ 2 , is the infinitesimal generator of the C× -action ϕ. So ∂ ∂ 1 + z2 (¯ z1 dz1 + z¯2 dz2 ) z1 Xζ (z) ϑ(z) = ζ z, z ∂z1 ∂z2 =ζ
1 (z1 z¯1 + z2 z¯2 ) = ζ. z, z
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
L. Bates et al.
364
Finally, the vector field Y (z) = Y1 ∂z∂ 1 + Y2 ∂z∂ 2 lies in ker ϑ(z) if and only if 0 = Y (z) ϑ(z), that is, 0 = Y1 z¯1 + Y2 z¯2 = Y (z), z. ×
Thus ker ϑ(z) = hor Tz L . We now want to calculate the curvature Ω = dϑ of ϑ. We compute the exterior derivative of ϑ as follows. −1
dϑ = (2πi)−1 (dz, z = −(2πi)−1 = (2πi)−1 = (2πi)−1
1
2 (dz, z
z, z 1
2 [dz, z
z, z 1
z, z2
−1
∧ dz, z + z, z
dz, dz)
+ z, dz) ∧ dz, z +
1 dz, dz z, z
∧ z, dz + z, zdz, dz]
[(¯ z1 dz1 + z¯2 dz2 ) ∧ (z1 d¯ z1 + z2 d¯ z2 )
− (z1 z¯1 + z2 z¯2 )(dz1 ∧ d¯ z1 + dz2 ∧ d¯ z2 )] = (2πi)−1
1
¯2 dz1 2 [−z2 z
z, z
∧ d¯ z1 + z2 z¯1 dz1 ∧ d¯ z2
− z1 z¯1 dz2 ∧ d¯ z2 + z1 z¯2 dz2 ∧ d¯ z1 ].
(98)
From Xζ
dϑ =
1
¯2 2 (−z1 z2 z
z, z
d¯ z1 + z1 z2 z¯1 d¯ z2 − z2 z1 z¯1 d¯ z2 + z2 z¯2 z1 d¯ z1 ) = 0
it follows that dϑ pushes forward to a 2-form ω on CP1 under the C× -orbit map π. On CP1 with define the 2-form ω by 1 (z2 dz1 − z1 dz2 ) ∧ (¯ z2 d¯ z1 − z¯1 d¯ z2 ). (99) (π ∗ ω)(z1 , z2 ) = 2 2 2 (|z1 | + |z2 | ) To check that ω is well defined, we use the charts ϕ1 : U1 = {[z1 : z2 ] ∈ CP1 | z2 = 0} → W1 ⊆ C : [z1 : z2 ] →
z1 = w1 z2
ϕ2 : U2 = {[z1 : z2 ] ∈ CP1 | z1 = 0} → W2 ⊆ C : [z1 : z2 ] →
z2 = w2 z1
and
for CP1 . The overlap map is −1 ϕ12 = ϕ2 ◦ ϕ−1 1 : W1 ∩ W2 → W1 ∩ W2 : w1 → w2 = w1 .
Let V1 = π −1 (U1 ) ⊆ L× and V2 = π −1 (U2 ) ⊆ L× . Now dw1 ∧ dw ¯1 −1 ∗ ∗ π ω | V1 = ω | U1 = (ϕ1 ) |U1 2 2 (1 + |w1 | )
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
and
∗ π ∗ ω|V2 = ω|U2 = (ϕ−1 2 )
So
365
U2 . 2 2 (1 + |w2 | ) ¯2 dw2 ∧ dw
−1 −1 ∧ d w ¯ dw 1 ∗ 1 ϕ∗12 (ϕ−1 2 ) (ω | (U1 ∩ U2 )) = 2 (W1 ∩ W2 ) −1 2 (1 + |w1 | ) ¯1 dw1 ∧ dw ∗ = (W1 ∩ W2 ) = (ϕ−1 1 ) ω | (U1 ∩ U2 ). 2 2 (1 + |w | ) 1
Thus ω is non-zero well defined 2-form on CP1 . Moreover, from (98) and (99) it follows that π ∗ ω = dϑ. Thus we have proved. Claim A.3. The C× -principal bundle π : L× → CP1 has Chern class 1. Appendix B. Associated Line Bundles Let L× be a principal C× -bundle over a symplectic manifold (CP1 , ω) with bundle projection map π : L× → CP1 . Using the exponential map exp : C → C× : ζ → e2πiζ we may identify the Lie algebra of C× with C. For each z ∈ L× we have ker Tz π = span{Xζ (z) | ζ ∈ C}, where d Xζ (z) = (exp 2πitζ) · z = 2πiζz. dt t=0 On L× let hor T L× define a C× -principal connection, that is, for every z ∈ L× (1) Tz L× = hor Tz L× ⊕ ver Tz L× , where ver Tz L× = ker Tz π; (2) Tz π(hor Tz L× ) = Tπ(z) P ; (3) hor Tcz L× = hor Tz L× , for every c ∈ C× . The principal C× connection on L× is given by a principal connection 1-form Θ, which vanishes on hor T L× , is invariant under the C× action on L× , and satisfies the normalization condition (Xζ
Θ)(z) = ζ,
for every z ∈ L× and every ζ ∈ C. We now construct the complex line bundle Ln associated to the bundle π : × L → CP1 . Consider the C× action φ : C× × (L× × C) → L× × C : (c, (z, x)) → (cz, cn x).
(100)
This action is free and proper. So its orbit space L× ×C× C is a complex line bundle over CP1 , which we denote by Ln . Let λn : L× × C → L× ×C× C be the orbit map
April 2, 2009 10:19 WSPC/148-RMP
366
J070-00363
L. Bates et al.
of the C× action (100). Let ρ : L× × C → CP1 : (z, x) → π(z). For every c ∈ C× , we have ρ(cz, cn x) = π(cz) = π(z), where the second equality follows because π : L× → CP1 is the bundle projection map of a C× -principal bundle. Therefore ρ induces a map πn : Ln → CP1 , which is the projection map of the complex line bundle πn : Ln → CP1 . We now find local trivializations for the complex line bundle πn : Ln → CP1 . Consider the open sets V1 = {(z1 , z2 , w) ∈ L× × C | z2 = 0} and V2 = {(z1 , z2 , w) ∈ L× ×C | z1 = 0}. Here L× = C2 \{(0, 0)}. Then Vi for i = 1, 2 are invariant under the 2 i = λn (Vi ) C× action φ (100) and {Vi } is an open covering of L× ×C. Therefore U i=1
i }2 form an open covering of Ln . The for i = 1, 2 are open subsets of Ln and {U i=1 maps 1 ⊆ Ln : (w1 , x) → λn (w1 , 1, x) ψ1 : W1 × C ⊆ C × C → U and 2 ⊆ Ln : (w2 , x) → λn (1, w2 , x) ψ2 : W2 × C ⊆ C × C → U are holomorphic parametrizations of Ln . As a consequence of our discussion of holomorphic sections of the complex line bundle Ln → CP1 below, the overlap map is ψ12 = ψ2−1 ◦ ψ1 : (W1 ∩ W2 ) × C → (W1 ∩ W2 ) × C : (w1 , x) → (w2 , x ) = (w1−1 , w1 −n x). Let τ1 : W1 × C → W1 : (w1 , u) → w1 and τ2 : W2 × C → W2 : (w2 , u) → w2 . Since πn ◦ ψi = ϕi ◦ τi for i = 1, 2, it follows that τi for i = 1, 2 are local holomorphic trivializations of the complex line bundle πn : Ln → CP1 . Let σ : CP1 → Ln be a section. Then σ lifts to a unique map Σ : L× → L× × C, which has the following properties: (1) It is a section of the bundle Π1 : L× × C → L× : (z, w) → z; (2) for every z ∈ L× it satisfies (λn ◦ Σ)(z) = (σ ◦ π)(z); (3) it intertwines the C× action on L× with the C× action on L× × C, that is, for every c ∈ C× and every z ∈ L× we have Σ(cz) = cΣ(z).
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
367
Since Σ is a section of the bundle Π1 , we may write Σ(z) = (z, σ (z)) for a unique function σ : L× → C. Because Σ covers the section σ we have λn (Σ(z)) = σ(π(z)) for every z ∈ L× . In other words, σ assigns to π(z) the unique C× orbit in L× × C through (z, σ (z)). Thus the section σ uniquely determines the function σ . Because Σ intertwines the C× actions, for every c ∈ C× and every z ∈ L× we have (cz, σ (cz)) = Σ(cz) = cΣ(z) = (cz, cn σ (z)). Therefore σ (cz) = cn σ (z). This proves the first part of Lemma B.1. Corresponding to every section σ : CP1 → Ln there is a unique mapping σ : L× → C such that σ (cz) = cn σ (z) for every c ∈ C× and every z ∈ L× and conversely. To prove the converse let σ : L× → C be a function such that σ (cz) = cn σ (z) for every c ∈ C× and every z ∈ L× . Let Σ : L× → L× × C : z → (z, σ (z)). Then Σ(cz) = (cz, σ (cz)) = (cz, cn σ (z)) = cΣ(z). Therefore Σ induces a map σ : CP1 → Ln . Since ρ(Σ(cz)) = π(cz) = π(z), we get (πn ◦ σ)(π(z)) = π(z). Therefore σ is a section of the bundle πn : Ln → CP1 . Suppose that σ : CP1 → Ln is a holomorphic section of the complex line bundle πn : Ln → CP1 . Then there is a unque function σ : L× = C2 \{(0, 0)} → C such that σ (cz1 , cz2 )) = cn σ (z1 , z2 ) for every c ∈ C× and every (z1 , z2 ) ∈ L× . In the local trivialization τi : Wi × C → Wi : (wi , x) → wi , for i = 1, 2 the section σ becomes the section σi : Wi → Wi × C : wi → (wi , σi (wi )) for i = 1, 2. Here σ1 (w1 ) = σ (w1 , 1) and σ2 (w2 ) = σ (1, w2 ). On W1 ∩ W2 we have 1
n
σ2 (w2 ) = w2 σ1 . (101) w2 To see this, since z2 = 0 on V1 ∩ V2 ⊆ L× we have z1
,1 = z2n σ (w1 , 1) = z2n σ1 (w1 ). σ (z1 , z2 ) = σ z2 z2 Since z1 = 0 on V1 ∩ V2 we have z2
σ (z1 , z2 ) = σ z1 1, = z1n σ (1, w2 ) = z1n σ2 (w2 ). z1 But w1 and w2 are both non-zero on W1 ∩ W2 . Therefore n 1 z2 σ2 (w2 ) = . σ1 (w1 ) = w2n σ1
z1 w2 This proves (101).
April 2, 2009 10:19 WSPC/148-RMP
368
J070-00363
L. Bates et al.
Since σi are holomorphic functions on W1 ∩ W2 for i = 1, 2, from (101) it follows that σ1 is a polynomial in w1 of degree at most n. Consequently, on V1 ∩ V2 the holomorphic section σ is a homogeneous polynomial of degree at most n. But σ (cz1 , cz2 )) = cn σ (z1 , z2 ). Therefore σ is a homogeneous polynomial of degree n on V1 ∩ V2 . Because V1 ∩ V2 is an open subset of L× and therefore of C2 , it follows that σ is a homogeneous polynomial on C2 (with coordinates (z1 , z2 )) of degree n. Because the map σ → σ from the space of holomorphic sections of the line bundle πn : Ln → CP1 to the space of homogeneous polynomials of degree n on C2 is a natural isomorphism of vector spaces, we have proved Lemma B.2. The space of holomorphic sections of the complex line bundle πn : Ln → CP1 may be identified with the space of homogeneous polynomials of degree n on C2 . The curvature 2-form Ω of the connection 1-form Θ is hor dΘ. We assume that Ω = π ∗ ω, where ω is the symplectic form on CP1 . Given a vector field X on CP1 , its horizontal lift to L× is the unique vector field lift X on L× with values in hor TL× such that for every z ∈ L× Tz π(lift X)(z) = X(π(z)). Given a section σ : CP → Ln and a vector field X on CP1 , the covariant derivative of σ with respect to X is the section ∇X σ such that 1
(∇X σ) = (lift X)σ . This is well-defined because from (∇X σ) (cz) = (lift X)σ (cz) = (lift X)cn σ (z) = cn (lift X)σ (z) = cn ((∇X σ) )(z) it follows that ∇X σ is a section of πn : Ln → CP1 . Given vector fields X1 and X2 on CP1 , we have [lift X1 , lift X2 ] = hor[lift X1 , lift X2 ] + ver[lift X1 , lift X2 ] = lift[X1 , X2 ] + ver[lift X1 , lift X2 ]. But the 1-form Θ vanishes on horizontal vectors. So for z ∈ L× Θz | verz [lift X1 , lift X2 ] = Θz | [lift X1 , lift X2 ]z − liftz X1 Θz | liftz X2 + liftz X2 Θz | liftz X1 = −dΘz (liftz X1 , liftz X2 ) = −Ωx (liftz X1 , liftz X2 ) = ω(x)(X1 (x), X2 (x))), where x = π(z). Therefore using the normalization condition we get verz [lift X1 , lift X2 ] = X ω(x)(X1 (x),X2 (x)) (z).
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
369
Hence, ([∇X1 , ∇X2 ]σ) (z) = [(lift X1 ), (lift X2 )]σ (z) = hor[(lift X1 ), (lift X2 )]σ (z) + ver[(lift X1 ), (lift X2 )]σ (z). We now focus on the second term in the preceding expression. We compute ver[(lift X1 ), (lift X2 )]σ (z) = (X ω(x)(X1 (x),X2 (x)) σ )(z) d σ (exp(tω(x)(X1 (x), X2 (x))))z = dt t=0 d n = (exp(tω(x)(X1 (x), X2 (x)))) σ (z) dt t=0 = nω(x)(X1 (x), X2 (x))σ (z). Since
lift[X1 , X2 ]σ (z) = (∇[X1 ,X2 ] σ) (z), we get [∇X1 , ∇X2 ]σ − ∇[X1 ,X2 ] σ = nω(X1 , X2 )σ. Thus we have proved. Proposition B.3. The Chern class of the complex line bundle πn : Ln → CP1 is n times the Chern class of the C× -bundle π : L× → CP1 , which is 1. References [1] R. Abraham, J. E. Marsden and T. Ratiu, Manifolds, Tensor Analysis and Applications, 2nd edn. (Springer-Verlag, New York, 1988). [2] J. M. Arms, Reduction of Poisson algebras at non-zero momentum values, J. Geom. Phys. 21 (1996) 81–95. [3] J. M. Arms, R. H. Cushman and M. J. Gotay, A universal reduction procedure for Hamiltonian group actions, in The Geometry of Hamiltonian Systems, ed. T. Ratiu, MSRI Publ., Vol. 20 (Springer Verlag, Berlin, 1991), pp. 33–51. [4] J. M. Arms, M. J. Gotay and G. Jennings, Geometric and algebraic reduction for singular momentum maps, Adv. Math. 79 (1990) 43–103. [5] J. M. Arms and D. C. Wilbour, Reduction procedures for Poisson manifolds, in Symplectic Geometry and Mathematical Physics (Aix-en-Provence, 1990), Progr. Math., Vol. 99 (Birkhauser, Boston, MA, 1991), pp. 462–475. [6] V. Bargmann, On a Hilbert space of analytic functions, Comm. Pure Appl. Math. 14 (1961) 187–214. [7] L. Bates and E. Lerman, Proper group actions and symplectic stratified spaces, Pacific J. Math. 181 (1997) 201–229. [8] R. Cushman, Reduction, Brouwer’s Hamiltonian and the critical inclination, Celest. Mech. 31 (1983) 401–429. ´ [9] R. Cushman and J. Sniatycki, Differential structure of orbit spaces, Canadian J. Math. 53 (2001) 715–755.
April 2, 2009 10:19 WSPC/148-RMP
370
J070-00363
L. Bates et al.
[10] P. A. M. Dirac, Generalized Hamiltonian dynamics, Canad. J. Math. 2 (1950) 129– 148. ´ [11] C. Duval, J. Elhadad, M. Gotay, J. Sniatycki and G. Tuynman, The BRS method and geometric quantization: Some examples, Comm. Math. Phys. 126 (1990) 535–557. [12] V. Guillemin, V. Ginzburg and Y. Karshon, Moment Maps, Cobordisms and Hamiltonian Group Actions (American Mathematical Society, 2002). [13] V. Guillemin, E. Lerman and S. Sternberg, Symplectic Fibrations and Multiplicity Diagrams (Cambridge University Press, Cambridge, 1996). [14] V. Guillemin and S. Sternberg, Geometric quantization and multiplicities of group representations, Invent. Math. 67 (1982) 515–538. [15] V. Guillemin and S. Sternberg, The moment map and collective motion, Ann. Physics 127 (1980) 220–253. [16] V. Guillemin and S. Sternberg, Symplectic Techniques in Physics (Cambridge University Press, 1984). [17] J. Huebschmann, Poisson cohomology and quantization, J. Reine Angew. Math. 408 (1990) 57–113. [18] J. Huebschmann, K¨ ahler spaces, nilpotent orbits and singular reduction, Mem. Amer. Math. Soc.. 172 (2004) No. 814, vi+96 pp. [19] J. Huebschmann, K¨ ahler quantization and reduction, J. Reine Angew. Math. 591 (2006) 75–109. [20] T. Kimura, Generalized classical BRST cohomology and reduction of Poisson manifolds, Commun. Math. Phys. 151 (1993) 155–182. [21] P. Libermann and C.-M. Marle, Symplectic Geometry and Analytical Mechanics, Mathematics and Its Applications, Vol. 35 (D. Reidel Publishing Co., Dordrecht, 1987); Translated from the French by Bertram Eugene Schwarzbach. [22] J. E. Marsden and A. Weinstein, Reduction of symplectic manifolds with symmetry, Rep. Math. Phys. 5 (1974) 121–130. [23] D. McDuff and D. Salamon, Introduction to Symplectic Topology, Oxford Mathematical Monographs, 2nd edn. (Oxford University Press, New York, 1998). [24] E. Meinrenken and R. Sjamaar, Singular reduction and quantization, Topology 38 (1999) 699–762. [25] K. Meyer, Symmetries and integrals in mechanics, in Dynamical Systems, Proc. Sympos., Univ. Bahia, Salvador, 1971 (Academic Press, New York, 1973), pp. 259–272. [26] G. W. Schwarz, Smooth functions invariant under the action of a compact Lie group, Topology 14 (1975) 63–68. [27] R. Sjamaar, Holomorphic slices, symplectic reduction and multiplicities of representations, Ann. Math. 141 (1995) 87–129. [28] R. Sjamaar and E. Lerman, Stratified symplectic spaces and reduction, Ann. Math. 134 (1991) 375–422. ´ [29] J. Sniatycki, Geometric Quantization and Quantum Mechanics, Applied Mathematical Science, Vol. 30 (Springer Verlag, New York, 1980). ´ [30] J. Sniatycki, Orbits of families of vector fields on subcartesian spaces, Ann. Inst. Fourier (Grenoble) 53 (2003) 2257–2296. ´ [31] J. Sniatycki, Poisson algebras in reduction of symmetries, Rep. Math. Phys. 56 (2005) 53–73. ´ [32] J. Sniatycki, Geometric quantization of algebraic reduction, preprint; arXiv DG/ 0609727. ´ [33] J. Sniatycki, Geometric quantization, reduction and decomposition of group representations, J. Fixed Point Theory Appl. 3 (2008) 307–315.
April 2, 2009 10:19 WSPC/148-RMP
J070-00363
Quantization of Singular Reduction
371
´ [34] J. Sniatycki and A. Weinstein, Reduction and quantization for singular momentum mappings, Lett. Math. Phys. 7 (1983) 155–161. [35] D. C. Wilbour, Poisson algebras and singular reduction of constrained Hamiltonian systems, Ph.D. Thesis, University of Washington (1993). [36] N. J. Woodhouse, Geometric Quantization, Oxford Mathematical Monographs, 2nd edn. (Oxford University Press, New York, 1992).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Reviews in Mathematical Physics Vol. 21, No. 3 (2009) 373–437 c World Scientific Publishing Company
SPECTRAL AND SCATTERING THEORY FOR SOME ABSTRACT QFT HAMILTONIANS
∗ and A. PANATI ´ C. GERARD
Laboratoire de math´ ematiques, Universit´ e de Paris XI, 91 405 Orsay Cedex France ∗
[email protected] Received 26 June 2008 Revised 19 December 2008 We introduce an abstract class of bosonic QFT Hamiltonians and study their spectral and scattering theories. These Hamiltonians are of the form H = dΓ(ω) + V acting on the bosonic Fock space Γ(h), where ω is a massive one-particle Hamiltonian acting on h and V is a Wick polynomial Wick(w) for a kernel w satisfying some decay properties at infinity. We describe the essential spectrum of H, prove a Mourre estimate outside a set of thresholds and prove the existence of asymptotic fields. Our main result is the asymptotic completeness of the scattering theory, which means that the CCR representations given by the asymptotic fields are of Fock type, with the asymptotic vacua equal to the bound states of H. As a consequence, H is unitarily equivalent to a collection of second quantized Hamiltonians. Keywords: Quantum field theory; scattering theory. Mathematics Subject Classification 2000: 81T08, 47N50, 81Q10, 81T10
1. Introduction 1.1. Introduction In recent years a lot of effort was devoted to the spectral and scattering theory of various models of Quantum Field Theory like models of non-relativistic matter coupled to quantized radiation or self-interacting relativistic models in dimension 1 + 1 (see among many others the papers [2, 4–7, 15, 16, 20] and references therein). Substantial progress was made by applying to these models methods originally developed in the study of N -particle Schr¨ odinger operators, namely the Mourre positive commutator method and the method of propagation observables to study the behavior of the unitary group e−itH for large times. Up to now, the most complete results (valid for example for arbitrary coupling constants) on the spectral and scattering theory for these models are available only for massive models and for localized interactions. (For results on massless models see, for example, [7] and references therein.) 373
April 2, 2009 10:25 WSPC/148-RMP
374
J070-00364
C. G´ erard & A. Panati
It turns out that for this type of models, the details of the interaction are often irrelevant. The essential feature of the interaction is that it can be written as a Wick polynomial, with a symbol (see below) which decays sufficiently fast at infinity. The conjugate operator (for the Mourre theory), or the propagation observables (for the proof of propagation estimates), are chosen as second quantizations of corresponding operators on the one-particle space h. In applications the one-particle kinetic energy is usually the operator (k 2 + 2 12 m ) acting on L2 (Rd , dk), which clearly has a nice spectral and scattering theory. Therefore the necessary one-particle operators are easy to construct. Our goal in this paper is to describe an abstract class of bosonic QFT Hamiltonians to which the methods and results of [4, 5] can be naturally extended. Let us first briefly describe this class of models. We consider Hamiltonians of the form: H = H0 + V,
acting on the bosonic Fock space Γ(h),
where H0 = dΓ(ω) is the second quantization of the one-particle kinetic energy ω and V = Wick(w) is a Wick polynomial. To define H without ambiguity, we assume that H0 + V is essentially selfadjoint and bounded below on D(H0 ) ∩ D(V ). The Hamiltonian H is assumed to be massive, namely we require that ω ≥ m > 0 and moreover that powers of the number operator N p for p ∈ N are controlled by sufficiently high powers of the resolvent (H + b)−m . These bounds are usually called higher order estimates. The interaction V is supposed to be a Wick polynomial. If for example h = L2 (Rd , dk), this means that V is a finite sum V = p,q∈I Wick(wp,q ) where Wick(wp,q ) is formally defined as: Wick(wp,q ) = a∗ (K)a(K )wp,q (K, K )dK dK , for K = (k1 , . . . , kp ),
K = (k1 , . . . , kq ),
a∗ (K) =
p
a∗ (ki ),
a(K ) =
i=1
q
a(ki ),
i=1
and wp,q (K, K ) is a scalar function separately symmetric in K and K . To define Wick(w) as an unbounded operator on Γ(h), the functions wp,q are supposed to be in L2 (R(p+q)d ). The functions wp,q are then the distribution kernels of a Hilbert– Schmidt operator wp,q from ⊗qs h into ⊗ps h. Putting together these operators we obtain a Hilbert–Schmidt operator w on Γ(h) which is called the Wick symbol of the interaction V . In physical situations, this corresponds to an interaction which has both a space and an ultraviolet cutoff (in one space dimension, only a space cutoff is required). As said above, it is necessary to assume that the one-particle energy ω has a nice spectral and scattering theory. It is possible to formulate the necessary properties of ω in a very abstract framework, based on the existence of only two auxiliary
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
375
Hamiltonians on h. The first one is a conjugate operator a for ω, in the sense of the Mourre method. The second one is a weight operator x, which is used both to control the “order” of various operators on h and as a way to localize bosons in h. Note that the one-particle energy ω may have bound states. The first basic result on spectral theory that we obtain is the HVZ theorem, which describes the essential spectrum of H. If σess (ω) = [m∞ , +∞[ for some m∞ ≥ m > 0, then we show that σess (H) = [inf σ(H) + m∞ , +∞[, in particular H always has a ground state. We then consider the Mourre theory and prove that the second quantized Hamiltonian A = dΓ(a) is a conjugate operator for H. In particular this proves the local finiteness of point spectrum outside of the set of thresholds, which is equal to τ (H) = σpp (H) + dΓ(1) (τ (ω)), where τ (ω) is the set of thresholds of ω for a and dΓ(1) (E) for E ⊂ R is the set of all finite sums of elements of E. The scattering theory for our abstract Hamiltonians follows the standard approach based on the asymptotic Weyl operators. These are defined as the limits: W ± (h) = s- lim eitH W (ht )e−itH , t→±∞
h ∈ hc (ω),
where hc (ω) is the continuous spectral subspace for ω and ht = e−itω h. The asymptotic Weyl operators define two CCR representations over hc (ω). Due to the fact that the theory is massive, it is rather easy to see that these representations are of Fock type. The main problem of scattering theory is to describe their vacua, i.e. the spaces of vectors annihilated by the asymptotic annihilation operators a± (h) for h ∈ hc (ω). The main result of this paper is that the vacua coincide with the bound states of H. As a consequence, one sees that H is unitarily equivalent to the asymptotic Hamiltonian: H|Hpp (H) ⊗ 1 + 1 ⊗ dΓ(ω),
acting on Hpp (H) ⊗ Γ(hc (ω)).
This result is usually called the asymptotic completeness of wave operators. It implies that H is unitarily equivalent to a direct sum of Ei + dΓ(ω|hc (ω) ), where Ei are the eigenvalues of H. In more physical terms, asymptotic completeness means that for large times any initial state asymptotically splits into a bound state and a finite number of free bosons. We conclude the introduction by describing the examples of abstract QFT Hamiltonians to which our results apply. The first example is the space-cutoff P (ϕ)2 model with a variable metric, which corresponds to the quantization of a nonlinear Klein–Gordon equation with variable coefficients in one space dimension.
April 2, 2009 10:25 WSPC/148-RMP
376
J070-00364
C. G´ erard & A. Panati
The one-particle space is h = L2 (R, dx) and the usual relativistic kinetic energy 1 1 (D + m2 ) 2 is replaced by the square root h 2 of a second order differential operator h = Da(x)D + c(x), where a(x) → 1 and c(x) → m2∞ for m∞ > 0 when x → ∞. (It is also possible to treat functions c having different limits m2±∞ > 0 at ±∞.) The interaction is of the form: V = g(x) : P (x, ϕ(x)) : dx, 2
R
where g ≥ 0 is a function on R decaying sufficiently fast at ∞, P (x, λ) is a bounded 1 below polynomial of even degree with variable coefficients, ϕ(x) = φ(ω − 2 δx ) is the relativistic field operator and : : denotes the Wick ordering. This model is considered in details in [12], applying the abstract arguments in this paper. Note that some conditions on the eigenfunctions and generalized eigenfunctions of h are necessary in order to prove the higher order estimates. The analogous model for constant coefficients was considered in [4]. Even in the constant coefficient case we improve the results in [4] by removing an unpleasant technical assumption on g, which excluded to take g compactly supported. The second example is the generalization to higher dimensions. The one-particle energy ω is: 12 Di aij (x)Dj + c(x) , ω=
1≤i,j≤d
where h = 1≤i,j≤d Di aij (x)Dj + c(x) is an elliptic second order differential operator converging to D2 + m2∞ when x → ∞. The interaction is now g(x)P (x, ϕκ (x))dx, R
1
where P is as before and ϕκ (x) = φ(ω − 2 F (ω ≤ κ)δx ) is now the UV-cutoff relativistic field. Here because of the UV-cutoff, the Wick ordering is irrelevant. Again some conditions on eigenfunctions and generalized eigenfunctions of h are necessary. We believe that our set of hypotheses should be sufficiently general to consider also Klein–Gordon equations on other Riemannian manifolds, like for example manifolds equal to the union of a compact piece and a cylinder R+ ×M , where the metric on R+ × M is of product type. 1.2. Plan of the paper We now describe briefly the plan of the paper. Section 2 is a collection of various auxiliary results needed in the rest of the paper. We first recall in Secs. 2.1 and 2.2 some arguments connected with the abstract Mourre theory and a convenient functional calculus formula. In Sec. 2.3, we fix some notation connected with one-particle operators. Standard results taken from [4, 5] on bosonic Fock spaces and Wick polynomials are recalled in Secs. 2.4 and 2.6.
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
377
The class of abstract QFT Hamiltonians that we will consider in the paper is described in Sec. 3. The results of the paper are summarized in Sec. 4. In Sec. 5, we give examples of abstract QFT Hamiltonians to which all our results apply, namely the space-cutoff P (ϕ)2 model with a variable metric, and the analogous models in higher dimensions, where now an ultraviolet cutoff is imposed on the polynomial interaction. Section 6 is devoted to the proof of commutator estimates needed in various localization arguments. The spectral theory of abstract QFT Hamiltonians is studied in Sec. 7. The essential spectrum is described in Sec. 7.1, the virial theorem and Mourre’s positive commutator estimate are proved in Secs. 7.2, 7.4 and 7.5. The results of Sec. 7 are related to those of [8], where abstract bosonic and fermionic QFT Hamiltonians are considered using a C ∗ -algebraic approach instead of the geometrical approach used in our paper. Our result on essential spectrum can certainly be deduced from the results in [8]. However, the Mourre theory in [8] requires that the one-particle Hamiltonian ω has no eigenvalues and also that ω is affiliated to an abelian C ∗ -algebra O such that eita Oe−ita = O, where a is the one-particle conjugate operator. In concrete examples, this second assumption seems adapted to constant coefficients one-particle Hamiltonians and not satisfied by the examples we describe in Sec. 5. In Sec. 8, we describe the scattering theory for abstract QFT Hamiltonians. The existence of asymptotic Weyl operators and asymptotic fields is shown in Sec. 8.1. Other natural objects, like the wave operators and extended wave operators are defined in Secs. 8.2 and 8.3. Propagation estimates are shown in Sec. 9. The most important are the phasespace propagation estimates in Secs. 9.2 and 9.3 and the minimal velocity estimate in Sec. 9.4. Finally asymptotic completeness is proved in Sec. 10. The main step is the proof of geometric asymptotic completeness in Sec. 10.4, identifying the vacua with the states for which no bosons escape to infinity. In Sec. 10.5, we show that states for which no bosons escape to infinity coincide with bound states of the Hamiltonian, completing thefore the proof of asymptotic completeness. Various technical proofs are collected in the Appendix. 2. Auxiliary Results In this section, we collect various auxiliary results which will be used in the sequel. 2.1. Commutators Let A be a selfadjoint operator on a Hilbert space H. If B ∈ B(H) one says that B is of class C 1 (A) [1] if the map R t → eitA Be−itA ∈ B(H) is C 1 for the strong topology.
April 2, 2009 10:25 WSPC/148-RMP
378
J070-00364
C. G´ erard & A. Panati
If H is selfadjoint on H, one says that H is of class C 1 (A) [1] if for some (and hence all) z ∈ C\σ(H), (H − z)−1 is of class C 1 (A). The classes C k (A) for k ≥ 2 are defined similarly. If H is of class C 1 (A), the commutator [H, iA] defined as a quadratic form on D(A) ∩ D(H) extends then uniquely as a bounded quadratic form on D(H). The corresponding operator in B(D(H), D(H)∗ ) will be denoted by [H, iA]0 . If H is of class C 1 (A) then the virial relation holds (see [1]): 1{λ} (H)[H, iA]0 1{λ} (H) = 0,
λ ∈ R.
An estimate of the form 1I (H)[H, iA]0 1I (H) ≥ c0 1I (H) + K, where I ⊂ R is a compact interval, c0 > 0 and K a compact operator on H, or: 1I (H)[H, iA]0 1I (H) ≥ c0 1I (H), is called a (strict) Mourre estimate on I. An operator A such that the Mourre estimate holds on I is called a conjugate operator for H (on I). Under an additional regularity condition of H with respect to A (for example if H is of class C 2 (A)), it has several important consequences like weighted estimates on (H − λ ± i0)−1 for λ ∈ I (see e.g. [1]) or abstract propagation estimates (see e.g. [14]). We now recall some useful machinery from [1] related with the best constant c0 in the Mourre estimate. Let H be a selfadjoint operator on a Hilbert space H and B be a quadratic form with domain D(H M ) for some M ∈ N such that the virial relation 1{λ} (H)B1{λ} (H) = 0,
λ ∈ R,
(2.1)
is satisfied. We set ∞ 2 ρB H (λ) := sup{a ∈ R| ∃ χ ∈ C0 (R), χ(λ) = 0, χ(H)Bχ(H) ≥ aχ (H)}, ∞ ρ˜B H (λ) := sup{a ∈ R| ∃ χ ∈ C0 (R), χ(λ) = 0, ∃ K compact,
χ(H)Bχ(H) ≥ aχ2 (H) + K}. ˜B The functions, ρB H, ρ H are lower semi-continuous and it follows from the virial B relation that ρH (λ) < ∞ iff λ ∈ σ(H), ρ˜B H (λ) < ∞ iff λ ∈ σess (H) (see [1, Sec. 7.2]). One sets: B τB (H) := {λ | ρ˜B H (λ) ≤ 0}, κB (H) := {λ | ρH (λ) ≤ 0},
which are closed subsets of R, and µB (H) := σpp (H)\τB (H).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
379
The virial relation and the usual argument shows that the eigenvalues of H in µB (H) are of finite multiplicity and are not accumulation points of eigenvalues. In the next lemma we collect several abstract results adapted from [1, 3]. B Lemma 2.1. (i) if λ ∈ µB (H) then ρB ˜B H (λ) = 0. If λ ∈ µB (H) then ρH (λ) = ρ H (λ).
(ii) ρB ˜B H (λ) > 0 iff ρ H (λ) > 0 and λ ∈ σpp (H), which implies that κB (H) = τB (H) ∪ σpp (H). (iii) Let H = H1 ⊕ H2 , H = H1 ⊕ H2 , B = B1 ⊕ B2 , where Bi , H, B are as above and satisfy (2.1). Then B1 B2 ρB H (λ) = min(ρH1 (λ), ρH2 (λ)).
(iv) Let H = H1 ⊗ H2 , H = H1 ⊗ 1 + 1 ⊗ H2 , B = B1 ⊗ 1 + 1 ⊗ B2 , where Hi , Bi , H, B are as above, satisfy (2.1) and Hi are bounded below. Then ρB H (λ) =
inf
B2 1 (ρB H1 (λ1 ) + ρH2 (λ2 )).
λ1 +λ2 =λ
Proof. (i), (ii) can be found in [1, Sec. 7.2], in the case B = [H, iA] for A a selfadjoint operator such that H ∈ C 1 (A). This hypothesis is only needed to ensure the virial relation (2.1). (iii) is easy and (iv) can be found in [3, Thm. 3.4] in the same framework. Again it is easy to see that the proof extends verbatim to our situation. Assume now that H, A are two selfadjoint operators on a Hilbert space H such that the quadratic form [H, iA] defined on D(H M ) ∩ D(A) for some M uniquely extends as a quadratic form B on D(H M ) and the virial relation (2.1) holds. AbusA ing notation we will in the rest of the paper denote by ρ˜A H , ρH , τA (H), κA (H) the objects introduced above for B = [H, iA]. The set τA (H) is usually called the set of thresholds of H for A. 2.2. Functional calculus ˜ ∈ C0∞ (C) an almost analytic extension of χ, If χ ∈ C0∞ (R), we denote by χ satisfying χ ˜|R = χ, |∂ z¯χ(z)| ˜ ≤ Cn |Im z|n ,
n ∈ N.
We use the following functional calculus formula for χ ∈ C0∞ (R) and A selfadjoint: i χ(A) = ∂z¯χ(z)(z ˜ − A)−1 dz ∧ d z¯. (2.2) 2π C
April 2, 2009 10:25 WSPC/148-RMP
380
J070-00364
C. G´ erard & A. Panati
2.3. Abstract operator classes In this subsection we introduce a poor man’s version of pseudodifferential calculus tailored to our abstract setup. It rests on two positive selfadjoint operators ω and x on the one-particle space h. Later ω will of course be the one-particle Hamiltonian. The operator x will have two purposes: first as a weight to control various operators, and second as an observable to localize particles in h. We fix selfadjoint operators ω, x on h such that: ω ≥ m > 0, x ≥ 1, there exists a dense subspace S ⊂ h such that ω, x : S → S. To understand the terminology below the reader familiar with the standard pseudodifferential calculus should think of the example h = L2 (Rd ),
1
1
x = (x2 + 1) 2
ω = (Dx2 + 1) 2 ,
and S = S(Rd ).
To control various commutators later it is convenient to introduce the following classes of operators on h. If a, b : S → S we set ada b = [a, b] as an operator on S. Definition 2.2. For m ∈ R, 0 ≤ δ <
1 2
and k ∈ N we set
m S(0) = {b : S → h | xs bx−s−m ∈ B(h), s ∈ R},
and for k ≥ 1: β m s−m+(1−δ)β−δα Sδ,(k) = {b : S → S | x−s adα ∈ B(h) α + β ≤ k, s ∈ R}, x adω bx
where the multicommutators are considered as operators on S. The parameter m control the “order” of the operator: roughly speaking an operator m in Sδ,(k) is controlled by xm . The parameter k is the number of commutators of the operator with x and ω that are controlled. The lower index δ controls the behavior of multicommutators: one loses xδ for each commutator with x and gains x1−δ for each commutator with ω. The operator norms of the (weighted) multicommutators above can be used as m . a family of seminorms on Sδ,(k) m m for δ = 0 will be denoted simply by S(k) . We will use the The spaces Sδ,(k) following natural notation for operators depending on a parameter: m if b = b(R) belongs to Sδ,(k) for all R ≥ 1 we will say that m b ∈ O(Rµ )Sδ,(k) , m are uniformly bounded in R. The following if the seminorms of R−µ b(R) in Sδ,(k) lemma is easy. m1 m2 m1 m2 × Sδ,(k) ⊂ Sδ,(k) . Lemma 2.3. (i) Sδ,(k) s m+s (ii) Let b ∈ S(0) . Then J( x ) for m + s ≥ 0 if J ∈ C0∞ (R) and R )bx ∈ O(R ∞ for all s ∈ R if J ∈ C0 (]0, +∞[). (m)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
381
Proof. (i) follows from Leibniz rule applied to the operators adx and adω . (ii) is immediate. 2.4. Fock spaces In this subsection we recall various definitions on bosonic Fock spaces. We will also collect some bounds needed later. Bosonic Fock spaces. If h is a Hilbert space then Γ(h) :=
∞
⊗ns h,
n=0
is the bosonic Fock space over h. Ω ∈ Γ(h) will denote the vacuum vector. The number operator N is defined as N |Nns h = n1. We define the space of finite particle vectors: Γfin (h) := {u ∈ Γ(h) | for some n ∈ N, 1[0,n] (N )u = u}. The creation-annihilation operators on Γ(h) are denoted by a∗ (h) and a(h). We denote by 1 φ(h) := √ (a∗ (h) + a(h)), 2
W (h) := eiφ(h) ,
the field and Weyl operators. dΓ operators. If r : h1 → h2 is an operator one sets: dΓ(r) : Γ(h1 ) → Γ(h2 ), n dΓ(r)|Nns h1 := 1⊗(j−1) ⊗ r ⊗ 1⊗(n−j) , j=1
with domain Γfin (D(r)). If r is closeable, so is dΓ(r). Γ operators. If q : h1 → h2 is bounded one sets: Γ(q) : Γ(h1 ) → Γ(h2 ) Γ(q)|Nn s
h1
= q ⊗ · · · ⊗ q.
Γ(q) is bounded iff q ≤ 1 and then Γ(q) = 1. dΓ(r, q) operators. If r, q are as above one sets: dΓ(q, r) : Γ(h1 ) → Γ(h2 ), dΓ(q, r)|Nn s
h1
:=
n
q ⊗(j−1) ⊗ r ⊗ q ⊗(n−j) ,
j=1
with domain Γfin (D(r)). We refer the reader to [4, Secs. 3.5–3.7] for more details.
April 2, 2009 10:25 WSPC/148-RMP
382
J070-00364
C. G´ erard & A. Panati
Tensor products of Fock spaces. If h1 , h2 are two Hilbert spaces, one denote by U : Γ(h1 ) ⊗ Γ(h2 ) → Γ(h1 ⊕ h2 ) the canonical unitary map (see, e.g., [4, Sec. 3.8] for details). If H = Γ(h), we set Hext := H ⊗ H Γ(h ⊕ h). The second copy of H will be the state space for bosons living near infinity in the spectral theory of a Hamiltonian H acting on H. Let H = dΓ(ω) + V be an abstract QFT Hamiltonian defined in Sec. 3.1 Then we set: Hscatt := H ⊗ Γ(hc (ω)). The Hilbert space Γ(hc (ω)) will be the state space for free bosons in the scattering theory of a Hamiltonian H acting on H. We will need also: H ext := H ⊗ 1 + 1 ⊗ dΓ(ω),
acting on Hext .
Clearly Hscatt ⊂ Hext and H ext preserves Hscatt . We will use the notation N0 := N ⊗ 1,
N∞ := 1 ⊗ N,
as operators on Hext or Hscatt .
Identification operators. The identification operator is defined as I : Hext → H, I := Γ(i)U, where U is defined as above for h1 = h2 = h and: i : h ⊕ h → h, (h0 , h∞ ) → h0 + h∞ . We have: I
n i=1
a∗ (hi )Ω ⊗
p i=1
a∗ (gi )Ω :=
n i=1
a∗ (hi )
p
a∗ (gi )Ω,
hi ∈ h,
gi ∈ h.
i=1
If ω is a selfadjoint operator as above, we denote by I scatt the restriction of I to Hscatt . √ Note that i = 2 so Γ(i) and hence I, I scatt are unbounded. As domain for I (respectively, I scatt ) we can choose for example D(N ∞ ) ⊗ Γfin (h) (respectively, D(N ∞ ) ⊗ Γfin (hc (ω))). We refer to [4, Sec. 3.9] for details. Operators I(j) and dI(j, k). Let j0 , j∞ ∈ B(h) and set j = (j0 , j∞ ). We define I(j) : Γfin (h) ⊗ Γfin (h) → Γfin (h) I(j) := IΓ(j0 ) ⊗ Γ(j∞ ).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
383
If we identify j with the operator j : h ⊕ h → h,
(2.3)
j(h0 ⊕ h∞ ) := j0 h0 + j∞ h∞ , then we have I(j) = Γ(j)U.
∗ ∗ We deduce from this identity that if j0 j0∗ +j∞ j∞ = 1 (respectively, j0 j0∗ +j∞ j∞ ≤ 1) ∗ then I (j) is isometric (respectively, is a contraction). Let j = (j0 , j∞ ), k = (k0 , k∞ ) be pairs of maps from h to h. We define
dI(j, k) : Γfin (h) ⊗ Γfin (h) → Γfin (h) as follows: dI(j, k) := I(dΓ(j0 , k0 ) ⊗ Γ(j∞ ) + Γ(j0 ) ⊗ dΓ(j∞ , k∞ )). Equivalently, treating j and k as maps from h ⊕ h to h as in (2.3), we can write dI(j, k) := dΓ(j, k)U. We refer to [4, Secs. 3.10 and 3.11] for details. Various bounds. Proposition 2.4. (i) Let a, b two selfadjoint operators on h with b ≥ 0 and a2 ≤ b2 . Then dΓ(a)2 ≤ dΓ(b)2 . (ii) Let b ≥ 0, 1 ≤ α. Then: dΓ(b)α ≤ N α−1 dΓ(bα ). (iii) Let 0 ≤ r and 0 ≤ q ≤ 1. Then: dΓ(q, r) ≤ dΓ(r). (iv) Let r, r1 , r2 ∈ B(h) and q ≤ 1. Then: 1
1
|(u2 | dΓ(q, r2 r1 )u1 )| ≤ dΓ(r2 r2∗ ) 2 u2 dΓ(r1∗ r1 ) 2 u1 , 1
1
N − 2 dΓ(q, r)u ≤ dΓ(r∗ r) 2 u. ∗ (v) Let j0 j0∗ + j∞ j∞ ≤ 1, k0 , k∞ selfadjoint. Then: 1
1
|(u2 | dI ∗ (j, k)u1 )| ≤ dΓ(|k0 |) 2 ⊗ 1u2 dΓ(|k0 |) 2 u1 1
1
+ 1 ⊗ dΓ(|k∞ |) 2 u2 dΓ(|k∞ |) 2 u1 , u1 ∈ Γ(h), − 12
(N0 + N∞ )
∗
dI (j, k)u ≤
dΓ(k0 k0∗
+
u2 ∈ Γ(h) ⊗ Γ(h).
∗ 12 k∞ k∞ ) u,
u ∈ Γ(h).
Proof. (i) is proved in [10, Proposition 3.4]. The other statements can be found in [4, Sec. 3].
April 2, 2009 10:25 WSPC/148-RMP
384
J070-00364
C. G´ erard & A. Panati
2.5. Heisenberg derivatives Let H be a selfadjoint operator on Γ(h) such that H = dΓ(ω) + V on D(H m ) for some m ∈ N where ω is selfadjoint and V symmetric. We will use the following notations for various Heisenberg derivatives: d0 =
∂ + [ω, i·] acting on B(h), ∂t
D0 =
∂ + [H0 , i·], ∂t
D=
∂ + [H, i·], ∂t
acting on B(Γ(h)),
where the commutators on the right-hand sides are quadratic forms. If R t → M (t) ∈ B(D(H), H) is of class C 1 then: D χ(H)M (t)χ(H) = χ(H)D0 M (t)χ(H) + χ(H)[V, iM (t)]χ(H),
(2.4)
for χ ∈ C0∞ (R). If R m(t) ∈ B(h) is of class C 1 and H0 = dΓ(ω) then: D0 dΓ(m(t)) = dΓ(d0 m(t)). 2.6. Wick polynomials In this subsection we recall some results from [4, Sec. 3.12]. We set Bfin (Γ(h)) := {B ∈ B(Γ(h)) | for some n ∈ N 1[0,n] (N )B1[0,n] (N ) = B}. Let w ∈ B(⊗ps h, ⊗qs h). We define the operator Wick(w) : Γfin (h) → Γfin (h) as follows: Wick(w)|Nn s
h
n!(n + q − p)! w ⊗s 1⊗(n−p) . := (n − p)!
(2.5)
The operator Wick(w) is called a Wick monomial of order (p, q). This definition extends to w ∈ Bfin (Γ(h)) by linearity. The operator Wick(w) is called a Wick polynomial and the operator w is called the symbol of the Wick polynomial Wick(w). If w = (p,q)∈I wp,q for wp,q of order (p, q) and I ⊂ N finite, then deg(w) := sup p + q (p,q)∈I
is called the degree of Wick(w). If h1 , . . . , hp , g1 , . . . , gq ∈ h then: Wick(|g1 ⊗s · · · ⊗s gq )(hp ⊗s · · · ⊗s h1 )|) = a∗ (g1 ) · · · a∗ (gq )a(hp ) · · · a(h1 ). We recall some basic properties of Wick polynomials.
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
385
Lemma 2.5. (i) Wick(w)∗ = Wick(w∗ ) as a identity on Γfin (h). (ii) If s- lim ws = w, for ws , w of order (p, q) then for k + m ≥ (p + q)/2: s- lim(N + 1)−k Wick(ws )(N + 1)−m = (N + 1)−k Wick(w)(N + 1)−m . s
(iii) (N + 1)−k Wick(w)(N + 1)−m ≤ CwB(Γ(h)) , uniformly for w of degree less than p and k + m ≥ p/2. Most of the time the symbols of Wick polynomials will be Hilbert–Schmidt operators. Let us introduce some more notation in this context: we set 2 Bfin (Γ(h)) := B 2 (Γ(h)) ∩ Bfin (Γ(h)),
where B 2 (H) is the set of Hilbert–Schmidt operators on the Hilbert space H. Recall that extending the map: ¯ B 2 (H) |u)(v| → u ⊗ v¯ ∈ H ⊗ H ¯ where H ¯ by linearity and density allows to unitarily identify B 2 (H) with H ⊗ H, 2 is the Hilbert space conjugate to H. Using this identification, Bfin (Γ(h)) is iden¯ or equivalently to Γfin (h ⊕ ¯h). We will often use this tified with Γfin (h) ⊗ Γfin (h) identification in the sequel. n p q If u ∈ ⊗m s h, v ∈ ⊗s h, w ∈ B(⊗s h, ⊗s h) with m ≤ p, n ≤ q, then one defines the contracted symbols: h), (v|w := ((v| ⊗s 1⊗(q−n) )w ∈ B(⊗ps h, ⊗q−n s h, ⊗qs h), w|u) := w(|u) ⊗s 1⊗(p−m) ) ∈ B(⊗p−m s h, ⊗q−n h). (v|w|u) := ((v| ⊗s 1⊗(q−n) )w(|u) ⊗s 1⊗(p−m) ) ∈ B(⊗p−m s s 2 If a is selfadjoint on h and w ∈ Bfin (Γ(h)), we set 2 (Γ(h)) , dΓ(a)w = (a)i ⊗ 1Γ(h) 1Γ(h) ⊗ (¯ a)i wBfin ¯ wB 2 (Γ(h)) + fin 1≤i<∞
1≤i<∞
2 ¯ and one uses the (Γ(h)) Γfin (h) ⊗ Γfin (h) where the sums are finite since w ∈ Bfin convention au = +∞ if u ∈ D(a). We collect now some bounds on various commutators with Wick polynomials.
Proposition 2.6. (i) Let b a selfadjoint operator on h and w ∈ Bfin (Γ(h)). Then: [dΓ(b), Wick(w)] = Wick([dΓ(b), w]), as quadratic form on D(dΓ(b)) ∩ D(N deg(w)/2 ). (ii) Let q a unitary operator on h and w ∈ Bfin (Γ(h)). Then Γ(q) Wick(w)Γ(q)−1 = Wick(Γ(q)wΓ(q)−1 ).
April 2, 2009 10:25 WSPC/148-RMP
386
J070-00364
C. G´ erard & A. Panati
(iii) Let w ∈ Bfin (Γ(h)) of order (p, q) and h ∈ h. Then: [Wick(w), a∗ (h)] = p Wick(w|h)), W (h) Wick(w)W (−h) =
q p s=0 r=0
[Wick(w), a(h)] = q Wick((h|w), p+q−r−s p! q! i √ Wick(ws,r ), s! r! 2
(2.6) (2.7)
where ws,r = (h⊗(q−r) |w|h⊗(p−s) ).
(2.8)
2 Proposition 2.7. (i) Let q ∈ B(h), q ≤ 1 and w ∈ Bfin (h). Then for m + k ≥ deg(w)/2:
(N + 1)−m [Γ(q), Wick(w)](N + 1)−k ≤ C dΓ(1 − q)w . (ii) Let j = (j0 , j∞ ) with j0 , j∞ ∈ B(h), deg(w)/2:
j0∗ j0
+
∗ j∞ j∞
(2.9)
≤ 1. Then for m + k ≥
(N0 + N∞ + 1)−m (I ∗ (j) Wick(w) − (Wick(w) ⊗ 1)I ∗ (j))(N + 1)−k ≤ C dΓ(1 − j0 )w +C dΓ(j∞ )w .
(2.10)
3. Abstract QFT Hamiltonians In this section, we define the class of abstract QFT Hamiltonians that we will consider in this paper. 3.1. Hamiltonians 2 Let ω be a selfadjoint operator on h and w ∈ Bfin (Γ(h)) such that w = w∗ . We set
H0 := dΓ(ω),
V := Wick(w).
Clearly H0 is selfadjoint and V symmetric on D(N n ) for n ≥ deg(w)/2 by Lemma 2.5. We assume: (H1) inf σ(ω) = m > 0, (H2) H0 + V is essentially selfadjoint and bounded below on D(H0 ) ∩ D(V ). We set H := H0 + V . In the sequel, we fix b > 0 such that H + b ≥ 1. We assume: (H3) ∀n ∈ N, ∃ p ∈ N such that N n H0 (H + b)−p < ∞, ∀P ∈ N, ∃ P < M ∈ N such that N M (H + b)−1 (N + 1)−P < ∞. The bounds in (H3) are often called higher order estimates. Definition 3.1. A Hamiltonian H on Γ(h) satisfying (Hi) for 1 ≤ i ≤ 3 will be called an abstract QFT Hamiltonian.
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
387
3.2. Hypotheses on the one-particle Hamiltonian The study of the spectral and scattering theory of abstract QFT Hamiltonians relies heavily on corresponding statements for the one-particle Hamiltonian ω. The now standard approach to such results is through the proof of a Mourre estimate and suitable propagation estimates on the unitary group e−itω . Many of these results can be formulated in a completely abstract way. A convenient setup is based on the introduction of only three selfadjoint operators on the one-particle space h, the Hamiltonian ω, a conjugate operator a for ω and a weight operator x. In this subsection we describe the necessary abstract hypotheses and collect various technical results used in the sequel. We will use the abstract operator classes introduced in Sec. 2.3. Commutator estimates. We assume that there exists a selfadjoint operator x ≥ 1 for ω such that: (G1 i) there exists a subspace S ⊂ h such that S is a core for ω, ω 2 and the operators ω, x for z ∈ C\σ(x), (x − z)−1 , F (x) for F ∈ C0∞ (R) preserve S. 0 . (G1 ii) [x, ω] belongs to S(3) Definition 3.2. An operator x satisfying (G1) will be called a weight operator for ω. Dynamical estimates. Particles living at time t in x ≥ ct for some c > 0 are interpreted as free particles. The following assumption says that states in hc (ω) describe free particles: (S) there exists a subspace h0 dense in hc (ω) such that for all h ∈ h0 there exists > 0 such that
1[0,] x e−itω h ∈ O(t−µ ), µ > 1.
|t| (We recall that hc (ω) is the continuous spectral subspace for ω.) Note that (S) can be deduced from (G1), (M1) and (G4), assuming that ω ∈ C 3 (a). The standard way to see this is to prove first a strong propagation estimate (see, e.g., [14]): |a| ≤ χ(ω)e−itω (a + i)−2 ∈ O(t−2 ), F |t| in norm if χ ∈ C0∞ (R) is supported away from κa (ω), and then to obtain a corresponding estimate with a replaced by x using (G4) and arguments similar to those in [11, Lemma A.3]. The operators [ω, ix] and [ω, i[ω, ix]] are respectively the instantaneous velocity and acceleration for the weight x. The following condition means roughly that
April 2, 2009 10:25 WSPC/148-RMP
388
J070-00364
C. G´ erard & A. Panati
the acceleration is positive: (G2) there exists 0 < <
1 2
such that [ω, i[ω, ix]] = γ 2 + r−1− ,
−1
−1− 2 and r−1− ∈ S(0) . where γ = γ ∗ ∈ S,(2)
Mourre theory and local compactness. We now state hypotheses about the conjugate operator a: (M1 i) ω ∈ C 1 (a), [ω, ia]0 ∈ B(h). (M1 ii) ρaω ≥ 0, τ a (ω) is a closed countable set. We will also need the following condition which allows to localize the operator [ω, ia]0 using the weight operator x. 0 . (G3) a preserves S and [x, [ω, ia]0 ] belongs to S(0)
Note that if a preserves S then [ω, a]0 = ωa − aω on S. Therefore [x, [ω, a]0 ] in (G3) is well defined as an operator on S. We will also need some conditions which roughly say that a is controlled by x. This allows to translate propagation estimates for a into propagation estimates for x. 1 . (G4) a belongs to S(0) 2 hence ax−1 and a2 x−2 are bounded. Note that by Lemma 2.3(i), a2 ∈ S(0) We state also an hypothesis on local compactness:
(G5) x− (ω + 1)− is compact on h for some 0 < ≤ 12 . Comparison operator. To get a sharp Mourre estimate for abstract QFT Hamiltonians, it is convenient to assume the existence of a comparison operator ω∞ such that: 2 2 ≤ ω 2 ≤ Cω∞ , for some C > 0, (C i) C −1 ω∞ (C ii) ω∞ satisfies (G1), (M1), (G3) for the same x and a and κaω∞ ⊂ τωa∞ .
Note that the last condition in (C ii) is satisfied if ω∞ has no eigenvalues. 1
1
(C iii) ω − 2 (ω − ω∞ )ω − 2 x and [ω − ω∞ , ia]0 x are bounded for some > 0. Some consequences. We now state some standard consequences of (G1). Lemma 3.3. Assume (H1), (G1). Then for F ∈ C0∞ (R): k −1 x F ( R )[x, adkx ω] + M (R), k = 0, 1, where M (R) ∈ (i) [F ( x R ), adx ω] = R −1 0 O(R−2 )S(0) ∩ O(R−1 )S(0) .
x −1 ∈ O(1), (ii) F ( x R ) : D(ω) → D(ω) and ωF ( R )ω x −1 (iii) [F ( R ), [ω, x]] ∈ O(R ),
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
389
x −2 (iv) F ( x ), if F1 ∈ C0∞ (R) and F F1 = F . R )[ω, ix](1 − F1 )( R ) ∈ O(R Assume (H1), (M1 i), (G3). Then for F ∈ C0∞ (R): −1 ). (v) [F ( x R ), [ω, ia]0 ] ∈ O(R Assume (H1), (G1), (G2). Then for F ∈ C0∞ (R): x 2 2 2 −1 ∈ O(R−1 ). (vi) F ( x R ) : D(ω ) → D(ω ) and [ω , F ( R )]ω −µ ∞ Let b ∈ Sδ,(1) for µ ≥ 0 and F ∈ C0 (R\{0}). Then:
−µ−1+δ ). (vii) [F ( x R ), b] ∈ O(R
In (i) for k = 0 the commutator on the left-hand side is considered as a quadratic form on D(ω). Lemma 3.4. Let ω∞ be a comparison operator satisfying (C). Then for F ∈ C ∞ (R) with F ≡ 0 near 0, F ≡ 1 near +∞ we have: x x − 12 − 12 ω (ω − ω∞ )F ω , [ω − ω∞ , ia]F ∈ o(R0 ). R R The proof of Lemmas 3.3 and 3.4 will be given in the Appendix. 3.3. Hypotheses on the interaction We now formulate the hypotheses on the interaction V . If j ∈ C ∞ (R), we set for R ≥ 1 j R = j( x R ). For the scattering theory of abstract QFT Hamiltonians, we will need the following decay hypothesis on the symbol of V : (Is) dΓ(j R )w ∈ O(R−s ), s > 0 if j ≡ 0 near 0, j ≡ 1 near ± ∞. 2 (Γ(h)) and j is as above then Note that if w ∈ Bfin
dΓ(j R )w ∈ o(R0 ),
when R → ∞.
(3.1)
Another type of hypothesis concerns the Mourre theory. We fix a conjugate operator a for ω such that (M1) holds and set A := dΓ(a). For the Mourre theory, we will impose: ¯ (M2) w ∈ D(A ⊗ 1 − 1 ⊗ A). If hypothesis (G4) holds then ax−1 is bounded. It follows that the condition (D) dΓ(xs )w < ∞, for some s > 1 implies both (Is) for s > 1 and (M2). 4. Results For the reader’s convenience, we summarize in this section the results of the paper. To simplify the situation we will assume that all the various hypotheses hold, i.e. we assume conditions (Hi), 1 ≤ i ≤ 3, (Gi), 1 ≤ i ≤ 5, (S), (M1), (C) and (D).
April 2, 2009 10:25 WSPC/148-RMP
390
J070-00364
C. G´ erard & A. Panati
However various parts of Theorem 4.1 hold under smaller sets of hypotheses, we refer the reader to later sections for precise statements. The notation dΓ(1) (E) for a set E ⊂ R is defined in Sec. 7.3. Theorem 4.1. Let H be an abstract QFT Hamiltonian. Then: (1) if σess (ω) = [m∞ , +∞[ then σess (H) = [inf σ(H) + m∞ , +∞[. (2) The Mourre estimate holds for A = dΓ(a) on R\τ, where τ = σpp (H) + dΓ(1) (τa (ω)), where τa (ω) is the set of thresholds of ω for a and dΓ(1) (E) for E ⊂ R is defined in (7.18). (3) The asymptotic Weyl operators: W ± (h) := s- lim eitH W (e−itω h)e−itH exist for all h ∈ hc (ω), t±∞
and define two regular CCR representations over hc (ω). (4) There exist unitary operators Ω± , called the wave operators: Ω± : Hpp (H) ⊗ Γ(hc (ω)) → Γ(h) such that W ± (h) = Ω± 1 ⊗ W (h)Ω±∗ ,
h ∈ hc (ω),
±
H = Ω (H|Hpp (H) ⊗ 1 + 1 ⊗ dΓ(ω))Ω±∗ . Parts (1)–(4) are proved respectively in Theorems 7.1, 7.10, 8.1 and 10.6. Statement (1) is the familiar HVZ theorem, describing the essential spectrum of H. Statement (2) is the well-known Mourre estimate. Under additional conditions, it is possible to deduce from it resolvent estimates which imply in particular that the singular continuous spectrum of H is empty. In our case this result follows from (4), provided we know that ω has no singular continuous spectrum. Statement (3) is rather easy. Statement (4) is the most important result of this paper, namely the asymptotic completeness of wave operators. Remark 4.2. Assume that there exist another operator ω∞ on h such that ω|hc (ω) is unitarily equivalent to ω∞ . Typically this follows from the construction of a nice scattering theory for the pair (ω, ω∞ ). Then since dΓ(ω) restricted to Γ(hc (ω)) is unitarily equivalent to dΓ(ω∞ ), we can replace ω by ω∞ in statement (4) of Theorem 4.1. 5. Examples In this section, we give examples of QFT Hamiltonians to which we can apply Theorem 4.1. Our two examples are space-cutoff P (ϕ)2 Hamiltonians for a variable
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
391
metric, and similar P (ϕ)d+1 models for d ≥ 2 if the interaction term has also an ultraviolet cutoff. For µ ∈ R we denote by S µ (Rd ) the space of C ∞ functions on Rd such that: 1
∂xα f (x) ∈ O(x−µ−α ) α ∈ Nd ,
where x = (1 + x2 ) 2 .
5.1. Space-cutoff P (ϕ)2 models with variable metric We fix a second order differential operator on h = L2 (R): D = −i∂x ,
h := Da(x)D + c(x),
where a(x) ≥ c0 , c(x) ≥ c0 for some c0 > 0 and a(x) − 1, c(x) − m2∞ ∈ S −µ (R) for some m∞ , µ > 0. We set: 1
ω := h 2 and consider the free Hamiltonian H0 = dΓ(ω),
acting on Γ(h).
To define the interaction, we fix a real polynomial with x-dependent coefficients: P (x, λ) =
2n
ap (x)λp ,
a2n (x) ≡ a2n > 0,
(5.1)
p=0
and a function g ∈ L1 (R) with g ≥ 0. For x ∈ R, one sets 1
ϕ(x) := φ(ω − 2 δx ), where δx is the Dirac distribution at x. The associated P (ϕ)2 interaction is formally defined as: g(x) : P (x, ϕ(x)) : dx, V := R
where : : denotes the Wick ordering. In [12], we prove the following theorem. Condition (B3) below is formulated in terms of a (generalized) basis of eigenfunctions of h. To be precise we say that the families {ψl (x)}l∈I and {ψ(x, k)}k∈R form a generalized basis of eigenfunctions of h if: ψl (·) ∈ L2 (R), ψ(·, k) ∈ S (R), l ≤ m2∞ ,
hψl = l ψl , 2
hψ(·, k) = (k +
|ψl )(ψl | +
l∈I
m2∞ )ψ(·, k), 1 2π
R
l ∈ I,
k ∈ R,
|ψ(·, k))(ψ(·, k)|dk = 1.
Theorem 5.1. Assume that : (B1) gap ∈ L2 (R), 0 ≤ p ≤ 2n, g ∈ L1 (R), g ≥ 0, g(ap )2n/(2n−p) ∈ L1 (R), 0 ≤ p ≤ 2n − 1, (B2) xs gap ∈ L2 (R) ∀ 0 ≤ p ≤ 2n, for some s > 1.
April 2, 2009 10:25 WSPC/148-RMP
392
J070-00364
C. G´ erard & A. Panati
Assume moreover that for a measurable function M : R → R+ with M (x) ≥ 1 there exists a generalized basis of eigenfunctions of h such that : l∈I M −1 (·)ψl (·)2∞ < ∞, (B2) M −1 (·)ψ(·, k)∞ ≤ C, k ∈ R. (B4) gap M s ∈ L2 (R), g(ap M s )2n/(2n−p+s) ∈ L1 (R), ∀ 0 ≤ s ≤ p ≤ 2n − 1. Then the Hamiltonian
H = dΓ(ω) +
g(x) : P (x, ϕ(x)) : dx R 1
satisfies all the hypotheses of Theorem 4.1 for the weight operator x = (1 + x2 ) 2 and conjugate operator a = 12 (xDx −1 Dx + hc). Remark 5.2. If g is compactly supported we can take M (x) = +∞ outside supp g, and the meaning of (B3) is that the sup norms ∞ are taken only on supp g. Remark 5.3. Condition (B3) is discussed in details in [12], where many sufficient conditions for its validity are given. As an example let us simply mention that if a(x) − 1, c(x) − m2∞ and the coefficients ap are in the Schwartz class S(R), then all conditions in Theorem 5.1 are satisfied. 5.2. Higher-dimensional examples We work now on L2 (Rd ) for d ≥ 2 and consider ω=
12 Di aij (x)Dj + c(x)
1≤i,j≤d
where aij , c are real, [aij ](x) ≥ c0 1, c(x) ≥ c0 for some c0 > 0 and [aij ] − 1 ∈ S −µ (Rd ), c(x) − m2∞ ∈ S −µ (Rd ) for some m∞ , µ > 0. The free Hamiltonian is as above H0 = dΓ(ω), acting on the Fock space Γ(L2 (Rd )). Since d ≥ 2 it is necessary to add an ultraviolet cutoff to make sense out of the formal expression g(x)P (x, ϕ(x))dx. Rd
We set
ω − 12 ϕκ (x) := φ ω χ δx , κ
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
393
where χ ∈ C0∞ ([−1, 1]) is a cutoff function equal to 1 on [− 12 , 12 ] and κ 1 is an 1 ultraviolet cutoff parameter. Since ω − 2 χ( ωκ )δx ∈ L2 (Rd ), ϕκ (x) is a well defined selfadjoint operator on Γ(L2 (Rd )). If P (x, λ) is as in (5.1) and g ∈ L1 (Rd ), then V := g(x)P (x, ϕκ (x))dx, Rd
is a well-defined selfadjoint operator on Γ(L2 (Rd )). We have then the following theorem. As before we consider a generalized basis {ψl (x)}l∈I and {ψ(x, k)}k∈Rd of eigenfunctions of h. Theorem 5.4. Assume that : (B1) gap ∈ L2 (Rd ), 0 ≤ p ≤ 2n, g ∈ L1 (Rd ), g ≥ 0, g(ap )2n/(2n−p) ∈ L1 (Rd ), 0 ≤ p ≤ 2n − 1, (B2) xs gap ∈ L2 (Rd ) ∀ 0 ≤ p ≤ 2n, for some s > 1. Assume moreover that for a measurable function M : Rd → R+ with M (x) ≥ 1 there a generalized basis of eigenfunctions of h such that : exists −1 M (·)ψl (·)2∞ < ∞, l∈I (B3) M −1 (·)ψ(·, k)∞ ≤ C, k ∈ R. (B4) gap M s ∈ L2 (Rd ), g(ap M s )2n/(2n−p+s) ∈ L1 (Rd ), ∀ 0 ≤ s ≤ p ≤ 2n − 1. Then the Hamiltonian g(x)P (x, ϕκ (x))dx H = dΓ(ω) + Rd
satisfies all the hypotheses of Theorem 4.1 for the weight operator x = (1 + 1 x2 ) 2 and conjugate operator a = 12 (x · Dx −1 Dx + hc). Remark 5.5. Sufficient conditions for (B3) to hold with M (x) ≡ 1 are given in [12]. 6. Commutator Estimates In this section, we collect various commutator estimates, needed in Sec. 7. 6.1. Number energy estimates We recall first some notation from [4]: let an operator B(t) depending on some parameter t map ∩n D(N n ) ⊂ H into itself. We will write B(t) ∈ (N + 1)m ON (tp ) −m−k
(N + 1)
for m ∈ R if p
B(t)(N + 1) ≤ Ck t , k
If (6.1) holds for any m ∈ R, then we will write B(t) ∈ (N + 1)−∞ ON (tp ).
k ∈ Z.
(6.1)
April 2, 2009 10:25 WSPC/148-RMP
394
J070-00364
C. G´ erard & A. Panati
Likewise, for an operator C(t) that maps ∩n D(N n ) ⊂ H into ∩n D((N0 + N∞ )n ) ⊂ Hext we will write ˇN (tp ) for m ∈ R if C(t) ∈ (N + 1)m O p (N0 + N∞ )−m−k C(t)(N + 1)k ≤ Ck t , k ∈ Z.
(6.2)
If (6.2) holds for any m ∈ R, then we will write ˇN (tp ). B(t) ∈ (N + 1)−∞ O The notation (N + 1)oN (tp ), (N + 1)m oˇN (tp ) are defined similarly. Lemma 6.1. Let H be an abstract QFT Hamiltonian. Then: (i) For all P ∈ N there exists α > 0 such that for all 0 ≤ s ≤ P N s+α (H − z)−1 N −s ∈ O(|Im z|−1 ),
uniformly for z ∈ C\R ∩ {|z| ≤ R}.
(ii) For χ ∈ C0∞ (R) we have N m χ(H)N p < ∞,
m, p ∈ N.
Proof. (ii) follows directly from (H3). It remains to prove (i). Let us fix P ∈ N and M > P such that N M (H + b)−1 (N + 1)−P ∈ B(H).
(6.3)
We deduce also from (H3) and interpolation that there exists α > 0 such that N α (H + b)−1 ∈ B(H).
(6.4)
We can choose α > 0 small enough such that δ = (M − α)/P > 1. Interpolating between (6.3) and (6.4) we obtain first that N α+δx (H + b)−1 (N + 1)−x is bounded for all x ∈ [0, P ]. Since δ > 1, we get that N α(s+1) (H + b)−1 (N + 1)−sα < ∞,
s ∈ [0, P α−1 ].
(6.5)
Without loss of generality, we can assume that α−1 ∈ N, and we will prove by induction on s ∈ N that N (s+1)α (H − z)−1 (N + 1)−sα ∈ O(|Im z|−1 ),
(6.6)
uniformly for z ∈ C\R ∩ {|z| ≤ R} and 0 ≤ s ≤ P α−1 . For s = 0, (6.6) follows from the fact that N α (H + b)−1 is bounded. Let us assume that (6.6) holds for s − 1. Then we write: N (s+1)α (H − z)−1 (N + 1)−sα = N (s+1)α (H + b)−1 N −sα N sα (H + b)(H − z)−1 (N + 1)−sα = N (s+1)α (H + b)−1 N −sα N sα (1 + (b + z)(H − z)−1 )(N + 1)−sα , so (6.6) for s follows from (6.5) and the induction hypothesis. We extend then (6.6) from integer s ∈ [0, P α−1 ] to all s ∈ [0, P α−1 ] by interpolation. Denoting sα by s we obtain (i).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
395
6.2. Commutator estimates Lemma 6.2. Let H be an abstract QFT Hamiltonian and x a weight operator for ω. Let q ∈ C0∞ (R), 0 ≤ q ≤ 1, q ≡ 1 near 0. Set for R ≥ 1 q R = q( x R ). Then for χ ∈ C0∞ (R): (N + 1)−∞ ON (R− inf(s,1) ) under hypothesis (Is), [Γ(q R ), χ(H)] ∈ otherwise. (N + 1)−∞ oN (R0 ) Proof. In all the proof M and P will denote integers chosen sufficiently large. We prove the lemma under hypothesis (Is) s > 0, the general case being handled replacing hypothesis (Is) by the estimate (3.1). Clearly Γ(q R ) preserves D(N n ). We have [H0 , Γ(q R )] = dΓ(q R , [ω, q R ]).
(6.7)
By Lemma 3.3(i), [ω, q R ] ∈ O(R−1 ) and hence [H0 , Γ(q R )](H0 + 1)−1 is bounded. Therefore, Γ(q R ) preserves D(H0 ). As in [4, Lemma 7.11] the following identity is valid as a operator identity on D(H0 ) ∩ D(N P ): [H, Γ(q R )] = [H0 , Γ(q R )] + [V, Γ(q R )] =: T. From (6.7) and Proposition 2.4(iv) we get that [Γ(q R ), H0 ] ∈ (N + 1)ON (R−1 ). Using Proposition 2.7(i) and hypothesis (Is), we get that [Γ(q R ), V ] ∈ (N + 1)n ON (R−s ),
n ≥ deg(w)/2
which gives T ∈ (N + 1)n O(R− inf(s,1) ).
(6.8)
Let now T (z) := [Γ(q R ), (z − H)−1 ] = −(z − H)−1 [Γ(q R ), H](z − H)−1 . By (H3), D(H M ) ⊂ D(H0 ) ∩ D(N P ), so the following identity holds on D(H M ): T (z) = (z − H)−1 T (z − H)−1 . ˜1 , χ ˜ be almost analytic extensions of χ1 , Let now χ1 ∈ C0∞ (R) with χ1 χ = χ and χ χ. We write: N m [χ(H), Γ(q R )]N p = N m χ1 (H)[χ(H), Γ(q R )]N p + N m [χ1 (H), Γ(q R )]χ(H)N p i m = ∂ z¯χ(z)N ˜ χ1 (H)T (z)N p dz ∧ d z¯ 2π C i ∂ z¯χ˜1 (z)N m T (z)χ(H)N p dz ∧ d z¯. + 2π C
April 2, 2009 10:25 WSPC/148-RMP
396
J070-00364
C. G´ erard & A. Panati
Using Lemma 6.1(i) and (6.8), we obtain that for all n1 ∈ N there exists n2 ∈ N such that N n1 T (z)(N + 1)−n2 , (N + 1)−n2 T (z)N n1 ∈ O(|Im z|−2 ), uniformly for z ∈ C\R ∩ {|z| ≤ R}. Using also Lemma 6.1(ii), we obtain that N m [χ(H), Γ(q R )]N p ∈ O(R− inf(s,1) ), which completes the proof of the lemma. 2 Let j0 ∈ C0∞ (R), j∞ ∈ C ∞ (R), 0 ≤ j0 , 0 ≤ j∞ , j02 + j∞ ≤ 1, j0 = 1 near 0 (and x x R hence j∞ = 0 near 0). Set for R ≥ 1 j = (j0 ( R ), j∞ ( R )).
Lemma 6.3. Let H be an abstract QFT Hamiltonian and x a weight operator for ω. Then for χ ∈ C0∞ (R): ˇ − inf(s,1) ) under hypothesis (Is), (N + 1)−∞ O(R ext ∗ R ∗ R χ(H )I (j ) − I (j )χ(H) ∈ otherwise. (N + 1)−∞ oˇ(R0 ) Proof. Again we will only prove the lemma under hypothesis (Is). As in [4, Lemma 7.12], we have: R ]). H0ext I ∗ (j R ) − I ∗ (j R )H0 ∈ (N + 1)O([ω, j0R ] + [ω, j∞ R R ] = [(1 − j∞ )R , ω], we obtain that [ω, j0R ] + [ω, j∞ ] ∈ O(R−1 ), Writing [ω, j∞ hence:
ˇ N (R−1 ). H0ext I ∗ (j R ) − I ∗ (j R )H0 ∈ (N + 1)O
(6.9)
This implies that I ∗ (j R ) sends D(H0 ) into D(H0ext ), and since I ∗ (j R )N = (N0 + N∞ )I ∗ (j R ), I ∗ (j R ) sends also D(N n ) into D((N0 + N∞ )n ). Next by Proposition 2.7(ii) and condition (Is) we have ˇN (R−s ), (V ⊗ 1)I ∗ (j R ) − I ∗ (j R )V ∈ (N + 1)n O
n ≥ deg(w)/2.
(6.10)
This and (6.9) show that as an operator identity on D(H0 ) ∩ D(N n ) we have ˇN (R− min(1,s) ). H ext I ∗ (j R ) − I ∗ (j R )H ∈ (N + 1)n O
(6.11)
Using then (H3) and the fact that I ∗ (j R ) sends D(H0 ) into D(H0ext ) and D(N n ) into D((N0 + N∞ )n ), we obtain the following operator identity on D(H M ) for M large enough: T (z) := (z − H ext )−1 I ∗ (j R ) − I ∗ (j R )(z − H)−1 = (z − H ext )−1 I ∗ (j R )H − H ext I ∗ (j R ) (z − H)−1 , uniformly for z ∈ C\R ∩ {|z| ≤ R}.
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
397
Using then Lemma 6.1(i) (and its obvious extension for H ext ), we obtain that for all n1 ∈ N there exists n2 ∈ N such that (N0 + N∞ )n1 T (z)(N + 1)−n2 , (N0 + N∞ + 1)−n2 T (z)N n1 ∈ O(|Im z|−2 )R− inf(s,1) .
(6.12)
Let us again pick χ1 ∈ C0∞ (R) with χ1 χ = χ. We have: (N0 + N∞ )m (χ(H ext )I ∗ (j R ) − I ∗ (j R )χ(H))N m = (N0 + N∞ )m χ1 (H ext )(χ(H ext )I ∗ (j R ) − I ∗ (j R )χ(H))N m + (N0 + N∞ )m (χ1 (H ext )I ∗ (j R ) − I ∗ (j R )χ1 (H))χ(H)N m i m ext = ∂z¯χ(z)(N ˜ )T (z)N m dz ∧ d z¯ 0 + N∞ ) χ1 (H 2π C i + ∂z¯χ˜1 (z)(N0 + N∞ )m T (z)χ(H)N m dz ∧ d z¯. 2π C Using Lemma 6.1(i), (6.12), the above operator is O(R− inf(s,1) ) as claimed. 7. Spectral Analysis of Abstract QFT Hamiltonians In this section, we study the spectral theory of our abstract QFT Hamiltonians. The essential spectrum is described in Sec. 7.1. The Mourre estimate is proved in Sec. 7.4. An improved version with a smaller threshold set is proved in Sec. 7.5. 7.1. HVZ theorem and existence of a ground state Theorem 7.1. Let H be an abstract QFT Hamiltonian and let x be a weight operator for ω. Assume hypotheses (G1), (G5). Then (i) if σess (ω) ⊂ [m∞ , +∞[ then σess (H) ⊂ [inf σ(H) + m∞ , +∞[. (ii) if σess (ω) = [m∞ , +∞[ then σess (H) = [inf σ(H) + m∞ , +∞[. Proof. Let us pick functions j0 , j∞ ∈ C ∞ (R) with 0 ≤ j0 ≤ 1, j0 ∈ C0∞ (R), 2 = 1. For R ≥ 1, j R is defined as in Sec. 6.2 and we set j0 ≡ 1 near 0 and j02 + j∞ R R 2 q = (j0 ) . From Sec. 2.4 we know that I(j R )I ∗ (j R ) = 1. We first prove (i). Let χ ∈ C0∞ (]−∞, inf σ(H)+m∞ [). Using Lemma 6.3, we get: χ(H) = χ(H)I(j R )I ∗ (j R ) = I(j R )χ(H ext )I ∗ (j R ) + o(R0 ) =
M k=0
I(j R )1{k} (N∞ )χ(H ext )I ∗ (j R ) + o(R0 ),
(7.1)
April 2, 2009 10:25 WSPC/148-RMP
398
J070-00364
C. G´ erard & A. Panati
for some M , using the fact that H is bounded below and ω ≥ m > 0. Using again Lemma 6.3, we have: I(j R )1{0} (N∞ )χ(H ext )I ∗ (j R ) = I(j R )1{0} (N∞ )I ∗ (j R )χ(H) + o(R0 ) = Γ(q R )χ(H) + o(R0 ).
(7.2)
It remains to treat the other terms in (7.1). Because of the support of χ and using again Lemma 6.3, we have: I(j R )1{k} (N∞ )χ(H ext )I ∗ (j R ) = I(j R )1{k} (N∞ )1 ⊗ F (dΓ(ω) < m∞ )χ(H ext )I ∗ (j R ) = I(j R )1{k} (N∞ )1 ⊗ F (dΓ(ω) < m∞ )I ∗ (j R )χ(H) + o(R0 ), where F (λ < m∞ ) is a cutoff function supported in ]−∞, m∞ [. From hypothesis (H3), it follows that 1[P,+∞[ (N )χ(H) tends to 0 in norm when P → +∞. Since I ∗ (j R ) is isometric, we obtain: I(j R )1{k} (N∞ )1 ⊗ F (dΓ(ω) < m∞ )I ∗ (j R )χ(H) = I(j R )1{k} (N∞ )1 ⊗ F (dΓ(ω) < m∞ )I ∗ (j R )1[0,P ] (N )χ(H) + o(R0 ) + o(P 0 ), where the error term o(P 0 ) is uniform in R. Next we use the following identity from [5, Sec. 2.13]: 1{k} (N∞ )I ∗ (j R )1{n} (N ) = Ik (
1 n! R R ) 2 j R ⊗ · · · ⊗ j0R ⊗ j∞ ⊗ · · · ⊗ j∞ , (n − k)!k! 0
n
n−k
n−k
k
k
where Ik is the natural isometry between h and h⊗ h. We note next that if F ∈ C0∞ (R) is supported in ]−∞, m∞ [, F (ω) is compact on R R tends to 0 in norm when R → ∞ since s- limR→∞ j∞ = 0. It follows h, so F (ω)j∞ from this remark that for each k ≥ 1 and n ≤ P : I(j R )1{k} (N∞ )1 ⊗ F (dΓ(ω) < m∞ )I ∗ (j R )1{n} (N ) = oP (R0 ), and hence I ∗ (j R )1{k} (N∞ )χ(H ext )I(j R ) = o(P 0 ) + o(R0 ) + oP (R0 ) = o(R0 ),
(7.3)
if we choose first P large enough and then R large enough. Collecting (7.1)–(7.3) we finally get that χ(H) = Γ(q R )χ(H) + o(R0 ). 1
We use now that for each R Γ(q R )(H0 +1)− 2 is compact on Γ(h), which follows easily from (H1) and (G5) (see, e.g., [5, Lemma 4.2]). We obtain that χ(H) is compact as a norm limit of compact operators. Therefore σess (H) ⊂ [inf σ(H) + m∞ , +∞[. Let us now prove (ii). Note that it follows from (i) that H admits a ground state. Let λ = inf σ(H) + ε for ε > m∞ . Since ε ∈ σess (ω), there exist unit vectors hn ∈ D(ω) such that limn→∞ (ω − ε)hn = 0 and w- limn→∞ hn = 0. Let u ∈ Γ(h) a normalized ground state of H and set un = a∗ (hn )u.
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
399
Since u ∈ D(N ) by (H3) un is well-defined. Moreover since w- lim hn = 0, we obtain that lim un = 1 and w- lim un = 0. Since u ∈ D(H ∞ ), we know from (H3) that u, Hu ∈ D(N ∞ ) and hence the following identity is valid: H0 a∗ (hn )u = a∗ (hn )H0 u + a∗ (ωhn )u = a∗ (hn )Hu − a∗ (hn )V u + a∗ (ωhn )u, which shows that un = a∗ (hn )u ∈ D(H0 ). Clearly un ∈ D(N ∞ ), so un ∈ D(H) and (H − λ)un = (H0 + V − λ)un = a∗ (hn )(H − λ)u + a∗ (ωhn )u + [V, a∗ (hn )]u = a∗ ((ω − ε)hn )u + [V, a∗ (hn )]u. We can compute the Wick symbol of [V, a∗ (hn )] using Proposition 2.6. Using the fact that hn tends weakly to 0 and Lemma 2.5(iii) we obtain that [V, a∗ (hn )]u tends to 0 in norm. Similarly the term a∗ ((ω − ε)hn )u tends to 0 in norm. Therefore (un ) is a Weyl sequence for λ. 7.2. Virial theorem Let H be an abstract QFT Hamiltonian. We fix a selfadjoint operator a on h such that hypothesis (M1 i) holds and set A := dΓ(a). On the interaction V we impose hypothesis (M2). Lemma 7.2. Assume (M1 i) and set ωt = eita ωe−ita . Then: (i) eita induces a strongly continuous group on D(ω) and sup ωt (ω + 1)−1 < ∞,
|t|≤1
sup ω(ωt + 1)−1 < ∞.
|t|≤1
(ii) sup0<|t|≤1 |t|−1 (ω − ωt ) < ∞, s- limt→0 t−1 (ω − ωt ) = −[ω, ia]0 . Proof. The first statement of (i) follows from [9, Appendix]. This fact clearly implies the first bound in (i). The second follows from ω(ωt + 1)−1 = e−ita ωt (ω + 1)−1 eita . We deduce then from (i) that sup ωs (ωt + 1)−1 < ∞.
|t|,|s|≤1
Since ω ∈ C 1 (a) we have: (ωt + 1)−1 − (ω + 1)−1 =
t
eisa (ω + 1)−1 [ω, ia]0 (ω + 1)−1 e−isa ds,
0
as a strong integral, and hence: (ω − ωt ) = (ωt + 1)((ωt + 1)−1 − (ω + 1)−1 )(ω + 1) t = (ωt + 1)(ωs + 1)−1 eisa [ω, ia]0 e−isa (ωs + 1)−1 (ω + 1)ds. 0
Using (7.4) we obtain (ii).
(7.4)
April 2, 2009 10:25 WSPC/148-RMP
400
J070-00364
C. G´ erard & A. Panati
We set now A := dΓ(a),
Hs = eisA He−isA ,
H0,s = eisA H0 e−isA ,
Vs = eisA V e−isA ,
and introduce the quadratic forms [H0 , iA], [V, iA], [H, iA] with domains D(H0 ) ∩ D(A), D(N n ) ∩ D(A) and D(H m ) ∩ D(A) for n ≥ deg w/2 and m large enough. Proposition 7.3. Let H be an abstract QFT Hamiltonian such that (M1 i), (M2) hold. Then: (i) [H0 , iA] extends uniquely as a bounded operator from D(N ) to H, denoted by [H0 , iA]0 , (ii) [V, iA] extends uniquely as a bounded operator from D(N M ) to H for M large enough, denoted by [V, iA]0 , (iii) [H, iA] extends uniquely as a bounded operator from D(H P ) to H for P large enough, denoted by [H, iA]0 and equal to [H0 , iA]0 + [V, iA]0 , (iv) for r large enough (H + b)−r is in C 1 (A) and the following identity is valid as a bounded operators identity from D(A) to H: A(H + b)−r = (H + b)−r A + i
d (Hs + b)−r |s=0 , ds
(7.5)
where d (Hs + b)−r = (H + b)−r+j ([H0 , iA]0 + [V, iA]0 )(H + b)−j−1 |s=0 ds j=0 r−1
(7.6)
is a bounded operator on H. Proof. We have [H0 , iA] = dΓ([ω, ia]), which using hypothesis (M1 i) and Proposition 2.4(i) implies that [H0 , iA](N + 1)−1 is bounded. The fact that the extension is unique follows from the fact that D(a) ∩ D(ω) is dense in h since ω ∈ C 1 (a). 2 ¯ (h) with Γfin (h)⊗Γfin (h), Let us now check (ii). Through the identification of Bfin we get from Proposition 2.6 that [V, iA] = [Wick(w), iA] = Wick(w(1) ) 2 a))w. By (M2) w(1) ∈ Bfin (h) which implies that where w(1) = (dΓ(a) ⊗ 1 − 1 ⊗ dΓ(¯ [V, iA](N + 1)−n is bounded for n ≥ deg w/2 using Lemma 2.5. The fact that the extension is unique is obvious. By the higher order estimates we have [H, iA] = [H0 , iA] + [V, iA] on D(A) ∩ D(H M ) for M large enough, so [H, iA]0 (H + b)−M is bounded, again by the higher order estimates. To prove that the extension is unique we need to show that D(A) ∩ D(H M ) is dense in D(H M ) for M large enough. Let u = (H + b)−M v ∈ D(H M ) and u = (H + b)−M (1 + iA)−1 v. Clearly u → u in D(H M ) when → 0. Next u belongs to D(H M ) and to D(A) since (H +b)−M is in C 1 (A) by (iv). This completes the proof of (iii).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
401
It remains to prove (iv). We start by proving some auxiliary properties of Hs . Since H0,s = dΓ(ωs ), we obtain using Lemma 7.2(i) and Proposition 2.4(i) that sup H0 (H0,s + 1)−1 < ∞.
(7.7)
|s|≤1
The same arguments show also that D(H0 ) = D(H0,s ) i.e. eisA preserves D(H0 ). Since eisA preserves D(N n ) we obtain from the higher order estimates that on D(H P ).
Hs = H0,s + Vs
(7.8)
Let us fix n ≥ deg w/2. Conjugating the bounds in (H3) by eisA , we obtain that there exists p ∈ N such that 2 ≤ C(Hs + b)2p , N 2n H0,s
uniformly in |s| ≤ 1.
Using also (7.7) we obtain N 2n H02 ≤ C(Hs + b)2p ,
uniformly in |s| ≤ 1.
(7.9)
Let us show that for r large enough: (Hs + b)−r − (H + b)−r ≤ C|s|,
|s| ≤ 1.
(7.10)
Using (7.8), we can write for P large enough: ((Hs + b)−r − (H + b)−r )(H + b)−P =
r−1
(Hs + b)−r+j (H − Hs )(H + b)−j−1 (H + b)−P
j=0
=
r−1
(Hs + b)−r+j (H0 − H0,s + V − Vs )(H + b)−j−1 (H + b)−P .
(7.11)
j=0
Using that H0,s − H0 = dΓ(ωs − ω), Lemma 7.2(ii) and Proposition 2.4(i) we obtain that (H0,s − H0 )(N + 1)−1 ≤ C|s|,
|s| ≤ 1.
(7.12)
If r ≥ 2p then for 0 ≤ j ≤ r − 1 then either j + 1 ≥ p or r − j ≥ p. Using (7.9) and (7.12) we deduce that (Hs + b)−r+j (H0,s − H0 )(H + b)−j−1 ≤ C|s|,
|s| ≤ 1.
(7.13)
Next from Proposition 2.6, we have: Vs = Wick(eisA we−isA ). 2 ¯ the symbol eisA we−isA (h) with Γfin (h) ⊗ Γfin (h), Through the identification of Bfin ¯ isA −isA is identified with e ⊗e w. From hypothesis (M2) and Proposition 2.5, we
April 2, 2009 10:25 WSPC/148-RMP
402
J070-00364
C. G´ erard & A. Panati
obtain that for M ≥ deg(w)/2: (Vs − V )(N + 1)−M ≤ C|s|,
|s| ≤ 1.
(7.14)
By the same argument as above we obtain: (Hs + b)−r+j (V − Vs )(H + b)−j−1 ≤ C|s|,
|s| ≤ 1.
(7.15)
Combining (7.11), (7.15) and (7.13), we obtain (7.10). Next from (7.12) we obtain by considering first finite particle vectors that s- lim s−1 (H0,s − H0 )(N + 1)−1 exists. s→0
We note next that by hypothesis (M2) we know that s−1 (eisA we−isA − w) converges 2 (Γ(h)) when s → 0. Using then Lemma 2.5(ii), we obtain also that in Bfin s- lim s−1 (Vs − V )(N + 1)−n exists. s→0
From (7.11) we obtain that for r ≥ p and P large enough: s- lim s−1 ((Hs + b)−r − (H + b)−r ) exists on D(H P ). s→0
By (7.10) the strong limit exists on Γ(h), which shows that (H + b)−r is in C 1 (A).
Remark 7.4. The same proof as in Proposition 7.3(iv) shows that for r large r enough and zi ∈ C\R, the operator i=1 (zi −H)−1 is in C 1 (A). Using the functional calculus formula (2.2), it is easy to deduce from this fact that χ(H) is in C 1 (A) for all χ ∈ C0∞ (R). The following proposition is the main consequence of Proposition 7.3. Proposition 7.5. Let H be an abstract QFT Hamiltonian such that (M1 i), (M2) hold. Then the virial relation holds: 1{λ} (H)[H, iA]0 1{λ} (H) = 0,
λ ∈ R.
(7.16)
Proof. Let us fix r large enough such that (H + b)−r ∈ C 1 (A) so that (H + b)−r : D(A) → D(A) and [(H + b)−r , iA] extends as a bounded operator on H denoted by [(H + b)−r , iA]0 . Moreover from Proposition 7.3(iv) we have: −r
[(H + b)
, iA]0 = −
r−1
(H + b)−r+j [H, iA]0 (H + b)−j−1 .
j=0
Let now u1 , u2 ∈ H such that Hui = λui . Since (H + b)−r ∈ C 1 (A) and ui is an eigenvector of (H + b)−r , we have the virial relation: 0 = (u1 , [(H + b)−r , iA]0 u2 ) =−
r−1 (u1 , (H + b)−r+j [H, iA]0 (H + b)−j−1 u2 ) j=0
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
=−
r−1
403
(λ + b)−r−1 (u1 , [H, iA]0 u2 )
j=0
= −r(λ + b)−r−1 (u1 , [H, iA]0 u2 ), which proves the lemma. 7.3. Mourre estimate for second quantized Hamiltonians In this subsection we will apply the abstract results in Sec. 2.1 to second quantized Hamiltonians. Let ω, a be two selfadjoint operators on h such that (H1), (M1) hold. Note that it follows from Lemma 2.1 and the results recalled above it that (M1) imply also that κa (ω) is a closed countable set.
(7.17)
Clearly dΓ(ω) ∈ C 1 (dΓ(a)) and [dΓ(ω), idΓ(a)]0 = dΓ([ω, ia]0 ). Since dΓ(ω) and dΓ([ω, ia]0 ) commute with N , we can restrict them to each n-particle sector ⊗ns h. We denote by dΓ(A) (1)
ρdΓ(ω) dΓ(A)
the corresponding restriction of ρdΓ(ω) to the range of 1[1,+∞[ (N ). Finally we introduce the following natural notation for E ⊂ R: dΓ(1) (E) =
+∞ n=1
E + · · · + E , dΓ(E) = {0} ∪ dΓ(1) (E).
(7.18)
n
Remark 7.6. As an example of use of this notation, note that if b is a selfadjoint operator on h, then: σ(dΓ(b)) = dΓ(σ(b)). Note also that if E is a closed countable set included in [m, +∞[ for some m > 0, dΓ(1) (E) is a closed countable set. Lemma 7.7. Let ω, a be two selfadjoint operators on h such that (M1) holds. Then: dΓ(a)
(i) ρdΓ(ω) ≥ 0, dΓ(a) (1)
(ii) ρdΓ(ω)
(λ) = 0 ⇒ λ ∈ dΓ(1) (κa (ω)).
Proof. We have [dΓ(ω), i dΓ(a)] = dΓ([ω, ia]). Since dΓ(ω) ∈ C 1 (dΓ(a)) the virial dΓ(a) relation is satisfied. Denote by ρn the restriction of ρdΓ(ω) to ⊗ns h. Applying Lemma 2.1(iv) we obtain 0, λ = 0, ρ0 (λ) = , +∞, λ = 0 ρn (λ) =
inf
(ρaω (λ1 ) + · · · + ρaω (λn ))
λ1 +···+λn =λ
April 2, 2009 10:25 WSPC/148-RMP
404
J070-00364
C. G´ erard & A. Panati
for n ≥ 1. We note next that since ω ≥ m > 0, χ(dΓ(ω))1[n,+∞[ (N ) = 0 if n is large enough, where χ ∈ C0∞ (R). Therefore only a finite number of n-particle dΓ(a) sectors contribute to the computation of ρdΓ(ω) near an energy level λ. We can dΓ(a)
hence apply Lemma 2.1(iii) and obtain that ρdΓ(ω) ≥ 0. Let us now prove the second statement of the lemma. Since ρaω (λ) = +∞ if λ ∈ σ(ω), we have ρaω (λ) = +∞ for λ < 0. Therefore ρn (λ) = inf (ρaω (λ1 ) + · · · + ρaω (λn )) , In (λ)
for In (λ) = {(λ1 , . . . , λn )|λ1 + · · · + λn = λ, λi ≥ 0}. The function ρaω (λ1 ) + · · · + ρaω (λn ) is lower semicontinuous on Rn , hence attains its minimum on the compact set In (λ). Therefore using also that ρaω ≥ 0, we see that ρn (λ) = 0 iff λ ∈ κa (ω) + · · · + κa (ω) (n factors). Using Lemma 2.1(iii) as above, we obtain that dΓ(A) (1) ρdΓ(ω) (λ) = 0 implies that λ ∈ dΓ(1) (κa (ω)), which proves (ii). 7.4. Mourre estimate for abstract QFT Hamiltonians In this subsection we prove the Mourre estimate for abstract QFT Hamiltonians. Let H be an abstract QFT Hamiltonian and a a selfadjoint operator on h such that (M1) holds. Let also x be a weight operator for ω. Theorem 7.8. Let H be an abstract QFT Hamiltonian and a a selfadjoint operator on h such that (M1) and (M2) hold. Let x be a weight operator for ω such that conditions (G1), (G3), (G5) hold. Set τ := σpp (H) + dΓ(1) (κa (ω)) and A = dΓ(a). Then: (i) Let λ ∈ R\τ . Then there exists > 0, c0 > 0 and a compact operator K such that 1[λ−,λ+] (H)[H, iA]0 1[λ−,λ+] (H) ≥ c0 1[λ−,λ+] (H) + K. (ii) For all λ1 ≤ λ2 such that [λ1 , λ2 ] ∩ τ = ∅ one has: dim 1[λ1 ,λ2 ] (H) < ∞. Consequently σpp (H) can accumulate only at τ, which is a closed countable set. (iii) Let λ ∈ R\(τ ∪ σpp (H)). Then there exists > 0 and c0 > 0 such that 1[λ−,λ+] (H)[H, iA]0 1[λ−,λ+] (H) ≥ c0 1[λ−,λ+] (H). Proof. We note first that [H, iA]0 satisfies the virial relation by Proposition 7.5. Therefore we will be able to apply the abstract results in Lemma 2.1 in our situation. Recall that H ext = H ⊗ 1 + 1 ⊗ dΓ(ω) and set Aext = A ⊗ 1 + 1 ⊗ A.
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
405
By Proposition 7.3, [H, iA]0 considered as an operator on H with domain D(H M ) is equal to H1 + V1 , where H1 = dΓ([ω, ia]0 ), V1 = [V, iA]0 . Note that by (M2) V1 is a 2 (h), and by (G3), [x, [ω, ia]] is bounded on Wick polynomial with a symbol in Bfin h. Therefore using Lemma 3.3(v) we see that the analog of (6.11) holds for [H, iA]0 . We obtain: ˇN (R0 ), I ∗ (j R )[H, iA]0 = [H ext , iAext ]0 I ∗ (j R ) + (N + 1)n O for some n. We recall (7.2): χ(H) = Γ(q R )χ(H) + I(j R )χ(H ext )1[1,+∞[ (N∞ )I ∗ (j R ) + o(R0 ),
(7.19)
for q R = (j0R )2 . Using then Lemma 6.3 and the higher order estimates (which hold also for H ext with the obvious modifications), we obtain that: χ(H)[H, iA]0 χ(H) = Γ(q R )χ(H)[H, iA]0 χ(H) + I(j R )χ(H ext )[H ext , iAext ]0 χ(H ext )1[1,+∞[ (N∞ )I ∗ (j R ) + o(R0 ). (7.20) We will now prove by induction on n ∈ N the following statement: for λ ∈ ]−∞, inf σ(H) + nm[, (i) ρA H (λ) ≥ 0, H(n) A (ii) τ (H) ∩ ]−∞, inf σ(H) + nm[ ⊂ σpp (H) + dΓ(κa (ω)). Statement H(0) is clearly true since ρA H (λ) = +∞ for λ < inf σ(H). Let us assume that H(n − 1) holds. Let us denote by ρext (1) the restriction of ext ext and ρA H ext to the range of 1[1,+∞[ (N∞ ). This function is well defined since H ext ext [H , iA ]0 commute with N∞ . Let λ ∈ ] − ∞, inf σ(H) + nm[. Using Lemma 2.1(iv) and the fact that ω ≥ m we obtain: ρext (1) (λ) =
inf
(λ1 ,λ2 )∈I (n) (λ)
A (1)
(ρA H (λ1 ) + ρH0 (λ2 )),
where I (n) (λ) = {(λ1 , λ2 )| λ1 + λ2 = λ, inf σ(H) ≤ λ1 ≤ inf σ(H) + (n − 1)m, 0 ≤ λ2 ≤ − inf σ(H)}, A (1)
and the function ρH0
is defined in Sec. 7.3. Note that by H(n − 1) (i) and A (1)
Lemma 7.7(i) the two functions ρA H (λ1 ) and ρH0 (λ2 ) are positive for (λ1 , λ2 ) ∈ I (n) (λ). We deduce first from this fact that: ρext (1) (λ) ≥ 0
for λ ∈ ]−∞, inf σ(H) + nm[ .
(7.21) A (1)
Moreover using that the lower semicontinuous function ρA H (λ1 ) + ρH0 (λ2 ) attains its minimum on the compact set I (n) (λ) ⊂ R2 , we obtain that ρext (1) (λ) = 0,
λ ∈ ]−∞, inf σ(H) + nm[ ⇒ λ = λ1 + λ2 ,
where (λ1 , λ2 ) ∈ I (n) (λ),
A (1)
ρA H (λ1 ) = ρH0 (λ2 ) = 0.
(7.22)
April 2, 2009 10:25 WSPC/148-RMP
406
J070-00364
C. G´ erard & A. Panati
From H(n − 1) (ii) and Lemma 2.1(ii) we get that ρA H (λ1 ) = 0,
λ1 ∈ ]−∞, inf σ(H) + (n − 1)m[ ⇒ λ1 ∈ σpp (H) + dΓ(κa (ω)).
From Lemma 7.7(ii) we know that A (1)
ρH0 (λ2 ) = 0 ⇒ λ2 ∈ dΓ(1) (κa (ω)). Using (7.22) we get that ρext (1) (λ) = 0,
λ ∈ ]−∞, inf σ(H) + nm[ ⇒ λ ∈ σpp (H) + dΓ(1) (κa (ω)).
(7.23)
The operators Γ(q R )χ(H) and hence Γ(q R )χ(H)[H, iA]0 χ(H) are compact on H. Choosing hence R large enough in (7.20) we obtain using (7.19) and the fact that I(j R )I ∗ (j R ) = 1 that ext (1) ρ˜A (λ), H (λ) ≥ ρ
λ ∈ ]−∞, inf σ(H) + nm[.
(7.24)
By Lemma 2.1(i) this implies first that ρA H ≥ 0 on ]−∞, inf σ(H) + nm[ , i.e. H(n) (i) holds. Using then (7.23) we obtain that ρ˜A H (λ) = 0,
λ ∈ ]−∞, inf σ(H) + nm[ ⇒ λ ∈ σpp (H) + dΓ(1) (κa (ω)),
which proves H(n) (ii). Since H(n) holds for any n we obtain statement (i) of the theorem. The fact that dim 1[λ1 ,λ2 ] (H) < ∞ if [λ1 , λ2 ] ∩ τ = ∅ follows from the abstract results recalled in Sec. 2.1. We saw in (7.17) that κa (ω) is a closed countable set. Using also Remark 7.6, this implies by induction on n that τ ∩ ]−∞, inf σ(H) + nm[ is a closed countable set for any n. Finally statement (iii) follows from Lemma 2.1. This completes the proof of the theorem. 7.5. Improved Mourre estimate Theorem 7.8 can be rephrased as: τA (H) ⊂ σpp (H) + dΓ(1) (κa (ω)), which is sufficient for our purposes. Nevertheless a little attention shows that one should expect a better result, namely: τA (H) ⊂ σpp (H) + dΓ(1) (τa (ω)), i.e. eigenvalues of ω away from τa (ω) should not contribute to the set of thresholds of H. In this subsection we prove this result if there exists a comparison operator ω∞ such that hypothesis (C) holds. We fix a function q ∈ C ∞ (R) such that 0 ≤ q ≤ 1,
q ≡ 0 near 0,
q ≡ 1 near 1.
(7.25)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
407
Lemma 7.9. Assume (H1), (G1), (G3), (M1) for ω and ω∞ and (C). Set H0 = dΓ(ω), H∞ = dΓ(ω∞ ). Let q as in (7.25) and χ ∈ C0∞ (R). Then: (χ2 (H0 ) − χ2 (H∞ ))Γ(q R ) ∈ o(R0 ), R
(7.26) R
0
χ(H0 )[H0 , iA]0 χ(H0 )Γ(q ) = χ(H∞ )[H∞ , iA]0 χ(H∞ )Γ(q ) + o(R ).
(7.27)
Assume additionally (G5). Then ρ˜aω = ρ˜aω∞ .
(7.28)
Proof. We will first prove the following estimates: [χ(H ), Γ(q R )],
(χ(H0 ) − χ(H∞ ))Γ(q R ) ∈ o(R0 ),
(H1 + i)−1 [H0 − H∞ , iA]0 Γ(q R )(H2 + i)−1 ∈ o(R0 ) (H1 + i)−1 [[H∞ , iA]0 , Γ(q R )](H2 + i)−1 ∈ o(R0 ),
(7.29) (7.30)
for , 1 , 2 ∈ {0, ∞}. If we use the identities [dΓ(bi ), Γ(q R )] = dΓ(q R , [bi , q R ]),
dΓ(b1 − b2 )Γ(q R ) = dΓ(q R , (b1 − b2 )q R ),
for b1 = ω, b2 = ω∞ , Lemma 3.4, Lemma 3.3(i) and the bounds in Proposition 2.4, it is easy to see that uniformly in z ∈ C\R ∩ {|z| ≤ R}: [(z − H )−1 , Γ(q R )] ∈ O(R−1 )|Im z|−2 , (z − H1 )−1 (H0 − H∞ )Γ(q R )(z − H2 )−1 ∈ o(R0 )|Im z|−2 . Using the functional calculus formula (2.2) this implies (7.29). The proof of (7.30) is similar using Lemma 3.4 and Lemma 3.3(v). The proof of (7.27) is now easy: we move the operator Γ(q R ) to the left, changing H0 into H∞ along the way, and then move Γ(q R ) back to the right. All errors terms are o(R0 ), by (7.29), (7.30). (7.26) follows from (7.29). If we restrict (7.26), (7.27) to the one-particle sector we obtain that (χ2 (ω) − χ2 (ω∞ ))q R ∈ o(R0 ), χ(ω)[ω, ia]0 χ(ω)q R = χ(ω∞ )[ω∞ , ia]0 χ(ω∞ )q R + o(R0 ). Using (G5) and the fact that (1− q) ∈ C0∞ (R) we see that χ(H )(1 − q)R is compact for = 0, ∞. Writing 1 = (1 − q)R + q R , we easily obtain (7.28). Theorem 7.10. Let H be an abstract QFT Hamiltonian satisfying the hypotheses of Theorem 7.8. Let ω∞ be a comparison Hamiltonian on h such that (C1) holds. Then the conclusions of Theorem 7.8 hold for τ := σpp (H) + dΓ(1) (τa (ω)). Proof. We use the notation in the proof of Theorem 7.8. We pick a function q1 satisfying (7.25) such that q1 j∞ = j∞ , so that I ∗ (j R ) = 1 ⊗ Γ(q1R )I ∗ (j R ).
April 2, 2009 10:25 WSPC/148-RMP
408
J070-00364
C. G´ erard & A. Panati
Therefore in (7.20) we can insert 1 ⊗ Γ(q1R ) to the left of I ∗ (j R ). If we set ext H∞ := H ⊗ 1 + 1 ⊗ H∞ , ext , we obtain instead then using the obvious extension of Lemma 7.9 to H ext and H∞ of (7.20):
χ(H)[H, iA]0 χ(H) = Γ(q R )χ(H)[H, iA]0 χ(H) ext ext ext + I(j R )χ(H∞ )[H∞ , iAext ]0 χ(H∞ )1[1,+∞[ (N∞ )I ∗ (j R ) + o(R0 ). (7.31)
Therefore in the later steps of the proof we can replace ω by ω∞ . By assumption κa (ω∞ ) = τa (ω∞ ) and by Lemma 7.9 τa (ω∞ ) = τa (ω). This completes the proof of the theorem. 8. Scattering Theory for Abstract QFT Hamiltonians In this section, we consider the scattering theory for our abstract QFT Hamiltonians. This theory is formulated in terms of asymptotic Weyl operators, (see Theorem 8.1) which form regular CCR representations over hc (ω). Using the fact that the theory is massive, it is rather easy to show that this representation is of Fock type (see Theorem 8.5). The basic question of scattering theory, namely the asymptotic completeness of wave operators, amounts then to prove that the space of vacua for the two asymptotic CCR representations coincide with the space of bound states for H. This will be shown in Theorem 10.6, using the propagation estimates of Sec. 9. In all this section we only consider objects with superscript +, corresponding to t → +∞. The corresponding objects with superscript − corresponding to t → −∞ have the same properties. 8.1. Asymptotic fields For h ∈ h we set ht := e−itω h. Recall that hc (ω) ⊂ h is the continuous spectral subspace for ω and that by hypothesis (S) there exists a subspace h0 dense in hc (ω) such that for all h ∈ h0 there exists > 0 such that
1[0,] x e−itω h ∈ O(t−µ ), µ > 1.
|t| Theorem 8.1. Let H be an abstract QFT Hamiltonian such that hypotheses (Is) for s > 1 and (S) hold. Then: (i) For all h ∈ hc (ω) the strong limits W + (h) := s- lim eitH W (ht )e−itH t→+∞
(8.1)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
409
exist. They are called the asymptotic Weyl operators. The asymptotic Weyl operators can be also defined using the norm limit : W + (h)(H + b)−n = lim eitH W (ht )(H + b)−n e−itH , t→+∞
(8.2)
for n large enough. (ii) The map hc (ω) h → W + (h)
(8.3)
is strongly continuous and for n large enough, the map hc (ω) h → W + (h)(H + b)−n
(8.4)
is norm continuous. (iii) The operators W + (h) satisfy the Weyl commutation relations: 1
W + (h)W + (g) = e−i 2 Im(h|g) W + (h + g). (iv) The Hamiltonian preserves the asymptotic Weyl operators: eitH W + (h)e−itH = W + (h−t ).
(8.5)
Proof. The proof is almost identical to the proof of [4, Theorem 10.1], therefore we will only sketch it. We have: W (ht ) = e−itH0 W (h)eitH0 , which implies that, as a quadratic form on D(H0 ), one has ∂t W (ht ) = −[H0 , iW (ht )].
(8.6)
Using (8.6) and the fact that for n large enough D(H ) ⊂ D(H0 ) ∩ D(V ), we have, as quadratic forms on D(H n ): n
∂t eitH W (ht )e−itH = eitH [V, iW (ht )]e−itH . Integrating this relation we have as a quadratic form identity on D(H n ) t eit H [V, iW (ht )]e−it H dt . eitH W (ht )e−itH − W (h) =
(8.7)
0
We claim that for h ∈ h0 (see hypothesis (S)), and p ≥ deg w/2:
[V, W (ht )](N + 1)−p ∈ L1 (dt).
(8.8)
In fact writing w as p+q≤deg(w) wp,q , where wp,q is of order (p, q) and using Proposition 2.6, we obtain that [Wick(wp,q ), W (ht )] = W (ht ) Wick(wp,q (t)), where wp,q (t) is the sum of the symbols in the right-hand side of (2.7) for (s, r) = x (p, q). Using (Is) and (S) we obtain writing 1 = 1[0,] ( x t ) + 1],+∞[ ( t ) that wp,q (t)B 2 (h) ∈ L1 (dt), which proves (8.8) using Lemma 2.5.
April 2, 2009 10:25 WSPC/148-RMP
410
J070-00364
C. G´ erard & A. Panati
Using then the higher order estimates, we obtain that the identity (8.7) makes sense as an identity between bounded operators from D(H n ) to H for n large enough. It also proves that the norm limit (8.2) exists for h ∈ h0 . The rest of the proof is identical to [4, Theorem 10.1]. It relies on the bound (eitH W (ht )e−itH − eitH W (gt )e−itH )(H + b)−n ≤ (W (h) − W (g))(N + 1)−1 (N + 1)(H + b)−n ≤ Ch − g(h2 + g2 + 1). Theorem 8.2. (i) For any h ∈ hc (ω): d + W (sh)|s=0 ds defines a selfadjoint operator, called the asymptotic field, such that φ+ (h) := −i
W + (h) = eiφ
+
(h)
.
(ii) The operators φ+ (h) satisfy in the sense of quadratic forms on D(φ+ (h1 )) ∩ D(φ+ (h2 )) the canonical commutation relations [φ+ (h2 ), φ+ (h1 )] = i Im(h2 |h1 ). −itH
itH +
(8.9)
+
(iii) e φ (h)e = φ (h−t ). (iv) For p ∈ N, there exists n ∈ N such that for hi ∈ hc (ω), 1 ≤ i ≤ p, D(H n ) ⊂ D( p1 φ+ (hi )), p
φ+ (hi )(H + i)−n = s- lim eitH t→+∞
i=1
p
φ(hi,t )e−itH (H + i)−n ,
i=1
and the map hc (ω)p (h1 , . . . , hp ) →
p
φ+ (hi )(H + i)−n ∈ B(H)
i=1
is norm continuous. Proof. The proof is very similar to [4, Theorem 10.2] so we will only sketch it. Properties (i) and (ii) are standard consequences of the fact that the asymptotic Weyl operators define a regular CCR representation (see e.g. [4, Sec. 2]). Property (iii) follows from Theorem 8.1(iv). It remains to prove (iv). For fixed p we pick n ∈ N such that N p/2 (H + b)−n is bounded. It follows that
p
itH −n −itH sup e (8.10) φ(hi,t )(H + b) e
< ∞.
t∈R 1
Let us first establish the existence of the strong limit s- lim e t→+∞
itH
p 1
φ(hi,t )(H + b)−n e−itH =: R(h1 , . . . , hp ),
for hi ∈ h.
(8.11)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
411
If m is large enough such that H = H0 + V on D(H m ), then as quadratic form on D(H m ) we have: p p φ(hi,t )(H + b)−n = V, i φ(hi,t ) (H + b)−n , D 1
1
where the Heisenberg derivative D is defined in Sec. 2.5. Next: [V, i
p
φ(hi,t )](H + b)−n =
1
p j−1 j=1
φ(hi,t )[V, iφ(hj,t )]
1
p
φ(hi,t )(H + b)−n ,
j+1
as an operator identity on D(H ). The term [V, iφ(ht )] is by Proposition 2.6 a sum of Wick monomials with kernels of the form wp,q |ht ) or (ht |wp,q . Arguing as in the proof of Theorem 8.1 we see from hypotheses (S) and (Is) for s > 1 that for h ∈ h0 m
[V, iφ(ht )](H + b)−n ∈ L1 (dt).
(8.12)
This proves the existence of the limit (8.11) for u ∈ D(H m ), hi ∈ h0 . The fact that the map hp (h1 , . . . , hp ) →
p
φ(hj )(H + b)−n ∈ B(H)
(8.13)
j=1
is norm continuous implies the existence of the limit for u ∈ D(H m ) and hi ∈ hc (ω). The estimate (8.10) shows the existence of (8.11) for all u ∈ H. We prove now (iv). We recall that
W (sh) − 1
−1
(8.14) sup (N + 1) < ∞,
s |s|≤1, h ≤C and
W (sh) − 1
−1
lim sup − iφ(h) (N + 1) = 0. s→0 h ≤C s
(8.15)
We fix P ∈ N and M large enough so that N P +1 (H + b)−M is bounded and prove (iv) by induction on 1 ≤ p ≤ P . p We have to show that D(H M ) ⊂ D( 1 φ+ (hi )) and that R(h1 , . . . , hp ) = p + −M . This amounts to show that 1 φ (hi )(H + b) R(h1 , . . . , hp ) = s- lim (is)−1 (W + (sh1 ) − 1) s→0
p
φ+ (hi )(H + b)−M .
2
p Note that by the induction assumption D(H ) ⊂ D( 2 φ+ (hi )) and M
p 2
φ+ (hi )(H + b)−M = s- lim eitH t→+∞
p 2
φ(hi,t )e−itH (H + b)−M .
(8.16)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
C. G´ erard & A. Panati
412
Using (8.16) and the fact that eitH W (h1,t )e−itH is uniformly bounded in t, we have: (is)−1 (W + (sh1 ) − 1)
p
φ+ (hi )(H + b)−M
2
= s- lim e
itH
t→+∞
−1
(is)
(W (sh1,t ) − 1)
p
φ(hi,t )e−itH (H + b)−M .
2
So to prove (iv), it suffices to check that s- lim s- lim eitH R(s, t)e−itH = 0, s→0
for
R(s, t) =
t→∞
(8.17)
p W (sh1,t ) − 1 − iφ(h1,t ) φ(hi,t )(H + b)−M . s 2
Using (8.14) and the higher order estimates, we see that R(s, t) is uniformly bounded for |s| ≤ 1, t ∈ R, and using then (8.15) we see that lims→0 supt∈R R(s, t)u = 0, for u ∈ D(H M ). This shows (8.17). The norm continuity result in (iv) follows from the norm continuity of the map (8.13). Finally the following theorem follows from Theorem 8.2 as in [4, Sec. 10.1]. Theorem 8.3. (i) For any h ∈ hc (ω), the asymptotic creation and annihilation operators defined on D(a+ (h)) := D(φ+ (h)) ∩ D(φ+ (ih)) by 1 a+∗ (h) := √ (φ+ (h) − iφ+ (ih)), 2 1 a+ (h) := √ (φ+ (h) + iφ+ (ih)), 2 are closed. (ii) The operators a+ satisfy in the sense of quadratic forms on D(a+# (h1 )) ∩ D(a+# (h2 )) the canonical commutation relations [a+ (h1 ), a+∗ (h2 )] = (h1 |h2 )1, [a+ (h2 ), a+ (h1 )] = [a+∗ (h2 ), a+∗ (h1 )] = 0. (iii) eitH a+ (h)e−itH = a+ (h−t ).
(8.18)
(iv) For p ∈ N, there exists n ∈ N such that for hi ∈ hc (ω), 1 ≤ i ≤ p, D((H +i)n ) ⊂ p D( 1 a+ (hi )) and p 1
a+ (hi )(H + b)−n = s- lim eitH t→∞
p 1
a (hi,t )(H + b)−n e−itH .
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
413
8.2. Asymptotic spaces and wave operators In this subsection we recall the construction of asymptotic vacuum spaces and wave operators taken from [4, Sec. 10.2] and adapted to our setup. We define the asymptotic vacuum space: K+ := {u ∈ H | a+ (h)u = 0, h ∈ hc (ω)}. The asymptotic space is defined as H+ := K+ ⊗ Γ(hc (ω)). The proof of the following proposition is completely analogous to [4, Proposition 10.4]. Proposition 8.4. (i) K+ is a closed H-invariant space. (ii) K+ is included in the domain of p1 a+ (hi ) for hi ∈ hc (ω). (iii) H( H) ⊂ K+ . The asymptotic Hamiltonian is defined by H + := K + ⊗ 1 + 1 ⊗ dΓ(ω),
for K + := H|K+ .
We also define Ω+ : H+ → H, Ω+ ψ ⊗ a∗ (h1 ) · · · a∗ (hp )Ω := a+∗ (h1 ) · · · a+∗ (hp )ψ, h1 , . . . , hp ∈ hc (ω),
(8.19)
ψ∈K . +
The map Ω+ is called the wave operator. The following theorem is analogous to [4, Theorem 10.5] Theorem 8.5. Ω+ is a unitary map from H+ to H such that : a+ (h)Ω+ = Ω+ 1 ⊗ a (h), +
+
h ∈ hc (ω),
+
HΩ = Ω H . Proof. By general properties of regular CCR representations, (see [4, Proposition 4.2]) the operator Ω+ is well-defined and isometric. To prove that it is unitary, it suffices to show that the CCR representation hc (ω) h → W + (h) admits a densely defined number operator (see, e.g., [4, Sec. 4.2]). Let n+ be the quadratic form associated to the CCR representation W + . Let us show that D(n+ ) is dense in H. We fix n ∈ N such that a+ (h)(H + b)−n = s- lim eitH a(ht )e−itH (H + b)−n , t→+∞
For each finite-dimensional space f ⊂ hc (ω) set: n+ f (u) =
dim f i=1
a+ (hi )u2 ,
h ∈ hc (ω).
April 2, 2009 10:25 WSPC/148-RMP
414
J070-00364
C. G´ erard & A. Panati
for {hi } an orthonormal base of f. We have for u ∈ D(H n ): dim f
n+ f (u) = lim
t→+∞
a(hi,t )e−itH u2
i=1
= lim (e−itH u|dΓ(Pf,t )e−itH u), t→+∞
if Pf,t is the orthogonal projection on e−itω f. But dΓ(Pf,t ) ≤ N , so 1
2 −itH u2 ≤ C(H + b)p u2 , n+ f (u) ≤ sup N e
t
for some p, by the higher order estimates. Therefore D(H p ) ⊂ D(n+ ), which for p large enough, which implies that D(n+ ) is densely defined. 8.3. Extended wave operator In Sec. 2.4 we introduced the scattering Hilbert space Hscatt ⊂ Hext . Clearly Hscatt is preserved by H ext . We see that H+ is a subspace of Hscatt and ext H + = H|H +.
We define the extended wave operator Ωext,+ : D(Ωext,+ ) → H by: D(Ωext,+ ) = D(H ∞ ) ⊗ Γfin (hc (ω)), and Ωext,+ ψ ⊗ a∗ (h1 ) · · · a∗ (hp )Ω := a∗+ (h1 ) · · · a∗+ (hp )ψ, Note that Ω
ext,+
:H
scatt
ψ ∈ D(H ∞ ),
hi ∈ hc (ω).
→ H is unbounded and: Ω+ = Ωext,+ |H+ .
Considering Ω+ as a partial isometry equal to 0 on Hscatt H+ , we can rewrite this identity as: Ω+ = Ωext,+ 1H+ ,
(8.20)
where 1H+ denotes the projection onto H+ inside the space Hscatt . Moreover using Theorem 8.3(iv), we obtain as in [4, Theorem 10.7] the following alternative expression for Ωext,+ . Theorem 8.6. (i) Let u ∈ D(Ωext,+ ). Then the limit lim eitH Ie−itH
t→+∞
ext
u
exists and equals Ωext,+ u. (ii) Let χ ∈ C0∞ (R). Then Ran χ(H ext ) ⊂ D(Ωext,+ ), Iχ(H ext ) and Ωext,+ χ(H ext ) are bounded operators and s- lim eitH Ie−itH t→+∞
ext
χ(H ext ) = Ωext,+ χ(H ext ).
(8.21)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
415
9. Propagation Estimates In this section, we consider an abstract QFT Hamiltonian H and fix a weight operator x. We will prove various propagation estimates for H. The proof of the phase-space estimates will be more involved than in [4, 5]. In fact the operator playing the role of the acceleration [ω, i[ω, ix]] vanishes in the situation considered in these papers. 9.1. Maximal velocity estimates The following proposition shows that bosons cannot propagate in the region x > vmax t where vmax := [ω, ix]. Proposition 9.1. Assume hypotheses (G1), (Is) for s > 1. Let χ ∈ C0∞ (R). Then for R > R > vmax , one has:
2 12 ∞
|x|
−itH dt ≤ Cu2 . χ(H)e u
1[R,R ]
t t 1 Proof. The proof is almost identical to [4, Proposition 11.2] so we will only sketch +∞ it. We fix G ∈ C0∞ (]vmax , +∞[) with G ≥ 1[R,R ] and set F (s) = s G2 (t)dt. We use the propagation observable Φ(t) = χ(H)dΓ(F ( x t ))χ(H). We use that x x x x −1 )=t G d0 F ( [ω, ix] − G + O(t−2 ) t t t t C0 2 x ≤− G + O(t−2 ) t t −s ) in norm by hypothby Lemma 3.3. The term χ(H)[V, id Γ(F ( x t ))]χ(H) is O(t esis (Is), Lemma 2.5 and the higher order estimates.
9.2. Phase space propagation estimates Set v := [ω, ix], and recall from hypothesis (G2) that [ω, iv] = γ 2 + r−1− , −1
−1− 2 where γ ∈ S,(1) , r−1− ∈ S(0) for some > 0. We will show that for free bosons the instantaneous velocity v and the average velocity x t converge to each other when t → ±∞.
April 2, 2009 10:25 WSPC/148-RMP
416
J070-00364
C. G´ erard & A. Panati
Proposition 9.2. Assume (G1), (G2) and (Is) for s > 1 and let χ ∈ C0∞ (R) and 0 < c0 < c1 . Then +∞ 1 x x −itH 2 2 (i) 1 dΓ(( x u2 dt t − v)1[c0 ,c1 ] ( t )( t − v)) χ(H)e t ≤ Cu , +∞ 1 x −itH 2 2 (ii) 1 dΓ(γ1[c0 ,c1 ] ( t )γ) 2 χ(H)e u dt ≤ Cu . Proof. We follow the proof of [4, Proposition 11.3], [5, Proposition 6.2] with some modifications due to our abstract setting. It clearly suffices to prove Proposition 9.2 for c1 > vmax + 1, which we will assume in what follows. We fix a function F ∈ C ∞ (R), with F, F ≥ 0, F (s) = 0 for s ≤ c0 /2, F (s), F (s) ≥ d1 > 0 for s ∈ [c0 , c1 ]. We set s F 2 (t)dt, R0 (s) = 0
so that R0 (s) = 0 for s ≤ c0 /2, R0 (s), R0 (s) ≥ d2 > 0 for s ∈ [c0 , c1 ]. Finally we fix another function G ∈ C ∞ (R) with G(s) = 1 for s ≤ c1 + 1, G(s) = 0 for s ≥ c1 + 2, and set: R(s) := G(s)R0 (s). The function R belongs to C0∞ (R) and satisfies: R(s) = 0 in [0, c0 /2],
R (s) ≥ d3 1[c0 ,c1 ] (s) + χ1 (s),
R (s) ≥ d3 1[c0 ,c1 ] (s) + χ2 (s),
(9.1)
for χ1 , χ2 ∈ C0∞ (]vmax , +∞[) and d3 > 0. We set x x 1 x − v + h.c. , b(t) := R − R t 2 t t which satisfies b(t) ∈ O(1) and use the propagation observable Φ(t) = χ(H)dΓ(b(t))χ(H). Using Lemma 3.3 we obtain that: x x2 1 1 x x R − ∂t b(t) = R v t t t2 2 t t 1 x x − vR + O(t−2 ), 2 t t and
(9.2)
x 1 x x 1 x x R vR v− v − vR t 2 t t 2 t t x 1 + (9.3) R [ω, iv] + h.c. + O(t−2 ). 2 t
1 [ω, ib(t)] = t
Adding (9.2) and (9.3) we obtain: x 1 1 x x x −v R −v + d0 b(t) = R [ω, iv] + h.c. + O(t−2 ). t t t t 2 t
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
417
By hypothesis (G2), we have: [ω, iv] = γ 2 + r−1− , −1
−1− 2 for γ ∈ S,(1) , r−1− ∈ S(0) . Since 0 ∈ supp R , we know by Lemma 2.3 that x R r−1− ∈ O(t−1− ). t −1
2 Using that γ ∈ S,(1) , we get by Lemma 3.3(vii) that: x x 1 R γ 2 + h.c. = γR γ + O(t−3/2+ ). 2 t t
Finally this gives: x x x 1 x d0 b(t) = − v R − v + γR γ + O(t−1−1 ), t t t t t for some 1 > 0. We note that R and R are positive, except for the error terms due to χ1 , χ2 in (9.1). To handle these terms we pick χ3 ∈ C0∞ (]vmax , +∞[) such that χ3 χi = x −1 −3/2+ ) and [γ, χ3 ( x ) by χi , i = 1, 2. Then [ x t − v, χ3 ( t )] ∈ O(t t )] ∈ O(t Lemma 3.3(i) and 3.3(vii). This yields: x x x 1 x 1 x − v χ2 − v = ± χ3 −v ± t t t t t t t x x x × χ2 − v χ3 + O(t−2 ) t t t x C ≤ χ23 + O(t−2 ), t t x x x x ±γχ1 γ = ±χ3 γχ1 γχ3 t t t t + O(t−3/2+ ) C 2 x ≤ χ3 + O(t−3/2+ ), t t −1
using that γ ∈ S(0)2 and Lemma 2.3. Using again (9.1), we finally get: C1 x x x x − v 1[c0 ,c1 ] − v + C1 γ1[c0 ,c1 ] d0 b(t) ≥ γ t t t t t C2 2 x χ (9.4) + O(t−1−1 ), − t 3 t for some C1 , 1 > 0.
April 2, 2009 10:25 WSPC/148-RMP
418
J070-00364
C. G´ erard & A. Panati
To handle the commutator [V, i dΓ(b(t))] we note that using Lemma 3.3(iv) and the fact that 0 ∈ supp R, we have x x b(t) = 1[,+∞[ b(t)1[,+∞[ + O(t−2 ) t t for some > 0. Using also hypothesis (Is) for s > 1, this implies that if V = Wick(w) then dΓ(b(t))w ∈ L1 (dt). Using the higher order estimates this implies that χ(H)[V, i dΓ(b(t))χ(H)] ∈ L1 (dt). The rest of the proof is as in [4, Proposition 11.3]. 9.3. Improved phase space propagation estimates In this subsection we will prove improved propagation estimates. We will use the following lemma which is an analog of [5, Lemma 6.4] in our abstract setting. Its proof will be given in the Appendix. Lemma 9.3. Assume (H1), (G1), (G2) and set v = [ω, ix] which is a bounded 2 −δ , δ > 0 and set 0 = inf(δ, 1 − δ/2). If operator on h. Let c = ( x t − v) + t ∞ J ∈ C0 (R) then: 2 (i) J( x t )c ∈ O(1), 1 x (ii) [c 2 , J( t )] ∈ O(t−1+δ/2 ). If J ∈ C0∞ (R\{0}) then for δ small enough: 1 x x 1 x x x 1 −1−1 2 2 (iii) J( x ), t )d0 c J( t ) = − t J( t )c J( t ) + γJ( t )M (t)J( t )γ + O(t ∞ where 1 > 0 and M (t) ∈ O(1). If J, J1 ∈ C0 (R) and J1 ≡ 1 on supp J, then: x x 1 x −0 2 ). If J, J1 , J2 ∈ C0∞ (R) (iv) |J( x t )( t − v) + h.c.| ≤ CJ1 ( t )c J1 ( t ) + O(t with J2 ≡ 1 on supp J and supp J1 , then: 1 x x x 2 x x −0 2 ). (v) ±(J( x t )( t − v)c J1 ( t ) + h.c.) ≤ C( t − v)J2 ( t )( t − v) + O(t 1
Proposition 9.4. Assume (G1), (G2), (Is) for s > 1. Let J ∈ C0∞ (]c0 , c1 [) for 0 < c0 < c1 and χ ∈ C0∞ (R). Then:
2 12 +∞
dt
x x
−itH − v + h.c. ≤ Cu2 . χ(H)e u
dΓ J
t t t 1 Proof. We fix J1 ∈ C0∞ (]c0 , c1 [) with J1 ≡ 1 on supp J and set 2 x x x 1 2 b(t) = J1 − v + t−δ , c J1 , for c = t t t and δ > 0 will be chosen small enough later. We will use the propagation observable Φ(t) = χ(H)dΓ(b(t))χ(H). Note that by Lemma 9.3(i) and the higher order estimates b(t), Φ(t) ∈ O(1). We first note that χ(H)[V, i dΓ(b(t))]χ(H) ∈ O(t−s ),
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
419
using hypothesis (Is) and Lemma 9.3(i). Next D0 dΓ(b(t)) = dΓ(d0 b(t)), x x x x 1 1 d0 b(t) = d0 J1 c 2 J1 + h.c. + J1 (d0 c 2 )J1 . t t t t By Lemma 9.3(iii) we know that choosing δ small enough: J1
x t
1
(d0 c 2 )J1
x t
1 x c 2 x J1 t t t x x + γJ1 M (t)J1 γ + O(t−1−1 ), t t
= −J1
for some 1 > 0 and M (t) ∈ O(1). By Lemma 9.3(iv) we get then that x x x x 1 C − v + h.c. −J1 (d0 c 2 )J1 ≥ J t t t t t x − CγJ12 γ − Ct−1−1 t for some 1 > 0. Next by Lemma 3.3: x x x 1 − v + O(t−2 ), d0 J1 = − J1 t 2t t t which by Lemma 9.3(v) gives for J2 ∈ C0∞ (]c0 , c1 [) and J2 ≡ 1 on supp J1 : x x x 1 C x x − v J22 −v d0 J1 c 2 J1 + h.c. ≥ − t t t t t t + O(t−1−1 ) for some 1 > 0. Collecting the various estimates, we obtain finally x x C − v + h.c. χ(H) −DΦ(t) ≥ χ(H)dΓ J t t t − CR1 (t) − CR2 (t) + O(t−1−1 ), where
x 2 R1 (t) = χ(H)dΓ γJ1 γ χ(H), t x x x 1 − v J22 −v χ(H) R2 (t) = χ(H)dΓ t t t t
are integrable along the evolution by Proposition 9.2. We can then complete the proof as in [5, Proposition 6.3].
April 2, 2009 10:25 WSPC/148-RMP
420
J070-00364
C. G´ erard & A. Panati
9.4. Minimal velocity estimate In this subsection we prove the minimal velocity estimate. It says that for states with energy away from thresholds and eigenvalues of H, at least one boson should escape to infinity. We recall that as in Sec. 7.4, A = dΓ(a). Lemma 9.5. Let H be an abstract QFT Hamiltonian. Assume (G4). Let k ∈ N, m = 1, 2 and χ ∈ C0∞ (R). Then there exists C such that for any > 0 and q ∈ C0∞ ([−2, 2]) with 0 ≤ q ≤ 1 one has:
k Am
t m
N Γ(q )χ(H)
≤ C . m t where q t = q( x t ). Proof. Applying Proposition 2.4(ii) we get (dΓ(a))2m ≤ N 2m−1 dΓ(a2m ).
(9.5)
Γ(q t )dΓ(a2m )Γ(q t ) = dΓ((q t )2 , q t a2m q t ) ≤ dΓ(q t a2m q t ),
(9.6)
Next
by Proposition 2.4(iv). We write using (G4): q t a2m q t = Gt x−m a2m x−m Gt ≤ Ct2m (Gt )2 ,
m = 1, 2,
m m for Gt = G( x we obtain that t ) and G(s) = s q(s). Using that |G(s)| ≤ C
q t a2m q t ≤ C2m t2m ,
m = 1, 2.
(9.7)
From (9.7) and (9.5), (9.6) we obtain Γ(q t )N 2k dΓ(a)2m Γ(q t ) ≤ C2m t2m N 2k+2m .
(9.8)
This implies the lemma using the higher order estimates. Proposition 9.6. Let H be an abstract QFT Hamiltonian. Assume hypotheses (Gi), for 1 ≤ i ≤ 5, (M1), (M2), (Is) for s > 1. Let χ ∈ C0∞ (R) be supported in R\(τ ∪ σpp (H)). Then there exists > 0 such that :
2 +∞
−itH dt
Γ 1[0,] |x| ≤ Cu2 . u χ(H)e
t t 1 Proof. Let us first prove the proposition for χ supported near an energy level λ ∈ R\τ ∪ σpp (H). By Theorem 7.8, we can find χ ∈ C0∞ (R) equal to 1 near λ such that for some c0 > 0: χ(H)[H, iA]0 χ(H) ≥ c0 χ2 (H).
(9.9)
Let > 0 be a parameter which will be fixed later. Let q ∈ C0∞ (|s| ≤ 2), 0 ≤ q ≤ 1, q = 1 near {|s| ≤ } and let q t = q( x t ).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
421
We use the propagation observable A Φ(t) := χ(H)Γ(q t ) Γ(q t )χ(H). t We fix cutoff functions q˜ ∈ C0∞ (R), χ ˜ ∈ C0∞ (R) such that supp q˜ ⊂ [−4, 4],
0 ≤ q˜ ≤ 1,
q˜q = q, χχ ˜ = χ.
By Lemma 9.5 for m = 1 the observable Φ(t) is uniformly bounded. We have: A DΦ(t) = χ(H)dΓ(q t , d0 q t ) Γ(q t )χ(H) + h.c. t A + χ(H)[V, iΓ(q t )] Γ(q t )χ(H) + h.c. t + t−1 χ(H)Γ(q t )[H, iA]Γ(q t )χ(H) A − t−1 χ(H)Γ(q t ) Γ(q t )χ(H) t =: R1 (t) + R2 (t) + R3 (t) + R4 (t).
(9.10)
We have used the fact, shown in the proof of Lemma 6.2, that Γ(q t ) preserves D(H0 ) and D(N n ) to expand the commutator [H, iΦ(t)] in (9.10). Let us first estimate R2 (t). By Proposition 2.7 and hypothesis (Is) [V, iΓ(q t )] ∈ (N + 1)n ON (t−s ),
s > 1,
for some n. Therefore by the higher order estimates and Lemma 9.5 for m = 1: R2 (t) ∈ O(t−s ),
s > 1.
(9.11)
We estimate now R1 (t). By Lemma 3.3(i): x x 1 1 t d0 q = − −v q + h.c. + rt =: g t + rt , 2t t t t where rt ∈ O(t−2 ). By the higher order estimates χ(H)dΓ(q t , rt ) ∈ O(t−2 ), which using Lemma 9.5 for m = 1 yields
χ(H)dΓ(q t , rt ) A Γ(q t )χ(H) ∈ O(t−2 ).
t Then we set 1
B1 := χ(H)dΓ(q t , g t )(N + 1)− 2 ,
1
B2∗ := (N + 1) 2
A Γ(q t )χ(H), t
and use the inequality A χ(H)dΓ(q t , g t ) Γ(q t )χ(H) + h.c. = B1 B2∗ + B2 B1∗ t ≥ −B1 B1∗ − B2 B2∗ .
(9.12)
April 2, 2009 10:25 WSPC/148-RMP
422
J070-00364
C. G´ erard & A. Panati
We can write: t −B2 B2∗ = −χ(H)χ(H)Γ(q ˜ )Γ(˜ qt )
˜ qt ) = χ(H)Γ(q t )χ(H)Γ(˜
A2 (N + 1)Γ(˜ q t )Γ(q t )χ(H)χ(H) ˜ t2
A2 t (N + 1)Γ(˜ q t )χ(H)Γ(q ˜ )χ(H) + O(t−1 ) t2
≥ −2 C1 χ(H)Γ2 (q t )χ(H) + O(t−1 ).
(9.13) 2
In the first step we use that [χ(H), ˜ Γ(q t )] ∈ O(t−1 ) by Lemma 6.2 and that At2 (N + t 1)Γ(q )χ(H) ∈ O(1) by Lemma 9.5 for m = 2. In the second step we use the following estimate analogous to (9.8): A2 (N + 1)Γ(˜ q t )χ(H) ˜ ≤ C1 2 . t2 Next we use Proposition 2.4(iv) to obtain: χ(H)Γ(˜ ˜ qt )
B1∗ B1 = χ(H)dΓ(q t , g t )2 (N + 1)−1 χ(H) ≤ χ(H)dΓ((g t )2 )χ(H). By Proposition 9.2, we obtain +∞ dt B1 e−itH u2 ≤ Cu2 . t 1
(9.14)
To handle R3 (t), we write using Lemma 6.2: R3 (t) = t−1 Γ(q t )χ(H)[H, iA]χ(H)Γ(q t ) + O(t−2 ) ≥ C0 t−1 Γ(q t )χ2 (H)Γ(q t ) − Ct−2 ≥ C0 t−1 χ(H)Γ2 (q t )χ(H) − Ct−2 .
(9.15)
It remains to estimate R4 (t). We write using Lemma 9.5: A R4 (t) = −t−1 χ(H)Γ(q t ) Γ(q t )χ(H) t A t q t )χ(H)Γ(q ˜ q t ) Γ(˜ ˜ )χ(H) + O(t−2 ) = −t−1 χ(H)Γ(q t )χ(H)Γ(˜ t ≥ −C2 t−1 χ(H)Γ(q t )2 χ(H) + O(t−2 ).
(9.16)
Collecting (9.13), (9.15) and (9.16), we obtain −t−1 B2∗ (t)B2 (t) + R3 (t) + R4 (t) ≥ (−2 C1 + C0 − C2 )t−1 χ(H)Γ(q t )2 χ(H) + O(t−2 ).
(9.17)
We pick now small enough so that C˜0 = −2 C1 + C0 − C2 > 0. Using (9.11), (9.14) and (9.17) we conclude that C˜0 χ(H)Γ2 (q t )χ(H) − R(t) − Ct−s , s > 1. DΦ(t) ≥ t where R(t) is integrable along the evolution. We finish the proof as in [4, Proposition 11.5].
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
423
10. Asymptotic Completeness In this section, we prove the asymptotic completeness of wave operators. The first step is the geometric asymptotic completeness, identifying the asymptotic vacua with the subspace of states living at large times t in x ≤ t for arbitrarily small > 0. In the second step, using the minimal velocity estimate, one shows that these states have to be bound states of H. 10.1. Existence of asymptotic localizations Theorem 10.1. Let H be an abstract QFT Hamiltonian. Assume hypotheses (G1), (G2), (Is) for s > 1. Let q ∈ C0∞ (R), 0 ≤ q ≤ 1, q = 1 on a neighborhood of zero. Set q t = q( x t ). Then there exists s- lim eitH Γ(q t )e−itH =: Γ+ (q). t→∞
(10.1)
We have q ), Γ+ (q q˜) = Γ+ (q)Γ+ (˜ 0 ≤ Γ+ (q) ≤ Γ+ (˜ q ) ≤ 1,
if 0 ≤ q ≤ q˜ ≤ 1,
[H, Γ+ (q)] = 0.
(10.2) (10.3) (10.4)
The proof is completely similar to the proof of [4, Theorem 12.1], using Proposition 9.4. An analogous result is true for the free Hamiltonian H0 . Proposition 10.2. Assume hypotheses (H1), (G1), (G2). Let q ∈ C ∞ (R), 0 ≤ q ≤ 1, q ≡ 1 near ∞. Then there exists s- lim eitH0 Γ(q t )e−itH0 =: Γ+ free (q). t→∞
(10.5)
Moreover if additionally q ≡ 0 near 0 then: + Γ+ free (q) = Γfree (q)Γ(1c (ω)),
where 1c (ω) is the projection on the continuous spectral subspace of ω. Proof. By density it suffices to the existence of the limit (10.5) on Γfin (h). Using the identity (see, e.g., [4, Lemma 3.4]): d Γ(rt ) = dΓ(rt , rt ), dt we obtain for a, b ∈ B(h):
1
Γ(a) − Γ(b) =
dΓ(ta + (1 − t)b, a − b)dt. 0
It follows then from Proposition 2.4 that B(h) a → Γ(a)(N + 1)−1 ∈ B(Γ(h))
(10.6)
April 2, 2009 10:25 WSPC/148-RMP
424
J070-00364
C. G´ erard & A. Panati
is norm continuous. This implies that it suffices to prove the existence of the limit for q ∈ C ∞ (R) 0 ≤ q ≤ 1 and q ≡ 1 near ∞, q ≡ Cst near 0. In particular q ∈ C0∞ (R\{0}). We can then repeat the proof of [4, Theorem 12.1], noting that the only place where q ≡ 1 near 0 is needed is to control the commutator [V, iΓ(q t )] which is absent in our case. This proves (10.5). Restricting (10.5) to the one-particle sector we obtain the existence of q + := s- lim eitω q t e−itω . t→+∞
(10.7)
By Lemma 3.3(i), we see that [χ(ω), q + ] = 0 for each χ ∈ C0∞ (R) hence q + commutes with ω. If q ≡ 0 near 0 then clearly 1pp (ω)q + = q + 1pp (ω) = 0,
and hence q + = q + 1c (ω) = 1c (ω)q + .
We note now that + Γ+ free (q) = Γ(q ),
which implies (10.6). 10.2. The projection P0+ Theorem 10.3. Let H be an abstract QFT Hamiltonian. Assume hypotheses (G1), (G2), (Is) for s > 1. Let {qn } ∈ C0∞ (R) be a decreasing sequence of functions such that 0 ≤ qn ≤ 1, qn ≡ 1 on a neighborhood of 0 and ∩∞ n=1 supp qn = {0}. Then P0+ := s- lim Γ+ (qn ) exists. n→∞
(10.8)
P0+ is an orthogonal projection independent on the choice of the sequence {qn }. Moreover : [H, P0+ ] = 0. Moreover if (S) holds: Ran P0+ ⊂ K+ .
(10.9)
The range of P0+ can be interpreted as the space of states asymptotically containing no bosons away from the origin. Proof. The proof is analogous to [4, Theorem 12.3]. We will only detail (10.9). Let n ∈ N such that D(H n ) ⊂ D(a+∗ (h)) for all h ∈ hc (ω). We will show that for u ∈ Ran P0+ : (H + b)−n a+ (h)u = 0,
h ∈ hc (ω).
Since h → (H + b)−n a+ (h) is norm continuous by Theorem 8.2, we can assume that h ∈ h0 . By (S) and the fact that u ∈ Ran P0+ we can choose q ∈ C0∞ (R) with 0 ≤ q ≤ 1 such that: u = lim eitH Γ(q t )e−itH u, t→+∞
q t ht ∈ o(1).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
425
Then: (H + b)−n a+ (h)u = lim eitH (H + b)−n a(ht )Γ(q t )e−itH u t→+∞
= lim eitH (H + b)−n Γ(q t )a(q t ht )e−itH u t→+∞
= 0, using that (N + 1)−1 a(q t ht ) ∈ o(1) and the higher order estimates. 10.3. Geometric inverse wave operators 2 Let j0 ∈ C0∞ (R), j∞ ∈ C ∞ (R), 0 ≤ j0 , j∞ , j02 + j∞ ≤ 1, j0 = 1 near 0 (and hence t t t j∞ = 0 near 0). Set j := (j0 , j∞ ), j = (j0 , j∞ ). As in Sec. 2.4, we introduce the operator I(j t ) : Hext → H.
Theorem 10.4. Assume (G1), (G2), (Is) for s > 1. Then: (i) The following limits exist: s- lim eitH t→+∞
ext
I ∗ (j t )e−itH ,
s- lim eitH I(j t )e−itH t→+∞
ext
.
(10.10) (10.11)
If we denote (10.10) by W + (j), then (10.11) equals W + (j)∗ and W + (j) ≤ 1. (ii) For any bounded Borel function F one has W + (j)F (H) = F (H ext )W + (j). (iii) Let q0 , q∞ ∈ C ∞ (R), ∇q0 , ∇q∞ ∈ C0∞ (R), 0 ≤ q0 , q∞ ≤ 1, q0 ≡ 1 near 0 and q∞ ≡ 1 near ∞. Set ˜j := (˜j0 , ˜j∞ ) := (q0 j0 , q∞ j∞ ). Then + + ˜ Γ+ (q0 ) ⊗ Γ+ free (q∞ )W (j) = W (j).
(iv) Assume additionally that j0 + j∞ = 1. Then Ran W + (j) ⊂ Hscatt and if χ ∈ C0∞ (R): Ωext,+ χ(H ext )W + (j) = χ(H). Note that statement (iv) of Theorem 10.4 makes sense since Ran W + (j) ⊂ Hscatt and χ(H ext ) preserves Hscatt . Proof. Statements (i)–(iii) are proved exactly as in [4, Theorem 12.4], we detail only (iv). We pick q∞ ∈ C ∞ (R) with q∞ ≡ 1 near ∞, q∞ ≡ 0 near 0 and q∞ j∞ = j∞ . + + Applying (iii) for q0 ≡ 1, we obtain by (iii) that 1 ⊗ Γ+ free (q∞ )W (j) = W (j). Applying then (10.6) we get that 1 ⊗ Γ(1c (ω))W + (j) = W + (j) i.e. Ran W + (j) ⊂ Hscatt . The rest of the proof of (iv) is as in [4, Theorem 12.4].
April 2, 2009 10:25 WSPC/148-RMP
426
J070-00364
C. G´ erard & A. Panati
10.4. Geometric asymptotic completeness In this subsection we will show that Ran P0+ = K+ . We call this property geometric asymptotic completeness. It will be convenient to work in the scattering space Hscatt and to treat Ω+ as a partial isometry Ω+ : Hscatt → H, as explained in Sec. 8.3. Theorem 10.5. Assume (G1), (G2), (S), (Is) for s > 1. Let jn = (j0,n , j∞,n ) satisfy the conditions of Sec. 10.3. Additionally, assume that j0,n + j∞,n = 1 and that for any > 0, there exists m such that, for n > m, supp j0,n ⊂ [−, ]. Then Ω+∗ = w- lim W + (jn ). n→∞
Besides K+ = Ran P0+ . Proof. The proof is analogous to [4, Theorem 12.5]. Since it is in important step, we will give some details. If q ∈ C0∞ (R) is such that q = 1 in a neighborhood of 0, 0 ≤ q ≤ 1 then for sufficiently big n we have qj0,n = j0,n . Therefore, for sufficiently big n by Theorem 10.4(iii) (Γ+ (q) ⊗ 1)W + (jn ) − W + (jn ) = 0. Hence w- lim (P0+ ⊗ 1W + (jn ) − W + (jn )) = 0.
(10.12)
n→∞
Let χ ∈ C0∞ (R). We have Ω+∗ χ(H) = Ω+∗ Ωext,+ χ(H ext )W + (jn )
(1)
= w- lim Ω+∗ Ωext,+ χ(H ext )W + (jn )
(2)
= w- lim Ω+∗ Ωext,+ χ(H ext )P0+ ⊗ 1W + (jn )
(3)
= w- lim P0+ ⊗ 1χ(H ext )W + (jn )
(4)
= w- lim P0+ ⊗ 1W + (jn )χ(H)
(5)
= w- lim W + (jn )χ(H)
(6).
n→∞ n→∞ n→∞ n→∞ n→∞
We use Theorem 10.4 in step (1), (10.12) in step (3), Ran P0+ ⊂ K+ in step (4), Theorem 10.4(ii) in step (5) and (10.12) again in step (6). Clearly this implies that: Ω+∗ = w- lim W + (jn ). n→∞
Therefore by (10.12) Ran Ω+∗ ⊂ Ran P0+ ⊗ Γ(h) ⊂ K+ ⊗ Γ(h).
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
427
But by construction Ran Ω+∗ = K+ ⊗ Γ(h). Hence K+ ⊗ Γ(h) = Ran P0+ ⊗ Γ(h), and therefore K+ = Ran P0+ . 10.5. Asymptotic completeness In this subsection, we will prove asymptotic completeness. Theorem 10.6. Assume hypotheses (Hi), 1 ≤ i ≤ 3, (Gi), 1 ≤ i ≤ 5, (Mi) i = 1, 2, (Is) for s > 1 and (S). Then: K+ = Hpp (H). Proof. By Proposition 8.4 and geometric asymptotic completeness we already know that H( H) ⊂ K+ = Ran P0+ . It remains to prove that P0+ ≤ 1pp (H). Let χ ∈ C0∞ (R\(τ ∪ σpp (H))). We deduce from Proposition 9.6 in Sec. 9.4 that there exists > 0 such that for q ∈ C0∞ ([−, ]) with q(x) = 1 for |x| < /2 we have +∞ dt Γ(q t )χ(H)e−itH u2 ≤ cu2 . t 1 Since Γ(q t )χ(H)e−itH u → Γ+ (q)χ(H)u, we have Γ+ (q)χ(H) = 0. This implies that P0+ ≤ 1τ ∪σpp (H). Since τ is a closed countable set and σpp (H) can accumulate only at τ , we see that 1pp (H) = 1τ ∪σpp (H). This completes the proof of the theorem. Appendix A A.1. Proof of Lemma 3.3 To prove (i) we restrict the quadratic form [F ( x R ), ω] to S. Using (2.2), we get ! −1 −1 x x x i ˜ ∂ z¯F (z) z − [x, ω] z − dz ∧ d z¯, F ,ω = R 2πR C R R −2 x i ∂ z¯F˜ (z) z − [x, ω]dz ∧ d z¯ = 2πR C R −2 −1 x i x 2 ˜ F (z) z − + ∂ ad ω z − dz ∧ d z¯ z¯ x 2πR2 C R R (A.1) 0 where the right-hand sides are operators on S. Since ad2x ω ∈ S(0) , we see that the 0 . Using the bound last term belongs to R−2 S(0)
x R (z
−
x −1 R )
= O(|Im z|−1 ) for
April 2, 2009 10:25 WSPC/148-RMP
428
J070-00364
C. G´ erard & A. Panati
−1 z ∈ supp F˜ , we see that the last term belongs also to R−1 S(0) . This proves (i) for k = 0. (0) Replacing ω by [ω, x] and using that ad2x [ω, x] ∈ S(0) we get also (i) for k = 1. (ii) follows from (i) for k = 0 since S is a core for ω. (iii) and (iv) are proved similarly. (v) is proved as (i), replacing ω by [ω, ia]0 and using only the first line of 2 (A.1). To prove (vi) we restrict again the quadratic form [F ( x R ), ω ] to S and get: ! −1 −1 x x x i ∂ z¯F˜ (z) z − [x, ω 2 ] z − dz ∧ d z¯, F , ω2 = R 2πR C R R −1 x i ˜ = ∂ z¯F (z) z − (2[x, ω]ω + [ω, [x, ω]]) 2πR C R −1 x × z− dz ∧ d z¯, (A.2) R
where the right-hand sides are operators on S. Note that [ω, [x, ω]] is bounded −1 −1 by (G2). We use next that ω(z − x ω ∈ O(|Im z)|−2 uniformly in R ≥ 1 to R ) obtain (vi). To prove (vii), we pick another function F1 ∈ C0∞ (R\{0}) such that F1 F = F and note that ! ! ! x x x x x F ,b = F F1 ,b + F , b F1 . R R R R R Applying again (2.2), we get ! −1 −1 x x x i ˜ ∂ z¯F (z) z − [x, b] z − dz ∧ d z¯, F ,b = R 2πR C R R −µ+δ and the analogous formula for [F1 ( x and R ), b]. We use then that [x, b] ∈ S(0) Lemma 2.3, moving powers of x through the resolvents either to the left or to the right to obtain (vii).
A.2. Proof of Lemma 3.4 We use the identity: 1
ω − 2 = c0
+∞
1
s− 2 (ω + s)−1 ds,
0
to get: ω
1 2
! x − 12 F ,ω R ! +∞ x − 12 12 −1 = c0 s ω (ω + s) F , ω (ω + s)−1 ds ∈ O(R−1 ), R 0
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
since ω ≥ m > 0. Hence 1 1 x ω − 2 (ω − ω∞ )F ω− 2 R = ω
− 12
1
(ω − ω∞ )ω
− 12
F
1
x R
+ω
= ω − 2 (ω − ω∞ )ω − 2 x x− F
− 12
(ω − ω∞ )ω
x R
− 12
ω
1 2
F
x R
,ω
− 12
429
!
+ O(R−1 )
= O(R− ) + O(R−1 ). The second statement of the lemma is obvious. A.3. Proof of Lemma 9.3 Since by (G1) [v, x] extends from S as a bounded operator on h and S is a core for x, we get that v preserves D(x). Since x t − v is selfadjoint on D(x) we get # " 2 $ x x −v − v u ∈ D(x) = D(x2 ), = u ∈ D(x) D(c) = D t t 0 so c is selfadjoint on D(x2 ). Since v ∈ S(0) we get by Lemma 2.3 that
x J( x t )cJ( t ) ∈ O(1) which proves (i). Let us now prove (ii). We first consider the commutator [c, J( x t )] for J ∈ C0∞ (R). We have ! ! ! x x x x x − v v, J −v c, J = + v, J t t t t t x x x x −1 −1 =t −v J −v [v, x] + t J [v, x] t t t t x x + − v M (t) + M (t) −v , t t
−1 0 where M (t) ∈ t−2 S(0) ∩ t−1 S(0) by Lemma 3.3(i). This implies that the last two
−1 ) and terms in the right-hand side are O(t−2 ). Using then that [v, J ( x t )] ∈ O(t x 0 [[v, x], t ] ∈ O(t−1 ) since v ∈ S(3) , we see that x x x − v J [v, x] = J M1 (t) + O(t−1 ), t t t x x x −v = J J [v, x] M2 (t) + O(t−1 ), t t t
where Mi (t) ∈ O(1). This shows that: ! x 1 x c, J = J O(1) + O(t−2 ). t t t
(A.3)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
C. G´ erard & A. Panati
430
We will use the following identities valid for λ > 0: +∞ +∞ 1 1 − 12 − 12 −1 2 λ = c0 s (λ + s) ds, λ = c0 s− 2 λ(λ + s)−1 ds, 0
(A.4)
0
and − 32
λ
+∞
= 2c0
1
s− 2 (λ + s)−2 ds,
(A.5)
0
which follows by differentiating the first identity of (A.4) with respect to λ. A related obvious bound is: +∞ 1 1 s− 2 (t−δ + s)−n ds = O(t(n− 2 )δ ), n ≥ 1. (A.6) 0
From (A.4) we obtain that +∞ 1 1 c 2 = c0 s− 2 c(c + s)−1 ds,
as a strong integral on D(c).
(A.7)
0
Therefore 1
c2 , J
x t
!
+∞
= c0
1
s− 2
0
− c(c + s)−1
! x (c + s)−1 t ! x −1 c, J ds. (c + s) t c, J
We use the bounds c(c + s)−1 ≤ 1,
(c + s)−1 ≤ (t−δ + s)−1 ,
(A.8)
and (A.3) to obtain
! +∞
1
1
c 2 , J x ≤ Ct−1 s− 2 (t−δ + s)−1 ds = O(t−1+δ/2 ),
t 0 by (A.4), which proves (ii). To prove (iii) we first compute 2 x 2 x − v − [ω, iv] − v + h.c. − δt−δ−1 . d0 c = − t t t
(A.9)
We first rewrite the second term in the right-hand side in a convenient way: by (G2), we have [ω, iv] = γ 2 + r−1− ,
−1
2 γ ∈ S,(1) ,
−1− r−1− ∈ S(0) .
0 Since v ∈ S(0) , we get first that: x − −1− − v r−1− ∈ O(t−1 )S(0) + S(0) . t
(A.10)
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
We claim also that γ,
! x − 1 + −3/2+2 − v ∈ O(t−1 )S(0)2 + S(0) . t
− 1 +
Clearly [γ, x] ∈ S(0)2
431
(A.11)
. To handle [γ, v] we use the Lie identity and write: −3/2+2
i[γ, v] = −[γ, [ω, x]] = [ω, [x, γ]] + [x, [ω, γ]] ∈ S(0)
,
(A.12)
which proves (A.11). By Lemma 2.3(i), we get that ! ! x x −1+ −2+2 −v , γ, − v γ ∈ t−1 S(0) + S(0) , γ γ, t t and hence using that 0 < < 12 : x x − v + h.c. = 2γ − v γ + R2 (t), [ω, iv] t t −1 −1−1 + S(0) , for some 1 > 0. We set now: where R2 (t) ∈ O(t−1 )S(0) 2c x −δ−1 R0 (t) = − , R1 (t) = −(δ − 2)t , R3 (t) = −2γ − v γ, t t
and rewrite (A.9) as d0 c =
3
Ri (t).
i=0
Using (A.7), we obtain as a strong integral on D(c): +∞ 1 1 2 d0 c = c0 s− 2 (d0 c(c + s)−1 − c(c + s)−1 d0 c(c + s)−1 )ds 0
=
3 i=0
=:
3
+∞
c0
1
s− 2 (Ri (t)(c + s)−1 − c(c + s)−1 Ri (t)(c + s)−1 )ds
0
Ii (t).
i=0
Using (A.4) we obtain 1 1 I0 (t) = − c 2 , t
1
I1 (t) = Ct−δ−1 c− 2 = O(t−δ/2−1 ).
x It remains to handle the terms J( x t )Ii (t)J( t ) for i = 2, 3. We write them as: x x J Ii (t)J t t +∞ x x − 12 −1 = c0 s J Ri (t)(c + s) J ds t t 0 +∞ x x − 12 −1 −1 s J − c0 c(c + s) Ri (t)(c + s) J ds. t t 0
April 2, 2009 10:25 WSPC/148-RMP
432
J070-00364
C. G´ erard & A. Panati
We will need to use the fact that O ∈ supp J. To do this we claim that if J, J1 ∈ C0∞ (R) with J1 ≡ 1 near supp J then: x x J (c + s)−1 (1 − J1 ) ∈ O(t−2 (t−δ + s)−2 ) + O(t−2 (t−δ + s)−3 ), t t (A.13) x x J c(c + s)−1 (1 − J1 ) ∈ O(t−2 (t−δ + s)−1 ) + O(t−2 (t−δ + s)−2 ). t t (A.14) We pick T1 ∈ C0∞ (R), T1 ≡ 1 on supp J1 , T1 ≡ 0 on supp J. We write using (A.3): x x −1 J (c + s) (1 − J1 ) t t ! x x −1 = J c, J1 (c + s) (c + s)−1 t t x x = J (c + s)−1 T1 O(t−1 )(c + s)−1 t t x +J (c + s)−1 O(t−2 )(c + s)−1 t ! x x −1 , c (c + s)−1 O(t−1 )(c + s)−1 = J T1 (c + s) t t x +J (c + s)−1 O(t−2 )(c + s)−1 t x = J (c + s)−1 O(t−1 )(c + s)−1 O(t−1 )(c + s)−1 t x +J (c + s)−1 O(t−2 )(c + s)−1 . t We obtain (A.13) using the bound (c + s)−1 ≤ (t−δ + s)−1 . (A.14) follows from (A.13) using that c(c + s)−1 = 1 − s(c + s)−1 . We hence fix a cutoff J1 ∈ C0∞ (R\{0}) such that J1 ≡ 1 on supp J and set x x ˜ Ri (t) = J1 Ri (t)J1 , t t ˜ i (t). and denote by I˜i (t) the analogs of Ii (t) for Ri (t) replaced by R We claim that: x x ˜ J (Ii (t) − Ii (t))J ∈ O(t−2+5δ/2 ), i = 2, 3. t t
(A.15)
To prove (A.15), we note that I˜i (t) is obtained from Ii (t) by inserting J1 ( x t ) to the left and right of Ri (t) under the integral sign. The error terms under the integral
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
433
sign coming from this insertion are estimated using (A.13), (A.14) and the fact that −1
Ri (t) ∈ O(1) for i = 2, 3, since γ ∈ S(0)2 . The integrals of these error terms are estimated using (A.6), which by a painful but straightforward computation gives (A.15). ˜ 2 (t) ∈ O(t−1−1 ) for some 1 > 0 small By Lemma 2.3(ii), we know that R enough, hence using the bounds (A.8) and (A.6), we obtain that for δ > 0 small enough x x ˜ I2 (t) and hence J I2 (t)J ∈ O(t−1−2 ), 2 > 0. t t To treat I˜3 (t), we use that ˜ 3 (t) = γ ∗ R t
x − v γt , t
for γt = γJ1
x t
.
We claim that [γt , c] ∈ O(t−3/2+ ).
(A.16)
Let us prove this claim. We write: ! ! x x x x − v γt , − v + γt , −v −v , [γt , c] = t t t t and
[γt , x] = [γ, x]J1
Now
x t
,
x x − v [γ, x]J1 , t t
[γt , v] = [γ, v]J1 [γ, x]J1
x t
− 1 +
This follows from the fact that [γ, x] ∈ S(0)2
x t
+ γ J1
x −v t
x t
! ,v .
1
∈ O(t− 2 + ).
, 0 ∈ supp J1 and Lemma 2.3(ii).
−3/2+2 S(0) ,
Similarly we saw in (A.12) that [γ, v] ∈ which implies that: x x x x − v [γ, v]J1 − v ∈ O(t−3/2+2 ). , [γ, v]J1 t t t t Finally using Lemma 3.3(i) we write: ! x x 1 −1 0 J1 ,v = J1 ∩ O(t−1 )S(0) . [x, v] + M (t), M (t) ∈ O(t−2 )S(0) t t t −1
0 Since γ ∈ S(0)2 and [x, v] ∈ S(0) , we get that 1 x x x x − v γJ1 − v ∈ O(t− 2 ), [x, v], γJ1 [x, v] t t t t −1 0 ∩ O(t−1 )S(0) : and since M (t) ∈ O(t−2 )S(0) x x − v γM (t), γM (t) − v ∈ O(t−2 ). t t
April 2, 2009 10:25 WSPC/148-RMP
434
J070-00364
C. G´ erard & A. Panati
Collecting the various estimates we obtain (A.16). From the estimate of [γt , c] we obtain:
We now write:
[γt , (c + s)−1 ] ∈ O(t−3/2+ (t−δ + s)−2 ),
(A.17)
[γt , c(c + s)−1 ] ∈ O(t−3/2+ (t−δ + s)−1 ).
(A.18)
x − v γt (c + s)−1 ds t 0 +∞ 1 x − v γt (c + s)−1 ds. s− 2 c(c + s)−1 γt∗ − c0 t 0
I˜3 (t) = c0
+∞
1
s− 2 γt∗
We first move γt to the right in the two integrals using (A.17) and the fact that x x x − v = J1 γ − v ∈ O(1), γt∗ t t t −1
since γ ∈ S(0)2 . We obtain errors terms of size O(t−3/2++5δ/2 ) using (A.6). We then move γt∗ to the left in the second integral using (A.18) and the fact that
x
1 −1 −1 δ/2
2
t − v (c + s) ≤ c (c + s) ≤ t . We obtain error terms of size O(t−3/2++δ ) using again (A.6). Hence for δ > 0 small enough, we get: +∞ x 1 I˜3 (t) = c0 − v (c + s)−1 γt ds γt∗ s− 2 t 0 +∞ 1 x − v (c + s)−1 γt ds γt∗ s− 2 c(c + s)−1 − c0 t 0 + O(t−1−1 ) for some 1 > 0. The integrals can be computed exactly since x t − v commutes x − 12 with c and are equal to C1 ( t − v)c for some constant C1 and hence O(1). This yields: x ˜ x x x ∗ I3 (t)J J =J γt M (t)γt J + O(t−1−1 ) t t t t x x =J γM (t)γJ + O(t−1−1 ), t t x for M (t) ∈ O(1). Using also (A.15), the same equality holds for J( x t )I3 (t)J( t ). 1 −2 Finally we use that γJ( x ), by Lemma 2.3(ii) and [γ, J( x t ) ∈ O(t t )] ∈ −3/2+ ), to get: O(t x x x x J γM (t)γJ = γJ M (t)J γ + O(t−2+ ). t t t t
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
Hence
J
x t
I3 (t)J
x t
= γJ
x t
which completes the proof of (iii). Let us now prove (iv). Set x x − v + h.c., B0 = J t t
M (t)J
x t
B1 = J1
x t
435
γ + O(t−1−1 ),
1
c 2 J1
x t
.
By Lemma 3.3 we have: x x x 2 2 −v J − v + O(t−1 ) B0 = 4 t t t x x x − v J14 − v + O(t−1 ) ≤C t t t 2 x x x = CJ12 − v J1 + O(t−1 ) t t t x x = CJ12 cJ12 + O(t−δ ) t t 1 1 x x x 2 2 2 = CJ1 c J1 c J1 + O(t−0 ) t t t = CB12 + O(t−0 ), where we used (ii) in the last step. Applying then Heinz theorem we obtain that 1
|B0 | ≤ C(B12 + t−0 ) 2 ≤ CB1 + Ct−0 /2 , which proves (iv). To prove (v) we set B2 = J
x t
x x 1 2 − v c J1 + h.c. t t
Using (ii) and Lemma 3.3, we get: 1 x x − v JJ1 c 2 + h.c. + O(t−1+δ/2 ) ±B2 = ± t t x x 1 1 1 − − v c 2 JJ1 = ± c2 c 2 + h.c. + O(t−1+δ/2 ) t t ≤ Cc + O(t−1+δ/2 ) 2 x − v + O(t−0 ), ≤C t
April 2, 2009 10:25 WSPC/148-RMP
436
J070-00364
C. G´ erard & A. Panati
x −2 since ( x is bounded with norm O(1). Since B2 = J2 ( x t − v)c t )B2 J2 ( t ) we get 2 x x x − v J2 ±B2 ≤ CJ2 + O(t−0 ) t t t x x x 2 =C − v J2 − v + O(t−0 ), t t t 1
by Lemma 3.3. References [1] W. Amrein, A. Boutet de Monvel and W. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N -Body Hamiltonians (Birkh¨ auser, Basel-Boston-Berlin, 1996). [2] A. Arai, M. Hirokawa and F. Hiroshima, On the absence of eigenvectors of Hamiltonian in a class of massless quantum field model without infrared cutoff, J. Funct. Anal. 168 (1999) 470–497. [3] A. Boutet de Monvel and V. Georgescu, Graded C ∗ -algebras and many-body perturbation theory II: The Mourre estimate, Ast´erisque 210 (1992) 75–97. [4] J. Derezi´ nski and C. G´erard, Spectral and scattering theory of spatially cut-off P (ϕ)2 Hamiltonians, Comm. Math. Phys. 213 (2000) 39–125. [5] J. Derezi´ nski and C. G´erard, Asymptotic completeness in quantum field theory. Massive Pauli–Fierz Hamiltonians, Rev. Math. Phys. 11 (1999) 383–450. [6] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Rayleigh scattering at atoms with dynamical nuclei, Comm. Math. Phys. 271 (2007) 387–430. [7] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Spectral theory for the standard model of non-relativistic QED, preprint; arXiv:math-ph/0611013v2. [8] V. Georgescu, On the spectral analysis of Quantum Field Hamiltonians, J. Funct. Anal. 245 (2007) 89–143. [9] V. Georgescu and C. G´erard, On the virial theorem in quantum mechanics, Comm. Math. Phys. 208 (1999) 275–281. [10] V. Georgescu, C. G´erard and J. Moeller, Spectral theory of massless Nelson models, Comm. Math. Phys. 249 (2004) 29–78. [11] C. G´erard and F. Nier, Scattering theory for pertubations of periodic Schr¨ odinger operators, J. Math. Kyoto Univ. 38 (1998) 595–634. [12] C. G´erard and A. Panati, Spectral and scattering theory for the space-cutoff P (ϕ)2 model with variable metric, Ann. Henri Poincar´e 9 (2008) 1575–1629. [13] L. H¨ ormander, The Analysis of Linear Partial Differential Operators, Vol. 3 (Springer Verlag, Berlin-Heidelberg-New York, 1985). [14] W. Hunziker, I. M. Sigal and A. Soffer, Minimal escape velocities, Comm. Partial Differential Equations 24 (1999) 2279–2295. [15] E. Lieb and M. Loss, Existence of atoms and molecules in non-relativistic quantum electrodynamics, Adv. Theor. Math. Phys. 7 (2003) 667–710. [16] A. Pizzo, One-particle (improper) states and scattering states in Nelson’s massless model, Ann. Henri Poincar´e 4 (2003) 439–483. [17] D. Robert, Propri´et´es spectrales d’op´erateurs pseudo-diff´erentiels, Comm. Partial Differential Equations 3 (1978) 755–826.
April 2, 2009 10:25 WSPC/148-RMP
J070-00364
Spectral and Scattering Theory for Abstract QFT Hamiltonians
437
[18] L. Rosen, The (φ2n )2 Quantum field theory: Higher order estimates, Comm. Pure Appl. Math. 24 (1971) 417–457. [19] B. Simon and R. Høgh-Krohn, Hypercontractive semigroups and two dimensional self-coupled Bose fields, J. Funct. Anal. 9 (1972) 121–180. [20] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, Cambridge, 2004).
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Reviews in Mathematical Physics Vol. 21, No. 3 (2009) 439–457 c World Scientific Publishing Company
REPLICA CONDENSATION AND TREE DECAY
ARTHUR JAFFE Department of Physics, Harvard University, Cambridge, MA 02138, USA Arthur Jaff
[email protected] DAVID MOSER Department of Mathematics, Northeastern University, Boston, MA 02115, USA
[email protected] Received 7 June 2008 Revised 25 December 2008 We give an intuitive method — using local, cyclic replica symmetry — to isolate exponential tree decay in truncated (connected) correlations. We give an expansion and use the symmetry to show that all terms vanish, except those displaying replica condensation. The condensation property ensures exponential tree decay. We illustrate our method in a low-temperature Ising system, but expect that one can use a similar method in other random field and quantum field problems. While considering the illustration, we prove an elementary upper bound on the entropy of random lattice surfaces. Keywords: Replica symmetry; decay of correlations; lattice systems; entropy estimate. Mathematics Subject Classification 2000: 82B05, 82B26, 82B99
1. Introduction Symmetry is used widely in physics to unify laws or simplify results. Global symmetries often arise and are characterized by Lie groups or their representation acting on a manifold. Some symmetries, such as gauge symmetry, are local; they are characterized by the action of a group on a bundle over a manifold. Global replica symmetry has been introduced as a symmetry of the Hamiltonian of certain interacting systems such as Ising models, random fields, and quantum fields, leading to valuable insights. In Sec. 3, we study local replica symmetry. This is not a symmetry of the Hamiltonian in general, but it is a symmetry within certain spin configurations. This enables us to simplify our expansion of certain expectations in the low-temperature Ising system in order to exhibit a desired property: exponential tree decay of truncated correlations. This low-temperature expansion only serves to illustrate our 439
April 2, 2009 10:28 WSPC/148-RMP
440
J070-00365
A. Jaffe & D. Moser
method. We plan to investigate the use of our method in other high-temperature and low-temperature situations for random and quantum fields. Consider the truncated expectations σi1 σi2 · · · σin T , defined in Sec. 4.1. The Ising spins σi are maps from the unit lattice Zd in d ≥ 2 dimensions to ±1. 2 The Hamiltonian is H = 12 ∇σ , and the Gibbs factor is e−βH , where β denotes the inverse temperature. Duneau, Iagolnitzer and Souillard [1] and Duneau–Souillard [2] have proved relations between analyticity in the temperature and decay of the trunT cated expectations σi1 σi2 · · · σin with an exponential rate proportional to the length of the shortest tree connecting the points i1 , . . . , in . Other authors including Dobrushin and Shlosman have also analyzed these properties. In a recent work, Bertini, Cirillo and Olivieri established a tree decay by another method based on assumed convergent expansions [3]. Convergence results (or related analyticity) can serve as input to these arguments and can be established using cluster expansions for high or low temperature. Here we reinvestigate the cluster properties of low-temperature connected correlation functions, relating them to symmetry in replicas, which we call multiple colors. In Sec. 7, we show that there are constants a, b, such that for δn = β −b−ln n ≥ 1, the trunctated expectations satisfy |σi1 σi2 · · · σin T | ≤ ann e−δn τ (i1 ,...,in ) ,
(1.1)
where τ (i1 , . . . , in ) is the length of the minimal tree connecting the n points i1 , . . . , in . Note the condition δn ≥ 1 requires that β ≥ βn , where βn grows at least as fast as O(ln n). It would be of interest to eliminate the n-dependence from the minimum value of β. Our method uses replica variables, comprising n identical, independent copies of the original system; one considers expectations in the replicated system that are product expectations for the individual systems. Replica symmetry is the symmetry of these expectations under a permutation of the copies. For a system in a finite volume Λ, with i1 , . . . , in ∈ Λ, the same estimate holds uniformly in Λ. Our method requires unbroken replica symmetry, so one must impose the same boundary conditions in each replica copy. We develop a low-temperature expansion, based on the intuitive idea that individual terms with less than the desired exponential tree-graph decay sum to zero (vanish) due to symmetry under the local cyclic replica group. In Sec. 7, we define and establish convergence of this expansion. The terms in the expansion are parametrized by replica continents. These replica continents are bounded by random surfaces. The convergence of our expansion relies on an interplay between energy and entropy estimates; in particular, we give entropy estimates bounding the number of random surfaces that occur in our expansion, as well as energy estimates showing that large islands are suppressed at a desired rate. The key to our method is the use of local cyclic replica symmetry, to show that all non-zero terms in our expansion display replica condensation, defined in Sec. 5. By this, we mean that all the lattice sites i1 , . . . , in must live on a single
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
441
continent. The size of the boundary of the continent must therefore be larger than τ (i1 , . . . , in ); this is the source of the exponential tree decay. 1.1. The Ising model as illustration The Ising system is the simplest example of a statistical mechanics interaction. We present our method for such a model on a unit cubic lattice Zd , with d ≥ 2, although our methods clearly apply in more generality. The Ising Hamiltonian in volume Λ ⊂ Zd is 1 2 (1 − σi σj ) , (1.2) HΛ = HΛ (σ) = ∇σ2 (Λ) = 2 nn∈Λ
where σi takes the values ±1, and nn denotes the sum over nearest-neighbor pairs of sites in the lattice, namely sites with |i − j| = 1. The partition function e−βHΛ (σ) (1.3) ZΛ,β = σi i∈Λ
normalizes statistical averages f Λ,β of a function f , 1 f Λ,β = f (σ) e−βHΛ (σ) . ZΛ,β σi
(1.4)
i∈Λ
Often f is a monomial in spins, f = σi1 σi2 · · · σin . The expectation · Λ,β is linear, so one can express the expectation of a general f as a limit of finite linear combinations of expectations of the form σi1 σi2 · · · σin Λ,β . 2. The Correspondence Zd ↔ Rd Each subset X ⊂ Zd of sites in the lattice Zd can be identified with a subset X ⊂ Rd . Define the latter as the union of closed, unit d-cubes i centered at the lattice sites i ∈ X, as we illustrate in the upper part of Fig. 1. Connectedness: We say that X ⊂ Zd is connected if any two sites in X can be connected by a continuous path through nearest-neighbor lattice sites in the set X. This agrees with the notion that the interior of the set X ⊂ Rd is connected in the ordinary sense. Two unit cubes are connected if they share a unit (d − 1)-cube, which we call a face. But they are disconnected, if they only touch on a corner of dimension ≤ (d − 2). Boundary: The boundary ∂X ⊂ Rd allows us to define the set ∂X ⊂ Zd of boundary lattice sites. These boundary sites ∂X ∈ Zd are those lattice sites in X lying in cubes that share a (d − 1)-face with the boundary ∂X ⊂ Rd .
April 2, 2009 10:28 WSPC/148-RMP
442
J070-00365
A. Jaffe & D. Moser
Fig. 1.
An example for the correspondence between subsets of Zd and Rd , and their boundaries.
By |∂X|, we always refer to the area of the (d − 1)-surface ∂X ⊂ Rd and not the number of points in ∂X ⊂ Zd . In most instances, we will call this area the “length” of the boundary, but in some cases we will also call it the number of faces of the boundary surface. We illustrate the correspondence between the boundary lattice sites and the boundary of regions in Rd in the lower part of Fig. 1. Surface: More generally let a face in Rd denote a (d − 1)-cube; such a cube lies in the boundary of two d-cubes in Rd . A surface Y is a union of (d − 1)-faces, and its area |Y | is the number of (d − 1)-faces in Y . Lattice sites in Y may lie on either side of the surface Y , but could be limited by selecting an orientation to appropriate sets of faces in Y . Connected surface: Define two faces to be adjacent, if they share a (d − 2)-cube. Likewise, define Y to be connected if any two faces in Y can be reached by a continuous path through a sequence of adjacent faces in Y . 3. Replica Variables and Symmetry Choose n ∈ Z+ and consider n independent copies of a statistical-mechanical or quantum-field system; these are called n replicas. One can study the properties of expectations under the group of permutations of the replica variables (the replica
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
443
group). The n-element subgroup of cyclic permutations of all the copies is abelian, and it provides useful one-dimensional representations of replica symmetry. 3.1. Replica variables We assume that the different replicas are identical and independent. They are defined on the same lattice, they have the same form of interaction, they are given identical boundary conditions, etc. We label the spin variable at the lattice site i (α) by σi , where α = 1, 2, . . . , n denotes the index of the copy. We also consider the (α) replica spins at site i as a vector σi with the vector components σi . 3.2. The global replica group The global replica group is the symmetric group Sn comprising elements π ∈ Sn with action, π : (1, . . . , n) → (π1 , . . . , πn ).
(3.1)
The element π ∈ Sn acts on the spins, giving a unitary representation, (α)
σi
(α)
→ (πσi )
(π −1 α )
= σi
,
for α = 1, . . . , n,
and for all i.
(3.2)
The global cyclic replica group Snc is the subgroup of cyclic permutations of n objects, and is generated by the permutation π 0 , π 0 : (1, . . . , n) → (2, . . . , n, 1).
(3.3)
Treating the indices α modulo n, substitute α = n for α = 0 and write (α)
σi
(α) (α−1) → π 0 σi = σi ,
for α = 1, . . . , n,
and for all i.
(3.4)
The matrix representation of (3.4) is σi → π 0 σi , where n 0 (α) 0 (α ) π σi π α α σi = , α =1
and
0 π αα = δα−1α .
(3.5)
3.3. The local cyclic replica group Let K denote a subset of the lattice Zd . The local cyclic replica group Snc (K) is a bundle over Snc defined as the action of Snc on the spins in K and the identity on 0 which has the representation on the complement. This group is generated by πK spins, 0 π σi , when i ∈ K 0 . (3.6) πK σi = when i ∈ K σi ,
April 2, 2009 10:28 WSPC/148-RMP
444
J070-00365
A. Jaffe & D. Moser
3.4. Irreducible representations The cyclic replica group is abelian, so its irreducible representations are onedimensional. We transform from σi to a set of coordinates si = Uσi to reduce the representation of Snc , so the matrix U acts as Fourier transform in the replica space. In particular, let ω = e2πi/n , and define (α)
si
=
1 n1/2
n
(α )
α =1
ω α(α −1) σi
,
for α = 1, . . . , n.
(3.7)
Note that for n > 2 the s-variables may be complex, even though the original σ-spins are real. The entries of the matrix U are Uαα = n−1/2 ω α(α −1) . Proposition 3.1. The matrix U is unitary with eigenvalues ω α , for α = 1, . . . , n. Let D be the diagonal matrix with Dαα = ω α δαα . Then π 0si = Dsi .
(3.8)
Proof. For ν an integer (modulo n), n
ω −να = nδν0 .
(3.9)
α=1
Thus (U U ∗ )αα =
n
Uαβ Uα β =
β=1
n 1 (α−α )(β−1) ω = δαα . n
(3.10)
β=1
Since π 0 acts on the σi components according to (3.4), this means that n 0 (α) (α) α = ω (si ) = Dαα (si )(α) , π si
(3.11)
α =1
which is (3.8). The inverse change of coordinates is (γ) σi
=
1 n1/2
n
(α)
ω −(γ−1)α si
,
for γ = 1, . . . , n.
(3.12)
α=1
A further corollary of the unitarity of U is the fact that for any i, j n α=1
(α) (α)
σi σj
= σi , σj 2 = Uσi , Uσj 2 = si , sj 2 =
n
(α) (α)
si sj .
(3.13)
α=1
In particular, the expression on the right-hand side of this identity is always real. Furthermore, each individual term on the right is invariant under the elements of the local, cyclic replica group Snc (K) as long as both i, j ∈ K or both i, j ∈ K.
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
445
3.5. Replica boundary conditions We consider finite volume Hamiltonians that, along with their boundary conditions, have the global replica group as a symmetry. If one wished to investigate the breaking of the symmetry of the replica group in the infinite volume limit, then one might explicitly break replica symmetry in a finite volume by imposing different boundary conditions for different replica copies of the system. In order to simplify the discussion, we impose +1 boundary conditions in each replica copy: set σi = (+1, . . . , +1) ,
when i ∈ ∂Λ.
(3.14)
The resulting boundary conditons for s are si = (0, 0, . . . , 0, n1/2 ) ,
when i ∈ ∂Λ.
(3.15)
3.6. Replica symmetry is global, not local Define the total replica Hamiltonian Hreplica as the sum of the Hamiltonians for the replica copies of the Hamiltonian in volume Λ, Hreplica = Hreplica(σ ) =
n 1 1 (α) (α) ∇σ 2 (Λ) = (σi − σj )2 . 2 2 α=1
(3.16)
nn∈Λ
Proposition 3.2. Consider the replica Hamiltonian (3.16). (i) As a function of the variables s, one has Hreplica =
n 1 1 1 (α) (α) ∇σ 2 (Λ) = ∇s 2 (Λ) = |si − sj |2 . 2 2 2 α=1
(3.17)
nn∈Λ
(ii) The replica Hamiltonian (3.17) is invariant under a global replica permutation π ∈ Sn defined in (3.2), namely Hreplica (πs ) = Hreplica(s ).
(3.18)
(iii) In general, the replica Hamiltonian is not invariant under the local cyclic replica group Snc (K) defined in (3.6). Proof. The relation (3.13) shows that Hreplica has the form (3.17). The invariance under the global replica group follows by considering the effect on Hreplica expressed in the σ variables, where the transformation permutes the various terms HΛ (σ (α) ) in the first expression for Hreplica in (3.16). In order to see that Hreplica (σ ) is not invariant under the local cyclic replica group, we give a configuration σ and set K that provides a counterexample in the case n = 2. It is easiest to visualize this configuration by illustrating it; see the left
April 2, 2009 10:28 WSPC/148-RMP
446
J070-00365
A. Jaffe & D. Moser
Fig. 2.
A counter-example to local cyclic replica symmetry.
side of Fig. 2. We choose K to be the centermost square in the configuration (with σ (1) = +1 and σ (2) = −1), and choose πK ∈ Snc (K) to flip the spins in K. The action of πK produces the configuration on the right side of the figure, and it lowers the energy by 4|∂K|. In other words, Hreplica (σ ) − Hreplica (πKσ ) = 4|∂K|, showing that Hreplica is not invariant under the action of Snc (K). 4. Expectations Define the expectation · Λ,β for the replicated system as follows: for a function F (σ ), let 1 F Λ,β = F (σ )e−βHreplica ( σ) , (4.1) Z
σi i∈Λ
where Z = Z n , with Z is given in (1.3). In case that F (σ ) = f (σ (γ) ) only depends on one component σ (γ) , the expectation · Λ,β reduces to the expectation · Λ,β . In this case f (σ (γ) ) Λ,β = f (σ)Λ,β , for γ = 1, . . . , n. (4.2) We now introduce the generating function S(µ) for expectations of products of spins. Let µ be a function from Λ to C and let (γ) µi σi , and correspondingly σ (γ) (µ) = µi σi . (4.3) σ(µ) = i∈Λ
Then define
i∈Λ
(γ) S(µ) = eσ(µ) Λ,β = eσ (µ) Λ,β .
(4.4)
The expectations of n spins are derivatives of the generating function, (γ) (γ) ∂n (γ) σi1 σi2 · · · σin Λ,β = S(µ) = σi1 σi2 · · · σin Λ,β . ∂µi1 ∂µi2 · · · ∂µin µi =0 (4.5)
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
447
The expectations (4.5) are n-multi-linear, symmetric, functions of the spins, n µi1 · · · µin σi1 σi2 · · · σin Λ,β . (4.6) σ(µ)n Λ,β = i1 ,...,in =1
One can recover the expectation σi1 σi2 · · · σin Λ,β from the expectations of powers of σ(µ) by polarization, 1 n 1 · · · n ( 1 σi1 + · · · + n σin ) Λ,β . (4.7) σi1 σi2 · · · σin Λ,β = n 2 n! ,..., =±1 1
n
4.1. Truncated expectations The truncated expectation of a product of n spins is a generalization of the correlation of two spins. The truncated expectation vanishes asymptotically as one translates any subset of the spin locations a large distance away from the others. The generating function of the connected expectations is G(µ) = ln S(µ) = lneσ(µ) Λ,β .
(4.8)
One defines the truncated (connected) expectations as T σi1 σi2 σi3 · · · σin Λ,β
∂n = G(µ) . ∂µi1 ∂µi2 · · · ∂µin µi =0
(4.9)
T
A standard representation of σi1 σi2 σi3 · · · σin Λ,β in terms of sums of products of expectations can be formulated in terms of the set P of partitions of {i1 , i2 , . . . , in }. Suppose that a set P ∈ P has cardinality |P |. Then T σ P Λ,β . (4.10) σi1 σi2 σi3 · · · σin Λ,β = P P ∈P
Like the expectations (4.5), the n-truncated expectations satisfy the n-multi-linear relation (4.6) and (4.7). Thus n T µi1 · · · µinσi1 σi2 σi3 · · · σin T (4.11) σ(µ)n Λ,β = Λ,β , i1 ,...,in =1
and σi1 σi2 · · · σin T Λ,β =
1 2n n!
1 ,...,n =±1
T
1 · · · n ( 1 σi1 + · · · + n σin )n Λ,β .
(4.12)
4.2. Truncated functions as replica expectations The form of the replica variables s leads to an elementary representation of the truncated (connected) expectations of products of spins. Ultimately, we show that this yields exponential decay at low temperatures with a rate governed by the length of the shorted tree-graph connecting all the spins. (A similar argument presumably works at high temperature.) Our expansion method uses replica symmetry to arrange that each term in the expansion either exhibits the desired decay rate, or else it is canceled by other
April 2, 2009 10:28 WSPC/148-RMP
448
J070-00365
A. Jaffe & D. Moser
terms as a consequence of local cyclic replica symmetry. We begin by establishing a known representation of the connected correlation of n spins as an expectation of n replica variables introduced above. Cartier [4] lectured on, but did not publish a replica representation of a product of spins; our presentation is based on Sylvester’s treatment of a correlation inequality [5], in which he analyzed s(1) . Let gcd denote the greatest common divisor. Proposition 4.1. Let s be defined in (3.7) with n replica copies, and let γ ∈ (1, . . . , n) satisfy gcd(n, γ) = 1. Then (γ) (γ) (γ) (n−2)/2 si1 si2 · · · sin Λ,β . σi1 σi2 · · · σin T (4.13) Λ,β = n Lemma 4.1. For all γ = 1, . . . , n, (γ) (γ) (γ) T si1 si2 · · · sin Λ,β = n−(n−2)/2 σi1 σi2 · · · σin T Λ,β .
(4.14)
Proof. Using the multi-linearity (4.11), and its analog for the expectations · Λ,β and · Λ,β of the truncated functions, we infer that (γ) (γ) (γ) T si1 si2 · · · sin Λ,β
T n (α ) (α ) (α ) = n−n/2 ω γα1 +··· +γαn −γn σi1 1 σi2 2 · · · σin n α1 ,...,αn =1
= n−n/2
n
ω γα1 +··· +γαn −γn
α1 ,...,αn =1
Λ,β
(α1 ) (α2 ) (α ) T σi1 σi2 · · · σin n Λ,β . (4.15)
Since the different components of σi are independent, the expectations on the right vanishes unless α1 = · · · = αn . In this case the truncated expectation of each copy equals the truncated expectation of the original spins, and the sum yields n such terms. Therefore (4.14) holds as claimed. Lemma 4.2. Let kγ = 0 (modulo n). Then (γ) (γ) (γ) si1 si2 · · · sik Λ,β = 0.
(4.16)
Proof. Expand the expectation (γ) (γ) (γ) si1 si2 · · · sik Λ,β =
=
1 nn/2 1 nn/2
n α1 ,...,αn =1 n α1 ,...,αn =1
(α ) (α ) (α ) ω γα1 +···+γαk −γk σi1 1 σi2 2 · · · σik n Λ,β (α −1) (α2 −1) (α −1) ω γα1 +···+γαk −γk σi1 1 σi2 · · · σik n . Λ,β (4.17)
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
449
In the second equality, we use the symmetry of the expectation · Λ,β under the global cyclic replica group Snc π 0 . Therefore (γ) (γ) (γ) (γ) (γ) (γ) si1 si2 · · · sik Λ,β = ω γk si1 si2 · · · sik Λ,β . (4.18) As long as γk = 0 (modulo n), it is the case that ω γk = 1. Therefore the expectation must vanish. Proof of the Proposition 4.1. The relation (4.10) also holds for the replica expectations, (γ) (γ) T (γ) si1 si2 · · · sin Λ,β = s(γ) P Λ,β . (4.19) P P ∈P
Because gcd(n, γ) = 1, it is the case that kγ = 0 (modulo n) for all k = 1, . . . , n − 1. Thus we can apply Lemma 4.2 to each such k, and only the partition P with all n elements in one set survives in (4.19). We infer (γ) (γ) (γ) (γ) (γ) (γ) (γ) (γ) T (4.20) si1 si2 si3 · · · sin Λ,β = si1 si2 si3 · · · sin Λ,β . Using Lemma 4.1 then completes the proof. 5. Replica Condensation In this section, we investigate certain classes of configurations σ of the replica spins. We see that for each class of configurations, there is a local cyclic replica group (see Sec. 3.3) under which the Hamiltonian Hreplica of (3.16) is invariant. This leads to the phenomenon of replica condensation in which all the spin localizations i1 , . . . , in must be localized within a given region K ⊂ Λ that we call a continent. 5.1. Continents Each configuration of spins σ in the volume Λ defines a sea S(σ ), surrounding a set of continents K(σ ). The sea starts at the boundary ∂Λ of the region Λ. The boundary of a continent appears if any one of the components of σ changes its value. Continents have a substructure arising from the different configurations of the individual components σ (α) within the continent. We say more about this substructure when defining replica continent contours in Sec. 6.2. In the following we utilize the notion of “connectedness” introduced in Sec. 2. Definition 5.1. Consider a configuration σ . The replica sea S(σ ) is the connected component of the set {i | σi = (+1, . . . , +1)} that meets the boundary ∂Λ of Λ. The continents Kj are the connected components of the complementary set, S c (σ ) = K1 ∪ · · · ∪ Kr . The set of continents K(σ ) is K(σ ) = {K1 , . . . , Kr }. We illustrate this definition in Fig. 3.
(5.1)
April 2, 2009 10:28 WSPC/148-RMP
450
J070-00365
A. Jaffe & D. Moser
Fig. 3.
The set of continents K( σ ) = {K1 , . . . , K5 } in the sea S( σ ).
5.2. Local cyclic replica symmetry In Sec. 3.6, we saw that a global replica symmetry transformation leaves Hreplica(σ ) invariant, and that a local replica symmetry transformation does not necessarily do so. We now recover local cyclic replica symmetry by choosing the localization K in Snc (K) to be a continent. Proposition 5.1. Let K ∈ K(σ ). Then the local cyclic replica group Snc (K) defined in (3.6) preserves the continent K and the Hamiltonian Hreplica(σ ). For πK ∈ Snc (K), Hreplica(σ ) = Hreplica(πK (σ )).
(5.2)
Proof. The action of Snc (K) on σ leaves invariant spins σi = (+1, . . . , +1), so it changes neither the sea S(σ ) nor the definition of continents. Hence it also does not change the contribution of nearest neighbor spins to the energy either inside or outside the continent. The local permutation also does not alter the energy across the island boundary, because all the components outside the island have value +1 and are invariant under the permutation. 5.3. Symmetry ensures condensation We now establish the property of condensation. We use the representation (4.13) for the truncated correlation function of n spins. We may choose any γ with gcd(n, γ) = 1, so for simplicity we consider the case γ = 1. (1)
(1)
Proposition 5.2 (Condensation). In the expectation si1 · · · sin Λ,β , any configuration σ giving a non-zero contribution has all the sites i1 , . . . , in ∈ K lying in a single continent K ∈ K(σ ). Lemma 5.1. Consider a given configuration σ and a continent K ∈ K(σ ) containk denote πK applied k times. ing at least one but not all the sites i1 , . . . , in . Let πK
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
451
Then n−1
(1)
k
(1)
k k (πK si1 ) · · · (πK sin )e−βHreplica (πK ( σ))
k=0
=
n−1
(1)
si1
k k (1) k πK (σ ) · · · sin πK (σ ) e−βHreplica (πK ( σ ))
k=0
= 0.
(5.3)
Proof. From Proposition 5.1, we infer that the energy in the permuted configuration is unchanged by the permutation, k (σ )) = Hreplica (σ ). Hreplica (πK
(5.4) (1) sik .
Let l = |{k|ik ∈ K}| Therefore, we only need consider the changes to the spins denote the number of sites i1 , . . . , ik that lie in K; clearly 1 ≤ l < n. According to (1) Proposition 3.1, the application of πK to si gives a phase ω for i ∈ K. The sum equals n−1
(1) (1)
(1)
(1) (1)
(1)
ω kl si1 si2 · · · sin e−βHreplica ( σ)
k=0
= si1 si2 · · · sin e−βHreplica ( σ)
n−1
ω kl = 0.
(5.5)
k=0
Proof of Proposition 5.2. The expectation is (1) (1) (1) (1) si1 · · · sin e−βHreplica ( σ) /Z. si1 · · · sin Λ,β =
(5.6)
σ
If σ is a configuration where some site ik lies in the sea ik ∈ S(σ ) then the spin has (1) (1) the value of the boundary, sik = 0. We also have sik = 0, if ik ∈ K and all the σ (α) take the same values on K. Therefore, the only contributing configurations have all the sites ik lying in continents where πK actually yields new configurations. In this case, the sum in Lemma 5.1 is a sub-sum of (5.6). According to the lemma the sum is only non-zero if all or none of the ik lie in the contintent K. 6. Contours and the Energy 6.1. Contours for vector spins σ For each component σ (α) of the vector spin, we can define contours in the usual statistical mechanics sense. These contours are the boundaries between islands with different values of σ (α) , as defined in Sec. 2. We label the contours for different components by different colors. The contours in σ are just vectors of
April 2, 2009 10:28 WSPC/148-RMP
452
J070-00365
A. Jaffe & D. Moser
(a) The contours of σ(1)
(b) The contours of σ(2)
(c) The contours of σ
(d) The set of continents K( σ ) = {K1 , K2 } Fig. 4.
1, (d) The continent contours C(K σ )]
An illustration of contours and continents in the case n = 2.
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
453
contours in each component. We illustrate these contours in the case n = 2 in Figs. 4(a)–4(c).
6.2. Replica continent contours We need to estimate Pr(r), the probability for the occurrence of a continent of length r. In order to obtain this bound, compare configurations σ containing the replica continent contour C(K, σ ) to configurations σ ∗ with the contour removed. as follows: One defines C Definition 6.1. For K ∈ K(σ ) define the replica continent contour of K in the configuration σ as the vector C(K, σ ) with components C (α) (K, σ ) = union of contours C for σ (α) with |C ∩ ∂K| = 0,
(6.1)
where | · | is the measure of (d − 1)-surfaces. This is the subset of contours for σ meeting the boundary of the continent ∂K. We illustrate the replica continents and replica continent contours in Figs. 4(d) and 4(e). Several different configurations of the spin σ may have different contours, but a common continent K. Define the set of possible contours for the continent K as C(K) = {C(K, σ ) | where K ∈ K(σ )}.
(6.2)
∈ C(K) is just the sum over the length of the Finally, the length of any contour C constituent contours, = |C|
n
|C (α) |.
(6.3)
α=1
With these definitions it is obvious that removing C(K, σ ) in the configuration σ is well-defined. We just remove the respective contours C (α) (K, σ ) for the components σ (α) , by flipping the sign of all the spins inside these contours. Definition 6.2. For a configuration σ and a continent K ∈ K(σ ), write σ ∗ for the configuration where the contour C(K, σ ) for the continent has been removed as described above. As a consequence of the removal of the replica continent contour the energy Hreplica is decreased by two times the length of the removed contours. This the generalization of the fact that for each component spin, the energy is given by two times the total length of the contours, σ )|. Hreplica (σ ∗ ) = Hreplica (σ ) − 2|C(K,
(6.4)
April 2, 2009 10:28 WSPC/148-RMP
454
J070-00365
A. Jaffe & D. Moser
7. Tree Decay In this section we prove the decay bound for the truncated correlation functions. We base the proof on condensation. Starting from the representation (4.13), namely (1) (1) si1 · · · sin Λ,β = n−(n−2)/2 σi1 · · · σin T (7.1) Λ,β , we use the fact established in Proposition 5.2 that every non-vanishing contribution contains a continent K with all the points i1 , . . . , in . Proposition 7.1. There are constants a, b depending on d, but independent of Λ, such that if 1 ≤ δ = β − b − ln n (hence requiring β ≥ βn = O(ln n)), then the truncated correlation functions satisfy n −δτ (i1 ,...,in ) |σi1 · · · σin T . Λ,β | ≤ an e
(7.2)
Here τ (i1 , . . . , in ) is the length of the shortest tree connecting i1 , . . . , in . 7.1. Outline of the proof We have shown in Proposition 5.2 that each non-vanishing contribution to the expectation (7.1) contains a condensate continent K containing all the points ∈ C(K) has minii1 , . . . , in . As a consequence, every possible replica contour C mal length τ (i1 , . . . , in ). We formulate the sum over configurations 1 T si1 · · · sin e−βHreplica ( σ) , (7.3) σi1 · · · σin Λ,β = n(n−2)/2 Z
σ
of length r and a sum over r. We claim as a sum over configurations with contours C occurs with |C| = r satisfies the that the probability Pr(r) that a replica contour C bound
Pr(r) ≤ e−β|C| = e−βr .
(7.4)
To complete the proof we use a bound on the number of random, connected contours of length r, along with an estimate on the number of configurations that contain a These estimates, together with the fact that |s(1) | ≤ n1/2 , yield given contour C. i the desired bound. We now break the proof into a sequence of elementary steps. 7.2. Details of the proof Rewrite the sum. Consider the sum (7.3), with the restriction of Proposition 5.2. = C(K, Recall that the replica continent borders C σ ), and the set of configurations containing such a replica continent C(K) C(K, σ ) is given in Definition 6.1. One can rewrite the summation appearing in (7.3) as an iterated sum ∞
r=τ (i1 ,...,in ) K,C
σ
.
(7.5)
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
455
the sum denotes the sum over configurations containing the For fixed K and C, = C(K, continent K ∈ K(σ ) with the continent border C σ ),
=
.
(7.6)
C(K,
σ with K∈K(
σ),C= σ)
σ
ranges over the possible continents K containing the n sites i1 , . . . , in , The sum of length |C| = r. Thus and their possible borders C
K,C
=
K⊃{i1 ,...,in }
∈ C(K) C
=r with |C|
.
(7.7)
Finally we sum over r, which is bounded from below by the minimal size τ (i1 , . . . , in ). One interprets the sum as the energy contribution to the sum, namely the probability
Pr(r) =
1 −βH( σ) , e Z
for the states σ with K ∈ K(σ ). Likewise, one interprets the sum contribution to the sum. Define the entropy factor N (r) by N (r) =
(7.8)
as the entropy
1.
(7.9)
K,C
The entropy counts the number of different shapes for C. (1) 1/2 Using |σi | = 1, one has |si | ≤ n . Thus we obtain the bound (1) (1) (n−2)/2 |σi1 · · · σin T | si1 · · · sin Λ,β | Λ,β | = n ≤ n(n−2)/2
∞ 1 (1) (1) |si1 · · · sin |e−βH( σ ) Z r=τ σ
K,C
≤ n(n−2)/2
∞
nn/2 N (r) Pr(r)
r=τ
= nn−1
∞
N (r) Pr(r).
(7.10)
r=τ
In the following, we prove bounds on Pr(r) and on N (r) that depend only on r, on β, and on the dimension d. Bound the entropy. We show that there are constants A, B depending only on d such that N (r) satisfies the exponential bound, N (r) ≤ AB r nr .
(7.11)
April 2, 2009 10:28 WSPC/148-RMP
456
J070-00365
A. Jaffe & D. Moser
We obtain this result by constructing the border contour ∂K and attaching satisfying the l colored sub-contours. In this way one constructs any possible C conditions above. The geometry of the contour (which must surround i1 ) requires must lie in a cube of side-length that the starting face we choose in constructing C (r − 1), centered at i1 . Such a cube contains at most drd possible starting faces. We estimate the number of possible continent boundary configurations ∂K using a bound on the number of random connected contours of length r. Such estimates have been derived in various contexts in references [6–8], while we use a recent improvement [9]. This states that the number of contours of length containing a fixed face is bounded by where kd = (9d)2/d .
kdr ,
(7.12)
by attaching at least one and at most r We now construct the full contour C subcontours to ∂K to obtain the total number of faces r. This can be done in a number of ways. For l sub-contours, the number of ways is bounded by the product of combinatorial factors: for the starting faces on ∂K, rl kdr for the shapes, l n r−1 for the colors, for the lengths, l−1 1/l! as the ordering of the subcontours is irrelevant. Therefore N (r) ≤
drd kdr
r
rl kdr nl
l=1
r−1 1 . l − 1 l!
(7.13)
We use the elementary inequalities r ≤ d!e , d
r
Then N (r) ≤ drd kd2r nr
r−1 and ≤ 2r . l−1
r rl r − 1 l=1
l!
l−1
r ≤ dd! 2e2 kd2 nr .
(7.14)
(7.15)
This bound has the form (7.11) with A = dd! and B = 2e2 kd2 . Bound the energy factor. The energy bound has the form (7.16) Pr(r) ≤ e−βr , ) is any fixed connected set with {i1 , . . . , in } ⊂ where K (implicitely contained in ∈ C(K) is any fixed extended border with |C| = r. The idea is to compare K and C every summand in the numerator to a summand in the denominator. For any given we can take away the contours in C obtaining σ with K ∈ K(σ ) and C(K, σ ) = C,
April 2, 2009 10:28 WSPC/148-RMP
J070-00365
Replica Condensation and Tree Decay
457
the unique σ ∗ as described in Definition 6.2. Because of the difference in energy, this gives an additional factor e−βr for the term in the numerator. As the procedure works for all the summands, we infer
σ Pr(r) =
e−βH( σ ) e
−βH(
σ)
≤
σ
σ
∗
e−βH( σ ) e−βr
= e−βr . e
(7.17)
∗
−βH(
σ )
σ
Tree decay. The bound (7.2) now follows. Using (7.10), one has ∞ ∞ n−1 n−1 σi1 · · · σin T ≤ n N (r) Pr(r) ≤ n AB r nr e−βr Λ,β r=τ ∞
= nn−1 A
r=τ
e−(β−b−ln n)r ,
(7.18)
r=τ
where b = ln B and where τ = τ (i1 , . . . , in ). The last sum converges for β > b ln n. With 1 ≤ δn = β − b − ln n, this gives n−1 σi1 · · · σin T A(1 − e−δn )−1 e−δn τ . (7.19) Λ,β ≤ n For β sufficiently large (depending on n), the condition is valid and the geometric sum converges. This completes the proof of Proposition 7.1. Acknowledgment The authors thank an anonymous donor, whose gift enabled this collaboration. References [1] M. Duneau, D. Iagolnitzer and B. Souillard, Decrease properties of truncated correlation functions and analyticity properties for classical lattices and continuous systems, Comm. Math. Phys. 31 (1973) 191–208. [2] M. Duneau and B. Souillard, Cluster properties of lattice and continuous systems, Comm. Math. Phys. 47 (1976) 155–166. [3] L. Bertini, E. N. M. Cirillo and E. Olivieri, A combinatorial proof of tree decay of semi-invariants, J. Stat. Phys. 115 (2004) 395–413. [4] P. Cartier, unpublished lecture (1974). [5] G. Sylvester, Representations and inequalities for Ising model Ursell functions, Comm. Math. Phys. 42 (1975) 209–220. [6] W. Holsztynski and J. Slawny, Phase transitions in ferromagnetic spin systems at low temperatures, Comm. Math. Phys. 66 (1979) 147–166. [7] Ya. G. Sinai, Theory of Phase Transitions: Rigorous Results (Pergamon Press, London, 1982). [8] J. L. Lebowitz and A. E. Mazel, Improved Peierls argument for high-dimensional Ising models, J. Stat. Phys. 90 (1998) 1051–1059. [9] P. N. Balister and B. Bollob´ as, Counting regions with bounded surface area, Commun. Math. Phys. 273 (2007) 305–315.
May
12,
2009 13:21 WSPC/148-RMP
J070-00366
Reviews in Mathematical Physics Vol. 21, No. 4 (2009) 459–510 c World Scientific Publishing Company
EFFECTIVE DYNAMICS FOR SOLITONS IN THE NONLINEAR KLEIN–GORDON–MAXWELL SYSTEM AND THE LORENTZ FORCE LAW
EAMONN LONG and DAVID STUART∗ Centre for Mathematical Sciences, Wilberforce Road, Cambridge, CB3 OWA, UK ∗
[email protected] Received 11 August 2008 Revised 11 February 2008 We the nonlinear Klein–Gordon–Maxwell system derived from the Lagrangian R consider (− 14 Fµν F µν + 12 (∂ − ieA)µ φ, (∂ − ieA)µ φ − V(φ) − eAµ JB µ ) on four-dimensional Minkowski space-time, where φ is a complex scalar field and Fµν = ∂µ Aν − ∂ν Aµ is the electromagnetic field. For appropriate nonlinear potentials V, the system admits soliton solutions which are gauge invariant generalizations of the non-topological solitons introduced and studied by Lee and collaborators for pure complex scalar fields. In this article, we develop a rigorous dynamical perturbation theory for these solitons in the small e limit, where e is the electromagnetic coupling constant. The main theorems assert the long time stability of the solitons with respect to perturbation by an external electromagnetic field produced by the background current JB , and compute their effective dynamics to O(e). The effective dynamical equation is the equation of motion for a relativistic particle acted on by the Lorentz force law familiar from classical electrodynamics. The theorems are valid in a scaling regime in which the external electromagnetic fields are O(1), but vary slowly over space-time scales of O( 1δ ), and δ = e1−k for k ∈ (0, 12 ) as e → 0. We work entirely in the energy norm, and the approximation is controlled in this norm for times of O( 1e ). Keywords: Soliton; Maxwell; nonlinear Klein–Gordon; solitary wave; effective equation; Loerntz force. Mathematics Subject Classification 2000: 35Q51, 35Q60, 35Q75, 37K40
Contents 1. Statement of Results 1.1. Introduction . . . . . . . . . . . . . . . . . . . 1.2. The external electromagnetic field and scaling 1.3. Non-topological solitons . . . . . . . . . . . . 1.4. The main theorems . . . . . . . . . . . . . . .
459
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
460 460 465 468 475
May 12, 2009 13:21 WSPC/148-RMP
460
J070-00366
E. Long & D. Stuart
2. Stability: Proof of Theorem 10 2.1. Beginning of proof of Theorem 10 . . . . 2.2. Results from modulation theory . . . . . 2.3. The main growth estimate . . . . . . . . 2.4. Completion of the proof of Theorem 10
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
478 478 484 485 486
3. Modulation Theory 488 3.1. Preparation of the initial data . . . . . . . . . . . . . . . . . . . . . . 488 3.2. Modulation equations and constraints . . . . . . . . . . . . . . . . . 489 3.3. A bound for λ˙ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492 4. The Lorentz Force Law: Proof of Theorem 12
492
5. Proof of the Main Growth Estimate 494 5.1. Proof of Theorem 16, assuming Lemma 23 . . . . . . . . . . . . . . . 495 5.2. Proof of Lemma 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496 Appendix 501 A.1. Further properties of the solitons . . . . . . . . . . . . . . . . . . . . 501 A.2. Some estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 1. Statement of Results 1.1. Introduction In this article, we are interested in the effective dynamics of a class of solitary wave, or soliton, solutions to the nonlinear Klein–Gordon–Maxwell (nl-KGM) equations, in the presence of an external electromagnetic field. In this introduction we start by writing down the equations and giving a heuristic statement of, and motivation for, our results in Secs. 1.1.3 and 1.1.4. Then, in Secs. 1.2 and 1.3, we provide the necessary background for a precise formulation of the main results — Theorems 10 and 12 — which appear in Sec. 1.4. These theorems are proved in the subsequent sections; a list of notation appears in Sec. 1.1.5 to facilitate reading of the article. 1.1.1. The equations We study the following system of equations, called the nonlinear Klein–Gordon– Maxwell system, or (nl-KGM) system, which describe the interaction of a complex scalar field φ with an electromagnetic field Fµν in the presence of an external spacetime current JB : ∂ µ Fµν = eiφ, Dν φ + eJB ν Dµ Dµ φ + V (φ) = 0.
(1)
Here φ is a complex function on Minkowski space-time R1+3 , and Dµ = ∂µ − ieAµ is the covariant derivative associated to an electromagnetic potential Aµ dxµ = A0 dt+ Aj dxj with associated field Fµν = ∂µ Aν − ∂ν Aµ . (The operator D determines an S 1 connection over R1+3 whose curvature is −iF .) We use standard relativistic notation
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
461
0 in which {xµ }µ=3 µ=0 are coordinates, with greek indices running over {0, 1, 2, 3}, x = t j 3 is the time coordinate, and {x }j=1 are space coordinates with Latin indices running over {1, 2, 3}; the Minkowski metric is
ηµν dxµ dxν = dt2 − (dx1 )2 − (dx2 )2 − (dx3 )2 , and is used to raise/lower indices in the usual way. When the spatial part of a space-time vector or 1-form is considered separately bold face will often be used e.g. x = (x1 , x2 , x3 ) for clarity. We refer to e as the (electromagnetic) coupling constant: for the purposes of this article it is a small positive parameter. The current four-vector is of the form k ∂k JB = JB,ν ∂ν = ρB ∂t + jB
and is conserved, i.e. ∂t ρB + div jB = 0. The quantity ρB is called the (background) charge density, while jB is referred to as the (background spatial) current density. Throughout the paper we make the following hypotheses on the nonlinear potential function V: (H1) Phase invariance: There exists G : R → R such that V(φ) = G(|φ|). 2 (H2) Positive mass: V(φ) = m2 |φ|2 + V1 (φ) where m > 0 and V1 (φ) = −U (|φ|) is smooth with U (0) = U (0) = U (0) = 0. (H3) Sub-criticality: The third derivative D(3) V1 = V1 satisfies a growth condition |V1 (φ)| ≤ c(1 + |φ|p−3 ), for some p ∈ (3, 6). The significance of 6 is that it is the critical Sobolev exponent for the embedding H 1 (R3 ) → Lp (R3 ). The function V is subject to a number of additional more specialized hypotheses, which we detail in Sec. 1.3.2, in particular to ensure existence and uniqueness of solitons solutions with the properties described in Sec. 1.3. 1.1.2. Solitons The research in this paper is built upon the existence results for solitons in semilinear wave equations given in [3, 4, 22]. These solitons are time-periodic solutions of the nonlinear Klein–Gordon equation ∂µ ∂ µ φ + V (φ) = 0, which is obtained by putting e = 0 in (1) (i.e. when there is no electromagnetic coupling), and are of the form φ(t, x) = eiωt fω (x). Lee emphasized that solutions of this type, which he called non-topological solitons, provide a way of circumventing the Derrick–Pohozaev non-existence results on static solitons in scalar field theories; see [15, Chap. 7] for a discussion of their properties from the physical point of view.
May 12, 2009 13:21 WSPC/148-RMP
462
J070-00366
E. Long & D. Stuart
It is proved in the references [3, 4, 22] that, for certain potentials V, solutions of this form exist with fω positive and radial. Also under further conditions these solutions are known to be essentially unique ([18]) and dynamically stable ([10, 24]); see Sec. 1.3 and the Appendices for further details. For non-zero values of the coupling constant e solutions to (1) of this type have been constructed in [2, 1] directly, using a spherically symmetric ansatz, and perturbatively in [16, 17] for small e using the e = 0 case as a starting point. For small e it is possible to use the information on stability for e = 0 from [24] to prove modulational stability of the solitons and their Lorentz boosts, see Sec. 1.3.4 and [17] for details. Much of the same information for the e = 0 case will also be used in the present article to study the stability of the solitons when subjected to external (background) electromagnetic fields. 1.1.3. Informal statement of results on interaction of solitons with electromagnetic field Our main concern in this article is to understand the interaction of the solitons just described, with an external electromagnetic field produced by the space-time current JB . In order to be able prove theorems giving precise information on the effect of this field on the soliton, we study (1) in a regime determined by two small parameters: • The electromagnetic coupling constant e = o(1). • The external electric and magnetic fields, Eδext and Bδext , vary over scales which are O( 1δ ), where δ = o(1). Thus the small parameter δ is the ratio of the size of the soliton to the length scale over which the external field varies. The following is an informal version of our main theorems: The system (1) has solutions which are close, in energy norm, to solitons of the type described above and which, in an appropriate scaling regime, move according to the Lorentz force equation: d (γMS u) = eQS (Eδext + u × Bδext ), (2) dt where the effective mass MS and charge QS of the soliton are as in (60) and (61). The scaling δ = e1−k for k ∈ (0, 12 ) ensures that this holds for time intervals of length Te0 as e → 0. The precise formulation is in the two theorems stated in Sec. 1.4. 1.1.4. Motivation and related work Our interest in this problem stems from the classical, but ongoing, controversy surrounding the classical equation of motion for a point charge in an external electromagnetic field. The difficulty arises in attempts to account for the “back reaction”
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
463
of the charge’s own field on itself. Attempts to derive an equation of motion lead to modifications of the Lorentz force law (2), most notably the Lorentz–Dirac equation ([21, Eq. (9.1)]). This equation is third order in time, and is difficult to interpret consistently without some further constraint on the type of solution allowed, due to the occurrence of runaway solutions and violations of causality, (see [6] and [7, Chap. 28]). Recent discussions of this problem have been given in [9] and the books [21, 28]. One natural and well-established approach to the problem of making sense of the back reaction is to start with a well-posed system of equations in which the point charge is explicitly replaced by a smooth bounded charge distribution, the Abraham model, or one of its generalizations like the Lorentz model, for example. One can then derive an equation of motion for the charge as an expansion, valid when the size of the charge distribution is small (compared with typical length scales set by the external fields), and show that this agrees with the Lorentz–Dirac equation at a certain order of approximation — see [14]. In this setting it turns out, however, that at the same order the Lorentz–Dirac equation can be approximated by a more conventional equation of motion which seems to be free of interpretational difficulties, (see [21, Eq. (9.10)], where the name Landau–Lifshitz equation is suggested for this effective equation of motion. The Landau–Lifshitz equation, which is second order in time, can be obtained formally from the Lorentz–Dirac equation by substituting for the third derivatives the expression obtained by differentiating the ordinary Lorentz force law (2) once in time). Our aim in studying soliton motion in the (nl-KGM) system is to attempt a similar analysis using a solitonic model for the particle (in place of the Abraham or Lorentz model). Our model has the virtue of being, in a very natural way, a Lorentz invariant system which is well posed (and so free of causality problems). Unfortunately, the calculations required even just to derive the equation of motion for the soliton to O(e) (i.e. the Lorentz force equation (2)) are long, and further work will be required to calculate additional corrections which may be compared with the Lorentz–Dirac equation in appropriate regimes. To achieve this, the starting point would be the equation of motion (116) for the soliton parameters derived from modulation theory. In Sec. 4, this equation is computed to highest order (i.e. to O(e)), and shown to give the Lorentz force law. A computation to the next order should give the Landau–Lifshitz equation ([21, Eq. (9.10)]). However it seems that some renormalization of the soliton mass and charge will have to be taken into account in this computation, and it is possible a refinement of the ansatz (62) will be needed to achieve this. It is to be hoped that at least in some simple cases such as one-dimensional motion of the soliton in an electric field Eδext = (0, 0, E(δt, δx)) of fixed direction it will be possible to carry this through, and make a comparison with the corresponding specialization of the Landau–Lifshitz equation ([21, Eq. (9.11)]). A corresponding theorem to our main result was proved for solitons in interaction with gravitational fields in the articles [25, 26]. The system treated there (Einstein’s equation coupled to a nonlinear Klein–Gordon equation) is in many ways more difficult than the one studied here (for example, it is quasi-linear).
May 12, 2009 13:21 WSPC/148-RMP
464
J070-00366
E. Long & D. Stuart
Correspondingly, it is possible to carry out a more general analysis for the Klein– Gordon–Maxwell system under consideration here: in particular we emphasize that in the present article we are able to work entirely with the energy norm throughout (whereas for the Einstein system it was necessary to work with much stronger norms). There have also been theorems proved on effective dynamics for solitons moving under a potential in the nonlinear Schr¨ odinger equation, see [12, 5, 27]. 1.1.5. Notation The following is a list of notations for important objects, with the section in which they are first introduced, for reference. space of (equivalence classes of) measurable functions • Lp (R3 ) is the Lebesgue with norm f Lp = R3 |f |p dx < ∞, and H k (R3 ) is the Sobolev space of (equivak lence classes of) measurable functions with norm f H k = |α|=0 ∂ α f L2 < ∞, where ∂ α means the weak partial derivative determined by the multi-index α. We k if χf ∈ H k for every smooth, compactly supported χ, and say f ∈ Hloc 1 H˙ 1 = {f ∈ Hloc ∩ L6 : ∇f L2 = f H˙ 1 < ∞}.
• •
• • • • • •
•
•
(3)
Further we define Hrk to be the intersection of H k and the space of radial functions, i.e. functions of |x|, and similarly define Lpr and H˙ r1 . Electromagnetic potential Aµ dxµ = A0 dt + Aj dxj , electromagnetic field Fµν = ∂µ Aν − ∂ν Aµ , and covariant derivative Dµ = ∂µ − ieAµ : Sec. 1.1.1. Complex scalar (soliton) field φ and its self-interaction potential V(φ) = G(|φ|) subject to hypotheses (H1)–(H3): Sec. 1.1.1. Additional hypotheses (SOL), (KER) and (POS): Secs. 1.3.2 and 1.3.4. (nl-KGM) is the nonlinear Klein–Gordon–Maxwell system: (1) and Sec. 1.4. Ψ = (φ, ψ, Ai , Ei ) is the dependent variable in the Hamiltonian formulation: Sec. 1.4 (and Sec. 1.3.1 for zero external current case). e electromagnetic coupling constant, δ external field scaling parameter are both small: Secs. 1.1.3 and 1.2.2. Scaled external electromagnetic potentials aδµ , and electric and magnetic fields Eδext and Bδext induced by external current (ρδB , jδB ): Sec. 1.2.2. Ψδext = (0, 0, aδ , Eδext ) represents the external field in the Hamiltonian formulaδ,χ , Eδext ), its gauge transform by χ: Secs. 1.4 and 2.1. tion, and Ψδ,χ ext = (0, 0, a fω , fω,e are the soliton profile functions in (respectively) the e = 0 case and for non-zero e, while αω,e is the A0 component of the electromagnetic potential for soliton solutions: Sec. 1.3. ΨS,e is the set of Lorentz transformed soliton solutions in Hamiltonian formalism, or ΨSC,e in Coulomb gauge: Secs. 1.1.2 and 1.3. Gauge transformation to Coulomb gauge generated by ζ: Sec. 1.3.4. λ = (λ−1 , λ0 , λ1 , . . . , λ6 ) = (ω, θ, ξ, u) are parameters for Lorentz transformed solitons: Sec. 1.3.
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
465
˜ and ζ: Sec. 1.3.4. • γ, Pu , Qu , Θ, Θc , Z, V0 (λ), Nλ , Ξ 8 ˜ • Ostab ∈ R is the stable region of soliton parameter space, where Grillakis– ˜ is positive on the symplectic Shatah–Strauss stability condition (39) holds and Ξ normal subspace Nλ : Sec. 1.3.4. • (H0 , Ω0 ), (H, Ω) and ΨH and Ψ2Hs are the symplectic phase spaces and norms: Secs. 1.3.1 and 1.2.1. • W, K and Ξ are quadratic forms used in stability analysis: Sec. 2.3. ˜ quantity like W but including certain nonlinear interaction H ˜ parts of the • W Hamiltonian: Sec. 5. • Tloc , T0 , T1 : Secs. 1.2.1, 1.4.1 and 2.2, respectively. λ : Sec. 2.2. • ∂ 1.2. The external electromagnetic field and scaling ext is induced by the space-time current JB = The external electromagnetic field Fµν k ρB ∂t + jB ∂k according to Maxwell’s equations, i.e. the first equation of (1) with φ set equal to zero. Introducing an external electromagnetic potential, written in ext = ∂µ aν − ∂ν aµ , and lower case symbols, aµ dxµ = a0 dt + aj dxj , such that Fµν imposing the Coulomb condition ∇ · a = 0, these equations can be written:
− a0 = eρB , a = ∇∂t a0 − ejB .
(4)
Here ρB is the background charge density, jB is the background current density. The associated electric field, Eext , and magnetic field, Bext , are given by ∂a − ∇a0 , ∂t = ∇ × a.
Eext =
(5)
Bext
(6)
We shall make the following assumptions on the external field: (BG) The external electromagnetic potentials are smooth and satisfy: max
|α|=j
∇α t,x aµ L∞ (R1+3 ) = Lj < ∞,
(7)
µ=0, 1, 2, 3
(using multi-index notation ∇α t,x for arbitrary partial derivatives of order |α|.) It might appear that these assumptions are restrictive: in particular, the assumption that aL∞ (R3 ×R+ ) < ∞ precludes the consideration of a constant magnetic field. However, since we shall scale so that the external electric and magnetic fields do not change appreciably over the spread of the soliton, which is exponentially localized, these conditions could probably be relaxed with some further work. A more important restriction in our study appears to arise in the consideration of the scaling of the the external field, which we discuss below, after presenting results on local well-posedness for the (nl-KGM) system in the presence of an external field.
May 12, 2009 13:21 WSPC/148-RMP
466
J070-00366
E. Long & D. Stuart
1.2.1. The Cauchy problem for (nl-KGM) in an external field Throughout this article, we make use of local well-posedness of the (nl-KGM) system in the energy norm. In the case that there is no external field and V ≡ 0, this was proved in [13]. In this section we give conditions under which this is true in the more general situation of (1) considered here. Since our assumptions on the external field do not require finite energy it is convenient to subtract off the external field. Thus assume given an external electromagnetic potential a0 dt + aj dxj as above, in Coulomb gauge ∇ · a = 0, which solves the inhomogeneous Maxwell equations (4) and verifies (7). Write the electromagnetic potential appearing in (1) as Aµ = aµ + Aµ . Then, requiring the Coulomb gauge condition ∇ · A = 0, as is always possible, (1) is equivalent to the following system: φ˙ = ψ + ie(a0 + A0 )φ, 2 ψ˙ = ∆φ − 2ie(A + a) · ∇φ − e2 |A + a| φ − V (φ) + ie(a0 + A0 )ψ, A = ieφ, (∇ − ieA)φ + ∇A˙0 − e2 |φ|2 a,
(8)
− A0 = ieφ, ψ, where A = (A1 , A2 , A3 ) is the spatial part of A = A − a. We solve this system in the energy space H ≡ H 1 × L2 × H˙ 1 × L2 , which is endowed with the energy norm ΨH = (φ, ψ, A, E)H 1 ×L2 ×H˙ 1 ×L2 ; see Sec. 1.1.5 for notation on standard norms. We also define corresponding higher energy norms indexed by s ∈ N by (φ, ψ, A, E)2Hs ≡
s−1
2 ∇α x (φ, ψ, A, E)H ,
(9)
|α|=0
with corresponding space denoted Hs . We say that the Cauchy problem for (8) is locally well posed in H if the following two conditions hold: ˙ (WP1) given initial data (φ(0), ψ(0), A(0), A(0)) ∈ H in Coulomb gauge (i.e. ˙ div A(0) = 0, div A(0) = 0), satisfying ˙ (φ(0), ψ(0), A(0), A(0)) H ≤ k0
(10)
there exists Tloc = Tloc (k0 ) > 0 and a unique solution ((φ(t), ψ(t), A(t), ˙ A(t)) such that ˙ (φ(t), ψ(t), A(t), A(t)) ∈ C([0, Tloc ); H), Tloc (AL2 + φL2 )dt < ∞. 0
(WP2) the solution is continuous with respect to the initial data in that, for another ˙ 1 (0) , which are close in H, and set of initial data φ1 (0), ψ1 (0), A1 (0), A also satisfy (10), and the Coulomb gauge conditions, the following holds on
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
467
the common domain of definition [0, Tloc ], for some constant c > 0: ˙ −A ˙ 1 )H max (φ − φ1 , ψ − ψ1 , A − A1 , A
[0,Tloc ]
˙ ˙ 1 (0))H . ≤ c(φ(0) − φ1 (0), ψ(0) − ψ1 (0), A(0) − A1 (0), A(0) −A As remarked above, in the absence of the external field, and with V ≡ 0 the validity of (WP1) and (WP2) was proved in [13]. The general case was addressed in the thesis [17] where it was shown, using in addition Strichartz inequalities from [11, 23], that (WP1) and (WP2) hold if V is a smooth sub-critical nonlinearity: Proposition 1. Suppose V is smooth and that there exists a positive number κ ∈ (0, 4) such that, for all φ, ϕ, |V (φ) − V (ϕ)| ≤ C|φ − ϕ|(1 + |φ|4−κ + |ϕ|4−κ )
(11)
and that V (0) = 0. Assume that the external potential is smooth and verifies (7) for every non-negative integer j. Then the Cauchy problem for (8) is well-posed in the sense of (WP1) and (WP2). Further, if the initial data lie in Hs for some s ≥ 2 then the solution exists for all time, and remains in Hs , and is smooth if the initial data are smooth. Remark 2. The Coulomb condition leaves a residual gauge invariance by functions χ(t, x) which are harmonic in x. (These are either constant or unbounded.) In particular, the system (8) is invariant under the transformation aµ → aµ + ∂µ χ, (φ, ψ) → eieχ (φ, ψ) if χ = α0 (t) + αj (t)xj is linear in x and smooth in t. In this ˜ ψ) ˜ is Lipschitz on H 1 × L2 . It follows case, the map (φ, ψ) → eiχ (φ, ψ) = (φ, that Proposition 1 remains valid if the external potential is obtained from one satisfying (7) by gauge transformation by χ = α0 (t) + αj (t)xj . Remark 3. Notice that when the nonlinearity is determined by a smooth function V whose third derivative satisfies: |D(3) V(φ)| ≤ c(1 + |φ|3−κ ),
for all φ
(12)
for some c > 0, 0 < κ < 3 the conditions of Proposition 1 hold, and the Cauchy problem is well-posed. This assumption is also sufficient to estimate the nonlinear terms in the perturbation theory developed in Secs. 2, 3 and 5 of this article. Introduce F (φ) = V (φ) − m2 φ = V1 (φ) = β(|φ|)φ as the nonlinear part of V (φ), with V as in the introduction. Then (12) implies the inequality |F (f + v) − F (f )| ≤ c(1 + |f |3 )(|v| + |v|4 ) where c is a positive constant, (13) which is convenient for our use. In fact, for our purposes it would be sufficient to make the following slightly more general assumption on F : For all f > 0 and for any v, |F (f + v) − F (f )| ≤ c(f r−1 + f 3 )(|v| + |v|4 ), (14)
May 12, 2009 13:21 WSPC/148-RMP
468
J070-00366
E. Long & D. Stuart
where r, c are positive constants, see [16]. Of course given a smooth potential V satisfying (12), let F be as just defined, then (14) will also hold with r = 1. 1.2.2. Scaling the external fields As already mentioned, we require that the external electric and magnetic fields are approximately constant over the soliton. To ensure this, we introduce a scaled version of the external fields. Thus, we have 1 1 (15) aδ0 (t, x) = a0 (δt, δx), aδ (t, x) = a(δt, δx), δ δ with the scaled external electric and magnetic fields given by: Eδext = Eext (δt, δx),
Bδext = Bext (δt, δx).
(16)
Clearly these fields correspond to the following rescaled charge and current densities: ρδB (t, x) = δρB (δt, δx),
jδB (t, x) = δjδB (δt, δx).
(17)
Henceforth, we shall almost always refer exclusively to the scaled fields. It remains to choose the length scale, 1δ , over which the external fields change: this is determined by the analysis in Sec. 5 which bounds the deviation of the solution from the modulated soliton. This analysis seems to require two main conditions on the scaling of δ and e: • From Lemma 21, it seems that we need e lim = 0, e→0 δ to bound the effect of the scaled external electromagnetic potential. • Treatment of the last term in (137), seems to suggest that we need
(18)
δ2 = 0. (19) e→0 e This condition is used to ensure the deviation from the Lorentz force law is small for times of order 1e . lim
We will consider the limit e → 0 with δ = e1−k
(20)
1 2 ),
for some constant k ∈ (0, so that both of these conditions hold. It remains to be seen what are the optimal conditions for scaling e, δ under which the results of this paper hold. 1.3. Non-topological solitons We now discuss existence and stability properties of non-topological solitons as solutions of (nl-KGM) in the absence of external fields. This means we are here concerned with the (nl-KGM) system with ρB = 0 = jB . We first discuss the Hamiltonian formulation of (nl-KGM), since that gives the appropriate context in which to introduce non-topological solitons.
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
469
1.3.1. Hamiltonian formalism It is useful to present the Hamiltonian formalism for the (nl-KGM), not least because it will give us a language which we shall use in proving the existence and long-time stability of the non-topological solutions. Indeed, as we shall see, from the Hamiltonian point of view, non-topological solitons are relative equilibria, and recognizing this fact leads to the identification of the appropriate quantities with which to work. In order to define the phase space we recall the standard function spaces defined in Sec. 1.1.5. To start with, consider the nonlinear wave equation in isolation φ + V (φ) = 0.
(21)
This can be written as a Hamiltonian system on the phase H0 ≡ {(φ, ψ) ∈ space 1 2 ˙ ˙ ˙ ˙ and H × L }, with symplectic form Ω0 ((φ , ψ ), (φ, ψ)) = φ , ψ − ψ , φdx, Hamiltonian 1 (22) |∇φ|2 + 2V(φ). H0 (φ, ψ) = 2 The corresponding Hamiltonian evolution equations, equivalent to (21), are: φ ψ ∂t . (23) = ψ
φ − V (φ) Next, for (nl-KGM), introduce the phase space H ≡ {Ψ = (φ, ψ, A, E) ∈ H 1 × L2 × H˙ 1 × L2 },
(24)
which is endowed with the norm ΨH = (φ, ψ, A, E)H 1 ×L2 ×H˙ 1 ×L2 and the (densely defined, weak) symplectic form ˙ − ψ , φ ˙ + A · E ˙ = φ , ψ ˙ − E · Adx, ˙ Ω(Ψ , Ψ) (25) ˙ The (nl-KGM) equations with ρB = where Ψ = (φ , ψ , A , E ) and similarly for Ψ. 0 = jB arise formally as the Hamiltonian flow on H associated to the Hamiltonian 1 H(φ, ψ, A, E) = (26) (|E|2 + |∇ × A|2 + |ψ|2 + |∇A φ|2 + 2V(φ)), 2 and subject to the constraint: C0 ≡ div E − ieφ, ψ = 0.
(27)
Here ∇A φ is the covariant derivative of φ given by ∇A φ = ∇φ − ieAφ and A is the spatial part of the gauge field. The equations of motion for the augmented Hamiltonian H1 = H − A0 C0 are: φ ψ + ieA 0 φ ψ
A φ − V (φ) + ieA0 ψ = (28) ∂t Ai Ei + ∇i A0 Ei
A − ∇ (div A) + ieφ, ∇ φ i
i
A
May 12, 2009 13:21 WSPC/148-RMP
470
J070-00366
E. Long & D. Stuart
where the “Lagrange multiplier” A0 is identifiable with the temporal part of the gauge field, A φ = φ − 2ieA · ∇φ − ie div Aφ + e2 |A|2 φ, i = 1, . . . , 3, and we have not yet imposed any gauge condition. 1.3.2. Existence of non-topological solitons: The e = 0 case The class of solitary wave solutions of interest is that of non-topological solitons discussed in [15, Chap. 7]. These are examples of a special type of solution to a Hamiltonian system with symmetry called relative equilibrium: this means that the time evolution is given by an orbit of a one parameter subgroup of the symmetry group. For (23), the Hamiltonian is invariant under the action of S 1 by phase rotation, as long as V(φ) = G(|φ|) is a function of |φ| only; the charge corresponding to this S 1 action is Q(φ, ψ) = iψ, φdx. A relative equilibrium is then a solution of the form (φ, ψ) = eiωt (fω (x), iωfω ), where fω is a real-valued function which satisfies an elliptic equation. These solutions are critical points of the functional H0 + ωQ, often called the augmented Hamiltonian in this context. We consider G of the form |f | m2 2 f − U (f ) with U (f ) = tβ(t)dt. G(f ) = 2 0 then the equation satisfied by fω is − fω + (m2 − ω 2 )fω = β(fω )fω . This equation typically has many solutions (see [3] and references therein), but we are only interested in positive, radially symmetric solutions because it is these which are dynamically stable: these are sometimes called the ground state solitons. Thus, crucial to our analysis is the following hypothesis on existence and uniqueness of the e = 0 ground state soliton: (SOL) For ω 2 < m2 , there exists a unique positive radial function fω ∈ H 4 (R3 ) which solves (− + m2 − ω 2 )fω = β(fω )fω . Theorem 4. The existence part of (SOL) holds under the following conditions: U (f ) = −U (−f ) U (0) = U (0) = 0
and and
U ∈ C 1 (R) ∩ C 2 ((0, ∞)),
∃s ∈ (0, 1) : lim f U (f ) = 0, s
f →0
m2 − ω 2 2 ζ , 2 U (f ) lim = 0. f →∞ f 5
∃ζ > 0 : U (ζ) >
(29) (30) (31) (32)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
471
The uniqueness part of (SOL) holds under the additional conditions: (U1)
∃l1 > 0 : 0 < f < l1 ⇒ U (f ) < (m2 − ω 2 )f and
l1 < f < ∞ ⇒ U (f ) > (m2 − ω 2 )f and
U (l1 ) − (m2 − ω 2 ) > 0,
and that (U2)
For l2 > l1 , ∃λ = λ(l2 ) ∈ C[(l1 , ∞), R+ ] such that 2(m2 − ω 2 )f + λf U (f ) − (λ + 2)U (f ) is non-negative on (0, l2 ) and non-positive on (l2 , ∞).
Proof. The existence part of this hypothesis was proved in [3] under the given conditions on the nonlinearity. It was shown in several articles (see, for example, [18], where further references are given), that these solutions are unique under the given additional conditions. The following two operators, L± (ω), which appear on linearizing (23) about the soliton solution, are crucial to an understanding of the stability and dynamical properties of the e = 0 soliton: L+ (ω) = − + m2 − ω 2 − β(fω ) − β (fω )fω , L− (ω) = − + m2 − ω 2 − β(fω ).
(33)
We make the following hypothesis on L+ (ω): (KER) The kernel of L+ (ω) is empty in Hr2 (R3 ). (Recall that Hrs was defined as the space of radial Sobolev H s functions, immediately following (3).) Theorem 5. The hypothesis (KER) is valid under the conditions (U1) and (U2). Proof. See [18]: establishing (KER) is a crucial step in proving uniqueness of the positive function fω . The operators L± also determine stability properties of the soliton. For proving stability the following spectral assumption is used: (S1) The subspace in which L+ is strictly negative is one-dimensional. This assumption is valid for the ground state solitons fω obtained by the constrained minimization technique of [3], because they are minimizers subject to a single constraint, see [24] (where a direct proof in the pure power case is also given). Some additional more technical results on the solitons can be found in Sec. A.1.
May 12, 2009 13:21 WSPC/148-RMP
472
J070-00366
E. Long & D. Stuart
1.3.3. Existence of non-topological solitons: The general case We now show that for small values of the coupling constant e the ground state solitons just discussed can be continued (via the implicit function theorem) to give soliton solutions of (28). The properties of the e = 0 soliton needed to achieve this were detailed already in Sec. 1.3.2. As shown in [1, 2] it is also possible to obtain soliton solutions for systems like (28) by variational techniques applied within the class of radial functions, but for present purposes we prefer to use the implicit function theorem so that we can carry over stability information from the e = 0 case, which seems to be hard to obtain otherwise. Generalizing the class of non-topological solitons to the case of the gauge invariant system (28) leads us to search for solutions to (28) of the form Exp[iωt]fω,e φ ψ Exp[iωt]i(ω − eαω,e )fω,e = , A 0 E −∇αω,e
(34)
where we have emphasized the dependence on the parameters ω and e; we will assume the functions fω,e and αω,e to be radially symmetric. It can easily be checked that this gives a solution to (28) with A0 = αω,e as long as the functions fω,e and αω,e satisfy 2 2 − αω,e + e2 fω,e αω,e − eωfω,e = 0,
(35)
− fω,e − U (fω,e ) + (m2 − (ω + eαω,e )2 )fω,e = 0.
(36)
The first of these equations implies C0 = 0. It can readily be checked that if a gauge transformation is made to put the solution thus obtained into temporal gauge, A0 = 0, then its time dependence amounts to the action of the one parameter group of gauge transformations ei(ω−eαω,e )t , so that it is indeed a relative equilibrium solution as defined above. Theorem 6 ( [17]). Assume that the hypotheses (SOL) and (KER) hold for ω0 with ω0 2 < m2 . Then, there exists a neighborhood U of ω0 such that for ω ∈ U, there is a number e(ω) > 0 such that for ω ∈ U, |e| < e(ω), there exists fω,e ∈ Hr2 (R3 ) such that − fω,e + m2 fω,e − (ω − eαω,e )2 fω,e = β(fω,e )fω,e ,
(37)
where αω,e ∈ H˙ r1 (R3 ) is a non-local function of fω,e uniquely determined by 2 2 − αω,e + e2 fω,e αω,e = ωefω,e .
In addition the map ω → fω,e is C 2 from U to Hr2 .
(38)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
473
1.3.4. Stability in the absence of an external field The stability of the solutions to (23) of the form eiωt fω (x) was first considered in [20, 10] where it was proved that the positive radial solution was stable, with respect to radially symmetric perturbations of the initial data, as long as ∂ω (ωfω 2L2 ) < 0.
(39)
It was also shown that the solutions are unstable when this quantity is positive. In [24], an alternative, modulational, approach to stability was adopted along the lines of [27], with the aim, both of generalizing previous stability results to prove stability of uniformly moving solutions with respect to arbitrary (non-symmetric) perturbations, and also of providing techniques which could provide useful information in dynamically non-trivial settings. The presence of external fields is an example of the latter circumstance, and so the analysis in this article is based on that in [24], which we will now summarize. It turns out that the condition (39) implies the strict positivity of the Hessian of the augmented Hamiltonian on the symplectic normal space to the space of solitons. To explain this properly in the generality needed it is necessary to consider the action of the Poincare (or inhomogeneous Lorentz) group Action of the Poincare group on the solitons. The Eq. (28) are Poincare covariant. The action of the Poincare group on the radial soliton (34) gives a family of functions depending smoothly on eight parameters {λA }6A=−1 , with λ = (λ−1 , λ0 , λ1 , . . . , λ6 ) = (ω, θ, ξ, u)
(40)
determining (respectively) the frequency, the phase, the centre and the velocity of the soliton. Explicitly: Exp[iΘ](fω,e (Z)) Exp[iΘ](iγ(ω − eαω,e (Z))fω,e (Z) − γu · ∇Z fω,e (Z)) . (41) ΨS,e(x; λ) = −γuαω,e (Z) 1 − Pu + γQu ∇Z αω,e (Z) γ Here the projection operators Pu : R3 → R3 and Qu : R3 → R3 are defined by ui uj (Pu )ij = |u| 2 and Qu = 1 − Pu , and
1 . 1−|u|2
with γ(u) = √
Z(x, λ) = γPu (x − ξ) + Qu (x − ξ),
(42)
Θ(x, λ) = θ − ωu · Z,
(43)
˜ ⊂ R8 The parameters are required to lie in the set O
defined by ˜ ≡ {(ω, θ, ξ, u) ⊂ R8 : |u| < 1 and ω 2 < m2 }. O
(44)
May 12, 2009 13:21 WSPC/148-RMP
474
J070-00366
E. Long & D. Stuart
The parameter range corresponding to stable solitons is ˜ stab ≡ {(ω, θ, ξ, u) ⊂ O ˜ : condition (39) holds}. O
(45)
The Poincare covariance of the equations of motion (28) implies that the solitons given by (41) form an eight parameter family of solutions t → ΨS,e(x; λ(t)) of (28) d ˜ defined by λ = V0 (λ), where V0 is the vector field on O as long as dt ω (46) V0 (λ) ≡ 0, , u, 0 , γ for λ = (ω, θ, ξ, u). The case of the nonlinear wave equation (23) can be obtained by putting e = 0 in the first two components of the formulae just given. Simplifying to this case we obtain an eight parameter family of functions, (φS,0 , ψS,0 )(x; λ) ≡ eiΘ (fω (Z), (iγωfω (Z) − γu · ∇Z fω (Z)))
(47)
such that t → (φS,0 , ψS,0 )(x; λ(t)), solves (23), as long as
d dt λ
= V0 (λ), with V0 as above.
Stability for e = 0 (nonlinear Klein–Gordon). The starting point for stability analysis is the observation that (φS,0 , ψS,0 ) is a critical point of the augmented Hamiltonian ω (48) F0 (φ, ψ; λ) = H0 (φ, ψ) + ui Πi (φ, ψ) + Q(φ, ψ) γ where H0 , Q are the functionals defined above, and Πi are the momenta Πi (φ, ψ) = ψ, ∂i φdx. The Hessian of F0 at (φS,0 , ψS,0 ) is a quadratic form depending upon λ: ˜ ψ), ˜ (φ, ˜ ψ)). ˜ ˜ ψ; ˜ λ) ≡ D2 F0 (φS,0 , ψS,0 ; λ)((φ, ˜ φ, Ξ( Introduce the subspace ˜ ψ) ˜ ∈ H 1 × L2 : Ω0 ((φ, ˜ ψ), ˜ ∂λ (φS,0 , ψS,0 )(λ)) = 0} Nλ ≡ {(φ,
(49)
then the following hypothesis is crucial for stability: (POS)
˜stab ∃τ∗ = τ∗ (K) > 0 such that For each compact K ⊂ O 2 ˜ ψ; ˜ λ) ≥ τ∗ (φ, ˜ ψ) ˜ ˜ ˜ ˜ φ, Ξ( 1 2 for all (φ, ψ) ∈ Nλ . H ×L
˜stab is the set of parameter values corresponding to stable solitons, Remark 7. O which are obtained as Poincare transforms of solitons eiωt fω with ω such that (39) holds. Theorem 8 ([24]). If the nonlinearity satisfies the conditions given in Sec. 1.3.2 then (POS) is true. Furthermore, solitons of (23) corresponding to frequencies ω
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
475
such that (39) holds are modulationally stable with respect to small, arbitrary perturbation in energy norm. To be precise, consider the initial value problem for (23) ˜ stab , in the with initial data close to a soliton (φS,0 , ψS,0 )(·; λ(0)) with λ(0) ∈ O sense that = (φ(0, ·), ψ(0, ·)) − (φS,0 , ψS,0 )(·; λ(0))H0 is sufficiently small. Then there exists a global solution which satisfies: sup (φ(t, ·), ψ(t, ·)) − (φS,0 , ψS,0 )(·; λ(t))H0 ≤ c,
(50)
t∈R
˜stab . for some C 1 curve t → λ(t) ∈ O Stability for small e (nonlinear Klein–Gordon–Maxwell ). It was shown in [17], that stability holds also for solitons in (28) under the condition (39), for sufficiently small values of the electromagnetic coupling constant e. This was proved using the Coulomb condition, so we first write down the soliton solutions (41) in Coulomb gauge. (The Coulomb condition is not invariant under Lorentz boosts, therefore, it is necessary to perform a gauge transformation to move the Lorentz boosted solitons into the Coulomb gauge.) The Lorentz boosted solitons ΨSC,e in the Coulomb gauge have the form Exp[iΘC ](fω,e (Z)) φSC,e (x) Exp[iΘC ](iγ(ω − eαω,e (Z))fω,e (Z) − γu · ∇Z fω,e (Z)) ψSC,e (x) (51) = −γuαω,e (Z) + ∇ζ A SC,e (x) 1 ESC,e (x) − Pu + γQu ∇Z αω,e (Z) γ where ΘC = Θ + ieζ, and ζ(x; λ) is a solution of − ζ = −γu · ∇αω,e (Z).
(52)
It is a smooth function of x and also depends smoothly on λ; requiring that ∇ζ ∈ Lp , p > 3 fixes it up to a constant. Some estimates for ζ are given in Sec. A.1.2. The temporal part of the gauge field is given by (ASC,e )0 = γαω,e (Z) + ζ˙ = γαω,e (Z) + V0 (λ) · ∂λ ζ. Theorem 9 ( [17]). In the situation of the previous theorem the solitons (51) of (28) corresponding to frequencies ω such that (39) holds are, for sufficiently small |e|, modulationally stable in Coulomb gauge with respect to small, arbitrary perturbation of the initial data in energy norm · H defined in (24). The stability is in the same sense as in the previous theorem, see [17] for full details. 1.4. The main theorems We now explain and state our main results on the interaction of the solitons of Sec. 1.3 with the scaled external electromagnetic field of Sec. 1.2. We write the
May 12, 2009 13:21 WSPC/148-RMP
476
J070-00366
E. Long & D. Stuart
total electromagnetic potential as A = Aµ dxµ (as described in Sec. 1.1.1) with corresponding electric field Ej = ∂t Aj − ∂j A0 . The potential A will be formed from three constituents: 1. The external field, produced by a background charge ρδB and current jδB , and scaled as described in Sec. 1.2, 2. The soliton contribution, as described in Sec. 1.3 but with parameters λ(t) varying in a dynamically determined way, 3. An additional component produced by interaction of the initial data with the two previous components. This component is not explicitly given, and must be estimated. Similarly, the solitonic field will be made up of a component which is the moving soliton, and a remainder produced by interactions, which must be estimated. It is convenient to write the (nl-KGM) equations in first order form. Including the scaled background current density, the equations read: ψ + ieA0 φ φ ψ A φ − V (φ) + ieA0 ψ , (53) ∂t = Ai Ei + ∇i A0 Ei
Ai + ieφ, ∇A φ − ejδB with the Coulomb gauge condition imposed. These equations are to be solved with the Gauss law div E − ieφ, ψ = ρδB ,
(54)
as a constraint. We shall abbreviate a general solution by making use of the following definition: Ψ = (φ, ψ, Ai , Ei ),
(55)
with i ∈ {1, 2, 3}. Using this Hamiltonian formulation with Ψ as dynamical variable we write the external field Ψδext = (0, 0, aδ , Eδext ). It will be convenient also to have the freedom of applying a gauge transformation χ(t, x) to this: δ,χ , Eδext ), Ψδ,χ ext = (0, 0, a
with aδ,χ = aδ + dχ. The aim is now to construct a solution Ψ to (53) consisting of Ψδ,χ ext with a soliton ΨSC,e (λ) superimposed. We choose the gauge transformation χ so that the transformed external electromagnetic potentials vanish along the world-line of the soliton x = ξ(t); in particular, at t = 0 we will choose χ(0, x) = χ0 (x) = −(x − ξ(0)) · aδ (0, ξ(0)) so that aδ,χ0 (0, x) = aδ (0, x) − aδ (0, ξ(0)).
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
477
1.4.1. Stability in the presence of an external field The following theorem asserts the long time stability, under the influence of an external field, of stable solitons to (53). Recall that the stable solitons are those ˜ stab , so that (39) and hence (POS) hold, and they are stable parametrised by λ ∈ O by Theorems 8 and 9 in the absence of an external field. Theorem 10. Assume that the nonlinearity satisfies the hypotheses (H1)–(H3), and also is such that the hypotheses (SOL), (KER) and (POS) in Sec. 1.3.2 hold. In addition, assume that the external field satisfies the assumptions in Sec. 1.2. Suppose further that the scaling parameters satisfy δ 2 = o(e), e = o(1) and e = o(δ). (i) Consider initial data of the form Ψ(0) = (φSC,e (λ(0)), ψSC,e (λ(0)), Ai (0), Ei (0)) ˜stab corresponds to a stable soliton (which verifies (POS)). It where λ(0) ∈ O follows that, if e is sufficiently small and 2 0 Ψ(0) − Ψδ,χ ext (0) − ΨSC,e (λ(0))H = o(e),
(56)
there exists • a positive number T0 > 0, independent of e, • a C 1 gauge transformation χ(t, x) defined in (63), linear in x at each time t, satisfying χ(0, x) = χ0 (x) ˜ stab ), and • a curve λ(t) ∈ C 1 ([0, T0 ], O |e|
• a distributional solution Ψ(t) of (53), such that
Ψ(t) − Ψδ,χ ext (t) ∈ C
0,
T0 ;H |e|
and 2 sup Ψ(t) − Ψδ,χ ext (t) − ΨSC,e (λ(t))H = o(e),
(57)
T
0] t∈[0, |e|
Furthermore, λ(t) satisfies a system of ordinary differential equations given by (116) with |∂t λ − V0 (λ)| = O(e). The time component of the potential A0 is determined by the Coulomb condition and the Gauss law, (71), and has properties detailed in Sec. 2. (ii) More generally, the same conclusions hold for initial data sufficiently close to a stable soliton in an appropriate sense: see Sec. 2.4.3 for a precise statement. This theorem is proved in Sec. 2. Remark 11. As explained in Sec. 1.3.2, if the nonlinear potential satisfies (29)– (32), U (1), U (2), S(1) above in addition to (H1)–(H3) then the conditions (SOL), (KER) and (POS) all hold.
May 12, 2009 13:21 WSPC/148-RMP
478
J070-00366
E. Long & D. Stuart
1.4.2. Motion in the presence of an external field: The Lorentz force The previous theorem provides ordinary differential equations (116) which determine the evolution of the soliton parameters. A detailed investigation of these equations allows us to deduce an equation of motion for the soliton, which is expected to be the Lorentz force law for a moving charge, at least to highest order in e. As remarked earlier, if the analysis were carried out explicitly to higher order in e, corrections would be expected to appear, in particular due to the back reaction of the soliton’s electromagnetic field on itself. However, these are not expected to appear in the O(e) force law, and the following theorem validates this: Theorem 12. Assume the hypotheses and conclusions of Theorem 10 hold, and let λ = (ω, θ, ξ, u) be the parameters of the soliton ΨSC,e (λ). Then, on the interval [0, T|e|0 ], the center and velocity of the soliton evolve according to the equations: d ξ = u + o(e) dt d (MS γ(u)u) = eQS (Eδext (t, ξ) + u × Bδext (t, ξ)) + o(e), dt where the mass of the soliton, MS , is given by 1 MS = ∇fω 2L2 + ω 2 fω 2L2 , 3 and the charge of the soliton is given by 2 QS = (ω − eα)fω,e .
(58) (59)
(60)
(61)
This theorem is proved in Sec. 4. Remark 13. Observe that, since we have scaled the external field so that Eδext and Bδext are independent of e, the soliton undergoes O(1) motion on the time interval [0, T|e|0 ] according to the Lorentz force law. 2. Stability: Proof of Theorem 10 In this section, we explain the proof of Theorem 10, making use of results which are proved separately in Secs. 3 and 5. Throughout this section, the hypotheses of Theorem 10 are understood to hold without explicit mention. Also we may assume, without loss of generality, that the solution is smooth in the course of the following calculations: since finite energy solutions can be approximated by smooth ones by (WP2) in Sec. 1.2.1, and all the bounds we use depend only on the energy norm, this implies the result for finite energy initial data as in Theorem 10. 2.1. Beginning of proof of Theorem 10 2.1.1. Ansatz for the solution We make an ansatz for a solution Ψ(t) = Ψδ,χ ext (t) + ΨSC,e (λ(t)) + Perturbation, which is close to a soliton with time varying (modulating) parameters λ(t), in the
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
background external field Ψδ,χ ext (t). Explicitly the ansatz reads: φSC,e (λ(t)) + Exp[iΘC ]v φ(t, x) ψ(t, x) ψSC,e (λ(t)) + Exp[iΘC ]w . = A (t, x) (A δ,χ ˜ ) (λ(t)) + a + A µ SC,e µ µ µ δ ˜j (ESC,e )j (λ(t)) + (Eext )j + E Ej (t, x)
479
(62)
Notice that we have included here an ansatz for the temporal part of the potential ˜ = A0 . Since we have imposed the Coulomb gauge throughout, it follows that div A 0. The choice of the gauge transformation χ is: t ˙ aδ0 (s, ξ(s)) + ξ(s) · aδ (s, ξ)ds. (63) χ(t, x) = −(x − ξ) · aδ (t, ξ) − 0
This is chosen so that the gauge transformed external potentials vanish along the world line of the soliton: δ aδ,χ µ = aµ + ∂µ χ,
aδ,χ µ (t, ξ(t)) = 0.
(64)
These imply δ δ ˙ δ (t, ξ(t)), aδ,χ 0 (t, x) = a0 (t, x) − a0 (t, ξ(t)) − (x − ξ(t)) · a
aδ,χ (t, x) = aδ (t, x) − aδ (t, ξ(t)),
(65)
exhibiting the claimed vanishing of aδ,χ µ along the soliton’s world line. This allows certain quantities to be proved to be bounded in the course of the proof. Notice that χ is linear (and so harmonic) in x, and so preserves the Coulomb condition (see Remark 2). There is clearly a redundancy in our ansatz, in that λ(t) is so far completely undetermined. The appropriate choice of λ(t) is dictated by the requirement that the solution be close to a soliton determined by the parameters λ(t), i.e. by the ˜ E). ˜ This is requirement that we have good bounds for field perturbation (v, w, A, carried out in Sec. 3, with the main results summarized next in Sec. 2.2. First ˜ E), ˜ and give some bounds for the we write explicitly the equations for (v, w, A, inhomogeneous terms in these equations. 2.1.2. Equations for the perturbations of the fields ∂t v + i(ωγ + h)v = w + 1 , ∂t w + i(ωγ + h)w = −Mλ v + 2 + N ,
(66) (67)
˜ =E ˜ + 3 , ∂t A
(68)
˜ = ∆A ˜ + 4 , ∂t E
(69)
May 12, 2009 13:21 WSPC/148-RMP
480
J070-00366
E. Long & D. Stuart
where the inhomogeneous terms h, 1 , . . . , 4 and N are defined in Sec. 2.1.3, and Mλ is the operator Mλ v = (− x + m2 + γ 2 ω 2 |u|2 )v + 2iωγu · ∇x v − β(fω )v − fω β (fω )v.
(70)
The last two terms have been chosen to depend on the e = 0 profile function fω , rather than fω,e , so that it is possible to make direct use of the stability assumption (POS) in Sec. 1.3.4. (This choice is reflected in the expression for the inhomogeneous term N in (76) and its corresponding estimate in (95)). In addition to these evolution equations, the fields are constrained to satisfy the Gauss law (27), which takes the form: ˜ = − A˜0 = eiExp[−iΘC ]φSC,e , w + eiv, Exp[−iΘC ]ψSC,e + w. div E
(71)
Under finite energy assumptions this equation has a unique solution with A˜0 ∈ H˙ 1 ; this defines uniquely A˜0 as a non-local function of v, w, λ at each time. Estimates for A˜0 are given in Lemma 39. 2.1.3. Inhomogeneous terms in the field perturbation equations (66)–(69) The following quantity appears in both (66) and (67): ˙ c − ωγ − e(ASC,e )0 − eaδ,χ − eA˜0 . h=Θ 0
(72)
0 The inhomogeneous term in (66) is 1 = I1 + II 1 + 1 , where
I1 = −(λ˙ − V0 (λ)) · e−iΘc ∂λ φSC,e ,
(73)
δ,χ II 1 = iea0 fω,e ,
(74)
01 = ieA˜0 fω,e .
(75)
The inhomogeneous terms in (67) are N (fω,e , fω , v) = β(|fω,e + v|)(fω,e + v) − β(|fω,e |)fω,e − β(|fω |)v − fω β (|fω |)v,
(76)
III 0 + IV and 2 = I2 + II 2 + 2 2 + 2 where
I2 = −(λ˙ − V0 (λ)) · e−iΘc ∂λ ψSC,e , δ,χ −iΘc ψSC,e , II 2 = eRfω,e + iea0 e
(77)
III 2 = eRv + Sv, IV 2
=e
−iΘc
[∆A (e
(78) iΘc
v) + (∆A − ∆ASC,e )φSC,e ] −
II 2
−
III 2 ,
02 = ieA˜0 e−iΘc ψSC,e .
(79) (80)
Here, the operators R, S are given by Rv = 2i(aδ,χ ) · (iγ(ω − eαω,e )u − ∇)v − e|aδ,χ |2 v, Sv = 2ieαω,e γu · ∇v + ieγ(u · ∇αω,e )v + 2eγ 2 |u|2 ωαω,e v − e2 (γαω,e |u|)2 v.
(81)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
481
(In verifying these formulas, it is helpful to note that by the exact solutions in Sec. 1.3.4 e−iΘc (∇ − ieASC,e − ieaδ,χ )eiΘc v = ∇v − i(γ(ω − eαω,e )u + eaδ,χ )v, and a similar formula for the second derivatives.) The inhomogeneous term in (68) is 3 = I3 + 03 where I3 = −(λ˙ − V0 (λ)) · ∂λ ASC,e , 0 = ∇A˜0
(82)
3
III 0 0 and in (69) we have 4 = I4 + II + IV 4 + 4 4 + 4 , with 4 = 0 and
I4 = −(λ˙ − V0 (λ)) · ∂λ ESC,e , II 4
2
2 δ,χ
= −e |fω,e | a
,
(83) (84)
iΘc ˜ III v, ∇A (φSC,e + eiΘc v) − e2 |fω,e |2 A, 4 = eie
(85)
iΘc IV v). 4 = eiφSC,e , ∇A (e
(86)
To clarify the structure of these terms it is helpful to insert the ansatz (62) into 4 ˆ (n) , where H ˆ (n) has homogeneity the Hamiltonian (26) and write H − V = n=0 H ˜ (The terms of degree larger than two arise solely from 1 |∇A φ|2 .) Then n in (v, A). 2 ˜ arise, the pieces of 2 , (respectively, 4 ), which are of degree n ∈ {1, 2, 3} in (v, A) (n+1) ˆ ˆ (n+1) ). , (respectively, −DA˜ H respectively, as the Frechet derivatives −Dv H The nonlinear potential V only appears through Mλ v and N in (67). With this understood we now introduce notation for the various terms arising in (67) and (69), ˆ (n) , then in (67) the ˆ = 4 H organized according to their homogeneity. Let H n=2 corresponding terms are IV ˆ = −Mλ0 v + III −Dv H 2 + 2 ,
where Mλ0 v = (− x + γ 2 ω 2 |u|2 )v + 2iωγu · ∇x v = −e−iΘ (eiΘ v)
(87)
with Θ the soliton phase factor in (43). Notice that the operator Mλ0 consists of those terms in (70), which do not arise from the V term in the energy, because we have so far excluded this term in our expansion (which is of H − V). However, it is convenient to put back in the quadratic parts of the Taylor expansion of V, but expanded around fω (the e = 0 soliton), so as to obtain the Mλ operator which appears in (67). Thus we let ˜ =H ˆ + 1 D2 V(fω )(v, v) = H ˆ + 1 [m2 |v|2 − β(fω )|v|2 − fω β (fω )(v)2 ], H 2 2 ˜ as for H, ˆ so that, using the same notation for the homogeneous components of H 1 (n) (n) (2) (2) V ˆ ˜ ˆ ˜ = H for n > 2 and H −H = 2 D (fω )(v, v). In (69) the we have H
May 12, 2009 13:21 WSPC/148-RMP
482
J070-00366
E. Long & D. Stuart
corresponding terms are IV ˜ = ∆A ˜ + III −DA˜ H 4 + 4 .
To write these terms explicitly we introduce a multilinear notation as follows. ˜ −D ˜ H) ˜ = B(1) (v, A) ˜ + B(2) (v, A) ˜ + B(3) (v, A), ˜ (−Dv H, A
(88)
˜ is a homogeneous degree n function of (v, A), ˜ as indicated by where B(n) (v, A) (1) 1 3 1 3 3 ˙ : H (R ; C) ⊕ H (R ; R ) → H −1 (R3 ; C) ⊕ the superscript. We will define B −1 3 3 −1 ˙ H (R ; R ), where by H (respectively, H˙ −1 ) we mean the dual space of H 1 1 ˙ (respectively, H ). Explicitly: ˜ B21 v + B22 A), ˜ ˜ = (B11 v + B12 A, B(1) (v, A)
(89)
B11 v = −Mλ v + eRv + Sv,
(90)
where
and the operators R and S are as just defined. Next ˜ = −2efω,e (γ(ω − eαω,e )u + eaδ,χ ) · A ˜ − 2ieA ˜ · ∇fω,e , B12 A B21 v = −2efω,e (γ(ω − eαω,e )u + eaδ,χ )v + iev, ∇fω,e + iefω,e , ∇v, ˜ = 0 integration by parts yields ˜ = A ˜ − e2 fω,e A. ˜ Since div A and finally, B22 A ˜ L2 , and ˜ B21 vL2 = v, B12 A A, ˜ (2) = − 1 (v, A), ˜ L2 , ˜ B(1) (v, A) H 2 1 ˜ 2 + |∇v − i(γ(ω − eαω,e )u + eaδ,χ )v|2 ˜ 2 + e2 |fω,e A| |∇A| = 2 1 + (m2 |v|2 − β(fω )|v|2 − fω β (fω )(v)2 ) 2 ˜ ω,e , ∇v − i(γ(ω − eαω,e )u + eaδ,χ )v − 2ieAf
δ,χ ˜ − 2ieAv, ∇fω,e − i(γ(ω − eαω,e )u + ea )fω,e dx.
(91)
Next, the quadratic terms in the equations can be expressed in terms of a rank three symmetric tensor B(2) : (H 1 (R3 ; C) ⊕ H˙ 1 (R3 ; R3 ))2 → H −1 (R3 ; C) ⊕ H˙ −1 (R3 ; R3 ) which is given explicitly by ˜ + B121 [A, ˜ v] + B122 [A, ˜ A], ˜ B111 [v, v] + B112 [v, A] (2) ˜ = , B (v, A) ˜ + B221 [A, ˜ v] + B222 [A, ˜ A] ˜ B211 [v, v] + B212 [v, A]
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
483
where B111 = B222 = 0, and ˜ = −ev(γ(ω − eαω,e )u + e(aδ,χ )) · A ˜ − ie∇v · A, ˜ B112 [v, A] ˜ v] = −ev(γ(ω − eαω,e )u + e(aδ,χ )) · A ˜ − ie∇v · A, ˜ B121 [A, ˜ A] ˜ = −e2 fω,e |A| ˜ 2, B122 [A, and B211 [v, v] = −e(γ(ω − eαω,e )u + e(aδ,χ ))|v|2 + iev, ∇v, along with ˜ v] = B212 [v, A] ˜ = −e2 fω,e , vA. ˜ B221 [A, These terms are obtained by differentiation of the cubic part of the expanded Hamiltonian, which is ˜ (3) = − 1 (v, A), ˜ B(2) (v, A) ˜ L2 H 2 ˜ L2 + e2 Af ˜ ω,e , Av ˜ L2 . = ∇v − iγu(ω − eαω,e )v − ieaδ,χ v, −ieAv Finally the cubic terms in the equations arise by differentiation of the quartic part of the Hamiltonian 2 ˜ (4) = − 1 (v, A), ˜ B(3) (v, A) ˜ L2 = e ˜ 2 |v|2 , H |A| 2 2 and are determined by a rank-four tensor, B(3) : (H 1 (R3 ; C) ⊕ H˙ 1 (R3 ; R3 ))3 → H −1 (R3 ; C) ⊕ H˙ −1 (R3 ; R3 ) which, using an identical notation to the rank three case, has as its only non-zero entries B1122 =
−e2 3
(92)
and the other entries obtained by permuting the indices. 2.1.4. Some bounds for the inhomogeneous terms We record here some simple bounds for the quantities defined above: II II 2 • II 1 Lp + 2 Lp = O(e) and 4 Lp = O(e ) for every p ∈ [1, ∞] by (188) and (189), • hfω,e Lp = O(e + |λ˙ − V0 | + e|A˜0 |Lq ), for any q > 3, which can be read off from (72), using results from Secs. A.1.2, A.2.2 and A.2.1, and the assumptions on the applied fields. A˜0 can be bounded in Lq , q > 3 by Sec. A.2.2. • It is possible to write h = h1 − eA˜0 with ∇h1 L∞ = O(e + |λ˙ − V0 |) and ∇A˜0 bounded in Lp , p ∈ (3/2, 3], by Sec. A.2.2.
May 12, 2009 13:21 WSPC/148-RMP
484
J070-00366
E. Long & D. Stuart
Finally, consider N : by Lemma 35 we can write N (fω,e , fω , v) = β(|fω,e + v|)(fω,e + v) − β(|fω,e |)fω,e − β(|fω,e |)v − fω,e β (|fω,e |)v + O(e2 |v|) = N (fω,e , fω,e , v) + O(e2 |v|).
(93)
Using the condition (12), or more generally (13), and the fundamental theorem of calculus, we can estimate |N (f, f, v)| ≤ c(1 + |f |3 )(|v|2 + |v|5 ),
(94)
for any f . Therefore, choosing f = fω,e , which is bounded, and using (93) we have |N (fω,e , fω , v)| ≤ c1 (|v|2 + |v|5 ) + c2 e2 |v|.
(95)
2.2. Results from modulation theory The assumptions on the nonlinearity under which we are working ensure that the Cauchy problem for (53) is locally well-posed in the sense of (WP1) and (WP2), see Sec. 1.2.1. Since so far χ is unknown (since λ(t) and hence ξ(t) are not yet determined) we cannot solve directly for Ψ = (φ, ψ, Aj , Ej ) in the background potential aδ,χ µ . Instead we exploit gauge invariance and solve for ˆ ψ, ˆ A ˆ = (φ, ˆ j , Ej ) ≡ (e−ieχ φ, e−ieχ ψ, Aj − ∂j χ, Ej ) = e−ieχ · Ψ Ψ
(96)
in the potential aδµ , which is known. (Since χ(t, x) is harmonic in x this gauge transformation preserves both the Eqs. (53) and the Coulomb gauge condition (see Remark 2)). By Proposition 1 on local well-posedness, there exists a time Tloc > 0 and unique solution to (28)) with ˆ − Ψδ ) ∈ C([0, Tloc ]; H), (Ψ ext
(97)
ˆ Ψ(0) = (e−ieχ0 φ(0), e−ieχ0 ψ(0), Aj (0) − ∂j χ0 , Ej (0)).
(98)
with initial data
Once λ(t) = (ω(t), θ(t), ξ(t), u(t)), and hence χ(t), is determined, then Ψ(t) is ˆ obtained from Ψ(t) by the above relation. As remarked previously, by Proposition 1 these solutions can be approximated in energy norm by smooth solutions evolving in any of the spaces Hs of (9) (after subtracting off the background field). Thus, although the statement and proof of Theorem 10 involve only the energy norm, it is permissible to assume smoothness of the solutions throughout the proof. We now state a theorem which asserts that it is possible to choose the soliton parameters λ(t) in such a way that the quantity W defined in (103) is equivalent to the energy norm. This is achieved by choosing λ(t) in such a way that the pair (v, w) satisfies some conditions which are equivalent to those in (49) (after adjusting the phase).
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
485
ˆ be a solution to the Cauchy problem for (53) satisfyTheorem 14. (a) Let Ψ ing (97) with initial data (98) with Ψ(0) as described in Theorem 10. Then, for ˜stab ) with the folsufficiently small e, there exist T1 > 0 and λ ∈ C 1 ([0, T1 ]; O lowing properties. On the interval [0, T1 ] define Ψ(t) = (φ(t), ψ(t), Aj (t), Ej (t)) by (63) and (96). Then it is possible to write Ψ in the form (62) where v, w are constrained to satisfy Ω0 ((v, w), ∂ λ (φS,0 , ψS,0 )) = 0,
(99)
λ φS,0 = Exp[−i(Θ)]∂λ φS,0 , ∂
(100)
where we define
λ ψS,0 . Furthermore, the function t → λ(t) solves a system and likewise for ∂ of differential equations (116). The condition (99) is equivalent to requiring (φ − φSC,e , ψ − ψSC,e ) ∈ Nλ . ˜ E) ˜ H are sufficiently small, then (b) If e and (v, w, A, ˜ E) ˜ 2H ), |λ˙ − V0 (λ)| = O(e + (v, w, A,
(101)
˜ E) ˜ 2 = O(e) then so that, in particular, if (v, w, A, H |λ˙ − V0 (λ)| = O(e).
(102)
Proof. This is a consequence of the lemmas in Sec. 3. 2.3. The main growth estimate As discussed in Sec. 1.3.4, the natural quantity for stability and perturbation analyses of the solitons (51) is the Hessian of the augmented Hamiltonian. Here we modify this quantity to take account of the phase shifts in (62), and discard terms which are formally O(e), leading us to the introduction of the following quadratic form: ˜ E; ˜ λ) = K + Ξ, W (v, w, A,
(103)
˜ 2 2 + ∇ × A ˜ 2 2 + 2E, ˜ (u · ∇)A ˜ L2 ), ˜ E; ˜ λ) = 1 (E K(A, L L 2
(104)
where
and Ξ(v, w; λ) =
1 (w − iγωv2L2 + v, Mλ − γ 2 ω 2 )vL2 + 2w, u · ∇vL2 ), 2
(105)
where Mλ is as defined in (70). Theorem 15 (Equivalence of W and Energy Norm). Suppose that the nonlinearity is such that (H1)–(H3) and (SOL), (KER) and (POS) hold. Suppose fur˜stab . Then the quadratic form W just ther that λ lies in a compact subset, K, of O
May 12, 2009 13:21 WSPC/148-RMP
486
J070-00366
E. Long & D. Stuart
˜ E) ˜ 2 provided that (v, w) satisfy defined, is equivalent uniformly on K to (v, w, A, H the constraints (99). Proof. This is essentially [24, Theorem 2.7]. Since there is no coupling in W ˜ E), ˜ it is only necessary to show separately the equivalence of between (v, w) and (A, ˜ E) ˜ 2 . For K this can be achieved Ξ and K to the corresponding parts of (v, w, A, H ˜ ˙ 1 by the Coulomb condition), ˜ L2 = A by completing the square (since ∇ × A H while for Ξ it is an immediate consequence of (POS). Theorem 16 (Main Growth Estimate). Assume given a solution to the Cauchy problem for (28) for which Theorem 14 applies on an interval [0, T|e|2 ] for some fixed ˜ stab , so that by Theorem 15 positive T2 . Assume that λ(t) ∈ K, a compact subset of O there exists c1 > 0 such that, 1 ˜ E) ˜ 2H ≤ c1 W, W ≤ (v, w, A, c1
(106)
on [0, T|e|2 ]. Assume further that there exist c2 > 0, c3 > 0 such that that δ 2 ≤ c2 |e| and and W ≤ c3 |e|, and that e = o(δ). It follows that, for sufficiently small e, there exists c4 > 0 such that, on [0, T|e|2 ] W (t) ≤ c4 (W (0) + e2 + δ 2 ) exp(c4 |e|t).
(107)
Proof. See Sec. 5. 2.4. Completion of the proof of Theorem 10 2.4.1. Local solution verifying constraints For simplicity of exposition we first prove part (i) of the theorem, i.e. we consider initial data Ψ(0) consisting of an exact soliton as in (51) determined by parameters ˜stab , with ω(0) satisfying the stability condition. λ(0) = (θ(0), ω(0), u(0), ξ(0)) ∈ O On account of the applied fields there will be a non-trivial evolution starting from this initial value. Applying the local existence Theorem 1 and Theorem 14 as in Sec. 2.2, we deduce the existence a positive time T1 > 0 such that on the interval [0, T1 ] there is a solution to the Cauchy problem which can be written as in (62) where v(0) = 0 = w(0), and v(t), w(t) satisfy the constraints (109) (or (99)), and t → λ(t) solves (116). We may assume that λ(t) ∈ K, a fixed compact subset of ˜stab , so that (106) holds. O 2.4.2. Growth of the energy norm Since we have a local solution satisfying the constraints (99) we can assume that the conclusions of Theorem 15 hold. Furthermore, by continuity we may assume (making T1 smaller if need be) that on this interval W (t) ≤ c3 |e|, and (106) holds.
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
487
Now apply the growth estimate in Theorem 16: W (t) ≤ c4 (W (0) + e2 + δ 2 ) exp(c4 |e|t), to deduce by a standard continuation argument, since W (0) = 0 and δ 2 = o(e), that there exists an interval [0, T|e|0 ], with T0 > 0 fixed (independent of e, δ), on which W (t) ≤ c5 (e2 + δ 2 ) = o(e) which completes the proof of Theorem 10 for the case of exact soliton initial data — part (i) of Theorem 10. 2.4.3. General initial data Part (ii) of Theorem 10 says that the behavior described in part (i) also holds for nearby initial data: for a precise formulation it is necessary to consider the initial ˆ data for the gauge transform Ψ: ˆ be a soluTheorem 17. Under the same assumptions as Theorem 10, let Ψ δ ˆ − Ψ ) ∈ C(R; H) and initial tion to the Cauchy problem for (53) with (Ψ ext ˆ ˆ ˆ j (0)) having the following property. There exists ˆ ˆ j (0), E data Ψ(0) = (φ(0), ψ(0), A ˜ · aδ (0, ξ), ˜ then ˜ = (θ, ˜ω ˜ ∈O ˜ stab such that if we define χ(x) ˜ , ξ) ˜ = −(x − ξ) λ ˜, u 1
χ ˜ ˜ ˆ 2 − Ψδ, κ0 ≡ e−ieχ˜ · Ψ(0) ext (0) − ΨSC,e (λ)H = o(e ).
(108)
It follows that, if e is sufficiently small there exists T0 > 0, χ(t, x) and λ(t) ∈ ˜ stab ), all as in Theorem 10, such that if Ψ(t) is defined as in (96) it C 1 ([0, T|e|0 ], O satisfies all the conclusions of part (i) of Theorem 10. Proof. It is only necessary to argue, as in the proof of Lemma 18, that under ˜ = o(e 12 ) such that ˜ stab with |λ(0) − λ| the stated conditions there exists λ(0) ∈ O ˆ can be written as Ψ(0) = (φ(0), ψ(0), Aj (0), Ej (0)) ≡ e−ieχ0 · Ψ(0) ˜ ˜ Ψ(0) = (φSC,e (λ(0)) + φ(0), ψSC,e (λ(0)) + ψ(0), Ai (0), Ei (0)), with
˜ ˜ φ(0), ψ(0) ∈ Nλ(0)
where Nλ(0) is the symplectic normal subspace, of codimension eight, defined in (49). This is a simple consequence of the implicit function theorem, as is Lemma 18. There ˆ depends on λ(0), is only a slight modification required in that φ(0) = e−ieχ0 φ(0) and so does ψ(0), unlike the case considered in that lemma. However, for small e, this has no effect on the non-degeneracy condition required to apply the implicit function theorem. (Also the fact that χ0 grows linearly in x can easily be handled using the exponential decay in x of φSC,e , ψSC,e and their derivatives.) ˜ = o(e 12 ) we can deduce from (108) that W (0) = o(e). Now using |λ(0) − λ| Indeed, for the electromagnetic components, this is immediate since the gauge transformation leaves the electric field unchanged, and only shifts Aj by ∂j χ0 ,
May 12, 2009 13:21 WSPC/148-RMP
488
J070-00366
E. Long & D. Stuart
and this shift is put onto the background potential (and so does not contribute to ˜ is unchanged). The change of the electromagnetic components of the W (0) since A ˜ to λ(0) are easily estimated in energy norm as soliton induced by the change of λ ˜ O(|λ − λ(0)|) by Lemmas 33 and 34. For the other components we just use phase invariance to estimate, e.g. ˆ − eieχ0 φSC,e (λ(0))L2 ˆ − φSC,e (λ(0))L2 = φ(0) e−ieχ0 φ(0) ˆ − eieχ˜ φSC,e (λ) ˜ L2 ≤ φ(0) ˜ − eieχ0 φSC,e (λ(0))L2 + eieχ˜ φSC,e (λ) ˜ = o(e 12 ). ≤ κ0 + O(|λ(0) − λ|) From this point on, the argument can be completed as before: since ˜ ˜ (φ(0), ψ(0)) ∈ Nλ(0) is equivalent to the conditions (99), Theorems 14 and 16 can be applied to produce a local solution satisfying the growth estimate in Sec. 2.4.2.
3. Modulation Theory In this section, we state and prove some theorems which imply Theorem 14, which is needed in the proof of the main results (Theorems 10 and 12). The proofs are a direct application of the developments in [24], and so the presentation will be brief and reference made to [24,16] for some of the calculations. The crucial point is that the conditions (99) are equivalent to a locally well-posed set of ordinary differential are of the form from (47) that, for e = 0, the soliton solutions equations. Recall φS,0 , ψS,0 )(x; λ ≡ eiΘ fω (Z), (iγωfω (Z) − γu · ∇Z fω (Z)) with λ(t) an integral curve of the vector field V0 . Explicitly, the conditions (99) read v, ∂ λ ψS,0 (λ)L2 − w, ∂λA φS,0 (λ)L2 = 0
(109)
for A = −1, 0, . . . , 6. In the next two subsections we state two lemmas which prove that these constraints can be enforced thorough out a time interval: • The first shows that by an appropriate choice of λ(0), they can be assumed to hold in an open neighborhood of the set of stable solitons in the phase space H. This shows that the class of initial data considered in part (ii) of Theorem 10 forms an open set containing the stable solitons. • The second shows that an appropriate choice of ∂t λ implies that they are preserved for later times. 3.1. Preparation of the initial data ˜ = (θ, ˜ω ˜ ∈O ˜stab (so that (39) holds ˜ , ξ) Lemma 18. Suppose that there exists λ ˜, u ˜ ˜ ˜ and with ω = ω ˜ ). Then, there exists e(λ), κ(λ, e), such that, if |e| < e(λ) ˜ H 1 + ψ(0) − ψSC,e (λ) ˜ L2 < κ, κ 1 = φ(0) − φSC,e (λ)
(110)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
489
˜stab depending differentiably upon (φ(0), ψ(0)) such that there exists λ(0) ∈ O (v(0), w(0)), determined by the first two equations of (62) at t = 0, satisfy the constraints (109) with λ = λ(0). Furthermore, there exists c1 > 0 such that ˜ + φ(0) − φSC,e (λ(0))H 1 + ψ(0) − ψSC,e (λ(0))L2 < c1 κ 1 . |λ(0) − λ|
(111)
Proof. The condition in (39) allows this to be deduced from the implicit function theorem, see [24, §2.3] or [16] for details. 3.2. Modulation equations and constraints ˜ stab and (v(0), w(0)) be as given in the conclusions of Lemma 19. Let λ(0) ∈ O ˆ Lemma 18. Let Ψ be a solution to the Cauchy problem for (53) on the time interval [0, Tloc ] with regularity as in (97), and such that ˆ − Ψδ (t)H < N0 . sup Ψ(t) ext
(112)
[0,Tloc ]
˜ stab , which is the closure of Fix a compact subset K of the stable parameter set O an open neighborhood of λ(0). Then, there exist κ2 > 0 and T1 > 0 such that, if (v(0), w(0))H 1 ⊕L2 < κ2 , there exists λ(t) ∈ C 1 ([0, T1 ]; K) such that the constraints (109) are satisfied for 0 ≤ t ≤ T1 , where v, w are as in (62) with Ψ obtained ˆ via (63) and (96). The function t → λ(t) is a solution of a system of ordifrom Ψ nary differential equations (116). Proof. The proof of this is essentially the same as [24, §2.5]. For clarity, it is divided into three stages. 3.2.1. Beginning of proof of Lemma 19 ˜ λ in an obvious way: Equations (66) and (67) define a linear operator M ˜ λ (v, w) = (−∂t v − iωγv + w, −∂t w − iωγw − Mλ v) M
(113)
˜ ∗ be the formal L2 (dxdt) adjoint of this operator. Then, by [24, §2.5], and let M λ there exists an 8 × 8 matrix DAB such that ˜ ∗λ (−∂ ˜1 ˜2 M DAB (−∂ (114) λA ψS,0 , ∂λA φS,0 ) = λB ψS,0 , ∂λB φS,0 ) + (IA , IA ) B
where the inhomogeneous terms ˜IjA are proportional to λ˙ − V0 (λ): ˜Ij = I˜j (λ˙ − V0 (λ))B A AB j with I˜AB smooth functions of x, which are exponentially decreasing as |x| → ∞; the precise formulae, which are unimportant here, can be found in [24, §2.5]. A simple integration by parts then shows that the constraints in (109) are satisfied on an interval containing the initial time, if they hold at that initial time and if the
May 12, 2009 13:21 WSPC/148-RMP
490
J070-00366
E. Long & D. Stuart
following is true −∂ λA ψS,0 , j1 L2 + ∂λA φS,0 , j2 + N L2 ˜2 + ˜I1A − ih∂ λA ψS,0 , vL2 + IA + ih∂λA φS,0 , wL2 = 0,
(115)
for all A = −1, 0, . . . , 6, and at each time in the interval. A calculation as in [24], which is reviewed in the next stage of the proof in Sec. 3.2.2, shows that these latter conditions are equivalent to the following system of differential equations (M(e)AB + גAB (v, w, λ))(λ˙ − V0 (λ))B = FA (e, Ψδext , Ψ, λ),
(116)
where M(e)AB is defined in (117), גAB is defined in (118), FA is given by (122) and where the indices A, B ∈ {−1, 0, 1, . . . , 6}, and we sum over the repeated index B. 3.2.2. Explicit computation of the modulational equation (116) We write out explicitly the various terms in the conditions (115). The first thing to note is that the overall expression is affine in (λ˙ − V0 (λ)) so we divide into the inertial terms, which are proportional to this quantity (and give rise to the left-hand side of (116)), and the remaining force terms, which give rise to the right-hand side of (116). The dominant contribution to the inertial terms arises from I1 , I2 , while II that to the force terms arises from II 1 , 2 . To describe the inertial terms we need the following matrix, which, to highest order, describes the mass of the soliton: −iΘc −iΘc ∂λB φSC,e L2 − ∂ ∂λB ψSC,e L2 . MAB (e) = ∂ λA ψS,0 , e λA φS,0 , e
(117)
Then the dominant inertial term is I I −∂ λA ψS,0 , 1 L2 + ∂λA φS,0 , 2 L2 = MAB (e)(∂t λ − V0 (λ))B .
Next, we have the following matrices, which may be thought of as corrections — owing to the presence of the perturbations v and w — to the “inertia” matrix above: 1 ˜2 גAB = v, (I˜AB − i∂λB Θc ∂ λA ψS,0 )L2 − w, (IAB + i∂λB ΘC ∂λA φS,0 )L2 .
(118)
We now present the abbreviations for the force terms appearing in the modulational equation. Firstly, we have what is effectively the Lorentz force term. II II FL A = ∂λA ψS,0 , 1 L2 − ∂λA φS,0 , 2 L2 δ,χ = ∂ λA ψS,0 , iea0 fω,e L2 δ,χ − ∂ λA φS,0 , iea0 (iγ(ω − eαω,e ) − u · ∇)fω,e + eRfω,e L2 .
We also have a force FnA +
FpA
due to the nonlinear interactions, where
FnA = −∂ λA φS,0 , N L2 , FpA
=
0 ∂ λA ψS,0 , 1
(119)
+ ie(γαω,e +
(120) aδ,χ 0
+ A˜0 )vL2
δ,χ III 0 ˜ − ∂ + IV λA φS,0 , 2 2 + 2 + ie(γαω,e + a0 + A0 )wL2 .
(121)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
491
We abbreviate the total force as follows: p n FA = FL A + FA + FA .
(122)
Bound for the inertia matrix. It follows from the definition of גAB that ˜ E) ˜ H ). |גAB | = O((v, w, A,
(123)
Bounds for the forces. Firstly, the main force term can be bounded as FL A = O(e),
(124)
because of (188), (189) and (155). For some values of A there are better bounds: 3 FL 0 = O(e ).
(125)
Referring to (119), and using Lemmas A.1.3 and 34, we deduce that 3 FL 0 = O(e ) − ifω , eRfω L2 δ,χ + ∂θ ψS,0 , ieaδ,χ 0 fω L2 − ∂θ φS,0 , iea0 (iωγfω − u · ∇fω )L2 .
By the reality of fω and the Coulomb condition, the last three terms vanish, proving the bound (125). Also, for A = 3 + j we have an improvement: 2 FL 3+j = O(e + eδ).
(126)
To establish this, we first argue as above that 3 FL 3+j = O(e ) − ∂uj φS,0 , eRfω L2 δ,χ δ,χ + ∂ uj ψS,0 , iea0 fω L2 − ∂uj φS,0 , iea0 (iωγfω − u · ∇fω )L2 .
Now referring to the formulae in A.1.4 we see that ∂ uj φS,0 = even + i odd, while ∂ uj ψS,0 = odd + i even where even (respectively, odd) means a real valued function which is even (respectively, odd) as a function of Z. The bound asserted then follows by inspection and use of Lemma 37. Next, (95) implies, by (151), (152), (155) and by the H¨ older and Sobolev inequalities, that ˜ E) ˜ H ) + O((v, w, A, ˜ E) ˜ 2H + (v, w, A, ˜ E) ˜ 5H ). |FnA | = O(e2 (v, w, A, Finally ˜ E) ˜ H + e(v, w, A, ˜ E) ˜ 2H + e2 (v, w, A, ˜ E) ˜ 3H ). |FpA | = O(e(v, w, A,
(127)
This is obtained directly from the formula above by means of the Sobolev and H¨older inequalities and using the bounds in Secs. A.2.1 and A.2.2. 3.2.3. Completion of proof of Lemma 19 The matrix M(e)AB is invertible for small e on account of the stability condition (39) and Lemma (35). Also the matrix גAB is small when (v, w) is small, so that in this
May 12, 2009 13:21 WSPC/148-RMP
492
J070-00366
E. Long & D. Stuart
case the system of evolution equations (116) can be manipulated — as in the proof of [24, Theorem 2.6] — to form a system of equations of the form ˆ λ). λ˙ = V0 (λ) + V1 (e, Ψδext , Ψ, This is almost a locally well-posed system of ordinary differential equations — there ˆ is known to exist is a slight modification of the standard proof from [24] required: Ψ already, but (v, w), determined as in the statement, depend on λ(t) through the gauge transformation (63), which is non-local in the ξ component of λ, and so V1 is similarly non-local. To allow for this, it is necessary to augment λ by the non-local quantity appearing in (63), which is in fact χ(t, ξ). Call Λ = (λ, χ(t, ξ(t))), then there is a locally well-posed system of ordinary differential equations of the form ˆ allowing the proof of Lemma 19 to be completed in the same Λ˙ = W (Λ), e, Ψδext, Ψ), way in [24]. 3.3. A bound for λ˙ Lemma 20. In the situation of the previous lemma, ˜ E) ˜ 2H + e(v, w, A, ˜ E) ˜ H) |λ˙ − V0 (λ)| = O(e + (v, w, A, in the limit of e going to zero. Proof. The function λ(t) is obtained as a solution of the modulation equations (116). Referring to the bounds for the inertial matrix and forces in Sec. 3.2.2, it ˜ E) ˜ H sufficiently small the bound claimed holds. is immediate that for e, (v, w, A,
4. The Lorentz Force Law: Proof of Theorem 12 The starting point is (116). Define MAB (0) = ∂ λA ψS,0 , ∂λB φS,0 L2 − ∂λA φS,0 , ∂λB ψS,0 L2 ,
(128)
and observe that by Lemmas 35 and 34 MAB (e)− MAB (0) = O(e2 ). Using this, and referring to the decomposition of FA in equation (122), and the associated bounds following it, we infer that ˜ 12 ))(λ˙ − V0 )B = FL + O(eW ˜ 12 + W ˜ ), (M(0)AB + O(e2 + W A
(129)
where FL A is as in (119). Since the right-hand side is known, up to the stated error term, it is now just a matter of calculation to obtain explicit forms for the left-hand side of these equations, and thence to deduce Theorem 12. The calculation is done in [24, §A.7], using a set of functions defined in Sec. A.1.4 which are convenient linear combinations of the ∂ λA (φS,0 , ψS,0 ). We now record the conclusions. Using (102), the A = 0 component of (129) reads: 2 ˜ ∂ω (ωfω 2L2 )ω˙ = FL 0 + O(e ) + O(W ),
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
493
L 3 with a formula for FL 0 given in (119) which indicates that F0 = O(e ) (see Sec. 3.2.2), and all together:
˜ ). ∂ω (ωfω 2L2 )ω˙ = O(e2 ) + O(W
(130)
Similarly, the bound (126) for FL 3+j implies the following equation for the center of the soliton: ˜ ) + O(eδ). ξ˙ = u + O(e2 ) + O(W
(131)
Next, using (130) and (102), the A = i ∈ {1, 2, 3} component of (129) reads 1 2 ˜ ∂t ∇fω 2L2 + ω 2 fω 2L2 γui = FL (132) i + O(W ) + O(e ), 3 again with FL i given in (119) as: δ,χ FL i = ∂ξi ψS,0 , iea0 fω,e L2 δ,χ − ∂ ξi φS,0 , iea0 (iγ(ω − eαω,e ) − u · ∇)fω,e + eRfω,e L2 ,
(133)
operator R is defined in (81). Here, on the left-hand side, fω 2L2 = where the 2 3 fω (Z) d Z and by the Lorentz transformation (42) d3 Z = γd3 x. The inner products on the right-hand side are in L2 (d3 x). It remains to simplify this expression for FL i : firstly, δ,χ δ,χ ∂ ξi ψS,0 , iea0 fω,e L2 − ∂ξi φS,0 , iea0 (iγ(ω − eαω,e ) − u · ∇)fω,e L2 δ,χ δ,χ 3 = ∂ ξi ψS,0 , iea0 fω L2 − ∂ξi φS,0 , iea0 (iγω − u · ∇)fω L2 + O(e )
by Lemma 35, 3 = (iγω − u · ∇)fω , ie∇aδ,χ 0 fω L2 + O(e )
by integration by parts, = eωfω 2L2 [∇i aδ0 (t, ξ) − a˙ δ (t, ξ) − u · ∇aδ (t, ξ)] + O(eδ + e3 ), by (65) and Lemma 38. (Again, fω 2L2 = −∂ ξ j φS,0 , eRfω,e L2 = γωe
fω (Z)2 d3 Z.) But also, referring to (81), fω2 (Z)∇u · aδ (t, x)dx,
= ωefω 2L2 ul ∇j ·aδl (t, ξ) + O(eδ), again using Lemma 38. Adding together these contributions, we end up with FL = eωfω 2L2 (∇aδ0 − (∂t aδ ) + u × (∇ × aδ ))(t, ξ) + O(e3 + eδ), which is the required form of the Lorentz force law, as given in Theorem 12, once 2 we note that ωfω2 = (ω − eα)fω,e + O(e2 ).
May 12, 2009 13:21 WSPC/148-RMP
494
J070-00366
E. Long & D. Stuart
5. Proof of the Main Growth Estimate In this section, we are concerned with the proof of Theorem 16. In order to control ˜ which allows us to take advantage of W it is helpful to introduce a quantity W certain cancellations occuring in the energy identity to handle some of the nonlinear interaction terms which would otherwise be difficult to estimate directly. The ˜ direct nonlinear interactions between v and terms in the Hamiltonian A arise from 1 ˜ by means obtained by expanding the expression 2 |(∇ − ieA)φ|2 in terms of v, A of (62). (There are also indirect interactions mediated by A˜0 via the Gauss law, but these are easier to estimate.) In Sec. 2.1.3, this expansion of 12 |(∇ − ieA)φ|2 is carried out explicitly, and, including also the quadratic part of the Taylor expansion of the potential V, leads to the introduction of the quantity: ˜ A) ˜ = H(v,
4
˜ (n) H
n=2
=−
4 1 ˜ B(n−1) (v, A) ˜ L2 , (v, A), 2 n=2
˜ of where the superscript n (respectively, n − 1) indicates the homogeneity in v, A (n) (n−1) ˜ the term H in the expanded Hamiltonian (respectively, of the term B in the expanded evolution equations (67), (69)); see Sec. 2.1.3 for explicit expressions and explanations. Using these definitions we have an alternative form for the expanded evolution: equations (66), (68) can be written in the form λ φSC,e , ∂λ ASC,e ) ˜ = (w, E) ˜ − (i(γω + h)v, 0) − (∂t λ − V0 (λ)) · (∂ ∂t (v, A) + (01 , 03 ) + (Φ11 , 0),
(134)
with Φ11 = II 1 . The remaining two equations (67), (69) can be written: ˜ −D ˜ H) ˜ − (i(γω + h)w, 0) − (∂t λ − V0 (λ)) ˜ = (−Dv H, ∂t (w, E) A λ ψSC,e , ∂λ ESC,e ) + (0 , 0) + (Φ21 , Φ22 ), · (∂ 2
(135)
II where h is defined in (72), and Φ21 = II 2 + N , and Φ22 = 4 are given in terms of the inhomogeneous terms defined in Sec. 2.1.3; notice that the inhomogeneous terms IV III IV III 2 , 2 , 4 , 4 are included in the first term on the right-hand side of (135). To study these equations it will turn out that the following quantity is useful:
˜ = 1 w − iγωv2 2 − 1 γ 2 ω 2 v2 2 + 1 E ˜ 22 W L L L 2 2 2 ˜ A). ˜ u · ∇(v, A) ˜ L2 + H(v, ˜ + (w, E), ˜ as follows: it is formed by adding to the Hessian of the augmented We can think of W Hamiltonian W those terms arising in the expanded Hamiltonian (when we input the perturbed solution ansatz (62)) which describe the interactions of the fields ˜ with themselves and with the external electromagnetic field. An important (v, A)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
495
˜ is that the following two lemmas imply a long time bound reason for introducing W for W , and hence a stability estimate in energy norm. Lemma 21. In the situation of Theorem 15, so that ˜ stab , • λ lies in a compact subset, K ⊂ O • (v, w) satisfy the constraints (99), and ˜ E) ˜ 2, • W is equivalent (uniformly on K) to (v, w, A, H assume that W < 1, and that e = o(1) and e = o(δ). Then, there exists a constant c(K) > 0 such that, for all λ ∈ K, ˜ ≤ 1 W. cW ≤ W (136) c ˜ (n) which occur in the Proof. Referring to the formulae in Sec. 2.1.3 for the H ˜ definition of H, it is a straghtforward consequence of the H¨ older inequality that 2 2 e 3 e e ˜ +O 2 W +O W W + O W =W δ δ2 δ 3
+ O(e2 W ) + O(eW 2 ) + O(e2 W 2 ), Lemma 27 and the assumptions on the external field in Sec. 1.2. The lemma follows immediately. d Notation 22. In the following we write, f = dt (O(A) + o(B)) if there exist C 1 d functions g, h such that f = dt (g + h) and g = O(A) and h = o(B).
Lemma 23. Assume the hypotheses of Theorem 16. It follows that, 2 d W ˜ = d (O(eW ˜ 12 ) + o(W ˜ + (e2 + eδ)W ˜ )) + O e4 + e + e W ˜ 12 , dt dt δ
(137)
˜ going to zero. in the limit of e and W Proof. See Sec. 5.2. 5.1. Proof of Theorem 16, assuming Lemma 23 Proof. Integrating up Eq. (137), and using the Cauchy–Schwarz inequality, ˜ , we infer the existence of a constant c > 0 such that, ˜ 1/2 ≤ +eδ 2 + eW 2eδ W for t ∈ [0, T2 /e], t 2 2 ˜ ˜ ˜ W (s)ds , (138) |W (t) − W (0)| ≤ c e + δ + |e| 0
as long as e = O(δ). By Gronwall’s inequality and Lemma 23, for |e| sufficiently small there exists a constant c > 0 such that, on [0, T2 /e], ˜ (t) ≤ c(W ˜ (0) + e2 + δ 2 ) exp[c|e|t]. W By Lemma 21, the result is proved.
(139)
May 12, 2009 13:21 WSPC/148-RMP
496
J070-00366
E. Long & D. Stuart
5.2. Proof of Lemma 23 5.2.1. Beginning of proof of Lemma 23 By the assumptions of Theorem 16 we have a solution of Eqs. (134) and (135) satisfying the conclusions of Theorems 14 and 15, so that the constraints (109) hold and W = O(e). Then, by Lemma 21 and Theorem 15, there exists c > 0 such that 1 ˜ ˜. ˜ E) ˜ 2H ≤ cW W ≤ (v, w, A, c Also since W = O(e) the bound (102) holds, and will be used in the course of the ˜ will be obtained as a consequence of the energy identity proof. The estimate for W for (134) and (135), so the next stage is to write that identity down and separate the terms out in a way that allows them to be usefully estimated. 5.2.2. The energy identity for (66)–(69) d ˜ ˜ (w, E) ˜ + u · ∇(v, A) ˜ − iγω(v, 0)L2 W = ∂t (w, E), dt ˜ −D ˜ H) ˜ − u · ∇(w, E) ˜ + iγω(w, 0)L2 ˜ (−Dv H, − ∂t (v, A), A ˜ A)dx ˜ (w, E) ˜ L2 . ˜ + ∂t h(v, − ∂t (γω)iv, wL2 + ∂t u · ∇(v, A),
(140)
˜ A) ˜ i.e. ˜ for the integrand defining H, Here we have introduced a notation h(v, ˜ A)dx ˜ A) ˜ = h(v, ˜ H(v, =−
1 2
4
˜ B(n−1) (v, A)dx. ˜ (v, A),
(141)
n=2
˜ show that they depend Explicit expressions for the nonlinear operators B(n−1) (v, A) ˜ in the final line of (140) refers to differentiation with (v, A) ˜ on t, x, and the ∂t h held fixed; similar conventions will be understood below. Substituting for the time derivatives from (134) and (135), and noting the usual cancellations which occur in the derivation of the energy identity, we obtain the following expression: d ˜ ˜ A)dx ˜ ˜ W = Q1 + Q2 + Q3 − Dv H, ihvL2 + (∂t + u · ∇)h(v, dt ˜ (w, E) ˜ L2 , − iv, (u · ∇h)wL2 − ∂t (γω)iv, wL2 + ∂t u · ∇(v, A), where ˜ + u · ∇(v, A) ˜ − iγω(v, 0)L2 Q1 = (Φ21 , Φ22 ), (w, E) ˜ −D ˜ H) ˜ + u · ∇(w, E) ˜ − iγω(w, 0)L2 , − (Φ11 , 0), (−Dv H, A
(142)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
497
˜ + u · ∇(v, A) ˜ − iγω(v, 0)L2 Q2 = (02 , 0), (w, E) ˜ −D ˜ H) ˜ − u · ∇(w, E) ˜ + iγω(w, 0)L2 − (01 , 03 ), (−Dv H, A ˜ 3 , where and Q3 = −(∂t λ − V0 (λ)) · Q λ ψSC,e , ∂λ ESC,e ), (w, E) ˜ 3 = (∂ ˜ + u · ∇(v, A) ˜ − iγω(v, 0)L2 Q λ φSC,e , ∂λ ASC,e ), (−Dv H, ˜ −D ˜ H) ˜ − u · ∇(w, E) ˜ + iγω(w, 0)L2 . − (∂ A We control Q1 , Q2 , Q3 in the next three subsections before completing the proof of Lemma 23. In the course of estimating the various terms we will use bounds for N , h and the Φ’s (which may be read off from those in Sec. 2.1.4), and the bounds for A˜0 in Sec. A.2.2. 5.2.3. Estimation of Q1 The following proposition is the main result about Q1 needed for the basic growth estimate: Proposition 24. In the situation of Lemma 23 ˜ 12 + eδ W ˜ ) + O(eW ˜ 12 )) + O(e4 + eW ˜ + e2 W ˜ 12 ). Q1 = ∂t (o(W Proof. Substituting from (134) and (135) we obtain: Q1 = (∂t λ − V0 ) · ∂λ ASC,e Φ22 L2 λ φSC,e , Φ21 L2 − ∂ λ ψSC,e , Φ11 , L2 ] + (∂t λ − V0 ) · [∂ + ieA˜0 (iγ(ω − eαω,e ) − u · ∇)fω,e , Φ11 L2 − (ieA˜0 fω,e , ∇A˜0 ), (Φ21 , Φ22 )L2 + (∂t + u · ∇)v, Φ21 L2 + ihv, Φ21 L2 − ihw, Φ11 L2 ˜ Φ22 L2 − (∂t + u · ∇)w, Φ11 L2 + (∂t + u · ∇)A,
(143)
since Φ12 = 0. Estimation of the first line in Q1 . The first line of Q1 is easily seen to be small, since Φ22 = −e2 aδ,χ fω,e is O(e2 ) in every Lp by the bounds in Sec. 2.1.4. Together with the fact that, ∂λ ASC,e Lp = O(e) for p > 3, by (51) and the results of Sec. A.1.2, this implies that ∂λ ASC,e , Φ22 L2 = O(e3 ), and so by (102) the first line is O(e4 ). Estimation of the second line in Q1 . The second line is smaller than appears due to a cancellation which is a consequence of the modulation equations, (115) or (116). To see this, we refer to the decomposition of the force on the right-hand side
May 12, 2009 13:21 WSPC/148-RMP
498
J070-00366
E. Long & D. Stuart
of (116) given in Sec. 3.2.2, and using the definitions of the ΦIJ in (134) and (135), we see that ∂ λ φSC,e , Φ21 L2 − ∂λ ψSC,e , Φ11 L2 A
A
=
−FL A
−
FnA
+ ErrA
= −(M(e)AB + גAB )(λ˙ − V0 )B + FpA + ErrA where ErrA = ∂ λA φSC,e − ∂λA φS,0 , Φ21 L2 − ∂λA ψSC,e − ∂λA ψS,0 , Φ11 L2 . Using Lemma 35, the bound (95) for N , and the fact that from Sec. 2.1.4 Φ11 = II 1 2 ˜ 1/2 ). ˜ ˜ 5/2 + e2 W and Φ21 − N = II 2 are O(e), we deduce that |ErrA | ≤ ce (e + W + W Next notice that Lemma 35 implies that M(e)AB − M(0)AB = O(e2 ). Therefore since M(0)AB = −M(0)BA the largest term drops out and the second line of Q1 can be rewritten as (M(e)AB − M(0)AB + גAB )(λ˙ − V0 )A (λ˙ − V0 )B − (Fp + ErrA )(λ˙ − V0 )A A
4
2
˜ 1/2 ), for small e and W ˜. which, by the above and (123) and (127) is O(e + e W Estimation of the third and fourth lines in Q1 . Using Lemma 39, (95), the bounds in Sec. 2.1.4 and the properties of fω,e in Sec. A.1.1, the third and fourth ˜ 1/2 + e2 W ˜ 3/2 ). lines can be estimated immediately to be O(e3 W Estimation of the fifth and sixth line in Q1 . This requires care because h is unbounded as a function of x. This makes it essential to separate the nonlinear term N in Φ21 from the other terms (which are exponentially decreasing in x and can thus absorb the unboundedness of h). Therefore we estimate first of all the quantity ˜ 12 ), (144) ihv, Φ21 − N (fω,e , fω , v)L2 − ihw, Φ11 L2 = O(e2 W by (102) and the bounds for h recorded in Sec. 2.1.4. Next, write the first term on line five, together with the missing piece ihv, N L2 from the previous estimation, as the sum of two quantities: (∂t + ih + u · ∇)v, N L2 + Rem, where Rem = (∂t + u · ∇)v, Φ21 − N L2 . It is shown in Lemma 40 that the ˜ 12 ). To complete the proof of ˜ )) + O(eW ˜ + e3 W first of these quantities is ∂t (o(W Proposition 24 we need to estimate the sixth line and the quantity Rem defined above. This is done by means of the integration by parts identity (196), and taking advantage of the fact that (145) (∂t + u · ∇)fω,e = (λ˙ − V0 (λ)) · ∂λ fω,e , is O(e) by (102). Together with (192), this implies that (∂t + u · ∇)ΦIJ Lp = O(e(e + δ))
(146)
for all p and all IJ except for IJ = 21; but in that case (146) holds instead for Φ21 − N = II 2 , (which is what is actually needed to estimate Rem). Putting this
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
499
˜ 12 )) + information into (196), we infer that the sixth line and Rem are ∂t (O(eW ˜ 12 ), which is sufficient to complete the proof of the proposition. O(e(e + δ)W 5.2.4. Estimation of Q2 The terms in Q2 arising from 01 , 02 can be estimated in a straightforward way by the H¨older and Sobolev inequalities, because of the exponential decay of fω,e , and using Lemma 39 to bound A˜0 . For example, ˜) w − iγωv + u · ∇v, ieA˜0 (iγ(ω − eαω,e ) − u · ∇)fω,e L2 = O(e2 W
(147)
by H¨ older’s inequality, since fω,e and ∇fω,e are bounded in every Lp norm and ˜ 12 ) for 3 < p < ∞. For the terms involving 0 = ∇A˜0 we can A˜0 Lp = O(eW 3 estimate, ˜ ), ˜ ∇A˜0 L2 = div E, ˜ u · ∇A˜0 L2 = O(e2 W u · ∇E,
(148)
˜ 12 ) and ∇A˜0 L3 = O(eW ˜ 12 ). Consider next the terms ˜ L3/2 = O(eW since div E ˜ L2 . Referring to the explicit expressions for D ˜ H ˜ given in Sec. 2.1.3, 03 , −DA˜ H A starting with (88), we see that the resulting terms can all be estimated in a straight˜ ), except for forward way (using the bounds for ∇A˜0 in Sec. A.2.2) to be O(e2 W one, namely: ˜ ∇A˜0 L2 , A, ˜ ). but this vanishes by the Coulomb condition, and so Q2 = O(e2 W 5.2.5. Estimation of Q3 ˜ 3 is smaller than it appears due to the constraints. To see this The quantity Q first recall that, as used above already, ∂λ ASC,e Lp = O(e) for p > 3, and ∂λ ESC,e Lp = O(e) for p > 3/2, by (51) and the results of Secs. A.1.2 and A.1.3 ˜ in Sec. 2.1.3, this means that the electromagReferring to the expressions for DA˜ H ˜ 3 can be bounded as O(eW ˜ 12 ). But also, the expressions for netic contributions to Q ˜ in Sec. 2.1.3 imply that Dv H λ φSC,e , −Dv H ˜ + Mλ vL2 = O(eW ˜ 2 ). ∂ 1
˜ 12 ), we deduce that Q ˜ 3 is equal to Therefore, up to O(eW u · ∇w − iωγw − Mλ v, ∂ λ φSC,e , L2 − u · ∇v − iωγv + w, ∂λ ψSC,e L2 . Now the identities in Sec. A.1.4 and the constraints (109) imply that this expression vanishes if φSC,e , ψSC,e are replaced by φS,0 , ψS,0 . But by Lemma 35, this can be ˜ 12 ) error. Therefore, since (λ˙ − V0 ) = O(e) by done at the expense of an O(e2 W 2 ˜ 12 (102), we deduce that Q3 = O(e W ).
May 12, 2009 13:21 WSPC/148-RMP
500
J070-00366
E. Long & D. Stuart
5.2.6. Completion of proof of Lemma 23 The previous subsections have provided the requisite information on the Q s, and so it now suffices to control the remaining quantities in (142) appearing after the Q s. The following two propositions treat the two quantities on the first line of (142). Proposition 25. Assume the hypotheses of Lemma 23. It follows that, 4 1 ˜ ˜ ˜ (∂t + u · ∇)B(n−1) (v, A)dx ˜ (v, A), (∂t + u · ∇)h(v, A)dx = − 2 n=2 2 ˜ . ˜ +e W = eW δ Proof. Observe • the fact that aδ,χ is pointwise O( 1δ ), but its derivatives are O(1), in particular ˙ L∞ + ∇aL∞ ). (∂t + u · ∇)aδ,χ L∞ ≤ 2(a • the identity (∂t + u · ∇)fω,e = (λ˙ − V0 ) · ∂λ fω,e , which shows that the left-hand side is O(e) in every Lp , by (102) and the exponential decay properties in Sec. A.1. Similarly, (∂t + u · ∇)αω,e W 1,∞ is O(e2 ) by (102) and the bounds for αω,e in Sec. A.1.2. To prove the proposition now, just use these observations to estimate with H¨ older’s inequality each of the terms arising from differentiation of the expressions for B(n) in Sec. 2.1.3. Proposition 26. Assume the hypotheses of Lemma 23. It follows that 2 ˜ ihvL2 = O eW ˜ . ˜ +e W Dv H, δ
(149)
˜ we have Proof. Using the notation in (88) for the Frechet derivative Dv H, ˜ ihvL2 | = |B(v, A), ˜ (ihv, 0)L2 | |Dv H,
(150)
and we can estimate term by term, but some care is needed since h is unbounded as a function of x, see (72). In addition to the first point in the proof of the previous proposition, we use the bounds for h recorded in Sec. 2.1.4. Those terms in (150) arising from B(3) vanish identically, while of those arising from B(2) the only non˜ L2 . By the Coulomb condition and the zero ones are proportional to ehv, ∇v A ˜ 3/2 ). It remains to bound those bound for ∇h from Sec. 2.1.4, this term is O(e2 W (1) terms arising from B . Of these, it is straightforward to bound those arising from ˜ ) by the second fact just mentioned, and the same goes for those B12 as O(eW arising from Mλ in B11 = −Mλ + eR + S. However, there is a single non-zero term
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
501
arising from eRv which is proportional to hv, aδ,χ · ∇v 2
˜ ), but, again, only which, with an integration by parts, can be bounded as O( eδ W δ,χ after taking into account the Coulomb condition ∇ · a = 0. Finally for the terms arising from S we see from (81) that ihv, SvL2 = eγ [2αω,e vu · ∇v + u · ∇αω,e |v|2 ]dx = 0, so that ihv, SvL2 = 0, and the proof of the proposition is completed. The remaining terms on the second line of formula (142) are easily estimated ˜ ) by (102), and the proof of Lemma 23 is completed. as O(eW Appendix A.1. Further properties of the solitons A.1.1. Exponential decay properties of the solitons The e = 0 solitons in the nonlinear Klein–Gordon equation (23) are exponentially localized: to be precise we have the following estimates for the profiles functions fω , gω : lim sup ∇α fω Exp[|x|( m2 − ω 2 − ε)] < ∞ ∀ε ∈ (0, m2 − ω 2 ), (151) |x|→∞
|α|≤3
together with ∇α gω Exp[|x|( m2 − ω 2 − ε)] < ∞ ∀ε ∈ (0, m2 − ω 2 ), (152) lim sup |x|→∞
|α|≤3
and
fω = − m2 − ω 2 , |x|→∞ fω lim
(153)
while ∀ε > 0, there exists c(ε) > 0 such that fω (|x|) > c(ε) Exp[−|x|( m2 − ω 2 + ε)].
(154)
(See [24, Theorem 1.4]). Exponential decay also holds for the solitons coupled to electromagnetism for small e: Lemma 27. Suppose that |e| < e1 , for some e1 > 0. Under conditions (29)–(32) on U, |Dα fω,e (x)| ≤ C Exp[−κ|x|]
(155)
for positive constants C and κ, and where α is any multi-index with |α| ≤ 2. Furthermore, the constants C and κ are independent of the coupling constant e. Proof. See [17].
May 12, 2009 13:21 WSPC/148-RMP
502
J070-00366
E. Long & D. Stuart
A.1.2. Some estimates of the soliton electromagnetic potential α Lemma 28. For each f ∈ Hr2 (R3 ), there exists a unique α ∈ H˙ r1 (R3 ) such that (156) − α + e2 f 2 α = ωef 2 . 2 3 1 3 Furthermore, the map A : H (R ) → H˙ (R ) defined by A(f ) = α is continuously Frechet-differentiable. Proof. This follows from standard arguments. Lemma 29. Suppose that f ∈ H 1 (R3 ). Suppose further that α solves − α + e2 f 2 α = eωf 2 .
(157)
It follows that ∇α, ∇i ∇j α ∈ L2 (R3 ) for any i, j ∈ (1, 2, 3). Furthermore, ∇i ∇j αL2 , ∇αL2 , αL∞ = O(e). Proof.
|∇α|2 + e2 f 2 α2 = eω
f 2α
(158)
from which it easily follows via Sobolev’s inequality that ∇αL2 ≤ cef L2 f L3 .
(159)
2
Next, since − α = e(ω − eα)f , we have αL2 ≤ e(ωf 2L4 + eαω,e L6 f 2L6 ).
(160)
By the Calderon–Zygmund inequality, we have that for any i, j ∈ (1, 2, 3), ∇i ∇j αL2 = O(e). By Sobolev’s inequality, we have thus shown that α ∈ W inequality, αL∞ = O(e).
(161) 1,6
and hence by Morrey’s
Corollary 30. Suppose that fω,e ∈ H 2 (R3 ) solves where αω,e
(162) − fω,e + m2 fω,e − (ω − eαω,e )2 fω,e = β(fω,e )fω,e , ∈ H˙ r1 (R3 ) is a non-local function of fω,e uniquely determined by 2 2 − αω,e + e2 fω,e αω,e = ωefω,e .
(163)
Then, fω,e ∈ H 4 (R3 ). Proof. Differentiate the equation for fω,e and apply the Calderon–Zygmund inequality. This leads naturally to the following lemma. Lemma 31. Suppose that f ∈ H 4 (R3 ) and that α solves − α + e2 f 2 α = eωf 2 . It follows that ∇α ∈ W 3,p (R3 ) for any p ∈
( 32 , ∞).
(164)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
503
Proof. Differentiate (164), and apply the Calderon–Zygmund inequality (using the H¨ older and Sobolev inequalities if necessary) to get the result. Lemma 32. Suppose that f ∈ H 2 (R3 ) and that α solves − α + e2 f 2 α = eωf 2 . It follows that
(165)
ω ω 0 ≤ sgn α ≤ , e e
where sgn(x) = x/|x| for x = 0 and sgn(0) = 0. Proof. Assume that f in Cc∞ (R3 ). Define α+ = max(α, 0) and α− = max(−α, 0). Suppose ωe > 0, then by a weak maximum principle ([8, Theorem 8.1]), α > 0. Now, A0 = α− ωe solves − A0 + e2 |f |2 A0 = 0, therefore A0 ≤ 0 by the same weak maximum principle. Hence, 0 ≤ α ≤ ωe . Similarly, if −ωe > 0, then 0 ≥ α ≥ − ωe so that αL∞ ≤ | ωe |. The lemma follows by approximation. Lemma 33. Suppose that fω,e and αω,e are as given in Theorem 6. Then, i j dαω,e ∇ ∇ = O(e) (166) dλ p L
for p ∈ (1, ∞), and i, j = 1, 2, 3. In addition, ∇ ( 32 , ∞).
dαω,e 2,p dλ W
= O(e) for any p ∈
dα
ω,e is a well-defined object. We note Proof. From Lemma 28 and Theorem 6, dλ that dfω,e dαω,e 2 dαω,e 2 + e2 fω,e = efω,e δ−1 A + 2efω,e (ω − eαω,e ) (167)
dλA dλA dλA
dα
Lp = O(e) for p ∈ (1, ∞) follows immediately. The lemma from which dλω,e A follows trivially from repeated differentiation, the Calderon–Zygmund inequality and the H¨ older and Sobolev inequalities. Let ζ(x; λ) be the unique solution in H˙ 1 of (52), − ζ = −γu · ∇αω,e (Z), which takes the Lorentz transformed solitons into Coulomb gauge. Then Lemma 34. ∇i ∇j ζLp = O(e), ∇i ∇j ∂λ ζLp = O(e), for p ∈ ( 32 , ∞) and i, j = 1, 2, 3. Proof. By (52), and its derivative: and
d d d −
ζ = −γu · ∇ αω,e − γu · ∇αω,e . dλA dλA dλA the result follows by means of Lemmas 29 and 33.
(168)
May 12, 2009 13:21 WSPC/148-RMP
504
J070-00366
E. Long & D. Stuart
A.1.3. Differentiability Lemma 35. Let fω,e ∈ H 2 be given by Theorem 6. Then it is a differentiable function of ω and satisfies, for small e: fω,e − fω H 2 + ∂ω fω,e − ∂ω fω H 2 = O(e2 ).
(169)
Proof. See [17]. ˜ ω = hω − ωqω , where hω = H(ΦS,e(0, ω, 0, 0)) while qω = Lemma 36. Let h Q(ΦS,e (0, ω, 0, 0)). Then d ˜ hω = −qω . dω Proof. Following the argument given in [10], we note that d ˜ d ΦS,e (λ0 ) hω = −qω + H (ΦS,e (λ0 )) − ωQ (ΦS,e (λ0 )), , dω dω L2
(170)
(171)
where λ0 = (ω, 0, 0, 0). The result follows from the fact that H (ΦS,e (λ0 )) − ωQ (ΦS,e (λ0 )) = 0. λ φS,0 , ∂ λ ψS,0 ) A.1.4. Some identities involving (∂ The explicit calculation of the modulation equations can be carried out by making use of the following functions (aA (Z(x, λ); λ), bA (Z(x, λ); λ) from [24]: b−1 (Z; λ) = gω − iu · Zfω , b0 (Z; λ) = if ω , bi (Z; λ) =
(172) (173)
∇iZ fω (Z),
(174)
b3+i (Z; λ) = ζji ∇jZ fω (Z) − iωγ((γPu + Qu )Z)i fω (Z),
(175)
a−1 (Z; λ) = −γ −1 b0 + (γu · ∇Z − iγω)b−1 ,
(176)
while
a0 (Z; λ) = (γu · ∇Z − iγω)b0
(177)
ai (Z; λ) = (γu · ∇Z − iγω)bi ,
(178)
a3+i (Z; λ) = (γPu + Qu )Z)ij bj + (γu · ∇Z − iγω)b3+i , where i, j = 1, 2, 3, gω =
d dω fω ,
(179)
and
ζji = γ 2 (u · Z)(Pu )ji +
γ−1 γ−1 (u · Z)(Qu )ji + (Qu Z)i uj . 2 γ|u| |u|2
These are convenient for computation of the modulation equations because the linear span of the ∂ λA (φS,0 , ψS,0 ) is the same as the linear span of the (bA , −aA ).
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
505
(To be precise: except for A = j ∈ {1, 2, 3}, we have ∂ λA (φS,0 , ψS,0 ) = (bA , −aA ), j and for A = j we have ∂ j (φ , ψ ) = −(γP + Q ) S,0 S,0 u u jk (bk , −ak ) + ωγu (b0 , a0 ).) ξ The following identities are equivalent to Lemma 2.2 in [24], and can be obtained by differentiating the Euler–Lagrange equation F0 = 0, where F0 is the augmented Hamiltonian (48): ∂λ0 ψS,0 = 0, (iγω − u · ∇) ∂λ0 φS,0 −
(180)
(iγω − u · ∇) ∂λ0 ψS,0 − Mλ ∂λ0 φS,0 = 0,
(181)
(iγω − u · ∇)∂ λj φS,0 − ∂λj ψS,0 = 0,
(182)
(iγω − u · ∇)∂ λj ψS,0 − Mλ ∂λj φS,0 = 0,
(183)
1 (iγω − u · ∇)∂ λ−1 φS,0 − ∂λ−1 ψS,0 = − ∂λ0 φS,0 , γ
(184)
1 (iγω − u · ∇)∂ λ−1 ψS,0 − Mλ ∂λ−1 φS,0 = − ∂λ0 ψS,0 , γ
(185)
(iγω − u · ∇)∂ λ3+j φS,0 − ∂λ3+j ψS,0 = −∂λj φS,0 − γωuj ∂λ0 φS,0 ,
(186)
(iγω − u · ∇)∂ λ3+j ψS,0 − Mλ ∂λ3+j φS,0 = −∂λj ψS,0 − γωuj ∂λ0 ψS,0
(187)
where the index j runs from 1 to 3. A.2. Some estimates A.2.1. Estimates related to the external field Lemma 37. Let f be a measurable function with (1 + |x|)f ∈ L1 . Then if aδ,χ is as in (64) eaδ,χ 0 f Lp ≤ ceL1 (1 + |x − ξ|)f Lp ,
(188)
eaδ,χ f Lp ≤ ceL1 (1 + |x − ξ|)f Lp
(189)
and
for p ∈ [1, ∞]. If in addition feven is an even function of (x−ξ) and (1+|x|)2 feven ∈ L1 then 3 2 aδ,χ (190) µ feven d x ≤ cL2 δ(1 + |x − ξ|) feven L1 with L1 , L2 as in (7). Proof. Recall (64) and (65). Writing aδ0 (t, x) − aδ0 (t, ξ) = (x − ξ) ·
∇aδ0 (t, ξ + s(x − ξ))ds
(191)
etc., by the fundamental theorem of calculus, the result then follows, using the fact that the gradients of aδ0 , aδ are bounded independent of δ by assumption (see
May 12, 2009 13:21 WSPC/148-RMP
506
J070-00366
E. Long & D. Stuart
Sec. 1.2). For the proof of (190), it suffices to use the identity for ∇aδµ corresponding to (191), and then substitute this back into (191) and use the fact that (x − ξ)feven = 0. Similarly, we have the following bounds: Lemma 38. R3
(1 + |x − ξ|)−1 (∂t + u · ∇)aδ,χ µ L∞ ≤ C1 (|δ| + |e|) f (x)|∇t,x aδµ (t, x) − (∇t,x aδµ )(t, ξ)|dx ≤ C2 |δ|,
(192) (193)
where we use (102), (7), and C1 = C1 (L1 , L2 ) and C2 = C2 (L2 , (1 + |x|)f L1 ). A.2.2. Estimates for the time component of the electromagnetic potential ˜ there exists a unique A˜0 ∈ H˙ 1 Lemma 39. Given (v, w) ∈ H 1 × L2 and λ ∈ O solving (71) such that ˜ E) ˜ H + e(v, w, A, ˜ E) ˜ 2H ), ∇A˜0 Lp = O(e(v, w, A,
(194)
for p ∈ ( 32 , 3]. Consequently A˜0 Lq satisfies the same bound for 3 < q < ∞ by Sobolev’s inequality. Proof. From Gauss’s law (71), we have explicitly − A˜0 = eifω,e , w + eiv, (iγ(ω − eαω,e ) − u · ∇)fω,e + w.
(195)
By Sobolev’s and H¨older’s respective inequalities, ˜ E) ˜ H + e(v, w, A, ˜ E) ˜ 2H ) A˜0 Lq = O(e(v, w, A, for q ∈ [1, 32 ]. The lemma follows from the Sobolev inequality and from the Calderon–Zygmund inequality, [8, Sec. 9.4]. A.2.3. Integration by parts and simple averaging First we recall the phenomenon of averaging in the context of ordinary differential equations, in the simplest possible case of the perturbed harmonic oscillator. Let ˙ ≤ N . For 0 < 1 let y
g be a C 1 function of t ∈ R, with |g| ≤ M and |g| be the solution of y¨ + y = g(t) with initial data y (0) = y0 , y˙ (0) = y1 (fixed independent of ). Then y − y 0 is O() in C 1 ([−T, T ]) norm for times of T = O( 1 ). One way to prove this is to define f = −1 (y − y 0 ), which solves f¨+ f = g(t) with zero initial data. Let E(t) = (f 2 + f˙2 )/2 be the energy; it satisfies E(0) = 0 and
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
507
˙ E(t) = f˙(t)g(t). Now an integration by parts gives T T ˙ |f (t)|dt f (t)g(t)dt ≤ M |f (T )| + N 0 0 T N 2 ≤ M |f (T )| + + 2
0
T
|f (t)|2 dt, 2
which, by Gronwall’s inequality, implies E(t) = O(1) for t = O( 1 ) as claimed. To conclude, this simple fact — that a small slowly varying inhomogeneous g(t) term only influences a simple harmonic oscillator to O() on time scales of O( 1 ) — expresses a weak averaging effect, and can be proved by integration by parts. Of course, this argument can be modified to give information about perturbed oscillators on longer times scales of O( 1a ), a < 2, and many different generalizations are possible. A simple generalization, which is usful for the study of slow motion of solitons, can be obtained by integrating the identity (∂t + u · ∇)F, GL2 = ∂t F, GL2 − F, (∂t + u · ∇)GL2
(196)
where F, G are sufficiently regular functions of t, x but u = u(t) depends on t only and the inner product is L2 (dx). This is often useful because in perturbation theory for solitons functions often arise with (∂t + u · ∇)G small — see (146). The following result, used in the proof of Proposition 24, is a more complicated version of this idea: Proposition 40. In the situation of Lemma 23, ˜ 12 ), ˜ )) + O(eW ˜ + e3 W (∂t + ih + u · ∇)v, N (fω,e , fω , v)L2 = ∂t (o(W ˜ )) if there exists a C 1 function g = o(W ˜) where a function f satisfies f = d/dt(o(W d and f = dt g. Proof. We work mostly with the potential V1 (φ) = −U (|φ|) which determines N : recall that V1 (φ) = −β(|φ|)φ, and (being slightly cavalier with notation) (76) can be rewritten N (fω,e , fω , v) = −V1 (fω,e + v) + V1 (fω,e ) + V1 (fω )(v). Define
¯ = Θ
t
hds,
(197)
0 ∗ ¯ ω = Exp[iΘ]f fω,e
(198)
and ¯ v ∗ = Exp[iΘ]v.
(199)
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
E. Long & D. Stuart
508
˜ 12 ) by (66), we have Then, as with (93), and using the fact that ∂t v ∗ L2 = O(e + W ∂t v + ihv, N (fω,e , fω , v)L2 1
∗ ∗ ∗ ˜ 2 + e2 W ˜ ). = −∂t v ∗ , V1 (fω,e + v ∗ ) − V1 (fω,e ) − V1 (fω,e )[v ∗ ]L2 + O(e3 W
(200) But, ∗ ∗ ∗ ∂t v ∗ , V1 (fω,e + v ∗ ) − V1 (fω,e ) − V1 (fω,e )[v ∗ ]L2 1 ∗ ∗ ∗ ∗ = ∂t + v ∗ ) − V1 (fω,e ) − V1 (fω,e )[v ∗ ] − V1 (fω,e )[v ∗ ]2 dx V1 (fω,e 2 1 (3) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ 2 . − ∂t fω,e , V1 (fω,e + v ) − V1 (fω,e ) − V1 (fω,e )[v ] − V1 (fω,e )[v ] 2 L2
(201) Hence, ∗ ∗ ∗ ∂t v ∗ , V1 (fω,e + v ∗ ) − V1 (fω,e ) − V1 (fω,e )[v ∗ ]L2 1 2 = ∂t V1 (fω,e + v) − V1 (fω,e ) − V1 (fω,e )[v] − V1 (fω,e )[v] dx 2
− (∂t + ih)fω,e , V1 (fω,e + v) − V1 (fω,e ) − V1 (fω,e )[v]L2 1 (3) − (∂t + ih)fω,e , V1 (fω,e )[v]2 . 2 L2
(202)
Now, (3) ihfω,e , V1 (fω,e )[v]2 L2
≤c
|fω,e h|(1 + |fω,e |3 )|v|2 dx,
(203)
by condition (13). Additionally, ihfω,e , V1 (fω,e + v) − V1 (fω,e ) − V1 (fω,e )[v]L2 1 = (1 − s)ihfω,e , (V1 (fω,e + sv) − V1 (fω,e ))[v]L2 , 0
≤ c
|fω,e h|(1 + |fω,e |3 )(|v|2 + |v|5 )dx,
(204)
by condition (13). Therefore, by the exponential decay of fω,e and the fact that |fω,e h|Lp = O(e) by the bounds of Sec. 2.1.4, 1 (3) 2 ˜ ). = O(eW ihfω,e , V1 (fω,e + v) − V1 (fω,e ) − V1 (fω,e )[v] − V1 (fω,e )[v] 2 L2
May 12, 2009 13:21 WSPC/148-RMP
J070-00366
Effective Dynamics for Solitons
509
Integration by parts and Lemma 35 imply that 2 ˜ (u · ∇v, N (fω,e , fω,0 , v) = O(e W ) + u · ∇fω,e , V1 (fω,e + v) − V1 (fω,e ) 1 (3) 2 − V1 (fω,e )[v] − V1 (fω,e )[v] . 2 Next notice that the quantity 1 (3) (∂t + u · ∇)fω,e , V1 (fω,e + v) − V1 (fω,e ) − V1 (fω,e )[v] − V1 (fω,e )[v]2 2 ˜ ) in the same way as the bounds (203), (204) once we can be estimated to be O(eW note that, for every p ∈ [1, ∞], ∂t fω,e + u · ∇fω,e Lp = (λ˙ − V0 (λ)) · ∂λ fω,e Lp = O(e),
(205)
by (102). The proof is now completed by noticing that Taylor’s theorem and (13) imply that the quantity 1 ∗ ∗ ∗ ∗ + v ∗ ) − V1 (fω,e ) − V1 (fω,e )[v ∗ ] − V1 (fω,e )[v ∗ ]2 dx V1 (fω,e 2 ˜ 3/2 ) + O(W ˜ 3 ) = o(W ˜ ), since W ˜ is small by assumption. is O(W Acknowledgment Both authors were supported by EPSRC. References [1] T. D’Aprile and D. Mugnai, Solitary waves for nonlinear Klein–Gordon–Maxwell and Schr¨ odinger–Maxwell equations, Proc. Roy. Soc. Edinburgh Sect. A 134(5) (2004) 893–906. [2] V. Benci and D. Fortunato, Solitary waves of the nonlinear Klein–Gordon equation coupled with the Maxwell equations, Rev. Math. Phys. 14(4) (2002) 409–420. [3] H. Berestycki and P. L. Lions, Nonlinear scalar field equations. I. Existence of a ground state, Arch. Ration. Mech. Anal. 82 (1983) 313–345 [4] H. Berestycki, P. L. Lions and L. Peletier, An ODE approach to existence of positive semilinear solutions for semilinear problems in Rn , Indiana Univ. Math. J. 30 (1983) 141–157. [5] J. C. Bronski and R. L. Jerrard, Soliton dynamics in a potential, Math. Res. Lett. 7 (2000) 329–342. [6] P. A. M. Dirac, Classical theory of radiating electrons, Proc. Roy. Soc. London A 167 (1938) 148–169. [7] R. Feynman, R. Leighton and M. Sands, The Feynman Lectures on Physics, Vol. II (Addison–Wesley, Reading, Mass, 1981). [8] D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order (Springer-Verlag, Berlin, 1998). [9] H.-P. Gittel, J. Kijowski and E. Zeidler, The relativistic dynamics of the combined particle-field system in renormalized classical electrodynamics, Comm. Math. Phys. 198 (1998) 711–736.
May 12, 2009 13:21 WSPC/148-RMP
510
J070-00366
E. Long & D. Stuart
[10] M. Grillakis, J. Shatah and W. Strauss, Stability theory of solitary waves in the presence of symmetry, I, J. Funct. Anal. 74 (1987) 160–197. [11] M. Grillakis, Regularity for the wave equation with a critical non-linearity, Comm. Pure Appl. Math. 45(6) (1992) 749–774. [12] B. Jonsson, J. Fr¨ ohlich, S. Gustafson and I. M. Sigal, Long time motion of NLS solitary waves in a confining potential, Ann. Henri Poincar´e 7 (2006) 621–660. [13] S. Klainerman and M. Machedon, On the Maxwell–Klein–Gordon equation with finite energy, Duke Math. J. 74(1) (1994) 19–44. [14] M. Kunze and H. Spohn, Adiabatic limit for the Maxwell–Lorentz equations, Ann. Henri Poincar´e 1(4) (2000) 625–653. [15] T. D. Lee, Particle Physics and Introduction to Field Theory (Harwood, New York, 1981). [16] E. Long, On charged solitons and electromagnetism, Doctoral Thesis, University of Cambridge (2006). [17] E. Long, Existence and stability of solitary waves in nonlinear Klein–Gordon–Maxwell equations, Rev. Math. Phys. 18 (2006) 747–779. [18] K. McLeod, Uniqueness of positive radial solutions of u + f (u) = 0 in Rn , Trans. Amer. Math. Soc. 339(3) (1993) 495–505. [19] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 1 (Academic Press, New York, 1972). [20] J. Shatah, Stable standing waves of nonlinear Klein–Gordon equations, Comm. Math. Phys. 91 (1983) 313–327. [21] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, Cambridge, 2004). [22] W. Strauss, Existence of solitary waves in higher dimensions, Comm. Math. Phys. 55 (1977) 149–162. [23] R. S. Strichartz, Restrictions of Fourier transforms to quadratic surfaces and decay of solutions of wave equations, Duke Math. J. 44 (1977) 705–714. [24] D. M. A. Stuart, Modulational approach to stability of non-topological solitons in semilinear wave equations, J. Math. Pures Appl. 80(1) (2001) 51–83. [25] D. M. A. Stuart, The geodesic hypothesis and non-topological solitons on pseudo´ Riemannian manifolds, Ann. Sci. Ecole Norm. Sup. 37(4) (2004) 312–362. [26] D. M. A. Stuart, Geodesics and the Einstein nonlinear wave system, J. Math. Pures Appl. 83(9) (2004) 541–587. [27] M. Weinstein, Modulational stability of ground states of nonlinear Schr¨ odinger equations, SIAM J. Math. Anal. 16(3) (1985) 472–491. [28] A. D. Yaghjian, Relativistic Dynamics of a Charged Sphere: Updating the Lorentz– Abraham Model, Lecture Notes in Physics, Vol. 686 (Springer, Heidelberg, 2006).
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
Reviews in Mathematical Physics Vol. 21, No. 4 (2009) 511–548 c World Scientific Publishing Company
ON SPECTRAL RENORMALIZATION GROUP
∗ , MARCEL GRIESEMER† ¨ ¨ JURG FROHLICH and ISRAEL MICHAEL SIGAL‡,§ ∗Institute
for Theoretical Physics, ETH Zurich, Switzerland and IHES, Bures-sur-Yvette, France
†Department
of Mathematics, University of Stuttgart, D-70569 Stuttgart, Germany
‡Department of Mathematics, University of Toronto, Toronto, ON M5S 2E4, Canada §
[email protected]
Received 4 December 2008 Revised 6 April 2009 The operator-theoretic renormalization group (RG) methods are powerful analytic tools to explore spectral properties of field-theoretical models such as quantum electrodynamics (QED) with non-relativistic matter. In this paper, these methods are extended and simplified. In a companion paper, our variant of operator-theoretic RG methods is applied to establishing the limiting absorption principle in non-relativistic QED near the ground state energy. Keywords: Renormalization group; quantum electrodynamics; renormalization flow; Feshbach–Schur map; stable and unstable manifolds; limiting absorption principle; ground state; ground state energy; resonances; spectrum. Mathematics Subject Classification 2000: 81T17, 47A55, 81V10
1. Introduction This paper is devoted to the nuts and bolts of the spectral (operator-theoretic) renormalization group (RG) method introduced in [8, 9] and developed further in [3, 20]. This method has been used successfully in order to describe the spectral structure of non-relativistic quantum electrodynamics (QED) with confining potentials and of Nelson’s model with a “subcritical” interaction [8, 9, 12, 4, 15, 25] (see [21] for a book exposition and [5,6,17], for an alternative multiscale technique). The RG technique developed in this paper is a variant of the one presented in [3], where the smooth Feshbach–Schur map was introduced. It is simpler than that of [3] and similar to that of [20]. In this paper, we apply the RG technique to prove existence of eigenvalues and to describe continuous spectra for operators on Fock spaces appearing in massless quantum field theories for which standard techniques do not work. The latter 511
May 12, 2009 14:51 WSPC/148-RMP
512
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
results are complementary to those of papers [3, 20] which deal only with eigenvalues. The results obtained here are used in subsequent papers to prove existence of the ground state and resonances for non-relativistic QED without the confinement assumption ([25], see also [5]) and to prove local decay near the ground state energy ([18], see also [17]). This paper is self-contained, except for the proof of the combinatorial Theorem A.1 (Wick Ordering), for which we refer to [9, Theorem A.4]. The class of Hamiltonians and the problems we consider here originate in nonrelativistic QED. This theory deals with the interactions of non-relativistic matter with the quantized electro-magnetic field. (See [13, 14, 21, 26] for background.) The dynamics of non-relativistic matter is generated by the Schr¨ odinger operator n 1 ∆xj + V (x), Hp := − 2m j j=1
(1.1)
where ∆xj is the Laplacian in the variable xj , x = (x1 , . . . , xn ), and V (x) is the potential energy of the particle system. This operator acts on the Hilbert space Hp , which is either L2 (R3n ) or a subspace of this space determined by a symmetry group of the particle system. We assume that V (x) is real and such that the operator Hp is self-adjoint. The quantized electromagnetic field is described by the quantized vector potential d3 k (1.2) A(y) = (eiky a(k) + e−iky a∗ (k))χ(k) |k| in the Coulomb gauge (div A(x) = 0). Here χ is an ultraviolet cut-off: χ(k) = 1√ in a neighborhood of k = 0, and χ vanishes rapidly at infinity. The dynamics (2π)3 2 of the quantized electromagnetic field is given by the quantum Hamiltonian Hf = d3 k ω(k)a∗ (k)a(k). (1.3) The operators A(y) and Hp act on the Fock space Hf ≡ F. Above, ω(k) = |k| is the dispersion law connecting the energy, ω(k), of the field quantum with its wave vector k, and a∗ (k) and a(k) denote the creation and annihilation operators on F . The latter are operator-valued generalized, transverse vector fields: eλ (k)a# a# (k) := λ (k), λ∈{0,1}
where eλ (k) are polarization vectors, i.e. orthonormal vectors in R3 satisfying k · eλ (k) = 0, and a# λ (k) are scalar creation and annihilation operators satisfying canonical commutation relations. The right-hand side of (1.3) can be understood as a weak integral. See Supplement D for a brief review of definitions of the Fock space, the creation and annihilation operators and the operator Hf .
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
513
The Hamiltonian of the total system, matter and radiation field, is given by Hg =
n 1 (−i∇xj + gA(xj ))2 + V (x) + Hf 2m j j=1
(1.4)
acting on the Hilbert space H := Hp ⊗ Hf . Here the coupling constant g is related e2 1 ≈ 137 . (See [10, 17, 25] for a discussion to the fine-structure constant α = 4πc of the definition of Hg and units involved.) This model describes emission and absorption of radiation by systems of matter, such as atoms and molecules, as well as other processes of interaction of quantized radiation with matter. It has been extensively studied in the last decade; see references in [25, 26] for references to earlier contributions. For a large class of potentials V (x), including Coulomb potentials, and for an ultra-violet cut-off in A(x), the operator Hg is self-adjoint. The key problem of non-relativistic QED is to establish spectral and resonance structure of Hg and, in particular, to prove existence (and uniqueness) of the ground state and of resonances of Hg corresponding to excited states of the atomic Hamiltonian. One verifies that Hf defines a positive, self-adjoint operator on F with purely absolutely continuous spectrum, except for a simple eigenvalue 0 corresponding to the vacuum eigenvector Ω (see Supplement D). Thus, for g = 0, the low-energy spec(p) trum of the Hamiltonian H0 of the decoupled system consists of branches [i , ∞) (p) of absolutely continuous spectrum, where i are the isolated eigenvalues of the par(p) ticle Hamiltonian Hp , and of the eigenvalues i sitting at the “thresholds” of the continuous spectrum. The absence of gaps between the eigenvalues and thresholds is a consequence of the fact that the photons are massless. This leads to hard and subtle problems in perturbation theory, known collectively as the infrared problem. The first step in tackling the problem of ground states and resonances in the framework of the RG approach is to perform a certain canonical transformation and then apply to the resulting Hamiltonian a specially designed RG map in order to project out the particle- and high-photon-energy degrees of freedom ([25] (cf. [8]). As a result, one arrives at a Hamiltonian on Fock space of the form H := T + W , where T := w0,0 [Hf ], with w0,0 : [0, ∞) → C and continuous (w0,0 [Hf ] is defined by the operator calculus), and m+n m dkj W := χ1 a∗ (kj ) 1/2 m+n |k | j B 1 1 1 m+n≥1 × wm,n [Hf ; k1 , . . . , km+n ]
m+n
a(kj )χ1 .
(1.5)
m+1
Here wm,n : I × B1m+n → C, m + n > 0, B1r denotes the Cartesian product of r unit balls in R3 , I := [0, 1] and χ1 := χ1 (Hf ) with χ1 (r) a smooth cut-off function such that χ1 = 1 for r ≤ 9/10, χ1 = 0 for r ≥ 1 and 0 ≤ χ1 (r) ≤ 1. See Sec. 3 for
May 12, 2009 14:51 WSPC/148-RMP
514
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
more details concerning notation. Operators on Fock space of the form above will be said to be in generalized normal (or Wick) form. Note that, in order to be able to apply our theory to the analysis of resonances of Hg , the operators H = T + W , introduced above, are allowed to be non-self-adjoint. Our goal in this paper is to describe the spectrum of the operator H near 0. We assume that the function w0,0 (r), defining the operator T := w0,0 [Hf ], satisfies (r) − 1| ≤ β0 . w0,0 (0) = 0, sup |w0,0
(1.6)
r∈[0,∞)
We consider the operator W (see (1.5)) as a perturbation of the operator T := w0,0 [Hf ], whose spectrum is explicitly known. It consists of the essential spectrum w0,0 (R+ ) and an eigenvalue 0 at its tip with the eigenvector Ω. We propose to determine the effect of the perturbation W on the spectrum of T near 0 and, in particular, to determine the fate of the eigenvalue 0 of T . If the operator H has an eigenvalue near 0, we call it the ground state energy of H. We denote by Ds the set of operators of the form H = T + W , where T and W are described above, such that (1.6) holds and w1 µ,s,ξ ≤ γ0 , where w1 := (wm,n )m+n≥1 , and w1 µ,s,ξ is a norm defined in Sec. 3. We define a subset S of the complex plane by 1 (1.7) S := w ∈ C|Re w ≥ 0, |Im w| ≤ Re w . 3 Recall that a complex function f on an open set D in a complex Banach space B is said to be analytic if ∀H ∈ D and ∀ξ ∈ B, f (H + τ ξ) is analytic in the complex variable τ for |τ | sufficiently small (or equivalently, f is Gˆateaux-differentiable, see [11]; a stronger notion of analyticity, requiring in addition that f is locally bounded, is used in [22]). In the next theorem, B is the space of Hf -bounded operators on F (i.e. the space of closed operators A with A(Hf + 1)−1 bounded). We are now prepared to state the main result of this paper. Theorem 1.1. Assume that β0 and γ0 are sufficiently small. Then there is an analytic map e : Ds → C such that e(H) ∈ R, for H = H ∗ , and for H ∈ Ds the number e(H) is a simple eigenvalue of the operator H and σ(H) ⊂ e(H) + S. Note that our approach also provides an effective way to compute the eigenvalue e(H) and the corresponding eigenvector. Theorem 1.1 is used in [25, 18]. Besides, our main technical result, Theorem 5.1 formulated in Sec. 5, furnishes a key technical step in an RG proof of local decay, see [18]. Combining results of this paper with those of [1] one obtains estimates on the resolvent of H near the eigenvalue e(H): For each Ψ and Φ from a dense set of
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
515
vectors, the matrix element Ψ, (H − z)−1 Φ near the eigenvalue e ≡ e(H) of H is of the form
Ψ, (H − z)−1 Φ = (e − z)−1 p(Ψ, Φ) + r(z, Ψ, Φ),
(1.8)
where p and r(z) are sesquilinear forms in Ψ and Φ with r(z) analytic in z ∈ Q := C\(e(H) + S) and bounded on the intersection of a neighborhood of e with Q as |r(z, Ψ, Φ)| ≤ CΨ,Φ |e − z|−γ
for some γ < 1.
Such estimates are needed in an analysis of the long time dynamics of resonances in QED; see [1]. This will be described in more detail elsewhere. Next, we explain the main ideas of the spectral renormalization group method. Our goal is to describe the spectral structure near 0 of an operator H from the set Ds introduced above. Denote by D(0, α) the disc in C centered at 0 and of radius α. For α0 sufficiently small, we construct a renormalization transformation, Rρ , defined on D := D(0, α0 )1 + Ds , with the following properties: • Rρ is “isospectral” and “preserves” the limiting absorption principle; • Rρ removes the photon degrees of freedom related to energies ≥ ρ. We then consider the discrete semi-flow, Rnρ , n ≥ 1, generated by the renormalization transformation, Rρ (called renormalization group) and relate the dynamics of this flow to spectral properties of individual Hamiltonians in Ds . We show that the flow, Rnρ , has the fixed-point manifold Mf p := CHf , an unstable manifold Mu := C1, and a (complex) co-dimension 1 stable manifold Ms for Mf p foliated by (complex) co-dimension 2 stable manifolds for each fixed point. We show that H − λ is in the domain of Rnρ , provided the parameter λ is adjusted appropriately, so that H − λ is, roughly, in a ρn -neighborhood of the stable manifold Ms (see Fig. 1). (n) Thus, for n sufficiently large, the operators Hλ := Rnρ (H − λ) are close to the operator wHf , for some w ∈ C with Re w > 0, and their spectra can be easily analyzed. Since the renormalization map is “isospectral”, we can pass this
Fig. 1.
Stable and unstable manifolds.
May 12, 2009 14:51 WSPC/148-RMP
516
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal (n−1)
spectral information to the operator Hλ , and so forth, until we obtain the desired spectral information for the initial operator H. Our paper is organized as follows. In Sec. 2, we describe the Feshbach–Schur map, which is the main ingredient of the renormalization map introduced in Sec. 4. In Sec. 3, we define the Banach spaces on which the renormalization map acts. The renormalization group approach is presented in Sec. 5 where the main technical results implying Theorem 1.1 are proven. In Appendix A we present the proof of a key technical result describing properties of the renormalization map. This proof is close to the proof of a similar result in [3] and is presented here for the reader’s convenience. In Appendix B, we present a result on the construction of eigenvalues and eigenvectors, similar to a corresponding result of [3]. Finally, in a Supplement D, we collect some relevant facts on Fock space and creation and annihilation operators. 2. The Smooth Feshbach–Schur Map In this section, we review the method of isospectral decimation maps acting on operators, introduced in [8, 9] and refined in [3]. At the origin of this method is the isospectral smooth Feshbach–Schur map a acting on a set of closed operators and mapping a given operator to one acting on a subspace of the original Hilbert space. Let χ, χ be a partition of unity on a separable Hilbert space H, i.e. χ and χ are positive operators on H whose norms are bounded by one, 0 ≤ χ, χ ≤ 1, and χ2 + χ2 = 1. We assume that χ and χ are non-zero. Let τ be a (linear) projection acting on closed operators on H with the property that operators in its image commute with χ and χ. We also assume that τ (1) = 1. Let τ := 1 − τ and define Hτ,χ# := τ (H) + χ# τ (H)χ# .
(2.1)
where χ# stands for either χ or χ. Given χ and τ as above, we denote by Dτ,χ the space of closed operators, H, on H which belong to the domain of τ and satisfy the following three conditions: (i) τ and χ (and therefore also τ and χ) leave the domain D(H) of H invariant: D(τ (H)) = D(H) (ii)
and χD(H) ⊂ D(H),
(2.2)
Hτ,χ is (bounded) invertible on Ran χ,
(2.3)
τ (H)χ and χτ (H) extend to bounded operators on H.
(2.4)
and (iii)
(For more general conditions see [3, 19].) a In [8, 9, 3] this map is called the Feshbach map. As was pointed out to us by Klopp and Simon, the invertibility procedure at the heart of this map was introduced by Schur in 1917; it appeared implicitly in an independent work of Feshbach on the theory of nuclear reactions, in 1958, where the problem of perturbations of operator eigenvalues was considered. See [19] for further extensions and historical remarks.
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
517
The smooth Feshbach–Schur map (SFM) maps operators on H belonging to Dτ,χ to operators on H by H → Fτ,χ (H), where −1 Fτ,χ (H) := H0 + χW χ − χW χHτ,χ χW χ.
(2.5)
Here H0 := τ (H) and W := τ (H). Note that H0 and W are closed operators on H with coinciding domains, D(H0 ) = D(W ) = D(H), and H = H0 + W . We remark that the domains of χW χ, χW χ, Hτ,χ , and Hτ,χ all contain D(H). Remarks. • The definition of the smooth Feshbach map given above differs somewhat from the one given in [3]. In [3], the map Fτ,χ (H) is denoted by Fχ (H, τ (H)), and the pair of operators (H, T ) are referred to as a Feshbach pair. • The usual Feshbach–Schur map is obtained as a special case of the smooth Feshbach–Schur map by choosing χ = projection, and, usually, τ = 0. • Typically the operator χ is taken to be of the form χ := χ(A) for some self-adjoint operator A on H. For the Feshbach map, χ has to be a projection and therefore we would have to take χ := χ(A) to be a characteristic function of the operator A, while in the smooth Feshbach–Schur map we are allowed to take χ := χ(A) to be a smooth approximation of the characteristic function of an interval in R. This explains the adjective “smooth” in the definition. • In [3] a semi-group property of Fτ,χ (H) is exhibited. Next, we introduce some maps appearing in various identities involving the Feshbach–Schur map: −1 χW χ, Qτ,χ (H) := χ − χHτ,χ
(2.6)
−1 Q# τ,χ (H) := χ − χW χHτ,χ χ.
(2.7)
Note that Qτ,χ (H) ∈ B(Ran χ, H) and Q# τ,χ (H) ∈ B(H, Ran χ). The smooth Feshbach–Schur map of H is isospectral to H in the sense of the following theorem. Theorem 2.1. Let χ and τ be as above, and assume that H ∈ Dτ,χ so that Fτ,χ (H) is well defined. Then (i) 0 ∈ ρ(H) ⇔ 0 ∈ ρ(Fτ,χ (H)), i.e. H is bounded invertible on H if and only if Fτ,χ (H) is bounded invertible on Ran χ. (ii) If ψ ∈ H\{0} solves Hψ = 0 then ϕ := χψ ∈ Ran χ\{0} solves Fτ,χ (H)ϕ = 0. (iii) If ϕ ∈ Ran χ\{0} solves Fτ,χ (H)ϕ = 0 then ψ := Qτ,χ (H)ϕ ∈ H\{0} solves Hψ = 0. (iv) The multiplicity of the spectral value {0} is conserved under the Feshbach– Schur map in the sense that dim Ker H = dim Ker Fτ,χ (H).
May 12, 2009 14:51 WSPC/148-RMP
518
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
(v) If one of the inverses, H −1 or Fτ,χ (H)−1 , exists then so does the other, and these inverses are related by −1 χ. H −1 = Qτ,χ (H)Fτ,χ (H)−1 Qτ,χ (H)# + χHτ,χ
(2.8)
Moreover if τ (H) is invertible, then Fτ,χ (H)−1 = χH −1 χ + χτ (H)−1 χ. This theorem is proven in [3]; see [19] for further extensions. In comparison with the original use of the Feshbach projection method as a tool in the analytic perturbation theory of eigenvalues, the smooth Feshbach–Schur map has two new features: • Flexibility in the choice of the projection; in particular, “dressing” the eigenspace corresponding to some eigenvalue with vectors from the continuous spectrum subspace, and relaxing the projection property altogether; • Viewing the Feshbach–Schur procedure as a map on a space of operators, rather then a tool in the analysis of a fixed operator. Our operator theoretic renormalization group is based on an iterative composition of Feshbach–Schur maps, decimating the degrees of freedom of the system under investigation. 3. A Banach Space of Hamiltonians We construct a Banach space of Hamiltonians on which our renormalization transformation will be defined. In order not to complicate matters unnecessarily, we will think of the creation and annihilation operators used below as scalar operators neglecting helicity of photons. We explain at the end of the Supplement D how to reinterpret our expressions for the photon creation and annihilation operators. Recall that B1r denotes the Cartesian product of r unit balls in R3 , I := [0, 1] and m, n ≥ 0. Given functions w0,0 : [0, ∞) → C and wm,n : I × B m+n → C, m + n > 0, we consider monomials, Wm,n ≡ Wm,n [wm,n ], in the creation and annihilation operators defined as follows: W0,0 [w0,0 ] := w0,0 [Hf ] (defined by the functional calculus), and dk(m,n) ∗ Wm,n [wm,n ] := a (k(m) )wm,n [Hf ; k(m,n) ]a(k˜(n) ), (3.1) 1/2 m+n |k (m,n) | B1 for m + n > 0. Here we are using the notation k(m) := (k1 , . . . , km ) ∈ R3m ,
a∗ (k(m) ) :=
m
a∗ (ki ),
(3.2)
i=1
k(m,n) := (k(m) , k˜(n) ),
dk(m,n) :=
m i=1
|k(m,n) | := |k(m) | · |k˜(n) |,
d3 ki
n
d3 k˜i ,
(3.3)
i=1
|k(m) | := |k1 | · · · |km |.
(3.4)
The notation Wm,n [wm,n ] stresses the dependence of Wm,n on wm,n . Note that W0,0 [w0,0 ] := w0,0 [Hf ]. We also denote T ≡ W0,0 [w0,0 ].
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
519
We assume that, for every m and n with m+n > 0, the function wm,n [r; , k(m,n) ] is measurable in k(m,n) ∈ B1m+n and s times continuously differentiable in r ∈ I, for some s ≥ 1, and for almost every k(m,n) ∈ B1m+n . As a function of k(m,n) , it is totally symmetric with respect to the variables k(m) = (k1 , . . . , km ) and k˜(n) = (k˜1 , . . . , k˜n ) and obeys the norm bound wm,n µ,s :=
s
∂rn wm,n µ < ∞,
(3.5)
n=0
where wm,n µ := max j
sup r∈I,k(m,n) ∈B1m+n
||kj |−µ wm,n [r; k(m,n) ]|
(3.6)
for some µ ≥ 0. Here and in what follows, kj is one of the 3-vectors in the variable k(m,n) . Recall that |k(m,n) |−1/2 is absorbed in the integration measure in the definition of Wm,n . For m + n = 0 the variable r ranges over [0, ∞), and we assume that the following norm is finite: sup |∂rn w0,0 (r)|. (3.7) w0,0 µ,s := |w0,0 (0)| + 1≤n≤s r∈[0,∞)
(This norm is independent of µ, but we keep this index for notational convenience.) µ,s . The Banach space of functions wm,n of this type is denoted by Wm,n We fix three numbers µ, 0 < ξ < 1 and s ≥ 0 and define the Banach space µ,s Wm,n , (3.8) W µ,s ≡ Wξµ,s := m+n≥0
with the norm w µ,s,ξ :=
ξ −(m+n) wm,n µ,s < ∞.
(3.9)
m+n≥0
Clearly, Wξµ ,s ⊂ Wξµ,s if µ ≥ µ, s ≥ s and ξ ≤ ξ. Let χ1 (r) ≡ χr≤1 be a smooth cut-off function such that χ1 = 1 for r ≤ 9/10, χ1 = 0 for r ≥ 1 and 0 ≤ χ1 (r) ≤ 1 and sup|∂rn χ1 (r)| ≤ 30 ∀r and for n = 1, 2. We define χρ (r) ≡ χr≤ρ := χ1 (r/ρ) ≡ χr/ρ≤1 and χρ ≡ χHf ≤ρ . The following basic bound, proven in [3], links the norm defined in (3.6) to the operator norm on B[F ]. µ,s , Theorem 3.1. Fix m, n ∈ N0 such that m + n ≥ 1. Suppose that wm,n ∈ Wm,n and let Wm,n ≡ Wm,n [wm,n ] be as defined in (3.1). Then for all λ > 0
(Hf + λ)−m/2 Wm,n (Hf + λ)−n/2 ≤ wm,n 0 ,
(3.10)
and therefore χρ Wm,n χρ ≤
ρ(m+n)(1+µ) √ wm,n 0 , m!n!
where · denotes the operator norm on B[F ].
(3.11)
May 12, 2009 14:51 WSPC/148-RMP
520
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Theorem 3.1 says that the finiteness of wm,n 0 insures that χ1 Wm,n χ1 defines a bounded operator on B[F ]. With a sequence w := (wm,n )m+n≥0 in W µ,s we associate an operator by setting χ1 Wm,n [w]χ1 , (3.12) H(w) := W0,0 [w] + m+n≥1
where we write Wm,n [w] := Wm,n [wm,n ]. These operators are said to be in generalized normal (or Wick) form and are called generalized Wick-ordered operators. Theorem 3.1 shows that the series in (3.12) converges in the operator norm and obeys the estimate H(w) − W0,0 (w) ≤ ξ w 1 µ,0,ξ ,
(3.13)
µ,0
for arbitrary w = (wm,n )m+n≥0 ∈ W and any µ > −1/2. Here w 1 = (wm,n )m+n≥1 . Hence we have the linear map H : w → H(w)
(3.14)
from W µ,0 into the set of closed operators on Fock space F . The following result is proven in [3]. Theorem 3.2. For any µ ≥ 0 and 0 < ξ < 1, the map H : w → H(w), given in (3.12), is injective. Next, we decompose the Banach space W µ,s into components having, as we will establish below, distinct scaling properties. We define the Banach spaces µ,s | f (0) = 0} T := {f ∈ W0,0
and W1µ,s :=
µ,s Wm,n ,
(3.15)
(3.16)
m+n≥1
to consist of all sequences w1 := (wm,n )m+n≥1 obeying w1 µ,s,ξ := ξ −(m+n) wm,n µ,s < ∞.
(3.17)
m+n≥1
We observe that there is a natural bijection µ,s W0,0 →C⊕T,
w0,0 → w0,0 [0] ⊕ (w0,0 − w0,0 [0]).
µ,s and C ⊕ T . We rewrite our We shall henceforth not distinguish between W0,0 µ,s space as Banach W
W µ,s = C ⊕ T ⊕ W1µ,s . µ,s W1,op
(3.18) µ,s := H(W1µ,s ) and Wmn,op := µ,s ξ, as in Wop,ξ := H(Wξµ,s ).
µ,s We define the spaces Wop := H(W µ,s ), µ,s H(Wmn ). Sometimes we display the parameter Theorem 3.2 implies that H(W µ,s ) is a Banach space with norm H(w) µ,s,ξ := w µ,s,ξ .
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
521
µ,s Corresponding to (3.18), operators in Wop can be represented as
H(w) = E1 + T + W,
(3.19)
where E ∈ C is a complex number, T = T [Hf ], with T [·] ∈ T , and W ∈ W1µ,s . Indeed, let E := w0,0 [0], T := w0,0 [Hf ] − w0,0 [0] and W := χ1 Wm,n [w]χ1 . (3.20) m+n≥1
Then Eq. (3.19) holds. Remark 3.3. In this paper we need only s = 1. We introduce the more general spaces for the sake of future references. Indeed, in our proof the limiting absorption principle (LAP) in [18] we need s = 2. More precisely, we have to use more sophisticated Banach spaces where the operator ∂rn in (3.5), is replaced by the operator
M+N ∂rn (k∂k )q to (A.16). Here q := (q1 , . . . , qM+N ), (k∂k )q := j=1 (kj · ∇kj )qj , with km+j := k˜j , and the indices n and q satisfy 0 ≤ n + |q| ≤ s with s = 2. 4. The Renormalization Transformation Rρ In this section we introduce an operator-theoretic renormalization transformation based on the smooth Feshbach–Schur map, which is closely related to the one introduced in [3] and [8, 9]. We fix the index µ in our Banach spaces at some positive value, µ > 0. The renormalization transformation is homothetic to an isospectral map defined on a polydisc in a suitable Banach space of Hamiltonians. It has a certain contraction property insuring that (upon appropriate tuning of the spectral parameter) the image of any Hamiltonian in the polydisc under a large number of iterations of the renormalization transformation approaches a fixed-point Hamiltonian, wHf , whose spectral analysis is particularly simple. Thanks to the isospectrality of the renormalization map, certain properties of the spectrum of the initial Hamiltonian can be derived from the corresponding properties of the limiting Hamiltonian. The renormalization map is defined below as a composition of a decimation map, Fρ , and two rescaling maps, Sρ and Aρ . Here ρ is a positive parameter — the photon energy scale — which will be chosen later. The decimation of degrees of freedom is accomplished by the smooth Feshbach map, Fτ,χ with the operators τ and χ chosen as τ (H) = W00 := w00 (Hf ) and χ = χρ ≡ χHf ≤ρ ,
(4.1)
where H = H(w) is given in Eq. (3.12). With τ and χ identified in this way we will use the notation Fρ ≡ Fτ,χρ . s The decimation map acts on the Banach space Wop .
(4.2)
May 12, 2009 14:51 WSPC/148-RMP
522
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Let χρ be defined so that χρ ≡ χHf ≤ρ and χρ ≡ χHf ≥ρ form a smooth partition of unity, χ2ρ +χ2ρ = 1. The lemma below shows that the domain of this map contains µ,s : the following polydisc in Wop Dµ,s (α, β, γ) :=
µ,s H(w) ∈ Wop ||E| ≤ α,
sup |T [r] − 1| ≤ β, w1 µ,s,ξ ≤ γ
,
(4.3)
r∈[0,∞)
for appropriate α, β, γ > 0. Here H(w) = E + T + W , where E, T and W are given in (3.20) and w1 := (wm,n )m+n≥1 . Lemma 4.1. Fix 0 < ρ < 1, µ > 0, s ≥ 1, and 0 < ξ < 1. Then it follows that the polydisc Dµ,s (ρ/8, 1/8, ρ/8) is in the domain of the Feshbach map Fρ . Proof. Let H(w) ∈ Dµ,s (ρ/8, 1/8, ρ/8). We remark that W := H(w) − E − T defines a bounded operator on F , and we only need to check the invertibility of H(w)τ χρ on Ran χρ . Now the operator E + T = W0,0 [w] is invertible on Ran χρ since for all r ∈ [3ρ/4, ∞) Re T [r] + Re E ≥ r − |T [r] − r| − |E| ≥ r 1 − sup |T [r] − 1| − |E| r
ρ 3ρ 1 ρ ≥ 1− − ≥ 4 8 8 2
(4.4)
and T := T [Hf ]. Equation (4.4) implies also that (E + T )−1 ≤ 2/ρ. On the other hand, by (3.11), W ≤ ξρ/8 ≤ ρ/8. Hence χρ W χρ (E + T )−1 ≤ 1/4 and therefore H(w)τ,χρ = [1 + χρ W χρ (E + T )−1 ](E + T ) is invertible on Ran χρ . The last part of the proof above gives the estimate (H(w)τ χρ )−1 ≤
8 . 3ρ
(4.5)
We introduce the scaling transformation Sρ : B[F ] → B[F ], by Sρ (1) := 1,
Sρ (a# (k)) := ρ−3/2 a# (ρ−1 k),
(4.6)
where a# (k) is either a(k) or a∗ (k) and k ∈ R3 . On the domain of the decimation map Fρ we define the renormalization map Rρ as Rρ := ρ−1 Sρ ◦ Fρ .
(4.7)
Remark 4.2. The renormalization map above is different from the one defined in [3]. The map in [3] contains an additional change of the spectral parameter λ := − HΩ .
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
523
We mention here some properties of the scaling transformation. It is easy to check that Sρ (Hf ) = ρHf , and hence Sρ (χρ ) = χ1
and ρ−1 Sρ (Hf ) = Hf ,
(4.8)
−1
which means that the operator Hf is a fixed point of ρ Sρ . Further note that E · 1 is expanded under the scaling map, ρ−1 Sρ (E · 1) = ρ−1 E · 1, at a rate ρ−1 . (To control this expansion it is necessary to suitably restrict the spectral parameter.) Next, we show that the interaction W contracts under the scaling transformaµ,s induces a tion. To this end we remark that the scaling map Sρ restricted to Wop µ,s scaling map sρ on W by ρ−1 Sρ (H(w)) =: H(sρ (w)). It is easy to verify that sρ (w) := (sρ (wm,n ))m+n≥0 and, for all (m, n) ∈ sρ (wm,n )[r, k(m,n) ] = ρm+n−1 wm,n [ρr, ρk(m,n) ].
(4.9) N20 , (4.10)
We note that by Theorem 3.1, the operator norm of Wm,n [sρ (wm,n )] is controlled by the norm |wm,n [ρr, ρk(m,n) ]| sup ρm+n−1 sρ (wm,n ) µ = max j m+n |kj |µ r∈I,k∈B 1
≤ρ
m+n+µ−1
wm,n µ .
Hence, for m + n ≥ 1, we have that sρ (wm,n ) µ ≤ ρµ wm,n µ .
(4.11)
Since µ > 0, this estimate shows that Sρ contracts wm,n µ by at least a factor of ρµ < 1. The next result shows that this contraction is actually a property of the renormalization map Rρ along the “stable” directions. Recall, χ1 is the cut-off function introduced at the beginning of Sec. 3. Define the constant
s 4 sup|∂rn χ1 | + sup|∂r χ1 |2 ≤ 200. (4.12) Cχ := 3 n=0 Clearly, for, say, s = 1, Cχ ≥ 4/3. We keep the constant Cχ below in order to relate the analysis of this paper to that of [3]. Theorem 4.3. Let 0 : H → HΩ and µ > 0. Then for the absolute constant Cχ given in (4.12) and for any s ≥ 1, 0 < ρ < 1/2, α, β ≤ ρ8 and γ ≤ 8Cρ χ we have that Rρ − ρ−1 0 : Dµ,s (α, β, γ) → Dµ,s (α , β , γ ),
(4.13)
√
ρ 4Cχ
continuously, with ξ := (in the definition of the polydiscs, see (4.3)) and 2 (4.14) α = 3Cχ γ /2ρ , β = β + 3Cχ γ 2 /2ρ , γ = 256Cχ2 ρµ γ. With some modifications, this theorem follows from [3, Theorem 3.8] and its proof; especially Eqs. (3.104), (3.107) and (3.109). For the sake of completeness, we present a proof of this theorem in Appendix A.
May 12, 2009 14:51 WSPC/148-RMP
524
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Remark 4.4. Subtracting the term ρ−1 0 from Rρ allows us to control the expanding direction during the iteration of the map Rρ . In [3], such control was achieved by using a change of the spectral parameter λ, which controls HΩ . 5. Renormalization Group In this section, we describe some dynamical properties of iterations, Rnρ ∀n ≥ 1, of the renormalization map Rρ . A closely related iteration scheme is used in [3]. First, we observe that 1 ∀τ ∈ C, Rρ (τ Hf ) = τ Hf and Rρ (τ 1) = τ 1. ρ Hence we define Mf p := CHf and Mu := C1 as candidates for the manifold of fixed points of Rρ and the unstable manifold. The next result identifies the stable manifold of Mf p which turns out to be of (complex) codimension 1 and is foliated by (complex) co-dimension 2 stable manifolds, for each fixed point in Mf p . This implies, in particular, that, in a vicinity of Mf p , there are no other fixed points, and that Mu is the entire unstable manifold of Mf p (see Fig. 1). We introduce some definitions. Recall that D(λ, r) := {z ∈ C | |z − λ| ≤ r}, a disc in the complex plane. As an initial set of operators we take
D := Dµ,s (α0 , β0 , γ0 ), with α0 , β0 , γ0 1 and s ≥ 1. We also let
Ds := Dµ,s (0, β0 , γ0 ). (The subindex s stands for “stable”, not to be confused with the smoothness index s, which, in this section, is denoted s .) For H ∈ D we write Hu := HΩ
and Hs := H − HΩ 1
(the unstable- and stable-central-space components of H, respectively). Note that Hs ∈ Ds . We fix the scale ρ so that α0 , β0 , γ0 ρ ≤
1 . 2
(5.1)
Below, we use the nth iteration of the numbers α0 , β0 and γ0 under the map (4.14): αn := cρ−1 (cρµ )2(n−1) γ02 , β n = β0 +
n−1 cγ02 µ 2j (cρ ) , ρ j=0
γn = (cρµ )n γ0 . Recall that a vector-function f from an open set D in a complex Banach space B1 into a complex Banach space B2 is said to be analytic iff ∀H ∈ D and
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
525
∀ξ ∈ B1 , f (H + τ ξ) is analytic in the complex variable τ for |τ | sufficiently small (see [11]). One can show that f is analytic iff it is Gˆ ateaux-differentiable ([11, 22]). A stronger notion of analyticity, requiring in addition that f is locally bounded, is used in [22]. Furthermore, if f is analytic in D and g is an analytic vector-function from an open set Ω in C into D, then the composite function f ◦ g is analytic on Ω. In what follows B1 is the space of Hf -bounded operators on F and B2 is either C or B(F ). For a Banach space X the symbol OX (α) will stand for an element of X bounded in its norm by const α. 1 . There is an analytic map Theorem 5.1. Let δn := νn ρn with 4αn ≤ νn ≤ 18 ∗ e : Ds → D(0, 4α0 ) such that e(H) ∈ R for H = H , and n n µ,s ρ Uδn ⊂ D(Rρ ) and Rρ (Uδn ) ⊂ D (5.2) , βn , γn 8
where Uδ := {H ∈ D | |e(Hs ) + Hu | ≤ δ}. Moreover, ∀H ∈ Uδn and ∀n ≥ 1, there are En ∈ C and τn (r) ∈ C such that |En | ≤ 2νn , |τn (r) − 1| ≤ βn , τn is C s , Rnρ (H) = En + τn (Hf )Hf + OW µ,s (γn ), op
(5.3)
µ,s (the spaces Wop are defined in Sec. 3), En and τn (r) are real if H is self-adjoint and, as n → ∞, τn (r) converge in L∞ to some number (constant function) τ ∈ C.
This theorem implies that Mf p := CHf is (locally) a manifold of fixed points of Rρ and Mu := C1 is the unstable manifold, and the set Uδn = {H ∈ D | e(Hs ) = −Hu } (5.4) Ms := n
is a local stable manifold for the fixed point manifold Mf p in the sense that, ∀H ∈ Ms , ∃τ ∈ C such that
µ,s Rnρ (H) → τ Hf in the norm of Wop ,
(5.5)
as n → ∞ (see Fig. 2). Moreover, Ms is an invariant manifold for Rρ : Ms ⊂ D(Rρ ) and Rρ (Ms ) ⊂ Ms , though we do not need this property here and thus we will not prove it.
Fig. 2.
Characterization of Ms in terms of Hu and e(Hs ).
May 12, 2009 14:51 WSPC/148-RMP
526
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Fig. 3.
The RG-flow on Ms .
The next result reveals the spectral significance of the map e: Theorem 5.2. Let Hs ∈ Ds . Then the number e(Hs ) is an eigenvalue of the operator Hs and σ(Hs ) ⊂ e(Hs ) + S where 1 (5.6) S := w ∈ C|Re w ≥ 0, |Im w| ≤ Re w . 3 This theorem implies Theorem 1.1 formulated in the introduction. We begin with some preliminary results, collected in Proposition 5.3 below, from which we derive Theorems 5.1 and 5.2. Proposition 5.3. Let V−1 ≡ D and e−1 (Hs ) = 0 ∀Hs . The triples (Vn , En , en ), n = 0, 1, . . . , where Vn is a subset of D, En is a map of Vn−1 into C, and en is a map of Ds into C, are defined inductively in n ≥ 0 by the formulae 1 n+1 ρ Vn := H ∈ D||Hu + en−1 (Hs )| ≤ , (5.7) 12 En (H) := (Rnρ (H))u ,
(5.8)
en (Hs ) is the unique zero of the function En (Hs − λ)
(5.9)
1 n+1 in the disc D(en−1 (Hs ), 12 ρ ). Moreover, these objects have the following properties:
Vn ⊂ Vn−1
and
Vn ⊂ D(Rn+1 ), ρ
(5.10)
1 n+1 En (Hs − λ) is analytic in λ ∈ D(en−1 (Hs ), 12 ρ ) and in Hs ∈ Ds , en (Hs ) ∈ R, ∗ if H = H , and
|en (Hs ) − en−1 (Hs )| ≤ 2αn ρn .
(5.11)
Proof. We proceed by induction in the index n. For n = 0 the proposition is trivially true. We assume that the statements of the proposition hold for all 0 ≤ n ≤ j − 1 and prove them for n = j. Let en (Hs ) and En (Hs − λ), 0 ≤ n ≤ j − 1, be
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
527
as defined in the proposition. Since ej−1 (Hs ) is defined by (5.9) with n = j − 1 we can define Vj using (5.7) with n = j. Next, by (5.10) with n = j − 1, Vj−1 ⊂ D(Rjρ ) and therefore the map Ej is well defined. Let H ∈ Vj−1 and denote λ := −Hu so that H := Hs − λ. Let H (j) (λ) := j Rρ (H (0) (λ)) with H (0) (λ) := Hs − λ (we suppress the dependence of H (j) (λ) on Hs ). Write inductively H (j) (λ) := Rρ (H (j−1) (λ)). We claim that H (j) (λ) is analytic (in the sense specified in the paragraph pre1 j+1 ρ ) and in Hs ∈ Ds . We prove ceding Theorem 5.1) in λ ∈ D(ej−1 (Hs ), 12 this statement by induction in j. Clearly, H (0) (λ) = Hs − λ is analytic in 1 ρ) and in Hs ∈ Ds . Now, assume that H (j−1) (λ) is analytic λ ∈ D(e−1 (Hs ), 12 1 j ρ ) and in Hs ∈ Ds . Then by Proposition C.1, Appendix C, in λ ∈ D(ej−2 (Hs ), 12 (j−1) (j−1) (λ) := E (λ) + T (j−1) (λ) and W (j−1) (λ) are analytic. By the properties H0 (j−1) (j−1) (λ), the inverse H0 (λ)−1 χρ is well-defined and is analytic and therefore of T so is ∞ (j−1) (j−1) χρ H (j−1) (λ)−1 χρ (−H0 (λ)−1 χρ W (j−1) (λ)χρ )n H0 (λ)−1 χρ . τ,χρ χρ = n=0
By the definition of the decimation map, (4.1)–(4.2), (j−1)
Fρ (H (j−1) (λ)) = H0
(j−1) (λ) + χρ W (j−1) (λ)χρ H (j−1) (λ)−1 (λ)χρ , τ,χρ χρ W
is analytic. Hence, by the definition of the renormalization map Rρ in (4.6)–(4.7), Rρ (H (j−1) (λ)) is analytic as well. 1 j+1 ρ ) and in Hs ∈ Ds . This implies that Ej (Hs −λ) is analytic in λ ∈ D(ej−1 , 12 In the remaining part of the proof we will use the shorthand en ≡ en (Hs ) and (abusing notation) En (λ) ≡ En (Hs −λ). Now, we prove (5.9) and (5.11) with n = j. We begin with some preliminary estimates. Let H ∈ Vj−1 . For 1 ≤ n ≤ j denote ∆n E(λ) := En (λ) − ρ−1 En−1 (λ).
(5.12)
(H)), we have, by Theorem 4.3, that |∆n E(λ)| ≤ αn . Since Rnρ (H) = Rρ (Rn−1 ρ 1 n+1 ρ ) together with the Cauchy This and the analyticity of ∆n E(λ) in D(en−1 , 12 formula imply that −m 1 n+1 m ρ |∂λ ∆n E(λ)| ≤ αn for n ≤ j and m = 0, 1. (5.13) 12 Iterating (5.12) we find for i ≤ j Ei (λ) = ρ−i (E0i (λ) − λ),
(5.14)
where E0i (λ) :=
i
ρn ∆n E(λ).
(5.15)
n=1
By the estimate (5.13) with m = 1 we have for i ≤ j |∂λ E0i (λ)| ≤
i n=1
ρn |∂λ ∆n E(λ)| ≤ c
i n=1
c2n−1 ρ2µ(n−1)−2 γ02 ,
May 12, 2009 14:51 WSPC/148-RMP
528
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
which, by the conditions on the parameters, (5.1), implies |∂λ E0i (λ)| ≤ cρ−2 γ02 ≤
1 5
(5.16)
for 0 < i ≤ j. Now, we are ready to show the existence and properties of ej , stated in (5.9) and (5.11) with n = j, i.e. to show that Ej (λ) has a unique zero, ej , in every disc 1 ρ. The latter is equivalent to showing that ej is a D(ej−1 , rρj ) with 2αj ≤ r ≤ 12 fixed point of the map λ → E0j (λ) in the discs D(ej−1 , rρj ). Using the equations ej−1 = E0j−1 (ej−1 ) and (5.15) with i = j − 1, j and using the triangle inequality we obtain |E0j (λ) − ej−1 | ≤ ρj |∆j E(λ)| + |E0j−1 (λ) − E0j−1 (ej−1 )|. Now, remembering the estimate (5.13) (with m = 0 and n = j) and the estimate (5.16) (with i = j −1) and using the mean-value theorem we arrive at the inequality 1 |E0j (λ) − ej−1 | ≤ ρj αj + |λ − ej−1 |, 5
(5.17)
and therefore, |E0j (λ) − ej−1 | ≤ rρj , provided |λ − ej−1 | ≤ rρj (remember that αj ≤ α0 ρ 1). This inequality together with Eq. (5.16) with i = j implies that the map λ → E0j (λ) has a unique fixed point, ej , in the disc D(ej−1 , rρj ). For 1 ρ this gives (5.9) with n = j. Taking r = 2αj we arrive at (5.11) with n = j. r = 12 If H is self-adjoint, then so is the operator Rρ (H), and, consequently, Rjρ (H) = Rjρ (H)∗ . Hence Ej (λ) and ej are real in this case. Next, we show the first inclusion in (5.10) for n = j. Let H ∈ Vj and hence 1 j+1 ρ . Then, by the induction assumption (5.11) for n = j − 1, we |λ − ej−1 | ≤ 12 1 j+1 1 j ρ + 2αj−1 ρj−1 ≤ 12 ρ and therefore H ∈ Vj−1 , as have that |λ − ej−2 | ≤ 12 claimed. We proceed to show the second inclusion in (5.10) for n = j. Let H ∈ Vj and keep the notation as above. Since Ej−1 (ej−1 ) = 0, we have that |Ej (λ)| ≤ |∆j E(λ)| + ρ−1 |Ej−1 (λ) − Ej−1 (ej−1 )| which by (5.13), (5.14) and (5.16) with i = j − 1 gives |Ej (λ)| ≤ αj + 65 ρ−j |λ − ej−1 |. Hence, since αj ≤ α0 and by (5.1), |Ej (λ)| ≤ provided |λ − ej−1 | ≤ that, for n := j,
1 j+1 . 12 ρ
1 ρ, 8
(5.18)
Thus, using Theorem 4.3 and (5.18) we conclude
Rnρ (Vn ) ⊂ Dµ,1 (ρ/8, βn , γn )
(5.19) γ2
with the numbers βn and γn given inductively by βn = βn−1 + 3Cχ n−1 2ρ and γn = 256Cχ2 ρµ γn−1 and in final form, in the paragraph preceding Theorem A.1. Clearly, −1 γ2 βn , γn ≤ ρ8 . For example, βn ≤ β0 + c ρ0 1 − (cρµ )2 < ρ8 . Hence, by Lemma 4.1, j Rρ (Vj ) ⊂ D(Rρ ). Thus (5.10) is proven for n = j.
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
529
Proof of Theorem 5.1. By (5.11), the limit e(Hs ) := limj→∞ ej (Hs ) exists pointwise for H ∈ D. Iterating Eq. (5.11) we find the estimate |en (Hs ) − e(Hs )| ≤ 3αn+1 ρn+1 .
(5.20)
ρ Given that α0 ≤ 108 (this is a condition on the (bare) coupling constant g), this inequality implies that
Vn ⊂ Uδn ⊂ Vn−1
(5.21)
1 n where δn := 18 ρ . To prove the analyticity of e(Hs ) we note that, since Ej (λ, Hs ) is analytic in Hs ∈ Ds , then so is ej (Hs ). By (5.11) the limit e(Hs ) := limj→∞ ej (Hs ) is also analytic in Hs ∈ Ds . Equations (5.10) and (5.21) imply the first part of (5.2). The second part of (5.2) follows from Theorem 4.3 and (5.18). ). Now we prove the last statement of Theorem 5.1. Let H ∈ Uδn ⊂ Vn ⊂ D(Rn+1 ρ According to (3.19), H (n) := Rnρ (H) can be written as
H (n) = En 1 + Tn + Wn ,
(5.22)
where Tn ≡ Tn (Hf ) with Tn (r) ∈ C 1 and Tn (0) = 0. Hence the function τn (r) := s ≤ γn . Tn (r)/r is well defined. By (5.19) we have |∂r Tn (r) − 1| ≤ βn and Wn Wop This gives the desired estimates for the last two terms in (5.3). Let En (λ) ≡ En for λ := −Hu . To prove the bound on the first term on the right-hand side of (5.3) we use the relation En (en ) = 0 and Eqs. (5.14) and (5.16) to obtain |En (λ)| = |En (λ) − En (en )| ≤
6 −n ρ |λ − en |. 5
(5.23)
This inequality together with (5.20) implies, |En (λ)| ≤ 65 νn + 18 5 αn+1 ρ ≤ 2νn , provided |λ − e| ≤ νn ρn and 4αn ≤ νn . Finally, if H is self-adjoint, then so is Rnρ (H) and therefore En and τn (r) := Tn (r)/r are real. To complete the proof of Theorem 5.1 it remains to show that as n → ∞, the functions τn (r) converge in L∞ to a constant function, τ , as n → ∞. To prove this property requires representing the operators T (n) as sums of the jth step corrections, ∆n T (r) := Tn (r) − ρ−1 Tn−1 (ρr),
(5.24)
similarly to (5.14) and (5.15). In fact, this analysis gives that τ = limn→∞ τn (0). We omit the details here but refer the reader to [3]. Proof of Theorem 5.2. It is shown in Appendix B, Theorem B.1, that e(Hs ) is an eigenvalue of Hs (cf. [8, 9, 3]). Here we show the second statement of the theorem regarding the spectrum of Hs . As above, we omit the reference to Hs and set e ≡ e(Hs ) and en ≡ en (Hs ). We first consider the case of a self-adjoint operator Hs . Let H (n) (λ) := Rnρ (Hs − λ) and, recall, En (λ) := H (n) (λ)u . Equations (5.14) and (5.16) imply the estimate
May 12, 2009 14:51 WSPC/148-RMP
530
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
∂λ En (λ) ≤ − 54 ρ−n . Using the equation En (en ) = 0, the mean value theorem and the estimate above, we obtain that En (λ) ≥ − 45 ρ−n (λ − en ), provided λ ≤ en . Hence, if λ ≤ en − θn , with θn γn ρn and θn → 0 as n → ∞, then H (n) (λ) ≥ 4 −n θn −O(γn ) ≥ 12 γn . This implies 0 ∈ ρ(H (n) (λ)) and therefore, by Theorem 2.1, 5ρ 0 ∈ ρ(Hs − λ) or λ ∈ ρ(Hs ). Since en → e and θn → 0 as n → ∞, this implies that σ(Hs ) ⊂ [e, ∞), which is the second statement of the theorem for self-adjoint operators. Now we consider a non-self-adjoint operator Hs . For all n ≥ 0, we have shown that if Hs ∈ Ds , e = e(Hs ) and if |λ − e| ≤ δn , where δn = νn ρn , then Hs − λ ∈ dom(Rnρ ) and H (n) (λ) := Rnρ (Hs − λ) ∈ Dµ,1 (ρ/8, βn , γn ) (Theorem 5.1). By Theorem 2.1 we have that λ ∈ σ(Hs ) ⇔ 0 ∈ σ(H (n) (λ)),
(5.25)
if |λ − e| ≤ δn . By Theorem 5.1, we can decompose H (n) (λ) = En (λ) + τn (Hf , λ)Hf + Wn (λ),
(5.26)
with Wn (λ) ≤ γn on Ran χHf ≤ρ . Hence 0 ∈ σ(H (n) (λ)) ⇒ ∃r ∈ [0, ρ] : |En (λ) + τn (r, λ)r| ≤ γn .
(5.27)
Using that En (en ) = 0 (en ≡ en (Hs )) and the integral of derivative formula we find En (λ) = (λ − en )g(λ)
1
(5.28)
¯ := en + s(λ − en ) satisfies with g(λ) := 0 En (en + s(λ − en ))ds. Note that λ ¯ − e| ≤ δn for 0 ≤ s ≤ 1. This and (5.14), (5.16) and ρ−1 γ0 1 imply that |λ |g(λ) + ρ−n | ≤
1 −n ρ . 5
(5.29)
In addition, below we use the estimate (5.20) which we rewrite as: |en − e| ≤ 3αn+1 ρn+1 .
(5.30)
We also use the estimates |τn − 1| ≤ βn (see Theorem 5.1). We denote µ := λ − e so that En (λ) = g(λ)(µ + e − en ). We consider separately two cases. (a) Re µ ≤ −θ and |Im µ| ≤ 3θ with θ ≥ 36αn+1 ρn+1 . Using Re (En + τn r) = Re g Re µ − Im g Im µ + Re(g(e − en )) + Re τn r and using (5.29), we obtain Re(En + τn r) ≥
4 −n 3 6 ρ θ − ρ−n θ − ρ−n 3αn+1 ρn+1 + (1 − βn )r. 5 5 5
(5.31)
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
531
Since θ ≥ 36αn+1 ρn+1 this gives 1 −n ρ θ + (1 − βn )r. 10 (b) |Im µ| ≥ θ and |Re µ| ≤ 3θ. If r ≤ 10θρ−n , then Re(En + τn r) ≥
(5.32)
|Im (En + τn r)| = |Re g Im µ + Im g Re µ + Im(g(e − en )) + Im τn r| ≥
4 −n 3 6 ρ θ − ρ−n θ − ρ−n 3 αn+1 ρn+1 − βn 10θρ−n . 5 5 5
This gives 1 −n θρ , 10 and βn ≤ 10−3 .
|Im (En + τn r)| ≥ provided θ ≥ 72αn+1 ρn+1
Now, if r ≥ 10θρ−n , then we estimate by (5.31) and (5.29) 6 |En + τn r| ≥ |gµ + τn r| − ρ−n 3αn+1 ρn+1 . 5 Furthermore, we have
(5.33)
(5.34)
|gµ + τn r|2 = (Re g Re µ − Im g Im µ + Re τn r)2 + (Re g Im µ + Im g Re µ + Im τn r)2 2 1 −n ρ Im µ ≥ (Re g Re µ + Re τn r)2 − 5 4 3 + ρ−n |Im µ| − ρ−n |Im µ|2 − (βn r)2 . 5 5 1 Since Re g Re µ + 2 Re τn r ≥ 0, we have |gµ + τn r|2 ≥ ( 12 Re τn r)2 − (βn r)2 which gives, for θ ≤ 10−3 , 1 |gµ + τn r| ≥ (1 − 2βn )r 2 ≥ 5(1 − 2βn )θρ−n ≥ 2θρ−n . This together with (5.34) yields |En + τn r| ≥ θρ−n , provided θ ≥ 4αn+1 ρn+1 . This together with (5.33) gives for the case (b) 1 −n θρ , (5.35) |En + τn r| ≥ 10 provided θ ≥ 72αn+1 ρn+1 and θ ≤ 10−3 . The inequalities (5.32) and (5.35) and relations (5.25) and (5.27) show that λ ∈ ρ(Hs ) if either Re µ ≤ −θ and |Im µ| ≤ 3θ or Im µ ≥ θ and |Re µ| ≤ 3θ with µ := λ − e, provided θ ≥ max(20ρn γn , 72αn+1 ρn+1 )
and βn ≤ 10−3 .
(5.36)
This can be written as (1)
(2)
Ωθ , Ωθ ⊂ ρ(Hs )
(5.37)
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
532
where θ satisfies (5.36) and (1)
Ωθ := {λ ∈ C | Re µ ≤ −θ and |Im µ| ≤ 3θ} and (2)
Ωθ := {λ ∈ C | |Im µ| ≥ θ and |Re µ| ≤ 3θ}. (3)
Define the new subset Ωθ := {λ ∈ C | Re µ ≤ −θ}. We claim that (3) Ωθ
∞
⊂
(1)
(2)
(Ω3n θ ∪ Ω3n+1 θ ).
(5.38)
n=0 (3)
(1)
(2)
(3)
(3)
Indeed, Ωθ /(Ωθ ∪ Ω3θ ) ∩ Ωθ ⊂ {λ ∈ C | Re µ ≤ −3θ, |Im µ| ≥ 3θ} ⊂ Ω3θ and (3) (1) (2) (3) therefore Ωθ ⊂ Ωθ ∪ Ω3θ ∪ Ω3θ . Iterating the last inclusion we arrive at the desired relation. Equations (5.37) and (5.38) imply that (2)
(3)
Ωθ ∪ Ωθ ⊂ ρ(Hs ) for any θ satisfying (5.36). Now assume λ ∈ / e + S, where S is defined in (5.6). Then either Re µ < 0 or Re µ ≥ 0 and |Im µ| > 13 Re µ. In the first case ∃n such that Re µ < −θn where (3) θn := max(20ρn δn , 72δn+1 ρn+1 ), and therefore λ ∈ Ωθn ⊂ ρ(Hs ). In the second case, assuming µ > 0, we choose n such that Re µ ≈ 3θn . Then |Im µ| ≥ θn and (2) |Re µ| ≤ 3θn so that λ ∈ Ωθn ⊂ ρ(Hs ). Hence C/{e + S} ⊂ ρ(Hs ) which implies σ(Hs ) ⊂ e(Hs ) + S. Remark 5.4. Define E0∞ (e(Hs ), Hs ) := limj→∞ E0j (e(Hs ), Hs ). Then E0∞ (e(Hs ), Hs ) =
∞
ρi ∆i E(e(Hs ), Hs ),
(5.39)
i=1
where the series on the right-hand side converges absolutely by estimate (5.13), and e(Hs ) satisfies the relation e(Hs ) = E0∞ (e(Hs ), Hs ).
(5.40) 1 j 8ρ ,
provided |λ − Indeed, Eqs. (5.14) and (5.18) yield |E0j (λ, Hs ) − λ| ≤ 1 j+1 ej−1 (Hs )| ≤ 12 ρ , which together with (5.20) implies (5.40). Equations (5.39), (5.40), (5.14) and (5.13) (with m = 0) imply that |E0n (λ) − e| ≤ |E0n (λ) − E0n (e)| + |E0n (e) − E0∞ (e)| (λ)|)|λ − e| + ≤ sup (|E0n λ∈Aδn
∞
ρi αi .
(5.41)
i=n+1
Now, using Eq. (5.16) and the definition of αi we obtain, furthermore, that 1 |λ − e| + (1 − ρ)−1 ρn+1 αn+1 . 5 This estimate is used in our further work, [18]. |E0n (λ) − e| ≤
(5.42)
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
533
Acknowledgments A part of this work was done while the third author was visiting ETH Z¨ urich, ESI Vienna and IAS Princeton. He is grateful to these institutions for hospitality. He is supported by NSERC Grant No. NA7901. Appendix A. Proof of Theorem 4.3 The proof below is similar to the proof of [3, Theorem 3.8]. We proceed in two ˆ =: Rρ (H(w)). In fact, we find explicit steps. First we determine w ˆ such that H(w) ˆ formulae expressing w ˆ in terms of w. Then, using these formulae, we estimate w. Let H(w) ∈ Dµ,0 (ρ/8, 1/8, ρ/8). We write this operator as H(w) = H0 + W where H0 := E + T . According to the definition (Eqs. (2.3) and (4.2)) of the smooth Feshbach map, Fρ , we have that Fρ (H(w)) = H0 + χρ W χρ − χρ W χρ (H0 + χρ W χρ )−1 χρ W χρ .
(A.1)
Here, recall, the cut-off operators χρ ≡ χHf ≤ρ are defined in Sec. 3 and χρ := 1−χρ . Note that, because of H(w) ∈ Dµ,0 (ρ/8, 1/8, ρ/8) and of (3.13) 2 ρ
H0−1 χ2ρ ≤
and W ≤
ξρ . 8
(A.2)
Equation (A.2) implies that the Neumann series expansion in Wχρ := χρ W χρ of the resolvent in (A.1) is norm convergent and yields Fρ (H(w)) = H0 +
∞
(−1)L−1 χρ W (H0−1 χ2ρ W )L−1 χρ .
(A.3)
L=1
To write the Neumann series on the right-hand side of (A.3) in the generalized normal form we use Wick’s theorem, which we formulate now. We begin with some notation. We introduce the operator families dx(p,q) ∗ m,n [w | r; k(m,n) ] := χ1 a (x(p) ) Wp,q 1/2 p+q |x (p,q) | B1 × wm+p,n+q [Hf + r; k(m) , x(p) , k˜(n) , x ˜(q) ]a(˜ x(q) )χ1 , (A.4) ˜(q) , for m+n ≥ 0 and a.e. k(m,n) ∈ B1m+n . Here we use the notation for x(p,q) , x(p) , x etc. similar to the one introduced in Eqs. (3.2)–(3.4). For m = 0 and/or n = 0, the variables k(0) and/or k˜(0) are dropped out. Denote by Sm the group of permutations of m elements. Define the symmetrization operation as (sym) [r; k(m,n) ] wm,n 1 := wm,n [r; kπ(1) , . . . , kπ(m) ; k˜π˜ (1) , . . . , k˜π˜ (n) ]. m!n! π∈Sm π ˜ ∈Sn
(A.5)
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
534
Finally, below we will use the notation Σ[k(m) ] := |k1 | + · · · + |km |, (1)
(A.6)
(L)
( ) ( ) ( ) k(m ,n ) = (k(m ) , k˜(n ) ),
k(M,N ) = (k(m1 ,n1 ) , . . . , k(mL ,nL ) ),
(A.7)
(1) ( −1) ( +1) (L) r := Σ[k˜(n1 ) ] + · · · + Σ[k˜(n−1 ) ] + Σ[k(m+1 ) ] + · · · + Σ[k(mL ) ], (1)
( )
( +1)
(L)
r˜ := Σ[k˜(n1 ) ] + · · · + Σ[k˜(n ) ] + Σ[k(m+1 ) ] + · · · + Σ[k(mL ) ],
(A.8) (A.9)
with r = 0 if n1 = · · · n −1 = m +1 = · · · mL = 0 and similarly for r˜ and m1 + · · · + mL = M, n1 + · · · + nL = N . Theorem A.1 (Wick Ordering). Let w = (wm,n )m+n≥1 ∈ W1s and Fj ≡ Fj (Hf ), j = 0 · · · L, where the functions Fj (r) are C s and are bounded together with their derivatives. Write W := m+n≥1 Wm,n with Wm,n := Wm,n [wm,n ]. Then , F0 W F1 W · · · W FL−1 W FL = W
(A.10)
(sym) (sym) := W [w], ˜ := (w ˜M,N )M+N ≥0 with w ˜M,N given by the symmetrization where W ˜ w with respect to k(M) and k˜(N ) , of the coupling functions
w M,N [r; k(M,N ) ] =
L m + p
n + q
m1 +···+mL =M, p1 ,q1 ,...,pL ,qL : =1 n1 +···+nL =N m +p +n +q ≥1
p
q
1 [r + r1 ; k ˜1 ] × F0 [r + r˜0 ] Ω | W (m1 ,n1 ) ]F1 [Hf + r + r (1)
2 [r + r2 ; k ˜L−1 ] ×W (m2 ,n2 ) ] · · · FL−1 [Hf + r + r (2)
L [r + rL ; k ˜L ], ×W (mL ,nL ) ]ΩFL [r + r (L)
(A.11)
with [r; k(m ,n ) ] := W m ,n [w | r; k(m ,n ) ]. W p ,q
(A.12)
For a proof of this theorem see [9, Theorem A.4]. Here we sketch the idea of this proof. Substituting the expansion W := m+n≥1 Wm,n into (A.10) we find = W
m1 ,n1 ,...,mL ,nL m +n ≥1
F0
L (Wmi ,ni Fi ).
i=1
Now we want to transform each product on the right-hand side to the generalized normal form, see Eq. (3.12). Each factor has the creation and annihilation operators entering it explicitly and through the operators Hf . We do not touch the latter and reshuffle the former.
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
535
We pull the annihilation operators, a, entering the Wmi ,ni ’s explicitly, to the left and the creation operators, a∗ , to the left. The creation and annihilation operators interchange positions according to the formula a(k)a∗ (k ) = a∗ (k )a(k) + δ(k − k ). Thus they either pass through each other without a change or produce the δfunction (contract with each other). Furthermore, they pass through functions of the photon Hamiltonian operator Hf according to the pull-through formulae a(k)F [Hf ] = F [Hf + |k|]a(k),
F [Hf ]a∗ (k) = a∗ (k)F [Hf + |k|],
(A.13)
which hold on Hred in the sense of operator-valued distributions for every measurable function F . Indeed, by the operator calculus it suffices to prove this formula for the resolvent: a(k)(Hf − z)−1 = (Hf + ω(k) − z)−1 a(k). The latter equation follows readily from the commutation relation (Hf + ω(k))a(k) = a(k)Hf . Another way ([9, Lemma A.1]) to prove this important for us formula is to observe that for any n, F [Hf ]a(k1 ) · · · a(kn )Ω = F [ω(k1 ) + · · · + ω(kn )]a(k1 ) · · · a(kn )Ω and use the previous formula. Some of the creation and annihilation operators reach the extreme left and right positions, while the remaining ones contract. The terms with M creation operators reaching the extreme left positions and N annihilation operators reaching the extreme right positions contribute to the (M, N )-formfactor, w M,N , of the . operator W This is the standard way for proving the Wick theorem on the reduction of operators on Fock spaces to their normal (or Wick) forms, modified by presence of Hf -dependent factors. The problem here is that the number of terms generated by various contractions, which is the number of pairs which can be formed by creation and annihilation operators, is, very roughly, of order of L! for a product of L terms. Therefore a simple majoration of the series for w M,N will diverge badly. Thus we have to re-sum this series in order to take advantage of possible cancelations. The latter is done by, roughly, representing, for a given M and N , the sum over all contractions by a vacuum expectation which effects only the “contracting” creation and annihilation operators and does not apply to the “external” ones, i.e. those which reached the extreme positions on the left and right. As a direct consequence of Theorem A.1 and Eqs. (4.7), (4.9), (4.10) and (A.3), ˆ = Rρ (H(w)) = Sρ (Fρ (H(w))) as follows. we find a sequence w ˆ such that H(w)
May 12, 2009 14:51 WSPC/148-RMP
536
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Theorem A.2. Let H(w) ∈ Dµ,s (ρ/8, ρ/8, ρ/8). Then Rρ (H(w)) = H(w) ˆ where (sym) (sym) (M) w ˆ = (w ˆM,N )M+N ≥0 with w ˆM,N , the symmetrization with respect to k and k˜(N ) (as in Eq. (A.5)) of the kernels ∞
w ˆM,N [r; k(M,N ) ] = ρM+N −1
L m + p
n + q
p
=1
m1 +···+mL =M, p1 ,q1 ,...,pL ,qL : n1 +···+nL =N m +p +n +q ≥δL
L=1
×
(−1)L−1
q
Vm,p,n,q [r; k(M,N ) ],
(A.14)
for M + N ≥ 1, and w ˆ0,0 [r] = r + ρ−1
∞
(−1)L−1
L
p1 ,q1 ,...,pL ,qL : =1 p +q ≥1
L=2
V0,p,0,q [r],
(A.15)
for M = N = 0. Here m, p, n, q := (m1 , p1 , n1 , q1 , . . . , mL , pL , nL , qL ) ∈ N4L 0 , and Vm,p,n,q [r; k(M,N ) ] :=
Ω, F0 [Hf + r]
L
[ρ(r + {W
=1
( ) r ); ρk(m ,n ) ]F [Hf
+ r]}Ω ,
(A.16)
with M := m1 + · · ·+ mL, N := n1 + · · ·+ nL , F0 [r] := χ1 [r + r˜0 ], FL [r] := χ1 [r + r˜L ] χ1 [r+˜ r ]2 and F [r] := T [ρ(r+˜ r )]+E , for = 1, . . . , L − 1. Here the notation introduced in Eqs. (A.4)–(A.9) and (A.12) is used. We remark that Theorem A.2 determines w ˆ from w ∈ W µ,s only as a sequence of integral kernels that define an operator in B[F ]. Now we show that w ˆ ∈ W µ,s , i.e. w ˆ µ,s,ξ < ∞. In what follows we use the notation introduced in Eqs. (A.4)–(A.9) and (A.12). To estimate w, ˆ we start with the following preparatory lemma µ,s Lemma A.3. For fixed L ∈ N and m, p, n, q ∈ N4L 0 , we have Vm,p,n,q ∈ WM,N and
Vm,p,n,q µ,s ≤
4Cχ2 ρµ Ls
Cχ ρ
L−1 L
wm +p ,n +q µ,s p q , p q
=1
(A.17)
with the convention that pp := 1 for p = 0. Here the constant Cχ is given by (4.12). This lemma is proven in [3] (Lemma III.10) for the L2 -version of the norms (3.5) and (3.7) with s = 1. The extension of this lemma to the norms (3.5) and (3.7) with s = 2, used in this paper, is straightforward. We present here the proof for s = 0 and point out how it extends to the s > 0 case in order to illustrate its simple structure and for references needed later.
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
537
Remark A.4. The proof of Lemma A.3 requires taking derivatives of χ1 [r] and χ1 [r]. Here the main advantage of using the smooth Feshbach map, rather than the (projection) Feshbach map, becomes manifest. If χ1 [r] and χ1 [r] were projections, i.e., characteristic functions of intervals, we would inevitably encounter δ-distributions. In fact, the appearance of these δ-distributions are the reason for using (a rather involved mixture of) supremum and L1 -norms in [8, 9]. In contrast, the proof of Lemma A.3 is quite straightforward and merely requires summation of geometric series. Proof. First we note that by the definition of the cut-off function χ1 (r) ≡ χr≤1 (see the paragraph after (3.9)), |Fi [r]| ≤ 1, i = 0, L. Moreover, since T (r) ≥ 7 1 8 r, supp χ1 ⊂ {r ≥ 1} and |E| ≤ 8 ρ, we have that, for = 1, . . . , L − 1, χ21 [r + r˜ ] ≤ 4 . |F [r]| ≤ T [ρ(r + r˜ )] − E 3ρ
(A.18)
Now, we estimate |Vm,p,n,q |, using that | Ω, AΩ| ≤ A op , for any A ∈ B[Hred]. We have that |Vm,p,n,q [r; k(M,N ) ]| ≤
L
F [Hf + r] op
=0
L
=1
[ρ(r + r ); ρk W (m ,n ) ] op . ( )
(A.19)
( )
Using (3.17) and letting j to be defined by the property that the vector k(mj
j
,nj )
contains kj among its 3-dimensional components, we arrive at Vm,p,n,q µ = max j
≤ ×
4 3ρ
sup r∈I,k(M,N ) ∈B1M +N
1,L
L−1 max j
||kj |−µ Vm,p,n,q [r; k(M,N ) ]|
= j
r∈I,k()
m +n ∈B1 (m ,n )
sup m +n j r∈I,k(m ,n ) ∈B1 j j j (j )
≤ ×
4 3ρ
L−1
ρµ max j
= j
( )
j
r∈I,k()
m +n j r∈I,k(m ,n ) ∈B1 j j j
[ρr; ρk W (m ,n ) ] op
j [ρr; ρk ( j ) |kj |−µ W (m
1,L
sup (j )
sup
sup
m +n ∈B1 (m ,n )
,nj ) ] op
[r; k W (m ,n ) ] op ( )
j [r; k ( j ) |kj |−µ W (m
j
,nj ) ] op .
(A.20)
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
538
We now convert the operator norms on the right-hand side of (A.20) into the ( ) coupling functions norms. To this end we use, pointwise in k(m ,n ) a.e., inequality (3.11) in Theorem 3.1 to obtain for any µ ≥ 0 max j
sup r∈I,
≤
≤
m +n () k(m ,n ) ∈B1
1
max j
( ) ( ) pp qq
1 ( ) ( ) pp qq
[r; k ( ) |kj |−µ W (m ,n ) ] op sup
m +n () r∈I,k(m ,n ) ∈B1
( ) ( ) |kj |−µ wm +p ,n +q [·; k(m ) , ·; k˜(n ) , ·] 0
wm +p ,n +q µ .
This estimate with µ = 0 if = j and µ ≥ 0 if = j , inserted into the th factor on the right-hand side of (A.20), yields (A.17) with s = 0. To estimate the norm Vm,p,n,q µ,s with s = 1, 2 we need the bounds |∂rs F [r]| ≤
Cχ ρ
(A.21)
where the constant Cχ is given in (4.12). These bounds are obtained similarly to (A.18), using the inequality 2χ [r + r˜ ]∂r χ1 [r + r˜ ] |∂r F [r]| ≤ 1 T [ρ(r + r˜ )] − E 2 χ [r + r˜ ]ρ∂r T [z; ρ(r + r˜ )] + 1 (T [ρ(r + r˜ )] − E)2 and a similar inequality for |∂r2 F [r]|. To estimate Vm,p,n,q µ,s with s = 1, 2 we apply the operator ∂rn to (A.16) and use the Leibnitz rule of differentiation of products s times to obtain (A.17). We are now prepared to prove the estimates in Theorem 4.3. Recall that we assume ρ≤ 1/2 we choose ξ = 1/4. First, we apply Lemma A.3 to (A.14) and andm+p ≤ 2 . This yields use that m+p p w ˆM,N µ,s ≤
∞
4Cχ ρµ Ls
L=1
×
Cχ ρ
L
(2ρ)M+N
m1 +···+mL =M, p1 ,q1 ,...,pL ,qL : n1 +···+nL =N m +p +n +q ≥1
p q L 2 2 wm +p ,n +q µ,s . √ √ p
q
=1
(A.22)
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
539
Using the definition (3.17), the inequality 2ρ ≤ 1, we derive the following bound ˆM,N )M+N ≥1 , for w ˆ 1 := (w ξ −(M+N ) w ˆM,N µ,s w ˆ 1 µ,s,ξ := M+N ≥1
≤ 8Cχ ρ
1+µ
∞
L
s
L=1
×
Cχ ρ
L
M+N ≥1 m1 +··· +mL =M, p1 ,q1 ,...,pL ,qL : n1 +··· +nL =N m +p +n +q ≥1
p q L 2ξ 2ξ −(m +p +n +q ) ξ w √ √ m +p ,n +q µ,s p
q
=1
≤ 8Cχ ρ
1+µ
∞
L
s
L=1
Cχ ρ
L
L
m n 2ξ p 2ξ q × . ξ −(m+n) wm,n µ,s √ √ p q p=0 q=0 m+n≥1
∞ √ p p = Using the assumption ξ = 1/4 and the estimate m p=0 (2ξ/ p) ≤ p=0 (2ξ) 1 , and recalling the definitions w := (w ) and w := m,n m+n≥1 µ,s,ξ 1 1 1−2ξ −(m+n) wm,n µ,s , we obtain M+N ≥1 ξ w ˆ 1 µ,s,ξ ≤ 8Cχ ρµ+1
∞
Ls B L ,
(A.23)
L=1
where B :=
Cχ w µ,s,ξ . ρ(1 − 2ξ)2 1
(A.24)
Note that in (A.23) we have dropped the factor p−p/2 gained in Theorem 3.1. Our assumption, γ ≤ (8Cχ )−1 ρ, also insures that B≤
1 4Cχ γ ≤ . ρ 2
(A.25)
Thus the geometric series in the last line of (A.23) is convergent. We obtain for s = 0, 1, 2 ∞
Ls B L ≤ 8B.
(A.26)
L=1
Inserting (A.26) into (A.23), we see that the right-hand side of (A.23) is bounded by 64Cχ ρ1+µ B which, remembering the definition of B gives w ˆ 1 µ,s,ξ ≤ 256Cχ2 ρµ w1 µ,s,ξ .
(A.27)
May 12, 2009 14:51 WSPC/148-RMP
540
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Next, we estimate w ˆ0,0 . We analyze the expression (A.15). Using estimate Eq. (A.17) with m = 0, n = 0 (and consequently, M = 0, N = 0), we find ρ
−1
V0,p,0,q µ,s ≤ 2L
s
CχL+1 ρ−L
L wp ,q µ,s p q . p q
=1
(A.28)
In fact, examining the proof of Lemma A.3 more carefully we see that the following, slightly stronger estimate is true ρ−1 ∂rs V0,p,0,q µ,0 ≤ 2Ls CχL+1 ρ−L+s
L wp ,q µ,s p q . p q
=1
(A.29)
Now, using (A.29) and p+q≥1 wp,q µ,s ≤ ξ p+q≥1 ξ −p−q wp,q µ,s =: w1 µ,s,ξ , where, recall, w 1 := (wm,n )m+n≥1 , we obtain ρ−1
∞
sup |∂rs V0,p,0,q [r]|
L=2 p1 ,q1 ,...,pL ,qL : r∈I p +q ≥1
≤ 2Cχ ρs
∞
Ls
L=2
≤ 2Cχ ρs
∞
Cχ ρ
L
wp,q µ,s
p+q≥1
L
Ls D L ,
(A.30)
L=2
where D := Cχ ρ−1 ξ w 1 µ,s,ξ . Now, since D ≤ Cχ ξρ−1 γ ≤ ξ/8 = 1/16, we have, ∞ similarly to (A.26), that L=2 Ls DL ≤ 12D2 for s = 0, 1, 2. Hence we find 2 ∞ Cχ ξ −1 s s w1 µ,s,ξ , (A.31) ρ sup |∂r V0,p,0,q [r]| ≤ 24Cχ ρ ρ p1 ,q1 ,...,pL ,qL : r∈I L=2
p +q ≥1
for s = 0, 1, 2. ! := w We set E ˆ0,0 [0]. Since E = w0,0 [0], Eqs. (A.15) and (A.31) yield 2 Cχ ξ −1 ! w 1 µ,0,ξ . |E − ρ E| ≤ 24Cχ ρ
(A.32)
Next, writing T![r] := w ˆ0,0 [r] − w ˆ0,0 [0], we find furthermore that sup |T! [r] − 1| = sup |∂r w ˆ0,0 [r] − 1|
r∈[0,∞)
r∈[0,∞)
≤ sup |T [r] − 1| + 24Cχ ρ r∈[0,∞)
Cχ ξ w1 µ,1,ξ ρ
2 .
(A.33)
Now, recall that |T [r] − 1| ≤ β and w1 µ,s,ξ ≤ γ. Hence Eqs. (A.32), (A.33), C ξγ 2 C ξγ 2 , β = β + 24Cχ χρ and γ = and (A.27) give (4.13) with α = 24Cχ χρ √ 2 µ 256Cχ ρ γ. Remembering that ξ = ρ/(4Cχ ) we conclude that the statement of Theorem 4.3 holds.
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
541
Remark A.5. In the proof the limiting absorption principle (LAP) in [18] to estimate Vm,p,n,q µ,s , with s = 1, 2, (see Lemma A.3) instead of the operator ∂rn , we apply the operator ∂rn (k∂k )q to (A.16). Here q := (q1 , . . . , qM+N ), (k∂k )q :=
M+N (kj · ∇kj )qj , with km+j := k˜j , and the indices n and q satisfy 0 ≤ n + |q| ≤ s. 1 Remark A.6. For the proof of the limiting absorption principle in [18] we also ˆ0,0 [r]) need the following estimate (here we use that T! [r] = ∂r2 w 2 Cχ ξ w 1 µ,2,ξ . (A.34) sup |T! [r]| ≤ ρ sup |T [r]| + 24Cχ ρ2 ρ r∈[0,∞) r∈[0,∞) Appendix B. Construction of Eigenvalues and Eigenvectors In this Appendix, we prove that the value E := e(Hs )+ Hu we constructed in Sec. 5 is the ground state energy of the Hamiltonian H under consideration (see Theorem 5.3) and we construct the corresponding ground state. We use the definitions of Sec. 5. We follow closely [3]. Theorem B.1. Let H ∈ D. Then the value E := e(Hs ) + Hu where e(Hs ) is given in Theorem 5.1, is a simple eigenvalue of the operator H. The corresponding eigenfunction is given constructively in Eq. (B.13) below. Proof. Let H (0) := H − E ∈ Ms . We define a sequence of operators (H (n) )∞ n=0 in µ,s ⊆ B(Hred ) by H (n) := Rnρ (H (0) ). We will also need the following representaWop tion for Sρ : Sρ (A) =: Γρ AΓ∗ρ ,
(B.1)
where Γρ is the unitary dilatation on F defined by this formula and Γρ Ω = Ω. Then the definition (4.13) of Rρ implies that, for all integers n ≥ 0, H (n) =
1 Γρ (Fρ (H (n−1) ))Γ∗ρ , ρ
(B.2)
where, recall, Fρ := Fτ χρ with τ (H) := W0,0 (see Eq. (4.1)). We will use the operators Qτ χ defined in (2.6). It is easy to show (see [3]) that these operators satisfy the identity HQτ χ = χFτ χ (H). Let Q(n) := Qτ χρ (H (n) ).
(B.3)
Then the equation H (n) Q(n) = χρ Fρ (H (n) ) together with (B.2), implies the intertwining property H (n−1) Q(n−1) Γ∗ρ = ρΓ∗ρ χ1 H (n) .
(B.4)
Equation (B.4) is the key identity for the proof of the existence of an eigenvector with the eigenvalue e.
May 12, 2009 14:51 WSPC/148-RMP
542
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
For the construction of this eigenvector, for non-negative integers β we define vectors Ψk in Hred by setting Ψ0 := Ω and Ψk := Q(0) Γ∗ρ Q(1) Γ∗ρ · · · Q(k−1) Ω.
(B.5)
We first show that this sequence is convergent, as k → ∞. To this end, we observe that Ω = Γ∗ρ χρ Ω and hence Ψk+1 − Ψk = Q(0) Γ∗ρ Q(1) Γ∗ρ · · · Q(k−1) Γ∗ρ (Q(k) − χρ )Ω.
(B.6)
Since χρ ≤ 1, this implies that Ψk+1 − Ψk ≤ Q(k) − χρ op
β−1
{1 + Q(j) − χρ op }.
(B.7)
j=0
To estimate the terms on the right-hand side we consider the jth step Hamiltonian H (j) . As in the proof of Proposition A.5 we write H (j) as H (j) = Ej · 1 + Tj + Wj ,
(B.8)
with |Ej | ≤ 8αj
and Wj op ≤ γj ≤
ρ . 16
(B.9)
Recalling the definition (2.6) of Q(j) , we have χρ − Q(j) = χρ (Ej + Tj + χρ Wj χρ )−1 χρ Wj χρ .
(B.10)
By (B.9), for all j ∈ N, we may estimate −1 ρ 16γj (j) − Wj op . (B.11) χρ − Q op ≤ Wj op ≤ 8 ρ
∞ ∞ Inserting this estimate into (B.7) and using that j=0 (1 + λj ) ≤ exp[ j=0 λj ], for λj ≥ 0, we obtain k−1 16γk 16γj Ψk+1 − Ψk ≤ 1+ ρ j=0 ρ ≤
16γk exp[32γ0 ρ−1 ], ρ
(B.12)
where we have used that ∞ j=0 γj ≤ 2γ0 (recall the definition of γj after Eq. (5.1)). ∞ Since j=0 γj < ∞, we see that the sequence (Ψk )k∈N0 of vectors in Hred is convergent, and its limit Ψ∞ := lim Ψk , k→∞
(B.13)
satisfies the estimate Ψ∞ − Ω = Ψ∞ − Ψ0 ≤ which guarantee that Ψ(∞) = 0.
32γ0 exp[32γ0 ρ−1 ], ρ
(B.14)
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
543
The vector Ψ∞ constructed above is an element of the kernel of H (0) , as we will now demonstrate. Observe that, thanks to (B.4), H (0) Ψk = (H (0) Q(0) Γ∗ρ )(Q(1) Γ∗ρ · · · Q(k−1) Ω) = ρΓ∗ρ χ1 (H (1) Q(1) Γ∗ρ )(Q(2) Γ∗ρ · · · Q(k−1) Ω) .. . = ρk (Γ∗ρ χ1 )k H (k) Ω.
(B.15)
Equation (B.8) together with the estimate (B.9) and the relation Tk Ω = 0 implies that H (k) Ω = (Wk + Ek )Ω ≤ γk + 8α2k ≤ 2γk . Summarizing (B.15)–(B.16) and using that the operator norm of by 1, we arrive at
(B.16) Γ∗ρ χ1
H (0) Ψk ≤ 2γk → 0
is bounded (B.17)
as k → ∞. Since H (0) ∈ B(Hred) is continuous, (B.17) implies that H (0) Ψ∞ = lim (H (0) Ψk ) = 0.
(B.18)
k→∞
Thus 0 is an eigenvalue of the operator H (0) := H − E, i.e. E is an eigenvalue of the operator H, with the eigenfunction Ψ∞ . Appendix C. Analyticity of all Parts of H(w) Let S be an open set in a Banach space B. Below the analyticity is understood in the sense described in the paragraph preceding Theorem 5.1. Proposition C.1 ([20]). Suppose that λ → H(w λ ) is analytic in λ ∈ S and that H(w λ ) belongs to some polydisc D(α, β, γ) for all λ ∈ S. Then: λ (Hf ) λ → w0,0
and
λ → W (w λ )
are analytic in λ ∈ S. Proof. Recall that B1 = {k ∈ R3 : |k| ≤ 1} and that an operator A is called Hf bounded iff the operator A(Hf +1)−1 is bounded. Let P1 denote the projection onto the one boson subspace of F , which is isomorphic to L2 (R3 ). Then P1 H(w λ )P1 , like H(w λ ), is analytic. We write λ (Hf )(Hf + 1)−1 P1 + P1 W1,1 (w λ )(Hf + 1)−1 P1 P1 H(w λ )(Hf + 1)−1 P1 = P1 w0,0
= Dλ + K λ ,
(C.1) λ w0,0 (ω)(ω
−1
+ 1) , ω := |k|, and Kλ is the where Dλ denotes multiplication with Hilbert–Schmidt operator with kernel ˜ = wλ (0, k, k)(˜ ˜ ω + 1)−1 , Mλ (k, k) 1,1
May 12, 2009 14:51 WSPC/148-RMP
544
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
whose support belongs to B1 × B1 . In what follows if an operator family has a factor (Hf + 1)−1 standing on its right, then the analyticity is understood in then operator norm. Our strategy is to show first that Kλ and hence λ P1 w0,0 (Hf )(Hf + 1)−1 P1 = P1 H(wλ )(Hf + 1)−1 P1 − Kλ λ are analytic. Then we show that λ → w0,0 (Hf ) is analytic. The analyticity of λ λ λ λ → W (w ) = H(w ) − w0,0 (Hf ) then follows.
Step 1. Kλ is analytic. (n) For each n ∈ N let {Qi }i be a collection of n measurable subsets of B1 such that n (n) (n) (n) Qi , Qi ∩ Qj = ∅, i = j, (C.2) B1 = i=1
and (n)
|Qi | ≤ (n)
Let χi (n)
const . n
(C.3)
denote the operator on L2 (B1 ) of multiplication with χQ(n) . Then for i = j, (n)
(n)
i
(n)
χi Dλ χj = 0 because χi and χj have disjoint support and commute with Dλ . Together with (C.1) this implies that (n)
(n)
χi K λ χ j
= χi P1 H(wλ )(Hf + 1)−1 P1 χj (n)
(n)
,
for i = j.
Since the right-hand side is analytic, so is the left-hand side and hence (n) (n) (n) χi K λ χj Kλ = i =j (n)
is analytic. It follows that λ → ϕ, Kλ ψ is analytic for all ϕ, ψ in L2 (B1 ). Now let ϕ, ψ ∈ C(B1 ). Then (n)
| ϕ, Kλ ψ − ϕ, Kλ ψ| n (n) (n) ϕ(x)ψ(y)Mλ (x, y) χi (x)χi (y)dxdy = B1 ×B1 i=1
≤ ϕ ∞ ψ ∞ Kλ HS
n
1/2 (n) |Qi |2
→ 0,
(n → ∞),
i=1
uniformly in λ, because the Hilbert–Schmidt norm Kλ HS is bounded uniformly in λ (in fact, it is bounded by γ). This proves that ϕ, Kλ ψ is analytic for all ϕ, ψ ∈ C(B1 ). Since C(B1 ) is dense in L2 (B1 ), an other approximate argument using supλ Kλ < ∞ shows that ϕ, Kλ ψ is analytic for all ϕ, ψ ∈ L2 (B1 ). Therefore λ → Kλ is analytic [23]. λ Step 2. For each k ∈ R3 , w0,0 (|k|)(ω + 1)−1 is an analytic function of λ.
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
545
For each n ∈ N let fk,n ∈ L2 (B1 ) denote a multiple of the characteristic function λ (|k|) as a function of k of B1/n (k) with fk,n = 1. By the continuity of w0,0 λ λ (|k|)(ω + 1)−1 = lim |fk,n (x)|2 w0,0 (|x|)(|x| + 1)−1 dx w0,0 n→∞
R3
λ = lim a∗ (fk,n )Ω, w0,0 (Hf )(Hf + 1)−1 a∗ (fk,n )Ω. n→∞
(C.4)
Since a∗ (fk,n )Ω ∈ P1 F the expression · · ·, before taking the limit, is an analytic λ , this function is Lipschitz continuous with function of λ. By assumption on w0,0 respect to |k| uniformly in λ. Therefore the convergence in (C.4) is uniform in λ λ (|k|)(ω + 1)−1 is analytic by the Weierstrass approximation theorem and hence w0,0 from complex analysis. λ (Hf ) is analytic. Step 3. w0,0 By the spectral theorem λ (Hf )(Hf + 1)−1 ϕ =
ϕ, w0,0
[0,∞)
λ w0,0 (x)(x + 1)−1 dµϕ (x).
By an application of Lebesgue’s dominated convergence theorem, using λ supλ w0,0 (x + 1)−1 < ∞, we see that the right-hand side, which we call ϕ(λ), is a continuous function of λ. Therefore λ −1 ϕ(λ)dλ = w0,0 (x)(x + 1) dλ dµϕ (x) Γ
[0,1]
Γ
for all closed loops Γ : t → λ(t) in S. The analyticity of λ → ϕ(λ) now follows from λ the analyticity of w0,0 (x)(x + 1)−1 and the theorems of Cauchy and Morera. By λ polarization, w0,0 (Hf )(Hf + 1)−1 is weakly analytic and hence analytic. Supplement D. Background on the Fock Space, etc Let h be either L2 (R3 , C, d3 k) or L2 (R3 , C2 , d3 k). In the first case we consider h as the Hilbert space of one-particle states of a scalar Boson or a phonon, and in the second case, of a photon. The variable k ∈ R3 is the wave vector or momentum of the particle. (Recall that throughout this paper, the velocity of light, c, and Planck’s constant, , are set equal to 1.) The Bosonic Fock space, F , over h is defined by F :=
∞
Sn h⊗n ,
(D.1)
n=0
where Sn is the orthogonal projection onto the subspace of totally symmetric n-particle wave functions contained in the n-fold tensor product h⊗n of h; and " S0 h⊗0 := C. The vector Ω := 1 ∞ n=1 0 is called the vacuum vector in F . Vectors Ψ ∈ F can be identified with sequences (ψn )∞ n=0 of n-particle wave functions, which are totally symmetric in their n arguments, and ψ0 ∈ C. In the first case these functions are of the form, ψn (k1 , . . . , kn ), while in the second case, of the form ψn (k1 , λ1 , . . . , kn , λn ), where λj ∈ {−1, 1} are the polarization variables.
May 12, 2009 14:51 WSPC/148-RMP
546
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
In what follows we present some key definitions in the first case only limiting ourselves to remarks at the end of this appendix on how these definitions have to be modified for the second case. The scalar product of two vectors Ψ and Φ is given by ∞ n d3 kj ψn (k1 , . . . , kn )ϕn (k1 , . . . , kn ). (D.2)
Ψ, Φ := n=0
j=1
Given a one particle dispersion relation ω(k), the energy of a configuration of n n non-interacting field particles with wave vectors k1 , . . . , kn is given by j=1 ω(kj ). We define the free-field Hamiltonian, Hf , giving the field dynamics, by n (Hf Ψ)n (k1 , . . . , kn ) = ω(kj ) ψn (k1 , . . . , kn ), (D.3) j=1
for n ≥ 1 and (Hf Ψ)n = 0 for n = 0. Here Ψ = (ψn )∞ n=0 (to be sure that the right-hand side makes sense we can assume that ψn = 0, except for finitely many n, for which ψn (k1 , . . . , kn ) decrease rapidly at infinity). Clearly, the operator Hf has the single eigenvalue 0 with the eigenvector Ω and the rest of the spectrum is absolutely continuous. With each function ϕ ∈ h one associates an annihilation operator a(ϕ) defined as follows. For Ψ = (ψn )∞ n=0 ∈ F with the property that ψn = 0, for all but finitely many n, the vector a(ϕ)Ψ is defined by √ (a(ϕ)Ψ)n (k1 , . . . , kn ) := n + 1 d3 k ϕ(k)ψn+1 (k, k1 , . . . , kn ). (D.4) These equations define a closable operator a(ϕ) whose closure is also denoted by a(ϕ). Equation (D.4) implies the relation a(ϕ)Ω = 0.
(D.5)
∗
The creation operator a (ϕ) is defined to be the adjoint of a(ϕ) with respect to the scalar product defined in Eq. (D.2). Since a(ϕ) is anti-linear, and a∗ (ϕ) is linear in ϕ, we write formally (D.6) a(ϕ) = d3 k ϕ(k)a(k), a∗ (ϕ) = d3 k ϕ(k)a∗ (k), where a(k) and a∗ (k) are unbounded, operator-valued distributions. The latter are well known to obey the canonical commutation relations (CCR): [a# (k), a# (k )] = 0,
[a(k), a∗ (k )] = δ 3 (k − k ),
(D.7)
where a# = a or a∗ . Now, using this one can rewrite the quantum Hamiltonian Hf in terms of the creation and annihilation operators, a and a∗ , as (D.8) Hf = d3 k a∗ (k)ω(k)a(k), acting on the Fock space F .
May 12, 2009 14:51 WSPC/148-RMP
J070-00368
On Spectral Renormalization Group
547
More generally, for any operator, t, on the one-particle space h we define the ∗ operator T on the Fock space F by the following formal expression T := a (k)ta(k)dk, where the operator t acts on the k-variable (T is the second quantization of t). The precise meaning of the latter expression can obtained by using a basis {φj } in the space h to rewrite it as T := j a∗ (φj )a(t∗ φj )dk. To modify the above definitions to the case of photons, one replaces the variable k by the pair (k, λ) and adds to the integrals in k also the sums over λ. In particular, # the creation and annihilation operators have now two variables: a# λ (k) ≡ a (k, λ); they satisfy the commutation relations # [a# λ (k), aλ (k )] = 0,
[aλ (k), a∗λ (k )] = δλ,λ δ 3 (k − k ).
(D.9)
One can also introduce the operator-valued transverse vector fields by a# (k) := eλ (k)a# λ (k), λ∈{−1,1}
where eλ (k) ≡ e(k, λ) are polarization vectors, i.e. orthonormal vectors in R3 satisfying k · eλ (k) = 0. Then in order to reinterpret the expressions in this paper for the vector (photon) — case one either adds the variable λ as was mentioned above or replaces, in appropriate places, the usual product of scalar functions or scalar functions and scalar operators by the dot product of vector-functions or vector-functions and operator valued vector-functions. References [1] W. Abou Salem, J. Faupin, J. Fr¨ ohlich and I. M. Sigal, On theory of resonances in non-relativisitc QED, to appear in Adv. Appl. Math. [2] L. Amour, B. Grbert and J.-C. Guillot, The dressed mobile atoms and ions, J. Math. Pures Appl. (9) 86(3) (2006) 177–200. [3] V. Bach, Th. Chen, J. Fr¨ ohlich and I. M. Sigal, Smooth Feshbach map and operatortheoretic renormalization group methods, J. Funct. Anal. 203 (2003) 44–92. [4] V. Bach, Th. Chen, J. Fr¨ ohlich and I. M. Sigal, The renormalized electron mass in non-relativistic quantum electrodynamics, J. Funct. Anal. 243 (2007) 426–535. [5] V. Bach, J. Fr¨ ohlich and A. Pizzo, Infrared-finite algorithms in QED: The groundstate of an atom interacting with the quantized radiation field, Comm. Math. Phys. 264(1) (2006) 145–165. [6] V. Bach, J. Fr¨ ohlich and A. Pizzo, An infrared-finite algorithm for Rayleigh scattering amplitudes, and Bohr’s frequency condition, Comm. Math. Phys. 274(2) (2007) 457–486. [7] V. Bach, J. Fr¨ ohlich and A. Pizzo, Infrared-finite algorithms in QED II. The expansion of the groundstate of an atom interacting with the quantized radiation field, Adv. Math. 220(4) (2009) 1023–1074. [8] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998) 299–395. [9] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Renormalization group analysis of spectral problems in quantum field theory, Adv. Math. 137 (1998) 205–298. [10] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Comm. Math. Phys. 207(2) (1999) 249–290.
May 12, 2009 14:51 WSPC/148-RMP
548
J070-00368
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
[11] M. Berger, Nonlinearity and Functional Analysis. Lectures on Nonlinear Problems in Mathematical Analysis, Pure and Applied Mathematics (Academic Press, New York-London, 1977). [12] T. Chen, Infrared renormalization in non-relativisitc QED and scaling criticality, J. Funct. Anal. 254(10) (2008) 2555–2647. [13] C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg, Photons and Atoms — Introduction to Quantum Electrodynamics (John Wiley, New York, 1991). [14] C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg, Atom-Photon Interactions — Basic Processes and Applications (John Wiley, New York, 1992). [15] J. Faupin, Resonances of the confined hydrogen atom and the Lamb–Dicke effect in non-relativisitc QED, Ann. Henri Poincar´e 9(4) (2008) 743–773. [16] H. Feshbach, Unified theory of nuclear reactions, Ann. Phys. 5 (1958) 357–390. [17] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Spectral theory for the standard model of non-relativisitc QED, Comm. Math. Phys. 283 (2008) 613–646. [18] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Local decay in the standard model of non-relativisitc quantum electrodynamics, arXiv:0904.1014v1 [math-ph]. [19] M. Griesemer and D. Hasler, On the smooth Feshbach-Schur map, J. Funct. Anal. 254(9) (2008) 2329–2335. [20] M. Griesemer and D. Hasler, Analytic perturbation theory and renormalization analysis of matter coupled to quantized radiation, to appear in Ann. Henri Poincar´e; arXiv:0801.4458. [21] S. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics, 2nd edn. (Springer, 2006). [22] E. Hille and R. S. Phillips, Functional Analysis and Semi-Groups (Amer. Math. Soc., 1957). [23] T. Kato, Perturbation Theory for Linear Operators (Springer, 1976). ¨ [24] J. Schur, Uber Potenzreihen die im Inneren des Einheitskreises beschr¨ ankt sind, J. Reine Angew. Math. 147 (1917) 205–232. [25] I. M. Sigal. Ground state and resonances in the standard model of the non-relativistic quantum electrodynamics, to appear in J. Statist. Phys. [26] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, Cambridge, 2004).
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Reviews in Mathematical Physics Vol. 21, No. 4 (2009) 549–585 c World Scientific Publishing Company
LARGE DEVIATION GENERATING FUNCTION FOR CURRENTS IN THE PAULI–FIERZ MODEL
WOJCIECH DE ROECK∗ Institute for Theoretical Physics, K. U. Leuven, Celestijnenlaan 200D, B3001 Leuven, Belgium and Institute for Theoretical Physics, ETH Zurich, Schafmattstr. 32, 8093 Zurich, Switzerland
[email protected] Received 5 January 2009 Revised 8 April 2009 We consider a finite quantum system coupled to quasifree thermal reservoirs at different temperatures. We construct the statistics of energy transport between the reservoirs and we show that the corresponding large deviation generating function exists and it is analytic on a compact set. This result is valid for small coupling and exponentially decaying reservoir correlation functions. Our technique consists of a diagrammatic expansion that uses the Markovian limit of the system as a reference. As a corollary, we derive the Gallavotti–Cohen fluctuation relation for the entropy production. Keywords: Gallavotti–Cohen symmetry; nonequilibrium statistical mechanics; spinboson model. Mathematics Subject Classification 2000: 82C10, 82C70
1. Introduction 1.1. Fluctuations in open quantum systems Recently, the physics community has shown quite some interest in current fluctuations in nonequilibrium quantum systems. We mention two interesting points of view: (1) Starting with [18, 22], it has become clear that nonequilibrium systems, both classical and quantum, exhibit a symmetry in the fluctuations of entropy production. This symmetry, dubbed the “Gallavotti–Cohen Fluctuation Theorem” holds arbitrarily far for equilibrium. Discussions of the Gallavotti–Cohen symmetry (and of the related “Jarzynski” equality) in quantum systems can be found in [30, 41, 28, 42, 46, 39, 17]. ∗ Postdoctoral
Fellow FWO-Flanders. 549
May 12, 2009 13:34 WSPC/148-RMP
550
J070-00369
W. De Roeck
(2) Starting with [33–35], the idea was developed that shot noise between metallic contacts shows distinct signs of Fermi statistics and that it provides a way to determine the charge of the charge carriers. This idea allowed to “observe” fractional charges [9]. We refer to [29] for an elementary derivation of the characteristic function of full counting statistics. From the point of view of mathematical physics, it is instructive to have a setup where the above points can be studied rigorously in microscopic models of quantum systems. Partially, this has been achieved in [3], where the authors considered a free fermion junction in the thermodynamics limit and the characteristic function of charge transport was constructed via a regularization argument. This setup was aimed primarily at point (2) above. The present paper considers a “spinboson” type model with boson reservoirs at different temperatures and constructs the characteristic function of energy transport by taking a thermodynamic limit, starting from the expression for finite reservoirs. In this sense, our approach is more elementary than that in [3]. Once the characteristic function is constructed, we investigate its large time-limit. In particular, we prove the existence of the large deviation generating function of energy transport in a compact set and we verify the Galavotti–Cohen fluctuation theorem. As a corollary, we obtain a central limit theorem for the energy currents between the reservoirs. 1.2. Open quantum systems with finite reservoirs Our model describes a small quantum system (an atom, in what follows called “system” S) interacting with a quantum system with many degrees of freedom (a reservoir). We choose the reservoir as simple as possible: a free field of bosons, although fermions would do just as well.a The system is coupled to the reservoirs through an interaction term, which is linear in the field creation and annihilation operators. This type of models are known as Pauli–Fierz models, or, in the simplest case, the spin-boson model. These models arise as toy-models in solid state physics, were the bosons are lattice phonons, or through the dipole approximation in QED, where the bosons are photons, see [45] for more background. To make the statements mathematically sharp, we consider this field in the thermodynamic limit, or equivalently, in the limit where the modes form a continuum. However, for the sake of distilling the right physical question addressed in this paper, we start from a finite-volume setup. 1.2.1. Setup Fix a finite-dimensional Hilbert space HS with self-adjoint Hamiltonian HS . We imagine there are m heat baths at respective temperatures 1/βk , k = 1, . . . , m. In what follows, the heat baths will be assumed to be large but finite, with the a In
fact, they would simplify the technical work.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
551
parameter n ∈ N controlling their “size” (the number of degrees of freedom increases as n increases). To each k = 1, . . . , m, we associate (1) A finite-dimensional Hilbert space HRnk and a positive Hamiltonian HRnk on HRnk . (2) A self-adjoint coupling operator Ckn ∈ B(HS ⊗ HRnk ). (3) The Gibbs state ρnRk on B(HRnk ) at inverse temperature βk , given by, Tr[e−βk HRk A] n
ρnRk [A]
=
Tr[e
n −βk HR
k
,
]
A ∈ B(HRnk ).
(1.1)
We define the total interacting Hamiltonian on H := HS ⊗ ⊗k HRnk as H n = HS ⊗ 1 +
m
1 ⊗ HRnk +
k=1
m
Ckn .
(1.2)
k=1
We take as initial state to be decoupled, i.e. of the form ρS ⊗ ρnR ,
ρnR :=
m
ρnRk
(1.3)
k=1
corresponding to initially decorrelated thermal reservoir states and an arbitrary state ρS on B(HS ). 1.2.2. Transport fluctuations We introduced the finite volume systems in order to pick the right expression for transport fluctuations, and hence now that all tools are in place, we ask what we mean by transport fluctuations in the finite-volume models. Note that the reservoir Hamiltonians HRnk mutually commute and that they have discrete spectrum (since we assumed the Hilbert spaces to be finite-dimensional). Hence one can measure them simultaneously in the beginning and at the end of an experiment. We will be concerned with the differences between the outcomes of these measurements. Let Py be the joint spectral projections of HRn ≡ HRnk , k = 1, . . . , m corresponding to the eigenvalues y ≡ (yk ), in particular; n ei(γ,y) Py , for γ ∈ Cm (1.4) ei(γ,HR ) = y
where (·, ·) stands for the scalar product in Cm and, likewise, (γ, HRn ) is shorthand for k γk HRnk . We define the probabilities n n Pnt (∆y) := ρS ⊗ ρRn [Py eitH Py e−itH Py ] (1.5) y,y y −y=∆y
May 12, 2009 13:34 WSPC/148-RMP
552
J070-00369
W. De Roeck
for observing energy differences ∆y ∈ Rm , when measuring the energy twice, before and after the interaction has acted during a time-span t. The “measurement” in formula (1.5) is manifest through the projections Py , Py . The Fourier–Laplace transform χnt (γ) of this measure has a nice expression which is better suited for taking the thermodynamic limit: Using that (the density matrix corresponding to) ρRn commutes with the spectral projections Py , one calculates χnt (γ) := Pnt (∆y)e−i(γ,∆y) ∆y
= ρS ⊗ ρRn [ei(γ,HR ) eitH e−i(γ,HR ) e−itH ] n
n
n
n
(1.6)
where the sum over ∆y is over all differences of y’s, i.e. over all energy level spacings of the Hamiltonians HRnk . In this paper, we study the infinite-volume limit of (1.6) for a specific model. This model is specified by taking the reservoirs to be quasifree boson fields and the coupling Ckn to be linear in creation and annihilation operators. The thermodynamical limit of this model is introduced in Sec. 2 and the finite-volume model is introduced in Sec. 4.
1.2.3. Other approaches The approach to “current fluctuations” of Sec. 1.2.2 has been used since [30, 41, 28] for fluctuations of heat and work and, most widespread, since [33,35] for fluctuations of charge transport (“full counting statistics”), made mathematically transparent in [29, 3]. However, it is not entirely clear that this is what one measures “in a realistic experiment”. One can imagine different approaches and we outline the most obvious of those now. The alternative approach starts from the idea that it is the operator eitHλ HRnk e−itHλ − HRnk n
n
which determines the transported energy. Hence, the characteristic function is, in this approach, defined by m n n γk (eitHλ HRnk e−itHλ − HRnk ) . χ ˜nt (γ) := exp −i
(1.7)
(1.8)
k=1
The drawback of this formula is that it has no obvious operational interpretation, i.e. it is intuitively not clear how to devise a natural experimental setup for measuring (1.8). ˜nt (γ) determine the same first It is important to remark that both χnt (γ) and χ and second moments, at least if the initial state is chosen as in Sec. 1.2.1. Indeed,
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
by straightforward calculation, one checks ∂ n ∂ n χ (γ) = χ ˜ (γ), ∂γ t ∂γ t
553
(1.9)
∂2 n ∂2 n χt (γ) = χ ˜ (γ), (1.10) 2 ∂γ ∂γ 2 t where we used the fact that HRni commutes with the initial density matrix. ˜nt (γ) for n ≥ 3 is due to the fact Note that the difference between χnt (γ) and χ that, for operators, a product of exponentials is not equal to the exponentials of the sum. ˜nt (γ), we For a more extended discussion of the differences between χnt (γ) and χ refer to [16, 5]. In [5], one finds also the description of a yet different definition of current fluctuations. 1.3. Gallavotti–Cohen symmetry In the previous section, we focused our attention on the characteristic function χt (γ), γ ∈ Rd . We will now discuss how one can obtain the so-called Gallavotti– Cohen fluctuation theorem for the entropy production from χt (γ), see Sec. 1.1 for references. 1.3.1. Large deviation-generating function We start by remarking that, if χt (γ) is the characteristic function of a Rm -valued random variable Xt , i.e. χt (γ) = E(e−i(γ,Xt ) ),
γ ∈ Rm ,
then the limit (provided it exists), 1 F (κ) := lim log χt (iκ), t∞ t
t > 0,
κ ∈ Rm ,
(1.11)
(1.12)
is the large-deviation generating function. Whenever F (κ) is differentiable on Rm , one deduces that the family of random variables Xt , t ∈ R+ satisfies a large deviation principle with rate function I(x) and speed t, given by I(x) := sup ((κ, x) − F (κ)), κ∈Rm
x ∈ Rm .
(1.13)
Loosely speaking, this means that Prob(Xt ≈ xt) ≈ e−tI(x) ,
(1.14)
where I(x) ≥ 0 and, in the typical case, there is a single x∗ for which I(x∗ ) = 0. For a thorough and rigorous discussion of large deviation principles, see [10]. If F (κ) is analytic in a neighborhood of 0 ∈ Cm , then one concludes [7] that the random variable Xt satisfies a central limit theorem (CLT), with mean and variance given by, respectively, the first and second derivative of F (κ) in κ = 0. This is exploited in Corollary 3.4.
May 12, 2009 13:34 WSPC/148-RMP
554
J070-00369
W. De Roeck
1.3.2. Symmetries of the characteristic function for finite systems We begin by stating a transient version of the GC fluctuation theorem, which will be helpful in the derivation. Assume there is a antiunitary operator Θ on H n , satisfying Θ−1 Θ = ΘΘ−1 = 1 (i.e. Θ is an involution) and ΘH n Θ−1 = H n ,
ΘHRnk Θ−1 = HRnk ,
In what follows, we abbreviate G (γ) := exp i n
for k = 1, . . . , m.
(1.15)
γk HRnk
,
Utn := eitH
n
(1.16)
k
and we write β ≡ (β1 , . . . , βm ) ∈ Rm . If we choose the initial state of the small system to be the trace state ρS (S) := (dim HS )−1 Tr[S],
S ∈ B(HS ),
(1.17)
then the characteristic function χnt (γ) can be manipulated as follows. (To keep the expressions transparent, we drop the dependence on n in Gn (γ) and Utn .): χnt (γ) Tr[G(iβ)] = Tr[G(iβ)G(γ)Ut G(−γ)U−t ]
(1.18)
= Tr[G(iβ)G(−γ)U−t G(γ)Ut ]
(1.19)
= Tr[G(iβ)G(γ − iβ)Ut G(−(γ − iβ))U−t ]
(1.20)
= χnt (γ − iβ) Tr[G(iβ)].
(1.21)
To obtain the second equality we inserted ΘΘ−1 = 1 and we used (1.15). The third equality follows from the group property G(γ)G(γ ) = G(γ + γ ) and the cyclicity of the trace. Hence, we obtain the exact identity χnt (γ) = χnt (γ − iβ).
(1.22)
This relation is sometimes called the “transient fluctuation theorem”. It is a straightforward consequence of the KMS-condition. Remark that it depends on the initial state of the small system through our choice (1.17). The idea is, how0 ever, that for an arbitrary initial state ρS , the correction to (1.22) is of order eo(t ) , as t∞, and the symmetry can be restored by taking the log and dividing by t, as t∞. To get a non-trivial long-time limit, one must perform the thermodynamic limit n∞ first. Summarizing; if the limit F (κ) := lim
t∞
1 log lim χnt (iκ), n∞ t
κ ∈ Rm
(1.23)
exists, then one obtains from (1.22) that F (κ) = F (−κ − β).
(1.24)
Moreover, F (κ) can often be proven to be independent of the initial state ρS , in contrast to the characteristic function χnt (γ). To make the connection with thermodynamic entropy production, we recall that for a macroscopic reservoir at
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
555
temperature 1/β, the change of entropy associated to a change of energy ∆E is given by β∆E. Hence, if Xt = (Xt )k=1,...,m is the (vector-valued) random variable that represents the changes of energy in the different reservoirs, then n St := k βk (Xt )k represents the entropy production. Denote by E the expecn −i(γ,Xt ) n ) = χt (γ), then, it follows that tation such that E (e En (e−iνSt ) = χn (β1 ν, . . . , βm ν)
(1.25)
and hence, if the initial state is chosen as above, then (1.22) translates into En (eνSt ) = En (e(−1−ν)St ),
for ν ∈ R.
(1.26)
Assuming that limt↑∞ limn↑∞ t−1 log En (eκSt ) exist, this leads, via reasoning as in Sec. 1.3.1, in particular (1.13) and (1.14), to the large deviation symmetry t−1 log
Prob(St ≈ ta) → a, Prob(St ≈ −ta) t↑∞
for a ∈ R.
(1.27)
The relation (1.27) is often described as a refinement of the second law. It states that the probability to witness a positive entropy production is exponentially larger than the probability to witness a negative entropy production. A basic consequence is that the mean entropy production is positive. Mathematically, this follows from (1.27) by Jensen’s inequality. For a review of the different fluctuation relations and more explanation about their meaning (in particular, the link with entropy production), we refer to [36]. In this paper, the existence of F (κ) will be proven for κ in some neighborhood of 0 ∈ Cm . We also obtain the independence of F (κ) from the initial state ρS . The symmetry (1.24) then follows trivially from the reasoning above. 1.4. The non-commutative theory of large deviations In classical statistical mechanics, the existence of the large deviation generating function can usually be established through a convexity argument, see e.g. [44]. A similar general understanding is lacking in quantum statistical mechanics (see however [37,32,24,38] for partial results. Another — even conceptual — problem in quantum statistical mechanics, is how to describe joint large deviations of several noncommuting variables. Remark that it was exactly to solve such a conceptual problem, that the framework of the fluctuation algebra was constructed [23] to describe quantum central limit theorems. We consider a setup where the random variable Xt , see Sec. 1.3, corresponds to the total heat transport into reservoirs. Hence the setup is somewhat different from that in [37,32,24,38], since, in contrast to those works, the expectation E(e−i(γ,Xt ) ) is not naturally given in the form ρ(e−iγA ) where ρ is a state and A a self-adjoint operator. Rather, the definition of the expectation E and random variable Xt relies explicitly on two measurements. The problem of joint distributions for noncommuting observables does not even appear in this context since the different reservoir Hamiltonians do mutually commute. This is discussed more extensively in [16].
May 12, 2009 13:34 WSPC/148-RMP
556
J070-00369
W. De Roeck
1.5. Outline We introduce the model in abstract terms in Sec. 2, immediately followed by the result in Sec. 3. The physical justification of this model is given in Sec. 4.1, where it is explained how it emerges from the quantities discussed in Secs. 1.2.1 and 1.2.2. In Sec. 4.2, we discuss related results in the literature. The final Sec. 5 contains the proofs. 2. The Model As outlined in the introduction, we study a small system in contact with heat reservoirs. In the following sections, we introduce these concepts in the thermodynamic limit. The connection with the finite-volume setup will be visible throughout, but it will be made explicit in Sec. 4. 2.1. The small system The small system is described by a finite-dimensional Hilbert space HS . Its dynamics is generated by a self-adjoint Hamiltonian HS on HS . To describe the coupling of the system to the different reservoirs k = 1, . . . , m, we introduce self-adjoint operators Vk ∈ B(HS ). Obviously, to see the effect of the heat baths, we need that the operators Vk do not commute with HS , at least not for all k. This will be effectively ensured by Assumption 2. 2.2. The reservoirs at zero-temperature The reservoirs, interacting with the small system, are assumed to consist of free bosons. For each k = 1, . . . , m, we define a one-particle space hk with a positive, selfadjoint operator ωk which generates the dynamics of a single reservoir boson. For concreteness, we fix ωk to have absolutely continuous spectrum. The most obvious example is to choose hk := L2 (Rd , dq) with q ∈ Rd the momentum of a boson. Then ωk is simply the operator that acts on L2 (Rd , dq) by multiplication with the dispersion c|q| (where c is the “speed of light”). Let Γs (hk ) be the symmetric Fock space built on the one-particle space hk , see e.g. [11]. The “full” reservoir space is then given by the tensor products of these Fock spaces HR := ⊗k Γs (hk ) = Γs (h),
h := ⊕k hk .
(2.1)
The free reservoir Hamiltonian for the kth reservoir, HRk is defined to be the second quantization of ωk , i.e. HRk = dΓ(ωk ).
(2.2)
We also write HRk for the operator that equals dΓ(ωk ) on the kth factor of the tensor product and unity on the other factors.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
557
On Γs (hk ), we define creation/annihilation operators a∗ (ϕ)/a(ϕ) for ϕ ∈ h. They satisfy the commutation relations [a(ϕ), a∗ (ϕ )] = ϕ, ϕ h ,
[a# (ϕ), a# (ϕ )] = 0
(2.3)
where a# stands for a or a∗ and ·, ·h is the scalar product on h. To describe the coupling with the system, we choose a “form factor”, i.e. a function φk ∈ hk for each k. We also write φk for the vector in h = ⊕k hk which equals φk on hk and 0 on hk , k = k. The interaction between the small system and the kth reservoir is given by HSRk := Vk ⊗ Ψ(φk ),
with Ψ(ϕ) = a(ϕ) + a∗ (ϕ)
for ϕ ∈ h.
(2.4)
The total Hamiltonian that generates the (zero-temperature) dynamics of system and reservoirs is formally given as Hλ = HS + λ
m
HSRk +
k=1
m
HRk
(2.5)
k=1
for some coupling strength λ ∈ R, which will be chosen small. The rigorous definition of the Hamiltonian (2.5) is standard in the literature; the following proposition appears, e.g., in [12] in a related context. Proposition 2.1. Assume that ωk ≥ 0 and
φk , (ωk )−1 φk hk < ∞,
k = 1, . . . , m.
(2.6)
Then HSRk is relatively bounded with respect to HRk and hence Hλ is self-adjoint on the domain of Hλ=0 . The condition (2.6) is implied by our upcoming assumptions, in particular Assumption 1. However, we do not need Proposition 2.1 since our objects of interest will be defined by a convergent perturbation series. 2.3. The reservoirs at positive temperature We put the tools in place to describe the positive temperature state of the reservoirs. Let C be the ∗algebra consisting of polynomials in a(ϕ), a∗ (ϕ ) with ϕ, ϕ ∈ h. We introduce the positive operators Tk on hk and T on h = ⊕k hk by Tk := (eβk ωk − 1)−1 ,
T = ⊕k Tk ,
(2.7)
where βk should be thought of as the inverse temperature of reservoir k. We let ρR be a quasi-free state defined on C . It is fully specifiedb by (1) Gauge-invariance ρR [a∗ (ϕ)] = ρR [a(ϕ)] = 0.
(2.8)
reason why, in models like ours, it is enough to know the state on C , has been explained in many places, e.g. [2, 6, 14, 20].
b The
May 12, 2009 13:34 WSPC/148-RMP
558
J070-00369
W. De Roeck
(2) Two-point correlation functions ρR [a∗ (ϕ)a(ϕ )] ρR [a∗ (ϕ)a∗ (ϕ )]
ϕ , T ϕh = ∗ ρR [a(ϕ)a(ϕ )] ρR [a(ϕ)a (ϕ )] 0
0 .
ϕ, (1 + T )ϕ h
(2.9)
(3) Quasi-freeness, i.e. the higher-point correlation functions are expressed in terms of the two-point function by ρR [a# (ϕr )a# (ϕs )] (2.10) ρR [a# (ϕ1 ) · · · a# (ϕ2n )] = pairings π (r,s)∈π
ρR [a# (ϕ1 ) · · · a# 2n+1 (ϕ2n+1 )] = 0,
(2.11)
where a pairing π is a partition of {1, . . . , 2n} into n (unordered) pairs and the product is over these pairs (r, s) (we use the convention that r < s). A quantity that will play an important role in our analysis is the reservoir correlation function, defined as, for k = 1, . . . , m, ψk (t) := ρR [Ψ(eitωk φk )Ψ(φk )] = φk , Tk eitωk φk + φk , (1 + Tk )e−itωk φk .
(2.12)
The following assumption requires the reservoir to have exponential decay of correlations. Assumption 1. There is an open set DA ⊂ Cm containing 0 and such that, for all γ ∈ DA , sup{|ψk (t + γk )| exp(gR |t|)} ≤ c < ∞,
k = 1, . . . , m
(2.13)
t∈R
for some positive constant c and decay rate gR > 0. Via the relation (2.12), Assumption 1 implies a condition on the form factors φk . Let, for concreteness, hk = L2 (Rd , dq) and ωk (q) ≡ |q|, then ψˆk , the Fouriertransform of ψk , is given by 1 δ(ωk (q) − ξ) ξ>0 βk ωk (q) e −1 ψˆk (ξ) := dq|φk (q)|2 (2.14) 1 Rd δ(ωk (q) + ξ) ξ ≤ 0 1 − e−βk ωk (q) and Assumption 1 demands that R ξ → ψˆk (ξ)eiγk ξ is analytic in a strip of width gR such that dξ|ψˆk (ξ)eiγk ξ | < ∞, for any δ < gR . (2.15) sup −δ<η<δ
iη+R
Note in particular that Assumption 1 can only be satisfied if all temperatures 1/βk are strictly positive. The Assumption 1 will be used heavily to set up a diagrammatic expansion in Sec. 5.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
559
2.4. The reduced dynamics Up to now, we have defined the zero-temperature Hamiltonian of the coupled system and the thermal states of the reservoirs. The initial state for the coupled system is chosen of the form ρS ⊗ ρR where ρS is any state on B(HS ) and ρR has been defined in Sec. 2.3. Recalling the discussion in Sec. 1.2.2, we are primarily interested in the characteristic function of energy transport, which we could guess to be given by (2.16) ρS ⊗ ρR [G(γ)eitHλ G(−γ)e−itHλ ], where G(γ) = exp{i k γk HRk }, in close analogy to the expression (1.6). However, as it stands, there are two problems with this expression. First, one could doubt whether (2.16), even if it were well-defined, is the correct thermodynamic limit of the characteristic function. A proposition which states that this is indeed the case, is contained in Sec. 4. The second problem is that (2.16) is a priori ill-defined. Indeed, the state ρR has been defined on C and thus ρS ⊗ρR is defined on B(HS )⊗C . Hence, in order to make sense out of (2.16), one would need that G(γ)eitHλ G(−γ)e−itHλ ∈ B(HS )⊗C , which is definitely not true. In the mathematical literature on open quantum systems, these matters have been analyzed in great detail and standard solutions are available. We refer to [2, 11, 6]. In the present paper, we prefer to define (2.16) by a series expansion. While mathematically less elegant, this has the advantage that it allows us, at all stages of the analysis, to check that our expressions can be obtained as thermodynamic limits of finite volume quantities, for which definitions like (2.16) do make sense. The aim of this section is hence to define (2.16) rigorously. It turns out that it is useful to introduce a more general object, namely the “γ-deformed reduced t , formally defined as an operator on B(HS ) by dynamics” Zλ,γ t (S)] := ρS ⊗ ρR [G(γ/2)eitHλ G(−γ/2)(S ⊗ 1)G(−γ/2)e−itHλ G(γ/2)] ρS [Zλ,γ (2.17)
for all S ∈ B(HS ) and states ρS on B(HS ). Note that for S = 1 this expression t corresponds to the Heisenreduces to (2.16). Moreover, for γ = 0, the operator Zλ,γ berg dynamics of observables of the small system, after “tracing out” the reservoir, hence the name “reduced dynamics”. t as a bounded The next lemma defines the “γ-deformed reduced dynamics” Zλ,γ operator on B(HS ). Lemma 2.2. Assume that Assumption 1 holds with a given DA and let γ ∈ Rm ∪ DA . We define a linear map Jγt on B(HS ) ⊗ C , given by Jγt (A)
:= i
m
(Vk (t) ⊗ Ψ(e−i(t−
γk 2
)ωk
φk ))A − A(Vk (t) ⊗ Ψ(e−i(t+
γk 2
)ωk
φk ))
k=1
(2.18)
May 12, 2009 13:34 WSPC/148-RMP
560
J070-00369
W. De Roeck
with Vk (t) := e−itHS Vk eitHS and A ∈ B(HS ) ⊗ C . The series ∞ t ρS (Zλ,γ (S)) = λ2n dt1 · · · dtn ρS ⊗ ρR [Jγtn Jγtn−1 · · · Jγt1 (S)] n≥0
0
(2.19) is well defined for any λ ∈ R, γ ∈ Rm ∪ DA , S ∈ B(HS ) and state ρS on B(HS ), i.e. sum and integrals on the right-hand side converge absolutely. The thus-defined t on B(HS ) has the expected properties operator Zλ,γ t Zλ,0 (1) = 1, t Zλ,γ (S)
≤ S
(2.20) for Im γ = 0.
(2.21)
Moreover, for any λ ∈ R and t ≥ 0, the function t Rm γ → ρS (Zλ,γ (1))
(2.22)
is positive definite. Heuristically, (2.20) follows by the fact that, formally, (S ⊗ 1) → eitHλ (S ⊗ is an automorphism. Likewise, (2.21) and (2.22) follow from the fact that 1)e itHλ G(−γ/2) and G(−γ/2)e−itHλ G(γ/2) are unitary operators. G(γ/2)e By the positive-definiteness of (2.22), the normalization (2.20) and Bochner’s t (1)] is the characteristic function of a Theorem, the function γ → ρS [Zλ,γ (t-dependent) random variable. This indirectly defined random variable will be called Xt in Sec. 3. One should imagine that it describes the statistics of energy transport between the reservoirs, as described in Sec. 1.2. To make this connection convincing, we will invoke a thermodynamic limit in Sec. 4 . First, we proceed with the statement of our results. −itHλ
2.5. Deformed Lindblad operators In this section, we introduce some tools to study the weak coupling limit of our model. This is the scaling limit in which λ0 while the time is rescaled as t → λ−2 t. In this limit, the dynamics of the system can be described by a Markovian semigroup, which is introduced below. The elements of the following set are called “Bohr frequencies”: sp([HS , ·]) = {ε ∈ R | ∃e, e ∈ spHS : ε = e − e },
(2.23)
where [HS , ·] is the operator on B(HS ) acting as S → [HS , S]. Define the transition rates ck (ε) and energy shifts ∆k (ε) by dt ψk (t)eitε = 2π ψˆk (ε), (2.24) ck (ε) := R
∆k (ε) := Im
R+
dt ψk (t)eiεt ,
ε ∈ sp[HS , ·].
(2.25)
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
561
Note that ck (ε) ≥ 0 by Bochner’s theorem, using that ψk (t) was defined in (2.12) as a time-correlation function. Finally, we define the transition operators for reservoir k and Bohr frequency ε, 1e (HS )Vk 1e (HS ), ε ∈ sp[HS , ·] (2.26) Vk,ε := e,e ∈spHS e−e =ε
where 1e (HS ) stands for the spectral projection of HS on e ∈ spHS . The next assumption, often called “Fermi’s Golden Rule” condition, ensures that the small system is sufficiently well-coupled to the reservoir. Assumption 2 (Fermi’s Golden Rule). The set of matrices BV := {ck (ε)Vk,ε | ε ∈ sp[HS , ·], k = 1, . . . , m}
(2.27)
generates the whole algebra B(HS ). This means that any operator S that commutes with all elements of BV , is a multiple of the identity, S = c1 for some c ∈ C. Next, we define the generator of the Markovian semigroup. In fact, we define a family of operators on B(HS ), parametrized by γ ∈ Cm , such that the generator of the semigroup corresponds to γ = 0. We call those operators “deformed Lindbladians”. Let 1 ∗ ∗ ∗ ck (ε) eiγk ε Vk,ε SVk,ε − (Vk,ε Vk,ε S + SVk,ε Vk,ε ) , Lγ (S) := i[Υ, S] + 2 k,ε
S ∈ B(HS ) where Υ is an effective self-adjoint Hamiltonian, given by ∗ ∆k (ε)Vk,ε Vk,ε , note that [Υ, HS ] = 0. Υ :=
(2.28)
(2.29)
k,ε
One easily checks that, for Re γ = 0, the family etLγ is a semigroup of completely positive maps. Moreover, Lγ=0 is unity preserving. Hence etL0 (1) = 1 e
tL0
(S) ≥ 0,
(2.30) for S ≥ 0 and Re γ = 0.
(2.31)
In fact, L0 is a Lindblad generator, see e.g. [1]. Physically, this means that the semigroup etL0 has all properties of a “real” Heisenberg dynamics. For an extended discussion of the operators (2.28), we refer to [31, 16]. Here, we restrict ourselves to noticing that the map ∗ S → Vk,ε SVk,ε
(2.32)
induces an energy transition −ε in the system (the energy is transferred to the kth reservoir). By multiplying this term with eiγk ε , we are able to keep track of the
May 12, 2009 13:34 WSPC/148-RMP
562
J070-00369
W. De Roeck
energy currents into the reservoir. Indeed, the operator Lγ turns out to describe the characteristic function of energy transport, cfr. the setup in Sec. 1.2. The relation ck (ε) = eβk ε ck (−ε)
(2.33)
expresses that the transition rates associated to the kth bath satisfy a detailed balance condition at inverse temperature βk . The relation (2.33) is easily checked from (2.24) and (2.12), ultimately it is a consequence of the fact that the reservoir states are thermal. The following proposition is essentially a consequence of a theorem by Frigerio [19], Assumption 2 and a non-commutative version of the Perron–Frobenius theorem, see e.g. [42] for a proof. Remark also that Statement (3) follows from Statements (1) and (2) by spectral perturbation theory. Let L∗γ be the adjoint of Lγ , defined by duality: Tr[S Lγ (S)] = Tr[L∗γ (S )S],
S, S ∈ B(HS ).
(2.34)
Proposition 2.3. For all γ ∈ Cd , the operators Lγ commute with [HS , ·], i.e. 1ε ([HS , ·])Lγ 1ε ([HS , ·]) = Lγ (2.35) ε∈sp[HS ,·]
where 1ε ([HS , ·]) is the spectral projection of the Liouville operator [HS , ·] on the Bohr frequency ε. Further, assume that Assumption 2 holds and let γ ∈ Cm satisfy Re γ = 0. Then the following statements hold. (1) The operators Lγ and L∗γ have a simple real eigenvalue fL (γ) at the top of the spectrum sup Re spLγ = fL (γ).
(2.36)
For γ = 0, fL (0) = 0 and the associated eigenvector of Lγ is 1 ∈ B(HS ). In general, we write ζLγ for the associated eigenvector of Lγ and ζ˜Lγ for the associated eigenvector of L∗γ . Both ζLγ and ζ˜Lγ can be chosen to be strictly positive operators satisfying [HS , ζLγ ] = 0,
[HS , ζ˜Lγ ] = 0
(2.37)
and such that the spectral projection PLγ of Lγ , corresponding to the eigenvalue fL (γ), acts as PLγ (S) = Tr[ζ˜Lγ S] ζLγ ,
S ∈ B(HS ).
(2.38)
(2) The eigenvalue fL (γ) is elevated above the rest of the spectrum by a gap gL (γ) > 0, i.e. spLγ = {fL (γ)} ∪ Ωγ
and
dist(fL (γ), Re Ωγ ) = gL (γ).
(2.39)
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
563
(3) There is an open set DL ⊂ Cm with iRm ⊂ DL such that for γ ∈ DL (i.e. not necessarily Re γ = 0), the above statements (1) and (2) still hold, that is, Lγ and L∗γ still have a maximal eigenvalue (as to the real part), fL (γ). The eigenvalue fL (γ) is separated by a gap gL (γ) > 0 from the rest of the spectrum and it is analytic in γ. The associated eigenvectors ζLγ and ζ˜Lγ can be chosen strictly positive and they commute with HS . The relevance of Proposition 2.3 lies in the following observation. Anticipating the result of Proposition 3.1 in Sec. 3, the quantity 2
ρS [et(i[HS ,·]+λ
Lγ )
(1)]
(2.40)
is a good approximation, for small coupling λ and times of order λ−2 , of the charact (1)] that was defined in Lemma 2.2. One can argue (see [42]) teristic function ρS [Zλ,γ that the function (2.40) is precisely the characteristic function of a probability measure. The statements of Proposition 2.3 imply that the associated large deviation function exists and is analytic. Indeed, we calculate for γ ∈ DL ; 2 1 log ρS [et(i[HS ,·]+λ Lγ ) (1)] t 2 2 1 = log(eλ fL (γ)t ρS (PLγ (1)) + O(eλ (fL (γ)−gL (γ))t )) t 2 1 = λ2 fL (γ) + log(Tr[ζ˜Lγ ]ρS (ζLγ ) + O(e−λ gL (γ)t )), t∞. t
(2.41) (2.42)
Since ζLγ and ζ˜Lγ are strictly positive operators, the first term between brackets is strictly positive and the limit t∞ can be performed, yielding λ2 fLγ . This calculation is valid for γ in a neighborhood of (iR)m and fL (γ) is analytic, hence differentiable. Consequently, the large deviation principle can be deduced, see Sec. 1.3.1. Our main result, Theorem 3.2, shows that one can apply the same reasoning, albeit for a restricted set of γ, to the fully interacting model, and not only to its Markovian approximation. 3. Results t , acting on B(HS ), was defined via Lemma 2.2. ForThe reduced dynamics Zλ,γ mally, it satisfies γ γ itHλ γ γ −itHλ t ρS [Zλ,γ (S)] = ρS ⊗ ρR G G − G e (S ⊗ 1)G − e (3.1) 2 2 2 2 where G(γ) := ei(γ,HR ) = exp{i k γk HRk )} and ρS is a state on B(HS ). For γ = 0, t describes the reduced Heisenberg evolution of observables of the small system, Zλ,γ i.e. after tracing out the reservoirs. t is We first state that in the weak coupling limit, the reduced evolution Zλ,γ Markovian.
May 12, 2009 13:34 WSPC/148-RMP
564
J070-00369
W. De Roeck
Proposition 3.1. Assume Assumption 1, then, for all T < ∞ and γ ∈ DA (as in Assumption 1), sup 0≤t≤λ−2 T
2
t |ρS [Zλ,γ (S)] − ρS [et(i[HS ,·]+λ
Lγ )
(S)]| → 0 λ 0
(3.2)
for all S ∈ B(HS ). Recall that the second term of the left-hand side of (3.2) was the subject of the discussion at the end of Sec. 2.5. Proposition 3.1 is not necessary for the proof of our main results and we will not prove it in this paper, although to do so, it would suffice to adapt only slightly the proof of our main result, Theorem 3.2. Proposition 3.1 remains true under less restrictive assumptions on ψk (t) than the exponential decay in Assumption 1 (see [16]) and statements like Eq. (3.2) has been exhaustedly discussed in the literature, at least for γ = 0, see [8, 31, 15]. In Lemma 2.2, we showed that for all λ ∈ R and initial states ρS , the function γ → ρS (Zλ,γ (1)) can be interpreted as a characteristic function of a random variable. In other words, there are random variables Xt ∈ Rm and probability measures PλρS (with the expectation denoted by EλρS ) such that t (1)]. EλρS (e−i(γ,Xt ) ) = ρS [Zλ,γ
(3.3)
One should keep in mind that the kth component of the random variable Xt is the amount of energy that has flown out of the kth reservoir. This will be justified via Proposition 4.1, where it is shown that the characteristic function EλρS (e−i(γ,Xt ) ) is the limit of similar characteristic functions in a finite-volume setup. The following result states the existence and analyticity of the large deviation generating function for the family of random variables Xt , in a neighborhood of the origin. Theorem 3.2. Assume Assumptions 1 and 2. There is a λ0 > 0 and an open domain D ⊂ Cm , 0 ∈ D such that for 0 < |λ| ≤ λ0 and γ ∈ D, t f (λ, γ) := lim t−1 log ρS [Zλ,γ (1)] t↑∞
(3.4)
exists, is independent of ρS and analytic in γ. Moreover, f (λ, γ) = λ2 (fL (γ) + O(λ2 ))
(3.5)
where the function fL (γ) has been defined in Proposition 2.3. The domain D is a subset of DA ∩DL (see Assumption 1 and Proposition 2.3), it can in general be taken larger as λ0, but we can never take D to be a neighborhood of (iR)m , as would be required to obtain the full large deviation principle. Note that the large deviation generating function, as defined in Sec. 1.3, is related to the function f (λ, γ) by a rotation in the complex plane, i.e. F (λ, κ) := f (λ, iκ) with κ ∈ Rm , is a large deviation generating function.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
565
As the main corollary, we state the Gallavotti–Cohen fluctuation theorem for the entropy production. This requires an additional assumption. Assumption 3 (Time-Reversal Invariance). Assume there are anti-unitary involutions θS on HS and θRk on hk , such that θS HS θS−1 = HS , −1 = ωk , θRk ωk θR k
θS Vk θS−1 = Vk , θRk φk = φk .
From Assumption 3, it follows that the anti-unitary involution Θ := θS ⊗ Γs (⊕k θRk ) on H satisfies (cf. the conditions in (1.15)) Θ−1 Hλ Θ = Hλ
and Θ−1 HRk Θ = HRk .
(3.6)
In fact, the condition in (3.6) is sufficient for the next result, Theorem 3.3 and one can easily imagine examples in which (3.6) holds but the condition θRk φk = φk above is violated. For the sake of constructiveness, we stick however to Assumption 3. Theorem 3.3. Assume Assumptions 1–3. Let F (λ, κ) := f (λ, iκ) with f (·, ·) as in Theorem 3.2 and define β = (β1 , β2 , . . . , βm ) ∈ Rm .
(3.7)
For κ such that κ ∈ iD and −β − κ ∈ iD, F (λ, κ) = F (λ, −β − κ).
(3.8)
The meaning of formula (3.8), which is the rigorous analogue of (1.24), has been explained in Sec. 1.3. Another corollary of Theorem 3.2 is the central limit theorem for the random variables Xt . Since we argued that (Xt )k represents the amount of energy that has flown into the kth reservoir, the quantity 1 xλ := lim EλρS (Xt ), t↑∞ t
xλ ≡ (xλ )k
(3.9)
is the mean energy current into the reservoirs. Corollary 3.4 states that the fluctuations of Xt around its mean value xλ are Gaussian, as t∞. Corollary 3.4. Assume the assumptions of Theorem 3.2. The limit in (3.9) is well-defined and independent of ρS . Moreover, xλ = i
∂ f (λ, γ)|γ=0 . ∂γ
(3.10)
Define the m × m matrix (recall that m is the number of reservoirs) σλ := −
∂2 f (λ, γ)|γ=0 . ∂γ 2
(3.11)
May 12, 2009 13:34 WSPC/148-RMP
566
J070-00369
W. De Roeck Xt√ −txλ t
Then, the Rm -valued random variables sian measure with covariance σλ , i.e. EλρS [e
−i(γ, √1t (Xt −txλ )
converge in distribution to a Gaus-
1
] → e− 2 (γ,σλ γ) , t↑∞
Moreover, the moments of
Xt√ −txλ t
γ ∈ Rm .
(3.12)
converge to the moments of the Gaussian, e.g.
1 λ E [(Xt − txλ )k (Xt − txλ )k ] → (σλ )k,k . t↑∞ t ρS
(3.13)
The Corollary 3.4 follows from Theorem 3.2 by a general argument by Bryc [7]. The Gaussian fluctuations of the currents are particularly important in closeto-equilibrium thermodynamics, as they figure in the Green–Kubo relation and Onsager–Machlup theory. We refer to [26] and [25] for further rigorous results on the central limit theorem in a very similar context. 4. Discussion 4.1. Finite-volume approximations t (1)] As announced in the previous sections, we show that the expression ρS [Zλ,γ is a thermodynamic limit of characteristic functions of finite-volume models. We start by introducing the finite-volume models. Instead of the one-particle reservoir spaces hk , we introduce finite-dimensional spaces hnk with self-adjont Hamiltonians ωkn and structure factors φnk ∈ hnk . The full reservoir space is now defined as
HRn := ⊗k Γs (hnk ) = Γs (hn ),
hn := ⊕k hnk .
(4.1)
and the reservoir Hamiltonians are given by HRnk := dΓ(ωkn )
on Γs (hnk )
(4.2)
completely analogous to the setup in Sec. 2.2. The small system is unchanged, its Hilbert space is HS and its Hamiltonian is HS , as before. The full zero-temperature Hamiltonian on HS ⊗ HRn is defined as Hλn := HS +
m k=1
HRnk + λ
m
Vk ⊗ (a(φnk ) + a∗ (φnk ))
(4.3)
k=1
where, as before, we write HS for HS ⊗ 1 and HRnk for 1 ⊗ HRnk . The thermal state can now be defined straightforwardly. Even though the spaces Γs (hnk ) are not n finite-dimensional, the operators e−βk HRk are trace-class, provided that βk > 0 and ωkn > 0, and one can define ρRk (A) :=
n 1 Tr[e−βk HRk A], Zk (βk )
Zk (βk ) := Tr[e−βk HRk ]. n
(4.4)
Notice that the setup which we introduced above is reminiscent of the objects in Sec. 1.2, the differences being (1) we are more specific about the choice of spaces
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
567
and Hamiltonians, since we want to connect them to our infinite-volume model, and (2) we do not ask that the full reservoir space HRn is finite-dimensional. Up to this point, we have not made any connection between the objects introduced above and our model. This is done through the upcoming Assumption 4. Let n Tkn := (eβk ωk − 1)−1 and define the finite volume reservoir correlation function ψkn (t) := φnk , Tkn e−itωk φnk hnk + φnk , (1 + Tkn )eitωk φnk hnk . n
n
(4.5)
Assumption 4. We assume that the finite-volume correlation functions ψkn converge pointwise to ψk , as n↑∞. The convergence is assumed to be uniform on compacts in a strip around the real axis, i.e. sup |ψkn (t) − ψk (t)| → 0,
t∈K
n↑∞
(4.6)
for all compact sets K ⊂ {t ∈ C, |Im t| ≤ δ}, for some δ > 0. Note that we do not ask for ψkn to converge to ψk in any Lp -space. That would be unrealistic since the function ψkn (t) is quasiperiodic for any finite n ∈ N.c Nevertheless, the convergence assumed in (4.6) yields convergence of the finite-volume characteristic functions to their infinite-volume counterparts, as will be stated in Proposition 4.1. One can always devise a “finite-volume” approximation to our model. Consider for example the simple case where hk ∼ L2 (R+ , dωk ) and ωk acts on L2 (R+ , dωk ) by multiplication with the variable ωk . This is the case of no internal degrees of freedom for the reservoir particles, i.e. a reservoir particle is only characterized by its energy. Given Assumption 1 and its implication (2.15), the function ψk and its Fourier transform ψˆk are exponentially decaying as |t|∞ and |ξ|∞, respectively. Finding a finite-volume approximation essentially amounts to finding a sequence of step functions (the steps correspond to the base states of hnk ) with compact support such that their Fourier-transforms ψkn approximate ψk in the sense of Assumption 4. This can obviously be done. The following proposition states the convergence of the finite-volume dynamics and characteristic functions to their infinite-volume analogues. Proposition 4.1. Assume Assumption 4 as stated above and let γ ∈ Cm satisfy |Im γk | < δ, then, for all t ∈ R, n γ γ −itHλn n γ n n γ itHλ n n t ρS ⊗ ρR G G − G (S)] e (S ⊗ 1)G − e → ρS [Zλ,γ n↑∞ 2 2 2 2 (4.7) where Gn (γ) := ei c This
P k
n γk HR
k
.
is nothing else than a manifestation of Poincar´e recurrences in finite-volume systems. In this context, it warns us that we should not attempt to take the time to infinity before taking the thermodynamic limit.
May 12, 2009 13:34 WSPC/148-RMP
568
J070-00369
W. De Roeck
The proof of this proposition is given in Sec. 5.2. Recall that the left-hand side of (4.7) was constructed as the characteristic function (called χnt (γ)) of a random variable describing the transport of energy between the reservoirs. The right-hand side of (4.7) was shown to be the characteristic function of a random variable in Lemma 2.2. Hence, Proposition 4.1 states that the random variable describing energy transport in finite volume converges in distribution to a random variable in infinite volume. Since the characteristic functions are assumed to be analytic, we obtain also convergence of moments. 4.2. Related results There has lately been a lot of work on spin-boson and spin-fermion models, or more generally, Pauli–Fierz models. We feel our work is technically closest to [27], in which one considers the (equilibrium) spin-boson model and one proves that the generator of the positivetemperature dynamics (the “Liouvillian”) has absolutely continuous spectrum for λ = 0, except for one eigenvalue which corresponds to the stationary state. The other eigenvalues of the system at λ = 0 turn into resonances whose location is in first nonvanishing order predicted by the Lindblad generator. In a later paper, [25], the authors prove the Green–Kubo relation and the Onsager reciprocity relations for the nonequilibrium spin-boson model (actually, for technical reasons, they treat the spin-fermion model) Let ψˆk be the Fourier transform of the time-correlation function ψk , defined in (2.12). Then, the basic assumption of [27] reads that the function ψˆk is analytic in a strip {|Im z| ≤ δ} and dω|ψˆk (ω)| < ∞. (4.8) sup −δ<η<δ
iη+R
Actually, in the mentioned paper [27] the subscript k is not present since there is only one reservoir. In [25], the assumption is somehow stronger, requiring that dω|eγk ω ψˆk (ω)| < ∞ (4.9) sup −δ<η<δ
iη+R
for γ in some neighborhood of 0 ∈ Cd . This assumption is obviously equivalent to Assumption 1 in the present paper, see the discussion following Assumption 1. The technique of [27] consists of a spectral deformation of the Liouvillian, the generator of the Heisenberg dynamics in the GNS-representation. We employ time-dependent perturbation theory and we rewrite the Dyson expansion as a onedimensional polymer model. The same technique has also been used in [40]. Nevertheless, the proofs are similar to that of [27] in that both rely heavily on the exponential decay of reservoir correlations. Assumption 1 cannot be weakened without changing the method drastically. Note that one cannot assume 1 with DA = Cm since that would imply that ψk (t) is bounded and analytic for all t ∈ C, hence constant.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
569
Results that need weaker regularity properties of ψk and ψˆk are e.g. [4,13,12,20]. In those works one employs Mourre theory or renormalization group techniques to prove return to equilibrium, or approach to a nonequilibrium steady state. A different approach to the same question is through scattering theory, see [43, 21]. Other papers that study fluctuations of currents in a rigorous setup are [26, 3]. 5. Proofs Our proofs fall naturally apart in a few steps • First Step. In Sec. 5.1, we develop a diagrammatic representation of the reduced t . The main results are formulas (5.11) (integral over diagrams) and evolution Zλ,γ (5.15) (integral over sequences of irreducible diagrams). One might get a feeling for the diagrammatic representation by looking at Fig. 1. • Second Step. In Sec. 5.2, we estimate the integral over diagrams that were constructed in the First Step. The result is contained in Lemma 5.1. This is the only place in the article where the diagrams are really used. • Third Step. In Sec. 5.3, we briefly refer to the diagrammatic representation obtained in the First Step to split the Laplace transform of the reduced evot into a term coming from the “ladders” and one coming from the lution Zλ,γ “excitations”. We then translate the bounds from the Second Step into bounds on these two terms. This is accomplished in Lemma 5.2. • Fourth Step. In Sec. 5.4, we state an auxiliary result, Theorem 5.3, that leads very directly to the proof of our main results. The auxiliary result, Theorem 5.3, relies entirely and exclusively on Lemma 5.2 and its straightforward proof is fully independent of the diagrammatic approach. To stress this fact, we postpone the proof to the abstract Appendix. 5.1. Dyson expansion In this section, we set up a convenient notation to handle the Dyson expansion, which has been introduced in Lemma 2.2. This will be done in two steps, corret sponding to Secs. 5.1.1 and 5.1.2. In Sec. 5.1.1, we separate the contributions to Zλ,γ into an operator part (acting on B(HS )) and a scalar part that originates from the reservoir correlation functions. This is best visible in (5.4), where ζγ (π, t, k, l) is a product of reservoir correlation functions and the operators U and I act on B(HS ). In Sec. 5.1.2, we rewrite the expansion as an integral over diagrams, i.e. collections of ordered time-pairs. The result is displayed in (5.11). The main advantages of this formalism are (1) that one can render explicit a factorization property of the reduced t , as displayed in (5.15), and, (2) that it allows to derive estimates on evolution Zλ,γ the Dyson expansion in a rather intuitive way, as will be exploited in Sec. 5.2. t was defined We warn the reader of the following fact: The reduced evolution Zλ,γ in Lemma 2.2 by a series expansion, whose absolute convergence was stated but not
May 12, 2009 13:34 WSPC/148-RMP
570
J070-00369
W. De Roeck
yet proved. Throughout Sec. 5.1, this expansion is manipulated, but the proof of its absolute convergence appears only at the end of Sec. 5.2 by virtue of the bounds in Lemma 5.1. Hence, strictly speaking, the equalities in the present section should be considered as order-by-order (in powers of λ) equalities between series. 5.1.1. Expansion in pairings Define the group Ut on B(HS ) by Ut (S) := eitHS Se−itHS ,
S ∈ B(HS ),
(5.1)
and the operators Ik,l , with k ∈ {1, . . . , m} and l ∈ {L, R} (L, R stand for “left ” and “right ”), as if l = L i Vk S (5.2) S ∈ B(HS ). Ik,l (S) := −i SVk if l = R. Elements in R2n , {1, . . . , m}2n , {L, R}2n are denoted by t, k, l, with ti , ki , li their respective components for i = 1, . . . , 2n. Using the operators Ut and Ik,l as defined above, we evaluate (2.19) formally by using (2.9) and (2.10), (2.11): t 2n λ dt1 · · · dt2n Vγ[0,t] (π, t) (5.3) Zλ,γ = n∈Z+
where Vγ[0,t] (π, t) :=
π∈Pn
0
ζγ (π, t, k, l) Ut−t2n Ik2n ,l2n · · · Il2 ,k2 Ut2 −t1 Il1 ,k1 Ut1
(5.4)
k,l
and
ψkr (−(ts − tr )) ψkr (ts − tr ) ζγ (π, t, k, l) := δkr ,ks ψkr (ts − tr + γ) (r,s)∈π ψkr (−(ts − tr − γ))
lr = ls = L lr = ls = R lr = L, ls = R
(5.5)
lr = R, ls = L
with the correlation function ψk as defined in (2.12) and the pairings π as in (2.10), (2.11). For n = 0, the integral in (5.3) is meant to be equal to Ut . This expansion is formal since we have not yet proven any convergence properties. This will however be done at the end of Sec. 5.2, following Lemma 5.1. [0,t] In close analogy to Vγ (π, t), we also define VγI (π, t) for a closed interval I := [s, s ] such that t1 , . . . , t2n ∈ I by ζγ (π, t, k, l) Us −t2n Ik2n ,l2n · · · Il2 ,k2 Ut2 −t1 Il1 ,k1 Ut1 −s . (5.6) VγI (π, t) := k,l
Next, we introduce some combinatorial concepts to deal with the pairings π ∈ Pn that were used in the above formulas. For convenience, we will replace the variables (π, t) ∈ Pn × [0, t]2n by a single variable σ which carries the same information and which will be called a “diagram”.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
571
5.1.2. Diagrams σ Let Σ1I be the set of pairs of times in the closed interval I. The smaller time coordinate is called u and the larger time-coordinate is called v. The set ΣnI is defined as the set of collections of n pairs of times in I. That is, each σ ∈ ΣnI consists of n pairs, whose time-coordinates are parametrized by (ui , vi ) for i = 1, . . . , n and with the convention that ui ≤ vi and ui ≤ ui+1 . The elements σ are called diagrams. As announced, there is clearly a one-to-one relation between a diagram σ ∈ ΣnI and a couple (π, t) with t1 , . . . , t2n ∈ I, as used in Sec. 5.1.1. We define the domain of a diagram σ as Dom σ :=
n
[ui , vi ] ⊂ I,
for σ ∈ ΣnI .
(5.7)
i=1
We call σ ∈ ΣnI irreducible (notation: irr) whenever its domain Dom σ is a connected set. In other words, it is irreducible whenever there are no two (sub)diagrams σ1 ∈ ΣnI 1 , σ2 ∈ ΣnI 2 with n1 + n2 = n such that σ = σ1 ∪ σ2
and Dom σ1 ∩ Dom σ2 = ∅.
(5.8)
For any diagram σ ∈ ΣnI that is not irreducible, we can thus find a unique (up to the order) sequence of diagrams σ1 , . . . , σm such that σ1 , . . . , σm are irreducible and σ = σ1 ∪ · · · ∪ σm .
(5.9)
We fix the order of σ1 , . . . , σm by requiring that max Dom σi ≤ min Dom σi+1 and we call the sequence (σ1 , . . . , σm ) obtained in this way, the decomposition of σ into irreducible components. We let ΣnI (irr) ⊂ ΣnI stand for the set of irreducible diagrams σ which satisfy Dom σ = I, that is, u1 = s and maxi vi = s where I = [s, s ]. An irreducible diagram σ ∈ ΣnI (irr) is called minimally irreducible whenever it has the following property: For any subdiagram σ ⊂ σ, σ ∈ ΣnI , the diagram σ\σ n−n does not belong to ΣI (irr). Intuitively, this means that, either the subdiagram σ contains a boundary point (s or s ), or the diagram σ\σ is not irreducible. The set of minimally irreducible diagrams with n pairs in the interval I is denoted by ΣnI (min. irr.). The concepts of irreducible and minimally irreducible diagrams are illustrated in Fig. 1. n On the set ΣnI , we define the Lesbegue measure dσ := i=1 dui dvi . On the set ΣnI (irr) with I = [s, s ], the definition of the measure has to modified as dσ :=
i=1
dui
i=j
dvi ,
where j is defined by max vi = vj = s . i
That is, the fixed times u1 and vj are not integrated over.
(5.10)
May 12, 2009 13:34 WSPC/148-RMP
572
J070-00369
W. De Roeck
Fig. 1. A diagram σ ∈ Σ6[0,t] with domain [u1 , v6 ]. It can be decomposed into two irreducible components σ1 and σ2 with respective domains [u1 , v2 ] and [u3 , v6 ]. The first subdiagram is minimally irreducible on its domain, i.e. σ1 ∈ Σ2[u ,v ] (min. irr.), while the second subdiagram is 1
2
/ Σ4[u ,v ] (min. irr.). Indeed, one can remove the pair (u4 , v4 ) without destroying the not, i.e. σ2 ∈ 3 6 irreducibility of that subdiagram. Alternatively, one could remove (u5 , v5 ).
5.1.3. Representation of the reduced evolution Remark first that we can write VγI (σ) instead of VγI (π, t), as defined in (5.4), since by the construction in Sec. 5.1.2, there is a one-to-one mapping between σ and (t, π). Hence, by copying (5.3), we can represent the reduced evolution as t Zλ,γ
=
λ
2n Σn [0,t]
n∈Z+
dσ Vγ[0,t] (σ)
(5.11)
where the term corresponding to n = 0 in the right-hand side is defined to equal [0,t] Ut . Next, we use the notion of diagrams σ to decompose the operators Vγ (σ) into products. Let (σ1 , . . . , σp ) be the decomposition of a diagram σ ∈ Σn[0,t] into irreducible components. Define the times t1 , . . . , t2p to be the boundaries of the domains of the irreducible components, i.e. [t2i−1 , t2i ] = Dom σi for i = 1, . . . , p. Then Vγ[0,t] (σ) = Ut−t2p Vγ[t2p−1 ,t2p ] (σp ) Ut2p−1 −t2p−2 · · · Ut3 −t2 Vγ[t1 ,t2 ] (σ1 ) Ut1 ,
(5.12)
as can be checked from (5.4)–(5.5). Here, the essential observation is that due to the absence of any pairing between the times in σi and σi+1 , the correlation function in (5.5) factorizes. We can now, still formally, rewrite (5.3) as a sum over collections of irreducible diagrams. We introduce first Wγt
:=
n≥1
λ
2n Σn (irr) [0,t]
dσ Vγ[0,t] (σ),
(5.13)
and we remark that the definition of Wγt allows for a shift of time in the right-hand side, that is Wγt
=
n≥1
λ
2n Σn I (irr)
dσ VγI (σ),
for any I = [s, s + t],
s ∈ R.
(5.14)
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
573
Then, by this time-translation invariance and (5.12), the expression (5.3) gets rewritten as t = dt1 · · · dtm Zλ,γ m∈2Z+
0≤t1 ≤···≤tm ≤t t −tm−1
m × (Ut−tm Wλ,γ
t2 −t1 Utm−1 −tm−2 · · · Ut3 −t2 Wλ,γ Ut1 ).
(5.15)
Indeed, instead of summing over all diagrams, we now sum over all sequences of irreducible diagrams. The term on the right-hand side of (5.15) corresponding to m = 0 is again understood to be equal to Ut . 5.2. Estimates on the Dyson expansion In this section, we prove some useful a priori estimates on the Dyson expansion. We will establish a bound on the integral over all diagrams up to a given time t (Statement 2 of Lemma 5.1). This allows to justify all expansions that have been stated up to now. In particular, it proves Lemma 2.2 and Proposition 4.1. We also state a bound on the integral over irreducible diagrams (Statement 3 of Lemma 5.1) that will be crucial in Sec. 5.3. In the language of Sec. 5.3, Statement 3 of Lemma 5.1 bounds the contributions of non-ladder diagrams, showing that they are O(λ4 ) and exponentially decaying in time. Lemma 5.1. Define m
hγ (t) := 4 Cγ1 (z) :=
Vk max{|ψk (t)|, |ψk (t + γ)|, |ψk (t − γ)|},
R+
(5.16)
dw hγ (w)e−wRe z ,
Cγ2 (z)
t ≥ 0,
k=1
:=
dw
R+
R+
(5.17)
dw hγ (w + w )e
−wRe z
,
z ∈ C.
By Assumption 1, the function hγ (t) decays as e−gR |t| as |t|∞ and Cγ1,2 (z) are finite as long as Re z > −gR . The following estimates hold: (1) For all σ ∈ Σn[0,t] ,
Vγ[0,t] (σ) ≤
n
hγ (vi − ui )
(5.18)
i=1
where (ui , vi ) are the pairs of the diagram σ, as described in Sec. 5.1.2. (2) For all t > 0, tn dσVγ[0,t] (σ) ≤ (hγ 1 )n , hγ 1 := dw hγ (w). (5.19) n! Σn R+ [0,t]
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
W. De Roeck
574
(3) Let z ∈ C, then dt e−tz λ2n R+
Σn (irr) [0,t]
n≥2
dσVγ[0,t] (σ) ≤ λ4
Cγ1 (a)Cγ2 (a) , 1 − λ2 Cγ2 (a)
a := Re z − λ2 hγ 1 provided that |λ
2
Cγ2 (a)|
(5.20)
< 1 and a > −gR .
Proof. Statement (1) is immediate from expression (5.4) and the estimates Ut ≤ 1,
Il,k ≤ Vk B(HS ) .
(5.21)
The sum over k, l produces the factor (2m)2n that has been absorbed into the definition of hγ . (Recall that m is the number of reservoirs.) To show Statement (2), we employ Statement (1) to estimate dσVγ[0,t] (σ) (5.22) Σn [0,t]
≤
du1 · · · dun 0
dv1 · · · dvn vi >ui
hγ (vi − ui )
(5.23)
i=1
du1 · · · dun (hγ 1 )n =
≤
n
0
tn (hγ 1 )n . n!
(5.24)
Statement (3) is checked as follows. First, we use Statement (1) to bound dσ λ2n Vγ[0,t](σ) ≤ dσ χ(σ) (5.25) Σn (irr) [0,t]
Σn (irr) [0,t]
n where χ(σ) := λ2n i=1 hγ (vi − ui ). Next, we note that for each irreducible diagram σ ∈ Σn[0,t] (irr), we can find a subdiagram σ ⊂ σ such that σ is minimally
irreducible, i.e. σ ∈ Σn[0,t] (min.irr) with n ≤ n. Note that the choice of subdiagram σ is not necessarily unique. Conversely, given a minimally irreducible diagram σ ∈ Σn[0,t] (min.irr), we can add any set of pairs σ ∈ Σn[0,t] to σ , thereby creating a new irreducible diagram σ := σ ∪ σ ∈ Σn[0,t] (irr). By these considerations, we easily deduce dσ χ(σ) ≤ dσ χ(σ) 1 + dσ χ(σ ). n≥2
Σn (irr) [0,t]
n≥2
Σn (min.irr) [0,t]
p∈N
Σp [0,t]
(5.26) 2
The second factor in (5.26) is bounded by etλ hγ 1 , by Statement (2). To deal with the first factor in (5.26), we will prove the bound dσ χt (σ) ≤ λ2 Cγ1 (Re z)(λ2 Cγ2 (Re z))n−1 ,
(5.27)
Σn (min.irr) [0,t]
such that the proof of Statement (3) is obtained by summing (5.27) over n ≥ 1 and changing Re z to Re z − a to compensate for the second factor in (5.26). The proof
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
575
of (5.27) is an explicit calculation, which uses the fact that for σ ∈ Σnt (min.irr), the relative order of the times ui , vi is fixed as follows: 0 = u1 ≤ u2 ≤ v1 ≤ u3 ≤ v2 ≤ u4 ≤ · · · ≤ vn−2 ≤ un ≤ vn−1 ≤ vn = t. We have hence −tz dt e R+
×
dσ χt (σ)
Σn (min.irr) [0,t]
= λ2n
∞
dv1 hγ (v1 − u1 )e−z(v1 −u1 )
0 ∞
dvn−2 · · ·
vn−3
×
vn−1
vn−2
∞
v1
∞
du2 0
dun−1 vn−3
∞
dun vn−2
(5.28)
dv2 · · ·
v1
vn−4
dun−2 vn−5
dvn−1 e−z(vn−1 −vn−2 ) hγ (vn−1 − un−1 )
vn−2
dvn e−z(vn −vn−1 ) hγ (vn − un ).
(5.29)
vn−1
Performing the change of variables wi = vi − vi−1 and wi = vi−1 − ui (for i > 1) and extending the range of integration of yi to R, the above expression factorizes and one obtains the bound (5.27). At this moment, we are finally ready to prove Lemma 2.2. Proof of Lemma 2.2 and Proposition 4.1. By Statement (2) of Lemma 5.1, it is clear that the series on the right-hand side of (2.19), rewritten as (5.11), converges absolutely. Furthermore, for a finite-volume approximation to our model, as described in Sec. 4, we can associate an analogous expansion as in (2.19) and (5.3). Indeed, the series in (5.3) depends on the reservoirs only through the functions ψk and hence the analogous expression for finite-volume reservoirs is obtained by replacing ψk with ψkn . Then, Proposition 4.1 follows from Assumption 4 by the dominated convergence theorem. The claims in (2.20), (2.21) and (2.22) can also be established from Proposition 4.1. Indeed, they hold trivially for the finite-volume approximations, as consequences of the unitarity of the dynamics on the full system. 5.3. The resolvent expansion In this section, we employ the formula (5.15) to calculate the Laplace transform of t . Further, we also classify the diagrams as follows: In the expression (5.12) we Zλ,γ view the irreducible (sub)diagrams σi ∈ Σn[0,t] (irr) with n ≥ 2 as “excitations”. In contrast, if all σi consist of a single pair, hence n = 1, then σ = ∪i σi is called a [0,t] “ladder” diagram. The operators Vγ (σ) corresponding to these ladder diagrams provide the leading contribution to the dynamics for small times. They are the only terms that do not vanish in the weak-coupling limit, cf. Sec. 2.5 and Proposition 3.1.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
W. De Roeck
576
We define separately the Laplace transforms of the irreducible “excitation” diagrams (Rex λ,γ ) and the irreducible “ladder” diagram (Mγ ): −tz 2n Rex (z) := dt e λ dσ Vγ[0,t] (σ) (5.30) λ,γ R+
Mγ (z) :=
R+
n≥2
Σn (irr) [0,t]
dt e−tz Vγ[0,t] (0, t).
(5.31)
where the pair σ ≡ (0, t) in (5.31) is the only element of Σ1[0,t] (irr). In other words, 2 the sum of Rex λ,γ (z) and λ Mγ (z) is the Laplace transform of the sum of irreducible t diagrams Wλ,γ , defined in (5.13): t dt e−tz Wλ,γ = λ2 Mγ (z) + Rex (5.32) λ,γ (z). R+
A priori, these formulas are only valid for large Re z, but in the proof of Lemma 5.2, we will extend them by analytic continuation. 2 The following lemma shows that Rex λ,γ (z) is “small” wrt. λ Mγ (z). Its proof relies almost solely on the estimates obtained in Lemma 5.1. Lemma 5.2. Suppose that Assumption 1 in Sec. 2 holds. Then, (1) There are positive constants δγ > 0, δλ > 0 and g > 0, such that the operators Mγ (z) and Rex λ,γ (z), originally defined in (5.30), (5.31) for Re z large enough, can be extended d to analytic functions of (z, λ, γ) in the region defined by Re z > −g , |λ| ≤ δλ , |γ| ≤ δγ , and sup Mγ (z) = C < ∞,
(5.33)
γ,z
4 sup Rex λ,γ (z) = O(λ )
as λ0
(5.34)
γ,z
where the sup is over the region |γ| ≤ δγ , Re z > −g . (2) For (z, λ, γ) satisfying (z − i[HS , ·])−1 × λ2 Mγ (z) + Rex λ,γ (z) < 1, −1 . Rλ,γ (z) = (z − i[HS , ·] − λ2 Mγ (z) − Rex λ,γ (z))
(5.35)
(3) Recall the deformed Lindblad generators Lγ as defined in Sec. 2.5. They satisfy 1ε ([HS , ·])Mγ (iε)1ε ([HS , ·]) (5.36) Lγ = ε∈sp[HS ,·]
where 1ε ([HS , ·]) are spectral projections, as in (2.35). In general, g can be made bigger by restricting δγ in Statement (1), since necessarily g ≤ gR , with gR as defined in Assumption 1. This freedom will not be used, which means that we will not try to optimize the region D in Theorem 3.2. d This
extension is actually not unique in λ, because our definitions involve only λ2 with λ ∈ R.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
577
Proof. Let hγ , Cγ1 (·), Cγ2 (·) and a ≡ a(z, λ) be as in Lemma 5.1, then Mγ (z) ≤ Cγ1 (Re z),
4 Rex λ,γ (z) ≤ λ
Cγ1 (a)Cγ2 (a) , 1 − λ2 Cγ2 (a)
(5.37)
where the bound on Mγ (z) follows from Statement (1) of Lemma 5.1 and the bound on Rex λ,γ (z) follows from Statement (3) of Lemma 5.1. The bounds (5.33) are now immediate and the analyticity follows from the Vitali convergence theorem. We proceed to Statement (2). To simplify the following calculations, we abbreviate ex 2 Rirr λ,γ (z) := Rλ,γ (z) + λ Mγ (z),
Then
Rλ,γ (z) := =
R+
RS (z) := (z − i[HS , ·])−1 .
(5.38)
t dt e−tz Zλ,γ
(5.39)
n RS (z)(Rirr λ,γ (z)RS (z))
(5.40)
n∈Z+ −1 = RS (z)(1 − Rirr λ,γ (z)RS (z))
= (z − i[HS , ·] −
−1 Rirr λ,γ (z))
(5.41)
= (z − i[HS , ·] − λ Mγ (z) − 2
−1 Rex . λ,γ (z))
(5.42) The second equality follows by Laplace transforming (5.15). Note in particular that t Rirr λ,γ (z) is the Laplace transform of Wλ,γ . The third equality follows by summing a geometric series, using the assumption that Rirr λ (z)RS (z) < 1. Hence, Statement (2) of Lemma 5.2 is proven. The proof of Statement (3) is an explicit calculation; starting from the expressions (5.4)–(5.5) and (5.31), we obtain ψk (−t) l = l = L ψ (t) l = l = R k dt e−tz Il ,k Ut Il,k (5.43) Mγ (z) = l = L, l = R ψk (t + γ) R+ k=1,...,m l,l ∈{L,R} ψk (−(t − γ)) l = R, l = L. From which Statement (3) follows upon using the explicit expressions for Lγ in Sec. 2.5. 5.4. Spectral analysis of Laplace-transformed reduced evolution In this section, we collect the proof of our main results, Theorems 3.2 and 3.3. We begin by stating an auxiliary result, Theorem 5.3 which follows from Lemma 5.2 by a standard application of perturbation theory and the Laplace transform. To separate this reasoning from the preceding estimates, we postpone it to the Appendix. Recall the definition of the eigenvalue fL (γ) and its associated spectral projector PLγ from Proposition 2.3.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
W. De Roeck
578
Theorem 5.3. Suppose that Assumptions 1 and 2 in Sec. 2 hold. Then, there is a domain D ⊂ C × Cm with (0, 0) ∈ D such that there is a function f (λ, γ) and a rank 1 operator P λ,γ , both analytic on D and satisfying P λ,γ = PLγ + O(λ2 ),
f (λ, γ) = λ2 fL (γ) + O(λ4 )
(5.44)
and 2
t − etf (λ,γ) P λ,γ = O(e(f (λ,γ)−λ Zλ,γ
g)t
),
t∞,
λ↓0
(5.45)
for some decay rate g > 0. Once one accepts the conclusions of Lemma 5.2 and Proposition 2.3, the proof of Theorem 5.3 follows by standard analytic perturbation theory and the inverse Laplace transform. For the sake of clarity, this step has been abstracted in Lemma A.1 in the Appendix. To apply this lemma in our context, one should substitute A(z) → Rλ,γ (z)
(5.46)
A2 (z) → Mγ (z)
(5.47)
A>2 (z, λ) → Rex λ,γ (z)
(5.48)
N → Lγ .
(5.49)
Note that almost identical reasoning was used in [40]. Our main result, Theorem 3.2, follows immediately from Theorem 5.3. Proof of Theorem 3.2. We calculate, using Theorem 5.3, 1 t log ρS [Zλ,γ (1)] t 2 1 = log ρS [etf (λ,γ) P λ,γ (1) + O(e(f (λ,γ)−λ g)t )], λ0, t 2 1 = f (λ, γ) + log ρS [P λ,γ (1) + O(e−λ gt )] t 2 1 = f (λ, γ) + log ρS [PLγ (1) + O(λ2 ) + O(e−λ gt )]. t
t↑∞
(5.50) (5.51) (5.52)
By Proposition 2.3 and the comments following it, PLγ (1) is a strictly positive operator. Hence, the argument of the log in (5.52) is bounded away from zero, uniformly in t. This concludes the proof of Theorem 3.2. Proof of Theorem 3.3. One can easily deduce Theorem 3.3 from Theorem 3.2 and the KMS-condition for the decoupled reservoir states. However, to remain in the style of the present paper, we deduce it from the finite-volume approximations to our model, as presented in Sec. 4.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
579
Let β ≡ (β1 , . . . , βm ) ∈ Rm and choose ρS to be the trace state, cf. (1.17), then we argue that t t (1)) = ρS (Zλ,γ−iβ (1)). ρS (Zλ,γ
(5.53)
Indeed, for the finite-volume approximations, this statement is Eq. (1.22), and hence, by Proposition 4.1, it is also true for our model. Theorem 3.3 now follows t (1)), as t↑∞, and its independence of from the existence of the limit t−1 log ρS (Zλ,γ ρS (Theorem 3.2). The alternative proof proceeds via checking (5.53) through the KMS-relation.e
Appendix. Spectral Analysis Consider a function R+ t → V (t) where V (t) are elements of a Banach space and supt e−tm V (t) < ∞ for some m > 0. The Laplace transform A(z) := dt e−tz V (t) (A.1) R+
is well-defined for Re z > m and it follows that 1 dz ezt A(z), with Γ→ := m + iR for any m > m V (t) = 2πi Γ→
(A.2)
where the integral is in the sense of improper Riemann integrals. We will state certain assumptions which allow to continue A(z) downwards in the complex plane and obtain bounds on V (t). Lemma A.1. Let, for Re z high enough, A(z) := (z − iB − λ2 A2 (z) − A>2 (z, λ))−1
(A.3)
and assume the following conditions are fulfilled: (1) B is a bounded and it has purely discrete spectrum consisting of semisimple eigenvalues on the real axis, including the eigenvalue 0. (2) The operators A2 (z) and A>2 (z, λ) are analytic functions in the domain Re z > −gA for some gA > 0. Moreover sup Re z>−gA
A>2 (z, λ) = O(λ4 ),
sup
A2 (z) = C < ∞.
λ0
(A.4) (A.5)
Re z>−gA
e Of
course, the two possible proofs are essentially the same proof. To derive (1.22), one also uses the KMS condition, though only in finite volume.
May 12, 2009 13:34 WSPC/148-RMP
580
J070-00369
W. De Roeck
(3) Let N :=
⊕ Nb ,
Nb := 1b (B)A2 (ib)1b (B),
b∈spB
(A.6)
and assume that the operator N0 has a simple eigenvalue fN , elevated by a gap gN above the rest of the spectrum of N0 and N ; spN = {fN } ∪ ΩN
and
dist(Re fN , Re ΩN ) ≥ gN .
(A.7)
Then, there is a λ0 such that for |λ| ≤ λ0 , there is a complex number f ≡ f (λ), a rank-one operator P ≡ P (λ), bounded operators R(t) ≡ R(t, λ) and a decay rate g > 0, such that 2
V (t) = P ef t + R(t)e(f −gλ
)t
(A.8)
where, as |λ|0, f (λ) − λ2 fN = O(λ4 )
(A.9)
P (λ) − 1fN (N ) = O(λ2 )
(A.10)
sup R(t, λ) = O(1)
(A.11)
t∈R+
with 1fN (N ) the spectral projection of N associated to the eigenvalue fN . If, in addition A2 and A>2 depend analytically on a parameter γ in a complex domain D ⊂ C, such that the estimates (A.4), (A.5) and (A.7) hold uniformly in γ ∈ D, then (A.8) holds with f, P and R analytic in γ and the estimates (A.9)– (A.11) are satisfied uniformly in γ ∈ D. We prove Lemma A.1 below. Lemma A.2. The singular points of A(z) in the domain Re z ≥ −gA lie within a distance of order O(λ4 ) of the spectrum of iB + λ2 N (provided that there are any singular points at all). Proof. Standard perturbation theory implies that the spectrum of the operator iB + λ2 A2 (z) + A>2 (z, λ),
(A.12)
lies at a distance of O(λ2 ) from the spectrum of iB. Here and in what follows, the estimates in powers of λ are uniform for Re z ≥ −gA . Let 10b ≡ 1b (B) be the spectral projections of B on the eigenvalue b. As long as λ is small enough, there is an invertible operator U ≡ U (λ, z) satisfying U − 1 = O(λ2 ) and such that the projections 1b := U 10b U −1 ,
b ∈ spB
(A.13)
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
581
are spectral projections of the operator (A.12). It follows that the spectral problem for (A.12) is equivalent to the spectral problem for U −1 1b (iB + λ2 A2 (z) + A>2 (z, λ))1b U (A.14) b
=
(ib10b + λ2 Nb + Aex,b (z, λ))
(A.15)
b
where Aex,b (z, λ) := 10b U −1 (iB)U 10b − ib10b ,
(O(λ4 ))
+ λ2 10b U −1 N U 10b − λ2 Nb ,
(O(λ4 ))
+ λ2 10b U −1 (A2 (ib) − N )U 10b ,
(O(λ4 ))
+ λ2 10b U −1 (A2 (z) − A2 (ib))U 10b , (O(λ2 |z − ib|)) + 10b U −1 A>2 (z, λ)U 10b ,
(O(λ4 )).
(A.16)
The estimates in powers of λ are obtained by using U − 1 = O(λ ), the bounds (A.4), (A.5) and the analyticity of A2 . When z is chosen at a distance O(λ2 ) from ib, then all terms in (A.16) are O(λ4 ). The claim now follows by simple perturbation theory applied to the expression in (A.15). 2
Lemma A.3. The function A(z) has exactly one singularity at a distance O(λ4 ) from λ2 fN . This singularity is called f ≡ f (λ). The corresponding residue P is a rank-one operator satisfying P − 1N (fN ) = O(λ2 ).
(A.17)
Proof. By Lemma A.2, there can be at most one singularity. We prove below that there is at least one. By the reasoning in the proof of Lemma A.2 and the fact that the eigenvector corresponding to fN belongs to Ran 10b=0 (see Condition (3) of Lemma A.1), it suffices to study the singularities of the function (z − λ2 N0 + Aex,0 (z, λ))−1 .
(A.18)
Consider the contour Γ ≡ Γ (λ) which is a circle with center λ fN and radius λ2 r, with gN > 2r > 0. Clearly, for λ small enough, all spectrum of λ2 N0 lies outside the contour Γf , except for the eigenvalue λ2 fN . The contour integral of (z − λ2 N )−1 along Γf equals the spectral projection corresponding to fN . We estimate dz(z − λ2 N0 − Aex,0 (z, λ))−1 − (z − λ2 N0 )−1 (A.19) f
Γf
=
f
2
dz(z − λ2 N0 )−1 Aex,0 (z, λ)(z − λ2 N0 − Aex,0 (z, λ))−1
(A.20)
Γf
=
Γf
dz(λ−2 c(r))2 O(λ4 ),
c(r) :=
sup |z−fN |=r
(z − N0 )−1 ,
λ0.
(A.21)
The last estimate is in norm sense and it follows from the bound in (A.16). The expression (A.21) is O(λ2 ) as λ0 since the circumference of the contour Γf is
May 12, 2009 13:34 WSPC/148-RMP
582
J070-00369
W. De Roeck
2πrλ2 . From the fact that the contour integral of (A.18) does not vanish, we conclude that A(z) has at least one singularity inside Γf . The claim about the residue is most easily seen in an abstract setting. Let F (z) be a Banach-space valued analytic function in some open domain containing 0, and such that 0 ∈ spF (0) is an isolated eigenvalue. We have hence the Taylor expansion 1 z n Fn , Fn := F (n) (0), 0 ∈ spF0 . (A.22) F (z) = n! n If F1 − 1 is small enough, then also F1−1 F0 has 0 as an isolated eigenvalue. We denote the corresponding spectral projection by 10 (F1−1 F0 ) and we calculate Res(F (z)−1 ) = Res(F0 + zF1 )−1 = (Res(F1−1 F0 + z)−1 )F1−1 = 10 (F1−1 F0 )F1−1 . (A.23) The last expression is clearly a rank-one operator. In the case at hand, F1−1 = 1 + O(λ2 ), which yields (A.17). We proceed to the proof of Lemma A.1. First, we fix closed contours Γf and Γb , for b ∈ spB, and a horizontal contour Γ→ (see also Fig. 2); • The contour Γf is as described in Lemma A.3, with r ≡ gN /3. In particular, it encircles the point f but no other singular points of A(z). • The contours Γb are such that, for b = 0; |λ|2 gN /4 ≤ dist(Γb , ib + λ2 spNb ) ≤ |λ|2 gN /3
(A.24)
|λ|2 gN /4 ≤ dist(Γb , λ2 (spN0 \{fN })) ≤ |λ|2 gN /3.
(A.25)
and, for b = 0;
• The contour Γ→ is given by Γ→ := −gA + iR. We assume λ to be small enough such that the contour Γ→ lies entirely below the contours Γf and Γb . By Lemma A.2, we know that all singularities of A(z) in the region Re z > −gA lie in the interior of the contours Γf and Γb . Hence, we can deform contours as follows 1 dz ezt A(z) (A.26) V (t) = 2πi Γ→ 1 1 1 = dz ezt A(z) + dz ezt A(z) + dz ezt A(z). (A.27) 2πi Γf 2πi 2πi Γ→ Γb b
The first term in (A.27) yields e P . For the second term, we obtain tf
2
second term of (A.27) = O(eλ
t(fN −(gN +O(λ2 ))
by using Lemma A.2 and straightforward estimates.
),
λ0,
t↑∞
(A.28)
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
583
Fig. 2. The (rotated) complex plane. The black dots indicate the spectrum of iB + λ2 N (which need not be discrete). The upper dot is the eigenvalue fN . In the picture, we have assumed that the spectrum of B consists of three semisimple eigenvalues: 0, b1 , b−1 . The gray regions contain the possible singularities of the function A(z). These singularities lie at O(λ4 ) from the spectrum of iB + λ2 N and in the region Re z < −gA . The integration contours Γ→ , Γ→ , Γf and Γb are drawn in dashed lines.
The third term of (A.27) is split as follows dz ezt A(z) Γ→
dz ezt (z − iB − λ2 N )−1
=
(A.29) (A.30)
Γ→
−
dz ezt (z − iB − λ2 N )−1 (A≥2 + λ2 A2 − λ2 N )
Γ→
× (z − iB − λ2 N − (A≥2 + λ2 A2 − λ2 N ))−1 .
(A.31)
May 12, 2009 13:34 WSPC/148-RMP
584
J070-00369
W. De Roeck
The integration contour in (A.30) can be closed in the lower half-plane since the spectrum of iB +λ2 N lies above Γ→ . Hence, it is equal to 0. The integrand of (A.31) decays as |z|−2 for z∞. This is seen by using that, for any bounded operator M , 1 −1 , |z|∞. (A.32) (z − M ) = O |z| 1
By extracting etRe z , it follows that the integral (A.31) is e− 2 gA t × O(λ2 ). Hence, Lemma A.1 is proven. References [1] R. Alicki and M. Fannes, Quantum Dynamical Systems (Oxford University Press, 2001). [2] H. Araki and E. J. Woods, Representations of the canonical commutation relations describing a nonrelativistic infinite free Bose gas, J. Math. Phys. 4 (1963) 637–662. [3] J. E. Avron, S. Bachmann, G. M. Graf and I. Klich, Fredholm determinants and the statistics of charge transport (2007); arXiv:0705.0099. [4] V. Bach, J. Fr¨ ohlich and I. Sigal, Return to equilibrium, J. Math. Phys. 41 (2000) 3985–4060. [5] S. Bachmann and G. M. Graf, Charge transport and determinants (2008); http://arxiv.org/abs/0808.0560. [6] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 2, 2nd edn. (Springer-Verlag, Berlin, 1996). [7] W. Bryc, A remark on the connection between the large deviation principle and the central limit theorem, Statist. Probab. Lett. 18 (1993) 253–256. [8] E. B. Davies, Markovian master equations, Comm. Math. Phys. 39 (1974) 91–110. [9] R. de Picciotto, M. Reznikov, M. Heiblum, V. Umansky, G. Bunin and D. Mahalu, Direct observation of a fractional charge, Nature 389(6647) (1997) 162–164. [10] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications (Springer, Berlin, 1993). [11] J. Derezi´ nski, Introduction to Representations of Canonical Commutation and Anticommutation Relations, Lecture Notes in Physics, Vol. 695 (Springer-Verlag, Berlin, 2006). [12] J. Derezi´ nski and V. Jakˇsi´c, Spectral theory of Pauli–Fierz operators, J. Funct. Anal. 180 (2001) 241–327. [13] J. Derezi´ nski and V. Jakˇsi´c, Return to equilibrium for Pauli–Fierz systems, Ann. H. Poincar´e 4 (2003) 739–793. [14] J. Derezi´ nski, V. Jakˇsi´c and C.-A. Pillet, Perturbation theory of W ∗ -dynamics, Liouvilleans and KMS-states, Rev. Math. Phys. 15 (2003) 447–489. [15] J. Derezi´ nski and W. De Roeck, Extended weak coupling limit for Pauli–Fierz operators, Comm. Math. Phys. 279(1) (2008) 1–30. [16] J. Derezi´ nski, W. De Roeck and C. Maes, Fluctuations of quantum currents and unravelings of master equations, J. Statist. Phys. 131(2) (2008) 341–356. [17] M. Esposito, U. Harbola and S. Mukamel, Nonequilibrium fluctuations, fluctuation theorems, and counting statistics in quantum systems (2008); arXiv:0811.3717. [18] D. J. Evans, E. G. D. Cohen and G. P. Morriss, Probability of second law violations in steady flows, Phys. Rev. Lett. 71 (1993) 2401–2404. [19] A. Frigerio, Stationary states of quantum dynamical semigroups, Comm. Math. Phys. 63(3) (1978) 269–276. [20] J. Fr¨ ohlich and M. Merkli, Another return of ‘return to equilibrium’, Comm. Math. Phys. 251 (2004) 235–262.
May 12, 2009 13:34 WSPC/148-RMP
J070-00369
Large Deviation Generating Function for Currents in the Pauli–Fierz Model
585
[21] J. Fr¨ ohlich, M. Merkli and D. Ueltschi, Dissipative transport: Thermal contacts and tunnelling junctions, Ann. Henri Poincar´e 4(5) (2004) 897–945. [22] G. Gallavotti and E. G. D. Cohen, Dynamical ensembles in nonequilibrium statistical mechanics, Phys. Rev. Lett. 74 (1995) 2694–2697. [23] D. Goderis, A. Verbeure and P. Vets, Dynamics of fluctuations for quantum lattice systems, Comm. Math. Phys. 128(3) (1990) 533–549. [24] F. Hiai, M. Mosonyi and O. Tomohiro, Large deviations and Chernoff bound for certain correlated states on the spin chain (2007); arXiv:0706.2141. [25] V. Jakˇsi´c, Y. Ogata and C.-A. Pillet, The Green–Kubo formula for the spin-fermion system, Comm. Math. Phys. 268(2) (2006) 369–401. [26] V. Jaksic, Y. Pautrat and C.-A. Pillet, Central limit theorem for locally interacting Fermi gas (2007); mp-arc 07-256. [27] V. Jakˇsi´c and C.-A. Pillet, On a model for quantum friction. III: Ergodic properties of the spin-boson system, Comm. Math. Phys. 178 (1996) 627–651. [28] C. Jarzynski and D. Wojcik, Classical and quantum fluctuation theorems for heat exchange, Phys. Rev. Lett. 92 (2004) 230602, 4 pp. [29] I. Klich, Full counting statistics: An elementary derivation of Levitov’s formula, in Quantum Noise, ed. Y. V. Nazarov (Kluwer, 2003), pp. 397–402. [30] J. Kurchan, Quantum fluctuation theorem (2000); arXiv cond-mat/0007360v2. [31] J. Lebowitz and H. Spohn, Irreversible thermodynamics for quantum systems weakly coupled to thermal reservoirs, Adv. Chem. Phys. 39 (1978) 109–142. [32] M. Lenci and L. Rey-Bellet, Large deviations in quantum lattice systems: One-phase region, J. Statist. Phys. 119 (2005) 715–746. [33] G. B. Lesovik, Excess quantum shot noise in 2D ballistic point contacts, JETP Lett. 49 (1989) 592–594. [34] G. B. Lesovik and L. S. Levitov, Charge distribution in quantum shot noise, JETP Lett. 58 (1993) 225–230. [35] L. S. Levitov, H. Lee and G. B. Lesovik, Electron counting statistics and coherent states of electric current, J. Math. Phys. 37 (1996) 4845–4866. [36] C. Maes, On the origin and the use of fluctuation relations for the entropy, in Poincar´e Seminar, eds. J. Dalibard, B. Duplantier and V. Rivasseau (Birkh¨ auser, Basel, 2003), pp. 145–191. [37] K. Netoˇcn´ y and F. Redig, Large deviations for quantum spin systems, J. Statist. Phys. 117 (2004) 521–547. [38] Y. Ogata, Large deviations in quantum spin chain (2008); arXiv:0803.0113. [39] W. De Roeck, Quantum fluctutation theorem: Can we go from micro to meso? Comptes Rendues Physique 8 (2007) 674–783. [40] W. De Roeck, J. Fr¨ ohlich and A. Pizzo, Quantum Brownian motion in a simple model system, submitted to Comm. Math. Phys. (2008); arXiv:0810.4537. [41] W. De Roeck and C. Maes, A quantum version of free energy — irreversible work relations, Phys. Rev. E 69(2) (2004) 026115, 6pp. [42] W. De Roeck and C. Maes, Fluctuations of the dissipated heat in a quantum stochastic model, Rev. Math. Phys. 18 (2006) 619–653. [43] D. Ruelle, Natural nonequilibrium states in quantum statistical mechanics, J. Statist. Phys. 98 (2000) 57–75. [44] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton University Press, Princeton, 1993). [45] H. Spohn, Dynamics of Charged Particles and their Radiation Field (Cambridge University Press, Cambridge, 2004). [46] P. Talkner, E. Lutz and P. Hanggi, Fluctuation theorems: Work is not an observable, Phys. Rev. E 75 (2007) 050102(R), 2 pp.
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Reviews in Mathematical Physics Vol. 21, No. 5 (2009) 587–613 c World Scientific Publishing Company
FULL REGULARITY FOR A C∗ -ALGEBRA OF THE CANONICAL COMMUTATION RELATIONS
HENDRIK GRUNDLING∗ and KARL-HERMANN NEEB† ∗School of Mathematics and Statistics, University of New South Wales, Sydney, New South Wales 2052, Australia
[email protected] †Department
of Mathematics, Technical University, Darmstadt, Germany
[email protected] Received 15 July 2008 Revised 18 November 2008
The Weyl algebra — the usual C∗ -algebra employed to model the canonical commutation relations (CCRs), has a well-known defect, in that it has a large number of representations which are not regular and these cannot model physical fields. Here, we construct explicitly a C∗ -algebra which can reproduce the CCRs of a countably dimensional symplectic space (S, B) and such that its representation set is exactly the full set of regular representations of the CCRs. This construction uses Blackadar’s version of infinite tensor products of nonunital C∗ -algebras, and it produces a “host algebra” (i.e. a generalized group algebra, explained below) for the σ-representation theory of the Abelian group S where σ(·, ·) := eiB(·,·)/2 . As an easy application, it then follows that for every regular representation of ∆(S, B) on a separable Hilbert space, there is a direct integral decomposition of it into irreducible regular representations (a known result). Keywords: Canonical commutation relations; C∗ -algebra; regular representation; host algebra; Weyl algebra; infinite tensor product; group algebra; infinite dimensional group; symplectic space; quantum field. Mathematics Subject Classification 2000: 43A10, 43A40, 22A25, 46N50, 81T05
0. Introduction In the description of quantum systems, one typically deals with a set of operators satisfying canonical commutation relations. This means that there is a real linear map ϕ from a given symplectic space (S, B) to a linear space of selfadjoint operators on some common dense invariant core D in a Hilbert space H, satisfying the relations [ϕ(f ), ϕ(g)] = iB(f, g)1, 587
ϕ(f )∗ = ϕ(f )
on D.
June 3, 2009 10:59 WSPC/148-RMP
588
J070-00367
H. Grundling & K.-H. Neeb
If {qi , pi | i ∈ I} ⊂ S is a symplectic basis for S i.e. 0 = B(qi , qj ) = B(pi , pj ) = B(pi , qj ) − δij , then ϕ(qi ) and ϕ(pi ) are interpreted as quantum mechanical position and momentum operators. If S consists of Schwartz functions on a space-time manifold, we can take ϕ to be a bosonic quantum field. As is known, if (S, B) is non-degenerate then the operators ϕ(f ) cannot all be bounded, so it is natural to go from the polynomial algebra P generated by {ϕ(f ) | f ∈ S} to a C∗ -algebra encoding the same algebraic information. The obvious way to do this, is to form suitable bounded functions of the fields ϕ(f ). Following Weyl, we consider the C∗ -algebra generated by the set of unitaries {exp(iϕ(f )) | f ∈ S} and this C∗ -algebra is simple. It can be defined abstractly as the C∗ -algebra generated by a set of unitaries {δf | f ∈ X} subject to the relations δf∗ = δ−f and δf δg = e−iB(f,g)/2 δf +g . This is the familiar Weyl (or CCR) algebra, often denoted ∆(S, B) (cf. [20]). A different C∗ -algebra for the CCRs was defined in [6] based on the resolvents of the fields. By its definition, ∆(S, B) has a representation in which the unitaries δf can be identified with the exponentials eiϕ(f ) , and hence we can obtain the concrete algebra P back from these. Such representations π : ∆(S, B) → B(H), i.e. those for which the one-parameter groups λ → π(δλf ) are strong operator continuous for all f ∈ X are called regular, and states are regular if their GNS-representations are. Since for physical situations the quantum fields are defined as the generators of the one-parameter groups λ → π(δλf ), the representations of interest are required to be regular. (Note that the ray-continuity of s → π(δs ) implies continuity on all finite dimensional subspaces of S.) Unfortunately, ∆(S, B) has a large number of nonregular representations, and so one can object that it is not satisfactory, since analysis of physical objects can lead to nonphysical ones, e.g., w∗ -limits of regular states can be nonregular. Nonregular representations are interpreted as situations where the field ϕ(f ) can have “infinite field strength”. Whilst this is useful for some nonphysical idealizations, e.g., plane waves (cf. [1]), or for quantum constraints (cf. [17]), for physical situations one wants to exclude such representations. The resolvent algebra of [6] also has nonregular representations (although far fewer than the Weyl algebra). Our aim here is to construct a C∗ -algebra for the CCRs of a countably dimensional (S, B) such that its representation space comprises of exactly the regular representations of the CCRs, in a sense to be made precise below. This will demonstrate that the regular representation theory of the Weyl algebra is isomorphic to the full representation theory of a C∗ -algebra, and hence it is subject to the usual structure theory for the full representation theory of C∗ algebras. The existence of such an algebra has already been shown in [15], but here we want to obtain an explicit construction of it. In the case that S is finite dimensional, there is an immediate solution. Regard S with its addition as an Abelian group, then σ(·, ·) := exp[iB(·, ·)/2] is a 2-cocycle of S, and ∆(S, B) is just the σ-twisted group algebra of S with the discrete topology
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
589
(cf. [23]). Define the C∗ -algebra L as the C ∗ -envelope of the twisted convolution algebra of S, where the latter consists of L1 (S) equipped with the multiplication and involution: f ∗ g(x) = f (y)g(x − y)σ(y, x)dµ(y), f ∗ (x) = f (−x) S
where µ is a Haar measure on S, i.e. L is the σ-twisted group algebra of S. This algebra L is known to be isomorphic to the compacts K(L2 (S)) (cf. [26 and 4, p. 206]). Then we have an embedding of ∆(S, B) into the multiplier algebra of L, ∆(S, B) ⊂ M (L), by the action δx · f (y) = σ(x, y) f (y − x). The unique extensions of representations on L to ∆(S, B) produces a bijection from the representations of L onto the regular representations of ∆(S, B) and the bijection respects direct sums and takes irreducibles to irreducibles. So L is the desired C∗ -algebra with full regularity. For the case that S is infinite dimensional, since regular representations π are characterized by requiring the maps s → π(δs ) to be continuous on all finite dimensional subspaces of S, this means that we require these maps to be strong operator continuous with respect to the inductive limit topology, where the inductive limit is the one consisting of all finite dimensional subspaces of S under inclusion. This inductive limit topology on S is only a group topology with respect to addition in the case that S is a countably dimensional space; cf. [13]. Hence in this case the regular representation theory of ∆(S, B) is the σ-representation theory of the topological group S, but not otherwise. Henceforth, we will always take (S, B) to be countably dimensional, equipped with the (locally convex) inductive limit topology. The problem now becomes the one of how to define a σ-twisted group algebra for S. The usual theory fails in this case, since S is not locally compact, hence there is no Haar measure. We see that there is a need to generalize the notion of a (twisted) group algebra to topological groups which are not locally compact. Such a generalization, called a full host algebra, has been proposed in [16]. Briefly, it is a C∗ -algebra A which has in its multiplier algebra M (A) a homomorphism η : G → U (M (A)), such that the (unique) extension of the representation theory of A to M (A) pulls back via η to the continuous (unitary) representation theory of G. There is also an analogous concept for unitary σ-representations, where σ is a continuous Tvalued 2-cocycle on G. Thus, given a full host algebra A, the continuous representation theory of G can be analyzed on A with a large arsenal of C∗ -algebraic tools. Our main result in this paper is an explicit construction of a full host algebra for the σ-representations of an infinite dimensional topological linear space S, regarded as a group where S will be a countably dimensional symplectic space with symplectic form B, equipped with the (locally convex) inductive limit topology. We demonstrate the usefulness of this construction by proving that for every regular representation of ∆(S, B) on a separable Hilbert space, there is a direct
June 3, 2009 10:59 WSPC/148-RMP
590
J070-00367
H. Grundling & K.-H. Neeb
integral decomposition of it into irreducible regular representations. This last result is already known by different means (cf. [18, 25]). This paper is structured as follows. In Sec. 1, we state the notation and definitions necessary for the subsequent material, and in Sec. 2, we discuss existence and uniqueness issues for host algebras. In Sec. 3, we construct the host algebra for the pair (S, σ) mentioned above, do the direct integral decomposition mentioned, and in the Appendix we add general results concerning host algebras and the strict topology which are required for our proofs. These results are of independent interest for the general structure theory of host algebras. The reader in a hurry can skip Sec. 2.
1. Definitions and Notation We will need the following notation and concepts for our main results: • In the following, we write M (A) for the multiplier algebra of a C∗ -algebra A and, if A has a unit, U (A) for its unitary group. We have an injective morphism of C∗ -algebras ιA : A → M (A) and will just denote A for its image in M (A). Then A is dense in M (A) with respect to the strict topology, which is the locally convex topology defined by the seminorms pa (m) := m · a + a · m,
a ∈ A,
m ∈ M (A)
(cf. [29]). • For a complex Hilbert space H, we write Rep(A, H) for the set of non-degenerate representations of A on H. Note that the collection Rep A of all non-degenerate representations of A is not a set, but a (proper) class in the sense of von Neumann–Bernays–G¨ odel set theory, cf. [27], and in this framework we can consistently manipulate the object Rep A. However, to avoid set-theoretical subtleties, we will express our results below concretely, i.e. in terms of Rep(A, H) for given Hilbert spaces H. We have an injection Rep(A, H) → Rep(M (A), H),
π → π ˜
with π ˜ ◦ ιA = π,
which identifies the non-degenerate representation π of A with that representation π ˜ of its multiplier algebra which extends π and is continuous with respect to the strict topology on M (A) and the topology of pointwise convergence on B(H). • For topological groups G and H, we write Hom(G, H) for the set of continuous group homomorphisms G → H. We also write Rep(G, H) for the set of all (strong operator) continuous unitary representations of G on H. Endowing U (H) with the strong operator topology turns it into a topological group, denoted U (H)s , so that Rep(G, H) = Hom(G, U (H)s ).
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
591
• Let T ⊆ C × denote the unit circle, viewed as a multiplicative subgroup and σ : G × G → T be a continuous 2-cocycle, i.e. σ(1, x) = σ(x, 1) = 1,
σ(x, y)σ(xy, z) = σ(x, yz)σ(y, z) for x, y, z ∈ G.
We then form the topological group Gσ := T × G,
(t, g)(t , g ) := (tt σ(g, g ), gg )
and note that the projection q : Gσ → G defines a central extension of G by T. A continuous unitary representation (π, H) of Gσ is called a σ-representation of G if π(t, 1) = t1 holds for each t ∈ T. Then G → U (H),
g → π(1, g)
is continuous with respect to the strong operator topology, but π(1, g)π(1, g ) = σ(g, g )π(1, gg ) for g, g ∈ G. We write Rep((G, σ), H) for the set of all continuous σ-representations of G on H. Definition 1.1. Let G be a topological group and σ : G × G → T a continuous 2-cocycle. A host algebra for the pair (G, σ) is a pair (L, η) where L is a C∗ -algebra and η : Gσ → U (M (L)) is a homomorphism such that for each complex Hilbert space H the corresponding map η ∗ : Rep(L, H) → Rep((G, σ), H),
π → π ˜◦η
is injective. We then write Rep(G, H)η ⊆ Rep(G, H) for the range of η ∗ . We say that (G, σ) has a full host algebra if it has a host algebra for which η ∗ is surjective for each Hilbert space H. In the case that σ = 1, we simply speak of a host algebra for G. In this case, Gσ = G × T is a direct product, so that a host algebra for G is a pair (L, η), where η : G → U (M (L)) is a homomorphism into the unitary group of M (L) such that for each complex Hilbert space H the corresponding map η ∗ : Rep(L, H) → Rep(G, H),
π → π ˜◦η
is injective. We then write Rep(G, H)η ⊆ Rep(G, H) for the range of η ∗ . We say that G has a full host algebra if it has a host algebra for which η ∗ is surjective for each Hilbert space H. Note that by the universal property of (twisted) group algebras, the homomorphism η : Gσ → U (M (L)) extends uniquely to the σ-twisted group algebra of G with the discrete topology, i.e. we have a ∗ -homomorphism η : Cσ∗ (Gd ) → U (M (L)) (still denoted by η). Remark. (1) It is well known that for each locally compact group G, the group C∗ algebra C ∗ (G), and the natural map ηG : G → M (C ∗ (G)) provide a full host algebra ([11, Sec. 13.9]) and for each pair (G, σ), where G is locally compact, the
June 3, 2009 10:59 WSPC/148-RMP
592
J070-00367
H. Grundling & K.-H. Neeb
corresponding twisted group C∗ -algebra C ∗ (G, σ), which is isomorphic to an ideal of C ∗ (Gσ ), is a full host algebra for the pair (G, σ). This is most easily seen by decomposition of representations of Gσ into isotypic summands with respect to the action of the central subgroup T × {1} (apply [10, 23] with L = C). The map ηG : G → M (C ∗ (G, σ)) is continuous with respect to the strict topology of M (C ∗ (G, σ)).a (2) Note that the map η ∗ preserves direct sums, unitary conjugation, subrepresentations, and for full host algebras, irreducibility (cf. [16]) so that this notion of isomorphism between Rep((G, σ), H) and Rep(L, H) involves strong structural correspondences. (3) Whilst the concept of a host algebra is a natural extension of the concept of a group C∗ -algebra (and it easily generalizes to other algebraic objects cf. [16]), it has so far had a troubled history. It was first used in [15], though not under this name. There, the existence of host algebras was proven for groups which are inductive limits of locally compact groups, though the proof was not constructive enough to allow much further structural analysis of these host algebras. Then in [16] the concept was generalized to algebraic objects other than topological groups, and a general existence and uniqueness theorem was given, though unfortunately this turned out to be wrong (see the erratum, and the counterexample below). Since then, host algebras have been constructed in [22] for complex semigroups. Our aim in Sec. 3 is to provide an explicit, and more useful construction of a host algebra (than [15]) for the regular representations of the canonical commutation relations. 2. Existence and Uniqueness Issues For general topological groups, there are serious existence and uniqueness questions for their host algebras (as mentioned, the existence and uniqueness theorem in [16] is wrong). From the structural “isomorphism” between the σ-representation theory of G and the representation theory of its full host noted above, it becomes easy to find examples of topological groups without full host algebras. For instance, in [24, Example 5.2], there is an abelian topological group with a faithful continuous unitary representation, but no continuous irreducible representations. Hence this group cannot have a host algebra, whether full or not. In [14], it is shown in particular for any non-atomic measure space (X, µ), such as the unit interval [0, 1] with Lebesgue meaure, the unitary group of the W∗ -algebra L∞ (X, µ), endowed with the weak topology, has no non-trivial continuous characters, hence no non-zero host algebra. It is therefore an important open problem to characterize those pairs (G, σ) for which full host algebras exist. a This
is an easy consequence of the fact that im(ηG ) is bounded and that the action on the corresponding L1 -algebra is continuous.
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
593
Concerning the issue of uniqueness, the following simple counterexample shows that if a host algebra exists, then it need not be unique. Let G := Z. Then its charac∼ ter group is G = T, which is a compact group with respect to the topology of pointwise convergence. Since G is locally compact, C ∗ (G) ∼ = C(T) is a full host algebra for G. Let L := C0 ([0, 1)) and define a homomorphism η : Z → U (M (C0 ([0, 1)))) ∼ = C([0, 1), T) by η(n)(x) := e2πinx . Then η(1) : [0, 1) → T is a continuous bijection, which implies that η(Z) separates the points, hence by Lemma A.1 below the C∗ algebra generated by this set is strictly dense in M (L). Since the unique extensions of representations of L to M (L) are continuous in the strict topology, it follows that η ∗ is injective. Further, Z is discrete, so that continuity of representations η ∗ π is trivially satisfied, and thus (L, η) is a host algebra. This host algebra is full because the representations of Z are in one-to-one correspondence with Borel spectral measures on T and η(1) is a Borel isomorphism. Note in particular that this full host algebra L ∼ = C ∗ (G) is not unital, although G is a discrete group. This issue also needs further analysis, e.g., one needs to find what structural properties are shared by host algebras for the same pair (G, σ), and to explore the properties of the set of host algebras. In the Appendix, we list more host algebra properties, e.g., those relating to products and homomorphisms of groups. 3. A Construction of a Full Host Algebra for (S, σ) Here we want to present an example of a host algebra for an infinite-dimensional group. Let (S, B) be a countably dimensional (nondegenerate) symplectic space. Then by Lemma A.8, we know that there is a complex structure and a hermitian inner product (·, ·) on S such that B(v, w) = Im(v, w) for all v, w ∈ S. Moreover, with respect to the inner product (·, ·), S has an orthonormal basis (en )n∈N . We consider S ∼ = C (N) as an inductive limit of the subspaces Sn := span{e1 , . . . , en } and endow it with the inductive limit topology, which turns it into an abelian topological group with respect to addition (which is only true for countably dimensional spaces; cf. [13]). Moreover, the symplectic form B(v, w) = Im(v, w) defines a group twococycle σ(v, w) := exp[iB(v, w)/2] on S. Let Sσ denote the corresponding central extension of S by T (cf. above Definition 1.1). In the rest of this section we will prove that: Theorem 3.1. The pair (S, σ) has a full host algebra. Recall that A := ∆(S, B) is the discrete twisted σ-group algebra of S, i.e. it is the unique (simple) C∗ -algebra generated by a collection of unitaries {δs | s ∈ S} satisfying the (Weyl) relations δs1 δs2 = σ(s1 , s2 )δs1 +s2 ([8, Theorem 5.2.8]). Let R(H) := {π ∈ Rep(A, H) | t ∈ R → π(δtx ) is strong operator continuous ∀x ∈ S} denote the set of regular representations on the Hilbert space H. Through the identification π(s) := π(δs ), R(H) corresponds exactly with the σ-representations of S on H, i.e. with Rep((S, σ), H).
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
H. Grundling & K.-H. Neeb
594
Lemma 3.2. With the notation above, we have A = (minimal) tensor norms, where An := C ∗ {δzen | z ∈ C}.
∞ n=1
An with the spatial
Proof. This follows directly from Proposition 11.4.3 of Kadison and Ringrose [19], we only need to verify that its conditions hold in the present context. For this, ∞ observe that A = C ∗ { n=1 An }, 1 ∈ An , [An , Am ] = {0} when n = m. Moreover, the linear maps ψk : A1 ⊗ · · · ⊗ Ak → A defined by
ψk (A1 ⊗ · · · ⊗ Ak ) := A1 A2 · · · Ak are -monomorphisms because each image subalgebra C ∗ { kn=1 An } is the unique C∗ -algebra generated by the unitaries {δzei |z ∈ C, i = 1, . . . , k}, and this is also true for A1 ⊗ · · · ⊗ Ak . This is enough to apply the proposition loc. cit. ∗
Observe that each An is just the discrete σ-group algebra of the subgroup Cen ⊂ S, and as the latter is locally compact, we can construct its σ-twisted group algebra which we denote by Ln (recall that Ln is just the enveloping C∗ algebra of L1 (C), equipped with σ-twisted convolution). It is well known that L ∼ = K(L2 (R)) (cf. [26]). Note that for each finite subset F ⊂ N, the algebra n ∼ L2 (R)) ∼ = K(L2 (RF )) is a host algebra for the regular repren∈F Ln = K( n∈F sentations of n∈F An = C ∗ {δzen | z ∈ C, n ∈ F }, i.e. for the σ-representations of span{en | n ∈ F } ⊂ S. ∞ It is natural to try some infinite tensor product n=1 Ln for a host algebra, but because the algebras Ln are non-unital, the definition of the infinite tensor product needs some care [2]. For each n ∈ N, choose a nonzero projection Pn ∈ Ln ∼ = K(H) ∗ and define C -embeddings Ψk : L(k) → L()
by
where k < and L(k) define
Ψk (A1 ⊗ · · · ⊗ Ak ) := A1 ⊗ · · · ⊗ Ak ⊗ Pk+1 ⊗ · · · ⊗ P , k := n=1 Ln . Then the inductive limit makes sense, so we L :=
∞ n=1
Ln := lim{L(n) , Ψk } −→
and write Ψk : L(k) → L for the corresponding embeddings, satisfying Ψk ◦Ψkj = Ψj for j ≤ k. Since each Ln is simple, so are the finite tensor products L(k) ([28, Proposition T.6.25]), and as inductive limits of simple C∗ -algebras are simple ([19, Proposition 11.4.2]), so is L. It is also clear that L is separable. Since Ψk+n,k (Lk ) = Lk ⊗ Pk+1 ⊗ · · · ⊗ Pk+n , where Lk ∈ L(k) , this means that we can consider L to be built up out of elementary tensors of the form Ψk (L1 ⊗ · · · ⊗ Lk ) = L1 ⊗ L2 ⊗ · · · ⊗ Lk ⊗ Pk+1 ⊗ Pk+2 ⊗ · · · ,
where Li ∈ Li , (3.1)
i.e. eventually they are of the form · · · ⊗ Pk ⊗ Pk+1 ⊗ · · · . We will use this picture below, and generally will not indicate the maps Ψk .
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
595
Lemma 3.3. (i) With respect to componentwise multiplication, we have an inclusion ∞ ∞ A= An ⊂ M (L) = M Ln . n=1
n=1
(ii) There is a natural embedding ιn : M (L(n) ) → M (L). This is a topological embedding on each bounded subset of M (L(n) ). Moreover, L(n) is dense in M (L(n) ) with respect to the restriction of the strict topology of M (L). (iii) Let π ∈ Rep(L, H), and let πn denote the unique representation which it induces on L(n) ⊂ M (L(n) ) ⊂ M (L) by strict extension. Then π(L1 ⊗ L2 ⊗ · · ·) = s-lim πn (L1 ⊗ · · · ⊗ Ln ) n→∞
for all L1 ⊗ L2 ⊗ · · · ∈ L as in (3.1). k Proof. (i) For each k we obtain a homomorphism Θk : n=1 An → M (L) by componentwise multiplication in the first k entries of L, leaving all entries further up invariant. By simplicity of its domain, each Θk is a monomorphism. k From Θk ( n=1 An ) ⊂ M (L) for each k ∈ N, we obtain all the generating unitaries δs in M (L), then they generate A in M (L) by uniqueness of the C∗ -algebra of the canonical commutation relations. (ii) Now L = L(n) ⊗ B for a C∗ -algebra B (cf. [2, p. 315]), and M (L(n) ) embeds in M (L) as M (L(n) ) ⊗ 1. Therefore (ii) follows from Lemma A.2. (iii) Note that Un := Ψn (1) = 1 ⊗ · · · ⊗ 1 ⊗ Pn+1 ⊗ Pn+2 ⊗ · · · ∈ M (L) converges strictly to 1. Recall that L = L1 ⊗ L2 ⊗ · · · ∈ L as in (3.1) is of the form A1 ⊗ A2 ⊗ · · · ⊗ Ak ⊗ Pk+1 ⊗ Pk+2 ⊗ · · · , where Ai ∈ Li , so for n ≥ k we get for all ψ ∈ Hπ that for the strictly continuous extension π of π to M (L): π (L − L1 ⊗ · · · ⊗ Ln ⊗ 1 ⊗ 1 ⊗ · · ·)ψ = π(L1 ⊗ · · · ⊗ Ln ⊗ (Pn+1 ⊗ Pn+2 ⊗ · · · − 1))ψ (Un − 1)ψ = π(L1 ⊗ · · · ⊗ Ln ⊗ 1 ⊗ · · ·) · π ≤ C · π(Un − 1)ψ → 0 as n → ∞, where C > 0 is chosen such that L1 ⊗ · · · ⊗ Ln ≤ C for all n, and this is possible because Pk+1 ⊗ Pk+2 ⊗ · · · = 1. But this is exactly the claim we needed to prove. Let π ∈ Rep(A, H) be regular. Observe that π is regular on all An , hence there are unique π n ∈ Rep(Ln , H) which extend (on H) to πAn by the host algebra property of Ln . For the distinguished projections Pn ∈ Ln , we simplify the notation
June 3, 2009 10:59 WSPC/148-RMP
596
J070-00367
H. Grundling & K.-H. Neeb
to π(Pn ) := π n (Pn ). Observe that the projections π(Pj ) all commute, and so the strong limit Pk := s-lim π(Pk ) · · · π(Pn ) n→∞
exists, and it is the projection onto the intersection of the ranges of all π(Pj ), j ≥ k. Since Pk = π(Pk ) Pk+1 we have Pk+1 ≥ Pk and so also s-limk→∞ Pk ≤ 1 exists. n We will use the notation A(n) := j=1 Aj below. Proposition 3.4. Define a monomorphism η : Sσ → U (M (L)) by η((s, t)) := tδs ∈ A ⊂ M (L) (by Lemma 3.3(i)). Then η is continuous with respect to the strict topology on M (L) and L is a host algebra of (S, σ), i.e. the maps η ∗ : Rep(L, H) → Rep((S, σ), H) are injective. The range of η ∗ consists of those π ∈ Rep((S, σ), H) for which s-limk→∞ Pk = 1. Proof. Let π be a representation of L and π ˜ its strictly continuous extension to M (L). To see that the representation η ∗ π ˜ of Sσ is continuous, we show that η is continuous with respect to the strict topology on M (L). Since Sσ is a topological direct limit of the subgroups Sm,σ , where Sm = spanC {e1 , . . . , em }, it suffices to show that η is continuous on each subgroup Sm,σ . Recall that the twisted group algebra C ∗ (Sm , σ) ∼ = L(m) is a full host algebra for (Sm , σ) and that the corresponding strictly continuous homomorphism ηm : Sm,σ → M (L(m) ) is compatible with the embedding ιm : M (L(m) ) → M (L) in the sense that η|Sm,σ = ιm ◦ ηm . Since ιm restricts to an embedding on the unitary group (Lemma 3.3(ii)), the continuity of ηm implies the continuity of η on Sm,σ , which in turn implies the continuity of η. As a consequence, π ˜ ◦ η is a continuous unitary representation of Sσ for each strictly continuous representation π ˜ of M (L). To see that η ∗ is injective, we have to show that two representations π1 , π2 of L for which η ∗ π1 = η ∗ π2 are equal. If η ∗ π1 = η ∗ π2 , then we obtain for each m ∈ N ∗ ∗ π1 = ηm π2 on Sm,σ . This means that the corresponding unitary the relation ηm representations of the group Sm,σ coincide. In view of Lemma 3.3(iii), it suffices to argue that the two non-degenerate representations π1,m and π2,m of L(m) coincide (cf. Lemma A.3 for the non-degeneracy), which in turn follows from the host algebra property of L(m) for Sm,σ . To characterize the range of η ∗ , let π ∈ Rep(A, H) be the strictly continuous extension of a π0 ∈ Rep(L, H). Then, by Lemma 3.3(iii), it must satisfy π0 (L1 ⊗ L2 ⊗ · · ·) = s-lim πn (L1 ⊗ · · · ⊗ Ln ) n→∞
for all L1 ⊗ L2 ⊗ · · · ∈ L. Now we have πn (L1 ⊗ · · · ⊗ Ln−1 ⊗ Pn ) = π n (L1 ⊗ · · · ⊗ Ln−1 ⊗ 1) πn (1 ⊗ · · · 1 ⊗ Pn ) where π n denotes the strictly continuous extension to M (L(n) ), and it is obvious that these two operators commute. From the algebra relations A(n) ⊃ A(n−1) ⊂ M (L(n−1) ) ⊂ M (L(n) ),
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
597
and the host algebra properties we get that π n (L1 ⊗ · · · ⊗ Ln−1 ⊗ 1) = πn−1 (L1 ⊗ · · · ⊗ Ln−1 ) and π n (1 ⊗ · · · 1 ⊗ Pn ) = π(Pn ), so πn (L1 ⊗ · · · ⊗ Ln−1 ⊗ Pn ) = πn−1 (L1 ⊗ · · · ⊗ Ln−1 )π(Pn ). Thus, for L = L1 ⊗ L2 ⊗ · · · = A1 ⊗ A2 ⊗ · · · ⊗ Ak ⊗ Pk+1 ⊗ Pk+2 ⊗ · · · ∈ L, we get for n > k that πn (L1 ⊗ · · · ⊗ Ln ) = πk (A1 ⊗ · · · ⊗ Ak )π(Pk+1 ) · · · π(Pn ). Using the fact that the projections π(Pj ) all commute, π0 (L1 ⊗ L2 ⊗ · · ·) = s-lim πn (L1 ⊗ · · · ⊗ Ln ) = πk (A1 ⊗ · · · ⊗ Ak )Pk+1 . n→∞
Since π0 is non-degenerate, and all πk L(k) are non-degenerate, it follows that s-limk→∞ Pk = 1. Conversely, if we start from a regular representation π of A which satisfies s-limk→∞ Pk = 1, we will define a representation π0 on L by π0 (L) := πk (A1 ⊗ · · · ⊗ Ak )Pk+1
for L = A1 ⊗ A2 ⊗ · · · ⊗ Ak ⊗ Pk+1 ⊗ Pk+2 ⊗ · · ·
where πk ∈ Rep L(k) is obtained from πA(k) , using the host algebra property of L(k) . To see that this can be done, note that for A ∈ L(k) we have πk (A)Pk+1 = πk+1 (Ψk+1,k (A))Pk+2 . Therefore the universal property of the direct limit algebra L implies the existence of a representation π0 of L, satisfying π0 (Ψk (A)) = πk (A)Pk+1
for A ∈ L(k) .
That it is non-degenerate follows from the fact that each πk is non-degenerate, and 0 A = π, recall that πk is the representation that s-limk→∞ Pk = 1. To see that π obtained from from πA(k) , using the host algebra property of L(k) . Let B ∈ A(k) , then for A ∈ L(k) we have π 0 (B)π0 (Ψk (A)) = π0 (B · Ψk (A)) = πk (B · A)Pk+1 = π(B)πk (A)Pk+1 = π(B)π0 (Ψk (A)) from which it follows that π 0 A = π. Thus for every family of projections Pk ∈ Lk we get a host algebra. Now recall that Lk ∼ = K( 2 (N)), and that there is a (countable) approximate identity (En )n∈N 2 in K( (N)) consisting of a strictly increasing sequence of projections En with (k) dim(En 2 (N)) = n. For each k, choose such an approximate identity (En ) ⊂ Lk , then for each sequence n = (n1 , n2 , . . .) ∈ N∞ := NN , we have a sequence of pro(1) (2) jections (En1 , En2 , . . .) from which we can construct an infinite tensor product as above, and we will denote it by L[n]. For the elementary tensors, we streamline the
June 3, 2009 10:59 WSPC/148-RMP
598
J070-00367
H. Grundling & K.-H. Neeb
notation to: A1 ⊗ · · · ⊗ Ak ⊗ E[n]k+1 := A1 ⊗ · · · ⊗ Ak ⊗ En(k+1) ⊗ En(k+2) ⊗ · · · ∈ L[n], k+1 k+2 where Ai ∈ Li , and their closed span is the simple C∗ -algebra L[n]. Next we want to define componentwise multiplication between different C∗ algebras L[n] and L[m]. This can of course be done in the algebraic infinite tensor product of the algebras Lk , (cf. [5, p. 470]) using suitable closures of subalgebras, but it is faster to proceed as follows. Note that for componentwise multiplication, the sequences give: (1) (2) , En(2) , . . .) · (Em , Em , . . .) = (Ep(1) , Ep(2) , . . .) (En(1) 1 2 1 2 1 2
(3.2)
where pj := min(nj , mj ), i.e. multiplication reduces the entries, and hence the (1) (2) (3) sequence (E1 , E1 , E1 , . . .) is invariant under such multiplication. So we define an embedding L[n] ⊆ M (L[1]) for all n, where 1 := (1, 1, . . .) by (A1 ⊗ · · · ⊗ Ak ⊗ E[n]k+1 ) · (B1 ⊗ · · · ⊗ Bn ⊗ E[1]n+1 ) A1 B1 ⊗ · · · ⊗ An Bn ⊗ An+1 E (n+1) ⊗ · · · ⊗ Ak E (k) ⊗ E[1]k+1 1 1 := (n) A B ⊗ · · · ⊗ A B ⊗ E (k+1) B nk+1 1 1 k k k+1 ⊗ · · · ⊗ Enn Bn ⊗ E[1]n+1
if n ≤ k if n ≥ k
for the left action, and similar for the right action on L[1]. To see that this is an embedding as claimed, choose a faithful representation πi of each Li ∼ = K(H) on (n) a Hilbert space Hi and let ψn be a unit vector in E1 Hn . Construct the infinite tensor product Hilbert space ∞ sequence (ψ1 , ψ2 , . . .), n=1 Hn with respect to the ∞ ∞ and note that for each L[n], the tensor representation n=1 πn on n=1 Hn is faithful (since it is faithful on the C∗ -algebras of which they are inductive limits). Then it is obvious that the given multiplication above is concretely realised on this Hilbert space, and by faithfulness of the representations we realise the embeddings L[n] ⊆ M (L[1]) for all n. Then L[n] · L[m] ⊆ L[p],
(3.3)
where pj := min(nj , mj ), and in fact L[n] ⊂ M (L[p]) ⊃ L[m].
(3.4)
Using the embedding L[n] ⊆ M (L[1]) for all n, we define the C∗ -algebra in M (L[1]) generated by all L[n], and denote it by L[E]. By (3.3), this is just the closed span of all L[n] and hence the closure of the dense ∗ -subalgebra L0 ⊂ L[E], where
L[n]0 (finite sums) and L[n]0 := L(k) ⊗ E[n]k+1 . L0 := n∈N∞
k∈N
We still have A ⊂ M (L[E]) ⊃ L(n) for each n ∈ N. Note that if two sequences n and m differ only in a finite number of entries, then L[n] = L[m], and hence we actually have that the correct index set for the algebras L[n] is not the sequences N∞ , but the set of equivalence classes N∞ /∼ where n ∼ m if they differ only in finitely
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
599
many entries. Some of the structures of N∞ will factor through to N∞ /∼, e.g., we have a partial ordering of equivalence classes defined by [n] ≥ [m] if for any representatives n and m respectively, we have that there is an N (depending on the representatives) such that nk ≥ mk for all k > N . In particular, we note that products reduce sequences, i.e. we have L[n] · L[p] ⊆ L[q] for qi = min(ni , pi ), so [n] ≥ [q] ≤ [p]. Let ϕ : N∞ /∼ → N∞ be a section of the factor map. Then L[E] is the C∗ -algebra generated in M (L[1]) by {L[ϕ(γ)] | γ ∈ N∞ /∼}, and it is the closure of the span of the elementary tensors in this generating set. Below we will prove that L[E] is a full host algebra for (S, σ), and so it is of some interest to explore its algebraic structure. From the reducing property of products, we already know that L[E] has the ideal L[1] (we will show that it is proper), hence that it is not simple. However, it has in fact infinitely many proper ideals and each of the generating algebras L[n] is contained in such an ideal: Proposition 3.5. For the C∗ -algebra L[E], we have the following : (i) L[E] is nonseparable, (ii) Define I[n1 , . . . , nk ] to be the closed span of {L[q]0 | [q] ≤ [n ] for some = 1, . . . , k}. Let [p] > [n ] strictly for all ∈ {1, . . . , k}, then L[p] ∩ I[n1 , . . . , nk ] = {0}. (iii) I[n1 , . . . , nk ] is a proper closed two sided ideal of L[E]. (iv) Define L[n1 , . . . , nk ] := C ∗ (L[n1 ] ∪ · · · ∪ L[nk ]) . Then L[n1 , . . . , nk ] ⊂ I[n1 , . . . , nk ] and C ∗ (L[n1 , . . . , nk ] · L[nk+1 ]) ⊆ L[q1 , . . . , qk ],
where
(qj ) = min((nj ) , (nk+1 ) ). (1)
(2)
Proof. (i) L[E] ⊃ Q := {E[n]1 := En1 ⊗ En2 ⊗ · · · | n ∈ N∞ }. If n = p, there (k) (k) is some k for which Enk = Epk and as the approximate identity is linearly (k) (k) increasing, one of these must be larger than the other, so take Enk > Epk strictly. Group the remaining parts of the tensor product together, i.e. write ⊗A E[n]1 = En(k) k
and E[p]1 = Ep(k) ⊗ B, k
where A and B are projections, then choose a product representation π = π1 ⊗ π2 in which π1 is faithful on Lk and π2 is faithful on the C∗ -algebra gener(k) ated by A and B. Thus there is a unit vector ψ ∈ H1 such that π1 (Enk )ψ = 1 (k) and π1 (Epk )ψ = 0. For any unit vector ϕ ∈ H2 we get ⊗ A − Ep(k) ⊗ B)(ψ ⊗ ϕ) E[n]1 − E[p]1 ≥ (π1 ⊗ π2 )(En(k) k k = π1 (En(k) )ψ ⊗ π2 (A)ϕ = π1 (En(k) )ψ · π2 (A)ϕ k k = π2 (A)ϕ and by letting ϕ range over the unit ball we get that E[n]1 − E[p]1 ≥ A = 1. Thus, since Q is uncountable and its elements far apart, L[E] cannot be separable.
June 3, 2009 10:59 WSPC/148-RMP
600
J070-00367
H. Grundling & K.-H. Neeb
(ii) Here we adapt the argument in (i) as follows. It suffices to show that for d q1 , . . . , qd with qi ≤ nj for some j, the norm distance between i=1 L[qi ]0 and any C ∈ L[p]0 is always ≥ C. Let C ∈ L[p]0 be nonzero and consider a sum di=1 Ci with Ci ∈ L[qi ]0 and [p] > [nj ] for all j, which implies [p] > [qi ] for all i. Choose an M > 0 large enough so that all C and Ci can be expressed in the form: (0)
Ci = Ci
⊗ E[ni ]M ,
(0)
for Ci
∈ L(M−1) .
Then by [p] > [qi ] there is an entry of the tensor products, say for j > M , (j) which consist only of elements of the approximate identity (En )∞ n=1 ⊂ Lj and for which B > Bi for all i, where B (respectively Bi ) is the jth entry of C (respectively Ci ). Denote the remaining parts of the tensor products by A (respectively Ai ), i.e. C = A ⊗ B,
Ci = Ai ⊗ Bi ,
where B > Bi ∀ i
and B, Bi consist of commuting projections. Then d d Ci = A ⊗ B − (Ai ⊗ Bi ). C − i=1
i=1
Choose a product representation π = π1 ⊗ π2 such that π1 is faithful on L[p] (j) and π2 is faithful on the C∗ -algebra generated by (En )∞ n=1 ⊂ Lj . Thus there is a unit vector ϕ ∈ Hπ2 such that π2 (B)ϕ = 1 and π2 (Bi )ϕ = 0 for all i (which exists because B > Bi for all i). Then we have for any unit vector ψ ∈ Hπ1 that d d Ci ≥ (π1 ⊗ π2 ) A ⊗ B − Ai ⊗ Bi (ψ ⊗ φ) C − i=1
i=1
= π1 (A)ψ ⊗ π2 (B)ϕ = π1 (A)ψ · π2 (B)ϕ = π1 (A)ψ d and by letting ψ range over the unit ball of Hπ1 , we find that C − i=1 Ci ≥ A = C since B = 1. This establishes the claim. (iii) It is obvious from the reduction property L[n] · L[p] ⊆ L[q] for qj = min(nj , pj ), that I[n1 , . . . , nk ] is a two-sided ideal (hence a ∗ -algebra). To see that it is proper, note that [p] > [ni ] strictly for all i where pj = max((n1 )j , . . . , (nk )j ) + 1. Thus, by (ii) we see that L[p] ∩ I[n1 , . . . , nk ] = {0} and hence that I[n1 , . . . , nk ] is proper. (iv) L[n1 , . . . , nk ] ⊂ I[n1 , . . . , nk ] because I[n1 , . . . , nk ] is a C∗ -algebra which contains all the generating elements L[ni ] of L[n1 , . . . , nk ]. Next we need to prove that C ∗ (L[n1 , . . . , nk ] · L[nk+1 ]) ⊆ L[q1 , . . . , qk ], where (qj ) = min (nj ) , (nk+1 ) . By definition, C ∗ (L[n1 , . . . , nk ] · L[nk+1 ]) is the closed linear span of monoN mials i=1 Li , where Li can be either of the form Ai Bi or Bi Ai , where
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
601
Ai ∈ L[n1 , . . . , nk ] and Bi ∈ L[nk+1 ]. So it suffices to show that AB ∈ L[q1 , . . . , qk ] for A ∈ L[n1 , . . . , nk ] and B ∈ L[nk+1 ] (since then BA ∈ L[q1 , . . . , qk ] by involution). Since L[n]0 is dense in L[n], it suffices to prove this for A = A1 A2 · · · Ap where Ai = Ci ⊗ E[nki ]ri +1 and Ci ∈ L(ri ) , ki ∈ {1, . . . , k}, and B = D ⊗ E[nk+1 ]r+1 , where D ∈ L(r) . Now Ap B = F ⊗ E[qkp ]s+1 ∈ L[qkp ] for some F ∈ L(s) , s ≥ max(rp , r). Then Ap−1 Ap B = (Cp−1 ⊗ E[nkp−1 ]rp−1 +1 )(F ⊗ E[qkp ]s+1 ) = G ⊗ E[m]t+1 , where t ≥ max(rp−1 , s) and mi = min((nkp−1 )i , (qkp )i )) = min((nkp−1 )i , min((nkp )i , (nk+1 )i )) = min(min((nkp−1 )i , (nk+1 )i ), min((nkp )i , (nk+1 )i )) = min((qkp−1 )i , (qkp )i ) and so we have in fact that ⊗ E[qkp−1 ]t+1 )(F ⊗ E[qkp ]t+1 ) ∈ L[qkp−1 ] · L[qkp ] Ap−1 Ap B = (C F ∈ L(t) . Hence Ap−1 Ap B ∈ L[qkp−1 , qkp ]. We continue the process where C, to get AB = A1 A2 · · · Ap B ∈ L[q1 , . . . , qk ]. For each strictly increasing sequence ([n1 ], [n2 ], . . .) ⊂ N∞ /∼ we get from part (ii) a strictly increasing chain of proper ideals Jk := I[n1 , . . . , nk ]. Now we want to prove our main theorem in this section. Theorem 3.6. The monomorphism η : Sσ → U (M (L[E])) from above, defined by η((s, t)) := tδs ∈ A ⊂ M (L[E]), is continuous with respect to the strict topology on M (L[E]) and L[E] is a host algebra, i.e. the map η ∗ : Rep(L[E], H) → Rep((S, σ), H) is injective. The range of η ∗ is exactly R(H). Proof. First we show that η is continuous with respect to the strict topology on M (L[E]). This implies that for each π ∈ Rep(L[E], H) the representation π ∈ Rep(A, H) is regular, hence η ∗ (Rep(L[E], H)) ⊆ R(H). Since im(η) is bounded, it suffices to show that the set {L ∈ L[E] | g → η(g)L is norm continuous in g ∈ Sσ } spans a dense subspace of L[E]. This reduces the assertion to the corresponding result for the action of Sσ on L[n] for each n, which follows from the continuity of the corresponding map Sσ → M (L[n]) (Proposition 3.4).
June 3, 2009 10:59 WSPC/148-RMP
602
J070-00367
H. Grundling & K.-H. Neeb
To prove that η ∗ is injective we show that A separates Rep(L[E], H) for all H. Let π ∈ Rep(L[E], H), then by Proposition 3.4 we know that the values which π (A) takes on Hn uniquely determine the values of π(L[n]) on its essential subspace Hn , hence on all H, as π(L[n]) is zero on the orthogonal complement of Hn . This holds for all n, hence π (A) uniquely determines the values of π on L[E], i.e. η ∗ is injective. It remains to prove that η ∗ (Rep(L, H)) = R(H). Start from a π ∈ Rep(A, H) which is regular. Then we have to show how to obtain a π0 ∈ Rep L[E] such that π 0 A = π. Observe that π is regular on all A(n) , hence there are unique πn ∈ Rep(L(n) , H) which extend (on H) to coincide with πA(n) by the host algebra property of L(n) . For each n define the projections (k) (m) n n En k := s-lim π(Enk ) · · · π(Enm ) and E := s-lim Ek . m→∞
k→∞
(n)
En k
for k > n, and in parNow each πn (L ) commutes with the projections ticular preserves the space Hn := En H, and hence so does π(A(n) ). Then by Proposition 3.4 we know that we can define a (non-degenerate) representation π0n : L[n] → B(Hn ) by π0n (L) = πk (A1 ⊗ · · · ⊗ Ak ) En k+1 (k+1)
(k+2)
for L = A1 ⊗ · · · ⊗ Ak ⊗ Enk+1 ⊗ Enk+2 ⊗ · · · ∈ L[n] such that π 0n A is π(A), restricted to Hn . We extend π0n to all of H, by putting it to zero on the orthogonal complement of Hn . Note that n ≤ m ⇒ Hn ⊆ Hm . We now argue that these representations π0n combine into a single representation of L[E]. First, we want to extend by linearity the maps π0n : L[n] → B(H) to define a linear map π0 from the dense ∗-subalgebra L0 ⊂ L[E] to B(H), where we recall that L0 := n∈N∞ L[n]0 (finite sums). This linear extension π0 is possible if the sum of the spaces L[n]0 is direct for m different n ∈ ϕ(N∞ /∼), i.e. if 0 = k=1 Bk for Bk ∈ L[nk ]0 , where nk ∼ n if k = implies that Bk = 0 for all k. Let us prove this implication, so assume m 0 = k=1 Bk as above. Choose an M > 0 large enough so that for all k, the Bk can be expressed in the form Bk = Ak ⊗E[nk ]M for Ak ∈ L(M−1) , define the projections Pk := 1 ⊗ · · · ⊗ 1 ⊗ E[1]k (there are k − 1 factors of 1), and note that P commutes with all Bk for ≥ M. In fact, for Bk as above, we have (simplifying notation to nk = n): ⊗ · · · ⊗ En(−1) ⊗ E[1] ∈ L(−1) ⊗ E[1] Bk P = Ak ⊗ En(M) M −1 and so multiplication by P for ≥ M maps the Bk to elementary tensors of the (M) (−1) form Ak ⊗ EnM ⊗ · · · ⊗ En−1 in L(−1) (after identifying L(−1) ⊗ E[1] with (−1) ). Now a set of elementary tensors (in a finite tensor product) will be linearly L independent if the entries in a fixed slot are linearly independent so it suffices to (M) (−1) find > M such that the pieces EnM ⊗ · · · ⊗ En−1 are linearly independent for (k) n ∈ N := {nk | k = 1, . . . , m}. Since the approximate identities (En )∞ n=1 ⊂ Lk
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
603
consist of strictly increasing projections, their terms are linearly independent from which it follows that tensor products of these with distinct entries are linearly independent. Thus we only have to identify an large enough so that the portions of the sequences nk between the entries M and can distinguish all the sequences in N, and this is always possible since the nk are representatives of distinct equivalence classes in N∞ /∼. Thus {B1 P , . . . , Bm P } is linearly independent for this , so m 0 = k=1 Bk P implies that all Bk = 0. We conclude that the linear extension π0 exists. That π0 respects involution is clear. To see that it is a homomorphism, consider two elementary tensors L = A1 ⊗ A2 ⊗ · · · ⊗ Ak ⊗ E[n]k+1 ∈ L[n] and M = B1 ⊗ B2 ⊗ · · · ⊗ Bm ⊗ E[p]m+1 ∈ L[p] where m > k and n ∼ p ∈ N∞ . Then π0 (L)π0 (M ) p = πk (A1 ⊗ · · · ⊗ Ak )En k+1 πm (B1 ⊗ · · · ⊗ Bm )Em+1 p En = πm A1 ⊗ · · · ⊗ Ak ⊗ En(k+1) ⊗ · · · ⊗ En(m) m+1 πm (B1 ⊗ · · · ⊗ Bm )Em+1 m k+1 p = πm A1 B1 ⊗ · · · ⊗ Ak Bk ⊗ En(k+1) Bk+1 ⊗ · · · ⊗ En(m) Bm En m+1 Em+1 . m k+1
Now recall that the operator product is jointly continuous on bounded sets in the strong operator topology, hence p (k) (m) (k) (r) En k Ek = s-lim π(Enk ) · · · π(Enm ) · s-lim π(Epk ) · · · π(Epr ) m→∞
= s-lim
m→∞
r→∞ (k) (m) π(Enk ) · · · π(Enm )π(Ep(k) ) · · · π(Ep(m) ) m k
= s-lim π(Eq(k) ) · · · π(Eq(m) ) = Eqk m k m→∞
where qj := min(nj , pj ). Thus we get exactly that π0 (L)π0 (M ) = π0 (LM ). We now verify that π0 is bounded. For this, we first need to prove the following: Claim. Recall that L[n1 , . . . , nk ] = C ∗ (L[n1 ]∪· · ·∪L[nk ]). Then for each k ≥ 1 and k-tuple (n1 , . . . , nk ) such that nk ∼ n if k = the map π0 on L0 ∩ L[n1 , . . . , nk ] extends to a representation of the C∗ -algebra L[n1 , . . . , nk ]. Proof. Note that the claim implies the compatibility of the representations, i.e. on intersections L[p1 , . . . , p ] ∩ L[n1 , . . . , nk ], the representations produced by π0 on L[n1 , . . . , nk ] and L[p1 , . . . , p ] coincide. This is because π0 is given as a consistent map on the dense space L0 . We now prove the claim by induction on k. We already have by definition that π0 is the representation π n on L[n] for each n, hence the claim is true for k = 1.
June 3, 2009 10:59 WSPC/148-RMP
604
J070-00367
H. Grundling & K.-H. Neeb
Assume the claim is true for all values of k up to a fixed k ≥ 1, then we now prove it for k + 1. Observe that L[n1 , . . . , nk+1 ] contains the closed two-sided ideals J1 := C ∗ (L[n1 , . . . , nk ] · L[nk+1 ]) ⊂ J2 ∩ J3 , where J2 := J1 + L[n1 , . . . , nk ] and J3 := J1 + L[nk+1 ] and that L[n1 , . . . , nk+1 ] = J2 + J3 . We will prove below that J1 is proper (hence that the ideal structure above is nontrivial). Consider the factorization ξ : L[n1 , . . . , nk+1 ] → L[n1 , . . . , nk+1 ]/J1 . Then ξ(L[n1 , . . . , nk+1 ]) = ξ(L[n1 , . . . , nk ]) + ξ(L[nk+1 ]) and ξ(J2 ) · ξ(J3 ) = 0. If J1 is not proper, then L[nk+1 ] ⊂ J1 ⊃ L[n1 , . . . , nk ]. By Proposition 3.5(iv), we have that J1 ⊂ L[q1 , . . . , qk ] ⊂ I[q1 , . . . , qk ] for (qj ) = min((nj ) , (nk+1 ) ), and hence L[nk+1 ] ⊂ J1 ⊂ I[q1 , . . . , qk ]. Thus, by Proposition 3.5(ii) we conclude that [nk+1 ] cannot be strictly greater than all the [qi ], i.e. there is one member of the set {q1 , . . . , qk }, say qj , which satisfies [qj ] = [nk+1 ], and so by definition of qj , we have that eventually (nk+1 ) = min((nj ) , (nk+1 ) ), i.e. [nj ] ≥ [nk+1 ]. Likewise, the inclusion L[n1 , . . . , nk ] ⊂ C ∗ (L[n1 , . . . , nk ] · L[nk+1 ]) = J1 implies that no nj , j = 1, . . . , k, is reduced through multiplication by nk+1 , i.e. eventually (nj ) = min((nj ) , (nk+1 ) ) for all j, i.e. [nj ] ≤ [nk+1 ]. So, together with the previous inequality, we see that there must be a j ∈ {1, . . . , k} such that [nj ] = [nk+1 ]. This contradicts the initial assumption that all [n ] are distinct, and so J1 must be proper. Now consider π0 on L0 ∩ L[n1 , . . . , nk+1 ]. By the induction assumption, π0 on L0 ∩ L[n1 , . . . , nk ] is the restriction of a representation on L[n1 , . . . , nk ] — we denote the projection onto its essential subspace by E[n1 , . . . , nk ]. Note that E[nk+1 ] commutes with E[n1 , . . . , nk ] because it commutes with all the generating elements π0 (Li ) = π ni (Li ), Li ∈ L[ni ]. Thus we have an orthogonal decomposition H = H1 ⊕ H2 ⊕ H3 ⊕ H4 , where H1 := E[n1 , . . . , nk ]E[nk+1 ]H, H2 := E[n1 , . . . , nk ](1 − E[nk+1 ])H, H3 := E[nk+1 ](1 − E[n1 , . . . , nk ])H, H4 := (1 − E[nk+1 ])(1 − E[n1 , . . . , nk ])H and π0 preserves these subspaces. Now by Proposition 3.5(iv) and the induction assumption, π0 extends from the L0 ∩ J1 to a representation on J1 ,
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
605
and as J1 = C ∗ (L[n1 , . . . , nk ] · L[nk+1 ]), the essential projection for π0 J1 is E[n1 , . . . , nk ]E[nk+1 ], i.e. its essential subspace is H1 . But since J1 is a closed twosided ideal of L[n1 , . . . , nk+1 ], its non-degenerate representations extend uniquely to L[n1 , . . . , nk+1 ]. Thus on H1 , π0 extends from L0 ∩ L[n1 , . . . , nk+1 ] to a representation on L[n1 , . . . , nk+1 ]. Next observe that on H1⊥ = H2 ⊕ H3 ⊕ H4 we have {0} = π0 (J1 ). We show that one can define a consistent representation of ξ(L[n1 , . . . , nk+1 ]) by ρ(ξ(A)) := π0 (A) H1⊥ , for A ∈ L[nk+1 ] + L[n1 , . . . , nk ], using the structure of ξ(L[n1 , . . . , nk+1 ]) above. First observe that ρ is well-defined on ξ(L[nk+1 ]) and ξ(L[n1 , . . . , nk ]) separately, because if A1 − A2 ∈ J1 , then π0 (A1 − A2 ) H1⊥ = 0. Next, ρ is well-defined on the set ξ(L[nk+1 ] + L[n1 , . . . , nk ]) by the induction assumption, and the consistency of the extensions of π0 . To see that ρ is welldefined on the algebra ξ(L[n1 , . . . , nk+1 ]) = ξ(L[n1 , . . . , nk ]) + ξ(L[nk+1 ]), it suffices by the direct sum decomposition to check it on H2 , H3 and H4 separately. On H2 , π0 vanishes on L[nk+1 ], so since ξ(L[nk+1 ]) is an ideal of ξ(L[n1 , . . . , nk+1 ]) (and ξ(J2 ) · ξ(J3 ) = {0}), it follows that we can extend ρ(ξ(A)) H2 by linearity, i.e. ρ(ξ(A) + ξ(B)) = ρ(ξ(A)) for A ∈ L[n1 , . . . , nk ], B ∈ L[nk+1 ] to define a representation on ξ(L[n1 , . . . , nk+1 ]). Likewise, on H3 , π0 vanishes on L[n1 , . . . , nk ], so we can show ρ defines a representation of ξ(L[n1 , . . . , nk+1 ]) and on H4 , ρ is zero. Then ρ lifts to a representation of L[n1 , . . . , nk+1 ] on H1⊥ which coincides with π0 on L0 ∩ L[n1 , . . . , nk+1 ]. Taking the direct sum of this with the representation we obtained on H1 , produces a representation of L[n1 , . . . , nk+1 ] on all H which coincides with π0 on L0 ∩ L[n1 , . . . , nk+1 ]. Thus, we have proven the claim for k + 1, which completes the induction. That π0 is bounded on L0 now follows immediately from the claim, because m any A ∈ L0 is of the form A = k=1 Bk for Bk ∈ L[nk ]0 , where nk ∼ n if k = . But this is an element of L[n1 , . . . , nm ] and by the claim π0 extends as a representation to it, hence π0 (A) ≤ A. We conclude that π0 is a bounded representation, hence extends to all of L[E]. To see that π0 is non-degenerate, recall that (k) {En } ⊂ Lk is an approximate identity of increasing projections. Thus we can find (m) (m) a sequence n such that s-limm→∞ π(Enm ) = 1, and hence En = 1 by π(Enm ) ≤ En ≤ 1 for all m. Since the essential subspace of π0 L[n] is En H, it follows that π0 is non-degenerate. It then follows from Proposition 3.4 applied to L[n] that π 0 A = π. Finally, we apply the structures above to produce a direct integral of regular representations into irreducible regular representations. First observe that given any representation π ∈ Rep((S, σ), H), where H is separable, then as (En )n∈N is an approximate identity for K( 2 (N)), there is a sequence n such that s-lim s-lim π(En(k) ) · · · π(En() ) = 1, k k→∞ →∞
and thus by Proposition 3.4 there is a unique π0 ∈ Rep(L[n], H) such that η ∗ π0 = π. Fix a choice of maximally commutative subalgebra C ⊂ π0 (L[n]) . Then, since L[n]
June 3, 2009 10:59 WSPC/148-RMP
606
J070-00367
H. Grundling & K.-H. Neeb
is separable, there is an extremal decomposition of π0 (cf. [7, Corollary 4.4.8]), i.e. there is a standard measure space (Z, µ) with µ a positive bounded measure, a measurable family z → H(z) of Hilbert spaces, a measurable family z → πz ∈ Rep(L[n],H(z)) of representations which are almost all irreducible and a unitary ⊕ U : H → Z H(z)dµ(z) such that U CU −1 is the diagonizable operators, and U π0 (A)U −1 =
⊕
πz (A)dµ(z)
∀ A ∈ L[n].
Z
⊕ ⊕ Then for ψ, ϕ ∈ Z H(z)dµ(z) with decompositions ψ = Z ψz dµ(z) and ϕ = ⊕ Z ϕz dµ(z), we have for s ∈ S and any countable approximate identity (Fk ) of L[n] that (ϕ, U π(s)U −1 ψ) = (ϕ, U η ∗ π0 (s)U −1 ψ) = lim (ϕ, U π0 (δs Fk )U −1 ψ) k→∞
= lim
(ϕz , πz (δs Fk ) ψz )dµ(z)
k→∞
Z
lim (ϕz , πz (δs Fk ) ψz )dµ(z)
=
Z k→∞
(ϕz , η ∗ πz (s)ψz )dµ(z)
= Z
=
ϕ,
⊕
η ∗ πz (s)dµ(z)ψ ,
Z
where the usage of the Dominated Convergence Theorem in line four is justified by |(ϕz , πz (δs Fk ) ψz )| ≤ ϕz ψz as both of z → ϕz and z → ψz are square integrable with respect to µ. Hence U π(s)U
−1
⊕
=
η ∗ πz (s)dµ(z)
∀ s ∈ S.
Z
Since η ∗ preserves irreducibility, almost all η ∗ πz are irreducible, and hence we obtain the promised decomposition.
Acknowledgments The first author gratefully acknowledges the support of the Sonderforschungsbereich TR12, “Symmetries and Universality in Mesoscopic Systems” who generously supported his visit to Germany in the Summer of 2005. The second author wishes to express his appreciation for the generous support he received from the Australian Research Council for his visit to the University of New South Wales in May 2004.
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
607
Appendix A.1. Host algebras and the strict topology Lemma A.1. Let X be a locally compact space. (a) On each bounded subset of M (C0 (X)) ∼ = Cb (X), the strict topology coincides with the topology of compact convergence, i.e. the compact open topology. This holds in particular for the subgroup C(X, T) ∼ = U (Cb (X)). (b) A unital ∗-subalgebra S ⊆ Cb (X) is strictly dense if and only if it separates the points of X. Proof. (a) ([3, Ex. 12.1.1(b)]) Let B ⊆ Cb (X) be a bounded subset with f ≤ C for each f ∈ B. For each ϕ ∈ C0 (X) and ε > 0 we now find a compact subset K ⊆ X with |ϕ| ≤ ε outside K. For fi → f in B with respect to the compact open topology, we then have (f − fi )ϕ ≤ (f − fi )|K ϕ + εf − fi ≤ εϕ + 2εC for sufficiently large i. Therefore the maps B → C0 (X), f → f ϕ are continuous if B carries the compact open topology. This means that the strict topology on B is coarser than the compact open topology. If, conversely, K ⊆ X is a compact subset and h ∈ C0 (X) with h|K = 1, then (f − fi )|K ≤ (f − fi )h shows that the strict topology on Cb (X) is finer than the compact open topology. This proves (a). (b) If S is strictly dense, then it obviously separates the points of X because the point evaluations are strictly continuous. Suppose, conversely, that S separates the points of X. Replacing S by its norm closure, we may without loss of generality assume that S is norm closed. Let K ⊆ X be compact. Since S separates the points of K, the Stone–Weierstraß Theorem implies that S|K = C(K). For any f ∈ Cb (X) we therefore find some fK ∈ S with fK ≤ 2f and fK |K = f |K because the restriction map is a quotient morphism of C∗ -algebras. Since the net (fK ) is bounded and converges to f in the compact open topology, (a) implies that it also converges in the strict topology. Therefore S is strictly dense in Cb (X). A.2. Tensor products of C ∗ -algebras Let A and B be C∗ -algebras and A ⊗ B their spatial C∗ -tensor product (defined by the minimal cross norm) ([12]), which is a suitable completion of the algebraic
June 3, 2009 10:59 WSPC/148-RMP
608
J070-00367
H. Grundling & K.-H. Neeb
tensor product A ⊗ B, turning it into a C∗ -algebra. We then have homomorphisms iA : M (A) → M (A ⊗ B),
iB : M (B) → M (A ⊗ B),
uniquely determined by iA (ϕ)(A ⊗ B) = (ϕ · A) ⊗ B,
iB (ϕ)(A ⊗ B) = A ⊗ (ϕ · B).
Moreover, for each complex Hilbert space H, we have Rep(A ⊗ B, H) ∼ = {(α, β) ∈ Rep(A, H) × Rep(B, H) : [α(A), β(B)] = {0}}. This correspondence is established by assigning to each pair (α, β) with commuting range the representation π := α ⊗ β : A ⊗ B → B(H),
a ⊗ b → α(a)β(b).
Note that this representation of A ⊗ B is non-degenerate if α and β are non-degenerate. Lemma A.2. The following assertions hold for the embedding iA : M (A) → M (A ⊗ B): (1) The map i−1 A : M (A) ⊗ 1 → M (A),
m ⊗ idB → m
is continuous with respect to the strict topology on its domain obtained from A ⊗ B and the strict topology on its range obtained from A. (2) Its restriction to bounded subsets is a homeomorphism. (3) iA (A) is dense in M (A) ⊗ 1 with respect to the strict topology on M (A ⊗ B). Proof. (1) The strict topology on M (A) is defined by the seminorms pa (m) = m · a + a · m, i−1 A
= pa⊗1 , which shows immediately that i−1 satisfying pa ◦ A is continuous. (2) Since the embedding iA is isometric, it suffices to show that for each bounded subset M ⊆ M (A), the restriction of iA to M is continuous. Since iA is linear, it suffices to show that for each bounded net (Mν ) with lim Mν = 0 in the strict topology of M (A), we also have lim iA (Mν ) = 0 in M (A ⊗ B). For A ∈ A and B ∈ B we have Mν (A ⊗ B) = Mν A ⊗ B = Mν AB → 0 and likewise (A ⊗ B)Mν → 0. Since the elementary tensors span a dense subset of A ⊗ B, the boundedness of the net (Mν ) implies that iA (Mν ) → 0 holds in the strict topology of M (A ⊗ B) (cf. Wegge–Olsen [28, Lemma 2.3.6]). (3) Let {Eα } be any approximate identity of A, satisfying Eα ≤ 1. Then for any A ∈ M (A), the net {AEα } ⊂ M (A) is bounded by A and converges to A in the strict topology of M (A), and hence in the strict topology of M (A ⊗ B) by (2). This proves (3).
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
609
Lemma A.3. For each non-degenerate representation π ∈ Rep(A ⊗ B, H) the rep˜ (a ⊗ 1) and π2 (b) := π ˜ (1 ⊗ b) are non-degenerate, where π ˜ resentations π1 (a) := π denotes the unique extension of π from A ⊗ B to M (A ⊗ B). Moreover, the corre˜2 ∈ Rep(M (B), H) from π1 , π2 on sponding extensions π ˜1 ∈ Rep(M (A), H) and π A, B respectively, satisfy ˜ ◦ iA π ˜1 = π
and
π ˜2 = π ˜ ◦ iB .
In particular, the representations π ˜ ◦ iA and π ˜ ◦ iB are continuous with respect to the strict topology on M (A), M (B) respectively, and the the topology of pointwise convergence on B(H). Proof. To see that π1 is non-degenerate, we observe that for a ⊗ b ∈ A ⊗ B we have π(a ⊗ b) = π1 (a)π2 (b) = π2 (b)π1 (a), so that any vector annihilated by π1 (A) is also annihilated by A ⊗ B, hence zero. The same argument proves non-degeneracy of π2 . For m ∈ M (A), we have π ˜ (m ⊗ 1)π1 (a) = π ˜ (m ⊗ 1)˜ π (a ⊗ 1) = π ˜ (ma ⊗ 1) = π1 (ma) = π ˜1 (m)˜ π1 (a), so that the non-degeneracy of π1 implies π ˜ ◦ iA = π ˜1 , and likewise π ˜ ◦ iB = π ˜2 . The last assertion follows from the general fact that for a non-degenerate representation of A, the corresponding extension to M (A) is continuous with respect to the strict topology on M (A) and the topology of pointwise convergence on B(H); similary for B. Lemma A.4. Let G1 , G2 be topological groups and suppose that (A1 , η1 ), respectively, (A2 , η2 ) are full host algebras for G1 , respectively, G2 . Then η : G1 × G2 → M (A1 ⊗ A2 ),
(g1 , g2 ) → iA1 (η1 (g1 ))iA2 (η2 (g2 ))
defines a full host algebra of G1 × G2 . Proof. This follows from the observation that unitary representations of the direct product group G := G1 × G2 can be viewed as pairs of commuting representations πj : Gj → U (H), and we have the same picture on the level of non-degenerate representations of C∗ -algebras. We only have to observe that both pictures are compatible. In fact, let πj be commuting unitary representations of Gj , j = 1, 2, and π ˜j the corresponding representations of the host algebras Aj . Then we have (η ∗ (˜ π1 ⊗ π ˜2 ))(g1 , g2 ) = (˜ π1 ⊗ π ˜2 )(η1 (g1 ) ⊗ η2 (g2 )) π2 (η2 (g2 )) =π ˜1 (η1 (g1 ))˜ = π1 (g1 )π2 (g2 ). Corollary A.7 below provides a converse to this lemma.
June 3, 2009 10:59 WSPC/148-RMP
610
J070-00367
H. Grundling & K.-H. Neeb
A.3. Ideals of multiplier algebras Let A be a C∗ -algebra and M (A) its multiplier algebra. We are interested in the relation between the ideals of A and M (A). Lemma A.5. (a) Each strictly closed ideal J ⊆ M (A) coincides with the strict closure of the ideal J ∩ A of A, which is norm-closed. (b) For each norm closed ideal I A, its strict closure I˜ satisfies I˜ ∩ A = I. (c) The map J → J ∩ A induces a bijection from the set of strictly closed ideals of M (A) onto the set of norm-closed ideals of A. Proof. (a) Let (ui )i∈I be an approximate identity in A and µ ∈ J. Then µui ∈ J ∩ A converges to µ in the strict topology, and the assertion follows. Since on A the norm topology is finer than the strict topology, the ideal J ∩ A of A is norm-closed. (b) The ideal I is automatically ∗-invariant ([11, Proposition 1.8.2]), so that A/I is a C∗ -algebra. Let q : A → A/I denote the quotient homomorphism. The existence of an approximate identity in A implies that I is invariant under the left and right action of the multiplier algebra, so that we obtain a natural homomorphism M (q) : M (A) → M (A/I), which is strictly continuous ([9, Proposition 3.8]). Then I˜ := ker M (q) M (A) is a strictly closed ideal satisfying I˜ ∩ A = I, and (a) implies that I˜ is the strict closure of I. (c) Follows from (a) and (b). The following proposition shows that for each closed normal subgroup N of a topological group G with a host algebra, the quotient group G/N also has a host algebra: Proposition A.6. Let G be a topological group and suppose that A is a host algebra for G with respect to the homomorphism ηG : G → M (A). Let N G be a closed normal subgroup, I˜N M (A) the strictly closed ideal generated by ηG (N ) − 1, and IN := A ∩ I˜N . Then ηG factors through a homomorphism ηG/N : G/N → M (A/IN ), turning A/IN into a host algebra for the quotient group G/N . If, in addition, A is a full host algebra of G, then A/IN is a full host algebra of G/N . Proof. If π is a unitary representation of G, then we write πA for the corresponding ˜A ◦ ηG = π. Further, representation of A and π ˜A for the extension to M (A) with π let qG : G → G/N denote the quotient map.
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
611
We consider the C∗ -algebra B := A/IN and recall that the quotient morphism q : A → B induces a strictly continuous morphism M (q) : M (A) → M (B) ([9, Proposition 3.8]). In view of IN = ker q = (ker M (q)) ∩ A, Lemma A.5 implies that ker M (q) = I˜N . Next we observe that ηG (N )−idA ⊆ I˜N implies that N acts by trivial multipliers on the algebra B = A/IN . We therefore obtain a group homomorphism ηG/N : G/N → U (M (B))
with
ηG/N ◦ qG = M (q) ◦ ηG .
To see that ηG/N turns B into a host algebra for the quotient group G/N , we first note that every non-degenerate representation π : B → B(H) can be viewed as a non-degenerate representation πA : A → B(H) with πA := π ◦q. The corresponding representations of the multiplier algebras satisfy π ˜ ◦ M (q) = π ˜A : M (A) → B(H). This leads to π ˜ ◦ ηG/N ◦ qG = π ˜ ◦ M (q) ◦ ηG = π ˜A ◦ ηG , showing that the unitary representation π ˜ ◦ ηG/N of G/N is continuous. We thus obtain a map ∗ ηG/N : Rep(B) → Rep(G/N ),
π → π ˜ ◦ ηG/N .
If two representations π and γ of B lead to the same representation of G/N , i.e., ∗ ∗ ηG/N (π) = π ˜ ◦ ηG/N = γ˜ ◦ ηG/N = ηG/N (γ),
then the corresponding representations of G coincide, i.e. π ˜A ◦ ηG = γ˜A ◦ ηG , but since A is a host algebra for G, we have πA = γA i.e., π ◦ q = γ ◦ q and as q is surjective, we get π = γ. ∗ is surjective, then every continuous unitary representation If, in addition, ηG π of G/N pulls back to a continuous unitary representation of G which defines a unique representation ρA of A which in turn extends to the representation ρ˜A of M (A) satisfying ρ˜A ◦ ηG = π ◦ qG . Further, I˜N ⊆ ker ρ˜A implies IN ⊆ ker ρA , so that ρ˜A factors via M (q) : M (A) → M (B) through a strictly continuous repre∗ ˜B ◦ ηG/N = π. This implies that ηG/N is also sentation π ˜B of M (B), satisfying π surjective. Corollary A.7. Let G1 , G2 be topological groups and G := G1 × G2 . If G has a full host algebra (A, η), then G1 and G2 have full host algebras (A1 , η1 ) and (A2 , η2 ) with A ∼ = A1 ⊗ A2 . Proof. The existence of host algebras of G1 ∼ = G/({1} × G2 ) and G2 ∼ = G/(G1 × {1}) follows directly from the last statement in Proposition A.6. Now Lemma A.4 applies.
June 3, 2009 10:59 WSPC/148-RMP
612
J070-00367
H. Grundling & K.-H. Neeb
A.4. Symplectic space Lemma A.8. In each countably dimensional symplectic vector space (S, B), there exists a basis (pn , qn )n∈N with B(pn , qm ) = δnm
and
B(pn , pm ) = B(qn , qm ) = 0
for n, m ∈ N.
Then Ipn := qn and Iqn = −pn defines a complex structure on S for which (v, w) := B(Iv, w) is positive definite and hence defines a (sesquilinear) inner product on S by v, w := (v, w) + iB(v, w). Moreover {qn | n ∈ N} is a complex orthonormal basis of S with respect to ·, ·. Proof. Let (en )n∈N be a linear basis of S. We construct the basis elements pn , qn inductively as follows. If p1 , . . . , pk and q1 , . . . , qk are already chosen, pick a minimal m with em ∈ span{p1 , . . . , pk , q1 , . . . , qk } and put pk+1 := em −
k B(em , qi )pi + B(pi , em )qi i=1
to ensure that this element is B-orthogonal to all previous ones. Then pick minimal, such that B(pk+1 , e ) = 0, put q˜k+1 := e −
k B(e , qi )pi + B(pi , e )qi i=1
and pick qk+1 ∈ R˜ qk+1 with B(pk+1 , qk+1 ) = 1. This process can be repeated ad infinitum and produces the required bases of S because for each k, the span of p1 , . . . , pk , q1 , . . . , qk contains at least e1 , . . . , ek . That {qn | n ∈ N} a complex orthonormal basis with respect to ·, · follows from the definitions. References [1] F. Acerbi, G. Morchio and F. Strocchi, Nonregular representations of CCR algebras and algebraic fermion bosonization, Proceedings of the XXV Symposium on Mathematical Physics (Tor´ un, 1992), Rep. Math. Phys. 33(1–2) (1993) 7–19. [2] B. Blackadar, Infinite tensor products of C∗ -algebras, Pacific J. Math. 77 (1977) 313–334. [3] B. Blackadar, K-Theory for Operator Algebras, 2nd edn. (Cambridge University Press, 1998). [4] B. Blackadar, Operator Algebras, Encyclopaedia of Mathematical Sciences, Vol. 122 (Springer-Verlag, Berlin, 2006). [5] N. Bourbaki, Elements of Mathematics. Algebra I, Chapters 1–3 (Springer-Verlag, 1992); Reprint of 1974 edition. [6] D. Buchholz and H. Grundling, The resolvent algebra: A new approach to canonical quantum systems, J. Funct. Anal. 254 (2008) 2725–2779. [7] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 1, Texts and Monographs in Physics, 2nd edn. (Springer-Verlag, 2003).
June 3, 2009 10:59 WSPC/148-RMP
J070-00367
Full Regularity for a C ∗ -Algebra of the Canonical Commutation Relations
613
[8] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 2, Texts and Monographs in Physics, 2nd edn. (Springer-Verlag, 1997). [9] R. C. Busby, Double centralizers and extensions of C∗ -algebras, Trans. Amer. Math. Soc. 132 (1968) 79–99. [10] R. C. Busby and H. A. Smith, Representations of twisted group algebras, Trans. Amer. Math. Soc. 149(2) (1970) 503–537. [11] J. Dixmier, Les C ∗ -alg`ebres et leurs Repr´esentations (Gauthier-Villars, Paris, 1964). [12] P. A. Fillmore, A User’s Guide to Operator Algebras (Wiley, New York, 1996). [13] H. Gl¨ ockner, Direct limit Lie groups and manifolds, J. Math. Kyoto Univ. 43 (2003) 1–26. [14] H. Gl¨ ockner and K.-H. Neeb, Minimally almost periodic abelian groups and commutative W∗ -algebras, in Nuclear Groups and Lie Groups, eds. E. M. Peinador et al., Research and Exposition in Math., Vol. 24 (Heldermann Verlag, 2001), pp. 163–186. [15] H. Grundling, A group algebra for inductive limit groups, Continuity problems of the canonical commutation relations, Acta Appl. Math. 46 (1997) 107–145. [16] H. Grundling, Generalising group algebras, J. London Math. Soc. 72 (2005) 742–762; Erratum, ibid. 77 (2008) 270–271. [17] H. Grundling and C. A. Hurst, A note on regular states and supplementary conditions, Lett. Math. Phys. 15 (1988) 205–212; Errata, ibid. 17 (1989) 173–174. [18] G. C. Hegerfeldt, Decomposition into irreducible representations for the canonical commutation relations, Nuovo Cimento Soc. Ital. Fis. B 4 (1971) 225–244. [19] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras II (Academic Press, 1983). [20] J. Manuceau, C∗ -algebre de relations de commutation, Ann. Inst. Henri Poincar´e 8 (1968) 139–161. [21] J. Manuceau, M. Sirugue, D. Testard and A. Verbeure, The smallest C∗ -algebra for the canonical commutation relations, Commun. Math. Phys. 32 (1973) 231–243. [22] K.-H. Neeb, A complex semigroup approach to group algebras of infinite dimensional Lie groups, Semigroup Forum 77 (2008) 5–35. [23] J. Packer and I. Raeburn, Twisted crossed products of C∗ -algebras, Math. Proc. Camb. Phil. Soc. 106 (1989) 293–311. [24] V. Pestov, Abelian topological groups without irreducible Banach representations, in Abelian Groups, Module Theory and Topology (Padua, 1997), eds. D. Dikrajan and L. Salce, Lecture Notes in Pure and Appl. Math., Vol. 201 (Dekker, New York, 1998), pp. 343–349. [25] R. Schaflitzel, Decompositions of regular representations of the canonical commutation relations, Publ. Res. Inst. Math. Sci. 26 (1990) 1019–1047. [26] I. E. Segal, Representations of the canonical commutation relations, in Carg`ese Lectures in Theoretical Physics: Applications of Mathematics to Problems in Theoretical Physics (Carg`ese, 1965) (Gordon Breach Science Publ., 1967), pp. 107–170. [27] G. Takeuti and W. M. Zaring, Introduction to Axiomatic Set Theory (Springer-Verlag, 1975). [28] N. E. Wegge-Olsen, K-Theory and C ∗ -Algebras (Oxford Science Publications, 1993). [29] S. L. Woronowicz, C∗ -algebras generated by unbounded elements, Rev. Math. Phys. 7 (1995) 481–521.
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Reviews in Mathematical Physics Vol. 21, No. 5 (2009) 615–674 c World Scientific Publishing Company
REPORT ON THE DETAILED CALCULATION OF THE EFFECTIVE POTENTIAL IN SPACETIMES WITH S 1 × Rd TOPOLOGY AND AT FINITE TEMPERATURE
V. K. OIKONOMOU Department of Theoretical Physics, Aristotle University of Thessaloniki, Thessaloniki 541 24, Greece
[email protected] Received 1 July 2008 Revised 8 April 2009 In this paper, we review the calculations that are needed to obtain the bosonic and fermionic effective potential at finite temperature and volume (at one loop). The calculations at finite volume correspond to S 1 × Rd topology. These calculations appear in the calculation of the Casimir energy and of the effective potential of extra dimensional theories. In the case of finite volume corrections, we impose twisted boundary conditions and obtain semi-analytic results. We mainly focus in the details and validity of the results. The zeta function regularization method is used to regularize the infinite summations. Also the dimensional regularization method is used in order to renormalize the UV singularities of the integrations over momentum space. The approximations and expansions are carried out within the perturbative limits. After the end of each section, we briefly present applications associated to the calculations. Particularly the calculation of the effective potential at finite temperature for the standard model fields, the effective potential for warped and large extra dimensions, and the topological mass creation. In the end, we discuss on the convergence and validity of one of the obtained semi-analytic results. Keywords: Effective potential; zeta regularization; Casimir energy; finite temperature; extra dimensions. Mathematics Subject Classification 2000: 81Q99, 81R40, 81T13, 81T60, 81V99
1. Introduction During the development of quantum field theory, many quantitative methods have been developed. Some of the most frequently used techniques are one-dimensional infinite lattice sums [3, 35]. In this article, we shall review the calculations associated with these summations, that appear in many important branches of quantum field theory, three of which are, the physics of extra dimensions [64–68, 52, 91], the Casimir effect, [3, 4, 54, 75, 85, 74, 58, 93, 92] and finally in field theories at 615
June 2, 2009 18:35 WSPC/148-RMP
616
J070-00371
V. K. Oikonomou
finite temperature [57, 55, 71, 63, 3, 4, 85, 17, 35, 51]. In both three cases, we shall compute the effective potential. The method we shall use involves the expansion of the potential in Bessel series and zeta regularization [3, 4, 35, 12]. We focus on the details of the calculation and the readers who want to study these theories will find this paper a useful tool. 1.1. Effective potential in theories with large extra dimensions In theories with large extra dimensions [64–68, 52, 91], the fields entering the Lagrangian are expanded in the eigenfunctions of the extra dimensions. Let us focus on theories with one extra dimension with the topology of a circle, namely of the type S 1 × M4 (M4 stands for the 4-dimensional Minkowski space). In the following, we shall also discuss the orbifold compactification apart from the circle compactification we describe here. For circle compactifications, the harmonic expansion of the fields reads, φ(x, y) =
∞
φn (x)e
i2πny L
,
(1)
n=−∞
where x stands for the 4-dimensional Minkowski space coordinates, y for the extra dimension and L the radius of the extra dimension. We note that fields are periodic in the extra dimension y namely, φ(x, y) = φ(x, y + 2πR). One of the ways to break supersymmetry is the Scherk–Schwarz compactification mechanism. This is based on the introduction of a phase q. For fermions we denote it qF and for bosons qB . Now the harmonic expansions for fermion and bosons fields read, φ(x, y) =
∞
φn (x)e
i2π(n+qF )y L
,
(2)
i2π(n+qB )y L
,
(3)
n=−∞
for fermions and, φ(x, y) =
∞
φn (x)e
n=−∞
for bosons. We can observe that the initial periodicity condition is changed. Using Eqs. (2) and (3) we can find that the effective potential at one loop is equal to, (n + qB )2 2 2 ∞ + + M (φ) p 4 d p 1 L2 ln V (φ) = Tr (4) . 2 2 n=−∞ (2π)4 ) (n + q F 2 p2 + + M (φ) L2 Note that fermions and bosons contribute to the effective potential with opposite signs. This is due to the fact that fermions are described by anti-commuting Grassmann fields. Also M 2 (φ) is a n independent term and depends on the way that spontaneous symmetry breaking occurs. We shall not care for the particular form of this and we focus on the general calculation of terms like the one in Eq. (4).
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
617
1.2. The Casimir energy One of the most interesting phenomena in quantum field theory is the Casimir effect (for a review, see [3, 4, 11, 23, 35, 31]). It expresses the quantum fluctuations of the vacuum of a quantum field. It originates from the “confinement” of a field in finite volume. Many studies have been done since. Casimir’s original work [2]. The Casimir energy, usually calculated in these studies, is closely related to the boundary conditions of the fields under consideration [27, 30, 14, 15, 3, 4, 37, 39]. Boundary conditions influence the nature of the so-called Casimir force, which is generated from the vacuum energy. In this paper, we shall concentrate on the computation of the effective potential (Casimir energy) of bosonic and fermionic fields in a spacetime with the topology S 1 × Rd [3, 4, 11, 22, 26, 28, 29, 35]. Fermionic and bosonic fields in spaces with non trivial topology are allowed to be either periodic or anti-periodic in the compact dimension. The forms of the potential to be studied are, 2 2
∞ 4π n dk d 1 2 2 ln +k +m , (5) L (2π)d n=−∞ L2 and the fermionic one, 1 L
∞ (2n + 1)2 π 2 dk d 2 2 ln +k +m . (2π)d n=−∞ L2
(6)
We shall study them also in the cases d = 2 and d = 3, which are of particular importance in physics since they correspond to three and four total dimensions. Both have many applications in solid state physics and cosmology [11, 3]. Also we shall generalize to the case with fermions and bosons obeying general boundary conditions also in d + 1 dimensions. This is identical from a calculational aspect with the effective potential of theories with extra dimensions [52,67]. So computing one of the two gives simultaneously the other. The expression that is going to be studied thoroughly is,
2 2π dk d 1 2 2 +k +m ln (n + ω) L (2π)d L aL dk d+1 1 dk d 2 2 = ln[k + a ] + ln[1 − e−2( 2 −iπω) ] d+1 d (2π) L (2π) d aL 1 dk + ln[1 − e−2( 2 +iπω) ]. (7) d L (2π) The calculations shall be done in d+1 dimensions, quite general, and the application to every dimension we wish, can be done easily. The only constraint shall be if d is even or odd. We shall make that clear in the corresponding sections and treat both cases in detail.
June 2, 2009 18:35 WSPC/148-RMP
618
J070-00371
V. K. Oikonomou
1.3. Field theories at finite temperature The calculations used in finite temperature field theories are based on the imaginary time formalism [55, 57, 3, 35, 4]: t → iβ,
(8)
with β = T1 . The eigenfrequencies of the fields that appear to the propagators are discrete and are summed in the partition function. These are affected from the boundary conditions used for fermions and bosons [3,4]. Bosons obey only periodic and fermions antiperiodic boundary conditions at finite temperature, as we shall see (this is restricted and dictated by the KMS relations [57]). Indeed for bosons the boundary conditions are: ϕ(x, 0) = ϕ(x, β),
(9)
where x stands for space coordinates, and the fermionic boundary conditions are, ψ(x, 0) = −ψ(x, β).
(10)
In most calculations involving bosons, we are confronted with the following expression: ∞ dk 3 ln[4π 2 n2 T 2 + k 2 + m2 ], (11) T (2π)3 n=−∞ while the fermionic contribution is, ∞ dk 3 ln[(2n + 1)2 π 2 T 2 + k 2 + m2 ], T (2π)3 n=−∞
(12)
and k stands for the Euclidean momentum: k 2 = k12 + k22 + k32 ,
(13)
while m is the field mass. In the next sections, we deal with the two above contributions in d + 1 dimensions and we specify the results for d = 3 and d = 2. 2. Bosonic Contribution at Finite Temperature We will compute the following expression, ∞ dk 3 ln[4π 2 n2 T 2 + k 2 + m2 ]. S1 = T (2π)3 n=−∞
(14)
In the following we generalize in d dimensions. This will give us the opportunity to deal other cases apart from the d = 4. Consider the sum: So =
∞ 1 1 = 4π 2 n2 T 2 + a2 4π 2 T 2 n=−∞ n=−∞ ∞
1 n2
a2 + 2 2 4π T
,
(15)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
619
where, a2 = k 2 + m 2 .
(16)
Integrating over a2 , ∞
1 , 2 n 2 T 2 + a2 4π n=−∞ we get:
∞ n=−∞
da2 4π 2 n2 T 2
+
a2
=
ln[4π 2 n2 T 2 + a2 ].
Now, ∞
1 2 coth = 2 2 2 2 4π n T + a 4aT n=−∞ thus Eq. (18) becomes, ∞
(17)
a , 2T
a 2 coth da2 4aT 2T
a = 2 ln sinh . 2T
da2 = 2 2 4π n T 2 + a2 n=−∞
Using the relation [1],
ln(sinh x) = ln
ln
sinh
1 x −x [e − e ] = x + ln[1 − e−2x ] − ln[2], 2
a 2T
=
(19)
and upon summation,
a a a ln sinh + ln[1 − e− T ] − ln[2], = 2T 2T and,
(18)
a a + ln[1 − e− T ] − ln[2]. 2T
Summing Eqs. (22) and (23) we obtain,
∞ a da2 a a = 2 ln sinh = + 2 ln[1 − e− T ] − 2 ln[2]. 2 n 2 T 2 + a2 4π 2T T n=−∞
(20)
(21)
(22)
(23)
(24)
Finally the result is [55, 57, 3, 35]: ∞
ln[4π 2 n2 T 2 + a2 ] =
n=−∞
Upon using,
a a + 2 ln[1 − e− T ] − 2 ln[2]. T
(25)
(n + ω)2 4π 2 T 2 + a2 ) ln = 2(a − b), (n + ω)2 4π 2 T 2 + b2 )
(26)
June 2, 2009 18:35 WSPC/148-RMP
620
J070-00371
V. K. Oikonomou
Eq. (25) becomes, ln[4π 2 n2 T 2 + a2 ] =
1 2πT
∞
−∞
dx ln[x2 + a2 ] + 2 ln[1 − e− T ]. a
Finally we have, ∞ dx dk 3 dk 3 2 2 2 ln[x2 + a2 ] + k + m ] = T ln[(2πnT ) (2π)3 (2π)3 −∞ 2π a dk 3 + 2T ln[1 − e− T ]. 3 (2π)
(27)
(28)
Remembering that, a2 = k 2 + m 2 ,
(29)
the first integral of Eq. (28) is the one loop contribution to the effective potential at zero temperature. The 4-momentum is: K 2 = k 2 + x2 .
(30)
Writing the above in d + 1 dimensions (in the end we take d = 3 to come back to four dimensions) we get, dk d+1 dk d 2 2 2 2 2 n T + k + m ] = ln[k 2 + a2 ] T ln[4π (2π)d (2π)d+1 a dk d + 2T ln[1 − e− T ]. (31) (2π)d The temperature dependent part has singularities stemming from the infinite summations. These singularities are poles of the form [3, 35, 4]: 1 , (32) where → 0 the dimensional regularization variable (d = 4 + ). As we shall see, by using the zeta regularization [3, 4, 85, 35, 12] these will be erased. In the following of this section, we focus on the calculation of the temperature dependent part. Let, a dk d ln[1 − e− T ] Vboson = 2T d (2π) a dk d = 2T ln[1 − e− T ]. (33) d (2π) By using [1], ln[1 − e
a −T
]=−
a ∞ e− T q
q=1
we obtain,
Vboson = 2T
q
,
dk d a ln[1 − e− T ] d (2π)
(34)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
a ∞ dk d e− T q (2π)d q=1 q
= −2T = −2
∞
T
q=1
and remembering, a=
621
dk d e− T q (2π)d q a
(35)
k 2 + m2 ,
(36)
by integrating over the angles we get, √ 2 2 ∞ d − k T+m q dk e T Vboson = −2 (2π)d q q=1
∞
√
∞
k2 +m2
dk d−1 (2π) 2 e− T q = −2 T k d d q −∞ (2π) q=1 Γ 2 d √ ∞ ∞ k2 +m2 (2π) 2 = −2 T dk k d−1 e− T q . d q=1 Γ q(2π)d −∞ 2 The integral,
∞
d
dk k d−1 e−
√
k2 +m2 T
q
,
(37)
(38)
−∞
equals to [1], 12 − d2
∞ √ 2 2 d+1 d d mq d−1 − k T+m q −1 √ −1 q 2 2 dk k e =2 ( π) m Γ K d+1 . 2 T 2 T −∞ (39) So Vboson can be written as: Vboson
d+1 d ∞ 2 d+1 mq T 2 2 −1 d+1 = −2 (2π) 2 m K d+1 d 2 (2π) T mq q=1
mq K d+1 ∞ 1 2 d−1 T =− (2π) 2 md+1 .
d+1 d (2π) mq 2 q=1 2T
(40)
The function [1], 1 Kν (z) ν = 2 z 2
0
∞
z2
e−t− 4t dt, tν+1
(41)
June 2, 2009 18:35 WSPC/148-RMP
622
J070-00371
V. K. Oikonomou
is even under the transformation z → −z. Thus, Eq. (40) becomes:
mq K d+1 ∞ 2 d−1 1 T 2 md+1 Vboson = − (2π) d+1
d (2π) mq 2 q=1 2T
mq K d+1 ∞ 2 d−1 1 1 T =− (2π) 2 md+1 .
d+1 d 2 q=−∞ (2π) mq 2 2T (The symbol By using,
(42)
in the summation denotes omission of the zero mode term q = 0.)
Kν (z) 1 ν = 2 z 2
∞
0
z2
e−t− 4t dt, tν+1
(43)
we get, ∞
Vboson = − Let λ =
2 (m T ) 4t .
d−1 1 1 (2π) 2 md+1 d 4 (2π)
∞
dt e−t
(
e−
mq 2 ) T 4t
q=−∞
0
t
d+1 2 +1
.
(44)
Using the Poisson summation formula [12, 35, 3, 4] we have, ∞ ∞ 2 π − 4π2 k2 e−λq = e 4λ , λ q=−∞
(45)
k=−∞
and omitting the zero modes we obtain: ∞ ∞ 2 2 π −λq2 − 4π4λk 1+ 1+ e = e . λ q=−∞
(46)
k=−∞
Finally, ∞
2
e−λq =
q=−∞
π λ
∞
1+
Vboson
∞
dt e 0
− 1,
(47)
k=−∞
and replacing in Vboson , we take
d−1 1 1 =− (2π) 2 md+1 d 4 (2π)
e
2 2 − 4π4λk
−t
π λ
1+
∞
e
2 2 − 4π4λk
k=−∞
t
d+1 2 +1
− 1 . (48)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
623
Set, d+1 2
ν= and Eq. (48) reads,
Vboson
d−1 1 1 d+1 2 =− (2π) m 4 (2π)d
0
(49)
π ∞ λ dt e−t ν+1 t
d−1 1 1 (2π) 2 md+1 − 4 (2π)d
∞ 0
d−1 1 1 + (2π) 2 md+1 4 (2π)d
dt e−t
∞
dt e
−t
π λ
2 2
e
k=−∞ tν+1
− 4π4λk
1
.
tν+1
0
∞
(50)
Also by setting, a= Eq. (50) becomes (with λ =
a2 4t ),
m , T √
πt2 atν+1 0 ∞ − 4π2 k2 t √ e a2 πt2 ∞ d−1 1 1 k=−∞ d+1 −t 2 m (2π) dt e − 4 (2π)d atν+1 0
Vboson = −
+
(51)
d−1 1 1 (2π) 2 md+1 d 4 (2π)
d−1 1 1 (2π) 2 md+1 d 4 (2π)
∞
∞
dt e−t
dt e−t
0
1 tν+1
.
(52)
From this, after some calculations, we obtain: ∞
√ d−1 π 1 −t −ν− 12 2 md+1 (2π) dt e t Vboson = − 2 (2π)d a 0 ∞ − 4π2 k2 t √ e a2 πt2 √ ∞ d−1 1 π k=−∞ d+1 −t 2 m (2π) − dt e ν+ 12 2 (2π)d a at 0 d−1 1 1 (2π) 2 md+1 + d 4 (2π)
∞
dt e 0
−t
1 tν+1
.
(53)
June 2, 2009 18:35 WSPC/148-RMP
624
J070-00371
V. K. Oikonomou
By using [1], 1 1 = (x2 + a2 )µ+1 Γ(µ + 1) we finally have:
∞
dt e−(x
2
+a2 )t µ
t ,
(54)
0
√ d−1 π 1 1 2 md+1 Γ (2π) + 1 −ν − 2 (2π)d a 2
√ d−1 1 π 1 d+1 − (2π) 2 m Γ −ν − + 1 2 (2π)d a 2 1
2 ν+ 2 −1 ∞ 2πk × 1+ a
Vboson = −
k=−∞
+
d−1 1 1 (2π) 2 md+1 Γ(−ν). d 4 (2π)
(55)
The sum, ∞ k=−∞
1+
2πk a
2 ν+ 12 −1 ,
(56)
is invariant under the transformation k → −k. Thus we change the summation to,
2 ν+ 12 −1 ∞ 2πk . (57) 1+ 2 a k=1
Replacing the above to Vboson after some calculations we get:
√ d−1 1 1 π d+1 2 Vboson = − (2π) m Γ −ν − + 1 2 (2π)d a 2
√ d−1 π 1 1 d+1 2 − (2π) m Γ −ν − + 1 (a2 ) 2 −ν (2π)d a 2 ∞ 2 2 2 ν+ 12 −1 × (a + 4π k ) k=1
+
d−1 1 1 (2π) 2 md+1 Γ(−ν). 4 (2π)d
(58)
We use the binomial expansion (in the case that d is even) or the Taylor expansion (in the case d odd) [1]:
1 ν − ! σ 1 2 2 2 ν− 12
(a + b ) (a2 )l (b2 )ν− 2 −l . = (59) 1 l=0 ν − − l !l! 2
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
625
If d is even, then σ equals to, 1 (60) σ=ν− . 2 If d is odd then σ N ∗ . We shall deal both cases. Replacing the sum into Vboson , we get
√ d−1 π 1 1 d+1 2 (2π) m Γ −ν − + 1 Vboson = − 2 (2π)d a 2 d−1 1 1 (2π) 2 md+1 Γ(−ν) d 4 (2π)
√ d−1 π 1 1 d+1 (2π) 2 m Γ −ν − + 1 (a2 ) 2 −ν − d (2π) a 2
1 2 ν− 12 −l ν− ! σ ((2π) ) ∞ 1 2
(a2 )l (k 2 )ν− 2 −l × (61) . 1 k=1 l=0 ν − − l !l! 2 The last expression shall be the initial point for the following two subsections. A much more elegant computation involves the analytic continuation of the Epstein zeta function [3, 12, 4, 35, 54, 53, 75, 30, 51, 28]. In following section, we shall present the Epstein zeta functions in much more detail. In our case, relation (58) can be written in a much more elegant way, using the one-dimensional Epstein zeta function, ∞ −ν 2 w(n + α)2 + m2 . (62) Z1m (ν, w, α) =
+
n=1
In our case, α = 0. Particularly one can make the relevant substitutions in the sum, ∞ 1 (a2 + 4π 2 k 2 )ν+ 2 −1 , (63) k=1
in terms of the one-dimensional Epstein zeta function, (62). 2.0.1. The Chowla–Selberg formula It worths mentioning at this point a very important formula related with the Bessel sums [3, 4, 35] of relation,
mq K d+1 ∞ 1 2 d−1 T Vboson = − (2π) 2 md+1
d+1 d (2π) mq 2 q=1 2T
mq K d+1 ∞ 2 d−1 1 1 T =− (2π) 2 md+1 (64) d+1 .
d 2 q=−∞ (2π) mq 2 2T
June 2, 2009 18:35 WSPC/148-RMP
626
J070-00371
V. K. Oikonomou
Apart from the inhomogeneous Epstein zeta [3, 12, 4, 35, 54, 53, 75, 30, 51, 28], there exists in the literature a generalization of the inhomogeneous Epstein zeta function, namely, the extended Chowla–Selberg formula [3], which we briefly describe at this point. We start with a two-dimensional generalization of the Epstein zeta function, (am2 + bmn + cn2 + q)−s . (65) E(s; a, b, c; q) = n, m ∈Z
In the following Q is equal to, Q(m, n) = am2 + bmn + cn2 ,
(66)
∆ = 4ac − b2 .
(67)
and also ∆ is,
Following [3], relation (65), can be written as, √ 22s πas−1 −s √ Γ(s − 1/2)ζEH (s − 1/2, 4aq/∆) E(s; a, b, c; q) = 2ζEH (s, q/a)a + Γ(s) a
1/4−s/2 √ ∞ 4aq 22s πas−1 s−1/2 √ n cos(nπb/a) d1−2s ∆ + 2 + d Γ(s) a n=1 × Ks−1/2
πn a
4aq d2
∆+
d/n
.
(68)
In the above relation, the summation d/n is over the 1 − 2s powers of the divisors of n. Also ζEH stands for, πΓ(s − 1/2) −s+1/2 p−s ζEH (s; p) = − + p 2 2Γ(s) +
∞ 2π s p−s/2+1/4 s−1/2 √ n Ks−1/2 (2πn p). Γ(s) n=1
(69)
Relation (68) has very attractive features. Most importantly the exponential convergence. We just mention this here for completeness and because (68) is very important. For more details, see the detailed description of [3]. Our case is a special case of the extended Chowla–Selberg formula. 2.0.2. The case d odd As stated before in the d odd case, σ ∈ N ∗ . Then Vboson is:
√ d−1 π 1 1 d+1 2 (2π) Vboson = − m Γ −ν − + 1 2 (2π)d a 2 +
d−1 1 1 (2π) 2 md+1 Γ(−ν) d 4 (2π)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
√ d−1 1 π 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − (2π)d a 2
1 2 ν− 12 −l ((2π) ) ν − ! ∞ σ 1 2
(a2 )l (k 2 )ν− 2 −l × . 1 k=1 l=0 ν − − l !l! 2
627
−
(70)
Using the analytic continuation of the Riemann zeta function [3, 4, 35, 56, 12], ζ(s) =
∞
n−s ,
(71)
n=1
to negative integers, Vboson becomes:
√ d−1 π 1 1 d+1 2 Vboson = − (2π) m Γ −ν − + 1 2 (2π)d a 2 d−1 1 1 (2π) 2 md+1 Γ(−ν) d 4 (2π)
√ d−1 π 1 1 d+1 (2π) 2 m Γ −ν − + 1 (a2 ) 2 −ν − d (2π) a 2
1 2 ν− 12 −l ν− ! σ ((2π) ) 2
(a2 )l ζ(−2ν + 1 + 2l) × . 1 l=0 ν − − l !l! 2
+
(72)
This is the final form of the bosonic contribution to the effective potential for d odd. In the following, we compute the above in the case d = 3. This will be done by Taylor expanding the last expression in powers of ε (with d = 3 + ε) as ε → 0. Let us explicitly show how the poles are erased. In the case d = 3 two terms of Vboson have poles. The first pole appears in Γ(−ν) (remember ν = d+1 2 ) and the other is contained in ζ(−2ν + 1 + 2l) for the value l = 2 that gives the pole of ζ(s) for s = 1. These terms expanded around d = 3 + , in the limit → 0 are written: d−1 3m4 −m4 γm4 m4 ln(2) 1 1 2 md+1 Γ(−ν) = + (2π) − + d 2 2 2 4 (2π) 16π ε 64π 32π 32π 2
m4 ln(m) m4 ln(π) − + + O(ε) (73) 16π 2 32π 2 (where γ the Euler–Masceroni constant) in which a pole appears, −m4 . 16π 2 ε
(74)
June 2, 2009 18:35 WSPC/148-RMP
628
J070-00371
V. K. Oikonomou
Regarding the other pole containing term (for d = 3 + , → 0),
√ d−1 1 π 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − − d (2π) a 2
1 2 ν− 12 −2 ) ν − ! ((2π) 2
× (a2 )2 ζ(−2ν + 1 + 4) 1 ν − − 2 !l! 2 =
−(γm4 ) m4 ln(2) m4 ln(m) m4 ln(π) m4 + + + + 2 16π ε 16π 2 32π 2 16π 2 32π 2 1 5 m4 ψ m4 ψ m4 ln(α2 ) 2 2 + O(ε), − − + 32π 2 32π 2 32π 2
(75)
with ψ the digamma function. Summing the above expressions we observe that the poles are naturally erased as a consequence of the zeta regularization method. We expand Vboson keeping the most dominant terms in the high temperature limit [3, 35, 55, 57]: √ m4 α2 −m4 + 4 4 4 4 2 4 2 16π 2 α + 3m − γm − m − m π + m Vboson = 16π 2 2 4 ε 64π 32π 6πα 45α 12α2
3 4 ψ − m γm4 m4 ln(2) m4 ln(π) m4 ln(α2 ) 2 − + + − − 16π 2 16π 2 16π 2 32π 2 32π 2 1 5 m4 ψ m4 ψ 2 2 + O(ε) − + (76) 32π 2 32π 2
Vboson
m T
we get: −m4 m4 + 4 4 3 4 2 2 2 16π 2 + 3m − γm − m T − γm + m T = 16π 64π 2 ε 32π 2 6π 16π 2 12
and substituting α =
3 4 ψ − m m4 ln(2) m4 ln(π) m4 ln(α2 ) π2 T 4 2 + + − − − 45 16π 2 16π 2 32π 2 32π 2 1 5 m4 ψ m4 ψ 2 2 + O(ε). − + (77) 32π 2 32π 2
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
In Eq. (77), we kept terms of order ∼ T . For σ = 8, we have additionally,
m m 7m ζ(5) m m9 ζ(7) 7m11 ζ(9) T T T − + − 4096π 6 T 3 32768π 8T 5 1572864π 10T 7 m m 3m13 ζ(11) 33m15 ζ(13) T T + − . 4194304π 12T 9 268435456π 14T 11
629
(78)
2.0.3. The case d even In the case d even, σ takes a limited number of values. Particularly, all the integer values up to the number σ = v − 12 . Before proceeding we comment on the values that d can take. If it takes values d > 2 that is 4, 6, . . . , the theory ceases to be renormalizable and UV regulators must be used in order to cure UV singularities [57, 55, 11]. We shall not deal with these problems that usually appear in extra dimensional models. Now Vboson in the d even case becomes:
√ d−1 π 1 1 2 md+1 Γ Vboson = − (2π) + 1 −v − 2 (2π)d a 2 d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 π 1 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν − −ν − (2π)d a 2
1 1 ν− 12 ((2π)2 )ν− 2 −l ν − ! ∞ 1 2
(a2 )l (k 2 )ν− 2 −l × , 1 k=1 l=0 ν − − l !l! 2
+
(79)
and using the zeta regularization [3, 4, 35, 12] we get:
√ d−1 π 1 1 d+1 2 (2π) Vboson = − m Γ −ν − + 1 2 (2π)d a 2 d−1 1 1 (2π) 2 md+1 Γ(−ν) d 4 (2π)
12 −ν √ d−1 1 π 2 2 md+1 Γ − (2π) + 1)(a −ν − (2π)d a 2
1 1 ν− 12 ((2π)2 )ν− 2 −l ν − ! 2 2 l .
(a × ) ζ(−2ν + 1 + 2l) 1 l=0 ν − − l !l! 2
+
(80)
We compute for example the above in the case d = 2. We can easily see that the poles are contained in the terms Γ(−ν − 12 + 1) and Γ(−ν − 12 + 1). Expanding for
June 2, 2009 18:35 WSPC/148-RMP
630
J070-00371
V. K. Oikonomou
ε → 0 (d = 2 + ε) the first pole containing term is:
√ d−1 π 1 1 d+1 (2π) 2 m Γ −ν − + 1 − 2 (2π)d a 2 2
2 2 m T −(m T ) γm T m2 T ln(2) m2 T ln(m) m2 T ln(π) √ − √ √ √ √ = √ + + − + , 2 2πε 4 2π 4 2π 4 2π 2 2π 4 2π (81) and the other one reads:
√ d−1 1 π 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − − d (2π) a 2
1 1 2 ν− −l 1 2 ν− 2 ((2π) ) ν − ! 2
× (a2 )l ζ(−2ν + 1 + 2l) 1 l=0 ν − − l !l! 2 =
γm2 T m2 T ln(2) m2 T ln(m) m2 T ln(π) m2 T √ √ √ √ √ + + + + 2 2πε 4 2π 4 2π 2 2π 4 2π 2 m 2 m T ln √ m2 T ln(2π) T2 √ √ − − + 2 2πT 3 ζ (−2) . 2 2π 4 2π
(82)
Adding Eqs. (81) and (82) we can see that the poles are erased naturally and Vboson becomes (d = 2): m3 m2 T m2 T ln(2) m2 T ln(π) Vboson = 6√2π + 4√2π + 2√2π + 2√2π 2 m 2 T ln m √ m2 T ln(2π) T2 √ √ − + 2 2πT 3 ζ (−2) − . 2 2π 4 2π
(83)
2.1. Fermionic contribution at finite temperature In this section we will compute the fermionic contribution to the effective potential: ∞ dk 3 ln[(2n + 1)2 π 2 T 2 + k 2 + m2 ]. (84) T (2π)3 n=−∞
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
631
Following the same procedures as in the bosonic case we obtain [35, 3, 4, 57, 55]: T
dk 3 ln[(2n + 1)2 π 2 T 2 + k 2 + m2 ] (2π)3 ∞ a dx dk 3 dk 3 2 2 ln[x = + a ] + 2T ln[1 + e−( 2T ) ]. (2π)3 −∞ 2π (2π)3
(85)
As before, the first term to the left-hand side is the effective potential at zero temperature. We shall dwell on the temperature dependent contribution, which in d + 1 dimensions is written, T
dk d ln[4π 2 n2 T 2 + k 2 + m2 ] (2π)d a dk d+1 dk d 2 2 = ln[k + a ] + 2T ln[1 + e−( 2T ) ]. (2π)d+1 (2π)d
(86)
Let,
a dk d ln[1 + e−( 2T ) ] d (2π)
Vfermion = 2T
a dk d ln[1 + e− 2T ]. d (2π)
= 2T
(87)
By using [1], ln[1 + e− 2T ] = − a
a ∞ (−1)q e− 2T q
q=1
q
,
(88)
Vfermion becomes,
dk d a ln[1 + e− 2T ] d (2π) a ∞ dk d (−1)q e− 2T q = −2T (2π)d q=1 q
Vfermion = 2T
= −2
∞ q=1
T
dk d (−1)q e− 2T q . (2π)d q a
(89)
Recall that, a=
k 2 + m2 ,
(90)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
V. K. Oikonomou
632
and so, Vfermion = −2
∞
T
q=1
∞
√
k2 +m2 2T
dk d (−1)q e− (2π)d q ∞
q
q −
d 2
√
k2 +m2
q 2T dk d−1 (2π) (−1) e T k = −2 d d q −∞ (2π) q=1 Γ 2 ∞ d √ ∞ k2 +m2 (2π) 2 (−1)q = −2 T dk k d−1 e− 2T q . d q=1 Γ q(2π)d −∞ 2
(91)
The integral,
∞
dk k d−1 e−
√
k2 +m2 2T
q
,
(92)
−∞
equals to [1],
∞
dk k
d−1 −
√
k2 +m2 2T
e
q
=2
d 2 −1
√ ( π)−1
−∞
q 2T
12 − d2 m
d+1 2
So Vfermion reads as,
Vfermion = −2
d ∞ 2 2 −1 (−1)q
(2π)d
q=1
=−
d mq Γ K d+1 . 2 2 2T (93)
∞ q=1
K d+1 (2π)
d+1 2
md+1
2
mq 2T
1
2T mq
d+1 2
mq 2 d−1 (−1)q 2T (2π) 2 md+1 .
d+1 (2π)d mq 2 4T K d+1
(94)
Using the relation [1, 3, 4, 35]: ∞
(−1)q f (r) = 2
q=1
∞
f (2r) −
q=1
∞ q=1
f (2r) −
∞
f (r),
(95)
q=1
we get, ∞ q=1
(−1)q K d+1 2
mq 4T
mq 2T
d+1 2
=2
∞ q=1
mq mq K d+1 ∞ 2 2 T 2T − d+1
d+1
mq 2 mq 2 q=1 2T 4T
K d+1
(96)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
and upon replacing to Vfermion we obtain: Vfermion = −
∞ (−1)q q=1
d−1
=−
(2π) 2 md+1 (2π)d
mq 2T
d+1 2
K d+1 2 md+1 mq 4T
mq mq ∞ K d+1 ∞ K d+1 2 2 T 2T − 2 d+1 d+1 .
q=1 mq 2 mq 2 q=1 2T 4T
(2π) (2π)d
633
d−1 2
(97)
The function, z2 Kν (z) 1 ∞ e−t− 4t z ν = dt, (98) 2 0 tν+1 2 is even under the transformation z → −z. Thus the above becomes:
mq mq d−1 ∞ K d+1 ∞ K d+1 2 2 (2π) 2 md+1 1 T 2T 1 2 − Vfermion = − d+1 d+1
d 2 q=−∞ (2π) 2 q=−∞ mq 2 mq 2 2T 4T
mq mq ∞ K d+1 ∞ K d+1 2 2 1 T 2T − , q=−∞ mq d+1 2 q=−∞ mq d+1 2 2 2T 4T d−1
=−
(2π) 2 md+1 (2π)d
(99)
where the symbol denotes omission of the zero modes in the summation. Using, z2 Kν (z) 1 ∞ e−t− 4t z ν = dt, (100) 2 0 tν+1 2 the two Bessel sums are written as:
mq d−1 ∞ K d+1 2 (2π) 2 md+1 T −
d+1 d (2π) mq 2 q=−∞ 2T ∞
= − Set λ =
2 (m T ) 4t
d−1 1 1 (2π) 2 md+1 d 2 (2π)
∞
0
dt e−t
e−
(
mq 2 ) T 4t
q=−∞
t
d+1 2 +1
.
(101)
and using the Poisson summation formula [3, 4, 35] we obtain: ∞ ∞ 2 2 π −λq2 − 4π4λk 1+ e = e − 1. (102) λ q=−∞ k=−∞
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
V. K. Oikonomou
634
Upon replacing we get: −
d−1 ∞ K d+1 2 (2π) 2 md+1 d (2π) mq q=−∞
mq T
d+1 2
2T
d−1 1 1 = − (2π) 2 md+1 d 2 (2π)
∞
0
−t dt e
π λ
∞
1+
e
k=−∞
t
2 2
− 4π4λk
d+1 2 +1
− 1 . (103)
Set, d+1 2
ν= and thus,
(104)
−
d−1 2
(2π) md+1 (2π)d
∞ q=−∞
K d+1
2
mq 2T
mq T
d+1 2
π ∞ d−1 1 1 λ −t 2 md+1 = − (2π) dt e 2 (2π)d tν+1 0 d−1 1 1 (2π) 2 md+1 − d 2 (2π)
+
d−1 1 1 (2π) 2 md+1 2 (2π)d
∞
dt e
−t
0
∞
dt e−t
0
π λ
1
∞
e
k=−∞ tν+1
2 2 − 4π4λk
tν+1
.
(105)
Also, a= and finally (with λ =
−
m , T
2
a 4t ),
d−1 ∞ K d+1 2 (2π) 2 md+1 d (2π) mq q=−∞
mq T
d+1 2
2T ∞
√ d−1 1 π d+1 −t −ν− 2 (2π) 2 m = − dt e t (2π)d a 0
(106)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
√ πt2 √ ∞ d−1 π (2π) 2 md+1 − dt e−t d (2π) a 0
+
d−1 1 1 (2π) 2 md+1 2 (2π)d
∞
dt e−t
1 1 = 2 2 µ+1 (x + a ) Γ(µ + 1)
−
∞
∞
2 2
e
k=−∞ 1
atν+ 2
− 4πa2k t
1
.
tν+1
0
By using [1],
we obtain the equation:
635
dt e−(x
2
(107)
+a2 )t µ
t ,
(108)
0
d−1 ∞ K d+1 2 (2π) 2 md+1 d (2π) mq q=−∞
mq T
d+1 2
2T
√ d−1 π 1 2 md+1 Γ (2π) + 1 = − −ν − (2π)d a 2
2 ν+ 12 −1 √ ∞ d−1 1 2πk π − (2π) 2 md+1 Γ −ν − + 1 1+ (2π)d a 2 a k=−∞
+
d−1 1 1 (2π) 2 md+1 Γ(−ν). d 2 (2π)
(109)
The sum,
∞
1+
k=−∞
2πk a
2 ν+ 12 −1 ,
is invariant under the we change the sum to,
2 ν+ 12 −1 ∞ 2πk . 1+ 2 a k=1
Replacing again we get:
−
d−1 ∞ K d+1 2 (2π) 2 md+1 d (2π) mq q=−∞
mq T
d+1 2
2T
√ d−1 π 1 2 md+1 Γ (2π) + 1 = − −ν − (2π)d a 2
(110)
(111)
June 2, 2009 18:35 WSPC/148-RMP
636
J070-00371
V. K. Oikonomou
∞
√ d−1 2 π 1 d+1 2 12 −ν 2 2 2 ν+ 12 −1 2 (2π) − m Γ −ν − + 1 (a ) (a + 4π k ) (2π)d a 2 k=1
d−1 1 1 × (2π) 2 md+1 Γ(−ν). d 2 (2π)
(112)
Using the binomial expansion (in the case d even) or Taylor expansion (in the case d odd) [1]:
1 ν − ! σ 1 2 2 2 ν− 12
(a2 )l (b2 )ν− 2 −l . = (113) (a + b ) 1 l=0 ν − − l !l! 2 For d even, σ equals, 1 σ=ν− . 2
(114)
If d is odd then σ is a positive integer. By Taylor expanding, we obtain:
mq d−1 ∞ K d+1 2 (2π) 2 md+1 T −
d+1 d (2π) mq 2 q=−∞ 2T
√ d−1 1 π 2 md+1 Γ = − −ν − (2π) + 1 (2π)d a 2 d−1 1 1 (2π) 2 md+1 Γ(−ν) 2 (2π)d
√ d−1 1 1 2 π 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − − (2π)d a 2
1 2 ν− 12 −l ((2π) ) ν − ! σ ∞ 2 2 l 2 ν− 12 −l
(a × ) (k ) . 1 k=1 l=0 ν − − l !l! 2
+
Following the previous techniques we get for the second sum of Eq. (101):
mq d−1 ∞ K d+1 2 1 (2π) 2 md+1 2T
d+1 d 2 (2π) mq 2 q=−∞ 4T
√ d−1 π 1 1 2 md+1 Γ + 1 = (2π) −ν − 2 (2π)d a1 2
(115)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology d−1 1 1 (2π) 2 md+1 Γ(−ν) d 4 (2π)
√ d−1 π 1 1 1 d+1 2 + (2π) m Γ −ν − + 1 (a21 ) 2 −ν 2 (2π)d a1 2
1 2 ν− 12 −l ((2π) ) ν − ! σ ∞ 1 2
(a21 )l (k 2 )ν− 2 −l × , 1 k=1 l=0 ν − − l !l! 2
637
−
(116)
with, α1 =
m . 2T
(117)
Finally, adding the resulting expressions, we get:
mq mq d−1 ∞ K d+1 ∞ K d+1 2 2 (2π) 2 md+1 1 T 2T Vfermion = − − d+1 d+1
(2π)d 2 q=−∞ mq 2 mq 2 q=−∞ 2T 4T
√ d−1 π 1 =− (2π) 2 md+1 Γ −ν − + 1 d (2π) a 2 d−1 1 1 (2π) 2 md+1 Γ(−ν) d 2 (2π)
√ d−1 1 2 π 1 d+1 (2π) 2 m Γ −ν − + 1 (a2 ) 2 −ν =− d (2π) a 2
1 2 ν− 12 −l ((2π) ) ν − ! σ ∞ 1 2
(a2 )l (k 2 )ν− 2 −l × 1 k=1 l=0 ν − − l !l! 2
√ d−1 π 1 1 2 md+1 Γ + 1 (2π) −ν − + 2 (2π)d a1 2
+
d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 1 π 1 1 2 md+1 Γ + 1 (a21 ) 2 −ν (2π) −ν − + 2 (2π)d a1 2
1 2 ν− 12 −l ((2π) ) ν − ! σ ∞ 1 2
× (a21 )l (k 2 )ν− 2 −l , 1 k=1 l=0 ν − − l !l! 2
−
(118)
June 2, 2009 18:35 WSPC/148-RMP
638
J070-00371
V. K. Oikonomou
with α = m T and α1 = we obtain,
m 2T
. Using the zeta regularization technique [3, 4, 12, 35, 46]
d−1
Vfermion
(2π) 2 md+1 =− (2π)d
mq mq ∞ K d+1 ∞ K d+1 2 2 1 T 2T − d+1 q=−∞ mq 2 2 q=−∞ mq d+1 2
2T
d−1 1 π d+1 2 =− (2π) m Γ −ν − + 1 (2π)d a 2 √
4T
d−1 1 1 (2π) 2 md+1 Γ(−ν) d 2 (2π)
√ d−1 1 1 2 π 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − − d (2π) a 2
1 2 ν− 12 −l ((2π) ) ν − ! σ 2
× (a2 )l ζ(−2ν + 1 + 2l) 1 l=0 ν − − l !l! 2
√ d−1 π 1 1 2 md+1 Γ + 1 + (2π) −ν − 2 (2π)d a1 2
+
d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 π 1 1 1 2 md+1 Γ + 1 (a21 ) 2 −ν + (2π) −ν − 2 (2π)d a1 2
1 2 ν− 12 −l ((2π) ) ν − ! σ 2
(a21 )l ζ(−2ν + 1 + 2l) × (119) . 1 l=0 ν − − l !l! 2 We kept the above expression without simplifying in order to have a clear picture of the terms appearing (compare with the bosonic case). In the case d = 3, in relation (119) the same poles appear that we found in the bosonic case. Again we Taylor expand around d = 3 + for → 0. As in the bosonic case, we can write the fermionic contribution at finite temperature more elegantly using the analytic continuation of the Epstein zeta function [3, 12, 4, 35, 54, 53, 75, 30, 51, 28]. In this case, the sums of the form, ∞ 1 (a2 + 4π 2 (2k + 1)2 )ν+ 2 −1 ], (120)
−
k=1
can be written in terms of the one-dimensional Epstein zeta function, ∞ 2 [w(n + α)2 + m2 ]−ν , Z1m (ν, w, α) = n=1
(121)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
639
with α = 12 and so on. We postpone the detailed presentation of the Epstein zeta functions in the section in which we study the twisted boundary conditions effective potential. 2.1.1. Case d odd For the case d = 3, keeping terms ∼ T we have: m4 −m4 + 4 4 2 2 2 4 2 16π 2 + 3m − 3γm − m T + 14π T Vfermion = 16π 64π 2 ε 32π 2 6 45 m4 ln
m2 T2
3 1 m4 ψ − m4 ψ 2 2 − − 32π 2 32π 2
m4 ln(π) − 16π 2 32π 2 5 m4 ψ 7m6 ζ(3) 31m8 ζ(5) 2 . + + − 2 4 2 32π 1536π T 65536π 6T 4 +
(122)
There are terms which are inverse powers of the temperature which in the high temperature limit (which we use) are negligible. 2.1.2. Case d even The calculation is the same as in the bosonic case. We only quote the case d = 2,
√ m3 m2 T ln(2) 3 √ √ − Vfermion = (123) − 12 2πT ζ (−2) . 6 2π 2π We observe that the results contain a finite number of terms and is not an infinite sum as in the case d odd. 2.2. Some applications on finite temperature field theories 2.2.1. The standard model at finite temperature Let us now present the 1-loop correction for the effective potential of standard model fields [63]. The calculations of the final results are based on relations (119) and (61), of the previous sections. We start with a scalar boson described by the Lagrangian, L=
1 µ ∂ φ∂µ φ − V0 (φ), 2
(124)
1 2 2 λ 4 m φ + φ , 2 4!
(125)
with tree level potential, V0 =
June 2, 2009 18:35 WSPC/148-RMP
640
J070-00371
V. K. Oikonomou
or in the case of Ns complex scalar fields, L=
1 µ α ∂ φ ∂µ φ†α − V0 (φα , φ†α ), 2
(126)
and in the following, α (Ms2 )α b ≡ Vb =
∂2V ∂φ†α ∂φb
.
(127)
Mention that Tr Ms2 = 2Vαα , where 2 comes from the two degrees of freedom that every complex scalar field has. Also Tr I = 2Ns . Now regarding the fermion fields we have, b L = iψ α γ · ∂ψ α − ψ α (Mf )α bψ ,
(128)
i i where the mass matrix (Mf )α b (φc ), is a function of scalar fields linear in φc : α i (Mf )α b = Γbi φc .
(129)
It is assumed that a Higgs mechanism gives mass to fermions. Finally consider the SU (N ) gauge invariant Lagrangian, 1 1 L = − Tr(Fµν F µν ) + Tr(Dµ φα )† (Dµ φα ) . . . , 4 2
(130)
describing the gauge bosons-Higgs interactions. In the following, i l (Mgb )2αβ (φc ) = gα gβ Tr[(Tαl φi )† Tβj φj ],
(131)
are the gauge bosons masses, and Tα are the SU (N ) generators in the adjoint representation. For the case of scalar bosons the 1-loop correction to the effective potential is, β (φc ) = V0 (φc ) + V1β (φc ), Veff
with V0 (φc ) the tree order effective potential and the loop correction, ∞ 1 d3 p ln[ωn2 + ω 2 (φc )], V1β (φc ) = 2β n=−∞ (2π)3
(132)
(133)
where: ωn = 2nπβ −1 ,
(134)
ω 2 = p2 + m2 (φc ).
(135)
and also
In the above, m2 (φc ) is given in relation (127). Relation (133) was the starting point of the our calculation for the boson case, see relation (14). Now in the fermion case, β Veff (φc ) = V0 (φc ) + V1β (φc )
(136)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
641
where as before V0 (φc ) the tree level potential and V1β (φc ) the 1-loop correction. The last equals to ∞ d3 p 2λ V1β (φc ) = − ln[ωn2 + ω 2 (φc )], (137) 2β n=−∞ (2π)3 with ωn the fermionic Matsubara frequencies: ωn = (2n + 1)πβ −1 .
(138)
ω 2 = p2 + Mf2 (φc ).
(139)
Also,
Relation (137) was the starting point for the fermion effective potential calculation, relation (84). Finally for the gauge bosons case the tree effective potential with the 1-loop correction reads,
1 1 d4 p β 2 2 2 2 ln[p + Mgb (φc )] + 2 4 JB [Mgb (φc )β ] , (140) V1 (φc ) = Tr ∆ 2 (2π)4 2π β where Tr ∆ = 3. Notice that:
JB [m2 β 2 ] =
∞
dx x2 ln[1 − e−
√
x2 +β 2 m2
],
(141)
0
and as before: i l (Mgb )2αβ (φc ) = gα gβ Tr[(Tαl φi )† Tβj φj ].
(142)
Relation (141) was obtained from relation (14). 2.3. Supersymmetric effective potential at finite temperature It is very useful to extend our analysis for scalar bosons, fermions and gauge bosons in the supersymmetric case. Consider an N = 1, d = 4 supersymmetric Lagrangian with an SU (N ) gauge symmetry. After that we give a general formula for the super symmetric potential at finite temperature. We shall use the DR renormalization scheme [90]. The chiral superfield in components reads, ¯ µ A − 1 θ2 θ¯2 A ¯ = A(x) + iθσ µ θ∂ Φ(x, θ, θ) 4 √ i + 2θψ(x) − √ θθ∂µ ψσ µ θ¯ + θθF (x), 2 and the vector hypermultiplet is described by the chiral superfield,
i µ ν α α α α 2 µ α ¯ Wa = T ¯ θ)a Fµν + θ σ Dµ λ , −λa + θa D − (σ σ 2
(143)
(144)
June 2, 2009 18:35 WSPC/148-RMP
642
J070-00371
V. K. Oikonomou
with, α α αbc b c = ∂µ Aα Aµ Aν , Fµν ν − ∂ν Aµ + f
(145)
¯α = ∂µ λ ¯ + f αbc Ab λ ¯c Dµ λ µ .
(146)
and also,
The N = 1 Lagrangian is,
1 Im τ Tr d2 θ W α Wα + d2 θ d2 θ¯ Φ† e−2V Φ L= 8π 2 ¯ + d θ W + d2 θ¯ W
(147)
which in components is written, L=−
θ i 1 α αµν α αµν ¯α + 1 Dα Dα F Fµν F + Fµν − 2 λα σ µ Dµ λ 2 2 4g 32π g 2g 2
α † α α ¯σ µ (∂µ ψ − iAα T α ψ) + (∂µ A − iAα µ T A) (∂µ A − iAµ T A) − iψ¯ µ √ † α α √ † α † α α α ¯ Aλ ¯ + F Fi − D A T A − i 2A T λ ψ + i 2ψT i
+
¯ ¯ † 1 ∂W ∂W ∂W 1 ∂W Fi + Fi − ψi ψj − ψ¯i ψ¯j . † † ∂Ai 2 ∂Ai ∂Aj 2 ∂Ai ∂A†j ∂Ai
(148)
The computation of the finite temperature effective potential can be done easily. The general potential up to one loop at finite temperature is [90], 1 (VT =0 + VT =0 ). (149) 64π 2 In the above, V0 is the tree order potential (appearing in the Lagrangian). Also VT =0 is the one loop effective potential at T = 0. It is given by:
m2 Mj2 3 3 i VT =0 = − − ln ln +3 Q2 2 Q2 2 i j V = V0 +
−2
M2 3 k − ln . Q2 2
(150)
k
Finally, VT =0 , is given by: VT =0
√ 2 2 − k +m i d3 k T = 2T ln 1 − e 3 (2π) i √ 2 2 k +M d3 k j T +3 2T ln 1 − e 3 (2π) j
−2
k
√ 2 2 + k +M i d3 k 2T 2T ln 1 + e . 3 (2π)
(151)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
643
The above is our final formula. Notice that relation (151) contains integrals we computed in the previous sections, both for bosons and for fermions, see, for example, relations (31) and (85). Also the first term corresponds to the scalar bosons part, the second to the gauge bosons and the third to the fermion part. The same correspondence applies to relation (150). The masses that appear in relations (151) and (150) are model dependent and can be found in the same way as in (127), (129) and (131). All the above are invaluable to the theories of phase transitions at finite temperature. See for example reference [63] and references therein. In conclusion, the generalization of the above to any dimensions is straightforward. In general, apart from the phase transition application, a theory at finite temperature offers the possibility to connect a d-dimensional theory with the d + 1dimensional theory at finite temperature. Let us discuss a little on this. One could say that the calculations we obtained actually correspond to a three-dimensional theory in the case of initial d = 4 theory. However, one should be really cautious since the argument that a d-dimensional field theory correspond to the same theory in d − 1 dimensions has been proven true [86] only for the φ4 theory (always within the limits of perturbation theory). This also holds true for supersymmetric theories. On the contrary, this does not hold for QCD and Yang–Mills theories. Actually, QCD3 resembles more QCD4 and not QCD4 at finite temperature! It would be more correct to say that a d-dimensional theory at finite temperature resembles more the same theory with one dimension compactified to a circle and in the limit R → 0, where R the magnitude of the compact dimension. We shall report on these issues somewhere else [89]. 3. Calculation of Effective Potential in Spacetime Topology S 1 × Rd In this section, we will compute the fermionic and bosonic contributions to the effective potential of field theories quantized in spacetime topologies S 1 × Rd [39, 40,49,4,3,35,26–29]. The calculations are done in Euclidean time by making a Wick rotation in the time coordinate. By this, we have static-time independent results. In spacetimes with non trivial topology the fields can have periodic or antiperiodic boundary conditions without the restrictions that we had in the temperature case [3, 54] (that is bosons must obey only periodic and fermions only antiperiodic boundary conditions). We shall deal with periodic bosons and antiperiodic fermions. The boundary conditions for bosons are, ϕ(x, 0) = ϕ(x, L),
(152)
L denoting the compact (circle) dimension, while the fermion boundary conditions, ψ(x, 0) = −ψ(x, L).
(153)
June 2, 2009 18:35 WSPC/148-RMP
644
J070-00371
V. K. Oikonomou
Another more general set of boundary conditions that can be used is the so-called twisted boundary conditions of the form: ϕ(x, 0) = e−iw ϕ(x, L),
(154)
ψ(x, 0) = −eiρ ψ(x, L),
(155)
for bosons and,
for fermions. 3.1. Periodic bosons and antiperiodic fermions Using, ϕ(x, 0) = ϕ(x, L),
(156)
ψ(x, 0) = −ψ(x, L),
(157)
for bosons and,
for fermions, we shall compute the bosonic contribution, 2 2
∞ dk 3 4π n 1 2 2 ln +k +m , L (2π)3 n=−∞ L2 and also the fermionic one,
∞ (2n + 1)2 π 2 1 dk 3 2 2 ln + k + m . L (2π)3 n=−∞ L2
(158)
(159)
Following the techniques developed in the previous sections (roughly we substitute T → L1 ),
2 2 ∞ 4π n dk d 1 2 2 ln +k +m L (2π)d n=−∞ L2
√ d−1 1 π 1 d+1 2 (2π) = − m Γ −ν − + 1 2 (2π)d a 2 d−1 1 1 (2π) 2 md+1 Γ(−ν) d 4 (2π)
√ d−1 1 π 1 d+1 2 (2π) − m Γ −ν − + 1 (a2 ) 2 −ν (2π)d a 2
1 1 ν− 12 ((2π)2 )ν− 2 −l ν − ! 2
(a2 )l ζ(−2ν + 1 + 2l) × , 1 l=0 ν − − l !l! 2
+
(160)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
for the boson case, with α = mL and,
∞ (2n + 1)2 π 2 1 dk d 2 2 ln + k + m L (2π)d n=−∞ L2
645
mqL d−1 ∞ ∞ K d+1 (mqL) 2 (2π) 2 md+1 1 2 K d+1 2 = − − d+1 d+1
q=−∞ mqL 2 (2π)d 2 q=−∞ mqL 2 2 4
√ d−1 π 1 = − (2π) 2 md+1 Γ −ν − + 1 d (2π) a2 2
d−1 1 1 (2π) 2 md+1 Γ(−ν) d 2 (2π)
√ d−1 1 1 2 π d+1 (2π) 2 m Γ −ν − + 1 (a22 ) 2 −ν − d (2π) a2 2
1 2 ν− 12 −l ((2π) ) ν − ! σ 2
× (a22 )l ζ(−2ν + 1 + 2l) 1 l=0 ν − − l !l! 2
√ d−1 π 1 1 2 md+1 Γ + 1 + (2π) −ν − 2 (2π)d a1 2
+
d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 π 1 1 1 2 md+1 Γ + 1 (a21 ) 2 −ν + (2π) −ν − 2 (2π)d a1 2
1 2 ν− 12 −l ((2π) ) ν − ! σ 2
(a21 )l ζ(−2ν + 1 + 2l) × , 1 l=0 ν − − l !l! 2
−
for the fermion case, with α2 = mL and α1 = mL 2 . For the case d = 3, the bosonic contribution is: 2 2
∞ 4π n dk 3 1 2 2 ln +k +m L (2π)3 n=−∞ L2
(161)
m4 −m4 + 2 4 4 4 3 2 2 16π 2 + m + 3m − γm − γm − m − π = 16π 12L2 64π 2 ε 32π 2 16Lπ 2 6Lπ 45L4
+
m4 ln(2) m4 ln(2) m4 ln(m) m4 ln(m) m4 ln(L2 m2 ) + − + − 32π 2 32Lπ 2 16π 2 16Lπ 2 32π 2
June 2, 2009 18:35 WSPC/148-RMP
646
J070-00371
V. K. Oikonomou
3 1 4 4 m m ψ − ψ 4 4 m ln(π) m ln(π) 2 2 + + − − 32π 2 32Lπ 2 32π 2 32π 2 5 4 m ψ L2 m6 ζ(3) L4 m8 ζ(5) 2 . + + − 32π 2 384π 4 4096π 6
(162)
In Eq. (162), we omitted terms of higher order in L. This is because we are interested in the limit L → 0. The fermionic contribution for d = 3 is: 1 L
∞ (2n + 1)2 π 2 dk 3 2 2 ln + k + m (2π)3 n=−∞ L2
−m4 m4 + 2 4 4 4 2 2 16π 2 + − m + 3m − γm − γm + 14π = 16π 2 2 2 2 ε 6L 64π 32π 16Lπ 45L4
3 1 4 4 ψ − ψ m m 4 2 2 4 m ln(L m ) m ln(π) 2 2 − + − − 32π 2 16π 2 32π 2 32π 2 5 m4 ψ 7m6 L2 ζ(3) 31L4 m8 ζ(5) 2 . + + − 32π 2 1536π 4 65536π 6
(163)
In the case d = 2, the bosonic contribution reads: 1 L
2 2
∞ 4π n dk 2 2 2 ln + k + m (2π)2 n=−∞ L2 m2 m3 m2 ln(2) m2 ln(L2 m2 ) √ √ = + √ + √ − 4 2Lπ 6 2π 2 2Lπ 4 2Lπ
m2 ln(π) m2 ln(2π) ζ (−2) − √ + , + √ L3 2 2Lπ 2 2Lπ
(164)
and the fermionic contribution: 1 L
∞ (2n + 1)2 π 2 m2 ln(2) ζ (−2) m3 dk 2 2 2 − ln +k +m = √ − √ . 2 2 (2π) n=−∞ L L3 6 2π 2Lπ (165)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
647
3.2. Some applications I 3.2.1. Topological symmetry breaking in self interacting field theories We now discuss some applications of the periodic bosons and anti-periodic fermions effective potential at finite volume. It is well known that field theory at finite volume plays an important role to topological symmetry breaking or restoration and topological mass generation [39, 40, 49, 4, 3, 35, 26–29, 22, 58, 74, 93, 92]. Apart from the known influence of the topology to the boundary conditions of the sections of the fiber bundles studied, the effective mass and on particle creation [3], the need for studying field theories at finite volume is that the universe might exhibit non trivial topology as a whole [75, 37, 39, 27, 3]. Now, we briefly present the topological mass generation. When spacetime has non trivial topology then a massless field with periodic boundary conditions, can acquire mass through loop corrections, in a dynamical way. Indeed, the one loop potential reads, 1 ln(an /µ2 ), (166) V 1 (φ) = vol(M ) n with vol(M ) is the volume of the spacetime under study and an are the eigenvalues of the Laplace operator on this spacetime. A regularized form of the above involves the zeta function [35], a−s (167) ζ(s) = n . n
The potential at loop is written as, V 1 (φ) =
1 [ζ (0) + ζ(0) ln µ2 ], vol(M )
(168)
with µ a dimensional regularization parameter that can be removed in the renormalization process. The topological mass is equal to, m2 =
d2 V (φ) , dφ2
(169)
at φ = 0. In the above relation, V (φ) is equal to, V (φ) =
1 λ 4 φ − [ζ (0) + ζ(0) ln µ2 ]. 4! vol(M )
Now for the spacetime S 1 × R3 the eigenvalues an are,
2πn λ + k12 + k22 + k32 . an = φ2 + 2 L Also the zeta function ζ(s) reads, 2 2
∞ 4π n λ 2 L1 3 2 2 2 φ + + k1 + k2 + k3 . ζ(s) = d ki 2π 2 L2 n=−∞
(170)
(171)
(172)
June 2, 2009 18:35 WSPC/148-RMP
648
J070-00371
V. K. Oikonomou
The calculation of the above can be done with the techniques we presented in the previous sections. Now at φ = 0 the potential is, V (φ = 0) = −
π2 . 90L41
(173)
The above is just the Casimir energy for a real scalar field that satisfies periodic boundary conditions instead of Dirichlet. The topologically generated mass in this case is, m2 =
λ . 24L21
(174)
These techniques can be useful to determine the vacuum stability of the theory under consideration [37, 39, 93, 3, 74]. In the case of the periodic scalar field, the mass is positive, thus the φ = 0 vacuum is stable. Let us now study the same setup in S 1 × R3 but with the scalar field satisfying anti-periodic boundary conditions along the compact dimension. This case resembles the calculations of a fermion field at finite volume we presented previously. The only vacuum expectation value that is allowed is φ = 0 [72]. The zeta function now reads, 2
∞ π (2n + 1)2 λ 2 L1 3 2 2 2 d ki φ + + k1 + k2 + k3 , (175) ζ(s) = 2π 2 L2 n=−∞ and in this case, at φ = 0 the potential is, V (φ = 0) =
7π 2 . 720L41
(176)
The above is just the Casimir energy for a real scalar field that satisfies periodic boundary conditions instead of Dirichlet. The topologically generated mass now reads, m2 = −
λ . 48L21
(177)
The negative sign indicates an instability in this theory [74, 3, 37]. 3.2.2. Casimir effect the effective potential and extra dimensions The calculations for finite volume field theories with a toroidal compact dimension are useful for field theories with one compact extra dimension. We shall present some cases here. Also these are special cases of the effective potential with a twist in the fields boundary conditions that we describe in the next section. Let us start with a scalar field in the Randall–Sundrum1 (RS1) model [94]. The line element is given by, ds2 = e−2krc φ ηµν dxµ dxν − rc2 dφ2 .
(178)
The theory is quantized on the orbifold S 1 /Z2 and thus the points (xµ , φ) and (xµ , −φ) are identified. The exponential factor is the most appealing feature of the
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
649
RS1 model. Actually the hierarchy problem can be solved within this scenario since a Tev mass scale can be produced from a Plank mass scale [94]. One of the most interesting problems appearing in models with extra compact dimensions is related with the size and stability of the compact dimension. Particularly, the problem is two fold. First, one must find a way to shrink the extra dimensions. This is a very serious feature since the visible spatial dimensions of our world inflated in the past. Also their size exponentially increases during inflation. So firstly, the extra dimensions must shrink. Secondly, the extra dimensions must be stabilized and not collapse into the Planck scale. One indicator to solve the first problem is the existence of a negative energy in the bulk, that is, the Casimir energy of the bulk scalar field must be negative. In the context of string theory, there are setups such as orientifolds planes and other structures [24]. In some cases, field theory corrections can be supplemented by string structures but we shall not discuss this here. Consider a free scalar in the bulk, with Lagrangian density, L = GAB ∂A Φ∂B Φ − m2 Φ2 .
(179)
The harmonic expansion of the scalar field is, Φ(xµ , φ) =
n
yn (φ) ψn (xµ ) √ . R
(180)
Solving the equations of motion for the RS metric one obtains obtain,
Mn ekRφ Mn ekRφ yn (φ) ∼ e2kRφ Jν + Yν , k k
(181)
and in order the field satisfies the orbifold boundary conditions, Mn must satisfy,
1 Mn ekRφ k ∼ π N + . 4
(182)
It is clear that the Casimir energy is significant due to the extra dimensions quantum fluctuations. For the bulk scalar field we obtain, V
+
2 ∞ nπ 1 d4 k 2 2 = ln k + + Mn , 2 n=−∞ (2π)4 rc
(183)
with rc the compact dimension radius. Notice that relation (183) is identical with relation (158) for the case of five dimensions. The calculation and generalization is straightforward, and we can find the result in closed form, in terms of the polylogarithm functions. This calculation is similar to the finite temperature one for d even, see relations (115) and (80).
June 2, 2009 18:35 WSPC/148-RMP
650
J070-00371
V. K. Oikonomou
For a more general calculation see the next section. In the case of a massless scalar relation (183) is modified to, 2 ∞ nπ 1 d4 k + 2 V1 = ln k + , (184) 4 2 n=−∞ (2π) rc which is calculated to be, V1+ = −
3ζ(5) , 64π 4 rc4
(185)
which is clearly negative, and thus this results to a shrinking of the compact dimension. Also the Casimir force in terms of the compact dimensions is repulsive which leads to a stabilization of the extra dimension. The calculations for fermions are straightforward. Also the existence of a minimum in the effective potential is an indicator of stabilization of the extra dimensions. Finally let us mention that Casimir calculations have been done for de Sitter and anti-de Sitter brane worlds, see [30, 47, 45]. Additionally same results hold for other 5-dimensional setups, such as large extra dimensions and universal extra dimensions. We shall briefly present some applications in relation to them after the next section. 3.3. The case of twisted boundary conditions We shall study only the twisted boson case since the other case is similar [3, 35, 4]. The twisted boundary conditions for bosons are: ϕ(x, 0) = e−iw ϕ(x, L),
(186)
ψ(x, 0) = −eiρ ψ(x, L),
(187)
ψ(x, 0) = ei(ρ+π) ψ(x, L).
(188)
while for fermions:
or equivalently,
We do a Fourier expand ϕ: dp3 eipx = eiw dp3 eipx+iwn L , n
(189)
n
from which we obtain, 1 wn L = 2πn + w → wn = (2πn + w) , L with, G = w2 +k12 +m2 . n Doing the same as in the previous with the difference: 2π 1 wn = (2πn + w) = (n + ω) , L L
(190)
(191)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
with, ω =
w 2π ,
we will compute [3, 35, 4],
2 dk 3 1 2π 2 2 +k +m . ln (n + ω) L (2π)3 L
Consider the sum: ∞
∞ 1 1 1 = 2 , 2 a2 2π 2π n=−∞ n=−∞ (n + ω)2 +
2 (n + ω)2 + a2 L L 1 2π L
651
(192)
(193)
with, a2 = k 2 + m 2 .
(194)
Integrating, ∞
1 , 2 2π 2 2 n=−∞ (n + ω) +a L over a2 , we get, ∞
2 ∞ 2π da2 = ln (n + ω)2 + a2 . 2 L 2π 2 2 n=−∞ n=−∞ (n + ω) +a L
Also, ∞
da2 L = 2 4a 2π n=−∞ (n + ω)2 + a2 L
coth
(195)
(196)
aL aL − iπω + coth + iπω , 2 2 (197)
and consequently, ∞
da2 2 2π n=−∞ (n + ω)2 + a2 L
aL L aL − iπω + coth + iπω da2 = coth 4a 2 2
aL aL − iπω + ln sinh + iπω . = ln sinh 2 2
Using [1],
ln(sinh x) = ln
1 x −x (e − e ) = x + ln(1 − e−2x ) − ln[2], 2
(198)
(199)
(200)
June 2, 2009 18:35 WSPC/148-RMP
652
J070-00371
V. K. Oikonomou
and summing,
aL aL aL − iπω = − iπω + ln[1 − e−2( 2 −iπω) ] − ln[2], ln sinh 2 2 and,
aL aL aL ln sinh + iπω = + iπω + ln[1 − e−2( 2 +iπω) ] − ln[2], 2 2
(201)
(202)
we get,
∞
da2 2 2π n=−∞ (n + ω)2 + a2 L
aL aL − iπω + iπω + ln sinh = ln sinh 2 2 = aL + ln[1 − e−2(
aL 2 −iπω)
] + ln[1 − e−2(
aL 2 +iπω)
] − 2 ln[2].
(203)
aL 2 +iπω)
] − 2 ln[2].
(204)
After some calculations [3, 35, 4, 13]: 2 ∞ 2π ln (n + ω)2 + a2 L n=−∞ = αL + ln[1 − e−2(
aL 2 −iπω)
] + ln[1 − e−2(
Using the identity [1],
ln
(n + ω)2 4π 2 T 2 + a2 ) = 2(a − b), (n + ω)2 4π 2 T 2 + b2 )
(205)
the relation (204) becomes, 2 2π + a2 ln (n + ω)2 L aL aL L ∞ = dx ln[x2 + a2 ] + ln[1 − e−2( 2 −iπω) ] + ln[1 − e−2( 2 +iπω) ]. (206) 2π −∞ Thus, 1 L
2 2π dk 3 2 2 +k +m ln (n + ω) (2π)3 L ∞ aL dk 3 dx 1 dk 3 2 2 ln[x + a ] + ln[1 − e−2( 2 −iπω) ] = 3 3 (2π) −∞ 2π L (2π) 3 dk aL 1 ln[1 − e−2( 2 +iπω) ], (207) + 3 L (2π)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
653
with, a2 = k 2 + m 2 ,
(208)
The first integral is the one loop correction to the effective potential for L = 0. In d + 1 dimensions relation (207) reads [3, 4, 35, 28, 33, 21, 19]:
2 1 2π dk d 2 2 +k +m ln (n + ω) L (2π)d L aL dk d+1 1 dk d 2 2 = ln[k + a ] + ln[1 − e−2( 2 −iπω) ] d+1 d (2π) L (2π) d aL 1 dk + ln[1 − e−2( 2 +iπω) ]. (209) d L (2π) In the following, we consider only the L dependent part, aL 1 1 dk d dk d −2( aL 2 −iπω) ] + Vtwisted = ln[1 − e ln[1 − e−2( 2 +iπω) ]. (210) d L (2π) L (2π)d Let, 1 V1 = L and 1 V2 = L
aL dk d ln[1 − e−2( 2 −iπω) ], (2π)d
(211)
aL dk d ln[1 − e−2( 2 +iπω) ], (2π)d
(212)
so relation (210) reads, Vtwisted = V1 + V2 .
(213)
The calculation of V1 and of V2 is equivalent. Their analytic properties are the same. So we calculate only V2 . We have, dk d dk d 1 1 −2( aL 2 +iπω) ] = V2 = ln[1 − e ln[1 − e−aL−2iπω) ]. (214) d L (2π) L (2π)d Using, ln[1 − e−aL−i2πω ] = −
∞ e−aLq−2πiωq q=1
Now V2 becomes,
dk d ln[1 − e−aL−2iπω) ] (2π)d ∞ 1 dk d e−aLq−2πiωq =− L (2π)d q=1 q
1 V2 = L
q
.
(215)
June 2, 2009 18:35 WSPC/148-RMP
654
J070-00371
V. K. Oikonomou
∞ 1 dk d e−aLq−2πiωq =− L (2π)d q q=1 ∞ dk d e− 1 =− L (2π)d q=1
√ k2 +m2 qL−2πiωq
q
d ∞ 2 2 1 ∞ dk d−1 (2π) 2 e− k +m qL −2πiωq
e k =− d L −∞ (2π)d q q=1 Γ 2 d ∞ ∞ √ 1 (2π) 2 2 2 =− dk k d−1 e− k +m qL e−2πiωq , d L −∞ d q=1 Γ q(2π) 2 √ we used (a = k 2 + m2 ). The integral, ∞ √ 2 2 dk k d−1 e− k +m qL , √
(216)
(217)
−∞
equals to [1], ∞ √ d+1 d d 1 d−1 − k2 +m2 qL −1 √ −1 −d 2 2 2 2 dk k e =2 ( π) (qL) m Γ K d+1 (mqL). 2 2 −∞ (218) thus V2 is written: V2 = − =−
d ∞ K d+1 (mqL) 1 d+1 2 2 −1 2 2 md+1 (2π) d (2π) 1 mqL q=1
d+1 2
e−2πiωq
∞ K d+1 (mqL) d−1 1 1 −2πiωq 2 2 md+1 (2π) .
d+1 e 2 q=1 (2π)d mqL 2
(219)
2 Equivalently V1 equals to: V1 = −
∞ K d+1 (mqL) d−1 1 1 (2π) 2 md+1 2 d+1 e+2πiωq . d 2 q=1 (2π) mqL 2
(220)
2 Summing V1 and V2 V1 + V2 = −
∞ K d+1 (mqL) d−1 1 1 (2π) 2 md+1 2 d+1 (e+2πiωq + e−2πiωq ) d 2 q=1 (2π) mqL 2
(221)
2 and using, cos x =
1 −ix (e + eix ), 2
(222)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
655
we get: V1 + V2 = −
∞ q=1
K d+1 (mqL) d−1 1 2 2 md+1 (2π)
d+1 cos(2πωq). (2π)d mqL 2
(223)
2 The function, K d+1 (mqL) 2
d+1 cos(2πωq), mqL 2 2
(224)
is invariant under the transformation q → −q and relation (223) is written, V1 + V2 = −
∞ q=1
K d+1 (mqL) d−1 1 2 2 md+1 (2π)
d+1 cos(2πωq), (2π)d mqL 2
(225)
2 and finally, ∞ K d+1 (mqL) d−1 1 1 2 2 md+1 V1 + V2 = − (2π)
d+1 cos(2πωq). 2 q=−∞ (2π)d mqL 2
(226)
2
Again the symbol means omission of the zero modes. By breaking the cosine function to exponentials, we introduce F1 and F2 with Vtwisted = F1 + F2 , where, F1 = −
∞ K d+1 (mqL) d−1 1 1 (2π) 2 md+1 2 d+1 e−2πiωq , d 4 q=−∞ (2π) mqL 2
(227)
2 and, F2 = −
∞ K d+1 (mqL) d−1 1 1 (2π) 2 md+1 2 d+1 e2πiωq . d 4 q=−∞ (2π) mqL 2
(228)
2 We compute F1 only, since the computation of the other is similar. We have: z2 Kν (z) 1 ∞ e−t− 4t z ν = dt, (229) 2 0 tν+1 2 and F1 becomes: ∞ d−1 1 1 (2π) 2 md+1 F1 = − d 8 (2π)
∞
e 0
e−
(mqL)2 4t
t
d+1 2 +1
−t q=−∞
e−2πiωq .
(230)
June 2, 2009 18:35 WSPC/148-RMP
656
J070-00371
V. K. Oikonomou
Using the Poisson identity [3, 4, 35], ∞
∞
f (n) =
n=−∞
∞
−∞
k=−∞
f (x1 )e−2πikx1 dx1 ,
(231)
with, f (x) = e− and λ =
(mL) 4t
2
∞
(mxL)2 4t
e−2πiωx ,
(232)
, β = 2, πω, we get [50]: ∞ ∞ 2 −λq2 −iβq e e = e−λx e−iβx e−2πikx dx
q=−∞
−∞
k=−∞
=
=
√ √
∞
1 √ 2π 2π k=−∞ 2π
∞
1 √ 2π k=−∞
∞
2
e−λx e−iβx e−2πikx dx
−∞
∞
2
e−λx eix(−β−2πk) dx.
(233)
−∞ 2
The Fourier transformation of the function e−λx is: (β+2πk)2 ∞ e− 4λ 1 −λx2 ix(−β−2πikx) √ , e e dx = √ √ 2π −∞ 2 λ
(234)
and finally, 2
∞
e
−λq2 −iβq
e
q=−∞
(β+2πk) ∞ √ e− 4λ = 2π √ √ 2 λ k=−∞ 2
(β+2πk) ∞ √ e− 4λ √ = π λ k=−∞ ∞ π − (β+2πk)2 4λ e . = λ
(235)
k=−∞
Neglecting the zero modes, we get: ∞
2
e−λq e−iβq =
q=−∞
1+
2
e−λq e−iβq =
q=−∞
or equivalently, ∞ q=−∞
e
−λq2 −iβq
e
(236)
k=−∞
from which, ∞
∞ π − (β+2πk)2 4λ e , λ
=
π λ
π λ
e
2 −β 4λ
+
e
2 − (β+2πk) 4λ
,
(237)
− 1.
(238)
k=−∞
2
e
∞
−β 4λ
+
∞ k=−∞
2
e
− (β+2πk) 4λ
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
657
Replacing in F1 we obtain, d−1 1 1 F1 = − (2π) 2 md+1 d 8 (2π)
∞
dt e
−t
0
π λ
e
2 −β 4λ
+
∞
e
2 − (β+2πk) 4λ
k=−∞
t
d+1 2 +1
− 1 . (239)
Setting, d+1 , 2
v=
(240)
and the above becomes, d−1 1 1 2 md+1 F1 = − (2π) 8 (2π)d
∞
dt e−t
0
π − β2 e 4λ λ tν+1
d−1 1 1 (2π) 2 md+1 − d 8 (2π)
+
d−1 1 1 (2π) 2 md+1 d 8 (2π)
∞
dt e
−t
0
∞
dt e−t
π λ
e
2 − (β+2πk) 4λ
k=−∞ tν+1
1
.
tν+1
0
∞
(241)
Substitute a = mL and the above relation is written (λ =
a2 4t ),
2 √ ∞ −β t a2 d−1 πt2e 1 1 F1 = − (2π) 2 md+1 dt e−t 8 (2π)d atν+1 0
√ πt2 ∞ d−1 1 1 −t 2 md+1 − (2π) dt e 8 (2π)d 0 d−1 1 1 + (2π) 2 md+1 8 (2π)d
∞
dt e 0
−t
1 tν+1
∞
2
e
− (β+2πk) t a2
k=−∞ atν+1
.
(242)
June 2, 2009 18:35 WSPC/148-RMP
658
J070-00371
V. K. Oikonomou
After some calculations we get: ∞
√ 2 d−1 π 1 −( β +1)t −ν− 12 2 md+1 a2 (2π) dt e t F1 = − 4 (2π)d a 0 ∞ √ (β+2πk)2 − t a2 e πt2 √ ∞ d−1 1 π k=−∞ d+1 −t 2 m (2π) − dt e ν+ 12 4 (2π)d a at 0 d−1 1 1 + (2π) 2 md+1 d 8 (2π)
∞
dt e
−t
1 tν+1
0
.
(243)
Finally using the following, 1 1 = (x2 + a2 )µ+1 Γ(µ + 1)
∞
dt e−(x
2
+a2 )t µ
t ,
(244)
0
we have:
2
ν+ 12 −1 √ d−1 β π 1 1 d+1 (2π) 2 m Γ −ν − + 1 +1 F1 = − 4 (2π)d a 2 a2
2 ν+ 12 −1 √ ∞ d−1 β + 2πk π 1 1 (2π) 2 md+1 Γ −ν − + 1 − 1+ 4 (2π)d a 2 a k=−∞
d−1 1 1 (2π) 2 md+1 Γ(−ν). d 8 (2π)
+
(245)
Adding F2 (with −β + 2πk) we have, Vtwisted
2
ν+ 12 −1 √ d−1 β 1 1 π d+1 2 (2π) =− m Γ −ν − + 1 +1 2 (2π)d a 2 a2
√ d−1 π 1 1 d+1 2 (2π) − m Γ −ν − + 1 4 (2π)d a 2
2 ν+ 12 −1
2 ν+ 12 −1 ∞ −β + 2πk β + 2πk × + 1+ 1+ a a k=−∞
+
d−1 1 1 (2π) 2 md+1 Γ(−ν). d 4 (2π)
(246)
The sum, ∞ k=−∞
1+
β + 2πk a
2 ν+ 12 −1 +
1+
−β + 2πk a
2 ν+ 12 −1 ,
(247)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
659
is invariant under k → −k, thus: 2
∞
1+
k=1
β + 2πk a
2 ν+ 12 −1 +
1+
−β + 2πk a
2 ν+ 12 −1 .
(248)
So we obtain: Vtwisted
2
ν+ 12 −1 √ d−1 β π 1 1 d+1 (2π) 2 m Γ −ν − + 1 =− +1 2 (2π)d a 2 a2
√ d−1 1 1 π 1 d+1 2 (2π) − m Γ −ν − + 1 (a2 ) 2 −ν d 2 (2π) a 2 ∞ 2 2 ν+ 12 −1 2 2 ν+ 12 −1 × (a + (β + 2πk) ) + (a + (−β + 2πk) ) k=1
+
d−1 1 1 (2π) 2 md+1 Γ(−ν). 4 (2π)d
(249)
Depending on whether d is even or odd we can Taylor expand or use the binomial expansion for the sum [1]:
1 ν− ! σ 1 1 2
(a2 )l (b2 )ν− 2 −l . (250) (a2 + b2 )ν− 2 = 1 l=0 ν − − l !l! 2 If d is even then σ = ν − 12 . If d is odd, then σ is a positive integer. For d odd, we make a Taylor expansion: Vtwisted
2
ν+ 12 −1 √ d−1 β 1 1 π d+1 (2π) 2 m Γ −ν − + 1 =− +1 2 (2π)d a 2 a2 d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 1 π 1 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − − d 2 (2π) a 2
1 ν− ! σ ∞ 1 2
× (a2 )l ((β + 2πk)2 )ν− 2 −l 1 k=1 l=0 ν − − l !l! 2
1 ν − ! σ ∞ 1 2
(a2 )l ((−β + 2πk)2 )ν− 2 −l + , 1 k=1 l=0 ν − − l !l! 2
+
(251)
June 2, 2009 18:35 WSPC/148-RMP
660
J070-00371
V. K. Oikonomou
and after calculations, Vtwisted
2
ν+ 12 −1 √ d−1 β π 1 1 d+1 2 (2π) =− m Γ −ν − + 1 +1 2 (2π)d a 2 a2 d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 1 1 1 π 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν ((2π)2 )ν− 2 −l − −ν − 2 (2π)d a 2
1
2 ν− 12 −l ν − ! ∞ σ β 2
(a2 )l +k × 1 2π k=1 l=0 ν − − l !l! 2
1 ν− 12 −l
ν − ! σ ∞ 2 β 2 .
(a2 )l +k + − (252) 1 2π k=1 l=0 ν − − l !l! 2
+
We use zeta regularization, expressed in terms of the Hurwitz zeta [3, 4, 35, 56, 12, 54]: ζ(s, υ) =
∞ k=0
∞
1 1 1 → = ζ(s, υ) − s (k + υ)s (k + υ)s υ
(253)
k=1
which is defined for 0 < υ ≤ 1 and the term k + υ = 0 is omitted. In our case, υ is β which contains the phase appearing in the boundary conditions. So ω must be ω ). positive (β = 2π Using Hurwitz zeta [3, 4, 35, 56, 12, 54]:
Vtwisted
2
ν+ 12 −1 √ d−1 β π 1 1 d+1 (2π) 2 m Γ −ν − + 1 =− +1 2 (2π)d a 2 a2 d−1 1 1 (2π) 2 md+1 Γ(−ν) d 4 (2π)
√ d−1 1 1 π 1 1 d+1 2 (2π) − m Γ −ν − + 1 (a2 ) 2 −ν ((2π)2 )ν− 2 −l d 2 (2π) a 2
1
2ν−1−2l ν− ! σ ∞ β β 2 2 l
(a ) ζ −2ν + 1 + 2l, × − 1 2π 2π k=1 l=0 ν − − l !l! 2
+
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
1
ν− ! β 2 2 l
(a ) ζ −2ν + 1 + 2l, − + 1 2π k=1 l=0 ν − − l !l! 2
2ν−1−2l β . − − 2π
661
σ ∞
(254)
The objective now is to make the β dependence clear. For this we use the expansion of Hurwitz zeta [1]: ∞
∞ πz 2πqn 2πqn πz 2Γ(1 − z) . cos 1−z + cos sin 1−z sin ζ(z, q) = (2π)1−z 2 n=1 n 2 n=1 n (255) Also the ζ(z, −q) expansion, can be found using [35], ζ(1 − s, a) =
iπs Γ(s) − iπs (e 2 F (s, a) + e 2 F (s, −a)), s (2π)
(256)
where F (s, a) =
∞ e2iπna ns n=1
(257)
which is valid if Re z < 0 and 0 < q ≤ 1. In our case z = −2ν + 1 + 2l. Note that for d = 3, we have −2ν = −4 and −2ν + 1 + 2l is negative for l = 0, 1. For l = 2 we use the Hurwitz zeta expansion, ζ(s, a), around s = 1, where a pole exists,
1 (258) = −ψ0 (a). lim ζ(s, a) − s→1 s−1 Thus we can compute Vtwisted as an expansion up to order L−2 . By using dimensional regularization we Taylor expand the d dependent terms around d + ε, ε → 0 as before. Also for d = 3 the expression −2ν + 1 + 2l is always an odd numβ 2ν−1−2l ) are omitted. Below we quote the terms for ber for all l. So the terms ( 2π l = 0, 1, 2:
2
ν+ 12 −1 √ d−1 β π 1 1 2 md+1 Γ (2π) + 1 Vtwisted = − −ν − + 1 2 (2π)d a 2 a2 d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 1 π 1 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − − 2 (2π)d a 2
+
June 2, 2009 18:35 WSPC/148-RMP
662
J070-00371
V. K. Oikonomou
∞ βn π(1 − 2ν) 2 ν− 12 2Γ(2ν) × ((2π) ) cos sin (2π)2ν 2 n2ν n=1
∞
∞ βn βn π(1 − 2ν) π(1 − 2ν) sin 2ν + sin cos 2ν + cos 2 n 2 n n=1 n=1
1
∞ ν− ! βn π(1 − 2ν) 2
a2 sin 2ν − cos + 1 2 n n=1 ν − −1 ! 2
∞ βn π(3 − 2ν) 2 ν− 12 −1 2Γ(2ν − 2) × ((2π) ) cos 2ν−2 sin (2π)2ν−2 2 n n=1
∞
∞ βn βn π(3 − 2ν) π(3 − 2ν) sin 2ν−2 + sin cos 2ν−2 + cos 2 n 2 n n=1 n=1
1
ν− ! ∞ π(3 − 2ν) βn 1 2
a4 ((2π)2 )ν− 2 −2 − cos sin 2ν−2 + 1 2 n n=1 ν − − 2 !2! 2
2 β β , + ψo × + ψo − (259) ε 2π 2π
(with ψo the digamma function) which after calculations is written:
2
ν+ 12 −1 √ d−1 β 1 1 π d+1 2 Vtwisted = − (2π) m Γ −ν − + 1 +1 2 (2π)d a 2 a2 d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
√ d−1 1 π 1 1 2Γ(2ν) d+1 2 12 −ν 2 (2π) m Γ −ν − + 1 (a ) ((2π)2 )ν− 2 − d 2ν 2 (2π) a 2 (2π)
1
ν− ! ∞ π(1 − 2ν) βn 1 2
a2 ((2π)2 )ν− 2 −1 + × 2 sin cos 2ν 1 2 n n=1 ν− −1 ! 2
∞ 2Γ(2ν − 2) βn 2 sin π(3 − 2ν) × cos (2π)2ν−2 2 n2ν−2 n=1
+
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
1 ν− ! 1 2
a4 ((2π)2 )ν− 2 −2 + 1 ν − − 2 !2! 2
2 β β + O(ε, ε2 and higher) , + ψo × + ψo − ε 2π 2π
(260)
d+1 2 ,a
= mL. The sums appearing above are:
1 βn = (Li2ν (e−iβ ) + Li2ν (eiβ )), cos 2ν n 2 n=1
with β = 2πω, ν =
∞
and ∞ n=1
663
cos
βn n2ν−2
=
(261)
1 (Li2ν−2 (e−iβ ) + Li2ν−2 (eiβ )). 2
(262)
Let us see how the poles cancel in the above expressions. In the case d = 3, one of the poles is contained to the Hurwitz, and is of the form 2ε with ε → 0. The other d−1 1 2 md+1 Γ(−ν). Thus we have: pole is contained to the expression 14 (2π) d (2π)
Vtwisted
3 β2 2 m4 −m4 4 √ m 1+ 2 3m4 + 2m4 α2 cos(β) 2 2 γm4 α 16π 16π + + = − ε 32π 2 6πα π 2 α5 64π 2
3 m ψ − m4 ln(2) m4 ln(π) m4 ln(π) m4 ln(α2 ) 2 + + + − − 16π 2 32π 2 32π 2 32π 2 32π 2
1 5 −β β m4 ψ m4 ψ m4 ψ m4 ψ 2 2 2π 2π − + + + . 32π 2 32π 2 32π 2 32π 2
4
(263)
We can see how the poles cancel. The last expression is the vacuum energy in the case that arbitrary phases appear. 3.4. Some applications II 3.4.1. Extra dimensional models with twisted boundary conditions Let us now briefly present an application of the twisted potential case we computed above. In models with large extra dimensions, supersymmetry can be broken in the
June 2, 2009 18:35 WSPC/148-RMP
664
J070-00371
V. K. Oikonomou
bulk by the Scherk–Schwarz mechanism, as we described briefly in the introduction. Consider the immediate extra dimensional extension of the MSSM in five dimensions on the orbifold S 1 /Z2 [64,67,68]. Assume that supersymmetry breaking occurs in the bulk through the Scherk–Schwarz mechanism [66]. Thus the fields have the following boundary conditions, Φ(xµ , y + 2πR) = e2πiqΦ Φ(xµ , y).
(264)
The Scherk–Schwarz mechanism consists in using different parameters qΦ for fermions and bosons belonging to the same hypermultiplet. The harmonic expansion of the fields for circle compactification is, Φ(xµ , y) =
∞
Φn (x)e
i2π(n+qΦ )y R
.
(265)
n=−∞
In the case of the S 1 /Z2 orbifold compactification, the Z2 even fields have harmonic expansion, Φ(xµ , y) =
∞
Φn (x) cos
2π(n + qΦ )y , R
(266)
Φn (x) sin
2π(n + qΦ )y . R
(267)
n=−∞
while the Z2 odd fields, Φ(xµ , y) =
∞ n=−∞
The Z2 even fields have zero modes and produce the 4-dimensional MSSM, while the Z2 odd do not have zero modes. The Kaluza–Klein modes within each hypermultiplet have masses, (n + qB )2 , R2 for the boson case, and for the fermion case the mass reads, m2B =
(268)
(n + qF )2 . (269) R2 In the orbifold extra dimensional extension, the electroweak symmetry breaking occurs through radiative corrections to the Higgs mass. So it is necessary to include one loop corrections to the appropriate mass eigenstate Higgs scalar field mass (for more details see [68,67]). The one loop corrected mass is induced by a tower of KK states and is equal to, m2F =
m2φ (φ = 0) = with V (φ) given by,
V (φ) =
∞
1 Tr 2 n=−∞
d2 V (φ) , dφ2
(n + qB )2 2 + M (φ) d4 p R2 ln . 2 4 (2π) ) (n + q F 2 2 p + + M (φ) R2
(270)
p2 +
(271)
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
665
In the above, M 2 (φ) is the φ-dependent mass of the KK states which are model dependent. It is obvious that the effective potential (271) is identical to (192) which was computed in the previous section. Thus the Scherk–Schwarz phases are like twists in the boundary conditions. The calculation follows as we described above. See also [35, 3]. 3.5. An alternative elegant approach. Epstein zeta functions In this section, we briefly present a much more elegant and more elegant computation method for the effective potential. Consider a massive scalar field quantized in T N × Rn with periodic boundary conditions in each of the torii, that is, φ(xi ) = φ(xi + Li ),
(272)
with xi the coordinates describing the torii and Li the torii radii. The zeta function corresponding to this setup is [35, 3, 4, 50, 51, 28],
−s ∞ 2πn1 2πnN . ζ(s, Li ) = (2π)−n dn k + ···+ + k2 + M 2 L1 LN n ···n =−∞ 1
N
(273) The general summations can be written in terms of the Epstein zeta function. Indeed after performing the integration in relation (273), we obtain, √ n 2s π Γ(s − n/2) L1 v2 ZN (s − n/2; w1 , . . . , wN ) , (274) ζ(s, wi ) = L1 Γ(s) 2π with wi = (L1 /Li )2 . In the above we used the generalized Epstein zeta function, 2
v ZN (s − n/2; w1 , . . . , wN ) =
∞
[w1 n21 + · · · + wN n2N + v 2 ]n/2−S . (275)
n1 ···nN =−∞
The interested reader can consult the references [28, 51, 3, 35], where the subject is developed in greater detail. 3.6. Twisted sections and non trivial topology One question that one might ask is if there is a criterion or more correctly a way to know which are the allowed boundary conditions for a field in a specific topology. The answer can be given in terms of the allowed sections of the fiber bundles that the spacetime topology corresponds to. Non trivial topology affects the fields entering the Lagrangian (twisted fields) (see for example [72, 75, 74, 58]). In our case, the topological properties of S 1 ×R3 are classified by the first Stieffel class H 1 (S 1 ×R3 , Ze2 ) which is isomorphic to the singular (simplicial) cohomology group H1 (S 1 ×R3 , Z 2 ) because of the triviality of 1 the Ze2 sheaf. It is known that H 1 (S ×R3 , Z e2 ) = Z 2 classifies the twisting of a bundle. Specifically, it describes and classifies the orientability of a bundle globally.
June 2, 2009 18:35 WSPC/148-RMP
666
J070-00371
V. K. Oikonomou
In our case, the classification group is Z2 and, we have two locally equivalent bundles, which are however different globally (like in the case of the cylinder and that of the moebius strip where both locally resemble S 1 × R). The mathematical lying behind, is to find the sections that correspond to these two fiber bundles, and which are classified by Z2 [72]. The sections we used are real scalar fields and Majorana or Dirac spinor fields. These carry a topological number called moebiosity (twist), which distinguishes between twisted and untwisted fields. The twisted fields obey anti-periodic boundary conditions, while untwisted fields periodic in the compact dimension. In the finite temperature case, one takes scalar fields to obey periodic and fermion fields anti-periodic boundary conditions, disregarding all other configurations that may arise from non trivial topology. We shall consider all these configurations. Let ϕu , ϕt and ψt , ψu denote the untwisted and twisted scalar and twisted and untwisted spinor fields respectively. The boundary conditions in the S 1 dimension read, ϕu (x, 0) = ϕu (x, L),
(276)
ϕt (x, 0) = −ϕt (x, L),
(277)
ψu (x, 0) = ψu (x, L),
(278)
ψt (x, 0) = −ψt (x, L),
(279)
and
for scalar fields and
and
for fermion fields, where x stands for the remaining two spatial and one time dimension which are not affected by the boundary conditions. Spinors (both Dirac and Majorana), still remain Grassmann quantities. The untwisted fields are assigned twist h0 (the trivial element of Z2 ) and the twisted fields twist h1 (the non trivial element of Z2 ). Recall that h0 + h0 = h0 (0 + 0 = 0), h1 + h1 = h0 (1 + 1 = 0), h1 + h0 = h1 (1 + 0 = 1). We require the Lagrangian to be scalar under Z2 thus to have h0 moebiosity. Thus the topological charges flowing at the interaction vertices 1 must sum to h0 under H 1 (S ×R3 , Z e2 ). For supersymmetric models, supersymmetry transformations impose some restrictions on the twist assignments of the superfield component fields [75]. No other field configuration is allowed to take non zero vacuum expectation value but the untwisted scalars. This is due to Grassmann nature of the vacuum or space dependent vacuum solutions that other configurations imply. In the general case when the spacetime has topology (S 1 )q × R4−q , then the topologically allowed "field configurations are classified by the representations of ! H 1 (S 1 )q × R4−q , Z2 = Z2q . Thus the different inequivalent twists that can be
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
667
assigned are 2q . This means that we can have 2q topologically inequivalent spin 0 scalars, spin 1/2 Majorana fermions and spin 3/2 Majorana fermions (this for supergravity), for our case q = 1. It is worth mentioning that equivalent mathematical setups exist in the literature. Twisted fields have frequently been used, for example as we seen in the Scherk–Schwarz mechanism [66] for supersymmetry breaking in our 4-dimensional world, where the harmonic expansion of the fields is of the form: φ(x, y) = eimy
∞
φn (x)e
i2πny L
.
(280)
n=−∞
The “m” parameter incorporates the twist mentioned above. This treatment is closely related to automorphic field theory [92] in more than 4 dimensions (which is an alternative to the one used by us). Concerning the automorphic field theory, due to the compact dimension we can use generic boundary conditions for bosons and fermions in the compact dimension which are, ϕi (x2 , x3 , τ, x1 ) = eiπn1 α ϕi (x2 , x3 , τ, x1 + L) Ψ(x2 , x3 , τ, x1 ) = eiπn1 δ Ψ(x2 , x3 , τ, x1 + L),
(281)
with, 0 < α, δ < 1, i = 1, 2, n1 = 1, 2, 3 . . . . The values α = 0, 1 correspond to periodic and antiperiodic bosons respectively while δ = 0, 1 corresponds to periodic and anti-periodic fermions [92]. 3.7. The validity of approximations. Numerical tests Let us check numerically one of our results. We focus on the bosonic contribution at high temperature. We shall study the convergence properties of our approximation and how the semi-analytic results behave in comparison to the numerical evaluation of the potential. As we seen, before the high temperature limit was taken, the bosonic contribution is given by:
mq K d+1 ∞ 2 d−1 1 T (2π) 2 md+1 (282) Vboson = − d+1 .
d (2π) mq 2 q=1 2T After the high temperature limit was taken, the effective potential is given by the semi-analytic approximation:
√ d−1 π 1 1 2 md+1 Γ (2π) + 1 −ν − Vboson = − 2 (2π)d a 2 +
d−1 1 1 (2π) 2 md+1 Γ(−ν) 4 (2π)d
June 2, 2009 18:35 WSPC/148-RMP
668
J070-00371
V. K. Oikonomou
√ d−1 1 π 1 2 md+1 Γ (2π) + 1 (a2 ) 2 −ν −ν − (2π)d a 2
1 2 ν− 12 −l ((2π) ) ν − ! σ 2
(a2 )l ζ(−2ν + 1 + 2l) × . 1 l=0 ν − − l !l! 2 −
(283)
The converge of (283) and (282) is quite fast. Also the two relations describe the same physics and are identical as can be checked. Particularly, this holds even if we keep only a few terms of (283). We have checked this for values of m/T that our approximation is valid, that is m/T < 1. Also this holds for several dimensions. Let us study the finite temperature limit of a 5-dimensional theory, that is for d = 4. In Fig. 1, we plot the dependence of Vboson /md+1 as a function of m/T , where Vboson is given by the Bessel sum of relation (282). A numerical calculation is done for the sum over the Bessel functions. Also in Fig. 2, we plot the dependence of Vboson /md+1 as a function of m/T , with Vboson given by the semi-analytic approximation of Vboson m4 8 × 1036
Numerical
6 × 1036 4 × 1036 2 × 1036 0.002 0.004 0.006 0.008
m −−− 0.01 T
Fig. 1. Plot of the dependence of Vboson /md+1 as a function of m/T . Numerical approximation of Bessel sum. 5-dimensional bosonic theory at finite temperature.
Vboson m4 8 × 1036
Semianalytic
6 × 1036 4 × 1036 2 × 1036 0.002 0.004 0.006 0.008
m 0.01 T
Fig. 2. Plot of the dependence of Vboson /md+1 as a function of m/T . Semi-analytic approximation. 5-dimensional bosonic theory at finite temperature.
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
Vboson m4 8 × 1036
669
Comparison
6 × 1036 4 × 1036 2 × 1036 0.002 0.004 0.006 0.008
m 0.01 T
Fig. 3. Comparison of numerical and corresponding semi-analytic approximation. 5-dimensional bosonic theory at finite temperature.
relation (283). In addition, in Fig. 3, we compare the above results. As we can see the two results are identical for a large range of the expansion parameter m/T . This shows us that in the high temperature limit ( m T < 1) the semi-analytic expressions we obtained are in complete agreement to the numerical values. This holds Vboson m4
Numerical
6 × 1011 5 × 1011 4 × 1011 3 × 1011 2 × 1011 1 × 1011 0.002 0.004 0.006 0.008
m 0.01 T
Fig. 4. Plot of the dependence of Vboson /md+1 as a function of m/T . Numerical approximation of Bessel sum. 4-dimensional bosonic theory at finite temperature.
Vboson m4 5 × 1011
Semianalytic
4 × 1011 3 × 1011 2 × 1011 1 × 1011 0.002 0.004 0.006 0.008
m 0.01 T
Fig. 5. Plot of the dependence of Vboson /md+1 as a function of m/T . Semi-analytic approximation. 4-dimensional bosonic theory at finite temperature.
June 2, 2009 18:35 WSPC/148-RMP
670
J070-00371
V. K. Oikonomou
Vboson m4
Comparison
6 × 1011 5 × 1011 4 × 1011 3 × 1011 2 × 1011 1 × 1011 0.002 0.004 0.006 0.008
m 0.01 T
Fig. 6. Comparison of numerical and corresponding semi-analytic approximation. 4-dimensional bosonic theory at finite temperature.
regardless the number of terms of the semi-analytic expansion we keep. Thus the expansion is perturbative and valid. The same analysis can be done for the d = 4 case. We present the results in Figs. 4–6. Thus within the perturbative limits the semi-analytic approximation is valid and exponentially converging as expected (see also [3]). Acknowledgments The author would like to thank the referee of Reviews in Mathematical Physics for invaluable comments and suggestions that improved significantly the quality and appearance of the paper. References [1] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals Series and Products (Academic Press, 1965). [2] H. Casimir, On the attraction between two perfectly conducting plates, Proc. Kon. Nederl. Akad. Wet. 51 (1948) 793–795. [3] E. Elizalde, Ten Physical Applications of Spectral Zeta Functions (Springer, 1995). [4] E. Elizalde, S. D. Odintsov, A. Romeo and A. A. Bytsenko, Zeta Regularization Techniques and Applications (World Scientific, 1994). [5] E. Elizalde, S. Leseduarte and A. Romeo, Sum rules for zeros of Bessel functions and an application to spherical Aharonov–Bohm quantum bags, J. Phys. A 26 (1993) 2409–2419. [6] M. Bordag and K. Kirsten, Vacuum energy in a spherically symmetric background field, Phys. Rev. D 53 (1996) 5753–5760. [7] M. Bordag, K. Kirsten and J. S. Dowker, Heat kernels and functional determinants on the generalized cone, Commun. Math. Phys. 182 (1996) 371–394. [8] M. Bordag, B. Geyer, K. Kirsten and E. Elizalde, Zeta function determinant of the Laplace operator on the D-dimensional ball, Commun. Math. Phys. 179 (1996) 215–234. [9] M. Bordag, E. Elizalde and K. Kirsten, Heat kernel coefficients of the Laplace operator on the D-dimensional ball, J. Math. Phys. 37 (1996) 895–916.
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
671
[10] G. Lambiase, V. V. Nesterenko and M. Bordag, Casimir energy of a ball and cylinder in the zeta function technique, J. Math. Phys. 40 (1999) 6254–6265. [11] M. Bordag, U. Mohideen and V. M. Mostepanenko, New developments in Casimir effect, Phys. Rep. 353 (2001) 1–205. [12] E. Elizalde, Zeta function methods and quantum fluctuations, J. Phys. A 41 (2008) 304040. [13] E. Elizalde, Uses of zeta regularization in QFT with boundary conditions: A cosmotopological Casimir effect, J. Phys. A 39 (2006) 6299–6307. [14] E. Elizalde, Analytical regularisation for confined quantum fields between parallel surfaces, J. Phys. A 39 (2006) 6725–6732. [15] E. Elizalde and A. C. Tort, A note on the Casimir energy of a massive scalar field in positive curvature space, Mod. Phys. Lett. A 19 (2004) 111–116. [16] E. Elizalde, F. C. Santos and A. C. Tort, Confined quantum fields under the influence of a uniform magnetic field, J. Phys. A 35 (2002) 7403–7414. [17] E. Elizalde and A. C. Tort, Thermal energy of a scalar field in a one-dimensional compact space, Phys. Rev. D 66 (2002) 045033, 6 pp. [18] G. Cognola, E. Elizalde and K. Kirsten, Casimir energies for spherically symmetric cavities, J. Phys. A 34 (2001) 7311–7327. [19] E. Elizalde, M. Bordag and K. Kirsten, Casimir energy for a massive fermionic quantum field with a spherical boundary, J. Phys. A 31 (1998) 1743–1759. [20] E. Elizalde, Multidimensional extension of the generalized Chowla–Selberg formula, Commun. Math. Phys. 198 (1998) 83–95. [21] M. Bordag, E. Elizalde, K. Kirsten and S. Leseduarte, Casimir energies for massive fields in the bag, Phys. Rev. D 56 (1997) 4896–4904. [22] K. Kirsten and E. Elizalde, Casimir energy of a massive field in a genus 1 surface, Phys. Lett. B 365 (1996) 72–78. [23] G. Plunien, B. Muller and W. Greiner, The Casimir effect, Phys. Rep. 134 (1986) 87–193. [24] R. Obousy and G. Cleaver, Casimir energies and brane stability (2008); arXiv:0810.1096. [25] E. Ponton and E. Poppitz, Casimir energy and radius stabilization in five and six dimensional orbifolds, J. High Energy Phys. 0106 (2001) 019. [26] E. Elizalde, K. Kirsten and Yu. Kubyshin, On the instability of the vacuum in multidimensional scalar theories, Z. Phys. C 70 (1996) 159–172. [27] E. Elizalde, The vacuum energy density for spherical and cylindrical universes, J. Math. Phys. 35 (1994) 3308–3321. [28] E. Elizalde and K. Kirsten, Topological symmetry breaking in selfinteracting theories on toroidal space-time, J. Math. Phys. 35 (1994) 1260–1273. [29] E. Elizalde, The spectrum of the Casimir effect on a torus, Z. Phys. C 44 (1989) 471–492. [30] E. Elizalde, S. Nojiri, S. D. Odintsov and S. Ogushi, Casimir effect in de Sitter and anti-de Sitter braneworlds, Phys. Rev. D 67 (2003) 063515. [31] K. A. Milton, The Casimir effect: Recent controversies and progress, J. Phys. A 37 (2004) R209–R277. [32] K. A. Milton, Calculating Casimir energies in renormalizable quantum field theory, Phys. Rev. D 68 (2003) 065020. [33] I. Brevik, K. A. Milton, S. D. Odintsov and K. E. Osetrin, Dynamical Casimir effect and quantum cosmology, Phys. Rev. D 62 (2000) 064005, 8 pp. [34] R. Kantowski and K. A. Milton, Casimir energies in M (4)×S(N ) for even N. Green’s function and zeta function techniques, Phys. Rev. D 36 (1987) 3712–3721.
June 2, 2009 18:35 WSPC/148-RMP
672
J070-00371
V. K. Oikonomou
[35] K. Kirsten, Spectral Functions in Mathematics and Physics (Chapman Hall/CRC, 2001). [36] E. Elizalde, S. Naftulin and S. D. Odintsov, Covariant effective action and one loop renormalization of 2-D dilaton gravity with fermionic matter, Phys. Rev. D 49 (1994) 2852–2861. [37] I. L. Buchbinder and S. D. Odintsov, Spontaneous supersymmetry breaking and effective action in supersymmetrical Kaluza–Klein theories and strings, Int. J. Mod. Phys. A 4 (1989) 4337–4351. [38] I. L. Buchbinder and S. D. Odintsov, Effective action in multidimensional (super)gravities and spontaneous compactification (quantum aspects of Kaluza–Klein theories), Fortshrt. Phys. 37 (1989) 225–259. [39] S. D. Odintsov, Compactification and spontaneous symmetry breaking in the lambda phi**4 theory with Kaluza–Klein background, Sov. Phys. J. 31 (1988) 695–710. [40] E. Elizalde, S. D. Odintsov and S. Leseduarte, Chiral symmetry breaking in the Nambu–Jona–Lasinio model in curved space-time with nontrivial topology, Phys. Rev. D 49 (1994) 5551–5558. [41] I. Brevik, K. Milton, S. Nojiri and S. D. Odintsov, Quantum (in)stability of a brane world AdS(5) universe at nonzero temperature, Nucl. Phys. B 599 (2001) 305–318. [42] I. L. Buchbinder and S. D. Odintsov, Effective potential in a curved space-time, Sov. Phys. J. 27 (1984) 554–558. [43] I. L. Buchbinder and S. D. Odintsov, One loop renormalization of the Yang–Mills field theory in a curved space-time, Sov. Phys. J. 26 (1983) 359–361. [44] S. D. Odintsov, Casimir effect in multidimensional quantum supergravities and supersymmetry breaking, Mod. Phys. Lett. A 3 (1988) 1391–1399. [45] S. D. Odintsov, Two loop effective potential in quantum field theory in curved spacetime, Phys. Lett. B 306 (1993) 233–236. [46] E. Elizalde, S. D. Odintsov and A. Romeo, Zeta regularization of the O(N ) nonlinear sigma model in D-dimensions, J. Math. Phys. 37 (1996) 1128–1147. [47] E. Elizalde, S. D. Odintsov and A. Romeo, Effective potential for a covariantly constant gauge field in curved space-time, Phys. Rev. D 54 (1996) 4152–4159. [48] S. D. Odintsov, Effective actions in quantum gravity theories, Sov. J. Nucl. Phys. 46 (1987) 932–936. [49] K. Kirsten, Topological gauge field mass generation by toroidal space-time, J. Phys. A 26 (1993) 2421–2435. [50] K. Kirsten, Connections between Kelvin functions and zeta functions with applications, J. Phys. A 25 (1992) 6297–6306. [51] K. Kirsten, Casimir effect at finite temperature, J. Phys. A 24 (1991) 3281–3298. [52] V. Di Clemente and Yu. A. Kubyshin, Effective potential and KK renormalization scheme in a 5D supersymmetric theory, Nucl. Phys. B 636 (2002) 115–131. [53] K. Kirsten, Generalized multidimensional Epstein zeta functions, J. Math. Phys. 35 (1994) 459–470. [54] K. Kirsten, Inhomogeneous multidimensional Epstein zeta functions, J. Math. Phys. 32 (1991) 3008–3014. [55] J. I. Kapusta, Finite Temperature Field Theory, Cambridge Monographs on Mathematical Physics (Cambridge University Press, 1989). [56] E. C. Titchmarsh, The Theory of the Riemann Function (Oxford at the Clarendon Press, 1951). [57] A. Das, Finite Temperature Field Theory (World Scientific, 1997). [58] G. Denardo and E. Spallucci, Dynamical mass generations in S(1)×R(3), Nucl. Phys. B 169 (1980) 514–526.
June 2, 2009 18:35 WSPC/148-RMP
J070-00371
Calculation of Effective Potential in Spacetimes with S 1 × Rd Topology
673
[59] L. Van Hove, Quantum field theory at positive temperature, Phys. Rep. 137 (1988) 11–20. [60] D. Bailin and A. Love, Supersymmetric Gauge Field Theory and String Theory (Institute of Physics Publishing, 2003). [61] M. Quiros, Introduction to extra dimensions (2006); hep-ph/0606153. [62] M. Quiros, New ideas in symmetry breaking (2003); hep-ph/0302189. [63] M. Quiros, Finite temperature field theory and phase transitions (1999); hepph/9901312. [64] I. Antoniadis, A possible new dimension at a few TeV, Phys. Lett. B 246 (1990) 377–384. [65] I. Antoniadis, N. Arkani-Hamed, S. Dimopoulos and G. R. Dvali, New dimensions at a millimeter to a Fermi and superstrings at a TeV, Phys. Lett. B 436 (1998) 257–263. [66] J. Scherk and J. H. Schwarz, How to get masses from extra dimensions, Nucl. Phys. B 153 (1979) 61–88. [67] A. Pomarol and M. Quiros, The standard model from extra dimensions, Phys. Lett. B 438 (1998) 255–260. [68] A. Delgado, A. Pomarol and M. Quiros, Supersymmetry and electroweak breaking from extra dimensions at the TeV scale, Phys. Rev. D 60 (1999) 095008, 13 pp. [69] G. F. R. Ellis, Topology and cosmology, Gen. Relativity Gravitation 2 (1971) 7–21. [70] C. W. Bernard, Feynman rules for gauge theories at finite temperature, Phys. Rev. D 9 (1974) 3312–3320. [71] L. Dolan and R. Jackiw, Symmetry behavior at finite temperature, Phys. Rev. D 9 (1974) 3320–3341. [72] S. J. Avis and C. J. Isham, Generalized spin structures on four dimensional spacetimes, Commun. Math. Phys. 72 (1980) 103–118. [73] C. J. Isham, Spinor fields in four-dimensional space-time, Proc. R. Soc. London. A 364 (1978) 591–599. [74] L. H. Ford, Vacuum polarization in a nonsimply connected spacetime, Phys. Rev. D 21 (1980) 933–948. [75] Yu. P. Goncharov and A. A. Bytsenko, Topological violation of supersymmetry, Phys. Lett. B 163 (1985) 155–160. [76] Yu. P. Goncharov and A. A. Bytsenko, Topological violation of supersymmetry at finite temperature, Phys. Lett. B 168 (1986) 239–244. [77] Yu. P. Goncharov and A. A. Bytsenko, The supersymmetric Casimir effect and quantum creation of the universe with nontrivial topology, 2, Phys. Lett. B 169 (1986) 171–176. [78] Yu. P. Goncharov and A. A. Bytsenko, The supersymmetric Casimir effect and quantum creation of the universe with nontrivial topology, Phys. Lett. B 160 (1985) 385–390. [79] Yu. P. Goncharov and A. A. Bytsenko, Topological Casimir effect for a class of hyperbolic four-dimensional Clifford–Klein space-times, Class. Quant. Grav. 8 (1991) L211–L218. [80] Yu. P. Goncharov and A. A. Bytsenko, Topological Casimir effect for a class of hyperbolic three-dimensional Clifford-Klein space-times, Class. Quant. Grav. 8 (1991) 2269–2275. [81] Yu. P. Goncharov and A. A. Bytsenko, Casimir effect in supergravity theories and quantum birth of the universe with nontrival topology, Class. Quant. Grav. 4 (1987) 555–571.
June 2, 2009 18:35 WSPC/148-RMP
674
J070-00371
V. K. Oikonomou
[82] Yu. P. Goncharov and A. A. Bytsenko, Space-time topology, temperature and the vanishing of vacuum energies in dimensionally reduced supersymmetric theories, Nucl. Phys. B 271 (1986) 726–748. [83] J. S. Dowker and R. Banach, Quantum field theory on Clifford–Klein space-times. The effective Lagrangian and vacuum stress energy tensor, J. Phys. A 11 (1978) 2255–2284. [84] J. S. Dowker and R. Banach, Automorphic field theory: Some mathematical issues, J. Phys. A 12 (1979) 2527–2543. [85] A. A. Bytsenko, E. Elizalde and S. Zerbini, Effective finite temperature partition function for fields on noncommutative flat manifolds, Phys. Rev. D 64 (2001) 105024, 7 pp. [86] N. P. Landsman, Limitations to dimensional reduction at high temperature, Nucl. Phys. B 322 (1989) 498–530. [87] V. K. Oikonomou, Study of temperature inversion symmetry for the twisted Wess– Zumino, J. Phys. A 40 (2007) 5725–5731. [88] V. K. Oikonomou, Non trivial spacetime effects in a supersymmetric model, J. Phys. A 40 (2007) 9929–9939. [89] V. K. Oikonomou, work in preparation. [90] S. P. Martin, Two-loop effective potential for a general renormalizable theory and softly broken supersymmetry, Phys. Rev. D 65 (2002) 116003. [91] G. D. Kribs, TASI 2004 lectures on the phenomenology of extra dimensions (2006); hep-ph/0605325. [92] B. Alles, J. Soto and J. Taron, On the physics of lambda phi**4 in a R**2 × S**1 space, Z. Phys. C 39 (1988) 489–498. [93] E. J. Ferrer, V. de la Incera and A. Romeo, Photon propagation in space-time with a compactified spatial dimension, Phys. Lett. B 515 (2001) 341–347. [94] L. Randall and R. Sundrum, An alternative to compactification, Phys. Rev. Lett. 83 (1999) 3370–3373. [95] L. Randall and R. Sundrum, A large mass hierarchy from a small extra dimension, Phys. Rev. Lett. 83 (1999) 4690–4693.
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Reviews in Mathematical Physics Vol. 21, No. 5 (2009) 675–708 c World Scientific Publishing Company
TIME DELAY FOR DISPERSIVE SYSTEMS IN QUANTUM SCATTERING THEORY
RAFAEL TIEDRA DE ALDECOA Facultad de Matem´ aticas, Pontificia Universidad Cat´ olica de Chile, Av. Vicu˜ na Mackenna 4860, Santiago, Chile
[email protected] Received 8 January 2009 Revised 9 April 2009 We consider time delay and symmetrized time delay (defined in terms of sojourn times) for quantum scattering pairs {H0 = h(P ), H}, where h(P ) is a dispersive operator of hypoelliptic-type. For instance, h(P ) can be one of the usual elliptic operators such as the Schr¨ odinger operator h(P ) = P 2 or the square-root Klein–Gordon operator h(P ) = √ 1 + P 2 . We show under general conditions that the symmetrized time delay exists for all smooth even localization functions. It is equal to the Eisenbud–Wigner time delay plus a contribution due to the non-radial component of the localization function. If the scattering operator S commutes with some function of the velocity operator ∇h(P ), then the time delay also exists and is equal to the symmetrized time delay. As an illustration of our results, we consider the case of a one-dimensional Friedrichs Hamiltonian perturbed by a finite rank potential. Our study puts into evidence an integral formula relating the operator of differentiation with respect to the kinetic energy h(P ) to the time evolution of localization operators. Keywords: Time delay; scattering theory; pseudodifferential operators. Mathematics Subject Classification 2000: 46N50, 81Q10, 35Q40, 35S05
1. Introduction and Main Results One can find a large literature on the identity of Eisenbud–Wigner time delay and time delay in quantum scattering defined in terms of sojourn times (see [3, 7, 8, 12, 19, 23, 24, 30–34, 38, 39, 49] and references therein). However, most of the papers treat scattering processes where the free dynamics is given by some Schr¨ odinger operator. The mathematical articles where different scattering processes are considered (such as [23, 30, 31, 38]) only furnish explicit applications in the Schr¨ odinger case. The purpose of the present paper is to fill in this gap by proving the existence of time delay and its relation to Eisenbud–Wigner time delay for a general class of dispersive quantum systems. Using a symmetrization argument introduced in [9, 31, 44] for N -body scattering, and rigorously applied 675
June 2, 2009 18:35 WSPC/148-RMP
676
J070-00370
R. Tiedra de Aldecoa
in [5, 17, 29, 47, 48], we shall treat any scattering process with free dynamics given by a regular enough pseudodifferential operator of hypoelliptic-type. Given a real Euclidean space X of dimension d ≥ 1, we consider in H(X) := L2 (X) the dispersive operator H0 := h(P ), where h : X → R is some hypoelliptic function and P ≡ (P1 , . . . , Pd ) is the vector momentum operator in H(X). We also consider a selfadjoint perturbation H of H0 such that the wave operators W± := s-limt→±∞ eitH e−itH0 exist and are complete (so that the scattering operator S := W+∗ W− is unitary). We define the usual time delay and the symmetrized time delay for the quantum scattering system {H0 , H} as follows. Take a function f ∈ L∞ (X) decaying to zero sufficiently fast at infinity, and such that f = 1 on some neighborhood Σ of the origin. Define for r > 0 and some state ϕ ∈ H(X) the numbers Q −itH0 e dt e−itH0 ϕ, f ϕ Tr0 (ϕ) := r R and
Q −itH e dt e−itH W− ϕ, f W− ϕ , r R
Tr (ϕ) :=
where Q ≡ (Q1 , . . . , Qd ) is the vector position operator in H(X). The operator f (Q/r) is approximately the projection onto the states of H(X) localized in rΣ := {x ∈ X | x/r ∈ Σ}. So, if ϕ is normalized to one, Tr0 (ϕ) can be roughly interpreted as the time spent by the freely evolving state e−itH0 ϕ inside the region rΣ. Similarly Tr (ϕ) can be roughly interpreted as the time spent by the associated scattering state e−itH W− ϕ inside rΣ. In consequence τrin (ϕ) := Tr (ϕ) − Tr0 (ϕ) is approximately the time delay in rΣ of the scattering process {H0 , H} with incoming state ϕ, and 1 τr (ϕ) := Tr (ϕ) − [Tr0 (ϕ) + Tr0 (Sϕ)] (1.1) 2 is the corresponding symmetrized time delay. In the case of the Schr¨ odinger operator (h(x) = x2 ) it is known that the existence (and the value) of τrin (ϕ) and τr (ϕ) as r → ∞ depend on the choice of the localization function f . The limit limr→∞ τrin (ϕ) does exist only if f is radial, in which case it is equal to Eisenbud–Wigner time delay [43]. On another hand, it has been shown in [17] that the limit limr→∞ τr (ϕ) does exist for all characteristic functions f = χΣ with Σ = −Σ regular enough. In such a case, the limit limr→∞ τr (ϕ) is the sum of the Eisenbud–Wigner time delay plus a term depending on the boundary ∂Σ of Σ. Our goal in this paper is to present a unified picture for these phenomena by treating all scattering pairs {H0 ≡ h(P ), H}, with h in some natural class of hypoelliptic functions containing h(x) = x2 as a particular case (see Assumption 4.6). In Sec. 4, Theorem 4.3, we prove under general assumptions on H and ϕ the existence
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
677
of the symmetrized time delay for all smooth even functions f (for us, f is even if f (x) = f (−x) for a.e. x ∈ X). We show that 1 lim τr (ϕ) = ϕ, S ∗ [Af , S]ϕ, r→∞ 2 where Af is some explicit operator depending on h and f defined in Sec. 3. If d f is radial, then Af reduces in some sense to the operator A = −2i dh(P ) , and limr→∞ τr (ϕ) is equal to Eisenbud–Wigner time delay. So, if H0 is purely absolutely continuous and the scattering matrix S(λ) is strongly continuously differentiable in the spectral representation of H0 , then dS(λ) (U ϕ)(λ) dλ (U ϕ)(λ), −iS(λ)∗ , (1.2) lim τr (ϕ) = r→∞ dλ σ(H0 ) Hλ ⊕ where U : H(X) → σ(H0 ) dλ Hλ is a spectral transformation for H0 (see Remark 4.4 for a precise statement). If f is not radial, the limit limr→∞ τr (ϕ) is the sum of the Eisenbud–Wigner time delay and the contribution of the non-radial component of the localization function f (see Remark 4.5). In Theorem 4.8, we show that the free sojourn times Tr0 (ϕ) and Tr0 (Sϕ) before and after the scattering satisfy lim [Tr0 (Sϕ) − Tr0 (ϕ)] = 0
r→∞
if the scattering operator S commutes with some appropriate function of the velocity operator h (P ) ≡ ∇h(P ). Under this circumstance, the usual time delay limr→∞ τrin (ϕ) also exists and is equal to limr→∞ τr (ϕ) (Theorem 4.10). In Corollary 4.11, we exhibit two classes of functions h for which the commutation assumption is satisfied. Basically, these two classes of functions are the radial functions and the polynomials of degree 1. So, in particular, our results cover and shed a new light on the case of the Schr¨ odinger operator h(x) = x2 . In Sec. 5, we consider as an illustration of our approach the simple, but instructive, case of the one-dimensional Friedrichs Hamiltonian H0 = Q (H0 is of the form h(P ) after a Fourier transformation). We verify all the assumptions of Sec. 4 when H is a regular enough finite rank perturbation of H0 . The main difficulty consists in showing (as in the Schr¨ odinger case [4, 26]) that the scattering operator maps some dense set into itself. Essentially this reduces to proving that the scattering matrix S(x) is sufficiently differentiable on R\σpp (H), which is achieved by proving a stationary formula for S(x) and by using higher order commutators methods (see Lemmas 5.9–5.12). All these results are collected in Theorem 5.14, where the formula (1.3) lim τrin (ϕ) = lim τr (ϕ) = −i dx|ϕ(x)|2 S(x)S (x) r→∞
r→∞
R
is proved for finite rank perturbations. Some comments on the relation between Eq. (1.3) and the Birman–Krein formula are given in Remark 5.7. The differentiability properties of the restriction operator appearing in the stationary formula for S(x) are recalled in the Appendix.
June 2, 2009 18:35 WSPC/148-RMP
678
J070-00370
R. Tiedra de Aldecoa
Virtually our technics may be applied to many physical examples such as the square-root Klein–Gordon operator, the Klein–Gordon equation, the Pauli operator, or the Dirac operator. We hope that these cases will be considered in future publications. Let us note that our approach relies crucially on the proof in Sec. 3 of the integral formula ∞ Q −ith(P ) Q ith(P ) e e dt ϕ, eith(P ) f − e−ith(P ) f ϕ = ϕ, Af ϕ. lim r→∞ 0 r r (1.4) The proof of (1.4) relies in some sense on the equation Q −ith(P ) Q + th (P ) ith(P ) e e f =f , r r which replaces the Alsholm–Kato formula [1, Eq. (2.1)] 2 2 2 Q −itP 2 tP e eiQ /2t eitP f = e−iQ /2t f r r of the Schr¨ odinger case. We think that Formula (1.4) is interesting on its own, since it relates (when f is radial) the time evolution of the localization operator f (Q/r) to the operator of differentiation with respect to the kinetic energy h(P ). As a last comment, we would like to emphasize that this paper shows that is the on-shell value of a time delay the Eisenbud–Wigner operator −iS(λ)∗ dS(λ) dλ operator (symmetrized or not), not only for Schr¨ odinger-type scattering systems, but for a large class of scattering pairs {H0 , H}. This was not so clear from the very beginning. We finally mention the papers [10, 45] for recent works on time delay. 2. Averaged Localization Functions In this section, we collect results on a class of averaged localization functions which appears naturally when dealing with quantum time delay. We start by fixing the notations which will be freely used throughout the paper. We write |·| for the norm in X, set · := (1+|·|2 )1/2 , and use dx := (2π)−d/2 dx as measure on X (dx is the usual Euclidean measure on X). We denote by x · y the scalar product of x, y ∈ X. Sometimes we identify X with Rd by choosing in X an orthonormal basis V := {v1 , . . . , vd }. Given a function g ∈ C 1 (X; C), we write g (x) for the derivative of g at x, i.e. g(x + h) = g(x) + h · g (x) + o(|h|) for h ∈ X with |h| sufficiently small. For higher order derivatives, we use the multi-index notation. A multi-index α is a d-tuple (α1 , . . . , αd ) of integers αj ≥ 0 such that |α| := α1 + · · · + αd ,
α! := α1 · · · αd ,
∂ α := ∂1α1 · · · ∂dαd ,
and αd 1 xα := xα 1 · · · xd
if x = x1 v1 + · · · + xd vd ∈ X
(xj ∈ R).
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
679
The Hilbert space H(X) = L2 (X) is endowed with its usual norm · and scalar product ·, ·. The jth components of P and Q with respect to V act as (Pj ϕ)(x) := −i(∂j ϕ)(x) and (Qj ϕ)(x) := xj ϕ(x) in H(X). Assumption 2.1. The function f ∈ L∞ (X) satisfies the following conditions: (i) There exists ρ > 0 such that |f (x)| ≤ Const. x−ρ for a.e. x ∈ X. (ii) f = 1 on a neighborhood of 0. It is clear that s-limr→∞ f (Q/r) = 1 if f satisfies Assumption 2.1. Furthermore, one has for each x ∈ X\{0} 1 ∞ +∞ dµ
dµ ≤ |f (µx) − 1| + Const. (µ) dµ µ−(1+ρ) < ∞. f (µx) − χ [0,1] µ µ 0 0 1 Therefore the function Rf : X\{0} → C given by
+∞
Rf (x) := 0
dµ [f (µx) − χ[0,1] (µ)] µ
is well-defined (see [17, Sec. 2] and [48, Sec. 2] for a similar definition). In the next lemma, we establish some differentiability properties of Rf . The symbol S (X) stands for the Schwartz space on X. Lemma 2.2. Let f satisfy Assumption 2.1. Then (a) For all j ∈ {1, 2, . . . , d} and x ∈ X, assume that (∂j f )(x) exists and satisfies |(∂j f )(x)| ≤ Const.x−(1+ρ) . Then Rf is differentiable on X\{0}, and its derivative is given by ∞ dµ f (µx). (2.1) Rf (x) = 0
∞
Moreover, Rf belongs to C (X\{0}) if f ∈ S (X). (b) Assume that Rf belongs to C m (X\{0}) for some m ≥ 1. Then one has for each x ∈ X\{0} and t > 0 the homogeneity properties
t
|α|
x · Rf (x) = −1,
(2.2)
α
(2.3)
α
(∂ Rf )(tx) = (∂ Rf )(x),
where α is a multi-index with 1 ≤ |α| ≤ m. (c) Assume that f is radial, i.e. there exists f0 ∈ L∞ (R) such that f (x) = f0 (|x|) for a.e. x ∈ X. Then Rf belongs to C ∞ (X\{0}), and Rf (x) = −x−2 x. Proof. (a) The claim is a consequence of standard results on differentation under the integral (see e.g. [28, Chap. 13, Lemma 2.2]).
June 2, 2009 18:35 WSPC/148-RMP
680
J070-00370
R. Tiedra de Aldecoa
(b) Let x ∈ X\{0} and t > 0. Then one has ∞ dµ [f (µtx) − χ[0,1] (µ)] Rf (tx) = µ 0 ∞ ∞ dµ dµ = [f (µ) − χ[0,1] (µ)] + [χ[0,1] (µ) − χ[0,t] (µ)] µ µ 0 0 = Rf (x) − ln t,
(2.4)
and (2.2) follows by taking the derivative with respect to t and by putting t = 1. Equation (2.3) follows by taking derivatives with respect to x. (c) For x ∈ X\{0}, one gets Rf0 (1) = Rf (x) + ln |x|, by putting t = |x|−1 in Eq. (2.4). This implies the claim. In the sequel, we shall also need the function Ff : X\{0} → C defined by Ff (x) := dµ f (µx). R
The function Ff satisfies several properties as Rf . Here we only note that Ff is well-defined if f satisfies Assumption 2.1(i) with ρ > 1, and that for each t > 0 and each x ∈ X\{0}. (2.5) Physically, if p ∈ Rd and f ≥ 0, then the number Ff (p) ≡ R dtf (tp) can be seen as the sojourn time in the region defined by the localization function f of a free classical particle moving along the trajectory R t → x(t) := tp. Ff (x) = tFf (tx)
3. Integral Formula for H0 = h(P ) Given a function h ∈ C 1 (X; R), we denote by κ(h) the set of critical values of h, i.e. κ(h) := {λ ∈ R | ∃x ∈ X such that h(x) = λ and h (x) = 0}. The size and the topology of κ(h) depends on the regularity and the behavior of the function h. Here we only recall some properties of κ(h) (see [2, Sec. 7.6.2] for more details): 1. H0 = h(P ), whose spectrum is σ(H0 ) = h(X), has purely absolutely continuous spectrum in σ(H0 )\κ(h). 2. H0 is purely absolutely continuous if h−1 (κ(h)) has measure zero. 3. κ(h) has measure zero if h ∈ C d (X; R), with d the dimension of X. 4. κ(h) is finite if h is a polynomial. 5. κ(h) is closed if |h(x)| + |h (x)| → ∞ as |x| → ∞.
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
681
In the sequel, we assume that h satisfies the following. Assumption 3.1. The function h : X → R is of class C m for some m ≥ 2, and satisfies the following conditions: (i) |h(x)| → ∞ as |x| → ∞. α (ii) |α|≤m |(∂ h)(x)| ≤ Const. (1 + |h(x)|). For each s, t ∈ R, we denote by Hts (X) the usual weighted Sobolev space over X, namely, the completion of S (X) for the norm ϕ Hst (X) := P s Qt ϕ . We also set Hs (X) := H0s (X) and Ht (X) := Ht0 (X), and for each t ≥ 0 we define
Dt0 (X) := ϕ ∈ Ht (X) | η(h(P ))ϕ = ϕ for some η ∈ Cc∞ R\κ(h) . The set Dt0 (X) is included in the subspace Hac (H0 ) of absolute continuity of H0 , Dt0 (X) is dense in H(X) if h−1 (κ(h)) has measure zero, and Dt01 (X) ⊂ Dt02 (X) if t1 ≥ t2 . Lemma 3.2. Let f satisfy Assumption 2.1, assume that Rf belongs to C 2 (X\{0}), and let h satisfy Assumption 3.1. Then the operator given by the formal expression Af := Q · Rf (h (P )) + Rf (h (P )) · Q D10 (X).
is well-defined on In particular, −1 h (κ(h)) has measure zero.
{Af , D10 (X)}
(3.1)
is symmetric if f is real and
Proof. Let ϕ ∈ D10 (X) and choose η ∈ Cc∞ R\κ(h) such that η(h(P ))ϕ = ϕ. Then there exists c > 0 such that |h (x)| > c for all x ∈ h−1 (supp η), due to Assumption 3.1(i) (see the discussion after [2, Proposition 7.6.6] for details). This together with Assumption 3.1(ii) implies that |h (P )|−2 η(h(P ))(∂ α h)(P ) < ∞ and |h (P )|−1 η(h(P )) < ∞ (3.2) ) for any multi-index α with |α| ≤ 2. Furthermore, the operator (∂ α Rf ) |hh (P (P )| is also bounded for α with |α| ≤ 2, due to the compacity of (∂ α Rf )(Sd−1 ). Therefore, using formula (2.3) with t = |x|−1 , we get the estimate Af ϕ = (∂ h) (P ) · (∂ R ) (h (P )) + 2R (h (P )) · Q η(h(P ))ϕ i j j f f j≤d
h (P ) |h (P )|−2 η(h(P ))(∂j h) (P ) · (∂j Rf ) ϕ ≤ |h (P )| j≤d h (P ) −1 + Const. |h (P )| η(h(P ))R · Q ϕ f |h (P )| ≤ Const. Qϕ , which implies the claim.
June 2, 2009 18:35 WSPC/148-RMP
682
J070-00370
R. Tiedra de Aldecoa
There are at least two cases where the operator Af takes a simple form. First, suppose that h is a polynomial of degree 1, i.e. h(x) = v0 + v · x for some v0 ∈ R and v ∈ X\{0}. Then the operator Rf (h (P )) reduces to the constant vector Rf (v), and Af := 2Rf (v) · Q. Secondly, suppose that f is radial. Then one has Rf (x) = −x−2 x due to Lemma 2.2(c), and Af reduces to the operator h (P ) h (P ) A := − Q · + ·Q . (3.3) h (P )2 h (P )2 For instance, in the particular case where h(x) = h0 (|x|) with h0 ≥ 0, one gets P P A0 := − Q · + ·Q . (3.4) |P |h0 (|P |) |P |h0 (|P |) The next theorem is somehow related to the usual result on the asymptotic velocity for Hamiltonians H0 = h(P ) (see e.g. [2, Sec. 7.C], [20, Theorem 7.1.29], [22], and [40, Sec. 2]). The symbol F stands for the Fourier transformation. Theorem 3.3. Let f ∈ S (X) be an even function such that f = 1 on a neighborhood of 0. Let h satisfy Assumption 3.1 with m ≥ 3. Then we have for each ϕ ∈ D20 (X) ∞ Q −ith(P ) Q ith(P ) e e dt ϕ, eith(P ) f − e−ith(P ) f lim ϕ = ϕ, Af ϕ. r→∞ 0 r r (3.5) Proof. (i) Let ϕ ∈ D20 (X), take a real η ∈ Cc∞ R\κ(h) such that η(h(P ))ϕ = ϕ, and set ηt (P ) := eith(P ) η(h(P )). Then we have Q −ith(P ) Q ith(P ) ith(P ) −ith(P ) e e f −e f ϕ, e ϕ r r
x x dx(Ff )(x)ϕ, ηt (P )ei r ·Q η−t (P ) − η−t (P )ei r ·Q ηt (P ) ϕ = X
x x i x ·Q ix ·Q r r e e = dx(Ff )(x) ϕ, ηt P + ϕ η−t (P ) − η−t (P )ηt P − r r X x x = dx(Ff )(x) ϕ, (ei r ·Q −1)ηt P + η−t (P ) + η−t (P ) r X x x x x ei r ·Q −1 ϕ . × ηt P + − ηt P − − η−t (P )ηt P − r r r (3.6)
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
683
Since f is even, Ff is also even, and x x dx(Ff )(x) ϕ, η−t (P ) ηt P + − ηt P − ϕ = 0. r r X Thus Formula (3.6) and the change of variables µ := t/r, ν := 1/r, give ∞ Q −ith(P ) Q ith(P ) e e dt ϕ, eith(P ) f − e−ith(P ) f lim ϕ r→∞ 0 r r ∞ dµ dx K(ν, µ, x), (3.7) = lim ν0
0
X
where
µ 1 iνx·Q e −1 η(h(P + νx))ei ν [h(P +νx)−h(P )] K(ν, µ, x) := (Ff )(x) ϕ, ν iνx·Q iµ [h(P −νx)−h(P )] 1 ν e − η(h(P − νx))e −1 ϕ . ν
(ii) To prove the statement, we shall show that one may interchange the limit and the integrals in (3.7), by invoking Lebesgue’s dominated convergence theorem. This will be done in (iii) below. If one assumes that these interchanges are justified for the moment, then direct calculations using the symmetry of f , Lemma 2.2(a), and Lemma 3.2 give ∞ Q −ith(P ) Q ith(P ) ith(P ) −ith(P ) e e dt ϕ, e f −e f ϕ lim r→∞ 0 r r ∞ dµ dx(Ff )(x){(x · Q)ϕ, eiµx·h (P ) ϕ − ϕ, e−iµx·h (P ) (x · Q)ϕ} =i 0
=
j≤d
=
dx[F (∂j f )](x)[Qj ϕ, eiµx·h (P ) ϕ + ϕ, eiµx·h (P ) Qj ϕ]
dµ
0
j≤d
X ∞
X ∞
dµ[Qj ϕ, (∂j f )(µh (P ))ϕ + ϕ, (∂j f )(µh (P ))Qj ϕ]
0
= ϕ, Af ϕ. (iii) To interchange the limit ν0 and the integration over µ in (3.7), one has to 1 bound X dx K(ν, µ, x) uniformly in ν by a function in L ((0, ∞), dµ). We begin with the first term of X dx K(ν, µ, x): 1 K1 (ν, µ) := dx(Ff )(x) Q2 ϕ, (eiνx·Q −1)Q−2 ν X µ i ν [h(P +νx)−h(P )] × η(h(P + νx)) e ϕ .
June 2, 2009 18:35 WSPC/148-RMP
684
J070-00370
R. Tiedra de Aldecoa
One has
1 iνx·Q −2 e −1 Q ≤ Const. |x| ν
(3.8)
due to the spectral theorem and the mean value theorem. Since Ff ∈ S (X) it follows that K1 (ν, µ) ≤ Const., (3.9) and thus K1 (ν, µ) is bounded uniformly in ν by a function in L1 ((0, 1], dµ). For the case µ > 1 we recall that there exists c > 0 such that |h (x)| > c for all x ∈ h−1 (supp η), due to Assumption 3.1(i). Therefore, the operator Aj,ν (x) := (Ff )(x)
η(h(P + νx))(∂j h)(P + νx) 1 iνx·Q e −1 Q−2 ν |h (P + νx)|2
satisfies for any integer k ≥ 1 the bound Aj,ν (x) ≤ Const. x−k , due to Eqs. (3.2) and (3.8) and the rapid decay of Ff . So K1 (ν, µ) can be written as dxQ2 ϕ, Aj,ν (x)(∂j Bν,µ )(x)ϕ, K1 (ν, µ) = −iµ−1 j≤d
X
µ
with Bν,µ (x) := ei ν [h(P +νx)−h(P )] . Moreover lengthy, but direct, calculations using Eq. (3.8) and Assumption 3.1(ii) show that (∂j Aj,ν )(x) ≤ Const. (1 + |ν|)x−k and
∂ (∂j Aj,ν )(x) (∂ h)(P + νx) ≤ Const. 1 + |ν| + ν 2 x−k 2 |h (P + νx)|
(3.10)
for any integer k ≥ 1. Therefore, one can perform two successive integrations by parts (with vanishing boundary contributions) and obtain K1 (ν, µ) = iµ−1 dxQ2 ϕ, (∂j Aj,ν )(x)Bν,µ (x)ϕ j≤d
= −µ−2
X
(∂ h)(P + νx) dx Q2 ϕ, ∂ (∂j Aj,ν )(x) (x)ϕ . B ν,µ |h (P + νx)|2 X
j,≤d
This together with Formula (3.10) implies that K1 (ν, µ) ≤ Const. µ−2 for each ν < 1 and each µ > 1.
(3.11)
The combination of the bounds (3.9) and (3.11) shows that K1 (ν, µ) is bounded uniformly for ν < 1 by a function in L1 ((0, ∞), dµ). Since similar arguments shows that the same holds for the second term of X dx K(ν, µ, x), one can interchange the limit ν0 and the integration over µ in (3.7).
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
685
The interchange of the limit ν0 and the integration over x in (3.7) is justified by the bound |K(ν, µ, x)| ≤ Const. |x(Ff )(x)|, which follows from Formula (3.8). Remark 3.4. We strongly believe that Formula (3.5) remains true for a large class of non-smooth even localization functions f (such as characteristic functions, for instance). In the particular cases of the Schr¨odinger operator h(x) = x2 and the one-dimensional Friedrichs model h(x) = x, similar results suggest that f only has to decay to 0 sufficiently fast at infinity (see [17, Proposition 4.5] and Sec. 5.1). Unfortunately, in the general situation, we have not been able to extend the proof of Theorem 3.3 to such a class of functions. Next result follows directly from Lemma 2.2(c) and Theorem 3.3. Corollary 3.5. Let f ∈ S (X) be a radial function such that f = 1 on a neighborhood of 0. Let h satisfy Assumption 3.1 with m ≥ 3. Then we have for each ϕ ∈ D20 (X) ∞ Q −ith(P ) Q ith(P ) ith(P ) −ith(P ) e e dt ϕ, e f −e f lim ϕ = ϕ, Aϕ, r→∞ 0 r r (3.12) with A defined by (3.3). The rest of the section is devoted to the interpretation of Formula (3.12). We consider first the operator A on the right-hand side. One has for each ϕ ∈ D10 (X) [A, h(P )]ϕ = −2iϕ,
(3.13)
d which suggest that A = −2i dh(P ) , with a slight abuse of notation. Thus, formally, i A can be seen as the operator of differentiation with respect to the kinetic energy 2 h(P ). In fact, this affirmation could be turned into a rigorous statement in many concrete situations. As an example, we present two particular cases where rigorous formulas can be easily obtained.
Case 1. Suppose that f is radial and that h is a polynomial of degree 1 satisfying the hypotheses of Corollary 3.5. Then h(x) = v0 + v ·x for some v0 ∈ R, v ∈ X\{0}, and we have h(X) = R and κ(h) = ∅. So H0 has purely absolutey continuous spectrum σ(H0 ) = σac (H0 ) = R. Moreover, the operators A ≡ −2 vv2 · Q and h(P ) ≡ v0 + v · P are selfadjoint, and have S (X) as a common core. The associated unitary groups U (t) := eitA and V (s) := eish(P ) are strongly continuous, and satisfy the Weyl relations U (t/2)V (s) = eits V (s) U (t/2).
June 2, 2009 18:35 WSPC/148-RMP
686
J070-00370
R. Tiedra de Aldecoa
It follows by the Stone–von Neumann theorem [37, VIII.14] that there exists a unitary operator U1 : H(X) → L2 (R; CN , dλ), with N finite or infinite, such that U1 U (t/2)U1∗ is the group of translation to the left by t, and U1 V (s)U1∗ is the group of multiplication by eisλ . In terms of the generators, this implies the following. We have U1 h(P )U1∗ = λ, where “λ” stands for the multiplication operator by λ in L2 (R; CN , dλ), and we have for each ϕ ∈ H(X) and φ ∈ D10 (X) d(U1 φ) (λ) dλ (U1 ϕ)(λ), −2i , (3.14) ϕ, Aφ = dλ R CN d where dλ denotes the distributional derivative. For instance, in the case of the one-dimensional Friedrichs model (h(x) = x), one has N = 1, and U1 reduces to the one-dimensional Fourier transform.
Case 2. Suppose that h is radial and satisfies the hypotheses of Corollary 3.5. Then there exists a function h0 ∈ C 3 (R; R) such that h(x) = h0 (|x|) for each x ∈ X, and we have κ0 := κ(h) = {λ ∈ R | ∃ρ ∈ [0, ∞) such that h0 (ρ) = λ and h0 (ρ) = 0}. In particular, κ0 is closed, and it has measure zero due to Sard’s Theorem. We also assume that h0 ≥ 0 on [0, ∞) (so that h−1 0 (λ) is unique for each λ ∈ h0 ([0, ∞))\κ0 ) −1 and that h0 (κ0 ) has measure zero. These assumptions are satisfied by many physical Hamiltonians such as the Schr¨ odinger operator (h0 (ρ) = ρ2 ) or the square-root Klein–Gordon operator (h0 (ρ) = 1 + ρ2 ). Taking advantage of the spherical coordinates, one can derive a spectral transformation U0 for h(P ) ≡ h0 (|P |). Lemma 3.6. Let h0 be as above. Then the mapping U0 ⊕ dλ L2 (Sd−1 ) defined by h0 ([0,∞)) (U0 ϕ)(λ, ω) :=
d−1 (h−1 0 (λ)) h0 (h−1 0 (λ))
12
:
(F ϕ) h−1 0 (λ)ω
H(X)
→
(3.15)
for each ϕ ∈ H(X), λ ∈ h0 ([0, ∞))\κ0 , and ω ∈ Sd−1 , is unitary and satisfies ⊕ dλ λ. (3.16) U0 h0 (|P |)U0∗ = h0 ([0,∞))
Moreover, one has for each ϕ ∈ H(X) and φ ∈ D10 (X) d(U0 φ) (λ, ·) ϕ, A0 φ = dλ (U0 ϕ)(λ, ·), −2i , dλ h0 ([0,∞)) L2 (Sd−1 ) where
d dλ
denotes the distributional derivative.
(3.17)
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
687
Note that Formula (3.16) (or the fact that h−1 0 (κ0 ) has measure zero) implies that h(P ) = h0 (|P |) has purely absolutely continuous spectrum. In the case h0 (ρ) = odinger operator [24, ρ2 , U0 reduces to the usual spectral transformation for the Schr¨ Sec. 2]: 1
(U0 ϕ)(λ, ω) = 2− 2 λ
d−2 4
1
(F ϕ)(λ 2 ω).
Proof. A direct calculation using the spherical coordinates and the fact that κ0 and 2 2 ϕ ∈ H(X). Thus h−1 0 (k0 ) have measure zero shows that U0 ϕ ⊕ = ϕ for2 each U0 is an isometry. Furthermore, for each ψ ∈ h0 ([0,∞)) dλ L (Sd−1 ) and ξ ∈ X\{0}, one can check that 1 h0 (|ξ|) 2 ξ ∗ −1 U0 ψ = F ψ where ψ(ξ) := ψ h0 (|ξ|), . (3.18) |ξ|d−1 |ξ| Thus U0 U0∗ = 1, and U0 is unitary. Formulas (3.16) and (3.17) follow by using (3.15), (3.18), and the definition (3.4) of A0 . Formulas (3.14) and (3.17) provide (at least when h radial or a polynomial of degree 1) a rigorous meaning to the right-hand side of Formula (3.12). They d , where λ is the imply that A acts in the spectral representation of h(P ) as −2i dλ spectral variable. What about the left-hand side of Formula (3.12)? For r fixed, it can be interpreted as the difference of times spent by the evolving state e−ith(P ) ϕ in the past (t ≤ 0) and in the future (t ≥ 0) within the region defined by the localization operator f (Q/r). Thus, Formula (3.12) shows (at least when h radial or a polynomial of degree 1) that this difference of times tends as r → ∞ to the d in the spectral representation of h(P ). expectation value in ϕ of the operator −2i dλ 4. Time Delay In this section, we prove the existence of time delay for scattering systems with free Hamiltonian H0 = h(P ) and full Hamiltonian H. The function h : X → R satisfies Assumption 3.1, and the full Hamiltonian H can be any selfadjoint operator in H(X) satisfying Assumption 4.1 below. Given two Hilbert spaces H1 and H2 , we write B(H1 , H2 ) for the set of bounded operators from H1 to H2 , and put B(H1 ) := B(H1 , H1 ). The definition of complete wave operators is given in [36, Sec. XI.3]. Assumption operators1 W± exist and are complete, and any opera 4.1. The wave tor T ∈ B H−ρ (X), H(X) , with ρ > 2 , is locally H-smooth on R\{κ(h)∪σpp (H)}. Under Assumption 3.1 it is known that each operator T ∈ B H−ρ (X), H(X) , with ρ > 12 , is locally h(P )-smooth on R\κ(h) (see [2, Proposition 7.6.6] and [2, Theorem 3.4.3(a)]). Therefore, if r > 0 and ϕ ∈ D00 (X), then Tr0 (ϕ) is finite for each function f satisfying Assumption 2.1(i) with ρ > 1. The number Tr (ϕ) is finite
June 2, 2009 18:35 WSPC/148-RMP
688
J070-00370
R. Tiedra de Aldecoa
under similar conditions. Indeed, define for each t ≥ 0
Dt (X) := ϕ ∈ Ht (X) | η(h(P ))ϕ = ϕ for some η ∈ Cc∞ R\{κ(h) ∪ σpp (H)} . Then Tr (ϕ), with ϕ ∈ D0 (X), is finite for each function f satisfying Assumption 2.1(i) with ρ > 1 due to Assumption 4.1. Obviously, the set Dt (X) satisfies properties similar to those of Dt0 (X): Dt (X) ⊂ Hac (H0 ), Dt (X) is dense in H(X) if h−1 (κ(h) ∪ σpp (H)) has measure zero, and Dt1 (X) ⊂ Dt2 (X) if t1 ≥ t2 . For each r > 0, we define the number Q −itH0 Q itH0 1 ∞ e e dt ϕ, S ∗ eitH0 f − e−itH0 f ,S ϕ , τrfree (ϕ) := 2 0 r r (4.1) which is finite for all ϕ ∈ D00 (X). We refer the reader to [5, Eqs. (93) and (96)], [17, odinger Eq. (4.1)], and [47, Sec. 2.1] for similar definitions when H0 is the free Schr¨ operator. The usual definition can be found in [3, Eq. (3)], [24, Eq. (6.2)], and [30, Eq. (5)]. We recall that the symmetrized time delay τr (ϕ) is defined in Eq. (1.1). The symbol R± stands for R± := {x ∈ R | ±x ≥ 0}. Lemma 4.2. Let f ≥ 0 satisfy Assumption 2.1 with ρ > 1. Suppose that Assumption 4.1 holds. Let ϕ ∈ D0 (X) be such that (W− − 1)e−itH0 ϕ ∈ L1 (R− , dt)
(4.2)
(W+ − 1)e−itH0 Sϕ ∈ L1 (R+ , dt).
(4.3)
and
Then lim [τr (ϕ) − τrfree (ϕ)] = 0.
r→∞
Proof. One has for ϕ ∈ D0 (X) 2 1 2 ∞ Q 12 Q 2 e−itH W− ϕ − f e−itH0 Sϕ dt f τr (ϕ) − τrfree (ϕ) = r r 0 2 1 2 Q 12 Q 2 e−itH W− ϕ − f e−itH0 ϕ . + dt f r r −∞
0
(4.4) Using the inequality | ϕ 2 − φ 2 | ≤ ϕ − φ · ( ϕ + φ ) ,
ϕ, φ ∈ H(X),
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
689
the completeness of W± , and the fact that ϕ ∈ Hac (H0 ), we obtain the estimates 2 1 2 12 Q 2 Q f e−itH W− ϕ − f e−itH0 ϕ ≤ Const. g− (t) ϕ (4.5) r r 2 1 2 12 Q 2 Q f e−itH W− ϕ − f e−itH0 Sϕ ≤ Const. g+ (t) ϕ , (4.6) r r where g− (t) := (W− − 1) e−itH0 ϕ
and g+ (t) := (W+ − 1) e−itH0 Sϕ .
Since s-limr→∞ f (Q/r)1/2 = 1, the scalars on the left-hand side of (4.5)–(4.6) converge (for t fixed) to zero as r → ∞. Furthermore, we know from Hypotheses (4.2)– (4.3) that g± ∈ L1 (R± , dt). Therefore the claim follows from (4.4) and Lebesgue’s dominated convergence theorem. The next theorem shows the existence of symmetrized time delay. It is a direct consequence of Lemma 4.2, Definition 4.1, and Theorem 3.3. Theorem 4.3. Let f ≥ 0 be an even function in S (X) such that f = 1 on a neighborhood of 0. Let h satisfy Assumption 3.1 with m ≥ 3. Suppose that Assumption 4.1 holds. Let ϕ ∈ D2 (X) satisfy Sϕ ∈ D2 (X) and (4.2)–(4.3). Then one has lim τr (ϕ) =
r→∞
1 ϕ, S ∗ [Af , S]ϕ, 2
(4.7)
with Af defined by (3.1). Remark 4.4. The result of Theorem 4.3 is of particular interest when the localization function f is radial. In such a case Af = A due to Lemma 2.2(c), and (4.7) reduces to lim τr (ϕ) =
r→∞
1 ϕ, S ∗ [A, S]ϕ. 2
(4.8)
d Since A is formally equal to −2i dH , this equation expresses the identity of sym0 metrized time delay (defined in terms of sojourn times) and Eisenbud–Wigner time delay for dispersive Hamiltonians H0 = h(P ). To show this more rigorously, let us suppose that H0 is purely absolutely continuous. In such a case ⊕ there exist Hilbert spaces {Hλ }λ∈σ(H0 ) and a unitary operator U : H(X) → σ(H0 ) dλ Hλ such that ⊕ ⊕ UH0 U ∗ = σ(H0 ) dλ λ and USU ∗ = σ(H0 ) dλ S(λ), with S(λ) unitary in Hλ (see e.g. [6, Proposition 5.29]). Assume, by analogy to (3.14) and (3.17), that A satisfies
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
R. Tiedra de Aldecoa
690
for each ϕ ∈ H(X) and φ ∈ D10 (X) d(U φ) (λ) dλ (U ϕ)(λ), −2i . ϕ, Aφ = dλ σ(H0 ) Hλ
(4.9)
Assume also that the scattering matrix σ(H0 ) λ → S(λ) ∈ Hλ is strongly continuously differentiable on the support of U ϕ. Then (4.8) can be rewritten as dS(λ) (U ϕ)(λ) lim τr (ϕ) = dλ (U ϕ)(λ), −iS(λ)∗ . r→∞ dλ σ(H0 ) Hλ Remark 4.5. One can put into evidence Eisenbud–Wigner contribution to symmetrized time delay even if the localization function f is not radial. Indeed, by !f , where using Formula (2.4), one gets that Af = A + A !f := Q · R (h (P )) + R (h (P )) · Q A f f and f (x) := Rf R
x |x|
for each x ∈ X\{0}. Thus Formula (4.7) always implies that lim τr (ϕ) =
r→∞
# 1 1" !f , S]ϕ . ϕ, S ∗ [A, S]ϕ + ϕ, S ∗ [A 2 2
As noted in Remark 4.4, the first term corresponds to the usual Eisenbud–Wigner time delay. The second term corresponds to the contribution of the non-radial component of the localization function f . Due to Eq. (2.2), one has on D10 (X) !f e−itH0 = A !f . eitH0 A !f (and thus S ∗ [A !f , S]) is decomposable in the spectral Basically, this means that A representation of H0 . If h is radial and satisfies the hypotheses of Lemma 3.6, one !f (λ) to the fiber at energy λ by using the can even determine the restriction A !f (λ) is a symmetric first order differential operator spectral transformation U0 (A !f , S] on Sd−1 with non-constant coefficients). So, if we sum up, the operator S ∗ [A is always decomposable in the spectral representation of H0 under some technical assumptions, but its restriction to the fiber at energy λ is an operator much more complicated than −iS(λ)∗ dS(λ) dλ . Some informations on this matter can be found in [17, Sec. D] in the particular case of the Schr¨ odinger operator (h(x) = x2 ). Now, we give conditions under which one has lim [Tr0 (Sϕ) − Tr0 (ϕ)] = 0.
r→∞
(4.10)
This implies the equality of time delay and symmetrized time delay as r → ∞: lim [τrin (ϕ) − τr (ϕ)] = 0.
r→∞
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
691
Physically, (4.10) means that the freely evolving states e−itH0 ϕ and e−itH0 Sϕ tend to spend the same time within the region defined by the localization function f (Q/r) as r → ∞. Formally, the proof of (4.10) goes as follows. Suppose that Ff (h (P )), with Ff defined in Sec. 2, commutes with the scattering operator S. Then, using the change of variables µ := t/r, ν := 1/r, and the symmetry of f , one gets lim [Tr0 (Sϕ) − Tr0 (ϕ)]
r→∞
Q −ith(P ) ∗ ith(P ) e = lim dt ϕ, S e f , S ϕ − ϕ, S ∗ [Ff (h (P )), S]ϕ r→∞ R r
∗ 1 dµ ϕ, S = lim f (νQ + µh (P )) − f (µh (P )) , S ϕ ν0 R ν " # = dµ ϕ, S ∗ [Q · f (µh (P )), S]ϕ
R
= 0. The rigorous proof will be given in Theorem 4.8 below. Before this we introduce assumptions on h slightly stronger than Assumption 3.1, and we prove a technical lemma. Assumption 4.6. The function h : X → R is of class C m for some m ≥ 2, and satisfies the following conditions: (i) |h(x)| → ∞ as |x| → ∞. |(∂ α h)(x)| ≤ Const. (1 + |h(x)|). (ii) |α|≤m α (iii) |α|=m |(∂ h)(x)| ≤ Const. Assumption 4.6 appears naturally when one studies the spectral and scattering theory of pairs {H0 = h(P ), H} using commutator methods (see e.g. [2, Sec. 7.6.3] and [42, Sec. 2.1]). Assumption 4.6(i) is related to the closedness of κ(h), whereas Assumptions 4.6(ii), (iii) are related to the polynomial growth of the group {eix·Q } in D(H0 ) and D(|H0 |1/2 ). We say that functions h satisfying Assumption 4.6 are of hypoelliptic type, by reference to hypoelliptic polynomials of degree m which also satisfy Assumption 4.6 (see [21, Theorem 11.1.3]). A typical example one should keep in mind is the case where h is an elliptic symbol of degree s > 0, i.e. h ∈ C ∞ (X; R), |(∂ α h)(x)| ≤ cα xs−|α| for each multi-index α, and |h(x)| ≥ c|x|s , for some c > 0, outside a compact set. Lemma 4.7. Let h satisfy Assumption 4.6, and take η ∈ Cc∞ R\κ(h) . Then one has for each µ ∈ R, x ∈ X, and |ν| < 1 1 {η(h(P + νx))ei µν [h(P +νx)−h(P )] −η(h(P ))eiµx·h (P ) } ≤ Const. (1 + |µ|)xm+2 . ν
June 2, 2009 18:35 WSPC/148-RMP
692
J070-00370
R. Tiedra de Aldecoa
Proof. Due to the spectral theorem and the mean value theorem, one has 1 {η(h(P + νx))ei µν [h(P +νx)−h(P )] −η(h(P ))eiµx·h (P ) } ν gy (ξν) , ≤ sup (4.11) y∈X, ξ∈[0,1]
where µ
gy (ν) := η(h(y + νx))ei ν [h(y+νx)−h(y)] α = η(h(y + νx)) exp iµ x |α|=1
1
dt(∂ α h)(y + tνx) .
0
Direct calculations using Assumption 4.6(ii) show that sup gy (ξν) ≤ Const. |x| ξ∈[0,1]
+ Const. x2 |µ| sup η(h(y + ξνx)) 1 + |h(y + tξνx)| . (4.12) ξ,t∈[0,1]
Then one can use Taylor’s Formula [2, Eq. (1.1.8)] [(t − 1)ξν]|α| xα xα h(y + tξνx) = (∂ α h)(y + ξνx) + m[(t − 1)ξν]m α! α! |α|<m
×
1
|α|=m
dτ (∂ α h) y + ξνx + τ (t − 1)ξνx (1 − τ )m−1
0
to get a bound for |h(y + tξνx)| in terms of |h(y + ξνx)|. Indeed, using the formula above and Assumptions 4.6(ii)–(iii), one obtains that |h(y + tξνx)| ≤ Const. νm−1 xm−1 1 + |h(y + ξνx)| + Const. |ν|m |x|m . This, together with the bounds (4.11), (4.12) and Assumption 4.6(ii), implies the claim. Theorem 4.8. Let f ∈ S (X) be even, let h satisfy Assumption 4.6 with m ≥ 3, and suppose that Assumption 4.1 holds. If ϕ ∈ D20 (X) satisfies Sϕ ∈ D20 (X) and [Ff (h (P )), S]ϕ = 0,
(4.13)
lim [Tr0 (Sϕ) − Tr0 (ϕ)] = 0.
(4.14)
then one has r→∞
In particular, time delay and symmetrized time delay satisfy lim [τrin (ϕ) − τr (ϕ)] = 0.
r→∞
(4.15)
The left-hand side in (4.13) is well-defined due to Eq. (2.5). Indeed, one has h (P ) −1 [Ff (h (P )), S]ϕ = |h (P )| η(h(P ))Ff ,S ϕ |h (P )|
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
693
for some η ∈ Cc∞ R\κ(h) , and thus [Ff (h (P )), S]ϕ ∈ H(X) by (3.2) and the compacity of Ff (Sd−1 ). Proof. Let ϕ ∈ D20 (X), take a real η ∈ Cc∞ R\κ(h) such that η(h(P ))ϕ = ϕ, and set ηt (P ) := eith(P ) η(h(P )). Using (4.13) and the change of variables µ := t/r, ν := 1/r, one gets 0 0 (Sϕ) − T1/ν (ϕ) T1/ν 1 = {η µν (P )f (νQ)η− µν (P ) − f (µh (P ))}, S ϕ dµ ϕ, S ∗ ν R = dµ dx(Ff )(x) R
X
R
X
1 iνx·Q {e × ϕ, S ∗ η µν (P + νx)η− µν (P ) − eiµx·h (P ) }, S ϕ ν = dµ dx(Ff )(x) µ 1 ivx·Q (e × ϕ, S ∗ −1)η(h(P + νx))ei ν [h(P +νx)−h(P )] , S ϕ ν µ ∗ 1 + dµ {η(h(P + νx))ei ν [h(P +νx)−h(P )] dx(Ff )(x) ϕ, S ν R X iµx·h (P ) − η(h(P ))e }, S ϕ . (4.16) To prove the statement, it is sufficient to show that the limit as ν0 of each of these two terms is equal to zero. This is done in points (i) and (ii) below. (i) One can adapt the method Theorem 3.3 (point (iii) of the proof) in order to apply Lebesgue’s dominated convergence theorem to (4.16). So one gets dµ dx(Ff )(x) lim ν0
R
X
∗ 1 ivx·Q iµ [h(P +νx)−h(P )] ν (e −1)η(h(P + νx))e ,S ϕ × ϕ, S ν = i dµ dx(Ff )(x){(x · Q)Sϕ, eiµx·h (P ) Sϕ R
X
− (x · Q)ϕ, eiµx·h (P ) ϕ}, and the change of variables µ := −µ, x := −x, together with the symmetry of f , implies that this expression is equal to zero.
June 2, 2009 18:35 WSPC/148-RMP
694
J070-00370
R. Tiedra de Aldecoa
(ii) We have to show that the limit µ ∗ 1 := lim {η(h(P + νx))ei ν [h(P +νx)−h(P )] dµ dx(Ff )(x) ϕ, S ν0 R ν X − η(h(P ))eiµx·h (P ) }, S ϕ (4.17) is equal to zero. For the moment, let us assume that we can interchange the limit and the integrals in (4.17), by invoking Lebesgue’s dominated convergence theorem. Since d iµ [h(P +νx)−h(P )] ν η(h(P + νx))e dν ν=0
= x · h (P )η (P ) eiµx·h (P ) +
iµ η(h(P )) xα (∂ α h)(P )eiµx·h (P ) , 2 |α|=2
one gets in such a case dµ dx(Ff )(x)ϕ, S ∗ [x · h (P )η (P )eiµx·h (P ) , S]ϕ = R
X
i + dµ µ dx xα (Ff )(x)ϕ, S ∗ [(∂ α h)(P )eiµx·h (P ) , S]ϕ. 2 R X |α|=2
Then the change of variables µ := −µ, x := −x, together with the symmetry of f , implies that this expression is equal to zero. It remains to show that one can apply Lebesgue’s dominated convergence theo0 rem to (4.17). Since ϕ and Sϕ belong to the same set D2 (X) it is enough to treat the limit limν0 R dµ L(ν, µ), where µ 1 L(ν, µ) := dx(Ff )(x) ϕ, {η(h(P + νx))ei ν [h(P +νx)−h(P )] ν X − η(h(P ))eiµx·h (P ) }ϕ . Using Lemma 4.7 and the fact that Ff ∈ S (X), one gets that |L(ν, µ)| ≤ Const. (1 + |µ|) for all |ν| < 1. Therefore L(ν, µ) is bounded uniformly for |ν| < 1 by a function in L1 ([−1, 1], dµ). For the case |µ| > 1 we recall that there exists c > 0 such that |h (x)| > c for all x ∈ h−1 (supp η), due to Assumption 4.6(i). So L(ν, µ) can be rewritten as 1 η(h(P + νx))(∂j h)(P + νx) L(ν, µ) = dx(Ff )(x) ϕ, ν iµ|h (P + νx)|2 X j≤d
i µ [h(P +νx)−h(P )] η(h(P ))(∂j h)(P ) iµx·h (P ) ν e e − ∂j ϕ , × ∂j iµ|h (P )|2 and one can perform an integration by parts (with vanishing boundary contributions) with respect to xj . We do not give the details since the calculations are very
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
695
similar to those of Theorem 3.3 (point (iii) of the proof). We only give the result obtained after three successive integrations by parts: dx[∂k ∂j2 (Ff )(x)] L(ν, µ) = O(|µ|−2 ) − iµ−3 j,k≤d
X
1 η(h(P + νx))(∂k h)(P + νx) i µ [h(P +νx)−h(P )] e ν × ϕ, ν |h (P + νx)|4 η(h(P ))(∂k h)(P ) iµx·h (P ) e − ϕ , |h (P )|4
(4.18)
where O(|µ|−2 ) are terms (containing derivatives ∂ α h with |α| ≤ 3) bounded in norm by Const. |µ|−2 . Now, one shows as in Lemma 4.7 that 1 η(h(P + νx))(∂k h)(P + νx) i µ [h(P +νx)−h(P )] e ν ν |h (P + νx)|4 η(h(P ))(∂k h)(P ) iµx·h (P ) e − |h (P )|4 ≤ Const. (1 + |µ|)xm+2 for each µ ∈ R, x ∈ X, and |ν| < 1. It follows by (4.18) that |L(ν, µ)| ≤ Const. |µ|−2 for each |ν| < 1. This bound, together with our previous estimate for |µ| ≤ 1, shows that L(ν, µ) is bounded uniformly for |ν| < 1 by a function in L1 (R, dµ). So one can interchange the limit ν0 and the integration over µ in (4.17). The interchange of the limit ν0 and the integration over x in (4.17) is justified by the bound (Ff )(x) ϕ, 1 {η(h(P + νx))ei µν [h(P +νx)−h(P )] −η(h(P ))eiµx·h (P ) }ϕ ν ≤ Const. (1 + |µ|)|(Ff )(x)|xm+2 , which follows from Lemma 4.7. In physical terms, the commutation condition (4.13) expresses roughly the conservation of the observable Ff (h (P )) by the scattering process. Since h (P ) is the free velocity operator for the scattering process, Ff (h (P )) is a quantum analogue of the classical sojourn time Ff (p), with momentum p ∈ R, described at the end of Sec. 2. Therefore it is not completely surprising that the sojourn times Tr0 (Sϕ) and Tr0 (ϕ) are equal (in the sense of (4.14)) if (4.13) is satisfied. Remark 4.9. There are many situations where the commutation Assumption (4.13) is satisfied. Here we present two of them. The first one occurs when h is a polynomial of degree 1, i.e. h(x) = v0 + v · x for some v0 ∈ R and v ∈ X\{0}. In such a case the operator Ff (h (P )) reduces to the scalar Ff (v), and thus (4.13) is clearly satisfied. The second one occurs when both f and h are radial, namely when
June 2, 2009 18:35 WSPC/148-RMP
696
J070-00370
R. Tiedra de Aldecoa
f (x) = f0 (|x|) and h(x) = h0 (|x|) with, say, h0 as in Lemma 3.6. In such a case Ff (h (P )) is diagonalizable in the spectral representation of H0 ≡ h(P ), namely ⊕ ∗ dλ Ff h0 (h−1 (4.19) U0 Ff (h (P ))U0 = 0 (λ)) , h0 ([0,∞))
where U0 is the spectral transformation (3.15) for h(P ). We also know that S is decomposable in the spectral representation of H0 . Thus (4.13) is satisfied, since diagonalizable operators commute with decomposable operators. We are now in a position to state our main theorem on the existence of time delay. It is a direct consequence of Theorems 4.3 and 4.8. Theorem 4.10. Let f ≥ 0 be an even function in S (X) such that f = 1 on a neighborhood of 0. Let h satisfy Assumption 4.6 with m ≥ 3. Suppose that Assumption 4.1 holds. Let ϕ ∈ D2 (X) satisfy Sϕ ∈ D2 (X), (4.13), and (4.2)–(4.3). Then one has 1 (4.20) lim τrin (ϕ) = lim τr (ϕ) = ϕ, S ∗ [Af , S]ϕ, r→∞ r→∞ 2 with Af defined by (3.1). The comments of Remarks 4.4 and 4.5 concerning the symmetrized time delay τr (ϕ) remain valid in the case of the time delay τrin (ϕ). The right-hand side of (4.20) can always be written as the sum of the Eisenbud–Wigner time delay and the time delay associated with the non-radial component of the localization function f . In particular, if f is radial, one has in ∗ dS(λ) (U ϕ)(λ) dλ (U ϕ)(λ), −iS(λ) (4.21) lim τ (ϕ) = r→∞ r dλ σ(H0 ) Hλ under the assumptions of Remark 4.4. Formula (4.21) is the main result of this paper: it expresses the identity of time delay (defined in terms of sojourn times) and Eisenbud–Wigner time delay for dispersive Hamiltonians H0 = h(P ). However, a priori, (4.21) holds only if the conditions (4.9) and (4.13) are satisfied. As we have seen in Cases 1 and 2 of Sec. 3 and Remark 4.9, this occurs for instance when h is a polynomial of degree 1 or radial. These two classes of functions provide a bulk of examples much bigger than what can be found in the literature, since only the Schr¨ odinger Hamiltonian 2 (h(x) = x ) has been explicitly treated before. We collect the preceding remarks in a corollary to Theorem 4.10. Corollary 4.11. Let f ≥ 0 be an even function in S (X) such that f = 1 on a neighborhood of 0. Let h satisfy Assumption 4.6 with m ≥ 3. Suppose that Assumption 4.1 holds. Let ϕ ∈ D2 (X) satisfy Sϕ ∈ D2 (X) and (4.2)–(4.3). Then (a) Suppose that h(x) = v0 + v · x for some v0 ∈ R and v ∈ X\{0}. Then one has in ∗ dS(λ) (U1 ϕ)(λ) dλ (U1 ϕ)(λ), −iS(λ) lim τ (ϕ) = r→∞ r dλ R CN
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
697
if the scattering matrix R λ → S(λ) ∈ B(CN ) is strongly continuously differentiable on the support of U1 ϕ. (b) Let f be radial, and suppose that h is radial and satisfies the hypotheses of Lemma 3.6. Then one has dS(λ) (U0 ϕ)(λ, ·) dλ (U0 ϕ)(λ, ·), −iS(λ)∗ lim τrin (ϕ) = r→∞ dλ h0 ([0,∞)) L2 (Sd−1 ) 2 d−1 if the scattering matrix h0 ([0, ∞)) λ → S(λ) ∈ B L (S ) is strongly continuously differentiable on the support of U0 ϕ. 5. Friedrichs Model As an illustration of our results, we treat in this section the case of a one-dimensional Friedrichs Hamiltonian H0 perturbed by a finite rank operator V . For historical reasons [16] we define the Friedrichs Hamiltonian as the position operator H0 := Q in the Hilbert space H(R) := L2 (R). The operator H0 satisfies FH0 F −1 = −P . So, we can apply after a Fourier transformation the results of the Sec. 4 with h(x) = −x and κ(h) = ∅. Since h is a polynomial of degree 1, we only have to check the hypotheses of Corollary 4.11(a) in order to prove the existence of the limits limr→∞ τrin (ϕ) and limr→∞ τr (ϕ), and their identity with Eisenbud–Wigner time delay. However, the model is very explicit, so we will add some more remarks to this result. 5.1. Preliminaries For the moment, we do not specify the selfadjoint perturbation H of H0 = Q. We only assume, by analogy to Assumption 4.1, that Assumption operators W± exist and are complete, and any oper 5.1. The wave ator T ∈ B H−s (R), H(R) , with s > 12 , is locally H-smooth on R\σpp (H). Since H0 = Q the propagation of the states ϕ ∈ H(R) takes place in the space of momenta. Therefore the quantities Tr0 (ϕ), Tr (ϕ), τrin (ϕ), and τr (ϕ) are defined with respect to a localization operator f (P/r): P 0 −itH0 −itH0 e dt e ϕ, f ϕ , Tr (ϕ) := r R P e−itH W− ϕ , Tr (ϕ) := dt e−itH W− ϕ, f r R τrin (ϕ) := Tr (ϕ) − Tr0 (ϕ), 1 τr (ϕ) := Tr (ϕ) − [Tr0 (ϕ) + Tr0 (Sϕ)]. 2 The sets Dt0 (X) and Dt (X) of Secs. 3 and 4 are replaced, for s ≥ 0, by D0s (R) := {ϕ ∈ Hs (R) | η(Q)ϕ = ϕ for some η ∈ Cc∞ (R)}
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
R. Tiedra de Aldecoa
698
and
D s (R) := ϕ ∈ Hs (R) | η(Q)ϕ = ϕ for some η ∈ Cc∞ R\σpp (H) . Theorem 3.3 implies that ∞ " # P P e−itQ − e−itQ f eitQ ϕ = 2 ϕ, P ϕ lim dt ϕ, eitQ f r→∞ 0 r r
(5.1)
for each ϕ ∈ D02 (R) and each even function f ∈ S (R) such that f = 1 on a neighborhood of 0. Using the formula P P −t itQ −itQ e g e =g (5.2) , g ∈ L∞ (R), r r one can even show that (5.1) remains true for all ϕ ∈ Hs (R), s > 1, and all f satisfying the following assumption [46, Sec. 2]. Assumption 5.2. The function f ∈ L∞ (R) is even, f = 1 on a neighborhood of 0, and there exists ρ > 1 such that |f (x)| ≤ Const.x−ρ for a.e. x ∈ R. The typical example of function f one should keep in mind is the following. Example 5.3. Let f = χJ , where J ⊂ R is bounded, symmetric (i.e. J = −J), and contains an interval (−δ, δ) for some δ > 0. Then f satisfies Assumption 5.2, and f (P/r) is the orthogonal projection onto the set of states with momentum localized in rJ. Formula (5.2) and the symmetry of f give for each r > 0 and ϕ ∈ H(R) t−k dt dk|(F ϕ)(k)|2 f . Tr0 (ϕ) = r R R Then Fubini’s theorem (which is applicable due to Assumption 5.2) and the change of variable x := t−k r imply that Tr0 (ϕ) = r ϕ 2 dx f (x), (5.3) R
and thus that Tr0 (Sϕ) = Tr0 (ϕ)
and τrin (ϕ) = τr (ϕ).
(5.4)
So the Eqs. (4.14) and (4.15) of Theorem 4.8 are true here not only as r → ∞, but for each r > 0. This can be explained as follows. The “velocity” operator associated with the free evolution group eitQ is not only constant (which guarantees that Theorem 4.8 is applicable), but equal to −1: d itQ −itQ (e P e ) = −1. dt
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
699
Therefore the propagation speed of a state eitQ ϕ in the space of momenta is equal to −1. In that respect Formulas (5.3)–(5.4) are natural. For instance, if ϕ = 1 and f = χJ is as in Example 5.3, then Tr0 (ϕ) = r|J|, where |J| is the Lebesgue measure of J. So Tr0 (ϕ) is nothing else but the sojourn time in rJ (in the space of momenta) of the state eitQ ϕ propagating at speed −1. The next lemma follows from what precedes and Theorem 4.10. Lemma 5.4. Let f ≥ 0 satisfy Assumption 5.2. Suppose that Assumption 5.1 holds. For some s > 1, let ϕ ∈ D s (R) satisfy (4.2)–(4.3) and Sϕ ∈ D s (R). Then lim τ in (ϕ) r→∞ r
= lim τr (ϕ) = ϕ, S ∗ [P, S]ϕ. r→∞
(5.5)
Remark 5.5. Formula (5.5) shows that limr→∞ τrin (ϕ) is null if the commutator [P, S] vanishes (which happens if and only if the scattering operator S is a constant). We give an example of Hamiltonian H for which this occurs. !0 ) := H1 (R), and for q ∈ L1 (R; R) let H := H !0 + !0 := P with domain D(H Let H
1 q(Q) with domain D(H) := ϕ ∈ H (R) | Hϕ ∈ H(R) . It is known [51, Sec. 2.4.3] e f is selfadjoint, that the wave operators W $ s-lims→±∞ eitH e−itH0 exist that H ± := R ∗ −i R dx q(x) $ $ is a constant. Thereand are complete, and that S := W + W− = e −1 ), the wave = H0 + q(−P ) is selfadjoint on D(H) := F D(H fore H := F HF −1 $ exist and are complete, and S = S. operators W± = F W± F Remark 5.6. Suppose that the assumptions of Lemma 5.4 are verified, and for a.e. x ∈ R let S(x) ∈ C be the component at energy x of the scattering matrix associated with the scattering operator S. Then, Eq. (5.5) can be rewritten as (5.6) lim τrin (ϕ) = lim τr (ϕ) = −i dx|ϕ(x)|2 S(x)S (x) r→∞
r→∞
R
if the function x → S(x) is continuously differentiable on the support of ϕ (note that Eq. (5.6) does not follow from [30] or [6, Chap. 7.2], since we do not require f (P/r) to be an orthogonal projection or x → S(x) to be twice differentiable on the whole real line). Formula (5.6) holds for the general class of functions f ≥ 0 satisfying Assumption 5.2. However, if ϕ = 1 and f = χJ is as in Example 5.3, then we know that the scalars Tr0 (ϕ) and Tr (ϕ) can be interpreted as sojourn times. Therefore in such a case Formula (5.6) expresses exactly the identity of the usual and symmetrized time delays with the Eisenbud–Wigner time delay for the Friedrichs model. Remark 5.7. Let R0 (·) and R(·) be the resolvent families of H0 and H, and suppose that R(i) − R0 (i) is trace class. Then, at least formally, we get from the Birman–Krein formula [51, Theorem 8.7.2] that S(x)S (x) = −2πiξ (x; H, H0 ),
(5.7)
June 2, 2009 18:35 WSPC/148-RMP
700
J070-00370
R. Tiedra de Aldecoa
where ξ (x; H, H0 ) is the derivative of the spectral shift function for the pair {H0 , H}. Therefore, one has in lim τr (ϕ) = −2π dx|ϕ(x)|2 ξ (x; H, H0 ), (5.8) r→∞
R
and the number −2πξ (x; H, H0 ) may be interpreted as the component at energy x of the time delay operator for the Friedrichs model. However Eqs. (5.7)–(5.8) turn out to be difficult to prove rigorously under this form. We refer to [23], [31, Sec. III.b], and [38, Sec. 3] for general theories on this issue, and to [11,13,35,50] for related works in the case of the Friedrichs–Faddeev model. 5.2. Finite rank perturbation Here we apply the theory of Sec. 5.1 to finite rank perturbations of H0 = Q. Given u, v ∈ H(R) we write Pu,v for the rank one operator Pu,v := u, ·v, and we set Pv := Pv,v . The full Hamiltonian H we consider is defined as follows. Assumption 5.8. Fix an integer N ≥ 0 and take µ ≥ 0. For j, k ∈ {1, . . . , N }, let vj ∈ Hµ (R) satisfy vj , vk = δjk , and let λj ∈ R. Then H := H0 + V, where V := N j=1 λj Pvj . Many functions vj (as the Hermite functions [37, p. 142]) satisfy the requirements of Assumption 5.8. Under Assumption 5.8, the perturbation V is bounded from H−µ (R) to Hµ (R), H is selfadjoint on D(H) = D(H0 ), and the wave operators W± exist and are complete [36, Theorem XI.8]. In the next lemma we establish some of the spectral properties of H, we prove a limiting absorption principle for H, and we give a class of locally H-smooth operators. The limiting absorption principle is expressed in terms of the Besov space K := (H1 (R), H(R))1/2,1 ≡ H1/2,1 (R) defined by real interpolation [2, Sec. 3.4.1]. We recall that for each s > 1/2 we have the continuous embeddings Hs (R) ⊂ K ⊂ H(R) ⊂ K
∗
⊂ H−s (R).
We refer the reader to [2, Sec. 6.2.1] for the definition of the regularity classes C k (A) and to [2, Sec. 7.2.2] for the definition of a (strict) Mourre estimate. The symbol C± stands for the half-plane C± := {z ∈ C | ± Im(z) > 0}. Lemma 5.9. Let H satisfy Assumption 5.8 with µ ≥ 2. Then (a) H has at most a finite number of eigenvalues, and each of these eigenvalues is of finite multiplicity. (b) The map z → (H − z)−1 ∈ B(K , K ∗ ), which is holomorphic on C± , extends to a weak* continuous function on C± ∪ {R\σpp (H)}. In particular, H has no singularly continuous spectrum. (c) If T belongs to B H−s (R), H(R) for some s > 1/2, then T is locally H-smooth on R\σpp (H).
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
701
The spectral results of points (a) and (b) on the finiteness of the singular spectrum of H are not surprising; they are known in the more general setting where V is an integral operator with H¨ older continuous kernel (see e.g. [14, Theorem 1] and [15, Lemma 3.10]). A proof is given here for completeness. Note that point (a) implies that the sets D s (R) are dense in H(R) for each s ≥ 0. Proof. (a) Let A := −P , then e−itA H0 eitA = H0 + t for each t ∈ R. Thus H0 is of class C ∞ (A) and satisfies a strict Mourre estimate on R [2, Sec. 7.6.1]. Furthermore, the quadratic form D(A) ϕ → ϕ, iV Aϕ − Aϕ, iV ϕ extends uniquely to the bounded form defined by the rank 2N operator F1 := N + Pv ,v λ P . This means that V is of class C 1 (A). Thus H is of j v ,v j j j=1 j j class C 1 (A) and since F1 is compact, H satisfies a Mourre estimate on R. The claim then follows by [2, Corollary 7.2.11]. (b) The quadratic form D(A) ϕ → ϕ, iF1 Aϕ − Aϕ, iF1 ϕ extends uniquely to the bounded form defined by the rank 3N operator F2 := N − j=1 λj Pvj ,vj + 2Pvj ,vj + Pvj ,vj . This, together with [2, Theorems 7.2.9 and 7.2.13] and the proof of point (a), implies that H is of class C 2 (A) and that H satisfies a strict Mourre estimate on R\σpp (H). It follows by [41, Theorem 01] (which applies to operators without spectral gap) that the map z → (H−z)−1 ∈ B(K , K ∗ ) extends to a weak* continuous function on C± ∪ {R\σpp (H)}. In particular, H has no singularly continuous spectrum in R\σpp (H). Since continuous Borel measures on R have no pure points [37, p. 22] and since σpp (H) is finite by point (a), we even get that H has no singularly continuous spectrum at all. (c) Since T belongs to B D(H), H(R) and T ∗ H(R) ⊂ Hs (R) ⊂ K , the claim is a consequence of [2, Proposition 7.1.3(b)] and the discussion that follows. We now study the differentiability of the function x → S(x), which relies on the differentiability of the boundary values of the resolvent of H. Lemma 5.10. Let H satisfy Assumption 5.8 with µ ≥ n+1 for some integer n ≥ 1. Let I ⊂ {R\σpp (H)} be a relatively compact interval, and take s > n − 1/2. Then for each x ∈ I the limits Rn (x ± i0) := lim (H − x ∓ iε)−n ε0
exist in the norm topology of B Hs (R), H−s (R) and are H¨ older continuous. Furthermore, x → R(x ± i0) is n − 1 times (H¨ older continuously) differentiable as a
June 2, 2009 18:35 WSPC/148-RMP
702
J070-00370
R. Tiedra de Aldecoa
map from I to B Hs (R), H−s (R) , and dn−1 R(x ± i0) = (n − 1)!Rn (x ± i0). dxn−1 Proof. The claims follow from [25, Theorem 2.2(iii)] applied to our situation. We only have to verify the hypotheses of that theorem, namely that H is n-smooth with respect to A = −P in the sense of [25, Definition 2.1]. This is done in points (a), (b), (cn ), (dn ), and (e) that follow. (a) D(A) ∩ D(H) ⊃ S (R) is a core for H. (b) Let ϕ ∈ H1 (R) and θ ∈ R. Then one has eiθA ϕ H1 (R) = Q + θϕ ≤ Q + θQ−1 · ϕ H1 (R) 1
1
≤ 2− 2 (2 + |θ|) 2 ϕ H1 (R) . In particular, eiθA maps D(H) into D(H), and sup|θ|≤1 HeiθA ϕ < ∞ for each ϕ ∈ D(H). (cn )–(dn ) Due to the proof Lemma 5.9(a) the quadratic form D(A) ∩ D(H) ϕ → Hϕ, iAϕ − Aϕ, iHϕ extends uniquely to the bounded form defined by the operator iB1 := 1 + F1 , N λ where F1 = j=1 j Pvj ,vj + Pvj ,vj . Similarly, for j = 2, 3, . . . , n + 1 the quadratic form D(A) ∩ D(H) ϕ → (iBj−1 )∗ ϕ, iAϕ − Aϕ, i(iBj−1 )ϕ extends uniquely to a bounded form defined by an operator iBj := Fj , where Fj is a linear combination of the rank one operators Pv(j−k) ,v(k) , k = 0, 1, . . . , j. (e) Due to the proof Lemma 5.9(a), H satisfies a Mourre estimate on R. m For m = 1, 2, . . . , N let Vm := j=1 λj Pvj and Hm := H0 + Vm . Then it is known that the scattering matrix S(x) factorizes for a.e. x ∈ R as [51, Eq. (8.4.2)] ! ! S(x) = S! N (x) · · · S2 (x)S1 (x),
(5.9)
where S! m (x) is unitarily equivalent to the scattering matrix Sm (x) associated with the pair {Hm , Hm−1 }. Since the difference Hm − Hm−1 is of rank one, one can even obtain an explicit expression for Sm (x) (see [51, Eq. (6.7.9)]). For instance, one has the following simple formula for S1 (x) [51, Eq. (8.4.1)], [18, Eq. (66a)] S1 (x) =
1 + λ1 F (x − i0) , 1 + λ1 F (x + i0)
where F (x ± i0) := lim v1 , (H0 − x ∓ iε)−1 v1 . ε0
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
703
Formula (5.9) is not very convenient for studying the differentiability of the function x → S(x). This is why we prove the usual formula for S(x) in the next lemma. Given τ ∈ R, we let γ(τ ) : S (R) → C be the restriction operator defined by γ(τ )ϕ := ϕ(τ ). Some of the regularity properties of γ(τ ) are collected in the Appendix. Here we only recall that γ(τ ) extends uniquely to an element of s B H (R), C for each s > 1/2. Lemma 5.11. Let H satisfy Assumption 5.8 with µ ≥ 2. Then for each x ∈ R\σpp (H) one has the equality S(x) = 1 − 2πiγ(x)[1 − V R(x + i0)]V γ(x)∗ .
(5.10)
Proof. The claim is a consequence of the stationary method for trace class perturbations [51, Theorem 7.6.4] applied to the pair {H0 , H}. The perturbation V can be written as a product V = G∗ G0 , with G := N N j=1 λj Pvj and G0 := j=1 Pvj . Since the operators G and G0 are selfadjoint and belong to the Hilbert–Schmidt class, all the hypotheses of [51, Theorem 7.6.4] (and thus of [51, Theorem 5.7.1]) are trivially satisfied. Therefore one has for a.e. x ∈ R the equality
+ i0) G0 γ(x)∗ , (5.11) S(x) = 1 − 2πiγ(x)G 1 − B(x + i0) is the norm limit defined by the condition where B(x + i0) = 0. lim G0 (H − x − iε)−1 G − B(x ε0
On another hand, weknow from Lemma 5.10 that the limit R(x + i0) exists in the s −s norm topology of B H (R), H (R) for each x ∈ R\σpp (H) and each s > 1/2. + i0) = Since we also have G0 , G ∈ B H−µ (R), Hµ (R) , we get the identity B(x G0 R(x + i0)G. This together with Formula (5.11) implies the claim. We are in a position to show the differentiability of the scattering matrix. Lemma 5.12. Let H satisfy Assumption 5.8 with µ ≥ n+1 for some integer n ≥ 1. Then x → S(x) is n − 1 times (H¨ older continuously) differentiable from R\σpp (H) to C. Proof. Due to Formula (5.10) it is sufficient to prove that the terms 1 2 d d ∗ A(x) := γ(x) V γ(x) dx1 dx2 and
B(x) :=
2 3 d1 d d ∗ γ(x) V R(x + i0) V γ(x) dx1 dx2 dx3
June 2, 2009 18:35 WSPC/148-RMP
704
J070-00370
R. Tiedra de Aldecoa
exist and are locally H¨older continuous on R\σpp (H) for all non-negative integers 1 , 2 , 3 satisfying 1 + 2 + 3 ≤ n − 1. The factors in B(x) satisfy 3 d 1 ∗ γ(x) for s3 > 3 + , ∈ B C, H−s3 (R) 3 dx 2 V ∈ B(H−s3 (R), Hs2 (R))
for s2 , s3 ∈ [0, µ],
d2 1 R(x + i0) ∈ B(Hs2 (R), H−s2 (R)) for s2 > 2 + , 2 dx 2 −s s1 2 V ∈ B H (R), H (R) for s1 , s2 ∈ [0, µ],
d1 γ(x) ∈ B Hs1 (R), C 1 dx
1 for s1 > 1 + , 2
and are locally H¨ older continuous due to Lemmas 5.10 and A.1. Therefore, if the sj ’s above are chosen so that sj ∈ (j + 1/2, µ] for j = 1, 2, 3, then B(x) is finite and locally H¨ older continuous on R\σpp (H). Since similar arguments apply to the term A(x), the claim is proved. Lemma 5.13. Let H satisfy Assumption 5.8 with µ > 2. Then one has for each ϕ ∈ Hs (R), s > 2, (W− − 1)e−itH0 ϕ ∈ L1 (R− , dt)
(5.12)
(W+ − 1)e−itH0 ϕ ∈ L1 (R+ , dt).
(5.13)
and
Proof. For ϕ ∈ Hs (R) and t ∈ R, we have (W− − 1)e−itH0 ϕ = −ie−itH
t
dτ eiτ H V e−iτ H0 ϕ,
−∞
where the integral is strongly convergent. Hence to prove (5.12) it is enough to show that −δ t dt dτ V e−iτ H0 ϕ < ∞ (5.14) −∞
−∞
for some δ > 0. Let ζ := min{µ, s}, then P ζ ϕ and V P ζ are finite by hypothesis. If |τ | is big enough, it follows by (5.2) that −iτ H 0 V e ϕ ≤ Const.P −ζ e−iτ Q P −ζ = Const.P − τ −ζ P −ζ ≤ Const.|τ |−ζ . Since ζ > 2, this implies (5.14), and thus (5.12). The proof of (5.13) is similar. In the next theorem we prove Formula (5.6) for Hamiltonians H satisfying Assumption 5.8 with µ ≥ 5.
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
705
Theorem 5.14. Let f ≥ 0 satisfy Assumption 5.2, and let H satisfy Assumption 5.8 with µ ≥ 5. Then one has for each ϕ ∈ D 3 (R) the identity lim τrin (ϕ) = lim τr (ϕ) = −i dx|ϕ(x)|2 S(x)S (x). r→∞
r→∞
R
Proof. Let ϕ ∈ D 3 (R). Then Sϕ ∈ D 3 (R) by Lemma 5.12, and conditions (4.2)–(4.3) are verified by Lemma 5.13. Therefore all the hypotheses of Theorem 5.4 and Remark 5.6 are satisfied, and so the claim is proved. Acknowledgments The author thanks the Swiss National Science Foundation, the N´ ucleo Cient´ıfico ICM P07-027-F “Mathematical Theory of Quantum and Classical Magnetic Systems”, and the Chilean Science Fundation Fondecyt under the Grants 1090008 and 1085162 for financial support. This work was completed while the author was visiting the University of Chile and the Pontifical Catholic University of Chile. He would like to thank Professors M. M˘antoiu and G. Raikov for their kind hospitality. Appendix We collect in this appendix some facts on the restriction operator γ(τ ) of Lemma 5.11. We consider the general case with configurations space Rd , d ≥ 1. Given τ ∈ R, we let γ(τ ) : S (Rd ) → L2 (Rd−1 ) be the restriction operator defined by γ(τ )ϕ := ϕ(τ, ·).We know from [27, Theorem 2.4.2] that γ(τ ) extends uniquely to an element of B Hs (Rd ), L2 (Rd−1 ) for each s > 1/2. Furthermore γ(τ ) is H¨older continuous in τ with respect to the operator norm, namely for all τ, τ ∈ R there exists a constant c such that γ(τ ) − γ(τ ) B(Hs (Rd ),L2 (Rd−1 )) 1 3 1 s− |τ − τ | 2 , if s ∈ , 2 2 3 1 ≤ c |τ − τ | · | ln|τ − τ || if s = and |τ − τ | < , 2 2 3 |τ − τ | if s > . 2
(A.1)
Finally γ(τ ) has the following differentiability property. Lemma A.1. Let s > k + 12 with k ≥ 0 integer. (H¨ older Then γ is k times continuously) differentiable as a map from R to B Hs (Rd ), L2 (Rd−1 ) . Proof. We adapt the proof of [24, Lemma 3.3]. Consider first s > k + 12 with k = 1. The obvious guess for the derivative at τ of γ is (Dγ)(τ ) := γ(τ )∂1 , where
June 2, 2009 18:35 WSPC/148-RMP
706
J070-00370
R. Tiedra de Aldecoa
∂1 stands for the partial derivative with respect to the first variable. Thus one has for ϕ ∈ S (Rd ) and δ ∈ R with |δ| ∈ (0, 1/2) 1 1 δ
[γ(τ + δ) − γ(τ )] − (Dγ)(τ ) ϕ = dξ (∂1 ϕ)(τ + ξ, ·) − (∂1 ϕ)(τ, ·) . δ δ 0 In particular, using (A.1), we get for some µ > 0 1 [γ(τ + δ) − γ(τ )] − (Dγ)(τ ) ϕ 2 d−1 δ L (R
≤
1 |δ|
0
|δ|
)
dξ (∂1 ϕ)(τ + sgn(δ)ξ, ·) − (∂1 ϕ)(τ, ·)L2 (Rd−1 )
≤ ∂1 ϕ Hs−1 (Rd )
1 |δ|
|δ| 0
1 ≤ Const. ϕ Hs (Rd ) |δ|
dξ γ(τ + sgn(δ)ξ) − γ(τ ) B(Hs−1 (Rd ),L2 (Rd−1 ))
|δ|
dξ|ξ|µ
0
≤ Const. ϕ Hs (Rd ) |δ|µ .
Since S (Rd ) is dense in Hs (Rd ) and Dγ : R → B Hs (Rd ), L2 (Rd−1 ) is H¨older continuous, this proves the result for k = 1. The result for k > 1 follows then easily by using the expression for (Dγ)(τ ). References [1] P. Alsholm and T. Kato, Scattering with long range potentials, in Partial Differential Equations (Proc. Sympos. Pure Math., Vol. XXIII, Univ. California, Berkeley, California, 1971), (Amer. Math. Soc., Providence, R.I., 1973), pp. 393–399. [2] W. O. Amrein, A. Boutet de Monvel and V. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N -Body Hamiltonians, Progress in Math., Vol. 135 (Birkh¨ auser, Basel, 1996). [3] W. O. Amrein and M. B. Cibils, Global and Eisenbud–Wigner time delay in scattering theory, Helv. Phys. Acta 60 (1987) 481–500. [4] W. O. Amrein, M. B. Cibils and K. B. Sinha, Configuration space properties of the S-matrix and time delay in potential scattering, Ann. Inst. Henri Poincar´e 47 (1987) 367–382. [5] W. O. Amrein and Ph. Jacquet, Time delay for one-dimensional quantum systems with steplike potentials, Phys. Rev. A 75(2) (2007) 022106, 20 pp. [6] W. O. Amrein, J. M. Jauch and K. B. Sinha, Scattering Theory in Quantum Mechanics (Benjamin, Reading, 1977). [7] W. O. Amrein and K. B. Sinha, Time delay and resonances in potential scattering, J. Phys. A 39(29) (2006) 9231–9254. [8] D. Boll´e, F. Gesztesy and H. Grosse, Time delay for long-range interactions, J. Math. Phys. 24(6) (1983) 1529–1541. [9] D. Boll´e and T. A. Osborn, Time delay in N -body scattering, J. Math. Phys. 20 (1979) 1121–1134. [10] V. Buslaev and Pushnitski A, The scattering matrix and associated formulas in hamiltonian mechanics, preprint (2008); arXiv:0805.4172.
June 2, 2009 18:35 WSPC/148-RMP
J070-00370
Time Delay for Dispersive Systems in Quantum Scattering Theory
707
[11] V. S. Buslaev, Spectral identities and the trace formula in the Friedrichs model, in Spectral Theory and Wave Processes, ed. M. Sh. Birman (Consultants Bureau Plenum Publishing Corporation, New York, 1971), pp. 43–54. [12] C. A. A. de Carvalho and H. M. Nussenzveig, Time delay, Phys. Rep. 364(2) (2002) 83–174. [13] T. Dreyfus, The determinant of the scattering matrix and its relation to the number of eigenvalues, J. Math. Anal. Appl. 64(1) (1978) 114–134. [14] E. M. Dyn’kin, S. N. Naboko and S. I. Yakovlev, A finiteness bound for the singular spectrum in a selfadjoint Friedrichs model, Algebra i Analiz 3(2) (1991) 77–90. [15] L. D. Faddeev, On a model of Friedrichs in the theory of perturbations of the continuous spectrum, Trudy Mat. Inst. Steklov 73 (1964) 292–313. ¨ [16] K. Friedrichs, Uber die Spektralzerlegung eines Integraloperators, Math. Ann. 115(1) (1938) 249–272. [17] C. G´erard and R. Tiedra de Aldecoa, Generalized definition of time delay in scattering theory, J. Math. Phys. 48(12) (2007) 122101, 15 pp. [18] M. A. Grubb and D. B. Pearson, Derivation of the wave and scattering operators for an interaction of rank one, J. Math. Phys. 11 (1970) 2415–2424. [19] K. Gustafson and K. Sinha, On the Eisenbud–Wigner formula for time-delay, Lett. Math. Phys. 4(5) (1980) 381–385. [20] L. H¨ ormander, The Analysis of Linear Partial Differential Operators. I, Distribution Theory and Fourier Analysis, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Vol. 256 (Springer-Verlag, Berlin, 1983). [21] L. H¨ ormander, The Analysis of Linear Partial Differential Operators. II, Differential Operators with Constant Coefficients, Classics in Mathematics (Springer-Verlag, Berlin, 2005); Reprint of the 1983 orginal. [22] J. M. Jauch, R. Lavine and R. G. Newton, Scattering into cones, Helv. Phys. Acta 45 (1972/73) 325–330. [23] J. M. Jauch, K. B. Sinha and B. N. Misra, Time-delay in scattering processes, Helv. Phys. Acta 45 (1972) 398–426. [24] A. Jensen, Time-delay in potential scattering theory, Comm. Math. Phys. 82 (1981) 435–456. [25] A. Jensen, E. Mourre and P. Perry, Multiple commutator estimates and resolvent smoothness in quantum scattering theory, Ann. Inst. H. Poincar´ e Phys. Th´eor. 41(2) (1984) 207–225. [26] A. Jensen and S. Nakamura, Mapping properties of wave and scattering operators for two-body Schr¨ odinger operators, Lett. Math. Phys. 24 (1992) 295–305. [27] S. T. Kuroda, An Introduction to Scattering Theory, Lecture Notes Series, Vol. 51 (Aarhus Universitet Matematisk Institut, Aarhus, 1978). [28] S. Lang, Real Analysis, Addison-Wesley Publishing Company Advanced Book Program, 2nd edn. (Addison-Wesley Publishing Co., 1983). [29] P. A. Martin, Scattering theory with dissipative interactions and time delay, Nuovo Cimento B 30 (1975) 217–238. [30] P. A. Martin, On the time-delay of simple scattering systems, Comm. Math. Phys. 47(3) (1976) 221–227. [31] P. A. Martin, Time delay in quantum scattering processes, Acta Phys. Austriaca Suppl. XXIII (1981) 157–208. [32] A. Mohapatra, K. B. Sinha and W. O. Amrein, Configuration space properties of the scattering operator and time delay for potentials decaying like |x|−α , α > 1, Ann. Inst. H. Poincar´ e Phys. Th´eor. 57(1) (1992) 89–113.
June 2, 2009 18:35 WSPC/148-RMP
708
J070-00370
R. Tiedra de Aldecoa
[33] S. Nakamura, Time-delay and Lavine’s formula, Comm. Math. Phys. 109(3) (1987) 397–415. [34] H. Narnhofer, Time delay and dilation properties in scattering theory, J. Math. Phys. 25(4) (1984) 987–991. [35] A. E. Oganjan, The virial theorem and the trace formula in the Friedrichs model, in Mathematical Analysis and Probability Theory (in Russian), ed. V. S. Koroljuk (“Naukova Dumka”, Kiev, 1978), pp. 127–131, 218. [36] M. Reed and B. Simon, Methods of Modern Mathematical Physics. III, Scattering Theory (Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1979). [37] M. Reed and B. Simon, Methods of Modern Mathematical Physics. I, Functional Analysis, 2nd edn. (Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1980). [38] D. Robert, Relative time-delay for perturbations of elliptic operators and semiclassical asymptotics, J. Funct. Anal. 126(1) (1994) 36–82. [39] D. Robert and X. P. Wang, Existence of time-delay operators for Stark Hamiltonians, Comm. Partial Differential Equations 14(1) (1989) 63–98. [40] D. W. Robinson, Propagation properties in scattering theory, J. Austral. Math. Soc. Ser. B 21(4) (1979/80) 474–485. [41] J. Sahbani, The conjugate operator method for locally regular Hamiltonians, J. Operator Theory 38(2) (1997) 297–322. [42] J. Sahbani, Propagation theorems for some classes of pseudo-differential operators, J. Math. Anal. Appl. 211(2) (1997) 481–497. [43] M. Sassoli de Bianchi and P. A. Martin, On the definition of time delay in scattering theory, Helv. Phys. Acta 65(8) (1992) 1119–1126. [44] F. T. Smith, Lifetime matrix in collision theory, Phys. Rev. 118 (1960) 349–356. [45] H. Tamura, Time delay in scattering by potentials and by magnetic fields with two supports at large separation, J. Funct. Anal. 254(7) (2008) 1735–1775. [46] R. Tiedra de Aldecoa, Time delay for dispersive systems in quantum scattering theory. I. The Friedrichs model, preprint (2008); arXiv:0804.1349. [47] R. Tiedra de Aldecoa, Time delay and short-range scattering in quantum waveguides, Ann. Henri Poincar´e 7(1) (2006) 105–124. [48] R. Tiedra de Aldecoa, Anisotropic Lavine’s formula and symmetrised time delay in scattering theory, Math. Phys. Anal. Geom. 11(2) (2008) 155–173. [49] X. P. Wang, Phase-space description of time-delay in scattering theory, Comm. Partial Differential Equations 13(2) (1988) 223–259. [50] D. R. Yafaev, On the trace formula in multichannel Friedrichs model, Proc. Steklov Inst. Math. 2 (1981) 205–213. [51] D. R. Yafaev, Mathematical Scattering Theory, General Theory, Translations of Mathematical Monographs, Vol. 105 (American Mathematical Society, Providence, RI, 1992); Translated from the Russian by J. R. Schulenberger.
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Reviews in Mathematical Physics Vol. 21, No. 6 (2009) 709–733 c World Scientific Publishing Company
ABSOLUTELY CONTINUOUS SPECTRUM FOR A RANDOM POTENTIAL ON A TREE WITH STRONG TRANSVERSE CORRELATIONS AND LARGE WEIGHTED LOOPS
RICHARD FROESE Department of Mathematics, University of British Columbia, Vancouver, British Columbia, V6T 1Z2 Canada DAVID HASLER Department of Mathematics, The College of William and Mary, Williamsburg, Virginia, 23187 USA WOLFGANG SPITZER Institut f¨ ur Theoretische Physik, Universit¨ at Erlangen-N¨ urnberg, Germany, Erlangen, 91058 Germany
[email protected] Received 14 January 2009 Revised 11 May 2009 We consider random Schr¨ odinger operators on tree graphs and prove absolutely continuous spectrum at small disorder for two models. The first model is the usual binary tree with certain strongly correlated random potentials. These potentials are of interest since for complete correlation they exhibit localization at all disorders. In the second model, we change the tree graph by adding all possible edges to the graph inside each sphere, with weights proportional to the number of points in the sphere. Keywords: Absolutely continuous spectrum; Bethe lattice with spherical mean field Laplacian; strongly correlated random potential. Mathematics Subject Classification 2000: 82B44
1. Introduction Proving the existence of absolutely continuous spectrum for random Schr¨ odinger operators at weak disorder remains a challenging problem. The extended states conjecture, asserting the existence of absolutely continuous spectrum at low disorder for the Anderson model on Zd , d ≥ 3 remains the most important open problem in the field. When Zd is replaced by the Bethe Lattice (or tree graph), this conjecture has been proved by Klein [13], extended and reproved by Aizenman, Sims and Warzel [2–5], and given yet another proof by the present authors [11]. 709
July 8, 2009 10:14 WSPC/148-RMP
710
J070-00372
R. Froese, D. Hasler & W. Spitzer
Our proof, which only applied to binary trees, has been simplified and extended by Halasan [12] to cover trees with higher branching number, and with additional vertices. (See also Spitzer [15].) Recent work on trees includes level statistics by Aizenman and Warzel [6] and localization respectively singular continuous spectrum by Breuer [7] and [8], and by Breuer and Frank [9]. There is a large gap between the known results for the tree and the open problem on Zd . This present paper is an attempt to address some of the problems that would come up on Zd in simpler models. The paper has two parts. In the first part, we consider a binary tree with a transversely 2-periodic random potential. The potential is defined by choosing two values of the potential at random, independently for each sphere or level (that is, a set of vertices a fixed distance in the graph from the origin) in the tree. These two values are then repeated periodically across the sphere. The point of this model is that although the underlying graph is still a tree, we have negated some of the advantage of the exponential spreading of the tree. In fact, such two-periodic potentials can exhibit either dense point spectrum or absolutely continuous spectrum. In our previous paper [10], the values (q1 , q2 ) were chosen close to (δ, −δ) for δ > 0. In this case, we obtained a deterministic result proving existence of absolutely continuous spectrum. On the other hand, if (q1 , q2 ) are chosen randomly on the diagonal q1 = q2 then the potential is radial, and this model is equivalent to a one-dimensional Anderson model that exhibits localization at all disorders. We will prove that if the potentials (q1 , q2 ) are sufficiently uncorrelated (see assumption (8) below) then there will be some absolutely continuous spectrum, as is the case for the Anderson model. However, since in some sense this model is so close to being one-dimensional, the proof has some features not appearing in [11]. In both [11] and the present paper, the proofs follow from an estimate of an average over potential values q of functions µ(z, q), similar in both models, that measure the contraction of a relevant map of the plane. We seek an estimate of the form µ(z, q)dν(q) < 1 for z near the boundary at infinity. In [11], we use the independence of the potentials across the sphere in proving that µ(z, 0) is already less than one. Then small values of q in the integral are handled by semicontinuity. In the present situation, µ(z, q) for q = 0 is identically equal to one, and perturbations in q send it in both directions. Thus we must use cancellations in the integral over q in an essential way. Our method extends to the case where the joint distributions are not identical, as long as they are all centered and satisfy certain uniform bounds. This is significant since in this case we lose the self-similarity that has been used in previous proofs. Another obvious way that Zd differs from the tree is in the presence of arbitrarily large loops. In the second part of this paper, we show how to introduce (weighted) loops with unbounded size into the model from the first part. We introduce connections between every pair of vertices in a given sphere, weighted to make
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
711
the total weight of the added edges equal to one in each sphere. This is a sort of mean field interaction. These connections mean that when we remove the interior of some ball from the graph, the resulting exterior domain does not consist of disconnected pieces equivalent to the original graph, as is the case for the tree. Nevertheless, we can prove absolutely continuous spectrum for this model using results from the first part of this paper in a two-step procedure. To reduce the technical complication, we will only consider a Bernoulli distribution for the potentials in this section. In the next section, we review the basic set-up for calculating a diagonal matrix element of the Green’s function for discrete random Schr¨ odinger operators, using a decomposition of the graph and the corresponding sequence of forward Green’s functions. In Sec. 3, we specialize to a tree model with a strongly transversely correlated random potential and present Theorem 2, the first main theorem. The bounds on the moment required in the proof of this theorem are given in Sec. 4 but the proof of the main technical Lemma 4 is postponed to Sec. 6. Section 5 deals with extensions and open problems related to our method of proof. The last two sections are devoted to the mean field tree model. Theorem 9 is our second main result. A proof of the main technical Lemma 12 needed for this theorem is relegated to Sec. 8.
2. Review of Basic Setup Let (V, E) be a graph with vertex set V and edges E ⊆ V × V , and let γ : E → R+ be a bounded symmetric function. Let L be the Laplacian with matrix elements given by γ((v, w)) if (v, w) ∈ E Lv,w = 0 otherwise. We assume that the number of edges joining a vertex is uniformly bounded. Then L is a bounded, self-adjoint operator on 2 (V ). Given a potential q : V → R, let Q be the operator of multiplication by q with matrix elements Qv,w = q(v)δv,w . We are interested in the spectrum of the discrete Schr¨ odinger operator H =L+Q acting in 2 (V ). Let 0 ∈ V denote a distinguished vertex. We will study the spectral measure for H for the vector δ0 ∈ 2 (V ) given by 1 δ0 (v) = 0
if v = 0 otherwise
through its Borel transform given by the Green’s function G0 (λ) = δ0 , (H − λ)−1 δ0 .
July 8, 2009 10:14 WSPC/148-RMP
712
J070-00372
R. Froese, D. Hasler & W. Spitzer
Our approach is based on a decomposition of V as a disjoint union and the corresponding direct sum decomposition of 2 (V ) V =
∞
Sn ,
2 (V ) =
n=0
∞
2 (Sn ).
n=0
We assume that S0 = {0} and that vertices in Sn are only connected to vertices in Sn−1 , Sn and Sn+1 . (We will take the sets Sn to be spheres containing all vertices a distance n in the graph from 0.) Then the block matrix forms of L and H have zeros away from the diagonal and first off-diagonal blocks. D0 E0T 0 0 ··· 0 ··· E0 D1 E1T , L= 0 E1 D2 E2T · · · . .. .. .. .. .. . . . . H =
D0 + Q 0
E0T
0
0
E0
D1 + Q1
E1T
0
0 .. .
E1 .. .
D2 + Q2 .. .
E2T .. .
···
··· . ··· .. .
According to the formula for L, the matrix Dn is the Laplacian for the sphere Sn , while En has non-zero entries corresponding to the connections between Sn and Sn+1 . Let Pn denote the projection of 2 (V ) onto 2 (Sn ) and define Pn,∞ = ∞ k=n Pk . Define Hn = Pn,∞ H Pn,∞ and the forward Green’s functions Gn (λ) = Pn (Hn − λ)−1 Pn . Each Gn (λ) is a dn × dn matrix, where dn is the number of vertices in Sn and lies in the Siegel upper half space SHdn , that is, the space of symmetric dn × dn matrices with positive definite imaginary part; H := SH1 is the usual complex upper half plane. The forward Green’s functions are related by the formula Gn (λ) = Φn (Gn+1 , Qn , λ),
(1)
where Φn : SHdn+1 × Sdn × H → SHdn is given by Φn (Gn+1 , Qn , λ) = −(EnT Gn+1 En − Dn − Qn + λ)−1 . Here Sd is the set of d × d real symmetric matrices. To see this, note that Gn (λ) is the top left corner block of D + Q − λ E T 0 0 · · · −1 n n n En . 0 Hn+1 − λ .. .
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
713
Thus, according to Schur’s formula
A
BT
B
C
−1 =
(A − B T C −1 B)−1
(B T C −1 B − A)−1 B T C −1
C −1 B(B T C −1 B − A)−1
(C − BA−1 B T )−1
(2)
for the inverse of a symmetric block matrix we have T Gn (λ) = − [En
0
0
En
−1
0 · · ·](Hn+1 − λ)−1 . − Dn − Qn + λ ..
,
which implies (1). Now suppose that the potential is chosen at random, independently for every sphere Sn according to a probability distribution Nn on Rdn . Then the matrices Gn (λ) are random variables, distributed according to some measure Rn,λ on SHdn , and (1) implies that Rn,λ is the push-forward of Rn+1,λ × Nn under Φn . This means that for every integrable function f on SHdn
SHdn
f (Z)dRn,λ (Z) =
SHdn+1 Rdn
f (Φn (Z, Q, λ))dNn (Q)dRn+1,λ (Z).
(3)
The measure in which we really are interested is R0,λ , the distribution for G0 , which is a probability measure on H. In our examples, we will use formula (3) to prove a bound of the form sup |Re(λ)|≤λ0 0
H
w1+α (z)dR0,λ (z) < ∞,
(4)
where α > 0 and w(z) is a weight function satisfying Im(z) ≤ Cw(z)
(5)
for z in a neighborhood of the boundary at infinity ∂∞ H. In the upper half plane model of hyperbolic space H, the boundary at infinity is R ∪ {i∞}. A neighborhood of ∂∞ H is the complement of a closed bounded set in H∪∂∞ H. Here and throughout the paper, C denotes a generic constant that may change from line to line. Notice that the integral in formula (4) is the expectation E[w1+α (G0 (λ))]. Lemma 1. Suppose that (4) holds for some α > 0 and some weight function w(x) satisfying (5). Then the spectral measure µ0 of which G0 (λ) is the Borel transform is almost surely purely absolutely continuous in (−λ0 , λ0 ).
July 8, 2009 10:14 WSPC/148-RMP
714
J070-00372
R. Froese, D. Hasler & W. Spitzer
Proof (Following Klein [13] and Simon [14]). By Fatou’s lemma and (4) λ0 λ0 1+α E lim inf w (G0 (x + i))dx ≤ lim inf E(w1+α (G0 (x + i)))dx < C. ↓0
↓0
−λ0
−λ0
This implies that for almost every choice of potential λ0 λ0 lim inf (Im(G0 (x + i)))1+α dx ≤ C lim inf (w(G0 (x + i)))1+α dx < C. ↓0
↓0
−λ0
−λ0
So, for such a potential, there exists a sequence n ↓ 0 such that λ0 sup (Im(G0 (x + in )))1+α dx < C. n
−λ0
Then, since π −1 Im G0 (x + i)dx converges to dµ0 (x) weakly (see [14]) as ↓ 0 we find that for any compactly supported continuous function f λ0 λ0 −1 f (x)dµ0 (x) = lim π f (x) Im G0 (x + in )dx −λ0 −λ0 n→∞ 1/q ≤ lim sup π −1 n→∞
×
λ0
−λ0
λ0
−λ0
|f (x)|q dx 1/(1+α)
(Im G0 (x + in ))1+α dx
≤ Cf q . Here q is the dual exponent to 1 + α in H¨ older’s inequality. This implies that dµ0 (x) = g(x)dx for some g ∈ L1+α and completes the proof. 3. A Binary Tree with Transversely 2-Periodic Potentials We now specialize to a binary tree (Fig. 1).
S0
Fig. 1.
S1
S2
S3
...
Rooted binary tree with transversely 2-periodic potential.
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
715
For a tree, the forward Green’s functions are diagonal, and with Gn+1 (λ) = diag[z1 , z2 , . . . , z2n+1 ], Qn = diag[q1 , q2 , . . . , q2n ], we have
Φn (Gn+1 , Qn , λ) = diag
−1 −1 ,..., . z1 + z2 + λ − q1 z2n+1 −1 + z2n+1 + λ − q2n
To define a two-periodic potential we choose for each sphere (except the root) two potential values q = (q1 , q2 ) at random, independently for each sphere, according to an identical joint distribution ν. In the diagram, the spheres are outlined by boxes. For each sphere (except the first), after choosing q = (q1 , q2 ), we set the potential at all the black vertices equal to q1 and the potential at all the white vertices equal to q2 . The potential value at 0 is chosen according to some single site distribution ν(0) . We make the following assumptions about this distribution ν. The distribution has bounded support: ν is supported in {q = (q1 , q2 ) : |q1 | ≤ 1, |q2 | ≤ 1}. The distribution is centered on zero: (q1 + q2 )dν(q) = 0. Let cij =
(6)
(7)
qi qj dν(q). Then c = c11 + c22 > 0 and δ =
2c12 < 1/2. c11 + c22
(8)
The first inequality in (8) simply says that q is not identically zero. The second is a bound on the correlation. Completely correlated potentials (that is, the onedimensional case where the spectrum is localized) would correspond to δ = 1. To adjust the disorder, we multiply the potential by a coupling constant a > 0 and study the Schr¨ odinger operator Ha = L + a Q. This amounts to replacing ν with the scaled distribution νa satisfying f (q)dνa (q) = f (aq)dν(q). The scaled distribution νa is supported in {q = (q1 , q2 ) : |q1 | ≤ a, |q2 | ≤ a}. We can now formulate the main theorem for this section. Theorem 2. Let ν(0) be a probability measure of bounded support for the potential at the root, let ν be a probability measure on R2 satisfying (6)–(8) and let Ha be the random discrete Schr¨ odinger operator on the binary tree corresponding to the transversely √ two-periodic potential defined by the scaled distribution νa . There exists λ0 ∈ (0, 2 2) such that for sufficiently small a the spectral measure for Ha corresponding to δ0 has purely absolutely continuous spectrum in (−λ0 , λ0 ).
July 8, 2009 10:14 WSPC/148-RMP
716
J070-00372
R. Froese, D. Hasler & W. Spitzer
For a two-periodic potential, the formula (3) can be simplified. In this case, the measure Nn is independent of n and concentrated on the two-dimensional hyperplane where q1 = q3 = q5 = · · · and q2 = q4 = q6 = · · · . Thus, introducing a coupling constant a, the measure Nn is a product of νa with delta functions for the hyperplane. For these potentials the diagonal entries of Gn (λ) exhibit the same symmetry as the potentials, so the probability distribution for Gn (λ) is determined by the joint distribution ra,λ for (z1 , z2 ), which also is independent of n. With this notation, the formula (3) can be written f (z1 , z2 )dra,λ (z1 , z2 ) H×H
= H×H×R2
f −
1 1 ,− z1 + z2 + λ − q1 z1 + z2 + λ − q2
dνa (q)dra,λ (z1 , z2 ).
It is convenient to introduce a new random variable u = z1 + z2 + λ for every sphere except the first. Let ρa,λ denote the distribution on H for u. Then, taking f (z1 , z2 ) = g(z1 +z2 +λ) in the formula above we obtain our main recursion formula g(u)dρa,λ (u) = g(φq,λ (u))dνa (q)dρa,λ (u), (9) H
H×R2
where 1 1 − + λ. (10) u − q1 u − q2 A source of difficulty is the singular behavior of φq,λ near the diagonal of q. When q1 = q2 , (and Im(λ) ≥ 0) then φq,λ is a linear fractional transformation that defines an injective map from H to H. In fact, if λ ∈ R the map is a hyperbolic isometry. However, as soon as q1 = q2 the map φq,λ covers H twice. This can be seen even when we only consider real values of u. In this case φq,λ (u) ranges over all of R for u in the interval (q1 , q2 ) (supposing for the moment that q1 < q2 ). This interval shrinks and then disappears as q1 approaches q2 . √ √ We now introduce the weight function cd(u). For λ ∈ (−2 2, 2 2) the fixed point solution of u → φ0,λ (u) is uλ = λ/2 + i 2 − λ2 /4. Define φq,λ (u) = −
cd(u) =
|u − uλ |2 . Im(u)
Our goal is to bound the moment Ma,α,λ = cd(u)1+α dρa,λ (u).
(11)
(12)
H
Given Lemma 1, such a bound for R0,λ in place of ρa,λ will provide a proof of Theorem 2. This is done in the following lemma. Lemma 3. Let ν(0) be a probability measure of bounded support for the potential at the root, and suppose that sup |Re λ|≤λ0 0
Ma,α,λ < C
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
717
for some positive a, α and . Then the spectral measure for δ0 corresponding to the transversely two-periodic random potential with coupling constant a has purely absolutely continuous spectrum in [−λ0 , λ0 ]. Proof. Let w(z) = |z − i|2 /Im(z). The recursion formula (3) for the first level implies 1+α w(z) dR0,λ (z) = w(−(u − q)−1 )1+α dν(0) (q)dρa,λ (u) H
H×R
≤
sup H
R
w(−(u − q)−1 ) cd(u) + 1
1+α
dν(0) (q) (Ma,α,λ + 1),
so the lemma follows from Lemma 1 and the bound 1+α w(−(u − q)−1 ) dν(0) (q) ≤ C sup cd(u) + 1 H R by our assumption on ν(0) . 4. Bounding Ma,α,λ Lemma 3 shows that our main theorem follows from a bound for Ma,α,λ . We now explain how we can obtain such a bound. Beginning with (12) we introduce a cutoff function χ, 0 ≤ χ ≤ 1 with support in a neighborhood of the boundary at infinity of H, and with χ = 1 near infinity. Since cd is bounded on the compact support of 1 − χ, χ(u) cd(u)1+α dρa,λ (u) + C, Ma,α,λ ≤ H
where C only depends on the support of χ. Now we apply the recursion formula (9) to conclude χ(φq,λ (u))cd(φq,λ (u))1+α dνa (q)dρa,λ (u) + C. Ma,α,λ ≤ H
R2
Since the image of φq,λ (u) as q ranges over the support of νa , λ ranges over the rectangle |Re(λ)| ≤ λ0 , 0 ≤ Im(λ) ≤ and u ranges over the support of 1 − χ is compact, the function cd(φq,λ (u)) is bounded there, and we may again insert a cutoff and conclude χ(u)χ(φq,λ (u))cd(φq,λ (u))1+α dνa (q)dρa,λ (u) + C. (13) Ma,α,λ ≤ H
R2
The constant C is different from the previous equation, but can still be taken to be independent of λ in the range of values we are considering. Here is the essential idea of our argument. Introduce µq,λ (u) =
|(2u − q1 − q2 )uλ − 2(u − q1 )(u − q2 )|2 cd(φq,λ (u)) = cd(u) 2(|u − q1 |2 + |u − q2 |2 )|u − uλ |2
(14)
July 8, 2009 10:14 WSPC/148-RMP
718
J070-00372
R. Froese, D. Hasler & W. Spitzer
and the averaged version
µa,α,λ (u) =
Then (13) implies
Ma,α,λ ≤
H
R2
µ1+α q,λ (u)dνa (q).
χ(u) µa,α,λ (u) cd(u)1+α dρa,λ (u) + C.
So if we knew that µa,α,λ (u) ≤ 1 − 1 on supp(χ) for a suitable range of λ, then we would obtain Ma,α,λ ≤ (1 − 1 )Ma,α,λ + C which gives the desired bound on Ma,α,λ . Notice that the averaging over q is essential for obtaining such a bound, since µ0,λ (u) = 1. Also note that µq,λ (u) is continuous as u and λ approach the real axis, except at u = q1 = q2 . This includes u = i∞, by which we mean continuity as w → 0 when we set u = −1/w. At the singular point, we can define µq,λ (u) to be supremum of all possible limits. In this way, we can extend µq,λ (u) to an upper semi-continuous function whose domain includes real values of u and λ. Here is the bound for µa,α,λ (u). This is the main technical result in the first part of the paper. Lemma 4. Suppose that ν is a√probability measure on R2 satisfying (6)–(8). Assume u and λ are real, |λ| < 2 2 and a and R are positive real numbers satisfying R ≥ 2 and aR ≤ 1/4. Then there exist positive constants Ci such that with c and δ defined by (8) c(1 − δ) for |u| ≤ aR 1 − 20R2 + C1 aR + C2 α (15) µa,α,λ (u) ≤ 2 C3 p(u, λ, δ) 1 − a c + C4 α − for |u| ≥ aR, |u|2 2|u − uλ |2 R where p(u, λ, δ) = (1 − 2δ)u2 − (1 − δ)λu + 1 − δ. This lemma is proved in a separate section. When |u| → ∞ the bound tends to 1, so this bound alone is not sufficient. To proceed we must iterate the procedure leading to (13). Starting with (13) (with q replaced by q1 ) we apply (9) to arrive at Ma,α,λ ≤ χ(u)χ(φq1 ,λ (u)) cd(φq1 ,λ ◦ φq2 ,λ (u))1+α H
R2
R2
× dνa (q1 )dνa (q2 )dρa,λ (u) + C 1+α 1+α = χ(u)µ1+α q1 ,λ (u)χ(φq1 ,λ (u))µq2 ,λ (φq1 ,λ (u))cd(u) H
R2
R2
× dνa (q1 )dνa (q2 )dρa,λ (u) + C.
(16)
In view of Lemma 3, the following lemma will complete the proof of Theorem 2.
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
719
√ Lemma 5. Suppose that ν satisfies (6)–(8). Then there exists λ0 ∈ (0, 2 2) such that for small enough a and sup |Re λ|≤λ0 0
Proof. Let ma,α,λ (u) = χ(u)
R2
R2
Ma,α,λ < C.
1+α µ1+α q1 ,λ (u)χ(φq1 ,λ (u))µq2 ,λ (φq1 ,λ (u))dνa (q1 )dνa (q2 ).
√ Given (16), it suffices to show that there exists λ0 ∈ (0, 2 2) and 1 > 0 so that ma,α,λ (u) ≤ 1 − 1
(17)
for all u in a neighborhood of infinity in H and for a and α sufficiently small. An obvious estimate for ma,α,λ (u) is " # ma,α,λ (u) ≤ χ(u)µa,α,λ (u) χ(φq1 ,λ (u))µa,α,λ (φq1 ,λ (u)) . sup (18) q1 ∈supp(νa )
We begin by choosing λ0 with λ0 < 2
$
1 − 2δ . 1−δ
Then a simple calculation shows that the polynomial p(u, λ, δ) in Lemma 4 is bounded below p(u, λ, δ) ≥ p0 > 0 for all u ∈ R and λ with |λ| ≤ λ0 . Choosing R sufficiently large and α sufficiently small we can simplify the estimate in Lemma 4 to read 1 − 22 + C1 aR for |u| ≤ aR µa,α,λ (u) ≤ a2 3 1 − for |u| ≥ aR |u|2 for some 2 , 3 > 0 and for all u ∈ ∂∞ H = R ∪ {i∞}. Then, choosing a small (depending on R) we obtain for |u| ≤ aR 1 − 2 (19) µa,α,λ (u) ≤ a2 3 1 − for |u| ≥ aR. 2 |u| In particular, µa,α,λ (u) ≤ 1 for all u ∈ ∂∞ H. By upper semi-continuity of µ, we can extend this estimate to a neighborhood of ∂∞ H to conclude χ(u) µa,α,λ (u) ≤ 1 + 4 , where 4 can be made arbitrarily small by shrinking the support of χ.
(20)
July 8, 2009 10:14 WSPC/148-RMP
720
J070-00372
R. Froese, D. Hasler & W. Spitzer
To estimate the right side of (18) we consider u in two regions. The first region are the points near u ∈ R with |u| ≤ C. For these points, the estimate (19) and upper semi-continuity of µa,α,λ (u) imply µa,α,λ (u) ≤ 1 − 5 for some 5 > 0. This, combined with (20), where we have shrunk the support of χ to make 4 sufficiently small, proves (17) for these values of u. On the other hand, if u is in the region near u ∈ R with |u| ≥ C (including i∞) then u is bounded away from the singularity of φq1 ,λ (u) for q1 ∈ supp(νa ), so for these values of u and small q1 , the values of φq1 ,λ (u) are close to φ0,λ (u) and therefore |φq1 ,λ (u)| is uniformly bounded. This means we can exchange the roles of the two factors in (18) and obtain (17) for these values of u as well. 5. Extensions and Open Problems For δ = 0, that is, when the random variables q1 and q2 are independent, our result gives λ0 = 2. An obvious question is “How large can λ0 actually be?”. When λ0 is larger than 2 the polynomial p(u, λ, δ) in (15) changes sign so the estimate for µa,α,λ (u) goes above 1 for some values of u. However, the product on the righthand side of (18) remains bounded below 1 if λ0 is only slightly larger than 2, since the second term in the product compensates. So, our proof can accommodate λ0 slightly larger than 2. To push λ0 even higher, we can consider iterating the procedure leading to (16) an arbitrary number of times. This would presumably allow even larger values of λ0 at the expense of more complicated proofs. The determination of the exact range of absolutely continuous spectrum (as indeed the question of band-edge localization for this model) remains open. At first glance, it appears that the assumption that the distributions νa are identical for each sphere seems essential. Dropping it means that we lose self-similarity in the tree. However, in fact it is possible to handle the case where the distribution for the nth sphere νa,n can depend on n, provided that each distribution satisfies the assumptions (6)–(8). Then the distributions ρa,λ,n and the moments Ma,α,λ,n also vary from sphere to sphere. In this setup we are interested in Ma,α,λ,1 . The methods in this paper (with two iterations) can then be used to show that for suitable a, α and λ Mλ,n ≤ (1 − )Mλ,n+2 + C.
(21)
(We have dropped the a and α subscripts.) Here and C are positive constants that are independent of n and λ. Iterating this bound N times gives N −1 % N k (1 − ) Mλ,1 ≤ (1 − ) Mλ,1+2N + C k=1
≤ (1 − )N Mλ,1+2N
C + .
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
721
This estimate may appear useless, but for Im(λ) > 0 we actually have an n independent (but λ dependent!) bound on Mλ,n , because the support of ρa,λ,n is contained in a λ dependent compact set. Hence we obtain Mλ,1 ≤ (1 − )N Cλ +
C
and we may send N → ∞ to obtain the desired bound on Mλ,1 . 6. Proof of Lemma 4 The goal of this section is to prove the estimates in Lemma √ 4 on µ defined by (14) for u and λ real. Notice that when λ ∈ R and |λ| < 2 2 then Im(uλ ) > 0 and |uλ |2 = 2. We will blow up the singularity on the diagonal by introducing polar co-ordinates r and ωi , i = 1, 2 defined by u − q1 = rω1 ,
u − q2 = rω2 ,
ω12 + ω22 = 1.
We begin with the estimate for |u| small. √ Lemma 6. Suppose |λ| < 2 2, |qi | ≤ a and |u| ≤ aR where R ≥ 2 and aR ≤ 1/4. Then µq,λ (u) ≤
|ω1 + ω2 |2 + CaR. 2
Proof. We can write µq,λ (u) =
|(ω1 + ω2 )uλ + 2rω1 ω2 |2 . 2|u − uλ |2
We have 2 2 1 ≤ 1 + |λ||u| = ≤ |u − uλ |2 2 − λu + u2 1 − λu/2 since |λu/2| ≤ 1/2 and (1 − x)−1 ≤ 1 + 2|x| for |x| ≤ 1/2. Next, we have |ω1 + ω2 |2 |(ω1 + ω2 )uλ + 2rω1 ω2 |2 ≤ + r + r2 /4, 4 2 √ since |ω1 + ω2 | < 2, |ω1 ω2 | ≤ 1/2. With our bounds on qi and R we have r2 = |u − q1 |2 + |u − q2 |2 ≤ 2a2 (1 + R)2 . Combining these estimates completes the proof. Now we turn to the estimate for large |u|.
July 8, 2009 10:14 WSPC/148-RMP
722
J070-00372
R. Froese, D. Hasler & W. Spitzer
√ Lemma 7. Suppose |λ| < 2 2, |qi | ≤ a and |u| ≥ aR where R ≥ 2. Then µq,λ (u) ≤ 1 + with
q=
q1
1 1 l, q − 2 q, (Q − C/R)q u u
q2
,
−2u2 + λu 1 , 2|u − uλ |2 1
2 u − λu + 1 1 Q= 2 2|u − uλ | −2u2 + λu − 1 l=
−2u2 + λu − 1 u2 − λu + 1
.
The constant C = C1 /(1 − λ2 /8) + C2 where C1 and C2 are some (explicitly computable positive) numbers. Proof. Let δi = qi /u and note that |δi | < 1/R. We can write µq,λ (u) =
1 |2(u − uλ ) − (δ1 + δ2 )(2u − uλ ) 4|u − uλ |2 2 + 2δ1 δ2 u|2 . 2 |1 − δ1 | + |1 − δ2 |2
(22)
The third term on the right can be written |1 − δ1
|2
2 1 . = 2 + |1 − δ2 | 1 − (δ1 + δ2 ) + (δ12 + δ22 )/2
If x ≤ δ < 1 then (1 − x)−1 ≤ 1 + x + (1 + δ/(1 − δ))x2 . Using this with x = (δ1 + δ2 ) − (δ12 + δ22 )/2 and δ = (2R − 1)/R2, which implies δ/(1 − δ) ≤ 6/R we find, after some calculation, that this term can be estimated by 1 40 2 + ≤ 1 + (δ1 + δ2 ) + 2δ1 δ2 + (δ12 + δ22 ). |1 − δ1 |2 + |1 − δ2 |2 2 R We now turn to the middle term on the right-hand side of (22). Multiplying out the square, using Re(uλ ) = λ/2, and making some simple estimates, we arrive at |2(u − uλ ) − (δ1 + δ2 )(2u − uλ ) + 2δ1 δ2 u|2 ≤ 4|u − uλ |2 − 4(δ1 + δ2 )(|u − uλ |2 + u(u − λ/2)) + 2δ1 δ2 (|2u − uλ |2 + 4u(u − λ/2)) + (δ12 + δ22 )(|2u − uλ |2 + R−1 (9|u|2 + 2|λu|)). We now combine these estimates. In the error terms, we can control quadratic terms in u using |u|2 ≤
1 |u − uλ |2 . 1 − λ2 /8
A straightforward calculation completes the proof.
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
723
In preparation for the proof of Lemma 4 we prove the following lemma. Recall that ω1 and ω2 are functions of u and q. Explicitly, u − qi ωi (u, q) = , (u − q1 )2 + (u − q2 )2 so that ωi (u, aq) = ωi (u/a, q). Also, with the notation of (8) we have (q1 − q2 )2 dν(q) = c11 + c22 − 2c12 = c(1 − δ). R2
Lemma 8. For R ≥ 2 and |u| ≤ aR, |ω1 + ω2 |2 c(1 − δ) dνa (q) ≤ 1 − . 2 20R2 2 R
(23)
Proof. We begin with a scaling argument. The scaling properties of ωi (u, q) and νa imply that bounding the left-hand side of (23) for |u| ≤ aR is equivalent to bounding |ω1 + ω2 |2 dν(q) 2 R2 for |u| ≤ R. Referring to Fig. 2, we have ω1 = − cos(θ + π/4) and ω2 = − sin(θ + π/4). Then |ω1 + ω2 |2 /2 = (1 + 2ω1 ω2 )/2 = (1 + cos(2θ))/2. From this we see that the maximum occurs at an endpoint for θ, when (u, u) = (R, R) or (u, u) = (−R, −R). This leads to | ± R − q¯|2 q˜2 |ω1 + ω2 |2 ≤ = 1 − , 2 | ± R − q¯|2 + q˜2 |R ± q¯|2 + q˜2 q | ≤ 1 and R ≥ 2 we have where q¯ = (q1 + q2 )/2 and q˜ = (q1 − q2 )/2. Since |¯ |R ± q¯| ≤ 2R. This implies q˜ 2 (q1 − q2 )2 |ω1 + ω2 |2 ≤1− = 1 − . 2 4R2 + 1 20R2 Integrating this formula completes the proof.
(q1, q2)
(R,R)
θ
(u,u)
(R,R)
Fig. 2.
Co-ordinates ω1 , ω2 relative to (u, u).
July 8, 2009 10:14 WSPC/148-RMP
724
J070-00372
R. Froese, D. Hasler & W. Spitzer
We are now ready to give the proof of Lemma 4. Proof of Lemma 4. The estimates of Lemmas 6 and 7 and the estimate (1 + x)1+α ≤ 1 + (1 + α)x + α(1 + α)x2 for x > −1 can be used to show 2 |ω1 + ω2 | + C1 aR + C2 α for |u| ≤ aR 2 µq,λ (u)1+α ≤ 1 + 1 + α l, q − 1 q, (Q − C3 /R − C4 α)q for |u| ≥ aR. u u2 We now integrate this estimate with respect to νa . For |u| ≤ aR, we use Lemma 8. When we integrate the estimate for |u| ≥ aR, the linear term vanishes, thanks to (7). The quadratic term gives the estimate on the right-hand side in (15). 7. A Mean Field Model In this section, we add a weighted complete graph to every sphere in the tree. Since the weights are chosen to make the total added weights the same in each sphere, this is a sort of mean field model. Pick a number γ > 0. Each added edge (dotted line in Fig. 3) in the nth sphere Sn is given the weight γ2−n . We call this graph the mean field binary tree. The spectrum √of the free Laplacian √ √ √ on the mean field tree is the union of two intervals [−2 2+γ, 2 2+γ]∪[−2 2, 2 2] and is purely absolutely continuous. This can be seen by diagonalizing the Laplacian using a Haar basis, as in [1]. The (normalized) Haar basis {e0 , e1 , . . . , e2n −1 } for n C2 is defined as follows. Let (e0 )(j) = 2−n/2 , j = 1, . . . , 2n . For k = 0, 1, . . . , n − 1 we set (e2k )(j) = 2−(n−k)/2 if j = 1, . . . , 2n−k−1 , (e2k )(j) = −2−(n−k)/2 if j = 2n−k−1 + 1, . . . , 2n−k , and 0 otherwise. Finally, we define the non-zero components of ei for 2k ≤ i < 2k+1 to be (ei )(j) = (e2k )(j − i + 2k ). n Figure 4 shows a diagram of the Haar basis for 2 (S n ) = C2 with n = 3. Each vector is normalized to make the basis orthonormal. This basis has a natural tree structure determined by the supports of the vectors. The highest level is the constant vector, and the lowest level consists of vectors with two non-zero entries of ±2−1/2 .
S0
S1
S2
S3
...
Fig. 3. Rooted binary tree with mean-field edges insides spheres and transversely 2-periodic potential.
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
Fig. 4.
725
Haar basis for 2 (S 3 ).
To simplify the calculations, we will consider this model when the transversely two-periodic potential is defined by the product of two independent Bernoulli distributions for q1 and q2 , ν=
1 (δ(q1 − 1) + δ(q1 + 1))(δ(q2 − 1) + δ(q2 + 1)). 4
Theorem 9. Let ν(0) be a probability measure of bounded support for the potential at the root and ν be the product of Bernoulli distributions defined above and let Ha,γ be the random discrete Schr¨ odinger operator on the mean field binary tree corresponding to the transversely two-periodic potential √ defined by the scaled distribution νa and weight γ. There exist 0 < λ0 , λ1 < 2 2 such that for sufficiently small a the spectral measure for Ha corresponding to δ0 has purely absolutely continuous spectrum in {λ : |λ| ≤ λ0 , |λ − γ| ≤ λ1 }. In this theorem, the constant λ0 has the same value as in the first √ part of the paper, while λ1 can be taken to be any positive number less than 2 2. The forward Green’s functions Gn are not diagonal. In the basic recursion formula (1) for the forward Green’s functions on the mean field tree the matrices En and Qn are unchanged from the binary tree, but the matrices Dn are now 2−n γ times the Laplace operator for the complete graph on Sn . This Laplace operator is a 2n × 2n matrix with each diagonal entry equal to zero and each off-diagonal entry equal to 1. Thus Dn = γ(P − 2−n I),
July 8, 2009 10:14 WSPC/148-RMP
726
J070-00372
R. Froese, D. Hasler & W. Spitzer
where P projects onto 2−n/2 [1, 1, . . . , 1]T . Introduce the dn × dn matrix Un = EnT Gn+1 En − Dn + λ = EnT Gn+1 En − γP + λn , where λn = λ + γ2−n .
(24)
Then the basic recursion formula reads T Un−1 = −En−1 (Un − Qn )−1 En−1 − γP + λn−1 .
The range of P is the span of the first vector in the Haar basis. Since the representation of a two-periodic potential in this basis is not too complicated, it is natural to change to this basis to simplify the problem. Let Vn be the 2n × 2n orthogonal change of basis matrix to the Haar basis, whose columns consist of the Haar basis vectors. Lemma 10. (i) VnT P Vn = diag[1, 0, 0, . . .]. √ (ii) VnT EnT Vn+1 = 2[I, 0]. (iii) Let Q = diag[q1 , q2 , q1 , q2 , . . .] be a two-periodic potential. Setting q¯ = (q1 + q2 )/2 and q˜ = (q1 − q2 )/2 we have
T 0 Vn−1 VnT QVn = q¯I + q˜ . Vn−1 0 The proof of this lemma is a straightforward computation, which we omit. Now we write the matrix Un in the Haar basis. Define ˜n = V T Un Vn . U n ˜n reads In view of Lemma 10, the recursion formula for U
T −1 0 Vn−1 I ˜n − q¯ − q˜ ˜n−1 = −2[I, 0] U U − γ diag[1, 0, 0, . . .] + λn−1 , Vn−1 0 0 (25) where λn is given by (24). This recursion formula preserves matrices of the form diag[u1 , u2 , u2 , . . .]. ˜n−1 , defined by the ˜n = diag[u1 , u2 , u2 , . . .]. Then U Lemma 11. Suppose that U recursion formula above, has the form ˜n−1 = diag[ψq,λ,γ,n−1 (u1 , u2 ), φq,λ,n−1 (u2 ), φq,λ,n−1 (u2 ), . . .], U where ψq,λ,γ,n (u1 , u2 ) = −
2 + λn − γ, u1 − q¯ − q˜2 (u2 − q¯)−1
2 φq,λ,n (u2 ) = − + λn , u2 − q¯ − q˜2 (u2 − q¯)−1 and λn is given by (24).
(26)
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
Proof. We have
0 ˜n − q¯ − q˜ U Vn−1
T Vn−1
−1 =
0
A
BT
B
C
727
,
where A = diag[u1 − q¯, u2 − q¯, u2 − q¯, . . .], B = −˜ qVn−1 , C = (u2 − q¯)I. The top left block of this inverse is given by Schur’s formula (A−B T C −1 B)−1 . Since T Vn−1 = q˜2 (u2 − q¯)−1 I, the result is a diagonal matrix B T C −1 B = q˜2 (u2 − q¯)−1 Vn−1 2 −1 −1 with (u1 − q¯− q˜ (u2 − q¯) ) in the upper left corner and (u2 − q¯− q˜2 (u2 − q¯)−1 )−1 in the other diagonal positions. The recursion formula picks out this block, multiplies by −2 and then adds −γ diag[1, 0, 0, . . .] + λn−1 . This gives the formulas (26). ˜n preserves diagonal matrices having The fact that the recursion formula for U ˜n must actually have this form. This the form diag[u1 , u2 , u2 , . . .] means that U follows from the limit formula for the forward Green’s functions proved in [10] which implies that these matrices will lie in any set that is preserved by the recursion flow. Thus, there are two random variables u1 and u2 for each sphere that describe the forward Green’s function. For the nth sphere, they are distributed according to some joint measure ρa,λ,γ,n for (u1 , u2 ). Since the variables for adjacent spheres are related by (26) the recursion formula for these measures reads w(u1 , u2 )dρa,λ,γ,n (u1 , u2 ) = w(ψq,λ,γ,n (u1 , u2 ), φq,λ,n (u2 )) H×H
H×H
R2
× dνa (q)dρa,λ,γ,n+1 (u1 , u2 ). Define the moments
Ma,α,λ,γ,n =
H×H
cd1,n (u1 )1+α dρa,λ,γ,n (u1 , u2 ),
where cd1,n (u1 ) =
|u1 − uλn −γ |2 Im(u1 )
and uλ is the same fixed point as in (11). Our goal is to bound Ma,α,λ,γ,0 for a and α ˜0 = U0 = [u1 ] = E0T G1 E0 + λ. small and λ and γ in some range. When n = 0 then U T −1 Since G0 = −(E0 G1 E0 + λ − q0 ) we can use the argument of Lemma 3 to prove the existence of absolutely continuous spectrum from such a bound. Observe now that the recursion for u2 is the same as the formula for u in the first part of the paper, except that λ is replaced by λn . Explicitly, φq,λn (u) = φq,λ,n (u),
July 8, 2009 10:14 WSPC/148-RMP
728
J070-00372
R. Froese, D. Hasler & W. Spitzer
where the φ is given on the left by (10) and on the right by (26). We claim this implies that (2) cd2,n (u2 )1+α dρa,λ,γ,n (u1 , u2 ) ≤ C, (27) Ma,α,λ,γ,n = H×H
provided |λ| < λ0 . Here cd2,n (u2 ) =
|u2 − uλn |2+ . Im(u2 )
The function |z|+ is equal to |z| except near z = 0 where it has been modified to be bounded away from zero. This makes no difference to the growth properties, but will allow us to make a needed lower bound in the next section. For large n, the bound (27) follows from the results in the first part of the paper (extended to distributions that vary from sphere to sphere) since the small perturbations γ2−n of λ are easily absorbed in the proof. The result for large n suffices, since it is easy to iterate the bound (27) a finite number of steps. All that is required is an upper bound µq,λn (u) ≤ C, for µ given by (14). Similarly, it is enough to bound Ma,α,λ,γ,n for large n. We follow the same basic steps as before to begin the proof of such a bound. Let χ(u1 ) be a cutoff with support where u1 is in a neighborhood of ∂∞ H. Then, one iteration gives Ma,α,λ,γ,n = H×H
≤
H×H
cd1,n (u1 )1+α dρa,λ,γ,n (u1 , u2 ) R2
= H×H
R2
cd1,n (ψq,λ,γ,n (u1 , u2 ))1+α χ(u1 )dνa (q)dρa,λ,γ,n+1 (u1 , u2 ) + C (cd1,n (ψq,λ,γ,n (u1 , u2 )) − C1 cd2,n (u2 ) + C1 cd2,n (u2 ))1+α
× χ(u1 )dνa (q)dρa,λ,γ,n+1 (u1 , u2 ) + C 1+α 2α [cd1,n (ψq,λ,γ,n (u1 , u2 ) − C1 cd2,n (u2 )]+ ≤ H×H
R2
(2)
× χ(u1 )dνa (q)dρa,λ,γ,n+1 (u1 , u2 ) + 2α C11+α Ma,α,λ,γ,n + C. The notation [x]+ denotes max{0, x}, not to be confused with |z|+ . Here we used the convexity of x → x1+α . The positive constant C1 can be chosen as large as we please. Now we define cd1,n (ψq,λ,γ,n (u1 , u2 )) − C1 cd2,n (u2 ) , (28) µq,λ,γ,n (u1 , u2 ) = cd1,n+1 (u1 ) and the averaged version µa,α,λ,γ,n (u1 , u2 ) =
R2
[µq,λ,γ,n ]1+α + (u1 , u2 )dνa (q).
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
729
(2)
Then, provided |λ| ≤ λ0 so that Ma,α,λ,γ,n is bounded, we can rewrite the estimate above as 2α µa,α,λ,γ,n (u1 , u2 ) χ(u1 ) cd1,n+1 (u1 )1+α dρa,λ,γ,n+1 (u1 , u2 ) + C. Ma,α,λ,γ,n ≤ H×H
A second iteration results in 22α µa,α,λ,γ,n (ψq,λ,γ,n+1 (u1 , u2 ), φq,λ,n+1 (u2 )) Ma,α,λ,γ,n ≤ H×H
R2
× χ(ψq,λ,γ,n+1(u1 , u2 )) · [µq,λ,γ,n+1 (u1 , u2 )]1+α + × χ(u1 ) dνa (q) cd1+α (29) 1,n+2 (u1 )dρa,λ,γ,n+2 (u1 , u2 ) + C. √ Lemma 12. There exist 0 < λ0 , λ1 < 2 2 such that for |λ| ≤ λ0 , |λ−γ| ≤ λ1 , a, α sufficiently small, n sufficiently large and χ supported sufficiently near ∂∞ H, there is > 0 such that 22α µa,α,λ,γ,n (ψq,λ,γ,n+1 (u1 , u2 ), φq,λ,n+1 (u2 ))χ(ψq,λ,γ,n+1 (u1 , u2 )) R2
× [µq,λ,γ,n+1(u1 , u2 )]1+α + χ(u1 )dνa (q) ≤ 1 − .
This lemma, proved below, implies the main result for the mean field model. Proof of Theorem 9. Inserting the estimate of Lemma 12 into (29) gives Ma,α,λ,γ,n ≤ (1 − )Ma,α,λ,γ,n+2 + C for n large. This is the same estimate as (21) so we can follow the argument given there to bound Ma,α,λ,γ,n for n large. As noted above, this is sufficient to prove the theorem. 8. Proof of Lemma 12 The function µq,λ,γ,n (u1 , u2 ) is the rational function given by |(u2 − q¯)uλn −γ − (u1 − q¯)(u2 − q¯) + q˜2 |2 Im(u1 ) ' µq,λ,γ,n (u1 , u2 ) = & |u2 − q¯|2 Im(u1 ) + q˜2 Im(u2 ) |u1 − uλn+1 −γ |2 − C1
Im(u1 )|u2 − uλn |2+ . Im(u2 )|u1 − uλn+1 −γ |2
√ For |λ − γ| ≤ λ1 < 2 2, the fixed point uλn −γ lies in the upper half plane for n sufficiently large, and is bounded away from ∂∞ H. The function µq,λ,γ,n (u1 , u2 ) always appears with a cutoff function χ(u1 ) that ensures that u1 is in a neigborhood of ∂∞ H and thus that, for n sufficiently large, |u1 − uλn+1 −γ | is bounded below by a positive constant. The variable u2 can range over all of H.
July 8, 2009 10:14 WSPC/148-RMP
730
J070-00372
R. Froese, D. Hasler & W. Spitzer
Introduce polar co-ordinates r, ω1 and ω2 for Im(u1 ) and Im(u2 ) as Im(u1 ) = rω1 ,
Im(u2 ) = rω2 ,
ω12 + ω22 = 1.
Then µq,λ,γ,n(u1 , u2 ) =
|(u2 − q¯)uλn −γ − (u1 − q¯)(u2 − q¯) + q˜2 |2 ω1 & ' |u2 − q¯|2 ω1 + q˜2 ω2 |u1 − uλn+1 −γ |2 − C1
ω1 |u2 − uλn |2+ . ω2 |u1 − uλn+1 −γ |2
With a Bernoulli distribution, the potential takes on four possible values (±a, ±a). The corresponding values of µ are as follows. µ++ a,λ,γ,n (u1 , u2 ) =
ω1 |u2 − uλn |2+ |u1 − uλn −γ − a|2 − C . 1 |u1 − uλn+1 −γ |2 ω2 |u1 − uλn+1 −γ |2
The formula for µ−− is identical, except that −a is replaced with a. For the other two values, we have µ+− = µ−+ , with ω1 |u2 − uλn |2+ |u2 (u1 − uλn −γ ) − a2 |2 ω1 & ' µ+− − C . 1 a,λ,γ,n (u1 , u2 ) = ω2 |u1 − uλn+1 −γ |2 |u2 |2 ω1 + a2 ω2 |u1 − uλn+1 −γ |2 Lemma 13. For u1 in a neighborhood of ∂∞ H and a bounded, µ++ a,λ,γ,n (u1 , u2 ) ≤ 1 + C
a + 2−n . |u1 − uλn+1 −γ |
Proof. Dropping the second term we have µ++ a,λ,γ,n (u1 , u2 )
uλn+1 −γ − uλn −γ − a 2 ≤ 1 + . u1 − uλn+1 −γ
Expanding the square, using that |uλn −γ − uλn+1 −γ | ≤ C2−n and that |(a + C2−n )/(u1 − uλn+1 −γ )| is bounded, since a is bounded and u1 is bounded away from uλn+1 −γ near ∂∞ H completes the proof. The following lemma is the most involved estimate in this section. Lemma 14. Suppose that u1 lies in a sufficiently small neighborhood of infinity. Then for C1 and n sufficiently large and a sufficiently small, there exists a positive constant C such that √ C C1 (a − 2−n ) +− µa,λ,γ,n (u1 , u2 ) ≤ 1 − . (30) |u1 − uλn+1 −γ |
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
731
Proof. To simplify the appearance of the formulas, we introduce the notation An = u1 − uλn −γ ,
Bn = u2 − uλn .
We begin by establishing the inequality #2 " √ |u2 An − a2 | − a C1 |Bn |+ + +− µa,λ,γ,n (u1 , u2 ) ≤ . |u2 |2 |An+1 |2
(31)
Let x = ω2 /ω1 ∈ [0, ∞]. We must maximize µ+− a,λ,γ,n (u1 , u2 ) =
|Bn |2+ |u2 An − a2 |2 − C 1 (|u2 |2 + xa2 )|An+1 |2 x|An+1 |2
over x. We will assume without loss that a > 0. Differentiating with respect to x we obtain the following equation for the critical point: C1 |Bn |2+ |u2 An − a2 |2 a2 = , 2 2 2 (|u2 | + xa ) x2 or |u2 An − a2 |ax = ±
C1 |Bn |+ (|u2 |2 + xa2 ).
Since x is non-negative we must choose ± = +. This results in the critical point √ C1 |Bn |+ |u2 |2 √ x= . a(|u2 An − a2 | − a C1 |Bn |+ ) The critical point will lie in [0, ∞] provided |u2 An − a2 | ≥ a C1 |Bn |+ , in which case a calculation shows that the critical value is '2 & √ |u2 An − a2 | − a C1 |Bn |+ . |u2 |2 |An+1 |2
(32)
(33)
At the endpoint x = 0, we find that µ+− tends to −∞ while the limit as x → ∞ is 0. This implies that when (32) holds, then the maximum occurs at the critical value, and otherwise the maximum is 0. This proves (31). Now we can proceed with the proof of estimate (30). We may assume that (32) holds, because otherwise µ+− is zero and the desired estimate is true. This implies a ) that for some > 0 (e.g., = √C |B | 1
n +
|u2 An | ≥ a(1 − ) C1 |Bn |+ .
Here we use that |Bn |+ ≥ C. Thus we may assume √ a C1 |Bn |+ ≤ 1 + 2 |u2 An |
(34)
July 8, 2009 10:14 WSPC/148-RMP
732
J070-00372
R. Froese, D. Hasler & W. Spitzer
provided 0 < < 1/2 and use this in estimating (33). Expanding the square in (33) we end up with an estimate for µ+− given by 2a|An | |An |2 a|Bn |+ a3 µ+− (u , u ) ≤ + + 2 a,λ,γ,n 1 2 |An+1 | |u2 ||An+1 | |An+1 ||Bn |+ |u2 ||An+1 ||Bn |+ √ √ 2 C1 a2 C1 a|Bn |+ 2 C1 |An | + + − . |An+1 | |u2 ||An+1 | |u2 ||An+1 | Now we may use (34), |Bn |+ ≥ C and |An |/|An+1 | ≤ 1 + C2−n /|An+1 | to arrive at the estimate a|Bn |+ −n ((1 − 2) C1 − Ca). /|An+1 | − µ+− a,λ,γ,n (u1 , u2 ) ≤ 1 + C2 |u2 ||An+1 | Finally, the bound |Bn |+ /|u2 | ≥ C completes the proof. Proof of Lemma 12. With the Bernoulli distribution, the average defining µ has four terms, so, dropping the subscripts and using the estimates from this section we have 1 ++ 1+α ([µ ]+ + [µ−− ]1+α + [µ+− ]1+α + [µ−+ ]1+α + + + ) 4 1+α 1+α 1 a + 2−n a − 2−n ≤ 1+C + 1 − C C1 . 2 |u1 − uλn−1 −γ | |u1 − uλn−1 −γ |
µ(u1 , u2 ) =
For a small and n large, both terms inside the square brackets are a small perturbation of 1. But since we are free to take C1 large, we may assume that the relative size of the term with the good (negative) sign is much larger. This leads to the estimate µ(u1 , u2 ) ≤ 1 − C
C1
a − 2−n + C(a + 2−n )2 |u1 − uλn−1 −γ |
for a, α small and n, C1 large. To prove the lemma we must estimate the expression (again dropping most subscripts) 1 4
%
22α µ(ψq (u1 , u2 ), φq (u1 , u2 )) χ(ψq (u1 , u2 ))[µq (u1 , u2 )]1+α χ(u1 ). +
q∈(±a,±a)
When |u1 | ≤ C we can estimate µ by 1 + C(a + 2−n )2 and pull it out of the sum. What results is another copy of µ evaluated at bounded u1 . This can be estimated by 1 − . Since for small α the quantity 22α is close to 1, we end up with the desired bound of 1 − for a, α small and n, C1 large. For u1 near infinity, we estimate the occurrences of µ in the sum by the bound for µ++ which is slightly greater than one. Then we just need to guarantee that
July 8, 2009 10:14 WSPC/148-RMP
J070-00372
Absolutely Continuous Spectrum for a Random Potential on a Tree
733
one of the µ terms will be evaluated with ψq (u1 , u2 ) bounded. This happens when q = (a, a) since in this case ψq (u1 , u2 ) = −2/(u1 − q¯) + λn − γ independently of u2 . Acknowledgments We would like to thank the University of Erlangen-N¨ urnberg (R.F.), the University of British Columbia and the Erwin Schr¨ odinger Institute (D.H. and W.S.), and Jacobs University (W.S.) for hospitality and financial support. References [1] C. Allard and R. Froese, A Mourre estimate for a Schr¨ odinger operator on a binary tree, Rev. Math. Phys. 12 (2000) 1655–1667; mp-arc:98-497. [2] M. Aizenman, R. Sims and S. Warzel, Stability of the absolutely continuous spectrum of random Schr¨ odinger operators on tree graphs, Probab. Theory Related Fields 136(3) (2006) 363–394; arXiv:math-phys/0502006. [3] M. Aizenman, R. Sims and S. Warzel, Absolutely continuous spectra of quantum tree graphs with weak disorder, Commun. Math. Phys. 264 (2006) 371–389; arXiv:mathphys/0504039. [4] M. Aizenman, R. Sims and S. Warzel, Fluctuation based proof of the stability of ac spectra of random operators on tree graphs, in Quantum Graphs and Their Applications, Proceedings AMS-IMA-SIAM Joint Res. Conf., Snowbird 2005, eds. G. Berkolaiko, R. Carlson, S. A. Fulling and P. Kuchment, AMS Contemporary Mathematics Series, Vol. 415 (Amer. Math. Soc., 2006); arXiv:math-ph/0510069. [5] M. Aizenman and S. Warzel, Persistence under weak disorder of AC spectra of quasiperiodic Schr¨ odinger operators on trees graphs, Moscow Math. J. 5(3) (2005) 499– 506; arXiv:math-ph/0504084. [6] M. Aizenman and S. Warzel, The canopy graph and level statistics for random operators on trees, Math. Phys. Anal. Geom. 9 (2006) 291; arXiv:math-ph/0607021. [7] J. Breuer, Localization for the Anderson model on trees with finite dimensions, Ann. Henri Poincar´e 8(8) (2007) 1507–1520; arXiv:math/0609474. [8] J. Breuer, Singular continuous spectrum for the Laplacian on certain sparse trees, Commun. Math. Phys. 269(3) (2006) 851–857; arXiv:math/0608159. [9] J. Breuer and R. L. Frank, Singular spectrum for radial trees; arXiv:0806.0649. [10] R. Froese, D. Hasler and W. Spitzer, Transfer matrices, hyperbolic geometry and absolutely continuous spectrum for some discrete Schr¨ odinger operators on graphs, J. Funct. Anal. 230(1) (2005) 184–221; mp-arc:04-244. [11] R. Froese, D. Hasler and W. Spitzer, Absolutely continuous spectrum for the Anderson model on a tree: A geometric proof of Klein’s theorem, Commun. Math. Phys. 269(1) (2007) 239–257; mp-arc:05-388. [12] F. Halasan, Thesis in preparation, University of British Columbia (2009). [13] A. Klein, Extended states in the Anderson model on the Bethe Lattice, Adv. Math. 133 (1998) 163–184; mp-arc:94-236. [14] B. Simon, Lp Norms of the Borel transform and the decomposition of measures, Proc. Amer. Math. Soc. 123(12) (1995) 3749–3755. [15] W. Spitzer, Absolutely continuous spectrum on some tree graphs, Oberwolfach Report No. 12/2007, Transport in multi-dimensional random-Schr¨ odinger operators (2007).
July
8,
2009 10:16 WSPC/148-RMP
J070-00373
Reviews in Mathematical Physics Vol. 21, No. 6 (2009) 735–780 c World Scientific Publishing Company
MASSLESS SCALAR FREE FIELD IN 1+1 DIMENSIONS I: WEYL ALGEBRAS PRODUCTS AND SUPERSELECTION SECTORS
FABIO CIOLLI Dipartimento di Matematica, Universit` a di Roma “Tor Vergata”, Via della Ricerca Scientifica I-00133, Roma, Italy
[email protected] Received 18 December 2008 Revised 13 May 2009 This is the first of two papers on the superselection sectors of the conformal model in the title, in a time zero formulation. A classification of the sectors of the net of observables as restrictions of solitonic (twisted) and non-solitonic (untwisted) sector automorphisms of proper extensions of the observable net is given. All of them are implemented by the elements of a field net in a non-regular vacuum representation and the existence of a global compact Abelian gauge group is proved. A non-trivial center in the fixed-point net of this gauge group appears, but in an unphysical representation and reducing to the identity in the physical one. The completeness of the described superselection structure, to which the second paper is devoted, is shown in terms of Roberts’ net cohomology. Some general features of physical field models defined by twisted cross products of Weyl algebras in non-regular representations are also presented. Keywords: Weyl algebras; massless scalar free field; superselection sectors; conformal models; solitonic sectors; twisted crossed products; non-regular representations. Mathematics Subject Classification 2000: 81T05, 81T10, 81T40, 46L60, 46N50
Contents 1. Introduction
736
2. Weyl Algebras 740 2.1. Isomorphisms and twisted crossed products of Weyl algebras . . . . 741 2.2. Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747 2.3. Elementary Weyl algebra in non-regular representation . . . . . . . . 751 3. The Streater and Wilde Model 755 3.1. Weyl algebras for the Streater and Wilde model . . . . . . . . . . . . 755 3.2. Defining representations for the Streater and Wilde model . . . . . . 760 4. Local Theory, DHR Sectors and Gauge Group 761 4.1. Nets for the Streater and Wilde model . . . . . . . . . . . . . . . . . 763 735
July 8, 2009 10:16 WSPC/148-RMP
736
J070-00373
F. Ciolli
4.2. Chiral versus time zero formulation . . . . . . . . . . . . . . . . . . . 766 4.3. DHR sectors for the observable net AI . . . . . . . . . . . . . . . . . 771 4.4. Gauge symmetry group . . . . . . . . . . . . . . . . . . . . . . . . . 775
1. Introduction The general theory of superselection sectors in low dimensional Quantum Field Theory is still lacking, so the study of special models is of great interest. Relevant progress has been achieved in the past years using Algebraic Quantum Field Theory for various classes of models such as loop groups, orbifold models and coset models, see [23, 49] also for a historical review. In this approach, a complete classification of rational models (i.e. with a finite number of sectors) with Virasoro central charge c < 1, see [34, 32], and for the local extensions of compact type of the Virasoro net on the circle at c = 1 has been attained, see [11] for details. In purely massive theories, the triviality of sectors has been proved in [39]. Two notable features of the theory are the presence of topological sectors of solitonic origin, see [35, 42], and the following dichotomy between the rational and non-rational case established in [37]: if in a model all irreducible sectors have conjugates then, it is either rational or has uncountably many different irreducible sectors. In this paper, we deal with a non-rational c = 1 model, the massless scalar free field, also called the Streater and Wilde model after its first formulation using the ideas of local nets in [51]. Our main goal is to understand better the interplay between chiral and time zero formulation of the superselection structure in 1+1dimensional theories, the nature of solitonic and non-solitonic sectors, the relation between DHR sectors and the presence of a quantum global internal symmetry, usually reducing to a global compact gauge group. The results obtained in this paper, and in its sequel [14], are largely adaptable to other theories based on Weyl algebras and give a strong further application of Roberts’ net cohomology for discussing superselection sectors in general spacetime context, see [48] for reference. The observables of the Streater and Wilde model are obtained from the quantized derivatives of the classical fields by imposing a constraint condition on the solutions of the two-dimensional wave equation. This choice avoids infrared divergences, see [51] and references therein, and for this reason the model is also called the theory of the potential of the field. The existence of a (physical, separable) Hilbert space representation for the observable net is obtained in the usual way by the Fock space second quantization procedure for Weyl algebras. For the same net, the existence of a continuous, i.e. uncountable, family of DHR superselection sectors is known from the original description in [51] (see [26, 48] for general references on the Doplicher, Haag, Roberts theory of superselection sectors). They are labeled by pairs of real numbers, i.e. by the elements of R2d (here the subscript d means discrete topology) and
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
737
realized as inner automorphisms by some left/right movers (solitons) of a field net extension. Other relevant features of the model were discussed in [29], namely the Tomita–Takesaki modular structure, spacelike and timelike duality. The usual chiral net formulation of the Streater and Wilde model is treated in [10] and the above cited extension may be classified according to [11, Definition 3.2] as being of compact type. A relevant step in the description of Weyl algebras models was the introduction of non-regular representations, see [55] and [1, 2]: for such a representation π of W(V, σV ), the Weyl algebra on the symplectic space (V, σV ), there is a subspace of elements v ∈ V such that, for λ ∈ R, the map λ → π(W (λv)) ⊂ B(Hπ ) is not weakly continuous. The papers just cited pointed out the utility of such Hilbert space representations in the presence of a theory with uncountable many sectors. This avoids the use of inner product spaces with indefinite metrics, for the (unphysical) representation of charged fields. However, the same papers do not attack the problem of a local net theory in a non-regular representation, nor the full description of its DHR sectors and associated gauge group. The task of this first paper is to collect the known results on the Streater and Wilde model and reconstruct the DHR sectors theory using a putative field net F (that do not satisfy the locality condition) and a compact group G of gauge symmetries, such that the observable net A is the fixed-point net under the action of G, restricted to the representation Hilbert space Ha of the observables, i.e. A = FG Ha . In the second paper, we legitimate the net F as the complete field net of A, and the description of the superselection structure is hence similar to the higher dimensional one of a field system with gauge symmetry, see the celebrated [22] for definitions and results, apart from the presence of braid symmetry instead of permutation symmetry.a To construct such a field net, together with the gauge group structure, we use the abstract tool of 2-cocycle twisted crossed product of Weyl algebras, i.e. the reinterpretation of the Weyl algebras of fields as an extension of that of the observables by the cocycle twisted action of the charge group. This simple current extension is defined by a (generalized) 2-cocycle derived from the symplectic form of the Weyl algebras, in its observable and charge-gauge group components, already partially studied in the physical literature, see for example [28].b
a It
should be said that the superselection theory of the analogous model in 1+3 dimensions is known to be trivial. Here a constrain condition on the symplectic space (not introduced to avoid the infrared divergences as in the 1+1 case) distinguishes the observable net A from its Haag dual net Ad ⊃ A and the failing of Haag duality for the observable net A, i.e. Ad = A, denotes the presence of a spontaneous breaking of the gauge group. The above mentioned triviality is due to the absence of sector (or solitonic) automorphisms for Ad = F, the dual net equates the field net, and is proved by net cohomology in [9]. b More general examples of simple current extensions, derived from loop groups, orbifold models or vertex operator algebras may be considered. See, for example, [33].
July 8, 2009 10:16 WSPC/148-RMP
738
J070-00373
F. Ciolli
The extension of the symplectic spaces in the model has the structure Va ⊂ Vf = Va ⊕ (N ⊕ C) where the symplectic form splits as σf = σa ⊕ σN ⊕C . Here, the space C is the discrete Abelian charge group. The structure of the corresponding Weyl algebras is then W(Vf , σf ) = W(Va , σa ) ⊗ W(N, σN ) z U(C).
(1.1)
Here U(C) denotes the Abelian group C written multiplicatively. Note that the factor W(N, σN ) commutes with W(Va , σa ), but is acted upon by the charge group through the 2-cocycle z, reflecting the symplectic interaction between N and C. Such a construction, in a time zero formulation, allows one to classify the sectors labeled by R2d as restrictions of solitonic (twisted) and non-solitonic (untwisted) sector automorphisms of two different simple current extensions of the net of the observables. Hence these sectors accord to the definitions in [42, 35], but this classification reflects a different perspective respect to the equivalent nature of the left/right solitons of the chiral formulation. Moreover, the time zero approach makes evident the presence of a non-trivial center FG ∩ (FG ) of the fixed-point net under the action of the global compact gauge group G, weakly continuously represented on the unphysical Hilbert space. Namely, the construction by a simple current extension, gives a six-term diagram of inclusions of (localized nets of) symplectic spaces, Weyl algebras and von Neumann algebras with the action of the charge and gauge groups. To introduce the test function spaces used to define the time zero symplectic spaces of the model, we denote by S the Schwartz space of real valued rapidly decreasing functions on the real line, ∂S the space of functions that are derivative of functions in S and by ∂ −1 S the C ∞ -functions whose derivative is in S. Referring to the classical theory of the quantum massless field in 1+1 dimensions, if ϕ denotes the field and π = ϕ˙ its conjugate momentum field, the currents extension corresponds to two different test function space extensions: the codimension 1 extension from ∂S to S, which corresponds to lifting the condition that test functions for the massless time-zero field ϕ should vanish at zero wave number in Fourier space, and the codimension 2 extension from S to ∂ −1 S which corresponds to admitting as test functions for the time-zero conjugate momentum field π both constant functions and odd functions tending to constants at infinity. These extensions of test function spaces, together with the extension of the corresponding symplectic form σa (F, G) = (f0 g1 − f1 g0 )dx, R
where F = f0 ⊕ f1 and G = g0 ⊕ g1 are two different couples of test functions for field and momentum, respectively, give the extension from the algebra of observables to the algebra of fields. We hence denote by C the space of quotient classes f˜0 (0) for ϕ and by Q the space of quotient classes f1 (+∞) − f1 (−∞) for π. Together, they form a two-dimensional real space of charges C ⊕ Q, furnished with the discrete
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
739
∼Q∼ topology, i.e. C = = Rd , and playing the role of the charge group of the model, denoted only by C in the more generic formula (1.1). Omitting some intermediate terms, a reduced version of the cited diagram of inclusions for the (nets of) symplectic spaces and the von Neumann algebras of the observables and putative fields, is Va := ∂S ⊕ S ⊂ Vb := (∂S ⊕ S) ⊕ N ⊂ ((∂S ⊕ S) ⊕ N ) ⊕ C ⊕ Q ∼ = S ⊕ ∂ −1 S =: Vf , A ⊂ B := A ⊗ Zb = F G ⊂ (A ⊗ Zb ) U(C) U(Q) =: F . In these diagrams, N ∼ = R denotes the space of constant test functions for π, that plays the same role of N in (1.1). The charge group acts non-trivially only on the non-trivial central tensor factor Zb , the Abelian von Neumann algebra generated by the representation of N . A major task, that we postpone to the second paper [14], is the question of the completeness of the R2d -labeled superselection theory. A positive answer is given by a careful choice of the index sets defining the nets and by the very effective theory of net cohomology of Roberts. Actually, we determine the sectors for a large class of models given by an extension of Weyl algebras. In the second paper, the non-trivial center, or relative commutant A ∩ F , is discussed more deeply. This feature is considered for example in [5]. It will also be pointed out that it is related to the R-graded commutation rules of the non local field net F. Moreover, the relation with the superselection theory in presence of constraints, see [3], and further structural properties of the nets of the Streater and Wilde model, such as duality properties with respect to different index sets and split properties, will also be considered. The structure of this paper is as follows: In Sec. 2, we recall known and add new material on Weyl algebras, at the abstract algebraic, C* and von Neumann algebraic level. Particularly, in Sec. 2.1, a definition of the twisted cross product of Weyl algebras is given, and used to describe the observable-charge coupling in a physical model. In Sec. 2.2, we present some useful requirements of independence of the states on the Weyl algebras of a twisted cross product, necessary for physically interesting representations, on a non separable Hilbert space. A detailed account of non-regular representations of Weyl algebras is presented in Sec. 2.3. The attention is focused on the non-regular representation of the elementary Weyl algebras on the symplectic space L ∼ = R ⊕ R, presented for example in [55], to be used as a building block for the general twisted product case.c Section 3 is devoted to the twisted cross product formulation of the Streater and Wilde model with initial data on the time zero line. Fixing any charged element c The
same algebraic characterization may be used to study the superselection structure of models presenting electromagnetic charges and interaction. In this line, the analysis of the St¨ uckelberg– Kibble QED2 model will be presented elsewhere.
July 8, 2009 10:16 WSPC/148-RMP
740
J070-00373
F. Ciolli
in the symplectic space, gives a symplectic isomorphism that exponentiates to an isomorphism of Weyl algebras. The defining representations for some intermediate and the larger putative field algebra are also introduced through this isomorphism, in an essentially unique way. The local net theory of the model is presented in Sec. 4, in the usual approach: the time zero observable net A on the index set of the open bounded intervals of the time zero line is defined, so that, if I is such an interval and the base of the double cone O, i.e. I = O, then A(I) = A(O). Similarly, four more intermediate nets and the putative field net F are defined. The net F realizes the cited simple current extensions. In Sec. 4.2, the relation between the chiral and the time zero formulation is discussed. The usual d’Alambert formula gives an isomorphic correspondence between the symplectic spaces and the charges in the two cases. In Sec. 4.3, we present a detailed description of the twisted and untwisted automorphisms describing the sectors. Finally, in Sec. 4.4, the global compact Abelian gauge group G is derived as the Bohr compactification of a subspace, isomorphic to R2 , of the symplectic space of the fields. Further details on the braided tensor category of the DHR sectors of A will be presented in [14].
2. Weyl Algebras We recall in this section some essential results of the theory of Weyl algebras and fix the general notation, referring mainly to [7, Sec. 5.2] and [38, 50]. For V a separable real (topological) vector space, we denote by σV a (continuous) symplectic form on it, i.e. a R-bilinear, antisymmetric, real valued, (continuous) form on V × V and by (V, σV ) the associated real symplectic space. The abstract *-algebra on C generated by the elements W (v), v ∈ V , with product and involution defined respectively by i
W (v)W (v ) = e− 2 σV (v,v ) W (v + v ), W (v)∗ = W (−v)
(2.1)
for v, v ∈ V is called the Weyl algebra of (V, σV ) and indicated by W(V, σV ), or with WV if no confusion arises. We have to note that, as far as we consider abstract non-represented Weyl algebras, it turns out to be needless to specify the topology on V , i.e. we can use the discrete one. Passing to the representations on a Hilbert space, the topology on the (support of the) symplectic space will play its role: Fock representations and, more generally, regular representations are typical examples, as we shell better see in the sequel. The relations (2.1) imply that W (0) = I, W (v)−1 = W (−v) = W (v)∗ , i.e. the generators of the Weyl algebra are formal unitaries and
W (v)W (v ) = e−iσV (v,v ) W (v )W (v),
v, v ∈ V.
(2.2)
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
741
The algebra W(V, σV ) is a unital, generally non commutative *-algebra, that is simple iff the symplectic form σV is non degenerate, i.e. if σV (v, v ) = 0 for all v ∈ V implies v = 0. In [38, 50], a well-established standard theory associates a unique C*-norm to any Weyl algebra, called the minimal regular norm. The symplectic form σV is non-degenerate on V iff a unique C*-norm on W(V, σV ) exists, hence coinciding with the minimal regular one. We denote by C ∗ (V, σV ) the C*-algebra generated by the Weyl algebra W(V, σV ) in the minimal regular norm, and call it the (unique) C*-algebra associated with W(V, σV ). We term NV := {v ∈ V : ∀v ∈ V, σV (v, v ) = 0}
(2.3)
the degeneracy subspace of V , so that W(NV , σV ) ∼ = W(NV , 0) ⊆ W(V, σV ) is the Abelian *-subalgebra generated by NV . Its completion in the minimal regular norm on W(V, σV ), denoted by C ∗ (NV ), constitutes the center of C ∗ (V, σV ), i.e.
ZV := C ∗ (V, σV ) ∩ C ∗ (V, σV ) = C ∗ (NV ).
(2.4)
∗
C (V, σV ) is simple iff NV = {0} and in the degenerate case, i.e. NV = {0}, the minimal regular norm on WV is not the only C*-norm on WV . Clearly, if V0 := V /NV and σV0 := σV V0 (we use the notation σH := σV H for the restriction of the symplectic form to a subspace H ⊂ V ), the pair (V0 , σV0 ) is a non-degenerate symplectic space and the C*-algebra C ∗ (V0 , σV0 ) it generates is simple. The degenerate case is treated in [38], also when V is replaced by an Abelian topological group. If σV is non-degenerate and V has a complexification, i.e. an operator J such that σV (·, J·) is a positive definite form and σV (Jv, v ) = −σV (v, Jv ),
J 2 = −1,
v, v ∈ V,
(2.5)
we immediately get a pre-Hilbert space structure for V , whose inner product is defined from the symplectic form by (·, ·)V := σV (·, J·) + iσV (·, ·). Actually such a correspondence between the pair σV , J and (·, ·)V is bijective, up to isomorphism, and the complexification is necessary to obtain a pure quasi-free state and a Fock representation for WV (see, e.g., [7]). This is the usual method, necessary in some sense, to obtain a definite metric Hilbert space representation, for the (observable) algebra of a physical model. 2.1. Isomorphisms and twisted crossed products of Weyl algebras We focus in the sequel on two relevant symplectic structures: isomorphisms and twisted compositions of symplectic spaces; these respectively give rise functorially to isomorphic and twisted crossed products of Weyl and C*-algebras. A symplectic morphism between symplectic spaces is given as a (continuous) map on the spaces, preserving the symplectic forms. An invertible morphism, i.e. an
July 8, 2009 10:16 WSPC/148-RMP
742
J070-00373
F. Ciolli
isomorphism, may be defined also in the case of degeneracy as follows Definition 2.1. Given two symplectic spaces (V1 , σ1 ) and (V2 , σ2 ), a symplectic space isomorphism ψ : (V1 , σ1 ) → (V2 , σ2 ) is a continuous isomorphism between V1 and V2 as real topological vector spaces, that preserves the symplectic form, i.e. σ2 (ψ(x), ψ(y)) = σ1 (x, y),
x, y ∈ V1 .
A symplectic isomorphism exponentiates functorially to a Weyl algebras isomorphism between W(V1 , σV1 ) and W(V2 , σV2 ) and to a center-preserving C*-algebras isomorphism denoted by Ψ : C ∗ (V1 , σ1 ) → C ∗ (V2 , σ2 ).
(2.6)
Consider now • Weyl := (W (V, σV ), ϕ), the category of all the Weyl algebras as objects and all the (purely algebraic) isomorphisms between them as morphisms. The above discussion may be formalized saying that there exists a Weyl exponentiation functor W that realizes an isomorphism of categories between • Symp := ((V, σV ), ψ), the category of symplectic spaces as objects and symplectic isomorphisms as morphisms; • W(Symp), the subcategory of Weyl, where the morphisms are only W(ψ) for ψ a symplectic isomorphism as in Definition 2.1. Using the C*-closure of Weyl algebras in the uniquely defined minimal regular norm, both of these categories are isomorph to the following one in C*-context, from which W(Symp) is obtained by a forgetful topology functor: • C∗ (W(Symp)), the subcategory of Weyl C*-algebras as objects and the unit preserving isomorphisms W(ψ) of W(Symp) as morphisms, extended to the C*-closure. Notice that the objects in the previously listed isomorphic categories have different algebraic and topological structures, although defined in a natural way starting from Symp. A similar natural, physically motivated definition of the representations for the Weyl algebra models is also pursued in the sequel: a typical example is the Fock representation, at least for a Weyl subalgebra and its extension to the whole Weyl algebra. We may introduce a fourth category, isomorphic to the three above, that is more handy from the point of view of the crossed products theory. The objects of this category are called Weyl algebra groups and defined as in the sequel: to any (also degenerate) symplectic space furnished with the discrete topology, a Weyl group U(V, σV ) is associated such that 1 → T → U(V, σV ) → U(V ) → 1
(2.7)
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
743
is a short exact sequence.d This means that the discrete twisted crossed product U(V, σV ) := T (ι,y) U(V ) is an extension of the Abelian formal symbols group U(V ) on the symplectic space V , by the torus group T and the 2-cocycle (see [47]) z = (β, y) : (U(V ), U(V ) × U(V )) → (Aut T, T)
(2.10) i
where the action is trivial, β ≡ ι, and the function y(v, v ) := e− 2 σV (v,v ) is defined by the symplectic form. Hence, the above announced fourth category is defined by • U(Symp) := (U(V, σV ), Ψ), the category of the Weyl algebra groups as objects and the symplectic derived group isomorphisms as morphisms, i.e. Ψ = W(ψ) for ψ as in Definition 2.1. A Weyl algebra is hence simply recovered as a discrete crossed product W(V, σV ) = C (ι,y) U(V ). Observe that this is not a semidirect product, eventually defined by a non-trivial action β, but a crossed product twisted by the non-trivial function y. A useful decomposition in the case of degeneracy is also possible, where Eq. (2.7) is better replaced by an extension making the degeneracy explicit: 1 → T × U(NV ) → U(V, σV ) → U(V /NV ) → 1.
(2.11)
Here T × U(NV ) ∼ = Z(U(V, σV )) is the center of the group U(V, σV ), and we have U(V /NV ) ∼ = U(V )/U(NV ). This extension may be read as the discrete twisted crossed product U(V, σV ) = (T × U(NV )) (ι,y) U(V /NV ),
(2.12)
where y take value on the T-part of the normal Abelian subgroup T × U(NV ), and the Weyl algebra is also written as W(V, σV ) = W(NV ) (ι,y) U(V /NV ). d Given
a group G an extension E of it by another group N is described by the short exact sequence 1→N →E→G→1
(2.8)
where E is the set of pairs (n, s) ∈ N × G with multiplication law (n, s)(m, t) := (n βs (m) y(s, t), st),
(n, s), (m, t) ∈ N × G
for z = (β, y) : (G, G × G) → (Aut N, N ) the non-Abelian 2-cocycle of the extension satisfying the equations y(s, t) ∈ (βst , βs ◦ βt ), βr (y(s, t))y(r, st) = y(r, s)y(rs, t),
s, t ∈ G r, s, t ∈ G.
(2.9)
The first equation means that y intertwines the action of βst and of βs ◦ βt , i.e. y(s, t)βst (n) = βs (βt (n))y(s, t), for every s, t ∈ G and n ∈ N ; the second relation is a 2-cocycle multiplicative non-Abelian equation. The extensions E are classified, up to isomorphism, by the 2-cohomology of G, with values in 2-category (Z(N ), Aut(N ), N ), where elements in Z(N ), the center of N , implement identity of Aut(N ) of above described cocycles, see [46, 47].
July 8, 2009 10:16 WSPC/148-RMP
744
J070-00373
F. Ciolli
Another simple example of extension is obtained from a direct sum of symplectic spaces (H, σH ) and (L, σL ) defined by: (V, σV ) := (H ⊕ L, σH + σL ).
(2.13)
Here we mean that the symplectic form σV decomposes according as σV = σH ⊕σL , i.e. σH = σV H and σL = σV L, such that (V, σV ) ∼ = (H, σH )⊕(L, σL ) is an obvious symplectic isomorphism that at Weyl algebras level gives W(V, σV ) ∼ = W(H, σH ) ⊗ W(L, σL ).
(2.14)
The definition of the C*-maximal tensor product of two C*-algebras gives the C*-algebra isomorphism Ψ : C ∗ (V, σV ) → C ∗ (H, σH )⊗max C ∗ (L, σL ).
(2.15)
This is easy to obtain because denoting by ·H , ·L and ·V the minimal regular norms on W(H, σH ), W(L, σL ) and W(H ⊕ L, σH ⊕ σL ) respectively, for given a ∈ W(H) and b ∈ W(L), on a generic elementary tensor a ⊗ b ∈ W(H) ⊗ W(L) it holds a ⊗ bmax = abV ≥ Ψ(a)max Ψ(b)max = aH bL being Ψ(a)max = a ⊗ Imax = aH and similarly for b ∈ W(L). We call such a kind of isomorphism for symplectic spaces, or Weyl and associated C*-algebras, a splitting isomorphism and a direct sum as in Eq. (2.13) may also be called a splitting decomposition of the symplectic space V . Remark 2.2. Observe that if both (H, σH ) and (L, σL ) are non-degenerate symplectic spaces, the minimal regular norms ·H and ·L are unique and the C*subcross norms on the algebraic tensor product C ∗ (H, σH )⊗ C ∗ (L, σL ) all coincide, so that in this case, it holds (see e.g. [50, 38]) C ∗ (V, σV ) = C ∗ (H, σH )⊗max C ∗ (L, σL ) = C ∗ (H, σH )⊗min C ∗ (L, σL ).
(2.16)
The splitting isomorphisms are trivial examples of the following general construction: let (H, σH ) and (L, σL ) be two symplectic spaces, with symbol Abelian groups U(H) and U(L) and let U(H, σH ), U(L, σL ) be their Weyl algebra groups, defined as in the above Eq. (2.7). Consider the 2-cocycle (β, y) : (U(L), U(L) × U(L)) → (Aut U(H, σH ), U(H, σH )), defined for the elements s = W (l), s = W (l ) ∈ U(L) and t = (ζ, W (h)) ∈ U(H, σH ) by βs (t) = βs ((ζ, W (h))) = (ζe−iα(h,l) , W (h))
(2.17)
where the action β is given by a (continuous) real valued, R-bilinear form α, defined on H × L, such that α(h, 0) = α(0, l) = 0, and the function y defined by
y(s, s ) = (e−iσL (l,l )/2 , IH ).
(2.18)
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
745
To the pair of groups U(H, σH ) and U(L) is associated the extension e → U(H, σH ) → U(H ⊕ L, σV ) → U(L) → e where σV is a symplectic form on V := H ⊕ L, defined by σV ((h, l), (h , l )) = σH (h, h ) + σL (l, l ) + α(h, l ) − α(h , l)
(2.19)
σH,L ((h, l), (h , l )) := α(h, l ) − α(h , l)
(2.20)
so that
represents the interacting content of the non-splitting sum. An extension group is defined from the 2-cocycle z := (β, y) as above, i.e. in other notation U(H ⊕ L, σV ) = U(H, σH ) (β,y) U(L) = U(H, σH ) z U(L).
(2.21)
Explicitly, for generic elements t = (ζ, W (h)), t = (ζ , W (h )) ∈ U(H, σH ) and s = W (l), s = W (l ) ∈ U(L), the extension group is defined by the product ((ζ, W (h)), W (l))((ζ , W (h )), W (l )) = ((ζ, W (h))βs (ζ , W (h )), W (l)W (l ))
= ((ζζ e−iα(h ,l) e−iσL (l,l )/2 e−iσH (h,h )/2 , W (h + h )), W (l + l )), by the identity e = ((1, IH ), IL ) and the passage to the inverse given by ((ζ, W (h)), W (l))−1 = ((ζ −1 e−iα(h,l) , W (−h)), W (−l)). In this generality, we can introduce the following: Definition 2.3. The algebra on C associated as group algebra to the extension group U(H ⊕ L, σV ) = U(H, σH ) (β,y) U(L), where the symplectic form σV and the 2-cocycle (β, y) are defined as above, is called the twisted crossed product algebra of the Weyl algebras W(H, σH ) and W(L, σL ). This algebra may also be defined as the Weyl algebra on the symplectic space (V := H ⊕ L, σV ).e In particular cases, such a twisted crossed product of Weyl algebras may be derived from a non-splitting decomposition of a symplectic space, as better said in the sequel. If (V, σV ) is a (degenerate) symplectic space and H is a real subspace of it, we denote by H := {v ∈ V : σV (v, h) = 0, h ∈ H}
(2.22)
the symplectic complement of H in V and by H ⊥σV := {S ⊂ V, linear space : σV (s, h) = 0, s ∈ S, h ∈ H}
(2.23)
the partially ordered set of the symplectic subspaces of V disjoint to H. The set H ⊥σV has maximal element H and obviously contains the (eventually non-trivial) e Such a construction from two symplectic spaces is also called the semidirect product of Weyl algebras in the literature, see, e.g., [28].
July 8, 2009 10:16 WSPC/148-RMP
746
J070-00373
F. Ciolli
degeneracy subspace NV . The decomposition seen in Eq. (2.13) holds iff σV (l, h) vanishes for all l ∈ L and h ∈ H, i.e. introducing the symbol ⊥σV called the symplectic disjunction in (V, σV ), iff L⊥σV H. To construct examples of Weyl algebras products, suppose given a space decomposition V = B⊕C such that the symplectic form is not splitting, i.e. σV = σB +σC , and there exists a decomposition of one of the addend as B = H ⊕ N , with N ⊥σV H and C⊥σV H, i.e. (N ⊕ C)⊥σV H. In such a situation we have for the interacting part of the symplectic form σB,C = σH⊕N,C = σN,C and the non-splitting contents of such a decomposition of V is confined in the subspace L := C ⊕N ∼ = N ⊕C. The Weyl algebra associated to (V, σV ) is isomorphic to a twisted cross product, in accordance with the following Proposition 2.4. Given a symplectic space (V, σV ) with decomposition V = H ⊕ N ⊕ C,
with
B =H ⊕N
and
L = C ⊕ N ⊥σV H
there exists a 2-cocycle z = (β, y) : (U(C), U(C) × U(C)) → (Aut (U(N, σN ), U(N, σN ))) as in Eq. (2.9), such that for fixed elements s = W (c) and s = W (c ) in U(C) an action β : U(C) → Aut(U(N, σN )), is defined by βs (m) = (e−iσL (c,n) ζ, W (n)) = ad s(m),
(2.24)
for the element m = (ζ, W (n)) ∈ U(N, σN ), and where y : U(C)×U(C) → U(N, σN ) can be written as
y(s, s ) := (e−iσC (c,c )/2 , I) ∈ U(N, σN ). Such a 2-cocycle gives a twisted crossed product decomposition of the Weyl algebra as W(V, σV ) = W(H ⊕ N, σH ⊕ σN ) (β,y) U(C) = W(H, σH ) ⊗ W(N, σN ) (β,y) U(C). Proof. The subspaces H and L in Definition 2.3 have to be, respectively, identified with H ⊕ N and C in the case at hand. According to this, in Eq. (2.19) we have to read α(h, c) = α(n, c) = σL (n, c) for h ⊕ n ∈ H ⊕ N, c ∈ C, and the symplectic form σV decomposes as σV = σH ⊕ σL , where L = C ⊕ N and σL = σV L. Observe that the subspace N may be thought, is some sense, as being in common between the symplectic subspaces B and L, and that the Weyl elements defined from the subspace C have a non-trivial action on Weyl elements defined from N ⊂ B, by the evaluation of σL . We end this section with some broad ideas about the formalization of physical model by Weyl algebras. In all generality, a simple current extension of Weyl
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
747
algebras is essentially described by a crossed product of Weyl algebras, along the following scheme. The charge carrying fields are defined starting from a symplectic space (Vf , σf ) = (Va ⊕ N ⊕ C, σf ).
(2.25)
Here, we have to read the subspace decomposition V = B ⊕ C = (Va ⊕ N ) ⊕ C as in Proposition 2.4 above, with H = Va . The symplectic form σf may not split in ⊥σ f
the sum σB ⊕ σC , for σB = σf B and σC = σf C, but if N ∈ Va algebra isomorphically can be written as W(Vf , σf ) = W(Va ⊕ N, σa ⊕ σN ) (β,y) U(C).
the field Weyl
(2.26)
In these decompositions, Va has the meaning of the symplectic space for the observables algebra, for which a (regular positive metric) Fock space representation πa exists. The representation πf for the field algebra Wf is in general a non-regular extension of πa , as it happens in a non-rational model, and U(C) plays the role of the charge group of the theory. Observe that this description is more general that ⊥σ the one treated in Proposition 2.4, where also C ∈ Va f was assumed. Hence the Weyl algebra models may be classified on the basis of the different specific properties in the above space decomposition (2.25) and the algebraic ones in Eq. (2.26), such as the dimension of C and N as real linear spaces, the evaluation of σV when restricted to C and N , and so on. We will see two different examples below. As a final remark, observe that the purely algebraic constructions above are shown to entails some general functorial features passing to representations, that are also relevant for the nets of von Neumann algebras, defined from localized symplectic subspaces of a given symplectic space. 2.2. Representations We summarize some general results about the representation theory of a Weyl algebra W(V, σV ) and its associated C*-algebra C ∗ (V, σV ), see [50, 38] for details: • every positive linear functional on W(V, σV ) is continuous with respect to the minimal regular norm and extends to a unique positive, continuous linear functional on C ∗ (V, σV ); • every representation π of the Weyl algebra W(V, σV ) on a Hilbert space Hπ extends to a representation of C ∗ (V, σV ), on the same Hilbert space; • every *-automorphism on W(V, σV ) extends uniquely to a *-automorphism of C ∗ (V, σV ). In the sequel, we show the relation between the twisted crossed product characterization of Weyl algebras, introduced in the last section, and some factorization properties of their representations. We begin from the simplest situation, the split
July 8, 2009 10:16 WSPC/148-RMP
748
J070-00373
F. Ciolli
case of Eq. (2.14) or (2.15), by the following: Lemma 2.5. Let (V = H ⊕ L, σV = σH ⊕ σL ) be a direct sum of symplectic spaces with Weyl algebra W(V, σV ) ∼ = W(H, σH ) ⊗ W(L, σL ) as above. Then (i) if (πωH , HωH , ΩH ) and (πωL , HωL , ΩL ) are the GNS representations associated to the states ωH and ωL on WH and WL respectively, then the unique product state ω and its GNS representation πω is canonically defined for the Weyl algebra WV = W(H ⊕ L, σH ⊕ σL ) by the (spatial) tensor product as (πω , Hω , Ω) = (πωH , HωH , ΩH ) ⊗ (πωL , HωL , ΩL ); (ii) if (H, σH ) and (L, σL ) are non degenerate symplectic spaces or if (for example) σV L = σL vanish, i.e. if W(L, σL ) = W(L) is Abelian, then (πω (C ∗ (H, σH )⊗max C ∗ (L, σL ))) = πωH (C ∗ (H, σH )) ⊗ πωL (C ∗ (L, σL )) , where the latter means the tensor product of the von Neumann algebras πωH (C ∗ (H, σH )) and πωL (C ∗ (L, σL )) . Proof. (i) πω is obtained as the GNS representation of the product state ω := ωH ⊗ ωL on the C*-algebra C ∗ (H, σH )⊗max C ∗ (L, σL ), i.e. from the state defined by ω(A⊗max B) = ωH (A)ωL (B),
A ∈ C ∗ (H, σH ),
B ∈ C ∗ (L, σL ).
Here ⊗max assures for the product state ω a well behaved passage to the representation πω of C ∗ (H, σH ) ⊗ C ∗ (L, σL ) on the Hilbert space Hω = HωH ⊗ HωL , obtained as the spatial tensor product of the GNS representations πωH of C ∗ (H, σH ) and πωL of C ∗ (L, σL ) (see e.g. [52, Theorem IV.4.9] or [31, Proposition 11.1.1] for details). (ii) The results follow directly from [52, Theorem IV.4.13] and the identity C ∗ (H, σH )⊗max C ∗ (L, σL ) = C ∗ (H, σH )⊗min C ∗ (L, σL ). This equality, in the case of non degenerate subspaces is given by Eq. (2.16). In the second case, if C ∗ (L, σL ) is Abelian, hence nuclear, it is a well known consequence. An elementary example of item (ii) in above Lemma 2.5, is given by a splitting isomorphism of symplectic spaces with L = NV , the degenerate subspace of V , and H = V /NV . Passing to the non-splitting situation, the factorization of representations we are interested in, is described by the following general result, also related to one in [28]. Proposition 2.6. Let (V = H ⊕ L, σV = σH + σL + σH,L ) be a symplectic space decomposition as in Definition 2.3 and Eq. (2.21), such that the Weyl algebra W(V, σv ) is not splitting, i.e. the real form α that defines through Eq. (2.20) the interacting part σH,L of the symplectic form σV is non-trivial. Then, for given ωH
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
749
and ωL two states on W(H, σH ) and W(L, σL ) respectively, the linear functional on W(V, σV ) defined for v = h ⊕ l ∈ H ⊕ L = V by ω(W (v)) := ωH (W (h))ωL (W (l)),
(2.27) ⊥
/ H ∩L. In particular, is positive, i.e. is a state on W(V, σV ), if ωL (W (l)) = 0 for l ∈ ⊥ if H ∩ L = {0}, i.e. if α is non-trivial on any subspace of L, such a condition is also necessary, i.e. if H ⊥ ∩ L = {0}, ω is a state on W(V, σV ) iff for l = 0.
ωL (W (l)) = 0
(2.28)
If H ⊥ ∩ L = {0} and the condition (2.28) holds, the state ω is faithful iff so is the state ωH . Respectively, changing L with H. Proof. To verify the hermiticity of ω, we may restrict to an element A = W (l)W (h) for l ∈ L and h ∈ H, so that such a property for ω holds iff ω(A) = ωH (W (h))ωL (W (l)) = ω(A∗ ) = ω(W (l)∗ βW (l) (W (h)∗ )) = ωH (βW (l) (W (h)))ωL (W (l)) i.e. by the definition of the action of β in Eq. (2.24), iff ωH (W (h))ωL (W (l))(1 − eiα(h,l) ) = 0. / H ⊥ ∩ L. In particular, if Hence the hermiticity holds if ωL (W (l)) = 0 for l ∈ ⊥ H ∩ L = {0}, the hermiticity holds iff the condition (2.28) is satisfied. To show the positivity, observe that any element A ∈ WV is written for li ∈ L, hi ∈ H and ai ∈ C with li = lj for i = j, as a finite sum A= ai W (hi )W (li ). (2.29) 1≤i≤n
Hence we obtain AA∗ =
1≤i≤n
=
|ai |2 +
ai aj W (hi )W (li )W (lj )∗ W (hj )∗ + adj
1≤i<j≤n
|ai |2 +
1≤i≤n
ai aj eiαij W (hi − hj )W (li − lj ) + adj
(2.30)
1≤i<j≤n
where αij = 12 [σH (hi , hj ) + σL (li , lj ) + σV (hj , li − lj )]. Applying the functional ω gives |ai |2 + 2{ai aj eiαij ωH (W (hi − hj ))ωL (W (li − lj ))} ω(AA∗ ) = 1≤i≤n
=
1≤i<j≤n
1≤i≤n
≥
|ai |2 +
2{ai aj eiβij }
1≤i<j≤n li −lj ∈H ⊥ ∩L
|ai |2 +
1≤i<j≤n li −lj ∈H ⊥ ∩L
2{ai aj eiβij } =
1≤i<j≤n li −lj ∈H ⊥ ∩L
|bi |2 ≥ 0
1≤i<j≤n li −lj ∈H ⊥ ∩L
(2.31)
July 8, 2009 10:16 WSPC/148-RMP
750
J070-00373
F. Ciolli
where we used the condition ωL (l) = 0 for l ∈ / H ⊥ ∩ L, redefined βij from the phases that arise from the evaluation of the states, and the last equality holds because we may call bi = ai eiβi for some βi such that βij = βi − βj . If we suppose that the property (2.28) holds true for L, we only have ω(AA∗ ) = 1≤i≤n |ai |2 , so the faithfulness trivially follows in this case. Observe that the above defined state ω is a product state on the tensor product algebras WH ⊗ WL in the sense of Lemma 2.5, only if the linear form α trivializes, i.e. if H ⊥ ∩ L = L. In particular, the Proposition 2.6 applies to the symplectic space decomposition as in Proposition 2.4. For this case, given the two states ωH on W(H, σH ) and ωL on W(L, σL ), we define in a canonical way as in Lemma 2.5, the product state ωp := ωH ⊗ ωL
(2.32)
on the tensor product Weyl algebra W(H, σH ) ⊗ W(L, σL ) and denote by πp := πH ⊗ πL its GNS representation on the Hilbert space Hp = HH ⊗ HL . Now let ωH⊕N be an extension of the state ωH from the Weyl algebra W(H, σH ) to the Weyl algebra W(H ⊕ N, σH⊕N ), for an extension σH⊕N of the symplectic form σH of H to H ⊕ N . For L = N ⊕ C, let denote by ωC the restriction of a state ωL of W(L, σL ) to the subalgebra W(C, σL C). A result relating the states on a twisted crossed product decomposition (Proposition 2.6) and a splitting Weyl algebra (Lemma 2.5), is obtained confining the interacting part σH,L of the symplectic form σV to the subspace N , that is in common between H and L. Hence we give the following result, where indeed, referring to Proposition 2.6, H is replaced by H ⊕ N and L by C: Proposition 2.7. Consider V = H ⊕ L = H ⊕ N ⊕ C as in notation above, so that the state ω in Eq. (2.27) is well defined. Then the following are equivalent: ωH⊕N (W (x)) = ωH (W (h)), x = h ⊕ n ∈ H ⊕ N, (i) l = n ⊕ c ∈ N ⊕ C = L; ωL (W (l)) = ωC (W (c)), (ii) the states ωp and ω, the first defined as in (2.32) with extension and restriction as above, and ω defined as in Proposition 2.6, coincide on W(V, σV ) and on its C*-algebra C ∗ (V, σV ). Proof. (i) ⇒ (ii) is obtained by a simple calculation from the definitions, for v = h ⊕ l = h ⊕ n ⊕ c ∈ H ⊕ N ⊕ C = V and x = h ⊕ n ∈ H ⊕ N ωp (W (v)) := ωH (W (h)) ⊗ ωL (W (l)) = ωH⊕N (W (x))ωC (W (c)) =: ω(W (v)). (ii) ⇒ (i) It suffices to use c = 0 in the preceding calculation to obtain back the first relation in (i) and analogously, using h = 0, for the second. Notice that for the element h = 0 in H we obtain from the condition (i) in the proposition above that the state ωN := ωH⊕N N of WN is such that ωN (W (n)) = 1 for every n ∈ N .
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
751
Remark 2.8. (1) Consider a symplectic space decomposition as in Proposition 2.4 and refer to the physical model discussion about Eq. (2.26). If C represents the charge group for Vf and H ⊕N is related to the symplectic space of observables, then the triviality of α, i.e. of σN,C , is equivalent to a trivial action of the sector (auto)morphisms on the observables, that means a trivial superselection structure of the model. The condition in Eq. (2.28) instead, defines the state ωL in a unique way as a non-regular state (see Definition 2.9 below). (2) The obtained results actually extend the ones of Herdegen in [28], where a crossed product of a Weyl algebra by a CAR algebra is treated: in fact, as observed by Slawny in [50, Sec. 3.10], a CAR algebra may always be written as a Weyl algebra with a non-degenerate symplectic form. (3) The Proposition 2.7 suggests that, for twisted crossed products as above, it is the non-regular representations of W(V, σV ) that is of interest, with C the nonregular subspace and N the regular subspace of V , see Definition 2.9 below. This always leads to non-rational models, with uncountably many superselection sectors, in the sense of [37]. Further general results for twisted crossed products and non-regular representations of CCR algebras of a generic locally compact group, along the lines of [25] are also possible, see [13]. 2.3. Elementary Weyl algebra in non-regular representation In the above subsection, we pointed out the utility of non-regular representations when a field algebra model is written as a cross product of Weyl algebras. A regular (Fock space) representation is used instead on the observable part of the model, also called in literature the physical part, and a non-regular one on the charge part, see e.g. [1]. We recall hence the following basic: Definition 2.9. A representation π of a Weyl algebra W = W(V, σV ) is said a regular representation if for all v ∈ V the one parameter group λ → π(W (λv)),
λ ∈ R,
v∈V
(2.33)
is weakly (and strongly) continuous. If there exist a non-trivial subspace V1 ⊂ V such that the map in Eq. (2.33) is not weakly continuous, then the representation π is said to be non-regular. The maximal such subspace is said the non-regular subspace of π. A state ω whose GNS representation πω is regular (non-regular) is also said to be regular (non-regular). Observe that in general a Weyl algebra is not norm continuous, being W (v1 ) − W (v2 ) = 2 if v1 = v2 for v1 , v2 ∈ V . The strong continuity of the regular representations of the Weyl algebras, that is needed for their relevant physical content, is in some sense also necessary to obtain good representation of the associated CCR algebra, see [47, Sec. 3.3] for details.
July 8, 2009 10:16 WSPC/148-RMP
752
J070-00373
F. Ciolli
We now discuss in detail the simplest, but relevant, case of non-regular representation for the symplectic space L ∼ = R2d , in order to use it as a building block for the superselection theory of physical models. Let C ∼ =N ∼ = Rd be copies of the additive group of reals, furnished with the discrete topology; consider the symplectic space (L, σL ) where L = C ⊕ N ∼ = R2d and the symplectic form is the usual non-degenerate one on R2d , i.e. for a pair of elements l = (c, n), l = (c , n ) ∈ L we have i
WL (l)WL (l ) = e− 2 σL (l,l ) WL (l + l ),
σL (l, l ) = (cn − c n).
The Weyl algebra WL associated to (L, σL ) is hence that of a quantum system with one degree of freedom; its irreducible regular representations, and those of the associated C*-algebra C ∗ (L, σL ), are well described by the Stone–von Neumann uniqueness theorem, see, e.g., [7]. To consider a particular non-regular representation of WL , following for example [55, Sec. 2], let ωL be the functional on WL defined by l = (c, n) ∈ L.
ωL (WL (l)) = ωL (WL ((c, n))) = δc,0 ,
(2.34)
This is the unique tracial state on W(L, σL ), up to the change of C and N , and it turns out to be not faithful because ωL ((I − WL ((0, n)))(I − WL∗ ((0, n)))) = 0,
(0, n) ∈ L.
However, if we denote the GNS triple of ωL by (πL , HL , ΩL ), we observe that πL is faithful because WL is simple, as the degeneracy subspace of L is trivial, i.e. NL = {0}. The GNS Hilbert space HL is non separable and can be derived from the total set of vectors obtained as (c, n) ∈ L.
|c, n := πL (WL ((c, n)))ΩL ,
The action of a represented Weyl element πL (WL ((c, n))) on the generic vector |c , n ∈ HL , for (c, n), (c , n ) ∈ L, reads as
i
πL (WL ((c, n)))|c , n = e− 2 σL ((c,n),(c ,n )) |c + c , n + n .
(2.35)
Moreover, if |c, n and |c , n are two elements of HL , their scalar product is equal to i
c, n|c , n = δc,c e 2 c(n −n) ,
(2.36)
so that 0, n|0, n = 1. It is hence possible to identify the vectors |c, n for i every n ∈ N with the vector |c, 0, up to a phase, giving |c, n = e 2 cn |c, 0 i c(n −n) δc,c . Because of this identification, we simply write and c, n|c , n = e 2 |c := |c, 0. In particular this is possible for c = 0, so that the following relations hold ΩL = |0, 0 = |0, n =: |0,
for all n ∈ N.
(2.37)
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
From the above discussion, the Hilbert space HL may also be written as HL = C|c ∼ = 2 (C) ∼ = 2 (Rd ),
753
(2.38)
c∈C
2 (Rd ) being the space of square summable maps from Rd into C, i.e. an element ξ ∈ 2 (Rd ) is a function supported on a countable subset Dξ of R such that ξ := d∈Dξ |ξ(d)|2 < ∞. According to Eq. (2.36), the inner product on 2 (Rd ) is equivalently given by ξ(d)η(d), ξ, η ∈ 2 (Rd ). ξ, η = d∈Dξ ∩Dη
Denoting for every c ∈ R by δc ∈ 2 (Rd ) the characteristic function of the subset {c} ⊂ R, identifying |c, 0 with δc gives the isomorphism in the above Eq. (2.38). The representation πL turns out to be irreducible and non-regular, not being strongly continuous in first coordinate c ∈ C of elements WL (c, r) but only in the second. The von Neumann algebras L := πL (C ∗ (L, σL )) is irreducibly represented on HL by πL . Moreover, we may consider the Weyl elements WL (c, 0), c ∈ C and WL (0, n), n ∈ N , and the Abelian C*-subalgebra of C ∗ (L, σL ) they generate, respectively C ∗ (C, 0) =: C ∗ (C) and C ∗ (N, 0) =: C ∗ (N ); finally if C := πL (C ∗ (C)) and N := πL (C ∗ (N )) are the associated Abelian von Neumann subalgebras of L they define, the properties of these unitaries and algebras are collected in the following: Proposition 2.10. Under the above definitions we have: (i) if c = 0 and n = 0 then the spectrum of WL ((c, 0)) is T and equals both the spectrum of WL ((0, n)) and the discrete spectrum of πL (WL (0, n)); (ii) there is a self-adjoint operator ΦN on HL such that for any n ∈ N we have πL (WL (0, n)) = exp{ 2i nΦN }, i.e. ΦN is the generator of the one parameter group associated to N ; there is no self-adjoint operator ΦC on HL , generating a one-parameter group associated to C. In particular, for all c ∈ C we have ΦN |c, 0 = −i lim n−1 (πL (W (0, n)) − IL )|c, 0 n→0
= −i lim n−1 (einc − 1)|c, 0 = c|c, 0; n→0
(iii) C ∗ (C) ∼ = C(bC) ∼ = C(bR) ∼ = C(bN ) ∼ = C ∗ (N ), where C(bR) indicates the C*algebra of the continuous functions on the Bohr compactification bR of R, i.e. is the algebra of the almost periodic functions on R, i.e. the group C*-algebra of the compact group bR; (iv) C ∗ (L, σL ) = C ∗ (N ) (ι,y) U(C), i.e. C ∗ (L, σL ) is the twisted crossed product of C ∗ (N ) by the discrete additive group C, with 2-cocycle (ι, y), defined from the symplectic form σL ;
July 8, 2009 10:16 WSPC/148-RMP
754
J070-00373
F. Ciolli
(v) every non zero vector in HL is cyclic for the von Neumann subalgebra C; N is a spectrally multiplicity free von Neumann subalgebra of L. In particular, C and N are maximal Abelian in B(HL ); (vi) L := πL (C ∗ (L, σL )) is a type II1 factor. Proof. (i)–(iii) are standard results, see, for example, [50] and also [27]. For (iv), see the above Proposition 2.4. (v) Considering the total set of vectors |c, c ∈ C and the action of C on them as read off from Eq. (2.35), cyclicity for C is trivial, and the maximality is a classical result. For the subalgebra N , consider for every c ∈ C, the spec∗ (N ) ∼ tral measures µc defined on C = bN , the Gelfand spectrum of C ∗ (N ), and the Baire set Xc = bN \ c. These data satisfy the hypothesis of [12, Corollary 5.2] that characterizes πL as a spectrally multiplicity-free representation of C ∗ (L, σL ), a stronger condition for N to be maximally Abelian in B(HL ). (vi) The von Neumann algebras L is equal to the represented discrete crossed product algebra N α Γ, where Γ := πL (U(C)) denotes the discrete group U(C) represented on the Hilbert space HL . The automorphic action α defined by αc := ad W (c), c ∈ C on the maximal Abelian *-algebra N is given by the 2-cocycle (ι, y), i.e. through the symplectic form σL . It easy to see that α acts ergodically on N , so that L is a factor. Moreover, the state ωL is a tracial state on L so that the algebra is of finite type, and is of type II1 being not finite dimensional. Remark 2.11. (1) The elements π(W ((c, 0))) for c ∈ C, may be considered as operators carrying the charge c. The elements π(W ((0, n))) for n ∈ N act as different phases, on the distinct charge subspaces, in the charge decomposition of HL given by Eq. (2.38). (2) The physical idea is to profit of a symplectic space L = C ⊕N with C ∼ = Rnd ∼ =N for some natural n, such that C represents the additive group of the charges and ∼ its associated dual group C = bR ∼ = bN the internal (gauge) symmetry group of the model. The non-regular representation of WL with C the non-regular subspace of L and obtained as the nth tensor product of the representation πL above, may be used to manage the twisted crossed product of Weyl algebras in a defining state satisfying the requirements of Propositions 2.6 and 2.7. (3) From a mathematical point of view, a similar construction may also be given for the Weyl algebras W(L, σL ) where now L = C ⊕ N , for C an Abelian discrete group and N the locally compact group such that its Bohr compactification bN the dual group of C, i.e. the space of all arbitrary characters of is equal to C, C, see [38, 50, 25, 13].
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
755
3. The Streater and Wilde Model We begin in this section to analyze the scalar massless free Bosonic quantum field, on the 1+1-dimensional Minkowski spacetime, along the lines of the twisted crossed products described in the previous Sec. 2. General references are [51,29] and a recent more physical introduction is [15]. With respect to the original formulation on chiral left/right light lines of the Streater and Wilde paper, we prefer a time zero formulation, i.e. giving Cauchy data on the time zero real line, because the classification of the superselection sectors, e.g. their different (solitonic) origin, is more clear in this approach, as it will be shown in Sec. 4.3. In contrast to the same model in higher dimension it is well known that in the 1+1-dimensional case infrared infinities occur, so that the observable algebra may be defined in a physical Fock space representation only after furnishing some supplementary constrain conditions reducing the algebra of non observable fields. If we write S := SR (R) for the Schwartz space of real valued rapidly decreasing functions on the real line, the cited constrains are realized in the Weyl formulation as a restriction of the test function space S to a subspace of functions with vanishing ˜ Fourier transform at zero momentum, i.e. f (0) := f (x)dx = 0, for f ∈ S.f This condition is also written for short as f = ∂g, for some g ∈ S, and is required for smearing the functions of the observable fields at time zero, not for their Lagrangian associated time zero momentum. For this reason, the observable theory is also called in the literature the theory of the potential of the field. 3.1. Weyl algebras for the Streater and Wilde model The symplectic spaces of the model are introduced by the following restriction and extensions of the function space S: ∂S := {f ∈ S : f = ∂g, g ∈ S}, S := {f ∈ C ∞ (R) : ∂f ∈ S}, ∂0−1 S := f ∈ ∂ −1 S : lim f (x) = lim f (x) , ∂
−1
∂q−1 S :=
x→−∞
x→+∞
(3.1)
f ∈ ∂ −1 S : lim f (x) = − lim f (x) . x→−∞
x→+∞
Denoting by Rd the constant functions on the time zero line, we have the following additive group quotients S = ∂0−1 S/Rd
and
∂q−1 S = ∂ −1 S/Rd ,
(3.2)
and S ⊂ ∂q−1 S ⊂ ∂ −1 S.
(3.3)
and the inclusions ∂S ⊂ S, f Considering
S ⊂ ∂0−1 S ⊂ ∂ −1 S
localized subspaces of test function space as defining local algebras, we may also replace S by the space of smooth functions with compact support.
July 8, 2009 10:16 WSPC/148-RMP
756
J070-00373
F. Ciolli
From the linear spaces in Eqs. (3.1) and because of the inclusions in (3.3) we define properly included (symplectic) spaces as follows: Va := ∂S ⊕ S
⊂ Vb := ∂S ⊕ ∂0−1 S ⊂ Vc := S ⊕ ∂0−1 S
∩ Vq := ∂S ⊕
∩ ∂q−1 S
⊂ Ve := ∂S ⊕ ∂
∩ −1
S ⊂ Vf := S ⊕ ∂
(3.4) −1
S.
Moreover, because of the quotient maps in Eq. (3.2) we have, for N ∼ = Rd Vb = Va ⊕ N
and Ve = Vq ⊕ N.
(3.5)
The symplectic spaces associated to the above support spaces are defined as follows: for F = f0 ⊕ f1 and G = g0 ⊕ g1 elements in Va , a (non-degenerate) symplectic form σa is defined by σa (F, G) = (f0 g1 − f1 g0 )dx. (3.6) R
The following standard procedure gives a pre-Hilbert space from the symplectic space Va , defining it on the mass shell (i.e. the positive light cone in momentum space), and leads to a Fock space representation of the associated observable algebras. Let Ta be the map defined by Ta : Va → Ha := L2 (R, dp) F = f0 ⊕ f1 → ω −1/2 f˜0 + iω 1/2 f˜1 ,
for ω := |p|,
(3.7)
from the real space Va to the real subspace Ta (Va ) of Ha , whose closure in Ha constitutes the real subspace Ha := Ta (Va ) of Ha , i.e. the one particle Hilbert space of the observables of the model, see [51, 29] for details.g The scalar product on Ha is explicitly given, for two elements obtained from F, G ∈ Va , by ¯ −1 ¯ ˜ ˜ (Ta F, Ta G)Ha := (ω f 0 g˜0 + ω f 1 g˜1 )dp + i (f0 g1 − f1 g0 )dx. (3.8) For F = f0 ⊕ f1 ∈ Va , a quasi-free state on the algebra W(Va , σa ) is given by 1 ωa (Wa (F )) := exp − Ta (F )2Ha 4 1 = exp − (3.9) (ω −1 |f˜0 |2 + ω|f˜1 |2 )dp . 4 The GNS representation of ωa , whose triple we denote by (πa , Ha , Ωa ), is the Fock (vacuum) representation of the algebra W(Va , σa ). This Fock space construction is possible only for the symplectic space (Va , σa ), see, e.g., [1], but construction with a canonical non-regular representation exists instead for all the Weyl algebras associated to the symplectic spaces in diagram (3.4) as illustrated below. the map Ta is an isomorphism of symplectic spaces, the pair (Ta (Va ), σa ) itself is often regarded as the symplectic space of the observables of the model.
g Because
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
757
Indicated the extension of σa to the space Vk by σk , for Vk each of the real spaces in (3.4), the pairs (Vk , σk ) are real symplectic space, properly included as in the same diagram. Only the symplectic spaces (Vb , σb ) and (Ve , σe ) are degenerate, with the same degenerate subspace N := {F = f0 ⊕ f1 ∈ Vb : f0 ⊕ f1 = 0 ⊕ n, n ∈ R} ∼ = Rd ,
(3.10)
and because of the results on degenerate symplectic forms as in Eq. (2.4), there exist a common non-trivial center of the C*-algebras C ∗ (Vb , σb ) and C ∗ (Ve , σe ) given by
C ∗ (N ) = Zb := C ∗ (Vb , σb ) ∩ C ∗ (Vb , σb ) = C ∗ (Ve , σe ) ∩ C ∗ (Ve , σe ) =: Ze . (3.11) We introduce now some symplectic isomorphisms to discuss the model in terms of twisted crossed products of Weyl algebras as in Sec. 2.1. To any function f1 ∈ ∂ −1 S, we associate three real values 1 (3.12) F+ := lim f1 (x), F− := lim f1 (x), F∞ := (F+ + F− ). x→+∞ x→−∞ 2 Observe that subtracting the real value F∞ from the second coordinate function f1 of an element F = f0 ⊕ f1 ∈ Vb , Ve realizes the quotient maps in Eq. (3.2). Hence we state the following: Lemma 3.1. For given an element F = f0 ⊕ f1 in Vb (or Ve ), we denote by [f1 ] := f1 − F∞ the element in the quotient group S = ∂0−1 S/Rd (or ∂q−1 S = ∂ −1 S/Rd ) as in the quotient maps of Eq. (3.2). Then (i) there exists a splitting isomorphism of degenerate symplectic spaces ψ∞ from (Vb , σb ) to (Va , σa ) ⊕ (N, σN ), with σN ≡ 0, defined as ψ∞ : Vb → Va ⊕ N = (∂S ⊕ S) ⊕ N F → (f0 ⊕ [f1 ]) ⊕ F∞ .
(3.13)
The Weyl functor gives an isomorphism of C*-algebras as Ψb : C ∗ (Vb , σb ) → C ∗ (Va , σa )⊗max C ∗ (N ) = C ∗ (Va , σa )⊗min C ∗ (N ). (ii) there exists a splitting isomorphism of degenerate symplectic spaces, from (Ve , σe ) to (Vq , σq ) ⊕ (N, σN ) with σN ≡ 0, also indicated by ψ∞ and defined by ψ∞ : Ve → Vq ⊕ N = (∂S ⊕ ∂q−1 S) ⊕ N F → (f0 ⊕ [f1 ]) ⊕ F∞ .
(3.14)
The corresponding isomorphism of C*-algebras given by the Weyl functor is Ψe : C ∗ (Ve , σe ) → C ∗ (Vq , σq )⊗max C ∗ (N ) = C ∗ (Vq , σq )⊗min C ∗ (N ). Proof. ψ∞ is an isomorphism between the real discrete vector spaces Vb and Va ⊕ N , so it trivially preserves the symplectic form and the image of the degeneracy subspace N of Vb . The last equality about tensor product holds because C ∗ (N ) Abelian. The same for the Ve case.
July 8, 2009 10:16 WSPC/148-RMP
758
J070-00373
F. Ciolli
A more general symplectic space isomorphism extending ψ∞ is also possible for the larger symplectic space (Vf , σf ) itself, once we have picked an element in it. This isomorphism turns out to be essentially unique. We begin this construction by defining the charges of a generic element F = f0 ⊕ f1 ∈ Vf . These are the two real quantities canonically associated to F by f0 dx, Fq := ∂f1 dx = F+ − F− . (3.15) Fc := R
R
A complete useful re-parametrization of the symplectic spaces in diagram (3.4) is obtained by the choice of a regularizing element T ∈ Vf with non-vanishing charges, i.e. such that Tc = 0 = Tq . A useful choice for such an element of Vf is the following: given a function t ∈ ∂ −1 S such that lim t(x) = − lim t(x) =
x→+∞
x→−∞
1 2
(3.16)
and denoted by ∂t ∈ S its derivative, we choice T = ∂t ⊕ t ∈ S ⊕ ∂ −1 S. For example, we may take t(x) = π1 arctan x, or a half-line or bounded smooth localization obtained from a composition of it with a C ∞ function of compact support. Note that R t(x)∂t(x)dx = 0 and the charges associated to T both equate 1, in fact ∂t(x)dx = lim t(x) − lim t(x) = Tq = 1. Tc = x→+∞
R
x→−∞
Now, because of the fixing of the regularizing element, we may associate to every element F ∈ Vf two more real numbers f1 ∂tdx, Fr := f0 tdx, (3.17) Fn := R
R
and functions f0∂t := f0 − Fc ∂t ∈ ∂S,
f1t := f1 − Fq t − F∞ ∈ S.
(3.18)
In this way, denoting by (L = N ⊕ C, σL ) and (M = R ⊕ Q, σM ) two copies of the elementary symplectic space as in Sec. 2.3, i.e. N ∼ =C∼ =R∼ =Q∼ = Rd , and setting Ft = f0∂t ⊕ f1t ∈ Va ,
Fl = Fc ⊕ Fn ∈ L,
Fm = Fr ⊕ Fq ∈ M,
(3.19)
it is possible to state the following Proposition 3.2. For every choice of an element T ∈ Vf as above, for example as in (3.16): (i) there exists a non-splitting isomorphism of symplectic spaces between (Vf , σf ) and (ψT (Vf ), σf ) (Va ⊕ L ⊕ M, σa ⊕ σL ⊕ σM ), defined by ψT : Vf → ψT (Vf ) Va ⊕ L ⊕ M F → Ft ⊕ Fl ⊕ Fm .
(3.20)
ψT extends the splitting isomorphisms ψ∞ of Lemma 3.1 for Vb and Ve and reduces to the isomorphisms (Vk , σk ) → (ψT (Vk ), σk ) of symplectic spaces when
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
759
applied to Vk , k = c, q respectively, i.e. such that: • ψT is a R-linear continuous bijection on its image, preserving symplectic forms and degeneracy subspaces; • for elements F = f0 ⊕ f1 and G = g0 ⊕ g1 in Vf we have σf (F, G) = (f0 g1 − f1 g0 )dx = σa (Ft , Gt ) + σL (Fl , Gl ) + σM (Fm , Gm ); • ψT maps Vc into elements with Fq = 0, Vq into elements with Fc = Fn = 0 and Ve into elements with Fc = 0. (ii) the subspace N ⊂ Vb is invariant under the action of ψT , i.e. for any F = (0, F∞ ) ∈ N it holds F → (0, 0) ⊕ (0, F∞ ) ⊕ (0, 0), and the non-trivial center of C ∗ (Vb , σb ), denoted Zb , equals its relative commutant in C ∗ (Vf , σf ) and C ∗ (Vc , σc ), i.e. for k = c, f we have
Zb = C ∗ (Vb , σb ) := C ∗ (Vb , σb ) ∩ C ∗ (Vk , σk ). c
Similarly for C ∗ (Ve , σe ) instead of C ∗ (Vb , σb ); (iii) for a different choice of an element T ∈ Vf that defines the symplectic isomorphism ψT , there exists a symplectic isomorphism ϕT T between ψT (Vf ) and ψT (Vf ) such that ψT = ϕT T ◦ψT . The isomorphism ϕT T reduces to the identity on C ⊕ Q iff T − T ∈ Vb , i.e. iff T and T have the same charge content. Moreover, it exponentiates to a Weyl algebra and C*-algebra isomorphism. Proof. The existence of the isomorphisms with the properties described in (i) and (ii) follows from easy calculations on the above definitions. The essential uniqueness up to the symplectic isomorphism in (iii) follows because ϕT T is defined from the function T − T ∈ Vf , so that the property of invariance of the charges holds iff T − T has zero charge content. We will refer to the additive group C ⊕ Q as the charge group of the theory. Remark 3.3. (1) ψT being non-splitting, its exponentiation ΨT gives a Weyl algebra W(ψT (Vf )) = ΨT (Wf ) ∼ = Wf that is only properly contained in the algebraic tensor product Wa ⊗ WL ⊗ WM . The associated C*-algebra is isomorphic to C ∗ (Vf , σf ) and only properly contained in the respective (maximal) tensor product as well. Hence, because of the results in Sec. 2, we have C ∗ (Vf , σf ) ∼ = ((C ∗ (Va , σa ) ⊗ Zb ) σL U(C)) σM U(Q).
(3.21)
(2) Observe that the isomorphism ψ∞ is introduced to quotient out from an element in Vb or Ve its degenerate part in N . Instead, the isomorphism ψT is tailored to extract from an element in Vf its charge content, i.e. to fix its equivalence class in the additive group C ⊕ Q of the charges, dividing out the remaining components in Va , N and R, according to the choice of T ∈ Vf .
July 8, 2009 10:16 WSPC/148-RMP
760
J070-00373
F. Ciolli
3.2. Defining representations for the Streater and Wilde model We define a reference representation for each Weyl and C*-algebra that functorially derives from the six term diagram (3.4). These extend the Fock representation of the observable algebra and are non-regular representations that satisfy the properties of factorization indicated in Proposition 2.6. The twisted crossed product in (3.21), explains the observable and charge content of each of the algebras associated to the symplectic spaces in diagram (3.4). We begin by noticing that Eq. (3.21) gives an interpretation of the non-regular representations introduced for different models of this kind in [1,2]. In these papers, the representation of W(Vf , σf ) is obtained as the GNS representation associated to the non-regular state 1 (3.22) ω(Wf (F )) = exp − q(F ) , F ∈ Vf , 4 where q(F ) is a so called generalized quadratic form, defined by (ω −1 |f˜0 |2 + ω|f˜1 |2 )dp if F = f0 ⊕ f1 ∈ Va , q(F ) = +∞ if F ∈ / Va .
(3.23)
If ωa is the Fock state on Wa and for two symplectic spaces with supports L ∼ =M ∼ = 2 Rd we take two copies ωL and ωM of the non-regular elementary state introduced in Sec. 2.3, respectively, on the algebras WL and WM , similarly to Eq. (3.22) we give the following: Definition 3.4. For given T ∈ Vf a charge regularizing element as in Sec. 3.1 and notation as in the previous Proposition 3.2, and for δ being the Kronecker delta, we define a state on the Weyl algebra Wf by 1 ωf (Wf (F )) := exp − Ta (Ft )2Ha δFc ,0 δFq ,0 4 = ωa (Wa (Ft ))δFc ,0 δFq ,0 = ωa (Wa (Ft ))ωL (WL (FL ))ωM (WM (FM )). Observe that the state in Definition 3.4 differs from the one in Eq. (3.22) only in restriction to elements Wb (F ) ∈ Wb , with F ∈ Vb and F∞ = 0: the GNS representation of the state ωf , unlike that of ω, maps them in elements different from the identity, as seen in Sec. 2.3 for the representation πL . This means that ωf , unlike ω, results to be faithful on the Weyl elements obtained from the subspace N , hence it is preferred in the sequel for obtaining the physical content also for this subspace. For different regularizing elements, the above definition gives different states, whose GNS representations are actually related by the following
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
761
Proposition 3.5. (i) The GNS representations πf , associated to the states ωf for different choices T1 , T2 ∈ Vf of the non vanishing charge regularizing element in Definition 3.4, are unitary equivalent. The unitary operator on Hf giving this equivalence by adjoint action is πf (W (F )) for F = T1 − T2 ; moreover, such a F ∈ Vb if the charges of T1 and T2 are the same; (ii) the representation πf is unitary equivalent to the representations πa ⊗ πL ⊗ πM Wf and, denoting by H(c,q) ∼ = Ha the Hilbert space at fixed charges (c, q) ∈ C ⊕ Q and π(c,q) ∼ = πa the relative representation, also to the representation ⊕(c,q)∈C⊕Q π(c,q) . πf acts on the Hilbert space Hf ∼ H(c,q) . (3.24) = Ha ⊗ HL ⊗ HM ∼ = Ha ⊗ 2 (C) ⊗ 2 (Q) ∼ = (c,q)∈C⊕Q
Equivalently, we have πb = πc ∼ = πa ⊗ πL , πq ∼ = πa ⊗ πM and πe ∼ = πf , where ∼ it is meant πb = (πa ⊗ πL )Wb an similar. The representations πa and πb are the only regular ones; (iii) the GNS vector Ωf ∈ Hf of the state ωf is the vector |0, 0 ∈ H(0,0) ⊂ Hf . It is cyclic for πf (Wf ) and is identified by the isomorphisms in Eq. (3.24) with the vectors Ωa ⊗ ΩL ⊗ ΩM and Ωa ⊗ δ0 ⊗ δ0 respectively. Similarly, in the other cases, Ωb and Ωe not being cyclic for Wb and We , respectively. Proof. (i) The equivalence for different T is a consequence of Proposition 3.2, because of the isomorphism of the Weyl and C*-algebras and because the states ωf satisfies the factorization condition (2.28) in Proposition 2.6. (ii) It suffices to observe that the state ωf reduces to the state ωa on Wa and that if F ∈ Vf with at least a non vanishing charge, then ωf (W (F )) = 0 because of the Kronecker deltas. (iii) it is trivial. 4. Local Theory, DHR Sectors and Gauge Group In this section we treat the net approach of AQFT to the Streater and Wilde model. As general reference we use [26, 48], and for the low dimensional theories also [34]. Let M be the 1+1-dimensional Minkowski spacetime and consider R as its time zero axis. We recall some notation and fix new concerning Open (M ) and Open (R), the set of open non-void subsets of M and R, respectively, partially ordered by inclusion. A causal disjointness relation ⊥ is induced on Open (M ) from the Minkowski metric. A triple consisting of a subset P ⊂ Open (R), the inclusion partial order ⊂ and the disjointness relation ⊥ will be called an index set and shortly indicated as (P, ⊂, ⊥). If P ∈ P, we denote by P ⊥ the causally disjoint set of P , i.e. the set of the elements of P disjointed from P . On the Minkowski spacetime the preferred index set is the set of causally complete bounded connected regions, indicated by K and called the set of open double
July 8, 2009 10:16 WSPC/148-RMP
762
J070-00373
F. Ciolli
cones on M and defined as follows: for given x, y ∈ M with y ∈ Vx+ ⊂ M , the open future cone of x and for Vy− ⊂ M the open past cone of y, a double cone O ∈ K is defined by the intersection of Vx+ ∩ Vy− . The relation ⊥ on M reduces to the set theoretic disjointness relation on the time zero line R, i.e. for A, B ∈ Open (R)A⊥B ⇔ A ∩ B = ∅. The index set we work with in this paper, will only be the following: I := {non-empty, open, bounded intervals of R}.
(4.1)
A generic abstract net N on the index set P is defined by an inclusion preserving map N : P → N(P ),
P ∈ P.
(4.2)
A net hence takes its image elements in the objects of a generic category, also furnished with the structures of partial order and disjointness. To distinguish nets defined on different index set and with the same image category, we use the notation
NP and define a maximal element of the net NP by NP (M ) := P ∈P NP (P ),
where the symbol has to be properly understood relative to the category of the image elements. A relevant property of a net is locality, i.e. N(P1 ) ⊥ N(P2 ) for P1 , P2 ∈ P and P1 ⊥ P2 ; it distinguishes Bosonic (i.e. local) from non Bosonic nets. For given a symplectic space (V, σV ) and an appropriate notion of localization of an element of V in an element of an index set P, it remains naturally defined a category of symplectic subspaces with order and disjointness relation (Sub(V (P)), ⊂, ⊥σV ). The partial order is inherited from the one of P, by the defined localization, and the disjointness of two symplectic subspaces V1 and V2 in ⊥σ this category means that V2 ∈ V1 V , i.e. σ(v1 , v2 ) = 0 for any element v1 ∈ V1 and v2 ∈ V2 . Hence (Sub(V (P)), ⊂, ⊥σV ) is a first example of the image category of a net that, accordingly to (4.2), we indicate by VP and call the net of the symplectic subspaces of the symplectic space (V, σV ) on the index set (P, ⊥). Notice that the notion of localization of the elements of V on P, also accounts for the locality of VP . The Weyl functor preserves the net structure on P, so that a net of Weyl algebra WP : P → W(V (P ), σV ) and a net of C*-algebra NP are also defined, by NP : P → N(P ) := W((V (P ), σ))− ,
P ∈ P.
(4.3)
Here the completion may be intended with respect to the minimal regular norm. Once fixed a defining representation πn , a net NP of von Neumann algebras is canonically associated to a Weyl algebra net, by NP : P → N (P ) := πn (W(V (P ), σ)) ,
P ∈ P.
(4.4)
If A ∈ Open(R) then NP (A) is the C*-algebra generated by the von Neumann algebras NP (P ), P ⊂ A, P ∈ P, so that we may define N P (A) := NP (A) .
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
763
Hence, NP (A) and N P (A) are the C* and von Neumann algebras generated by additivity on the index set P.h In particular, NI (R) is the C*-algebra generated by all the local von Neumann algebras NI (I), for I ∈ I. It is called the C*-algebra of quasi local elements of the net NI and is the C*-algebra referred to in studying the DHR superselection sectors of the net, with respect to the index set I. The net and this C*-algebra are usually both indicated by the same symbol N . 4.1. Nets for the Streater and Wilde model To define useful nets for the study of the Streater and Wilde model, a proper definition of localization is now in order: for j = a, b, c, q, e, f we define the localization of a generic element F = f0 ⊕ f1 ∈ Vj as loc F := supp f0 ∪ supp ∂f1 ,
f0 ∈ S,
f1 ∈ ∂ −1 S.
(4.5)
According to this definition, the various nets of symplectic subspaces Vj,P ⊆ Vf,P with range category (Sub(Vj (P)), ⊂, ⊥σj ) are defined: explicitly Vj,I is given by VjI : I ∈ I → VjI (I) := {F ∈ Vj : loc F ⊂ I},
I ∈ I.
Observe that the elements F = 0 ⊕ n ∈ Vf , with n ∈ N ∼ = Rd , have vanishing localization Weyl unitaries according to the definition in Eq. (4.5), so that they may be thought as localized in any interval of the time zero line. A possible first result for the model, is a complete simple current extension characterization of the nets of von Neumann algebras, derived from the nets of symplectic subspaces in the six term inclusion of symplectic spaces in diagram (3.4), in the representations defined in Sec. 3. To obtain such a result, we show the existence of local symplectic isomorphisms obtained from a natural local choice of the regularizing element T ∈ Vf . Such a characterization will be used in Sec. 4.4 to discuss the existence of the relative gauge simmetry groups. The following result collects the local properties of the symplectic isomorphism ψT , according to the relative position of loc T and of the localized algebras. Observe that for T = ∂t ⊕ t we have loc T = supp ∂t. We omit for brevity the reference to the symplectic form of the symplectic subspaces involved, and always suppose that the nets are defined on a fixed index set P. Proposition 4.1. Let P ⊂ R be a generic non-void open subset of the time zero real line and ψT be a symplectic isomorphism defined as in Proposition 3.2 for T = ∂t ⊕ t ∈ Vf (loc T ). Letting v = F ⊕ l ⊕ m = (f0 ⊕ f1 ) ⊕ (c ⊕ n) ⊕ (r ⊕ q) be the example, the additivity property for the von Neumann algebras net NP means that N (A) = ∨P ⊂A NP (P ). For the general properties of abstract nets we refer to [48], and to [14] for further discussions. h For
July 8, 2009 10:16 WSPC/148-RMP
764
J070-00373
F. Ciolli
generic element in Va ⊕ L ⊕ M, we have : (i) if loc T ⊥P then ψT (Va (P )) = Va (P ) ⊕ 0 ⊕ 0; for a generic P and loc T it holds ψT (Vb (P )) = ψT (Va (P )) ⊕ N, ψT (Ve (P )) = ψT (Vq (P )) ⊕ N ; (ii) if loc T ⊂ P then ψT (Vc (P )) = ψT (Vb (P )) ⊕ C = ψT (Va (P )) ⊕ (C ⊕ N ), ψT (Vq (P )) = ψT (Va (P )) ⊕ Q, ψT (Ve (P )) = ψT (Vb (P )) ⊕ Q = ψT (Vq (P )) ⊕ N = ψT (Va (P )) ⊕ N ⊕ Q, ψT (Vf (P )) = ψT (Vc (P )) ⊕ Q = ψT (Ve (P )) ⊕ C = ψT (Va (P )) ⊕ (C ⊕ N ) ⊕ Q. Proof. (i) The case loc T ⊥P is trivial. If loc T and P are not disjoint, then f0 tdx, 0 ∈ Va ⊕ L ⊕ M : ψT (Va (P )) = (f0 , f1 ) ⊕ 0, f1 ∂tdx ⊕ R
R
f0 ∈ ∂S(P ), f1 ∈ S(P )
is a proper subset of Va ⊕ N ⊕ R, that does not split into a symplectic sum. Moreover ψT (Vb (P )) = (f0 , f1 ) ⊕ (0, n) ⊕ 0, f0 tdx ∈ Va ⊕ L ⊕ M : f0 ∈ ∂S(P ), f1 ∈ S(P )
R
= ψT (Va (P )) ⊕ N. The decompositions for ψT (Vq (P )) and ψT (Ve (P )) are in the proof of (ii) below. (ii) The definition of ψT as in Proposition 3.2 for generic loc T and domain P gives f0 tdx, q ∈ Va ⊕ L ⊕ M : ψT (Vf (P )) = (f0 , f1 ) ⊕ (c, n) ⊕ R
f0 ∈ ∂S(P ∪ loc T ), f0 + c∂t ∈ S(P ),
f1 ∈ S(P ∪ loc T ), f1 + qt ∈ ∂ −1 S(P )
and similarly for the other local symplectic subspaces. Taking loc T ⊂ P , the results follow.
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
765
Observe that whatever loc T and P are, with loc T ⊥ P , the local algebra A(Va (P )) defined as in Eq. (4.4) satisfies (4.6) A(Va (P )) ∼ = πa ⊗ πL ⊗ πM (W(ψT (Va (P )))) = A(Va (P )) ⊗ IL ⊗ IM . Similarly we define from the symplectic local subspaces of Vb , Vc , Vq , Ve and Vf the nets B, C, Q, E and F , respectively. Remembering that we also indicated by C and Q the additive groups of the charges, we denote without ambiguity by U(C) and U(Q) their unitary Weyl representations on HL and HM , respectively. Recalling that in Remark 3.3, for chosen T ∈ Vf we denoted by ΨT the isomorphism at the Weyl algebra level obtained from the symplectic space isomorphism ψT by the Weyl functor, we use the same symbol ΨT for its extension to the von Neumann algebras, so that from the representation πf to the representation πa ⊗ πL ⊗ πM , it holds A(Va (P )) ⊗ IL ⊗ IM = ΨT (A(Va (P ))), for every T ∈ Vf .i Hence, denoting the Abelian algebra of the symplectic subspace N in representation πb by ∼ πL (WL (N )) , Zb := πb (Wb (N )) = (4.7) the above proposition and the isomorphism ΨT , give the following characterization of the nets involved, as simple current extensions from the net of observables: Proposition 4.2. For given P ⊆ R a generic non-void subset of the line and for notations as above we have: (i) for generic P and loc T : B(Vb (P )) = A(Va (P )) ⊗ Zb , E(Ve (P )) = Q(Vq (P )) ⊗ Zb ; (ii) if loc T ⊂ I ∈ I then
A(I) ∩
⊂
B(I)
C(I)
A(I) ⊗ Zb ∩
⊂
(A(I) ⊗ Zb ) σL U(C) ∩
A(I) σM U(Q) ⊂ (A(I) ⊗ Zb ) σM U(Q) ⊂ (A(I) ⊗ Zb ) σM U(Q) σL U(C)
Q(I)
E(I)
F (I). (4.8)
Proof. From the Proposition 4.1, by functoriality and the splitting of the representation. i Such an isomorphism Ψ turns out to be weakly continuous only for the closed linear subspaces T of the von Neumann algebras with fixed charge, see [14] for detail.
July 8, 2009 10:16 WSPC/148-RMP
766
J070-00373
F. Ciolli
4.2. Chiral versus time zero formulation We introduce in the sequel the chiral formulation of the scalar massless free field nets: this allows one to relate the time zero approach to the Streater and Wilde original one and to the conformal chiral one given by Buchholz, Mack and Todorov in [10]. To fix notation, let A = AK be the observable net of the Streater and Wilde model on the index set of the double cones K of the 1+1-dimensional Minkowski spacetime M . If I ∈ I is a time zero open interval and O = I ∈ K is the open double cone it generates by causal completion, i.e. the interior of the causal closure of the interval = AI (I). I, the net A has time zero generating net AI , i.e. it holds A(O) A wherever based double cone, is described as O = I+ × I− , for I± ∈ I± , the index set of the open bounded intervals on the right/left light ray lines, see e.g. [44]. The Streater and Wilde original model in [51] is formulated in the left/right chiral fields formalism and, for every chiral component, it gives a family labeled by Disliking this, we preferred to write the Rd of DHR automorphisms for the net A. symplectic spaces and nets in a time zero formulation and then, using the symplectic isomorphism ψT defined in Proposition 3.2, we extracted the charge content. To relate the two approaches, and to discuss the geometric covariance properties, we may use a second symplectic isomorphism constructed as in the sequel. In the chiral approach, the symplectic spaces are given by the smearing test functions of the left/right mover solutions θ± of the classical wave equation, i.e. (∂t ± ∂x )θ± = 0. To consider the charge carrying fields, we have to restrict to the functions θ± (x) ∈ CR∞ (R) with ∂θ± ∈ S and a general solution of the wave equation can be written asj Θ(x, t) := θ− (x − t) + θ+ (x + t). To give a symplectic isomorphism between the two formulation, we start from a direct application of the d’Alambert formula: if F = f0 ⊕ f1 ∈ Vf , we have x+t 1 Θ(x, t) = f0 (y)dy . (4.9) f1 (x + t) + f1 (x − t) + 2 x−t Denoted by V := {θ ∈ CR∞ (R) : ∂θ ∈ S} a chiral symplectic space support, and by σ± (θ, ϕ) = ± (ϕ∂θ − θ∂ϕ)dx R
two symplectic forms on V , we can hence define the left/right chiral symplectic spaces of the fields by V+ := (V, σ+ ) and V− := (V, σ− ). j Actually in Streater’s and Wilde’s original formulation the unnecessary requirement limy→−∞ θ± (y) = 0 is made. We omit such a choice that does not reveal the presence of Zb .
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
767
Consider now the isomorphism ψ of real linear spaces defined by ψ : Vf → ψ(Vf ) = V ⊕ V
(4.10)
F F ⊕ θ− , F = f0 ⊕ f1 → ψ(F ) := ΘF := θ+
where the direct and inverse transformations are explicitly given by x 1 F θ (x) = (x) + f (y)dy f 1 0 F F + 2 (x) − θ− (x)) f0 (x) = ∂(θ+ −∞ , . x F F f (x) = θ (x) + θ (x) 1 1 + − F θ− (x) = f0 (y)dy f1 (x) − 2 −∞
(4.11)
Define moreover for every Θ = (θ+ , θ− ) ∈ V ⊕ V the two real constants Θ(+∞) := (θ+ (+∞), θ− (+∞)) ∈ R2d where θ± (+∞) := lim θ± (x).
(4.12)
x→+∞
Observe that for F ∈ Vf and Θ = ψ(F ), this pair of constants is given by x 1 F θ± (+∞) = lim f0 (y)dy . f1 (x) ± x→+∞ 2 −∞
(4.13)
Introducing the (continuous) real valued R-linear form α : R2 → R such that α(a, b) = ab, we may denote by σ∞ the usual symplectic form on R2 , defined by σ∞ ((a1 , b1 ), (a2 , b2 )) = α(a1 b2 ) − α(b1 a2 ) = a1 b2 − b1 a2 ,
(4.14)
and finally state the following Lemma 4.3. There exists a non-splitting symplectic isomorphism ψ∞ , equating the isomorphism ψ of Eq. (4.10) as a real linear space isomorphism, defined by ψ∞ : (Vf , σf ) → ψ∞ ((Vf , σf )) = (V+ ⊕ V− , σ+ + σ− + σ∞ ) F F , θ− ), F = f0 ⊕ f1 → ΘF := (θ−
(4.15)
such that for the symplectic forms it holds F G F G , θ+ ) + σ− (θ− , θ− ) + σ∞ (ΘF (+∞), ΘG (+∞)), σf (F, G) = σ+ (θ+
F, G ∈ Vf .
The corresponding Weyl algebras isomorphism is Ψ∞ : W(Vf , σf ) → Ψ∞ (W(Vf , σf )) = W(V+ , σ+ ) z− U(V− )
(4.16)
where in the 2-cocycle z− = (β, y), the action β is defined by the α appearing in Eq. (4.14) and the T-valued function y is defined by the symplectic form σ− . Proof. The proof it trivial, it is enough to reefer to the general case treated in Sec. 2.1, in particular Eqs. (2.17) and (2.18) for the definition of z− . Observe that the part σ∞ in the symplectic form, corresponds to the interacting part of σf (actually denoted by σV+ ,V− according to the notation of Sec. 2.1).
July 8, 2009 10:16 WSPC/148-RMP
768
J070-00373
F. Ciolli
Various observations are now in order: • Picking a point at infinity ∞ ∈ R ∪ {±∞} in the limits of the integrals appearing in Eq. (4.11) is necessary in order to define ψ∞ and σ∞ . The choice ∞ = −∞ F F (−∞) = θ+ (−∞) = 12 f1 (−∞), so that the interacting part σ∞ of the gives θ− symplectic form depends only on the limit value of the functions at +∞. • The space ψ∞ (Vf ) is not a direct sum because ψ∞ is not splitting. As a consequence, the Weyl algebra Ψ∞ (W(Vf , σf )) is not a tensor product but a twisted crossed product of Weyl algebras. Changing the role of V+ and V− in Eq. (4.16) we also have W(Vf , σf ) ∼ = W(V− , σ− ) z+ U(V+ ), where now the 2-cocycle z+ is given in terms of α and σ+ . To make evident the symmetry behind this construction, and its physical significance, we use the following notation W(Vf , σf ) ∼ = W(V+ , σ+ )W(V− , σ− ), ∞
(4.17)
where the symbol accounts for the twisted interaction operated by σ∞ between ∞ the two chiral field algebras at the chosen point ∞. • For S, the Schwartz space of functions on the chiral line and Va ∼ = S, we define the two chiral symplectic space of observables as Va± := (Va , σ± ) ⊂ (V± , σ± ). Choosing ∞ = −∞ we have Θ(+∞) = (0, 0) ∈ Rd , for any Θ ∈ Va+ ⊕ Va− . Hence σ∞ vanishes, independently of the choice of ∞ we made, when at least one of its arguments is in Va+ or Va− . We may interpret this as and independence relation, given by the vanishing of the commutator at ∞ between any field of one chirality and any observable of the other chirality. • The isomorphism ψ∞ splits when restricted to the observable symplectic subspace, i.e. it hold ψ∞ (Va , σa ) = (Va+ , σ+ ) ⊕ (Va− , σ− ), and we have W(Va , σa ) ∼ = W(Va+ , σ+ ) ⊗ W(Va− , σ− ). The (vacuum) states of the algebras W(Va± , σ± ), i.e. the quasi-free state on the chiral observables algebras W(Va± , σ± ), are defined for θ ∈ Va± and Ta± (θ)(p) := 1 ˜ by |p| 2 θ(p) 1 1 ˜ 2 dp . ωa± (W (θ)) := exp − Ta± (θ)2Ha |p||θ| = exp − ± 2 2 R
(4.18)
Similarly to the time zero formulation, the GNS construction from these states gives Fock space representation on the Hilbert spaces Ha± , with chiral one particle spaces Ha± := Ta± (Va± )− , and GNS (vacuum) vector Ωa± , see, for example, [10] for details. From these representations of the chiral observable algebras, we can obtain the non-regular representations of the chiral field algebras, using the regularizing elements tool of Sec. 3.1 as follows. Initially we observe that a simple relation between the charges carried by a field in the two formulations is also obtained from the d’Alambert formula (4.9). The
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
769
charges carried by Θ = (θ+ , θ− ) ∈ V+ ⊕ V− are the pair (c+ , c− ) ∈ C+ ⊕ C− ∼ = Rd ⊕ Rd where c± := lim θ± (y) − lim θ± (y). y→+∞
(4.19)
y→−∞
If F ∈ Vf and if c± are the charges of ψ∞ (F ) = ΘF , we have Fc = c+ − c−
and Fq = c+ + c− .
(4.20)
In particular we have that (Fc , Fq ) = (0, 0) iff (c+ , c− ) = (0, 0). Consider now two copies of the elementary symplectic space, defined in Sec. 2.3, that we denote by (L± , σL ), with L± = C± ⊕ N± ∼ = R2d . Let moreover ωL± be the non-regular states on the Weyl algebras W(L± , σL ) respectively, defined as in Eq. (2.34) by ωL± (W ((c± , n± )) = δc± ,0 , for (c± , n± ) ∈ L± and δ the Kronecker delta. A regularizing element for the left field movers (similarly for the right movers case), is anon vanishing charge element S+ , i.e. S ∈ V+ /Va+ . We can choose it such that R S+ ∂S+ dx = 0 for simplicity. Then, as in Proposition 3.2, there exist a regularizing isomorphism ψS+ : (V+ , σ+ ) → ψS+ ((V+ , σ+ )) (Va+ , σ+ ) ⊕ (L+ , σL ) θ → θ − c+ S+ ⊕ (c+ , n+ )
(4.21)
where the charge c+ = limy→+∞ θ(y) − limy→−∞ θ(y) and n+ := R (θ∂S+ − S+ ∂θ)dx. Observe that c+ = 0 if θ ∈ Va+ ⊂ V+ . Similarly, we obtain (c− , n− ) ∈ L− and, as in Proposition 2.6, it is easy to see that two non-regular states ω± on W(V± , σ± ) respectively, are defined by ω± (W (θ)) = ωa± (W (θ − c± S± ))ωL± ((c± , n± )),
θ ∈ V± ,
(4.22)
with non-separable GNS representation spaces H± ∼ = Ha± ⊗ HL± , respectively. A state for the algebra W(V+ , σ+ )W(V− , σ− ) is defined by ∞
ω∞ (W (Θ)) := ωa+ (W (θ+ − c+ S+ ))ωa− (W (θ− − c− S− ))δc+ ,0 δc− ,0 ,
(4.23)
for the element W (Θ) ∈ W(V+ , σ+ )W(V+ , σ+ ) associated to the pair Θ = θ+ ⊕ ∞
θ− ∈ V+ ⊕ V− , with charges (c+ , c− ) ∈ C+ ⊕ C− . Observe that the definition of the state ω∞ above is well posed, in fact once we write the obvious algebraic isomorphism W(V+ , σ+ )W(V− , σ− ) ∼ = W(Va+ ⊕ N+ , σ+ ) ⊗ W(Va− ⊕ N− , σ− ) U(C+ ⊕ C− ), ∞
the condition (2.28) of Proposition 2.6 is satisfied because for the state ωC = ωL+ ⊕L− C+ ⊕C− it holds ωC (W (c+ , c− )) = δ(0,0),(c+,c− ) . Moreover, the choice of the point at infinity ∞ (note that the values of the functions at the point at infinity are not involved in the definition of ω∞ ) and of the regularizing elements S± ∈ V± , defines the state ω∞ uniquely, up to isomorphism, as in point (i) of Proposition 3.5.
July 8, 2009 10:16 WSPC/148-RMP
770
J070-00373
F. Ciolli
From the GNS representations π± := (πa± ⊗ πL± ) ◦ ψS± of the Weyl algebras W(V± , σ± ), respectively, we obtain for the non-regular GNS representation of the state ω∞ : π∞ ∼ = π+ ⊗ π− := ((πa+ ⊗ πL+ ) ◦ ψS+ ) ⊗ ((πa− ⊗ πL− ) ◦ ψS− ). For the Hilbert spaces H∞ of this representation, it holds H∞ ∼ = H+ ⊗H− ∼ = Hf , i.e.
π∞ is unitarily equivalent to the representation πf . In fact, for F = (f0 ⊕ f1 ) ∈ Va we have F 2 F 2 2|p||θ Ta (F )2Ha = (|p|−1 |f˜0 |2 + |p||f˜1 |2 )dp = + | + 2|p||θ− | dp R
R
F F = 2Ta+ (θ+ )2Ha + 2Ta− (θ− )2Ha . +
−
(4.24)
We pass now to define the chiral and 1+1-dimensional nets of the fields and their relation with the time zero one. For a time zero based double cone O = I = I+ × I− , with I ∈ I and I± ∈ I± , we have: F(O) = F (I) ∼ = F+ (I+ )F− (I− ). ∞
(4.25)
Here denotes the twisted crossed product for the point at infinity obtained in ∞
Eq. (4.17) and F± : I± → π± (W(V± (I± )) are the chiral field nets on the non separable Hilbert spaces H± defined, for I± ∈ I± the light ray intervals, from the localized symplectic space V± (I± ) = {θ ∈ V± : supp ∂V ⊂ I± }. The observable chiral nets are defined on the separable Hilbert Fock spaces Ha± by A± := I± → πa± (W(Va± (I± )) ,
I± ∈ I± ,
and the following equality holds for O as above A(O) = A(I) ∼ = A+ (I+ ) ⊗ A− (I− ).
(4.26)
As a last observation, notice that if N = {0 ⊕ n, n ∈ R} ⊂ Vf is the time zero subspace defined in above Sec. 3.1, than ψ∞ (N ) = N ⊕ N ⊂ V+ ⊕ V− , because ψ∞ (n) = n2 . Hence ψ∞ (N ) generates the diagonal constant elements in the field, and it holds Zb ∼ = π+ (W(ψ∞ (N ))) ⊗ π− (W(ψ∞ (N ))) ∼ = ∩I± ∈I± F± (I± ), i.e. the common Abelian von Neumann subalgebra of the local left and right movers is isomorphic to the non-trivial center Zb of the time zero net B. in the vacuum It is well known that the 1+1-dimensional observable net A, ∼ representation πa = πa+ ⊗ πa− , satisfies (see [51, 29, 8] for this model and [32] for a general formulation): • covariance under the action of P SL(2, R) × P SL(2, R), the universal covering group of the M¨ obius group on M , implemented by a continuous, unitary, positive energy representation that we denote by Ua = Ua+ ⊗ Ua− ;
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
771
• Reeh–Schlieder property for A relative to the cyclic and separating vector Ωa ∈ Ha (and also for the net A, relative to the same vector, and for A± relative to Ωa± ∈ Ha± ); and • the local algebras A(O) (AI (I) and A± (I± )) appear as type III1 factors. Moreover, as a consequence of the modular theory for the local algebras and of the M¨ obius covariance, it is also shown in [29] that we have with respect to the double cone index set K (respec• Haag duality for the net A, tively, of AI with respect to the index set I and nets A± with respect to the index sets I± ); with respect to the index set K, and • timelike duality for the net A, • unitary equivalence of the local algebras A(O) with any local algebras defined by additivity on bounded simply connected regions or wedges in M (respectively, of the algebra AI (I) and the algebra defined by additivity on the half lines, in time zero formulation). 4.3. DHR sectors for the observable net AI We begin this section recalling some basic facts about DHR superselection theory, see, e.g., [48]. Given a directed index set P and a von Neumann algebras net NP in representation πn , it holds NP ⊂ N P := (πn , πn ) . The DHR superselection sectors of NP are described by the W ∗ -category Rep⊥ NP defined as follows: • objects: the representations π of NP such that — π is a representation in N P , i.e. π(NP (P )) ⊂ N P for P ∈ P; — π satisfies the DHR superselection criterion, i.e. πP ⊥ ∼ = πn P ⊥ ,
P ∈ P;
and
(4.27)
• arrows: the intertwiners between these representations, i.e. operators T ∈ N P such that for any P ∈ P and π1 , π2 ∈ Rep⊥ NP we have T π1 (A) = π2 (A)T,
A ∈ NP (P ).
(4.28)
By [18, 20], it is possible to analyze Rep⊥ NP in terms of the tensor W ∗ -category Tt of localized transportable endomorphisms of the net NP (if P is directed this is actually an equivalence of W ∗ -categories). The correspondence functor is given on the objects of Tt by π = πn ◦ ρ, for ρ ∈ Tt and π ∈ Rep⊥ NP .k This functor allows k The endomorphism ρ is said to be localized in P ∈ P, where P is the element appearing in Eq. (4.27); moreover it is said to be transportable if there exists a field of endomorphisms P a → ρa such that ρa = ρ if a = P , the localization region of ρ and if for every b ∈ P such that a, P ⊂ b, there exists a unitary operator u(b) ∈ U (Hπ ) such that ρa = ad u(b) ◦ ρ. For a general approach, also on non-directed set, see [48].
July 8, 2009 10:16 WSPC/148-RMP
772
J070-00373
F. Ciolli
one to reduce the study of all the representations to the Hilbert space Hn , with intertwiners defined as in Eq. (4.28). In our case, using the above recalled identification of Rep⊥ AI and Tt , we define for any F ∈ Vf (I) that carries the charges (Fc , Fq ) ∈ G a DHR representation for by the adjoint action automorphism ρI obtained by F and the nets AI and A, F localized in I, such that πa ◦ ad πf (Wf (F )) := πa ◦ ρIF .
(4.29)
For the representations in Rep⊥ NP , the covariance property under the action of a geometrical symmetry group of the spacetime is a standard requirement. In our model, in order to define and discuss such a covariance and the positivity of the energy for the net AI and for the introduced subsidiary nets, we have to pass K is a (Poincar´e to the 1+1-dimensional theory. In fact it is well known that if N or M¨ obiusl ) translation covariant net on the index set K of the double cones in a 1+1-dimensional Minkowski spacetime M , its time zero restriction net NI is only a space-translation covariant net, without spectrum condition, see, for example, [34]. obius covariant if there exist a A representation π of a net NK is said to be M¨ unitary representation Uπ of the group P SL(2, R)×P SL(2, R) on the Hilbert space Hπ such that, for any double cone O = I+ × I− and element g = (g+ , g− ) ∈ U ⊂ P SL(2, R) × P SL(2, R) in the connected neighborhood U of the identity element of the covering of the M¨ obius group, we have ad Uπ (g)(π(N (I+ × I− ))) = π(N (g+ I+ × g− I− )).
(4.30)
Returning to the Streater and Wilde model, if we take F ∈ Vf (I) implementing the sector automorphism ρIF and the representation πFI := πa ◦ ρIF as in Eq. (4.29) we may pass to the chiral formulation using the labeled by the charges (Fc , Fq ) ∈ G, F symplectic isomorphism ψ∞ of Lemma 4.3. For given the chiral charges (cF + , c− ) = Fq +Fc Fq −Fc ( 2 , 2 ), see Eq. (4.20), we have a unitary representation of P SL(2, R) × P SL(2, R) on the subspace HcF+ ⊗ HcF− ⊂ H+ ⊗ H− ∼ = (⊕c+ ∈C+ Hc+ ) ⊗ (⊕c− ∈C− Hc− ). For every chiral charge c± ∈ C± , and for Ha± the observables separable Hilbert space, we have the Hilbert space equivalence Hc± ∼ = Ha± , but carrying different representations. If this equivalence is given by the unitaries Y± : Ha± → Hc± and if Ua± are the representations of the M¨obius group on Ha± , respectively, then we obius group may define by Uc± := ad Y± (Ua± ) the unitary representations of the M¨ on the Hilbert spaces Hc± . The representations of the same group on Hc+ ⊗ Hc− are defined by Uc+ ,c− := Uc+ ⊗ Uc− . Denote now by ξ± ∈ I± the coordinates on ray light lines, and take an element Θ = (θ+ , θ− ) ∈ V+ (I+ )⊕V− (I− ), i.e. with supp Θ = I+ ×I− = O, and charges equal l The name conformal covariance is better reserved for covariance under the diffeomorphisms group, that will not appear in this paper.
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
773
to (c+ , c− ). For a given DHR representation πc+ ⊗ πc− , and for any g = (g+ , g− ) ∈ U ⊂ P SL(2, R) × P SL(2, R), the symmetry group representation Uc+ ,c− acts as ad Uc+ ,c− (g)(πc+ (W (θ+ (ξ+ ))) ⊗ πc− (W (θ− (ξ− )))) = ad Uc+ (g+ )(πc+ (W (θ+ (ξ+ )))) ⊗ ad Uc− (g− )(πc− (W (θ− (ξ− )))) −1 −1 ξ+ ))) ⊗ πc− (W (θ− (g− ξ− ))). = πc+ (W (θ+ (g+
(4.31)
Observe that it is not necessary to define the action of the M¨obius group on the two left/right chirality points at infinity ∞± , that not even explicitly appear in the action of the representation Uc+ ,c− for no (c+ , c− ) ∈ C+ ⊕C− . In fact it is possible to choose ∞+ not contained in the left supports I+ of Θ, and take the open connected / g+ I+ , for every (g+ , g− ) ∈ U, and neighborhood U such to remain with ∞+ ∈ similarly for ∞− . Moreover, the defined representation Uc+ ,c− may also be used in restriction to the elements of the chiral observable net AK = A+ ⊗ A− in any representation πFI ∼ = πc+ ⊗ πc− . Finally, the covariance of the vacuum representation π∞ ∼ = π+ ⊗ π− of the field obius group representation by net FK is also easily obtained, defining the M¨ U := ⊕(c+ ,c− )∈C+ ⊕C− Uc+ ,c− = (⊕c+ ∈C+ Uc+ ) ⊗ (⊕c− ∈C− Uc− ),
(4.32)
under the proper notion of convergence, i.e. summing up on finite sets of charges Λ ⊂ C+ ⊕ C− . The same result holds for the time zero formulation, under the isomorphisms seen above. After this excursion on the chiral formulation and the representation of the M¨ obius group, the reason why we choose time zero description is going to be clear in a moment. Solitonic sectors appeared as part of AQFT in [45], for a general formulation see also [24,43]. In the Streater and Wilde model, this notion emerge for the sectors of the intermediate net CI , that turn out to be similar to a class studied in [40] for massive and conformal field theories. For I ∈ I, we define its left and right causally disjoint sets Il⊥ , Ir⊥ ⊂ I, by Il⊥ ∪ Ir⊥ = I ⊥ and Il⊥ ∩ Ir⊥ = ∅, and use the following Definition 4.4. Let NI be a net in the defining representation πn , and let K be a group with a local action α on the net NI . A translation covariant representation π of the net NI is said to be a K-solitonic representation with support I ∈ I if there exist h, k ∈ K such that for any I1 ∈ Il⊥ and I2 ∈ Ir⊥ we have πN (I1 ) = πn ◦ αh (N (I1 )),
πN (I2 ) = πn ◦ αk (N (I2 )).
A K-solitonic automorphism ρIk,h of the net NI is a net automorphism such that π = πn ◦ ρIk,h is a solitonic representation with support I ∈ I, i.e. ρIk,h NI (I1 ) = αk and ρIk,h NI (I2 ) = αh .
July 8, 2009 10:16 WSPC/148-RMP
774
J070-00373
F. Ciolli
We remember other notions, presented e.g. in [41,42], also useful for the case at hand. If K is a (finite) compact group and RI := NIK is the fixed-point net of NI under the action of K so that the DHR sectors of the net RI are described by K-solitonic automorphisms of net NI , we speak of an orbifold model ; moreover, such a model is said to be holomorphic if the net NI has only trivial superselection sectors. Starting K , with time zero restriction NI , being the solitonic from a 1+1-dimensional net N automorphisms not locally normal at infinity, it is only as representations of the = N K , with time zero restriction RI , that they represent fixed-point subnet R K true positive energy DHR sectors. In the Streater and Wilde model, the properties of the sector automorphisms of implemented by the Weyl elements of the auxiliary the observable net AI (and A) nets, are collected in the following result, where geometric covariance properties are also discussed Proposition 4.5. In the above notation, we have (i) for F = f0 ⊕ f1 ∈ Ve (I), the adjoint action of the represented Weyl element πf (Wf (F )) implements an N -solitonic transportable automorphism ρIF of the net CI , localized in I ∈ I, i.e. such that for G ∈ Vc we have ρIF (πf (Wf (G))) := ad πf (Wf (F ))(πf (Wf (G))) = eiσf (F,G) πf (Wf (G)). (4.33) Moreover, if limx→±∞ f1 (x) = F± ∈ N and if loc G ∈ Ir⊥ , we have ρIF πf ((Wf (G))) = αF+ ((Wf (G)))) = e−iF+ Gc πf (Wf (G)), or if loc G ∈ Il⊥ we have ρIF πf ((Wf (G))) = αF− ((Wf (G)))) = e−iF− Gc πf (Wf (G)). The elements in CI intertwine solitonic automorphisms with the same values of Fq = F+ − F− , preserving supports, if localized with the same support as the automorphisms. Such automorphisms turn out to be DHR sector automorphisms in restriction to the nets BI and AI , localized in I, with charge Fq ∈ Q and intertwiners in BI and AI , respectively; (ii) for F ∈ Vc (I) the automorphism ρIF := ad πf (Wf (F )) is a DHR automorphism of the net EI , localized in I, i.e. such that for G ∈ Ve equality (4.33) holds and ρIF E(I1 ) = ι if I1 ⊥ I. The elements in EI intertwine automorphisms with same charge, preserving supports if localized as the automorphisms are. The restriction to the nets QI and AI gives DHR sector automorphisms for these nets as well, with intertwiners in QI and AI , respectively; (iii) for any F ∈ Vf (I) the automorphism ρIF defined as in equality (4.33) is equivalent to a positive energy, M¨ obius covariant DHR sector automorphism ρO F for localized in O = I . the 1+1-dimensional net A, with charges (Fc , Fq ) ∈ G, The intertwiners of two such automorphisms are given by elements in A.
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
775
Proof. (i) The transportability is a result of a simple calculation, see, for example, [51], and using the correspondence of the given automorphism sectors on the time zero axes. Moreover π := πc ◦ ρIF satisfies Definition 4.4 for the group N , since F± → Ia ⊗ πL (WL ((0, F± ))) ⊗ IM ∈ U(Hf ) is a strongly continuous unitary representation of N on the Hilbert space Hf , acting locally on the net CI . (ii) and (iii) It is easy to check the DHR superselection criterion requirements for such automorphisms, relatively to the indicated nets. In particular, to show the equivalence in (iii) it is enough to observe that the map ρIF := ad(πf (W (F ))) → ad(π+ (W (ψ∞ (F ))) ⊗ π− (W (ψ∞ (F )))) =: ρO F defines an automorphism for the net A with the required properties. Positivity of the energy and Poincar´e covariance of ρO F is proved in [51]. The M¨ obius covariance is proved using the chiral formulation and the representation of the M¨ obius group given in Eq. (4.31). Remark 4.6. (1) Notice that the time zero formulation of Proposition 4.5 distinguishes between the different nature of the two families of DHR automorphisms obtained for the net A: the ones in (i) are the restriction of solitonic automorphisms of the bigger net C; the ones in (ii) instead are the restriction of DHR automorphisms of the bigger net Q. Similarly for the net A because of (iii). In the following Sec. 4.4, the net A is characterized as the fixed-point subnet of Q under the action of a compact gauge group Gq , i.e. A = QGq . Using the discussion and terminology in [35], we say that the automorphisms in (i) give twisted representations of A and the ones in (ii) untwisted, relatively to the representation of the net Q. In contrast, the chiral approach describes the sectors as indistinguished restrictions of solitonic automorphisms on the light lines. (2) In the general situation treated in [40], the split property for the net is required in order to construct disorder operators that implement the solitonic automorphisms; in the Streater and Wilde case such operators are directly given as Weyl elements in the larger algebras. 4.4. Gauge symmetry group We consider in the sequel the gauge automorphisms defined on net FI . The general theory in [18,19], the construction of the Weyl algebras sector automorphisms and the structure of simple current extensions in (ii) of Proposition 4.2, We also suggest looking for gauge symmetries in the dual of the charge group G. recall the results (iii) and (iv) in Proposition 2.10, where the properties of the elementary Weyl algebra on the symplectic space L = C ⊕ N ∼ = R2d , in non-regular representation, are displayed. Of particular relevance is the Bohr compactification of N .
July 8, 2009 10:16 WSPC/148-RMP
776
J070-00373
F. Ciolli
Consider the introduced two Abelian groups: the charge group G := C ⊕ Q ∼ = R2d furnished with the discrete topology and the additive group G0 := N ⊕ R as defined in the symplectic isomorphism in Sec. 3.1 and furnished with the usual topology of R2 . i.e. the group of Consider moreover the Abelian group of all characters on G, with all the functions χ : G → T such that χ(s + s ) = χ(s)χ(s ) for s, s ∈ G, the identity function as neutral element and complex conjugation as inverse. If we furnish this group with the compact-open topology, that coincides with the pointwise convergence topology, G being discrete, we obtain an Abelian compact group we denote by G, because of the Pontryagin duality theorem. Moreover, the group G is isomorphic to the Bohr compactification of G0 , i.e. the closure of the group G0 in the compact-open topology defined above. Observe that the group G0 is identified with the subgroup of the characters of G with that correspond to strongly continuous one-dimensional representation of G, the usual topology. This means that under the groups morphism χ : G0 → G (n, r) → χ(n,r) , the elements of G0 are the elements χ(n,r) of G with the property that for λ ∈ R the map χ(n,r) (λ(c, q)) = eiλ(nc+rq) , (c, q) ∈ G is continuous. The properties collected in the following result, authorize to call G the gauge group of the net A, at least relatively to the net F . Proposition 4.7. The map defined by V : G0 → U(Hf ) (n, r) → V (n, r) := Ia ⊗ πL (WL ((0, n))) ⊗ πM (WM ((r, 0))) satisfies: (i) V is a strongly continuous unitary representation of G0 on Hf ; (ii) the invariant Hilbert subspace of V (G0 ) is Ha . Moreover, the representation V leaves invariant Ωf = Ωa ⊗ ΩL ⊗ ΩM , the vacuum vector of FI ; (iii) on the net FI the adjoint action of V implements an automorphism group we indicate by the same symbol G0 , such that for (n, r) ∈ G0 and F ∈ Vf we have α(n,r) (πf (Wf (F ))) = ad V (n, r)(πf (Wf (F ))) = e−i(nFc +rFq ) πf (Wf (F )). This automorphism group acts strictly locally on the net FI and on the 1+1obius group dimensional net F = F+ F− , commuting with the action of the M¨ ∞
on the net F, represented on Hf ∼ = H+ ⊗ H− as in Eq. (4.32) and acting on the subspace of fixed charges (c+ , c− ) as in Eq. (4.31); Moreover, for the Abelian compact group G defined above, we have: (iv) G ∼ = Gc × Gq , where Gc and Gq are the Bohr compactifications of N and R respectively. The group G is strong continuously represented on Hf , extend-
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
777
ing the representation V, and preserving the properties in (ii) and (iii) of the subgroup G0 ; the charge group; (v) the dual group of G is G, Gq Gc (vi) it holds FI = CI , FI = EI and FIG = BI . Proof. Strong continuity in (i) and (iv) holds because the group G0 is represented by V using only the regular subspace N ⊕ R ⊂ L ⊕ M of the representation πL ⊗ πM and, by density, extends to elements in G because of the universal property of Bohr compactification, see [16, Chap. 16]. (ii) The invariance of the subspace Ha and of the vector Ωf hold because of the factorization property of the state ωf = ωa ⊗ ωL ⊗ ωM , and of its GNS representation, as illustrated in Proposition 2.6. (iii) For the elements in G0 this is trivial, because the action is implemented by Weyl elements. The commutation of the action with the action of the M¨ obius group derives from the invariance of charges and the T-valued action of the latter group. (iv) results from the discussion above. In particular, the locality of the action for elements of G not in G0 holds, because it is T-valued. (v) by the Pontryagin’s duality theorem. (vi) follows from the general case in [18], averaging over the compact group G, or the indicated subgroups, with respect to the Haar measure, in the one-dimensional representation fixed by specifying the charges of the Weyl generators. Remark 4.8. Observe that the previous introduction of a non-splitting isomorphism, i.e. of a regularizing element T ∈ Vf , has been also useful to simplify the proofs of the Propositions 4.5 and 4.7. Actually this isomomorphism corresponds to a section G (c, q) → ∆c /Inn A, where ∆c is the group of the covariant DHR automorphisms and Inn A the one of the inner automorphisms of A. This section gives a group homomorphism, as said in [19, Sec. IV] and a similar choice is also made in a non-trivial center situation, as illustrated in [4]. Having established the existence of the superselection automorphisms and their relative gauge groups, the following diagram concerns the full meaning of the simple current extension diagram seen in (4.8): B U(C)
A ⊗ Zb A=Q
Gq
⊂B=E
∩ Q
Gq
=C
Gc
G
=F ⊂
∩ ⊂
E =F
C = F Gq ∩
Gc
⊂
F
A U(Q)
B U(Q)
B U(Q) U(C)
(4.34)
July 8, 2009 10:16 WSPC/148-RMP
778
J070-00373
F. Ciolli
Notice that • the vertical lines describe fixed-point restrictions under the action of the compact subgroup Gq ; dually they account for (twisted) DHR sectors of AI and BI and solitonic (automorphism) sectors for CI ; • the horizontal lines describe fixed-point restrictions under the action of the compact subgroup Gc ; dually they account for (untwisted) DHR sectors of nets AI , BI , QI and EI . To conclude, it worth recalling that such a kind of diagrams is present in other situation where the sectors of a net are obtained along two different extension procedures, through partial field nets, see e.g. the square of nets in [43, 40]. For a general explanation of these features in terms of braided crossed G-categories, for G a finite gauge group, see [42]. Regarding the last cited paper, it is to observe that in the Streater and Wilde model only the discrete subgroup N ∼ = Rd is present in the description of solitonic sectors, and not the entire gauge group. Acknowledgments This paper is mainly based on my PhD thesis written under the supervision of John Roberts to whom I am grateful for continuing interest and support. A partial list of people to whom I am indebted for useful conversations and hints about the topics of this paper, includes Sebastiano Carpi, Roberto Conti, Roberto Longo, Michael M¨ uger, Fernando Lled´ o, Gerardo Morsella, Gherardo Piacitelli, Giuseppe Ruzzi and Ezio Vasselli. References [1] F. Acerbi, G. Morchio and F. Strocchi, Infrared singular fields and nonregular representations of canonical commutation relation algebras, J. Math. Phys. 34 (1993) 899–914. [2] F. Acerbi, G. Morchio and F. Strocchi, Theta vacua, charge confinement and charged sectors from nonregular representations of CCR algebras, Lett. Math. Phys. 27 (1993) 1–11. [3] H. Baumgaertel and H. Grundling, Superselection in the presence of constraints, J. Math. Phys. 46(8) (2005), 082303, 34 pp. [4] H. Baumgaertel and F. Lledo, Dual group actions on C*-algebras and their description by Hilbert extensions, Math. Nachr. 239/240 (2002) 11–27. [5] H. Baumgaertel and F. Lledo, Duality of compact groups and Hilbert C*-systems for C*-algebras with a nontrivial center, Internat. J. Math. 15 (2004) 759–812. [6] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, I, 2nd edn. (Springer, New York, 1987). [7] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, II, 2nd edn. (Springer, New York, 1997). [8] R. Brunetti, D. Guido and R. Longo, Modular structure and duality in conformal quantum field theory, Commun. Math. Phys. 156 (1993) 201–219. [9] D. Buchholz, S. Doplicher, R. Longo and J. E. Roberts, A new look at Goldstone’s theorem, Rev. Math. Phys. 4 (Special Issue No. 1) (1992) 49–83.
July 8, 2009 10:16 WSPC/148-RMP
J070-00373
Massless Scalar Free Field in 1+1 Dimensions I
779
[10] D. Buchholz, G. Mack and I. Todorov, The current algebra on the circle as a germ of local field theories, Nucl. Phys. B. (Proc. Suppl.) 5B (1988) 20–56. [11] S. Carpi, On the representation theory of Virasoro nets, Commun. Math. Phys. 244 (2004) 261–284. [12] S. Cavallaro, Multiplicity-free representations of commutative C*-algebras and spectral properties, ArXiv math.OA/9804090. [13] F. Ciolli, Sui settori di Superselezione del Campo Libero Scalare a Massa nulla in 1+1 dimensioni, Phd thesis, Universit´ a di Messina (2003). [14] F. Ciolli, Massless scalar free field in 1+1 dimensions II: Net cohomology and completeness of superselection sectors, arXiv:0811.4673. [15] J. Derezinski and K. A. Meissner, Quantum Massless Field in 1 + 1 Dimensions, Lecture Notes in Phys., Vol. 690 (Springer, Berlin, 2006), pp. 107–127. [16] J. Dixmier, Les C*-alg`ebres et leurs Repr´esentations (Gautier Villars, Paris, 1964). [17] J. Dixmier, C*-Algebras (North Holland, 1977). [18] S. Doplicher, R. Haag and J. E. Roberts, Fields, observables and gauge transformation I, Commun. Math. Phys. 13 (1969) 1–23. [19] S. Doplicher, R. Haag and J. E. Roberts, Fields, observables and gauge transformation II, Commun. Math. Phys. 15 (1969) 173–200. [20] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics I, Commun. Math. Phys. 23 (1971) 199–230. [21] S. Doplicher and R. Longo (eds.), Noncommutative Geometry, Martina Franca, Italy (2000), Lecture Notes in Mathematics, Vol. 1831 (Springer-Verlag, 2004). [22] R. Doplicher and J. E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics, Comm. Math. Phys. 131 (1990) 51–107. [23] D. E. Evans and Y. Kawahigashi, Quantum Symmetries on Operator Algebras (Oxford University Press, 1998). [24] J. Fr¨ ohlich, New super-selection sectors (‘Soliton-states’) in two-dimensional Bose quantum field models, Comm. Math. Phys. 47 (1976) 269–310. [25] H. Grundling, A group algebra for inductive limit groups. Continuity problems of the canonical commutation relations, Acta Appl. Math. 42 (1997) 107–145. [26] R. Haag, Local Quantum Physics, Texts and Monographs in Physics, 2nd edn. (Springer, 1996). [27] H. Halvorson, Complementarity of representations in quantum mechanics, Stud. Hist. Philos. Mod. Phys. 35 (2004) 45–56. [28] A. Herdegen, Semidirect product of CCR and CAR algebras and asymptotic states in quantum electrodynamics, J. Math. Phys. 39 (1998) 1788–1817. [29] P. D. Hislop and R. Longo, Modular structure of the local algebras associated with the free massless scalar field theory, Comm. Math. Phys. 84 (1982) 71–85. [30] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras, Vol. I, Elementary Theory, Graduate Studies in Mathematics (American Mathematical Society, Providence, 1997). [31] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras, Vol. II, Advanced Theory, Graduate Studies in Mathamatics (American Mathematical Society, Providence, 1997). [32] Y. Kawahigashi and R. Longo, Classification of two-dimensional local conformal nets with c < 1 and 2-cohomology vanishing for tensor categories, Commun. Math. Phys. 244 (2004) 63–97. [33] Y. Kawahigashi and R. Longo, Local conformal nets arising from framed vertex operator algebras, Adv. Math. 206 (2006) 729–751.
July 8, 2009 10:16 WSPC/148-RMP
780
J070-00373
F. Ciolli
[34] Y. Kawahigashi, R. Longo and M. M¨ uger, Multi-interval subfactors and modularity of representations in conformal field theory, Commun. Math. Phys. 219 (2001) 631–669. [35] V. G. Kac, R. Longo and F. Xu, Solitons in affine and permutation orbifolds, Commun. Math. Phys. 253 (2004) 723–764. [36] R. Longo, (ed.), Mathematical Physics in Mathematics and Physics. Quantum and Operator Algebraic Aspects, Fields Institute Communications, Vol. 30 (American Mathematical Society, 2001). [37] R. Longo and F. Xu, Topological sectors and a dichotomy in conformal field theory, Commun. Math. Phys. 251 (2004) 321–364. [38] J. Manuceau, M. Sirugue, D. Testard and A. Verbeure, The smallest C*-algebra for canonical commutation relations, Commun. Math. Phys. 32 (1973) 231–243. [39] M. M¨ uger, Superselection structure of massive quantum field theories in (1+1)-dimensions, Rev. Math. Phys. 10 (1998) 1147–1170. [40] M. M¨ uger, On Soliton automorphisms in massive and conformal theories, Rev. Math. Phys. 11 (1999) 337–359. [41] M. M¨ uger, Conformal field theory and Doplicher–Roberts reconstruction, in Mathematical Physics in Mathematics and Physics. Quantum and Operator Algebraic Aspects, ed. R. Longo, Fields Institute Communications, Vol. 30 (American Mathematical Society, 2001), pp. 297–319. [42] M. M¨ uger, Conformal orbifold theories and braided crossed G-categories, Commun. Math. Phys. 260 (2005) 727–762. [43] K.-H. Rehren, Spin-statistics and CPT for solitons, Lett. Math. Phys. 46 (1998) 95–110. [44] K.-H. Rehren, Chiral observables and modular invariants, Commun. Math. Phys. 208 (2000) 689–712. [45] J. E. Roberts, Local cohomology and superselection rules, Commun. Math. Phys. 51 (1976) 107–119. [46] J. E. Roberts, Mathematical aspects of local cohomology, in Proceedings of the Colloquium on Operator Algebras and their Applications to Mathematical Physics (CNRS, Marseille, 1977), pp. 321–332. [47] J. E. Roberts, Lecture notes of 1997/98 course held at Dipartimento di Matematica, Universit` a di Roma 2 Tor Vergata, Roma (1998). [48] J. E. Roberts, More lecture on algebraic quantum field theory, in Noncommutative Geometry, eds. S. Doplicher and R. Longo, Lecture Notes in Mathematics, Vol. 1831 (Springer-Verlag, 2004), pp. 263–342. [49] B. Schroer, Two-dimensional models as testing ground for principles and concepts of local quantum physics, Ann. Phys. 321 (2006) 435–479. [50] J. Slawny, On factor representations and the C*-algebra of canonical commutation relations, Commun. Math. Phys. 24 (1972) 151–170. [51] R. F. Streater and I. F. Wilde, Fermion states of a Boson field, Nucl. Phys. B 24 (1970) 561–575. [52] M. Takesaki, Theory of Operator Algebras, I (Springer, 2001). [53] M. Takesaki, Theory of Operator Algebras, II (Springer, 2002). [54] M. Takesaki, Theory of Operator Algebras, III (Springer, 2002). [55] W. Thirring and N. Narnhofer, Covariant QED without indefinite metric, Rev. Math. Phys. 4 (Special Issue No. 1) (1992) 197–211.
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Reviews in Mathematical Physics Vol. 21, No. 6 (2009) 781–820 c World Scientific Publishing Company
RENORMALIZATION OF SPONTANEOUSLY BROKEN SU(2) YANG–MILLS THEORY WITH FLOW EQUATIONS
CHRISTOPH KOPPER Centre de Physique Th´ eorique, CNRS, UMR 7644, Ecole Polytechnique, F-91128 Palaiseau, France
[email protected] ¨ VOLKHARD F. MULLER Fachbereich Physik, Technische Universit¨ at Kaiserslautern, D-67653 Kaiserslautern, Germany
[email protected] Received 21 January 2009 Revised 26 May 2009 We present a renormalizability proof for spontaneously broken SU(2) gauge theory based on flow equations. It is a conceptually and technically simplified version of the earlier paper [5] including some extensions. The proof of [5] was also incomplete since an important assumption made implicitly in the proof of Lemma 2 there was not verified. So the present paper is also a corrected version of [5]. Keywords: Yang–Mills theory; spontaneously broken gauge theory; renormalization; flow equations. Mathematical Subject Classification 2000: 81T13, 81T15, 81T17
1. Introduction The differential flow equations [12] of the renormalization group [10, 11] offer a powerful tool for a unified approach to the analysis of systems with infinitely many degrees of freedom. Although first conceived for an analysis of such systems beyond perturbation theory, it was realized by Polchinski [8] that these equations also paved the way for a new elegant approach to perturbative renormalization theory.a Local gauge theories, however, present particular difficulties in this approach because the momentum space regulator violates gauge invariance. Thus dimensional renormalization is in practice the most popular scheme for renormalizing such theories in perturbation theory. But at the same time this scheme is restricted to Feynman graphs. It not only defies to be given rigorous meaning in path integral formulations, it does not even directly apply in a mathematical sense to perturbative Green a Wilson
himself remarked already in the late sixties that this should be possible, as we learned from E. Br´ ezin. 781
July 10, 2009 14:13 WSPC/148-RMP
782
J070-00375
C. Kopper & V. F. M¨ uller
functions as a whole without splitting them into graphs. Thus, in some sense it is farthest away from nonperturbative analysis, and it does not allow to address a number of interesting conceptual, mathematical and quantitative questions. The authors analyzed spontaneously broken SU(2)-Yang–Mills theory with flow equations in [5]. This analysis was simplified in [7]. In an endeavor to further simplify and clarify the analysis, which was also caused by lecturing on the subject several times, we came across an error in [5], which reappeared in [7] by quotation. In fact [5, Lemma 2] cannot be proven without an assumption made implicitly in its proof, which did not take into account the presence of irrelevant boundary terms in the bare action. These terms have been “forgotten” because the context of the proof had changed in the progress of our work, after the lemma had been written. Since we have found quite a number of further simplifications in the mean time, since the subject is important in physics, and since a correction of [5] required quite a lot of changes, even if the line of argument stays the same, we preferred to write a self-contained modern and (hopefully!) mathematically correct version of our previous paper. The strategy of proof remains that of [5]. The (ultraviolet) power counting part of the flow equation renormalization proof is universal and simple for all renormalizable theories. For gauge theories, we have to show that gauge invariance can be restored when the cutoffs are taken away. On the level of the Green functions (which are not gauge invariant) this means that we have to verify the Slavnov–Taylor identities (STI) of the theory. They then allow to argue that physical quantities such as the S-matrix are gauge-invariant [13, 14]. On analyzing the flow equations (FE) for a gauge theory one realizes that the restoration of the STI depends on the choice of the renormalization conditions chosen and cannot be true in general. More precisely, since gauge invariance is violated in the regularized theory, the renormalization group flow will generally produce nonvanishing contributions to all those relevant parameters of the theory, which are forbidden by gauge invariance, e.g. a 2 )2 . The question is then: Can noninvariant gauge field selfcoupling of the form (A we use the freedom in adjusting the renormalization conditions such that the STI are nevertheless restored in the end? To answer this question a first observation is crucial: The violation of the STI in the regularized theory can be expressed through Green functions carrying an operator insertion, which depends on the regulators. FE theory for such insertions tells us that these Green functions will vanish once the cutoffs are removed, if we achieve renormalization conditions on the noninserted Green functions such that the inserted ones, which are calculated from those, have vanishing renormalization conditions for all relevant terms, i.e. up to the dimension of the insertion (which is 5 in our case). Comparing the number of relevant terms for the SU(2) theory — 37 (see Appendix A) — and for the insertion — 53 (see Appendix C) —, we realize that it is not possible to make vanish 53 terms on adjusting 37 free parameters, unless there are linear interdependences. These interdependences are revealed in the analysis of the present paper. As compared
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
783
to [5] we also include the proof of the validity of the equation of the anti-ghost in the renormalized theory for suitable renormalization conditions. This paper is organized as follows. In Sec. 2, we introduce the classical action of the model and the BRST-transformations [1, 9]. In Sec. 3, we introduce the concepts from FE theory and recall the statements on renormalizability we need. In particular, we introduce the above mentioned operator insertions. When using FE it is natural to analyze the generating functional of free propagator amputated Schwinger functions. The analysis of the STI is however technically simpler for oneparticle irreducible vertex functions so that we introduce the generating functionals of both, together with the corresponding renormalizability statements. In Sec. 4, we derive the violated Slavnov–Taylor identities (VSTI) for the regularized theory in various forms for the bare and the renormalized functionals. Sections 1–4 follow closely the line of [5]. In Sec. 5, we present the new tool required in view of the fact that [5, Lemma 2] has become obsolete. Namely the generating functional of the vertex functions is not only expanded with respect to fields and momenta, but also with respect to the mass parameters, as far as their presence indicates improvement of UV power counting. The corresponding redefinition of relevant renormalization constants permits a complete analysis of the relevant part of the STI in terms of the renormalization conditions. We do not need any more to jump from bare to renormalized functionals and vice versa. It is then possible to show that for suitable renormalization conditions the inserted functional describing the violation of the STI has no relevant part. This result together with an obvious bound on its irrelevant part at the regularization scale Λ0 , following directly from the properties of regulator, permits to prove that the violation disappears for Λ0 → ∞ so that the STI hold in this limit. This proof finally elucidates the fact the validity of the STI can directly and fully be settled by analyzing the (large) system of equations describing its relevant part at the renormalization point. This aim was not achieved in [5]. We reproduce the appendices of [5] with slight notational changes. In Appendix A we list all 37 relevant terms allowed by the global symmetries of SU(2)-Yang–Mills theory. In Appendix B, the 7 relevant terms appearing in the inserted functionals describing the BRST-transformations are listed. In Appendix C, we list the 53 equations corresponding to the relevant contributions to the inserted functional describing the violation of the STI. By analysis of this system of equations we show restoration of gauge symmetry in the (properly) renormalized theory. A reader familiar with the power counting results following from the flow equations can skip the major part of Sec. 3. He might use it for finding some notations also used in later sections and to get acquainted with the mass expansion of the Schwinger functions which is used for the first time in this paper. It is described in the last part of Sec. 3.1 (from (55) onwards) and in the last page of Sec. 3.2 (from (88) onwards).
July 10, 2009 14:13 WSPC/148-RMP
784
J070-00375
C. Kopper & V. F. M¨ uller
2. The Classical Action Following closely the monograph of Faddeev and Slavnov [2], we collect some basic properties of the classical Euclidean SU(2) Yang–Mills–Higgs model on fourdimensional Euclidean space-time. The fields of the model are a triplet {Aaµ }a=1,2,3 of real vector fields and the complex scalar doublet {φα }α=1,2 . The classical action has the form 1 a a 1 ∗ ∗ 2 2 F F + (∇µ φ) ∇µ φ + λ(φ φ − ρ ) , Sinv = dx (1) 4 µν µν 2 with the field strength tensor a (x) = ∂µ Aaν (x) − ∂ν Aaµ (x) + gabc Abµ (x)Acν (x) Fµν
(2)
and the covariant derivative ∇µ = ∂µ + g
1 a a σ Aµ (x) 2i
(3)
acting on the SU(2)-spinor φ. The parameters g, λ, ρ are real positive, abc is totally skew symmetric, 123 = +1, and {σ a }a=1,2,3 are the standard Pauli matrices. The action (1) is invariant under local gauge transformations of the fields 1 1 a a σ Aµ (x) → u(x) σ a Aaµ (x)u∗ (x) + g −1 u(x)∂µ u∗ (x), 2i 2i φ(x) → u(x)φ(x),
(4)
with u : R4 → SU(2), smooth. The choice of a stable equilibrium point of the action (1) leads to spontaneous symmetry breaking, dealt with by reparametrizing the complex scalar doublet as B 2 (x) + iB 1 (x) φ(x) = , (5) ρ + h(x) − iB 3 (x) where {B a (x)}a=1,2,3 is a real triplet and h(x) the real Higgs field. Moreover, in place of the parameters ρ, λ the masses m=
1 gρ, 2
1
M = (8λρ2 ) 2
(6)
are used. Aiming at a quantized theory, pure gauge degrees of freedom have to be eliminated. We choose the ’t Hooft gauge fixing, with α ∈ R+ , 1 (7) Sg.f. = dx(∂µ Aaµ − αmB a )2 . 2α With regard to functional integration this condition is implemented by introducing anticommuting Faddeev–Popov ghost and anti-ghost fields {ca }a=1,2,3 and {¯ ca }a=1,2,3 , respectively, and forming with these six independent scalar fields the
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
785
additional term in the action 1 1 Sgh = − dx c¯a (−∂µ ∂µ + αm2 )δ ab + αgmhδ ab + αgmacb B c − g∂µ acb Acµ cb . 2 2 (8) Hence, we have the total “classical action” SBRS = Sinv + Sg.f. + Sgh , which is decomposed as
(9)
SBRS =
dx{Lquad (x) + Lint (x)}
(10)
into its quadratic part, where ∆ ≡ ∂µ ∂µ , Lquad =
1 1 1 1 (∂µ Aaν − ∂ν Aaµ )2 + (∂µ Aaµ )2 + m2 Aaµ Aaµ + h(−∆ + M 2 )h 4 2α 2 2 1 a (11) + B (−∆ + αm2 )B a − c¯a (−∆ + αm2 )ca , 2
and into its interaction part 1 Lint = gabc (∂µ Aaν )Abµ Acν + g 2 (abc Abµ Acν )2 4 1 + g{(∂µ h)Aaµ B a − hAaµ ∂µ B a − abc Aaµ (∂µ B b )B c } 2 1 + gAaµ Aaµ {4mh + g(h2 + B a B a )} 8 2 M 1 M2 1 + g h(h2 + B a B a ) + g 2 (h2 + B a B a )2 4 m 32 m 1 ca {hδ ab + acb B c }cb − gacb (∂µ c¯a )Acµ cb . − αgm¯ 2
(12)
Inspecting the quadratic part (11) we recognize two favorable consequences of the particular gauge fixing (7): this part is diagonal in the fields (no coupling Aaµ ∂µ B a appears) and all fields are massive. As a prerequisite to state the symmetries of SBRS (10), composite classical fields are introduced as follows: ψµa (x) = {∂µ δ ab + garb Arµ (x)}cb (x), 1 ψ(x) = − gB a (x)ca (x), 2 1 1 ψ a (x) = (m + gh(x))δ ab + garb B r (x) cb (x), 2 2 Ωa (x) =
1 apq p g c (x)cq (x). 2
(13)
July 10, 2009 14:13 WSPC/148-RMP
786
J070-00375
C. Kopper & V. F. M¨ uller
We can then write (8) in the form Sgh = − dx c¯a {−∂µ ψµa + αmψ a }.
(14)
The classical action SBRS , (10), shows the following symmetries: (i) Euclidean invariance: SBRS is an O(4)-scalar. ca } are isovectors (ii) Rigid SO(3)-isosymmetry: The fields {Aaµ }, {B a }, {ca }, {¯ and h an isoscalar; SBRS is invariant under spacetime independent SO(3)transformations. (iii) BRS-invariance: The BRS-transformations of the basic fields [1] are defined as Aaµ (x) → Aaµ (x) − ψµa (x)ε, h(x) → h(x) − ψ(x)ε, B a (x) → B a (x) − ψ a (x)ε,
(15)
ca (x) → ca (x) − Ωa (x)ε, 1 c¯a (x) → c¯a (x) − (∂ν Aaν (x) − αmB a (x))ε α with the composite fields (13), and ε is a Grassmann element not depending on space-time, that commutes with the fields {Aaµ , h, B a } but anticommutes with the (anti-) ghosts {ca , c¯a }. To show the BRS-invariance of the total classical action (9) one first observes that the composite classical fields (13) are themselves invariant under the BRS-transformations (15). Herewith, and using (14), it follows easily that the sum Sg.f. + Sgh is invariant under the transformation (15). Finally, on Sinv act only the BRS-transformations of the fields Aaµ , B a , h, which amounts to local gauge transformations. We observe that upon scaling the composite fields (13) entering the BRStransformations as well as Sgh (14), by a factor of λ, the corresponding SBRS remains invariant under such BRS-transformations. 3. Renormalization without Slavnov–Taylor Identities 3.1. The flow equations for the Schwinger functions Quantization of the theory by means of functional integration in the realm of (formal) power series is based on a Gaussian measure related to the quadratic part (11) of SBRS (10). Denoting the differential operators appearing there by Dµν := (−∆ + m2 )δµν −
1−α ∂µ ∂ν , α
˜ := −∆ + M 2 , D
D := −∆ + αm2 , (16)
we write 1 1 ˜ + 1 B a , DB a − ¯ ca , Dca . dx Lquad (x) = Aaµ , Dµν Aaν + h, Dh 2 2 2
(17)
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
To these differential operators (16) are associated the (free) propagators 1 Cµν (x, y) = dk eik(x−y) Cµν (k), (2π)4
787
(18)
and similarly in the other cases, with 1 kµ kν Cµν (k) = 2 δµν − (1 − α) 2 , k + m2 k + αm2 (19) 1 1 C(k) = 2 , S(k) = . k + M2 k 2 + αm2 A Gaussian product measure, the covariances of which are a regularized version of the propagators (18), (19), forms the point of departure. We choose the cutoff function, improving slightly the former one of [7], (k 2 + m2 )(k 2 + αm2 )(k 2 + M 2 )(k 2 )2 . (20) σΛ (k 2 ) = exp − Λ10 It is positive, invertible and analytic, and has the property d σΛ (k 2 )|k2 =0 = 0 (21) dk 2 which will be helpful in the analysis of the relevant part of the STI later on. Employing this cutoff function we define the regularized propagators, with UVcutoff Λ0 < ∞ and a flow parameter Λ satisfying 0 ≤ Λ ≤ Λ0 , Λ,Λ0 Cµν (k) ≡ Cµν (k) σΛ,Λ0 (k 2 ) := Cµν (k)(σΛ0 (k 2 ) − σΛ (k 2 ))
(22)
and similarly for C(k), S(k). The particular choice (20) implies Λ,Λ0 (k) = − ∂Λ Cµν
10 (k 2 + αm2 )δµν − (1 − α)kµ kν (k 2 + M 2 )(k 2 )2 · · σΛ (k 2 ), Λ3 Λ2 Λ6
and similarly in the other cases. Herefrom follow the bounds, using C Λ,Λ0 (k) as a collective symbol for the propagators considered, 2 for 0 ≤ Λ ≤ m, c|w| σ2Λ (k ) (23) |∂ w ∂Λ C Λ,Λ0 (k)| ≤ |k| Λ−3−|w|P|w| σΛ (k 2 ) for Λ > m. Λ On the left-hand side ∂ w denotes a |w|-fold partial momentum derivative (see below (39)). Moreover, the polynomials P|w| have nonnegative coefficients, which, as well as the constants c|w| , depend on α, m, M, |w| only. Considering σΛ (k 2 ), (20), as a function of (Λ, k 2 ), it cannot be extended continuously to (0, 0). We set σ0 (0) := limk2 →0 σ0 (k 2 ) = 0, and hence σ0,Λ0 (0) = σΛ0 (0) = 1. It is convenient to introduce a short collective notation for the various fields and their sources: (i) We denote the bosonic fields and the corresponding sources, respectively, by ϕτ = (Aaµ , h, B a ),
Jτ = (jµa , s, ba ),
(24)
July 10, 2009 14:13 WSPC/148-RMP
788
J070-00375
C. Kopper & V. F. M¨ uller
(ii) and all fields and their respective sources by Φ = (ϕτ , ca , c¯a ),
K = (Jτ , η¯a , η a ).
(25)
The sources η a and η¯a are Grassmann elements and have ghost number +1 and −1, respectively. In the sequel, we exclusively use left derivatives with respect to these quantities. The characteristic functional of the Gaussian product measure with the covariances C Λ,Λ0 from (22), (19) is then given by Λ,Λ0 1 1 (K) , (26) dµΛ,Λ0 (Φ)e Φ,K = e P where
Φ, K :=
dx
a
a
a
a
ϕτ (x)Jτ (x) + c¯ (x)η (x) + η¯ (x)c (x) ,
(27)
τ
P Λ,Λ0 (K) =
1 a Λ,Λ0 a 1 1 jµ , Cµν jν + s, C Λ,Λ0 s + ba , S Λ,Λ0 ba − ¯ η a , S Λ,Λ0 η a . 2 2 2 (28)
Aiming at a quantized descendant of the classical theory, we consider the generating functional LΛ,Λ0 (Φ) of the connected amputated Schwinger functions (CAS) Λ,Λ0 1 1 Λ0 ,Λ0 (Φ)+I Λ,Λ0 ) (Φ +Φ) e− (L = dµΛ,Λ0 (Φ )e− L , (29) LΛ,Λ0 (0) = 0.
(30)
The constant I Λ,Λ0 is the vacuum part of the theory which is proportional to the volume because of translation invariance. It therefore requires to consider the theory at first in a finite volume Ω ⊂ R4 . For details see [6]. Since the regularization necessarily violates the local gauge symmetry, the bare functional 0 ,Λ0 LΛ0 ,Λ0 (Φ) = dx Lint (x) + LΛ (31) c.t. (Φ) in a first stage has to be chosen sufficiently general in order to allow for the restoration of the Slavnov–Taylor identities at the end. Therefore, we add to the interaction 0 ,Λ0 part (12) of classical origin counter terms LΛ c.t. , which a priori include all local terms of mass dimension ≤ 4 permitted by the unbroken global symmetries, i.e. Euclidean O(4)-invariance and SO(3)-isosymmetry. There are 37 such terms, by definition all at least of order O(). The general bare functional is presented in Appendix A. From (29) the corresponding flow equation follows upon differentiation with respect to the flow parameter Λ, 1
∂Λ e− (L
Λ,Λ0
(Φ)+I Λ,Λ0 )
1
˙ Λ,Λ0 e− (L = ∆
Λ,Λ0
(Φ)+I Λ,Λ0 )
,
(32)
where the right-hand side is obtained on derivation of the Gaussian measure dµΛ,Λ0 (Φ ) and observing that the integrand is a function of Φ + Φ. The “dot”
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
appearing on the functional Laplace operator δ 1 1 δ Λ,Λ0 δ Λ,Λ0 δ ,C ∆Λ,Λ0 = ,C + 2 δAaµ µν δAaν 2 δh δh δ δ 1 Λ,Λ0 δ Λ,Λ0 δ ,S ,S + + 2 δB a δB a δca δ¯ ca
789
(33)
denotes differentiation with respect to Λ. Hence, we arrive at the flow equation ∂Λ (LΛ,Λ0 (Φ) + I Λ,Λ0 ) δ δ ˙ Λ,Λ0 δ Λ,Λ0 δ ˙ , Cτ ,S = +2 LΛ,Λ0 (Φ) a a 2 δϕ δϕ δc δ¯ c τ τ τ Λ,Λ0 Λ,Λ0 Λ,Λ0 Λ,Λ0 δL δL 1 Λ,Λ0 δL Λ,Λ0 δL ˙ ˙ , Cτ ,S − − . 2 τ δϕτ δϕτ δca δ¯ ca
(34)
Since we restrict to perturbation theory, the generating functional will be considered within a formal loop expansion LΛ,Λ0 (Φ) =
∞
0 l LΛ,Λ (Φ). l
(35)
l=0
Furthermore, decomposing into particular n-point Schwinger functions we use a multiindex n, the components of which denote the number of each source field species appearing: n = (nA , nh , nB , nc¯, nc ),
|n| = nA + nh + nB + nc¯ + nc .
(36)
Because of (12) there will not appear 1- and 2-point functions at the tree level (l = 0). If we do not regard the vacuum part, we can study the flow of the n-point functions in the infinite volume limit Ω → R4 . Due to translation invariance, it is ˆ the conventions convenient to consider also the Fourier transformed source field Φ, used are d4 p δ ipx ˆ 4 Φ(p) → δ = (2π) := , Φ(x) = e := e−ipx δΦ(p) . ˆ Φ(x) 4 δΦ(x) p R4 (2π) p p (37) Given these conventions, the momentum representation of the n-point function with multiindex n, (36), at loop order l is obtained as an |n|-fold functional derivative n 0 0 (2π)4(|n|−1) δΦ(p) LΛ,Λ (Φ)|Φ=0 = δ(p1 + · · · + p|n| )LΛ,Λ ˆ l l,n (p1 , . . . , p|n| ).
(38)
For the sake of a slim appearance, the notation does not reveal how the momenta are assigned to the multiindex n, and in addition, the O(4)- and SO(3)-tensor structure remains hidden. By definition the n-point function is completely symmetric (antisymmetric) if the variables that belong to each of the bosonic (fermionic) species
July 10, 2009 14:13 WSPC/148-RMP
790
J070-00375
C. Kopper & V. F. M¨ uller
occurring are permuted. As momentum derivatives of n-point functions have to be considered, too, we also introduce the shorthand notation w = (w1,1 , . . . , wn−1,4 ), ∂ w :=
wi,µ 4 ∂ , ∂pi,µ i=1 µ=1
n−1
wi,µ ∈ N0 , |w| =
(39) wi,µ .
i,µ
The system of flow equations (FE) for the connected amputated Schwinger functions (CAS) then follows from (34), using (35), (38), and finally performing the momentum derivatives (39) 0 ∂Λ ∂ w LΛ,Λ l,n (p1 , . . . , p|n| )
=
n ,|n |=|n|+2
−
cn−n k
0 (∂Λ C Λ,Λ0 (k))∂ w LΛ,Λ l−1,n (k, −k, p1 , . . . , p|n| )
0 c{wi } [cn1 ,n2 ∂ w1 LΛ,Λ l1 ,n1 (p1 , . . . , p|n1 |−1 , p )
l1 +l2 =l, w1 +w2 +w3 =w n1 , n2 , |n1 |+|n2 |=|n|+2 0 · (∂ w3 ∂Λ C Λ,Λ0 (p ))∂ w2 LΛ,Λ l2 ,n2 (−p , . . . , p|n| )]s,a .
(40)
The field assignment of the propagators C Λ,Λ0 on the right-hand side is not written, it is implicit in the multiindices n , n1 , n2 related to n. In the linear term the integrated momentum k refers to that of the fields from n − n and the factor cn−n has the value 1/2 and 1 in the case of bosons and fermions, respectively. In the bilinear terms we have −p = p1 + · · · + p|n1 |−1 . Furthermore the subscripts s, a indicate a residual (anti)symmetrization according to the statistics of the various fields. Given a propagator with its attached fields appearing in the pair of CAS joint by this propagator, see (34), the residual (anti)symmetrization with respect to the |n| external fields restricts to those permutations which exchange such fields between the pair of CAS while leaving the order of the fields within one CAS invariant.b The constants cn1 ,n2 take the values 1/2 (for a boson propagator) or 1 (for a fermion propagator). The combinatoric coefficients c{wi } stem form the Leibniz rule and have the values c{wi } = w1 !ww!2 !w3 ! , where w! = i,µ wi,µ !. To end up with Schwinger functions fulfilling the Slavnov–Taylor identities (STI), we have to consider Schwinger functions with a composite field inserted, too. Two kinds of such insertions have to be dealt with: local insertions implementing the BRS-variations, and a space-time integrated insertion representing the intermediate violation of the STI. The classical composite BRS-fields (13) all have mass dimension 2 and transform as vector-isovector, scalar-isoscalar, scalar-isovector and scalar-isovector, respectively. Moreover, the first three have ghost number 1, whereas the last one has b For
details see [7, Eq. (2.28)].
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
791
ghost number 2. Hence, adding counterterms, we introduce the bare composite fields (ψµa )0,Λ0 (x) = R10 ∂µ ca (x) + R20 garb Arµ (x)cb (x), 1 (ψ)0,Λ0 (x) = −R30 gB a (x)ca (x), 2 1 1 (ψ a )0,Λ0 (x) = R40 mca (x) + R50 gh(x)ca (x) + R60 garb B r (x)cb (x), 2 2 1 (Ωa )0,Λ0 (x) = R70 gapq cp (x)cq (x), 2
(41a) (41b) (41c) (41d)
keeping the notation from (13) but using it henceforth exclusively according to (41a)–(41d). We set Ri0 = 1 + O(),
(42)
thus viewing the counterterms again as formal power series in ; the tree order 0 provides the classical terms (13). Observe that for l > 0 the field products appearing in the classical composite fields ψµa and ψ a of (13) do require R10 and R40 , respectively, as counterterms. Moreover, it is important to note that the modified composite fields (41a)–(41d) remain invariant under the BRS-transformations (15) upon assuming the conditions R60 = R70 = R20 ,
R30 R50 = (R20 )2
(43)
and employing the generalized composite fields (41a)–(41d) in place of the original ones, (13). To deal with Schwinger functions showing one insertion, the bare interaction (31) is modified adding the composite fields (41a)–(41d) coupled to corresponding sources LΛ0 ,Λ0 (ξ) =
˜ Λ0 ,Λ0 (ξ; Φ) := LΛ0 ,Λ0 (Φ) + LΛ0 ,Λ0 (ξ), L dx{γµa (x)ψµa (x) + γ(x)ψ(x) + γ a (x)ψ a (x) + ω a (x)Ωa (x)}.
(44) (45)
According to the properties of these composite fields, the sources γµa , γ, γ a are Grassmann elements, they all have canonical dimension 2 and ghost number −1, whereas ω a has canonical dimension 2 and ghost number −2. For the insertions and their respective sources we also introduce a short collective notation ψτ = (ψµa , ψ, ψ a ),
γτ = (γµa , γ, γ a ),
ξ = (γτ , ω a ).
(46)
Using now (44) in place of LΛ0 ,Λ0 as the bare action in the representation (29) ˜ Λ,Λ0 (ξ; Φ), from which the generating functional of the provides the functional L regularized CAS with one insertion ψ(x) follows as 0 (x; Φ) := LΛ,Λ γ
δ ˜ Λ,Λ0 L (ξ; Φ)|ξ=0 , δγ(x)
(47)
July 10, 2009 14:13 WSPC/148-RMP
792
J070-00375
C. Kopper & V. F. M¨ uller
and similarly for the other insertions from (45). In the infinite volume limit, and performing a Fourier transform of the insertion position we obtain 0 0 ˆ Λ,Λ L (q; Φ) = dx eiqx LΛ,Λ (x; Φ). (48) γ γ After a loop expansion the n-point function with one insertion ψ is obtained as 4(|n|−1) n 0 ˆ Λ,Λ0 (q; Φ)|Φ=0 , L δ(q + p1 + · · · + p|n| )LΛ,Λ δΦ(p) ˆ γ;l,n (q; p1 , . . . , p|n| ) := (2π) γ;l
(49) and similarly as regards the other insertions. Starting from the analog of (34) for the modified generating functional ˜ Λ,Λ0 (ξ; Φ), which emerges from the bare action (44), and restricting to one inserL 0 (x; Φ). Proceedtion by the operation (47), leads to a linear flow equation for LΛ,Λ γ ing then as before in the derivation of (40), yields the system of differential FE for the CAS with one insertion ψ 0 ∂Λ ∂ w LΛ,Λ γ;l,n (q; p1 , . . . , p|n| ) 0 = cn−n (∂Λ C Λ,Λ0 (k))∂ w LΛ,Λ γ;l−1,n (q; k, −k, p1 , . . . , p|n| )
n ,|n |=|n|+2
−
k
w1 Λ,Λ0 c{wi } [c(1) Lγ;l1 ,n1 (q; p1 , . . . , p|n1 |−1 , p ) n1 ,n2 ∂
l1 +l2 =l, w1 +w2 +w3 =w n1 , n2 , |n1 |+|n2 |=|n|+2 0 · (∂ w3 ∂Λ C Λ,Λ0 (p ))∂ w2 LΛ,Λ l2 ,n2 (−p , . . . , p|n| )]s,a .
(50)
The notation is that of (40), with −p = q + p1 + · · · + p|n1 |−1 , however. Since ghost and anti-ghost in (34) do not appear symmetrically, the c¯ (c)-derivative appears once in n1 (n2 ) and once in n2 (n1 ). It is obvious that each of the other insertions (45) leads to a similar system of flow equations. As will turn out in Sec. 4, the initial regularization, necessarily violating the STI, leads to a bare space-time integrated insertion of the form Λ0 ,Λ0 (Φ) = dxN (x), N (x) = Q(x) + Q (x; Λ−1 (51) L1 0 ). The individual terms of N (x) involve at most five fields and have ghost number 1. Furthermore, Q(x) is a local polynomial in the fields and their derivatives, having canonical mass dimension D = 5, whereas Q (x; Λ−1 0 ) is nonpolynomial in the field momenta but suppressed by powers of Λ0 −1 . To obtain the generating functional 0 (Φ) with one (bare) insertion (51) we can resort to the local case, considering LΛ,Λ 1 the bare local insertion (52) LΛ0 ,Λ0 () = dx(x)N (x)
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
and proceed as before. Observing (47), (48) we obtain δ ˜ Λ,Λ0 0 0 ˆ Λ,Λ0 (0; Φ). L (Φ) = dx (; Φ)| = dxLΛ,Λ (x; Φ) = L LΛ,Λ =0 1 δ(x)
793
(53)
Performing again a loop expansion, the CAS n-point function with one insertion (51) is obtained as 4(|n|−1) n 0 0 δΦ(p) LΛ,Λ δ(p1 + · · · + p|n| )LΛ,Λ ˆ 1;l,n (p1 , . . . , p|n| ) := (2π) 1;l (Φ)|Φ=0 .
(54)
For these CAS holds again a system of linear FE. According to the preceding treatment of the integrated insertion we only have to take (50) at the fixed momentum 0 value q = 0 of the insertion, and then replace each symbol LΛ,Λ γ;l,n (0; · · ·) by the new Λ,Λ0 symbol L1;l,n (· · ·). Polchinski realized the flow equations (40) to open the way for a simple inductive proof of renormalizability. The mathematical proof was carried through in [4] on simplifying still Polchinski’s argument. The FE for composite operators (50) were introduced and analyzed in [3]. For a recent presentation see [7]. The analysis of the STI, however, as will be shown in Sec. 4, requires to trace in the perturbative expansion the effect of the super-renormalizable three-point couplings present in the interaction. To this end we scale in the tree-level part (12) of (31) the mass parameters appearing in the three-point couplings, as well as in the BRS-insertions the part proportional to m, see (41c), by a common factor of λ > 0: m → λm,
M → λM.
(55)
Note however that we do not scale the mass parameters which are present in the regularized propagators appearing in the flow equations. All CAS will then depend smoothly on λ, and we expand them as 0 ) LΛ,Λ l,n (λ; p
=
∞
(ν),Λ,Λ0
( p ),
(ν),Λ,Λ0
(q; p ),
(mλ)ν Ll,n
p = (p1 , . . . , p|n| ),
(56)
ν=0 0 LΛ,Λ ) = γ;l,n (λ; q; p
∞
(mλ)ν Lγ;l,n
(57)
ν=0
where for suitable renormalization schemesc the sum is finite, its size depending on l and n, as will be shown below. We adopt the following Renormalization scheme. Relevant terms are those which satisfy |n| + |w| + ν ≤ 4 in case of the functional LΛ,Λ0 , 0 , |n| + |w| + ν ≤ 2 in case of LΛ,Λ γ in agreement with the bounds to be derived below. c i.e.
Schemes which do not explicitly introduce infinite sums in the boundary terms at Λ = 0 by hand.
July 10, 2009 14:13 WSPC/148-RMP
794
J070-00375
C. Kopper & V. F. M¨ uller
At tree level we then haved (ν),Λ,Λ0
(∂ w L0,n
)(0 ) = 0,
if |n| + |w| + ν < 4.
(58)
For l ≥ 1, we use renormalization conditions on the relevant terms as follows: we impose (ν),0,Λ0
(∂ w Ll,n
! )(0 ) = 0,
if |n| + |w| + ν < 4,
(59)
whereas if |n| + |w| + ν = 4, on the right-hand side a free constant r(ν),l,n can be chosen. Correspondingly, in the case of an insertion, we have at the tree level (ν),Λ,Λ0
(∂ w Lγ;0,n
)(0; 0 ) = 0,
if |n| + |w| + ν < 2,
(60)
if |n| + |w| + ν < 2,
(61)
and employ renormalization conditions (ν),0,Λ0
(∂ w Lγ;l,n
!
)(0; 0 ) = 0,
but if |n| + |w| + ν = 2, on the right-hand side again a free constant can be chosen. Because of the expansions (56) and (57) the FE (40) and (50) have to be adjusted attributing a superscript (ν) to the CAS and to sum ν1 +ν2 = ν, in complete analogy to the loop index l. Using these extended FE the following bounds can be deduced, Proposition 1. Let l ∈ N0 and 0 ≤ Λ ≤ Λ0 , then | p| Λ+m (ν),Λ,Λ0 |∂ w Ll,n ( p )| ≤ (Λ + m)4−|n|−|w|−ν P1 log P2 , m Λ+m |q, p| Λ+m w (ν),Λ,Λ0 2−|n|−|w|−ν P1 log |∂ Lγ;l,n (q; p )| ≤ (Λ + m) P2 . m Λ+m
(62) (63)
In these bounds Pi , i = 1, 2, denote (each time they appear possibly new ) polynomials with nonnegative coefficients independent of Λ, Λ0 , p, q, m. The coefficients may depend on n, l, ν, w, and the other free parameters of the theory α, M/m, g. These bounds are uniform in Λ0 . The proof is solely based on power counting for renormalizable theories, it does not involve the symmetry structure of the Yang– Mills theory. Proof. To prove (62) one proceeds by induction as follows: ascending in N := 2l+|n|, for given N ascending in l, for given N, l ascending in ν, and for given N, l, ν descending in |w|. Given n, the irrelevant cases |n| + |w| + ν > 4 are treated first, integrating from the initial point Λ = Λ0 “downwards” with initial conditions equal to zero. In contrast, the relevant ones, i.e. |n| + |w| + ν ≤ 4, choosing the particular momentum value p = 0, are integrated from the initial point Λ = 0 “upwards” with d Notice,
that for l = 0 there are no CAS with |n| ≤ 2.
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
795
initial conditions (59) and the remaining ones chosen freely, hereafter this result has to be extended to general p via the Taylor formula 1 )(t (∂f p )dt. f ( p ) = f (0) + p · 0
Descending in |w|, the integrand in the respective remainder of the Taylor extension has already been bounded previously. A derivative by induction provides another factor of (Λ + m)−1 , which can be combined with the momentum factor of the remainder to increase the degree of the bounding polynomial. A key to this induction is the property that in the tree order there are no CAS with |n| ≤ 2. Bounding the linear terme of the FE (ν),Λ,Λ 0 Λ,Λ0 w cn−n (∂Λ C (k))∂ Ll−1,n (k, −k, p ) k n ,|n |=|n|+2 (ν),Λ,Λ ≤ Λ |Λ3 ∂Λ C Λ,Λ0 (Λk )||∂ w Ll−1,n 0 (Λk , −Λk , p )| k
n ,|n |=|n|+2
≤ Λ
4−|n |−|w|−ν
(Λ + m)
n ,|n |=|n|+2
≤ (Λ + m)
4−|n|−|w|−ν−1
| p| Λ+m P1 log P2 m Λ+m
| p| Λ+m P3 log P4 , m Λ+m
after a change of the integration variable k = Λk one uses the bounds (23) and (62) and then performs the k -integration. The proof of (63) is analogous to the proof of (62): One has to observe the inherent demarcation between relevant and irrelevant, and to employ the bound (62) required to treat the bilinear term on the right-hand side of the FE (50). Our renormalization scheme implies (ν),Λ,Λ0
Ll,n
( p ) ≡ 0,
if ν > 2l + |n| − 2,
(ν),Λ,Λ0
Lγ;l,n
(q, p ) ≡ 0,
if ν > 2l + |n| − 1. (64)
These statements follow inductively from the FE, once they hold for the terms fixed by the boundary conditions. Note that the first of these relations can be understood in terms of Feynman graphs as following from the upper bound on the number of trivalent vertices at a given loop-order. The second one takes into account additionally that the BRS-insertions (41c) also include one factor of m. To also prove convergence for Λ0 → ∞ (which is plausible noticing the uniformity of the bounds (62), (63) and the nonoscillatory nature of the quantities we consider) one has to analyze the FE, derived with respect to Λ0 , using the same e This
term generates a new loop.
July 10, 2009 14:13 WSPC/148-RMP
796
J070-00375
C. Kopper & V. F. M¨ uller
inductive technique. It is then possible to prove [7] that | p| Λ0 −2 w (ν),Λ,Λ0 5−|n|−|w|−ν |∂Λ0 ∂ Ll,n ( p )| ≤ Λ0 (Λ + m) P1 log P2 , m Λ+m (65) |q, p | Λ 0 (ν),Λ,Λ 3−|n|−|w|−ν P1 log |∂Λ0 ∂ w Lγ;l,n 0 (q; p )| ≤ Λ−2 P2 , 0 (Λ + m) m Λ+m (66) for Λ0 large enough. Herefrom we can infer the existence of the limits Λ0 → ∞ at fixed value of Λ. 3.2. The flow equations for the proper vertex functions Our analysis of the Slavnov–Taylor identities (STI) and the proof of their restoration will be based on a representation in terms of proper vertex functions (1PI), since the extraction of relevant parts from the STI is simpler and more transparent in terms of those than in terms of the CAS. To present their relation with the CAS considered so far, we introduce the shorthand notation ˜ Φ) := L ˜ Λ,Λ0 (ξ; ϕτ , c, c¯), L(ξ;
Cτ := CτΛ,Λ0 ,
S := S Λ,Λ0 ,
(67)
for the generating functional of the CAS with insertion (45) and for the regularized ˜ Φ) we define the “classical fields” Φ ≡ (ϕ , c, c¯) by propagators. From L(ξ; τ ˜ δ L(ξ; Φ) ϕτ (x) = ϕτ (x) − dy Cτ (x − y) , δϕτ (y) ˜ Φ) δ L(ξ; a a , (68) c (x) = c (x) + dy S(x − y) a δ¯ c (y) ˜ Φ) δ L(ξ; a a c¯ (x) = c¯ (x) − dy S(x − y) a . δc (y) ˜ Φ) ≡ Γ ˜ Λ,Λ0 (ξ; ϕ , c, c¯) The generating functional of the proper vertex functions Γ(ξ; τ f is then given by the transform ˜ Φ) = L(ξ; ˜ Φ) − 1 Γ(ξ; ϕτ , Cτ−1 ϕτ + ¯ c, S −1 c 2 τ ϕτ Cτ−1 ϕτ − ¯ c, S −1 c − ¯ c, S −1 c, (69) + τ
with Φ = Φ(Φ) on the right-hand side, according to (68). Since we are only interested in the kernels to be derived from the generating functional Γ we may always assume the field variables to be sufficiently regular so that the application of the f This transform corresponds to the familiar Legendre transform of the connected (non-amputated) Schwinger functions.
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
797
inverted regularized propagators makes sense. By functional derivation we deduce the relations ˜ Φ) δ Γ(ξ; = dy Cτ−1 (x − y)ϕτ (y), δϕτ (x) ˜ Φ) δ Γ(ξ; = dy S −1 (x − y)¯ ca (y), (70) δca (x) ˜ Φ) δ Γ(ξ; = − dy S −1 (x − y)ca (y), δ¯ ca (x) forming the inverse of the relations (68). Moreover, acting on the “classical fields” (68) with the respective inverse propagators Cτ−1 and S −1 , and then using (70), ˜ Φ) and Γ(ξ; ˜ Φ) provides the crucial relations between the generating functionals L(ξ; (2π)−4 Cτ−1 (p)ϕτ (−p) =
˜ Φ) ˜ Φ) δ L(ξ; δ Γ(ξ; − , δϕτ (p) δϕτ (p)
(2π)−4 S −1 (p)ca (−p) = − ca (−p) = (2π)−4 S −1 (p)¯
˜ Φ) δ L(ξ; ˜ Φ) δ Γ(ξ; + , a δ¯ c (p) δ¯ ca (p)
(71)
˜ Φ) ˜ Φ) δ L(ξ; δ Γ(ξ; − , a δc (p) δca (p)
written in terms of Fourier transformed fields. Functional derivation of (69) with respect to the source γ(x) at fixed Φ leads to ˜ Φ) ˜ Φ) δ L(ξ; δ Γ(ξ; = , (72) δγ(x) δγ(x) ξ=0
ξ=0
and to analogous equations as regards the other sources γµa , γ a , ω a . Restricting again to perturbation theory we consider the proper vertex functions which correspond to the various types of CAS dealt with up to now. Hence, we define proper vertex functions without insertion, with one local insertion as in (47), (48), and with a global one as in (53), keeping the same notations. Since by definition ˜ Φ) has no vacuum part, we can extend to infinite volume and use Fourier transΓ(ξ; formed “classical fields” (68), with the conventions (37) (but omitting the “hat” by Λ,Λ0 0 0 , ΓΛ,Λ we abuse of notation). Hence, from the generating functionals ΓΛ,Λ l γ;l , Γ1;l obtain the corresponding n-point proper vertex functions of loop order l in analogy with (38), (49), (54), n 0 0 ΓΛ,Λ (Φ)|Φ≡0 = δ(p1 + · · · + p|n| )ΓΛ,Λ (2π)4(|n|−1) δΦ(p) l l,n (p1 , . . . , p|n| ),
(73)
Λ,Λ0 n 0 ΓΛ,Λ (2π)4(|n|−1) δΦ(p) γ;l (q; Φ)|Φ≡0 = δ(q + p1 + · · · + p|n| )Γγ;l,n (q; p1 , . . . , p|n| ),
(74) Λ,Λ0 n 0 (2π)4(|n|−1) δΦ(p) ΓΛ,Λ 1;l (Φ)|Φ≡0 = δ(p1 + · · · + p|n| )Γ1;l,n (p1 , . . . , p|n| ).
(75)
July 10, 2009 14:13 WSPC/148-RMP
798
J070-00375
C. Kopper & V. F. M¨ uller
˜ The FE for the L-functional implies a corresponding flow equation for the proper ˜ vertex functional Γ. Performing the Λ-derivative of the transform (69)g and observing that the classical fields Φ, (68), themselves depend on Λ due to (70), eventually yields ˜ ˜ Φ) − 1 Φ) = ∂Λ L(ξ; ϕτ , ∂Λ Cτ−1 ϕτ + ¯ c, ∂Λ S −1 c (∂Λ Γ)(ξ; 2 τ ϕτ , ∂Λ Cτ−1 ϕτ − ¯ c, ∂Λ S −1 c − ¯ c, ∂Λ S −1 c, (76) + τ
˜ denotes the derivative of the functional Γ ˜ itself. Inserting now the flow where (∂Λ Γ) ˜ equation for L(ξ; Φ) which has the same form as (34), and eliminating in its bilinear δ ˜ L using the Eqs. (68), provides the flow equation of the terms the functionals δΦ vertex functional 1 ˜ ˜ ˜ Φ), ˙ Λ,Λ0 L(ξ; Φ) + (∂Λ I)(ξ) − (∂Λ Γ)(ξ; ϕτ , ∂Λ Cτ−1 ϕτ + ¯ c, ∂Λ S −1 c = ∆ 2 τ (77) where one should remember the dependence on the parameters Λ, Λ0 from (67) and the definition (33). At this stage the fields Φ can be considered as autonomous (test) ˜ not depending on Λ. On the left-hand side the second functions of the functional Γ, ˜ 0) = 0, and the subsequent terms subtract term is the vacuum part, since Γ(ξ; ˜ Φ). The resulting functional the (regularized) two-point tree order from (∂Λ Γ)(ξ; still has to be expressed in terms of proper vertex functions. Performing a loop expansion and functional derivatives with respect to the fields we obtain from (77) for |n| ≥ 1 n ˜ l )(ξ; Φ) = ∆ ˜ l−1 (ξ; Φ), ˙ Λ,Λ0 L |Φ≡0 : (∂Λ Γ δΦ
l ≥ 1.
(78)
Since the vacuum part has disappeared we can now pass to the infinite volume ˜ l−1 (ξ; Φ) is first acted upon by two limit. On the right-hand side the functional L particular Φ-derivatives from the functional Laplace operator, then followed by an n-fold functional derivative with respect to (the classical field) Φ. The resulting object has to be expressed in terms of proper vertex functions. To determine this right-hand side of (78) we use the crucial relation (71) relating the L-functional to the Γ-functional.h The presence of various fields increases the combinatorial complexity. To indicate the procedure we employ a collective notation. We perform ˜ Φ) via the chain rule together a Φ-derivative of the relation (71), as regards L(ξ; g Again
to be viewed on finite volume before passing to correlation functions. is a closed expression for the right-hand side of the FE for proper vertex functions in terms of the functional inverse of the second functional derivative of the generating functional Γ, see, e.g., [7, Eq. (3.49)]. This functional inverse, however, would still have to be evaluated within a systematic perturbative treatment.
h There
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
799
with (70), and hereafter consider the outcome within a loop expansion,i ˜ l (ξ; Φ) δ2Γ δ(p + q)δl,0 = 4 (2π) CΦ,Φ (p) δΦ(p)δΦ (q) 8 − (2π) Φ l1 +l2 =l
k
˜ l1 (ξ; Φ) ˜ l2 (ξ; Φ) δ2Γ δ2 L Φ (k) . C Φ δΦ(p)δΦ (k) δΦ (−k)δΦ (q) (79)
This identity forms the point of departure to relate successively n-point functions of the L- and the Γ-functional. We have to deal with it in the case without insertion, setting ξ ≡ 0, as well as in the case of one local insertion. In the latter one, (79) has to be derived with respect to the source at zero source, cf. (47), (48). By this operation, both the L-functional with and without insertion appear, 0 0 0 δ 2 LΛ,Λ δ 2 ΓΛ,Λ (Φ) δ 2 ΓΛ,Λ γ;l (q; Φ) γ;l1 (q; Φ) Λ,Λ0 l2 8 C Φ (k) = (2π) Φ (k) δΦ(p)δΦ δΦ(p)δΦ (p ) δΦ (−k)δΦ (p ) k Φ l1 +l2 =l
+ k
0 0 δ 2 ΓΛ,Λ δ 2 LΛ,Λ (Φ) Λ,Λ0 γ;l2 (q; Φ) l1 C (k) δΦ(p)δΦ (k) Φ Φ δΦ (−k)δΦ (p )
.
(80)
Taking (80) at momentum q = 0 and replacing the subscript γ by the subscript 1 provides the relation in the case of the integrated insertion. From (79) without insertion, considered at loop order l = 0 and at Φ = Φ ≡ 0, 0 follows in the first step, because of the key property LΛ,Λ 0,n (k, −k) ≡ 0, if |n| = 2, Λ,Λ0 Λ,Λ0 1 = CΦ,Φ (p)Γ0,n (p, −p),
n=(Φ, ˆ Φ ).
(81)
Before returning to the flow equation we note, that in order to obtain from (79) with ξ ≡ 0 or from (80) the relation between the various n-point functions of the L- and the Γ-functional, we have to act upon these equations repeatedly by Φ-derivation, to be performed on the L-functional via the chain rule. The chain rule derivatives δΦ/δΦ can be read from (70). In particular, on account of the propagators C Λ,Λ0 (k) vanishing at Λ = Λ0 and observing (81), one realizes, ascending with |n|, 0 ,Λ0 0 ,Λ0 ΓΛ (p1 , . . . , p|n| ) = LΛ (p1 , . . . , p|n| ), l,n l,n
(l, |n|)=(0, 2),
Λ0 ,Λ0 0 ,Λ0 ΓΛ 1;l,n (p1 , . . . , p|n| ) = L1;l,n (p1 , . . . , p|n| ),
(82) (83)
and similarly in the case of the local insertions. We now return to the FE (78) and first treat the case without insertion, thus we set there ξ ≡ 0. Performing in addition the momentum derivatives (39) we obtain i Here
Φ is determined by Φ , cf. (17).
July 10, 2009 14:13 WSPC/148-RMP
800
J070-00375
C. Kopper & V. F. M¨ uller
the system, for |n| ≥ 1, (l, |n|)=(0, 2), 0 ∂Λ ∂ w ΓΛ,Λ l,n (p1 , . . . , p|n| ) 1 0 = (∂Λ C Λ,Λ0 (k))∂ w LΛ,Λ l−1,n (k, −k; p1 , . . . , p|n| ). 2 k
(84)
|n |=|n|+2
The summation extends on the various propagators as stated in (79), not distinguished here notationally, the corresponding pair of fields together with n determine n . Moreover, the momentum derivative ∂ w concerns the momenta p1 , . . . , p|n| of the configuration n. To generate the functions on the right-hand side of (84) we n |Φ≡0 , and these derivatives are have to act on (79), after setting ξ ≡ 0, with δΦ 0 directly applied on the L-functional. Hence the functions LΛ,Λ in (84), differing l,n Λ,Λ0 from the CAS Ll,n . The vanishing 2-point CAS in the tree order, together with 0 its correspondence (81) then allow to express inductively the functions LΛ,Λ on the l,n right-hand side of (84) in terms of proper vertex functions, ascending in l, and for fixed l ascending in |n|. The right-hand side of (84) then emerges in the form
Λ,Λ0 0 LΛ,Λ l−1,n (k, −k; p1 , . . . , p|n| ) = Γl−1,n (k, −k, p1 , . . . , p|n| ) + · · · ,
(85)
where the dots represent chains ΓCΓ and higher iterations, formed of proper vertex 0 functions ΓΛ,Λ l ,n with (l , n ) prior to (l − 1, n ), joined via (free) propagators. In the case of one local insertion, Eq. (78) has to be derived with respect to the source at zero source, cf. (47), (48). Performing again the momentum derivation leads to the the system of flow equations for proper vertex functions with one local insertion, |n| ≥ 1, 0 ∂Λ ∂ w ΓΛ,Λ γ;l,n (q; p1 , . . . , p|n| ) 1 0 = (∂Λ C Λ,Λ0 (k))∂ w LΛ,Λ γ;l−1,n (q; k, −k; p1 , . . . , p|n| ). 2 k
(86)
|n |=|n|+2
The right-hand side of (86) is now obtained in complete analogy to the case without insertion, this right-hand side is now extracted inductively from (80) in place of (79). By this operation, both the L-functions with and without insertion appear. Proceeding inductively as before, and using the already determined L-functions without insertion, provides the function on the right-hand side of the system (86), as Λ,Λ0 0 LΛ,Λ γ;l−1,n (q; k, −k; p1 , . . . , p|n| ) = Γγ;l−1,n (q; k, −k, p1 , . . . , p|n| ) + · · · ,
(87)
where the dots again represent a sum of chains, each of which contains exactly 0 one inserted factor ΓΛ,Λ γ;l ,n , which has already been determined previously in the inductive procedure. Finally, in the case of an integrated insertion, we obtain the system (86) at the particular momentum value q ≡ 0.
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
801
Once (85) and (87) have been inductively fixed, we can again perform the mass scaling (55) in the tree-level interaction and insertions. It then leads to expansions corresponding to (56), (57) for the vertex functions 0 ΓΛ,Λ ) = l,n (λ; p
∞
(ν),Λ,Λ0
(mλ)ν Γl,n
( p ),
p = (p1 , . . . , p|n| ),
(88)
ν=0 0 ΓΛ,Λ ) = γ;l,n (λ; q; p
∞
(ν),Λ,Λ0
(mλ)ν Γγ;l,n
(q; p ).
(89)
ν=0
We first consider the tree level l = 0. In the case of (88) the scaling (55) of the interaction results in (ν),0,Λ0
(∂ w Γ0,n
)(0 ) = 0,
|n| = 3,
|w| + ν=1.
(90)
Whereas there is no |n| = 1 content, the 2-point functions are fixed by the regularized propagators (81) (the masses of which are not scaled). The vertex functions with insertion (89) satisfy (ν),0,Λ0
(∂ w Γγ;0,n
)(0; 0) = 0,
|n| + |w| + ν < 2.
(91)
Owing to the expansions (88) and (89), in both FE (84) and (86) a superscript (ν) has to be attached to the respective n-point function on the left-hand side and on the n -point functions present on the right-hand side. We then use the same inductive scheme which leads to the bounds (62), (63) on the CAS and may deduce renormalizability of the proper vertex functions. For the relevant terms the choice of the renormalization conditions is as follows, l ≥ 1, (ν),0,Λ0
(∂ w Γl,n
! )(0 ) = 0,
if |n| + |w| + ν < 4,
(92)
but if |n| + |w| + ν = 4, a nonvanishing constant can be chosen on the right-hand side, whereas in the case of an insertion (ν),0,Λ0
(∂ w Γγ;l,n
!
)(0; 0) = 0,
if |n| + |w| + ν < 2,
(93)
but if |n| + |w| + ν = 2, again a nonvanishing constant on the right-hand side may be imposed. Proceeding inductively as indicated we obtain the bounds: Proposition 2.
| p| Λ+m ( p )| ≤ (Λ + m)4−|n|−|w|−ν P1 log P2 , m Λ+m (l, |n|) = (0, 2), |q, p | Λ+m (ν),Λ,Λ |∂ w Γγ;l,n 0 (q; p )| ≤ (Λ + m)2−|n|−|w|−ν P1 log P2 , m Λ+m (ν),Λ,Λ0
|∂ w Γl,n
(94) (95)
The notations are those from (62), (63). Moreover, we can also obtain the bounds (65) and (66) in the case of proper vertex functions derived with respect to Λ0 .
July 10, 2009 14:13 WSPC/148-RMP
802
J070-00375
C. Kopper & V. F. M¨ uller
4. Violated Slavnov–Taylor Identities To examine the violation of the STI produced by the UV cutoff Λ0 we start from the generating functional of the regularized Schwinger functions at the physical value Λ = 0 of the flow parameter,j 1 Λ0 ,Λ0 1 0,Λ0 (Φ)+ Φ,K Z (K) = dµ0,Λ0 (Φ)e− L . (96) The Gaussian measure dµ0,Λ0 (Φ) corresponds to the quadratic form cf. (26), Q0,Λ0 (Φ) =
1 a 0,Λ0 −1 a 1 Aµ , C Aν + h, (C 0,Λ0 )−1 h µν 2 2 1 a ca , (S 0,Λ0 )−1 ca . + B , (S 0,Λ0 )−1 B a − ¯ 2
1 0,Λ0 (Φ), Q
(97)
We define regularized BRS-variations (15), (41a)–(41d) of the fields by δBRS ϕτ (x) = −(σ0,Λ0 ψτ )(x)ε, δBRS ca (x) = −(σ0,Λ0 Ωa )(x)ε, 1 a a a ∂ν Aν − mB δBRS c¯ (x) = − σ0,Λ0 (x)ε. α The BRS-variation of the Gaussian measure has the form 1 0,Λ0 dµ0,Λ0 (Φ) → dµ0,Λ0 (Φ) 1 − δBRS Q (Φ) ,
(98)
(99)
and inspecting (97) we observe that the factor σ0,Λ0 of the variations (98) just cancels its inverse entering the inverted propagators. Hence, the BRS-variation of the Gaussian measure has mass dimension D = 5. Requiring the regularized generating functional Z 0,Λ0 (K), (96), to be invariant under the BRS-variations (98) of the integration variables, provides the violated Slavnov–Taylor identities (VSTI) 1 Λ0 ,Λ0 1 ! (Φ)+ Φ,K (δBRS Φ, K − δBRS (Q0,Λ0 + LΛ0 ,Λ0 )). 0 = dµ0,Λ0 (Φ)e− L (100) The BRS-variations appearing in (100) can be dealt with, considering corresponding modified generating functionals: (i) With the modified bare interaction (44) we define 1 ˜ Λ0 ,Λ0 1 (ξ;Φ)+ Φ,K Z˜ 0,Λ0 (K, ξ) := dµ0,Λ0 (Φ)e− L ,
j Again
one should stay in finite volume as long as the vacuum part is involved.
(101)
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
and introduce a regularized BRS-operator δ δ a DΛ0 = Jτ , σ0,Λ0 + η¯ , σ0,Λ0 a δγτ δω τ 1 δ δ + ∂ν − m a , σ0,Λ0 η a . α δjνa δb
803
(102)
(ii) The BRS-variations of the bare interaction and of the Gaussian measure Λ0 ,Λ0 0,Λ0 Λ0 ,Λ0 L1 ε := −δBRS (Q +L ) = dx N (x)ε (103) form a space-time integrated insertion with ghost number 1. The variation of LΛ0 ,Λ0 , however, keeps the regularizing factor σ0,Λ0 of (98), thus the integrand N (x) is no longer a polynomial in the fields and their derivatives. We can initially treat the integrand N (x) as a local insertion with a source ρ(x), cf. ˜ Λ0 ,Λ0 (ρ; Φ) similarly to (44), (52). Introducing the corresponding bare action L k ˜ 0,Λ0 (K, ρ) in analogy to (101). we define the functional Z In terms of these modified Z-functionals the VSTI (100) can now be written δ ˜ 0,Λ0 0,Λ0 ˜ DΛ0 Z Z (K, ξ)|ξ=0 = dx (K, ρ)|ρ=0 . (104) δ(x) The modified Z-functional (101) is related to the corresponding generating functional of modified CAS byl 0,Λ0 0,Λ0 1 1 ˜ 0,Λ0 (K) − ) Z˜ 0,Λ0 (K, ξ) = e P e (L (ξ;ϕτ ,c,¯c)+I ,
(105)
and analogously in case of Z˜ 0,Λ0 (K, ρ). Furthermore, the variables of the Z- and the L-functional satisfy ϕτ (x) = dy Cτ0,Λ0 (x − y)Jτ (y), ca (x) = −
dy S 0,Λ0 (x − y)η a (y),
(106)
c¯ (x) = − a
dy S 0,Λ0 (x − y)¯ η a (y).
From (104), via (105) and the analogous relation for Z˜ 0,Λ0 (K, ρ), we derive, using the definitions (47), (53) and denoting the differential operators (16) by Dτ in accord with ϕτ , the violated Slavnov–Taylor identities of the CAS : 1 δL0,Λ0 δL0,Λ0 a a a a ∂ν Aν − mB c ,D −m − c , σ0,Λ0 ∂ν α δAaν δB a 0,Λ0 0 0 + ϕτ , Dτ L0,Λ ca , DL0,Λ . (107) ω a = L1 γτ − ¯ τ
notation we let the variables ρ and ξ, respectively, denote different functions. vacuum part I 0,Λ0 is the same as in the case without insertion, since the latter has nonzero ghost number. k Abusing l The
July 10, 2009 14:13 WSPC/148-RMP
804
J070-00375
C. Kopper & V. F. M¨ uller
Starting from the relations (72) between the generating functionals of the vertexand Schwinger-functions we can convert (107) at the (physical) value Λ = 0 into the violated Slavnov–Taylor identities for proper vertex functions, on substituting there the fields Φ according due to (70), and employing (71), (72), δΓ0,Λ0 δΓ0,Λ0 0,Λ0 0,Λ0 , σ0,Λ0 Γγτ , σ0,Λ0 Γωa − δϕτ δca τ 1 δΓ0,Λ0 0 − ∂ν Aaν − mB a , σ0,Λ0 (ϕτ , ca , c¯a ), (108) = Γ0,Λ 1 α δ¯ ca with 0 0 Γ0,Λ (ϕτ , ca , c¯a ) = L0,Λ (ϕτ , ca , c¯a ). 1 1
(109)
In the analysis of the STI it will turn out that we need the form of their explicit 0 ,Λ0 (Φ), too. From the definition (103) we directly violation “on the bare side”, ΓΛ 1 0 ,Λ0 (Φ), using (44) and (45), determine the bare functional LΛ 1 1 0 ,Λ0 0 ,Λ0 0 ,Λ0 ∂ν Aaν − mB a (Φ) = ca , D ϕτ , Dτ LΛ − ¯ ca , DLΛ + LΛ γτ ω 1 α τ Λ0 ,Λ0 Λ0 ,Λ0 δL 1 δL a a Λ0 ,Λ0 ∂ν Aν − mB , σ0,Λ0 , σ0,Λ0 Lγτ − + δ¯ ca α δϕτ τ Λ0 ,Λ0 δL Λ0 ,Λ0 − , σ L . (110) 0,Λ0 ω δca 0 ,Λ0 (Φ) generates n-point functions with 2 ≤ |n| ≤ 5. MoreThe functional LΛ 1 over, we observe that only the terms emerging from the BRS-variation of the bare interaction LΛ0 ,Λ0 have mass dimension greater than D = 5, because of the cutoff 0 ,Λ0 , its n-point function σ0,Λ0 (k 2 ) (cf. remark after (99)). Given the functional LΛ 1 Λ0 ,Λ0 , due to the identity (83). functions coincide with those of the functional Γ1
5. Restoration of the Slavnov–Taylor Identities 5.1. Mass expansions of vertex functionals To restore the STI, it is in particular necessary to make vanish the relevant part of 0 . It will then turn out that this is also sufficient in the the violating functional Γ0,Λ 1 limit Λ0 → ∞. Namely, the irrelevant contributions to this functional at the bare 0 ,Λ0 , which stem from the regulating function σ0,Λ0 , are sufficiently bounded scale ΓΛ 1 in terms of inverse powers of Λ0 so that we may apply Proposition 3 providing the bound (119). The freedom we dispose of to achieve this task is the freedom of choosing the 0 renormalization conditions for the relevant terms appearing in the functions Γ0,Λ l,n 0,Λ0 and Γγ;l,n . On inspection of the VSTI (108) one realizes that there is an obstacle 0 is of on this way of proceeding: Since the insertion defining the functional ΓΛ,Λ 1
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
805
dimension 5, we have to apply up to 5 field- and momentum-derivatives on (108) in order to exhaust all relevant terms. We first notice that momentum derivatives of the cutoff function σ0,Λ0 (k 2 ) = σΛ0 (k 2 ) do not contribute to the relevant terms looked for,m cf. (21). Hence, in the terms generated from (108) by these fieldor momentum-derivatives there apply d1 (field or momentum)-derivatives to the factors of the form δΓ/δϕ in (108), and d2 (field or momentum)-derivatives apply to the factors of the form Γγ , ∂Aa , or mB a , where d1 + d2 ≤ 5. If d2 ≥ 3 derivatives 0 apply to the functionals Γ0,Λ γ;l,n , they generate irrelevant contributions, since the 0 insertions in Γ0,Λ γ;l,n are of dimension 2. In our earlier paper [5] such contributions to the VSTI were denoted by “irr” in its Appendix C. They hampered the analysis of the relevant part of the VSTI at the renormalization scale in our previous efforts since they cannot be controlled explicitly in terms of the renormalization conditions. 0 multiplying these The only way out can be that the relevant terms from Γ0,Λ l,n irrelevant terms can always be made to vanish so as to avoid the a priori unknown irrelevant terms to appear. One then realizes however that there are contributions 0 in Γ0,Λ l,n , present already at the tree level l = 0, which do not satisfy this criterion, namely the nonvanishing super-renormalizable three-point couplings, as well as the mass term of the 2-point functions (see Appendix A). 0 We present the following solution to this problem: The functions Γ0,Λ γ;l,n and 0 Γ0,Λ l,n are expanded at zero momentum not only with respect to the fields and the momenta but also with respect to the number of super-renormalizable vertices, or otherwise stated with respect to the number of mass parameters appearing in these couplings, see Sec. 3.2, (88) and (89). The degree of divergence then diminishes with this number, in fact the corresponding bounds (94) and (95) show that the presence of an explicit mass term produces a gain in power counting by one unit. Disposing then of all relevant terms in this new sense, we will realize that there do not remain uncontrollable contributions to the VSTI of the form mentioned above. One should note that counting a power of a mass parameter as a power of a field, is intuitively in accord with the fact that these mass parameters stem from the vacuum expectation value of the scalar field. 0 0 and ΓΛ,Λ inherited We start introducing the expansion of the functionals LΛ,Λ 1 1 from the mass scaling (55), 0 ) = LΛ,Λ 1;l,n (λ; p
∞
(ν),Λ,Λ0
(mλ)ν L1;l,n
( p ),
p = (p1 , . . . , p|n| ),
(111)
ν=0 0 ΓΛ,Λ ) = 1;l,n (λ; p
∞
(ν),Λ,Λ0
(mλ)ν Γ1;l,n
( p ).
(112)
ν=0
Since we aim at a consistent mass expansion of the VSTI, (108), we first observe, that we also have to perform the mass scaling (55) of the BRS-variation 1 a a α (∂ν Aν (x) − αmB (x)) of the anti-ghost appearing, cf. (15), in accord with our m This
property is at the origin of our particular choice of the cutoff function.
July 10, 2009 14:13 WSPC/148-RMP
806
J070-00375
C. Kopper & V. F. M¨ uller
treatment of the BRS-insertions. We then want to determine via (108) the relevant (ν),0,Λ 0 , given by the values (∂ w Γ1;l,n 0 )(0 ), |n|+|w|+ν ≤ 5. It part of the functional Γ0,Λ 1 is important to note that irrelevant contributions only emerge from the functionals containing a BRS-insertion. Requiring the vertex functions in (108) to satisfy the boundary conditions, l ∈ N0 , (ν),0,Λ0
(∂ w Γl,n
! )(0 ) = 0,
if |n| + |w| + ν < 4, (ν),0,Λ
(113)
(ν),0,Λ
0 0 , Γω then are annihilated irrelevant contributions from the functionals Γγτ by multiplication and only contributions of these functionals with |n2 | + |w2 | + ν2 ≤ 2 field-, momentum- and mass-derivatives, i.e. relevant terms, do appear. The condition (113) is satisfied for l ≥ 1 by the renormalization conditions (92), and in the tree order, if |n| = 3, (90). Here, we remind the reader that we do not apply the mass expansion to the free propagator, but only to the boundary terms appearing in the FE. Now the inverted 0 free propagators Γ0,Λ 0,n , |n| = 2, appear in (108) as boundary terms at Λ = 0 for Λ,Λ0 the functions Γ1;l,n , and they are then mass expanded, (55), thus satisfying (113), too. Therefore it is important to remember that the FE and the VSTI are derived before mass expanding. Afterwards we consistently apply the mass expansion to all boundary terms and make the corresponding statement on the bounds for the vertex functions which is verified inductively. The renormalization conditions (92) imposed on (a subset of) the relevant terms of the vertex functions imply zero renormalization conditions for the leading contributions to all the two-point functions:
δm2(ν) = 0,
Σc¯c(ν) (0) = 0,
ΣBB(ν) (0) = 0,
Σhh(ν) (0) = 0
for ν ≤ 1, (114)
and also ΣAB(ν) (0) = 0 for ν = 0;
κ(ν) = 0 for ν ≤ 2.
(115)
Here we use the notations of Appendix A. The respective relevant parts of the inserted functionals Γγ are collected in Appendix B. The restricted set of renormalization conditions (93) is automatically satisfied, even in the nonvoid case with |n| = 1, 0 n ≡ ca : Γ0,Λ γ a ;n (0; 0) = mR4 ,
(116)
due to the explicit factor of m to be scaled according to (55). 0 0 (Φ), ΓΛ,Λ (Φ) serve to control the violation of the STI. The functionals LΛ,Λ 1 1 They contain irrelevant boundary terms at Λ = Λ0 , in contrast to the functionals without insertion or with a BRS-insertion. These boundary terms are due to the presence of the factors σ0,Λ0 , cf. the remarks after (110). They are proportional to σ0,Λ0 (p) − 1 = O((p2 )2 /Λ40 ), as follows from (20), since the terms proportional to σ0,Λ0 (0) = 1 are relevant.
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
807
0 ,Λ0 We first assert the bound on the bare functional ΓΛ , valid for l ∈ N0 , 1 r | p| Λ0 (ν),Λ ,Λ p )| ≤ (Λ0 + m)5−|n|−|w|−ν log P |∂ w Γ1;l,n 0 0 ( , (117) m Λ0
and trivially satisfied, unless 2 ≤ |n| ≤ 5. Because of the identity (83) we can 0 ,Λ0 and making use of (110). We employ establish the corresponding bound on LΛ 1 (ν),Λ,Λ (ν),Λ,Λ 0 w , (62), and on ∂ w Lγ;l,n 0 , (63), at the value the previous bounds on ∂ Ll,n Λ = Λ0 . For σ0,Λ0 (k 2 ) = σΛ0 (k 2 ) we use the bounds |k| −|w| |∂ w σΛ0 (k 2 )| ≤ Λ0 P|w| , Λ0 which are an easy consequence of (20), the polynomials P|w| having nonnegative coefficients not depending on k. With these ingredients we prove (117). 0 stated below in (119) does not follow from The bound on the functional ΓΛ,Λ 1 the choice of standard renormalization conditions for insertions. We rather assume its relevant part at the physical value Λ = 0 of the flow parameter to vanish, l ∈ N0 , (ν),0,Λ0
(∂ w Γ1;l,n
)(0 ) = 0,
|n| + |w| + ν ≤ 5.
(118)
In Sec. 5.3, we will be able to verify these conditions from the VSTI (108), choosing for the functionals entering the left-hand side suitable renormalization conditions within the class (92), (93) considered. Assuming (118), we want to show that the corresponding irrelevant part vanishes upon shifting the UV-cutoff to infinity: Proposition 3. Given (118), then for l ∈ N0 , |n| ≥ 2 and 0 ≤ Λ ≤ Λ0 , r | p| 1 Λ0 (ν),Λ,Λ |∂ w Γ1;l,n 0 ( p )| ≤ (Λ + m)5+1−|n|−|w|−ν log P , Λ0 m Λ+m
(119)
with a positive integer r depending on n, l, w, and a polynomial P as in (62), (63). Proof. We first notice, that the bound (119) at Λ = Λ0 agrees with the bound (117), and at Λ < Λ0 majorizes this bound, if |n| + |w| + ν > 5. The functions (ν),Λ,Λ ∂ w Γ1;l,n 0 with flow parameter 0 ≤ Λ ≤ Λ0 are bounded integrating inductively the FE (86), adapted to an integrated insertion and to the λ-expansion, however, as stated. We proceed in the inductive order as in the proof of the Proposition 1, but observing that the relevant terms of the functional treated here satisfy |n| + |w| + ν ≤ 5. Considering the tree order first we notice, that the right-hand side of the FE does vanish. Hence, this order is already fixed by its boundary value at Λ = Λ0 . If |n| = 2, the boundary value even vanishes and thus the function itself, satisfying (119) trivially. Proceeding, for given n in the irrelevant cases |n| + |w| + ν > 5 the bound (119) follows from the bound on their boundary values. Integrating the (ν),Λ,Λ relevant cases |n| + |w| + ν ≤ 5 with initial values (118) yields ∂ w Γ1;0,n 0 (0 ) = 0. Descending in |w|, the integrand in the respective remainder of the Taylor extension
July 10, 2009 14:13 WSPC/148-RMP
808
J070-00375
C. Kopper & V. F. M¨ uller
has already been bounded before, providing the bound for general value p. Hence, the assertion is established in the tree order. Proceeding for l > 0 inductively as indicated, the L-functions appearing on the right-hand side of the FE (86) have to be determined within this inductive process via (80), as expounded in presenting the FE and supplemented in the text after (87), leading to the Proposition 2. Therefore, to bound the right-hand side one also needs the bound (94) on the vertex functions without insertions, to be dealt with (ν),Λ,Λ independently before. As a result the bound deduced on |∂ w L1;l−1,n0 | essentially (ν),Λ,Λ
coincides with the bound on |∂ w Γ1;l−1,n0 |, cf. (87), i.e. has the same form and power behavior of Λ + m. This bound allows to estimate the right-hand side of the FE and hereafter the integrations “downwards” with initial conditions (117), and “upwards” with initial conditions (118), of the irrelevant and relevant cases, respectively. Extending finally the relevant cases via the Taylor formula to general p completes the proof. Thus, given the condition (118), the bound (119) implies that the Slavnov–Taylor identities are restored in the limit Λ0 → ∞. 5.2. Equation of motion of the anti-ghost Renormalization theory for nonabelian gauge theories in gauge invariant renormalization schemes is generally based on the STI, complemented by the equation of motion of the anti-ghost [13, 14, 2]. In our scheme we rather start from a derivation of this equation from the functional integral. In Sec. 5.3, we will then show that this equation is satisfied for renormalization conditions compatible with the STI if in addition the renormalization condition for the longitudinal part of the gauge field propagator is fixed uniquely to vanish at zero momentum. The field equation follows from the representation (29). After functional derivation of (29) with respect to c¯a (x) we reexpress the right-hand side as δLΛ,Λ0 (Φ) − 1 (LΛ,Λ0 (Φ)+I Λ,Λ0 ) e δ¯ ca (x) 1 δ − (LΛ0 ,Λ0 (Φ +Φ)+LΛ0 ,Λ0 (ζ;Φ +Φ)) dµ (Φ )e = Λ,Λ0 a δζ (x) ζ=0 on extending the original bare interaction LΛ0 ,Λ0 (Φ) by the insertion δLΛ0 ,Λ0 (Φ) . LΛ0 ,Λ0 (ζ; Φ) = dx ζ a (x) δ¯ ca (x)
(120)
The source ζ a (x) is a Grassmann element carrying ghost number −1. Treating now the right-hand side analogously as in (44)–(47), we obtain the field equation of the anti-ghost δLΛ,Λ0 (Φ) 0 = LΛ,Λ ζ a (x; Φ), δ¯ ca (x)
(121)
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
809
employing the notation introduced there. On the right-hand side appears the generating functional of the CAS with one local insertion corresponding to (120). The classical BRS-invariant action (9) satisfies the classical field equation δ/δ¯ ca (x)SBRS = ∂µ ψµa (x) − αmψ a (x), observing (14). The aim is to show that the relation following from the classical action at the tree level for the physical value Λ = 0 of the flow parameter δL0,Λ0 (Φ) 0 0 = ∂µ L0,Λ (x; Φ)|mod − αmL0,Λ a γµ γ a (x; Φ)|mod , δ¯ ca (x)
(122)
still holds in the renormalized theory. The label “mod” is to signal that we have to ˜ 0 = O() for i = 1, 4 since the replace in the bare insertions (41a)–(41d) Ri0 → R i respective tree order is absent on the left-hand side. We can write (122) in terms of proper vertex functions. Fourier transforming (122), using our conventions (37), (48), and employing the relations (72), (71) yields (2π)4
q 2 + αm2 a δΓ0,Λ0 (Φ) = − c (−q) δ¯ ca (q) σ0,Λ0 (q 2 ) 0 0 (q; Φ)|mod − αmΓ0,Λ − iqµ Γ0,Λ a γµ γ a (q; Φ)|mod .
(123)
The first term on the right-hand side is the tree level 2-point function. Restricting (123) to its relevant part, σ0,Λ0 (q 2 ) is replaced by σ0,Λ0 (0) = 1 due to (21), the first term then provides the tree order of R1 and R4 excluded in the insertions as indicated by the label mod, cf. (122). The proof of (123) or equivalently (122) consists in two steps of the same nature as those employed in the previous section. We may consider the (regularized) inserted functional 4 0 ΓΛ,Λ ca (q; Φ) := (2π)
δΓΛ,Λ0 (Φ) q 2 + αm2 a + c (−q) δ¯ ca (q) σ0,Λ0 (q 2 )
0 0 + iqµ ΓΛ,Λ (q; Φ)|mod + αmΓΛ,Λ a γµ γ a (q; Φ)|mod .
(124)
In the mass expansion scheme it corresponds to an operator insertion of dimension 3, where we take into account also the momentum and mass factors in front of the last three terms. Since the flow equations for inserted functionals are linear, the new functional obeys again a linear flow equation obtained from those for the functionals on the right-hand side by superposition. Note that the second term on the righthand side, being a tree level contribution, does not flow. If we can choose renormalization conditions such that all relevant contributions 0 to ΓΛ,Λ ca (q; Φ) vanish, we can prove by induction on the linear flow equation (the 0 solution of which is unique for specified boundary conditions) that ΓΛ,Λ ca (q; Φ) ≡ 0. Note that for this functional there are no irrelevant boundary contributions at Λ = Λ0 , since such terms only appear in the first two terms on the right-hand side at the tree level and cancel exactly. So the situation is simpler than that of the functional Γ1 analyzed in the previous section.
July 10, 2009 14:13 WSPC/148-RMP
810
J070-00375
C. Kopper & V. F. M¨ uller
At the end of the next section it is shown explicitly that the relevant contributions to (124) can be made to vanish for suitable renormalization conditions so that the equation of motion for the anti-ghost (123) or (122) holds at the quantum level. 5.3. Analysis of the relevant part of the Slavnov–Taylor identities and of the equation for the anti-ghost 0 to vanish in accord with We now require the relevant part of the functional Γ0,Λ 1 the VSTI (108). This requirement amounts to satisfy the 53 equations presented in Appendix C. It is satisfied in the tree order. Noticing that the normalization constants of the BRS-insertions behave as Ri = 1 + O(), i = 1, . . . , 7, we first analyze the equations IX to XXIX, but take already into account the equations VII d , VIII c , the latter ones providing
!
r2hBA = r2c¯cA = 0.
(125)
In proceeding we use conditions determined before, if needed. From XIV b , XIV e , XV 1b , XXIII directly follow r1AA¯cc = r2AA¯cc = r1BB c¯c = r2AABB = 0, !
(126)
and then, from XIV a+c , XVII b , XVIII c , XXVIII , XXIX , r2AAAA = rhh c¯c = rc¯c¯cc = rhB c¯c = r2BB c¯c = 0. !
(127)
XVI a , XVIII a , and XV 2a combined with XVI b , respectively, require !
!
!
R3 R5 = (R2 )2 .
R2 = R6 = R7 ,
!
XIV c : 2F1AAAA R1 = −F AAA gR2 !
XI : F c¯cB(1) R5 = −F c¯ch(1) R2 .
(128) (129) (130)
From X, XX , XIX , IX follow for the self-coupling of the scalar field 8F BBBB R4 = F BBh(1) gR3 ,
(131)
4F BBhh R4 = F BBh(1) gR5 ,
(132)
8F hhhh R4 R3 = F BBh(1) g(R5 )2 ,
(133)
! ! !
F hhh(1) R3 = F BBh(1) R5 , !
(134)
and from XVI b , XVII a , XXI , XIII 2 for the scalar-vector coupling 2F BBA R5 = −F1hBA R2 ,
(135)
4F AAhh R1 = F1hBA gR5 ,
(136)
4F1AABB R1 = F1hBA gR3 ,
(137)
F AAh (1) R1 = F1hBA R4 .
(138)
! ! ! !
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
811
One easily verifies that the remaining equations of IX to XXIX are satisfied due to these conditions (125)–(138). At this stage, all those relevant couplings with |n| = 3, 4 not appearing already in the tree order are required to vanish: (125)–(127). All other couplings involving four fields are determined by particular couplings with |n| = 3: (129), (131)–(133), (136), (137). In addition, there are 4 conditions relating couplings with |n| = 3: (130), (134), (135) and (138). Moreover, the normalization constants of the BRSinsertions are required to satisfy the three conditions (128). There are still 18 − 2 equations among I to VIII to be considered. They contain the relevant parameters of Γ0,Λ0 with |n| = 1, 2, 3, except F hhh , together with the normalization constants of the BRS-insertions. Since 2 of these parameters have been fixed before, (125), there remain 26 to be dealt with. (F hhh will then be determined by (134).) These parameters in addition have to obey the conditions derived before: We first observe that the condition (138) is identical to equation VI b . There remain the 5 conditions to be satisfied: 3 conditions (128), together with (130), (135). All these conditions generate 4 linear relations among the equations still to be considered: denoting by {X} the content of the bracket {· · ·} appearing in equation X, we find [7, (4.94-97)] 0 = α−1 {VIII b } + gR2 {Ib } + R1 ({III a } + {III b }),
(139)
0 = gR2 {II b } − {VIII b } + R1 {IV b } − 2R4 {V },
(140)
0 = R2 {IV a } − R3 ({VI a } − {VI b }),
(141)
0 = R2 {V } − R3 {VII c }.
(142)
Hence, the 26 parameters in question are constrained by 16 + 5 − 4 = 17 equations. As renormalization conditions we then fix κ(3) = 0 and let ˙ BB , F AAA , F BBh(1) , R3 Σtrans , Σlong , ΣAB (1) , Σ˙ c¯c , Σ
(143)
be chosen freely. These parameters correspond to the number of wave function renormalizations (including one for the BRS sector) and coupling constant renormalizations of the theory. Thus, there are 26 − 9 parameters left, together with 17 equations. These parameters are now determined successively in terms of (143) and possibly parameters determined before in proceeding. We list them in this order, writing in bracket the particular equation fulfilled: R1 (Ib ), R4 (II b ), R2 (III b ) → R6 , R7 , R5 F1c¯cA (III a ), F BBA (V F
AAh (1)
(VI b ), F
c¯cB(1)
)→
F1hBA
(IV a ) → F
c¯ch(1)
due to (128), due to (135), due to (130),
(144)
Σc¯c(2) (VIII a ), ΣBB (2) (II a ), δm2(2) (Ia ), Σhh(2) (VII a ), Σ˙ hh (VII b+c ). Now all parameters are determined, without using the equations IV b , VI a , VII c , VIII b . These equations, however, are satisfied because of the relations (139)–(142).
July 10, 2009 14:13 WSPC/148-RMP
812
J070-00375
C. Kopper & V. F. M¨ uller
Finally, the relevant couplings with |n| = 4, as well as F hhh(1) , then are explicitly given by (129), (131)–(134), (136) and (137). We have not yet implemented the field equation of the anti-ghost (123). Performing the mass scaling as before and then extracting the local content |n|+ |w|+ ν ≤ 4 leads to the relations ˙ c¯c = R1 , 1+Σ α+Σ
c¯c(2)
(145)
= αR4 ,
(146)
F1c¯cA = gR2 , α F c¯cB(1) = gR6 , 2 α c¯ch(1) F = − gR5 . 2
(147) (148) (149)
Fixing now the hitherto free renormalization constant Σlong at the particular value Σlong = 0, we claim these relations to be satisfied: (145) and (147) follow at once from Ib and III a+b , respectively; (148) follows from 2{IV a } − {IV b }, due to (147) and (128); and herefrom follow (149) due to (130), and (146) because of VIII a , thus establishing the claim. Given these additional relations (145)–(149) we can adjust the procedure (144) choosing now a reduced set of free renormalization conditions (143) in which Σlong is excluded. Proceeding similarly as before we find Ib : Σlong = 0, III b : gR2 = −2F AAA
II a : ΣBB(2) = 0,
˙ 1+Σ → R6 , R7 , R5 1 + Σtrans
II b : R4 = Ia : 1 + δm2(2)
(150)
c¯c
due to (128),
˙ c¯c 1+Σ (1 + ΣAB(1) ), ˙ BB 1+Σ 1 = (1 + ΣAB(1) )2 , ˙ BB 1+Σ
˙ BB 1+Σ → F1hBA → F AAh (1) due to (135), (138), 1 + Σtrans 2 M R4 4 VII a : + Σhh(2) = F BBh(1) , m g R3
V : 2F BBA = F AAA
˙ hh = (1 + Σ ˙ BB ) R5 . VII b+c : 1 + Σ R3
(151) (152) (153) (154) (155) (156)
Resuming the following task has been achieved: we first treated the functional 0,Λ0 0 Γ and its ancillary functionals Γ0,Λ with a BRS-insertion, disregarding γτ , Γω the STI. There appear 37 + 7 relevant parameters. Fixing among these parameters a priori κ = 0 (no tadpoles) and Σlong = 0 (due to the field equation of the antighost), and regarding the set (143) without Σlong , as renormalization constants to 0,Λ0
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
813
be chosen freely, we can uniquely determine the remaining relevant parameters upon 0 to vanish, (118), on account of requiring the relevant part of the functional Γ0,Λ 1 0 vanishes, the VSTI (108). Finally, since the relevant part of the functional Γ0,Λ 1 due to Proposition 3, (119), its irrelevant part vanishes in the limit Λ0 → ∞, too. 0,∞ Thus, perturbatively, the functional Γ0,∞ and its ancillary functionals Γ0,∞ γτ , Γω are finite and satisfy the STI, i.e. Eq. (108) for Λ0 → ∞ with the right-hand side vanishing. Acknowledgment Both authors have been lecturing at ESI, Vienna, about the subject of this paper; the ensuing discussions were important for its genesis; hospitality of ESI is therefore gratefully acknowledged. Appendix A The bare functional LΛ0 ,Λ0 and the relevant part of the generating functional Γ0,Λ0 for the proper vertex functions have the same general form. We present the latter and give the tree order explicitly. At the end we state the modification to obtain the bare functional LΛ0 ,Λ0 . Writing Γ0,Λ0 (A, h, B, c¯, c) =
4
Γ|n| + Γ(|n|>4) ,
|n|=1
where |n| counts the number of fields, we extracted the relevant part, i.e. its local field content with mass dimension not greater than four. Moreover, in the sequel we do not underline the field variables though all arguments in the Γ-functional should appear underlined, of course. (1) One-point function ˆ Γ1 = κh(0). (2) Two-point functions 1 a 1 hh Aµ (p)Aaν (−p)ΓAA Γ2 = µν (p) + h(p)h(−p)Γ (p) 2 2 p 1 + B a (p)B a (−p)ΓBB (p) − c¯a (p)ca (−p)Γc¯c (p) 2 a a AB + Aµ (p)B (−p)Γµ (p) , 2 2 2 2 ΓAA µν (p) = δµν (m + δm ) + (p δµν − pµ pν )(1 + Σtrans (p ))
+
1 pµ pν (1 + Σlong (p2 )), α
Γhh (p) = p2 + M 2 + Σhh (p2 ), Γc¯c (p) = p2 + αm2 + Σc¯c (p2 ),
ΓBB (p) = p2 + αm2 + ΣBB (p2 ), AB 2 ΓAB (p ). µ (p) = ipµ Σ
July 10, 2009 14:13 WSPC/148-RMP
814
J070-00375
C. Kopper & V. F. M¨ uller
Besides the unregularized tree order explicitly stated, there emerge 10 relevant parameters from the various self-energies: ˙ BB (0), Σc¯c (0), Σ ˙ c¯c (0), ΣAB (0), δm2 , Σtrans (0), Σlong (0), Σhh (0), Σ˙ hh (0), ΣBB (0), Σ ˙ where the notation Σ(0) ≡ (∂p2 Σ)(0) has been used. We note that because of the regularization, the inverse of the regularized propagators (22) actually appears as the tree order (l = 0) of the 2-point functions. Due to the property (21), however, the regularizing factor (σ0,Λ0 (p2 ))−1 does not contribute to the relevant part. (3) Three-point functions We only present the relevant part explicitly. A relevant parameter vanishing in the tree order is denoted by r ∈ O(), otherwise it is denoted by F . Moreover, we indicate an irrelevant part by a symbol On , n ∈ N, reminding that this part vanishes like an nth power of a momentum when all momenta tend to zero homogeneously. {rst Arµ (p)Asν (q)Atλ (−p − q)ΓAAA Γ3 = µνλ (p, q) p
q
+ Arµ (p)Arν (q)h(−p
rst r − q)ΓAAh B (p)B s (q)Atµ (−p − q)ΓBBA (p, q) µν (p, q) + µ
+ h(p)B r (q)Arµ (−p − q)ΓhBA (p, q) + rst c¯r (p)cs (q)Atµ (−p − q)Γcµ¯cA (p, q) µ + B r (p)B r (q)h(−p − q)ΓBBh (p, q) + h(p)h(q)h(−p − q)Γhhh (p, q) + c¯r (p)cr (q)h(−p − q)Γc¯ch (p, q) + rst c¯r (p)cs (q)B t (−p − q)Γc¯cB (p, q)}, AAA + O3 , ΓAAA µνλ (p, q) = δµν i(p − q)λ F AAh ΓAAh + O2 , µν (p, q) = δµν F
ΓBBA (p, q) = i(p − q)µ F BBA + O3 , µ ΓhBA (p, q) = i(p − q)µ F1hBA µ + i(p + q)µ r2hBA + O3 ,
1 F AAA = − g + rAAA , 2 1 F AAh = mg + rAAh , 2 1 F BBA = − g + rBBA , 4 1 F1hBA = g + r1hBA , 2
Γcµ¯cA (p, q) = ipµ F1c¯cA + iqµ r2c¯cA + O3 ,
F1c¯cA = g + r1c¯cA ,
ΓBBh (p, q) = F BBh + O2 ,
F BBh =
1 M2 g + rBBh , 4 m
1 M2 g + rhhh , 4 m 1 Γc¯ch (p, q) = F c¯ch + O2 , F c¯ch = − αgm + rc¯ch , 2 1 c¯cB c¯cB c¯cB + O2 , F = αgm + rc¯cB . Γ (p, q) = F 2 The 3-point functions AAB and BBB have no relevant local content. Γhhh (p, q) = F hhh + O2 ,
F hhh =
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
815
(4) Four-point functions Defining as before parameters r and F , then {abc ars Abµ (k)Acν (p)Arµ (q)Asν (−k − p − q)F1AAAA Γ4 |rel = k
p
q
+ Arµ (k)Arµ (p)Asν (q)Asν (−k
− p − q)r2AAAA
+ Aaµ (k)Abµ (p)¯ cr (q)cs (−k − p − q)(δ ab δ rs r1AA¯cc + δ ar δ bs r2AA¯cc ) + Aaµ (k)Abµ (p)B r (q)B s (−k − p − q)(δ ab δ rs F1AABB + δ ar δ bs r2AABB ) + B a (k)B b (p)¯ cr (q)cs (−k − p − q)(δ ab δ rs r1BB¯cc + δ ar δ bs r2BB¯cc ) + h(k)h(p)h(q)h(−k − p − q)F hhhh + B r (k)B r (p)h(q)h(−k − p − q)F BBhh + B r (k)B r (p)B s (q)B s (−k − p − q)F BBBB + Arµ (k)Arµ (p)h(q)h(−k − p − q)F AAhh + h(k)h(p)¯ cr (q)cr (−k − p − q)rhh¯cc + c¯a (k)ca (p)¯ cr (q)cr (−k − p − q)rc¯c¯cc + rst h(k)B r (p)¯ cs (q)ct (−k − p − q)rhB¯cc }, 1 2 1 g + r1AAAA , F1AABB = g 2 + r1AABB , 4 8 2 2 1 2 M 1 2 M hhhh BBhh g g = +r , F = + rBBhh , 32 m 16 m 2 1 2 M 1 g = + rBBBB , F AAhh = g 2 + rAAhh . 32 m 8
F1AAAA = F hhhh F BBBB
Hence, Γ0,Λ0 in total involves 1 + 10 + 11 + 15 = 37 relevant parameters. We now obtain the form of the bare functional LΛ0 ,Λ0 , together with its order l = 0 explicitly given, upon deleting in the two-point functions the contributions of the order l = 0, i.e. keeping there only the 10 parameters which appear in the various self-energies. Appendix B Analyzing the STI, vertex functions (72) with one operator insertion, generated by the BRS-variations, have to be considered, too. These insertions have mass dimension D = 2. We remind the notation (47) and (48) of the corresponding Fouriertransform, presenting the respective relevant part of these four vertex functions with one insertion, 0,Λ0 a arb ˆ Arµ (k)cb (−q − k)gR2 , Γγµa (q, Φ)|rel = −iqµ c (−q)R1 + k
July 10, 2009 14:13 WSPC/148-RMP
816
J070-00375
C. Kopper & V. F. M¨ uller
ˆ 0,Λ0 (q; Φ)|rel = − 1 g Γ γ 2
B r (k)cr (−q − k)R3 , k
a 0 ˆ 0,Λ Γ γ a (q; Φ)|rel = mc (−q)R4 1 1 + h(k)ca (−q − k) gR5 + arb B r (k)cb (−q − k) gR6 , 2 2 k k 1 ars 0 ˆ 0,Λ Γ cr (k)cs (−q − k) gR7 . ω a (q; Φ)|rel = 2 k
There appear 7 relevant parameters Ri = 1 + ri ,
ri = O(),
i = 1, . . . , 7.
All the other two-point functions, and the higher ones, of course, are of irrelevant type. Appendix C As a consequence of the expansion in the mass parameters the conditions following from the fact that the relevant part of the functional Γ1 should vanish !
Γ1 (A, h, B, c¯, c)|dim≤5 = 0. can be reordered according to the value of ν which appears. We get contributions for 0 ≤ ν ≤ 3. The value of ν in the various relevant couplings is indicated as a superscript in parentheses if ν > 0. We explicitly indicate the momentum and the power of m in front of each STI. The power of m indicates the value of ν in the corresponding contribution to Γ1 . Two fields (I) δAaµ (q) δcr (k) Γ1 |0
AB(1) ! (a) 0 = m2 qµ {−(1 + δm2(2) )R1 + R4 + 1 + c¯c ! 2 (b) 0 = q qµ {− α1 (1 + long )R1 + α1 (1 + ˙ )}.
1 α
c¯c(2)
(II) δB a (q) δcr (k) Γ1 |0
BB(2) c¯c(2) ! (a) 0 = m3 {(α + )R4 − (α + ) − g2 κ(3) R3 }. c¯c BB ! AB(1) (b) 0 = mq 2 {− R1 + (1 + ˙ )R4 − (1 + ˙ )}.
Three fields (III) δArµ (p) δAsν (q) δct (k) Γ1 |0 !
(a) 0 = (pµ pν − qµ qν ){−2F AAA R1 − α1 (F1c¯cA − r2c¯cA ) + [ α1 (1 + long ) − (1 + trans )]gR2 }, ! (b) 0 = (p2 − q 2 )δµν {2F AAA R1 + (1 + trans )gR2 }.
},
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
817
(IV) δArµ (p) δB s (q) δct (k) Γ1 |0
! (a) 0 = mpµ {2F BBA R4 + 12 g AB(1) R6 + α1 F c¯cB,(1) − r2c¯cA }, AB(1) ! R2 + 4F BBA R4 + (F1¯ccA − r2c¯cA )}. (b) 0 = mqµ {g
(V) δB r (p) δB s (q) δct (k) Γ1 |0
BB g ! 0 = (p2 − q 2 ){2R1 F BBA + (1 + ˙ ) 2 R6 }. (VI) δArµ (p) δh(q) δct (k) Γ1 |0 !
(a) 0 = mpµ {−2R1 F AAh(1) +R4 (F1hBA −r2hBA )+
AB(1)
1 c¯ch(1) 1 }, 2 gR5 − α F
!
(b) 0 = mqµ {−2R1 F AAh(1) + 2R4 F1hBA }. (VII) δh(p) δB s (q) δct (k) Γ1 |0 hh(2) 2 ! (a) 0 = m2 {( M )(− 12 gR3 ) + 2F BBh(1) R4 + F c¯ch(1) m2 + BB(2) 1 + (α + ) 2 gR5 }, hh ! 2 hBA (b) 0 = p {F1 R1 − (1 + ˙ ) 12 gR3 }, BB 1 ! (c) 0 = q 2 {−F1hBA R1 + (1 + ˙ ) 2 gR5 }, !
(d) 0 = k 2 {r2hBA R1 }. (VIII) δct (q) δcs (p) δc¯r (k) Γ1 |0 !
(a) 0 = m2 {2F c¯cB(1) R4 − (α + (b) 0 (c) 0
!
= k {F1c¯cA R1 − r2c¯cA R1 ! = (p2 + q 2 ){r2c¯cA R1 }. 2
c¯c(2)
)gR7 }, c¯c ˙ − (1 + )gR7 },
Four fields (IX) δh(p) δh(q) δB 1 (k) δc1 (l) Γ1 |0 !
0 = m{6F hhh,(1)(− 12 gR3 ) + 4F BBhh R4 + 2F BBh,(1) gR5 + 2rhh¯cc }. (X) δB 1 (k) δB 1 (p) δB 2 (q) δc2 (l) Γ1 |0 !
0 = m{−F BBh,(1) gR3 + 8F BBBB R4 + (2r1BB¯cc + r2BB¯cc )}. (XI) δh(l) δc¯3 (k) δc1 (p) δc2 (q) Γ1 |0 !
0 = m{2rhB¯cc R4 + F c¯cB(1) gR5 + F c¯ch,(1) gR7 }. (XII) δc2 (k) δc¯2 (l) δc1 (p) δB 1 (q) Γ1 |0 !
0 = m{F c¯ch(1) (− 12 gR3 ) + (2r1BB¯cc − r2BB¯cc )R4 + F c¯cB(1) ( 12 gR6 − gR7 ) + 2rc¯c¯cc }. (XIII)1 δA1µ (k) δA2ν (p) δB 1 (q) δc2 (l) Γ1 |0 !
0 = 2r2AABB R4 + r2AA¯cc . (XIII)2 δA1µ (k) δA1ν (p) δB 2 (q) δc2 (l) Γ1 |0 !
0 = m{−F AAh(1) gR3 + 4F1AABB R4 + 2r1AA¯cc }.
July 10, 2009 14:13 WSPC/148-RMP
818
J070-00375
C. Kopper & V. F. M¨ uller
(XIV) δA1µ (p) δA1ν (q) δA2ρ (k) δc2 (l) Γ1 |0 !
(a) 0 = 2δµν lρ {4(F1AAAA + r2AAAA )R1 + 2F AAA gR2 + α1 r1AA¯cc }, !
(b) 0 = δµν (pρ + qρ ){ α2 r1AA¯cc }, !
(c) 0 = (δµρ lν + δνρ lµ ){−4F1AAAA R1 − 2F AAA gR2 }, !
(d) 0 = (δµρ pν + δνρ qµ ){0}, !
(e) 0 = (δµρ qν + δνρ pµ ){− α1 r2AA¯cc }. (XV)1 δB 1 (p) δB 1 (q) δA2µ (k) δc2 (l) Γ1 |0 !
(a) 0 = lµ {4F1AABB R1 + 2F BBA gR6 }, !
(b) 0 = kµ {r1BB¯cc }. (XV)2 δB 1 (p) δB 2 (q) δA1µ (k) δc2 (l) Γ1 |0 !
(a) 0 = pµ {−2r2AABB R1 + 2F BBA gR2 + F1hBA gR3 }, !
(b) 0 = qµ {−2r2AABB R1 − 2F BBA gR2 + 2F BBA gR6 }, !
(c) 0 = kµ {−2r2AABB R1 +F1hBA 12 gR3 +r2hBA 12 gR3 +F BBA gR6 − α1 r2BB¯cc }. (XVI) δh(p) δA1µ (k) δB 2 (q) δc3 (l) Γ1 |0 !
(a) 0 = pµ {F1hBA g(R6 − R2 ) − r2hBA gR2 }, !
(b) 0 = qµ {F1hBA gR2 − r2hBA gR2 + 2F BBA gR5 }, !
(c) 0 = kµ {F1hBA 12 gR6 − r2hBA 12 gR6 + F BBA gR5 − α1 rhB¯cc }. (XVII) δh(p) δh(q) δA1µ (k) δc1 (l) Γ1 |0 !
a) 0 = lµ {4F AAhhR1 − F1hBA gR5 }, !
(b) 0 = kµ {r2hBA gR5 + α2 rhh¯cc }. (XVIII) δA2µ (k) δc2 (p) δc1 (q) δc¯1 (l) Γ1 |0 !
(a) 0 = lµ {F1c¯cA g(R2 − R7 ) + α2 rc¯c¯cc }, !
(b) 0 = pµ {2r1AA¯cc R1 + r2c¯cA g(R2 − R7 ) + α2 rc¯c¯cc }, !
(c) 0 = qµ {−r2AA¯cc R1 − r2c¯cA gR7 + α2 rc¯c¯cc }. Five fields (XIX) δh(p) δh(q) δh(k) δB 1 (l) δc1 (l ) Γ1 |0 !
0 = −2F hhhh R3 + F hhBB R5 . (XX) δh(p) δB 1 (q) δB 1 (k) δB 2 (l) δc2 (l ) Γ1 |0 !
0 = −F BBhh R3 + 2F BBBB R5 . (XXI) δA1µ (k) δA1ν (p) δh(k) δB 2 (l) δc2 (l ) Γ1 |0 !
0 = −F AAhh R3 + F1AABB R5 .
July 10, 2009 14:13 WSPC/148-RMP
J070-00375
Renormalization of Spontaneously Broken SU(2) Yang–Mills Theory
819
(XXII) δA1µ (k) δB 1 (p) δc1 (l ) δA2ν (q) δB 3 (l) Γ1 |0 !
0 = r2AABB (R6 − 2R2 ). (XXIII) δA1µ (k) δB 1 (q) δA2ν (p) δc2 (l ) δh(l) Γ1 |0 !
0 = r2AABB R5 . (XXIV) δA3µ (k) δA3ν (p) δc¯2 (q) δc3 (l) δc1 (l ) Γ1 |0 !
0 = r2AA¯cc R2 + r1AA¯cc R7 . (XXV) δA3µ (k) δc¯3 (q) δA2ν (p) δc3 (l) δc1 (l ) Γ1 |0 !
0 = r2AA¯cc (3R2 − R7 ). (XXVI) δB 1 (p) δB 1 (q) δc¯1 (k) δc2 (l) δc3 (l ) Γ1 |0 !
0 = r2BB¯cc (R6 − R7 ) − r1BB¯cc R7 . (XXVII) δB 1 (p) δc¯1 (k) δB 2 (q) δc3 (l) δc1 (l ) Γ1 |0 !
0 = −rhB¯cc R3 + r2BB¯cc (3R6 − 2R7 ). (XXVIII) δh(p) δh(q) δc¯1 (k) δc2 (l) δc3 (l ) Γ1 |0 !
0 = rhB¯cc R5 + rhh¯cc R7 . (XXIX) δh(p) δB 1 (q) δc1 (l) δc¯2 (k) δc2 (l ) Γ1 |0 !
0 = 2rhh¯cc R3 − 2r1BB¯cc R5 + r2BB¯cc R5 + rhB¯cc (−R6 + 2R7 ).
References [1] C. Becchi, A. Rouet and R. Stora, Renormalization of gauge theories, Ann. Phys. (N.Y.) 98 (1976) 287–321. [2] L. D. Faddeev and A. A. Slavnov, Gauge Fields: Introduction to Quantum Theory (Benjamin, Reading, MA, 1980). [3] G. Keller and Ch. Kopper, Perturbative renormalization of composite operators via flow equations I, Comm. Math. Phys. 148 (1992) 445–467. [4] G. Keller, Ch. Kopper and M. Salmhofer, Perturbative renormalization and effective Lagrangians in Φ44 , Helv. Phys. Acta 65 (1991) 32–52. [5] Ch. Kopper and V. F. M¨ uller, Renormalization proof for spontaneously broken Yang– Mills theory with flow equations, Comm. Math. Phys. 209 (2000) 477–516. [6] Ch. Kopper, V. F. M¨ uller and Th. Reisz, Temperature independent renormalization of finite temperature field theory, Ann. Henri Poincar´e 2 (2001) 387–402. [7] V. F. M¨ uller, Perturbative renormalization by flow equations, Rev. Math. Phys. 15 (2005) 491–558. [8] J. Polchinski, Renormalization and effective Lagrangians, Nucl. Phys. B 231 (1984) 269–295. [9] I. V. Tyutin, Gauge invariance in field theory and statistical mechanics, Lebedev Institute, Report No: FIAN-39, preprint (1975). [10] K. Wilson, Renormalization group and critical phenomena I. Renormalization group and the Kadanoff scaling picture, Phys. Rev. B 4 (1971) 3174–3183. [11] K. Wilson, Renormalization group and critical phenomena II. Phase cell analysis of critical behaviour, Phys. Rev. B 4 (1971) 3184–3205.
July 10, 2009 14:13 WSPC/148-RMP
820
J070-00375
C. Kopper & V. F. M¨ uller
[12] F. Wegner and A. Houghton, Renormalization group equations for critical phenomena, Phys. Rev. A 8 (1973) 401–412. [13] J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, 3rd edn. (Clarendon Press, Oxford, 1997), Chap. 21. [14] J. Zinn-Justin, Renormalization of gauge theories, in Trends in Elementary Particle Theory, eds. H. Rollnik and K. Dietz, Lecture Notes in Physics, Vol. 37 (SpringerVerlag, 1975), pp. 2–40.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
Reviews in Mathematical Physics Vol. 21, No. 7 (2009) 821–876 c World Scientific Publishing Company
THE NON-RELATIVISTIC LIMIT ¨ OF THE EULER–NORDSTROM SYSTEM WITH COSMOLOGICAL CONSTANT∗
JARED SPECK Department of Mathematics, Rutgers University, Hill Center, 110, Frelinghuysen Road, Piscataway, NJ 08854, USA
[email protected]
Received 28 October 2008 Revised 14 May 2009 In this paper, we study the singular limit c → ∞ of the family of Euler–Nordstr¨ om systems indexed by the parameters κ2 and c (ENcκ ), where κ2 > 0 is the cosmological constant and c is the speed of light. Using Christodoulou’s techniques to generate energy currents, we develop Sobolev estimates that show initial data belonging to an appropriate Sobolev space launch unique solutions to the ENcκ system that converge to corresponding unique solutions of the Euler–Poisson system with the cosmological constant κ2 as c tends to infinity. Keywords: Cosmological constant; energy current; Euler equations; Euler–Poisson; hyperbolic PDEs; Newtonian limit; non-relativistic limit; Gunnar Nordstr¨ om; relativistic fluid; scalar gravity; singular limit; Vlasov–Nordstr¨ om. Mathematics Subject Classification 2000: 35L80, 35M99, 83C55, 83D05
1. Introduction The Euler–Nordstr¨ om system models the evolution of a relativistic perfect fluid with self-interaction mediated by Nordstr¨ om’s theory of scalar gravity. In [22], we introduced the system in dimensionless units and showed that the Cauchy problem is locally well-posed in the Sobolev spacea H N for N ≥ 3. In this article, we study the non-relativistic (also known as the “Newtonian”) limit of the family of Euler–Nordstr¨ om systems indexed by the parameters κ and c (ENcκ ), where κ2
∗ This
article was finalized while the author was a postdoctoral researcher in the Princeton University Math. Department. a More precisely, we showed local well-posedness in a suitable affine shift of H N for N ≥ 3, where ¯ by “affine shift” of H N we mean the collection of all functions F such that F − V H N < ∞, ¯ is a fixed constant array; see Sec. 2 for further discussion of this function space. where V 821
August 12, 2009 3:58 WSPC/148-RMP
822
J070-00374
J. Speck
is the cosmological constantb and c is the speed of light. The limit c → ∞ is singular because the ENcκ system is hyperbolic for all finite c, while the limiting system, namely the Euler–Poisson system with a cosmological constant (EPκ ), is not hyperbolic. Using Christodoulou’s techniques [8] to generate energy currents, together with elementary harmonic analysis, we develop Sobolev estimates and use them to study the singular limit c → ∞. Before introducing our main theorem, we place this article in context by mentioning some related works. We remark that our list of references is not exhaustive. In [13], Klainerman and Majda study singular limits in quasilinear symmetric hyperbolic systems, and in particular the incompressible limit (as the Mach number tends to 0) of compressible fluids. In [20], Rendall studies the singular limit c → ∞ of the Vlasov–Einstein system and proves that a class of data launches solutions to this system that converge to corresponding solutions of the Vlasov–Poisson system as c → ∞, thereby obtaining the first rigorous existence proof for the c → ∞ limit of the Einstein equations coupled to a matter field. In [6], Calogero and Lee study the singular limit c → ∞ of the Vlasov–Nordstr¨ om system and prove that a class of data launches solutions to this system that converge to corresponding solutions of the Vlasov–Poisson system at the rate O(c−1 ), a result analogous to our main theorem. In [2], Bauer improves the rate of convergence to O(c−4 ), which is known as a “1.5 post-Newtonian approximation.” In [3], Bauer, Kunze, Rein, and Rendall study the Vlasov–Maxwell and Vlasov–Nordstr¨ om systems and obtain a formula that relates the radiation flux at infinity to the motion of matter and that is analogous to the Einstein quadrupole formula (see, e.g., [23]) in general relativity. In [17], Oliynyk studies the singular limit c → ∞ of the Euler–Einstein system. He exhibits a class of data that launches solutions that converge to corresponding solutions of the Euler–Poisson system as c → ∞, while in [18], he improves the rate of convergence by showing that the “first post-Newtonian expansion” is valid. Our main theorem is in the spirit of the above results. We state it loosely here, and we state and prove it rigorously as Theorem 11.2: Main Theorem. Let N ≥ 4 be an integer, and assume that κ2 > 0. Then initial data belonging to a suitable affine shift of the Sobolev space H N launch unique solutions to the ENcκ system that converge uniformly on a spacetime slab [0, T ] × R3 to corresponding unique solutions of the EPκ system as the speed of light c tends to infinity. We remark that although we explicitly discuss only the ENcκ system in this article, the techniques we apply can be generalized under suitable hypotheses to study singular limits of hyperbolic systems that derive from a Lagrangian and that feature a small parameter.c parameter κ2 > 0 is fixed throughout this article. Remark 4.1 contains an explanation of why our proof breaks down in the case κ2 = 0. c The small parameter is c−2 in the case of the ENc system. κ b The
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
823
As discussed in [22], we consider the ENcκ system to be a mathematical scalar caricature of the Euler–Einstein system with cosmological constant (EEcκ ). We now provide some justification for this point of view. First of all, like the EEcκ system, the ENcκ system is a metric theory of gravity featuring gravitational waves that propagate along null cones. Second, the main theorem stated above shows that if κ2 > 0, then the Newtonian limit of the ENcκ system is the EPκ system. Furthermore, as previously mentioned, Oliynyk’s work [17] shows that the Newtonian limit of the EEc0 system is the EP0 system. Based on these considerations, we therefore expectd that achieving an understanding of the evolution of solutions to the ENcκ system will provide insight into the behavior of solutions to the vastly more complicated EEcκ system. 1.1. Outline of the structure of the paper Before proceeding, we outline the structure of this article. In Sec. 2, we introduce some notation that we use throughout our discussion. In Sec. 3, we derive the ENcκ equations with the parameter c and then rewrite the equations using Newtonian state-space variables, a change of variables that is essential for comparing the relativistic system ENcκ to the non-relativistic system EPκ . In Sec. 4, we provide for convenience the ENcκ and EPκ systems in the form used for the remainder of the article. From this form, it is clear that formally, limc→∞ ENcκ = EPκ . In Sec. 5, we introduce standard PDE matrix notation and discuss the Equations of Variation (EOVcκ ), which are the linearization of the ENcκ and EPκ systems. In Sec. 6, we provide an extension of the Sobolev–Moser calculus that is useful for bookkeeping powers of c. We also introduce some hypotheses on the c-dependence of the equation of state that are sufficient to prove our main theorem. We then apply the calculus to the ENcκ system by proving several preliminary lemmas that are useful in the technical estimates that appear later. Roughly speaking, the lemmas describe the c → ∞ asymptotics of the ENcκ equations. In Sec. 7, we introduce the energy currents that are used to control the Sobolev norms of the solutions. One of the essential features of the currents that we use is that they have a positivity property that is uniform for all large c. In Sec. 8, we describe a class of initial data for which our main theorem holds, and in Sec. 9, we smooth the initial data for technical reasons. In Sec. 10, we recall the local existence result [22] for the ENcκ system and prove an important precursor to our main theorem. Namely, we prove that solutions to the ENcκ system exist on a temper this expectation by recalling that our proof does not work in the case κ2 = 0 and that in contrast to the initial value problem studied here, Oliynyk considers the case κ2 = 0 with compactly supported data under an adiabatic equation of state. This special class of equations of state allows one to make a “Makino” change of variables that regularizes the equations and overcomes the singularities that typically occur in the equations in regions where the proper energy density vanishes. Furthermore, this change of variables enables one to write the relativistic Euler equations in symmetric hyperbolic form. See [14, 19] for additional examples of this change of variables in the context of various fluid models.
d We
August 12, 2009 3:58 WSPC/148-RMP
824
J070-00374
J. Speck
common interval of time [0, T ] for all large c. This proof is separated into two parts. The first part is a continuous induction argument based on some technical lemmas. The second part is the proof of these technical lemmas, which are a series of energy estimates derived with the aid of the calculus developed in Sec. 6. The two basic tools we use for generating the energy estimates are energy currents and the estimate f H 2 ≤ C · (∆ − κ2 )f L2 , for f ∈ H 2 . In Sec. 11, we state and prove our main theorem. 2. Remarks on the Notation We introduce here some notation that is used throughout this article, some of which is non-standard. We assume that the reader is familiar with standard notation for the Lp spaces and the Sobolev spaces H k . Unless otherwise stated, the symbols Lp and H k refer to Lp (R3 ) and H k (R3 ), respectively. 2.1. Notation regarding differential operators If F is a scalar or finite-dimensional array-valued function on R1+3 , then D(a) F denotes the array consisting of all ath-order spacetime coordinate partial derivatives (including partial derivatives with respect to time) of every component of F, while ∂ (a) F denotes the array of consisting of all ath-order spatial coordinate partial derivatives of every component of F. We write DF and ∂F , respectively, instead of D(1) F and ∂ (1) F. ∇ denotes the Levi–Civita connection corresponding to the spacetime metric g defined in (3.4). 2.2. Index conventions We adopt Einstein’s convention that diagonally repeated Latin indices are summed from 1 to 3, while diagonally repeated Greek indices are summed from 0 to 3. Indices are raised an lowered using the spacetime metric g, which is defined in (3.4), or the Minkowski metric g, depending on context. 2.3. Notation regarding norms and function spaces ¯ ⊂ Rn is a constant array, we use the notation If E ⊂ R3 and V def ¯ Lp (E) , F Lp¯ (E) = F − V V
(2.1)
and we denote the set of all (array-valued) Lebesgue measurable functions F such j that F Lp¯ (E) < ∞ by LpV ¯ (E). We also define the HV ¯ (E) norm of F by V 1/2 def ¯ 22 , F H j (E) = ∂α (F − V) (2.2) L (E) ¯ V
| α|≤j
where ∂α is a multi-indexed operator representing repeated partial differentiation with respect to spatial coordinates. Unless we indicate otherwise, we assume that E = R3 when the set E is not explicitly written.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
825
Remark 2.1. Technically speaking, the · H j are not norms in general, since for ¯ V ¯ = 0. This is not a problem because in this article, we example 0 j = ∞ unless V HV ¯
only study the · H j “norm” of functions F that by design feature F H j < ∞. ¯ V
¯ V
If F is a map from [0, T ] into the normed function space X, we use the notation def
|||F |||X,T = sup F (t)X .
(2.3)
t∈[0,T ]
We also use the notation C j ([0, T ], X) to denote the set of j-times continuously differentiable maps from (0, T ) into X that, together with their derivatives up to order j, extend continuously to [0, T ]. If D ⊂ Rn , then Cbj (D) denotes the set of j-times continuously differentiable functions (either scalar or array-valued, depending on context) on Int(D) with bounded derivatives up to order j that extend continuously to the closure of D. The norm of a function F ∈ Cbj (D) is defined by def |F|j,D = sup |∂IF(z)|, (2.4) |I|≤j
z∈D
where ∂I is a multi-indexed operator representing repeated partial differentiation with respect to the arguments z of F, which may be either spacetime coordinates or state-space variables depending on context. 2.4. Notation for c-independent inequalities If Ac is a quantity that depends on the parameter c, and X is a quantity such that Ac ≤ X holds for all large c, then we indicate this by writing Ac X.
(2.5)
2.5. Notation regarding constants We use the symbol C to denote a generic constant in the estimates below which is free to vary from line to line. If the constant depends on quantities such as real numbers N, subsets D of Rn , functions F of the state-space variables, etc., that are peripheral to the argument at hand, we sometimes indicate this dependence by writing C(N, D, F), etc. We explicitly show the dependence on such quantities when it is (in our judgment) illuminating, but we often omit the dependence on such quantities when it overburdens the notation without being illuminating. Occasionally, we shall use additional symbols such as Λ1 , Z, L2 , etc., to denote constants that play a distinguished role in the discussion. 3. The Origin of the ENcκ System In this section, we insert both the speed of light c and Newton’s universal gravitational constant G into the Euler–Nordstr¨ om system with a cosmological constant
August 12, 2009 3:58 WSPC/148-RMP
826
J070-00374
J. Speck
and perform a Newtonian change of variables, which brings the system into the form (4.1)–(4.8). A similar analysis for the Vlasov–Nordstr¨ om systeme is carried out in [6]. 3.1. Deriving the equations with c as a parameter We assume that spacetime is a four-dimensional Lorentzian manifold M and furthermore, that there is a global rectangular (inertial) coordinate system on M. We use the notation x = (x0 , x1 , x2 , x3 )
(3.1)
to denote the components of a spacetime point x in this fixed coordinate system, and for this preferred time-space splitting, we identify t = x0 with time and s = (x1 , x2 , x3 ) with space. Note that we are breaking with the usual convention, which is x0 = ct. The components of the Minkowski metric and its inverse in the inertial coordinate system are given by g µν = diag(−c2 , 1, 1, 1),
(3.2)
g µν = diag(−c−2 , 1, 1, 1),
(3.3)
respectively. We adopt Nordstr¨om’s postulate, namely that the spacetime metric g is related to the Minkowski metric by a conformal scaling factor: gµν = e2φ g µν .
(3.4)
In (3.4), φ is the dimensionless cosmological-Nordstr¨ om potential, a scalar quantity. We now briefly introduce the notion of a relativistic perfect fluid. Readers may consult [1] or [7] for more background. For a perfect fluid model, the components of the energy-momentum-stress density tensor (which is commonly called the “energymomentum tensor” in the literature) of matter read T µν = c−2 (ρ + p)uµ uν + pg µν = c−2 (ρ + p)uµ uν + e−2φ pgµν ,
(3.5)
where ρ is the proper energy density of the fluid, p is the pressure (this “proper” quantity is defined in a local rest frame), and u is the four-velocity, which is subject to the normalization constraint gµν uµ uν = e2φ gµν uµ uν = −c2 .
(3.6)
The Euler equations for a perfect fluid are (see, e.g., [7]) ∇µ T µν = 0 ∇µ (nuµ ) = 0,
(ν = 0, 1, 2, 3),
(3.7) (3.8)
where n is the proper number density and ∇ denotes the covariant derivative induced by the spacetime metric g. Vlasov–Nordstr¨ om (VN) model describes a particle density function f on physical space × momentum space that evolves due to self-interaction mediated by Nordstr¨ om’s theory of gravity. Various aspects of this system are studied, for example, in [4, 5].
e The
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
827
Nordstr¨ om’s theoryf [16] provides the following evolution equationg for φ: we define an auxiliary energy-momentum-stress density tensor µν = e6φ T µν = c−2 e6φ (ρ + p)uµ uν + e4φ pg µν , Taux def
(3.9)
and postulate that φ is a solution to µν = 4πc−4 Ge4φ (ρ − 3p). φ − κ2 φ = −4πc−4 Ge4φ trg T = −4πc−4 Gg µν Taux
(3.10) Note that φ = g µν ∂µ ∂ν φ = −c−2 ∂t2 φ + φ def
(3.11)
is the wave operator on flat spacetime applied to φ. The virtue of the postulate equation (3.10), as we shall see, is that it provides us with continuity equations (3.25) for an energy-momentum-stress density tensor Θ in Minkowski space. We also introduce the entropy per particle, a thermodynamic variable that we denote by η, and we close the system by supplying an equation of state, which may depend on c. A “physical” equation of state for a perfect fluid state satisfies the following criteria (see, e.g., [10]): (1) ρ ≥ 0 is a function of n ≥ 0 and η ≥ 0. (2) p ≥ 0 is defined by
∂ρ − ρ, p=n ∂n η
(3.12)
where the notation |· indicates partial differentiation with · held constant. (3) A perfect fluid satisfies ∂ρ ∂p ∂ρ > 0, > 0, ≥ 0 with “ = ” iff η = 0. (3.13) ∂n η ∂n η ∂η n As a consequence, we have that σ, the speed of sound in the fluid, is always real for η > 0: ∂p ∂p/∂n|η def = c2 > 0. (3.14) σ 2 = c2 ∂ρ η ∂ρ/∂n|η (4) We also demand that the speed of sound is positive and less than the speed of light whenever n > 0 and η > 0: n > 0 and η > 0 ⇒ 0 < σ < c. f Norstr¨ om’s
(3.15)
theory of gravity, although shown to be physically wrong through experiment, was the first metric theory of gravitation. g Nordstr¨ om considered only the case κ = 0.
August 12, 2009 3:58 WSPC/148-RMP
828
J070-00374
J. Speck
Postulates (1)–(3) express the laws of thermodynamics and fundamental thermodynamic assumptions, while Postulate (4) ensures that at each x ∈ M, vectors that are causal with respect to the sound cone in Tx M are necessarily causal with respect to the gravitational null cone in Tx M; see Sec. 7.2. Remark 3.1. We note that the assumptions ρ ≥ 0, p ≥ 0 together imply that the energy-momentum-stress density tensor (3.5) satisfies both the weak energy condition (Tµν X µ X ν ≥ 0 holds whenever X is timelike and future-directed with respect to the gravitational null cone) and the strong energy condition ([Tµν − 1/2g αβ Tαβ gµν ]X µ X ν ≥ 0 holds whenever X is timelike and future-directed with respect to the gravitational null cone). Furthermore, if we assume that the equation of state is such that p = 0 when ρ = 0, then (3.14) and (3.15) guarantee that p ≤ ρ. It is then easy to check that 0 ≤ p ≤ ρ implies the dominant energy condition (−T µν X ν is causal and future-directed whenever X is causal and future-directed with respect to the gravitational null cone). By (3.13), we can solve for σ 2 and c−2 ρ as c-indexed functions S2c and Rc respectively of η and p: def
σ 2 = S2c (η, p),
(3.16)
c−2 ρ = Rc (η, p).
(3.17)
def
We also will make use of the following identity implied by (3.14), (3.16), and (3.17): ∂Rc (η, p) = S−2 (3.18) c (η, p). ∂p η Remark 3.2. Note that c−2 ρ has the dimensions of mass density. As we will see in Sec. 6, limc→∞ Rc (η, p) will be identified with the Newtonian mass density. We summarize by stating that the Eqs. (3.4)–(3.8), (3.10), (3.12), and (3.17) constitute the ENcκ system. 3.2. A reformulation of the ENcκ system in Newtonian variables In this section, we reformulate the ENcκ system as a fixed background theory in flat Minkowski space and introduce a Newtonian change of state-space variables. The resulting system (4.1)–(4.8) is an equivalent formulation of the ENcκ system. We remark that for the remainder of this article, all indices are raised and lowered with the Minkowski metric g, so that ∂ λ φ = g µλ ∂µ φ. To begin, we use the form of the metric (3.4) to compute that in our inertial coordinate system, the continuity equation (3.7) for the energy-momentum-stress density tensor (3.5) is given by 0 = ∇µ T µν = ∂µ T µν + 6T µν ∂µ φ − g αβ T αβ ∂ ν φ αβ ν ∂ φ (ν = 0, 1, 2, 3), = ∂µ T µν + 6T µν ∂µ φ − e−6φ g αβ Taux
(3.19)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
829
µν where Taux is define in (3.9). For this calculation we made use of the explicit form of the Christoffel symbols of g in our rectangular coordinate system: α α αβ ∂β φ. Γα µν = δν ∂µ φ + δµ ∂ν φ − g µν g
(3.20)
Using the postulated equation (3.10) for φ, (3.19) can be rewritten as
c4 1 µν α 1 µν 2 2 6φ µν µν µ ν 0 = e ∇µ T = ∂µ Taux + . ∂ φ∂ φ − g ∂ φ∂α φ − g κ φ 4πG 2 2 (3.21) Let us denote the terms from (3.21) that are inside the square brackets as Θµν . Since the coordinate-divergence of Θ vanishes, we are provided with local conservation laws in Minkowski space, and we regard Θ as an energy-momentum-stress density tensor. We also introduce the following state-space variables that play a mathematical roleh in the sequel: Rc = c−2 ρe4φ = e4φ Rc (η, p), def
def
P = pe4φ .
(3.22) (3.23)
After we make this change of variables, the components of Θ read Θµν = [Rc + c−2 P ]e2φ uµ uν + P gµν c4 1 µν α 1 µν 2 2 µ ν + ∂ φ∂ φ − g ∂ φ∂α φ − g κ φ , 4πG 2 2 def
(3.24)
and we replace (3.7) with the equivalent equation ∂µ Θµν = 0
(ν = 0, 1, 2, 3).
(3.25)
We also expand the covariant differentiation from (3.8) in terms of coordinate derivatives and the Christoffel symbols (3.20), arriving at the equation ∂µ (ne4φ uµ ) = 0.
(3.26)
Our goal is to obtain the system ENcκ in the form (4.1)–(4.8) below. To this end, we project (3.25) onto the orthogonal complementi of u and in the direction of u. We therefore introduce the rank 3 tensor Π, which has the following components in our inertial coordinate system: Πµν = c−2 e2φ uµ uν + g µν . def
(3.27)
Π is the projection onto the orthogonal complement of u: Πµν uλ g λµ = 0
“physical” quantities are Rc and p. are referring here to the orthogonal complement defined by the Minkowski metric g.
h The i We
(ν = 0, 1, 2, 3).
(3.28)
August 12, 2009 3:58 WSPC/148-RMP
830
J070-00374
J. Speck
We now introduce the following Newtonian change of state-space variablesj def
v j = uj /u0
(j = 1, 2, 3),
def
Φ = c2 φ,
(3.29) (3.30)
where v = (v 1 , v 2 , v 3 ) is the Newtonian velocity and Φ is the cosmologicalNordstr¨ om potential. Relation (3.29) can be inverted to give u0 = e−φ γc ,
(3.31)
uj = e−φ γc v j ,
(3.32)
where def
γc (v) =
c . (c2 − |v|2 )1/2
(3.33)
Remark 3.3. We provide here a brief elaboration on the Newtonian change of variables. Equation (3.29) provides the standard relationship between the Newtonian velocity v and the four-velocity u: if xν (t) (ν = 0, 1, 2, 3) are the rectangular components of a timelike curve in M parameterized by x0 = t, and τ denotes the proper time parameter, then we have that v j = ∂t xj = (∂τ /∂t) · uj = uj /u0 (j = 1, 2, 3). Dimensional analysis suggests the approximate identification (for large c) of the cosmological-Nordstr¨ om potential Φ from (3.30) and (4.4) with the cosmologicalNewtonian potential Φ∞ , where Φ∞ is the solutionk to the non-relativistic equation (4.12): Φ∞ has the dimensions of c2 , which suggests that when considering the limit c → ∞, we should rescale the dimensionless cosmological-Nordstr¨ om potential φ as we did in (3.30). Indeed, our main result, which is Theorem 11.2, shows that with an appropriate formulation of the initial value problems for the ENcκ and EPκ systems, we have that limc→∞ Φ = Φ∞ . Dimensional analysis also suggests the formal identification of R∞ from (4.10)–(4.14) with limc→∞ Rc = limc→∞ Rc (η, p) (for now assuming that this limit exists), where Rc (η, p) is defined in (3.17). Furthermore, these changes of variables can be justified through a formal expandef sion c−2 Φ = φ = φ(0) + c−2 φ(1) + · · · , Rc = R(0) + c−2 R(1) + · · · , in powers of c−2 in Eq. (4.4): equating the coefficients of powers of c−2 on each side of the equation implies the formal identificationsl φ(0) = 0 and (∆ − κ2 )φ(1) = 4πGR(0) . If we also consider Eq. (4.12), which reads (∆ − κ2 )Φ∞ = 4πGR∞ , then we are lead to the def
formal identifications R(0) ≈ R∞ and Φ = c2 φ ≈ φ(1) ≈ Φ∞ . A similar analysis for the Vlasov–Nordstr¨ om system is carried out in [6]. suggested by Remark 3.2, even though Rc is not a state-space variable, Eq. (3.22) also represents a Newtonian change of variables. k We use the symbol Φ ∞ here to denote the solution to (4.12) in order to distinguish the cosmological-Newtonian potential from the cosmological-Nordstr¨ om potential. l Upon expansion, the formal equation satisfied by φ 2 (0) is (∆ − κ )φ(0) = 0, and by imposing vanishing boundary conditions at infinity, we conclude that φ(0) = 0. j As
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
831
Upon making the substitutions (3.29)–(3.30) and lowering an index with g, the components of Π in our inertial coordinate system read (for 1 ≤ j, k ≤ 3): Π00 = −c−2 γc2 |v|2 ,
(3.34)
Π0j = c−2 γc2 v j ,
(3.35)
Πj0 = −γc2 v j ,
(3.36)
Πjk = c−2 γc2 v j vk + δkj .
(3.37)
Furthermore, we will also make use of the relation ∂λ γc = c−2 (γc )3 vk ∂λ v k
(λ = 0, 1, 2, 3).
(3.38)
Considering first the projection of (3.25) in the direction of u, we remark that one may use (3.8) and (3.12) to conclude that for C 1 solutions, uν ∂µ Θµν = 0 is equivalent to Eq. (4.1). We now project (3.25) onto the orthogonal complement of u, which, with the aid of (3.10), gives the three equations Πjν ∂µ Θµν = 0, j = 1, 2, 3: 0 = Πjν ∂µ Θµν = Πjν [Rc + c−2 P ](eφ uµ )∂µ (eφ uν ) + (Πjν ∂ ν φ)
c4 (φ − κ2 φ) 4πG
= Πjν [Rc + c−2 P ](eφ uµ )∂µ (eφ uν ) + (Πjν ∂ ν Φ)(Rc − 3c−2 P ).
(3.39)
After making the substitutions (3.30)–(3.33), and using relation (3.38), it follows that for C 1 solutions, (3.39) is equivalent to (4.3). We also introduce the nameless quantity Qc and make use of (3.12), (3.14), (3.16)–(3.18), (3.22), (3.23), and (3.30) to express it in the following form: ∂P ∂P ∂(ρ/c2 ) def = ·n = Qc (η, p, Φ), (3.40) Qc = n ∂n η,φ ∂(ρ/c2 ) η,φ ∂n η where 2
Qc (η, p, Φ) = S2c (η, p)e4Φ/c [Rc (η, p) + c−2 p] = S2c (η, p)[Rc + c−2 P ]. def
(3.41)
Then we use the chain rule together with (3.8), (4.1), and (3.40) to derive eφ uµ ∂µ P + Qc ∂µ (eφ uµ ) = (4P − 3Qc )eφ uµ ∂µ φ,
(3.42)
which we may use in place of (3.8). Upon making the substitutions (3.22), (3.23), (3.30)–(3.32), and using the relation (3.38), it follows that for C 1 solutions, (3.42) is equivalent to (4.2). 4. The Formal Limit c → ∞ of the ENcκ System For convenience, in this section we list the final form of the ENcκ system as derived in Secs. 3.1 and 3.2. We also take the formal limit c → ∞ to arrive at the EPκ system and introduce the equations of variation (EOVcκ ).
August 12, 2009 3:58 WSPC/148-RMP
832
J070-00374
J. Speck
4.1. A recap of the ENcκ system The ENcκ system is given by ∂t η + v k ∂k η = 0,
(4.1)
k
k
∂t P + v ∂k P + Qc ∂k v + c
−2
2
k
a
k
(γc ) Qc vk (∂t v + v ∂a v )
= (4P − 3Qc )[c−2 ∂t Φ + c−2 v k ∂k Φ],
(4.2)
(γc )2 (Rc + c−2 P )[∂t v j + v k ∂k v j + c−2 (γc )2 v j vk (∂t v k + v a ∂a v k )] + ∂j P + c−2 (γc )2 v j (∂t P + v k ∂k P ), = (3c
−2
−2 j
P − Rc )(∂j Φ + (γc )
v [c
−2
(4.3)
∂t Φ + c
−2 k
v ∂k Φ])
− c−2 ∂t2 Φ + ∆Φ − κ2 Φ = 4πG(Rc − 3c−2 P ),
(4.4)
where j = 1, 2, 3, def
γc = γc (v) = def
c , (c2 − |v|2 )1/2
(4.5)
2
Rc = e4Φ/c Rc (η, p),
(4.6) 2
Qc = Qc (η, p, Φ) = S2c (η, p)e4Φ/c [Rc (η, p) + c−2 p] −1 2 ∂Rc = (η, p) e4Φ/c [Rc (η, p) + c−2 p], ∂p η def
def
def
2
P = e4Φ/c p,
(4.7) (4.8)
c denotes the speed of light, Sc (η, p), which is defined in (3.18), is the speed of sound, and the functions Rc and Sc derive from a c-indexed equation of state as discussed in Sec. 3.1. The variables η, p, v = (v 1 , v 2 , v 3 ), and Φ denote the entropy per particle, pressure, (Newtonian) velocity, and cosmological-Nordstr¨om potential respectively. Section 6 contains a detailed discussion of the c-dependence of the ENcκ System. 4.2. The EPκ system as a formal limit Taking the formal limit c → ∞ in the ENcκ system gives the Euler–Poisson system with a cosmological constant: ∂t η + v k ∂k η = 0,
(4.9)
∂t p + v k ∂k p + Q∞ ∂k v k = 0,
(4.10)
∂t R∞ + ∂k (R∞ v k ) = 0,
(4.10 )
R∞ (∂t vj + v k ∂k v j ) + ∂j p = −R∞ ∂j Φ ∆Φ − κ Φ = 4πGR∞ , 2
(j = 1, 2, 3),
(4.11) (4.12)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
833
where def
R∞ = R∞ (η, p), def
(4.13) def
Q∞ = Q∞ (η, p) = S2∞ (η, p)R∞ (η, p) =
−1 ∂R∞ (η, p) R∞ (η, p), ∂p η
(4.14)
R∞ (η, p) and S2∞ (η, p) are the limits as c → ∞ of Rc (η, p) and S2c (η, p), respectively (see (6.12)–(6.14)), and R∞ is the Newtonian mass density. Since Eqs. (3.18) and (6.14) imply that ∂R∞ (η, p)/∂p = S−2 ∞ (η, p), it then follows with the aid of the chain rule that for C 1 solutions, Eqs. (4.10) and (4.10 ) are equivalent. We refer to the solution variable Φ from Eq. (4.12) as the cosmological-Newtonian potential. An introduction to the EPκ system can be found in [12]. In this article, Kiessling assumes an isothermal equation of state (p = c2s · R∞ , where the constant cs denotes the speed of sound), and derives the Jeans dispersion relation that arises from linearizing (4.10 ), (4.11), (4.12) about a static state in which ¯ ∞ is positive, followed by taking the the background Newtonian mass density R limit κ → 0. It is a standard result that the solution to (4.12) is given by
−κ|s−s | e ¯ [R∞ (η(t, s ), p(t, s )) − R∞ (¯ η , p¯)]d3 s , Φ(t, s) = Φ∞ − G |s − s | R3 (4.15) ¯ ∞ , η¯, and p¯, which are the values of Φ, η, and p, respectively, where the constants Φ in a constant background state, are discussed in Sec. 8. The boundary conditions ¯ ∞ vanishes at ∞, and we view Φ(t, s) as leading to this solution are that Φ(t, ·) − Φ ¯ ∞. a (not necessarily small) perturbation of the constant potential Φ Remark 4.1. Consider the kernel K(s) = −Ge−κ|s|/|s| appearing in (4.15). An easy computation gives that K(s), ∂K(s) ∈ L1 (R3 ). Therefore, a basic result from harmonic analysis (Young’s inequality) implies that the map f → K ∗ f, where ∗ denotes convolution, is a bounded linear mapm from L2 (R3 ) to H 1 (R3 ). From this fact and Remark B.2 (alternatively consult Lemma 6.1), it follows +1 (R3 ) whenever (η(t, ·), p(t, ·)) ∈ Hη¯N (R3 ) × Hp¯N (R3 ). By then that Φ(t, ·) ∈ HΦN ¯ +2 applying Lemma A.1, we can further conclude that Φ(t, ·) ∈ HΦN (R3 ) whenever ¯ N 3 N 3 (η(t, ·), p(t, ·)) ∈ Hη¯ (R ) × Hp¯ (R ). 5. The Equations of Variation (EOVcκ ) The EOVcκ are formed by linearizing the ENcκ system (EPκ system if c = ∞) around of the form V = ( 2, Φ 3 ). Given such a background solution (BGS) V η , P , v1 , . . . , Φ m Our
proof breaks down at this point in the case κ = 0.
August 12, 2009 3:58 WSPC/148-RMP
834
J070-00374
J. Speck
and inhomogeneous terms f, g, h(1) , h(2) , h(3) , l, we define the EOVc by aV κ ∂t η˙ + vk ∂k η˙ = f,
(5.1)
c ∂k v˙ k + c−2 ( c vk (∂t v˙ k + va ∂a v˙ k ) = g, ∂t P˙ + vk ∂k P˙ + Q γc )2 Q
(5.2)
c + c−2 P )[∂t v˙ j + vk ∂k v˙ j + c−2 ( γc )2 vj vk (∂t v˙ k + va ∂a v˙ k )] ( γc )2 (R + ∂j P˙ + c−2 ( γc )2 vj (∂t P˙ + vk ∂k P˙ ) = h(j) , ˙ −c−2 ∂t2 Φ e
˙ = l, + ∆Φ˙ − κ Φ 2
(5.3) (5.4)
c = e4Φ/c Rc ( where γc = c/(c2 − | v |2 )1/2 , R η , p), etc. The unknowns are the def 1 2 3 ˙ ˙ ˙ components of W = (η, ˙ P , v˙ , v˙ , v˙ ) and Φ. def
def
2
Remark 5.1. We place parentheses around the superscripts of the inhomogeneous terms h(j) in order to emphasize that we are merely labeling them, and that in general, we do not associate any transformation properties to them under changes of coordinates. 5.1. PDE matrix/vector notation Let us now provide a few remarks on our notation. We find it useful to analyze both the dependent variable p and the dependent variable P when discussing solutions to (4.1)–(4.4). Therefore, we will make use of all four of the following arrays: def
W = (η, P, v 1 , v 2 , v 3 ),
(5.5)
def
V = (η, P, v 1 , v 2 , v 3 , Φ, ∂t Φ, ∂1 Φ, ∂2 Φ, ∂3 Φ), def
W = (η, p, v 1 , v 2 , v 3 ),
(5.6) (5.7)
def
V = (η, p, v 1 , v 2 , v 3 , Φ, ∂t Φ, ∂1 Φ, ∂2 Φ, ∂3 Φ),
(5.8)
def def 2, Φ 3 ) that defines = ( η , P , v1 , . . . , Φ where P = e4Φ/c p. When discussing a BGS V c the coefficients of the unknowns in the EOVκ , we also use notation similar to that def def W used in (5.5)–(5.8), including V = ( η , p, v 1 , . . . , ∂3 Φ), = ( η , P , v 1 , v2 , v3 ), where def −4Φ/c def e 2 = ( P , etc. When c = ∞, we may also refer to W η , p, v1 , v2 , v3 ) as p = e 2
the BGS, since in this case, the left-hand sides of (5.1)–(5.4) do not depend on Additionally, we may refer to the unknowns in the and furthermore, W = W. Φ, def ˙ = (η, EOVcκ as W ˙ p, ˙ v˙ 1 , v˙ 2 , v˙ 3 ) when c = ∞; in this article, Φ˙ will always vanish at infinity, and in the case c = ∞, rather than considering Φ˙ to be an “unknown,” we assume that the solution variable Φ˙ has been constructed via the convolution Φ˙ = K ∗ l, where the kernel K(s) is defined in Remark 4.1, and l is the right-hand side of (5.4). We frequently adopt standard PDE matrix/vector notation. For example, we may write (4.1)–(4.3) as µ c A (W, Φ)∂µ W
= b,
(5.9)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
835
where each c Aν (·) is a 5 × 5 matrix with entries that are functions of W and Φ, while b = (f, g, . . . , h(3) ) is the 5-component column array on the right-hand side of (4.1)–(4.3). It is instructive to see the form of the c Aν (·), ν = 0, 1, 2, 3, for we will soon concern ourselves with their large-c asymptotic behavior. Abbreviating def (i) def (i,j) def −2 = c (γc )2 v i v j , we have that αc = (γc )2 (Rc + c−2 P ), βc = c−2 (γc )2 v i , βc 1 0 0 0 0 (1) (2) (3) 0 1 Q c βc Q c βc Q c βc (1) (1,1) (1,2) (1,3) 0 αc (1 + βc ) αc βc αc βc , c A (W, Φ) = 0 Qc βc (2,1) (2,2) (2,3) 0 Q β (2) αc Qc βc αc (1 + βc ) αc βc c c (3)
0
(3,1)
Q c βc
(3,2)
αc βc
(3,3)
αc βc
αc (1 + βc
) (5.10)
1 0 0 ∞ A (W) = 0 0 1 c A (W, Φ) 1
v
0 =0 0 0
0
0 1 0 0 0 0
0 0 R∞ 0 0
0 0 0 R∞ 0
0
v1 (1,1)
1 + βc
(1,1)
(5.11)
(2,1)
αc v 1 βc
(3,1)
αc v 1 βc
Q c βc
(1,2)
(1,3)
αc v 1 βc
(2,1)
αc v 1 (1 + βc
(3,1)
αc v 1 βc
,
(1,3)
Q c βc )
0
(1,2)
)
(1,1)
αc v 1 (1 + βc
βc
0 0 0 , 0 R∞
0
Qc (1 + βc
βc
αc v 1 βc
(2,2)
(2,3)
αc v 1 βc
)
(3,2)
(3,3)
αc v 1 (1 + βc
)
(5.12) 1 v 0 1 ∞ A (W) = 0 0 0 and similarly for c A2 (W, Φ),
0 v1 1 0 0
0 Q∞ R∞ v 1 0 0
0 0 0 R∞ v 0
2 3 ∞ A (W), c A (W, Φ),
0 0 0 0
1
,
(5.13)
R∞ v 1
and
3 ∞ A (W).
6. On the c-Dependence of the ENcκ System In addition to appearing directly as the term c−2 , the constant c appears in 2 Eqs. (4.1)–(4.4) through four terms: (i) P = e4Φ/c p, (ii) γc = c/(c2 − |v|2 )1/2 , 2 2 (iii) Rc = e4Φ/c Rc (η, p), and (iv) Qc = S2c (η, p)e4Φ/c [Rc (η, p) + c−2 p]. Because we want to recover the EPκ system in the large c limit, the first obvious requirement we have is that the function Rc (η, p) has a limit R∞ (η, p) as c → ∞. For mathematical reasons, we will demand convergence in the norm | · |N +1,C (see definition (2.4))
August 12, 2009 3:58 WSPC/148-RMP
836
J070-00374
J. Speck
at a rate of order c−2 , where C is a compact subset of R+ × R+ that depends on the Newtonian initial data V∞ defined in (8.1); see (6.12) and (6.13). Although a construction of C is described in detail in Sec. 8.2, let us now provide a preliminary description that is sufficient for our current purposes: for given initial data, ¯ 2 , [−a, a]5 , K def ¯2 × [−a, a]5 , ¯2 , O = O we will prove the existence of compact sets O def ¯ K = O2 ×[−a, a]5 , and a time interval [0, T ] so that for all large c, the (c-dependent) solutionsn V (V) to the ENcκ system launched by the initial data exist on [0, T ]× R3 ¯ 2 , V([0, T ] × R3 ) ⊂ K, and ¯2 , W([0, T ] × R3 ) ⊂ O and satisfy W([0, T ] × R3 ) ⊂ O 3 ¯2 and O ¯ 2 , and (10.29), V([0, T ]×R ) ⊂ K. See Sec. 8.2 for a detailed description of O (10.30) for the construction of K and K. ¯ 2 onto the first two axes The set C from above, then, is the projection of O (which are the η, p components of V). Intuitively, we would like the aforementioned four functions of the state-space variables to converge to p, 1, R∞ , and Q∞ , respectively, when their domains are restricted to an appropriate compact subset. In this section, we will develop and then assume hypotheses on the c-indexed equation of state that will allow us to prove useful versions of these kinds of convergence results. 6.1. Functions with c-independent properties: the definitions The main technical difficulty that we must confront is ensuring that the Sobolev estimates provided by the propositions appearing in Appendix B can be made independently of all large c. By examining these propositions, one could anticipate that this amounts to analyzing the Cbj norms (see definition (2.4)) of various c-indexed families of functions Fc appearing in the family of ENcκ systems. We therefore introduce here some machinery that will allow us to easily discuss uniform-in-c estimates. Following this, we use this machinery to prove some preliminary lemmas that will be used in the proofs of Theorems 10.2 and 11.2, which are the two main theorems of this article. Before proceeding, we refer the reader to the notation defined in (2.5), which will be used frequently in the discussion that follows. Definition 6.1. Let y 1 , . . . , y n denote Cartesian coordinates on Rn , and let D ⊂ Rn be a compact convex set. We define Rj (ck ; D; y 1 , . . . , y n ) to be the ring consisting of all c-indexed families of functions Fc (y 1 , . . . , y n ) such that for all large c, Fc ∈ Cbj (D), and such that the following estimate holds: |Fc |j,D ck · C(D).
(6.1)
We emphasize that the constant C(D) is allowed to depend on the family Fc and the set D, but within a given family and on a fixed set, C(D) must be independent of all large c. n Recall
the notation (5.5)–(5.8) which defines the arrays W, V, W, and V, respectively.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
837
Definition 6.2. Let D ⊂ Rn be a compact convex set. Let q1 , . . . , qn be functions such that (q1 , . . . , qn ) ∈ Hq¯j1 (R3 ) × · · · × Hq¯jn (R3 ) (see definition (2.2)) and such that {(q1 (s), q2 (s), . . . , qn (s)) | s ∈ R3 } ⊂ D, where q¯1 , q¯2 , . . . , q¯n are constants such that (¯ q1 , q¯2 , . . . , q¯n ) ∈ D. We define Rj (ck ; D; q1 , . . . , qn ) to be the ring consisting of all c-indexed expressions that can be written as the composition of an element of Rj (ck ; D; y 1 , . . . , y n ) with (q1 , . . . , qn ). If Fc is such an expression, then we indicate this by writing Fc (q1 , . . . , qn ) ∈ Rj (ck ; D; q1 , . . . , qn )
(6.2)
or Fc ∈ R (c ; D; q1 , . . . , qn ). j
k
(6.3)
We remark that the notations (6.2), (6.3) also carries with it the implication that the functions (q1 , . . . , qn ) have the aforementioned properties. Remark 6.1. The notation Fc ∈ Rj (ck ; D; q1 , . . . , qn ) represents an abuse of notation in the sense that in Definition 6.1, the arguments of the function Fc (y 1 , . . . , y n ) are fixed, while in Definition 6.2, we are allowing ourselves the freedom to shift the point of view as to what are the arguments of the expression Fc by allowing ourselves to “shift around powers of c.” At the beginning of Sec. 6.3, we explain why this freedom can be useful. As a simple example, if ∂t Φ ∈ H 2 , ∂t ΦL∞ ≤ 1, and Fc = c−2 ∂t Φ, then we have that Fc ∈ R2 (c−2 ; [−1, 1]; ∂t Φ) and also that Fc ∈ R2 (c−1 ; [−1, 1]; c−1∂t Φ). Definition 6.3. Let D, q1 , . . . , qn , and q¯1 , q¯2 , . . . , q¯n be as in Definition 6.2. Then we define I j (ck ; D; q1 , . . . , qn ) to be the sub-ring contained in Rj (ck ; D; q1 , . . . , qn ) consisting of all such c-indexed expressions Fc such that the following estimate holds: Fc H j ck · C(D; q1 H j , . . . , qn H j ). q ¯1
q ¯n
(6.4)
If Fc is such an expression, then we indicate this by writing Fc (q1 , . . . , qn ) ∈ I j (ck ; D; q1 , . . . , qn )
(6.5)
or Fc ∈ I (c ; D; q1 , . . . , qn ) j
k
or Fc = O (c ; D; q1 , . . . , qn ). j
k
(6.6)
Remark 6.2. This definition is highly motivated by the inequality (B.6) of Appendix B. Remark 6.3. We also emphasize that in our applications below, the functions qi and constants q¯i may themselves depend on the parameter c, even though we do not always explicitly indicate this dependence. Typically, the qi will be quantities
August 12, 2009 3:58 WSPC/148-RMP
838
J070-00374
J. Speck
related to solutions of the ENcκ system, and the q¯i will be equal to the components of either (8.2), (8.10), or (8.11), perhaps scaled by a power of c. Remark 6.4. In the notation R(· · ·), I(· · ·), and Oj (· · ·), we often omit the argument D. In this case, it is understood that there is an implied set D that is to be inferred from context; frequently D is to be inferred from L∞ estimates on the qi that follow from Sobolev embedding. Also, we omit the argument ck when k = 0. Furthermore, we have chosen to omit dependence on the constants q¯i since, as will be explained at the beginning of Sec. 6.3, their definitions will be clear from context. We will occasionally omit additional arguments when the context is clear. 6.2. Functions with c-independent properties: Useful lemmas The following three lemmas provide the core structure for analyzing the Sobolev norms of terms appearing in the ENcκ system. They are especially useful for keeping track of powers of c. Their proofs are based on the Sobolev–Moser estimates that are stated as propositions in Appendix B. We assume throughout this section that the functions q1 , . . . , qn have the properties stated in Definition 6.2. Lemma 6.1. If j ≥ 2 and Fc (y 1 , . . . , y n ) ∈ Rj (ck ; D; y 1 , . . . , y n ), then Fc ◦ (q1 , . . . , qn ) − Fc ◦ (¯ q1 , . . . , q¯n ) ∈ I j (ck ; D; q1 , . . . , qn ).
(6.7)
Proof. We emphasize that the conclusion of Lemma 6.1 is exactly the statement q1 , . . . , q¯n )H j ck ·C(q1 H j , . . . , qn H j ). Its proof that Fc ◦(q1 , . . . , qn )−Fc ◦(¯ follows from Definitions 6.1–6.3, and from (B.6).
q ¯1
q ¯n
Lemma 6.2. Suppose that Fc ∈ Rj (ck1 ; D; q1 , . . . , qn ), Gc ∈ Rj (ck2 ; D; q1 , . . . , qn ), and Hc ∈ I j (ck3 ; D; q1 , . . . , qn ). Then Fc · Gc ∈ Rj (ck1 +k2 ; D; q1 , . . . , qn )
if j ≥ 0
(6.8)
Fc · Hc ∈ I j (ck1 +k3 ; D; q1 , . . . , qn )
if j ≥ 2.
(6.9)
and
Proof. Lemma 6.2 follows from the product rule for derivatives and (B.3). Remark 6.5. Lemma 6.2 shows that for k ≤ 0, Rj (ck ; D; q1 , . . . , qn ) is a ring, i.e., it is closed under products. We frequently use this property in this article without explicitly mentioning it. Remark 6.6. Lemma 6.2 can easily be used to show that if Fc (y 1 , . . . , y n ) ∈ > 0 such that C(D) Rj (c0 ; D; y 1 , . . . , y n ) and if there exists a constant C(D) 1 n j 0 inf (y1 ,...,yn )∈D |Fc (y , . . . , y )|, then 1/Fc ◦ (q1 , . . . , qn ) ∈ R (c ; D; q1 , . . . , qn ).
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
839
Remark 6.7. Lemma 6.2 shows that if Fc (y 1 , . . . , y n ) ∈ Rj (c0 ; D; y 1 , . . . , y n ) and q1 , . . . , q¯n ) = 0, then Fc ◦ (q1 , . . . , qn ) ∈ I j (ck ; D; q1 , . . . , qn ). In particular, if Fc ◦ (¯ q¯ = 0, then any monomial q k for k > 0 is an element of I j (q). Remark 6.8. Lemma 6.2 shows in particular that for k ≤ 0, I j (ck ; D; q1 , . . . , qn ) is an ideal in Rj (D; q1 , . . . , qn ). Remark 6.9. If k ≤ 0 and there exists a fixed function F∞ ∈ Rj (D; y 1 , . . . , y n ) such that Fc −F∞ ∈ Rj (ck ; D; y 1 , . . . , y n ), then it follows that |Fc |j,D |F∞ |j,D +1, so that the family of functions Fc is uniformly bounded in the norm | · |j,D for all large c. A similar remark using the · H j norm applies if F∞ ∈ I j (D; q1 , . . . , qn ) and Fc − F∞ ∈ I j (ck ; D; q1 , . . . , qn ). We often make use of these observations in this article without explicitly mentioning it. Lemma 6.3. Suppose that j ≥ 3, k1 + k2 = k0 , and that Fc ∈ Rj (ck0 ; D1 ; q1 , . . . , qn ). Assume further that for 1 ≤ i ≤ n, we have that qi ∈ k=1 k j−k k2 3 k=0 C ([0, T ], Hq¯i ) and that for all large c, that c (∂t q1 , . . . , ∂t qn )([0, T ]×R ) ⊂ D2 . Then on [0, T ], we have that ∂t (Fc ) ∈ I j−1 (ck1 ; D1 × D2 ; q1 , . . . , qn , ck2 ∂t q1 , . . . , ck2 ∂t qn ).
(6.10)
Proof. Lemma 6.3 follows from the chain rule, Lemma 6.2, and Remark 6.7. We emphasize that the constant term associated to ck2 ∂t qi is 0, so that on the right-hand side of the definition (6.4) of I j−1 (· · ·), we are measuring ck2 ∂t qi in the H j−1 norm. Corollary 6.1. Let ∂a be a first-order spatial coordinate derivative operator. Suppose that j ≥ 3, k1 + k2 = k0 , and that Fc ∈ Rj (ck0 ; D1 ; q1 , . . . , qn ). Assume that for all large c, we have that ck2 (∂a q1 , . . . , ∂a qn )([0, T ] × R3 ) ⊂ D2 . Then on [0, T ], we have that ∂a (Fc ) ∈ I j−1 (ck1 ; D1 × D2 ; q1 , . . . , qn , ck2 ∂a q1 , . . . , ck2 ∂a qn ).
(6.11)
Proof. The proof of Corollary 6.1 is virtually identical to the proof of Lemma 6.3.
6.3. Applications to the ENcκ system We will now apply these lemmas to the ENcκ system. Let us first make a few remarks about our use of the norms · H j ,¯qi that appear on the right-hand side of (6.4) and the constant term q¯i associated to qi . For the remainder of this article, it is to ¯ c , that the constant be understood that the constant term associated to ck V is ck V k k¯ term associated to c V is c Vc , and the constant term associated to both DV and ¯ c and V ¯ c are defined in (8.10) and (8.11), respectively. In other DV is 0, where V words, when estimating ck V using a jth order Sobolev norm, it is understood that we are using the norm · H j , and similarly for the other state-space arrays. ¯c ck V
August 12, 2009 3:58 WSPC/148-RMP
840
J070-00374
J. Speck
The relationship between the arrays V and V is always understood to be the one implied by (5.6) and (5.8). We furthermore emphasize that V (or V) will represent a solution array to the ENcκ system, and therefore will implicitly depend on c ˚ c (see (8.9)) and through the c dependence through the c-dependent initial data V c ¯ c and V ¯ c depend of the ENκ system itself. The fact that the constant arrays V ¯ c is contained on the parameter c does not pose any difficulty. For as we shall see, V ¯ c is contained in the fixed compact in the fixed compact set K for all large c, and V set K for all large c, where the sets K and K were introduced at the beginning of ¯ c and V ¯ c that we will need Sec. 6. Therefore, the L∞ estimates of the constants V can be made independently of all large c. In addition to the above remarks, we add that we will have available a priori k=2 N −k o estimates that guarantee that V ∈ k=0 C k ([0, T ], HV ¯ c ) for a fixed integer N ≥ 4 on our time interval [0, T ] of interest, which are hypotheses that are relevant for Lemma 6.3 and Corollary 6.1. Our a priori estimates will also ensure that all of the relevant quantities are contained in an appropriate fixed compact convex set, so that the “hypotheses on the qi ” described in Definition 6.2 will always be satisfied. Consequently, we will often omit the dependence of the running constants C(· · ·) on such sets. The relevant a priori estimates (“Induction Hypotheses”) are described in detail in Sec. 10.3.1. Let us now provide a clarifying example and also elaborate upon the idea that it is sometimes useful to shift the point of view as to what are the arguments of def a family Fc (· · ·). For example, consider the expression Fc = c−2 ∂t Φ, where Φ is ˚c a solution variable in the ENcκ system depending on c through the initial data V −1 and through the c-dependence of the system itself. If it is known that c ∂t ΦH 3 is uniformly bounded by L for all large c, then we have that Fc ∈ I 3 (c−1 ; c−1 ∂t Φ) since c−1 c−1 ∂t ΦH 3 c−1 L. If it also turns out that ∂t ΦH 3 is uniformly bounded for all large c, then have that Fc ∈ I 3 (c−2 ; ∂t Φ). If both estimates are true, then we indicate this by writing Fc ∈ I 3 (c−1 ; c−1 ∂t Φ) ∩ I 3 (c−2 ; ∂t Φ) or Fc = O3 (c−1 ; c−1 ∂t Φ) ∩ O3 (c−2 ; ∂t Φ). These kinds of estimates will enter into our continuous induction argument in Sec. 10.2, in which we will first prove a bound for c−1 ∂t Φ, and then use it to obtain a bound for ∂t Φ; see (10.25) and (10.27). Remark 6.10. For simplicity, we are not always optimal in our estimates. The following four lemmas, which provide an analysis of the c-dependence of the terms appearing in the ENcκ system, will be used heavily in Sec. 10.3, which contains most of our technical estimates. Before providing the lemmas, we first restate our hypotheses on the equation of state using our new notation. Hypotheses on the c-Dependence of the Equation of State. Rc (η, p), R∞ (η, p) ∈ RN +1 (C; η, p), Rc (η, p) − R∞ (η, p) ∈ RN +1 (c−2 ; C; η, p). o The
relevance of N ≥ 4 is described in Sec. 8.
(6.12) (6.13)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
841
Recall that the set C was introduced at the beginning of Sec. 6 and is described in detail in Sec. 8.2. We also assume that R∞ (η, p) and S2∞ (η, p) are “physical” as defined in Sec. 3.1, and in particular that whenever η, p > 0, we have that 0 < R∞ (η, p) and 0 < S2∞ (η, p). Additionally, we note the following simple consequence of (3.18), (6.12), and (6.13): S2c (η, p) − S2∞ (η, p) ∈ RN (c−2 ; C; η, p).
(6.14)
Remark 6.11. At the end of this section, we provide an example of a well-known family of equations of state, namely the polytropic equations of state, that satisfy the above hypotheses. Hypothesis (6.12) ensures that the terms appearing in the ENcκ and EPκ systems are sufficiently differentiable functions of V, thus enabling us to apply the Sobolev– Moser type inequalities appearing in Appendix B. It is strong enough to imply Theorems 10.1 and 11.1. Hypothesis (6.13) is used in our proof of Theorems 10.2 and 11.2. Although a weakened version of Hypothesis (6.13) is sufficient to prove a convergence theorem, we do not pursue this matter here since we are not striving for optimal results. Lemma 6.4. Let γc , Rc , R∞ , Qc , Q∞ , W, and W be the quantities defined in (4.5), (4.6), (4.13), (4.7), (4.14), (5.5), and (5.7), respectively. Then for m = 0, 1, 2 and ν = t, 1, 2, 3 we have the following estimates for all large c, including c = ∞: (γc )2 − 1 ∈ RN +1 (c−2 ; v), e
+ 2 − 4Φ/c
(6.15)
− 1 ∈ RN +1 (cm−2 ; c−m Φ),
(6.16)
2
Rc − R∞ = e4Φ/c Rc (η, p) − R∞ (η, p) ∈ RN +1 (cm−2 ; η, p, c−m Φ), Qc − Q∞ = Qc (η, p, Φ) − Q∞ (η, p) ∈ R (c N
m−2
−m
; η, p, c
(6.17)
Φ),
(6.18)
W − W ∈ RN (cm−2 ; P, c−m Φ),
(6.19)
W ∈ RN (W, c−m Φ),
(6.20)
∂ν W − ∂ν W ∈ I N −1 (cm−2 ; P, ∂ν P, c−m Φ, c−m ∂ν Φ), (6.21) ∂ν W ∈ I
N −1
(W, ∂W, c
−m
Φ, c
−m
∂ν Φ). (6.22)
Proof. (6.15), and (6.16) are easy Taylor estimates. (6.17) follows from Lemma 6.2, (6.12), (6.13), and (6.16). (6.18) then follows from (3.18), (3.41), (4.14), Lemma 6.2, 2 (6.14), and (6.17). Since P − p = (1 − e−4Φ/c )P, (6.19) follows from (6.16), Lemma 6.2, and that the fact that W and W differ only in that the second component of W is p, while the second component of W is P. (6.20) is a simple consequence
August 12, 2009 3:58 WSPC/148-RMP
842
J070-00374
J. Speck
of (6.19). (6.21) follows from (6.19), Lemma 6.3, and Corollary 6.1. (6.22) then follows easily from (6.21). The next lemma connects the c-asymptotic behavior of an expression written in terms of the state-space array W to the c-asymptotic behavior of the same expression written in terms of the state-space array W. Lemma 6.5. If 0 ≤ j ≤ N and Fc ∈ Rj (ck ; W), then for m = 0, 1, 2, we have that Fc ∈ Rj (ck ; W, c−m Φ).
(6.23)
Proof. Lemma 6.5 follows easily from expressing W in terms of W and c−m Φ via (6.20) and applying the chain rule. Lemma 6.6. Let c Aν (W, Φ), ν = 0, 1, 2, 3, denote the matrix-valued functions of W and Φ introduced in Sec. 5. Let the c-dependent relationship between W and W, Φ be defined by (5.5) and (5.7). Then for all large c including c = ∞, and for m = 0, 1, 2, we have that ν ∞ A (W), ν c A (W, Φ), cA
ν
(∞ A0 (W))−1 ∈ RN (W) ∩ RN (W, c−m Φ),
(6.24)
(c A0 (W, Φ))−1 ∈ RN (W, c−m Φ) ∩ RN (W, c−m Φ),
(6.25)
(W, Φ) − ∞ A (W) ∈ RN (cm−2 ; W, c−m Φ) ∩ RN (cm−2 ; W, c−m Φ), ν
(6.26) −1
(c A (W, Φ)) 0
−1
− (∞ A (W)) 0
∈ R (c N
m−2
; W, c
−m
Φ) ∩ R (c N
m−2
; W, c
−m
Φ).
(6.27) Proof. (6.24)–(6.27) follow from (5.10)–(5.13), Remark 6.6, Lemma 6.2, Lemma 6.4, Lemma 6.5, the determinant-adjoint formula for the inverse of a matrix, and the hypotheses (6.12), (6.13) on the equation of state. def
Lemma 6.7. Let B∞ (W, ∂Φ) = (0, 0, −R∞ (η, p)∂1 Φ, −R∞ (η, p)∂2 Φ, −R∞ (η, p)∂3 Φ)) denote the right-hand side of the EPκ equations (4.9)–(4.11), and let Bc (W, Φ, DΦ) denote the right-hand side of the ENcκ equations (4.1)–(4.3). Let the c-dependent relationship between W and W, Φ be defined by (5.5) and (5.7). Then for all large c including c = ∞, and for m = 0, 1, 2 and n = 0, 1, we have that B∞ (W, ∂Φ) ∈ I N (cn ; W, c−n ∂Φ) ∩ I N (cn ; W, c−m Φ, c−n ∂Φ),
(6.28)
Bc (W, Φ, DΦ) ∈ I N (W, c−m Φ, ∂Φ, c−m ∂t Φ) ∩ I N (W, c−m Φ, ∂Φ, c−m ∂t Φ) (6.29)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
843
and Bc (W, Φ, DΦ) = B∞ (W, ∂Φ) + ON (cm−2 ; W, c−m Φ, c−m DΦ).
(6.30)
¯ ∞ , 0) = 0 and Proof. (6.28)–(6.30) all follow from combining the facts B∞ (W ¯ ¯ Bc (Wc , Φc , 0) = 0 with Remark 6.7, Lemma 6.2, Lemma 6.4, and Lemma 6.5. Remark 6.12. The fact that B∞ (W, ∂Φ) ∈ I N (c1 ; W, c−m Φ, c−1 ∂Φ) plays a distinguished role in the proof of Lemma 10.2; B∞ (W, ∂Φ) will be one of the factors in the “worst error term” because it can grow like c1 if we only have control over the size of c−1 ∂Φ. Remark 6.13. Many of the above lemmas are valid for other values of m and n; we stated the lemmas for the values of m and n that we plan to use later. Example 6.1. As an enlightening example, we discuss the non-relativistic limit of polytropic equations of state, that is, equations of state of the form ρ = m0 c2 n + Ac (η) γ γ−1 n , where m0 denotes the rest mass of a fluid element, n denotes the proper number density, and γ > 1. Let us assume that Ac , A∞ ∈ RN +1 (Π1 (C); η), that A∞ > 0 on Π1 (C), and that Ac − A∞ ∈ RN +1 (c−2 ; Π1 (C); η), where Π1 (C) is the projection of the set C introduced at the beginning of Sec. 6 onto the first axis. Some omitted calculations show that Hypotheses 6.12 and 6.13 then hold, and that 2
Rc = e4Φ/c Rc (η, p) =
m0 P 1/γ e4Φ/c
m0 p1/γ 1/γ
(1−1/γ)
1/γ Ac (η)
Qc = Qc (η, p, Φ) = γP, R∞ = R∞ (η, p) =
2
+
P , − 1)
c2 (γ
(6.31) (6.32)
,
(6.33)
A∞ (η)
Q∞ = Q∞ (η, p) = γp.
(6.34)
In the isentropic case η(t, s) ≡ η¯, (6.33) can be rewritten in the familiar form p = C · (R∞ )γ , where C is a constant. 7. Energy Currents In this section we provide energy currents and discuss two key properties: (i) for a ˙ when contracted against cerfixed c, they are positive definite in the variations W tain covectors, and (ii) their divergence is lower order in the variations. In Sec. 8.3, we will see that the positivity property is uniform for all large c. A general framework for the construction of energy currents for hyperbolic systems derivable from a Lagrangian is developed in [8]. The role of energy currents is to replace the energy principle available for symmetric hyperbolic systems by providing integral identities, or more generally, integral inequalities, that enable one to control Sobolev
August 12, 2009 3:58 WSPC/148-RMP
844
J070-00374
J. Speck
norms of solutionsp to the EOVcκ . This technique will be used in our proofs of Lemma 10.10 and Theorem 11.2. 7.1. The definition of an energy current : M → R10 as defined in Sec. 5, ˙ : M → R5 and a BGSq V Given a variation W 0 j we define the energy current to be the vectorfield (c) J˙ with components (c) J˙ , (c) J˙ , j = 1, 2, 3, in the global rectangular coordinate system given by (c) ˙ 0 def
J = η˙ 2 +
P˙ 2 + 2c−2 ( γc )2 ( vk v˙ k )P˙ c Q
c + c−2 P] · [v˙ k v˙ k + c−2 ( + ( γc )2 [R γc )2 ( vk v˙ k )2 ], (c) ˙ j def
J = vj η˙ 2 +
(7.1)
vj ˙ 2 P + 2[v˙ j + c−2 ( γc )2 vj vk v˙ k ] · P˙ c Q
c + c−2 P] · [v˙ k v˙ k + c−2 ( + ( γc )2 vj [R γc )2 ( vk v˙ k )2 ]. In the case c = ∞, we define for j = 1, 2, 3: (∞) ˙ 0 def
J = η˙ 2 +
(∞) ˙ j def
p˙2 ∞ v˙ k v˙ k , +R Q∞
J = vj η˙ 2 +
(7.2)
vj 2 ∞ vj v˙ k v˙ k . p˙ + 2v˙ j p˙ + R ∞ Q
˙ a fact that will be rigorously justified We note that formally, limc→∞ (c) J˙ = (∞) J, in Sec. 8.3. The energy current (7.1) is very closely related to the energy current J˙ introduced in [22], where the following changes have been made. First, we have dropped the terms from J˙ corresponding to the variations of the potential Φ˙ and its derivatives, for we will bound these terms in a Sobolev norm using a separate argument. Second, the expression for (c) J˙ is constructed using the velocity state-space variable def ˙ as opposed to the variables U j = eφ uj and variations U˙ j v (3.29) and variations v, ˙ Finally, we emphasize that the formula for (c) J˙ ν that appear in the expression for J. applies in a rectangular coordinate system with x0 = t, whereas in the formula for J˙ν provided in [22], the rectangular coordinate system is such that x0 = ct, even though c was set equal to unity in [22]. Remark 7.1. A similar current was used by Christodoulou in [9] to analyze the motion of a relativistic fluid evolving in Minkowski space. ˙ or D Φ; ˙ these terms are we shall see, the energy currents (c) J˙ do not control the variations Φ controlled through a separate argument based on the lemmas and propositions of Appendix A. q Recall that we also refer to W f as the BGS when c = ∞. p As
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
7.2. The positive definiteness of ξµ (c) J˙
µ
845
for ξ ∈ Ixs∗+
As discussed in detail in [22], for ξ belonging to a certain subset of the cotangent µ ˙ W) ˙ is positive space at x, which we denote by Tx∗ M, the quadratic formr ξµ (c) J˙ (W, ˙ definite in W if P > 0. To elaborate upon this, we follow Christodoulou [9] and introduce the reciprocal acoustical metric h−1 , a quadratic form on Tx∗ M with components that read (for j, k = 1, 2, 3) def ( h−1 )00 = −c−2 − ( γc )2 [S−2 η , p) − c−2 ], c (
(7.3)
def h−1 )j0 = −( γc )2 [S−2 η , p) − c−2 ] vj , ( h−1 )0j = ( c (
(7.4)
def ( h−1 )jk = δ jk − ( γc )2 [S−2 η , p) − c−2 ] v j vk , c (
(7.5)
in the global rectangular coordinate system. Recall that the function Sc is defined in (3.16). Recall that for a hyperbolic system of PDEs, the characteristic subsets of Tx∗ M is the union of several sheets. If we restrict our attention to the truncatedt EOVcκ (5.1)–(5.3), then omitted calculations imply that the inner sheet is the sound cone h−1 )µν ζµ ζν = 0}. The at x, which can be described in coordinates as {ζ ∈ Tx∗ M | ( interior of the positive component of the sound cone, which we denote by Ixs∗+ , can be described in coordinates as h−1 )µν ζµ ζν < 0 and ζ0 > 0}. Ixs∗+ = {ζ ∈ Tx∗ M | ( def
(7.6)
We remark that the characteristic subsets of the Tx∗ M in the complete EOVcκ system (5.1)–(5.4) feature an additional sheet: the light cone (also known as the “gravitational null cone”), which is contained inside the sound cone.u It follows from the general construction of energy currents as presented in [8] µ ˙ W) ˙ is positive definite whenever P > 0 and ξ belongs to the that ξµ (c) J˙ (W, interior of the positive component of the sound cone in Tx∗ M: µ ˙ W) ˙ >0 ξµ (c) J˙ (W,
˙ > 0, P > 0, and ξ ∈ I s∗+ . if |W| x
(7.7)
(c) ˙ µ
˙ W) ˙ to estimate The inequality (7.7) allows us to use the quadratic form ξµ J (W, 2 ˙ We will the L norms of the variations W, provided that we estimate the BGS V. discuss this issue further in Sec. 8.3. µ µ ˙ W)” ˙ write “ξµ (c) J˙ (W, to emphasize the point of view that ξµ (c) J˙ is a quadratic form ˙ in W. s [22] contains a detailed discussion of the notion of the characteristic subset of T ∗ M in the x system. context of the ENc=1 κ t By “truncated EOVc ” we mean the system that results upon deleting the variable Φ ˙ and Eq. (5.4) κ that it satisfies. u As discussed in [22], one can also define the sound cone and light cone subsets of the tangent space at x, which we denote by Tx M, by introducing the notion of the dual to a sheet of the characteristic subset of Tx∗ M. The duality reverses the aforementioned containment so that in Tx M, the sound cone is contained inside of the light cone. This is perhaps the more familiar picture, for it corresponds to our intuitive notion of sound traveling more slowly than light. r We
August 12, 2009 3:58 WSPC/148-RMP
846
J070-00374
J. Speck
In contrast, the energy current J˙ from [22] has the property that ξµ J˙µ is a ˙ only for ξ belonging to the interior of the positive definite quadratic form in V µ ˙ W) ˙ is positive definite positive component of the light cone in Tx∗ M; ξµ (c) J˙ (W, µ ˙ ˙ (c) ˙ ˙ for a larger set of ξ than is ξµ J (V, V) because J does not contain terms involving ˙ and therefore does not control the propagation of the variations of the potential Φ, gravitational waves. Remark 7.2. Because limc→∞ S−2 η , p) = S−2 η , p) > 0, it follows that for all c ( ∞ ( large c, the covector with coordinates (1, 0, 0, 0) is an element of Ixs∗+ . Therefore, (c) ˙ 0 ˙ ˙ is positive definite for all large c. We also observe that (∞) J˙ 0 (W, ˙ W), ˙ J (W, W) which is defined in (7.2), is manifestly positive definite in the variations if p > 0, ∞ > 0 and for by our fundamental assumptions on the equation of state, p > 0 ⇒ R Q∞ > 0. 7.3. The divergence of the energy current ˙ are solutions of the EOVcκ (5.1)–(5.3), then As described in [22], if the variations W µ we can compute ∂µ ((c) J˙ ) and use Eqs. (5.1)–(5.3) for substitution to eliminate the ˙ termsv containing the derivatives of W: µ
∂µ ((c) J˙ ) =
j 1 v ∂t γc )2 + ∂j · P˙ 2 + 2c−2 ( c c Q Q · {∂t vk + vk ∂j vj + vj ∂j vk + 2c−2 ( γc )2 vk ( vj ∂t vj + vj va ∂j va )} · P˙ v˙ k c + c−2 P)] + ∂j [( c + c−2 P) v j ]} + {∂t [( γc )2 (R γc )2 (R c + c−2 P } · {v˙ k v˙ k + c−2 ( γc )2 ( vk v˙ k )2 } + 2c−2 ( γc )4 {R · { vk v˙ k v˙ j ∂t vj + vk v˙ k v˙ a vj ∂j va + c−2 ( γc )2 ( vk v˙ k )2 ( vj ∂t vj + va vj ∂j va )} + 2ηf ˙ +2
P˙ g + 2v˙ j h (j) . Qc
(7.8)
We observe here that in the case c = ∞, (7.8) reduces to the more palatable expression j 1 v ∞ ) + ∂j (R ∞ vj )} · v˙ k v˙ k J ) = ∂t + ∂j · p˙ 2 + {∂t (R Q∞ Q∞
(∞) ˙ µ
∂µ (
+ 2ηf ˙ +2
P˙ g + 2v˙ j h (j) . ∞ Q
(7.9)
v Showing this via a calculation is an arduous task. The lower-order divergence property is a generic feature of energy currents constructed in the manner described in [8], but we require its explicit form in order to analyze its c-dependence.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
847
8. The Initial Data and the Uniform-in-c Positivity of the Energy Currents In this section we describe a class of initial data for which our energy methods allow us to rigorously take the limit c → ∞ in the ENcκ system. The Cauchy surface we consider is {(t, s) ∈ M | t = 0}. 8.1. An H N perturbation of a uniform quiet fluid Initial data for the EPκ system are denoted by def ˚ ˚0 , Ψ ˚1 , Ψ ˚2 , Ψ ˚3 ), η, ˚ p,˚ v 1 ,˚ v 2 ,˚ v3 , ˚ Φ∞ , Ψ V∞ (s) = (˚
(8.1)
˚0 (s) def ˚j def Φ∞ (s). We assume that ˚ where Ψ = ∂t Φ(0, s) and Ψ = ∂j ˚ V∞ is an H N ¯ ∞ , where perturbation of the constant state V ¯ ∞ def ¯ ∞, 0, 0, 0, 0), V = (¯ η , p¯, 0, 0, 0, Φ
(8.2)
¯ ∞ is the unique solution to η¯, p¯ are positive constants, and the constant Φ ¯ ∞ + 4πGR∞ (¯ κ2 Φ η , p¯) = 0.
(8.3)
¯ ∞. The constraint (8.3) must be satisfied in order for Eq. (4.12) to be satisfied by V N By an H perturbation, we mean that ˚ ∞ N W H¯
W∞
< ∞,
(8.4)
˚ ∞ and W ¯ ∞ to refer to the first 5 components of where we use the notation W ¯ ˚ V∞ and V∞ , respectively. We emphasize that a further positivity restriction on the initial data ˚ p and ˚ η is introduced in Sec. 8.2, and that throughout this article, N is a fixed integer satisfying N ≥ 4.
(8.5)
Remark 8.1. We require N ≥ 4 so that Corollary B.1 and Remark B.1 can be k=2 applied to conclude that l ∈ k=0 C k ([0, T ], H N −k ), where l is defined in (10.14); this is a necessary hypothesis for Proposition A.3, which we use in our proof of Theorem 10.2. ˚ν , ν = 0, 1, 2, 3, as “data”, in the EPκ system, Although we refer to ˚ Φ∞ and Ψ v 2 ,˚ v 3 through Eqs. (4.10 ), (4.12), and these 5 quantities are determined by ˚ η, ˚ p,˚ v 1 ,˚ ¯ ∞ and Ψ ˚0 : (8.3), together with vanishing conditions at infinity on ˚ Φ∞ − Φ ¯ ∞ ) = 4πG[R∞ (˚ ∆˚ Φ∞ − κ2 (˚ Φ∞ − Φ η, ˚ p) − R∞ (¯ η , p¯)], ˚0 = −4πG∂t |t=0 (R∞ (η, p)) = −4πG∂k (R∞ (˚ ˚ 0 − κ2 Ψ ∆Ψ η, ˚ p)˚ v k ),
(8.6) (8.7)
¯ ∞ and Ψ ˚0 . where the integral kernel from (4.15) can be used to compute ˚ Φ∞ − Φ We will nevertheless refer to the array ˚ V∞ as the “data” for the EPκ system.
August 12, 2009 3:58 WSPC/148-RMP
848
J070-00374
J. Speck
+2 ˚ν ∈ H N +1 for ν = Remark 8.2. Remark 4.1 implies that ˚ Φ∞ ∈ HΦN and Ψ ¯∞ 0, 1, 2, 3.
V∞ . Depending on which We now construct data for the ENcκ system from ˚ set of state-space variables we are working with, we denote the data for the ENcκ system by def ˚ ˚0 , Ψ ˚1 , Ψ ˚2 , Ψ ˚3 ), Vc = (˚ η, ˚ p,˚ v 1 ,˚ v 2 ,˚ v3 , ˚ Φc , Ψ 2
˚ ˚ c def ˚0 , Ψ ˚1 , Ψ ˚2 , Ψ ˚3 ), or V = (˚ η , e4Φc /c ˚ p,˚ v 1 ,˚ v 2 ,˚ v3 , ˚ Φc , Ψ
(8.8) (8.9)
˚0 , Ψ ˚1 , Ψ ˚2 , and Ψ ˚3 are data in the sense that Φc , Ψ where unlike in the EPκ case, ˚ c the ENκ system is under-determined if they are not prescribed. We have chosen ˚0 , Ψ ˚1 , Ψ ˚2 , Ψ ˚3 for the ENcκ system to be the same as the v 2 ,˚ v3 , Ψ the data ˚ η, ˚ p,˚ v 1 ,˚ data for the EPκ system, but for technical reasons described below and indicated in (8.12) and (8.14), our requirement that there exists a constant background state Φ∞ by a small constant typically constrains the datum ˚ Φc so that it differs from ˚ that vanishes as c → ∞. Vc is an H N perturbation of the constant As in the EPκ system, we assume that ˚ state of the form (depending on which collection of state-space variables we are working with) ¯ c def ¯ c, 0, 0, 0, 0), = (¯ η , p¯, 0, 0, 0, Φ V ¯ c def ¯ c, 0, 0, 0, 0), = (¯ η , P¯c , 0, 0, 0, Φ or V
(8.10) (8.11)
¯ ∞, Φ ¯ c is the unique solution to where η¯ and p¯ are the same constants appearing in V ¯
2
¯ c + 4πGe4Φc /c [Rc (¯ η , p¯) − 3c−2 p¯] = 0, κ2 Φ def
¯
(8.12)
2
and P¯c = e4Φc /c p¯. The constraint (8.12) must be satisfied in order for Eq. (4.4) ¯ c for the ENcκ ¯ c . Although the background potential Φ to be satisfied by p¯, η¯, and Φ ¯ system is not in general equal to the background potential Φ∞ for the EPκ system, it follows from the hypotheses (6.12) and (6.13) on the c-dependence of Rc that ¯c = Φ ¯ ∞. lim Φ
c→∞
(8.13)
We now define the initial datum ˚ Φc appearing in the arrays (8.8) and (8.9) by def ˚ ¯∞ + Φ ¯ c, Φc = ˚ Φ∞ − Φ
(8.14)
¯ c matches which ensures that the deviation of ˚ Φc from the background potential Φ ¯ ˚ the deviation of Φ∞ from the background potential Φ∞ . We denote the first 5 ¯ c , and V ˚ c, W ¯ c , and W ˚c, V ¯ c by W ˚ c, W ¯ c , respectively. components of ˚ Vc , V Remark 8.3. We could weaken the hypotheses by allowing the initial data for the ENcκ system to deviate from the initial data for the EPκ system by an H N
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
849
perturbation that decays to 0 rapidly enough as c → ∞. For simplicity, we will not pursue this analysis here. 8.2. The sets O, O2 , O2 , C, K, and K In order to avoid studying the free boundary problem, and in order to avoid singularities in the energy currents (7.1) and (7.2), we assume that the initial pressure, energy density, and speed of sound are uniformly bounded from below by a positive constant. According to our assumptions (3.13) on the equation of state, to achieve this uniform bound, it is sufficient to make the following further assumption on ˚ ∞ (R3 ) is contained in a compact subset of the following the initial data: that W open subset O of the state-space R5 , the admissible subset of truncated state-space, defined by def
(8.15) O = {W = (η, p, v 1 , v 2 , v 3 ) ∈ R5 | η > 0, p > 0}. ¯ ∞ ∈ O1 , where O1 is a pre˚ ∞ (R3 ) ⊂ O1 and W Therefore, we assume that W compact open set with O1 O, and “” means that “the closure is compact and contained in the interior of”. We then fix convex precompact open subsets O2 and ¯ 2 onto the first O2 with O1 O2 O2 O, and define C to be the projection of O ¯ 2 denotes the closure of O2 . We assume that with this definition two axes, where O of C, hypotheses (6.12) and (6.13) are satisfied by the equation of state. Conse˚ c (R3 ) O2 quently, property (8.13) shows that for all large c including c = ∞, W ˚ ˚ ¯ c ∈ O2 ; also note that for all c, Wc = W∞ = W ˚ ∞. and W We now address the variables (Φ, ∂t Φ, ∂1 Φ, ∂2 Φ, ∂3 Φ). In Sec. 10, we will use energy estimates to prove the existence of an interval [0, T ] and a cube of the form [−a, a]5 such that for all large c including c = ∞, we have (Φ, ∂t Φ, ∂1 Φ, ∂2 Φ, ∂3 Φ)([0, T ] × R3 ) ⊂ [−a, a]5 . Furthermore, it will follow from the discussion in Sec. 10 that for all large c including c = ∞, we have ˚0 , Ψ ˚1 , Ψ ˚2 , Ψ ˚3 )(R3 ) Int([−a, a]5 ). The compact convex set K, then, as given (˚ Φc , Ψ ¯2 × [−a, a]5 . It follows from the above in (10.29) below, will be defined to be O ˚ c (R3 ) Int(K) and discussion that for all large c including c = ∞, we have V ¯ c ∈ Int(K). Our goal will be to show that the solution Vc to (4.1)–(4.8) launched V ˚ c exists on a time interval [0, T ] that is independent of (all by the initial data V large) c and remains in K. We now discuss the simple construction of K: based on the above construction, it follows from definitions (5.5)–(5.8) that for all large c including c = ∞, we have ¯ 2 . As given in (10.30), we will then define the compact convexw V∈K⇒W∈O def ¯ set K by K = O2 × [−a, a]5 , so that for all large c including c = ∞, we also have that V ∈ K ⇒ V ∈ K. As in the previous discussion, it follows that for all large c ¯ c ∈ Int(K). including c = ∞, we have ˚ Vc (R3 ) Int(K) and V
B.2 requires the convexity of K and K, and the estimate (B.6) also requires that ¯ c ∈ K. In practice, K and K can be chosen to be cubes. ¯ c ∈ K and V V
w Proposition
August 12, 2009 3:58 WSPC/148-RMP
850
J070-00374
J. Speck
8.3. The uniform-in-c positive definiteness of
(c) ˙ 0
J
0 As mentioned at the beginning of Sec. 7, we will use the quantity (c) J˙ (t)L1 2 (c) ˙ ˙ ˙ with to control W(t) J is an energy current for the variation W L2 , where coefficients defined by a BGS V. Since we seek estimates that are uniform in c, it follows that we will show that under some simple assumptions on the BGS V, 0 (c) ˙ ˙ for all large c. Let us now formulate this J is uniformly positive definite in W precisely as a lemma.
˙ defined by Lemma 8.1. Let (c) J˙ be the energy current (7.1) for the variation W Assume that W(t, s) ∈ O denotes the ¯ 2 and that |Φ(t, s)| ≤ Z, where W the BGS V. and O ¯ 2 is defined in Sec. 8.2. Then there exists a constant first 5 components of V, CO¯2 ,Z with 0 < CO¯2 ,Z < 1 such that ˙ 2 ≤ (c) J˙ 0 (W, ˙ W) ˙ ≤ C −1 ˙ 2 CO¯2 ,Z |W| ¯ 2 ,Z |W| O
(8.16)
holds for all large c including c = ∞. ˙ = 1 since it is invariant Proof. It is sufficient prove inequality (8.16) when |W| ˙ V as under any rescaling of W. Let W, V be the arrays related to the arrays W, defined in (5.5)–(5.8). Our assumptions imply the existence of a compact set D s) ∈ D. ¯2 and Z such that for all large c, V(t, depending only on O 0 Recall that (∞) J˙ is defined in (7.2) and that (∞) J˙ is manifestly positive definite then ˙ if p > 0. If we view (∞) J˙ 0 as a function of (W, ˙ W), in the variationsx W ˙ 2 ≤ by uniform continuity, there is a constant 0 < C(D) < 1 such that C(D)|W| 0 (∞) ˙ ˙ 2 holds on the compact set {|W| ˙ = 1} × Π5 (D), where Π5 (D) J ≤ C(D)−1 |W| 0 is the projection of D onto the first five axes. Furthermore, if we also view (c) J˙ then by Lemma 6.2, Lemma 6.4, (7.1), and (7.2) we have ˙ V), as a function of (W, (c) ˙ 0 (∞) ˙ 0 (8.16) now easily follows: ˙ 2 , where Fc ∈ RN (c−2 ; D; V). J + Fc · |W| that J = CO¯2 ,Z can be any positive number that is strictly smaller than C(D). 0
Remark 8.4. If c = ∞, then the coefficients of the quadratic form (∞) J˙ are independent of Φ. It follows that in this case, the constant CO¯2 from (8.16) is independent of Z. 9. Smoothing the Initial Data For technical reasons, we need to smooth the initial data. Without smoothing, the terms on the right-hand sides of (10.8)–(10.10) involving the derivatives of the initial x To be consistent the notation used in formula (7.2), it would be “more correct” to use the symbol ˙ ·). However, for the purposes of this ˙ to denote the variations appearing as arguments in (∞) J(·, W ˙ =W ˙ since in this context, these placeholder variables proof, there is no harm in identifying W merely represent the arguments of (∞) J˙ when viewed as a quadratic form in the variations.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
851
data could be unbounded in the H N norm. To begin, we fix a Friedrichs mollifier χ(s); i.e. χ ∈ Cc∞ (R3 ), supp(χ) ⊂ {s | |s| ≤ 1}, χ ≥ 0, and χ d3 s = 1. For > 0, def ˚ ∞ of the data ˚ we set χ (s) = −3 χ( s ). We smooth the first 5 components W V∞ ∞ ˚ ∞ ∈ C by defined in (8.1) with χ , defining χ W def
˚ ∞ (s) = χ W
R3
˚ ∞ (s )d3 s . χ (s − s )W
(9.1)
+2 ˚0 ) ∈ H N × H N +1 because by Note that we do not smooth the data (˚ Φc , Ψ ¯c Φ Remark 8.2 and definition (8.14), they already have sufficient regularity. The following property of such a mollification is well known:
˚ ∞ H N = 0. ˚∞ −W lim χ W
(9.2)
→0+
We will choose below an 0 > 0. Once chosen, we define (0) (0)
def ˚ def ˚ ∞, v ) = χ 0 W W = ((0)˚ η , (0)˚ p, (0)˚
(9.3)
2
˚ ˚ c def W v), = ((0)˚ η , e4Φc /c · (0)˚ p, (0)˚
(9.4)
where ˚ Φc is defined in (8.14). By Sobolev embedding, the assumptions on the initial ˚ c defined in (8.9), by ˚ data Wc , which are the first 5 components of the data V Lemma 6.2, by (6.16), and by the mollification property (9.2), ∃{Λ1 > 0 ∧ 0 > 0} such that for all large c,
˚ c H N ≤ Λ1 ⇒ W ∈ O ¯2 , W − (0) W ˚ c H N C ¯ ,Z · ˚c−W (0) W O2
Λ1 , 2
(9.5) (9.6)
¯2 is defined in Sec. 8.2, and CO¯ ,Z is the constant from (8.16). Here, Z is where O 2 a fixed constant that will serve as an upper bound for Φ(t)L∞ on a certain time interval, where Φ will be a solution variable in the ENcκ system. We explain this fixed value of Z, given in expression (10.35) below, in detail in Sec. 10.3. Note that ¯2 ; Z). according to this reasoning, Λ1 = Λ1 (O ˚ c N +1 and (0) V ˚ c L∞ enter into ˚ c L∞ , (0) V ˚ c H N , V Remark 9.1. Because V H ¯ Vc
¯c V
our Sobolev estimates below, it is an important fact that these quantities are uniformly bounded for all large c. By (8.13), (8.14), definition (9.4), and Sobolev ˚ c , we only need to show that embedding, to obtain uniform bounds for (0) V 4˚ Φc /c2 (0) · ˚ pH N +1 is uniformly bounded for all large c. This fact follows from e 2 ¯ e4Φc /c ·p ¯
Lemma 6.1, Lemma 6.2, and (6.16). Such a uniform bound is used, for example, in ˚ c ; we use the estimate (10.76). We can similarly obtain the uniform bounds for V such a bound, for example, in the proof of (10.50).
August 12, 2009 3:58 WSPC/148-RMP
852
J070-00374
J. Speck
10. Uniform-in-Time Local Existence for ENcκ In this section we prove our first important theorem, namely that there is a uniform time interval [0, T ] on which solutions to the ENcκ system having the initial data ˚ c exist, as long as c is large enough. V 10.1. Local existence and uniqueness for ENcκ revisited Let us first recall the following local existence result proved in [22], in which it was not yet shown that the time interval of existence can be chosen independently of all large c. ˚ c (s) be initial data Theorem 10.1 (ENcκ Local Existence Revisited). Let V c (8.8) for the ENκ system (4.1)–(4.8) that are subject to the conditions described in Sec. 8. Assume that the equation of state is “physical ” as described in Sec. 3.1. Then for all large (finite) c, there exists a Tc > 0 such that (4.1)–(4.8) has a unique classical solution V ∈ Cb2 ([0, Tc ] × R3 ) of the form V = (η, P, v 1 , v 2 , v 3 , Φ, ∂t Φ, ∂1 Φ, ˚ c (s). The solution satisfies V([0, Tc ] × R3 ) ⊂ K, ∂2 Φ, ∂3 Φ) with V(0, s) = V where the (c-independent) compact convex set K is defined in (10.29). Furthermore, k=3 k k=2 N −k N +1−k 3 3 ), V ∈ k=0 C k ([0, Tc ], HV ¯c ¯ c ) and Φ ∈ Cb ([0, Tc ] × R ) ∩ k=0 C ([0, Tc ], HΦ ¯ ¯ where the constants Vc and Φc are defined by (8.11) and (8.12) respectively. Remark 10.1. Although they are not explicitly proved in [22], the facts that V ∈ Cb2 ([0, Tc ] × R3 ) and that V is twice differentiable in t as a map from [0, Tc ] N −2 follow from our assumption that N ≥ 4 (i.e. for N ≥ 4, it can be shown to HV ¯c k=N −2 k N −k that V ∈ CbN −2 ([0, Tc ] × R3 ) ∩ k=0 C ([0, Tc ], HV ¯ c )). Also, by Corollary B.1, k=2 k 2 N −k we have that p ∈ k=0 C ([0, Tc ], Hp¯ ), since p = P e−4Φ/c . The proof of the claim that Tc can be chosen such that V([0, Tc ] × R3 ) ⊂ K ˚ c (R3 ) Int(K) (see Sec. 8.2), together with the continuity is based on the fact V result from the theorem and Sobolev embedding. Remark 10.2. The case c = ∞ is discussed separately in Theorem 11.1. Remark 10.3. The local existence theorem in [22] was proved using the relativistic def
state-space variables U ν = eφ uν . However, the form of the Newtonian change of variables made in Secs. 3.1 and 3.2, together with Corollary B.1, allows us to conclude Sobolev regularity in one set of state-space variables if the same regularity is known in the other set of variables. The following corollary, which slightly extends the lifespan of the solution and also allows us to conclude stronger regularity properties from weaker regularity assumptions, will soon be used in our proof of Proposition 10.1. Corollary 10.1. Let V(t, s) be a solution to the ENcκ system (4.1)–(4.8) that has N the regularity properties V ∈ Cb1 ([0, T ] × R3) ∩ L∞ ([0, T ], HV ¯ c ). Let O be the admissible subset of truncated state-space defined in (8.15), and let Π5 : R10 → R5
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
853
denote projection onto the first 5 axes. Assume that V([0, T ] × R3 ) ⊂ K and that ¯ c ∈ Int(K), where K ⊂ R10 is a compact convex set such that Π5 (K) O. Then V there exists an > 0 such that V ∈ Cb2 ([0, T + ] × R3 ) ∩
k=2
N −k C k ([0, T + ], HV ¯ c ).
(10.1)
k=0
Proof. We apply Theorem 10.1 to concludey that for each T ∈ [0, T ], there exists to the ENcκ system such that V ∈ an > 0, depending on T , and a solution V k=2 k N −k 2 3 Cb ([T − , T + ] × R ) ∩ k=0 C ([T − , T + ], HV ¯ c ) and such that V(T ) = V(T ). Furthermore, the uniqueness argument from [22], which is based on local energy estimates, can be easily modified to show that solutions to the ENcκ system on their common are unique in the class C 1 ([T − , T + ] × R3 ). Therefore V ≡ V slab of spacetime existence. Corollary 10.1 thus follows. In addition to Theorem 10.1, our proof of Theorem 10.2 also requires an additional key ingredient, namely the following continuation principle for Sobolev normbounded solutions: ˚ c (s) be initial data (8.8) for Proposition 10.1 (Continuation Principle). Let V c the ENκ system (4.1)–(4.8) that are subject to the conditions described in Sec. 8, and k=1 N −k let T > 0. Assume that V ∈ C 1 ([0, T ) × R3 ) ∩ k=0 C k ([0, T ), HV ¯ c ) is the unique ˚ classical solution existing on [0, T ) launched by Vc (s). Let O be the admissible subset of truncated state-space defined in (8.15), and let Π5 : R10 → R5 denote projection onto the first 5 axes. Assume that there are constants M1 , M2 > 0, a compact set K ⊂ R10 with Π5 (K) O, and a set U Int(K) such that the following three estimates hold for any T ∈ [0, T ): 1. |||V|||H N¯
Vc
,T
≤ M1 .
2. |||∂t V|||H N −1 ,T ≤ M2 . 3. V([0, T ] × R3 ) ⊂ U. Then there exists an > 0 such that V∈
Cb2 ([0, T
+ ] × R ) ∩ 3
k=2
N −k C k ([0, T + ], HV ¯c )
k=0
and V([0, T + ] × R3 ) ⊂ K.
(10.2)
Remark 10.4. Hypothesis 2. is redundant; it can be deduced from Hypothesis 1. by using the equations to solve for ∂t V and then applying (B.3). y Theorem 10.1 can be easily modified to obtain a solution that exists both “forward” and “backward” in time.
August 12, 2009 3:58 WSPC/148-RMP
854
J070-00374
J. Speck
N Proof. We will first show that there exists a V∗ ∈ HV ¯ c such that
lim V(Tn ) − V∗ H N −1 = 0
(10.3)
n→∞
holds for any sequence {Tn } of time values converging to T from below. If {Tn } is such a sequence, then Hypothesis 2. implies that V(Tj ) − V(Tk )H N −1 ≤ M2 |Tj − Tk |. By the completeness of H N −1 , there exists a N −1 such that (10.3) holds, and it is easy to check that V∗ does not V ∗ ∈ HV ¯c depend on the sequence {Tn }. By Hypothesis 1. we also have that {V(Tn )} N ∗ ∗ ≤ M1 . We converges weakly in HV ¯ c to V as n → ∞ and that V H N ¯ Vc
now fix a number N with 5/2 < N < N. By Proposition B.4, we have that def
limn→∞ V(Tn ) − V∗ H N = 0. Consequently, if we define V(T ) = V∗ , it follows N ∞ N that V ∈ C 0 ([0, T ], HV ¯ c ). Using the fact that N > 5/2, together ¯ c ) ∩ L ([0, T ], HV
older spaces, it can be shown with the embedding of H N (R3 ) into appropriate H¨ N 0 3 that V ∈ C 0 ([0, T ], HV ¯ c ) ⇒ V, ∂V ∈ Cb ([0, T ] × R ); i.e. we can continuously 3 extend V, ∂V to the slab [0, T ] × R . To conclude that V ∈ Cb1 ([0, T ] × R3 ), we will show that ∂t V extends continuously to [0, T ] × R3 . To this end, we use the ENcκ equations to solve for ∂t V: ∂t V = F(V, ∂V),
(10.4)
where F ∈ C N . Since V, ∂V ∈ Cb0 ([0, T ] × R3 ), the right-hand side of (10.4) has been shown to extend continuously so that it is an element of Cb0 ([0, T ] × R3 ). Furthermore, since V ∈ C 1 ([0, T ) × R3 ) by assumption, it follows from the previous conclusions and elementary analysis that ∂t V exists classically on [0, T ] × R3 and that ∂t V ∈ Cb0 ([0, T ] × R3), thus implying that V ∈ Cb1 ([0, T ] × R3). The additional conclusions in (10.2) now follow from Corollary 10.1 and continuity. Remark 10.5. Proposition 10.1 shows that if the solution V blows up at time T, then either limT ↑T |||V|||H N¯ ,T = ∞, limT ↑T |||∂t V|||H N −1 ,T = ∞, or V(T , R3 ) Vc
escapesz every compact subset of O × R5 as T ↑ T, where O is defined in (8.15).
Remark 10.6. Although the main theorems in this article require that N ≥ 4, Corollary 10.1 and Proposition 10.1 are also valid for N = 3, except that the conclusion V ∈ Cb2 ([0, T + ] × R3 ) must be replaced with V ∈ Cb1 ([0, T + ] × R3 ), N −2 and the conclusion V ∈ C 2 ([0, T + ], HV ¯ c ) does not hold. 10.2. The uniform-in-time local existence theorem We now state and prove the uniform time of existence theorem. ˚∞ denote initial data Theorem 10.2 (Uniform Time of Existence). Let V (8.1) for the EPκ system (4.9)–(4.14) that are subject to the conditions described are assuming here that on the set {(η, p)|η > 0, p > 0}, the function Rc is “physical” as described in Sec. 6.3 and is and sufficiently regular.
z We
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
855
˚ c denote the corresponding initial data (8.9) for the ENc system in Sec. 8. Let V κ ˚ c denote the (4.1)–(4.8) constructed from ˚ V∞ as described in Sec. 8, and let (0) W ˚ c as described in Sec. 9. Assume smoothing (9.4) of the first 5 components of V that the c-indexed equation of state satisfies the hypotheses (6.12) and (6.13) and is “physical ” as described in Secs. 3.1 and 6.3, and let K be the fixed compact subset of R10 defined in (10.29). Then there exist c0 > 0 and T > 0, ˚ c launches a unique claswith T not depending on c, such that for c ≥ c0 , V sical solution V to (4.1)–(4.8) that exists on the slab [0, T ] × R3 and that has ˚ c (s) and V([0, T ] × R3 ) ⊂ K. The solution is of the the properties V(0, s) = V 1 2 3 form V = (η, P, v , v , v , Φ, ∂t Φ, ∂1 Φ, ∂2 Φ, ∂3 Φ) and has the regularity properk=2 N −k 3 3 ties V ∈ Cb2 ([0, T ] × R3 ) ∩ k=0 C k ([0, T ], HV ¯ c ) and Φ ∈ Cb ([0, T ] × R ) ∩ k=3 k ¯ c and Φ ¯ c are defined by (8.11) C ([0, T ], H ¯N +1−k ), where the constants V k=0
Φc
2
def
and (8.12) respectively. Furthermore, with p = P e−4Φ/c , there exist constants Λ1 , Λ2 , L1 , L2 , L3 , L4 > 0 such that ˚ c ||| N Λ1 , |||W − (0) W H ,T
(10.5a)
|||Φ − ˚ Φc |||H N +1 ,T Λ2 ,
(10.5b)
|||∂t W|||H N −1 ,T L1 ,
(10.5c)
|||∂t Φ|||H N ,T 2 |||∂t η|||H N −2 ,T , |||∂t2 p|||H N −2 ,T c
−1
|||∂t2 Φ|||H N −1 ,T
L2 ,
(10.5d)
L3 ,
(10.5e)
L4 .
(10.5f)
10.2.1. Outline of the structure of the proof of Theorem 10.2 We prove Theorem 10.2 via the method of continuous induction (“bootstrapping”). After defining the constants Λ1 , Λ2 , L2 , and L4 , we make the assumptions (10.31)– (10.34). These assumptions hold at τ = 0 and therefore, by Theorem 10.1, there exists a maximal interval τ ∈ [0, Tcmax) on which the solution exists and on which the assumptions hold. Based on these estimates, we use a collection of technical lemmas derived from energy estimates to conclude that the bounds (10.21)–(10.27) hold for τ ∈ [0, Tcmax). It is important that the constants appearing on the righthand sides of (10.21)–(10.27) do not depend on c, if c is large enough. We can therefore apply Proposition 10.1 to conclude that for all large c, the solution can be extended to a uniform interval [0, T ]. The closing of the induction argument is largely due to the fact that the source term for the Klein–Gordon equation satisfied by Φ, which is the right-hand side of (4.4), “depends on Φ only through c−2 Φ.” 10.2.2. Proof of Theorem 10.2 To begin, we remark that for the remainder of this article, we indicate dependence ˚ c N +1 , ˚ ˚ c H N , (0) W ˚0 H N by of the running constants on W Φc H N +1 , and Ψ H ¯ Wc
¯ c W
¯c Φ
August 12, 2009 3:58 WSPC/148-RMP
856
J070-00374
J. Speck
writing C(id). By Remark 9.1, any constant C(id) can be chosen to be independent of all large c. We now introduce some notation that will be used throughout the proof, and also in the following section, where we have placed the proofs of the technical lemmas. Let V denote the local in time solution to the ENcκ system (4.1)–(4.8) ˚ c as furnished by Theorem 10.1. With W denoting launched by the initial data V the first 5 components of V, we suggestively define def ˙ ˚ c (s) W(t, s) = W(t, s) − (0) W
(10.6)
˙ def Φ = Φ−˚ Φc , (10.7) (0) ˚ c (s) is defined in (9.4) with the help of W where ˚ Φc is defined in (8.14) and ˚ c (s) is explained in more detail below. (10.35). We remark that this choice of (0) W ˙ is a solution to It follows from the fact that W is a solution to (4.1)–(4.3) that W c ˙ ˚ c (s) − the EOVκ (5.1)–(5.3) defined by the BGS V with initial data W(0, s) = W (0) ˚ c ˙ are given by Wc (s). The inhomogeneous terms in the EOVκ satisfied by W b = (f, g, . . . , h(3) ), where for j = 1, 2, 3 η ], f = −v k ∂k [(0)˚
(10.8) ˚
2
g = (4P − 3Qc )[∂t (c−2 Φ) + v k ∂k (c−2 Φ)] − v k ∂k [e4Φc /c · (0)˚ p]
h(j)
v k ] − c−2 (γc )2Qc vk v a ∂a [(0)˚ v k ], − Qc ∂k [(0)˚ = 3c−2 P − Rc ∂j Φ + (γc )−2 v j [∂t (c−2 Φ) + v k ∂k (c−2 Φ)] − (γc )2(Rc + c−2 P ) v k ∂k [(0)˚ v j ] + c−2 (γc )2v j vk v a ∂a [(0)˚ vk ] ˚
2
˚
2
p] − c−2 (γc )2v j v k ∂k [e4Φc /c · (0)˚ p]. − ∂j [e4Φc /c · (0)˚
(10.9)
(10.10)
In order to show that the hypotheses of Proposition 10.1 are satisfied, we will ˙ in L2 . Therefore, we study the equation that ∂α W ˙ satneed to estimate ∂α W c isfies: for 0 ≤ | α| ≤ N, we differentiate the EOVκ defined by the BGS V with ˙ satisfies ˙ is a solution, obtaining that ∂α W inhomogeneous terms b to which W µ ˙ (10.11) c A (W, Φ)∂µ (∂α W) = bα , where (suppressing the dependence of the c Aν (·) on W and Φ) bα = c A0 ∂α ((c A0 )−1 b) + kα def
(10.12)
and def ˙ − ∂α ((c A0 )−1 c Ak ∂k W)]. ˙ (10.13) kα = c A0 [(c A0 )−1 c Ak ∂k (∂α W) ˙ is a solution the EOVc defined by the same BGS V with inhomoThus, each ∂α W κ geneous terms bα . Furthermore, Φ˙ is a solution to the EOVcκ equation (5.4) with ˙ Φ(0, s) = 0, and the inhomogeneous term l on the right-hand side of (5.4) is def l = (κ2 − ∆)˚ Φc + 4πG(Rc − 3c−2 P ).
(10.14)
We will return to these facts in Sec. 10.3, where we will use them in the proofs of some technical lemmas.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
857
As an intermediate step in our proof of (10.5a)–(10.5f), we will prove the following weaker version of (10.5d): c−1 |||∂t Φ|||H N ,T L2 .
(10.5d )
We now define the constants Λ1 , Λ2 , L2 , and L4 . We will then use a variety of energy estimates to define L1 , L2 , and L3 in terms of these four constants and to show that (10.5a)–(10.5f) are satisfied if T is small enough. First, to motivate our definitions of L2 , L4 , and Λ2 , see inequalities (A.4) and (A.6) of Proposition A.2 and inequality (A.19) of Corollary A.1, and let C0 (κ) denote the constant that appears throughout the lemma and its corollary. By a non-optimal application of Lemma 10.3, we have that ˚0 H N + l(0)H N −1 ) 1/2 def = L2 /2, C0 (κ)(c−1 Ψ
(10.15) def
˚0 − ∂t l(0)H N −2 ) 1 = L4 . C0 (κ)(cl(0)H N −1 + (∆ − κ )Ψ 2
(10.16)
˚0 2 N 1/4 def Note also the trivial (and not optimal) estimate (C0 (κ))2 c−2 Ψ = H (Λ2 )2 /4. With these considerations in mind, we have thus defined def
Λ2 = 1, def L2 = def L4 =
(10.17)
1,
(10.18)
1.
(10.19)
To define Λ1 , we first define Z = Z(id; Λ2 ) to be the constant appearing in (10.35). Using this value of Z, which we emphasize depends only on Λ2 and the ˚ ∞ for the EPκ system, we then choose Λ1 so that (9.5) and (9.6) initial data W ˚ c, hold. Note that it is exactly at this step in the proof that the smoothing (0) W ˚ which is defined in (9.4), of the initial data Wc , which are the first 5 components of (8.9), is fixed. We find it illuminating to display the dependence of other constants that will appear below on Λ1 , Λ2 , L2 , L4 . Therefore, we continue to refer to (10.17)–(10.19) by the symbols Λ2 , L2 , and L4 respectively, even though they are equal to 1. We now carry out the continuous induction in detail. Let Tcmax be the maximal time for which the solution V exists and satisfies the estimates (10.5a), (10.5b), (10.5d ), and (10.5f); i.e. k=2 N −k max def = sup T |V ∈ C k ([0, T ], HV Tc ¯ c ), k=0
and (10.5a), (10.5b), (10.5d ), and (10.5f) hold .
(10.20)
Note that the set we are taking the sup of necessarily contains positive values of T since for all large c, the relevant bounds are satisfied at T = 0, and therefore by Theorem 10.1, also for short times. Lemmas 10.10, 10.2, 10.4, 10.8, 10.6, and inequalities (10.61) and (10.60) of Lemma 10.7 supply the following estimates which
August 12, 2009 3:58 WSPC/148-RMP
858
J070-00374
J. Speck
are valid for 0 ≤ τ < Tcmax : ˙ |||W||| H N ,τ [Λ1 /2 + τ · C(Λ1 , Λ2 , L1 , L2 )]
· exp(τ · C(Λ1 , Λ2 , L1 , L2 )), |||∂t W|||H N ,τ L1 (Λ1 , Λ2 , L2 ), |||∂t2 η|||H N −2 ,τ , |||∂t2 p|||H N −2 ,τ L3 (Λ1 , Λ2 , L1 , L2 , L4 ),
c c
−1
(10.23)
2
˙ |||Φ||| H N +1 ,τ
(Λ2 ) + τ · C(Λ1 , Λ2 , L2 ) 4 + τ 2 · C(Λ1 , Λ2 , L1 , L2 , L3 , L4 ),
(10.24)
|||∂t Φ|||H N ,τ
L2 /2
C(Λ1 , Λ2 , L1 , L2 ),
(10.25)
C(Λ1 , Λ2 , L1 , L2 , L3 , L4 ),
(10.26)
2
−1
(10.21) (10.22)
|||∂t2 Φ|||H N −1 ,τ
+τ ·
L4 /2 + τ ·
|||∂t Φ|||H N ,τ
L2 (Λ1 , Λ2 , L1 , L2 )/2 + τ · C(Λ1 , Λ2 , L1 , L2 , L3 , L4 ).
(10.27)
We apply the following sequence of reasoning to interpret the above inequalities: first L1 in (10.22) is determined through the known constants Λ1 , Λ2 , and L2 . Then L3 in (10.23) is determined through the known constants Λ1 , Λ2 , L1 , L2 , and L4 . Then L2 in (10.27) is determined through Λ1 , Λ2 , L1 , and L2 . Finally, the remaining constants C(· · ·) in (10.21)–(10.26) are all determined through Λ1 , Λ2 , L1 , L2 , L3 , L4 . By Sobolev embedding and (8.13), there exists a cube [−a, a]5 (depending on ˙ N ≤ Λ1 the initial data, Λ1 , and L2 ) such that for all large c, the assumptions ||Φ|| H and ||∂t Φ||H N ≤ L2 together imply that ˙ ∂2 Φ, ˙ ∂3 Φ, ˙ ∂t Φ)([0, T ] × R3 ) ⊂ [−a, a]5 . ˙ ∂1 Φ, (Φ,
(10.28)
Motivated by these considerations, we define both for use now and use later in the article the following compact sets: def ¯ 5 K = O 2 × [−a, a] def
¯ 2 × [−a, a]5 . K = O
(10.29) (10.30)
Here, O2 and O2 are the sets defined in Sec. 8.2. We now choose T so that when 0 ≤ τ ≤ T, it algebraically follows that the righthand sides of (10.21) and (10.24)–(10.27) are strictly less than Λ1 , (Λ2 )2 , L2 , L4 , and L2 respectively. Note that T may be chosen independently of (all large) c. We now show that Tcmax ≤ T is impossible. Assume that Tcmax ≤ T. Then observe that the right-hand sides of (10.21) and (10.24)–(10.27) are strictly less than Λ1 , (Λ2 )2 , L2 , L4 , and L2 , respectively, when τ = Tcmax . Therefore, by the construction of the set K described above, by (9.5), and by Sobolev embedding, we conclude that for all large c, V([0, Tcmax ) × R3 ) is contained in the interior of K. Consequently, we may apply Proposition 10.1 to extend the solution in time beyond Tcmax, thus contradicting the definition of Tcmax .
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
859
Note that this argument also shows that V([0, T ] × R3 ) ⊂ K. This completes the proof of Theorem 10.2. 10.3. The technical lemmas We now state and prove the technical lemmas quoted in the proof of Theorem 10.2. We will require some auxiliary lemmas along the way. Throughout this section, we assume the hypotheses of Theorem 10.2 and we use the notation from Sec. 10.2.2; i.e. V denotes the solution, W denotes its first 5 components, the relationship ˙ and Φ ˙ are defined in (10.6) and between W and W is given by (5.5) and (5.7), W (10.7) respectively, l is defined in (10.14), and so forth. All of the estimates in this section hold on the time interval τ ∈ [0, Tcmax), where Tcmax is defined in (10.20). 10.3.1. The induction hypotheses By the definition of Tcmax , we have the following bounds, where Λ2 , L2 , and L4 are defined in (10.17)–(10.19) respectively, and we will soon elaborate on our choice Λ1 : ˚ c ||| N Λ1 , |||W − (0) W (10.31) H ,τ
|||Φ − ˚ Φc |||H N +1 ,τ Λ2 ,
(10.32)
L2 ,
(10.33)
c−1 |||∂t2 Φ|||H N −2 ,τ L4 .
(10.34)
c
−1
|||∂t Φ|||H N ,τ
We note the following easy consequence of (8.14) and (10.32): ¯ c ||| N ≤ |||Φ − ˚ ¯ c ||| |||Φ − Φ Φc ||| N + |||˚ Φc − Φ H ,τ
H
H N ,τ
,τ
def
(10.32)
Λ2 + C(id) = C(id; Λ2 ). It then follows from (8.13), (10.32), and Sobolev embedding that |||Φ|||L∞ ,τ Z(id; Λ2 ).
(10.35)
Let us recall how Λ1 was chosen: using the value of Z in (10.35), which depends ˚ ∞ for the EPκ system and the known constant Λ2 , we have only on the data W chosen a constant Λ1 > 0 such that (9.5) and (9.6) hold. As discussed in Secs. 9 ˚ of W ˚ ∞, and 10.2.2, such a choice of Λ1 also involves fixing the smoothing (0) W (0) ˚ which then defines Wc via equation (9.4). We emphasize that it is this choice of (0) ˚ Wc and Λ1 that appear in (10.21), (10.5a), and (10.31). By (9.6) and (10.31), we also have that ¯ c ||| N ≤ |||W − (0) W ˚ c ||| N + |||(0) W ˚c−W ˚ c ||| N |||W − W H ,τ
H
,τ
H
,τ
˚c−W ¯ c ||| N + |||W H ,τ def
Λ1 + C(id; Λ1 ) + C(id) = C(id; Λ1 ).
(10.31)
Furthermore, by Lemma 6.1, (6.20) with m = 0, and (10.31 ), we have that ¯ c ||| N C(id; Λ1 , Λ2 ). |||W − W (10.36) H ,τ
August 12, 2009 3:58 WSPC/148-RMP
860
J070-00374
J. Speck
We also observe that (9.5), (10.31), and the definition of O2 given in Sec. 8.2 ¯2 and together imply that for all large c, we have that W([0, Tcmax) × R3 ) ⊂ O max 3 ¯ W([0, Tc ) × R ) ⊂ O2 . In our discussion below, we will refer to (10.31)–(10.36), (10.31 ), and (10.32 ) as the induction hypotheses. Sobolev embedding and the induction hypotheses, which for all large c are satisfied at τ = 0, together imply that W, ∂W, W, ∂W, Φ, ∂Φ, c−1 ∂t Φ, c−1 ∂t2 Φ are each contained in a compact convex set (depending only on the initial data, Λ1 , Λ2 , L2 , and L4 ) on [0, Tcmax) × R3 . As stated in Remark 6.4, we will make use of this fact without explicitly mentioning it every time. 10.3.2. Proofs of the technical lemmas Lemma 10.1. Consider the quantity l defined in (10.14). Then for m = 0, 1, 2, we have that η, ˚ p) + Fc , (4πG)−1 l = R∞ (η, p) − R∞ (˚ −1
(4πG)
∂t l −1 2 (4πG) ∂t l
(10.37)
= ∂t (R∞ (η, p)) + Gc ,
(10.38)
+ Hc ,
(10.39)
=
∂t2 (R∞ (η, p))
where Fc ∈ I N (cm−2 ; η, p, c−m Φ), Gc ∈ I
N −1
Hc ∈ I
N −2
(c
m−2
(c
m−2
−m
; η, p, c
−m
; η, p, c
(10.40)
Φ, ∂t η, ∂t p, c Φ, ∂t η, ∂t p, c
−m −m
∂t Φ),
(10.41)
∂t Φ, ∂t2 η, ∂t2 p, c−m ∂t2 Φ).
(10.42)
Proof. It follows from the discussion in Sec. 8 that 2
¯
2
(4πG)−1 l = (e4Φ/c Rc (η, p) − e4Φc /c Rc (¯ η , p¯)) ¯
2
2
+ 3c−2 (e4Φc /c p¯ − e4Φ/c p) + R∞ (¯ η , p¯) − R∞ (˚ η, ˚ p).
(10.43)
Therefore, (10.37) + (10.40) follows from Lemma 6.1, Lemma 6.2, and Lemma 6.4. (10.38) + (10.41) and (10.39) + (10.42) then follow from Lemma 6.3. Lemma 10.2. |||∂t W|||H N −1 ,τ , |||∂t W|||H N −1 ,τ C(id; Λ1 , Λ2 , L2 ) = L1 (id; Λ1 , Λ2 , L2 ). def
(10.44)
Proof. By using the ENcκ equations (4.1)–(4.3) to solve for ∂t W and applying Lemma 6.2, (6.21) in the cases ν = 1, 2, 3, Lemma 6.5, Lemma 6.6, Lemma 6.7, and Remark 6.12, we have that ∂t W = (c A0 (W, Φ))−1 [−c Ak (W, Φ)∂k W + Bc (W, Φ, DΦ)] = (∞ A0 (W))−1 [−∞ Ak (W)∂k W + B∞ (W, ∂Φ)] + ON −1 (W, ∂W, c−1 Φ, c−1 DΦ) ∩ ON −1 (c−2 ; W, ∂W, Φ, DΦ).
(10.45)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
861
The bound for |||∂t W|||H N −1 ,τ now follows from Lemma 6.2, (6.24), (6.28), the induction hypotheses, (10.45), and the definition (6.6) of ON −1 (W, ∂W, c−1 Φ, c−1 DΦ). The bound for |||∂t W|||H N −1 ,τ then follows from the bound for |||∂t W|||H N −1 ,τ , (6.21) in the case ν = t, m = 1, and the induction hypotheses. We remark that we have written the “intersection term” on the right-hand side of (10.45) in a form that will be useful in our proofs of Lemma 10.3, and Lemma 10.4; the “c−2 decay” is used in Lemma 10.3 and Corollary 11.1, while the “dependence on c−1 DΦ” is used in Lemma 10.4. Similar comments apply to Corollary 10.2 and equation (10.48) below. The following indispensable corollary shows that for large c, the ENcκ system can be written as a small perturbation of the EPκ system. See also Corollary 11.1. Corollary 10.2 (ENcκ ≈ EPκ for Large c). ∂t W = (∞ A0 (W))−1 [−∞ Ak (W)∂k W + B∞ (W, ∂Φ)] + ON −1 (W, ∂W, c−1 Φ, c−1 DΦ) ∩ ON −1 (c−2 ; W, ∂W, Φ, DΦ).
(10.46)
Proof. Recall that ∂t W and ∂t W differ only in that the second component of ∂t W is ∂t P, while the second component of ∂t W is ∂t p. Therefore, it follows trivially from (10.45) that (10.46) holds for all the components of ∂t W except for the second component ∂t p. To handle the component ∂t p, we first observe that the second component of the array −(∞ A0(W))−1 [−∞ Ak (W)∂k W + B∞ (W, ∂Φ)] is equal to −v k ∂k p − Q∞ (η, p)∂k v k . It thus follows directly from considering the second component of (10.45) that ∂t P = −v k ∂k p − Q∞ (η, p)∂k v k + ON −1 (W, ∂W, c−1 Φ, c−1 DΦ) ∩ ON −1 (c−2 ; W, ∂W, Φ, DΦ). 2
(10.47)
2
Therefore, since ∂t p − ∂t P = (e−4Φ/c − 1)∂t P − 4(c−2 ∂t Φ)e−4Φ/c P, we use Lemma 6.2, (6.16), (6.21), (6.22), Lemma 6.5, and (10.47) to conclude that ∂t p = −v k ∂k p − Q∞ (η, p)∂k v k + ON −1 (W, ∂W, c−1 Φ, c−1 DΦ) ∩ ON −1 (c−2 ; W, ∂W, Φ, DΦ).
(10.48)
Lemma 10.3. There exists a constant C(id) > 0 such that l(0)H N c−2 C(id) ˚0 − ∂t l(0)H N −1 c (∆ − κ )Ψ 2
−2
C(id).
(10.49) (10.50)
Proof. The estimate (10.49) follows from the estimate (10.37) for l(t) at t = 0 and (10.40) in the case m = 0.
August 12, 2009 3:58 WSPC/148-RMP
862
J070-00374
J. Speck
To obtain the estimate (10.50), first recall that by assumption (8.7) and the chain rule, we have that ˚0 = ∂k (R∞ (˚ η, ˚ p)˚ vk ) (4πG)−1 (κ2 − ∆)Ψ ∂R∞ ∂R∞ (˚ η, ˚ p)˚ v k ∂k˚ (˚ η, ˚ p)˚ v k ∂k˚ η+ p + R∞ (˚ η, ˚ p)∂k˚ vk . = ∂η ∂p (10.51) Furthermore, by Lemma 6.2, (10.38) at t = 0, (10.41) in the case m = 0, the chain rule, (4.1), (10.48), and (3.18) + (3.41) in the case c = ∞, we have that (4πG)−1 ∂t l(0) = −
∂R∞ ∂R∞ (˚ η, ˚ p)˚ v k ∂k˚ (˚ η, ˚ p)˚ v k ∂k˚ η− p ∂η ∂p
η, ˚ p)∂k˚ v k + ON −1 (c−2 ; id). − R∞ (˚
(10.52)
The estimate (10.50) now follows from (10.51) and (10.52). Lemma 10.4. |||∂t2 η|||H N −2 ,τ , |||∂t2 p|||H N −2 ,τ C(id; Λ1 , Λ2 , L1 , L2 , L4 ) = L3 (id; Λ1 , Λ2 , L1 , L2 , L4 ).
def
(10.53)
Proof. To obtain the bound for ∂t2 p, differentiate each side of the expression (10.48) with respect to t, and then apply Lemma 6.3 to conclude that ∂t2 p = −∂t [v k ∂k p + Q∞ (η, p)∂k v k ] + Fc ,
(10.54)
where Fc ∈ I N −2 (W, DW, ∂∂t W, c−1 Φ, c−1 DΦ, c−1 ∂∂t Φ, c−1 ∂t2 Φ). We now use Lemma 6.2, the induction hypotheses, the previously established bounds (10.44) on |||∂t W|||H N −1 ,τ and |||∂t W|||H N −1 ,τ , and the definition of I N −2 (· · ·) to conclude the estimate (10.53) for |||∂t2 p|||H N −2 ,τ . The estimate for ∂t2 η is similar, and in fact much simpler: use equation (4.1) to solve for ∂t η, and then differentiate with respect to t and reason as above. Lemma 10.5. |||l|||H N ,τ C(id; Λ1 , Λ2 ), |||∂t l|||H N −1 ,τ
C(id; Λ1 , Λ2 , L1 , L2 ),
|||∂t2 l|||H N −2 ,τ C(id; Λ1 , Λ2 , L1 , L2 , L3 , L4 ).
(10.55) (10.56) (10.57)
Proof. To prove (10.55), we first consider the formula for l given in (10.37) + (10.40). By Lemma 6.1 and (10.36), we have that η, ˚ p)|||H N ,τ ≤ |||R∞ (η, p) − R∞ (¯ η , p¯)|||H N ,τ |||R∞ (η, p) − R∞ (˚ + |||R∞ (¯ η , p¯) − R∞ (˚ η, ˚ p)|||H N ,τ C(id; Λ1 , Λ2 ).
(10.58)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
863
To estimate |||Fc |||H N ,τ , where Fc is from (10.37), simply use (10.40) in the case m = 0 together with (10.32’) and (10.36). The proofs of (10.56) and (10.57) follow similarly from the expressions (10.38), (10.39), (10.41) in the case m = 1, and (10.42) in the case m = 1, together with Lemma 6.2 and the bounds supplied by the induction hypotheses, Lemma 10.2, and Lemma 10.4. Lemma 10.6. c−1 |||∂t Φ|||H N ,τ 1/2 + τ · C(id; Λ1 , Λ2 , L1 , L2 ) = L2 /2 + τ · C(id; Λ1 , Λ2 , L1 , L2 ).
def
(10.59)
Proof. (10.59) follows from definition (10.18), Lemma 10.3, inequality (10.56) of Lemma 10.5, and inequality (A.4) of Proposition A.2. Lemma 10.7. |||∂t Φ|||H N ,τ C(id; Λ1 , Λ2 , L1 , L2 ) + τ · C(id; Λ1 , Λ2 , L1 , L2 , L3 , L4 ) = L2 (id; Λ1 , Λ2 , L1 , L2 )/2 + τ · C(id; Λ1 , Λ2 , L1 , L2 , L3 , L4 ), (10.60)
def
c−1 |||∂t2 Φ|||H N −1 ,τ 1/2 + τ · C(id; Λ1 , Λ2 , L1 , L2 , L3 , L4 ) = L4 /2 + τ · C(id; Λ1 , Λ2 , L1 , L2 , L3 , L4 ).
def
(10.61)
Proof. The estimate (10.60) follows from Lemma 10.3, inequalities (10.56) and (10.57) of Lemma 10.5, and inequality (A.24) of Proposition A.3. The estimate (10.61) follows from definition (10.19), Lemma 10.3, inequality (10.57) of Lemma 10.5, and inequality (A.6) of Proposition A.2. Lemma 10.8. 2
˙ |||Φ||| H N +1 ,τ
(Λ2 )2 + τ · C(id; Λ1 , Λ2 , L2 ) + τ 2 · C(id; Λ1 , Λ2 , L1 , L2 , L3 , L4 ). 4 (10.62)
Proof. Inequality (10.62) follows from definition (10.17), (10.55), (10.60), and inequality (A.19) of Corollary A.1. Lemma 10.9. Let
(c)
˙ defined by J˙ be the energy current (7.1) for the variation W
def
the BGS V, and let b = (f, g, . . . , h(3) ), where f, g, . . . , h(3) are the inhomogeneous ˙ that are defined in (10.8)–(10.10) and that terms from the EOVcκ satisfied by W ˙ Then on [0, T max), we also appear in the expression (7.8) for the divergence of (c) J. c have that µ
˙ 2 2 + W ˙ L2 bL2 ]. ∂µ ((c) J˙ )L1 C(id; Λ1 , Λ2 , L1 , L2 ) · [W L
(10.63)
August 12, 2009 3:58 WSPC/148-RMP
864
J070-00374
J. Speck
Proof. We separate the terms on the right-hand side of (7.8) into two types: those that depend quadratically on the variations, and those that depend linearly on the variations. We first bound (for all large c) the L1 norm of the terms that depend ˙ 2 2 . This bound follows quadratically on the variations by C(id; Λ1 , Λ2 , L1 , L2 ) · W L directly from the fact that the coefficients of the quadratic variation terms can be bounded in L∞ by C(id; Λ1 , Λ2 , L1 , L2 ); such an L∞ bound may be obtained by combining Remark 6.6, Lemma 6.4 in the case m = 1, Remark 6.9, the induction hypotheses, (10.44), and Sobolev embedding. We similarly bound the L1 norm of the terms that depend linearly on the ˙ L2 bL2 , but for these terms, we also make use of variations by C(id; Λ1 , Λ2 ) · W the Cauchy–Schwarz inequality for integrals. We also state here the following corollary that will be used in the proof of Theorem 11.2. k=1 N −k Corollary 10.3. Let V ∈ Cb1 ([0, T ] × R3 ) ∩ k=0 C k ([0, T ], HV ¯ c ), and assume 3 ˙ that V([0, T ] × R ) ⊂ K, where K is defined in (10.30). Let W be a solution to the
EOV∞ κ (5.1)–(5.3) defined by the BGS W with inhomogeneous terms b = (f, g, . . . , h(3) ), where W denotes the first 5 components of V. Let (∞) J˙ be the energy current ˙ defined by the BGS W. Then on [0, T ], we have that (7.2) for the variation W µ
˙ 2 2 + W ˙ L2 bL2 ]. ∂µ ((∞) J˙ )L1 ≤ C(K; |||W|||L∞ ,T , |||∂t W|||L∞ ,T ) · [W L (10.64) Proof. We do not give any details since Corollary 10.3 can proved by arguing as we did in our proof of Lemma 10.9. In fact, the proof of Corollary 10.3 is simpler: c does not enter into the estimates. Lemma 10.10. ˙ |||W||| H N ,τ [Λ1 /2 + τ · C(id; Λ1 , Λ2 , L1 , L2 )] · exp(τ · C(id; Λ1 , Λ2 , L1 , L2 )). (10.65)
Proof. Our proof of Lemma 10.10 follows from a Gronwall estimate in the H N ˙ defined in (10.6). Rather than directly estimating the H N norm of the variation W ˙ norm of W, we instead estimate the L1 norm of (c) J˙ α 0, where (c) J˙ α is the energy ˙ defined by the BGS V. This is favorable because current for the variation ∂α W ˙ of property (7.7) and because by (7.8), the divergence of (c) J˙ is lower order in W. We follow the method of proof of local existence from [22]; the only difficulty is checking that our estimates are independent of all large c. An important ingredient in our proof is showing that for 0 ≤ | α| ≤ N, we have the bound ˙ H N ), bα L2 C(id; Λ1 , Λ2 , L2 )(1 + W
(10.66)
where bα is defined in (10.12). Let us assume (10.66) for the moment; we will provide a proof at the end of the proof of the lemma.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
865
˙ J˙ α denote the energy current (7.1) for the variation ∂α W def defined by the BGS V, and abbreviating J˙ α = (c) J˙ α to ease the notation, we define E(t) ≥ 0 by def J˙ 0α (t, s) d3 s. (10.67) E2 (t) = We now let
(c)
| α|≤N
R3
By (8.16), and the Cauchy–Schwarz inequality for sums, we have that ˙ 2 N E2 (t) C −1 ˙ 2 CO¯2 ,Z W ¯ 2 ,Z WH N . H O
(10.68)
Here, the value of Z = Z(id; Λ2 ) is given by (10.35). def
Then by Lemma 10.9, (10.66), (10.68), with C = C(id; Λ1 , Λ2 , L1 , L2 ), we have
2E
d ˙ 2 2 + ∂α W ˙ L2 bα L2 ) E= ∂µ J˙ µα d3 s C · (∂α W L dt 3 R | α|≤N
| α|≤N
˙ 2 N + W ˙ H N ) C · (E2 + E). C · (W H
(10.69)
We now apply Gronwall’s inequality to (10.69), concluding that E(t) E(0) + Ct · exp(Ct).
(10.70)
Using (10.68) again, it follows from (10.70) that −1 ˙ ˙ W(t) H N CO ¯ 2 ,Z W(0)H N + Ct · exp(Ct).
(10.71)
˚ c and taking into account inequality (9.6), the ˙ ˚ c − (0) W Recalling that W(0) =W estimate (10.65) now follows. It remains to show (10.66). Our proof is based on the Sobolev–Moser propositions stated in Appendix B and the c-independent estimates of Sec. 6. With the 5 components of the array b defined by (10.8)–(10.10), we first claim that the term 0 0 −1 b) from (10.12) satisfies c A ∂α ((c A ) c A0 ∂α ((c A0 )−1 b)L2 C(id; Λ1 , Λ2 , L2 ).
(10.72)
Because (6.25) and the induction hypotheses together imply that c A0 (W, Φ)L∞ C(id; Λ1 , Λ2 ), it suffices to bound the H N norm of (c A0 )−1 b by the right-hand side of (10.72). To this end, we use the induction hypotheses, (6.25), Proposition B.1, and Remark B.1, with (c A0 (W, Φ))−1 playing the role of F in the proposition and b playing the role of G, to conclude that (c A0 )−1 bH N C(id; Λ1 , Λ2 )bH N .
(10.73)
To estimate bH N , we first split the array b into two arrays: b = Bc (W, Φ, DΦ) + Ic (id, W, Φ),
(10.74)
August 12, 2009 3:58 WSPC/148-RMP
866
J070-00374
J. Speck
where Bc is defined in Lemma 6.7 and the 5-component array Ic comprises the terms from the right-hand sides of (10.8)–(10.10) containing at least one factor of the smoothed initial data. By Lemma 6.2, Lemma 6.4, Remark 6.9, and Remark 9.1, we have that Ic ∈ I N (id, W, Φ),
(10.75)
and from (10.75) and the induction hypotheses, it follows that Ic (id, W, Φ)H N C(id; Λ1 , Λ2 ).
(10.76)
Furthermore, by (6.29) in the case m = 1 and the induction hypotheses, we have that Bc (W, Φ, DΦ)H N C(id; Λ1 , Λ2 , L2 ).
(10.77)
Combining (10.74), (10.76), and (10.77), we have that bH N C(id; Λ1 , Λ2 , L2 ).
(10.78)
We now observe that (10.73) and (10.78) together imply (10.72). We next claim that the kα terms (10.13) satisfy ˙ HN . kα L2 C(id; Λ1 , Λ2 )W
(10.79)
Since c A0 (W, Φ)L∞ C(id; Λ1 , Λ2 ), to prove (10.79), it suffices to control the L2 0 −1 k ˙ ˙ norm of (c A0 )−1 c Ak ∂k (∂α W)−∂ c A ∂k W). By the induction hypotheses, α ((c A ) (6.25), Proposition B.3, and Remark B.3, with (c A0 )−1 c Ak = ((c A0 )−1 c Ak )(W, Φ) ˙ playing the role of G, we have playing the role of F in the proposition, and ∂k W (for 0 ≤ | α| ≤ N ) that ˙ H N −1 , ˙ − ∂α ((c A0 )−1 c Ak ∂k W) ˙ L2 C(id; Λ1 , Λ2 )∂ W (c A0 )−1 c Ak ∂α (∂k W) (10.80) from which (10.79) readily follows. This concludes the proof of (10.66), and therefore also the proof of Lemma 10.10. 11. The Non-Relativistic Limit of the ENcκ System In this section, we state and prove our main theorem regarding the non-relativistic limit of the ENcκ system. Before stating our main theorem, we first state and prove a corollary of Theorem 10.2 that will be used in the proof of Theorem 11.2, and we also briefly discuss local existence for the EPκ system. 11.1. ENcκ well-approximates EPκ for large c The following corollary, which is an extension of Corollary 10.2, shows that for large c, solutions to the ENcκ system are “almost” solutions to the EPκ system.
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
867
Corollary 11.1. For all large c, the solutions V = (W, Φ, DΦ) to the ENcκ system (4.1)–(4.8) furnished by Theorem 10.2 satisfy µ ∞ A (W)∂µ W
= B∞ (W, ∂Φ) + E1c ,
(11.1)
Φc ) = 4πG[R∞ (η, p) − R∞ (˚ η, ˚ p)] + E2c , ∆(Φ − ˚ Φc ) − κ2 (Φ − ˚
(11.2)
|||E1c |||H N −1 ,T c−2 C(id; Λ1 , Λ2 , L2 ),
(11.3)
|||E2c |||H N −1 ,T c−1 C(id; Λ1 , Λ2 , L4 ),
(11.4)
where
and T is from Theorem 10.2. Proof. The estimate (11.3) follows from multiplying each side of (10.46) by 0 ∞ A (W), and then combining Proposition B.1, Remark B.1, (10.5d), and the induction hypotheses from Sec. 10.3.1, which are valid on [0, T ]; we remark that we are making use of the ON −1 (c−2 ; W, ∂W, Φ, DΦ) estimate on the righthand side of (10.46). Similarly, the estimate (11.4) follows from the fact that Φc ) = c−2 ∂t2 Φ + l, where l is given by (10.43), together with ∆(Φ − ˚ Φc ) − κ2 (Φ − ˚ (10.37), (10.40) in the case m = 0, (10.5f), and the induction hypotheses.
11.2. Local existence for EPκ In this section, we briefly discuss local existence for the EPκ system. V∞ denote initial data (8.1) for Theorem 11.1 (Local Existence for EPκ ). Let ˚ the EPκ system (4.9)–(4.14) that are subject to the conditions described in Sec. 8. Assume further that the equation of state is “physical” as described in Secs. 3.1 and 6.3. Then there exists a T∞ > 0 such that (4.9)–(4.14) has a unique classical def
1 solution V∞ ∈ Cb2 ([0, T∞ ] × R3 ) of the form V∞ = (η∞ , p∞ , v∞ , . . . , ∂3 Φ∞ ), and V∞ (s). Additionally, T∞ can be chosen such that V∞ ([0, T∞ ]× such that V∞ (0, s) = ˚ R3 ) ⊂ K, where the compact convex set K is defined in (10.30). Finally, V∞ ∈ k=3 k=2 k N −k ) and Φ ∈ Cb3 ([0, T∞ ] × R3 ) ∩ k=0 C k ([0, T∞ ], HΦ¯N∞+1−k ). ¯ k=0 C ([0, T∞ ], HV ∞
Proof. Theorem 11.1 can be proved by an iteration scheme based on the method of energy currents: energy currents (∞) J˙ can be used to control W∞ H N¯ , while W∞
Φ∞ H N +1 can be controlled using the estimate f H 2 ≤ C(∆ − κ2 )f L2 for ¯∞ Φ
f ∈ H 2 . These methods are employed in the proof of Theorem 11.2 below, so we do not provide a proof here. Similar techniques are used by Makino in [14]. We remark that these methods apply in particular to the system studied by Kiessling (as described in Sec. 4.2) in [12].
August 12, 2009 3:58 WSPC/148-RMP
868
J070-00374
J. Speck
11.3. Statement and proof of the main theorem Theorem 11.2 (The Non-Relativistic Limit of ENcκ ). Let ˚ V∞ denote initial data (8.1) for the EPκ system (4.9)–(4.14) that are subject to the conditions described in Sec. 8. Let ˚ Vc denote the corresponding initial data (8.8) for the ENcκ system (4.1)–(4.8) constructed from ˚ V∞ as described in Sec. 8, and assume that the c-indexed equation of state satisfies the hypotheses (6.12) and (6.13) and is def
1 , . . . , ∂3 Φ∞ ) “physical” as described in Secs. 3.1 and 6.3. Let V∞ = (η∞ , p∞ , v∞ def
(Vc = (ηc , pc , vc1 , . . . , ∂3 Φc )) denote the solution to the EPκ (ENcκ ) system launched Vc ) as furnished by Theorem 11.1 (Theorem 10.2). By Theorems 11.1 by ˚ V∞ (˚ and 10.2, we may assume that for all large c, V∞ and Vc exist on a common spacetime slab [0, T ] × R3 , where T is the minimum of the two times from the conclusions of the theorems. Let W∞ and Wc denote the first 5 components of V∞ and Vc respectively. Then there exists a constant C > 0 such that |||W∞ − Wc |||H N −1 ,T c−1 · C,
(11.5)
¯ ∞ ) − (Φc − Φ ¯ c )||| N +1 c−1 · C, |||(Φ∞ − Φ H ,T
(11.6)
¯∞ − Φ ¯ c | = 0, lim |Φ
(11.7)
c→∞
¯ c are defined through the initial data by (8.3) and ¯ ∞ and Φ where the constants Φ (8.12), respectively. Remark 11.1. (11.5)–(11.7), and Sobolev embedding imply that Wc → W∞ uniformly and Φc → Φ∞ uniformly on [0, T ] × R3 as c → ∞. Furthermore, the interpolation estimate (B.9), together with the uniform bound |||Wc |||H N¯ ,T C Wc
that follows from combining (6.20), (10.5a), and (10.5b), collectively imply that limc→∞ |||W∞ − Wc |||H N ,T = 0 for any N < N. The reason that we cannot use our argument to obtain the H N norm on the left-hand side of (11.5) instead of the H N −1 norm is that the expression (11.12) for b already involves one derivative of W, and therefore can only be controlled in the H N −1 norm. Proof. Throughout the proof, we refer to the constants Λ1 , Λ2 , etc., from the conclusion of Theorem 10.2. To ease the notation, we drop the subscripts c from def def the solution Vc and its first 5 components Wc , setting V = Vc , W = Wc , etc. We then define with the aid of (8.14) ˙ def = W∞ − W, W ˙ def ¯ ∞ ) − (Φ − Φ ¯ c ) = (Φ∞ − ˚ Φ = (Φ∞ − Φ Φ∞ ) − (Φ − ˚ Φc ).
(11.8) (11.9)
Our proof of Theorem 11.2 is similar to our proof of Lemma 10.10; we use energy currents and elementary harmonic analysis (i.e. Lemma A.1) to obtain a Gronwall ˙ defined in (11.8). It will also follow estimate for the H N −1 norm of the variation W N +1 ˙ H N −1 plus norm of Φ˙ is controlled in terms of W from our proof that the H
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
869
a small remainder. We remark that all of the estimates in this proof are valid on the interval [0, T ], where T is as in the statement of Theorem 11.2. ˙ Φ ˙ are solutions to the From definitions (11.8) and (11.9), it follows that W, ∞ following EOVκ defined by the BGS W∞ : µ ˙ ∞ A (W∞ )∂µ W
= b,
(11.10)
(∆ − κ2 )Φ˙ = l,
(11.11)
where def
b = B∞ (W∞ , ∂Φ∞ ) − B∞ (W, ∂Φ) + [∞ Aµ (W) − ∞ Aµ (W∞ )]∂µ W − E1c , (11.12) def
l = 4πG[R∞ (η∞ , p∞ ) − R∞ (η, p)] − E2c ,
(11.13)
B∞ is defined in Lemma 6.7, and E1c , E2c are defined in Corollary 11.1. Note that the definition of l in (11.13) differs from the definition (10.43) of l that is used in the proof of Corollary 11.1. By comparing (8.1) and (8.8), we see that the initial ˙ is condition satisfied by W ˙ W(0) = 0.
(11.14)
Differentiating Eq. (11.10) with the spatial multi-index operator ∂α , we have that µ ˙ ∞ A (W∞ )∂µ (∂α W)
= bα ,
on W∞ for ν = 0, 1, 2, 3) def bα = ∞ A0 ∂α (∞ A0 )−1 b + kα
where (suppressing the dependence of
(11.15)
ν ∞ A (·)
(11.16)
and def
kα =
0 0 −1 k ˙ ∞ A ∂α ∞ A [(∞ A ) (∂k W)
˙ − ∂α ((∞ A0 )−1 ∞ Ak ∂k W)].
(11.17)
As an intermediate step, we will show that for 0 ≤ | α| ≤ N − 1, we have that bα L2 C(id; |||W∞ |||H N¯
W∞
,T , Λ1 , Λ2 , L1 , L2 , L4 )
˙ H N −1 + c−1 ). · (W
(11.18)
Let us assume (11.18) for the moment and proceed as in Lemma 10.10: we let ˙ defined by the BGS W∞ , and define J˙ α denote the energy current (7.2) for ∂α W E(t) ≥ 0 by def J˙ 0α (t, s)d3 s, (11.19) E2 (t) =
(∞)
| α|≤N −1
R3
where we have dropped the superscript (∞) on J˙ to ease the notation. By (8.16), Remark 8.4, and the Cauchy–Schwarz inequality for sums, we have that ˙ 2 N −1 E2 (t) C −1 ˙ 2 CO¯2 W ¯ 2 WH N −1 . H O
(11.20)
August 12, 2009 3:58 WSPC/148-RMP
870
J070-00374
J. Speck
Then by Corollary 10.3 + Sobolev embedding, (11.18), and (11.20), with C = C(id; |||W∞ |||H N¯
W∞
we have that 2E
d E= dt
,T , |||∂t W∞ |||H N −1 ,T , Λ1 , Λ2 , L1 , L2 , L4 ),
| α|≤N −1
R3
∂µ J˙ µα d3 s C ·
˙ 2 2 + ∂α W ˙ L2 bα L2 ) (∂α W L
| α|≤N −1
˙ 2 N −1 + c−1 C · W ˙ H N −1 C · E2 + c−1 C · E. C · W H
(11.21)
Taking into account (11.14), which implies that E(0) = 0, we apply Gronwall’s inequality to (11.21), concluding that for t ∈ [0, T ], E(t) c−1 C · t · exp(C · t).
(11.22)
From (11.20) and (11.22), it follows that −1 ˙ |||W||| C · T · exp(T · C), H N −1 ,T c
(11.23)
which implies (11.5). We now return to the proof of (11.18). To prove (11.18), we show only that the following bound holds, where for the remainder of this proof, we abbreviate C = C(id; |||W∞ |||H N −1 ,T , Λ1 , Λ2 , L1 , L2 , L4 ): ¯ ∞ W
˙ H N −1 + c−1 C. bH N −1 C · W
(11.24)
The remaining details, which we leave up to the reader, then follow as in the proof of Lemma 10.10. By (10.36), which is valid for τ = T, and by (B.5), we have that ˙ H N −1 , R∞ (η∞ , p∞ ) − R∞ (η, p)H N −1 C · W
(11.25)
and combining (11.4), (11.11), (11.13), (11.25), and Lemma A.1, it follows that ˙ H N −1 + c−1 C. ˙ H N +1 C · lH N −1 C · W Φ
(11.26)
Similarly, taking into account (11.26), we have that ˙ H N −1 + ∂ Φ ˙ H N −1 ) B∞ (W∞ , ∂Φ∞ ) − B∞ (W, ∂Φ)H N −1 C · (W ˙ H N −1 + c−1 C. C · W
(11.27)
Finally, by (10.36) and (10.44), which are both valid for τ = T, by (B.3), and by (B.5), we have that ˙ H N −1 . [∞ Aµ (W) − ∞ Aµ (W∞ )]∂µ WH N −1 C · W
(11.28)
Inequality (11.24) now follows from (11.3), (11.12), (11.27), and (11.28). The estimate (11.6) then follows from (11.9), (11.23), and (11.26), while (11.7) is merely a restatement of (8.13).
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
871
Acknowledgments The author would like to thank Michael Kiessling and A. Shadi Tahvildar-Zadeh for discussing this project and for providing comments that were helpful in the revision of the earlier drafts. The author would also like to thank the anonymous referee for providing suggestions that helped to clarify certain points and for providing some of the references. This work was supported by NSF Grant DMS-0406951. Any opinions, conclusions, or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the NSF. Appendix A. Inhomogeneous Linear Klein–Gordon Estimates In this appendix, we collect together some standard energy estimates for the linear Klein–Gordon equation with an inhomogeneous term. We provide some proofs for convenience. Throughout this appendix, we abbreviate Lp = Lp (Rd ) and H j = H j (Rd ). ˚0 (s) ∈ H N , where N ∈ N. Then Proposition A.1. Let l ∈ C 0 ([0, T ], H N ) and Ψ ˙ s) : R × Rd → R to the equation there is a unique solution Φ(t, ˙ + ∆Φ˙ − κ2 Φ ˙ =l −c−2 ∂t2 Φ
(A.1)
d 2 ˙ ˚0 (s), where ∆ def ˙ s) = Ψ with initial data Φ(0, s) = 0, ∂t Φ(0, = i=1 ∂i . The solution k=1 k N +1−k ˙ ∈ ). has the regularity property Φ k=0 C ([0, T ], H Proof. This is a standard result; consult [21] for a proof. Proposition A.2. Assume the hypotheses of Proposition A.1. Assume further that k=2 l ∈ k=0 C k ([0, T ], H N −k ). Then there exists a constant C0 (κ) > 0 such that −1 ˚ ˙ |||Φ||| Ψ0 H N + cT |||l|||H N ,T }, H N +1 ,T ≤ C0 (κ) · {c
(A.2)
2 ˙ ˚ |||∂t Φ||| H N ,T ≤ C0 (κ) · {Ψ0 H N + c T |||l|||H N ,T },
(A.3)
˙ ˚ |||∂t Φ||| H N ,T ≤ C0 (κ) · {Ψ0 H N + cl(0)H N −1 + cT |||∂t l|||H N −1 ,T }, ˙ |||∂t2 Φ||| H N −1 ,T
˚0H N + c l(0)H N −1 + c T |||∂t l||| N −1 }, ≤ C0 (κ) · {cΨ H ,T 2
2
(A.4) (A.5)
2 2 ˚ ˙ |||∂t2 Φ||| H N −1 ,T ≤ C0 (κ) · {c l(0)H N −1 + c(∆ − κ )Ψ0 − ∂t l(0)H N −2
+ cT |||∂t2 l|||H N −2 ,T },
(A.6)
3 2 2 ˚ ˙ |||∂t3 Φ||| H N −2 ,T ≤ C0 (κ) · {c l(0)H N −1 + c (∆ − κ )Ψ0 − ∂t l(0)H N −2
+ c2 T |||∂t2 l|||H N −2 ,T }. ˙ is a solution to the Klein–Gordon equation Proof. Because ∂ (k) Φ ˙ + ∆ ∂ (k) Φ ˙ − κ2 ∂ (k) Φ ˙ = ∂ (k) l, −c−2 ∂t2 ∂ (k) Φ
(A.7)
August 12, 2009 3:58 WSPC/148-RMP
872
J070-00374
J. Speck
we will use standard energy estimates for the linear Klein–Gordon equation to ˙ estimate |||Φ||| H N +1 ,T . Thus, for 0 ≤ k ≤ N, we define Ek (t) ≥ 0 by def 2 (k+1) ˙ 2 2 ˙ ˙ ΦL2 + c−1 ∂ (k) ∂t Φ(t) Ek2 (t) = κ∂ (k) Φ(t) L2 + ∂ L2 .
(A.8)
˙ by −∂ (k) ∂t Φ, ˙ integrate We now multiply each side of the equation satisfied by ∂ (k) Φ d by parts over R , and use H¨ older’s inequality to arrive at the following chain of inequalities:
d 1 d 2 ˙ · ∂ (k) l dd s (Ek (t)) = (−∂ (k) ∂t Φ) Ek (t) Ek (t) = dt 2 dt Rd (k) ˙ (A.9) ≤ ∂ ∂t Φ(t)L2 ∂ (k) l(t)L2 , ˙ · (∂ (k) l) denotes the array-valued quantity formed by taking the where (−∂ (k) ∂t Φ) ˙ and ∂ (k) l. component by component product of the two arrays −∂ (k) ∂t Φ If we now define E(t) ≥ 0 by N def 2 2 −2 2 ˙ ˙ ˙ E 2 (t) = Ek2 (t) = κ2 Φ(t) ∂t Φ(t) (A.10) N + ∂ Φ(t) N + c N, H
H
H
k=0
it follows from (A.9) and the Cauchy–Schwarz inequality for sums that E(t)
d 1 d 2 ˙ H N l(t)H N ≤ cE(t)l(t)H N , E(t) = (E (t)) ≤ ∂t Φ dt 2 dt
(A.11)
and so d E(t) ≤ cl(t)H N . dt
(A.12)
Integrating (A.12) over time, we have the following inequality, valid for t ∈ [0, T ] : E(t) ≤ E(0) + ct|||l|||H N ,T .
(A.13)
˙ = 0, we have that From the definition of E(t) and the initial condition Φ ˙ Φ(t) H N +1 ≤ C(κ)E(t), ˙ ∂t Φ(t)H N ≤ cE(t), ˚ 0 H N . E(0) = c−1 Ψ
(A.14) (A.15) (A.16)
Combining (A.13)–(A.16), and taking the sup over t ∈ [0, T ] proves (A.2) and (A.3). To prove (A.4)–(A.7), we differentiate the Klein–Gordon equation with respect to t (twice to prove (A.6) and (A.7)) and argue as above, taking into account the initial conditions ˙ = −c2 l(0), ∂t2 Φ(0) ˙ ˚0 − ∂t l(0) . = c2 (∆ − κ2 )Ψ ∂t3 Φ(0)
(A.17) (A.18)
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
873
Corollary A.1. Assume the hypotheses of Proposition A.2, and let C0 (κ) be the constant appearing in the conclusions of the proposition. Then ˙ ˚0 2 N + 2T · |||∂t Φ||| ˙ 2 N +1 ≤ (C0 (κ))2 · c−2 Ψ |||Φ||| (A.19) H ,T H N ,T · |||l|||H N ,T . H d ˙ H N l(t)H N . Taking into (E 2 (t)) ≤ ∂t Φ Proof. Inequality (A.11) gives that 12 dt account (A.14) and (A.16), the proof of (A.19) easily follows.
˙ ∈ L2 and that ∆Φ ˙ − Lemma A.1. Let N ∈ N, and I ∈ H N −1 . Suppose that Φ 2˙ N +1 ˙ and κ Φ = I. Then Φ ∈ H ˙ H N +1 ≤ C(N, κ)IH N −1 . Φ
(A.20)
Proof. For use in this argument, we define the Fourier transform through its action def on integrable functions F by F(ξ) = Rd F (s)e−2πiξ·s dd s. The following chain of inequalities uses standard results from Fourier analysis, including Plancherel’s theorem:
˙ 2 2 2 2 2 d ˙ ˙ (κ2 + |2πξ|2 )2 |Φ(ξ)| d ξ ΦH 2 ≤ C(1 + |2πξ| ) Φ(ξ)L2 ≤ C(κ) Rd
˙ 2 2 = C(κ)I2 2 , = C(κ)(κ2 − ∆)Φ L L
(A.21)
and this proves (A.20) in the case N = 1. To estimate L2 norms of the kth order ˙ for k ≥ 1, we differentiate the equation k times to arrive at the derivatives of Φ (k) ˙ ˙ = ∂ (k) I, and argue as above to conclude that equation ∆ ∂ Φ − κ2 ∂ (k) Φ ˙ 2 ≤ C(κ)∂ (k) I2 2 . ∂ (k) Φ H2 L
(A.22)
Now we add the estimate (A.21) to the estimates (A.22) for 1 ≤ k ≤ N − 1 to conclude (A.20). ˙ ∈ L2 does not follow from the remaining assumpRemark A.1. The hypothesis Φ d2 2 2 tions. For example, consider g(x) = ex . Then g − dx 2 g ∈ L (R), but g ∈ H (R). Proposition A.3. Assume the hypotheses of Proposition A.1. Assume further that k=2 l ∈ k=0 C k ([0, T ], H N −k ). Then −1 ˚ ˙ Ψ0 H N + l(0)H N −1 + |||l|||H N −1 ,T |||Φ||| H N +1 ,T ≤ C(N, κ) · {c
+ T |||∂t l|||H N −1 ,T }
(A.23)
and 2 ˚ ˙ |||∂t Φ||| H N ,T ≤ C(N, κ) · {cl(0)H N −1 + (∆ − κ )Ψ0 − ∂t l(0)H N −2
+ |||∂t l|||H N −2 ,T + T |||∂t2 l|||H N −2 ,T }.
(A.24)
def Proof. Define I = l + c−2 ∂t2 Φ˙ and observe that Φ˙ is a solution to
˙ − κ2 Φ˙ = I. ∆Φ
(A.25)
August 12, 2009 3:58 WSPC/148-RMP
874
J070-00374
J. Speck
By inequality (A.5) of Proposition A.2, Lemma A.1, and the triangle inequality, we have that −2 2 ˙ ˙ ∂t Φ|||H N −1 ,T |||Φ||| H N +1 ,T ≤ C(N, κ)|||l + c
˚0 H N + l(0)H N −1 + |||l||| N −1 ≤ C(N, κ) · {c−1 Ψ H ,T + T |||∂t l|||H N −1 ,T },
(A.26)
which proves (A.23). ˙ − κ2 ∂t Φ ˙ = ∂t l, we may Because ∂t Φ˙ satisfies the equation −c−2 ∂t3 Φ˙ + ∆(∂t Φ) use a similar argument to prove (A.24); we leave the simple modification, which makes use of (A.7), up to the reader. Appendix B. Sobolev–Moser Estimates In this appendix, we use notation that is as consistent as possible with our use of notation in the body of the paper. To conserve space, we refer the reader to the literature instead of providing proofs: Propositions B.1 and B.2 are similar to propositions proved in [11, Chap. 6], while Proposition B.3 is proved in [13]. The corollaries and remarks below are straightforward extensions of the propositions. With the exception of Proposition B.4, which is a standard Sobolev interpolation inequality, the proofs of the propositions given in the literature are commonly based on the following version of the Gagliardo–Nirenberg inequality [15], together with repeated use of H¨older’s inequality and/or Sobolev embedding, where throughout j j d this appendix, we abbreviate Lp = Lp (Rd ), H j = H j (Rd ), and HV ¯ = HV ¯ (R ): Lemma B.1. If i, k ∈ N with 0 ≤ i ≤ k, and V is a scalar-valued or array-valued function on Rd satisfying V ∈ L∞ and ∂ (k) VL2 < ∞, then 1− i
i
∂ (i) VL2k/i ≤ C(k)VL∞k ∂ (k) VLk 2 .
(B.1)
Proposition B.1. Let K ⊂ Rn be a compact set, and let j, d ∈ N with j > d2 . Let V : Rd → Rn be an element of H j , and assume that V ⊂ K. Let F ∈ Cbj (K) be a q × q matrix-valued function, and let G ∈ H j be a q × q (q × 1) matrix-valued (array-valued ) function. Then the q×q (q×1) matrix-valued (array-valued ) function (F ◦ V)G is an element of H j and (F ◦ V)GH j ≤ C(j, d)|F|j,K (1 + VjH j )GH j .
(B.2)
Corollary B.1. Assume the hypotheses of Proposition B.1 with the following changes: V, G ∈ C 0 ([0, T ], H j ). Then the q × q (q × 1) matrix-valued (array-valued ) function (F ◦ V)G is an element of C 0 ([0, T ], H j ). Remark B.1. We often make use of a slight modification of Proposition B.1 in j which the assumption V ∈ H j is replaced with the assumption V ∈ HV ¯ , where
August 12, 2009 3:58 WSPC/148-RMP
J070-00374
The Non-Relativistic Limit of the Euler–Nordstr¨ om System
875
¯ ∈ Rn is a constant array. Under this modified assumption, the conclusion of V Proposition B.1 is modified as follows: (F ◦ V)GH j ≤ C(j, d)|F|j,K (1 + VjH j )GH j .
(B.3)
¯ V
A similar modification can be made to Corollary B.1. Proposition B.2. Let K ⊂ Rn be a compact convex set, and let j, d ∈ N with j > j d d n 2 . Let F ∈ Cb (K) be a scalar or array-valued function. Let V, V : R → R , and j assume that V, V ∈ H . Assume further that V, V ⊂ K. Then F ◦ V − F ◦ V ∈ H j and H j ≤ C(j, d, VH j , V H j )|F|j+1,K V − V Hj . F ◦ V − F ◦ V
(B.4)
∈ H j from Remark B.2. As in Remark B.1, we may replace the hypotheses V, V j ∈ H ¯ , in which case the conclusion of Proposition B.2 with the hypotheses V, V V the proposition is: j )|F|j+1,K V − V H j . (B.5) H j ≤ C(j, d, V j , V (F ◦ V) − (F ◦ V) H H ¯ V
¯ V
= V, ¯ where V ¯ ∈ K is a Furthermore, a careful analysis of the special case V constant array, gives the bound ¯ H j ≤ C(j, d)|∂F/∂V|j−1,K (1 + Vj−1 F ◦ V − F ◦ V )(VH j ), Hj ¯ V
¯ V
(B.6)
in which we require less regularity of F than we do in the general case. Proposition B.3. Assume the hypotheses of Proposition B.1 with the following two changes: (1) Assume j > d2 + 1. (2) Assume that G ∈ H j−1 . Let α be a spatial derivative multi-index such that 1 ≤ | α| ≤ j. Then ∂α [(F ◦ V)G] − (F ◦ V)∂α GL2 ≤ C(j, d)|∂F/∂V|j−1,K (VH j + VjH j )GH j−1 .
(B.7)
Remark B.3. As in Remark B.1, we may replace the assumption V ∈ H j in j ¯ Proposition B.3 with the assumption V ∈ HV ¯ , where V is a constant array, in which case we obtain ∂α [(F ◦ V)G] − (F ◦ V)∂α GL2 ≤ C(j, d)|∂F/∂V|j−1,K (VH j + VjH j )GH j−1 . ¯ V
¯ V
(B.8)
Proposition B.4. Let N , N ∈ R be such that 0 ≤ N ≤ N, and assume that F ∈ H N . Then 1−N /N
FH N ≤ C(N , d)FL2
N /N
FH N .
(B.9)
August 12, 2009 3:58 WSPC/148-RMP
876
J070-00374
J. Speck
References [1] N. Andersson and G. L. Comer, Relativistic fluid dynamics: Physics for many different scales, Living Rev. Relativity 10 (2007) lrr-2007-1, 83 pp.; http://relativity.livingreviews.org/Articles/lrr-2007-1/. [2] S. Bauer, Post-Newtonian approximation of the Vlasov–Nordstr¨ om system, Comm. Partial Differential Equations 30 (2005) 957–985. [3] S. Bauer, M. Kunze, G. Rein and A. D. Rendall, Multipole radiation in a collisionless gas coupled to electromagnetism or scalar gravitation, Comm. Math. Phys. 266 (2006) 267–288. [4] S. Calogero, Spherically symmetric steady states of galactic dynamics in scalar gravity, Classical Quantum Gravity 20 (2003) 1729–1741. [5] S. Calogero, Global classical solutions to the 3D Nordstr¨ om–Vlasov system, Comm. Math. Phys. 266 (2006) 343–353. [6] S. Calogero and H. Lee, The non-relativistic limit of the Nordstr¨ om–Vlasov system, Commun. Math. Sci. 2 (2004) 19–34. [7] D. Christodoulou, Self-gravitating relativistic fluids: A two-phase model, Arch. Ration. Mech. Anal. 130 (1995) 343–400. [8] D. Christodoulou, The Action Principle and Partial Differential Equations (Princeton University Press, Princeton, NJ, 2000). [9] D. Christodoulou, The Formation of Shocks in 3-Dimensional Fluids (European Mathematical Society, Z¨ urich, Switzerland, 2007). [10] Y. Guo and S. Tahvildar-Zadeh, Formation of singularities in relativistic fluid dynamics and in spherically symmetric plasma dynamics, Contemp. Math. 238 (1999) 151–161. [11] L. H¨ ormander, Lectures on Nonlinear Hyperbolic Differential Equations (SpringerVerlag, Berlin, Heidelberg, New York, 1997). [12] M. Kiessling, The “Jeans swindle”: A true story — mathematically speaking, Adv. Appl. Math. 31 (2003) 132–149. [13] S. Klainerman and A. Majda, Singular limits of quasilinear hyperbolic systems with large parameters and the incompressible limit of compressible fluids, Comm. Pure Appl. Math. 34 (1981) 481–524. [14] T. Makino, On a local existence theorem for the evolution equation of gaseous stars, Stud. Math. Appl. 18 (1986) 459–479. [15] L. Nirenberg, On elliptic partial differential equations, Ann. Scuola Norm. Sup. Pisa (3) 13 (1959) 115–162. [16] G. Nordstr¨ om, Zur Theorie der Gravitation vom Standpunkt des Relativit¨ atsprinzips, Ann. Phys. 42 (1913) 533–554. [17] T. A. Oliynyk, The Newtonian limit for perfect fluids, Comm. Math. Phys. 276 (2007) 131–188. [18] T. A. Oliynyk, Post-Newtonian expansions for perfect fluids, to appear in Comm. Math. Phys.; arXiv.org:0810.3752 (2008). [19] A. D. Rendall, The initial value problem for a class of general relativistic fluid bodies, J. Math. Phys. 33 (1992) 1047–1053. [20] A. D. Rendall, The Newtonian limit for asymptotically flat solutions of the Vlasov– Einstein system, Comm. Math. Phys. 163 (1994) 89–112. [21] C. D. Sogge, Lectures on Nonlinear Wave Equations, Monographs in Analysis, II (International Press, Boston, MA, 1995). [22] J. Speck, Well-posedness for the Euler–Nordstr¨ om system with cosmological constant, J. Hyperbolic Differ. Equ. 6(2) (2009) 313–358. [23] N. Straumann, General Relativity and Relativistic Astrophysics, Texts and Monographs in Physics (Springer-Verlag, Berlin, 1984).
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
Reviews in Mathematical Physics Vol. 21, No. 7 (2009) 877–928 c World Scientific Publishing Company
SU(3)-GOODMAN–DE LA HARPE–JONES SUBFACTORS AND THE REALIZATION OF SU(3) MODULAR INVARIANTS
DAVID E. EVANS∗ and MATHEW PUGH† School of Mathematics, Cardiff University, Senghennydd Road, Cardiff, CF24 4AG, Wales, United Kingdom ∗EvansDE@cardiff.ac.uk †PughMJ@cardiff.ac.uk Received 15 April 2009
We complete the realization by braided subfactors, announced by Ocneanu, of all SU(3)modular invariant partition functions previously classified by Gannon. Keywords: Subfactors; modular invariants; SU(3). Mathematics Subject Classification 2000: 46L37, 46L60, 81T40
1. Introduction In [24] Goodman, de la Harpe and Jones constructed a subfactor B ⊂ C given by the embedding of the Temperley–Lieb algebra in the AF-algebra for an SU(2) ADE Dynkin diagram. We will present an SU(3) analogue of this construction, where we embed the SU(3)-Temperley–Lieb or Hecke algebra in an AF path algebra of the SU(3) ADE graphs. Using this construction, we are able to realize all the SU(3) modular invariants by subfactors. The algebraic structure behind the integrable statistical mechanical SU(N )models are the Hecke algebras Hn (q) of type An−1 , for q ∈ C, since the Boltzmann weights lie in ( N MN )SU(N ) or ( N MN )SU(N )q . The Hecke algebra Hn (q) is the algebra generated by unitary operators gj , j = 1, 2, . . . , n−1, satisfying the relations (q −1 − gj )(q + gj ) = 0, gi gj = gj gi ,
(1) |i − j| > 1,
gi gi+1 gi = gi+1 gi gi+1 .
(2) (3)
When q = 1, the first relation becomes gj2 = 1, so that Hn (1) reduces to the group ring of the symmetric, or permutation, group Sn , where gj represents a transposition 877
August 12, 2009 3:57 WSPC/148-RMP
878
J070-00376
D. E. Evans & M. Pugh
(j, j + 1). Writing gj = q −1 − Uj where |q| = 1, and setting δ = q + q −1 , these relations are equivalent to the self-adjoint operators 1, U1 , U2 , . . . , Un−1 satisfying the relations H1:
Ui2 = δUi ,
H2:
Ui Uj = Uj Ui ,
H3:
|i − j| > 1,
Ui Ui+1 Ui − Ui = Ui+1 Ui Ui+1 − Ui+1 .
To any σ in the permutation group Sn , decomposed into transpositions of nearest neighbors σ = i∈Iσ τi,i+1 , we associate the operator gσ = gi , i∈Iσ
which is well defined because of the braiding relation (3). Then the commutant of the quantum group SU(N )q is obtained from the Hecke algebra by imposing an extra condition, which is the vanishing of the q-antisymmetrizer (−q)|Iσ | gσ = 0. (4) σ∈SN +1
For SU(2) it reduces to the Temperley–Lieb condition Ui Ui±1 Ui − Ui = 0,
(5)
(Ui − Ui+2 Ui+1 Ui + Ui+1 )(Ui+1 Ui+2 Ui+1 − Ui+1 ) = 0.
(6)
and for SU(3) it is
We will say that a family of operators {Um } satisfy the SU(3)-Temperley– Lieb relations if they satisfy the Hecke relations H1–H3 and the extra condition (6). The Temperley–Lieb algebra has diagrammatic representations due to Kauffman [25]. There are similar diagrammatic representations for the SU(3)-Temperley– Lieb based on the spider relations of Kuperberg, which we will exploit in a later sequel [20, 21] going into SU(3)-planar algebras. However, for our purposes here to construct SU(3)-Goodman–de la Harpe–Jones subfactors, it is enough to work algebraically. We will embed the SU(3)-Temperley–Lieb algebra in the path algebra of the candidate nimrep graphs for the SU(3) modular invariants, using the Boltzmann weights we constructed in [19]. This is with the exception of the graph (12) E4 for which we did not derive the Ocneanu cells which permitted the derivation of the Boltzmann weights. However this is still enough to realize all SU(3)-modular (12) which we invariants, and compute their nimrep graphs with the exception of E4 will do in this paper, after first outlining the theory of modular invariants from α-induction in the next section. 2. ADE Graphs We start with the SU(3) modular invariants. The list below of all SU(3) modular invariants was shown to be complete by Gannon [23]. Let P (n) = {µ = (µ1 , µ2 ) ∈
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
879
Z2 |µ1 , µ2 ≥ 0; µ1 + µ2 ≤ n − 3}. These µ are the admissible representations of the Kaˇc–Moody algebra su(3)∧ at level k = n − 3. We define the automorphism A of order 3 on the weights µ ∈ P (n) by A(µ1 , µ2 ) = (n − 3 − µ1 − µ2 , µ1 ). There are four infinite series of SU(3) modular invariants: the identity (or diagonal) invariant at level n − 3 is
ZA(n) =
|χµ |2 ,
n ≥ 4,
(7)
(n) µ∈P+
and its orbifold invariant is given by ZD(3k) =
1 3
k ≥ 2,
(8)
(3k) µ∈P+ µ1 −µ2 ≡0 mod 3
ZD(n) =
|χµ + χAµ + χA2 µ |2 ,
χµ χ∗A(n−3)(µ1 −µ2 ) µ ,
n ≥ 5,
n ≡ 0 mod 3.
(9)
(n) µ∈P+
Two other infinite series are given by their conjugate invariants. The conjugate invariant ZA(n)∗ = C and the conjugate orbifold invariants ZD(n)∗ = ZD(n) C are ZA(n)∗ =
χµ χ∗µ ,
n ≥ 4,
(10)
(n) µ∈P+
ZD(3k)∗ =
ZD(n)∗ =
1 3
(χµ + χAµ + χA2 µ )(χ∗µ + χ∗Aµ + χ∗A2 µ ),
k ≥ 2,
(11)
(3k) µ∈P+ µ1 −µ2 ≡0 mod 3
(n) µ∈P+
χµ χ∗ (n−3)(µ1 −µ2 ) , A
µ
n ≥ 5,
n ≡ 0 mod 3.
(12)
There are also exceptional invariants, i.e. invariants which are not diagonal, orbifold, or their conjugates: ZE (8) = |χ(0,0) + χ(2,2) |2 + |χ(0,2) + χ(3,2) |2 + |χ(2,0) + χ(2,3) |2 + |χ(2,1) + χ(0,5) |2 + |χ(3,0) + χ(0,3) |2 + |χ(1,2) + χ(5,0) |2 , ZE (8)∗ =
|χ(0,0) + χ(2,2) | + (χ(0,2) + χ(3,2) )(χ∗(2,0) + χ∗(2,3) ) + (χ(2,0) + χ(2,3) )(χ∗(0,2) + χ∗(3,2) ) + (χ(2,1) + χ(0,5) )(χ∗(1,2)
(13)
2
+ |χ(3,0) + χ(0,3) |2 + (χ(1,2) + χ(5,0) )(χ∗(2,1) + χ∗(0,5) ),
+ χ∗(5,0) ) (14)
ZE (12) = |χ(0,0) + χ(0,9) + χ(9,0) + χ(4,4) + χ(4,1) + χ(1,4) |
2
+ 2|χ(2,2) + χ(2,5) + χ(5,2) |2 ,
(15)
August 12, 2009 3:57 WSPC/148-RMP
880
J070-00376
D. E. Evans & M. Pugh
ZE (12) = |χ(0,0) + χ(0,9) + χ(9,0) |2 + |χ(2,2) + χ(2,5) + χ(5,2) |2 + 2|χ(3,3) |2 MS
+ |χ(0,3) + χ(6,0) + χ(3,6) |2 + |χ(3,0) + χ(0,6) + χ(6,3) |2 + |χ(4,4) + χ(4,1) + χ(1,4) |2 + (χ(1,1) + χ(1,7) + χ(7,1) )χ∗(3,3) + χ(3,3) (χ∗(1,1) + χ∗(1,7) + χ∗(7,1) ),
(16)
ZE (12)∗ = |χ(0,0) + χ(0,9) + χ(9,0) |2 + |χ(2,2) + χ(2,5) + χ(5,2) |2 + 2|χ(3,3) |2 MS
+ (χ(0,3) + χ(6,0) + χ(3,6) )(χ∗(3,0) + χ∗(0,6) + χ∗(6,3) ) + (χ(3,0) + χ(0,6) + χ(6,3) )(χ∗(0,3) + χ∗(6,0) + χ∗(3,6) ) + |χ(4,4) + χ(4,1) + χ(1,4) |2 + (χ(1,1) + χ(1,7) + χ(7,1) )χ∗(3,3) + χ(3,3) (χ∗(1,1) + χ∗(1,7) + χ∗(7,1) ),
(17)
ZE (24) = |χ(0,0) + χ(4,4) + χ(6,6) + χ(10,10) + χ(21,0) + χ(0,21) + χ(13,4) + χ(4,13) + χ(10,1) + χ(1,10) + χ(9,6) + χ(6,9) |2 + |χ(15,6) + χ(6,15) + χ(15,0) + χ(0,15) + χ(10,7) + χ(7,10) + χ(10,4) + χ(4,10) + χ(7,4) + χ(4,7) + χ(6,0) + χ(0,6) |2 ,
(18)
where ZE (12) and ZE (24) are self-conjugate, ZE (8)∗ = ZE (8) C and ZE (12)∗ = ZE (12) C. MS MS The modular invariants arising from SU(3)k conformal embeddings are (see [14]): • • • •
D(6) : SU(3)3 ⊂ SO(8)1 , also realized as an orbifold SU(3)3 /Z3 , E (8) : SU(3)5 ⊂ SU(6)1 , plus its conjugate, E (12) : SU(3)9 ⊂ (E6 )1 , E (24) : SU(3)21 ⊂ (E7 )1 . (12)
The Moore–Seiberg invariant EMS [28], an automorphism of the orbifold invariant D(12) = SU(3)9 /Z3 , is the SU(3) analogue of the E7 invariant for SU(2), which is an automorphism of the orbifold invariant D10 = SU(2)16 /Z2 (see [9, Sec. 5.3] for a realization by a braided subfactor). In the statistical mechanical models underlying this theory, the vertices and edges of the underlying graph are used to describe bonds on a two dimensional lattice, together with some Hamiltonian or family of Boltzmann weights. In the conformal field theory, or subfactor theory, the vertices of the graph appear as primary fields or endomorphisms of a type III factor. The simplest case of the diagonal invariant only involves the Verlinde algebra, whose fusion rules are determined by the graph A(n) . The infinite graph A(∞) is illustrated in Fig. 1, whilst for finite n, the graphs A(n) are the subgraphs of A(∞) , given by all the vertices (λ1 , λ2 ) such that λ1 + λ2 ≤ n − 3, and all the edges in A(∞) which connect these vertices. The Verlinde algebra of SU(3) at level k = n − 3 will be represented by a finite system N XN of irreducible inequivalent endomorphisms of a type III factor N [33]
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
Fig. 1.
881
The infinite graph A(∞) .
which possesses a non-degenerate braiding, with unitary operator ε(λ, µ) intertwining λµ and µλ, called a braiding operator, which satisfy the Braiding Fusion Equations [8, Definition 2.2]. For every braiding ε+ ≡ ε there is an opposite braiding ε− obtained by reversing the crossings. If we have an inclusion ι : N → M of type III factors together with a non-degenerately braided finite system N XN such that the dual canonical endomorphism θ = ιι decomposes as a sum of elements of N XN then we call N ⊂ M a braided subfactor. The α-induced morphisms α± λ ∈ End(M ), which extend λ ∈ N XN , are defined by the Longo–Rehren for−1 ◦ Ad(ε± (λ, θ)) ◦ λ ◦ ι. A coupling matrix Z can be defined [8] mula [27] α± λ = ι + − by Zλ,µ = αλ , αµ , where λ, µ ∈ N XN , normalized so that Z0,0 = 1. By [6, 14] this matrix Z commutes with the modular S- and T -matrices, and therefore Z is a modular invariant. The right action of the N -N system N XN on the M -N system M XN yields a representation of the Verlinde algebra or a nimrep Gλ , of the original N -N fusion rules, i.e. a matrix representation where all the matrix entries are nonnegative integers. These nimreps give multiplicity graphs associated to the modular invariants (or at least associated to the inclusion, as a modular invariant may be represented by wildly differing inclusions). The matrix Gν has spectrum Sλ,ν /Sλ,0 with multiplicity Zλ,λ . In particular, the spectrum of the nimrep is determined by the diagonal part of the modular invariant and provides an automatic connection between the modular invariant and fusion graphs, which in the SU(2) case reduces to the classification by Capelli–Itzykson–Zuber [10] of modular invariants by ADE graphs. As M -N sectors cannot be multiplied among themselves there is no associated fusion rule algebra to decompose. Nevertheless, when chiral locality does hold [5, 6] the nimrep graph M XN can be canonically identified with both chiral ± , the systems induced by the images of α-induction, by β → β ◦ ι, graphs M XM ± β ∈ M XM . The question then arises whether or not every SU(3) modular invariant can be realized by a subfactor. This was claimed and announced by Ocneanu [31] in his
August 12, 2009 3:57 WSPC/148-RMP
882
J070-00376
D. E. Evans & M. Pugh
bimodule setting. Most of these invariants are understood in the literature. Xu [36] (see also [3–5]) looked at the conformal embedding invariants in the loop group setting of [33], taking α-induction as the principal tool. These conformal inclusions are local or type I. In particular, the chiral graphs for the D(6) , E (8) , E (12) and E (24) SU(3) invariants were computed. Since these inclusions are type I, the chiral (12) graphs coincide with their nimreps with corresponding graphs D(6) , E (8) , E1 and E (24) , respectively. These graphs are illustrated in [19, Figs. 10, 13, 14 and 16], respectively. Note that by the spectral theory of nimreps developed in [8, 9] and described above, these graphs and the other candidate graphs of di Francesco and Zuber will now automatically have spectra described by the diagonal part of the modular invariant. ockenhauer The infinite series of orbifold invariants D(3k) were considered by B¨ and Evans in [4], yielding nimreps which produce the graphs D(3k) , which are the Z3 -orbifolds of the graphs A(3k) . B¨ockenhauer and Evans [4] produced a method for analyzing conjugates of conformal embedding invariants by taking an orbifold of the extended system of the level one theory of the ambient group. In [7], B¨ ockenhauer and Evans realized all modular invariants for cyclic Zn theories, in particular, charge conjugation. The conformal embedding modular invariant E (8) : SU(3)5 ⊂ SU(6)1 produces the E (8) invariant and the nimrep graph E (8) . Then taking the extension SU(6)1 ⊂ SU(6)1 Z3 describes charge conjugation on the cyclic Z6 system for SU(6)1 . Then the inclusion SU(3)5 ⊂ SU(6)1 Z3 produces its orbifold E (8) /Z3 for the conjugate modular invariant (see Fig. 5). This procedure could be used to (12) understand and realize SU(3)9 ⊂ (E6 )1 , with two nimreps. One was E1 through of course the SU(3)9 ⊂ (E6 )1 standard conformal embedding, and another the orbifold (12) (12) E2 = E1 /Z3 obtained from the subfactor SU(3)9 ⊂ (E6 )1 Z3 . The extension (E6 )1 ⊂ (E6 )1 Z3 describes charge conjugation on the cyclic Z6 system for (E6 )1 . The conformal embedding inclusion is always local and so type I but its orbifold here is not local, so this particular modular invariant E (12) is type I for one subfactor (12) (12) and its Z3 -orbifold E2 (see Fig. 6). realization and type II for another, E1 (12) We now realize the remaining SU(3) modular invariants A∗ , D∗ and EMS by subfactors, using an SU(3) analogue of the Goodman–de la Harpe–Jones construction of a subfactor, where we embed the SU(3)-Temperley–Lieb or Hecke algebra in an AF path algebra of the SU(3) ADE graphs. These subfactors yield nimreps (12) which produce the graphs A(n)∗ , D(n)∗ , E5 , respectively (see Figs. 9, 10, 8, respectively). We can also realize the conjugate invariant of the Moore–Seiberg invariant (12) EMS by a subfactor, since this is now a product of two modular invariants (the Moore–Seiberg and conjugate) which can both be realized by subfactors, and so by [18, Theorem 3.6] their product is also realized by an inclusion. However, we have not yet been able to compute its nimrep as we have been unable to deter(12) which would enable a direct computation of the mine the cells for the graph E4 desired nimrep graph using the SU(3)-Goodman–de la Harpe–Jones subfactor, or alternatively, compute the nimrep in the alternative inclusion given by the braided product of the Moore–Seiberg inclusion and the conjugate inclusion.
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
883
Almost all the ADE graphs mentioned above were proposed by di Francesco and Zuber [11] by looking for graphs whose spectrum reproduced the diagonal part of the modular invariant, aided to some degree by first listing the graphs and spectra of fusion graphs of the finite subgroups of SU(3). At that time, they proposed looking for 3-colorable graphs. They succeeded, for SU(3), in finding graphs and nimreps for the orbifold invariants, and the exceptional invariants (with three candidates for the conformal embedding SU(3)9 ⊂ (E6 )1 invariant). All these graphs were threecolorable, and they conjectured this to be the case for all SU(3) modular invariants. B¨ockenhauer and Evans [2] understood that nimrep graphs for the conjugate SU(3) modular invariants were not three colorable. This was also realized simultaneously by Behrend, Pearce, Petkova and Zuber [1] and Ocneanu [31]. Indeed Ocneanu announced in Bariloche [31] that all SU(3) modular invariants can be realized by subfactors, and the classification of their associated nimreps. He ruled out the third (12) candidate E3 for the E (12) modular invariant by asserting that it did not support a valid cell system. This graph was ruled out as a natural candidate in [13, Sec. 5.2]. We now list the ADE graphs: four infinite series of graphs A(n) , D(n) , A(n)∗ and (12) (12) (12) (12) (n)∗ , n ≤ ∞, and seven exceptional graphs E (8) , E (8)∗ , E1 , E2 , E4 , E5 and D (24) . We note that all the graphs are three-colorable, except for the graphs D(n) , E n ≡ 0 mod 3, A(n)∗ , n ≤ ∞, and E (8)∗ . For the A graphs, the vertices are labeled by Dynkin labels (λ1 , λ2 ), λ1 , λ2 ≥ 0. We define the color of a vertex (λ1 , λ2 ) of A(n) , n < ∞, to be λ1 − λ2 mod 3. There is a natural conjugation on the graph defined by (λ1 , λ2 ) = (λ2 , λ1 ) for all λ1 , λ2 ≥ 0. This conjugation interchanges the vertices of color 1 with those of color 2, but leaves the set of all vertices of color 0 invariant. For all the other three-colorable graphs there is also a conjugation. The vertices of these graphs are colored such that the conjugation again leaves the set of all vertices of color 0 invariant. We use the convention that the edges on the graph are always from a vertex of color j to a vertex of color j + 1 (mod 3). For the non-three-colorable graphs, we will not distinguish between the color of vertices, so that all the vertices have color j for any j ∈ {1, 2, 3}. In this paper, we will consider the finite graphs, i.e. A(n) , D(n) , A(n)∗ and D(n)∗ , n < ∞, and the exceptional E graphs. The figures for the complete list of the ADE graphs are given in [1, 19]. 3. Ocneanu Cells We will construct a representation of a Hecke algebra in the path algebra of an ADE graph. For more details on path algebras see [17]. This construction is not as straightforward as for SU(2) where one only needs the Perron–Frobenius eigenvector for the ADE Dynkin diagram. The McKay graph G of SU(3) is made of triangles, which are paths of length 3 on the graph such that the start and end vertices are the same. This corresponds to the fact that the fundamental representation ρ, which along with its conjugate representation ρ generates the irreducible representations of SU(3), satisfies
August 12, 2009 3:57 WSPC/148-RMP
884
J070-00376
D. E. Evans & M. Pugh
ρ ⊗ ρ ⊗ ρ 1. To every triangle on G one can assign a complex number, called an Ocneanu cell. More details are given in [19]. These cells are axiomatized in the context of an arbitrary graph G whose adjacency matrix has Perron–Frobenius eigenvalue [3] = [3]q , although in practice it will be any one of the ADE graphs. Here the quantum number [m]q is defined by [m]q = (q m − q −m )/(q − q −1 ). We will frequently denote the quantum number [m]q simply by [m], for m ∈ N. Now [3]q = q 2 + 1 + q −2 , so that q is easily determined from the eigenvalue of G. The quantum number [2] = [2]q is then simply q + q −1 . If G is an ADE graph, the Coxeter number n of G is the number in parentheses in the notation for the graph G, e.g. the exceptional graph E (8) has Coxeter number 8, and q = eπi/n . We define a type I frame in an arbitrary G to be a pair of edges α, α which have the same start and endpoint. A type II frame will be given by four edges αi , i = 1, 2, 3, 4, such that s(α1 ) = s(α4 ), s(α2 ) = s(α3 ), r(α1 ) = r(α2 ) and r(α3 ) = r(α4 ). Definition 3.1 ([31]). Let G be an arbitrary graph with Perron–Frobenius eigenvalue [3] and Perron–Frobenius eigenvector (φi ). A cell system W on G is a map (αβγ) (αβγ) that associates to each oriented triangle ijk in G a complex number W (ijk ) with the following properties: in G we have
(i) for any type I frame
(19) in G we have
(ii) for any type II frame
.
(20)
Ocneanu cells for the ADE graphs were constructed in [19], with the exception (12) of the graph E4 . Using these cells we define the connection ρ1
,ρ2 Xρρ31,ρ 4
l −→ i ρ2 = ρ3 −→ k ρ4 j
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
885
for the ADE graph G by 2
1
,ρ2 ,ρ2 = q 3 δρ1 ,ρ3 δρ2 ,ρ4 − q − 3 Uρρ31,ρ , Xρρ31,ρ 4 4
(21)
,ρ2 where Uρρ31,ρ is given by the representation of the Hecke algebra, and is defined by 4 ,ρ2 = Uρρ31,ρ 4
−1 3 φ−1 s(ρ1 ) φr(ρ2 ) W (j,l,k
(λ,ρ ,ρ4 )
(λ,ρ ,ρ2 )
)W (j,l,i 1
).
(22)
λ
A representation U of the Hecke algebra corresponds to a picture
in the A2 web space. It will be proved in [20] that a diagrammatic algebra generated by these pictures indeed gives a representation of the Hecke algebra. More details on the relation between the A2 web space of Kuperberg and the Ocneanu cells are given in [19]. The above connection corresponds to the natural braid generator gi , up to a choice of phase. It was claimed in [30] and proven in [19] that the connection satisfies the unitarity property of connections ρ ,ρ ,ρ2 Xρρ31,ρ Xρ31,ρ42 = δρ1 ,ρ1 δρ2 ,ρ2 , (23) 4 ρ3 ,ρ4
and the Yang–Baxter equation ,σ2 ,ρ4 ,ρ5 Xρσ11,ρ Xσρ13,σ Xσσ23,ρ = 2 3 6 σ1 ,σ2 ,σ3
,σ2 ,σ3 ,ρ5 Xρρ13,σ Xρσ21,ρ Xσρ24,σ , 1 6 3
(24)
σ1 ,σ2 ,σ3
provided that the cells W () satisfy (19), (20). 4. General Construction In this section, we will construct the SU(3)-Goodman–de la Harpe–Jones subfactors. We first present some results that will be needed for this construction. Let U1 , U2 , . . . , Um−1 be operators which satisfy H1–H3 with parameter δ. We let Fi := Ui Ui+1 Ui − Ui = Ui+1 Ui Ui+1 − Ui+1 , for i = 1, 2, . . . , m − 2. These operators Fi correspond to the picture
in the A2 web space.
(25)
August 12, 2009 3:57 WSPC/148-RMP
886
J070-00376
D. E. Evans & M. Pugh
Lemma 4.1. With Fi defined as above, Fi Fi+1 Fi = δ 2 Fi if and only if the Ui satisfy the extra SU(3) relation (6). Proof. The condition (6) can be written as Ui+2 Ui+1 Ui Ui+1 Ui+2 Ui+1 − Ui Ui+1 − Ui Ui+1 Ui+2 Ui+1 − Ui+2 Ui+1 Ui Ui+1 = δ(Ui+1 Ui+2 Ui+1 − Ui+1 ).
(26)
We have Fi Fi+1 Fi = (Ui+1 Ui Ui+1 − Ui+1 )(Ui+1 Ui+2 Ui+1 − Ui+1 )(Ui+1 Ui Ui+1 − Ui+1 ) 2 2 3 Ui+2 Ui+1 − Ui+1 )(Ui Ui+1 − 1) = (Ui+1 Ui − 1)(Ui+1
= (Ui+1 Ui Ui+1 − Ui+1 )(δ 2 Ui+2 − δ1)(Ui+1 Ui Ui+1 − Ui+1 ) = δ(Ui Ui+1 Ui − Ui )(δUi+2 − 1)(Ui Ui+1 Ui − Ui ) = δ(δUi Ui+1 Ui Ui+2 Ui Ui+1 Ui − δUi Ui+1 Ui Ui+2 Ui − δUi Ui+2 Ui Ui+1 Ui + δUi Ui+2 Ui − Ui Ui+1 Ui2 Ui+1 Ui + Ui Ui+1 Ui2 + Ui2 Ui+1 Ui − Ui2 ). In the following we use relation H3 to transform each expression, and we indicate which terms have been replaced at each stage by enclosing them within square brackets [ ]. Since Ui , Ui+2 commute by H1, we have δ 2 (δUi Ui+1 Ui+2 [Ui Ui+1 Ui ] − δUi Ui+1 Ui+2 Ui − δUi+2 Ui Ui+1 Ui + δUi Ui+2 − Ui [Ui+1 Ui Ui+1 ]Ui + 2Ui Ui+1 Ui − Ui ) = δ 2 (δUi [Ui+1 Ui+2 Ui+1 ]Ui Ui+1 − δUi Ui+1 Ui+2 Ui+1 + δUi Ui+1 Ui+2 Ui − δUi Ui+1 Ui+2 Ui − δUi+2 Ui Ui+1 Ui + δUi Ui+2 − Ui2 Ui+1 Ui2 − Ui Ui+1 Ui + Ui3 + 2Ui Ui+1 Ui − Ui ) = δ 2 (δUi+2 [Ui Ui+1 Ui ]Ui+2 Ui+1 − δUi Ui+2 Ui Ui+1 + δUi [Ui+1 Ui Ui+1 ] − δUi Ui+1 Ui+2 Ui − δUi+2 [Ui Ui+1 Ui ] + δUi Ui+2 − (δ 2 − 1)(Ui Ui+1 Ui − Ui )) = δ 2 (δUi+2 Ui+1 Ui Ui+1 Ui+2 Ui+1 − δ[Ui+2 Ui+1 Ui+2 ]Ui+1 + δUi+2 Ui Ui+2 Ui+1 − δ 2 Ui+2 Ui Ui+1 + δUi2 Ui+1 Ui − δUi2 + δUi Ui+1 − δUi Ui+1 Ui+2 Ui+1 − δUi+2 Ui+1 Ui Ui+1 + δUi+2 Ui+1 − δUi+2 Ui + δUi Ui+2 − (δ 2 − 1)(Ui Ui+1 Ui − Ui )) = δ 2 (δ(Ui+2 Ui+1 Ui Ui+1 Ui+2 Ui+1 + Ui Ui+1 − Ui Ui+1 Ui+2 Ui+1 − Ui+2 Ui+1 Ui Ui+1 ) 2 2 + δUi+1 − δUi+2 Ui+1 + δUi+2 Ui+1 + Ui Ui+1 Ui − Ui ) − δUi+1 Ui+2 Ui+1
= δ 2 (δ 2 (Ui+1 Ui+2 Ui+1 − Ui+1 ) − δ 2 (Ui+1 Ui+2 Ui+1 − Ui+1 ) + Ui Ui+1 Ui − Ui ) = δ 2 Fi , where the penultimate equality follows from (26).
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
887
Note that if the condition (6) is satisfied, alg(1, Fi |i = 1, . . . , m − 1) is not the Temperley–Lieb algebra, since although Fi Fj = Fj Fi for |i − j| > 2, it is not the case for |i − j| = 2, indeed Fi Fi+2 Fi = δFi Ui+3 so that Fi , Fi+2 do not commute. We will now define a representation of the Hecke operators Uk as elements of the path algebra for ADE graphs. Let G be a finite ADE graph with Coxeter number n < ∞. Let M0 = Cn0 where n0 is the number of 0-colored vertices of G, and let M0 ⊂ M1 ⊂ M2 ⊂ · · · be finite dimensional von Neumann algebras, with the Bratteli diagram for the inclusion Mj ⊂ Mj+1 given by the graph G, j ≥ 0. Let (µ, µ ) be matrix units indexed by paths µ, µ on G, and denote by EG , VG the edges, vertices of G respectively. We define maps s, r : EG → VG , where for an edge γ ∈ EG , s(γ) denotes the source vertex of γ and r(γ) its range vertex. We define operators Uk ∈ Mk+1 , for k = 1, 2, . . ., by β ,γ Uβ12,γ12 (σ · β1 · γ1 , σ · β2 · γ2 ), (27) Uk = σ,βi ,γi
where the summation is over all paths σ of length k − 1 and edges β1 , β2 , γ1 , γ2 of G such that r(σ) = s(β1 ) = s(β2 ), s(γi ) = r(βi ) for i = 1, 2, and r(γ1 ) = r(γ2 ), and (ρ ,ρ ,ρ ) ,γ2 defined in (22). We will use the notation Wρ1 ,ρ2 ,ρ3 for W (i1 ,i1 2 ,i23 3 ), with Uββ12,γ 1 where il = s(ρl ), l = 1, 2, 3. Lemma 4.2. With Uk ∈ Mk+1 given as in (27), the operator Fk ∈ Mk+2 defined in (25) is given by 1 Wγ1 ,γ2 ,γ3 Wβ1 ,β2 ,β3 (σ · β1 · β2 · β3 , σ · γ1 · γ2 · γ3 ), (28) Fk = φ2r(β3 ) σ,βi ,γi
where the summation is over all paths σ of length k − 1 and edges βi , γi of G, i = 1, 2, 3, such that s(β1 ) = s(γ1 ) = r(β3 ) = r(γ3 ). Proof. We have Uk Uk+1 Uk β ,γ β ,γ β ,γ = Uβ12,γ12 Uβ34,γ34 Uβ56,γ56 (σ1 · β1 · γ1 · µ1 , σ1 · β2 · γ2 · µ1 ) σi ,βi , γi ,µi
× (σ2 · µ2 · β3 · γ3 , σ2 · µ2 · β4 · γ4 )(σ3 · β5 · γ5 · µ3 , σ3 · β6 · γ6 · µ3 ) β ,γ β ,µ β ,γ Uβ12,γ12 Uβ43,µ31 Uβ26,γ46 (σ1 · β1 · γ1 · µ1 , σ1 · β6 · γ6 · µ3 ) = σi ,βi , γi ,µi
=
σi ,βi ,γi µi ,λi
1 Wβ6 ,γ6 ,λ1 φs(β6 ) φr(γ6 ) φs(β4 ) φr(µ1 ) φs(β1 ) φr(γ1 )
× Wβ2 ,β4 ,λ1 Wβ4 ,µ3 ,λ2 Wβ3 ,µ1 ,λ2 Wβ2 ,β3 ,λ3 × Wβ1 ,γ1 ,λ3 (σ1 · β1 · γ1 · µ1 , σ1 · β6 · γ6 · µ3 )
August 12, 2009 3:57 WSPC/148-RMP
888
J070-00376
D. E. Evans & M. Pugh
=
σi ,βi ,γi µi ,λi
1 φs(β6 ) φr(γ6 ) φr(µ1 ) φs(β1 ) φr(γ1 )
Wβ6 ,γ6 ,λ1
× Wβ1 ,γ1 ,λ3 (δλ1 ,µ3 δλ3 ,µ1 φs(µ3 ) φr(µ3 ) φs(µ1 ) + δλ1 ,λ3 δµ1 ,µ3 φr(λ1 ) φs(µ3 ) φr(µ3 ) )(σ1 · β1 · γ1 · µ1 , σ1 · β6 · γ6 · µ3 ) =
σ,βi , γi ,µi
(29)
1 Wβ6 ,γ6 ,µ3 Wβ1 ,γ1 ,µ1 (σ1 · β1 · γ1 · µ1 , σ1 · β6 · γ6 · µ3 ) + Uk , φ2r(µ1 )
where we obtain (29) by Ocneanu’s type II equation (20). Note that if p is a minimal projection in Mk corresponding to a vertex (v, k) of the Bratteli diagram G of G, then Fk+1 p is a projection in Mk+3 corresponding to since from (28) we see that the last three edges in any the vertex (v, k + 3) of G, pairs of paths in Fk+1 form a closed loop of length 3 and hence the pairs of paths in Fk+1 p ∈ Mk+3 must have the same end vertex as p ∈ Mk . Lemma 4.3. The operators Uk defined in (27) satisfy the SU(3)-Temperley–Lieb relations. Proof. These operators satisfy the Hecke relations H1–H3 since the connection defined in (21) satisfies the Yang–Baxter equation. We are left to show that they satisfy (6). By Lemma 4.1, we need only show that Fk Fk+1 Fk = [2]2 Fk . We have Fk Fk+1 Fk =
1
Wγ7 ,γ8 ,γ9 Wβ7 ,β8 ,β9 φ2 φ2 φ2 σi ,βi , r(β3 ) r(β6 ) r(β9 ) γi ,µi
× Wγ4 ,γ5 ,γ6 Wβ4 ,β5 ,β6 Wγ1 ,γ2 ,γ3 Wβ1 ,β2 ,β3 × (σ1 · β1 · β2 · β3 · µ1 , σ1 · γ1 · γ2 · γ3 · µ1 ) × (σ2 · µ2 · β4 · β5 · β6 , σ2 · µ2 · γ4 · γ5 · γ6 ) × (σ3 · β7 · β8 · β9 · µ3 , σ3 · γ7 · γ8 · γ9 · µ3 ) 1 = Wγ7 ,γ8 ,γ9 Wβ7 ,β8 ,β9 φ2r(β3 ) φ2r(µ1 ) φ2s(µ3 ) σ1 ,βi , γi ,µi
× Wβ8 ,β9 ,µ3 Wβ4 ,β5 ,µ1 Wβ7 ,β4 ,β5 Wβ1 ,β2 ,β3 × (σ1 · β1 · β2 · β3 · µ1 , σ1 · γ7 · γ8 · γ9 · µ3 ) φs(µ1 ) φr(µ1 ) φs(µ3 ) φr(µ3 ) Wγ7 ,γ8 ,γ9 Wβ1 ,β2 ,β3 δµ1 ,β7 δµ1 ,µ3 = [2]2 φ2r(β3 ) φ2r(µ1 ) φ2s(µ3 ) σ1 ,βi , γi ,µi
× (σ1 · β1 · β2 · β3 · µ1 , σ1 · γ7 · γ8 · γ9 · µ3 )
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
= [2]2
1
Wγ7 ,γ8 ,γ9 Wβ1 ,β2 ,β3 φ2 σ1 ,βi , r(β3 ) γi ,µ1
889
(σ1 · β1 · β2 · β3 · µ1 , σ1 · γ7 · γ8 · γ9 · µ1 )
= [2]2 Fk . By [12, Theorem 6.1] there is a unique normalized faithful trace on defined as in [16] by
k
tr((σ1 , σ2 )) = δσ1 ,σ2 [3]−k φr(σ1 ) ,
Mk , (30)
for paths σi of length k, i = 1, 2, k = 0, 1, . . . . The conditional expectation of Mk onto Mk−1 with respect to the trace is given by E((σ1 · σ1 , σ2 · σ2 )) = δσ1 ,σ2 [3]−1
φr(σ1 ) (σ1 , σ2 ), φr(σ1 )
for paths σi of length k − 1, and σi of length 1, i = 1, 2, k = 1, 2, . . . (see e.g. [17, Lemma 11.7]). Lemma 4.4. For an ADE graph G, let M0 = Cn0 where n0 is the number of 0-colored vertices of G. Let M0 ⊂ M1 ⊂ M2 ⊂ · · · be a sequence of finite dimensional von Neumann algebras with normalized trace. Then for the operator Uk ∈ Mk+1 defined in (27), tr is a Markov trace in the sense that tr(xUk ) = [2][3]−1 tr(x) for any x ∈ Mk , k ≥ 1. Proof. Let x ∈ Mk be the matrix unit (α1 · α1 , α2 · α2 ). Then ,γ2 Uββ12,γ (α1 · α1 · µ, α2 · α2 · µ) · (σ · β1 · γ1 , σ · β2 · γ2 ) xUk = 1 σ,βi ,γi ,µ
=
,γ2 Uββ12,γ δα2 ,σ δα2 ,β1 δµ,γ1 (α1 · α1 · µ, σ · β2 · γ2 ) 1
σ,βi ,γi ,µ
=
,γ2 Uαβ2,µ (α1 · α1 · µ, α2 · β2 · γ2 ),
β2 ,γ2 ,µ
and tr(xUk ) =
β2 ,γ2 ,µ
=
β2 ,γ2 ,µ
2
,γ2 Uαβ2,µ tr((α1 · α1 · µ, α2 · β2 · γ2 )) 2
,γ2 Uαβ2,µ δα1 ,α2 δα1 ,β2 δµ,γ2 [3]−k+1 φr(µ) = δα1 ,α2 [3]−k+1 2
= δα1 ,α2 [3]−k+1
µ
µ
= δα1 ,α2 [3]−k+1
1 Wλ,α1 ,µ Wλ,α2 ,µ φr(µ) φs(α1 ) φr(µ)
1 [2]φs(α1 ) φr(α1 ) δα1 ,α2 = [2][3]−1 tr(x), φs(α1 )
α ,µ
Uα1,µ φr(µ) 2
August 12, 2009 3:57 WSPC/148-RMP
890
J070-00376
D. E. Evans & M. Pugh
where we have used Ocneanu’s type I equation (19) in the penultimate equality. The result for any x ∈ Mk follows by linearity of the trace. Then we have tr(Uk ) = [2]/[3], and the conditional expectation of Uk ∈ Mk+1 onto Mk is E(Uk ) = [2]1k /[3], for all k ≥ 1. We will need the following result: Lemma 4.5. Let Fi ∈ Mi+2 be as above and tr a Markov trace on the Mi , i = 1, 2, . . . , then tr(Fk+1 x) = [2][3]−2 tr(x), for x ∈ Mk , k ∈ N. Proof. Now tr(Uk+1 Uk+2 Uk+1 x) = tr(Uk+2 Uk+1 xUk+1 ) = [2][3]−1 tr(Uk+1 xUk+1 ), since tr is a 2 x) = [2] tr(Uk+1 x) = [2]2 [3]−1 tr(x). Markov trace. Then tr(Uk+1 xUk+1 ) = tr(Uk+1 −1 We also have tr(Uk+1 x) = [2][3] tr(x), so that tr((Uk+1 Uk+2 Uk+1 − Uk+1 )x) =
[2]3 [2] − [3]2 [3]
tr(x) =
[2] tr(x). [3]2
Proposition 4.6. With Uk ∈ Mk+1 as above and x ∈ Mk , k = 1, 2, . . . , x commutes with Uk if and only if x ∈ Mk−1 , i.e. Mk−1 = {Uk } ∩ Mk . ∩ Mk , it is clear that x ∈ Mk−1 commutes with Uk . Proof. Since Uk ∈ Mk−1 We now check the converse. Let x = αi ,αi λα1 ·α2 ,α1 ·α2 (α1 · α2 , α1 · α2 ) ∈ Mk , where the summation is over all |αi | = k − 1, |αi | = 1, i = 1, 2. Assume that x commutes with Uk . We have the inclusion of x in Mk+1 given by x = αi ,αi ,µ λα1 ·α2 ,α1 ·α2 (α1 · α2 · µ, α1 · α2 · µ). Since x commutes with Uk we have Uk2 x = Uk xUk , and taking the conditional expectation onto Mk we have
[2]E(Uk x) = E(Uk xUk ).
(31)
By the Markov property of the trace on the Mk , the left hand side gives [2]E(Uk x) = [2]E(Uk )x = [2]2 x/[3], since x ∈ Mk . For the right-hand side of (31) we have E(Uk xUk ) ,γ4 =E Uββ34,γ (σ · β3 · γ3 , σ · β4 · γ4 ) 3 σ,β3 ,β4 , γ3 ,γ4
×
αi ,αi ,β2 , γ2 ,µ
,γ2 Uαβ2,µ λα1 ·α2 ,α1 ·α2 (α1 · α2 · µ, α1 · β2 · γ2 ) 2
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
891
β4 ,γ4 β2 ,γ2 =E Uβ3 ,γ3 Uα ,µ λα1 ·α2 ,α1 ·α2 δα1 ,σ δβ4 ,α2 δγ4 ,µ (σ · β3 · γ3 , α1 · β2 · γ2 ) 2
σ,αi ,αi βi ,γi ,µ
= [3]−1
,γ2 Uβα32,γ,µ3 Uαβ2,µ λα1 ·α2 ,α1 ·α2 E((α1 · β3 · γ3 , α1 · β2 · γ2 )) 2
αi ,αi ,βi , γi ,µ
= [3]−1
α ,µ β ,γ φr(γ2 ) (α1 · β3 , α · β2 ) Uβ32,γ3 Uα2,µ2 λα1 ·α2 ,α1 ·α2 δγ2 ,γ3 1 2 φr(β )
α1 ,α1 β3 ,β2
= α−1
3
α2 ,α2 γi ,µ
bα1 ·β1 ,α1 ·β2 (α1 · β1 , α1 · β2 ),
α1 ,α1 β1 ,β2
where bα1 ·β1 ,α1 ·β2 =
α2 ,α2 γ,µ
φ
,µ β2 ,γ r(γ) Uβα12,γ Uα ,µ λα1 ·α2 ,α1 ·α2 φr(β . Then for any paths α1 , ) 2
1
α1 and edges β1 , β2 on G we have bα1 ·β1 ,α1 ·β2 =
α2 ,α2 ,γ µ,ζi
1 1 Wβ1 γζ1 Wα2 µζ1 φs(α1 ) φr(γ) φs(α2 ) φr(γ)
× Wα2 µζ2 Wβ2 γζ2 λα1 ·α2 ,α1 ·α2 =
α2 ,α2
1 λα ·α ,α ·α φ2s(α2 ) φr(β1 ) 1 2 1 2
×
γ,µ,ζi
=
α2 ,α2
=
φr(γ) φr(β1 )
1 Wβ1 γζ1 Wα2 µζ1 Wα2 µζ2 Wβ2 γζ2 φr(γ)
1 λα ·α ,α ·α (φr(α2 ) φs(α2 ) φr(β1 ) δα2 ,α2 δβ1 ,β2 φ2s(α2 ) φr(β1 ) 1 2 1 2
+ φs(α2 ) φr(β1 ) φs(α2 ) δα2 ,β1 δα2 ,β2 )
(32)
φr(α2 )
(33)
α2
φs(α2 )
λα1 ·α2 ,α1 ·α2 δβ1 ,β2 + λα1 ·β1 ,α1 ·β2 ,
where equality (32) follows by Ocneanu’s type II equation (20). Since β1 = β2 in the first term in (33), here r(α1 ) = r(α1 ). We define λr(α1 ) :=
β
δs(β ),r(α1 )
φr(β ) λα ·β ,α1 ·β , φr(α1 ) 1
August 12, 2009 3:57 WSPC/148-RMP
892
J070-00376
D. E. Evans & M. Pugh
which only depends on the range of the path α1 (which is equal to the range of α1 ). Then we have for the right-hand side of (31) φr(α2 ) E(Uk xUk ) = [3]−1 λα1 ·α2 ,α1 ·α2 δβ1 ,β2 (α1 · β1 , α1 · β2 ) φ s(α ) 2 β ,β , 1
2
αi ,α1
+
λα1 ·β1 ,α1 ·β2 (α1 · β1 , α1 · β2 )
β1 ,β2 ,
α1 ,α1
= [3]−1 λs(β) (α1 · β, α1 · β) + λα1 ·β1 ,α1 ·β2 (α1 · β1 , α1 · β2 ) β,α1 ,α1
β1 ,β2 ,
α1 ,α1
= [3]−1 (w + x), where w = α1 ,α λr(α1 ) (α1 , α1 ) ∈ Mk−1 . Then (31) gives ([2]2 − 1)x = w, so we 1 see that x ∈ Mk−1 .
Remark. The above proof was motivated by the following pictorial argument, which uses concepts which will be introduced in [20]. Let be the inclusion of Mk−1 in Mk and ı the inclusion of Mk in Mk+1 . For x ∈ Mk−1 , we have the embedding ı(x) of x into Mk+1 , and U1 ∈ Mk+1 given by the tangles:
Then inserting x and U1 into the discs of the multiplication tangle M0,k+1 , we have
and clearly U1 ı(x) = ı(x)U1 .
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
Fig. 2.
893
ı(x) for x ∈ Mk .
Conversely, if x ∈ Mk we have ı(x) ∈ Mk+1 as in Fig. 2. Let U1 ı(x) = ı(x)U1 , then we have the following equality of tangles:
Let T be the tangle
We enclose both sides of U1 ı(x) = ı(x)U1 by the tangle T . Now T (U1 ı(x)) = δ 2 ı(x), whilst T (ı(x)U1 ) is
i.e. T (ı(x)U1 ) = x + (v), where v = EMk−1 (x) ∈ Mk−1 . So δ 2 x = x + (v) which gives x = (δ 2 − 1)−1 (v), i.e. x ∈ Mk−1 . We define the depth of the graph G to be dG = maxv,v ∈VG dv,v , where dv,v is the length of the shortest path between any two vertices v, v ∈ VG . Lemma 4.7. Let G be an SU(3) ADE graph G (except D(n) for n ≡ 0 mod 3, and (12) E4 ). Then with Uj ∈ Mj+1 as above, any element of Mm+1 can be written as a linear combination of elements of the form aUm b and c for a, b, c ∈ Mm , m ≥ dG +3. Proof. Let a = (λ1 · λ2 , ζ1 · ζ2 ), b = (ζ1 · ζ2 , ν1 · ν2 ) ∈ Mm such that λ1 , ζ1 , ν1 are paths of length m − 1 on G starting from one of the 0-colored vertices of
August 12, 2009 3:57 WSPC/148-RMP
894
J070-00376
D. E. Evans & M. Pugh
G, and λ2 , ζ2 , ζ2 , ν2 are edges on G. Then with Um as in (27), and embedding a, b in Mm+1 , we have ,γ2 (λ1 · λ2 · µ, ν1 · ν2 · µ ) Uνν12,γ δ δ δ δ δ aUm b = 1 ζ1 ,σ ζ2 ,ν1 µ,γ1 ν2 ,ζ2 γ2 ,µ σ,βi ,γi ,µ,µ
=
µ,µ
=
ζ ,µ
Uζ22,µ (λ1 · λ2 · µ, ν1 · ν2 · µ )
µ,µ ,ξ
1 W ((ξ,ζ2 ,µ) )W ((ξ,ζ2 ,ν) )(λ1 · λ2 · µ, ν1 · ν2 · µ ). φs(ζ2 ) φr(µ) (34)
The proof for each graph is similar, so we illustrate the general method by (12) considering the graph E1 , illustrated in Fig. 3, which contains double edges. The proof for graphs without double edges is simpler. Let m ≥ dG + 3 be a fixed integer. We denote by B the set of all linear combinations of elements of the form aUm b and c for a, b, c ∈ Mm . We will write elements in Mm+1 in the form x = (λ1 · λ2 · λ3 , ν1 · ν2 · ν3 )
(35)
where λ1 , ν1 are paths of length m − 1 on G with s(λ1 ) = s(ν1 ), and λ1 , λ2 , ν1 , ν2 are edges of G with r(λ3 ) = r(ν3 ). Since the choice of the pair of paths λ1 · λ2 , ν1 · ν2 in a, b is arbitrary, the proof will depend on specific choices of ζ2 , ζ2 in (34) in order to obtain the desired element. We label the vertices and some of the edges (12) (12) as in Fig. 3. For the other edges, let γv,v denote the edge on E1 from of E1 vertex v to v . We first consider any element (35) where r(λ2 ) = r(ν2 ). For any such pair (λ1 · λ2 , ν1 · ν2 ) with r(λ2 ) = il , l ∈ {1, 2, 3}, there is only one element x, which is given by the embedding of x = (λ1 · λ2 , ν1 · ν2 ) ∈ Mm in Mm+1 . If r(λ2 ) = il ,
Fig. 3.
(12)
The SU(3) graph E1
.
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
895
l ∈ {1, 2, 3}, there are two possibilities for the edges λ3 = ν3 . If we choose ζ2 = ζ2 = (1) (1) γil ,jl then (34) gives xl = (λ1 · λ2 · γjl ,kl , ν1 · ν2 · γjl ,kl ), so that xl ∈ B, l = 1, 2, 3. (1) Embedding x in Mm+1 we obtain (λ1 · λ2 · γjl ,r , ν1 · ν2 · γjl ,r ) = x − xl ∈ B, for l = 1, 2, 3. A similar method gives the result for the case when r(λ2 ) = r(ν2 ) = kl , l = 1, 2, 3. For any pair (λ1 · λ2 , ν1 · ν2 ) with r(λ2 ) = r(ν2 ) = p, there are seven possibilities (2) for λ3 , ν3 . We denote these elements by xl , x(ξ,ξ ) , for l = 1, 2, 3, ξ, ξ ∈ {β, β }, where xl = (λ1 · λ2 · γp,jl , ν1 · ν2 · γp,jl ), x(ξ,ξ ) = (λ1 · λ2 · ξ, ν1 · ν2 · ξ ). First, choosing ζ2 = ζ2 = α, Eq. (34) gives (2)
y0 =
1 1 1 (2) (2) (2) |Wp,j1 ,r(α) |2 x1 + |Wp,j2 ,r(α) |2 x2 + |Wp,j3 ,r(α) |2 x3 φr φj1 φr φj2 φr φj3 +
1 |Wp,q,r(αβ ) |2 x(β ,β ) , φr φq (12)
where y0 is an element in B. Using the solution W + for the cells of E1 in [19, Theorem 12.1], we obtain y1 = [2]r1+ (x1 + x2 + x3 ) + [4]r2− x(β ,β ) , (2)
(2)
(2)
given
(36)
where r1± = ([2][4] ± [2][4]), r2± = ([2]2 ± [2][4]) and y1 ∈ B. Similarly, the choices ζ2 = ζ2 = α , ζ2 = α, ζ2 = α and ζ2 = α , ζ2 = α give y2 = [2]r1− (x1 + x2 + x3 ) + [4]r2+ x(β,β) , (2) (2) (2) y3 = [2] r1+ r1− (x1 + ωx2 + ωx3 ) + [4] r2+ r2− x(β ,β) , (2) (2) + − (2) y4 = [2] r1 r1 (x1 + ωx2 + ωx3 ) + [4] r2+ r2− x(β,β ) , (2)
(2)
(2)
(37) (38) (39)
where ω = e2πi/3 and yj ∈ B, j = 2, 3, 4. We can obtain three more equations by choosing ζ2 = ζ2 = γkl ,p for l = 1, 2, 3. Then (34) gives (l)
(2)
(2)
+ l
[2]2 [3][4]2
(2)
y5 = x1 + x2 + x3 +
(l)
[2]2 [2]2 − r1 x(β,β) + l 2 [3][4] [3][4]2
[2]2 + r1+ r1− x(β ,β) + r x(β ,β ) , [3][4]2 1
r1+ r1− x(β,β ) (40)
where l = ω l−1 and y5 ∈ B, l = 1, 2, 3. Equations (36)–(40) are linearly indepen(2) (l) dent, and hence we can find xl , x(ξ,ξ ) in terms of yj , j = 1, . . . , 4, and y5 , for (2) l = 1, 2, 3, ξ, ξ ∈ {β, β }; i.e. xl , x(ξ,ξ ) ∈ B. For any pair (λ1 · λ2 , ν1 · ν2 ) with r(λ2 ) = r(ν2 ) = q, there are four possibilities (3) (3) for λ3 , ν3 . We denote these elements by xl , xr , for l = 1, 2, 3, where xl =
August 12, 2009 3:57 WSPC/148-RMP
896
J070-00376
D. E. Evans & M. Pugh
(λ1 · λ2 · γq,kl , ν1 · ν2 · γq,kl ), xr = (λ1 · λ2 · γ, ν1 · ν2 · γ). Choosing ζ2 = ζ2 = β, Eq. (34) gives y6 = [2]r1− (x1 + x2 + x3 ) + [4]r2+ xr , (3)
(3)
(3)
(41)
where y6 ∈ B. Similarly, the choices ζ2 = ζ2 = β , ζ2 = β, ζ2 = β and ζ2 = β , ζ2 = β give y7 = [2]r1+ (x1 + x2 + x3 ) + [4]r2− xr , (3) (3) (3) y8 = [2] r1+ r1− (x1 + ωx2 + ωx3 ) + [4] r2+ r2− xr , (3) (3) (3) y9 = [2] r1+ r1− (x1 + ωx2 + ωx3 ) + [4] r2+ r2− xr , (3)
(3)
(3)
(42) (43) (44)
where yj ∈ B, j = 7, 8, 9. Equations (41)–(44) are linearly independent, and we (3) find xl , xr ∈ B for l = 1, 2, 3. For any pair (λ1 ·λ2 , ν1 ·ν2 ) with r(λ2 ) = r(ν2 ) = r, there are four possibilities for λ3 , ν3 , and we denote these elements by x(ξ,ξ ) = (λ1 ·λ2 ·ξ, ν1 ·ν2 ·ξ ), ξ, ξ ∈ {α, α }. Choosing ζ2 = ζ2 = γ, Eq. (34) gives y10 = r2− x(α,α) + r2+ x(α ,α ) ,
(45)
where y10 ∈ B. We obtain three more equations by choosing ζ2 = ζ2 = γjl ,r , l = 1, 2, 3: (l) y11 = r1+ x(α,α) + l r1+ r1− x(α,α ) + l r1+ r1− x(α ,α) + r1− x(α ,α ) , (46) (l)
where y11 ∈ B, l = 1, 2, 3. So from (45) and (46) for l = 1, 2, 3, we find that x(ξ,ξ ) ∈ B for ξ, ξ ∈ {α, α }. We now consider any element x in (35) where r(λ2 ) = r(ν2 ). When r(λ2 ) = il , r(ν2 ) = p, there is only one possibility for λ3 , ν3 , which is λ3 = γil ,jl , ν3 = γp,jl , l = 1, 2, 3, given by choosing ζ2 = γkl ,il , ζ2 = γkl ,p . Then x = (λ1 · λ2 · γil ,jl , ν1 · ν2 · γp,jl ) ∈ B. When r(λ2 ) = jl , r(ν2 ) = jl+1 , l = 1, 2, 3, there is again only one possibility for λ3 , ν3 . So x ∈ B. Similarly when r(λ2 ) = kl , r(ν2 ) = kl+1 , l = 1, 2, 3. Consider the pair (λ1 · λ2 , ν1 · ν2 ) where r(λ2 ) = jl , l = 1, 2, 3, and r(ν2 ) = q. For each l = 1, 2, 3, there are two possibilities for λ3 , ν3 . We denote these by (4) (5) xl = (λ1 · λ2 · γjl ,kl , ν1 · ν2 · γq,kl ), xl = (λ1 · λ2 · γjl ,r , ν1 · ν2 · γ). Choosing ζ2 = γp,jl , γ2 = β, we obtain (l) (4) (5) y12 = [3][4]xl − [2] r2+ xl , (47) where y12 ∈ B, l = 1, 2, 3. Similarly, choosing ζ2 = γp,jl , γ2 = β , we obtain (l) (4) (5) y13 = [3][4]xl + [2] r2− xl , (l)
(l)
(48)
where y13 ∈ B, l = 1, 2, 3. Then for each l = 1, 2, 3, from (47), (48) we find that (4) (5) xl , xl ∈ B.
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
897
We now consider the pair (λ1 · λ2 , ν1 · ν2 ) where r(λ2 ) = kl , l = 1, 2, 3, and r(ν2 ) = r. For each l = 1, 2, 3, there are two possibilities for λ3 , ν3 . We denote these by x(ξ),l = (λ1 · λ2 · γkl ,p , ν1 · ν2 · ξ), ξ ∈ {α, α }. Then for each l = 1, 2, 3, choosing ζ2 = γjl ,kl , γ2 = γjl ,r , we obtain (l) (49) y14 = l r1+ x(α),l + l r1− x(α ),l , where y14 ∈ B, l = 1, 2, 3. Similarly, choosing ζ2 = γq,kl , γ2 = γ, we obtain (l) y15 = r2− x(α),l − r2+ x(α ),l , (l)
(50)
(l)
where y15 ∈ B, l = 1, 2, 3. Then for each l = 1, 2, 3, from (49), (50) we find that x(α),l , x(α ),l ∈ B. All the other elements in Mm+1 are in B, since y ∗ ∈ B if y ∈ B.
The following lemma is an SU(3) version of Skau’s lemma. The proof is similar to the proof of Skau’s lemma given in [24, Theorem 4.4.3]. Lemma 4.8. For an ADE graph G, let M0 = Cn0 where n0 is the number of 0-colored vertices of G, and let M0 ⊂ M1 ⊂ M2 ⊂ · · · be a tower of finite dimensional von Neumann algebras with Markov trace tr on the Mi , with the inclusions Mj ⊂ (12) Mj+1 given by an SU(3) ADE graph G (except E4 ), and operators Um ∈ Mm+1 , m ≥ 1, which satisfy the relations H1–H3 for δ ≤ 2, and such that Um commutes with Mm−1 . Let M∞ be the GNS-completion of j≥0 Mj with respect to the trace. Then {U1 , U2 , . . .} ∩ M∞ = M0 . Proof. The first inclusion M0 ⊂ {U1 , U2 , . . .} ∩M∞ is obvious, since M0 commutes with Um for all m ≥ 1. We now show the opposite inclusion M0 ⊃ {U1 , U2 , . . .} ∩ M∞ . For each k ≥ 1 let Fk be the conditional expectation of M∞ onto {Uk , Uk+1 , . . .} ∩M∞ with respect to the trace. Note that Fk Fl = Fmin(k,l) . So we want to show F1 (M∞ ) ⊂ M0 . We first show F2 (M∞ ) ⊂ Mm for some sufficiently large m. By [24], the diagram {Uk+1 , Uk+2 , . . .} ∩ M∞
⊂
∪
M∞ ∪
{Uk+1 , Uk+2 , . . .} ∩ {Uk , Uk+1 , . . .} ⊂ {Uk , Uk+1 , . . .} is a commuting square, for k ≥ 1. Since {Uk+1 , Uk+2 , . . .} ⊂ {Uk , Uk+1 , . . .} is isomorphic to R2 ⊂ R1 , where R1 = {1, U1 , U2 , . . .} , R2 = {1, U2 , U3 , . . .} , we may write the commuting square as R2 ∩ M∞ ⊂ M∞ ∪ R2
∪
∩ R1 ⊂ R1 .
August 12, 2009 3:57 WSPC/148-RMP
898
J070-00376
D. E. Evans & M. Pugh
Let E denote the conditional expectation from R1 onto R2 ∩ R1 with respect to the trace. Since Fk+1 is the conditional expectation from M∞ onto R2 ∩ M∞ and Uk ∈ R1 , we have Fk+1 (Uk ) = E(Uk ). Since by [16, Corollary 3.4] the principal graph of R2 ⊂ R1 is the 01-part of A(n) , and there is only one vertex joined to the distinguished vertex ∗ of A(n) , the relative commutant R2 ∩ R1 is trivial for α ≤ 3 (which corresponds to δ ≤ 2), and E is just the trace. Thus Fk+1 (Uk ) ∈ C for each k ≥ 1. By Lemma 4.7, for sufficiently large m, any element of Mm+1 can be written as aUm b for a, b ∈ Mm , and we have F2 (aUm b) = F2 Fm+1 (aUm b) = F2 (aFm+1 (Um )b) = Fm+1 (Um )F2 (ab) = F2 (λab) ∈ F2 (Mm ), where λ ∈ C. So F2 (Mm+1 ) ⊂ F2 (Mm ), for sufficiently large m, and by induction we have F2 (M∞ ) ⊂ F2 (Mr ), where r is the smallest integer such that Lemma 4.7 holds. Then certainly F2 (M∞ ) ⊂ Fr+1 (Mr ), and by Proposition 4.6, with k = r, any element x in Mr commutes with Ur if and only if x ∈ Mr−1 , so Fr Fr+1 (Mr ) ⊂ Fr (Mr−1 ). Then by inductive use of Proposition 4.6 we obtain F2 (M∞ ) ⊂ F2 (M1 ) = M1 , and so F1 (M∞ ) = F1 F2 (M∞ ) ⊂ F1 (M1 ) = M0 , by Proposition 4.6. We now construct the SU(3)-Goodman–de la Harpe–Jones subfactor for an SU(3) ADE graph G, following the idea of Goodman, de la Harpe and Jones for the ADE Dynkin diagrams [24]. Let n be the Coxeter number for G, ∗G a distinguished vertex and let n0 be the number of 0-colored vertices of G. Let A0 be the von Neumann algebra Cn0 , and form a sequence of finite dimensional von Neumann algebras A0 ⊂ A1 ⊂ A2 ⊂ · · · such that the Bratteli diagram for the inclusion Al−1 ⊂ Al is given by (part of) the graph G. There are operators Um ∈ Am+1 which be the GNS-completion of satisfy the Hecke relations H1–H3. Let C m≥0 Am with respect to the trace, and B its von Neumann subalgebra generated by {Um }m≥1 . = A0 by Lemma 4.8. Then for q the minimal projection in A0 ∩ C We have B corresponding to the distinguished vertex ∗G of G, we have an SU(3)-Goodman–de m ⊂ q Cq = C for the graph G. With Bm = q B la Harpe–Jones subfactor B = q B m q, the sequence {Bm ⊂ Cm }m is a periodic sequence of commuting and Cm = q C squares of period 3, in the sense of Wenzl in [34], that is, for large enough m the Bratteli diagrams for the inclusions Bm ⊂ Bm+1 , Cm ⊂ Cm+1 are the same as those for Bm+3 ⊂ Bm+4 , Cm+3 ⊂ Cm+4 , and the Bratteli diagrams for the inclusions Bm ⊂ Cm and Bm+3 ⊂ Cm+3 are the same. For such m the graph of the Bratteli diagram for B3m ⊂ C3m is the intertwining graph, given by the intertwining matrix V computed in Proposition 4.10, whose rows are indexed by the vertices of G and columns are indexed by the vertices of A(n) , such that V ∆A = ∆G V . For sufficiently large m, we can make a basic construction Bm ⊂ Cm ⊂ Dm . Then with D = m Dm , B ⊂ C ⊂ D is also a basic construction. The graph of the Bratteli diagram for Cm ⊂ Dm is the reflection of the graph for Bm ⊂ Cm , which is the intertwining graph. Then we can extend the definition of Dm to small m so that the graph Cm ⊂ Dm is still given by the reflection of the intertwining graph. We
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
899
see that D0 = µ∈A(n) V V ∗ (∗A , µ)C, where ∗A is the distinguished vertex (0, 0) of A(n) . The minimal projections in D0 correspond to the vertices µ of A(n) such that V V ∗ (∗, µ ) > 0,
(51)
and the Bratteli diagram for the inclusion Dm−1 ⊂ Dm is given by (part of) the graph A(n) . Each algebra Bm is generated by the U1 , . . . , Um−1 in Dm . Now λ(1,0) (N ) ⊂ N ∼ = P ⊂ Q, where P ⊂ Q is Wenzl’s subfactor with (n) principal graph given by the 01-part A01 of A(n) (see [16, Corollary 3.4]). Then d/2 (λ(1,0) λ(1,0) ) (N ) ∼ = P ⊂ Qd , where P ⊂ Q ⊂ Q1 ⊂ · · · is the Jones tower. For (n) any 0-colored vertex µ of A01 let dµ be the minimum number of edges in any path (n) from (0, 0) to µ on A01 , and let d = max{dµ − 2 | V V ∗ (∗A , µ) > 0}. Note that ∗ each dµ is even since µ is a 0-colored vertex. Let [θ] = µ∈A(n) V V (∗A , µ)[λµ ]. Now [(λ(1,0) λ(1,0) )d/2 ] decomposes into irreducibles as µ nµ [λµ ], where µ are the 0-colored vertices of A(n) and nµ ∈ N. Then θ(N ) ⊂ N is a restricted version of (λ(1,0) λ(1,0) )d/2 (N ), so that θ(N ) ⊂ N ∼ = qP ⊂ q(Qd )q where q ∈ P ∩ Qd is a sum of minimal projections corresponding to the vertices µ such that [θ] ⊃ [λµ ]. We will show that qP ⊂ q(Qd )q is isomorphic to a subfactor obtained by a basic construction. Following the example in [9, Lemma A.1] for E7 in the SU(2) case, we now do the same construction for the graph A(n) , where q is the projection corresponding to the distinguished vertex ∗A . We get a periodic sequence {Em ⊂ Fm }m of commuting squares of period 3. Then the resulting subfactor E ⊂ F , where E = m Em , F = m Fm , is Wenzl’s subfactor [34]. If we make basic constructions of Em ⊂ Fm for d−1 times then we get a periodic sequence {Em ⊂ Gm }m of commuting squares, and each Em is generated by the Hecke operators in Gm . Let q be a sum of the minimal projections corresponding to m = qEm and G m = qGm q, and obtain a the vertices µ in G0 given by (51). We set E periodic sequence of commuting squares of period 3 such that the resulting subfactor m }m is is isomorphic to qP ⊂ q(Qd )q. The Bratteli diagram for the sequence {G r the same as that for {Dm }m since D0 = G0 = C where the r minimal projections correspond to the vertices µ of (51), where r is the number of such vertices µ , and the rest of the Bratteli diagram is given by the 01-part of the graph A(n) . Each m is generated by the Hecke operators U1 , . . . , Um−1 ∈ G m . Then the sequence E of commuting squares {Bm ⊂ Dm }m is isomorphic to the sequence of commuting m }m , and so the subfactors B ⊂ D and qP ⊂ q(Qd )q are also m ⊂ G squares {E isomorphic. Since B ⊂ D is a basic construction of B ⊂ C, then the subfactor qP ⊂ q(Qd )q is also the basic construction of some subfactor. Since θ(N ) ⊂ N is isomorphic to qP ⊂ q(Qd )q, [θ] = V V ∗ (∗A , µ)[λµ ] (52) µ∈A(n)
can be realized as the dual canonical endomorphism of some subfactor.
August 12, 2009 3:57 WSPC/148-RMP
900
J070-00376
D. E. Evans & M. Pugh
4.1. Computing the intertwining graphs Let V (G) denote the free module over Z generated by the vertices of G, identifying an element a ∈ V (G) as a = (av ), av ∈ Z, v ∈ VG . For graphs G1 , G2 , a map V : V (G1 ) → V (G2 ) is positive if Vij ≥ 0 for all i ∈ VG2 , j ∈ VG1 . Let A(G) be the path algebra where the embeddings on the Bratteli diagram are given by the graph G, and we will denote the finite dimensional algebra at the k th level of the Bratteli diagram by A(G)k . The following lemma and proposition are the SU(3) versions of [15, Proposition 4.5 and Corollary 4.7] (see also, Lemma 11.26 and [17, Proposition 11.27]). Lemma 4.9. Suppose that G1 , G2 are locally finite connected graphs with Coxeter number n, adjacency matrices ∆G1 , ∆G2 respectively and distinguished vertices ∗1 , ∗2 , respectively. Let (Um )m∈N , (Wm )m∈N denote canonical families of operators in A(G1 ) and A(G2 ) respectively, which satisfy the SU(3)-Temperley–Lieb relations 2 2 = [2]q Um , Wm = [2]q Wm for all m ∈ N, q = e2πi/n . Let π : A(G1 ) → such that Um A(G2 ) be a unital embedding such that : (a) The diagram π
m A(G2 )m A(G1 )m −→
ιm ↓
↓ m πm+1
A(G1 )m+1 −→ A(G2 )m+1 commutes for all m, where πm = π|A(G1 )m , and ιm , m are standard inclusions. (b) tr1 · πm = tr2 , where tri is a Markov trace on A(Gi ), i = 1, 2. (c) π(Um ) = (Wm ) for all m ≥ 1 (so πm+1 (Um ) = Wm ). Then there exists a positive linear map V : V (G1 ) → V (G2 ) such that : (1) V ∆G1 = ∆G2 V, (2) V has no zero rows or columns, (3) V ∗1 = ∗2 . Proof. We denote by G the Bratteli diagram of G. The vertex (i, m) of G will be the vertex i ∈ VG at level m of the Bratteli diagram. Let pm i denote a minimal projection in A(G1 )m corresponding to the vertex (i, m) of the Bratteli diagram G1 of G1 . Then πm (pm i ) is a projection in A(G2 )m , and so there are families of m |k(j) = 1, . . . , bm equivalent minimal projections {qj,k(j) ji } in A(G2 )m corresponding to vertices (j, m) in G2 , such that bm
πm (pm i )
=
ji
j
m qj,k(j) .
(53)
k(j)=1
m The numbers {bm ji }j are non-negative, are independent of the choice of pi and
are not all zero, since πm is injective. Let Fm = [2]−1 [3]−1 (Um Um+1 Um − Um ) in (1)
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
901
A(G1 ), and Fm = [2]−1 [3]−1 (Wm Wm+1 Wm − Wm ) in A(G2 ). Now multiplying (53) (2) on the left by Fm+1 , we have (2)
bm
(2) Fm+1 πm (pm i )
=
ji
j (2)
(2)
m Fm+1 qj,k(j) ,
k(j)=1 (1)
(1)
m m but by (a) and (c), Fm+1 πm (pm i ) = πm+3 (Fm+1 )πm (pi ) = πm+3 (Fm+1 pi ), so we have bm
(1)
πm+2 (Fm+1 pm i )=
ji
j
(2)
m Fm+1 qj,k(j) .
(54)
k(j)=1 (1)
Since tr1 and tr2 are Markov traces, by Lemma 4.5 we have tr1 (Fm+1 pm i ) = (2) −3 m m −3 m m m [3] tr1 (pi ), and tr2 (Fm+1 qj,k(j) ) = [3] tr2 (qj,k(j) ). Since pi , qj,k(j) are minimal projections, they have trace [3]−k φi , [3]−k φj , respectively. Then Fm+1 pm i has (1) −k−3 m φi , which shows that Fm+1 pi is a minimal projection in A(G1 )m+3 trace [3] (2) m corresponding to vertex (i, m + 3) of G1 , and similarly Fm+2 qj,k(j) is a minimal projection in A(G2 )m+3 corresponding to vertex (j, m + 3) of G2 . It follows from (1)
(53) and (54) that the coefficients occurring in the decomposition of a minimal projection as in (53) corresponding to vertex (i, m) of G1 , m ≥ 1, is independent of l the level m, i.e. bm ji = bji =: bji for all m, l ≥ 0. Now put V = (bji )i∈VG1 ,j∈VG2 , then since A(G1 )0 ∼ = C ∼ = A(G2 )0 , and π0 : A(G1 )0 → A(G2 )0 we see that V ∗1 = ∗2 . Note that since π is unital, the rows of V are non-zero. We need to show V ∆G1 = ∆G2 V . Let ∆Gk (m), k = 1, 2, be the finite submatrix of ∆Gk , whose rows and columns (0) are labeled by the vertices v ∈ Gk with d(v) ≤ m + 1, where d(v) is the distance of vertex v from ∗k , ie. the length of the shortest path on Gk from ∗k to v. Similarly let V (m) denote the finite submatrix of V whose rows are labeled by j ∈ VG2 with d(j) ≤ m + 1, and whose columns are labeled by i ∈ VG1 with d(i) ≤ m + 1. It follows from (a) that for each m we have K0 (m )K0 (πm ) = K0 (πm+1 )K0 (ιm ).
(55)
Let M1 , M2 , be two multi-matrix algebras, with the embedding ϕ of M1 in M2 given by a matrix Λ, with p1 rows corresponding to the minimal central projections in M1 and p2 rows corresponding to the minimal central projections in M2 . Then K0 (Mi ) = Zpi , i = 1, 2, and K0 (ϕ) : Zp1 → Zp2 is given by multiplication by the matrix Λ. For m of color j, we see that K0 (ιm ) is the submatrix of ∆G1 (m) mapping vertices of color j to vertices of color j + 1, and K0 (m ) is the submatrix of ∆G2 (m) mapping vertices of color j to vertices of color j + 1. Similarly, K0 (πm ) is the submatrix of V (m) mapping vertices of G1 of color j to vertices of G2 of color j. Then (55) implies ∆G2 (m)V (m − 1) = V (m)∆G1 (m) holds for all m. Hence V ∆G1 = ∆G2 V .
August 12, 2009 3:57 WSPC/148-RMP
902
J070-00376
D. E. Evans & M. Pugh
We define polynomials Sν (x, y), for ν the vertices of A(n) , by S(0,0) (x, y) = 1, and xSν (x, y) = µ ∆A (ν, µ)Sµ (x, y), ySν (x, y) = µ ∆TA (ν, µ)Sµ (x, y). For concrete values of the first few Sµ (x, y) see [17, p. 610]. Proposition 4.10. Let G be a finite SU(3)-ADE graph with distinguished vertex ∗G and Coxeter number n < ∞. Let {Um }m≥0 , {Wm }m≥0 be the canonical family of operators satisfying the Hecke relations in A(A(n) ), A(G), respectively. We can identify A(A(n) ) with the algebra generated by {1, W1 , W2 , . . .}. If we define π : A(A(n) ) → A(G) by π(1) = 1, π(Um ) = Wm , then π is a unital embedding, and there exists a positive linear map V : V (A(n) ) → V (G) such that : (a) V ∆A = ∆G V, (b) V has no zero rows or columns, (c) V ∗A = ∗G , where ∗A = (0, 0) is the distinguished vertex of A(n) . Let V(0,0) be the vector corresponding to the distinguished vertex ∗G , and for the other vertices define V(λ1 ,λ2 ) ∈ V (G) by V(λ1 ,λ2 ) = S(λ1 ,λ2 ) (∆TG , ∆G )V(0,0) , for all vertices (λ1 , λ2 ) of A(n) . Then V = (V(0,0) , V(1,0) , V(0,1) , V(2,0) , . . . , V(0,n−3) ). Proof. Now π : A(A(n) ) → A(G) defined by π(1) = 1, π(Um ) = Wm is a unital embedding which satisfies the condition of Lemma 4.9 with ∗1 = (0, 0) and ∗2 = ∗G . Hence when m is finite there exists V = (V(λ1 ,λ2 ) ), for (λ1 , λ2 ) the vertices of A(n) , with the required properties. Now V ∆A = (V(λ1 −1,λ2 ) + V(λ1 +1,λ2 −1) + V(λ1 ,λ2 +1) )(λ1 ,λ2 ) , where V(λ1 ,λ2 ) is understood to be zero if (λ1 , λ2 ) is off the graph A(n) . Thus V ∆A = ∆G V implies that ∆G V(λ1 ,λ2 ) = V(λ1 −1,λ2 ) + V(λ1 +1,λ2 −1) + V(λ1 ,λ2 +1) . Then V(λ1 ,λ2 ) = S(λ1 ,λ2 ) (∆TG , ∆G )V(0,0) , since ∆G V(λ1 ,λ2 ) = ∆G S(λ1 ,λ2 ) (∆TG , ∆G )V(0,0) = ∆TA ((λ1 , λ2 ), (µ1 , µ2 ))S(µ1 ,µ2 ) (∆TG , ∆G )V(0,0) (µ1 ,µ2 )
= V(λ1 −1,λ2 ) + V(λ1 +1,λ2 −1) + V(λ1 ,λ2 +1) , and V(0,0) = S(0,0) (∆TG , ∆G )V(0,0) . For any ADE graph G the matrix V is the adjacency matrix of a (possibly disconnected) graph. By [5, Theorem 4.2] the connected component of ∗A of this graph gives the principal graph of the SU(3)-Goodman–de la Harpe–Jones subfactor. For the graph E (8) with vertex i1 chosen as the distinguished vertex this is the graph illustrated in Fig. 4, which was shown to be the principal graph for this subfactor in [35]. 5. Modular Invariants Associated to the Dual Canonical Endomorphisms Let N ⊂ M be the SU(3)-GHJ subfactor for the finite ADE graph G, where the distinguished vertex ∗G is the vertex with lowest Perron–Frobenius weight. Then
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
Fig. 4.
903
Principal graph for the SU(3)-Goodman–de la Harpe–Jones subfactor for E (8) .
the dual canonical endomorphism θ for N ⊂ M is given by (52) where V is now determined in Proposition 4.10. We list these θ’s below for the ADE graphs, where we use the same notation for the ADE graphs as in [19]. We must point out that (12) as we have been unable to explicitly construct the Ocneanu cells W for E4 , the existence of the SU(3)-Goodman–de la Harpe–Jones subfactor which realizes the (12) candidate for the dual canonical endomorphism for E4 is not shown here. A(n) : [θ] = [λ(0,0) ],
(56)
D
(57)
(n)
A(n)∗
: [θ] = [λ(0,0) ] ⊕ [λA(0,0) ] ⊕ [λA2 (0,0) ], : [θ] = [λµ ],
(58)
µ∈A(n)
D(2k)∗ : [θ] =
[λµ ],
(59)
(2k)
µ∈A : τ (µ)=0
D(2k+1)∗ : [θ] =
[λµ ], (2k+1)
µ=(2µ1 ,2µ2 )∈A τ (µ)=0
:
E (8) : [θ] = [λ(0,0) ] ⊕ [λ(2,2) ], E
(60)
(61)
(8)∗
: [θ] = [λ(0,0) ] ⊕ [λ(2,1) ] ⊕ [λ(1,2) ] ⊕ [λ(2,2) ] ⊕ [λ(5,0) ] ⊕ [λ(0,5) ],
(62)
(12)
: [θ] = [λ(0,0) ] ⊕ [λ(4,1) ] ⊕ [λ(1,4) ] ⊕ [λ(4,4) ] ⊕ [λ(9,0) ] ⊕ [λ(0,9) ],
(63)
(12)
: [θ] = [λ(0,0) ] ⊕ 2[λ(2,2) ] ⊕ [λ(4,1) ] ⊕ [λ(1,4) ] ⊕ 2[λ(5,2) ] ⊕ 2[λ(2,5) ]
E1 E2
⊕ [λ(4,4) ] ⊕ [λ(9,0) ] ⊕ [λ(0,9) ], (12)
E4
(64)
: [θ] = [λ(0,0) ] ⊕ [λ(2,2) ] ⊕ [λ(4,1) ] ⊕ [λ(1,4) ] ⊕ [λ(5,2) ] ⊕ [λ(2,5) ] ⊕ [λ(4,4) ] ⊕ [λ(9,0) ] ⊕ [λ(0,9) ],
(65)
E5
(12)
: [θ] = [λ(0,0) ] ⊕ [λ(3,3) ] ⊕ [λ(9,0) ] ⊕ [λ(0,9) ],
E
(24)
: [θ] = [λ(0,0) ] ⊕ [λ(4,4) ] ⊕ [λ(10,1) ] ⊕ [λ(1,10) ] ⊕ [λ(6,6) ] ⊕ [λ(9,6) ] ⊕ [λ(6,9) ] ⊕ [λ(13,4) ] ⊕ [λ(4,13) ] ⊕ [λ(10,10) ] ⊕ [λ(21,0) ] ⊕ [λ(0,21) ].
(66) (67)
August 12, 2009 3:57 WSPC/148-RMP
904
J070-00376
D. E. Evans & M. Pugh
Note that these dual canonical endomorphisms depend only on the existence of a cell system W for each graph G, but not on the choice of cell system since Lemma 4.9 and Proposition 4.10 did not depend on this choice. Where we have found two inequivalent solutions, the computations below show that either choice will give the same M -N graph, since the computations in these particular cases only depend on the dual canonical endomorphism θ. Similarly, even if there exists other solutions (12) graphs, these will not give any new M -N for the cells W for the D, D∗ and E1 graphs either. It is conceivable however that in certain situations, for SU(n), n > 3, the M -N graph will depend on the connection and not just on the GHJ graph. Remark. For SU(2) it was shown in [13] that the modular invariant Z can be realized from a subfactor with a dual canonical endomorphism of the form Zµ,µ [µ], (68) [θ] = µ
where the direct summation is over all µ even. This raises the question of whether all the SU(3) modular invariants can be realized from some subfactor with dual canonical endomorphism θ of the form (68), where now allow µ to be of any color? For the A(n)∗ graphs the θ given in (58) is automatically in the form (68), where Z is the conjugate modular invariant ZA(n)∗ = C. For the A(n) graphs, if we choose the M -N morphism [a] to be [ιλ(p,0) ], where p = (n − 3)/2, the sector [aa] gives [λ(0,0)) ] ⊕ [λ(1,1) ] ⊕ [λ(2,2) ] ⊕ · · · ⊕ [λ(p,p) ], and we obtain a dual canonical endomorphism [θ] = [aa] = µ Zµ,µ [µ], where the direct summation is over all µ (of any color) and Z is the identity modular invariant ZA(n) = I. (12) For each of the ADE graphs (with the exception of E4 ) we have shown the existence of a braided subfactor N ⊂ M with dual canonical endomorphisms θ given by (56)–(67). By the α-induction of [3–5], a matrix Z can be defined by − Zλ,µ = α+ λ , αµ , λ, µ ∈ N XN . If the braiding is non-degenerate, Z is a modular invariant mass matrix. For the dual canonical endomorphisms θ in (56)–(67), what is the corresponding M -N system or Cappelli–Itzykson–Zuber graph which classifies the modular invariant? And what is the corresponding modular invariant? For A(n) the M -M , M -N and N -N systems are all equal since N = M . Subfactors given by conformal inclusions were considered in [4, 5]. Those conformal inclusions which have SU(3) invariants give identical dual canonical endomorphisms θ to those computed above. The M -N system was computed for conformal inclusions with corresponding mod(12) ular invariants associated to the graphs D(6) and E (8) in [4], and to E1 and E (24) in [5]. The M -N system was also computed in [4] for the inclusion with the D(n) dual canonical endomorphism (57) for n ≡ 0 mod 3, and in [7] for the inclusion (12) dual canonical endomorphism (64), which do not come from conwith the E2 formal inclusions. For each of these graphs, the graph of the M -N system and the α-graph can both be identified with the original graph itself, and the modular invariant is that associated with the original graph. We compute the M -N graph
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
905
(12)
for the remaining θ’s. The proof for the case of E2 was not published in [7], so we produce a proof using our method here. Knowledge of the dual canonical endomorphism θ is not usually sufficient to determine the M -N graph, but we can utilize the fact that the list of SU(3) modular invariants is complete. For an ADE graph G with Coxeter number n, the basic method is to compute ιλ, ιµ for representations λ, µ on A(n) , and decompose into irreducibles. Sometimes there is an ambiguity about the decomposition, e.g. if ιλ, ιλ = 4 then we could have ιλ = 2ιλ(1) or ιλ = ιλ(1) + ιλ(2) + ιλ(3) + ιλ(4) where ιλ(i) , i = 1, 2, 3, 4, are irreducible sectors. By [8, Corollary 6.13], M XN = tr(Z) for some modular invariant Z, and therefore, since we have a complete list of SU(3) modular invariants, we can eliminate any particular decomposition if the total number of irreducible sectors obtained does not agree with the trace of any of the modular invariants (7)–(18). We compute the trace for all the modular invariants at level k in the following lemma: Lemma 5.1. The traces of the level k modular invariants Z are 1 (k + 1)(k + 2), 2 1 tr(ZD(k+3) ) = (k + 1)(k + 2) + ck , 6 k+2 tr(ZA(k+3)∗ ) = , 2 k+2 , tr(ZD(k+3)∗ ) = 3 2 tr(ZA(k+3) ) =
tr(ZE (8) ) = 12,
(69) (70) (71) (72) (73)
tr(ZE (8)∗ ) = 4,
(74)
tr(ZE (12) ) = 12,
(75)
tr(ZE (12)∗ ) = 11,
(76)
tr(ZE (12) ) = 17,
(77)
tr(ZE (24) ) = 24,
(78)
MS
MS
where ck = 0 if k ≡ 0 mod 3, c3m = 2/3 for m ∈ N and x denotes the largest integer less than or equal to x. Proof. For the A graphs, tr(ZA(k+3) ) is given by the number of vertices of A(k+3) , which is 1 + 2 + 3 + · · · + k + 1 = (k + 1)(k + 2)/2. For k ≡ 0 mod 3, the diagonal terms in ZD(k+3) are given by the 0-colored vertices of A(k+3) , so tr(ZD(k+3) ) is tr(ZA(k+3) )/3. For k ≡ 0 mod 3 the 0-colored vertices of A(k+3) again give the diagonal terms in ZD(k+3) but the number of 0-colored vertices of A(k+3) is now one greater than the number of 1,2-colored vertices. The trace of ZA(k+3)∗ is given by the number of “diagonal” elements µ = µ of A(k+3) , which is k + 2/2. For the D∗ graphs, when k ≡ 0 mod 3, the trace is given by the number of vertices
August 12, 2009 3:57 WSPC/148-RMP
906
J070-00376
D. E. Evans & M. Pugh
µ = (µ1 , µ2 ) of A(k+3) such that A(n−3)(µ1 −µ2 ) µ = µ. For the 0-colored vertices this is the number of diagonal elements, whilst for the 1,2-colored vertices this is where Aµ = µ or A2 µ = µ, depending on the parity of n. In each case the number of such vertices is k + 2/2. For k ≡ 0 mod 3 the trace is again given by a third of the number of vertices of A(k+3) which satisfy each of the following µ = µ, Aµ = A2 µ, A2 = Aµ, µ = Aµ, Aµ = µ, A2 µ = A2 µ, µ = A2 µ, Aµ = Aµ and A2 µ = µ. The first three equalities are satisfied when µ = µ, the second three when Aµ = µ and the last three when A2 µ = µ. So we have tr(ZD(k+3)∗ ) = 3k + 2/2 also. The computations of tr(ZE ) for the exceptional invariants is clear from inspection of the modular invariant. Lemma 5.2. The trace of the modular invariants at level k are all different. Proof. For level 5 we have tr(A(8) ) = 21, tr(D(8) ) = 7, tr(A(8)∗ ) = 3 and tr(D(8)∗ ) = 9, and compare these with (73) and (74). For level 9, tr(A(12) ) = 55, tr(D(12) ) = 19, tr(A(12)∗ ) = 5 and tr(D(12)∗ ) = 15, and compare these with (75)– (77). For level 21 we compare tr(A(24) ) = 253, tr(D(24) ) = 85, tr(A(24)∗ ) = 11 and tr(D(24)∗ ) = 33 with (78). For all other levels we need to compare the modular invariants for the A, D, A∗ and D∗ graphs. Comparing the A and D modular invariants, the traces can only be equal if 3(k + 1)(k + 2) = (k + 1)(k + 2) + 6ck . For k ≡ 0 mod 3, this gives k = 0, −3, whilst if k ≡ 0 mod 3 we obtain k = −1, −2. So these traces cannot be equal except when k = 0, but the graphs A(3) and D(3) are both a single vertex. Comparing A-A∗ , the traces are only equal if (k + 1)(k + 2) = 2(k + 2)/2. For even k this gives solutions k = 0, −4, but when k = 0 the graph A(3)∗ is also just a single vertex, so identical to the graph A(3) . For k odd we have k = −1. Next, comparing A-D∗ , the traces are only equal if (k + 1)(k + 2) = 6(k + 2)/2. For k even this gives solutions k = ±2, but for k = 2 the graph D(5)∗ is identical to A(5) . For k odd we obtain solutions k = −3, 1, but we again have for k = 1 that the graphs D(4)∗ and A(4) are the same. We now compare D-A∗ . When k ≡ 0 mod 3, the traces are equal only if (k + 1)(k + 2) + 4 = 6(k + 2)/2 = 6k/2 + 6, so we have the quadratic k 2 +3(k −2k/2) = 0. When k is even we have only the solution k = 0, whilst when k is odd this gives k 2 = −3. When k ≡ 0 mod 3, we obtain instead the quadratic k 2 + 3(k − 2k/2) − 4 = 0. For even k this gives the solutions k = ±2, but we notice that the graphs D(5) and A(5)∗ are the same, whilst for odd k we have the solutions k = ±1, but we again see that the graphs D(4) and A(4)∗ are the same. Comparing D-D∗ we now obtain the quadratic equations k 2 +3(k−6k/2)−14 = 0, k 2 + 3(k − 6k/2) − 18 = 0 for k ≡ 0 mod 3, k ≡ 0 mod 3, respectively. Neither of these equations has integer solutions for odd or even k. Finally, comparing the A∗ and D∗ modular invariants, the traces are only equal if (k + 2)/2 = 3(k + 2)/2, giving (k + 2)/2 = 0 which has solutions k = −2, −3. Since the traces of the modular invariants at any level are all different, once we have found the number of irreducible sectors, we can identify the corresponding
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
907
modular invariant. There may however still be an ambiguity with regard to the fusion rules that these irreducible sectors satisfy, with different seemingly possible fusion rules giving different nimrep graphs for the M -N system. However, we know that the nimrep must have spectrum Sλ,ν /Sλ,0 with multiplicity determined by the diagonal part Zλ,λ of the modular invariant. It turns out that the consideration of the trace and the eigenvalues is sufficient to compute the M -N graphs for A(12)∗ , (12) (12) (12) D(12)∗ , E2 , E4 and E5 , and identify the corresponding modular invariant. The results are summarized in Table 1. We will say that an irreducible sector [ιλ(µ1 ,µ2 ) ] such that µ1 + µ2 = m appears at tier m. 5.1. E (8)∗ For the graph E (8)∗ , we have [θ] = [λ(0,0) ]⊕[λ(2,1) ]⊕[λ(1,2) ]⊕[λ(2,2) ]⊕[λ(5,0) ]⊕[λ(0,5) ]. Then computing ιλ, ιµ = λ, θµ (by Frobenius reciprocity) for λ, µ on A(8) , we find ιλ, ιλ = 1 and ιλ, ιµ = 0 for λ, µ = λ(0,0) , λ(1,0) , λ(0,1) . At tier 2, we have ιλ(2,0) , ιλ(2,0) = 2, ιλ(2,0) , ιλ(1,0) = 1 and ιλ(2,0) , ιµ = 0 for µ = λ(0,1) , λ(0,0) . (1)
So [ιλ(2,0) ] = [ιλ(1,0) ] ⊕ [ιλ(2,0) ]. Since ιλ(0,2) , ιλ(0,2) = ιλ(0,2) , ιλ(2,0) = 2 we have [ιλ(0,2) ] = [ιλ(2,0) ]. Lastly, at tier 2, we have ιλ(1,1) , ιλ(1,1) = 2 and ιλ(1,1) , ιλ(1,0) = ιλ(1,1) , ιλ(0,1) = 1, giving [ιλ(1,1) ] = [ιλ(1,0) ] ⊕ [ιλ(0,1) ]. At tier 3, we have ιλ(3,0) , ιλ(3,0) = ιλ(3,0) , ιλ(0,2) = 2, so [ιλ(3,0) ] = [ιλ(0,2) ]. Similarly [ιλ(0,3) ] = [ιλ(2,0) ]. For ιλ(2,1) , we find ιλ(2,1) , ιλ(2,1) = 2 and ιλ(2,1) , ιλ(0,0) = ιλ(2,1) , ιλ(1,0) = 1, giving [ιλ(2,1) ] = [ιλ(0,0) ] ⊕ [ιλ(1,0) ] and similarly [ιλ(1,2) ] = [ιλ(0,0) ] ⊕ [ιλ(0,1) ]. So no new irreducibles appear at tier 3. No new irreducible sectors appear at the other tiers either, so we have 4 irreducible (1) sectors [ιλ(0,0) ], [ιλ(1,0) ], [ιλ(0,1) ] and [ιλ(2,0) ]. We now compute the sector products of these irreducible sectors with the M -N sector [ρ] = [λ(1,0) ]. It is easy to compute (1)
[ιλ(0,0) ][ρ] = [ιλ(1,0) ], [ιλ(1,0) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] = [ιλ(0,1) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(2,0) ] and [ιλ(0,1) ][ρ] = [ιλ(0,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(0,1) ]. We can invert these formula to obtain (1)
(1)
[ιλ(2,0) ] = [ιλ(2,0) ] [ιλ(1,0) ], and so [ιλ(2,0) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] ([ιλ(2,0) ] ⊕ [ιλ(0,1) ]) = [ιλ(0,1) ]. Then we see that the multiplication graph for [ρ] is the original graph E (8)∗ itself, illustrated in Fig. 5, and the modular invariant associated to θ is ZE (8)∗ .
Fig. 5.
M -N graph for the E (8)∗ SU(3)-GHJ subfactor.
August 12, 2009 3:57 WSPC/148-RMP
908
J070-00376
D. E. Evans & M. Pugh (12)
5.2. E2
(12)
For the graph E2 , we have [θ] = [λ(0,0) ] ⊕ 2[λ(2,2) ] ⊕ [λ(4,1) ] ⊕ [λ(1,4) ] ⊕ 2[λ(5,2) ] ⊕ 2[λ(2,5) ] ⊕ [λ(4,4) ] ⊕ [λ(9,0) ] ⊕ [λ(0,9) ]. We have ιλ, ιλ = 1 and ιλ, ιµ = 0 for all λ, µ ∈ {λ(0,0) , λ(1,0) , λ(0,1) }. At tier 2, we have ιλ, ιλ = 3 and ιλ, ιµ = 0 for λ = λ(2,0) , λ(1,1) , λ(0,2) , µ = λ(0,0) , λ(1,0) , λ(0,1) . Then λ(2,0) , λ(1,1) , λ(0,2) decompose into irreducibles as (1)
(2)
(3)
(79)
(1)
(2)
(3)
(80)
(1)
(2)
(3)
(81)
[ιλ(2,0) ] = [ιλ(2,0) ] ⊕ [ιλ(2,0) ] + [ιλ(2,0) ], [ιλ(1,1) ] = [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ], [ιλ(0,2) ] = [ιλ(0,2) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(0,2) ].
At tier 3, we find ιλ(3,0) , ιλ(3,0) = ιλ(3,0) , ιλ(1,1) = 3 so that [ιλ(3,0) ] = [ιλ(1,1) ], and similarly [ιλ(0,3) ] = [ιλ(1,1) ]. From ιλ(2,1) , ιλ(2,1) = 7, ιλ(2,1) , ιλ(1,0) = 2 and ιλ(2,1) , ιλ(0,2) = 3, and similarly for ιλ(1,2) , we obtain (1)
(2)
(3)
(1)
(2)
(3)
[ιλ(2,1) ] = 2[ιλ(1,0) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(0,2) ] + [ιλ(0,2) ],
(82)
[ιλ(1,2) ] = 2[ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ],
and no new irreducible sectors appear at tier 3. Then we have twelve irreducible (i) (i) (i) sectors [ιλ(0,0) ], [ιλ(1,0) ], [ιλ(0,1) ], [ιλ(2,0) ], [ιλ(1,1) ], [ιλ(0,2) ] for i = 1, 2, 3, and the corresponding modular invariant must be ZE (12) since tr(ZE (12) ) = 12. We now look at the fusion rules that these irreducible sectors satisfy. With ρ = λ(1,0) , we have [ιλ(0,0) ][ρ] = [ιλ(1,0) ], (1)
(2)
(3)
[ιλ(1,0) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ], (1)
(2)
(83)
(3)
and similarly [ιλ(0,1) ][ρ] = [ιλ(0,0) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ]. Since [ιλ(2,0) ][ρ] = (1)
(2)
(3)
(1)
[ιλ(1,1) ] ⊕ [ιλ(3,0) ] = 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ], we obtain ([ιλ(2,0) ][ρ]) ⊕ (2)
(3)
(1)
(2)
(3)
([ιλ(2,0) ][ρ]) ⊕ ([ιλ(2,0) ][ρ]) = 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ] ⊕ +2[ιλ(1,1) ]. We now use a similar argument to that in [4, §2.4]. The statistical dimension of the positive energy representation (µ1 , µ2 ) of SU(3)9 is given by the Perron– Frobenius eigenvector for the graph A(12) : d(µ1 ,µ2 ) = [µ1 + 1][µ2 + 1][µ1 + µ2 + 2]/[2]. (1) (2) (3) Then from (83) we obtain d(2,0) + d(2,0) + d(2,0) = d2(1,0) − d(1,0) = [3]3 − [3] = (i)
[3][4]/[2], where d(2,0) = dιλ(i) . We may then assume without loss of gener(2,0)
(1)
ality that d(2,0) < [3][4]/(3[2]) = [2][3]/[4]. Then since ([2][3]/[4])2 ≈ 2.488 < (1)
(1)
3, [ιλ(2,0) ][ιλ(2,0) ] decomposes into at most two irreducible N -N sectors. Then (1)
(1)
(1)
(1)
(1)
ιλ(2,0) ◦ ρ, ιλ(2,0) ◦ ρ = ρ ◦ ρ, ιλ(2,0) ◦ ιλ(2,0) ≤ 2. So [ιλ(2,0) ][ρ] cannot contain an irreducible sector with multiplicity greater than one. Since, by (79) and (82), (1) (1) (1) ιλ(2,0) ◦ ρ, ιλ(1,1) = ιλ(2,0) , ιλ(1,1) ◦ ρ = ιλ(2,0) , ιλ(0,1) + ιλ(2,0) + ιλ(1,2) = 2,
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
909
using (80) we may assume, again without loss of generality, that (1)
(1)
(2)
[ιλ(2,0) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(1,1) ]. (1)
(1)
(1)
Since [ιλ(1,0) ][ρ] ⊃ [ιλ(2,0) ] and ιλ(1,0) , ιλ(2,0) ◦ ρ = ιλ(1,0) ◦ ρ, ιλ(2,0) > 0, then (1)
(1)
(1)
(1)
(1)
[ιλ(2,0) ][ρ] ⊃ [ιλ(1,0) ]. Then since ιλ(2,0) ◦ ρ, ιλ(2,0) ◦ ρ = ιλ(2,0) ◦ ρ, ιλ(2,0) ◦ ρ = 2, (1)
(j)
we have [ιλ(2,0) ][ρ] = [ιλ(1,0) ] ⊕ [ιλ(0,2) ], for j ∈ {1, 2, 3}. By a similar argument (1)
we may also assume that [ιλ(2,0) ] has statistical dimension < [2][3]/[4], and using (j )
(1)
[ρ] instead of [ρ], we find [ιλ(0,2) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ], and have the freedom to set j = 3. Then we also have [ιλ(0,2) ][ρ] ⊃ [ιλ(0,1) ] for j = 2, 3 and (j)
(2)
(3)
(1)
(2)
(1)
(2)
([ιλ(0,2) ] ⊕ [ιλ(0,2) ])[ρ] = 2[ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ]. From [ιλ(1,1) ][ρ] we obtain (3)
(1)
(2)
(3)
([ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ])[ρ] = 3[ιλ(1,0) ] ⊕ 2[ιλ(0,2) ] ⊕ 2[ιλ(0,2) ] ⊕ 2[ιλ(0,2) ] and (1)
(2)
(3)
since [ιλ(1,0) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,1) ] = [ιλ(2,0) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] (j)
(j)
(j)
then ιλ(1,1) ◦ ρ, ιλ(1,0) = ιλ(1,1) , ιλ(1,0) ◦ ρ = 1 and [ιλ(1,1) ][ρ] ⊃ [ιλ(1,0) ] for j = 1, 2, 3. There is still some ambiguity surrounding the decompositions of (j) (j) (j) [ιλ(2,0) ][ρ], [ιλ(1,1) ][ρ] and [ιλ(0,2) ][ρ], for j = 2, 3. Computing the eigenvalues of the nimrep graphs for the different possibilities, we find that the only nimrep graph which has eigenvalues Sρµ /S0µ with multiplicities given by the diagonal entry Zµ,µ (j) (j) (j+1) (j) of the modular invariant is that for: [ιλ(2,0) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(1,1) ], [ιλ(1,1) ][ρ] = (l)
(l+1)
(j)
(j+1)
[ιλ(0,2) ] ⊕ [ιλ(0,2) ] and [ιλ(0,2) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] for j = 1, 2, 3, l ∈ {1, 2, 3}. The nimrep graph is the same for any choice of l = 1, 2, 3, up to a relabeling of the (j) (j) (j) irreducible representations [ιλ(2,0) ], [ιλ(1,1) ] and [ιλ(0,2) ], and the graph is just the (12)
graph E2
itself, illustrated in Fig. 6. The associated modular invariant is ZE (12) .
(12)
5.3. E4
Warning. the existence of the SU(3)-Goodman–de la Harpe–Jones subfactor which (12) gives the dual canonical endomorphism for E4 has not been shown yet by us. (12) For E4 , we suppose [θ] = [λ(0,0) ]⊕[λ(2,2) ]⊕[λ(4,1) ]⊕[λ(1,4) ]⊕[λ(5,2) ]⊕[λ(2,5) ]⊕ [λ(4,4) ] ⊕ [λ(9,0) ] ⊕ [λ(0,9) ]. Then computing ιλ, ιµ = λ, θµ for λ, µ on A(12) , we find ιλ, ιλ = 1 for λ = λ(0,0) , λ(1,0) , λ(0,1) . At tier 2, we have ιλ, ιλ = 2 and ιλ, ιµ = 0 for λ = λ(2,0) , λ(1,1) , λ(0,2) , µ = λ(0,0) , λ(1,0) , λ(0,1) . Then [λ(2,0) ], [λ(1,1) ], [λ(0,2) ] decompose into irreducibles as [ιλ(2,0) ] = [ιλ(2,0) ] ⊕ [ιλ(2,0) ],
(1)
(2)
(84)
[ιλ(1,1) ] = [ιλ(1,1) ] ⊕ [ιλ(1,1) ],
(1)
(2)
(85)
(1)
(2)
(86)
[ιλ(0,2) ] = [ιλ(0,2) ] ⊕ [ιλ(0,2) ].
At tier 3, ιλ(3,0) , ιλ(3,0) = ιλ(3,0) , ιλ(1,1) = 2 and similarly for ιλ(0,3) , so that [ιλ(3,0) ] = [ιλ(0,3) ] = [ιλ(1,1) ]. From ιλ(2,1) , ιλ(2,1) = 5, ιλ(2,1) , ιλ(1,0) = 1 and
August 12, 2009 3:57 WSPC/148-RMP
910
J070-00376
D. E. Evans & M. Pugh
Fig. 6.
(12)
M -N graph for the E2
SU(3)-GHJ subfactor.
ιλ(2,1) , ιλ(0,2) = 2, we have two possibilities for the decomposition of [ιλ(2,1) ]: [ιλ(1,0) ] ⊕ 2[ιλ(j) case (i), (0,2) ] [ιλ(2,1) ] = (87) (1) (2) (1) (2) [ιλ ] ⊕ [ιλ ] ⊕ [ιλ ] ⊕ [ιλ ] ⊕ [ιλ ] case (ii), (1,0) (0,2) (0,2) (2,1) (2,1) where we may assume j = 1 without loss of generality. Similarly, [ιλ(0,1) ] ⊕ 2[ιλ(1) case (i ), (2,0) ] [ιλ(1,2) ] = (1) (2) (1) (2) [ιλ (0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ] case (ii ),
(88)
At tier 4, we have ιλ(4,0) , ιλ(4,0) = 3, ιλ(4,0) , ιλ(1,0) = 1 and ιλ(4,0) , ιλ(0,2) = 2, and similarly for ιλ(0,4) , giving (1)
(2)
(1)
(2)
[ιλ(4,0) ] = [ιλ(1,0) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(0,2) ], [ιλ(0,4) ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ]. From ιλ(3,1) , ιλ(3,1) = 8, ιλ(3,1) , ιλ(0,1) = 2, ιλ(3,1) , ιλ(2,0) = 2 and ιλ(3,1) , ιλ(1,2) = 6 we have 2[ιλ(0,1) ] ⊕ 2[ιλ(1) for case (i ), (2,0) ] [ιλ(3,1) ] = (1) (2) (1) (2) 2[ιλ (0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ] for case (ii ), [ιλ(1,3) ] =
2[ιλ(1,0) ] ⊕ 2[ιλ(1) (0,2) ] 2[ιλ
(1,0) ]
(1)
for case (i), (2)
(1)
(2)
⊕ [ιλ(0,2) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(2,1) ] for case (ii).
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
911
We have ιλ(2,2) , ιλ(2,2) = 11, ιλ(2,2) , ιλ(0,0) = 1 and ιλ(2,2) , ιλ(1,1) = 4, giving (3−j) [ιλ(0,0) ] ⊕ 3[ιλ(j) case I, (1,1) ] ⊕ [ιλ(1,1) ] [ιλ(2,2) ] = (89) (1) (2) (1) (2) [ιλ (0,0) ] ⊕ 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ] ⊕ [ιλ(2,2) ] ⊕ [ιλ(2,2) ] case II, where j ∈ {1, 2}. Again, without loss of generality, we may assume that j = 1, and we see that for case I nothing new appears at tier 4. For case II, at tier 5 we find (1) [ιλ(5,0) ] = [ιλ(0,4) ], [ιλ(0,5) ] = [ιλ(4,0) ], [ιλ(4,1) ] = [ιλ(1,4) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,1) ] ⊕ (2)
2[ιλ(1,1) ] and [ιλ(3,2) ] =
(2) 2[ιλ(1,0) ] ⊕ 3[ιλ(1) (0,2) ] ⊕ [ιλ(0,2) ] 2[ιλ
(1,0) ]
[ιλ(2,3) ] =
(1)
for case (i),
(2)
(1)
(2)
⊕ 2[ιλ(0,2) ] ⊕ 2[ιλ(0,2) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(2,1) ] for case (ii),
(2) 2[ιλ(0,1) ] ⊕ 3[ιλ(1) (2,0) ] ⊕ [ιλ(2,0) ] 2[ιλ
(0,1) ]
for case (i ),
⊕ 2[ιλ(2,0) ] ⊕ 2[ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ] for case (ii ), (1)
(2)
(1)
(2)
and nothing new appears at tier 5. Then the total number of irreducible sectors for case I(i)(i ) is 9, for cases I(i)(ii ), I(ii)(i ), II(i)(i ) we have 11, for cases I(ii)(ii ), II(i)(ii ), II(ii)(i ) we have 13 and for case II(ii)(ii ) we have 15. The values of tr(Z) at level 12 are tr(ZA(12) ) = 55, tr(ZD(12) ) = 19, tr(ZA(12)∗ ) = 5, tr(ZD(12)∗ ) = 15, tr(ZE (12) ) = 12, tr(ZE (12)∗ ) = 11 and tr(ZE (12) ) = 17. So we see that MS MS the only possible cases are I(i)(ii ), I(ii)(i ), II(i)(i ) which have corresponding modular invariant ZE (12)∗ , and II(ii)(ii ) associated with the modular invariant ZD(12)∗ . MS For case II(i)(i ), where we again use the notation ρ = λ(1,0) , we have [ιλ(1,2) ][ρ] = (1)
[ιλ(1,1) ] ⊕ [ιλ(0,3) ] ⊕ [ιλ(2,2) ] and [ιλ(1,2) ][ρ] = ([ιλ(0,1) ] ⊕ 2[ιλ(2,0) ])[ρ] = [ιλ(0,0) ] ⊕ (1)
(1)
(1)
(2)
(1)
(2)
[ιλ(1,1) ]⊕2([ιλ(2,0) ][ρ]), giving 2[ιλ(2,0) ][ρ] = 3[ιλ(1,1) ]⊕3[ιλ(1,1) ]⊕[ιλ(2,2) ]⊕[ιλ(2,2) ], (1)
which is impossible since [ιλ(2,0) ][ρ] must have integer coefficients. Note that case II(ii)(i ) is the conjugate of case II(i)(ii ), where we replace ιλ(µ1 ,µ2 ) ↔ ιλ(µ2 ,µ1 ) . So we need to only consider cases I(i)(ii ) and II(ii)(ii ). Consider first the case I(i)(ii ). From [ιλ(2,1) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(3,1) ] (1)
and (87) we find [ιλ(2,1) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(3,1) ] ([ιλ(0,1) ] ⊕ [ιλ(2,0) ]) = (1)
(2)
(1)
(2)
[ιλ(0,1) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ]. Then by [ιλ(0,2) ][ρ] = [ιλ(0,1) ] ⊕ (2)
[ιλ(1,2) ] and (86), [ιλ(0,2) ][ρ] = [ιλ(0,1) ]. From [ιλ(1,1) ][ρ] = [ιλ(1,0) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(2,1) ] and (85) we obtain (1)
(2)
(1)
(2)
([ιλ(1,1) ] ⊕ [ιλ(1,1) ])[ρ] = 2[ιλ(1,0) ] ⊕ 3[ιλ(0,2) ] ⊕ [ιλ(0,2) ],
(90)
whilst from [ιλ(2,2) ][ρ] = [ιλ(2,1) ] ⊕ [ιλ(1,3) ] ⊕ [ιλ(3,2) ] and (89) we have (1)
(2)
(1)
(2)
(3[ιλ(1,1) ] ⊕ [ιλ(1,1) ])[ρ] = 4[ιλ(1,0) ] ⊕ 7[ιλ(0,2) ] ⊕ [ιλ(0,2) ].
(91)
August 12, 2009 3:57 WSPC/148-RMP
912
J070-00376
D. E. Evans & M. Pugh
Then from (90) and (91) we find (1)
(1)
[ιλ(1,1) ][ρ] = [ιλ(1,0) ] ⊕ 2[ιλ(0,2) ],
(2)
(1)
(2)
[ιλ(1,1) ][ρ] = [ιλ(1,0) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(0,2) ].
In the same manner, by considering [ιλ(2,0) ][ρ] = [ιλ(1,1) ]⊕[ιλ(3,0) ] and [ιλ(1,2) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(0,3) ] ⊕ [ιλ(2,2) ], and using (84) and (88), we have (1)
(2)
(1)
(2)
(1)
(2)
([ιλ(2,0) ] ⊕ [ιλ(2,0) ])[ρ] = 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ], (1)
(2)
(92)
(1)
(2)
([ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ])[ρ] = [ιλ(0,0) ] ⊕ 5[ιλ(1,1) ] ⊕ 3[ιλ(1,1) ] ([ιλ(0,0) ] ⊕ [ιλ(1,1) ]). (1)
(2)
(93) (1)
Then from (92), (90) and (85), we have ([ιλ(1,2) ][ρ])⊕([ιλ(1,2) ][ρ]) = 2[ιλ(1,1) ] giving (j)
(1)
[ιλ(1,2) ][ρ] = [ιλ(1,1) ] for j = 1, 2. From [ιλ(2,2) ][ρ] = [ιλ(1,2) ] ⊕ [ιλ(3,1) ] ⊕ [ιλ(2,3) ] and (89) we have (1)
(1)
(2)
([ιλ(1,1) ] ⊕ 2[ιλ(1,1) ])[ρ] = 4[ιλ(0,1) ] ⊕ 4[ιλ(2,0) ] ⊕ 4[ιλ(2,0) ] (1)
(2)
⊕ 3[ιλ(1,2) ] ⊕ 3[ιλ(1,2) ], (1)
(1)
(2)
(94)
(1)
(2)
giving 2[ιλ(1,1) ][ρ] = 2[ιλ(0,1) ] ⊕ 2[ιλ(2,0) ] ⊕ 2[ιλ(2,0) ] ⊕ 2[ιλ(1,2) ] ⊕ 2[ιλ(1,2) ]. Then (j)
(1)
(j)
(1)
ιλ(2,0) ◦ ρ, λ(1,1) = ιλ(2,0) , λ(1,1) ◦ ρ = 1 for j = 1, 2, and the decompo(1)
(2)
(1)
sitions of [ιλ(2,0) ][ρ] and [ιλ(2,0) ][ρ] both contain the irreducible sector [ιλ(1,1) ]. (2)
(1)
(1)
(2)
Then [ιλ(1,1) ][ρ] = ([ιλ(1,1) ][ρ]) ([ιλ(1,1) ][ρ]) = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ] and (1)
(2)
(2)
[ιλ(2,0) ][ρ] and [ιλ(2,0) ][ρ] both also contain [ιλ(1,1) ]. Then from (92), we have (j)
(1)
(2)
[ιλ(2,0) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(1,1) ]. The nimrep graph for multiplication by [ρ] for the case I(i)(ii ) is then seen to be just the graph E4 . Now consider the case II(ii)(ii ), which has corresponding modular invariant ZD(12)∗ . We obtain the following sector products: (12)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(2)
(1)
(2)
([ιλ(2,0) ] ⊕ [ιλ(2,0) ])[ρ] = 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ], (1)
(2)
(1)
(2)
([ιλ(1,1) ] ⊕ [ιλ(1,1) ])[ρ] = 2[ιλ(1,0) ] ⊕ 2[ιλ(0,2) ] ⊕ 2[ιλ(0,2) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(2,1) ], (1)
(2)
(1)
(2)
([ιλ(0,2) ] ⊕ [ιλ(0,2) ])[ρ] = 2[ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ], ([ιλ(2,1) ] ⊕ [ιλ(2,1) ])[ρ] = [ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ], ([ιλ(1,2) ] ⊕ [ιλ(1,2) ])[ρ] = [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(2,2) ] ⊕ [ιλ(2,2) ], (1)
(2)
(1)
(2)
and from ([ιλ(2,2) ] ⊕ [ιλ(2,2) ])[ρ] = [ιλ(2,1) ] ⊕ [ιλ(2,1) ] we may choose without loss of (j)
(j)
generality [ιλ(2,2) ][ρ] = [ιλ(2,1) ] for j = 1, 2. Then there are four different possibilities (j)
(j)
(j)
(j)
for [ιλ(1,1) ][ρ], three for [ιλ(2,0) ][ρ], six for [ιλ(0,2) ][ρ] and six for [ιλ(2,1) ][ρ], j = 1, 2.
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
913
From these, the only nimrep graph which has eigenvalues Sρ,µ /S0,µ with multiplicities given by the diagonal entry Zµ,µ of the modular invariant for D(12)∗ is that for the following sector products: (j)
(j)
[ιλ(2,0) ][ρ] = 2[ιλ(1,1) ], (1)
(j)
(j)
[ιλ(1,1) ][ρ] = [ιλ(1,0) ] ⊕ 2[ιλ(0,2) ] ⊕ [ιλ(2,1) ], (j)
(j)
(j)
[ιλ(0,2) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ], (j)
(j)
(j)
(j)
(j)
(3−j)
[ιλ(2,1) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,2) ], [ιλ(1,2) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(2,2) ], for j = 1, 2. For any λ ∈ M XN , let [λ][ρ] = µ∈M XN aµ [µ], aµ ∈ C. Then µ◦ρ, λ = µ, λ ◦ ρ = aµ for all µ ∈ M XN , so [µ][ρ] ⊃ aµ [λ]. Then if G is the multiplication matrix for [ρ], GT is the multiplication matrix for [ρ]. This graph cannot be the nimrep graph since GGT = GT G, which means [ιλ][ρ][ρ] = [ιλ][ρ][ρ]. Then the only (12) possibility for the nimrep graph for the M -N system is the graph E4 , illustrated in Fig. 7, and the associated modular invariant is ZE (12)∗ , assuming that θ is as MS expressed in (65). (12)
5.4. E5
(12)
For the graph E5 , we have [θ] = [λ(0,0) ] ⊕ [λ(3,3) ] ⊕ [λ(9,0) ] ⊕ [λ(0,9) ]. Then computing ιλ, ιµ = λ, θµ for λ, µ on A(12) , we find ιλ, ιλ = 1 for λ = λ(µ1 ,µ2 ) such that µ1 + µ2 ≤ 2. At tier 3, we have ιλ, ιλ = 2 and ιλ, ιµ = 0 for λ = λ(3,0) , λ(2,1) , λ(1,2) , λ(0,3) , µ = λ(µ1 ,µ2 ) such that µ1 + µ2 ≤ 2. We also have ιλ(3,0) , ιλ(0,3) = 0. Then λ(3,0) , λ(2,1) , λ(1,2) , λ(0,3) decompose into irreducibles as [ιλ(3,0) ] = [ιλ(3,0) ] ⊕ [ιλ(3,0) ],
(1)
(2)
(95)
[ιλ(2,1) ] = [ιλ(2,1) ] ⊕ [ιλ(2,1) ],
(1)
(2)
(96)
[ιλ(1,2) ] = [ιλ(1,2) ] ⊕ [ιλ(1,2) ],
(1)
(2)
(97)
(1)
(2)
(98)
[ιλ(0,3) ] = [ιλ(0,3) ] ⊕ [ιλ(0,3) ].
At tier 4, we have ιλ(4,0) , ιλ(4,0) = 2, ιλ(4,0) , ιλ(2,1) = 1 and ιλ(4,0) , ιµ = 0 (j)
(1)
(1)
(1)
(99)
(1)
(1)
(100)
for µ = λ(1,0) , λ(0,2) . Then [ιλ(4,0) ] = [ιλ(2,1) ] ⊕ [ιλ(4,0) ] for j ∈ {1, 2}. We have the freedom to choose j = 1 without loss of generality. Similarly for ιλ(0,4) . Then [ιλ(4,0) ] = [ιλ(2,1) ] ⊕ [ιλ(4,0) ], [ιλ(0,4) ] = [ιλ(1,2) ] ⊕ [ιλ(0,4) ].
August 12, 2009 3:57 WSPC/148-RMP
914
J070-00376
D. E. Evans & M. Pugh
Fig. 7.
(12)
M -N graph for the E4
SU(3)-GHJ subfactor.
From ιλ(3,1) , ιλ(3,1) = 3, ιλ(3,1) , ιλ(2,0) = 1, ιλ(3,1) , ιλ(1,2) = 1 and ιλ(3,1) , ιλ(0,4) = 1, we have two possibilities for the decomposition of [ιλ(3,1) ]: [ιλ(3,1) ] =
(1) [ιλ(2,0) ] ⊕ [ιλ(1) (1,2) ] ⊕ [ιλ(3,1) ]
case (i),
[ιλ
case (ii).
(2,0) ]
(2)
(1)
⊕ [ιλ(1,2) ] ⊕ [ιλ(0,4) ]
(101)
Similarly,
[ιλ(1,3) ] =
(1) [ιλ(0,2) ] ⊕ [ιλ(1) (2,1) ] ⊕ [ιλ(1,3) ] case (i ), [ιλ
(0,2) ]
⊕ [ιλ(2,1) ] ⊕ [ιλ(4,0) ] case (ii ), (2)
(1)
(102)
Since ιλ(2,2) , ιλ(2,2) = 3, ιλ(2,2) , ιλ(1,1) = 1, ιλ(2,2) , ιλ(3,0) = 1 and (j )
(j )
1 2 ιλ(2,2) , ιλ(0,3) = 1, we have [ιλ(2,2) ] = [ιλ(1,1) ]⊕[ιλ(3,0) ]⊕[ιλ(0,3) ] for j1 , j2 ∈ {1, 2}.
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
915
We again have the freedom to choose, without loss of generality, j1 = j2 = 1, so that (1)
(1)
[ιλ(2,2) ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(0,3) ].
(103)
At tier 5, ιλ(5,0) , ιλ(5,0) = ιλ(5,0) , ιλ(0,4) = 2 giving [ιλ(5,0) ] = [ιλ(0,4) ], and similarly [ιλ(0,5) ] = [ιλ(4,0) ]. Since ιλ(3,2) , ιλ(3,2) = 4, ιλ(3,2) , ιλ(1,0) = 1, ιλ(3,2) , ιλ(0,2) = 1 and ιλ(3,2) , ιλ(2,1) = 2, we have [ιλ(3,2) ] = [ιλ(1,0) ] ⊕ [ιλ(0,2) ] ⊕ (1) (2) (1) (2) [ιλ(2,1) ] ⊕ [ιλ(2,1) ], and similarly [ιλ(2,3) ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ]. We have ιλ(4,1) , ιλ(4,1) = ιλ(4,1) , ιλ(1,4) = ιλ(1,4) , ιλ(1,4) = 3 so that [ιλ(4,1) ] = [ιλ(1,4) ]. Since ιλ(4,1) , ιλ(1,1) = 1, ιλ(4,1) , ιλ(2,2) = 2, ιλ(4,1) , ιλ(3,0) = 1 and ιλ(4,1) , ιλ(0,3) = 1, we have two possibilities for the decomposition of [ιλ(4,1) ]: (2) [ιλ(1,1) ] ⊕ [ιλ(1) (3,0) ] ⊕ [ιλ(0,3) ] case I, [ιλ(4,1) ] = (104) (2) (1) [ιλ (1,1) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(0,3) ] case II. Then we see that no new irreducible sectors appear at tier 5. We also have at tier 6, ιλ(5,1) , ιλ(5,1) = ιλ(5,1) , ιλ(1,3) = 3 giving [ιλ(5,1) ] = [ιλ(1,3) ]. Case (i)(i ) gives 16 irreducible sectors, whilst case (ii)(ii ) gives 18 irreducibles, and therefore by looking at tr(Z) for the level 12 modular invariants Z we see that neither of these cases is possible. Case (ii)(i ) is the “conjugate” of case (i)(ii ), that is, we replace each irreducible sector [ιλ] in case (i)(ii ) by [ιλ] in case (ii)(i ). We therefore only need to consider case (i)(ii ), which has seventeen irreducible sec(1) (2) (1) (2) tors: [λ(0,0) ], [λ(1,0) ], [λ(0,1) ], [λ(2,0) ], [λ(1,1) ], [λ(0,2) ], [λ(3,0) ], [λ(3,0) ], [λ(0,3) ], [λ(0,3) ], (1)
(2)
(1)
(2)
(1)
(1)
(1)
[λ(2,1) ], [λ(2,1) ], [λ(1,2) ], [λ(1,2) ], [λ(4,0) ], [λ(0,4) ] and [λ(3,1) ]. We now consider the sector products for these irreducible sectors, where we again denote by [ρ] the irreducible N -N sector [λ(1,0) ]. The products [ιλ][ρ] are inherited from those for the N -N system for λ = λ(µ1 ,µ2 ) such that µ1 + µ2 ≤ 2, and we use (95)–(98) to decompose into irreducibles where necessary, e.g. (1)
(2)
[ιλ(0,2) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(1,2) ] = [ιλ(0,1) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ].
(105)
From [ιλ(2,1) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(3,1) ] and (96) we obtain (1)
(2)
(1)
(2)
(1)
([ιλ(2,1) ] ⊕ [ιλ(2,1) ])[ρ] = 2[ιλ(2,0) ] ⊕ 2[ιλ(1,2) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(3,1) ].
(106)
Similarly, by considering [ιλ(1,3) ][ρ] and [ιλ(4,0) ][ρ], and using (102) and (99) we have (2)
(1)
(1)
(2)
(1)
(107)
(1)
(1)
(1)
(1)
(1)
(108)
([ιλ(2,1) ] ⊕ [ιλ(4,0) ])[ρ] = [ιλ(2,0) ] ⊕ 2[ιλ(1,2) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(0,4) ], ([ιλ(2,1) ] ⊕ [ιλ(4,0) ])[ρ] = [ιλ(2,0) ] ⊕ 2[ιλ(1,2) ] ⊕ [ιλ(3,1) ] ⊕ [ιλ(0,4) ]. Then from (106)–(108) we find (1)
(1)
(1)
(109)
(2)
(1)
(2)
(110)
[ιλ(2,1) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(3,1) ], [ιλ(2,1) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(1,2) ], (1)
(1)
(1)
[ιλ(4,0) ][ρ] = [ιλ(1,2) ] ⊕ [ιλ(0,4) ].
(111)
August 12, 2009 3:57 WSPC/148-RMP
916
J070-00376
D. E. Evans & M. Pugh
Now we focus on case I. From [ιλ(3,0) ][ρ] = [ιλ(4,0) ] ⊕ [ιλ(2,1) ] and (95) we obtain (1)
(2)
(1)
(2)
(1)
(112)
(2)
(1)
(113)
([ιλ(3,0) ] ⊕ [ιλ(3,0) ])[ρ] = 2[ιλ(2,1) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(4,0) ]. Similarly by considering [ιλ(0,3) ][ρ] we have (1)
(2)
([ιλ(0,3) ] ⊕ [ιλ(0,3) ])[ρ] = 2[ιλ(0,2) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(4,0) ]. From [ιλ(2,2) ][ρ] = [ιλ(2,1) ] ⊕ [ιλ(1,3) ] ⊕ [ιλ(3,2) ] and (103), we find (1)
(1)
(1)
(2)
(1)
([ιλ(3,0) ] ⊕ [ιλ(0,3) ])[ρ] = [ιλ(0,2) ] ⊕ [ιλ(2,1) ] ⊕ 2[ιλ(2,1) ] ⊕ [ιλ(4,0) ],
(114)
whilst from [ιλ(4,1) ][ρ] = [ιλ(4,0) ] ⊕ [ιλ(3,2) ] ⊕ [ιλ(5,1) ] and (104), we find (1)
(2)
(1)
([ιλ(3,0) ] ⊕ [ιλ(0,3) ])[ρ] = [ιλ(1,0) ] ⊕ 2[ιλ(0,2) ] ⊕ 2[ιλ(2,1) ] (2)
(1)
(115)
(1)
(116)
⊕ 2[ιλ(2,1) ] ⊕ 2[ιλ(4,0) ]. Then from (112)–(115) we obtain (1)
(1)
(2)
(1)
(2)
[ιλ(3,0) ][ρ] = [ιλ(2,1) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(4,0) ], [ιλ(3,0) ][ρ] = [ιλ(2,1) ],
(117)
(1)
(2)
(118)
(2)
(1)
(119)
[ιλ(0,3) ][ρ] = [ιλ(0,2) ] ⊕ [ιλ(2,1) ], [ιλ(0,3) ][ρ] = [ιλ(0,2) ] ⊕ [ιλ(4,0) ].
Next, by considering [ιλ][ρ] for λ = λ(1,2) , λ(3,1) , λ(0,4) , and (97), (101) and (100) we obtain (1)
(2)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(2)
([ιλ(1,2) ] ⊕ [ιλ(1,2) ])[ρ] = 2[ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ 2[ιλ(0,3) ] ⊕ [ιλ(0,3) ], (1)
(2)
(121)
(2)
(122)
([ιλ(1,2) ] ⊕ [ιλ(3,1) ])[ρ] = [ιλ(1,1) ] ⊕ 2[ιλ(3,0) ] ⊕ [ιλ(0,3) ] ⊕ [ιλ(0,3) ], (1)
(1)
([ιλ(1,2) ] ⊕ [ιλ(0,4) ])[ρ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(0,3) ] ⊕ 2[ιλ(0,3) ]. (1) [ιλ(1,2) ][ρ]
(120)
(1) [ιλ(3,0) ]
(1) (2) We see from (120)–(122) that ⊂ [ιλ(1,1) ] ⊕ ⊕ [ιλ(0,3) ] ⊕ [ιλ(0,3) ]. (1) (1) (2) From (105) and (109)–(111), we see that [ιλ(1,2) ][ρ] = [ιλ(0,2) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(2,1) ] ⊕ (1) (1) (1) (1) (2) (1) [ιλ(4,0) ], since ιλ(1,2) ◦ ρ, ιλ = ιλ(1,2) , ιλ ◦ ρ = 1 for λ = λ(0,2) , λ(2,1) , λ(2,1) , λ(4,0) . (1) (1) (1) (1) Then ιλ(1,2) ◦ ρ, ιλ(1,2) ◦ ρ = ιλ(1,2) ◦ ρ, ιλ(1,2) ◦ ρ = 4 implies that we must (1) (1) (1) (2) have [ιλ(1,2) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(0,3) ] ⊕ [ιλ(0,3) ]. Then from (120)–(122) we
obtain (2)
(1)
[ιλ(1,2) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(0,3) ], (2)
(2)
(1)
(1)
[ιλ(3,0) ][ρ] = [ιλ(0,3) ], [ιλ(0,3) ][ρ] = [ιλ(3,0) ]. It is easy to check that the nimrep graph for multiplication by [ρ] obtained in case (12) I is just the graph E5 .
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
917
For case II, we again have (120), and by considering [ιλ(3,1) ][ρ] = [ιλ(3,0) ] ⊕ [ιλ(2,2) ] ⊕ [ιλ(4,1) ] and (95), (103) and (104) we obtain (1)
(1)
(1)
(2)
(1)
([ιλ(1,2) ] ⊕ [ιλ(3,1) ])[ρ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ 2[ιλ(3,0) ] ⊕ 2[ιλ(0,3) ],
(123)
and similarly from [ιλ(0,4) ][ρ], (100), (98) and (104) we obtain (1)
(1)
(2)
(1)
(2)
([ιλ(1,2) ] ⊕ [ιλ(0,4) ])[ρ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ 2[ιλ(0,3) ] ⊕ [ιλ(0,3) ].
(124)
(1)
(1)
Then from (120), (123) and (124) we see that [ιλ(1,2) ][ρ] ⊂ [ιλ(1,1) ] ⊕ 2[ιλ(0,3) ]. (1)
(1)
(1)
(1)
(1)
Since ιλ(1,2) ◦ ρ, ιλ(1,2) ◦ ρ = ιλ(1,2) ◦ ρ, ιλ(1,2) ◦ ρ = 4, we must have [ιλ(1,2) ][ρ] = (1)
(2)
(1)
(2)
2[ιλ(0,3) ]. Then from (120) we obtain [ιλ(1,2) ][ρ] = 2[ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(0,3) ], and (2)
(2)
(1)
(1)
we have ιλ(1,2) ◦ ρ, ιλ(1,2) ◦ ρ = ιλ(1,2) ◦ ρ, ιλ(1,2) ◦ ρ = 6. From (105) and (109)– (2)
(2)
(2)
(2)
(111), we see that [ιλ(1,2) ][ρ] = [ιλ(0,2) ] ⊕ [ιλ(2,1) ], giving ιλ(1,2) ◦ ρ, ιλ(1,2) ◦ ρ = 2 = 6, which is a contradiction. Then we reject case II. (12) Then the only possibility for the graph of the M -N system is E5 , illustrated in Fig. 8, and the modular invariant for θ is ZE (12) . MS
5.5. A(n)∗ We compute the nimrep graph for the case n = 12. It appears that the results will carry over to all other n, however we have not been able to show this in general. For the graph A(12)∗ , we have [θ] = µ [λµ ], where the direct sum is over all (12) representations µ on A . Then computing ιλ, ιµ = λ, θµ for λ, µ on A(12) ,
Fig. 8.
(12)
M -N graph for the E5
SU(3)-GHJ subfactor.
August 12, 2009 3:57 WSPC/148-RMP
918
J070-00376
D. E. Evans & M. Pugh
we find that ιλ(µ2 ,µ1 ) , ιλ(µ2 ,µ1 ) = ιλ(µ2 ,µ1 ) , ιλ(µ1 ,µ2 ) so we have [ιλ(µ2 ,µ1 ) ] = [ιλ(µ1 ,µ2 ) ] for all (µ1 , µ2 ) on A(12) . At tier 0, we have ιλ(0,0) , ιλ(0,0) = 1. At tier 1, ιλ(1,0) , ιλ(1,0) = 2 and ιλ(1,0) , ιλ(0,0) = 1, giving (1)
[ιλ(1,0) ] = [ιλ(0,0) ] ⊕ [ιλ(1,0) ].
(125)
At tier 2, we have ιλ(2,0) , ιλ(2,0) = 3 and ιλ(2,0) , ιλ(1,0) = 2, so [ιλ(2,0) ] = (1) (1) [ιλ(0,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(2,0) ]. We also have ιλ(1,1) , ιλ(1,1) = 6, ιλ(1,1) , ιλ(0,0) = 1, (1)
ιλ(1,1) , ιλ(1,0) = 3 and ιλ(1,1) , ιλ(2,0) = 4, giving [ιλ(1,1) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ (1)
[ιλ(2,0) ]. At tier 3, we have ιλ(3,0) , ιλ(3,0) = 4 and ιλ(3,0) , ιλ(2,0) = 3, so (1)
(1)
(1)
[ιλ(3,0) ] = [ιλ(0,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(3,0) ]. We also have ιλ(2,1) , ιλ(2,1) = 10, ιλ(2,1) , ιλ(0,0) = 1, ιλ(2,1) , ιλ(1,0) = 3, ιλ(2,1) , ιλ(2,0) = 5 and ιλ(2,1) , ιλ(3,0) = (1) (1) (1) 6, giving [ιλ(2,1) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 2[ιλ(2,0) ] ⊕ [ιλ(3,0) ]. Similarly, at tier 4, we find (1)
(1)
(1)
(1)
[ιλ(4,0) ] = [ιλ(0,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(4,0) ], (1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
[ιλ(3,1) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 2[ιλ(2,0) ] ⊕ 2[ιλ(3,0) ] ⊕ [ιλ(4,0) ], [ιλ(2,2) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 3[ιλ(2,0) ] ⊕ 2[ιλ(3,0) ] ⊕ [ιλ(4,0) ], and at tier 5: [ιλ(5,0) ] = [ιλ(4,0) ], (1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
[ιλ(4,1) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 2[ιλ(2,0) ] ⊕ 2[ιλ(3,0) ] ⊕ 2[ιλ(4,0) ] ⊕ [ιλ(5,0) ], [ιλ(3,2) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 3[ιλ(2,0) ] ⊕ 3[ιλ(3,0) ] ⊕ 2[ιλ(4,0) ] ⊕ [ιλ(5,0) ]. (1)
(1)
(1)
(1)
Then we have six irreducible sectors [ιλ(0,0) ], [ιλ(1,0) ], [ιλ(2,0) ], [ιλ(3,0) ], [ιλ(4,0) ] and (1)
[ιλ(5,0) ]. We now compute the sector products. We have [ιλ(0,0) ][ρ] = [ιλ(1,0) ] = [ιλ(0,0) ]⊕ (1)
(1)
(1)
[ιλ(1,0) ]. From [ιλ(1,0) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(0,1) ] = 2[ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ [ιλ(2,0) ] and (1)
(1)
(1)
(1)
(125) we find [ιλ(1,0) ][ρ] = 2[ιλ(0,0) ] ⊕ 2[ιλ(1,0) ] ⊕ [ιλ(2,0) ] ([ιλ(0,0) ] ⊕ [ιλ(1,0) ]) = (1)
(1)
[ιλ(0,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(2,0) ]. Similarly, we find (1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
[ιλ(2,0) ][ρ] = [ιλ(1,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(3,0) ], [ιλ(3,0) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(4,0) ], [ιλ(4,0) ][ρ] = [ιλ(3,0) ] ⊕ [ιλ(4,0) ], and the nimrep graph is A(12)∗ . The labeled nimrep graph is illustrated in Fig. 9. The associated modular invariant is ZA(12)∗ . In the case above, since n = 12 is even, we have [ιλ(5,0) ] = [ιλ(4,0) ] and so (1)
[ιλ(4,0) ][ρ] = [ιλ(5,0) ] ⊕ [ιλ(3,1) ] = [ιλ(4,0) ] ⊕ [ιλ(3,1) ]. This leads to [ιλ(4,0) ][ρ] ⊃
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
Fig. 9.
919
M -N graph for the A(12)∗ SU(3)-GHJ subfactor.
(1)
(1)
[ιλ(4,0) ], and there is a loop from [ιλ(4,0) ] to itself in the nimrep graph. However, when n is odd, e.g. for n = 11, we have instead [ιλ(5,0) ] = [ιλ(3,0) ] so [ιλ(4,0) ][ρ] = (1)
(1)
[ιλ(5,0) ]⊕[ιλ(3,1) ] = [ιλ(3,0) ]⊕[ιλ(3,1) ]. This causes [ιλ(4,0) ][ρ] ⊃ [ιλ(4,0) ], hence there (1)
is no loop from [ιλ(4,0) ] to itself in the nimrep graph for the n = 11 case. 5.6. D (n)∗ We compute the nimrep graph for the case n = 12. For the graph D(12)∗ , we have [θ] = µ [λµ ], where the direct sum is over all representations µ of color 0 on A(12) . At tier 0 we have ιλ(0,0) , ιλ(0,0) = 1. At tier 1, ιλ(1,0) , ιλ(1,0) = 2 and ιλ(1,0) , ιλ(0,0) = 0, and similarly for ιλ(0,1) , giving [ιλ(1,0) ] = [ιλ(1,0) ] ⊕ [ιλ(1,0) ],
(1)
(2)
(126)
(1)
(2)
(127)
[ιλ(0,1) ] = [ιλ(0,1) ] ⊕ [ιλ(0,1) ].
At tier 2, we have ιλ(2,0) , ιλ(2,0) = 3 and ιλ(2,0) , ιλ(0,1) = 1, and similarly for ιλ(0,2) , so we have (1)
(2)
(1)
(128)
(1)
(2)
(1)
(129)
[ιλ(2,0) ] = [ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ [ιλ(2,0) ], [ιλ(0,2) ] = [ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(0,2) ].
For ιλ(1,1) we have ιλ(1,1) , ιλ(1,1) = 6 and ιλ(1,1) , ιλ(0,0) = 1, so there are two possibilities for the decomposition of [ιλ(1,1) ] as irreducible sectors: (2) [ιλ(0,0) ] ⊕ 2[ιλ(1) case I, (1,1) ] ⊕ [ιλ(1,1) ] [ιλ(1,1) ] = (1) (2) (3) (4) (5) [ιλ (0,0) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] case II. (130) At tier 3, we have ιλ(3,0) , ιλ(3,0) = 4, ιλ(3,0) , ιλ(1,1) = 4 and ιλ(3,0) , ιλ(0,0) = 1, giving (2) (1) [ιλ(0,0) ] ⊕ [ιλ(1) (1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(3,0) ] for case I, [ιλ(3,0) ] = (131) (1) (2) (3) [ιλ (0,0) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ] for case II. Then we see that for case II [ιλ(1,1) ] ⊃ [ιλ(3,0) ]. However, this contradicts the following values of the inner-products at tier 6, ιλ(3,3) , ιλ(1,1) = 8 and ιλ(3,3) , ιλ(3,0) = 10. So we reject case II.
August 12, 2009 3:57 WSPC/148-RMP
920
J070-00376
D. E. Evans & M. Pugh
Continuing at tier 3 we have ιλ(0,3) , ιλ(0,3) = ιλ(0,3) , ιλ(3,0) = 4, so that [ιλ(0,3) ] = [ιλ(3,0) ]. From ιλ(2,1) , ιλ(2,1) = 10, ιλ(2,1) , ιλ(1,0) = 3 and ιλ(2,1) , ιλ(0,2) = 5, and similarly for ιλ(1,2) , we have (1)
(2)
(1)
(1)
(132)
(1)
(2)
(1)
(1)
(133)
[ιλ(2,1) ] = 2[ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ 2[ιλ(0,2) ] ⊕ [ιλ(2,1) ], [ιλ(1,2) ] = 2[ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ 2[ιλ(2,0) ] ⊕ [ιλ(1,2) ].
Next, at tier 4, we have ιλ(4,0) , ιλ(4,0) = 5, ιλ(4,0) , ιλ(1,0) = 2, ιλ(4,0) , ιλ(0,2) = 3 and ιλ(4,0) , ιλ(2,1) = 6, so there are two possibilities for the decomposition of [ιλ(4,0) ], and similarly for [ιλ(0,4) ]:
[ιλ(4,0) ] =
[ιλ(0,4) ] =
(1) (1) (2) (1) [ιλ(1,0) ] ⊕ [ιλ(2) (1,0) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(4,0) ] case (i), 2[ιλ(1) ] ⊕ [ιλ(1) ] (1,0) (0,2)
(1) (1) (2) (1) [ιλ(0,1) ] ⊕ [ιλ(2) (0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(0,4) ] case (i ), 2[ιλ(1) ] ⊕ [ιλ(1) ] (0,1) (2,0)
case (ii ).
Since ιλ(3,1) , ιλ(3,1) = 14, ιλ(3,1) , ιλ(0,1) = ιλ(3,1) , ιλ(1,2) = 11 and ιλ(3,1) , ιλ(0,4) = 8, then
[ιλ(3,1) ] =
(134)
case (ii),
3, ιλ(3,1) , ιλ(2,0)
(135)
=
5,
(2) (1) (1) (1) 2[ιλ(1) (0,1) ] ⊕ [ιλ(0,1) ] ⊕ 2[ιλ(2,0) ] ⊕ 2[ιλ(1,2) ] ⊕ [ιλ(0,4) ] for case (i ), 3[ιλ(1) ] ⊕ 2[ιλ(1) ] ⊕ [ιλ(1) ] (0,1) (2,0) (1,2)
for case (ii ). (136)
Similarly, for [ιλ(1,3) ],
[ιλ(1,3) ] =
(2) (1) (1) (1) 2[ιλ(1) (1,0) ] ⊕ [ιλ(1,0) ] ⊕ 2[ιλ(0,2) ] ⊕ 2[ιλ(2,1) ] ⊕ [ιλ(4,0) ] for case (i), 3[ιλ(1) ] ⊕ 2[ιλ(1) ] ⊕ [ιλ(1) ] (1,0) (0,2) (2,1)
for case (ii). (137)
From ιλ(2,2) , ιλ(2,2) = 19, ιλ(2,2) , ιλ(0,0) = 1, ιλ(2,2) , ιλ(1,1) = 8 and ιλ(2,2) , ιλ(3,0) = 8, we must have (1)
(2)
(1)
(1)
[ιλ(1,3) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,1) ] ⊕ 3[ιλ(1,1) ] ⊕ 2[ιλ(3,0) ] ⊕ [ιλ(2,2) ].
(138)
At tier 5, we have ιλ(5,0) , ιλ(5,0) = ιλ(5,0) , ιλ(0,4) = 5, giving [ιλ(5,0) ] = [ιλ(0,4) ], and similarly [ιλ(0,5) ] = [ιλ(4,0) ]. From ιλ(3,2) , ιλ(3,2) = 27, ιλ(3,2) , ιλ(1,0) = 3,
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
921
ιλ(3,2) , ιλ(0,2) = 6, ιλ(3,2) , ιλ(2,1) = 14 and ιλ(3,2) , ιλ(1,3) = 19 we must have (2) (1) (1) (1) 2[ιλ(1) (1,0) ] ⊕ [ιλ(1,0) ] ⊕ 3[ιλ(0,2) ] ⊕ 3[ιλ(2,1) ] ⊕ 2[ιλ(4,0) ] for case (i), [ιλ(3,2) ] = 3[ιλ(1) ] ⊕ 3[ιλ(1) ] ⊕ 2[ιλ(1) ] ⊕ 2[ιλ(1) ] ⊕ [ιλ(1) ] for case (ii). (1,0) (0,2) (2,1) (4,0) (3,2) (139) However, case (ii) does not satisfy ιλ(3,2) , ιλ(4,0) = 11, and hence we discard it. Similarly we discard case (ii ) since no possible decomposition of [ιλ(2,3) ] exists for that case. Then we are left with only the one case (i)(i ). We have (1)
(2)
(1)
(1)
(1)
[ιλ(2,3) ] = 2[ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ 3[ιλ(2,0) ] ⊕ 3[ιλ(1,2) ] ⊕ 2[ιλ(0,4) ].
(140)
From ιλ(4,1) , ιλ(4,1) = 17, ιλ(4,1) , ιλ(0,0) = 1, ιλ(4,1) , ιλ(1,1) = 7, ιλ(4,1) , ιλ(3,0) = 7 and ιλ(4,1) , ιλ(2,2) = 17, we have (1)
(2)
(1)
(1)
[ιλ(4,1) ] = [ιλ(0,0) ] ⊕ 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ] ⊕ 2[ιλ(3,0) ] ⊕ 2[ιλ(2,2) ],
(141)
and since ιλ(1,4) , ιλ(1,4) = ιλ(1,4) , ιλ(4,1) = 17, [ιλ(1,4) ] = [ιλ(4,1) ]. We see that no new irreducible sectors appear at tier 5, so the M -N system contains 15 irreducible sectors. We also have the following decompositions at tier 6: [ιλ(6,0) ] = [ιλ(0,6) ] = [ιλ(3,0) ],
(142)
(1)
(2)
(1)
(1)
(1)
(2)
(1)
(1)
(1)
(2)
(1)
(1)
(1)
[ιλ(5,1) ] = 2[ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ 2[ιλ(0,2) ] ⊕ 2[ιλ(2,1) ] ⊕ [ιλ(4,0) ], (1)
[ιλ(4,2) ] = 2[ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ 3[ιλ(2,0) ] ⊕ 3[ιλ(1,2) ] ⊕ 2[ιλ(0,4) ], (1)
[ιλ(1,5) ] = 2[ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ 2[ιλ(2,0) ] ⊕ 2[ιλ(1,2) ] ⊕ [ιλ(0,4) ].
(143) (144) (145)
We now find the sector products of the irreducible sectors with the N -N sector (1) (2) [ρ] = [λ(1,0) ]. We have [ιλ(0,0) ][ρ] = [ιλ(1,0) ] = [ιλ(1,0) ] ⊕ [ιλ(1,0) ]. From [ιλ(1,1) ][ρ] = [ιλ(1,0) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(2,1) ] and (130) we have (1)
(2)
(1)
(2)
(1)
(1)
(1)
(2)
(1)
(1)
(2[ιλ(1,1) ] ⊕ [ιλ(1,1) ])[ρ] = 4[ιλ(1,0) ] ⊕ 3[ιλ(1,0) ] ⊕ 3[ιλ(0,2) ] ⊕ [ιλ(2,1) ] ([ιλ(0,0) ][ρ]) = 3[ιλ(1,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 3[ιλ(0,2) ] ⊕ [ιλ(2,1) ].
(146)
Similarly, by considering [ιλ(3,0) ][ρ], [ιλ(2,2) ][ρ] and [ιλ(4,1) ][ρ], and using (131), (138) and (141), we have the following: (1)
(2)
(1)
(1)
(2)
(1)
([ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(3,0) ])[ρ] = 2[ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ 3[ιλ(0,2) ] (1)
(1)
⊕ 2[ιλ(2,1) ] ⊕ [ιλ(4,0) ], (1)
(2)
(1)
(1)
(1)
(2)
(147) (1)
(2[ιλ(1,1) ] ⊕ 3[ιλ(1,1) ] ⊕ 2[ιλ(3,0) ] ⊕ [ιλ(2,2) ])[ρ] = 5[ιλ(1,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 7[ιλ(0,2) ] (1)
(1)
⊕ 6[ιλ(2,1) ] ⊕ 3[ιλ(4,0) ], (1)
(2)
(1)
(1)
(1)
(2)
(148) (1)
(2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ] ⊕ 2[ιλ(3,0) ] ⊕ 2[ιλ(2,2) ])[ρ] = 4[ιλ(1,0) ] ⊕ 2[ιλ(1,0) ] ⊕ 6[ιλ(0,2) ] (1)
(1)
⊕ 6[ιλ(2,1) ] ⊕ 4[ιλ(4,0) ].
(149)
August 12, 2009 3:57 WSPC/148-RMP
922
J070-00376
D. E. Evans & M. Pugh
Then from (146)–(149) we obtain the following sector products: (1)
(1)
(2)
(1)
(2)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
[ιλ(1,1) ][ρ] = [ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(0,2) ], [ιλ(1,1) ][ρ] = [ιλ(1,0) ] ⊕ [ιλ(0,2) ] ⊕ [ιλ(2,1) ], [ιλ(3,0) ][ρ] = [ιλ(0,2) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(4,0) ], [ιλ(2,2) ][ρ] = [ιλ(2,1) ] ⊕ [ιλ(4,0) ]. Next, from [ιλ(1,0) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] and (126) we have (1)
(2)
(1)
(2)
(1)
([ιλ(1,0) ] ⊕ [ιλ(1,0) ])[ρ] = 2[ιλ(0,1) ] ⊕ 2[ιλ(0,1) ] ⊕ [ιλ(2,0) ].
(150) (1)
By considering [ιλ(0,2) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(1,2) ] and (129) we obtain ([ιλ(1,0) ] ⊕ (2)
(1)
(1)
(2)
(1)
(1)
[ιλ(1,0) ] ⊕ [ιλ(0,2) ])[ρ] = 3[ιλ(0,1) ] ⊕ 2[ιλ(0,1) ] ⊕ 2[ιλ(2,0) ] ⊕ [ιλ(1,2) ]. Then from (150) we see that (1)
(1)
(1)
(1)
[ιλ(0,2) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(2,0) ] ⊕ [ιλ(1,2) ].
(151)
From [ιλ(2,1) ][ρ], (132) and (151) we find (1)
(2)
(1)
(1)
(2)
(1)
(2[ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(2,1) ])[ρ] = 3[ιλ(0,1) ] ⊕ 3[ιλ(0,1) ] ⊕ 3[ιλ(2,0) ] (1)
(1)
⊕ [ιλ(1,2) ] ⊕ [ιλ(0,4) ].
(152)
Similarly, by considering [ιλ(1,3) ][ρ] and [ιλ(0,5) ][ρ], and using (134), (137) and [ιλ(0,5) ] = [ιλ(4,0) ], we have the following: (1)
(2)
(1)
(1)
(1)
(2)
(1)
(2[ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ 2[ιλ(2,1) ] ⊕ [ιλ(4,0) ])[ρ] = 3[ιλ(0,1) ] ⊕ 3[ιλ(0,1) ] ⊕ 4[ιλ(2,0) ] (1)
(1)
⊕ 3[ιλ(1,2) ] ⊕ 2[ιλ(0,4) ], (1)
(2)
(1)
(1)
(1)
(2)
(153) (1)
([ιλ(1,0) ] ⊕ [ιλ(1,0) ] ⊕ [ιλ(2,1) ] ⊕ [ιλ(4,0) ])[ρ] = 2[ιλ(0,1) ] ⊕ 2[ιλ(0,1) ] ⊕ 2[ιλ(2,0) ] (1)
(1)
⊕ 2[ιλ(1,2) ] ⊕ 2[ιλ(0,4) ]. Then from (150), (152)–(154) we obtain the following sector products: (1)
(1)
(2)
(2)
(1)
(2)
(1)
(1)
(1)
(1)
(1)
(1)
(1)
[ιλ(1,0) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ [ιλ(2,0) ], [ιλ(1,0) ][ρ] = [ιλ(0,1) ] ⊕ [ιλ(0,1) ], (1)
[ιλ(2,1) ][ρ] = [ιλ(2,0) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(0,4) ], [ιλ(4,0) ][ρ] = [ιλ(1,2) ] ⊕ [ιλ(0,4) ].
(154)
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
923
Next, since [ιλ(0,1) ][ρ] = [ιλ(0,0) ] ⊕ [ιλ(1,1) ], from (127) we have (1)
(2)
(1)
(2)
([ιλ(0,1) ] ⊕ [ιλ(0,1) ])[ρ] = 2[ιλ(0,0) ] ⊕ 2[ιλ(1,1) ] ⊕ [ιλ(1,1) ].
(155) (1)
By considering [ιλ(2,0) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] and (128) we obtain ([ιλ(0,1) ] ⊕ (2)
(1)
(1)
(2)
(1)
[ιλ(0,1) ] ⊕ [ιλ(2,0) ])[ρ] = 2[ιλ(0,0) ] ⊕ 3[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ] ⊕ [ιλ(3,0) ]. Then from (155) we see that (1)
(1)
(2)
(1)
[ιλ(2,0) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(3,0) ].
(156)
From [ιλ(1,2) ][ρ], (133) and (156) we obtain (1)
(2)
(1)
(1)
(2)
(2[ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ [ιλ(1,2) ])[ρ] = 3[ιλ(0,0) ] ⊕ 3[ιλ(1,1) ] ⊕ 3[ιλ(1,1) ] (1)
(1)
⊕ [ιλ(3,0) ] ⊕ [ιλ(2,2) ].
(157)
Similarly, by considering [ιλ(3,1) ][ρ] and [ιλ(0,4) ][ρ], and using (136) and (135), we have the following: (1)
(2)
(1)
(1)
(1)
(2)
(2[ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ 2[ιλ(1,2) ] ⊕ [ιλ(0,4) ])[ρ] = 3[ιλ(0,0) ] ⊕ 3[ιλ(1,1) ] ⊕ 4[ιλ(1,1) ] (1)
(1)
⊕ 3[ιλ(3,0) ] ⊕ 3[ιλ(2,2) ], (1)
(2)
(1)
(1)
(1)
(158) (2)
([ιλ(0,1) ] ⊕ [ιλ(0,1) ] ⊕ [ιλ(1,2) ] ⊕ [ιλ(0,4) ])[ρ] = 2[ιλ(0,0) ] ⊕ 2[ιλ(1,1) ] ⊕ 2[ιλ(1,1) ] (1)
(1)
⊕ 2[ιλ(3,0) ] ⊕ 2[ιλ(2,2) ].
(159)
Then from (155), (157)–(159) we obtain the following sector products: (1)
(1)
(2)
(1)
(2)
[ιλ(0,1) ][ρ] = [ιλ(0,0) ] ⊕ [ιλ(1,1) ] ⊕ [ιλ(1,1) ], [ιλ(0,1) ][ρ] = [ιλ(0,0) ] ⊕ [ιλ(1,1) ], (1)
(2)
(1)
(1)
(1)
(1)
(1)
[ιλ(1,2) ][ρ] = [ιλ(1,1) ] ⊕ [ιλ(3,0) ] ⊕ [ιλ(2,2) ], [ιλ(0,4) ][ρ] = [ιλ(3,0) ] ⊕ [ιλ(2,2) ]. We thus obtain the graph D(12)∗ as the nimrep graph for the M -N system, illustrated in Fig. 10, and the associated modular invariant is ZD(12)∗ . 5.7. The type I parent Thus we have constructed subfactors which realize all of the SU(3) modular invari(12) case, since the existence of this subfactor is not yet ants, except for the E4 (12) shown. However, for the modular invariant associated to the graph E4 , we have ZE (12)∗ = ZE (12) C, where C is the modular invariant associated to the graph MS
MS
A(12)∗ . Since both ZE (12) , C are shown to be realized by subfactors, the result of MS [18, Theorem 3.6] shows that the modular invariant ZE (12)∗ is also realized by a MS subfactor.
August 12, 2009 3:57 WSPC/148-RMP
924
J070-00376
D. E. Evans & M. Pugh
Fig. 10.
M -N graph for the D (12)∗ SU(3)-GHJ subfactor.
The M -N graph G of a subfactor N ⊂ M is defined by the matrix ∆ρ which gives the decomposition of the M -N sectors with respect to multiplication by the fundamental representation ρ. Similarly, multiplication by the conjugate representation defines the matrix ∆ρ = ∆Tρ which is the adjacency matrix of the conjugate Then since N XN is commutative, the matrices ∆ρ and ∆Tρ commute, graph G. i.e. ∆ρ is normal. This provides a proof that the adjacency matrices of the ADE graphs are all normal, since each of the ADE graphs appears as the M -N graph for a subfactor N ⊂ M . The zero-column of the modular invariant Z associated with the subfactor N ⊂ + M determines α+ j , αj since α preserves the sector product j + + + Nj,j α+ α+ j , αj = αj αj , id = j , id
=
j
j
j Nj,j Zj ,0 ,
(160)
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
925
− and similarly the zero-row determines α− j , αj . Then for all modular invariants ± with the same zero-column, the sectors [α1 ] satisfy the same equation (160) and hence have the same nimrep graphs. Let v be an isometry which intertwines the identity and the canonical endomorphism γ = ιι. [6, Proposition 3.2] states that the following conditions are equivalent:
1. Zλ,0 = θ, λ for all λ ∈ N XN . 2. Z0,λ = θ, λ for all λ ∈ N XN . 3. Chiral locality holds: ε+ (θ, θ)v 2 = v 2 . The chiral locality condition, which can be expressed in terms of the single inclusion N ⊂ M and the braiding, expresses local commutativity (locality) of the extended net, if N ⊂ M arises from a net of subfactors [27]. Chiral locality holds if and only if the dual canonical endomorphism is visible in the vacuum row, [θ] = λ Z0,λ [λ] (and hence in the vacuum column also). We will call the inclusion N ⊂ M type I if and only if one of the above equivalent conditions 1–3 hold. Otherwise we will call the inclusion type II. Note that the (12) (12) inclusions obtained for the E1 and E2 graphs realize the same modular invariant (12) (12) ZE (12) , but the inclusion for E1 is type I whilst the inclusion for E2 is type II. This shows that it is possible for a type I modular invariant to be realized by a type II inclusion, and suggests that care needs to be taken with the type I, II labeling of modular invariants. The nimrep graph of [α± 1 ] for the identity modular invariant is the fusion graph of the original N -N system, whilst the nimrep graph of [α± 1 ] for the modular invariants associated to D(3k+3) and E (8) were computed in [4], and (12) for E1 and E (24) in [5]. In these cases we have Zλ,0 = θ, λ for all λ ∈ N XN , for θ given in (56)–(67). The principal graph of the inclusion α± 1 (N ) ⊂ N is then the ]. The other modular invariants all have the same zero-column nimrep graph of [α± 1 as one of these modular invariants, and hence the nimrep graph of [α± 1 ] for these modular invariants must be the graph given by the type I parent of Z, that is, the type I modular invariant which has the same first column as Z. The results are summarized in Table 1, where “Type” refers to the type of the inclusion N ⊂ M given by the SU(3)-GHJ construction, where the distinguished vertex ∗G is the vertex with lowest Perron–Frobenius weight.a (12) For E4 , we do not show the existence of the Ocneanu cells, and hence do not have a GHJ subfactor here. However, we have shown that the ZE (12)∗ modular MS invariant is realized as a braided subfactor. The corresponding nimrep is not computed here, but if (65) is a dual canonical endomorphism, then its nimrep graph is (12) (12) shown to be E4 . This would be the case if E4 carries a cell system.
a Note,
we have only showed the A∗ and D ∗ case for n = 12. We have not done any computations for the D (n) graphs, n ≡ 0 mod 3.
August 12, 2009 3:57 WSPC/148-RMP
926
J070-00376
D. E. Evans & M. Pugh Table 1.
The SU(3) modular invariants realized by SU(3)-GHJ subfactors.
GHJ graph A(n)
D (n) D (n)∗
A(n)∗ D (3k) (n ≡ 0 mod 3) D (3k)∗ (n ≡ 0 mod 3) E (8) E (8)∗ (12) E1 (12)
E2
(12)
Type
M -N graph
Type I parent
ZA(n) ZA(n)∗ = C ZD (3k) ZD (n) ZD (3k)∗ = ZD (3k) C ZD (n)∗ = ZD (n) C ZE (8) ZE (8)∗ = ZE (8) C ZE (12) = ZE (12) C
I II I II II II I II I
A(n)
A(n) A(n) D (3k) A(n) D (3k) A(n) E (8) E (8) (12) E1
ZE (12) = ZE (12) C
II
—
—
Modular invariant
E3
(12)
E4
(12)
E5
E (24)
Z
(12)∗
EM S
Z
=Z
(12) C
EM S
(12)
EM S
ZE (24) = ZE (24) C
A(n)∗ D (3k) ? D (3k)∗ D (n)∗ E (8) E (8)∗ (12) E1 (12)
E2
—
(12)
E1
—
(12)
D (12)
E5
(12)
D (12)
E (24)
E (24)
II
E4
II I
Acknowledgments This paper is based on work in [32]. The first author was partially supported by the EU-NCG network in Non-Commutative Geometry MRTN-CT-2006-031962, and the second author was supported by a scholarship from the School of Mathematics, Cardiff University.
References [1] R. E. Behrend, P. A. Pearce, V. B. Petkova and J.-B. Zuber, Boundary conditions in rational conformal field theories, Nucl. Phys. B 579 (2000) 707–773. [2] J. B¨ ockenhauer, Modular invariants and subfactors II, Lecture at Warwick Workshop on Modular Invariants, Operator Algebras and Quotient Singularities, University of Warwick (September 1999). [3] J. B¨ ockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors. I, Comm. Math. Phys. 197 (1998) 361–386. [4] J. B¨ ockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors. II, Comm. Math. Phys. 200 (1999) 57–103. [5] J. B¨ ockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors. III, Comm. Math. Phys. 205 (1999) 183–228. [6] J. B¨ ockenhauer and D. E. Evans, Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors, Comm. Math. Phys. 213 (2000) 267–289. [7] J. B¨ ockenhauer and D. E. Evans, Modular invariants and subfactors, in Mathematical Physics in Mathematics and Physics (Siena, 2000), ed. R. Longer, Fields Inst. Commun. Vol. 30 (Amer. Math. Soc., Providence, RI, 2001), pp. 11–37. [8] J. B¨ ockenhauer, D. E. Evans and Y. Kawahigashi, On α-induction, chiral generators and modular invariants for subfactors, Comm. Math. Phys. 208 (1999) 429–487. [9] J. B¨ ockenhauer, D. E. Evans and Y. Kawahigashi, Chiral structure of modular invariants for subfactors, Comm. Math. Phys. 210 (2000) 733–784.
August 12, 2009 3:57 WSPC/148-RMP
J070-00376
SU(3)-Goodman–de la Harpe–Jones Subfactors and SU(3) Modular Invariants
927
[10] A. Cappelli, C. Itzykson, and J.-B. Zuber, The A-D-E classification of minimal and (1) A1 conformal invariant theories, Comm. Math. Phys. 113 (1987) 1–26. [11] P. Di Francesco and J.-B. Zuber, SU(N ) lattice integrable models associated with graphs, Nucl. Phys. B 338 (1990) 602–646. [12] E. G. Effros, Dimensions and C ∗ -Algebras, CBMS Regional Conference Series in Mathematics, Vol. 46 (Conference Board of the Mathematical Sciences, Washington, D.C., 1981). [13] D. E. Evans, Fusion rules of modular invariants, Rev. Math. Phys. 14 (2002) 709–731. [14] D. E. Evans, Critical phenomena, modular invariants and operator algebras, in Operator algebras and mathematical physics (Constant¸a, 2001), The Theta Foundation, Bucharest (2003), pp. 89–113. [15] D. E. Evans and J. D. Gould, Dimension groups and embeddings of graph algebras, Internat. J. Math. 5 (1994) 291–327. [16] D. E. Evans and Y. Kawahigashi, Orbifold subfactors from Hecke algebras, Comm. Math. Phys. 165 (1994) 445–484. [17] D. E. Evans and Y. Kawahigashi, Quantum Symmetries on Operator Algebras, Oxford Mathematical Monographs (The Clarendon Press Oxford University Press, New York, 1998). [18] D. E. Evans and P. R. Pinto, Subfactor realization of modular invariants, Comm. Math. Phys. 237 (2003) 309–363. [19] D. E. Evans and M. Pugh, Ocneanu cells and Boltzmann weights for the SU(3) ADE graphs, to appear in M¨ unster J. Math.; arXiv:0906.4307 [math.OA]. [20] D. E. Evans and M. Pugh, A2 -planar algebras I, preprint; arXiv:0906.4225 [math.OA]. [21] D. E. Evans and M. Pugh, A2 -planar algebras II: Planar modules, preprint; arXiv:0906.4311 [math.OA]. [22] D. E. Evans and M. Pugh, Spectral measures and generating series for nimrep graphs in subfactor theory, to appear in Comm. Math. Phys.; arXiv:0906.4314 [math.OA]. [23] T. Gannon, The classification of affine SU(3) modular invariant partition functions, Comm. Math. Phys. 161 (1994) 233–263. [24] F. M. Goodman, P. de la Harpe and V. F. R. Jones, Coxeter Graphs and Towers of Algebras, MSRI Publications, Vol. 14 (Springer-Verlag, New York, 1989). [25] L. H. Kauffman, State models and the Jones polynomial, Topology 26 (1987), 395–407. [26] G. Kuperberg, Spiders for rank 2 Lie algebras, Comm. Math. Phys. 180 (1996) 109–151. [27] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567–597. [28] G. Moore and N. Seiberg, Naturality in conformal field theory, Nucl. Phys. B 313 (1989) 16–40. [29] A. Ocneanu, Paths on coxeter diagrams: From platonic solids and singularities to minimal models and subfactors. (Notes recorded by S. Goto), in Lectures on Operator Theory, eds. B. V. Rajarama Bhat et al., Fields Institute Monographs, Vol. 13 (Amer. Math. Soc., Providence, RI, 2000), pp. 243–323. [30] A. Ocneanu, Higher coxeter systems (2000), Talk given at MSRI; http://www.msri.org/publications/ln/msri/2000/subfactors/ocneanu. [31] A. Ocneanu, The classification of subgroups of quantum SU(N ), in Quantum Symmetries in Theoretical Physics and Mathematics (Bariloche, 2000), Contemp. Math., Vol. 294 (Amer. Math. Soc., Providence, RI, 2002), pp. 133–159. [32] M. Pugh, The Ising model and beyond, PhD thesis, Cardiff University (2008).
August 12, 2009 3:57 WSPC/148-RMP
928
J070-00376
D. E. Evans & M. Pugh
[33] A. Wassermann, Operator algebras and conformal field theory. III. Fusion of positive energy representations of LSU(N ) using bounded operators, Invent. Math. 133 (1998) 467–538. [34] H. Wenzl, Hecke algebras of type An and subfactors, Invent. Math. 92 (1988) 349–383. [35] F. Xu, Generalized Goodman–Harpe–Jones construction of subfactors. I, II, Comm. Math. Phys. 184 (1997) 475–491, 493–508. [36] F. Xu, New braided endomorphisms from conformal inclusions, Comm. Math. Phys. 192 (1998) 349–403.
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Reviews in Mathematical Physics Vol. 21, No. 7 (2009) 929–945 c World Scientific Publishing Company
SINGULAR SPECTRUM FOR RADIAL TREES
JONATHAN BREUER Mathematics 253-37, California Institute of Technology, Pasadena, CA 91125, USA
[email protected] RUPERT L. FRANK Department of Mathematics, Princeton University, Washington Road, Princeton, NJ 08544, USA
[email protected] Received 25 March 2009 Revised 10 July 2009
We prove several results showing that absolutely continuous spectrum for the Laplacian on radial trees is a rare event. In particular, we show that metric trees with unbounded edges have purely singular spectrum and that, generically (in the sense of Baire), radial trees have purely singular continuous spectrum. Keywords: Schr¨ odinger operators; quantum graphs; trees; singular spectrum; reflectionless property. Mathematics Subject Classification 2000: 34L05, 34L40, 35Q40, 47B36
1. Introduction 1.1. Overview Trees have provided a popular setting for many spectral theoretic works. This is due to various fascinating features they have as well as the fact that they have some multi-dimensional properties that may be studied with the help of one-dimensional methods. Recent interest in metric trees is connected to the popularity general quantum graphs have enjoyed (for a review, see [13]), and has fueled a significant number of studies in the past decade. As a review of spectral theory on trees is beyond the scope of this introduction, we give here only a partial list of recent works that are relevant to our work [1, 2, 6–8, 10–12, 17, 26]. This paper deals with the absence of absolutely continuous spectrum for the Laplacian on radial trees. In particular, our purpose is to demonstrate that the existence of absolutely continuous spectrum imposes rather stringent restrictions 929
August 12, 2009 3:59 WSPC/148-RMP
930
J070-00377
J. Breuer & R. L. Frank
on the structure of the tree, so that generally, the occurrence of absolutely continuous spectrum is a rather exceptional event. To be specific, Theorem 2 shows that any sparse radial metric tree has purely singular spectrum. Moreover, we shall demonstrate that, in some natural sense, most metric radial trees have purely singular continuous spectrum. To the best of our knowledge, this is the first theorem of this type to be proven for metric trees. Examples of Schr¨ odinger operators (operators of the form −∆ + V ) with nondiscrete spectrum that have no absolutely continuous spectral measures, have been known for several decades. One (by now classical) example is that of the Anderson model for which, in the localized regime, the spectral measures are pure point, but the spectrum is a union of intervals. Moreover, various works in the 1980’s and 1990’s have shown that singular continuous spectral measures are the rule rather than the exception (for a review of “exotic” spectra see [15]). This picture is in some contrast to the “standard” physics textbook picture (where a quantum particle is either in a bound state, corresponding to an eigenfunction, or is free and thus in the absolutely continuous regime) and so took some time to establish. The first explicit example of a Schr¨odinger operator with singular continuous spectrum was constructed by Pearson in 1978 [18]. The potential in Pearson’s example consists of a sequence of bumps such that the distance between two consecutive bumps increases to infinity. Potentials of this type came to be known as “sparse potentials” and have been extensively studied since (see [14] for a review). Almost-periodic and quasi-periodic potentials have also provided various examples of Schr¨ odinger operators with singular continuous and other types of “exotic” spectra and have been extensively studied for this and various other reasons. One of the realizations that grew out of the research on the spectral properties of the operators described above, was that in a certain natural sense, a Schr¨ odinger operator with a generic potential had singular continuous spectrum. In particular, Simon’s celebrated Wonderland Theorem [22] states that for a dense Gδ set of potentials in C∞ (Rd ) (continuous functions vanishing at infinity with the supremum norm), the operator −∆ + V has purely singular continuous spectrum in (0, ∞). Recently, the picture described above has been complemented by a remarkable theorem of Remling’s [20, 21], following the work of Breimesser and Pearson [4, 5]. Remling’s Theorem deals with the consequences of absolutely continuous spectrum for one-dimensional Schr¨odinger operators and Jacobi matrices. In particular, it imposes various explicit restrictions on one-dimensional (discrete and continuous) operators with absolutely continuous spectrum. In all the works described above, the singular spectrum is a result of an interaction with a potential. That some of the ideas mentioned above may be applied to construct spaces for which the Laplace operator (with no added potential) has unusual spectral properties was first realized by Simon [24] who considered a family of ladder-type graphs and showed that a generic graph in this family has singular continuous spectrum. Other examples of infinite graphs with singular spectrum include [11, 16, 27].
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
931
In the case of tree graphs, these ideas were applied in [6, 7] to discrete radial sparse trees (trees with edge lengths growing to infinity), where it was shown that by controlling the various parameters defining the tree it is possible to control also the “degree of continuity” of the spectral measures. In particular, examples with purely singular (continuous or point) spectrum were presented there. We remark that while our results show that the essential spectrum of the Laplacian on a sparse tree graph is radically different from that of the Laplacian on Euclidean space, the discrete spectrum of perturbations of the Laplacian is qualitatively similar on trees and on Euclidean space; see [10, 17] for estimates on eigenvalues of Schr¨ odinger operators on metric trees. 1.2. Main results While our primary concern is in metric trees, let us begin with a first result in the discrete setting. Let Γd be a radial discrete tree, that is, a rooted, radially symmetric tree graph. Denote the root by O and, for any vertex x, the branching number of x (that is, the number of forward nearest neighbors of x) by b(x). The symmetry implies that b(x) is a function of the distance of x to the root. We only consider infinite trees. Therefore, there exists a sequence of natural numbers, {bn }∞ n=1 , such that if n = d(O, x) then b(x) = bn . We have Theorem 1. Assume {bn }∞ n=1 is bounded. Let ∆ be the discrete Laplacian on Γd . Then, if ∆ has nonempty absolutely continuous spectrum, the sequence {bn }∞ n=1 is is a (one sided) eventually periodic. That is, there exists N ∈ N such that {bn }∞ n=N periodic sequence. Remark. Different authors use different definitions for ∆. The theorem holds both for f (y) (1.1) (∆f )(x) = y∼x
(AKA the adjacency matrix), and for (b(x) + 1)f (x) − f (y) x = O y∼x (∆f )(x) = f (y) x=O (b(x))f (x) −
(1.2)
y∼x
where y ∼ x means y is a nearest neighbor of x. Regarding radial metric trees, a moment’s reflection shows that such a result cannot hold since the edge lengths are now continuous parameters. We can, however, show that unbounded edge lengths does rule out absolutely continuous spectrum. Explicitly, let Γc be a radial metric tree with root O. As before, let b(x) be the branching number of a vertex x. We assume b(O) = 1 and b(x) > 1 for any other
August 12, 2009 3:59 WSPC/148-RMP
932
J070-00377
J. Breuer & R. L. Frank
vertex. If x is a vertex with n + 1 vertices on the unique geodesic connecting it to O (including the endpoints), we denote d(O, x) = tn and b(x) = bn . The parameters bn and tn are well defined because of the radial symmetry. We shall assume that (1) inf n (tn+1 − tn ) > 0. Remark. In the metric tree literature, the trees we consider here are usually called regular trees [17,26]. Since this paper also has a theorem about discrete trees, where regularity usually means that every vertex has the same number of neighbors, we chose the term radial for both settings as a unifying compromise. Let −∆ be the operator in L2 (Γc ) defined through the quadratic form |u (x)|2 dx
(1.3)
Γc
for functions u ∈ H01 (Γc ), the Sobolev space of functions continuous along the edges and satisfying Γc (|u (x)|2 + |u(x)|2 )dx < ∞, u(O) = 0. Functions in the operator domain of −∆ satisfy Dirichlet boundary conditions at the root and Kirchhoff boundary conditions at the vertices. We shall prove Theorem 2. Under assumption (1) above, if lim sup(tn+1 − tn ) = ∞,
(1.4)
n→∞
then the spectrum of −∆ is purely singular, in the sense that any spectral measure for −∆ is supported on a set of Lebesgue measure zero. Put differently, this theorem says that any subsequence of unbounded edges destroys absolutely continuous spectrum, no matter what happens between these unbounded edges. The trees described in Theorem 2 are sparse in the sense that their branchings become sparse as the distance from the root increases (at least along some sequence). We find it remarkable that such a small class of radial metric trees has absolutely continuous spectrum. The next two theorems show that pure point spectrum does not occur too often as well. Theorem 3. Assume that the sequence {bn } is bounded. Then, if lim sup n→∞
tn+1 − tn > 0, n2n
(1.5)
the spectrum of −∆ coincides with [0, ∞) and is purely singular continuous. For any ε > 0, C > 0 we consider the set T ε,C of radial trees whose defining parameter sequences, {tn , bn }, satisfy (1)ε inf(tn+1 − tn ) ≥ ε, t1 ≥ ε. (2)C sup bn ≤ C.
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
933
Moreover, we allow {tn , bn } to be a finite (possibly empty) sequence. This means that we also consider trees which have only a finite number of vertices and which contain half-lines. Identifying sequences {tn , bn } with measures n βn δtn on R+ (with βn = √ √bn +1 ) we can consider T ε,C as a (compact) metric space with convergence being bn −1 induced from weak convergence of measures. This convergence is natural for us since, as we shall see, convergence of the trees implies strong resolvent convergence of the corresponding Laplacians (up to a natural unitary transformation taking the different Hilbert spaces into account). From Theorem 3 we shall deduce the somewhat surprising Theorem 4. In the space T ε,C , with the topology of weak convergence for the corresponding measures, the set of trees whose spectrum on [0, ∞) is purely singular continuous, is a dense Gδ set. Recall that a Gδ set is a countable intersection of open sets. As with other works dealing with radial trees (see e.g. [6–8, 17, 26]), the radial symmetry reduces the analysis to a one-dimensional problem. Thus, the exclusion of eigenvalues in Theorem 3 follows by a Simon–Stolz type argument [25] and Theorem 4 follows from Theorem 3 with the help of results from Simon’s Wonderland paper [22] applied to the corresponding families of one-dimensional operators. The new ingredient which enters in the proofs of Theorems 1 and 2 is Remling’s Theorem [20, 21], mentioned in the previous subsection. As noted above, Remling’s Theorem leads to various constraints on one-dimensional discrete and continuous Schr¨ odinger operators with absolutely continuous spectrum. The structural restrictions on trees with absolutely continuous spectrum are a consequence of these constraints. In particular, Theorem 1 is an immediate corollary of [20, Theorem 1.1], given [6, Theorem 2.4]. In contrast with the discrete case, Theorem 2 is not an immediate consequence of the results of [20] or its continuous counterpart [21]. The difficulty lies in the fact that the objects appearing in the direct sum decomposition of −∆ (see, e.g., [8,17]) are not “standard” Schr¨ odinger operators, but rather Sturm–Liouville operators on 2 weighted L spaces with rather singular weights. The better part of the rest of this paper is devoted to demonstrating the applicability of Remling’s Theorem to these operators. A crucial point in the analysis is the proof that for a whole-line potential that is reflectionless (see Sec. 4.1 for the definition) on a set of positive Lebesgue measure, the part of the potential lying to the left of 0 uniquely determines the part lying to the right of 0. In the context of Jacobi matrices and Schr¨ odinger operators with measure valued potentials this is, indeed, a simple realization relying on classical results. In our case, however, this seems to be a new result (in particular, see Proposition 12). In order to prove this, we have made use of the Kre˘ın formula for the difference of the resolvents of two different self-adjoint extensions of a closed, densely defined, symmetric operator, as it appears in [19] (see Sec. 3). We are not aware of any previous application of
August 12, 2009 3:59 WSPC/148-RMP
934
J070-00377
J. Breuer & R. L. Frank
this formula in the spectral theory of Schr¨ odinger operators on radial trees, and we believe that it may be useful also beyond the context of the present paper. To the best of our knowledge, the theorems above are the first of their kind in terms of the generality in which they hold. In particular, we do not know of another Wonderland-type theorem for trees. Interestingly enough, it is not clear how we could formulate an interesting analogue of Theorem 4 for discrete trees. The reason is that in order to exclude eigenvalues, one needs a “free operator” which approximates the tree. The natural free operator in the discrete case is the discrete Laplacian which has spectrum in [−2, 2]. Since, in general, discrete trees might have a significant portion of their spectrum outside [−2, 2], the exclusion of eigenvalues there does not have the implications of Theorem 3. The rest of this paper is structured as follows. Section 2 describes the reduction of the above theorems to theorems for one-dimensional Schr¨ odinger operators with point interactions. Section 3 proves a resolvent formula and a uniqueness result for such operators. Section 4 completes the proof of the theorems. As noted above, Theorem 1 is a direct consequence of [20, Theorem 1.1]. Thus, no additional discussion will be devoted to its proof. 2. Reduction to the One-Dimensional Case Using the radial symmetry of the tree we shall deduce our main theorems from results about one-dimensional operators. In this section, we describe this reduction and state the corresponding theorems in the one-dimensional context. As in the previous section, let Γc be a radial metric tree associated with parameters {(tn , bn )}∞ n=1 , which are assumed to satisfy (1). We put (t0 , b0 ) := (0, 0). For 2 any integer k ≥ 0, we introduce the self-adjoint operator A+ k in L (tk , ∞) defined by ∞ + (Ak f )(r) = −f (r) for r ∈ n=k (tn , tn+1 ) with domain consisting of all functions ∞ f ∈ H 2 ( j=k (tn , tn+1 )) satisfying f (tk ) = 0 and f (tn +) =
bn f (tn −),
1 f (tn +) = √ f (tn −) bn
(2.1)
for n > k. These operators appear naturally in the direct sum decomposition of the Laplacian [8, 17, 26]. Indeed, one has Proposition 5 ([26]). −∆ is unitarily equivalent to the direct sum −∆ ∼ = A+ 0 ⊕
∞
⊕ (A+ k ⊗ ICb1 ···bk−1 (bk −1) ).
(2.2)
k=1
It follows from this proposition that our main results, Theorems 2–4, will be proved if we can show the corresponding results for any of the operators A+ k , k ≥ 0. Since we consider general sequences {(tn , bn )}∞ n=1 we may, without loss of generality, restrict our attention to A+ 0 . To simplify notation we denote this operator from now on by A+ . Moreover, when studying these operators we need no longer assume that
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
935
the bn ’s are integer-valued. All we need is (2) inf n≥1 bn > 1. Using boundary conditions (2.1) and integrating by parts one easily finds that ∞ |f |2 dt (2.3) (f, A+ f ) = 0
for f ∈ dom A+ . In particular, A+ is a non-negative operator. We now state our results concerning the operators A+ which will imply Theorems 2–4. + Theorem 6. Let {(tn , bn )}∞ n=1 satisfy assumptions (1) and (2) and let A be the associated operator. Then, if lim supn→∞ (tn+1 − tn ) = ∞, the absolutely continuous spectrum of the operator A+ is empty.
One easily sees that if lim supn→∞ (tn+1 − tn ) = ∞, then the spectrum of A+ coincides with the interval [0, ∞). According to Theorem 6, this spectrum might have a singular continuous and a pure point component. However, if a subsequence of the differences tn+1 − tn grows sufficiently fast, we can rule out the existence of eigenvalues following an argument of Simon and Stolz [25] and we obtain + be the associated Theorem 7. Let {(tn , bn )}∞ n=1 with supn bn < ∞ and let A −2n (tn+1 − tn ) > 0, the spectrum of A+ coincides operator. Then, if lim supn→∞ n with [0, ∞) and is purely singular continuous.
Our next result states that singular continuous spectrum is indeed the “generic” situation. In order to define what we mean by “generic” we will introduce a natural to identify topology on the sequences {(tn , bn )}∞ n=1 as above. It will be convenient √ √ such a sequence with a measure µ = βn δtn on R+ where βn = ( bn +1)/( bn −1). For later use, we will consider at once the case of measures on the whole line. For any ε > 0, we denote by Mεa the set of all non-negative atomic measures µ on R of the form µ = n∈J βn δtn where βn ∈ [1, ∞) and where tn are real numbers satisfying |tn − tm | ≥ ε for all n = m. The index set J may be finite, infinite or the subsets consisting of all µ ∈ Mεa with empty. Moreover, we denote by Mε,+ a supp µ ⊂ [ε, ∞), and we put M0a := Mεa , M0,+ := Mε,+ a a . ε>0
ε>0
be the subset consisting of all µ ∈ Mε,+ with Finally, for C ≥ 2 let Mε,C,+ a a −1 1 + C ≤ βn ≤ C for all n ∈ J. we associate an operator A+ With any measure µ = n∈J βn δtn ∈ M0,+ a µ in 2 + L (R+ ) acting as Aµ f = −f in R+ \supp µ on functions satisfying f (0) = 0 and √ √ (2.1) for all n ∈ J where βn = ( bn + 1)/( bn − 1) with bn ∈ (1, ∞]. (For bn = ∞, (2.1) is interpreted as f (tn −) = 0 and f (tn +) = 0.) If J is infinite and all bn ’s are finite, this is precisely the operator A+ defined above.
August 12, 2009 3:59 WSPC/148-RMP
936
J070-00377
J. Breuer & R. L. Frank
The one-dimensional analog of Theorem 4 is with the topology of weak convergence, the set of Theorem 8. In the space Mε,C,+ a µ’s for which the spectrum of A+ is purely singular continuous is a dense Gδ set. µ Remark. The proof will show that the same is true if we restrict the bn ’s to be integers. This is what we need when we deduce Theorem 4. Moreover, to deduce Theorem 4, we also use that a countable intersection of dense Gδ ’s is a dense Gδ by Baire’s Category Theorem. 3. The Resolvent and the m-Function 3.1. A resolvent formula In this subsection we derive a convenient expression for the resolvent of the operator 0,+ A+ = A+ µ for µ ∈ Ma . We write √ bn + 1 √ µ= δtn , 0 < t1 < t2 < · · · , 1 < bn ≤ ∞, (3.1) bn − 1 n∈J where J is either of the form {1, 2, . . . , #supp µ} if there is a finite number of atoms, 2 2 or J = N if there are infinitely many atoms. Let A+ 0 := −d /dt be the Dirichlet + 2 −1 Laplacian in L (R+ ) and recall that its resolvent (A0 − z) , z ∈ C\[0, ∞), has integral kernel gz (t, u) :=
i ik|t−u| (e − eik(t+u) ), 2k
z = k2 ,
Im k > 0.
Put H := 2 (J, C2 ) and let γ be the trace operator from L2 (R+ ) to H with domain dom γ = dom A+ 0 , that is,
f (tn ) (γf )n := . f (tn ) For any z ∈ C\[0, ∞) we define an operator T (z) in H by 1 ik|tn −tm | 1 (e (σmn eik|tn −tm | − eik(tn +tm ) ) − eik(tn +tm ) ) 2ik 2 T (z)nm := 1 ik (σnm eik|tn −tm | − eik(tn +tm ) ) − (eik|tn −tm | + eik(tn +tm ) ) 2 2 where σmn := sgn(tm − tn ), with the convention sgn(0) = 0. Finally, we define the multiplication operator B in H by √
1 bn + 1 0 1 Bnm := δnm √ . 2 bn − 1 1 0 The following expression for the resolvent of the operator A+ = A+ µ will be very useful for us.
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
937
Lemma 9. For any z ∈ C\[0, ∞) one has −1 −1 ∗ −1 + (γ(A+ ) (T (z) + B)−1 γ(A+ . (A+ − z)−1 = (A+ 0 − z) 0 − z) 0 − z)
(3.2)
Proof. Obviously, T (z) = T (z)∗ . For any function f ∈ L2 (R+ ) ∩ H 2 (R+ \{tn }n∈J ) we introduce
f (tn ±) (γ± f )n := . f (tn ±) It is straightforward to check that 1 −1 ∗ ) = −T (z) ± J, γ± (γ(A+ 0 − z) 2
Jnm := δnm
0 1 . −1 0
Therefore, the resolvent formula implies that −1 −1 ∗ (γ(A+ ) T (z) − T (ζ) = (ζ − z)γ± (A+ 0 − ζ) 0 − z)
and so −1 −1 ∗ (γ(A+ ) . T (z) − T (ζ) = (ζ − z)γ(A+ 0 − ζ) 0 − z)
Hence, by the abstract result of Posilicano [19] there exists a self-adjoint operator G, say, with (G − z)−1 given by the right-hand side of (3.2). We need to prove that G = A+ . Now assume that f ∈ dom G. From (3.2) one sees that f ∈ H 2 (R+ \{tn }n∈J ) −1 (G − z)f . (Note that (A+ and Gf = −f . Let a± := γ± f and c := γ(A+ 0 − z) 0 − −1 z) (G − z)f and its derivative are continuous.) Applying (3.2) to (G − z)f we learn that
1 1 −1 (3.3) a± = c + −T (z) ± J (T (z) + B) c = B ± J (T (z) + B)−1 c. 2 2 ˆ ⊕H ˇ Decomposing J = Jˆ ∪˙ Jˇ where Jˆ := {n ∈ J : bn < ∞} and accordingly H = H ˆ± + a ˇ± , we see from (3.3) after eliminating c that a ˆ+ = (B + 12 J)(B − and a± = a 1 −1 ˆ Calculating the a ˆ− . (Note that the inverse (B − 12 J)−1 is well-defined on H.) 2 J) product of the two matrices we find jump condition (2.1) for f . For n ∈ Jˇ, (3.3) says that the second component of a+,n and the first component of a−,n are zero, which again are the claimed boundary conditions for f . Our first application of the resolvent formula is to prove that weak convergence of measures implies strong resolvent convergence of the associated operators. ε,+ for some ε > 0 and assume that µ(j) → µ Proposition 10. Let {µ(j) }∞ j=1 ⊂ Ma + + weakly. Then Aµ(j) → Aµ in strong resolvent sense.
August 12, 2009 3:59 WSPC/148-RMP
938
J070-00377
J. Breuer & R. L. Frank
Proof. Let f ∈ L2 (0, ∞) with compact support. The assertion will follow if we can prove that for some sufficiently large κ > 0 one has 2 −1 + κ2 )−1 f ) → (f, (A+ f ). (f, (A+ µ +κ ) µ(j)
(Here we use that weak resolvent convergence is the same as strong resolvent convergence and that the operators A+ and A+ µ are all non-negative, so that it sufµ(j) fices to verify the convergence at a single point −κ2 of the resolvent set.) We introduce the operators T (j) (−κ2 ), B (j) and γ (j) in the obvious way and write 2 −1 2 −1 f and a := γ(A+ f . In view of the resolvent formula a(j) := γ (j) (A+ 0 +κ ) 0 +κ ) (3.2), we need to show that (a(j) , (T (j) (−κ2 ) + B (j) )−1 a(j) ) → (a, (T (−κ2 ) + B)−1 a).
(3.4)
We shall assume that J (j) = J = N for any j, the other cases being similar. We claim that a(j) → a
in 2 (N, C2 ).
(3.5)
(j) tn
2 −1 Indeed, the weak convergence implies that → tn and hence, since (A+ f 0 +κ ) (j) (j) 1 is C , that an → an for each n. Moreover, for all tn ≥ sup supp f := M , one has (j) (j) an = ce−κtn (1, −κ) with a constant c depending on f but not on j or n. Since (j) (j) tn ≥ εn, it follows that |an |2 ≤ |c|2 (1 + κ2 )e−2κεn for n ≥ M/ε and similarly for a. From this one easily deduces (3.5). In the proof of Lemma 13 below, we show that for sufficiently large κ, the operators (T (j) (−κ2 ) + B (j) )−1 are uniformly bounded in j. Moreover, the weak convergence of the measures implies that T (j) (−κ2 ) + B (j) → T (−κ2 ) + B strongly. With the help of the resolvent identity one deduces that (T (j) (−κ2 ) + B (j) )−1 → (T (−κ2 ) + B)−1 strongly. This, together with (3.5), implies (3.4).
Our second application of (3.2) will be to derive an expression for the m-function. Let us recall the definition. For later purposes we consider whole-line operators. As in Sec. 2 we can associate to each measure µ ∈ M0a on R a whole line operator A = Aµ acting as the Laplacian away from supp µ on functions satisfying √ √ “jump conditions” (2.1) for tn ∈ supp µ and ( bn + 1)/( bn − 1) = µ({tn }) (with the same modification as before if bn = ∞). This defines a self-adjoint, non-negative operator in L2 (R). Therefore, for any z ∈ C\[0, ∞) there exist functions f± (z; ·) solving −f = zf in R\supp µ, satisfying “jump conditions” (2.1) and lying in L2 at ±∞. (For example, choose f+ (z; t) = ((A − z)−1 g)(t) where g is supported near −∞ and t is to the right of supp g, and continue f to the left.) Since f± is defined uniquely only up to a multiplicative constant, it is natural to consider m± (z; t) = ±
(z; t) f± , f± (z; t)
the m-functions of A. Note that if µ({0}) = 0 and t ≥ 0, then m+ (z; t) depends only on the restriction of µ to R+ , and therefore we will also speak of the m-function of A+ .
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
939
The promised formula is Corollary 11. Let µ ∈ M0,+ as in (3.1). Then for all z = k 2 ∈ C\[0, ∞), Im k > 0, a m+ (k 2 ; 0) = ik +
eik(tn +tm )
n,m
T 1 1 (T (k 2 ) + B)−1 . n,m ik ik
(3.6)
2 Here (T (k 2 ) + B)−1 n,m is the (n, m)-entry (a 2 × 2-matrix) of the operator (T (k ) + B)−1 .
Proof. Since m+ (z; 0) =
∂2 (A+ − z)−1 (t, u)|(t,u)=(0,0) , ∂t∂u
this follows from (3.2). 3.2. Uniqueness Our goal in this subsection is to prove that the m-function of A+ uniquely determines the measure µ. ˜+ with corresponding operators A+ = A+ Proposition 12. Let µ, µ ˜ ∈ M0,+ a µ, A = + ˜ + (z; t) for Aµ˜ . Assume that the corresponding m-functions satisfy m+ (z; t) = m + ˜. some 0 ≤ t < min{inf supp µ, inf supp µ ˜} and all z ∈ C . Then µ = µ This is an analog of the famous Borg–Marchenko result in the Schr¨ odinger case. It has been generalized to perturbations by measures in [3], but the result seems to be new for perturbations by boundary conditions (2.1). Our proof below relies on the expression (3.6) from which we will derive √ and put t1 := inf supp µ and µ({t1 }) =: ( b1 + Lemma 13. Let 0 = µ ∈ M0,+ a √ 1)/( bn − 1). Then for large, real κ, m+ (−κ2 ; 0) + κ = − Of course, if b1 = ∞ then turn to the
b1 −1 b1 +1
b1 − 1 2κe−2κt1 (1 + o(1)). b1 + 1
(3.7)
= 1. Accepting Lemma 13 for the moment, we
Proof of Proposition 12. By translation invariance, we may assume that t = 0. Hence by Lemma 13 either both µ and µ ˜ are zero, or else they are both not and ˜ and β1 := µ({t1 }) = µ ˜({t1 }). Now choose then t1 := inf supp µ = inf supp µ µ − β1 δt1 )}. Solving the equation t1 < s < min{inf supp(µ − β1 δt1 ), inf supp(˜ = zf+ explicitly on the interval [0, s], we can write m+ (z; s) as a fractional −f+ linear function of m+ (z; 0) with coefficients depending only on s, z, r1 , β1 . Hence ˜ + (z; s) for all z. Now iterate. m+ (z; s) = m
August 12, 2009 3:59 WSPC/148-RMP
940
J070-00377
J. Breuer & R. L. Frank
Proof of Lemma 13. We shall use the expression for m+ (−κ2 ; 0) from Corollary 11. In order to calculate the asymptotics as κ → ∞, we decompose 1 0 − T 0 (−κ2 )nm := δnm 2κ κ . T (−κ2 ) = T 0 (−κ2 ) + T R (−κ2 ), 0 2 One easily estimates for all large κ const κe−2κtn if n = m, R 2 T (−κ )n,m C2 →C2 ≤ const κe−κ|tn −tm | if n = m. Hence by a matrix-valued version of Schur’s lemma T R (−κ2 )H→H ≤ sup T R (−κ2 )n,m C2 →C2 ≤ const κ(e−2κt1 + e−κε ), m
n
where we used that ε := inf n =m |tn − tm | > 0 and hence |tn − tm | ≥ ε|n − m|. On the other hand, the eigenvalues of T 0 (−κ) + B are easily calculated and one finds that the smallest (in absolute value) eigenvalue is bounded away from zero by a constant times κ−1 independently of the bn . (To be a bit more explicit, the positive eigenvalue of (T 0 (−κ) + B)nn is larger than κ/2 and the negative eigenvalue is smaller than −1/2κ.) Hence both T 0 (−κ2 ) + B and T (−κ2 ) + B = T 0 (−κ2 ) + B + T R (−κ2 ) are invertible, and the norms of their inverses are bounded from above by a constant times κ. We conclude that (T (−κ2 ) + B)−1 − (T 0 (−κ2 ) + B)−1 = (T (−κ2 ) + B)−1 T R (−κ2 )(T 0 (−κ2 ) + B)−1 ≤ const κ3 (e−2κt1 + e−κε ), and so by (3.6) m+ (−κ2 ; 0) = −κ +
e−2κtn
n 5 −2κt1
+ O(κ e = −κ −
∗ 1 1 (T 0 (−κ2 ) + B)−1 n,n −κ −κ
(e−2κt1 + e−κε ))
b1 − 1 2κe−2κt1 + O(κ5 e−2κt1 (e−2κt1 + e−κε )), b1 + 1
as claimed. Remark. Looking at the previous proof, we see that the assumption inf n =m |tn − tm | > 0 can be significantly relaxed to n:n =m e−κ|tn −tm | = o(κ−4 ) uniformly in m. 4. Proof of Theorems 6–8 4.1. A Remling-type theorem Roughly speaking, Remling’s theorem states that for a given one-dimensional Schr¨ odinger operator, A, (with a natural boundedness assumption on the potential) any right-limit of A is reflectionless on the absolutely continuous spectrum of A.
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
941
Clearly, the two central notions behind this theorem are that of right-limit and that of a reflectionless operator. We proceed to define these notions in our setting and formulate the version of Remling’s Theorem that will be useful for us. ˆ is said to be a right-limit of µ if Definition 14. Let µ, µ ˆ ∈ M0a . The measure µ there exists a strictly increasing sequence, sj → ∞, such that for every continuous, compactly supported function f on R, lim f (t − sj )dµ(t) = f (t)dˆ µ(t). (4.1) j→∞
R
R
M0,+ a
with µ({0}) = 0 and let A and A+ be the corresponding Let µ ∈ whole-line and half-line operators. Recall that we have introduced the m-functions m± (z; t) before Corollary 11. Of course, m+ (z; 0) depends only on the restriction of µ to [0, ∞). Since m+ (z; 0) is a Herglotz function of z ∈ C+ , its boundary values on the real line exist a.e. We denote by Σac (A+ ) the set Σac (A+ ) = {E ∈ R | 0 < m+ (E + i0; 0) < ∞}.
(4.2)
Σac (A+ ) is an essential support of the absolutely continuous spectrum of A+ . In particular, A+ has absolutely continuous spectrum iff Σac (A+ ) has positive Lebesgue measure (see, e.g., [23]). Definition 15. Fix Λ ⊆ R and let µ ∈ M0a . The operator Aµ is called reflectionless ¯ − (E + on Λ if for all t ∈ R\supp µ and for almost every E ∈ Λ, m+ (E + i0; t) = −m i0; t). We are now in a position to formulate the version of Remling’s Theorem appropriate for our setting. 0,+ Theorem 16. Let A+ = A+ and let µ ˆ be a right-limit of µ. µ for some µ ∈ Ma + Then Aµˆ is reflectionless on Σac (A ).
Proof. The proof of this theorem is essentially the same as the proof of [21, Theorem 4.1], but let us make a few remarks. As stated in the Introduction, Remling’s Theorem follows from a result of Breimesser–Pearson ([4, Theorem 1]) which states v (E+i0;t) ¯ + (E + i0; t) are that, on Σac (A+ µ ), the value distribution of − v(E+i0;t) and −m asymptotically equal (as t → ∞). Here v(z; t) is the Dirichlet solution to the formal / L2 ), and m+ (z; t) is the m-function for A+ equation A+ µ v = zv (generally, v ∈ µ . (For the concept of value distribution see [4, 5].) Remling’s Theorem follows from this if we can prove that if µ ˆ is a right-limit of v (z;s ) ˆ + (z; 0) and limj→∞ − v(z;sjj) = m ˆ − (z; 0) uniformly µ then limj→∞ m+ (z; sj ) = m + ˆ± on compact subsets of C where {sj } is the sequence from Definition 14 and m are the m-functions for Aµˆ . To show this, consider m ˆ + (z; 0) and A+ ˆ. µ ˆ , the right half-line restriction of Aµ First, note that Green’s formula (see [9]): t2 t2 (−f (t))¯ g (t)dt − f (t)(−¯ g (t))dt = W (f, g¯)(t2 ) − W (f, g¯)(t1 ) (4.3) t1
t1
August 12, 2009 3:59 WSPC/148-RMP
942
J070-00377
J. Breuer & R. L. Frank
holds in our case, by integration by parts along intervals which contain no atoms of µ ˆ and then by summing up the resulting telescoping sum. Therefore, the Weyl nested disk construction (see, e.g., [9]) works in our setting to show that, for any δ > 0 there exists N > 0 such that, if µ ˜ agrees with µ ˆ on (0, N ), then m ˜ + (z; 0), the m-function , lies in a disk of radius no bigger than δ which also contains m ˆ + (z; 0). for A+ µ ˜ obius transformation given Explicitly, this disk is the image of C+ ∪ R under the M¨ by the matrix Tz (N )−1 , where Tz (N ) is defined by
f (0) f (N ) Tz (N ) = f (0) f (N ) for any formal solution, f , of A+ µ ˆ f = zf . Using (2.1) for the atoms and the solutions of the free equation along the intervals between them, it is possible to write this matrix as a product of simple matrices, and so note that its entries are continuous functions of the parameters defining the restriction of µ ˆ to (0, N ). Thus, the center and radius of the disk are continuous functions of these parameters. Recalling the ˆ + (z; 0). The definition of right-limit, this implies the convergence of m+ (z; sj ) to m v (z;sj ) convergence of − v(z;sj ) to m ˆ − (z; 0) is established in a similar way, by considering − Aµˆ , the restriction of Aµˆ to (−∞, 0) with a Dirichlet boundary condition. As for the applicability of Theorem 1 of Breimesser–Pearson [4], the only thing that actually depends on the particular properties of the model is Lemma 3 there and its corollary. Once again, since Green’s formula holds in our case it is easy to see that the proof goes through with no change. 4.2. Proof of Theorem 6 Assume, by contradiction, that lim supn→∞ (tn+1 − tn ) = ∞ and that Σac (A+ ) has positive Lebesgue measure. Note that A+ is non-negative, so Σac (A+ ) ⊂ [0, ∞). Let µ = ∞ n=1 βn δtn . By the hypothesis on tn and a compactness argument, µ has ˆ(−∞, 0] = 0 while µ ˆ(0, ∞) = 0. a right-limit µ ˆ ∈ M0a such that µ ˆ + (E + Now, Theorem 16 implies√that Aµˆ is reflectionless on Σac (A+ ). Hence m + ¯ / supp µ ˆ. Since a i0; t) = −m ˆ − (E + i0; t) = i E for all E ∈ Σac (A ) and all t ∈ Herglotz function is uniquely determined by its values on a set of positive measure (see, e.g. [28, Appendix B]), one has m ˆ + (k 2 ; t) = ik for all k with Im k > 0 and all t∈ / supp µ ˆ , and hence by Proposition 12, µ ˆ(0, ∞) = 0, a contradiction. 4.3. Proof of Theorem 7 First note that in view of (2.3), 0 is not an eigenvalue of A+ . Hence we need to show that any function f satisfying f (0) = 0 and (2.1) and solving −f (x) = Ef (x) for some E ∈ (0, ∞) is not in L2 .
in R+ \{tn }∞ n=1
(4.4)
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
Any solution of (4.4) satisfies
f (x) f (y) = T (x, y, E) f (x) f (y)
943
(4.5)
for a certain 2 × 2 matrix T (x, y, E) of determinant 1. It follows from [25, Theorem 2.1] that if ∞ dx =∞ (4.6) 2 0 T (x, O, E) then (4.4) has no solution in L2 (0, ∞). If x, y ∈ (tn , tn+1 ) for some fixed n then
f (x) cos k(x − y) k −1 sin k(x − y) f (y) = (4.7) −k sin k(x − y) cos k(x − y) f (x) f (y) √ where k = E > 0. The norm of this matrix is bounded by max(k, k −1 ). Furthermore, the jump condition is taken into account by the jump matrix √ bn 0 (4.8) Sn = 1 . √ 0 bn Thus, if x ∈ (tn , tn+1 ), T (x, O, E) ≤ (max(k, k −1 ))n+1
n
b m ≤ nn ,
(4.9)
m=1
for sufficiently large n (and any fixed E ∈ (0, ∞)). This implies tn+1 tn+1 − tn dx ≥ , 2 T (x, O, E) n2n tn
(4.10)
which, by (1.5), implies the result. 4.4. Proof of Theorem 8 It is easy to see (cf. also [21]) that Mε,C,+ is a complete (indeed, compact) meta ric space where the topology coincides with that of weak convergence of measures. According to Proposition 10 weak convergence of measures implies convergence in the strong resolvent sense for the corresponding operators. Thus, it follows with no eigenvalues from [22, Theorem 1.1] that the set of operators in Mε,C,+ a in [0, ∞) is a Gδ set. Moreover, [22, Theorem 1.2] implies that the set of operators with no absolutely continuous spectrum in (0, ∞) (and so also in [0, ∞)) in Mε,C,+ a is a Gδ set. To complete the proof it suffices to note that the set of measures in Mε,C,+ a with {tn } satisfying (1.5) is a dense set (since, for any given measure one may take a measure coinciding with it on a fixed bounded set, but satisfying condition (1.5)), and for such measures the spectrum of the corresponding operator is purely singular continuous by Theorem 7.
August 12, 2009 3:59 WSPC/148-RMP
944
J070-00377
J. Breuer & R. L. Frank
Acknowledgments We are grateful to Barry Simon for useful discussions. RF appreciates the warm hospitality of Caltech, where part of this work has been done, and acknowledges support through DAAD grant D/06/49117.
References [1] M. Aizenman, R. Sims and S. Warzel, Absolutely continuous spectra of quantum tree graphs with weak disorder, Comm. Math. Phys. 264 (2006) 371–389. [2] C. Allard and R. Froese, A Mourre estimate for a Schr¨ odinger operator on a binary tree, Rev. Math. Phys. 12 (2000) 1655–1667. [3] A. Ben Amor and C. Remling, Direct and inverse spectral theory of one-dimensional Schr¨ odinger operators with measures, Integral Equations Operator Theory 52 (2005) 395–417. [4] S. V. Breimesser and D. B. Pearson, Asymptotic value distribution for solutions of the Schr¨ odinger equation, Math. Phys. Anal. Geom. 3 (2000) 385–403. [5] S. V. Breimesser and D. B. Pearson, Geometrical aspects of spectral theory and value distribution for Herglotz functions, Math. Phys. Anal. Geom. 6 (2003) 29–57. [6] J. Breuer, Singular continuous spectrum for the Laplacian on certain sparse trees, Comm. Math. Phys. 269 (2007) 851–857. [7] J. Breuer, Singular continuous and dense point spectrum for sparse trees with finite dimensions, in Probability and Mathematical Physics: A Volume in Honor of Stanislav Molchanov, eds. D. Dawson, V. Jakˇsi´c and B. Vainberg, CRM Proc. and Lecture Notes, Vol. 27 (Ames. Math. Soc. 2007), pp. 65–84. [8] R. Carlson, Nonclassical Sturm–Liouville problems and Schr¨ odinger operators on radial trees, Electron. J. Differential Equations 71 (2000) 24 pp. (electronic). [9] E. A. Coddington and N. Levinson, Theory of Ordinary Differential Equations (McGraw-Hill, New-York, 1955). [10] T. Ekholm, R. L. Frank and H. Kovarik, Eigenvalue estimates for Schr¨ odinger operators on metric trees, preprint (2007); arXiv:0710.5500v1. [11] K. Fujiwara, The Laplacian on rapidly branching trees, Duke Math. J. 83 (1996) 191–202. [12] P. D. Hislop and O. Post, Anderson localization for radial tree-like random quantum graphs, preprint (2006); arXiv:math-ph/0611022. [13] P. Kuchment, Quantum graphs: An introduction and a brief survey, in Analysis on Graphs and Its Applications, eds. P. Exner, J. P. Keating, T. Sunada and A. Teplyaev, Proc. Symp. Pure. Math., Vol. 77 (Amer. Math. Soc., 2008), pp. 291–314. [14] Y. Last, Spectral theory of Sturm-Liouville operators on infinite intervals: A review of recent developments, in Sturm–Liouville Theory: Past and Present, eds. W. O. Amrein, A. M. Hinz and D. B. Pearson (Birkh¨ auser Verlag, Basel, 2005), pp. 99–120. [15] Y. Last, Exotic spectra: A review of Barry Simon’s central contributions, in Spectral theory and Mathematical Physics: A Festschrift in Honor of Barry Simon’s 60th Birthday, eds. F. Gesztesy, P. Deift, C. Galvez, P. Perry and W. Schlag (American Mathematical Society, Providence, RI, 2007), pp. 697–712. [16] L. Malozemov and A. Teplyaev, Pure point spectrum of the Laplacians on fractal graphs, J. Funct. Anal. 129 (1995) 390–405. [17] K. Naimark and M. Solomyak, Eigenvalue estimates for the weighted Laplacian on metric trees, Proc. London Math. Soc. (3) 80 (2000) 690–724.
August 12, 2009 3:59 WSPC/148-RMP
J070-00377
Singular Spectrum for Radial Trees
945
[18] D. B. Pearson, Singular continuous measure in scattering theory, Comm. Math. Phys. 60 (1978) 13–36. [19] A. Posilicano, A Kre˘ın-like formula for singular perturbations of self-adjoint operators and applications, J. Funct. Anal. 183 (2001) 109–147. [20] C. Remling, The absolutely continuous spectrum of Jacobi matrices, preprint (2007); arXiv:math-sp/0706.1101. [21] C. Remling, The absolutely continuous spectrum of one-dimensional Schr¨ odinger operators, Math. Phys. Anal. Geom. 10 (2007) 359–373. [22] B. Simon, Operators with singular continuous spectrum. I. General operators, Ann. of Math. (2) 141(1) (1995) 131–145. [23] B. Simon, Spectral analysis of rank one perturbations and applications, in Proc. Mathematical Quantum Theory, II: Schr¨ odinger Operators, eds. J. Feldman, R. Froese and L. Rosen, CRM Proceedings and Lecture Notes, Vol. 8 (Amer. Math. Soc. Providence, RI, 1995), pp. 109–149. [24] B. Simon, Operators with singular continuous spectrum, VI: Graph Laplacians and Laplace–Beltrami operators, Proc. Amer. Math. Soc. 124 (1996) 1177–1182. [25] B. Simon and G. Stolz, Operators with singular continuous spectrum. V. Sparse potentials, Proc. Amer. Math. Soc. 124(7) (1996) 2073–2080. [26] M. Solomyak, On the spectrum of the Laplacian on regular metric trees, Waves Random Media 14 (2004) S155–S171. [27] A. Teplyaev, Spectral analysis on infinite Sierpi´ nski gaskets, J. Funct. Anal. 159 (1998) 537–567. [28] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Mathematical Surveys and Monographs, Vol. 72 (American Mathematical Society, Providence, RI, 2000).
August 12, 2009 4:0 WSPC/148-RMP
J070-00380
Reviews in Mathematical Physics Vol. 21, No. 7 (2009) 947–948 c World Scientific Publishing Company
ERRATA CONTINUITY OF A CLASS OF ENTROPIES AND RELATIVE ENTROPIES
[Reviews in Mathematical Physics, Vol. 16, No. 6 (2004) 809–822]
JAN NAUDTS Departement Natuurkunde, Universiteit Antwerpen UIA, Universiteitsplein 1, 2610 Antwerpen, Belgium
[email protected]
Conditions are given under which the inequality (B.1) of the cited paper is valid.
The inequality (B.1) of [1, Appendix B] is not generally valid under the given assumptions. It must be replaced by the following statement: Theorem 1. Let g(x) be a concave increasing function on [0, 1] satisfying g(0) = 0. Let φ(x) = −1/g (x). If φ(x) satisfies φ(λu) ≥ λφ(u), then for all x, λ in [0, 1] g(λ)g(x) ≥ g(1)g(λx).
(1)
f (λ) = g(λ)g(x) − g(1)g(λx).
(2)
Proof. Let One calculates, using xg (λx) ≥ g (λ), g(x) ≥ 0, g (λx) < 0, and g(x) > xg(1), d2 f = g (λ)g(x) − x2 g(1)g (λx) dλ2 ≤ xg (λx)[g(x) − xg(1)] ≤ 0.
(3)
Hence, f (λ) is concave. Because it satisfies f (0) = f (1) = 0, one concludes that f (x) ≥ 0 for all x in [0, 1]. This proves (1). Application of this theorem in the way described in [1] reproduces (45) of [1] under the additional assumption that φ(λu) ≥ λφ(u) for all u, λ in (0, 1). As a consequence, also (48) and (49) of [1] have been proved only under the additional assumption.
947
August 12, 2009 4:0 WSPC/148-RMP
948
J070-00380
J. Naudts
Acknowledgement The author is grateful to Prof. A. El Kaabouchi for pointing out the mistake in [1]. Reference [1] J. Naudts, Continuity of a class of entropies and relative entropies, Rev. Math. Phys. 16(6) (2004) 809–822.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Reviews in Mathematical Physics Vol. 21, No. 8 (2009) 949–979 c World Scientific Publishing Company
´ AN INTRODUCTION TO QUANTITATIVE POINCARE RECURRENCE IN DYNAMICAL SYSTEMS
BENOIT SAUSSOL Universit´ e Europ´ eenne de Bretagne, France and Universit´ e de Brest, CNRS, UMR 6205 Laboratoire de Math´ ematiques, ISSTB, 6 Av. Le Gorgeu, 29238 Brest cedex, France
[email protected] Received 25 March 2009 Revised 3 July 2009 We present some recurrence results in the context of ergodic theory and dynamical systems. The main focus will be on smooth dynamical systems, in particular, those with some chaotic/hyperbolic behavior. The aim is to compute recurrence rates, limiting distributions of return times, and short returns. We choose to give the full proofs of the results directly related to recurrence, avoiding as much as possible to hide the ideas behind technical details. This drove us to consider as our basic dynamical system a one-dimensional expanding map of the interval. We note, however, that most of the arguments still apply to higher dimensional or less uniform situations, so that most of the statements continue to hold. Some basic notions from the thermodynamic formalism and the dimension theory of dynamical systems will be recalled. Keywords: Return time; dimension; entropy; mixing; hyperbolic dynamics. Mathematics Subject Classification 2000: 37B20, 37C45, 37D50, 60F05, 94A17
1. Classical Recurrence Results in Ergodic Theory In this section, we briefly present some classical results on recurrence in the general context of ergodic theory. Most of them are of qualitative nature, and the main purpose here will be to give some quantitative refinement to them. From now on, we have a measure preserving dynamical system (X, A, T, µ): X is a space, A is a σ-algebra on X, T : X → X is a measurable map and µ a probability measure on (X, A), such that µ(T −1 A) = µ(A) for all A ∈ A. We say that the system is ergodic if the invariant sets are trivial: T −1 A = A implies µ(A) = 0 or µ(A) = 1. 1.1. Some examples of dynamical systems (1) on the unit circle, the angle α-rotation T x = x + α mod 1; (2) on the unit circle, the doubling map T x = 2x mod 1; 949
September 16, 2009 9:47 WSPC/148-RMP
950
J070-00378
B. Saussol
(3) on the 2-torus, the cat map T x = 21 11 x. There is a good thing with these maps: they all preserve the Lebesgue measure. (4) full shift on two symbols, X = {0, 1}N, T x = (xn+1 )n preserves the infinite product of a Bernoulli measure. (5) shift of a stationary process: W = (Wn ) a stationary real valued process and X = RN with the shift map again, and the probability measure PW . 1.2. Hitting and return time Given a point x ∈ X the sequence of iterations x, T x, T 2 x, . . . , T n x, . . . is called its (forward) orbit. Given a set A and an initial point x, the basic object of study here will be the (first) hitting time of the orbit of x to the set A. We denote it by τA (x), defined by τA (x) = min{n : T n x ∈ A, n = 1, 2, . . .} or τA (x) = +∞ if the (forward) orbit never enters in A. When x ∈ A we usually call τA (x) the (first) return time. The first theorem here could reasonably not be something else than the famous Poincar´e recurrence theorem itself: Theorem 1. Let A ∈ A be a measurable set. Then for µ-almost all x ∈ A, the forward orbit T n x, n = 1, 2, . . . belongs to A infinitely often. We will call these points A-recurrent. Proof. Let n ≥ 1 be an integer. The disjoint union {τA ≤ n} = {τA ◦ T ≤ n − 1} ∪ T −1 (A ∩ {τA > n − 1}) gives using the invariance of the measure µ(τA = n) = µ(A ∩ {τA ≥ n}).
(1)
In particular we have µ(τA = n) ≥ µ(A ∩ {τA = +∞}). The sets {τA = n}, n = 1, 2, . . . are disjoints in the finite measure space, thus the left-hand side is summable with n. So is the right-hand side, hence A ∩ {τA = +∞} is a null set. Therefore T −n (A ∩ {τA = +∞}) n≥0
is again a null set. Poincar´e recurrence theorem tells in particular that τA (x) < +∞ for µ-almost every x ∈ A. We emphasize that this theorem is valid for any finite measure preserving dynamical system and any measurable set. Obviously, for zero measure sets, the statement is empty. Note that this statement only concerns return times, since the initial point x needs to be in the set A.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
951
When X is a separable metric space (i.e. there is a dense countable subset) we obtain a corollary of topological nature, somewhat more concrete: Theorem 2. Assume (X, d) is a separable metric space and that µ is a Borel T invariant measure. Then the orbit of almost any initial point returns arbitrarily close to the initial point : For µ-almost every x, there exists a subsequence nk such that T nk x → x. Proof. Let {B, B ∈ B} be a countable basis of X (e.g., balls of rational radius centered at a dense sequence). By Poincar´e recurrence theorem, for each set B there exists a negligible NB ⊂ B such that any point in B\NB is B-recurrent. The set N = ∪B NB is negligible. Let x ∈ X\N . Let Bi ∈ B be a sequence of sets with diameter going to zero and containing x. Since x is Bi -recurrent, there exists an integer ni such that T ni x ∈ Bi . Therefore T ni x → x. 1.3. Mean behavior of return times We just have seen that the function τA is almost surely finite on A. If we denote by 1 µ|A the conditional measure on A, then we can look at the expectation µA = µ(A) of τA : Theorem 3 (Kaˇ c’s Lemma [37]). Let A ∈ A be such that µ(A) > 0. We have τA dµ = µ({τA < +∞}). A
In particular, when the system is ergodic we have return time is equal to the inverse of the measure.
τ A A
dµA =
1 µ(A) ,
i.e. the mean
An elegant proof uses towers, however it requires the map to be bi-measurable. Proof. We recall the relation between hitting and return times (1) µ(τA = n) = µ(A ∩ {τA ≥ n}). Summing up over n yields µ(τA < +∞) =
∞ n=1
µ(A ∩ {τA ≥ n}) =
τA dµ. A
For the last statement, observe that the set {τA < +∞} is invariant by Poincar´e recurrence theorem. If x ∈ A is such that τA (x) < +∞, then the iterate TA (x) = T τA(x) x is well defined and belongs to A again. This defines (almost everywhere) an induced map on A, called the first return map to A. Theorem 4. The system (A, TA , µA ) is a well defined measure preserving dynamical system. It is ergodic if the original system is ergodic.
September 16, 2009 9:47 WSPC/148-RMP
952
J070-00378
B. Saussol
Proof. Let B ⊂ A be a measurable set. To prove the invariance of µA it is sufficient to prove that µ(TA−1 B) = µ(B). First, µ(TA−1 B) =
∞
µ(A ∩ {τA = n} ∩ T −n B).
n=1
We then refine Eq. (1) starting from the disjoint union {τA ≤ n} ∩ T −n−1 B = T −1 ({τA ≤ n − 1} ∩ T −n B) ∪ T −1 (A ∩ {τA = n} ∩ T −n B). This gives by invariance of the measure µ(A ∩ {τA = n} ∩ T −n B) = µ(Bn ) − µ(Bn−1 ) where Bn = {τA ≤ n} ∩ T −n−1 B. We have µ(Bn ) → µ(B) as n → ∞, thus µ(TA−1 B) = lim µ(Bn ) = µ(B). Let us assume now the ergodicity of the original system. Let B ⊂ A be a measurable TA -invariant subset. For any x ∈ B, the first iterate T n x (n ≥ 1) that belongs to A also belongs to B, which means that τB = τA on B. But if µ(B) = 0, Kaˇc’s lemma gives that τB dµ = 1 = τA dµ, B
A
which implies that µ(B\A) = 0, proving ergodicity. We will invoke several time the classical Birkhoff ergodic theorem [8], that we recall without proof. Theorem 5. Let ϕ be an integrable function. The time average n1 Sn ϕ converges ˜ If the measure µ is ergodic then ϕ˜ a.e. pointwise and in L1 to some function ϕ. equal to the space average ϕ dµ. In an ergodic system, the ergodic theorem gives a quantitative information on the recurrence property in a given set. More precisely, if A ∈ A then we get that card{1 ≤ k ≤ n : T k x ∈ A} → µ(A) n
for µ-a.e. x. (n)
(n−1)
If we define inductively the nth return time by τA (x) = τA (x) + τA (TAn−1 (x)), we get by the Birkhoff ergodic theorem again (but on the induced map) that 1 (n) 1 τ (x) → n A µ(A) when the system is ergodic and µ(A) > 0.
for µA -a.e. x
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
953
2. Thermodynamic Formalism for Expanding Maps of the Interval Sensitivity to initial conditions, i.e. separation of nearby orbits at an exponential speed, is at the origin of deterministic chaos. A possible mathematical formalization of this exponential separation is the hyperbolic dynamic. This geometric property implies some randomness on the statistical properties of the system, which behaves much like an i.i.d. process. Some results about recurrence that we want to present in this review are only known in the low-dimensional case, or in sufficiently strong mixing conditions. To give a unified presentation of these results, we decided to work with a class of dynamical systems which possess all these features. The choice is thus to consider Markov maps of the interval together with an equilibrium state. Remark 6. Our aim is to consider the dynamical system with its natural metric. For example, we will be mainly interested by return time to sets which are natural (e.g., balls). Therefore, the connection with symbolic dynamics is on purpose maintained relatively low. We warn the reader that the existence of a Markov partition is not essential. Roughly speaking, it makes many geometric and measure theoretic estimates uniform, which makes our life easier. This simplifying assumption allows us to give a self-contained proof of Ruelle–Perron–Frobenius theorem, from which we get precise estimates on decorrelation. Finally, the choice to consider expanding maps instead of real hyperbolic maps (with expanding and contracting directions, e.g., Anosov, or Axiom A) is made on purpose to keep the technicity at a low level. We refer the reader to [38] for a complete presentation of hyperbolic dynamics, and also to [48] for the dimensional theory of conformal dynamics. 2.1. Coding and geometry We assume that X is the interval [0, 1] and that T is a piecewise C 1+α expanding map on X: (E) there exists some constant β > 1 such that |T (x)| ≥ β for every x ∈ X. There exists a collection J = {J1 , . . . , Jp } such that each Ji is a closed interval and (M1) T is a C 1+α diffeomorphism from int Ji onto its image; (M2) X = ∪i Ji and int Ji ∩ int Jj = ∅ unless i = j; (M3) T (Ji ) ⊃ Jj whenever T (int Ji ) ∩ int Jj = ∅. J is called a Markov partition. Remark 7. For real hyperbolic systems, the notion of Markov partition involves stable and unstable manifolds. The definition here is considerably simpler, although
September 16, 2009 9:47 WSPC/148-RMP
954
J070-00378
B. Saussol
it is consistent with the general one. The simplification is due to the fact that the local stable manifold is trivially reduced to a point for expanding maps. Such Markov maps of the interval can be modeled by symbolic systems as follows. Define a p×p matrix A = (aij ) by aij = 1 if T (Ji ) ⊃ Jj and aij = 0 otherwise. Let A = {1, . . . , p} and ΣA ⊂ AN be the set of sequences ω such that aωi ,ωi+1 = 1 for any i ∈ N. Denote by σ = ΣA → ΣA the shift map defined by σ(ω)i = ωi+1 −i for any i ∈ N. Setting χ(ω) = ∩∞ Jωi gives the symbolic coding of the interval i=0 T map (X, T ) by (ΣA , σ): σ
ΣA −→ ΣA χ ↓ ↓χ . T
X −→ X −n Let ∂J := ∪i ∂Ji . The map χ is one-to-one except on the set S := ∪∞ ∂J , n=0 T 2 where it is at most p -to-one. For ω ∈ ΣA we denote by Cn (ω) the nth cylinder of ω, that is the set of sequences ω ∈ ΣA such that ωi = ωi for any i = 0, . . . , n − 1. When x = χ(ω) ∈ S we let Jn (x) = χ(Cn (ω)).
Lemma 8. Let ψ : X → R be α-Hold¨er continuous. For any x, y in the same ncylinder, we have 1 . |Sn ψ(x) − Sn ψ(y)| ≤ |ψ|α δ α α β −1 Proof. For any k = 0, . . . , n − 1, T k x and T k y are in the same element of the partition. By the expanding property and an immediate recurrence we get that d(T k x, T k y) ≤ β k−n d(T n x, T n y) ≤ β k−n .
(2)
Therefore Sn ψ(x) − Sn ψ(y) =
n−1
ψ(T k x) − ψ(T k y)
k=0
≤
n−1
|ψ|α d(T k x, T k y)α ≤ |ψ|α
k=0
n−1
(δβ k−n )α .
k=0
Proposition 9. There exist two constants c0 , c1 such that for any x ∈ S, any integer n, c0 |(T n ) (x)|−1 ≤ diam Jn (x) ≤ c1 |(T n ) (x)|−1 . Proof. The function x → log |T (x)| is α-H¨older continuous on X. Thus by Lemma 8, there exists some constant D such that for each n ∈ N and x, y in the same n-cylinder, |(T n ) (x)| ≤ D. |(T n ) (y)|
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
955
The restriction of T n to the interval Jn (x) is a diffeomorphism, so we can apply the mean value theorem: there exists y ∈ Jn (x) such that diam T n (Jn (x)) = |(T n ) (y)| diam Jn (x). This together with the distortion estimate proves the upper bound with c1 = D diam X. We now prove the lower bound. Let ρ = min diam Ji > 0. Since T n (Jn (x)) is an union of some of the Ji ’s, diam T n (Jn (x)) ≥ ρ. This together with the distortion estimate proves the second statement with c0 = ρD−1 Remark 10. We emphasize that this picture is true in the more general situation of conformal repellers. X is a compact invariant subset of a C 1+α map of a Riemaniann manifold M , such that (i) T is expanding on X: dx T v ≥ βv for all v ∈ Tx M , for all x ∈ X; (ii) there exists an open set V ⊂ M such that X = ∩n T −n V ; In this case there exist Markov partitions of arbitrary small diameter. Under the additional assumption (iii) T is conformal: dx T is a multiple of an isometry, for each cylinder Jn (x) the inner and outer diameter are in some range dx T n −1 [c0 , c1 ]. We assume for simplicity that the map is topologically mixing, that is, for any two open sets A and B, there exists N such that for all n > N , A∩T −n B = ∅. This is equivalent to assume that ΣA is irreducible and aperiodic, i.e. there exists m0 such that Am0 has only nonzero entries. 2.2. Dimension, entropy and Lyapunov exponent We now review briefly some essential notion coming from the thermodynamic formalism of expanding maps, as well as its relation with dimensions and Lyapunov exponents. We emphasize that most of the results are known in much more general situations. It is out of the scope of this note to present them in full generality, with the weakest hypothesis. 2.2.1. Dimensions We now introduce some dimensions for sets and measures. The ambient space is RN . Let α ≥ 0 and define a set function Hα (·) by Hα (A) = lim inf (diam Vi )α δ→0
i
where the infimum is taken among all countable covers of A by sets Vi with diam Vi ≤ δ (the limit exists by monotonicity). Hα (A) is called the Hausdorff measure of dimension α of the set A. We define the Hausdorff dimension of the set A, denoted by dimH A, as the unique number such that Hα (A) = +∞ if α < dimH A and Hα (A) = 0 if α > dimH A.
September 16, 2009 9:47 WSPC/148-RMP
956
J070-00378
B. Saussol
Recall that for all α ≥ 0, Hα is an exterior measure, for which Borel sets are measurable. We recover (a multiple of) the Lebesgue measure when α = N . For a countable collection of set (An ) we have dimH ∪n∈N An = supn∈N dimH An . Given a Radon measure (Borel measure finite on compact sets) µ its Hausdorff dimension is defined by dimH µ = inf{dimH A : Ac µ-negligible}. The pointwise dimensions of a measure µ are defined by dµ (x) = lim inf r→0
log µ(B(x, r)) log r
and dµ (x) = lim sup r→0
log µ(B(x, r)) . log r
Theorem 11. For any Radon measure µ we have dimH µ = esssup dµ . Proof. Fix α > esssup dµ and an integer n ≥ 1. Let An = {x ∈ B(0, n) : µ(B(x, r)) ≥ rα , ∀r < 1/n}. Let δ ∈ (0, 1/n). By Vitali’s lemma, there exists a countable family (xi , ri ) with xi ∈ An and ri < δ such that B(xi , ri ) are disjoints and Vi := B(xi , 3ri ) covers An . Moreover, diam(Vi )α ≤ (6ri )α ≤ 6α µ(B(xi , ri ))α ≤ µ(B(0, n + 1/n)). i
i
i
This gives Hα (An ) < ∞, therefore dimH An ≤ α. Since {dµ < α} = ∪n An has full measure, we get dimH µ ≤ α. We prove the upper bound. Let α < esssup dµ . Let Y be a measurable set with µ(Y c ) = 0. There exists n ≥ 1 such that Z := {x ∈ Y : µ(B(x, r)) ≤ rα , ∀r < 1/n} has positive measure. Let δ < 1/2n and consider a δ-cover (Vi ) of Z. Let xi ∈ Vi ∩ Z (if the intersection is empty we simply discard this set). We have (diam Vi )α ≥ µ(B(xi , diam Vi )) ≥ µ(∪i Vi ) ≥ µ(Z). i
i
This proves that dimH Y ≥ dimH Z ≥ α. Therefore dimH µ ≥ α. 2.2.2. Entropy From now on and until the end of Sec. 2.2, µ will denote an invariant measure. Its entropy with respect to a finite measurable partition ξ is defined by 1 hµ (T, ξ) = lim − log µ(ξn (x))dµ(x), n
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
957
where ξn = ξ ∨ T −1 ξ ∨ · · · ∨ T −n+1 ξ and x ∈ ξn (x) ∈ ξn . We will invoke several time the Shannon–McMillan–Breiman theorem, that we recall without proof (but see Proposition 17). Theorem 12. The limit of − n1 log µ(ξn (x)) exists for µ-a.e. x. It is called the local entropy at x, denoted by hµ (x). If the measure µ is ergodic then the limit is a.e. equal to the entropy hµ (T, ξ). The entropy hµ (T ) of the map is defined by the supremum of the metric entropies hµ (T, ξ), taken among all finite measurable partitions. It can be proven that any generating partition achieves this supremum. In particular for our Markov partition J the entropy is maximal. In the case of the full shift on m symbols endowed by the Bernoulli measure µp with weights p = (p1 , . . . , pm ), the entropy is hµp (σ) = −
m
pi log pi ,
i=1 1 for all i’s. The supremum log m which is maximal for the uniform measure pi = m is equal to the topological entropy. This is a special case of the variational principle (see Sec. 2.3 below). In the case of the shift on m symbols with a Markov measure µP,π , where P is the stochastic matrix giving the transition probabilities and π is the left eigenvector πP = π, the entropy is simply
hµP,π = −
m
πi Pi,j log Pi,j .
i,j=1
2.2.3. Lyapunov exponents Let x ∈ S. A small interval I x is mapped by the map T n to some larger interval T n (I); as long as I ⊂ Jn (x), T n expands the length of I by a factor |(T n ) (x)| up to a multiplicative correction e±D . The average expansion factor of T is thus |(T n ) (x)|1/n = exp
1 log|(T n ) (x)| n
up to a correction e±D/n . The limit 1 1 log|(T n ) (x)| = lim Sn log|T |(x), n n if it exists, is called the Lyapunov exponent of T at the point x. λ(x) = lim
Proposition 13. For µ-a.e. x ∈ X the Lyapunov exponent exists and if the measure is ergodic then 1 log|T |dµ. lim log|(T n ) (x)| = n X
September 16, 2009 9:47 WSPC/148-RMP
958
J070-00378
B. Saussol
Proof. By the ergodic theorem this limit exists µ-a.e. when µ is an invariant measure. If, furthermore, this measure is ergodic then it is constant and equal to the Lyapunov exponent of the measure log|T |dµ. λµ = X
Remark 14. In higher dimension: the image of a ball in Rd by a linear map is in general an ellipsoid, with axes of different length and directions. This picture remains in a loose sense in the nonlinear case: approximation of T n in a vicinity of x by its differential dx T n shows that the iterate by the map T n of a small ball B(x, r) looks like an ellipsoid, with axes En,i (x) and length eλn,i (x)n r, i = 1, . . . , d. By Oseledet’s theorem [45], for µ-a.e. x the directions En,i (x) and the exponents λn,i (x) converge to some asymptotic value Ei (x) and λi (x); moreover these values are constant a.e. if the measure µ is ergodic. 2.2.4. Their relation These three quantities attached to a measure preserving map of the interval are linked through the following relation: Theorem 15. The pointwise dimension dµ (x) exists a.e. and dµ (x) = If the measure is ergodic, for µ-a.e. x we have dµ (x) =
hµ (x) λ(x)
a.e.
hµ (T ) = dimH µ. λµ (T )
Proof. Let ε > 0. Let x ∈ X be such that µ(Jn (x)) ≥ e−n(hµ (x)+ε) and diam Jn (x) ≤ e−n(λ(x)−ε) for any n sufficiently small. This concerns a.e. points by Shannon–McMillan–Breiman theorem (Theorem 12), Proposition 9 and Proposition 13. Given r > 0 sufficiently small, we take n the smallest integer such that e−n(λ(x)−ε) < r. Since Jn (x) ⊂ B(x, r) we have log µ(Jn (x)) log µ(B(x, r)) ≤ . log r log r h (x)+ε
µ This proves dµ (x) ≤ λ(x)−ε . Let us define the set
Gε (m) = {x ∈ X : ∀n > m, µ(Jn (x)) ≤ e−n(hµ (x)−ε) and diam(Jn (x)) ≥ e−n(λ(x)+ε) }. Let x be a density point of Gε (m). Given r > 0 we take the largest n = n(r) such that e−n(λ(x)+ε) > r. For any r sufficiently small (so that n > m) we have µ(B(x, r)) ≤ 2µ(Gε (m) ∩ B(x, r)) ≤ 4e−n(hµ (x)−ε) , since the ball B(x, r) can intersects at most two cylinders from Gε (m). This proves hµ (x)−ε . dµ (x) ≥ λ(x)+ε In the ergodic case the last identity follows from Theorem 11.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
959
Remark 16. The existence of the pointwise dimension has been prove by Young [58] in the case of C 2 surface diffeomorphisms with nonzero entropy. Then Ledrappier and Young [42], and finally Barreira, Pesin and Schmeling [5] extended the result in arbitrary dimensions for C 1+α diffeomorphism, for measures without zero Lyapunov exponents. 2.3. Equilibrium states or Gibbs measures An important class of invariant measures in the ergodic theory of smooth dynamical systems is equilibrium states. This notion comes from thermodynamics, via the symbolic dynamics and is central in the thermodynamic formalism of dynamical systems. In our setting, they are also Gibbs measures: the behavior at small scale of these measures is precisely controlled by a function, the potential. In particular, natural measures (e.g., absolutely continuous or physical measure) are equilibrium states. The second interest of these measures lies in the fact that they possess strong statistical properties. 2.3.1. Gibbs measures Let ϕ : X → R be a H¨ older continuous function. An invariant measure µ is called a Gibbs measure for the potential ϕ if there exists a constant PT (ϕ) ∈ R, called the pressure, such that for some κϕ ≥ 1, for any x and any n, we have 1 µ(Jn (x)) ≤ κϕ . ≤ κϕ exp(Sn ϕ(x) − nPT (ϕ))
(3)
The case of Markov measures is recovered if one takes the potential ϕ(x) = log Px0 ,x1 where P is the stochastic transition matrix. The potential ϕ = −log|T | gives the absolutely continuous invariant measure. Note that this measure is Markov essentially in the case of piecewise affine maps. Proposition 17. For a Gibbs measure µ, the statement of the Shannon–McMillan– Breiman theorem with the partition J follows immediately from Birkhoff ergodic theorem and PT (ϕ) = hµ (T ) + ϕ dµ. Proof. Indeed, we have − n1 log µ(Jn (x)) ∼ − n1 Sn ϕ(x) + PT (ϕ) which converges a.e. by Birkhoff ergodic theorem. The dominated convergence theorem proves the identity. Remark 18. A measure which attains the supremum sup hν (T ) + ϕ dν ν
among all T -invariant measures ν, is called an equilibrium measure. According to the variational principle, the supremum is indeed the pressure PT (ϕ).
September 16, 2009 9:47 WSPC/148-RMP
960
J070-00378
B. Saussol
It turns out that for Markov expanding maps of the interval this supremum is attained at an unique measure µϕ , which is also the (unique) gibbs measure. For a general account on equilibrium states, and a proof of these results, we refer to [10, 40]. Our Gibbs measure has the following mixing property: there exist some constants c > 0 and θ ∈ (0, 1) such that for any cylinder set A of rank n and any measurable set B we have |µϕ (A ∩ T − B) − µϕ (A)µϕ (B)| ≤ cθ −n µ(A)µ(B). This is called the ψ-mixing property (with exponential rate). Observe that in particular such a measure µ is mixing, hence ergodic. The mixing property can also be stated in a different and weaker way, which will be sufficient for the sequel. If f is a Lipschitz function and g an integrable function we have f g ◦ T dµϕ − f dµϕ g dµϕ ≤ cθ f LipgL∞ . (4) For the sake of completeness, we provide in Sec. 2.3.2 below a proof of the existence of Gibbs measures and the computation of the rate of decay of correlation (4). Remark 19. Such a result on decay of correlations holds in a large variety of settings, where one has some expanding behavior [43]. This includes some dynamical systems with singularities, or without Markov partition, in any dimensions. For some nonuniformly expanding systems, the decay rate is polynomial. In the case of invertible maps (e.g. hyperbolic diffeomorphisms), the second function g has to be regular also [10, 58]. 2.3.2. Ruelle–Perron–Frobenius theorem We closely follow the presentation of the monograph [10] (see also [46]). Suppose that the potential ϕ : X → R is α-H¨older. For convenience we will work on the subshift of finite type (ΣA , σ) defined in Sec. 2.1, instead of working directly on the interval map (X, T ). Let δ = β −α . We endow ΣA with the metric d(ω, ω ) = δ n if n is the largest integer such that Cn (ω) = Cn (ω ). Note that this makes the potential on the symbolic space ϕ ◦ π Lipschitz. To simplify notations we still denote it by ϕ. Let be its Lispchitz constant. For all a ∈ A, if aω and aω exist then d(aω, aω ) ≤ δd(ω, ω ) (σ is locally expanding, its inverse contracting). We assume that for some integer N , AN has only positive entries. The Ruelle– Perron–Frobenius operator is defined by Lϕ (f )(ω) = eϕ(aω) f (aω). a∈A,Aaω0 =1
Lϕ acts on continuous functions. It is a bounded operator on the space of continuous function, as well as on the space of Lipschitz functions. We can iterate it eSn ϕ(ω ) f (ω ). Lnϕ f (ω) = σn ω =ω
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
961
Lemma 20. There exists a probability measure ν and a constant λ > 0 such that L∗ϕ ν = λν. Proof. The dual L∗ϕ acts on probability measures. The map defined on the convex L∗ m
compact set M(ΣA ) of probability measures by m → T (m) = L∗ ϕm(1) has a fixed ϕ point by the Schauder–Tychonoff theorem. Putting λ = L∗ϕ ν(1), this fixed point ν satisfies, Lϕ f dν = d(L∗ϕ )ν = λ f dν. ΣA
ΣA
ΣA
Without loss of generality, we assume that λ = 1, changing if necessary ϕ by ϕ − log λ. Let b > 0 such that b ≥ + b and
Cb = {f ≥ 0 : f (ω) ≤ f (ω )ebd(ω,ω ) if ω0 = ω0 , and ν(f ) = 1}. −1 Lemma 21. There exist c ∈ (0, 1) such that if f ∈ Cb then c ≤ LN . ϕ f and f ≤ c
Proof. Let ω and ω in Σ. There exists ω such that ω0 = ω0 and σ N ω = ω . Thus
SN ϕ(ω ) f (ω ) ≥ (inf eSN ϕ )e−b f (ω). LN ϕ f (ω ) ≥ e b
e This shows that inf LN ϕ f ≥ inf eSN ϕ sup f . The conclusion follows from the remark N that ν(f ) = 1 and ν(Lϕ f ) = 1.
Notice that if f ∈ Cb then log f , hence f , are Lipschitz. Moreover the Lipschitz norm f := f ∞ + |f |Lip ≤ M, for some constant M = c−1 max(3, b + 1). Lemma 22. There exists h ∈ Cb such that Lϕ h = h and h > 0. Proof. The set Cb is relatively compact in the set of continuous functions by Ascoli– Arzela. Moreover, it is clearly closed, thus compact, and convex. In addition, whenever f ∈ Cb and ω0 = ω0 we have Lϕ f (ω) = eϕ(aω) f (aω) a
≤
eϕ(aω)−ϕ(aω ) eϕ(aω ) f (aω )ebd(aω,aω )
a
≤ eδ[ +b]d(ω,ω ) Lϕ f (ω ), which shows that Lϕ (Cb ) ⊂ Cb . The Schauder–Tychonoff theorem applies again and shows the existence of a fixed point h ∈ Cb , which satisfies Lϕ h = h. By Lemma 21, we have h = LN ϕ h ≥ c > 0. Note that the measure µ = hν is invariant. Without loss of generality, we assume that h = 1, changing if necessary ϕ by ϕ + log h − log h ◦ σ, and ν by hν.
September 16, 2009 9:47 WSPC/148-RMP
962
J070-00378
B. Saussol
Theorem 23. The measure µ is a Gibbs measure for the potential ϕ. Proof. Let ω ∈ ΣA and n an integer. Let f = 1C n (ω) be the indicator function of the n-cylinder about ω. We have µ(C n (ω)) = µ(Lnϕ f ) ≤ sup eSn ϕ ≤ κeSn ϕ(ω) C n (ω)
for some constant κ only depending on ϕ. The argument in Lemma 21 gives Lnϕ f ≥ inf C n (ω) eSn ϕ 1[ω0 ] , and Ln+N f ≥ (inf eSN ϕ )(inf eSn ϕ ). The previous computation yields µ(C n (ω)) = ν(Ln+N f) ≥ ϕ
1 Sn ϕ(ω) e , κ
changing the constant κ if necessary. Despite its extreme simplicity, the following lemma is the core of the estimate on decay of correlation. The interpretation is that after each iteration by T N , at least η-percent of the remaining density is chopped out and follows the invariant measure. The exponential convergence will follow immediately. Lemma 24. There exists η ∈ (0, 1) such that for any f ∈ Cb , LN ϕ f = η + (1 − η)f with f ∈ Cb .
Proof. Let η = c2 < 1. Let f ∈ Cb . Put g = LN ϕ f . Write g = η1 + (g − η). Since g − η ≥ c − ηc−1 ≥ 0 and both g ∈ Cb and 1 ∈ Cb , we have g − ηh ∈ RCb . Lemma 25. There exist constants C > 0 and θ ∈ (0, 1) such that Lnϕf − 1 ≤ Cθn for all f ∈ Cb and n ≥ 0, where · stands for the Lipschitz norm. Proof. By applying Lemma 24 successively, we obtain that for all p ≥ 1, p p LpN ϕ f = (1 − (1 − η) ) + (1 − η) fp
for some fp ∈ Cb . Write n = pN + r. We have Lnϕf − 1 ≤ |||Lrϕ |||(1 − η)p 1 + fp . 1
Putting θ = (1 − η) N and C = supr
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
963
Proof. Let f be a Lipschitz function. Taking u = (1+b−1 )f ensures that f +u ∈ RCb . By Lemma 25, this gives the result. Note. From now on, we fix a Gibbs measure µ of a H¨ older potential. We will prove quantitative recurrence results for the dynamical system (X, T, µ). 3. Recurrence Rate By the topological version of Poincar´e recurrence theorem, i.e. Theorem 2, a.e. point x return arbitrarily close after iteration by T . A very natural question is the behavior when r → 0 of the first return time τr (x) = τB(x,r) (x) = min{n ≥ 1 : d(T n x, x) < r}, that is the first time that the orbit of x is back in the r-neighborhood of the point x. This is quantified by the following notion. Definition 27. We define the lower and upper recurrence rate of a point x by the limits log τr (x) log τr (x) , R(x) := lim sup . R(x) := lim inf r→0 − log r − log r r→0 When the limit exists we denote it by R(x). Theorem 28. The recurrence rate is a.e. equal to the dimension of the measure: R(x) = dimH µ for µ-a.e. x. Recurrence rates were introduced and studied in [55] in the case of interval maps, and [6,7] in the case of axiom A diffeomorphisms and some class of repellers. The corresponding results for hitting or waiting times were then considered by Galatolo [25, 26]; see also [27, 28] for subsequent developments. Remark 29. We emphasize that some assumption on the system is necessary, since there exist examples of dynamical systems where the conclusion of Theorem 28 is false (e.g., rotations with special diophantine type [6]). 3.1. The rapid mixing method In this section, we prove Theorem 28 using the method developed in [52]. Remark 30. The theorem is indeed valid in a more general situation than Markov expanding maps of the interval. The core assumption is the decay of correlation for Lipschitz functions (4) with a superpolynomial rate. Without assuming the existence of the pointwise dimension, one still gets the identities R = dµ and R = dµ µ-a.e. Let δ = dimH µ. The following lemma is a direct consequence of Theorem 15 about the existence of the pointwise dimension and Egorov theorem.
September 16, 2009 9:47 WSPC/148-RMP
964
J070-00378
B. Saussol
Lemma 31. Good set for the pointwise dimension: For any ε > 0 there exists r0 > 0 such that the measurable set K ⊂ X defined by Kε = {x ∈ X : ∀r < r0 , rδ+ε ≤ µ(B(x, r)) ≤ rδ−ε } has measure at least 1 − ε. Lemma 32. For µ-a.e. x ∈ Kε , we have lim inf r→0
log τr (x) − log r
Proof. Fix ε > 0 and let Kε be as in Lemma 31. Take α = define
≥ δ − 4ε. 1 δ−4ε ,
set rn =
1 nα
and
Ln = {x ∈ Kε : d(T n x, x) < rn }. For a ball B = B(x, r) and a constant a > 0, we denote by aB the ball B(x, ar). Let (Bi )i be a collection of balls of radius rn centered at points in Kε that covers Ln and such that the collection of balls ( 12 Bi )i is disjoint. We have µ(Ln ) = µ(∪i Bi ∩ Ln ) ≤ µ(Bi ∩ Ln ) ≤ µ(Bi ∩ T −n 2Bi ). i
For each i, define the function φi (x) = max(0, 1 − φi is rn−1 -Lipschitz, and 12Bi ≤ φi ≤ 13Bi . We have −n µ(Bi ∩ T 2Bi ) ≤ φi φi ◦ T n dµ ≤
i
rn−1 d(x, 2Bi )).
We remark that
2 φi dµ + cθn φi Lip ≤ µ(3Bi )2 + cθn rn−1
using the decay of correlations formula (4). Since the balls are centered on Kε we have µ(3Bi ) ≤ (3rn )δ−ε and ( 12 rn )δ+ε ≤ µ( 12 Bi ). This last inequality and the fact that the balls are disjoint imply that their number is bounded by ( 12 rn )−δ−ε . Therefore, −δ−ε 1 rn µ(Bi ∩ T −n 2Bi ) ≤ [(3rn )2δ−2ε + cθn rn−1 ]. 2 i
Thus we have n µ(Ln ) < +∞. By the Borel–Cantelli lemma, for µ-a.e. x there exists n0 (x) such that for any n > n0 (x), x ∈ Ln thus d(x, T n x) ≥ rn . Therefore, for any r and n such that rn ≤ r < min{d(x, T j x) : j = 1, . . . , n0 (x)} we have τr (x) > n. Hence, since the set of periodic points has zero measure, lim inf r→0
log τr (x) log n 1 ≥ lim = = δ − 4ε n→∞ log rn − log r α
for µ-a.e. x ∈ Kε . Lemma 33. For µ-a.e. x ∈ Kε , we have lim supr→0
log τr (x) − log r
Proof. Define Mr = {x ∈ Kε : τ2r (x) ≥ r−δ−2ε }.
≤ δ + 2ε.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
965
Let Bi be a family of balls of radius r centered at points of Kε that covers Mr and such that the balls 12 Bi are disjoints. We have µ(Bi ∩ Mr ). µ(Mr ) = µ(∪i Bi ∩ Mr ) ≤ i
But observe that by the triangle inequality µ(Bi ∩ Mr ) ≤ µ(Bi ∩ {τBi ≥ r−δ−2ε }). By Kaˇc’s lemma and the Markov inequality, this is bounded by rδ+2ε τBi dµ = rδ+2ε . Bi
Since the number of balls Bi is bounded by ( 12 r)−δ−ε , we end up with µ(Mr ) ≤ 2δ+ε rε . Therefore, the sequence rn = e−n satisfies µ(Mrn ) < +∞. n
By the Borel–Cantelli lemma, for µ-a.e. x ∈ Kε there exists n1 (x) such that for any n > n1 (x), x ∈ Mrn . Hence τ2rn (x) < rn−δ−2ε , therefore lim sup n
The conclusion follows since
log τ2rn ≤ δ + 2ε. − log 2rn
log rn log rn+1
converges to 1.
Theorem 28 follows from Lemmas 31–33. 3.2. Repetition times, minimal distance In the symbolic setting, a result comparable to Theorem 28 on recurrence rate exists. Let ξ be a finite or countable measurable partition of X. We define the first repetition time of the first n-symbols by Rn (x, ξ) = min{k ≥ 1 : ξn (x) = ξn (T k x)}. Theorem 34 ([44]). Let (X, T, µ) be any ergodic measure preserving dynamical system. Let ξ a finite measurable partition of X. Then for µ-a.e. x we have 1 log Rn (x, ξ) = hµ (T, ξ). n→∞ n lim
The initial statement was indeed for the non-overlapping return time Rnno (we impose that k ≥ n in the definition of Rn ) but Quas observed in [50] that they are a.e. eventually equal when the entropy is positive. There it is also shown that the result holds for countable partitions. Although our interest is more on smooth
September 16, 2009 9:47 WSPC/148-RMP
966
J070-00378
B. Saussol
dynamical systems than in symbolic systems, we provide a proof of the theorem in Sec. 3.2.1 below. Remark 35. Theorem 34 can be applied with the Markov partition J and a Gibbs measure µ. Using the approximation argument of balls by cylinders present for example in Lemma 51, we recover the statement of Theorem 28. However, this strategy, which was successful in the one-dimensional case [55] does not survive in higher dimensional systems, while the method presented in the previous section does not depend on the dimension. The analogy with recurrence rates is also made precise if one takes the pseudodistance d(x, y) = e−n whenever n is the largest integer such that ξk (x) = ξk (y) for any k < n. In this case one has Rn (x, ξ) = τe−n (x), while the Hausdorff dimension of the ergodic measure µ is equal to the entropy hµ (T, ξ). Remark 36. At the same time and independently, Boshernitzan in [9] established a quantitative version of the topological version of Poincar´e recurrence theorem, that is Theorem 2. The statement of this elegant result is the following. If the α-dimensional Hausdorff measure is σ-finite on X then 1
lim inf n α d(T n x, x) < ∞ for µ-a.e. x. n→∞
(5)
Recurrence rates are concerned with the time necessary to achieve a certain distance, while this result consider the distance as a function of time. For sure, these results are correlated, and in fact it is an exercise to see that the statement (5) implies that R ≤ α almost surely. The reciprocal is a little bit weaker and read as follows: R < α implies the statement (5). We do not reproduce here the proof of Boshernitzan’s theorem; although much more general, it is close in spirit to our Lemma 33, where the core argument is lying in Kaˇc’s lemma. 3.2.1. Repetition time and entropy The proof of Theorem 34 is based on the Shannon–McMillan–Breiman theorem and some combinatorial arguments that we extract in the two lemmas below. To simplify the notations we work directly on a space Σ = {1, . . . , p}N for some integer p, endowed with the shift map σ and an ergodic invariant measure µ with entropy hµ . We call interval a set of consecutive integers, denote them by [|m, n|] = {k ∈ N : m ≤ k ≤ n} and denote the singleton [|m, m|] by [|m|]. Given ω = (ωi )i≥0 ∈ Σ and m ≤ n we denote the word wm ωm+1 · · · ωn by ω[|m,n|] . The n-repetition time reads Rn (ω) = min{k ≥ 1 : ωk+[|0,n−1|] = ω[|0,n−1|] }. We shall prove now that the exponential growth of Rn and Rno is governed by the entropy. Indeed, these two quantities are asymptotically equal when the entropy is positive. Lemma 37 ([50]). If then entropy hµ > 0 then Rn (ω) = Rnno (ω) eventually a.e.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
967
Proof. Observe that Rn (ω) = Rnno (ω) iff Rn (ω) < n. Suppose this is the case, and let k < n such that Rn (ω) = k, where ω[|0,n−1|] = ωk+[|0,n−1|] . Hence ω[|0,k−1|] = ω[|k,2k−1|] . Let ε ∈ (0, hµ /3) and consider, for some integer N , the set 1 Γ = Γ(N ) := ω ∈ Σ : ∀k ≥ N, log µ(ω[|0,k−1|] ) + hµ < ε . (6) k We can estimate the measure of the set Γk := {ω ∈ Γ : ω[|0,k−1|] = ω[|k,2k−1|] }. Indeed, if we denote by ss the concatenation of a finite sequence s of length |s| = k, we have µ(Γk ) = µ(ω ∈ Γk : ω[|0,k−1|] = s) ≤ µ(Γ ∩ ss). |s|=k
|s|=k
Remark that if |s| = k and Γ ∩ ss = ∅ then µ(ss) ≤ e−2k(hµ −ε) and s ∩ Γ = ∅, which imply that µ(s) ≥ e−k(hµ +ε) . Hence, there can be at most ek(hµ +ε) such s. Thus,
µ(Γk ) ≤ e−k(hµ −3ε) . Consequently, k µ(Γk ) < ∞. By the Borel–Cantelli lemma, for µ-a.e. ω ∈ Γ, there exists kω such that for all k > kω we have ω ∈ Γk . If in addition ω is not periodic then Rn (ω) → ∞ as n → ∞, hence for n sufficiently large Rn (ω) > kω , which implies that Rn (ω) = Rnno (ω). The conclusion follows then from the Shannon–McMillan–Breiman theorem, since µ(∪N Γ(N )) = 1, and the measure µ is aperiodic (does not give any mass to the set of periodic points). Given an integer L we call pattern a partition S of the interval [|0, L − 1|] by disjoints sub-intervals, [|0, L−1|] = ∪S∈S S. If S = [|m, n|] is an element of a pattern we denote its length by |S| := m − n + 1. For integers 1 < M < N < L and reals b > 0 and ε ∈ (0, 1) we say that a sequence ω = ω[|0,L−1|] ∈ {1, . . . , p}L follows the pattern S if • each S ∈ S is either — a singleton, — or an interval of length |S| ∈ [|M, N |] such that for some t ∈ [|M, ebN |] ωt+S = ωS , in this case we say that S is a long interval • and the interval [|0, L − 1|] is almost filled by long intervals: |S| ≥ (1 − ε)L. long S∈S
Lemma 38. For any δ > 0, there exists ε > 0 such that for any M < 1/ε, N and L, the number of admissible patterns is bounded by eδL . L
different positions in [|0, L − 1|].
Moreover, we have 0 ≤ j ≤ εL. Hence, there are at most j≤εL Lj choices for the position of the singletons. Proof. A number j of singletons can be in
j
September 16, 2009 9:47 WSPC/148-RMP
968
J070-00378
B. Saussol
Each long interval S has a length at least equal to M . Therefore, there is at most L/M ≤ εL long intervals. Once the configuration of singletons is fixed, the position and the length of the long intervals is determined by the position of the for them when there are k long left extremity, thus there are at most Lk choices
choices for the configuration of the intervals. Hence, there is at most k≤εL L k long intervals. We then use the simple estimation L L L 1+ε . ≤ ε−εL εj ≤ j j εε j≤εL
j≤εL
For any δ > 0, if ε is sufficiently small the number of different admissible patterns is bounded by eδL . Lemma 39. There is at most pεL ebL admissible sequences of length L following the same pattern. Proof. Fix a pattern S. We first fill the singletons. They are at most εL, which gives at most pεL possibilities. Once the configuration of singleton is fixed, we fill the long intervals, from right to left. For the first S, there exists a time t ≤ eb|S| such that ωS = ωt+S . Since ωt+S is on the right, it is already determined. Hence ωS is one of the ωj+S , j ∈ {N, . . . , eb|S| }. This leaves at most eb|S| different choices for ωS . We proceed similarly for the second interval, and so on and so forth. Finally, there is at most
eb|S| ≤ ebL S∈S
different ways of filling the long intervals. Proof of Theorem 34. Let R(ω) = lim inf n→∞
log Rnno (ω) n
and R(ω) = lim sup n→∞
log Rn (ω) . n
First we claim that R(ω) ≤ hµ pour µ-a.e. ω. Indeed, let ε > 0 and h > hµ + ε. For n ≥ N we have Rn dµ µ(Γ(N ) ∩ {Rn ≥ enh }) ≤ e−nh Γ(N )
≤ e−nh
|C|=n
C∩Γ(N )
τC dµ ≤ e−nh en(hµ +ε)
by Kaˇc’s lemma. This upper bound is summable in n, hence by the Borel–Cantelli lemma we have Rn < enh eventually a.e. Hence R < hµ µ-a.e. Assume that hµ > 0, otherwise the proof is finished. By Lemma 37, it suffices to prove that R(ω) ≥ hµ for µ-a.e. ω.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
969
no Since Rnno (σω) ≤ Rn+1 (ω), we have R(σω) ≤ R(ω) for all ω. This implies that R ◦ σ = R µ-a.e., hence R is equal to a constant b0 a.e. Suppose for a contradiction that b0 < hµ , and fix b < h ∈ (b0 , hµ ). Let δ ∈ (0, h − b), take ε > 0 given by Lemma 38 such that pε eb+δ < eh , and fix M < 1/ε. Let log Rnno (ω)
If N is sufficiently large then µ(AN ) > 1 − ε/2. Let L 1 k 1AN (σ ω) > 1 − ε/2 . BL = ω ∈ Σ : L k=0
If L is sufficiently large, it follows from Birkhoff ergodic theorem that µ(BL ) > bN 0.1 and furthermore p(e ) < εL/2. We now count the number of cylinders of length L which may contain an ω ∈ BL . Let ω ∈ BL . We see a pattern S in ω[|0,L−1|] in that way: If ω ∈ AN then the first element of S is the singleton [|0|], otherwise we take [|0, n − 1|] where n is between M and N , and Rnno (ω) < ebn . If the pattern is already constructed up to the position k − 1 ≤ L − ebN , then the next element will be the singleton [|k|] if σ k ω ∈ AN , otherwise the interval [|k, k + n − 1|] where n is between M and N , and Rnno (σ k ω) < ebn . The remaining part of the pattern is made by singletons [|k|], with k ∈ [|L − ebN , L − 1|]. There are at most ebN + ε/2 < εL singletons, hence the sequence ω[|0,L−1|] follows the pattern S. By Lemma 39, there are at most pεL ebL sequences following the pattern S. In addition, By Lemma 38, there are at most eδL patterns, hence the number of sequences is bounded by pεL eδL ebL < ehL . The contradiction comes from the fact that h < hµ , and µ(BL ) > 0.1 for some L arbitrarily large. 4. Fluctuation of the Return Time The literature on this subject is vast and still growing rapidly in different directions. Again, we will focus here on a particular aspect. We invite the reader to the reviews on this subject by Coelho [15] and Abadi and Galves [1]. They are certainly an excellent starting point for a broader and also for an historical exposition on the field. 4.1. Exponential law The exponential law and the Poisson distributions are often called law of rare events. Indeed the time before the first occurrence of an event in an i.i.d. process has a
September 16, 2009 9:47 WSPC/148-RMP
970
J070-00378
B. Saussol
geometric law; in the limit of rare events (i.e. the probability of the event is small), geometric laws are well approximated by exponential laws. Theorem 40. For µ-a.e. x0 , the random variable µ(B(x0 , r))τB(x0 ,r) (·) converges in distribution, under the laws of µ or µB(x0 ,r) , to an exponential with parameter one. We note that most of the works on this subject are considering cylinder sets of a symbolic dynamic. Few works considering return time to balls or natural sets [16, 17, 35, 20, 19] are emerging. If x0 is a periodic point of period p, a large proportion of points in B(x0 , r) should be back in the ball after p iterations (this depends on the measure also). Therefore the statement for return times should be false for periodic points. An extra Dirac mass at the origin of the distribution should appear in the limiting law, if it exists. Indeed, an exponential approximation for the hitting time is often valid, but with a different normalization [29]. The first approach to establish a version of the theorem was to discard points with short recurrence, e.g. the exponential law is proved for cylinders which do not recur before half of their length. And prove that this concerns almost all cylinders. The novel approach presented here is to consider Lebesgue density points for the property that recurrence rate and dimension coincide. The advantage is that it allows to give a very short and simple proof, not based on the symbolic dynamics. This is essential when looking at return time to balls, especially in higher dimensional systems. The next lemma exploits the basic idea that the geometric distribution appears when there is a loss of memory. Lemma 41. Let A be a measurable set with µ(A) > 0. If δ(A) := sup |µ(τA > k) − µA (τA > k)|. k
Then for any integer n, we have |µ(τA > n) − (1 − µ(A))n | ≤ δ(A). Proof. For any integer k ≥ 0, the same argument as in (1) gives µ(τA > k + 1) = µ(τA > k) − µ(A)µA (τA > k) = (1 − µ(A))µ(τA > k) + µ(A)(µ(τA > k) − µA (τA > k)) by invariance of the measure. Thus |µ(τA > k + 1) − (1 − µ(A))µ(τA > k)| ≤ µ(A)δ(A).
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
971
Therefore, by an immediate recurrence, for any integer n |µ(τA > n) − (1 − µ(A))n | ≤
n−1
(1 − µ(A))k δ(A)µ(A)
k=0
≤
1 δ(A)µ(A) ≤ δ(A). µ(A)
The point is now to estimate the distance δ(B(x0 , r)) between the distribution of return and entrance times in B(x0 , r). Clearly this has to do with the mixing property. However, for short returns the mixing may not be strong enough, therefore one has to take care of them in a special manner. For short returns, this is done by a direct application of the recurrence rate result Theorem 28. Lemma 42. For any d ∈ (0, dimH µ), we have µB(x0 ,r) (τB(x0 ,r) ≤ r−d ) → 0
as r → 0
(7)
for µ-a.e. x0 ∈ X. We call such x0 a non-sticky point. Proof. Let L = {x ∈ X : ∀r < r0 , τ2r (x) > r−d }. Let x0 ∈ L be a Lebesgue density point of L, that is µB(x0 ,r) (L) → 1 as r → 0. Let r < r0 . If x ∈ B(x0 , r) and τB(x0 ,r) (x) ≤ r−d we have τ2r (x) ≤ r−d as well, hence x ∈ Lc . Therefore µB(x0 ,r) (τB(x0 ,r) ≤ r−d ) ≤ µB(x0 ,r) (Lc ). The conclusion follows by taking r0 → 0, which by Theorem 28 ensures that µ(L) → 1, and the Lebesgue density theorem which says that density points form a set of full measure. Lemma 43. For any d ∈ (0, dimH µ), one has for µ-a.e. x0 ∈ X µ(τB(x0 ,r) ≤ r−d ) → 0
as r → 0.
Proof. This is a direct consequence of the inequality µ(τA ≤ n) ≤ nµ(A) valid for any measurable set A and the existence of the pointwise dimension. Inevitably, one will need a geometric measure theoretic hypothesis of the form: Hypothesis A. x0 is such that there exists a > 0 and b ≥ 0 such that µ(B(x0 , r)\B(x0 , r − ρ)) ≤ r−b ρa for any r > 0 sufficiently small.
(8)
September 16, 2009 9:47 WSPC/148-RMP
972
J070-00378
B. Saussol
Lemma 44. In our setting, Hypothesis A is satisfied for any points x0 . Proof. The Gibbs property and uniform expansion implies that there exists two constants a, b > 0 such that for any integer k µ(Jk (x)) ≤ e−ak
and e−bk ≤ diam(Jk (x)).
Therefore, any interval of length e−bk has non-empty intersection with at most two cylinders, hence its measure is bounded by 2e−ak . Hence there exists c, d > 0 such that any interval I has a measure µ(I) ≤ c diam(I)d .
(9)
For large times, and around such a point x0 , one can use the mixing property to estimate δ(B(x0 , r)). Lemma 45. For µ-a.e. x0 we have δ(B(x0 , r)) → 0 as r → 0. Proof. Let d ∈ (0, dimH µ) and let x0 be a non-sticky point. Write for simplicity A = B(x0 , r) and En = {τA ≥ n}. Let A = B(x0 , r − ρ), g ≤ n be an integer and define the function φ(x) = max(0, 1 − ρ−1 d(x, A )). We remark that φ is ρ−1 Lipschitz and that 1A ≤ φ ≤ 1A . We make several approximations: |µ(A ∩ En ) − µ(A ∩ T −g En−g )| ≤ µ(A ∩ {τA ≤ g}) µ(A ∩ T −g En−g ) − φ1En−g ◦ T g dµ ≤ µ(A\A ) φ1En−g ◦ T g dµ − µ(En−g ) φ dµ ≤ cθg ρ−1 φ dµ − µ(A) ≤ µ(A\A ) |µ(T −g En−g ) − µ(En )| ≤ µ(τA ≤ g).
(10) (11) (12) (13) (14)
Putting together all these estimates gives 1 |µA (En ) − µ(En )| = |µ(A ∩ En ) − µ(A)µ(En )| µ(A) c g −1 2µ(A\A ) + θ ρ + µ(τA ≤ g). µ(A) µ(A) Observe that this upper bound holds even for n ≤ g, so that it is also an upper bound for δ(B(x0 , r)). Now we choose g = r−d and ρ = θg/2 . The first term goes to zero by Lemma 42, the last one by Lemma 43. The measure µ(A\A ) is bounded by r−b θag/2 by (8) thanks to Lemma 44. This proves δ(B(x0 , r)) → 0. ≤ µA (τA ≤ g) +
Proof of the Theorem. Fix t > 0. We still write A = B(x0 , r). Taking n = t/µ(A), since µ(µ(A)τA > t) = µ(τA > n), we get by Lemma 41 that |µ(µ(A)τA > t) − e−t | ≤ δ(A) + |(1 − µ(A))n − e−t |.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
973
By Lemma 45, it suffices to show that the last term goes to zero when r → 0. It is bounded by n n −t (1 − µ(A))n − 1 − t 1− t . + − e n n It is well known that the first term goes to zero as n → ∞, and the second term is bounded by n|µ(A) − nt | ≤ nt . This shows that the hitting time, rescaled by the measure of the ball, converges in distribution to an exponential. The statement for the return time follows since the two distributions differ by δ(B(x0 , r)), which goes to zero as r → 0. Remark 46. It was shown by Lacroix [41] that if instead of a ball B(x, r) one allows any type of neighborhoods, then any possible limiting distributions can appear; see also [21]. On the other side, if a limiting distribution exists for the return time, then it also exists for the hitting time, and the two are related by an integral relation [31]. The only fixed point of this relation is, not by chance, the exponential distribution. For successive return times, one expect a Poisson limit distribution [17, 34, 30, 49, 54], or more generally a compound Poisson. 5. Smallest Return Time in Balls We now investigate the first return time of a set A ⊂ X: τ (A) := min{n ≥ 1 : A ∩ T −n A = ∅} = inf τA (x). x∈A
This quantity arised in two different contexts. First, it is the basic ingredient in the definition of the dimension for Poincar´e recurrence introduced by Afraimovich [2]. The second motivation was the proof of exponential law [18, 35]. As already said before, for cylinders with a short periodic orbit the distribution of return times is not exponential. It is also related to the speed of approximation to the exponential law [51]. 5.1. Rate of recurrence for cylinders under positive entropy We show now that in a symbolic system with positive entropy, the first return time of a cylinder is at least of the order of its size. This result was established in [55]. The proof presented here is from [3]. Theorem 47. Let ξ be a finite measurable partition with strictly positive entropy hµ (T, ξ). Then the lower rate of Poincar´e recurrences for cylinders is almost surely larger than one, i.e. for µ-a.e. x ∈ X one has lim inf n→∞
τ (ξn (x)) ≥ 1. n
Proof. We keep the notations from Sec. 3.2.1 and write the proof in the symbolic representation. Fix ε ∈ (0, hµ /3). We choose N so large that Γ = Γ(N ) (see (6))
September 16, 2009 9:47 WSPC/148-RMP
974
J070-00378
B. Saussol
has a measure at least 1 − ε. We can choose c so large that for any ω ∈ Γ and any positive integer n c−1 e[−nhµ −nε] ≤ µ(Cn (ω)) ≤ ce[−nhµ +nε] . Let δ = 1 −
3 hµ ε
(15)
and set An := {ω ∈ Γ : τ (Cn (ω)) ≤ δn}.
Obviously An =
δn
∪k=1 Pn (k)
where
Pn (k) := {ω ∈ Γ : τ (Cn (ω)) = k}.
We shall prove that n µ(An ) < ∞. Let n be a positive integer and 0 ≤ k ≤ n. If the return time of the cylinder C = [w0 · · · wn−1 ] is equal to k, i.e. τ (C) = k, then it can be readily checked that ωj+k = ωj , for all 0 ≤ j ≤ n − k − 1. This means that any block made with k consecutive symbols completely determines the cylinder C. Let Z = {Ck (ω) : ω ∈ Pn (k)}. Because of the structure of cylinders under consideration, for any cylinder Z ∈ Z there is a unique cylinder CZ of length n such that CZ ⊂ Z and one has Z ∩Pn (k) ⊂ CZ . This implies µ(Pn (k)) = µ(Z ∩ Pn (k)) ≤ µ(CZ ). Z∈Z
Z∈Z
But for each Z ∈ Z, we have Z ∩ Γ = ∅ and CZ ∩ Γ = ∅, thus there exists ω ∈ Γ such that Z = Ck (ω) and CZ = Cn (ω). Using (15) we get µ(Cn (ω)) ≤ c exp[−nhµ + nε] and 1 ≤ cµ(Ck (ω)) exp[khµ + kε]. Multiplying these inequalities we get µ(CZ ) ≤ c2 exp[−nhµ + nε] exp[khµ + kε]µ(Z). Summing up on Z ∈ Z we get (recall that k ≤ n) µ(Pn (k)) ≤ c2 exp[−(n − k)hµ + 2nε]. This implies that δn
µ(An ) =
µ(Pn (k))
k=1
ehµ exp[−n(hµ − δhµ − 2ε)]. ehµ − 1 Since hµ − δhµ − 2ε = hµ − (1 − h3µ ε)hµ − 2ε = ε > 0, we get that µ(An ) < +∞. ≤ c2
n≥1
In view of the Borel–Cantelli lemma, we finally get that for µ-almost every ω ∈ Γ τ (Cn (ω)) ≥ (1 − h3µ ε)n, except for finitely many integers n. Since in addition µ(Γ) > 1 − ε, the arbitrariness of ε implies the desired result.
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
975
In the special case of a Markov partition, the other inequality is easy: Proposition 48. For the Markov partition J we have for any x ∈ X lim sup n→∞
τ (Jn (x)) ≤ 1. n
Proof. By the Markov property, any cylinder Jn (x) contains a periodic point of period at most n + m0 . Therefore, τ (Jn (x)) ≤ n + m0 . 5.2. Local rate of return for balls These symbolic recurrence rate can be translated to estimate return time of balls. That is to estimate τ (B(x, r)). Definition 49. We call a point x ∈ X super-regular with respect to a partition ξ if its orbit does not approach exponentially fast the boundary of the partition: lim n1 log d(T n x, ∂ξ) = 0. Lemma 50. Let µ be any Gibbs measure of H¨ older potential. Then almost every point is super-regular with respect to the Markov partition J . Proof. The boundary of the partition is composed by a finite number p of points therefore by Eq. (9) in the proof of Lemma 44 we get µ(x : d(x, ∂ξ) < ε) ≤ pcεd for any ε > 0. By invariance of the measure this implies that for any ν > 0 µ(x : d(T n x, ∂ξ) < e−νn ) < ∞. n
The conclusion follows then by the Borel–Cantelli lemma. ¯ Lemma 51. If x is super-regular with respect to the partition J and λ > λ(x) := 1 n −λn ) ⊂ Jn (x) for any n sufficiently large. lim sup n log |(T ) (x)|, we have B(x, e ¯ + ν < λ. By super-regularity, there exists c > 0 Proof. Let ν > 0 be such that λ(x) such that for any integer k, d(T k x, ∂ξ) ≥ ce−νk .
(16)
Let n0 be such that for any n > n0 , |(T n ) (x)| ≤
c (λ−ν)n e . D
(17)
We claim that for any n > n0 , and any k ≤ n,B(x, e−λn ) ⊂ Jk (x). Indeed this is true for k = 0 since e−λn ≤ c. Moreover, if this holds for some integer k < n then we get T k (B(x, e−λn )) ⊂ B(T k x, D|(T k ) (x)|e−λn ) ⊂ J0 (T k x),
September 16, 2009 9:47 WSPC/148-RMP
976
J070-00378
B. Saussol
by (16) and (17). Therefore the ball B(x, e−λn ) is contained in Jk+1 (x). This proves the claim by recurrence. Theorem 52. Let µ be a Gibbs measure of a H¨ older potential. Then for µ-a.e. x we have 1 τ (B(x, r)) = . lim r→0 | log r| λµ Proof. Let x be a point which has a Lyapunov exponent λ(x) equal to λµ , which is super-regular with respect to the Markov partition and such that the lower recurrence rate for cylinders Jn (x) is at least equal to one. This concerns a.e. points by Lemma 50 and Theorem 47. By Lemma 51, for any λ > λµ we have lim inf n→∞
τ (B(x, e−λn )) 1 τ (Jn (x)) ≥ lim inf . n→∞ λ |log e−λn | n
This proves the lower bound. By Proposition 9, we have diam(Jn (x)) ≤ c1 |(T n ) (x)|−1 . Taking n = n(r) the smallest integer such that the upper bound is less than r, we get that Jn (x) ⊂ B(x, r). The conclusion follows now by Proposition 48. (B(x,r)) 1 Remark 53. The upper bound lim supr→0 τ|log r−| ≤ λµ still holds in higher dimension for (non-conformal) expanding maps, under some Markov assumption. The lower bound in the first part of the proof may be generalized to higher dimensional dynamical systems [56], under a weak regularity condition (of the type in [39] which ensures the existence of Lyapunov charts). In that case one has to replace λµ by the largest Lyapunov exponent Λµ := lim n1 log dx T n dµ. Unfortunately these two inequalities only give a range of possible values for the local rate of return for balls. There are examples [56] where the bounds are attained, and also where these are not sharp. This suggests that the existence and the computation of the local rate of return in the non-conformal case is still far away.
5.3. Dimension for Poincar´ e recurrence These rate of return for balls are the base ingredient of the definition of the dimension for Poincar´e recurrence, or the Afraimovich–Pesin dimension [2, 47]. Define for A ⊂ X, q ∈ R and α ∈ R the quantity e−qτ (Bi ) (diam Bi )α M (A, q, α) = lim inf ε→0 {Bi }
i
where the infimum is taken among all countable covers of A by balls Bi . Let α(A, q) denote the transition point of M (A, q, α) from +∞ to zero. The spectrum α(·, ·) is a generalization of the Hausdorff dimension and has been introduced and computed for some geometric constructions and Markov maps of the interval [4, 23].
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
977
The behavior of τ (B(x, r))/|logr| is closely related to a corresponding pointwise dimension [14, 3]. This spectrum has also been computed for surface diffeomorphisms [56] and for a general class of interval maps [36]. Acknowledgment It is a great pleasure to thank Jean-Ren´e Chazottes for the good remarks on a preliminary version of this paper, and the organizers of the workshop on hitting, returning and matching in dynamical systems, information theory and mathematical biology, EURANDOM, Eindhoven 2008. I acknowledge the referee for valuable comments and suggestions. References [1] M. Abadi and A. Galves, Inequalities for the occurrence times of rare events in mixing processes. The state of art, Markov Process. Related Fields 7 (2001) 97–112. [2] V. Afraimovich, Pesin’s dimension for Poincar´e recurrences, Chaos 7 (1997) 12–20. [3] V. Afraimovich, J.-R. Chazottes and B. Saussol, Pointwise dimensions for Poincar´e recurrence associated with maps and special flows, Discrete Contin. Dyn. Syst. 9 (2003) 263–280. [4] V. Afraimovich, J. Schmeling, E. Ugalde and J. Urias, Spectra of dimensions for Poincar´e recurrences, Discrete Contin. Dyn. Syst. 6 (2000) 901–914. [5] L. Barreira, Ya. Pesin and J. Schmeling, Dimension and product structure of hyperbolic measures, Ann. Math. 149 (1999) 755–783. [6] L. Barreira and B. Saussol, Hausdorff dimension of measures via Poincar´e recurrence, Comm. Math. Phys. 219 (2001) 443–463. [7] L. Barreira and B. Saussol, Product structure of Poincar´e recurrence, Ergodic Theory Dynam. Systems 22 (2002) 33–61. [8] G. Birkhoff, Proof of the ergodic theorem, Proc. Natl. Acad. Sci. USA 17 (1931) 650–655. [9] M. Boshernitzan, Quantitative recurrence results, Invent. Math. 113 (1993) 617–631. [10] R. Bowen, Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, Lecture Notes in Mathematics, Vol. 470, 2nd edn. (Springer-Verlag, Berlin-New York, 2008). [11] X. Bressaud and R. Zweim¨ uller, Non exponential law of entrance times in asymptotically rare events for intermittent maps with infinite invariant measure, Ann. Inst. H. Poincar´e 2 (2001) 1–12. [12] H. Bruin, B. Saussol, S. Troubetzkoy and S. Vaienti, Statistics of return time via inducing, Ergodic Theory Dynam. Systems 23 (2003) 991–1013. [13] H. Bruin and S. Vaienti, Return time statistics for unimodal maps, Fund. Math. 176 (2003) 77–94. [14] J.-R. Chazottes and B. Saussol, On pointwise dimensions and spectra of measures, C. R. Acad. Sci. Paris S´ er. I Math. 333 (2001) 719–723. [15] Z. Coelho, Asymptotic laws for symbolic dynamical systems, in Topics in Dynamical Systems and Applications, eds F. Blanchard, A. Maass and A. Nogueira. London Math. Soc. Lecture Notes Series, Vol. 279 (Cambridge University Press, 2000), pp. 123–165. [16] P. Collet, Statistics of closest return for some non-uniformly hyperbolic systems, Ergodic Theory Dynam. Systems 21 (2001) 401–420.
September 16, 2009 9:47 WSPC/148-RMP
978
J070-00378
B. Saussol
[17] P. Collet, A. Galves and B. Schmitt, Unpredictability of the occurrence time of a long laminar period in a model of temporal intermittency, Ann. Inst. H. Poincar´ e Phys. Th´eor. 57 (1992) 319–331. [18] P. Collet, A. Galves and B. Schmitt, Repetition time for gibbsian source, Nonlinearity 12 (1999) 1225–1237. [19] M. Denker, M. Gordin and A. Sharova, A Poisson limit theorem for toral endomorphisms, Illinois J. Math. 48 (2004) 1–20. [20] D. Dolgopyat, Limit theorems for partially hyperbolic systems, Trans. Amer. Math. Soc. 356 (2004) 1637–1689. [21] F. Durand and A. Maass, A note on limit laws for minimal Cantor systems, Discrete Contin. Dyn. Syst. 9 (2003) 745–750. [22] D. J. Feng and J. Wu, The Hausdorff dimension of recurrent set in symbolic spaces, Nonlinearity 14 (2001) 81–85. [23] B. Fernandez, E. Ugalde and J. Urias, Spectrum of dimensions for Poincar´e recurrences of Markov maps, Discrete Contin. Dyn. Syst. 8 (2002) 835–849. [24] H. Furstenberg, Recurrence in Ergodic Theory and Combinatorial Number Theory (Princeton University Press, 1981). [25] S. Galatolo, Hitting time and dimension in axiom A systems, generic interval exchanges and an application to Birkoff sums, J. Stat. Phys. 123 (2006) 111–124. [26] S. Galatolo, Dimension and hitting time in rapidly mixing systems, Math. Res. Lett. 14 (2007) 797–805. [27] S. Galatolo and D.-H. Kim, The dynamical Borel–Cantelli lemma and the waiting time problems, Indag. Math. (N.S.) 18 (2007) 421–434. [28] S. Galatolo, D.-H. Kim and K. Park, The recurrence time for ergodic systems with infinite invariant measures, Nonlinearity 19 (2006) 2567–2580. [29] A. Galves and B. Schmitt, Inequalities for hitting times in mixing dynamical systems, Random Comput. Dyn. 5 (1997) 337–348. [30] N. Haydn, Statistical properties of equilibrium states for rational maps, Ergodic Theory Dynam. Systems 20 (2000) 1371–1390. [31] N. Haydn, Y. Lacroix and S. Vaienti, Hitting and return times in ergodic dynamical systems, Ann. Probab. 33 (2005) 2040–2050. [32] N. Haydn, J. Luevano, G. Mantica and S. Vaienti, Multifractal properties of return time statistics, Phys. Rev. Lett. 88 (2002) 224502. [33] M. Hirata, Poisson law for Axiom A diffeomorphism, Ergodic Theory Dynam. Systems 13 (1993) 533–556. [34] M. Hirata, Poisson law for the dynamical systems with the “self-mixing” conditions, in Dynamical Systems and Chaos (Hachioji, 1994), Vol. 1 (World Sci. Publishing, 1995), pp. 87–96. [35] M. Hirata, B. Saussol and S. Vaienti, Statistics of return times: A general framework and new applications, Comm. Math. Phys. 206 (1999) 33–55. [36] F. Hofbauer, The recurrence dimension for piecewise monotonic maps of the interval, Ann. Sc. Norm. Super Pisa Cl. Sci. (5) 4 (2005) 439–449. [37] M. Kac, On the notion of recurrence in discrete stochastic processes, Bull. Amer. Math. Soc. 53 (1947) 1002–1010. [38] A. Katok and B. Hasselblatt, Introduction to the Modern Theory of Dynamical Systems, Encyclopedia of Mathematics and Its Application, Vol. 54 (Cambridge University Press, Cambridge, 1995). [39] A. Katok, J. M. Strelcyn, F. Ledrappier and F. Przytycki, Invariant Manifolds, Entropy and Billiards; Smooth Maps with Singularities, Lectures Notes in Mathematics, Vol. 1222 (Springer-Verlag, Berlin, 1986).
September 16, 2009 9:47 WSPC/148-RMP
J070-00378
Recurrence in Dynamical Systems
979
[40] G. Keller, Equilibrium States in Ergodic Theory, London Mathematical Society Student Texts, Vol. 42 (Cambridge University Press, Cambridge, 1998). [41] Y. Lacroix, Possible limit laws for entrance times of an ergodic aperiodic dynamical system, Israel J. Math. 132 (2002) 253–264. [42] F. Ledrappier and L. S. Young, The metric entropy of isms. art II: Relations between entropy, exponents and dimension, Ann. Math. 122 (1985) 540–574. [43] S. Luzzatto, Stochastic-like behaviour in nonuniformly expanding maps, in Handbook of Dynamical Systems, eds. B. Hasselblatt and A. Katok, Vol. 1B (Elsevier B. V., Amsterdam, 2006), pp. 265–326. [44] D. Ornstein and B. Weiss, Entropy and data compression schemes, IEEE Trans. Inform. Theory 39 78–83. [45] V. Oseledecs, A multiplicative ergodic theorem. Lyapunov characteristic numbers for dynamical systems, Tr. Mosk. Mat. Obs. 19 (1968) 179–210; Trans. Moscow Math. Soc. 19 (1968) 197–221 (English translation). [46] W. Parry and M. Pollicott, Zeta functions and the periodic orbit structure of hyperbolic dynamics, Ast´erisque 187–188 (1990) 1–268. [47] V. Penn´e, B. Saussol and S. Vaienti Dimensions for recurrence times: Topological and dynamical properties, Discrete Contin. Dyn. Syst. 5 (1999) 783–798. [48] Ya. B. Pesin, Dimension Theory in Dynamical Systems. Contemporary Views and Applications, Chicago Lectures in Mathematics (University of Chicago Press, 1997). [49] B. Pitskel, Poisson law for Markov chains, Ergodic Theory Dynam. Systems 11 (1991) 501–513. [50] A. Quas, An entropy estimator for a class of infinite alphabet processes, Theor. Veroyatnost. i Primenen. 43 (1998) 61–621. [51] B. Saussol, On fluctuations and the exponential statistics of return times, Nonlinearity 14 (2001) 179–191. [52] B. Saussol, Recurrence rate in rapidly mixing dynamical systems, Discrete Contin. Dyn. Syst. 15 (2006) 259–267. [53] B. Saussol and J. Wu, Recurrence spectrum in smooth dynamical system, Nonlinearity 16 (2003) 1991–2001. [54] B. A. Sevast’yanov, Poisson limit law for a scheme of sums of independent random variables, Theory Probab. Appl. 17 (1972) 695–699. [55] B. Saussol, S. Troubetzkoy and S. Vaienti, Recurrence, dimensions and Lyapunov exponents, J. Statist. Phys. 106 (2002) 623–634. [56] B. Saussol, S. Troubetzkoy and S. Vaienti, Recurrence and Lyapunov exponents, Moscow Math. J. 3 (2003) 189–203. [57] M. Urbanski, Recurrence rates for loosely Markov dynamical systems, J. Aust. Math. Soc. 82 (2007) 39–57. [58] L. S. Young, Dimension, Entropy and Lyapunov exponents, Ergodic Theory Dynam. Systems 2 (1982) 109–124. [59] L. S. Young, Statistical properties of dynamical systems with some hyperbolicity, Ann. Math. 147 (1998) 585–650.
September
14,
2009 15:31 WSPC/148-RMP
J070-00379
Reviews in Mathematical Physics Vol. 21, No. 8 (2009) 981–1044 c World Scientific Publishing Company
A RIGOROUS TREATMENT OF THE PERTURBATION THEORY FOR MANY-ELECTRON SYSTEMS
YOHEI KASHIMA Institut f¨ ur Theoretische Physik, Universit¨ at Heidelberg, Philosophenweg 19, 69120 Heidelberg, Germany
[email protected] Received 6 April 2009 Revised 7 July 2009
Four point correlation functions for many electrons at finite temperature in periodic lattice of dimension d (≥1) are analyzed by the perturbation theory with respect to the coupling constant. The correlation functions are characterized as a limit of finite dimensional Grassmann integrals. A lower bound on the radius of convergence and an upper bound on the perturbation series are obtained by evaluating the Taylor expansion of logarithm of the finite dimensional Grassmann Gaussian integrals. The perturbation series up to second-order is numerically implemented along with the volume-independent upper bounds on the sum of the higher order terms in the 2-dimensional case. Keywords: Fermionic Fock space; Hubbard model; Grassmann integral formulation; perturbation theory; numerical analysis. Mathematics Subject Classification 2000: 81T25, 41A58, 65Z05
Contents 1. Introduction
982
2. The 2.1. 2.2. 2.3.
Perturbation Theory The Hubbard model . . . . . . . . . . . . . . . . . . . . . . . . . . The correlation function . . . . . . . . . . . . . . . . . . . . . . . . The perturbation series . . . . . . . . . . . . . . . . . . . . . . . .
985 985 986 988
3. Grassmann Gaussian Integral Formulation 3.1. Discretization of the integral over [0, β] . . . . . . . . . . . . . . . . 3.2. The Grassmann Gaussian integral . . . . . . . . . . . . . . . . . .
992 992 998
4. Upper Bound on the Perturbation Series 1002 4.1. The connected part of the exponential of Laplacian operator . . . . 1002 4.2. Evaluation of upper bounds . . . . . . . . . . . . . . . . . . . . . . 1008
981
September 14, 2009 15:31 WSPC/148-RMP
982
J070-00379
Y. Kashima
5. Numerical Results in 2D 1017 5.1. The decay constant for d = 2 . . . . . . . . . . . . . . . . . . . . . 1017 5.2. The second-order perturbation . . . . . . . . . . . . . . . . . . . . 1021 5.3. Numerical values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1023 Appendix A. The Fermionic Fock Space
1025
Appendix B. The Temperature-Ordered Perturbation Series
1027
Appendix C. Diagonalization of the Covariance Matrix
1039
References
1043
1. Introduction The thermal average of an observable O for many electrons in a solid is expressed as Tr e−βH O/Tr e−βH , where H is a Hamiltonian representing the total energy of the system, β is the inverse temperature and the trace operation Tr is taken over the Fermionic Fock space, the Hilbert space of all the possible states of electrons. If the movements of electrons are confined in finite lattice sites under periodic boundary condition, the Fermionic Fock space becomes finite dimensional. The thermal average Tr e−βH O/Tr e−βH is defined as a quotient of finite sums over the orthonormal basis spanning the space. Though the expectation value Tr e−βH O/Tr e−βH has a clear mathematical meaning in this setting, to rigorously control its behavior for interacting electrons poses a challenge. The purpose of this paper is to analyze the thermal expectation value for 4-point functions modeling paired electrons’ condensation by means of the perturbation theory. In the earlier article [13], Koma and Tasaki rigorously proved upper bounds on 2-point and 4-point correlation functions for the Hubbard model and concluded the decay properties in 1- and 2-dimensional cases. In an abstract general context, on the other hand, Feldman, Kn¨ orrer and Trubowitz gave a concise representation of the Schwinger functionals formulating the correlation functions via Grassmann integral and established upper bounds of the Schwinger functionals in [6]. Let us also remark the intensive renormalization group study by the same authors in [9], which analyzes the Grassmann integral formulation corresponding to the temperature zero limit of the correlation function for the momentum distribution function. The work [9] was presented as the 11th paper in the series of Feldman, Kn¨ orrer and Trubowitz’s 2D Fermi liquid construction. A flow chart showing the hierarchical relation between these 11 papers is found in the digest [8]. In this paper, we focus on the correlation function Tr e−βH O/Tr e−βH for 4point functions O = ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ and the Hubbard model H defined on a finite lattice. We expand the 4-point correlation function as a perturbation series with respect to the coupling constant and study the properties of the perturbation series. We especially aim at establishing upper bounds on the sum of higher order terms of the perturbation series so that one can numerically measure the error between the correlation function and the low order terms of the
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
983
perturbation series. More precisely, our goal is set to (1) find a constant r > 0 such that for any U ∈ R with |U | ≤ r ∞ Tr(e−βH O) = an U n , Tr e−βH n=0
where U denotes the coupling constant and an ∈ R (∀n ∈ N ∪ {0}), and to (2) establish an inequality of the form that for any U ∈ R with |U | ≤ r and m ∈ N ∪ {0} m Tr(e−βH O) n − a U ≤ Rm+1 (|U |), n Tr e−βH n=0 where Rm+1 (|U |) = O(|U |m+1 ) as |U |0. The inequality claimed in (2) is proved in Theorem 4.10 as our main result and a volume-independent r required in (1) and (2) is obtained in Proposition 5.1 for the 2-dimensional case. Our strategy is based on the discretization of the integrals over the interval of temperature appearing in the temperature-ordered perturbation series. By replacing the integrals by finite Riemann sums, we obtain a fully discrete analog of the perturbation series in which all the variables run in finite sets. The discretized perturbation series is formulated in a finite dimensional Grassmann Gaussian integral, which is rigorously defined as a linear functional on the finite dimensional linear space of Grassmann algebras. See [22] for another approach to the finite dimensional Grassmann integral formulation based on the Lie–Trotter type formula. We then rewrite the 4-point correlation function as the Taylor series expansion of logarithm of the Grassmann Gaussian integral. By evaluating the partial derivatives of logarithm of the Grassmann Gaussian integral, which were characterized as the tree expansion by Salmhofer and Wieczerkowski in [23], and passing the parameter defining the Riemann sum to infinity, we obtain an upper bound on each term of the perturbation series of the original correlation function. For completeness of the paper and convenience for readers, the derivation of the temperature-ordered perturbation series is presented in the appendices. As a key lemma, we make use of the volume- and temperature-independent upper bound on the determinant of the covariance matrix recently established by Pedra and Salmhofer in [19]. Pedra–Salmhofer’s determinant bound enables us to find a numerical upper bound on the Fermionic perturbation theory in a simple argument. As one aim, this paper intends to show a practical application of Pedra– Salmhofer’s determinant bound. Let us note that the lower bound on the radius of convergence of the perturbation series proved in Theorem 4.10 and Proposition 5.1 below for the 2-dimensional case is proportional to β −3 . By applying advanced multi-scale, renormalization
September 14, 2009 15:31 WSPC/148-RMP
984
J070-00379
Y. Kashima
techniques to the correlation functions of the 2-dimensional Hubbard model, Rivasseau [20] and Afchain, Magnen and Rivasseau [1] proved that a lower bound on the radius of convergence is proportional to (log β)−2 , which is larger than our lower bound for large β, i.e., small temperature. In this article, however, we feature calculating the quantities in a simple manner so that readers can verify the construction of the theory by themselves, rather than improving the temperature-dependency of the convergence of the perturbation theory via large machinery. Our motivation to implement the perturbation theory for many electrons with rigorous error estimate numerically was grown amid active research of numerical analysis for high temperature superconductivity. The macroscopic behavior of electromagnetic fields around a type-II superconductor is governed by a system of nonlinear Maxwell equations called the macroscopic critical-state models. Prigozhin initiated the variational formulation of the Bean critical-state model for type-II superconductivity and reported numerical simulations by finite element method in [18]. Following Prigozhin’s preceding work [18], finite element approximations of various macroscopic models have been studied in rigorous levels up until today. See [2,12] for the latest developments on this subject. In a smaller length scale, the density of superconducting charge carriers, the induced magnetic field and motions of the quantized vortices in a type-II superconductor under an applied magnetic field can be simulated by solving the mesoscopic Ginzburg–Landau models. Numerical approximation schemes for the Ginzburg–Landau models such as finite element method, finite difference method and finite volume method are summarized in the review article [5], which also explains extensions of the Ginzburg–Landau models to describe high temperature superconductivity characterized by d-wave pairing symmetry. We now turn our attention to microscopic models governing many electrons in a solid and try to approximate the 4-point correlation functions, which are believed to exhibit the off-diagonal long-range order as explained by Yang in [25] if superconductivity is happening in the system. However, the concept of error estimate for the numerical computation of the correlation functions formulated in the Fermionic Fock space is not yet seen in a mathematical literature as we can see for the macroscopic critical-state models and the mesoscopic Ginzburg–Landau models today. Hence, in this paper we attempt to propose an error analysis for the numerical approximation of the correlation functions defined in microscopic quantum theory and implement our numerical scheme in practice. The contents of this paper are outlined as follows. In Sec. 2, the model Hamiltonian and the correlation function of our interest are defined. The perturbation series of the correlation function is derived. In Sec. 3, the temperature-ordered perturbation series of the partition function is discretized and the discretized partition function is formulated in a finite dimensional Grassmann Gaussian integral. In Sec. 4, each coefficient of the perturbation series of the correlation function is evaluated and upper bounds on the sum over higher order terms are obtained as our main result. In Sec. 5, the perturbation series up to second order is numerically implemented together with the error estimates between the second-order perturbation
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
985
and the correlation function in 2-dimensional case. In Appendix A, the standard properties of the Fermionic Fock space are reviewed. A self-contained proof for the temperature-ordered perturbation series expansion is presented in Appendix B. Finally, the temperature-discrete covariance matrix is diagonalized and its determinant is calculated in Appendix C. 2. The Perturbation Theory In this section, we define the Hamiltonian operator, formulate the 4-point correlation function governed by the Hamiltonian under finite temperature and expand the correlation function as a power series of the coupling constant. To analyze the properties of the power series of the 4-point correlation function derived in this section is set to be the main purpose of this paper. 2.1. The Hubbard model First of all we define the Hubbard model H as the field Hamiltonian operator on the Fermionic Fock space along with various notations and parameters treated in this paper. The spacial lattice Γ is defined by Γ := Zd /(LZ)d , where L(∈ N) is the length of one edge of the rectangular lattice and d(∈ N) stands for the space dimension. On any set S we define Kronecker’s delta δx,y (x, y ∈ S) by δx,y := 1 if x is identical to y in S, δx,y := 0 otherwise. For example, δ(0,0),(L,L) = 1 for (0, 0), (L, L) ∈ Z2 /(LZ)2 . For any proposition A the function 1A is defined by 1 if A is true, 1A := 0 otherwise. ∗ , which is Using the annihilation operator ψxσ and the creation operator ψxσ the adjoint operator of ψxσ , at site x ∈ Γ and spin σ ∈ {↑, ↓}, the free part H0 and the interacting part V of the Hubbard model H are defined as follows. ∗ ∗ ∗ F (xσ, yτ )ψxσ ψyτ , V := U ψx↑ ψx↓ ψx↓ ψx↑ , (2.1) H0 := x,y∈Γ σ,τ ∈{↑,↓}
where
x∈Γ
d F (xσ, yτ ) := δσ,τ − t (δx,y−ej + δx,y+ej ) − t · 1d≥2 j=1
·
d j,k=1 j
(δx,y−ej −ek +δx,y−ej +ek +δx,y+ej −ek +δx,y+ej +ek )−µδx,y , (2.2)
September 14, 2009 15:31 WSPC/148-RMP
986
J070-00379
Y. Kashima
the vectors ej ∈ Γ (j ∈ {1, . . . , d}) are given by ej (l) = δj,l for all j, l ∈ {1, . . . , d}. The parameters t, t , µ, U ∈ R are called the nearest neighbor hopping amplitude, the next to nearest neighbor hopping amplitude, the chemical potential and the coupling constant, respectively. Note that the term representing the next to nearest neighbor hopping in F (xσ, yτ ) is effective only for d ≥ 2. The Hubbard model H is defined by H := H0 + V and is a self-adjoint operator on the Fermionic Fock space Ff (L2 (Γ × {↑, ↓}; C)). We summarize the definitions and the basic properties of the Fermionic Fock space, the annihilation, creation operators in Appendix A. Here we note the fact that dim Ff (L2 (Γ × {↑, ↓}; C)) = d 22L < +∞, which means that any linear operator on Ff (L2 (Γ × {↑, ↓}; C)) can be considered as a matrix. Let us prepare some more notations used in this paper. For any linear operator A : Ff (L2 (Γ × {↑, ↓}; C)) → Ff (L2 (Γ × {↑, ↓}; C)), Tr A is defined by d
Tr A :=
2L 2
φl , Aφl Ff ,
l=1
where ·, · Ff is the inner product of Ff (L2 (Γ × {↑, ↓}; C)) (see Appendix A) and 2Ld
{φl }2l=1 is any orthonormal system of Ff (L2 (Γ × {↑, ↓}; C)). The correlation function A under the finite temperature T is defined by A :=
Tr(e−βH A) , Tr e−βH
where β := 1/(kB T ) > 0 with the Boltzmann constant kB > 0. The momentum lattice Γ∗ is defined by Γ∗ := (2πZ/L)d /(2πZ)d .
n For any vectors α, γ of algebra of length n, let α, γ denote l=1 α(l)γ(l). Let
n ·, · Cn denote the inner product of Cn defined by u, v Cn := l=1 u(l)v(l) for any n u, v ∈ C . For any finite set S, S stands for the number of elements contained in S. Let Sn denote the set of all the permutations on n elements for n ∈ N. 2.2. The correlation function Our goal is to analyze the 4-point correlation function ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ by means of the perturbation method with respect to the coupling constant U . The correlation function of our interest can be derived from the logarithm of the partition function. Let us substitute real parameters {λx,y,z,w }x,y,z,w∈Γ (⊂ R) into our Hamiltonian H and define the parametrized Hamiltonian Hλ by ∗ ∗ Ux,y,z,w ψx↑ ψy↓ ψw↓ ψz↑ , (2.3) Hλ := H0 + Vλ , Vλ := x,y,z,w∈Γ
where we set Ux,y,z,w := U δx,y δz,w δx,z + λx,y,z,w + λz,w,x,y ,
(2.4)
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
987
for all x, y, z, w ∈ Γ. Note that Hλ still keeps the self-adjoint property and that Hλ |λx,y,z,w =0,∀x,y,z,w∈Γ = H. To simplify notations, let X represent a vector in Γ4 in our argument unless otherwise stated. From now, we fix 4 sites x 1 , x 2 , y1 , y2 ∈ Γ to define the correlation function ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ and write X˜1 = (x 1 , x 2 , y1 , y2 ) and X˜2 = (y1 , y2 , x 1 , x 2 ). Lemma 2.1. The following equality holds. ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ = −
Tr e−βHλ 1 ∂ log . β ∂λX˜1 Tr e−βH0 λX =0 4 ∀X ∈Γ
(2.5) Remark 2.2. Since Hλ is self-adjoint, its spectrum σ(Hλ ) is a subset of R. The spectral mapping theorem (see, e.g., [26, Sec. VIII-7, Corollary 1]) shows that {e−βx}x∈σ(Hλ ) is the spectrum of e−βHλ . Thus, Tr e−βHλ > 0. For the same reason as above the inequality Tr e−βH0 > 0 holds. Therefore, log(Tr e−βHλ /Tr e−βH0 ) is well-defined. Let L(Ff (L2 (Γ × {↑, ↓}; C))) denote the space of linear operators on Ff (L2 (Γ × {↑, ↓}; C)). The proof of Lemma 2.1 is based on the following lemma. Lemma 2.3. Let (a, b) be an interval of R. Assume that A : (a, b) → L(Ff (L2 (Γ × {↑, ↓}; C))) is an operator-valued C 1 -class function. The following equality holds. For all s ∈ (a, b)
1 d d A(s) e = e(1−t)A(s) A(s)etA(s) dt. ds ds 0 Proof. Fix any s ∈ (a, b) and take small ε > 0 such that [s − ε, s + ε] ⊂ (a, b). For any s ∈ (s − ε, s + ε)
eA(s) − eA(s ) = [−e(1−t)A(s) etA(s ) ]t=1 t=0
1 d (1−t)A(s) tA(s ) (e =− e ) dt 0 dt
1 e(1−t)A(s) (A(s) − A(s ))etA(s ) dt. = 0
Moreover, we see that d A(s) e = ds
lim
s →s s ∈(s−ε,s+ε)
eA(s) − eA(s ) s − s
=
lim
s →s s ∈(s−ε,s+ε)
= 0
1
1
0
e(1−t)A(s)
e(1−t)A(s)
A(s) − A(s ) tA(s ) e dt s − s
d A(s)etA(s) dt, ds
September 14, 2009 15:31 WSPC/148-RMP
988
J070-00379
Y. Kashima
where we have used the inequality sup d A(s) − A(s ) tA(s ) θ∈[s−ε,s+ε] A(θ)
sup e ≤ ds A(θ) e s − s θ∈[s−ε,s+ε] with the operator norm · and Lebesgue’s dominated convergence theorem to exchange the order of the limit operation and the integral. Proof of Lemma 2.1. Since the operator-valued function λX˜1 → Hλ is continuously differentiable on any interval containing 0 inside, we can apply Lemma 2.3 to have Tr e−βHλ 1 ∂ log − β ∂λX˜1 Tr e−βH0 λX =04
∀X ∈Γ
1
∂ e(1−t)(−βH) (−βHλ ) λX =0 et(−βH) dt ∂λX˜1 ∀X ∈Γ4
Tr 0 1 = − β Tr e−βH
1 Tr(e(1−t)(−βH) (ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ )et(−βH) ) = dt Tr e−βH 0 = ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ , where we have used the equality that Tr(AB) = Tr(BA) for any operators A, B.
2.3. The perturbation series The partition function Tr e−βHλ /Tr e−βH0 can be expanded as a power series of the parameter {UX }X ∈Γ4 . We give the derivation of the temperature-ordered perturbation series in Appendix B. Here we only state the result. Proposition 2.4. For any U ∈ R and {λX }X ∈Γ4 ⊂ R, ∞ n 1 Tr e−βHλ = 1+ − −βH 0 Tr e n! j=1 n=1
δσ2j−1 ,↑ δσ2j ,↓
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓}
·
0
β
dx2j−1 Ux 2j−1 ,x 2j ,y2j−1 ,y2j
· det(C(x j σj xj , yk σk xk ))1≤j,k≤2n
x2j =x2j−1 , ∀j∈{1, 2, . . . , n}
(2.6)
where the constraint x2j = x2j−1 requires the variable x2j to take the same value as x2j−1 for all j ∈ {1, 2, . . . , n} and each component of the covariance matrix
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
989
(C(x j σj xj , yk σk xk ))1≤j,k≤2n is defined by C(xσx, yτ y) :=
1y−x>0 δσ,τ i k,y−x −(y−x)Ek 1y−x≤0 e e − Ld 1 + eβEk 1 + e−βEk ∗
(2.7)
k∈Γ
with the dispersion relation Ek := −2t
d
cos(k, ej ) − 4t · 1d≥2
j=1
d
cos(k, ej ) cos(k, ek ) − µ.
(2.8)
j,k=1 j
Lemma 2.1 indicates that we can construct the power series of ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ +ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ by substituting the series (2.6) into the Taylor series expansion of the function log(x) around x = 1. Since the radius of convergence of the Taylor series of log(x) around 1 is 1, we need to know when the inequality |Tr e−βHλ /Tr e−βH0 − 1| < 1 holds beforehand. An answer will be given to this question in Proposition 2.7 below. It will be more convenient for our analysis to generalize the problems so that the variables U , {λX }X ∈Γ4 , {UX }X ∈Γ4 are allowed to be complex. We will then recover the statements on our original problem by restricting the variables to be real. For {UX }X ∈Γ4 ⊂ C we define a function P ({UX }X ∈Γ4 ) by the power series of the righthand side of (2.6). Let us recall that the real function log(x) (x > 0) is extended to be the complex analytic function log(z) in the domain {z ∈ C | |z − 1| < 1} by the power series log(z) =
∞ (−1)n−1 (z − 1)n . n n=1
In our argument to clarify when the inequality |P ({UX }X ∈Γ4 ) − 1| < 1
(2.9)
holds as well as in the proofs of other lemmas in this paper, the following lemma on the determinant bound on the covariance matrix plays essential roles. Lemma 2.5 [19, Theorem 2.4]. For any n ∈ N, (x j , σj , xj ), (yj , τj , yj ) ∈ Γ × {↑, ↓} × [0, β) (∀j ∈ {1, . . . , n}), sup uj ,vj ∈Cn with uj Cn , vj Cn ≤1 ∀j∈{1, . . . , n}
|det(uj , vk Cn C(x j σj xj , yk τk yk ))1≤j,k≤n | ≤ 4n ,
1/2
where uCn := u, u Cn for all u ∈ Cn . Remark 2.6. The statement of [19, Theorem 2.4] is on the determinant bound of the covariance matrices independent of the spin coordinate. It is, however,
September 14, 2009 15:31 WSPC/148-RMP
990
J070-00379
Y. Kashima
straightforward to derive the bound claimed in Lemma 2.5 on our spin-dependent covariance matrix from [19, Theorem 2.4]. We can expand −1/β∂/∂λX˜1 log(P ({UX }X ∈Γ4 ))|λX =0,∀X ∈Γ4 as a power series of U as follows. Proposition 2.7. Assume that U ∈ C satisfies |U | < log 2/(16βL4d). Then there exists ε > 0 such that if {λX }X ∈Γ4 satisfies |λX | ≤ ε for all X ∈ Γ4 , the inequality (2.9) holds. Moreover, we have ∞ 1 ∂ 4 − log(P ({UX }X ∈Γ )) λX =0 = an U n , β ∂λX˜1 ∀X ∈Γ4 n=0
where the coefficients {an }∞ n=0 are given by an := −
1 ∂ β ∂λX˜1
n+1 (−1)j−1 j j=1
m1 +· · ·+mj =n+1 mk ≥1,∀k∈{1, . . . , j}
with {Gn }∞ n=1 defined by n 1 Gn := − n! j=1
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓}
j Gmk k=1 λX
(2.10)
,
(2.11)
=0 ∀X ∈Γ4
β
0
dx2j−1 δσ2j−1 ,↑ δσ2j ,↓
· (δx 2j−1 ,x 2j δy2j−1 ,y2j δx 2j−1 ,y2j−1 + λx 2j−1 ,x 2j ,y2j−1 ,y2j + λy2j−1 ,y2j ,x 2j−1 ,x 2j ) · det(C(x j σj xj , yk σk xk ))1≤j,k≤2n
x2j =x2j−1 . ∀j∈{1,2, . . . , n}
(2.12)
Proof. Let us fix U ∈ C with |U | < log 2/(16βL4d). Take any ε ∈ (0, log 2/ (32βL4d ) − |U |/2) and assume that {λX }X ∈Γ4 satisfies |λX | ≤ ε for all X ∈ Γ4 . Then, we see that for all X ∈ Γ4 |UX | <
log 2 . 16βL4d
By using the inequality (2.13) and Lemma 2.5, we observe that n ∞ 1 log 2 4d |P ({UX }X ∈Γ4 ) − 1| < · 16 = elog 2 − 1 = 1. βL · n! 16βL4d n=1
(2.13)
(2.14)
The inequality (2.14) allows us to consider log(P ({UX }X ∈Γ4 )) as an analytic function of the multi-variable {UX }X ∈Γ4 in the domain (2.13). Moreover,
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
991
we have −
1 ∂ log(P ({UX }X ∈Γ4 )) λX =0 β ∂λX˜1 ∀X ∈Γ4 = − ·
∞ m 1 (−1)m P ({UX }X ∈Γ4 ) λX =0 − 1 β m=0 ∀X ∈Γ4
∂ P ({UX }X ∈Γ4 ) λX =0 , ∂λX˜1 ∀X ∈Γ4
where we used the equality that d log(z)/dz = |z − 1| < 1). Furthermore, we can write
∞
m m=0 (−1) (z
∞ Gn P ({UX }X ∈Γ4 ) λX =0 − 1 = 4
∀X ∈Γ
(2.15)
n=1 ∞
λX =0 U ∀X ∈ Γ4
n
− 1)m (∀z ∈ C with
,
∂ ∂ P ({UX }X ∈Γ4 ) λX =0 = Gn λX =0 U n−1 , ∂λX˜1 ∂λ ˜ 4 X 1 ∀X ∈Γ ∀X ∈Γ4 n=1
(2.16)
where Gn (n ∈ N) is defined in (2.12). By substituting (2.16) into (2.15), we obtain −
1 ∂ log(P ({UX }X ∈Γ4 )) λX =0 β ∂λX˜1 ∀X ∈Γ4 m ∞ ∞ ∞ ∂ 1 n = − Gn λX =0 U Gn λX =0 U n−1 . − β m=0 ∂λ ˜ 4 X1 ∀X ∈Γ ∀X ∈Γ4 n=1 n=1
(2.17)
Again by using Lemma 2.5, we can show that for U ∈ C with |U | < log 2/(16βL4d) ∞ Gn λX =0 |U |n < 1, 4 ∀X ∈Γ
n=1
∞ ∂ Gn λX =0 |U |n−1 < ∞. ∂λX˜1 4 ∀X ∈Γ
(2.18)
n=1
∞ Since the radius of convergence of the power series m=0 z m is 1, the inequalities (2.18) provide a sufficient condition to reorder the right-hand side of (2.17) (see, e.g., [15, Theorems 3.1 and 3.4] for products and compositions of convergent power series) to deduce −
1 ∂ log(P ({UX }X ∈Γ4 )) λX =0 β ∂λX˜1 ∀X ∈Γ4 =−
1 ∂ G1 λX =0 β ∂λX˜1 ∀X ∈Γ4
September 14, 2009 15:31 WSPC/148-RMP
992
J070-00379
Y. Kashima
−
∞ n n+1−l 1 ∂ Gl λX =0 β n=1 ∂λX˜1 ∀X ∈Γ4 j=1 l=1
m1 +· · ·+mj =n+1−l mk ≥1,∀k∈{1, . . . , j}
·
j
(−Gmk ) λX =0 +
k=1
∀X ∈Γ4
∂ Gn+1 λX =0 U n . ∂λX˜1 4 ∀X ∈Γ
(2.19)
Arranging (2.19) yields (2.11). By restricting U to be real in (2.10), we obtain the power series expansion of the correlation function ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ . At this point,
∞ however, we only know that the series n=0 an U n converges for U ∈ C with |U | < log 2/(16βL4d), which heavily depends on the volume factor Ld . With the aim of enlarging the radius of convergence and finding upper bounds of the power series
∞ n n=0 an U , we will construct our theory in the following sections. 3. Grassmann Gaussian Integral Formulation In this section, we discretize the integrals over [0, β] contained in the perturbation series P ({UX }X ∈Γ4 ) so that the discretized perturbation series can be formulated in a Grassmann Gaussian integral involving only finite dimensional Grassmann algebras. Moreover, by showing that the discrete analog of P uniformly converges to the original P , we characterize our partition function Tr e−βH /Tr e−βH0 and the 4-point function ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ as a limit of finite dimensional Grassmann integrals. The finite dimensional Grassmann Gaussian integral formulation will then enable us to apply the tree formula for the connected part of the exponential of Laplacian operator of the Grassmann left derivatives to express each term of the discretized perturbation series as a finite sum over trees in Sec. 4.
3.1. Discretization of the integral over [0, β]
β We define the fully discrete perturbation series by replacing the integral 0 dx in P ({UX }X ∈Γ4 ) by the Riemann sum. Let us introduce finite sets [0, β)h and [−β, β)h parametrized by h ∈ N/β as follows. 1 1 2 [0, β)h := 0, , , . . . , β − , h h h 2 1 1 [−β, β)h := −β, −β + , −β + , . . . , β − . h h h
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
993
Note that [0, β)h = βh and [−β, β)h = 2βh. We define the function Ph ({UX }X ∈Γ4 ) of the multi-variable {UX }X ∈Γ4 (⊂ C) by Ph ({UX }X ∈Γ4 ) := 1 +
d L βh
n=1
n 1 1 − n! j=1 h
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓} x2j−1 ,x2j ∈[0,β)h
· δσ2j−1 ,↑ δσ2j ,↓ δx2j−1 ,x2j Ux 2j−1 ,x 2j ,y2j−1 ,y2j · det(C(x j σj xj , yk σk xk ))1≤j,k≤2n .
(3.1)
d
Note that if n > L βh, det(C(x j σj xj , yk τk yk ))1≤j,k≤2n = 0 for any (x j , σj , xj ), (yj , τj , yj ) ∈ Γ×{↑, ↓}×[0, β)h (j ∈ {1, . . . , 2n}), since Γ×{↑, ↓}×[0, β)h = 2Ldβh. Let us summarize the properties of the function Ph in the same manner as in Proposition 2.7. Lemma 3.1. Assume that U ∈ C satisfies |U | < log 2/(16βL4d). The following statements hold. (i) There exists ε > 0 such that for any {λX }X ∈Γ4 (⊂ C) with |λX | ≤ ε (∀X ∈ Γ4 ) and any h ∈ N/β, the inequality |Ph ({UX }X ∈Γ4 ) − 1| < 1 holds. (ii) For any h ∈ N/β ∞ 1 ∂ log(Ph ({UX }X ∈Γ4 )) λX =0 = ah,n U n , − β ∂λX˜1 4 ∀X ∈Γ n=0 where the coefficients {ah,n }∞ n=0 are given by ah,n := −
1 ∂ β ∂λX˜1
n+1 (−1)j−1 j j=1
m1 +· · ·+mj =n+1 mk ≥1,∀k∈{1, . . . , j}
j Gh,mk , k=1 λX =0 ∀X ∈Γ4
(3.2) with {Gh,n }∞ n=1 defined by n 1 1 Gh,n := − n! j=1 h
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓} x2j−1 ,x2j ∈[0,β)h
· δσ2j−1 ,↑ δσ2j ,↓ δx2j−1 ,x2j · (δx 2j−1 ,x 2j δy2j−1 ,y2j δx 2j−1 ,y2j−1 + λx 2j−1 ,x 2j ,y2j−1 ,y2j + λy2j−1 ,y2j ,x 2j−1 ,x 2j ) · det(C(x j σj xj , yk σk xk ))1≤j,k≤2n .
(3.3)
September 14, 2009 15:31 WSPC/148-RMP
994
J070-00379
Y. Kashima
(iii) For all n ∈ N ∪ {0}, limh→+∞,h∈N/β ah,n = an , where {an }∞ n=0 is defined in (2.11) and (2.12). Proof. The proofs for the claims (i) and (ii) are parallel to that of Proposition 2.7, based on Lemma 2.5. By the definition (2.7), det(C(x j σj xj , yk σk xk ))1≤j,k≤2n is piece-wise smooth with respect to the variables {xj }2n j=1 , which implies that the Riemann sums over [0, β)h in Gh,n all converge to the corresponding integrals in Gn as h → +∞. Thus, the claim (iii) is true. Lemma 3.1(iii) tells us that establishing an h-dependent upper bound on |ah,n | and showing that the upper bound converges as h → +∞ lead to finding a bound on |an |. This goal will be achieved in Sec. 4. The main aim of this section is to formulate Ph as a finite dimensional Grassmann Gaussian integral, which will be used in the characterization of the coef4. Though it is not directly required in our search for ficients {ah,n }∞ n=0 in Sec.
n the upper bound on ∞ n=0 an U , to represent the original partition function P and the 4-point function ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ as a limit of the finite dimensional Grassmann integrals also interests us. The following uniform convergence property of Ph provides a framework to this purpose. The following proposition will be referred in the proof of our main theorem Theorem 4.10 as well. Proposition 3.2. For any r > 0 lim
sup
h→+∞ UX ∈C with |UX |≤r h∈N/β ∀X ∈Γ4
|Ph ({UX }X ∈Γ4 ) − P ({UX }X ∈Γ4 )| = 0.
(3.4)
Remark 3.3. For the same reason as for the convergence property Lemma 3.1(iii), each term of the series Ph ({UX }X ∈Γ4 ) converges to the corresponding term of P ({UX }X ∈Γ4 ) as h → +∞. By using this fact and Lebesgue’s dominated convergence theorem for l1 -space, the convergence property (3.4) can be shown. Below we present an elementary proof without employing the convergence theorem of the Lebesgue integration theory. Proof of Proposition 3.2. By using Lemma 2.5 and the inequality that |UX | ≤ r (∀X ∈ Γ4 ), we have |P ({UX }X ∈Γ4 ) − Ph ({UX }X ∈Γ4 )| ≤
∞ n=βh+1
·
βh n 2 1 (rβL4d )n 42n + n! n! j=1 n=2
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓}
δσ2j−1 ,↑ δσ2j ,↓ |Ux 2j−1 ,x 2j ,y2j−1 ,y2j |
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
995
n β · ds2j−1 det(C(x j σj sj , yk σk sk ))1≤j,k≤2n s2j =s2j−1 0 j=1 ∀j∈{1, . . . , n} n 1 det(C(x j σj xj , yk σk xk ))1≤j,k≤2n x =x − 2j 2j−1 h ∀j∈{1, . . . , n} x2j−1 ∈[0,β)h
j=1
≤
∞ n=βh+1
2 (rβL4d )n 42n n!
(l2j−1 +1)/h βh βh−1 n 1 (rL4d )n + sup ds2j−1 n! l2j−1 /h x j ,yj ∈Γ,σj ∈{↑, ↓} j=1 n=2 l2j−1 =0
∀j∈{1, . . . , 2n}
· det(C(x j σj sj , yk σk sk ))1≤j,k≤2n s2j =s2j−1 ∀j∈{1, . . . , n}
l2j =l2j−1 . ∀j∈{1, . . . , n}
− det(C(x j σj lj /h, yk σk lk /h))1≤j,k≤2n
(3.5)
We especially need to show that the second term of the right-hand side of the inequality (3.5) converges to 0 as h → +∞. Let us fix n ∈ {2, 3, . . . , βh} and x j , yj ∈ Γ, σj ∈ {↑, ↓} (∀j ∈ {1, . . . , 2n}). There exists a function g : (−β, β)n(n−1)/2 → R, g ∈ C ∞ (((−β, β)\{0})n(n−1)/2 ) such that for all s2j−1 ∈ [0, β) (∀j ∈ {1, . . . , n}) g(s1 − s3 , s1 − s5 , . . . , s1 − s2n−1 , s3 − s5 , . . . , s3 − s2n−1 , . . . , s2n−3 − s2n−1 ) = det(C(x j σj sj , yk σk sk ))1≤j,k≤2n s2j =s2j−1 . ∀j∈{1, . . . , n}
Note that by using the property that Ek = E−k for all k ∈ Γ∗ , we can show C(xσx, yτ y) ∈ R for all (x, σ, x), (y, τ, y) ∈ Γ × {↑, ↓} × [0, β)h . Thus, the function g is chosen to be real-valued. Then we see that βh−1 n (l2j−1 +1)/h ds2j−1 j=1
l2j−1 =0
l2j−1 /h
· det(C(x j σj sj , yk σk sk ))1≤j,k≤2n s2j =s2j−1 ∀j∈{1,. . .,n}
l2j =l2j−1 ∀j∈{1, . . . , n}
− det(C(x j σj lj /h, yk σk lk /h))1≤j,k≤2n
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Y. Kashima
996
=
n
j=1
βh−1 l2j−1 =0
(l2j−1 +1)/h
ds2j−1 (χl + (1 − χl )χl,s )
l2j−1 /h
· |g(s1 − s3 , . . . , s2n−3 − s2n−1 ) − g(l1 /h − l3 /h, . . . , l2n−3 /h − l2n−1 /h)|, (3.6) where the functions χl , χl,s are defined by 1 if there exist j, k ∈ {1, . . . , n} such that j = k and l2j−1 = l2k−1 , χl := 0 otherwise, and χl,s
1 if s2j−1 − s2k−1 = l2j−1 /h − l2k−1 /h := for all j, k ∈ {1, . . . , n} with j = k, 0 otherwise.
Let us fix l = (l1 , l3 , . . . , l2n−1 ) and s = (s1 , s3 , . . . , s2n−1 ) with l2j−1 ∈ {0, 1, . . . , βh − 1}, s2j−1 ∈ (l2j−1 /h, (l2j−1 + 1)/h) for all j ∈ {1, . . . , n} satisfying χl = 0 and χl,s = 1. In this case l2j−1 = l2k−1 and s2j−1 − s2k−1 = l2j−1 /h − l2k−1 /h for all j, k ∈ {1, . . . , n} with j = k. Note that if l2j−1 < l2k−1 , l2j−1 /h−l2k−1 /h, s2j−1 −s2k−1 ∈ (−β, 0). If l2j−1 > l2k−1 , l2j−1 /h−l2k−1 /h, s2j−1 − s2k−1 ∈ (0, β). Let us set the interval I(b, c) for b, c ∈ R with b = c by I(b, c) := [b, c] if b < c,
[c, b] if b > c.
Then we see that I(s2j−1 − s2k−1 , l2j−1 /h − l2k−1 /h) ⊂ (−β, β)\{0} for all j, k ∈ {1, . . . , n} with j = k. Since g ∈ C ∞ (((−β, β)\{0})n(n−1)/2 ), the mean value theorem ensures that for any j, k ∈ {1, . . . , n} with j < k there exists θ2j−1,2k−1 ∈ I(s2j−1 − s2k−1 , l2j−1 /h − l2k−1 /h) such that g(s1 − s3 , . . . , s2n−3 − s2n−1 ) − g(l1 /h − l3 /h, . . . , l2n−3 /h − l2n−1 /h) = ∇g(θ1,3 , . . . , θ2n−3,2n−1 ), (s1 − s3 − (l1 /h − l3 /h), . . . , s2n−3 − s2n−1 − (l2n−3 /h − l2n−1 /h))t , which leads to |g(s1 − s3 , . . . , s2n−3 − s2n−1 ) − g(l1 /h − l3 /h, . . . , l2n−3 /h − l2n−1 /h)| 1/2 1 n(n − 1) ≤ sup |∇g(s)|. (3.7) h 2 s∈((−β,β)\{0})n(n−1)/2 Moreover, by using Lemma 2.5, we see that for j < k, ∂ ∂(s2j−1 − s2k−1 ) g(s1 − s3 , . . . , s2n−3 − s2n−1 ) ≤
1 1 ∂ ∂(s2j−1 − s2k−1 )
p1 =0 p2 =0
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
997
· C(x 2j−1+p1 σ2j−1+p1 s2j−1 , y2k−1+p2 σ2k−1+p2 s2k−1 ) ∂ + ∂(s2j−1 − s2k−1 )
· C(x 2k−1+p1 σ2k−1+p1 s2k−1 , y2j−1+p2 σ2j−1+p2 s2j−1 ) 42n−1 ≤ 8 · 42n−1
sup x∈Γ,x∈(−β,β)\{0}
∂ C(x↑x, 0↑0) . ∂x
(3.8)
By (3.7) and (3.8), we have βh−1 n (l2j−1 +1)/h ds2j−1 (1 − χl )χl,s j=1
l2j−1 =0
l2j−1 /h
· |g(s1 − s3 , . . . , s2n−3 − s2n−1 ) − g(l1 /h − l3 /h, . . . , l2n−3 /h − l2n−1 /h)| ∂ n(n − 1)β n 42n (3.9) sup C(x↑x, 0↑0) . ≤ h x∈Γ,x∈(−β,β)\{0} ∂x On the other hand, note that
{l/h ∈ [0, β)nh | χl = 1} = (βh)n − {l/h ∈ [0, β)nh | χl = 0} βh n = (βh) − n! ≤ n2 (βh)n−1 , n
(3.10)
where we used the inequality
N N − n! ≤ n2 N n−1 , n n
which holds for all N ∈ N and n ∈ {0, 1, . . . , N }. By using Lemma 2.5 and (3.10), we obtain βh−1 n (l2j−1 +1)/h ds2j−1 χl j=1
l2j−1 =0
l2j−1 /h
· |g(s1 − s3 , . . . , s2n−3 − s2n−1 ) − g(l1 /h − l3 /h, . . . , l2n−3 /h − l2n−1 /h)| 2 2 n−1 2n n β 4 . h Combining (3.6), (3.9), (3.11) with (3.5) shows ≤ 2h−n n2 (βh)n−1 42n =
sup UX ∈C with |UX |≤r ∀X ∈Γ4
≤
∞ n=βh+1
|P ({UX }X ∈Γ4 ) − Ph ({UX }X ∈Γ4 )|
βh 2 1 1 (rβL4d )n 42n + (rL4d )n n! h n=2 n!
(3.11)
September 14, 2009 15:31 WSPC/148-RMP
998
J070-00379
Y. Kashima
n 2n
· n(n − 1)β 4
sup x∈Γ,x∈(−β,β)\{0}
∂ C(x↑x, 0↑0) + 2n2 β n−1 42n ∂x
→ 0, as h → +∞, h ∈ N/β. Corollary 3.4. For all U ∈ R ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ ∂ Ph ({UX }X ∈Γ4 ) ∂λX˜1 1 lim . = − β h→+∞ Ph ({UX }X ∈Γ4 ) h∈N/β λX =0
(3.12)
X ∈Γ4
Proof. The relation (2.4) and Cauchy’s integral formula ensure that for any ˜X }X ∈Γ4 ⊂ C and r > 0 {U ∂ ˜X }X ∈Γ4 ) (Ph − P )({U ∂λX˜1 ∂ ∂ ˜X }X ∈Γ4 ) + = (Ph − P )({U ∂UX˜1 ∂UX˜2 1 (Ph − P )({UX }X ∈Γ4 ) dUX˜1 = ˜ ˜ )2 2πi ˜ (UX˜1 − U |UX˜ −UX˜ |=r X1 1 1 (Ph − P )({UX }X ∈Γ4 ) + dUX˜2 . ˜ ˜ )2 UX =U˜X ˜ ˜ |=r (UX˜ − U |UX˜ −U X X 2
2
2
2
(3.13)
∀X ∈Γ4
By applying Proposition 3.2 to (3.13) we can show that for any r˜ > 0 and any ˜X | ≤ r˜ (∀X ∈ Γ4 ) ˜X }X ∈Γ4 with |U {U ∂ ˜X }X ∈Γ4 ) (Ph − P )({U ∂λX˜1 ≤
2 r
sup {UX }X ∈Γ4 ⊂C |UX |≤r+˜ r ,∀X ∈Γ4
|Ph ({UX }X ∈Γ4 ) − P ({UX }X ∈Γ4 )| → 0
(3.14)
as h → +∞, h ∈ N/β. Combining (3.14) with Lemma 2.1 yields (3.12). 3.2. The Grassmann Gaussian integral To deal with the discretized partition function Ph rather than P is advantageous since the variables run in the finite set Γ × {↑, ↓}× [0, β)h in every term of the power
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
999
series Ph . Accordingly, we can formulate Ph as a Grassmann Gaussian integral on finite Grassmann algebras. Elementary calculus on finite Grassmann algebras has been summarized in the books [7,21]. For a convenience of calculation, especially in order to refer to Proposition C.7 shown in Appendix C, we assume that h ∈ 2N/β from now. Let us number elements of the set Γ × {↑, ↓} × [0, β)h so that we can write Γ×{↑, ↓}×[0, β)h = {(x j , σj , xj ) | j ∈ {1, . . . , N }} with N := 2Ldβh. We then introduce a set of Grassmann algebras denoted by {ψx j σj xj , ψ x j σj xj | j ∈ {1, . . . , N }}. Remind us that the Grassmann algebra {ψx j σj xj , ψ x j σj xj | j ∈ {1, . . . , N }} satisfies the anti-commutation relations ψx j σj xj ψx k σk xk = −ψx k σk xk ψx j σj xj , ψx j σj xj ψ x k σk xk = −ψ x k σk xk ψx j σj xj , ψ x j σj xj ψ x k σk xk = −ψ x k σk xk ψ x j σj xj for all j, k ∈ {1, . . . , N }. Let C[ψx j σj xj , ψ x j σj xj |j ∈ {1, . . . , N }] denote the complex linear space spanned by all the monomials consisting of {ψx j σj xj , ψ x j σj xj |j ∈ {1, . . . , N }}. As a linear functional on C[ψx j σj xj , ψ x j σj xj |j ∈ {1, . . . , N }], the Grassmann integral · dψ x N σN xN . . . dψ x 1 σ1 x1 dψx N σN xN · · · dψx 1 σ1 x1 is defined as follows.
ψx 1 σ1 x1 · · · ψx N σN xN ψ x 1 σ1 x1 · · · ψ x N σN xN
· dψ x N σN xN · · · dψ x 1 σ1 x1 dψx N σN xN · · · dψx 1 σ1 x1 := 1, ψx j1 σj1 xj1 · · · ψx jn σjn xjn ψ x k
1
σk1 xk1
· · · ψ x km σkm xkm
· dψ x N σN xN · · · dψ x 1 σ1 x1 dψx N σN xN · · · dψx 1 σ1 x1 := 0 if n = N or m = N , and linearly extended onto the whole space. Let us simply write the vectors of the Grassmann algebras (ψx 1 σ1 x1 , . . . , ψx N σN xN ), (ψ x 1 σ1 x1 , . . . , ψ x N σN xN ) as ψX = (ψx 1 σ1 x1 , . . . , ψx N σN xN ),
ψX = (ψ x 1 σ1 x1 , . . . , ψ x N σN xN ).
In order to indicate the dependency on the parameter h, we write the covariance matrix as Ch := (C(x j σj xj , x k σk xk ))1≤j,k≤N and define a 2N × 2N skew symmetric matrix Ch by 0 Ch Ch := . −Cht 0 The diagonalization of Ch is presented in Appendix C. Here we note the fact that det Ch = 0 proved in Proposition C.7 to see that Ch is invertible.
September 14, 2009 15:31 WSPC/148-RMP
1000
J070-00379
Y. Kashima
For any f (ψX , ψX ) ∈ C[ψx j σj xj , ψ x j σj xj |j ∈ {1, . . . , N }], ef (ψX ,ψX ) is defined by e
f (ψX ,ψX )
:= e
f (0,0)
2N 1 n (f (ψX , ψX ) − f (0, 0)) , n! n=0
where f (0, 0) denotes the constant part of f (ψX , ψX ). Let us also write in short dψX = dψ x N σN xN · · · dψ x 1 σ1 x1 , dψX = dψx N σN xN · · · dψx 1 σ1 x1 . Definition 3.5. As a linear functional on C[ψx j σj xj , ψ x j σj xj |j ∈ {1, . . . , N }], the Grassmann Gaussian integral · dµCh (ψX , ψX ) is defined by
−1 t t 1
f (ψX , ψX )e− 2 (ψX ,ψX ) ,Ch (ψX ,ψX ) dψX dψX
f (ψX , ψX )dµCh (ψX , ψX ) := , −1 t t 1 e− 2 (ψX ,ψX ) ,Ch (ψX ,ψX ) dψX dψX (3.15) for all f (ψX , ψX ) ∈ C[ψx j σj xj , ψ x j σj xj |j ∈ {1, . . . , N }]. Remark 3.6. The denominator of (3.15) is non-zero. In fact, a direct calculation and the assumption h ∈ 2N/β show
−1 t t 1 e− 2 (ψX ,ψX ) ,Ch (ψX ,ψX ) dψX dψX = (det Ch )−1 (−1)N (N −1)/2 = (det Ch )−1 , which takes a positive value independent of h by Proposition C.7. The Grassmann Gaussian integral representation of Ph is as follows. Proposition 3.7. Assume that {UX }X ∈Γ4 (⊂ C) satisfies the equality that for all x, y, z, w ∈ Γ Ux,y,z,w = Uz,w,x,y . The following equality holds.
P Ph ({UX }X ∈Γ4 ) = e x,y,z,w∈Γ Ux,y,z,w Vh,x,y,z,w (ψX ,ψX ) dµCh (ψX , ψX ), where Vh,x,y,z,w (ψX , ψX ) := −
1 h
ψ x↑x ψ y↓x ψw↓x ψz↑x .
x∈[0,β)h
Proof. By substituting the equalities 1 dµCh (ψX , ψX ) = 1 and
ψx jn σjn xjn · · · ψx j1 σj1 xj1 ψ x k σk xk · · · ψ x kn σkn xkn dµCh (ψX , ψX ) 1
1
1
= det(Ch (x jp σjp xjp , x kq σkq xkq ))1≤p,q≤n
(3.16)
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1001
for any j1 , . . . , jn , k1 , . . . , kn ∈ {1, 2, . . . , N } (see [7, Problem I.13]) into (3.1), we have Ph ({UX }X ∈Γ4 )
n 1 − 1 n! j=1 h
d L βh
= 1+
n=1
·
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓}
δσ2j−1 ,↑ δσ2j ,↓ δx2j−1 ,x2j Ux 2j−1 ,x 2j ,y2j−1 ,y2j
x2j−1 ,x2j ∈[0,β)h
ψx 2n σ2n x2n · · · ψx 1 σ1 x1 ψ y1 σ1 x1 · · · ψ y2n σ2n x2n dµCh (ψX , ψX )
·
=
1 +
d L βh
n=1
n 1 1 − n! j=1 h
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓} x2j−1 ,x2j ∈[0,β)h
· δσ2j−1 ,↑ δσ2j ,↓ δx2j−1 ,x2j Ux 2j−1 ,x 2j ,y2j−1 ,y2j
· ψx 2j σ2j x2j ψx 2j−1 σ2j−1 x2j−1 ψ y2j−1 σ2j−1 x2j−1 ψ y2j σ2j x2j dµCh (ψX , ψX )
=
e
1 −h
P
P x,y,z,w∈Γ
x∈[0,β)h
Ux,y,z,w ψ x↑x ψ y↓x ψw↓x ψz↑x
dµCh (ψX , ψX ).
(3.17)
To obtain the last equality of (3.17) we used the equality (3.16). As a corollary, our original partition functions and the correlation function are represented as a limit of the finite dimensional Grassmann integrals. Corollary 3.8. For any U ∈ R and {λX }X ∈Γ4 ⊂ R, the following equalities hold : Tr e−βHλ = lim Tr e−βH0 h→+∞
P
e
x,y,z,w∈Γ
Ux,y,z,w Vh,x,y,z,w (ψX ,ψX )
dµCh (ψX , ψX ),
h∈2N/β −βH
Tr e = lim Tr e−βH0 h→+∞
(3.18)
e
−U h
P
P x∈Γ
x∈[0,β)h
ψ x↑x ψ x↓x ψx↓x ψx↑x
dµCh (ψX , ψX ),
h∈2N/β
(3.19) ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ =
lim h→+∞ h∈2N/β
1 βh
ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑
+
x∈[0,β)h
(ψ x 1 ↑x ψ x 2 ↓x ψy2 ↓x ψy1 ↑x + ψ y1 ↑x ψ y2 ↓x ψx 2 ↓x ψx 1 ↑x )
September 14, 2009 15:31 WSPC/148-RMP
1002
J070-00379
Y. Kashima
·e
−U h
·
e
P
P x∈Γ
−U h
x∈[0,β)h
P
ψ x↑x ψ x↓x ψx↓x ψx↑x
P x∈Γ
x∈[0,β)h
dµCh (ψX , ψX )
ψ x↑x ψ x↓x ψx↓x ψx↑x
dµCh (ψX , ψX ).
(3.20)
Proof. Since the relation (2.4) implies the condition (3.16), we can apply Proposition 3.2 and Proposition 3.7 to deduce (3.18). The equality (3.19) is (3.18) for λX = 0 (∀X ∈ Γ4 ). Note the fact that ∂ PX ∈Γ4 UX Vh,X (ψX ,ψX ) e λX =0 ∂λX˜1 ∀X ∈Γ4 = − ·e
1 h
(ψ x 1 ↑x ψ x 2 ↓x ψy2 ↓x ψy1 ↑x + ψ y1 ↑x ψ y2 ↓x ψx 2 ↓x ψx 1 ↑x )
x∈[0,β)h
−U h
P
P x∈Γ
x∈[0,β)h
ψ x↑x ψ x↓x ψx↓x ψx↑x
,
(3.21)
where the differential operator ∂/∂λX˜1 is defined to act on every coefficient of GrassP
mann monomials in the expansion e X ∈Γ4 UX Vh,X (ψX ,ψX ) (see [7, Problem I.3]). P U V Moreover, by expanding e X ∈Γ4 X h,X (ψX ,ψX ) one can verify the equality
P ∂ e X ∈Γ4 UX Vh,X (ψX ,ψX ) dµCh (ψX , ψX ) ∂λX˜1
∂ PX ∈Γ4 UX Vh,X (ψX ,ψX ) = e dµCh (ψX , ψX ). (3.22) ∂λX˜1 The equality (3.20) follows from Corollary 3.4, Proposition 3.7 and (3.21) and (3.22).
4. Upper Bound on the Perturbation Series
n In this section, we calculate upper bounds on our perturbation series ∞ n=0 an U by evaluating the tree formula for the connected part of the exponential of Laplacian operator. In order to employ the Grassmann Gaussian integral formulation of Ph developed in Sec. 3.2, we assume that h ∈ 2N/β throughout this section.
4.1. The connected part of the exponential of Laplacian operator
n Our approach to find an upper bound on |an | of our perturbation series ∞ n=0 an U is based on the characterization of the connected part of the exponential of the Laplacian operator of Grassmann left derivatives reported in [23]. Let us construct our argument step by step to reveal the structure of the problem.
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1003
P The Grassmann integral e X ∈Γ4 UX Vh,X (ψX ,ψX ) dµCh (ψX , ψX ) can be viewed as an analytic function of the multi-variable {UX }X ∈Γ4 . Since
P e X ∈Γ4 UX Vh,X (ψX ,ψX ) dµCh (ψX , ψX ) UX =0 = 1, ∀X ∈Γ4
if |UX | is sufficiently small for all X ∈ Γ4 , the inequality P e X ∈Γ4 UX Vh,X (ψX ,ψX ) dµC (ψX , ψX ) − 1 < 1 h holds. Thus, we can define a function Wh ({UX }X ∈Γ4 ) by P Wh ({UX }X ∈Γ4 ) := log e X ∈Γ4 UX Vh,X (ψX ,ψX ) dµCh (ψX , ψX ) , 4d
which is analytic in a neighborhood of 0 in CL . Lemma 4.1. For all h ∈ 2N/β and n ∈ N ∪ {0}, the following equality holds. ∂ n+1 Wh 1 ah,n = − βn! ∂Ux 1 ,x 2 ,y1 ,y2 ∂Uz1 ,z1 ,z1 ,z1 · · · ∂Uzn ,zn ,zn ,zn UX =04 z1 ,...,zn ∈Γ
+
∂ n+1 Wh ∂Uy1 ,y2 ,x 1 ,x 2 ∂Uz1 ,z1 ,z1 ,z1 · · · ∂Uzn ,zn ,zn ,zn
∀X ∈Γ
UX =0 ,
(4.1)
∀X ∈Γ4
where ah,n was defined in (3.2) and (3.3). Proof. The Taylor expansion of Wh ({UX }X ∈Γ4 ) around 0 is given by Wh ({UX }X ∈Γ4 ) = Wh | UX =0
∀X ∈Γ4
∞ 1 + n! n=1
X1 ,...,Xn ∈Γ4
∂ n Wh ∂UX1 · · · ∂UXn U
UX 1 · · · UX n .
(4.2)
X =0 4
∀X ∈Γ
Fix any U ∈ C with |U | < log 2/(16βL4d). Let ε > 0 be the constant claimed in Lemma 3.1(i). By using a parameter {λX }X ∈Γ4 (⊂ C) with |λX | ≤ ε (∀X ∈ Γ4 ) we define the variable {UX }X ∈Γ4 by the equality (2.4). Then the inequality |Ph ({UX }X ∈Γ4 )− 1| < 1 holds by Lemma 3.1(i) and the condition (3.16) is satisfied. Thus, Proposition 3.7 ensures that Wh ({UX }X ∈Γ4 ) = log(Ph ({UX }X ∈Γ4 )) for this {UX }X ∈Γ4 . Moreover, by the equalities (2.4) and (4.2) we have that 1 ∂ log(Ph ) λX =0 − β ∂λX˜1 ∀X ∈Γ4 ∂ 1 ∂ 1 ∂ =− Wh λX =0 = − + Wh λX =0 β ∂λX˜1 β ∂U ∂U 4 ˜ ˜ X1 X2 ∀X ∈Γ ∀X ∈Γ4
September 14, 2009 15:31 WSPC/148-RMP
1004
J070-00379
Y. Kashima ∞ 1 U n−1 β n=1 (n − 1)! z1 ,...,zn−1 ∈Γ
= −
·
∂ n Wh ∂UX˜1 ∂Uz1 ,z1 ,z1 ,z1 · · · ∂Uzn−1 ,zn−1 ,zn−1 ,zn−1 UX =0
∀X ∈Γ4
∂ Wh + . ∂UX˜2 ∂Uz1 ,z1 ,z1 ,z1 · · · ∂Uzn−1 ,zn−1 ,zn−1 ,zn−1 UX =0 n
∀X ∈Γ4
By uniqueness of Taylor series and Lemma 3.1(ii) we obtain (4.1). A message from Lemma 4.1 is that upper bounds on |ah,n | can be obtained by characterizing the partial derivatives of Wh at UX = 0 (∀X ∈ Γ4 ), which is the way we follow from now. Since |ah,0 | can be evaluated directly from (3.2) by using Lemma 2.5, let us study the equality (4.1) for n ≥ 1. Fix any n ≥ 1 and use the simplified notations defined as follows. Zj := (zj , zj , zj , zj ) ∈ Γ4
for zj ∈ Γ (∀j ∈ {1, . . . , n}),
Z0 := X˜1 ∈ Γ4 .
(4.3)
Set Nn+1 := {0, 1, . . . , n}. By noting that ∂ P e X ∈Γ4 UX Vh,X (ψX ,ψX ) dµCh (ψX , ψX ) UX =0 ∂UZj ∀X ∈Γ4 j∈Q ∂ = (1 + UZj Vh,Zj (ψX , ψX ))dµCh (ψX , ψX ) UZj =0 ∂UZj ∀j∈N j∈Q
j∈Nn+1
n+1
for any Q ⊂ Nn+1 , we see that ∂ Wh UX =0 ∂UZj ∀X ∈Γ4 j∈Nn+1
=
j∈Nn+1
=
j∈Nn+1
∂ log ∂UZj
(1 + UZj Vh,Zj (ψX , ψX ))dµCh (ψX , ψX ) j∈Nn+1
UZj =0 ∀j∈Nn+1
∂ ∂UZj
· log 1 + Vh,Zj (ψX , ψX )dµCh (ψX , ψX ) U Zj j∈Q j∈Q Q⊂Nn+1 Q=∅
.
UZj =0 ∀j∈Nn+1
(4.4)
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1005
The Grassmann Gaussian integral contained in the right-hand side of (4.4) can be rewritten as follows. q
Lemma 4.2. Introduce Grassmann algebras {ψxq j σj xj , ψ x j σj xj |j ∈ {1, . . . , N }} indexed by q ∈ Nn+1 and write q = (ψxq 1 σ1 x1 , . . . , ψxq N σN xN ), ψX
q
q
q
ψX = (ψ x 1 σ1 x1 , . . . , ψ x N σN xN )
q
q for all q ∈ Nn+1 . Let ∂/∂ψX , ∂/∂ψX be the vectors of left derivatives associated q q with ψX , ψX , respectively. Then, the following equality holds. For all Q ⊂ Nn+1 with Q = ∅
Vh,Zq (ψX , ψX )dµCh (ψX , ψX ) = e∆
q∈Q
q q Vh,Zq (ψX , ψX )ψq =ψ q =0 , X
X
∀q∈Q
q∈Q
(4.5) where the Laplacian operator ∆ and its exponential e∆ are defined by ∆ := −
p,q∈Nn+1
∂ p ∂ψX
t , Ch
∂ q ∂ψX
t
2N (n+1)
,
∆
e :=
l=0
1 l ∆. l!
Remark 4.3. When we introduce another set of Grassmann algebras, let us think that the complex linear space spanned by monomials of all the Grassmann algebras introduced up to this point is defined on the assumption of multiplication satisfying the anti-commutation relations between these algebras. The notion of the Grassmann integral · dψX dψX is naturally extended to be a linear map from the enlarged linear space of all the algebras to the subspace without ψX , ψX . For a monomial φj1 · · · φjn of Grassmann algebras {φl }m l=1 , the left derivative (∂/∂φl )φj1 · · · φjn (l ∈ {1, . . . , m}) is defined by k−1 φj1 · · · φjk−1 φjk+1 · · · φjn (−1) ∂ φj1 · · · φjn := ∂φl 0
if there uniquely exists k ∈ {1, . . . , n} s.t. l = jk , otherwise.
Then, the left derivative ∂/∂φl is extended to be a linear map on the linear space of monomials of the algebras {φl }m l=1 . The concepts of Grassmann integrals and left derivatives are generally defined as operators on Grassmann algebra with coefficients in a superalgebra (see [7, Chap. I]). q q , ηX } indexed Proof of Lemma 4.2. We define another Grassmann algebra {ηX by the sets Γ × {↑, ↓} × [0, β)h and Nn+1 and the associated left derivative
September 14, 2009 15:31 WSPC/148-RMP
1006
J070-00379
Y. Kashima q
q
q q q q {∂/∂ηX , ∂/∂ηX } in the same way as {ψX , ψX } and {∂/∂ψX , ∂/∂ψX }. Then, we see that for any subset Q ⊂ Nn+1 with Q = ∅
Vh,Zq (ψX , ψX )dµCh (ψX , ψX )
q∈Q
=
Vh,Zq
q∈Q
=
Vh,Zq
q∈Q
=
Vh,Zq
q∈Q
= e∆
∂ ∂ q , q ∂ηX ∂η X
∂ ∂ q , q ∂ηX ∂η X ∂ ∂ q , q ∂ηX ∂η X
q
q
e (ηX ,ηX )
t
,(ψX ,ψX )t
dµCh (ψX , ψX )ηq =η q =0 X
X
∀q∈Q
q
q
e (ηX ,ηX )
t
,(ψX ,ψX )t
dµCh (ψX , ψX )ηq =η q =0 X
X
∀q∈Q
e−
P
p t q t p,q∈Q (ηX ) ,Ch (η X )
q q ηX =η X =0 ∀q∈Q
q q Vh,Zq (ψX , ψX )ψq =ψ q =0 , X
(4.6)
X
∀q∈Q
q∈Q
where we have used the equality that
P
e
q q t t q∈Q (ηX ,η X ) ,(ψX ,ψX )
dµCh (ψX , ψX ) = e−
P
p t q t p,q∈Q (ηX ) ,Ch (η X )
(see [7, Problem I.13]). To verify the equalities (4.6) in more detail, see the books [7, 21] for the properties of left derivatives. By combining (4.4) with (4.5) we obtain j∈Nn+1
=
∂ Wh UX =0 ∂UZj ∀X ∈Γ4 j∈Nn+1
∂ ∂UZj
q q · log 1 + e∆ Vh,Zq (ψX , ψX )ψq =ψ q =0 U Zp X X ∀q∈Q q∈Q p∈Q Q⊂Nn+1 Q=∅
. UZq =0 ∀q∈Nn+1
(4.7) In order to characterize the right-hand side of (4.7), let us review the general theory developed in [23] by translating in our setting. Consider a map α from the power
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1007
set P(Nn+1 ) of Nn+1 to C defined by α(∅) := 1 and for Q ∈ P(Nn+1 )\{∅} q q α(Q) := e∆ Vh,Zq (ψX , ψX )ψq =ψ q =0 . X X
∀q∈Q
q∈Q
By [23, Lemma 1] there uniquely exists a map αc : P(Nn+1 ) → C such that for all Q ∈ P(Nn+1 )\{∅}
α(Q) =
αc (Q0 )α(Q\Q0 ),
Q0 ⊂Q min Q∈Q0
where min Q stands for the smallest number contained in Q. In [23, Lemma 2] it was proved that the right-hand side of (4.7) is equal to αc (Nn+1 ), which is called the connected part of the operator e∆ . The formula for αc (Nn+1 ) was given in [23, Theorem 3]. We summarize the result below. Lemma 4.4 [23, Theorem 3]. The following equality holds. j∈Nn+1
∂ Wh UX =0 = ∂UZj ∀X ∈Γ4
(∆q,q + ∆q ,q )
T ∈T(Nn+1 ) {q,q }∈T
·
ds [0,1]n
·
φ(T, π, s)e∆(M(T,π,s))
π∈Sn+1(T )
q q Vh,Zq (ψX , ψX )ψq =ψ q =0 , X
X
(4.8)
∀q∈Nn+1
q∈Nn+1
where T(Nn+1 ) is the set of all the trees (connected graphs without loop) on Nn+1 , ∆q,q := −
∂ q ∂ψX
t , Ch
t
∂ q
,
∂ψX
s := (s1 , . . . , sn ), Sn+1 (T ) is a subset of Sn+1 depending on T, φ(T, π, s) is a realvalued non-negative function of s depending on T and π with the property that
ds φ(T, π, s) = 1, (4.9) [0,1]n
π∈Sn+1 (T )
M (T, π, s) is an (n + 1) × (n + 1) real symmetric non-negative matrix depending on T, π, s satisfying M (T, π, s)q,q = 1 for all q ∈ Nn+1 and the operator ∆(M (T, π, s)) is defined by M (T, π, s)p,q ∆p,q . ∆(M (T, π, s)) := p,q∈Nn+1
September 14, 2009 15:31 WSPC/148-RMP
1008
J070-00379
Y. Kashima
In order to bound |ah,n |, the tree formula (4.8) will be evaluated in the rest of this section. 4.2. Evaluation of upper bounds Here we evaluate the tree expansion given in Lemma 4.4. Let us first prepare some necessary tools. The following lemma essentially uses the determinant bound Lemma 2.5. Lemma 4.5. For any l ∈ N and any pm , qm ∈ Nn+1 , (x jm , σjm , xjm ), (x km , σkm , xkm ) ∈ Γ × {↑, ↓} × [0, β)h (∀m ∈ {1, . . . , l}) q1 ql ∆(M(T,π,s)) p1 pl ψx j1 σj1 xj1 · · · ψx j σj xj ψ x k σk xk · · · ψ x k σk xk ψq =ψ q =0 ≤ 4l . e 1 1 1 l l l X l l l X ∀q∈N n+1
Proof. Since M (T, π, s) is a non-negative real symmetric matrix, there are constants γq ≥ 0 (q ∈ Nn+1 ) and projection matrices Pq (q ∈ Nn+1 ) satisfying that
Pp Pq = 0 for all p, q ∈ Nn+1 with p = q such that M (T, π, s) = nq=0 γq Pq . Define
˜ by M ˜ := n √γq Pq . Then we see that an (n + 1) × (n + 1) real matrix M q=0 ˜. ˜ tM M (T, π, s) = M
(4.10)
˜ = (v0 , . . . , vn ) with vectors vq ∈ Rn+1 (q ∈ Nn+1 ), the equality By writing M (4.10) implies that M (T, π, s)p,q = vp , vq for all p, q ∈ Nn+1 . The property that M (T, π, s)q,q = 1 (∀q ∈ Nn+1 ) ensures that for all q ∈ Nn+1 |vq | = 1.
(4.11)
Then we observe that q1 σ x ψ x k1 σk1 xk1 l jl jl
e∆(M(T,π,s)) ψxp1j1 σj1 xj1 · · · ψxplj 1 = (−1)l(l−1)/2 l!
ql
· · · ψ xk
l
q σ x q l kl kl ψX =ψX =0 ∀q∈Nn+1
vp , vq ∆p,q
p,q∈Nn+1
·
l m=1
qm q ψxpm ψ q σ x x σ x jm jm jm km km km ψ =ψ =0 X
X
∀q∈Nn+1
= (−1)l(l−1)/2 det(vps , vqt Ch (x js σjs xjs , x kt σkt xkt ))1≤s,t≤l .
(4.12)
By noting (4.11) we can apply Lemma 2.5 to (4.12) to deduce the desired inequality. One point we need to carefully deal with in the evaluation of the righthand side of (4.8) is the combinatorial factor, which comes in the expansion of
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1009
q q + ∆q ,q ) q∈Nn+1 Vh,Zq (ψX , ψX ). In order to count the combinatorial factor explicitely, we need to prepare some notions concerning trees. Take any T ∈ T(Nn+1 ) and for any q ∈ Nn+1 let dq (∈ N) denote the incidence number of the vertex q (the number of lines connected to the vertex q). From now, let us always think that any tree in T(Nn+1 ) starts from the vertex 0. For any q ∈ Nn+1 let Lq (T ) (⊂ T ) be the set of lines from the vertex q to the vertices of the later generation. We see that
{q,q }∈T (∆q,q
L0 (T ) = d0 ,
Lq (T ) = dq − 1,
∀q ∈ Nn+1 \{0}.
We define the combinatorial factor N (T ) we want to calculate as follows. Definition 4.6. For any T ∈ T(Nn+1 ) the combinatorial factor N (T )(∈ N) is defined as the total number of monomials appearing in the expansion of (∆q,q + ∆q ,q ) ψ zq1 ↑xq1 ψ zq2 ↓xq2 ψzq3 ↑xq3 ψzq4 ↓xq4 . (4.13) {q,q }∈T
q∈Nn+1
Note that N (T ) is independent of how to choose zqj ∈ Γ, xqj ∈ [0, β)h (∀j ∈ {1, 2, 3, 4}, ∀q ∈ Nn+1 ). The combinatorial factor N (T ) is counted as follows. Lemma 4.7. For T ∈ T(Nn+1 ) let dq (q ∈ Nn+1 ) denote the incidence number of the vertex q in T . If there is q ∈ Nn+1 such that dq > 4, N (T ) = 0. Otherwise, 3 N (T ) = 4 (dq − 1)!. dq − 1 q∈Nn+1
Proof. Set W := q∈Nn+1 ψ zq1 ↑xq1 ψ zq2 ↓xq2 ψzq3 ↑xq3 ψzq4 ↓xq4 . If there is p ∈ Nn+1 such that dp > 4, every term in the expansion of the product {q,q }∈T (∆q,q + ∆q ,q ) contains more than 4 derivatives with respect to the Grassmann algebras indexed by p. Since the number of the Grassmann algebras with index p in W is 4, (4.13) must vanish. Let us consider the case that dq ∈ {1, 2, 3, 4} for all q ∈ Nn+1 . The operator
∆q,q can be decomposed as ∆q,q = σ∈{↑,↓} ∆σq,q , where ∆σq,q := −
x,y∈Γ x,y∈[0,β)h
Ch (xσx, yσy)
∂ ∂ q q ∂ψxσx ∂ψ yσy
(4.14)
for σ ∈ {↑, ↓}. Note that for any σ ∈ {↑, ↓} and p, p , p ∈ Nn+1 ∆σp,p ∆σp,p W = ∆σp ,p ∆σp ,p W = 0.
(4.15)
By changing the numbering of vertices if necessary we may assume the following condition on T without losing generality. (♣) The distance between the vertex p and the initial vertex 0 is less than equal to that between the vertex q and the vertex 0 for all p, q ∈ Nn+1 \{0} with p ≤ q.
September 14, 2009 15:31 WSPC/148-RMP
1010
J070-00379
Y. Kashima
Note that (∆q,q + ∆q ,q )W = {q,q }∈T
(∆↑q,p + ∆↓q,p + ∆↑p,q + ∆↓p,q )W.
q∈Nn+1 {q,p}∈Lq (T ) Lq (T )=∅
(4.16) Let us count N (T ) recursively with respect to q ∈ Nn+1 as follows. The expansion of the product {0,p}∈L0 (T ) (∆↑0,p + ∆↓0,p + ∆↑p,0 + ∆↓p,0 ) is a sum of 4 L0 (T ) terms, each of which is a product of L0 (T ) Laplacians. By the property (4.15) any term containing the products ∆σ0,q ∆σ0,q or ∆σq,0 ∆σq ,0 for some σ ∈ {↑, ↓}, {0, q}, {0, q } ∈ L0 (T ) does not contribute to the number of remaining monomials in (4.16), thus, can be eliminated. Therefore, we only need to count 4
L0 (T )!
L0 (T ) terms in the expansion of {0,p}∈L0 (T ) (∆↑0,p + ∆↓0,p + ∆↑p,0 + ∆↓p,0 ). Take any q ∈ Nn+1 \{0} with Lq (T ) = ∅. By the condition (♣) there uniquely exists q ∈ Nn+1 with q < q such that {q , q} ∈ Lq (T ). Thus, every term in the expansion of the product (∆↑j,p + ∆↓j,p + ∆↑p,j + ∆↓p,j ) j∈Nn+1 ,Lj (T )=∅ {j,p}∈Lj (T ) j
contains one of the Laplacians of the form ∆σq,q , ∆σq ,q (σ ∈ {↑, ↓}). Therefore, by the property (4.15) only 3
Lq (T )!
Lq (T ) terms in the expansion of the product {q,p}∈Lq (T ) (∆↑q,p + ∆↓q,p + ∆↑p,q + ∆↓p,q ) need to be counted. By repeating this argument recursively for all q ∈ Nn+1 \{0} we can calculate N (T ) as follows. 3 4 N (T ) =
L0 (T )!
Lq (T )!
Lq (T )
L0 (T ) 4 = d0 ! d0
=4
q∈Nn+1 \{0} Lq (T )=∅
q∈Nn+1 \{0} dq ≥ 2
3 (dq − 1)! dq − 1
q∈Nn+1
where we used the fact that
3 (dq − 1)!, dq − 1 3 0! = 1. 0
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1011
Let Dh denote the L1 -norm of the covariance matrix Ch , i.e., Dh :=
1 h
|Ch (x↑x, 0↑0)|.
x∈Γ x∈[−β,β)h
Now we can find an upper bound on |ah,n |. Lemma 4.8. For any h ∈ 2N/β and for all n ∈ N ∪ {0} the following equality holds. 128 3n + 4 (4.17) |ah,n | ≤ (4Dh )n . n 3n + 4 Proof. By using Lemma 2.5 and (3.2) and (3.3) we can directly evaluate |ah,0 | to obtain |ah,0 | ≤ 32, which is (4.17) for n = 0. Let us show (4.17) for n ≥ 1. By (4.1) and (4.8), we need to evaluate
1 (∆q,q + ∆q ,q ) ds φ(T, π, s)e∆(M(T,π,s)) βn! [0,1]n T ∈T(Nn+1 ) {q,q }∈T
0 0 · Vh,Z0 (ψX , ψX )
π∈Sn+1 (T )
n
q q Vh,Zq (ψX , ψX )ψq =ψ q =0 . X X ∀q∈Nn+1 q=1 zq ∈Γ
(4.18)
By using the property (4.9), we have (4.18) ≤
1 βn!
sup s∈[0,1]n π∈Sn+1 (T )
T ∈T(Nn+1 )
0
0 , ψX ) · Vh,Z0 (ψX
∆(M(T,π,s)) e (∆q,q + ∆q ,q ) {q,q }∈T
q q Vh,Zq (ψX , ψX )ψq =ψ q =0 . X X ∀q∈Nn+1 zq ∈Γ
n q=1
(4.19)
Take any T ∈ T(Nn+1 ). If T contains a vertex whose incidence number is larger than 4, by Lemma 4.7 the tree T does not contribute to the sum in (4.19). Thus, we may assume that the incidence numbers d0 , d1 , . . . , dn of T are less than equal to 4. Moreover, as in the proof of Lemma 4.7 without losing generality we may assume the condition (♣) on T . Let q1 , q2 , . . . , ql ∈ Nn+1 \{0} be such that q1 < q2 < · · · < ql and {q1 , q2 , . . . , ql } = {q ∈ Nn+1 \{0} | Lq (T ) = ∅}. Every term of the expansion of the product {q,q }∈T (∆q,q + ∆q ,q ) has the form {0,p0 }∈L0 (T )
∆σ{0,p0 }
l
j=1 {qj ,pj }∈Lqj (T )
∆σ{qj ,pj } ,
(4.20)
September 14, 2009 15:31 WSPC/148-RMP
1012
J070-00379
Y. Kashima
where by using the notation (4.14), ∆σ{q,p} ∈ {∆τq,p , ∆τp,q | τ ∈ {↑, ↓}} for all {q, p} ∈ Lq (T ) with q ∈ Nn+1 satisfying Lq (T ) = ∅. For any {0, p} ∈ L0 (T ) we set |Ch (zp ↑xp , x 1 ↑x0 )| if ∆σ{0,p} = ∆↑p,0 , ↓ |C (z ↓x , x ↓x )| if ∆σ h p p 2 0 {0,p} = ∆p,0 , σ C{0,p} (x0 , zp xp ) := ↑ σ |Ch (y1 ↑x0 , zp ↑xp )| if ∆{0,p} = ∆0,p , |Ch (y2 ↓x0 , zp ↓xp )| if ∆σ{0,p} = ∆↓0,p for all x0 , xp ∈ [0, β)h and zp ∈ Γ. For any j ∈ {1, 2, . . . , l} and any {qj , p} ∈ Lqj (T ) we define σ C{q (zqj xqj , zp xp ) j ,p} |Ch (zqj τ xqj , zp τ xp )| := |Ch (zp τ xp , zqj τ xqj )|
if ∆σ{qj ,p} = ∆τqj ,p if ∆σ{qj ,p} = ∆τp,qj
for some τ ∈ {↑, ↓}, for some τ ∈ {↑, ↓}
for all xp , xqj ∈ [0, β)h and zp , zqj ∈ Γ. By noting that (4.20) is the product of n Laplacians and using Lemma 4.5 and the condition (♣), we observe that l ∆(M(T,π,s)) σ e ∆{0,p0 } ∆σ{qj ,pj } j=1 {0,p0 }∈L0 (T ) {qj ,pj }∈Lqj (T ) n 0 q q 0 q · Vh,Z0 (ψX , ψX ) Vh,Zq (ψX , ψX )ψq =ψ =0 X X ∀q∈Nn+1 q=1 zq ∈Γ n 4n+2 1 σ ≤ C{0,p (x0 , zp0 xp0 ) 0} h h q=1 xq ∈[0,β)h zq ∈Γ
x0 ∈[0,β)h
·
l
σ C{q (zqj xqj , zpj xpj ) j ,pj }
j=1 {qj ,pj }∈Lqj (T )
4n+2 = h
·
{q1 ,p1 }∈Lq1 (T )
·
{q2 ,p2 }∈Lq2 (T )
···
x0 ∈[0,β)h {0,p0 }∈L0 (T )
{0,p0 }∈L0 (T )
1 h 1 h
{ql ,pl }∈Lql (T )
1 h
xp0 ∈[0,β)h zp0 ∈Γ
xp1 ∈[0,β)h zp1 ∈Γ
xp2 ∈[0,β)h zp2 ∈Γ
1 h
σ C{0,p (x0 , zp0 xp0 ) 0}
σ C{q (zq1 xq1 , zp1 xp1 ) 1 ,p1 }
σ C{q (zq2 xq2 , zp2 xp2 ) 2 ,p2 }
xpl ∈[0,β)h zpl ∈Γ
σ C{q (zql xql , zpl xpl ) l ,pl }
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
≤ 4n+2 β
{0,p0 }∈L0 (T )
·
{q1 ,p1 }∈Lq1 (T )
·
{q2 ,p2 }∈Lq2 (T )
···
sup x0 ∈[0,β)h
1 h
xp0 ∈[0,β)h zp0 ∈Γ
1 sup h xq1 ∈[0,β)h ,zq1 ∈Γ 1 sup h xq2 ∈[0,β)h ,zq2 ∈Γ
{ql ,pl }∈Lql (T )
σ C{0,p (x0 , zp0 xp0 ) 0}
xp1 ∈[0,β)h zp1 ∈Γ
xp2 ∈[0,β)h zp2 ∈Γ
1 sup xql ∈[0,β)h ,zql ∈Γ h
1013
σ C{q (zq1 xq1 , zp1 xp1 ) 1 ,p1 }
σ C{q (zq2 xq2 , zp2 xp2 ) 2 ,p2 }
xpl ∈[0,β)h zpl ∈Γ
σ C{q (zql xql , zpl xpl ) l ,pl }
≤ 4n+2 βDhn .
(4.21)
To obtain the last inequality in (4.21) we have used the fact that for all {0, p0 } ∈ L0 (T ) and all {qj , pj } ∈ Lqj (T ) (∀j ∈ {1, 2, . . . , l}) 1 σ C{0,p (x0 , zp0 xp0 ) ≤ Dh , sup 0} x0 ∈[0,β)h h xp0 ∈[0,β)h zp0 ∈Γ
1 h xqj ∈[0,β)h ,zqj ∈Γ sup
xpj ∈[0,β)h zpj ∈Γ
σ C{q (zqj xqj , zpj xpj ) ≤ Dh . j ,pj }
By combining (4.21) with (4.19), we have 16 (4Dh )n (4.18) ≤ n!
N (T ),
(4.22)
T ∈T(Nn+1 )
where N (T ) is defined in Definition 4.6. By repeating the same argument as above we can show the inequality (4.22) for the case that Z0 = X˜2 . Thus, by recalling (4.1) we arrive at 32 (4Dh )n N (T ). (4.23) |ah,n | ≤ n! T ∈T(Nn+1 )
To complete the proof we calculate the sum T ∈T(Nn+1 ) N (T ) in (4.23). As characterized in Lemma 4.7, the number N (T ) only depends on the incidence numbers of T . By using Lemma 4.7 and Cayley’s theorem on the number of trees with fixed incidence numbers (see, e.g., [24, Corollary 2.2.4]) we have (n − 1)! N (T ) = 1Pq∈N dq =2n · n+1 (dq − 1)! T ∈T(Nn+1 ) dq ∈{1,2,3,4} ∀q∈Nn+1
·4
q∈Nn+1
q∈Nn+1
3 (dq − 1)! dq − 1
September 14, 2009 15:31 WSPC/148-RMP
1014
J070-00379
Y. Kashima
= 4(n − 1)!
dq ∈{1,2,3,4} ∀q∈Nn+1
1Pq∈N
n+1
dq =2n
·
q∈Nn+1
3 . dq − 1
Moreover, by using Cauchy’s integral formula and the residue theorem we see that for a positive r > 0 2n n+1 4 4(n − 1)! d 3 N (T ) = zd d−1 (2n)! dz T ∈T(Nn+1 )
4(n − 1)! = 2πi
d=1
z
3(n+1)
n+1
z=0
(1 + z) dz = 4(n − 1)! Resz=0 z 2n+1 |z|=r 4n! 3n + 3 3n + 4 = 4(n − 1)! = . n−1 n 3n + 4
(1 + z)3(n+1) zn
(4.24)
Combining (4.24) with (4.23) yields the result. The inequality (4.17) motivates us to know the properties of the power series f (x) defined by ∞ 4 3n + 4 n (4.25) f (x) := x . n 3n + 4 n=0
As the last lemma before our main theorem, the properties of f (x) are summarized. Lemma 4.9. The radius of convergence of the power series f (x) defined in (4.25) is 4/27 and f (4/27) = 81/16. Moreover, for any x ∈ (0, 4/27] the following equality holds. 4 −1 − 1 + π tan 27x 16 4 , (4.26) f (x) = 2 cos 9x 3 where the function tan−1 (·) is defined as a bijective map from R to (−π/2, π/2) satisfying tan−1 (tan(θ)) = θ for all θ ∈ (−π/2, π/2). Proof. As a topic in generating functions the power series (4.25) is commonly studied (see, e.g., [10, pp. 200–201]). However, we give a proof for the statements for completeness. Let us analyze the cubic equation X = 1 + xX 3
(4.27)
for x ∈ (0, 4/27). We see that for any x ∈ (0, 4/27) and z ∈ C satisfying |z−1| = 1/2, the inequality |xz 3 | < |z − 1| holds. Thus, the Lagrange inversion theorem (see,
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1015
e.g., [14, Theorem 2.3.1]) implies that in the domain {z ∈ C | |z − 1| < 1/2} there is exactly one root X = v(x) of (4.27) and n−1 ∞ xn d 4 (4z 3 z 3n ) = f (x). (4.28) v(x) = 1 + n! dz z=1 n=1 On the other hand, by algebraically solving (4.27) and specifying a root contained in the domain {z ∈ C | |z − 1| < 1/2} we can determine the explicite form of v(x) as follows. 4 −1 − 1 + π tan 27x 2 , (4.29) v(x) = √ cos 3 3x where tan−1 (·) is defined as stated in Lemma 4.9. The equalities (4.28) and (4.29) give (4.26) for x ∈ (0, 4/27). The ratio test shows that the radius of convergence of f is 4/27. Moreover, by continuity we have limx4/27 f (x) = v(4/27)4 = 81/16, which completes the proof. Define a constant D > 0 by D :=
lim
h→+∞,h∈N/β
Dh =
β
dx −β
|C(x↑x, 0↑0)|.
(4.30)
x∈Γ
Our main result is stated as follows. Theorem 4.10. For any x 1 , x 2 , y1 , y2 ∈ Γ and m ∈ N ∪ {0} and any U ∈ R with |U | ≤ 1/(27D), the following equality and inequalities hold. ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ =
∞
an U n ,
(4.31)
|ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ | ≤ R(|U |), m ∗ ∗ ∗ ∗ n an U ψx 1 ↑ ψx 2 ↓ ψy2 ↓ ψy1 ↑ + ψy1 ↑ ψy2 ↓ ψx 2 ↓ ψx 1 ↑ −
(4.32)
n=0
n=0
≤ R(|U |) −
m n=0
128 3n + 4 (4D|U |)n , n 3n + 4
where {an }∞ n=0 is given in (2.11) and (2.12) and 32 1 −1 −1 +π tan R(|U |) := 32 27D|U | cos4 2 2 9D |U | 3
(4.33)
if U = 0, if 0 < |U | ≤
1 , 27D (4.34)
September 14, 2009 15:31 WSPC/148-RMP
1016
J070-00379
Y. Kashima
with the function tan−1 (·) : R → (−π/2, π/2) satisfying tan−1 (tan θ) = θ for all θ ∈ (−π/2, π/2). Proof. Since by Lemmas 3.1(iii) and 4.8, |an | = lim |ah,n | ≤ h→+∞ h∈2N/β
128 3n + 4
3n + 4 (4D)n , n
(4.35)
Lemma 4.9 implies that for all U ∈ [−1/(27D), 1/(27D)] ∞
|an ||U |n ≤ R(|U |),
(4.36)
n=0
where R(|U |) is defined in (4.34). The inequalities (4.32) and (4.33) follow from (4.31) and (4.35) and (4.36). We show that the equality (4.31) holds for U ∈ R with |U | ≤ 1/(27|D|). Let us fix any ε ∈ (0, 1/(27D)). Since P |λX =0,∀X ∈Γ4 = Tr e−βH /Tr e−βH0 > 0 for all U ∈ R, Proposition 3.2 implies that there exists N0 ∈ N such that |Ph |λX =0,∀X ∈Γ4 | > 0 for all h ∈ 2N/β with h ≥ N0 /β and all U ∈ R with |U | ≤ 1/(27D) − ε. Moreover, since Ph |λX =0,∀X ∈Γ4 is a polynomial of U we can take a simply connected domain Oh (⊂ C) containing the interval [−1/(27D) + ε, 1/(27D) − ε] inside such that |Ph |λX =0,∀X ∈Γ4 | > 0 for all U ∈ Oh . Thus, we see that ∂Ph /∂λX˜1 /Ph |λX =0,∀X ∈Γ4 defines an analytic function of U in the domain Oh . By Lemmas 4.8 and 4.9, the
∞ series n=0 ah,n U n converges for all U ∈ C with |U | ≤ 1/(27Dh). By choosing N0 sufficiently large we may assume that 1/(27D) − ε ≤ 1/(27Dh) for all h ∈ 2N/β with h ≥ N0 /β. Therefore, Lemma 3.1(ii) and the identity theorem for analytic functions ensure that ∂ Ph ({UX }X ∈Γ4 ) ∞ 1 ∂λX˜1 − = ah,n U n (4.37) β Ph ({UX }X ∈Γ4 ) n=0 λX =0 X ∈Γ4
for all U ∈ [−1/(27D) + ε, 1/(27D) − ε]. Note that Lemma 4.8 implies 128 |ah,n U | ≤ 3n + 4 n
n 4 3n + 4 n 27
(4.38)
for all U ∈ [−1/(27D)+ε, 1/(27D)−ε] and the right-hand side of (4.38) is summable over N ∪ {0}. Thus, by Lemma 3.1(iii), Corollary 3.4 and Lebesgue’s dominated convergence theorem for l1 (N ∪ {0}) we can pass h → +∞ in (4.37) to deduce the equality (4.31) for all U ∈ [−1/(27D) + ε, 1/(27D) − ε]. Then, by sending ε0 and continuity we obtain (4.31) for all U ∈ [−1/(27D), 1/(27D)]. Remark 4.11. In Proposition 5.1, we give a volume-independent upper bound on the decay constant D in 2-dimensional case. One can straightforwardly extend the
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1017
calculation of Proposition 5.1 to derive volume-independent upper bounds on D in any dimension. By replacing D by these upper bounds, Theorem 4.10 provides
n volume-independent upper bounds on the perturbation series ∞ n=0 an U .
5. Numerical Results in 2D In this section, we compute the perturbation series of the correlation function ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ up to second-order term in the 2-dimensional case. We also implement the upper bound obtained in Theorem 4.10 and report the error between the correlation function and the second-order perturbation. Throughout this section, it is assumed that d = 2.
5.1. The decay constant for d = 2
∞ In order to estimate the radius of convergence of the perturbation series n=0 an U n and compute the upper bound on the sum of the higher order terms numerically, first we need to evaluate the decay constant D defined in (4.30). The result is presented in the following proposition. Proposition 5.1. The following inequality holds. D≤
16 32π 2 16π 3 √ √ + + β2 3 3β 3 3
+3
(2|t| + 4|t |)eβξ β 1 + eβξ
(2|t| + 4|t |)eβξ β β3 + 2 1 + eβξ
2 +
(2|t| + 4|t |)eβξ β 1 + eβξ
3 ,
(5.1)
where ξ := 4|t| + 4|t | + |µ|. The derivation of the inequality (5.1) needs the following estimate. Lemma 5.2. The following inequality holds. x∈Γ
1 1+
2
|ei2πxl /L − 1|3 L3 /(8π 3 β 3 )
≤4+
4π 3 β 2 8π 2 β √ + √ , 3 3 3 3
l=1
where x = (x1 , x2 ) ∈ Γ. Proof. For any y ∈ R, let y denote the largest integer which does not exceed y. By using the inequality that |eiθ − 1| ≥ 2|θ|/π for any θ ∈ [−π, π], we
September 14, 2009 15:31 WSPC/148-RMP
1018
J070-00379
Y. Kashima
see that x∈Γ
1 1+
2
|ei2πxl /L − 1|3 L3 /(8π 3 β 3 )
l=1 L/2 L/2
≤4
x1 =0
x2 =0
∞
∞
1 1+
2
|ei2πxl /L − 1|3 L3 /(8π 3 β 3 )
l=1
≤4
x1 =0 x2
=4+8
∞ x1
≤4+8
1 3 /(π 3 β 3 ) + 8x3 /(π 3 β 3 ) 1 + 8x 1 2 =0
∞ ∞ 1 1 + 4 3 3 3 3 3 3 1 + 8x1 /(π β ) 1 + 8x1 /(π β ) + 8x32 /(π 3 β 3 ) =1 x =1 x =1
0
1
∞
dx1
∞
2
1 1 + 8x31 /(π 3 β 3 ) ∞
1 3 /(π 3 β 3 ) + 8x3 /(π 3 β 3 ) 1 + 8x 0 0 1 2
∞
∞
∞ 1 1 1 2 2 = 4 + 4πβ dx +π β dx dx 3 3 )2/3 1 + x 1 + x3 (1 + x 0 0 0
∞ 1 ≤ 4 + (4πβ + 2π 2 β 2 ) dx , 1 + x3 0 +4
dx1
dx2
where we used the inequality
∞
dx 0
1 ≤1+ (1 + x3 )2/3
∞
1
1 = 2. x2
Then, by using the equality
∞
dx 0
1 2π = √ 1 + x3 3 3
(see [10]), we obtain the desired inequality. Proof of Proposition 5.1. Let us define a linear operator dl,L : C ∞ (R2 ) → C ∞ (R2 ) (l = 1, 2) by (dl,L f )(k) :=
f (k + 2πel /L) − f (k) 2π/L
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1019
for any f ∈ C ∞ (R2 ). Then, the mean value theorem shows that for all k = (k1 , k2 ) ∈ R2 3 ∂ ˆ1 , k2 ) , |(d1,L )3 f (k1 , k2 )| ≤ sup f ( k 3 ˆ1 ∈[k1 ,k1 +6π/L] ∂k1 k (5.2) 3 ∂ 3 ˆ sup |(d2,L ) f (k1 , k2 )| ≤ 3 f (k1 , k2 ) . ˆ2 ∈[k2 ,k2 +6π/L] ∂k2 k Define a function F (k, x) : R2 × R → R by F (k, x) := exEk /(1 + eβEk ). We see that for any x = (x1 , x2 ) ∈ Γ, x ∈ [−β, β] and l ∈ {1, 2} i2πxl /L 3 e −1 C(x↑x, 0↑0) 2π/L =
1 −i k,x e ((dl,L )3 F (k, x)1x≥0 − (dl,L )3 F (k, x + β)1x<0 ). Ld ∗
(5.3)
k∈Γ
By (5.2) and (5.3), we have that for any x = (x1 , x2 ) ∈ Γ, x ∈ [−β, β] and l ∈ {1, 2} 3 ei2πxl /L − 1 3 ∂ . C(x↑x, 0↑0) ≤ sup sup F (k, x) 3 x∈[0,β] k∈[0,2π+6π/L]×[0,2π+6π/L] ∂kl 2π/L (5.4) Note that for l = 1, 2
xEk xEk e e ∂ 3 Ek ∂ ∂Ek ∂ 2 Ek ∂ 2 ∂3 F (k, x) = + 3 ∂kl3 ∂kl3 ∂Ek 1 + eβEk ∂kl ∂kl2 ∂Ek2 1 + eβEk 3 3 xEk ∂Ek e ∂ + (5.5) ∂kl ∂Ek3 1 + eβEk
and |Ek | ≤ 4|t| + 4|t | + |µ|,
∂Ek ∂ 2 Ek ∂ 3 Ek , , ∂kl ∂k 2 ∂k 3 ≤ 2|t| + 4|t |. l l
Moreover, we can prove that for m ∈ {1, 2, 3} and any E ∈ R m d m exE βeβ|E| ≤ sup . dE 1 + eβE 1 + eβ|E| x∈[0,β]
(5.6)
(5.7)
By the equalities that exE /(1 + eβE ) = e(β−x)(−E) /(1 + eβ(−E) ) and d/dE = −d/d(−E) we may assume E ≥ 0 without losing generality in the following argument. First let us study the case for m = 1. d exE exE βeβE βeβE , x − = ≤ dE 1 + eβE 1 + eβE 1 + eβE 1 + eβE which is (5.7) for m = 1.
September 14, 2009 15:31 WSPC/148-RMP
1020
J070-00379
Y. Kashima
Next, consider the case that m = 2. We see that 2 d exE exE = g1 (x), βE dE 1+e 1 + eβE with g1 (x) :=
x−
βeβE 1 + eβE
2 −
β 2 eβE . (1 + eβE )2
Note that |g1 (0)|, |g1 (βeβE /(1 + eβE ))|, |g1 (β)| ≤ β 2 eβE /(1 + eβE ). Thus, we have that 2 d eβE ex0 E βE βE ≤ dE 2 1 + eβE 1 + eβE max{|g1 (0)|, |g1 (βe /(1 + e ))|, |g1 (β)|} 2 βeβE ≤ . 1 + eβE Finally, analyze the case for m = 3. A calculation shows that 3 d exE exE = g2 (x), βE dE 1+e 1 + eβE
(5.8)
where g2 (x) := x3 −
3βeβE 2 3β 2 (eβE − 1)eβE β 3 eβE x + x − (e2βE − 4eβE + 1). 1 + eβE (1 + eβE )2 (1 + eβE )3
The roots x1 , x2 of the equation g2 (x) = 0 are given by x1 =
βeβE/2 βE/2 (e − 1), 1 + eβE
x2 =
βeβE/2 βE/2 (e + 1). 1 + eβE
Since 0 ≤ x1 < β ≤ x2 , maxx∈[0,β] |g2 (x)| = max{|g2 (0)|, |g2 (x1 )|, |g2 (β)|}. Moreover, we see that |g2 (0)| = = |g2 (β)| = |g2 (x1 )| = =
β 3 eβE β 3 eβE 2βE βE |e − 4e + 1| ≤ (e2βE + eβE ) (1 + eβE )3 (1 + eβE )3 β 3 e2βE , (1 + eβE )2 β3 |e2βE − 4eβE + 1| ≤ |g2 (0)|, (1 + eβE )3 β 3 eβE β 3 eβE |eβE + 2eβE/2 − 1| ≤ (e2βE + eβE ) βE 3 (1 + e ) (1 + eβE )3 β 3 e2βE , (1 + eβE )2
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1021
which imply that max |g2 (x0 )| ≤
x0 ∈[0,β]
β 3 e2βE . (1 + eβE )2
(5.9)
Combining (5.8) with (5.9) deduces (5.7) for m = 3. Then by (5.4)–(5.7) we have for l = 1, 2 2 ei2πxl /L − 1 3 (2|t| + 4|t |)βeβξ (2|t| + 4|t |)βeβξ C(x↑x, 0↑0) ≤ + 3 2π/L 1 + eβξ 1 + eβξ 3 (2|t| + 4|t |)βeβξ + , (5.10) 1 + eβξ with ξ := 4|t| + 4|t | + |µ|. By using the inequality that |C(x↑x, 0↑0)| ≤ 1, (5.10) and Lemma 5.2 we can derive the inequality (5.1). 5.2. The second-order perturbation As a preparation for the numerical implementation of the second-order perturbation a0 + a1 U + a2 U 2 , we rewrite the formula (2.11) for an in a more suitable form for practical computation. By decomposing the determinant of the covariance matrix into the sums over permutations, Gn defined in (2.12) can be written as follows. 1 Gn = sgn(π) sgn(τ )gn (π, τ ), (5.11) n! π,τ ∈Sn
with gn (π, τ ) :=
n
−
j=1
x j1 ,x j2 ,y1j ,y2j ∈Γ
π(j)
· C(x j1 ↑xj , y1
0
β
dxj (δx j ,x j δyj ,yj δx j ,yj + λx j ,x j ,yj ,yj + λyj ,yj ,x j ,x j ) 1
2
1
2
1
1
1
2
1
2
1
2
1
2
τ (j)
↑xπ(j) )C(x j2 ↓xj , y2
↓xτ (j) ).
The formula (2.11) for a0 , a1 , a2 is a0 = −
1 ∂ G1 λX =0 , β ∂λX˜1 4 ∀X ∈Γ
a1 = −
1 ∂ β ∂λX˜1
1 ∂ a2 = − β ∂λX˜1
1 2 G2 − G1 , λX =0 2
(5.12)
∀X ∈Γ4
1 3 . G3 − G1 G2 + G1 λX =0 3 ∀X ∈Γ4
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Y. Kashima
1022
By substituting (5.11) into (5.12) and canceling, we can simplify the expressions for a1 , a2 as follows. 1 2 1 2 1 ∂ 1 g2 a1 = − , β ∂λX˜1 2 2 1 2 1
− g2 Id2 ,
1 ∂ 1 a2 = − β ∂λX˜1 3 + g3
− 3g3
1 2
+ g3
where Id2 =
1 2
1 2 1 2
Id3 =
3
2 3
− g2
1 2 1 2
g3
2 3 1 2
2 1
1 2
1
Id3 ,
1
, , Id2 λX =0 1
2
1 2 2 3
3 1 , 1 2 3 1 , 2 2 3 1 , 1 3
2
1
2 3 2 3 2 1
∀X ∈Γ4
3
+ g3
2 3
, Id3 3 1 1 2 3 1 2 3 + 3g3 , 1 1 3 2 2 1 3 3 1 2 3 1 2 3 − 3g3 , 1 2 3 1 1 3 2 3 , λX =0 2 1 3
2
∀X ∈Γ4
2
∈ S2 , 2 1 3 1 2 3 1 , , 3 1 3 2 2
,
1
1 , 3 1 3 2 3
2 3 1 2
∈ S3 .
In practice, we need to implement the permutation-dependent terms ∂/∂λX˜1 gn (π, τ )|λX =0,∀X ∈Γ4 . Let us describe its implementation below. n β Note that the multiple integral j=1 0 dxj f (x1 , . . . , xn ) of any integrable function f can be decomposed into a sum of n! integrals as follows. n β n β dxj f (x1 , . . . , xn ) = dxj 1xη(1) >xη(2) >···>xη(n) f (x1 , . . . , xn ) j=1
0
j=1
=
0
η∈Sn
n η∈Sn j=1
0
xη(j−1)
dxη(j) f (x1 , . . . , xn ),
(5.13)
where xη(0) := β. By substituting the expression (2.7) of the covariance matrix and decomposing n β the integral j=1 0 dxj as in (5.13), we can expand ∂/∂λX˜1 gn (π, τ )|λX =0,∀X ∈Γ4
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1023
as sums over the momentum space Γ∗ as follows. For n ∈ N, ∂ gn (π, τ ) λX =0 ∂λX˜1 4 ∀X ∈Γ
=
(−1)n L(n+1)d ·
n
k1 ,...,kn ,p1 ,...,pn ∈Γ∗
(cos(kj , x 1 + pj , x 2 − kπ−1 (j) , y1 − pτ −1 (j) , y2 )
j=1
+ cos(kj , y1 + pj , y2 − kπ−1 (j) , x 1 − pτ −1 (j) , x 2 )) ·
n
δkl +pl ,kπ−1 (l) +pτ −1 (l)
l=1 l=j
·
n η∈Sn j=1
·
xη(j−1)
0
n 1xπ(j) −xj ≤0 j=1
1 + eβEkj
dxη(j) e
−
xη(j) (Ekη(j) +Epη(j) −Ek
1xπ(j) −xj >0 1 + e−βEkj
π −1 (η(j))
1xτ (j) −xj ≤0 1 + eβEpj
−
−Ep
τ −1 (η(j))
1xτ (j) −xj >0 1 + e−βEpj
)
,
(5.14)
where xη(0) = β. Let us sketch how to implement (5.14). We prepare of real vari xj−1a function ˜j xj E ˜n returning the exact value of n ˜1 , . . . , E dx e with x0 = β ables E j j=1 0 beforehand. Then we iterate the system with respect to the variables k1 , . . . , kn , p1 , . . . , pn ∈ Γ∗ . For fixed k1 , . . . , kn , p1 , . . . , pn ∈ Γ∗ we iterate with respect to ˜j = Ek + the permutation η ∈ Sn . For each η ∈ Sn we substitute the variables E η(j) n xj−1 ˜ dxj exj Ej Epη(j) −Ekπ−1 (η(j)) −Epτ −1 (η(j)) (j = 1, . . . , n) into the function j=1 0 and its returning value is then multiplied by the constant n 1xπ(j) −xj >0 1xτ (j) −xj >0 1xπ(j) −xj ≤0 1xτ (j) −xj ≤0 1xη(1) >···>xη(n) − − . 1 + eβEpj 1 + e−βEpj 1 + eβEkj 1 + e−βEkj j=1 5.3. Numerical values Here we display our numerical results. In our computation, we fix the physical parameters t, t , µ, β to satisfy t = t = µ = 0.01, β = 1. In this configuration, the upper bound on D obtained in Proposition 5.1 is 92.04. Thus, by Theorem 4.10 the
n radius of convergence 1/(27|D|) of our perturbation series ∞ n=0 an U is estimated −4 to be larger than equal to 4.024 × 10 . The errors between the correlation function and the 2nd order perturbation for various |U | less than 4.024×10−4 are exhibited in Table 1, where Error is defined by
September 14, 2009 15:31 WSPC/148-RMP
1024
J070-00379
Y. Kashima Table 1. Errors between the correlation function and the second-order perturbation. |U | Error
1.0 × 10−6 1.408 × 10−7
5.0 × 10−6 1.773 × 10−5
1.0 × 10−5 1.433 × 10−4
5.0 × 10−5 1.942 × 10−2
|U | Error
1.0 × 10−4 1.739 × 10−1
2.0 × 10−4 1.842
3.0 × 10−4 9.454
4.0 × 10−4 7.307 × 10
Table 2. Second-order perturbation in the case that x 1 = x 2 = y1 = y2 = (l, l) (l ∈ {0, 1, . . . , 5}). L a0 a1 a2 a0 + a1 U + a2 U 2
10, 11, . . . , 18 5.050 × 10−1 −3.774 × 10−1 9.339 × 10−2 5.050 × 10−1
the right-hand side of the inequality (4.33) for m = 2 and D = 92.04 and satisfies that |ψx∗ 1 ↑ ψx∗ 2 ↓ ψy2 ↓ ψy1 ↑ + ψy∗ 1 ↑ ψy∗ 2 ↓ ψx 2 ↓ ψx 1 ↑ − a0 − a1 U − a2 U 2 | ≤ Error. Let us fix U = 1.0×10−5. According to Table 1, the error between the correlation function and the second-order perturbation is estimated as 1.433 × 10−4 . Table 2 shows values of a0 , a1 , a2 and a0 + a1 U + a2 U 2 in the case that x 1 = x 2 = y1 = y2 = (l, l) (l ∈ {0, 1, . . . , 5}) for various lattice size L from 10 up to 18. We observe that each of a0 , a1 , a2 respectively takes the same value for any L ∈ {10, . . . , 18} and l ∈ {0, . . . , 5}. Table 3 shows the values of a0 , a1 , a2 and a0 + a1 U + a2 U 2 in the case that x 1 = y1 = (0, 0), x 2 = y2 = (l, l) (l ∈ {1, . . . , 5}) for various lattice size L from 10 to 18. Again we see that each of a0 , a1 , a2 respectively takes the same value for any L ∈ {10, . . . , 18} and l ∈ {1, . . . , 5}. Since we have fixed a small U so that Error becomes sufficiently small, the 1st and 2nd order terms do not contribute to the sum a0 + a1 U + a2 U 2 much in these numerical simulations. We also computed a0 , a1 , a2 in the case that x 1 = x 2 = (0, 0), y1 = y2 = (l, l) for l ∈ {1, . . . , 5} for L = 10, 11, . . . , 18. The result shows that |a0 |, |a1 |, |a2 | ≤ 1.5 × 10−5 for any l ∈ {1, . . . , 5} and L ∈ {10, 11, . . . , 18}. In this case the values of |a0 |, |a1 |, |a2 | are much smaller than those presented in Tables 2 and 3. This result indicates that the 4-point correlation function takes small values if x 1 = x 2 , Table 3. Second-order perturbation in the case that x 1 = y1 = (0, 0), x 2 = y2 = (l, l) (l ∈ {1, . . . , 5}). L a0 a1 a2 a0 + a1 U + a2 U 2
10, 11, . . . , 18 5.050 × 10−1 −2.524 × 10−1 9.402 × 10−2 5.050 × 10−1
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1025
y1 = y2 and |x 1 − y1 | is large and agrees with the decaying property of the 4-point correlation function for the 2-dimensional Hubbard model proved in [13]. Appendices In this section, we review the definitions of the Fermionic Fock space and the annihilation, creation operators, prove Proposition 2.4 and show that the covariance matrix Ch has a non-zero determinant independent of the parameter h. We write a matrix M indexed by finite sets S, S with S = S = n as M = (M (s, s ))s∈S,s ∈S . In this notation let us think that each element of S and S has already been given a number from 1 to n and M is defined by M = (M (sj , sk ))1≤j,k≤n even if the numbering of S and S is not specified in the context. The main results of Propositions 2.4 and C.7 concluded after some arguments involving such matrices in this appendices are independent of how to number the index sets. For any finite set B, let L2 (B; C) denote the complex linear space consisting of complex-valued functions on B, even when we do not introduce an inner product in L2 (B; C). Appendix A. The Fermionic Fock Space The first part of appendices reviews the definitions of the Fermionic Fock space on ∗ the lattice Γ × {↑, ↓} and the annihilation, creation operators ψxσ , ψxσ . 2 n For any n ∈ N we consider the linear space L ((Γ × {↑, ↓}) ; C) as a Hilbert space equipped with the inner product ·, · L2 ((Γ×{↑,↓})n ;C) defined by φ1 , φ2 L2 ((Γ×{↑,↓})n ;C) :=
φ1 (x 1 σ1 , . . . , x n σn )φ2 (x 1 σ1 , . . . , x n σn ).
x 1 ,...,x n ∈Γ σ1 ,...,σn ∈{↑,↓}
By convention, we set L2 ((Γ × {↑, ↓})0 ; C) := C. For n ∈ N the anti-symmetrization operator An : L2 ((Γ×{↑, ↓})n ; C) → L2 ((Γ× {↑, ↓})n ; C) is defined by 1 (An φ)(x 1 σ1 , . . . , x n σn ) := sgn(π)φ(x π(1) σπ(1) , . . . , x π(n) σπ(n) ). n! π∈Sn
The operator A0 is defined as the identity map on C, i.e., A0 z := z for all z ∈ C. The subspace An (L2 ((Γ × {↑, ↓})n ; C)) of L2 ((Γ × {↑, ↓})n ; C) is called as the Fermionic n-particle space and is a Hilbert space equipped with the inner product of L2 ((Γ × {↑, ↓})n ; C). Note that by anti-symmetry for any n > 2Ld, An (L2 ((Γ × {↑, ↓})n ; C)) = {0}. The Fermionic Fock space Ff (L2 (Γ × {↑, ↓}; C)) is defined as the direct sum of An (L2 ((Γ × {↑, ↓})n ; C)) (n = 0, . . . , 2Ld) as follows. 2L ! d
2
Ff (L (Γ × {↑, ↓}; C)) :=
n=0
An (L2 ((Γ × {↑, ↓})n ; C)).
September 14, 2009 15:31 WSPC/148-RMP
1026
J070-00379
Y. Kashima
The space Ff (L2 (Γ × {↑, ↓}; C)) is a Hilbert space with inner product ·, · Ff defined by 2L d
φ1 , φ2 Ff :=
φ1,n , φ2,n L2 ((Γ×{↑,↓})n ;C) ,
n=0
for any vectors φ1 = (φ1,0 , φ1,1 , . . . , φ1,2Ld ), φ2 = (φ2,0 , φ2,1 , . . . , φ2,2Ld ) ∈ Ff (L2 (Γ × {↑, ↓}; C)). Define a set of functions {φkσ }(k,σ)∈Γ∗×{↑,↓} ⊂ L2 (Γ × {↑, ↓}; C) by φkσ (xτ ) := δσ,τ e−i k,x /Ld/2. We then define a function φk1 σ1 · · · φkn σn ∈ L2 ((Γ × {↑, ↓})n ; C) by φk1 σ1 · · · φkn σn (x 1 τ1 , . . . , x n τn ) := φk1 σ1 (x 1 τ1 ) · · · φkn σn (x n τn ). An orthonormal basis of Ff (L2 (Γ × {↑, ↓}; C)) is given by
"2Ld n=0
Bn , where
B0 := {1}(⊂ C), √ Bn := n!An
φnkσkσ nkσ ∈ {0, 1}, nkσ = n k∈Γ∗ σ∈{↑,↓} (k,σ)∈Γ∗ ×{↑,↓}
(A.1) for n
2Ld
∈
{1, 2, . . . , 2Ld }. Thus, we see that dim Ff (L2 (Γ × {↑, ↓}; C))
=
2Ld
Bn = 2 . The annihilation operator ψxσ : Ff (L2 (Γ × {↑, ↓}; C)) → Ff (L2 (Γ × {↑, ↓}; C)) (x ∈ Γ, σ ∈ {↑, ↓}) is defined in the following steps. For any n ∈ N ∪ {0} and any φ ∈ An+1 (L2 ((Γ × {↑, ↓})n+1 ; C)), ψxσ φ ∈ An (L2 ((Γ × {↑, ↓})n ; C)) is defined by √ (ψxσ φ)(x 1 σ1 , . . . , x n σn ) := n + 1φ(xσ, x 1 σ1 , . . . , x n σn ). n=0
For any z ∈ A0 (L2 ((Γ × {↑, ↓})0 ; C)), ψxσ z := 0. The domain of the operator ψxσ is then extended to the whole space Ff (L2 (Γ × {↑, ↓}; C)) by linearity. ∗ : Ff (L2 (Γ × {↑, ↓}; C)) → Ff (L2 (Γ × {↑, ↓}; C)) is The creation operator ψxσ the adjoint operator of ψxσ and characterized as follows. For any n ∈ N ∪ {0} and ∗ φ ∈ An+1 (L2 ((Γ × {↑, ↓})n+1 ; C)) and any φ ∈ An (L2 ((Γ × {↑, ↓})n ; C)), ψxσ ∗ φ)(x 1 σ1 , . . . , x n+1 σn+1 ) (ψxσ n+1 1 = √ (−1)l−1 δx,x l δσ,σl φ(x 1 σ1 , . . . , x& l σl , . . . , x n+1 σn+1 ), n + 1 l=1
where the notation “x& l σl ” stands for the omission of the variable x l σl . ∗ For any operators A, B let {A, B} denote AB + BA. The operators ψxσ , ψxσ (x ∈ Γ, σ ∈ {↑, ↓}) satisfy the following canonical anti-commutation relations.
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1027
For all x, y ∈ Γ, σ, τ ∈ {↑, ↓}, ∗ ∗ , ψyτ } = 0, {ψxσ , ψyτ } = {ψxσ
∗ {ψxσ , ψyτ } = δx,y δσ,τ .
(A.2)
See, e.g., [3, 4] for more detailed definitions of the Fermionic Fock space and the operators on it. Appendix B. The Temperature-Ordered Perturbation Series In this section we present the derivation of the perturbation series (2.6). Propositions claimed here are standard tools in many-body theory (see, e.g., [17, Chaps. 2 and 3]). This part of appendices is devoted to show them in a mathematical context. Let us fix notations used in the analysis below. Let H0 , Vλ be the operators defined in (2.1), (2.2) and (2.3), respectively. In our argument in this section, however, we do not use the relation (2.4) or the condition (3.16) imposed on the parameter {UX }X ∈Γ4 . One can consider more general Vλ of the form (2.3) parametrized by any complex multi-variable {UX }X ∈Γ4 in this section. ∗ (s) (s ∈ R, x ∈ Γ, σ ∈ {↑, ↓}) by Define the operators Vλ (s), ψxσ (s), ψxσ Vλ (s) := esH0 Vλ e−sH0 ,
ψxσ (s) := esH0 ψxσ e−sH0 ,
∗ ∗ ψxσ (s) := esH0 ψxσ e−sH0 .
For a ∈ {0, 1} the operator ψxσa (s) is defined by ∗ (s) if a = 1, ψxσ ψxσa (s) := ψxσ (s) if a = 0. Next we define the ordering operators T1 , T2 . Definition B.1. Consider linear operators M (s1 ), . . . , M (sn ) : Ff (L2 (Γ × {↑, ↓}; C)) → Ff (L2 (Γ × {↑, ↓}; C)) parametrized by s1 , . . . , sn ∈ R. Assume that sj = sk for any j, k ∈ {1, . . . , n} with j = k. The operator T1 (M (s1 ) · · · M (sn )) is defined by T1 (M (s1 ) · · · M (sn )) := M (sπ(1) ) · · · M (sπ(n) ), where π ∈ Sn is uniquely determined by the condition that sπ(1) > sπ(2) > · · · > sπ(n) . Let us define a relation “∼” in the set {ψxσa (s)}(x,σ,a,s)∈Γ×{↑,↓}×{0,1}×R as follows. ψxσa1 (s1 ) ∼ ψyτ a2 (s2 )
if a1 = a2 and s1 = s2 .
We see that “∼” is an equivalence relation in {ψxσa (s)}(x,σ,a,s)∈Γ×{↑,↓}×{0,1}×R . Let [ψxσa (s)] denote the equivalent class represented by an element ψxσa (s). We define relations “” and “” in the quotient set {ψxσa (s)}(x,σ,a,s)∈Γ×{↑,↓}×{0,1}×R /∼ as follows. [ψxσa1 (s1 )] [ψyτ a2 (s2 )] if s1 > s2 , or s1 = s2 and a1 > a2 , [ψxσa1 (s1 )] [ψyτ a2 (s2 )] if [ψxσa1 (s1 )] [ψyτ a2 (s2 )] or [ψxσa1 (s1 )] = [ψyτ a2 (s2 )]. The set {ψxσa (s)}(x,σ,a,s)∈Γ×{↑,↓}×{0,1}×R / ∼ is totally ordered under the relation “” and the relation “” is a strict order in this quotient set.
September 14, 2009 15:31 WSPC/148-RMP
1028
J070-00379
Y. Kashima
Definition B.2. For any ψx j σj aj (sj ) (j = 1, . . . , n) the operator T2 (ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn )) is defined by T2 (ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn )) := sgn(π)ψx π(1) σπ(1) aπ(1) (sπ(1) ) · · · ψx π(n) σπ(n) aπ(n) (sπ(n) ), where π ∈ Sn is uniquely determined by the conditions that [ψx π(1) σπ(1) aπ(1) (sπ(1) )] [ψx π(2) σπ(2) aπ(2) (sπ(2) )] · · · [ψx π(n) σπ(n) aπ(n) (sπ(n) )], and if there exist l1 , l2 ∈ {1, . . . , n} with l1 < l2 such that [ψx l1 σl1 al1 (sl1 )] = [ψx l2 σl2 al2 (sl2 )], = [ψx j σj aj (sj )] (∀j ∈ {l1 + 1, . . . , l2 − 1}) and π(m) = l1 with m ∈ {1, . . . , n}, then π(m + 1) = l2 . Using the ordering operator T1 we have the following expansion. Lemma B.3. For any t1 , t2 ∈ R with t1 < t2 , e−(t2 −t1 )(H0 +Vλ ) = e−(t2 −t1 )H0 + e−t2 H0
∞ (−1)n ds1 · · · dsn T1 (Vλ (s1 ) · · · Vλ (sn ))et1 H0 . n! n [t1 ,t2 ] n=1
Remark B.4. Though the operator T1 (Vλ (s1 ) · · · Vλ (sn )) is defined only for (s1 , . . . , sn ) with sj = sk (j = k), we can consider T1 (Vλ (s1 ) · · · Vλ (sn )) as a Bochner integrable function over [t1 , t2 ]n since the Lebesgue measure of the set {(s1 , . . . , sn ) ∈ [t1 , t2 ]n | ∃j, k ∈ {1, . . . , n} s.t. j = k and sj = sk } is zero. Proof of Lemma B.3. Since the operator-valued function ξ → e−(t2 −t1 )(H0 +ξVλ ) is analytic, we have n ∞ d 1 e−(t2 −t1 )(H0 +Vλ ) = e−(t2 −t1 )(H0 +ξVλ ) . (B.1) n! dξ ξ=0 n=0 It is sufficient to show that for all n ∈ N and all ξ ∈ R n d e−(t2 −t1 )(H0 +ξVλ ) dξ
n −t2 (H0 +ξVλ ) = (−1) e ds1 · · · dsn T1 (Vλ,ξ (s1 ) · · · Vλ,ξ (sn ))et1 (H0 +ξVλ ) , [t1 ,t2 ]n
(B.2) where Vλ,ξ (s) := es(H0 +ξVλ ) Vλ e−s(H0 +ξVλ ) . In fact, substituting (B.2) for ξ = 0 into (B.1) gives the result. We show (B.2) by induction on n.
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1029
By Lemma 2.3, we have
d −(t2 −t1 )(H0 +ξVλ ) e = dξ
1
0
= −e
ds e−(1−s)(t2 −t1 )(H0 +ξVλ ) (t1 − t2 )Vλ e−s(t2 −t1 )(H0 +ξVλ )
−t2 (H0 +ξVλ )
t2
ds Vλ,ξ (s)et1 (H0 +ξVλ ) ,
t1
which is (B.2) for n = 1. Let us assume that (B.2) is true for n − 1 (n ≥ 2).
d dξ
n−1
e−(t2 −t1 )(H0 +ξVλ )
= (−1)n−1 (n − 1)!e−t2 (H0 +ξVλ )
[t1 ,t2 ]n−1
ds1 ds2 · · · dsn−1
· 1s1 >s2 >···>sn−1 Vλ,ξ (s1 )Vλ,ξ (s2 ) · · · Vλ,ξ (sn−1 )et1 (H0 +ξVλ )
t2
sn−2
s1 n−1 = (−1) (n − 1)! ds1 ds2 · · · dsn−1 e−(t2 −s1 )(H0 +ξVλ ) Vλ t1
t1
t1
· e−(s1 −s2 )(H0 +ξVλ ) Vλ · · · e−(sn−2 −sn−1 )(H0 +ξVλ ) Vλ e−(sn−1 −t1 )(H0 +ξVλ ) . By writing t2 = s0 , t1 = sn and using Lemma 2.3 again, we observe that
d dξ
n
e−(t2 −t1 )(H0 +ξVλ )
= (−1)n−1 (n − 1)!
s0
ds1 · · ·
sn
· e−(s0 −s1 )(H0 +ξVλ ) Vλ · · ·
sn−2
dsn−1 sn
n−1 j=0
d −(sj −sj+1 )(H0 +ξVλ ) )Vλ (e dξ
· · · e−(sn−2 −sn−1 )(H0 +ξVλ ) Vλ e−(sn−1 −sn )(H0 +ξVλ )
s0
sn−2 n−1 = (−1)n (n − 1)! ds1 · · · dsn−1 sn
· e−(s0 −s1 )(H0 +ξVλ ) Vλ · · ·
sn
sj
j=0
ds e−(sj −s)(H0 +ξVλ ) Vλ e−(s−sj+1 )(H0 +ξVλ ) Vλ
sj+1
· · · e−(sn−2 −sn−1 )(H0 +ξVλ ) Vλ e−(sn−1 −sn )(H0 +ξVλ )
September 14, 2009 15:31 WSPC/148-RMP
1030
J070-00379
Y. Kashima
= (−1)n (n − 1)!
n−1 s0 j=0
ds1 · · ·
sn
sn−2
sj
dsn−1 sn
ds 1s0 >s1 >···>sj >s>sj+1 >···>sn sj+1
· e−(s0 −s1 )(H0 +ξVλ ) Vλ · · · e−(sj −s)(H0 +ξVλ ) Vλ e−(s−sj+1 )(H0 +ξVλ ) Vλ · · · e−(sn−2 −sn−1 )(H0 +ξVλ ) Vλ e−(sn−1 −sn )(H0 +ξVλ ) . Then by changing the index of the variables {sj , s | j = 0, . . . , n}, we obtain n d e−(t2 −t1 )(H0 +ξVλ ) dξ = (−1)n n!e−t2 (H0 +ξVλ )
· ds1 · · · dsn 1s1 >···>sn Vλ,ξ (s1 ) · · · Vλ,ξ (sn )et1 (H0 +ξVλ ) [t1 ,t2 ]n
= (−1)n e−t2 (H0 +ξVλ )
· ds1 · · · dsn T1 (Vλ,ξ (s1 ) · · · Vλ,ξ (sn ))et1 (H0 +ξVλ ) , [t1 ,t2 ]n
which completes the proof. ∗ Next we prepare some properties of the operators ψxσ (s) and ψxσ (s). Using the matrix {F (xσ, yτ )}(x,σ),(y,τ )∈Γ×{↑,↓} defined in (2.2), we define the matrices F (a) (a = 0, 1) by F if a = 0, F (a) := t if a = 1. −F
Lemmas B.5 and B.6 below follow [15, Lemmas 3.2.1 and 3.2.2]. However, we give the proof to make this section self-contained. Lemma B.5. The following equalities hold. (i) For any (x, σ, a, s) ∈ Γ × {↑, ↓} × {0, 1} × R e−sF (a) (xσ, yτ )ψyτ a . ψxσa (s) = y∈Γ τ ∈{↑,↓}
(ii) For any (x, σ, s), (y, τ, t) ∈ Γ × {↑, ↓} × R ∗ ∗ {ψxσ (s), ψyτ (t)} = {ψxσ (s), ψyτ (t)} = 0,
∗ {ψxσ (s), ψyτ (t)} = e(s−t)F (yτ, xσ).
Proof. We see that for a ∈ {0, 1}, d ψxσa (s) = esH0 (H0 ψxσa − ψxσa H0 )e−sH0 . ds
(B.3)
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1031
By using (A.2), we can show that for a ∈ {0, 1} H0 ψxσa = −
F (a)(xσ, yτ )ψyτ a + ψxσa H0 .
(B.4)
y∈Γ τ ∈{↑,↓}
By combining (B.3) with (B.4), we obtain a differential equation d ψxσa (s) = − ds
F (a)(xσ, yτ )ψyτ a (s),
y∈Γ τ ∈{↑,↓}
for a ∈ {0, 1}, which gives (i). By using (i) and (A.2), the first equalities of (ii) can be proved. Moreover, we see that ∗ (s), ψyτ (t)} = esF (x 1 σ1 , xσ)e−tF (yτ, x 2 σ2 ){ψx∗ 1 σ1 , ψx 2 σ2 } {ψxσ x 1 ,x 2 ∈Γ σ1 ,σ2 ∈{↑,↓}
=
e−tF (yτ, x 1 σ1 )esF (x 1 σ1 , xσ) = e(s−t)F (yτ, xσ).
x 1 ∈Γ σ1 ∈{↑,↓}
For any linear operator A : Ff (L2 (Γ × {↑, ↓}; C)) → Ff (L2 (Γ × {↑, ↓}; C)), let A 0 denote Tr (e−βH0 A)/Tr e−βH0 . For a set of the operators {ψx j σj aj (sj )}nj=1 , let ψx σ a (s1 ) · · · ψx σ a (sj ) · · · ψx σ a (sn ) denote the product obtained by elimi1
1 1
j
j
j
n
n n
nating ψx j σj aj (sj ) from the product ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn ).
Lemma B.6. If n ∈ N is odd, for any (x j , σj , aj , sj ) ∈ Γ × {↑, ↓} × {0, 1} × R (j = 1, . . . , n) ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn ) 0 = 0. If n ∈ N is even, for any (x j , σj , aj , sj ) ∈ Γ × {↑, ↓} × {0, 1} × R (j = 1, . . . , n) ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn ) 0 =
n
(−1)j ψx 1 σ1 a1 (s1 )ψx j σj aj (sj ) 0
j=2
· ψx 2 σ2 a2 (s2 ) · · · ψx j σj aj (sj ) · · · ψx n σn an (sn ) 0 .
(B.5)
Moreover, ψx 1 σ1 a1 (s1 )ψx 2 σ2 a2 (s2 ) 0 = (I + e−βF (a1 ) )−1 (x 1 σ1 , yτ ){ψyτ a1 (s1 ), ψx 2 σ2 a2 (s2 )}. y∈Γ τ ∈{↑,↓}
(B.6)
September 14, 2009 15:31 WSPC/148-RMP
1032
J070-00379
Y. Kashima
Proof. By using the orthonormal basis
"2Ld m=0
Bm defined in (A.1), we can write
Tr(e−βH0 ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn )) 2L d
=
φ, e−βH0 ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn )φ Ff .
(B.7)
m=0 φ∈Bm
Since for all s ∈ R and m ∈ {0, 1, . . . , 2Ld } esH0 (Am (L2 ((Γ × {↑, ↓})m ; C))) ⊂ Am (L2 ((Γ × {↑, ↓})m ; C)), we see that if n is odd, for any m ∈ {0, 1, . . . , 2Ld } and φ ∈ Bm e−βH0 ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn )φ = 0, or e−βH0 ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn )φ ∈ Al (L2 ((Γ × {↑, ↓})l ; C)) with l = m, which implies that φ, e−βH0 ψx 1 σ1 a1 (s1 ) · · · ψx n σn an (sn )φ Ff = 0.
(B.8)
The first statement follows from (B.7) and (B.8). Let us assume that n is even. We see that ψx 1 σ1 a1 (s1 )ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn ) = {ψx 1 σ1 a1 (s1 ), ψx 2 σ2 a2 (s2 )}ψx 3 σ3 a3 (s3 ) · · · ψx n σn an (sn ) − ψx 2 σ2 a2 (s2 )ψx 1 σ1 a1 (s1 )ψx 3 σ3 a3 (s3 ) · · · ψx n σn an (sn ) = {ψx 1 σ1 a1 (s1 ), ψx 2 σ2 a2 (s2 )}ψx 3 σ3 a3 (s3 ) · · · ψx n σn an (sn ) − ψx 2 σ2 a2 (s2 ){ψx 1 σ1 a1 (s1 ), ψx 3 σ3 a3 (s3 )}ψx 4 σ4 a4 (s4 ) · · · ψx n σn an (sn ) + ψx 2 σ2 a2 (s2 )ψx 3 σ3 a3 (s3 )ψx 1 σ1 a1 (s1 )ψx 4 σ4 a4 (s4 ) · · · ψx n σn an (sn ) =
n
{ψx 1 σ1 a1 (s1 ), ψx j σj aj (sj )}(−1)j ψx 2 σ2 a2 (s2 ) · · · ψx j σj aj (sj ) · · · ψx n σn an (sn )
j=2
− (−1)n ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn )ψx 1 σ1 a1 (s1 ), which yields ψx 1 σ1 a1 (s1 )ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn ) 0 + ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn )ψx 1 σ1 a1 (s1 ) 0 =
n
{ψx 1 σ1 a1 (s1 ), ψx j σj aj (sj )}
j=2
· (−1)j ψx 2 σ2 a2 (s2 ) · · · ψx j σj aj (sj ) · · · ψx n σn an (sn ) 0 .
(B.9)
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
On the other hand, by Lemma B.5(i), e−βF (a) (xσ, yτ )ψyτ a (s). ψxσa (s + β) =
1033
(B.10)
y∈Γ τ ∈{↑,↓}
By using (B.10) and the equality that Tr(AB) = Tr(BA) for any operators A, B, we observe that ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn )ψx 1 σ1 a1 (s1 ) 0 = ψx 1 σ1 a1 (s1 + β)ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn ) 0 = e−βF (a1 ) (x 1 σ1 , yτ )ψyτ a1 (s1 )ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn ) 0 . y∈Γ τ ∈{↑,↓}
(B.11) By substituting (B.11) into (B.9), we obtain (δx 1 ,y δσ1 ,τ + e−βF (a1 ) (x 1 σ1 , yτ )) y∈Γ τ ∈{↑,↓}
· ψyτ a1 (s1 )ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn ) 0 =
n
{ψx 1 σ1 a1 (s1 ) , ψx j σj aj (sj ) }(−1)j
j=2
· ψx 2 σ2 a2 (s2 ) · · · ψx j σj aj (sj ) · · · ψx n σn an (sn ) 0 .
(B.12)
Let us define a unitary matrix M = (M (kτ, xσ))(k,τ )∈Γ∗ ×{↑,↓},(x,σ)∈Γ×{↑,↓} by M (kτ, xσ) :=
δσ,τ −i k,x e . Ld/2
ˆ τˆ) ∈ Γ∗ × {↑, ↓} Then we have for all (k, τ ), (k, ˆ τ ) = M F t M t (kτ, kˆ ˆ τ ) = δ ˆ δτ,ˆτ E ˆ , MFM ∗ (kτ, kˆ k k,k
(B.13)
where Ek is defined in (2.8). The equality (B.13) implies that ∗
det(I + e−βF ) = det(I + e−βMF M ) = 0,
t
det(I + eβF ) = det(I + eβMF
t
Mt
) = 0.
Thus, for a = 0, 1 the matrix I + e−βF (a) is invertible. The equality (B.12) leads to ψx 1 σ1 a1 (s1 )ψx 2 σ2 a2 (s2 ) · · · ψx n σn an (sn ) 0 =
n
(I + e−βF (a1 ) )−1 (x 1 σ1 , yτ ){ψyτ a1 (s1 ), ψx j σj aj (sj )}(−1)j
j=2 y∈Γ τ ∈{↑,↓}
· ψx 2 σ2 a2 (s2 ) · · · ψx j σj aj (sj ) · · · ψx n σn an (sn ) 0 .
(B.14)
The equality (B.6) is (B.14) for n = 2. Then, by substituting (B.6) into (B.14), we obtain (B.5).
September 14, 2009 15:31 WSPC/148-RMP
1034
J070-00379
Y. Kashima
From now, we show some lemmas involving the ordering operator T2 . To simplify notations, let ψj denote ψx j σj aj (sj ) for fixed variables (x j , σj , aj , sj ) ∈ Γ × {↑, ↓} × {0, 1} × R (j = 1, . . . , n). Lemma B.7. For any π ∈ Sn , T2 (ψ1 ψ2 · · · ψn ) = sgn(π)T2 (ψπ(1) ψπ(2) · · · ψπ(n) ).
(B.15)
Proof. It is sufficient to show (B.15) for any transposition π as any permutation is a product of transpositions. Let us assume that π = (j, k), 1 ≤ j < k ≤ n. Let τ, η ∈ Sn be the unique permutations associated with the definitions of T2 (ψ1 · · · ψn ) and T2 (ψπ(1) · · · ψπ(n) ), respectively. T2 (ψ1 · · · ψn ) = sgn(τ )ψτ (1) · · · ψτ (n) ,
(B.16)
T2 (ψπ(1) · · · ψπ(n) ) = sgn(η)ψπ(η(1)) · · · ψπ(η(n)) .
(B.17)
First consider the case that [ψj ] = [ψk ]. Let A, B ⊂ {j + 1, . . . , k − 1} satisfy that [ψj ] = [ψα ] for any α ∈ A, [ψk ] = [ψγ ] for any γ ∈ B and [ψj ], [ψk ] = [ψp ] for any p ∈ {j + 1, . . . , k − 1}\A ∪ B. If A, B = ∅, we can write A = {α1 , . . . , αl }, B = {γ1 , . . . , γm } with j + 1 ≤ α1 < · · · < αl ≤ k − 1, j + 1 ≤ γ1 < · · · < γm ≤ k − 1. By the definition of T2 the product ψπ(η(1)) · · · ψπ(η(n)) is obtained by replacing ψj ψα1 · · · ψαl and ψγ1 · · · ψγm ψk by ψα1 · · · ψαl ψj and ψk ψγ1 · · · ψγm , respectively, in the product ψτ (1) · · · ψτ (n) . Thus, if we define cycles ζ1 , ζ2 ∈ Sn by j α1 · · · αl−1 αl γ1 γ2 · · · γm k ζ1 = , , ζ2 = α1 α2 · · · αl j k γ1 · · · γm−1 γm the permutation η is written as η = π −1 ζ1 ζ2 τ.
(B.18)
On the other hand, Lemma B.5(ii) ensures that ψα1 · · · ψαl ψj = (−1)l ψj ψα1 · · · ψαl ,
ψk ψγ1 · · · ψγm = (−1)m ψγ1 · · · ψγm ψk . (B.19)
By (B.16)–(B.19), we see that T2 (ψπ(1) · · · ψπ(n) ) = sgn(π −1 ζ1 ζ2 τ )ψζ1 (ζ2 (τ (1))) · · · ψζ1 (ζ2 (τ (n))) = (−1)1+l+m sgn(τ )(−1)l+m ψτ (1) · · · ψτ (n) = −T2 (ψ1 · · · ψn ).
(B.20)
If A = ∅ or B = ∅, by setting ζ1 = Id and l = 0 or ζ2 = Id and m = 0, respectively, we see that the equalities (B.18) and (B.20) hold true. Next consider the case that [ψj ] = [ψk ]. Let A˜ ⊂ {j + 1, . . . , k − 1} be such that ˜ [ψj ] = [ψq ] for any q ∈ A˜ and [ψj ] = [ψq ] for any q ∈ {j + 1, . . . , k − 1}\A.
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1035
If A˜ = ∅, we write A˜ as A˜ = {q1 , . . . , qr } with j + 1 ≤ q1 < · · · < qr ≤ k − 1. By the definition of T2 the product ψπ(η(1)) · · · ψπ(η(n)) is obtained by replacing ψj ψq1 · · · ψqr ψk by ψk ψq1 · · · ψqr ψj in the product ψτ (1) · · · ψτ (n) . Thus, the permutation η satisfies the equality η = τ.
(B.21)
By Lemma B.5(ii), the following equality holds. ψk ψq1 · · · ψqr ψj = −ψj ψq1 · · · ψqr ψk .
(B.22)
By combining (B.16) and (B.17) with (B.21) and (B.22), we have T2 (ψπ(1) · · · ψπ(n) ) = sgn(τ )ψπ(τ (1)) · · · ψπ(τ (n)) = −sgn(τ )ψτ (1) · · · ψτ (n) = −T2 (ψ1 · · · ψn ).
(B.23)
By repeating the same argument as above without the term ψq1 · · · ψqr we can prove the equalities (B.23) for the case that A˜ = ∅, which completes the proof. Lemma B.8. Assume that n ∈ N is even and [ψ1 ] [ψj ] (∀j ∈ {2, 3, . . . , n}). The following equality holds. n 'j · · · ψn ) 0 . (−1)j T2 (ψ1 ψj ) 0 T2 (ψ2 · · · ψ (B.24) T2 (ψ1 · · · ψn ) 0 = j=2
Proof. For n = 2, the equality (B.24) is trivial. Assume that n ≥ 4. Let τ ∈ Sn be the unique permutation associated with the definition of T2 (ψ1 · · · ψn ). T2 (ψ1 · · · ψn ) = sgn(τ )ψτ (1) · · · ψτ (n) . By assumption τ (1) = 1. Moreover, Lemma B.6 ensures that T2 (ψ1 · · · ψn ) 0 = sgn(τ )
n
(−1)j ψ1 ψτ (j) 0 ψτ (2) · · · ψ τ (j) · · · ψτ (n) 0 .
(B.25)
j=2
Let us fix j ∈ {2, 3, . . . , n}. Let π ∈ Sn be such that (π(1), π(2), π(3), . . . , π(n)) = (1, τ (j), τ (2), . . . , τ& (j), . . . , τ (n)),
(B.26)
where “τ& (j)” stands for the omission of the number τ (j) from the row (τ (2), τ (3), . . . , τ (n)). Then, we have sgn(π) = (−1)j−2 sgn(τ ) = (−1)j sgn(τ ).
(B.27)
On the other hand, we can write {1, . . . , n}\{1, τ (j)} = {l1 , . . . , ln−2 } with 2 ≤ l1 < l2 < · · · < ln−2 ≤ n. There exists η ∈ Sn−2 such that (j), . . . , τ (n)). (lη(1) , lη(2) , . . . , lη(n−2) ) = (τ (2), τ (3), . . . , τ& By (B.26) and (B.28), we obtain (π(1), π(2), π(3), . . . , π(n)) = (1, τ (j), lη(1) , lη(2) , . . . , lη(n−2) ),
(B.28)
September 14, 2009 15:31 WSPC/148-RMP
1036
J070-00379
Y. Kashima
which implies that sgn(π) = (−1)τ (j)−2 sgn(η) = (−1)τ (j) sgn(η).
(B.29)
By (B.27) and (B.29), we have (−1)j sgn(τ ) = (−1)τ (j) sgn(η).
(B.30)
Note the equalities that T2 (ψ1 ψτ (j) ) 0 = ψ1 ψτ (j) 0 , T2 (ψ2 · · · ψ τ (j) · · · ψn ) 0 = sgn(η)ψτ (2) · · · ψτ (j) · · · ψτ (n) 0 .
(B.31)
By substituting (B.30) and (B.31) into (B.25), we see that T2 (ψ1 · · · ψn ) 0 =
n
(−1)τ (j) T2 (ψ1 ψτ (j) ) 0 T2 (ψ2 · · · ψ τ (j) · · · ψn ) 0
j=2
=
n
'j · · · ψn ) 0 , (−1)j T2 (ψ1 ψj ) 0 T2 (ψ2 · · · ψ
j=2
which is (B.24). Lemma B.9. For all x j , yj ∈ Γ, σj , τj ∈ {↑, ↓}, sj , tj ∈ R (j = 1, 2, . . . , n), T2 (ψx∗ 1 σ1 (s1 )ψy1 τ1 (t1 ) · · · ψx∗ n σn (sn )ψyn τn (tn )) 0 = det(T2 (ψx∗ j σj (sj )ψyk τk (tk )) 0 )1≤j,k≤n .
(B.32)
Proof. We show (B.32) by induction on n. The equality (B.32) is obviously true when n = 1. Let us assume that (B.32) is true for n − 1 (n ≥ 2). Lemma B.7 implies that for all π ∈ Sn T2 (ψx∗ 1 σ1 (s1 )ψy1 τ1 (t1 ) · · · ψx∗ n σn (sn )ψyn τn (tn )) 0 = T2 (ψx∗ π(1) σπ(1) (sπ(1) )ψyπ(1) τπ(1) (tπ(1) ) · · · ψx∗ π(n) σπ(n) (sπ(n) ) · ψyπ(n) τπ(n) (tπ(n) )) 0 = (−1)n T2 (ψyπ(1) τπ(1) (tπ(1) )ψx∗ π(1) σπ(1) (sπ(1) ) · · · ψyπ(n) τπ(n) (tπ(n) )ψx∗ π(n) σπ(n) (sπ(n) )) 0 ,
(B.33)
and det(T2 (ψx∗ j σj (sj )ψyk τk (tk )) 0 )1≤j,k≤n = det(T2 (ψx∗ π(j) σπ(j) (sπ(j) )ψyπ(k) τπ(k) (tπ(k) )) 0 )1≤j,k≤n = (−1)n det(T2 (ψyπ(k) τπ(k) (tπ(k) )ψx∗ π(j) σπ(j) (sπ(j) )) 0 )1≤j,k≤n .
(B.34)
The equalities (B.33) and (B.34) enable us to assume that [ψx∗ 1 σ1 (s1 )] [ψx∗ j σj (sj )], [ψyj τj (tj )] (∀j ∈ {1, . . . , n}) without losing generality in the following argument.
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1037
By using Lemmas B.7 and B.8, the hypothesis of induction, and the fact that = 0, we have
∗ (t)ψx∗ σ (t ) 0 ψxσ
T2 (ψx∗ 1 σ1 (s1 )ψy1 τ1 (t1 ) · · · ψx∗ n σn (sn )ψyn τn (tn )) 0 =
n
T2 (ψx∗ 1 σ1 (s1 )ψyj τj (tj )) 0
j=1
· T2 (ψy1 τ1 (t1 )ψx∗ 2 σ2 (s2 )ψy2 τ2 (t2 ) · · · ψx∗ j σj (sj )ψy j τj (tj ) · · · ψx∗ n σn (sn )ψyn τn (tn )) 0 =
n
(−1)j−1 T2 (ψx∗ 1 σ1 (s1 )ψyj τj (tj )) 0 det(T2 (ψx∗ l σl (sl )ψyk τk (tk )) 0 )1≤l,k≤n l=1,k=j
j=1
=
det(T2 (ψx∗ j σj (sj )ψyk τk (tk )) 0 )1≤j,k≤n ,
which concludes the proof. Lemma B.10. For all x, y ∈ Γ, σ, τ ∈ {↑, ↓}, x, y ∈ R ∗ (x)ψyτ (y)) 0 = C(xσx, yτ y), T2 (ψxσ
(B.35)
where C(xσx, yτ y) is defined in (2.7). Proof. By the definition of T2 , we have ∗ ∗ ∗ T2 (ψxσ (x)ψyτ (y)) 0 = ψxσ (x)ψyτ (y) 0 1x−y≥0 − ψyτ (y)ψxσ (x) 0 1x−y<0.
(B.36) By Lemma B.5(ii), (B.6) and (B.13), we see that ∗ (x)ψyτ (y) 0 ψxσ t = (I + eβF )−1 (xσ, x σ ){ψx∗ σ (x), ψyτ (y)} x ∈Γ σ ∈{↑,↓}
=
(I + eβF )−1 (xσ, x σ )e(x−y)F (yτ, x σ ) t
x ∈Γ σ ∈{↑,↓}
= ((I + eβF )−1 e(x−y)F )(xσ, yτ ) t
t
= (M t (I + eβMF =
δσ,τ Ld
∗ ˆ k,k∈Γ
t
M t −1 (x−y)MF t M t
)
e
ˆ
δk,kˆ e−i x,k ei y,k
M )(xσ, yτ )
e(x−y)Ekˆ δσ,τ i k,y−x e−(y−x)Ek = e , Ld 1 + eβEk 1 + eβEkˆ k∈Γ∗ (B.37)
September 14, 2009 15:31 WSPC/148-RMP
1038
J070-00379
Y. Kashima
∗ ψyτ (y)ψxσ (x) 0 ∗ = (I + e−βF )−1 (yτ, y τ ){ψy τ (y), ψxσ (x)} y ∈Γ τ ∈{↑,↓}
=
y ∈Γ
τ ∈{↑,↓}
(I + e−βF )−1 (yτ, y τ )e(x−y)F (y τ , xσ)
= ((I + e−βF )−1 e(x−y)F )(yτ, xσ) ∗
∗
= (M ∗ (I + e−βMF M )−1 e(x−y)MF M M )(yτ, xσ) δσ,τ Ld
=
∗ ˆ k,k∈Γ
ˆ
δk,kˆ ei y,k e−i x,k
e(x−y)Ekˆ δσ,τ i k,y−x e−(y−x)Ek = e . Ld 1 + e−βEk 1 + e−βEkˆ k∈Γ∗ (B.38)
By combining (B.37) and (B.38) with (B.36), we obtain (B.35). We have prepared all the lemmas necessary to prove Proposition 2.4. Proof of Proposition 2.4. By applying Lemma B.3 for t1 = 0, t2 = β, we have
∞ (−1)n e−βHλ = e−βH0 + e−βH0 ds1 · · · dsn T1 (Vλ (s1 ) · · · Vλ (sn )) n! [0,β]n n=1 =e
−βH0
+e
−βH0
∞
n
(−1)
[0,β]n
n=1
ds1 · · · dsn 1s1 >···>sn Vλ (s1 ) · · · Vλ (sn ). (B.39)
By (B.39), Lemma B.5(ii), the definition of T2 and Lemma B.7, we see that Tr e−βHλ Tr e−βH0 =1+
∞ n n=1 j=1
−
x j ,yj ,zj ,wj ∈Γ
0
β
dsj Ux j ,yj ,zj ,wj 1s1 >··· >sn
· ψx∗ 1 ↑ (s1 )ψy∗ 1 ↓ (s1 )ψw1 ↓ (s1 )ψz1 ↑ (s1 ) · · · ψx∗ n ↑ (sn )ψy∗ n ↓ (sn )ψwn ↓ (sn )ψzn ↑ (sn ) 0
β ∞ n − =1+ dx2j−1 Ux 2j−1 ,x 2j ,y2j−1 ,y2j n=1 j=1
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ
0
· 1x1 >x3 >··· >x2n−1 (−1)n ψx∗ 1 ↑ (x1 )ψx∗ 2 ↓ (x1 )ψy1 ↑ (x1 )ψy2 ↓ (x1 ) · · · · ψx∗ 2n−1 ↑ (x2n−1 )ψx∗ 2n ↓ (x2n−1 )ψy2n−1 ↑ (x2n−1 )ψy2n ↓ (x2n−1 ) 0
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
=1+
∞ n n=1 j=1
−
x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ
1039
β
0
dx2j−1 Ux 2j−1 ,x 2j ,y2j−1 ,y2j
· 1x1 >x3 >··· >x2n−1 T2 (ψx∗ 1 ↑ (x1 )ψy1 ↑ (x1 )ψx∗ 2 ↓ (x1 )ψy2 ↓ (x1 ) · · · · ψx∗ 2n−1 ↑ (x2n−1 )ψy2n−1 ↑ (x2n−1 )ψx∗ 2n ↓ (x2n−1 )ψy2n ↓ (x2n−1 )) 0 ∞ n 1 − =1+ n! j=1 n=1 x 2j−1 ,x 2j ,y2j−1 ,y2j ∈Γ σ2j−1 ,σ2j ∈{↑,↓}
·
0
β
dx2j−1 δσ2j−1 ,↑ δσ2j ,↓ Ux 2j−1 ,x 2j ,y2j−1 ,,y2j
· T2 (ψx∗ 1 σ1 (x1 )ψy1 σ1 (x1 )ψx∗ 2 σ2 (x2 )ψy2 σ2 (x2 ) · · · ψx∗ 2n−1 σ2n−1 (x2n−1 ) · ψy2n−1 σ2n−1 (x2n−1 )ψx∗ 2n σ2n (x2n )ψy2n σ2n (x2n )) 0 x2j =x2j−1 . ∀j∈{1,. . .,n}
Then by using Lemmas B.9 and B.10 we obtain the series (2.6).
Appendix C. Diagonalization of the Covariance Matrix In this part of appendices we diagonalize the covariance matrix (Ch (xσx, yτ y))(x,σ,x),(y,τ,y)∈Γ×{↑,↓}×[0,β)h and calculate its determinant. The fact that the determinant of the covariance matrix is non-zero, which is to be proved in Proposition C.7, verifies the well-posedness of the Grassmann Gaussian integral defined in Definition 3.5. For convenience of calculation we assume that h ∈ 2N/β. Define the sets Wh and Mh by π Wh := ω ∈ Z −πh ≤ ω < πh , β π Mh := ω ∈ (2Z + 1) −πh < ω < πh . β Note that Wh = 2βh and Mh = βh. The assumption that h ∈ 2N/β ensures the equality (2πZ . (C.1) Mh = Wh β The set Mh is seen as a set of the Matsubara frequencies with cut-off. For f ∈ L2 ([−β, β)h ; C) we define fˆ ∈ L2 (Wh ; C) by 1 e−iωt f (t). fˆ(ω) := h t∈[−β,β)h
September 14, 2009 15:31 WSPC/148-RMP
1040
J070-00379
Y. Kashima
Lemma C.1. For any f ∈ L2 ([−β, β)h ; C) 1 iωt ˆ e f (ω), 2β
f (t) =
∀t ∈ [−β, β)h .
ω∈Wh
Proof. If t = −β + s/h with s ∈ {0, . . . , 2βh − 1},
1 iωt ˆ 1 e f (ω) = 2β 2βh ω∈Wh
eiωt e−iωu f (u)
ω∈Wh u∈[−β,β)h
2βh−1 2βh−1 l 1 i(−πh+πm/β)(s/h−l/h) e f −β + = 2βh m=0 h l=0
=
2βh−1 2βh−1 1 −iπ(s−l) iπm(s−l)/(βh) l e e f −β + 2βh h m=0 l=0
=
2βh−1
e−iπ(s−l) δs,l f
l=0
l s = f (t). −β + = f −β + h h
Lemma C.2. If f ∈ L2 ([−β, β)h ; C) satisfies f (t) = −f (t + β) for all t ∈ [−β, β)h with t < 0, f (t) =
1 iωt ˆ e f (ω), 2β
∀t ∈ [−β, β)h .
(C.2)
ω∈Mh
Proof. Take any ω ∈ Wh ∩ 2πZ/β. By assumption, we see that 1 fˆ(ω) = h =−
=−
e−iωt f (t) +
t∈[−β,β)h \[0,β)h
1 h 1 h
1 h
t∈[0,β)h
e−iωt f (t + β) +
t∈[−β,β)h \[0,β)h
t∈[0,β)h
e−iω(t−β) f (t) +
1 h
e−iωt f (t)
1 h
e−iωt f (t)
t∈[0,β)h
e−iωt f (t) = 0.
(C.3)
t∈[0,β)h
Then, by (C.1), (C.3) and Lemma C.1, we obtain (C.2). Let us define gk ∈ L2 ([−β, β)h ; C) (k ∈ Γ∗ ) by 1t≥0 1t<0 tEk − . gk (t) := e 1 + eβEk 1 + e−βEk Note that the function gk satisfies the anti-periodic property gk (t) = −gk (t + β) for all t ∈ [−β, β)h with t < 0.
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1041
Lemma C.3. For all t ∈ [−β, β)h gk (t) =
eiωt 1 . β h(1 − e−iω/h+Ek /h ) ω∈M
(C.4)
1 iωt e gˆk (ω). 2β
(C.5)
h
Proof. By Lemma C.2, gk (t) =
ω∈Mh
Moreover, we observe that for ω ∈ Mh gˆk (ω) = −
=−
=
=
2 h
1 h 1 h
e−iωt
t∈[−β,β)h \[0,β)h
e−iω(t−β)
t∈[0,β)h
e−iωt
t∈[0,β)h
etEk 1 + −βE k 1+e h
etEk 1 + βE k 1+e h
t∈[0,β)h
e−iωt
t∈[0,β)h
etEk 2 = βE k 1+e h(1 + eβEk )
e−iωt
etEk 1 + eβEk
etEk 1 + eβEk
et(−iω+Ek )
t∈[0,β)h
2 . h(1 − e−iω/h+Ek /h )
(C.6)
The equality (C.4) follows from (C.5) and (C.6). By substituting the characterization of gk given in Lemma C.3 into Ch (xσx, yτ y) =
δσ,τ i k,y−x e gk (x − y), Ld ∗ k∈Γ
we obtain: Lemma C.4. For any (x, σ, x), (y, τ, y) ∈ Γ × {↑, ↓} × [0, β)h , Ch (xσx, yτ y) =
δσ,τ ei k,y−x e−iω(y−x) . βLd h(1 − e−iω/h+Ek /h ) k∈Γ∗ ω∈M h
In order to diagonalize Ch , we define a matrix Y = (Y (kτ ω, xσx))(k,τ,ω)∈Γ∗ ×{↑,↓}×Mh ,(x,σ,x)∈Γ×{↑,↓}×[0,β)h by δτ,σ ei k,x e−iωx . Y (kτ ω, xσx) := ) βhLd Lemma C.5. The matrix Y is unitary.
(C.7)
September 14, 2009 15:31 WSPC/148-RMP
1042
J070-00379
Y. Kashima
Proof. Assume that ω = −πh + π/β + 2πm/β, ω ˆ = −πh + π/β + 2π m/β ˆ with m, m ˆ ∈ {0, 1, . . . , βh − 1}. Then we observe that δτ,ˆτ ˆτ ω ˆ) = Y Y ∗ (kτ ω, kˆ βhLd
ˆ
ei x,k−k e−ix(ω−ˆω)
x∈Γ x∈[0,β)h
=
δτ,ˆτ δk,kˆ βh−1 βh
ˆ e−i2πl(m−m)/(βh)
l=0
= δτ,ˆτ δk,kˆ δm,m ˆ = δτ,ˆ τ δk,k ω. ˆ δω,ˆ Let x = s/h, xˆ = sˆ/h with s, sˆ ∈ {0, 1, . . . , βh − 1}. ˆσ Y ∗ Y (xσx, x ˆ xˆ) =
δσ,ˆσ −i k,x−ˆx iω(x−ˆx) e e βhLd ∗ k∈Γ
ω∈Mh
=
βh−1 δσ,ˆσ δx,ˆx i(−πh+π/β+2πm/β)(s/h−ˆs/h) e βh m=0
=
βh−1 δσ,ˆσ δx,ˆx i(−πh+π/β)(s/h−ˆs/h) i2πm(s−ˆs)/(βh) e e βh m=0
= δσ,ˆσ δx,ˆx δs,ˆs ei(−πh+π/β)(s/h−ˆs/h) = δσ,ˆσ δx,ˆx δx,ˆx . By using the matrix Y and (C.7) we can diagonalize Ch as follows: ˆ τˆ, ω Lemma C.6. For all (k, τ, ω), (k, ˆ ) ∈ Γ∗ × {↑, ↓} × Mh , ˆτ ω (Y Ch Y ∗ )(kτ ω, kˆ ˆ ) = δτ,ˆτ δk,kˆ δω,ˆω
1 1−
e−iω/h+Ek /h
.
Finally, we calculate the determinant of the covariance matrix Ch . Proposition C.7. For any h ∈ 2N/β, det Ch =
1 (1 + eβEk )2
.
k∈Γ∗
Proof. Since {e−iω/h+Ek /h | ω ∈ Mh } is the set of all the βhth roots of −eβEk , z βh + eβEk = (z − e−iω/h+Ek /h ) ω∈Mh
for all z ∈ C. Especially, ω∈Mh
(1 − e−iω/h+Ek /h ) = 1 + eβEk .
(C.8)
September 14, 2009 15:31 WSPC/148-RMP
J070-00379
Rigorous Treatment of Perturbation Theory for Many-Electron Systems
1043
By Lemmas C.5, C.6 and (C.8), we see that det Ch = det(Y Ch Y ∗ ) =
k∈Γ∗ σ∈{↑,↓} ω∈Mh
1 1−
e−iω/h+Ek /h
=
1 . βEk )2 k∈Γ∗ (1 + e
Acknowledgments The author would like to thank M. Salmhofer for discussions and an important hint for the proof of Lemma 4.8, as well as for support during the completion of this work. The author would also wish to thank the referees for their careful reading of the manuscript. References [1] S. Afchain, J. Magnen and V. Rivasseau, Renormalization of the 2-point function of the Hubbard model at half-filling, Ann. Henri Poincar´e. 6 (2005) 399–448. [2] J. W. Barrett and L. Prigozhin, A quasi-variational inequality problem in superconductivity, preprint (2009). [3] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 1, 2nd edn. (Springer, Berlin-Heidelberg-New York, 2003). [4] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 2, 2nd edn. (Springer, Berlin-Heidelberg-New York, 2003). [5] Q. Du, Numerical approximations of the Ginzburg–Landau models for superconductivity, J. Math. Phys. 46 (2005) 095109. [6] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A representation for Fermionic correlation functions, Comm. Math. Phys. 195 (1998) 465–493. [7] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Fermionic Functional Integrals and the Renormalization Group, CRM Monograph Series, Vol. 16 (American Mathematical Society, Providence, RI, 2002). [8] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid. Part 1: Overview, Comm. Math. Phys. 247 (2004) 1–47. [9] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid. Part 3: The Fermi surface, Comm. Math. Phys. 247 (2004) 113–177. [10] R. L. Graham, D. E. Knuth and O. Patashnik, Concrete Mathematics: A Foundation for Computer Science, 2nd edn. (Addison-Wesley Professional, Reading, MA, 1994). [11] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, 7th edn. (Academic Press, Boston, MA, 2000). [12] Y. Kashima, On the double critical-state model for type-II superconductivity in 3D, M2AN Math. Model Numer. Anal. 42 (2008) 333–374. [13] T. Koma and H. Tasaki, Decay of superconducting and magnetic correlations on oneand two-dimensional Hubbard models, Phys. Rev. Lett. 68 (1992) 3248–3251. [14] S. G. Krantz and H. R. Parks, The Implicit Function Theorem — History, Theory, and Applications (Birkh¨ auser, Boston-Basel-Berlin, 2002). [15] S. Lang, Complex Analysis, 4th edn. (Springer-Verlag, Berlin-Heidelberg-New York, 1999). [16] D. Lehmann, Mathematical Methods of Many-Body Quantum Field Theory, Chapman & Hall/CRC Research Notes in Mathematics, Vol. 436 (Chapman & Hall/CRC, Boca Raton, FL, 2005).
September 14, 2009 15:31 WSPC/148-RMP
1044
J070-00379
Y. Kashima
[17] G. D. Mahan, Many-Particle Physics, 3rd edn. (Kluwer Academic/Plenum Publishers, New York, 2000). [18] L. Prigozhin, The Bean model in superconductivity: Variational formulation and numerical solution, J. Comput. Phys. 129 (1996) 190–200. [19] W. Pedra and M. Salmhofer, Determinant bounds and the Matsubara UV problem of many-fermion systems, Comm. Math. Phys. 282 (2008) 797–818. [20] V. Rivasseau, The two dimensional Hubbard model at half-filling: I. Convergent contributions, J. Stat. Phys. 106 (2002) 693–722. [21] M. Salmhofer, Renormalization: An Introduction, Springer Texts and Monographs in Physics (Springer-Verlag, Berlin-Heidelberg-New York, 1999). [22] M. Salmhofer, Clustering of fermionic truncated expectation values via functional integration, J. Stat. Phys. 134 (2009) 941–952. [23] M. Salmhofer and C. Wieczerkowski, Positivity and convergence in fermionic quantum field theory, J. Stat. Phys. 99 (2000) 557–586. [24] D. B. West, Introduction to Graph Theory, 2nd edn. (Prentice Hall, Upper Saddle River, NJ, 2002). [25] C. N. Yang, Concept of off-diagonal long-range order and the quantum phases of liquid He and of superconductors, Rev. Mod. Phys. 34 (1962) 694–704. [26] K. Yosida, Functional Analysis, 6th edn. (Springer-Verlag, Berlin-Heidelberg-New York, 1998).
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Reviews in Mathematical Physics Vol. 21, No. 8 (2009) 1045–1080 c World Scientific Publishing Company
QUANTIZATION OF THE HALL CONDUCTANCE AND DELOCALIZATION IN ERGODIC LANDAU HAMILTONIANS
FRANC ¸ OIS GERMINET∗ , ABEL KLEIN† and JEFFREY H. SCHENKER‡ ∗Universit´ e
de Cergy-Pontoise, D´ epartement de Math´ ematiques, CNRS UMR 8088, IUF, 95000 Cergy-Pontoise, France
[email protected]
†University
of California, Irvine, Department of Mathematics, Irvine, CA 92697-3875, USA
[email protected]
‡Michigan
State University, Department of Mathematics, East Lansing, MI 48823, USA jeff
[email protected] Received 2 December 2008 Revised 15 July 2009
We prove quantization of the Hall conductance for continuous ergodic Landau Hamiltonians under a condition on the decay of the Fermi projections. This condition and continuity of the integrated density of states are shown to imply continuity of the Hall conductance. In addition, we prove the existence of delocalization near each Landau level for these two-dimensional Hamiltonians. More precisely, we prove that for some ergodic Landau Hamiltonians, there exists an energy E near each Landau level where a “localization length” diverges. For the Anderson–Landau Hamiltonian, we also obtain a transition between dynamical localization and dynamical delocalization in the Landau bands, with a minimal rate of transport, even in cases when the spectral gaps are closed. Keywords: Ergodic Landau Hamiltonians; random Landau Hamiltonians; quantum Hall effect; Hall conductance; dynamical delocalization. Mathematics Subject Classification 2000: 82B44, 47B80, 60H25
Contents 1. Introduction
1046
2. Definitions and Main Results
1048
3. Technicalities
1056
4. Existence and Quantization of the Hall Conductance
1059
5. Continuity of the Hall Conductance
1066
1045
September 14, 2009 15:49 WSPC/148-RMP
1046
J070-00381
F. Germinet, A. Klein & J. H. Schenker
6. Delocalization for Ergodic Landau Hamiltonians with Open Gaps
1070
7. Dynamical Delocalization for the Anderson–Landau Hamiltonian with Closed Gaps
1071
Appendix A. The Spectrum of Landau Hamiltonians with Bounded Potentials
1075
Appendix B. The Spectrum of Anderson–Landau Hamiltonians
1075
1. Introduction Ergodic Landau Hamiltonians describe an electron moving in a very thin flat conductor with impurities under the influence of a constant magnetic field perpendicular to the plane of the conductor. They play an important role in the understanding of the quantum Hall effect [34, 2, 39, 25, 36, 33, 5, 3, 6]. Laughlin’s argument relies on the assumption that under weak disorder and strong magnetic field the energy spectrum consists of bands of extended states separated by energy regions of localized states and/or energy gaps [34, 25, 2, 39]. Kunz [33] formulated assumptions under which he derived the divergence of a “localization length” near each Landau level at weak disorder. Previous to our recent paper [24], there had been no rigorous results concerning delocalization for continuous ergodic Landau Hamiltonians. Divergence of a “localization length” had only been proved for an ergodic Landau Hamiltonian in a tight-binding approximation, a discrete ergodic Schr¨ odinger operator. The first results were obtained by Bellissard, van Elst and Schulz-Baldes [6], who proved that, for an ergodic Landau Hamiltonian in a tight-binding approximation, if the Hall conductance jumps from one integer value to another between two Fermi energies, then there is an energy between these Fermi energies at which a certain localization length diverges. Their results relied on a proof of the quantization of Hall conductance (the quantum Hall effect) for ergodic Landau Hamiltonians in a tight binding representation (discrete ergodic Landau Hamiltonians) in energy intervals characterized by a condition on the decay of the Fermi projections. Their proof relies on noncommutative geometry and the Dixmier trace. Aizenman and Graf [1] gave a more elementary derivation of this result, incorporating ideas of Avron, Seiler and Simon [3], paying the price of a slightly stronger condition on the decay of the Fermi projections. In [24], we proved that the (continuous) Anderson–Landau Hamiltonian (the random Landau Hamiltonian in [24]) exhibits dynamical delocalization in each Landau band. More precisely, under the disjoint bands condition (open spectral gaps between Landau bands), which holds (bounded potentials) under weak disorder and/or strong magnetic field, we proved the existence of a transition between dynamical localization and dynamical delocalization in each Landau band, with a lower bound on the rate of transport. We used nontrivial consequences of the multiscale analysis for random Schr¨odinger operators to prove that the Hall conductance
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1047
for the Anderson–Landau Hamiltonian is well defined and constant in intervals of dynamical localization. We used the knowledge of the precise values of the Hall conductance for the (free) Landau Hamiltonian: it is constant between Landau levels and jumps by one at each Landau level, a well known fact (e.g., [3, 6]). In addition, we showed that the Hall conductance is constant as a function of the disorder parameter in the gaps between the Landau bands, a result previously derived by Elgart and Schlein [14] for smooth potentials. Under the disjoint bands conditions (open spectral gaps), we combined these ingredients to conclude that there must be dynamical delocalization as we cross a Landau band. Moreover, since the existence of dynamical localization at the edges of these Landau bands was known [9, 40, 20], we proved the existence of dynamical mobility edges. In [24], we circumvented the use of the quantization of the Hall conductance. For continuous Landau Hamiltonians quantization of the Hall conductance had only been known on spectral gaps [3]. A proof of quantization of the Hall conductance inside the spectrum of continuous ergodic Landau Hamiltonians has been a long-standing open problem. Although it was promised in 1994 [6], the proof never appeared. (As mentioned in [6], in the discrete case, their proof studies a compact noncommutative manifold, while in the continuous case the corresponding noncommutative manifold is locally compact, but not compact.) In this article, we prove quantization of the Hall conductance for continuous ergodic Landau Hamiltonians under a condition on the decay of the Fermi projections. We also show that this condition and continuity of the integrated density of states imply continuity of the Hall conductance. In particular, we get quantization and continuity of the Hall conductance for the Anderson–Landau Hamiltonian in the region of localization. Our condition on the decay of the Fermi projections is reminiscent of the condition used in [1], but it is not the same because of differences between the continuous and the discrete cases. Although the weaker condition given in [6] is very natural (it was shown by Bouclet and the authors [7] to be sufficient for a rigorous derivation of the Kubo–St˘reda formula for the Hall conductance in continuous ergodic Landau Hamiltonians), its use for a derivation of the quantization of the Hall conductance seems to require methods of noncommutative geometry and the Dixmier trace that have not been extended to the continuous case. In [24], we did not use the quantization of the Hall conductance, but required the disjoint bands condition. The results in this paper not only give a new proof of the delocalization results in [24], but they allow the extension of those results to ergodic Landau Hamiltonians, in the sense of divergence of a “localization length”. In this paper, we go beyond the disjoint bands condition, proving dynamical delocalization in the Landau bands for the Anderson–Landau Hamiltonian in cases where the spectral gaps are closed. Using our results on the quantization of the Hall conductance, we prove the existence of a transition between dynamical localization and dynamical delocalization in a Landau band, with a lower bound on the rate of transport, for Anderson–Landau Hamiltonians with closed spectral gaps. Although
September 14, 2009 15:49 WSPC/148-RMP
1048
J070-00381
F. Germinet, A. Klein & J. H. Schenker
in this paper we assume, as in [24], that the potentials are bounded, this restriction can be removed. This extension appears in a companion article [23], which considers an Anderson–Landau Hamiltonian with unbounded random amplitudes (e.g., with a Gaussian distribution), where all the gaps close as soon as the disorder is turned on. The main results of this paper still hold for such unbounded Anderson–Landau Hamiltonians; the theorem concerning the existence of a dynamical transition is stated below for completeness. 2. Definitions and Main Results We consider a Z2 -ergodic Landau Hamiltonian HB,λ,ω = HB + λVω
on L2 (R2 , dx),
(2.1)
where HB is the (free) Landau Hamiltonian, B (x2 , −x1 ) (2.2) 2 (A is the vector potential and B > 0 is the strength of the magnetic field, we use the symmetric gauge and incorporated the charge of the electron in the vector potential), λ ≥ 0 is the disorder parameter, and Vω is a bounded ergodic (real) potential. Thus, there is a probability space (Ω, P) equipped with an ergodic group {τ (a); a ∈ Z2 } of measure preserving transformations, a potential-valued map Vω on Ω, measurable in the sense that φ, Vω φ is a measurable function of ω for all φ ∈ Cc∞ (R2 ). Such a family of potentials includes random as well as quasiperiodic potentials. We assume that HB = (−i∇ − A)2
−M1 ≤ Vω (x) ≤ M2 ,
with A =
where M1 , M2 ∈ [0, ∞) with M1 + M2 > 0,
(2.3)
for all a ∈ Z2 .
(2.4)
and Vω (x − a) = Vτa ω (x)
An important example of an ergodic Landau Hamiltonian is the Anderson– Landau Hamiltonian (A)
HB,λ,ω := HB + λVω(A) , (A)
where Vω
is the random potential Vω(A) (x) =
ωi u(x − i),
(2.5)
(2.6)
i∈Z2
with u(x) ≥ 0 a bounded measurable function with compact support, u(x) ≥ u0 on some nonempty open set for some constant u0 > 0, and ω = {ωi ; i ∈ Z2 } a family of independent, identically distributed random variables taking values in a bounded interval [−M1 , M2 ] (0 ≤ M1 , M2 < ∞, M1 + M2 > 0), whose common probability distribution µ has a bounded density ρ. Without loss of generality we (A) set i∈Z2 u(x − i)∞ = 1, and hence −M1 ≤ Vω (x) ≤ M2 .
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1049
An ergodic Landau Hamiltonian HB,λ,ω is a self-adjoint measurable operator, i.e. with probability one HB,λ,ω is a self-adjoint operator and the mappings ω → f (HB,λ,ω ) are strongly measurable for all bounded measurable functions on R (cf. [38]). The magnetic translations Ua = Ua (B), a ∈ R2 , defined by B
(Ua ψ)(x) = e−i 2 (x2 a1 −x1 a2 ) ψ(x − a),
(2.7)
give a projective unitary representation of R2 on L2 (R2 , dx): B
Ua Ub = ei 2 (a2 b1 −a1 b2 ) Ua+b = eiB(a2 b1 −a1 b2 ) Ub Ua ,
a, b ∈ Z2 .
(2.8)
We have Ua HB Ua∗ = HB for all a ∈ R2 , and the following covariance relation for magnetic translation by elements of Z2 : Ua HB,λ,ω Ua∗ = HB,λ,τa ω
for all a ∈ Z2 .
(2.9)
It follows from ergodicity that HB,λ,ω has a nonrandom spectrum: there exists a nonrandom set ΣB,λ such that σ(HB,λ,ω ) = ΣB,λ with probability one. Moreover, the decomposition of σ(HB,λ,ω ) into pure point spectrum, absolutely continuous spectrum, and singular continuous spectrum is also independent of the choice of ω with probability one [29, 8, 38]. In addition, the integrated density of states N (B, λ, E) is well defined and may be written as (cf. [26]) N (B, λ, E) = E{tr{χ0 PB,λ,E,ω χ0 }}.
(2.10)
Here and throughout the paper, χx denotes the characteristic function of a cube of side length 1 centered at x ∈ Z2 . The spectrum of the Landau Hamiltonian HB , denoted by ΣB , consists of a sequence of infinitely degenerate eigenvalues, the Landau levels: ΣB = {Bn := (2n − 1)B, n = 1, 2, . . .}.
(2.11)
We also set B0 = −∞ for convenience. Standard arguments (see Appendix A) show that ΣB,λ ⊂
∞
Bn (B, λ),
where Bn (B, λ) = [Bn − λM1 , Bn + λM2 ].
(2.12)
n=1
For a given magnetic field B > 0, disorder λ ≥ 0 and energy E ∈ R, the Fermi projection PB,λ,E,ω is just the spectral projection of the ergodic Landau Hamiltonian HB,λ,ω onto energies ≤ E, i.e. PB,λ,E,ω = χ(−∞,E] (HB,λ,ω ). Estimates on the decay of the operator kernel of the Fermi projection, {χx PB,λ,E,ω χy }x,y∈Z2 , play an important role in the study of the Hall conductance.
(2.13)
September 14, 2009 15:49 WSPC/148-RMP
1050
J070-00381
F. Germinet, A. Klein & J. H. Schenker
To state these estimates we introduce norms on random operators (see Sec. 3.1 for more details). A random operator Sω is a strongly measurable map from the probability space (Ω, P) to bounded operators on L2 (R2 , dx). We set 1
Sω p := {E{tr|Sω |p }} p = Sω p Lp (Ω,P)
for p ∈ [1, ∞),
Sω ∞ := Sω L∞ (Ω,P) .
(2.14)
The Hall conductance σH (B, λ, E) is given by σH (B, λ, E) = −2πiE{tr{χ0 PB,λ,E,ω [[PB,λ,E,ω , X1 ], [PB,λ,E,ω , X2 ]]χ0 }},
(2.15)
defined for B > 0, λ ≥ 0 and energy E ∈ R such that χ0 PB,λ,E,ω [[PB,λ,E,ω , X1 ], [PB,λ,E,ω , X2 ]]χ0 1 < ∞.
(2.16)
(Xi denotes the operator given by multiplication by the coordinate xi , i = 1, 2, and |X| the operator given by multiplication by |x|.) A natural condition for (2.16) and quantization of the Hall conductance was given by Bellissard et al. [6]: x∈Z2
2
|x|2 χx PB,λ,E,ω χ0 2 < ∞.
(2.17)
They showed the sufficiency of this condition in an abstract C ∗ -algebra setting, from which they obtained existence and quantization of the Hall conductance for ergodic Landau Hamiltonians in a tight binding representation (ergodic Landau Hamiltonians). This condition was also shown by Bouclet and the authors [7] to be sufficient for a rigorous derivation of (2.15) for ergodic Landau Hamiltonians as a Kubo formula. Aizenman and Graf [1] gave a more elementary derivation of the existence and quantization of the Hall conductance for an ergodic Landau Hamiltonian HB,λ,ω on 2 (Z 2 ), under the condition [1, Condition (5.4)], namely
1
|x|{E{|δx , PB,λ,E,ω δ0 |q }} q < ∞ for some q > 2,
(2.18)
x∈Z2
which implies (2.17) in the discrete setting. In the discrete setting, given an interval where the integrated density of states is continuous, constancy of the Hall conductance follows if either (2.17) or (2.18) holds with a uniform bound in the interval [6, 1]. On the continuum, it is natural to work with estimates on the the decay of χx PB,λ,E,ω χ0 2 . In fact, it is known that for the Anderson–Landau Hamiltonian χx PB,λ,E,ω χ0 2 exhibits sub-exponential in x in the region of localization
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1051
[22, Theorem 3], [24, Eq. (3.2)]. We will prove that a sufficient condition for the existence and quantization of the Hall conductance for ergodic Landau Hamiltonians is given by β |x| χx PB,λ,E,ω χ0 2 < ∞ for some β ∈ (0, 1). (2.19) x∈Z2
We will also show that for an interval where the integrated density of states is continuous, we have constancy of the Hall conductance if (2.19) holds with a locally bounded bound. Note that (2.19) implies (2.17). We consider the magnetic field-disorder-energy parameter space {(B, 0) × ΣB }; (2.20) Ξ = {(0, ∞) × [0, ∞) × R}\ B∈(0,∞)
we exclude the Landau levels at no disorder. We give Ξ the relative topology as a subset of R3 . Given a subset Φ ⊂ Ξ, we set Φ(B,λ) := {E ∈ R; (B, λ, E) ∈ Φ},
(2.21)
with a similar definition for Φ(B,E) . We now introduce a (generalized) “localization length” L(B, λ, E), based on (2.19). Given β ∈ (0, 1] and (B, λ, E) ∈ Ξ, we set L(B, λ, E) := lim Lβ (B, λ, E),
(2.22)
β↑1
where Lβ (B, λ, E) :=
x∈Z2
β
|x| χx PB,λ,E,ω χ0 2
for β ∈ (0, 1].
(2.23)
We will also need “localization lengths” that take into account what happens near (B, λ, E). We let L+ (B, λ, E) := lim Lβ+ (B, λ, E), β↑1
(B,λ)
L+
(B,λ)
(E) := lim Lβ+ (E), β↑1
(2.24) (2.25)
where Lβ+ (B, λ, E) := (B,λ)
Lβ+ (E) :=
inf
sup
Φ(B,λ,E) (B ,λ ,E )∈Φ Φ⊂Ξ open
inf
Lβ (B , λ , E ),
sup Lβ (B, λ, E ).
IE E ∈I I⊂R open
(2.26)
(2.27)
The justification of the definitions (2.22), (2.24) and (2.25), that is, the existence of the limits, is found in Sec. 3.3. Note that L1 (B, λ, E) < ∞ implies (2.17), and that in general we only have L1 (B, λ, E) ≤ L(B, λ, E).
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
F. Germinet, A. Klein & J. H. Schenker
1052
We also define the subsets of Ξ where these “localization lengths” are finite: Ξ# := {(B, λ, E) ∈ Ξ; #(B, λ, E) < ∞}, # = L, L+ , Lβ , Lβ+ , {B,λ}
Ξ#
:= {E ∈ R; #(B,λ) (E) < ∞},
# = L, L+ , Lβ , Lβ+ . {B,λ}
ΞL+ is, by definition, a relatively open subset of Ξ, and ΞL+ {B,λ} Ξ#
(B,λ) Ξ# ,
(2.28)
is an open subset
(B,λ) Ξ#
⊃ with defined as in (2.21), but we may not of R. Note that have equality. {B,λ} , # = Lβ , Lβ+ , are monotone In Sec. 3.3, we show that the sets Ξ# and Ξ# increasing in β ∈ (0, 1], with {B,λ} {B,λ} ΞLβ , ΞL+ = ΞLβ + , ΞL+ = ΞLβ + . (2.29) ΞL = β∈(0,1)
β∈(0,1)
β∈(0,1)
Note that ΞNS := {(B, λ, E) ∈ Ξ; E ∈ / ΣB,λ } ⊂ ΞL+ ;
(2.30)
ΞNS being the region of no spectrum. We are now ready to state our main results. Theorem 2.1. Let HB,λ,ω be an ergodic Landau Hamiltonian. Then the Hall conductance σH (B, λ, E) is defined and integer valued on ΞL . In addition, σH (B, λ, E) {B,λ} is locally bounded on ΞL+ and on each ΞL+ . (B,λ)
We set σH L+ (B, λ, E).
(B,λ)
(E) := σH (B, λ, E), N (B,λ) (E) := N (B, λ, E), and L+
(E) :=
Theorem 2.2. Let HB,λ,ω be an ergodic Landau Hamiltonian. If for a given (B, λ) ∈ (0, ∞) × [0, ∞), the integrated density of states N (B,λ) (E) is continuous (B,λ) {B,λ} in E, then the Hall conductance σH (E) is continuous on ΞL+ . In particular, (B,λ)
σH
{B,λ}
(E) is constant on each connected component of ΞL+
.
If we have λ(M1 + M2 ) < 2B,
(2.31)
it follows from (2.12) that the bands Bn (B, λ) are disjoint, and the spectral gaps remain open. We will refer to (2.31) as the disjoint bands condition; it clearly holds under weak disorder and/or strong magnetic field. Corollary 2.3. Let HB,λ,ω be an ergodic Landau Hamiltonian. Suppose the integrated density of states N (B,λ) (E) is continuous in E for all (B, λ) ∈ (0, ∞)×[0, ∞) satisfying the disjoint bands Condition (2.31). Then for all such (B, λ) the “local(B,λ) ization length” L+ (E) diverges near each Landau level : for each n = 1, 2, . . . there exists an energy En (B, λ) ∈ Bn (B, λ) such that {B,λ}
L+
(En (B, λ)) = ∞. (A)
(2.32)
For the Anderson–Landau Hamiltonian HB,λ,ω we can say more. Following [21, 22, 24], we introduce the region of dynamical localization. (It was called the strong
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1053
insulator region in [21] and the region of complete localization in [22].) This can be done in many equivalent ways, as shown in [21, 22], but for the purposes of this paper we define it by the decay of the Fermi projection, using [22, Theorem 3 and following comments]: The region of dynamical localization ΞDL consists of those (B, λ, E) ∈ Ξ for which there exists an open interval I E such that sup χx PB,λ,E ,ω χ0 2 ≤ CI,B,λ (1 + |x|)−η1
E ∈I
for all x ∈ Z2 ,
(2.33)
where η1 > 0 is a fixed number that can be calculated from the proof of [22, Theorem 3]. (The condition stated in [22, Theorem 3] is of the form (2.34) E sup χx PB,λ,E ,ω χ0 22 ≤ CI,B,λ (1 + |x|)−η1 for all x ∈ Z2 , E ∈I
but an inspection of the proof shows that it can be replaced by (2.33).) Its complement in Ξ will be called the region of dynamical delocalization: ΞDD := Ξ\ΞDL . (See [24] for background, definitions, and discussion.) It follows that that there exists β1 ∈ (0, 1) such that (B,λ)
ΞDL
{B,λ}
= ΞLβ
1+
{B,λ}
⊂ ΞL+
.
(2.35)
Moreover, the integrated density of states N (B, λ, E) of the the Anderson–Landau Hamiltonian is jointly H¨ older-continuous in (B, E) for λ > 0 [12]. (N (B, λ, E) is actually Lipshitz continuous in E for fixed (B, λ) [11].) Thus (2.32) implies [24, Eq. (2.20)], that is, (B,λ)
ΞDD
∩ Bn (B, λ) = ∅,
(2.36)
and hence Corollary 2.3 provides a new proof for [24, Theorems 2.1 and 2.2]. We actually have more. Using the characterization of ΞDL as the region of applicability of the multiscale analysis [21], we can get the constant CI,B,λ in (2.33) locally bounded in B and λ, obtaining ΞDL = ΞLβ1 + ⊂ ΞL+ .
(2.37)
For the Anderson–Landau Hamiltonian we have a slightly stronger version of Theorems 2.1 and 2.2. (A)
Theorem 2.4. Let HB,λ,ω be the Anderson–Landau Hamiltonian. Then the Hall conductance σH (B, λ, E) is defined and integer valued on ΞL , and H¨ oldercontinuous on ΞL+ . In particular, σH (B, λ, E) is constant on each connected component of ΞL+ . It follows that on ΞDL , the region of dynamical localization, the Hall conductance σH (B, λ, E) is defined, integer valued, and constant on each connected component. The results in this article for the Anderson–Landau Hamiltonian go beyond [24, Theorems 2.1 and 2.2]; they show the existence of a dynamical metal-insulator transition, in the sense of [21], inside the Landau bands of the Anderson–Landau
September 14, 2009 15:49 WSPC/148-RMP
1054
J070-00381
F. Germinet, A. Klein & J. H. Schenker
Hamiltonian in cases when the disjoint bands condition does not hold and the spectral gaps are closed. We give a simple example in the next theorem. (B,λ) As shown in [21], the region of dynamical localization ΞDL can be characterized as follows. To measure “dynamical localization” we introduce p
MB,λ,ω (p, X , t) = x 2 e−itHB,λ,ω X (HB,λ,ω )χ0 22 ,
(2.38)
the random moment of order p ≥ 0 at time t for the time evolution in the Hilbert– Schmidt norm, initially spatially localized in the square of side one around the origin (with characteristic function χ 0 ), and “localized” in energy by the func∞ (R). (Notation: x := 1 + |x|2 .) Its time averaged expectation is tion X ∈ Cc,+ given by 1 ∞ t E{MB,λ,ω (p, X , t)}e− T dt. (2.39) MB,λ (p, X , T ) = T 0 (B,λ)
It is proven in [21] that ΞDL is the set of energies E for which there exists X ∈ ∞ (R) with X ≡ 1 on some open interval containing E, α ≥ 0, and p > 4α + 22, Cc,+ such that lim inf T →∞
1 MB,λ (p, X , T ) < ∞, Tα
(2.40)
in which case it is also shown in [21] that (2.40) holds for any p ≥ 0 with α = 0. (A)
Theorem 2.5. Let HB,λ,ω be an Anderson–Landau Hamiltonian as in (2.5) and (2.6), where the common probability distribution µ has density ρ(s) =
η+1 (1 − |s|)η χ[−1,1] (s), 2
and the single-site potential u satisfies 0 < U− ≤ U (x) := u(x − i) ≤ 1,
η > 0,
with U− a constant.
(2.41)
(2.42)
i∈Z2
Let B > 0. Then : (i) The spectral gaps are all closed for λ ≥ ΣB,λ = [E0 (B, λ), ∞)
1 U− B :
for λ ≥
1 B, U−
(2.43)
where E0 (B, λ) := inf ΣB,λ ∈ (B − λ, B − λU− ). > 1 B, and δ ∈ (0, B). Set (ii) Let λ U− Jn (B) := (Bn + δ, Bn+1 − δ),
n ∈ N,
J0 (B) := (−∞, B − δ) ⊂ (−∞, B).
(2.44)
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1055
Then for all N ∈ N there exists ηN > 0 such that, taking η ≥ ηN , for all λ ∈ [0, λ] we have (B,λ)
Jn (B) ⊂ ΞDL
for all λ ∈ [0, λ],
n = 0, 1, 2, . . . , N.
(2.45)
there exists Moreover, for all λ ∈ [0, λ] (B,λ)
En (B, λ) ∈ [Bn − δ, Bn + δ] ∩ ΞDD
for n = 1, 2, . . . , N.
(2.46)
{B,λ}
In particular, for n = 1, 2, . . . , N we have L+ (En (B, λ)) = ∞, and for every ∞ (R) with X ≡ 1 on some open interval J En (B, λ) and p > 24, we have X ∈ Cc,+ p
MB,λ (p, X , T ) ≥ Cp,X T 4 −6
(2.47)
for all T ≥ 0 with Cp,X > 0. all the spectral gaps are closed, but we still Note that for all λ ∈ [ U1− B, λ] show existence of at least one dynamical mobility edge near the first N Landau levels, namely a boundary point between the regions of dynamical localization and dynamical delocalization. Another application of the results in this paper can be found in a companion (A) article [23], which considers an Anderson–Landau Hamiltonian HB,λ,ω as in (2.5) and (2.6), but with a common probability distribution µ which has a bounded density ρ with suppρ = R and fast decay: ρ(ω) ≤ ρ0 exp(−|ω|α )
for some ρ0 ∈ (0, +∞) and α > 0.
(2.48)
(In particular, µ may have a Gaussian distribution.) The random potential Vω is now an unbounded ergodic potential, but HB,λ,ω is essentially self-adjoint on Cc∞ (Rd ) with probability one, and we have (see [4]) ΣB,λ = R
for all λ > 0,
(2.49)
where ΣB.λ is the spectrum of HB,λ,ω with probability one. It is shown in [23] that the main results of this paper, and in particular Theorems 2.1, 2.2 and 2.4, as well as the relevant results from [21], hold for these Anderson–Landau Hamiltonians with suppµ = R (and hence unbounded potentials). Note that (2.37) is still valid, although its proof must be modified, taking into account that the Wegner estimate can be controled as λ → 0 for intervals that do not contain Landau levels. The fact that the Landau gaps are immediately filled up as soon as the disorder is turned on implies that the approach used in [24] and in Corollary 2.3 is not applicable. Proving the existence of a dynamical transition in that case requires the full set of conclusions of Theorem 2.4, namely that the Hall conductance is integer valued and continuous on connected components of ΞL+ , as used in the proof of Theorem 2.5. The continuity of the Hall conductance for arbitrary small λ (in order to let λ go to zero) given by Theorem 2.4 is required. A result similar to Theorem 2.5(ii) is proved in [23]: given n ∈ N, there is at least one dynamical mobility edge near the first N Landau levels for small λ. It can be stated as follows.
September 14, 2009 15:49 WSPC/148-RMP
1056
J070-00381
F. Germinet, A. Klein & J. H. Schenker
Theorem 2.6 ([23]). Let HB,λ,ω be a random Landau Hamiltonian as in (2.5) and (2.6), but with a common probability distribution µ which has a bounded density ρ with supp ρ = R and (2.48), so (2.49) holds for all λ > 0. Let B > 0. Then, for (±) each n ∈ N, there exists λ(n) > 0, such that for λ ∈ (0, λ(n)] there exist En (B, λ), (−) (+) with Bn − B < En (B, λ) < Bn < En (B, λ) < Bn + B, 1
|En(±) (B, λ) − Bn | ≤ Kn (B)λ|log λ| α → 0
as λ → 0,
(2.50)
with a finite constant Kn (B), and (−)
(B,λ)
(En(+) (B, λ), (En+1 (B, λ)) ⊂ ΞDL . (−)
(2.51)
(B,λ)
We also have (−∞, E1 (B, λ)) ⊂ ΞDL for λ ∈ (0, λ(0)], λ(0) > 0. Moreover, for λ ∈ (0, min{λ(n − 1), λ(n)}) there exists (B,λ)
En (B, λ) ∈ [En(−) (B, λ), En(+) (B, λ)] ∩ ΞDD ,
(2.52)
∞ and hence (2.47) holds for every X ∈ Cc,+ (R) with X ≡ 1 on some open interval J En (B, λ) and p > 24.
We collect some technicalities in Sec. 3. In Sec. 4, we study the Hall conductance, proving Theorem 2.1. Section 5 is devoted to the continuity of the Hall conductance: Theorem 2.2 is proved in Sec. 5.1, and the stronger version for Anderson– Landau Hamiltonians, Theorem 2.4, is proved in Sec. 5.2. Corollary 2.3 is proven in Sec. 6. Dynamical delocalization (and a dynamical metal-insulator transition) for the Anderson–Landau Hamiltonians with closed spectral gaps is shown in Sec. 7, where we prove Theorem 2.5. In Appendix A, we prove a useful lemma about the spectrum of Landau Hamiltonians with bounded potentials. The spectrum of the Anderson–Landau Hamiltonian is discussed in Appendix B. 3. Technicalities 3.1. Norms on random operators and Fermi projections Given p ∈ [1, ∞), Tp will denote the Banach space of bounded operators S on 1 L2 (R2 , dx) with STp = Sp := (tr|S|p ) p < ∞. A random operator Sω is a strongly measurable map from the probability space (Ω, P) to bounded operators on L2 (R2 , dx). Given p ∈ [1, ∞), we set 1
Sω p := {E{Sω pp }} p = Sω Tp Lp (Ω,P) ,
(3.1)
Sω ∞ := Sω L∞ (Ω,P) .
(3.2)
and
These are norms on random operators, note that q−p
p
Sω q ≤ Sω ∞q Sω pq
for 1 ≤ p ≤ q < ∞,
(3.3)
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1057
and they satisfy Holder’s inequality: Sω Tω r ≤ Sω p Tω q
for r, p, q ∈ [1, ∞] with
1 1 1 = + . r p q
(3.4)
In particular, if Sω ∞ ≤ 1, we have 2
Sω q ≤ Sω 2q
for 2 ≤ p ≤ q < ∞.
(3.5)
3.2. Operator kernels of Fermi projections Let HB,λ,ω be an ergodic Landau Hamiltonian for a given magnetic field B > 0, disorder λ ≥ 0 and energy E ∈ R. We consider the operator kernel of the Fermi projection PB,λ,E,ω = χ(−∞,E] (HB,λ,ω ), {χx PB,λ,E,ω χy }x,y∈Z2 , and set κp (B, λ, E) ≡ χ0 PB,λ,E,ω χ0 p
for p ∈ [1, ∞],
(3.6)
κ1,∞ (B, λ, E) ≡ tr{χ0 PB,λ,E,ω χ0 }L∞ (Ω,P) .
Note that κ1,∞ (B, λ, E) is locally bounded on Ξ (e.g., [7]), and hence also κp (B, λ, E), since κ∞ (B, λ, E) ≤ 1 and for p ∈ [1, ∞) we have 1
1
κp (B, λ, E) ≤ χ0 PB,λ,E,ω χ0 1p ≤ {κ1,∞ (B, λ, E)} p . In addition, we have 1 1 = χ P 2 (B, λ, E)} 2 0 B,λ,E,ω χ0 p = {κ p 2 2 χ0 PB,λ,E,ω p 1 = χ P 2 0 B,λ,E,ω | 2 2p ≤ κp (B, λ, E)
if p ∈ [2, ∞) if p ∈ [1, ∞)
(3.7)
, (3.8)
and thus, given x ∈ Z2 , for all p ∈ [1, ∞) we have χ0 PB,λ,E,ω χx p ≤ χ0 PB,λ,E,ω 2p PB,λ,E,ω χx 2p = κp (B, λ, E).
(3.9)
It follows from (2.10) that N (B, λ, E) = κ1 (B, λ, E).
(3.10)
N (B, λ, E) = 0 ⇔ χx PB,λ,E,ω χ0 2 = 0 for all x ∈ Z2 .
(3.11)
Note that
3.3. Localization lengths We will now justify the definitions (2.22), (2.24) and (2.25). To justify (2.22), we must show that the limit exists in [0, ∞). Given β ∈ (0, 1] and (B, λ, E) ∈ Ξ, let β (B, λ, E) := N (B, λ, E)1−β Lβ (B, λ, E), L
(3.12)
September 14, 2009 15:49 WSPC/148-RMP
1058
J070-00381
F. Germinet, A. Klein & J. H. Schenker
β (B, λ, E) is monotone where N (B, λ, E) is as in (2.10). It follows from (3.9) that L decreasing in β ∈ (0, 1], so we can define L(B, λ, E) :=
β (B, λ, E) = lim L β (B, λ, E). inf L β↑1
β∈(0,1)
(3.13)
It is an immediate consequence of (3.12) and (3.13) (cf. (3.11)) that L(B, λ, E) is well defined and L(B, λ, E) = L(B, λ, E).
(3.14)
The definitions (2.24) and (2.25) are justified in a similar way. As before β+ (B, λ, E) := N (B, λ, E)1−β Lβ+ (B, λ, E), L (B,λ) (E) := N (B, λ, E)1−β L(B,λ) (E), L β+ β+
(3.15)
are seen to be monotone decreasing in β ∈ (0, 1], so we have L+ (B, λ, E) = (B,λ)
L+
(E) =
β+ (B, λ, E) = lim L β+ (B, λ, E), inf L
(3.16)
(B,λ) (E) = lim L (B,λ) (E). inf L β+ β+
(3.17)
β↑1
β∈(0,1)
β↑1
β∈(0,1)
{B,λ}
It follows that that the sets Ξ# and Ξ# ing in β ∈ (0, 1], and we have (2.29)
, # = Lβ , Lβ+ , are monotone increas-
3.4. Auxiliary “localization lengths” Although the “localization lengths” L(B, λ, E) and L+ (B, λ, E) give a convenient way to write our main theorems, in the proofs it will be more convenient to work with auxiliary “localization lengths” based on the norms for random operators introduced in (2.14) with p ∈ [2, ∞). They can be thought of an adaptation to the continuum (and to two parameters) of [1, condition (5.4)]. If q ∈ [1, ∞), J ⊂ [1, ∞), we define the following “localization lengths” for (B, λ, E) ∈ Ξ: max{|x|, 1} χx PB,λ,E,ω χ0 q ,
q (B, λ, E) := x∈Z2
q+ (B, λ, E) := (B,λ)
q+
(E) :=
inf
sup
Φ(B,λ,E) (B ,λ ,E )∈Φ Φ⊂Ξ open
inf
sup q (B, λ, E ),
IE E ∈I I⊂R open
J (B, λ, E) := inf q (B, λ, E), q∈J
J+ (B, λ, E) := inf q+ (B, λ, E), q∈J
(B,λ)
J+ (E)
(B,λ)
:= inf q+ q∈J
q (B , λ , E ),
(E).
(3.18)
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1059
While the quantity in [1, condition (5.4)] is monotone increasing in q ∈ [1, ∞), the “localization lengths” q (B, λ, E) cannot be compared for different q’s. Another difference is that [1, condition (5.4)] implies the equivalent of (2.17) in the lattice, but q (B, λ, E) < ∞ only implies (2.17) if q = 2. We also define the subsets of Ξ where these “localization lengths” are finite: Ξ# = {(B, λ, E) ∈ Ξ; # (B, λ, E) < ∞}, # = q, q+, J, J+, {B,λ} Ξ#
(B,λ)
= {E ∈ R; #
{B,λ}
Note that we may have Ξ# {B,λ} Ξ#
⊃
(B,λ) Ξ#
ΞJ =
and
Ξq ,
(E) < ∞}, (B,λ)
= Ξ#
ΞJ+ =
q∈J
# = q+, J + . (B,λ)
, with Ξ#
Ξq+ ,
{B,λ}
ΞJ+
(3.19)
defined as in (2.21). However,
=
q∈J
{B,λ}
Ξq+
.
(3.20)
q∈J {B,λ}
ΞJ+ is, by definition, a relatively open subset of Ξ, and ΞJ+ is an open subset of R. If q ∈ [2, ∞), it follows immediately from (3.5) and (3.6) that for all (B, λ, E) ∈ Ξ we have
q (B, λ, E) ≤ κq (B, λ, E) + L q2 (B, λ, E),
(3.21)
q+ (B, λ, E) ≤ κq (B, λ, E) + L q2 + (B, λ, E),
(3.22)
(B,λ)
q+ It follows that ΞL ⊂
(B,λ)
(E) ≤ κq (B, λ, E) + L 2 + (E). q
Ξ(2,r]
and ΞL+ ⊂
r>2
Ξ(2,r]+ .
(3.23)
(3.24)
r>2 (A)
For the Anderson–Landau Hamiltonian HB,λ,ω the following holds for all large q0 (recall (2.33)–(2.37)):
Ξq+ = Ξq0 + , ΞDL = q∈[1,∞) (B,λ)
ΞDL
=
{B,λ}
Ξq+
{B,λ}
(3.25)
= Ξq0 + .
q∈[1,∞)
4. Existence and Quantization of the Hall Conductance Theorem 2.1 is an immediate consequence of the following theorem. Theorem 4.1. Let HB,λ,ω be an ergodic Landau Hamiltonian. Then the Hall conductance σH (B, λ, E) is defined on Ξ[2,∞) with the bound |σH (B, λ, E)| ≤ 4π
inf {κp (B, λ, E){ q (B, λ, E)}2 } < ∞.
q∈[2,∞) 1 2 p + q =1
(4.1)
September 14, 2009 15:49 WSPC/148-RMP
1060
J070-00381
F. Germinet, A. Klein & J. H. Schenker {B,λ}
It follows that σH (B, λ, E) is locally bounded on Ξ[2,∞)+ and on each Ξ[2,∞)+ . Moreover, the Hall conductance σH (B, λ, E) is integer valued on Ξ(2,3] . Theorem 4.1 will proved by the following lemmas. Given x ∈ R2 , we set x ˆ to be the discretization of x, i.e. the unique element 2 ˆ i denote the operator xi − 12 , x ˆi + 12 ), 1 = 1, 2. We let X of Z such that xi ∈ [ˆ ˆ given by multiplication by xˆi , and note that Xi χu = ui χu for each u ∈ Z2 , i.e. ˆ i = 2 xχx , and note X x∈Z ˆi ≤ 1 , Xi − X 2
ˆ ≤ |X| − |X|
√ 2 . 2
(4.2)
If (B, λ, E) ∈ Ξ and q ∈ [1, ∞), it follows that ˆ PB,λ,E,ω χ0 q ≤ q (B, λ, E), |X|
(4.3)
and hence, using (4.2), and (3.8) we get |X|PB,λ,E,ω χ0 q ≤ q (B, λ, E) + κq (B, λ, E) ≤ 2 q (B, λ, E).
(4.4)
It follows that, with i = 1, 2, ˆ i ]χ0 q ≤ q (B, λ, E), [PB,λ,E,ω , X
(4.5)
[PB,λ,E,ω , Xi ]χ0 q ≤ 3 q (B, λ, E).
(4.6)
ˆ i PB,λ,E,ω χu and We conclude, using covariance, that for P-a.e. ω, X ˆ i ]χu and [PB,λ,E,ω , Xi ]χu , are bounded Xi PB,λ,E,ω χu , and hence also [PB,λ,E,ω , X 2 operators for all (B, λ, E) ∈ Ξ[1,∞) , u ∈ Z , i = 1, 2. ˆ i substituted for Xi : We now define a modified Hall conductance, with X ˆ 1 ], [PB,λ,E,ω , X ˆ 2 ]]χ0 }}, σ ˆH (λ, E) = −2πiE{tr{χ0 PB,λ,E,ω [[PB,λ,E,ω , X
(4.7)
defined for (B, λ, E) ∈ Ξ such that ˆ 1 ], [PB,λ,E,ω , X ˆ 2 ]]χ0 1 < ∞. χ0 PB,λ,E,ω [[PB,λ,E,ω , X
(4.8)
Lemma 4.2. The Hall conductances σH (B, λ, E) and σ ˆH (B, λ, E) are defined on the set Ξ[2,∞) . Moreover, for all (B, λ, E) ∈ Ξ[2,∞) we have ˆH (B, λ, E) σH (B, λ, E) = σ = −2πi (u1 v2 − u2 v1 )E{tr{χ0 PB,λ,E,ω χu PB,λ,E,ω χv PB,λ,E,ω χ0 }}, u,v∈Z2
(4.9)
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
with
|σH (B, λ, E)| ≤ 4π
u,v∈Z2
|u||v|χ0 Pλ,E,ω χu Pλ,E,ω χv Pλ,E,ω χ0 1
≤ 4πκp (B, λ, E){ q (B, λ, E)}2 < ∞ for all q ∈ [2, ∞) and
1 p
+
2 q
1061
(4.10)
= 1.
Proof. Let (B, λ, E) ∈ Ξq for some q ∈ [1, ∞). Writing Pω for PB,λ,E,ω , we have χ0 Pω [[Pω , X1 ], [Pω , X2 ]]χ0 1 ≤ {χ0 Pω [Pω , X1 ]χu [Pω , X2 ]χ0 1 + χ0 Pω [Pω , X2 ]χu [Pω , X1 ]χ0 1 } u∈Z2
< ∞,
(4.11)
since may use the Holder’s inequality (3.4) with u∈Z2
1 p
+
2 q
= 1 to get
χ0 Pω [Pω , Xi ]χu [Pω , Xj ]χ0 1 ≤ χ0 Pω p
[Pω , Xi ]χu q (|u| + 1) χu Pω χ0 q
u∈Z2
≤ χ0 Pω p [Pω , Xi ]χ0 q
(|u| + 1) χu Pω χ0 q
u∈Z2
≤ 4κp (B, λ, E){ q (B, λ, E)}2 < ∞
(4.12)
for i, j = 1, 2, where we used covariance, (3.8), (4.6), and (3.18). Thus σH (B, λ, E) is defined on the set Ξq , and similarly for σ ˆH (B, λ, E). ˆH (B, λ, E). To see that, note that We will now show that σH (B, λ, E) = σ ˆH (B, λ, E) σH (B, λ, E) − σ ˆ 1 ], [Pω , X2 ]]χ0 }} = −2πiE{tr{χ0 Pω [[Pω , X1 − X ˆ 1 ], [Pω , X2 − X ˆ 2 ]]χ0 }}. + 2πiE{tr{χ0 Pω [[Pω , X
(4.13)
We have ˆ 1 ], [Pω , X2 ]]χ0 }} E{tr{χ0 Pω [[Pω , X1 − X ˆ 1 )(1 − Pω )[Pω , X2 ]χ0 }} = E{tr{χ0 Pω (X1 − X
(4.14)
ˆ 1 )Pω χ0 }} + E{tr{χ0 [Pω , X2 ](1 − Pω )(X1 − X ˆ 1 )(1 − Pω )[Pω , X2 ]Pω χ0 }} = E{tr{χ0 (X1 − X
(4.15)
ˆ 1 )Pω [Pω , X2 ](1 − Pω )χ0 }} + E{tr{χ0 (X1 − X ˆ 1 )[Pω , X2 ]χ0 }} = 0, = E{tr{χ0 (X1 − X
(4.16)
September 14, 2009 15:49 WSPC/148-RMP
1062
J070-00381
F. Germinet, A. Klein & J. H. Schenker
where in (4.16) we used centrality of trace, justified since X2 χ0 is a bounded operator, to go from (4.15) to (4.16) we used (1 − Pω )[Pω , X2 ]Pω + Pω [Pω , X2 ](1 − Pω ) = [Pω , X2 ],
(4.17)
and the passage from (4.14) to (4.15) can be justified as follows: ˆ 1 )(1 − Pω )[Pω , X2 ]χ0 }} E{tr{χ0 Pω (X1 − X ˆ 1 )(1 − Pω )[Pω , X2 ]χ0 }} = E{tr{χ0 Pω χu (X1 − X u∈Z2
=
ˆ 1 )(1 − Pω )[Pω , X2 ]χ0 Pω χu }} E{tr{χu (X1 − X
u∈Z2
=
ˆ 1 )(1 − Pω )[Pω , X2 ]χ−u Pω χ0 }} E{tr{χ0 (X1 − X
u∈Z2
ˆ 1 )(1 − Pω )[Pω , X2 ]Pω χ0 }}, = E{tr{χ0 (X1 − X
(4.18)
with a similar calculation for the other term in (4.15), where we used the centrality of the trace and covariance (the absolute summability of all series can be verified as in (4.12)). The second term in the right-hand side of (4.13) is also equal to 0 by ˆH (B, λ, E). a similar calculation, so we conclude that σH (B, λ, E) = σ Since, with 1p + 2q = 1, we have |u||v| χ0 Pω χu Pω χv Pω χ0 1 ≤ |u| χ0 Pω χu q χ0 Pω p |v| χv Pω χ0 q , (4.19) the estimate (4.10) follows from (3.18) and (3.8). The expression (4.9) then follows for σH (B, λ, E) = σ ˆH (B, λ, E) from (4.7). Next, we will show that the Hall conductance σH (λ, E) takes integer values on Ξ(2,3] , following the approach of Avron, Seiler and Simon [3], as modified by Aizenman and Graf [1]. Avron, Seiler and Simon proved the result for random Landau Hamiltonians at energies outside the spectrum, i.e. on ΞNS . Their argument was adapted to the lattice by Aizenman and Graf, who proved that the Hall conductance for the lattice model takes integer values in the region where [1, condition (5.4)] holds, i.e. on the lattice equivalent of Ξ(2,3] . (On the lattice this result had been proved earlier under the lattice equivalent of condition (2.17) by Bellissard, van Elst and Schulz-Baldes [6].) We complete the circle by adapting Aizenman and Graf’s argument back to the continuum. Let Z2∗ = ( 12 , 12 ) + Z2 denote the dual lattice to Z2 . Given a ∈ Z2∗ we define the complex valued function γa (x) on R2 by γa (x) =
x2 − a2 ) xˆ1 − a1 + i(ˆ , |ˆ x − a|
(4.20)
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1063
and let Γa denote the unitary operator given by multiplication by the function √ 2 x − a| ≥ 2 for all x ∈ R2 . We have the following estimate: γa (x). Note that |ˆ 1 1 , |γa (x) − γa (y)| ≤ min |ˆ x − yˆ| max ,2 |ˆ x − a| |ˆ y − a| |ˆ x − yˆ| ,2 . (4.21) ≤ min 4 |ˆ x − a| (The first inequality can be found in [3]. The second inequality can be seen as follows: if |ˆ x − yˆ| ≤ 12 |ˆ x − a| we have |ˆ x − a| − |ˆ y − a| ≤ |ˆ x − yˆ| ≤ 12 |ˆ x − a|, and hence |ˆ x −ˆ y | x−ˆ y| 1 1 |ˆ x − a| ≤ 2|ˆ y − a|; if |ˆ x − yˆ| > 2 |ˆ x − a| we have |ˆx−a| > 2 , and hence 4 |ˆ |ˆ x−a| > 2.) Given two orthogonal projections P and Q in a Hilbert space, such that P − Q is compact, the index of P and Q is defined by (cf. [3, Sec. 2]) Index(P, Q) := dim Ker(P − Q − 1) − dim Ker(Q − P − 1).
(4.22)
The index is a well defined integer since P − Q compact implies that dim Ker(P − Q ± 1) are both finite. Note that in the case P and Q have finite rank we have Index(P, Q) = dim Ran P − dim Ran Q = tr(P − Q).
(4.23)
Lemma 4.3. The Hall conductance σH (B, λ, E) takes integer values on Ξ(2,3] . Proof. Let (B, λ, E) ∈ Ξq for some q ∈ (2, 3], and write Pω for PB,λ,E,ω . As in [3,1], we prove that for all a ∈ Z2∗ we have E(Pω − Γa Pω Γ∗a 3 ) < ∞,
(4.24)
and hence for P-a.e. ω the index of the orthogonal projections Pω and Γa Pω Γ∗a (see [3, Sec. 2]), Index(Pω , Γa Pω Γ∗a ), is the finite integer given by Index(Pω , Γa Pω Γ∗a ) = tr(Pω − Γa Pω Γ∗a )3 .
(4.25)
Note that Index(Pω , Γa Pω Γ∗a ) is independent of a ∈ Z2∗ [3, Proposition 3.8], and hence it follows from the covariance relation (2.9) and properties of the index (use [3, Proposition 2.4]) that for all b ∈ Z2 we have Index(Pτb ω , Γa Pτb ω Γ∗a ) = Index(Ub Pω Ub∗ , Γa Ub Pω Ub∗ Γ∗a ) = Index(Pω , Γa+b Pω Γ∗a+b ) = Index(Pω , Γa Pω Γ∗a ).
(4.26)
Since Index(Pω , Γa Pω Γ∗a ) is a measurable function by (4.25), it follows from ergodicity that it must be constant almost surely (see [3, Proposition 8.1]). In particular, this constant must be an integer, and, since constants are integrable, E{Index(Pω , Γa Pω Γ∗a )} = Index(Pω , Γa Pω Γ∗a ) for P-a.e. ω
(4.27)
September 14, 2009 15:49 WSPC/148-RMP
1064
J070-00381
F. Germinet, A. Klein & J. H. Schenker
is an integer, and the lemma will follow if we show σH (B, λ, E) = E{Index(Pω , Γa Pω Γ∗a )}.
(4.28)
Let Tω = Pω − Γa Pω Γ∗a . We have Tω q ≤ χx+y Tω χx , 2 2 y∈Z
x∈Z
(4.29)
q
where q q2 χx+y Tω χx = tr χx Tω∗ χx+y Tω χx 2 2 x∈Z
x∈Z
q
=
q
tr|χx Tω∗ χx+y Tω χx | 2 =
x∈Z2
χx+y Tω χx qq ,
(4.30)
x∈Z2
and hence Tω q ≤
y∈Z2
1q χx+y Tω χx qq
,
(4.31)
x∈Z2
which is the extension of [1, Lemma 1] to the continuum. Note that if the right-hand side of (4.31) is finite, then Tω = χx+y Tω χx (4.32) in Tq , y∈Z2
x∈Z2
where Tq is the Banach space of compact operators with the norm q , in the sense that for each y ∈ Z2 the series x∈Z2 χx+y Tω χx converges in Tq , to, say, T (y) (but the series is not necessarily absolutely summable), the series y∈Z2 T (y) converges absolutely in Tq , and T = y∈Z2 T (y) . It follows from (4.21) that χx+y Tω χx q ≤ 4
|y| χy Pω χ0 q , |x − a|
(4.33)
and hence E(Tω q ) ≤
y∈Z2
≤4
x∈Z2
x∈Z2
1q q χx+y Tω χx q
1 |x − a|q
q1
q (B, λ, E) < ∞,
(4.34)
where we used q > 2. Since we also have q ≤ 3, and Sr ≤ Ss for any 1 ≤ s ≤ r < ∞, we note that (4.24) follows from (4.34).
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1065
It remains to prove (4.28). To do so, note that it follows from (4.32) and (4.25) that ∗ 3 tr(χx Tω χx+u Tω χx+v Tω χx ) (4.35) Index(Pω , Γa Pω Γa ) = tr Tω = u,v∈Z2
x∈Z2
where the series in x is at first only known to be convergent for each u, v, but not absolutely convergent, to, say, ζ(u, v), and u,v∈Z2 |ζ(u, v)| < ∞. To show that the series is actually absolutely convergent, we let r be given by 1 2 + r q = 1, so in particular q < r, and note that, using (4.21), we have E{tr|χx Tω χx+u Tω χx+v Tω χx |} u,v,x∈Z2
≤
u,v,x∈Z
4|u| χ0 Pω χu Pω χv Pω χ0 1 |x − a| 2
≤ 64
2
1− rq
q
|u||u − v| r |v| χ0 Pω χu Pω χv Pω χ0 1
u,v∈Z2
4|u − v| |x + u − a|
a∈Z2∗
qr
4|v| |x − a|
1 q < ∞, |a|2 |u − a| r (4.36)
since a∈Z2∗
and
1 q ≤ |a|2 |u − a| r
3r−q 3r
1
a∈Z2∗
|a| 3r−q
6r
a∈Z2∗
1 |a|3
q 3r
< ∞,
(4.37)
q
|u||u − v| r |v|χ0 Pω χu Pω χv Pω χ0 1
u,v∈Z2
≤
q sup |x| r χx Pω χ0 r { q (B, λ, E)}2
x∈Z2
≤
sup |x|χx Pω χ0 q
qr
x∈Z2
{ q (B, λ, E)}2
q
≤ { q (B, λ, E)}2+ r < ∞.
(4.38)
We can thus take expectations in (4.35) obtaining E{Index(Pω , Γa Pω Γ∗a )} = E{tr(χ0 Pω χu Pω χv Pω χ0 )} u,v∈Z2
×
(1 − γa (x)γa (x + u))(1 − γa (x + u)γa (x + v))
x∈Z2
× (1 − γa (x + v)γa (x)).
(4.39)
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
F. Germinet, A. Klein & J. H. Schenker
1066
On the other hand,
(1 − γa (x)γa (x + u))(1 − γa (x + u)γa (x + v))(1 − γa (x + v)γa (x))
x∈Z2
=
(1 − γa (0)γa (u))(1 − γa (u)γa (v))(1 − γa (v)γa (0))
a∈Z2∗
= −2πi(u1 v2 − u2 v1 )
(4.40)
by Connes formula as in [1, Appendix F] — see also [1, Eqs. (4.14) and (5.1)]. Thus (4.28) follows from (4.39), (4.40), and (4.9). This completes the proof of Theorem 4.1.
5. Continuity of the Hall Conductance 5.1. Ergodic Landau Hamiltonians Theorem 2.2 follows immediately from the following theorem. Theorem 5.1. Let HB,λ,ω be an ergodic Landau Hamiltonian. If for a given (B, λ) ∈ (0, ∞) × [0, ∞) the integrated density of states N (B,λ) (E) is continuous (B,λ) {B,λ} in E, then the Hall conductance σH (E) is continuous on Ξ(2,∞)+ . In particular, (B,λ)
σH
{B,λ}
(E) is constant on each connected component of Ξ(2,3]+ .
To prove Theorem 5.1, we will use the following lemma. Lemma 5.2. Let (B, E, λ) ∈ Ξq+ with q ∈ (2, ∞); set 1p + 2q = 1. Then there exists a neighborhood Φ of (B, E, λ) in Ξ, such that Φ ⊂ Ξq+ , and for all (B , λ , E ) ∈ Φ , Pω , Pω for σH (B, λ, E), σH (B , λ , E ), PB,λ,E,ω , PB ,λ ,E ,ω , we have, with σH , σH respectively. |σH
− σH | ≤ CB,λ,E,q
sup u∈Z2
χ0 (Pω
1 p
− Pω )χu 1
{ q+ (B, λ, E)}2 .
(5.1)
Proof. Given (B, E, λ) ∈ Ξq+ with q ∈ (2, ∞), there exists a neighborhood Φ of (B, E, λ) in Ξ such that
q (B , λ , E ) ≤ 2 q+ (B, λ, E) < ∞
(5.2)
, Pω , Pω for any (B , λ , E ) ∈ Φ. (It follows that Φ ⊂ Ξq,+ .) We write σH , σH for σH (B, λ, E), σH (B , λ , E ), PB,λ,E,ω , PB ,λ ,E ,ω , respectively. Using Lemma 4.2
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1067
and (4.7), we have i ˆ 1 ], [Pω , X ˆ 2 ]]χ0 }} (σ − σH ) = E{tr{χ0 (Pω − Pω )[[Pω , X 2π H ˆ 1 ], [P , X ˆ 2 ]]χ0 }} + E{tr{χ0 Pω [[(P − Pω ), X ω
ω
ˆ 1 ], [(P − Pω ), X ˆ 2 ]]χ0 }} + E{tr{χ0 Pω [[Pω , X ω ≡ σ1 + σ2 + σ3 ,
(5.3)
where σ1 , σ2 , σ3 can be shown to be well defined as in the proof of Lemma 4.2, and can be written similarly to (4.9). Thus, with 1p + 2q = 1, where p < ∞ since q > 2, we have |σ1 | ≤ |(u1 − v1 )v2 − (u2 − v2 )v1 |E{tr|χ0 (Pω − Pω )χu Pω χv Pω χ0 |} u,v∈Z2
≤8
sup u∈Z2
χ0 (Pω
≤ 16
sup u∈Z2
− Pω )χu p { q+ (B, λ, E)}2
χ0 (Pω
1 p
− Pω )χu 1
{ q+ (B, λ, E)}2 ,
(5.4)
with similar estimates for |σ2 | and |σ3 |. The desired estimate (5.1) now follows from (5.3) and (5.4). Proof of Theorem 5.1. In view of Theorem 4.1, it suffices to show that if for a given (B, λ) ∈ (0, ∞) × [0, ∞) the integrated density of states N (B,λ) (E) is con(B,λ) {B,λ} tinuous in E, then the Hall conductance σH (E) is continuous on Ξ(2,∞)+ . This follows immediately from Lemma 5.2, since for E1 ≤ E2 we have, for all u ∈ Z2 , χ0 (PB,λ,E2 ,ω − PB,λ,E1 ,ω )χu 1 ≤ χ0 (PB,λ,E2 ,ω − PB,λ,E1 ,ω )χ0 1 = N (B,λ) (E2 ) − N (B,λ) (E1 ).
(5.5)
5.2. The Anderson–Landau Hamiltonian Theorem 2.4 follows from the following theorem. (A)
Theorem 5.3. Let HB,λ,ω be the Anderson–Landau Hamiltonian. Then the Hall conductance σH (B, λ, E) is defined on Ξ[2,∞) , integer valued on Ξ(2,3] , and H¨ oldercontinuous on Ξ(2,∞)+ . In particular, σH (B, λ, E) is constant on each connected component of Ξ(2,3]+ . In view of Theorems 4.1 and 5.1, all that remains to finish the proof of Theorem 5.3 is to show that for the Anderson–Landau Hamiltonian the Hall conductance σH (B, λ, E) is H¨older-continuous on Ξ(2,∞)+ . This will follow from Lemma 5.2 and the following lemma, which improves on a result of Combes, Hislop, Klopp and Raikov [12]: the integrated density of states of the Anderson–Landau Hamiltonian N (B, λ, E) is jointly H¨ older continuous in (B, E) for λ > 0. More precisely, they proved that given given λ > 0, α, δ ∈ (0, 1), and a compact set Y ⊂ (0, ∞] × R,
September 14, 2009 15:49 WSPC/148-RMP
1068
J070-00381
F. Germinet, A. Klein & J. H. Schenker
there exists a constant CY,α,δ (λ) such that α
|N (B , λ, E ) − N (B, λ, E)| ≤ CY,α,δ (λ)(|B − B| 4 + |E − E|δ )
(5.6)
for all (B, E), (B , E ) ∈ Y , and the constant CY,α,δ (λ) is locally bounded for λ > 0. (Although the fact that CY,α,δ (λ) is locally bounded is not explicitly stated in [12], it is implicit in the proof.)] H¨ older continuity in the energy was previously known in special cases [9,40,27,10]. We strengthen this result, proving joint H¨ older-continuity of χ0 PB,λ,E,ω χ0 in the 1 norm with respect to (B, E, λ). (A)
Lemma 5.4. Let HB,λ,ω be the Anderson–Landau Hamiltonian. Fix α, δ, η ∈ (0, 1). Then, given a compact subset K of Ξ, there exists a constant CK,α,δ,η such that sup χ0 (PB ,λ ,E ,ω − PB ,λ ,E ,ω )χu 1
u∈Z2
η
α
≤ CK,α,δ,η (|B − B| 5 + |E − E |δ + |λ − λ | 3 )
(5.7)
for all (B , λ , E ), (B , λ , E ) ∈ K. Lemma 5.4 will follow from the above stated result of [12] and Lemma 5.5 below. Note that if E ≤ E we have PB,λ,E ,ω − PB,λ,E ,ω ≥ 0, so the hypothesis of Lemma 5.5 follow from (5.6). (A)
Lemma 5.5. Let HB,λ,ω be the Anderson–Landau Hamiltonian. Let δ ∈ (0, 1). Suppose that for every bounded interval I and (B, λ) ∈ (0, ∞)2 there exists a constant CI (B, λ), locally bounded in (B, λ), such that for all E , E ∈ I we have χ0 (PB,λ,E ,ω − PB,λ,E ,ω )χ0 1 ≤ CI (B, λ)|E − E |δ .
(5.8)
Given K = [B1 , B2 ] × [λ1 , λ2 ] × [E1 , E2 ] ⊂ Ξ, there is a constant CK , such that for all E ∈ [E1 , E2 ] and u ∈ Z2 we have δ
χ0 (PB,λ ,E,ω − PB,λ ,E,ω )χu 1 ≤ CK |λ − λ | δ+2 ,
(5.9)
for all B ∈ [B1 , B2 ] and λ , λ ∈ [λ1 , λ2 ], and δ
χ0 (PB ,λ,E,ω − PB ,λ,E,ω )χu 1 ≤ CK |B − B | δ+4 ,
(5.10)
for all B , B ∈ [B1 , B2 ] and λ ∈ [λ1 , λ2 ]. Proof. It suffices to consider the case when B2 − B1 < 1 and λ2 − λ1 < 1, We set I = [E1 − 1, E2 ]. Note that (5.8) holds for (B, λ) ∈ [B1 , B2 ] × [λ1 , λ2 ] and E , E ∈ I with CI ≡ sup(B,λ)∈[B1 ,B2 ]×[λ1 ,λ2 ] CI (B, λ) < ∞. (This includes the case λ1 = 0 with a slightly modified interval I, although this case is not included in the hypothesis (5.8). The reason is that since K ⊂ Ξ, if λ1 = 0 the interval [E1 , E2 ] cannot contain any Landau level for B ∈ [B1 , B2 ]. In this case, we set I = [E1 − ρ, E2 ], where 0 < ρ ≤ 1 is chosen so I also does not contain a Landau level for some B ∈ [B1 , B2 ]. The proof applies also in this case except that we take B2 − B1 < ρ and λ2 − λ1 < ρ.)
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1069
We fix a function f ∈ C ∞ (R), such that 0 ≤ f (t) ≤ 1, f (t) = 1 if t ≤ 0, and f (t) = 0 if t ≥ 1. We prove (5.9) first. Let E ∈ [E1 , E2 ], B ∈ [B1 , B2 ], and λ , λ ∈ [λ1 , λ2 ]. We ); let γ = |λ − λ |α , where α ∈ (0, 1) will be chosen later. We set g(t) = f ( t−(E−γ) γ note g ∈ C ∞ (R), with 0 ≤ g(t) ≤ 1, g(t) = 1 if t ≤ E − γ, g(t) = 0 if t ≥ E. We write PB,λ ,E,ω − PB,λ ,E,ω = {PB,λ ,E,ω − g(HB,λ ,ω )} + {g(HB,λ ,ω ) − g(HB,λ ,ω )} + {g(HB,λ ,ω ) − PB,λ ,E,ω }.
(5.11)
By construction, for any λ ≥ 0 we have 0 ≤ PB,λ,E,ω − g(HB,λ,ω ) ≤ PB,λ,E,ω − PB,λ,E−γ,ω ,
(5.12)
and thus, for λ# = λ , λ and any u ∈ Z2 , we have χ0 (PB,λ# ,E,ω − g(HB,λ# ,ω ))χu 1 1 1 ≤ χ0 (PB,λ# ,E,ω − g(HB,λ# ,ω )) 2 2 (PB,λ# ,E,ω − g(HB,λ# ,ω )) 2 χu 2 = χ0 (PB,λ# ,E,ω − g(HB,λ# ,ω ))χ0 1 ≤ χ0 (PB,λ# ,E,ω − PB,λ# ,E−γ,ω )χ0 1 ≤ CI γ δ . (5.13) We now estimate the middle term in the right-hand side of (5.11). Let RB,λ,Bω (z) = (HB,λ,ω − z)−1 be the resolvent. Recall (e.g., [7]) that χv Rλ,B,ω (z)2 ≤ cλ
1 + |z| , Im z
(5.14)
with a constant cλ independent of B, v ∈ Z2 , and ω, and locally bounded in λ. The Helffer–Sj¨ ostrand formula with a quasi analytic extension of g of order 3 (e.g., [13]), combined with the resolvent equation and (5.14), yields χ0 (g(HB,λ ,ω ) − g(HB,λ ,ω ))χu 1 ≤ C
|λ − λ | , γ2
(5.15)
where the constant C depends only on E1 , E2 , λ1 , λ2 , our choice of the function f , and fixed parameters. Thus, combining (5.11), (5.13), and (5.15). we get χ0 (Pλ ,E ,ω − Pλ ,E ,ω )χu 1 ≤ 2CI γ δ + C
|λ − λ | γ2
= 2CI |λ − λ |αδ + C|λ − λ |1−2α δ
= (2CI + C)|λ − λ | δ+2 , where we chose α =
1 δ+2
to optimize the bound.
(5.16)
September 14, 2009 15:49 WSPC/148-RMP
1070
J070-00381
F. Germinet, A. Klein & J. H. Schenker
To prove (5.10), we start by repeating the above proof varying B instead of λ. The only difference is in the equivalent of the estimate (5.15). Here we use [12, Proposition 5.1], observing that its proof (note [12, Eqs. (5.12) and (5.13)]) actually proves the stronger result |B − B | χ0 (g(HB ,λ,ω ) − g(HB ,λ,ω ))χu 1 ≤ C˜ , γ4
(5.17)
where now γ = |B − B |α , and the constant C˜ depends only on E1 , E2 , λ1 , λ2 , B1 , B2 , our choice of the function f , and fixed parameters. Proceed1 , in which case we ing as before, we see that in this case we should choose α = δ+4 get (5.10).
6. Delocalization for Ergodic Landau Hamiltonians with Open Gaps We now prove Corollary 2.3 by proving the following theorem Theorem 6.1. Let HB,λ,ω be an ergodic Landau Hamiltonian. Suppose the integrated density of states N (B,λ) (E) is continuous in E for all (B, λ) ∈ (0, ∞)×[0, ∞) satisfying the disjoint bands Condition (2.31). Then for all such (B, λ) the “local{B,λ} ization length” (2,3]+ diverges near each Landau level : for each n = 1, 2, . . . there exists an energy En (B, λ) ∈ Bn (B, λ) such that {B,λ}
(2,3]+ (En (B, λ)) = ∞.
(6.1)
We start the proof of Theorem 6.1 by setting, for n = 1, 2, . . . , Gn = {(B, λ, E) ∈ Ξ; λ(M1 + M2 ) < 2B, E ∈ (Bn−1 + λM2 , Bn − λM1 )}. In view of (2.12) and (2.30), we have ∞ ∞ Gn = Ξ {(B, λ)} × Bn (B, λ) ⊂ ΞNS ⊂ Ξ(2,3]+ . n=1
(6.2)
(6.3)
B∈(0,∞) λ∈[0,∞) n=1
It is well known that σH (B, 0, E) = n if E ∈ ]Bn , Bn+1 [ for all n = 0, 1, 2 . . . [3, 6]. Given n ∈ N and (B, λ1 , E) ∈ Gn , we can find λE > λ1 such (B,λ) for all λ ∈ I = [0, λE [. It follows that, with probability one, that E ∈ Gn 1 Pλ = − Rλ (z)dz for all λ ∈ I, (6.4) 2πi Γ where Pλ = PB,λ,E,ω , Rλ (z) = (HB,λ,ω − z)−1 , and Γ is a bounded contour such that dist(Γ, σ(HB,λ,ω )) ≥ η > 0 for all λ ∈ I. Note HB,λ,ω ≥ B − λE M1 for all
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1071
λ ∈ I. It follows that there is a constant K such that (cf. [7, Proposition 2.1]) Rλ (z)χx 2 ≤ K
for all x ∈ Z2 , z ∈ Γ, λ ∈ I.
(6.5)
Given λ, ξ ∈ I, it follows from (6.4) and the resolvent identity that (ξ − λ) Qλ,ξ := Pξ − Pλ = Rλ (z)VR ξ (z)dz, 2πi Γ
(6.6)
:= max{M1 , M2 }). Letting σλ = σH (B, λ, E), it with V = Vω (recall V ≤ M follows from Lemma 5.2 that for all λ ∈ I, taking ξ ∈ I in a suitable neighborhood of λ, we have 1 3 |σλ − σξ | ≤ CB,λ,E sup χ0 Qλ,ξ χu 1 ≤ CB,λ,E
u∈Z2
|ξ − λ| M |Γ|K 2 2π
13
,
(6.7)
so σλ is a continuous function of λ in the interval I. By Theorem 4.1, σλ is constant in I, and hence we conclude that σH (B, λ, E) = σH (B, 0, E) = n
for all (B, λ, E) ∈ Gn .
(6.8)
{B,λ}
Now, let (B, λ) satisfy (2.31), and suppose Bn (B, λ) ⊂ Ξ(2,3]+ for some n ∈ N. We then have (B,λ)
{B,λ}
⊂ Ξ(2,3]+ . (Bn−1 + λM1 , Bn+1 − λM2 ) = Gn−1 ∪ Bn (B, λ) ∪ G(B,λ) n
(6.9)
Since the integrated density of states N (B,λ) (E) is assumed to be continuous in E, it follows from Theorem 5.1 that the Hall conductance σH (B, λ, E) is constant on the interval (Bn−1 + λM1 , Bn+1 − λM2 ), and hence has the same value on the (B,λ) (B,λ) , which contradicts (6.8). Thus we conclude that spectral gaps Gn−1 and Gn {B,λ} Bn (B, λ) cannot be a subset of Ξ(2,3]+ , which proves Theorem 6.1. 7. Dynamical Delocalization for the Anderson–Landau Hamiltonian with Closed Gaps In this section, we prove Theorem 2.5. (A) Let HB,λ,ω be an Anderson–Landau Hamiltonian as in (2.5) and (2.6), with a common probability distribution µ with supp µ = [−M1 , M2 ] with M1 , M2 ∈ (0, ∞). As shown in Appendix B, we have In (B, λ), where In (B, λ) = [E− (n, B, λ), E+ (n, B, λ)], (7.1) ΣB,λ = n∈N
where, for all B > 0 and n ∈ N, ±E± (n, B, λ) are increasing, continuous functions of λ > 0, depending on u and M1 , M2 , but not on other details of the measure µ.
September 14, 2009 15:49 WSPC/148-RMP
1072
J070-00381
F. Germinet, A. Klein & J. H. Schenker
We set E+ (0, B, λ) = −∞. We have Bn−1 ≤ E− (n, B, λ) < Bn < E+ (n, B, λ) ≤ Bn+1
for all n ∈ N,
B − λM1 ≤ E− (1, B, λ) = E0 (B, λ) := inf ΣB,λ < B.
(7.2)
(Note that B − λM1 ≤ E0 (B, λ) follows from (2.12).) If (2.31) holds, then E+ (n, B, λ) < E− (n+1, B, λ) for all n ∈ N and the spectral gaps do not close. If for some n ∈ N we have E+ (n, B, λ) ≥ E− (n + 1, B, λ), the nth spectral gap (Bn , Bn+1 ) has closed, i.e. [Bn , Bn+1 ] ⊂ ΣB,λ . Let us now assume that the single-site potential u in (2.6) satisfies 0 < U− ≤ U (x) :=
u(x − i) ≤ 1,
(7.3)
i∈Z2
for some constant U− . (The upper bound is simply a normalization we had already assumed.) Then, as shown in Appendix B, we have Bn + λM2 U− ≤ E+ (n, B, λ)
for λ ∈
Bn − λM1 U− ≥ E− (n, B, λ)
for λ ∈
B − λM1 U− ≥ E− (1, B, λ) = E0 (B, λ)
2B 0, M 2 U− 0,
2B M 1 U−
for all λ ≥ 0.
,
(7.4)
,
(7.5)
(7.6)
It follows that if λ(M1 + M2 )U− ≥ 2B,
(7.7)
all the internal spectral gaps close, i.e. ΣB,λ = [E0 (B, λ), ∞). Theorem 2.5(i) is proven. > To prove Theorem 2.5(ii), we assume (2.41) and fix λ Let Jn (B) be as in (2.44), we set δ δ Bn + , Bn+1 − , n ∈ N, 2 2 δ J0 (B) := −∞, B − ⊂ (−∞, B). 2
(7.8) 1 U− B,
and δ ∈ (0, B).
Jn (B) :=
(7.9)
We will prove (2.45) by a multiscale analysis. The multiscale analysis is carried on for the finite volume operators defined in [24, Secs. 4 and 5]; the Anderson– Landau Hamiltonian satisfies all the requirements for the multiscale analysis plus
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1073
a Wegner estimate [24, Secs. 4 and 5]. We take scales L ∈ LB N, where LB ≥ 1 is defined in [24, Eq. (5.1)], and consider boxes ΛL (x) = x + [− L2 , L2 )2 , x ∈ R2 , and L (x) = ΛL (x) ∩ Z2 . We define finite volume operators HB,λ,0,L,ω on L2 (ΛL (0)) let Λ as in [24, Eq. (5.2)]: HB,λ,0,L,ω = HB,0,L + λV0,L,ω on L2 (ΛL (0)), ωi u(x − i), V0,L,ω (x) =
(7.10)
e L−δ (0) i∈Λ u
where HB,0,L is defined in [24, Sec. 5] and supp u ⊂ (− δ2u , δ2u )2 , and then define HB,λ,ω,x,L for all x ∈ Z2 by [24, Eq. (4.3)]. (We prescribed periodic boundary condition for the (free) Landau Hamiltonian at the square centered at 0, and used the magnetic translations to define the finite volume operators in all other squares by [24, Eq. (4.3)]; in the square centered at x ∈ Z2 the potential Vx,L,ω is exactly L−δu (x).) as in (7.10) except that the sum is now over i ∈ Λ A Wegner estimate is given in [24, Theorem 5.1] and extended in [11, Theorem 4.3]; note that the constants in the Wegner estimate can be chosen uniformly in λ ∈ [λ1 .λ2 ] if λ1 > 0. It follows that for a closed interval I ⊂ (Bn , Bn+1 ), (But note that the n = 0, 1, 2, . . . , they can be chosen uniformly in λ ∈ [0, λ]. constants will depend on the interval I, and hence for I = Jn (B) they will depend on n.) But one has to be careful in the multiscale analysis, since ρ∞ appears in the Wegner estimate, (2.41) gives ρ∞ = η+1 2 , and we will prove (2.45) for η sufficiently large. All these issues can be taken in consideration by applying the finite volume criterion for localization given in [20, Theorem 2.4], in a similar way to the application in [20, Proof of Theorem 3.1]. then We write Λ = ΛL (x), HB,λ,L,ω = HB,λ,x,L,ω , etc. If λ|ωi | ≤ 2δ for all i ∈ Λ, we have by Lemma A.1 (it also applies to finite volume operators) that σ(HB,λ,L,ω ) ⊂
∞ n=1
δ δ Bn − , Bn + . 2 2
(7.11)
We have δ δ 2 inf P λ|ωi | ≤ for all i ∈ Λ ≥ 1 − L P λ|ω0 | > b 2 2 λ∈[0,λ] η δ = 1 − L2 1 − 2λ
(7.12)
where δb < U2− ≤ 12 2λ Given ω satisfying (7.11), E ∈ Jn (B) implies dist(E, σ(HB,λ,L,ω )) > 2δ . Let RB,λ,L,ω (E) = (HB,λ,L,ω − E)−1 . It follows from the Combes estimate (cf. [19, Theorem 1]; note that the estimate holds for finite volume operators with periodic
September 14, 2009 15:49 WSPC/148-RMP
1074
J070-00381
F. Germinet, A. Klein & J. H. Schenker
boundary condition with uniform constants for large enough volumes using the distance on the torus, cf. [16, Lemma 18] and [31, Theorem 3.6]) that χx RB,λ,L,ω (E)χy ≤
C1 −C2 δL e δ
with |x − y| ≥ L , for all x, y ∈ Λ 10
(7.13)
where C1 , C2 > 0 are constants, depending only on n, B, u. (B,λ) (The case Let us fix n ∈ N and prove that Jn (B) ⊂ ΞDL for all λ ∈ [0, λ]. n = 0 can be handled in a similar manner.) We take the constants in the Wegner λ]. Thus, if we have estimate valid for subintervals of Jn (B), uniformly in λ ∈ [0, (7.11), we will have the condition whose probability is estimated in [24, Eq. (2.17)] if L9
C1 −C2 δL C3 e , < δ η+1
(7.14)
where C3 is another constant depending only on n, B, u, and δ. We now take L0 (n) satisfying [24, Eq. (2.16)] and large enough for the Wegner estimate, and for L0 ≥ L0 (n) we set η(n, L0 ) = 1 +
C3 δ −9 C2 δL0 L e , 2C1 0
(7.15)
so (7.14) holds with L = L0 and η = η(n, L0 ). Since lim
L0 →∞
L20
η(n,L0 ) δ = 0. 1− 2λ
(7.16)
Thus we can find η(n) > 0 such that for all η ≥ η(n) there exists L0 (η) ≥ L0 (n) (B,λ) for which we have [24, Eq. (2.17)], so E ∈ Jn (B) implies E ∈ ΞDL . Thus given N ∈ N, letting ηN = maxn=0,1,2,...,N η(n), we have (2.45) for η ≥ ηN . Since the Hall conductance σH (B, 0, E) = n if E ∈ (Bn , Bn+1 ) for all n = 0, 1, 2 . . . [3, 6], it follows from Theorem 2.4 that for η ≥ ηN we have σH (B, λ, E) = n
× Jn (B). for all (λ, E) ∈ [0, λ]
(7.17)
We now proceed as in [24, Proof of Theorem 2.2], using again Theorem 2.4 (here we could also use Theorem 2.2), to conclude that for n = 1, 2, . . . , N we have {B,λ} (En (B, λ)) = ∞, so we have (2.46), and En (B, λ) ∈ [Bn − δ, Bn + δ] with L+ (2.47) follows from [21, Theorem 2.11], as in [24, Theorem 2.2]. Theorem 2.5 is proven.
Acknowledgment F. Germinet was supported in part by ANR 08 BLAN 0261; A. Klein was supported in part by NSF Grant DMS-0457474.
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1075
Appendix A. The Spectrum of Landau Hamiltonians with Bounded Potentials In the appendix, we justify (2.12). Lemma A.1. Let H = HB + W, where HB is the free Landau Hamiltonian as in (2.2), and −M1 ≤ W ≤ M2 , where M1 , M2 ∈ [0, ∞). Then σ(H) ⊂
∞
[Bn − M1 , Bn + M2 ].
(A.1)
n=1
Proof. THe lemma follows from [28, Theorem V.4.10] by writing M1 − M2 M1 − M2 H = HB − + W+ . 2 2
(A.2)
Appendix B. The Spectrum of Anderson–Landau Hamiltonians (A)
Consider an Anderson–Landau Hamiltonian HB,λ,ω = HB,λ,ω as in (2.5) and (2.6), and suppose that supp µ = [−M1 , M2 ] with M1 , M2 ∈ (0, ∞).
(B.1)
(The argument applies also to the case M1 , M2 ∈ [0, ∞) with M1 + M2 > 0, with the obvious modifications.) In this appendix, we make no other hypotheses on the common probability distribution µ. It follows from [30, Theorem 4], which applies also to Anderson–Landau Hamiltonians, that under these hypotheses we have 2 σ(HB,λ,ω ), where Ωsupp := [−M1 , M2 ]Z . (B.2) ΣB,λ = ω∈Ωsupp
We consider squares ΛL := [− L2 , L2 ) centered at the origin with side L > 0. Given (Λ) (Λ) such a square Λ, we define ω (Λ) by ωj = ωj if j ∈ Λ and ωj = 0 otherwise, and set (Λ)
HB,λ,ω := HB + λVω(Λ) ,
where Vω(Λ) = Vω(Λ) .
(B.3)
(Λ)
Note that Vω is relatively compact with respect to HB , so ΣB is also the essential (Λ) (Λ) spectrum of HB,λ,ω . In particular, HB,λ,ω has discrete spectrum in the spectral gaps {Gn (B) := (Bn , Bn+1 ), n = 0, 1, . . .} of HB . Since ω (Λ) ∈ Ωsupp if ω ∈ Ωsupp , it follows that ΣB ⊂ ΣB,λ =
∞
(Λ
)
Ln σ(HB,λ,ω ),
(B.4)
n=1 ω∈Ωsupp (Λ
)
Ln converges to HB,λ,ω for any Ln → ∞. (This uses (B.2) plus the fact that HB,λ,ω in the strong resolvent sense.) In particular, it follows from (B.1) that ΣB,λ is increasing with λ.
September 14, 2009 15:49 WSPC/148-RMP
1076
J070-00381
F. Germinet, A. Klein & J. H. Schenker
Let ω ∈ Ωsupp , ω (Λ) > 0, that is, ωj ≥ 0 for all j ∈ Λ and case
(Λ) Vω
j∈Λ
ωj > 0. In this
≥ 0, and (Λ)
ΣB ⊂ σ(HB,λ,ω ) ⊂
∞
[Bn , Bn + λM2 ].
(B.5)
n=1
We now use a modified Birman–Schwinger method, following [18, Sec. 4]. We fix n ∈ N and set (Λ) (Λ) for E ∈ (Bn , Bn+1 ), (B.6) R(E) = − Vω (HB − E)−1 Vω a compact self-adjoint operator. Let r+ (E) = max σ(R(E)). We claim lim r+ (E) = ∞.
(B.7)
E↓Bn
To see this, let Πn = χ{Bn } (HB ). Then 1 (Λ) (Λ) (Λ) (Λ) −1 R(E) = Vω Πn Vω − Vω (1 − Πn )(HB − E) Vω . (B.8) E − Bn Since M2 (Λ) (Λ) −1 for E ∈ (Bn , Bn + B), Vω ≤ (B.9) Vω (1 − Πn )(HB − E) B (Λ) (Λ) (B.7) follows if we show that Vω Πn Vω = 0. But otherwise, we would con (Λ) (Λ) clude that Vω Πn = 0 (A∗ A = 0 implies A = 0), and, since Vω > 0 in an nonempty open set, we would contradict the unique continuation principle. Now, (Λ) using (B.7), we conclude, as in [18, Proposition 4.3], that HB,λ,ω has an eigenvalue in (Bn , Bn + λM2 ] for all sufficiently small λ > 0. Now, let us replace ω by M2 in the notation if ωj = M2 for all j, and consider (Λ) (Λ) (Λ) HB,λ,M2 . Fix n ∈ N, and let E+ (n, B, λ) denote the biggest eigenvalue of HB,λ,M2 (Λ)
in the open interval (Bn , Bn+1 ). We have shown the existence of E+ (n, B, λ) for (Λ) small λ > 0. By the argument in [28, Sec. VII.3.2], E+ (n, B, λ) then exists for λ ∈ (Λ) (Λ) (0, λ+ (n, B)), with λ+ (n, B) > 0, where it is continuous and increasing in λ. In (Λ) (Λ) 2B view of (B.5), we have limλ↓0 E+ (n, B, λ) = Bn and λ+ (n, B) ≥ M . In addition, 2 (Λ)
(Λ)
we must either have λ+ (n, B) = ∞ or limλ↑λ(Λ) (n,B) E+ (n, B, λ) = Bn+1 . In the (Λ)
+
latter case we may thus extend E+ (n, B, λ) as an increasing, continuous function (Λ) (Λ) for λ ∈ (0, ∞) by setting E+ (n, B, λ) = Bn+1 for λ ≥ λ+ (n, B). (Λ) A similar argument produces a smallest eigenvalue E− (n, B, λ) ∈ [Bn−1 , Bn ) of (Λ) (Λ) (Λ) 2B , continuous HB,λ,−M1 in (Bn−1 , Bn ) for λ ∈ (0, λ− (n, B)), where λ− (n, B) ≥ M 1 (Λ)
(Λ)
and decreasing in λ, with limλ↓0 E− (n, B, λ) = Bn . Moreover, λ− (1, B) = ∞, (Λ) (Λ) and, for n = 2, 3, . . ., either λ− (n, B) = ∞ or limλ↑λ(Λ) (n,B) E− (n, B, λ) = Bn−1 . (Λ)
−
In the latter case, we extend E− (n, B, λ) as a decreasing, continuous function for (Λ) (Λ) λ ∈ (0, ∞) by setting E− (n, B, λ) = Bn−1 for λ ≥ λ− (n, B).
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1077
(Λ)
For an arbitrary ω ∈ Ωsupp and λ > 0, the eigenvalues of HB,λ,ω in the intervals (Bn , Bn + λM2 ) and (Bn − λM1 , Bn ) (if they exist) are separately continuous and increasing in each ωj ∈ [−M1 , M2 ], j ∈ Λ, and hence they must be in the interval (Λ) (Λ) (Λ) In (B, λ) = [E− (n, B, λ), E+ (n, B, λ)]. Thus we conclude that for each square Λ we have (Λ) σ(HB,λ,ω ) = In(Λ) (B, λ). (B.10) ω∈Ωsupp
n∈N
In addition, the same argument shows that for fixed λ and B we have (Λ) (Λ) ±E± (n, B, λ) increasing with Λ. We set E+ (n, B, λ) := supΛ E+ (n, B, λ) ≤ (Λ) Bn+1 , E− (n, B, λ) := inf Λ E− (n, B, λ) ≥ Bn−1 , and conclude from (B.4) and (B.10) that (cf. [24, Eq. (2.11)] ΣB,λ = In (B, λ), where In (B, λ) = [E− (n, B, λ), E+ (n, B, λ)]. (B.11) n∈N
Note that the intervals In (B, λ) depend on suppµ = [−M1 , M2 ], but not on other details of the measure µ. Now assume that u in (2.6) satisfies u(x − i) ≤ 1, (B.12) 0 < U− ≤ U (x) := i∈Z2
for some constant U− . (The upper bound is simply a normalization we had already assumed.) In this case, for all n ∈ N we have 2B , (B.13) Bn + λM2 U− ≤ E+ (n, B, λ) for λ ∈ 0, M 2 U− 2B Bn − λM1 U− ≥ E− (n, B, λ) for λ ∈ 0, . (B.14) M 1 U− We also have B − λM1 U− ≥ E− (1, B, λ) for all λ ≥ 0.
(B.15)
), then This can be seen as follows. Take λ ∈ (0, M2B 2 U− HB,λ,M2 = HB + λM2 U− + λM2 (U − U− ),
with 0 ≤ U − U− ≤ 1 − U− . (B.16)
Since σ(HB + λM2 U− ) = ΣB + λM2 U− = {Bn + λM2 U− ; n ∈ N}, it follows from [28, Theorem 4.10] (as in Lemma A.1), and the definition of E+ (n, B, λ), that σ(HB,λ,M2 ) ⊂
∞
[Bn + λM2 U− , E+ (n, B, λ)].
(B.17)
n=1
Since by the same argument [Bn + λM2 U− − λM2 (1 − U− ), E+ (n, B, λ)], ΣB + λM2 U− ⊂ n∈N=∅
(B.18)
September 14, 2009 15:49 WSPC/148-RMP
1078
J070-00381
F. Germinet, A. Klein & J. H. Schenker
where N=∅ := {n ∈ N; σ(HB,λ,M2 ) ∩ [Bn + λM2 U− , E+ (n, B, λ)] = ∅}, we conclude that N=∅ = N. It then follows from (B.11) that (B.13) holds. (B.14) and (B.15) are proved in a similar manner. Under the condition (2.31) the spectral gaps never close. On the other hand, if we have (B.12), if λU− (M1 + M2 ) ≥ 2B,
(B.19)
all the internal spectral gaps close, i.e. ΣB,λ = (E− (1, B, λ), ∞).
(B.20)
References [1] M. Aizenman and G. M. Graf, Localization bounds for an electron gas, J. Phys. A 31 (1998) 6783–6806. [2] H. Aoki and T. Ando, Effects of localiztion on the Hall conductivity in the two-dimensional system in strong magnetic field, Solid State Commun. 38 (1981) 1079–1082. [3] J. Avron, R. Seiler and B. Simon, Charge deficiency, charge transport and comparison of dimensions, Comm. Math. Phys. 159 (1994) 399–422. [4] J. M. Barbaroux, J. M. Combes and P. D. Hislop, Landau Hamiltonians with unbounded random potentials, Lett. Math. Phys. 40 (1997) 355–369. [5] J. Bellissard, Ordinary quantum Hall effect and noncommutative cohomology, in Proc. Localization in Disordered Systems (Bad Schandau, 1986), Teubner-Texte Phys., Vol. 16 (Teubner Publ., Leipzig, 1988), pp. 61–74. [6] J. Bellissard, A. van Elst and H. Schulz-Baldes, The non commutative geometry of the quantum Hall effect, J. Math. Phys. 35 (1994) 5373–5451. [7] J. M. Bouclet, F. Germinet, A. Klein and J. Schenker, Linear response theory for magnetic Schr¨ odinger operators in disordered media, J. Funct. Anal. 226 (2005) 301–372. [8] R. Carmona and J. Lacroix, Spectral Theory of Random Schr¨ odinger Operators (Birkha¨ user, 1990). [9] J. M. Combes and P. D. Hislop, Landau Hamiltonians with random potentials: Localization and the density of states, Comm. Math. Phys. 177 (1996) 603–629. [10] J. M. Combes, P. D. Hislop and F. Klopp, H¨ older continuity of the integrated density of states for some random operators at all energies, Int. Math. Res. Not. 4 (2003) 179–209. [11] J. M. Combes, P. D. Hislop and F. Klopp, An optimal Wegner estimate and its application to the global continuity of the integrated density of states for random Schr¨ odinger operators, Duke Math. J. 140 (2007) 469–498. [12] J. M. Combes, P. D. Hislop, F. Klopp and G. Raikov, Global continuity of the integrated density of states for random Landau Hamiltonians, Comm. Partial Differential Equations 29 (2004) 1187–1213. [13] E. B. Davies, Spectral Theory and Differential Operators (Cambridge University Press, 1995). [14] A. Elgart and B. Schlein, Adiabatic charge transport and the Kubo formula for Landau-type Hamiltonians, Comm. Pure Appl. Math. 57 (2004) 590–615. [15] A. Figotin and A. Klein, Localization phenomenon in gaps of the spectrum of random lattice operators, J. Stat. Phys. 75 (1994) 997–1021.
September 14, 2009 15:49 WSPC/148-RMP
J070-00381
Ergodic Landau Hamiltonians
1079
[16] A. Figotin and A. Klein, Localization of classical waves I: Acoustic waves, Comm. Math. Phys. 180 (1996) 439–482. [17] A. Figotin and A. Klein, Localization of classical waves II: Electromagnetic waves, Comm. Math. Phys. 184 (1997) 411–441. [18] A. Figotin and A. Klein, Midgap defect modes in dielectric and acoustic Media, SIAM J. Appl. Math. 58 (1998) 1748–1773. [19] F. Germinet and A. Klein, Operator kernel estimates for functions of generalized Schr¨ odinger operators, Proc. Amer. Math. Soc. 131 (2003) 911–920. [20] F. Germinet and A. Klein, Explicit finite volume criteria for localization in continuous random media and applications, Geom. Funct. Anal. 13 (2003) 1201–1238. [21] F. Germinet and A. Klein, A characterization of the Anderson metal-insulator transport transition, Duke Math. J. 124 (2004) 309–350. [22] F. Germinet and A. Klein, New characterizations of the region of complete localization for random Schr¨ odinger operators, J. Stat. Phys. 122 (2006) 73–94. [23] F. Germinet A. Klein and B. Mandy, Dynamical delocalization in random Landau Hamiltonians with unbounded random couplings, in Spectral and Scattering Theory for Quantum Magnetic Systems, Contemp. Math., Vol. 500 (Amer. Math. Soc., Providence, RI, 2009), pp. 87–100. [24] F. Germinet, A. Klein and J. Schenker, Dynamical delocalization in random Landau Hamiltonians, Ann. of Math. 166 (2007) 215–244. [25] B. Halperin, Quantized hall conductance, current-carrying edge states, and the existence of extended states in a two-dimensional disordered potential, Phys. Rev B 25 (1982) 2185–2190. [26] T. Hupfer, H. Leschke, P. M¨ uller and S. Warzel, Existence and uniqueness of the integrated density of states for Schr¨ odinger operators with magnetic fields and unbounded random potentials, Rev. Math. Phys. 13 (2001) 1547–1581. [27] T. Hupfer, H. Leschke, P. M¨ uller and S. Warzel, The absolute continuity of the integrated density of states for magnetic Schr¨ odinger operators with certain unbounded potentials, Comm. Math. Phys. 221 (2001) 229–254. [28] T. Kato, Perturbation Theory for Linear Operators (Springer-Verlag, 1976). [29] W. Kirsch and F. Martinelli, On the ergodic properties of the spectrum of general random operators, J. Reine Angew. Math. 334 (1982) 141–156. [30] W. Kirsch and F. Martinelli, On the spectrum of Schr¨ odinger operators with a random potential, Comm. Math. Phys. 85 (1982) 329–350. [31] A. Klein and A. Koines, A general framework for localization of classical waves: I. Inhomogeneous media and defect eigenmodes, Math. Phys. Anal. Geom. 4 (2001) 97–130. [32] A. Klein and A. Koines, A general framework for localization of classical waves: II. Random media, Math. Phys. Anal. Geom. 7 (2004) 151–185. [33] H. Kunz, The quantum Hall effect for electrons in a random potential, Comm. Math. Phys. 112 (1987) 121–145. [34] R. B. Laughlin, Quantized hall conductivity in two dimensions, Phys. Rev. B 23 (1981) 5632–5633. [35] I. M. Lifshits, A. G. Gredeskul and L. A. Pastur, Introduction to the Theory of Disordered Systems (Wiley-Interscience, New York, 1988). [36] Q. Niu and D. J. Thouless, Quantum Hall effect with realistic boundary conditions, Phys. Rev. B 35 (1987) 2188–2197. [37] L. Pastur, Spectral properties of disordered systems in one-body approximation, Comm. Math. Phys. 75 (1980) 179–196.
September 14, 2009 15:49 WSPC/148-RMP
1080
J070-00381
F. Germinet, A. Klein & J. H. Schenker
[38] L. Pastur and A. Figotin, Spectra of Random and Almost-Periodic Operators (Springer-Verlag, 1992). [39] D. J. Thouless, Localisation and the two-dimensional Hall effect, J. Phys. C 14 (1981) 3475–3480. [40] W.-M. Wang, Microlocalization, percolation, and Anderson localization for the magnetic Schr¨ odinger operator with a random potential, J. Funct. Anal. 146 (1997) 1–26.
October 23, 2009 12:3 WSPC/148-RMP
J070-00382
Reviews in Mathematical Physics Vol. 21, No. 9 (2009) 1081–1090 c World Scientific Publishing Company
METHOD OF CONSTRUCTING BRAID GROUP REPRESENTATION AND ENTANGLEMENT IN A 9 × 9 YANG–BAXTER SYSTEM
TAOTAO HU∗ , GANGCHENG WANG, CHUNFANG SUN, CHENGCHENG ZHOU, QINGYONG WANG and KANG XUE† School of Physics, Northeast Normal University, Changchun 130024, P. R. China ∗
[email protected] †
[email protected] Received 2 April 2009 In this paper, we present reducible representation of the n2 braid group representation which is constructed on the tensor product of n-dimensional spaces. Specifically, it is shown that via a combining method, we can construct more n2 dimensional braiding S-matrices which satisfy the braid relations. By Yang–Baxterization approach, we derive ˘ a 9 × 9 unitary R-matrix according to a 9 × 9 braiding S-matrix we have constructed. ˘ The entanglement properties of R-matrix is investigated, and the arbitrary degree of ˘ entanglement for two-qutrit entangled states can be generated via R-matrix acting on the standard basis. Keywords: Braid groups representation; quantum entanglement; Yang–Baxter equation. Mathematics Subject Classification 2000: 20F36, 20G05, 20G45, 81R10, 47N50
1. Introduction Quantum entanglement is the most surprising nonclassical property of composite quantum systems that Schr¨ odinger singled out many decades ago as “the characteristic trait of quantum mechanics”. Recently, entanglement has become one of the most fascinating topics in quantum information theory, entanglement is recognized as an essential resource for quantum processing and quantum communications [1–3] and it play a crucial role in quantum computation [4–6]. It is believed that the protocols based on the entangled states have an exponential speedup over the classical ones. Besides, in highly correlated states in condensed-matter systems such as superconductors [7, 8] and fractional quantum Hall liquids [9], the entanglement serves as a unique measure of quantum correlations between degrees of freedom. Leveraging the entanglement and using quantum coherence, certain problems may be solved faster by a quantum computer than a classical one.
1081
October 23, 2009 12:3 WSPC/148-RMP
1082
J070-00382
T. Hu et al.
Recently, it has been revealed that there are natural and profound connections between quantum computations and braid group theory as well as the Yang–Baxter equation (YBE) [10–18]. During the investigation of the relationships among quantum entanglement, topological entanglement and quantum computation, Kauffman and Lomonaco [11] have explored the important role of unitary braiding operators. It is shown that the braid matrix can be identified as the universal quantum gate [11, 13]. This motivates a novel way to study quantum entanglement based on the theory of braiding operators, as well as YBE. The first step along this direction is initiated by Zhang et al. [13]. In [11], the Bell matrix generating two-qubit entangled states has been recognized to be a unitary braid transformation. Later on, an approach to describe Greenberger–Horne–Zeilinger (GHZ) states or N -qubit entangled states based on the theory of unitary braid representations has been presented in [19]. Chen and his co-workers [20, 21] used unitary braiding operators to realize entanglement swapping and generate the GHZ states, as well as the linear cluster states. These literatures introduce the braiding operators and Yang–Baxter equations to the field of quantum information and quantum computation. In a very recent work [22], it has been found that any pure two-qudit entangled state can be achieved by a universal Yang–Baxter equation. In our paper, we present the method of constructing n2 dimensional matrix solutions of braid group algebra relation. The paper is organized as follows. In Sec. 2, we present the reducible representations of n2 braid group algebra. Specifically, more n2 dimensional braiding S-matrices which satisfy the braid relations can be obtained by the combining method, and we get some well known and some new braiding matrix S. In Sec. 3, by Yang–Baxterization approach, we derive a 9 × 9 ˘ unitary R-matrix according to a 9 × 9 S-matrix we have constructed. We investi˘ gate the entanglement properties of R-matrix. It shows that the arbitrary degree of entanglement for two-qutrit entangled states can be generated via the unitary ˘ matrix R-matrix acting on the standard basis. The summary is made in the last section. 2. Method of Constructing Braiding S-Matrices In a recent paper [23], a reducible representation of the Temperley–Lieb algebra is constructed on the tensor product of n-dimensional spaces. In fact, Temperley– Lieb algebra is a subalgebra of braid algebra. Motivated by this, we investigated the methods of constructing braid representation to get more useful braid representations conveniently. We first review the theory of braid groups, Let Bn denotes the braid group on n strands. Bn is generated by elementary braids {b1 , b2 , . . . , bn−1 } with the braid relations, bi bi+1 bi = bi+1 bi bi+1 1 ≤ i < n − 2 (2.1) bi bj = bj bi |i − j| ≥ 2
October 23, 2009 12:3 WSPC/148-RMP
J070-00382
Braid Groups and Entanglement
1083
where the notation bi ≡ bi,i+1 is used, bi,i+1 represents 11 ⊗ 12 ⊗ 13 · · · ⊗ Si,i+1 ⊗ · · · ⊗ 1n , and 1j is the unit matrix of the jth particle. By calculation, we get the reducible representations of braiding matrix which is defined by two n × n matrices A and B ∈ GL(n, C) which all can also be seen as an n2 dimensional vector {Aab , Bdc } ∈ Cn ⊗ Cn . The braiding matrix S can be expressed as ab = Aad Bcb ∈ Mat(Cni ⊗ Cni+1 ) Scd
(2.2)
where we explicitly write the indices corresponding to the factors in the tensor prodN uct space H = 1 Cn . Substituting the relation into braid relations equation (2.1), the limited conditions can be derived. The S in Eq. (2.2) is a solution of braid relation if and only if (the detail calculation is given in the Appendix) AB = BA,
namely
[A, B] = 0n×n .
(2.3)
For example, n = 2, in order to get a significant result, we set that every row and array of two 2 × 2 convertible matrices A and B have only one element which is equal to 1 for convenience. 1 0 0 1 A= , B= . (2.4) 0 1 1 0 Substituting Eq. (2.8) into Eq. 1 0 0 0 0 1 S= 0 1 0 0 0 0
(2.2), we get 0 0 0 0 and S = 0 0 1 1
0 1 0 0
0 0 1 0
1 0 0 0
(2.5)
the first S we get is the standard swap gate [11]. In order to obtain more useful braiding S-matrix, we do the further combination as follows: S=
2
ai S (i)
(2.6)
i=1
where S (1) and S (2) all have the reducible representations as Eq. (2.2), a1 and a2 are the corresponding coefficients. (1) a (S (1) )ab )d (B (1) )bc , cd = (A
(2) a (S (2) )ab )d (B (2) )bc cd = (A
(2.7)
according to Eqs. (2.1), (2.6) and (2.7), we get when [A(i) , B (i) ] = 0, [A(i) , A(j) ] = 0, [B (i) , B (j) ] = 0 (i, j = 1, 2) the constructed S-matrix in Eq. (2.6) is a braidingmatrix which satisfy the braid relation equation (2.1). According to the limitation, we set 1 0 0 i 0 1 A= , B= and C = . (2.8) 0 1 1 0 −i 0
October 23, 2009 12:3 WSPC/148-RMP
1084
J070-00382
T. Hu et al. √1 2 (1)
The coefficient a1 and a2 do not have restriction and we set them equal to
for
convenience. We let A(1) = B (2) = A, B (1) = B, A(2) = C and A(1) = B = A, A(2) = B, B (2) = C, respectively, according to this combining method, we can get two 4 × 4 models as follows: 0 1 i 0 1 0 0 i 1 1 0 0 1 and S = √1 0 1 1 0 , S= √ (2.9) 0 i 0 1 1 0 2 −i 0 2 0 1
−i
−i 0
0
0 1
one can see by the combination we obtain two 4 × 4 braiding model while the first S-matrix is a new braiding model which is found to be locally equivalent to the DCNOT gate [24]. This motivates us to find generalized n2 (n ≥ 2) dimensional braiding-matrix representation by the combining method. We do the similar combination as follows: S=
n
ai S (i)
(2.10)
i=1
where S (i) also have the reducible representation (i) a (i) b (S (i) )ab cd = (A )d (B )c
(2.11)
substituting Eqs. (2.10) and (2.11) into Eq. (2.1), we find A(i) and B (i) are subject to the limited conditions as follows (the detail calculation is given in Appendix): [A(i) , B (i) ] = 0,
[A(i) , A(j) ] = 0,
[B (i) , B (j) ] = 0 (i, j = 1, 2, 3, . . . , n)
(2.12)
coefficients ai are not restricted, when Eq. (2.12) is satisfied, S-matrix in Eq. (2.10) satisfy the braid relation equation (2.1). Namely, we can obtain more n2 dimensional braiding-matrix representation by this combining method. 3. A 9 × 9 Braiding S-Matrix, Yang–Baxterization and Entanglement In Sec. 2, we present that we can get arbitrary n2 dimensional braiding-matrix representation by the reducible representation and the combining method. Now we emphasize on one 9 × 9 braiding S-matrix we have constructed to investigate its application on quantum entanglement. For n = 3, let three 3×3 matrices A, B and C as follows (we choose {|0, |1, |2} as the standard basis), 0 1 0 1 0 0 0 0 eiϕ1 A = 1 0 0 , B = 0 0 eiϕ2 , C = 0 1 0. (3.1) −iϕ2 −iiϕ1 0 e 0 0 0 e 0 0 1
October 23, 2009 12:3 WSPC/148-RMP
J070-00382
Braid Groups and Entanglement
1085
Here [A, B] = 0, [A, C] = 0, [B, C] = 0 satisfy Eq. (2.12), the parameters ϕ1 and ϕ1 are both real. We let A(1) = B (2) = A, A(2) = B (1) = B and A(3) = B (3) = C, and we set coefficient ai (i = 1, 2, 3) all equal to 1 for convenience. By combination, in terms of the standard basis {|00, |01, |02, |10, |11, |12, |20, |21, |22} we get a 9 × 9 braiding S-matrix as follows:
1
0 0 0 S= 0 − q1 0 q − 1 0
0
0
0
0
q1
0
q1
1
0
1
0
0
0
0
0
1
0
q2−
0
1
0
1
0
1
0
0
0
0
0
q2
0
1
0
q2
0
0
0
0
0
1
0
1
0
1
0
0
1
0
q2−
0
0
0
0
1
0
1
Q−
0
Q−
0
0
0
0
0
0 Q 0 . 0 0 0
Q
(3.2)
1
Here q1 = eiϕ1 , q2 = eiϕ2 , Q = q1 q2 , and one can easily find that S 2 = 3S, S + = S. The usual YBE takes the form [20]: ˘ i (x)R ˘ i+1 (xy)R ˘ i (y) = R ˘ i+1 (y)R ˘ i (xy)R ˘ i+1 (x). R
(3.3)
The spectral parameters x and y which are related with the one-dimensional momentum play an important role in some typical models [16, 17]. The asymptotic ˘ i,i+1 (x, ϕ1 , ϕ2 ) ∝ bi , where bi ˘ ϕ1 , ϕ2 ) is x-independent, i.e. lim R behavior of R(x, are braiding operators, which satisfy the braiding relations equation (2.1). From ˘ a given solution of the braid relation S, a R(x) can be constructed by using the approach of Yang–Baxterization. Let the unitary Yang–Baxter matrix take the form, ˘ R(x) = ρ(x)(I + G(x)S).
(3.4)
This is a trigonometric solution of YBE, where ρ(x) is a normalization fac˘ tor. One can choose appropriate ρ(x) to ensure that R(x) is unitary. Substituting 2 Eq. (3.4) to Eq. (3.3) and according to S = 3S, one has G(x)+G(y)+3G(x)G(y) = ˘ i (x) = Ii yields G(x = 1) = 0 and G(xy). In addition, the initial condition R † ˘ −1 (x) = R(x ˘ − )) can be tenable ˘ ρ(x = 1) = 1. The unitary condition (i.e. Ri (x) = R − − only on condition that ρ(x)ρ(x )(G(x) + G(x ) + 3G(x)G(x− ))=0. Take account into these condition, we obtain a set solution of G(x) and ρ(x), ρ(x) = x,
G(x) = −
x − x− . 3x
(3.5)
October 23, 2009 12:3 WSPC/148-RMP
1086
J070-00382
T. Hu et al.
Substituting Eqs. (3.2) and (3.5) into Eq. (3.4), the unitary solution of YBE can be obtained as following, b
0 0 0 0 ˘ i (x, ϕ1 , ϕ2 ) = R a q1 0 a q1 0
0
0
0
0
aq1
0
aq1
b
0
a
0
0
0
0
0
b
0
0
a
0
a
0
b
a q2 0
0
0
0
0
aq2
0
b
0
aq2
0
0
0
0
0
b
0
a
0
a
0
a q2
0
b
0
0
0
0
0
a
0
b
a Q
0
a Q
0
0
0
0
0
0 aQ 0 0 0 0
aQ
(3.6)
b
where a = x−1 −x, b = 2x+x−1 . The Gell-Mann matrices, a basis for the Lie algebra SU(3) [25], λu satisfy [Iλ , Iµ ] = ifλµν Iν (λ, µ, ν = 1, . . . , 8), where Iµ = 12 λµ . As a recent paper has done, we denote Iλ by, I± = I1 ± iI2 , V± = V4 ∓ iV5 ,U± = I6 ± iI7 , Y = √23 I8 . we also generate three sets of realization of SU(3) as: (1) (1) (1) I± = I1± I2∓ , U± = U1± V2∓ , V± = V1± U2∓ , (1) 1 1 I3 = (I13 − I23 ) + (I13 Y2 − Y1 I23 ), 3 2 Y (1) = 1 (Y1 + Y2 ) − 2 I 3 I 3 − 1 Y1 Y2 ; 3 3 1 2 2 (2) (2) (2) I± = U1± U2∓ , U± = V1± I2∓ , V± = I1± V2∓ , (2) 1 1 1 3 3 3 3 I3 = − (I1 − I2 ) + (Y1 − Y2 ) + I1 Y2 − Y1 I2 , 2 3 2 (2) 1 2 3 3 1 1 3 3 = − (I1 + I2 ) + (Y1 + Y2 ) + I1 I2 + Y1 Y2 ; Y 3 6 3 2 (3) (3) (3) I± = V1± V2∓ , U± = I1± U2∓ , V± = U1± I2∓ , (3) 1 1 1 3 3 3 3 I3 = − (I1 − I2 ) − (Y1 − Y2 ) + I1 Y2 − Y1 I2 , 2 3 2 Y (3) = 1 (I 3 + I 3 ) − 1 (Y1 + Y2 ) − 2 I 3 I 3 − 1 Y1 Y2 . 2 3 1 6 3 1 2 2 (k)
(k)
(k)
(k)
(k)
(k)
(k)
(k)
(k)
We denote I± = I1 ± iI2 , V± = V4 ∓ iV5 , U± = I6 ± iI7 , Y (k) = (k) (i) (j) √2 I (k = 1, 2, 3). These realizations satisfy the commutation relation [Iλ , Iµ ] = 3 8 (i)
iδij fλµν Iν (λ, µ, ν = 1, . . . , 8; i, j = 1, 2, 3).
October 23, 2009 12:3 WSPC/148-RMP
J070-00382
Braid Groups and Entanglement
1087
˘ For ith and (i + 1)th lattices, R-matrix can be expressed in terms of above operators, ˘ ϕ1 , ϕ2 ) = 1 a[I (1) + I (1) + Q(V (1) + U (1) ) R(x, − − + 3 + (1)
(1)
(2)
(2)
+ Q−1 (U− + V+ ) + I+ + I− (2)
(2)
(2)
(2)
+ q1 (V+ + U− ) + q1−1 (V− + U+ ) (3)
(3)
(3)
(3)
+ I+ + I− + q2 (V+ + U− ) b (3) (3) + q2−1 (V− + U+ )] + (I ⊗ I). 3 So the whole tensor space C3 ⊗ C3 is completely decomposed, i.e. C3 ⊗ C3 = ˘ can be represented by fundamental C3 ⊕ C3 ⊕ C3 . In addition, each block of R-matrix representation of SU(3) algebra. ˘ −1 (x), one can get x∗ = −x, so we can ˘ † (x) = R According to the condition R i iθ introduce a new parameter with x = e , and θ may be related with entanglement ˘ ϕ1 , ϕ2 ) on the separable state |mn, it yields the foldegree. When one acts R(θ, ˘ ij lowing family of states |ψmn = 22 ij=00 Rmn |mn (m, n = 0, 1, 2). For example, if m = 0 and n = 0, |ψ00 =
1 (b|00 + aq1−1 |12 + aq1−1 |21). 3
(3.7)
In [26], the generalized concurrence (or the degree of entanglement [27]) for two qudits is given by d C= (1 − I1 ) (3.8) d−1 where I1 = Tr[ρ2A ] = Tr[ρ2B ] = |κ0 |4 + |κ1 |4 + · · · + |κd−1 |4 , ρA and ρB are the reduced density matrices for the sub-systems, and κj ’s (j = 0, 1, . . . , d − 1) are the Schmidt coefficients. Then we can obtain the generalized concurrence of the state |ψ00 as 3 2 1 − 4 − 4 C= 1 − |2x + x | − |x − x | 2 81 81 √ 2 2 = |sin θ| 2 cos2 θ + 1 (3.9) 3 one can find that when θ = π3 , the state |ψ00 becomes the maximally entanπ gled state of tow qutrits as state |ψ00 = √13 (ei 6 |00 − iq1−1 |12 − iq1−1 |21). ˘ In general, if one acts the unitary Yang–Baxter matrix R(x) on the basis {|00, |01, |02, |10, |11, |12, |20, |21, |22}, the same generalized concurrence
October 23, 2009 12:3 WSPC/148-RMP
1088
J070-00382
T. Hu et al.
will be obtained as Eq. (3.9). It is easy to check that the generalized concurrence ranges from 0 to 1 when the parameter θ runs from 0 to π. However, for θ ∈ [0, π], π the generalized concurrence is not a monotonic function of θ. And when x = ei 3 , nine complete and orthogonal maximally entangled states for two qutrits will be generated. The QE does not depend on the parameters ϕ1 and ϕ2 . So one can verify that parameter ϕ1 and ϕ2 may be absorbed into a local operation. 4. Summary In this paper, we have presented the reducible representation of braid group algebra, Specifically that by the further combining method, we can get more n2 dimensional braiding S-matrices and obtain some well-known and new braiding models. According to a 9 × 9 braiding S-matrix which we have constructed satisfing the braiding ˘ relations we derived a unitary R-matrix via Yang–Baxterization. We show that the arbitrary degree of entanglement for two-qutrit entangled states can be generated ˘ matrix acting on the standard basis. via the unitary R Acknowledgments This work was supported by the NSF of China (Grant No. 10875026). Appendix. The Two Limited Conditions The two limited conditions in Sec. 2 will be calculated in detail as follows, ab = Aad Bcb into the braid relation in Eq. (2.1) (i.e. S12 S23 S12 = If we substitute Scd S23 S12 S23 ) ijk αβγ abc [S12 S23 S12 ]abc edf = [S12 ]ijk [S23 ]αβγ [S12 ]def ab jc iβ = Sij Sβf Sde
= Aaj Bib Ajf Bβc Aie Bdβ = (BA)be (A2 )af (B 2 )cd ,
(A.1)
ijk αβγ abc [S23 S12 S23 ]abc edf = [S23 ]ijk [S12 ]αβγ [S23 ]def bc aj βk = Sjk Sdβ Sef
= Abk Bjc Aaβ Bdj Aβf Bek = (AB)be (A2 )af (B 2 )cd ,
(A.2)
according to Eqs. (A.1) and (A.2), one can see if AB = BA, namely [A, B] = 0, the braid relation S12 S23 S12 = S23 S12 S23 holds.
October 23, 2009 12:3 WSPC/148-RMP
J070-00382
Braid Groups and Entanglement
Substitute S = one has
n i=1
[S12 S23 S12 ]abc edf
1089
abc ai S (i) into [S12 S23 S12 ]abc edf and [S23 S12 S23 ]edf , respectively,
=
n
(g)
(h)
(l)
ag ah al [S12 S23 S12 ]abc edf
ghl
=
n
(h) jc ag ah al (S (g) )ab )βf (S (l) )iβ ij (S de
ghl
=
n
ag ah al (B (g) A(l) )be (A(g) A(h) )af (B (h) B (l) )cd ,
(A.3)
ghl
[S23 S12 S23 ]abc edf =
n
(λ)
(µ)
(ν)
aλ aµ aν [S23 S12 S23 ]abc edf
λµν
=
n
(µ) aj aλ aµ aν (S (λ) )bc )dβ (S (ν) )βk jk (S ef
λµν
=
n
aλ aµ aν (A(λ) B (ν) )be (A(µ) A(ν) )af (B (λ) B (µ) )cd ,
(A.4)
λµν
here (g, h, l = 1, 2, 3, . . . , n) and (λ, µ, ν = 1, 2, 3, . . . , n), respectively. So we can let g = ν, h = µ and l = λ, then according to Eqs. (A.3) and (A.4), we limit Aλ B ν = B ν Aλ , Aν Aµ = Aµ Aν , and B µ B λ = B λ Aµ , (λ, µ, ν = 1, 2, 3, . . . , n). Under this limited condition, Eq. (A.3) is equal to Eq. (A.4), namely, when the limited condition equation (2.12) is satisfied the braid relation S12 S23 S12 = S23 S12 S23 holds. References [1] A. K. Ekert, Quantum cryptography based on Bell’s theorem, Phys. Rev. Lett. 67 (1991) 661–663. [2] C. H. Bennett and S. J. Wiesner, Communication via one- and two-particle operators on Einstein–Podolsky–Rosen states, Phys. Rev. Lett. 69 (1992) 2881–2884. [3] C. H. Bennett, G. Brassard, C. Crpeau, R. Jozsa, A. Peres and W. K. Wootters, Teleporting an unknown quantum state via dual classical and Einstein–Podolsky– Rosen channels, Phys. Rev. Lett. 70 (1993) 1895–1899. [4] C. H. Bennett and D. P. Divincenzo, Quantum information and computation, Nature 404 (2000) 247–255. [5] R. Raussendorf and H. J. Briegel, A one-way quantum computer, Phys. Rev. Lett. 86 (2001) 5188–5191. [6] S.-S. Li, G.-L. Long, F.-S. Bai, S.-L. Feng and H.-Z. Zheng, Proc. Natl. Acad. Sci. USA 98 (2001) 11847–11848. [7] S. Oh and J. Kim, Entanglement of electron spins in superconductors, Phys. Rev. B 71 (2005) 144523, 4 pp. [8] V. Vedral, High temperature macroscopic entanglement, New J. Phys. 6 (2004) 102–120.
October 23, 2009 12:3 WSPC/148-RMP
1090
J070-00382
T. Hu et al.
[9] X. G. Wen, Quantum order and symmetric spin liquids, Phys. Lett. A 300 (2002) 175–181. [10] H. A. Dye, Unitary solutions to the Yang–Baxter Equation in dimension four, Quant. Inf. Proc. 2 (2003) 117–150; arXiv:quant-ph/0211050. [11] L. H. Kauffman and S. J. Lomonaco Jr., Braiding operators are universal quantum gates, New J. Phys. 6 (2004) 134–173; arXiv:quant-ph/0401090. [12] Y. Zhang, L. H. Kauffman and M. L. Ge, Universal quantum gate, Yang– Baxterization and Hamiltonian, Int. J. Quant. Inform. 3(4) (2005) 669–678; arXiv:quant-ph/0412095. [13] Y. Zhang, L. H. Kauffman and M. L. Ge, Yang–Baxterization, universal quantum gate and Hamiltonians, Quant. Inf. Proc. 4 (2005) 159–197; arXiv:quant-ph/0502015. [14] J. Franko, E. C. Rowell and Z. Wang, Extraspecial 2-groups and images of braid group representtations, J. Knot Theory Ramifications 15 (2006) 413–428; arXiv:math.RT/0503435. [15] M. Nielsen and I. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, 1999). [16] C. N. Yang, Some exact results for the many-body problem in one dimension with repulsive delta-function interaction, Phys. Rev. Lett. 19 (1967) 1312–1315. [17] C. N. Yang, S matrix for the one-dimensional N -body problem with repulsive or attractive δ-function interaction, Phys. Rev. 168 (1968) 1920–1923. [18] R. J. Baxter, Partition function of the eight-vertex lattice model, Ann. Phys. 70 (1972) 193–228. [19] Y. Zhang, E. C. Rowell, Y. S. Wu, Z. H. Wang and M. L. Ge, From extraspecial twogroups to GHZ states (2007); arXiv:quant-ph/0706.1761. [20] J. L. Chen, K. Xue and M. L. Ge, Braiding transformation, entanglement swapping, and Berry phase in entanglement space, Phys. Rev. A 76 (2007) 042324, 6 pp. [21] J. L. Chen, K. Xue and M. L. Ge, Berry phase and quantum criticality in Yang– Baxter systems, Ann. Phys. 323 (2008) 2614–2623. [22] J. L. Chen, K. Xue and M. L. Ge, All pure two-qudit entangled states can be generated via a universal Yang–Baxter matrix assisted by local unitary transformations (2008); arXiv:0809.2321. [23] P. P. Kulish, N. Manojlovic and Z. Nagy, Quantum symmetry algebras of spin systems related to Temperley–Lieb R-matrices, J. Math. Phys. 49 (2008) 023510, 9 pp. [24] G. Wang, K. Xue, C. Wu, H. Liang and C. H. Oh, Entanglement and the Berry phase in a new Yang–Baxter system, J. Phys. A 42 (2009) 125207, 8 pp. [25] W. Pfeifer, The Lie Algebras su(N ). An Introduction (Birkhauser Verlag, 2003). [26] S. Albeverio and S. M. Fei, A note on invariants and entanglements, J. Opt. B: Quantum Semiclass. Opt. 3 (2001) 223–227. [27] S. Hii and W. K. Wootters, Entanglement of a pair of quantum bits, Phys. Rev. Lett. 78 (1997) 5022–5025. [28] W. K. Wooters, Entanglement of formation of an arbitrary state of two qubits, Phys. Rev. Lett. 80 (1998) 2245–2248. [29] M. V. Berry, Quantal phase factors accompanying adiabatic changes, Proc. Roy. Soc. London Ser. A 392 (1984) 45–57.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
Reviews in Mathematical Physics Vol. 21, No. 9 (2009) 1091–1143 c World Scientific Publishing Company
KO-HOMOLOGY AND TYPE I STRING THEORY
RUI M. G. REIS Department of Mathematical Sciences, University of Aberdeen, King’s College, Aberdeen AB24 3UE, UK
[email protected] RICHARD J. SZABO Department of Mathematics and Maxwell Institute for Mathematical Sciences, Heriot-Watt University, Colin Maclaurin Building, Riccarton, Edinburgh EH14 4AS, UK
[email protected] ALESSANDRO VALENTINO Courant Research Center “Higher Order Structures” and Mathematisches Institut, Georg-August-Universit¨ at G¨ ottingen, Bunsenstr. 3-5, D-37073 G¨ ottingen, Germany
[email protected] Received 5 December 2008 Revised 29 June 2009
We study the classification of D-branes and Ramond–Ramond fields in Type I string theory by developing a geometric description of KO-homology. We define an analytic version of KO-homology using KK-theory of real C ∗-algebras, and construct explicitly the isomorphism between geometric and analytic KO-homology. The construction involves recasting the Cn -index theorem and a certain geometric invariant into a homological framework which is used, along with a definition of the real Chern character in KO-homology, to derive cohomological index formulas. We show that this invariant also naturally assigns torsion charges to non-BPS states in Type I string theory, in the construction of classes of D-branes in terms of topological KO-cycles. The formalism naturally captures the coupling of Ramond–Ramond fields to background D-branes which cancel global anomalies in the string theory path integral. We show that this is related to a physical interpretation of bivariant KK-theory in terms of decay processes on spacetime-filling branes. We also provide a construction of the holonomies of Ramond–Ramond fields in Type II string theory in terms of topological K-chains. Keywords: KO-homology; index theory; classification of D-branes; Type I string theory. Mathematics Subject Classification 2000: 55N20, 81T30 1091
October 23, 2009 12:9 WSPC/148-RMP
1092
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
0. Introduction This paper continues the development and applications of the topological classification of D-branes in string theory using generalized homology theories. As explained by [48, 61, 37, 29, 49, 28], and reviewed in [52, 20, 62], D-brane charges and Ramond–Ramond fluxes are necessarily classified by the K-theory of spacetime in order to explain certain dynamical processes that cannot be accounted for by ordinary cohomology theory alone. However, as emphasized by [53, 46, 32, 59, 1, 60], a much more natural description of D-branes is provided by K-homology which at the analytic level links them to Fredholm modules and spectral triples. This point of view was exploited in great detail in [54] to provide a rigorous geometric description of D-branes in Type II string theory using the Baum–Douglas construction of K-homology [10, 11]. In this paper, we extend this description to D-branes and Ramond–Ramond fields in Type I string theory. The classification using KOtheory is explored extensively in [61,15,51,52,49,14,2,45]. We use this and Jakob’s approach [38] to construct a geometric realization of KO-homology as the homology theory dual to KO-theory, and describe various implications for the classification of Type I Ramond–Ramond charges and fluxes. As in [54], we simplify our treatment by dealing only with topologically trivial B-fields, and by ignoring the square-root of the Atiyah–Hirzebruch genus which naturally appears in the cohomological formula for D-brane charge [48,52,22]. Throughout we will compare and contrast with the complex case of Type II D-branes. We will also develop the analytic description of KO-homology. We define this using Kasparov’s KK-theory for real C ∗-algebras, which also encompasses the analytic KR-homology theories appropriate to D-branes in orientifold backgrounds. Generally, there is a description of the KK-theory group KK(A, B) in terms of an additive category whose objects are separable C ∗-algebras and whose morphisms A → B are precisely the elements of KK(A, B), with the intersection product given by composition of morphisms. This category may be viewed as a certain completion of the stable homotopy category of separable C ∗-algebras [16]. We use this description to provide a physical interpretation of KK-theory in terms of what we call “generalized D9-brane decay”, which unifies the description of charges in terms of tachyon condensation with the description of fluxes in terms of holonomies over anomaly-canceling background D-branes. In particular, we find a certain bound state obstruction to measuring the KO-theory class of a Ramond–Ramond field analogous to that found recently in [30]. Our physical interpretation of Kasparov’s theory is different from the proposal of [1, 2] (see also [53]) and is better suited to the global constructions of D-branes that we present. The use of KK-theory in string theory has also been exploited in context of string and other dualities in [22]. One of the main technical achievements of this paper is an explicit, detailed proof of the equivalence between the topological and analytic definitions of KO-homology, an ingredient missing from the original Baum–Douglas construction. In the course of working out the details, we came across the unpublished recent preprint [12]
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1093
in which a proof is also given. While having some overlap with the present work, our proof is fundamentally different. Our approach is more tailored to the physical applications that we have in mind, as it employs the construction of a certain geometric invariant which is related later on to D-brane charges and Ramond– Ramond fluxes. This invariant gives a rigorous definition to the Z2 Wilson lines which are used in physical constructions of Type I D-branes with torsion charges through tachyon condensation [57, 61, 14], and it is related to the mod 2 index that appears in the phase of the Type IIA partition function [49, 25, 50]. It is also related to the homological invariants that we construct is our description of fluxes as holonomies over background D-branes. Mathematically, our technique leads to a straightforward derivation of index formulas in the real case, whose proof is also missing from [10, 11] and which we provide in detail here. On the other hand, in contrast to our approach, the method of proof given in [12] has the virtue of being applicable to a potentially wider class of generalized homology theories. The first four sections of this paper present most of the technical details of the construction of KO-homology and its applications, a lot of which have not appeared in completeness anywhere in the literature and contain mathematical results of independent interest. Our exposition begins in Sec. 1 with a self-contained description of analytic KO-homology using Kasparov’s KK-theory for real C ∗-algebras. Section 2 details a Baum–Douglas type construction of geometric KO-homology. Using the approach of [38], we prove that this theory is equivalent to the usual definition provided by the spectrum of KO-theory, and thereby establish that the geometric definition really is a generalized homology theory. The content of Sec. 3 is the crux of our mathematical results, the detailed proof of the isomorphism between geometric and analytic KO-homology. This is done by recasting the Cn -index theorem into a homological setting and thereby obtaining the associated homological invariant. (This is where our proof differs from that of [12].) In Sec. 4, we construct the Chern character in KO-homology and use it to derive cohomological formulas for the topological index (in the appropriate dimensionalities). The final two sections of the paper turn to more physical applications of the geometric KO-homology framework. It is well known that the K-theory framework naturally accounts for certain properties of D-branes and Ramond–Ramond fields that would not be realized if these objects were classified by ordinary cohomology or homology alone. For example, it explains the appearence of stable but non-BPS branes carrying torsion charges, and correctly incorporates both the self-duality and quantization conditions on Ramond–Ramond fields. It has also led to a variety of new predictions concerning the spectrum of superstring theory, such as the instability of D-branes wrapping non-contractible cycles in certain instances due to the fact that their cohomology classes do not “lift” to K-theory [25], and the obstruction to simultaneous measurement of electric and magnetic Ramond–Ramond fluxes when torsion fluxes are included [30]. Moreover, certain properties of the string theory path integral, such as worldsheet anomalies and certain subtle phase factor contributions from the Ramond–Ramond fields, are most naturally formulated within
October 23, 2009 12:9 WSPC/148-RMP
1094
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
the context of K-theory [25, 29]. With these considerations in mind, we illustrate what the formalism of KO-homology we have developed tells about the structure of D-branes and Ramond–Ramond fields in Type I string theory, extending previous work which is mostly carried out in the Type II setting. In Sec. 5, we explain the virtues of classifying Type I D-branes within the homological framework, and adapt some of the results of [25, 54] concerning the stability of brane constructions to the real case. The precise definitions involving D-branes, along with further motivation and results from the geometric K-homology formalism, can be found in [54] and will not be repeated here. We also examine in detail the problem of constructing torsion D-branes. In the real case this turns out to be much more involved than in the complex case, but we nevertheless formally give the constructions using the invariant built in Sec. 3 and suspension techniques. And finally in Sec. 6, we demonstrate that the topological classification of Ramond–Ramond fields in Type II string theory is also much more natural within the context of geometric K-homology. We show that the pertinent differential K-theory group, which normally classifies fluxes, naturally describes the holonomies on background D-branes which are used to cancel the topological anomaly in the string theory path integral. This relation may be tied to the generalized D9-brane decay which lends a physical interpretation to KKtheory. We then provide a construction of the holonomies in terms of a geometric invariant defined on K-chains representing the background D-branes, and describe some of their properties. 1. Analytic KO-Homology In this section we will give a detailed overview of the definition of KO-homology in terms of Kasparov’s KK-theory for real C ∗-algebras [40], and describe various properties that we will need in subsequent sections of this paper. 1.1. Real C ∗-algebras We begin with an overview of the theory of real C ∗-algebras. The main references are [31, 42]. Definition 1.1. A real algebra is a ring A which is also an R-vector space such that λ(xy) = (λx)y = x(λy) for all λ ∈ R and all x, y ∈ A. A real ∗-algebra is a real algebra A equipped with a linear involution ∗ : A → A such that (xy)∗ = y ∗ x∗ for all x, y ∈ A. A real Banach algebra is a real algebra A equipped with a norm − : A → R such that xy ≤ xy and such that A is complete in the norm topology. If A is a unital algebra then we assume 1 = 1. A real Banach ∗-algebra is a real Banach algebra which is also a real ∗-algebra. A real C ∗ -algebra is a real Banach ∗-algebra such that (i) The involution is an isometry, i.e. x∗ = x for all x ∈ A; and (ii) 1 + x∗ x is invertible in A for all x ∈ A.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1095
Remark 1.2. Although in the complex case invertibility of 1 + x∗ x for all x ∈ A would follow immediately from the C ∗-algebra structure, in the real case this is no longer true. For example, consider the real Banach ∗-algebra C with involu√ tion given by the identity map. Then 1 + i∗ i is not invertible, where i := −1. This invertibility condition is fundamental to obtaining the usual representation theorem below for C ∗-algebras in terms of bounded self-adjoint operators on a real Hilbert space. However, C with involution given by complex conjugation is a real C ∗-algebra. Since the only R-linear involutions of C are the identity and complex conjugation, when we consider C as a real C ∗-algebra the involution will always be implicitly assumed to be complex conjugation. More generally any complex C ∗-algebra, regarded as a real vector space and with the same operations, is a real C ∗-algebra. Let us now give a number of examples of real C ∗-algebras, some of which we will use later on in representation theorems. Example 1.3. Let HR be a real Hilbert space. Then the set of bounded linear operators B(HR ) with the usual operations is a real C ∗-algebra. Any closed selfadjoint subalgebra of B(HR ) is also a real C ∗-algebra. More generally, any closed self-adjoint subalgebra of a real C ∗-algebra is always a real C ∗-algebra. Example 1.4. Let X be a locally compact Hausdorff space and C0 (X, R) the space of real-valued continuous functions vanishing at infinity. Then C0 (X, R) with pointwise operations, the supremum norm and involution given by the identity map is a real C ∗-algebra. As in the complex case, C0 (X, R) is unital if and only if X is compact. Example 1.5. With X as in Example 1.4 above, let Y be a closed subspace of X and C0 (X, Y ; R) the subspace of C0 (X, C) consisting of maps f : X → C such that f (Y ) ⊂ R. Then with the operations inherited from C0 (X, C), the subspace C0 (X, Y ; R) is a real C ∗-algebra. Example 1.6. Let X be a locally compact Hausdorff space with involution τ : X → X, i.e. a homeomorphism such that τ ◦ τ = idX , and consider the subset C0 (X, τ ) of C0 (X, C) consisting maps f such that f ◦ τ = f ∗ = f . Then C0 (X, τ ), with the operations inherited from C0 (X, C), is a real C ∗-algebra. If τ = idX then C0 (X, τ ) = C0 (X, R). If X is compact and Y is a closed subspace of X, then there is a compact Hausdorff space Z with an involution τ such that C(X, Y ; R) ∼ = C(Z, τ ). However, the converse does not hold in general. Example 1.7. Let V be a real vector space equipped with a quadratic form Q, and consider the associated real Clifford algebra C(V, Q). Assume, without loss of generality, that Q(v) = v, φ(v) for all v ∈ V with respect to an inner product on V, where the linear operator φ ∈ L(V) is symmetric and orthogonal. We can then define an involution on C(V, Q) by (v1 · · · vk )∗ = φ(vk ) · · · φ(v1 ), i.e. if v ∈ V
October 23, 2009 12:9 WSPC/148-RMP
1096
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
then v ∗ = φ(v). The isomorphism Φ : C(V ⊕ V, Q ⊕ −Q) → L( V) induces a norm on C(V, Q) by pullback of the operator norm on L( V), and the inclusion C(V, Q) → C(V, Q) ⊗ C(V, −Q) ∼ = C(V ⊕ V, Q ⊕ −Q) given by x → x ⊗ 1 thereby induces a norm on C(V, Q). Then C(V, Q) with its algebra structure, this involution and norm is a real C ∗-algebra. If A, B are real ∗-algebras then a real ∗-algebra homomorphism is a real algebra map φ : A → B, i.e. an R-linear ring homomorphism, such that φ(x∗ ) = φ(x)∗ for all x ∈ A. The homomorphism is assumed to be unital if both algebras are unital. We now come to the most general representation theorems for real C ∗-algebras. If A is an algebra, we denote by Mn (A) the algebra of n × n matrices with entries in A. Theorem 1.8. Let A be a finite-dimensional real C ∗-algebra. Then there exist k, n1 , . . . , nk ∈ N such that A ∼ = Mn1 (A1 ) × · · · × Mnk (Ak ) as real C ∗-algebras with A1 , . . . , Ak ∈ {R, C, H}. Proof. Let x ∈ A. If x∗ x = xx∗ and xn = 0 for some n ∈ N, then x = 0. This implies that A has no non-zero nilpotent two-sided ideals. Wedderburn’s theorem on the representation of finite-dimensional real algebras states that any real algebra with no non-zero nilpotent two-sided ideals is isomorphic (as a real algebra) to a finite direct product of R-algebras of the form Mk (D), with k ∈ N and D a finitedimensional division algebra over R. The only finite-dimensional division R-algebras are R, C and H. The direct product, with direct product operations, supremum norm (aij ) = sup aij A i,j
∗
(a∗ji ),
∗
and involution (aij ) = is a real C -algebra. One then shows, as in the complex case, that these two algebras are isomorphic as real C ∗-algebras. Analogously to the complex case, one also has the following result. Theorem 1.9. Let A be any real C ∗-algebra. Then there exists a real Hilbert space HR such that A is isomorphic as a real C ∗-algebra to a closed self-adjoint subalgebra of B(HR ). Let A be a real C ∗-algebra. We denote by AC := A ⊗ C the complexification of A, which is a complex algebra containing A as a real algebra. We can define a map JA : AC → AC by JA (x + iy) = x − iy for all x, y ∈ A. The map JA is a conjugate linear ∗-isomorphism of the complex C ∗-algebra AC . If φ : A → A is a continuous ∗-homomorphism, then the map JA (φ) : AC → AC defined by JA (φ)(x + iy) = φ(x) + iφ(y) is a continuous ∗-homomorphism such that JA ◦ JA (φ) = JA (φ) ◦ JA . Conversely, if J is a conjugate linear ∗-isomorphism of a complex C ∗-algebra B, then A = {x ∈ B | J(x) = x} is a real C ∗-algebra. This implies the following result.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1097
Proposition 1.10. Let C∗R be the category of real C ∗-algebras and continuous ∗-algebra homomorphisms. Let C∗C,cl be the category of pairs (A, J), where A is a complex C ∗-algebra and J is a conjugate linear ∗-isomorphism of A, and continuous ∗-homomorphisms commuting with J. Then the assignments A → (AC , JA ), φ → JA (φ) define a functor J : C∗R → C∗C,cl which is an equivalence of categories. 1.2. Commutative real C ∗-algebras We will now specialize to the case of commutative algebras. As with complex Banach algebras, a maximal two-sided ideal in a real Banach algebra A is closed in A. If M is a maximal two-sided ideal of a real Banach algebra A, then A/M is isomorphic to one of R or C as real algebras. A character on a real algebra A is a non-zero real algebra map χ : A → C, assumed unital if A is unital. Let ΩA be the space of characters of A. This can be given, as in the complex case, a locally compact Hausdorff space topology such that ΩA is homeomorphic to ΩAC . Furthermore, A is unital if and only if ΩA is compact. Given x ∈ A, evaluation at x gives a continuous map Γ(x) : ΩA → C called the Gel’fand transform of x. From this, we obtain the Gel’fand transform of A, Γ : A → C0 (ΩA , C), which is a continuous real algebra homomorphism of unit norm. If A is a real ∗-algebra, then Γ is a ∗-algebra homomorphism. The most important results on the representation of commutative real C ∗-algebras are the following. Theorem 1.11. Let A be a commutative real C ∗-algebra. Then: (i) The map τ : ΩA → ΩA defined by τ (χ) = χ is an involution; and (ii) The Gel’fand transform Γ : A → C0 (ΩA , τ ) is a real C ∗-algebra isomorphism. Proof. (i) The map τ is a bijection. The collection of sets Ux,V = {χ ∈ ΩA | χ(x) ∈ V } for every x ∈ A and V open in C is a sub-basis for the topology of ΩA . The complex conjugate V of V is an open set and τ −1 (Ux,V ) = Ux,V . Thus τ is continuous. (ii) The map Γ is a real ∗-algebra map with Γ(x) = x. One also has Γ(x) ◦ τ (χ) = Γ(x)(χ ) = χ(x) = Γ(x)∗ (χ), and so Γ(x) ◦ τ = Γ(x)∗ and Γ(A) ⊂ C0 (ΩA , τ ). Let θ : A → AC be the C ∗-algebra embedding of A into its complexification. The map ϑ : ΩAC → ΩA
October 23, 2009 12:9 WSPC/148-RMP
1098
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
given by ϑ(f ) = f ◦ θ is a homeomorphism and there is a commutative diagram A
Γ
θ
AC
/ C0 (ΩA , C)
Γ
ϑ∗
/ C0 (ΩA , C). C
Using this one then shows that Γ(A) = C0 (ΩA , τ ). Corollary 1.12. Let A be a commutative real C ∗-algebra with trivial involution. Then A is ∗-isomorphic to C0 (ΩA , R). 1.3. Hilbert modules We will now start presenting an overview of KK-theory for real C ∗-algebras. The basic references are [56, 16]. We begin by generalizing the notion of Hilbert space. Definition 1.13. Let A be a (not necessarily commutative) real C ∗-algebra. A pre-Hilbert module over A is a (right) A-module E equipped with an A-valued inner product, i.e. a bilinear map (−, −) : E × E → A such that (i) (x, x) ≥ 0 for all x ∈ E and (x, x) = 0 if and only if x = 0; (ii) (x, y) = (y, x)∗ for all x, y ∈ E; and (iii) (x, y a) = (x, y) a for all x, y ∈ E, a ∈ A. For x ∈ E, we define xE := (x, x)1/2 . This defines a norm on E satisfying the Cauchy–Schwartz inequality. If E is complete under this norm, then it is called a Hilbert module over A. We can define tensor products of C ∗-algebras and Hilbert modules in the usual way (see [16, 56] for the constructions). If E is a pre-Hilbert module over the real C ∗-algebra A, we assume that the complexification E ⊗ C is a pre-Hilbert module over AC . This means that the A-valued inner product extends to a sesquilinear map. We assume that sesquilinear maps are linear in the second variable. Let E, F be Hilbert A-modules and T : E → F an A-linear map. We call a map T ∗ : F → E such that (T x, y)F = (x, T ∗y)E for all x ∈ E, y ∈ F the adjoint of T . If it exists the adjoint is unique by Definition 1.13(i). Not every A-linear map between Hilbert A-modules has an adjoint. We denote the set of all A-linear maps T : E → F admitting an adjoint by L(E, F). Elements of L(E, F) are bounded A-linear maps and L(E) := L(E, E) is a C ∗-algebra with the operator norm and involution given by the adjoint. Given x ∈ F, y ∈ E, we define an operator θx,y ∈ L(E, F) by θx,y (z) = x(y, z)E . These operators generate an L(E) − L(F)-bimodule whose norm closure in L(E, F) is denoted K(E, F). Elements of K(E, F) are called generalized compact operators.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1099
If E = HR is a real Hilbert space, then L(E) is the usual space of bounded linear operators and K(E) is the space of compact operators. If n ∈ N ∪ {∞}, then An with inner product ( x, y ) :=
n
x∗i yi
i=1
for all x = (xi )1≤i≤n , y = (yi )1≤i≤n is a Hilbert module. One has K(A) ∼ = A and K(A∞ ) ∼ = A ⊗ KR where KR := K(HR ). Definition 1.14. Let A be a real C ∗-algebra. The multiplier algebra of A, M(A), is the maximal C ∗-algebra containing A as an essential ideal. Equivalently, by representing A ⊂ L(HR ) one has M(A) = {T ∈ L(HR ) | TS , ST ∈ A for all S ∈ A} . The multiplier algebra M(A) is a C ∗-algebra which is ∗-isomorphic to the C ∗-algebra of double centralizers, i.e. pairs (T1 , T2 ) ∈ L(A) × L(A) such that aT1 (b) = T2 (a)b, T1 (ab) = T1 (a)b and T2 (ab) = aT2 (b) for all a, b ∈ A. If A is unital, then M(A) = A. Furthermore, M(KR ) = L(HR ), and M(C0 (X, R)) = Cb (X, R) is the C ∗-algebra of real-valued bounded continuous functions on a locally compact Hausdorff space X. Proposition 1.15. Let E be a Hilbert A-module. Then there is an isomorphism L(E) ∼ = M(K(E)) . 1.4. KKO-theory We will now define the KKO-theory groups using Kasparov’s approach [40]. A useful survey of Kasparov’s theory can be found in [33]. We assume that a real C ∗-algebra A is separable and a real C ∗-algebra B is σ-unital. Definition 1.16. A (Kasparov) (A, B)-module is a triple (E, ρ, T ), where E is a countably generated Hilbert B-module, ρ : A → L(E) is a ∗-homomorphism and T ∈ L(E) such that (T − T ∗ )ρ(a), (T 2 − 1)ρ(a), [T, ρ(a)] ∈ K(E)
(1.1)
for all a ∈ A. A Kasparov module (E, ρ, T ) is called degenerate if all operators in (1.1) are zero. Two Kasparov modules (Ei , ρi , Ti ), i = 1, 2 are said to be orthogonally equivalent if there is an isometric isomorphism U ∈ L(E1 , E2 ) such that T1 = U ∗ T2 U and ρ1 (a) = U ∗ ρ2 (a)U for all a ∈ A. Orthogonal equivalence is an equivalence relation on the set of Kasparov modules. We denote the set of equivalence classes by E(A, B). The subset containing degenerate modules is denoted D(A, B). Direct sum makes E(A, B) and D(A, B) into monoids.
October 23, 2009 12:9 WSPC/148-RMP
1100
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
Definition 1.17. Let (Ei , ρi , Ti ) ∈ E(A, B) for i = 0, 1, (E, ρ, T ) ∈ E(A, B ⊗ C([0, 1], R)), and let ft : B ⊗ C([0, 1], R) → B be the evaluation map ft (g) = g(t). Then (E0 , ρ0 , T0 ) and (E1 , ρ1 , T1 ) are said to be homotopic and (E, ρ, T ) is called a homotopy if (E ⊗fi B, fi ◦ ρ, fi∗ (T )) is orthogonally equivalent to (Ei , ρi , Ti ) for i = 0, 1, where fi∗ (T )(a) := fi (T (a)). Homotopy is an equivalence relation on E(A, B) and we denote the equivalence classes by [E, ρ, T ]. It is useful to consider special kinds of homotopy. If E = C([0, 1], E0 ), E0 = E1 and the induced maps t → Tt , t → ρt (a) for all a ∈ A are strongly ∗-continuous, then we call (E, ρ, T ) a standard homotopy. If in addition ρt = ρ is constant and Tt is norm continuous, then (E, ρ, T ) is called an operator homotopy. Any degenerate module is homotopic to the zero module. The quotient Q(E) := L(E)/K(E) is a generalization of the Calkin algebra. If ρ(a)[T1 , T2 ]ρ(a)∗ ≥ 0 in Q(E), then (E, ρ, T1 ) and (E, ρ, T2 ) are operator homotopic. Definition 1.18. The set of equivalence classes in E(A, B) with respect to homotopy of (A, B)-modules is denoted KKO(A, B) or KKO0 (A, B). For p, q ≥ 0 we define KKOp,q (A, B) = KKO(A, B ⊗ Cp,q ), p,q
where Cp,q := C(R ) is the real Clifford algebra of the vector space Rp+q with quadratic form of signature (p, q). The equivalence relation allows us to simplify the (A, B)-modules required to define KKO(A, B). We need only consider modules of the form (B ∞ , ρ, T ) with T = T ∗ . If A is unital, we can further assume that T ≤ 1 and T 2 − 1 ∈ K(B ∞ ). There is another equivalence relation that we can define on E(A, B). We say that two (A, B)-modules (Ei , ρi , Ti ), i = 0, 1 are stably operator homotopic, (E0 , ρ0 , T0 ) oh (E1 , ρ1 , T1 ), if there exist (Ei , ρi , Ti ) ∈ D(A, B) such that (E0 ⊕ E0 , ρ0 ⊕ ρ0 , T0 ⊕ T0 ) and (E1 ⊕ E1 , ρ1 ⊕ ρ1 , T1 ⊕ T1 ) are operator homotopic up to orthogonal equivalence. The set of equivalence classes with respect to oh coincides with the set KKO(A, B) defined above. Proposition 1.19. The set KKO(A, B) enjoys the following properties: (i) KKO(A, B) is an abelian group. (ii) KKO(−, −) is a covariant bifunctor from the category of separable C ∗-algebras into the category of abelian groups which is additive: KKO(A1 ⊕ A2 , B) = KKO(A1 , B) ⊕ KKO(A2 , B), KKO(A, B1 ⊕ B2 ) = KKO(A, B1 ) ⊕ KKO(A, B2 ). (iii) Any two ∗-homomorphisms f : A2 → A1 and g : B1 → B2 induce group homomorphisms f ∗ : KKO(A1 , B) → KKO(A2 , B), g∗ : KKO(A, B1 ) → KKO(A, B2 )
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1101
defined by f ∗ [E, ρ, T ] = [E, ρ ◦ f, T ], g∗ [E, ρ, T ] = [E ⊗g B2 , ρ ⊗ 1, T ⊗ 1]; and (iv) Any two homotopies ft : A2 → A1 and gt : B1 → B2 induce the same homomorphism for all t ∈ [0, 1], i.e. ft∗ = f0∗ and gt∗ = g0∗ . ∼ M2 (M(B ⊗ KR )) where If we assume B unital, then we can identify L(B ∞ ) = M(B ⊗ KR ) is the multiplier algebra of B ⊗ KR . Thus we can give ρ and T the form ρ0 0 0 T∗ ρ= , T = T 0 0 ρ1 ∞ ∼ with ρ0 (a), ρ1 (a), T ∈ M(B ⊗ KR ) = L(B ), T ≤ 1, and T ∗ T − 1, T T ∗ − 1, T ρ1 (a) − ρ0 (a)T ∈ B ⊗ KR for all a ∈ A. 1.5. Analytic KO-homology Specializing all of our constructions to the case A = R and B unital we get the KO-theory groups KKO(R, B) ∼ = KO0 (B) and KKOp,q (R, B) ∼ = KOp−q (B). In par∼ ticular, KKO(R, C(X, R)) = KO0 (C(X, R)) ∼ = KO0 (X) for any compact Hausdorff space X. On the other hand, using the Gel’fand transform the contravariant functor (X, τ ) → C(X, τ ) induces an equivalence of categories between the category of compact Hausdorff spaces with involution and the category of commutative real C ∗-algebras. Since KKO (−, R) is also a contravariant functor, it follows that their composition (X, τ ) → KKO (C(X, τ ), R) is a covariant functor. Definition 1.20. Let (X, τ ) be a compact Hausdorff space with involution. The analytic KO-homology groups of (X, τ ) are defined by KOan (X, τ ) = KKOn,0 (C(X, τ ), R) = KKO(C(X, τ ), Cn ) where Cn := Cn,0 = C(Rn ). It will be helpful in some of our later analysis to have a closer look at our definition of KOn (A) = KKOn,0 (A, R) = KKO(A, Cn ), the KO-homology of a real C ∗-algebra A. Following through the definitions, this is based on triples (HR , ρ, T ) which are defined by the data: (i) HR is a separable real Hilbert space; (ii) ρ : A → L(HR ) is a unital representation of A; and (iii) T is a bounded linear operator on HR . These are assumed to satisfy the following conditions: (i) HR is equipped with a Z2 -grading such that ρ(a) is even for all a ∈ A and T is odd;
October 23, 2009 12:9 WSPC/148-RMP
1102
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
(ii) For all a ∈ A one has (T 2 − 1)ρ(a),
(T − T ∗ )ρ(a),
T ρ(a) − ρ(a)T ∈ KR ;
(1.2)
and (iii) There are odd R-linear operators ε1 , . . . , εn on HR with the Cn algebra relations εi = ε∗i ,
ε2i = −1,
εi εj + εj εi = 0
(1.3)
for i = j such that T and ρ(a) commute with each εi . From (1.2) it follows that T may be taken to be a Fredholm operator without loss of generality (see [41, Lemma 5.1]), and we shall refer to the triple (HR , ρ, T ) as an n-graded Fredholm module. Let us denote by ΓOn (A) the set of all n-graded Fredholm modules over A. Consider the equivalence relation ∼ on ΓOn (A) generated by the relations: Orthogonal equivalence: (HR , ρ, T ) ∼ (HR , ρ , T ) if and only if there exists an isometric degree-preserving linear operator U : HR → HR such that Uρ(a) = ρ (a)U for all a ∈ A, U T = T U , and U εi = εi U ; and Homotopy equivalence: (HR , ρ, T ) ∼ (HR , ρ, T ) if and only if there exists a norm continuous function t → Tt such that (HR , ρ, Tt ) is a Fredholm module for all t ∈ [0, 1] with T0 = T , T1 = T . We define the direct sum of two Fredholm modules (HR , ρ, T ) and (HR , ρ , T ) to be the Fredholm module (HR ⊕ HR , ρ ⊕ ρ , T ⊕ T ). We may now define KOn (A) as the free abelian group generated by elements in ΓOn (A)/∼ and quotiented by the ideal generated by the set {[x0 ⊕ x1 ] − [x0 ] − [x1 ] | [x0 ], [x1 ] ∈ ΓOn (A)/∼}. In KOn (A) the inverse of a class represented by the module (HR , ρ, T ) is given by (HRo , ρ, T ), where HRo is the Hilbert space HR with the opposite Z2 -grading and where the operators εi reverse their signs. For a compact Hausdorff space X we define KOan (X) := KOn (C(X, R)) = KKO(C(X, R), Cn ). Of course, this construction is exactly the one given before, only spelled out in more detail here. For further details and properties of this construction in the complex case, see [12]. 1.6. The intersection product Let D be a real C ∗-algebra. Then there is a natural homomorphism τD : KKO(A, B) → KKO(A ⊗ D, B ⊗ D) defined by τD [B ∞ , ρ, T ] = [B ∞ ⊗ D, ρ ⊗ 1, T ⊗ 1].
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1103
We can define in KKO-theory a product ⊗D : KKO(A, D) × KKO(B, D) → KKO(A, B) called the intersection product by [E1 , ρ1 , T1 ] ⊗D [E2 , ρ2 , T2 ] = [E1 ⊗ρ2 E2 , ρ1 ⊗ρ2 1, T1 # T2 ], where T1 # T2 ∈ L(E1 ⊗ρ2 E2 ) is a suitably defined operator [33]. If all C ∗-algebras involved are separable, then the intersection product extends to a bilinear map ⊗D : KKO(A1 , B1 ⊗ D) × KKO(D ⊗ A2 , B2 ) → KKO(A1 ⊗ A2 , B1 ⊗ B2 ) given by x ⊗D y = τA2 (x) ⊗B1 ⊗D⊗A2 τB1 (y) for all (x, y). Proposition 1.21. Let A be a separable C ∗-algebra and B, D1 , D2 σ-unital algebras. Suppose there exist α ∈ KKO(D1 , D2 ) and β ∈ KKO(D2 , D1 ) with α⊗D2 β = 1D1 and β ⊗D1 α = 1D2 . Then there are isomorphisms ⊗D1 α : KKO(A, B ⊗ D1 ) → KKO(A, B ⊗ D2 ), ⊗D2 β : KKO(A, B ⊗ D2 ) → KKO(A, B ⊗ D1 ). If D1 , D2 are separable, then one has isomorphisms α⊗D2 : KKO(A ⊗ D2 , B) → KKO(A ⊗ D1 , B), β⊗D1 : KKO(A ⊗ D1 , B) → KKO(A ⊗ D2 , B). If in addition there exist α ∈ KKO(D1 ⊗ D2 , R) and β ∈ KKO(R, D1 ⊗ D2 ) such that β ⊗D1 α = 1D2 and β ⊗D2 α = 1D1 , then there are isomorphisms ⊗D1 α : KKO(A, B ⊗ D1 ) → KKO(A ⊗ D2 , B), ⊗D2 α : KKO(A, B ⊗ D2 ) → KKO(A ⊗ D1 , B), β ⊗D1 : KKO(A ⊗ D1 , B) → KKO(A, B ⊗ D2 ), β ⊗D2 : KKO(A ⊗ D2 , B) → KKO(A, B ⊗ D1 ). The last result in Proposition 1.21 allows us to conclude that the KKO-groups are stable, i.e. there are isomorphisms KKO(A ⊗ KR , B) ∼ = KKO(A, B) ∼ = KKO(A, B ⊗ KR ). One also has the isomorphisms KKO(A ⊗ Cp,q , B ⊗ Cr,s ) ∼ = KKO(A ⊗ Cp,q ⊗ Cr,s , B) ∼ KKO(A ⊗ Cp−q+s−r,0 , B) = along with symmetric isomorphisms. Since KKOn (R, A) is the operator algebraic KO-theory of A, these isomorphisms and the periodicity of real Clifford algebras immediately imply mod 8 real Bott periodicity. Analogously, we obtain from the symmetric isomorphism Bott periodicity in analytic KO-homology.
October 23, 2009 12:9 WSPC/148-RMP
1104
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
2. Geometric KO-Homology We will now define geometric KO-homology, analogously to the Baum–Douglas construction of K-homology [10, 11, 54], and describe the basic properties of the topological KO-homology groups of a topological space that we will need later on. We will prove directly that this is a homology theory by comparing it with other formulations of KO-homology as the dual theory to KO-theory. In particular, in the next section we will show that this homology theory is equivalent to the analytic homology theory of the previous section. 2.1. Spin bordism Throughout X will denote a finite CW-complex. Definition 2.1. A KO-cycle on X is a triple (M, E, φ) where (i) M is a compact spin manifold without boundary; (ii) E is a real vector bundle over M ; and (iii) φ : M → X is a continuous map. There are no connectedness requirements made upon M , and hence the bundle E can have different fiber dimensions on the different connected components of M . It follows that disjoint union (M1 , E1 , φ1 ) (M2 , E2 , φ2 ) := (M1 M2 , E1 E2 , φ1 φ2 ) is a well-defined operation on the set of KO-cycles on X. Definition 2.2. Two KO-cycles (M1 , E1 , φ1 ) and (M2 , E2 , φ2 ) on X are isomorphic if there exists a diffeomorphism h : M1 → M2 such that (i) h preserves the spin structures; (ii) h∗ (E2 ) ∼ = E1 as real vector bundles; and (iii) The diagram h
/ M2 M1 D DD DD φ2 D φ1 DD ! X commutes. The set of isomorphism classes of KO-cycles on X is denoted ΓO(X). Definition 2.3. Two KO-cycles (M1 , E1 , φ1 ) and (M2 , E2 , φ2 ) on X are spin bordant if there exist a compact spin manifold W with boundary, a real vector bundle E → W , and a continuous map φ : W → X such that the two KO-cycles (∂W, E|∂W , φ|∂W ), (M1 (−M2 ), E1 E2 , φ1 φ2 )
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1105
are isomorphic, where −M2 denotes M2 with the spin structure on its tangent bundle TM 2 reversed. The triple (W, E, φ) is called a spin bordism of KO-cycles. 2.2. Real vector bundle modification Let M be a spin manifold and F → M a C∞ real spin vector bundle with fibers of dimension n := dimR Fp ≡ 0 mod 8 for p ∈ M . Let 1R M := M × R denote the trivial is a real vector bundle over M with fibers real line bundle over M . Then F ⊕ 1R M of dimension n + 1 and projection map λ. By choosing a C∞ metric on it, we may define the unit sphere bundle = S(F ⊕ 1R ) M M
(2.1)
by restricting the set of fiber vectors of F ⊕ 1R M to those which have unit norm. The tangent bundle of F ⊕ 1R M fits into an exact sequence of bundles given by R ∗ 0 → λ∗ (F ⊕ 1R M ) → T (F ⊕ 1M ) → λ (T M ) → 0.
Upon choosing a splitting, the spin structures on T M and F induce a spin structure and hence M is a compact spin manifold. By construction, M is a sphere on T M, bundle over M with n-dimensional spheres Sn as fibers. We denote the bundle projection by → M. π : M
(2.2)
as consisting of two copies B± (F ), with opposite We may regard the total space M spin structures, of the unit ball bundle B(F ) of F glued together by the identity map idS(F ) on its boundary so that = B+ (F ) ∪S(F ) B− (F ). M
(2.3)
Since n ≡ 0 mod 8, the group Spin(n) has two irreducible real half-spin representations. The spin structure on F associates to these representations real vector bundles S0 (F ) and S1 (F ) of equal rank 2n/2 over M . Their Whitney sum S(F ) = S0 (F ) ⊕ S1 (F ) is a bundle of real Clifford modules over T M such that C(F ) ∼ = End S(F ), where C(F ) is the real Clifford algebra bundle of F . Let / − (F ) be the real spinor bundles over F obtained from pullbacks to F S / + (F ) and S by the bundle projection F → M of S0 (F ) and S1 (F ), respectively. Clifford multiplication induces a bundle map F ⊗ S0 (F ) → S1 (F ) that defines a vector bundle / − (F ) covering idF which is an isomorphism outside the zero map σ : S / + (F ) → S section of F . Since the ball bundle B(F ) is a sub-bundle of F , we may form real / ± (F )|B± (F ) . We spinor bundles over B± (F ) as the restriction bundles ∆± (F ) = S can then glue ∆+ (F ) and ∆− (F ) along S(F ) = ∂B(F ) by the Clifford multiplication defined by map σ giving a real vector bundle over M H(F ) = ∆+ (F ) ∪σ ∆− (F ).
(2.4)
For each p ∈ M , the bundle H(F )|π−1 (p) is the real Bott generator vector bundle over the n-dimensional sphere π −1 (p) [10].
October 23, 2009 12:9 WSPC/148-RMP
1106
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
Definition 2.4. Let (M, E, φ) be a KO-cycle on X and F a C∞ real spin vector bundle over M with fibers of dimension dimR Fp ≡ 0 mod 8 for p ∈ M . Then the , H(F ) ⊗ π ∗ (E), φ ◦ π) from (M, E, φ) is called process of obtaining the KO-cycle (M real vector bundle modification. 2.3. Topological KO-homology We are now ready to define the topological KO-homology groups of the space X. Definition 2.5. The topological KO-homology group of X is the abelian group obtained from quotienting ΓO(X) by the equivalence relation ∼ generated by the relations of (i) spin bordism; (ii) direct sum: if E = E1 ⊕ E2 , then (M, E, φ) ∼ (M, E1 , φ) (M, E2 , φ); and (iii) real vector bundle modification. The group operation is induced by disjoint union of KO-cycles. We denote this group by KOt (X) := ΓO(X)/ ∼, and the homology class of the KO-cycle (M, E, φ) by [M, E, φ] ∈ KOt (X). Since the equivalence relation on ΓO(X) preserves the dimension of M mod 8 in KO-cycles (M, E, φ), one can define the subgroups KOtn (X) consisting of classes of KO-cycles (M, E, φ) for which all connected components Mi of M are of dimension dim Mi ≡ n mod 8. Then KOt (X)
=
7
KOtn (X)
(2.5)
n=0
has a natural Z8 -grading. The geometric construction of KO-homology is functorial. If f : X → Y is a continuous map, then the induced homomorphism f∗ : KOt (X) → KOt (Y ) of Z8 -graded abelian groups is given on classes of KO-cycles [M, E, φ] ∈ KOt (X) by f∗ [M, E, φ] := [M, E, f ◦ φ]. One has (idX )∗ = idKOt (X) and (f ◦ g)∗ = f∗ ◦ g∗ . Since real vector bundles over M extend to real vector bundles over M × [0, 1], it follows by spin bordism that induced homomorphisms depend only on their homotopy classes. If pt denotes a one-point topological space, then the collapsing map ζ : X → pt induces an epimorphism ζ∗ : KOt (X) → KOt (pt).
(2.6)
The reduced topological KO-homology group of X is t (X) := ker ζ∗ . KO
(2.7)
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1107
Since the map (2.6) is an epimorphism with left inverse induced by the inclusion of t (X) for any space X. As in a point ι : pt → X, one has KOt (X) ∼ = KOt (pt) ⊕ KO the complex case [54], one has the following basic calculational tools for computing the geometric KO-homology groups. Proposition 2.6. The abelian group KOt (X) enjoys the following properties: (i) KOt (X) is generated by classes of KO-cycles [M, E, φ] where M is connected. (ii) If {Xj }j∈J is the set of connected components of X then KOt (X) = KOt (Xj ). j∈J
(iii) The homology class of a KO-cycle (M, E, φ) on X depends only on the KOtheory class of E in KO0 (M ); and (iv) The homology class of a KO-cycle (M, E, φ) on X depends only on the homotopy class of φ in [M, X].
2.4. Homological properties We have not yet established that the geometric definition of KO-homology above is actually a (generalized) homology theory. Defining KOti+8k (X) := KOti (X) for all k ∈ Z, 0 ≤ i ≤ 7, we will now show that KOt (X) is an 8-periodic unreduced homology theory. We know that KO-theory is an 8-periodic cohomology theory which can be defined in terms of its spectrum KO∞ . For n ≥ 1, let HR be a real Z2 -graded separable Hilbert space which is a ∗-module for the real Clifford algebra Cn−1 = C(Rn−1 ) as in Sec. 1.5. Let Fredn be the space of all Fredholm operators on HR which are odd, Cn−1 -linear and self-adjoint. Then Fredn is the classifying space for KOn [4]. For n ≤ 0, we choose k ∈ N such that 8k + n ≥ 1 and define Fredn := Fred8k+n . One then has KO∞ = {Fredn }n∈Z , and so we can define [58] a homology theory related to KO by the inductive limit KOsi (X, Y ) := lim πn+i ((X/Y ) ∧ Fredn ) → n
(2.8)
for all i ∈ Z, where Y is a closed subspace of the topological space X and ∧ denotes the smash product. Bott periodicity then implies that this is an 8-periodic homology theory. One can give a definition of relative KO-homology groups KOti (X, Y ) in such a way that there is a map µs : KOti (X, Y ) → KOi (X, Y ) which defines a natural equivalence between functors on the category of topological spaces having the homotopy type of finite CW-pairs (X, Y ), where KOi (X, Y ) is Jakob’s realization of KO-homology [38]. The building blocks of KOi (X) are triples (M, x, φ) as in Definition 2.1 but now x ∈ KOn (M ) is a KO-theory class over M such that dim M + n ≡ i mod 8. The equivalence relations are as in Definition 2.5 with real
October 23, 2009 12:9 WSPC/148-RMP
1108
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
vector bundle modification modified from Definition 2.4 as follows. The nowhere zero section ΣF : M → F ⊕ 1 R M defined by ΣF (p) = 0p ⊕ 1 for p ∈ M induces an embedding . ΣF : M → M
(2.9)
Then real vector bundle modification is replaced by the relation , ΣF (x), φ ◦ π), (M, x, φ) ∼ (M ! n n where the functorial homomorphism ΣF ! : KO (M ) → KO (M ) is the Gysin map induced by the embedding (2.9). On stable isomorphism classes of real vector bundles [E] ∈ KO0 (M ) one has ∗ ΣF ! [E] = [H(F ) ⊗ π (E)].
(2.10)
In the present category, KOi (X, Y ) is naturally equivalent to KOsi (X, Y ). It is important to notice that this is quite a nontrivial result, the validity of which has been established in [38]. One can give a spin bordism description of KOt (X, Y ) as follows. We consider the set ΓO(X, Y ) of isomorphism classes of triples (M, E, φ) where (i) M is a compact spin manifold with (possibly empty) boundary; (ii) E is a real vector bundle over M ; and (iii) φ : M → X is a continuous map with φ(∂M ) ⊂ Y . The set ΓO(X, Y ) is then quotiented by relations of relative spin bordism, which is modified from Definition 2.3 by the requirement that M1 (−M2 ) ⊂ ∂W is a regularly embedded submanifold of codimension 0 with φ(∂W \M1 (−M2 )) ⊂ Y , direct sum, and real vector bundle modification, which is applicable in this case R since S(F ⊕ 1R M ) is a compact spin manifold with boundary S(F ⊕ 1M )|∂M . The collection of equivalence classes is a Z8 -graded abelian group with operation induced by disjoint union of relative KO-cycles. One has KOti (X, ∅) = KOti (X). Theorem 2.7. The map µs : KOti (X, Y ) → KOi (X, Y ) defined on classes of KO-cycles by µs [M, E, φ]t = [M, [E], φ]s is an isomorphism of abelian groups which is natural with respect to continuous maps of pairs.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1109
Proof. Taking into account the equivalence relations on ΓO(X, Y ) used to define both KO-homology groups, the map µs is well-defined and a group homomorphism. Let [M, x, φ]s ∈ KOn (X, Y ) with m := dim M . We may assume that M is connected and x is non-zero in KOi (M ). Then m − i ≡ n mod 8. Consider the trivial spin vector bundle F = M × Rn+7m+1 over M . In this case the sphere bundle (2.1) = M × Sn+7m+1 and the associated Gysin homomorphism in KO-theory is is M a map i i+7m+n ΣF (M ). ! : KO (M ) → KO
) ∼ Since i + 7m + n ≡ (i + 7m + m − i) mod 8 ≡ 0 mod 8, one has KOi+7m+p (M = 0 KO (M ). It follows that there are real vector bundles E, H → M such that ΣF ! (x) = [E] − [H], and so by real vector bundle modification one has [M, x, φ]s = , [H], φ ◦ π]s in KOn (X, Y ). Therefore µs ([M , E, φ ◦ π]t − [M , H, , [E], φ ◦ π]s − [ M [M s φ ◦ π]t ) = [M, x, φ]s , and we conclude that µ is an epimorphism. Now suppose that µs [M1 , E1 , φ1 ]t = µs [M2 , E2 , φ2 ]t are identified in KOn (X, Y ) through real vector bundle modification. Then, for instance, there is a real spin 1 and [E2 ] = ΣF [E1 ]. This implies that vector bundle F → M1 such that M2 = M ! the Gysin homomorphism is a map 0 0 r ΣF ! : KO (M1 ) → KO (M1 ) ∩ KO (M1 )
1 ) ∩ KOr (M 1 ) = {0} in this case, we where r = dim Fp for p ∈ M1 . Since KO0 (M have r ≡ 0 mod 8 which implies that these two homology classes are also identified in KOtn (X, Y ) through real vector bundle modification. As this is the only relation in KOn (X, Y ) that might identify these classes without identifying them as KOcycles, we conclude that µs is a monomorphism and therefore an isomorphism. Remark 2.8. Theorem 2.7 establishes the existence of a natural equivalence between covariant functors KOt ∼ = KO . Since KO is a homological realization of the homology theory associated with KO-theory, it follows that the same is true of KOt . We have thus constructed an unreduced 8-periodic geometric homology theory dual to KO-theory. It is the periodicity mod 8 of the fiber dimensions of the spin vector bundle F used for real vector bundle modification in KOt that accounts for the isomorphism KOt ∼ = KO . Having established that KO-homology is a generalized homology theory, we may throughout exploit standard homological properties (see [58], for example). In particular, there is a long exact homology sequence for any pair (X, Y ). Because KOt is an 8-periodic theory, this sequence truncates to a 24-term exact sequence. In the spin bordism description, the connecting homomorphism ∂ : KOtn (X, Y ) → KOtn−1 (Y ) is given by the boundary map ∂[M, E, φ] := [∂M, E|∂M , φ|∂M ]
(2.11)
October 23, 2009 12:9 WSPC/148-RMP
1110
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
on classes of KO-cycles and extended by linearity. ∂ is natural and commutes with induced homomorphisms. Other homological properties are direct translations of those of the complex case provided by [54], where a more extensive treatment can be found. For example, one has the usual excision property. If U ⊂ Y is a subspace whose closure lies in the interior of Y , then the inclusion ς U : (X\U, Y \U ) → (X, Y ) induces an isomorphism ≈
ς∗U : KOt (X\U, Y \U ) −→ KOt (X, Y ) of Z8 -graded abelian groups. 2.5. Products There are two important products that can be defined on topological KO-homology groups. The cap product is the Z8 -degree preserving bilinear pairing : KO0 (X) ⊗ KOt (X) → KOt (X) given for any real vector bundle F → X and KO-cycle class [M, E, φ] ∈ KOt (X) by [F ] [M, E, φ] := [M, φ∗ F ⊗ E, φ] and extended linearly. It makes KOt (X) into a module over the ring KO0 (X). As in the complex case, this product can be extended to a bilinear form : KOi (X) ⊗ KOtj (X) → KOti+j (X). The construction utilizes Bott periodicity and the isomorphism KO−n (X) ∼ = KO0 (Σn X), where Σn X = Sn ∧ X is the nth iterated reduced suspension of the space X. The product : KOn (X) ⊗ KOti (X) → KOti+n (X) is given by the pairing : KO0 (Σn X) ⊗ KOti−n (Σn X) → KOti−n (Σn X). If X and Y are spaces, then the exterior product × : KOti (X) ⊗ KOtj (Y ) → KOti+j (X × Y ) is given for classes of KO-cycles [M, E, φ] ∈ KOti (X) and [N, F, ψ] ∈ KOtj (Y ) by [M, E, φ] × [N, F, ψ] := [M × N, E F, (φ, ψ)], where M ×N has the product spin structure uniquely induced by the spin structures on M and N , and E F is the real vector bundle over M × N with fibers (E F )(p,q) = Ep ⊗ Fq for (p, q) ∈ M × N . This product is natural with respect to continuous maps. Unfortunately, in contrast to the complex case, we do not have a version of the K¨ unneth theorem for KO-homology. Indeed, should such a formula exist, one could use it to show that KO (pt) ⊗ KO (pt) has to be a tensor product as modules over the ring KO (pt). But this does not work correctly as pointed out by Atiyah in [3]. Moreover, for A = B = C considered as a real C ∗-algebra, one has that the map K (A) ⊗ K (B) → K (A ⊗ B)
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1111
is not surjective. The correct framework for K¨ unneth formula for real K-theory is united K-theory [21,18], which is a machinery that involves real K-theory, complex K-theory, and self-conjugate K-theory, and has the property that its homological algebra behaves better. We will return to this point in Sec. 6.2. 2.6. The Thom isomorphism Let X be an n-dimensional compact manifold with (possibly empty) boundary, and B(T X) → X and S(T X) → X the unit ball and sphere bundles of X. An element τ ∈ KOn (B(T X), S(T X)) is called a Thom class or an orientation for X if τ |(B(T X)x ,S(T X)x ) ∈ KOn (B(T X)x , S(T X)x) ∼ = KO0 (pt) is a generator for all x ∈ X [39]. The manifold X is said to be KO-orientable if it has a Thom class. In that case the usual cup product on the topological KO-theory ring yields the Thom isomorphism ≈
TX : KOi (X) −→ KOi+n (B(T X), S(T X)) given for i = 0, 1, . . . , 7 and ξ ∈ KOi (X) by ∗ TX (ξ) := πB(T X) (ξ) τ,
where πB(T X) : B(T X) → X is the bundle projection. This construction also works by replacing the tangent bundle of X with any O(r) vector bundle V → X, defining a Thom isomorphism ≈
TX,V : KOi (X) −→ KOi+r (B(V ) , S(V )) given by ∗ TX,V (ξ) := πB(V ) (ξ) τV ,
(2.12)
r
where the element τV ∈ KO (B(V ), S(V )) is called the Thom class of V . Indeed, for a manifold X, the KO-orientability condition (existence of a Thom class) described above is equivalent to the existence of a spin structure on the stable normal bundle of the manifold [6, 38]. Any KO-oriented manifold X of dimension n has a uniquely determined fundamental class [X]s ∈ KOsn (X, ∂X), which is represented by the element [X, 1R X , idX ] in KOtn (X, ∂X). One then has the Poincar´e duality isomorphism ≈
ΦX : KOi (X) −→ KOsn−i (X, ∂X) given for i = 0, 1, . . . , 7 and ξ ∈ KOi (X) by taking the cap product ΦX (ξ) := ξ [X]s .
(2.13)
In particular, if X is a compact spin manifold of dimension n without boundary, then X is KO-oriented and so in this case we have a Poincar´e duality isomorphism [38, 54, 58] giving KOti (X) ∼ = KOn−i (X).
(2.14)
October 23, 2009 12:9 WSPC/148-RMP
1112
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
The isomorphism (2.14) may be compared with the universal coefficient theorem for KO-theory [63, 30], which asserts that there is an exact sequence 0 → Ext(KOti−1 (X), Z) → KOi+4 (X) → Hom(KOti (X), Z) → 0
(2.15)
for all i ∈ Z. The degree shift by 4 arises from the fact that KO−3 (pt) = 0 and that there is a cup product pairing KOi−4 (pt) ⊗ KO−i (pt) → KO−4 (pt) ∼ = Z. Under the same conditions as above, one then also has the Thom isomorphism in KO-homology ≈
T∗X,V : KOti (X) −→ KOti+r (B(V ), S(V )).
(2.16)
3. The Isomorphism One of the main results of this paper is an explicit realization of the isomorphism between topological and analytic KO-homology. The primary goal of this section is to prove the following result. Theorem 3.1. There is a natural equivalence ≈
µa : KOt −→ KOa between the topological and analytic KO-homology functors. As for any (generalized) homology theory, there’s a uniqueness theorem for homology theories ( [58]) on the category of finite CW-complexes. More precisely, one has the following Theorem 3.2. Let h and k be generalized homology theories defined on the category of finite CW-pairs, and let φ : h → k be a natural transformation of homology theories such that φ : hn (pt) → kn (pt) is an isomorphism for any n ∈ Z. Then φ is a natural equivalence. Taking into account the uniqueness theorem stated above, the proof of Theorem 3.1 is tantamount to proving that the map µa : KOtn (pt) → KOan (pt), induced by the natural transformation µa , an isomorphism for n = 0, 1, . . . , 7. From the realization (2.8) it follows that KOtn (pt) ∼ = lim πn+8k (Fred0 ) −→ k
∼ 0 (Sn ). = KO = πn (Fred0 ) ∼
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1113
The main idea behind our proof is to show that there exist surjective “index” homomorphisms indtn and indan such that the diagram µa
/ KOn (pt) KOtn (pt) LLL LLL indan LLL indtn L& KO−n (pt) a
(3.1)
commutes for every n. The KO-theory groups KO−n (pt) appear here because they are the coefficient groups of the KOtn and KOan homology theories. This setup is motivated by the fact [10] that the map µa and the commutativity of the diagram (3.1) are intimately related to an index theorem, as we demonstrate explicitly in Sec. 4.3, and hence the motivation behind our terminology above. Since the groups KO−n (pt) are equal to either 0, Z or Z2 depending on the particular value of n, the commutativity of the diagram (3.1) along with surjectivity of the index maps are sufficient to prove that µa is an isomorphism. For clarity and later use, we will divide the proof into four parts. We will first give the constructions of the three maps in (3.1) each in turn, and then present the proof of commutativity of the diagram. In the following section we proceed to construct the map µa , referred to in 3.1. This map is the natural counterpart for the real case of the complex version built in [10]. For an equivalent definition, see [12, 23]. 3.1. The map µa Let (M, E, φ) be a topological KO-cycle on X with dim M = n. We construct a corresponding class in KOan (X) as follows. Consider the Clifford bundle S(M / ) := P Spin (M ) ×λn Cn where Cn = C(Rn ), λn : Spin(n) → End(Cn ) is given by left multiplication with Spin(n) ⊂ C0n ⊂ Cn , and P Spin (M ) is the principal Spin(n)-bundle over M associated to the spin structure on the tangent bundle T M . Since Cn = C0n ⊕ C1n is a Z2 -graded algebra, it follows that S(M / )=S / 0 (M ) ⊕ S / 1 (M )
(3.2)
is a Z2 -graded real vector bundle over M with respect to the C(T M )-action. The Clifford algebra Cn acts by right multiplication on the fibers whilst preserving the bundle grading (3.2). / M : C∞ (M, S(M / )) → Choose a C∞ Riemannian metric g M on T M . Let D ∞ / )) be the canonical Atiyah–Singer operator [4] defined locally by C (M, S(M M
D /
=
n i=1
ei · ∇M ei ,
(3.3)
October 23, 2009 12:9 WSPC/148-RMP
1114
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
where {ei }1≤i≤n is a local basis of sections of the tangent bundle T M , ∇M ei are the corresponding components of the spin connection ∇M , and the dot denotes Clifford multiplication. The operator D / M is a Cn -operator [41], i.e. one has D / M (Ψ · ϕ) = D / M (Ψ) · ϕ for all Ψ ∈ C∞ (M, S(M / )) and all ϕ ∈ Cn , where · ϕ denotes right multiplication / M is a Cn by ϕ. Since D / M commutes with the Cn -action, the vector space ker D module. M M , ρM We now construct a triple (HE E , TE ) comprising the following data: M := L2R (M, S(M / ) ⊗ E; dg M ); (i) The separable real Hilbert space HE M (ii) The ∗-homomorphism ρM E : C(M, R) → L(HE ) defined by
(ρM E (f )(Ψ))(p) = f (p)Ψ(p) / ) ⊗ E) and p ∈ M ; and for f ∈ C(M, R), Ψ ∈ C∞ (M, S(M (iii) The bounded Fredholm operator D /M E TEM := 2 1 + (D /M E )
(3.4)
M , where D /M acting on HE E is the Atiyah–Singer operator (3.3) twisted by the real vector bundle E → M .
This triple satisfies the following properties: M is Z2 -graded according to the splitting (3.2) of the Clifford bundle; (i) HE M M for all f ∈ C(M, R); (ii) ρE (f ) is an even operator on HE M (iii) Since M is compact, TE is an odd Fredholm operator which obeys the compactness conditions (1.2) with ρM E (f ); and M (iv) There are odd operators εi , i = 1, . . . , n commuting with both ρM E (f ) and TE M which generate a Cn -action on HE as in (1.3), and which are given explicitly as right multiplication by elements ei of a basis of the vector space Rn . M M , ρM It follows that (HE E , TE ) is a well-defined n-graded Fredholm module over the real C ∗-algebra C(M, R). We now define the map µa in (3.1) by M M M M ∗ M µa (M, E, φ) := φ∗ (HE , ρM E , TE ) = (HE , ρE ◦ φ , TE ),
(3.5)
where φ∗ : C(X, R) → C(M, R) is the real C ∗-algebra homomorphism induced by the map φ. At this stage the map µa is only defined on KO-cycles. We then have Proposition 3.3. The map µa : KOtn (X) → KOan (X) induced by (3.5) is a welldefined homomorphism of abelian groups for any n ∈ N. Proof. Let (M, E, φ), (N, F, ψ) ∈ ΓO(X), and consider their disjoint union. The Clifford bundle of the disjoint union manifold splits as S(M / N ) = S(M / ) S(N / ),
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1115
and therefore the twisted Clifford bundle has a corresponding spliting S(M / N) ⊗ (E F ) = (S(M / ) ⊗ E) (S(N / ) ⊗ F ), giving rise to a splitting of the space of / N ) ⊗ (E F )) = C∞ (M, S(M / ) ⊗ E) ⊕ C∞ (N, S(N / ) ⊗ F ), sections C∞ (M, S(M 2 and therefore of the corresponding spaces of L -sections: M N M = HE ⊕ HFN . HE F
(3.6)
The algebras of functions also split as C(M N, R) = C(M, R) ⊕ C(N, R), and this together with the structure of the Hilbert spaces of sections (3.6) imply that M M ρE TE 0 0 M N M N ρE F = = , T , E F 0 ρN 0 TFN F which immediately implies that M N M N M N M ∗ M N N ∗ N , ρE F ◦ (φ ψ)∗ , TE F ) = (HE , ρM (HE F E ◦ φ , TE ) + (HF , ρF ◦ ψ , TF ),
(3.7) showing that the map µa preserves disjoint union of cycles, and so it is a homomorphism of (unital) abelian monoids. Let us now consider the direct sum relation. Since µa is a monoid morphism, we have µa ((M, E1 , φ) (M, E2 , φ)) = µa (M, E1 , φ) + µa (M, E2 , φ) M
T E1 ∗ M M M M =φ HE1 ⊕ HE2 , ρE1 ⊕ ρE2 , 0
0 TEM2
.
As above, the space of sections splits / ) ⊗ (E1 ⊕ E2 )) = C∞ (M, S(M / ) ⊗ E1 ) ⊕ C∞ (M, S(M / ) ⊗ E2 ) C∞ (M, S(M M M M = HE ⊕ HE . This leads to giving rise to a splitting of the Hilbert spaces HE 1 ⊕E2 1 2 the conclusion that
M T E1 0 M M M M M M M [HE1 ⊕E2 , ρE1 ⊕E2 , TE1 ⊕E2 ] = HE1 ⊕ HE2 , ρE1 ⊕ ρE2 , , 0 TEM2
which therefore implies that µa (M, E1 ⊕E2 , φ) = µa ((M, E1 , φ)(M, E2 , φ)), showing that our map preserves the direct sum relation. Let us now suppose that the cycle (M, E, φ) is a bord, i.e. that there exists (W, F, ψ) such that (∂W, F |∂W , ψ|∂W ) = (M, E, φ). The inclusion of the boundary ι : ∂W → W induces a commutative diagram ι∗
/ KOa (W ) , KOan (∂W ) n MMM MMM ψ∗ M (ψ|∂W )∗ MMM & KOan (X)
October 23, 2009 12:9 WSPC/148-RMP
1116
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
a so, denoting by [D /M E ] the element µM (M, E, idM ), we have
/ ∂W / ∂W (ψ|∂W )∗ ([D F |∂W ]) = ψ∗ (ι∗ ([D F |∂W ])) = ψ∗ (ι∗ ([D / ∂W ] ∩ [F |∂W ]).
(3.8)
A result of Higson and Roe [34, Proposition 11.2.15] states that, in analytic K/ W −∂W ], and considering the long exact homology sequence homology, [D / ∂W ] = ∂[D of the pair (W, ∂W ), ι
∗ KOan (W ) −→ KOan (W − αW ) · · · −→ KOan (αW ) −→
∂
−→ KOan−1 (αW ) −→ · · · , / ∂W ] = ι∗ ◦ ∂ ([D / W −∂W ]) = 0, by exactness of the sequence. The it follows that ι∗ ([D above discussion therefore implies that µa (∂W, F |∂W , ψ|∂W ) = 0, and so the map preserves the bordism relation. The only relation remaining now is vector bundle modification. Let (M, E, φ) ∈ ΓO(X) and let F → M be a Spin vector bundle with rk(F ) = 8k. Assume n = dim M . We want to show that the equality , H(F ) ⊗ π ∗ (E), φ ◦ π) µa (M, E, φ) = µa (M
(3.9)
holds. Some elementary calculations show that c , H(F ) ⊗ π ∗ (E), φ ◦ π) = φ∗ (π∗ ([D µa (M /M H(F )⊗π ∗ [E] ])) c
= φ∗ (π∗ ([D / M ] ∩ [H(F ) ⊗ π ∗ E])) c
= φ∗ (π∗ ([D / M ] ∩ ([H(F )] ∪ [π ∗ E]))) c
= φ∗ (π∗ (([D / M ] ∩ H(F )) ∩ π ∗ [E])) c
= φ∗ (π∗ ([D / M ] ∩ H(F )) ∩ [E]). So, if we show that c
[D / M ] = π∗ ([D / M ] ∩ H(F )),
(3.10)
equality (3.9) will follow immediately from it. By the remarks in Sec. 2.6, we know that [H(F )] is the Thom class of F , and taking a closer look at the right-hand c side of (3.10), we see that it is just the image of [D / M ], the fundamental class of M in analytic K-homology, through the homology Thom isomorphism of the spherical → M . So (3.10) is equivalent to the fact that the Thom isomorphism fibration π : M ) → KOa (M ) TM,F : KOaq+8k (M q
(3.11)
to the fundamental class of M . This follows from maps the fundamental class of M from the ones in the way one constructs the Spin structure of the sphere bundle M M and F . It is easy to show that µ is natural with respect to continuous maps of spaces.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1117
The last step required is to show that µa is a natural transformation of homology theories, namely that it commutes with the boundary operators of the homology theories in question. To achieve this, one needs the following nontrivial result describing the boundary map in analytic KO-homology [34, 12] Theorem 3.4. Let M − ∂M be the interior of a spin manifold M of dimension n with boundary ∂M, and let E be a real vector bundle on M . Equip the boundary ∂M with the spin structure induced by that on M . Then / ∂M ∂[ D / M−∂M E|∂M ] E|M −∂M ] = [ D where ∂ : KOan (M − ∂M ) → KOan−1 (∂M ) is the boundary homomorphism. /M Finally, one can prove that the class [ D / M ] := [ D 1 ] represents the fundamental class of M in KOan (M ), and that [23] / M ]. [D /M E ] = [E] ∩ [ D Combining these results, one can conclude that the map µa commutes with the appropriate boundary maps, therefore showing that it is a natural transformation of homology theories. 3.2. The map indan Let (HR , ρ, T ) be an n-graded Fredholm module over the real C ∗-algebra C(X, R). Since the Fredholm operator T commutes with εi for i = 1, . . . , n, the kernel ker T ⊂ HR is a real Cn -module with Z2 -grading induced by the grading of HR . Thus we can define n /ı∗ M n+1 ) ∼ indan (T ) := [ker T ] ∈ (M = KO−n (pt),
(3.12)
n is the Grothendieck group of real graded Cn representations and ı∗ is where M induced by the natural inclusion ı : Cn → Cn+1 . We will call (3.12) the analytic or Clifford index of the Fredholm operator T . An important property of this definition is the following result [41]. Theorem 3.5. The analytic index indan : Fredn → KO−n (pt) is constant on the connected components of Fredn . Given two Fredholm modules (HR , ρ, T ) and (HR , ρ, T ) over a real C ∗-algebra A, we will say that T is a compact perturbation of T if (T − T )ρ(a) ∈ KR for all a ∈ A. We then have the following elementary result. Lemma 3.6. If T is a compact perturbation of T , then the Fredholm modules (HR , ρ, T ) and (HR , ρ, T ) are operator homotopic over A.
October 23, 2009 12:9 WSPC/148-RMP
1118
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
Proof. Consider the path Tt = (1 − t)T + tT for t ∈ [0, 1]. Then the map t → Tt is norm continuous. We will show that for any t ∈ [0, 1], the triple (HR , ρ, Tt ) is a Fredholm module over A, i.e. that the operator Tt satisfies (Tt2 − 1)ρ(a), (Tt − Tt∗ )ρ(a), Tt ρ(a) − ρ(a)Tt ∈ KR
(3.13)
for all a ∈ A. The last two inclusions in (3.13) are easily proven because the path Tt is “linear” in the operators T and T . To establish the first one, for any t ∈ [0, 1] and a ∈ A we compute (Tt2 − 1)ρ(a) = [(T 2 − 1) + t2 (T − T )2 − t(T 2 − 1) − t(T − T )2 + t(T 2 − 1)]ρ(a).
(3.14)
By using the fact that (HR , ρ, T ) and (HR , ρ, T ) are Fredholm modules, that T is a compact perturbation of T , and that KR is an ideal in L(HR ), one easily verifies that the right-hand side of (3.14) is a compact operator. This implies that (HR , ρ, Tt ) is a well-defined family of Fredholm modules over A. Proposition 3.7. The induced map indan : KOan (X) → KO−n (pt) given on classes of n-graded Fredholm modules by indan [HR , ρ, T ] = [ker T ] is a well-defined surjective homomorphism for any n ∈ N. Proof. We first show that to the direct sum of two Fredholm modules (HR , ρ, T ) and (HR , ρ , T ) over A = C(X, R), the map indan associates the class [ker T ] + n+1 ∼ n /ı∗ M [ker T ] ∈ M = KO−n (pt). The kernel ker(T ⊕ T ) = ker(T ) ⊕ ker(T ) n and of its quotient is a real graded Cn -module. By the definition of the group M a ∗ by ı Mn+1 , one thus has indn (T ⊕ T ) = [ker T ] + [ker T ] and so the map indan respects the algebraic structure on ΓOn (A). Consider now two Fredholm modules (HR , ρ, T ) and (HR , ρ , T ) which are orthogonally equivalent. Then there exists an even isometry U : HR → HR such that T = U T U ∗,
εi = U εi U ∗ .
This implies that ker T = U (ker T ), and that the graded Cn representations given respectively by εi and εi are equivalent. In particular, they represent the same class n+1 . n /ı∗ M in M Finally, consider two homotopic n-graded Fredholm modules (HR , ρ, T ) and (HR , ρ, T ) over A. In general, T and T are not elements of Fredn because they need not be self-adjoint. However, one can always perform a compact perturbation to obtain an equivalent Fredholm module whose operator is self-adjoint by
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1119
simply replacing T with T˜ := 12 (T + T ∗ ). By Lemma 3.6, compact perturbation implies operator homotopy and so there is no loss of generality in considering only homotopy of “self-adjoint” Fredholm modules. Then the function t → Tt gives a homotopy T˜t = 12 (Tt + Tt∗ ) in Fredn connecting T and T . The proposition now follows from Theorem 3.5. 3.3. The map indtn Given a KO-cycle (M, E, φ) on X with M an n-dimensional compact spin manifold, we can assign to it the Atiyah–Milnor–Singer (AMS ) invariant [4] defined by E (M ) = β ◦ ι∗ ◦ ς ∗ (τν (E)) ∈ KO−n (pt) A
(3.15)
where: (i) ν is the normal bundle N (Sn+8k /M ), with projection : ν → M , identified with a tubular neighborhood of an embedding f : M → Sn+8k
(3.16)
for some k ∈ N sufficiently large; (ii) τν (E) = τν [∗ E] ∈ KO0 (ν, ν\M ) where τν := [∗S / + (ν), ∗S / − (ν) ; σ] ∈ KO0 (ν, ν\M ) / + (ν) → ∗S / − (ν) given by Clifford multiis the Thom class of ν, with σ : ∗S plication; (iii) ς ∗ : KO0 (ν, ν\M ) → KO0 (Sn+8k , Sn+8k \M ) is given by the excision theorem; 0 (Sn+8k ) is given by the inclusion ι : (Sn+8k , (iv) ι∗ : KO0 (Sn+8k , Sn+8k \M ) → KO n+8k , M ); and pt) → (S 0 (Sn ) := KO−n (pt) is given by Bott periodicity. 0 (Sn+8k ) → KO (v) β : KO This definition does not depend on the embedding (3.16) nor on the integer k. We define E (M ). indtn (M, E, φ) := A
(3.17)
Proposition 3.8. The map indtn : KOtn (X) → KO−n (pt) induced by (3.17) is a well-defined surjective homomorphism for any n ∈ N. Proof. We first prove that the map indtn respects the algebraic structure on the abelian group KOtn (X). Given two n-dimensional compact spin manifolds M1 and M2 , let M = M1 M2 . Embed M in the sphere Sn+8k for some k sufficiently large as in (3.16). Then the normal bundle ν of this embedding is given by N (Sn+8k /M1 ) N (Sn+8k /M2 ). Identify ν with a tubular neighborhood of the embedding given by ν1 ν2 , with projection = 1 2 : ν1 ν2 → M1 M2 . The Thom class of ν
October 23, 2009 12:9 WSPC/148-RMP
1120
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
is given by τν := [∗S / + (ν), ∗S / − (ν) ; σ] / + (ν1 ) 2∗S / + (ν2 ), 1∗S / − (ν1 ) 2∗S / − (ν2 ) ; σ1 σ2 ] = [1∗S ∼ KO0 (ν1 , ν1 \M1 ) ⊕ KO0 (ν2 , ν2 \M2 ). = τν1 + τν2 ∈ KO0 (ν, ν\M ) = Let E1 and E2 be real vector bundles over M1 and M2 , respectively, and let E = E1 E2 . Then in KO0 (ν, ν\M ) one has τν (E) = τν [∗E] = τν [1∗ E1 2∗ E2 ] = τν1 [1∗ E1 ] + τν2 [2∗ E2 ] = τν1 (E1 ) + τν2 (E2 ). Using the fact that the maps ς ∗ , ι∗ and β are group homomorphisms, one then finds indtn ((M1 , E1 , φ1 ) (M2 , E2 , φ2 )) E (M ) := A E (M2 ) E (M1 ) + A =A 1 2 = indtn (M1 , E1 , φ1 ) + indtn (M2 , E2 , φ2 ) ∈ KO−n (pt), showing that indtn is a homomorphism of abelian groups. Next we have to check that the map indtn is independent of the choice of representative of a homology class in KOtn (X). The independence of the direct sum relation follows from the discussion above, while spin bordism independence is guaranteed E (M ) is a spin cobordism invariant [4]. by the property that the AMS invariant A Finally, we have to verify that the map indtn does not depend on real vector bundle modification. We will give a fairly detailed proof of this result, as we believe it is instructive. Let M be a smooth n-dimensional compact spin manifold and let E → M be a smooth real vector bundle. Given an embedding (3.16), the AMS invariant of the pair (M, E) may be written as [41] E (M ) = β ◦ f! [E], A where f! : KO0 (M ) → KO0 (Sn+8k ) is the Gysin homomorphism of the embedding f . Let F be a real spin vector bundle over M with fibers of real dimension 8l for some l ∈ N. Consider the corresponding sphere bundle (2.1) with projection (2.2). As discussed in Sec. 2.4 (see (2.9) and (2.10)), real vector bundle modification of a KO-cycle (M, E, φ) on X induced by , E, φ ◦ π), where E = H(F ) ⊗ π ∗ (E) is the real vector F produces the KO-cycle (M such that bundle over M = ΣF [E] [E] !
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1121
∈ KO0 (M ). We may compute the AMS invariant for with [E] ∈ KO0 (M ) and [E] , E) by choosing an embedding the pair (M → Sn+8k+8l f : M so that b (M ) = β ◦ f ! ([E ]) A E = β ◦ f ! ◦ ΣF [E] = β ◦ (f ◦ ΣF )! [E], !
where in the last equality we have used functoriality of the Gysin map. Notice that f ◦ ΣF : M → Sn+8k+8l =: Sn+8m E (M ) is independent is an embedding of M into a “large enough” sphere. Since A of the embedding and the integer m, we have E (M ) b (M ) = A A E as required. 3.4. The isomorphism theorem We can now assemble the constructions of Secs. 3.1–3.3 above to finally establish M ∼ our main result. Notice first of all that since ker D /M E = ker TE , one has indan ◦ µa (M, E, φ) = indan (D /M E )
(3.18)
for any KO-cycle (M, E, φ) on X with dim M = n. At this point we can use an important result from spin geometry called the Cn -index theorem [41]. Theorem 3.9. Let M be a compact spin manifold of dimension n and let E be a real vector bundle over M . Let ∞ D /M / ) ⊗ E) → C∞ (M, S(M / ) ⊗ E) E : C (M, S(M
be the Cn -linear Atiyah–Singer operator with coefficients in E. Then indan (D /M E ) = AE (M ). The proof of Theorem 3.1 is now completed once we establish the following result. Proposition 3.10. The map µa : KOtn (pt) → KOan (pt) is an isomorphism for any n ∈ N. Proof. As noticed at the beginning of this section, it suffices to establish the commutativity of the diagram (3.1), i.e. that indtn = indan ◦ µa .
October 23, 2009 12:9 WSPC/148-RMP
1122
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
Let [M, E, φ] be the class of a KO-cycle over pt with dim M = n. Using Theorem 3.9 and (3.18) we have E (M ) indtn [M, E, φ] := A a a /M = indan (D E ) = indn ◦ µ [M, E, φ]
as required. 4. The Real Chern Character In this section we will describe the natural complexification map from geometric KO-homology to geometric K-homology and use it to define the Chern character homomorphism in topological KO-homology. We describe various properties of this homomorphism, most notably its intimate connection with the AMS invariant which was the crux of the isomorphism of the previous section. 4.1. The complexification homomorphism Let X be a compact topological space. Consider the topological, generalized homology groups Kt (X) and KOt (X), along with the corresponding K-theory and KOtheory groups. The complexification of a real vector bundle over X is a complex vector bundle over X which is isomorphic to its own conjugate vector bundle. The complexification map is compatible with stable isomorphism of real and complex vector bundles, and thus defines a homomorphism from stable equivalence classes of real vector bundles to stable equivalence classes of complex vector bundles. It thereby induces a natural transformation of cohomology theories (⊗ C)∗ : KO (X) → K (X) given by [E] − [F ] → [EC ] − [FC ] where EC := E ⊗ C is the complexification of the real vector bundle E → X. We can also define a complexification morphism relating the homology theories (⊗ C)∗ : KOt (X) → Kt (X)
(4.1)
by [M, E, φ] ⊗ C := [M, EC , φ] and extended by linearity, where on the right-hand side we regard M endowed with the spinc structure induced by its spin structure as a KO-cycle. One can easily see that [M, E, φ] ⊗ C = φ∗ ([EC ] [M ]K )
(4.2)
where [M ]K ∈ K (M ) denotes the K-theory fundamental class of M . Thus in the case when X is KO-oriented (and therefore K-oriented), i.e. X is a compact spin
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1123
manifold, the homomorphism (⊗ C)∗ is just the Poincar´e dual of (⊗ C)∗ . This is clearly a natural transformation of homology theories. A related natural transformation between cohomology theories is the realification morphism ( R)∗ : K (X) → KO (X) induced by assigning to a complex vector bundle over X the underlying real vector bundle over X. Because a spinc manifold is not necessarily spin, we cannot implement this transformation in the homological setting in general. Rather, we must assume that X is a compact spin manifold. In this case the K-homology group Kt (X) has generators [54] [X × Sn , Ei , pr1 ] − [X × Sn , Fi , pr1 ], 0 ≤ n ≤ 7, where pr1 : X × Sn → X is the projection onto the first factor. We can then define the morphism ( R)∗ : Kt (X) → KOt (X) by ([X × Sn , Ei , pr1 ] − [X × Sn , Fi , pr1 ]) n
:= [X × S , Ei
R n
R, pr1 ] − [X × S , Fi
R, pr1 ]
and extending by linearity. Since this definition depends on a choice of generators for Kt (X), the transformation is not natural. As for the complexification morphism, the morphism ( R)∗ thus defined is Poincar´e dual to ( R)∗ . It follows that the composition ( R)∗ ◦ (⊗ C)∗ is multiplication by 2. 4.2. Chern character in KO-homology We can use the natural transformation provided by the complexification homomorphism (4.1) to define a real homological Chern character t chR • : KO (X) → H (X, Q)
(4.3)
by chR • (ξ) = ch• (ξ ⊗ C) for ξ ∈ KOt (X), where on the right-hand side we use the K-homology Chern character ch• : Kt (X) → H (X, Q). Tensoring with Q gives a map t chR • ⊗ idQ = (ch• ⊗ idQ ) ◦ ((⊗ C)∗ ⊗ idQ ) : KO (X) ⊗Z Q → H (X, Q)
with ch• ⊗idQ : Kt (X)⊗Z Q → H (X, Q) an isomorphism. The real Chern character (4.3) is a natural transformation of homology theories. An important point here is that the real Chern character requires a somewhat finer analysis than the usual Chern character. Although it detects all the homology classes, there can be KO-homology elements which have the same image under it because of the complexification map and the different periodicities of K-theory and
October 23, 2009 12:9 WSPC/148-RMP
1124
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
KO-theory. For example, pt. They have the same But since they belong to of KOt (pt), we conclude
4 R consider the KO-cycles [pt, 1R pt , idpt ] and [S , 1S4 , ζ] over R image through ch• , namely the generator of H0 (pt, Q). different subgroups KOti (pt) with respect to the grading that these are the generators of the lattice ΛKOt (pt) :=
KOt (pt)/torKOt (pt) . This fact will be important when we study brane constructions in the next section. KO be the We can give a characteristic class description of chR • as follows. Let τE H KO-theory Thom class and τE the cohomology Thom class of a real spin vector bundle E over X. Let ch• : K (X) → H (X, Q) be the usual cohomology Chern character which is a multiplicative Z2 -degree preserving natural transformation of cohomology theories. Denote by A(E) ∈ Heven (X, Q) the Atiyah–Hirzebruch class of E. By using the analysis of natural transformations given in [38], along with the Hirzebruch formulation of the Riemann–Roch formula −1 ch• ((τ KO ) ⊗ C) = τ H A(E) E
E
and (4.2), one then has • chR • (M, E, φ) = φ∗ (ch (EC ) A(TM ) [M ])
(4.4)
where [M ] ∈ H (M, Z) is the orientation cycle of the compact spin manifold M . Since EC ∼ = EC for any real vector bundle E → X, one has ch• (EC ) = ch• (EC ). Thus all components of the cohomology Chern character in the formula (4.4) of degree 4j + 2 vanish. 4.3. Cn -index theorems We will now explore the relation between the homological real Chern character and the topological index defined in (3.17). We first show that up to Poincar´e duality the topological index is the homological morphism induced by the collapsing map. Recall that up to isomorphism, the AMS invariant is given by E (M ) = ζ˜ KO [E] A !
where M is a compact spin manifold of dimension n, E is a real vector bundle over M , ζ˜ : M → pt is the collapsing map on M , and ζ˜!KO is the corresponding Gysin homomorphism on KO-theory. In this case, we have ζ˜ KO = Φ ◦ ζ˜ KO ◦ Φ−1 ∗
pt
M
!
where ζ˜∗KO is the induced morphism on KOt (M ), and Φpt and ΦM are the Poincar´e duality isomorphisms on pt and M , respectively. It then follows that Φ ◦ indt (M, E, φ) = Φ ◦ ζ˜ KO [E] pt
n
pt
!
= Φpt ◦ ζ˜!KO ◦ Φ−1 M (M, E, idM ) KO ˜ = ζ∗ [M, E, idM ] = [M, E, ζ˜ ] = ζ KO [M, E, φ] ∗
where ζ : X → pt is the collapsing map on X with ζ˜ = ζ ◦ φ.
(4.5)
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1125
We will next describe how the real Chern character can be used to give a characteristic class description of the map indtn in the torsion-free cases. Consider first the case n ≡ 4 mod 8. We begin by showing that there is a commutative diagram ζ∗KO
/ KOt (pt) KOt4 (X) 4 LLL LLL chR LLL • ζ∗H ◦chR L& • H0 (pt, Q)
(4.6)
where ζ∗H is the induced morphism on homology. Recall that chR • = ch• ◦ (⊗ C)∗ , where (⊗ C)∗ is the complexification map (4.1). Then one has • H ζ∗H ◦ chR • (M, E, φ) = ζ∗ ◦ φ∗ (ch (E) A(TM ) [M ]) = (ζ ◦ φ)∗ (ch• (E) A(TM ) [M ]) R KO ˜ = chR • (M, E, ζ) = ch• ◦ ζ∗ [M, E, φ]. t Now recall from Sec. 4.2 above that the map chR • : KO4 (pt) → H0 (pt, Q) sends R Z → 2Z ⊂ Q. On its image, the homomorphism ch• is thus invertible and its inverse is given as division by 2. An explicit realization is gotten by noticing that • H ζ∗H ◦ chR • (M, E, φ) = ζ∗ ◦ ΦM (ch (EC ) A(TM )) • )) = Φpt ◦ ζ!H (ch (EC ) A(TM
• = ch (EC ) A(TM ), [M ] ,
(4.7)
where −, − : H (M, Q) × H (M, Q) → Q is the canonical dual pairing between cohomology and homology. In (4.7), we have used the fact that Φpt is the identity on H0 (pt, Q) ∼ = Q , and the proof of the last equality uses the Atiyah–Hirzebruch version of the Grothendieck–Riemann–Roch theorem and can be found in [39, Sec.
• V4.20]. Recall that for a spin manifold M of dimension 4k + 8, one has ), [M ] ∈ 2Z. After using the isomorphism KO4 (pt) ∼ ch (EC ) A(TM = Z, we
• 1 KO thus deduce that ζ∗ [M, E, φ] = 2 ch (EC ) A(TM ) , [M ] , and from (4.5) we arrive finally at 1 ), [M ]. indtn (M, E, φ) = ch• (EC ) A(TM 2 When n ≡ 0 mod 8, one obtains a similar result but now without the factor R t 1 2 , since in that case ch• : KO0 (pt) → H0 (pt, Q) is the inclusion Z → Q. In the remaining non-trivial cases n ≡ 1, 2 mod 8 the homological Chern character is of no use, as KO−n (pt) is the pure torsion group Z2 , and there is no cohomological formula for the AMS invariant in these instances. However, by using Theorem 3.9 one still has an interesting mod 2 index formula for the topological index in these cases as well [4]. We can summarize our homological derivations of these index formulas as follows. /M Theorem 4.1. Let [M, E, φ] ∈ KOtn (X), and let D E be the Atiyah–Singer operator M M / E denote the vector space of real on M with coefficients in E. Let H / E := ker D
October 23, 2009 12:9 WSPC/148-RMP
1126
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
harmonic E-valued spinors on M . Then one has the Cn -index formulas ch• (EC ) A(TM ), [M ] , n ≡ 0 mod 8, M H / mod 2, n ≡ 1 mod 8, dim C E M dimH H / mod 2, n ≡ 2 mod 8, E indtn (M, E, φ) = 1 •
ch (EC ) A(TM ), [M ], n ≡ 4 mod 8, 2 0, otherwise.
5. Brane Constructions in Type I String Theory In Type I superstring theory with topologically trivial B-field, a D-brane in an oriented ten-dimensional spin manifold X is usually described by a spin submanifold W → X, together with a Chan–Paton bundle which is equiped with a superconnection and defined by an element ξ ∈ KO0 (X) [61] (see [54] for a more precise treatment). In this section we will apply the mathematical formalism developed thus far to the classification and construction of Type I D-branes in topological KO-homology. The main new impetus that we will emphasize is the role of the AMS geometric invariant, which was the crucial ingredient in the proof of Sec. 3. It will provide a precise, rigorous framework for certain physical aspects of Type I brane constructions. We will begin by demonstrating how geometric K-homology can be used to describe D-branes in Types II and I string theory in a topologically nontrivial spacetime. We will introduce the notion of wrapped D-brane on a given submanifold of spacetime, we will define the group of charges of wrapped D-branes, and construct explicit examples of wrapped D-branes which have torsion charge. 5.1. Classification of Type I D-branes By the Sen–Witten construction [57, 61], the group of topological charges of a Type II Dp-brane realized as a spinc submanifold W ⊂ X is given by K0cpt (νW ) ∼ = K0 (B(νW ), S(νW )), where νW is the normal bundle N (X/W ) → W of W ⊂ X, equiped with the spinc structure induced by the spinc structure on W and the spin structure on X, and B(νW ) and S(νW ) are respectively the unit ball and sphere bundle of the normal bundle. Let W be a tubular neighborhood of W in X, which we can identify with the interior of the ball bundle B(νW )\S(νW ). By Poincar´e duality, it follows that the cpt elements of the K-theory group K0cpt (νW ) ∼ = K10 (νW ) can be naturally interpreted as spacetime-filling D9-branes (or D9 brane-antibrane pairs) in X. This requires extending the Chan–Paton bundles over W ∼ = B(νW ) to X\W in the standard way [6,61,52,54], possibly by stabilizing with the addition of extra brane–antibrane pairs. In the following, we will always assume that this has been implicitly done and identify the normal bundle νW with spacetime X itself.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1127
According to the Sen–Witten construction, the classes in K0cpt (νW ) are interpreted as systems of D9 − D¯9 branes which are unstable, and will decay onto the worldvolume W, which correspond to the zero loci of the appropriate tachyon field. In particular, this process happens in spacetime, and it depends on how the worldvolume is embedded in it. On the other hand, the role played by the Chan– Paton vector bundle on the Dp-brane is not manifest in this classification. However, there is a natural way of classifying the Dp-branes on W by means which manifestly takes into account the Chan–Paton bundle contribution. Indeed, from the Dp-brane data, we can naturally construct the Baum–Douglas cycle (W, E, id), where E denotes the Chan–Paton bundle, and declare that its charge is given by the class [W, E, id] ∈ Kp+1 (W ). As the group Kp+1 (W ) contains no information about the embedding of the worldvolume W in X, we can intuitively think the charge [W, E, id] takes into account how the D-brane wraps the submanifold W . Notice that this analogous to the charge classification of an extended object in an abelian gauge theory via the homology cycle of its worldvolume. At this point, we notice that by definition the elements of Ktp+1 (W ) are given by (differences of) classes [M, E, φ] where M is a p+ 1-dimensional manifold. However, it is not always possible to choose the map φ in [M, W ] in such a way that φ is a diffeomorphism. This motivates the following definition. Definition 5.1. Let X be a Type II string background, described by a tendimensional spin manifold, and let W ⊂ X be a spinc submanifold. A Dp-brane wrapping the worldvolume W is defined as the K-cycle (M, E, φ), where dim M = p + 1, and φ(M ) ⊂ W . We will call E the Chan–Paton bundle associated to the wrapped Dp-brane, and we will say that the Dp-brane fills W if φ(M ) = W . The charge of the wrapped Dp-brane is given by the class [M, E, φ] in the group Ktp+1 (W ). Notice that in the above definition we have relaxed the condition that dim W = p + 1, as we are not requiring that the wrapping preserves the dimension of the D-brane. This is an attempt to take into account, at least at the topological level, the well-known fact that D-branes are not always representable as submanifolds equipped with vector bundles, since they are boundary conditions for a superconformal field theory, and that a distinction should be made between the wrapping D-brane, in this case identified with a K-cycle representing a particular type of boundary conditions, and the worldvolume it wraps. Notice also that the group of charges of wrapped Dp-branes does not depend on how the manifold W is embedded into the spacetime, and hence it seems to represent a genuine worldvolume concept. In particular, as mentioned above, the wrapped D-brane definition is very natural in the ordinary case of a D-brane realized as a submanifold W of spacetime equipped with a Chan–Paton bundle E, as it only depends on how the vector bundle is defined on the submanifold, and not on the procedure used to “extend” it to the spacetime. Finally, in the case of ordinary D-branes wrapping W with dim W = p + 1, the group Kp+1 (W ) coincides with the group of charges of Type IIB Dp-branes that
October 23, 2009 12:9 WSPC/148-RMP
1128
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
can be obtained via the Sen–Witten construction, i.e. via brane-antibrane decay. This can be shown as follows. Since the normal bundle N (X/W ) → W is a spinc vector bundle of even rank 9 − p in this case, we can use the Thom isomorphism in K-theory to establish that K0cpt (νW ) ∼ = K0 (W ). As W is a spinc submanifold of the spacetime, we can use Poincar´e duality to get K0 (W ) ∼ = Ktp+1 (W ) where p + 1 = dim W . This suggests that for ordinary Type II Dp-branes the wrapping charge is completely determined by the decay of the tachyon field. It is natural at this point to extend the notion of wrapped D-brane and of wrapping charge to Type I string theory. In this case, though, the two notions of charge do not coincide, as we will show in the following. Recall that in Type I string theory the group of topological charges of a Dp-brane realized as a spin submanifold W ⊂ X is given by KO0cpt (νW ) ∼ = KO0 (B(νW ), S(νW )).
(5.1)
By using the Thom isomorphism in KO-theory, we have KO0cpt (νW ) ∼ = KOp−9 (W ). Finally, by Poincar´e duality, we get KOp−9 (W ) ∼ = KO10 (W ). The group KO10 (W ) is in general not isomorphic to the group KOp+1 (W ), and explicitly depends on the dimension of the spacetime. We can physically interpret the elements of KO10 (W ) as equally charged systems of wrapping D9–D¯9-branes decaying on the submanifold W, and via the inclusion i : W → X they can be related to the D9-branes used in the Sen–Witten construction. This is not surprising, as the decay mechanism is somehow at the heart of the spacetime D-brane charge classification, and it reinforces the statement that the group (5.1) encodes spacetime properties of the Dp-brane, i.e. as vector bundles defined over the spacetime. (Higher-degree KO-theory groups, while having no natural interpretations in terms of D-branes, in fact arise through the chain of orientifolds one encounters when taking T-duals of the Type I theory and require the use of KR-theory [15,51].) The topological charges of the D-branes arising in this way are provided by the AMS invariant (3.15), or equivalently by the topological index as computed in the Cn -index Theorems 3.9 and 4.1. This naturally links the D-brane charge to a fermionic field theory on the brane worldvolume, as D /M E is the Atiyah–Singer operator defined on sections of the irreducible spinor bundle over M coupled to the real vector bundle E → M . The precise form of the charge in Theorem 4.1 is dictated by whether the corresponding spinor representations are real, complex or pseudo-real. Most noteworthy are the (non-BPS) torsion charges. The AMS
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1129
invariant in these instances gives a precise realization to the notion of a “Z2 Wilson line” which is usually used in the physics literature for the construction of torsion D-branes in Type I string theory [57, 61, 14]. It is defined as a non-trivial element in the set of R/Z-valued gauge holonomies on M which are invariant under the involution which sends a complex vector bundle V to its complex conjugate V . Within our framework, it is determined instead by the coupling of the branes to the worldvolume fermions ψ, valued in E, which are solutions of the harmonic equation D /M E ψ = 0. This provides a rigorous framework for describing the torsion charges, and moreover identifies the bundles used in tachyon condensation processes as the usual spinor bundles coupled to the Chan–Paton bundle E. We will see some explicit examples in Sec. 5.3 below. 5.2. Wrapped branes Let us now make some of these constructions more explicit. Given the real Chern character, we can mimick some (but not all) of the constructions of Type II D-branes in complex K-homology. However, in light of the remarks made in Sec. 4.2, special care must be taken as the Chern character in the real case is not a rational injection. With this in mind, we have the following adaptation of [54, Theorem 2.1]. Theorem 5.2. Let X be a compact connected finite CW-complex of dimension n whose rational homology can be presented as H (X, Q) =
mp n [Mip ]Q, p=0 i=1
Mip
where is a p-dimensional compact connected spin submanifold of X without boundary and with orientation cycle [Mip ] given by the spin structure. Suppose that the canonical inclusion map ιpi : Mip → X induces, for each i, p, a homomorphism (ιpi )∗ : Hp (Mip , Q) → Hp (X, Q) ∼ = Qmp with the property (ιpi )∗ [Mip ] = κip [Mip ]
(5.2)
for some κip ∈ Q\{0}. Then the KO-homology lattice ΛKOt (X) := torKOt (X) contains a set of linearly independent elements given by the classes of KO-cycles KOt (X)/
p [Mip , 1R M p , ιi ], i
0 ≤ p ≤ n,
1 ≤ i ≤ mp .
p t Proof. By [54] the cycles [Mip , 1C Mip , ιi ] form a basis for the lattice ΛK (X) := t t K (X)/torK (X) in K-homology. The conclusion follows from the fact that p p p C [Mip , 1R M p , ιi ] ⊗ C = [Mi , 1M p , ιi ], i
i.e. that the elements
p p R chR • (Mi , 1Mip , ιi )
i
form a set of generators of H (X, Q).
October 23, 2009 12:9 WSPC/148-RMP
1130
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
Theorem 5.2 provides sufficient combinatorial criteria on the rational homology of X which ensure that torsion-free D-branes can wrap non-trivial spin cycles of the spacetime X. As in the complex case, this is related to an analogous problem for the spin bordism group MSpin (X), which can also be defined in terms of a spectrum MSpin∞ . Just as in K-theory, the Atiyah–Bott–Shapiro (ABS) orientation map [6] MSpin∞ → KO∞ induces an MSpin (pt)-module structure on KOt (pt). Then analogously to the complex case we have the following result [35]. Theorem 5.3. The map MSpin (X) ⊗MSpin(pt) KOt (pt) → KOt (X),
[M, φ] → [M, 1R M , φ]
induced by the ABS orientation is a natural isomorphism of KOt (pt)-modules for any finite CW-complex X. This immediately implies the following result, reducing the problem of calculating the KO-homology generators to the analogous problem in spin bordism. Theorem 5.4. Let X be a finite CW-complex. Suppose that [Mi , φi ], 1 ≤ i ≤ m are the generators of MSpin (X) as an MSpin (pt)-module. Then [Mi , 1R Mi , φi ], t t 1 ≤ i ≤ m generate KO (X) as a KO (pt)-module. In other words, for each n = 0, 1, . . . , 7 the group KOtn (X) is generated by elements [Mi , 1R Mi , φi ], 1 ≤ i ≤ m with dim Mi = n. 5.3. Torsion branes We now describe a geometrical approach to the computation of torsion KO-cycle generators, thus elucidating the role of the AMS invariant in the construction of torsion D-branes. The general problem in KO-homology turns out to be much more involved than in the complex case. We discuss this further in Sec. 5.4 below. For now we will content ourselves with finding explicit representatives for the generators of the non-trivial groups KOtn (pt) with n = 0, 1, 2, 4. This entails instructive exercises in the computations of topological indices which aid in better understanding the origins of Type I torsion D-brane charges. Recall from Sec. 4.2 that for the nontorsion cases n = 0 and n = 4, using the real Chern character one finds that the t 4 R ∼ classes [pt, 1R pt , idpt ] and [S , 1S4 , ζ] are generators of the groups KO0 (pt) = Z and t ∼ KO4 (pt) = Z, respectively. We begin with the group KOt1 (pt). Consider the circle S1 and assign to it a Riemannian metric. Since there is only one unit tangent vector at any point of S1 , one has P SO (S1 ) ∼ = S1 . A spin structure on S1 is thus given by a double covering P Spin (S1 ) → S1
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1131
and by the fibration Z2
/ P Spin (S1 ). S1
There are only two double coverings of the circle, one disconnected and the other connected, given, respectively, by S1 × Z2 → S1 ,
S1M → S1
where S1M is the total space of the principal Z2 -bundle associated to the M¨ obius strip. We will call these two spin structures the “interesting” and the “uninteresting” spin structures, respectively. Corresponding to these two spin structures (labeled “i” and “u”, respectively), 1 R 1 we construct classes in KOt1 (pt) given by [S1i , 1R S1 , ζ] and [Su , 1S1 , ζ] where ζ : S → pt is as usual the collapsing map. We will now compute the topological indices in detail, finding the AMS invariants [41] R (S1 ) = 1, A 1 1 i S
R (S1 ) = 0 A 1 1 u S
∼ in KO−1 (pt) ∼ = Z2 . Hence the two classes above represent the elements of KOt1 (pt) = 1 R Z2 . In particular, [Si , 1S1 , ζ] is a generator, analogous to the non-BPS Type I D-particle that arises from tachyon condensation on the Type I D1 brane–antibrane system with a Z2 Wilson line [57, 61, 14]. Let us first consider the circle with the interesting spin structure. Since C1 ∼ = C, 1 × C. By decomposing C = R ⊕ iR, one S one has S(S / 1 ) := P Spin (S1 ) ×Z2 C1 ∼ = 0 1 1 1 1 1 / (S ) = S × iR. As the Clifford has the identifications S / (S ) = S × R and S / 1 )) = C∞ (S1 , C). By bundle is trivial, its space of sections is given by C∞ (S1 , S(S 1 coordinatizing the circle S with arc length s, the Atiyah–Singer operator (3.3) can be expressed as 1 d (5.3) D /S = i ds where e1 = i is a generator of the Clifford algebra C1 . To compute the topological R (S1 ), we use the C1 -index Theorem 3.9 and hence determine the vector index A 1 1 i S
1
1
space ker D / S , or equivalently the chiral subspace ker(D / S )0 . Since C∞ (S1 , S / 0) = 1 C∞ (S1 , R), the kernel of the chiral Atiyah–Singer operator (D / S )0 : C∞ (S1 , S / 0) → 1 ∞ 1 1 / ) is given by the space of real-valued constant functions on S . The C (S , S dimension of this vector space, as a module over C01 ∼ = R, is 1 and hence 1
/ S )0 ] = 1 indt1 (S1i , 1R S1 , ζ) = [ker(D in M0 /ı∗ M1 ∼ = Z2 . (Note that here we are using ungraded Clifford = KO−1 (pt) ∼ modules.) We now turn to the uninteresting spin structure on S1 . This time the bundle obius bundle. It can be described by a trivialization S(S / 1 ) is the (infinite complex) M¨
October 23, 2009 12:9 WSPC/148-RMP
1132
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
made of three charts U1 , U2 and U3 with Z2 -valued transition functions g12 = 1, 1 / S )0 consists of locally g23 = 1 and g31 = −1. In this case, the vector space ker(D constant real-valued functions ψi defined on Ui which satisfy ψj = gji ψi on the intersections Ui ∩ Uj = ∅. Because of the non-trivial transition function g31 , there 1 1 are no non-zero solutions ψ to the equation (D / S )0 ψ = 0. The kernel ker(D / S )0 is thus trivial, and so indt1 (S1u , 1R S1 , ζ) = 0. Let us now consider the structure of the group KOt2 (pt). Analogously to the construction above, one can equip the torus T2 = S1 × S1 with an “interesting” spin structure and show that R (T2 ) = 1, A 1
T2
and also that R (S2 ) = 0 A 1 2 S
2 R in KO−2 (pt) ∼ = Z2 . It follows that the classes [T2 , 1R T2 , ζ] and [S , 1S2 , ζ] represent t the elements of the group KO2 (pt) ∼ = Z2 . In particular, [T2 , 1R T2 , ζ] is a generator, and it is analogous to the Type I non-BPS D-instanton which is usually constructed as the Ω-projection of the Type IIB D(−1) brane-antibrane system [61,14]. We will now give some details of these results. Equip T2 with the flat metric dθ1 ⊗ dθ1 + dθ2 ⊗ dθ2 , where (θ1 , θ2 ) are angular coordinates on S1 × S1 . Since T2 is a Lie group, its tangent bundle is trivializable, and hence the oriented orthonormal frame bundle is canonically given by P SO (T2 ) = T2 × S1 . Consider the spin structure on T2 given by id
2 ×z
2
P Spin (T2 ) = T2 × S1 −−T−−−→ T2 × S1 . / 2 ) = T2 ×H Since C2 ∼ = H and C02 ∼ = C, the corresponding Clifford bundles are S(T 0 2 2 and S / (T ) = T × C. In the riemannian coordinates (θ1 , θ2 ), the Atiyah–Singer operator (3.3) can be expressed as 2 ∂ ∂ + σ2 D / T = σ1 ∂θ1 ∂θ2 where the Pauli spin matrices 0 1 0 −i σ1 = , σ2 = 1 0 i 0 represent the generators e1 , e2 of C2 , acting by left multiplication. The chiral oper2 ator (D / T )0 is locally the Cauchy–Riemann operator, and hence its kernel consists of holomorphic sections of the chiral Clifford bundle S / 0 (T2 ). These are simply 2 the complex-valued constant functions on T , as the torus is a compact complex manifold. As a module over C02 , this vector space is one-dimensional and so 2
/ T )0 ] = 1 indt2 (T2 , 1R T2 , ζ) = [ker(D ∼ Z2 . in M1 /ı∗ M2 ∼ = KO−2 (pt) =
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1133
Consider now the two-sphere S2 as a riemannian manifold. It is not difficult to see that P SO (S2 ) = SO(3) → SO(3)/SO(2) ∼ = S2 is the oriented orthonormal frame bundle over S2 . The (unique) spin structure on S2 is thus given by h / P SO (S2 ) ∼ P Spin (S2 ) ∼ = SU(2) = SO(3) TTTT TTTT TTTT SO(2) TTTT U(1) TT) S2
with h : SU(2) → SO(3) the usual double covering, and by U(1)
/ P Spin (S2 ) S2
which is the Hopf fibration of S2 . acts on C2 ∼ = H as multiplication iθ e 0
Recall that the group Spin(2) ∼ = U(1) ∼ = SO(2) by 0 , θ ∈ [0, 2π). e−iθ
If one gives the sphere S2 the structure of the complex projective line CP1 , then there are isomorphism S / 0 (S2 ) = P Spin (S2 ) ×U(1) C ∼ = T 1,0 CP1 since the bundle 0 2 S / (S ) has the same transition functions as the Hopf fibration. In other words, S / 0 (S2 ) is isomorphic to the canonical line bundle LC over CP1 . The vector space 2 ker(D / S )0 thus consists of the holomorphic sections of LC . The only such section on CP1 is the zero section, and we finally find 2
indt2 (S2 , 1R / S )0 ] = 0 S2 , ζ) = [ker(D in M1 /ı∗ M2 ∼ = Z2 . 5.4. General constructions The analysis of Sec. 5.3 above shows that the problem of finding generators of the geometric KO-homology groups of a space X, representing the Type I D-branes in X, becomes increasingly involved at a very rapid rate. Even in the case of spherical D-branes, we have not been able to find a nice explicit solution in the same way that can be done in the complex case [54]. Nevertheless, at least in these cases we can find a formal solution as follows, which also illustrates the generic problems at hand.
October 23, 2009 12:9 WSPC/148-RMP
1134
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
Suppose that we want to construct generating branes for the group KOtk (Sn ) for some n > 0. Poincar´e duality gives the map KOn−k (Sn ) → KOtk (Sn ),
ξ → ξ [Sn , 1R Sn , idSn ].
(5.4)
As Poincar´e duality is a group isomorphism, picking a generator in KOn−k (Sn ) will give a generator in KOtk (Sn ). But the problem is that the class ξ is not a (virtual or stable) vector bundle over Sn in the cases of interest k < n. To this end, we rewrite the cap product in (5.4) by using the suspension isomorphism Σ and the desuspension Σ−1 to get −1 (Σ(ξ) Σ[Sn , 1R ξ [Sn , 1R Sn , idSn ] = Σ Sn , idSn ]).
As we are interested only in generators, we can substitute Σ(ξ) with the generators 0 (S2n−k ). The generators of the latter of the KO-theory group KO0 (Σn−k Sn ) = KO groups are given by [39] the canonical line bundle LF over the projective line FP1 , with F the reals R for k = 2n − 1, the complex numbers C for k = 2n − 2, the quaternions H for k = 2n − 4 and the octonions O for k = 2n − 8 (the remaining groups are trivial up to Bott periodicity). 6. Fluxes In this final section, we shall explore the classification of Type II Ramond– Ramond (RR) fields, in the absence of D-branes, using the language of topological K-homology. We will find again a crucial role played by a certain invariant, analogous to the AMS invariant but this time determined by the holonomy of RR-fields over background D-branes. We will see that these holonomies find their most natural interpretation within the context of geometric K-homology. Along the way we will also propose a physical interpretation of KK-theory. 6.1. Classification of Type IIA Ramond–Ramond fields We will start with a description of how the Ramond–Ramond fluxes in Type IIA string theory naturally fit into the framework of topological K-homology, and then propose in Sec. 6.2 below a unified description of the couplings of D-branes to RR fields using bivariant K-theory. The Type IIA Ramond–Ramond fields are classified by a local formulation of K-theory called “differential KO-theory”, a specific instance of a generalized differential cohomology theory which provides a characterization in terms of bundles with connection [28, 36, 30]. Consider the short exact sequence of coefficient groups given by 1 → Z → R → R/Z → 1, and use it to define the K-theory groups KiR/Z (X) of a space X with coefficients in the circle group R/Z ∼ = S1 as the K (pt)-module theory which fits into the
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1135
corresponding long exact sequence · · · → Ki (X) → Ki (X) ⊗ R → KiR/Z (X) → Ki+1 (X) → · · · . Then the flat RR fields in Type IIA string theory on X, in the absence of D-branes, are classified by the group K−1 R/Z (X) [49]. If X is a finite-dimensional smooth spin manifold of dimension ten, then by using the Chern character the RR phases are described by the short exact sequence β
0 0 → Hodd (X, R)/ΛK−1 (X) → K−1 R/Z (X) −→ torK (X) → 0
(6.1)
where β is the Bockstein homomorphism. Thus the identity component of the circle odd (X, R)/ΛK−1 (X) . The cohocoefficient K-theory group K−1 R/Z (X) is the torus H mology class of an element in this component is determined by the Chern character ch• , which is an epimorphism on ΛK−1 (X) → Hodd (X, R). Suppose now that K−1 (X) is pure torsion. In this case, K−1 (X; R/Z) ∼ = Tor(K0 (X)), and the corresponding flat Ramond–Ramond fields can be represented by virtual vector bundles over X. A torsion RR flux ξ ∈ K0 (X) gives an additional phase factor to a D-brane in the string theory path integral, which we will realize in Sec. 6.3 below in terms of the η-invariant of a suitable Dirac operator. For this, we exploit Pontrjagin duality of K-theory [30]. Proposition 6.1. There is a natural isomorphism KiR/Z (X) ∼ = Hom(Kti (X), R/Z) for all i ∈ Z. Proof. Apply the universal coefficient theorem for K-theory, and use the fact that the circle group R/Z is divisible which implies Ext(Ktj (X), R/Z) = 0 for all j ∈ Z. Let W be a compact spin submanifold of X of dimension p + 1. We could then identify the spacetime X with the normal bundle νW as in Sec. 5.1. However, for the present discussion it is more convenient to work with compact closed manifolds X, so we replace νW with its sphere bundle S(νW ). Thus in the following spacetime is regarded as a spin fibration π : X → W whose fibers are spheres X/W ∼ = S9−p . By Proposition 6.1 and the Thom isomorphism (2.16), the group of RR fluxes is given by t t ∼ K−1 R/Z (X) = Hom(K−1 (X), R/Z) = Hom(Kp−10 (W ), R/Z)
and by Bott periodicity we have finally t ∼ K−1 R/Z (X) = Hom(Kp+2 (W ), R/Z).
(6.2)
The K-homology group Ktp+2 (W ) consists of wrapped Type II D-branes [M, E, φ] with the properties dim M = p + 2 and φ(M ) ⊂ W . The dimension shift is related to the topological anomaly in the worldvolume fermion path integral [49], as we now explain.
October 23, 2009 12:9 WSPC/148-RMP
1136
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
Consider a one-parameter family of p + 1-dimensional Type II brane worldvolumes specified by a circle bundle U → W whose total space U is generically a p + 2-dimensional submanifold of spacetime X with the topology of W × S1 . Complex vector bundles Eg of rank n over generic fibers U/W ∼ = S1 are determined by elements g ∈ U(n) by the clutching construction (analogously to Sec. 2.2). Thus the S1 is parametrized by the group U(n). family of twisted Atiyah–Singer operators D /E g The anomaly [29] arises as the determinant line bundle of this family, which is essentially defined as the highest exterior power of the kernel of the family. This defines a non-trivial real line bundle on the group U(n) called the Pfaffian line bundle, which has the property that its lift to Spinc (n) is the trivial complex line bundle. One can also construct a connection and holonomy of the Pfaffian line bundle [29]. As in Sec. 5.1, U is wrapped by D-branes in Ktp+2 (U ). One can now restrict to the subgroup Ktp+2 (W ) ⊂ Ktp+2 (U ) by keeping only those D-branes which are wrapped on the embedding W → U by the zero section of U → W . The isomorphism (6.2) reflects the fact that the topological anomaly is canceled by coupling D-branes to the RR fields through the RR phase factors. This cancellation necessitates that the worldvolume W be a spinc manifold [61, 29].
6.2. Generalized D9-brane decay The couplings described in Sec. 6.1 above are intimately related to a topological classification of the D9-brane decay described in Sec. 5.1, which lends a physical interpretation to the bivariant KO-theory groups introduced in Sec. 1.4. Let us explain this first for the simpler case of Type II D-branes and complex KK-theory. Consider the KK-theory groups KKi (X, W ) := KKi,0 (C(X, C), C(W, C)). By the Rosenberg–Shochet universal coefficient theorem [55], one then has a split short exact sequence of abelian groups given by 0 → ExtZ (Ki+1 (X), Ki (W )) → KKi (X, W ) → → HomZ (Ki (X), Ki (W )) → 0 for all i ∈ Z. Composition of group morphisms with Poincar´e duality Ki (W ) ∼ = Ktp+1−i (W ) gives 0 → ExtZ (Ki+1 (X), Ktp+1−i (W )) → KKi (X, W ) → → HomZ (Ki (X), Ktp+1−i (W )) → 0.
(6.3)
For i = 0 the sequence (6.3) expresses the fact that the elements of the free part of the abelian group KK0 (X, W ) correspond to classes of morphisms Kt10 (X) ∼ = K0 (X) → Ktp+1 (W ), generalizing the brane decay in K-Homology. We may thus interpret KK0 (X, W ) as the group of “generalized D9-brane decays”. An example of such a generalized decay can be straightforwardly given in the case p ≡ 1 mod 2 (for p = 1, W is the worldvolume of a D-string). Moreover, suppose that W is a
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1137
spin manifold. Then there is a direct image map on K-theory π! : K0 (X) → K0 (W )
(6.4)
given by taking the intersection product of Sec. 1.6 (see Proposition 1.21) by the longitudinal element in KK0 (X, W ) [24], defined by the fiberwise Atiyah–Singer operator on the spin fibration π : X → W as follows. Fix a spin structure and a Riemannian metric g X/W on the relative tangent bundle T (X/W ). This determines a bundle S(X/W / ) → X of Clifford algebras. Let HX be a horizontal distribution of planes on X, so that HX ⊕ T (X/W ) = T X, which together with the metric deter/ X/W be mines a spin connection ∇X/W on T (X/W ) → X. For any w ∈ W we let D w −1 2 ∼ the corresponding Atiyah–Singer operator (3.3) along the fiber π (w) = S actX/W / )). Define the corresponding closure Tw analogously ing on C∞ (X/W, S(X/W X/W }w∈W of bounded Fredholm operto (3.4). This defines a continuous family {Tw ators over W acting on an infinite-dimensional Hilbert bundle HX/W → W , whose X/W = L2 (X/W, S(X/W / ); dg X/W ). By the Atiyah–Singer fiber at w ∈ W is Hw index theorem [5], the topological index π! (ξ) is equal to the analytic index of the }w∈W on X/W appropriately twisted by family of Atiyah–Singer operators {D / X/W w 0 ξ ∈ K (X). On the other hand, for i = −1 one sees from (6.3) that torsion-free elements of the group KK−1 (X, W ) correspond to classes of morphisms K−1 (X) → Ktp+2 (W ) linking RR fields to anomaly canceling D-branes. Any such morphism gives an element of KK−1 (X, W ), but not conversely. The obstruction consists of classes of group extensions of Ktp+2 (W ) by K0 (X), which we may interpret as bound states of anomaly cancelling D-branes wrapping the worldvolume W and D9-branes wrapped on spacetime X. This property seems to reflect the fact [30] that flux operators which correspond to torsion elements of K-theory do not all commute among themselves, as a result of the torsion link pairing provided by Pontrjagin duality. In this way the KK-theory group KK−1 (X, W ) naturally captures the correct topological classification of RR fluxes after quantization. Note that if we disregard the ambient spacetime by setting X = pt, then we recover the group KK−1,0 (C, C(W, C)) ∼ = K−1 (W ) ∼ = Ktp+2 (W ) which relates to anomaly cancelling D-branes wrapped on the worldvolume W . As in Sec. 2.5, the Type I case is more subtle. Indeed, the universal coefficient theorem proven in [55] is not valid in the case of real C ∗-algebras, due to obstructions that lie in the homological algebra [3]. One still has the homomorphism KKOi (X, W ) → HomZ (KOi (X), KOi+1 (W )) but this is no longer surjectve. Again, a universal coefficient theorem exists in united KK-theory [19], giving rise to a homomorphism KKO0 (X, W ) → [Kcrt (X), Kcrt(W )]
October 23, 2009 12:9 WSPC/148-RMP
1138
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
where Kcrt is united K-theory and [−, −] is given by all CRT-module homomorphisms of degree 0. Most probably, this can have an interpretetion in term of generalized D9-brane decay in Type I string theory, though we have not investigated the details of this. 6.3. Holonomy over Type II D-branes To make the discussion at the end of Sec. 6.1 above more precise, we need to refine our analysis by considering a larger collection of triples and finding an appropriate invariant substituting the usual index morphism. This is necessary to take into account the particular role played by the RR fluxes in the string theory path integral. To give a homological description of the coupling of D-branes to RR fields, we must first of all remember that the topological classification given in Sec. 6.1 above is valid only for Type IIA RR fields in spacetime which are not sourced by D-branes. Thus given a K-cycle (M, E, φ) on X wrapping W , instead of considering the one-parameter family U → W of brane worldvolumes above, we will assume with boundary ∂ M = M and the existence of a compact smooth spinc manifold M dimension n + 1 when dim M = n. Suppose in addition that there exists a vector → X such that → M with E | f ∼ E, and a continuous map φ : M bundle E ∂M = c φ |∂ M f = φ. Then (M, E, φ) is spin bordant to the trivial K-cycle (∅, ∅, ∅), and so [M, E, φ] = 0 in Kt (X). The charge of this D-brane thus vanishes and so it cannot source any RR fields, as required. We call such a triple (M, E, φ) a “background D-brane”, because it should be regarded as equivalent to the closed string vacuum. looks like a product M × I, with I = [0, 1] Any neighborhood of the boundary in M the unit interval, and so locally the extension of M mimics the fibrations U → W considered previously. By (6.1), and in the same hypothesis on K0 (X), the holonomy of flat RR fields over such a brane can be represented in terms of a virtual flat vector bundle ξ = [E0 ] − [E1 ] ∈ K0 (X) of rank 0, restricted to M as follows. Fix a spinc structure which coincide with those of the product M × I in a and Riemannian metric on M f ∞ )⊗E ) → C∞ (M , S( )⊗E ) / M / M neighborhood of the boundary. Let D /M e : C (M , S( E with coefficients in E, defined with be the canonical Atiyah–Singer operator of M respect to the global Szeg¨o boundary conditions considered by Atiyah–Patodi– ) to M Singer (APS) [7]. Then the restriction of the Clifford algebra bundle S( / M may be identified with S(M / ). Near the boundary, in M × I, we have ∂ f M + D / = σ · D /M e E E ∂u where u is the inward normal coordinate and σ· is Clifford multiplication by the unit inward normal vector. Let spec0 (TEM ) denote the spectrum of the closure (3.4) of the twisted Atiyah– M 0 ) = L2R (M, S / 0 (M ) ⊗ E; dg M ). It Singer operator on the chiral Hilbert space (HE is a discrete unbounded subset of R with no accumulation points such that the
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1139
M 0 eigenspaces are finite-dimensional subspaces of (HE ) . An eigenvalue λ is repeated 0 M in spec (TE ) according to its multiplicity. For s ∈ C with Re(s) 0, define the absolutely convergent series λ|λ|−s−1 . (6.5) η(s, D /M E )= M )\{0} λ∈spec0 (TE
Let η(D /M E ) be the value of the meromorphic continuation of (6.5) at s = 0. This is called the APS eta-invariant [7] and it is a measure of the spectral asymmetry of the Atiyah–Singer operator D /M E . The reduced eta-invariant is the geometric invariant defined by [8] , E, φ ) = Ξ(M
/M /M dimR H E + η(D E ) 2
mod Z,
(6.6)
M where H /E is the vector space of harmonic E-valued spinors on M as in Theorem 4.1. Under an operator homotopy t → (TEM )t , the quantity (6.6) is not a continuous function of t but its jumps are due to eigenvalues λ changing sign as they cross , E, φ ) zero, and so it has only integer jump discontinuities. As a consequence, Ξ(M takes values in R/Z. By exponentiating we obtain a geometric invariant valued in the unit circle group U(1) ⊂ C defined by
, E, φ ) = exp(2πi Ξ(M , E, φ )). Ω(M
(6.7)
, E, φ ), where now the spin manifolds Consider the collection of K-chains (M can have boundary. The boundary of a K-chain is defined as ∂(M , E, φ ) = M (M, E, φ) in the notation above. The difference here from the definition of relative K-cycles Γ(X, Y ) is that the background D-branes are free to live anywhere in X, M ) ⊂ X. In other words, we take Y = X and define K-chains to be the i.e. φ( 1 , φ1 ) and (M 2 , E 2 , φ2 ) 1 , E relative K-cycles Γ(X, X). Two isomorphic K-chains (M yield conjugate Atiyah–Singer operators, and so Ξ is well-defined on the set of isomorphism classes Γ(X, X). One has then the following behavior of (6.6) under the equivalence relations on K-chains described in Sec. 2.4. c
Proposition 6.2 ([13]). The map Ξ : Γ(X, X) → R/Z induced by (6.6) respects: (i) Algebraic operation: 1 , E 1 , φ1 ) (M 2 , E 2 , φ2 )) = Ξ(M 1 , E 1 , φ1 ) + Ξ(M 2 , E 2 , φ2 ); Ξ((M , E 1 ⊕ E 2 , φ ) = Ξ(M , E 1 , φ ) + Ξ(M , E 2 , φ ); and (ii) Direct sum: Ξ(M , E, φ ◦ π , E, φ ). ) = Ξ(M (iii) Vector bundle modification: Ξ( M Note that one does not say anything about the spinc bordism relation in Propoc sition 6.2, and in fact the eta-invariant η(D /M E ) is not a spin cobordism invariant [8]. In fact, taking the quotient of Γ(X, X) by the spinc bordism relation
October 23, 2009 12:9 WSPC/148-RMP
1140
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
along with the relations of Proposition 6.2 gives the trivial K-homology group Kt (X, X) = 0, consistent with the assumptions made on the D-brane background [M, E, φ] above. Given the flat RR-flux ξ = [E0 ] − [E1 ] in K−1 R/Z (X), we can define classes [M, ξ , φ] := [M, F0 , φ] − [M, F1 , φ] in the K-homology of W where Fi := φ∗ ◦ π! (Ei ) for i = 0, 1. The corresponding invariant , ξ , φ ) = exp[2πi (Ξ(M , F0 , φ ) − Ξ(M , F1 , φ ))] Ω(M
(6.8)
is then the holonomy [27] over the D-brane background with the given virtual Chan– Paton bundle. The above construction gives a K-homological description of the usual couplings that are inserted into the Type II string theory path integral [17]. Remark 6.3. Just as we arrived at the Cn -Index Theorem 3.9, it is possible to extract a K-homology version of the APS index theorem in certain dimensionalities, whose reduction mod Z evaluated on differences of bundles E0 and E1 then yields the same holonomy (6.8). This is essentially a K-theory version [27,29] of the index theorem for flat bundles [9,43], which provides a topological formula for differences of the reduced eta-invariants (6.6) in terms of the direct image of the collapsing ξ , φ ) is a spinc map ζ! : K−1 R/Z (W ) → R/Z. In particular, in these dimensions Ξ(M , cobordism invariant. It is not clear how to use these couplings to cancel the worldvolume anomalies in the path integral, which arise in the low-energy effective field theory on the D-brane. In this regime the D-branes are genuinely described as spinc submanifolds of the spacetime X. On the other hand, the geometric K-homology formalism includes non-representable D-branes, which do not wrap homology cycles of spacetime represented by non-singular spinc submanifolds [54, 26], and thereby provides a description of the D-brane physics deeper into the stringy regime. Acknowledgments We thank G. Landi and V. Mathai for helpful discussions. We are grateful to J. Boersema for pointing out some errors in an earlier version of this manuscript. This work was supported in part by the EU-RTN Network Grant MRTNCT-2004-005104. The work of R.M.G.R. was supported in part by FCT grant SFRH/BD/12268/2003. The work of A.V. was supported by the Mathematics Department at Heriot-Watt University, and in part by the German Research Foundation (Deutsche Forschungsgemeinschaft (DFG)) through the Institutional Strategy of the University of G¨ ottingen. References [1] T. Asakawa, S. Sugimoto and S. Terashima, D-Branes, Matrix Theory and KHomology, J. High Energy Phys. 0203 (2002) 034; arXiv:hep-th/0108085. [2] T. Asakawa, S. Sugimoto and S. Terashima, D-Branes and KK-Theory in Type I String Theory, J. High Energy Phys. 0205 (2002) 007; arXiv:hep-th/0202165. [3] M. F. Atiyah, Vector bundles and the K¨ unneth formula, Topology 1 (1962) 245–248.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1141
[4] M. F. Atiyah and I. M. Singer, Index theory for skew-adjoint fredholm operators, Publ. Math. IHES 37 (1969) 5–26. [5] M. F. Atiyah and I. M. Singer, The index of elliptic operators V, Ann. Math. 93 (1971) 139–149. [6] M. F. Atiyah, R. Bott and A. Shapiro, Clifford modules, Topology 3 (1964) 3–38. [7] M. F. Atiyah, V. K. Patodi and I. M. Singer, Spectral asymmetry and Riemannian geometry I, Math. Proc. Cambridge Phil. Soc. 77 (1975) 43–69. [8] M. F. Atiyah, V. K. Patodi and I. M. Singer, Spectral asymmetry and Riemannian geometry II, Math. Proc. Cambridge Phil. Soc. 78 (1975) 405–432. [9] M. F. Atiyah, V. K. Patodi and I. M. Singer, Spectral asymmetry and Riemannian geometry III, Math. Proc. Cambridge Phil. Soc. 79 (1976) 71–99. [10] P. Baum and R. G. Douglas, K-homology and index theory, Proc. Symp. Pure Math. 38 (1982) 117–173. [11] P. Baum and R. G. Douglas, Index theory, bordism and K-homology, Contemp. Math. 10 (1982) 1–33. [12] P. Baum, N. Higson and T. Schick, On the equivalence of geometric and analytic K-homology, Pure Appl. Math. Quart. 3 (2007) 1–24; arXiv:math/0701484 [math.KT]. [13] M.-T. Benameur and M. Maghfoul, Differential characters in K-theory, Diff. Geom. Appl. 24 (2006) 417–432. [14] O. Bergman, Tachyon condensation in unstable Type I D-brane systems, J. High Energy Phys. 0011 (2000) 015; arXiv:hep-th/0009252. [15] O. Bergman, E. G. Gimon and P. Hoˇrava, Brane transfer operations and T-duality of non-BPS states, J. High Energy Phys. 9904 (1999) 010; arXiv:hep-th/9902160. [16] B. Blackadar, K-Theory for Operator Algebras (Cambridge University Press, 1998). [17] J. de Boer, R. Dijkgraaf, K. Hori, A. Keurentjes, J. Morgan, D. R. Morrison and S. Sethi, Triples, fluxes and strings, Adv. Theor. Math. Phys. 4 (2002) 995–1186; arXiv:hep-th/0103170. unneth formula, [18] J. L. Boersema, Real C ∗-algebras, united K-theory, and the K¨ K-Theory 26 (2002) 345–402; arXiv:math.OA/0208068. [19] J. L. Boersema, Real C ∗-algebras, united KK-theory, and the universal coefficient theorem, K-Theory 33 (2004) 107–149; arXiv:math.OA/0302335. [20] L. Bonora and A. A. Bytsenko, Fluxes, brane charges and Chern morphisms of hyperbolic geometry, Class. Quant. Grav. 23 (2006) 3895–3916; arXiv:hep-th/0602162. [21] A. K. Bousfield, A classification of K-local spectra, J. Pure Appl. Algebra 66 (1990) 121–163. [22] J. Brodzki, V. Mathai, J. Rosenberg and R. J. Szabo, D-branes, RR-fields and duality on noncommutative manifolds, Comm. Math. Phys. 277 (2008) 643–706; arXiv:hepth/0607020. [23] U. Bunke, A K-theoretic relative index theorem and Callias-type Dirac operators, Math. Ann. 303 (1995) 241–279. [24] A. Connes and G. Skandalis, The longitudinal index theorem for foliations, Publ. Res. Inst. Math. Sci. 20 (1984) 1139–1183. [25] D.-E. Diaconescu, G. W. Moore and E. Witten, E8 gauge theory and a derivation of K-theory from M-theory, Adv. Theor. Math. Phys. 6 (2003) 1031–1134; arXiv:hepth/0005090. [26] J. Evslin and H. Sati, Can D-branes wrap nonrepresentable cycles?, J. High Energy Phys. 0610 (2006) 050; arXiv:hep-th/0607045. [27] D. S. Freed, On determinant line bundles, in Mathematical Aspects of String Theory, ed. S.-T. Yau (World Scientific Publishing, 1987), pp. 189–238.
October 23, 2009 12:9 WSPC/148-RMP
1142
J070-00383
R. M. G. Reis, R. J. Szabo & A. Valentino
[28] D. S. Freed and M. J. Hopkins, On Ramond–Ramond fields and K-theory, J. High Energy Phys. 0005 (2000) 044; arXiv:hep-th/0002027. [29] D. S. Freed and E. Witten, Anomalies in string theory with D-branes, Asian J. Math. 3 (1999) 819–851; arXiv:hep-th/9907189. [30] D. S. Freed, G. W. Moore and G. Segal, The uncertainty of fluxes, Comm. Math. Phys. 271 (2007) 247–274; arXiv:hep-th/0605198. [31] K. R. Goodearl, Notes on Real and Complex C ∗-Algebras (Shiva Publishing, 1982). [32] J. A. Harvey and G. W. Moore, Noncommutative tachyons and K-theory, J. Math. Phys. 42 (2001) 2765–2780; hep-th/0009030. [33] N. Higson, A primer on KK-theory, Proc. Symp. Pure Math. 51 (1990) 239–283. [34] N. Higson and J. Roe, Analytic K-Homology (Oxford University Press, 2000). [35] M. J. Hopkins and M. A. Hovey, Spin cobordism determines real K-theory, Math. Z. 210 (1992) 181–196. [36] M. J. Hopkins and I. M. Singer, Quadratic functions in geometry, topology and M-theory, J. Diff. Geom. 70 (2005) 329–452; arXiv:math.AT/0211216. [37] P. Hoˇrava, Type IIA D-branes, K-theory and matrix theory, Adv. Theor. Math. Phys. 2 (1999) 1373–1404; arXiv:hep-th/9812135. [38] M. Jakob, A bordism type description of homology, Manuscripta Math. 96 (1998) 67–80. [39] M. Karoubi, K-Theory. An Introduction (Springer-Verlag, 1978). [40] G. G. Kasparov, The operator K-functor and extensions of C ∗-algebras, Math. USSR Izv. 16 (1981) 513–572. [41] H. B. Lawson Jr. and M. L. Michelson, Spin Geometry (Princeton University Press, 1989). [42] B.-R. Li, Introduction to Operator Algebras (World Scientific Publishing, 1992). [43] J. Lott, R/Z index theory, Comm. Anal. Geom. 2 (1994) 279–311. [44] I. Madsen and J. Rosenberg, The universal coefficient theorem for equivariant K-theory of real and complex C ∗-algebras, Contemp. Math. 70 (1988) 145–173. [45] V. Mathai, M. K. Murray and D. Stevenson, Type I D-branes in an H-flux and twisted KO-theory, J. High Energy Phys. 0311 (2003) 053; arXiv:hep-th/0310164. [46] Y. Matsuo, Topological charges of noncommutative soliton, Phys. Lett. B 499 (2001) 223–228; arXiv:hep-th/0009002. [47] M. Matthey, Mapping the homology of a group to the K-theory of its C ∗-algebra, Illinois Math. J. 46 (2002) 953–977. [48] R. Minasian and G. W. Moore, K-theory and Ramond–Ramond charge, J. High Energy Phys. 9711 (1997) 002; arXiv:hep-th/9710230. [49] G. W. Moore and E. Witten, Self-duality, Ramond–Ramond fields and K-theory, J. High Energy Phys. 0005 (2000) 032; arXiv:hep-th/9912279. [50] G. W. Moore and N. Saulina, T-duality and the K-theoretic partition function of type IIA superstring theory, Nucl. Phys. B 670 (2003) 27–89; arXiv:hep-th/0206092. [51] K. Olsen and R. J. Szabo, Brane descent relations in K-theory, Nucl. Phys. B 566 (2000) 562–598; arXiv:hep-th/9904153. [52] K. Olsen and R. J. Szabo, Constructing D-branes from K-theory, Adv. Theor. Math. Phys. 4 (2000) 889–1025; arXiv:hep-th/9907140. [53] V. Periwal, D-brane charges and K-homology, J. High Energy Phys. 0007 (2000) 041; arXiv:hep-th/0006223. [54] R. M. G. Reis and R. J. Szabo, Geometric K-homology of flat D-branes, Comm. Math. Phys. 266 (2006) 71–122; arXiv:hep-th/0507043. [55] J. Rosenberg and C. Schochet, The K¨ unneth theorem and the universal coefficient theorem for Kasparov’s generalized K-functor, Duke Math. J. 55 (1987) 431–474.
October 23, 2009 12:9 WSPC/148-RMP
J070-00383
KO-Homology and Type I String Theory
1143
[56] H. Schr¨ oder, K-Theory for Real C ∗-Algebras and Applications (Wiley, 1993). [57] A. Sen, SO(32) spinors of Type I and other solitons on brane-antibrane pair, J. High Energy Phys. 9809 (1998) 023; arXiv:hep-th/9808141. [58] R. M. Switzer, Algebraic Topology. An Introduction (Springer-Verlag, 1978). [59] R. J. Szabo, Superconnections, anomalies and non-BPS brane charges, J. Geom. Phys. 43 (2002) 241–292; arXiv:hep-th/0108043. [60] R. J. Szabo, D-branes, tachyons and K-homology, Mod. Phys. Lett. A 17 (2002) 2297–2316; arXiv:hep-th/0209210. [61] E. Witten, D-branes and K-theory, J. High Energy Phys. 9812 (1998) 019; arXiv:hepth/9810188. [62] E. Witten, Overview of K-theory applied to strings, Int. J. Mod. Phys. A 16 (2001) 693–706; arXiv:hep-th/0007175. [63] Z. Yosimura, Universal coefficient sequences for cohomology theories of CW-spectra, Osaka J. Math. 16 (1979) 201–217.
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
Reviews in Mathematical Physics Vol. 21, No. 9 (2009) 1145–1195 c 2009 by Michael K.-H. Kiessling
THE VLASOV CONTINUUM LIMIT FOR THE CLASSICAL MICROCANONICAL ENSEMBLE∗
MICHAEL K.-H. KIESSLING Department of Mathematics, Rutgers University, Piscataway NJ 08854, USA
[email protected]
Received 12 February 2009 Revised 11 August 2009 For classical Hamiltonian N -body systems with mildly regular pair interaction potential (in particular, L2loc integrability is required), it is shown that when N → ∞ in a fixed bounded domain Λ ⊂ R3 , with energy E scaling as E ∝ N 2 , then Boltzmann’s ergodic ensemble entropy SΛ (N, E) has the asymptotic expansion SΛ (N, N 2 ε) = −N ln N + sΛ (ε)N + o(N ). Here, the N ln N term is combinatorial in origin and independent of the rescaled Hamiltonian, while sΛ (ε) is the system-specific Boltzmann entropy per particle, i.e. −sΛ (ε) is the minimum of Boltzmann’s H function for a perfect gas of energy ε subjected to a combination of externally and self-generated fields. It is also shown that any limit point of the n-point marginal ensemble measures is a linear convex superposition of n-fold products of the H-function-minimizing one-point functions. The proofs are direct, in the sense that (a) the map E → S(E) is studied rather than its inverse S → E(S); (b) no regularization of the microcanonical measure δ(E − H) is invoked, and (c) no detour via the canonical ensemble. The proofs hold irrespective of whether microcanonical and canonical ensembles are equivalent or not. Keywords: Classical statistical mechanics; microcanonical ensemble; unstable interactions; Vlasov continuum limit; entropy; n-point functions. Mathematics Subject Classification 2000: 82B03, 82B05, 82B21
1. Introduction The rigorous foundations of equilibrium statistical mechanics have largely been laid long ago [40, 36, 29, 30], but the most basic problem in classical statistical mechanics, namely the rigorous asymptotic evaluation of Gibbs’ microcanonical ensemble [15] in the limit of a large number N of particles, has only been treated in an approximate way. The standard way of dealing with the microcanonical ensemble (a.k.a. Boltzmann’s ergodic ensemble [2]) in a rigorous manner [40, 28, 30] has been ∗ c 2009 by Michael K.-H. Kiessling. This paper may be reproduced, in its entirety, for noncommercial purposes.
1145
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
M. K.-H. Kiessling
1146
to replace its singular ensemble measure by a regularized measure (usually also referred to as microcanonical, although quasi-microcanonical would seem a better name). In these approaches, one cannot take the limit of vanishing regularization; yet, since one can approximate the singular measure as closely as one pleases, “this is not completely unsatisfactory from a conceptual point of view” ([28, p. 4]). All the same, Lanford’s wording makes it plain that it is desirable to find a way to remove the regularization or to avoid it altogether. Recently, the author noticed that after only minor modifications, Ruelle’s method [40] to establish the thermodynamic limit for Boltzmann’s ergodic ensemble entropy, taken per volume (or per particle), works without the need for any regularization of the ensemble measure [22]; a follow-up work on the thermodynamic limit of the correlation functions is planned. Taking “the thermodynamic limit” [40] means that the domain Λ grows “evenly” with N and such that N/Vol(Λ) → ρ with ρ a fixed number density, and the energy E scales such that E/Vol(Λ) → ε (or E/N → ε, abusing notation), with ε ∈ R a fixed energy density (or energy per particle) — this limit covers systems of interest in condensed matter physics or chemical physics, such as those with hard core or Lennard–Jones interactions. In the present paper, we will be concerned with another limit N → ∞, where Λ is fixed and E scales such that E/N 2 → ε. This limit covers systems of interest in plasma and astrophysics, such as those with Coulomb or (mollified) Newton interactions. It is variably knowna as a “thermodynamic mean-field limit”, a “selfaveraging limit”, or “Vlasov limit”. We will study the Boltzmann ensemble entropy and the correlation functions. The remainder of this paper is structured as follows. In Sec. 2, we collect the defining formulas of the ergodic/microcanonical ensemble for finite N and explain which probabilistic quantities are of physical interest. In Sec. 3, we give a heuristic motivation for the Vlasov limit. In Sec. 4, we state our main theorems, ordered by increasing depth. Their proofs are given in Secs. 5.1–5.3. Section 6 lists some spin-offs of our results, and Sec. 7 closes our paper with an outlook on some open problems. 2. A Brief Review of the Ergodic Ensemble For a Newtonian N -body systemb in a domain Λ ⊂ R3 with Hamiltonianc (N )
HΛ (p1 , . . . , q N ) =
1 (N ) |pi |2 + WΛ (q i , q j ) + VΛ (q i ), 2
1≤i≤N
a The
1≤i<j≤N
(1)
1≤i≤N
first two names refer to a Weiss-type “mean-field approximation” becoming exact in the limit, but we will not invoke any such approximation and speak of the Vlasov limit. b All particles belong to a single species. We use units of mc2 for energy, mc for momentum, and h/mc for length, where m is particle mass, c the speed of light, h Planck’s constant. c It is understood that W is symmetric, i.e. W (q, q ˜ ) = WΛ (˜ q, q), and not reducible to a sum of Λ Λ one-body terms; other details of W and V will be specified in the next section.
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1147
(N )
the ergodic/microcanonical ensemble is a family {Xk |k ∈ N} of i.i.d. copies of a random vector X(N ) = (P1 , Q1 , . . . , PN , QN ) ∈ (R3 × Λ)N distributed according to the stationaryd single-system a priori probability measure µE (d6N X) = (N !ΩH (N ) (E))−1 δ(E − HΛ (X (N ) ))d6N X, (N )
(N )
(2)
Λ
where X (N ) := (p1 , q 1 , . . . , pN , qN ) ∈ (R3 × Λ)N and d6N X is 6N -dimensional Lebesgue measure, and where 1 (N ) δ(E − HΛ (X (N ) ))d6N X ΩH (N ) (E) = (3) N! Λ is known as the structure function e ; here, the means derivative with respect to E of 1 ΩH (N ) (E) = χ{H (N ) <E} d6N X, (4) Λ Λ N! (N )
where χ{H (N ) <E} is the characteristic function of the set {HΛ (X (N ) ) < E} ∈ Λ
(R3 × Λ)N , over which the integrals extend. Thus, if B denotes the Borel sets of (N ) (R3 × Λ)N ⊂ R6N , then ((R3 × Λ)N , B, µE ) is the single-system probability space; so if B ∈ B is a Borel set, then the probability of X(N ) being in B isf (N )
Prob(X(N ) ∈ B) = µE (B).
(5)
(N )
= E} = ∅; put differently, Clearly, Prob(X(N ) ∈ B) = 0 unless B ∩ {HΛ (N ) (N ) Prob(HΛ (Xk ) = E) = 1 ∀ k ∈ N. Moreover, Prob(X(N ) ∈ d6N X) = (N ) µE (d6N X) is the a priori probability for X(N ) to be in d6N X about X (N ) . The ergodic ensemble is probabilistically meaningful for all N ∈ N, yet its thermodynamic significance emerges only in the large N regime (Avogadro’s N ≈ 1023 ) when it makes sense to speak of a solid, a liquid, a plasma (etc.) on macroscopic scales of space and time. Since the typical physical characteristics of solids and liquids (etc.) are not revealed by “picturing” such systems as individual points in R6N , one associates each microstate X (N ) with a unique family of empirical n-point “densities” on (R3 × Λ)n , n = 1, 2, . . . , N . The normalized one-point “density” with N atoms (empirical measure) is given by 1 (1) δ(p − pi )δ(q − q i ) (6) ∆X (N ) (p, q) = N 1≤i≤N
and the normalized two-point density with N atoms (U -statistic of order 2) by ∆X (N ) (p, q; p , q ) (2)
=
1 δ(p − pi )δ(q − q i )δ(p − pj )δ(q − q j ); N (N − 1)
(7)
1≤i=j≤N
(N)
is defined with respect to the flow generated by the Hamiltonian HΛ (p1 , . . . , qN ). N ! term was supplied by Gibbs to resolve Gibbs’ paradox. It cancels out in (2). f It is tacitly understood that whenever one encounters a physically interesting subset L of a Borel null-set which is not itself Borel measurable, then we use the Lebesgue σ-algebra. d Stationarity e The
October 26, 2009 11:30 WSPC/148-RMP
1148
J070-00385
M. K.-H. Kiessling
similarly the empirical n-point densities with n = 3, . . . , N are defined. The map (n) X (N ) → {∆X (N ) }N n=1 is bijective if we insist that the particular labeling given to us algebraically with the right-hand side of (6) or the right-hand side of (7) etc. has an intrinsic meaning; however, considered purely measure theoretically as “density” on (n) R6n each ∆X (N ) is invariant under the permutation group applied to the particular labeling, and since there are N ! distinct X (N ) s obtained by permuting the particle (n) labels, the map X (N ) → {∆X (N ) }N n=1 is many-to-one in this sense. Understood in this measure theoretic way the empirical n-point densities do not depend on the unphysical (though mathematically convenient) labeling of the particles,g and so are physically more natural than points in R6N ; when n is small, say n = 1 or 2, (n) then ∆X (N ) is also “physically more manifest” than a point in R6N . Hence, the probabilities of interest to physicists will be of the form (n) Prob(∆X(N ) ∈ B)
(8)
in Ps ((R3 × Λ)n ), the permutationfor physically significant measurable sets B 3 symmetric probability measures on (R × Λ)n . Among the physically significant sets are balls (with respect to a suitable topology, still to be chosen) centered at a representative n-point density function for a solid, liquid, . . . , or complements of such balls. As for the topology, the fine (TV) topology for Ps ((R3 × Λ)n ) is not (n) suitable as it is equivalent to discriminating between different ∆X (N ) with respect to 6N the Borel sigma algebra of R= /SN (see footnote g). Practically accessibleh are only some considerably less finely resolved events, such as the empirical n-point densities (n) ∆X (N ) distinguished with respect to the weak topology, quantified by a convenient Kantorovich–Rubinstein metric dKR on Ps ((R3 × Λ)n ). Two very different points in (N ) and Y (N ) , can map into two densities ∆(n)(N ) and ∆(n)(N ) which R6N = /SN , say X X Y in weak topology on Ps ((R3 × Λ)n ) are virtually indistinguishable; here X (N ) and Y (N ) are any two representative points out of the N ! points each, which constitute (N ) , respectively Y (N ) . So the probabilities of interest to the pre-image in R6N of X physicists are typically of the form (n)
(n) ) > δ), Prob(dKR (∆X(N ) , feq
(9)
where feq (p(1) , q (1) ; . . . ; p(n) , q (n) ) ∈ (P ∩ C0b )((R3 × Λ)n ) is an equilibrium density function, defined — in the simplest of all cases — implicitly as the unique function for which (after rescaling of variables and parameters, if necessary) (n)
(n)
N →∞
(n) ) > δ) −−−−→ Prob(dKR (∆X(N ) , feq
∀ δ > 0.
(10)
g So, physically we can identify these N ! distinct X (N) s with a single N -point configuration in e (N) ∈ R3N × ΛN /SN . The subscript = means that coincidence points R3 × Λ, which is a point X = (n) are removed, and SN is the symmetric group of order N . We should also write ∆ e (N ) , with the X (n) (n) understanding that as measure ∆ e (N ) is given by ∆ (N ) for any of the N ! points X (N) in the X X e (N) . The map X e (N) → {∆(n) }N is bijective. pre-image in R6N of X e (N ) n=1 X
if the balls in TV topology were practically accessible, for N 1 the amount of information would be sheer overwhelming and not very illuminating.
h Even
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1149
In these simplest of all cases, (10) also explains what is meant by a “representa(n) tive n-point density function”; and whether feq represents a solid, liquid, gas, etc., (n) depends on the specific configurational correlations exhibited by feq . In more complicated (and more interesting) situations, several “competing” equilibrium func(n) tions feq may exist, and (10) has to be modified accordingly. The “simplest case” scenario just described was discovered by Boltzmann ([2, p. 442]), based on his explicit evaluation of (2) for the perfect gas. He realized that (N ) when HΛ is the perfect gas Hamiltonian and N 1, then basically every point (N ) (n) of {HΛ = E} (identified with an n-pt. density through the map X (N ) → ∆X (N ) ) lies in the vicinity (with respect to weak topology) of one and the same equilibrium (n) (N ) density function feq at that energy E, and given n. When HΛ sports non-trivial pair interactions, Boltzmann’s description needs to be modified slightly to account for the phenomenon of phase transitions. While there can hardly be a doubt that Boltzmann’s insight into (2) is correct, the rigorous results which support his assessment have been obtained not for (2) but for some regularized approximation of this singular measure [40, 29, 30]. In this paper we will finally vindicate Boltzmann’s ideas in the Vlasov regime of the relevant class of Hamiltonians (1).
3. Heuristic Considerations on the Vlasov Limit For the ergodic ensemble to exhibit a Vlasov regime, the Hamiltonian (1) needs to satisfy additional conditions. In particular, a necessary condition on the symmetric and irreducible pair potential WΛ is local integrability, i.e. WΛ (q, ·) ∈ L1 (Br (q) ∩ Λ) ∀ q ∈ Λ. We remark that for the existence of a dynamical Vlasov regime the local integrability of the forces derived from WΛ is mandatory, viz. ∇q WΛ (q, ·) ∈ L1 (Br (q) ∩ Λ) ∀ q ∈ Λ. Coulomb’s electrical and Newton’s gravitational interactions belong in either class. Physically meaningful external potentials (N ) VΛ are continuous for q ∈ Λ; it has minor technical advantages to assume that (N ) (N ) (N ) VΛ is actually continuous also at the boundary, i.e. limq →q VΛ (q ) = VΛ (q) (N ) for all q ∈ ∂Λ and q ∈ Λ. For convenience we assume that inf HΛ (p1 , . . . , q N ) = (N ) min HΛ (p1 , . . . , q N ) = Eg (N ) > −∞, and call Eg (N ) the N -body ground state energy;i Newton’s gravitational interactions need to be regularized to achieve Eg (N ) > −∞. In the introduction, we have already mentioned that the Vlasov limit scaling for such interactions is E N 2 ε for N 1. We now explain why. Integrating (6) over p-space R3 gives a normalized one-point “density” (empirical measure) on Λ
i Presumably boundedness below is not technically necessary. We expect that pair interactions which diverge logarithmically to −∞ can be accommodated but require additional weak compactness estimates, e.g. in some Lp space; cf. [25].
October 26, 2009 11:30 WSPC/148-RMP
1150
J070-00385
M. K.-H. Kiessling
with N atoms, which by abuse of notation we denote as follows, 1 (1) (1) ∆X (N ) (q) ≡ ∆X (N ) (p, q)d3 p = δ(q − q i ). N R3
(11)
1≤i≤N
Whenever Boltzmann’s simplest scenario holds, then there is an equilibrium density (1) ρE,N ∈ (P ∩ C0b )(Λ), depending on N ( 1) and E, such that ∆X (N ) (q) ≈ ρE,N (q) for overwhelmingly most X (N ) distributed by (2), where “≈” means the two “densities” do not differ by much in a conventional Kantorovich–Rubinstein metric dKR . This suggests that when Λ ⊂ R3 is fixed and N → ∞ together with E → ∞ such N →∞ that E/N α → ε for a yet-to-be determined α, then ρE,N −−−−→ ρε ∈ (P∩C0b )(Λ) and (1)
N →∞
∆X (N ) (q)d3 p −−−−→ ρε (q), weakly. Implementing this law-of-large-numbers type scenario inevitably leads to α = 2, as is most easily seen if we assume for a moment that WΛ ∈ C0b (Λ × Λ). Then q → WΛ (q, q) is a bounded continuous function in Λ and we can write 1 2 (1) (N ) (N ) |p| ∆X (N ) (p, q)d3 p d3 q H (X ) = N 2 1 (N ) (1) +N VΛ (q) − WΛ (q, q) ∆X (N ) (p, q)d3 p d3 q 2 1 (1) (1) 2 ˜ )∆X (N ) (p, q)d3 p d3 q ∆X (N ) (˜ ˜ )d3 p˜ d3 q˜, +N WΛ (q, q p, q 2 (12) (1) and when R3 ∆X (N ) (p, q)d3 p ≈ ρε (q), we find 1 2 (1) H (N ) (X (N ) ) ≈ N |p| ∆X (N ) (p, q)d3 p d3 q 2 1 (N ) +N VΛ (q) − WΛ (q, q) ρε (q)d3 q 2 1 ˜ )ρε (q)ρε (˜ + N2 WΛ (q, q q )d3 q d3 q˜. (13) 2 The last term clearly scales ∝ N 2 because WΛ and ρε are independent of N . In a sense this already establishes the E ∝ N 2 scaling. However, we have yet to consider the terms on the first two lines on the right-hand side of (13). It would seem that these scale ∝ N and so, for large N , would become insignificant as compared to the one in the last line, but only the N 12 WΛ (q, q)ρε (q)d3 q contribution will surely become insignificantj for large N , for the same reasons for why the last one scales ∝ N 2 (WΛ and ρε do not depend on N ). As for the external potential (N ) VΛ (q), the superscript (N ) indicates that we may want to adjust it to the number of particles in the system on which it acts in order to retain a noticeable effect this indicates that the Vlasov limit does not require the continuity of WΛ , the only purpose of which was to furnish identity (12) which involves WΛ (q, q).
j Incidentally,
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1151
(N )
when N becomes large. So, in particular, we can set VΛ (q) = N VΛ (q) [or = (N ) (N − 1)VΛ (q)], with VΛ (q) independent of N , and find N VΛ (q)ρε (q)d3 q = N 2 VΛ (q)ρε (q)d3 q [+O(N )], scaling ∝ N 2 (in leading order), hence remaining significant in (13) as N becomes large. And as to the kinetic energy term, it is (1) important to realize that R3 ∆X (N ) (p, q)d3 p ≈ ρε (q) ∈ (P ∩ C0b )(R3 ) does not (1)
imply that ∆X (N ) (p, q) ≈ fε (p, q) ∈ (P ∩ C0b )(R3 × Λ). For instance, we can have (1)
that N 3/2 ∆X (N ) (N 1/2 p, q) ≈ fε (p, q) ∈ (P ∩ C0b )(R3 × Λ) so that a significant fraction of the energy will be distributed over the kinetic degrees of freedom,k and then, up to terms of O(N ), we find 1 2 |p| + VΛ (q) fε (p, q)d3 p d3 q H (N ) (X (N ) ) ≈ N 2 2 1 ˜ )fε (p, q)fε (˜ ˜ )d3 p d3 q d3 p˜ d3 q˜ . (14) WΛ (q, q p, q + 2 This scaling scenario can be verified explicitly for the perfect gas (WΛ ≡ 0) by inspecting Boltzmann’s calculations, and it is reasonable to expect that it will continue to hold for a physically interesting class of WΛ ≡ 0. To summarize, the Vlasov limit for the Hamiltonian (1) with V (N ) = N V means N →∞ (1) that N 3/2 ∆X(N ) (N 1/2 p, q) −−−−→ fε (p, q) weakly in P(R3 × Λ), with fε (p, q) ∈ N →∞
(P ∩ C0b )(R3 × Λ), and N −2 H (N ) (X (N ) ) −−−−→ E(fε ) = ε > εg , where 1 2 |p| + VΛ (q) f (p, q)d3 p d3 q E(f ) = 2 1 ˜ )d3 p d3 q d3 p˜ d3 q˜ + WΛ (q, q˜ )f (p, q)f (˜ p, q 2
is the “energy of f ”, and where εg = inf f ∈P(R3 ×Λ) E(f ) is given by 1 WΛ (q, q˜ )ρ(q)ρ(˜ q )d3 q d3 q˜ . εg = inf VΛ (q)ρ(q)d3 q + ρ∈P(Λ) 2
(15)
(16)
4. The Vlasov Limit for Boltzmann’s Ergode We now state our main results about the Vlasov scaling limit for Boltzmann’s ergodic ensemble of N -body systems in a format which will be recognized as the familiar folklore by anyone with a joint expertise in Vlasov theory and statistical mechanics. We will also utilize some less familiar notions. In the following, Λ ⊂ R3 is a bounded, connected domain (open) which does not depend on N . The upshot of the previous section is that if we want the external potential to remain significant when N gets large, then our N -body dynamics in Λ k Unless
E is the ground state energy for which all particle momenta vanish, indeed.
October 26, 2009 11:30 WSPC/148-RMP
1152
J070-00385
M. K.-H. Kiessling
will be governed by Hamiltonians (1) of the special type 1 (N ) 2 |p | + (N − 1)VΛ (q i ) + WΛ (q i , q j ), HΛ (p1 , . . . , q N ) = 2 i 1≤i≤N
(17)
1≤i<j≤N
with the single particle potential VΛ and the pair interaction WΛ independent of (N ) N . We choose N − 1 rather than N as scaling for VΛ because then we can absorb ˜ ) := VΛ and WΛ together in a new N -independent effective pair interaction UΛ (q, q ˜ q ). This does not affect any of our results (as we will prove), WΛ (q, q ) + VΛ (q) + VΛ (˜ but the Hamiltonian (17) can be recast shorter as (N )
HΛ (p1 , . . . , q N ) =
1 |pi |2 + UΛ (q i , q j ). 2
1≤i≤N
(18)
1≤i<j≤N
Since heuristically we expect for a Hamiltonian system with Hamiltonian (18) under N →∞ (1) Vlasov scaling that N 3/2 ∆X(N ) (N 1/2 p, q) −−−−→ fε (p, q) weakly, with fε (p, q) ∈ (P ∩ C0b )(R3 × Λ), we also expect that the rescaled particle momentum random vectors N −1/2 Pi converge in distribution, implying that i 21 |Pi |2 ≈ N 2 εkin for (N ) µE -most X(N ) , where εkin is the kinetic energy contribution to ε. We find it more convenient to work with random variables which themselves converge in distribu˜ k in (18); or in more tion and so re-scale the momentum variables as pk = N 1/2 p 1/2 economical notation: we replace pk → N pk in (18). With this minor additional abuse of notation our Hamiltonian finally reads (N )
HΛ (p1 , . . . , q N ) = N
1 |pi |2 + UΛ (q i , q j ). 2
1≤i≤N
(19)
1≤i<j≤N
˜ ): Our main results will be proved under the following hypotheses on UΛ (q, q (H1) (H2) (H3) (H4) (H5)
ˆ ) = UΛ (ˆ Symmetry: UΛ (ˇ q, q q , qˇ ). q , qˆ ) is l.s.c. on Λ × Λ. Lower Semi-Continuity: UΛ (ˇ Sublevel Set Regularity: χ{UΛ (ˇq,ˆq)−min UΛ <} d3 qˇ d3 qˆ > 0. Local Square Integrability: UΛ (q, ·) ∈ L2 (Br (q) ∩ Λ) ∀ q ∈ Λ. ˆ ) = +∞ whenever q ˇ ∈ Λ or qˆ ∈ Λ. q, q Confinement : UΛ (ˇ
Hypothesis (H1) is a consequence for WΛ of Newton’s “actio equals re-actio”, plus the symmetrized added contribution of VΛ , both of which need no further commentary. Hypothesis (H2) is satisfied by many important pair interactions invoked in ˆ) = q, q physics, though not by all. For instance, the Coulomb pair potential UΛCoul (ˇ ˆ| for q ˇ = qˆ satisfies (H2) after also setting UΛCoul (q, q) ≡ u for any particular 1/|ˇ q −q ˆ ) = −UΛCoul (ˇ ˆ) q, q q, q u ∈ R. On the other hand, the Newton pair potential UΛNewt (ˇ does not satisfy (H2) for any choice of u; however, the regularized Newton pair Newt ˆ ) = −(χBr ∗ UΛCoul ∗ χBr )(ˇ ˆ ) (where f ∗ g denotes the conven(ˇ q, q q, q potential UΛ,reg tional convolution product of f and g) does satisfy (H2). By (H2), there exists an (N ) N -dependent ground state energy Eg (N ), i.e. HΛ ≥ Eg (N ) > −∞, but the ground
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1153
state configuration can have some unwanted features.l Hypothesis (H3) eliminates the possibility of energetically isolated ground states, thus guaranteeing the existence of a fat set of minimizing sequences of configurations. Hypothesis (H4) is a little stronger than necessary, but it allows us to make convenient use of Chebychev’s inequality to prove a law of large numbers for the pair-specific interaction energy; the important Coulomb potential satisfies (H4). Note that (H4) implies local L1 integrability of UΛ , which is needed in various integrals featuring in the Vlasov limit. Note also that by (H2) and (H4) there exists an N -independent εg ∈ R defined by (16). In Appendix A, we show that (H1) and (H2) guarantee that the pair-specific ground state energy Eg (N )/[N (N − 1)] ≡ εg (N ) is monotonic increasing with N , and using also (H3) and (H4) we show that εg (N )εg as N → ∞. In Appendix A, we also show that if UΛ ≥ 0, then also Eg (N )/N 2 ≡ ε˜g (N )εg as N → ∞. Hypothesis (H5) is really inherited from the dynamical theory of N (N ) particles in Λ ⊂ R3 , where one sets VΛ = +∞ for q ∈ Λ to dynamically model confinement in a container; (H5) has a minor notational advantage by allowing us to treat physical space integrals like momentum space integrals as over all R3 , the spatial cutoff to Λ automatically being provided by the potential VΛ through UΛ . Usually, (H5) is not listed explicitly as a hypothesis on the interactions even when spatial integrations are explicitly restricted to Λ. This concludes our commentary on the list of hypotheses (H1)–(H5). All our results (except Proposition 7 in Appendix A) will be formulated and proved under the convenient assumption that UΛ ≥ 0, so that εg ≥ 0. Since UΛ has 2 a minimum in Λ , by (H2), and since the physics of our dynamical system does not change if we simply add a constant to UΛ , we may assume that UΛ ≥ 0 without loss of generality. We emphasize that this choice is merely for convenience, given (H2), and so is not listed as another hypothesis. The simplest objects of interest are the thermodynamic functions. In the 1960s and hence, techniques based on monotonicity, convexity and super-additivity estimates have been developed to prove their existence and regularity in the limit N → ∞ which avoids having to control the more sophisticated objects of interest, which are the correlation functions. For the traditional thermodynamic limit scaling, see Ruelle’s book [40] and [22] for a recent extension of Ruelle’s arguments to Boltzmann’s Ergode proper. For the Vlasov scaling of the canonical ensemble, see [19]. To extend these arguments to Boltzmann’s Ergode proper with Vlasov scaling, our first goal is to show that the logarithm of the structure function (3) for the Hamiltonian (19), which yields Boltzmann’s ergodic ensemble entropym (cf. [15, Eq. (305)]), SH (N ) (E) = ln ΩH (N ) (E), Λ
(20)
Λ
instance, in our example of the amended Coulomb pair potential one can choose u = 0, but then Thomson’s problem on S2 ⊂ R3 [43] yields as ground state configuration always the spurious one (up to SO(3) action) for which all particle positions coincide. To avoid these spurious ground state configurations it is advisable to choose u > 0 huge. m Entropy is measured in units of k , where k is Boltzmann’s constant. B B l For
October 26, 2009 11:30 WSPC/148-RMP
1154
J070-00385
M. K.-H. Kiessling
admits the correct type of asymptotic expansion for N → ∞ with E = N 2 ε, and has the correct qualitative ε dependence. The usual strategy can be put to work if we assume just a little more than (H1)–(H5). In this vein, we state: (N )
Theorem 1. Let HΛ be given in (19), with UΛ satisfying conditions (H1) and (H5), but with (H2)–(H4) replaced by the single stronger condition: (H6)
ˆ ) is continuous on Λ × Λ. q, q Continuity: UΛ (ˇ
(21)
Let ε > εg , with εg ≥ 0 defined as before. Then the ergodic ensemble entropy (20) has the following asymptotic expansion for N 1, SH (N ) (N 2 ε) = −N ln N + N sΛ (ε) + o(N ), Λ
(22)
where sΛ (ε) is the system-specific Boltzmann entropy per particle. The function ε → sΛ (ε) is continuous and strictly increasing for ε > εg . We remark that the leading term of the right-hand side of (22) is purely com(N ) binatorial in origin and independent of the Hamiltonian HΛ — it is solely due to the N ! in (3). System-specific information begins to show in the next to leading term, which is O(N ). The o(N ) term in (22) is presumably O(ln N ). We will also prove two upgrades of Theorem 1 (Theorems 1+ and 1++ ) which involve the decomposition of the system-specific Boltzmann entropy per particle sΛ (ε) into a “kinetic” and an “interaction” contribution. The discussion of this more technical material is postponed until Sec. 5.1. While they do yield valuable qualitative information about the thermodynamic functions for the systems under study, in this case sΛ (ε), existence theorems such as Theorem 1 and their “proofs by sub-additivity” have the disadvantage that they do not characterize the limit objects in a way which would allow their systematic evaluation for physically interesting irreducible pair potentials WΛ and external one-body potentials VΛ . It is this type of characterization that we are after, and in Sec. 5.2, we prove that sΛ (ε) satisfies the familiar maximum entropy variational principle for the entropy per particle of a perfect gas in a combination of self- and externally generated fields. More precisely, we prove the following strengthening of Theorem 1. (N )
Theorem 2. Let HΛ be given in (19), with UΛ ≥ 0 satisfying (H1)–(H5). Let ε > εg . Then the Boltzmann entropy (20) has the asymptotic expansion SH (N ) (N 2 ε) = −N ln N + N sΛ (ε) + o(N ) Λ
(23)
for N 1, and the system-specific Boltzmann entropy per particle is given by sΛ (ε) = −HB (fε ),
(24)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
where HB (f ) is “Boltzmann’s H function” of f, which readsn HB (f ) = f (p, q) ln(f (p, q)/e)d3 p d3 q,
1155
(25)
and where fε is any minimizer of this H functional over the set of trial densities Aε = {f ∈ (P ∩ L1 ∩ L1 ln L1 )(R3 × Λ) : E(f ) = ε}, where E(f ) now reads 1 2 E(f ) = |p| f (p, q)d3 p d3 q 2 1 ˜ )f (p, q)f (˜ ˜ )d3 p d3 q d3 p˜ d3 q˜. + UΛ (q, q p, q (26) 2 Any minimizer fε of HB (f ) over the set Aε is of the form fε (p, q) = σε (p)ρε (q), where ρε (q) solves the following fixed point equation on q space, 3 −1 UΛ (q, q˜ )ρε (˜ q )d q˜ exp −ϑε (ρε ) Λ ρε (q) = ˜ )ρε (˜ exp −ϑε (ρε )−1 UΛ (ˆ q, q q )d3 q˜ dˆ q Λ
with ϑε (ρ) given by 3 ϑε (ρ) = ε − 2
(27)
(28)
Λ
1 ˜ )ρ(q)ρ(˜ UΛ (q, q q )d3 q d3 q˜, 2
and where σε (p) = σ(ρε )(p), with σ(ρ)(p) defined whenever ϑε (ρ) > 0, by 3 1 σ(ρ)(p) = (2πϑε (ρ))− 2 exp − |p|2 /ϑε (ρ) . 2
(29)
(30)
Evidently, every minimizer of HB (f ) over Aε factors into a product of a Maxwellian on p space and a purely space-dependent “self-consistent Boltzmann factor ”.o However, the Maxwellian in (27) is not autonomous from the Boltzmann factor in (27), as is manifest by the functional dependence of the (rescaled) temperature ϑ = ϑε (ρε ) on ρε , see (30). For a subset of ε values the minimizer of HB (f ) over Aε may not be unique, but all minimizers produce the same asymptotic formula (23). In such a case of non-uniqueness of minimizers, they always seem to constitute either a finite set (typically a first order phase transition) or a continuous group orbit of a compact group (e.g., when Λ is invariant under SO(2) or SO(3) and a minimizer breaks that symmetry), to the best of our knowledge; this seems to cover all physically relevant possibilities. In addition to the minimizers of HB (f ) there may be non-minimizing critical points of HB (f ) satisfying (27)–(30), but these are irrelevant for (23). remark that Euler’s number e in (25) is inherited from the N ! term in (20). expression conventionally known as “Boltzmann factor” results when WΛ ≡ 0 so that ˜ ) = VΛ (q) + VΛ (˜ q), i.e. for the perfect gas acted on by an external potential VΛ . UΛ (q, q
n We
o The
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
M. K.-H. Kiessling
1156
Our Theorem 3, proved in Sec. 5.3 with input from Sec. 5.2, characterizes the Vlasov limit N → ∞ of the marginal measures n (N ) 6n µE (d X)
= µE (d6n X × (R3 × Λ)N −n ), (N )
n = 1, 2, . . . (n fixed)
(31)
in terms of the fε . We note that the object of interest in (mathematical) physics is not (2) itself but only the collection of its first few marginal measures (31). To state our theorem, we introduce Ps ((R3 × Λ)N ), the permutation-symmetric probability measures on the set of infinite sequences in R3 × Λ. A theorem of de Finetti [14], Dynkin [9], and Hewitt–Savage [17] (see also [11, App. A.9]) states that Ps ((R3 × Λ)N ) is uniquely presentable as an average of infinite product measures; i.e. for each µ ∈ Ps ((R3 × Λ)N ) there exists a unique probability measure ν(dτ |µ) on P(R3 × Λ), such that n µ(d3n p d3n q) = τ ⊗n (d3 p1 d3 q1 · · · d3 pn d3 qn )ν(dτ |µ) ∀ n ∈ N, (32) P(R3 ×Λ)
where nµ is the nth marginal measure of µ, and τ ⊗n (d3 p1 d3 q1 · · · d3 pn d3 qn ) ≡ τ (d3 p1 d3 q1 ) ⊗ · · · ⊗ τ (d3 pn d3 qn ). Equation (32) is also the extremal decomposition for the convex set Ps ((R3 × Λ)N ), see [17]. Theorem 3. Under the same assumptions as in Theorem 2, consider (2) with Hamiltonian (19) as extended to a probability on (R3 × Λ)N . Then the sequence ˙ (N[N ])
(N )
{µN 2 ε }N ∈N is tight, so one can extract a subsequence {µN 2 ε (N˙ [N ])
lim nµN˙ 2 ε
N →∞
}N ∈N such that
(d3n p d3n q) = n µ˙ ε (d3n p d3n q) ∈ Ps ((R3 × Λ)n )
∀ n ∈ N.
(33)
The decomposition measure ν(dτ |µ˙ ε ) of each such limit point µ˙ ε is supported by the subset of P(R3 × Λ) which consists of the probability measures τε (d3 p d3 q) = fε (p, q)d3 p d3 q which minimize the H functional HB (f ) over Aε . 5. Proofs We have stated our Theorems 1–3 entirely in terms of the familiar quantities of kinetic theory. These are the one-body density function fε (p, q) which minimizes Boltzmann’s H-function H (f ) under the familiar energy functional constraint E(f ) = ε, and the system-specific Boltzmann entropy per particle sΛ (ε) which is given as the negative of Boltzmann’s H-function evaluated with fε . However, in this format our theorems give essentially symmetric weight to the p and q variables, which ignores the fact that the p-space integrations involved in (31) and (20) can be carried out explicitly in the same fashion as for the perfect gas. As a consequence the problem reduces to studying the large N asymptotics of the expressions which result from these p-space integrations.p In fact, all the hard analytical work Boltzmann needed for this was that (1 + x/n)n ex ; cf. [2, Part II, Chap. 3]. Of course, things are not quite as straightforward with an irreducible WΛ ≡ 0, or else Boltzmann would not have had to have WΛ ≡ 0 excluded from his analysis.
p All
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1157
goes into controlling the q-space integrations. This is certainly the case as far as the entropy per particle goes, yet also each minimizer fε of HB (f ) over the set Aε is uniquely determined by ρε , which signals that all of our Theorems 1–3 will be essentially straightforward corollaries of theorems about certain q-space expressions. Those theorems take a less familiar form, presumably, which is why their statements have been relegated into this section where we prove Theorems 1–3. 5.1. Proof of Theorem 1 and its two upgrades To prove Theorem 1 we first formulate and then prove an upgraded version (Theorem 1+ ), whose proof also proves Theorem 1. 5.1.1. Theorem 1+ and its proof Carrying out the p integrationsq in Ω
(N )
(N )
HΛ
(E) given by (3), with HΛ
given in (19),
Boltzmann’s ergodic ensemble entropy (20) becomes (2/N )3N/2 3N −1 |S |ΨI (N ) (E) SH (N ) (E) = ln Λ 3N Λ
(34)
with |S3N −1 | the standard measure of the unit 3N − 1 sphere S3N −1 , and with 3N (3/2) (N ) (35) ΨI (N ) (E) = (E − IΛ (q 1 , . . . , q N )) 2 −1 χ{I (N ) <E} d3N q, Λ (N − 1)! Λ where we introduced the interaction Hamiltonian (N ) IΛ (q 1 , . . . , q N ) = UΛ (q i , q j ).
(36)
1≤i<j≤N
Implementing the Vlasov limit scaling, i.e. setting E = N 2 ε with ε > εg ≥ 0, recalling that |S3N −1 | = π 3N/2 /Γ(3N/2), and using Stirling’s formula for Euler’s Γ function, we obtain the following asymptotic expansion for (34), 3/2 4πe 2 ε + O(ln N ) SH (N ) (N ε) = −N ln N + N ln |Λ| Λ 3 + ln 1−
3N 2 −1 1 (N ) I (q , . . . , q ) λ(d3N q), 1 N Λ εN 2 +
(37)
where (· · · )+ means the positive part of (· · · ); moreover, λ(d3N q) is the N -fold product of the normalized Lebesgue measure λ(d3 q) = |Λ|−1 d3 q on Λ. For brevity we wrote |Λ| for the volume Vol(Λ) of Λ. (N ) (N ) When IΛ ≡ 0 in ΛN , then HΛ becomes the Hamiltonian of the perfect gas (N ) without external fields,r abbreviated as KΛ (for kinetic Hamiltonian). In this case is understood that d6N X etc. now involves the p variables used in (19). (N) is tacitly understood that the cutoff provided by IΛ remains effective, so that the configurational integrations in (37) are still over ΛN .
q It r It
October 26, 2009 11:30 WSPC/148-RMP
1158
J070-00385
M. K.-H. Kiessling
the second line in (37) vanishes, and (37) becomes the asymptotic expansion of the entropy of the spatially uniformly distributed perfect gas, viz. 3/2 4πe 2 ε SK (N ) (N ε) = −N ln N + N ln |Λ| + O(ln N ). (38) Λ 3 The coefficient of the O(N ) term in (38) gives the system-specific Boltzmann entropy per particle of the spatially uniform perfect gas, which we denote by 3/2 4πe ε sΛ,K (ε) = ln |Λ| . (39) 3 (N )
Whenever interactions IΛ ≡ 0 of the admitted type are present, Theorem 1 follows if we can show that the second line in (37) is O(N ) and so contributes additively to the system-specific Boltzmann entropy per particle, and provided it has the right monotonicity and regularity. This is expressed in: Proposition 1. Under the assumptions stated in Theorem 1, there holds 3N 2 −1 1 (N ) 1 ln 1− I (q , . . . , q ) λ(d3N q) = sΛ,I (ε). lim 1 N N →∞ N εN 2 Λ +
(40)
The function ε → sΛ,I (ε) is continuous and increasing for ε > εg ≥ 0. This concludes the pretext for our first upgrade of Theorem 1, stated next. Theorem 1+ . Theorem 1 holds, with sΛ (ε) = sΛ,K (ε) + sΛ,I (ε),
(41)
where sΛ,K (ε) is given in (39), and sΛ,I (ε) in (40). Proof of Theorem 1+ . Clearly, Proposition 1 and formula (37) imply Theorem 1 and the splitting of the system-specific Boltzmann entropy per particle sΛ (ε) in (22) into a sum of a kinetic and an interaction component, (41). Proposition 1 also adds a piece of information about sΛ,I (ε) which does not just re-express what is stated in Theorem 1. In fact, by the known strict increase of ε → ln ε, the increase of ε → sΛ,I (ε) implies the strict increase of ε → sΛ (ε), but the increase of ε → sΛ,I (ε) does not follow from the properties of ε → ln ε and the strict increase of ε → sΛ (ε). So Theorem 1+ holds and extends Theorem 1. Proof of Proposition 1. By hypothesis (H6), UΛ is bounded continuous on Λ× Λ, so we can write 1 (N ) (1) (1) ˆ )∆X (N ) (ˇ UΛ (ˇ q, q q )∆X (N ) (ˆ q )d3 qˇ d3 qˆ N −2 IΛ (q 1 , . . . , q N ) = 2 1 1 (1) − UΛ (q, q)∆X (N ) (q)d3 q, (42) N 2
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1159
and we may abbreviate the first term of the right-hand side of (42) in bilinear form notation, 1 (1) (1) (1) (1) ˆ )∆X (N ) (ˇ UΛ (ˇ q, q q )∆X (N ) (ˆ q )d3 qˇ d3 qˆ ≡ ∆X (N ) , ∆X (N ) . (43) 2 (N )
The above integrals extend over R3 , and we set IΛ (q 1 , . . . , q N ) = ∞ as well as (1) (1) ∆X (N ) , ∆X (N ) = ∞ if any q k ∈ Λ. Also by (H6), the term in the second line of the N
right-hand side of (42) is O(N −1 ) for all X (N ) ∈ Λ . Recalling our claim (which we promised to prove) that the limit N → ∞ for the ensemble does not change if the Hamiltonian is changed by an additive term of order O(N −1 ) relative to the leading terms, we now introduce the configurational integral (N ) ΥΛ (ε)
3N 2 −1 1 (1) (1) ≡ ln 1 − ∆X (N ) , ∆X (N ) λ(d3N q) ε +
(44)
for all N > NU (ε) (to be defined). Note that the integral (44) is generally not well(1) (1) (N ) defined for all N ∈ N because ∆X (N ) , ∆X (N ) is bigger than N −2 IΛ (q 1 , . . . , q N ) by the absolute value of the second line of the right-hand side of (42), which reads N precisely N −2 k=1 21 UΛ (q k , q k ). And while this term = O(1/N ), when N is not (1)
(1)
N
large enough then it is possible that ∆X (N ) , ∆X (N ) > ε everywhere in Λ , in which case the integral in (44) vanishes, and its logarithm = −∞, then. Yet, when N > NU (ε) the integral (44) is well-defined, and we conclude that (modulo the proof of precisely the just re-uttered claim that O(1/N ) contributions to the Hamiltonian drop out when N → ∞) our Proposition 1 is proved if we can prove the following proposition. Proposition 2. Under the hypotheses on UΛ in Theorem 1, when N NU (ε) then (N )
ΥΛ (ε) = N γΛ (ε) + o(N ).
(45)
The function ε → γΛ (ε) is continuous and increasing for ε > εg ≥ 0. Proof of Proposition 2. We will establish uniform bounds and super-additivity estimates. For 0 < n < N , we set X (N ) ≡ (X (n) , Y (N −n) ), which also defines Y (N −n) . We note the convex linear decomposition
n (1) n (1) (1) ∆Y (N −n) (q). ∆X (N ) (q) = ∆X (n) (q) + 1 − (46) N N Since UΛ ≥ 0 is the kernel of a bilinear form which is positive definite when restricted to the set of probability measures on Λ, Jensen’s inequality gives us
n (1) n (1) (1) (1) (1) (1) ∆Y (N −n) , ∆Y (N −n) . ∆X (N ) , ∆X (N ) ≤ ∆X (n) , ∆X (n) + 1 − (47) N N
October 26, 2009 11:30 WSPC/148-RMP
1160
J070-00385
M. K.-H. Kiessling
We of course also have 1 =
n N
+ (1 −
n N ),
and so we conclude that
n 1 (1) 1 (1) (1) (1) 1 − ∆X (N ) , ∆X (N ) ≥ 1 − ∆X (n) , ∆X (n) ε N ε +
n 1 (1) (1) + 1− . 1 − ∆Y (N −n) , ∆Y (N −n) N ε +
(48)
Next we recall that, if ϕ is some function on a domain D, and if Σ(ϕ+ ) denotes the support of its positive part, and χΣ(ϕ+ ) is the characteristic function of Σ(ϕ+ ), then the inclusion Σ(ϕ+ ) ∩ Σ(ϑ+ ) ⊂ Σ((ϕ + ϑ)+ ) for any two such functions ϕ and ϑ yields the estimate (ϕ + ϑ)+ = (ϕ + ϑ)χΣ((ϕ+ϑ)+ ) ≥ (ϕ + ϑ)χΣ(ϕ+ ) χΣ(ϑ+ ) = (ϕ+ + ϑ+ )χΣ(ϕ+ ) χΣ(ϑ+ ) .
(49)
n n Set ϕ = N [1 − 1ε ∆X (n) , ∆X (n) ] and ϑ = (1 − N )[1 − 1ε ∆Y (N −n) , ∆Y (N −n) ]. Then inequality (49) applies to the right-hand side of (48). Applying next the classical inequality between the arithmetic and the geometric means of any two positive numbers A and B, viz. αA + (1 − α)B ≥ Aα B (1−α) for any α ∈ [0, 1], we get (1)
(1)
(1)
(1)
1 (1) (1) 1 − ∆X (N ) , ∆X (N ) ε +
Nn
1− Nn 1 (1) 1 (1) (1) (1) 1 − ∆Y (N −n) , ∆Y (N −n) ≥ 1 − ∆X (n) , ∆X (n) . ε ε + +
(50)
We now use (50) to estimate the right-hand side of (44). For this, let N NU (ε) and let NU (ε) < n < N − NU (ε). Noting that the resulting integral over ΛN factors into two integrals, one over Λn and another over ΛN −n , and working out the powers, we find 3N 2 −1 1 (1) (1) ln 1 − ∆X (N ) , ∆X (N ) λ(d3N q) ε + n 3n 2 −N 1 (1) (1) λ(d3n q) ≥ ln 1 − ∆X (n) , ∆X (n) ε + 3(N2−n) −1+ Nn 1 (1) (1) λ(d3(N −n) q), (51) + ln 1 − ∆X (N −n) , ∆X (N −n) ε + where we also relabeled the integration variables under the second integral on the n < 1, right-hand side of (51) from Y (N −n) to X (N −n) . Noting next that 0 < N we resort again to Jensen’s inequality, this time with respect to the λ measures in the two integrals on the right-hand side of (51). Also using ln(· · ·)a = a ln(· · ·), we
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1161
arrive at 3N 2 −1 1 (1) (1) ln 1 − ∆X (N ) , ∆X (N ) λ(d3N q) ε + 3n 2 −1 2 − 2n/N 1 (1) (1) λ(d3n q) ≥ 1+ ln 1 − ∆X (n) , ∆X (n) 3n − 2 ε + + 1+
3(N2−n) −1 1 (1) 2n/N (1) ln 1 − ∆X (N −n) , ∆X (N −n) λ(d3(N −n) q). 3(N − n) − 2 ε + (52)
Formula (52) writes shorter thusly, 2 − 2n/N 2n/N (N ) (n) (N −n) ΥΛ (ε) ≥ 1 + (ε). ΥΛ (ε) + 1 + ΥΛ 3n − 2 3(N − n) − 2
(53)
(N )
So N → ΥΛ (ε) is almost super-additive. To be able to create a properly super-additive function we establish upper and () lower bounds of → ΥΛ (ε) which are linear in , whenever > NU (ε); we will need those bounds with ∈ {n, N − n}, with > 1. As a by-product, the upper bound with = N will also guarantee convergence of the constructed super-additive function. (1) (1) The upper bound is trivial. Recall that by hypothesis ∆X () , ∆X () ≥ 0 for all ∈ N. So for > NU (ε) and ε > εg ≥ 0 we find 32 −1 1 (1) 2 (1) ln 1 − ∆X () , ∆X () λ(d3 q) ≤ 0. (54) 3 − 2 ε + As for the lower bound, we distinguish two cases, (a): |Λ|−1 , |Λ|−1 < ε, and (b): |Λ|−1 , |Λ|−1 ≥ ε. In case (a) we apply Jensen’s inequality with respect to λ to the convex map x → (1 − x)θ+ (for θ ≥ 1), and also use − 1 < , to get 2 3−2 32 −1 1 (1) (1) 3 ln 1 − ∆X () , ∆X () λ(d q) ε +
11 1 1 1 3 3 3 ≥ ln 1 − q , qˆ )λ(d qˇ)λ(d qˆ) , (55) UΛ (q, q)λ(d q) − UΛ (ˇ ε 2 ε 2 + and the right-hand side of (55) ≥ −C > −∞ when > crit (ε) (given UΛ ), with C > 0 independent of . Since the interaction entropy exists when > NU (ε), clearly crit ≥ NU (ε), but after at most an adjustment of C, we can conclude that the left-hand side of (55) ≥ −C > −∞ when > NU (ε), with C > 0 independent of . In case (b), inequality (55) is still true but now trivial, for the right-hand side of (55) = −∞ for all > 1, then. So instead we now proceed as follows. By (1) (1) hypothesis (H6), the bilinear form ∆X () , ∆X () takes its minimum ε∗g () ≥ εg . Clearly, ε∗g () = ε˜g () + O(−1 ), where ε˜g () := min −2 IΛ (q 1 , . . . , q ), and since ()
October 26, 2009 11:30 WSPC/148-RMP
1162
J070-00385
M. K.-H. Kiessling
ε˜g () ≤ εg (as proved in Appendix A), we have that ε∗g () ≤ εg + O(−1 ); of course, we also assume that > NU (ε) so that ε∗g () < ε. By permutation symmetry there are many equivalent minimizers, but possibly also several distinct permutation () () group orbits of minimizers. We pick any particular minimizer Qg and let q g,k ∈ Λ ()
denote the kth coordinate vector in Qg . By (H6) again, we can vary all the q k in the () minimizing configuration a little bit, say, each q k in Bδ (q g,k ) ∩ Λ, where Bδ (q) is a ball centered at q, with radius δ > 0 independent of k and but chosen small enough (1) (1) (given ε) so that ∆X () , ∆X () does not change by more than (ε − εg + O(−1 ))/2. ()
For brevity we write Bδ [k] for Bδ (q g,k ); let χBδ [k] be the characteristic function of Bδ [k]. We use that λ(d3 qk ) = χBδ [k] λ(d3 qk ) + χBδc [k] λ(d3 qk ) where Bδc [k] = Λ\Bδ [k] is the complement in Λ of Bδ [k], then use that both terms in this decomposition are non-negative so that we get an upper estimate by dropping the contribution from χBδc [k] λ(d3 qk ) for each k. After this step the restriction to the positive part of (1 − 1ε . , . ) is eventually tautological when is sufficiently large so that the O(−1 ) term has gotten sufficiently small. We next apply Jensen’s inequality with respect to the probability measure 1≤k≤ ( Bδ [k]∩Λ λ(d3 q))−1 χBδ [k] λ(d3 qk ) to the convex map x+ → xθ+ (for θ ≥ 1), finally recall that 0 ≤ εg < ε, and get 2 3−2 32 −1 1 (1) (1) 3 λ(d q) 1 − ∆X () , ∆X () ε +
2 3−2 32 −1 1 (1) (1) ≥ χBδ [k] λ(d3 qk ) 1 − ∆X () , ∆X () ε
2
≥ |Cδ | 3−2/
1≤k≤
1 (1) (1) 1 − ∆X () , ∆X () ε
χBδ [k]
1≤k≤
λ(d3 qk ) 3
λ(d q)
Bδ [k]∩Λ
2
≥ |Cδ | 3−2/
εg 1
1+ + O(−1 ) ≥ C > 0 1− 2 ε
(56)
for large enough; here Cδ = min q
λ(d3 q) > 0.
(57)
Bδ (q )∩Λ
In summary, our list of inequalities (54)–(56), and the finiteness of the number of until “ is large enough”, establishes that when > NU (ε), then for some -independent constant C∗ > 0, 3 () − 1 C∗ ≤ ΥΛ (ε) ≤ 0; (58) − 2 incidentally, our (55) and (56) produce an upper estimate for NU (ε).
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1163
Recall that in this proof we assume that N NU (ε), and that NU (ε) < n < N − NU (ε). With the help of (58), for ∈ {n, N − n}, we conclude from (53) that there exists a C ∈ R independent of n and N such that (N )
(N −n)
(n)
ΥΛ (ε) ≥ ΥΛ (ε) + ΥΛ
(ε) + C.
(59)
Adding that constant C to both sides of the inequality (59) shows that N → (N ) ΥΛ (ε) + C is a super-additive function for all ε > εg ≥ 0. And using (54) with (N ) N , we also see that N −1 (ΥΛ (ε) + C) is bounded above, and so, by standard facts (N ) about super-additive functions, N −1 (ΥΛ (ε) + C) converges as N → ∞, lim N −1 (ΥΛ (ε) + C) = sup N −1 (ΥΛ (ε) + C), (N )
(N )
N →∞
N ∈N
N →∞
(60)
and since N −1 C −−−−→ 0, we conclude that N −1 ΥΛ (ε) converges as well, i.e. (N )
lim
N →∞
1 (N ) Υ (ε) = γΛ (ε). N Λ
(61)
This proves (45). To prove continuity of γΛ (ε), we establish upper and lower bounds on the deriva(N ) tive of the functions ε → N −1 ΥΛ (ε) which are uniform in N > NU (ε). Differen(N ) tiating the functions ε → N −1 ΥΛ (ε) + ( 32 − N1 ) ln ε, we obtain 1− 3 1 (N ) 1 1 ΥΛ (ε) = − N 2 N ε 1−
3N 2 −2 1 (1) (1) ∆ , ∆X (N ) λ(d3N q) ε X (N ) + . − 1 3N 2 −1 1 (1) (1) 3N ∆ , ∆X (N ) λ(d q) ε X (N ) + (62) (1)
(1)
To get a lower bound, we split off a factor (1 − 1ε ∆X (N ) , ∆X (N ) )+ in the integrand of the denominator of the right-hand side of (62), and using that ε > εg ≥ 0, the (1) (1) positivity of the bilinear form now gives (1 − 1ε ∆X (N ) , ∆X (N ) )+ ≤ 1, and so 1 (N ) Υ (ε) ≥ N Λ
3 1 − 2 N
1 [1 − 1] = 0; ε
(63)
incidentally, this shows once again monotonicity ↑ of ε → γΛ (ε). To get an N 3N 2 independent upper bound to (62), note that 3N 2 − 2 = ( 2 − 1)(1 − 3N −2 ) and 2 that 0 < (1 − 3N −2 ) < 1 for N > 1, then apply Jensen’s inequality with respect to λ to pull the power (1 − 3N2−2 ) out of the integral in the numerator, then note a
October 26, 2009 11:30 WSPC/148-RMP
1164
J070-00385
M. K.-H. Kiessling
cancellation versus the denominator. Since 0 < (1 −
2 3N −2 )
< 1 for N > 1,
1 3 − 3N2−2 3N − 2 −1 1 1 (N ) (1) (1) Υ 1 − ∆X (N ) , ∆X (N ) (ε) ≤ 2 N λ(d3N q) − 1 N Λ ε ε + (64) whenever N > NU (ε) (so that the integral is non-zero). By the first inequality in (58) with = N , the right-hand side of (64) is bounded above independently of N . The continuity of ε → γΛ (ε) follows. Proposition 2 is proved. To complete the proof of Proposition 1 we still need to show that the omission (1) of N1 12 UΛ (q, q)∆X (N ) (q)d3 q from (42) was justified. This is now straightforward. By hypothesis (H6), UΛ (≥ 0) is a bounded continuous function on Λ × Λ. So there exists an N -independent constant B > 0 such that 1 (1) UΛ (q, q)∆X (N ) (q)d3 q ≤ B, 0≤ (65) 2 N
as long as X (N ) ∈ Λ . Thus, and abbreviating the expression in the second line on the right-hand side of (37) by SI (N ) (N 2 ε), we have the two-sided estimate Λ
ΥΛ (ε) ≤ SI (N ) (N 2 ε) ≤ ΥΛ (ε + BN −1 ). (N )
(N )
Λ
(66)
But (N ) |ΥΛ ε
+ BN
−1
−
(N ) ΥΛ (ε)|
≤
ε+BN −1
ε
(N )
|ΥΛ
(ς)|dς ≤ BC,
(67)
the last inequality by (64) and by the first inequality in (58), with = N , and by ε ≤ ς ≤ 2ε. So we conclude that for any B > 0 we have 1 (N ) Υ (ε + BN −1 ) = γΛ (ε). N →∞ N Λ lim
(68)
Hence, and by (66), lim
N →∞
1 S (N ) (ε) = γΛ (ε), N IΛ
and Proposition 1 is proved, with sΛ,I (ε) = γΛ (ε). This also completes the proof of Theorem 1.
(69)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1165
5.1.2. Theorem 1++ and its proof Ruelle’s proof [40] of the traditional thermodynamic limit for (20) per volumes proceeded along somewhat different lines, and when adapted to the Vlasov scaling it yields an interesting alternate proof of Theorem 1 which characterizes sΛ (ε) in terms of a variational principle (VP) involving sΛ,K (ε) and yet another (auxiliary) “interaction entropy”, which we denote by sΛ,I (ε). For technical reasons we now need to assume that εg > 0 (rather than εg ≥ 0). So, following Ruelle [40] we introduce the configurational integralt ΞI (N ) (E) = Λ
χnI (N ) <Eo λ(d3N q).
(70)
Λ
Up to a purely numerical factor, (70) is quasi the “3N/2-th derivative” with respect to E of ΨI (N ) (E), the first derivative of which is given in (35). For convenience we Λ
rewrite (70), with E = N 2 ε, as 2
ΞI (N ) (N ε) = Λ
(ε − N −2 IΛ (q 1 , . . . , q N ))0+ λ(d3N q). (N )
(71)
Proposition 3. Assume the hypotheses of Theorem 1, but now let εg > 0. Then the following limit exists, 1 ln ΞI (N ) (N 2 ε) = sΛ,I (ε), Λ N →∞ N
(72)
lim
and sΛ,I (ε) ≤ 0 is an increasing, right-continuous, function of ε > εg . Proof of Proposition 3. Simplest things first, we note that ΞI (N ) (E) ≤ 1 (obviΛ
ously), which proves that ln ΞI (N ) (N 2 ε) ≤ 0 for all N , and so sΛ,I (ε) ≤ 0 whenever Λ this limit exists. The proof that this limit exists and is a monotonically increasing right-continuous function of ε > εg > 0 consists of two main steps. (N ) First, as in our proof of Theorem 1, we temporarily replace N −2 IΛ (q 1 , . . . , q N ) (1) (1) by ∆X (N ) , ∆X (N ) in (71) and study its logarithm. For this we need once again to assume that N NU (ε). Inspection of our proof of Proposition 2 reveals that we can recycle inequality (50), take its vanishing power, integrate and take
s Actually, Ruelle discussed the entropy of a regularized microcanonical ensemble measure [40]. In [22] the author showed that a minor modification of Ruelle’s approach establishes the thermodynamic limit for (20) per volume without regularization. t Instead of the normalized Lebesgue measure λ(d3N q), Ruelle [40] uses N !−1 d3N q which gives equivalent results in the thermodynamic limit; not so in the Vlasov limit.
October 26, 2009 11:30 WSPC/148-RMP
1166
J070-00385
M. K.-H. Kiessling
logarithms, and for NU (ε) < n < N − NU (ε), in place of (51) we now find (1) (1) ln (ε − ∆X (N ) , ∆X (N ) )0+ λ(d3N q) ≥ ln
(1)
(1)
(ε − ∆X (n) , ∆X (n) )0+ λ(d3n q)
(ε − ∆X (N −n) , ∆X (N −n) )0+ λ(d3(N −n) q), (1)
+ ln
(1)
(73)
(1) (1) which proves super-additivity of N → ln (ε − ∆X (N ) , ∆X (N ) )0+ λ(d3N q) without 0 further ado. Furthermore, since (· · ·)+ is either 1 or 0, we conclude that ln (ε − (1) (1) ∆X (N ) , ∆X (N ) )0+ λ(d3N q) ≤ 0. This upper bound and super-additivity now yield that the following limit exists, 1 (1) (1) ln (ε − ∆X (N ) , ∆X (N ) )0+ λ(d3N q) = s˜Λ,I (ε); lim (74) N →∞ N moreover, s˜Λ,I (ε) ≤ 0 is monotonic increasing, since the left-hand side of (74) is. Next we would like to prove continuity of s˜Λ,I (ε) as function of ε and then conclude the proof as at the end of the proof of Theorem 1, but so far a proof of continuity of s˜Λ,I (ε) has eluded us. Fortunately we can bypass this obstacle because s˜Λ,I (ε) is a monotonic increasing function of ε. We define s˜Λ,I (ε+ ) = inf s˜Λ,I (xε) x>1
and show that 1 ln lim N →∞ N
1−
0 1 (N ) I (q 1 , . . . , q N ) λ(d3N q) = s˜Λ,I (ε+ ), εN 2 Λ +
which proves Proposition 3, with sΛ,I (ε) = s˜Λ,I (ε+ ). To accomplish this, we recall (42) and (43) and rewrite (71) as 0 1 (1) (1) (1) 2 ΞI (N ) (N ε) = ε − ∆X (N ) , ∆X (N ) + ∆X (N ) λ(d3N q), Λ N + where we also introduced the abbreviation 1 (1) (1) UΛ (q, q)∆X (N ) (q)d3 q. ∆X (N ) = 2
(75)
(76)
(77)
(78)
Since now εg > 0, there exist constants B, B satisfying 0 < B < B < ∞ so that (1)
B ≤ ∆X (N ) ≤ B.
(79)
But then, for all N > NU (ε) big enough, we have 1 ln ΞI (N ) (N 2 ε) ≥ s˜Λ,I (ε + N −1 B) + o(1) ≥ s˜Λ,I (ε+ ) + o(1) Λ N where o(1) → 0 as N → ∞. So lim inf N →∞
1 ln ΞI (N ) (N 2 ε) ≥ s˜Λ,I (ε+ ). Λ N
(80)
(81)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1167
On the other hand, for all N > NU (ε) we also have that 1 ln ΞI (N ) (N 2 ε) ≤ s˜Λ,I (ε + N −1 B) + o(1), Λ N
(82)
and so lim sup N →∞
1 ln ΞI (N ) (N 2 ε) ≤ s˜Λ,I (ε+ ). Λ N
(83)
The estimates (81) and (83) prove (76). So sΛ,I (ε) = s˜Λ,I (ε+ ). Of course, sΛ,I (ε) = s˜Λ,I (ε) at all ε which are points of continuity of s˜Λ,I (ε), and the two functions share their points of discontinuity. At such points sΛ,I (ε) is right-continuous and may or may not agree with s˜Λ,I (ε). Proposition 3 is proved. We are now ready to state our second upgrade of our Theorem 1. Theorem 1++ . Under the hypotheses of Proposition 3, Theorem 1 holds and the system-specific Boltzmann entropy per particle sΛ (ε) given in (22) satisfies the variational principle sΛ (ε) = sup (sΛ,K (xε) + sΛ,I ([1 − x]ε)).
(84)
0≤x≤1
Proof of Theorem 1++ . Integration by parts yields, for any > 0 and ε > εg , 0 1 1 (N ) 1 (N ) 3N 1− I λ(d q) = I λ(d3N q)dx , (85) 1 − 2 Λ εN 2 Λ [1 − x]εN 0 + + (N )
where we suppressed the arguments (q 1 , . . . , q N ) from IΛ (q 1 , . . . , q N ). Setting −1 ln( 3N = 3N 2 − 1, recalling (71), and using that N 2 − 1) → 0, we find 1 3N 1 ln ΞI (N ) (N 2 [1 − x]ε) x 2 −2 dx. (86) sΛ,I (ε) = lim Λ N →∞ N 0 Proposition 3 and Laplace’s method (cf. [11, Sec. II.7]) now yield 3 sΛ,I (ε) = sup ln x + sΛ,I ([1 − x]ε) ; 0≤x≤1 2
(87)
note that (87) implies that ε → sΛ,I (ε) is continuous even when sΛ,I (ε) is not. Recalling next the definition (39) of sΛ,K (ε) as well as (41) of Theorem 1+ , we see that Theorem 1++ is proved. We end this subsection by pointing out that our method of proving Theorem 1++ not only avoids the regularization of Dirac’s δ measure, we also tackled the map E → S(E) directly rather than its inverse S → E(S) [40]. The strategy to tackle S → E(S) is due to Griffiths [16].
October 26, 2009 11:30 WSPC/148-RMP
1168
J070-00385
M. K.-H. Kiessling
5.2. Proof of Theorem 2 Since formula (37) holds also under the assumptions (H1)–(H5) on the interactions, and since it is well-known that the system-specific Boltzmann entropy per particle of the perfect gas (39) minimizes Boltzmann’s H functional under the constraint of prescribing the value of the kinetic Hamiltonian, it suffices to study the interaction entropy of Boltzmann’s ergodic ensemble, SI (N ) (E) = ln Λ
1 (N ) 1 − IΛ (q 1 , . . . , q N ) E +
3N 2
−1
λ(d3N q).
(88)
Note that (88) is non-positive, and under hypotheses (H1)–(H5) we also have 1 1 (N ) I (q 1 , . . . , q N ) 1− ε N2 Λ +
3N 2
3N2−2 −1
λ(d3N q)
3N2−2 3N 1 1 (N ) ≥ 1− I (q 1 , . . . , q N ) 2 −1 χBδ [k] λ(d3 qk ) ε N2 Λ ≥ |Cδ |
2 3−2/N
1≤k≤N
1 1 (N ) I (q 1 , . . . , q N ) 1− ε N2 Λ
χBδ [k] λ(d q)
1≤k≤N
εg 1
1+ ≥ |Cδ | 1− > 0, 2 ε
λ(d3 qk ) 3
Bδ [k]∩Λ
2 3
(89)
where again Cδ is given in (57), but now with δ(ε) independent of k and N chosen so (N ) that N −2 IΛ (q 1 , . . . , q N ) ≤ ε˜g (N ) + (ε − εg )/2 when the q k vary in Bδ (q g,k ) ∩ Λ, (N )
where (q g,1 , . . . , q g,N ) is a ground state configuration for IΛ (q 1 , . . . , q N ) with a fat neighborhood, which exists by (H2) and (H3). We also used that ε˜g (N ) = (N ) min N −2 IΛ (q 1 , . . . , q N ) ≤ εg (see Appendix A). So 2 εg 2 1
SI (N ) (E) ≥ ln |Cδ | 3 1 − 1+ > −∞ 3N − 2 Λ 2 ε
(90)
for all N > 1. The estimate (90) guarantees the existence of limit points of the (negative) interaction entropy per particle as N → ∞. We want to show that the interaction entropy per particle actually has a limit and characterize the limit by the variational principle stated in Theorem 2. We begin by characterizing (88) by its own maximum entropy principle. We introduce the quasi-interaction energy of (N ) ∈ Ps (ΛN ), defined by (N ) Q I/ε ((N ) )
3N − 2 = 2
ln 1 −
1 (N ) I (q 1 , . . . , q N ) (N ) (d3N q) εN 2 Λ +
(91)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1169
whenever supp (N ) ⊂ supp(ε − N −2 IΛ )+ ; else we set Q I/ε ((N ) ) = −∞. The (N )
(N )
entropy of (N ) relative to ap ∈ Ps (ΛN ) is defined as usualu by d(N ) (N ) (N ) (N ) R (N ) (d3N q) ( |ap ) = − ln (N ) dap (N )
(92) (N )
if (N ) is absolutely continuous with respect to the a priori measure ap , and (N ) provided the integral in (92) exists. In all other cases, R (N ) ((N ) |ap ) = −∞. Finally, we define what we call the interaction entropy of (N ) by (N )
(N )
SI/ε ((N ) ) ≡ R (N ) ((N ) |λ) + Q I/ε ((N ) ).
(93)
We are now ready to state our variational principle. Proposition 4. For ε > εg ≥ 0, the interaction entropy functional (93) achieves its supremum. The maximizer is the unique probability measure 3N 2 −1 1 (N ) 1− I (q , . . . , q ) d3N q 1 N εN 2 Λ (N ) + 3N N 2 ε (d q) = ∈ (Ps ∩ L∞ )(ΛN ); 3N 2 −1 1 (N ) ˜N ) I (˜ q1, . . . , q d3N q˜ 1− εN 2 Λ + (94) thus max
(N ) ∈Ps (ΛN )
(N )
(N )
(N )
SI/ε ((N ) ) = SI/ε (N 2 ε ).
(95)
(N )
(96)
Moreover, (N )
SI/ε (N 2 ε ) = SI (N ) (N 2 ε). Λ
(N )
(N )
Proof of Proposition 4. Under our hypotheses on IΛ the measure N 2 ε is absolutely continuous with respect to λ and bounded whenever ε > εg , so the standard convexity argument due to Boltzmann [2], cf. [40, 11], applies and shows (N ) (N ) (N ) (N ) that SI/ε ((N ) ) − SI/ε (N 2 ε ) ≤ 0, with equality holding if and only if (N ) = N 2 ε . Identity (96) is verified by explicit calculation. Since ultimately we are interested in the limit N → ∞ of our finite-N results, we recall the formalism of probabilities on infinite sequences ΛN , as encountered already in Sec. 3 for (Λ × R3 )N . Thus, by Ps (ΛN ) we denote the permutation-symmetric probability measures on the set of infinite exchangeable sequences in Λ. Let {n }n∈N denote the sequence of marginals of any ∈ Ps (ΛN ). The de Finetti [14], Dynkin [9] and Hewitt–Savage [17] decomposition theorem for Ps (ΛN ) states that every ∈ Ps (ΛN ) is uniquely presentable as a linear convex superposition of infinite product u Our
physicists’ sign convention of relative entropy is opposite to the probabilists’ one.
October 26, 2009 11:30 WSPC/148-RMP
1170
J070-00385
M. K.-H. Kiessling
measures, i.e. for each ∈ Ps (ΛN ) there exists a unique probability measure ς(dρ|) on P(Λ), such that for each n ∈ N, n (d3n q) = ρ⊗n (d3 q1 · · · d3 qn )ς(dρ|), (97) P(Λ)
where n is the nth marginal measure of , and where ρ⊗n (d3 q1 · · · d3 qn ) ≡ ρ(d3 q1 )× · · · × ρ(d3 qn ). Also, (97) expresses the extreme point decomposition of the convex set Ps (ΛN ), see [17]. Next we would like to formulate the N = ∞ analogue of (93), but the naive (N ) manipulation of the formulas is not recommended. The functional Q I/ε is welldefined by (91) and its accompanying text for all N ∈ N; however, since our (N ) conditions on IΛ (q 1 , . . . , q N ) allow it to be unbounded above when two positions q k and q l approach each other (for example: Coulomb interactions), we (N ) find that Q I/ε (ρ⊗n ) = −∞ for all product measures ρ⊗n , but these are exactly the N -point marginals of the extreme points of our set of exchangeable measures on the infinite Cartesian product ΛN . This obstacle can be circumvented by noting that the finite-N quasi-interaction energy defined in (91) and the line ensuing (91) is the monotone limit of a family of concave functionals in which the integrand function ln(1 − x)+ (with ln 0 = −∞ understood) is replaced by ln(1 − x)χ{x<1−α} + [ln α + (1 − α − x)/α]χ{x≥1−α} ; thus 3N − 2 1 (N ) α (N ) (N ) Q I/ε ( ) = I χ{I (N ) <εN 2 (1−α)} ln 1 − Λ 2 εN 2 Λ 1 1 (N ) + ln α + I − α χ{I (N ) ≥εN 2 (1−α)} (N ) (d3N q), 1− Λ α εN 2 Λ (98) (N )
where we omitted the argument (q 1 , . . . , q N ) from IΛ , for brevity, and Q I/ε ((N ) ) = lim α Q I/ε ((N ) ). (N )
(N )
(99)
α↓0
We also define α SI/ε ((N ) ) precisely like SI/ε ((N ) ) except that Q I/ε ((N ) ) is (N )
(N )
(N )
replaced by α Q I/ε ((N ) ). We have α SI/ε (ρ⊗n ) > −∞ for all ρ ∈ (P ∩ L1 ln L1 )(Λ), (N )
(N )
and limα↓0 α SI/ε (ρ⊗n ) = −∞ whenever I/ε ≤ 1. By α N 2 ε we denote the unique (N )
(N )
maximizer of α SI/ε ((N ) ), easily proven to exist as done for SI/ε ((N ) ). Equally (N )
(N )
easily we find limα↓0 α SI/ε (α N 2 ε ) = SI/ε (N 2 ε ). We are now ready to formulate the N = ∞ analogue of (93). To define the mean quasi-interaction energy of ∈ Ps (ΛN ), we introduce the subset PsU 2 (ΛN ) ⊂ Ps (ΛN ) for which the expected value of UΛ2 is finite; Λ i.e. UΛ2 (q, q ) 2 (d3 q d3 q ) < ∞, where 2 (d3 q d3 q ) is the second marginal measure of ∈ PsU 2 (ΛN ). Also, by PUΛ2 (Λ) we denote the subset of P(Λ) Λ which consists of Lebesgue-absolutely continuous probability measures ρ for which (N )
(N )
(N )
(N )
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1171
UΛ2 (q, q )ρ⊗2 (d3 q d3 q ) < ∞, which implies ρ, ρ < ∞; here we recycled the bilinear form notation (43) for lower semi-continuous (rather than continuous) UΛ . If ∈ PsU 2 (ΛN ), then the decomposition measure ς(dρ|) is concentrated on PUΛ2 (Λ); Λ this can be shown by adapting arguments from [17]; cf. also [31]. The mean quasiinteraction energy of ∈ PsU 2 (ΛN ) is defined as Λ
Q
I/ε
() ≡ lim lim
α↓0 n→∞
1 α (n) n Q ( ). n I/ε
(100)
We show that Q I/ε () is well-defined. By the linearity of n → α Q I/ε (n), the presentation (97) yields (n) α (n) n Q I/ε ( ) = α Q I/ε (ρ⊗n )ς(dρ|), (101) (n)
and on PsU 2 (ΛN ) the conventional law of large numbers for U statistics applies Λ (see [18]) and yields
1 3 1 (n) lim α Q I/ε (ρ⊗n ) = ln 1 − ρ, ρ χ{ ρ, ρ <ε(1−α)} n→∞ n 2 ε
1 1 + ln α + (102) 1 − α − ρ, ρ χ{ ρ, ρ ≥ε(1−α)} . α ε Clearly, when α ↓ 0 in (102) the “value” −∞ is assigned to all ρ for which ρ, ρ ≥ ε; the α ↓ 0 limit is finite when ρ, ρ < ε. We conclude with: Lemma 1. The mean quasi-interaction energy (100) is well-defined and affine linear. For ∈ PsU 2 (ΛN ) having decomposition measure ς(dρ|) supported entirely by Λ ρ for which ρ, ρ < ε, we have (100) given by Q I/ε () = Q I/ε (ρ)ς(dρ|), (103) where Q I/ε (ρ) ≡ otherwise, Q
I/ε
1 3 ln 1 − ρ, ρ ; 2 ε
(104)
() = −∞.
The N = ∞ analogue of (92) is the well-known mean (relative) entropy of ∈ Ps (ΛN ), which is well-defined as limit R () ≡ lim
n→∞
1 (n) n R ( |λ). n
(105)
Here, R (n) (n|λ), n ∈ {0, 1, . . .}, is the relative entropy of n, as defined in (92); we also set R (−k) (−k |λ) ≡ 0 for all k ∈ N. The limit (105) exists or is −∞. This is a consequence of the next lemma, which holds for ∈ Ps (ΛN ) or ∈ Ps (ΛN ). If (N ) = (N ) , it is understood that k ≤ N in k .
October 26, 2009 11:30 WSPC/148-RMP
1172
J070-00385
M. K.-H. Kiessling
Lemma 2. Relative entropy n → R (n) (n|λ) has the following properties: (a) Non-positivity: For all n, R (n) (n|λ) ≤ 0;
(106)
(b) Monotonic decrease: If n > m then R (n) (n|λ) ≤ R (m) (m |λ);
(107)
(c) Strong sub-additivity: For m, n ≤ , and k = − m − n, R () ( |λ) ≤ R (m) (m |λ) + R (n) (n|λ) + R (k) (k |λ) − R (−k) (−k |λ).
(108)
The proof of Lemma 2 is a straightforward adaptation from a proof by Robinson and Ruelle [39, Sec. 2, proof of Proposition 1] for the standard-thermodynamic-limit problem to the Vlasov limit, studied here, cf. [19]. The next lemma also has an elementary proof which likewise is an adaption from [39], proof of their Proposition 3, cf. [19]. Lemma 3. The mean entropy functional (105) is affine linear. Lemma 3 in conjunction with the de Finetti [14], Dynkin [9] and Hewitt– Savage [17] decomposition theorem for Ps (ΛN ) yields a key formula for the mean entropy which does not hold for the finite-N entropy. Namely, as a consequence of Lemma 3, the extremal decomposition of yields (109) R () = R (ρ|λ)ς(dρ|), where we also set R (ρ|λ) ≡ R (1) (ρ|λ). Lemma 4, also proved by adaption of a corresponding proof in [39, Proposition 4], ends the listing of properties of mean relative entropy (105). Lemma 4. The mean entropy functional is weakly upper semi-continuous. Finally we define the mean interaction entropy of ∈ Ps (ΛN ), S I/ε () ≡ R () + Q By (109) and (103) we have S I/ε () =
I/ε
().
(110)
P(Λ)
S I/ε (ρ)ς(dρ|),
(111)
where we introduced the functional S I/ε (ρ) ≡ R (ρ|λ) + Q I/ε (ρ),
(112)
which is well-defined and finite whenever ρ ∈ (P ∩ L1 ln L1 )(Λ) and ρ, ρ < ε; else we have S I/ε (ρ) = −∞. Note that S I/ε (ρ) ≤ 0, for R (|λ) ≤ 0 and Q I/ε (ρ) ≤ 0, the latter because UΛ ≥ 0 by hypothesis.
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1173
Because of (111) the problem of maximizing S I/ε () reduces to maximizing S I/ε (ρ) given in (112). Proposition 5. S I/ε (ρ) is weakly upper semi-continuous for ε > εg ≥ 0 and takes its finite non-positive maximum at a solution of the fixed point equation 3 ˜ exp −ϑ−1 (ρ) U (q, q )ρ(˜ q )d q ˜ Λ ε , Λ ρ(q) = (113) −1 3 exp −ϑε (ρ) UΛ (ˆ q , q˜ )ρ(˜ q )d q˜ dˆ q Λ
Λ
where ϑε (ρ) =
2 3
1 1 − ρ, ρ ε > 0. ε
(114)
Proof of Proposition 5. Since relative entropy R (ρ|λ) is weakly upper semicontinuous ([38, Suppl. to IV.5]; [11, Chap. VIII]), and since the functional Q I/ε (ρ) is weakly upper semi-continuous as a consequence of hypothesis (H2) and the positivity of UΛ , so is S I/ε (ρ). Since Λ is compact, S I/ε (ρ) now takes its maximum, which is non-positive because S I/ε (ρ) ≤ 0, and finite (i.e. > −∞) because of the following. Let k → ρ(k) in (P ∩ C∞ 0 )(Λ) be a minimizing sequence for ρ, ρ. Since ε > εg ≥ 0, by (H3) there is a K such that εg < ρ(k) , ρ(k) < ε for all k ≥ K. Then maxρ S I/ε (ρ) ≥ S I/ε (ρ(K) ) = R (ρ(K) |λ) + Q I/ε (ρ(K) ) > −∞. Let q → ρε (q) denote any maximizer for S I/ε (ρ). Suppose ρε , ρε ≥ ε. Then Q I/ε (ρε ) = −∞, and because R (ρε |λ) ≤ 0 then also S I/ε (ρε ) = −∞. Therefore ρε , ρε < ε strictly, and since ε > 0, this proves (114). The standard variational argument now shows that the maximizer satisfies the Euler–Lagrange equation for S I/ε (ρ), which is (113). Corollary 1. The functional S I/ε () given in (110) achieves its supremum. If ε is a maximizer of S I/ε (), then the support of its decomposition measure ς(dρ|ε ) is the set of maximizers {ρε } of the functional S I/ε (ρ) given in (112). Proof of Corollary 1. Abstractly, by Lemma 4 and the linearity of the mean quasi-interaction energy functional, the mean interaction entropy functional S I/ε () given in (110) is weakly upper semi-continuous, and so achieves its supremum over the compact set of permutation symmetric probabilities PsU 2 (ΛN ). Λ Alternatively, by (111) and two obvious estimates, we have right away that S I/ε (ρε ) = S I/ε (ρN ε ) ≤ sup S I/ε () ≤ max S I/ε (ρ) = S I/ε (ρε ),
ρ
(115)
so sup S I/ε () = max S I/ε () = S I/ε (ρN ε ). Now let ε maximize S I/ε () and suppose that supp ς(dρ|ε ) is not a subset of the maximizers {ρε } of S I/ε (ρ). Then S I/ε (ε ) = S I/ε (ρ)ς(dρ|ε ) < max S I/ε (ρ) = S I/ε (ρN (116) ε ), P(Λ)
ρ
so ε is not a maximizer — a contradiction to the supposition.
October 26, 2009 11:30 WSPC/148-RMP
1174
J070-00385
M. K.-H. Kiessling (N )
(N )
We now relate the sequence of maximizers {N 2 ε }N ∈N of {SI/ε } to the set of (N )
maximizers {ρε } of S I/ε . We begin with the maxima of SI/ε ((N ) ) and S I/ε (ρ). Proposition 6. We have 1 (N ) (N ) S ( 2 ) = S I/ε (ρε ). N I/ε N ε
lim
N →∞
(117)
Proof of Proposition 6. For all α ∈ (0, 1), we have α (N ) α (N ) SI/ε ( N 2 ε )
≥ α SI/ε (ρ⊗n ε ). (N )
(118)
We compute α (N ) ⊗n SI/ε (ρε )
= N R (1) (ρε |λ) + α Q I/ε (ρ⊗n ε ). (N )
(119)
Since ρε , ρε < ε, when α ∈ (0, 1) is sufficiently small we have by (H4) and Proposition 5 that
1 1 α (N ) ⊗n 3 Q I/ε (ρε ) = ln 1 − ρε , ρε . N →∞ N 2 ε lim
(120)
Hence, for all sufficiently small α ∈ (0, 1), 1 α (N ) ⊗n SI/ε (ρε ) = S I/ε (ρε ). N →∞ N lim
(121)
Thus lim inf N →∞
1 α (N ) α (N ) S ( N 2 ε ) ≥ S I/ε (ρε ) N I/ε
(122)
for all sufficiently small α ∈ (0, 1), and this yields the first desired estimate lim inf N →∞
1 (N ) (N ) S ( 2 ) ≥ S I/ε (ρε ). N I/ε N ε
(123)
Now consider (94) as extended to a probability on ΛN . Since Λ is bounded, Λ is (N ) compact, and then the sequence {N 2 ε }N ∈N is weakly compact, so (N˙ [N ])
lim nN˙ 2 ε
N →∞
n
= n ˙ ε ∈ Ps (Λ ) ˙
∀ n ∈ N,
(124)
after extraction of a subsequence {(N[N ]) }N ∈N ; note that {n ˙ ε }n∈N form a the 1 compatible sequence of marginals. Furthermore, we have ∂Λ ˙ ε (d3 q) = 0, or else R (˙ ε ) = −∞, a contradiction; so n ˙ ε ∈ Ps (Λn ).
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1175
Following [31,19] we now use sub-additivity of relative entropy (property (C) in Lemma 2) and then negativity of relative entropy (property (a) in Lemma 2) (valid (N˙ )
also with N˙ 2 ε in place of ), and obtain R
(N˙ )
(N˙ ) (N˙ 2 ε |λ)
≤ ≤
N˙ n N˙ n
(N˙ )
(N˙ )
R (n) (nN˙ 2 ε |λ) + R (m) (mN˙ 2 ε |λ) (N˙ )
R (n) (nN˙ 2 ε |λ)
(125)
where a/b is the integer part of a/b, and where m < n. Upper semi-continuity for the relative entropy gives ˙ (N[N ])
lim sup R (n) (nN˙ 2 ε N →∞
while
1 N˙ N˙ n
→
1 n.
|λ) ≤ R (n) (n ˙ ε |λ),
(126)
Hence, dividing (125) by N˙ [N ] and letting N → ∞ gives
lim sup N →∞
1 (N˙ ) (N˙ ) 1 R (N˙ 2 ε |λ) ≤ R (n) (n ˙ ε |λ) n N˙
∀ n ∈ N,
(127)
and now taking the supremum over n (equivalently: the limit n → ∞) we get lim sup N →∞
1 (N˙ ) (N˙ ) R (N˙ 2 ε |λ) ≤ R (˙ ε ). N˙
(128)
Lastly, using (109) in (128) yields lim sup N →∞
1 (N˙ ) (N˙ ) R (N 2 ε |λ) ≤ N˙
R (ρ|λ)ς(dρ|˙ε )
(129)
where ς(dρ|˙ ε ) be the Hewitt–Savage decomposition measure for ˙ ε . For each ρ ∈ ˙ ˙ supp ς(dρ|˙ ε ) we can choose a family of (N) [ρ] ∈ Ps (ΛN ) satisfying ˙
lim n(N ) [ρ] = ρ⊗n
N →∞
(130)
for each n ∈ N, such that for each N˙ [N ], with N ∈ N, we have (N˙ ) N˙ 2 ε
=
˙
(N ) [ρ]ς(dρ|˙ ε ).
(131)
In contrast to the de Finetti–Dynkin–Hewitt–Savage decomposition, this finite N decomposition is not unique, but this is immaterial. We remark that in the physically (presumably) most important situations, namely when supp ς(dρ|ε ) is either
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
M. K.-H. Kiessling
1176
a finite set or a continuous group orbit of a compact group, then a decomposition (131) satisfying (130) can easily be constructed explicitly, as shown in Appendix B. ˙ (N)
˙
˙
By (131), the linearity of the map (N ) → α Q I/ε ((N ) ) gives α
(N˙ )
(N˙ )
Q I/ε (N 2 ε ) =
α
˙ (N)
˙
Q I/ε ((N ) [ρ])ς(dρ|˙ ε ), (N˙ )
(132)
˙
and by the concavity of the map I → α Q I/ε ((N ) ), Jensen’s inequality gives α
(N˙ )
˙
Q I/ε ((N ) [ρ]) ≤
3N˙ − 2 1 ˙ ln 1 − U(N ) ((N ) ) χ{U(N ) ( (N˙ ) )<ε(1−α)} 2 ε
1 1 ˙ + ln α + 1 − α − U(N ) ((N ) ) χ{U( (N˙ ) )≥ε(1−α)} , α ε (133)
where U
(N )
(
(N˙ )
) = (1 − N˙ −1 )
1 ˙ ˆ ) 2 (N ) [ρ](d3 qˇ d3 qˆ). UΛ (ˇ q, q 2
The weak lower semi-continuity of UΛ now gives 1 ˙ UΛ 2 ρ(N ) [ρ]d6 q ≥ ρ, ρ, lim inf N →∞ 2
(134)
(135)
and since N˙ −1 → 0, we find for each convergent subsequence of measures that
1 α (N˙ ) (N˙ ) 3 1 lim sup Q I/ε ( [ρ]) ≤ ln 1 − ρ, ρ χ{ ρ, ρ <ε(1−α)} ˙ 2 ε N →∞ N
1 1 + ln α + 1 − α − ρ, ρ χ{ ρ, ρ ≥ε(1−α)} α ε (136) for each α ∈ (0, 1). Now suppose that ρ, ρ ≥ ε; then the right-hand side of (136) ↓ (N˙ )
(N˙ )
−∞ as α ↓ 0, in which case by (136) and (132) also α Q I/ε (N 2 ε ) ↓ −∞ as α ↓ 0, and by (99) and (93) and property (A) in Lemma 2, and then Proposition 4, this contradicts the lower bound (90). Therefore, ρ, ρ < ε for every ρ ∈ supp ς(dρ|˙ ε ), and so, and recalling (104), we conclude that
1 1 (N˙ ) (N˙ ) 3 ln 1 − ρ, ρ ς(dρ|˙ ε ) lim sup Q I/ε (N 2 ε ) ≤ ˙ 2 ε N →∞ N = Q I/ε (ρ)ς(dρ|˙ ε ). (137)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1177
The estimates (137) and (129) and two obvious estimates now give lim sup N →∞
˙ ˙ 1 (N˙ ) (N) 1 1 (N) ˙ (N˙ ) (N˙ ) SI/ε (N˙ 2 ε ) ≤ lim sup R (N) (N˙ 2 ε |λ) + lim sup Q I/ε (N˙ 2 ε ) ˙ ˙ ˙ N N →∞ N N →∞ N ≤ R (ρ|λ)ς(dρ|˙ε ) + Q I/ε (ρ)ς(dρ|˙ ε )
SI/ε (ρ)ς(dρ|˙ ε )
=
≤ max SI/ε (ρ), ρ
and since this holds for each limit point ˙ε we can drop the dot to get 1 (N ) (N ) lim sup SI/ε (N 2 ε ) ≤ S I/ε (ρε ). N →∞ N
(138)
(139)
By (123) and (139), Proposition 6 is proved. By Propositions 4–6, the interaction entropy per particle for Boltzmann’s ergodic ensemble converges as follows, 1 S (N ) (N 2 ε) = S I/ε (ρε ), (140) sΛ,I (ε) ≡ lim N →∞ N IΛ and sΛ,I (ε) is characterized by its own variational principle expressed in Proposition 5. Moreover, by formula (37), which holds under assumptions (H1)–(H5), the expansion (23) now follows, with sΛ (ε) = sΛ,K (ε) + sΛ,I (ε),
(141)
where sΛ,K (ε) is given in (39), and sΛ,I (ε) in (140) and Proposition 5. By Proposition 5, any maximizer ρε of S I/ε (ρ) satisfies (28) and (29). Lastly, one readily verifies that the just described sΛ (ε) equals the negative minimum of Boltzmann’s H functional over the set of trial densities Aε = {f ∈ (P ∩ L1 ∩ L1 ln L1 )(R3 × Λ) : E(f ) = ε}. This is done by explicitly carrying out the standard variational argument for H(f ), taking the constraints into account with the help of Lagrange multipliers which are then eliminated with the help of the very functionals of ρε displayed in Theorem 2. This completes the proof of Theorem 2. 5.3. Proof of Theorem 3 We begin with the observation that (94) is clearly the N th configurational marginal measure of (2), i.e. (94) is (2) with the Hamiltonian given by (19), integrated over all the p variables in (19). Put differently, (94) is the joint N -point distribution on configuration space ΛN of an N -body system with Hamiltonian (19) chosen with respect to the a priori measure (2) on (R3 × Λ)N . Hence, our proof of Theorem 2 also proves the following weaker version of Theorem 3.
October 26, 2009 11:30 WSPC/148-RMP
1178
J070-00385
M. K.-H. Kiessling
Theorem 3 − . Under the same assumptions as in Theorem 2, consider (2) for the Hamiltonian (19) as extended to a probability on (R3 × Λ)N . Then the sequence (N ) {N 2 ε }N ∈N of its configuration space marginals, obtained by integrating over all the N
p variables in (19) and given in (94), is weakly compact in Ps (Λ ), so one can ˙ (N[N ])
extract a subsequence {N˙ 2 ε
}N ∈N such that ˙ (N[N ])
lim N˙ 2 ε
N →∞
in the sense that (N˙ [N ]) lim nN˙ 2 ε N →∞
=
= ˙ ε ∈ Ps (ΛN ),
P(Λ) 1≤k≤n
ρ(q k )d3 qk ς(dρ|˙ ε )
(142)
∀ n ∈ N.
(143)
The decomposition measure ς(dρ|˙ ε ) of each such limit point ˙ ε is supported on the subset of P(Λ) which consists of the probability measures ρε (q)d3 q which maximize the functional S I/ε (ρ). Since each limit point ˙ ε of (94) is a convex linear superposition of infinite product measures on ΛN consisting of “Boltzmann factors” ρε (q) on Λ, satisfying (28) with (29) and maximizing the interaction entropy functional S I/ε (ρ), and since each such Boltzmann factor is associated with a unique “Maxwellian” σε (p) on R3 through (30), each such Boltzmann factor thereby defines a unique Maxwell– Boltzmann distribution σε (p)ρε (q) on R3 × Λ given by the product of this Boltzmann factor with its associated Maxwellian. So the very decomposition measure ς(dρ|˙ ε ) of each limit point ˙ ε on ΛN allows us to define a unique probability measure µ˙ ε on (R3 × Λ)N , viz. n µ˙ ε (d3n p d3n q) = σ(ρ)(pk )ρ(q k )d3 pk d3 qk ς(dρ|˙ ε ) ∀ n ∈ N, P(Λ) 1≤k≤n
(144) and this measure ς(dρ|˙ ε ) on P(Λ) can be mapped into a unique measure ν(dτ |µ˙ ε ) on P(R3 × Λ) which is concentrated on those τ ∈ P(R3 × Λ) which are of the form τ (d3 p d3 q) = σ(ρ)(p)ρ(q)d3 p d3 q, with σ(ρ) given by (30) and ρ satisfying (28) with (29) and maximizing S I/ε (ρ), thus n µ˙ ε (d3n p d3n q) = τ (d3 pk d3 qk ) ν(dτ |µ˙ ε ) ∀ n ∈ N. (145) P(R3 ×Λ) 1≤k≤n
The corresponding infinite product measures 1≤k≤∞ τ (d3 pk d3 qk ) on (R3 × Λ)N are extreme points of Ps ((R3 × Λ)N ), and so (145) is the extremal representation of µ˙ ε . So, having Theorem 3− and its consequence (145), all we need to do to finish the proof of Theorem 3 is to show that each such defined µ˙ ε is indeed a limit point of (2) under the stated hypotheses.
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1179
To see this, we use that by Theorem 3− we already know that (143) holds, and (N˙ [N ])
we know the support of ς(dρ|˙ ε ). Writing nN˙ 2 ε
explicitly gives
1−
˙ n (N) N˙ 2 ε (d3n q)
32N˙ −1 ˙ 1 (N) ˙ IΛ (q 1 , . . . , q N˙ ) d3(N −n) q εN˙ 2 + d3n q, 32N˙ −1 1 (N˙ ) ˙ ˜ N˙ ) IΛ (˜ q1 , . . . , q d3N q˜ 1− 2 ˙ εN +
=
(146)
where the integral in the numerator runs over the variables q n+1 to q N˙ . For any N˙ and 1 ≤ n < N˙ we now write (N˙ )
IΛ (q 1 , . . . , q N˙ ) (n|N˙ )
(n)
= IΛ (q 1 , . . . , q n ) + IΛ
˙ (N−n)
(q 1 , . . . , q N˙ ) + IΛ
(q n+1 , . . . , q N˙ )
(147)
(n|N˙ )
(q 1 , . . . , q N˙ ). Henceforth we omit the arguments from the Is which defines IΛ to keep the formulas within sight; by (147) the superscripts convey which variables are used. With the help of (147) we rewrite the integrands thusly,v 1−
1 (N˙ ) IΛ εN˙ 2
= +
(n|N˙ )
(n)
IΛ + IΛ 1− (N˙ −n) N˙ 2 (ε − N˙ −2 I )
1−
Λ
+
˙ 1 (N−n) IΛ 2 ˙ εN
.
(148)
+
(N˙ )
Now (1 − (εN˙ 2 )−1 IΛ )+ vanishes in an open neighborhood of configurations ˙ (N)
(n)
(q 1 , . . . , q n )∞ for which IΛ = ∞, so that IΛ < ∞ on the support of (148). (n) (n) And for any configuration (q 1 , . . . , q n ) for which IΛ < ∞, we have N˙ −2 IΛ → 0 − as N˙ [N ] → ∞. Moreover, by our Theorem 3 and its explication (143), we ˙ (N−n)
˙ 3N
−1
˙
)+2 d3(N −n) q, interpreted as a measure on the conhave that (1 − εN1˙ 2 IΛ vex set of probability measures P(Λ) with support in the set of empirical onepoint “densities” with N˙ − n atoms,w converges (up to normalization) to ς(dρ|˙ ε ), (N˙ −n) → ρ, ρ while and for any ρ in the support of ς(dρ|˙ ε ) we have that N˙ −2 IΛ ˙ (n|N) N˙ −1 IΛ → 1≤k≤n Λ UΛ (q k , q˜ )ρ(˜ q )d3 q when N˙ [N ] → ∞. So for any ρ in the are using that (f g)+ = f+ g+ + f− g− for two arbitrary functions f and g, and that in our case f cannot be strictly negative if g is, giving (f g)+ = f+ g+ ; in our case, f and g are the respective expressions between the two pairs of big parentheses at the right-hand side of (148). w If U is bounded continuous on Λ2 , then we already know that we can rewrite I (N) as a sum Λ Λ of a bilinear and a linear form on P(Λ) evaluated at a normalized empirical one-point “density” with N atoms; see (42). If UΛ is only lower semi-continuous this particular identification ceases (N) to make sense, but happily we can always interpret IΛ as a linear form on the convex set of probability measures P(Λ2 ) (cf. (133) and (135)), evaluated at a normalized empirical two-point “density” with N atoms (7), and we note that any empirical two-point “density” with N atoms (7) is uniquely determined by its associated empirical one-point “density” with N atoms (6). v We
October 26, 2009 11:30 WSPC/148-RMP
1180
J070-00385
M. K.-H. Kiessling
support of the decomposition measure ς(dρ|˙ ε ) we have 1−
(n) ˙ IΛ
N˙ 2 (ε −
→
˙ (n|N) + IΛ (N˙ −n) N˙ −2 IΛ )
exp −
1≤k≤n 3 2 ϑε (ρ) = (N˙ ) marginal nµN˙ 2 ε
with
32N˙ −1
1 ϑε (ρ)
+
˜ )ρ(˜ UΛ (q k , q q )d3 q˜
(149)
Λ
ε − ρ, ρ. After this preparation, we now explicitly compute the and find
˙ n (N) µN˙ 2 ε (d3n p d3n q)
3(N˙2−n) −1 1 ˙ (N˙ ) (n|N ) (K + IΛ ) d3(N −n) q εN˙ 2 + d3n p d3n q, = 3(N˙2−n) −1 1 ˙ (N˙ ) (K (n|N ) + IΛ ) d3N q˜ d3n p˜ 1− εN˙ 2 + 1−
(150)
where K (n|N ) (p1 , . . . , pn ) = N 1≤k≤n 12 |pk |2 . Using (147) we factor the integrands as in (148), though now we get ˙ (N) K (n|N ) + IΛ 1− εN˙ 2 + ˙ (n) (n|N) K (n|N ) + IΛ + IΛ 1 (N˙ −n) = 1− IΛ (151) 1− (N˙ −n) εN˙ 2 + N˙ 2 (ε − N˙ −2 I ) Λ
+
and by following essentially verbatim the arguments which lead from (148) to (149), we now find that for any ρ ∈ supp ς(dρ|˙ ε ),
˙ −n) 3(N −1 ˙ 2 (n) ˙ (n|N) K (n|N ) + IΛ + IΛ 1− (N˙ −n) N˙ 2 (ε − N˙ −2 IΛ ) + 1 2 3 ˜ |p | + U (q , q )ρ(˜ q )d q ˜ Λ k k 2 Λ . → exp − ϑε (ρ)
1≤k≤n
Our Theorem 3 is proved. 6. Spin-Offs of Our Results In this section we list a number of corollaries of our results.
(152)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1181
6.1. A weak law of large numbers/ergodic theorem Whenever HB (f ) has a unique minimizer fε over Aε , then necessarily all limit points N in (33) coincide, i.e. any µ˙ ε = µε . By the weak compactness of Ps (Λ ) (in product topology) we then in fact do have weak convergence, lim
N →∞
n (N ) µN 2 ε (d3n p d3n q)
= nµε (d3n p d3n q) ∈ Ps ((R3 × Λ)n )
∀ n ∈ N.
(153)
Since in this case the decomposition measure ν(dτ |µε ) is a singleton, the limit µε = {nµε }n∈N is of the form n µε (d3n p d3n q) = fε (pk , q k )d3 pk d3 qk (154) 1≤k≤n
with fε (p, q) = σε (p)ρε (q) as defined in Theorem 2. As discussed in [41], the factorization property (154) is equivalent to a weak law of large numbers — or to an ergodic theorem, depending on one’s point of view. Since the single particle momentum P and position Q of an individual N -body system picked from Boltzmann’s Ergode (2), with Hamiltonian (19), are random variables, any bounded continuous single-particle test function θ on R3 ×Λ defines a new random variable Θ = θ(P, Q), and so does its sample mean over a single N -body system, ΘN ≡
N 1 θ(Pj , Qj ). N j=1
Theorem 3 in the special case (154) implies that, for all such θ, lim ΘN = θ(p, q)fε (p, q)d3 p d3 q, N →∞
(155)
(156)
R3 ×Λ
in probability. The generalization to n-body test functions holds as well. 6.2. The Vlasov limit for other thermodynamic potentials A second corollary, or actually a whole family of corollaries, is the existence of the Vlasov limit for the thermodynamic potentials of the canonical and grand canonical ensembles under the same hypotheses. We only discuss the Vlasov limit for the thermodynamic potential of the canonical ensemble. Thus, taking the Laplace transform of (3), i.e. multiplying by e−βE and integrating over E, yields what is known as the canonical partition function, 1 (N ) (157) ZH (N ) (β) = exp(−βHΛ (X (N ) ))d6N X. Λ N! (N )
The Hamiltonian HΛ (X (N ) ) is given in (19). Clearly (157) factors as follows, ZH (N ) (β) = ZK (N ) (β)ZI (N ) (β) Λ
where
Λ
(158)
ZI (N ) (β) = Λ
(N )
exp(−βIΛ (q 1 , . . . , q N ))λ(d3N q)
(159)
October 26, 2009 11:30 WSPC/148-RMP
1182
J070-00385
M. K.-H. Kiessling
is the canonical configurational integral, with λ(d3 q) = |Λ|−1 d3 q the normalized Lebesgue measure introduced in Sec. 5, and λ(d3N q) its N -fold product, and |Λ|N ZK (N ) (β) = (160) exp(−βK (N ) (p1 , . . . , pN ))d3N p N! is the canonical partition function of a spatially uniform perfect gas in Λ, a Gaussian on the Cartesian product of the p spaces, which evaluates to |Λ|N (2πϑ)3N/2 ; (161) N! here, we introduced N ϑ = β −1 , with ϑ independent of N , not to be confused with ϑε which is a functional of ρ. Since β −1 receives the meaning of a temperature of a heat bath (up to the absorbed factor kB ), it needs to grow ∝ N to compensate for the growth of the system’s energy E ∝ N 2 . Taking the logarithm of (157) gives what we call the canonical thermodynamic potential (canonical T -potential, for short)x ΦH (N ) (β). Using (158) and (161) as well as β = N1ϑ yields the asymptotic expansion Λ 1 ΦH (N ) = −N ln N + N ln(e|Λ|(2πϑ)3/2 ) + O(ln N ) Λ Nϑ 1 + ln ZI (N ) . (162) Λ Nϑ ZK (N ) (β) =
Again, the N ln N term is due to Gibbs’ N ! and purely combinatorial in origin. In the absence of interactions (save the confinement to Λ) (162) reduces to 1 (163) = −N ln N + N ln(e|Λ|(2πϑ)3/2 ) + O(ln N ), ΦK (N ) Nϑ the asymptotic expansion of the canonical T -potential of the spatially uniform perfect gas. The coefficient of the O(N ) term in (163) is the system-specific Helmholtz T -potential per particle of the uniform perfect gas in Λ, denoted by φΛ,K (ϑ) = ln(e|Λ|(2πϑ)3/2 ).
(164)
The system-specific interaction Helmholtz T -potential per particle is defined by 1 1 φΛ,I (ϑ) = lim ln ZI (N ) . (165) Λ N →∞ N Nϑ The limit (165) exists for Hamiltonians satisfying (H1)–(H5), as follows by corollary from Theorem 2; if (H2) is replaced by bounded continuity of the interaction, as explained earlier, then we can also infer the existence of the limit (165) from our Theorem 1. The argument is quite standard, cf. [40]. Namely, note that 1 1 ln ZH (N ) (β) = ln e−βE+S(E) dE (166) Λ N N x Multiplying the canonical T -potential by the temperature of the heat bath yields the negative of what is usually called the canonical free energy, which in the thermodynamic limit yields the Helmholtz free energy of the physical systems.
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
where S(E) is shorthand for SH (N ) (E). Setting E = N 2 ε and β = Λ
1 Nϑ
1183
and expanding
S(E) using (23) (if UΛ is bounded continuous on Λ2 we can alternately use (22)), we find 1 1 ln ZH (N ) Λ N Nϑ
N1 ln N N (−ϑ−1 ε+sΛ (ε))+o(N ) = − ln N + ln e dε +O . (167) N Clearly, gN → g∞ as N → ∞, and so the following asymptotic expansion for the canonical T -potential results, 1 (168) ΦH (N ) = −N ln N + N φΛ (ϑ) + o(N ) Λ Nϑ with φΛ (ϑ) ≡ sup (−ϑ−1 ε + sΛ (ε)). ε>εg
(169)
N →∞
By (162) and (168), we also have N −1 ln ZI (N ) ( N1ϑ ) −−−−→ φΛ,I (ϑ), with Λ
φΛ,I (ϑ) = φΛ (ϑ) − φΛ,K (ϑ).
(170)
This concludes our demonstration that the Vlasov limit for the system-specific Helmholtz T -potential per particle follows from our theorems about the Vlasov limit of the system-specific Boltzmann entropy per particle. Next we notice that also the familiar “minimum free energy principle” for −φΛ (ϑ) follows from combining the Legendre–Fenchel transform (169) with our “maximum entropy principle” in Theorem 2. Thus, for the system-specific Helmholtz T -potential per particle we find the variational principle −ϑφΛ (ϑ) = inf Fϑ (f ), f ∈A
(171)
with A = {f ∈ (PUΛ ∩ L1 ∩ L1 ln L1 )(R3 × Λ)} the admissible trial densities, and Fϑ (f ) = E(f ) + ϑHB (f ),
(172)
the Helmholtz free energy functional of f , where HB (f ) is Boltzmann’s H function of f , given in (25), and E(f ) is the energy functional given in (26). It also follows directly from our results that Fϑ (f ) takes its infimum over the set A, and that any minimizer fϑ of Fϑ (f ) over A is of the form fϑ (p, q) = σϑ (p)ρϑ (q), where − 32
σϑ (p) = (2πϑ)
−1 1 2 |p| , exp −ϑ 2
(173)
(174)
October 26, 2009 11:30 WSPC/148-RMP
1184
J070-00385
M. K.-H. Kiessling
while ρϑ (q) now solves the following fixed point equation on q space, 1 3 UΛ (q, q˜ )ρϑ (˜ q )d q˜ exp − ϑ Λ ρϑ (q) = 1 ˜ )ρϑ (˜ exp − UΛ (ˆ q, q q )d3 q˜ dˆ q ϑ Λ Λ
(175)
with ϑ > 0 prescribed. We remark that the various possible relationships between the set of maximizers of the maximum entropy variational principle and the set of minimizers of the minimum free energy variational principle have been discussed in great detail in [12, 8]. Note that this can be (and was) done without proving that the maximum entropy variational principle characterizes the limit points of Boltzmann’s Ergode (2) proper. We also remark that the existence of the system-specific Helmholtz T -potential per particle in the Vlasov limit for the canonical ensemble was shown previously by various techniques. Sub-additivity arguments, such as those used to prove Theorem 1, are used in [19]. The very strategy which we applied to prove Theorems 2 and 3, which not only yields the variational principle for the system-specific Boltzmann entropy but also identifies the limit points of the sequence of ergodic ensemble measures as convex linear superpositions of infinite products of the optimizers for this maximum entropy principle, was originally applied in [31] to the canonical (N ) ensemble for Lipschitz continuous interactions IΛ ; subsequently in [19, 3] this approach to the canonical ensemble was generalized to less regular interactions including the ones studied here; and in [26] the limit N → ∞ of N −1 ln ZI (N ) (1/ϑ) Λ was obtained by adapting this strategy (note the different N scaling of β). We emphasize that none of these canonical results implies the existence of the Vlasov limit for the system-specific Boltzmann entropy per particle, nor captures the limit points of the ergodic ensemble measures, unless it is a priori known that the ensembles are (convexly) equivalent, i.e. unless it is known that ε → sΛ (ε) is concave (more on that in Sec. 7). Our results, by contrast, hold irrespective of whether ε → sΛ (ε) is concave or not. 6.3. The Vlasov limit for subergodic ensembles Another spin-off, or in this case rather a variation on the theme of our microcanonical results is the straightforward generalization of our theorems to subensembles whose invariant measures are concentrated on sub-manifolds of {H = E} determined by further isolating integrals of the Hamiltonian (19), such as angular momentum if the domain Λ is rotationally symmetric, or the Lynden–Bells’ invariant [32, 33] which occurs in a generalization of the Calogero–Moser model to particles moving in R3 confined by a quadratic potential. Hypothesis (H4) does not hold for these interactions, but can be replaced by a weaker one at the expense of some extra work. In those cases the entropy maximizer factors into a product of a
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1185
locally (at q) shifted Maxwellian on p space and a purely space-dependent Boltzmann factor. The shifted Maxwellian which generalizes (27) to include angular momentum is known as a “rotating Maxwellian;” in the case of the Lynden–Bells’ Hamiltonian one finds a “rotating-dilating Maxwellian.” An announcement of these results was made in [21]; details will appear in [24]. 7. Unfinished Business In this last section of our paper we point out some open problems related to the ones treated here. 7.1. The maximum interaction entropy principle To the best of the author’s knowledge, the maximum interaction entropy principle formulated in Proposition 5 is new. As made clear in Theorem 2 it offers a way to directly evaluate the usual variational principle of maximum entropy with energy constraint. By contrast, the standard approach to evaluate this constrained maximum entropy principle has been rather indirect. Namely, a Lagrange parameter (basically ϑ) is introduced for the energy constraint, yielding the corresponding fix point equation (175) for the stationary points of the free energy functional. After finding all solution families (not just the minimizers of the free energy functional), a parameter representation of energy and entropy along the various solution families of (175) results, among which the one with highest entropy for given energy has then to be selected. Clearly our new variational approach appears to be more economical than that. One of the simplest tasks would be to prove the existence of a unique solution to (28) at sufficiently high energies ε. For Coulomb interactions a unique solution is expected for all energies, while for (regularized) Newton interactions multiplicity of solutions is expected for sufficiently low energies. This is suggested by the detailed numerical evaluations of the standard principle of maximum entropy with constraints for related equations, cf. [42, 7]. 7.2. Convergence of the ergodic ensemble measures We already pointed out in Sec. 6.1 that the sequence of ergodic ensemble measures converges whenever a unique optimizer exists for the maximum interaction entropy variational principle in Theorem 2 and Proposition 5. We do not see any reason why the sequence of ergodic ensemble measures should not converge when the entropy maximizer is not unique, and so we expect that the mere existence of limit points concluded in this paper by using weak compactness can actually be upgraded to the existence of a limit. 7.3. Characterization of the de Finetti–Dynkin measure As also noted in Sec. 6.1, the decomposition measure ν(dτ |µε ) is a singleton whenever a unique optimizer exists for the maximum interaction entropy variational
October 26, 2009 11:30 WSPC/148-RMP
1186
J070-00385
M. K.-H. Kiessling
principle in Theorem 2. In more general situations we have little information on the decomposition measure ν(dτ |µε ), beyond knowing that it reduces to ς(dρ|ε ) and that ς(dρ|ε ) is supported on the maximizers of the maximum interaction entropy principle formulated in Proposition 5. Of course, we already mentioned earlier that experience with explicitly studied physical systems suggests that supp ς(dρ|ε ) is either a finite set or a continuous group orbit of a compact group, but a general proof or disproof seems not available. More is known for the canonical ensemble [27], and their approach should apply to the microcanonical ensemble to determine ν(dτ |µε ). 7.4. Large deviation principles Whenever HB (f ) has a unique minimizer fε over Aε , then Theorems 2 and 3 imply that N →∞
Prob(dKR (∆X(N ) , fε⊗n ) > δ) −−−−→ 0 ∀ δ > 0, (n)
(176)
where “Prob” refers to the ensemble measure (2) with Hamiltonian (19). It is desirable to improve (176) to a large deviation principle, a rigorous variation on the theme of Einstein’s fluctuation formula. Heuristically we expect Prob(dKR (∆X(N ) , fε⊗n ) > δ) sup e−N (HB (f )−HB (fε )) (n)
f ∈Aδε
∀ δ > 0,
(177)
δ (fε ). In [13, 12, 8] such where Aδε = {f ∈ (P ∩ L1 ∩ L1 ln L1 )(R3 × Λ) : E(f ) = ε}\B a feat was accomplished for the regularized microcanonical ensembles at the level of the 1-point functions. The recent article [10] establishes some nice large deviation principles for the n-point functions in a strong topology which allows one to handle some singular interactions. We expect that the conjectured large deviation principle can be proved along their lines. We also refer to Lanford’s article [28] and the books by Varadhan [45] and Ellis [11] for mathematical background on large deviation principles and their applications to statistical mechanics, and to [44] for a more recent review. 7.5. Vlasov limit for the canonical ensemble measures Using the very strategy used in this paper to prove our Theorems 2 and 3, the Vlasov limit for the canonical ensemble measures associated with (157) was established in [31,3,19] under various hypotheses on the interactions, covering our (H1)– (H5). This raises the question of whether one can conclude the convergence of the canonical ensemble measures associated with (157) from the convergence of the microcanonical ensemble measures (or, if convergence cannot be shown, the analog for the limit points). Put differently, we ask to extend the conclusions reached at the level of the thermodynamic functions to the level of the measures. In [12, 8] such a feat was accomplished for the canonical ensemble measures in terms of regularized microcanonical ensemble measures, using large deviation principle techniques, and issues of equivalence of ensembles were addressed.
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1187
7.6. Interactions without lower bound By hypothesis (H2) we allow the pair interactions to diverge when two particles ˜ ) is only allowed to diverge approach each other infinitely closely. However, WΛ (q, q to +∞, which happens with the repulsive Coulomb interactions when q → q˜ . ˜ ) to −∞ is excluded from our analysis, because our postuDivergence of WΛ (q, q (N ) lates imply that IΛ is bounded below by Eg (N ) > −∞. In particular, the −∞ singularity of the attractive Newton interactions in R3 will have to be regularized. The canonical ensemble and regularized microcanonical ensembles have been controlled under weaker hypotheses, allowing in particular the interactions to diverge logarithmically to −∞, see [3, 19] for the canonical and [4, 25, 20] for the regularized microcanonical ensembles. It should be possible to adapt the technical arguments in these papers to establish the Vlasov limit for (2) for negative logarithmically singular interactions.
7.7. Unbounded domains In [26, 6], unbounded Λ where allowed for the canonical ensemble, and our microcanonical theorems should similarly be extendible to unbounded domains under a suitable confinement hypothesis which replaces hypothesis (H5), presumably (H5 )
Confinement : e−UΛ (q,˜q) ∈ L1 (Λ × Λ).
(178)
Incidentally, (H5 ) not only imposes on behavior of UΛ as any of its two arguments is sent to infinity, it also restricts the manner in which UΛ can diverge to −∞, e.g. when its two arguments approach each other infinitely closely, allowing logarithmic divergence.
7.8. Ergodic ensembles of quasi-particles Our analysis does not cover ergodic ensembles of quasi-particle systems like point vortices moving in two dimensions whose Kirchhoff Hamiltonian is of the type (1) without the sum of |p|2 terms. The ergodic point vortex ensemble measures are of the type µE (d2NX) = (N !ΩI (N ) (E))−1 δ(E − IΛ (X (N ) ))d2NX, (N )
(N )
(179)
Λ
where X (N ) := (q 1 , . . . , q N ) ∈ ΛN , where now Λ ⊂ R2 , and d2NX is 2N -dimensional Lebesgue measure, and the pair interactions now feature positive logarithmic singularities (for a single specie of point vortices). Onsager [35] observed that for such systems a critical E value exists such that the map E → S(E) is decreasing when E > Ecrit , giving rise to negative ensemble temperatures. Regularized microcanonical measures for such vortex Hamiltonians have been analyzed in [4] under an equivalence assumption to the canonical ensemble, and in [25, 20] without such an
October 26, 2009 11:30 WSPC/148-RMP
1188
J070-00385
M. K.-H. Kiessling
equivalence assumption.y It is desirable to find a way to handle the proper ergodic ensemble for point vortex and other quasi-particle systems for which the sum of squares of kinematical momenta is absent from their Hamiltonian, but clearly this will require the introduction of new technical ideas. Incidentally, this last sentence applies verbatim also to other scalings than Vlasov scaling, in particular to the conventional thermodynamic limit scaling explained in the introduction. There is one exception to what we just wrote: precisely at the critical energy Ecrit of a point vortex system it is a priori known that all the n-point measures have densities given by (1/|Λ|)⊗n . Taking advantage of this fact, O’Neil and collaborators [34, 5] found that for a neutral two-species system the vicinity of Ecrit ∝ N ln N can be analyzed directly using δ(I − E); it turns out to be a small-entropy regime where S, not S/N , converges to a limit when N → ∞, with E − CN ln N ∝ N . Interestingly enough, this scaling falls in between the conventional thermodynamic limit and the Vlasov scaling. To the author’s knowledge, so far these are the only results for point vortices obtained for δ(I − E) proper, i.e. without regularization of the Dirac measure. Acknowledgment The author thanks Carlo Lancellotti for his careful reading of the manuscript and for his comments. This paper was written with support from the NSF under grant DMS-0807705. Any opinions expressed in this paper are entirely those of the author and not necessarily those of the NSF. Appendix A. Monotonicity of the Ground State Energy In this appendix, we will prove two monotonic convergence results about the ground state energy which are used in the setup of our construction of the Vlasov limit N → ∞. The results and their proofs are rather elementary and presumably known, and quite likely to be found in the vast literature on U statistics; however, my (certainly incomplete) perusal of the pertinent literature has not yet met with success.z y The
authors of [4] use the primitive Ω
(N )
IΛ
(E) of Ω (N ) (E) (i.e. (3) with H ≡ I) to define a IΛ
quasi-microcanonical ensemble entropy when E < Ecrit , and for E > Ecrit they use Ω Ω
(N )
IΛ
(N )
IΛ
(∞) −
(E). In [25,20] a Gaussian approximation to δ(I − E) is used. We also mention [13] where the
approximation Ω
(N )
IΛ
(E) − Ω
(N )
IΛ
(E − E) is used; these authors also regularize the logarithmic
singularity of the interactions. fact, we originally did not expect monotonicity results of the type proved here to hold at all. We were prompted to conjecture the results, and then to prove them, by analyzing the numerical results of the computations of the (conjectured) ground state energies Eg (N ) for Thomson’s problem [43] reported in [1, 37], which — divided by either N 2 or N (N − 1) — arranged themselves monotonically increasing when plotted versus N . An interesting spin-off of the monotonicity of the pair-specific Thomson energies is a necessary criterion for minimality which can be used as a test for the empirical numerical experiments. After the present paper was submitted we successfully carried out such a test; see [23]. z In
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1189
Here is our first proposition. Proposition 7. Let Λ ⊂ RD be a bounded and connected domain. Assume the ˜ ): following hypotheses regarding UΛ (q, q (H1) (H2) (H3) (H4)
ˆ ) = UΛ (ˆ ˇ) q, q q, q Symmetry: UΛ (ˇ ˆ ) is l.s.c. on Λ × Λ Lower Semi-Continuity: UΛ (ˇ q, q ˆ ) − min UΛ < }) > 0 q, q Sublevel Set Regularity: λ⊗2 ({UΛ (ˇ Local Square Integrability: UΛ (q, · ) ∈ L2 (Br (q) ∩ Λ) ∀ q ∈ Λ
where λ is normalized Lebesgue measure for Λ. For N ≥ 2 define the pair-specific ground state energy by 1 UΛ (q i , q j ). (180) εg (N ) ≡ min {q1 ,...,qN } N (N − 1) 1≤i<j≤N
Then the sequence N → εg (N ) so defined is monotonic increasing and converges to εg < ∞ defined by 1 ˜ )ρ(q)ρ(˜ εg = min UΛ (q, q q )dD q dD q˜. (181) 2 ρ∈P(Λ) Note that εg as defined in (181) coincides with εg as defined in (16) when D = 3 and UΛ is decomposed into the earlier stipulated sum of VΛ and WΛ . Proof of Proposition 7. We begin with the mandatory observation that under hypotheses (H1) and (H2) the pair-specific ground state energy εg (N ) defined in (180) is well-defined; i.e. εg (N ) ∈ R (note that (H3) and (H4) are immaterial here). We next prove the monotonicity of N → εg (N ), with N ≥ 2. Elementary (combinatorial) identities and the single inequality that the minimum of a sum is not less than the sum of the minima shows that εg (N + 1) ≥ εg (N ), viz. εg (N + 1) =
1 UΛ (q i , q j ) {q 1 ,...,qN +1 } (N + 1)N min
1≤i<j≤N +1
1 = min {q 1 ,...,qN +1 } (N + 1)N
1 ≥ (N + 1)N
1 N −1 1≤k≤N +1
1≤i<j≤N +1 i=k=j
1 min {q1 ,...,qN +1 }\{qk } N − 1 1≤k≤N +1
UΛ (q i , q j )
1≤i<j≤N +1 i=k=j
UΛ (q i , q j )
1 1 (N + 1) min = UΛ (q i , q j ) (N + 1)N {q 1 ,...,qN } N − 1 1≤i<j≤N
October 26, 2009 11:30 WSPC/148-RMP
1190
J070-00385
M. K.-H. Kiessling
=
1 UΛ (q i , q j ) {q 1 ,...,q N } N (N − 1) min
1≤i<j≤N
= εg (N ),
(182)
and the proof of monotonicity of N → εg (N ) is complete. Next, to prove convergence to εg given by (181) we begin by noting that under hypotheses (H1), (H2) and (H4), the ground state energy εg defined in (181) is well-defined; actually, for this issue we can even relax (H4) to the weaker L1loc (Λ) condition which is implied by (H4). We now use the density of empirical N -point 2 measures in the weakly compact set of all probability measures on Λ , and the existence (by (H2) and (H3)) of a minimizing sequence ∈ C0b (Λ) for ρ, ρ, to prove 2 (2) convergence εg (N )εg . We let ∆ (N ) denote the 2-point measure in Λ for a ground Xg
(N )
state Xg = (01 , q 1 , . . . , 0N , q N )g of N points in Λ (which need not be unique), 2 (2) and let ∆X (N ) be any other 2-point measure on Λ with N support points. We define the linear functional 2ρ → U(2ρ) by 1 ˆ ) 2 ρ(dD qˇ dD qˆ). U(2ρ) = UΛ (ˇ q, q (183) 2 Note that for product measures 2 ρ = ρ⊗2 we have U(ρ⊗2 ) = ρ, ρ.
(184)
Note furthermore that the functional 2ρ → U(2ρ) is generally not continuous, because we have only (weak) lower semi-continuity of UΛ . In particular, while any 2 (2) continuous change in the supporting points of the 2-point measure ∆X (N ) on Λ results in a weakly continuous change of the 2-point measure, the functional U eval(2) uated at these 2-point measures, i.e. U(∆X (N ) ), generally changes discontinuously. However, we do have εg (N ) = U(∆
(2) (N )
Xg
(2)
) ≤ U(∆X (N ) ).
(185)
Now let {ρn }n∈N be a minimizing sequence in (P ∩ C0b )(Λ) for ρ, ρ = U(ρ⊗2 ); note that it is not necessary to postulate also that ρn → ρ for any actual minimizer ρ, as this will follow automatically from the proof. Then, by (H3), for any > 0 we can find an n such that U(ρ⊗2 n ) ≤ εg + whenever n ≥ n . So pick any > 0, let n = n , and let {q k }k∈N be i.i.d. with a priori measure ρn ∈ (P ∩ C0b )(Λ) for each q k . Then by (H4) the weak law of large numbers for U statistics (of order 2) holds [18], and so, in probability, (2)
N →∞
U(∆X (N ) ) −−−−→ρn , ρn ≤ εg +
(186)
for each > 0. By (186) and (185) we have (2)
lim sup εg (N ) = lim sup U(∆ N →∞
N →∞
(N )
Xg
) ≤ εg .
(187)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1191
On the other hand, by the compactness of Λ and the weak∗ compactness of P(Λ) 2 (2) we can extract a ∗ -weakly convergent subsequence ∆ (N˙ ) → 2 ρ˙ ∈ P(Λ ). Moreover, Xg (n)
since any convergent sequence of n-point measures ∆
˙) (N
Xg
necessarily converges to
an n-fold product measure, we have 2 ρ˙ = ρ˙ ⊗2 . Now the weak lower semi-continuity of U gives lim inf U(∆ N˙ →∞
(2) ˙) (N
Xg
) ≥ ρ, ˙ ρ ˙ ≥ εg .
(188)
Estimates (187) and (188) prove convergence εg (N ) → εg . Convergence and the earlier proved monotonicity of N → εg (N ) completes the proof of Proposition 7. We notice that our proof of Proposition 7 yields as a “byproduct” that ρ, ˙ ρ ˙ = εg . Thus we have the following noteworthy corollary: Corollary 2. Any limit point ρ˙ ⊗2 of the sequence of ground state 2-point measures (2) {∆ (N ) }N ∈N minimizes the bilinear form U(ρ⊗2 ) = ρ, ρ. Xg
Here is our second proposition. Proposition 8. Assume the hypotheses on UΛ (q, q˜ ) stated in the previous proposition, and in addition assume that UΛ ≥ 0. Then the quasi pair-specific ground state energy, defined by ε˜g (N ) ≡
min
q1 ,...,qN
1 UΛ (q i , q j ), N2
(189)
1≤i<j≤N
is a strictly increasing function of N which converges to εg defined in (181). Proof of Proposition 8. First of all, ε˜g (N ) is as well-defined as εg (N ). Next, inspection of the monotonicity part of the proof of Proposition 7 reveals that the same steps as in (182) now yield ε˜g (N + 1) ≥
N2 ε˜g (N ), (N + 1)(N − 1)
(190)
and ε˜g (N ) ≥ 0 because of the here assumed positivity of UΛ . The strict monotonicity of N → ε˜g (N ) now follows because (N + 1)(N − 1) < N 2 .
(191)
Lastly, since 1 − N −1 → 1, the limit of ε˜g (N ) coincides with that of εg (N ). This concludes the proof of Proposition 8.
October 26, 2009 11:30 WSPC/148-RMP
1192
J070-00385
M. K.-H. Kiessling
Appendix B. Decomposition of the Finite N Measures Let ε ∈ Ps (ΛN ) be the weak limit of {N 2 ε ∈ Ps (ΛN )}N ∈N , and let ς(dρ|ε ) be its (N ) unique de Finetti–Dynkin–Hewitt–Savage decomposition measure. (If {N 2 ε }N ∈N has several limit points, as accounted for in the main text, the following considerations are valid for the associated converging subsequences of finite N measures.) We now show that if supp ς(dρ|ε ) is either a finite set or a continuous group orbit of a compact group, then for each ρ ∈ supp ς(dρ|ε ) we can explicitly construct a family of (N ) [ρ] ∈ Ps (ΛN ) satisfying (N )
lim
N →∞
n (N )
[ρ] = ρ⊗n
for each n ∈ N, such that for each N ∈ N, (N ) N 2 ε = (N ) [ρ]ς(dρ|ε ).
(192)
(193)
B.1. The support of ς(dρ|ε ) is a finite set In the simplest case ς(dρ|ε ) is a singleton, so that ε = ρ⊗N ε , i.e. lim
N →∞
n (N ) N 2 ε
= ρ⊗n ε
∀ n ∈ N.
(194)
(N ) In this case (N ) [ρ]ς(dρ|ε ) = (N ) [ρε ] = N 2 ε , and we are done. Next, assume that ς(dρ|ε ) is an arithmetic mean of two singletons, viz. ς(dρ|ε ) = ν1 δρ1 (dρ) + ν2 δρ2 (dρ)
(195)
with 0 < ν1 = 1 − ν2 < 1, and let dKR (ρ1 , ρ2 ) = D > 0 be the usual Kantorovich– Rubinstein distance between ρ1 and ρ2 . Let BD/2 (ρk ) be the KR-open ball in P(Λ) N which is centered at ρk and has radius D/2. Now decompose ΛN = ΛN 1 ∪ Λ2 , where (N ) N 1 (N ) ∈ Λ1 ∩Λ2 = ∅ and N 2 ε (ΛN k ) = νk , such that Λk contains all points for which ∆ BD/2 (ρk ); when N is too small there may be no such points, but by the weak density in P(Λ) of the empirical one-point measures the set of such points ∈ ΛN has positive (N ) N 2 ε measure when N is large enough. In fact, since by hypothesis the weak limit (N ) of {N 2 ε ∈ Ps (ΛN )}N ∈N is given by ε = ν1 ρ⊗N + ν2 ρ⊗N ∈ Ps (ΛN ), it follows 1 2 (N ) that when N ∞ then the probability with respect to N 2 ε that 1 ∆(N ) ∈ BD/2 (ρk ) approaches νk . So if we define (N ) [ρk ] = νk−1 N 2 ε χΛN k (N )
(196)
and recall that N 2 ε (ΛN k ) = νk , it follows that (N )
lim n(N ) [ρk ] = ρ⊗n k
N →∞
(197)
for each n ∈ N and k = 1 or 2, and such that for each N ∈ N, (N )
N 2 ε = ν1 (N ) [ρ1 ] + ν2 (N ) [ρ2 ], which is (193) in the case that ς is the arithmetic mean of two singletons.
(198)
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1193
The general case of supp ς being a finite set is treated similarly in an obvious manner, with D now the minimum of the set of distances between any pair (ρk , ρl ) picked from the support of ς. B.2. The support of ς(dρ|ε ) is a continuous group orbit For simplicity we assume that we are dealing with a one-parameter continuous group G acting on the base space, like SO(2) acting on Λ; the generalization to more complicated situations (e.g. SO(3) acting on Λ) is straightforward. In this case we can pick any particular ρ0 ∈ supp ς and obtain every other (say) ρθ ∈ supp ς by acting with a group element gθ ∈ G thusly, ρθ = ρ0 ◦ gθ . The de Finetti etc. decomposition of ε can then be written as an integral with respect to Haar measure over the group G of the infinite product measures ρ⊗N θ . The corresponding finite N (N ) presentation is simply obtained by change of variables for N 2 ε through factoring (N ) out the group G, which gives each [ρθ ] uniquely. References [1] E. L. Altschuler, T. J. Williams, E. R. Ratner, R. Tipton, R. Stong, F. Dowla and F. Wooten, Possible global minimum lattice congurations for Thomsons problem of charges on the sphere, Phys. Rev. Lett. 78 (1997) 2681–2685. [2] L. Boltzmann, Vorlesungen u ¨ber Gastheorie (J. A. Barth, Leipzig, 1896); Lectures on Gas Theory (Univ. California Press, Berkeley, 1964), translated into English by S. G. Brush. [3] E. Caglioti, L. P. Lions, C. Marchioro and M. Pulvirenti, A special class of stationary flows for two-dimensional Euler equations: A statistical mechanics description, Comm. Math. Phys. 143 (1992) 501–525. [4] E. Caglioti, L. P. Lions, C. Marchioro and M. Pulvirenti, A special class of stationary flows for two-dimensional Euler equations: A statistical mechanics description. II, Comm. Math. Phys. 174 (1995) 229–260. [5] J. L. Campbell and K. O’Neil, Statistics of two-dimensional point vortices and high energy vortex states, J. Stat. Phys. 65 (1991) 495–529. [6] S. Chanillo and M. K.-H. Kiessling, Surfaces with prescribed Gauss curvature, Duke Math. J. 105 (2000) 309–353. [7] P. H. Chavanis, Phase transitions in self-gravitating systems. Self-gravitating fermions and hard spheres models, Phys. Rev. E 65 (2002) 056123. [8] M. Costeniuc, R. S. Ellis, H. Touchette and B. Turkington, The generalized canonical ensemble and its universal equivalence with the microcanonical ensemble, J. Stat. Phys. 119 (2005) 1283–1329. [9] E. B. Dynkin, Klassy ekvivalentnyh sluˇcaˇinyh veliˇcin, Uspeki Mat. Nauk. 6 (1953) 125–134. [10] P. Eichelsbacher and U. Schmock, Large Deviations of U -empirical measures in strong topologies and applications, Ann. Inst. Henri Poincar´e 38 (2002) 779–797. [11] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics (Springer-Verlag, New York, 1985). [12] R. S. Ellis, K. Haven and B. Turkington, Large deviation principles and complete equivalence and nonequivalence results for pure and mixed ensembles, J. Stat. Phys. 101 (2000) 999–1064.
October 26, 2009 11:30 WSPC/148-RMP
1194
J070-00385
M. K.-H. Kiessling
[13] G. Eyink and H. Spohn, Negative temperature states and large-scale, long-lived vortices in two-dimensional turbulence, J. Stat. Phys. 70 (1993) 833–886. [14] B. de Finetti, La pr´evision: Ses lois logiques, ses sources subjectives, Ann. Inst. Henri Poincar´e 7 (1937) 1–68. [15] J. W. Gibbs, Elementary Principles in Statistical Mechanics (Yale Univ. Press, New Haven, 1902); reprint (Dover, New York, 1960). [16] R. B. Griffiths, Microcanonical ensemble in quantum statistical mechanics, J. Math. Phys. 6 (1965) 1447–1461. [17] E. Hewitt and J. L. Savage, Symmetric measures on Cartesian products, Trans. Amer. Math. Soc. 80 (1955) 470–501. [18] W. Hoeffding, A class of statistics with asymptotically normal distributions, Ann. Statist. 19 (1948) 293–325; Reprinted (except for Sec. 9.e–9.h), in Breakthroughs in Statistics, Vol. I, eds. S. Kotz and N. L. Johnson (Springer-Verlag, New York, 1992). [19] M. K.-H. Kiessling, Statistical mechanics of classical particles with logarithmic interactions, Commun. Pure Appl. Math. 47 (1993) 27–56. [20] M. K.-H. Kiessling, Statistical mechanics approach to some problems in conformal geometry, Physica A 297 (2000) 353–368. [21] M. K.-H. Kiessling, Statistical equilibrium dynamics, in Dynamics and Thermodynamics of Systems with Long-Range Interactions: Theory and Experiments, eds. A. Campa, A. Giansanti, G. Morigi and F. Sylos Labini, AIP Conf. Proc., Vol. 970 (American Inst. Phys., 2008), pp. 91–108. [22] M. K.-H. Kiessling, On Ruelle’s construction of the thermodynamic limit for the classical microcanonical entropy, J. Stat. Phys. 134 (2009) 19–25. [23] M. K.-H. Kiessling, A note on classical ground state energies, J. Stat. Phys. 136 (2009) 275–284. [24] M. K.-H. Kiessling and C. Lancellotti, in preparation (2009). [25] M. K.-H. Kiessling and J. L. Lebowitz, The microcanonical point vortex ensemble: Beyond equivalence, Lett. Math. Phys. 42 (1997) 43–56. [26] M. K.-H. Kiessling and H. Spohn, A note on the eigenvalue density of random matrices, Comm. Math. Phys. 199 (1999) 683–695. [27] S. Kusuoka and Y. Tamura, Gibbs measures for mean field potentials, J. Fac. Sci. Univ. Tokyo, Sec. IA, Math. 31 (1984) 223–245. [28] O. E. Lanford III, Entropy and equilibrium states in classical statistical physics, in Statistical Mechanics and Mathematical Problems, Conf. Proc. Battelle Seattle Recontres (1971), ed. A. Lenard, Lecture Notes in Physics, Vol. 20 (Springer, 1973), pp. 1–113. [29] A. Lenard (ed.), Statistical Mechanics and Mathematical Problems, Conf. Proc. Battelle Seattle Recontres (1971), Lecture Notes in Physics, Vol. 20 (Springer, 1973). [30] A. Martin-L¨ of, Statistical Mechanics and the Foundations of Thermodynamics, eds. J. Ehlers et al., Lecture Notes in Physics, Vol. 101 (Springer, 1979). [31] J. Messer, and H. Spohn, Statistical mechanics of the isothermal Lane–Emden equation, J. Stat. Phys. 29 (1982) 561–578. [32] D. Lynden-Bell and R. M. Lynden-Bell, Exact general solutions to extraordinary N -body problems, Proc. R. Soc. Lond. Ser. A 445 (1999) 475–489. [33] D. Lynden-Bell and R. M. Lynden-Bell, Relaxation to a perpetually pulsating equilibrium, J. Stat. Phys. 117 (2004) 199–209. [34] K. O’Neil and A. R. Redner, On the limiting distribution of pair-summable potential functions in many-particle systems, J. Stat. Phys. 62 (1991) 399–410. [35] L. Onsager, Statistical hydrodynamics, Nuovo. Cimerto. Sup. 6 (1949) 279–287.
October 26, 2009 11:30 WSPC/148-RMP
J070-00385
The Vlasov Continuum Limit for the Classical Microcanonical Ensemble
1195
[36] O. Penrose, Foundations of Statistical Mechanics: A Deductive Treatment (Pergamon Press, Oxford, 1970); reprinted (Dover, 2005). [37] A. P´erez-Garrido, M. J. W. Dodgson, M. A. Moore, M. Ortu˜ no and A. D´ıaz-S´ anchez, Comment on “Possible global minimum lattice configurations for Thomson’s problem of charges on a sphere”, Phys. Rev. Lett 79 (1997) 1417. [38] M. Reed and B. Simon, Methods of Modern Mathematical Physics I (Academic Press, New York, 1980). [39] W. D. Robinson and D. Ruelle, Mean entropy of states in classical statistical mechanics, Comm. Math. Phys. 5 (1967) 288–300. [40] D. Ruelle, Statistical Mechanics: Rigorous Results (Benjamin, New York, 1969); reprinted, Advanced Book Classics series (Addison-Wesley, Reading, 1989). [41] H. Spohn, Large Scale Dynamics of Interacting Particles, Texts and Monographs in Physics (Springer, 1991). [42] B. Stahl, M. K. H. Kiessling and K. Schindler, Phase transitions in gravitating systems and the formation of condensed objects, Planet. Space Sci. 43 (1995) 271–282. [43] J. J. Thomson, On the structure of the atom: An investigation of the stability and periods of oscillation of a number of corpuscles arranged at equal intervals around the circumference of a circle; with application of the results to the theory of atomic structure, Philos. Mag. 7 (1904) 237–265. [44] H. Touchette, Review on large deviation theory and statistical mechanics, Phys. Rep. 478 (2009) 1–69. [45] S. R. S. Varadhan, Large Deviations and Applications, CBMS-NSF Regional Conf. Ser. Appl. Math., Vol. 46 (SIAM, 1984).
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Reviews in Mathematical Physics Vol. 21, No. 10 (2009) 1197–1240 c World Scientific Publishing Company
FIVEBRANE STRUCTURES
HISHAM SATI∗,§ , URS SCHREIBER†,¶ and JIM STASHEFF‡, ∗Department
of Mathematics, Yale University, 10 Hillhouse Avenue, New Haven, CT 06511, USA †Fachbereich
Mathematik, Schwerpunkt Algebra und Zahlentheorie, Universit¨ at Hamburg, Bundesstraße 55, D-20146 Hamburg, Germany
‡Department
of Mathematics, University of Pennsylvania, David Rittenhouse Lab., 209 South 33rd Street, Philadelphia, PA 19104-6395, USA §
[email protected] ¶
[email protected] [email protected] Received 9 April 2009 Revised 4 August 2009
We study the cohomological physics of fivebranes in type II and heterotic string theory. We give an interpretation of the one-loop term in type IIA, which involves the first and second Pontrjagin classes of spacetime, in terms of obstructions to having bundles with certain structure groups. Using a generalization of the Green–Schwarz anomaly cancellation in heterotic string theory which demands the target space to have a String structure, we observe that the “magnetic dual” version of the anomaly cancellation condition can be read as a higher analog of String structure, which we call Fivebrane structure. This involves lifts of orthogonal and unitary structures through higher connected covers which are not just 3- but even 7-connected. We discuss the topological obstructions to the existence of Fivebrane structures. The dual version of the anomaly cancellation points to a relation of string and Fivebrane structures under electric-magnetic duality. Keywords: Anomaly; obstruction theory; generalizations of fiber spaces and bundles; string theory; superstring theory; fivebrane. Mathematics Subject Classification 2000: 81T50, 55S35, 55R65, 81T30
1. Introduction Cohomological physics began with Gauss in 1833, if not sooner (cf. Kirchoff’s laws). The cohomology referred to in Gauss’ work was that of differential forms, div, grad, curl and especially Stokes Theorem (the de Rham complex). Gauss explicitly defined the linking number of two circles imbedded in 3-space by an integral defined in terms of the electromagnetic effect of a current circulating in one of the circles. Although Maxwell’s equations were for a long time expressed only in coordinate dependent form, they were recast in a particularly attractive way in terms 1197
November 17, 2009 15:39 WSPC/148-RMP
1198
J070-00384
H. Sati, U. Schreiber & J. Stasheff
of differential forms on Minkowski space. More subtle differential geometry and implicitly characteristic classes occurred visibly in Dirac’s magnetic monopole [24], which lived in a U (1) bundle over R3 − 0. The magnetic charge was given by the first Chern number; for magnetic charge 1, the monopole lived in the Hopf bundle, introduced that same year 1931 by Hopf [45], though it seems to have taken some decades for that coincidence to be recognized [42]. Thus were characteristic classes (and by implication the cohomology of Lie algebras and of Lie groups) introduced into physics. In the case of electromagnetic theory, the Lie group was just U (1), but more general Lie groups were involved in Yang–Mills theory. There was a linguistic barrier between physicists and mathematicians (see Yang’s reminiscences [75]), which was breached when it was realized that the physicist’s gauge potential and field strength were, respectively, the mathematicians connection and curvature from differential geometry. A major development occurred when Dirac’s theory of the electron required — in modern language — lifting from the special orthogonal group SO(n) to the spinor group Spin(n), which corresponds to “killing” the first homotopy group π1 (SO(n)) of SO(n). Much later it was found that in string theory a further step is needed, namely lifting the spinor group to the string group String(n), giving rise to string structures. Such structures interpreted in terms of the vanishing condition of the worldsheet anomaly of a superstring were first identified by Killingback [49] and shortly afterwards amplified by Witten [72] in the context of the index theory of Dirac operators on loop space. A nice review and differential geometric description can be found in [55]. In all these articles, the String structure is regarded as a lift of an LSpin(n)-bundle over the free loop space LX through the Kac–Moody central extension LSpin(n)-bundle. It was in [66] where it was pointed out that this lift can also be interpreted as a lift of the original Spin(n)-bundle down on target space X to a principal bundle for the topological group called String(n). Then it was realized in [7] (see also [43]) that this topological group is in fact the realization of the nerve of a smooth (as opposed to just topological) albeit categorified group: the string 2-group. This paves the way for a differential geometric treatment of string structures on the target space X. Higher categorical generalizations of bundles are called n-bundles, whose structure groups are n-groups, the n-fold categorifications of the structure groups of the ordinary bundles. It was known early on that 2-bundles [10,8] (aka “crossed module bundle gerbes” [47]) with structure 2-group the string 2-group have the same classification as that of the string-bundles considered by Stolz–Teichner (compare [47] and [7]), and hence, rationally [22], as that of the bundles on loop space originally considered by Killingback and by Witten. Detailed proofs of this have recently appeared [6, 9]. Note that the string condition is closely related to the condition on vanishing of the integral seventh Stiefel–Whitney class, W7 = 0, observed in [23] and studied in connection to generalized cohomology in [50]. In particular, the
SO(n)
B
O(n)
z
O(n)k
Fig. 1.
kill π0
Z2
Z2
πk (O(n))
0
2
kill π3
9
String(n)
Z
3
0
4 0
5 Z
7
5
Fivebrane(n)
kill π7
0
6
higher connected covers
fundamental groups
Homotopy groups of O(n) and its higher connected covers for n > k + 1.
kill π1
=
Spin(n)
1
0
k
November 17, 2009 15:39 WSPC/148-RMP J070-00384
Fivebrane Structures 1199
November 17, 2009 15:39 WSPC/148-RMP
1200
J070-00384
H. Sati, U. Schreiber & J. Stasheff
string condition implies the vanishing of W7 as W7 = Sq 3 ( 12 p1 ), where Sq 3 is the Steendrod square that raises the cohomology degree by three. The word string being well established for maps from 1-dimensional manifolds, higher-dimensional analogs are referred to as branes (originally, membranes). The “surface” formed by an evolving string is called the worldsheet and, analogously, the higher-dimensional volumes of evolving branes are referred to as the worldvolumes. The most studied types of branes in string theory are the famous D-branes, which couple to the Ramond–Ramond (RR) fields. We will not be dealing with D-branes, and hence with RR fields, in this paper. In addition to D-branes, there are the Neveu–Schwarz (NS) branes that couple to the NSNS fields. The fundamental string couples to the B-field B2 , whose curvature is H3 (= dB2 + non-exact, locally). The Hodge dual of H3 in ten dimensions is H7 , which can be viewed as the curvature of a degree six “potential” B6 , to which couples an extended object, called the NS 5-brane. Thus, the 5-brane is the magnetic dual to the string (in the sense familiar in string theory, which mathematicians can find nicely described in [35] in terms of differential characters, which are just another way of talking about higher line bundles with connections). It is to be expected that anomaly freedom of the spinors on the fivebrane’s worldvolume require the target to carry an analog of a Spin structure even more strict than a String structure. There is a known formula for the dual to the Green–Schwarz anomaly. This formula is discussed in Sec. 3. The same formula can be deduced from anomaly freedom of the worldvolume theory of the super 5-brane [51, 28]. We observe that this known formula can be read as saying that target space X needs to admit a lift of the structure group of TX from SO(n) through Spin(n), through String(n) and then further to the 7-connected cover of SO(n) which we dub Fivebrane(n). We define a Fivebrane structure, which is obtained by killing all up to and including the seventh homotopy group of the orthogonal group. This is entirely analogous to the process of killing the third homotopy group in going from the spin group to the string group. Generally, one could wonder what the even higher connected covers of SO(n) would correspond to in terms of higher brane physics. While the notions of higher lifts are of course known in homotopy theory, we here discuss the topological structure associated with the physics of fivebranes, much the same as in the known string case. Definition 1. An n-dimensional manifold X has a fivebrane structure if the classifying map X → (BO )(n) of the tangent bundle TX lifts to BFivebrane := (BO )9. fˆ
M
f
BO(n)9 : / BO(n) .
One point we make is that the fivebrane — as opposed to branes of other dimensions — is distinguished in 10-dimensional spacetime since it does indeed lead
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1201
to a structure that naturally generalizes the String structure, due to the existence of the field H7 , the dual field to H3 . The importance of H7 will be highlighted in Sec. 3. Notice that the existence of this lift implies lifts through all the lower connected covers, which says that for X to have a Fivebrane structure it must also have an orientation, a Spin structure and a String structure. Our aim is to • understand the topological nature of such Fivebrane structures, i.e. identify the relevant bundles and the corresponding characteristic classes; • understand their differential geometric nature in terms of characteristic forms of such bundles (e.g. generalized connections) analogous to what is done for string theory. This paper will focus more on the first, while a sequel [62] will emphasize the second. We shall show the following: Theorem 1. The obstruction to lifting a String structure on X to a Fivebrane structure on X is the fractional second Pontrjagin class 16 p2 (TX ). This is our Proposition 1. • Here the fractional coefficient 1/6 is the crucial subtle point. It is explained in Sec. 4.4.2. • Inequivalent Fivebrane structures on X are classified by a quotient of H 7 (X, Z). This is Proposition 3. But this is not the full story yet. String structures are in general carried not just by a manifold X, but by a complex vector bundle E → X (e.g. the gauge bundle on the target space of the heterotic string): this vector bundle has a String structure if its second Chern classa cancels the fractional first Pontrjagin class of the spin bundle X: E is orientable E → X has a Spin structure ⇔ and w2 (E) = 0, E has Spin structure E → X has a String structure ⇔ and 1 p1 (TX ) = ch2 (E). 2 If the second Chern character of E vanishes, this reduces to saying that X itself has a String structure. (The fractional first Pontrjagin class is reviewed in Sec. 4.4.1.) This situation generalizes. A fivebrane structure should be assigned, more generally, to a vector bundle E → X. The dual Green–Schwarz mechanism indicates that the condition on E to have a fivebrane structure is essentially that the fourth a We
review characteristic classes and characters in Appendix A.
November 17, 2009 15:39 WSPC/148-RMP
1202
J070-00384
H. Sati, U. Schreiber & J. Stasheff
Table 1. Higher spin-like structures on manifolds and on “gauge bundles” over a manifold X. Here “gauge bundle” technically just means: complex vector bundle. The first four entries are well established; we are concerned here with the last two entries. Higher spin-like structure Spin structure on manifold Spin structure on gauge bundle
Defining condition
X
orientation and
w2 (TX ) = 0
E→X
orientation and
w2 (E) = 0
String structure on manifold
X
Spin structure and
1 p1 (TX ) = 0 2
String structure on gauge bundle
E→X
Spin structure and
1 p1 (TX ) = ch2 (E) 2
Fivebrane structure on manifold
X
String structure and
1 p2 (TX ) = 0 6
Fivebrane structure on gauge bundle
E→X
String structure and
1 p2 (TX ) = 8ch4 (E) + decomposables 6
Chern class of E cancels the second Pontrjagin class of X, but there are corrections by decomposable classes (because ch2 = c2 + decomposable classes): E has String structure E → X has Fivebrane structure ⇔ and . 1 p2 (TX ) = ch4 (E) + decomposable classes 6 We will study Fivebrane structures arising from anomaly polynomials in two theories: in type IIA string theory and in heterotic string theory. In the first, we will have a fivebrane structure for the tangent bundle of the space itself, since there is no gauge bundle. We address this in full detail. In the second theory, we have, in addition, a gauge bundle. The corresponding condition will involve both the tangent bundle and this gauge bundle. We are still able to give a description when the two separate conditions are imposed: that separately the tangent bundle and the gauge bundle admit a fivebrane structure. In the general case, we encounter a difficulty, namely that the factors — the second Pontrjagin classes — come with different coefficients. We explain this difficulty and provide (partial) solutions in Sec. 4.6 and towards the end of Sec. 4.7. We will be concerned in this paper primarily with the cohomological aspects of Fivebrane structures rather than the more subtle aspects related to K-theory which we hope to address elsewhere. The presence of the decomposable cohomology classes (products of nontrivial cohomology classes) appearing on the right can be deduced from the dual Green– Schwarz formula. Notice that a decomposable characteristic class necessarily suspends to a trivial group cocycle. So despite the appearance of those decomposable classes, Fivebrane structures are essentially controlled just by the seventh Lie algebra cohomology of so(n), just as String structures are controlled by the third Lie algebra cohomology (compare [61]). We shall discuss in [62] that the decomposable
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1203
characteristic classes in the dual Green–Schwarz formula affect the differential class (the connection) but not the integral class of the smooth Chern–Simons 7-bundle whose nontriviality obstructs the existence of Fivebrane structures. The description in terms of anomalies will be given in Sec. 3.1. The notion of and notation for connected covers is explained towards the end of Sec. 4.1. A recollection on characteristic classes is given in Appendix A. Outlook: Differential geometric interpretation. Just as string 2-bundles with connection provide a differential geometric realization of String structure, the higher analog of a spin bundle with connection, there are analogous higher bundles with connection giving a differential geometric realization of Fivebrane structures. Following the counting pattern, these deserve to be called fivebrane-6-bundles or fivebrane-5-gerbes with connection, but the reader might want to think of them just as a nonabelian version of differential cocycles with top degree curvature form a 7-form. As will be described in [62], these nonabelian differential cocycles can be obtained from integrating the L∞ -algebra connections of [61]. There the Lie 2-algebra string(n) (an L∞ -algebra concentrated in the lowest two degrees) governing string bundles with connection had been accompanied by the Lie 6-algebra fivebrane(n) (concentrated in the lowest six degrees) which plays the corresponding role for Fivebrane structures. 2. The Context Before getting to our main discussion, we first indicate the general context in which these questions arise. The word string being well established for maps from 1-dimensional manifolds, higher-dimensional analogs are referred to as branes (originally, membranes). The “surface” formed by an evolving string is called the worldsheet and, analogously, the higher-dimensional volumes of evolving branes are referred to as the worldvolumes. We have an “n-particle”, otherwise known as an (n − 1)-brane, whose worldvolume is an n-dimensional manifold Σ, or rather the image of that n-dimensional manifold under a sufficiently well behaved map φ:Σ→X
(1)
into a target space (usually to be thought of as physical spacetime) X. On that spacetime, we have a (generalized, higher) bundle (thought of as an n-bundle or (n − 1)-gerbe) with (generalized) connection. Henceforth, we will drop the “generalized” and the n unless crucial. Thinking of an ordinary bundle with connection should be sufficient. This bundle with connection encodes the data specifying the “background field” to which that (n − 1)-brane “couples”. Just as with ordinary connections, connections on an n-bundle can be expressed either in terms of local (p ≤ n)-forms on the base manifold or in terms of global (p ≤ n)-forms on the total space [61]. The local representatives of these forms are often referred to as a “background field”, though technically speaking the background field is that differential form datum together with the descent/gluing data that makes it a differential
November 17, 2009 15:39 WSPC/148-RMP
1204
J070-00384
H. Sati, U. Schreiber & J. Stasheff
cocycle. In particular, we have the 3-form curvature H3 of the Kalb–Ramond field in string theory, which is, in this sense, the curvature of a “background field”, the “B-field”, of a string theory. Likewise, we have the 7-form H7 which is the curvature of a background field for the fivebrane. 2.1. Σ-models We are concerned with the mathematical structure which is supposed to model the physics of charged n-particles, usually known as charged (n − 1)-branes or as quantum field theories of Σ-model type. Such a Σ-model is specified by choosing • a “space” X, called the target space; • a “space” (or class of such) Σ, called the parameter space or called the worldvolume; • the mapping space Maps(Σ, X) called the space of fields or the configuration space or sometimes the moduli space (the latter is usually a quotient); • on the target space a differential n-cocycle ∇, i.e. a higher generalization of a fiber bundle with connection, called the background field; • a prescription for how to interpret the push-forward of the pullback ev∗ ∇ along the projection pr1 onto Σ in the correspondence diagram Σ × Maps(Σ, X) NNN qqq NNev q q NNN qq q NNN q q q & xq pr1
Σ
(2) X
called the path integral or the quantization of the Σ-model. When the parameter space Σ is n-dimensional, one thinks of this data as encoding the physics of n-fold higher analogs of particles, “n-particles”, that propagate on X. The field configuration (1) is thought of as the trajectory of such an n-particle in X. One says that the n-particle couples to the background field ∇ or that it is charged under the background field. The terminology is entirely motivated from the familiar case of ordinary electromagnetically charged 1-particles: the electromagnetic background field ∇ which they couple to is modeled by a vector bundle (a line bundle in this case) with connection. For n = 2 one speaks of “strings”. String theory strictly is the study of those n = 2 Σ-models with a special restriction for what the “path integral” is allowed to be. Technically, string theory is required to encode a 2-dimensional superconformal field theory of central charge 15. This condition, however, is of no real relevance for our discussion here, which pertains to all Σ-models which generalize the “spinning 1-particle”. Some of the most interesting ideas concerning such Σ-models are originally due to Dan Freed: The interpretation of background fields and of charges as differential cocycles is nicely described and worked out in [35, 46], where the mathematically inclined reader can find rigorous interpretations, in terms of differential cohomology,
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1205
of the abelian kinds of “background fields” and related “anomalies” in string theory with which we are concerned here, which include the Chern–Simons 3- and 7-bundles with connection obstructing the String(n) and Fivebrane(n) lifts, but not these lifts themselves. (For those, one would need nonabelian differential cohomology [62].) Integration as pushforward. The interpretation of quantization and of the path integral as an operation on higher categorical structures has first been explored in [33,34]. Integration as a push-forward operation plays a prominent role in recent developments by Stolz and Teichner and by Hopkins et al. Let us take for instance the simple toy example case where the background field ∇ is a vector bundle (without connection) and where Σ is a point, i.e. n = 0. In this case pr1 is the map from pt×Maps(pt, X) = X to the point. Integration over the fiber in this case is just integration over X itself, and the ordinary push-forward gives the space of sections of the original vector bundle, as the point is varied over X. That reproduces indeed the desired “quantization over the point” and can, following [33,34], be regarded as the codimension 1 part of the full path integral for n = 1. Stolz and Teichner describe a variation of this which involves push-forward of K-theory classes to the point, which then classifies connected components of all (supersymmetric) 1-dimensional Σ-models. This shows that, while a fully satisfactory mathematical interpretation of the quantization of Σ-models is to date still an open question, a coherent picture, revolving around the correspondence (2), is beginning to emerge. The “higher spin-like structures” on target space X discussed here are believed to ensure the existence of the quantization step in the case that the Σ-model generalizes that describing spinning 1-particles. The sigma model for the string (i.e. Σ = string worldsheet) has been studied extensively in the literature and provides a consistent model both at the classical and the quantum levels. Due to string/fivebrane duality in ten dimensions [69, 16, 31, 32], one expects that there should be a formulation of string theory via fivebrane sigma models (i.e. Σ = fivebrane worldvolume). While this is expected, we point out that such a program has not been fully completed. However, there are works that point in that direction [28]. At the least for the gravitational parts — i.e. without the gauge bundle — for the fivebrane, there are models in which the anomalies from the worldvolume theory match the expression of the polynomials in the Pontrjagin classes [28,51]. That is enough for our purposes since our main focus is the cohomological structure resulting from lifts of the tangent bundle of spacetime and, after all, it is not our aim to write down a full quantum fivebrane sigma model action. The paper [51] also highlights some of the difficulties encountered, but also gives partial resolutions. The reader not further concerned with string theoretic reasoning might proceed to Sec. 4. 2.2. Background fields Independently of how the “background field” ∇ is modeled, it should locally be encoded by differential form data. See Table 2.
November 17, 2009 15:39 WSPC/148-RMP
1206
J070-00384
H. Sati, U. Schreiber & J. Stasheff
Table 2. Simple (abelian) examples for n-particles and the background fields they couple to. The background fields are often addressed in terms of the symbols used for their local form data: the Kalb–Ramond field is known as the “B-field” with its “H3 field strength”. Similarly one speaks of the “C3 -field” and its field strength “G4 ”, etc. This reflects the historical development, where the local differential form data was discovered first and its global interpretation only much later. (See also the remark on anomalies at the beginning of Sec. 3.) Background field
n-Particle
Global model
Local differential form data
(1-)particle
Electromagnetic field
Line bundle with connection/ Cheeger–Simons differential 2-character/Deligne 2-cocycle
Connection 1-form: A ∈ Ω1 (Y ) Curvature 2-form: F2 := dA ∈ Ω2closed (Y )
string (2-particle) (1-brane)
Kalb–Ramond field
Line 2-bundle with connection/ bundle gerbe with connection (“and curving”)/ Cheeger–Simons differential 3-character/ Deligne 3-cocycle
Connection 2-form: B ∈ Ω2 (Y ) Curvature 3-form: H3 := dB ∈ Ω3closed (Y )
membrane (3-particle) (2-brane)
Supergravity 3-form field
Line 3-bundle with connection/ bundle 2-gerbe with connection (“and curving”)/ Cheeger–Simons differential 4-character/ Deligne 4-cocycle
Connection 3-form: C ∈ Ω3 (Y ) Curvature 4-form: G4 := dC ∈ Ω4closed (Y )
All the relevant background fields that have been considered are locally controlled by some L∞ -algebra g, and the local differential form data can always be considered as encoding differential forms A ∈ Ω• (Y, g) with values in the L∞ algebra g [61]. In the case of abelian differential cocycles, these L∞ -algebras are all of the form bn−1 u(1): the higher-dimensional versions of u(1).
2.3. Charges Just as an ordinary 1-bundle may be trivialized by a nontrivial section, which one may think of as a “twisted 0-bundle”, higher n-bundles may be trivialized by “higher sections” which are called “twisted (n − 1)-bundles”. One says the twisted (n − 1)bundle is “twisted by” the corresponding n-bundle. A beautiful description of this situation for abelian n-bundles with connection in terms of differential characters is given in [35,46]. Twisted nonabelian 1-bundles have been studied in detail under the term “bundle gerbe modules” [13]. Twisted nonabelian 2-bundles have first been considered in [5, 47] under the name “twisted crossed module bundle gerbes”. In terms of the L∞ -connections considered in [61], twisted n-bundles with connections are the connections for L∞ -algebras arising as mapping cone L∞ -algebras g). (bn−1 u(1) → ˆ By comparing the formalism here with the situation of ordinary electromagnetism, one can regard the twisting n-bundle as encoding the presence of “magnetic charge”. This, too, is nicely explained at the beginning of [35]. Accordingly, where an untwisted (n − 1)-bundle has a curvature n-form Hn which is closed, a twisted (n − 1)-bundle has a curvature n-form which is “twisted by” the curvature
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1207
(n + 1)-form G(n+1) of the twisting n-bundle dHn = Gn+1 .
(3)
Indeed, for a twisted (n − 1)-bundle the curvature Hn is locally no longer the differential of the connection, dBn−1 = Hn , but receives a contribution from the connection n-form Bn of the twisting n-bundle Hn = dBn−1 + Bn .
(4)
The archetypical example is that of ordinary magnetic charge: as Maxwell discovered in the 19th century, in the presence of magnetic charge, which in four dimensions is modeled by a 3-form H3 = j1 , the electric field strength 2-form F2 is no longer closed dF2 = H3 .
(5)
When Dirac later discovered at the beginning of the 20th century that H3 has to have integral periods (“quantization of magentic charge”), the first 2-categorical structure in physics had appeared: the magnetic torsion 2-bundle/bundle-gerbe with de Rham class H3 . It seems that this was first explicitly realized in [35]. The next example of this kind received such a great amount of attention that it came to be known as the initiation of the “first superstring revolution”: the Green– Schwarz anomaly cancellation mechanism [40]. The interaction of gauge theory with type I supergravity theory leads to the (low energy limit of the) heterotic theory which has a rich mathematical structure. The Chapline–Manton coupling [19] amounts essentially to equating the curvature H3 of the B-field in type I with the Chern–Simons 3-form of the connection on the gauge bundle. More precisely, in terms of the difference of two Chern–Simons 3-bundles (Chern–Simons 2-gerbes), locally, one has dH3 = dCS (ω) − dCS (A)
(6)
for ω and A the local connection 1-forms of a spin and complex vector bundle, respectively and CS (−) denoting the corresponding Chern–Simons 3-forms. Thus this leads to nontrivial and rich structures both physically and mathematically. See for instance [35, 12, 21] for geometric treatments. As nicely explained by [35], in the “higher gauge theory” given by the effective supergravity target space theory of the heterotic string, the supergravity C-field with curvature 4-formb G4 had to be “trivialized” by the Kalb–Ramond field with curvature 3-form H3 , or conversely the Kalb–Ramond field had to be “twisted” by the supergravity curvature 4-form dH3 = G4 Moreover, G4 has to be the curvature b In
all of this paragraph, what we mean by the C-field and its corresponding curvature G4 is their topological part, i.e. the shift of G4 in the formula [G4 ] − 14 p1 = a ∈ H 4 (Y 11 , Z) [74], and not the fields themselves — the “p1 part” and not the “G4 part” of a.
November 17, 2009 15:39 WSPC/148-RMP
1208
J070-00384
H. Sati, U. Schreiber & J. Stasheff
of the virtual difference implicit in (6). The Green–Schwarz anomaly cancellation condition can hence be read, equivalently, as saying that • the supergravity C-field trivializes over the 10-dimensional target of the heterotic string; • G4 is the magnetic 5-brane charge which the electric heterotic string couples to; • the Kalb–Ramond field is twisted by the supergravity C-field. This anomaly cancellation has an interesting description in terms of a String structure [49, 22]. This can also be interpreted as a Spin structure on the (free) loop space of the 10-dimensional spacetime [72, 64, 66]. This free loop space can be viewed as the configuration space of the string. From a topological point of view, the String structure is equivalent to lifting the structure group of the tangent bundle of spacetime from Spin(10) to its three-connected cover String(10), obtained by killing the first three homotopy groups. The latter is infinite-dimensional but can be captured by some finite-dimensional constructions [66, 7]. There is no particular reason to prefer “electric charge” over “magnetic charge”: in the presence of a Riemannian structure, the Hodge star dual of an “electric” field strength Hn+1 may be interpreted as a field strength itself, in which case it is called the “magnetic field strength” Hd−n−1 := Hn+1 . Just as the original field strength Hn coupled to an “electric” n-particle, the dual field strength couples to a “magnetic” (d − n − 2)-particle. Such electric-magnetic duality is at the heart of what is known as “S-duality” for super Yang–Mills theory, which has recently been argued [48] to be the heart of geometric Langlands duality. It is only for electric 1-particles in d = 4 dimensions that their magnetic dual is again a 1-particle. The magnetic dual of the 2-particle in 10 dimensions is the 6-particle. In other words: the magnetic dual of the string is the 5-brane. Here our starting point is to look at the above situation with the electric string in the presence of magnetic 5-brane charge in the dual formulation, where the magnetic 5-brane couples to a 6-form field with field strength H7 in the presence of electric string charge, which is then given by an 8-form I8 . Type I supergravity admits a formulation in terms of the potential B6 corresponding to the field H7 , which is Hodge dual to H3 in ten dimensions [18]. There is also a corresponding anomaly cancellation procedure for this dual theory which makes use of a degree seven analog of the Chapline–Manton coupling [38, 59]. This process is also mathematically rich and has been treated in [35] from a K-theoretic point of view and, in fact, in a duality-symmetric fashion, i.e. including both fields on equal footing and at the same time. The magnetic dual discussion of the Green– Schwarz mechanism [38, 59] leads us to consideration of a twisted 6-bundle with field strength H7 = H3 , which is twisted by a certain 7-bundle whose field strength eight-form is a sum of two higher characteristic classes plus some mixed terms — see Eq. (12). This is the formula which we shall refer to as the dual Green–Schwarz anomaly cancellation condition and take as the starting point of our discussion.
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1209
Table 3. The Green–Schwarz formula and its dual version, with its interpretation in terms of electric strings and their magnetic 5-brane duals. The electric current I4 and the magnetic current I8 are both fixed such that the anomaly they produce cancels the anomaly from the fermions in the theory (see Sec. 3.1). Electric field strength coupled to fundamental electric (n − 1)-brane | {z } dFn+1
Magnetic field strength Magnetic coupled to fundamental (d − n − 3)-brane magnetic (d − n − 3)-brane current | {z } | {z } =
Electric KR field coupled to fundamental string {z } | dH3
=
jB
d Fn+1
Magnetic 5-brane current {z } |
Magnetic dual KR field coupled to fundamental 5-brane {z } |
1 p1 (ω) − ch2 (A) {z } |2
dH7
I4
Electric (n − 1)-brane current | {z } =
jE Electric string current | {z }
=
1 p2 (ω) − ch4 (A) 48 + decomposables | {z } I8
3. The Dual Green–Schwarz Mechanism and Higher Chern–Simons Forms Here we recall the string theoretic results which indicate which characteristic classes of a manifold X have to vanish in order for the manifold to qualify as the target for the propagation of a 5-brane. For the main mathematical point that we make in Sec. 4, it is sufficient to note here that there is motivation, from formal high energy physics, for studying the condition (12), below, on characteristic classes of complex vector bundles over X. The reader not further concerned with string theoretic reasoning might just want to note this equation and the observations following it and then proceed to Sec. 4. 3.1. Anomaly cancellation in string theory There are several somewhat different phenomena which are called anomalies in physics, but they usually all refer to issues of global topological twists. Physicists are used to developing their concepts in terms of local data and many times implicitly assume that this is sufficient. Generically, the anomaly in our context refers to an inconsistency in the topological assumptions taken for the underlying space or for bundles on that space. For instance, if one accepts that spinors are sections of spin bundles, then it is obvious that their existence requires the underlying manifold to have a Spin structure. But one way to discover this from the point of view of physics is, as nicely described in [73], to start with a naive action functional for a spinning particle and then to discover that it is ill defined globally unless the target has a Spin structure. Entirely analogous considerations lead to String structures as the “anomaly cancellation conditions” for superstrings, known as the Green–Schwarz anomaly cancellation mechanism.
November 17, 2009 15:39 WSPC/148-RMP
1210
J070-00384
H. Sati, U. Schreiber & J. Stasheff
From the target space perspective, these kinds of anomalies manifest themselves in the fact that the action functional of the theory, supposed to be a function on configuration space, happens to be, in fact, a section of a line bundle. There are (at least) two reasons why this may happen: • the path integral over the fermionic fields is to be interpreted not as a function over the configuration space of the remaining bosonic fields, but as a section in a Pfaffian line bundle over that space (reviewed in [36]); • the standard action functional for higher abelian gauge fields in the presence of electric and magnetic charges is also in general just a section of a line bundle over configuration space (discussed in [35]). If the tensor product of these two line bundles, namely of the Pfaffian and Charge line bundles, is a nontrivial bundle with nontrivial connection then the action is anomalous. The Green–Schwarz anomaly cancellation mechanism is to introduce electric string and magnetic 5-brane charges in precisely such a way that the line bundle on configuration space thus introduced cancels the nontriviality of the given Pfaffian line bundle due to the fermions in the theory. In the following, we describe the anomaly cancellation condition known as the dual Green–Schwarz mechanism [38,59,35] and related to super 5-branes [51]. It can Table 4. Anomalies arising from the fact that the bosonic action functional is, a priori, not a function on the bosonic configuration space of fields conf bos , but a section of a line bundle over that space. The complex anomaly line bundle with connection over the space conf bos of bosonic fields is the tensor product of a Pfaffian line bundle Pfaff from the fermionic path integral and another line bundle, Charge, due to the presence of electric and magnetic charges.
Pfaff ⊗ Charge
The action funcional e−S is supposed to be a complex function on conf bos , but is in general, in fact, a section of the anomaly line bundle.
Pfaff ⊗ Charge
conf bos
]
e−S
conf bos In order for the starting point of quantization to be well defined one needs anomaly cancellation: the anomaly line bundle needs to be trivializable and one needs a choice of trivialization that identifies it with the trivial line bundle with trivial connection. The obstruction to this trivialization is the anomaly. The curvature of the connection on Pfaff ⊗ Charge is called the local anomaly, its holonomy the global anomaly.
Pfaff ⊗ Charge
local anomaly: global anomaly:
/ (conf bos × C, ∇ = 0)
curv(Pfaff ⊗ Charge) Hol(Pfaff ⊗ Charge)
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1211
be obtained from the worldvolume perspective of the super 5-brane again as a generalization of how the spin-condition for the target of a spinning particle is found. Alternatively, it can be found from the condition that the index of the total Dirac operator (on the fermionic fields called “dilatino”, “gravitino” and “gaugino”) of the effective target space field theory of the heterotic string vanishes. In string theory, the need for String structures was originally found in terms of anomaly cancellations, either from the target space perspective or from the worldsheet perspective: The effective field theory of the heterotic string on a spin target space X involves the (pseudo-)Riemannian metric structure on X with Levi–Civita SO(10) connection ω (needed for gravity) and a gauge bundle E → X with connection A. The spinorial field content of the heterotic background theory consists of sections of three different bundles: the spin bundle S associated to the principal spin bundle over the Spin manifold X, as well as its tensor products with TX and with the gauge bundle E. Sections of S are states of the the dilatino field, those of S ⊗ TX correspond to the gravitino field, and those of S ⊗ E correspond to the the gaugino field. There is a Dirac operator associated with all three of these fields, denoted D, DTX and DE , respectively. The anomaly cancellation condition, which ensures that the action functional for these fields is a well defined function on configuration space, is that a particular linear combination of the indices of these Dirac operators vanishes. In the notation of Table 5, the anomaly cancellation formula involves the degree twelve form part of the identity [3, 40, 41] I := I3/2 (R) − I1/2 (R) + I1/2 (F, R) = 0.
(7)
Here R and F are the curvature 2-forms of the tangent bundle TX and of the gauge bundle E, respectively. The second of the three terms in the middle is Index(D) = A, the index of the uncoupled Dirac operator given in terms of the A-genus via the index theorem. The first term is Index(DTM ), the index of the Dirac operator coupled to the tangent vector bundle, i.e. S ⊗ TM . The third term is Index(DE ) is the index of the Dirac operator coupled to the gauge vector bundle, whose curvature by the index is F , i.e. the vector bundle is Spin(M ) ⊗ E. This is equal to ch(E) ∧ A, theorem. This fermionic anomaly corresponds to a line bundle with connection on configuration space whose curvature 2-form is the integral of a certain 12-form (see [63] Table 5. Anomaly contributions from the three different fermionic fields (sections of spin bundles) of the target space theory. L(R) is the Hirzebruch L-polynomial. The Is are the anomaly polynomials given by the (topological) indices indicated in the table.
Spin bundle
Name of field
Symbol for Dirac operator
S
dilatino
D
ˆ I1/2 (R) := Index(D) = A
S ⊗ TX
gravitino
DTX
I3/2 (R) := Index(DTX ) =
S⊗E
gaugino
DE
Contribution to anomaly 1 L(R) + I1/2 (R) 8 ˆ I1/2 (R, F ) := Index(DE ) = ch(E) ∧ A
November 17, 2009 15:39 WSPC/148-RMP
1212
J070-00384
H. Sati, U. Schreiber & J. Stasheff
for more detail) over target space X
curv(Pfaff ) = −
I4 ∧ I8 .
(8)
X
A similar integral encodes the curvature of the anomaly line bundle due to electric string current jE (a 4-form) and magnetic current jB (an 8-form): curv(Charge) = jE ∧ jB . (9) X
Anomaly cancellation demands that we identify I4 (right-hand side of Eq. (10) below) and I8 (right-hand side of Eq. (12) below), respectively, as the magnetic currents for the field strengths H3 and H7 that appear in the direct and the dual Green–Schwarz anomaly cancellation conditions. 3.2. Anomalies in terms of H3 and H7 Here we summarize the two pictures we have emphasized. The standard Green–Schwarz mechanism via H 3 . The H-fields appear in two theories of interest to us: Type II and heterotic theories on 10-dimensional X, respectively. In type II, the expression actually involves the Ramond–Ramond (RR) fields, but we are setting those to zero. The direct Green–Schwarz formalism [40] on a 10-dimensional manifold X leads to the appearance of the 3-dimensional Chern– Simons term via the Chapline–Manton coupling [19] which makes H3 no longer closed. Mathematically, this means we assume in that case that in the heterotic theory there is the usual H-field, a priori an R-valued differential form — as presented by supergravity — which gets modified from being closed to 1 (10) dH3 = ch2 (A) − p1 (ω), 2 where A is the gauge connection for an E8 × E8 or Spin(32)/Z2 vector bundle E and ω is the metric connection, so that p1 (ω) is the first Pontrjagin form of the tangent bundle TX. Recall [49] that the right-hand side of Eq. (10) is the (image in real cohomology of the) obstruction to having a String structure on the virtual difference bundle TM − E, and that (10) therefore says that this obstruction class has to vanish. Note the contrast of (10) in heterotic string theory to the case of type II string theory on X, where dH3 = 0. The dual Green–Schwarz mechanism via H 7 . The H-field H3 can be viewed as the dual of a field H7 where, rationally, this is just Hodge duality H7 := H3 . The “Bianchi identity”, i.e. the formula for dH7 , depends on the specific string theory, i.e. heterotic versus type II, as was the case for H3 . The reason we say “rationally” is because these fields, like other fields in string theory, can give integral and even torsion elements in cohomology in the quantum theory. In such a case, appropriate notions of Hodge duality will be needed to clarify the relationship between H3 and H7 . We address this in detail in a separate paper.
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1213
Let us now also consider type II string theory on X in the absence of any Ramond–Ramond fields. This theory also has a degree seven dual, H7 , of the H-field H3 . While H3 in this case is closed, H7 is not. Instead, from the dimensional reduction from M-theory, H7 satisfies (cf. [30, 37, 52]) p2 (ω) − dH7 =
2 1 p1 (ω) 2 , 48
(11)
where pi are the Pontrjagin classes of the tangent bundle TX . Observe that this only involves the topology of spacetime without any gauge bundles. On the other hand, in the heterotic case, we have a principal bundle with connection, the corresponding modified Bianchi identity gets modified by the Chern characters of E. The expression is (see [35]) dH7 = ch4 (A) −
1 1 1 p1 (ω) ch2 (A) + p1 (ω)2 − p2 (ω). 48 64 48
(12)
Here ω denotes the Levi–Civita connection on the spin-lift of the tangent bundle and A is the connection on the gauge vector bundle E. We interpret this “dual Green–Schwarz mechanism” as saying that H7 trivializes the obstruction to having a fivebrane structure on the pair (TX , E) Observations: It is useful to see how (12) simplifies in various special cases: (1) If we have a string structure on TX − E coming from String structures on both bundles separately, in that p1 (TX ) = 0 = ch2 (E), then, at the level of cohomology, Eq. (12) is replaced by
1 (13) ch4 (E) − p2 (TX ) = 0. 48 This holds in general when X is 4-connected, in which case the cohomology of X is only in degrees 0, 5 and 10. (2) If in addition to ch2 (E) = 0 we require that c21 and c2 be equal to zero, then in this case, using ch4 =
1 4 (c − 4c21 c2 + 4c1 c3 + 2c22 − 4c4 ), 24 1
we have that ch4 (E) = − 16 c4 (E) so that
1 1 − c4 (E) − p2 (TX ) = 0. 6 48
(14)
(15)
(3) We can write a formula in terms of either the Chern classes or the Pontrjagin classes for both factors in Eq. (15), thus giving a specialization of the general formula. For this we can consider the complexification of the tangent bundle to
November 17, 2009 15:39 WSPC/148-RMP
1214
J070-00384
H. Sati, U. Schreiber & J. Stasheff
the 10-dimensional spacetime. We use pj (TX ) = (−1)j c2j (TX C ) to write the differential form representative of Eq. (15) as −2π 1 dH7 = c4 (A) + c4 (ω) , 6 8
(16)
(17)
where now it is understood that c4 (ω) is the fourth Chern class of the complexified tangent bundle with corresponding connection ω, for which, with an abuse of notation, we use the same symbol as for the real connection. It will follow from our results in section 4 that Fivebrane structures for a pair 1 p2 (ω) to vanish on top of the (TM , E) with connections (ω, A) require ch4 (A) − 48 String structure. 3.3. The Chern–Simons forms For a principal G-bundle p : P → X, a characteristic class Kj (F ), expressed as a polynomial in the g-valued curvature F of polynomial degree j, is closed and pulled up to the total space is exact Kj (F ) = dQ2j−1 (A, F ),
(18)
where Q2j−1 (A, F ) ∈ g ⊗ Λ2j−1 (M ) is a “Chern–Simons form” for Kj (F ). This applies to the Chern character as well as the Pontrjagin classes. Thus, we can use this to solve Eq. (12) on the total space. We denote a specific choice (e.g. by the specific homotopy in the remark below) as CS7 (A, F ). Using the expressions ch4 (F ) = dCS7 (A, F )
(19)
ch4 (R) = dCS7 (ω, R),
(20)
Eq. (17) becomes
1 dH7 = 2π dCS7 (A, F ) + dCS7 (ω, R) . 8 This means that H7 can be taken to be of the form 1 H7 = 2π CS7 (A, F ) + CS7 (ω, R) . 8
(21)
(22)
We view this setting of H7 equal to the degree seven Chern–Simons forms as the degree seven analog of the Chapline–Manton coupling. Remarks. (1) A specific formula for the Chern-Simons form corresponding to the Chern character can be obtained by using the homotopy formula [20] (or in the
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1215
physics literature [76, 2]) CS2j−1 (A, F ) =
1 (j − 1)!
i 2π
j
1
dt Str(A, Ftj−1 ),
(23)
0
where Str is the symmetrized trace and At is a connection that interpolates between connections A0 and A1 At = A0 + t(A1 − A0 ),
(24)
with corresponding curvature Ft = dAt + A2t = tdA + t2 A2 = tF + (t2 − t)A2 .
(25)
For j = 4, CS7 (A, F ) =
1 6(2π)4
1
dt Str(A, Ft3 ),
(26)
0
where Ft is the curvature of the connection At = tA that interpolates between the zero connection at t = 0 and the connection A at t = 1. An analogous formula holds for CS7 (ω, R). (2) Unlike the Chern character, the Chern–Simons form is not gauge-invariant. Under a transformation A → Ag = g −1 (A + d)g, 4 3! i g g CS7 (A , F ) − CS7 (A, F ) = − tr[(g −1 dg)7 ] + dβ6 , (27) 7! 2π where β6 is a six-form which can be chosen by applying the chosen homotopy operator on the gauge-transformed Chern–Simons form. (3) Alternatively, we see that Eq. (15) is obtained from Eq. (12) by setting all decomposable characteristic forms (all nontrivial wedge products of two characteristic forms) to 0, which is the same as saying that all characteristic classes suspending to 0 are set to 0. Recall that a characteristic form P (FA ) on a i
G-bundle p : P → X is said to suspend to the form µ(i∗ A) on G → P if there is a form CS P (A, FA ) on P such that dCS P (A, FA ) = p∗ P (FA )
(28)
i∗ CS P (A, FA ) = µ(i∗ A).
(29)
and
Put more simply, recall that a form ω ∈ H ∗ (X) on the base of a bundle p : E → i
X is said to suspend to the form µ on F → E if there is a form ν on E such
November 17, 2009 15:39 WSPC/148-RMP
1216
J070-00384
H. Sati, U. Schreiber & J. Stasheff
that dν = p∗ ω and i∗ ν = µ. A decomposabe characteristic form P (FA ) ∧ P (FA ) necessarily suspends to a 0-form, since p∗ (P (FA ) ∧ P (FA )) = dCS P ∧P (A, FA )) := d(CS P (A, FA ) ∧ P (FA ))
(30)
and since i∗ (CS P (A, FA ) ∧ P (FA )) = 0,
(31)
because i∗ FA = 0, and hence i∗ P (FA ) = 0. 4. Fivebrane Structures We recall how String structures are lifts of spin bundles through the 3-connected cover of Spin(n) and how this lift is obstructed by the fractional first Pontrjagin class called 12 p1 . Then we define Fivebrane structures as lifts of the resulting string bundles through the 7-connected cover of Spin(n). We demonstrate that this lift is obstructed by a degree eight characteristic class of string bundles, which is the fractional second Pontrjagin class, 16 p2 , of the natural string bundle of the underlying manifold. 4.1. The homotopy groups of SO(n) The homotopy groups of the orthogonal group O(n), for n sufficiently large, are Z2 for k = 0, 1 mod 8 πk (O(n)) = Z for k = 3, 7 mod 8 . (32) 0 otherwise See also the first two rows of Fig. 1. The condition on n is best understood by considering the stable orthogonal group O, also known as the infinite orthogonal group, which is defined as the direct limit of the sequence of inclusions O(1) ⊂ O(2) ⊂ · · · ⊂ O =
∞
O(k).
(33)
k=0
That the homotopy groups of O (n) stabilize, i.e. that for n > k + 1, one has πk (O ) = πk (O (n)), follows from the fact that the inclusion O(n) → O(n + 1) is (n − 1)-connected (see below for notation) as we have the fiber bundle O(n) → O(n + 1) → S n ,
(34)
i.e. S n is the homogeneous space O(n + 1)/O(n). Below the stable range, i.e. for n ≤ k+1, the description of the homotopy groups of SO (n) becomes incomplete because one is essentially looking at the homotopy groups of spheres, which are not completely known. For example, the homotopy groups of SO (3) are the same as the homotopy groups of the sphere S 3 which are known only in specific degrees (but nonetheless in a range sufficient for most
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1217
practical purposes). But in the applications of all these considerations to string theory which we have in mind, the base manifold X from which one wishes to consider homotopy classes of maps into B(SO(n)k) is (n = 10)-dimensional. Therefore in this application Fig. 1 is fully applicable. One obtains a sequence of topological groups from O (n) by successively passing to its k-connected covers. This process is known as the “killing of homotopy groups”. See the third row of Fig. 1. An explanation of the topological group structure is given in Sec. 4.5 below. The description of the unitary group is analogous but much simpler due to the fact that Bott periodicity in this case has period two instead of period eight. In the stable range, i.e. for i < 2n, the inclusion U (n) into U (n + 1) induces an isomorphism between the homotopy groups πi (U (n)) and πi (U (n + 1)). The infinite unitary group U is defined in an analogous way as the orthogonal group above. The Bott periodicity theorem implies that the homotopy groups of U are particularly simple: πi (U ) is trivial if i is even and isomorphic to Z if i is odd. The standard notation for the k-connected cover of a space X for which all the homotopy groups vanish up to and including πk is Xk + 1. Thus the groups U 2k and O2k denote the connected covers of the unitary and orthogonal group, respectively, having the first potentially nontrivial homotopy group in dimension 2k. For example, O3 refers to the orthogonal group having first homotopy group in dimension three, which means that the first three homotopy groups are killed (starting with π0 ) and therefore is 2-connected. The result of killing the first two homotopy groups of O(n) are very familiar: these are the groups SO(n) and Spin(n), respectively. The group SO(n) is the connected component of the group O(n) and Spin(n) is the simply-connected cover of SO(n). Less familiar but by now well known is the result of killing the next nontrivial homotopy group beyond that, π3 : this is the string group String(n). It is noteworthy that O(n), SO(n) and Spin(n) are not just topological groups, but of course carry the structure of a Lie group. On the other hand, there is no known model for String(n) as a group with a manifold structure. Notice that since all compact Lie groups G have π3 (G) = Z or several copies of Z, String(n) has no chance of being a compact Lie group. The known models for String(n) are “huge” topological spaces. Notice however that [7] gives a Lie model for String(n) when regarded not as a mere group, but as a Lie 2-group (a 2-group whose space of objects and of morphisms is a Fr´echet manifold). Caveat: Notation for higher connected covers. A comment on the notation used for the connected covers of the corresponding classifying spaces is in order. In another approach, the connected covers of the groups are defined not directly but via the corresponding classifying space Gn := ΩBGn,
(35)
November 17, 2009 15:39 WSPC/148-RMP
1218
J070-00384
H. Sati, U. Schreiber & J. Stasheff
where Ω denotes the based loop space. However, for our purposes we are also interested in the classifying space itself. To avoid confusion (hopefully), we will use the notation B(Gn) = (BG)n + 1.
(36)
Note the shift in n compared to Eq. (35). For example, (BO )4 = B(O 3) refers to the classifying space having first homotopy group in dimension four, which means that the first four homotopy groups are killed, and therefore is 3-connected.
4.2. String structures revisited Recall how the group String(n) can be constructed from Spin(n) via the sequence 1 → K(Z, 2) → String(n) → Spin(n) → 1 ,
(37)
where Spin(n) is the simply connected cover of S O(n) and, in fact, the first nonzero homotopy group of the Lie group Spin(n) is π3 = Z. It follows from the Hurewicz theorem that the first nonzero cohomology group occurs at the same degree, i.e. H 3 (Spin(n), Z) = Z. This singles out the (homotopy class of a) map f : Spin(n) → K(Z, 3) BK(Z, 2)
(38)
that generates H 3 (Spin(n), Z) under the identification H 3 (X, Z) [X, K(Z, 3)], the set of homotopy classes of maps X → K(Z, 3). This map classifies a principal K(Z, 2)-bundle over Spin(n) String(n) := f ∗ EK(Z, 2)
/ EK(Z, 2) (39)
Spin(n)
f
/ K(Z, 3) BK(Z, 2)
and this bundle is the extension (37). Applying the classifying functor on (37) leads to a weakly exact (homotopy exact) sequence K(Z, 3) → BString(n) → BSpin(n) .
(40)
The corresponding complex analog is K(Z, 3) → (BU )6 → BSU
(41)
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1219
and on into K(Z, 4) obtained by mapping from BSU by the Chern class c2 . The structures are summarized in the following diagram / BString = (BO)8
(BU )6
/ BSpin = (BO)4
BSU = (BU )4
1 2 p1
/ K(Z, 4)
w2 / BSO = (BO)2 / K(Z2 , 2) BU OO eeee2 e OOO e e e e OOO eeeee OOO eeeeee e e OO' e e ee eeeeee K(Z, 2) mod 2 BO = (BO)1
w1
/ K(Z2 , 1). (42)
Remarks. (1) (BU )6 and (BO)8 are the homotopy fibers of the respective maps to K(Z, 4). (2) The map from BSU to BSpin is an isomorphism on fourth cohomology H 4 . This comes from the isomorphism Spin(3) ∼ = SU (2) ∼ = S3.
(43)
(3) The composite map from BSU to K(Z, 4) in the second row is given by the second Chern class c2 . (4) The composite map from BU to K(Z2 , 2) in the third row is given by c1 followed by reduction mod 2. That is, we have the following commutative diagram BU
c1
/ K(Z, 2)
mod 2
BU
w2
/ K(Z2 , 2).
(44)
November 17, 2009 15:39 WSPC/148-RMP
1220
J070-00384
H. Sati, U. Schreiber & J. Stasheff
4.3. Cohomology of the connected covers 4.3.1. In characteristic 0 In this section, we work in characteristic zero and we (briefly) address torsion in Sec. 4.3.2. The definition of String(n) as the extension of Spin(n) in (37) induced by the map (38) allows us to compute the cohomology of String(n). It is easier to compute first the cohomology of BString(n) from the homotopy fibration sequence K(Z, 3) → BString(n) → BSpin(n),
(45)
since in characteristic 0 we have (all coefficients are rational) H ∗ (K(Z, 3)) H ∗ (S 3 )
(46)
and we can apply the long exact Gysin sequence π
e∧
π∗
· · · → H n (E) →∗ H n−k (M ) → H n+1 (M ) → H n+1 (E) → · · ·
(47)
where e∧ is the wedge product of a differential form with the Euler class e. The result is H ∗ (BString) P [p2 , p3 , . . .],
(48)
from which it readily follows that H ∗ (String) is an exterior algebra on generators xi of degree 4i − 1 starting with i = 2. A very close analog of the above can be carried over starting with BU . The result is H ∗ ((BU )2k) P [ck , ck+1 , . . .]/(ck ) = P [ck+1 , . . .].
(49)
Except for the change in indexing, the proof is the same as for BString = BSpin4. Remarks. (1) The rational cohomology of (BU )k can be calculated in several ways, for example using the Gysin sequence, as we saw above, or by induction on k using the Leray–Serre spectral sequence. While the first method is more straightforward, we include the second one in Appendix B because it is likely to be needed for applications in future work where torsion is important. (2) In the geometric description of K(Z, 2) as P U (H), the projective unitary group on an infinite-dimensional Hilbert space H, the H-field H3 occurs as the canonical degree three class x1 of P U (H) bundles. In the operator algebra language, this is called the Dixmier–Douady class [25]. From a physical point of view if X is 10-dimensional, it is natural to ask whether a similar interpretation of the dual field H7 can be given. From a mathematical point of view, it is natural to ask whether a similar construction to the degree
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1221
three case can be carried out with the first nontrivial homotopy group π7 of string. 4.3.2. Over the integers and the integers mod p (1) The cohomology for all connected covers for the classifying spaces of the orthogonal and the unitary groups are also known. At p = 2, this is a result of Stong [68]. For general p, this is a result of Singer [65], in terms of generators and relations, and Giambalvo [39]. (2) H∗ ((BU )6; Z) is calculated as a ring in [4]. It is torsion-free and concentrated in even degrees. From this the cohomology can be read via H ∗ = hom(H∗ ((BU )6; Z), −),
(50)
as a map from rings to sets. This is related in an interesting way to the what are known as cubical structures on the additive group [4]. 4.4. Fractional Pontrjagin classes When our manifold X has extra structure, such that it admits a lift of the structure group of its tangent bundle to a higher connected cover of O(n), we can refine the Pontrjagin classes of X to fractional Pontrjagin classes. For X a d-dimensional orientable manifold, its kth Pontrjagin class pk is taken to be the kth Pontrjagin class of the tangent bundle TX , regarded as an associated SO(d)-bundle. So it is given by a map f
pk
X → BSO(d) → K(Z, 4k).
(51)
4.4.1. Spin structures and the first Pontrjagin class Saying that a d-dimensional manifold X is spin means we can lift the classifying map f of its tangent bundle TX to a map fˆ : X → BSpin(d). Since Spin is 2-connected and π3 (Spin) = Z, it follows that H 4 (BSpin, Z) = Z. If we denote the generator of this fourth cohomology group by ω, the situation looks as follows: BSpin(d) y< yy y yy fˆ yy y π y yy y y yy yy f / BSO(d) X
ω
/ K(Z, 4) (52)
p1
/ K(Z, 4).
Since, by assumption, ω is a generator, it must be true that π ∗ p1 is an integral multiple of ω. One finds that ω can be chosen so that π ∗ p1 = 2ω,
(53)
November 17, 2009 15:39 WSPC/148-RMP
1222
J070-00384
H. Sati, U. Schreiber & J. Stasheff
i.e. BSpin(d) y< yy y yy fˆ yy y π y yy y y yy yy f / BSO(d) X
/ K(Z, 4)
ω
·2
p1
(54)
/ K(Z, 4).
This motivates the notation ω :=
1 p1 2
(55)
for the generator of H 4 (BSpin, Z). Accordingly, the pullback 1 p1 1 1 fˆ p1 (X) := fˆ∗ p1 : X → BSpin 2→ K(Z, 4) 2 2
(56)
is “half the Pontrjagin class” of the Spin manifold X. Notice that for 12 p1 (X) to be zero, the vanishing of p1 (X) is a necessary but not a sufficient condition: 12 p1 (X) might be non-vanishing but 2-torsion. 4.4.2. String structures and the second Pontrjagin class The same kind of reasoning continues to apply as we keep killing homotopy groups of O(d). We say that X admits a String structure or that X is string if the classifying map f for TX lifts to BString(d). Now BString is 7-connected and H 8 (BString, Z) Z. Let ν denote the corresponding generator. The pullback π ∗ p2 of the second Pontrjagin class has to be an integer multiple of this generator. In the next section (see Proposition 1, Sec. 4.5), we will show that the integer multiple is 6: BString(d) < xx xx x x fˆ xx x π x x xx x xx xx f / BSO(d) X
ν
/ K(Z, 8)
·6
p2
(57)
/ K(Z, 8).
Therefore we should give the generator ν the name p2 /6 and define 1 1 p2 (X) := fˆ∗ p2 , 6 6
(58)
the fractional second Pontrjagin class of the String manifold X. Later (see Eq. (89)) we will make use of the spin characteristic classes, which better describe spin bundles than do Pontrjagin classes.
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1223
4.5. Fivebrane structures The point of view we take is that in the same way that H3 was part of the obstruction to lifting a spin bundle to a string bundle, we would like to interpret H7 as another obstruction. Further, H7 serves as a higher obstruction in the sense that it makes sense to talk about it once the “lower” obstructions vanish. The next task is to make this more precise. The first nonzero homotopy group of the topological group String(n) is π7 Z. Then, again, the Hurewicz theorem implies that the first nonzero cohomology group occurs in degree 7. As H 7 (X, Z) = [X, K(Z, 7)] and K(Z, 7) = BK(Z, 6), it follows that K(Z, 6) is a fiber in a nontrivial fibration sequence with String(n) as the base. From the structure of the homotopy groups (32), the extension will be O8. Thus (compare also Definition 1) Definition 2. The extension 1 → K(Z, 6) → O8 → String → 1, classified by the canonical map String → K(Z, 7), which replaces the sequence (37), we call the fivebrane-extension Fivebrane := O8. Remarks. (1) The fact that string occurs as the base of the sequence in Definition 2 is compatible with the interpretation of the classes of H3 and H7 as corresponding to the fibrations being pulled back from K(Z, 3) and K(Z, 7) respectively. (2) When the space is spin, the first Pontrjagin class p1 is divisible by 2, and the obstruction to lifting spin to string is 12 p1 . When the space is string, the second Pontrjagin class p2 is divisible by 6, and the obstruction to lifting string to fivebrane is 16 p2 . (3) Fivebrane(n) = O(n)8, which is a priori an H-space, can actually be equipped with the structure of a topological group. This follows from Milnor’s construction [53] of a simplicial set with a group structure for the based loop space of any “reasonable” space, e.g. a CW complex. Fivebrane(n) can be realized as simplicial group using loop spaces as follows: Looping BFivebrane(n) → BString(n) → B7 U (1) once gives Fivebrane(n) → String(n) → B6 U (1). Thus, Fivebrane(n), being defined as Ω(BFivebrane(n)), where the classifying space in brackets is defined by a homotopy pullback , is a topological group like every loop space. Proposition 1. The obstruction to lifting a String bundle to an O8 bundle is given by 16 p2 . Proof. The classifying spaces of the above sequences induce a map from BString → BSpin. The composite map BString → BSpin → BSO will map p2 to a multiple of the generator of H 8 (BString, Z). This multiple is obtained by noticing that the map from BString to (BU )7 is an isomophism on π8 and that the fourth Chern class c4 restricts to 6 times the generator of π8 on the complex side.
November 17, 2009 15:39 WSPC/148-RMP
1224
J070-00384
H. Sati, U. Schreiber & J. Stasheff
This works as follows. The complex vector bundle on S 2k that represents the generator of π2k (BU ) is the k-fold (external) tensor product of (1 − L), where 1 is the tautological (trivial) line bundle and L is the Hopf line bundle on S 2 . All that needs to be done is find the value of ck on this bundle. Using the multiplicative properties of the Chern character, this is just (k − 1)! (see e.g. [57, Theorem 5.1]). Thus for k = 4, the answer is 6 as claimed. It follows from connectivity that there is a commutative pullback diagram (BO)9 → (BU )10 sitting over (BO)8 → (BU )8. The map (BU )10 → (BU )8 is the fibration corresponding to 16 c4 , which restricts to 16 p2 in (BO)8. The map (BO)8 → (BU )8 is an isomorphism on π8 and lifts to (BO)9 → (BU )10, which is also an isomorphism on π8 . Thus we have the following diagram, where BFivebrane is as in Definition 1, BFivebrane = (BO)9
/ (BU )10
BString = (BO)8
/ (BU )8
1 6 c4
/ K(Z, 8)
BSpin = (BO)4
/ BSU = (BU )4
c2
/ K(Z, 4)
BSO = (BO)2
/ BU
BO = (BO)1.
(59)
Remarks. (1) In order for a bundle to be lifted from string to O8, it has already to be spin. Thus the first Pontrjagin classes in expressions (11) and (12) are set to zero and then dH7 in both cases is some linear combination of the second Pontrjagin classes, corresponding to the (lifts of the) tangent and the gauge bundles, TX and E (cf. [49]). (2) That the first column in (59) is real and the second column is complex follows the complexification map KO → KU on bundles, since we are interested in the divisibility properties of the Pontrjagin classes, and those are considered as the pullback of the complexification, i.e. the Chern classes. (3) The map from BO to BU is an isomorphism on π8k and is multiplication by 2 on π8k+4 . In particular, the map is an isomorphism on π8 and is multiplication by 2 on π4 (see e.g. [58]). (4) In the second row, we could have written (BU )6 in the second entry. The reason for having (BU )8 instead is because we would then have the map from the real to the complex side, (BO)8 → (BU )8 to be an isomorphism on π8 .
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
(5)
(6) (7)
(8)
1225
In contrast, the map BSpin → BSU is not an isomorphism on π4 but is in fact given by multiplication by 2 (by part 3 above) — see [58], for instance. In the first row, we wrote (BU )10 instead of (BU )9 because there are no homotopy groups of odd degree in BU , i.e. killing the homotopy in degree eight automatically gets us to degree ten. In both cases, BString and BFivebrane, we are killing a Z in the homotopy groups. The map from (BO)8 to (BU )8 in the second row is an isomorphism in degree eight because the generator of π8 for both spaces is v14 , where v1 is the Bott generator whose degree is two. The map from BSpin to BSU in the third row is given by multiplication by two.
Real versus Complex. In diagram (42) we had the complex spaces in the first column and the real spaces in the second column. In diagram (59) we had instead the real spaces in the first column and the complex spaces in the second column. We describe this further here. Consider the composite map from BSU to itself factoring through BSpin ∼ = in deg 4
BSU
/ BSpin
×2
/ BSU .
(60)
The composition is given by multiplication by 2 in degree 4 and acts on vector bundles via V → V ⊕ V = 2V,
(61)
so that c2 (2V ) = 2c2 (V ). Note that since we have SU bundles then c1 (V ) = 0. Next going from BU to BSO we have Z × BU
×2, ∼ = in deg 0
∼ = deg 0
forget
complexify
/ Z × BSO
/ Z × BU
/ Z × BSO. (62)
The first map amounts to forgetting the complex structure and the second to complexifying. The map from second to the fourth term is V → 2V , and that from the first to the third is V → V ⊕ V . 4.6. Congruence Remarks. (1) At the integral level, there is a very crucial difference between p2 being zero and 16 p2 being zero, due to the possible existence of very important 2- and 3-torsion. In other words, 16 p2 = 0 certainly implies that p2 is zero, but the converse is not necessarily true — and in fact in most interesting cases it is not true. This gives us the important conclusion that: Unlike the string case, both 2- and 3-torsion are important for (BO)9 structures. (2) Note that π∗ O has only 2-torsion so that in the stable range there is no 4-torsion, so there is no difference between 12 p1 and 14 p1 as obstructions. The latter is the
November 17, 2009 15:39 WSPC/148-RMP
1226
J070-00384
H. Sati, U. Schreiber & J. Stasheff
shift in the quantization condition of the field strength G4 in M-theory [74] (cf. footnote b). (3) Likewise, in the fivebrane case — again assuming the stable range — there is no difference, as far as the type of torsion is concerned, between the obstructions 1 1 1 6 p2 and 48 p2 . This is because they are both of the form 2n 3m p2 , i.e. involve 2- and 3-torsion. This leads to the following conclusion: In the stable range (hence no 8-torsion) for the tangent bundle, our definition of the fivebrane structure captures the physical condition for a fivebrane structure, i.e. that 16 p2 1 is essentially the same as 48 p2 in the sense that has just been explained. (4) In both case we can avoid subtleties of division by 2 and 3, by inverting those two primes. Thus if we invert 2 in the string case, starting with a ring R (e.g. Z) we can work with the ring R[ 12 ], and if we invert 6 = 2 × 3 in the fivebrane case, we can work with the ring R[ 16 ]. The division by 8. We have seen that the obstruction to the fivebrane struc1 p2 . Here ture is given by 16 p2 , while the expression appearing in the anomaly is 48 we give a description of this further division by 8, which, by the above remark, does not generate any new kind of torsion (i.e. no new primes). In particular this means that, as far as anomalies are concerned, the situation in the fivebrane case is in a sense better than that in the string case, where there was a crucial difference between p1 vanishing and 12 p1 vanishing, namely that coming from 2-torsion. What we would like to describe is a space, which we will call F , that sits in the fibration ×8
F → (BO)9 → (BO)9
(63)
1 where the second map takes 48 p2 to 16 p2 . The question is then: what is F ? As a warm-up, consider a degree two class, in which case we have ×n
F → K(Z, 2) → K(Z, 2),
(64)
and the answer is F = K(Zn , 1). We then seek a lift K(Z, 2) = ?
X
×n
/ K(Z, 2)
(65) αn
/ K(Zn , 2),
which exists if and only if the pullback of the map αn to X is zero, where αn is “reduction mod n”.
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1227
For a String structure on a space Y we have the following diagram / F = StringF Y FF FF FF FF FF FF FF FF FF F# (BO)8
/ K(Z, 8)
x
×8
1 6 p2
/ K(Z, 8)
(66) / K(Z8 , 8),
1 p2 which naturally lives not in (BO)8 but rather in the where x is our class 48 desired space F . The above diagram specifies F , which is the part within BO8 in which further division by 8 is allowed. By StringF we mean a bundle with fiber F . The modification of Proposition 1 is then 1 Proposition 2. The class 48 p2 is the obstruction to lifting a String F bundle, defined by diagram (66), to a fivebrane bundle.
We conclude this section with a caveat. The formulae that arise from the anomaly cancellation, e.g. (12), involve different coefficients for each of the second Pontrjagin classes of the two bundles. Had we had the same factor or just a minus sign, then we could have simply applied the formulae in the discussion at the end of Sec. 4.7. However, we have relative factors of 48, so how can one make sense of such factors? For instance, in the case when the string condition applies to all bundles, is p2 (E) + n1 p2 (F ) the second Pontrjagin class of some combination of the bundles E and F ? It does not make sense to talk about E + n1 F , but one can talk about nE + F . But then, we would have the issue of torsion, especially that n = 48 includes both the important 2- and 3-torsion. We can thus say that, away from such torsion, the fivebrane class of 48TM − E is equal to 48 times the fivebrane class of TM minus the fivebrane class of E. In the case when H 8 (X, Z) has 2- or 3-torsion we cannot draw such a conclusion.
4.7. Inequivalent Fivebrane structures: The fivebrane class In this section we consider the set of inequivalent Fivebrane structures on our manifold or on a pair consisting of a manifold with a gauge bundle, In order to do so, let us first recall the situation for String structures on manifolds. The complex version of the fibration K(Z, 3) → BString → BSpin
(67)
K(Z, 3) → (BU )6 → BSU,
(68)
is
November 17, 2009 15:39 WSPC/148-RMP
1228
J070-00384
H. Sati, U. Schreiber & J. Stasheff
where the space (BU )6 is the 5-connected cover of BU . A choice of String structure on a space X is a lift fˆ of the classifying map f in the diagram (BO)8 = (69)
fˆ
X
f
/ (BO)4
1 2 p1
/ K(Z, 4).
A choice of (BU )6 structure on a space X is a lift in the diagram B(U )6 = (70)
fˆ
X
f
/ BSU
c2
/ K(Z, 4).
The appearance of the factor 12 in (69) versus (70) is a reflection of the fact mentioned earlier that the map from BSpin to BSU is given by multiplication by two. The fibration sequences K(Z, 3) → (BO)8 → (BO)4 K(Z, 3) → (BU )6 → BSU
(71)
show that two string or (BU )6 structures differ by a map to K(Z, 3) for a fixed spin or BSU structure, respectively. Therefore, the set of lifts, i.e. the set of string structures for a fixed Spin structure in the real case, or the set of (BU )6 structures for a fixed BSU structure in the complex case, is a torsorc for a quotient of the third integral cohomology group H 3 (X; Z). The elements of the torsor are the string classes, corresponding to the NS degree three H-field in string theory. Now we are ready to study the set of Fivebrane structures. The complex version of the fibration K(Z, 7) → BFivebrane → BString,
(72)
obtained by applying the classifying functor on the sequence in Definition 2, is K(Z, 7) → (BU )10 → (BU )8, c Here
(73)
a torsor is a set with a group action which is free and transitive. So as a set it is the same as the group. The group acts but there is no canonical identification of the set with the group.
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1229
where the space (BU )8 is the 7-connected cover of BU . A choice of fivebrane structure on a space X is a lift in the diagram (BO)9 = fˆ
f
X
(74) / (BO)8
1 2 p1
/ K(Z, 8).
A choice of (BU )9 structure on a space X is a lift in the diagram (BU )10 < fˆ
f
X
(75) / (BU )8
1 6 c4
/ K(Z, 8).
The fibration sequences (72), (73) show that two fivebrane or (BU )9 structures differ by a map to K(Z, 7) for a fixed string or (BU )7 structure, respectively. Remarks. (1) A degree seven class X → K(Z, 7) that corresponds to the fivebrane structure cannot be specified. Consider the diagram K(Z, 7) F
(BO)9 =
g
(76)
fˆ
X
f
/ (BO)8
and note that f and g do not determine fˆ since BO9 = BO8 × K(Z, 7). This means that there is no degree seven class that picks fˆ. (2) For a given String structure, the set of compatible Fivebrane structures has a transitive action by degree seven cohomology. We have Proposition 3. The set of lifts, i.e. the set of Fivebrane structures for a fixed String structure in the real case, or the set of (BU )9 structures for a fixed (BU )7
November 17, 2009 15:39 WSPC/148-RMP
1230
J070-00384
H. Sati, U. Schreiber & J. Stasheff
structure in the complex case, is a torsor for a quotient (to be described below) of the seventh integral cohomology group H 7 (X; Z). We call the elements of the latter the fivebrane classes, corresponding to the (dual) NS degree seven H-field in string theory. We now explain the quotient in the proposition. Whether or not we have a free action depends on whether the map [X, (Ω(BO))8] → [X, K(Z, 7)]
(77)
coming from the diagram (Ω(BO))8 K(Z, 7) = g
6 fˆ
X
f
(BO)9
(78)
/ BO8
is the zero map. Note that (77) is just the map of homotopy classes induced by the top arrow in (78). However, the intention is to consider two liftings fˆi for i = 1, 2 which therefore differ by a map g as in (76). Since the maps of homotopy classes give long exact sequences for any three of the target spaces, then if (77) were the zero map then the next one would be onto. Failure to be onto is measured by the failure of [X, (BO)8] → [X, K(Z, 7)] to be zero. The set of lifts is a torsor over the quotient H 7 (X, Z)/[X, Ω(BO)8].
(79)
(3) The fibration (72) is a map of infinite loop spaces so we can think of it in terms of a map K(Z, 7) × BFivebrane → BFivebrane,
(80)
which is the action of the fiber K(Z, 7) on the total space. This realizes an action of K(Z, 7) on the space of inequivalent Fivebrane structures, where the latter is the seventh integral cohomology viewed as the space of maps with target K(Z, 7), and the map to BFivebrane is the classifying map. (4) At the level of spectra, which are objects whose homotopy groups represent generalized cohomology theories [1], we have Σ7 HZ × ko9 → ko9,
(81)
where ko9 is the connective version of real K-theory KO, i.e. the version with no negative degrees, of the K-theory corresponding to BO9.
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1231
(5) Addition of vector bundles is encoded in a product BSO × BSO → BSO,
(82)
BFivebrane × BFivebrane → BFivebrane.
(83)
which has a lift to
This gives us a way of adding Fivebrane bundles; for example, in our setting of heterotic string theory, we have the virtual difference of the tangent bundle and the gauge bundle. In fact lifts such as (83) work for even higher connected covers as well. (6) The bundles we have are the tangent bundle and the gauge vector bundle E, with the total bundle being the K-theoretic virtual difference. Hence it is natural to ask about the Fivebrane structure of sums and differences of bundles. There is a distinction here according to whether the bundles are spin or not. For the two possible gauge bundles, the E8 ×E8 bundle is spin, while the Spin(32)/Z2 bundle is not, as its second Stiefel–Whitney class is non-vanishing. The (integral) Pontrjagin classes for two bundles E1 and E2 satisfy [54] pk (E1 ⊕ E2 ) = pi (E1 ) ∪ pj (E2 ) mod elements of order 2. (84) i+j=k
Thus, applying to our case, p2 (TM ⊕ E) = p1 (TM ) ∪ p1 (E) + p2 (TM ) + p2 (E) mod 2-torsion. (85) But we are assuming p1 (TM ) to be zero — recall that in general, we have to have a String structure on a bundle to talk about a Fivebrane structure as we have to have a Spin structure on a bundle to talk about a String structure. Hence we get: p2 (TM ⊕ E) = p2 (TM ) + p2 (E) mod 2-torsion,
(86)
2[p2 (TM ⊕ E) − p2 (TM ) − p2 (E)] = 0.
(87)
so
We have worked out the case of a direct sum, but the case of virtual difference should be analogous. Indeed, since V ⊕ (−V ) is equivalent to a trivial bundle, p1 (E1 − E2 ) = p1 (E1 ) − p1 (E2 ) and p2 (E1 − E2 ) = p2 (E1 ) − p2 (E2 ), and hence we get 2[p2 (TM − E) − p2 (TM ) + p2 (E)] = 0.
(88)
We can actually give a better description by using special characteristic classes more adapted for covers of the tangent bundle and, in addition, use integral coefficients.
November 17, 2009 15:39 WSPC/148-RMP
1232
J070-00384
H. Sati, U. Schreiber & J. Stasheff
Note that, making use of the fact that the bundles are not just real but also spin, H ∗ (BSpin; Z) = Z[Q1 , Q2 , . . .] ⊕ γ,
(89)
with γ a 2-torsion factor, i.e. 2γ = 0 [71], concentrated in degrees not congruent to 0 mod 4. The two degrees relevant to our discussion are H 4 (BSpin; Z) ∼ = Z with generator Q1 8 ∼ Z ⊕ Z with generators Q2 , Q2 , H (BSpin; Z) = 1
(90)
where Q1 and Q2 are determined by their relation to the Pontrjagin classes p1 = 2Q1 p2 = Q21 + 2Q2 .
(91)
Then the spin generators are given in terms of the Pontrjagin classes by Q1 = 12 p1 and Q2 = 12 p2 − 12 ( 12 p1 )2 . Note that Q2 = 12 p2 holds when Q1 vanishes. Their importance in the study of anomalies was emphasized in [60]. For two Spin bundles E and F , such as the spinor bundle and the E8 × E8 bundle Q2 (E ⊕ F ) = Qi (E) ∪ Qj (F ) i+j=2
= Q2 (E) + Q2 (F ),
(92)
where again the condition Q1 (TM ) = 0 is used. 4.8. Summary • We considered the notion of a “Fivebrane structure” on a manifold, which generalizes that of a String structure: as a String structure is defined to be a lift of the tangent bundle of a Spin manifold to the 3-connected cover of spin, a fivebrane structure is the further lift to a 7-connected cover. We showed that fivebrane structures exist when a fractional second Pontrjagin class vanishes. This holds for the tangent bundle as well as gauge bundles. When considering direct sums or K-theoretic virtual bundles, as in the case of heterotic string theory where we have both TX and E, the obstructions are given by the sums and differences of the individual Fivebrane structures, respectively. • In analogy to how String structures appear in terms of the vanishing of a worldsheet anomaly for the superstring, Fivebrane structures are related to the vanishing of an anomaly in the six-dimensional worldvolume theory of the fivebranes. In M-theory and type IIA string theory, this is the anomaly of the M-theory fivebrane [26, 27, 56, 11, 17], while in heterotic string theory this is the worldvolume anomaly of the heterotic fivebrane [51]. Note that the electric-magnetic duality between strings and 5-branes in ten dimensions relates String structures
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1233
and Fivebrane structures: • We notice that string theory suggests that String structures and Fivebrane structures are related by a duality which generalizes Hodge duality. Apart from the well-known fact (see, for instance, [29] for a survey) that NS 5-branes are magnetic duals to strings, we notice that there is a known formula for the Hodge dual of the 3-form curvature which appears in the Green–Schwarz mechanism which relates to Fivebrane structures as the former relates to String structures. We leave the detailed study of this duality for a separate treatment. Acknowledgments H. S. thanks Matthew Ando and Nitu Kitchloo for very useful discussions and acknowledges the hospitality of the mathematics department at the University of Illinois at Urbana-Champaign. We thank Dan Freed and Jacques Distler for helpful discussion. We thank the anonymous referee for very useful comments and suggestions on improving the presentation. Appendix A. Recollection of Characteristic Classes We recall elements of the theory of characteristic classes. A.1. Universal characteristic classes For any topological group G, there is a classifying space BG and a universal principal G-bundle EG → BG
(93)
such that equivalence classes of (numerable) G-bundles E → B are in 1-1 correspondence with homotopy classes of maps B → BG. For G a Lie group, elements of the cohomology H ∗ (BG) (with any coefficients) are called the universal characteristic classes for G-bundles. Given a G-bundle E → B, its characteristic classes are the pull-backs of the universal characteristic classes via the classifying map B → BG. It is a classical theorem that, for G compact connected, the image in real cohomology, H ∗ (BG, R), of this cohomology ring is finitely freely generated in even degrees and is isomorphic to the ring of invariant polynomials on g = Lie(G). Moreover, the real cohomology ring of G itself H ∗ (G, R) H ∗ (g, R) is isomorphic to the Lie algebra cohomology ring of g, which is generated in odd degree. The relation between H ∗ (G, R) and H ∗ (BG, R) is as follows: • H ∗ (G) is isomorphic to an exterior algebra Λ(x1 , . . . , xn ), where the xi are all of odd degree. • H ∗ (BG) is isomorphic to a polynomial algebra P [y1 , . . . , yn ] where degree yi = degree xi + 1.
(94)
November 17, 2009 15:39 WSPC/148-RMP
1234
J070-00384
H. Sati, U. Schreiber & J. Stasheff
The isomorphism of the respective vector spaces of generators can in fact be expressed invariantly: Let PH ∗ (G) denote the subspace of primitive elements, i.e. those x ∈ H ∗ (G) such that m∗ (x) = 1 ⊗ x + x ⊗ 1
(95)
where m : G × G → G is the group multiplication. Let QH ∗ (BG) denote the quotient space of indecomposables, i.e. H + (BG)/H + (BG) · H + (BG) where H + denotes the subalgebra of elements of positive degree. The isomorphism we have described in terms of generators is in fact τ : PH ∗ (G) → QH ∗+1 (BG).
(96)
The isomorphism says that the two vector spaces are transgressively related. Unfortunately the name transgression is sometimes applied in algebraic topology to τ (cf. Cartan, Borel, Serre) and sometimes in differential geometry to its inverse (cf. Chern and disciples). In algebraic topology, the map σ : H ∗ (BG) → H ∗−1 (G) is always defined whereas τ is defined only on primitive elements. An element c in H n (BG, Z) can be identified with a (homotopy class of a) map f from BG to the Eilenberg–MacLane space K(Z, n) (and similarly for R), so that the characteristic class f ∗ c of a G-bundle E corresponding to the characteristic class c of G is given simply by the composite map f
c
B → BG → K(Z, n).
(97)
A.2. The cohomology of BSO and BU We are particularly concerned in this paper with H ∗ (BSO) and H ∗ (BU ). See [54]. In characteristic 0, H ∗ (BSO ) P [y4 , y8 , . . .],
(98)
where now the indexing is by the degree of the (Pontrjagin) class. In any characteristic, H ∗ (BU ) P [c1 , c2 , . . .],
(99)
where we have used the traditonal notation for the Chern classes ci . The basic definitions and properties of Chern and Pontrjagin classes are recalled next. A.3. Chern–Weil homomorphism Given a G-bundle P → X with connection A, the Chern–Weil homomorphism inv(g) → H ∗ (X, R)
(100)
sends each degree n invariant polynomial k ∈ inv(g) on the Lie algebra g = Lie(G) to the differential form k → k(FA ) := k(FA ∧ · · · ∧ FA )
(101)
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1235
Table 6. The Chern–Weil homomorphism sends, for each G-bundle P → X, any degree n invariant polynomial on g = Lie(G) to the de Rham class of the differential form k(FA ) = k(FA ∧ · · · ∧ FA ) obtained by inserting the curvature 2-form of any connection on P into k.
/ Ω∗ (X)
inv(g)
k
/ H ∗ (X, R) 4
Chern–Weil homomorphism
invariant polynomial
/ k(FA )
/ [k(FA )]
characteristic form
characteristic class
obtained by wedging n copies of the curvature 2-form and evaluating its Lie algebra value in k. This k(FA ) is a closed form. The corresponding class [k(FA )] in de Rham cohomology is the characteristic class of P corresponding to k. This class is independent of the choice of connection on P .
A.4. Polynomials and classes for matrix Lie algebras Given a matrix Lie algebra g ⊂ gl(n), the trace and the determinant operation on matrices provide families of g-invariant polynomials. • The assignment X → ch(X) := Tr(exp(X))
(102)
defines the invariant polynomials chk (X) as ch(X) :=
∞
tk chk (X, . . . , X).
(103)
k=0
The expression ch(X) is the total Chern-character. • Let g ⊂ gl(c, C) be a Lie algebra of complex matrices. Then the assignment X → c(X) := det(t + iX)
(104)
defines the invariant polynomials ck as c(X) =
n
tn−k ck (X, . . . , X).
(105)
k=0
These are the Chern Polynomials. The corresponding class [ck (FA )] is the kth Chern class.
Characteristic Class
Definition
Invariant Polynomial
Lie algebra
Chern character
Chern class
Pontrjagin class
k
p(X) = det(t − X) X = tn−k pk/2 (X, . . . , X)
c(X) = det(t + iX) X = tn−k ck (X, . . . , X)
ch(X) = tr(exp(itX)) X = tk chk (X, . . . , X) k
pk/2
ck
chk
k
g ⊂ gl(n, R)
g ⊂ gl(n, C)
g ⊂ gl(n, C)
Characteristic classes for matrix Lie algebras obtained from the trace and the determinant.
1236
Table 7.
November 17, 2009 15:39 WSPC/148-RMP J070-00384
H. Sati, U. Schreiber & J. Stasheff
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1237
• Let g ⊂ gl(n, R) be a Lie algebra of real matrices. Then the assignment X → c(X) := det(t − X)
(106)
defines the invariant polynomials pk/2 as c(X) =
n
tn−k pk/2 (X, . . . , X).
(107)
k=0
These are the Pontrjagin polynomials. The corresponding classes [pk/2 (FA )] are the Pontrjagin classes. • The restrictions of the ck from gl(n, C) to gl(n, R) satisfy ik ck (X, . . . , X) = pk/2 (X, . . . , X).
(108)
• When g ⊂ sl(n) all elements are traceless and various cancellations occur. In particular for g = so(n) the Pontrjagin classes have a relatively simple relation to the Chern characters. Appendix B. The Leray–Serre Spectral Sequence Suppose that H ∗ ((BU )2k; Q) = P [ck , ck+1 , . . .]. Corresponding to the fibration K(Z, 2k − 1)
/ (BU )2k + 2
(109) (BU )2k
/ K(Z, 2k)
we have a spectral sequence H ∗ ((BU )2k; H ∗ (K(Z, 2k − 1); Q)) ⇒ H ∗ ((BU )2k + 2).
(110)
Here the shift in 2 in the degree of the Eilenberg–MacLane space comes from Bott periodicity. The cohomology H ∗ (K(Z, 2k − 1); Q) is an exterior algebra Λ(e2k−1 ) on a generator of degree 2k − 1, e2k−1 , which satisfies e22k−1 = 0. The differential in the sequence takes this generator to the kth Chern class ck . Modding out by this class gives (49). In fact, the spectral sequence collapses from here and there is no extension problem for the E∞ term. References [1] J. F. Adams, Stable Homotopy and Generalised Homology (University of Chicago Press, 1974). [2] L. Alvarez-Gaum´e and P. H. Ginsparg, The structure of gauge and gravitational anomalies, Ann. Phys. 161 (1985) 423–490; ibid., Erratum 171 (1986) 233. [3] L. Alvarez-Gaum´e and E. Witten, Gravitational anomalies, Nucl. Phys. B 234 (1984) 269–330.
November 17, 2009 15:39 WSPC/148-RMP
1238
J070-00384
H. Sati, U. Schreiber & J. Stasheff
[4] M. Ando, M. J. Hopkins and N. P. Strickland, Elliptic spectra, the Witten genus and the theorem of the cube, Invent. Math. 146 (2001) 595–687. [5] P. Aschieri and B. Jurˇco, Gerbes, M5-brane anomalies and E8 gauge theory, J. High Energy Phys. 0410 (2004) 068, 22 pp.; arXiv:hep-th/0409200. [6] N. A. Baas, M. B¨ okstedt and T. A. Kro, 2-Categorical K-theory, arXiv:math/ 0612549. [7] J. Baez, A. Crans, U. Schreiber, and D. Stevenson, From loop groups to 2-groups, Homology Homotopy Appl. 9(2) (2007) 101–135; arXiv:math/0504123 [math.QA]. [8] J. Baez and U. Schreiber, Higher gauge theory, in Categories in Algebra, Geometry and Mathematical Physics, Contemp. Math., Vol. 431 (Amer. Math. Soc., Providence, RI, 2007), pp. 7–30; arXiv:math/0511710v2 [math.DG]. [9] J. Baez and D. Stevenson, The classifying space of a topological 2-group, arXiv: 0801.3843 [math.AT]. [10] T. Bartels, 2-Bundles, arXiv:math/0410328 [math.CT]. [11] E. Bergshoeff, R. Percacci, E. Sezgin, K. S. Stelle and P. K. Townsend, U (1) extended gauge algebras in p-loop space, Nucl. Phys. B 398 (1993) 343–358. [12] L. Bonora, P. Cotta-Ramusino, M. Rinaldi and J. Stasheff, The evaluation map in field theory, sigma-models and strings I, Comm. Math. Phys. 112 (1987) 237–282. [13] P. Bouwknegt, A. Carey, V. Mathai, M. Murray and S. Stevenson, Twisted K-theory and K-theory of bundle gerbes, Comm. Math. Phys. 228 (2002) 17–49; arXiv:hepth/0106194. [14] J.-L. Brylinksi and D. McLaughlin, A geometric construction of the first Pontryagin class, in Quantum Topology (World Scientific, Singapore 1993), pp. 209–220. ˇ [15] J.-L. Brylinksi and D. McLaughlin, Cech cocycles for characteristic classes, Comm. Math. Phys. 178 (1996) 225–236. [16] C. G. Callan, Jr., J. A. Harvey and A. Strominger, Worldbrane actions for string solitons, Nucl. Phys. B 367 (1991) 60–82. [17] M. Cederwall, G. Ferretti, B. E. W. Nilsson and A. Westerberg, Higher dimensional loop algebras, nonabelian extensions and p-branes, Nucl. Phys. B 424 (1994) 97–123. [18] A. H. Chamseddine, Interacting supergravity in ten-dimensions: The role of the sixindex gauge field, Phys. Rev. D 24 (1981) 3065–3072. [19] G. F. Chapline and N. S. Manton, Unification of Yang–Mills theory and supergravity in ten-dimensions, Phys. Lett. B 120 (1983) 105–109. [20] S. S. Chern and J. Simons, Characteristic forms and geometric invariants, Ann. Math. 99(2) (1974) 48–69. [21] A. Clingher, Heterotic string data and theta functions, Adv. Theor. Math. Phys. 9 (2005) 173–252; arXiv:math/0110320. [22] R. Coquereaux and K. Pilch, String structures on loop bundles, Comm. Math. Phys. 120 (1989) 353–378. [23] E. Diaconescu, G. Moore and E. Witten, E8 gauge theory, and a derivation of Ktheory from M -theory, Adv. Theor. Math. Phys. 6 (2003) 1031–1134; arXiv:hepth/0005090. [24] P. A. M. Dirac, Quantized singularities in the electromagnetic field, Proc. Royal Soc. A 133 (1931) 60–72. [25] J. Dixmier and A. Douady, Champs continus d’espaces hilbertiens et de C ∗ -alg´ebres, Bull. Soc. Math. France 91 (1963) 227–284. [26] J. A. Dixon, M. J. Duff and E. Sezgin, The coupling of Yang–Mills to extended objects, Phys. Lett. B 279 (1992) 265–271. [27] J. A. Dixon and M. J. Duff, Chern–Simons forms, Mickelsson–Fadeev algebras and the p-branes, Phys. Lett. B 296 (1992) 28–32.
November 17, 2009 15:39 WSPC/148-RMP
J070-00384
Fivebrane Structures
1239
[28] J. A. Dixon, M. J. Duff and J. C. Plefka, Putting string/fivebrane duality to the test, Phys. Rev. Lett. 69 (1992) 3009–3012; arXiv:hep-th/9208055v1. [29] M. J. Duff, R. R. Khuri and J. X. Lu, String solitons, Phys. Rept. 259 (1995) 213–326; arXiv:hep-th/9412184. [30] M. J. Duff, J. T. Liu and R. Minasian, Eleven-dimensional origin of string-string duality: A one loop test, Nucl. Phys. B 452 (1995) 261–282; arXiv:hep-th/9506126. [31] M. J. Duff and J. X. Lu, Strings from five-branes, Phys. Rev. Lett. 66 (1991) 1402– 1405. [32] M. J. Duff and J. X. Lu, Remarks on string/five-brane duality, Nucl. Phys. B 354 (1991) 129–140. [33] D. S. Freed, Higher algebraic structures and quantization, Comm. Math. Phys. 159 (1994) 343–398; arXiv:hep-th/9212115. [34] D. S. Freed, Quantum groups from path integrals, in Proc. Particles and Fields (Banff, 1994), CRM Ser. Math. Phys. (Springer, New York, 1999), pp. 63–107; arXiv:q-alg/9501025. [35] D. S. Freed, Dirac charge quantization and generalized differential cohomology, in Surv. Differ. Geom., VII (Int. Press, Somerville, MA, 2000), pp. 129–194; arXiv:hepth/0011220. [36] D. S. Freed, K-theory in quantum field theory, in Current Developments in Mathematics (Int. Press, Somerville, MA, 2002), pp. 41–87; arXiv:math-ph/0206031. [37] J. Figueroa-O’Farrill and J. Simon, Supersymmetric Kaluza–Klein reductions of M2-branes and and M5-branes, Adv. Theor. Math. Phys. 6 (2003) 703–793; arXiv: hep-th/0208107. [38] S. J. Gates, Jr. and H. Nishino, New D = 10, N = 1 supergravity coupled to Yang– Mills supermultiplet and anomaly cancellations, Phys. Lett. B 157 (1985) 157–163. [39] V. Giambalvo, The mod p cohomology of BO4k, Proc. Amer. Math. Soc. 20 (1969) 593–597. [40] M. B. Green and J. H. Schwarz, Anomaly cancellation in supersymmetric D = 10 gauge theory and superstring theory, Phys. Lett. B 149 (1984) 117–122. [41] M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory, Vol. 2 (Cambridge University Press, Cambridge, 1988). [42] W. Greub and H.-R. Petry, Minimal coupling and complex line bundles, J. Math. Phys. 16 (1975) 1347–1351. [43] A. Henriques, Integrating L∞ -algebras; arXiv:math/0603563 [math.AT]. [44] H. Hohnhold, S. Stolz and P. Teichner, From minimal geodesics to super symmetric field theories, http://math.berkeley.edu/ teichner/Papers/Bott-EFT.pdf. ¨ [45] H. Hopf, Uber die Abbildungen der dreidimensionalen Sph¨ are auf die Kugelfl¨ ache, Math. Ann. 104(1) (1931) 637–665. [46] M. J. Hopkins and I. M. Singer, Quadratic functions in geometry, topology,and M -theory, J. Diff. Geom. 70 (2005) 329–452; arXiv:math/0211216. [47] B. Jurˇco, Crossed module bundle gerbes; classification, string group and differential geometry, arXiv:math/0510078 [math.DG]. [48] A. Kapustin and E. Witten, Electric-magnetic duality and the Geometric Langlands Program, Commun. Number Theory Phys. 1(1) (2007) 1–236; arXiv:hep-th/0604151. [49] T. P. Killingback, World-sheet anomalies and loop geometry, Nucl. Phys. B 288 (1987) 578–588. [50] I. Kriz and H. Sati, M Theory, type IIA superstrings, and elliptic cohomology, Adv. Theor. Math. Phys. 8 (2004) 345–394; arXiv:hep-th/0404013. [51] K. Lechner and M. Tonin, World volume and target space anomalies in the D = 10 superfivebrane sigma model, Nucl. Phys. B 475 (1996) 545–561; arXiv: hep-th/9603094.
November 17, 2009 15:39 WSPC/148-RMP
1240
J070-00384
H. Sati, U. Schreiber & J. Stasheff
[52] V. Mathai and H. Sati, Some relations between twisted K-theory and E8 gauge theory, J. High Energy Phys. 0403 (2004) 016, 20 pp.; arXiv:hep-th/0312033. [53] J. Milnor, Cnstruction of universal bundles I, Ann. Math. 63(2) (1956) 272–284. [54] J. W. Milnor and J. D. Stasheff, Characteristic Classes, Annals of Mathematics Studies, No. 76 (Princeton University Press, Princeton, NJ, 1974). [55] M. K. Murray and D. Stevenson, Higgs fields, bundle gerbes and string structures, Comm. Math. Phys. 243(3) (2003) 541–555. [56] R. Percacci and E. Sezgin, Symmetries of p-branes, Int. J. Mod. Phys. A 8 (1993) 5367–5382. [57] F. P. Peterson, Some remarks on Chern classes, Ann. Math. 69(2) (1959) 414–420. [58] Y. Rudyak, On Thom Spectra, Orientability, and Cobordism (Springer, New York, 1998). [59] A. Salam and E. Sezgin, Anomaly freedom in chiral supergravities, Phys. Scripta 32 (1985) 283–285. [60] H. Sati, An approach to anomalies in M -theory via KSpin, J. Geom. Phys. 58 (2008) 387–401; arXiv:0705.3484. [61] H. Sati, U. Schreiber and J. Stasheff, L∞ -connections and applications to String- and Chern–Simons n-transport, in Recent Developments in QFT, eds. B. Fauser et al. (Birkh¨ auser, Basel, 2008); arXiv:0801.3480 [math.DG]. [62] H. Sati, U. Schreiber and J. Stasheff, Twisted differential string and fivebrane structures, in preparation. [63] J. H. Schwarz, Anomaly cancellation: A retrospective from a modern perspective, Int. J. Mod. Phys. A 17S1 (2002) 157–166; arXiv:hep-th/0107059v1. [64] G. Segal, What is an elliptic object?, in Elliptic Cohomology (Cambridge Univ. Press, Cambridge, 2007), pp. 306–317. [65] W. M. Singer, Connective fiberings over BU and U, Topology 7 (1968) 271–303. [66] S. Stolz and P. Teichner, What is an elliptic object?, in Topology, Geometry and Quantum Field Theory (Cambridge Univ. Press, Cambridge, 2004), pp. 247–343. [67] S. Stolz and P. Teichner, Super symmetric field theories and integral modular functions, http://math.berkeley.edu/ teichner/Papers/SusyQFT.pdf. [68] R. Stong, Determination of H ∗ (BO(k, . . . , ∞), Z2 ) and H ∗ (BU(k, . . . , ∞), Z2 ), Trans. Amer. Math. Soc. 107 (1963) 526–544. [69] A. Strominger, Heterotic solitons, Nucl. Phys. B 343 (1990) 167–184, ibid., Erratum 353 (1991) 565. ´ [70] D. Sullivan, Infinitesimal computations in topology, Inst. Hautes Etudes Sci. Publ. Math. 47 (1977) 269–331. [71] E. Thomas, On the cohomology groups of the classifying space for the stable spinor groups, Bol. Soc. Mat. Mexicana 7(2) (1962) 57–69. [72] E. Witten, The index of the Dirac operator in loop space, in Elliptic Curves and Modular Forms in Algebraic Topology (Princeton, NJ, 1986), Lecture Notes in Math., Vol. 1326 (Springer, Berlin, 1988), pp. 161–181. [73] E. Witten, Unification in ten dimensions, in Workshop on Unified String Theories, eds. M. Green and D. Gross (World Scientific, Singapore, 1986), pp. 438–456. [74] E. Witten, On flux quantization in M -theory and the effective action, J. Geom. Phys. 22 (1997) 1–13; arXiv:hep-th/9609122v2. [75] C. N. Yang, Selected Papers (1945–1980) (World Scientific, Singapore, 2005). [76] B. Zumino, Y.-S. Wu and A. Zee, Chiral anomalies, higher dimensions, and differential geometry, Nucl. Phys. B 239 (1984) 477–507.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Reviews in Mathematical Physics Vol. 21, No. 10 (2009) 1241–1312 c World Scientific Publishing Company
THE EXTENDED ALGEBRA OF OBSERVABLES FOR DIRAC FIELDS AND THE TRACE ANOMALY OF THEIR STRESS-ENERGY TENSOR
CLAUDIO DAPPIAGGI∗ , THOMAS-PAUL HACK† and NICOLA PINAMONTI‡ II. Institut f¨ ur Theoretische Physik, Universit¨ at Hamburg, Luruper Chaussee 149, D-22761 Hamburg, Germany ∗
[email protected] †
[email protected] ‡
[email protected] Received 22 April 2009 Revised 13 September 2009 Dedicated to the memory of our friend and colleague Raffaele Punzi We discuss from scratch the classical structure of Dirac spinors on an arbitrary globally hyperbolic, Lorentzian spacetime, their formulation as a locally covariant quantum field theory, and the associated notion of a Hadamard state. Eventually, we develop the notion of Wick polynomials for spinor fields, and we employ the latter to construct a covariantly conserved stress-energy tensor suited for back-reaction computations. We shall explicitly calculate its trace anomaly in particular. Keywords: Quantum field theory over curved backgrounds; spinor fields; conformal anomaly. Mathematics Subject Classification 2000: 81T20, 81T05
Contents 1. Introduction
1242
2. Dirac Fields: A Classical Overview 2.1. On the spin structure and related geometric entities 2.2. On the dynamics of a classical Dirac field 2.3. The Cauchy problem and the fundamental solutions
1244 1244 1251 1257
3. Dirac Fields: Quantum Point of View 3.1. The local algebras of fields and observables 3.2. Locality and general covariance 3.3. Spinors and Hadamard states
1261 1262 1266 1268
1241
November 18, 2009 11:47 WSPC/148-RMP
1242
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
3.4. On the notion of Wick polynomials: Rewriting the field algebra 3.5. On the notion of Wick polynomials: The enlarged algebra and its states 4. The 4.1. 4.2. 4.3.
Stress-Energy Tensor of Dirac Fields The classical stress-energy tensor The quantum stress-energy tensor: The problem The quantum stress-energy tensor: The solution and its trace anomaly
1275 1278 1284 1284 1285 1289
5. Conclusions and Outlook
1294
Appendix A. Useful Tools and Necessary Calculations A.1. Notations, conventions, identities A.2. On the calculus of bispinor-tensors A.3. On the Hadamard recursion relations and related results A.4. Conserved local curvature tensors
1296 1296 1298 1301 1308
1. Introduction In the last few decades, we witnessed an amazing leap forward in our understanding of the formulation of quantum field theory on curved backgrounds thanks to the efforts of many research groups who often tackled this topic by means of the algebraic formalism. A careful analysis of all related achievements, though tempting, would require a review on its own and, instead, we shall restrict ourselves to briefly mentioning the results of a recent manuscript, which has prompted our interest towards the topic discussed in this paper. To wit, in [24], it was shown that, in the framework of semiclassical Einstein’s equations, it is possible to construct explicit solutions for a homogeneous and isotropic Friedmann–Robertson–Walker spacetime with flat spatial sections, where the assumption has been made that the matter content is described by a suitably quantized free massive scalar field. In order to prove this result, two ingredients have played a key role, namely, Hadamard states, as the natural candidates for a ground state, and the quantum behavior of the regularized stress-energy tensor Tµν . Particularly, as end point of the above cited paper, a late times stable de Sitter solution has been displayed and, hence, an effective cosmological constant has arisen without inserting it from the very beginning as an input datum. If one tries to seek the origin of this genuine quantum effect, one can realize that it is ultimately rooted in the so-called trace anomaly. In the case of a massless field conformally coupled to the scalar curvature, this is tantamount to claiming that the expectation value of the regularized trace is not vanishing on Hadamard states, even though this is the case at a classical level. Although interesting, the derivation of the aforementioned result raises naturally the question about its robustness since one could wonder if this behavior is a feature pertaining only to scalar fields or if it holds true for any kind of matter constituent. It thus seems advisable to try to apply the same scheme of reasoning in the context of a free Dirac field. To our utmost surprise, we have realized that the accomplishment of this goal has not been as easy as one might a priori believe since,
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1243
just by a quick scan of the available literature, it is manifest how spinor fields play a somehow ancillary role in the arena of algebraic quantum field theory. In fact, there are only a few mathematically sound results and, at the same time, many tools and concepts, which have been thoroughly discussed for scalar fields, have barely been scratched for spinors. As an example, let us recall that we are mostly concerned with the trace anomaly for Dirac fields and it turns out that this quantity has indeed been already investigated in the late seventies in [18, 19], though by means of the so-called DeWitt–Schwinger expansion which lacks mathematical rigor and, therefore, we cannot call our understanding of this anomaly complete yet. With this in mind, before we can tackle any specific topic in a cosmological scenario, our first concern must lie in the amendment of the the above-mentioned problem. This will indeed be the main point of the paper and we shall discuss it within the framework of the algebraic approach to quantum field theory. To this end, we must also take into account that the scientific community interested in this problem might not be acquainted with the formulation of a spinor field theory on a curved background, which, already at a classical level, turns out to be rather different and more complicated than the scalar counterpart. Hence, as a starting point, we shall review the construction of a Dirac field in a classical framework emphasizing the role of the underlying geometric structures which are needed in order to fully describe both the kinematically and the dynamically allowed field configurations. To this effect, our analysis will benefit from earlier works which have already dwelled on this topic and, most notably, we shall refer to the seminal paper of Lichnerowicz [51] as well as to [25, 66]. Subsequently, we shall discuss the quantization of a Dirac field on a curved background and, in this respect, one should mention that there are several possibilities at our disposal. On the one hand one could follow the point of view already suggested in [25], while, on the other hand, one could also analyze the problem from the perspective of Araki [2], whose scheme has the peculiarity of unifying spinors and cospinors in a single body before quantizing them. This leads to a natural definition both of a CAR ∗-algebra of fields and of a subalgebra of observables, once we require at least that elements whose supports are spacelike separated must commute. Furthermore, this scheme, also at the heart of [27, 37, 48, 49, 66, 75], has the remarkable advantage of being well suited to recast the quantum theory of Dirac fields in the language of Locally Covariant Quantum Field Theory [15], as we will point out. In order to fully control the machinery of a quantum field, the scalar field scenario already taught us that one has to understand well which algebraic states one should use and, to this avail, the ones of Hadamard type are the natural choice in the context of Dirac fields, too; these play the role of ground states in a curved background and their ultraviolet behavior closely mimics that of the Minkowski spacetime vacuum state. Consequently, the smeared fluctuations of the components of the stress-energy tensor on these states are bounded, a property which is vital in the context of semiclassical Einstein’s equations. Hadamard states for Dirac fields have already been discussed in [21, 37, 48, 49, 65, 75] and we will review their properties in detail before employing them to achieve the first of our main
November 18, 2009 11:47 WSPC/148-RMP
1244
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
results, namely, the construction of the extended algebra of Wick polynomials in a spinor framework. To this end, we will follow the path paved in the scalar scenario in [12, 13, 39, 40] where it has been displayed that such polynomials lie at the basis of a sound S-matrix formalism for interacting field theories on a globally hyperbolic curved background. Notwithstanding, Wick polynomials are already valuable in free field theories and, as already anticipated, we will use them to achieve our second main result, i.e. the definition of a well-behaved quantum stress-energy tensor operator Tµν for Dirac fields. To achieve this goal, we shall follow a procedure similar to that discussed in [57] (see also the related work [42]) for a free scalar field, i.e. we shall introduce an improved point-splitting procedure to define Tµν evaluated on a Hadamard state. This leads to a new overall stress-energy tensor which does not alter classical dynamics and is ultimately conserved at the quantum level. Furthermore, as a byproduct of our analysis, we shall also be able to explicitly compute the expectation value of its trace which will agree, up to terms proportional to R, with previously found results while being derived in a rigorous framework. 2. Dirac Fields: A Classical Overview Since, as we have outlined in the introduction, the aim of this paper is to provide an as much as possible self-contained approach to some topics related to the quantum description of Dirac fields in curved backgrounds, we will start with a description of Dirac spinors in a classical framework. Although such topic has been already discussed both from a geometrical and from an analytical point of view by many authors, we reckon that we should try to recall the main features of the classical approach in order both to facilitate the understanding of the quantum aspects and to fix some subtleties which ubiquitously arise in these scenarios. 2.1. On the spin structure and related geometric entities Bearing in mind this overall philosophy, we shall mainly devote this subsection to the introduction of spin structures and of Dirac bundles, in order to characterize (co)spinors as suitable sections. We shall not dwell too much on the geometrical contents and for the potential readers who might find our approach too shallow we present our apologises and point them to [50] for a careful discussion of most of the forthcoming concepts and applications. As a starting point, let us fix that, in this paper, a spacetime is meant to be a four-dimensional, Hausdorff, connected, smooth manifold endowed with a Lorentzian metric, whose signature is chosen as (−, +, +, +). As proven in [33, 34], under these assumptions second countability and, hence paracompactness, are also a property of the underlying background. Furthermore, since it is common wisdom to associate Dirac fields to the notion of spin, we need a few definitions as a first step: Definition 2.1. We call spin group Spin(p, q) with p, q ∈ N the double cover of SO(p, q), i.e. there exists the following short exact sequence of Lie group
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1245
homomorphisms: {e} → Z2 → Spin(p, q) → SO(p, q) → {e}, . where {e} stands for the trivial group, whereas Z2 = {±1} is the cyclic group of order 2. Therefore, the third map of the above sequence, Π: Spin(p, q) → SO(p, q), is a surjective Lie-group homomorphism whose kernel is isomorphic to Z2 . Such a surjective covering will be indicated as Π: Spin(p, q) → SO(p, q). Notice that Spin(p, q) is here defined up to isomorphism and that the covering homomorphism Π is not concretely realized. To strengthen this definition, one needs the notion of Clifford algebra, which will be discussed later. Remark. As a consequence of the above definition, there exists an isomorphism between Spin(p, q) and Spin(q, p) for any possible value of both p and q. Furthermore, it is also possible to talk about the dimension of such classical Lie groups which, per direct inspection, is 1 dim(Spin(p, q)) = (p + q)(p + q − 1). 2 Furthermore, for all p, q > 0, the spin group has two connected components, where we denote that connected to the identity as Spin0 (q, p). The latter insight entails that the scenario with p = 3 and q = 1 is of great interest since, in this case, Spin0 (3, 1) is isomorphic to SL(2, C). The above definition represented only the first step towards the definition of a Dirac field since, in modern classical field theory, the geometric interpretation of a kinematically allowed configuration is that of a section of a suitable associated bundle. Within this respect, one should notice that, (un)fortunately, in the often analyzed case of a scalar field, such nice perspective does not really enter the fray, whereas, in the case discussed here, such a luxury is not at our disposal, being a spinor intrinsically a vector-valued field. Hence, we shall now show how the notion of spin group can be intertwined with that of a classical field. As a first step we need further definitions based on the analysis in [45, Chap. 4]. Definition 2.2. We call P [V, π , M ] a principal bundle over M with typical fiber V and projection map π : P → M , if there exists a topological group G such that P is a principal G-space, namely: • P is a topological space endowed with a free right action P × G → P which maps . (p, g) to Rg (p) = pg ∈ P for all p ∈ P and g ∈ G, • if we call P ∗ the subspace of P × P composed of all elements of the form (p, pg) with p ∈ P and g ∈ G, then there exists a continuous map τ : P ∗ → G such that pτ (p, p ) = p for all (p, p ) ∈ P ∗ . It is important to notice that, under the hypothesis of this last definition, one can always introduce a bijection between the typical fiber V and the group G as
November 18, 2009 11:47 WSPC/148-RMP
1246
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
proved in [45, Proposition 2.6]. Therefore, we will denote principal bundles (and vector bundles as well) by replacing the typical fiber with the isomorphic group (respectively, vector space) in the following. . Definition 2.3. Given a vector bundle E = E[V, π, M ] over a Lorentzian manifold M with the vector space V over the field K as typical fiber and projection map π : E → M , we call frame F (x) over the point x the assignment of an ordered basis to the fiber π −1 (x) ≡ V , i.e. a map p : Kk → π −1 (x), being k the dimension of V and K ∈ {R, C}. Remark. Such last definition is at the heart of the widely exploited concept of tetrads in general relativity and, on a practical ground, it is remarkable since it guarantees us that, whenever a vector bundle is associated to the underlying spacetime, one is free to choose a basis of such a space and that the rules of changing the basis are left untouched. The latter property is encoded in a right action of GL(k, K) on V , i.e. each element of this group transforms a basis into another one and it follows that this right action is both free and transitive since it is always possible to transform any chosen basis into any other one and only the identity element leaves a chosen basis unchanged. The most notable and ubiquitously used application of such a definition is the tangent bundle where V ≡ Rk , being k = dim M . In the following, we shall have this case in mind and hence we shall identify E as E ≡ T M = T M [Rk , π, M ]. Therefore, if we call Fx M the set of all possible frames over a point x ∈ M , this is isomorphic to GL(k, R) and, thus, we can gather all this information into a unique object: Definition 2.4. Given a spacetime M , a frame bundle associated to T M is the principal bundle F M = F M [GL(4, R), π , M ] build up from the disjoint union x Fx M , where Fx M is identified with the typical fiber GL(4, R) and where π : F M → M is the projection map. Whenever M is oriented and time oriented, we can reduce GL(4, R) to SO0 (3, 1). We shall always consider such case in the following. We are now in the position to eventually introduce the main geometric structure of this paper which lies at the heart of the construction and of the analysis of a Dirac (co)spinor: Definition 2.5. Given an oriented and time oriented spacetime M , a spin struc. ˜ , M ] is a principal fiber ture is the pair (SM , ρ) where SM = SM [Spin0 (3, 1), π bundle over M with the identity component of the spin group as typical fiber. Moreover, ρ is a smooth equivariant bundle morphism from SM to FM, that is, the following two conditions hold: (1) ρ is base point preserving, namely, ˜, π ◦ ρ = π
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1247
(2) ρ must be equivariant, i.e. calling RΛ˜ and RΛ the natural right actions of Spin0 (3, 1) on SM and of SO0 (3, 1) on FM , respectively, we require that ρ ◦ RΛ˜ = RΛ ◦ ρ,
∀ Λ ∈ SO0 (3, 1),
˜ being Π as in Definition 2.1. where Λ = Π(Λ), Remark. It is remarkable that spin structures are only uniquea on simply connected spacetimes [33, 34]. Remark. A natural, apparently na¨ıve, but ultimately rather complicated question a potential reader might ask, is why one should cope with such complicated geometric structures to deal with a somehow natural concept such as that of spin. In the common four-dimensional flat spacetime a spin 12 field is nothing but a suitable function satisfying the Dirac equation and transforming under a unitary and parity invariant representation of the universal cover of the Poincar´e group. Such a representation is induced from the j = 12 -one of SU (2) (see [5, Chap. 21] for more 1 1 details) and it can be restricted to the D( 2 ,0) ⊕ D(0, 2 ) representation of SL(2, C). If one considers a generic four-dimensional spacetime, this last remark plays a pivotal role since, although Poincar´e invariance is not present, a notion of SL(2, C) group can be nonetheless naturally introduced out of a spin structure. This will be used as the building block of a Dirac field in a curved background. Hence, though not sufficient to determine full dynamics of a Dirac spinor, our philosophy will be to seek to characterize the kinematically allowed configurations 1 1 of such a field by means of the mentioned D( 2 ,0) ⊕ D(0, 2 ) representation, while remembering at the same time that, classically, fields should be understood as sections of a suitable vector bundle. To combine these two concepts, we proceed in the following way: Definition 2.6. We call Dirac bundle of a four-dimensional Lorentzian spacetime 1 1 . M with respect to the representation T = D( 2 ,0) ⊕ D(0, 2 ) of SL(2, C) on C4 the . associated bundle DM = SM ×T C4 . This is the set of equivalence classes [(p, z)], where p ∈ SM , z ∈ C4 and equivalence is defined out of the relation (p1 , z1 ) ∼ (p2 , z2 ), if and only if there exists an element A of SL(2, C) such that RA (p1 ) = p2 and T (A−1 )z1 = z2 . The global structure of DM is that of a fiber bundle over M with typical fiber C4 , and the projection map πD is traded from the one of SM , namely, ∀ [(p, z)] ∈ DM , it holds . πD [(p, z)] = π ˜ (p). a Uniqueness
this context.
is meant here up to equivalence, cf., [25] for the suitable definition of equivalence in
November 18, 2009 11:47 WSPC/148-RMP
1248
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
Furthermore, if we endow C4 with the standard non-degenerate internal product, we can construct the dual Dirac bundle D∗ M as the C4∗ -bundle associated to SM requiring that (p1 , z1∗ ) and (RA (p1 ), z1∗ T (A)) are equivalent, where ∗ denotes the adjoint with respect to the inner product on C4 and elements of C4∗ are understood as row vectors. Consequently, the dual pairing of C4 and C4∗ extends in a welldefined way to a fiberwise dual pairing of DM and D∗ M . For the vector bundles we shall use, we also need to define suitable spaces of sections together with a sensible notion of topology. To this avail, we shall adapt the definitions in [68, Secs. 2.6.3 and 2.6.4] while, for the geometrical properties of a vector bundle over a differentiable manifold, a potential reader can refer to [45, Chap. 18]; all the definitions are given for the tangent bundle, but they can be slavishly amended to be used in the context we consider. Definition 2.7. Let E be an arbitrary vector bundle over a spacetime M , the latter . considered together with a covering made of coordinate patches. With E(E) = C ∞ (M, E), we denote the space of smooth sections of E, endowed with the topology induced by the family of seminorms . f k,C = sup{|f (k) (x)|, x ∈ C}, f ∈ E(E), where C ranges over the compact subsets of M and · (k) denotes a derivative of kth order (k ≥ 0), performed in the above mentioned coordinate patch. Furthermore, . we introduce the space of smooth sections with compact support D(E) = C0∞ (M, E), equipped with the topology induced by the family of seminorms . f k = sup{|f (k) (x)|}, f ∈ D(E). In the case E = DM , E = D∗ M , we can define a global pairing of E(D∗ M ) and D(DM ) or D(D∗ M ) and E(DM ) by integrating the local pairing induced by the inner product on C4 , i.e. . f, g = dµ(x)f (x)(g(x)), M ∗
for all f ∈ E(D M ), g ∈ D(DM ), where dµ(x) is the canonical, metric induced, volume measure on M . We are finally in the position to define the key object of our analysis: Definition 2.8. A Dirac spinor is a smooth global section of the Dirac bundle, i.e. ψ ∈ E(DM ) or, equivalently, if we consider a sufficiently small neighborhood U of any point x ∈ M , ψ is (diffeomorphic to) a 4-vector field ψU : U → C4 , since DM |U trivializes as U × C4 . Analogously, we call Dirac cospinor a smooth global section of the dual Dirac bundle, namely, ψ ∈ E(D∗ M ) or, in a sufficiently small : U → C4∗ . neighborhood U , ψ is diffeomorphic to ψU We would like to stress that our definition of Dirac spinor fields does not include any equation of motion.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1249
It is quite safe to admit that the introduced geometric objects are rather technical, and one might wonder if the class of spacetimes admitting them is not very restricted. Particularly, from a physical point of view, one is interested in introducing spinors and fields in general as global objects. There is thus a compelling need to understand which is really the set of backgrounds we can work with, and the answer is surprising and reassuring at the same time. In fact, the following theorem and lemma can be proved (see [8] for the original proof and also [33, 34] for a more physically oriented analysis and proof): Theorem 2.1. An oriented, time oriented, four-dimensional manifold M admits a spin structure if and only if it has a vanishing second Stiefel–Whitney class or, in other words, if the second de Rhamb cohomology class H 2 (M, Z2 ) is trivial. Lemma 2.1. A four-dimensional, globally hyperbolic, as well as oriented and time oriented spacetime M has a trivial second Stiefel–Whitney class. Proof. We briefly sketch the needed argument. Since the spacetime is globally hyperbolic, it can be split as Σ×R where Σ is a three-dimensional oriented spacelike Cauchy surface. A standard theorem in differential geometry states that all oriented three-dimensional manifolds are parallelizable and this straightforwardly extends to a four-dimensional globally hyperbolic oriented and time-oriented spacetime, thanks to the mentioned splitting. To conclude, one can use the fact, proved, e.g. in [45, Chap. 18], that parallelisability coincides with the vanishing of the second Stiefel–Whitney class. This theorem is rather useful because globally hyperbolic spacetimes are the most interesting and natural class of manifolds whenever one deals with both classical and quantum field theories on curved backgrounds. As anticipated in the above proof, each of these spacetimes can be foliated as Σ × R, being Σ a smooth Cauchy surface [6]. Therefore, hereupon one can state precisely the notion of initial value problem for the equations of motion, hence determining the classical dynamically allowed configurations of a field as the solution of a certain partial differential equation. The parallelizability of globally hyperbolic, four-dimensional spacetimes M , i.e. the fact that there always exists a global orthogonal frame on them, implies that F M is a trivial bundle in that case. Consequently, this property extends to SM , T M , T ∗ M , DM , and D∗ M as well, where the extension to the spinor-related bundles follows from the results of [33]. We are thus in the position to introduce global frames of the latter four bundles. (1) Employing a global section E of SM , we can define a spin frame {EA }A=1,...,4 , . i.e. a set of four global sections of DM as EA (x) = [(E(x), zA )], being zA ˇ the most general framework, one should consider the second Cech cohomology class, but this coincides with the de Rham one for differentiable manifolds. In this paper, we deal solely with spacetimes which, according to the definition stated at the beginning of the section, are differentiable.
b In
November 18, 2009 11:47 WSPC/148-RMP
1250
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
the standard basis of C4 . Hence, any Dirac spinor ψ can be decomposed as ψ(x) = ψ A (x)EA (x) where now ψ A ∈ C ∞ (M ). (2) A dual spin frame {E B }B=1,...,4 , i.e. a set of four global sections of D∗ M can then be automatically constructed out of the frame by means of the natural B . From now on capital action of dual vectors on vectors, namely, EA (E B ) = δA letter indices will refer to quantities expressed with respect to these bases. (3) Exploiting Definition 2.5, we can project E to F M , hence obtaining a global . section e = ρ ◦ E of F M . Employing this, we can define a Lorentz frame {ea }a=0,...,3 , i.e. a set of four global sections of T M , by realizing that T M can be understood as the R4 -bundle associated to F M . Such a fiberwise basis is orthonormal in the sense that g(ea , eb ) = ηab , where ηab is the Minkowski metric (the diagonal metric with η00 = −1 and η11 = η22 = η33 = 1), and it is often referred to as the non-holonomic basis of the base manifold; this is in sharp contrast with the standard (holonomic) coordinate basis ∂µ which is related to the basis ea by means of a basis change or, in other words, by the matrix eµa denoting the coefficients of ea in their expansion with respect to ∂µ . (4) Analogous to the spin case, one can now straightforwardly define a dual Lorentz frame {eb }b=0,...,3 constructed out of ea , again by the natural action of dual vectors on vectors, as ea (eb ) = δab . From now on, lower-case Roman letters will always refer to quantities expressed with respect to the non-holonomic basis, whereas lower-case Greek ones will indicate those with respect to the holonomic basis. Non-holonomic indices will be “raised” and “lowered” with ηab , while the same operations will be performed on the holonomic ones using gµν = g(∂µ , ∂ν ). The most notable consequence of this new data is that we can decompose every spinor-tensor f ∈ E( T M ⊗ · · · ⊗ T M ⊗ T ∗ M ⊗ · · · ⊗ T ∗ M ⊗ DM ⊗ · · · ⊗ DM ⊗ D∗ M ⊗ · · · ⊗ D∗ M ) {z
|
}
i
|
{z
}
{z
|
j
}
{z
|
k
}
l
as follows 1
l
···ai A1 ···Ak f = fba11···b ea1 ⊗ · · ·⊗ eai ⊗ eb1 ⊗ · · · ⊗ ebj ⊗ EA1 ⊗ · · ·⊗ EAk ⊗ E B ⊗ · · · ⊗ E B . j B1 ···Bl
One could in principle certainly choose a different global section E of SM and thus obtain different spin and Lorentz frames which are related to the previous ones by local spin and Lorentz transformations. On the level of coefficients, such a change of frames results in a ,...,a ,A ,...,A
a
a
A
1 i 1 k ˜ −1 ) 1 = (Λ−1 )a11 · · · (Λ−1 )aii (Λ fb ,...,b ,B ,...,B A1 1
j
1
l
A
b
b
B
B
˜ 1 ˜ −1 ) k Λ 1 · · · Λ j Λ ˜ k a1 ···ai A1 ···Ak · · · (Λ Ak b1 bj B1 · · · ΛBl fb1 ···bj B1 ···Bl , ˜ ∈ C ∞ (M, Spin0 (3, 1)), whereas Λ = ρ∗ (Λ) ˜ ∈ C ∞ (M, SO0 (3, 1)). where Λ
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1251
2.2. On the dynamics of a classical Dirac field Since we have by now assured ourselves of the existence and well-posedness of a global structure of Dirac fields in a globally hyperbolic time and space-oriented manifold, we shall next proceed to introduce the natural evolution operator out of which one can describe the classical dynamical content of our theory. It is imperative to stress a sharp difference between the previous and this subsection; in the preceding discussion, all the introduced geometric structures have been somehow natural and intrinsic, i.e. no special choice has been performed with the due excep1 1 tion of the D( 2 ,0) ⊕D(0, 2 ) representation to define the Dirac bundle and, eventually, the choice of spin structure in case it is non-unique. Conversely, in the forthcoming discussion, some arbitrariness appears and we shall try to emphasize it to a potential reader, since we have to keep track of it to ensure that it does not play a distinguished role in the forthcoming discussion of the quantum fields. To wit, we are referring to the definition of the so-called γ-matrices. To obtain them, we can proceed as follows (still refer to [50] for more details): Definition 2.9. We call Clifford algebra Cl(p, q) the real associative algebra generated by the identity I and a set of elements γa with a = 1, . . . , p + q, subject to the relations . (1) {γa , γb } = γa γb + γb γa = 2ηab I, where η is the standard diagonal matrix of signature (p, q). Particularly, if p = 3 and q = 1 or vice-versa, one can refer to Cl(3, 1) or Cl(1, 3) as Dirac–Clifford algebra. Remark. It is a direct consequence of this definition and of (1) in particular that the elements γa are linearly independent and that a basis for the Clifford algebra is given by the identity and by all products γa1 · · · γan with a1 < · · · < an and n ≤ p + q. This entails that dim(Cl(p, q)) = 2p+q . As a further important datum, we wish to underline that Cl(p, q) is a Z2 -graded algebra; this arises if we introduce the automorphism α : Cl(p, q) → Cl(p, q) such that α(γa ) = −γa for all possible a. Since α2 coincides with the identity map, we can always decompose: Cl(p, q) = Cl0 (p, q) ⊕ Cl1 (p, q), where Cli (p, q) = {a ∈ Cl(p, q) | α(a) = (−)i a}. By direct inspection, one can realize that Cl0 (p, q) is the subalgebra of the full Clifford algebra generated by products of even numbers of γa . The Dirac–Clifford algebra enjoys many relevant properties of great interest for our discussion. As a first step, using the Periodicity Theorem 4.1 in [50], one . can prove per direct inspection that Cl(1, 3)C = Cl(1, 3) ⊗ C is isomorphic to the algebra M (4, C) of 4 × 4 complex matrices. This entails that there is a complex representation T : Cl(1, 3)C → Hom(C4 , C4 ) and, according to Theorem 5.7, still
November 18, 2009 11:47 WSPC/148-RMP
1252
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
in [50], up to equivalence, there is only one of these which is irreducible. Its matrix A , which we choose in such a way that form can be described by the matrices γaB ∗ A A ∗ A A (γ0 ) B = −γ0B whereas (γa ) B = γaB for a = 1, 2, 3, i.e. I 0 σa 0 γ0 = i 2 , γa = i , (2) −σa 0 0 −I2 where a = 1, . . . , 3 and σa is a Pauli matrix, whereas In denotes the n × n-identity matrix. Furthermore, these matrices, independently of the choice in (2), are the socalled γ-matrices and they can always be interpreted as the coefficients of a global (spinor-)tensor γ ∈ E(T ∗ M ⊗ DM ⊗ D∗ M ) admitting the following expansion: A a γ = γaB e ⊗ EA ⊗ E B .
(3)
. This identity yields that, if we use the tetrad coefficients eaµ to define γµ = γa eaµ , we recover the standard anticommutation relations for the γ-matrices in curved backgrounds {γµ , γν } = 2gµν I4 .
(4)
This is the net effect of changing from a non-holonomic to a holonomic basis. Let us stress at this point that we have chosen a specific representation of the Dirac–Clifford algebra out of the possible equivalent ones and indeed this is the arising arbitrariness announced at the beginning of this subsection. The choice of different representations of the Dirac–Clifford algebra can be shown to lead to quantum field theories which are equivalent up to gauge transformations [66] and, moreover, if one restricts to the quantum observables, the choice of a representation becomes even irrelevant, as we will discuss in the next section. Remark. A second interesting property of the Clifford algebra arises from the realization that it contains the spin group (see [50, §2]) and, most notably, Spin(p, q) ⊂ Cl0 (p, q). More precisely, one can introduce the multiplicative group of units in the Clifford algebra as the set of elements a ∈ Cl(p, q) such that there exists a ∈ Cl(p, q) fulfilling aa = a a = 1 with respect to the algebra multiplication. The subset of these elements whose square, out of (1) is equal to ±1 is called the “Pin group” Pin(p, q) and Spin(p, q) ≡ Pin(p, q) ∩ Cl0 (p, q). If we now recall Definition 2.1, this realization of the spin group allows a somehow preferred choice for the surjective covering homomorphism Π (still see [50, §2]) as the map . ) : Spin(p, q) → SO(p, q) defined out of Ad g (g ) = g − 2 η(g,g Ad η(g,g) g. Here, the operation is meant with the respect to the identification of the involved subgroups as subsets of the Clifford algebra, on which the metric η is meaningful as per Definition 2.9. yields the universal covering homoNotice that, with respect to this choice, Ad morphism between Spin(p, q) and SO(p, q) whenever either q = 0 and p ≥ 3 or vice versa. The same holds between the components connected to the respective identities Spin0 (p, q) and SO0 (p, q) whenever q = 1 and p ≥ 3 or vice versa.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1253
It is thus natural to wonder how the above introduced representation yielding the γ-matrices restricts to the spin group, and the answer to this query comes from Proposition 5.15, still in [50], which guarantees us that each irreducible complex representation of the Clifford algebra on a vector space can be restricted to the sum of two inequivalent irreducible representations of the spin group. In the case under analysis, one can by direct inspection realize that the restriction of T to Spin(3, 1) yields the sum of two non-equivalent irreducible representations, which, 1 1 on SL(2, C) Spin0 (3, 1), coincide with the previously mentioned D( 2 ,0) ⊕ D(0, 2 ) representation. According to our analysis, for any vector field v ∈ E(T M ), we can meaningfully introduce . v/ = v a γa , which is an element of E(DM ⊗ D∗ M ), such that its coefficients v AB form a 4 × 4 complex matrix. Notice that, from now on, we will not specify explicitly when we shall deal with abstract Dirac–Clifford algebra elements γa or with their matrix representations. It is understood that, whenever we either contract γ-matrices with a vector field or these matrices are applied to a vector in C4 , we refer to the latter case. The last ingredient we need to specify the dynamics of Dirac fields is a parallel transport on the Dirac bundle. The grand strategy is rather simple, namely, we introduce the standard metric connection, interpret it on the frame bundle and eventually lift it to both the spin and the Dirac bundle: Definition 2.10. Let M be a spacetime and let us introduce the standard covariant derivative on the tangent bundle T M , ∇ : E(T M ) → E(T M ⊗ T ∗ M ),
∇ec = Γbac eb ⊗ ea ,
which is constructed out of the Levi–Civita connection Γbac . Then we call ω : F M → T ∗ F M ⊗ o(3, 1) the connection 1-form on F M induced out of Γ as . Γbac = eb (e∗ ω(ea )ec ), with e∗ : T ∗ M → T ∗ F M denoting the push-forward of e in the sense of cotangent vectors. At the same time, if we denote as dΠ the derivative of the covering map Π : SL(2, C) → SO(3, 1) calculated at the identity, we shall call the pull-back . Ω = (dΠ)−1 ◦ ρ∗ ◦ ω, of ω to SM the spin connection which is related to the following covariant derivative ∇ : E(DM ) → E(T ∗ M ⊗ DM ),
B a ∇EA = σaA e ⊗ EB ,
out of the defining equality B . = E B (E ∗ Ω(ea )EA ), σaA
(5)
November 18, 2009 11:47 WSPC/148-RMP
1254
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
where E∗ : T ∗ M → T ∗ SM denotes the push-forward of E in the sense of cotangent vectors. B can be expressed in a simple way We will soon prove that the coefficients σaA b by means of both the coefficients Γac and the γ-matrices. Before we tackle such problem, let us stress that the covariant derivatives ∇ from Definition 2.10 can be straightforwardly extended to any spinor-tensor
f ∈ E(T M ⊗ · · · ⊗ T M ⊗ T ∗M ⊗ · · · ⊗ T ∗ M ⊗ DM ⊗ · · · ⊗ DM i
j
∗
k
∗
⊗ D M ⊗ · · · ⊗ D M) l
by defining ∇ : E(T M ⊗ · · · ⊗ T M ⊗ T ∗M ⊗ · · · ⊗ T ∗ M ⊗ DM ⊗ · · · ⊗ DM i
j
∗
k
∗
⊗ D M ⊗ · · · ⊗ D M) l ∗
∗ → E(T M ⊗ T · · ⊗ T M ⊗ T · · ⊗ T ∗ M ⊗ DM ⊗ · · · ⊗ DM M ⊗ · M ⊗ · i
j
⊗ D∗ M ⊗ · · · ⊗ D∗ M ).
k
(6)
l
At a level of components and, for notational simplicity, only in the case i = j = l = k = 1, (6) reads aA ∇f = ec ∇c (fbB ea ⊗ eb ⊗ EA ⊗ E B ) aA aA dA A aC C aA c = [∂c fbB − Γdcb fdB + Γacd fbB + σcC fbB − σcB fbC ]e ⊗ ea ⊗ eb ⊗ EA ⊗ E B .
As promised, we shall now give an explicit expression for the σ-coefficients. Although a demonstration of the next lemma is already present in [51] (see also [81, Chap. 13]), we shall prove it again due to ubiquitous sign subtleties arising from the choice of the metric signature and possibly leading to confusions when comparing with the literature, [25] in particular. Lemma 2.2. The connection coefficients of the spin connection can be expressed as B = σaA
1 b B dC Γ γ γ . 4 ad bC A
(7)
Proof. The strategy of the proof is the following: we first derive an explicit expression for the double covering homomorphism Π : Spin0 (3, 1) SL(2, C) → SO0 (3, 1). From this we obtain an expression for its derivative at the identity dΠ : sl(2, C) → o(3, 1), which, inserted in (5), yields the wished-for result.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1255
Let us thus recall that Spin0 (3, 1) can be understood as a subgroup of the algebra ˜ ∈ SL(2, C), we can define the adjoint action Cl(3, 1). This entails that, for any Λ . ˜ ˜ −1 . AdΛ˜ (γa ) = Λγ aΛ Being SL(2, C) a finite cover of SO0 (3, 1), applying an induction-reduction mechanism, we know there exists a representation T of SO0 (3, 1) on Minkowski space . ˜ we have (see also a direct proof in the theorem such that, setting Λ = Π(Λ), at [51, p. 29]) T (Λ)γa = AdΛ˜ (γa ).
(8)
Since we are dealing with Lie groups, we recall that all finite-dimensional repre. sentations are matrix representations, i.e. we can simply write T (Λ)γa = Λb a γb . ˜ −1 = Λb a γb , where we notice the invariance of the left-hand side under ˜ aΛ Hence, Λγ ˜ → ±Λ, ˜ as one could have expected, being SL(2, C) the the Z2 -action sending Λ double cover of SO0 (3, 1). ˜ Let us now take an arbitrary differentiable path t → Λ(t) in SL(2, C) whose projection on SO0 (3, 1) is differentiable; the following identity holds ˜ ˜ −1 = Λ(t)b a γb . Λ(t)γ a Λ(t) If we derive with respect to t, an identity between algebra representations arises, namely, b ˜ −1 ˜ dΛ(t) dΛ(t) dΛ(t) ˜ −1 + Λ(t)γ ˜ γa Λ(t) = γb a dt dt dt a ⇔
˜ ˜ dΛ(t) ˜ −1 − Λ(t)γ ˜ −1 dΛ(t) Λ(t) ˜ −1 = ˜ γa Λ(t) a Λ(t) dt dt
dΛ(t) dt
b γb , (9) a
˜ Λ(t) ˜ −1 = I yields where we have exploited that the derivation of Λ(t) ˜ −1 ˜ dΛ(t) ˜ −1 = −Λ(t) ˜ dΛ(t) . Λ(t) dt dt ˜ −1 to (9), we end up with If we apply the adjoint action of Λ(t)
b ˜ ˜ dΛ(t) −1 dΛ(t) −1 dΛ(t) ˜ ˜ Λ(t) γa − γa Λ(t) = AdΛ(t) γb . ˜ −1 dt dt dt a ˜ −1 ) ≡ We use (8) as well as the basic property of a representation, namely, Ad(Λ −1 ˜ , to derive Ad(Λ) b b ˜ ˜ dΛ(t) −1 dΛ(t) −1 dΛ(t) −1 c −1 dΛ(t) ˜ ˜ Λ(t) γa − γa Λ(t) = (Λ ) a γb = Λ γb . dt dt dt dt c a
November 18, 2009 11:47 WSPC/148-RMP
1256
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
˜ . ˜ −1 dΛ(t) . b If we call λ = Λ(t) and µ = Λ(t)−1 dΛ(t) dt dt , it holds λγa − γa λ = µ a γb . Let us multiply this identity on the right with γa and, if we bear in mind that γ a γa = η ab γa γb = η ab ηab = 4, we end up with
4λ − γa λγ a = µab γ a γ b .
(10)
Taking into account the antisymmetry of µ ∈ o(3, 1) and the identity γ a γ [b γ c] γa = 0, a possible solution of (10) is λ=
1 µab γ a γ b . 4
The uniqueness of this solution is not guaranteed, since, being the left-hand side of (10) linear in λ, we have merely found a particular solution, and we are free ¯ such that to add any further solution of the homogeneous counterpart, i.e. any λ a a ¯ ¯ ¯ 4λ − γa λγ = 0. This implies that γ [γa , λ] = 0, but, since the γ-matrices are ¯ = 0. Therefore we can apply linearly independent, the last equality yields [γa , λ] ¯ = kI, being I the Schur’s lemma (see [36, Theorem 4.26]) which entails that λ identity. The value of k can be unambiguously determined if we notice that, according to ¯ is an element in the algebra of SL(2, C), which consists our previous construction, λ ¯ = Tr(kI) = 0, the of matrices with vanishing trace. Therefore, if we impose Tr(λ) only possibility is k = 0. Since the differentiable path chosen in the proof is arbitrary, we have now proven the explicit form of dΠ in terms of its inverse, namely, (dΠ)−1 : o(3, 1) → sl(2, C),
(dΠ)−1 (µab ) =
1 µab γ a γ b 4
∀µab ∈ o(3, 1).
Remembering the definition of Ω, recalling e = ρ◦E, and inserting the expression of (dΠ)−1 into (5), we finally obtain B = σaA
1 b B dC Γ γ γ . 4 ad bC A
As a direct application of the above result, we can prove the following important lemma. Note that its result is sometimes used to define the spin connection in more pragmatic textbooks. Lemma 2.3. The Dirac–Clifford γ-matrices are covariantly constant, i.e. ∇γ = 0. Proof. This is a straightforward calculation, once a subtlety has been clarified: the matrices γ are constructed as the matrix form of an irreducible representation of the Clifford algebra and then subsequently glued to each point of the underlying base manifold M via (3). This prescription entails that ∂a γb = 0 for any a, b = 0, . . . , 3. We can now compute A a e ⊗ EA ⊗ E B ), ∇γ = eb ∇b (γaB
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1257
which, exploiting Definition 2.10 and the antisymmetry of the matrices Γa , seen as elements in o(3, 1), becomes A a A e ⊗ EA ⊗ E B + γaB (−Γabc ec ⊗ EA ⊗ E B ∇γ = eb [∂b γaB C a B a + σbA e ⊗ EC ⊗ E B − σbC e ⊗ EA ⊗ E C )].
If we apply formula (6) to sections of T ∗ M ⊗ DM ⊗ D∗ M and consider that the matrix elements are constant functions, the lemma is proved out of direct substitution. We are ready, at last, to describe the dynamically allowed configurations of a spinor field in a curved background. Particularly, if we now understand multiplication with the γ-matrices to act from the left on spinors and from the right on cospinors, we shall call dynamically allowed a Dirac (co)spinor ψ () which satisfies the Dirac equation . Dψ = (−∇ + mI)ψ = 0, (11) . D ψ = (∇ + mI)ψ = 0, (12) where I stands for the identity on the relevant spaces, ψ ∈ E(DM ), whereas ψ ∈ E(D∗ M ). Bearing in mind the necessary differences in the matrix multiplication order, we define both D and D also on E(D∗ M ) and E(DM ), respectively, by the same expressions as above. Remark. Usually, in Minkowski spacetime, the Dirac equation is most notable for having a purely imaginary coefficient i in front of the Dirac operator. Here, such a number does not appear and the underlying reason is rooted in the employed convention for both the metric signature and the sign of the defining anticommutation relations for the Clifford algebra. To wit, i is absorbed in the definition of the γ-matrices, as stated in (2). 2.3. The Cauchy problem and the fundamental solutions In this subsection we shall discuss the classical initial value problem for Dirac fields. As it is well known from flat spacetimes, a solution of the Dirac equation is usually related to a solution of a second order hyperbolic differential equation. Lemma 2.4. The following two assertions hold : . • DD = D D = −P = −∇a ∇a + ( R4 + m2 )I4 , • every solution ψ of (11) is also a solution of the spinorial Klein–Gordon equation
R (13) P ψ = ∇a ∇a − − m2 ψ = 0, 4 where R is the scalar curvature of (M, g). A similar statement holds for each cospinor ψ , solution of (12).
November 18, 2009 11:47 WSPC/148-RMP
1258
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
Proof. Let us consider the first of the two statements, whose proof entails immediately also the second one. To wit, let us take D D = DD = (∇ + mI4 )(−∇ + mI4 ) = (−∇∇ + m2 I4 ), which means that we need to prove that ∇2 = ∇a ∇a − write
R 4 I4 .
To this avail, let us
∇2 = γ a ∇a γ b ∇b = γ a γ b ∇a ∇b = γ (a γ b) ∇a ∇b + γ [a γ b] ∇a ∇b , where in the second equality we have exploited Lemma 2.3, whereas in the third one we split the expression in its symmetric and antisymmetric part. Since γ (a γ b) = 1 a b 2 {γ , γ } is equal to the metric times the identity, it holds 1 ∇2 = ∇a ∇a + γ [a γ b] ∇a ∇b = ∇a ∇a + γ a γ b ∇[a ∇b] = ∇a ∇a + γ a γ b Cab , (14) 2 with Cab denoting the curvature tensor of the spin connection as defined in (42). Let us briefly state some properties of Cab , which are examined in more detail in Appendix A.1: firstly, from Lemma 2.2 one can infer that Cab =
1 Rabcd γ c γ d . 4
Employing the Clifford relations and the symmetry properties of the Riemann tensor Rabcd , it is straightforward to show that 1 R γ a γ b Cab = −γ b γ a Cab = − γ b γ a Rab = − . 2 2 Inserting this into (14), we finally obtain ∇2 = ∇a ∇a −
R I4 4
and thus D D = DD = −P = −∇a ∇a +
R + m2 I4 . 4
Remarks. One should notice that the above lemma can be seen as a particular application of Weizenb¨ ock’s formula [50]. It is furthermore worth noting that ∇a ∇a in the above expression is not diagonal in the spinor indices, and thus it is not the wave operator times the identity. Its principal symbol, however, is indeed diagonal and even of metric type: g µν kµ kν . Let us also stress that, in sharp contrast to the scalar Klein–Gordon equation in four spacetime dimensions, there is no freedom to select a coupling for Dirac fields to the scalar curvature since this is universally fixed to 14 . The introduction of the Dirac operator and the discussion of its main properties allow us to state and to prove the main theorem related to the classical dynamical
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1259
behavior of spinors on curved backgrounds: Theorem 2.2. Let (M, g) be a four-dimensional, globally hyperbolic, oriented and time oriented spacetime and let Σ be a smooth three-dimensional Cauchy hypersurface, whose smooth embedding in M is denoted ι : Σ → M . If we refer to DΣ as the bundle constructed out of the pull-back of DM via ι, then the following Cauchy problem admits a unique solution Dψ = 0 , (15) ι∗ ψ ≡ ψ0 where ψ ∈ E(DM ) and ψ0 ∈ D(DΣ). Proof. For notational simplicity, we will omit the embedding ι and just write ψ|Σ instead of ι∗ ψ for any ψ ∈ E(DM ). According to Lemma 2.4, each solution of the Dirac equation also solves a spinorial counterpart of the Klein–Gordon equation, namely, P ψ = (∇a ∇a − R4 −m2 )ψ = 0 where P has metric principal symbol and is thus a normally hyperbolic differential operator. Hence, we can invoke the results on hyperbolic partial differential equations (see [4, Theorem 3.2.11] in particular), which guarantee us that any Cauchy problem for compactly supported initial data for the the considered partial differential equation admits a unique smooth solution supported in the causal past and future of the initial datum. The only problem left is to give a prescription on how to switch from a Cauchy problem for the Dirac equation, hence with only one given initial datum, to one for a second order hyperbolic partial differential equation, where two data on the Cauchy surface have to be prescribed. To solve this dilemma, let us not deal with (15), but with the following system: R 2 a ∇a ∇ − 4 − m u = 0 , u|Σ = 0, − ∂u ≡ nψ0 ∂n Σ where u ∈ E(DM ) and n denotes the vector field normal to Σ such that η ab na nb = −1. As stated before, such a system admits a unique global smooth solution u˜ and, . ˜. It is immediate to see that ψ˜ hence, let us introduce the smooth section ψ˜ = D u ˜ is a solution of the Dirac equation Dψ = 0. Since ψ˜ is smooth, it can be directly ˜ Σ = D u˜|Σ = (∇ + m)˜ u|Σ . Inserting evaluated on the Cauchy surface Σ where ψ| the initial condition for u ˜, the second term vanishes, whereas the first reads ∂u ˜ γ a ∇a u˜|Σ = γ a na = −γ a na γ b nb ψ0 = −η ab na nb ψ0 = ψ0 . ∂n Σ Hence, the section ψ˜ constitutes a solution of the Cauchy problem in the thesis of the theorem. Furthermore it is also unique, since, if a second solution, say ψ˜ , with
November 18, 2009 11:47 WSPC/148-RMP
1260
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
the same initial data would exist, then also ψ˜ − ψ˜ would, per linearity, solve the Dirac equation, though with vanishing Cauchy data; the only possibility is thus that the above difference vanishes. Since our aim is to deal with quantum field theory over curved backgrounds in the long run, it is more useful to prove not only the existence and uniqueness of the solution of a Cauchy problem, but also the existence of the so-called fundamental solution, which is the last relevant theorem of the classical theory that we shall need: Theorem 2.3. The Dirac operator admits unique advanced (− ) and retarded (+ ) fundamental solutions, i.e. continuous linear maps S ± : D(DM ) → E(DM ) fulfilling DS ± = I = S ± D. These maps are determined by their support properties supp(S ± f ) ⊂ J ± (supp f ),
∀f ∈ D(DM ),
with J ± (U ) denoting the causal future/past of the set U . Similarly, there exist unique advanced and retarded fundamental solutions S∗± : D(D∗ M ) → E(D∗ M ) of D . Proof. The strategy will be similar to the one employed in the proof of the existence of the solution of the Cauchy problem. Thus, let us start from P = −D D = −DD ; since this is a normally hyperbolic differential operator, we already know (see [4, 44]) that there exists a unique retarded — say E + — and a unique advanced — say E − — fundamental solution for P on D(DM ). Hence, for any f ∈ D(DM ), we know that P E± = I = E±P
and supp(E ± f ) ⊂ J ± (supp f ).
. We can now define S ± = −D E ± which, per direct inspection, satisfies DS ± = I and is furthermore continuous and has the correct support properties, since the application of D preserves these features. In the same way, we can construct advanced and retarded right fundamental solutions for the dual Dirac operator . as S∗± = −DE∗± , with E∗± being the fundamental solutions of P on D(D∗ M ). The next step consists of proving that the right fundamental solution is “also a left one” and we only show this for S ± , since the proof for S∗± is analogous. Consider any h ∈ D(D∗ M ) and any f ∈ D(DM ). Since Df ∈ D(DM ), we end up with h, S ± Df = D S∗∓ h, S ± Df = S∗∓ h, DS ± Df = S∗∓ h, Df = D S∗∓ h, f = h, f , where all expressions are well-defined since supp S ± f ∩ supp S∗∓ h is compact due to global hyperbolicity of M . It remains to be shown that the fundamental solutions are unique and again we only prove this for S ± here. To this end, let us take any h ∈ E(D∗ M ) which becomes
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1261
compactly supported upon application of D and any f ∈ D(DM ). Suppose two different sets of fundamental solutions, say S ± and S˜± , exist. Starting from 0 = h, If − If = h, D(S ± − S˜± )f = D h, (S ± − S˜± )f , uniqueness of the left fundamental solutions follows from the non-degeneracy of , in combination with the fact that every element of D(D∗ M ) can be written as D h with a suitable h ∈ E(D∗ M ). Uniqueness in the sense of right fundamental solutions then follows from every left fundamental solution also being a right one and vice versa. In analogy with the scalar case, we shall from now on call S = S + − S − the causal propagator for the Dirac operator D and S∗ = S∗+ − S∗− the causal propagator for D . To conclude the section, we wish to underline that, up to this point, we have basically considered the Dirac spinor and cospinor fields as completely distinct objects. As it is usually done in Minkowski space, however, we can define a well-behaved Dirac conjugation map mapping spinors into cospinors and vice versa. Furthermore, this conjugation turns out to map any dynamically allowed configurations into another one. Definition 2.11. We call Dirac conjugation matrix the unique matrix β ∈ SL(4, C) such that β ∗ = β,
γa∗ = −βγa β −1
∀a = 0, . . . , 3,
and furthermore, iβna γa is a positive definite matrix, being n timelike and futuredirected. Starting from this object, we can define Dirac conjugation maps: . ·† : E(DM ) → E(D∗ M ), f † = f ∗ β, . ·† : E(D∗ M ) → E(DM ), h† = β −1 h∗ , where
∗
denotes the adjoint with respect to the Hermitian inner product on C4 .
Remarks. β is only unique, once a representation of the Dirac–Clifford algebra has been chosen. A direct inspection of the above identities shows that, with the definition of γ-matrices as in (2), β = −iγ0 and thus β = β −1 . It furthermore follows that if one applies subsequently and in any order both Dirac conjugations, this gives the identity map in the relevant spaces and that, as already anticipated, ·† preserves the Dirac equations in the sense that D f † = (Df )† , Dh† = (D h)† for any f ∈ E(DM ), h ∈ E(D∗ M ). 3. Dirac Fields: Quantum Point of View The aim of this section is twofold: on the one hand, we shall discuss the already available formulation of a quantum theory for Dirac fields on a curved background
November 18, 2009 11:47 WSPC/148-RMP
1262
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
while, on the other hand, we shall show for the first time that the notion of Wick polynomials can be coherently introduced also in this scenario, giving rise to an enlarged algebra of observables. Particularly, we shall point out how all these topics fit into the framework of the locally covariant formulation of quantum field theory. To achieve our goal, we shall refer to an earlier work due to Araki [2], though we shall also profit from [25, 27, 37, 48, 49, 65, 66, 76]. 3.1. The local algebras of fields and observables Although the first paper formulating a quantum theory of Dirac fields on curved spacetimes in the algebraic framework is [25], its underlying approach is slightly different from the one we shall employ, albeit ultimately fully equivalent. Particularly, in the aforementioned paper, the quantisation scheme calls for the choice of both a Cauchy surface Σ and initial data on Σ as building blocks of the quantum theory. We shall not dwell on the details of this method since we reckon that, in the spirit of local covariance, it is not best suited for our later purposes. Although the standard paradigm in particle physics calls for the treatment of particles and antiparticles as distinct, albeit related objects, in this paper we shall, as it has been done by most of the authors mentioned in the introduction of this section, bear in mind the lessons from [2] and, thus, we shall consider spinors and cospinors as part of a single entity, since it will turn out to be more convenient for our later purposes. On a practical ground, the building blocks of our discussion will be three. The . . first one arises out of the direct sum DM ⊕ D∗ M , namely, D = D(M ) = D(DM ⊕ D∗ M ), the space of compactly supported smooth sections of DM ⊕ D∗ M with the topology stated in Definition 2.3, i.e. the one induced by the family of seminorms . f k = sup|f (k) (x)|, f ∈ D, where · (k) again denotes a derivative of kth order (k ≥ 0), performed in a coordinate patch of M , while the second is the antilinear involution map Γ : D → D defined as . (16) Γ(f ⊕ h) = h† ⊕ f † ∀f ⊕ h ∈ D, being ·† the Dirac conjugation introduced in Definition 2.11. Furthermore, in order to eventually impose the anticommutation relations, we need a third datum, namely, . a sesquilinear form on D2 = D((DM ⊕ D∗ M )2 ). Let f = f1 ⊕ f2 and h = h1 ⊕ h2 be two elements of D, then (·, ·) : D2 → C is defined as: . (17) (f, h) = −if1† , Sh1 + iS∗ h2 , f2† , which is positive semidefinite, as one can infer with minor modifications either from [66, Lemma 4.2.4] or, with little more effort, from [25, Proposition 1.1]; let us also note that (Γf, Γh) = (h, f ). Furthermore, using all the afore introduced tools, one can define the following algebra: Definition 3.1. We call algebra of fields the unital ∗-algebra F(M, g) generated by the identity and the abstract elements B(f ) with f ∈ D satisfying the following
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1263
requirements: (i) (ii) (iii) (iv)
the map f → B(f ) is linear, B(Df1 ⊕ D f2 ) = 0 for all f1 ⊕ f2 ∈ D, B(Γf ) = B(f )∗ , for all f ∈ D and with Γ defined as in (16), . {B(f )∗ , B(h)} = B(f )∗ B(h) + B(h)B(f )∗ = (f, h), where the right-hand side is given by (17).
Before providing a constructive example for the above defined object, let us discuss how the quantum counterparts of the classical single Dirac fields are contained in such algebra. Remark. It is possible to recover the standard notion of a spinor and cospinor quantum field starting from the B-generators as follows: • the spinor arises as . ψ(h) = B(0 ⊕ h).
(18)
. ψ † (f ) = B(f ⊕ 0).
(19)
• the cospinor is given by
Particularly, to be convinced of the self-consistency of such statement, one should notice that the spinor and cospinor fields are related due to property (iii) and they respectively satisfy the Dirac and the dual Dirac equation of motion in the distributional sense thanks to (ii). Finally, it is (iv) which corresponds to the usual anticommutation relations between ψ and ψ † , namely, {ψ(h), ψ † (f )} = −ih, Sf ,
{ψ(h1 ), ψ(h2 )} = {ψ † (f1 ), ψ † (f2 )} = 0.
Let us proceed to show that Definition 3.1 is not empty by constructing an example, namely, the so called Borchers–Uhlmann algebra for Dirac fields, which is explicitly discussed in [66]. The starting point of this construction is the direct sum ∞
. n ˜ D , F(M, g) = n=0
. where Dn = D((DM ⊕ DM ∗ )n ) denotes the compactly supported sections of the . n-fold outer tensor product of DM ⊕ D∗ M , D0 = C, and, moreover, elements of <∞ F˜ (M, g) are understood to be finite sequences of vectors ⊕N fn with fn ∈ Dn . n ˜ Subsequently, we equip F(M, g) with the product . fn gm = fn ⊗ gm , n
m
l
n+m=l
. where (fn ⊗ gm )(x1 , . . . , xn+m ) = fn (x1 , . . . , xn )gm (xn+1 , . . . , xn+m ).
November 18, 2009 11:47 WSPC/148-RMP
1264
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
For later purposes, we would like the Borchers–Uhlmann algebra to enjoy a property not stated in Definition 3.1, namely, we want it to be a topological ˜ ∗-algebra. Therefore, we endow F(M, g) with the following topology: a sequence . . ˜ ˜ g) f = ⊕n fn if and only if F(M, g) fj = ⊕n fj,n is said to converge to F(M, (1) for all n, fj,n → fn in Dn with respect to the topology introduced at the beginning of this subsection, (2) there exists an N ∈ N such that fj,n vanishes for every n > N and for every j. ˜ To promote F(M, g) to a ∗-algebra enjoying the properties (ii)–(iv) stated in Definition 3.1, we divide out the (closed) ideal I, generated by elements of the form Df1 ⊕ D f2 with f1 ⊕ f2 ∈ D, by those of the form (f1 ⊕ f2 )∗ − Γ(f1 ⊕ f2 ), and finally by those of the form f ⊗ h + h ⊗ f − (f, h) with f, h ∈ D. Hence, we can define the Borchers–Uhlmann algebra as the quotient . ˜ F(M, g) = F(M, g)/I,
(20)
˜ where the product and topology of F(M, g) descend out of those on F(M, g). In the forthcoming discussions, it will sometimes be possible and even advantageous to use a weaker version of F(M, g) which is defined in the same way as F(M, g), but without including the Dirac equations in the construction of the ideal I; we shall refer to this case as the off-shell formalism. For the sake of completeness, let us note that the sesquilinear form (·, ·) can be promoted to a genuine non-degenerate scalar product on the coset space D/ ker S ⊕ S∗ , which, in turn, can be completed to a Hilbert space H with respect to the said scalar product. As a by-product, this entails the possibility to extend F(M, g) to a C ∗ -algebra F(M, g) representing the elements as bounded operators on the antisymmetric Fock space F (H), built out of tensorialization from H itself. Following [2], we shall refer to this scenario as the assignment of the C ∗ -algebra of Dirac fields F(M, g) to the pair (H, Γ). Note that the above discussion implies that there is a second topology, F(M, g) can be endowed with, namely the one stemming from the scalar product (·, ·). However, we will not use this or the related Hilbert space construction in the remainder of this work. The construction of F(M, g) can be performed in the same way for any open subset O ⊂ M . In the case of relatively compact subsets, one would like to interpret the resulting algebras F(O, g|O ) as the algebra of local(ized) observables of the free Dirac field. However, from a physical point of view, observables are required to commute at spacelike separations and the algebras constructed up to now do not fulfill such requirement. As a first step towards a definition of the algebra of local observables of a Dirac field, we can restrict our attention to . Feven(M, g) = even subalgebra of F(M, g), which, e.g. can be defined as the subalgebra invariant under B(f ) → −B(f ) [25, 37, 66]. The reason to choose such a subalgebra stems from the fact that any two
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1265
elements of Feven (M, g) indeed commute for spacelike separations: Proposition 3.1. Let Ai , i ∈ {1, 2} be two elements of Feven (M, g) which arise as finite linear combinations of smeared B(f ) generators . i i B(fn,1 ) · · · B(fn,2k ) Ai = n n
such that
1 supp fn,j
n,j
and
2 supp fn,j
n,j
are spacelike separated. Then [A1 , A2 ] = 0. Proof. The proof descends from two key observations: on the one hand we know the following relation between the commutator and anticommutator of four operators A, B, C and D: [AB, CD] = A{B, C}D − AC{B, D} − C{A, D}B + {A, C}DB.
(21)
On the other hand we know that, given two field algebra elements B(f ) and B(g) with the support of f and g spacelike separated, condition (iv) in Definition 3.1 together with the support properties of the causal propagator, proved in Theorem 2.3, entail that B(f )∗ B(h) + B(h)B(f )∗ = 0. To conclude the proof, one needs to notice that, since only products of an even number of fields appear, the properties of the commutator allow to reduce [A1 , A2 ] to a linear combination of commutators, all of the form (21) with AB and CD of the 1 1 2 2 )B(fn,j ) and B(fn,j )B(fn,j ), respectively. This operation together form B(fn,j 1 2 1 2 with the requirement on the supports of the test sections defining A1 and A2 concludes the proof. With the restriction to Feven (M, g) we have been able to assure local commutativity. This criterion is, however, not sufficient to extract the observable elements out of F(M, g). Feven (M, g) is thus still too large to be a candidate for the algebra of local observables. To obtain such a good candidate, we have to take only the socalled “gauge invariant” elements of Feven(M, g) into account, cf., e.g. [2,25,37,66], and we denote the resulting subalgebra with A(M, g). Particularly, if we consider . any A = n B(fn,1 ) · · · B(fn,2kn ) in Feven (M, g), it is lying in A(M, g) if and only if ˜ z (S) of any S ∈ Spin (3, 1); it is (pointwise) invariant under a particular “action” L 0 4 ˜ z (S) on given an arbitrary but fixed z ∈ C and any p ∈ SM , we first define L [(p, z)] ∈ DM and [(p, z ∗ )] ∈ D∗ M as . . ˜ z (S)([(p, z)]) = ˜ z (S)([(p, z ∗ )]) = L [(p, T (S)z)], L [(p, z ∗ T (S −1 ))].
November 18, 2009 11:47 WSPC/148-RMP
1266
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
˜ z (S) can then be straightforwardly extended to DM ⊕ D∗ M , subsequently to L arbitrary outer tensor products of the latter, and finally to the test sections ˜ z (S) depends on z, it is of course not fn,1 ⊗ · · · ⊗ fn,2kn determining A. Since L a well-defined action in the strict sense, cf., the footnote on [66, p. 74]. However, it is sufficient to define the invariance we are seeking, since invariance with respect to one specific z implies invariance with respect to T (Spin0 (3, 1))z. One could also define invariance employing changes of the representation of the Clifford algebra; we have chosen the one described above since it is closer to the physical intuition that “an observable is a scalar in spinor space”. The above definition of an observable is compatible with the product of algebra elements, and thus defines a subalgebra of Feven (M, g) in a well-defined way. In the rest of the paper, we shall always work with Feven (M, g), though all our results can be applied to A(M, g) as well. It is remarkable that, in order to get to the definition of the various algebras introduced above, once a particular representation of the Clifford algebra has been chosen, the only other necessary datum is the geometry of the underlying manifold and, in case of a non-simply connected spacetime, the spin structure.c This can be understood realizing that, beside the Dirac bundles DM and D∗ M themselves, the overall analysis relies on the causal propagators S and S∗ , which are unique in a globally hyperbolic spacetime with spin structure. This apparently innocuous observation will play an important role in identifying the quantisation of the Dirac field as a particular locally covariant quantum field theory, as we will explain in the next subsection.
3.2. Locality and general covariance In order to establish a connection between the previous discussion and the modern interpretation of quantum field theory over curved backgrounds, it is mandatory to address the question whether the axioms of a locally covariant theory, as proposed by Brunetti, Fredenhagen, and Verch in [15] are fulfilled for the above displayed algebraic quantization of Dirac fields, and an affirmative answer has indeed been given in [66], though one should also consider the earlier works [20,25]. We shall not dwell on a recapitulation of the precise definition of all the needed tools, e.g., the involved categorical notions here: instead, we choose to provide a short overview and we refer an interested reader to [14, 15, 66] for further details. That said, per direct inspection of the previous analysis, we can infer that the following axioms of a locally covariant theory are satisfied: (1) It is possible to associate to every globally hyperbolic spacetime (M, g) with spin structure (SM, ρ) the corresponding ∗-algebra F(M, g) of fields in a unique c Let
us recall that, on simply connected spacetimes, the spin structure is unique [33, 34].
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1267
way, once a representation of the Clifford algebra has been fixed for all spin spacetimes. (2) To every map χ which is an isometric, orientation and causality preserving embedding of (M1 , g1 ) into (M2 , g2 ) and at the same time maps (SM1 , ρ1 ) to (SM2 , ρ2 ) in a coherent and equivariant way (cf. [66, Definition 2.3.1]), one can associate an injective, unit-preserving ∗-homomorphism αχ between the corresponding fields algebras F(M1 , g1 ) and F(M2 , g2 ). Here, we call a map χ : (M1 , g1 ) → (M2 , g2 ) causality preserving if and only if every causal curve starting and ending in χ((M1 , g1 )) is an image of a causal curve in (M1 , g1 ). (3) Let us choose two maps as above, namely, χ1 : (M1 , g1 , SM1 , ρ1 ) → (M2 , g2 , SM2 , ρ2 ) and χ2 : (M2 , g2 , SM2 , ρ2 ) → (M3 , g3 , SM3 , ρ3 ); then the following composition law is satisfied for the corresponding algebra morphisms αχ1 ◦χ2 = αχ1 ◦ αχ2 . Let us note that the above axioms are also fulfilled in the off-shell formalism, i.e. for Dirac spinor fields not subject to the Dirac equations. We can, furthermore, add another two axioms in special cases: on the one hand, if we restrict the construction to Feven (M, g), the axiom of Einstein causality is fulfilled on account of Proposition 3.1: (4) Consider two globally hyperbolic spacetimes with spin structure (M1 , g1 , SM1 , ρ1 ) and (M2 , g2 , SM2 , ρ2 ) together with χ1 and χ2 , respectively, two embeddings into a third spacetime with spin structure (M3 , g3 , SM3 , ρ3 ) of the aforementioned kind. Under the assumption that χ1 (M1 ) and χ2 (M2 ) are spacelike separated in M3 , it holds that for every A1 ∈ Feven (M1 , g1 ) and A2 ∈ Feven (M2 , g2 ), [αχ1 (A1 ), αχ2 (A2 )] = 0. At the same time, in the on-shell formalism, the time slice axiom (cf., [66, Proposition 4.2.22]) holds: (5) Let χ : M1 → M2 be a map between two globally hyperbolic spacetimes with spin structure with the properties already discussed. If a Cauchy surface of M2 is contained in χ(M1 ), then αχ is an isomorphism. Let us remark at this point that the time slice axiom holds essentially due to Theorem 2.2 and that it does not automatically hold for the extended algebras of Wick polynomials we are going to define soon, though we expect that such result can be proven along the lines displayed in [16]. The aforementioned axioms state properties of the full field algebras, but one can refine these statements and identify fields with a special behavior under the maps χ and αχ , the so-called locally covariant fields [15, 39]. In fact, as discussed by Sanders in [66], the field B(·), and, thus, also the single fields ψ(·), ψ † (·), are locally covariant fields. This entails that B(·) can be understood as family of continuous
November 18, 2009 11:47 WSPC/148-RMP
1268
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
maps, indexed by spacetimes with spin structure M ,d BM : D(M ) → F(M, g), such that, given two spacetimes with spin structure M1 and M2 and a map χ : M1 → M2 with the properties discussed in axiom (2), one gets the same result by either building a quantum field out of a test section fM1 ∈ D(M1 ) and then mapping this field to F(M2 , g2 ) via αχ or by mapping the test section fM1 to . fM2 = χ∗ (fM1 ) ∈ D(M2 ) via the push-forward χ∗ of χ and then building BM2 (fM2 ) out of it. On the level of maps, we thus have: αχ ◦ BM1 = BM2 ◦ χ∗ . Similarly, one can identify certain observable composite fields as locally covariant quantum fields via specific choices of test section spaces, and, furthermore, the Wick polynomials we shall discuss later fit into the same framework as well. Let us stress once more that the above discussion can be completely recast in terms of category theory. Particularly, the locally covariant fields mentioned above constitute covariant functors from the category of globally hyperbolic spin spacetimes to the category of topological unital ∗-algebras. However, since our aim has been to merely give a brief overview, we have chosen to recall the axioms of locally covariant quantum field theory without having direct recourse to category theory, and we kindly refer the interested reader to the well-written works [14,15,66] for an extensive introduction and full-fledged discussion of the topic. 3.3. Spinors and Hadamard states The algebra A(M, g) ⊂ Feven (M, g), is, at this point of our discussion, the best candidate to play the role of an algebra of observables for a free Dirac field theory. Unfortunately, this status is far from being satisfactory because objects such as all the Dirac bispinors, the current in particular, and the (components of the) stressenergy tensor are not contained in A(M, g) or Feven (M, g). Since we want to consider these as genuine observables, the best option is to solve this problem along the same lines employed in the scalar case, namely, we shall suitably enlarge Feven (M, g) to include all the wanted elements. Although reasonable and, to a certain extent natural, such idea comes with a price to pay, i.e. not all the well-behaved algebraic states for Feven (M, g) are admissible for the extended algebra; in fact, we will have to restrict ourselves to so-called Hadamard states for that purpose. Hadamard states can be characterized in many ways that are presumably (and, as we will see to some extent, definitely) related by internal consistency of the theory, which makes them from both physical and mathematical perspectives the most sensible family of states for a quantum field theory on curved spacetimes. d We
omit the other data determining a spacetime with spin structure in the remainder of this paragraph in favour of notational simplicity.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1269
From the more mathematical angle, it is very important that all Hadamard states have the same singularity structure, while in more physical terms it is crucial that this structure, i.e. their UV-behavior, mimics the one of the Minkowskian vacuum state. Altogether this implies that Hadamard states can be employed to obtain a sensible definition of expected stress-energy tensor [78]; this context is in fact the first one in which Hadamard states have appeared in. Moreover, the Hilbert space representations of quasi-free Hadamard states are unique in a certain weak sense, as has been proven in [75] for scalar fields and in [21] for Dirac fields; this feature is desirable from both physical and mathematical point of views. The available literature on Hadamard states is immense and we point a reader interested in details beyond the forthcoming discussion to [47,49,62,63] for scalar fields or to [37,48,65] for a treatment related to spinors. As a first step and as main topic of this section, we shall proceed introducing the notion of Hadamard states for the full algebra F(M, g) and only later we shall restrict them to Feven(M, g). The already anticipated enlargement of the algebra to include all interesting observables of the free field will then be the core of a subsequent discussion. That said, henceforth, we shall consider a state ω to be a continuous, complexvalued linear functional on F(M, g), which is normalized, i.e. ω(1) = 1 and fulfills positivity, viz., ω(A∗ A) ≥ 0
∀A ∈ F(M, g).
Since this algebra is generated, according to Definition 3.1, by the abstract elements B(f ) with f ∈ D, every said state is uniquely determined by the set of its n-point functions, namely, . ωn (f1 , . . . , fn ) = ω(B(f1 ) · · · B(fn )). The afore required continuity of the state refers to the Borchers–Uhlmann topology of F(M, g). Consequently, each ωn is a distribution on Dn . The bridge between the algebraic formulation of quantum field theory employed in this work and its usual Hilbert space description is in the non-trivial direction provided by the Gelfand–Naimark–Segal (GNS ) construction (cf., e.g., [35]) which yields a representation of an algebraic state and a field algebra in terms of a Hilbert space vector and operators on the same Hilbert space respectively. Among all possible algebraic states, a distinguished role is played by the socalled quasi-free ones, whose n-point functions can be determined fully out of ω2 . Following [2], we recall: Definition 3.2. A state ω on F(M, g) is called quasi-free if and only if, given any set of fi ∈ D with i ∈ {1, . . . , n}, ω(B(f1 ) · · · B(fn )) vanishes for odd n while for even n it holds ω(B(f1 ) · · · B(fn )) =
πn ∈Sn
n/2 |πn |
(−1)
i=1
ω2 (fπn (2i−1) , fπn (2i) ).
November 18, 2009 11:47 WSPC/148-RMP
1270
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
Here, Sn denotes the set of ordered permutations of n elements, namely, the following two conditions are satisfied for πn ∈ Sn : n 1≤i≤ , πn (2i − 1) < πn (2i), 2 n . 2 One of the reasons why past works on Hadamard states have restricted their treatment to quasi-free states has been the ability to state the Hadamard property for the two-point function only, without having to consider higher order n-point functions. However, due to the works of [66, 67, 71] it is known that this concern is not necessary, as one can in fact state the Hadamard property in the same way for both quasi-free and non quasi-free states. The reason for this is the strong result that the Hadamard property of the two-point functions (in combination with the anticommutation relations) is already enough to determine the singularity structure of all n-point functions. For simplicity, i.e. to keep some of the following formulas short, we will sometimes restrict our discussion to quasi-free Hadamard states. However, whenever we will speak of a state in the following and will not specify it to be quasi-free explicitly, our statements will encompass non quasi-free states as well. To discuss Hadamard states on the level of single Dirac fields, we will be concerned with two distinguished distributions: . . (22) ω + (f, h) = ω(ψ(h)ψ † (f )) and ω − (f, h) = ω(ψ † (f )ψ(h)), πn (2i − 1) < πn (2i + 1). 1 ≤ i <
where f ∈ D(DM ) whereas h ∈ D(D∗ M ) and where both ψ † (f ) and ψ(h) are particular elements of F(M, g) as explained in Sec. 3.1. Hence, it turns out that both ω + and ω − can be understood as distributions on D(DM D∗ M ). We can now introduce the notion of Hadamard state and, as in the scalar case, it is remarkable and useful that this concept can be illuminated in two equivalent ways. The first one has recourse to the notion of wavefront sets [26, 43], a concept which enables a refined formulation of a singularity structure of a distribution, and, to this avail, one should take into account that an a priori obstacle lies in the nature of the vector-valued distributions appearing in the context of Dirac fields. Particularly, since wavefront sets are more familiar in the context of scalar distributions, we need to specify how they can be defined for distributions with values in higher-dimensional spaces. To achieve this, it appears to be natural to define the wavefront set of a vector valued distribution as the union of the wavefront sets of the coefficients with respect to a (possibly local) basisexpansion and indeed this turns out to be an invariant concept due to the properties of scalar wavefront sets [22, 48, 65]. Specifically, we can define the wavefront B sets of e ω ± (x, y) = ω ± A (x, y) E A (x) ⊗ EB (y) as 4 4 . ± B W F (ω ± ) = ω A (x, y). A=1 B =1 e We
refer to Appendix Appendix A for a motivation and explanation of the primed index notation.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1271
By defining wavefront sets in this way, we certainly loose information on the most singular “directions” of a vector-valued distribution. This information can be encoded in so-called polarized wavefront sets, as introduced in [22] and applied in [37, 48]. Though of high mathematical interest, such concept is of no explicit use in our approach (though it will sometimes be used implicitly) and we feel safe not to dwell on it since we would end up providing only shallow ideas. That said, let us state the first possible access to Hadamard states [37, 48, 62, 65]: Definition 3.3. A state ω satisfies the Hadamard condition and is thus called a Hadamard state if and only if W F (ω2 ) = {(x, y, kx , −ky ) ∈ T ∗ M 2 \0, | (x, kx ) ∼ (y, ky ), kx 0}. Here, (x, kx ) ∼ (y, ky ) implies that there exists a null geodesic γ connecting x to y such that kx is coparallel and cotangent to γ at x and ky is the parallel transport of kx from x to y along γ. Finally, kx 0 means that the covector kx is future-directed. Remarks. If a state ω fulfills the Hadamard condition, then ω ± possess the following wavefront sets: W F (ω ± ) = {(x, y, kx , −ky ) ∈ T ∗ M 2 \0, | (x, kx ) ∼ (y, ky ), kx
0},
(23)
where kx 0 states that kx is past-directed. An even stronger relation between the two distributions ω ± arises if we employ the anticommutation relation since it entails that ω + (f, h) + ω − (f, h) = −ih, Sf .
(24)
Actually, this relation is at the heart of the set equality in (23). Initially, we only have that W F (ω ± ) are subsets of the right-hand side of (23). However, since the integral kernel of ·, S· , with S being a Dirac derivative of the causal propagator E, has the symmetrized version of W F (ω2 ) as wavefront set (this follows by a direct computation similar to the one explained in [62, 65]), W F (ω ± ) must be the maximal ones displayed in (23) to assure that (24) can hold. By contrast, the distributions ω(ψ(h1 )ψ(h2 )) and ω(ψ † (f1 )ψ † (f2 )), which, together with ω ± determine ω2 , have smooth integral kernels. For ω(ψ(h1 )ψ(h2 )), this can be proved employing a symmetry argument already used in a similar way in [62]: due to the anticommutation relations, we have ω(ψ(h1 )ψ(h2 )) = −ω(ψ(h2 )ψ(h1 )). Hence, if (x, y, kx , ky ) is an element of the wavefront set of the distribution on the right-hand side of the previous equation, then (y, x, ky , kx ) must lie in the wavefront set of the other one. At the same time, on account of the Hadamard condition, we know that W F (ω(ψ(x)ψ(y))) is not invariant under the exchange of coordinates. This entails that W F (ω(ψ(x)ψ(y))) = ∅, hence, ω(ψ(h1 )ψ(h2 )), and analogously ω(ψ † (f1 )ψ † (f2 )), possesses a smooth integral kernel.
November 18, 2009 11:47 WSPC/148-RMP
1272
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
Although highly elegant from a mathematical point of view and thus very helpful in abstract proofs, the microlocal definition of a Hadamard state is neither the first one introduced chronologically nor the easiest one to cope with on the level of explicit calculations. In fact, as already promised above, there is a different, more explicit definition of a Hadamard state via the so-called Hadamard form. For scalar fields, this has been rigorously introduced in [47], while for Dirac fields a similar concept has been proposed in [49, 75]. To introduce it, we need the notion of a convex normal neighborhood, which is an open subset O of M such that any two points x, y ∈ O can be connected by a unique geodesic which completely belongs to O. On any convex normal neighborhood, we can introduce the smooth halved squared geodesic distance σ(x, y), and, finally, formulate the following definition: Definition 3.4. A state ω is said to be of the (local) Hadamard form if and only if in any convex normal neighborhood the distributions kernels of ω ± can be written as 1 ω ± (x, y) = ± 2 Dy (H ± (x, y) + W (x, y)), 8π where the index ·y stresses that the dual Dirac operator Dy acts on the y-variable, and the singular Hadamard distribution kernels H ± can be specified as σ±(x,y) U (x, y) + V (x, y) ln −−−−→ H ± (x, y), σ± (x, y) λ2 →0+
(25)
where the limit has to be taken in the weak sense. Here, U , V , as well as W are smooth bispinors and V can be expanded in powers of σ, viz., ∞ . Vn (x, y)σ(x, y)n , V (x, y) = n=0
. where λ is a reference length, and σ± (x, y) = σ(x, y) ± 2i(T (x) − T (y)) + 2 with > 0. In the above formula, T is a time function, such that ∇T is timelike and future pointing on the full spacetime (M, g). We furthermore require H ± to be bisolutions of the spinorial Klein–Gordon equations up to smooth terms, i.e. Px H ± (x, y) ∈ E(DM D∗ M ),
Py H ± (x, y) ∈ E(DM D∗ M )
(26)
and we demand that their difference is specified by the fundamental solution of P , viz., H + (f, g) − H − (f, g) = ig, Ef , where f ∈ D(DM ) and g ∈ D(D∗ M ). Remarks. The existence of a time function T is guaranteed on any globally hyperbolic manifold [6, 7] as these can be decomposed as Σ × R, where Σ is a smooth Cauchy surface and R is the very range of the time function T .
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1273
Furthermore, a completely satisfactory definition of the Hadamard form requires some more work to rule out spacelike singularities, to circumvent convergence problems of the series V , which is only asymptotic, and, finally, to assure that the definition does not depend neither on a special choice of the temporal function T nor on the employed convex normal neighborhood. Particularly, in strict terms, we have only defined the local Hadamard form here. A stronger and more satisfactory definition, the so-called global Hadamard form, has been introduced in [47], and it reinforces the local form as it extends it from the convex normal neighborhoods to certain “causally-shaped” neighborhoods of a Cauchy surface, thereby ruling out spacelike singularities. However, in [63], it has been shown that the local Hadamard form already implies the global Hadamard form. For further details and discussions of these aspects and the existence of states of the Hadamard form we refer an interested reader to [31, 47, 49, 65, 75]. To determine the so-called Hadamard coefficients U , V , and W , one has to exploit Eq. (26). As a result it turns out that, as explained in Appendix A.3, U and V depend solely on the mass m and on the local curvature, while the full state dependence is encoded in W . At this point, we would like to stress a slight conceptual difference between Dirac spinors and scalar fields: in the case of scalar fields, the two-point function fulfills the Klein–Gordon equation in both entries, and this property is thus inherited by its singular Hadamard kernel up to smooth terms. Contrariwise, in case of Dirac spinors, (26) does not follow straightforwardly from the fact that the two-point functions ω ± fulfill the Dirac equations. In more specific terms, if we recall the definition of ω ± (22), we know that they fulfill Dx ω ± (x, y) = Dy ω ± (x, y) = 0, where the operators Dy and Dx , respectively for the Dirac spinor and cospinor in the y and x variables, are defined as in (11) and (12). Consequently, Dx Dy (H ± (x, y) + W (x, y)) = 0, Dy Dy (H ± (x, y) + W (x, y)) = −Py (H ± (x, y) + W (x, y)) = 0, and, thus, both Py H ± and Dx Dy H ± are smooth. The smoothness of Px H ± does however, not follow automatically from these considerations, but has to be required or proven in a way similar to the one displayed in [65, Lemma 5.4]. We shall explicitly discuss the computation of U , V and W in Appendix A.3. To this avail, the following proposition will prove to be very helpful. Notice that the requirement of global hyperbolicity therein is not too restrictive since, ultimately, we will be interested in the coinciding point limits. Therefore, we are free to choose a sufficiently small domain which fulfills the said hypothesis. Proposition 3.2. Let H ± (x, y) be the Hadamard distribution kernels of a state introduced in Definition 3.4, where x and y are assumed to be contained in
November 18, 2009 11:47 WSPC/148-RMP
1274
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
Ox,y , a globally hyperbolic subset of a convex normal neighborhood. Then (Dx − Dy )H ± (x, y) = (Dy − Dx )H ± (x, y) are smooth. Proof. The overall strategy calls for combining a deformation argument as devised in [31, Appendix C] together with the so-called theorem of propagation of singularitiesf [22], here applied to the distributions on the globally hyperbolic spacetime (Ox,y , g). That said, let us proceed in logical sequential steps and consider any Cauchy surface Σ → (Ox,y , g) of the spacetime we are interested in. Let us also choose an open neighborhood of Σ, say OΣ , such that it is a causal normal neighborhood of Σ, i.e. Σ is a Cauchy surface for OΣ and for each p, q ∈ OΣ such that p ∈ J + (q) and there exists a convex normal neighborhood containing J − (p)∩J + (q). The existence of such sets in a globally hyperbolic spacetime and for any Cauchy surface Σ was first proved in [47]. The above-mentioned deformation argument grants us that it is possible to construct an isometric, orientation and time-orientation-preserving embedding, say χ, of OΣ in a causal normal neighborhood OΣ of a Cauchy surface Σ of a second globally hyperbolic spacetime M . Furthermore, one can engineer M in such a way that, in the past of χ(OΣ ), there exists another Cauchy surface Σ with a neighborhood OΣ which contains the image of a suitable neighborhood of a Cauchy surface Σ in a globally hyperbolic subset of the Minkowski spacetime under an isometric, orientation preserving, embedding χ. ˜ It is straightforward to extend χ and χ ˜ in such a way that they respect the spin structures. Since H ± on OΣ × OΣ are constructed only out of the local geometric data ˜ ± which coincides with the pushvia (26), it is possible to build a second pair H ± forward under χ of H in χ(OΣ ) × χ(OΣ ). At this stage, per hypothesis, we can consider two distributions h± on χ(OΣ )×χ(OΣ ) which solve weakly the second order hyperbolic partial differential equation of motion and whose singular part is equal ˜ ± . Thanks to the time slice axiom, we can now extend h± on the whole globally to H hyperbolic spacetime M . Furthermore, due to the propagation of the Hadamard form as proved in [31,62,65], h± are also of Hadamard form in OΣ × OΣ and their pull-backs to OΣ × OΣ thus locally coincide with the Hadamard distribution kernels in Minkowski spacetime. Afterwards, we consider . u± = (Dx − Dy )h± , and we proceed to prove that these distributions have empty wavefront set. According to the above discussions, we can pull back u± to the globally hyperbolic subsets OΣ of Minkowski spacetime in a well-defined way. In this region, the pulled-back f Here, the polarized wavefront sets, shortly mentioned before, play an important role, as one has only control on the propagation of the most singular (vector) directions of vector valued distributions due to possibly non-diagonal (matrix) terms in the partial differential operator under consideration.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1275
versions of u± have empty wavefront set, since the flat spacetime Hadamard kernel only depends on x − y due to translational invariance. To access the wavefront set of u± in χ(OΣ ) × χ(OΣ ), let us note that these distributions satisfy Px Py u± = 0, where the operator Px Py is properly supported, of real principal type, and homogeneous of degree 2 since it is the tensor product of two second order hyperbolic differential operators. From this it follows due to the propagation of singularities theorem (see [65] or [22]) that the wavefront set of u± on χ(OΣ ) × χ(OΣ ) can only contain elements of the form (x, y, kx , 0) or (x, y, 0, ky ).
(27)
Following a line of argument employed in the proof of [65, Theorem 5.8], we can infer that W F (u± ) = ∅ in the following way: since u± are constructed as Dirac derivatives of h± and properly supported partial differential operators like D and D do not increase the wavefront set, we know that W F (u± ) ⊂ W F (h± ). If we furthermore recall that h± have the “antisymmetric” wavefront set displayed in Definition 3.3, it follows that W F (u± ) cannot even contain elements of the form (27) and are thus empty. The proof can be concluded employing the contravariant transformations properties of wavefront sets under diffeomorphisms. The “universality” of the singularity structure of states of the Hadamard form, i.e. the independence of U and V on a state, allows for a locally covariant definition of normal ordering, as we will see in the next subsection. To this avail, it will be useful to compose the Hadamard distributions to a single object living on D ⊗ D, viz., . H(f1 ⊕ f2 , h1 ⊕ h2 ) = (Dy H + )(h1 , f2 ) − (Dy H − )(f1 , h2 ) = H + (h1 , Df2 ) − H − (f1 , Dh2 ),
(28)
where f1 ⊕ f2 , h1 ⊕ h2 ∈ D. Before we start working with Hadamard states, let us state the already anticipated and fruitful equivalence of the Hadamard form and the Hadamard condition, which has been for the first time proven for scalar fields in [62] and later extended to Dirac fields in [37, 48, 65]: Theorem 3.1. Let us consider a state ω on F(M, g) with two-point function ω2 . This satisfies the Hadamard condition if and only if the distribution on D ⊗ D defined by f ⊗ h → ω2 (f, h) − H(f, h), has a smooth integral kernel, and, thus, ω
±
(29)
are of Hadamard form.
3.4. On the notion of Wick polynomials: Rewriting the field algebra In the development of quantum field theory, a well-known obstruction arises whenever we consider the product of two fields, which, being distributions, cannot be
November 18, 2009 11:47 WSPC/148-RMP
1276
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
safely multiplied unless special conditions are met. Since, as already anticipated, our ultimate goal is to enlarge the algebras under consideration to include observables such as the stress-energy tensor for Dirac fields, we are led to tackle this problem. Like in the case of scalar fields, this results in the introduction of Wick polynomials and in the following we shall try to adapt an approach similar to the one discussed in the work of Brunetti, Duetsch, and Fredenhagen [11] which in turn is related to further earlier works [13, 39, 40]. Unsurprisingly, in our scenario, there are differences to the above mentioned works due to the vectorial nature of our fields and their anticommutativity. This necessitates a treatment of Wick polynomials of Dirac fields on curved spacetimes on its own and we will thus develop them in the following two subsections since they have not been treated in the literature in the past. As already anticipated, upon enlargement of the field algebra to include Wick polynomials we have to restrict our state space to Hadamard states, which seems not to be a real loss since these are already distinguished and presumably the only “physical” ones for the algebras discussed in Sec. 3.1. As a starting point to define the extended algebra of fields, it will be more convenient not to start directly from F(M, g) or its subalgebras, though we shall ˜ consider the subspace of F(M, g) defined as ∞
. n DA , C(M, g) = n=0
where the subscript A indicates that, for n > 0, one takes into account only antisymmetric elements, while D0A = D0 = C. Notice that it is required that a generic element F ∈ C(M, g) is unambiguously determined by a finite sequence {F (n) } of anti. symmetric elements lying in Dn . Let us introduce EΘ defined as EΘ (x) = EA (x) ⊕ 0 . if Θ = A and EΘ (x) = 0 ⊕ E B (x) if Θ = 4 + B where Θ ranges from 1 to 8, (n) while A and B from 1 to 4. Then each element F (n) = FΘ1 ···Θn EΘ1 ⊗ · · · ⊗ EΘn has antisymmetric coefficients, viz., (n)
FΘ1 ,...,Θk ,Θk+1 ,...,Θn (x1 , . . . , xk , xk+1 , . . . , xn ) (n)
= −FΘ1 ,...,Θk+1 ,Θk ,...,Θn (x1 , . . . , xk+1 , xk , . . . , xn ) ∀1 ≤ k ≤ n.
(30)
The subspace C(M, g) can be promoted to an algebra with respect to the fol. lowing product which we shall henceforth indicate as ·A ; let F = {F (n) } and . G = {G(n) } be two generic elements in C(M, g), then . n (31) (F ·A G)(n) = A(F (p) ⊗ G(q) ), p p+q=n . where (F (p) ⊗ G(q) )(x1 , . . . , xp+q ) = F (p) (x1 , . . . , xp ) ⊗ G(q) (xp+1 , . . . , xp+q ) and A is the total antisymmetrization projector such that F ·A G is indeed an element of
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1277
. C(M, g). Specifically, A leaves D0 invariant, while, for an arbitrary F (n) = f1 ⊗ · · · ⊗ fn with fi ∈ D and n > 0, the antisymmetrization reads 1 A(f1 ⊗ · · · ⊗ fn ) = (−1)|πn | fπn (1) ⊗ · · · ⊗ fπn (n) , n! πn ∈Sn
where the sum is taken over all permutationsg πn ∈ Sn and A can be extended to Dn by linearity. The algebra (C(M, g), ·A ) can be interpreted as the algebra of (not necessarily linear) functionals on the classical field configurations of Dirac spinors. Remark. Notice that, if we had defined ·A without the combinatorial factor np , ˜ (C(M, g), ·A ) would have been a subalgebra of F(M, g). However, since we want to regard elements of F = {F (n) } ∈ C(M, g) as functionals on smooth test sections and F (n) as being their nth (antisymmetrized) functional derivatives, we have introduced the factor np , as it implements the Leibniz rule for functional differentiation. Despite the previous comments, (C(M, g), ·A ) is indeed isomorphic to a subalgebra 1 ˜ ˜ g) with n! F (n) ∈ C(M, g). of F(M, g) upon identifying F (n) ∈ F(M, The standard quantization scheme is eventually realized changing the product ·A into a suitable -product compatible with the anticommutation relations. The overall procedure, once a functional ∆ : D2 → C is selected, can be realized out of the map Γ∆ : Dn → Dn−2 whose action on a generic element F (n) of Dn is required to be trivial if n < 2, whereas, for n ≥ 2, n−1 n (n) (n) . dµ(xi ) dµ(xj )(−1)j−i+1 ∆Θi Θj (xi , xj )FΘ1 ···Θn (x1 , . . . , xn) Γ∆ F = i=1 j=i+1
M
M
Θj (x ) ⊗ · · · ⊗ EΘn (x ). Θi (x ) ⊗ · · · ⊗ E × EΘ1 (x1 ) ⊗ · · · ⊗ E (32) i j n . Θ1 Θ2 Here, ∆(x, y) = ∆ (x, y) EΘ1 (x)⊗EΘ2 (y), with EΘ (x) the dual of EΘ (x), denotes Θi (x ) indicates that EΘi (x ) must the integral kernel of ∆, whereas the symbol E i i be omitted. On account of the regularity of the elements C(M, g), we can now safely define a -product as . (33) F S G = A αS (F ⊗ G),
where αS is defined as a formal exponentialh 1 . αS = exp i ΓS˜ ; 2 g Of course not all permutations employed are necessary, since in (31) the constituents will already be antisymmetric. The antisymmetrization as defined here, however, is still valid and it constitutes the easiest way to write it without unnecessarily getting lost in combinatorics. h Since C(M, g) contains only finite sequences of test sections, the exponential series will always terminate after finitely many terms.
November 18, 2009 11:47 WSPC/148-RMP
1278
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
here, ΓS˜ arises from (32) if one inserts for ∆ the functional . ˜ 1 ⊕ f2 , g1 ⊕ g2 ) = −f2 , Sg1 + S∗ f1† , g2† , D2 (f1 ⊕ f2 ) ⊗ (g1 ⊕ g2 ) → S(f with S and S∗ being the causal propagators constructed in Theorem 2.3. We would like to stress at this point that, on account of both the definition of Γ∆ in (32) ˜ only mutual contractions between F , and the symmetry property enjoyed by S, G ∈ C(M, g) survive in (33). The product S defined as above is a modification of the product ·A , i.e. for f , g in D ⊂ F(M, g), i˜ g), f S g = f ·A g + S(f, 2 with playing the role of a deformation parameter; this parameter appears as a ˜ but is not visible here, since it is set to 1 in natural units. coefficient in front of S, Consequently, quantisation can be understood as deforming the algebra of classical anticommuting fields to one of quantum Dirac fields respecting the standard nontrivial anticommutation relations, i.e. ˜ g) = (Γf, g); f S g + g S f = iS(f, this is accomplished by leaving the algebra elements unchanged and deforming only the product. Remarks. If we introduce a ∗-operation on (C(M, g), S ) via the straightforward tensorialization of Γ (16), the result is naturally isomorphic to the off-shell version of F(M, g) with its standard product. Particularly, B(f ) ∈ F(M, g) corresponds to f ∈ C(M, g) and the equations of motion can then be implemented by dividing out a suitable ideal. Since the Dirac equations will not be necessary in the following discussion, we will denote both the on-shell and off-shell algebras with (C(M, g), S ) and we shall consider them as being isomorphic respectively to F(M, g) and to its off-shell version (see also the first remark of this subsection). 3.5. On the notion of Wick polynomials: The enlarged algebra and its states Up to now we have focused on rather regular objects constructed out of D, but, alas, this does not suffice to reach our ultimate goals; as a matter of fact, we need to consider the spaces of compactly supported distributions as well: . . . E0 = C, En = E ((D∗ M ⊕ DM )n ), E = E1 were we would like to point out that elements of E test sections of D∗ M ⊕ DM and not of DM ⊕ D∗ M . This “dual” notation, as employed, e.g., in [25], is used to stress that D → E . The underlying leitmotiv to move on from D to E is rooted in our interest in objects like M dµ(x)f (x) : ψ † (x)ψ(x) : which will in the subsequent discussion correspond to distributions like f (x)δ(x, y), these are nothing but elements of En
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1279
which are supported on the thin diagonal . Diagn = {(x1 , . . . , xn ) ∈ M n | x1 = · · · = xn }. Since this amounts to potentially ill-defined operations such as taking the product of distributions at the same spacetime point, we cannot blindly extend (C(M, g), S ) (and equivalently F(M, g)) to incorporate these new objects into an enlarged ∗-algebra, but we have to require some suitable regularity conditions on both the distribution spaces and the -product. To fix the necessary regularity conditions on the distribution spaces, let us take an arbitrary open cone Γ ⊆ (T ∗ M )n , i.e. an open subset of exterior tensor products of the cotangent bundle invariant under multiplication with positive scalars, and define . n En Γ = {t ∈ E | W F (t) ⊂ Γ}. This puts us in the position to state the following definition. Definition 3.5. We call extended space of functionals the space ∞
. n EΓ,A , Cext (M, g) = n=0
where it is understood that elements of Cext (M, g) are finite sequences of distributions, the subscript A indicates restriction to antisymmetric elements in analogy to the previous subsection, and . Γ = (T ∗ M )n \(M n × (V¯+n ∪ V¯−n ))
(34)
with V¯+ and V¯− denoting the closure of the future and the past light cone, respectively, in the fiber of the cotangent bundle at each point of M ; here, we choose the same symbol Γ for all relevant cones in favor of notational simplicity. In order to promote Cext (M, g) to an algebra, we have to choose a product complementing the regularity of Cext (M, g). It is manifest that the product S is not up to the task since there are cases where it would lead us to pointwise products of causal propagators, which are ill-defined due to their wavefront set. In order to avoid the aforementioned problem, we can replace S by H , which is nothing but (33) with S˜ replaced by −2iH as defined in (28). Notice that such replacement needs to be done after having explicitly expanded αS and ΓS˜ , in such a way that only mutual contractions between F and G in F H G are considered. In other words, in the computation of F H G, only operations of the form dµ(x1 ) · · · dµ(xk ) dµ(y1 ) · · · dµ(yk )HΘ1 Θp+1 (x1 , y1 ) Mk
Mk
(p)
(q)
× · · · × HΘk Θp+k (xk , yk )FΘ1 ···Θp (x1 , . . . , xp )GΘp+1 ···Θp+q (y1 , . . . , yq ) × EΘk+1 (xk+1 ) ⊗ · · · ⊗ EΘp (xp ) ⊗ EΘp+k+1 (yp+1 ) ⊗ · · · ⊗ EΘp+q (yq ),
(35)
November 18, 2009 11:47 WSPC/148-RMP
1280
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
appear.i We can see that the new -product is equivalent to the old one when restricted to C(M, g) where it is well defined, being −1 F H G = αH (α−1 H (F ) S αH (G)), . αH = exp ΓH ,
(36)
where ΓH is defined as in (32) upon inserting H − i/2S˜ for ∆. It remains to be shown that H is a good product in Cext (M, g). First of all, the wavefront sets of the factors of the integrand in (35), that is W F (F (p) ⊗ G(q) ) and W F (I ⊗p+q−2k ⊗ H⊗k ) do not sum up to the zero section. Then, one can apply [43, Theorem 8.2.10] to conclude that the aforementioned integrand is indeed a distribution with compact support. Therefore, the overall integral is well-defined. Afterwards, one has also to make sure that Cext (M, g) is closed with respect to H . To this avail, if we take into account the singular structure of H and the fact that the wavefront sets of F (p) and G(q) must be of the form (34), we can make use of [43, Theorem 8.2.13] in order to conclude that the distributions resulting from the operations like (35) have again wavefront sets of the form (34). Hence, it turns out that the problem with S disappears. Owing to the above discussion, we can safely define the algebra (Cext (M, g), H ). Recalling that (C(M, g), S ) has been isomorphic to F(M, g), we can reverse this . viewpoint and just define the extended ∗-algebra Fext (M, g) = (Cext (M, g), H ). Similarly, restricting the possible test sections and distributions taken into account, we can define the extended algebras Feven,ext (M, g) and Aext (M, g). That said, following slavishly the analysis of the scalar case in [39,40], the product (36) rephrases the Wick formula in the Dirac scenario. Let us stress that (Cext (M, g), H ) is a proper extension, up to equivalence of the ormander products, of (C(M, g), S ) in mathematical terms, as D → E and the H¨ product of distributions reduces to the standard product if the distributions are test sections. Before introducing Hadamard states on F(M, g) in Sec. 3.3, we have already anticipated that among all admissible states on F(M, g), only they can be extended to states on the extended algebra Fext (M, g). We would now like to make this statement more precise, which as a byproduct will allow us to actually construct states on Fext (M, g). To this avail, we shall briefly recall the results of [38], which have initially been proven for scalar fields, but straightforwardly extend to our case, as they rely solely on properties of wavefront sets and truncated n-point functions (cf., e.g., [38] for a definition of truncated n-point functions) that are equally present in the Dirac field scenario. To extend states from F(M, g) to Fext (M, g), we need a notion of continuity proper to our setting. Let us recall that firstly D → E , but secondly a restricted version of E and its powers has been used in constructing Fext (M, g), namely, EΓ that, due to the antisymmetry of F and G, any k-fold contraction is up to a sign a contraction of the first k pairs of entries.
i Notice
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1281
with Γ as defined in (34). Hence, we need a notion of convergence which keeps the wavefront set under control, as provided in [43, 38]: Definition 3.6. Let Γ be an arbitrary but fixed open cone in (T ∗ M )n and let tk n be a sequence in En Γ . We say that tk → t in EΓ if and only if (1) k supp tk is compact, (2) tk → t weakly in the sense of distributions, (3) for every properly supported pseudodifferential operator P with P tk ∈ Dn for all k and P t ∈ Dn , P tk → P t in Dn .
This defines the H¨ ormander pseudo topology on EΓn . We can now employ it to straightforwardly induce a topology on Fext (M, g) (recall that elements of Fext (M, g) correspond to finite sequences of distributions) and in turn define a state ω on Fext (M, g) to be continuous if and only if ω(Fk ) → ω(F ) whenever Fext (M, g) Fk → F ∈ Fext (M, g). The most important property of the H¨ ormander pseudo topology for our purposes is the following result [43]. Proposition 3.3. Let Γ be an arbitrary but fixed open cone in (T ∗ M )n . For every t ∈ EΓn , there is a sequence tk in Dn such that tk → t in EΓn .
This proposition implies that Dn is dense in EΓn for every open cone Γ, which in turn implies that F(M, g) is dense in Fext (M, g) with respect to the just defined topology. Armed with the nice topological insights gained afore, we can proceed to construct states on Fext (M, g). As final preparation, let us note that our construction of Fext (M, g) particularly entails the standard paradigm according to which the product of two fields, say ψ † (x)ψ(y), should be regularized as 1 . : ψ † (x)ψ(y) : = ψ † (x)ψ(y) + 2 Dy H − (x, y), 8π such that, for a Hadamard state ω, ω(: ψ † (x)ψ(y) :) = −(8π 2 )−1 Dy W (x, y). At a level of expectation values, this can be equivalently seen as leaving ψ † (x)ψ(y) unchanged while ω − (x, y) becomes ω − (x, y) + Dy H − (x, y). This somehow heuristic comment prompts the following: Definition 3.7. Consider a quasi-free Hadamard state ω, whose n-point function is indicated as ωn . One can define the regularized n-point function : ωn : as . : ωn : = ωn = 0, if n is odd n/2 . |πn | (−1) (ω2 − H)(xπn (2i−1) , xπn (2i) ) : ωn : (x1 , . . . , xn ) = πn ∈Sn
if n is even,
i=1
with H as in (28) whereas the set of ordered permutations Sn ⊂ Sn is the one introduced in Definition 3.2.
November 18, 2009 11:47 WSPC/148-RMP
1282
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
As a straightforward consequence of the last definition, we can form expectation values of all elements in Fext (M, g), since all : ωn : defined as above are manifestly . smooth on account of Theorem 3.1. Specifically, for any F = {F (n) } ∈ Fext (M, g), we define . (n) F , : ωn : . ω(F ) = n
The above equation defines a complex-valued, linear, normalized functional on Fext (M, g) which extends the action of the quasi-free Hadamard state under consideration ω on F(M, g) to Fext (M, g). The last property we have to check is positivity on the full Fext (M, g), but this follows from the positivity on F(M, g), as ω is continuous on this dense subset of Fext (M, g). Altogether we have just proved the second direction of the main result of [38] (for the special case of quasi-free states): Theorem 3.2. Let ω ± be defined as in (22). (1) Let ω be a continuous state on Fext (M, g). Then ω ± are of Hadamard form and the truncated n-point functions of ω on F(M, g) are smooth for n = 2. (2) Conversely, if a state ω on F(M, g) is such that ω ± are of Hadamard form and the truncated n-point functions of ω are smooth for n = 2, then ω extends to a continuous state on Fext (M, g). For the full proof of the above theorem for the case of not necessarily quasi-free states we refer the interested reader to [38]. At this stage, we need to point out that there are still some ambiguities in the employed definition of H ± and thus in the definition of both H and : ωn :; indeed, the reference length λ necessary to construct H ± according to Definition 3.4 is in principle undetermined. This fact does, however, not hamper our analysis since different choices of λ and thus of H lead to isomorphic algebras. Lemma 3.1. Suppose we choose two different H, say H1 and H2 , to construct the extended algebra (Cext (M, g), H ). Then the two resulting algebras (Cext (M, g), H1 ) and (Cext (M, g), H2 ) are isomorphic. Proof. Due to the properties of the Hadamard distributions H ± , one knows that . the difference d = H2 − H1 has a smooth antisymmetric integral kernel. The two products H1 and H2 are related by a deformation. They are thus equivalent and the ∗-algebra isomorphism relating them can be realized as . αd = exp(Γd ), where Γd is taken as in (32) with d being inserted in place of ∆. Particularly, −1 F H2 G = αd (α−1 d F H1 αd G),
which is well-defined and holds true since αH2 ◦ α−1 H1 = αd and d has a smooth integral kernel.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1283
To finish the preparations for the final section of this work, we have to address a last issue. At the moment we are falling one step short from our ultimate goal since, to study the regularization of the stress-energy tensor, one has to understand the treatment of differentiated fields. Hence, a small addendum to the above analysis is needed and we shall follow the procedure employed for scalar fields in [57], though adapted to our language. Thus, let us take a differential operator K on D(DM ) of the form K = a0 + ∇a1 + · · · + ∇R aR ,
R < ∞,
where . ∇kak = aµk 1 ···µk AB ∇µ1 · · · ∇µk , and aµk 1 ···µk AB for k ∈ {0, . . . , R} are the coefficients of an element of Γ(T · · ⊗ T M ⊗DM ⊗ D∗ M ). M ⊗ · k
Notice that this class of differential operators encompasses both the Dirac operators and the spinorial Klein–Gordon operator which will appear in the expression of the stress-energy tensor and of its trace. In an analogous way we can choose differential operators K on D(D∗ M ) and combine them with those as K to obtain operators K ⊕K on D. If we now bear in mind Definition 3.5, we realize that the extended set of fields is defined out of a condition on the wavefront set of its elements. Thus, in order to engineer any operator of the form K ⊕K into the above discussion, we just need to recall a general result on wavefront sets (cf. [43, Chap. 8] or [26]) according to which a partial differential operator, being properly supported, does not increase the wavefront set of a distribution it is applied on. We can thus readily conclude that operators of the form K ⊕ K map Cext (M, g) to itself and that the previous discussion has already encompassed the treatment of differentiated fields. One could now prove several further properties of Wick polynomials of differentiated fields, but we will not indulge in this task since it will play no role in the forthcoming discussion and, furthermore, the results are by all means a straightforward extension, both as concepts and as technical proofs, of those discussed in [57] for the scalar case. At this point, it is worth stressing, that, in principle, one might even wish to extend the results of this last cited paper to account for differentiation with respect to the frame. Nonetheless we shall not indulge in this task, since it will not be needed in the next section. Before proceeding with the discussion of the stress-energy tensor, let us finally remark on how the extended algebra Fext (M, g) fits into the framework discussed in Sec. 3.2. Without going much into details we would like to point out that any coinciding point limits of smooth objects constructed out of the Hadamard distributions are locally covariant, since the construction of H ± depends only on the mass and the local curvature. As a result, all elements of Fext (M, g) which correspond to distributions with support on the thin diagonal are locally covariant fields.
November 18, 2009 11:47 WSPC/148-RMP
1284
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
4. The Stress-Energy Tensor of Dirac Fields The aim of the section is to focus on the structure of the stress-energy tensor and to study its quantum properties. Particularly, we shall display that it is possible to introduce an improved tensor which is conserved also at a quantum level, although its trace acquires new and classically unexpected terms of geometric origin which lie at the heart of the so-called trace anomaly. 4.1. The classical stress-energy tensor We start our analysis by revising the form and the properties of the stress-energy tensor for Dirac spinors in a classical framework. The Dirac equations (11) can be realized as the equations determining the extremal of the action functional . S=
1 † 1 † d x |g| ψ (Dψ) + (D ψ )ψ . 2 2 M
. d x |g|L = 4
M
4
A direct inspection of the above action shows us that, for compactly supported sections ψ, ψ † , it is, up to a total derivative term, identical to the more common expression d4 x |g|ψ † (−γ µ ψ;µ + mψ). S= M
We define the (Hilbert) stress-energy tensor by the usual procedure, i.e. . 2 δS . Tµν = |g| δg µν An explicit realization of this last identity in the case of spinor fields is much more involved due to the underlying orthogonal Lorentz frames whose explicit dependence on the metric must be accounted for. Actually, it is also possible to define the stressenergy tensor as a variation with respect to the frame, but, ultimately, this yields the same result as that with respect to the metric. We shall stick to this last perspective also on account of the results at the end of the previous section where we have not discussed derivatives with respect to the frame. A lengthy and, to a certain extent, tedious calculation, fully developed in [30], yields Tµν =
1 † (ψ γν) ψ − ψ † γ(µ ψ;ν) ) − Lgµν , 2 ;(µ
(37)
where ( ) denotes idempotent symmetrisation; one should also notice that, being the field free, the Lagrangian vanishes on shell, i.e. if one imposes the Dirac equations. If we contract (37) with g µν , we end up with the classical trace 1 † µ . γ ψ − ψ † γµ ψ;µ ) − 4L = −mψ † ψ, T = g µν Tµν = (ψ;µ 2
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1285
where, in the second equality, we have evaluated the left-hand side on shell by means of (11). Hence, as expected, the trace vanishes on shell for conformally invariant, i.e. massless, Dirac fields. If we consider instead the covariant conservation, we need to calculate ∇µ Tµν =
1 † {−ψ;ν Dψ + [D ψ † ];ν ψ − D ψ † ψ;ν 4 + ψ † [Dψ];ν + P ψ † γν ψ − ψ † γν P ψ} − L;ν ,
which vanishes on shell. 4.2. The quantum stress-energy tensor: The problem In the next step, we would like to define the quantum version of the stress-energy tensor for Dirac fields. Since we have a well-defined notion of Wick polynomials at hand, it would be easy to just take the classical expression for the stress-energy tensor and replace the occurring field monomials with their normal ordered quantum counterparts. This way, one would easily get an element of Aext (M, g) which, GNS-represented with respect to a Hadamard state, would be a well-defined operator valued smooth function. As we will shortly see, however, this procedure would not yield a meaningful object. To understand this, let us take a slight detour and think about the properties we would like a quantum stress-energy tensor to have. From the point of view quantum field theory over curved background, the most important entity to take into account as a guide in the search for a good quantum stress-energy tensor is of course the semi-classical Einstein’s equation, viz., Gµν (x) = 8πGω(: Tµν (x) :),
(38)
where Gµν denotes the Einstein tensor Rµν − 12 Rgµν , G is the gravitational constant and : Tµν : is a suitable regularized expression for the quantum stress-energy tensor. This equation thus describes the back-reaction of the quantum field on the background and it is worth noticing that : Tµν :, in the right-hand side, is not meant just as a normal ordered expression with respect to a reference state, though it is the fully renormalized stress-energy tensor with respect to the Hadamard prescription introduced and discussed in the preceding section. The legitimate question which now arises, is under which circumstances this equation makes sense at all. Regarding the form of the equation, we will restrict ourselves to point out that it can be derived by formally expanding a quantum metric and a quantum field about any classical vacuum solution of the Einstein’s equation and computing the equation of motion for the expected metric while keeping only “tree-level” graviton contributions and “loop-level” quantum field contributions. Since one discards “loop-level” graviton contributions, the equation derived in this manner can only make sense as a model equation, or maybe for special states. We refer the interested reader to [28] and the references cited therein for an exhaustive treatment of this topic while we shall continue dwelling upon the properties of the quantities appearing in (38).
November 18, 2009 11:47 WSPC/148-RMP
1286
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
The first, and certainly obvious, observation is that we need a regularized expression of the stress-energy tensor to obtain a finite expectation value, i.e. a finite right-hand side. The next observation is that the left-hand side of (38) is a classical and “sharp” quantity, while the right-hand side is a probabilistic object. Such a situation can of course only be a sensible one if the fluctuations of the probabilistic quantity involved are small in comparison to the quantity itself, finite in particular. These considerations are exactly those which lead to consider and select the states of Hadamard type as the physical and reasonable ones among the myriad of states available in quantum field theory on curved spacetimes. As a matter of fact, if we define normal ordering by means of the Hadamard singularity and we evaluate the normal-ordered (and smeared) stress-energy tensor on Hadamard states, we automatically get a quantity with finite fluctuations. This stems from the fact that powers of the Hadamard bidistribution are again well-defined bidistributions, thanks to the special form of the Hadamard wavefront set. Regarding quantitative statements about the fluctuations of the expected stress-energy tensor, it seems that in general a priori statements are not possible and one has to look at solutions of the semiclassical Einstein’s equations to a posteriori compute the fluctuations on these solutions and inspect to what extent these are to be trusted. It will prove helpful to realise that ω(: Tµν (x) :) for a : Tµν (x) : given as some linear combination of the previously defined Wick monomials and evaluated on a Hadamard state ω can be equivalently expressed as . 1 Tr[Dµν (x, y)W (x, y)], ω(: Tµν (x) :) = 8π 2
(39)
where Dµν is some (bi)differential operator specified by the choice of linear combination of Wick monomials in the definition of : Tµν (x) :, Tr denotes the trace over spinor indices and we refer the reader to Appendix A for the explanation of the possibly unfamiliar notations that arise in the context of bispinorial entities and that will be extensively used in the following. As we have already remarked, the obvious choice of expression for the stress-energy tensor in terms of Wick monomials will not turn out to the best one. In terms of the above defined differential operator, this means that the canonical version derived from the classical stress-energy tensor (37), 1 can . can . ν ˜ µν = −D Dy = − γ(µ (∇ν) − gν) ∇ν )Dy , Dµν 2 is not well suited for defining a sensible ω(: Tµν (x) :). Since we have assured ourselves that a right-hand side of (38) obtained by expressing ω(: Tµν (x) :) as (39) is in principle well defined, we could seek for additional physical and consistency requirements that lead to a potential refinement of that procedure, i.e. to a sensible choice of Dµν . Pursuing a comparable aim, Wald [78, 79] has set up five axioms that a meaningful expected stress-energy tensor should fulfill. These proved to be a valuable tool in a posteriori legitimating
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1287
known stress-energy tensor regularization schemes in curved spacetimes and stating to which extent they may differ from one another without raising doubts about their validity. For the convenience of the reader, we list them in the following. Definition 4.1. We say that ω(: Tµν (x) :) fulfills (the strong version of) Wald’s axioms if it has the following five properties: (1) Given two (not necessarily Hadamard) states ω1 and ω2 , such that ω1− (x, y) − ω2− (x, y) is a smooth bispinor, ω1 (: Tµν (x) :) − ω2 (: Tµν (x) :) is equal to can ˜ µν (ω1− (x, y) − ω2− (x, y))]. Tr[D (2) ω(: Tµν (x) :) is locally covariant in the following sense: let χ : (M1 , g1 , SM1 , ρ1 ) → (M2 , g2 , SM2 , ρ1 ), αχ : F(M1 , g1 ) → F(M2 , g2 ) as in Sec. 3.2. If two states ω1 and ω2 on F(M1 , g1 ) and F(M2 , g2 ) are related by ω1 = ω2 ◦ αχ , then ω2 (: Tµ2 ν2 (x2 ) :) = χ∗ (ω1 (: Tµ1 ν1 (x1 ) :)), where χ∗ denotes the push-forward of χ in the sense of covariant tensors. (3) ∇µ ω(: Tµν (x) :) = 0. (4) On Minkowski spacetime and in the Minkowski vacuum state ωMink , ωMink (: Tµν (x) :) = 0. (5) ω(: Tµν (x) :) does not contain derivatives of the metric of order higher than 2. ω(: Tµν (x) :) is said to fulfill the reduced version of Wald’s axioms, if only the first four statements hold. Wald has originally stated the axioms for scalar fields, while the version we give here is modified to be suitable for Dirac fields. We reckon that a few comments both on the origin and on the meaning of the single axioms might be helpful for a potential reader in order to understand their relevance: (1) In a given Fock-representation of the quantum field, the non-diagonal matrix elements of the formal unrenormalized stress-energy tensor operator in the “mode basis” are already finite, because their calculation only involves “finite mode sums”, while the calculation of the diagonal matrix elements involves “infinite mode sums” [78, 79]. To regularize the formal stress-energy tensor operator, it is therefore only necessary to subtract an infinite part proportional to the identity operator, thus leaving the non-diagonal matrix elements unchanged. Axiom 1 amounts to require such a “minimal” regularization. This axiom is also related to the so-called relative Cauchy evolution of a locally covariant field [15, 66]; since the functional derivative of the relative Cauchy evolution involves the commutator with the stress-energy tensor operator, one could reformulate this axiom on the operator level requiring that any regularization prescription yields the same relative Cauchy evolution. If we consider Hadamard states, such that ωi− (x, y) is locally given by
November 18, 2009 11:47 WSPC/148-RMP
1288
(2)
(3)
(4)
(5)
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
−Dy (H − (x, y) + Wi (x, y))/(8π 2 ) for i = 1, 2, the requirement is equivalent to can plus a demanding that the differential operator used in (39) is given by Dµν term which does not influence the state dependence of ω(: Tµν (x) :). Taking the locality principle of quantum field theory and the covariance principle of general relativity seriously, we would like to have a ω(: Tµν (x) :) which describes the back-reaction of the quantum field on the spacetime in a local and covariant way. In fact, this axiom seems to have been an inspiration towards the formulation of locally covariant QFT, as described in the seminal paper [15]. This axiom basically points out a necessary condition for the well-posedness of the semiclassical Einstein’s equations; namely, since the geometric left-hand side of (38) is conserved due to the Bianchi identities, also the right-hand side should vanish under the action of the covariant derivative. It is a sensible prerequisite of any regularization scheme for a field theory on a curved background that it should be possible to read it as an “extension” of the standard normal ordering in Minkowski spacetime, but there are good reasons to skip this axiom, cf., [32] and the second remark after Theorem 4.1. Wald originally proposed this axiom in a rather technical and more strict way [78, 79], essentially requiring that ω(: Tµν (x) :) does not depend on derivatives of the metric of order higher than the first. The underlying motivation is rooted, on the one hand, in the request of well-posedness of the Cauchy problem for the Einstein’s equations even with a non vanishing source and, on the other hand, in the need for a sensible “classical” limit of the semi-classical Einstein’s equations (see the enlightening discussion in Wald’s original paper [78, 79]). Wald himself realized, however, that the strict version of this axiom could not even be satisfied in the classical theory and has, thus, proposed the weaker one stated here. Unfortunately, further examinations revealed that even this weaker version does not seem possible to fulfill in massless theories without introducing an artificial length scale into the theory; therefore, the axiom has been discarded. We still believe, however, that it could be fulfilled, though only under special circumstances. We shall comment on this issue at a later stage of the paper.
Using these axioms, Wald could prove that a uniqueness result for ω(: Tµν (x) :) can be obtained. The first two axioms already imply that the results from two different sensible regularization schemes can only differ by a local curvature tensor. The third and fourth axioms then imply that this local curvature tensor is conserved and vanishes if the spacetime is locally flat. Requiring that this term has the correct dimension of m4 , the possible tensors are presumably only the ones obtained by varying a Lagrangian of the form R Rµν Rµν Rµνρτ Rµνρτ + F + F m4 F1 2 3 m2 m4 m4
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1289
with respect to the metric, with some dimensionless functions Fi (x). Requiring suitable analyticity properties with respect to the curvature tensors and m, in [42], it has been shown that the only possibilities are m4 gµν (F1 = 1),j m2 Gµν (F1 = x) and the three local curvature tensors Iµν (F1 = x2 ), Jµν (F2 = x), Kµν (F3 = x), cf. Appendix A for their full expression. In fact, we will later show that changing the scale λ in the regularizing Hadamard bidistribution H − amounts to changing ω(: Tµν (x) :) exactly by a tensor of this form and, furthermore, the attempt to regularize Einstein–Hilbert quantum gravity at one loop order automatically yields a renormalization freedom in form of such a tensor as well [73]. Having in mind how the semiclassical Einstein’s equation may be derived, these two arguments are of course related by means of internal consistency.k Using the Gauss–Bonnet–Chern theorem in four dimensions, which states that dµ(x)Rµνρτ Rµνρτ − 4Rµν Rµν + R2 M
is a topological invariant and, therefore, has a vanishing functional derivative with respect to the metric [1, 73], one can restrict the freedom even further by removing Kµν from the list of allowed local curvature tensors. 4.3. The quantum stress-energy tensor: The solution and its trace anomaly We now seek to exploit the above axioms in order to specify a sensible choice of differential operator Dµν . Looking at our proposed regularization procedure (39), the first obstacle to overcome seems to be the covariant conservation axiom. As we have seen above, conservation of the stress-energy tensor in the classical case is a direct consequence of the equation of motion. Since we are regularizing by subtracting from the two-point function the Hadamard bidistribution, which is in can applied to the thereby obtained general not a solution of the Dirac equation(s), Dµν smooth bispinor will in general not yield a conserved quantity. A viable solution to such problems (cf., the last remark of this section for the comparison to a different solution proposed in [42]), at the first time employed for scalar fields in [57], calls for the modification of the classical stress-energy tensor (37) by terms which vanish on shell, while, at the same time, they help restoring covariant conservation on the quantum side. As in the case of scalar fields, it seems that the only possible option jA
term proportional to the metric is not allowed if one seeks to fulfill the third axiom. As we will see later, however, it does not seem to be possible to fix this term in a way that is compatible with analyticity in m. Furthermore, the results of Hollands and Wald regarding the restriction of the possible regularization freedom by demanding analytic dependence on curvature and mass have only been obtained for scalar fields. Since the stress-energy tensor for Dirac fields is an observable and thus still a “scalar” field, their results can be, nonetheless, presumably extended to this case. k In fact, at least in the case of scalar fields, the combination of the local curvature tensors appearing as the finite renormalization freedom in [73] is, up to a term which seems to be an artefact of the dimensional regularization employed in that paper, the same that one gets via changing the scale in the regularizing Hadamard bidistribution.
November 18, 2009 11:47 WSPC/148-RMP
1290
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
is to add multiples of the Lagrangian to the classical expression of the stress-energy tensor. We thus propose the following classical stress-energy tensor as a starting point Tµν =
1 † (ψ γν) ψ − ψ † γ(µ ψ;ν) ) + cLgµν 2 ;(µ
and we look for a c ∈ R that yields a sensible ω(: Tµν (x) :). This is tantamount to the choice of the following differential operator for the point-splitting process (39) . can c c = Dµν Dµν − gµν (Dx + Dy )Dy 2 1 c ν = − γ(µ (∇ν) − gν) ∇ν )Dy − gµν (Dx Dy − Py ). 2 2
(40)
Before proceeding to prove that there indeed exists a suitable choice of c, we would like to anticipate another result, which can be easily understood from the aforementioned line of argument. Let us remember that there is another property of the classical stress-energy tensor stemming from the equations of motion: it has vanishing trace in the massless (and therefore conformally invariant) case. Following the above discussion, on the one hand, it seems that one might need to give up this property at a quantum level, while, on the other hand, one could still hope that the choice of c also provides a vanishing trace. Alas, it will turn out that this is not the case. One can only fix c in a way such that ω(: Tµν (x) :) has vanishing trace for m = 0, but conservation is inevitably spoilt. Since we have already realized that conservation is indeed an essential requirement for the right hand side of the semiclassical Einstein’s equations, we will have to accept that g µν ω(: Tµν (x) :) is not vanishing in the massless case. This goes under the name of trace anomaly. . Theorem 4.1. Let λ2m = 2 exp( 32 − 2γ)m−2 for m = 0 and λm arbitrary for m = 0, where γ denotes the Euler–Mascheroni constant, choose the Hadamard bidistribution to be the one with λ = λm and let ω(: Tµν (x) :) be defined as in (39), −1/6 with the differential operator Dµν = Dµν defined as in (40). Then ω(: Tµν (x) :) fulfills the reduced version of Wald’s axioms. Furthermore, it exhibits the following trace (anomaly) 1 1 1 1 7 R2 + R − Rµν Rµν − Rµνρτ Rµνρτ g µν ω(: Tµν (x) :) = − 2 π 1152 480 720 5760 1 − 2 π
1 = 2880π 2 −
1 π2
m4 m2 R + 8 48
+ m Tr[Dy W (x, y)].
7 1 2 µνρτ µν Cµνρτ C + 11 Rµν R − R − 6R 2 3
m4 m2 R + 8 48
+ m Tr[Dy W (x, y)].
(41)
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1291
Proof. We begin by computing ∇µ ω(: Tµν (x) :) and g µν ω(: Tµν (x) :), leaving c unspecified for the moment. Applying Synge’s rule and taking into account that {Cµν , γ µ } = 0 and [gνν ;µ ] = 0 (cf., Appendix A), we get c (x, y)W (x, y)] 8π 2 ∇µ ω(: Tµν (x) :) = ∇µ Tr[Dµν
c (x, y)W (x, y)] = Tr[(∇µ + gµµ ∇µ )Dµν 1 ν 1 = Tr (g ∇ν − ∇ν )(Dx Dy + Py ) + γν Dy (Py − Px ) 4 ν 4 !
c − (gνν ∇ν + ∇ν )(Dx Dy − Py ) W (x, y) . 2
Remembering that −Dy (H − + W ) is the local two-point distribution of a state, it follows that H − + W is subject to the distributional differential equations Dx Dy (H − + W ) = 0 = Py (H − + W ). Thus, we can safely replace W in the above equation by −H − , since every appearing term involves one of the two aforementioned differential operators. Such a procedure yields 1 2 µ 8π ∇ ω(: Tµν (x) :) = Tr (∇ν − gνν ∇ν )(Dx Dy + Py ) 4 1 + γν (γ µ ∇µ + m)(Px − Py ) 4 !
c − (gνν ∇ν + ∇ν )(Py − Dx Dy ) H − (x, y) . 2
Now we can insert the various coincidence point limits of the differentiated Hadamard bidistribution H − computed in Proposition A.1 of Appendix A to obtain 8π 2 ∇µ ω(: Tµν (x) :) = −(1 + 6c)Tr[V1 (x, y)];ν . For the trace, we use both the insights on the parallel transport of gamma matrices from Appendix A and the arguments already employed in the computation of the conservation to get c (x, y)W (x, y)] 8π 2 g µν ω(: Tµν (x) :) = g µν Tr[Dµν !
1 = Tr − 2c + (Dx Dy − Py ) + mDy W (x, y) 2
1 = Tr 2c + (Dx Dy − Py )H − (x, y) + mDy W (x, y) 2
= −6(4c + 1)Tr[V1 (x, y)] + mTr[Dy W (x, y)]. If we look at the above two results and we use the data from Appendix A, we realize that, since Tr[V1 (x, y)] is in general neither vanishing nor constant, we need to set c = −1/6 to assure conservation, thus yielding the asserted trace and, particularly, the trace anomaly in the massless case.
November 18, 2009 11:47 WSPC/148-RMP
1292
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
Let us now proceed to check the validity of the first two axioms. From the above two calculations we can extract that the term we have added to the canonical differcan c to achieve the looked-for differential operator Dµν contributes ential operator Dµν to ω(: Tµν (x) :) a term proportional to gµν Tr[(Dx Dy − Py )H − (x, y)] = −12gµν Tr[V1 (x, y)]. This term is clearly independent of the state, i.e. W (x, y), thus, axiom 1 holds for our regularization scheme and two Hadamard states ω1 , ω2 . Furthermore, since the Wick monomials are locally covariant quantum fields as discussed in the previous section, the same holds true for : Tµν (x) : and axiom 2 is straightforwardly fulfilled. What now remains to be shown is the vanishing of ω(: Tµν (x) :) on Minkowski spacetime and in the Minkowski vacuum state, provided we choose a suitable scale − + (x, y) = −Dy ω2,Mink (x, y)I4 , λ in the definition of H − . In this setting, we have ωMink + l with the scalar two-point function ω2,Mink [69]. The latter can be specified as + ω2,Mink (x, y) = lim+ →0
(4π)2
4m K1 (m 2σ (x, y)), 2σ (x, y)
where K1 denotes a modified Bessel function, the expression is to be understood in the sense of analytic continuation for negative values of the geodesic distance, and the limit specifies how to approach the branch cut of the squared root [57]. Expanding this in terms of σ (suppressing the for simplicity) we have 2 2γ 2 m 1 e m σ m4 2 + 2 + + σ + f1 (σ)σ ln 8π ω2,Mink (x, y) = σ 2 8 2 ! 2 2 m 5m σ − 1+ + σ 2 f2 (σ 2 ) I4 , 2 8 where the fi appearing in this paragraph are smooth functions. In Minkowski spacetime, the Dirac Hadamard bidistributions H ± are simply the scalar ones times the unit matrix. This is not surprising since, as visible in Appendix A, the nontrivial matrix part of the Dirac Hadamard coefficients stems from the curvature of the spin connection, which vanishes in flat spacetimes. We thus have 2 ! m 1 σ m4 − 2 + + σ + f1 (σ)σ ln I4 . HMink (x, y) = σ 2 8 λ2 If we take into account this singular part, a short computation yields 2γ 2 2 m2 e m λ WMink (x, y) = ln −1 2 2 ! 4 2γ 2 2 e m λ m 5m4 ln + − σ + f3 (σ)σ 2 I4 8 2 16 l Recall
that, according to our definition, ω − is of “positive frequency type”.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1293
and we can straightforwardly compute
2γ 2 2 e m λ m4 gµν 2 ln ωMink (: Tµν (x) :) = −3 , 4 2
− which vanishes for λ2 = λ2m = 2 exp( 32 − 2γ)m−2 . In the massless case, both HMink −1 and ω2,Mink only consist of the σ term, such that ωMink (: Tµν (x) :) is trivially vanishing, independent of any scale λ.
It is interesting to notice that the above analysis seems to suggest that the common idea which associates the emergence of the trace anomaly to the conservation of the stress-energy tensor is somehow inappropriate since such anomaly seems rooted in the loss of the equations of motion at a quantum level alone. As a matter of fact the needed modification of the classical stress-energy tensor does not evoke the trace anomaly, but it only modifies it. To conclude the section, a few further remarks are in due course: Remark. Our result for the trace anomaly coincides with previous ones obtained by means of gravitational index theorems [19] and point-splitting techniques [18]. Both approaches have made use of the DeWitt–Schwinger expansion, which cannot be defined rigorously [10, 32, 57, 80]. Moreover, even if this expansion reproduces the Hadamard singularity structure, it seems that calculations are much shorter if one expresses it directly through the Hadamard series. The previous attempts had, however, one advantage, namely, they had expressed the expected stress-tensor as the functional derivative of a (diffeomorphism-invariant) effective action, such that the result has been manifestly conserved. The nice discussion which follows [57, Definition 2.1] gives an explanation of how the extra term in our derivative operator c can be understood in this context. Dµν Remark. By changing the scale λ in the Hadamard bidistribution H − to λ , one modifies the (definition of the) smooth part W by 2V ln λ /λ, such that ω(: Tµν (x) :) c V ]. Since we know from Proposition 3.2 changes by a term proportional to Tr[Dµν that Dy V fulfills both Dirac equations of motion, we can a priori deduce that this term is automatically conserved and furthermore traceless in the conformal case. c Thus, it follows that both the determination of the correct c to be inserted in Dµν and the trace anomaly are independent of the scale λ. Even if we already know the c V ] beforehand, it is enlightening to calculate its explicit form. properties of Tr[Dµν The result is m4 m2 1 c gµν − Gµν + (Iµν − 3Jµν ), Tr[Dµν V]= 2 6 60 where the linear combination of Iµν and Jµν appearing in the above formula is traceless, cf., Appendix A. This term is well within and even exhausts the regularization freedom discussed after Definition 4.1. Furthermore, one should recall that, in the massive case, λ had to be fixed in terms of inverse powers of m to assure vanishing of ω(: Tµν (x) :) in Minkowski
November 18, 2009 11:47 WSPC/148-RMP
1294
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
spacetime. Hence, if one demands continuous dependence of ω(: Tµν (x) :) on m, it does not seem possible to fulfill the third axiom of Definition 4.1 in this way. Remark. Even if we are able to fix the scale λ, we could in principle still add multiples of m2 Gµν , Iµν , and Jµν to ω(: Tµν (x) :) without spoiling the validity of the first four of Wald’s axioms, though possibly modifying the R term in the trace anomaly. This freedom can be derived in a more general sense, by already viewing all Wick products as being uniquely defined only up to terms depending on the mass and the local curvature, where the possible regularization freedom is partially restricted by suitable consistency conditions, e.g. “Leibniz rules”. This approach has been developed and pursued successfully by Hollands and Wald in [42] and has the advantage to also encompass interacting fields, while our treatment along the lines of [56] is restricted to free fields. A treatment as in [42] will, as already remarked in Sec. 4.2, presumably yield the same renormalization freedom for the stress-energy tensor like the one found here, as it happens in the scalar case [42]. Furthermore, the arising of the conserved local tensors Iµν and Jµν puts us in the position to understand why the fifth of Wald’s axioms in Definition 4.1 is problematic. Let us consider the massive case, where we can fix the scale λ. Since Iµν and Jµν contain terms involving fourth order derivatives of the metric, we may hope to cancel terms of that type occurring in ω(: Tµν (x) :) by adding a fixed linear combination of those two tensors. In the massless case, however, there is, up to our knowledge, no physically sensible way to fix λ. Therefore, one has no control on the multiples of Iµν and Jµν occurring in ω(: Tµν (x) :) and thus no way to cancel them. Nonetheless, there are scenarios where the situation with respect to the mentioned axiom is not that pernicious. On cosmological, i.e. Friedmann–Robertson– Walker backgrounds, the semi-classical Einstein’s equations (38) can be reduced to an equation for the traces of both sides plus a conservation equation for the righthand side [24]. Thus, it seems that one has the chance to fulfill the fifth axiom for both massive and massless fields in this simplified setting, since a change of scale does not add fourth order derivative terms to the trace of the expected stress-energy tensor. In fact, as already explained in the introduction, this observation has been used in [24] to obtain stable solutions of the semi-classical Einstein’s equation at late times. 5. Conclusions and Outlook We have extensively discussed the structure of free Dirac fields both at a classical and at a quantum level. While, in the first case, we have mostly reviewed standard approaches, in the latter scenario we have achieved a twofold goal. Particularly, we have started the discussion of quantized Dirac spinors by exploiting the selfdual framework introduced by Araki which treats spinor and cospinor fields as a combined single object and allows to formulate the quantization procedure in a locally covariant way. This step has been fully undertaken by Sanders for the first
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1295
time and we have recalled the essential steps and features of this construction. Employing already available and known properties of Hadamard states, we have subsequently been able to introduce the extended algebra of Wick polynomials, the topic of Sec. 3.5 and the first of our main results. As a second one, we have shown that, as in the scalar case, a physically sensible definition of the stress-energy tensor for Dirac fields on a curved background in terms of Wick polynomials is indeed possible with just one caveat: one has to add to the classical expression a suitable term which vanishes on-shell and hence does not alter classical dynamics to obtain a conserved stress-energy tensor on the quantum side. Some new insights on Diracian Hadamard forms have constituted a prerequisite of this result, while one of its consequences is the emergence of a non-vanishing quantum trace of the stress-energy tensor, even if its value at a classical level is zero in the conformally invariant case. This result, which goes under the name of trace anomaly, has been previously known, but only as a result of formal calculations; it is thus derived here rigorously for the first time. On the overall, we reckon that this paper accomplishes also a further task, namely, it adds the insight that it is possible, interesting, but by no means straightforward to recast many of the already known rigorous results for scalar fields also for the spinor ones. Furthermore, our analysis opens several interesting questions to be tackled in future lines of research: the first one, which arises out of Sec. 3.2, concerns the possibility to prove the time slice axiom for the extended algebra of fields (as well as for interacting field theories) in the scenario considered in the paper. If one follows the path paved in the scalar case in [16], a positive answer seems definitively within our grasp. A further interesting problem originates from Sec. 3.3 in which Hadamard states are introduced and discussed; the Hadamard coefficients appearing in the singularity structure of such states are smooth bispinors and the question arises if their most remarkable feature in the scalar case, namely, their symmetry as proved in [55,56], also appears in the spinor scenario. Such a property would be desirable since, for example, it would lead to many simplifications in the demanding calculations necessary in the construction of the conserved stress-energy tensor. Although there are hints pointing towards this direction (see also [65]), we are far from a complete proof of such a symmetry and we thus feel this would be another rather interesting problem to tackle in the very next future. Besides these rather formal lines of research, our results have also some remarkable consequences at a physical level. On the one hand we are now ready to answer the question posed in the introduction on the robustness of the results in [24]; preliminary considerations seem to point towards this direction, though we leave a definitive answer to a future analysis. On the other hand, since our approach allows us to control the behavior of free Dirac fields at a cosmological level, it is interesting to point out that free or perturbatively self-interacting fields with half-integer spin in cosmology arise in many models, such as baryogenesis through leptogenesis, where they often play a pivotal role. In these scenarios there are still many open questions to be answered and it seems that, often, the role of spacetime
November 18, 2009 11:47 WSPC/148-RMP
1296
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
curvature effects are a priori discarded as negligible. Our experience suggests that this approximation might be too crude and, therefore, we would like to investigate these models in more detail in the framework of quantum field theory in curved spacetimes with the hope that such an analysis might lead to new and interesting physical consequences. Acknowledgments The work of C. D. is supported by the von Humboldt Foundation, that of T. H. by the German DFG Graduate School GRK 602, whereas N. P. gratefully acknowledges support by the German DFG Research Program SFB 676. We would like to thank K. Fredenhagen, V. Moretti, and R. Punzi for useful discussions and K. Sanders for carefully reading the paper and providing valuable comments. T. H. is especially grateful to R. Punzi for suggesting [64] to him. Appendix A. Useful Tools and Necessary Calculations The aim of the appendix is to recollect, to clarify, and, occasionally, to also prove useful formulas which are needed in the main body of the paper and which are subject to potential ambiguities. These are often perniciously leading to potentially grievous sign mistakes or misunderstandings of a sort which we wish to hold off from a potential reader. A.1. Notations, conventions, identities As a starting point, we would like to recollect our basic conventions regarding some symbols, whose exact definition often varies among the literature. In accord with Sec. 2, we work with spacetimes thought as four-dimensional, connected, Hausdorff, smooth manifolds, endowed with a Lorentzian metric gµν with signature (−, +, +, +). At the same time, other notable geometric quantities, namely, the Riemann and the Ricci tensor as well as the Ricci scalar, are defined via their components as follows . . . vα;βγ − vα;γβ = Rα λβγ vλ , Rαβ = Rα λβλ , R = Rαα , where vα are the components of an arbitrary covector; the extension to vectors and tensors of higher rank is then straightforward. As a last remark, we underline that the Riemann tensor possesses the symmetries Rαβγδ = −Rβαγδ = −Rαβδγ = Rγδαβ and fulfills Rαβγδ + Rαδβγ + Rαγδβ = 0. Finally, we define the Weyl tensor as it is usually done by the following expression 1 1 Cαβγδ = Rαβγδ − (gαδ gβγ − gαγ gβδ )R − (gβδ Rαγ − gβγ Rαδ − gαδ Rβγ + gαγ Rβδ ). 6 2
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1297
Similarly, we need to cope with geometric quantities related to the spin structure, introduced in Definition 2.5. Most of these are constructed out of the socalled γ-matrices which satisfy the standard anticommutation relations (4), i.e. {γµ , γν } = 2gµν . Our choice of the metric signature entails that the γ-matrices are different from the standard ones employed in quantum field theory books by an overall multiplicative factor ±i. Consistently also with (2), we stick to +i and, therefore, the Dirac operator appearing in the Dirac equation for spinors becomes . . D = −∇ + m, whereas the Dirac operator for cospinors is D = ∇ + m. That said, apologising in advance for assigning the letter C to two different objects, we define the components of the curvature tensor C of the spin connection as . (42) VA;βγ − VA;γβ = CA Bβγ VB , where VA are the components of an arbitrary cospinor; as previously, the extension of this definition to spinors, also of higher rank, as well as of that for the Riemann and the spin curvature tensor in presence of mixed spinor-tensors, is straightforward. It follows from Lemma 2.2 that the relation between the two curvature tensors is 1 C ABµν = Rµναβ γ αA C γ βCB . 4 Thus, C possesses the symmetries CABµν = −CBAµν = −CABνµ . We also use the notational convention that a matrix acts from the left on spinors and from the right on cospinors, e.g., we resolve the Dirac operators as † µ γ + mψ † , D ψ † = ψ;µ
Dψ = −γ µ ψ;µ + mψ.
If one strictly sticks to such convention, spinor indices can be safely suppressed, as we have already done in the main body of the paper and as we will often do in the remainder of this appendix. To conclude this subsection, we point out a few useful identities between the objects we have previously introduced. Starting from the gamma matrices, the product of an odd number of them has a vanishing trace. At the same time, if we consider an even number Tr γµ γν = 4gµν , Tr γµ γν γα γβ = 4(gµν gαβ − gµα gνβ + gµβ gνα ) Tr γ[α γβ] γ[γ γδ] γε γϕ = 4(gϕ[α gβ][γ gδ]ε + gε[α gβ][δ gγ]ϕ + gδ[α gβ]γ gεϕ ), where [ ] here denotes idempotent antisymmetrization. Furthermore, γ µ γµ = 4I4 ,
γ µ γ α γµ = −2γ α ,
γ µ γ α γ β γ γ γµ = −2γ γ γ β γ α ,
γ µ γ α γ β γµ = 4g αβ I4 ,
γ µ γ α γ β γ γ γ δ γµ = 2(γ δ γ α γ β γ γ + γ γ γ β γ α γ δ ).
(43)
November 18, 2009 11:47 WSPC/148-RMP
1298
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
The last equalities we shall need are −γ β Cαβ = Cαβ γ β =
1 Rαβ γ β , 2
[Cαβ , γγ ] = Rαβργ γ ρ , Tr Cαβ γγ γδ = −2Rαβγδ , C αβ;αβ = 0, 1 Tr Cαβ C αβ = − Rαβγδ Rαβγδ , 2 Tr Cαβ C αβ γρ γτ = Tr Cαβ C αβ gρτ , where the equalities not involving a trace can be proved by combining the symmetry properties of the Riemann tensor with the anticommutation relations of the γmatrices, e.g., Cαβ γ β =
1 Rαβρτ γ ρ γ τ γ β 4
=
1 (Rαρβτ + Rατ ρβ )γ ρ γ τ γ β 4
=
1 Rαβρτ (γ β γ τ γ ρ + γ ρ γ β γ τ ) 4
=
1 Rαβρτ (γ ρ γ τ γ β + 2g βτ γ ρ − 2g βρ γ τ − γ ρ γ τ γ β + 2g βτ γ ρ ) 4
=
3 Rαρ γ ρ − 2Cαβ γ β 2
⇔ Cαβ γ β =
1 Rαβ γ β . 2
A.2. On the calculus of bispinor-tensors The notion of bispinor-tensors heuristically boils down to consider objects which contemporary transform as spinor-tensors at two spacetime points. In a more sound language, they are sections of an outer tensor product VM WN of two vector bundles VM, WN, respectively, over M and N . VM WN is nothing but a vector bundle over M × N with, calling V and W the typical fibers of VM and WN, V ⊗ W as a typical fiber. Such a construction may seem awkward, but, in case M = N , it is indeed more fundamental than the familiar tensor product bundle VM ⊗ WM , the latter being constructed out of VM WM by pulling back via the map M x → (x, x) ∈ M × M . For simplicity we will choose to collect all possible (bi)spinor-tensorial objects under the name of (bi)tensor, except in special case where we want to stress the character of the involved vector spaces. The bitensors occurring in this work are
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1299
all defined only on a convex normal neighborhood, since we need a unique geodesic to connect the two points our bitensors depend on. We use unprimed indices to indicate components stemming from tensorial properties at x and primed indices for those rooted in such properties at y. Furthermore, we shall use the bracket notation introduced by Synge to denote coincidence point limits of bitensors, namely, . [B(x, y)] = lim B(x, y), y→x
where B is some smooth bitensor, such that the limit is well defined. Let us now recall the bitensors used in this work and examine their properties. We will only mention the basic points while we refer to the works of DeWitt and Brehme, Fulling, Christensen, and to the review by Poisson for further, more exhaustive, details [17, 23, 32, 61]. As a starting point, we consider the halved squared geodesic distance σ(x, y) taken with sign, sometimes also called Synge’s world function. Even if the geodesic distance itself might not be globally smooth, it is such on geodetically convex normal neighborhoods (provided smoothness of the metric) and it furthermore fulfills σ;µ σ ;µ = 2σ, an identity which can be either explicitly computed or derived from geometric considerations. In the following, we will, as it is customary, drop the semicolon when indicating covariant derivatives of σ. The aforementioned equation together with [σ] = [σµ ] = 0 and [σµν ] = gµν , two identities arising out of the defining properties of the geodesic distance, completely suffice to determine σ, as well as all the properties we need, namely, the coinciding point limits of its higher derivatives. These can be obtained by means of an inductive procedure; as an example, in the case of [σµνρ ], one differentiates σµ σ µ = 2σ three times and then takes the coinciding point limit. Together with the already known relations, one obtains [σµνρ ] = 0. At a level of fourth derivative, a new feature enters the fray, namely, one gets a linear combination of three coinciding fourth derivatives, though with different index orders. To relate those, one has to commute derivatives to rearrange the indices in the looked-for fashion, and this ultimately leads to the appearance of Riemann curvature tensors, i.e. [σ] = [σµ ] = [σµνρ ] = 0,
[σµν ] = gµν ,
1 [σµντ ] = − (Rµντ + Rµτ ν ). 3
We stress that the discussion of these few identities is indeed much more valuable than just yielding the stated results since a potential reader is now able to calculate coinciding point limits both of derivatives of arbitrarily high order and of any bitensor; this holds true provided he is given the limits of lower order derivatives, an equation relating them to the higher ones, as well as the information of appropriate curvature tensors. We would like to remark at this point that, since one is, most of the time, ultimately interested in the coinciding point limits of certain bitensors, the in between computational steps often only require the knowledge of coinciding point limits of hierarchically lower objects, in contrast to having the necessity to know their full form.
November 18, 2009 11:47 WSPC/148-RMP
1300
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
The next interesting bitensor is that of parallel transport along a geodesic, an object depending both on the underlying vector bundle and on the considered linear connection. We will denote the parallel transport relating the tangent spaces at x A and y as gνµ , while the one relating spinors at those same points is denoted as IB . Aµ With them at hand, parallel transporting a spinor-tensor T = T EA ⊗ ∂µ along the geodesic connecting y to x amounts to the following identity
A µ B ν , T Aµ = IB gν T
and a similar rule applies to higher spinor-tensors. One can reverse the role of x and y, introducing the inverses of the above two parallel transports, say gνµ and A I −1 B . On a practical ground, the construction of these two quantities boils down to finding a solution of the following partial differential equations: A α =0 gνµ ;α σ α = IB ;α σ
and [gνµ ] = gνµ ,
A A [IB ] = I4 B ,
being I4 the 4 × 4 identity matrix. These identities, together with the properties of σ and the inductive procedure described at the beginning of this paragraph, allow us to explicitly compute the derivatives of the parallel transports, the lowest ones being 1 1 A A A C Bαβ . [gνµ ;α ] = [IB [gνµ ;αβ ] = Rµ ναβ , [IB (44) ;α ] = 0, ;αβ ] = 2 2 We shall henceforth suppress spinor indices, taking care to follow the afore described conventions, and, to conclude the section, we would like to point out the special parallel transport properties of both σ and the gamma matrices. For the former we have, due to its geometric meaning,
gνµ σµ = −σν , whereas, for the latter, being covariantly constant, we have
Iγµ I −1 gνµ = γν . In this paper, we need to cope with the coinciding point limits of bitensors differentiated at both x and y. The first, and maybe obvious, related statement is that derivatives at different points commute, so that we can always rearrange derivative indices in such a way that the unprimed ones are always on the left whereas the primed ones are always on the right. As a subsequent step, one notices that mixed coinciding point limits can be also calculated out of inductive paths. If one has the knowledge of the coinciding derivatives at the point x, however, one can extend it to those at y by means of Synge’s rule: Lemma A.1. Let T be a smooth bitensor of arbitrary order; then its covariant derivatives possess the following property in the coinciding point limit (here suppressing all unessential indices): [T;µ ] = [T ];µ − [T;µ ]. This has been proven by Synge for σ exclusively, while, for the proof of an extension to arbitrary bitensors, one can refer to [61, Sec. 2.2] or to [17].
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1301
A.3. On the Hadamard recursion relations and related results As we have seen in Sec. 3.3, in order to “construct” the two-point functions ω ± (x, y) of a Hadamard state, we need to specify the distribution kernels H ± (x, y) and the smooth bispinor W (x, y), which must satisfy Px H ± (x, y) ∈ E(DM ⊗ D∗ M ),
Py H ± (x, y) ∈ E(DM ⊗ D∗ M )
and Dx Dy (H ± (x, y) + W (x, y)) = Py (H ± (x, y) + W (x, y)) = 0. From this it follows that Dx Dy H ± (x, y) is a smooth bispinor as well. Furthermore, due to Proposition 3.2, there are even more differential operators, which, applied to H ± (x, y), yield a smooth bispinor. Let us collect them all in the following: Px = −Dx Dx ,
Py = −Dy Dy ,
Dx Dy ,
Dx Dy
and Dx − Dy = Dy − Dx . (45)
The aim of this section is to use these data to determine the Hadamard bidistributions H ± (x, y) and to calculate the various coinciding point limits of their derivatives which are necessary for the proof of Theorem 4.1. Following the path paved in the preceding sections, let us recall that the index structure of ω ± is
ω ± (x, y) = ω ± (x, y)A B E A (x) ⊗ EB (y), and that H ± and W inherit this structure, and let us suppress spinor indices in the following. Although H ± (x, y) and W (x, y) are bispinors, we recall from the main body of the paper that their form slavishly mimics that of the kernels specifying the two-point function in the theory of scalar fields, viz., H ± (x, y) =
σ±(x,y) U (x, y) + V (x, y) ln , σ± (x, y) λ2
∞ . Vn (x, y)σ(x, y)n , V (x, y) =
(46) (47)
n=0
. where σ± (x, y) = σ(x, y) ± 2i(T (x) − T (y)) + 2 , with > 0 and T being a temporal function whose existence is guaranteed since the background is globally hyperbolic [6, 7]. As already commented in the main text, λ is a reference distance employed to make the argument of the logarithm dimensionless, while the remaining objects, the so-called Hadamard coefficients U and V , are smooth bispinors. As we will see shortly, U as well as V depend only on the geometry of the underlying background and on the mass, whereas W fully characterizes the state, namely, the two-point functions of two Hadamard states differ only by a smooth function and such a difference is indeed encoded in W . To determine U , V , and W , we need to use the knowledge on the differential operators (45) which, once applied to H ± , give smooth bispinors. To make the
November 18, 2009 11:47 WSPC/148-RMP
1302
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
following formulas more readable, we choose to omit the regularising ε — and thus the ± index of H ± —, the reference length λ, and the dependence of the kernels on the spacetime points (x, y). That said, we can in principle take either of the second order differential operators listed in (45) to recursively calculate U and V ; we will employ Px , as this is familiar from the computations in the scalar case. Applying Px to H, we obtain potentially singular terms proportional to σ −n for n = 1, 2, 3 and to ln σ as well as smooth terms proportional to positive powers of σ. We know, however, that the total result is smooth and one possible way to achieve this is to demand that the coefficients of the potentially singular terms are identically vanishing. Let us stress that, since we do a priori not know if U contains positive powers of σ, the terms proportional to negative powers of σ could in principle cancel each other to yield a smooth result. It is therefore a choice and not a necessity to require the coefficients of the inverse powers of σ to vanish, and it is, furthermore, not guaranteed that the result of this procedure does not depend on the choice of the second order differential operator out of the possible ones listed in (45). The afore laid down line of argument does, however, not hold for the coefficients of Px H proportional to ln σ; since U and V are required to be smooth, they can not contain a logarithmic dependence on σ and the terms proportional to ln σ have to vanish necessarily. The result of the previously described procedure are the so-called Hadamard recursive relations, which, in the scalar case, have been studied by several authors (see for example [56]). In the case of Dirac fields, there are results on the coinciding point limits of the Hadamard coefficients up to V1 computed in [18]; the form of the Hadamard singularity employed in this work is, however, a different one related to the non-rigorous DeWitt–Schwinger expansion, but formally, the relation between the different recursion relations arising in the two constructions is well known. After having discussed the Hadamard recursion relations, we shall show how they arise explicitly. Let us thus examine the terms U/σ and V ln σ individually. Starting with the latter, we have Px (V ln σ) = (Px V ) ln σ +
∞
(Vn (x σ − 2 + 4n) + 2σ µ Vn;µ )σ n−1 ,
n=0
where we have employed the identity σ µ σµ = 2σ. Remembering our previous discussion, we can now extract our first differential equation by requiring the coefficient of ln σ to vanish. Since this requirement has to hold independently of the differential operator chosen out of (45), we have Px V = Py V = Dx Dy V = (Dx − Dy )V = (Dy − Dx )V = 0.
(48)
To obtain further differential equations, we need to look at the terms involving U , viz., U (Px U ) 2σµ U µ + (x σ − 4)U − Px , = σ σ σ2
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1303
which, combined with the σ −1 coefficient coming from the series obtained out of differentiating V ln σ, leads us to the following two identities: Px U + 2V0;µ σ µ + (x σ − 2)V0 = 0,
(49)
2U;µ σ + (x σ − 4)U = 0,
(50)
µ
referring to the σ −1 and σ −2 coefficients, respectively. Let us now focus on (50); one can infer that U is subject to a linear partial differential equation which, according to standard theorems, provides a unique solution once a suitable initial condition is given. The latter is usually chosen in such a way that [U ] = I4 , and, hence, the Cauchy problem associated to the U -bispinor strongly suggests us to hypothesise U to be of the form U = uI, with a smooth biscalar u satisfying [u] = 1. Plugging in this ansatz in (50) and recalling the properties of I, it holds that uI is the solution we are seeking if and only if u satisfies the partial differential equation 2u;µ σ µ + (x σ − 4)u = 0. Hence, it turns out that u fulfills the same transport equation as the σ −1 coefficients of the Hadamard bidistribution encoding the singularity of the two-point function for a scalar quantum field and is thus given by the square root of the so-called Van Vleck–Morette determinant. It would be tempting to think that a similar result and interpretation holds for V , but, alas, this is far from being the truth as one can realise by direct inspection of (49) since spin curvature terms not proportional to the identity enter the arena via derivatives of I. The enlarged complexity of the Diracian Hadamard coefficient V , however, is compensated by the increased number of differential equations fulfilled by V (48). Of course, any of them is enough to determine V , but computations are still considerably easier if one employs all. To obtain differential equations for the Vn , one has to combine (47) with (48). After a few formal manipulations, one gets to ∞
(Px Vn )σ n +
n=0
∞
(2lVl;µ σ µ + (lσ + 2l(l − 1))Vl )σ l−1 = 0,
l=1
and, if we require this identity to hold true at each order in σ, to Px V0 + 2V1;µ σ µ + (x σ)V1 = 0, Px Vn + 2(n + 1)Vn+1;µ σ µ + ((n + 1)x σ + 2n(n + 1))(Vn+1 ) = 0,
(51) ∀n ≥ 1. (52)
At this point it is clear how to determine U , V and W explicitly: the starting point is (50), which, as we have explained above, gives us U once an initial condition has been assigned. Afterwards one can plug the result in (49) in order to obtain V0 ,
November 18, 2009 11:47 WSPC/148-RMP
1304
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
though one needs to specify an initial condition. This is already included in (49), however, since, if we take the coincidence point limit of (49) and recall the properties of σ, we end up with 1 [V0 ] = − [Px U ]. 2 Hence, we can now proceed iteratively, namely, we exploit (51) to construct V1 once we have specified the initial condition taking the coincidence point limit, i.e. 1 [V1 ] = − [Px V0 ]. 4 Similarly, (52) grants us that the same procedure allows us to express Vn+1 out of the preceding term Vn together with the initial condition [Vn+1 ] = −
1 [Px Vn ]. 2(n + 1)(n + 2)
Let us remember that all these results can be obtained starting from [U ] = I4 and employing differential equations which only involve local curvature terms and the mass m. Thus, both U and V indeed only depend on these data and are independent from the state under consideration. This of course does not hold for W — as U and V are state-independent, the full state dependence must be encoded in W . It seems that we finally have all ingredients necessary to calculate the sought coinciding point limits used in Theorem 4.1. There is one potential feature of the Hadamard coefficients, however, which helps a lot simplifying calculations and should therefore be discussed before starting calculations, namely, their symmetry. Indeed, such a property has been proven in [56] for the scalar case, but, unfortunately, a similar result does not exist for Dirac fields and even understanding the correct notion of “symmetry” in our framework is a rather challenging task. We shall leave the tantalizing endeavor to prove the symmetry of the Diracian Hadamard coefficients for possible future work and circumvent, for the time being, this gap with more explicit calculations. The following lemma will turn out to be rather useful in general and in the context of coping with the lack of (proven) symmetry in particular: Lemma A.2. Given a smooth bitensor B(x, y) which is such that smooth bitensor, it holds B [n B] . = σn [n (σ n )]
B(x,y) σn (x,y)
is a
Proof. Since B and B/σ n are smooth, their coinciding point limits depend neither on y nor on the path along which one approaches x. Thus, we can apply de l’Hospital’s rule to our smooth bitensors restricted to arbitrary smooth curves, in particular coordinate curves, thereby expressing coinciding point limits of fractions
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1305
via such limits of fractions of covariant derivatives with respect to y, e.g.,
B;µ B = . σn (σ n );µ As the coinciding point limit of σ and its first covariant derivative vanish, also that of the kth covariant derivative of σ n vanishes for all 0 ≤ k ≤ 2n − 1. Due to the smoothness of B/σ n and de l’Hospitals rule, the same must hold for the kth covariant derivative of B as well. Consequently, a multiple application of de l’Hospital’s rule yields
B;µ1 ···µ2n [B;µ1 ···µ2n ] B [B;µ1 ···µ2n ] = , = = n n n σ (σ );µ1 ···µ2n [(σ );µ1 ···µ2n ] [(σ n );µ1 ···µ2n ] where the third equality holds due to Synge’s rule and the vanishing of lower order derivatives of both B and σ n in the coinciding point limit. Since the above equalities do not depend on the choice of µ1 · · · µ2n , it holds (even for the vanishing components) B [(σ n );µ1 ···µ2n ], [B;µ1 ···µ2n ] = σn and appropriate contractions with the metric yield B n [ B] = [n (σ n )], σn which closes the proof. [n (σ n )] can be expressed solely in terms of traces of the metric and is non-vanishing in particular, further applications of the de l’Hospital rule are thus not necessary. The last tool worthy of mention to perform the calculations whose results we will display shortly is the computer. It should be clear at this point that there are a lot of recursion relations to solve to achieve the wished-for results. Thus, at the least as a means of backing up manual calculations, computer algebra systems are a valuable instrument. To this avail, we have chosen to work with Mathematica and the free package [64], suitable for performing calculations with vector bundles. The codes we have used to implement the recursive procedures and coinciding point limits are available upon request from
[email protected]. We can now finally state the main proposition of the appendix: Proposition A.1. The Hadamard bidistribution H fulfills (1) [Px H] = 6[V1 ], [(Px H);µ ] = 8[V1;µ ], [(Px H);µ ] = −8[V1;µ ] + 6[V1 ];µ , (2) [Py H] = 6[V1 ], [(Py H);µ ] = 8[V1;µ ] − 2[V1 ];µ , [(Py H);µ ] = −8[V1;µ ] + 8[V1 ];µ , (3) Tr[Dx Dy H] = − Tr[Px H], Tr[(Dx Dy H);µ ] = −Tr[(Px H);µ ] + [V1 ];µ , Tr[(Dx Dy H);µ ] = − Tr[(Px H);µ ] − [V1 ];µ , (4) Tr[(Py H − Px H);µ ]γ µ γν = 2Tr[V1 ];ν .
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
1306
Proof. (1) We shall employ (48)–(50). These data entail Px H = 2V1; σ + V1 (x σ + 2) + O(σ),
(53)
and thus, taking the coinciding point limit and remembering those of σ computed in the previous section, [Px H] = 6[V1 ]. Similarly, one gets, deriving (53) once and performing the limit, [(Px H);µ ] = 8[V1;µ ]. By means of Synge’s rule we finally have [(Px H);µ ] = −8[V1;µ ] + 6[V1 ];µ . (2) We would of course like to compute Py H, but without any knowledge on the symmetries of the Diracian Hadamard coefficients, we have to verify the transport equations for Py , which otherwise would follow automatically from those for Px as it happens in the scalar case for the scalar Hadamard coefficients u and v [56]. To wit,
2U;µ σ µ + U (y σ − 4) = I(2u;µ σ µ + u(y σ − 4)) = 0, where the first equality holds since the derivative of I vanishes along the geodesic connecting x and y and the second one holds since u(x, y) = u(y, x)m and u is thus subject to transport equations for both Px and Py . Since Py H is smooth, we now know that
. Y1 . Py U + 2V0;µ σ µ + V0 (y σ − 2) , = Z1 = σ σ must be smooth, too. Alas, it does not factorise into a term only involving the scalar coefficients u and v times I and, up to now, we are unaware of a way to prove that it is identically vanishing. But we can try to compute whether it vanishes up to the derivative order we need for our purposes. To this end, it helps to split V into vI + V˜ , where V˜ is the non-trivial matrix part of V stemming from the spin curvature. This way one can separate from Y1 a term which vanishes due to the transport equation for v and has to cope with the remainder only. Involved calculations yield [Y1 ] = [Y1;µ ] = [Y1 ] = [(Y1 );µ ] = 0, and thus, employing Lemma A.2, [Z1 ] = [Z1;µ ] = 0. Consequently,
Py H = 2V1; σ + V1 (y σ + 2) + terms vanishing in the limit and
+ V1;µ (y σ + 2) + terms vanishing in the limit. (Py H);µ = 2V1; σ;µ
One can now straightforwardly obtain [Py H] = 6[V1 ], [(Py H);µ ] = 8[V1;µ ] − 2[V1 ];µ , and [(Py H);µ ] = 8[V1;µ ] − 8[V1 ];µ . m The
symmetry of u does not have to be proved in the as that of v (see [56]), but qsame long way p p det(σµν (x, y)) |g(x)|−1 |g(y)|−1 .
it follows automatically by its explicit form u(x, y) =
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1307
(3) Let us define . Z2 = (Dx − Dy )H = (Dy − Dx )H. By direct inspection, Dx Dy H = −Px H − Dx Z2 . Again we know that Z2 is smooth and, alas, neither this quantity nor Dx Z2 turns out to be vanishing. Luckily enough, we can still extract some useful results at the level of traced coinciding point limits, at an order of derivatives high enough for our purposes. One computes U (Dx − Dy )σ (Dx − Dy )U − V (Dx − Dy )σ + ln(σ)(Dx − Dy )V + σ2 σ Y2 . U (Dx − Dy )σ + ln(σ)(Dx − Dy )V. =− + (54) 2 σ σ
Z2 = −
As already discussed, the last term vanishes identically and so does the first term on account of
U (Dx − Dy )σ = −u(Iγ µ σµ + γ µ Iσµ ) = −u(Iγ µ σµ + Iγ µ gµµ σµ ) = 0. This leaves us with Z2 = Y2 /σ. Involved computations, employing (Dx − Dy )V = Px V = 0 to exchange higher derivative terms with those of lower derivative order in the appearing commutators with γ-matrices, yield [Y2 ] = [Y2;µ ] = [Y2 ] = 0,
[(Y2 );µ ] = 6[[V1 ], γµ ].
After a few rearrangements and out of Lemma A.2, one gets [Z2 ] = 0,
[Z2;µ ] = [[V1 ], γµ ].
Hence, [Z2;µ ] is traceless due to the antisymmetry of the commutator. By means of formula (43), one can show per direct inspection that T r[V1 ]γµ γν = T r[V1 ]gµν which entails that even Dx Z2 is traceless and, thus, Tr[Dx Dy H] = −Tr[Px H]. In order to compute Tr[(Dx Dy H);µ ] and Tr[(Dx Dy H);µ ], let us consider that Dy Z2 = Dx Z2 + Px H − Py H. Employing this as well as the previous results and tricks we have discussed in this proof, one obtains the following chain of
November 18, 2009 11:47 WSPC/148-RMP
1308
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
identities Tr[(Dx Z2 );µ ] = Tr γ ν [Z2;νµ ] = −Tr γ ν [Z2;ν µ ] + Tr γ ν [Z2;µ ]ν = Tr[(Dy Z2 );µ ] = Tr[(Dx Z2 );µ ] − Tr[(Py H − Px H);µ ] 1 = − Tr[(Py H − Px H);µ ] = Tr[V1 ];µ 2 = −Tr[(Dy Z2 );µ ].
(55)
We can finally use this last calculation to obtain Tr[(Dx Dy H);µ ] = −Tr[(Px H);µ ] + [V1 ];µ , Tr[(Dx Dy H);µ ] = −Tr[(Px H);µ ] − [V1 ];µ . (4) Inserting the previous results, we have [(Py H − Px H);ν ] = 2[V1 ];ν . As already discussed, due to γ-matrix identities, tracing [V1 ] with two γ-matrices amounts to a multiplication with the metric. Since the operations of trace and covariant derivation commute, we have Tr[V1 ];ν γ ν γµ = Tr[V1 ];µ and thus Tr[(Py H − Px H);ν ]γ ν γµ = 2[V1 ];µ . We would like to conclude this section by stating the last ingredient necessary for proving Theorem 4.1, the coinciding point limit of V1 , viz., 4 m2 R R2 R Rαβ Rαβ Rαβγδ Rαβγδ Cαβ C αβ m + + + − + . [V1 ] = I4 + 8 48 1152 480 720 720 48 A.4. Conserved local curvature tensors The explicit form of the conserved local curvature tensors spanning the regularization freedom of the expected stress-energy tensor is: 1 2 δ . 1 2 R + 2R − 2R;µν − 2RRµν , R dµg = gµν Iµν = 2 |g| δgµν M Jµν
δ . 1 = |g| δgµν
Rαβ Rαβ dµg M
1 gµν (Rµν Rµν + R) − R;µν + Rµν − 2Rαβ Rα µ β ν , 2 δ . 1 = Rαβγδ Rαβγδ dµg |g| δgµν M
= Kµν
1 = − gµν Rαβγδ Rαβγδ + 2Rαβγµ Rαβγ ν + 4Rαβ Rαµ β ν 2 − 4Rαµ Rα ν − 4Rµν + 2R;µν .
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1309
As already stated, in four spacetime dimensions, these are related as Kµν = Iµν −4Jµν via the generalized Gauss–Bonnet–Chern theorem [1,73]. Furthermore, in this case, they all have a trace proportional to R and, thus, the linear combination Iµν − 3Jµν is traceless. References [1] L. J. Alty, The generalized Gauss–Bonnet–Chern theorem, J. Math. Phys. 36 (1995) 3094–3105. [2] H. Araki, On quasifree states of CAR and Bogoliubov automorphisms, Publ. RIMS Kyoto Univ. 6 (1970/71) 385–442. [3] H. Araki, Mathematical Theory of Quantum Fields, International Series of Monographs on Physics, Vol. 101 (Oxford University Press, 1999). [4] C. B¨ ar, N. Ginoux and F. Pf¨ affle, Wave Equations on Lorentzian Manifolds and Quantization, ESI Lectures in Mathematics and Physics (European Mathematical Society, 2007). [5] A. O. Barut and R. Raczka, Theory of Group Representations and Applications (World Scientific, 1986). [6] A. N. Bernal and M. Sanchez, Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes, Comm. Math. Phys. 257 (2005) 43–50; arXiv: gr-qc/0401112. [7] A. N. Bernal and M. Sanchez, Further results on the smoothability of Cauchy hypersurfaces and Cauchy time functions, Lett. Math. Phys. 77 (2006) 183–197; arXiv: gr-qc/0512095. [8] A. Borel and F. Hirzebruch, Characteristic classes and homgeneous spaces. I, Amer. J. Math. 80 (1958) 458–538. [9] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. Vol. 2: Equilibrium States. Models in Quantum Statistical Mechanics (Springer, 1996). [10] M. R. Brown and A. C. Ottewill, Photon propagators and the definition and approximation of renormalized stress tensors in curved space-time, Phys. Rev. D 34 (1986) 1776–1786. [11] R. Brunetti, M. Duetsch and K. Fredenhagen, Perturbative algebraic quantum field theory and the renormalization groups; arXiv:0901.2038[math-ph]. [12] R. Brunetti, K. Fredenhagen and M. K¨ ohler, The microlocal spectrum condition and Wick polynomials of free fields on curved spacetimes, Comm. Math. Phys. 180 (1996) 633–652; arXiv:gr-qc/9510056. [13] R. Brunetti and K. Fredenhagen, Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds, Comm. Math. Phys. 208 (2000) 623–661; arXiv:math-ph/9903028. [14] R. Brunetti and K. Fredenhagen, Quantum field theory on curved backgrounds; arXiv:0901.2063[gr-qc]. [15] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality principle: A new paradigm for local quantum physics, Comm. Math. Phys. 237 (2003) 31–68; arXiv:math-ph/0112041. [16] B. Chilian and K. Fredenhagen, The time slice axiom in perturbative quantum field theory on globally hyperbolic spacetimes, Comm. Math. Phys. 287 (2009) 513–522; arXiv:0802.1642[math-ph]. [17] S. M. Christensen, Vacuum expectation value of the stress tensor in an arbitrary curved background: The covariant point separation method, Phys. Rev. D 14 (1976) 2490–2501.
November 18, 2009 11:47 WSPC/148-RMP
1310
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
[18] S. M. Christensen, Regularization, renormalization, and covariant geodesic point separation, Phys. Rev. D 17 (1978) 946–963. [19] S. M. Christensen and M. J. Duff, New gravitational index theorems and supertheorems, Nucl. Phys. B 154 (1979) 301–342. [20] S. P. Dawson and C. J. Fewster, An explicit quantum weak energy inequality for Dirac fields in curved spacetimes, Class. Quant. Grav. 23 (2006) 6659–6681. [21] C. D’Antoni and S. Hollands, Nuclearity, local quasiequivalence and split property for Dirac quantum fields in curved spacetime, Comm. Math. Phys. 261 (2006) 133– 159; arXiv:math-ph/0106028. [22] N. Dencker, On the propagation of polarization sets for systems of real principal type, J. Funct. Anal. 46 (1982) 351–372. [23] B. S. DeWitt and R. W. Brehme, Radiation damping in a gravitational field, Ann. Phys. 9 (1960) 220–259. [24] C. Dappiaggi, K. Fredenhagen and N. Pinamonti, Stable cosmological models driven by a free quantum scalar field, Phys. Rev. D 77 (2008) 104015, 8 pp.; arXiv:0801.2850[gr-qc]. [25] J. Dimock, Dirac quantum fields on a manifold, Trans. Amer. Math. Soc. 269 (1982) 133–147 [26] J. J. Duistermaat and L. H¨ ormander, Fourier integral operators II, Acta Math. 128 (1972) 183–269. [27] C. J. Fewster and R. Verch, A quantum weak energy inequality for Dirac fields in curved spacetime, Comm. Math. Phys. 225 (2002) 331–359; arXiv:math-ph/ 0105027. [28] E. E. Flanagan and R. M. Wald, Does backreaction enforce the averaged null energy condition in semiclassical gravity? Phys. Rev. D 54 (1996) 6233–6283; arXiv:gr-qc/ 9602052. [29] L. H. Ford, Quantum instability of de Sitter space-time, Phys. Rev. D 31 (1985) 710–717. [30] M. Forger and H. R¨ omer, Currents and the energy-momentum tensor in classical field theory: A fresh look at an old problem, Ann. Phys. 309 (2004) 306–389; arXiv:hep-th/0307199. [31] S. A. Fulling, F. J. Narcowich and R. M. Wald, Singularity structure of the two point function In quantum field theory in curved space-time. II, Ann. Phys. 136 (1981) 243–272. [32] S. A. Fulling, Aspects of Quantum Field Theory in Curved Spacetime, London Math. Soc. Student Texts, Vol. 17 (Cambridge University Press, 1989). [33] R. P. Geroch, Spinor structure of space-times in general Relativity. I, J. Math. Phys. 9 (1968) 1739–1744. [34] R. P. Geroch, Spinor structure of space-times in general relativity. II, J. Math. Phys. 11 (1970) 343–348. [35] R. Haag, Local Quantum Physics: Fields, Particles, Algebras, Texts and Monographs in Physics (Springer, 1992). [36] B. C. Hall, Lie Groups, Lie Algebras, and Representations (Springer, 2000). [37] S. Hollands, The Hadamard condition for Dirac fields and adiabatic states on Robertson–Walker spacetimes, Comm. Math. Phys. 216 (2001) 635–661; arXiv:grqc/9906076. [38] S. Hollands and W. Ruan, The state space of perturbative quantum field theory in curved space times, Ann. Henri Poincar´e 3 (2002) 635–657. [39] S. Hollands and R. M. Wald, Local Wick polynomials and time ordered products of quantum fields in curved spacetime, Comm. Math. Phys. 223 (2001) 289–326.
November 18, 2009 11:47 WSPC/148-RMP
J070-00386
Extended Algebra of Observables for Dirac Fields and Trace Anomaly
1311
[40] S. Hollands and R. M. Wald, Existence of local covariant time ordered products of quantum fields in curved spacetime, Comm. Math. Phys. 231 (2002) 309–345; arXiv:gr-qc/0111108. [41] S. Hollands and R. M. Wald, On the renormalization group in curved spacetime, Comm. Math. Phys. 237 (2003) 123–160; arXiv:gr-qc/0209029. [42] S. Hollands and R. M. Wald, Conservation of the stress tensor in interacting quantum field theory in curved spacetimes, Rev. Math. Phys. 17 (2005) 227–311; arXiv:gr-qc/0404074. [43] L. H¨ ormander, The Analysis of Linear Partial Differential Operators I (Springer, 2000). [44] L. H¨ ormander, The Analysis of Linear Partial Differential Operators III (Springer, 1994). [45] D. Husemoller, Fibre Bundles, 3rd edn. (Springer-Verlag, 1996). [46] W. Junker and E. Schrohe, Adiabatic vacuum states on general spacetime manifolds: Definition, construction, and physical properties, Ann. Henri Poincar´e 3 (2002) 1113–1181. [47] B. S. Kay and R. M. Wald, Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on space-times with a bifurcate killing horizon, Phys. Rept. 207 (1991) 49–136. [48] K. Kratzert, Singularity structure of the two point function of the free Dirac field on a globally hyperbolic spacetime, Ann. Phys. 9 (2000) 475–498; arXiv:math-ph/ 0003015. [49] M. K¨ ohler, The stress-energy tensor of a locally supersymmetric quantum field theory on a curved spacetime, Ph.D. Thesis, Universit¨ at Hamburg (1995). [50] H. B. Lawson and M.-L. Michelsohn, Spin Geometry (Princeton University Press, 1989). [51] A. Lichnerowicz, Champs spinoriels et propagateurs en relativit´e g´en´erale, Bull. Soc. Math. France 92 (1964) 11–100. [52] A. D. Linde, The new inflationary universe scenario, in The Very Early Universe (Cambridge, 1982) (Cambridge University Press, 1983), pp. 205–249. [53] A. D. Linde, Quantum creation of the inflationary universe, Lett. Nuovo Cimento 39 (1984) 401–405. [54] C. L¨ uders and J. E. Roberts, Local quasiequivalence and adiabatic vacuum states, Comm. Math. Phys. 134 (1990) 29–63. [55] V. Moretti, Proof of the symmetry of the off-diagonal heat-kernel and Hadamard’s expansion coefficients in general C ∞ Riemannian manifolds, Comm. Math. Phys. 208 (1999) 283–308; arXiv:gr-qc/9902034. [56] V. Moretti, Proof of the symmetry of the off-diagonal Hadamard/Seeley–deWitt’s coefficients in C ∞ Lorentzian manifolds by a ‘local Wick rotation’, Comm. Math. Phys. 212 (2000) 165–189; arXiv:gr-qc/9908068. [57] V. Moretti, Comments on the stress-energy tensor operator in curved spacetime, Comm. Math. Phys. 232 (2003) 189–221. [58] A. H. Najmi and A. D. Ottewill, Quantum states and the Hadamard form. II. Energy minimization for spin 1/2 fields, Phys. Rev. D 30 (1984) 2573–2578. [59] H. Olbermann, States of low energy on Robertson–Walker spacetimes, Class. Quantum. Grav. 24 (2007) 5011–5030. [60] L. Parker and D. J. Toms, Renormalization group analysis of grand unified theories in curved space-time, Phys. Rev. D 29 (1984) 1584–1608. [61] E. Poisson, The motion of point particles in curved spacetime, Living Rev. Rel. 7 (2004) 6; arXiv:gr-qc/0306052.
November 18, 2009 11:47 WSPC/148-RMP
1312
J070-00386
C. Dappiaggi, T.-P. Hack & N. Pinamonti
[62] M. J. Radzikowski, Micro-local approach to the Hadamard condition in quantum field theory on curved space-time, Comm. Math. Phys. 179 (1996) 529–553. [63] M. J. Radzikowski, A local to global singularity theorem for quantum field theory on curved space-time, Comm. Math. Phys. 180 (1996) 1–22. [64] J. M. Lee, Ricci — A mathematica package for doing tensor calculations in differential geometry; www.math.washington.edu/∼lee/Ricci. [65] H. Sahlmann and R. Verch, Microlocal spectrum condition and Hadamard form for vector valued quantum fields in curved space-time, Rev. Math. Phys. 13 (2001) 1203–1246; arXiv:math-ph/0008029. [66] J. A. Sanders, Aspects of locally covariant quantum field theory, Ph.D. Thesis (University of York, 2008); arXiv:0809.4828[math-ph]. [67] K. Sanders, Equivalence of the (generalised) Hadamard and microlocal spectrum condition for (generalised) free fields in curved spacetime, arXiv:0903.1021[mathph]. [68] H. H. Schaefer, Topological Vector Spaces (Springer, 1999). [69] G. Scharf, Finite Quantum Electrodynamics: The Causal Approach. Texts and Monographs in Physics, 2nd edn. (Springer-Verlag, 1995). [70] A. A. Starobinsky, A new type of isotropic cosmological models without singularity, Phys. Lett. B 91 (1980) 99–102. [71] A. Strohmaier, R. Verch and M. Wollenberg, Microlocal analysis of quantum fields on curved spacetimes: Analytic wavefront sets and Reeh-Schlieder theorems, J. Math. Phys. 43 (2002) 5514–5530; arXiv:math-ph/0202003. [72] S. Tadaki, Hadamard regularization and conformal transformation, Progr. Theoret. Phys. 81 (1989) 891–903. [73] G. ’t Hooft and M. J. G. Veltman, One loop divergencies in the theory of gravitation, Ann. Inst. H. Poincar` e Sect. A (N.S.) 20 (1974) 69–94. [74] R. Verch, Local definiteness, primarity and quasiequivalence of quasifree Hadamard quantum states in curved space-time, Comm. Math. Phys. 160 (1994) 507–536. [75] R. Verch, Scaling analysis and ultraviolet behaviour of quantum field theories in curved spacetimes, Ph.D. Thesis, Universit¨ at Hamburg (1996). [76] R. Verch, A spin-statistics theorem for quantum fields on curved spacetime manifolds in a generally covariant framework, Comm. Math. Phys. 223 (2001) 261–288; arXiv:math-ph/0102035. [77] A. Vilenkin, Classical and quantum cosmology of the starobinsky inflationary model, Phys. Rev. D 32 (1985) 2511–2521. [78] R. M. Wald, The back reaction effect in particle creation in curved space-time, Comm. Math. Phys. 54 (1977) 1–19. [79] R. M. Wald, Trace anomaly of a conformally invariant quantum field in curved space-time, Phys. Rev. D 17 (1978) 1477–1484. [80] R. M. Wald, On the Euclidean approach to quantum field theory in curved spacetime, Comm. Math. Phys. 70 (1979) 221–242. [81] R. M. Wald, General Relativity (Chicago University Press, 1984). [82] R. M. Wald, Quantum Field Theory in Curved Space-Time and Black Hole Thermodynamics, Chicago Lectures in Physics (University of Chicago Press, 1994).
November 19, 2009 13:7 WSPC/148-RMP J070-00389
Reviews in Mathematical Physics Vol. 21, No. 10 (2009) 1313–1315 c World Scientific Publishing Company
REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 21 (2009)
Aftalion, A. & Helffer, B., On mathematical models for Bose–Einstein condensates in optical lattices Bates, L., Cushman, R., Hamilton, M. & ´ Sniatycki, J., Quantization of singular reduction Bojowald, M., Sandh¨ ofer, B., Skirzewski, A. & Tsobanjan, A., Effective constraints for quantum systems Breuer, J. & Frank, R. L., Singular spectrum for radial trees Ciolli, F., Massless scalar free field in 1+1 dimensions I: Weyl algebras products and superselection sectors Cuevas, J., see James, G. Cushman, R., see Bates, L. Dappiaggi, C., Hack, T.-P. & Pinamonti, N., The extended algebra of observables for Dirac fields and the trace anomaly of their stress-energy tensor
De Roeck, W., Large deviation generating function for currents in the Pauli–Fierz model Evans, D. E. & Pugh, M., SU(3)-Goodman–de la Harpe–Jones subfactors and the realization of SU(3) modular invariants Frank, R. L., see Breuer, J. Froese, R., Hasler, D. & Spitzer, W., Absolutely continuous spectrum for a random potential on a tree with strong transverse correlations and large weighted loops Fr¨ ohlich, J., Griesemer, M. & Sigal, I. M., On spectral renormalization group G´ erard, C. & Panati, A., Spectral and scattering theory for some abstract QFT Hamiltonians Germinet, F., Klein, A. & Schenker, J. H., Quantization of the Hall conductance and delocalization in ergodic Landau Hamiltonians Griesemer, M., see Fr¨ ohlich, J.
2 (2009) 229
3 (2009) 315
1 (2009) 111
7 (2009) 929
6 (2009) 735 1 (2009) 1 3 (2009) 315
10 (2009) 1241
1313
4 (2009) 549
7 (2009) 877 7 (2009) 929
6 (2009) 709
4 (2009) 511
3 (2009) 373
8 (2009) 1045 4 (2009) 511
November 19, 2009 13:7 WSPC/148-RMP J070-00389
1314
Author Index
Grundling, H. & Neeb, K.-H., Full regularity for a C∗ -algebra of the canonical commutation relations Hack, T.-P., see Dappiaggi, C. Hagedorn, G. A. & Joye, A., A mathematical theory for vibrational levels associated with hydrogen bonds II: The non-symmetric case Hamilton, M., see Bates, L. Hasler, D., see Froese, R. Helffer, B., see Aftalion, A. Hu, T., Wang, G., Sun, C., Zhou, C., Wang, Q. & Xue, K., Method of constructing braid group representation and entanglement in a 9 × 9 Yang–Baxter system Jaffe, A. & Moser, D., Replica condensation and tree decay James, G., S´ anchez-Rey, B. & Cuevas, J., Breathers in inhomogeneous nonlinear lattices: An analysis via center manifold reduction Joye, A., see Hagedorn, G. A. Kashima, Y., A rigorous treatment of the perturbation theory for many-electron systems Kiessling, M. K.-H., The Vlasov continuum limit for the classical microcanonical ensemble Klein, A., see Germinet, F.
5 (2009) 587 10 (2009) 1241
2 (2009) 279 3 (2009) 315 6 (2009) 709 2 (2009) 229
9 (2009) 1081 3 (2009) 439
1 (2009) 1 2 (2009) 279
8 (2009) 981
9 (2009) 1145 8 (2009) 1045
Kopper, C. & M¨ uller, V. F., Renormalization of spontaneously broken SU(2) Yang–Mills theory with flow equations Kr¨ uger, H. & Teschl, G., Long-time asymptotics of the Toda lattice for decaying initial data revisited Long, E. & Stuart, D., Effective dynamics for solitons in the nonlinear Klein–Gordon– Maxwell system and the Lorentz force law Moser, D., see Jaffe, A. M¨ uller, V. F., see Kopper, C. Neeb, K.-H., see Grundling, H. Oikonomou, V. K., Report on the detailed calculation of the effective potential in spacetimes with S 1 × Rd topology and at finite temperature Panati, A., see G´ erard, C. Pinamonti, N., see Dappiaggi, C. Pugh, M., see Evans, D. E. Reis, R. M. G., Szabo, R. J. & Valentino, A., KO-homology and Type I string theory S´ anchez-Rey, B., see James, G. Sandh¨ ofer, B., see Bojowald, M. Sati, H., Schreiber, U. & Stasheff, J., Fivebrane structures Saussol, B., An introduction to quantitative Poincar´e recurrence in dynamical systems
6 (2009) 781
1 (2009) 61
4 (2009) 459 3 (2009) 439 6 (2009) 781 5 (2009) 587
5 (2009) 615 3 (2009) 373 10 (2009) 1241 7 (2009) 877
9 (2009) 1091 1 (2009) 1 1 (2009) 111 10 (2009) 1197
8 (2009) 949
November 19, 2009 13:7 WSPC/148-RMP
J070-00389
Author Index Schenker, J. H., see Germinet, F. Schreiber, U., see Sati, H. Sigal, I. M., see Fr¨ ohlich, J. Skirzewski, A., see Bojowald, M. ´ Sniatycki, J., see Bates, L. Speck, J., The non-relativistic limit of the Euler–Nordstr¨ om system with cosmological constant Spitzer, W., see Froese, R. Stasheff, J., see Sati, H. Stuart, D., see Long, E. Sun, C., see Hu, T. Szabo, R. J., see Reis, R. M. G.
8 (2009) 1045 10 (2009) 1197 4 (2009) 511 1 (2009) 111 3 (2009) 315
7 (2009) 821 6 (2009) 709 10 (2009) 1197 4 (2009) 459 9 (2009) 1081 9 (2009) 1091
Teschl, G., see Kr¨ uger, H. Tiedra de Aldecoa, R., Time delay for dispersive systems in quantum scattering theory Tsobanjan, A., see Bojowald, M. Tumulka, R., The point processes of the GRW theory of wave function collapse Valentino, A., see Reis, R. M. G. Wang, G., see Hu, T. Wang, Q., see Hu, T. Xue, K., see Hu, T. Zhou, C., see Hu, T.
1315
1 (2009) 61
5 (2009) 675 1 (2009) 111
2 (2009) 155 9 (2009) 1091 9 (2009) 1081 9 (2009) 1081 9 (2009) 1081 9 (2009) 1081